This is an archive of the discontinued Mercurial Phabricator instance.

revlog: use named attributes on revlog index entries
AcceptedPublic

Authored by indygreg on Dec 26 2017, 7:36 PM.

Details

Reviewers
durin42
Group Reviewers
hg-reviewers
Summary

Now that we're using a dedicated type for index entries everywhere,
we can access fields in them by name instead of integer offset.

This change has a significant impact on performance on the Firefox
repository. (Values are from before refactor, parent commit, this
commit.)

$ HGMODULEPOLICY=c hg perfrevset '::tip'
! wall 0.755869 comb 0.750000 user 0.750000 sys 0.000000 (best of 13)
! wall 1.011107 comb 1.010000 user 1.010000 sys 0.000000 (best of 10)
! wall 0.864906 comb 0.860000 user 0.860000 sys 0.000000 (best of 11)

$ HGMODULEPOLICY=py hg perfrevset '::tip'
! wall 1.366281 comb 1.360000 user 1.360000 sys 0.000000 (best of 7)
! wall 1.860068 comb 1.860000 user 1.860000 sys 0.000000 (best of 5)
! wall 1.672432 comb 1.680000 user 1.670000 sys 0.010000 (best of 6)

And with PyPy:

! wall 0.163141 comb 0.160000 user 0.160000 sys 0.000000 (best of 56)
! wall 0.154325 comb 0.160000 user 0.160000 sys 0.000000 (best of 54)
! wall 0.155534 comb 0.160000 user 0.160000 sys 0.000000 (best of 53)

The C extension is ~15% slower than before the refactor. This is a
bit unfortunate. That is a bit steep price to pay for more readable
code and the establishment of a formal interface for index entries.

However, all is not lost! Because we're now using a custom type and
conforming to a yet-to-be-formally-defined interface, there's nothing
preventing us from implementing a backed-by-C custom type for revlog
entries. This type could lazily resolve PyObject instances, which
should result in a massive performance boost for operations that
don't need to access all fields of the index entry. This is easier
done *after* we drop all uses of tuples and their API for referring
to index entries.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

indygreg created this revision.Dec 26 2017, 7:36 PM
yuja added a subscriber: yuja.Dec 28 2017, 9:04 AM

Because we're now using a custom type and conforming to
a yet-to-be-formally-defined interface, there's nothing
preventing us from implementing a backed-by-C custom type
for revlog entries.

Not reviewed the series yet, but can't we just start with a custom type
backed by C so we don't have to import back IndexV1Entry from Python?

In D1773#30283, @yuja wrote:

Not reviewed the series yet, but can't we just start with a custom type
backed by C so we don't have to import back IndexV1Entry from Python?

That's possible. I do have a work-in-progress patch to the C code to implement such a type.

However, it is quite large and I suspect it will take a bit more effort to finish it. The end result of this series gets us into a state where the C type can be swapped in with minimal effort. I have a soft preference for implementing the type in C as a follow-up. But I can refactor the series if wanted. Or someone can review this series without accepting it and I'll add the C type later.

yuja added a comment.Jan 5 2018, 12:43 AM

However, it is quite large and I suspect it will take a bit more effort to finish it.

Isn't that something like dirstateTupleType? If we had a native type, we wouldn't
need D1767, D1768, and probably D1769. If it can get rid of refcounting business,
reviewing this series will be much fun.

indygreg edited the summary of this revision. (Show Details)Jan 6 2018, 3:26 AM
In D1773#30628, @yuja wrote:

However, it is quite large and I suspect it will take a bit more effort to finish it.

Isn't that something like dirstateTupleType? If we had a native type, we wouldn't
need D1767, D1768, and probably D1769. If it can get rid of refcounting business,
reviewing this series will be much fun.

I don't feel strongly about the approach. My bias is to land this whole series as-is. The "lazy struct" should be fairly easily done as a followup.

The refcounting of the type looks okay to me.

durin42 accepted this revision.Jan 10 2018, 5:54 PM
This revision is now accepted and ready to land.Jan 10 2018, 5:54 PM