Page MenuHomePhabricator

bundle2: inline struct operations
ClosedPublic

Authored by indygreg on Nov 14 2017, 1:26 AM.

Details

Summary

Before, we were calling struct.unpack() (via an alias) on every
loop iteration. I'm not sure what Python does under the hood, but
it would have to look at the struct format and determine what to
do.

This commit establishes a struct.Struct instance and reuses it for
struct reading.

We can see the impact from running hg perfbundleread on a Firefox
bundle:

! read(8k)
! wall 0.679730 comb 0.680000 user 0.140000 sys 0.540000 (best of 15)
! read(16k)
! wall 0.577228 comb 0.570000 user 0.080000 sys 0.490000 (best of 17)
! read(32k)
! wall 0.516060 comb 0.520000 user 0.040000 sys 0.480000 (best of 20)
! read(128k)
! wall 0.496378 comb 0.490000 user 0.010000 sys 0.480000 (best of 20)
! bundle2 iterparts()
! wall 3.056811 comb 3.050000 user 2.340000 sys 0.710000 (best of 4)
! wall 2.992605 comb 2.990000 user 2.260000 sys 0.730000 (best of 4)
! bundle2 iterparts() seekable
! wall 4.007676 comb 4.000000 user 3.170000 sys 0.830000 (best of 3)
! wall 3.863810 comb 3.860000 user 3.000000 sys 0.860000 (best of 3)
! bundle2 part seek()
! wall 6.267110 comb 6.250000 user 3.480000 sys 2.770000 (best of 3)
! wall 6.213387 comb 6.200000 user 3.350000 sys 2.850000 (best of 3)
! bundle2 part read(8k)
! wall 3.404164 comb 3.400000 user 2.650000 sys 0.750000 (best of 3)
! wall 3.241099 comb 3.250000 user 2.560000 sys 0.690000 (best of 3)
! bundle2 part read(16k)
! wall 3.197972 comb 3.200000 user 2.490000 sys 0.710000 (best of 4)
! wall 3.003930 comb 3.000000 user 2.270000 sys 0.730000 (best of 4)
! bundle2 part read(32k)
! wall 3.060557 comb 3.060000 user 2.340000 sys 0.720000 (best of 4)
! wall 2.904695 comb 2.900000 user 2.160000 sys 0.740000 (best of 4)
! bundle2 part read(128k)
! wall 2.952209 comb 2.950000 user 2.230000 sys 0.720000 (best of 4)
! wall 2.776140 comb 2.780000 user 2.070000 sys 0.710000 (best of 4)

Profiling now says most remaining time is spent in util.chunkbuffer.
I already heavily optimized that data structure several releases ago.
So we'll likely get little more performance out of bundle2 reading
while still retaining util.chunkbuffer().

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

indygreg created this revision.Nov 14 2017, 1:26 AM

FWIW, when I last looked at adding stream clones to bundle2, I held off because of the performance overhead. With the performance work in this series, I suspect bundle2's I/O is fast enough to support stream clones with minimal performance degradation. I may have a go at that once this series is queued because it is always something I've wanted to do.

durin42 accepted this revision.Nov 20 2017, 6:42 PM
This revision is now accepted and ready to land.Nov 20 2017, 6:42 PM

This series is great stuff.

This revision was automatically updated to reflect the committed changes.