Profiling revealed that repeated calls to indebug() were
consuming a fair amount of CPU during bundle2 reading, with
most of the time spent in ui.configbool().
Inlining indebug() and avoiding extra attribute lookups speeds
things up substantially. Using hg perfbundleread with a Firefox
bundle:
! read(8k)
! wall 0.679730 comb 0.680000 user 0.140000 sys 0.540000 (best of 15)
! read(16k)
! wall 0.577228 comb 0.570000 user 0.080000 sys 0.490000 (best of 17)
! read(32k)
! wall 0.516060 comb 0.520000 user 0.040000 sys 0.480000 (best of 20)
! read(128k)
! wall 0.496378 comb 0.490000 user 0.010000 sys 0.480000 (best of 20)
! bundle2 iterparts()
! wall 6.983756 comb 6.980000 user 6.220000 sys 0.760000 (best of 3)
! wall 3.460903 comb 3.460000 user 2.760000 sys 0.700000 (best of 3)
! bundle2 iterparts() seekable
! wall 8.132131 comb 8.110000 user 7.160000 sys 0.950000 (best of 3)
! wall 4.312722 comb 4.310000 user 3.480000 sys 0.830000 (best of 3)
! bundle2 part seek()
! wall 10.860942 comb 10.840000 user 7.790000 sys 3.050000 (best of 3)
! wall 6.754764 comb 6.740000 user 3.970000 sys 2.770000 (best of 3)
! bundle2 part read(8k)
! wall 7.258035 comb 7.260000 user 6.470000 sys 0.790000 (best of 3)
! wall 3.668004 comb 3.660000 user 2.960000 sys 0.700000 (best of 3)
! bundle2 part read(16k)
! wall 7.099891 comb 7.080000 user 6.310000 sys 0.770000 (best of 3)
! wall 3.489196 comb 3.480000 user 2.750000 sys 0.730000 (best of 3)
! bundle2 part read(32k)
! wall 6.964685 comb 6.950000 user 6.130000 sys 0.820000 (best of 3)
! wall 3.388569 comb 3.380000 user 2.640000 sys 0.740000 (best of 3)
! bundle2 part read(128k)
! wall 6.852867 comb 6.850000 user 6.060000 sys 0.790000 (best of 3)
! wall 3.276415 comb 3.270000 user 2.560000 sys 0.710000 (best of 4)