( )⚙ D4150 linelog: optimize replacelines

This is an archive of the discontinued Mercurial Phabricator instance.

linelog: optimize replacelines
ClosedPublic

Authored by quark on Aug 6 2018, 10:00 PM.

Details

Summary

The optimization to avoid calling annotate inside replacelines is significant
for practical use patterns.

Before this patch:

hg perflinelogedits
! wall 6.778478 comb 6.710000 user 6.700000 sys 0.010000 (best of 3)

After this patch:

hg perflinelogedits
! wall 0.136573 comb 0.140000 user 0.130000 sys 0.010000 (best of 63)

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

quark created this revision.Aug 6 2018, 10:00 PM
quark edited the summary of this revision. (Show Details)Aug 6 2018, 10:35 PM
quark edited the summary of this revision. (Show Details)Aug 6 2018, 11:05 PM
quark edited the summary of this revision. (Show Details)Aug 7 2018, 8:22 PM
quark updated this revision to Diff 10067.
durin42 accepted this revision.Aug 9 2018, 1:56 PM
durin42 added a subscriber: durin42.

Thanks for the optimization! I had gotten too worn out tracing things in the original code to figure out how the splicing worked, and I had something that worked so I moved along with it. This makes a lot of sense (possibly just because I'm less tired than I was when I finished understanding linelog the first time.)

This revision is now accepted and ready to land.Aug 9 2018, 1:56 PM
This revision was automatically updated to reflect the committed changes.
quark added a comment.EditedAug 9 2018, 6:12 PM

To clarify, I do think stateless API is better. It can be done by keeping _lastannotate as a private cache inaccessible from other APIs, move annotateresult to the return value of annotate, then add arev to replacelines to verify the cache. The C code use brev instead of rev as the parameter name for a reason.

The original annotateresult is used as an attempt to reduce C code to be reviewed. Reviewing C code took too long.

The choice of C was because I pursued performance, C is not bad for this particular work (ex. linelog.c has similar LOC), and C is also friendly for the Git community.

To provide more context, mpm commented internally on how the linelog format might be improved. That is to add a LINESPAN instruction replacing a range of continuous LINE instructions. Again, not implemented because of C. But it seems much easier with Python now.