This is an archive of the discontinued Mercurial Phabricator instance.

config: use copy-on-write to improve copy performance
ClosedPublic

Authored by quark on Sep 24 2017, 1:58 AM.

Details

Summary

Previously, chg's verify call could take 30+ms loading and checking new
config files. With one socket redirection, that adds up to around 70ms,
which is a lot for fast commands (ex. bookmark --hidden).

When investigating closer, A lot of time was spent on actually spent on ui
copying, which is mainly about config.config and dict copying.

This patch makes that 20x faster by adopting copy-on-write. The
copy-on-write is performed at config section level.

Before:

In [1]: %timeit ui.copy()
100 loops, best of 3: 2.32 ms per loop

After:

In [1]: %timeit ui.copy()
10000 loops, best of 3: 128 us per loop

2ms may look not that bad, but it adds up pretty quickly with multiple
calls. A typical chg run may call it 4 times, which is about 10ms.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

quark created this revision.Sep 24 2017, 1:58 AM
quark edited the summary of this revision. (Show Details)Sep 24 2017, 2:10 AM
mbthomas requested changes to this revision.Sep 25 2017, 12:17 PM
mbthomas added a subscriber: mbthomas.

Requesting changes for the doc comment typo.

Otherwise looks correct. Up to you whether to follow up on the suggestions.

mercurial/util.py
592

Should be copy-on-write.

602

Using a boolean means both sides have to copy when they each write - the original dict becomes frozen and can't be re-used. If we use a count of outstanding copies instead, the last copy can just be used as-is when the count goes to 0.

I don't know how much modification of the original occurs after copy, so maybe this isn't important.

637

Any way we can check/enforce this? I'm guessing not without impacting perf.

This revision now requires changes to proceed.Sep 25 2017, 12:17 PM
quark edited the summary of this revision. (Show Details)Sep 25 2017, 7:47 PM
quark updated this revision to Diff 2072.
quark added inline comments.Sep 25 2017, 7:47 PM
mercurial/util.py
602

Doing refcount adds 10% overhead for ui.copy() benchmark. I guess it's fine. If we really care about it, we can use a native cow class.

637

I think we want performance here. If it becomes a concern, we can add extra checks if devel.something is set.

Side note: having a native class in inheritance hierarchy is important for CPython performance according to what I learned. I thought pure_obj.__contains__ = native_obj.__contains__ may have a reasonable perf, but the former is about 4x slower. So class pure(native) is preferred for perf (but probably not design pattern's point of view).

quark updated this revision to Diff 2073.Sep 25 2017, 7:51 PM
mbthomas accepted this revision.Sep 26 2017, 1:06 PM
durin42 accepted this revision as: durin42.Sep 27 2017, 7:48 PM
durin42 added a subscriber: durin42.

A couple of nitpicks. I'm not super in love with this, but it seems to be enough of a huge win we should do it anyway. Sigh.

mercurial/util.py
659

you can (and should) omit pass when you have a docstring

666

you can (and should) omit pass when you also have a docstring

quark added a comment.EditedSep 27 2017, 8:55 PM

I'm also sad about this. But we have a strong desire to make commands like status have native-like performance. So every 1ms counts. If we cannot deliver that performance, the team might eventually end up with a native-rewrite approach. I'll be even sadder if we have to do that.

I'm still optimistic though, as you can see the recent alias change removes 30ms overhead. The next big thing will be config caching - if statuses are not changed, do not re-parse the config. That's another 30-60ms for chg overhead.

quark updated this revision to Diff 2135.Sep 27 2017, 9:11 PM
durin42 accepted this revision.Sep 30 2017, 9:25 AM
This revision is now accepted and ready to land.Sep 30 2017, 9:25 AM
This revision was automatically updated to reflect the committed changes.
yuja added a subscriber: yuja.Sep 30 2017, 1:06 PM
yuja added inline comments.
mercurial/config.py
180

Perhaps we need preparewrite() here?

yuja added inline comments.Oct 1 2017, 8:32 AM
mercurial/config.py
180

This won't be problem as long as we parse/read file only once per config object,
so accepted. Please send a follow up.