This is an archive of the discontinued Mercurial Phabricator instance.

merge: Removed sorting in casefolding detection, for a slight performance win
ClosedPublic

Authored by alex_gaynor on Jul 10 2017, 4:59 PM.

Details

Summary

It was not required for the correctness of the algorithm.

Diff Detail

Repository
rHG Mercurial
Branch
performance-changes (bookmark) on default (branch)
Lint
No Linters Available
Unit
No Unit Test Coverage

Event Timeline

alex_gaynor created this revision.Jul 10 2017, 4:59 PM
dsp requested changes to this revision.Jul 11 2017, 11:19 AM
dsp added a subscriber: dsp.

I think for the correctness of the algorithm sorted() is not needed. However it is there to ensure that the error is deterministic, as iterating over set is in arbitrary order. So running the same merge with multiple case conflicts can potentially lead to two different errors. sorted makes the iteration deterministic. Fujiwara might have more insights.

I am -1 on the patch, and will for the moment request changes unless someone from hg reviewers approves it.

This revision now requires changes to proceed.Jul 11 2017, 11:19 AM
phillco removed a subscriber: phillco.

Is it important that the error be deterministic?

I think we can improve the situation if you have exactly two collisions, by sorting (f, foldmap[fold]) and (lastfull, f), but I don't think it can easily be preserved for multiple collisions.

quark added a subscriber: quark.Jul 14 2017, 12:26 PM

Is it important that the error be deterministic?

Good question. Deterministic error is just to make tests stable AFAIK. If the perf win is noticable, I personally prefer perf win. We can use (glob) and (?) in tests to make it support non-deterministic outputs.

durin42 edited edge metadata.Jul 14 2017, 12:27 PM
In D30#1098, @quark wrote:

Is it important that the error be deterministic?

Good question. Deterministic error is just to make tests stable AFAIK. If the perf win is noticable, I personally prefer perf win. We can use (glob) and (?) in tests to make it support non-deterministic outputs.

Historically we've biased to keeping merge deterministic so it's easy to reproduce bugs. I'd very much like to keep that property.

Whether or not an error occurs is deterministic, the exact message is the only non-deterministic bit (assuming there's no bugs, heh).

Ah, I see. Alex, do you have any idea what the performance numbers on this are like, anecdotally on your mozilla repo?

Not quite sure this is science, but ....

First set of numbers is tip, second is with my patch:

~/p/mozilla-central ❯❯❯ time ~/projects/hg/hg up consolidate-sandbox-disable; time ~/projects/hg/hg up central
4324 files updated, 0 files merged, 491 files removed, 0 files unresolved
(activating bookmark consolidate-sandbox-disable)
        9.49 real        10.23 user         3.13 sys
4654 files updated, 0 files merged, 161 files removed, 0 files unresolved
(leaving bookmark consolidate-sandbox-disable)
        4.74 real         7.59 user         2.73 sys
~/p/mozilla-central ❯❯❯ time ~/projects/hg/hg up consolidate-sandbox-disable; time ~/projects/hg/hg up central
4324 files updated, 0 files merged, 491 files removed, 0 files unresolved
(activating bookmark consolidate-sandbox-disable)
        4.45 real         7.32 user         2.66 sys
4654 files updated, 0 files merged, 161 files removed, 0 files unresolved
(leaving bookmark consolidate-sandbox-disable)
        4.83 real         7.50 user         2.66 sys
~/p/mozilla-central ❯❯❯ time ~/projects/hg/hg up consolidate-sandbox-disable; time ~/projects/hg/hg up central
4324 files updated, 0 files merged, 491 files removed, 0 files unresolved
(activating bookmark consolidate-sandbox-disable)
        4.64 real         7.55 user         2.69 sys
4654 files updated, 0 files merged, 161 files removed, 0 files unresolved
(leaving bookmark consolidate-sandbox-disable)
        5.09 real         7.87 user         2.67 sys
~/p/mozilla-central ❯❯❯ time ~/projects/hg/hg up consolidate-sandbox-disable; time ~/projects/hg/hg up central
4324 files updated, 0 files merged, 491 files removed, 0 files unresolved
(activating bookmark consolidate-sandbox-disable)
        5.39 real         8.13 user         2.74 sys
4654 files updated, 0 files merged, 161 files removed, 0 files unresolved
(leaving bookmark consolidate-sandbox-disable)
        5.27 real         8.00 user         2.70 sys
~/p/mozilla-central ❯❯❯ time ~/projects/hg/hg up consolidate-sandbox-disable; time ~/projects/hg/hg up central
4324 files updated, 0 files merged, 491 files removed, 0 files unresolved
(activating bookmark consolidate-sandbox-disable)
        4.45 real         7.68 user         2.58 sys
4654 files updated, 0 files merged, 161 files removed, 0 files unresolved
(leaving bookmark consolidate-sandbox-disable)
        4.93 real         7.72 user         2.61 sys
~/p/mozilla-central ❯❯❯
~/p/mozilla-central ❯❯❯
~/p/mozilla-central ❯❯❯
~/p/mozilla-central ❯❯❯
~/p/mozilla-central ❯❯❯
~/p/mozilla-central ❯❯❯
~/p/mozilla-central ❯❯❯
~/p/mozilla-central ❯❯❯
~/p/mozilla-central ❯❯❯
~/p/mozilla-central ❯❯❯
~/p/mozilla-central ❯❯❯
~/p/mozilla-central ❯❯❯
~/p/mozilla-central ❯❯❯ time ~/projects/hg/hg up consolidate-sandbox-disable; time ~/projects/hg/hg up central
4324 files updated, 0 files merged, 491 files removed, 0 files unresolved
(activating bookmark consolidate-sandbox-disable)
        4.90 real         6.91 user         2.81 sys
4654 files updated, 0 files merged, 161 files removed, 0 files unresolved
(leaving bookmark consolidate-sandbox-disable)
        3.99 real         6.81 user         2.52 sys
~/p/mozilla-central ❯❯❯ time ~/projects/hg/hg up consolidate-sandbox-disable; time ~/projects/hg/hg up central
4324 files updated, 0 files merged, 491 files removed, 0 files unresolved
(activating bookmark consolidate-sandbox-disable)
        4.02 real         7.00 user         2.61 sys
4654 files updated, 0 files merged, 161 files removed, 0 files unresolved
(leaving bookmark consolidate-sandbox-disable)
        4.82 real         7.42 user         2.69 sys
~/p/mozilla-central ❯❯❯ time ~/projects/hg/hg up consolidate-sandbox-disable; time ~/projects/hg/hg up central
4324 files updated, 0 files merged, 491 files removed, 0 files unresolved
(activating bookmark consolidate-sandbox-disable)
        4.60 real         7.80 user         2.78 sys
4654 files updated, 0 files merged, 161 files removed, 0 files unresolved
(leaving bookmark consolidate-sandbox-disable)
        4.11 real         7.77 user         2.73 sys
~/p/mozilla-central ❯❯❯ time ~/projects/hg/hg up consolidate-sandbox-disable; time ~/projects/hg/hg up central
4324 files updated, 0 files merged, 491 files removed, 0 files unresolved
(activating bookmark consolidate-sandbox-disable)
        4.24 real         7.74 user         2.77 sys
4654 files updated, 0 files merged, 161 files removed, 0 files unresolved
(leaving bookmark consolidate-sandbox-disable)
        4.35 real         7.80 user         2.78 sys

Going to note that I'm less confident in the correctness of the second change than I am the first.

quark added a comment.Aug 9 2017, 5:34 PM

@durin42 The case-folding collision might be a small feature that Rust is meaningful.

mercurial/merge.py
765

I think this would be incorrect, since foldprefix is mutated and used in the loop.

alex_gaynor added inline comments.Aug 10 2017, 9:10 AM
mercurial/merge.py
765

I think you're right, removing the first sorted should be safe though.

alex_gaynor edited edge metadata.Aug 10 2017, 9:12 AM
alex_gaynor updated this revision to Diff 737.

Undo a change which was incorrect

quark accepted this revision.Aug 10 2017, 9:02 PM

Looks good to me. I'd prefer perf to error message stability. If the latter is a concern, we can re-run the loop if there is at least one case-folding collision detected.

If you want error message stability I'm pretty sure I can get that by changing the error message formatting to be tuple(sorted([f, foldmap[fold])). Let me know if you'd like me to add that.

This revision was automatically updated to reflect the committed changes.