This is an archive of the discontinued Mercurial Phabricator instance.

mdiff: prefer xdiff for diff calculation
AbandonedPublic

Authored by quark on Mar 2 2018, 7:25 PM.

Details

Reviewers
ryanmce
Group Reviewers
hg-reviewers
Summary

Let's switch to xdiff for its better diff quality and faster performance!

bdiff is still used as a fallback for cases xdiff isn't built, or the pure
Python version.

Test Plan

Added the "patience" test case mentioned in previous commit. It fails
with a huge diff output before this change.

I expected some annotate/diff output changes due to diff blocks being
shifted around. Let's see what sandcastle says. I'll make adjustments
accordingly.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

ryanmce created this revision.Mar 2 2018, 7:25 PM
durin42 added inline comments.
mercurial/mdiff.py
34

Rather than this, let's try to get the xdiffblocks function (and if it's not there set it to none), and then we could rework the below code to use xdiffblocks instead of blocks if experimental.xdiff is set to true - that'll make it easier to do some testing of both codepaths.

quark added a subscriber: quark.Mar 3 2018, 12:35 PM
quark added inline comments.
mercurial/mdiff.py
34

The diff functions below do not have an ui object. Would you be okay if the config is global? ex. something like

# mdiff.py
def setdiffalgo(name):
    global blocks
    if name == 'xdiff':
        blocks = ...
    else: ...
# dispatch.py
    ui = ....
    mdiff.setdiffalgo(...)
ryanmce added inline comments.Mar 3 2018, 2:14 PM
mercurial/mdiff.py
34

I chatted with people here at the sprint. I'm going to see if we can pass in the diff algo in from calling functions, falling back to bdiff if none is specified.

indygreg added inline comments.
mercurial/mdiff.py
34

global state is evil. Especially on functionality that calls out to C and may release the GIL. If we have multi-threaded applications, things could get in a very bad state.

I'd encourage you to plumb in the diff algorithm to use into the diffopts. Even if the diff algorithm is only visible for user-visible diffs (as opposed to say storage diffs), that's a win.

quark commandeered this revision.

Took the diffopts approach - note: merge conflicts improvements are gone since mdiff.get_matching_blocks is not covered.

quark abandoned this revision.Mar 3 2018, 3:10 PM