( )⚙ D1056 context: add a fast-comparision path between arbitraryfilectx and workingfilectx

This is an archive of the discontinued Mercurial Phabricator instance.

context: add a fast-comparision path between arbitraryfilectx and workingfilectx
ClosedPublic

Authored by phillco on Oct 13 2017, 3:45 PM.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

phillco created this revision.Oct 13 2017, 3:45 PM
durin42 accepted this revision.Oct 14 2017, 12:54 AM
This revision is now accepted and ready to land.Oct 14 2017, 12:54 AM
This revision was automatically updated to reflect the committed changes.

ruh-roh

mercurial/context.py
2566

Ugh, haha

2567

Why is this sufficient? Can't the contents be the same even if the paths are different?

I think you can only fastpath if the paths are the same, otherwise you have to fall back to data comparison.

This is already queued, but I think we need to drop it if I'm right here.

phillco added inline comments.Oct 16 2017, 1:34 PM
mercurial/context.py
2567

Ryan and I talked offline -- but surprisingly, the default filectx.cmp function only compares contents:

1:~/1$ repo['.']['A'].cmp(repo['.']['B'])
Out[1]: False

2:~/1$ repo['.']['A'].cmp(repo['.']['A'])
Out[2]: False

3:~/1$ repo['.']['A'].cmp(repo['.']['C'])
Out[3]: True

(here, A and B have the same content).

And filecmp seems to behave the same way:

7:~/1$ filecmp.cmp('A', 'B')
Out[7]: True

8:~/1$ filecmp.cmp('A', 'A')
Out[8]: True

9:~/1$ filecmp.cmp('A', 'C')
Out[9]: False

That doesn't mean that it's wrong, but I think it's consistent.

phillco added inline comments.Oct 16 2017, 2:03 PM
mercurial/context.py
2567

Ah, but some experimenting revealed that filecmp does follow symlinks, but our filectx comparators do not. Otherwise, both filecmp and filectx.cmp ignore any flag differences.

Thus, you can demonstrate a discrepancy:

A contains "foo"
real_A contains "A"
sym_A link to A

repo['.']['real_A'].cmp(repo['.']['sym_A']) # claims the same
filecmp.cmp('real_A', 'sym_A') # claims a difference, because "foo" != "A"

Note this simpler case doesn't trigger the discrepancy, because the linked file is otherwise identical:

A contains "A"
sym_A link to A

repo['.']['A'].cmp(repo['.']['sym_A']) # claims the same
filecmp.cmp('A', 'sym_A') # claims the same

The easiest fix is just to skip the fast-comparison path if either side is a symlink, and that's what I'll do unless others have other ideas. (@ryanmce thought we should think about what this API _should_ do).

For now, send a follow-up to not do that fast-path if it's a symlink, and we can reason more carefully about this API during the freeze with an eye towards landing something better in 4.5...