Page MenuHomePhabricator

treedirstate: implement efficient case collision detection
ClosedPublic

Authored by mbthomas on Nov 14 2017, 12:39 PM.

Details

Summary

Add a mechanism to the dirstate trees to allow lookups based on filtered views
of the keys. For a given filtering function, this returns one (if any) of the
keys for which filter(key) matches the input. The filtered values in each
directory node are cached to improve subsequent lookups.

Diff Detail

Repository
rFBHGX Facebook Mercurial Extensions
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

mbthomas created this revision.Nov 14 2017, 12:39 PM
Herald added a reviewer: Restricted Project. · View Herald TranscriptNov 14 2017, 12:39 PM
mbthomas updated this revision to Diff 3524.Nov 15 2017, 4:14 AM
mbthomas updated this revision to Diff 3603.Nov 17 2017, 9:45 AM
mbthomas updated this revision to Diff 3658.Nov 20 2017, 12:21 PM
durham added a subscriber: durham.Nov 21 2017, 11:48 AM

There's enough rust in here I'm not comfortable accepting just yet. Also a question inline.

rust/treedirstate/src/python.rs
371

Unrelated to this diff, but I think we may want to move more of the dirstate business logic into pure rust (not the python interop layer) in the near future. So we can get to a pure native code hg status.

rust/treedirstate/src/tree.rs
51

We might want to be explicit that only one filter can be used with a tree at a time, since we don't invalidate the cache based on the filter.

treedirstate/__init__.py
468

If we didn't have _newfiles, why would calling it twice cause a problem? Doesn't the file that was added to the dirstate the first time have the same case as the file that's trying to be added a second time? And therefore is not a conflict?

mbthomas added inline comments.Nov 21 2017, 1:29 PM
rust/treedirstate/src/python.rs
371

In this case the casefolder is a python function, which is actually a bit complicated on some platforms (in particular cygwin - see posix.py). If we do move the casefolding business logic into Rust we'll need to handle that somehow (possibly by not supporting Cygwin).

treedirstate/__init__.py
468

Nothing gets added to the dirstate until after all candidates have been audited. The collision we're protecting against is in the self._newfilesfolded map, which contains the normalized forms only, not in the dirstate.

mbthomas updated this revision to Diff 3733.Nov 21 2017, 1:35 PM
mbthomas updated this revision to Diff 3774.Nov 22 2017, 1:14 PM
mbthomas updated this revision to Diff 3827.Nov 24 2017, 11:53 AM
mbthomas updated this revision to Diff 3844.Nov 24 2017, 3:18 PM
mbthomas updated this revision to Diff 3868.Nov 27 2017, 8:00 AM
durham accepted this revision.Nov 27 2017, 1:00 PM
This revision is now accepted and ready to land.Nov 27 2017, 1:00 PM
This revision was automatically updated to reflect the committed changes.
jsgf added a subscriber: jsgf.Dec 6 2017, 5:26 PM
jsgf added inline comments.
rust/treedirstate/src/python.rs
7–8

This could just be one line:

use errors::{self, ErrorKind};