( )⚙ D10983 dirstate-item: also build DistateItem in dirstate.directories()

This is an archive of the discontinued Mercurial Phabricator instance.

dirstate-item: also build DistateItem in dirstate.directories()
ClosedPublic

Authored by marmoute on Jul 5 2021, 5:34 AM.

Details

Summary

The rust code was building tuple.

Diff Detail

Repository
rHG Mercurial
Branch
default
Lint
No Linters Available
Unit
No Unit Test Coverage

Event Timeline

marmoute created this revision.Jul 5 2021, 5:34 AM

I think future changes of DirstateItem internals will be easier if state == 'd' is never something we need to represent in DirstateItem. We can do that by changing directories to return a list of (path, mtime) instead of (path, DirstateItem), and updating the two callers (debugcommands.debugdirstate and dirstate.foldmap).

I think future changes of DirstateItem internals will be easier if state == 'd' is never something we need to represent in DirstateItem. We can do that by changing directories to return a list of (path, mtime) instead of (path, DirstateItem), and updating the two callers (debugcommands.debugdirstate and dirstate.foldmap).

I am not too convinced that treating directory in a different way is win. What's your motivation there ?

Currently a dirstatetuple / dirstateitem only exists for paths that are present in dirstate-v1 and for which dirstate.__contains__ returns true. It contains a state whose value is always n, a, r, or m. However we want to move towards not having this state at all other than in dirstate-v1 parsing and serialization. Adding more cases to handle makes that more difficult and isn’t necessary.

In Rust code the DirstateEntry struct corresponds to dirstatetuple / dirstateitem. Current in the tree dirstate map, "directory" nodes don’t have a DirstateEntry at all. (And soon entry-less nodes won’t be just directories but will also potentially include ignored or unknown files.) When adding this directories iterator I only made it create tuples that look kinda like dirstatetuple as a shortcut because that’s what debugcommands.debugstate was already handling.

Currently a dirstatetuple / dirstateitem only exists for paths that are present in dirstate-v1 and for which dirstate.__contains__ returns true. It contains a state whose value is always n, a, r, or m. However we want to move towards not having this state at all other than in dirstate-v1 parsing and serialization. Adding more cases to handle makes that more difficult and isn’t necessary.
In Rust code the DirstateEntry struct corresponds to dirstatetuple / dirstateitem. Current in the tree dirstate map, "directory" nodes don’t have a DirstateEntry at all. (And soon entry-less nodes won’t be just directories but will also potentially include ignored or unknown files.) When adding this directories iterator I only made it create tuples that look kinda like dirstatetuple as a shortcut because that’s what debugcommands.debugstate was already handling.

Given that we are moving toward adding more thing in the dirstate(map) in general, having an explicit entry for all of them, instead of various special case, seems simpler. I expect us to introduce dedicated boolean property to differenciate the one Mercurial will directly care about (tracked in wc on in parent) and the other ones.

SimonSapin accepted this revision.Jul 6 2021, 11:18 AM

Alright, we’re not gonna decide right now the future internals of dirstateitem so let’s take this patch and we can always change things again later. We’ll probably need to separate the two users of this iterator anyway:

  • Now that tree-based hasdir is based on "the counter of descendants with an entry is zero" instead of just "this node does not have an entry", foldmap should be changed in a same way
  • debugdirstate wants all nodes with the maximum of information. That could be a single iterator instead of having it combine values from two iterators.
rust/hg-cpython/src/dirstate.rs
79

https://dgrunwald.github.io/rust-cpython/doc/cpython/macro.py_capsule.html says "with a layer of caching" so it should be fine.

baymax updated this revision to Diff 28838.Jul 6 2021, 1:35 PM

✅ refresh by Heptapod after a successful CI run (🐙 💚)

baymax updated this revision to Diff 29009.Jul 8 2021, 11:04 AM

✅ refresh by Heptapod after a successful CI run (🐙 💚)

baymax updated this revision to Diff 29021.Jul 8 2021, 2:27 PM

✅ refresh by Heptapod after a successful CI run (🐙 💚)

Alphare accepted this revision.Jul 9 2021, 11:09 AM
Alphare added a subscriber: Alphare.
Alphare added inline comments.
rust/hg-cpython/src/dirstate.rs
79

Yeah, it's using a OnceCell, so this is at most a very tiny difference to performance.

This revision is now accepted and ready to land.Jul 9 2021, 11:09 AM