This is an archive of the discontinued Mercurial Phabricator instance.

dirstate-v2: Introduce a docket file
ClosedPublic

Authored by SimonSapin on Jul 12 2021, 11:52 AM.

Details

Summary

.hg/dirstate now only contains some metadata to point to a separate data file
named .hg/dirstate.{}.d with a random hexadecimal identifier. For now every
update creates a new data file and removes the old one, but later we’ll
(usually) append to an existing file.

Separating into two files allows doing the "write to a temporary file then
atomically rename into destination" dance with only a small docket file,
without always rewriting a lot of data.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

SimonSapin created this revision.Jul 12 2021, 11:52 AM
baymax updated this revision to Diff 29214.Jul 12 2021, 4:57 PM

✅ refresh by Heptapod after a successful CI run (🐙 💚)

baymax updated this revision to Diff 29218.Jul 12 2021, 6:39 PM

✅ refresh by Heptapod after a successful CI run (🐙 💚)

This seems good overall, but I have some questions.

mercurial/debugcommands.py
995

How does this work once we move to append-only (I'm not seeing it in the next changesets)?

1004

nit: hash is a builtin function in Python

mercurial/dirstateutils/docket.py
43

Why pass around nodeconstants? Shouldn't they be directly imported?

rust/rhg/src/commands/status.rs
176

The opening of the dirstate maybe should be consolidated somehow so we don't duplicate the docket indirection in rhg and hg-cpython, but that's not that big of a deal.

Alphare added inline comments.Jul 13 2021, 11:10 AM
mercurial/debugcommands.py
995

Scratch that, it's in the 3rd one.

SimonSapin marked 2 inline comments as done.Jul 13 2021, 11:25 AM
SimonSapin added inline comments.
mercurial/debugcommands.py
1004

Yes. Personally I don’t really mind variable shadowing, though once upon a time I worked on a code base where a linter complained about it.

I’ll rename to hash_bytes.

rust/rhg/src/commands/status.rs
176

Yes, rhg is in sore need of refactoring to have reusable APIs for various things such as opening the dirstate, but that will be for another patch.

The hg-cpython case is slightly different since (so far) we do all the I/O and docket handling on the Python side of FFI.

SimonSapin updated this revision to Diff 29235.Jul 13 2021, 3:50 PM
Alphare added inline comments.Jul 16 2021, 5:21 AM
mercurial/debugcommands.py
1005

You probably want ui.write here

marmoute added inline comments.
mercurial/debugcommands.py
997

do you mean or nothing for distate-v1 ?

mercurial/dirstatemap.py
637

We should restrict that amount of data-read to docket.data_size, should we not ?

mercurial/dirstateutils/docket.py
31

We should not use .d extension here. There are too many part of Mercurial that assum .d is attached to a revlog and that will eventually lead to confusion.

marmoute added inline comments.Jul 16 2021, 6:45 AM
mercurial/dirstatemap.py
637

never mind, this is done in the next changesets.

Alphare added inline comments.Jul 16 2021, 7:29 AM
mercurial/dirstateutils/docket.py
31

why not .docket simply?

Alphare added inline comments.Jul 16 2021, 7:30 AM
mercurial/dirstateutils/docket.py
31

This is the data, my bad.

baymax updated this revision to Diff 29606.Jul 20 2021, 3:41 PM

✅ refresh by Heptapod after a successful CI run (🐙 💚)

This revision was not accepted when it landed; it landed in state Needs Review.
This revision was automatically updated to reflect the committed changes.
mharbison72 added inline comments.
mercurial/dirstatemap.py
666–667

Should this be a try/finally (and also the one in the else)? And is the may-or-may-not-be-closed state handled properly by the callers?

tests/test-help.t
1011–1012

For some reason, the next line of the help text is missing. Not sure if that means it's malformed.