This is an archive of the discontinued Mercurial Phabricator instance.

rust-discovery: partial switch to typestate pattern
Needs RevisionPublic

Authored by gracinet on Dec 24 2019, 8:49 AM.

Details

Reviewers
martinvonz
Group Reviewers
hg-reviewers
Summary

The PartialDiscovery object owns two data fields that are
lazily evaluated: the set of undecided revisions, and the
children cache. That laziness is known to be essential for
performance.

In the previous version, we were using Option<T>, which led
us to methods such as ensure_undecided() followed by calls to
self.undecided.as_ref().unwrap(), as it was the simplest way
to avoid reference sharing problems, but that wasn't
satisfying. Human readers knew that panicking was indeed
impossible, but that wasn't enforced by the compiler.

We had something similar yet less pervasive with the early
release of target_heads.

The reviewer, Kevin Cox, then suggested to use a code pattern
known as typestate: different types to represent the successive
stages, with the second one always having an undecided set.

This is what we are doing here. Now we have two state types:
OnlyCommon and WithUndecided.
Only the first has the target_heads member; only the second
has the undecided member.

It makes the code a bit longer, because we have to make
PartialDiscovery a wrapper enum for identical consumption
within hg-cpython and reexpose the public interface.
But it makes the inner code more focused, clearer and
better checked by the compiler. A few further simplifications
will be made in following changesets thanks to this.

A key point that we didn't know how to solve
in the first version was the in-place mutation of that wrapper
enum, provided by the mutate_undecided() method.

This is partial because we don't address the children cache.
We'll do that in a follow-up also.

Also some methods and doc-comments have been kept on the inner
structs for readability of this changesets, but they will
be factorized on the wrapper enum in a next move

Diff Detail

Repository
rHG Mercurial
Branch
default
Lint
No Linters Available
Unit
No Unit Test Coverage

Event Timeline

gracinet created this revision.Dec 24 2019, 8:49 AM

@durin42 this Diffrential and its descendants are what I've told you about in our latest chat: implementing suggestions by @kevincox for a much cleaner an robust code.
I intend to provide the upcoming rust-nodemap series using the same kind of pattern. I believe @Alphare wants to clean up Dirstate related code in similar ways.

martinvonz requested changes to this revision.Dec 27 2019, 12:42 PM
martinvonz added a subscriber: martinvonz.

Did you consider encoding the state in a type parameter to PartialDiscovery? I think that would make the call sites simpler. See the last variation on http://cliffle.com/blog/rust-typestate/ for what I mean.

This revision now requires changes to proceed.Dec 27 2019, 12:42 PM

@martinvonz yes, I've thinked of it after I made this one, but somehow I thought this was enough for a first try.

Yes, we could have PartialDiscovery<Start>, and PartialDiscovery<Computed>, with Start being a pure marker, and Computed actually holding the undecided set and the children cache (anticipating on a later changeset), that would be more elegant and it would better organize the boiler plate.

Don't hesitate if you come up with better namings.

@martinvonz yes, I've thinked of it after I made this one, but somehow I thought this was enough for a first try.

It seems like it would be a lot of churn to take this patch and then a follow-up that rewrites it. I'd be much happier to take a patch that cleaner version right away (and it sounds like we both think it would be cleaner).

Yes, we could have PartialDiscovery<Start>, and PartialDiscovery<Computed>, with Start being a pure marker, and Computed actually holding the undecided set and the children cache (anticipating on a later changeset), that would be more elegant and it would better organize the boiler plate.
Don't hesitate if you come up with better namings.

I haven't even reviewed in enough detail to know what the states represent, so I don't have any suggestions yet :)

I'm gonna give it a try, yes. About the namings, I'm thinking of <Cheap> and <Expensive>. It doesn't describe the contents, but it's really clear about the intent.

cc @Alphare is this still relevant ?

cc @Alphare is this still relevant ?

It is. @gracinet told me he would want to come back to this series when he has time. Since it's a cleanup series it's not urgent in any way, but we would all like to have this upstream some day.