This is an archive of the discontinued Mercurial Phabricator instance.

rust: translation of missingancestors
ClosedPublic

Authored by gracinet on Dec 12 2018, 10:11 AM.

Details

Summary

This is as direct as possible a translation of the ancestor.missingancestors
Python class in pure Rust. The goal for this changeset is to make it easy
to compare with the Python version.

We also add to Python tests the cases that helped us develop and debug
this implementation.

Some possible optimizations are marked along the way as TODO comments

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

gracinet created this revision.Dec 12 2018, 10:11 AM
gracinet updated this revision to Diff 12832.Dec 12 2018, 11:26 AM
kevincox added inline comments.Dec 12 2018, 1:59 PM
rust/hg-core/src/ancestors.rs
154

Why not just write:

self.bases.iter().any(|b| != NULL_REVISION)

It is much clearer what you mean and I suspect the performance is very similar. See the below example. If you want the performance then I prefer c which is even a bit faster looking.

https://rust.godbolt.org/z/-JJ61P

161

Would it be possible to return a more abstract type such as Interator<Item=Revision>? I would also prefer to just enclose this completely as IIUC we are somewhat lazily computing the bases so this function has some unspecified preconditions.

278

use if revs_visit.remove(&curr) in the condition to avoid duplication in the code.

442

It tool me a while to get this abbreviation. Maybe it should just be spelt out.

443

I'm surprised that the .iter().cloned() is required with the argument being an impl IntoIterator

gracinet marked an inline comment as done.Dec 14 2018, 12:59 PM
gracinet updated this revision to Diff 12855.
gracinet marked 2 inline comments as done.Dec 14 2018, 1:00 PM

Thanks for the useful tips!

rust/hg-core/src/ancestors.rs
154

I like the any() variant. Performance for that method isn't really critical anyway.

161

I agree it's not really pretty, even with Rust mutability rules having our back, but I'm not sure: the only caller candidate besides unit tests (Python setdiscovery module) is happy to read from the bases attribute directly, and hence cope with the fact that it is a moving target.

One advantage of returning &HashSet is that we'll get a straightforward conversion to Python sets for free as soon as the bindings support them (see also https://github.com/dgrunwald/rust-cpython/pull/165)

It's quite possible that we'll see some refactoring of setdiscovery and these missing ancestors in the future. I believe it'll be a good time to revise this if that becomes an issue.

443

I believe that is because IntoIterator is implemented only for references to arrays, (with Item being references, of course). A bit frustrating (after all vec! was simpler, but it doesn't matter much in the context of this test).

yuja added a subscriber: yuja.Dec 14 2018, 10:19 PM

Queued this, thanks.

+ / Add rev's parents to self.bases
+ #[inline]
+ fn add_parents(&mut self, rev: Revision) -> Result<(), GraphError> {
+
No need to bother the set with inserting NULL_REVISION over and
+ // over
+ for p in self.graph.parents(rev)?.iter().cloned() {
+ if p != NULL_REVISION {
+ self.bases.insert(p);
+ }
+ }
+ Ok(())
+ }

The comment sounds like we don't care the NULL_REVISION (i.e. the bases could
be extend()-ed), but we do explicitly avoid doing that.

+ let mut missing: Vec<Revision> = Vec::new();
+ for curr in (0..=start).map(|i| start - i) {

Nit: can be (0..=start).rev() ?

+ if revs_visit.is_empty() {
+ break;
+ }
+ if both_visit.contains(&curr) {
+ curr's parents might have made it into revs_visit through
+
another path
+ TODO optim: Rust's HashSet.remove returns a boolean telling
+
if it happened. This will spare us one set lookup
+ both_visit.remove(&curr);
+ for p in self.graph.parents(curr)?.iter().cloned() {
+ if p == NULL_REVISION {
+ continue;
+ }
+ revs_visit.remove(&p);
+ bases_visit.insert(p);
+ both_visit.insert(p);
+ }
+ continue;
+ }
+ in Rust, one can't just use mutable variables assignation
+
to be more straightforward. Instead of Python's
+ thisvisit and othervisit, we'll differentiate with a boolean
+ let this_visit_is_revs = {
+ if revs_visit.remove(&curr) {
+ missing.push(curr);
+ true
+ } else if bases_visit.contains(&curr) {
+ false
+ } else {
+
not an ancestor of revs or bases: ignore
+ continue;
+ }
+ };

Maybe this could be extracted to a function
visit_parents(&mut revs_visit, &mut bases_visit, select_this, select_other)
where select_this and select_other are FnMut(&mut, &mut) -> &mut, but
I don't know if that will improve the readability.

gracinet marked an inline comment as done.Dec 14 2018, 10:20 PM
This revision was automatically updated to reflect the committed changes.
yuja added a comment.Dec 19 2018, 8:57 AM
>  +            // in Rust, one can't just use mutable variables assignation
>  +            // to be more straightforward. Instead of Python's
>  +            // thisvisit and othervisit, we'll differentiate with a boolean
>  +            let this_visit_is_revs = {
>  +                if revs_visit.remove(&curr) {
>  +                    missing.push(curr);
>  +                    true
>  +                } else if bases_visit.contains(&curr) {
>  +                    false
>  +                } else {
>  +                    // not an ancestor of revs or bases: ignore
>  +                    continue;
>  +                }
>  +            };
Maybe this could be extracted to a function
`visit_parents(&mut revs_visit, &mut bases_visit, select_this, select_other)`
where `select_this` and `select_other` are `FnMut(&mut, &mut) -> &mut`, but
I don't know if that will improve the readability.

Actually it looks better if I copy-paste the visit loop. I'll send follow-up
patches.

if both_visit.contains(&curr) {
    for p in ...
} else if revs_visit.remove(&curr) {
    missing.push(curr);
    for p in ...
} else if bases_visit.contains(&curr) {
    for p in ...
}