( )⚙ D1396 treedirstate: add vecmap implementation

This is an archive of the discontinued Mercurial Phabricator instance.

treedirstate: add vecmap implementation
ClosedPublic

Authored by mbthomas on Nov 14 2017, 12:39 PM.
Tags
None
Subscribers

Details

Reviewers
jsgf
stash
durham
Group Reviewers
Restricted Project
Commits
rFBHGX3d6ea6138e86: treedirstate: add vecmap implementation
Summary

This adds an implementation of an ordered map that uses a vector pairs, sorted
by the key.

This is largely compatibly with std::collections::BTreeMap, but has performance
characteristics more suited for use in treedirstate.

Diff Detail

Repository
rFBHGX Facebook Mercurial Extensions
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

mbthomas created this revision.Nov 14 2017, 12:39 PM
Herald added a reviewer: Restricted Project. · View Herald TranscriptNov 14 2017, 12:39 PM
durham added a subscriber: durham.Nov 15 2017, 12:24 PM

Mostly noob questions

rust/treedirstate/src/vecmap.rs
71

Maybe a rust-noob question, but why is '?Sized' needed here?

120

Is this a standard function? If not, maybe a doc comment explaining what 'start' means. Took me a couple reads of the code to realize.

126

Could we have returned if start { 0 } else { self.vec.len() } for this instead? Seems like that would hide the special case from the caller. Or do you intentionally return None because it allows us to detect the special case and do Iter(self.vec.iter()) without slicing later?

127

Since self.vec.binary_search_by(|e| e.0.borrow().cmp(q)) is a common pattern (I see it at least 3 times above), should there be a function to do that for us? I think it'd clean up the one dense line in each of these functions.

184

rust-noob question: What is self.0 here? I thought self was an Iter instance, and I thought self.0 was something used on a tuple instance. So I'm a little confused.

Also, I thought .next() would return the next value. But I thought .map() would only operate on some sort of generator or collection?

mbthomas added inline comments.Nov 15 2017, 12:55 PM
rust/treedirstate/src/vecmap.rs
71

Q is allowed to be a reference to a generic type for which the size is not known (e.g. a trait type). The important thing is that it's a thing that can be compared with a borrowed Key. This allows storing String or Vec[u8] keys, but then accessing/removing the entries using a &str or &[u8] borrowed value from elsewhere.

Rust defaults to assuming Sized for type parameters, so you have to remove it when you don't need it.

120

No, this is a private function to make implementing range and range_mut simpler.

126

That would probably make sense and make the other function simpler. I'll rework it.

127

I'm not sure how easy it will be to extract (there's a lot of type inferencing going on here), but I'll give it a shot.

184

vecmap::Iter is a wrapper around std::slice::Iter (see line 15) that convert its API to one that works in a map-like way. The difference is subtle:

std::slice::Iter for a Vec<Item> yields Option<&Item>, which is Option<&(K, V)> for this module. To be able to iterate over the key/value pairs, we need to convert this to Option<(&K, &V)>. This is more important for IterMut where we want to map Option<&mut (K, V)> to Option<(&K, &mut V)> to prevent the caller mutating the key.

To answer the first question, vecmap::Iter is a 1-tuple containing a std::slice::Iter, so we access the inner iter through self.0. I think this is a common Rust pattern.

For the second question, .next() returns Option<&(K, V)>, which we want to map to Option<(&K, &V)>. The map call here is over the Option type.

mbthomas updated this revision to Diff 3566.Nov 16 2017, 11:42 AM
mbthomas marked 10 inline comments as done.Nov 16 2017, 11:43 AM
mbthomas updated this revision to Diff 3593.Nov 17 2017, 9:44 AM
jsgf requested changes to this revision.Nov 17 2017, 12:30 PM
jsgf added a subscriber: jsgf.

Nits mostly.

rust/treedirstate/src/vecmap.rs
53–54

"that value is >replaced and< returned" for maximum clarity.

55

I prefer to avoid putting mut v: V into signatures, because its a little confusing for readers - it's not obvious that mut has local effect and doesn't affect the function's interface.

Instead:

pub fn insert(&mut self, k: K, v: V) -> Option<V> {
    let mut v = v;
68–70

Is this common enough to just always do as part of insert()?

134–136

How does this look if you just split it into separate range_index_start/range_index_end rather than toggling on start? It would still have some amount of duplication of the overall function and the calls to find_index, but it might be clearer in the end.

Or alternatively take an instance of RangeArgument and handle both at once, returning (usize, usize).

167

RangeArgument? It's already implemented for (Bound<T>, Bound<T>), but also allows using ../..= notation.

199

I prefer:

fn next(&mut self) -> Option<Self::Item> {

to be very explicit.

200

Pattern matching might be a little clearer: .map(|&(ref k, ref v)| (k, v))

208

likewise

209

ditto pattern match

This revision now requires changes to proceed.Nov 17 2017, 12:30 PM
mbthomas added inline comments.Nov 20 2017, 12:19 PM
rust/treedirstate/src/vecmap.rs
167

RangeArgument is nightly-only experimental still. I'm trying to keep to stable rust features.

mbthomas updated this revision to Diff 3648.Nov 20 2017, 12:20 PM
mbthomas marked 9 inline comments as done.Nov 20 2017, 12:22 PM
mbthomas added inline comments.
rust/treedirstate/src/vecmap.rs
68–70

Possibly. I'll see if I can think of a way to measure it.

stash accepted this revision.Nov 21 2017, 5:12 AM
jsgf accepted this revision.Nov 22 2017, 5:47 PM
This revision is now accepted and ready to land.Nov 22 2017, 5:47 PM
mbthomas updated this revision to Diff 3818.Nov 24 2017, 11:52 AM
durham accepted this revision.Nov 27 2017, 12:23 PM
This revision was automatically updated to reflect the committed changes.