This is an archive of the discontinued Mercurial Phabricator instance.

Differential D7716

rust-discovery: partial switch to typestate pattern
Needs RevisionPublic

Authored by gracinet on Dec 24 2019, 8:49 AM.

Download Raw Diff

Details

Reviewers

martinvonz

Group Reviewers

hg-reviewers

Summary

The PartialDiscovery object owns two data fields that are
lazily evaluated: the set of undecided revisions, and the
children cache. That laziness is known to be essential for
performance.

In the previous version, we were using Option<T>, which led
us to methods such as ensure_undecided() followed by calls to
self.undecided.as_ref().unwrap(), as it was the simplest way
to avoid reference sharing problems, but that wasn't
satisfying. Human readers knew that panicking was indeed
impossible, but that wasn't enforced by the compiler.

We had something similar yet less pervasive with the early
release of target_heads.

The reviewer, Kevin Cox, then suggested to use a code pattern
known as typestate: different types to represent the successive
stages, with the second one always having an undecided set.

This is what we are doing here. Now we have two state types:
OnlyCommon and WithUndecided.
Only the first has the target_heads member; only the second
has the undecided member.

It makes the code a bit longer, because we have to make
PartialDiscovery a wrapper enum for identical consumption
within hg-cpython and reexpose the public interface.
But it makes the inner code more focused, clearer and
better checked by the compiler. A few further simplifications
will be made in following changesets thanks to this.

A key point that we didn't know how to solve
in the first version was the in-place mutation of that wrapper
enum, provided by the mutate_undecided() method.

This is partial because we don't address the children cache.
We'll do that in a follow-up also.

Also some methods and doc-comments have been kept on the inner
structs for readability of this changesets, but they will
be factorized on the wrapper enum in a next move

Diff Detail

Repository

rHG Mercurial

Branch

default

Lint

No Linters Available

Unit

No Unit Test Coverage

Event Timeline

gracinet created this revision.Dec 24 2019, 8:49 AM

Herald added a reviewer: hg-reviewers. · View Herald TranscriptDec 24 2019, 8:49 AM

Herald added subscribers: mercurial-devel, kevincox, durin42. · View Herald Transcript

gracinet added a child revision: D7717: rust-discovery: restoring add_missing cheap early return.Dec 24 2019, 8:49 AM

@durin42 this Diffrential and its descendants are what I've told you about in our latest chat: implementing suggestions by @kevincox for a much cleaner an robust code.
I intend to provide the upcoming rust-nodemap series using the same kind of pattern. I believe @Alphare wants to clean up Dirstate related code in similar ways.

Did you consider encoding the state in a type parameter to PartialDiscovery? I think that would make the call sites simpler. See the last variation on http://cliffle.com/blog/rust-typestate/ for what I mean.

This revision now requires changes to proceed.Dec 27 2019, 12:42 PM

@martinvonz yes, I've thinked of it after I made this one, but somehow I thought this was enough for a first try.

Yes, we could have PartialDiscovery<Start>, and PartialDiscovery<Computed>, with Start being a pure marker, and Computed actually holding the undecided set and the children cache (anticipating on a later changeset), that would be more elegant and it would better organize the boiler plate.

Don't hesitate if you come up with better namings.

In D7716#114043, @gracinet wrote:

@martinvonz yes, I've thinked of it after I made this one, but somehow I thought this was enough for a first try.

It seems like it would be a lot of churn to take this patch and then a follow-up that rewrites it. I'd be much happier to take a patch that cleaner version right away (and it sounds like we both think it would be cleaner).

Yes, we could have PartialDiscovery<Start>, and PartialDiscovery<Computed>, with Start being a pure marker, and Computed actually holding the undecided set and the children cache (anticipating on a later changeset), that would be more elegant and it would better organize the boiler plate.
Don't hesitate if you come up with better namings.

I haven't even reviewed in enough detail to know what the states represent, so I don't have any suggestions yet :)

I'm gonna give it a try, yes. About the namings, I'm thinking of <Cheap> and <Expensive>. It doesn't describe the contents, but it's really clear about the intent.

cc @Alphare is this still relevant ?

Herald added a subscriber: mercurial-patches. · View Herald TranscriptApr 22 2020, 11:54 AM

In D7716#126417, @marmoute wrote:

cc @Alphare is this still relevant ?

It is. @gracinet told me he would want to come back to this series when he has time. Since it's a cleanup series it's not urgent in any way, but we would all like to have this upstream some day.

Revision Contents
Changeset List

			Path	Packages
M			rust/hg-core/src/discovery.rs (381 lines)

Commit	Parents	Author	Summary	Date
cbae7be51ffe	d01ada698fb1	Georges Racinet		Dec 23 2019, 1:43 PM

Status	Author	Revision
Needs Revision	gracinet	D7722 rust-discovery: simplifying add_missing_revisions()
Needs Revision	gracinet	D7721 rust-discovery: postponing random generator init
Needs Revision	gracinet	D7720 rust-discovery: moved some methods to the wrapper enum
Needs Revision	gracinet	D7719 rust-discovery: children cache as typestate transition
Needs Revision	gracinet	D7718 rust-directory: simplify bidirectional sampling
Needs Revision	gracinet	D7717 rust-discovery: restoring add_missing cheap early return
Needs Revision	gracinet	D7716 rust-discovery: partial switch to typestate pattern
Closed	gracinet	D7715 rust-discovery: type alias for random generator seed

Diff 18924

rust/hg-core/src/discovery.rs

	//! `mercurial.setdiscovery`			//! `mercurial.setdiscovery`

	use super::{Graph, GraphError, Revision, NULL_REVISION};			use super::{Graph, GraphError, Revision, NULL_REVISION};
	use crate::{ancestors::MissingAncestors, dagops, FastHashMap};			use crate::{ancestors::MissingAncestors, dagops, FastHashMap};
	use rand::seq::SliceRandom;			use rand::seq::SliceRandom;
	use rand::{thread_rng, RngCore, SeedableRng};			use rand::{thread_rng, RngCore, SeedableRng};
	use std::cmp::{max, min};			use std::cmp::{max, min};
	use std::collections::{HashSet, VecDeque};			use std::collections::{HashSet, VecDeque};
				use std::mem;

	type Rng = rand_pcg::Pcg32;			type Rng = rand_pcg::Pcg32;
	type Seed = [u8; 16];			type Seed = [u8; 16];

	pub struct PartialDiscovery<G: Graph + Clone> {			pub enum PartialDiscovery<G: Graph + Clone> {
	target_heads: Option<Vec<Revision>>,			Com(OnlyCommon<G>),
				Und(WithUndecided<G>),
				}

				pub struct OnlyCommon<G: Graph + Clone> {
				target_heads: Vec<Revision>,
	graph: G, // plays the role of self._repo			graph: G, // plays the role of self._repo
	common: MissingAncestors<G>,			common: MissingAncestors<G>,
	undecided: Option<HashSet<Revision>>,			respect_size: bool,
				randomize: bool,
				seed: Seed,
				}

				pub struct WithUndecided<G: Graph + Clone> {
				graph: G, // plays the role of self._repo
				common: MissingAncestors<G>,
				undecided: HashSet<Revision>,
	children_cache: Option<FastHashMap<Revision, Vec<Revision>>>,			children_cache: Option<FastHashMap<Revision, Vec<Revision>>>,
	missing: HashSet<Revision>,			missing: HashSet<Revision>,
	rng: Rng,			rng: Rng,
	respect_size: bool,			respect_size: bool,
	randomize: bool,			randomize: bool,
	}			}

	pub struct DiscoveryStats {			pub struct DiscoveryStats {
	self.cur += 1;			self.cur += 1;
	if rev == NULL_REVISION {			if rev == NULL_REVISION {
	return self.next();			return self.next();
	}			}
	Some(rev)			Some(rev)
	}			}
	}			}

				use PartialDiscovery::{Com, Und};

	impl<G: Graph + Clone> PartialDiscovery<G> {			impl<G: Graph + Clone> PartialDiscovery<G> {
	/// Create a PartialDiscovery object, with the intent			/// Create a PartialDiscovery object, with the intent
	/// of comparing our `::<target_heads>` revset to the contents of another			/// of comparing our `::<target_heads>` revset to the contents of another
	/// repo.			/// repo.
	///			///
	/// For now `target_heads` is passed as a vector, and will be used			/// For now `target_heads` is passed as a vector, and will be used
	/// at the first call to `ensure_undecided()`.			/// at the first call to `ensure_undecided()`.
	///			///

	pub fn new_with_seed(			pub fn new_with_seed(
	graph: G,			graph: G,
	target_heads: Vec<Revision>,			target_heads: Vec<Revision>,
	seed: Seed,			seed: Seed,
	respect_size: bool,			respect_size: bool,
	randomize: bool,			randomize: bool,
	) -> Self {			) -> Self {
	PartialDiscovery {			Com(OnlyCommon::new(
	undecided: None,			graph,
	children_cache: None,			target_heads,
	target_heads: Some(target_heads),			seed,
				respect_size,
				randomize,
				))
				}

				/// Do we have any information about the peer?
				pub fn has_info(&self) -> bool {
				match self {
				Com(c) => c.common.has_bases(),
				Und(u) => u.has_info(),
				}
				}

				/// Did we acquire full knowledge of our Revisions that the peer has?
				pub fn is_complete(&self) -> bool {
				match self {
				Com(_) => false,
				Und(u) => u.is_complete(),
				}
				}

				/// Return the heads of the currently known common set of revisions.
				///
				/// If the discovery process is not complete (see `is_complete()`), the
				/// caller must be aware that this is an intermediate state.
				///
				/// On the other hand, if it is complete, then this is currently
				/// the only way to retrieve the end results of the discovery process.
				///
				/// We may introduce in the future an `into_common_heads` call that
				/// would be more appropriate for normal Rust callers, dropping `self`
				/// if it is complete.
				pub fn common_heads(&self) -> Result<HashSet<Revision>, GraphError> {
				match self {
				Com(c) => c.common.bases_heads(),
				Und(u) => u.common_heads(),
				}
				}

				/// Provide statistics about the current state of the discovery process
				pub fn stats(&self) -> DiscoveryStats {
				match self {
				Com(c) => c.stats(),
				Und(u) => u.stats(),
				}
				}

				/// Register revisions known as being common
				pub fn add_common_revisions(
				&mut self,
				common: impl IntoIterator<Item = Revision>,
				) -> Result<(), GraphError> {
				match self {
				Com(oc) => Ok(oc.add_common_revisions(common)),
				Und(wu) => wu.add_common_revisions(common),
				}
				}

				/// Register revisions known as being missing in remote
				pub fn add_missing_revisions(
				&mut self,
				missing: impl IntoIterator<Item = Revision>,
				) -> Result<(), GraphError> {
				self.mutate_undecided(
				\|oc\| oc.compute_undecided(),
				\|wu\| wu.add_missing_revisions(missing),
				)
				}

				/// Mutate into a `WithUndecided` and apply the `mutator` closure.
				///
				/// If `self` is still in the `OnlyCommons` stage, this applies first
				/// the `transitor` closure to produce a `WithUndecided`.
				///
				/// The `mutator` closure is applied in all cases.
				///
				/// The advantage for the caller is not to have to re-consider the
				/// `OnlyCommon` variant after performing a mutation to `WithUndecided`.
				fn mutate_undecided<R, T, M>(
				&mut self,
				transitor: T,
				mutator: M,
				) -> Result<R, GraphError>
				where
				T: FnOnce(OnlyCommon<G>) -> Result<WithUndecided<G>, GraphError>,
				M: FnOnce(&mut WithUndecided<G>) -> Result<R, GraphError>,
				{
				match self {
				Com(com) => {
				// here we could use `mem::take` if we were on Rust 1.40
				// and `OnlyCommon` was truly implementing the `Default` trait.
				let com = mem::replace(com, OnlyCommon::default(&com.graph));
				let mut und = transitor(com)?;
				let res = mutator(&mut und);
				mem::replace(self, Und(und));
				res
				}
				Und(und) => mutator(und),
				}
				}

				pub fn take_quick_sample(
				&mut self,
				headrevs: impl IntoIterator<Item = Revision>,
				size: usize,
				) -> Result<Vec<Revision>, GraphError> {
				self.mutate_undecided(
				\|oc\| oc.compute_undecided(),
				\|wu\| wu.take_quick_sample(headrevs, size),
				)
				}

				pub fn take_full_sample(
				&mut self,
				size: usize,
				) -> Result<Vec<Revision>, GraphError> {
				self.mutate_undecided(
				\|oc\| oc.compute_undecided(),
				\|wu\| wu.take_full_sample(size),
				)
				}
				}

				impl<G: Graph + Clone> OnlyCommon<G> {
				/// In this first stage of the Discovery process, we gather information
				/// about common revisions only, and we don't need to compute an
				/// undecided set.
				///
				/// The more common revisions are known, the less the computation of
				/// the undecided set is expensive. Therefore, we delay it as much
				/// as possible.
				fn new(
				graph: G,
				target_heads: Vec<Revision>,
				seed: Seed,
				respect_size: bool,
				randomize: bool,
				) -> Self {
				OnlyCommon {
	graph: graph.clone(),			graph: graph.clone(),
	common: MissingAncestors::new(graph, vec![]),			target_heads: target_heads,
	missing: HashSet::new(),
	rng: Rng::from_seed(seed),
	respect_size: respect_size,			respect_size: respect_size,
	randomize: randomize,			randomize: randomize,
				seed: seed,
				common: MissingAncestors::new(graph, vec![]),
				}
				}

				/// Provide the cheapest possible valid object of type `Self`
				///
				/// If later on we can insist on `G` implementing `Default`, then
				/// we can drop the `graph` parameter and move this into an `impl Default`
				/// block. Currently, our concrete type for `G` besides in tests is outside
				/// of this crate (wrapper around index Python object).
				/// We don't want to implement `Default` for it right away.
				fn default(graph: &G) -> Self {
				Self::new(graph.clone(), Vec::new(), [0; 16], false, false)
				}

				/// Register revisions known as being common
				pub fn add_common_revisions(
				&mut self,
				common: impl IntoIterator<Item = Revision>,
				) {
				self.common.add_bases(common);
				}

				pub fn stats(&self) -> DiscoveryStats {
				DiscoveryStats { undecided: None }
				}

				fn compute_undecided(mut self) -> Result<WithUndecided<G>, GraphError> {
				self.common
				.missing_ancestors(self.target_heads.iter().cloned())
				.map(\|undecided\| WithUndecided::new(self, undecided))
				}
				}

				impl<G: Graph + Clone> WithUndecided<G> {
				fn new(
				disco: OnlyCommon<G>,
				undecided: impl IntoIterator<Item = Revision>,
				) -> Self {
				WithUndecided {
				undecided: undecided.into_iter().collect(),
				children_cache: None,
				rng: Rng::from_seed(disco.seed),
				graph: disco.graph.clone(),
				missing: HashSet::new(),
				respect_size: disco.respect_size,
				randomize: disco.randomize,
				common: disco.common,
	}			}
	}			}

	/// Extract at most `size` random elements from sample and return them			/// Extract at most `size` random elements from sample and return them
	/// as a vector			/// as a vector
	fn limit_sample(			fn limit_sample(
	&mut self,			&mut self,
	mut sample: Vec<Revision>,			mut sample: Vec<Revision>,
	&mut self,			&mut self,
	common: impl IntoIterator<Item = Revision>,			common: impl IntoIterator<Item = Revision>,
	) -> Result<(), GraphError> {			) -> Result<(), GraphError> {
	let before_len = self.common.get_bases().len();			let before_len = self.common.get_bases().len();
	self.common.add_bases(common);			self.common.add_bases(common);
	if self.common.get_bases().len() == before_len {			if self.common.get_bases().len() == before_len {
	return Ok(());			return Ok(());
	}			}
	if let Some(ref mut undecided) = self.undecided {			self.common.remove_ancestors_from(&mut self.undecided)
	self.common.remove_ancestors_from(undecided)?;
	}
	Ok(())
	}			}

	/// Register revisions known as being missing			/// Register revisions known as being missing
	///			///
	/// # Performance note			/// # Performance note
	///			///
	/// Except in the most trivial case, the first call of this method has			/// Except in the most trivial case, the first call of this method has
	/// the side effect of computing `self.undecided` set for the first time,			/// the side effect of computing `self.undecided` set for the first time,
	/// and the related caches it might need for efficiency of its internal			/// and the related caches it might need for efficiency of its internal
	/// computation. This is typically faster if more information is			/// computation. This is typically faster if more information is
	/// available in `self.common`. Therefore, for good performance, the			/// available in `self.common`. Therefore, for good performance, the
	/// caller should avoid calling this too early.			/// caller should avoid calling this too early.
	pub fn add_missing_revisions(			pub fn add_missing_revisions(
	&mut self,			&mut self,
	missing: impl IntoIterator<Item = Revision>,			missing: impl IntoIterator<Item = Revision>,
	) -> Result<(), GraphError> {			) -> Result<(), GraphError> {
	let mut tovisit: VecDeque<Revision> = missing.into_iter().collect();			let mut tovisit: VecDeque<Revision> = missing.into_iter().collect();
	if tovisit.is_empty() {			if tovisit.is_empty() {
	return Ok(());			return Ok(());
	}			}
	self.ensure_children_cache()?;			self.ensure_children_cache()?;
	self.ensure_undecided()?; // for safety of possible future refactors
	let children = self.children_cache.as_ref().unwrap();			let children = self.children_cache.as_ref().unwrap();
	let mut seen: HashSet<Revision> = HashSet::new();			let mut seen: HashSet<Revision> = HashSet::new();
	let undecided_mut = self.undecided.as_mut().unwrap();			let undecided_mut = &mut self.undecided;
	while let Some(rev) = tovisit.pop_front() {			while let Some(rev) = tovisit.pop_front() {
	if !self.missing.insert(rev) {			if !self.missing.insert(rev) {
	// either it's known to be missing from a previous			// either it's known to be missing from a previous
	// invocation, and there's no need to iterate on its			// invocation, and there's no need to iterate on its
	// children (we now they are all missing)			// children (we now they are all missing)
	// or it's from a previous iteration of this loop			// or it's from a previous iteration of this loop
	// and its children have already been queued			// and its children have already been queued
	continue;			continue;

	/// Do we have any information about the peer?			/// Do we have any information about the peer?
	pub fn has_info(&self) -> bool {			pub fn has_info(&self) -> bool {
	self.common.has_bases()			self.common.has_bases()
	}			}

	/// Did we acquire full knowledge of our Revisions that the peer has?			/// Did we acquire full knowledge of our Revisions that the peer has?
	pub fn is_complete(&self) -> bool {			pub fn is_complete(&self) -> bool {
	self.undecided.as_ref().map_or(false, \|s\| s.is_empty())			self.undecided.is_empty()
	}			}

	/// Return the heads of the currently known common set of revisions.			/// Return the heads of the currently known common set of revisions.
	///			///
	/// If the discovery process is not complete (see `is_complete()`), the			/// If the discovery process is not complete (see `is_complete()`), the
	/// caller must be aware that this is an intermediate state.			/// caller must be aware that this is an intermediate state.
	///			///
	/// On the other hand, if it is complete, then this is currently			/// On the other hand, if it is complete, then this is currently
	/// the only way to retrieve the end results of the discovery process.			/// the only way to retrieve the end results of the discovery process.
	///			///
	/// We may introduce in the future an `into_common_heads` call that			/// We may introduce in the future an `into_common_heads` call that
	/// would be more appropriate for normal Rust callers, dropping `self`			/// would be more appropriate for normal Rust callers, dropping `self`
	/// if it is complete.			/// if it is complete.
	pub fn common_heads(&self) -> Result<HashSet<Revision>, GraphError> {			pub fn common_heads(&self) -> Result<HashSet<Revision>, GraphError> {
	self.common.bases_heads()			self.common.bases_heads()
	}			}

	/// Force first computation of `self.undecided`
	///
	/// After this, `self.undecided.as_ref()` and `.as_mut()` can be
	/// unwrapped to get workable immutable or mutable references without
	/// any panic.
	///
	/// This is an imperative call instead of an access with added lazyness
	/// to reduce easily the scope of mutable borrow for the caller,
	/// compared to undecided(&'a mut self) -> &'a… that would keep it
	/// as long as the resulting immutable one.
	fn ensure_undecided(&mut self) -> Result<(), GraphError> {
	if self.undecided.is_some() {
	return Ok(());
	}
	let tgt = self.target_heads.take().unwrap();
	self.undecided =
	Some(self.common.missing_ancestors(tgt)?.into_iter().collect());
	Ok(())
	}

	fn ensure_children_cache(&mut self) -> Result<(), GraphError> {			fn ensure_children_cache(&mut self) -> Result<(), GraphError> {
	if self.children_cache.is_some() {			if self.children_cache.is_some() {
	return Ok(());			return Ok(());
	}			}
	self.ensure_undecided()?;

	let mut children: FastHashMap<Revision, Vec<Revision>> =			let mut children: FastHashMap<Revision, Vec<Revision>> =
	FastHashMap::default();			FastHashMap::default();
	for &rev in self.undecided.as_ref().unwrap() {			for rev in self.undecided.iter() {
	for p in ParentsIterator::graph_parents(&self.graph, rev)? {			for p in ParentsIterator::graph_parents(&self.graph, *rev)? {
	children.entry(p).or_insert_with(\|\| Vec::new()).push(rev);			children.entry(p).or_insert_with(\|\| Vec::new()).push(*rev);
	}			}
	}			}
	self.children_cache = Some(children);			self.children_cache = Some(children);
	Ok(())			Ok(())
	}			}

	/// Provide statistics about the current state of the discovery process			/// Provide statistics about the current state of the discovery process
	pub fn stats(&self) -> DiscoveryStats {			pub fn stats(&self) -> DiscoveryStats {
	DiscoveryStats {			DiscoveryStats {
	undecided: self.undecided.as_ref().map(\|s\| s.len()),			undecided: Some(self.undecided.len()),
	}			}
	}			}

	pub fn take_quick_sample(			pub fn take_quick_sample(
	&mut self,			&mut self,
	headrevs: impl IntoIterator<Item = Revision>,			headrevs: impl IntoIterator<Item = Revision>,
	size: usize,			size: usize,
	) -> Result<Vec<Revision>, GraphError> {			) -> Result<Vec<Revision>, GraphError> {
	self.ensure_undecided()?;
	let mut sample = {			let mut sample = {
	let undecided = self.undecided.as_ref().unwrap();			let undecided = &self.undecided;
	if undecided.len() <= size {			if undecided.len() <= size {
	return Ok(undecided.iter().cloned().collect());			return Ok(undecided.iter().cloned().collect());
	}			}
	dagops::heads(&self.graph, undecided.iter())?			dagops::heads(&self.graph, undecided.iter())?
	};			};
	if sample.len() >= size {			if sample.len() >= size {
	return Ok(self.limit_sample(sample.into_iter().collect(), size));			return Ok(self.limit_sample(sample.into_iter().collect(), size));
	}			}
	/// No effort is being made to complete or limit the sample to `size`			/// No effort is being made to complete or limit the sample to `size`
	/// but this method returns another interesting size that it derives			/// but this method returns another interesting size that it derives
	/// from its knowledge of the structure of the various sets, leaving			/// from its knowledge of the structure of the various sets, leaving
	/// to the caller the decision to use it or not.			/// to the caller the decision to use it or not.
	fn bidirectional_sample(			fn bidirectional_sample(
	&mut self,			&mut self,
	size: usize,			size: usize,
	) -> Result<(HashSet<Revision>, usize), GraphError> {			) -> Result<(HashSet<Revision>, usize), GraphError> {
	self.ensure_undecided()?;
	{			{
	// we don't want to compute children_cache before this			// we don't want to compute children_cache before this
	// but doing it after extracting self.undecided takes a mutable			// but doing it after extracting self.undecided takes a mutable
	// ref to self while a shareable one is still active.			// ref to self while a shareable one is still active.
	let undecided = self.undecided.as_ref().unwrap();			if self.undecided.len() <= size {
	if undecided.len() <= size {			return Ok((self.undecided.clone(), size));
	return Ok((undecided.clone(), size));
	}			}
	}			}

	self.ensure_children_cache()?;			self.ensure_children_cache()?;
	let revs = self.undecided.as_ref().unwrap();			let revs = &self.undecided;
	let mut sample: HashSet<Revision> = revs.clone();			let mut sample: HashSet<Revision> = revs.clone();

	// it's possible that leveraging the children cache would be more			// it's possible that leveraging the children cache would be more
	// efficient here			// efficient here
	dagops::retain_heads(&self.graph, &mut sample)?;			dagops::retain_heads(&self.graph, &mut sample)?;
	let revsheads = sample.clone(); // was again heads(revs) in python			let revsheads = sample.clone(); // was again heads(revs) in python

	// update from heads			// update from heads
	size: usize,			size: usize,
	) {			) {
	let sample_len = sample.len();			let sample_len = sample.len();
	if size <= sample_len {			if size <= sample_len {
	return;			return;
	}			}
	let take_from: Vec<Revision> = self			let take_from: Vec<Revision> = self
	.undecided			.undecided
	.as_ref()
	.unwrap()
	.iter()			.iter()
	.filter(\|&r\| !sample.contains(r))			.filter(\|&r\| !sample.contains(r))
	.cloned()			.cloned()
	.collect();			.collect();
	sample.extend(self.limit_sample(take_from, size - sample_len));			sample.extend(self.limit_sample(take_from, size - sample_len));
	}			}

	pub fn take_full_sample(			pub fn take_full_sample(
	SampleGraph,			SampleGraph,
	vec![10, 11, 12, 13],			vec![10, 11, 12, 13],
	[0; 16],			[0; 16],
	true,			true,
	true,			true,
	)			)
	}			}

				fn full_disco_with_undecided() -> WithUndecided<SampleGraph> {
				OnlyCommon::new(SampleGraph, vec![10, 11, 12, 13], [0; 16], true, true)
				.compute_undecided()
				.unwrap()
				}

	/// A PartialDiscovery as for pushing the 12 head of `SampleGraph`			/// A PartialDiscovery as for pushing the 12 head of `SampleGraph`
	///			///
	/// To avoid actual randomness in tests, we give it a fixed random seed.			/// To avoid actual randomness in tests, we give it a fixed random seed.
	fn disco12() -> PartialDiscovery<SampleGraph> {			fn disco12() -> PartialDiscovery<SampleGraph> {
	PartialDiscovery::new_with_seed(			PartialDiscovery::new_with_seed(
	SampleGraph,			SampleGraph,
	vec![12],			vec![12],
	[0; 16],			[0; 16],
	true,			true,
	true,			true,
	)			)
	}			}

				fn unwrap_disco_with_undecided(
				disco: &PartialDiscovery<SampleGraph>,
				) -> &WithUndecided<SampleGraph> {
				match disco {
				Com(_) => {
				panic!("Unexpected variant");
				}
				Und(wu) => wu,
				}
				}

				fn force_undecided(
				disco: &mut PartialDiscovery<SampleGraph>,
				undecided: impl IntoIterator<Item = Revision>,
				) {
				let undecided_set: HashSet<Revision> = undecided.into_iter().collect();
				disco
				.mutate_undecided(
				\|oc\| Ok(WithUndecided::new(oc, Vec::new())),
				\|wu\| {
				wu.undecided = undecided_set;
				Ok(())
				},
				)
				.unwrap()
				}

	fn sorted_undecided(			fn sorted_undecided(
	disco: &PartialDiscovery<SampleGraph>,			disco: &PartialDiscovery<SampleGraph>,
	) -> Vec<Revision> {			) -> Vec<Revision> {
	let mut as_vec: Vec<Revision> =			let mut as_vec: Vec<Revision> = unwrap_disco_with_undecided(disco)
	disco.undecided.as_ref().unwrap().iter().cloned().collect();			.undecided
				.iter()
				.cloned()
				.collect();
	as_vec.sort();			as_vec.sort();
	as_vec			as_vec
	}			}

	fn sorted_missing(disco: &PartialDiscovery<SampleGraph>) -> Vec<Revision> {			fn sorted_missing(disco: &PartialDiscovery<SampleGraph>) -> Vec<Revision> {
	let mut as_vec: Vec<Revision> =			let mut as_vec: Vec<Revision> = unwrap_disco_with_undecided(disco)
	disco.missing.iter().cloned().collect();			.missing
				.iter()
				.cloned()
				.collect();
	as_vec.sort();			as_vec.sort();
	as_vec			as_vec
	}			}

	fn sorted_common_heads(			fn sorted_common_heads(
	disco: &PartialDiscovery<SampleGraph>,			disco: &PartialDiscovery<SampleGraph>,
	) -> Result<Vec<Revision>, GraphError> {			) -> Result<Vec<Revision>, GraphError> {
	let mut as_vec: Vec<Revision> =			let mut as_vec: Vec<Revision> =
	disco.common_heads()?.iter().cloned().collect();			disco.common_heads()?.iter().cloned().collect();
	as_vec.sort();			as_vec.sort();
	Ok(as_vec)			Ok(as_vec)
	}			}

				fn assert_disco_is_only_commons(
				disco: &PartialDiscovery<SampleGraph>,
				) -> () {
				if let Com(_) = disco {
				return;
				}
				panic!("Not a discovery::OnlyCommon");
				}

				fn assert_disco_is_b(disco: &PartialDiscovery<SampleGraph>) -> () {
				if let Und(_) = disco {
				return;
				}
				panic!("Not a discovery::WithUndecided");
				}

	#[test]			#[test]
	fn test_add_common_get_undecided() -> Result<(), GraphError> {			fn test_add_common_get_undecided() -> Result<(), GraphError> {
	let mut disco = full_disco();			let mut disco = full_disco();
	assert_eq!(disco.undecided, None);
	assert!(!disco.has_info());			assert!(!disco.has_info());
	assert_eq!(disco.stats().undecided, None);			assert_eq!(disco.stats().undecided, None);

	disco.add_common_revisions(vec![11, 12])?;			disco.add_common_revisions(vec![11, 12])?;
	assert!(disco.has_info());			assert!(disco.has_info());
	assert!(!disco.is_complete());			assert!(!disco.is_complete());
	assert!(disco.missing.is_empty());

	// add_common_revisions did not trigger a premature computation			// add_common_revisions did not trigger a premature computation
	// of `undecided`, let's check that and ask for them			// of `undecided`, let's check that, force the mutation and
	assert_eq!(disco.undecided, None);			// ask for them
	disco.ensure_undecided()?;			assert_disco_is_only_commons(&disco);
				disco.add_missing_revisions(Vec::new())?;
				assert_disco_is_b(&disco);
	assert_eq!(sorted_undecided(&disco), vec![5, 8, 10, 13]);			assert_eq!(sorted_undecided(&disco), vec![5, 8, 10, 13]);
	assert_eq!(disco.stats().undecided, Some(4));			assert_eq!(disco.stats().undecided, Some(4));
	Ok(())			Ok(())
	}			}

	/// in this test, we pretend that our peer misses exactly (8+10)::			/// in this test, we pretend that our peer misses exactly (8+10)::
	/// and we're comparing all our repo to it (as in a bare push)			/// and we're comparing all our repo to it (as in a bare push)
	#[test]			#[test]
	Ok(())			Ok(())
	}			}

	#[test]			#[test]
	fn test_add_missing_early_continue() -> Result<(), GraphError> {			fn test_add_missing_early_continue() -> Result<(), GraphError> {
	eprintln!("test_add_missing_early_stop");			eprintln!("test_add_missing_early_stop");
	let mut disco = full_disco();			let mut disco = full_disco();
	disco.add_common_revisions(vec![13, 3, 4])?;			disco.add_common_revisions(vec![13, 3, 4])?;
	disco.ensure_children_cache()?;
	// 12 is grand-child of 6 through 9			// 12 is grand-child of 6 through 9
	// passing them in this order maximizes the chances of the			// passing them in this order maximizes the chances of the
	// early continue to do the wrong thing			// early continue to do the wrong thing
	disco.add_missing_revisions(vec![6, 9, 12])?;			disco.add_missing_revisions(vec![6, 9, 12])?;
	assert_eq!(sorted_undecided(&disco), vec![5, 7, 10, 11]);			assert_eq!(sorted_undecided(&disco), vec![5, 7, 10, 11]);
	assert_eq!(sorted_missing(&disco), vec![6, 9, 12]);			assert_eq!(sorted_missing(&disco), vec![6, 9, 12]);
	assert!(!disco.is_complete());			assert!(!disco.is_complete());
	Ok(())			Ok(())
	}			}

	#[test]			#[test]
	fn test_limit_sample_no_need_to() {			fn test_limit_sample_no_need_to() {
	let sample = vec![1, 2, 3, 4];			let sample = vec![1, 2, 3, 4];
	assert_eq!(full_disco().limit_sample(sample, 10), vec![1, 2, 3, 4]);			assert_eq!(
				full_disco_with_undecided().limit_sample(sample, 10),
				vec![1, 2, 3, 4]
				);
	}			}

	#[test]			#[test]
	fn test_limit_sample_less_than_half() {			fn test_limit_sample_less_than_half() {
	assert_eq!(full_disco().limit_sample((1..6).collect(), 2), vec![4, 2]);			assert_eq!(
				full_disco_with_undecided().limit_sample((1..6).collect(), 2),
				vec![4, 2]
				);
	}			}

	#[test]			#[test]
	fn test_limit_sample_more_than_half() {			fn test_limit_sample_more_than_half() {
	assert_eq!(full_disco().limit_sample((1..4).collect(), 2), vec![3, 2]);			assert_eq!(
				full_disco_with_undecided().limit_sample((1..4).collect(), 2),
				vec![3, 2]
				);
	}			}

	#[test]			#[test]
	fn test_limit_sample_no_random() {			fn test_limit_sample_no_random() {
	let mut disco = full_disco();			let mut disco = full_disco_with_undecided();
	disco.randomize = false;			disco.randomize = false;
	assert_eq!(			assert_eq!(
	disco.limit_sample(vec![1, 8, 13, 5, 7, 3], 4),			disco.limit_sample(vec![1, 8, 13, 5, 7, 3], 4),
	vec![1, 3, 5, 7]			vec![1, 3, 5, 7]
	);			);
	}			}

	#[test]			#[test]
	fn test_quick_sample_enough_undecided_heads() -> Result<(), GraphError> {			fn test_quick_sample_enough_undecided_heads() -> Result<(), GraphError> {
	let mut disco = full_disco();			let mut disco = full_disco();
	disco.undecided = Some((1..=13).collect());			force_undecided(&mut disco, 1..=13);

	let mut sample_vec = disco.take_quick_sample(vec![], 4)?;			let mut sample_vec = disco.take_quick_sample(vec![], 4)?;
	sample_vec.sort();			sample_vec.sort();
	assert_eq!(sample_vec, vec![10, 11, 12, 13]);			assert_eq!(sample_vec, vec![10, 11, 12, 13]);
	Ok(())			Ok(())
	}			}

	#[test]			#[test]
	fn test_quick_sample_climbing_from_12() -> Result<(), GraphError> {			fn test_quick_sample_climbing_from_12() -> Result<(), GraphError> {
	let mut disco = disco12();			let mut disco = disco12();
	disco.ensure_undecided()?;

	let mut sample_vec = disco.take_quick_sample(vec![12], 4)?;			let mut sample_vec = disco.take_quick_sample(vec![12], 4)?;
	sample_vec.sort();			sample_vec.sort();
	// r12's only parent is r9, whose unique grand-parent through the			// r12's only parent is r9, whose unique grand-parent through the
	// diamond shape is r4. This ends there because the distance from r4			// diamond shape is r4. This ends there because the distance from r4
	// to the root is only 3.			// to the root is only 3.
	assert_eq!(sample_vec, vec![4, 9, 12]);			assert_eq!(sample_vec, vec![4, 9, 12]);
	Ok(())			Ok(())
	}			}

	#[test]			#[test]
	fn test_children_cache() -> Result<(), GraphError> {			fn test_children_cache() -> Result<(), GraphError> {
	let mut disco = full_disco();			let mut disco = full_disco_with_undecided();
	disco.ensure_children_cache()?;			disco.ensure_children_cache()?;

	let cache = disco.children_cache.unwrap();			let cache = disco.children_cache.unwrap();
	assert_eq!(cache.get(&2).cloned(), Some(vec![4]));			assert_eq!(cache.get(&2).cloned(), Some(vec![4]));
	assert_eq!(cache.get(&10).cloned(), None);			assert_eq!(cache.get(&10).cloned(), None);

	let mut children_4 = cache.get(&4).cloned().unwrap();			let mut children_4 = cache.get(&4).cloned().unwrap();
	children_4.sort();			children_4.sort();
	assert_eq!(children_4, vec![5, 6, 7]);			assert_eq!(children_4, vec![5, 6, 7]);

	let mut children_7 = cache.get(&7).cloned().unwrap();			let mut children_7 = cache.get(&7).cloned().unwrap();
	children_7.sort();			children_7.sort();
	assert_eq!(children_7, vec![9, 11]);			assert_eq!(children_7, vec![9, 11]);

	Ok(())			Ok(())
	}			}

	#[test]			#[test]
	fn test_complete_sample() {			fn test_complete_sample() {
	let mut disco = full_disco();			let mut disco = full_disco_with_undecided();
	let undecided: HashSet<Revision> =			disco.undecided = vec![4, 7, 9, 2, 3].into_iter().collect();
	[4, 7, 9, 2, 3].iter().cloned().collect();
	disco.undecided = Some(undecided);

	let mut sample = vec![0];			let mut sample = vec![0];
	disco.random_complete_sample(&mut sample, 3);			disco.random_complete_sample(&mut sample, 3);
	assert_eq!(sample.len(), 3);			assert_eq!(sample.len(), 3);

	let mut sample = vec![2, 4, 7];			let mut sample = vec![2, 4, 7];
	disco.random_complete_sample(&mut sample, 1);			disco.random_complete_sample(&mut sample, 1);
	assert_eq!(sample.len(), 3);			assert_eq!(sample.len(), 3);
	}			}

	#[test]			#[test]
	fn test_bidirectional_sample() -> Result<(), GraphError> {			fn test_bidirectional_sample() -> Result<(), GraphError> {
	let mut disco = full_disco();			let mut disco = full_disco_with_undecided();
	disco.undecided = Some((0..=13).into_iter().collect());			disco.undecided = (0..=13).collect();

	let (sample_set, size) = disco.bidirectional_sample(7)?;			let (sample_set, size) = disco.bidirectional_sample(7)?;
	assert_eq!(size, 7);			assert_eq!(size, 7);
	let mut sample: Vec<Revision> = sample_set.into_iter().collect();			let mut sample: Vec<Revision> = sample_set.into_iter().collect();
	sample.sort();			sample.sort();
	// our DAG is a bit too small for the results to be really interesting			// our DAG is a bit too small for the results to be really interesting
	// at least it shows that			// at least it shows that
	// - we went both ways			// - we went both ways
	// - we didn't take all Revisions (6 is not in the sample)			// - we didn't take all Revisions (6 is not in the sample)
	assert_eq!(sample, vec![0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13]);			assert_eq!(sample, vec![0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13]);
	Ok(())			Ok(())
	}			}
	}			}

Diff	ID	Base	Description	Created	Lint	Unit
Base			Base
Diff 1	18924			Dec 24 2019, 8:49 AM	★	★