This is an archive of the discontinued Mercurial Phabricator instance.

Differential D11616

rhg: stop manifest traversal when no more files are needed
ClosedPublic

Authored by aalekseyev on Oct 5 2021, 10:16 AM.

Download Raw Diff

Details

Reviewers

martinvonz

Group Reviewers

hg-reviewers

Commits

rHG0cc69017d47f: rhg: stop manifest traversal when no more files are needed

Summary

Stopping the traversal early can skip a significant part
of the manifest traversal, to avoid some of its cost.

The worst-case benchmarks are favorable, as well.
Running [hg cat] on the last file in the manifest of
a large repo, I'm seeing a ~4ms improvement (150ms -> 146ms),
so this time is now almost indistinguishable from the
baseline ("brute force") implementation.

Running [hg cat] on ~220 files together with the last file
of the repo is further improved by ~5ms or so.

I suspect the raw performance improvements are caused by splitting
the manifest search and the file data access into separate phases,
instead of interleaving them.

Diff Detail

Repository

rHG Mercurial

Branch

default

Lint

No Linters Available

Unit

No Unit Test Coverage

Event Timeline

aalekseyev created this revision.Oct 5 2021, 10:16 AM

Herald added a reviewer: hg-reviewers. · View Herald TranscriptOct 5 2021, 10:16 AM

Herald added a subscriber: mercurial-patches. · View Herald Transcript

aalekseyev edited the summary of this revision. (Show Details)Oct 5 2021, 10:44 AM

aalekseyev updated this revision to Diff 30679.Oct 5 2021, 10:50 AM

I think this can be done more simply. I've sent D11622 to show what I mean. Does that make sense? How does it perform in your test repo?

This revision now requires changes to proceed.Oct 8 2021, 8:17 PM

martinvonz added a child revision: D11622: dirstate: simplify cat operation.Oct 8 2021, 8:42 PM

aalekseyev updated this revision to Diff 30700.Oct 11 2021, 6:51 AM

In D11616#178090, @martinvonz wrote:

I think this can be done more simply. I've sent D11622 to show what I mean. Does that make sense? How does it perform in your test repo?

Yeah, using a hashtable was the first thing I tried, but hashtable lookup is slow compared to plain comparison, and since it's done in a tight loop it really does add up.
I've run some benchmarks in different scenarois, ordered roughly in the order of complexity.

orig: 1b0f8aafedea
hashtbl: 8143112baaa4 (your patch, minus the early termination)
merge-itertools: 030cbc80308a
hashtbl-stop-early: e0e7ca2b9334 (your patch)
merge-manual d83c7db2f91f (my patch)

one:
                orig    139 (+ 0.0%) (err 0.7%)
             hashtbl    148 (+ 6.6%) (err 0.4%)
     merge-itertools    143 (+ 3.2%) (err 0.5%)
  hashtbl-stop-early    144 (+ 3.8%) (err 0.6%)
        merge-manual    135 (- 3.0%) (err 1.7%)

two:
                orig    141 (+ 0.0%) (err 1.8%)
             hashtbl    150 (+ 6.0%) (err 0.5%)
     merge-itertools    145 (+ 2.6%) (err 0.4%)
  hashtbl-stop-early    146 (+ 3.4%) (err 1.7%)
        merge-manual    136 (- 4.0%) (err 1.0%)

last:
                orig    140 (+ 0.0%) (err 0.7%)
             hashtbl    149 (+ 6.7%) (err 0.8%)
     merge-itertools    145 (+ 3.8%) (err 1.4%)
  hashtbl-stop-early    156 (+11.8%) (err 0.6%)
        merge-manual    142 (+ 1.4%) (err 0.6%)

many:
                orig    213 (+ 0.0%) (err 0.6%)
             hashtbl    183 (-14.5%) (err 0.4%)
     merge-itertools    175 (-17.9%) (err 0.4%)
  hashtbl-stop-early    178 (-16.6%) (err 0.6%)
        merge-manual    166 (-22.2%) (err 0.9%)

many+last:
                orig    215 (+ 0.0%) (err 0.8%)
             hashtbl    184 (-14.4%) (err 0.9%)
     merge-itertools    177 (-17.7%) (err 0.8%)
  hashtbl-stop-early    191 (-10.8%) (err 1.0%)
        merge-manual    174 (-18.9%) (err 1.8%)

large:
                orig    149 (+ 0.0%) (err 0.7%)
             hashtbl    158 (+ 6.5%) (err 0.9%)
     merge-itertools    156 (+ 4.7%) (err 0.7%)
  hashtbl-stop-early    157 (+ 5.4%) (err 0.6%)
        merge-manual    148 (- 0.8%) (err 1.2%)

These show that hashtbl loses to merge-itertools in pretty much all cases, with or without the early termination.
I can hand-craft a case where early termination gives it some extra advantage, but the existing one and many benchmarks are already terminating at ~5% of the traversal, so we're already in an optimistic case.

In D11616#178120, @aalekseyev wrote:
In D11616#178090, @martinvonz wrote:

I think this can be done more simply. I've sent D11622 to show what I mean. Does that make sense? How does it perform in your test repo?

Yeah, using a hashtable was the first thing I tried, but hashtable lookup is slow compared to plain comparison, and since it's done in a tight loop it really does add up.
I've run some benchmarks in different scenarois, ordered roughly in the order of complexity.
orig: 1b0f8aafedea
hashtbl: 8143112baaa4 (your patch, minus the early termination)
merge-itertools: 030cbc80308a
hashtbl-stop-early: e0e7ca2b9334 (your patch)
merge-manual d83c7db2f91f (my patch)
one:
                orig    139 (+ 0.0%) (err 0.7%)
             hashtbl    148 (+ 6.6%) (err 0.4%)
     merge-itertools    143 (+ 3.2%) (err 0.5%)
  hashtbl-stop-early    144 (+ 3.8%) (err 0.6%)
        merge-manual    135 (- 3.0%) (err 1.7%)
two:
                orig    141 (+ 0.0%) (err 1.8%)
             hashtbl    150 (+ 6.0%) (err 0.5%)
     merge-itertools    145 (+ 2.6%) (err 0.4%)
  hashtbl-stop-early    146 (+ 3.4%) (err 1.7%)
        merge-manual    136 (- 4.0%) (err 1.0%)
last:
                orig    140 (+ 0.0%) (err 0.7%)
             hashtbl    149 (+ 6.7%) (err 0.8%)
     merge-itertools    145 (+ 3.8%) (err 1.4%)
  hashtbl-stop-early    156 (+11.8%) (err 0.6%)
        merge-manual    142 (+ 1.4%) (err 0.6%)
many:
                orig    213 (+ 0.0%) (err 0.6%)
             hashtbl    183 (-14.5%) (err 0.4%)
     merge-itertools    175 (-17.9%) (err 0.4%)
  hashtbl-stop-early    178 (-16.6%) (err 0.6%)
        merge-manual    166 (-22.2%) (err 0.9%)
many+last:
                orig    215 (+ 0.0%) (err 0.8%)
             hashtbl    184 (-14.4%) (err 0.9%)
     merge-itertools    177 (-17.7%) (err 0.8%)
  hashtbl-stop-early    191 (-10.8%) (err 1.0%)
        merge-manual    174 (-18.9%) (err 1.8%)
large:
                orig    149 (+ 0.0%) (err 0.7%)
             hashtbl    158 (+ 6.5%) (err 0.9%)
     merge-itertools    156 (+ 4.7%) (err 0.7%)
  hashtbl-stop-early    157 (+ 5.4%) (err 0.6%)
        merge-manual    148 (- 0.8%) (err 1.2%)
These show that hashtbl loses to merge-itertools in pretty much all cases, with or without the early termination.
I can hand-craft a case where early termination gives it some extra advantage, but the existing one and many benchmarks are already terminating at ~5% of the traversal, so we're already in an optimistic case.

Wow, thanks for taking the time to check so carefully! I had only tried it in the hg repo and I couldn't measure any difference at all there. I tried just now in the mozilla-unified repo. The differences are not measurable there either (see below). I built with HGWITHRUSTEXT=cpython make local and benchmarked with hyperfine. Do you think your repo is different or did I not test correctly?

browser/moz.build:
yours: 204.8 ms ±   5.5 ms
mine:  203.2 ms ±   3.5 ms


xpfe/appshell/moz.build:
yours: 204.6 ms ±   5.3 ms
mine:  203.1 ms ±   2.6 ms

set:**/moz.build
yours: 649.9 ms ±   8.0 ms
mine:  647.5 ms ±   7.4 ms

I tried just now in the mozilla-unified repo. The differences are not measurable there either (see below).

Interesting. I'll try to reproduce that.

I built with HGWITHRUSTEXT=cpython make local

I was building with [make build-rhg], but I'm guessing that's the same.

Do you think your repo is different?

Mine has ~260k files. I don't know how many mozilla repo has, but I'll re-run my benchmarks on that tomorrow to see if I'm getting the same results.

set:**/moz.build

I think rhg does not support the set language, so it must be falling back to python in this case?

In D11616#178122, @aalekseyev wrote:

set:**/moz.build

I think rhg does not support the set language, so it must be falling back to python in this case?

That was just a lazy description by me. What I actually did was files=$(hg files 'set:**/moz.build'); hyperfine "rhg cat $files".

In D11616#178122, @aalekseyev wrote:

...
Mine has ~260k files. I don't know how many mozilla repo has, but I'll re-run my benchmarks on that tomorrow to see if I'm getting the same results.

Actually the clone finished before I left, so posting the results now:

all-moz.build:
                orig    152 (+ 0.0%) (err 10.2%)
             hashtbl    156 (+ 2.4%) (err 2.8%)
     merge-itertools    148 (- 2.6%) (err 1.0%)
  hashtbl-stop-early    163 (+ 7.2%) (err 2.8%)
        merge-manual    143 (- 5.8%) (err 1.1%)

browser:
                orig    140 (+ 0.0%) (err 1.0%)
             hashtbl    154 (+10.1%) (err 1.1%)
     merge-itertools    145 (+ 4.0%) (err 0.9%)
  hashtbl-stop-early    112 (-20.0%) (err 1.6%)
        merge-manual    111 (-20.5%) (err 1.1%)

xpfe:
                orig    139 (+ 0.0%) (err 1.9%)
             hashtbl    153 (+ 9.6%) (err 0.8%)
     merge-itertools    146 (+ 4.5%) (err 1.0%)
  hashtbl-stop-early    161 (+15.4%) (err 2.1%)
        merge-manual    142 (+ 1.8%) (err 0.6%)

In D11616#178124, @aalekseyev wrote:

In D11616#178122, @aalekseyev wrote:

...
Mine has ~260k files. I don't know how many mozilla repo has, but I'll re-run my benchmarks on that tomorrow to see if I'm getting the same results.

Actually the clone finished before I left, so posting the results now:

all-moz.build:
                orig    152 (+ 0.0%) (err 10.2%)
             hashtbl    156 (+ 2.4%) (err 2.8%)
     merge-itertools    148 (- 2.6%) (err 1.0%)
  hashtbl-stop-early    163 (+ 7.2%) (err 2.8%)
        merge-manual    143 (- 5.8%) (err 1.1%)
browser:
                orig    140 (+ 0.0%) (err 1.0%)
             hashtbl    154 (+10.1%) (err 1.1%)
     merge-itertools    145 (+ 4.0%) (err 0.9%)
  hashtbl-stop-early    112 (-20.0%) (err 1.6%)
        merge-manual    111 (-20.5%) (err 1.1%)
xpfe:
                orig    139 (+ 0.0%) (err 1.9%)
             hashtbl    153 (+ 9.6%) (err 0.8%)
     merge-itertools    146 (+ 4.5%) (err 1.0%)
  hashtbl-stop-early    161 (+15.4%) (err 2.1%)
        merge-manual    142 (+ 1.8%) (err 0.6%)

I wonder why we see so different results. Are those the times reported by hyperfine in your case?

Possibly relevant info about my setup:
rustc version: rustc 1.57.0-nightly (8f8092cc3 2021-09-28)
CPU: Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz
OS: Debian

I guess we can also ask @Alphare and @SimonSapin which version they prefer. (I'm personally unlikely to have to maintain this code.)

In D11616#178125, @martinvonz wrote:

I wonder why we see so different results. Are those the times reported by hyperfine in your case?

No, I collected the times with a custom OCaml script, but I've been seeing similar results when measuring by hand with time.
I haven't tried using hyperfine.

rustc version: rustc 1.57.0-nightly (8f8092cc3 2021-09-28)
CPU: Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz
OS: Debian

I've got:

rustc 1.55.0 (Red Hat 1.55.0-1.el7)
CPU: AMD EPYC 7702P
OS: CentOS 7

I found one more difference: my all-moz.build benchmark is only concatenating 42 files, instead of 1897, because my glob was not recursive. Whoops.
Correcting for that, I'm still seeing a 15-20ms advantage of merge-based algorithm over a hashtable-based.
The absolute time is 220ms vs 240ms, by the way, which is surprisingly far off from 600+ms you're seeing.

I'm getting ~630ms on **/moz.build when I run the original version (1b0f8aafedea), by the way. Could it be that your ~650ms measurement corresponds to that?

In D11616#178599, @aalekseyev wrote:

I'm getting ~630ms on **/moz.build when I run the original version (1b0f8aafedea), by the way. Could it be that your ~650ms measurement corresponds to that?

Yeah, it's very possible I messed up my measurements somehow. I'll check again tomorrow, then I'll probably queue your version.

In D11616#178600, @martinvonz wrote:

In D11616#178599, @aalekseyev wrote:

I'm getting ~630ms on **/moz.build when I run the original version (1b0f8aafedea), by the way. Could it be that your ~650ms measurement corresponds to that?

Yeah, it's very possible I messed up my measurements somehow. I'll check again tomorrow, then I'll probably queue your version.

Actually, I'll skip checking that. The important things is that you and others working on this are happy.

This revision is now accepted and ready to land.Oct 14 2021, 11:08 AM

aalekseyev added a commit: rHG0cc69017d47f: rhg: stop manifest traversal when no more files are needed.Oct 14 2021, 11:16 AM

Closed by commit rHG0cc69017d47f: rhg: stop manifest traversal when no more files are needed (authored by aalekseyev). · Explain Why

This revision was automatically updated to reflect the committed changes.

Thanks!

aalekseyev mentioned this in D11679: rhg: internally, return a structured representation from hg cat.Oct 18 2021, 6:07 AM

Revision Contents
Changeset List

			Path	Packages
M			rust/hg-core/src/operations/cat.rs (82 lines)

Diff	ID	Description	Created	Lint	Unit
Base		Base
Diff 1	30678		Oct 5 2021, 10:16 AM	★	★
Diff 2	30679		Oct 5 2021, 10:50 AM	★	★
Diff 3	30700		Oct 11 2021, 6:50 AM	★	★
Diff 4	30807	rHG0cc69017d47f1344ff0408dc344293e39342bade	Oct 5 2021, 10:10 AM	★	★

Commit	Parents	Author	Summary	Date
d1ce01a81b87	b7fd92e29705	Arseniy Alekseyev		Oct 5 2021, 10:10 AM

Status	Author	Revision
Abandoned	martinvonz	D11622 dirstate: simplify cat operation
Closed	aalekseyev	D11616 rhg: stop manifest traversal when no more files are needed
Closed	aalekseyev	D11615 rhg: faster hg cat when many files are requested

Diff 30678

rust/hg-core/src/operations/cat.rs

	// list_tracked_files.rs			// list_tracked_files.rs
	//			//
	// Copyright 2020 Antoine Cezar <antoine.cezar@octobus.net>			// Copyright 2020 Antoine Cezar <antoine.cezar@octobus.net>
	//			//
	// This software may be used and distributed according to the terms of the			// This software may be used and distributed according to the terms of the
	// GNU General Public License version 2 or any later version.			// GNU General Public License version 2 or any later version.

	use crate::repo::Repo;			use crate::repo::Repo;
	use crate::revlog::revlog::RevlogError;			use crate::revlog::revlog::RevlogError;
	use crate::revlog::Node;			use crate::revlog::Node;

				use crate::utils::hg_path::HgPath;
	use crate::utils::hg_path::HgPathBuf;			use crate::utils::hg_path::HgPathBuf;

	use itertools::EitherOrBoth::{Both, Left, Right};			use itertools::put_back;
	use itertools::Itertools;			use itertools::PutBack;
				use std::cmp::Ordering;

	pub struct CatOutput {			pub struct CatOutput {
	/// Whether any file in the manifest matched the paths given as CLI			/// Whether any file in the manifest matched the paths given as CLI
	/// arguments			/// arguments
	pub found_any: bool,			pub found_any: bool,
	/// The contents of matching files, in manifest order			/// The contents of matching files, in manifest order
	pub concatenated: Vec<u8>,			pub concatenated: Vec<u8>,
	/// Which of the CLI arguments did not match any manifest file			/// Which of the CLI arguments did not match any manifest file
	pub missing: Vec<HgPathBuf>,			pub missing: Vec<HgPathBuf>,
	/// The node ID that the given revset was resolved to			/// The node ID that the given revset was resolved to
	pub node: Node,			pub node: Node,
	}			}

				// Find an item in an iterator over a sorted collection.
				fn find_item<'a, 'b, 'c, D, I: Iterator<Item = (&'a HgPath, D)>>(
				i: &mut PutBack<I>,
				needle: &'b HgPath,
				) -> Option<I::Item> {
				loop {
				match i.next() {
				None => return None,
				Some(val) => match needle.as_bytes().cmp(val.0.as_bytes()) {
				Ordering::Less => {
				i.put_back(val);
				return None;
				}
				Ordering::Greater => continue,
				Ordering::Equal => return Some(val),
				},
				}
				}
				}

				fn find_files_in_manifest<
				'a,
				'b,
				'c,
				D,
				I: Iterator<Item = (&'a HgPath, D)>,
				J: Iterator<Item = &'b HgPath>,
				>(
				i: I,
				j: J,
				) -> (Vec<(&'a HgPath, D)>, Vec<&'b HgPath>) {
				let mut manifest_iterator = put_back(i);
				let mut res = vec![];
				let mut missing = vec![];

				for file in j {
				match find_item(&mut manifest_iterator, file) {
				None => missing.push(file),
				Some(item) => res.push(item),
				}
				}
				return (res, missing);
				}

	/// Output the given revision of files			/// Output the given revision of files
	///			///
	/// * `root`: Repository root			/// * `root`: Repository root
	/// * `rev`: The revision to cat the files from.			/// * `rev`: The revision to cat the files from.
	/// * `files`: The files to output.			/// * `files`: The files to output.
	pub fn cat<'a>(			pub fn cat<'a>(
	repo: &Repo,			repo: &Repo,
	revset: &str,			revset: &str,
	mut files: Vec<HgPathBuf>,			mut files: Vec<HgPathBuf>,
	) -> Result<CatOutput, RevlogError> {			) -> Result<CatOutput, RevlogError> {
	let rev = crate::revset::resolve_single(revset, repo)?;			let rev = crate::revset::resolve_single(revset, repo)?;
	let manifest = repo.manifest_for_rev(rev)?;			let manifest = repo.manifest_for_rev(rev)?;
	let node = *repo			let node = *repo
	.changelog()?			.changelog()?
	.node_from_rev(rev)			.node_from_rev(rev)
	.expect("should succeed when repo.manifest did");			.expect("should succeed when repo.manifest did");
	let mut bytes = vec![];			let mut bytes: Vec<u8> = vec![];
	let mut found_any = false;			let mut found_any = false;

	files.sort_unstable();			files.sort_unstable();

	let mut missing = vec![];			let (found, missing) = find_files_in_manifest(
				manifest.files_with_nodes(),
				files.iter().map(\|f\| f.as_ref()),
				);

	for entry in manifest			for (manifest_file, node_bytes) in found {
	.files_with_nodes()
	.merge_join_by(files.iter(), \|(manifest_file, _), file\| {
	manifest_file.cmp(&file.as_ref())
	})
	{
	match entry {
	Left(_) => (),
	Right(path) => missing.push(path),
	Both((manifest_file, node_bytes), _) => {
	found_any = true;			found_any = true;
	let file_log = repo.filelog(manifest_file)?;			let file_log = repo.filelog(manifest_file)?;
	let file_node = Node::from_hex_for_repo(node_bytes)?;			let file_node = Node::from_hex_for_repo(node_bytes)?;
	let entry = file_log.data_for_node(file_node)?;			bytes.extend(file_log.data_for_node(file_node)?.data()?);
	bytes.extend(entry.data()?)
	}
	}
	}			}

	// make the order of the [missing] files			// make the order of the [missing] files
	// match the order they were specified on the command line			// match the order they were specified on the command line
	let missing: Vec<_> = files			let missing: Vec<_> = files
	.iter()			.iter()
	.filter(\|file\| missing.contains(file))			.filter(\|file\| missing.contains(&file.as_ref()))
	.map(\|file\| file.clone())			.map(\|file\| file.clone())
	.collect();			.collect();
	Ok(CatOutput {			Ok(CatOutput {
	found_any,			found_any,
	concatenated: bytes,			concatenated: bytes,
	missing,			missing,
	node,			node,
	})			})
	}			}