This is an archive of the discontinued Mercurial Phabricator instance.

dirstate-tree: optimize HashMap lookups with raw_entry_mut
ClosedPublic

Authored by SimonSapin on Mar 3 2022, 2:59 PM.

Details

Summary

This switches to using HashMap from the hashbrown crate,
in order to use its raw_entry_mut method.
The standard library’s HashMap is also based on this same crate,
but raw_entry_mut is not yet stable there:
https://github.com/rust-lang/rust/issues/56167

Using version 0.9 because 0.10 is yanked and 0.11 requires Rust 1.49

This replaces in DirstateMap::get_or_insert_node a call to
HashMap<K, V>::entry with K = WithBasename<Cow<'on_disk, HgPath>>.

entry takes and consumes an "owned" key: K parameter, in case a new entry
ends up inserted. This key is converted by to_cow from a value that borrows
the 'path lifetime.

When this function is called by Dirstate::new_v1, 'path is in fact
the same as 'on_disk so to_cow can return an owned key that contains
Cow::Borrowed.

For other callers, to_cow needs to create a Cow::Owned and thus make
a costly heap memory allocation. This is wasteful if this key was already
present in the map. Even when inserting a new node this is typically the case
for its ancestor nodes (assuming most directories have numerous descendants).

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

SimonSapin created this revision.Mar 3 2022, 2:59 PM
Alphare accepted this revision.Mar 9 2022, 4:39 AM
Alphare added a subscriber: Alphare.
Alphare added inline comments.
rust/hg-core/src/lib.rs
60

Looks like AHash has a good chance of being faster than XxHash, but we will have to benchmark it later.

This revision is now accepted and ready to land.Mar 9 2022, 4:39 AM