This is an archive of the discontinued Mercurial Phabricator instance.

dirstate-tree: Avoid BTreeMap double-lookup when inserting a dirstate entry
ClosedPublic

Authored by SimonSapin on May 3 2021, 6:25 AM.

Details

Summary

The child nodes of a given node in the tree-shaped dirstate are kept in a
BTreeMap where keys are file names as strings. Finding or inserting a value
in the map takes O(log(n)) string comparisons, which adds up when constructing
the tree.

The entry API allows finding a "spot" in the map that may or may not be
occupied and then access that value or insert a new one without doing map
lookup again. However the current API is limited in that calling entry
requires an owned key (and so a memory allocation), even if it ends up not
being used in the case where the map already has a value with an equal key.

This is still a win, with 4% better end-to-end time for hg status measured
here with hyperfine:

Benchmark #1: ../hg2/hg status -R $REPO --config=experimental.dirstate-tree.in-memory=1
  Time (mean ± σ):      1.337 s ±  0.018 s    [User: 892.9 ms, System: 437.5 ms]
  Range (min … max):    1.316 s …  1.373 s    10 runs

Benchmark #2: ./hg status -R $REPO --config=experimental.dirstate-tree.in-memory=1
  Time (mean ± σ):      1.291 s ±  0.008 s    [User: 853.4 ms, System: 431.1 ms]
  Range (min … max):    1.283 s …  1.309 s    10 runs

Summary
  './hg status -R $REPO --config=experimental.dirstate-tree.in-memory=1' ran
    1.04 ± 0.02 times faster than '../hg2/hg status -R $REPO --config=experimental.dirstate-tree.in-memory=1'
  • ./hg is this revision
  • ../hg2/hg is its parent
  • $REPO is an old snapshot of mozilla-central

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.