This is an archive of the discontinued Mercurial Phabricator instance.

commands: don't violate storage abstractions in `manifest --all`
ClosedPublic

Authored by indygreg on Apr 5 2018, 12:39 AM.

Details

Summary

Previously, we asked the store to emit its data files. For modern
repos, this would use fncache to resolve the set of files then would
stat() each file. For my copy of the mozilla-unified repository, this
took 3.3-10s depending on the state of my filesystem cache to render
449,790 items.

The previous behavior was a massive layering violation because it
assumed tracked files would have specific filenames in specific
directories. Alternate storage backends would violate this assumption.

The new behavior scans the changelog entries for the set of files
changed by each commit. It aggregates them into a set and then
sorts and prints the result. This reliably takes ~16.3s on my
machine. ~80% of the time is spent in zlib decompression.

The performance regression is unfortunate. If we want to claw it
back, we can create a proper storage API to query for the set of
tracked files. I'm not opposed to doing that. But I'm in no hurry
because I suspect ~0 people care about the performance of
hg manifest --all.

.. perf::

`hg manifest --all` is likely slower due to changing its
implementation to respect storage interface boundaries. If you
are impacted by this regression in a meaningful way, please make
noise on the development mailing list and it can be dealt with.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

indygreg created this revision.Apr 5 2018, 12:39 AM
durin42 accepted this revision.Apr 6 2018, 8:49 PM
This revision is now accepted and ready to land.Apr 6 2018, 8:49 PM
This revision was automatically updated to reflect the committed changes.