Following on from Jun Wu's patch last October[1], we have found that using mmap
for the revlog index in repos with large revlogs gives a noticable performance
improvment (~110ms on each hg invocation), particularly for commands that don't
touch the index very much.
This changeset adds this as an option, activated by a new experimental config
option so that it can be enabled on a per-repo basis. The configuration option
specifies an index size threshold at which Mercurial will switch to using mmap
to access the index.
If the configuration option is not specified, the default remains to load the
full file, which seems to be the best option for smaller repos.
Some initial performance numbers for average of 5 invocations of hg log -l 5
for different cache states:
Repo: | HG | FB |
---|---|---|
Index size: | 2.3MB | much bigger |
read (warm): | 237ms | 432ms |
mmap (warm): | 227ms | 321ms |
(-3%) | (-26%) | |
read (cold): | 397ms | 696ms |
mmap (cold): | 410ms | 888ms |
(+3%) | (+28%) | |
[1] https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-October/088737.html
As you wrote in the commit message, the benefits of memory mapping likely kick in at some threshold and we'll want to make this behavior conditional on index size. It is tempting to implement that today to facilitate experimentation. If you can do that without introducing extra system calls for the default file open case, do it.