This is an archive of the discontinued Mercurial Phabricator instance.

dirstate: avoid reading the map when possible (issue5713) (issue5717)
ClosedPublic

Authored by durham on Oct 26 2017, 7:25 PM.

Details

Summary

Before the recent refactor, we would not load the entire map until it was
accessed. As part of the refactor, that got lost and even just trying to load
the dirstate parents would load the whole map. This caused a perf regression
(issue5713) and a regression with static http serving (issue5717).

Making it lazy loaded again fixes both.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

durham created this revision.Oct 26 2017, 7:25 PM

This makes Windows mostly happy again, but now I'm getting the following diff. Any ideas?

diff --git a/tests/test-static-http.t b/tests/test-static-http.t

  • a/tests/test-static-http.t

+++ b/tests/test-static-http.t
@@ -221,34 +221,31 @@

$ cat server.log | sed -n -e 's|.*GET \(/[^ ]*\).*|\1|p' | sort -u
/.hg/bookmarks
/.hg/bookmarks.current
  • /.hg/cache/hgtagsfnodes1 /.hg/requires /.hg/store/00changelog.i /.hg/store/00manifest.i /.hg/store/data/%7E2ehgsub.i /.hg/store/data/%7E2ehgsubstate.i /.hg/store/data/a.i

+ /.hg/cache/hgtagsfnodes1 (glob)

/notarepo/.hg/00changelog.i
/notarepo/.hg/requires
/remote-with-names/.hg/bookmarks
/remote-with-names/.hg/bookmarks.current
  • /remote-with-names/.hg/cache/branch2-served
  • /remote-with-names/.hg/cache/hgtagsfnodes1
  • /remote-with-names/.hg/cache/tags2-served /remote-with-names/.hg/localtags /remote-with-names/.hg/requires /remote-with-names/.hg/store/00changelog.i /remote-with-names/.hg/store/00manifest.i /remote-with-names/.hg/store/data/%7E2ehgtags.i /remote-with-names/.hg/store/data/foo.i

+ /remote-with-names/.hg/cache/branch2-base (glob)
+ /remote-with-names/.hg/cache/branch2-immutable (glob)
+ /remote-with-names/.hg/cache/branch2-served (glob)
+ /remote-with-names/.hg/cache/hgtagsfnodes1 (glob)
+ /remote-with-names/.hg/cache/rbc-names-v1 (glob)
+ /remote-with-names/.hg/cache/tags2-served (glob)

/remote/.hg/bookmarks
/remote/.hg/bookmarks.current
  • /remote/.hg/cache/branch2-base
  • /remote/.hg/cache/branch2-immutable
  • /remote/.hg/cache/branch2-served
  • /remote/.hg/cache/hgtagsfnodes1
  • /remote/.hg/cache/rbc-names-v1
  • /remote/.hg/cache/tags2-served /remote/.hg/localtags /remote/.hg/requires /remote/.hg/store/00changelog.i

@@ -257,6 +254,12 @@

/remote/.hg/store/data/%7E2ehgtags.i
/remote/.hg/store/data/bar.i
/remote/.hg/store/data/quux.i

+ /remote/.hg/cache/branch2-base (glob)
+ /remote/.hg/cache/branch2-immutable (glob)
+ /remote/.hg/cache/branch2-served (glob)
+ /remote/.hg/cache/hgtagsfnodes1 (glob)
+ /remote/.hg/cache/rbc-names-v1 (glob)
+ /remote/.hg/cache/tags2-served (glob)

/remotempty/.hg/bookmarks
/remotempty/.hg/bookmarks.current
/remotempty/.hg/requires

@@ -264,9 +267,9 @@

/remotempty/.hg/store/00manifest.i
/sub/.hg/bookmarks
/sub/.hg/bookmarks.current
  • /sub/.hg/cache/hgtagsfnodes1 /sub/.hg/requires /sub/.hg/store/00changelog.i /sub/.hg/store/00manifest.i /sub/.hg/store/data/%7E2ehgtags.i /sub/.hg/store/data/test.i

+ /sub/.hg/cache/hgtagsfnodes1 (glob)

Try that again without the reformatting

diff --git a/tests/test-static-http.t b/tests/test-static-http.t
--- a/tests/test-static-http.t
+++ b/tests/test-static-http.t
@@ -221,34 +221,31 @@
   $ cat server.log | sed -n -e 's|.*GET \(/[^ ]*\).*|\1|p' | sort -u
   /.hg/bookmarks
   /.hg/bookmarks.current
-  /.hg/cache/hgtagsfnodes1
   /.hg/requires
   /.hg/store/00changelog.i
   /.hg/store/00manifest.i
   /.hg/store/data/%7E2ehgsub.i
   /.hg/store/data/%7E2ehgsubstate.i
   /.hg/store/data/a.i
+  /.hg/cache/hgtagsfnodes1 (glob)
   /notarepo/.hg/00changelog.i
   /notarepo/.hg/requires
   /remote-with-names/.hg/bookmarks
   /remote-with-names/.hg/bookmarks.current
-  /remote-with-names/.hg/cache/branch2-served
-  /remote-with-names/.hg/cache/hgtagsfnodes1
-  /remote-with-names/.hg/cache/tags2-served
   /remote-with-names/.hg/localtags
   /remote-with-names/.hg/requires
   /remote-with-names/.hg/store/00changelog.i
   /remote-with-names/.hg/store/00manifest.i
   /remote-with-names/.hg/store/data/%7E2ehgtags.i
   /remote-with-names/.hg/store/data/foo.i
+  /remote-with-names/.hg/cache/branch2-base (glob)
+  /remote-with-names/.hg/cache/branch2-immutable (glob)
+  /remote-with-names/.hg/cache/branch2-served (glob)
+  /remote-with-names/.hg/cache/hgtagsfnodes1 (glob)
+  /remote-with-names/.hg/cache/rbc-names-v1 (glob)
+  /remote-with-names/.hg/cache/tags2-served (glob)
   /remote/.hg/bookmarks
   /remote/.hg/bookmarks.current
-  /remote/.hg/cache/branch2-base
-  /remote/.hg/cache/branch2-immutable
-  /remote/.hg/cache/branch2-served
-  /remote/.hg/cache/hgtagsfnodes1
-  /remote/.hg/cache/rbc-names-v1
-  /remote/.hg/cache/tags2-served
   /remote/.hg/localtags
   /remote/.hg/requires
   /remote/.hg/store/00changelog.i
@@ -257,6 +254,12 @@
   /remote/.hg/store/data/%7E2ehgtags.i
   /remote/.hg/store/data/bar.i
   /remote/.hg/store/data/quux.i
+  /remote/.hg/cache/branch2-base (glob)
+  /remote/.hg/cache/branch2-immutable (glob)
+  /remote/.hg/cache/branch2-served (glob)
+  /remote/.hg/cache/hgtagsfnodes1 (glob)
+  /remote/.hg/cache/rbc-names-v1 (glob)
+  /remote/.hg/cache/tags2-served (glob)
   /remotempty/.hg/bookmarks
   /remotempty/.hg/bookmarks.current
   /remotempty/.hg/requires
@@ -264,9 +267,9 @@
   /remotempty/.hg/store/00manifest.i
   /sub/.hg/bookmarks
   /sub/.hg/bookmarks.current
-  /sub/.hg/cache/hgtagsfnodes1
   /sub/.hg/requires
   /sub/.hg/store/00changelog.i
   /sub/.hg/store/00manifest.i
   /sub/.hg/store/data/%7E2ehgtags.i
   /sub/.hg/store/data/test.i
+  /sub/.hg/cache/hgtagsfnodes1 (glob)
lothiraldan accepted this revision.Oct 27 2017, 10:27 AM
lothiraldan added a subscriber: lothiraldan.

It looks good to me.

I did some quick and dirty performance check and I see a performance improvement. On the mozilla-central repository, without this change, I get these results:

$ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
.....................
command: Mean +- std dev: 154 ms +- 2 ms
$ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
.....................
command: Mean +- std dev: 154 ms +- 2 ms
$ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
.....................
command: Mean +- std dev: 156 ms +- 3 ms

And with this changeset I get better results:

$ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
.....................
command: Mean +- std dev: 119 ms +- 3 ms
$ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
.....................
command: Mean +- std dev: 120 ms +- 3 ms
$ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
.....................
command: Mean +- std dev: 123 ms +- 7 ms

Where perf is https://perf.readthedocs.io/en/latest/

yuja added a subscriber: yuja.Oct 27 2017, 11:17 AM

(I don't read the patch content yet.)

I just made static-http repo not see dirstate because of the issue5717.
I don't know the issue5713, but a fewer loading over HTTP should be generally better.

Here are some measures 0865d25e8a8a:

$ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
.....................
command: Mean +- std dev: 109 ms +- 3 ms
$ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
.....................
command: Mean +- std dev: 110 ms +- 2 ms
$ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
.....................
command: Mean +- std dev: 111 ms +- 3 ms

And here are the mesaures on c36c3fa7d35b:

$ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
.....................
command: Mean +- std dev: 166 ms +- 5 ms
$ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
.....................
command: Mean +- std dev: 165 ms +- 3 ms
$ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
.....................
command: Mean +- std dev: 166 ms +- 5 ms

So it's seems that this patch solves most of the performance regression, but with so many changesets since c36c3fa7d35b I'm not sure we can do better.

@mharbison72 your test output looks like the sort order changed? Does it consistently repro before and after this patch? The sort order is from the sort -u in the test, so I can't imagine this affecting it.

yuja accepted this revision.Oct 28 2017, 3:46 AM
This revision is now accepted and ready to land.Oct 28 2017, 3:46 AM
yuja added a comment.Oct 28 2017, 3:54 AM

Queued up to this patch, thanks.

mercurial/dirstate.py
132

Perhaps this can be moved to init() since calling dirstatemap()
doesn't do any IO now.

1209

Nit: dirstatemap.read() could be a private function.

This revision was automatically updated to reflect the committed changes.

After rereading the numbers, I realize that with this patch we are still 10ms (around 10%) slower than before. The numbers before 0865d25e8a8a were [109, 110, 111]ms and with this patch we are at [119, 120, 123]ms.

We should take a look at it.

@durham It did turn out to be a sort issue: some paths had '\' and were (partially?) converted to '/' by the test runner _after_ the sort. Piping through sed to fix the paths beforehand reduces it to this:

--- c:/Users/Matt/projects/hg/tests/test-static-http.t
+++ c:/Users/Matt/projects/hg/tests/test-static-http.t.err
@@ -232,8 +232,11 @@
   /notarepo/.hg/requires
   /remote-with-names/.hg/bookmarks
   /remote-with-names/.hg/bookmarks.current
+  /remote-with-names/.hg/cache/branch2-base
+  /remote-with-names/.hg/cache/branch2-immutable
   /remote-with-names/.hg/cache/branch2-served
   /remote-with-names/.hg/cache/hgtagsfnodes1
+  /remote-with-names/.hg/cache/rbc-names-v1
   /remote-with-names/.hg/cache/tags2-served
   /remote-with-names/.hg/localtags
   /remote-with-names/.hg/requires

Not sure why yet, but this patch isn't the problem.