This is an archive of the discontinued Mercurial Phabricator instance.

overlayworkingctx: remove unused `nofilechanges()` and `_compact()` methods
ClosedPublic

Authored by mjacob on Jul 11 2020, 5:12 PM.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

mjacob created this revision.Jul 11 2020, 5:12 PM
mjacob updated this revision to Diff 21869.Jul 11 2020, 7:12 PM
mjacob added inline comments.Jul 11 2020, 10:35 PM
mercurial/context.py
2549

D8732 removed the call leading through this code, thus removing the prefetch side effect. Would it be a good idea to do a prefetch in self.files()?

martinvonz added inline comments.
mercurial/context.py
2549

I'm more concerned that we're losing https://phab.mercurial-scm.org/D1243, but it's unclear what the purpose of that was. If you feel like it, you could run the entire test suite with in-memory rebase enabled and see if anything changes before and after the patch that removed the callers of this code.

mjacob added inline comments.Jul 13 2020, 11:35 AM
mercurial/context.py
2549

Before D1243, in-memory rebase didn’t work at all:

[...]
  File "/home/manu/vcs/hg/hgext/rebase.py", line 1025, in concludememorynode
    if wctx.isempty() and not repo.ui.configbool('ui', 'allowemptycommit'):
  File "/home/manu/vcs/hg/mercurial/context.py", line 2222, in isempty
    self._compact()
AttributeError: 'overlayworkingctx' object has no attribute '_compact'

I didn’t check at in-between revisions, but probably _compact() was only ever called from isempty(). (There’s another method with the same name on _lazymanifest, but it seems unrelated).

This revision was not accepted when it landed; it landed in state Needs Review.
This revision was automatically updated to reflect the committed changes.
martinvonz added inline comments.Jul 27 2020, 2:02 PM
mercurial/context.py
2549

It seems that this was indeed a problem -- one of Google's internal tests failed when we import D8732. The test case passes again if I add back a call to _compact() before we call overlayworkingctx.files() (though I'm not sure if that's the best place to call it. I'll try to add a test case that reproduces the problem without our internal extensions. Then I'll most likely roll back this patch and add back a call to _compact() somewhere.

martinvonz added inline comments.Jul 27 2020, 3:25 PM
mercurial/context.py
2549

Actually, running the test suite as I suggested comes up with examples we already had. I ran the test suite before your patch with --extra-config-opt rebase.experimental.inmemory=1 finds 43 failing tests. I then re-ran those with -j1 to get stable output and sent stderr to a file. Then I ran the same 43 tests in the same way after your patch. The diff of the outputs shows that test-rebase-named-branches.t and test-rebase-parameters.t change behavior. If you add the self._compact() call in files(), then change in test-rebase-parameters.t goes away. I haven't analyzed it further, but that seems to suggest that the change in test-rebase-named-branches.t might be a correct fix, but the change in test-rebase-parameters.t is undesirable.

mjacob added inline comments.Jul 27 2020, 6:21 PM
mercurial/context.py
2549

Maybe it’s a good idea to run larger parts of the rebase test suite both with and without in-memory by default. But that’s a separate discussion for after the 5.5 release.

test-rebase-named-branches.t failed the same way with in-memory rebases before and after applying the whole series, so it’s a separate issue for which I’ll open a issue in Bugzilla. However, from a0192a03216d (inclusive) to 0ecb3b11fcad (exclusive), the test was failing in an additional way, because the former unmasked a bug which the latter fixed.

test-rebase-parameters.t shows a bug in D8732 and later. It could be fixed by calling _compact() in modified(), added() and removed(), or fixing those methods to work with a non-compacted cache. Do you have a preference?

mjacob marked an inline comment as not done.Jul 27 2020, 6:21 PM
martinvonz added inline comments.Jul 27 2020, 7:39 PM
mercurial/context.py
2549

Maybe it’s a good idea to run larger parts of the rebase test suite both with and without in-memory by default. But that’s a separate discussion for after the 5.5 release.

Yes, and it would be even nicer to get in-memory rebase to the point where we can use it by default and eventually delete the on-disk rebase.

test-rebase-named-branches.t failed the same way with in-memory rebases before and after applying the whole series, so it’s a separate issue for which I’ll open a issue in Bugzilla. However, from a0192a03216d (inclusive) to 0ecb3b11fcad (exclusive), the test was failing in an additional way, because the former unmasked a bug which the latter fixed.

Ah, I see. Thanks for looking into it.

test-rebase-parameters.t shows a bug in D8732 and later. It could be fixed by calling _compact() in modified(), added() and removed(), or fixing those methods to work with a non-compacted cache. Do you have a preference?

It seems like _compact() is a little expensive, so maybe we should avoid calling it too many times. I wonder if it would make sense to remove the need for it by having overlayctx.write() check if the content matches and otherwise remove the record from the cache (effectively doing a single-file compaction).

mjacob added inline comments.Jul 27 2020, 8:14 PM
mercurial/context.py
2549

Here’s the issue for the other in-memory difference you observed: https://bz.mercurial-scm.org/show_bug.cgi?id=6391

If we do single-file compaction (it seems like that should be in _markdirty()), we would change _cache from being a cache to recording only dirty paths. Would that be an acceptable change for stable a few days before the release? What about the previous prefetching behavior?

martinvonz added inline comments.Jul 28 2020, 12:14 AM
mercurial/context.py
2549

If we do single-file compaction (it seems like that should be in _markdirty()), we would change _cache from being a cache to recording only dirty paths.

That's actually what I would have thought it was supposed to be anyway :)

Would that be an acceptable change for stable a few days before the release?

Good point. I didn't realize that a release had been cut since this patch was queued. The safest thing for stable is perhaps to roll back this patch and then add a call to _compact() in rebase.commitmemorynode() since that's where the old call to nofilechanges() was. Then we can do a better fix on default later.

What about the previous prefetching behavior?

Perhaps tomemctx() would be a good place for that?

mjacob added inline comments.
mercurial/context.py
2549

The temporary fix was added in D8843.

What about the previous prefetching behavior?

Perhaps tomemctx() would be a good place for that?

It seems like the prefetch was added (in f18f665b1424) specifically to speed up comparing the files with the parent. If we do file-by-file comparison, we lose the advantage of prefetching all files at once. On the other hand, maybe the file from the parent is mostly already in the cache when writing the new file to overlayworkingctx.

@spectral Since you added the prefetching code in f18f665b1424, do you have an opinion?

martinvonz added inline comments.Jul 29 2020, 1:16 AM
mercurial/context.py
2549

It seems like the prefetch was added (in f18f665b1424) specifically to speed up comparing the files with the parent. If we do file-by-file comparison, we lose the advantage of prefetching all files at once. On the other hand, maybe the file from the parent is mostly already in the cache when writing the new file to overlayworkingctx.

Good point, but I wonder if @spectral added the prefetching in _compact() simply because that's what otherwise triggered the fetching. Maybe if we do the compaction on a per-file basis in overlayworkingctx.write(), we'll find a better place to do it. Maybe merge._prefetchfiles() could be modified to accept the wctx too in order to prefetch some files from wctx.p1().

@spectral Since you added the prefetching code in f18f665b1424, do you have an opinion?

Leaving that line quoted for @spectral to reply to :)

spectral added inline comments.Jul 29 2020, 8:17 PM
mercurial/context.py
2549

I only have vague memories of adding this, but I believe it's because the call to underlying.data() was triggering a content fetch from the remotefilelog servers, so I added it here :)

Switching to a per-file _compact would, I believe, be a pessimization (basically undoing that commit, returning to one-by-one fetches) unless there's some other place we can do a prefetch. merge._prefetchfiles() seems like a totally reasonable place to teach about fetching from wctx.p1(), though I think I'd probably implement it by having a list of revs to prefetch for, and we just add wctx.p1() instead of passing in wctx and making merge._prefetchfiles() reason about it.

I haven't checked, it's also possible that this prefetching isn't actually necessary anymore, due to additional prefetching done elsewhere, or removing the need for the underlying data entirely (such as by comparing hashes or tracking the need for compaction separately).