This is an archive of the discontinued Mercurial Phabricator instance.

branchmap: explicitly warm+write all subsets of the branchmap caches
ClosedPublic

Authored by spectral on Aug 2 2019, 9:26 PM.

Details

Summary

'full' claims it will warm all of the caches that are known about, but this was
not the case - it did not actually warm the branchmap caches for subsets that we
haven't requested, or for subsets that are still considered "valid". By
explicitly writing them to disk, we can force the subsets for ex: "served" to be
written ("immutable" and "base"), making it cheaper to calculate "served" the
next time it needs to be updated.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

spectral created this revision.Aug 2 2019, 9:26 PM
spectral planned changes to this revision.Aug 2 2019, 9:30 PM

Needs some test updates.

Overall principle seems good. I made couple of inline comment.

mercurial/localrepo.py
2199–2204

Should we have this list explicitly stored in a list next to the filtermap ? That would seems more robust to future changes.

2227

Why the explicite write here ? We don't seems to need it for the previous section. Is this because if the cache of the previous subset is valid, the write would be skipped ?
If so, consider clarifying it in your comment.

spectral marked an inline comment as done.Aug 5 2019, 9:07 PM
spectral retitled this revision from branchmap: properly refresh/warm all branchmap caches to branchmap: explicitly warm+write all subsets of the branchmap caches.
spectral edited the summary of this revision. (Show Details)
spectral updated this revision to Diff 16129.
spectral added inline comments.Aug 5 2019, 9:12 PM
mercurial/localrepo.py
2227

Actually it's because the documentation for the function states that it will "warm the caches", "even the ones usually loaded more lazily".

If nothing in hg actually explicitly requests the subset, it won't be written:

$ hg init; echo hi > foo; hg ci -qAm foo; ls .hg/cache
branch2-served  evoext-obscache-00  rbc-names-v1  rbc-revs-v1

This would have, I'd thought, written out -served, -immutable, and -base, since -immutable and -base are subsets of -served, but that doesn't seem to happen. Even if I run hg debugupdatecache (before this change) they don't get written:

branch2-served  evoext-obscache-00  hgtagsfnodes1  rbc-names-v1  rbc-revs-v1  tags2  tags2-served

If the intent of hg debugupdatecache is to actually warm all levels of cache, it should probably warm -immutable and -base, so that they're kept up to date? Or is that undesirable for some reason (maybe it causes additional computation every time the cache for -served is updated if -immutable and -base exist, since they'd also possibly have to be updated? I'd think it'd be the opposite (-base is very cheap to calculate, and unlikely to go stale, can be used to make calculating immutable quicker, and that can be used to make calculating served quicker.. without them, then served has to start from scratch each time; this seems to be the *reason* for the subsettable :)), but I'm not super familiar with the caching code (and uses of it) to know if this is actually true in practice.

That said, I agree that these are two separate concerns, and the number of tests that need to be changed is pretty significant for this one, so I've split this change into two.

Forcing this write seems like a good idea. Having it in its own
changeset seems like a good idea (and please add a comment about forcing
the write).

Forcing this write seems like a good idea. Having it in its own
changeset seems like a good idea (and please add a comment about forcing
the write).

I've split the 'full' change from the one changing what subsets we invalidate already, this one will be used for the 'full' change since it had the most comments. Comment has been added. Please take another look at the whole stack.

marmoute accepted this revision.Aug 8 2019, 12:50 PM

We could warm them in increasing order to improve efficiency. However this is for the full cache warming so this looks good enough. (consider doing them in order in a follow up)

Note: this change seems independant from the previous one, so one might be able to take it on its own

pulkit accepted this revision.Aug 8 2019, 4:42 PM
This revision is now accepted and ready to land.Aug 8 2019, 4:42 PM
pulkit added a comment.Aug 8 2019, 4:42 PM

Thanks @marmoute for the review.