This is an archive of the discontinued Mercurial Phabricator instance.

branchmap: explicitly warm+write all subsets of the branchmap caches
ClosedPublic

Authored by spectral on Aug 2 2019, 9:26 PM.

Download Raw Diff

Details

Reviewers

marmoute
pulkit

Group Reviewers

hg-reviewers

Commits

rHGcdf0e9523de1: branchmap: explicitly warm+write all subsets of the branchmap caches

Summary

'full' claims it will warm all of the caches that are known about, but this was
not the case - it did not actually warm the branchmap caches for subsets that we
haven't requested, or for subsets that are still considered "valid". By
explicitly writing them to disk, we can force the subsets for ex: "served" to be
written ("immutable" and "base"), making it cheaper to calculate "served" the
next time it needs to be updated.

Diff Detail

Repository

rHG Mercurial

Lint

Lint Skipped

Unit

Unit Tests Skipped

Event Timeline

spectral created this revision.Aug 2 2019, 9:26 PM

Herald added a reviewer: hg-reviewers. · View Herald TranscriptAug 2 2019, 9:26 PM

Herald added a subscriber: mercurial-devel. · View Herald Transcript

spectral added a child revision: D6711: branchheads: store wdir-dependent caches in wcache (issue6181).Aug 2 2019, 9:26 PM

Needs some test updates.

Overall principle seems good. I made couple of inline comment.

mercurial/localrepo.py
2199–2204	Should we have this list explicitly stored in a list next to the filtermap ? That would seems more robust to future changes.
2227	Why the explicite write here ? We don't seems to need it for the previous section. Is this because if the cache of the previous subset is valid, the write would be skipped ? If so, consider clarifying it in your comment.

spectral marked an inline comment as done.Aug 5 2019, 9:07 PM

spectral added a parent revision: D6719: branchmap: refresh all "heads" of the branchmap subsets.

spectral retitled this revision from branchmap: properly refresh/warm all branchmap caches to branchmap: explicitly warm+write all subsets of the branchmap caches.

spectral edited the summary of this revision. (Show Details)

spectral updated this revision to Diff 16129.

spectral added inline comments.Aug 5 2019, 9:12 PM

mercurial/localrepo.py
2227	Actually it's because the documentation for the function states that it will "warm the caches", "even the ones usually loaded more lazily". If nothing in hg actually explicitly requests the subset, it won't be written: $ hg init; echo hi > foo; hg ci -qAm foo; ls .hg/cache branch2-served evoext-obscache-00 rbc-names-v1 rbc-revs-v1 This would have, I'd thought, written out -served, -immutable, and -base, since -immutable and -base are subsets of -served, but that doesn't seem to happen. Even if I run `hg debugupdatecache` (before this change) they don't get written: branch2-served evoext-obscache-00 hgtagsfnodes1 rbc-names-v1 rbc-revs-v1 tags2 tags2-served If the intent of `hg debugupdatecache` is to actually warm all levels of cache, it should probably warm -immutable and -base, so that they're kept up to date? Or is that undesirable for some reason (maybe it causes additional computation every time the cache for -served is updated if -immutable and -base exist, since they'd also possibly have to be updated? I'd think it'd be the opposite (-base is very cheap to calculate, and unlikely to go stale, can be used to make calculating immutable quicker, and that can be used to make calculating served quicker.. without them, then served has to start from scratch each time; this seems to be the reason for the subsettable :)), but I'm not super familiar with the caching code (and uses of it) to know if this is actually true in practice. That said, I agree that these are two separate concerns, and the number of tests that need to be changed is pretty significant for this one, so I've split this change into two.

Forcing this write seems like a good idea. Having it in its own
changeset seems like a good idea (and please add a comment about forcing
the write).

In D6710#98322, @marmoute wrote:

Forcing this write seems like a good idea. Having it in its own
changeset seems like a good idea (and please add a comment about forcing
the write).

I've split the 'full' change from the one changing what subsets we invalidate already, this one will be used for the 'full' change since it had the most comments. Comment has been added. Please take another look at the whole stack.

We could warm them in increasing order to improve efficiency. However this is for the full cache warming so this looks good enough. (consider doing them in order in a follow up)

Note: this change seems independant from the previous one, so one might be able to take it on its own

pulkit accepted this revision.Aug 8 2019, 4:42 PM

This revision is now accepted and ready to land.Aug 8 2019, 4:42 PM

Thanks @marmoute for the review.

spectral added a commit: rHGcdf0e9523de1: branchmap: explicitly warm+write all subsets of the branchmap caches.Aug 8 2019, 6:18 PM

Closed by commit rHGcdf0e9523de1: branchmap: explicitly warm+write all subsets of the branchmap caches (authored by spectral). · Explain Why

This revision was automatically updated to reflect the committed changes.

Revision Contents
Changeset List

			Path	Packages
M			mercurial/localrepo.py (10 lines)
M			tests/test-debugcommands.t (5 lines)
M			tests/test-server-view.t (5 lines)

Diff	ID	Description	Created	Lint	Unit
Base		Base
Diff 1	16115		Aug 2 2019, 9:26 PM	★	★
Diff 2	16129		Aug 5 2019, 9:07 PM	★	★
Diff 3	16163	rHGcdf0e9523de12a98b9395192b7a25108a7d0b36d	Aug 5 2019, 4:31 PM	★	★

Commit	Parents	Author	Summary	Date
79309dd3cbdc	5c97168bce98	Kyle Lippincott		Aug 5 2019, 4:31 PM

Status	Author	Revision
Abandoned	spectral	D6711 branchheads: store wdir-dependent caches in wcache (issue6181)
Closed	spectral	D6710 branchmap: explicitly warm+write all subsets of the branchmap caches
Abandoned	spectral	D6719 branchmap: refresh all "heads" of the branchmap subsets

Diff 16129

mercurial/localrepo.py

	# During strip, many caches are invalid but			# During strip, many caches are invalid but
	# later call to `destroyed` will refresh them.			# later call to `destroyed` will refresh them.
	return			return

	if tr is None or tr.changes['origrepolen'] < len(self):			if tr is None or tr.changes['origrepolen'] < len(self):
	# Updating all of the "heads" of the cache hierarchy should cause			# Updating all of the "heads" of the cache hierarchy should cause
	# all others that are stale to be updated.			# all others that are stale to be updated.
	self.ui.debug('updating the branch cache\n')			self.ui.debug('updating the branch cache\n')
	for filt in repoviewutil.subsettableheads:			for filt in repoviewutil.subsettableheads:
	# Refreshing visible-hidden caused a lot of test failures, so			# Refreshing visible-hidden caused a lot of test failures, so
	# we'll only do 'visible' for now.			# we'll only do 'visible' for now.
	if filt == 'visible-hidden':			if filt == 'visible-hidden':
	filt = 'visible'			filt = 'visible'
	self.filtered(filt).branchmap()			self.filtered(filt).branchmap()
				marmouteUnsubmitted Done Should we have this list explicitly stored in a list next to the filtermap ? That would seems more robust to future changes. marmoute: Should we have this list explicitly stored in a list next to the filtermap ? That would seems…

	if full:			if full:
	unfi = self.unfiltered()			unfi = self.unfiltered()
	rbc = unfi.revbranchcache()			rbc = unfi.revbranchcache()
	for r in unfi.changelog:			for r in unfi.changelog:
	rbc.branchinfo(r)			rbc.branchinfo(r)
	rbc.write()			rbc.write()

	# ensure the working copy parents are in the manifestfulltextcache			# ensure the working copy parents are in the manifestfulltextcache
	for ctx in self['.'].parents():			for ctx in self['.'].parents():
	ctx.manifest() # accessing the manifest is enough			ctx.manifest() # accessing the manifest is enough

	# accessing fnode cache warms the cache			# accessing fnode cache warms the cache
	tagsmod.fnoderevs(self.ui, unfi, unfi.changelog.revs())			tagsmod.fnoderevs(self.ui, unfi, unfi.changelog.revs())
	# accessing tags warm the cache			# accessing tags warm the cache
	self.tags()			self.tags()
	self.filtered('served').tags()			self.filtered('served').tags()

				# The `full` arg is documented as updating even the lazily-loaded
				# caches immediately, so we're forcing a write to cause these caches
				# to be warmed up even if they haven't explicitly been requested
				# yet (if they've never been used by hg, they won't ever have been
				# written, even if they're a subset of another kind of cache that
				marmouteUnsubmitted Not Done Why the explicite write here ? We don't seems to need it for the previous section. Is this because if the cache of the previous subset is valid, the write would be skipped ? If so, consider clarifying it in your comment. marmoute: Why the explicite write here ? We don't seems to need it for the previous section. Is this…
				spectralAuthorUnsubmitted Done Actually it's because the documentation for the function states that it will "warm the caches", "even the ones usually loaded more lazily". If nothing in hg actually explicitly requests the subset, it won't be written: $ hg init; echo hi > foo; hg ci -qAm foo; ls .hg/cache branch2-served evoext-obscache-00 rbc-names-v1 rbc-revs-v1 This would have, I'd thought, written out -served, -immutable, and -base, since -immutable and -base are subsets of -served, but that doesn't seem to happen. Even if I run `hg debugupdatecache` (before this change) they don't get written: branch2-served evoext-obscache-00 hgtagsfnodes1 rbc-names-v1 rbc-revs-v1 tags2 tags2-served If the intent of `hg debugupdatecache` is to actually warm all levels of cache, it should probably warm -immutable and -base, so that they're kept up to date? Or is that undesirable for some reason (maybe it causes additional computation every time the cache for -served is updated if -immutable and -base exist, since they'd also possibly have to be updated? I'd think it'd be the opposite (-base is very cheap to calculate, and unlikely to go stale, can be used to make calculating immutable quicker, and that can be used to make calculating served quicker.. without them, then served has to start from scratch each time; this seems to be the reason for the subsettable :)), but I'm not super familiar with the caching code (and uses of it) to know if this is actually true in practice. That said, I agree that these are two separate concerns, and the number of tests that need to be changed is pretty significant for this one, so I've split this change into two. spectral: Actually it's because the documentation for the function states that it will "warm the caches"…
				# has been used).
				for filt in repoview.filtertable.keys():
				filtered = self.filtered(filt)
				filtered.branchmap().write(filtered)

	def invalidatecaches(self):			def invalidatecaches(self):

	if r'_tagscache' in vars(self):			if r'_tagscache' in vars(self):
	# can't use delattr on proxy			# can't use delattr on proxy
	del self.__dict__[r'_tagscache']			del self.__dict__[r'_tagscache']

	self._branchcaches.clear()			self._branchcaches.clear()
	self.invalidatevolatilesets()			self.invalidatevolatilesets()

tests/test-debugcommands.t

	$ hg debugupdatecaches --debug			$ hg debugupdatecaches --debug
	updating the branch cache			updating the branch cache
	$ ls -r .hg/cache/*			$ ls -r .hg/cache/*
	.hg/cache/tags2-served			.hg/cache/tags2-served
	.hg/cache/tags2			.hg/cache/tags2
	.hg/cache/rbc-revs-v1			.hg/cache/rbc-revs-v1
	.hg/cache/rbc-names-v1			.hg/cache/rbc-names-v1
	.hg/cache/hgtagsfnodes1			.hg/cache/hgtagsfnodes1
				.hg/cache/branch2-visible-hidden
				.hg/cache/branch2-visible
				.hg/cache/branch2-served.hidden
	.hg/cache/branch2-served			.hg/cache/branch2-served
				.hg/cache/branch2-immutable
				.hg/cache/branch2-base

	Test debugcolor			Test debugcolor

	#if no-windows			#if no-windows
	$ hg debugcolor --style --color always \| egrep 'mode\|style\|log\.'			$ hg debugcolor --style --color always \| egrep 'mode\|style\|log\.'
	color mode: 'ansi'			color mode: 'ansi'
	available style:			available style:
	\x1b[0;33mlog.changeset\x1b[0m: \x1b[0;33myellow\x1b[0m (esc)			\x1b[0;33mlog.changeset\x1b[0m: \x1b[0;33myellow\x1b[0m (esc)

tests/test-server-view.t

	tag: tip			tag: tip
	user: debugbuilddag			user: debugbuilddag
	date: Thu Jan 01 00:00:00 1970 +0000			date: Thu Jan 01 00:00:00 1970 +0000
	summary: r0			summary: r0

	$ hg -R test --config experimental.extra-filter-revs='not public()' debugupdatecache			$ hg -R test --config experimental.extra-filter-revs='not public()' debugupdatecache
	$ ls -1 test/.hg/cache/			$ ls -1 test/.hg/cache/
	branch2-base%89c45d2fa07e			branch2-base%89c45d2fa07e
				branch2-immutable%89c45d2fa07e
	branch2-served			branch2-served
				branch2-served%89c45d2fa07e
				branch2-served.hidden%89c45d2fa07e
				branch2-visible%89c45d2fa07e
				branch2-visible-hidden%89c45d2fa07e
	hgtagsfnodes1			hgtagsfnodes1
	rbc-names-v1			rbc-names-v1
	rbc-revs-v1			rbc-revs-v1
	tags2			tags2
	tags2-served%89c45d2fa07e			tags2-served%89c45d2fa07e

	cleanup			cleanup

	$ cat errors.log			$ cat errors.log
	$ killdaemons.py			$ killdaemons.py