This is an archive of the discontinued Mercurial Phabricator instance.

Differential D9677

upgrade: use copy+delete instead of rename while creating backup
Changes PlannedPublic

Authored by pulkit on Dec 31 2020, 11:43 AM.

Download Raw Diff

Details

Reviewers

marmoute

Group Reviewers

hg-reviewers

Summary

A lot of times, we do an upgrade operation which does not touches all the parts
of the stores. But right not, we have a blind logic which processes everything.
To selectively upgrade parts of repository, we need to persist existing data
which is untouched.

However while creating current repository backup, we rename the whole store
leaving no option to persist untouched files.

We switch to copy+delete so that we can only delete data files which are changed
by the operation and leave rest untouched.

Diff Detail

Repository

rHG Mercurial

Branch

default

Lint

No Linters Available

Unit

No Unit Test Coverage

Event Timeline

pulkit created this revision.Dec 31 2020, 11:43 AM

Herald added a reviewer: hg-reviewers. · View Herald TranscriptDec 31 2020, 11:43 AM

Herald added a subscriber: mercurial-patches. · View Herald Transcript

I worried about having a much wider inconsistency window if we don't use rename. I'll have a deeper look later.

Help With Essay
from the Academic Experts. Our essay writing service is designed to get you the extra help you need in completing your next university essay

As pointed in a pa previous comment, this is problematic, because this does not preserve the same consistency as the previous code

This revision now requires changes to proceed.Jan 7 2021, 9:52 PM

In D9677#146243, @marmoute wrote:

As pointed in a pa previous comment, this is problematic, because this does not preserve the same consistency as the previous code

While the copy consistency can be maintained here, future changes (WIP) will break the consistency more. In upcoming work, we want to only touch parts of repository which needs to be updated. This will lead us to do selectively moving data from the upgraded repository to the current one.

pulkit edited parent revisions, added: D9695: upgrade: don't perform anything if nothing to do; removed: D9676: upgrade: migrated -> upgraded in ui messages.Jan 8 2021, 1:45 PM

pulkit updated this revision to Diff 24650.

In D9677#146367, @pulkit wrote:

In D9677#146243, @marmoute wrote:

As pointed in a pa previous comment, this is problematic, because this does not preserve the same consistency as the previous code

While the copy consistency can be maintained here, future changes (WIP) will break the consistency more. In upcoming work, we want to only touch parts of repository which needs to be updated. This will lead us to do selectively moving data from the upgraded repository to the current one.

@marmoute are you OK with this explanation?

My question is, would it be better to hardlink and then delete to avoid the copy overhead? (Of course not every filesystem supports that, but I think it falls back to a copy in that case.)

pulkit updated this revision to Diff 24774.Jan 13 2021, 5:31 AM

In D9677#147085, @mharbison72 wrote:

In D9677#146367, @pulkit wrote:

In D9677#146243, @marmoute wrote:

As pointed in a pa previous comment, this is problematic, because this does not preserve the same consistency as the previous code

While the copy consistency can be maintained here, future changes (WIP) will break the consistency more. In upcoming work, we want to only touch parts of repository which needs to be updated. This will lead us to do selectively moving data from the upgraded repository to the current one.

@marmoute are you OK with this explanation?

Not at all :-)

Currently we have:

a fully consistent repository in place (while the upgrade run)
a missing store during a split second (old store was rename)
a fully consistent repository in place (the upgraded store was rename in place)

The process ensure a very narrow inconsistency window and a very clear invalid state "the store is either valid or missing."

The change in this part move the (2) step to a much longer operation where elements get copied one by one. As a result we get a much longer window of inconsistency with a much wonkier inconsistency state as we get "mixed" files within the resulting store.

So overall this change is going in the wrong direction.

We are currently looking for lighter upgrade for 2 feature "persistent nodemap" and "share-safe". And I don't think either of them need this kind of change. A upgrade processs for persistent nodemap could be:

write new persistent nodemaps (data file, then docket) that nobody will read without the requirements.
add the new requirement

The downgrade being the reverse (remove the requirement, remove the nodemaps)

And something similar could be done for the share-safe upgrade

write a new .hg/store/requirements that nobody will read yet
rewrite .hg/requirements

This revision now requires changes to proceed.Jan 13 2021, 5:34 AM

In D9677#147172, @marmoute wrote:

In D9677#147085, @mharbison72 wrote:

In D9677#146367, @pulkit wrote:

In D9677#146243, @marmoute wrote:

As pointed in a pa previous comment, this is problematic, because this does not preserve the same consistency as the previous code

While the copy consistency can be maintained here, future changes (WIP) will break the consistency more. In upcoming work, we want to only touch parts of repository which needs to be updated. This will lead us to do selectively moving data from the upgraded repository to the current one.

@marmoute are you OK with this explanation?

Not at all :-)
Currently we have:

a fully consistent repository in place (while the upgrade run)

a missing store during a split second (old store was rename)

a fully consistent repository in place (the upgraded store was rename in place)

The process ensure a very narrow inconsistency window and a very clear invalid state "the store is either valid or missing."
The change in this part move the (2) step to a much longer operation where elements get copied one by one. As a result we get a much longer window of inconsistency with a much wonkier inconsistency state as we get "mixed" files within the resulting store.

Nope, this patch does not do that. We have three stores here: backupstore (containing backup of current store), current store and upgraded store. Below is comparison of split second as part 2) mentioned above.

Before:

Move current store to backup store
Move upgraded store to current store

Store inconsistency starts when step 1) starts and remains till step 2) finishes since both are move operation.

After this patch:

Copy current store to backup store
Move upgraded store to current store

In this one, store inconsistency happens only at step 2) since step 1) is a copy.

So, this patch reduces the time when store was inconsistent.

Note: that renaming a directory is instantaneous on any descent file system. copying full hierarchy is not.

I am afraid your explanation is now confusing me. Can you try to explain what this patch is doing again ?

pulkit added a child revision: D9770: upgrade: don't create store backup if `--no-backup` is passed.Jan 14 2021, 7:54 AM

pulkit updated this revision to Diff 24849.

Dropping this from stack for now to focus on getting other work pushed.

pulkit mentioned this in D9770: upgrade: don't create store backup if `--no-backup` is passed.Feb 6 2021, 2:26 PM

Djokovich edited the summary of this revision. (Show Details)Apr 17 2021, 8:59 AM

marmoute edited the summary of this revision. (Show Details)Apr 17 2021, 9:41 AM

I have been looking for this information for a long time, I was very surprised when I found it here. 192.168.0.1

The Joint Admission and Matriculations Board (JAMB) has opened the portal for checking of admission status for the 2021/2022 academic session. Also see <a href="https://www.currentschoolnews.com/exam/jamb-admission-status-checker/">check JAMB Admission Status</a>

Watch Free Online Movies and TV Shows onsoap2day

Revision Contents
Changeset List

			Path	Packages
M			mercurial/upgrade_utils/engine.py (36 lines)
M			tests/test-upgrade-repo.t (4 lines)

Commit	Parents	Author	Summary	Date
8722b9cd27de	4d79375a5c41	Pulkit Goyal		Dec 31 2020, 11:07 AM

Status	Author	Revision
Closed	pulkit	D9775 upgrade: update only requirements if we can
Closed	pulkit	D9774 engine: add `if True` to prepare for next patch
Closed	pulkit	D9773 test: unquiet few tests to demonstrate changes in upcoming patches
Closed	pulkit	D9772 upgrade: mark sharesafe improvement as only touching requirements
Closed	pulkit	D9771 actions: calculate what all parts does the operation touches
Closed	pulkit	D9770 upgrade: don't create store backup if `--no-backup` is passed
Changes Planned	pulkit	D9677 upgrade: use copy+delete instead of rename while creating backup
Closed	pulkit	D9695 upgrade: don't perform anything if nothing to do
Closed	pulkit	D9676 upgrade: migrated -> upgraded in ui messages
Closed	pulkit	D9675 upgrade: remove unnecessary `is None` check
Closed	pulkit	D9674 engine: refactor code to replace stores in separate function
Closed	pulkit	D9694 upgrade: demonstrate that a no-op upgrade still performs everything
Closed	pulkit	D9693 downgrade: if a compression is removed, consider that too
Closed	pulkit	D9673 engine: prevent a function call for each store file
Closed	pulkit	D9672 engine: make hook point for extension a public function
Closed	pulkit	D9669 engine: prevent multiple checking of re-delta-multibase
Closed	pulkit	D9668 engine: pass upgrade operation inside `_perform_clone()`
Closed	pulkit	D9667 engine: pass upgrade operation inside _clonerevlogs()
Closed	pulkit	D9666 actions: store deltareuse mode of whole operation in UpgradeOperation
Closed	pulkit	D9665 engine: refactor how total dstsize is calculated
Closed	pulkit	D9619 upgrade: introduce post upgrade and downgrade message for improvements
Closed	pulkit	D9618 actions: introduce function to calculate downgrades
Closed	pulkit	D9617 debugupgraderepo: minor documentation fix
Closed	pulkit	D9616 upgrade: rename actions to upgrade_actions
Closed	pulkit	D9615 upgrade: move optimization addition to determineactions()
Closed	pulkit	D9614 upgrade: drop support for old style optimization names
Closed	pulkit	D9583 upgrade: add a missing space in status message
Closed	pulkit	D9664 actions: rename DEFICIENCY constant to FORMAT_VARIANT
Closed	pulkit	D9582 upgrade: rename finddeficiences() to find_format_upgrades()
Closed	pulkit	D9580 engine: unwrap a hard to understand for loop
Closed	pulkit	D9579 engine: refactor actual cloning code into separate function
Closed	pulkit	D9578 upgrade: move printing of unused optimizations to UpgradeOperation class
Closed	pulkit	D9577 upgrade: move `printrequirements()` to UpgradeOperation class
Closed	pulkit	D9576 upgrade: move `printoptimisations() to UpgradeOperation class
Closed	pulkit	D9575 upgrade: move `printupgradeactions()` to UpgradeOperation class
Closed	pulkit	D9574 upgrade: move `print_affected_revlogs()` to UpgradeOperation class

Diff 24849

mercurial/upgrade_utils/engine.py

	currentrepo: repo object of current repository			currentrepo: repo object of current repository
	upgradedrepo: repo object of the upgraded data			upgradedrepo: repo object of the upgraded data
	backupvfs: vfs object for the backup path			backupvfs: vfs object for the backup path
	upgrade_op: upgrade operation object			upgrade_op: upgrade operation object
	to be used to decide what all is upgraded			to be used to decide what all is upgraded
	"""			"""
	# TODO: don't blindly rename everything in store			# TODO: don't blindly rename everything in store
	# There can be upgrades where store is not touched at all			# There can be upgrades where store is not touched at all
	util.rename(currentrepo.spath, backupvfs.join(b'store'))			backupstorevfs = vfsmod.vfs(backupvfs.join(b'store'))
				util.makedirs(backupstorevfs.base)
				for path, kind, st in sorted(currentrepo.store.vfs.readdir(b'', stat=True)):
				# Skip transaction related files.
				if path.startswith(b'undo'):
				continue
				# Only copy regular files.
				if kind != stat.S_IFREG:
				continue
				# Skip other skipped files.
				if path in (b'lock',):
				continue
				src = currentrepo.store.rawvfs.join(path)
				dst = backupstorevfs.join(path)
				util.copyfile(src, dst, copystat=True)
				if currentrepo.svfs.exists(b'data'):
				util.copyfiles(
				currentrepo.svfs.join(b'data'),
				backupstorevfs.join(b'data'),
				hardlink=False,
				)
				if currentrepo.svfs.exists(b'meta'):
				util.copyfiles(
				currentrepo.svfs.join(b'meta'),
				backupstorevfs.join(b'meta'),
				hardlink=False,
				)

				currentrepo.vfs.rmtree(b'store', forcibly=True)
	util.rename(upgradedrepo.spath, currentrepo.spath)			util.rename(upgradedrepo.spath, currentrepo.spath)


	def finishdatamigration(ui, srcrepo, dstrepo, requirements):			def finishdatamigration(ui, srcrepo, dstrepo, requirements):
	"""Hook point for extensions to perform additional actions during upgrade.			"""Hook point for extensions to perform additional actions during upgrade.

	This function is called after revlogs and store files have been copied but			This function is called after revlogs and store files have been copied but
	before the new store is swapped into the original location.			before the new store is swapped into the original location.
	ui.status(			ui.status(
	_(			_(
	b'finalizing requirements file and making repository readable '			b'finalizing requirements file and making repository readable '
	b'again\n'			b'again\n'
	)			)
	)			)
	scmutil.writereporequirements(srcrepo, upgrade_op.new_requirements)			scmutil.writereporequirements(srcrepo, upgrade_op.new_requirements)

	# The lock file from the old store won't be removed because nothing has a
	# reference to its new location. So clean it up manually. Alternatively, we
	# could update srcrepo.svfs and other variables to point to the new
	# location. This is simpler.
	backupvfs.unlink(b'store/lock')

	return backuppath			return backuppath

tests/test-upgrade-repo.t

	$ ls -d .hg/upgradebackup.*/			$ ls -d .hg/upgradebackup.*/
	.hg/upgradebackup.*/ (glob)			.hg/upgradebackup.*/ (glob)
	$ ls .hg/upgradebackup.*/store			$ ls .hg/upgradebackup.*/store
	00changelog.i			00changelog.i
	00manifest.i			00manifest.i
	data			data
	fncache			fncache
	phaseroots			phaseroots
	undo
	undo.backup.fncache
	undo.backupfiles
	undo.phaseroots

	unless --no-backup is passed			unless --no-backup is passed

	$ rm -rf .hg/upgradebackup.*/			$ rm -rf .hg/upgradebackup.*/
	$ hg debugupgraderepo --run --no-backup			$ hg debugupgraderepo --run --no-backup
	upgrade will perform the following actions:			upgrade will perform the following actions:

	requirements			requirements

Diff	ID	Description	Created	Lint	Unit
Base		Base
Diff 1	24564		Dec 31 2020, 11:43 AM	★	★
Diff 2	24650		Jan 8 2021, 1:45 PM	★	★
Diff 3	24774		Jan 13 2021, 5:31 AM	★	★
Diff 4	24849		Jan 14 2021, 7:54 AM	★	★