This is an archive of the discontinued Mercurial Phabricator instance.

Differential D9698

shelve: trust caller of shelvedfile.opener() to check that the file exists
ClosedPublic

Authored by martinvonz on Jan 8 2021, 3:36 PM.

Download Raw Diff

Details

Reviewers

pulkit

Group Reviewers

hg-reviewers

Commits

rHGb2a8ff736ecf: shelve: trust caller of shelvedfile.opener() to check that the file exists
rHGdb2c6ce1d2cf: shelve: trust caller of shelvedfile.opener() to check that the file exists
rHGef740217d2e9: shelve: trust caller of shelvedfile.opener() to check that the file exists

Summary

The only place we call shelvedfile.opener() is when we're about to
apply a bundle. The file should always exist. If it doesn't, the
.hg/ directory is corrupt and we don't provide any guarantees about
supporting corrupt repos (besides, telling the user that the shelve
doesn't exist when hg shelve --list lists it is not very helpful).

Diff Detail

Repository

rHG Mercurial

Lint

Automatic diff as part of commit; lint not applicable.

Unit

Automatic diff as part of commit; unit tests not applicable.

Event Timeline

martinvonz created this revision.Jan 8 2021, 3:36 PM

Herald added a reviewer: hg-reviewers. · View Herald TranscriptJan 8 2021, 3:36 PM

Herald added a subscriber: mercurial-patches. · View Herald Transcript

martinvonz added a child revision: D9699: shelve: raise more specific errors.Jan 8 2021, 3:36 PM

Testcase:

mkdir -p .hg/shelved
touch .hg/shelved/default-xxx.patch
hg shelve -l
hg unshelve default-xxx

It is listed and the error mode changed. I'm not against the change, but I think we should test for this case? Alternatively, "hg shelve -l" should test for both ".patch" and ".hg" to exist?

I generally like the rest of the series. I would recommend making "hg shelve -l" and "hg unshelve" tighter first by consider a shelf valid only if both .hg and .patch exist. The refactoring other works fine?

In D9698#146561, @joerg.sonnenberger wrote:
Testcase:
mkdir -p .hg/shelved
touch .hg/shelved/default-xxx.patch
hg shelve -l
hg unshelve default-xxx
It is listed and the error mode changed.

Right, that's a test for a corrupt .hg/ directory, and behaves just like I said in the parenthetical part. As I said, I think that we don't generally try hard to handle corrupt .hg/.

I'm not against the change, but I think we should test for this case? Alternatively, "hg shelve -l" should test for both ".patch" and ".hg" to exist?

That should be easier to do after this series. We could perhaps be nice and error out with something like "abort: shelve foo is corrupt" and tell the user to delete it (and make sure that hg unshelve --delete works on corrupt states like that).

In D9698#146562, @joerg.sonnenberger wrote:

I generally like the rest of the series. I would recommend making "hg shelve -l" and "hg unshelve" tighter first by consider a shelf valid only if both .hg and .patch exist. The refactoring other works fine?

Our messages crossed. I said in the other message that it would be easier to do it after this series. I don't consider it a regression to change the handling of corrupt state from "misleading" to "traceback". I'm fine with adding that change on top, if that's okay with you.

Big question for me would be what to do with junk entries. I would make "hg shelve -l" list them with a note to remove them manually, but otherwise reject interacting with them with shelve or unshelve. If that is the acceptable before, it seems better to adjust (and test) the semantic tightening first and then refactor the logic. If the intention is to allow removing the junk with "hg shelve -d", it seems easier to refactor first and adjust the behavior afterwards.

In D9698#146566, @joerg.sonnenberger wrote:

Big question for me would be what to do with junk entries. I would make "hg shelve -l" list them with a note to remove them manually, but otherwise reject interacting with them with shelve or unshelve. If that is the acceptable before, it seems better to adjust (and test) the semantic tightening first and then refactor the logic. If the intention is to allow removing the junk with "hg shelve -d", it seems easier to refactor first and adjust the behavior afterwards.

I consider it a new feature to better support corrupt entries. For example, before this series:

# Only .hg file does not get listed, but can be deleted (moved to backup).
$ touch .hg/shelved/junk.hg
$ hg shelve -l | grep junk
$ hg shelve -d junk
$ ls .hg/shelved/junk.hg
ls: cannot access '.hg/shelved/junk.hg': No such file or directory

# Only .patch file gets listed, but then `hg unshelve` says it doesn't exist. It can be successfully deleted.
$ touch .hg/shelved/junk.patch
$ hg shelve -l | grep junk
junk            (5s ago)
$ hg unshelve
unshelving change 'junk'
abort: shelved change 'junk' not found
$ hg shelve -d junk
$ hg shelve -l | grep junk
<empty>

# Other files in the directory lead to a crash and the user cannot recover without manually deleting the file.
$ touch .hg/shelved/junk
$ hg shelve -l
[...]
ValueError: not enough values to unpack (expected 2, got 1)
$ hg shelve -d junk
abort: shelved change 'junk' not found

I suspect code I'm removing in this patch was added before the higher-level check that we now have have (search for "not found" in the file). I don't think the intent was ever to support corrupt .hg/shelved/ directories.

martinvonz updated this revision to Diff 24705.Jan 11 2021, 1:04 PM

In D9698#146597, @martinvonz wrote:
In D9698#146566, @joerg.sonnenberger wrote:

Big question for me would be what to do with junk entries. I would make "hg shelve -l" list them with a note to remove them manually, but otherwise reject interacting with them with shelve or unshelve. If that is the acceptable before, it seems better to adjust (and test) the semantic tightening first and then refactor the logic. If the intention is to allow removing the junk with "hg shelve -d", it seems easier to refactor first and adjust the behavior afterwards.

I consider it a new feature to better support corrupt entries. For example, before this series:
# Only .hg file does not get listed, but can be deleted (moved to backup).
$ touch .hg/shelved/junk.hg
$ hg shelve -l | grep junk
$ hg shelve -d junk
$ ls .hg/shelved/junk.hg
ls: cannot access '.hg/shelved/junk.hg': No such file or directory
# Only .patch file gets listed, but then `hg unshelve` says it doesn't exist. It can be successfully deleted.
$ touch .hg/shelved/junk.patch
$ hg shelve -l | grep junk
junk            (5s ago)
$ hg unshelve
unshelving change 'junk'
abort: shelved change 'junk' not found
$ hg shelve -d junk
$ hg shelve -l | grep junk
<empty>
# Other files in the directory lead to a crash and the user cannot recover without manually deleting the file.
$ touch .hg/shelved/junk
$ hg shelve -l
[...]
ValueError: not enough values to unpack (expected 2, got 1)
$ hg shelve -d junk
abort: shelved change 'junk' not found
I suspect code I'm removing in this patch was added before the higher-level check that we now have have (search for "not found" in the file). I don't think the intent was ever to support corrupt .hg/shelved/ directories.

I added tests for those cases early in the stack (inserted D9718 at the bottom). I also made it so hg shelve --list only lists valid shelves at the end of the stack.

pulkit accepted this revision.Jan 16 2021, 4:15 AM

This revision is now accepted and ready to land.Jan 16 2021, 4:15 AM

martinvonz added a commit: rHGef740217d2e9: shelve: trust caller of shelvedfile.opener() to check that the file exists.Jan 16 2021, 5:56 AM

Closed by commit rHGef740217d2e9: shelve: trust caller of shelvedfile.opener() to check that the file exists (authored by martinvonz). · Explain Why

This revision was automatically updated to reflect the committed changes.

martinvonz added a commit: rHGdb2c6ce1d2cf: shelve: trust caller of shelvedfile.opener() to check that the file exists.Jan 16 2021, 1:31 PM

martinvonz added a commit: rHGb2a8ff736ecf: shelve: trust caller of shelvedfile.opener() to check that the file exists.Jan 17 2021, 2:21 AM

Revision Contents
Changeset List

			Path	Packages
M			mercurial/shelve.py (7 lines)
M			tests/test-shelve2.t (2 lines)

Diff	ID	Description	Created	Lint	Unit
Base		Base
Diff 1	24654		Jan 8 2021, 3:36 PM	★	★
Diff 2	24705		Jan 11 2021, 1:04 PM	★	★
Diff 3	24933	rHGef740217d2e945cb0af3647ff2cfbc3f68771b73	Jan 7 2021, 3:58 PM	★	★

Status	Author	Revision
Closed	martinvonz	D9744 shelve: move listshelves() to new ShelfDir class, so caller need not pass vfs
Closed	martinvonz	D9743 shelve: also create class representing whole directory of shelves
Closed	martinvonz	D9742 shelve: add a method for deleting shelf to new shelf class
Closed	martinvonz	D9741 shelve: inline ".patch" constant now that it's only used in the Shelf class
Closed	martinvonz	D9740 shelve: use listshelves() in cleanupoldbackups()
Closed	martinvonz	D9739 shelve: make listshelves() list shelves in a given vfs
Closed	martinvonz	D9738 shelve: replace repo instance in Shelf class by vfs instance
Closed	martinvonz	D9737 shelve: use listdir() instead of readdir() when we don't need stat information
Closed	martinvonz	D9720 shelve: don't crash on file with unexpected extension in .hg/shelved/
Closed	martinvonz	D9719 shelve: don't include invalid shelves in `hg shelve --list`
Closed	martinvonz	D9714 shelve: extract some repeated creation of shelf instances to variables
Closed	martinvonz	D9713 shelve: teach new shelf class to check if .shelve file exists
Closed	martinvonz	D9712 shelve: move method for creating backup to new shelf class
Closed	martinvonz	D9711 shelve: make gennames() helper generate relative backup paths
Closed	martinvonz	D9710 shelve: use listshelves() in cleanup function
Closed	martinvonz	D9709 shelve: inline shelvedfile.filename() since there are no callers outside class
Closed	martinvonz	D9708 shelve: make listshelves() return shelf names instead of filenames
Closed	martinvonz	D9707 shelve: move method for getting stat (mtime) to new shelf class
Closed	martinvonz	D9706 shelve: open patch using new shelf class instead of open()
Closed	martinvonz	D9705 shelve: move function for opening .patch file to new shelf class
Closed	martinvonz	D9704 shelve: move method for reading .hg to new shelf class
Closed	martinvonz	D9703 shelve: move method for writing bundle to new shelf class
Closed	martinvonz	D9702 shelve: move method for reading .shelve file to new shelf class
Closed	martinvonz	D9701 shelve: move method for writing .shelve to new shelf class
Closed	martinvonz	D9700 shelve: introduce class representing a shelf
Closed	martinvonz	D9699 shelve: raise more specific errors
Closed	martinvonz	D9698 shelve: trust caller of shelvedfile.opener() to check that the file exists
Closed	martinvonz	D9697 shelve: rewrite check for unknown shelf to delete
Closed	martinvonz	D9696 shelve: remove a bundlerepo method
Closed	martinvonz	D9718 tests: add tests for corrupt .hg/shelved/ directory

Diff 24933

mercurial/shelve.py

	if not self.backupvfs.isdir():			if not self.backupvfs.isdir():
	self.backupvfs.makedir()			self.backupvfs.makedir()
	util.rename(self.filename(), self.backupfilename())			util.rename(self.filename(), self.backupfilename())

	def stat(self):			def stat(self):
	return self.vfs.stat(self.fname)			return self.vfs.stat(self.fname)

	def opener(self, mode=b'rb'):			def opener(self, mode=b'rb'):
	try:
	return self.vfs(self.fname, mode)			return self.vfs(self.fname, mode)
	except IOError as err:
	if err.errno != errno.ENOENT:
	raise
	raise error.Abort(_(b"shelved change '%s' not found") % self.name)

	def applybundle(self, tr):			def applybundle(self, tr):
	fp = self.opener()			fp = self.opener()
	try:			try:
	targetphase = phases.internal			targetphase = phases.internal
	if not phases.supportinternal(self.repo):			if not phases.supportinternal(self.repo):
	targetphase = phases.secret			targetphase = phases.secret
	gen = exchange.readbundle(self.repo.ui, fp, self.fname, self.vfs)			gen = exchange.readbundle(self.repo.ui, fp, self.fname, self.vfs)

tests/test-shelve2.t

	$ mkdir .hg/shelved			$ mkdir .hg/shelved

	# A (corrupt) .patch file without a .hg file			# A (corrupt) .patch file without a .hg file
	$ touch .hg/shelved/junk1.patch			$ touch .hg/shelved/junk1.patch
	$ hg shelve -l			$ hg shelve -l
	junk1 (* ago) (glob)			junk1 (* ago) (glob)
	$ hg unshelve			$ hg unshelve
	unshelving change 'junk1'			unshelving change 'junk1'
	abort: shelved change 'junk1' not found			abort: $ENOENT$: '$TESTTMP/corrupt-shelves/.hg/shelved/junk1.hg'
	[255]			[255]
	$ hg shelve -d junk1			$ hg shelve -d junk1
	$ find .hg/shelve*			$ find .hg/shelve*
	.hg/shelve-backup			.hg/shelve-backup
	.hg/shelve-backup/junk1.patch			.hg/shelve-backup/junk1.patch
	.hg/shelved			.hg/shelved

	# A .hg file without a .patch file			# A .hg file without a .patch file