This is an archive of the discontinued Mercurial Phabricator instance.

Differential D9952

revlog: add a mechanism to verify expected file position before appending
ClosedPublic

Authored by spectral on Feb 3 2021, 8:16 PM.

Download Raw Diff

Details

Reviewers

indygreg

Group Reviewers

hg-reviewers

Commits

rHGe9901d01d135: revlog: add a mechanism to verify expected file position before appending
rHG51f6c4fd4dd9: revlog: add a mechanism to verify expected file position before appending
rHGa909d4e327ac: revlog: add a mechanism to verify expected file position before appending

Summary

If someone uses hg debuglocks, or some non-hg process writes to the .hg
directory without respecting the locks, or if the repo's on a networked
filesystem, it's possible for the revlog code to write out corrupted data.

The form of this corruption can vary depending on what data was written and how
that happened. We are in the "networked filesystem" case (though I've had users
also do this to themselves with the "hg debuglocks" scenario), and most often
see this with the changelog. What ends up happening is we produce two items
(let's call them rev1 and rev2) in the .i file that have the same linkrev,
baserev, and offset into the .d file, while the data in the .d file is appended
properly. rev2's compressed_size is accurate for rev2, but when we go to
decompress the data in the .d file, we use the offset that's recorded in the
index file, which is the same as rev1, and attempt to decompress
rev2.compressed_size bytes of rev1's data. This usually does not succeed. :)

When using inline data, this also fails, though I haven't investigated why too
closely. This shows up as a "patch decode" error. I believe what's happening
there is that we're basically ignoring the offset field, getting the data
properly, but since baserev != rev, it thinks this is a delta based on rev
(instead of a full text) and can't actually apply it as such.

For now, I'm going to make this an optional component and default it to entirely
off. I may increase the default severity of this in the future, once I've
enabled it for my users and we gain more experience with it. Luckily, most of my
users have a versioned filesystem and can roll back to before the corruption has
been written, it's just a hassle to do so and not everyone knows how (so it's a
support burden). Users on other filesystems will not have that luxury, and this
can cause them to have a corrupted repository that they are unlikely to know how
to resolve, and they'll see this as a data-loss event. Refusing to create the
corruption is a much better user experience.

This mechanism is not perfect. There may be false-negatives (racy writes that
are not detected). There should not be any false-positives (non-racy writes that
are detected as such). This is not a mechanism that makes putting a repo on a
networked filesystem "safe" or "supported", just *less* likely to cause
corruption.

Diff Detail

Repository

rHG Mercurial

Lint

Automatic diff as part of commit; lint not applicable.

Unit

Automatic diff as part of commit; unit tests not applicable.

Event Timeline

spectral created this revision.Feb 3 2021, 8:16 PM

Herald added a reviewer: indygreg. · View Herald TranscriptFeb 3 2021, 8:16 PM

Herald added a reviewer: hg-reviewers. · View Herald Transcript

Herald added a subscriber: mercurial-patches. · View Herald Transcript

spectral added a child revision: D9953: tests: add a comment in a test that will hopefully save someone some time.Feb 3 2021, 8:16 PM

This still adds all of the function call overhead even when the feature is not used. I also don't like that this check is done repeatedly e.g. during an unbundle. I don't think I would mind checking the size once per revlog on the first write, but not repeatedly.

spectral updated this revision to Diff 25481.Feb 4 2021, 2:46 PM

In D9952#150755, @joerg.sonnenberger wrote:

This still adds all of the function call overhead even when the feature is not used. I also don't like that this check is done repeatedly e.g. during an unbundle. I don't think I would mind checking the size once per revlog on the first write, but not repeatedly.

Switched to just being a boolean check instead of a dummy function. Also cleaned up the formatting issues, turns out I didn't have black installed correctly.

I agree that in a "tight" loop like during a pull it's much less useful, and would be open to having the checks disabled for those operations if/when they're ever enabled by default. However, there are commands like split or rebase (with a merge tool specified) that can have several pauses while hg is waiting on the user (writing a description, choosing what to split, resolving conflicts) where the files have already been written to once, but the user had a chance to "forget" about what they were doing. If there was an easy way of having any user interaction reset the state so that the next write caused the checks to happen again, I might be open to doing that.

Sorry, I'm not super familiar with the phabricator workflow, should I be doing something more here?

spectral added a commit: rHGa909d4e327ac: revlog: add a mechanism to verify expected file position before appending.Feb 24 2021, 11:19 AM

This revision was not accepted when it landed; it landed in state Needs Review.

Closed by commit rHGa909d4e327ac: revlog: add a mechanism to verify expected file position before appending (authored by spectral). · Explain Why

This revision was automatically updated to reflect the committed changes.

spectral added a commit: rHG51f6c4fd4dd9: revlog: add a mechanism to verify expected file position before appending.Feb 24 2021, 12:12 PM

Seems like this patch made test-git-interop.t more broken than it was before. I'll append the following patch in flight to make it just as broken as it was before:

diff --git a/hgext/git/__init__.py b/hgext/git/__init__.py
--- a/hgext/git/__init__.py
+++ b/hgext/git/__init__.py
@@ -90,7 +90,7 @@ class gitstore(object):  # store.basicst
             return os.path.join(self.path, b'..', b'.hg', f)
         raise NotImplementedError(b'Need to pick file for %s.' % f)

-    def changelog(self, trypending):
+    def changelog(self, trypending, concurrencychecker):
         # TODO we don't have a plan for trypending in hg's git support yet
         return gitlog.changelog(self.git, self._db)

spectral added a commit: rHGe9901d01d135: revlog: add a mechanism to verify expected file position before appending.Feb 24 2021, 12:53 PM

@spectral I've noticed some flakyness on the new test introduced in this change: https://foss.heptapod.net/octobus/mercurial-devel/-/jobs/187438

Could you take a look at it?

I'm unable to reproduce. I've run the test over 10,000 times (I added a #testcases a b c d e f g h i j k l m n o p q r s t u v w x y z so it ran 26 times each run, and I've run over 300 instances of that like run-tests.py -j26 -l --chg test-racy-mutations.t, over 100 with -j108, and other combinations (with and without the added testcases, with and without --chg, etc.)

I'm on a rather beefy linux machine, based on Debian testing, using Python3.9, on a commit descended from 856820b4.

It hasn't failed a single time, let alone like that. I also don't know what would cause that kind of failure. :(

In D9952#159118, @spectral wrote:

I'm unable to reproduce. I've run the test over 10,000 times (I added a #testcases a b c d e f g h i j k l m n o p q r s t u v w x y z so it ran 26 times each run, and I've run over 300 instances of that like run-tests.py -j26 -l --chg test-racy-mutations.t, over 100 with -j108, and other combinations (with and without the added testcases, with and without --chg, etc.)
I'm on a rather beefy linux machine, based on Debian testing, using Python3.9, on a commit descended from 856820b4.
It hasn't failed a single time, let alone like that. I also don't know what would cause that kind of failure. :(

Mhhh. It may be possible that your machine is too fast to cause the race. Your runners are not the fastest machines in the world and they are under quite a bit of load; it wouldn't be the first time I've seen that happen. Could you try to artificially load up your CPU while you run the tests with chg? Otherwise we'll have to dig deeper.

In D9952#159118, @spectral wrote:

I'm unable to reproduce. I've run the test over 10,000 times (I added a #testcases a b c d e f g h i j k l m n o p q r s t u v w x y z so it ran 26 times each run, and I've run over 300 instances of that like run-tests.py -j26 -l --chg test-racy-mutations.t, over 100 with -j108, and other combinations (with and without the added testcases, with and without --chg, etc.)
I'm on a rather beefy linux machine, based on Debian testing, using Python3.9, on a commit descended from 856820b4.
It hasn't failed a single time, let alone like that. I also don't know what would cause that kind of failure. :(

The error seems to hint at at attempted cleanup to a not fully initialized dirstateguard. So it might be the symptom of another failure that probably have a narrow windows for happening.

The test does not use a python extensions so it looks like that racy windows exists in the main code. @spectral, What is the range of code you want to ensure a race with ? and how is the synchronization happening to reach it ?

FWIW, it happened to me again (for now 100% of my pushes, with an impressive 2 pushes), so probably not a one-off. If you need access to Heptapod CI to figure that out, please say so.

spectral mentioned this in D10504: dirstateguard: use mktemp-like functionality to generate the backup filenames.Apr 20 2021, 4:08 PM

In D9952#159267, @marmoute wrote:

@spectral, What is the range of code you want to ensure a race with ? and how is the synchronization happening to reach it ?

I'm attempting to reproduce a timeline like:

process A starts
process A reads changelog, identifies it as having lets say 10 entries
process A acquires locks (note: I think steps 2 and 3 can be swapped)
process A starts the editor (for a commit message or whatever)
user deletes the lock file (maybe they don't realize why it's still held - it's in a screen session or something)
process B starts and runs to completion, appending to the changelog. Changelog now has 11 entries.
user finds process A's editor, quits it
process A appends to changelog, thinking there were only 10 entries but it actually had 11 due to process B writing to it.

In the test, the synchronization happens between step 4+5 - the "editor" signals it has started, and lets step 5 execute, and again between step 6+7 - the "editor" has been waiting for a signal saying that step 6 has finished, and upon receiving it, exits 'normally'.

The only way I can think of for this to break is if something is causing the dirstate.backup files to be deleted either in step 5 (debuglocks being used to delete the locks deletes the backup files) or step 6 (process B just blindly deletes all the backup files).

My guess: my machine has some security hardening thing enabled that your test runner doesn't? Assuming you're using cpython, the number on the end of these files is the memory address of the dirstateguard object, and because of chg, we probably actually stand a rather high likelihood that it'll be the same between processes. I don't know why I wasn't able to reproduce, but if I change id(self) to instead be a constant string, it does reproduce.

I haven't tried to figure out how to run the tests on the test runner yet, but I'm relatively confident that this is the cause. I've sent D10504 to address.

In D9952#159597, @spectral wrote:

In D9952#159267, @marmoute wrote:

@spectral, What is the range of code you want to ensure a race with ? and how is the synchronization happening to reach it ?

I'm attempting to reproduce a timeline like:

process A starts

process A reads changelog, identifies it as having lets say 10 entries

process A acquires locks (note: I think steps 2 and 3 can be swapped)

process A starts the editor (for a commit message or whatever)

user deletes the lock file (maybe they don't realize why it's still held - it's in a screen session or something)

process B starts and runs to completion, appending to the changelog. Changelog now has 11 entries.

user finds process A's editor, quits it

process A appends to changelog, thinking there were only 10 entries but it actually had 11 due to process B writing to it.

In the test, the synchronization happens between step 4+5 - the "editor" signals it has started, and lets step 5 execute, and again between step 6+7 - the "editor" has been waiting for a signal saying that step 6 has finished, and upon receiving it, exits 'normally'.
The only way I can think of for this to break is if something is causing the dirstate.backup files to be deleted either in step 5 (debuglocks being used to delete the locks deletes the backup files) or step 6 (process B just blindly deletes all the backup files).
My guess: my machine has some security hardening thing enabled that your test runner doesn't? Assuming you're using cpython, the number on the end of these files is the memory address of the dirstateguard object, and because of chg, we probably actually stand a rather high likelihood that it'll be the same between processes. I don't know why I wasn't able to reproduce, but if I change id(self) to instead be a constant string, it does reproduce.
I haven't tried to figure out how to run the tests on the test runner yet, but I'm relatively confident that this is the cause. I've sent D10504 to address.

Thanks for looking into it !

If you want to run this through the heptapod-CI simply put your change in a topic and push to the heptapod repositories. This will trigger a CI run.

spectral mentioned this in rHG222a42ac5b2d: dirstateguard: use mktemp-like functionality to generate the backup filenames.Apr 29 2021, 10:57 AM

Revision Contents
Changeset List

		Path
M		mercurial/changelog.py (6 lines)
M		mercurial/configitems.py (5 lines)
M		mercurial/localrepo.py (10 lines)
M		mercurial/revlog.py (23 lines)
A	M	mercurial/revlogutils/concurrency_checker.py (38 lines)
M		mercurial/store.py (8 lines)
A	M	tests/test-racy-mutations.t (102 lines)

Diff	ID	Description	Created	Lint	Unit
Base		Base
Diff 1	25478		Feb 3 2021, 8:15 PM	★	★
Diff 2	25481		Feb 4 2021, 2:46 PM	★	★
Diff 3	25885	rHGa909d4e327ac146e2f1b95162e472dc490ea7bcb	Feb 3 2021, 7:33 PM	★	★

	Status	Author	Revision
	Closed	spectral	D9953 tests: add a comment in a test that will hopefully save someone some time
	Closed	spectral	D9952 revlog: add a mechanism to verify expected file position before appending

Diff 25885

mercurial/changelog.py


	@property			@property
	def branchinfo(self):			def branchinfo(self):
	extra = self.extra			extra = self.extra
	return encoding.tolocal(extra.get(b"branch")), b'close' in extra			return encoding.tolocal(extra.get(b"branch")), b'close' in extra


	class changelog(revlog.revlog):			class changelog(revlog.revlog):
	def __init__(self, opener, trypending=False):			def __init__(self, opener, trypending=False, concurrencychecker=None):
	"""Load a changelog revlog using an opener.			"""Load a changelog revlog using an opener.

	If ``trypending`` is true, we attempt to load the index from a			If ``trypending`` is true, we attempt to load the index from a
	``00changelog.i.a`` file instead of the default ``00changelog.i``.			``00changelog.i.a`` file instead of the default ``00changelog.i``.
	The ``00changelog.i.a`` file contains index (and possibly inline			The ``00changelog.i.a`` file contains index (and possibly inline
	revision) data for a transaction that hasn't been finalized yet.			revision) data for a transaction that hasn't been finalized yet.
	It exists in a separate file to facilitate readers (such as			It exists in a separate file to facilitate readers (such as
	hooks processes) accessing data before a transaction is finalized.			hooks processes) accessing data before a transaction is finalized.

				``concurrencychecker`` will be passed to the revlog init function, see
				the documentation there.
	"""			"""
	if trypending and opener.exists(b'00changelog.i.a'):			if trypending and opener.exists(b'00changelog.i.a'):
	indexfile = b'00changelog.i.a'			indexfile = b'00changelog.i.a'
	else:			else:
	indexfile = b'00changelog.i'			indexfile = b'00changelog.i'

	datafile = b'00changelog.d'			datafile = b'00changelog.d'
	revlog.revlog.__init__(			revlog.revlog.__init__(
	self,			self,
	opener,			opener,
	indexfile,			indexfile,
	datafile=datafile,			datafile=datafile,
	checkambig=True,			checkambig=True,
	mmaplargeindex=True,			mmaplargeindex=True,
	persistentnodemap=opener.options.get(b'persistent-nodemap', False),			persistentnodemap=opener.options.get(b'persistent-nodemap', False),
				concurrencychecker=concurrencychecker,
	)			)

	if self._initempty and (self.version & 0xFFFF == revlog.REVLOGV1):			if self._initempty and (self.version & 0xFFFF == revlog.REVLOGV1):
	# changelogs don't benefit from generaldelta.			# changelogs don't benefit from generaldelta.

	self.version &= ~revlog.FLAG_GENERALDELTA			self.version &= ~revlog.FLAG_GENERALDELTA
	self._generaldelta = False			self._generaldelta = False

mercurial/configitems.py

	default=False,			default=False,
	)			)
	coreconfigitem(			coreconfigitem(
	b'debug',			b'debug',
	b'dirstate.delaywrite',			b'dirstate.delaywrite',
	default=0,			default=0,
	)			)
	coreconfigitem(			coreconfigitem(
				b'debug',
				b'revlog.verifyposition.changelog',
				default=b'',
				)
				coreconfigitem(
	b'defaults',			b'defaults',
	b'.*',			b'.*',
	default=None,			default=None,
	generic=True,			generic=True,
	)			)
	coreconfigitem(			coreconfigitem(
	b'devel',			b'devel',
	b'all-warnings',			b'all-warnings',

mercurial/localrepo.py

	)			)

	from .utils import (			from .utils import (
	hashutil,			hashutil,
	procutil,			procutil,
	stringutil,			stringutil,
	)			)

	from .revlogutils import constants as revlogconst			from .revlogutils import (
				concurrency_checker as revlogchecker,
				constants as revlogconst,
				)

	release = lockmod.release			release = lockmod.release
	urlerr = util.urlerr			urlerr = util.urlerr
	urlreq = util.urlreq			urlreq = util.urlreq

	# set of (path, vfs-location) tuples. vfs-location is:			# set of (path, vfs-location) tuples. vfs-location is:
	# - 'plain for vfs relative paths			# - 'plain for vfs relative paths
	# - '' for svfs relative paths			# - '' for svfs relative paths
	@storecache(b'obsstore')			@storecache(b'obsstore')
	def obsstore(self):			def obsstore(self):
	return obsolete.makestore(self.ui, self)			return obsolete.makestore(self.ui, self)

	@storecache(b'00changelog.i')			@storecache(b'00changelog.i')
	def changelog(self):			def changelog(self):
	# load dirstate before changelog to avoid race see issue6303			# load dirstate before changelog to avoid race see issue6303
	self.dirstate.prefetch_parents()			self.dirstate.prefetch_parents()
	return self.store.changelog(txnutil.mayhavepending(self.root))			return self.store.changelog(
				txnutil.mayhavepending(self.root),
				concurrencychecker=revlogchecker.get_checker(self.ui, b'changelog'),
				)

	@storecache(b'00manifest.i')			@storecache(b'00manifest.i')
	def manifestlog(self):			def manifestlog(self):
	return self.store.manifestlog(self, self._storenarrowmatch)			return self.store.manifestlog(self, self._storenarrowmatch)

	@repofilecache(b'dirstate')			@repofilecache(b'dirstate')
	def dirstate(self):			def dirstate(self):
	return self._makedirstate()			return self._makedirstate()

mercurial/revlog.py

	If mmaplargeindex is True, and an mmapindexthreshold is set, the			If mmaplargeindex is True, and an mmapindexthreshold is set, the
	index will be mmapped rather than read if it is larger than the			index will be mmapped rather than read if it is larger than the
	configured threshold.			configured threshold.

	If censorable is True, the revlog can have censored revisions.			If censorable is True, the revlog can have censored revisions.

	If `upperboundcomp` is not None, this is the expected maximal gain from			If `upperboundcomp` is not None, this is the expected maximal gain from
	compression for the data content.			compression for the data content.

				`concurrencychecker` is an optional function that receives 3 arguments: a
				file handle, a filename, and an expected position. It should check whether
				the current position in the file handle is valid, and log/warn/fail (by
				raising).
	"""			"""

	_flagserrorclass = error.RevlogError			_flagserrorclass = error.RevlogError

	def __init__(			def __init__(
	self,			self,
	opener,			opener,
	indexfile,			indexfile,
	datafile=None,			datafile=None,
	checkambig=False,			checkambig=False,
	mmaplargeindex=False,			mmaplargeindex=False,
	censorable=False,			censorable=False,
	upperboundcomp=None,			upperboundcomp=None,
	persistentnodemap=False,			persistentnodemap=False,
				concurrencychecker=None,
	):			):
	"""			"""
	create a revlog object			create a revlog object

	opener is a function that abstracts the file opening operation			opener is a function that abstracts the file opening operation
	and can be used to implement COW semantics or the like.			and can be used to implement COW semantics or the like.

	"""			"""
	# custom flags.			# custom flags.
	self._flagprocessors = dict(flagutil.flagprocessors)			self._flagprocessors = dict(flagutil.flagprocessors)

	# 2-tuple of file handles being used for active writing.			# 2-tuple of file handles being used for active writing.
	self._writinghandles = None			self._writinghandles = None

	self._loadindex()			self._loadindex()

				self._concurrencychecker = concurrencychecker

	def _loadindex(self):			def _loadindex(self):
	mmapindexthreshold = None			mmapindexthreshold = None
	opts = self.opener.options			opts = self.opener.options

	if b'revlogv2' in opts:			if b'revlogv2' in opts:
	newversionflags = REVLOGV2 \| FLAG_INLINE_DATA			newversionflags = REVLOGV2 \| FLAG_INLINE_DATA
	elif b'revlogv1' in opts:			elif b'revlogv1' in opts:
	newversionflags = REVLOGV1 \| FLAG_INLINE_DATA			newversionflags = REVLOGV1 \| FLAG_INLINE_DATA
	else:			else:
	fh = dfh			fh = dfh

	btext = [rawtext]			btext = [rawtext]

	curr = len(self)			curr = len(self)
	prev = curr - 1			prev = curr - 1
	offset = self.end(prev)			offset = self.end(prev)

				if self._concurrencychecker:
				if self._inline:
				# offset is "as if" it were in the .d file, so we need to add on
				# the size of the entry metadata.
				self._concurrencychecker(
				ifh, self.indexfile, offset + curr * self._io.size
				)
				else:
				# Entries in the .i are a consistent size.
				self._concurrencychecker(
				ifh, self.indexfile, curr * self._io.size
				)
				self._concurrencychecker(dfh, self.datafile, offset)

	p1r, p2r = self.rev(p1), self.rev(p2)			p1r, p2r = self.rev(p1), self.rev(p2)

	# full versions are inserted when the needed deltas			# full versions are inserted when the needed deltas
	# become comparable to the uncompressed text			# become comparable to the uncompressed text
	if rawtext is None:			if rawtext is None:
	# need rawtext size, before changed by flag processors, which is			# need rawtext size, before changed by flag processors, which is
	# the non-raw size. use revlog explicitly to avoid filelog's extra			# the non-raw size. use revlog explicitly to avoid filelog's extra
	# logic that might remove metadata size.			# logic that might remove metadata size.

mercurial/revlogutils/concurrency_checker.py

This file was added.

				from ..i18n import _
				from .. import error


				def get_checker(ui, revlog_name=b'changelog'):
				"""Get a function that checks file handle position is as expected.

				This is used to ensure that files haven't been modified outside of our
				knowledge (such as on a networked filesystem, if `hg debuglocks` was used,
				or writes to .hg that ignored locks happened).

				Due to revlogs supporting a concept of buffered, delayed, or diverted
				writes, we're allowing the files to be shorter than expected (the data may
				not have been written yet), but they can't be longer.

				Please note that this check is not perfect; it can't detect all cases (there
				may be false-negatives/false-OKs), but it should never claim there's an
				issue when there isn't (false-positives/false-failures).
				"""

				vpos = ui.config(b'debug', b'revlog.verifyposition.' + revlog_name)
				# Avoid any `fh.tell` cost if this isn't enabled.
				if not vpos or vpos not in [b'log', b'warn', b'fail']:
				return None

				def _checker(fh, fn, expected):
				if fh.tell() <= expected:
				return

				msg = _(b'%s: file cursor at position %d, expected %d')
				# Always log if we're going to warn or fail.
				ui.log(b'debug', msg + b'\n', fn, fh.tell(), expected)
				if vpos == b'warn':
				ui.warn((msg + b'\n') % (fn, fh.tell(), expected))
				elif vpos == b'fail':
				raise error.RevlogError(msg % (fn, fh.tell(), expected))

				return _checker

mercurial/store.py

	if filefilter(f, kind, st):			if filefilter(f, kind, st):
	n = util.pconvert(fp[striplen:])			n = util.pconvert(fp[striplen:])
	l.append((decodedir(n), n, st.st_size))			l.append((decodedir(n), n, st.st_size))
	elif kind == stat.S_IFDIR and recurse:			elif kind == stat.S_IFDIR and recurse:
	visit.append(fp)			visit.append(fp)
	l.sort()			l.sort()
	return l			return l

	def changelog(self, trypending):			def changelog(self, trypending, concurrencychecker=None):
	return changelog.changelog(self.vfs, trypending=trypending)			return changelog.changelog(
				self.vfs,
				trypending=trypending,
				concurrencychecker=concurrencychecker,
				)

	def manifestlog(self, repo, storenarrowmatch):			def manifestlog(self, repo, storenarrowmatch):
	rootstore = manifest.manifestrevlog(self.vfs)			rootstore = manifest.manifestrevlog(self.vfs)
	return manifest.manifestlog(self.vfs, repo, rootstore, storenarrowmatch)			return manifest.manifestlog(self.vfs, repo, rootstore, storenarrowmatch)

	def datafiles(self, matcher=None):			def datafiles(self, matcher=None):
	return self._walk(b'data', True) + self._walk(b'meta', True)			return self._walk(b'data', True) + self._walk(b'meta', True)

tests/test-racy-mutations.t

This file was added.

				#testcases skip-detection fail-if-detected

				Test situations that "should" only be reproducible:
				- on networked filesystems, or
				- user using `hg debuglocks` to eliminate the lock file, or
				- something (that doesn't respect the lock file) writing to the .hg directory
				while we're running

				$ hg init a
				$ cd a

				$ cat > "$TESTTMP/waitlock_editor.sh" <<EOF
				> [ -n "\${WAITLOCK_ANNOUNCE:-}" ] && touch "\${WAITLOCK_ANNOUNCE}"
				> f="\${WAITLOCK_FILE}"
				> start=\`date +%s\`
				> timeout=5
				> while [ \$ ! -f \$f \$ -a \$ ! -L \$f \$ ]; do
				> now=\`date +%s\`
				> if [ "\`expr \$now - \$start\`" -gt \$timeout ]; then
				> echo "timeout: \$f was not created in \$timeout seconds (it is now \$(date +%s))"
				> exit 1
				> fi
				> sleep 0.1
				> done
				> if [ \$# -gt 1 ]; then
				> cat "\$@"
				> fi
				> EOF
				$ chmod +x "$TESTTMP/waitlock_editor.sh"

				Things behave differently if we don't already have a 00changelog.i file when
				this all starts, so let's make one.

				$ echo r0 > r0
				$ hg commit -qAm 'r0'

				Start an hg commit that will take a while
				$ EDITOR_STARTED="$(pwd)/.editor_started"
				$ MISCHIEF_MANAGED="$(pwd)/.mischief_managed"
				$ JOBS_FINISHED="$(pwd)/.jobs_finished"

				#if fail-if-detected
				$ cat >> .hg/hgrc << EOF
				> [debug]
				> revlog.verifyposition.changelog = fail
				> EOF
				#endif

				$ echo foo > foo
				$ (WAITLOCK_ANNOUNCE="${EDITOR_STARTED}" \
				> WAITLOCK_FILE="${MISCHIEF_MANAGED}" \
				> HGEDITOR="$TESTTMP/waitlock_editor.sh" \
				> hg commit -qAm 'r1 (foo)' --edit foo > .foo_commit_out 2>&1 ; touch "${JOBS_FINISHED}") &

				Wait for the "editor" to actually start
				$ WAITLOCK_FILE="${EDITOR_STARTED}" "$TESTTMP/waitlock_editor.sh"

				Break the locks, and make another commit.
				$ hg debuglocks -LW
				$ echo bar > bar
				$ hg commit -qAm 'r2 (bar)' bar
				$ hg debugrevlogindex -c
				rev linkrev nodeid p1 p2
				0 0 222799e2f90b 000000000000 000000000000
				1 1 6f124f6007a0 222799e2f90b 000000000000

				Awaken the editor from that first commit
				$ touch "${MISCHIEF_MANAGED}"
				And wait for it to finish
				$ WAITLOCK_FILE="${JOBS_FINISHED}" "$TESTTMP/waitlock_editor.sh"

				#if skip-detection
				(Ensure there was no output)
				$ cat .foo_commit_out
				And observe a corrupted repository -- rev 2's linkrev is 1, which should never
				happen for the changelog (the linkrev should always refer to itself).
				$ hg debugrevlogindex -c
				rev linkrev nodeid p1 p2
				0 0 222799e2f90b 000000000000 000000000000
				1 1 6f124f6007a0 222799e2f90b 000000000000
				2 1 ac80e6205bb2 222799e2f90b 000000000000
				#endif

				#if fail-if-detected
				$ cat .foo_commit_out
				transaction abort!
				rollback completed
				note: commit message saved in .hg/last-message.txt
				note: use 'hg commit --logfile .hg/last-message.txt --edit' to reuse it
				abort: 00changelog.i: file cursor at position 249, expected 121
				And no corruption in the changelog.
				$ hg debugrevlogindex -c
				rev linkrev nodeid p1 p2
				0 0 222799e2f90b 000000000000 000000000000
				1 1 6f124f6007a0 222799e2f90b 000000000000
				And, because of transactions, there's none in the manifestlog either.
				$ hg debugrevlogindex -m
				rev linkrev nodeid p1 p2
				0 0 7b7020262a56 000000000000 000000000000
				1 1 ad3fe36d86d9 7b7020262a56 000000000000
				#endif