This is an archive of the discontinued Mercurial Phabricator instance.

Differential D8244

copies: fix the changeset based algorithm regarding merge
ClosedPublic

Authored by marmoute on Mar 5 2020, 1:14 PM.

Download Raw Diff

Details

Reviewers

None

Group Reviewers

hg-reviewers

Commits

rHG45f3f35cefe7: copies: fix the changeset based algorithm regarding merge

Summary

In 99ebde4fec99, we changed the list of files stored into the files field.
This lead to the changeset centric copy algorithm to break in various merge
situation involving merge. Older information could reach the merge through
p1, and while information from p2 was strictly fresher, it would get
overwritten anyway.

We update the situation with more details about which revision introduces rename
information. This help use making the right decision in case of merge.

We are now running a more comprehensive suite of test with include this kind of
situation. The behavior differ slightly from the filelog based in a couple of
instance. There is mostly two distinct cases:

there are conflicting rename information in a merge (different rename history

on each side). In this case the filelog based implementation arbitrarily pick a
side based on the file-revision-number. So it depends on a local factor. The
changeset centric algorithm will use a deterministic approach, by picking the
information coming from the first parent of the merge. This is stable across
different clone.

rename information related to file that exist in both source and destination.

The filelog based implementation do not even try to detect these, however the
changeset centric one get them for "free" (it is simpler to detect them than
not).

The new implementation focus on correctness. Performance improvement will come
later.

Diff Detail

Repository

rHG Mercurial

Lint

Automatic diff as part of commit; lint not applicable.

Unit

Automatic diff as part of commit; unit tests not applicable.

Event Timeline

marmoute created this revision.Mar 5 2020, 1:14 PM

Herald added a reviewer: hg-reviewers. · View Herald TranscriptMar 5 2020, 1:14 PM

Herald added a subscriber: mercurial-devel. · View Herald Transcript

marmoute mentioned this in D8078: copies: add a new test dedicated to testing chain of changeset with merge.Mar 6 2020, 5:08 AM

marmoute updated this revision to Diff 20551.Mar 6 2020, 5:21 AM

In 99ebde4fec99, we changed the list of files stored into the files field.
This lead to the changeset centric copy algorithm to break in various merge
situation involving merge.

Could you explain why it broke? It's hard to review this patch without really knowing what the problem or the solution is.

The new implementation focus on correctness. Performance improvement will come
later.

How much slower is it? Could you run some of those benchmarks you have run on previous patches touching this code? How do you plan to improve it?

mercurial/copies.py
287	s/had/add/?
tests/test-copies-chain-merge.t
524–525	nit: combine into `(no-filelog !)`?

In D8244#122806, @martinvonz wrote:

In 99ebde4fec99, we changed the list of files stored into the files field.
This lead to the changeset centric copy algorithm to break in various merge
situation involving merge.

Could you explain why it broke? It's hard to review this patch without really knowing what the problem or the solution is.

Outdated information from p1 could overwrite newer information from p2.

The new implementation focus on correctness. Performance improvement will come
later.

How much slower is it? Could you run some of those benchmarks you have run on previous patches touching this code? How do you plan to improve it?

There are two new calls that might degrade performance. This isancestor call that is easy to cache in memory. And the "ismerged" logic that is easy to cache on disk with the rest of the copy related information (the case is rare).

There is some win to have in python, but the main win will be the move the Rust algorithm (that need to be updated with the new logic). Moving to rust give a very large performance boost on the slow cases (usually over 10x, sometime 100x IIRC). That is the one I care about.

series update after Martin feedback

marmoute added a child revision: D8258: copies-tests: remove spurious `]` in the template.Mar 6 2020, 6:52 PM

marmoute marked 2 inline comments as done.Mar 6 2020, 7:48 PM

In D8244#122821, @marmoute wrote:

In D8244#122806, @martinvonz wrote:

In 99ebde4fec99, we changed the list of files stored into the files field.
This lead to the changeset centric copy algorithm to break in various merge
situation involving merge.

Could you explain why it broke? It's hard to review this patch without really knowing what the problem or the solution is.

Outdated information from p1 could overwrite newer information from p2.

The new implementation focus on correctness. Performance improvement will come
later.

How much slower is it? Could you run some of those benchmarks you have run on previous patches touching this code? How do you plan to improve it?

There are two new calls that might degrade performance. This isancestor call that is easy to cache in memory. And the "ismerged" logic that is easy to cache on disk with the rest of the copy related information (the case is rare).
There is some win to have in python, but the main win will be the move the Rust algorithm (that need to be updated with the new logic). Moving to rust give a very large performance boost on the slow cases (usually over 10x, sometime 100x IIRC). That is the one I care about.

I'll make the performance impact more concrete myself. I picked two quite arbitrary tags in the mozilla-unified repo and this is what I saw:

Before this patch:

$ python3 ~/hg/hg perfpathcopies FIREFOX_BETA_44_END FIREFOX_BETA_54_END
! wall 5.279230 comb 5.270000 user 5.250000 sys 0.020000 (best of 3)

After this patch:

$ python3 ~/hg/hg perfpathcopies FIREFOX_BETA_44_END FIREFOX_BETA_54_END
! wall 8.277523 comb 8.280000 user 8.170000 sys 0.110000 (best of 3)

Could you share some more benchmarking data? I know you had a set of commits that you've used before (and that you've asked me to use for benchmarking my patches to copies.py against). It's quite a significant slowdown for the case I tested above, but I'm fine with it since it fixes a bug. I'd just like to see how it behaves in other cases.

mercurial/copies.py
287	Not actually done, it seems, but not a big deal anyway.

In D8244#123560, @martinvonz wrote:
In D8244#122821, @marmoute wrote:

In D8244#122806, @martinvonz wrote:

In 99ebde4fec99, we changed the list of files stored into the files field.
This lead to the changeset centric copy algorithm to break in various merge
situation involving merge.

Could you explain why it broke? It's hard to review this patch without really knowing what the problem or the solution is.

Outdated information from p1 could overwrite newer information from p2.

The new implementation focus on correctness. Performance improvement will come
later.

How much slower is it? Could you run some of those benchmarks you have run on previous patches touching this code? How do you plan to improve it?

There are two new calls that might degrade performance. This isancestor call that is easy to cache in memory. And the "ismerged" logic that is easy to cache on disk with the rest of the copy related information (the case is rare).
There is some win to have in python, but the main win will be the move the Rust algorithm (that need to be updated with the new logic). Moving to rust give a very large performance boost on the slow cases (usually over 10x, sometime 100x IIRC). That is the one I care about.

I'll make the performance impact more concrete myself. I picked two quite arbitrary tags in the mozilla-unified repo and this is what I saw:
Before this patch:
$ python3 ~/hg/hg perfpathcopies FIREFOX_BETA_44_END FIREFOX_BETA_54_END
! wall 5.279230 comb 5.270000 user 5.250000 sys 0.020000 (best of 3)
After this patch:
$ python3 ~/hg/hg perfpathcopies FIREFOX_BETA_44_END FIREFOX_BETA_54_END
! wall 8.277523 comb 8.280000 user 8.170000 sys 0.110000 (best of 3)
Could you share some more benchmarking data? I know you had a set of commits that you've used before (and that you've asked me to use for benchmarking my patches to copies.py against). It's quite a significant slowdown for the case I tested above, but I'm fine with it since it fixes a bug. I'd just like to see how it behaves in other cases.

We now have some automatic benchmark setup for copies tracing, however we don't have any reference repositories with the necessary data for changeset centric copy tracing. Building a reference takes some manual operation and a lots of CPU time. So I am planning to build some once the format is more finalized.

I am not too worried about the current performance number because there are multiple easy optimization. In addition the rust version of the previous algorithm proved massively more efficient, so I have good hope for this one too.

The existing reference can still be used to gather various useful pairs of revision to run manual benchmark on.

You can see them using the following command in a setup scmperf repo.

$ grep -A 3 copies repos/*.benchrepo

To setup an scmperrepo, you can use:

$ hg clone https://foss.heptapod.net/mercurial/scmperf
$ cd scmperf
$ ./script/setup-repos default.repos

marmoute updated this revision to Diff 21029.Apr 10 2020, 1:43 PM

Herald added a subscriber: mercurial-patches. · View Herald TranscriptApr 21 2020, 3:32 AM

marmoute updated this revision to Diff 21180.Apr 22 2020, 12:11 PM

Gentle ping on this. This is still ready for review.

marmoute updated this revision to Diff 21181.Apr 22 2020, 1:24 PM

In D8244#126461, @marmoute wrote:

Gentle ping on this. This is still ready for review.

I'm still a bit worried about the performance regression and I was hoping to get more information about that. Now is the beginning of the next cycle, so I guess it's a good time to queue this now and test it on Google users. I'll review this soon.

In D8244#126503, @martinvonz wrote:

In D8244#126461, @marmoute wrote:

Gentle ping on this. This is still ready for review.

I'm still a bit worried about the performance regression and I was hoping to get more information about that. Now is the beginning of the next cycle, so I guess it's a good time to queue this now and test it on Google users. I'll review this soon.

I started poking at the performance and [spoiler] and this will be fine :-) Do not hold your breath, because there is a long road between "prototype to get numbers" and "clean implementation". If you can queue this, I would appreciate building on a more solid ground.

For example, on the worst pypy case I have been playing with:

Before the fix: ! wall 1.765737 comb 1.760000 user 1.690000 sys 0.070000 (median of 6)
After the fix:  ! wall 29.983194 comb 29.880000 user 29.780000 sys 0.100000 (median of 3)
With curent state of speedup:   ! wall 2.135641 comb 2.140000 user 2.120000 sys 0.020000 (median of 5)

marmoute added a commit: rHG45f3f35cefe7: copies: fix the changeset based algorithm regarding merge.Apr 28 2020, 2:53 PM

This revision was not accepted when it landed; it landed in state Needs Review.

Closed by commit rHG45f3f35cefe7: copies: fix the changeset based algorithm regarding merge (authored by marmoute). · Explain Why

This revision was automatically updated to reflect the committed changes.

Revision Contents
Changeset List

			Path	Packages
M			mercurial/copies.py (106 lines)
M			tests/test-copies-chain-merge.t (65 lines)

Diff	ID	Description	Created	Lint	Unit
Base		Base
Diff 1	20534		Mar 5 2020, 1:14 PM	★	★
Diff 2	20551		Mar 6 2020, 5:21 AM	★	★
Diff 3	20596		Mar 6 2020, 6:52 PM	★	★
Diff 4	21029		Apr 10 2020, 1:43 PM	★	★
Diff 5	21180		Apr 22 2020, 12:11 PM	★	★
Diff 6	21181		Apr 22 2020, 1:24 PM	★	★
Diff 7	21235	rHG45f3f35cefe7d67b63ba55de2bf688d54f33bcac	Mar 5 2020, 11:55 AM	★	★

Status	Author	Revision
Closed	marmoute	D8258 copies-tests: remove spurious `]` in the template
Closed	marmoute	D8244 copies: fix the changeset based algorithm regarding merge
Abandoned	marmoute	D8243 copies: stop recording buggy file merge when new file overwrite an old one
Closed	marmoute	D8242 copies-tests: add a case where with merge with an overwritten files
Closed	marmoute	D8241 copies-tests: add a case where a file is deleted/added but with a merge
Closed	marmoute	D8240 copies-tests: add a test with a rename overwriting another file
Closed	marmoute	D8239 copies-tests: add a `h` to the root commit (for chain merge tests)
Closed	marmoute	D8257 copies-tests: remove the final summary
Closed	marmoute	D8238 copies-tests: clarify the description of the EA/AE cases
Closed	marmoute	D8237 copies-tests: update the analysis of the BD/DB cases
Closed	marmoute	D8247 copies-tests: swap two branch description

Diff 21235

mercurial/copies.py

	def _revinfogetter(repo):			def _revinfogetter(repo):
	"""return a function that return multiple data given a <rev>"i			"""return a function that return multiple data given a <rev>"i

	* p1: revision number of first parent			* p1: revision number of first parent
	* p2: revision number of first parent			* p2: revision number of first parent
	* p1copies: mapping of copies from p1			* p1copies: mapping of copies from p1
	* p2copies: mapping of copies from p2			* p2copies: mapping of copies from p2
	* removed: a list of removed files			* removed: a list of removed files
				* ismerged: a callback to know if file was merged in that revision
	"""			"""
	cl = repo.changelog			cl = repo.changelog
	parents = cl.parentrevs			parents = cl.parentrevs

				def get_ismerged(rev):
				ctx = repo[rev]

				def ismerged(path):
				if path not in ctx.files():
				return False
				fctx = ctx[path]
				parents = fctx._filelog.parents(fctx._filenode)
				nb_parents = 0
				for n in parents:
				if n != node.nullid:
				nb_parents += 1
				return nb_parents >= 2

				return ismerged

	if repo.filecopiesmode == b'changeset-sidedata':			if repo.filecopiesmode == b'changeset-sidedata':
	changelogrevision = cl.changelogrevision			changelogrevision = cl.changelogrevision
	flags = cl.flags			flags = cl.flags

	# A small cache to avoid doing the work twice for merges			# A small cache to avoid doing the work twice for merges
	#			#
	# In the vast majority of cases, if we ask information for a revision			# In the vast majority of cases, if we ask information for a revision
	# about 1 parent, we'll later ask it for the other. So it make sense to			# about 1 parent, we'll later ask it for the other. So it make sense to
	# many first parent are met before any second parent is reached. In			# many first parent are met before any second parent is reached. In
	# that case the cache could grow. If this even become an issue one can			# that case the cache could grow. If this even become an issue one can
	# safely introduce a maximum cache size. This would trade extra CPU/IO			# safely introduce a maximum cache size. This would trade extra CPU/IO
	# time to save memory.			# time to save memory.
	merge_caches = {}			merge_caches = {}

	def revinfo(rev):			def revinfo(rev):
	p1, p2 = parents(rev)			p1, p2 = parents(rev)
				value = None
	if flags(rev) & REVIDX_SIDEDATA:			if flags(rev) & REVIDX_SIDEDATA:
	e = merge_caches.pop(rev, None)			e = merge_caches.pop(rev, None)
	if e is not None:			if e is not None:
	return e			return e
	c = changelogrevision(rev)			c = changelogrevision(rev)
	p1copies = c.p1copies			p1copies = c.p1copies
	p2copies = c.p2copies			p2copies = c.p2copies
	removed = c.filesremoved			removed = c.filesremoved
	if p1 != node.nullrev and p2 != node.nullrev:			if p1 != node.nullrev and p2 != node.nullrev:
	# XXX some case we over cache, IGNORE			# XXX some case we over cache, IGNORE
	merge_caches[rev] = (p1, p2, p1copies, p2copies, removed)			value = merge_caches[rev] = (
				p1,
				p2,
				p1copies,
				p2copies,
				removed,
				get_ismerged(rev),
				)
	else:			else:
	p1copies = {}			p1copies = {}
	p2copies = {}			p2copies = {}
	removed = []			removed = []
	return p1, p2, p1copies, p2copies, removed
				if value is None:
				value = (p1, p2, p1copies, p2copies, removed, get_ismerged(rev))
				return value

	else:			else:

	def revinfo(rev):			def revinfo(rev):
	p1, p2 = parents(rev)			p1, p2 = parents(rev)
	ctx = repo[rev]			ctx = repo[rev]
	p1copies, p2copies = ctx._copies			p1copies, p2copies = ctx._copies
	removed = ctx.filesremoved()			removed = ctx.filesremoved()
	return p1, p2, p1copies, p2copies, removed			return p1, p2, p1copies, p2copies, removed, get_ismerged(rev)

	return revinfo			return revinfo


	def _changesetforwardcopies(a, b, match):			def _changesetforwardcopies(a, b, match):
	if a.rev() in (node.nullrev, b.rev()):			if a.rev() in (node.nullrev, b.rev()):
	return {}			return {}

	repo = a.repo().unfiltered()			repo = a.repo().unfiltered()
	children = {}			children = {}
	revinfo = _revinfogetter(repo)			revinfo = _revinfogetter(repo)

	cl = repo.changelog			cl = repo.changelog
				isancestor = cl.isancestorrev # XXX we should had chaching to this.
				martinvonzUnsubmitted Done s/had/add/? martinvonz: s/had/add/?
				martinvonzUnsubmitted Not Done Not actually done, it seems, but not a big deal anyway. martinvonz: Not actually done, it seems, but not a big deal anyway.
	missingrevs = cl.findmissingrevs(common=[a.rev()], heads=[b.rev()])			missingrevs = cl.findmissingrevs(common=[a.rev()], heads=[b.rev()])
	mrset = set(missingrevs)			mrset = set(missingrevs)
	roots = set()			roots = set()
	for r in missingrevs:			for r in missingrevs:
	for p in cl.parentrevs(r):			for p in cl.parentrevs(r):
	if p == node.nullrev:			if p == node.nullrev:
	continue			continue
	if p not in children:			if p not in children:
	cl.reachableroots(min_root, [b.rev()], list(roots), includepath=True)			cl.reachableroots(min_root, [b.rev()], list(roots), includepath=True)
	)			)

	iterrevs = set(from_head)			iterrevs = set(from_head)
	iterrevs &= mrset			iterrevs &= mrset
	iterrevs.update(roots)			iterrevs.update(roots)
	iterrevs.remove(b.rev())			iterrevs.remove(b.rev())
	revs = sorted(iterrevs)			revs = sorted(iterrevs)
	return _combinechangesetcopies(revs, children, b.rev(), revinfo, match)			return _combinechangesetcopies(
				revs, children, b.rev(), revinfo, match, isancestor
				)


	def _combinechangesetcopies(revs, children, targetrev, revinfo, match):			def _combinechangesetcopies(
				revs, children, targetrev, revinfo, match, isancestor
				):
	"""combine the copies information for each item of iterrevs			"""combine the copies information for each item of iterrevs

	revs: sorted iterable of revision to visit			revs: sorted iterable of revision to visit
	children: a {parent: [children]} mapping.			children: a {parent: [children]} mapping.
	targetrev: the final copies destination revision (not in iterrevs)			targetrev: the final copies destination revision (not in iterrevs)
	revinfo(rev): a function that return (p1, p2, p1copies, p2copies, removed)			revinfo(rev): a function that return (p1, p2, p1copies, p2copies, removed)
	match: a matcher			match: a matcher

	It returns the aggregated copies information for `targetrev`.			It returns the aggregated copies information for `targetrev`.
	"""			"""
	all_copies = {}			all_copies = {}
	alwaysmatch = match.always()			alwaysmatch = match.always()
	for r in revs:			for r in revs:
	copies = all_copies.pop(r, None)			copies = all_copies.pop(r, None)
	if copies is None:			if copies is None:
	# this is a root			# this is a root
	copies = {}			copies = {}
	for i, c in enumerate(children[r]):			for i, c in enumerate(children[r]):
	p1, p2, p1copies, p2copies, removed = revinfo(c)			p1, p2, p1copies, p2copies, removed, ismerged = revinfo(c)
	if r == p1:			if r == p1:
	parent = 1			parent = 1
	childcopies = p1copies			childcopies = p1copies
	else:			else:
	assert r == p2			assert r == p2
	parent = 2			parent = 2
	childcopies = p2copies			childcopies = p2copies
	if not alwaysmatch:			if not alwaysmatch:
	childcopies = {			childcopies = {
	dst: src for dst, src in childcopies.items() if match(dst)			dst: src for dst, src in childcopies.items() if match(dst)
	}			}
	newcopies = copies			newcopies = copies
	if childcopies:			if childcopies:
	newcopies = _chain(newcopies, childcopies)			newcopies = copies.copy()
	# _chain makes a copies, we can avoid doing so in some			for dest, source in pycompat.iteritems(childcopies):
	# simple/linear cases.			prev = copies.get(source)
				if prev is not None and prev[1] is not None:
				source = prev[1]
				newcopies[dest] = (c, source)
	assert newcopies is not copies			assert newcopies is not copies
	for f in removed:			for f in removed:
	if f in newcopies:			if f in newcopies:
	if newcopies is copies:			if newcopies is copies:
	# copy on write to avoid affecting potential other			# copy on write to avoid affecting potential other
	# branches. when there are no other branches, this			# branches. when there are no other branches, this
	# could be avoided.			# could be avoided.
	newcopies = copies.copy()			newcopies = copies.copy()
	del newcopies[f]			newcopies[f] = (c, None)
	othercopies = all_copies.get(c)			othercopies = all_copies.get(c)
	if othercopies is None:			if othercopies is None:
	all_copies[c] = newcopies			all_copies[c] = newcopies
	else:			else:
	# we are the second parent to work on c, we need to merge our			# we are the second parent to work on c, we need to merge our
	# work with the other.			# work with the other.
	#			#
	# Unlike when copies are stored in the filelog, we consider
	# it a copy even if the destination already existed on the
	# other branch. It's simply too expensive to check if the
	# file existed in the manifest.
	#
	# In case of conflict, parent 1 take precedence over parent 2.			# In case of conflict, parent 1 take precedence over parent 2.
	# This is an arbitrary choice made anew when implementing			# This is an arbitrary choice made anew when implementing
	# changeset based copies. It was made without regards with			# changeset based copies. It was made without regards with
	# potential filelog related behavior.			# potential filelog related behavior.
	if parent == 1:			if parent == 1:
	othercopies.update(newcopies)			_merge_copies_dict(
				othercopies, newcopies, isancestor, ismerged
				)
	else:			else:
	newcopies.update(othercopies)			_merge_copies_dict(
				newcopies, othercopies, isancestor, ismerged
				)
	all_copies[c] = newcopies			all_copies[c] = newcopies
	return all_copies[targetrev]
				final_copies = {}
				for dest, (tt, source) in all_copies[targetrev].items():
				if source is not None:
				final_copies[dest] = source
				return final_copies


				def _merge_copies_dict(minor, major, isancestor, ismerged):
				"""merge two copies-mapping together, minor and major

				In case of conflict, value from "major" will be picked.

				- `isancestors(low_rev, high_rev)`: callable return True if `low_rev` is an
				ancestors of `high_rev`,

				- `ismerged(path)`: callable return True if `path` have been merged in the
				current revision,
				"""
				for dest, value in major.items():
				other = minor.get(dest)
				if other is None:
				minor[dest] = value
				else:
				new_tt = value[0]
				other_tt = other[0]
				if value[1] == other[1]:
				continue
				# content from "major" wins, unless it is older
				# than the branch point or there is a merge
				if (
				new_tt == other_tt
				or not isancestor(new_tt, other_tt)
				or ismerged(dest)
				):
				minor[dest] = value


	def _forwardcopies(a, b, base=None, match=None):			def _forwardcopies(a, b, base=None, match=None):
	"""find {dst@b: src@a} copy mapping where a is an ancestor of b"""			"""find {dst@b: src@a} copy mapping where a is an ancestor of b"""

	if base is None:			if base is None:
	base = a			base = a
	match = a.repo().narrowmatch(match)			match = a.repo().narrowmatch(match)

tests/test-copies-chain-merge.t

				#testcases filelog compatibility sidedata

	=====================================================			=====================================================
	Test Copy tracing for chain of copies involving merge			Test Copy tracing for chain of copies involving merge
	=====================================================			=====================================================

	This test files covers copies/rename case for a chains of commit where merges			This test files covers copies/rename case for a chains of commit where merges
	are involved. It cheks we do not have unwanted update of behavior and that the			are involved. It cheks we do not have unwanted update of behavior and that the
	different options to retrieve copies behave correctly.			different options to retrieve copies behave correctly.


	Setup			Setup
	=====			=====

	use git diff to see rename			use git diff to see rename

	$ cat << EOF >> $HGRCPATH			$ cat << EOF >> $HGRCPATH
	> [diff]			> [diff]
	> git=yes			> git=yes
	> [ui]			> [ui]
	> logtemplate={rev} {desc}\n			> logtemplate={rev} {desc}\n
	> EOF			> EOF

				#if compatibility
				$ cat >> $HGRCPATH << EOF
				> [experimental]
				> copies.read-from = compatibility
				> EOF
				#endif

				#if sidedata
				$ cat >> $HGRCPATH << EOF
				> [format]
				> exp-use-side-data = yes
				> exp-use-copies-side-data-changeset = yes
				> EOF
				#endif


	$ hg init repo-chain			$ hg init repo-chain
	$ cd repo-chain			$ cd repo-chain

	Add some linear rename initialy			Add some linear rename initialy

	$ touch a b h			$ touch a b h
	$ hg ci -Am 'i-0 initial commit: a b h'			$ hg ci -Am 'i-0 initial commit: a b h'
	adding a			adding a
	0dd616bc7ab1a111921d95d76f69cda5c2ac539c 644 f			0dd616bc7ab1a111921d95d76f69cda5c2ac539c 644 f
	$ hg manifest --debug --rev 'desc("e-2")' \| grep '644 f'			$ hg manifest --debug --rev 'desc("e-2")' \| grep '644 f'
	6da5a2eecb9c833f830b67a4972366d49a9a142c 644 f			6da5a2eecb9c833f830b67a4972366d49a9a142c 644 f
	$ hg debugindex f			$ hg debugindex f
	rev linkrev nodeid p1 p2			rev linkrev nodeid p1 p2
	0 4 0dd616bc7ab1 000000000000 000000000000			0 4 0dd616bc7ab1 000000000000 000000000000
	1 10 6da5a2eecb9c 000000000000 000000000000			1 10 6da5a2eecb9c 000000000000 000000000000
	2 19 eb806e34ef6b 0dd616bc7ab1 6da5a2eecb9c			2 19 eb806e34ef6b 0dd616bc7ab1 6da5a2eecb9c

				# Here the filelog based implementation is not looking at the rename
				# information (because the file exist on both side). However the changelog
				# based on works fine. We have different output.

	$ hg status --copies --rev 'desc("a-2")' --rev 'desc("mAEm-0")'			$ hg status --copies --rev 'desc("a-2")' --rev 'desc("mAEm-0")'
	M f			M f
				b (no-filelog !)
	R b			R b
	$ hg status --copies --rev 'desc("a-2")' --rev 'desc("mEAm-0")'			$ hg status --copies --rev 'desc("a-2")' --rev 'desc("mEAm-0")'
	M f			M f
				b (no-filelog !)
	R b			R b
	$ hg status --copies --rev 'desc("e-2")' --rev 'desc("mAEm-0")'			$ hg status --copies --rev 'desc("e-2")' --rev 'desc("mAEm-0")'
	M f			M f
				d (no-filelog !)
	R d			R d
	$ hg status --copies --rev 'desc("e-2")' --rev 'desc("mEAm-0")'			$ hg status --copies --rev 'desc("e-2")' --rev 'desc("mEAm-0")'
	M f			M f
				d (no-filelog !)
	R d			R d
	$ hg status --copies --rev 'desc("i-2")' --rev 'desc("a-2")'			$ hg status --copies --rev 'desc("i-2")' --rev 'desc("a-2")'
	A f			A f
	d			d
	R d			R d
	$ hg status --copies --rev 'desc("i-2")' --rev 'desc("e-2")'			$ hg status --copies --rev 'desc("i-2")' --rev 'desc("e-2")'
	A f			A f
	b			b
	R b			R b

				# From here, we run status against revision where both source file exists.
				#
				# The filelog based implementation picks an arbitrary side based on revision
				# numbers. So the same side "wins" whatever the parents order is. This is
				# sub-optimal because depending on revision numbers means the result can be
				# different from one repository to the next.
				#
				# The changeset based algorithm use the parent order to break tie on conflicting
				# information and will have a different order depending on who is p1 and p2.
				# That order is stable accross repositories. (data from p1 prevails)

	$ hg status --copies --rev 'desc("i-2")' --rev 'desc("mAEm-0")'			$ hg status --copies --rev 'desc("i-2")' --rev 'desc("mAEm-0")'
	A f			A f
	d			d
	R b			R b
	R d			R d
	$ hg status --copies --rev 'desc("i-2")' --rev 'desc("mEAm-0")'			$ hg status --copies --rev 'desc("i-2")' --rev 'desc("mEAm-0")'
	A f			A f
	d			d (filelog !)
				b (no-filelog !)
	R b			R b
				martinvonzUnsubmitted Done nit: combine into `(no-filelog !)`? martinvonz: nit: combine into `(no-filelog !)`?
	R d			R d
	$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mAEm-0")'			$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mAEm-0")'
	A f			A f
	a			a
	R a			R a
	R b			R b
	$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mEAm-0")'			$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mEAm-0")'
	A f			A f
	a			a (filelog !)
				b (no-filelog !)
	R a			R a
	R b			R b


	Note:			Note:
	\| In this case, one of the merge wrongly record a merge while there is none.			\| In this case, one of the merge wrongly record a merge while there is none.
	\| This lead to bad copy tracing information to be dug up.			\| This lead to bad copy tracing information to be dug up.

	$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mFBm-0")'			$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mFBm-0")'
	M b			M b
	A d			A d
	h			h
	R a			R a
	R h			R h
	$ hg status --copies --rev 'desc("b-1")' --rev 'desc("mBFm-0")'			$ hg status --copies --rev 'desc("b-1")' --rev 'desc("mBFm-0")'
	M d			M d
				h (no-filelog !)
	R h			R h
	$ hg status --copies --rev 'desc("f-2")' --rev 'desc("mBFm-0")'			$ hg status --copies --rev 'desc("f-2")' --rev 'desc("mBFm-0")'
	M b			M b
	$ hg status --copies --rev 'desc("f-1")' --rev 'desc("mBFm-0")'			$ hg status --copies --rev 'desc("f-1")' --rev 'desc("mBFm-0")'
	M b			M b
	M d			M d
				i (no-filelog !)
	R i			R i
	$ hg status --copies --rev 'desc("b-1")' --rev 'desc("mFBm-0")'			$ hg status --copies --rev 'desc("b-1")' --rev 'desc("mFBm-0")'
	M d			M d
				h (no-filelog !)
	R h			R h
	$ hg status --copies --rev 'desc("f-2")' --rev 'desc("mFBm-0")'			$ hg status --copies --rev 'desc("f-2")' --rev 'desc("mFBm-0")'
	M b			M b
	$ hg status --copies --rev 'desc("f-1")' --rev 'desc("mFBm-0")'			$ hg status --copies --rev 'desc("f-1")' --rev 'desc("mFBm-0")'
	M b			M b
	M d			M d
				i (no-filelog !)
	R i			R i

	The following graphlog is wrong, the "a -> c -> d" chain was overwritten and should not appear.			The following graphlog is wrong, the "a -> c -> d" chain was overwritten and should not appear.

	$ hg log -Gfr 'desc("mBFm-0")' d			$ hg log -Gfr 'desc("mBFm-0")' d
	o 22 f-2: rename i -> d			o 22 f-2: rename i -> d
	\|			\|
	o 21 f-1: rename h -> i			o 21 f-1: rename h -> i
	o \| 7 d-1 delete d			o \| 7 d-1 delete d
	\|/			\|/
	o 2 i-2: c -move-> d			o 2 i-2: c -move-> d
	\|			\|
	o 1 i-1: a -move-> c			o 1 i-1: a -move-> c
	\|			\|
	o 0 i-0 initial commit: a b h			o 0 i-0 initial commit: a b h

				One side of the merge have a long history with rename. The other side of the
				merge point to a new file with a smaller history. Each side is "valid".

				(and again the filelog based algorithm only explore one, with a pick based on
				revision numbers)

	$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mDGm-0")'			$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mDGm-0")'
	A d			A d
	a			a (filelog !)
	R a			R a
	$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mGDm-0")'			$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mGDm-0")'
	A d			A d
	a			a
	R a			R a
	$ hg status --copies --rev 'desc("d-2")' --rev 'desc("mDGm-0")'			$ hg status --copies --rev 'desc("d-2")' --rev 'desc("mDGm-0")'
	M d			M d
	$ hg status --copies --rev 'desc("d-2")' --rev 'desc("mGDm-0")'			$ hg status --copies --rev 'desc("d-2")' --rev 'desc("mGDm-0")'
	o 2 i-2: c -move-> d			o 2 i-2: c -move-> d
	\|			\|
	o 1 i-1: a -move-> c			o 1 i-1: a -move-> c
	\|			\|
	o 0 i-0 initial commit: a b h			o 0 i-0 initial commit: a b h

	$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mFGm-0")'			$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mFGm-0")'
	A d			A d
	a			h (no-filelog !)
				a (filelog !)
	R a			R a
	R h			R h
	$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mGFm-0")'			$ hg status --copies --rev 'desc("i-0")' --rev 'desc("mGFm-0")'
	A d			A d
	a			a
	R a			R a
	R h			R h
	$ hg status --copies --rev 'desc("f-2")' --rev 'desc("mFGm-0")'			$ hg status --copies --rev 'desc("f-2")' --rev 'desc("mFGm-0")'
	M d			M d
	$ hg status --copies --rev 'desc("f-2")' --rev 'desc("mGFm-0")'			$ hg status --copies --rev 'desc("f-2")' --rev 'desc("mGFm-0")'
	M d			M d
	$ hg status --copies --rev 'desc("f-1")' --rev 'desc("mFGm-0")'			$ hg status --copies --rev 'desc("f-1")' --rev 'desc("mFGm-0")'
	M d			M d
				i (no-filelog !)
	R i			R i
	$ hg status --copies --rev 'desc("f-1")' --rev 'desc("mGFm-0")'			$ hg status --copies --rev 'desc("f-1")' --rev 'desc("mGFm-0")'
	M d			M d
				i (no-filelog !)
	R i			R i
	$ hg status --copies --rev 'desc("g-1")' --rev 'desc("mFGm-0")'			$ hg status --copies --rev 'desc("g-1")' --rev 'desc("mFGm-0")'
	M d			M d
				h (no-filelog !)
	R h			R h
	$ hg status --copies --rev 'desc("g-1")' --rev 'desc("mGFm-0")'			$ hg status --copies --rev 'desc("g-1")' --rev 'desc("mGFm-0")'
	M d			M d
				h (no-filelog !)
	R h			R h

	$ hg log -Gfr 'desc("mFGm-0")' d			$ hg log -Gfr 'desc("mFGm-0")' d
	o 28 mFGm-0 simple merge - one way			o 28 mFGm-0 simple merge - one way
	\|\			\|\
	\| o 25 g-1: update d			\| o 25 g-1: update d
	\| \|			\| \|
	o \| 22 f-2: rename i -> d			o \| 22 f-2: rename i -> d