scmutil: make cleanupnodes optionally also fix the phase
ClosedPublic

Authored by martinvonz on Jun 19 2018, 5:16 PM.

Details

Summary

We have had multiple bugs where the phase wasn't correctly carried
forward to a rewritten changeset (for example: phabricator, split,
evolve, fix). Handling the phase update in cleanupnodes() makes it
less likely to happen again, especially once we have made it fix the
phase by default (perhaps in the next release cycle).

This patch also updates all applicable callers so we get some testing
of it.

Note that rebase and histedit can't be fixed yet because they call
cleanupnodes() only at the end and the phase may have been changed by
the user when the rebase/histedit was interrupted (due to merge
conflicts). I think we should make them write one commit at a time (as
it already does), along with associated obsmarkers, bookmark moves,
etc. When that's done, we can switch them over to cleanupnodes().

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
martinvonz created this revision.Jun 19 2018, 5:16 PM
martinvonz added inline comments.Jun 19 2018, 5:20 PM
contrib/phabricator.py
593

This line means that if you use phabricator from your hg repo, you'll need to remember to run ./hg phabsend (otherwise it will crash when trying to add URL in the commit message and you may end up with duplicate reviews like I did)

martinvonz edited the summary of this revision. (Show Details)Jun 19 2018, 7:42 PM
yuja added a subscriber: yuja.Jun 20 2018, 9:44 AM

+def cleanupnodes(repo, replacements, operation, moves=None, metadata=None,
+ fixphase=False):

"""do common cleanups when old nodes are replaced by new nodes
  
That includes writing obsmarkers or stripping nodes, and moving bookmarks.

@@ -825,11 +826,34 @@

    newnode = newnodes[0]
moves[oldnode] = newnode

+ allnewnodes = [n for ns in replacements.values() for n in ns]
+ toretract = {}
+ toadvance = {}
+ if fixphase:
+ precursors = {}
+ for oldnode, newnodes in replacements.items():
+ for newnode in newnodes:
+ precursors.setdefault(newnode, []).append(oldnode)
+
+ allnewnodes.sort(key=lambda n: repo[n].rev())
+ newphase = {}
+ def phase(ctx):
+ return newphase.get(ctx.node(), ctx.phase())
+ for newnode in allnewnodes:
+ ctx = repo[newnode]
+ oldphase = max(repo[oldnode].phase()
+ for oldnode in precursors[newnode])
+ parentphase = max(phase(p) for p in ctx.parents())
+ newphase[newnode] = max(oldphase, parentphase)
+ if newphase[newnode] > ctx.phase():
+ toretract.setdefault(newphase[newnode], []).append(newnode)
+ elif newphase[newnode] < ctx.phase():
+ toadvance.setdefault(newphase[newnode], []).append(newnode)
+

with repo.transaction('cleanup') as tr:
    # Move bookmarks
    bmarks = repo._bookmarks
    bmarkchanges = []
  • allnewnodes = [n for ns in replacements.values() for n in ns] for oldnode, newnode in moves.items(): oldbmarks = repo.nodebookmarks(oldnode) if not oldbmarks: @@ -850,6 +874,11 @@ if bmarkchanges: bmarks.applychanges(repo, tr, bmarkchanges)

    + for phase, nodes in toretract.items(): + phases.retractboundary(repo, tr, phase, nodes) + for phase, nodes in toadvance.items(): + phases.advanceboundary(repo, tr, phase, nodes)

The logic looks good to me, but I'm not sure if recomputing a proper phase
move here is better than the original config override.

    • a/hgext/fix.py +++ b/hgext/fix.py @@ -158,7 +158,7 @@ del filedata[rev]

      replacements = {prec: [succ] for prec, succ in replacements.iteritems()}
  • scmutil.cleanupnodes(repo, replacements, 'fix') + scmutil.cleanupnodes(repo, replacements, 'fix', fixphase=True)

Perhaps this has to be wrapped by a transaction. Otherwise, I think a secret
changeset could be temporarily exposed until cleanupnodes().

I assume this was inspired at least in part by https://bz.mercurial-scm.org/show_bug.cgi?id=5918?

When I was considering how to do this, I was thinking pass in an explicit phase to handle hg amend --secret and similar cases, and None if the phase should just be inherited from the old node(s). Are there cases where nodes are replaced, but you don't want to change the phase at all?

I assume this was inspired at least in part by https://bz.mercurial-scm.org/show_bug.cgi?id=5918?

It was inspired by D2016, but I had also seen your bug report.

When I was considering how to do this, I was thinking pass in an explicit phase to handle hg amend --secret and similar cases, and None if the phase should just be inherited from the old node(s).

Oh, I had not seen that you mentioned cleanupnodes() in that bug (or I did and forgot) :) That sounds like a good idea.

Are there cases where nodes are replaced, but you don't want to change the phase at all?

Yes, unfortunately: rebase and histedit (as mentioned in commit message).

When I originally started working on this right after Kyle sent D2016, I wanted to first fix rebase and histedit so they made the changes incrementally and we only ever had to call cleanupnodes() with one precursor. See comments on D2013 for my plans. It turned out to be (more work than I had hoped)^2 (i.e. more than I had hoped after I figured out it was more than I had hoped). The issue with making rebase and histedit create obsmarkers and move bookmarks incrementally is that we need to keep track of those things so we can undo them on --abort. Then I started thinking that maybe the first step towards that goal would be to add a generic hg undo.

Anyway, I like the idea of being able to use cleanupnodes() for amend as well. Perhaps something like this:

fixphase=False, targetphase=None => leave phases alone (used by rebase and histedit)
fixphase=True, targetphase=None => set phases based just on the precursors' phase (used by most other commands)
fixphase=True, targetphase=<phase> => set phase to targetphase unless parent has higher phase (used by hg amend --secret)
fixphase=False, targetphase=<phase> => assertion error

In D3818#59695, @yuja wrote:

The logic looks good to me, but I'm not sure if recomputing a proper phase
move here is better than the original config override.

The goal is not simplification, but to reduce the risk of bugs. We've had many bugs in this area: phabricator, fix, split, evolve, and others. If we just point extension writers to cleanupnodes(), it seems less likely that they'll make the mistake. Especially in a cycle or two when we've made fixphase=True the default. I'll update the commit message, since I admit that justification was completely missing.

    • a/hgext/fix.py +++ b/hgext/fix.py @@ -158,7 +158,7 @@ del filedata[rev]

      replacements = {prec: [succ] for prec, succ in replacements.iteritems()}
  • scmutil.cleanupnodes(repo, replacements, 'fix') + scmutil.cleanupnodes(repo, replacements, 'fix', fixphase=True)

Perhaps this has to be wrapped by a transaction. Otherwise, I think a secret
changeset could be temporarily exposed until cleanupnodes().

Good point. Addressed in D3823.

martinvonz updated this revision to Diff 9240.Jun 20 2018, 5:04 PM
martinvonz edited the summary of this revision. (Show Details)

Anyway, I like the idea of being able to use cleanupnodes() for amend as well. Perhaps something like this:

fixphase=False, targetphase=None => leave phases alone (used by rebase and histedit)
fixphase=True, targetphase=None => set phases based just on the precursors' phase (used by most other commands)
fixphase=True, targetphase=<phase> => set phase to targetphase unless parent has higher phase (used by hg amend --secret)
fixphase=False, targetphase=<phase> => assertion error

It always bugs me when an API can allow a bad combination of arguments, but I don't see a better way, and I think the consistent use of cleanupnodes() is more important.

Anyway, I like the idea of being able to use cleanupnodes() for amend as well. Perhaps something like this:

fixphase=False, targetphase=None => leave phases alone (used by rebase and histedit)
fixphase=True, targetphase=None => set phases based just on the precursors' phase (used by most other commands)
fixphase=True, targetphase=<phase> => set phase to targetphase unless parent has higher phase (used by hg amend --secret)
fixphase=False, targetphase=<phase> => assertion error

It always bugs me when an API can allow a bad combination of arguments, but I don't see a better way, and I think the consistent use of cleanupnodes() is more important.

I agree. It's also unfortunate to take up two arguments for what should be one. After a few years of Python, I've still not quite gotten used to arguments of mixed types, but perhaps {None, 'auto', <integer phase>} would be better?

quark added a subscriber: quark.Jun 21 2018, 2:40 AM

Not directly related to this patch. On API complexity: One of the unimplemented ideas is to require a transaction and make operation optional - default to the transaction name.

In D3818#59716, @quark wrote:

Not directly related to this patch. On API complexity: One of the unimplemented ideas is to require a transaction and make operation optional - default to the transaction name.

Good idea. I like passing around the transaction explicitly.

I also hope to get rid of the "moves" argument once we have made rebase and histedit add obsmarkers and move bookmarks incrementally.

yuja added a comment.Jun 21 2018, 8:43 AM
> Not directly related to this patch. On API complexity: One of the unimplemented ideas is to require a transaction and make `operation` optional - default to the transaction name.


Good idea. I like passing around the transaction explicitly.

+1. An explicit tr argument will make sure to wrap cleanupnodes() with
transaction.

yuja added a comment.Jun 21 2018, 8:43 AM

Queued, thanks.

+ ctx = unfi[newnode]
+ if targetphase is None:
+ oldphase = max(unfi[oldnode].phase()
+ for oldnode in precursors[newnode])
+ parentphase = max(phase(p) for p in ctx.parents())
+ newphase = max(oldphase, parentphase)
+ else:
+ newphase = targetphase

Maybe better to do max(targetphase, parentphase)?

I think it can be a source of bug that the targetphase may advance ancestors
to public/draft.

This revision was automatically updated to reflect the committed changes.
In D3818#59722, @yuja wrote:

Queued, thanks.

+ ctx = unfi[newnode]
+ if targetphase is None:
+ oldphase = max(unfi[oldnode].phase()
+ for oldnode in precursors[newnode])
+ parentphase = max(phase(p) for p in ctx.parents())
+ newphase = max(oldphase, parentphase)
+ else:
+ newphase = targetphase

Maybe better to do max(targetphase, parentphase)?

I think it can be a source of bug that the targetphase may advance ancestors
to public/draft.

Oh, I definitely agree. Good catch. Will send a followup.