This is an archive of the discontinued Mercurial Phabricator instance.

copies: simplify merging of copy dicts on merge commits
ClosedPublic

Authored by martinvonz on Jun 20 2019, 5:09 PM.

Details

Summary

After we removed some filtering in 35d674a3d5db (copies: don't filter
out copy targets created on other side of merge commit, 2019-04-18),
we will always include all entries from "copies1", so we can simplify
the code based on that.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

martinvonz created this revision.Jun 20 2019, 5:09 PM
This revision was not accepted when it landed; it landed in state Needs Review.
This revision was automatically updated to reflect the committed changes.
yuja added a subscriber: yuja.Jun 25 2019, 7:28 PM
while work:
  • r, i1, copies1 = heapq.heappop(work)

+ r, i1, copies = heapq.heappop(work)

if work and work[0][0] == r:
    # We are tracing copies from both parents
    r, i2, copies2 = heapq.heappop(work)
  • copies = {}
  • allcopies = set(copies1) | set(copies2)
  • for dst in allcopies:

+ for dst, src in copies2.items():

    1. Unlike when copies are stored in the filelog, we consider
    2. it a copy even if the destination already existed on the
    3. other branch. It's simply too expensive to check if the
    4. file existed in the manifest.
  • if dst in copies1:
  • # If it was copied on the p1 side, mark it as copied from

+ if dst not in copies:
+ # If it was copied on the p1 side, leave it as copied from

    1. that side, even if it was also copied on the p2 side.
  • copies[dst] = copies1[dst]
  • else: copies[dst] = copies2[dst]

Are we sure there's no copies alias held by later work?

I'm just asking because we've optimized some copies.copy()s away at
5ceb91136ebe. I don't know if it's safe or not to mutate copies at this
point.

In D6561#95834, @yuja wrote:
while work:
  • r, i1, copies1 = heapq.heappop(work)

+ r, i1, copies = heapq.heappop(work)

if work and work[0][0] == r:
    # We are tracing copies from both parents
    r, i2, copies2 = heapq.heappop(work)
  • copies = {}
  • allcopies = set(copies1) | set(copies2)
  • for dst in allcopies:

+ for dst, src in copies2.items():

    1. Unlike when copies are stored in the filelog, we consider
    2. it a copy even if the destination already existed on the
    3. other branch. It's simply too expensive to check if the
    4. file existed in the manifest.
  • if dst in copies1:
  • # If it was copied on the p1 side, mark it as copied from

+ if dst not in copies:
+ # If it was copied on the p1 side, leave it as copied from

    1. that side, even if it was also copied on the p2 side.
  • copies[dst] = copies1[dst]
  • else: copies[dst] = copies2[dst]

Are we sure there's no copies alias held by later work?
I'm just asking because we've optimized some copies.copy()s away at
5ceb91136ebe. I don't know if it's safe or not to mutate copies at this
point.

I *think* it should be safe. The policy is that each instance can only appear in one place in the queue. I should send a patch to document that policy. Let me know if you see a violation of that policy.