( )⚙ D6561 copies: simplify merging of copy dicts on merge commits

This is an archive of the discontinued Mercurial Phabricator instance.

copies: simplify merging of copy dicts on merge commits
ClosedPublic

Authored by martinvonz on Jun 20 2019, 5:09 PM.

Details

Summary

After we removed some filtering in 35d674a3d5db (copies: don't filter
out copy targets created on other side of merge commit, 2019-04-18),
we will always include all entries from "copies1", so we can simplify
the code based on that.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

martinvonz created this revision.Jun 20 2019, 5:09 PM
This revision was not accepted when it landed; it landed in state Needs Review.
This revision was automatically updated to reflect the committed changes.
yuja added a subscriber: yuja.Jun 25 2019, 7:28 PM
while work:
  • r, i1, copies1 = heapq.heappop(work)

+ r, i1, copies = heapq.heappop(work)

if work and work[0][0] == r:
    # We are tracing copies from both parents
    r, i2, copies2 = heapq.heappop(work)
  • copies = {}
  • allcopies = set(copies1) | set(copies2)
  • for dst in allcopies:

+ for dst, src in copies2.items():

    1. Unlike when copies are stored in the filelog, we consider
    2. it a copy even if the destination already existed on the
    3. other branch. It's simply too expensive to check if the
    4. file existed in the manifest.
  • if dst in copies1:
  • # If it was copied on the p1 side, mark it as copied from

+ if dst not in copies:
+ # If it was copied on the p1 side, leave it as copied from

    1. that side, even if it was also copied on the p2 side.
  • copies[dst] = copies1[dst]
  • else: copies[dst] = copies2[dst]

Are we sure there's no copies alias held by later work?

I'm just asking because we've optimized some copies.copy()s away at
5ceb91136ebe. I don't know if it's safe or not to mutate copies at this
point.

In D6561#95834, @yuja wrote:
while work:
  • r, i1, copies1 = heapq.heappop(work)

+ r, i1, copies = heapq.heappop(work)

if work and work[0][0] == r:
    # We are tracing copies from both parents
    r, i2, copies2 = heapq.heappop(work)
  • copies = {}
  • allcopies = set(copies1) | set(copies2)
  • for dst in allcopies:

+ for dst, src in copies2.items():

    1. Unlike when copies are stored in the filelog, we consider
    2. it a copy even if the destination already existed on the
    3. other branch. It's simply too expensive to check if the
    4. file existed in the manifest.
  • if dst in copies1:
  • # If it was copied on the p1 side, mark it as copied from

+ if dst not in copies:
+ # If it was copied on the p1 side, leave it as copied from

    1. that side, even if it was also copied on the p2 side.
  • copies[dst] = copies1[dst]
  • else: copies[dst] = copies2[dst]

Are we sure there's no copies alias held by later work?
I'm just asking because we've optimized some copies.copy()s away at
5ceb91136ebe. I don't know if it's safe or not to mutate copies at this
point.

I *think* it should be safe. The policy is that each instance can only appear in one place in the queue. I should send a patch to document that policy. Let me know if you see a violation of that policy.