The default copytrace implementation is very slow as it finds all the new files
that were added from merge base up to the head commit and for each file it
checks whether it this was copied or moved version of a different file.
copytrace extension in fb-hgext has a heuristic implementation of copy tracing
which is faster than the current copy tracing. The heuristic limits the search
of copies to just files that are either:
- Renames in the same directory
- Moved to other directory with same name
Stash@fb did analysis for the above heuristics and found that among 2,443,768
moves/copies there are only 32,234 moves/copies which does not fall under the
above heuristics which is approx. 0.013 of total copies.
If experimental.disablecopytrace = yes, then experimental.fastcopytrace flag
won't be considered as user explcitly disabled copytracing.
Elif experimental.disablecopytrace = no, then experimental.fastcopytrace flag will
be considered and if it's set to true, then the fastcopytrace heuristic
implementation will be used.
There are two more flags added by the implementation:
- experimental.fastcopytrace.sourcecommitlimmit
This flag limits the number of commits to be traveresed for the heuristics in
source branch i.e. the branch that is rebased or merged. copytracing can be
slow if there are too many commits in the source branch, so this flag can help
in limiting the number of commits.
- experimental.fastcopytrace.maxmovescandidatestocheck
This flag limits the number of heuristically found move candidates to check.
The extension also supports fast copytracing during amends which will be moved
in further patches.
This is also used elsewhere. I think it's worthwhile to be defined in localrepository class.
cc @stash Maybe it's more accurate to check paths.default first and fallback to origroot. If a user clones ssh://x/repo-x to repo-x-2, we probably want repo-x as the repo name. What do you think?