This is an archive of the discontinued Mercurial Phabricator instance.

copies: introduce a basic Rust function for `combine_changeset_copies`

Authored by marmoute on Nov 12 2020, 10:16 AM.



This new function mirror the python code. This first implementation does a lot
of data copies and is therefore quite slow. However my goal here is to create a
simple "frame" from where to start adding optimization.

This patch focus on the hg-core part of this work. Coming patches will do the
necessary hg-cpython work to be able to use this from Python.

Diff Detail

rHG Mercurial
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

marmoute created this revision.Nov 12 2020, 10:16 AM
Alphare accepted this revision.Nov 13 2020, 6:03 AM
Alphare added a subscriber: Alphare.

Mostly nits, but I'd like not to break clippy when we can help it.

The logic seems sound, which is not surprising since this is a re-write of Python.


Nit: for loops already desugar to calling into_iter(). This applies to multiple other places in the file


Simply use let (parent, child_copies) = if ...


let (minor, major) = if ...


You should add #[allow(clippy::if_same_then_else)] to this function because clippy is not happy with your explicit identical if/else cases.


let overwrite = if

The rest of the feedback have been applied. update to the diff to come.


The logic here is enough of a headache, that I really prefer to be explicit here, with each close to be fairly easy to understand on its own.

marmoute updated this revision to Diff 23492.Nov 13 2020, 7:05 PM
Alphare accepted this revision.Nov 16 2020, 6:10 AM
Alphare added inline comments.

Sure, sounds good.

SimonSapin added inline comments.

Could be worth documenting which of the key or value are source v.s. destination. At first I assumed the opposite of what the code seems to be doing.

Speaking of, can multiple keys have the same value? If so, what does that represent?


What does None represent here?


unreachable! can be avoided by using a custom enum (which can be defined inline in the function) instead of plain integers, or using a bool and renaming parent to something like is_parent_1


to_owned here copies a TimeStampedPathCopies value, but the original is not needed anymore so remove could be used instead of get.

marmoute added inline comments.Nov 25 2020, 9:06 AM

This use the same direction than the expect return (and the python implementation). But you are right, this could use a comment.

Multiple key can have a same value. It means that the file have been copied to multiple other file at the same time. The common use case for that is when a file is split in multiple others.


A file deletion. (I assume this is a hint that a comment could help).


I agrre with you so much that I am introducing an enum doing this further down the road :-)


Good point.

marmoute updated this revision to Diff 23714.Nov 27 2020, 7:30 AM
pulkit accepted this revision.Nov 28 2020, 4:39 AM
This revision is now accepted and ready to land.Nov 28 2020, 4:39 AM
marmoute updated this revision to Diff 23762.Nov 28 2020, 5:26 AM