( )⚙ D680 scmutil: handle conflicting files and dirs in origbackuppath

This is an archive of the discontinued Mercurial Phabricator instance.

scmutil: handle conflicting files and dirs in origbackuppath
ClosedPublic

Authored by mbthomas on Sep 11 2017, 2:03 PM.

Details

Summary

When ui.origbackuppath is set, .orig files are stored outside of the working
copy. However conflicts can occur when files or directories end up having the
same name. These conflicts cause Mercurial to abort, even if they've been
created as a result of different backups.

Make sure we always replace files or directories in the origbackuppath if
they conflict with another file or directory.

Test Plan

Add new unit test for conflicting paths.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

mbthomas created this revision.Sep 11 2017, 2:03 PM
ryanmce accepted this revision.Sep 11 2017, 2:21 PM

looks good

durham accepted this revision.Sep 12 2017, 12:01 PM
durham added a subscriber: durham.
durham added inline comments.
mercurial/scmutil.py
582

I tried to think of where this could go wrong or be abused. I think it's fine though.

Like if someone set ui.origbackuppath to empty string I think the wjoin above would result in paths inside the working copy, and rmtree would then delete parts of the working copy. So maybe the if origbackuppath is None: in line 559 should be if not origbackuppath:

yuja requested changes to this revision.Sep 13 2017, 10:54 AM
yuja added a subscriber: yuja.
yuja added inline comments.
mercurial/scmutil.py
574

Perhaps fullorigpath may be a Windows path, but util.finddirs()
can only handle internal path.

578

It shouldn't delete files out of the origbackuppath directory
even if the configured path points to (or under) a file.

Maybe we could use a new vfs dedicated to backup directory to
audit operations.

This revision now requires changes to proceed.Sep 13 2017, 10:54 AM
mbthomas updated this revision to Diff 1892.Sep 19 2017, 5:46 AM
mbthomas planned changes to this revision.Sep 19 2017, 5:47 AM
mbthomas updated this revision to Diff 1893.Sep 19 2017, 6:17 AM
mbthomas updated this revision to Diff 1894.
mbthomas marked 2 inline comments as done.Sep 19 2017, 6:21 AM
mbthomas added inline comments.
mercurial/scmutil.py
574

I've changed it to use util.normpath, which I think converts back to / as separator on windows (windows.normpath calls windows.pconvert), and is probably a good idea anyway. Is this sufficient?

yuja added inline comments.Sep 19 2017, 11:12 AM
mercurial/scmutil.py
574

That's probably okay now.

I'm not sure if it conflicts with the long paths plan. IIRC, forward
slash isn't allowed in \\?\ path.

Calling in windows path expert @ikostia

ikostia requested changes to this revision.Sep 19 2017, 11:36 AM

Please wait until the util.finddirs() is tweaked and remove normpath call.

mercurial/scmutil.py
574

I think we should adjust finddirs to split on os.path.sep, not hardcoded /, then we can get rid of util.normpath a couple of lines above.

I am long overdue with re-sending my series, I will add this change there as well.

This revision now requires changes to proceed.Sep 19 2017, 11:36 AM
mbthomas added inline comments.Sep 20 2017, 9:12 AM
mercurial/scmutil.py
574

At this point the path is a vfs-path within the repo, so it should be using / as a separator.

In the case of UNC paths, os.path.relpath seems to do the right thing, so os.path.relpath(r'\\?\c:\repo\dir\file', r'\\?\c:\repo') gives dir\file, which util.normpath converts to dir/file.

It's a bit of a shame that scmutil.origpath takes an absolute path rather than a vfs path within the repo, as it means we have to jump through these hoops. It doesn't look like we can easily change it, though.

quark added a subscriber: quark.Sep 27 2017, 6:31 PM

If it's just file and directory conflict, maybe we can encode the path so it becomes impossible to have such conflict. See _encodedir in store.py for an example. It requires a suffix so partially reverts D679 though.

mbthomas updated this revision to Diff 2208.Oct 1 2017, 5:32 AM

Just as an FYI: I revoke my previous comments, as stuff started working for non-prefixed paths and I no longer intend to land anything related to this.

I am concerned that this seems to rely even more on not having conflicts inside origbackuppath from files/dirs with same name but in different directories. If we have a backup file handling, it must be 100% reliable. That seems to be what this series is trying to fix - then it seems unfortunate it just moves the problem elsewhere.

mercurial/scmutil.py
561

Now, this seems like a separate trivial change, fixing the problem that origbackuppath couldn't be overridden.

With ui.origbackuppath set to some value, backup should be completely reliable. The file at $origbackuppath/path/to/file will be the most recent version of that file that was backed up. It will be overwritten/deleted if a backup of any file named $origbackuppath/path/to/file, $origbackuppath/path/to or $origbackuppath/path is subsequently made. This seems reasonable to me.

If ui.origbackuppath not set, then behaviour is still unreliable in the presence of files also named something.orig, but that is not changed behaviour. We discussed this at the sprint and decided that we didn't want to change the default behaviour of creating .orig files.

mercurial/scmutil.py
561

Happy to split this out, but this stack is already pretty tall.

mbthomas updated this revision to Diff 2350.Oct 2 2017, 5:14 PM
ryanmce accepted this revision.Oct 5 2017, 9:46 AM

It looks like all concerns have been addressed. This looks good to me.

mercurial/scmutil.py
567–570

In the future, we may want to move this to the repo class like other vfs instances. That will mean we don't have to re-instantiate this each time. However, these instantiations look cheap so I'm not worried and it's outside the scope of this change.

574

I agree that it seems out of scope to fix this in this patch series. Might be worth opening a task in bugzilla to track improving this, but I think this series is already fairly long and too important to hold up on a small thing like this. Since it works, I think we should move towards shipping it.

578

@mbthomas and I had talked about adding a new vfs; thanks for pushing us in that direction. Getting a dedicated vfs feels like a good win here.

This revision was automatically updated to reflect the committed changes.