This is an archive of the discontinued Mercurial Phabricator instance.

lfs: autoload the extension when cloning from repo with lfs enabled
ClosedPublic

Authored by mharbison72 on Oct 17 2018, 12:35 AM.

Details

Summary

This is based on a patch by Gregory Szorc. I made small adjustments to
clean up the messaging when the server has the extension enabled, but the
client has it disabled (to prevent autoloading). Additionally, I added
a second server capability to distinguish between the server having the
extension enabled, and the server having LFS commits. This helps prevent
unnecessary requirement propagation- the client shouldn't add a requirement
that the server doesn't have, just because the server had the extension
loaded. The TODO I had about advertising a capability when the server can
natively serve up blobs isn't relevant anymore (we've had 2 releases that
support this), so I dropped it.

Currently, we lazily add the "lfs" requirement to a repo when we first
encounter LFS data. Due to a pretxnchangegroup hook that looks for LFS
data, this can happen at the end of clone.

Now that we have more control over how repositories are created, we can
do better.

This commit adds a repo creation option to add the "lfs" requirement.
hg.clone() sets this creation option if the remote peer is advertising
lfs usage (as opposed to just support needed to push).

So, what this change effectively does is have cloned repos
automatically inherit the "lfs" requirement.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

mharbison72 created this revision.Oct 17 2018, 12:35 AM
This revision was automatically updated to reflect the committed changes.

Thank you for finishing what I started and implementing this properly (by adding a separate server-side capability to distinguish between LFS states).

As a follow-up, it would be nice to have the LFS wire protocol capabilities documented in mercurial/help/internals/wireprotocol.txt, especially since it isn't obvious how the now 2 capabilities are different.

As a follow-up, it would be nice to have the LFS wire protocol capabilities documented in mercurial/help/internals/wireprotocol.txt, especially since it isn't obvious how the now 2 capabilities are different.

Will do. How hard do you think it will be to do the same thing for pull? Also, I need to see if this can be made to work with cloning from a local repo.

! In D5130#76736, @mharbison72 wrote:
Will do. How hard do you think it will be to do the same thing for pull?

I don't think it should be too difficult to implement. But I have possibly controversial opinions about the UI. e.g. I'm not a fan of mutating repo requirements after repo creation without the user's explicit consent. (But not being able to pull from a remote that recently added LFS is a bad experience as well.)

Also, I need to see if this can be made to work with cloning from a local repo.

I'm really not a fan of local / hardlink clones. They are very brittle. When you start throwing extra repo features like LFS into the mix, it becomes even more dangerous. Especially when we start talking about adding modifying .hg/requires automatically. The fundamental problem is the hardlink clone is strictly separate but it "inherits" some state from its parent. If we start varying behavior of the two repos, interaction between them can be wonky. In general, for advanced features, we should go through the "hg clone --pull" interface to create a clean break and minimize potential for wonkiness.

One thing I forgot to mention in the commit comment is that hg clone --rev REV, where REV is some commit prior to committing any lfs blobs, will cause the requirement to be added to the clone. That didn't happen previously with the transaction hook. But that's probably enough of a corner case to not really care.

! In D5130#76736, @mharbison72 wrote:
Also, I need to see if this can be made to work with cloning from a local repo.

I'm really not a fan of local / hardlink clones. They are very brittle. When you start throwing extra repo features like LFS into the mix, it becomes even more dangerous. Especially when we start talking about adding modifying .hg/requires automatically. The fundamental problem is the hardlink clone is strictly separate but it "inherits" some state from its parent. If we start varying behavior of the two repos, interaction between them can be wonky. In general, for advanced features, we should go through the "hg clone --pull" interface to create a clean break and minimize potential for wonkiness.

What's brittle about it, assuming clone implies exact copy? I'm vaguely aware of filesystem bugs around detecting hardlinks, but I thought that was basically solved by the whitelist of filesystems.

I'm fine with picking this up next cycle, but I was tinkering with trying to address @yuja's concerns in D4713, and it looks like this already works with a local clone. It looks like local clone doesn't call localrepo.createrepository(), rather it just calls localrepo.makerepository() twice for the source and destination, and reads the existing requirements file. So I don't see an obvious way to warn about autoloading largefiles when cloning, and not being obnoxious and warning when any command is run. (I do see the different behavior below with clone --pull.)

diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -456,6 +456,7 @@ def makelocalrepository(baseui, path, in
     else:
         extensions.loadall(ui)

+    print('makerepo for %s, requirements are %s' % (path, requirements))
     # Set of module names of extensions loaded for this repository.
     extensionmodulenames = {m.__name__ for n, m in extensions.extensions(ui)}

@@ -2809,6 +2810,7 @@ def undoname(fn):
 def instance(ui, path, create, intents=None, createopts=None):
     localpath = util.urllocalpath(path)
     if create:
+        print('*** create repo on %s' % localpath)
         createrepository(ui, localpath, createopts=createopts)

     return makelocalrepository(ui, localpath, intents=intents)
diff --git a/tests/test-largefiles-small-disk.t b/tests/test-largefiles-small-disk.t
--- a/tests/test-largefiles-small-disk.t
+++ b/tests/test-largefiles-small-disk.t
@@ -28,14 +28,20 @@ Test how largefiles abort in case the di
   > util.oslink = oslink
   > EOF

+  $ cp $HGRCPATH $HGRCPATH.orig
+
   $ echo "[extensions]" >> $HGRCPATH
   $ echo "largefiles =" >> $HGRCPATH

   $ hg init alice
+  *** create repo on alice
+  makerepo for alice, requirements are set(['dotencode', 'generaldelta', 'fncache', 'store', 'revlogv1'])
   $ cd alice
   $ echo "this is a very big file" > big
   $ hg add --large big
+  makerepo for $TESTTMP\alice, requirements are set(['dotencode', 'generaldelta', 'fncache', 'store', 'revlogv1'])
   $ hg commit --config extensions.criple=$TESTTMP/criple.py -m big
+  makerepo for $TESTTMP\alice, requirements are set(['dotencode', 'generaldelta', 'fncache', 'store', 'revlogv1'])
   abort: No space left on device
   [255]

@@ -52,12 +58,16 @@ The user cache is not even created:
 Make the commit with space on the device:

   $ hg commit -m big
+  makerepo for $TESTTMP\alice, requirements are set(['dotencode', 'generaldelta', 'fncache', 'store', 'revlogv1'])

 Now make a clone with a full disk, and make sure lfutil.link function
 makes copies instead of hardlinks:

   $ cd ..
   $ hg --config extensions.criple=$TESTTMP/criple.py clone --pull alice bob
+  makerepo for alice, requirements are set(['dotencode', 'largefiles', 'fncache', 'store', 'generaldelta', 'revlogv1'])
+  *** create repo on bob
+  makerepo for bob, requirements are set(['dotencode', 'generaldelta', 'fncache', 'store', 'revlogv1'])
   requesting all changes
   adding changesets
   adding manifests
@@ -73,3 +83,30 @@ The largefile is not created in .hg/larg

   $ ls bob/.hg/largefiles
   dirstate
+
+  $ cp -f $HGRCPATH.orig $HGRCPATH
+.  $ cat >> alice/.hg/hgrc <<EOF
+.  > [extensions]
+.  > largefiles =
+.  > EOF
+
+  $ hg clone alice charlie
+  makerepo for alice, requirements are set(['dotencode', 'largefiles', 'fncache', 'store', 'generaldelta', 'revlogv1'])
+  makerepo for charlie, requirements are set(['dotencode', 'largefiles', 'fncache', 'store', 'generaldelta', 'revlogv1'])
+  updating to branch default
+  getting changed largefiles
+  1 largefiles updated, 0 removed
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ cd charlie
+  $ grep largefiles .hg/requires
+  largefiles
+  $ hg init nested
+  *** create repo on nested
+  makerepo for nested, requirements are set(['dotencode', 'generaldelta', 'fncache', 'store', 'revlogv1'])
+  $ cat nested/.hg/requires
+  dotencode
+  fncache
+  generaldelta
+  revlogv1
+  store