This is an archive of the discontinued Mercurial Phabricator instance.

share: introduce config option to store requires in .hg/store
ClosedPublic

Authored by pulkit on Jun 15 2020, 11:03 AM.

Details

Summary

This introduces a config option which enabled stores the requirements on a
repository in store instead.

When enabled, .hg/requires will contain the share-safe requirement which
marks that the requirements are present in the store.
This is done so that repository requirements can be shared with shares made
using hg share command.

After this patch, hg share checks whether the source repository has
share-safe requirement, if yes, it does not copy the requirements.

Test for the new functionality is added and a test case in exitsing share tests
is also added.

Diff Detail

Repository
rHG Mercurial
Branch
default
Lint
No Linters Available
Unit
No Unit Test Coverage

Event Timeline

pulkit created this revision.Jun 15 2020, 11:03 AM
pulkit edited the summary of this revision. (Show Details)Jun 25 2020, 4:52 AM
pulkit updated this revision to Diff 21702.
pulkit updated this revision to Diff 21740.Jul 1 2020, 6:00 AM
marmoute requested changes to this revision.Jul 15 2020, 1:58 PM
marmoute added a subscriber: marmoute.
marmoute added inline comments.
mercurial/localrepo.py
455

Until the semantic is fully implemented, this should be exp-sharesafe otherwise a non complying client could accessing without respecting the semantic.

537–554

This looks like a lot of duplicated code, am I missing something ? Could we do less duplication?

3797–3810

It looks like share-safe should requires store, can we make them dependent of each other.

This revision now requires changes to proceed.Jul 15 2020, 1:58 PM
pulkit marked 2 inline comments as done.Jul 17 2020, 10:24 AM
pulkit added inline comments.
mercurial/localrepo.py
455

Done.

537–554

Refactored in next patch.

3797–3810

Does not share extension implies that store is present?

marmoute added inline comments.Jul 20 2020, 1:05 PM
mercurial/localrepo.py
537–554

I see this is done in D8655 thanks !

3797–3810

Maybe, can you check ?

In all cases, I would rather make the safe-share requirement consistent from the start. Can we have some code either setup store automatically, or reject with some error (whichever solution you prefer).

pulkit updated this revision to Diff 22068.Jul 23 2020, 10:52 AM
pulkit updated this revision to Diff 22171.Aug 1 2020, 10:36 AM

I need to think about the implications of this change a bit more: the requires file plays an important part in how clients open repositories and there are significant backwards and forwards compatibility concerns.

One immediate request is to document semantics in mercurial/helptext/internals/requirements.txt as part of this change.

FWIW I thought I had typed up an internals document on how repositories are opened. But maybe I never submitted it for review or maybe it never landed.

I left a handful of comments.

Changing repo opening semantics and requirements handling is a bit scary.

I'm generally in favor of separating working directory requirements from non-working directory requirements so I think this patch is a step in the right direction. There are a few alternative implementations worth considering. (Although I'm unsure if they are superior.)

  1. Write the new requirements file in the .hg directory as .hg/store-requires (or similar). Conceptually, we're talking about working directory and backend requirements. The "store" is just today's layout for storing backend data [which can be shared between working directories].
  2. Write a new requirements file for working directory specific requirements.
mercurial/localrepo.py
545

I'm reasonably certain this logic is duplicated from somewhere. And opening shared repos is quirky enough that I'd feel better if there was a helper function for resolving the store vfs given a vfs for a .hg directory.

553

The swalloing of ENOENT doesn't sit well with me.

The repo opening semantics are such that .hg/requires is read first and all subsequent repo opening is dictated by what is found inside that file.

I think that a .hg/store/requires file presence should be directly indicated via an entry in .hg/requires - say store-requires - and if that entry is present, the file must be present: a missing file would be an error.

So I think the request here would be to rename the requirement to [exp-]store-requires instead of [exp-]share-safe. If we wanted to add additional functionality governing how shared repos are opened, we should consider a separate requirement for that, independent of the one that indicates if a .hg/store/requires exists.

Put another way, we want repo opening semantics to be strictly defined with little room for interpretation. And we want requires entries to be narrowly scoped so there isn't ambiguity about what they mean.

3672

It's really weird that this requirement has share in its name but it isn't dependent on being a shared repo.

3808

This whole block needs more comments because what it is trying to accomplish is important and nuanced.

I would also consider rewriting the code so that we populate variables indicating the set of requirements to populate in .hg/requires and .hg/store/requires. That way we have at most 2 calls to writerequires(), which I think will improve readability.

3808

What this is doing is blindly moving all requirements except SHARESAFE_REQUIREMENT to .hg/store/requires. The movement of most requirements there makes sense because they govern the "store." However, there are requirements like exp-sparse which I believe only concern the working directory and should not be in the "store."

If we want to go the approach of moving store-specific requirements into .hg/store/requires, I think we need to annotate whether requirements belong to the store or the non-store and write/move requirements accordingly.

mercurial/scmutil.py
1478

The presence of this conditional here feels a bit weird to me: I feel like it might be better if we explicitly passed in 2 sets of requirements to read and this function were "dumber."

1479

Writing of the requirements file should be extremely rare. I don't think we need this lock here.

mercurial/store.py
378

Bonus points if you refactor this to be a proper data structure as part of this series. A space delimited list/set makes no sense to me.

450

Wait - why was requires here before?

I left a handful of comments.
Changing repo opening semantics and requirements handling is a bit scary.
I'm generally in favor of separating working directory requirements from non-working directory requirements so I think this patch is a step in the right direction. There are a few alternative implementations worth considering. (Although I'm unsure if they are superior.)

  1. Write the new requirements file in the .hg directory as .hg/store-requires (or similar). Conceptually, we're talking about working directory and backend requirements. The "store" is just today's layout for storing backend data [which can be shared between working directories].
  2. Write a new requirements file for working directory specific requirements.

I would advocate for having a file for working-directory specific requirements. It is safer to assume all requirements are store specific unless explicitly specified otherwise.

marmoute added inline comments.Aug 7 2020, 4:40 AM
mercurial/localrepo.py
545

I had the same comment, and it turns out @pulkit is cleaning the duplication in later patches.

553

The swalloing of ENOENT doesn't sit well with me.

The repo opening semantics are such that .hg/requires is read first and all subsequent repo opening is dictated by what is found inside that file.

I think that a .hg/store/requires file presence should be directly indicated via an entry in .hg/requires - say store-requires - and if that entry is present, the file must be present: a missing file would be an error.

+1 for having some explicit error. If we go the route of using a wc specific files, we should be strict on having it too.

I don't have a strong opinion on having the requires file in the .hg/store/ directory yet. It seems cleaner/clearer to do so.

553

So I think the request here would be to rename the requirement to [exp-]store-requires instead of [exp-]share-safe. If we wanted to add additional functionality governing how shared repos are opened, we should consider a separate requirement for that, independent of the one that indicates if a .hg/store/requires exists.

Having a galaxy on interdependent requirements can become a nightmare, because we either have to start dealing with all the corner case of some being enabled without the other or to add extra logic to make sure they are not being used without the other (and error out on bad configuration). So having simple requirement adding a consistent improvement seems like a better idea.

The share-safe requirement comes with other semantic beside the store one, for example regarding the reading of the configuration. So store-requires would be too narrow.

3672

The requirements is named "share-safe" because after this change, the repository is safe to be shared. A repository do not need to be shared to "share-safe" the same way a watch does not need to be in the submerged to be water-proof.

3808

Defaulting to "requirements is a store requirement" and explicitly flagging requirement that are working copy specific seems the safest.

mercurial/scmutil.py
1479

I agreed that the requirements should be written at repository creation time (or upgrade time) and that is all. Maybe we should update the acquiring of the lock with some code checking we are actually holding a lock while doing this?

The upgrade usecase is a good argument in favor of moving the requirement file inside the store directory.

mercurial/store.py
378

I can add an extra bonus point to the stake :-)

pulkit marked 3 inline comments as done.Aug 7 2020, 8:59 AM
pulkit planned changes to this revision.

I left a handful of comments.
Changing repo opening semantics and requirements handling is a bit scary.
I'm generally in favor of separating working directory requirements from non-working directory requirements so I think this patch is a step in the right direction. There are a few alternative implementations worth considering. (Although I'm unsure if they are superior.)

  1. Write the new requirements file in the .hg directory as .hg/store-requires (or similar). Conceptually, we're talking about working directory and backend requirements. The "store" is just today's layout for storing backend data [which can be shared between working directories].
  2. Write a new requirements file for working directory specific requirements.

Hi, thanks a lot for the detailed review. I think we can have .hg/requires as working directory specific and .hg/store/requires as store requirements.

I have not yet implemented all the fixes you mentioned. Hence marking this as planned changes. However I have added some parents patches which should be good to review.

pulkit updated this revision to Diff 22381.Aug 10 2020, 11:26 AM
pulkit updated this revision to Diff 22397.Aug 11 2020, 5:06 AM
pulkit marked 2 inline comments as done.Aug 11 2020, 5:25 AM

I left a handful of comments.
Changing repo opening semantics and requirements handling is a bit scary.
I'm generally in favor of separating working directory requirements from non-working directory requirements so I think this patch is a step in the right direction. There are a few alternative implementations worth considering. (Although I'm unsure if they are superior.)

  1. Write the new requirements file in the .hg directory as .hg/store-requires (or similar). Conceptually, we're talking about working directory and backend requirements. The "store" is just today's layout for storing backend data [which can be shared between working directories].
  2. Write a new requirements file for working directory specific requirements.

Hi, thanks a lot for the detailed review. I think we can have .hg/requires as working directory specific and .hg/store/requires as store requirements.
I have not yet implemented all the fixes you mentioned. Hence marking this as planned changes. However I have added some parents patches which should be good to review.

@indygreg I think I have taken care of all fixes you suggested except one (the finegrained store-requires one). This patch should be good for an another look now.

mercurial/localrepo.py
553

Fixed the swallowing of ENOENT.

About a fine grained store-requires requirement, I am not sure what's the best way out.

Right now exp-share-safe implies two things, requirements in store and source .hgrc be shared. We definitely can have dedicated store-requires but that introduces a spider of inter-dependent requirements.

Does exp-share-safe implies store-requires or need it? etc.

For now, I have not changed anything on this front in this patch.

3672

What @marmoute said, the requirement is named share-safe because after this change, the repository is safe to be shared.

3808

The whole code block is now much simpler, thanks to filterrequirements introduced in previous patches.

mercurial/scmutil.py
1478

This code is much cleaner now after introducing filterrequirements in previous patches.

mercurial/store.py
450

A line of code dating back to 2008. Removing it does not make any tests fail. Will send a patch for it separately.

pulkit marked 2 inline comments as done.Aug 11 2020, 5:27 AM

I need to think about the implications of this change a bit more: the requires file plays an important part in how clients open repositories and there are significant backwards and forwards compatibility concerns.
One immediate request is to document semantics in mercurial/helptext/internals/requirements.txt as part of this change.

Since this patch does not implement all the things which are covered by this requirement, I appended a patch (D8914) at end of series for this.

FWIW I thought I had typed up an internals document on how repositories are opened. But maybe I never submitted it for review or maybe it never landed.

pulkit updated this revision to Diff 22444.Aug 25 2020, 7:39 AM
indygreg accepted this revision.Aug 25 2020, 9:33 PM

I'm approving this as an experimental quality feature. I think the technical direction is sound and the splitting of requirements corrects some long-standing soundness issues and will lead to a better end-user experience for shared repositories.

Before the experimental label is ripped off, I think we should consider:

  • Whether store requirement belongs in .hg/requires or .hg/store/requires. It's weird for it to be in .hg/store since that requirement essentially says files are in store/.
  • The name of the requirement having share in it still doesn't sit well with me. The thing the requirement is saying is that additional requirements exist in store/requires. A side-effect of that is that repository store sharing is more robust. I'd rather the .hg/requires file contain storerequires. If .hg/requires is using a shared store, it can have sharedstorerequires to indicate that special variant. But we could also use storerequires in this case as well: it all depends how specific we want to be with requires entries! Another way of looking at this is that it is weird an unshared repository has a requirement with share in it. That's a good indicator the naming is off!
mercurial/localrepo.py
535

As written, this code will accept the share safe requirement without the store requirement. The store requirement should be mandatory and the share safe requirement without store is nonsensical.

It would be nice if there were code somewhere that rejected this invalid combination of requirements. But the chances of this happening should be small and it is probably OK to live without it in the repo opening code. However, we should look for it when writing the requirement, since it would be possible to coerce config options to produce such a repository.

540

I'm not sure cacheaudited provides any benefit for a single file read. But it should be harmless since we throw away the vfs instance after a single use.

3685

I'm on the fence about whether we should drop or abort here. Either way is probably fine.

3805

sharedrepo wants to be converted to a constant.

I was going to push this but there's a test failure:

--- /home/gps/src/hg-committed/tests/test-remotefilelog-share.t
+++ /home/gps/src/hg-committed/tests/test-remotefilelog-share.t#safe.err
@@ -28,6 +28,7 @@


   $ hgcloneshallow ssh://user@dummy/master source --noupdate -q
+  devel-warn: write with no lock: "requires" at: /home/gps/src/hg-committed/mercurial/scmutil.py:1505 (writerequires)
   $ hg share source dest
   updating working directory
   1 files updated, 0 files merged, 0 files removed, 0 files unresolved
marmoute accepted this revision.Aug 26 2020, 3:16 AM

I'm approving this as an experimental quality feature. I think the technical direction is sound and the splitting of requirements corrects some long-standing soundness issues and will lead to a better end-user experience for shared repositories.
Before the experimental label is ripped off, I think we should consider:

  • Whether store requirement belongs in .hg/requires or .hg/store/requires. It's weird for it to be in .hg/store since that requirement essentially says files are in store/.

the store requirements is clearly something that needs to be shared between the source repo and the potential shares, so this argue for having it in .hg/store/requires, however as you point out if we are reading .hg/store/requirements we obviously already aware that we have a store. Maybe sharesafe should imply store? Or maybe that special case is not special enough and we can live with the funny inconsistency?

  • The name of the requirement having share in it still doesn't sit well with me. The thing the requirement is saying is that additional requirements exist in store/requires.

The requirements also change the configuration reading logic so that the configuration from the source repository is also read by the shares. So the sharesafe requirement is really about fixing various issue that were making repository unsafe to share. And use multiple behavior/layout change to achieve this.

Another way of looking at this is that it is weird an unshared repository has a requirement with share in it. That's a good indicator the naming is off!

I really don't understand your reasoning here. as I pointed before they are countless example of such naming in real life. For example, waterproof item does not imply they are (or will) submerged, fireproof item does not imply they are (or will) be on fire and military grade hardware does not requires you to be in the military to use them. So having a repository to be "share safe" does not requires it to be shared to me. Can you elaborate on why you find that naming suspicious ?

This revision is now accepted and ready to land.Aug 26 2020, 3:16 AM
pulkit marked 2 inline comments as done.Aug 26 2020, 4:44 AM
pulkit added inline comments.
mercurial/localrepo.py
535

The SHARESAFE_REQUIREMENT implies that store is present. There are two ways SHARESAFE_REQUIREMENT can be introduced:

  1. Creating new repos with config option enabled
  2. Using upgrade

For 1, we check that store is present. Refer checkrequirementscompat() a bit down in this file.

2 is not yet implemented and we will add the check there too when implemented.

So until the user is not manually editing the requires file, we can be sure that if SHARESAFE_REQUIREMENT is present, store will be present.

However, we should look for it when writing the requirement, since it would be possible to
coerce config options to produce such a repository.

Definitely, this patch has a test case for that in test-share.t.

540

Seems like copy-paste leftover, removed it.

3685

I am bit more inclined on aborting here but realized that this whole needs a separate discussion as unfortunately we didn't have such handling yet.

3805

This is bit messy here. sharedrepo here is createopts key rather than a requirement name.

pulkit marked 2 inline comments as done.Aug 26 2020, 6:05 AM
pulkit updated this revision to Diff 22450.

I was going to push this but there's a test failure:

--- /home/gps/src/hg-committed/tests/test-remotefilelog-share.t
+++ /home/gps/src/hg-committed/tests/test-remotefilelog-share.t#safe.err
@@ -28,6 +28,7 @@
   $ hgcloneshallow ssh://user@dummy/master source --noupdate -q
+  devel-warn: write with no lock: "requires" at: /home/gps/src/hg-committed/mercurial/scmutil.py:1505 (writerequires)
   $ hg share source dest
   updating working directory
   1 files updated, 0 files merged, 0 files removed, 0 files unresolved

Oops, turns out shallow-cloning in remotefilelog does not acquire a lock. I added D8952 in start of series to acquire lock while writing requires in shallow-cloning in remotefilelog.

pulkit updated this revision to Diff 22654.Sep 16 2020, 7:56 AM
indygreg accepted this revision.Sep 17 2020, 9:13 PM