Details
- Reviewers
- Alphare - marmoute 
- Group Reviewers
- hg-reviewers 
- Commits
- rHGb74ee41addee: sparse: lock the store when updating requirements config
Diff Detail
- Repository
- rHG Mercurial
- Branch
- stable
- Lint
- No Linters Available 
- Unit
- No Unit Test Coverage 
Event Timeline
Looks like this breaks another test: https://foss.heptapod.net/mercurial/mercurial-devel/-/jobs/269970
Wow, looks I just didn't run all tests here. Sorry!
Investigating this, I went pretty deep into a rabbit hole and may need rescuing.
The bug seems to go like this:
When cloning:
- hg.py:868: lock is obtained and stored into destlock variable
- a few lines later, the repo is actually created and the python object representing it is made by destpeer = peer(srcrepo, peeropts, dest) (which makes it unaware of the lock being held)
- hg.py:1038: _update is called on a repo that still doesn't know that it's locked; since _update is where sparse hooks in, updateconfig runs and deadlocks
Note that a few lines later (hg.py:1041-1047) the repo actually "learns" that it's locked (in a racy way).
Intuitively it seems good to "tell" the repo that it's locked from the very start, when calling peer or immediately after. But since this is the first time I see all of this code I can't make that call.
Another confusing thing is that during the clone we don't take wlock at all (other than in sparse), even though it seems reasonable to expect everything to be locked during the clone.
Taking wlock after the lock is supposed to be wrong (deadlock-prone), and yet here we are effectively doing it (clone takes the lock from the very start, and then sparse does it during clone).
Any advice would be appreciated.
- I agree that we should take the wlock in addition to the lock as we will be running an upgrade. (this should be a different commit, prior to the other one)
- the best way to get out of your trouble is to make the "new" peer (created after status) aware of the locking. Something like the logic below seems "suitable", Since this only happens in this specific cases, it seems like a bad idea to alter too much of the main peer API (we could still have a dedicated method on the repository for that). Same would have to be done for the wlock too
(I did not run this code, and it is not intended to be the final state.)
diff --git a/mercurial/hg.py b/mercurial/hg.py --- a/mercurial/hg.py +++ b/mercurial/hg.py @@ -873,6 +873,13 @@ def clone( # we need to re-init the repo after manually copying the data # into it destpeer = peer(srcrepo, peeropts, dest) + + # make the peer aware that is it already locked + # + # important: + # + # We still need to release that lock at the end of the function + destpeer.local()._lockref = weakref.ref(destlock) srcrepo.hook( b'outgoing', source=b'clone', node=srcrepo.nodeconstants.nullhex )
I tried this, and ran into another problem: dirstate is linked to wlock, so if you share a wlock then you must also share the dirstate.
I wrote a patch that does this, but I can't tell how safe that is. All tests pass, at least.
@marmoute, please have a look at the patch.
I see, the wlock's releasefn refer to the original dirstate.
I wrote a patch that does this, but I can't tell how safe that is. All tests pass, at least.
That seems "fine", it test are happy, lets try this route.
@marmoute, please have a look at the patch.
We all came to the same conclusion then. This is a hack, but it's very localized and clearly documented.