This is an archive of the discontinued Mercurial Phabricator instance.

sparse: override dirstate.walk() instead of dirstate._ignore
ClosedPublic

Authored by martinvonz on Jul 11 2017, 8:01 PM.

Details

Summary

Instead of treating files that are outside the sparse config as
ignored, this makes it so we list only those that are within the
sparse config by passing the sparse matcher to dirstate.walk().

Once we add support for narrow (sparseness applied to history, not
just working copy), we will need to do a similar restriction of the
walk over manifests, so this will be more consistent then. It also
simplifies the code a bit.

Note that a side-effect of this change is that files outside the
sparse config used to be listed as ignored, but they will now not be
listed at all. This can be seen in the test case where "hg purge" no
longer has any effect because it doesn't see that the files outside
the space config exist. To fix that, I think we should add an option
to dirstate.walk() to walk outside the sparse config. We might expose
that to the user as --no-sparse flag to e.g. "hg status" and "hg
purge", but that's work for another day.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

martinvonz created this revision.Jul 11 2017, 8:01 PM
dsp added a subscriber: dsp.Jul 12 2017, 11:06 AM

-1 on this.

I feel this can lead to confusing behavior for users and require too much understanding of the underlying behaviors. See the following example

  1. create a sparse profile
[include]
foo/
[exclude]
  1. create directory bar/ and file bar/test
  2. hg status returns an empty set.

My understanding from basic testing is that in the old behavior, bar/test would be returned as 'Ignored'.

I think the overall idea is good, but we might want to sort out a better user behavior (maybe introducing a new state?) before we are changing behavior.

In D59#733, @dsp wrote:

-1 on this.
I feel this can lead to confusing behavior for users and require too much understanding of the underlying behaviors. See the following example

  1. create a sparse profile
[include]
foo/
[exclude]
  1. create directory bar/ and file bar/test
  2. hg status returns an empty set.

That's true before and after this patch.

My understanding from basic testing is that in the old behavior, bar/test would be returned as 'Ignored'.

Ignored files are not included in the output of "hg status" by default. You need to pass -i or -A to see them. I think it's very rare to use those (I'm not sure I ever have outside of testing).

I think the overall idea is good, but we might want to sort out a better user behavior (maybe introducing a new state?) before we are changing behavior.

As I said in the commit message, I think presenting these files as ignored is wrong. It seems better to treat them as untracked, so they always show up. I agree that treating them as neither (not listing them at all) seems even worse, but it's a pretty small regression IMO. Adding support for another type of status (the one for files outside the sparse checkout) is a much bigger task that I don't feel like I have time for right now.

durham added a subscriber: durham.EditedJul 12 2017, 12:18 PM

My only concern would be if tools ever use hg status -A to get the full state of the working copy (like what files exist). But I think that's pretty minor.

durham added a reviewer: Restricted Project.Jul 12 2017, 12:18 PM
durham added inline comments.Jul 12 2017, 12:25 PM
hgext/sparse.py
224

Do we know if the ignore wrapper was ever accessed outside of the walk? It might be worth hacking in a temporary message that prints if the ignorewrapper is called while walk is not in the traceback, then run the tests to see if it's ever hit.

martinvonz added inline comments.Jul 12 2017, 1:24 PM
hgext/sparse.py
224

Good idea to check that. Turns out that _ignore is accessed not only from walk(). Here's where else:

  1. dirstate.status()

This is to make decide whether to report a file not in the dirstate as ignored or untracked. Since the input is the list of walked files, it won't make a difference whether files outside sparse config are in the ignore matcher or not.

  1. merge._checkunknownfiles()

If I'm reading this right, the difference here is just whether merge.checkignored or merge.checkunknown config options will be used for the affected files. I'd say this is probably a (tiny) change for the better: if I have turned on merge.checkunknown but not merge.checkignored, it might be because I have tools that generate the ignored files and I can regenerate them, but unknown files can not be regenerated. I'm not sure that's why one normally would set those values differently, though.

  1. hgignore() fileset

Not much to say about this. Will obviously change as expected.

  1. debugignore command

Same as 3.

durham accepted this revision.Jul 14 2017, 12:26 PM
This revision is now accepted and ready to land.Jul 14 2017, 12:26 PM
This revision was automatically updated to reflect the committed changes.