The earlier behaviour of grep was quite counter intuitive as it searches
history by default. It throws up matches from files which have been
deleted and commited out. This patch handles only the default behaviour
by iterating on the files being tracked by the workingctx are
not marked for removal. The node is set by useing scmutil.binnode(ctx).
All the tests of hg grep are updated to reflected this change.
Additional tests have been added to show the new behaviour.
Details
- Reviewers
sangeet259 - Group Reviewers
hg-reviewers
Diff Detail
- Repository
- rHG Mercurial
- Lint
Lint Skipped - Unit
Unit Tests Skipped
Event Timeline
No need to add Edit1, Edit2 in commit message. If something changes from one version to another that deserves mention, include that in commit message as normal.
mercurial/commands.py | ||
---|---|---|
2583 | Nit: no need to add bool() calls. | |
2584 | The above if condition wants opts['rev'] to be empty and here you are using it's value. | |
tests/test-grep.t | ||
341–366 | For better testing, please add untracked files which can be probable match. Also I suspect there will be existing tests for hg grep which should break because of new behavior. | |
355 | what's this None here? |
Perhaps we can start with adding an experimental option to grep files
including unchanged ones?
IIUC, the new default behavior is something like hg grep -r "wdir()" --all-files,
which is basically s/ctx.files()/ctx/.
(needless to say --all-files is a bad name, and the following patch is just a hack.)
diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py --- a/mercurial/cmdutil.py +++ b/mercurial/cmdutil.py @@ -1835,7 +1835,7 @@ def walkchangerevs(repo, match, opts, pr if not revs: return [] wanted = set() - slowpath = match.anypats() or (not match.always() and opts.get('removed')) + slowpath = True fncache = {} change = repo.changectx @@ -1889,7 +1889,7 @@ def walkchangerevs(repo, match, opts, pr else: self.revs.discard(value) ctx = change(value) - matches = [f for f in ctx.files() if match(f)] + matches = [f for f in ctx if match(f)] if matches: fncache[value] = matches self.set.add(value) @@ -1939,7 +1939,7 @@ def walkchangerevs(repo, match, opts, pr ctx = change(rev) if not fns: def fns_generator(): - for f in ctx.files(): + for f in ctx: if match(f): yield f fns = fns_generator()
mercurial/commands.py | ||
---|---|---|
2485 | This line looks identical to the one later on, with Pulkit's comment about usage of bool(). | |
2487–2490 | This block is not indented enough, but see my other comment below. | |
2494 | Since the difference in both branches for this if-else block seems to be only in this line, and just one value, it probably makes sense to store said value in a variable and use it in place of False/True and reduce duplication. |
@yuja Can you please clarify this a bit more?
Perhaps we can start with adding an experimental option to grep files
including unchanged ones?"
mercurial/commands.py | ||
---|---|---|
2474 | Sure, pushing the first approach now. Currently the filenode of workingctx returns this: return self._fileinfo(path)[0] . Shall I change that to returning wdirid? Also can you please highlight the slowpath part? Any reference or links to mailing lists, where I can learn more about it. | |
2475 | Done. Any reason why that is better? | |
2494 | Great. Thanks ! | |
2584 | I could have made the call by simply passing '' instead of opts.get('rev') but I chose this because I am planning to build upon this to handle revisions as well. | |
tests/test-grep.t | ||
355 | The None was a result of displaying revision in the ui.write. Removed that. I had forgotten to update this result. |
This patch appears to do 3 separate things by adding a big "if" branch.
- fix grep -r 'wdir()'
- add mode to include unmodified files
- change the default
It's probably better to write one (or more) patches for each change, in a way
not duplicating code for each combination of possible options.
mercurial/commands.py | ||
---|---|---|
2474 |
That isn't easy to answer because wctx.filenode() can return
It's quite old, so I have no concrete idea. But maybe reusing filelog objects is the key. | |
2475 | Extensions may override it to provide a faster implementation. |
What exactly do you mean by fixing 'grep -r wdir()' ?
Make it return the same result as
$ hg ci -m 'this was wdir()' $ hg grep -r .
(except the revision number/hash)
@yuja
So I was working on it, but the way I tried to correct "wdir()" is by handling it as a special case, so introducing an "if" block again similar to this.
But you were asking to eliminate the if/else block altogether, I wonder if there is way ?
So I was working on it, but the way I tried to correct "wdir()" is by handling it as a special case, so introducing an "if" block again similar to this. But you were asking to eliminate the if/else block altogether, I wonder if there is way ?
IIRC, my point was to avoid adding *big* if/else branch which will be likely
to introduce untested code paths.
I don't think there would be a way to go without if/else or try/except as
long as we use the low-level filelog API for performance reasons. Perhaps,
we'll have to catch WdirUnsupported error to fall back to the filectx API.
try: flog.read(fnode) except error.WdirUnsupported: ctx[fn].data()
Perhaps, we'll have to catch WdirUnsupported error to fall back to the filectx API.
This seems a good idea.
Trying it.
Hey @yuja
Can you explain why you did this ?
- slowpath = match.anypats() or (not match.always() and opts.get('removed'))
+ slowpath = True
Hey @yuja Can you explain why you did this ? > - slowpath = match.anypats() or (not match.always() and opts.get('removed')) + slowpath = True
From my vague memory, I guess I would think that if we do s/ctx.files()/ctx/
(which means we're looking for revisions where files exist, not files change),
the use of walkfilerevs() is not valid since filelogs only contain "changes."
To be clear, I'm not saying that will be the desired behavior of hg grep of
"--all-files" of multiple revisions. It might be better to skip intermediate
revisions where files are unchanged.
@yuja I am adding an --unmodified flag to change to change the mode to grep on the unmodified files as well.
What I am trying to do is ::
def fns_generator(): if --unmodified: for f in ctx: for f in ctx: if match(f): yield f else : for f in ctx.files(): for f in ctx: if match(f): yield f
Currently, this would work for a single revision, so I am modifying the prep function in grep.
Shall I continue and work on sending the required patch ?
mercurial/commands.py | ||
---|---|---|
2581–2591 | nit: can you change 'if part' to 'if-part' or '"if" part' to make it easier to parse? Also drop the trailing backslash. Actually, the comment doesn't seem to add any useful information, so I'd suggest either removing it or changing it to something like "When neither -r nor --all is passed, search the working copy" (if that's accurate) | |
2583 | drop one of the spaces before "or" | |
2584 | I agree with Pulkit: it's clearer to change this to ctx = repo[None] m = scmutil.match(ctx, pats, opts, default='relglob') for now and start calling revsingle() in a later patch | |
2591 | can you drop "rev is None" for now (since it's always true)? | |
2591 | It's surprising to me that ctx.matches() can include files that are not tracked. I'll see if I can send a patch to change that. (No action expected from you) | |
tests/test-grep.t | ||
293–294 | Perhaps it's better to preserve the old behavior here (by adding --all, I think)? I'm not sure what the reason for this call is, though, so perhaps it could instead be removed? |
@martinvonz, thanks for the reviews, but @yuja has asked me to break this patch into three patches. So I guess this is not going to be merged anyway.
Currently, I am trying to add a flag like --all_files or --unmodified which will enable one to grep all the files that were tracked by that revision and not just the changed files.
@yuja I am adding an --unmodified flag to change to change the mode to grep on the unmodified files as well. What I am trying to do is :: def fns_generator(): if --unmodified: for f in ctx: for f in ctx: if match(f): yield f else : for f in ctx.files(): for f in ctx: if match(f): yield f
Nit: if can be narrowed to ctx/ctx.files().
if --unmodified: fiter = iter(ctx) else: fiter = ctx.files()
Currently, this would work for a single revision, so I am modifying the `prep` function in grep. Shall I continue and send the patch ?
Seems fine. It should be generally encouraged to send patches than discussing
for long without a working code.
but @yuja has asked me to break this patch into three patches. So I guess
this is not going to be merged anyway.
To be clear, I suggested to split patches because the original code seemed
to add a single knob to switch two more features at once.
Better to test if ctx is a workingctx (i.e. ctx.rev() is None).
Another idea is to make wctx.filenode() return wdirid and catch
WdirUnsupported exception
to fall back to the slow path.