This is an archive of the discontinued Mercurial Phabricator instance.

grep: add MULTIREV support to --allfiles flag
ClosedPublic

Authored by sangeet259 on Jul 25 2018, 3:44 AM.

Details

Summary

This patch facilitates passing multiple revisions with all-files flag.
It's assumed that if you are passing multiple revisions to --allfiles,
you want hits from all of them.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

sangeet259 created this revision.Jul 25 2018, 3:44 AM
sangeet259 edited the summary of this revision. (Show Details)Jul 25 2018, 7:56 AM
sangeet259 updated this revision to Diff 9663.
sangeet259 updated this revision to Diff 9664.Jul 25 2018, 8:10 AM

@yuja can you review this one

yuja added a comment.Aug 3 2018, 10:17 AM

+ # the files dictionary stores all the files that have been looked at
+ # in the allfiles mode
+ files ={}

I don't think we should omit files that were seen in earlier revisions,
because --all-files is supposed to scan files of any states. I also expect
hg grep --all-files -r0+1 foo will show matches from both rev 0 and 1.

What do you think?

+ for f in ctx:
+ if match(f):
+ if f not in files:
+ files[f] = ctx

Keeping all ctx objects might use too much memory.

I don't think we should omit files that were seen in earlier revisions

If I don't skip that would mean the same file in the same state being searched across all the revisions, and getting repetitive and redundant hits.

One way I think to circumvent this is by using something like : if f not in files or f in ctx.files() and then checking if the new change corresponds to any match. But then, this makes it more similar to --diff.

Keeping all ctx objects might use too much memory.

True I will change it to files[f] = True

I also expect hg grep --all-files -r0+1 foo will show matches from both rev 0 and 1.

Suppose there are ten hits in 0 and the same 10 hits in 1, do you mean we print out all the 20 results, What purpose that would serve?

yuja added a comment.Aug 4 2018, 3:24 AM
>   I also expect hg grep --all-files -r0+1 foo will show matches from both rev 0 and 1.
Suppose there are ten hits in 0 and the same 10 hits in 1, do you mean we print out all the 20 results, What purpose that would serve?

Yes. I think that's the least surprising behavior for --all-files, which is
the option to ignore file status. Say a file is modified at both rev 0 and 1,
which revisions should be grepped?

a. only rev 0
b. rev 0 and rev 1 (because a file is changed at rev 1)
c. rev 0 and rev 1 (no matter if a file is changed or not)
d. --all-files for rev 0, and --diff for rev 1

(a) is the awkward behavior of the current "hg grep" which we're trying to
fix. (b) might sound sensible, but why are unchanged lines in rev 1 displayed
again? (c) shows redundant matches, but is consistent. (d) seems a bit tricky,
but will be useful.

So my proposal was (c). If we take (d), which I think is also good, we'll
need to find more appropriate option name than --all-files.

@yuja Cool, sending the option C now, will try out option D and send a patch if something comes up.

sangeet259 edited the summary of this revision. (Show Details)Aug 5 2018, 5:09 PM
sangeet259 retitled this revision from grep: add MULTIREV support to --all-files flag to grep: add MULTIREV support to --allfiles flag.
sangeet259 updated this revision to Diff 9937.
yuja added a comment.Aug 6 2018, 9:24 AM

Looks mostly good. One nit.

@@ -2748,7 +2749,7 @@

for fn in sorted(revfiles.get(rev, [])):
    states = matches[rev][fn]
    copy = copies.get(rev, {}).get(fn)
  • if fn in skip:

+ if fn in skip and not all_files:

if copy:
    skip[copy] = True
continue

Instead of ignoring skip[fn], it's probably better to not set skip[fn]
at all. That's what we do at a couple of lines below, if r and not diff.

@@ -1983,6 +1980,7 @@

it = iter(revs)
stopiteration = False

+

for windowsize in increasingwindows():
    nrevs = []
    for i in xrange(windowsize):

@@ -1997,14 +1995,18 @@

ctx = change(rev)
if not fns:
    def fns_generator():

+

if allfiles:
    fiter = iter(ctx)
else:
  • fiter = ctx.files()

+ fiter = iter(ctx.files())

for f in fiter:
    if match(f):
        yield f

+
+

fns = fns_generator()

+

Can you undo these unrelated changes?

sangeet259 updated this revision to Diff 10057.Aug 7 2018, 10:24 AM
sangeet259 updated this revision to Diff 10058.Aug 7 2018, 10:32 AM
sangeet259 updated this revision to Diff 10059.Aug 7 2018, 10:41 AM
yuja added a comment.Aug 7 2018, 10:50 AM

Queued with minor cleanup, thanks.

As a follow up, can you fix hg grep --all-files -rREVS FILE to scan
unchanged revisions?

This revision was automatically updated to reflect the committed changes.
In D3976#64274, @yuja wrote:

Queued with minor cleanup, thanks.
As a follow up, can you fix hg grep --all-files -rREVS FILE to scan
unchanged revisions?

Sure