This is an archive of the discontinued Mercurial Phabricator instance.

templatefilters: add commonprefix
ClosedPublic

Authored by joerg.sonnenberger on May 5 2018, 9:13 PM.

Details

Summary

The commonprefix filter takes a list of files names like files() and
returns the longest directory name common to all elements.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

yuja added a subscriber: yuja.May 6 2018, 8:29 AM

Can you add test for failure cases? such as

  • str|commonprefix
  • int|commonprefix
  • nothing common (e.g. ["/foo", "bar"])
  • empty list
  • paths normalized by os.path.normcase()

os.path.commonprefix() isn't what we want, but it might provide some hints.

yuja added a comment.May 7 2018, 9:09 AM

"os.path.commonprefix() isn't what we want, but it might provide some hints."

Perhaps os.path.commonprefix() could be used with conditional
fixup by os.path.dirname()?

prefix = os.path.commonprefix(files)
if prefix isn't a valid path:
    prefix = os.path.dirname(prefix)

This way, we can remove normpath() which has a side effect s|/|\\|g
on Windows.

joerg.sonnenberger retitled this revision from template filters: add commonprefix to templatefilters: add commonprefix.May 8 2018, 11:18 AM
joerg.sonnenberger updated this revision to Diff 8549.
yuja added a comment.May 9 2018, 8:46 AM

Please add tests of edge cases.

  • str|commonprefix
  • infinite loop (e.g. ["/foo", "bar"])
  • exact match (e.g. ["/foo"] and ["/foo", "/foo"])
  • empty list

os.path.commonprefix() will provide some hints to avoid comparison of
all list elements by using min()/max().

{str|commonprefix} is not really interesting since it is naturally an iterable of text. The others are covered, the routine tries to optimize a couple of common cases as well now.

yuja added a comment.May 10 2018, 8:40 AM

+ $ hg debugtemplate '{"foo/bar\nfoo/bar\n"|splitlines|commonprefix}\n'
+ foo

should be "foo/bar"

+ $ hg debugtemplate '{"/foo/bar\n/foo/bar\n"|splitlines|commonprefix}\n'
+ foo

should be "/foo/bar"

+ $ hg debugtemplate '{"/foo\n/foo\n"|splitlines|commonprefix}\n'
+

should e "/foo"

FWIW, I doubt if scanning all elements with the seen set is faster than
calling min()/max() to pick the farthest pair because Python is slow.

{str|commonprefix} is not really interesting since it is naturally an
iterable of text.

I think it's better to error out, but that could be addressed later.

This revision was automatically updated to reflect the committed changes.
yuja added a comment.Jun 2 2018, 11:35 PM

For the record, I still believe the commonprefix of a pair of identical paths
should be the path itself as it is the longest sub-component of these paths.

+ $ hg debugtemplate '{"foo/bar\nfoo/bar\n"|splitlines|commonprefix}\n'
+ foo

should be "foo/bar"

And removing the leading slash is awkward.

+ $ hg debugtemplate '{"/foo/bar\n/foo/bar\n"|splitlines|commonprefix}\n'
+ foo

should be "/foo/bar"

Do you think we should call it commonpath() or commondir() instead in case we want commonprefix() to work for any string in the future (and not care about path separators)?

Do you think we should call it commonpath() or commondir() instead in case we want commonprefix() to work for any string in the future (and not care about path separators)?

commondir() would be fine as well. commonpath() doesn't feel better. I'm ambivalent about either name.

Do you think we should call it commonpath() or commondir() instead in case we want commonprefix() to work for any string in the future (and not care about path separators)?

commondir() would be fine as well. commonpath() doesn't feel better. I'm ambivalent about either name.

I'd vote for changing it then. Both because I think it might make it clearer that it's the common prefix of "foo/bar" and "food" is not "foo" and because, as I said, we might want such a function later and it would be unfortunate if the name was taken. Btw, I just checked and we already have dirname() and stripdir() functions that would match commondir().

yuja added a comment.Jun 13 2018, 8:31 AM
> `commondir()` would be fine as well. `commonpath()` doesn't feel better. I'm ambivalent about either nam>
I'd vote for changing it then. Both because I think it might make it clearer that it's the common prefix of "foo/bar" and "food" is not "foo" and because, as I said, we might want such a function later and it would be unfortunate if the name was taken. Btw, I just checked and we already have `dirname()` and `stripdir()` functions that would match `commondir()`.

+1 for commondir(). It matches the actual behavior, which is the last filename
part is always stripped.