( )⚙ D2904 templatefuncs: add mailmap template function

This is an archive of the discontinued Mercurial Phabricator instance.

templatefuncs: add mailmap template function
ClosedPublic

Authored by sheehan on Mar 20 2018, 1:56 PM.

Details

Summary

This commit adds a template function to support the .mailmap file
in Mercurial repositories. The .mailmap file comes from git, and
can be used to map new emails and names for old commits. The general
use case is that someone may change their name or author commits
under different emails and aliases, which would make these
commits appear as though they came from different persons. The
file allows you to specify the correct name that should be used
in place of the author field specified in the commit.

The mailmap file has 4 possible formats used to map old "commit"
names to new "proper" names:

  1. <proper@email.com> <commit@email.com>
  2. Proper Name <commit@email.com>
  3. Proper Name <proper@email.com> <commit@email.com>
  4. Proper Name <proper@email.com> Commit Name <commit@email.com>

Essentially there is a commit email present in each mailmap entry,
that maps to either an updated name, email, or both. The final
possible format allows commits authored by a person who used
both an old name and an old email to map to a new name and email.

To parse the file, we split by spaces and build a name out
of every element that does not start with "<". Once we find an element
that does start with "<" we concatenate all the name elements that preceded
and add that as a parsed name. We then add the email as the first
parsed email. We repeat the process until the end of the line, or
a comment is found. We will be left with all parsed names in a list,
and all parsed emails in a list, with the 0 index being the proper
values and the 1 index being the commit values (if they were specified
in the entry).

The commit values are added as the keys to a dict, and with the proper
fields as the values. The mapname function takes the mapping object and
the commit author field and attempts to look for a corresponding entry.
To do so we try (commit name, commit email) first, and if no results are
returned then (None, commit email) is also looked up. This is due to
format 4 from above, where someone may have a mailmap entry with both
name and email, and if they don't it is possible they have an entry that
uses only the commit email.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

sheehan created this revision.Mar 20 2018, 1:56 PM
yuja requested changes to this revision.Mar 21 2018, 9:31 AM
yuja added a subscriber: yuja.

Perhaps parsemailmap() and mapname() can be moved to new
utility module. And it's probably better to add some unit tests or doctests.

mercurial/templatefuncs.py
173

Current trend is to ditch namedtuple and use @attr.s instead.

177

Nit: avoid using map as variable name since there's map() function.

234

Perhaps util.email() and templatefilters.person() can be used instead.
In which case, person() should probably be moved to new utility
module.

247

Please use '%' formatting because bytes.format() isn't available
on Python 3.

253

Nit: **kwargs is unnecessary.

266

Perhaps repo.wvfs.tryread() can be used instead. That's what
hgext.lfs is doing to read .hglfs file.

273

No need to invalidate by mtime since the cache will be discarded
with the templating session .

And I think reading a file is much expensive than parsing.

tests/test-mailmap.t
8

Nit: -u can be used instead of HGUSER=

This revision now requires changes to proceed.Mar 21 2018, 9:31 AM
sheehan marked 5 inline comments as done.Mar 27 2018, 12:04 PM
sheehan updated this revision to Diff 7345.
sheehan marked 3 inline comments as done.Mar 29 2018, 10:10 AM
sheehan updated this revision to Diff 7361.
yuja requested changes to this revision.Mar 30 2018, 8:49 AM
yuja added inline comments.
mercurial/templatefuncs.py
185

Nit: .exists() isn't needed. tryread() handles it.

188

Still reading the file no matter if it is cached or not. This should be:

if 'mailmap' not in cache:
    data = repo.wvfs.tryread(...)
    cache['mailmap'] = parsemailmap(data)
193

Nit: try-except isn't needed thanks to tryread().

This revision now requires changes to proceed.Mar 30 2018, 8:49 AM
sheehan updated this revision to Diff 7381.Mar 30 2018, 11:11 AM
yuja accepted this revision.Mar 30 2018, 9:13 PM
This revision is now accepted and ready to land.Mar 30 2018, 9:13 PM
yuja added a comment.Mar 30 2018, 9:21 PM

Queued, thanks. Can you send a followup?

mercurial/templatefuncs.py
188

Perhaps the last or author wouldn't be necessary because that's
the default of mapname().

mercurial/utils/stringutil.py
511

An example to make it crash: b'>@<' (which is nice emoticon btw.)

Perhaps we can remove this any(...) and add if not email: continue instead.

This revision was automatically updated to reflect the committed changes.
In D2904#48406, @yuja wrote:

Queued, thanks. Can you send a followup?

Thanks for the review, @yuja! I'll send a followup shortly. :)