This is an archive of the discontinued Mercurial Phabricator instance.

mail: fix _encode to be more correct on Python 3
ClosedPublic

Authored by durin42 on Jul 16 2018, 7:16 PM.

Details

Summary

This code appears to be on the wrong side of the law in Python 2, at
least some of the time. In Python 3, it's definitely wrong in places,
but fortunately that's easy to fix.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

durin42 created this revision.Jul 16 2018, 7:16 PM
indygreg accepted this revision.Aug 9 2018, 10:58 PM
indygreg added a subscriber: indygreg.
indygreg added inline comments.
mercurial/mail.py
249–272

Maybe this should be renamed _encodelossy to avoid surprises?

This revision is now accepted and ready to land.Aug 9 2018, 10:58 PM
This revision was automatically updated to reflect the committed changes.
yuja added a subscriber: yuja.Aug 11 2018, 10:17 PM

def _encode(ui, s, charsets):

'''Returns (converted) string, charset tuple.
Finds out best charset by cycling through sendcharsets in descending
order. Tries both encoding and fallbackencoding for input. Only as
last resort send as is in fake ascii.
Caveat: Do not use for mail parts containing patches!'''

Maybe this should be renamed _encodelossy to avoid surprises?

It isn't lossy in that any input bytes will never be dropped. If no reasonable
charset found, it falls back to 'us-ascii' to send bytes transparently.