( )⚙ D4231 mail: always fall back to iso-8859-1 if us-ascii won't work (BC)

This is an archive of the discontinued Mercurial Phabricator instance.

mail: always fall back to iso-8859-1 if us-ascii won't work (BC)
ClosedPublic

Authored by durin42 on Aug 9 2018, 10:07 PM.

Details

Summary

It looks like this was a well-intentioned backwards compat hack for
previewing the output of hg email in a stable way. Unfortunately I
think this hack's time has come, because Python 3 does a much better
job of ensuring it actually emits *valid* email messages. In
particular, Python 2 would blindly trust us that the bytes we handed
it were valid for the encoding we claimed, but Python 3 has some more
sniff-tests that we end up failing.

As a result, if we're going to print an email to the terminal, try
us-ascii first, but if that fails go straight to iso-8859-1 which
should be reasonably readable for ascii-compatible patch bodies. This
*will* be a breaking change for ascii-incompatible textual patch
content, but I don't think that's avoidable if we want to continue
using the email library from the stdlib.

.. bc::

Emails from the patchbomb extension will always be printed as though
they are iso-8859-1 if they're not valid us-ascii. Previously,
previewed emails were always claimed to be us-ascii and might
contain invalid byte sequences.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

durin42 created this revision.Aug 9 2018, 10:07 PM
indygreg accepted this revision.Aug 9 2018, 11:29 PM
indygreg added a subscriber: indygreg.

This seems reasonable to me. It's only changing the email that is displayed. So I don't think we need to scrutinize this too much.

This revision is now accepted and ready to land.Aug 9 2018, 11:29 PM
This revision was automatically updated to reflect the committed changes.