This is an archive of the discontinued Mercurial Phabricator instance.

json: reject unicode on py2 as well
ClosedPublic

Authored by martinvonz on May 12 2018, 12:35 AM.

Details

Summary

This makes it consistent with the behavior on py3.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

martinvonz created this revision.May 12 2018, 12:35 AM
indygreg requested changes to this revision.May 12 2018, 12:43 AM
indygreg added a subscriber: indygreg.

unicode is not a type on Python 3. I think this should be changed to (str, pycompat.unicode). That will evaluate to (str, unicode) on Python 2 and (str, str) on Python 3. If you go this route, please rephrase the comment to denote the Python 2 behavior only triggering on unicode because of str == bytes.

This revision now requires changes to proceed.May 12 2018, 12:43 AM
yuja added a subscriber: yuja.May 12 2018, 1:52 AM
`unicode` is not a type on Python 3. I think this should be changed to `(str, pycompat.unicode)`.

Alternatively, it could be type(u''). I have no preference though.

martinvonz updated this revision to Diff 8657.May 12 2018, 2:12 AM

unicode is not a type on Python 3. I think this should be changed to (str, pycompat.unicode). That will evaluate to (str, unicode) on Python 2 and (str, str) on Python 3. If you go this route, please rephrase the comment to denote the Python 2 behavior only triggering on unicode because of str == bytes.

Done, I believe.

yuja added a comment.May 12 2018, 2:52 AM
  • elif isinstance(obj, str):
  • # This branch is unreachable on Python 2, because bytes == str
  • # and we'll return in the next-earlier block in the elif
  • # ladder. On Python 3, this helps us catch bugs before they
  • # hurt someone.

+ elif isinstance(obj, (str, pycompat.unicode)):
+ # On Python 2, we'll only get here if obj is unicode, because
+ # str == bytes, so str is handled above.

Doh, I never thought we already had pycompat.unicode to be imported
automagically by the code transformer, and it's unavailable on Python 2!
Since we'll eventually ditch the transformer, we'll have to add
unicode = __builtin__.unicode to the py2 side of pycompat.

Nit: (str, can be removed as pycompat.unicode is str on Python 3.

In D3536#56051, @yuja wrote:
  • elif isinstance(obj, str):
  • # This branch is unreachable on Python 2, because bytes == str
  • # and we'll return in the next-earlier block in the elif
  • # ladder. On Python 3, this helps us catch bugs before they
  • # hurt someone.

+ elif isinstance(obj, (str, pycompat.unicode)):
+ # On Python 2, we'll only get here if obj is unicode, because
+ # str == bytes, so str is handled above.

Doh, I never thought we already had pycompat.unicode to be imported
automagically by the code transformer, and it's unavailable on Python 2!
Since we'll eventually ditch the transformer, we'll have to add
unicode = __builtin__.unicode to the py2 side of pycompat.

Yep, tests caught that too (once I ran them :P). I switched to type(u'') instead and I'll leave the fix of pycompat out of this patch (since pycompat probably needs further changes for xrange and others anyway).

Nit: (str, can be removed as pycompat.unicode is str on Python 3.

Good point. I fixed that and removed the comment because it it didn't seem to add anything anymore.

martinvonz updated this revision to Diff 8717.May 16 2018, 1:49 PM
This revision was automatically updated to reflect the committed changes.