pycompat.py has functions strurl and bytesurl which decodes and encodes the url
passed on Python 3 respectively. In some cases, strurl gets a url which is
already str and bytesurl gets a url which is already bytes. Let's prevent
encoding or decoding the values again if not required.
Details
- Reviewers
yuja durin42 - Group Reviewers
hg-reviewers - Commits
- rHGe2b87e19c6ef: pycompat: prevent encoding or decoding values if not required
Diff Detail
- Repository
- rHG Mercurial
- Lint
Lint Skipped - Unit
Unit Tests Skipped
Event Timeline
I'm 0 for this move.
I'm not a fan of making function inputs and outputs obscure because we'll
be likely to have a hard time to fix it in future. Mixing bytes and unicodes is
horrible. But if it's painful to be strict on url type, I'll say go for it.
mercurial/pycompat.py | ||
---|---|---|
198 | Need to return an unmodified value. |
I agree with @yuja here. I think we should strive to convert URLs to bytes (like most other internal types). Although I do realize that lots of stdlib code has... expectations. Is there a particular test failure you are tracking down with this fix? I've kind of got a lot of the wire protocol code paged in right now. I could potentially offer some advice.
test-http-bundle1.t failures. I think this current patch is probably the "least bad" option (barring getting pytype working, which I think would let us try and constrain some types in more principled ways) since some of these URL-relevant functions are called by both hg and urllib guts. The former wants to use bytes, and the latter can only use unicodes.
I've poked at this on and off a few times and something like this fix is all I've come up with. I'm not happy about it either.
The other option is probably to always treat URLs as unicodes, though that seems like it'll probably come with its own bag of hurt throughout the codebase.
Need to return an unmodified value.