This is an archive of the discontinued Mercurial Phabricator instance.

hgweb: fix websub regex flag syntax on Python 3
ClosedPublic

Authored by sheehan on Sep 6 2019, 11:43 AM.

Details

Summary

The websub config section for hgweb is broken under Python 3
when using regex flags syntax (ie the optional i in the example
from hg help config.websub:

patternname = s/SEARCH_REGEX/REPLACE_EXPRESSION/[i]

Flags are pulled out of the specified byte-string using a regular
expression, and uppercased. The flags are then iterated over and
passed to the re module using re.__dict__[item], to get the
object attribute of the same name from the re module. So on Python
2 if the il flags are passed, this transition looks like:

`'il'` -> `'IL'` -> `'I'` -> `re.__dict__['I']` -> `re.I`

However on Python 3, these are bytes objects. When we iterate over
a bytes object in Python 3, instead of getting the individual characters
in the string as string objects of length one, we get the integer \
value corresponding to that byte. So the same transition looks like:

`b'il'` -> `b'IL'` -> `73` -> `re.__dict__[73]` -> `KeyError`

This commit fixes the type mismatch by converting the bytes to a
system string before iterating over each element to pass to re.
The transition will now look like:

`b'il'` -> `u'IL'` -> `u'I'` -> `re.__dict__[u'I']` -> `re.I`

In addition we expand test-websub.t to cover the regex flag case
(for both the websub section and interhg).

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

sheehan created this revision.Sep 6 2019, 11:43 AM
indygreg requested changes to this revision.Sep 7 2019, 12:50 PM
indygreg added a subscriber: indygreg.

Nice catch!

mercurial/hgweb/webutil.py
794

I think pycompat.sysstr() is more appropriate here, since it won't convert the str to unicode on Python 2.

This revision now requires changes to proceed.Sep 7 2019, 12:50 PM
sheehan updated this revision to Diff 16466.Sep 9 2019, 1:25 PM
sheehan marked an inline comment as done.Sep 9 2019, 1:25 PM
sheehan edited the summary of this revision. (Show Details)Sep 9 2019, 1:45 PM
indygreg accepted this revision.Sep 9 2019, 9:51 PM
This revision is now accepted and ready to land.Sep 9 2019, 9:51 PM
This revision was automatically updated to reflect the committed changes.