This is an archive of the discontinued Mercurial Phabricator instance.

match: remove obsolete catching of OverflowError
ClosedPublic

Authored by martinvonz on Nov 28 2018, 1:26 PM.

Details

Summary

Since 0f6a1bdf89fb (match: handle large regexes, 2007-08-19), we catch
an OverflowError from the regex engine and split up the regex if that
happens. In 59a9dc9562e2 (ignore: split up huge patterns, 2008-02-11),
that was extended to raise an OverflowError in our code even if the
regex engine doesn't raise it. It's unclear if there was a range of
regex sizes where the OverflowError would be raised from the regex
engine but that were still below the limit we added in our
code. Either way, both limitations were probably removed in Python
2.7.4 when the regex code width was extended from 16bit to 32bit (or
Py_UCS4) integer (thanks to Yuya for finding that out).

If at least the first limitation was removed, we no longer should be
using OverflowError for flow control, so this patch changes that.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

martinvonz created this revision.Nov 28 2018, 1:26 PM
yuja added a subscriber: yuja.Nov 29 2018, 6:59 AM

Queued, thanks.

regex = '(?:%s)' % '|'.join([_regex(k, p, globsuffix)
                             for (k, p, s) in kindpats])
  • if len(regex) > 20000:
  • raise OverflowError
  • return regex, _rematcher(regex)
  • except OverflowError:

+ if len(regex) < 20000:
+ return regex, _rematcher(regex)

s/</<=/ to make sure no behavior changed.

This revision was automatically updated to reflect the committed changes.