( )⚙ D11389 rhg: Port Python’s `ui.configlist` as `Config::get_list`

This is an archive of the discontinued Mercurial Phabricator instance.

rhg: Port Python’s `ui.configlist` as `Config::get_list`
ClosedPublic

Authored by SimonSapin on Sep 3 2021, 10:38 AM.

Details

Summary

This new method is not used yet outside of its own unit tests,
so this changeset should make no observable change.

The Rust parser implementation attempts to exactly replicate the behavior of
the Python one, even in edge cases where that behavior is… surprising.
New unit tests capture some of these edge cases.

This started as a line-by-line port. The main changes are:

  • Pass around a parser mode enum instead of parser functions
  • Inline the whole parser into one function
  • Use [u8]::get which returns an Option, instead of indexing after explicitly checking the length.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

SimonSapin created this revision.Sep 3 2021, 10:38 AM

Dear reviewer, when trying to find cases where this code has different behavior than that of mercurial/utils/stringutil.py (or convince yourself that it’s the same), I advise for your own mental well-being not to question too much what that behavior is.

Some of it is arguably bugs, but this patch does not get into what existing config file could potentially rely on those bugs and would be broken if fix the parsing algorithm.

Inline the whole parser into one function

Almost, I had to split back up the recursive part. But _parse_plain and _parse_quote are still inlined, which simplifies things slightly.

SimonSapin updated this revision to Diff 30183.Sep 4 2021, 3:25 AM

Updated manually rather than trough Baymax since the windows-py3-pyox job in https://foss.heptapod.net/octobus/mercurial-devel/-/pipelines/26551 failed 4 times in a row with different error messages.

Alphare accepted this revision.Sep 8 2021, 8:48 AM
Alphare added a subscriber: Alphare.

The Rust replication seems good to me, I don't think the performance aspect is too important either. I'm not glad we have to have this, but I don't have a better solution!

rust/hg-core/src/config/values.rs
241

nit: I find that these are more readable by using r#"this kind of string"#

This revision is now accepted and ready to land.Sep 8 2021, 8:48 AM
SimonSapin added inline comments.Sep 9 2021, 9:48 AM
rust/hg-core/src/config/values.rs
241

Yes, not having to mentally do backslash-unescaping would certainly be easier for human readers. However these raw strings literals do not support any backslash-escaping so \t and \x0b would have to be replaced by their literal characters whose difference is less visible to human readers. We could have only those test cases use non-raw literals, but keeping all test cases consistent with each other feels more valuable to me.