( )⚙ D10296 rhg: read [paths] for `--repository` value

This is an archive of the discontinued Mercurial Phabricator instance.

rhg: read [paths] for `--repository` value
ClosedPublic

Authored by pulkit on Apr 2 2021, 6:46 AM.

Details

Summary

hg parses -R and --repository CLI arguments "early" in order to know which
local repository to load config from. (Config can then affect whether or how to
fall back.)

The value of of those arguments can be not only a filesystem path, but also an
alias configured in the [paths] section. This part was missing in rhg and
this patch implements that.

The current patch still lacks functionality to read config of current repository
if we are not at root of repo. That will be fixed in upcoming patches.

A new crate home is added to get path of home directory.

Diff Detail

Repository
rHG Mercurial
Branch
default
Lint
No Linters Available
Unit
No Unit Test Coverage

Event Timeline

pulkit created this revision.Apr 2 2021, 6:46 AM
Alphare requested changes to this revision.Apr 3 2021, 12:02 PM
Alphare added a subscriber: Alphare.
Alphare added inline comments.
rust/rhg/src/main.rs
175

The nested ifs can be replaced by:

match config_val {
    Ok(Some(val)) if val.len() > 0 => Some(get_path_from_bytes(val.as_bytes())),
    _ => Some(get_path_from_bytes(&repo_arg)),
}

Also, you're using get_str, when you should probably just be using get, which already returns bytes (and does not unnecessarily force UTF8). That would remove the need for as_bytes() in the above suggested code.

This revision now requires changes to proceed.Apr 3 2021, 12:02 PM
pulkit updated this revision to Diff 26659.Apr 5 2021, 4:33 PM
pulkit marked an inline comment as done.Apr 5 2021, 4:34 PM
Alphare accepted this revision.Apr 6 2021, 4:59 AM

Looks good, but there seems to be yet another edge case within the edge case… *sigh*. See inline comment.

tests/test-globalopts.t
73–80

This tests suggests that relative paths in [paths] are resolved relative to the user’s $HOME directory, not relative to the the config file nor the processes current directory. To be honest I find this really weird and unexpected, but it does appear to be the case when I change HOME=/somethingelse in the test and I do see root = os.path.expanduser(b'~') in mercurial/ui.py before _applyconfig calls fixconfig which looks at [paths] items and makes them absolute.

When run with --rhg this test ends up passing but not quite for the right reason. rhg resolves path a relative to its current directory which is the repository c. That repository happens to contains a file named a, so Repo::find returns HgError::unsupported("bundle repository") which causes fallback to Python. Fallback is what we want, but only because the identify command is not implemented (yet).

It looks like os.path.expanduser returns its input unchanged when getting the home directory fails, so the code above should be something like Some(home::home_dir().unwrap_or_else(|| PathBuf::new("~")).join(get_path_from_bytes(val)))

pulkit updated this revision to Diff 26679.Apr 7 2021, 6:21 PM
pulkit added inline comments.Apr 7 2021, 6:25 PM
tests/test-globalopts.t
73–80

Ugh!

Btw, how did you figured out that we are falling back with HgError:unsupported("bundle ...")? I mean is there some config or something which shows these debug prints or is there any other way?

I updated the patch after fighting with borrow checker a bit, does seems bit ugly to me, kindly have a look.

SimonSapin accepted this revision.Apr 8 2021, 4:06 PM

This version of the patch looks good, thanks!

Re debugging, there isn’t an easy way unfortunately. --config rhg.on-unsupported=abort prints the string passed to HgError::unsupported, but also disables fallback. We could add a fourth mode that keeps fallback but also has a debug log line printed to stderr.

What I did was add a use of the dbg! macro in main.rs to print the error value: https://doc.rust-lang.org/std/macro.dbg.html Since the value contains a byte string, the derived implementation of the Debug trait prints an array of integers. I copy-pasted that into a Python REPL and passed it to bytes() decode it as ASCII. Really not a good debugging experience.

Re debugging, there isn’t an easy way unfortunately. --config rhg.on-unsupported=abort prints the string passed to HgError::unsupported, but also disables fallback. We could add a fourth mode that keeps fallback but also has a debug log line printed to stderr.

Why don't we add a simple log::debug (with a possible rhg target in case we have a log consumer that allows for filtering by targets)? I've done this (albeit with a log::trace in rust/hg-cpython/src/dirstate/status.rs:89, and it's handy enough to use RUST_LOG=debug to get the debug entries printed.

pulkit retitled this revision from rhg: try read [paths] for `--repository` value to rhg: read [paths] for `--repository` value.Apr 10 2021, 5:15 PM
pulkit edited the summary of this revision. (Show Details)
pulkit updated this revision to Diff 26787.

@SimonSapin I updated this patch to also include reading .hg/hgrc and .hg/hgrc-not-shared of current dir. Kindly have a re-look. Thanks!

This version of the patch looks good, thanks!
Re debugging, there isn’t an easy way unfortunately. --config rhg.on-unsupported=abort prints the string passed to HgError::unsupported, but also disables fallback. We could add a fourth mode that keeps fallback but also has a debug log line printed to stderr.

+1 for fourth mode. Will help a lot in case of testing new code.

Note: I suspect we are lacking a test coverage for --repository FOO, where FOO is defined as a path://.

That’s quite possible, but we have code a few lines above this in main.rs that falls back to Python for any --repository value that looks like a URL. So I’d expect that case to be handled correctly (if not quickly).

That’s quite possible, but we have code a few lines above this in main.rs that falls back to Python for any --repository value that looks like a URL. So I’d expect that case to be handled correctly (if not quickly).

ha ! "great" then.

Alphare requested changes to this revision.May 20 2021, 9:34 AM

Sorry, this seems to have been forgotten somehow. @SimonSapin please also take a look since you requested changes.

rust/rhg/src/main.rs
172

Use repo_arg.is_empty() instead

178

if current_dir.is_err() is simpler. However, you're using current_dir.unwrap() right after, so in the end you could simply do:

if let Ok(current_dir_path) = current_dir {
    ...
} else {
    None
}
192

You could use config.ok() here.

211

Why do we unwrap here?

230

Same question as above

236

Could use config_val.or(Some(get_path_from_bytes(&repo_arg).to_path_buf()))

This revision now requires changes to proceed.May 20 2021, 9:34 AM

Sorry for being silent for long, catching up now. Will update this patch.

pulkit marked 3 inline comments as done.May 24 2021, 6:59 AM
pulkit edited the summary of this revision. (Show Details)
pulkit updated this revision to Diff 28192.
pulkit added inline comments.May 24 2021, 7:01 AM
rust/rhg/src/main.rs
192

You mean if config.ok()?

211

.canonicalize() returns a Result.

Alphare added inline comments.May 25 2021, 4:28 AM
rust/rhg/src/main.rs
192

No, I'm referencing https://doc.rust-lang.org/std/result/enum.Result.html#method.ok, which could replace the entire if expression

211

Well sure, but what happens if it's an Err? Mercurial will just panic and throw a nasty traceback in Python context, and a Rust panic message in Rust ones.

pulkit updated this revision to Diff 28201.May 25 2021, 6:01 PM
Alphare added inline comments.May 26 2021, 4:06 AM
rust/rhg/Cargo.toml
15

I forgot when first reading this patch, but we usually give a reason for adding a new dependency in the commit message.

rust/rhg/src/main.rs
203

I'm assuming this means that we silently ignore this error in the Python implementation as well?

221

This is canpath.unwrap_or(non_repo_config_val)

pulkit added inline comments.May 26 2021, 4:04 PM
rust/rhg/Cargo.toml
15

Will add.

rust/rhg/src/main.rs
203

Not sure what exact part in python will mean same as this one. But I see python code will raise IOError if it fails to read config from a file path passed.

221

but we are puting canpath.unwrap in Some() again, will it work?

baymax edited the summary of this revision. (Show Details)May 26 2021, 4:48 PM
baymax updated this revision to Diff 28206.

✅ refresh by Heptapod after a successful CI run (🐙 💚)

Alphare added inline comments.May 27 2021, 3:30 AM
rust/rhg/src/main.rs
203

This should probably do the same thing then.

221

Ah sorry, I misread. This is canpath.ok().or(non_repo_config_val). But you can use a match expression if you find it clearer also. unwrap raises an immediate alarm of "this can panic if not handled correctly or if the code changes in the future", so we try to avoid doing it in favor of letting the type system check all code paths for us.

baymax updated this revision to Diff 28504.Jun 7 2021, 7:15 AM

✅ refresh by Heptapod after a successful CI run (🐙 💚)

SimonSapin added inline comments.Jun 7 2021, 10:59 AM
rust/rhg/src/main.rs
187

.ok() here together with the handling of None later silently ignore config parse errors, where Python-based hg aborts. However:

  • This is a rather niche corner case
  • Handling such errors in this particular function is tricky because we can’t just use the ? operator
  • This patch still improves things already, and has been open long enough

So please add a // TODO: handle errors from load_from_explicit_sources` comment and let’s follow up later.

Similarly please add a TODO comment above for handling errors from current_dir

pulkit updated this revision to Diff 28513.Jun 9 2021, 3:51 AM
Alphare accepted this revision.Jun 9 2021, 5:10 AM
This revision is now accepted and ready to land.Jun 9 2021, 5:10 AM
This revision was automatically updated to reflect the committed changes.