Page MenuHomePhabricator

rust: enable on macOS
Needs ReviewPublic

Authored by danchr on Aug 19 2020, 11:52 AM.

Details

Reviewers
Alphare
marmoute
Group Reviewers
hg-reviewers
Summary

AFAICT this works quite well, apart from one issue related to the
dirstate mishandling removed files. In particular,
tests/test-casefolding.t will have a lot of errors like this:

File ".../mercurial/dirstate.py", line 1888, in filefoldmap
  return self._rustmap.filefoldmapasdict()

TypeError: unhashable type: 'list'

Rust has fairly good support for obtaining the canonical path to a
file, so I just used that. With this change, the test still fails, but
now it just fails due to a subtle difference between the two
implementations:

  • With the Python implementation, doing mv thefile TheFile does not affect hg status.
  • With the Rust implementation, the results are the same as on a case-sensitive file system; one file is missing and the other is unknown.

Please note that this assumes a case preserving, but case insensitive
file system -- excluding FAT, I suspect.

Most of the Rust tests pass as well, apart from one PathAuditor test.

Diff Detail

Repository
rHG Mercurial
Branch
default
Lint
No Linters Available
Unit
No Unit Test Coverage

Event Timeline

danchr created this revision.Aug 19 2020, 11:52 AM
Alphare requested changes to this revision.Aug 21 2020, 4:24 AM
Alphare added a subscriber: Alphare.

What do you mean by "mishandling removed files"? The fold map is there to give a mapping between the normalized (internal) representation of file paths and the actual ones on disk. As much as I hate the dirstate code, this is a useful part of the software that is triggered depending on your filesystem and paths used.

I don't have a machine running MacOS to run the tests on, all I know is that if the repo root is on a FS that is case-insensitive, the Rust code will not be run. Also, the PathAuditor failing is not a good sign.

I am currently rewriting some of the dirstate code, hopefully my experiment can make it upstream and improve the code, but I don't think "official" support for MacOS can come before we get a CI running all the tests and at least someone dogfooding it a little.

This revision now requires changes to proceed.Aug 21 2020, 4:24 AM
danchr added a comment.EditedAug 21 2020, 11:49 AM

What do you mean by "mishandling removed files"? The fold map is there to give a mapping between the normalized (internal) representation of file paths and the actual ones on disk. As much as I hate the dirstate code, this is a useful part of the software that is triggered depending on your filesystem and paths used.

Yeah, after posting this patch, I looked at testruns with and without this patch. I had one extra failure: test-casefolding.t. So it's obviously not correct ๐Ÿ˜•

I don't have a machine running MacOS to run the tests on, all I know is that if the repo root is on a FS that is case-insensitive, the Rust code will not be run. Also, the PathAuditor failing is not a good sign.
I am currently rewriting some of the dirstate code, hopefully my experiment can make it upstream and improve the code, but I don't think "official" support for MacOS can come before we get a CI running all the tests and at least someone dogfooding it a little.

I've been trying out, and as far as I can tell, the main issue is with the PathAuditor. Other than that, everything appears to work. I'd like to investigate further, but I'm not sure how to debug the Rust code. The error I get is this:

--- .../tests/test-casefolding.t
+++ .../tests/test-casefolding.t.err
@@ -34,13 +34,226 @@
 
   $ hg mv a A
   $ hg mv A a
+  ** unknown exception encountered, please report by visiting
+  ** https://mercurial-scm.org/wiki/BugTracker
+  ** Python 3.8.5 (default, Jul 21 2020, 18:31:18) [Clang 11.0.3 (clang-1103.0.32.62)]
+  ** Mercurial Distributed SCM (version 5.4.1+168-120ca13cb67c+20200821)
+  ** Extensions loaded: 
+  Traceback (most recent call last):
[snip]
+    File ".../mercurial/dirstate.py", line 602, in normalize
+      return self._normalize(path, isknown, ignoremissing)
+    File ".../mercurial/dirstate.py", line 569, in _normalize
+      folded = self._map.filefoldmap.get(normed, None)
+    File ".../mercurial/util.py", line 1744, in __get__
+      result = self.func(obj)
+    File ".../mercurial/dirstate.py", line 1888, in filefoldmap
+      return self._rustmap.filefoldmapasdict()
+  TypeError: unhashable type: 'list'
+  [1]

Could you perhaps offer some pointers on how I might investigate what's going wrong?

danchr edited the summary of this revision. (Show Details)Aug 24 2020, 3:52 PM
danchr updated this revision to Diff 22434.

I'd like to investigate further, but I'm not sure how to debug the Rust code. The error I get is this:

--- .../tests/test-casefolding.t
+++ .../tests/test-casefolding.t.err
@@ -34,13 +34,226 @@
   $ hg mv a A
   $ hg mv A a
+  ** unknown exception encountered, please report by visiting
+  ** https://mercurial-scm.org/wiki/BugTracker
+  ** Python 3.8.5 (default, Jul 21 2020, 18:31:18) [Clang 11.0.3 (clang-1103.0.32.62)]
+  ** Mercurial Distributed SCM (version 5.4.1+168-120ca13cb67c+20200821)
+  ** Extensions loaded: 
+  Traceback (most recent call last):
[snip]
+    File ".../mercurial/dirstate.py", line 602, in normalize
+      return self._normalize(path, isknown, ignoremissing)
+    File ".../mercurial/dirstate.py", line 569, in _normalize
+      folded = self._map.filefoldmap.get(normed, None)
+    File ".../mercurial/util.py", line 1744, in __get__
+      result = self.func(obj)
+    File ".../mercurial/dirstate.py", line 1888, in filefoldmap
+      return self._rustmap.filefoldmapasdict()
+  TypeError: unhashable type: 'list'
+  [1]

Could you perhaps offer some pointers on how I might investigate what's going wrong?

Some of the code infrastructure needed to support case-insensitive filesystems such as the filefoldmap is in place, but only because it mean little to no effort on my part or because some Python code expected the attribute to be there. Having never run the tests on a case-insensitive FS (or really ever been logged in to a machine with one in the last 10 years), I cannot say whether the current code works, it's more of a best effort. It's not that much code: I did not write the normalization routines for example.

The error you're having looks like should be fixed by the change you've proposed. I have no simple way of testing that, though. There is a util in rust/hg-cpython/src/utils.py called print_python_trace that could help you figure out where the Rust code is being called from that I've used from time to time. Other than that I've mostly looked at the Python/C code to replicate/adapt what was being done and looked at the test output for failures.

To recap: there is quite a bit of code that is missing in order to make the code work on case-insensitive filesystems. You should grep for "TODO" in Rust code, I tried to add one any time I skipped some logic.

marmoute requested changes to this revision.Aug 26 2020, 12:58 PM
marmoute added a subscriber: marmoute.

Looks like we should take the change to dirstate_map.rs (in its own changesets) and delay the change to rust/hg-core/src/lib.rs, right? @danchr can you split this?

@Alphare friendly reminder that your employer provide you access to a Mac OS machine.

This revision now requires changes to proceed.Aug 26 2020, 12:58 PM
danchr edited the summary of this revision. (Show Details)Aug 26 2020, 1:31 PM
danchr updated this revision to Diff 22464.
pulkit added a subscriber: pulkit.Sep 8 2020, 3:21 AM

@Alphare seems like this one needs a re-look.

Thanks for taking the time to do this.

I however still have reservations as to adding MacOS as a supported platform since it currently is tested in no CI that runs even semi-frequently on Rust code.
Also, what do you mean by "most tests pass"? What is needed to make the rest pass? And finally, which is the correct behavior from the Python version or the Rust one (according to your new commit message)?

I will take time to take a look at the code in detail after this becomes clearer for me, just so I can understand what compromise, if any, you're making right now.

rust/hg-cpython/src/dirstate/dirstate_map.rs
353

As stated by @marmoute, this should be its own patch, because the code is just broken without it and it can be accepted without the weight of the rest.

Thanks for taking the time to do this.
I however still have reservations as to adding MacOS as a supported platform since it currently is tested in no CI that runs even semi-frequently on Rust code.
Also, what do you mean by "most tests pass"? What is needed to make the rest pass? And finally, which is the correct behavior from the Python version or the Rust one (according to your new commit message)?
I will take time to take a look at the code in detail after this becomes clearer for me, just so I can understand what compromise, if any, you're making right now.

@Alphare do you think you could produce a list of necessary step to be taken before we can run this

@Alphare gentle ping to provide a todo list here.

@Alphare gentle ping to provide a todo list here.

I'm not sure what you mean, I've asked questions precisely because I don't have a clear idea of what's going on.