This is an archive of the discontinued Mercurial Phabricator instance.

rust-dirstate: add "dirs" rust-cpython binding
ClosedPublic

Authored by Alphare on May 17 2019, 6:10 AM.

Details

Summary

There is an obvious performance and memory issue with those bindings on larger
repos as it copies and allocates everything at once, round-trip. Like in the
previous patch series, this is only temporary and will only get better once
we don't have large data structures going to and from Python.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

Alphare created this revision.May 17 2019, 6:10 AM
Alphare updated this revision to Diff 15342.Jun 5 2019, 12:23 PM
kevincox accepted this revision.Jun 10 2019, 9:06 AM
kevincox added inline comments.
rust/hg-cpython/src/dirstate.rs
221
let dirs_map = if ...

Then just let the if statement evaluate to this value.

Alphare added inline comments.Jun 17 2019, 11:36 AM
rust/hg-cpython/src/dirstate.rs
221

Ah yes, I always forget... Done in a future patch.

Alphare updated this revision to Diff 15671.Jun 27 2019, 9:16 AM
This revision was not accepted when it landed; it landed in state Needs Review.
This revision was automatically updated to reflect the committed changes.
yuja added a subscriber: yuja.Jun 29 2019, 11:41 PM

+ def contains(&self, item: PyObject) -> PyResult<bool> {
+ Ok(self
+ .dirs_map(py)
+ .borrow()
+ .get(&item.extract::<PyBytes>(py)?.data(py).to_owned())
+ .is_some())

.contains_key(..) and .as_ref() instead of &...to_owned().

I'm surprised by the use of Deref in the previous patch. Is it legit
to leverage Deref to expose the inner type?

In D6394#96115, @yuja wrote:

+ def contains(&self, item: PyObject) -> PyResult<bool> {
+ Ok(self
+ .dirs_map(py)
+ .borrow()
+ .get(&item.extract::<PyBytes>(py)?.data(py).to_owned())
+ .is_some())

.contains_key(..) and .as_ref() instead of &...to_owned().
I'm surprised by the use of Deref in the previous patch. Is it legit
to leverage Deref to expose the inner type?

Ah yes, I don't know what I was thinking. I'll write a follow up.
To my eyes, the point of Deref is to make the struct act as a "fat pointer" from the outer interface code, but maybe this does not really help that much?

yuja added a comment.Jul 1 2019, 8:42 AM
To my eyes, the point of `Deref` is to make the `struct` act as a "fat pointer" from the outer interface code, but maybe this does not really help that much?

I feel DirsMultiset isn't a HashMap, but is implemented by using a HashMap.
For example, inner.get().unwrap() returns a refcount of the path, but which
is an implementation detail.

So I think it's better to define methods explicitly instead of proxying
everything to HashMap.

In D6394#96189, @yuja wrote:
To my eyes, the point of `Deref` is to make the `struct` act as a "fat pointer" from the outer interface code, but maybe this does not really help that much?

I feel DirsMultiset isn't a HashMap, but is implemented by using a HashMap.
For example, inner.get().unwrap() returns a refcount of the path, but which
is an implementation detail.
So I think it's better to define methods explicitly instead of proxying
everything to HashMap.

I see your point. I'll follow up later today.