rust-cpython: binding for LazyAncestors
ClosedPublic

Authored by gracinet on Dec 15 2018, 6:42 AM.

Details

Summary

The mercurial.rustext.ancestor module will not in the foreseeable
future be a drop-in replacement for the pure mercurial.ancestor, because the
Rust variants take the index at instantiation whereas the Python ones
take a parents function. From the Python side, using the index from ancestor
would leak internal details out of mercurial.revlog, and that's unwanted.
Therefore, given that classes defined in
rust-cpython have the same names in both language, we keep the Rust naming
convention (CamelCase).

Eventually, though, the ancestor module can be placed under control of
mercurial.policy, but it will still be up to revlog to be aware of
that and play the role of a factory for instantiation.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
gracinet created this revision.Dec 15 2018, 6:42 AM
kevincox accepted this revision.Dec 15 2018, 8:14 AM
kevincox added inline comments.
rust/hg-cpython/src/ancestors.rs
83

It should be fine with tbe GIL. RefCell is basically a single-thread lock. In this case it should be fine to have no renterance.

tests/test-rust-ancestor.py
93

list(lazy)

yuja added a subscriber: yuja.Dec 15 2018, 11:45 PM
m.add_class::<AncestorsIterator>(py)?;

+ m.add_class::<lazyancestors>(py)?;

Nit: While it's correct per our naming convention, I prefer calling it
as LazyAncestors in Rust, and export as lazyancestors.

gracinet updated this revision to Diff 12952.Dec 22 2018, 11:24 AM
kevincox accepted this revision.Dec 22 2018, 11:57 AM
kevincox added inline comments.
rust/hg-cpython/src/ancestors.rs
109
.map_err(|e| GraphError::new(py, format!("{:?}", e)))

Or even better is to implement Into so that you can use ?

gracinet added inline comments.Dec 22 2018, 2:48 PM
rust/hg-cpython/src/ancestors.rs
83

Yes, I'll remove these comments, same with the ones for RW lock, it's pretty clear to me now that such would be relevant only for code meant to be executed in a allow_threads closure.

109

Ah yes, forgot to upgrade that one, thanks. The current good spelling would be the one used in __contains__, where GraphError::pynew encloses the conversion details.

I believe implementing Into is not readily possible with the rust-cpython way of creating Python exceptions : we can't do it without the py marker (it's actually listed as one of the advantages of PyO3). And since Python<'p> is not defined by our crate, we'd need to enclose in a wrapper in the excepions module:

pub struct ExceptionConverter<'p>(pub Python<'p>, pub hg::GraphError);

impl <'p> From<ExceptionConverter<'p>> for PyErr {

    fn from(ec: ExceptionConverter<'p>) -> Self {
        GraphError::pynew(ec.0, ec.1)
    }
}

and then

def __new__(_cls, index: PyObject, initrevs: PyObject, stoprev: Revision,
            inclusive: bool) -> PyResult<Self> {
    let initvec = reviter_to_revvec(py, initrevs)?;

    let lazy = CoreLazy::new(
        Index::new(py, index)?, initvec, stoprev, inclusive).map_err(|e| ExceptionConverter(py, e))?;
    Self::create_instance(py, RefCell::new(Box::new(lazy)))
    }

This does compile (did not think hard about the namings), and I'd be surprised if it didn't the right thng, but that's a lot of effort for a result that's only a bit better. What do you think ?

yuja added a comment.Dec 23 2018, 11:57 PM

+/ The purpose of this module is to hide identifiers from other Rust users
+
/
+/ Some of the identifiers we are defining are meant for consumption
+
/ from Python with other naming conventions. For instance, py_class!
+/ does not let us make a distinction between the Python and the Rust name,
+
/ and that name ends up in particular in the __class__ attribute, so that
+/ renaming it inside its Python module is not enough.
+
/
+/ For Rust consumers, we will reexport these definitions, following the
+
/ appropriate naming convention.
+mod concealed_detail {
+ use super::*;
+
+ py_class!(pub class lazyancestors |py| {

Does the class name actually matter? Personally I don't care if
lazyancestors() function returns a LazyAncestors object. We'll anyway
need a wrapper function to make pure ancestors and rustext ancestors
compatible.

gracinet marked 2 inline comments as done.Dec 27 2018, 2:04 PM

Does the class name actually matter? Personally I don't care if
lazyancestors() function returns a LazyAncestors object. We'll anyway
need a wrapper function to make pure ancestors and rustext ancestors
compatible.

Yes, that's in line with your other comments, whereas I was pursuing the
goal of putting the whole ancestor module under responsibility of
policy.importmod : in that case, it would have been necessary to have
the same name.

I think I still prefer to put all the dispatching in the revlog module
(ie have it play the factory role) rather than put a factory function
inside ancestor. Unless you disagree, I'm going to make a new revision
going in that direction, ie

  • the class would be called LazyAncestors
  • the dispatching would stay in revlog
  • maybe for clarity, ancestor.rustlazyancestors should be renamed

and in the next series (bindings for MissingAncestors), similar
dispatching would occur in incrementalmissingrevs

For now, revlog can choose according to whether rustext is (fully)
importable (some other comment about that).

Cheers,

yuja added a comment.Dec 28 2018, 12:46 AM
> Does the class name actually matter? Personally I don't care if
>  lazyancestors() function returns a LazyAncestors object. We'll anyway
>  need a wrapper function to make pure ancestors and rustext ancestors
>  compatible.

Yes, that's in line with your other comments, whereas I was pursuing the
goal of putting the whole `ancestor` module under responsibility of
`policy.importmod` : in that case, it would have been necessary to have
the same name.

I was wondering if it would be possible to export a module attribute in all
lower-caps without renaming the type. (i.e. lazyancestors = LazyAncestors.)

Anyway, I just thought using CamelCase in Rust would be slightly nicer and
simpler. If that makes things complicated, I'll drop my idea and stick to
anything that is simpler.

I think I still prefer to put all the dispatching in the `revlog` module
(ie have it play the factory role) rather than put a factory function
inside `ancestor`. Unless you disagree, I'm going to make a new revision
going in that direction, ie

- the class would be called `LazyAncestors`
- the dispatching would stay in `revlog`

Both sound good.

  • maybe for clarity, ancestor.rustlazyancestors should be renamed

I think rustlazyancestors will be removed in favor of rust-cpython
implementation.

In D5441#81222, @yuja wrote:

Anyway, I just thought using CamelCase in Rust would be slightly nicer and
simpler. If that makes things complicated, I'll drop my idea and stick to
anything that is simpler.

I think using the Rust naming convention in Rust and the Python convention in Python is probably the best. It does make it slightly harder to grep the codebase for these things but since you would find the explicit registration linking the two names I don't think it is a major issue.

I believe implementing Into is not readily possible with the rust-cpython way of creating Python exceptions : we can't do it without the py marker

Ah right, I missed this. In this case I think the manual conversion is fine.

gracinet updated this revision to Diff 13007.Jan 4 2019, 1:30 PM
gracinet marked an inline comment as done.
gracinet edited the summary of this revision. (Show Details)

@yuja this new version exports the LazyAncestors name to Python, expecting revlog.ancestors to do the right thing (as a factory then). I agree that the destiny of the older rustlazyancestors is to disappear, but I'd prefer to wait a bit before doing that
@kevincox I'd agree in principle, but since we only have one caller, whose role is precisely to abstract those differencies, we decided that we actually don't need to maintain that convention (and rustext can ne considered a foreign module, so that this is not really a breaking of the naming convention for new classes within hg)

This revision was automatically updated to reflect the committed changes.