This is an archive of the discontinued Mercurial Phabricator instance.

pytype: add a (very slow) test that executes pytype
Needs RevisionPublic

Authored by durin42 on Nov 6 2019, 5:59 PM.

Details

Reviewers
indygreg
marmoute
Group Reviewers
hg-reviewers
Summary

This doesn't actually pass yet, but I wanted to share it early so
others can play with pytype.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

durin42 created this revision.Nov 6 2019, 5:59 PM
indygreg requested changes to this revision.Nov 8 2019, 11:52 AM
indygreg added a subscriber: indygreg.

This can't land per commit message, so setting status accordingly.

I'm extremely supportive of adding type checking, whether it be pytype or mypy. I don't think it matters much. And TBH I wouldn't be surprised if we end up with both once we drop Python 2 support and start using inline annotations heavily!

As for how to start testing, I think we should do this like Python 3 and try to make individual files "clean." Maintain a list of clean files and we ratchet from there.

Since type checking is slow (but there are state files we can reuse to speed things up), we'll need to figure out how to make this work in CI. But I have no doubt we can figure something out. Out of curiosity, how long does pytype take to run in a clean source directory, without any state files?

This revision now requires changes to proceed.Nov 8 2019, 11:52 AM
dlax added a subscriber: dlax.Nov 11 2019, 12:41 PM
mharbison72 added inline comments.
tests/test-check-pytype.t
23

Needs a trailing slash here.

durin42 updated this revision to Diff 18242.Nov 19 2019, 1:13 PM

Since type checking is slow (but there are state files we can reuse to speed things up), we'll need to figure out how to make this work in CI. But I have no doubt we can figure something out. Out of curiosity, how long does pytype take to run in a clean source directory, without any state files?

It takes 15-20 minutes on a 2018 Mac mini. In addition to the disabled files here, I've disabled:

mercurial/httppeer.py
mercurial/repoview.py
mercurial/localrepo.py
mercurial/revlog.py
mercurial/merge.py
mercurial/testing/storage.py

But I've also added back these, and sprinkled various disable comments around:

mercurial/bundlerepo.py
mercurial/error.py
mercurial/exchange.py
mercurial/policy.py
mercurial/pycompat.py
mercurial/scmwindows.py
mercurial/windows.py
mercurial/wireprotoframing.py

Even with the state files, it seems to take 10-15 minutes or so.

Since type checking is slow (but there are state files we can reuse to speed things up), we'll need to figure out how to make this work in CI. But I have no doubt we can figure something out. Out of curiosity, how long does pytype take to run in a clean source directory, without any state files?

It takes 15-20 minutes on a 2018 Mac mini. In addition to the disabled files here, I've disabled:

mercurial/httppeer.py
mercurial/repoview.py
mercurial/localrepo.py
mercurial/revlog.py
mercurial/merge.py
mercurial/testing/storage.py

But I've also added back these, and sprinkled various disable comments around:

mercurial/bundlerepo.py
mercurial/error.py
mercurial/exchange.py
mercurial/policy.py
mercurial/pycompat.py
mercurial/scmwindows.py
mercurial/windows.py
mercurial/wireprotoframing.py

Even with the state files, it seems to take 10-15 minutes or so.

Ooh, can you send your extra suppressions? For the most part I've been tackling this as a side project when I'm waiting on compiles.

As to doing individual files: that's well and good, and might be a viable approach for mypy. That said, my sense has been that mypy is faster and significantly less useful, since we don't get cross-file analysis of what's going on. I _think_ pytype gets faster as we add more annotations, so my rough thought thus far as been to try and get some subset of the codebase clean, transitively, and then try to start stamping some types on important functions and interfaces to constrain things. Where I'd _really_ like to get is to be able to migrate the @command decorators and friends to use strings instead of bytes, and be able to recommend typechecking as a way for extension authors to clean things up, but that'll take a while.

This seems useful, and the stabilisation seems to take a long while. Maybe we could take it in its current state (with "expected" broken output) and iterate from there. With the test in the repo.

What do you all think ?

quark added a subscriber: quark.EditedJan 31 2020, 12:15 AM

Have pyre been considered? It seems pyre only takes 10 seconds to check all the .py files.

EDIT: 10s on a machine with many CPU cores. 1 minute on laptop.

In D7295#118785, @quark wrote:

Have pyre been considered?

No, initially because I'd never heard of it, but now because I don't want to deal with ocaml. :)

It seems pyre only takes 10 seconds to check all the .py files.
EDIT: 10s on a machine with many CPU cores. 1 minute on laptop.

I'm dubious that it's doing anywhere near the checking of pytype if it manages to run to completion without errors. What does the output look like?

quark added a comment.EditedJan 31 2020, 2:58 PM

No, initially because I'd never heard of it, but now because I don't want to deal with ocaml. :)

The pip install version ships with a pre-compiled binary so an ocaml compiler is not needed.

I'm dubious that it's doing anywhere near the checking of pytype if it manages to run to completion without errors. What does the output look like?

Good question. I wasn't aware about the differences. I read pytype's README and the examples apply to pyre too. Pytype does type inference while pyre only checks functions that already have type annotations.

Pyre does complete (output). I wasn't able to get pytype past 44/251 (output). For files with type annotations like mail.py, it seems pyre output is useful:

# on commit d844202324924919bc517691052d39c520e077eb
mercurial/mail.py:57:4 Inconsistent override [15]: `mail.STARTTLS.starttls` overrides method defined in `smtplib.SMTP` inconsistently. The overriding method is not annotated but should return a subtype of `typing.Tuple[int, bytes]`.
mercurial/mail.py:387:22 Incompatible variable type [9]: charsets is declared to have type `List[str]` but is used as type `None`.
mercurial/mail.py:392:28 Incompatible parameter type [6]: Expected `bytes` for 2nd anonymous parameter to call `_encode` but got `Union[bytes, str]`.
mercurial/mail.py:394:33 Incompatible parameter type [6]: Expected `bytes` for 1st anonymous parameter to call `encoding.strfromlocal` but got `Union[bytes, str]`.
mercurial/mail.py:397:35 Incompatible variable type [9]: charsets is declared to have type `List[str]` but is used as type `None`.
mercurial/mail.py:399:4 Incompatible variable type [9]: addr is declared to have type `str` but is used as type `bytes`.
mercurial/mail.py:402:30 Incompatible parameter type [6]: Expected `typing.Optional[str]` for 1st anonymous parameter to call `str.split` but got `bytes`.
mercurial/mail.py:405:8 Incompatible variable type [9]: addr is declared to have type `str` but is used as type `bytes`.
mercurial/mail.py:414:63 Incompatible parameter type [6]: Expected `bytes` for 1st anonymous parameter to call `encoding.strfromlocal` but got `str`.
mercurial/mail.py:417:31 Incompatible variable type [9]: charsets is declared to have type `List[str]` but is used as type `None`.
mercurial/mail.py:426:30 Incompatible variable type [9]: charsets is declared to have type `List[str]` but is used as type `None`.
mercurial/mail.py:446:22 Incompatible variable type [9]: charsets is declared to have type `List[str]` but is used as type `None`.
mercurial/mail.py:499:52 Incompatible parameter type [6]: Expected `Union[email.header.Header, str]` for 1st anonymous parameter to call `email.header.decode_header` but got `Union[bytes, email.header.Header]`.

It seems to me that a combined approach might be interesting - use pytype to generate pyi files and pyre for faster checking.

marmoute requested changes to this revision.Apr 17 2020, 2:36 PM

According to IRC discussion with Augie, this is not ready yet.

This revision now requires changes to proceed.Apr 17 2020, 2:36 PM