This is an archive of the discontinued Mercurial Phabricator instance.

contrib: new script to read events from a named pipe and emit catapult traces
ClosedPublic

Authored by durin42 on Aug 21 2018, 5:39 PM.

Details

Summary

I'm starting to get more serious about getting some insight into where
we're spending our time, both in hg itself but also in the test
suite. As a first pass, I'm going to try and produce catapult
traces[0] that can be viewed with Chrome's about:tracing tool.

0: https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/edit#heading=h.nso4gcezn7n1

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

durin42 created this revision.Aug 21 2018, 5:39 PM

This series is RFC-ish: I'd very much like to land it (and find additional things we could have emit events!), but I also am not really in love with any of the names involved.

It looks like it'll be useful to debug some slowness problems at Google, at the very least.

Overall this series seems pretty reasonably and exciting! You say it is RFC-ish. But aside from the possibly-too-early import in hg, I'm tempted to queue this.

Overall this series seems pretty reasonably and exciting! You say it is RFC-ish. But aside from the possibly-too-early import in hg, I'm tempted to queue this.

That's honestly fine with me, with the addendum that the script-level trace event is pretty critical, because otherwise we can't identify _anything_ that happens before dispatch.run(), and that turns out to be a fair amount.

Overall this series seems pretty reasonably and exciting! You say it is RFC-ish. But aside from the possibly-too-early import in hg, I'm tempted to queue this.

That's honestly fine with me, with the addendum that the script-level trace event is pretty critical, because otherwise we can't identify _anything_ that happens before dispatch.run(), and that turns out to be a fair amount.

Oh, and I'm really not in love with the catapipe.py name - got any suggestions for something better?

Oh, and I'm really not in love with the catapipe.py name - got any suggestions for something better?

Nothing definitive. But since this is related to tracing, maybe have tracing in the name?

I'm a bit worried about using wall clocks computed from the trace reader here. But none of this has BC concerns, so we could change that.

I'll look at the remainder of the series and considering queuing this. Perfect is very much the enemy of done here.

contrib/catapipe.py
58–60

This is taking the start time before the readline(). So, the time will be misreported if it takes a while for an event to arrive.

My experience with profiling is that attempting to measure time when buffers (like pipes) are involved will result in various data misrepresentation. I find it best to record times in the producer and propagate those to consumers. Of course, this introduces a system call in the producer to get the current time, which adds overhead.

63

Do we need to use wall time here? A monotonic timer would be preferred for computing deltas. The best we have in Python 2 is time.clock(). If we use Python 3, we could use time.perf_counter().

This revision was automatically updated to reflect the committed changes.

This is taking the start time before the readline(). So, the time will be misreported if it takes a while for an event to arrive.

Gloriously, this doesn’t actually matter! As long as we have a fixed reference time for the entire event stream it isn’t a big deal, and the trace viewer makes that easy enough to understand.