I'm starting to get more serious about getting some insight into where
we're spending our time, both in hg itself but also in the test
suite. As a first pass, I'm going to try and produce catapult
traces[0] that can be viewed with Chrome's about:tracing tool.
Details
- Reviewers
- None
- Group Reviewers
hg-reviewers - Commits
- rHG9a81f126f9fa: contrib: new script to read events from a named pipe and emit catapult traces
Diff Detail
- Repository
- rHG Mercurial
- Lint
Lint Skipped - Unit
Unit Tests Skipped
Event Timeline
This series is RFC-ish: I'd very much like to land it (and find additional things we could have emit events!), but I also am not really in love with any of the names involved.
It looks like it'll be useful to debug some slowness problems at Google, at the very least.
Overall this series seems pretty reasonably and exciting! You say it is RFC-ish. But aside from the possibly-too-early import in hg, I'm tempted to queue this.
That's honestly fine with me, with the addendum that the script-level trace event is pretty critical, because otherwise we can't identify _anything_ that happens before dispatch.run(), and that turns out to be a fair amount.
Oh, and I'm really not in love with the catapipe.py name - got any suggestions for something better?
Nothing definitive. But since this is related to tracing, maybe have tracing in the name?
I'm a bit worried about using wall clocks computed from the trace reader here. But none of this has BC concerns, so we could change that.
I'll look at the remainder of the series and considering queuing this. Perfect is very much the enemy of done here.
contrib/catapipe.py | ||
---|---|---|
57–59 | This is taking the start time before the readline(). So, the time will be misreported if it takes a while for an event to arrive. My experience with profiling is that attempting to measure time when buffers (like pipes) are involved will result in various data misrepresentation. I find it best to record times in the producer and propagate those to consumers. Of course, this introduces a system call in the producer to get the current time, which adds overhead. | |
62 | Do we need to use wall time here? A monotonic timer would be preferred for computing deltas. The best we have in Python 2 is time.clock(). If we use Python 3, we could use time.perf_counter(). |
This is taking the start time before the readline(). So, the time will be misreported if it takes a while for an event to arrive.
Gloriously, this doesn’t actually matter! As long as we have a fixed reference time for the entire event stream it isn’t a big deal, and the trace viewer makes that easy enough to understand.
This is taking the start time before the readline(). So, the time will be misreported if it takes a while for an event to arrive.
My experience with profiling is that attempting to measure time when buffers (like pipes) are involved will result in various data misrepresentation. I find it best to record times in the producer and propagate those to consumers. Of course, this introduces a system call in the producer to get the current time, which adds overhead.