This is an archive of the discontinued Mercurial Phabricator instance.

Differential D4342

contrib: new script to read events from a named pipe and emit catapult traces
ClosedPublic

Authored by durin42 on Aug 21 2018, 5:39 PM.

Download Raw Diff

Details

Reviewers

None

Group Reviewers

hg-reviewers

Commits

rHG9a81f126f9fa: contrib: new script to read events from a named pipe and emit catapult traces

Summary

I'm starting to get more serious about getting some insight into where
we're spending our time, both in hg itself but also in the test
suite. As a first pass, I'm going to try and produce catapult
traces[0] that can be viewed with Chrome's about:tracing tool.

0: https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/edit#heading=h.nso4gcezn7n1

Diff Detail

Repository

rHG Mercurial

Lint

Lint Skipped

Unit

Unit Tests Skipped

Event Timeline

durin42 created this revision.Aug 21 2018, 5:39 PM

Herald added a reviewer: hg-reviewers. · View Herald TranscriptAug 21 2018, 5:39 PM

Herald added a subscriber: mercurial-devel. · View Herald Transcript

durin42 added a child revision: D4343: tests: add support for emitting trace events to run-tests.Aug 21 2018, 5:39 PM

This series is RFC-ish: I'd very much like to land it (and find additional things we could have emit events!), but I also am not really in love with any of the names involved.

It looks like it'll be useful to debug some slowness problems at Google, at the very least.

Overall this series seems pretty reasonably and exciting! You say it is RFC-ish. But aside from the possibly-too-early import in hg, I'm tempted to queue this.

In D4342#66808, @indygreg wrote:

Overall this series seems pretty reasonably and exciting! You say it is RFC-ish. But aside from the possibly-too-early import in hg, I'm tempted to queue this.

That's honestly fine with me, with the addendum that the script-level trace event is pretty critical, because otherwise we can't identify _anything_ that happens before dispatch.run(), and that turns out to be a fair amount.

In D4342#66810, @durin42 wrote:

In D4342#66808, @indygreg wrote:

Overall this series seems pretty reasonably and exciting! You say it is RFC-ish. But aside from the possibly-too-early import in hg, I'm tempted to queue this.

That's honestly fine with me, with the addendum that the script-level trace event is pretty critical, because otherwise we can't identify _anything_ that happens before dispatch.run(), and that turns out to be a fair amount.

Oh, and I'm really not in love with the catapipe.py name - got any suggestions for something better?

In D4342#66811, @durin42 wrote:

Oh, and I'm really not in love with the catapipe.py name - got any suggestions for something better?

Nothing definitive. But since this is related to tracing, maybe have tracing in the name?

I'm a bit worried about using wall clocks computed from the trace reader here. But none of this has BC concerns, so we could change that.

I'll look at the remainder of the series and considering queuing this. Perfect is very much the enemy of done here.

contrib/catapipe.py
57–59	This is taking the start time before the `readline()`. So, the time will be misreported if it takes a while for an event to arrive. My experience with profiling is that attempting to measure time when buffers (like pipes) are involved will result in various data misrepresentation. I find it best to record times in the producer and propagate those to consumers. Of course, this introduces a system call in the producer to get the current time, which adds overhead.
62	Do we need to use wall time here? A monotonic timer would be preferred for computing deltas. The best we have in Python 2 is `time.clock()`. If we use Python 3, we could use `time.perf_counter()`.

indygreg mentioned this in D4344: tracing: new module to make tracing events in hg easier.Aug 24 2018, 1:02 PM

Closed by commit rHG9a81f126f9fa: contrib: new script to read events from a named pipe and emit catapult traces (authored by durin42). · Explain WhyAug 24 2018, 1:14 PM

This revision was automatically updated to reflect the committed changes.

This is taking the start time before the readline(). So, the time will be misreported if it takes a while for an event to arrive.

Gloriously, this doesn’t actually matter! As long as we have a fixed reference time for the entire event stream it isn’t a big deal, and the trace viewer makes that easy enough to understand.

Revision Contents
Changeset List

			Path	Packages
A	M		contrib/catapipe.py (85 lines)

Commit	Parents	Author	Summary	Date
		Augie Fackler		Aug 21 2018, 3:01 PM

Status	Author	Revision
Closed	durin42	D4350 util: make timedcm require the label
Closed	durin42	D4349 cleanup: make all uses of timedcm specify what they're timing
Closed	durin42	D4348 util: make timedcm context manager also emit trace events
Closed	durin42	D4347 demandimport: instrument python 2 code with trace events
Closed	durin42	D4346 hg: wrap the highest layer in the `hg` script possible in trace event
Closed	durin42	D4345 dispatch: have dispatch.dispatch and dispatch._runcatch emit trace events
Closed	durin42	D4344 tracing: new module to make tracing events in hg easier
Closed	durin42	D4343 tests: add support for emitting trace events to run-tests
Closed	durin42	D4342 contrib: new script to read events from a named pipe and emit catapult traces

Diff 10492

contrib/catapipe.py

This file was added.

				#!/usr/bin/env python3
				#
				# Copyright 2018 Google LLC.
				#
				# This software may be used and distributed according to the terms of the
				# GNU General Public License version 2 or any later version.
				"""Tool read primitive events from a pipe to produce a catapult trace.

				For now the event stream supports

				START $SESSIONID ...

				and

				END $SESSIONID ...

				events. Everything after the SESSIONID (which must not contain spaces)
				is used as a label for the event. Events are timestamped as of when
				they arrive in this process and are then used to produce catapult
				traces that can be loaded in Chrome's about:tracing utility. It's
				important that the event stream into this process stay simple,
				because we have to emit it from the shell scripts produced by
				run-tests.py.

				Typically you'll want to place the path to the named pipe in the
				HGCATAPULTSERVERPIPE environment variable, which both run-tests and hg
				understand.
				"""
				from __future__ import absolute_import, print_function

				import argparse
				import datetime
				import json
				import os

				_TYPEMAP = {
				'START': 'B',
				'END': 'E',
				}

				_threadmap = {}

				def main():
				parser = argparse.ArgumentParser()
				parser.add_argument('pipe', type=str, nargs=1,
				help='Path of named pipe to create and listen on.')
				parser.add_argument('output', default='trace.json', type=str, nargs='?',
				help='Path of named pipe to create and listen on.')
				parser.add_argument('--debug', default=False, action='store_true',
				help='Print useful debug messages')
				args = parser.parse_args()
				fn = args.pipe[0]
				os.mkfifo(fn)
				try:
				with open(fn) as f, open(args.output, 'w') as out:
				out.write('[\n')
				start = datetime.datetime.now()
				while True:
				ev = f.readline().strip()
				indygregUnsubmitted Not Done This is taking the start time before the `readline()`. So, the time will be misreported if it takes a while for an event to arrive. My experience with profiling is that attempting to measure time when buffers (like pipes) are involved will result in various data misrepresentation. I find it best to record times in the producer and propagate those to consumers. Of course, this introduces a system call in the producer to get the current time, which adds overhead. indygreg: This is taking the start time before the `readline()`. So, the time will be misreported if it…
				if not ev:
				continue
				now = datetime.datetime.now()
				indygregUnsubmitted Not Done Do we need to use wall time here? A monotonic timer would be preferred for computing deltas. The best we have in Python 2 is `time.clock()`. If we use Python 3, we could use `time.perf_counter()`. indygreg: Do we need to use wall time here? A monotonic timer would be preferred for computing deltas.
				if args.debug:
				print(ev)
				verb, session, label = ev.split(' ', 2)
				if session not in _threadmap:
				_threadmap[session] = len(_threadmap)
				pid = _threadmap[session]
				ts_micros = (now - start).total_seconds() * 1000000
				out.write(json.dumps(
				{
				"name": label,
				"cat": "misc",
				"ph": _TYPEMAP[verb],
				"ts": ts_micros,
				"pid": pid,
				"tid": 1,
				"args": {}
				}))
				out.write(',\n')
				finally:
				os.unlink(fn)

				if __name__ == '__main__':
				main()

Diff	ID	Description	Created	Lint	Unit
Base		Base
Diff 1	10492		Aug 21 2018, 5:39 PM	★	★
Diff 2	10551	rHG9a81f126f9fa653b078ca8bc8380eb938fe4452d	Aug 21 2018, 3:01 PM	★	★