( )⚙ D2392 debugcommands: add debugwireproto command

This is an archive of the discontinued Mercurial Phabricator instance.

debugcommands: add debugwireproto command
ClosedPublic

Authored by indygreg on Feb 22 2018, 7:50 PM.

Details

Summary

We currently don't have a low-level mechanism for sending
arbitrary wire protocol commands. Having a generic and robust
mechanism for sending wire protocol commands, examining wire
data, etc would make it vastly easier to test the wire protocol
and debug server operation. This is a problem I've wanted a
solution for numerous times, especially recently as I've been
hacking on a new version of the wire protocol.

This commit establishes a hg debugwireproto command for sending
data to a peer.

The command invents a mini language for specifying actions to take.
This will enable a lot of flexibility for issuing commands and testing
variations for how commands are sent.

Right now, we only support low-level raw sends and receives. These
are probably the least valuable commands to intended users of this
command. But they are the most useful commands to implement to
bootstrap the feature (I've chosen to reimplement test-ssh-proto.t
using this command to prove its usefulness).

My eventual goal of hg debugwireproto is to allow calling wire
protocol commands with a human-friendly interface. Essentially,
people can type in a command name and arguments and
hg debugwireproto will figure out how to send that on the wire.
I'd love to eventually be able to save the server's raw response
to a file. This would allow us to e.g. call "getbundle" wire
protocol commands easily.

test-ssh-proto.t has been updated to use the new command in lieu
of piping directly to a server process. As part of the transition,
test behavior improved. Before, we piped all request data to the
server at once. Now, we have explicit control over the ordering of
operations. e.g. we can send one command, receive its response,
then send another command. This will allow us to more robustly
test race conditions, buffering behavior, etc.

There were some subtle changes in test behavior. For example,
previous behavior would often send trailing newlines to the server.
The new mechanism doesn't treat literal newlines specially and
requires newlines be escaped in the payload.

Because the new logging code is very low level, it is easy to
introduce race conditions in tests. For example, the number of bytes
returned by a read() may vary depending on load. This is why tests
make heavy use of "readline" for consuming data: the result of
that operation should be deterministic and not subject to race
conditions. There are still some uses of "readavailable." However,
those are only for reading from stderr. I was able to reproduce
timing issues with my system under load when using "readavailable"
globally. But if I "readline" to grab stdout, "readavailable"
appears to work deterministically for stderr. I think this is
because the server writes to stderr first. As long as the OS
delivers writes to pipes in the same order they were made, this
should work. If there are timing issues, we can introduce a
mechanism to readline from stderr.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

indygreg created this revision.Feb 22 2018, 7:50 PM
indygreg added a subscriber: sid0.Feb 22 2018, 8:09 PM

@sid0: I reckon we could find a way to take what I did in this commit and turn it into a wire protocol conformance tester. i.e. if we could refactor the test so it instantiates the server via alternate mechanisms, we could point the test at a separate server implementation as a means to tease out implementation differences. Just thought you'd like to know in case Mononoke would be interested in this.

indygreg planned changes to this revision.Feb 23 2018, 9:34 PM
indygreg added inline comments.
mercurial/debugcommands.py
2630

I'm having second thoughts about this.

The reason is that from a testing perspective (which is the primary driver behind this work), read() isn't very reliable because of timing issues. Depending on operating system settings, system performance, etc, operations like read(-1) can return a variable number of bytes because they return only what's available on the wire.

write(), however, is more reliable. When you write() to something in Python, Python makes as many system calls as necessary to ensure all bytes are delivered. So a write() at the Python level is mostly deterministic.

I think the concerns around process separation here aren't that significant. So I think I'm going to rework this (yet again) to spawn the SSH server in process and to only monitor I/O operations that are deterministic. This may mean only monitoring write() calls on pipes and *possibly* monitoring readline() and read(N). But if we monitor write() on both peers since they are both in-process, then read() monitoring is redundant. That could be useful to debug behavior. But for tests demonstrating the wire protocol exchange, it's less useful.

indygreg edited the summary of this revision. (Show Details)Feb 26 2018, 4:16 PM
indygreg updated this revision to Diff 6139.
yuja requested changes to this revision.Mar 1 2018, 6:04 AM
yuja added a subscriber: yuja.

This is OT, but I couldn't find a way to phabread this long series. :D2462 (the topmost patch
in yadda) appeared to lack many changes, and :D2392 missed some dependency patches.
And I gave up.

mercurial/debugcommands.py
2631

Perhaps util.hgcmd() is more appropriate because we don't run the command
by the system shell.

2677

util.unescapestr() can be used if we want to allow only string literals.

This revision now requires changes to proceed.Mar 1 2018, 6:04 AM
In D2392#40790, @yuja wrote:

This is OT, but I couldn't find a way to phabread this long series. :D2462 (the topmost patch
in yadda) appeared to lack many changes, and :D2392 missed some dependency patches.
And I gave up.

hg phabsend doesn't appear to update parent revisions when re-submitting???

Anyway, I updated parent revisions manually through the web UI and I think things are all cleaned up.

indygreg updated this revision to Diff 6250.Mar 1 2018, 11:30 AM
durin42 accepted this revision.Mar 1 2018, 4:35 PM
This revision was automatically updated to reflect the committed changes.