This is an archive of the discontinued Mercurial Phabricator instance.

wireproto: define human output side channel frame
ClosedPublic

Authored by indygreg on Mar 15 2018, 1:19 AM.

Details

Summary

Currently, the SSH protocol delivers output tailored for people over
the stderr file descriptor. The HTTP protocol doesn't have this
file descriptor (because it only has an input and output pipe). So
it encodes textual output intended for humans within the protocol
responses. So response types have a facility for capturing output
to be printed to users. Some don't. And sometimes the implementation
of how that output is conveyed is super hacky.

On top of that, bundle2 has an "output" part that is used to store
output that should be printed when this part is encountered.
bundle2 also has the concept of "interrupt" chunks, which can be
used to signal that the regular bundle2 stream is to be
preempted by an out-of-band part that should be processed immediately.
This "interrupt" part can be an "output" part and can be used to
print data on the receiver.

The status quo is inconsistent and insane. We can do better.

This commit introduces a dedicated frame type on the frame-based
protocol for denoting textual data that should be printed on the
receiver. This frame type effectively constitutes a side-channel
by which textual data can be printed on the receiver without
interfering with other in-progress transmissions, such as the
transmission of command responses.

But wait - there's more! Previous implementations that transferred
textual data basically instructed the client to "print these bytes."
This suffered from a few problems.

First, the text data that was transmitted and eventually printed
originated from a server with a specic i18n configuration. This
meant that clients would see text using whatever the i18n settings
were on the server. Someone in France could connect to a server in
Japan and see unlegible Japanese glyphs - or maybe even mojibake.

Second, the normalization of all text data originated on servers
resulted in the loss of the ability to apply formatting to that
data. Local Mercurial clients can apply specific formatting
settings to individual atoms of text. For example, a revision can
be colored differently from a commit message. With data over the
wire, the potential for this rich formatting was lost. The best you
could do (without parsing the text to be printed), was apply a
universal label to it and e.g. color it specially.

The new mechanism for instructing the peer to print data does
not have these limitations.

Frames instructing the peer to print text are composed of a
formatting string plus arguments. In other words, receivers can
plug the formatting string into the i18n database to see if a local
translation is available. In addition, each atom being instructed
to print has a series of "labels" associated with it. These labels
can be mapped to the Mercurial UI's labels so locally configured
coloring, styling, etc settings can be applied.

What this all means is that textual messages originating on servers
can be localized on the client and richly formatted, all while
respecting the client's settings. This is slightly more complicated
than "print these bytes." But it is vastly more user friendly.

FWIW, I'm not aware of other protocols that attempt to encode
i18n and textual styling in this manner. You could lobby the
claim that this feature is over-engineered. However, if I were to
sit in the shoes of a non-English speaker learning how to use
version control, I think I would *love* this feature because
it would enable me to see richly formatted text in my chosen
locale.

Anyway, we only implement support for encoding frames of this
type and basic tests for that encoding. We'll still need to
hook up the server and its ui instance to emit these frames.
I recognize this feature may be a bit more controversial than
other aspects of the wire protocol because it is a bit
"radical." So I'd figured I'd start small to test the waters and
see if others feel this feature is worthwhile.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

indygreg created this revision.Mar 15 2018, 1:19 AM
indygreg updated this revision to Diff 7148.Mar 19 2018, 7:59 PM
durin42 accepted this revision.Mar 21 2018, 9:33 PM
This revision is now accepted and ready to land.Mar 21 2018, 9:33 PM
This revision was automatically updated to reflect the committed changes.
yuja added a subscriber: yuja.Mar 25 2018, 12:52 AM
yuja added inline comments.
mercurial/wireprotoframing.py
318

It's probably better to require everything in ASCII if formatting is supposed to be fed to _().

It's a disaster to mix utf-8 bytes and local-encoding bytes in codebase.