This is an archive of the discontinued Mercurial Phabricator instance.

Differential D4774

wireprotov2: define semantics for content redirects
ClosedPublic

Authored by indygreg on Sep 26 2018, 9:09 PM.

Download Raw Diff

Details

Reviewers

None

Group Reviewers

hg-reviewers

Commits

rHG33eb670e2834: wireprotov2: define semantics for content redirects

Summary

When I implemented the clonebundles feature and deployed it on
hg.mozilla.org using Amazon S3 as a content server, server-side CPU
and bandwidth usage dropped off a cliff and a ton of server scaling
headaches went away pretty much the instant clients with support for
clonebundles were rolled out to Firefox CI.

An obvious takeaway from that experience was that offloading server
load to scalable file servers - potentially backed by a CDN - is a
really good idea. Another takeaway was that Mercurial's wire protocol
wasn't in a good position to support data offload generally.

In wire protocol version 1, there isn't a mechanism in the protocol to
say "grab the data from over here instead." For HTTP, we could teach
the client to follow HTTP redirects. Or we could invent a media type
that encoded redirects inline. But for SSH, we were pretty much out of
luck because that protocol wasn't very flexible.

Wire protocol version 2 offers the opportunity to do something better.

The recent generic server-side content caching layer in the wire
protocol version 2 server demonstrated that it is possible to have
drop-in caching of responses to command requests. This by itself
adds tons of value and already makes the built-in server much more
scalable. But I don't want to stop there.

The existing server-side caching implementation has a big weakness:
it requires the server to send data to the client. This means that
the Mercurial server is potentially sending gigabytes of data to
thousands of clients. This is problematic because compared to scaling
static file servers, scaling dynamic servers is *hard*.

A solution to this is to "offload" serving of content to something
that isn't the Mercurial server. By offloading content serving, you
turn the Mercurial server from a centralized monolithic service to
a distributed mostly-indexing service. Assuming high rates of content
offload, this should drastically reduce the total work performed by
the Mercurial server, both in terms of CPU and data transfer. This
will make Mercurial servers vastly easier to scale.

This commit defines the semantics for "content redirects" in wire
protocol version 2. Essentially:

Servers advertise the set of locations a response could be served from.
When making requests, clients advertise the set of locations they are willing to fetch content from.
Servers can then replace the inline response with one that says "get the response from over here instead."

This feature - when fully implemented - will allow extending the
server-side caching layer to facilitate such things as integrating
your server-side cache with a scalable blob store (such as S3 or
a CDN) and offloading most data transfer to that external service.

This feature could also be leveraged for load balancing. e.g.
requests could come into a central server and then get redirected
to an available mirror depending on server availability or locality.
There's tons of potential :)

Diff Detail

Repository

rHG Mercurial

Lint

Automatic diff as part of commit; lint not applicable.

Unit

Automatic diff as part of commit; unit tests not applicable.

Event Timeline

indygreg created this revision.Sep 26 2018, 9:09 PM

Herald added a reviewer: hg-reviewers. · View Herald TranscriptSep 26 2018, 9:09 PM

Herald added a subscriber: mercurial-devel. · View Herald Transcript

indygreg added a child revision: D4775: wireprotov2: advertise redirect targets in capabilities.Sep 26 2018, 9:09 PM

@sheehan: patches in this series implement content caching and "redirect" support for Mercurial servers. We'll want to implement an S3-based content cache for use at Mozilla so we can cache things in S3 and then make that data available by CDN, just like we do with clonebundles. You expressed interest in possibly authoring that extension. So if you want to do that, there should be enough in this series to get going. It will probably help to look at tests/wireprotosimplecache.py (added later in this series) to get an idea for what a cache extension looks like. An MVP S3-backed cache should take fewer than 100 lines of code. Adding support for e.g. IP-based filtering so it can target the local AWS region's S3 bucket (like what we do for clonebundles at Mozilla) could take a bit more work.

indygreg mentioned this in D4782: remotefilelog: import pruned-down remotefilelog extension from hg-experimental.Oct 1 2018, 6:53 PM

Closed by commit rHG33eb670e2834: wireprotov2: define semantics for content redirects (authored by indygreg). · Explain WhyOct 3 2018, 11:29 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents
Changeset List

			Path	Packages
M			mercurial/help/internals/wireprotocolrpc.txt (130 lines)
M			mercurial/help/internals/wireprotocolv2.txt (32 lines)

Status	Author	Revision
Closed	indygreg	D4778 wireprotov2: client support for following content redirects
Closed	indygreg	D4777 wireprotov2: server support for sending content redirects
Closed	indygreg	D4776 wireprotov2: client support for advertising redirect targets
Closed	indygreg	D4775 wireprotov2: advertise redirect targets in capabilities
Closed	indygreg	D4774 wireprotov2: define semantics for content redirects
Closed	indygreg	D4773 wireprotov2: support response caching
Closed	indygreg	D4772 wireprotov2: define type to represent pre-encoded object
Closed	indygreg	D4771 wireprotov2: change name and behavior of readframe()
Closed	indygreg	D4770 url: move _wraphttpresponse() from httpeer
Closed	indygreg	D4769 debugcommands: print all CBOR objects

Diff 11625

mercurial/help/internals/wireprotocolrpc.txt

	Name of the command that should be executed (bytestring).			Name of the command that should be executed (bytestring).
	args			args
	Map of bytestring keys to various value types containing the named			Map of bytestring keys to various value types containing the named
	arguments to this command.			arguments to this command.

	Each command defines its own set of argument names and their expected			Each command defines its own set of argument names and their expected
	types.			types.

				redirect (optional)
				(map) Advertises client support for following response redirects.

				This map has the following bytestring keys:

				targets
				(array of bytestring) List of named redirect targets supported by
				this client. The names come from the targets advertised by the
				server's capabilities message.

				hashes
				(array of bytestring) List of preferred hashing algorithms that can
				be used for content integrity verification.

				See the Content Redirects section below for more on content redirects.

	This frame type MUST ONLY be sent from clients to servers: it is illegal			This frame type MUST ONLY be sent from clients to servers: it is illegal
	for a server to send this frame to a client.			for a server to send this frame to a client.

	The following flag values are defined for this type:			The following flag values are defined for this type:

	0x01			0x01
	New command request. When set, this frame represents the beginning			New command request. When set, this frame represents the beginning
	of a new request to run a command. The ``Request ID`` attached to this			of a new request to run a command. The ``Request ID`` attached to this
	(bytestring) A well-defined message containing the overall status of			(bytestring) A well-defined message containing the overall status of
	this command request. The following values are defined:			this command request. The following values are defined:

	ok			ok
	The command was received successfully and its response follows.			The command was received successfully and its response follows.
	error			error
	There was an error processing the command. More details about the			There was an error processing the command. More details about the
	error are encoded in the ``error`` key.			error are encoded in the ``error`` key.
				redirect
				The response for this command is available elsewhere. Details on
				where are in the ``location`` key.

	error (optional)			error (optional)
	A map containing information about an encountered error. The map has the			A map containing information about an encountered error. The map has the
	following keys:			following keys:

	message			message
	(array of maps) A message describing the error. The message uses the			(array of maps) A message describing the error. The message uses the
	same format as those in the ``Human Output Side-Channel`` frame.			same format as those in the ``Human Output Side-Channel`` frame.

				location (optional)
				(map) Presence indicates that a content redirect has occurred. The map
				provides the external location of the content.

				This map contains the following bytestring keys:

				url
				(bytestring) URL from which this content may be requested.

				mediatype
				(bytestring) The media type for the fetched content. e.g.
				``application/mercurial-*``.

				In some transports, this value is also advertised by the transport.
				e.g. as the ``Content-Type`` HTTP header.

				size (optional)
				(unsigned integer) Total size of remote object in bytes. This is
				the raw size of the entity that will be fetched, minus any
				non-Mercurial protocol encoding (e.g. HTTP content or transfer
				encoding.)

				fullhashes (optional)
				(array of arrays) Content hashes for the entire payload. Each entry
				is an array of bytestrings containing the hash name and the hash value.

				fullhashseed (optional)
				(bytestring) Optional seed value to feed into hasher for full content
				hash verification.

				serverdercerts (optional)
				(array of bytestring) DER encoded x509 certificates for the server. When
				defined, clients MAY validate that the x509 certificate on the target
				server exactly matches the certificate used here.

				servercadercerts (optional)
				(array of bytestring) DER encoded x509 certificates for the certificate
				authority of the target server. When defined, clients MAY validate that
				the x509 on the target server was signed by CA certificate in this set.

				# TODO support for giving client an x509 certificate pair to be used as a
				# client certificate.

				# TODO support common authentication mechanisms (e.g. HTTP basic/digest
				# auth).

				# TODO support custom authentication mechanisms. This likely requires
				# server to advertise required auth mechanism so client can filter.

				# TODO support chained hashes. e.g. hash for each 1MB segment so client
				# can iteratively validate data without having to consume all of it first.

	TODO formalize when error frames can be seen and how errors can be			TODO formalize when error frames can be seen and how errors can be
	recognized midway through a command response.			recognized midway through a command response.

				Content Redirects
				=================

				Servers have the ability to respond to ANY command request with a
				redirect to another location. Such a response is referred to as a *redirect
				response*. (This feature is conceptually similar to HTTP redirects, but is
				more powerful.)

				A redirect response MUST ONLY be issued if the client advertises support
				for a redirect target.

				A redirect response MUST NOT be issued unless the client advertises support
				for one.

				Clients advertise support for redirect responses after looking at the server's
				capabilities data, which is fetched during initial server connection
				handshake. The server's capabilities data advertises named targets for
				potential redirects.

				Each target is described by a protocol name, connection and protocol features,
				etc. The server also advertises target-agnostic redirect settings, such as
				which hash algorithms are supported for content integrity checking. (See
				the documentation for the capabilities command for more.)

				Clients examine the set of advertised redirect targets for compatibility.
				When sending a command request, the client advertises the set of redirect
				target names it is willing to follow, along with some other settings influencing
				behavior.

				For example, say the server is advertising a ``cdn`` redirect target that
				requires SNI and TLS 1.2. If the client supports those features, it will
				send command requests stating that the ``cdn`` target is acceptable to use.
				But if the client doesn't support SNI or TLS 1.2 (or maybe it encountered an
				error using this target from a previous request), then it omits this target
				name.

				If the client advertises support for a redirect target, the server MAY
				substitute the normal, inline response data for a redirect response -
				one where the initial CBOR map has a ``status`` key with value ``redirect``.

				The redirect response at a minimum advertises the URL where the response
				can be retrieved.

				The redirect response MAY also advertise additional details about that
				content and how to retrieve it. Notably, the response may contain the
				x509 public certificates for the server being redirected to or the
				certificate authority that signed that server's certificate. Unless the
				client has existing settings that offer stronger trust validation than what
				the server advertises, the client SHOULD use the server-provided certificates
				when validating the connection to the remote server in place of any default
				connection verification checks. This is because certificates coming from
				the server SHOULD establish a stronger chain of trust than what the default
				certification validation mechanism in most environments provides. (By default,
				certificate validation ensures the signer of the cert chains up to a set of
				trusted root certificates. And if an explicit certificate or CA certificate
				is presented, that greadly reduces the set of certificates that will be
				recognized as valid, thus reducing the potential for a "bad" certificate
				to be used and trusted.)

mercurial/help/internals/wireprotocolv2.txt

	path filtering. Specifying a path filter whose type/prefix does not			path filtering. Specifying a path filter whose type/prefix does not
	match one in this set will likely be rejected by the server.			match one in this set will likely be rejected by the server.

	rawrepoformats			rawrepoformats
	An array of storage formats the repository is using. This set of			An array of storage formats the repository is using. This set of
	requirements can be used to determine whether a client can read a			requirements can be used to determine whether a client can read a
	raw copy of file data available.			raw copy of file data available.

				redirect
				A map declaring potential content redirects that may be used by this
				server. Contains the following bytestring keys:

				targets
				(array of maps) Potential redirect targets. Values are maps describing
				this target in more detail. Each map has the following bytestring keys:

				name
				(bytestring) Identifier for this target. The identifier will be used
				by clients to uniquely identify this target.

				protocol
				(bytestring) High-level network protocol. Values can be
				``http``, ```https``, ``ssh``, etc.

				uris
				(array of bytestrings) Representative URIs for this target.

				snirequired (optional)
				(boolean) Indicates whether Server Name Indication is required
				to use this target. Defaults to False.

				tlsversions (optional)
				(array of bytestring) Indicates which TLS versions are supported by
				this target. Values are ``1.1``, ``1.2``, ``1.3``, etc.

				hashes
				(array of bytestring) Indicates support for hashing algorithms that are
				used to ensure content integrity. Values include ``sha1``, ``sha256``,
				etc.

	changesetdata			changesetdata
	-------------			-------------

	Obtain various data related to changesets.			Obtain various data related to changesets.

	The command accepts the following arguments:			The command accepts the following arguments:

	noderange			noderange

Diff	ID	Description	Created	Lint	Unit
Base		Base
Diff 1	11445		Sep 26 2018, 9:09 PM	★	★
Diff 2	11625	rHG33eb670e28349297354f360df94e88219eeb0274	Sep 26 2018, 9:02 PM	★	★