With wireprotocol version two introducing command response caching
and enabling content redirect responses, it is possible to store
response objects in an arbitrary blob store and send clients to
the store to retrieve large responses. This commit adds an extension
which implements such wire protocol caching in Amazon S3.
Servers add their AWS access key and key ID to an hgrc config,
and specify the name of the S3 bucket which holds the objects.
When a cache lookup request comes in, the cacher sends a HEAD
request to S3 which will return a 404 if the object does not
exist (ie a cache miss). If the request is a cache hit, a presigned
url for the object is generated and used to issue a content
redirect response which is sent to the client. If the response
indicates a cache miss, the response is generated by the server
and buffered in the cache until onfinished is called. During
onfinished, we calculate the size of the response and can
optionally avoid caching if the response is below a configured
minimum threshold. Otherwise we insert the object into the
cache bucket using the put_object API.
To test this extension, we require the moto mock AWS library.
Specifically, we use the "standalone server" functionality,
which creates a Flask application that imitates S3. A new hghave
predicate is added to check for this functionality before
testing.
This is needed for determinism in testing, but there is likely a better way to avoid it that checking for an alternative endpoint url.