This is an archive of the discontinued Mercurial Phabricator instance.

wireprotov2: define and implement "manifestdata" command
ClosedPublic

Authored by indygreg on Sep 5 2018, 12:20 PM.

Details

Summary

The added command can be used for obtaining manifest data.
Given a manifest path and set of manifest nodes, data about
manifests can be retrieved.

Unlike changeset data, we wish to emit deltas to describe
manifest revisions. So the command uses the relatively new
API for building delta requests and emitting them.

The code calls into deltaparent(), which I'm not very keen of.
There's still work to be done in delta generation land so
implementation details of storage (e.g. exactly one delta
is stored/available) don't creep into higher levels. But we
can worry about this later (there is already a TODO on
imanifestorage tracking this).

On the subject of parent deltas, the server assumes parent revisions
exist on the receiving end. This is obviously wrong for shallow
clone. I've added TODOs to add a mechanism to the command to
allow clients to specify desired behavior. This shouldn't be
too difficult to implement.

Another big change is that the client must explicitly request
manifest nodes to retrieve. This is a major departure from
"getbundle," where the server derives relevant manifests as it
iterates changesets and sends them automatically. As implemented,
the client must transmit each requested node to the server. At
20 bytes per node, we're looking at 2 MB per 100,000 nodes. Plus
wire encoding overhead. This isn't ideal for clients with limited
upload bandwidth. I plan to address this in the future by allowing
alternate mechanisms for defining the revisions to retrieve. One
idea is to define a range of changeset revisions whose manifest
revisions to retrieve (similar to how "changesetdata" works).
We almost certainly want an API to look up an individual manifest
by node. And that's where I've chosen to start with the implementation.
Again, a theme of this early exchangev2 work is I want to start by
building primitives for accessing raw repository data first and see
how far we can get with those before we need more complexity.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

indygreg created this revision.Sep 5 2018, 12:20 PM

Could you add a TODO about the overhead of having to enumerate every node required? That'll be prohibitive fairly quickly I think.

Could you add a TODO about the overhead of having to enumerate every node required? That'll be prohibitive fairly quickly I think.

There are already TODOs in wireprotocolv2.txt.

And, yes, the overhead is prohibitive fairly quickly and I plan to add more "bulk querying" capabilities in future commits. I'm trying to get the granular data access in first then optimize later.

indygreg updated this revision to Diff 10967.Sep 12 2018, 1:11 PM
This revision was automatically updated to reflect the committed changes.