( )⚙ D2834 wireproto: support /api/* URL space for exposing APIs

This is an archive of the discontinued Mercurial Phabricator instance.

wireproto: support /api/* URL space for exposing APIs
ClosedPublic

Authored by indygreg on Mar 12 2018, 8:07 PM.

Details

Summary

I will soon be introducing a new version of the HTTP wire protocol.
One of the things I want to change with it is the URL routing.
I want to rely on URL paths to define endpoints rather than the
"cmd" query string argument. That should be pretty straightforward.

I was thinking about what URL space to reserve for the new protocol.
We /could/ put everything at a top-level path. e.g.
/wireproto/* or /http-v2-wireproto/*. However, these constrain us
a bit because they assume there will only be 1 API: version 2 of
the HTTP wire protocol. I think there is room to grow multiple
APIs. For example, there may someday be a proper JSON API to query
or even manipulate the repository. And I don't think we should have
to create a new top-level URL space for each API nor should we
attempt to shoehorn each future API into the same shared URL space:
that would just be too chaotic.

This commits reserves the /api/* URL space for all our future API
needs. Essentially, all requests to /api/* get routed to a new WSGI
handler. By default, it 404's the entire URL space unless the
"api server" feature is enabled. When enabled, requests to "/api"
list available APIs. URLs of the form /api/<name>/* are reserved for
a particular named API. Behavior within each API is left up to that
API. So, we can grow new APIs easily without worrying about URL
space conflicts.

APIs can be registered by adding entries to a global dict. This allows
extensions to provide their own APIs should they choose to do so.
This is probably a premature feature. But IMO the code is easier
to read if we're not dealing with API-specific behavior like config
option querying inline.

To prove it works, we implement a very basic API for version 2
of the HTTP wire protocol. It does nothing of value except
facilitate testing of the /api/* URL space.

We currently emit plain text responses for all /api/* endpoints.
There's definitely room to look at Accept and other request headers
to vary the response format. But we have to start somewhere.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

indygreg created this revision.Mar 12 2018, 8:07 PM

Will this cause conflicts with hg serve -S or hgwebdir with subrepo/virtual paths that start with 'api'? (This was the reason I used the LFS URI starting with '.hg/', knowing that isn't committable.)

Will this cause conflicts with hg serve -S or hgwebdir with subrepo/virtual paths that start with 'api'? (This was the reason I used the LFS URI starting with '.hg/', knowing that isn't committable.)

Bleh.

So what I'm hearing is that because of hgwebdir's virtual paths and support for serving subrepos, pretty much the entire URL space is dangerous because it can conflict with the path of a virtual repo or subrepo. I suppose we already have problems with existing URLs, like /rev/.

I agree that putting things under .hg/ in the URL space seems to be a viable workaround.

Changing the URL prefix isn't a big deal as far as the code is concerned. We can make that change at any time since we have no BC guarantees at this juncture.

Will this cause conflicts with hg serve -S or hgwebdir with subrepo/virtual paths that start with 'api'? (This was the reason I used the LFS URI starting with '.hg/', knowing that isn't committable.)

Bleh.
So what I'm hearing is that because of hgwebdir's virtual paths and support for serving subrepos, pretty much the entire URL space is dangerous because it can conflict with the path of a virtual repo or subrepo. I suppose we already have problems with existing URLs, like /rev/.

I think so. I remember when Yuya suggested putting the /index virtual file into hgwebdir a year ago, I had to dance around the fact that 'index' could have been a repo being served up. I'm guessing /rev/ et al. is less of a problem, because presumably they've existed since the beginning?

I agree that putting things under .hg/ in the URL space seems to be a viable workaround.
Changing the URL prefix isn't a big deal as far as the code is concerned. We can make that change at any time since we have no BC guarantees at this juncture.

I'm fine with landing this as-is and changing it later if needed. I haven't checked this myself, I was just going on a vague memory of issues that I ran into, and wondered if you had considered it.

indygreg updated this revision to Diff 6998.Mar 13 2018, 3:11 PM
indygreg updated this revision to Diff 7009.Mar 13 2018, 9:16 PM

So, assuming we can't introduce new top-level URLs without risking conflicts with subrepos or hgwebdir virtual repos, for at least the wire protocol, assuming we keep the ?cmd=capabilities mechanism for the initial server handshake, we can add something that defines where API requests should be routed. We can then allow server operators to reconfigure the base URL for these requests.

durin42 accepted this revision.Mar 21 2018, 3:18 PM
This revision is now accepted and ready to land.Mar 21 2018, 3:18 PM
indygreg updated this revision to Diff 7194.Mar 21 2018, 6:19 PM
durin42 accepted this revision.Mar 21 2018, 6:21 PM
This revision was automatically updated to reflect the committed changes.
yuja added a subscriber: yuja.Mar 23 2018, 11:22 AM

Can we expect the BC of hgwebdir will be fixed before the release?

I like the /api/ namespace, but breaking existing hgweb instances
wouldn't be acceptable.

In D2834#47334, @yuja wrote:

Can we expect the BC of hgwebdir will be fixed before the release?
I like the /api/ namespace, but breaking existing hgweb instances
wouldn't be acceptable.

D2936