This is an archive of the discontinued Mercurial Phabricator instance.

gitgetmeta: enable serving only missing hg-git map data
ClosedPublic

Authored by singhsrb on Sep 12 2017, 6:50 PM.
Tags
None
Subscribers

Details

Reviewers
durham
Group Reviewers
Restricted Project
Commits
rFBHGX24832600d3d2: gitgetmeta: enable serving only missing hg-git map data
Summary

The gitlookup extension allows for only requesting the complete file
from the server. This leads to download of large files (hg-git map file
specifically in this case) when only the missing chucks of the file would have
sufficed.

This commit makes changes to allow for downloading only the missing parts of
the map file. A config option 'onlymapdelta' has also been added to allow for
switching between the two different modes of operation: serving complete file
and serving only the missing chunks. The tests were subsequently updated to
cater to the new scenario and also, checking for interoperability of the two
modes.

Test Plan

Ran all the tests.

Diff Detail

Repository
rFBHGX Facebook Mercurial Extensions
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

singhsrb created this revision.Sep 12 2017, 6:50 PM
Herald added a reviewer: Restricted Project. · View Herald TranscriptSep 12 2017, 6:50 PM
singhsrb added inline comments.Sep 12 2017, 6:58 PM
tests/test-git-getmeta.t
2

will change these to onlymapdelta.true and onlymapdelta.false to be consistent with the config being a bool.

singhsrb edited the summary of this revision. (Show Details)Sep 12 2017, 7:53 PM
durham requested changes to this revision.Sep 13 2017, 12:32 PM
durham added a subscriber: durham.

Looks good! The only real issue is the pickle stuff

hgext3rd/gitlookup.py
159

Usually missinghashes will be a small list. Maybe we should break the iteration over mapfile once we've found all the missinghashes? Might save some time.

204

Generally we avoid using pickle as a serialization format. Unpickling something can run arbitrary code, so it's a bit of a security hole to pass it over the wire. You can use json if you want something simple.

This revision now requires changes to proceed.Sep 13 2017, 12:32 PM
singhsrb added inline comments.Sep 13 2017, 12:54 PM
hgext3rd/gitlookup.py
159

I wasn't sure that will very useful because even though the missing hashes is a small list, we don't really know where we will find the hashes in the file. But I don't see any harm in returning when we finish anyways as it can potentially save time.

204

I wasn't aware of that. Thanks for the information! I will update this to use json instead.

singhsrb updated this revision to Diff 1793.Sep 13 2017, 3:10 PM
singhsrb added a comment.EditedSep 13 2017, 3:23 PM

FYI: I also added some tests and additional error checking/logging along with the suggested changes. Also, modified the pattern

if not len(listtype):

to

if not listtype:
hgext3rd/gitlookup.py
168

Fixed this to use mapfile.name instead of gitmapfile

durham accepted this revision.Sep 13 2017, 4:19 PM
This revision is now accepted and ready to land.Sep 13 2017, 4:19 PM
This revision was automatically updated to reflect the committed changes.