This is an archive of the discontinued Mercurial Phabricator instance.

revlog: introduce a `sidedata` method
ClosedPublic

Authored by marmoute on Sep 7 2019, 5:27 AM.

Details

Summary

The method give access to extra information related to the revision. Such data
will not be part of the hash be strongly related to the revision. Having them
stored at the revlog level helps the storage consistency story and simplify
various things.

Example of data we could store there:

  • copy tracing related informations
  • graph structure related information (useful for discovery)
  • unresolved conflict data

The full implementation will be introduced gradually in the coming changesets.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

marmoute created this revision.Sep 7 2019, 5:27 AM

I think I see where you are going with this and it seems potentially very useful.

I do wish the flag processors code could have been refactored without involving sidedata, as I would have taken the patches later in this series to remove the temporary mixin in a heartbeat. But since sidedata is involved, I want to have others weigh in.

It would be extremely useful to see an actual use case for sidedata. As it stands, I have questions like whether we'll need to further evolve the storage APIs to accommodate things like cherry-picking which pieces of sidedata are desired. Also, incorporating sidedata in the revlog write APIs (in later patches) seems a bit odd, as the revlog format is rather stable and I'm not sure how additional data will be incorporated without introducing a new flag processor to accommodate extra data next to the revision. Again, I'd really like to see a real world use case for sidedata to help give me confidence this is the correct storage abstraction.

I think I see where you are going with this and it seems potentially very useful.
I do wish the flag processors code could have been refactored without involving sidedata, as I would have taken the patches later in this series to remove the temporary mixin in a heartbeat. But since sidedata is involved, I want to have others weigh in.

The point of the refactoring is to introduce sidedata ;-). I could rework the series to get rid of the mixin sooner, but after discussing with Augie on IRC it seemed fine to just do a final cleanup (the mixin discussion arrived after I wrote the other patches).

It would be extremely useful to see an actual use case for sidedata.

My first target is to use it for copy tracing. After the current series I have:

  1. a series to add actual storage for sidedata (using a flag processors)
  2. a series to refactor access to copy tracing information
  3. a series to use sidedata to store copy tracing information

(1) is almost in a visible state (2) and (3) need a bit more work. Which one are you the most earger to see ?

As it stands, I have questions like whether we'll need to further evolve the storage APIs to accommodate things like cherry-picking which pieces of sidedata are desired.

I feel like the API in this patch offer enough room for that already. By all mean, this is the early stage of implementation, it will be easy to adjust the API once we are further down the road.

Also, incorporating sidedata in the revlog write APIs (in later patches) seems a bit odd, as the revlog format is rather stable and I'm not sure how additional data will be incorporated without introducing a new flag processor to accommodate extra data next to the revision.

It will be done by introduce a flag processors ;-)
Having the sidedata specified at addrevision time seems okay to me. It highlight the fact these data needs to be immutable property of the revision.

Again, I'd really like to see a real world use case for sidedata to help give me confidence this is the correct storage abstraction.

FWIW I'm +0.5 on this because it seems like a reasonable abstraction and could serve a variety of purposes.

marmoute updated this revision to Diff 16474.Sep 9 2019, 5:22 PM
durin42 accepted this revision.Sep 17 2019, 2:05 PM
This revision is now accepted and ready to land.Sep 17 2019, 2:05 PM
This revision was automatically updated to reflect the committed changes.