Page MenuHomePhabricator

revlog: initial version of phash index [POC]
Needs RevisionPublic

Authored by joerg.sonnenberger on Jan 26 2021, 4:01 PM.


Group Reviewers

Integration is still somewhat hackish, but a reasonable start for a PoC.
Comparing it to Rust is not entirely fair as the additional Python
function in between dominates the runtime. The generator is pure Python
at the moment and at least a factor of 25 slower than the comparable C
extension. No fallback path for hash function yet to a real universal
hash (as opposed to just assuming the node hash is well distributed
enough). Most interesting baseline: pure Python lookup via the hash
function is a factor of 20 slower than the current dictionary lookup.
The index generation takes 37s for 2 * 570k revisions (manifest +
changelog). Further testing to measure the incremental cache update cost
is necessary, since initial testing shows a negative cost overall. It
can be estimated to be bound by ~65ms on the test platform.

Diff Detail

rHG Mercurial
No Linters Available
No Unit Test Coverage

Event Timeline

baymax requested changes to this revision.May 25 2021, 5:32 AM

There seems to have been no activities on this Diff for the past 3 Months.

By policy, we are automatically moving it out of the need-review state.

Please, move it back to need-review without hesitation if this diff should still be discussed.


This revision now requires changes to proceed.May 25 2021, 5:32 AM