Page MenuHomePhabricator

revlog: initial version of phash index [POC]
Needs RevisionPublic

Authored by joerg.sonnenberger on Jan 26 2021, 4:01 PM.

Details

Reviewers
indygreg
baymax
Group Reviewers
hg-reviewers
Summary

Integration is still somewhat hackish, but a reasonable start for a PoC.
Comparing it to Rust is not entirely fair as the additional Python
function in between dominates the runtime. The generator is pure Python
at the moment and at least a factor of 25 slower than the comparable C
extension. No fallback path for hash function yet to a real universal
hash (as opposed to just assuming the node hash is well distributed
enough). Most interesting baseline: pure Python lookup via the hash
function is a factor of 20 slower than the current dictionary lookup.
The index generation takes 37s for 2 * 570k revisions (manifest +
changelog). Further testing to measure the incremental cache update cost
is necessary, since initial testing shows a negative cost overall. It
can be estimated to be bound by ~65ms on the test platform.

Diff Detail

Repository
rHG Mercurial
Branch
default
Lint
No Linters Available
Unit
No Unit Test Coverage

Event Timeline

baymax requested changes to this revision.May 25 2021, 5:32 AM

There seems to have been no activities on this Diff for the past 3 Months.

By policy, we are automatically moving it out of the need-review state.

Please, move it back to need-review without hesitation if this diff should still be discussed.

:baymax:need-review-idle:

This revision now requires changes to proceed.May 25 2021, 5:32 AM