This is an archive of the discontinued Mercurial Phabricator instance.

rhg: faster hg cat when many files are requested
ClosedPublic

Authored by aalekseyev on Oct 5 2021, 10:16 AM.

Details

Summary

With this patch I'm seeing a ~39ms improvement (220ms -> 181ms) when
running [hg cat] on ~220 files in a ~260k-file repo.

The timing for [hg cat] on an individual file becomes slightly worse
(losing 5ms: 145ms -> 150ms).
A follow-up commit is intended to improve that.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

aalekseyev created this revision.Oct 5 2021, 10:16 AM
aalekseyev edited the summary of this revision. (Show Details)Oct 5 2021, 10:44 AM
martinvonz requested changes to this revision.Oct 8 2021, 5:21 PM
martinvonz added a subscriber: martinvonz.
martinvonz added inline comments.
rust/hg-core/src/operations/cat.rs
71

nit: This is not actually the order they were specified, right? I think it makes the order be sorted. So you could do missing.sort_unstable() instead, I suppose. Or maybe they're already sorted and you don't need to do anything here? (I suspect the code initially didn't have the sorting on line 47.)

This revision now requires changes to proceed.Oct 8 2021, 5:21 PM
aalekseyev added inline comments.Oct 11 2021, 6:39 AM
rust/hg-core/src/operations/cat.rs
71

Wow, I made two mistakes here:

  • Assumed that the existing behavior of rhg cat should be preserved, whereas in fact it differs from python hg, so should probably be changed. The python hg actually reports these in a sorted order.
  • Forgot that I sorted files already

The mistakes "cancel out" in some sense, so you're right that this should just be simplified. I'm pushing a change.

aalekseyev updated this revision to Diff 30699.Oct 11 2021, 6:50 AM

Even though my measurement above talks about concatenation of 200 files, a real production use case we have is concatenating around 10000 files.
The time of that operation is improved from ~3.5 seconds to ~0.4 seconds by this change.

aalekseyev marked an inline comment as done.Oct 14 2021, 5:30 AM
This revision was not accepted when it landed; it landed in state Needs Review.
This revision was automatically updated to reflect the committed changes.