Page MenuHomePhabricator

rhg: more efficient `HgPath::join`

Authored by aalekseyev on Oct 26 2021, 2:49 PM.



This commit makes HgPath::join slightly more efficient
by avoiding one copy.

It also avoids a particularly inefficient (quadratic) use of
HgPath::join by using a new mutating function HgPathBuf::push instead.

The name for HgPathBuf::push is chosen by analogy to PathBuf::push.

Diff Detail

rHG Mercurial
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

aalekseyev created this revision.Oct 26 2021, 2:49 PM
aalekseyev updated this revision to Diff 30990.Oct 26 2021, 2:52 PM
Alphare accepted this revision.Oct 27 2021, 3:19 AM
Alphare added a subscriber: Alphare.

Thanks! That quadratic code looked bad indeed (I can say it, I've written it ;) ), thanks for spotting it.

Out of curiosity, have you observed any measurable difference in commands?

This revision is now accepted and ready to land.Oct 27 2021, 3:19 AM
aalekseyev edited the summary of this revision. (Show Details)Oct 27 2021, 6:16 AM
aalekseyev updated this revision to Diff 30991.

Whoops, this last version didn't even compile (I compiled the wrong workspace when I made the tweak). I pushed a version that compiles now.
I saw the performance difference on .hgignore parsing (saw a 1ms improvement or so), but I don't think there's a command that makes that operation in isolation.
I'm looking at adding rhg debugignore.

I just realized that the extra copying that I removed was potentially useful. After a call to extend the buffer has spare capacity, so the call to HgPathBuf::from_bytes was trimming the unnecessary bytes. Removal this trimming saves time, but wastes some memory.
I'm assuming this is a fine tradeoff if the standard library is taking it, but I imagine we could get the best of both worlds if we pre-allocated a vector of the right capacity from the start.

The command hg debugignorerhg I introduced in on a repo with a ~3k line hgignore takes ~31ms after this patch and ~32.5 before this patch.

I looked at the benefit of pre-allocating the HgPathBuf of the right size from the start, but I couldn't measure much improvement, if any. (possibly because I don't have a sensitive benchmark)
The patch looks like this:

diff --git a/rust/hg-core/src/utils/ b/rust/hg-core/src/utils/
--- a/rust/hg-core/src/utils/
+++ b/rust/hg-core/src/utils/
@@ -172,6 +172,13 @@ impl HgPath {
             inner: self.inner.to_owned(),
+    fn to_hg_path_buf_with_spare_capacity(&self, spare_capacity : usize) -> HgPathBuf {
+        let mut vec = Vec::with_capacity(self.len() + spare_capacity);
+        vec.extend(&self.inner);
+        HgPathBuf {
+            inner: vec,
+        }
+    }
     pub fn bytes(&self) -> std::slice::Iter<u8> {
@@ -222,7 +229,7 @@ impl HgPath {
     pub fn join(&self, path: &HgPath) -> HgPathBuf {
-        let mut buf = self.to_owned();
+        let mut buf = self.to_hg_path_buf_with_spare_capacity(path.len() + 1);

I'm happy to push that if you think that's better.

Alphare accepted this revision.Oct 28 2021, 5:23 AM

I think this is a better change because of the ergonomics (them being closer to stdlib) and because it removes the quadratic code, but the optimizer is probably smart enough to figure some of this out. I don't think a 1ms difference in a 30ms run is very significant (unless you have the most stable of machines).

pulkit accepted this revision.Tue, Nov 9, 9:49 AM
This revision was automatically updated to reflect the committed changes.