This is an archive of the discontinued Mercurial Phabricator instance.

revlog: use node tree (native code) for shortest() calculation
ClosedPublic

Authored by martinvonz on May 8 2018, 2:07 PM.

Details

Summary

I want to rewrite revlog.shortest() to disambiguate only among hex
nodeids and then disambiguate the result with revnums at a higher
level (in scmutil). However, that would slow down `hg log -T
'{shortest(node,1)}\n'` from 5.0s to 6.8s, which I wasn't sure would
be acceptable. So this patch makes revlog.shortest() use the node tree
for finding the length of the shortest prefix that's unambiguous among
nodeids. Once that has been found, it makes it longer until it is also
not ambiguous with a revnum.

This speeds up hg log -T '{shortest(node,1)}\n' from 5.0s to 4.0s.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

martinvonz created this revision.May 8 2018, 2:07 PM
yuja added a subscriber: yuja.May 10 2018, 9:08 AM

Looks generally good, but can you fix your editor to not do Google indent?

I'm not sure if we should bump the module version in this case, but I would
do that just for sanity.

+static int nt_shortest(indexObject *self, const char *node)
+{
+ if (nt_init(self) == -1)
+ return -3;
+ if (nt_populate(self) == -1)
+ return -3;
+
+ int level, off;

Declaration has to be moved to top.

+ for (level = off = 0; level < 40; level++) {
+ int k = nt_level(node, level);
+ nodetree *n = &self->nt[off];
+ int v = n->children[k];
+ if (v < 0) {
+ v = -(v + 1);
+ const char *n = index_node(self, v);

Here, too.

In D3499#55772, @yuja wrote:

Looks generally good, but can you fix your editor to not do Google indent?

Wow, we still use tabs?! Okay, will change.

I'm not sure if we should bump the module version in this case, but I would
do that just for sanity.

Done, I think, but please check that I did it right.

+static int nt_shortest(indexObject *self, const char *node)
+{
+ if (nt_init(self) == -1)
+ return -3;
+ if (nt_populate(self) == -1)
+ return -3;
+
+ int level, off;

Declaration has to be moved to top.

Done.

+ for (level = off = 0; level < 40; level++) {
+ int k = nt_level(node, level);
+ nodetree *n = &self->nt[off];
+ int v = n->children[k];
+ if (v < 0) {
+ v = -(v + 1);
+ const char *n = index_node(self, v);

Here, too.

Done.

martinvonz updated this revision to Diff 8624.May 10 2018, 12:01 PM
yuja added a comment.May 11 2018, 10:20 AM

+static int nt_shortest(indexObject *self, const char *node)
+{
+ int level, off;
+
+ if (nt_init(self) == -1)
+ return -3;
+ if (nt_populate(self) == -1)
+ return -3;
+
+ for (level = off = 0; level < 40; level++) {
+ int k, v;
+ nodetree *n = &self->nt[off];
+ k = nt_level(node, level);
+ v = n->children[k];
+ if (v < 0) {
+ v = -(v + 1);
+ const char *n = index_node(self, v);

Perhaps we should check if n == NULL. index_node_existing() might be more
appropriate.

Can you send a followup?

In D3499#55925, @yuja wrote:

+static int nt_shortest(indexObject *self, const char *node)
+{
+ int level, off;
+
+ if (nt_init(self) == -1)
+ return -3;
+ if (nt_populate(self) == -1)
+ return -3;
+
+ for (level = off = 0; level < 40; level++) {
+ int k, v;
+ nodetree *n = &self->nt[off];
+ k = nt_level(node, level);
+ v = n->children[k];
+ if (v < 0) {
+ v = -(v + 1);
+ const char *n = index_node(self, v);

Perhaps we should check if n == NULL. index_node_existing() might be more
appropriate.

Oops, funny how I missed that after just having added index_node_existing. Thanks for noticing.

Can you send a followup?

Will do. Feel free to fold in.

This revision was automatically updated to reflect the committed changes.