This is an archive of the discontinued Mercurial Phabricator instance.

revset: remove "small" argument from "_optimize"
ClosedPublic

Authored by quark on Sep 7 2017, 12:16 PM.

Details

Summary

_optimize calculates weights of subtrees. "small" affects some weight
calculation (either 1 or 0.5). The weights are now only useful in and
optimization where we might swap two arguments and use andsmally.

In the real world, it seems unlikely that revsets with weight of 0.5 or 1
matters the and order optimization. I think the important thing is to get
weights of expensive revsets right (ex. contains).

This patch removes the small argument to simplify the interface.

As for choosing between 0.5 vs 1, things returning a single revision
(ancestor, string) has a weight of 0.5. Things returning multiple
revisions returns 1. This could be sometimes useful in the andsmally
optimization, ex.

(((:)-2) & expensive()) & ((1-2) & expensive())
  ^^^                       ^
 ^^^^^^^                   ^^^^^
^^^^^^^^^^^^^^^^^^^^^^^   ^^^^^^^^^^^^^^^^^^^^^
  weight=1                 weight=0.5

would have an andsmally optimization so 1-2 gets executed first, which
seems to be desirable.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

quark created this revision.Sep 7 2017, 12:16 PM
quark edited the summary of this revision. (Show Details)
yuja added a subscriber: yuja.Sep 8 2017, 9:49 AM
yuja added inline comments.
mercurial/revsetlang.py
421

Any reason to choose 1 here and 0.5 for the others?

quark added inline comments.Sep 8 2017, 1:06 PM
mercurial/revsetlang.py
421

I think there are 2 factors deciding weight:

  1. Time complexity of the revset implementation
  2. The returned revset size (maybe more important for and optimization)

Here, : returns a larger revset so its weight is larger.

Maybe we can change all weight numbers to get rid of float numbers. ex. multiple by 10 for everything.

quark edited the summary of this revision. (Show Details)Sep 18 2017, 7:46 PM
quark updated this revision to Diff 1881.
yuja accepted this revision.Sep 19 2017, 10:51 AM

Queued, thanks.

(copied from the email thread)

On Fri, 8 Sep 2017 17:06:12 +0000, quark (Jun Wu) wrote:

yuja wrote in revsetlang.py:421
Any reason to choose 1 here and 0.5 for the others?

I think there are 2 factors deciding weight:

  1. Time complexity of the revset implementation
  2. The returned revset size (maybe more important for and optimization)

Here, : returns a larger revset so its weight is larger.

I don't think keeping (2) makes much sense because this patch effectively
removes the concept about the computed size. And it's far from being accurate
anyway. In theory, the computed size would be an input to the weight function
of the subsequent query, e.g.

W_contains(subset) = K_contains * estimated_size(subset)

That said, I don't have strong preference about this. So can you update the
commit message to include why 0.5 or 1.

This revision is now accepted and ready to land.Sep 19 2017, 10:51 AM
quark added a comment.EditedSep 19 2017, 1:35 PM
In D656#12304, @yuja wrote:

(copied from the email thread)

I saw your email (and the next one). The commit message was updated accordingly. I think Phabricator in-bound email now works as long as the "From" address is known (i.e. listed at https://phab.mercurial-scm.org/settings/user/yuja/page/email/).

In D656#12341, @quark wrote:
In D656#12304, @yuja wrote:

(copied from the email thread)

I saw your email (and the next one). The commit message was updated accordingly. I think Phabricator in-bound email now works as long as the "From" address is known (i.e. listed at https://phab.mercurial-scm.org/settings/user/yuja/page/email/).

If Phabricator in-bound email are working, could we update https://bz.mercurial-scm.org/show_bug.cgi?id=5666 accordingly?

This revision was automatically updated to reflect the committed changes.