This is an archive of the discontinued Mercurial Phabricator instance.

chg: move only first time relevant if condition out of loop
AbandonedPublic

Authored by singhsrb on Oct 4 2017, 7:22 PM.

Details

Reviewers
yuja
Group Reviewers
hg-reviewers
Summary

The loop needlessly checks the condition after the first iteration.
Therefore, moving this condition outside the loop. Compiler can potentially
optimize this but not depending on it.

Test Plan

Ran 'test-chg.t' with '--chg' option.

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

singhsrb created this revision.Oct 4 2017, 7:22 PM
yuja added subscribers: quark, yuja.Oct 5 2017, 9:55 AM
yuja requested changes to this revision.

I prefer the original code since it looks simpler and there wouldn't be
measurable performance win.

@quark which do you like?

contrib/chg/chg.c
381–382

Nit: this line is over 80 chars.

This revision now requires changes to proceed.Oct 5 2017, 9:55 AM
singhsrb updated this revision to Diff 2468.Oct 5 2017, 2:19 PM

Just to be clear: this change was also for readability apart from performance improvement. I personally prefer conditional code inside the loops to not keep checking if its the first or last iteration because then one can say that the loop checks for conditions A, B and so on in each iteration rather than the loops checks for condition A, B and so on in each iteration but also checks for C in first iteration. But in the end, its just a subjective personal preference more than anything else. So, I am happy to revert to the old code if you guys feel so :).

quark added a comment.Oct 5 2017, 3:12 PM

I think the general rule of extracting one-time condition to outside a loop is correct. But the i == 0 check may be subject to change - if we do want to support hg -R ... serve for example, i == 0 may be changed to iscmdname or something. And this diff would make that change harder. Maybe we can change i == 0 to iscmdname now and leave some TODO notes about iscmdname to clarify the intention and current defect.

Performance-wise I'd expect compiler to do smart things. Unfortunately it seems gcc and clang are not that smart as of today.

In my opinion, just checking for hg -R ... serve would not solve the problem. If we go through that route, we would probably want to always detect the serve command rather than mostly detecting the serve command. However, if we do go in that direction and it makes sense to get rid of the special case for i == 0, then sure we should go ahead and delete this code whenever we do that.

yuja added a comment.Oct 6 2017, 11:20 AM

What I don't like is this change increases complexity of the loop condition,
checking argc at two places, and incrementing i at two places, conditionally.

singhsrb updated this revision to Diff 2506.Oct 6 2017, 1:14 PM
singhsrb added a comment.EditedOct 6 2017, 1:21 PM

I just refactored the code a little more to take care of redundant loop execution when argc == 0. I am guessing you mean you don't like the code in terms of readability because at runtime, this commit is strictly at least equal or better than the earlier code except for the case when the first argument is "--" (which i am guessing is not very common). Effectively, the extra argc check replaces the i == 0 check in each iteration and the seemingly extra ++i essentially mimics the ++i at the end of the first iteration.

simpkins added inline comments.
contrib/chg/chg.c
368–397

I agree with Yuya that the original code is more readable, and that this won't
have any measurable perf impact. Users presumably aren't running commands with
many thousands of arguments that we have to loop through here (and even if they
were, the CPU branch predictor would do a good job of handling the i == 0
check).

If you wanted to simplify this function, I personally think this would be better:

int i;
for (i = 0; i < argc; ++i) {
        if (strcmp(argv[i], "--") == 0) {
                break;
        } else if (strcmp("-d", argv[i]) == 0 ||
            strcmp("--daemon", argv[i]) == 0) {
                if (strcmp("serve", argv[0]) == 0)
                        return 1;
        } else if (strcmp("--time", argv[i]) == 0) {
                return 1;
        }
}
return 0;

or perhaps

if (argc == 0)
        return 0;

int isserve = (strcmp(argv[0], "serve") == 0);
for (i = 0; i < argc; ++i) {
        if (strcmp(argv[i], "--") == 0)
                break;
        if (isserve && (strcmp("-d", argv[i]) == 0 ||
                        strcmp("--daemon", argv[i]) == 0))
                return 1;
        if (strcmp("--time", argv[i]) == 0)
                return 1;
}
return 0;

@simpkins I definitely prefer

int i;
for (i = 0; i < argc; ++i) {
        if (strcmp(argv[i], "--") == 0) {
                break;
        } else if (strcmp("-d", argv[i]) == 0 ||
            strcmp("--daemon", argv[i]) == 0) {
                if (strcmp("serve", argv[0]) == 0)
                        return 1;
        } else if (strcmp("--time", argv[i]) == 0) {
                return 1;
        }
}
return 0;

as it avoids the need to check for "--serve" if we never saw "--daemon". Also, avoids the early return in the second option. I did not notice that we were only checking for the SERVEDAEMON and hence, this was also an option :).

singhsrb abandoned this revision.Oct 6 2017, 2:49 PM

Thanks everyone for their inputs!

I think @simpkins suggestions are definitely better than the main idea of this commit (which is to move out code from the loop). So, I am abandoning this commit. If we simplify this function, I am inclined to take the approach mentioned in my previous comment.