Page MenuHomePhabricator

remotefilelog: using fastlog as a fast path to fix linkrev
ClosedPublic

Authored by shivramk on Sep 4 2017, 12:30 PM.

Details

Summary

The _adjustlinknode function in remotefilectx currently has a slow case where
it must prefetch history from the server to lookup the correct linkcode.
This change makes use of fastlog (if enabled for the repo) to speed up the
linknode lookup as fastlog will already have the correct linkrev. If fastlog
is not enabled or the fastlog call fails for some reason then it will fall
back to the old method.

Test Plan

Added new test cases in test-remotefilelog-linknodes.t

Diff Detail

Repository
rFBHGX Facebook Mercurial Extensions
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

shivramk created this revision.Sep 4 2017, 12:30 PM
Herald added a reviewer: Restricted Project. · View Herald TranscriptSep 4 2017, 12:30 PM
pulkit added a subscriber: pulkit.Sep 5 2017, 11:15 AM
pulkit added inline comments.
remotefilelog/remotefilectx.py
28

You should add a return after this otherwise, you may hit an AttributeError.

durham requested changes to this revision.Sep 5 2017, 11:58 AM
durham added a subscriber: durham.

How much faster is this than before?

remotefilelog/remotefilectx.py
55

If initialization fails, then conduit will still be None and this logic will get invoked every single time this code path his hit, and print a ton of warnings. Maybe we should do something like set conduit=False if initialization fails, then in this function we say if conduit is None: initialize, elif conduit is False: return None

347

I'd explicitly return None here. We generally shouldn't rely on pythons default return None behavior for functions with actual return values.

tests/test-remotefilelog-linknodes.t
231

Why set debug to true here if there is no output?

This revision now requires changes to proceed.Sep 5 2017, 11:58 AM

I haven't measured the actual performance improvement. Will try on one of our big repos.

remotefilelog/remotefilectx.py
55

Makes sense. Will set conduit to False if initialization fails

shivramk edited edge metadata.Sep 5 2017, 12:59 PM
shivramk updated this revision to Diff 1614.

addressed review comments

ikostia edited edge metadata.Sep 6 2017, 5:21 AM

This looks good to me, but I'd like @durham to take a final look.
One of the things that is missing in this diff however is the timeout. Were you going to publish another diff for that?

ryanmce added inline comments.Sep 6 2017, 8:44 AM
remotefilelog/remotefilectx.py
14

I'm not a fan of making conduit global when it could just be a property of the class. Why make it global?

ryanmce requested changes to this revision.Sep 6 2017, 8:44 AM

If there's a good reason, I'll reconsider.

This revision now requires changes to proceed.Sep 6 2017, 8:44 AM
shivramk added inline comments.Sep 6 2017, 8:51 AM
remotefilelog/remotefilectx.py
14

Well the idea was to have a single conduit shared by all remotefilelog instances and have it be initialized only once.

Having said it, if its not common to have multiple instances of remotefilectx in a single invocation of hg then it would indeed be better to have it as an instance variable.

Ah good point. There will be plenty of remotefilectx, but probably there's only one remotefileserver. Perhaps the conduit can live there?

durham added a comment.Sep 6 2017, 1:10 PM

Since the goal of this is performance, I'd still like to know some perf numbers (keeping in mind this code review is open to the public so don't include any internal numbers like repo size). Preferably a comparison of before and after, but at least how fast it is when the code path is executed.

stash edited edge metadata.Sep 12 2017, 12:40 PM

Since the goal of this is performance, I'd still like to know some perf numbers (keeping in mind this code review is open to the public so don't include any internal numbers like repo size).

I can give it a go tomorrow. But I'm pretty sure it won't be worse than the remotefilelog prefetch heuristic we had before.

@shivramk, @ikostia just told me that you were planning to do the measurements yourself. Let me know if you have any problems with it

stash added a comment.Sep 13 2017, 1:09 PM

I've found a bug in implementation and I need some time to figure out how to fix it.

shivramk updated this revision to Diff 1877.Sep 18 2017, 6:22 PM

Moved the fastlog call out of the for loop

The previous version had a bug in that we were calling fastlog inside the for
loop which iterates over ancestor revisions. This would always return the latest
rev. This change moves it out of the for loop. We now call fastlog to retrieve
all revs where the file was modified, and then return the first rev which had
the same content.

I tested the changes with a file in one of our big repos and here are the
performance numbers.

Time taken for hg log with fastlog disabled:

time hg log strings.xml > /dev/null

hg log > /dev/null 10.08s user 0.64s system 99% cpu 10.728 total

With fastlog enabled:

time hg log strings.xml >/dev/null

hg log > /dev/null 1.14s user 0.47s system 59% cpu 2.710 total

stash requested changes to this revision.Sep 19 2017, 4:35 AM
stash added inline comments.
remotefilelog/remotefilectx.py
243–248

Sorry, I just realized I forgot to take draft phases into account :(.
In that case fastlog will be called even on draft revisions. Draft revisions are present only on your host, and that means that fastlog call will fail and it will slow things down.

To fix it we need to move this piece of code in the inner loop below, where we know that we are working with public revision.
See my comments below.

303

Fastlog call should be here, and it should be called only once, so I suggest to have

if seenpublic == False and repo.ui.configbool('fastlog', 'enabled'):
            lnode = self._linknodeviafastlog(repo, path, ancrev, fnode, cl, mfl)
            if lnode:
                return lnode

seenpublic = True

(note ancrev here self._linknodeviafastlog(repo, path, ancrev, fnode, cl, mfl))

One more important thing - in "_linknodeviafastlog" you are currently doing "for anc in results[1:]:". Since now we are checking only public revisions, we don't need to skip the first element, so it should be "for anc in results".

Also I think that we need to have a test that covers this behavior. Lmk if you need help with it!

P. S.
Adding a comment about what we are doing is also a good idea.

339

To route error messages correctly, you need to do:

repo.ui.log('fastloglinkrevfixup', _('fastlog returned 0 results\n'))

Can you please change all of the invocations of repo.ui.log?

341

See above: you don't need to skip first element if you query only public commits.

This revision now requires changes to proceed.Sep 19 2017, 4:35 AM

@shivramk should a test have caught the bug you found? Seems like a correctness issue, so I'm surprised nothing failed. Do we need to add a test?

shivramk updated this revision to Diff 1903.Sep 19 2017, 8:11 PM

Fixed handling of draft commits and logging calls

stash accepted this revision.Sep 20 2017, 5:58 AM

@shivramk It looks good to me, I'm going to accept the revision, but please address the comments, especially two most important ones about timeout and about calling fastlog only once.

Heads up @durham @ryanmce.

I just tested the change again. We'll see speed up on hg log FILENAME for certain files that are affected by linkrev issue. On my host I can reliably repro 7 secs speed up for one of such files - 11 secs down to 4 secs. It should also speed up other commands such as amend, maybe rebases etc.

What puzzled me for a long time is why we see this speed up in hg log command, while we already have fastlog.py extension. Turned out that fastlog.py extension works only for directories. From fastlog.py comment:

# bail on symlinks, and also bail on files for now
# with follow behavior, for files, we are supposed
# to track copies / renames, but it isn't convenient
# to do this through scmquery
remotefilelog/remotefilectx.py
297

Let's run fastlog only once - please run it only if seenpublic is False:

if not seenpublic and repo.ui.configbool('fastlog', 'enabled'):
298–299

Formatting is wrong:

lnode = self._linknodeviafastlog(repo, path, ancrev, fnode,
                                 cl, mfl)

I suggest to use Nuclide - it highlights such problems automatically

301

Can you also log successful fastlog calls:

if lnode:
  repo.ui.log('fastloglinkrevfixup', _('success\n'))
  return lnode
330–337

I completely forgot about timeout, but luckily the change is simple:

Please add these two lines -

diff --git a/remotefilelog/remotefilectx.py b/remotefilelog/remotefilectx.py
--- a/remotefilelog/remotefilectx.py
+++ b/remotefilelog/remotefilectx.py
@@ -325,8 +325,10 @@
             return None
         try:
             srchex = repo[srcrev].hex()
+            FASTLOG_TIMEOUT_IN_SECS = 0.5
             results = self._conduit.call_conduit(
                 'scmquery.log_v2',
+                timeout=FASTLOG_TIMEOUT_IN_SECS,
                 repo=reponame,
                 scm_type='hg',
                 rev=srchex,

Also I added depended revision in phabricator that adds timeout functionality - D742. So if you want to test locally, then first you need to do hgarc patch D742 to get my changes, then rebase your changes on top.

But don't worry, I tested them myself.

stash added a comment.EditedSep 21 2017, 10:01 AM

@shivramk I just pushed diff with timeout. Please address the comments and feel free to land it!

shivramk added inline comments.Sep 21 2017, 1:56 PM
remotefilelog/remotefilectx.py
297

The outer if condition is already checking for not seenpublic. Wouldn't this check be redundant?

shivramk updated this revision to Diff 1972.Sep 21 2017, 2:15 PM

Addressed review comments

stash added inline comments.Sep 22 2017, 5:15 AM
remotefilelog/remotefilectx.py
297

Right, makes sense

stash accepted this revision.Sep 22 2017, 5:16 AM
This revision was automatically updated to reflect the committed changes.