This is an archive of the discontinued Mercurial Phabricator instance.

store: append to fncache if there are only new files to write
ClosedPublic

Authored by pulkit on Nov 26 2018, 7:02 AM.

Details

Summary

Before this patch, if we have to add a new entry to fncache, we write the whole
fncache again which slows things down on large fncache which have millions of
entries. Addition of a new entry is common operation while pulling new files or
commiting a new file.

This patch adds a new fncache.addls set which keeps track of the additions
happening and store them. When we write the fncache, we will just read the addls
set and append those entries at the end of fncache.
We make sure that the entries are new entries by loading the fncache and making
sure entry does not exists there. In future if we can check if an entry is new
without loading the fncache, that will speed up things more.

Performance numbers for commiting a new file:

mercurial repo
before: 0.08784651756286621
after: 0.08474504947662354

mozilla-central
before: 1.83314049243927
after: 1.7054164409637451

netbeans
before: 0.7953150272369385
after: 0.7202838659286499

pypy
before: 0.17805707454681396
after: 0.13431048393249512

In our internal repo, the performance improvement is in seconds.

I have used octobus's ASV perf benchmark thing to get the above numbers. I also
see some minute perf improvements related to creating a new commit without a new
file, but I believe that's just some noise.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

pulkit created this revision.Nov 26 2018, 7:02 AM
This revision was automatically updated to reflect the committed changes.
yuja added a subscriber: yuja.Nov 27 2018, 7:24 AM

@@ -479,32 +481,45 @@

    fp.write(encodedir('\n'.join(self.entries) + '\n'))
fp.close()
self._dirty = False

+ if self.addls:
+ # if we have just new entries, let's append them to the fncache
+ tr.addbackup('fncache')
+ fp = self.vfs('fncache', mode='ab', atomictemp=True)
+ if self.addls:
+ fp.write(encodedir('\n'.join(self.addls) + '\n'))
+ fp.close()
+ self.entries = None
+ self.addls = set()

It's probably better to write entries | addls at once if there are both
adds and removes. Appending to an atomictemp file means the entire file
is copied first.