This is an archive of the discontinued Mercurial Phabricator instance.

revlog: move the nodemap into the index object (for pure)
ClosedPublic

Authored by marmoute on Nov 8 2019, 4:28 AM.

Details

Summary

This make the pure code closer to the C extension one. The ultimate goal is to
merge the two into a single object and offer a unified API. This changeset
focus on gathering the data on the same object.

For now the code for revlogoldindex and BaseIndexObject index object are
quite similar. However, there will be larger divergence later on, so I don't
think is worth doing a base case.

This work is part of a refactoring to unify the revlog index and the nodemap.
This unification prepare the use of a persistent nodemap.

Diff Detail

Repository
rHG Mercurial
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

marmoute created this revision.Nov 8 2019, 4:28 AM

This seems OK to me aside from the use of range(), which we don't want since len() can be massive and this could create perf problems on Python 2.

mercurial/pure/parsers.py
53

We want this to be pycompat.xrange.

mercurial/revlog.py
211

pycompat.xrange.

This seems OK to me aside from the use of range(), which we don't want since len() can be massive and this could create perf problems on Python 2.

I am not too convinced by a switch to range(). I've two main reason:

  1. this is the "pure" version, so people are unlikely to use it on large repository, they would get in other large performance issue before that.
  2. we are about to allocate a full nodemap, this will be significantly slower than the associated range() call.

Do you still cant the pycompat.xrange change ?

This revision was not accepted when it landed; it landed in state Needs Review.
This revision was automatically updated to reflect the committed changes.
yuja added a subscriber: yuja.Nov 10 2019, 4:36 AM

@@ -662,29 +665,10 @@

except TypeError:
    raise
except error.RevlogError:
  • if not isinstance(self._nodecache, revlogutils.NodeMap):
  • # parsers.c radix tree lookup failed
  • if node == wdirid or node in wdirfilenodeids:
  • raise error.WdirUnsupported
  • raise error.LookupError(node, self.indexfile, _(b'no node'))
  • else:
  • # pure python cache lookup failed
  • n = self._nodecache
  • i = self.index
  • p = self._nodepos
  • if p is None:
  • p = len(i) - 1
  • else:
  • assert p < len(i)
  • for r in pycompat.xrange(p, -1, -1):
  • v = i[r][7]
  • n[v] = r
  • if v == node:
  • self._nodepos = r - 1
  • return r
  • if node == wdirid or node in wdirfilenodeids:
  • raise error.WdirUnsupported
  • raise error.LookupError(node, self.indexfile, _(b'no node'))

+ # parsers.c radix tree lookup failed
+ if node == wdirid or node in wdirfilenodeids:
+ raise error.WdirUnsupported
+ raise error.LookupError(node, self.indexfile, _(b'no node'))

So revlog._nodepos is now useless since the node map is no longer built
incrementally.