remotefilelog/basepack.py
92–109	This is basically @singhsrb's refactor to `_getavailablepackfiles` in D1217 with some nore goodies.

remotefilelog/basepack.py
92–109	*more

remotefilelog/basepack.py
61	I don't see what the general convention is but I would prefer to see for filepath, _mtime, _size in self._getavailablepackfilessorted():
105	I removed the `id` from `ids` at this point in my refactor. I did not confirm if it actually helps or not but maybe something worth considering.
121	`size -> _size`
136	`file->_file`, `mtime->_mtime`
143	hmm, number of packs in `self.packs` can be different from ones that `gettotalsize` perceives because `self.packs` may not have the most recent packs..

remotefilelog/basepack.py
61	Why prefix them?

remotefilelog/basepack.py
61	We talked offline -- it's to mark it as unused. I like the idea marking unused variables, but `_size` looks too much to me like an instance variable (and the next person might not rename it). Since `_` is used by the localization framework, I humbly propose `__`. It doesn't look _too_ weird, although it's not been done before.
105	My hunch is the combined time of removing the items is greater than the time saved in subsequent lookups -- but I could be wrong. It's probably still a good idea to reduce memory usage.
143	This is a great point, the two could be inconsistent.

remotefilelog/basepack.py
61	Does it make more sense to actually get the tuple and extract stuff out of it instead of expanding in the loop in this case?

remotefilelog/basepack.py
61	Would that be to eliminate the unused variables?

remotefilelog/basepack.py
143	I ended up combining the calculation of both into a `gettotalsizeandcount` (ick) to ensutre consistency.

remotefilelog/basepack.py
61	I was thinking of something like: for pack in self._getavailablepackfilessorted(): filepath = pack.filepath . .
143	That's what I was mainly concerned about. Thanks for taking care of this!
remotefilelog/shallowutil.py
98	I am not sure `sum` is the right word here mainly because of its use in the string case. Perhaps, `add` or `aggregate` is better.

remotefilelog/basepack.py
61	I misunderstood the discussion last week (I thought it was choosing between `_name` vs `__`). But if it's between `_name` and `name`, I don't think it has to be that strict as long as pyflakes does not complain. The unprefixed version will causes leas code churn if the variables are used in a later patch, which makes the blame output tidier. I remembered that the Mercurial community strongly preferred less code churn when mpm was still there - for example, there was an AST transformer to convert `'str'` to `b'str'` for Py3 compatibility, instead of doing a codemod.

remotefilelog/basepack.py
61	It looks like the 2nd and 3rd parameter from _getavailablepackfilessorted is never used. Can we just not return them?
105	If we're going to remove it, I'd add a comment explaining why. I don't think we really need that optimization, since most of the time this loop will only be over 10's of items.
139	Count is not incremented
remotefilelog/shallowrepo.py
80–81	I'd drop the underscore and have the helper function add it. So the caller doesn't have to be aware of formatting. I'd also not use [] in this case and just pass the list as a list, since that's how it's consumed later. I know we use the pattern in other places in remotefilelog, but I've come to regret it. If you did want to use stores in the function, then you'd just pass `, 'filestore_', packcontentstore, packmetadatastore)` here. Same for sumdicts, where we pass in metrics, and receive out dicts. Just drop the
remotefilelog/shallowutil.py
106	The default of 0 here means it won't work with anything other than numbers. >>> from collections import defaultdict >>> foo = defaultdict(lambda: 0) >>> foo['bar'] += ['a', 'b', 'c'] Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for +=: 'int' and 'list'

remotefilelog/basepack.py
139	Well definitely needed another pair of eyes :).

remotefilelog/basepack.py
61	I think it's more confusing if we don't, because that function claims to otherwise be identical to `_getavailablepackfilessorted()`.
139	whoops, thanks
remotefilelog/shallowutil.py
106	hm, derp. I'll just drop the strings array comment for now. We could always set the default conditionally based on the types of the values later on, if we really want that use case.

remotefilelog/basepack.py
61	*`_getavailablepackfiles()`

Diff	ID	Description	Created	Lint	Unit
Base		Base
Diff 1	3260		Nov 3 2017, 12:12 PM	★	★
Diff 2	3261		Nov 3 2017, 12:13 PM	★	★
Diff 3	3288		Nov 5 2017, 9:25 PM	★	★
Diff 4	3290		Nov 5 2017, 9:33 PM	★	★
Diff 5	3298		Nov 6 2017, 4:31 PM	★	★
Diff 6	3301		Nov 6 2017, 9:08 PM	★	★

	Status	Author	Revision
	Accepted	phillco	D1312 packfiles: add hg debugpackstatus
	Closed	phillco	D1309 packs: improve packfile metrics

Diff 3260

cstore/py-datapackstore.h

	}			}
	}			}

	static PyObject uniondatapackstore_markforrefresh(py_uniondatapackstore self) {			static PyObject uniondatapackstore_markforrefresh(py_uniondatapackstore self) {
	self->uniondatapackstore->markForRefresh();			self->uniondatapackstore->markForRefresh();
	Py_RETURN_NONE;			Py_RETURN_NONE;
	}			}

				static PyObject uniondatapackstore_getmetrics(py_uniondatapackstore self) {
				// TODO: implement
				printf("crap");
				return NULL;
				}

	// --------- UnionDatapackStore Declaration ---------			// --------- UnionDatapackStore Declaration ---------

	static PyMethodDef uniondatapackstore_methods[] = {			static PyMethodDef uniondatapackstore_methods[] = {
	{"get", (PyCFunction)uniondatapackstore_get, METH_VARARGS, ""},			{"get", (PyCFunction)uniondatapackstore_get, METH_VARARGS, ""},
	{"getdeltachain", (PyCFunction)uniondatapackstore_getdeltachain, METH_VARARGS, ""},			{"getdeltachain", (PyCFunction)uniondatapackstore_getdeltachain, METH_VARARGS, ""},
	{"getmissing", (PyCFunction)uniondatapackstore_getmissing, METH_O, ""},			{"getmissing", (PyCFunction)uniondatapackstore_getmissing, METH_O, ""},
	{"markforrefresh", (PyCFunction)uniondatapackstore_markforrefresh, METH_NOARGS, ""},			{"markforrefresh", (PyCFunction)uniondatapackstore_markforrefresh, METH_NOARGS, ""},
				{"getmetrics", (PyCFunction)uniondatapackstore_getmetrics, METH_NOARGS, ""},
	{NULL, NULL}			{NULL, NULL}
	};			};

	static PyTypeObject uniondatapackstoreType = {			static PyTypeObject uniondatapackstoreType = {
	PyObject_HEAD_INIT(NULL)			PyObject_HEAD_INIT(NULL)
	0, /* ob_size */			0, /* ob_size */
	"cstore.uniondatapackstore", /* tp_name */			"cstore.uniondatapackstore", /* tp_name */
	sizeof(py_uniondatapackstore), /* tp_basicsize */			sizeof(py_uniondatapackstore), /* tp_basicsize */

remotefilelog/basepack.py

	from __future__ import absolute_import			from __future__ import absolute_import

	import errno, hashlib, mmap, os, struct, time			import errno, hashlib, mmap, os, struct, time

				from collections import defaultdict
	from mercurial import policy, pycompat, util			from mercurial import policy, pycompat, util
	from mercurial.i18n import _			from mercurial.i18n import _
	from mercurial import vfs as vfsmod			from mercurial import vfs as vfsmod

	from . import shallowutil			from . import shallowutil

	osutil = policy.importmod(r'osutil')			osutil = policy.importmod(r'osutil')

	class basepackstore(object):			class basepackstore(object):
	def __init__(self, ui, path):			def __init__(self, ui, path):
	self.ui = ui			self.ui = ui
	self.path = path			self.path = path
	self.packs = []			self.packs = []
	# lastrefesh is 0 so we'll immediately check for new packs on the first			# lastrefesh is 0 so we'll immediately check for new packs on the first
	# failure.			# failure.
	self.lastrefresh = 0			self.lastrefresh = 0
	for filepath in self._getavailablepackfiles():			for filepath in self._getavailablepackfilessorted():
				singhsrbUnsubmitted Done I don't see what the general convention is but I would prefer to see for filepath, _mtime, _size in self._getavailablepackfilessorted(): singhsrb: I don't see what the general convention is but I would prefer to see ``` for filepath, _mtime…
				phillcoAuthorUnsubmitted Not Done Why prefix them? phillco: Why prefix them?
				phillcoAuthorUnsubmitted Not Done We talked offline -- it's to mark it as unused. I like the idea marking unused variables, but `_size` looks too much to me like an instance variable (and the next person might not rename it). Since `_` is used by the localization framework, I humbly propose `__`. It doesn't look _too_ weird, although it's not been done before. phillco: We talked offline -- it's to mark it as unused. I like the idea marking unused variables, but…
				singhsrbUnsubmitted Not Done Does it make more sense to actually get the tuple and extract stuff out of it instead of expanding in the loop in this case? singhsrb: Does it make more sense to actually get the tuple and extract stuff out of it instead of…
				phillcoAuthorUnsubmitted Not Done Would that be to eliminate the unused variables? phillco: Would that be to eliminate the unused variables?
				singhsrbUnsubmitted Not Done I was thinking of something like: for pack in self._getavailablepackfilessorted(): filepath = pack.filepath . . singhsrb: I was thinking of something like: ``` for pack in self._getavailablepackfilessorted()…
				quarkUnsubmitted Not Done I misunderstood the discussion last week (I thought it was choosing between `_name` vs `__`). But if it's between `_name` and `name`, I don't think it has to be that strict as long as pyflakes does not complain. The unprefixed version will causes leas code churn if the variables are used in a later patch, which makes the blame output tidier. I remembered that the Mercurial community strongly preferred less code churn when mpm was still there - for example, there was an AST transformer to convert `'str'` to `b'str'` for Py3 compatibility, instead of doing a codemod. quark: I misunderstood the discussion last week (I thought it was choosing between `_name` vs `__`).
				durhamUnsubmitted Not Done It looks like the 2nd and 3rd parameter from _getavailablepackfilessorted is never used. Can we just not return them? durham: It looks like the 2nd and 3rd parameter from _getavailablepackfilessorted is never used. Can…
				phillcoAuthorUnsubmitted Not Done I think it's more confusing if we don't, because that function claims to otherwise be identical to `_getavailablepackfilessorted()`. phillco: I think it's more confusing if we don't, because that function claims to otherwise be identical…
				phillcoAuthorUnsubmitted Not Done `_getavailablepackfiles()` phillco:* *`_getavailablepackfiles()`
	try:			try:
	pack = self.getpack(filepath)			pack = self.getpack(filepath)
	except Exception as ex:			except Exception as ex:
	# An exception may be thrown if the pack file is corrupted			# An exception may be thrown if the pack file is corrupted
	# somehow. Log a warning but keep going in this case, just			# somehow. Log a warning but keep going in this case, just
	# skipping this pack file.			# skipping this pack file.
	#			#
	# If this is an ENOENT error then don't even bother logging.			# If this is an ENOENT error then don't even bother logging.
	# Someone could have removed the file since we retrieved the			# Someone could have removed the file since we retrieved the
	# list of paths.			# list of paths.
	if getattr(ex, 'errno', None) != errno.ENOENT:			if getattr(ex, 'errno', None) != errno.ENOENT:
	ui.warn(_('unable to load pack %s: %s\n') % (filepath, ex))			ui.warn(_('unable to load pack %s: %s\n') % (filepath, ex))
	continue			continue
	self.packs.append(pack)			self.packs.append(pack)

	def _getavailablepackfiles(self):			def _getavailablepackfiles(self):
	suffixlen = len(self.INDEXSUFFIX)			"""For each pack file (a index/data file combo), yields:
				(full path without extension, mtime, size)

	totalsize = 0			mtime will be the mtime of the index/data file (whichever is newer)
	files = []			size is the combined size of index/data file
	filenames = set()			"""
				indexsuffixlen = len(self.INDEXSUFFIX)
				packsuffixlen = len(self.PACKSUFFIX)

				ids = set()
				sizes = defaultdict(lambda: 0)
				mtimes = defaultdict(lambda: [])
	try:			try:
	for filename, size, stat in osutil.listdir(self.path, stat=True):			for filename, size, stat in osutil.listdir(self.path, stat=True):
	files.append((stat.st_mtime, filename))			id = None
	filenames.add(filename)			if filename[-indexsuffixlen:] == self.INDEXSUFFIX:
	totalsize += size			id = filename[:-indexsuffixlen]
				elif filename[-packsuffixlen:] == self.PACKSUFFIX:
				id = filename[:-packsuffixlen]

				# Since we expect to have two files corresponding to each ID
				# (the index file and the pack file), we can yield once we see
				# it twice.
				if id:
				sizes[id] += size # Sum both files' sizes together
				mtimes[id].append(stat.st_mtime)
				if id in ids:
				yield (os.path.join(self.path, id), max(mtimes[id]),
				singhsrbUnsubmitted Done I removed the `id` from `ids` at this point in my refactor. I did not confirm if it actually helps or not but maybe something worth considering. singhsrb: I removed the `id` from `ids` at this point in my refactor. I did not confirm if it actually…
				phillcoAuthorUnsubmitted Done My hunch is the combined time of removing the items is greater than the time saved in subsequent lookups -- but I could be wrong. It's probably still a good idea to reduce memory usage. phillco: My hunch is the combined time of removing the items is greater than the time saved in…
				durhamUnsubmitted Done If we're going to remove it, I'd add a comment explaining why. I don't think we really need that optimization, since most of the time this loop will only be over 10's of items. durham: If we're going to remove it, I'd add a comment explaining why. I don't think we really need…
				sizes[id])
				else:
				ids.add(id)
	except OSError as ex:			except OSError as ex:
				phillcoAuthorUnsubmitted Done This is basically @singhsrb's refactor to `_getavailablepackfiles` in D1217 with some nore goodies. phillco: This is basically @singhsrb's refactor to `_getavailablepackfiles` in D1217 with some nore…
				phillcoAuthorUnsubmitted Done more phillco:* *more
	if ex.errno != errno.ENOENT:			if ex.errno != errno.ENOENT:
	raise			raise

	numpacks = len(filenames)			def _getavailablepackfilessorted(self):
	self.ui.log("packsizes", "packstore %s has %d packs totaling %s\n" %			"""Like `_getavailablepackfiles`, but also sorts the files by mtime,
	(self.path, numpacks, util.bytecount(totalsize)),			yielding newest files first.
	numpacks=numpacks,
	totalsize=totalsize)			This is desirable, since it is more likely newer packfiles have more
	# Put most recent pack files first since they contain the most recent			desirable data.
	# info.			"""
				files = []
				for path, mtime, size in self._getavailablepackfiles():
				singhsrbUnsubmitted Not Done `size -> _size` singhsrb: `size -> _size`
				files.append((mtime, path))
	files = sorted(files, reverse=True)			files = sorted(files, reverse=True)
	for mtime, filename in files:			for mtime, path in files:
	packfilename = '%s%s' % (filename[:-suffixlen], self.PACKSUFFIX)			yield path
	if (filename[-suffixlen:] == self.INDEXSUFFIX
	and packfilename in filenames):			def gettotalsize(self):
	yield os.path.join(self.path, filename)[:-suffixlen]			"""Returns the total disk size (in bytes) of all the pack files in
				this store.

				(This might be smaller than the total size of the ``self.path``
				directory, since this only considers fuly-writen pack files, and not
				temporary files or other detritus on the directory.)
				"""
				totalsize = 0
				for file, mtime, size in self._getavailablepackfiles():
				singhsrbUnsubmitted Done `file->_file`, `mtime->_mtime` singhsrb: `file->_file`, `mtime->_mtime`
				totalsize += size
				return totalsize

				durhamUnsubmitted Done Count is not incremented durham: Count is not incremented
				singhsrbUnsubmitted Done Well definitely needed another pair of eyes :). singhsrb: Well definitely needed another pair of eyes :).
				phillcoAuthorUnsubmitted Done whoops, thanks phillco: whoops, thanks
				def getmetrics(self):
				"""Returns metrics on the state of this store."""
				return {
				'numpacks': len(self.packs),
				singhsrbUnsubmitted Done hmm, number of packs in `self.packs` can be different from ones that `gettotalsize` perceives because `self.packs` may not have the most recent packs.. singhsrb: hmm, number of packs in `self.packs` can be different from ones that `gettotalsize` perceives…
				phillcoAuthorUnsubmitted Done This is a great point, the two could be inconsistent. phillco: This is a great point, the two could be inconsistent.
				phillcoAuthorUnsubmitted Done I ended up combining the calculation of both into a `gettotalsizeandcount` (ick) to ensutre consistency. phillco: I ended up combining the calculation of both into a `gettotalsizeandcount` (ick) to ensutre…
				singhsrbUnsubmitted Done That's what I was mainly concerned about. Thanks for taking care of this! singhsrb: That's what I was mainly concerned about. Thanks for taking care of this!
				'totalsize': self.gettotalsize(),
				}

	def getpack(self, path):			def getpack(self, path):
	raise NotImplemented()			raise NotImplemented()

	def getmissing(self, keys):			def getmissing(self, keys):
	missing = keys			missing = keys
	for pack in self.packs:			for pack in self.packs:
	missing = pack.getmissing(missing)			missing = pack.getmissing(missing)
	# new objects), let's only actually check disk for new stuff every once			# new objects), let's only actually check disk for new stuff every once
	# in a while. Generally this code path should only ever matter when a			# in a while. Generally this code path should only ever matter when a
	# repack is going on in the background, and that should be pretty rare			# repack is going on in the background, and that should be pretty rare
	# to have that happen twice in quick succession.			# to have that happen twice in quick succession.
	newpacks = []			newpacks = []
	if now > self.lastrefresh + REFRESHRATE:			if now > self.lastrefresh + REFRESHRATE:
	self.lastrefresh = now			self.lastrefresh = now
	previous = set(p.path for p in self.packs)			previous = set(p.path for p in self.packs)
	new = set(self._getavailablepackfiles()) - previous			for filepath in self._getavailablepackfilessorted():
				if filepath not in previous:
	for filepath in new:
	newpacks.append(self.getpack(filepath))			newpacks.append(self.getpack(filepath))
	self.packs.extend(newpacks)			self.packs.extend(newpacks)

	return newpacks			return newpacks

	class versionmixin(object):			class versionmixin(object):
	# Mix-in for classes with multiple supported versions			# Mix-in for classes with multiple supported versions
	VERSION = None			VERSION = None
	SUPPORTED_VERSIONS = [0]			SUPPORTED_VERSIONS = [0]

remotefilelog/contentstore.py

	"""Returns the metadata dict for given node."""			"""Returns the metadata dict for given node."""
	for store in self.stores:			for store in self.stores:
	try:			try:
	return store.getmeta(name, node)			return store.getmeta(name, node)
	except KeyError:			except KeyError:
	pass			pass
	raise KeyError((name, hex(node)))			raise KeyError((name, hex(node)))

				def getmetrics(self):
				metrics = [s.getmetrics() for s in self.stores]
				return shallowutil.sumdicts(*metrics)

	def _getpartialchain(self, name, node):			def _getpartialchain(self, name, node):
	"""Returns a partial delta chain for the given name/node pair.			"""Returns a partial delta chain for the given name/node pair.

	A partial chain is a chain that may not be terminated in a full-text.			A partial chain is a chain that may not be terminated in a full-text.
	"""			"""
	for store in self.stores:			for store in self.stores:
	try:			try:
	return store.getdeltachain(name, node)			return store.getdeltachain(name, node)

remotefilelog/metadatastore.py

	if missing:			if missing:
	missing = store.getmissing(missing)			missing = store.getmissing(missing)
	return missing			return missing

	def markledger(self, ledger):			def markledger(self, ledger):
	for store in self.stores:			for store in self.stores:
	store.markledger(ledger)			store.markledger(ledger)

				def getmetrics(self):
				metrics = [s.getmetrics() for s in self.stores]
				return shallowutil.sumdicts(*metrics)

	class remotefilelogmetadatastore(basestore.basestore):			class remotefilelogmetadatastore(basestore.basestore):
	def getancestors(self, name, node, known=None):			def getancestors(self, name, node, known=None):
	"""Returns as many ancestors as we're aware of.			"""Returns as many ancestors as we're aware of.

	return value: {			return value: {
	node: (p1, p2, linknode, copyfrom),			node: (p1, p2, linknode, copyfrom),
	...			...
	}			}

remotefilelog/shallowrepo.py

	packcontentstore = datapackstore(			packcontentstore = datapackstore(
	repo.ui,			repo.ui,
	packpath,			packpath,
	usecdatapack=repo.ui.configbool('remotefilelog', 'fastdatapack'))			usecdatapack=repo.ui.configbool('remotefilelog', 'fastdatapack'))
	packmetadatastore = historypackstore(repo.ui, packpath)			packmetadatastore = historypackstore(repo.ui, packpath)

	repo.shareddatastores.append(packcontentstore)			repo.shareddatastores.append(packcontentstore)
	repo.sharedhistorystores.append(packmetadatastore)			repo.sharedhistorystores.append(packmetadatastore)
				shallowutil.reportpackmetrics(repo.ui, 'filestore_', *[packcontentstore,
				packmetadatastore])
				durhamUnsubmitted Not Done I'd drop the underscore and have the helper function add it. So the caller doesn't have to be aware of formatting. I'd also not use [] in this case and just pass the list as a list, since that's how it's consumed later. I know we use the pattern in other places in remotefilelog, but I've come to regret it. If you did want to use stores in the function, then you'd just pass `, 'filestore_', packcontentstore, packmetadatastore)` here. Same for sumdicts, where we pass in metrics, and receive out dicts. Just drop the durham: I'd drop the underscore and have the helper function add it. So the caller doesn't have to be…
	return packcontentstore, packmetadatastore			return packcontentstore, packmetadatastore

	def makeunionstores(repo):			def makeunionstores(repo):
	"""Union stores iterate the other stores and return the first result."""			"""Union stores iterate the other stores and return the first result."""
	repo.shareddatastores = []			repo.shareddatastores = []
	repo.sharedhistorystores = []			repo.sharedhistorystores = []

	packcontentstore, packmetadatastore = makepackstores(repo)			packcontentstore, packmetadatastore = makepackstores(repo)

remotefilelog/shallowutil.py

	# shallowutil.py -- remotefilelog utilities			# shallowutil.py -- remotefilelog utilities
	#			#
	# Copyright 2014 Facebook, Inc.			# Copyright 2014 Facebook, Inc.
	#			#
	# This software may be used and distributed according to the terms of the			# This software may be used and distributed according to the terms of the
	# GNU General Public License version 2 or any later version.			# GNU General Public License version 2 or any later version.
	from __future__ import absolute_import			from __future__ import absolute_import

	import errno, hashlib, os, stat, struct, tempfile			import errno, hashlib, os, stat, struct, tempfile

				from collections import defaultdict
	from mercurial import filelog, revlog, util, error			from mercurial import filelog, revlog, util, error
	from mercurial.i18n import _			from mercurial.i18n import _

	from . import constants			from . import constants

	if os.name != 'nt':			if os.name != 'nt':
	import grp			import grp

	def parsemeta(text):			def parsemeta(text):
	"""parse mercurial filelog metadata"""			"""parse mercurial filelog metadata"""
	meta, size = filelog.parsemeta(text)			meta, size = filelog.parsemeta(text)
	if text.startswith('\1\n'):			if text.startswith('\1\n'):
	s = text.index('\1\n', 2)			s = text.index('\1\n', 2)
	text = text[s + 2:]			text = text[s + 2:]
	return meta or {}, text			return meta or {}, text

				def sumdicts(*dicts):
				singhsrbUnsubmitted Done I am not sure `sum` is the right word here mainly because of its use in the string case. Perhaps, `add` or `aggregate` is better. singhsrb: I am not sure `sum` is the right word here mainly because of its use in the string case.
				"""Adds all the values of *dicts together into one dictionary. This assumes
				the values in *dicts are all summable.

				e.g. [{'a': 4', 'b': 2}, {'b': 3, 'c': 1}] -> {'a': 4, 'b': 5, 'c': 1}
				or, {'names': ['bob', 'sue']}, {'names': ['jim']} ->
				{'names': ['bob', 'sue', 'jim']}
				"""
				result = defaultdict(lambda: 0)
				durhamUnsubmitted Done The default of 0 here means it won't work with anything other than numbers. >>> from collections import defaultdict >>> foo = defaultdict(lambda: 0) >>> foo['bar'] += ['a', 'b', 'c'] Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for +=: 'int' and 'list' durham: The default of 0 here means it won't work with anything other than numbers. ``` >>> from…
				phillcoAuthorUnsubmitted Done hm, derp. I'll just drop the strings array comment for now. We could always set the default conditionally based on the types of the values later on, if we really want that use case. phillco: hm, derp. I'll just drop the strings array comment for now. We could always set the default…
				for dict in dicts:
				for k, v in dict.iteritems():
				result[k] += v
				return result

				def prefixkeys(dict, prefix):
				"""Returns ``dict`` with ``prefix`` prepended to all its keys."""
				result = {}
				for k, v in dict.iteritems():
				result[prefix + k] = v
				return result

				def reportpackmetrics(ui, prefix, *stores):
				dicts = [s.getmetrics() for s in stores]
				dict = prefixkeys(sumdicts(*dicts), prefix)
				ui.log(prefix + "packsizes", "", **dict)

	def _parsepackmeta(metabuf):			def _parsepackmeta(metabuf):
	"""parse datapack meta, bytes (<metadata-list>) -> dict			"""parse datapack meta, bytes (<metadata-list>) -> dict

	The dict contains raw content - both keys and values are strings.			The dict contains raw content - both keys and values are strings.
	Upper-level business may want to convert some of them to other types like			Upper-level business may want to convert some of them to other types like
	integers, on their own.			integers, on their own.

	raise ValueError if the data is corrupted			raise ValueError if the data is corrupted

treemanifest/init.py

	localpackpath = shallowutil.getlocalpackpath(repo.svfs.vfs.base,			localpackpath = shallowutil.getlocalpackpath(repo.svfs.vfs.base,
	PACK_CATEGORY)			PACK_CATEGORY)

	# Data store			# Data store
	if repo.ui.configbool("treemanifest", "usecunionstore"):			if repo.ui.configbool("treemanifest", "usecunionstore"):
	datastore = cstore.datapackstore(packpath)			datastore = cstore.datapackstore(packpath)
	localdatastore = cstore.datapackstore(localpackpath)			localdatastore = cstore.datapackstore(localpackpath)
	# TODO: can't use remotedatastore with cunionstore yet			# TODO: can't use remotedatastore with cunionstore yet
				# TODO make reportmetrics work with cstore
	mfl.datastore = cstore.uniondatapackstore([localdatastore, datastore])			mfl.datastore = cstore.uniondatapackstore([localdatastore, datastore])
	else:			else:
	datastore = datapackstore(repo.ui, packpath, usecdatapack=usecdatapack)			datastore = datapackstore(repo.ui, packpath, usecdatapack=usecdatapack)
	localdatastore = datapackstore(repo.ui, localpackpath,			localdatastore = datapackstore(repo.ui, localpackpath,
	usecdatapack=usecdatapack)			usecdatapack=usecdatapack)
	stores = [datastore, localdatastore]			stores = [datastore, localdatastore]
	remotedatastore = remotetreedatastore(repo)			remotedatastore = remotetreedatastore(repo)
	if repo.ui.configbool("treemanifest", "demanddownload", True):			if repo.ui.configbool("treemanifest", "demanddownload", True):
	mfl.localhistorystores = [			mfl.localhistorystores = [
	localhistorystore,			localhistorystore,
	]			]
	mfl.historystore = unionmetadatastore(			mfl.historystore = unionmetadatastore(
	sharedhistorystore,			sharedhistorystore,
	localhistorystore,			localhistorystore,
	writestore=localhistorystore,			writestore=localhistorystore,
	)			)
				shallowutil.reportpackmetrics(repo.ui, 'treestore_', *[mfl.datastore,
				mfl.historystore])

	class treemanifestlog(manifest.manifestlog):			class treemanifestlog(manifest.manifestlog):
	def __init__(self, opener, treemanifest=False):			def __init__(self, opener, treemanifest=False):
	assert treemanifest is False			assert treemanifest is False
	cachesize = 4			cachesize = 4

	opts = getattr(opener, 'options', None)			opts = getattr(opener, 'options', None)
	if opts is not None:			if opts is not None:
	raise RuntimeError("cannot add to a remote store")			raise RuntimeError("cannot add to a remote store")

	def getmissing(self, keys):			def getmissing(self, keys):
	return keys			return keys

	def markledger(self, ledger):			def markledger(self, ledger):
	pass			pass

				def getmetrics(self):
				return {}

	def serverrepack(repo, incremental=False):			def serverrepack(repo, incremental=False):
	packpath = repo.vfs.join('cache/packs/%s' % PACK_CATEGORY)			packpath = repo.vfs.join('cache/packs/%s' % PACK_CATEGORY)

	dpackstore = datapackstore(repo.ui, packpath)			dpackstore = datapackstore(repo.ui, packpath)
	revlogstore = manifestrevlogstore(repo)			revlogstore = manifestrevlogstore(repo)
	datastore = unioncontentstore(dpackstore, revlogstore)			datastore = unioncontentstore(dpackstore, revlogstore)

	hpackstore = historypackstore(repo.ui, packpath)			hpackstore = historypackstore(repo.ui, packpath)

			Path	Packages
M			cstore/py-datapackstore.h (7 lines)
M			remotefilelog/basepack.py (90 lines)
M			remotefilelog/contentstore.py (4 lines)
M			remotefilelog/metadatastore.py (4 lines)
M			remotefilelog/shallowrepo.py (3 lines)
M			remotefilelog/shallowutil.py (28 lines)
M			treemanifest/__init__.py (6 lines)

This is an archive of the discontinued Mercurial Phabricator instance.

packs: improve packfile metrics
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 3260

cstore/py-datapackstore.h

remotefilelog/basepack.py

remotefilelog/contentstore.py

remotefilelog/metadatastore.py

remotefilelog/shallowrepo.py

remotefilelog/shallowutil.py

treemanifest/init.py

This is an archive of the discontinued Mercurial Phabricator instance.

packs: improve packfile metricsClosedPublic

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 3260

cstore/py-datapackstore.h

remotefilelog/basepack.py

remotefilelog/contentstore.py

remotefilelog/metadatastore.py

remotefilelog/shallowrepo.py

remotefilelog/shallowutil.py

treemanifest/__init__.py

packs: improve packfile metrics
ClosedPublic

Revision Contents
Changeset List

treemanifest/init.py