This is an archive of the discontinued Mercurial Phabricator instance.

Differential D2626

xdiff: use xxhash in xdiff
AbandonedPublic

Authored by indygreg on Mar 3 2018, 8:37 PM.

Download Raw Diff

Details

Reviewers

quark

Group Reviewers

hg-reviewers

Summary

Diffing functions split inputs into blocks, typically lines. When
inspecting each block, the diffing function needs to see if two blocks
are identical. Because repeated memcmp() can be expensive, diff
implementations will hash each block to an integer and then do a simple
integer/register compare to check for block equivalence.

It looks like xxhash was using djb2 for this hash. This hash was fine
for its use. But the hash was operating on single bytes at a time
and the output from a previous byte was needed before feeding in a
new byte. So our CPU pipeline was very constricted.

This commit changes the hash to xxhash. This hashing function takes
a start address and a length. Internally, it operates on multiple
bytes at a time. This requires fewer assembly instructions and makes
the hash faster.

It's worth noting we are using the 32-bit version of xxhash. The 64-bit
version is faster on 64-bit hardware and we should ideally use it.
But the 64-bit version returns an unsigned long long and xdiff wants
the hash to be 32 bits. We could probably truncate the hash. Let's
use the 32-bit hash for now.

This change yields a minor perf win on the mozilla-central repo:

$ hg perfbdiff --alldata -c --count 10 --blocks --xdiff 400000
! wall 0.845893 comb 0.840000 user 0.840000 sys 0.000000 (best of 12)
! wall 0.796796 comb 0.790000 user 0.790000 sys 0.000000 (best of 13)

$ hg perfbdiff -m --count 100 --blocks --xdiff 400000
! wall 10.026769 comb 10.030000 user 9.010000 sys 1.020000 (best of 3)
! wall  9.450092 comb  9.460000 user 8.470000 sys 0.990000 (best of 3)

Taken on its own, this isn't a significant improvement. But it does
open the door to more efficient line searching...

Diff Detail

Repository

rHG Mercurial

Lint

Lint Skipped

Unit

Unit Tests Skipped

Event Timeline

indygreg created this revision.Mar 3 2018, 8:37 PM

Herald added a reviewer: hg-reviewers. · View Herald TranscriptMar 3 2018, 8:37 PM

Herald added a subscriber: mercurial-devel. · View Herald Transcript

indygreg added a child revision: D2627: xdiff: use memchr instead of character scanning.Mar 3 2018, 8:37 PM

I think the perf win comes from less hash collisions.

It seems xxhash can make things slower if the line is short.

## test case
open('a','w').write(''.join('%s\n' % (i % 100000) for i in xrange(10000000)))
open('b','w').write(''.join('%s\n' % (i % 100000) for i in xrange(10000001)))

## before xxhash
./xdiff /tmp/[ab] &> /dev/null  0.67s user 0.21s system 99% cpu 0.875 total
## after xxhash
./xdiff /tmp/[ab] &> /dev/null  2.86s user 0.43s system 98% cpu 3.315 total

The memchr change does not change much - it's still about 3 seconds on my machine.

Code I use for testing:

// contrib/xdiff/xdiff.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include "mercurial/thirdparty/xdiff/xdiff.h"

#define abort(...)                                                             \
	{                                                                      \
		fprintf(stderr, __VA_ARGS__);                                  \
		exit(-1);                                                      \
	}

char buf[4096000];
void readfile(const char *path, mmfile_t *file)
{
	memset(file, 0, sizeof(*file));
	FILE *fp = fopen(path, "r");
	if (!fp) {
		abort("cannot open %s\n", path);
	}
	while (!feof(fp)) {
		size_t size = fread(buf, 1, sizeof buf, fp);
		if (size > 0) {
			size_t new_size = file->size + size;
			file->ptr = realloc(file->ptr, new_size);
			if (!file->ptr) {
				abort("cannot allocate\n");
			}
			memcpy(file->ptr + file->size, buf, size);
			file->size = new_size;
		}
	}
	fclose(fp);
}

static int xdiff_outf(void *priv_, mmbuffer_t *mb, int nbuf)
{
	int i;
	for (i = 0; i < nbuf; i++) {
		write(STDOUT_FILENO, mb[i].ptr, mb[i].size);
	}
	return 0;
}

int main(int argc, char const *argv[])
{
	if (argc < 3) {
		abort("usage: %s FILE1 FILE2\n", argv[0]);
	}

	mmfile_t a, b;

	readfile(argv[1], &a);
	readfile(argv[2], &b);

	xpparam_t xpp = {
	    0,    /* flags */
	    NULL, /* anchors */
	    0,    /* anchors_nr */
	};
	xdemitconf_t xecfg = {
	    3,    /* ctxlen */
	    0,    /* interhunkctxlen */
	    0,    /* flags */
	    NULL, /* find_func */
	    NULL, /* find_func_priv */
	    NULL, /* hunk_consume_func */
	};
	xdemitcb_t ecb = {
	    0,           /* priv */
	    &xdiff_outf, /* outf */
	};

	xdl_diff(&a, &b, &xpp, &xecfg, &ecb);

	free(a.ptr);
	free(b.ptr);

	return 0;
}

## contrib/xdiff/Makefile
xdiff: ../../mercurial/thirdparty/xdiff/*.c xdiff.c
	gcc -O2 -g -std=c99 -I../.. -I../../mercurial/thirdparty/xdiff -I../../mercurial -o $@ $^

clean:
	-rm -f xdiff

This revision now requires changes to proceed.Mar 3 2018, 9:14 PM

I can reproduce the results. xxhash needs more than 4-8 bytes to outperform the old hasher. I also suspect more hash collisions are present with xxhash. Although I haven't verified.

The input showing the regression effectively turns the code into a hash table benchmark.

But my results on actual repo data show an improvement.

FWIW, memchr+djb hash perform about the same as the original code on the perfbdiff -m case.

indygreg abandoned this revision.Mar 4 2018, 2:30 PM

Revision Contents
Changeset List

			Path	Packages
M			mercurial/thirdparty/xdiff/xutils.c (18 lines)
M			setup.py (8 lines)

Status	Author	Revision
Abandoned	indygreg	D2627 xdiff: use memchr instead of character scanning
Abandoned	indygreg	D2626 xdiff: use xxhash in xdiff
Abandoned	indygreg	D2625 xxhash: vendor external library
Closed	indygreg	D2624 perf: teach perfbdiff to call blocks() and to use xdiff

Diff 6525

mercurial/thirdparty/xdiff/xutils.c

	* <http://www.gnu.org/licenses/>.			* <http://www.gnu.org/licenses/>.
	*			*
	* Davide Libenzi <davidel@xmailserver.org>			* Davide Libenzi <davidel@xmailserver.org>
	*			*
	*/			*/

	#include <limits.h>			#include <limits.h>
	#include <assert.h>			#include <assert.h>
	#include "xinclude.h"


				/* Define XXHASH functions in a custom namespace */
				#define XXH_NAMESPACE XDIFF_
				#define XXH_PRIVATE_API
				#include "thirdparty/xxhash/xxhash.h"

				#include "xinclude.h"

	long xdl_bogosqrt(long n) {			long xdl_bogosqrt(long n) {
	long i;			long i;

	/*			/*
	* Classical integer square root approximation using shifts.			* Classical integer square root approximation using shifts.
	*/			*/
	for (i = 1; n > 0; n >>= 2)			for (i = 1; n > 0; n >>= 2)
	ha ^= (unsigned long) *ptr;			ha ^= (unsigned long) *ptr;
	}			}
	*data = ptr < top ? ptr + 1: ptr;			*data = ptr < top ? ptr + 1: ptr;

	return ha;			return ha;
	}			}

	unsigned long xdl_hash_record(char const *data, char const top, long flags) {			unsigned long xdl_hash_record(char const *data, char const top, long flags) {
	unsigned long ha = 5381;			XXH32_hash_t h;
	char const ptr = data;			char const ptr = data;

	if (flags & XDF_WHITESPACE_FLAGS)			if (flags & XDF_WHITESPACE_FLAGS)
	return xdl_hash_record_with_whitespace(data, top, flags);			return xdl_hash_record_with_whitespace(data, top, flags);

	for (; ptr < top && *ptr != '\n'; ptr++) {			for (; ptr < top && *ptr != '\n'; ptr++) { }
	ha += (ha << 5);			h = XDIFF_XXH32(data, ptr - data + 1, 0);
	ha ^= (unsigned long) *ptr;
	}
	*data = ptr < top ? ptr + 1: ptr;			*data = ptr < top ? ptr + 1: ptr;
				return h;
	return ha;
	}			}

	unsigned int xdl_hashbits(unsigned int size) {			unsigned int xdl_hashbits(unsigned int size) {
	unsigned int val = 1, bits = 0;			unsigned int val = 1, bits = 0;

	for (; val < size && bits < CHAR_BIT * sizeof(unsigned int); val <<= 1, bits++);			for (; val < size && bits < CHAR_BIT * sizeof(unsigned int); val <<= 1, bits++);
	return bits ? bits: 1;			return bits ? bits: 1;
	}			}

setup.py

	'mercurial/thirdparty/xdiff/xutils.h',			'mercurial/thirdparty/xdiff/xutils.h',
	]			]

	extmodules = [			extmodules = [
	Extension('mercurial.cext.base85', ['mercurial/cext/base85.c'],			Extension('mercurial.cext.base85', ['mercurial/cext/base85.c'],
	include_dirs=common_include_dirs,			include_dirs=common_include_dirs,
	depends=common_depends),			depends=common_depends),
	Extension('mercurial.cext.bdiff', ['mercurial/bdiff.c',			Extension('mercurial.cext.bdiff', ['mercurial/bdiff.c',
	'mercurial/cext/bdiff.c'] + xdiff_srcs,			'mercurial/cext/bdiff.c',
				'mercurial/thirdparty/xxhash/xxhash.c',
				] + xdiff_srcs,
	include_dirs=common_include_dirs,			include_dirs=common_include_dirs,
	depends=common_depends + ['mercurial/bdiff.h'] + xdiff_headers),			depends=common_depends + ['mercurial/bdiff.h',
				'mercurial/thirdparty/xxhash/xxhash.h',
				] + xdiff_headers),
	Extension('mercurial.cext.diffhelpers', ['mercurial/cext/diffhelpers.c'],			Extension('mercurial.cext.diffhelpers', ['mercurial/cext/diffhelpers.c'],
	include_dirs=common_include_dirs,			include_dirs=common_include_dirs,
	depends=common_depends),			depends=common_depends),
	Extension('mercurial.cext.mpatch', ['mercurial/mpatch.c',			Extension('mercurial.cext.mpatch', ['mercurial/mpatch.c',
	'mercurial/cext/mpatch.c'],			'mercurial/cext/mpatch.c'],
	include_dirs=common_include_dirs,			include_dirs=common_include_dirs,
	depends=common_depends),			depends=common_depends),
	Extension('mercurial.cext.parsers', ['mercurial/cext/charencode.c',			Extension('mercurial.cext.parsers', ['mercurial/cext/charencode.c',

Diff	ID	Base	Description	Created	Lint	Unit
Base			Base
Diff 1	6525			Mar 3 2018, 8:37 PM	★	★