This is an archive of the discontinued Mercurial Phabricator instance.

xdiff: use memchr instead of character scanning
AbandonedPublic

Authored by indygreg on Mar 3 2018, 8:37 PM.

Details

Reviewers
quark
Group Reviewers
hg-reviewers
Summary

Compilers (at least in the configuration used to build Mercurial)
don't seem to be able to optimize a loop to look for a byte in a
buffer very well.

After removing hashing from the loop in our previous commit, we
no longer have a good reason to use a loop at all: we can instead
use memchr() to find a byte value in a memory range. memchr() will
often be backed by platform-optimal, hand-written assembly that will
perform better than anything a compiler can emit.

Using memchr() to scan for newlines makes xdiff a bit faster. On
the mozilla-central repository:

$ hg perfbdiff --alldata -c --count 10 --blocks --xdiff 400000
! wall 0.796796 comb 0.790000 user 0.790000 sys 0.000000 (best of 13)
! wall 0.589753 comb 0.590000 user 0.590000 sys 0.000000 (best of 17)

$ hg perfbdiff -m --count 100 --blocks --xdiff 400000
! wall 9.450092 comb 9.460000 user 8.470000 sys 0.990000 (best of 3)
! wall 7.364899 comb 7.360000 user 6.430000 sys 0.930000 (best of 3)

Diff Detail

Repository
rHG Mercurial
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

indygreg created this revision.Mar 3 2018, 8:37 PM
quark accepted this revision.Mar 3 2018, 8:47 PM
quark added a comment.EditedMar 3 2018, 8:53 PM

It seems mdiff.textdiff used by perfbdiff won't actually use xdiff. Maybe perfunidiff --config experimental.xdiff=1 should be used instead?

EDIT: Didn't see the previous commits. Will have a look.

indygreg abandoned this revision.Mar 4 2018, 2:30 PM