Compilers (at least in the configuration used to build Mercurial)
don't seem to be able to optimize a loop to look for a byte in a
buffer very well.
After removing hashing from the loop in our previous commit, we
no longer have a good reason to use a loop at all: we can instead
use memchr() to find a byte value in a memory range. memchr() will
often be backed by platform-optimal, hand-written assembly that will
perform better than anything a compiler can emit.
Using memchr() to scan for newlines makes xdiff a bit faster. On
the mozilla-central repository:
$ hg perfbdiff --alldata -c --count 10 --blocks --xdiff 400000 ! wall 0.796796 comb 0.790000 user 0.790000 sys 0.000000 (best of 13) ! wall 0.589753 comb 0.590000 user 0.590000 sys 0.000000 (best of 17) $ hg perfbdiff -m --count 100 --blocks --xdiff 400000 ! wall 9.450092 comb 9.460000 user 8.470000 sys 0.990000 (best of 3) ! wall 7.364899 comb 7.360000 user 6.430000 sys 0.930000 (best of 3)