Implementing SSE-optimized variants of string routines in libc for FreeBSD/amd64

There is a subpage for each routine. Most routines contain multiple variants (e.g. plain SSE2 vs AVX). For each routine, the various variants were evaluated using micro-benchmarks on a variety of processors.

A note about the benchmark numbers: I chose to use min as in earlier development it tended to most closely track changes made. That said, in most cases the median is quite close to the min (if not identical) while the average is often skewed by outliers resulting from preemption, timer interrupts, etc. I may revisit the results to include median numbers either in addition to or instead of min.


Last modified 6 years ago Last modified on Oct 8, 2014, 2:16:23 PM