| Version 1 (modified by john, 12 years ago) (diff) |
|---|
strlen
Variants
| Name | Description |
|---|---|
| stock | MD amd64 version {{rep stosq}} |
| SSE2 | movdqu for block-store |
| SSE2 aligned | movaps for aligned block-store and movdqu for unaligned |
| AVX 128 | 128-bit vmovdqu for block-store |
| AVX 256 | 256-bit vmovdqu for block-store |
| ERMS | repne stosb for machines with ERMS |
Machines Tested
| CPU | Speed (GHz) | Notes |
|---|---|---|
| AMD FX-8120 | 3.11 | 1 x 8 zoo.freebsd.org |
| AMD Opteron 6328 | 3.20 | 2 x 8 Supermicro H8DG6/H8DGi |
| Intel Xeon X5365 | 3.00 | 2 x 4 Supermicro X7DBU |
| Intel Xeon X5482 | 3.20 | 2 x 4 Supermicro X7DWN+ |
| Intel Xeon X5675 | 3.07 | Westmere 2 x 6 Supermicro X8DTU |
| Intel Core i5-2520M | 2.50 | Sandy Bridge 1 x 4 Thinkpad X220 (4286) |
| Intel Core i5-2500K | 3.30 | Sandy Bridge 1 x 4 MSI Z77A-G45 (MS-7752) |
| Intel Xeon E5-2680 | 2.70 | Romley 2 x 8 Supermicro X9DRW |
| Intel Xeon E5-2667 v2 | 3.30 | Romley V2 2 x 8 Supermicro X9DRW (supports ERMS) |
Test Cases
| Name | Description |
|---|---|
| page | set page to 0xa5 |
| short | set aligned 15 bytes to 0xa5 |
| short2 | set aligned 32 bytes to 0xa5 |
| short3 | set aligned 48 bytes to 0xa5 |
| offset | set misaligned ( + 4) 128 bytes to 0 |
| offset2 | set misaligned ( + 7) 97 bytes to 0 |
Results
The numbers are the min value in the distribution where the values are a TSC delta across a single invocation of the test.
Bold indicates the lowest time among the given variations in a Test and CPU combination. Green text is used for times faster than the stock implementation, and red text is used for times slower than the stock implementation.
CPU | Test / Variant | |||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
page | short | short2 | short3 | offset | offset2 | |||||||||||||||||||||||||||||||
| stock | SSE2 | SSSE2 aligned | AVX 128 | AVX 256 | ERMS | stock | SSE2 | SSSE2 aligned | AVX 128 | AVX 256 | ERMS | stock | SSE2 | SSSE2 aligned | AVX 128 | AVX 256 | ERMS | stock | SSE2 | SSSE2 aligned | AVX 128 | AVX 256 | ERMS | stock | SSE2 | SSSE2 aligned | AVX 128 | AVX 256 | ERMS | stock | SSE2 | SSSE2 aligned | AVX 128 | AVX 256 | ERMS | |
| AMD FX-8120 | ||||||||||||||||||||||||||||||||||||
| AMD Opteron 6328 | ||||||||||||||||||||||||||||||||||||
| Intel Xeon X5365 | ||||||||||||||||||||||||||||||||||||
| Intel Xeon X5482 | ||||||||||||||||||||||||||||||||||||
| Intel Xeon X5675 | ||||||||||||||||||||||||||||||||||||
| Intel Core i5-2520M | ||||||||||||||||||||||||||||||||||||
| Intel Core i5-2500K | ||||||||||||||||||||||||||||||||||||
| Intel Xeon E5-2680 | ||||||||||||||||||||||||||||||||||||
| Intel Xeon E5-2667 v2 | ||||||||||||||||||||||||||||||||||||
