Changes between Initial Version and Version 1 of LibCSSE/strlen


Ignore:
Timestamp:
Aug 8, 2014, 4:09:52 PM (12 years ago)
Author:
john
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • LibCSSE/strlen

    v1 v1  
     1= strlen =
     2
     3== Variants ==
     4
     5||= '''Name''' =||= '''Description''' =||
     6|| stock || MI C version ||
     7|| SSE2 || {{{pcmpeqb}}} and {{{pmovmskb}}} ||
     8|| SSE4.2 || {{{pcmpestri}}} and {{{pcpmestrm}}} ||
     9|| AVX || 128-bit {{{vpcmpeqb}}} and {{{vpmovmskb}}} ||
     10|| ERMS || {{{repne scasb}}} for machines with ERMS ||
     11
     12'''Note:''' clang was too smart and optimized plain {{{strlen}}} calls away, so I had to create a copy of the C version called {{{strlen_mi()}}} to fool it.
     13
     14== Machines Tested ==
     15
     16||= '''CPU''' =||= '''Speed (GHz)''' =||= '''Notes''' ||
     17|| Xeon X5365 || 3.00 || 2 x 4 Supermicro X7DBU ||
     18|| Xeon X5482 || 3.20 || 2 x 4 Supermicro X7DWN+ ||
     19|| Xeon X5675 || 3.07 || Westmere 2 x 6 Supermicro X8DTU ||
     20|| Core i5-2520M || 2.50 || Sandy Bridge 1 x 4 Thinkpad X220 (4286) ||
     21|| Xeon E5-2680 || 2.70 || Romley 2 x 8 Supermicro X9DRW ||
     22|| Xeon E5-2667 v2 || 3.30 || Romley V2 2 x 8 Supermicro X9DRW ||
     23
     24== Test Cases ==
     25
     26||= '''Name''' =||= '''Description''' =||
     27|| page || aligned string one page - 1 long ||
     28|| short || aligned string 14 characters long ||
     29|| short2 || aligned string 32 characters long ||
     30|| short3 || aligned string 48 characters long ||
     31|| offset || 4 byte offset string 126 characters long ||
     32|| offset2 || 7 byte offset string 95 characters long ||
     33
     34== Results ==
     35
     36The numbers are the min value in the distribution where the values are a TSC delta across a single invocation of the test.
     37
     38{{{#!th rowspan=3
     39'''CPU'''
     40}}}
     41{{{#!th colspan=30
     42'''Test / Variant'''
     43}}}
     44|--
     45{{{#!th colspan=5
     46'''page'''
     47}}}
     48{{{#!th colspan=5
     49'''short'''
     50}}}
     51{{{#!th colspan=5
     52'''short2'''
     53}}}
     54{{{#!th colspan=5
     55'''short3'''
     56}}}
     57{{{#!th colspan=5
     58'''offset'''
     59}}}
     60{{{#!th colspan=5
     61'''offset2'''
     62}}}
     63|--
     64|| '''stock''' || '''SSE2''' || '''SSSE4.2''' || '''AVX''' || '''ERMS''' || \
     65|| '''stock''' || '''SSE2''' || '''SSSE4.2''' || '''AVX''' || '''ERMS''' || \
     66|| '''stock''' || '''SSE2''' || '''SSSE4.2''' || '''AVX''' || '''ERMS''' || \
     67|| '''stock''' || '''SSE2''' || '''SSSE4.2''' || '''AVX''' || '''ERMS''' || \
     68|| '''stock''' || '''SSE2''' || '''SSSE4.2''' || '''AVX''' || '''ERMS''' || \
     69|| '''stock''' || '''SSE2''' || '''SSSE4.2''' || '''AVX''' || '''ERMS''' ||
     70