= strlen = == Variants == ||= '''Name''' =||= '''Description''' =|| || stock || MD amd64 version {{rep stosq}} || || SSE2 || {{{movdqu}}} for block-store || || SSE2 aligned || {{{movaps}}} for aligned block-store and {{{movdqu}}} for unaligned || || AVX 128 || 128-bit {{{vmovdqu}}} for block-store || || AVX 256 || 256-bit {{{vmovdqu}}} for block-store || || ERMS || {{{repne stosb}}} for machines with ERMS || '''Note:''' clang was too smart and inlined all the short {{{memset}}} calls, so I had to create a copy of the C version called {{{memset_()}}} to fool it. == Machines Tested == ||= '''CPU''' =||= '''Speed (GHz)''' =||= '''Notes''' =|| || AMD FX-8120 || 3.11 || 1 x 8 zoo.freebsd.org || || AMD Opteron 6328 || 3.20 || 2 x 8 Supermicro H8DG6/H8DGi || || Intel Xeon X5365 || 3.00 || 2 x 4 Supermicro X7DBU || || Intel Xeon X5482 || 3.20 || 2 x 4 Supermicro X7DWN+ || || Intel Xeon X5675 || 3.07 || Westmere 2 x 6 Supermicro X8DTU || || Intel Core i5-2520M || 2.50 || Sandy Bridge 1 x 4 Thinkpad X220 (4286) || || Intel Core i5-2500K || 3.30 || Sandy Bridge 1 x 4 MSI Z77A-G45 (MS-7752) || || Intel Xeon E5-2680 || 2.70 || Romley 2 x 8 Supermicro X9DRW || || Intel Xeon E5-2667 v2 || 3.30 || Romley V2 2 x 8 Supermicro X9DRW (supports ERMS) || == Test Cases == ||= '''Name''' =||= '''Description''' =|| || page || set page to 0xa5 || || short || set aligned 15 bytes to 0xa5 || || short2 || set aligned 32 bytes to 0xa5 || || short3 || set aligned 48 bytes to 0xa5 || || offset || set misaligned ( + 4) 128 bytes to 0 || || offset2 || set misaligned ( + 7) 97 bytes to 0 || == Results == The numbers are the min value in the distribution where the values are a TSC delta across a single invocation of the test. Bold indicates the lowest time among the given variations in a Test and CPU combination. Green text is used for times faster than the stock implementation, and red text is used for times slower than the stock implementation. {{{#!th rowspan=3 '''CPU''' }}} {{{#!th colspan=36 '''Test / Variant''' }}} |-- {{{#!th colspan=6 '''page''' }}} {{{#!th colspan=6 '''short''' }}} {{{#!th colspan=6 '''short2''' }}} {{{#!th colspan=6 '''short3''' }}} {{{#!th colspan=6 '''offset''' }}} {{{#!th colspan=6 '''offset2''' }}} |-- ||= '''stock''' =||= '''SSE2''' =||= '''SSSE2 aligned''' =||= '''AVX 128''' =||= '''AVX 256''' =||= '''ERMS''' =|| \ ||= '''stock''' =||= '''SSE2''' =||= '''SSSE2 aligned''' =||= '''AVX 128''' =||= '''AVX 256''' =||= '''ERMS''' =|| \ ||= '''stock''' =||= '''SSE2''' =||= '''SSSE2 aligned''' =||= '''AVX 128''' =||= '''AVX 256''' =||= '''ERMS''' =|| \ ||= '''stock''' =||= '''SSE2''' =||= '''SSSE2 aligned''' =||= '''AVX 128''' =||= '''AVX 256''' =||= '''ERMS''' =|| \ ||= '''stock''' =||= '''SSE2''' =||= '''SSSE2 aligned''' =||= '''AVX 128''' =||= '''AVX 256''' =||= '''ERMS''' =|| \ ||= '''stock''' =||= '''SSE2''' =||= '''SSSE2 aligned''' =||= '''AVX 128''' =||= '''AVX 256''' =||= '''ERMS''' =|| || ''AMD FX-8120'' || \ || 663|| [[span(601, style=color: green)]]|| '''[[span(404, style=color: green)]]'''|| [[span(405, style=color: green)]]|| [[span(1886, style=color:red)]]|| [[span(1018, style=color:red)]]|| \ || '''49'''|| [[span(98, style=color:red)]]|| [[span(64, style=color:red)]]|| [[span(64, style=color:red)]]|| [[span(95, style=color:red)]]|| [[span(153, style=color:red)]]|| \ || 49|| [[span(52, style=color:red)]]|| '''[[span(38, style=color: green)]]'''|| [[span(39, style=color: green)]]|| [[span(55, style=color:red)]]|| [[span(243, style=color:red)]]|| \ || 49|| [[span(55, style=color:red)]]|| 49|| '''[[span(43, style=color: green)]]'''|| [[span(72, style=color:red)]]|| [[span(292, style=color:red)]]|| \ || 51|| [[span(56, style=color:red)]]|| '''[[span(43, style=color: green)]]'''|| [[span(44, style=color: green)]]|| [[span(90, style=color:red)]]|| [[span(469, style=color:red)]]|| \ || 52|| [[span(74, style=color:red)]]|| 52|| '''[[span(49, style=color: green)]]'''|| [[span(88, style=color:red)]]|| [[span(469, style=color:red)]]|| || ''AMD Opteron 6328'' || \ || 482|| [[span(443, style=color: green)]]|| '''[[span(424, style=color: green)]]'''|| [[span(461, style=color: green)]]|| [[span(2454, style=color:red)]]|| [[span(449, style=color: green)]]|| \ || '''69'''|| [[span(106, style=color:red)]]|| [[span(106, style=color:red)]]|| [[span(106, style=color:red)]]|| [[span(106, style=color:red)]]|| [[span(106, style=color:red)]]|| \ || '''68'''|| [[span(87, style=color:red)]]|| [[span(88, style=color:red)]]|| [[span(86, style=color:red)]]|| [[span(87, style=color:red)]]|| [[span(128, style=color:red)]]|| \ || '''66'''|| [[span(90, style=color:red)]]|| [[span(92, style=color:red)]]|| [[span(90, style=color:red)]]|| [[span(89, style=color:red)]]|| [[span(151, style=color:red)]]|| \ || 102|| '''[[span(89, style=color: green)]]'''|| [[span(95, style=color: green)]]|| [[span(95, style=color: green)]]|| [[span(99, style=color: green)]]|| [[span(230, style=color:red)]]|| \ || 104|| '''[[span(92, style=color: green)]]'''|| [[span(93, style=color: green)]]|| [[span(93, style=color: green)]]|| [[span(98, style=color: green)]]|| [[span(226, style=color:red)]]|| || ''Intel Xeon X5365'' || \ || 657|| [[span(1197, style=color:red)]]|| '''[[span(378, style=color: green)]]'''|| -- || -- || [[span(720, style=color:red)]]|| \ || '''63'''|| [[span(144, style=color:red)]]|| [[span(144, style=color:red)]]|| -- || -- || [[span(144, style=color:red)]]|| \ || '''63'''|| [[span(90, style=color:red)]]|| [[span(90, style=color:red)]]|| -- || -- || [[span(162, style=color:red)]]|| \ || '''63'''|| [[span(99, style=color:red)]]|| [[span(90, style=color:red)]]|| -- || -- || [[span(171, style=color:red)]]|| \ || 243|| '''[[span(135, style=color: green)]]'''|| '''[[span(135, style=color: green)]]'''|| -- || -- || [[span(252, style=color:red)]]|| \ || 126|| '''[[span(108, style=color: green)]]'''|| [[span(117, style=color: green)]]|| -- || -- || [[span(225, style=color:red)]]|| || ''Intel Xeon X5482'' || \ || 624|| [[span(1144, style=color:red)]]|| '''[[span(312, style=color: green)]]'''|| -- || -- || [[span(696, style=color:red)]]|| \ || '''32'''|| [[span(112, style=color:red)]]|| [[span(112, style=color:red)]]|| -- || -- || [[span(112, style=color:red)]]|| \ || '''32'''|| [[span(64, style=color:red)]]|| [[span(64, style=color:red)]]|| -- || -- || [[span(128, style=color:red)]]|| \ || '''32'''|| [[span(72, style=color:red)]]|| [[span(64, style=color:red)]]|| -- || -- || [[span(144, style=color:red)]]|| \ || '''56'''|| [[span(120, style=color:red)]]|| [[span(120, style=color:red)]]|| -- || -- || [[span(224, style=color:red)]]|| \ || '''56'''|| [[span(96, style=color:red)]]|| [[span(104, style=color:red)]]|| -- || -- || [[span(200, style=color:red)]]|| || ''Intel Xeon X5675'' || \ || 352|| '''[[span(296, style=color: green)]]'''|| [[span(300, style=color: green)]]|| -- || -- || [[span(428, style=color:red)]]|| \ || '''24'''|| [[span(100, style=color:red)]]|| [[span(100, style=color:red)]]|| -- || -- || [[span(96, style=color:red)]]|| \ || '''46'''|| [[span(83, style=color:red)]]|| [[span(92, style=color:red)]]|| -- || -- || [[span(120, style=color:red)]]|| \ || '''24'''|| [[span(83, style=color:red)]]|| [[span(48, style=color:red)]]|| -- || -- || [[span(136, style=color:red)]]|| \ || 99|| '''[[span(56, style=color: green)]]'''|| [[span(106, style=color:red)]]|| -- || -- || [[span(192, style=color:red)]]|| \ || '''48'''|| [[span(99, style=color:red)]]|| [[span(106, style=color:red)]]|| -- || -- || [[span(160, style=color:red)]]|| || ''Intel Core i5-2520M'' || \ || 1812|| [[span(962, style=color: green)]]|| [[span(962, style=color: green)]]|| '''[[span(950, style=color: green)]]'''|| [[span(1400, style=color:red)]]|| [[span(13100, style=color:red)]]|| \ || '''87'''|| [[span(350, style=color:red)]]|| [[span(350, style=color:red)]]|| [[span(350, style=color:red)]]|| [[span(350, style=color:red)]]|| [[span(337, style=color:red)]]|| \ || '''87'''|| [[span(162, style=color:red)]]|| [[span(162, style=color:red)]]|| [[span(150, style=color:red)]]|| [[span(600, style=color:red)]]|| [[span(400, style=color:red)]]|| \ || '''87'''|| [[span(162, style=color:red)]]|| [[span(162, style=color:red)]]|| [[span(150, style=color:red)]]|| [[span(562, style=color:red)]]|| [[span(450, style=color:red)]]|| \ || '''87'''|| [[span(187, style=color:red)]]|| [[span(187, style=color:red)]]|| [[span(187, style=color:red)]]|| [[span(625, style=color:red)]]|| [[span(700, style=color:red)]]|| \ || '''87'''|| [[span(187, style=color:red)]]|| [[span(187, style=color:red)]]|| [[span(187, style=color:red)]]|| [[span(637, style=color:red)]]|| [[span(612, style=color:red)]]|| || Intel Core i5-2500K || \ || 734|| 635|| 635|| 627|| 924|| 907|| \ || 222|| 231|| 231|| 231|| 231|| 222|| \ || 156|| 107|| 107|| 99|| 396|| 264|| \ || 156|| 107|| 107|| 99|| 404|| 297|| \ || 453|| 123|| 123|| 123|| 412|| 420|| \ || 156|| 123|| 123|| 123|| 420|| 354|| || ''Intel Xeon E5-2680'' || \ || 356|| [[span(308, style=color: green)]]|| [[span(308, style=color: green)]]|| '''[[span(304, style=color: green)]]'''|| [[span(452, style=color:red)]]|| [[span(440, style=color:red)]]|| \ || '''28'''|| [[span(112, style=color:red)]]|| [[span(112, style=color:red)]]|| [[span(112, style=color:red)]]|| [[span(112, style=color:red)]]|| [[span(112, style=color:red)]]|| \ || '''28'''|| [[span(52, style=color:red)]]|| [[span(52, style=color:red)]]|| [[span(48, style=color:red)]]|| [[span(196, style=color:red)]]|| [[span(132, style=color:red)]]|| \ || '''28'''|| [[span(52, style=color:red)]]|| [[span(52, style=color:red)]]|| [[span(48, style=color:red)]]|| [[span(196, style=color:red)]]|| [[span(148, style=color:red)]]|| \ || '''56'''|| [[span(60, style=color:red)]]|| [[span(60, style=color:red)]]|| [[span(60, style=color:red)]]|| [[span(200, style=color:red)]]|| [[span(204, style=color:red)]]|| \ || '''52'''|| [[span(60, style=color:red)]]|| [[span(60, style=color:red)]]|| [[span(60, style=color:red)]]|| [[span(204, style=color:red)]]|| [[span(176, style=color:red)]]|| || ''Intel Xeon E5-2667 v2'' || \ || 428|| [[span(344, style=color: green)]]|| [[span(344, style=color: green)]]|| [[span(340, style=color: green)]]|| [[span(494, style=color:red)]]|| '''[[span(292, style=color: green)]]'''|| \ || '''24'''|| [[span(60, style=color:red)]]|| [[span(60, style=color:red)]]|| [[span(60, style=color:red)]]|| [[span(60, style=color:red)]]|| [[span(60, style=color:red)]]|| \ || '''24'''|| [[span(84, style=color:red)]]|| [[span(84, style=color:red)]]|| [[span(84, style=color:red)]]|| [[span(228, style=color:red)]]|| [[span(60, style=color:red)]]|| \ || '''24'''|| [[span(84, style=color:red)]]|| [[span(88, style=color:red)]]|| [[span(84, style=color:red)]]|| [[span(228, style=color:red)]]|| [[span(56, style=color:red)]]|| \ || '''52'''|| [[span(96, style=color:red)]]|| [[span(96, style=color:red)]]|| [[span(92, style=color:red)]]|| [[span(232, style=color:red)]]|| [[span(60, style=color:red)]]|| \ || '''48'''|| [[span(64, style=color:red)]]|| [[span(64, style=color:red)]]|| [[span(64, style=color:red)]]|| [[span(208, style=color:red)]]|| [[span(60, style=color:red)]]|| == Conclusions ==