| 7 | | || SSE2 || {{{movdqu}}} for block-copy || |
| 8 | | || SSE2 aligned || align source to use always use {{{movaps}}} and use {{{movaps}}} for aligned destination and {{{movdqu}}} for unaligned destination || |
| 9 | | || AVX || 256-bit {{{vmovdqu}}} for block-copy with 128-byte block as common loop || |
| | 7 | || SSE2 || {{{movups}}} for block-copy || |
| | 8 | || SSE2 aligned || align source to use always use {{{movaps}}} and use {{{movaps}}} for aligned destination and {{{movups}}} for unaligned destination || |
| | 9 | || AVX || 256-bit {{{vmovups}}} for block-copy with 128-byte block as common loop || |