Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

arm64: crypto: increase AES interleave to 4x

This patch increases the interleave factor for parallel AES modes
to 4x. This improves performance on Cortex-A57 by ~35%. This is
due to the 3-cycle latency of AES instructions on the A57's
relatively deep pipeline (compared to Cortex-A53 where the AES
instruction latency is only 2 cycles).

At the same time, disable inline expansion of the core AES functions,
as the performance benefit of this feature is negligible.

Measured on AMD Seattle (using tcrypt.ko mode=500 sec=1):

Baseline (2x interleave, inline expansion)
------------------------------------------
testing speed of async cbc(aes) (cbc-aes-ce) decryption
test 4 (128 bit key, 8192 byte blocks): 95545 operations in 1 seconds
test 14 (256 bit key, 8192 byte blocks): 68496 operations in 1 seconds

This patch (4x interleave, no inline expansion)
-----------------------------------------------
testing speed of async cbc(aes) (cbc-aes-ce) decryption
test 4 (128 bit key, 8192 byte blocks): 124735 operations in 1 seconds
test 14 (256 bit key, 8192 byte blocks): 92328 operations in 1 seconds

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

authored by

Ard Biesheuvel and committed by
Catalin Marinas
0eee0fbd 6910fa16

+1 -1
+1 -1
arch/arm64/crypto/Makefile
··· 29 29 obj-$(CONFIG_CRYPTO_AES_ARM64_NEON_BLK) += aes-neon-blk.o 30 30 aes-neon-blk-y := aes-glue-neon.o aes-neon.o 31 31 32 - AFLAGS_aes-ce.o := -DINTERLEAVE=2 -DINTERLEAVE_INLINE 32 + AFLAGS_aes-ce.o := -DINTERLEAVE=4 33 33 AFLAGS_aes-neon.o := -DINTERLEAVE=4 34 34 35 35 CFLAGS_aes-glue-ce.o := -DUSE_V8_CRYPTO_EXTENSIONS