lib/crypto: mips/chacha: Fix clang build and remove unneeded byteswap

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

The MIPS32r2 ChaCha code has never been buildable with the clang
assembler. First, clang doesn't support the 'rotl' pseudo-instruction:

error: unknown instruction, did you mean: rol, rotr?

Second, clang requires that both operands of the 'wsbh' instruction be
explicitly given:

error: too few operands for instruction

To fix this, align the code with the real instruction set by (1) using
the real instruction 'rotr' instead of the nonstandard pseudo-
instruction 'rotl', and (2) explicitly giving both operands to 'wsbh'.

To make removing the use of 'rotl' a bit easier, also remove the
unnecessary special-casing for big endian CPUs at
.Lchacha_mips_xor_bytes. The tail handling is actually
endian-independent since it processes one byte at a time. On big endian
CPUs the old code byte-swapped SAVED_X, then iterated through it in
reverse order. But the byteswap and reverse iteration canceled out.

Tested with chacha20poly1305-selftest in QEMU using "-M malta" with both
little endian and big endian mips32r2 kernels.

Fixes: 49aa7c00eddf ("crypto: mips/chacha - import 32r2 ChaCha code from Zinc")
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505080409.EujEBwA0-lkp@intel.com/
Link: https://lore.kernel.org/r/20250619225535.679301-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

Eric Biggers 9 months ago 22375ada a6d2f48b

+7 -13

1 changed file

expand all

lib

crypto

mips

chacha-core.S

+7 -13

lib/crypto/mips/chacha-core.S

··· 55 55 #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ 56 56 #define MSB 0 57 57 #define LSB 3 58 - #define ROTx rotl 59 - #define ROTR(n) rotr n, 24 60 58 #define CPU_TO_LE32(n) \ 61 - wsbh n; \ 59 + wsbh n, n; \ 62 60 rotr n, 16; 63 61 #else 64 62 #define MSB 3 65 63 #define LSB 0 66 - #define ROTx rotr 67 64 #define CPU_TO_LE32(n) 68 - #define ROTR(n) 69 65 #endif 70 66 71 67 #define FOR_EACH_WORD(x) \ ··· 188 192 xor X(W), X(B); \ 189 193 xor X(Y), X(C); \ 190 194 xor X(Z), X(D); \ 191 - rotl X(V), S; \ 192 - rotl X(W), S; \ 193 - rotl X(Y), S; \ 194 - rotl X(Z), S; 195 + rotr X(V), 32 - S; \ 196 + rotr X(W), 32 - S; \ 197 + rotr X(Y), 32 - S; \ 198 + rotr X(Z), 32 - S; 195 199 196 200 .text 197 201 .set reorder ··· 368 372 /* First byte */ 369 373 lbu T1, 0(IN) 370 374 addiu $at, BYTES, 1 371 - CPU_TO_LE32(SAVED_X) 372 - ROTR(SAVED_X) 373 375 xor T1, SAVED_X 374 376 sb T1, 0(OUT) 375 377 beqz $at, .Lchacha_mips_xor_done 376 378 /* Second byte */ 377 379 lbu T1, 1(IN) 378 380 addiu $at, BYTES, 2 379 - ROTx SAVED_X, 8 381 + rotr SAVED_X, 8 380 382 xor T1, SAVED_X 381 383 sb T1, 1(OUT) 382 384 beqz $at, .Lchacha_mips_xor_done 383 385 /* Third byte */ 384 386 lbu T1, 2(IN) 385 - ROTx SAVED_X, 8 387 + rotr SAVED_X, 8 386 388 xor T1, SAVED_X 387 389 sb T1, 2(OUT) 388 390 b .Lchacha_mips_xor_done