[ARM] 3252/1: help gcc do the best with ___arch__swab32

Patch from Nicolas Pitre

Depending on your gcc version, the current C-only implementation would
produce suboptimal code, ranging from a bad register selection forcing
an additional mov instruction to a failure to merge the eor and the ror
in a single instruction. With a little help gcc always produces the
best code.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

authored by Nicolas Pitre and committed by Russell King a3e49436 b016450f

+10 -1
+10 -1
include/asm-arm/byteorder.h
··· 22 22 { 23 23 __u32 t; 24 24 25 - t = x ^ ((x << 16) | (x >> 16)); /* eor r1,r0,r0,ror #16 */ 25 + if (__builtin_constant_p(x)) { 26 + t = x ^ ((x << 16) | (x >> 16)); /* eor r1,r0,r0,ror #16 */ 27 + } else { 28 + /* 29 + * The compiler needs a bit of a hint here to always do the 30 + * right thing and not screw it up to different degrees 31 + * depending on the gcc version. 32 + */ 33 + asm ("eor\t%0, %1, %1, ror #16" : "=r" (t) : "r" (x)); 34 + } 26 35 x = (x << 24) | (x >> 8); /* mov r0,r0,ror #8 */ 27 36 t &= ~0x00FF0000; /* bic r1,r1,#0x00FF0000 */ 28 37 x ^= (t >> 8); /* eor r0,r0,r1,lsr #8 */