Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

ARCv2: lib: memcpy: fix doing prefetchw outside of buffer

ARCv2 optimized memcpy uses PREFETCHW instruction for prefetching the
next cache line but doesn't ensure that the line is not past the end of
the buffer. PRETECHW changes the line ownership and marks it dirty,
which can cause data corruption if this area is used for DMA IO.

Fix the issue by avoiding the PREFETCHW. This leads to performance
degradation but it is OK as we'll introduce new memcpy implementation
optimized for unaligned memory access using.

We also cut off all PREFETCH instructions at they are quite useless
here:
* we call PREFETCH right before LOAD instruction call.
* we copy 16 or 32 bytes of data (depending on CONFIG_ARC_HAS_LL64)
in a main logical loop. so we call PREFETCH 4 times (or 2 times)
for each L1 cache line (in case of 64B L1 cache Line which is
default case). Obviously this is not optimal.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>

authored by

Eugeniy Paltsev and committed by
Vineet Gupta
f8a15f97 252f6e8e

-14
-14
arch/arc/lib/memcpy-archs.S
··· 25 25 #endif 26 26 27 27 #ifdef CONFIG_ARC_HAS_LL64 28 - # define PREFETCH_READ(RX) prefetch [RX, 56] 29 - # define PREFETCH_WRITE(RX) prefetchw [RX, 64] 30 28 # define LOADX(DST,RX) ldd.ab DST, [RX, 8] 31 29 # define STOREX(SRC,RX) std.ab SRC, [RX, 8] 32 30 # define ZOLSHFT 5 33 31 # define ZOLAND 0x1F 34 32 #else 35 - # define PREFETCH_READ(RX) prefetch [RX, 28] 36 - # define PREFETCH_WRITE(RX) prefetchw [RX, 32] 37 33 # define LOADX(DST,RX) ld.ab DST, [RX, 4] 38 34 # define STOREX(SRC,RX) st.ab SRC, [RX, 4] 39 35 # define ZOLSHFT 4 ··· 37 41 #endif 38 42 39 43 ENTRY_CFI(memcpy) 40 - prefetch [r1] ; Prefetch the read location 41 - prefetchw [r0] ; Prefetch the write location 42 44 mov.f 0, r2 43 45 ;;; if size is zero 44 46 jz.d [blink] ··· 66 72 lpnz @.Lcopy32_64bytes 67 73 ;; LOOP START 68 74 LOADX (r6, r1) 69 - PREFETCH_READ (r1) 70 - PREFETCH_WRITE (r3) 71 75 LOADX (r8, r1) 72 76 LOADX (r10, r1) 73 77 LOADX (r4, r1) ··· 109 117 lpnz @.Lcopy8bytes_1 110 118 ;; LOOP START 111 119 ld.ab r6, [r1, 4] 112 - prefetch [r1, 28] ;Prefetch the next read location 113 120 ld.ab r8, [r1,4] 114 - prefetchw [r3, 32] ;Prefetch the next write location 115 121 116 122 SHIFT_1 (r7, r6, 24) 117 123 or r7, r7, r5 ··· 152 162 lpnz @.Lcopy8bytes_2 153 163 ;; LOOP START 154 164 ld.ab r6, [r1, 4] 155 - prefetch [r1, 28] ;Prefetch the next read location 156 165 ld.ab r8, [r1,4] 157 - prefetchw [r3, 32] ;Prefetch the next write location 158 166 159 167 SHIFT_1 (r7, r6, 16) 160 168 or r7, r7, r5 ··· 192 204 lpnz @.Lcopy8bytes_3 193 205 ;; LOOP START 194 206 ld.ab r6, [r1, 4] 195 - prefetch [r1, 28] ;Prefetch the next read location 196 207 ld.ab r8, [r1,4] 197 - prefetchw [r3, 32] ;Prefetch the next write location 198 208 199 209 SHIFT_1 (r7, r6, 8) 200 210 or r7, r7, r5