Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

lib: Prepare zstd for preboot environment, improve performance

These changes are necessary to get the build to work in the preboot
environment, and to get reasonable performance:

- Remove a double definition of the CHECK_F macro when the zstd
library is amalgamated.

- Switch ZSTD_copy8() to __builtin_memcpy(), because in the preboot
environment on x86 gcc can't inline `memcpy()` otherwise.

- Limit the gcc hack in ZSTD_wildcopy() to the broken gcc version. See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81388.

ZSTD_copy8() and ZSTD_wildcopy() are in the core of the zstd hot loop.
So outlining these calls to memcpy(), and having an extra branch are very
detrimental to performance.

Signed-off-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200730190841.2071656-2-nickrterrell@gmail.com

authored by

Nick Terrell and committed by
Ingo Molnar
6d25a633 92ed3019

+13 -10
+1 -8
lib/zstd/fse_decompress.c
··· 47 47 ****************************************************************/ 48 48 #include "bitstream.h" 49 49 #include "fse.h" 50 + #include "zstd_internal.h" 50 51 #include <linux/compiler.h> 51 52 #include <linux/kernel.h> 52 53 #include <linux/string.h> /* memcpy, memset */ ··· 60 59 { \ 61 60 enum { FSE_static_assert = 1 / (int)(!!(c)) }; \ 62 61 } /* use only *after* variable declarations */ 63 - 64 - /* check and forward error code */ 65 - #define CHECK_F(f) \ 66 - { \ 67 - size_t const e = f; \ 68 - if (FSE_isError(e)) \ 69 - return e; \ 70 - } 71 62 72 63 /* ************************************************************** 73 64 * Templates
+12 -2
lib/zstd/zstd_internal.h
··· 127 127 * Shared functions to include for inlining 128 128 *********************************************/ 129 129 ZSTD_STATIC void ZSTD_copy8(void *dst, const void *src) { 130 - memcpy(dst, src, 8); 130 + /* 131 + * zstd relies heavily on gcc being able to analyze and inline this 132 + * memcpy() call, since it is called in a tight loop. Preboot mode 133 + * is compiled in freestanding mode, which stops gcc from analyzing 134 + * memcpy(). Use __builtin_memcpy() to tell gcc to analyze this as a 135 + * regular memcpy(). 136 + */ 137 + __builtin_memcpy(dst, src, 8); 131 138 } 132 139 /*! ZSTD_wildcopy() : 133 140 * custom version of memcpy(), can copy up to 7 bytes too many (8 bytes if length==0) */ ··· 144 137 const BYTE* ip = (const BYTE*)src; 145 138 BYTE* op = (BYTE*)dst; 146 139 BYTE* const oend = op + length; 147 - /* Work around https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81388. 140 + #if defined(GCC_VERSION) && GCC_VERSION >= 70000 && GCC_VERSION < 70200 141 + /* 142 + * Work around https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81388. 148 143 * Avoid the bad case where the loop only runs once by handling the 149 144 * special case separately. This doesn't trigger the bug because it 150 145 * doesn't involve pointer/integer overflow. 151 146 */ 152 147 if (length <= 8) 153 148 return ZSTD_copy8(dst, src); 149 + #endif 154 150 do { 155 151 ZSTD_copy8(op, ip); 156 152 op += 8;