ASoC: Intel: avs: Allow for NHLT configuration

+4 -2

.mailmap

··· 206 206 David Brownell <david-b@pacbell.net> 207 207 David Collins <quic_collinsd@quicinc.com> <collinsd@codeaurora.org> 208 208 David Heidelberg <david@ixit.cz> <d.okias@gmail.com> 209 + David Hildenbrand <david@kernel.org> <david@redhat.com> 209 210 David Rheinsberg <david@readahead.eu> <dh.herrmann@gmail.com> 210 211 David Rheinsberg <david@readahead.eu> <dh.herrmann@googlemail.com> 211 212 David Rheinsberg <david@readahead.eu> <david.rheinsberg@gmail.com> ··· 427 426 Kenneth Westfield <quic_kwestfie@quicinc.com> <kwestfie@codeaurora.org> 428 427 Kiran Gunda <quic_kgunda@quicinc.com> <kgunda@codeaurora.org> 429 428 Kirill Tkhai <tkhai@ya.ru> <ktkhai@virtuozzo.com> 430 - Kirill A. Shutemov <kas@kernel.org> <kirill.shutemov@linux.intel.com> 429 + Kiryl Shutsemau <kas@kernel.org> <kirill.shutemov@linux.intel.com> 431 430 Kishon Vijay Abraham I <kishon@kernel.org> <kishon@ti.com> 432 431 Konrad Dybcio <konradybcio@kernel.org> <konrad.dybcio@linaro.org> 433 432 Konrad Dybcio <konradybcio@kernel.org> <konrad.dybcio@somainline.org> ··· 606 605 Oleksij Rempel <o.rempel@pengutronix.de> <ore@pengutronix.de> 607 606 Oliver Hartkopp <socketcan@hartkopp.net> <oliver.hartkopp@volkswagen.de> 608 607 Oliver Hartkopp <socketcan@hartkopp.net> <oliver@hartkopp.net> 609 - Oliver Upton <oliver.upton@linux.dev> <oupton@google.com> 608 + Oliver Upton <oupton@kernel.org> <oupton@google.com> 609 + Oliver Upton <oupton@kernel.org> <oliver.upton@linux.dev> 610 610 Ondřej Jirman <megi@xff.cz> <megous@megous.com> 611 611 Oza Pawandeep <quic_poza@quicinc.com> <poza@codeaurora.org> 612 612 Pali Rohár <pali@kernel.org> <pali.rohar@gmail.com>

+2 -2

Documentation/userspace-api/netlink/intro-specs.rst

··· 13 13 Kernel comes with a simple CLI tool which should be useful when 14 14 developing Netlink related code. The tool is implemented in Python 15 15 and can use a YAML specification to issue Netlink requests 16 - to the kernel. Only Generic Netlink is supported. 16 + to the kernel. 17 17 18 18 The tool is located at ``tools/net/ynl/pyynl/cli.py``. It accepts 19 - a handul of arguments, the most important ones are: 19 + a handful of arguments, the most important ones are: 20 20 21 21 - ``--spec`` - point to the spec file 22 22 - ``--do $name`` / ``--dump $name`` - issue request ``$name``

+21 -19

MAINTAINERS

··· 915 915 ALPHA PORT 916 916 M: Richard Henderson <richard.henderson@linaro.org> 917 917 M: Matt Turner <mattst88@gmail.com> 918 + M: Magnus Lindholm <linmag7@gmail.com> 918 919 L: linux-alpha@vger.kernel.org 919 920 S: Odd Fixes 920 921 F: arch/alpha/ ··· 4400 4399 M: Jens Axboe <axboe@kernel.dk> 4401 4400 L: linux-block@vger.kernel.org 4402 4401 S: Maintained 4403 - T: git git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git 4402 + T: git git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux.git 4404 4403 F: Documentation/ABI/stable/sysfs-block 4405 4404 F: Documentation/block/ 4406 4405 F: block/ ··· 9210 9209 R: Jeffle Xu <jefflexu@linux.alibaba.com> 9211 9210 R: Sandeep Dhavale <dhavale@google.com> 9212 9211 R: Hongbo Li <lihongbo22@huawei.com> 9212 + R: Chunhai Guo <guochunhai@vivo.com> 9213 9213 L: linux-erofs@lists.ozlabs.org 9214 9214 S: Maintained 9215 9215 W: https://erofs.docs.kernel.org ··· 11529 11527 HUGETLB SUBSYSTEM 11530 11528 M: Muchun Song <muchun.song@linux.dev> 11531 11529 M: Oscar Salvador <osalvador@suse.de> 11532 - R: David Hildenbrand <david@redhat.com> 11530 + R: David Hildenbrand <david@kernel.org> 11533 11531 L: linux-mm@kvack.org 11534 11532 S: Maintained 11535 11533 F: Documentation/ABI/testing/sysfs-kernel-mm-hugepages ··· 13662 13660 13663 13661 KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64) 13664 13662 M: Marc Zyngier <maz@kernel.org> 13665 - M: Oliver Upton <oliver.upton@linux.dev> 13663 + M: Oliver Upton <oupton@kernel.org> 13666 13664 R: Joey Gouly <joey.gouly@arm.com> 13667 13665 R: Suzuki K Poulose <suzuki.poulose@arm.com> 13668 13666 R: Zenghui Yu <yuzenghui@huawei.com> ··· 13736 13734 M: Christian Borntraeger <borntraeger@linux.ibm.com> 13737 13735 M: Janosch Frank <frankja@linux.ibm.com> 13738 13736 M: Claudio Imbrenda <imbrenda@linux.ibm.com> 13739 - R: David Hildenbrand <david@redhat.com> 13737 + R: David Hildenbrand <david@kernel.org> 13740 13738 L: kvm@vger.kernel.org 13741 13739 S: Supported 13742 13740 T: git git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git ··· 16223 16221 F: drivers/devfreq/tegra30-devfreq.c 16224 16222 16225 16223 MEMORY HOT(UN)PLUG 16226 - M: David Hildenbrand <david@redhat.com> 16224 + M: David Hildenbrand <david@kernel.org> 16227 16225 M: Oscar Salvador <osalvador@suse.de> 16228 16226 L: linux-mm@kvack.org 16229 16227 S: Maintained ··· 16248 16246 16249 16247 MEMORY MANAGEMENT - CORE 16250 16248 M: Andrew Morton <akpm@linux-foundation.org> 16251 - M: David Hildenbrand <david@redhat.com> 16249 + M: David Hildenbrand <david@kernel.org> 16252 16250 R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16253 16251 R: Liam R. Howlett <Liam.Howlett@oracle.com> 16254 16252 R: Vlastimil Babka <vbabka@suse.cz> ··· 16304 16302 16305 16303 MEMORY MANAGEMENT - GUP (GET USER PAGES) 16306 16304 M: Andrew Morton <akpm@linux-foundation.org> 16307 - M: David Hildenbrand <david@redhat.com> 16305 + M: David Hildenbrand <david@kernel.org> 16308 16306 R: Jason Gunthorpe <jgg@nvidia.com> 16309 16307 R: John Hubbard <jhubbard@nvidia.com> 16310 16308 R: Peter Xu <peterx@redhat.com> ··· 16320 16318 16321 16319 MEMORY MANAGEMENT - KSM (Kernel Samepage Merging) 16322 16320 M: Andrew Morton <akpm@linux-foundation.org> 16323 - M: David Hildenbrand <david@redhat.com> 16321 + M: David Hildenbrand <david@kernel.org> 16324 16322 R: Xu Xin <xu.xin16@zte.com.cn> 16325 16323 R: Chengming Zhou <chengming.zhou@linux.dev> 16326 16324 L: linux-mm@kvack.org ··· 16336 16334 16337 16335 MEMORY MANAGEMENT - MEMORY POLICY AND MIGRATION 16338 16336 M: Andrew Morton <akpm@linux-foundation.org> 16339 - M: David Hildenbrand <david@redhat.com> 16337 + M: David Hildenbrand <david@kernel.org> 16340 16338 R: Zi Yan <ziy@nvidia.com> 16341 16339 R: Matthew Brost <matthew.brost@intel.com> 16342 16340 R: Joshua Hahn <joshua.hahnjy@gmail.com> ··· 16376 16374 16377 16375 MEMORY MANAGEMENT - MISC 16378 16376 M: Andrew Morton <akpm@linux-foundation.org> 16379 - M: David Hildenbrand <david@redhat.com> 16377 + M: David Hildenbrand <david@kernel.org> 16380 16378 R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16381 16379 R: Liam R. Howlett <Liam.Howlett@oracle.com> 16382 16380 R: Vlastimil Babka <vbabka@suse.cz> ··· 16464 16462 MEMORY MANAGEMENT - RECLAIM 16465 16463 M: Andrew Morton <akpm@linux-foundation.org> 16466 16464 M: Johannes Weiner <hannes@cmpxchg.org> 16467 - R: David Hildenbrand <david@redhat.com> 16465 + R: David Hildenbrand <david@kernel.org> 16468 16466 R: Michal Hocko <mhocko@kernel.org> 16469 16467 R: Qi Zheng <zhengqi.arch@bytedance.com> 16470 16468 R: Shakeel Butt <shakeel.butt@linux.dev> ··· 16477 16475 16478 16476 MEMORY MANAGEMENT - RMAP (REVERSE MAPPING) 16479 16477 M: Andrew Morton <akpm@linux-foundation.org> 16480 - M: David Hildenbrand <david@redhat.com> 16478 + M: David Hildenbrand <david@kernel.org> 16481 16479 M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16482 16480 R: Rik van Riel <riel@surriel.com> 16483 16481 R: Liam R. Howlett <Liam.Howlett@oracle.com> ··· 16501 16499 16502 16500 MEMORY MANAGEMENT - SWAP 16503 16501 M: Andrew Morton <akpm@linux-foundation.org> 16502 + M: Chris Li <chrisl@kernel.org> 16503 + M: Kairui Song <kasong@tencent.com> 16504 16504 R: Kemeng Shi <shikemeng@huaweicloud.com> 16505 - R: Kairui Song <kasong@tencent.com> 16506 16505 R: Nhat Pham <nphamcs@gmail.com> 16507 16506 R: Baoquan He <bhe@redhat.com> 16508 16507 R: Barry Song <baohua@kernel.org> 16509 - R: Chris Li <chrisl@kernel.org> 16510 16508 L: linux-mm@kvack.org 16511 16509 S: Maintained 16512 16510 F: Documentation/mm/swap-table.rst ··· 16522 16520 16523 16521 MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE) 16524 16522 M: Andrew Morton <akpm@linux-foundation.org> 16525 - M: David Hildenbrand <david@redhat.com> 16523 + M: David Hildenbrand <david@kernel.org> 16526 16524 M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16527 16525 R: Zi Yan <ziy@nvidia.com> 16528 16526 R: Baolin Wang <baolin.wang@linux.alibaba.com> ··· 16624 16622 M: Andrew Morton <akpm@linux-foundation.org> 16625 16623 M: Liam R. Howlett <Liam.Howlett@oracle.com> 16626 16624 M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16627 - M: David Hildenbrand <david@redhat.com> 16625 + M: David Hildenbrand <david@kernel.org> 16628 16626 R: Vlastimil Babka <vbabka@suse.cz> 16629 16627 R: Jann Horn <jannh@google.com> 16630 16628 L: linux-mm@kvack.org ··· 27091 27089 27092 27090 VIRTIO BALLOON 27093 27091 M: "Michael S. Tsirkin" <mst@redhat.com> 27094 - M: David Hildenbrand <david@redhat.com> 27092 + M: David Hildenbrand <david@kernel.org> 27095 27093 L: virtualization@lists.linux.dev 27096 27094 S: Maintained 27097 27095 F: drivers/virtio/virtio_balloon.c ··· 27246 27244 F: include/uapi/linux/virtio_iommu.h 27247 27245 27248 27246 VIRTIO MEM DRIVER 27249 - M: David Hildenbrand <david@redhat.com> 27247 + M: David Hildenbrand <david@kernel.org> 27250 27248 L: virtualization@lists.linux.dev 27251 27249 S: Maintained 27252 27250 W: https://virtio-mem.gitlab.io/ ··· 27853 27851 F: arch/x86/kernel/unwind_*.c 27854 27852 27855 27853 X86 TRUST DOMAIN EXTENSIONS (TDX) 27856 - M: Kirill A. Shutemov <kas@kernel.org> 27854 + M: Kiryl Shutsemau <kas@kernel.org> 27857 27855 R: Dave Hansen <dave.hansen@linux.intel.com> 27858 27856 R: Rick Edgecombe <rick.p.edgecombe@intel.com> 27859 27857 L: x86@kernel.org

+1 -1

Makefile

··· 2 2 VERSION = 6 3 3 PATCHLEVEL = 18 4 4 SUBLEVEL = 0 5 - EXTRAVERSION = -rc5 5 + EXTRAVERSION = -rc6 6 6 NAME = Baby Opossum Posse 7 7 8 8 # *DOCUMENTATION*

+5 -2

arch/arm64/include/asm/alternative.h

··· 26 26 bool alternative_is_applied(u16 cpucap); 27 27 28 28 #ifdef CONFIG_MODULES 29 - void apply_alternatives_module(void *start, size_t length); 29 + int apply_alternatives_module(void *start, size_t length); 30 30 #else 31 - static inline void apply_alternatives_module(void *start, size_t length) { } 31 + static inline int apply_alternatives_module(void *start, size_t length) 32 + { 33 + return 0; 34 + } 32 35 #endif 33 36 34 37 void alt_cb_patch_nops(struct alt_instr *alt, __le32 *origptr,

+1 -2

arch/arm64/include/asm/kfence.h

··· 10 10 11 11 #include <asm/set_memory.h> 12 12 13 - static inline bool arch_kfence_init_pool(void) { return true; } 14 - 15 13 static inline bool kfence_protect_page(unsigned long addr, bool protect) 16 14 { 17 15 set_memory_valid(addr, 1, !protect); ··· 23 25 { 24 26 return !kfence_early_init; 25 27 } 28 + bool arch_kfence_init_pool(void); 26 29 #else /* CONFIG_KFENCE */ 27 30 static inline bool arm64_kfence_can_set_direct_map(void) { return false; } 28 31 #endif /* CONFIG_KFENCE */

+11 -4

arch/arm64/include/asm/percpu.h

··· 77 77 " stxr" #sfx "\t%w[loop], %" #w "[tmp], %[ptr]\n" \ 78 78 " cbnz %w[loop], 1b", \ 79 79 /* LSE atomics */ \ 80 - #op_lse "\t%" #w "[val], %[ptr]\n" \ 80 + #op_lse "\t%" #w "[val], %" #w "[tmp], %[ptr]\n" \ 81 81 __nops(3)) \ 82 82 : [loop] "=&r" (loop), [tmp] "=&r" (tmp), \ 83 83 [ptr] "+Q"(*(u##sz *)ptr) \ ··· 124 124 PERCPU_RW_OPS(16) 125 125 PERCPU_RW_OPS(32) 126 126 PERCPU_RW_OPS(64) 127 - PERCPU_OP(add, add, stadd) 128 - PERCPU_OP(andnot, bic, stclr) 129 - PERCPU_OP(or, orr, stset) 127 + 128 + /* 129 + * Use value-returning atomics for CPU-local ops as they are more likely 130 + * to execute "near" to the CPU (e.g. in L1$). 131 + * 132 + * https://lore.kernel.org/r/e7d539ed-ced0-4b96-8ecd-048a5b803b85@paulmck-laptop 133 + */ 134 + PERCPU_OP(add, add, ldadd) 135 + PERCPU_OP(andnot, bic, ldclr) 136 + PERCPU_OP(or, orr, ldset) 130 137 PERCPU_RET_OP(add, add, ldadd) 131 138 132 139 #undef PERCPU_RW_OPS

+1 -1

arch/arm64/include/asm/scs.h

··· 53 53 EDYNSCS_INVALID_CFA_OPCODE = 4, 54 54 }; 55 55 56 - int __pi_scs_patch(const u8 eh_frame[], int size); 56 + int __pi_scs_patch(const u8 eh_frame[], int size, bool skip_dry_run); 57 57 58 58 #endif /* __ASSEMBLY __ */ 59 59

+1

arch/arm64/include/asm/spectre.h

··· 117 117 __le32 *origptr, __le32 *updptr, int nr_inst); 118 118 void spectre_bhb_patch_clearbhb(struct alt_instr *alt, 119 119 __le32 *origptr, __le32 *updptr, int nr_inst); 120 + void spectre_print_disabled_mitigations(void); 120 121 121 122 #endif /* __ASSEMBLY__ */ 122 123 #endif /* __ASM_SPECTRE_H */

+1 -7

arch/arm64/kernel/acpi.c

··· 197 197 */ 198 198 void __init acpi_boot_table_init(void) 199 199 { 200 - int ret; 201 - 202 200 /* 203 201 * Enable ACPI instead of device tree unless 204 202 * - ACPI has been disabled explicitly (acpi=off), or ··· 250 252 * behaviour, use acpi=nospcr to disable console in ACPI SPCR 251 253 * table as default serial console. 252 254 */ 253 - ret = acpi_parse_spcr(earlycon_acpi_spcr_enable, 255 + acpi_parse_spcr(earlycon_acpi_spcr_enable, 254 256 !param_acpi_nospcr); 255 - if (!ret || param_acpi_nospcr || !IS_ENABLED(CONFIG_ACPI_SPCR_TABLE)) 256 - pr_info("Use ACPI SPCR as default console: No\n"); 257 - else 258 - pr_info("Use ACPI SPCR as default console: Yes\n"); 259 257 260 258 if (IS_ENABLED(CONFIG_ACPI_BGRT)) 261 259 acpi_table_parse(ACPI_SIG_BGRT, acpi_parse_bgrt);

+12 -7

arch/arm64/kernel/alternative.c

··· 139 139 } while (cur += d_size, cur < end); 140 140 } 141 141 142 - static void __apply_alternatives(const struct alt_region *region, 143 - bool is_module, 144 - unsigned long *cpucap_mask) 142 + static int __apply_alternatives(const struct alt_region *region, 143 + bool is_module, 144 + unsigned long *cpucap_mask) 145 145 { 146 146 struct alt_instr *alt; 147 147 __le32 *origptr, *updptr; ··· 166 166 updptr = is_module ? origptr : lm_alias(origptr); 167 167 nr_inst = alt->orig_len / AARCH64_INSN_SIZE; 168 168 169 - if (ALT_HAS_CB(alt)) 169 + if (ALT_HAS_CB(alt)) { 170 170 alt_cb = ALT_REPL_PTR(alt); 171 - else 171 + if (is_module && !core_kernel_text((unsigned long)alt_cb)) 172 + return -ENOEXEC; 173 + } else { 172 174 alt_cb = patch_alternative; 175 + } 173 176 174 177 alt_cb(alt, origptr, updptr, nr_inst); 175 178 ··· 196 193 bitmap_and(applied_alternatives, applied_alternatives, 197 194 system_cpucaps, ARM64_NCAPS); 198 195 } 196 + 197 + return 0; 199 198 } 200 199 201 200 static void __init apply_alternatives_vdso(void) ··· 282 277 } 283 278 284 279 #ifdef CONFIG_MODULES 285 - void apply_alternatives_module(void *start, size_t length) 280 + int apply_alternatives_module(void *start, size_t length) 286 281 { 287 282 struct alt_region region = { 288 283 .begin = start, ··· 292 287 293 288 bitmap_fill(all_capabilities, ARM64_NCAPS); 294 289 295 - __apply_alternatives(&region, true, &all_capabilities[0]); 290 + return __apply_alternatives(&region, true, &all_capabilities[0]); 296 291 } 297 292 #endif 298 293

+6

arch/arm64/kernel/cpufeature.c

··· 95 95 #include <asm/vectors.h> 96 96 #include <asm/virt.h> 97 97 98 + #include <asm/spectre.h> 98 99 /* Kernel representation of AT_HWCAP and AT_HWCAP2 */ 99 100 static DECLARE_BITMAP(elf_hwcap, MAX_CPU_FEATURES) __read_mostly; 100 101 ··· 3876 3875 */ 3877 3876 if (system_uses_ttbr0_pan()) 3878 3877 pr_info("emulated: Privileged Access Never (PAN) using TTBR0_EL1 switching\n"); 3878 + 3879 + /* 3880 + * Report Spectre mitigations status. 3881 + */ 3882 + spectre_print_disabled_mitigations(); 3879 3883 } 3880 3884 3881 3885 void __init setup_system_features(void)

+17 -4

arch/arm64/kernel/module.c

··· 489 489 int ret; 490 490 491 491 s = find_section(hdr, sechdrs, ".altinstructions"); 492 - if (s) 493 - apply_alternatives_module((void *)s->sh_addr, s->sh_size); 492 + if (s) { 493 + ret = apply_alternatives_module((void *)s->sh_addr, s->sh_size); 494 + if (ret < 0) { 495 + pr_err("module %s: error occurred when applying alternatives\n", me->name); 496 + return ret; 497 + } 498 + } 494 499 495 500 if (scs_is_dynamic()) { 496 501 s = find_section(hdr, sechdrs, ".init.eh_frame"); 497 502 if (s) { 498 - ret = __pi_scs_patch((void *)s->sh_addr, s->sh_size); 499 - if (ret) 503 + /* 504 + * Because we can reject modules that are malformed 505 + * so SCS patching fails, skip dry run and try to patch 506 + * it in place. If patching fails, the module would not 507 + * be loaded anyway. 508 + */ 509 + ret = __pi_scs_patch((void *)s->sh_addr, s->sh_size, true); 510 + if (ret) { 500 511 pr_err("module %s: error occurred during dynamic SCS patching (%d)\n", 501 512 me->name, ret); 513 + return -ENOEXEC; 514 + } 502 515 } 503 516 } 504 517

+2 -1

arch/arm64/kernel/mte.c

··· 476 476 477 477 folio = page_folio(page); 478 478 if (folio_test_hugetlb(folio)) 479 - WARN_ON_ONCE(!folio_test_hugetlb_mte_tagged(folio)); 479 + WARN_ON_ONCE(!folio_test_hugetlb_mte_tagged(folio) && 480 + !is_huge_zero_folio(folio)); 480 481 else 481 482 WARN_ON_ONCE(!page_mte_tagged(page) && !is_zero_page(page)); 482 483

+1 -1

arch/arm64/kernel/pi/map_kernel.c

··· 104 104 105 105 if (enable_scs) { 106 106 scs_patch(__eh_frame_start + va_offset, 107 - __eh_frame_end - __eh_frame_start); 107 + __eh_frame_end - __eh_frame_start, false); 108 108 asm("ic ialluis"); 109 109 110 110 dynamic_scs_is_enabled = true;

+6 -4

arch/arm64/kernel/pi/patch-scs.c

··· 225 225 return 0; 226 226 } 227 227 228 - int scs_patch(const u8 eh_frame[], int size) 228 + int scs_patch(const u8 eh_frame[], int size, bool skip_dry_run) 229 229 { 230 230 int code_alignment_factor = 1; 231 231 bool fde_use_sdata8 = false; ··· 277 277 } 278 278 } else { 279 279 ret = scs_handle_fde_frame(frame, code_alignment_factor, 280 - fde_use_sdata8, true); 280 + fde_use_sdata8, !skip_dry_run); 281 281 if (ret) 282 282 return ret; 283 - scs_handle_fde_frame(frame, code_alignment_factor, 284 - fde_use_sdata8, false); 283 + 284 + if (!skip_dry_run) 285 + scs_handle_fde_frame(frame, code_alignment_factor, 286 + fde_use_sdata8, false); 285 287 } 286 288 287 289 p += sizeof(frame->size) + frame->size;

+1 -1

arch/arm64/kernel/pi/pi.h

··· 27 27 void init_feature_override(u64 boot_status, const void *fdt, int chosen); 28 28 u64 kaslr_early_init(void *fdt, int chosen); 29 29 void relocate_kernel(u64 offset); 30 - int scs_patch(const u8 eh_frame[], int size); 30 + int scs_patch(const u8 eh_frame[], int size, bool skip_dry_run); 31 31 32 32 void map_range(phys_addr_t *pte, u64 start, u64 end, phys_addr_t pa, 33 33 pgprot_t prot, int level, pte_t *tbl, bool may_use_cont,

+4 -1

arch/arm64/kernel/probes/kprobes.c

··· 49 49 addr = execmem_alloc(EXECMEM_KPROBES, PAGE_SIZE); 50 50 if (!addr) 51 51 return NULL; 52 - set_memory_rox((unsigned long)addr, 1); 52 + if (set_memory_rox((unsigned long)addr, 1)) { 53 + execmem_free(addr); 54 + return NULL; 55 + } 53 56 return addr; 54 57 } 55 58

+18 -17

arch/arm64/kernel/proton-pack.c

··· 91 91 92 92 static bool spectre_v2_mitigations_off(void) 93 93 { 94 - bool ret = __nospectre_v2 || cpu_mitigations_off(); 95 - 96 - if (ret) 97 - pr_info_once("spectre-v2 mitigation disabled by command line option\n"); 98 - 99 - return ret; 94 + return __nospectre_v2 || cpu_mitigations_off(); 100 95 } 101 96 102 97 static const char *get_bhb_affected_string(enum mitigation_state bhb_state) ··· 416 421 */ 417 422 static bool spectre_v4_mitigations_off(void) 418 423 { 419 - bool ret = cpu_mitigations_off() || 420 - __spectre_v4_policy == SPECTRE_V4_POLICY_MITIGATION_DISABLED; 421 - 422 - if (ret) 423 - pr_info_once("spectre-v4 mitigation disabled by command-line option\n"); 424 - 425 - return ret; 424 + return cpu_mitigations_off() || 425 + __spectre_v4_policy == SPECTRE_V4_POLICY_MITIGATION_DISABLED; 426 426 } 427 427 428 428 /* Do we need to toggle the mitigation state on entry to/exit from the kernel? */ ··· 1032 1042 1033 1043 if (arm64_get_spectre_v2_state() == SPECTRE_VULNERABLE) { 1034 1044 /* No point mitigating Spectre-BHB alone. */ 1035 - } else if (!IS_ENABLED(CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY)) { 1036 - pr_info_once("spectre-bhb mitigation disabled by compile time option\n"); 1037 - } else if (cpu_mitigations_off() || __nospectre_bhb) { 1038 - pr_info_once("spectre-bhb mitigation disabled by command line option\n"); 1039 1045 } else if (supports_ecbhb(SCOPE_LOCAL_CPU)) { 1040 1046 state = SPECTRE_MITIGATED; 1041 1047 set_bit(BHB_HW, &system_bhb_mitigations); ··· 1185 1199 pr_err("WARNING: %s", EBPF_WARN); 1186 1200 } 1187 1201 #endif 1202 + 1203 + void spectre_print_disabled_mitigations(void) 1204 + { 1205 + /* Keep a single copy of the common message suffix to avoid duplication. */ 1206 + const char *spectre_disabled_suffix = "mitigation disabled by command-line option\n"; 1207 + 1208 + if (spectre_v2_mitigations_off()) 1209 + pr_info("spectre-v2 %s", spectre_disabled_suffix); 1210 + 1211 + if (spectre_v4_mitigations_off()) 1212 + pr_info("spectre-v4 %s", spectre_disabled_suffix); 1213 + 1214 + if (__nospectre_bhb || cpu_mitigations_off()) 1215 + pr_info("spectre-bhb %s", spectre_disabled_suffix); 1216 + }

+7 -2

arch/arm64/kvm/hyp/nvhe/ffa.c

··· 479 479 struct ffa_mem_region_attributes *ep_mem_access; 480 480 struct ffa_composite_mem_region *reg; 481 481 struct ffa_mem_region *buf; 482 - u32 offset, nr_ranges; 482 + u32 offset, nr_ranges, checked_offset; 483 483 int ret = 0; 484 484 485 485 if (addr_mbz || npages_mbz || fraglen > len || ··· 516 516 goto out_unlock; 517 517 } 518 518 519 - if (fraglen < offset + sizeof(struct ffa_composite_mem_region)) { 519 + if (check_add_overflow(offset, sizeof(struct ffa_composite_mem_region), &checked_offset)) { 520 + ret = FFA_RET_INVALID_PARAMETERS; 521 + goto out_unlock; 522 + } 523 + 524 + if (fraglen < checked_offset) { 520 525 ret = FFA_RET_INVALID_PARAMETERS; 521 526 goto out_unlock; 522 527 }

+28

arch/arm64/kvm/hyp/nvhe/mem_protect.c

··· 367 367 return kvm_pgtable_stage2_unmap(pgt, addr, BIT(pgt->ia_bits) - addr); 368 368 } 369 369 370 + /* 371 + * Ensure the PFN range is contained within PA-range. 372 + * 373 + * This check is also robust to overflows and is therefore a requirement before 374 + * using a pfn/nr_pages pair from an untrusted source. 375 + */ 376 + static bool pfn_range_is_valid(u64 pfn, u64 nr_pages) 377 + { 378 + u64 limit = BIT(kvm_phys_shift(&host_mmu.arch.mmu) - PAGE_SHIFT); 379 + 380 + return pfn < limit && ((limit - pfn) >= nr_pages); 381 + } 382 + 370 383 struct kvm_mem_range { 371 384 u64 start; 372 385 u64 end; ··· 789 776 void *virt = __hyp_va(phys); 790 777 int ret; 791 778 779 + if (!pfn_range_is_valid(pfn, nr_pages)) 780 + return -EINVAL; 781 + 792 782 host_lock_component(); 793 783 hyp_lock_component(); 794 784 ··· 819 803 u64 size = PAGE_SIZE * nr_pages; 820 804 u64 virt = (u64)__hyp_va(phys); 821 805 int ret; 806 + 807 + if (!pfn_range_is_valid(pfn, nr_pages)) 808 + return -EINVAL; 822 809 823 810 host_lock_component(); 824 811 hyp_lock_component(); ··· 906 887 u64 size = PAGE_SIZE * nr_pages; 907 888 int ret; 908 889 890 + if (!pfn_range_is_valid(pfn, nr_pages)) 891 + return -EINVAL; 892 + 909 893 host_lock_component(); 910 894 ret = __host_check_page_state_range(phys, size, PKVM_PAGE_OWNED); 911 895 if (!ret) ··· 923 901 u64 phys = hyp_pfn_to_phys(pfn); 924 902 u64 size = PAGE_SIZE * nr_pages; 925 903 int ret; 904 + 905 + if (!pfn_range_is_valid(pfn, nr_pages)) 906 + return -EINVAL; 926 907 927 908 host_lock_component(); 928 909 ret = __host_check_page_state_range(phys, size, PKVM_PAGE_SHARED_OWNED); ··· 968 943 int ret; 969 944 970 945 if (prot & ~KVM_PGTABLE_PROT_RWX) 946 + return -EINVAL; 947 + 948 + if (!pfn_range_is_valid(pfn, nr_pages)) 971 949 return -EINVAL; 972 950 973 951 ret = __guest_check_transition_size(phys, ipa, nr_pages, &size);

+38 -33

arch/arm64/kvm/sys_regs.c

··· 2595 2595 .val = 0, \ 2596 2596 } 2597 2597 2598 - /* sys_reg_desc initialiser for known cpufeature ID registers */ 2599 - #define AA32_ID_SANITISED(name) { \ 2600 - ID_DESC(name), \ 2601 - .visibility = aa32_id_visibility, \ 2602 - .val = 0, \ 2603 - } 2604 - 2605 2598 /* sys_reg_desc initialiser for writable ID registers */ 2606 2599 #define ID_WRITABLE(name, mask) { \ 2607 2600 ID_DESC(name), \ 2608 2601 .val = mask, \ 2602 + } 2603 + 2604 + /* 2605 + * 32bit ID regs are fully writable when the guest is 32bit 2606 + * capable. Nothing in the KVM code should rely on 32bit features 2607 + * anyway, only 64bit, so let the VMM do its worse. 2608 + */ 2609 + #define AA32_ID_WRITABLE(name) { \ 2610 + ID_DESC(name), \ 2611 + .visibility = aa32_id_visibility, \ 2612 + .val = GENMASK(31, 0), \ 2609 2613 } 2610 2614 2611 2615 /* sys_reg_desc initialiser for cpufeature ID registers that need filtering */ ··· 3132 3128 3133 3129 /* AArch64 mappings of the AArch32 ID registers */ 3134 3130 /* CRm=1 */ 3135 - AA32_ID_SANITISED(ID_PFR0_EL1), 3136 - AA32_ID_SANITISED(ID_PFR1_EL1), 3131 + AA32_ID_WRITABLE(ID_PFR0_EL1), 3132 + AA32_ID_WRITABLE(ID_PFR1_EL1), 3137 3133 { SYS_DESC(SYS_ID_DFR0_EL1), 3138 3134 .access = access_id_reg, 3139 3135 .get_user = get_id_reg, 3140 3136 .set_user = set_id_dfr0_el1, 3141 3137 .visibility = aa32_id_visibility, 3142 3138 .reset = read_sanitised_id_dfr0_el1, 3143 - .val = ID_DFR0_EL1_PerfMon_MASK | 3144 - ID_DFR0_EL1_CopDbg_MASK, }, 3139 + .val = GENMASK(31, 0) }, 3145 3140 ID_HIDDEN(ID_AFR0_EL1), 3146 - AA32_ID_SANITISED(ID_MMFR0_EL1), 3147 - AA32_ID_SANITISED(ID_MMFR1_EL1), 3148 - AA32_ID_SANITISED(ID_MMFR2_EL1), 3149 - AA32_ID_SANITISED(ID_MMFR3_EL1), 3141 + AA32_ID_WRITABLE(ID_MMFR0_EL1), 3142 + AA32_ID_WRITABLE(ID_MMFR1_EL1), 3143 + AA32_ID_WRITABLE(ID_MMFR2_EL1), 3144 + AA32_ID_WRITABLE(ID_MMFR3_EL1), 3150 3145 3151 3146 /* CRm=2 */ 3152 - AA32_ID_SANITISED(ID_ISAR0_EL1), 3153 - AA32_ID_SANITISED(ID_ISAR1_EL1), 3154 - AA32_ID_SANITISED(ID_ISAR2_EL1), 3155 - AA32_ID_SANITISED(ID_ISAR3_EL1), 3156 - AA32_ID_SANITISED(ID_ISAR4_EL1), 3157 - AA32_ID_SANITISED(ID_ISAR5_EL1), 3158 - AA32_ID_SANITISED(ID_MMFR4_EL1), 3159 - AA32_ID_SANITISED(ID_ISAR6_EL1), 3147 + AA32_ID_WRITABLE(ID_ISAR0_EL1), 3148 + AA32_ID_WRITABLE(ID_ISAR1_EL1), 3149 + AA32_ID_WRITABLE(ID_ISAR2_EL1), 3150 + AA32_ID_WRITABLE(ID_ISAR3_EL1), 3151 + AA32_ID_WRITABLE(ID_ISAR4_EL1), 3152 + AA32_ID_WRITABLE(ID_ISAR5_EL1), 3153 + AA32_ID_WRITABLE(ID_MMFR4_EL1), 3154 + AA32_ID_WRITABLE(ID_ISAR6_EL1), 3160 3155 3161 3156 /* CRm=3 */ 3162 - AA32_ID_SANITISED(MVFR0_EL1), 3163 - AA32_ID_SANITISED(MVFR1_EL1), 3164 - AA32_ID_SANITISED(MVFR2_EL1), 3157 + AA32_ID_WRITABLE(MVFR0_EL1), 3158 + AA32_ID_WRITABLE(MVFR1_EL1), 3159 + AA32_ID_WRITABLE(MVFR2_EL1), 3165 3160 ID_UNALLOCATED(3,3), 3166 - AA32_ID_SANITISED(ID_PFR2_EL1), 3161 + AA32_ID_WRITABLE(ID_PFR2_EL1), 3167 3162 ID_HIDDEN(ID_DFR1_EL1), 3168 - AA32_ID_SANITISED(ID_MMFR5_EL1), 3163 + AA32_ID_WRITABLE(ID_MMFR5_EL1), 3169 3164 ID_UNALLOCATED(3,7), 3170 3165 3171 3166 /* AArch64 ID registers */ ··· 5609 5606 5610 5607 guard(mutex)(&kvm->arch.config_lock); 5611 5608 5612 - if (!(static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif) && 5613 - irqchip_in_kernel(kvm) && 5614 - kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)) { 5615 - kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] &= ~ID_AA64PFR0_EL1_GIC_MASK; 5616 - kvm->arch.id_regs[IDREG_IDX(SYS_ID_PFR1_EL1)] &= ~ID_PFR1_EL1_GIC_MASK; 5609 + if (!irqchip_in_kernel(kvm)) { 5610 + u64 val; 5611 + 5612 + val = kvm_read_vm_id_reg(kvm, SYS_ID_AA64PFR0_EL1) & ~ID_AA64PFR0_EL1_GIC; 5613 + kvm_set_vm_id_reg(kvm, SYS_ID_AA64PFR0_EL1, val); 5614 + val = kvm_read_vm_id_reg(kvm, SYS_ID_PFR1_EL1) & ~ID_PFR1_EL1_GIC; 5615 + kvm_set_vm_id_reg(kvm, SYS_ID_PFR1_EL1, val); 5617 5616 } 5618 5617 5619 5618 if (vcpu_has_nv(vcpu)) {

+12 -4

arch/arm64/kvm/vgic/vgic-debug.c

··· 64 64 static int iter_mark_lpis(struct kvm *kvm) 65 65 { 66 66 struct vgic_dist *dist = &kvm->arch.vgic; 67 + unsigned long intid, flags; 67 68 struct vgic_irq *irq; 68 - unsigned long intid; 69 69 int nr_lpis = 0; 70 + 71 + xa_lock_irqsave(&dist->lpi_xa, flags); 70 72 71 73 xa_for_each(&dist->lpi_xa, intid, irq) { 72 74 if (!vgic_try_get_irq_ref(irq)) 73 75 continue; 74 76 75 - xa_set_mark(&dist->lpi_xa, intid, LPI_XA_MARK_DEBUG_ITER); 77 + __xa_set_mark(&dist->lpi_xa, intid, LPI_XA_MARK_DEBUG_ITER); 76 78 nr_lpis++; 77 79 } 80 + 81 + xa_unlock_irqrestore(&dist->lpi_xa, flags); 78 82 79 83 return nr_lpis; 80 84 } ··· 86 82 static void iter_unmark_lpis(struct kvm *kvm) 87 83 { 88 84 struct vgic_dist *dist = &kvm->arch.vgic; 85 + unsigned long intid, flags; 89 86 struct vgic_irq *irq; 90 - unsigned long intid; 91 87 92 88 xa_for_each_marked(&dist->lpi_xa, intid, irq, LPI_XA_MARK_DEBUG_ITER) { 93 - xa_clear_mark(&dist->lpi_xa, intid, LPI_XA_MARK_DEBUG_ITER); 89 + xa_lock_irqsave(&dist->lpi_xa, flags); 90 + __xa_clear_mark(&dist->lpi_xa, intid, LPI_XA_MARK_DEBUG_ITER); 91 + xa_unlock_irqrestore(&dist->lpi_xa, flags); 92 + 93 + /* vgic_put_irq() expects to be called outside of the xa_lock */ 94 94 vgic_put_irq(kvm, irq); 95 95 } 96 96 }

+13 -3

arch/arm64/kvm/vgic/vgic-init.c

··· 53 53 { 54 54 struct vgic_dist *dist = &kvm->arch.vgic; 55 55 56 - xa_init(&dist->lpi_xa); 56 + xa_init_flags(&dist->lpi_xa, XA_FLAGS_LOCK_IRQ); 57 57 } 58 58 59 59 /* CREATION */ ··· 71 71 int kvm_vgic_create(struct kvm *kvm, u32 type) 72 72 { 73 73 struct kvm_vcpu *vcpu; 74 + u64 aa64pfr0, pfr1; 74 75 unsigned long i; 75 76 int ret; 76 77 ··· 162 161 163 162 kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF; 164 163 165 - if (type == KVM_DEV_TYPE_ARM_VGIC_V2) 164 + aa64pfr0 = kvm_read_vm_id_reg(kvm, SYS_ID_AA64PFR0_EL1) & ~ID_AA64PFR0_EL1_GIC; 165 + pfr1 = kvm_read_vm_id_reg(kvm, SYS_ID_PFR1_EL1) & ~ID_PFR1_EL1_GIC; 166 + 167 + if (type == KVM_DEV_TYPE_ARM_VGIC_V2) { 166 168 kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF; 167 - else 169 + } else { 168 170 INIT_LIST_HEAD(&kvm->arch.vgic.rd_regions); 171 + aa64pfr0 |= SYS_FIELD_PREP_ENUM(ID_AA64PFR0_EL1, GIC, IMP); 172 + pfr1 |= SYS_FIELD_PREP_ENUM(ID_PFR1_EL1, GIC, GICv3); 173 + } 174 + 175 + kvm_set_vm_id_reg(kvm, SYS_ID_AA64PFR0_EL1, aa64pfr0); 176 + kvm_set_vm_id_reg(kvm, SYS_ID_PFR1_EL1, pfr1); 169 177 170 178 if (type == KVM_DEV_TYPE_ARM_VGIC_V3) 171 179 kvm->arch.vgic.nassgicap = system_supports_direct_sgis();

+8 -10

arch/arm64/kvm/vgic/vgic-its.c

··· 78 78 { 79 79 struct vgic_dist *dist = &kvm->arch.vgic; 80 80 struct vgic_irq *irq = vgic_get_irq(kvm, intid), *oldirq; 81 + unsigned long flags; 81 82 int ret; 82 83 83 84 /* In this case there is no put, since we keep the reference. */ ··· 89 88 if (!irq) 90 89 return ERR_PTR(-ENOMEM); 91 90 92 - ret = xa_reserve(&dist->lpi_xa, intid, GFP_KERNEL_ACCOUNT); 91 + ret = xa_reserve_irq(&dist->lpi_xa, intid, GFP_KERNEL_ACCOUNT); 93 92 if (ret) { 94 93 kfree(irq); 95 94 return ERR_PTR(ret); ··· 104 103 irq->target_vcpu = vcpu; 105 104 irq->group = 1; 106 105 107 - xa_lock(&dist->lpi_xa); 106 + xa_lock_irqsave(&dist->lpi_xa, flags); 108 107 109 108 /* 110 109 * There could be a race with another vgic_add_lpi(), so we need to ··· 115 114 /* Someone was faster with adding this LPI, lets use that. */ 116 115 kfree(irq); 117 116 irq = oldirq; 118 - 119 - goto out_unlock; 117 + } else { 118 + ret = xa_err(__xa_store(&dist->lpi_xa, intid, irq, 0)); 120 119 } 121 120 122 - ret = xa_err(__xa_store(&dist->lpi_xa, intid, irq, 0)); 121 + xa_unlock_irqrestore(&dist->lpi_xa, flags); 122 + 123 123 if (ret) { 124 124 xa_release(&dist->lpi_xa, intid); 125 125 kfree(irq); 126 - } 127 126 128 - out_unlock: 129 - xa_unlock(&dist->lpi_xa); 130 - 131 - if (ret) 132 127 return ERR_PTR(ret); 128 + } 133 129 134 130 /* 135 131 * We "cache" the configuration table entries in our struct vgic_irq's.

+2 -1

arch/arm64/kvm/vgic/vgic-v3.c

··· 301 301 return; 302 302 303 303 /* Hide GICv3 sysreg if necessary */ 304 - if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V2) { 304 + if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V2 || 305 + !irqchip_in_kernel(vcpu->kvm)) { 305 306 vgic_v3->vgic_hcr |= (ICH_HCR_EL2_TALL0 | ICH_HCR_EL2_TALL1 | 306 307 ICH_HCR_EL2_TC); 307 308 return;

+15 -8

arch/arm64/kvm/vgic/vgic.c

··· 28 28 * kvm->arch.config_lock (mutex) 29 29 * its->cmd_lock (mutex) 30 30 * its->its_lock (mutex) 31 - * vgic_dist->lpi_xa.xa_lock 31 + * vgic_dist->lpi_xa.xa_lock must be taken with IRQs disabled 32 32 * vgic_cpu->ap_list_lock must be taken with IRQs disabled 33 33 * vgic_irq->irq_lock must be taken with IRQs disabled 34 34 * ··· 141 141 void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq) 142 142 { 143 143 struct vgic_dist *dist = &kvm->arch.vgic; 144 + unsigned long flags; 144 145 145 - if (irq->intid >= VGIC_MIN_LPI) 146 - might_lock(&dist->lpi_xa.xa_lock); 146 + /* 147 + * Normally the lock is only taken when the refcount drops to 0. 148 + * Acquire/release it early on lockdep kernels to make locking issues 149 + * in rare release paths a bit more obvious. 150 + */ 151 + if (IS_ENABLED(CONFIG_LOCKDEP) && irq->intid >= VGIC_MIN_LPI) { 152 + guard(spinlock_irqsave)(&dist->lpi_xa.xa_lock); 153 + } 147 154 148 155 if (!__vgic_put_irq(kvm, irq)) 149 156 return; 150 157 151 - xa_lock(&dist->lpi_xa); 158 + xa_lock_irqsave(&dist->lpi_xa, flags); 152 159 vgic_release_lpi_locked(dist, irq); 153 - xa_unlock(&dist->lpi_xa); 160 + xa_unlock_irqrestore(&dist->lpi_xa, flags); 154 161 } 155 162 156 163 static void vgic_release_deleted_lpis(struct kvm *kvm) 157 164 { 158 165 struct vgic_dist *dist = &kvm->arch.vgic; 159 - unsigned long intid; 166 + unsigned long flags, intid; 160 167 struct vgic_irq *irq; 161 168 162 - xa_lock(&dist->lpi_xa); 169 + xa_lock_irqsave(&dist->lpi_xa, flags); 163 170 164 171 xa_for_each(&dist->lpi_xa, intid, irq) { 165 172 if (irq->pending_release) 166 173 vgic_release_lpi_locked(dist, irq); 167 174 } 168 175 169 - xa_unlock(&dist->lpi_xa); 176 + xa_unlock_irqrestore(&dist->lpi_xa, flags); 170 177 } 171 178 172 179 void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu)

+10

arch/arm64/mm/fault.c

··· 969 969 970 970 void tag_clear_highpage(struct page *page) 971 971 { 972 + /* 973 + * Check if MTE is supported and fall back to clear_highpage(). 974 + * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and 975 + * post_alloc_hook() will invoke tag_clear_highpage(). 976 + */ 977 + if (!system_supports_mte()) { 978 + clear_highpage(page); 979 + return; 980 + } 981 + 972 982 /* Newly allocated page, shouldn't have been tagged yet */ 973 983 WARN_ON_ONCE(!try_page_mte_tagging(page)); 974 984 mte_zero_clear_page_tags(page_address(page));

+80 -31

arch/arm64/mm/mmu.c

··· 708 708 return ret; 709 709 } 710 710 711 + static inline bool force_pte_mapping(void) 712 + { 713 + const bool bbml2 = system_capabilities_finalized() ? 714 + system_supports_bbml2_noabort() : cpu_supports_bbml2_noabort(); 715 + 716 + if (debug_pagealloc_enabled()) 717 + return true; 718 + if (bbml2) 719 + return false; 720 + return rodata_full || arm64_kfence_can_set_direct_map() || is_realm_world(); 721 + } 722 + 723 + static inline bool split_leaf_mapping_possible(void) 724 + { 725 + /* 726 + * !BBML2_NOABORT systems should never run into scenarios where we would 727 + * have to split. So exit early and let calling code detect it and raise 728 + * a warning. 729 + */ 730 + if (!system_supports_bbml2_noabort()) 731 + return false; 732 + return !force_pte_mapping(); 733 + } 734 + 711 735 static DEFINE_MUTEX(pgtable_split_lock); 712 736 713 737 int split_kernel_leaf_mapping(unsigned long start, unsigned long end) ··· 739 715 int ret; 740 716 741 717 /* 742 - * !BBML2_NOABORT systems should not be trying to change permissions on 743 - * anything that is not pte-mapped in the first place. Just return early 744 - * and let the permission change code raise a warning if not already 745 - * pte-mapped. 718 + * Exit early if the region is within a pte-mapped area or if we can't 719 + * split. For the latter case, the permission change code will raise a 720 + * warning if not already pte-mapped. 746 721 */ 747 - if (!system_supports_bbml2_noabort()) 722 + if (!split_leaf_mapping_possible() || is_kfence_address((void *)start)) 748 723 return 0; 749 724 750 725 /* ··· 781 758 return ret; 782 759 } 783 760 784 - static int __init split_to_ptes_pud_entry(pud_t *pudp, unsigned long addr, 785 - unsigned long next, 786 - struct mm_walk *walk) 761 + static int split_to_ptes_pud_entry(pud_t *pudp, unsigned long addr, 762 + unsigned long next, struct mm_walk *walk) 787 763 { 764 + gfp_t gfp = *(gfp_t *)walk->private; 788 765 pud_t pud = pudp_get(pudp); 789 766 int ret = 0; 790 767 791 768 if (pud_leaf(pud)) 792 - ret = split_pud(pudp, pud, GFP_ATOMIC, false); 769 + ret = split_pud(pudp, pud, gfp, false); 793 770 794 771 return ret; 795 772 } 796 773 797 - static int __init split_to_ptes_pmd_entry(pmd_t *pmdp, unsigned long addr, 798 - unsigned long next, 799 - struct mm_walk *walk) 774 + static int split_to_ptes_pmd_entry(pmd_t *pmdp, unsigned long addr, 775 + unsigned long next, struct mm_walk *walk) 800 776 { 777 + gfp_t gfp = *(gfp_t *)walk->private; 801 778 pmd_t pmd = pmdp_get(pmdp); 802 779 int ret = 0; 803 780 804 781 if (pmd_leaf(pmd)) { 805 782 if (pmd_cont(pmd)) 806 783 split_contpmd(pmdp); 807 - ret = split_pmd(pmdp, pmd, GFP_ATOMIC, false); 784 + ret = split_pmd(pmdp, pmd, gfp, false); 808 785 809 786 /* 810 787 * We have split the pmd directly to ptes so there is no need to ··· 816 793 return ret; 817 794 } 818 795 819 - static int __init split_to_ptes_pte_entry(pte_t *ptep, unsigned long addr, 820 - unsigned long next, 821 - struct mm_walk *walk) 796 + static int split_to_ptes_pte_entry(pte_t *ptep, unsigned long addr, 797 + unsigned long next, struct mm_walk *walk) 822 798 { 823 799 pte_t pte = __ptep_get(ptep); 824 800 ··· 827 805 return 0; 828 806 } 829 807 830 - static const struct mm_walk_ops split_to_ptes_ops __initconst = { 808 + static const struct mm_walk_ops split_to_ptes_ops = { 831 809 .pud_entry = split_to_ptes_pud_entry, 832 810 .pmd_entry = split_to_ptes_pmd_entry, 833 811 .pte_entry = split_to_ptes_pte_entry, 834 812 }; 813 + 814 + static int range_split_to_ptes(unsigned long start, unsigned long end, gfp_t gfp) 815 + { 816 + int ret; 817 + 818 + arch_enter_lazy_mmu_mode(); 819 + ret = walk_kernel_page_table_range_lockless(start, end, 820 + &split_to_ptes_ops, NULL, &gfp); 821 + arch_leave_lazy_mmu_mode(); 822 + 823 + return ret; 824 + } 835 825 836 826 static bool linear_map_requires_bbml2 __initdata; 837 827 ··· 881 847 * PTE. The kernel alias remains static throughout runtime so 882 848 * can continue to be safely mapped with large mappings. 883 849 */ 884 - ret = walk_kernel_page_table_range_lockless(lstart, kstart, 885 - &split_to_ptes_ops, NULL, NULL); 850 + ret = range_split_to_ptes(lstart, kstart, GFP_ATOMIC); 886 851 if (!ret) 887 - ret = walk_kernel_page_table_range_lockless(kend, lend, 888 - &split_to_ptes_ops, NULL, NULL); 852 + ret = range_split_to_ptes(kend, lend, GFP_ATOMIC); 889 853 if (ret) 890 854 panic("Failed to split linear map\n"); 891 855 flush_tlb_kernel_range(lstart, lend); ··· 1034 1002 memblock_clear_nomap(kfence_pool, KFENCE_POOL_SIZE); 1035 1003 __kfence_pool = phys_to_virt(kfence_pool); 1036 1004 } 1005 + 1006 + bool arch_kfence_init_pool(void) 1007 + { 1008 + unsigned long start = (unsigned long)__kfence_pool; 1009 + unsigned long end = start + KFENCE_POOL_SIZE; 1010 + int ret; 1011 + 1012 + /* Exit early if we know the linear map is already pte-mapped. */ 1013 + if (!split_leaf_mapping_possible()) 1014 + return true; 1015 + 1016 + /* Kfence pool is already pte-mapped for the early init case. */ 1017 + if (kfence_early_init) 1018 + return true; 1019 + 1020 + mutex_lock(&pgtable_split_lock); 1021 + ret = range_split_to_ptes(start, end, GFP_PGTABLE_KERNEL); 1022 + mutex_unlock(&pgtable_split_lock); 1023 + 1024 + /* 1025 + * Since the system supports bbml2_noabort, tlb invalidation is not 1026 + * required here; the pgtable mappings have been split to pte but larger 1027 + * entries may safely linger in the TLB. 1028 + */ 1029 + 1030 + return !ret; 1031 + } 1037 1032 #else /* CONFIG_KFENCE */ 1038 1033 1039 1034 static inline phys_addr_t arm64_kfence_alloc_pool(void) { return 0; } 1040 1035 static inline void arm64_kfence_map_pool(phys_addr_t kfence_pool, pgd_t *pgdp) { } 1041 1036 1042 1037 #endif /* CONFIG_KFENCE */ 1043 - 1044 - static inline bool force_pte_mapping(void) 1045 - { 1046 - bool bbml2 = system_capabilities_finalized() ? 1047 - system_supports_bbml2_noabort() : cpu_supports_bbml2_noabort(); 1048 - 1049 - return (!bbml2 && (rodata_full || arm64_kfence_can_set_direct_map() || 1050 - is_realm_world())) || 1051 - debug_pagealloc_enabled(); 1052 - } 1053 1038 1054 1039 static void __init map_mem(pgd_t *pgdp) 1055 1040 {

+2

arch/loongarch/include/asm/cpu-features.h

··· 67 67 #define cpu_has_hypervisor cpu_opt(LOONGARCH_CPU_HYPERVISOR) 68 68 #define cpu_has_ptw cpu_opt(LOONGARCH_CPU_PTW) 69 69 #define cpu_has_lspw cpu_opt(LOONGARCH_CPU_LSPW) 70 + #define cpu_has_msgint cpu_opt(LOONGARCH_CPU_MSGINT) 70 71 #define cpu_has_avecint cpu_opt(LOONGARCH_CPU_AVECINT) 72 + #define cpu_has_redirectint cpu_opt(LOONGARCH_CPU_REDIRECTINT) 71 73 72 74 #endif /* __ASM_CPU_FEATURES_H */

+5 -1

arch/loongarch/include/asm/cpu.h

··· 101 101 #define CPU_FEATURE_HYPERVISOR 26 /* CPU has hypervisor (running in VM) */ 102 102 #define CPU_FEATURE_PTW 27 /* CPU has hardware page table walker */ 103 103 #define CPU_FEATURE_LSPW 28 /* CPU has LSPW (lddir/ldpte instructions) */ 104 - #define CPU_FEATURE_AVECINT 29 /* CPU has AVEC interrupt */ 104 + #define CPU_FEATURE_MSGINT 29 /* CPU has MSG interrupt */ 105 + #define CPU_FEATURE_AVECINT 30 /* CPU has AVEC interrupt */ 106 + #define CPU_FEATURE_REDIRECTINT 31 /* CPU has interrupt remapping */ 105 107 106 108 #define LOONGARCH_CPU_CPUCFG BIT_ULL(CPU_FEATURE_CPUCFG) 107 109 #define LOONGARCH_CPU_LAM BIT_ULL(CPU_FEATURE_LAM) ··· 134 132 #define LOONGARCH_CPU_HYPERVISOR BIT_ULL(CPU_FEATURE_HYPERVISOR) 135 133 #define LOONGARCH_CPU_PTW BIT_ULL(CPU_FEATURE_PTW) 136 134 #define LOONGARCH_CPU_LSPW BIT_ULL(CPU_FEATURE_LSPW) 135 + #define LOONGARCH_CPU_MSGINT BIT_ULL(CPU_FEATURE_MSGINT) 137 136 #define LOONGARCH_CPU_AVECINT BIT_ULL(CPU_FEATURE_AVECINT) 137 + #define LOONGARCH_CPU_REDIRECTINT BIT_ULL(CPU_FEATURE_REDIRECTINT) 138 138 139 139 #endif /* _ASM_CPU_H */

+2 -2

arch/loongarch/include/asm/hw_breakpoint.h

··· 134 134 /* Determine number of BRP registers available. */ 135 135 static inline int get_num_brps(void) 136 136 { 137 - return csr_read64(LOONGARCH_CSR_FWPC) & CSR_FWPC_NUM; 137 + return csr_read32(LOONGARCH_CSR_FWPC) & CSR_FWPC_NUM; 138 138 } 139 139 140 140 /* Determine number of WRP registers available. */ 141 141 static inline int get_num_wrps(void) 142 142 { 143 - return csr_read64(LOONGARCH_CSR_MWPC) & CSR_MWPC_NUM; 143 + return csr_read32(LOONGARCH_CSR_MWPC) & CSR_MWPC_NUM; 144 144 } 145 145 146 146 #endif /* __KERNEL__ */

+4 -1

arch/loongarch/include/asm/io.h

··· 14 14 #include <asm/pgtable-bits.h> 15 15 #include <asm/string.h> 16 16 17 - extern void __init __iomem *early_ioremap(u64 phys_addr, unsigned long size); 17 + extern void __init __iomem *early_ioremap(phys_addr_t phys_addr, unsigned long size); 18 18 extern void __init early_iounmap(void __iomem *addr, unsigned long size); 19 19 20 20 #define early_memremap early_ioremap ··· 25 25 static inline void __iomem *ioremap_prot(phys_addr_t offset, unsigned long size, 26 26 pgprot_t prot) 27 27 { 28 + if (offset > TO_PHYS_MASK) 29 + return NULL; 30 + 28 31 switch (pgprot_val(prot) & _CACHE_MASK) { 29 32 case _CACHE_CC: 30 33 return (void __iomem *)(unsigned long)(CACHE_BASE + offset);

+2

arch/loongarch/include/asm/loongarch.h

··· 128 128 #define CPUCFG6_PMNUM GENMASK(7, 4) 129 129 #define CPUCFG6_PMNUM_SHIFT 4 130 130 #define CPUCFG6_PMBITS GENMASK(13, 8) 131 + #define CPUCFG6_PMBITS_SHIFT 8 131 132 #define CPUCFG6_UPM BIT(14) 132 133 133 134 #define LOONGARCH_CPUCFG16 0x10 ··· 1138 1137 #define IOCSRF_FLATMODE BIT_ULL(10) 1139 1138 #define IOCSRF_VM BIT_ULL(11) 1140 1139 #define IOCSRF_AVEC BIT_ULL(15) 1140 + #define IOCSRF_REDIRECT BIT_ULL(16) 1141 1141 1142 1142 #define LOONGARCH_IOCSR_VENDOR 0x10 1143 1143

+1 -1

arch/loongarch/include/asm/pgalloc.h

··· 88 88 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long address) 89 89 { 90 90 pud_t *pud; 91 - struct ptdesc *ptdesc = pagetable_alloc(GFP_KERNEL & ~__GFP_HIGHMEM, 0); 91 + struct ptdesc *ptdesc = pagetable_alloc(GFP_KERNEL, 0); 92 92 93 93 if (!ptdesc) 94 94 return NULL;

+8 -3

arch/loongarch/include/asm/pgtable.h

··· 424 424 425 425 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) 426 426 { 427 + if (pte_val(pte) & _PAGE_DIRTY) 428 + pte_val(pte) |= _PAGE_MODIFIED; 429 + 427 430 return __pte((pte_val(pte) & _PAGE_CHG_MASK) | 428 431 (pgprot_val(newprot) & ~_PAGE_CHG_MASK)); 429 432 } ··· 550 547 551 548 static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) 552 549 { 553 - pmd_val(pmd) = (pmd_val(pmd) & _HPAGE_CHG_MASK) | 554 - (pgprot_val(newprot) & ~_HPAGE_CHG_MASK); 555 - return pmd; 550 + if (pmd_val(pmd) & _PAGE_DIRTY) 551 + pmd_val(pmd) |= _PAGE_MODIFIED; 552 + 553 + return __pmd((pmd_val(pmd) & _HPAGE_CHG_MASK) | 554 + (pgprot_val(newprot) & ~_HPAGE_CHG_MASK)); 556 555 } 557 556 558 557 static inline pmd_t pmd_mkinvalid(pmd_t pmd)

+4

arch/loongarch/kernel/cpu-probe.c

··· 157 157 c->options |= LOONGARCH_CPU_TLB; 158 158 if (config & CPUCFG1_IOCSR) 159 159 c->options |= LOONGARCH_CPU_IOCSR; 160 + if (config & CPUCFG1_MSGINT) 161 + c->options |= LOONGARCH_CPU_MSGINT; 160 162 if (config & CPUCFG1_UAL) { 161 163 c->options |= LOONGARCH_CPU_UAL; 162 164 elf_hwcap |= HWCAP_LOONGARCH_UAL; ··· 333 331 c->options |= LOONGARCH_CPU_EIODECODE; 334 332 if (config & IOCSRF_AVEC) 335 333 c->options |= LOONGARCH_CPU_AVECINT; 334 + if (config & IOCSRF_REDIRECT) 335 + c->options |= LOONGARCH_CPU_REDIRECTINT; 336 336 if (config & IOCSRF_VM) 337 337 c->options |= LOONGARCH_CPU_HYPERVISOR; 338 338 }

+1 -1

arch/loongarch/kernel/kexec_efi.c

··· 42 42 { 43 43 int ret; 44 44 unsigned long text_offset, kernel_segment_number; 45 - struct kexec_buf kbuf; 45 + struct kexec_buf kbuf = {}; 46 46 struct kexec_segment *kernel_segment; 47 47 struct loongarch_image_header *h; 48 48

+1 -1

arch/loongarch/kernel/kexec_elf.c

··· 59 59 int ret; 60 60 unsigned long text_offset, kernel_segment_number; 61 61 struct elfhdr ehdr; 62 - struct kexec_buf kbuf; 62 + struct kexec_buf kbuf = {}; 63 63 struct kexec_elf_info elf_info; 64 64 struct kexec_segment *kernel_segment; 65 65

-22

arch/loongarch/kernel/machine_kexec.c

··· 39 39 static unsigned long start_addr; 40 40 static unsigned long first_ind_entry; 41 41 42 - static void kexec_image_info(const struct kimage *kimage) 43 - { 44 - unsigned long i; 45 - 46 - pr_debug("kexec kimage info:\n"); 47 - pr_debug("\ttype: %d\n", kimage->type); 48 - pr_debug("\tstart: %lx\n", kimage->start); 49 - pr_debug("\thead: %lx\n", kimage->head); 50 - pr_debug("\tnr_segments: %lu\n", kimage->nr_segments); 51 - 52 - for (i = 0; i < kimage->nr_segments; i++) { 53 - pr_debug("\t segment[%lu]: %016lx - %016lx", i, 54 - kimage->segment[i].mem, 55 - kimage->segment[i].mem + kimage->segment[i].memsz); 56 - pr_debug("\t\t0x%lx bytes, %lu pages\n", 57 - (unsigned long)kimage->segment[i].memsz, 58 - (unsigned long)kimage->segment[i].memsz / PAGE_SIZE); 59 - } 60 - } 61 - 62 42 int machine_kexec_prepare(struct kimage *kimage) 63 43 { 64 44 int i; 65 45 char *bootloader = "kexec"; 66 46 void *cmdline_ptr = (void *)KEXEC_CMDLINE_ADDR; 67 - 68 - kexec_image_info(kimage); 69 47 70 48 kimage->arch.efi_boot = fw_arg0; 71 49 kimage->arch.systable_ptr = fw_arg2;

+1 -1

arch/loongarch/kernel/machine_kexec_file.c

··· 143 143 unsigned long initrd_load_addr = 0; 144 144 unsigned long orig_segments = image->nr_segments; 145 145 char *modified_cmdline = NULL; 146 - struct kexec_buf kbuf; 146 + struct kexec_buf kbuf = {}; 147 147 148 148 kbuf.image = image; 149 149 /* Don't allocate anything below the kernel */

+3 -4

arch/loongarch/kernel/mem.c

··· 13 13 void __init memblock_init(void) 14 14 { 15 15 u32 mem_type; 16 - u64 mem_start, mem_end, mem_size; 16 + u64 mem_start, mem_size; 17 17 efi_memory_desc_t *md; 18 18 19 19 /* Parse memory information */ ··· 21 21 mem_type = md->type; 22 22 mem_start = md->phys_addr; 23 23 mem_size = md->num_pages << EFI_PAGE_SHIFT; 24 - mem_end = mem_start + mem_size; 25 24 26 25 switch (mem_type) { 27 26 case EFI_LOADER_CODE: ··· 30 31 case EFI_PERSISTENT_MEMORY: 31 32 case EFI_CONVENTIONAL_MEMORY: 32 33 memblock_add(mem_start, mem_size); 33 - if (max_low_pfn < (mem_end >> PAGE_SHIFT)) 34 - max_low_pfn = mem_end >> PAGE_SHIFT; 35 34 break; 36 35 case EFI_PAL_CODE: 37 36 case EFI_UNUSABLE_MEMORY: ··· 46 49 } 47 50 } 48 51 52 + max_pfn = PFN_DOWN(memblock_end_of_DRAM()); 53 + max_low_pfn = min(PFN_DOWN(HIGHMEM_START), max_pfn); 49 54 memblock_set_current_limit(PFN_PHYS(max_low_pfn)); 50 55 51 56 /* Reserve the first 2MB */

+2 -21

arch/loongarch/kernel/numa.c

··· 272 272 node_mem_init(node); 273 273 node_set_online(node); 274 274 } 275 - max_low_pfn = PHYS_PFN(memblock_end_of_DRAM()); 275 + max_pfn = PFN_DOWN(memblock_end_of_DRAM()); 276 + max_low_pfn = min(PFN_DOWN(HIGHMEM_START), max_pfn); 276 277 277 278 setup_nr_node_ids(); 278 279 loongson_sysconf.nr_nodes = nr_node_ids; ··· 283 282 } 284 283 285 284 #endif 286 - 287 - void __init paging_init(void) 288 - { 289 - unsigned int node; 290 - unsigned long zones_size[MAX_NR_ZONES] = {0, }; 291 - 292 - for_each_online_node(node) { 293 - unsigned long start_pfn, end_pfn; 294 - 295 - get_pfn_range_for_nid(node, &start_pfn, &end_pfn); 296 - 297 - if (end_pfn > max_low_pfn) 298 - max_low_pfn = end_pfn; 299 - } 300 - #ifdef CONFIG_ZONE_DMA32 301 - zones_size[ZONE_DMA32] = MAX_DMA32_PFN; 302 - #endif 303 - zones_size[ZONE_NORMAL] = max_low_pfn; 304 - free_area_init(zones_size); 305 - } 306 285 307 286 int pcibus_to_node(struct pci_bus *bus) 308 287 {

+4 -3

arch/loongarch/kernel/perf_event.c

··· 845 845 846 846 static int __init init_hw_perf_events(void) 847 847 { 848 - int counters; 848 + int bits, counters; 849 849 850 850 if (!cpu_has_pmp) 851 851 return -ENODEV; 852 852 853 853 pr_info("Performance counters: "); 854 - counters = ((read_cpucfg(LOONGARCH_CPUCFG6) & CPUCFG6_PMNUM) >> 4) + 1; 854 + bits = ((read_cpucfg(LOONGARCH_CPUCFG6) & CPUCFG6_PMBITS) >> CPUCFG6_PMBITS_SHIFT) + 1; 855 + counters = ((read_cpucfg(LOONGARCH_CPUCFG6) & CPUCFG6_PMNUM) >> CPUCFG6_PMNUM_SHIFT) + 1; 855 856 856 857 loongarch_pmu.num_counters = counters; 857 858 loongarch_pmu.max_period = (1ULL << 63) - 1; ··· 868 867 on_each_cpu(reset_counters, NULL, 1); 869 868 870 869 pr_cont("%s PMU enabled, %d %d-bit counters available to each CPU.\n", 871 - loongarch_pmu.name, counters, 64); 870 + loongarch_pmu.name, counters, bits); 872 871 873 872 perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW); 874 873

+2 -3

arch/loongarch/kernel/setup.c

··· 294 294 295 295 early_init_dt_scan(fdt_pointer, __pa(fdt_pointer)); 296 296 early_init_fdt_reserve_self(); 297 - 298 - max_low_pfn = PFN_PHYS(memblock_end_of_DRAM()); 299 297 #endif 300 298 } 301 299 ··· 388 390 static void __init arch_mem_init(char **cmdline_p) 389 391 { 390 392 /* Recalculate max_low_pfn for "mem=xxx" */ 391 - max_pfn = max_low_pfn = PHYS_PFN(memblock_end_of_DRAM()); 393 + max_pfn = PFN_DOWN(memblock_end_of_DRAM()); 394 + max_low_pfn = min(PFN_DOWN(HIGHMEM_START), max_pfn); 392 395 393 396 if (usermem) 394 397 pr_info("User-defined physical RAM map overwrite\n");

+2 -2

arch/loongarch/kernel/traps.c

··· 1131 1131 tlbrentry = (unsigned long)exception_handlers + 80*VECSIZE; 1132 1132 1133 1133 csr_write64(eentry, LOONGARCH_CSR_EENTRY); 1134 - csr_write64(eentry, LOONGARCH_CSR_MERRENTRY); 1135 - csr_write64(tlbrentry, LOONGARCH_CSR_TLBRENTRY); 1134 + csr_write64(__pa(eentry), LOONGARCH_CSR_MERRENTRY); 1135 + csr_write64(__pa(tlbrentry), LOONGARCH_CSR_TLBRENTRY); 1136 1136 } 1137 1137 1138 1138 void per_cpu_trap_init(int cpu)

+1 -1

arch/loongarch/kvm/intc/eiointc.c

··· 439 439 spin_lock_irqsave(&s->lock, flags); 440 440 switch (type) { 441 441 case KVM_DEV_LOONGARCH_EXTIOI_CTRL_INIT_NUM_CPU: 442 - if (val >= EIOINTC_ROUTE_MAX_VCPUS) 442 + if (val > EIOINTC_ROUTE_MAX_VCPUS) 443 443 ret = -EINVAL; 444 444 else 445 445 s->num_cpu = val;

+1 -1

arch/loongarch/kvm/mmu.c

··· 857 857 858 858 if (writeable) { 859 859 prot_bits = kvm_pte_mkwriteable(prot_bits); 860 - if (write) 860 + if (write || !kvm_slot_dirty_track_enabled(memslot)) 861 861 prot_bits = kvm_pte_mkdirty(prot_bits); 862 862 } 863 863

+2

arch/loongarch/kvm/timer.c

··· 4 4 */ 5 5 6 6 #include <linux/kvm_host.h> 7 + #include <asm/delay.h> 7 8 #include <asm/kvm_csr.h> 8 9 #include <asm/kvm_vcpu.h> 9 10 ··· 96 95 * and set CSR TVAL with -1 97 96 */ 98 97 write_gcsr_timertick(0); 98 + __delay(2); /* Wait cycles until timer interrupt injected */ 99 99 100 100 /* 101 101 * Writing CSR_TINTCLR_TI to LOONGARCH_CSR_TINTCLR will clear

+9 -10

arch/loongarch/kvm/vcpu.c

··· 132 132 * Clear KVM_LARCH_PMU if the guest is not using PMU CSRs when 133 133 * exiting the guest, so that the next time trap into the guest. 134 134 * We don't need to deal with PMU CSRs contexts. 135 + * 136 + * Otherwise set the request bit KVM_REQ_PMU to restore guest PMU 137 + * before entering guest VM 135 138 */ 136 139 val = kvm_read_sw_gcsr(csr, LOONGARCH_CSR_PERFCTRL0); 137 140 val |= kvm_read_sw_gcsr(csr, LOONGARCH_CSR_PERFCTRL1); ··· 142 139 val |= kvm_read_sw_gcsr(csr, LOONGARCH_CSR_PERFCTRL3); 143 140 if (!(val & KVM_PMU_EVENT_ENABLED)) 144 141 vcpu->arch.aux_inuse &= ~KVM_LARCH_PMU; 142 + else 143 + kvm_make_request(KVM_REQ_PMU, vcpu); 145 144 146 145 kvm_restore_host_pmu(vcpu); 147 - } 148 - 149 - static void kvm_restore_pmu(struct kvm_vcpu *vcpu) 150 - { 151 - if ((vcpu->arch.aux_inuse & KVM_LARCH_PMU)) 152 - kvm_make_request(KVM_REQ_PMU, vcpu); 153 146 } 154 147 155 148 static void kvm_check_pmu(struct kvm_vcpu *vcpu) ··· 298 299 vcpu->arch.aux_inuse &= ~KVM_LARCH_SWCSR_LATEST; 299 300 300 301 if (kvm_request_pending(vcpu) || xfer_to_guest_mode_work_pending()) { 301 - kvm_lose_pmu(vcpu); 302 + if (vcpu->arch.aux_inuse & KVM_LARCH_PMU) { 303 + kvm_lose_pmu(vcpu); 304 + kvm_make_request(KVM_REQ_PMU, vcpu); 305 + } 302 306 /* make sure the vcpu mode has been written */ 303 307 smp_store_mb(vcpu->mode, OUTSIDE_GUEST_MODE); 304 308 local_irq_enable(); ··· 1605 1603 /* Restore timer state regardless */ 1606 1604 kvm_restore_timer(vcpu); 1607 1605 kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu); 1608 - 1609 - /* Restore hardware PMU CSRs */ 1610 - kvm_restore_pmu(vcpu); 1611 1606 1612 1607 /* Don't bother restoring registers multiple times unless necessary */ 1613 1608 if (vcpu->arch.aux_inuse & KVM_LARCH_HWCSR_USABLE)

-2

arch/loongarch/mm/init.c

··· 60 60 return memblock_is_memory(addr) && !memblock_is_reserved(addr); 61 61 } 62 62 63 - #ifndef CONFIG_NUMA 64 63 void __init paging_init(void) 65 64 { 66 65 unsigned long max_zone_pfns[MAX_NR_ZONES]; ··· 71 72 72 73 free_area_init(max_zone_pfns); 73 74 } 74 - #endif /* !CONFIG_NUMA */ 75 75 76 76 void __ref free_initmem(void) 77 77 {

+1 -1

arch/loongarch/mm/ioremap.c

··· 6 6 #include <asm/io.h> 7 7 #include <asm-generic/early_ioremap.h> 8 8 9 - void __init __iomem *early_ioremap(u64 phys_addr, unsigned long size) 9 + void __init __iomem *early_ioremap(phys_addr_t phys_addr, unsigned long size) 10 10 { 11 11 return ((void __iomem *)TO_CACHE(phys_addr)); 12 12 }

+1

arch/powerpc/Kconfig

··· 137 137 select ARCH_HAS_DMA_OPS if PPC64 138 138 select ARCH_HAS_FORTIFY_SOURCE 139 139 select ARCH_HAS_GCOV_PROFILE_ALL 140 + select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS 140 141 select ARCH_HAS_KCOV 141 142 select ARCH_HAS_KERNEL_FPU_SUPPORT if PPC64 && PPC_FPU 142 143 select ARCH_HAS_MEMBARRIER_CALLBACKS

-1

arch/powerpc/platforms/Kconfig.cputype

··· 423 423 config PPC_RADIX_MMU 424 424 bool "Radix MMU Support" 425 425 depends on PPC_BOOK3S_64 426 - select ARCH_HAS_GIGANTIC_PAGE 427 426 default y 428 427 help 429 428 Enable support for the Power ISA 3.0 Radix style MMU. Currently this

+1 -1

arch/riscv/Kconfig

··· 367 367 systems to handle cache management. 368 368 369 369 config AS_HAS_INSN 370 - def_bool $(as-instr,.insn r 51$(comma) 0$(comma) 0$(comma) t0$(comma) t0$(comma) zero) 370 + def_bool $(as-instr,.insn 0x100000f) 371 371 372 372 config AS_HAS_OPTION_ARCH 373 373 # https://github.com/llvm/llvm-project/commit/9e8ed3403c191ab9c4903e8eeb8f732ff8a43cb4

+1 -16

arch/riscv/Makefile

··· 134 134 CHECKFLAGS += -D__riscv -D__riscv_xlen=$(BITS) 135 135 136 136 # Default target when executing plain make 137 - boot := arch/riscv/boot 138 - ifeq ($(CONFIG_XIP_KERNEL),y) 139 - KBUILD_IMAGE := $(boot)/xipImage 140 - else 141 - ifeq ($(CONFIG_RISCV_M_MODE)$(CONFIG_SOC_CANAAN_K210),yy) 142 - KBUILD_IMAGE := $(boot)/loader.bin 143 - else 144 - ifeq ($(CONFIG_EFI_ZBOOT),) 145 - KBUILD_IMAGE := $(boot)/Image.gz 146 - else 147 - KBUILD_IMAGE := $(boot)/vmlinuz.efi 148 - endif 149 - endif 150 - endif 151 - 152 137 boot := arch/riscv/boot 153 138 boot-image-y := Image 154 139 boot-image-$(CONFIG_KERNEL_BZIP2) := Image.bz2 ··· 144 159 boot-image-$(CONFIG_KERNEL_ZSTD) := Image.zst 145 160 boot-image-$(CONFIG_KERNEL_XZ) := Image.xz 146 161 ifdef CONFIG_RISCV_M_MODE 147 - boot-image-$(CONFIG_ARCH_CANAAN) := loader.bin 162 + boot-image-$(CONFIG_SOC_CANAAN_K210) := loader.bin 148 163 endif 149 164 boot-image-$(CONFIG_EFI_ZBOOT) := vmlinuz.efi 150 165 boot-image-$(CONFIG_XIP_KERNEL) := xipImage

+14 -2

arch/riscv/kvm/aia_imsic.c

··· 689 689 */ 690 690 691 691 read_lock_irqsave(&imsic->vsfile_lock, flags); 692 - if (imsic->vsfile_cpu > -1) 693 - ret = !!(csr_read(CSR_HGEIP) & BIT(imsic->vsfile_hgei)); 692 + if (imsic->vsfile_cpu > -1) { 693 + /* 694 + * This function is typically called from kvm_vcpu_block() via 695 + * kvm_arch_vcpu_runnable() upon WFI trap. The kvm_vcpu_block() 696 + * can be preempted and the blocking VCPU might resume on a 697 + * different CPU. This means it is possible that current CPU 698 + * does not match the imsic->vsfile_cpu hence this function 699 + * must check imsic->vsfile_cpu before accessing HGEIP CSR. 700 + */ 701 + if (imsic->vsfile_cpu != vcpu->cpu) 702 + ret = true; 703 + else 704 + ret = !!(csr_read(CSR_HGEIP) & BIT(imsic->vsfile_hgei)); 705 + } 694 706 read_unlock_irqrestore(&imsic->vsfile_lock, flags); 695 707 696 708 return ret;

+2 -23

arch/riscv/kvm/mmu.c

··· 171 171 enum kvm_mr_change change) 172 172 { 173 173 hva_t hva, reg_end, size; 174 - gpa_t base_gpa; 175 174 bool writable; 176 175 int ret = 0; 177 176 ··· 189 190 hva = new->userspace_addr; 190 191 size = new->npages << PAGE_SHIFT; 191 192 reg_end = hva + size; 192 - base_gpa = new->base_gfn << PAGE_SHIFT; 193 193 writable = !(new->flags & KVM_MEM_READONLY); 194 194 195 195 mmap_read_lock(current->mm); 196 196 197 197 /* 198 198 * A memory region could potentially cover multiple VMAs, and 199 - * any holes between them, so iterate over all of them to find 200 - * out if we can map any of them right now. 199 + * any holes between them, so iterate over all of them. 201 200 * 202 201 * +--------------------------------------------+ 203 202 * +---------------+----------------+ +----------------+ ··· 206 209 */ 207 210 do { 208 211 struct vm_area_struct *vma; 209 - hva_t vm_start, vm_end; 212 + hva_t vm_end; 210 213 211 214 vma = find_vma_intersection(current->mm, hva, reg_end); 212 215 if (!vma) ··· 222 225 } 223 226 224 227 /* Take the intersection of this VMA with the memory region */ 225 - vm_start = max(hva, vma->vm_start); 226 228 vm_end = min(reg_end, vma->vm_end); 227 229 228 230 if (vma->vm_flags & VM_PFNMAP) { 229 - gpa_t gpa = base_gpa + (vm_start - hva); 230 - phys_addr_t pa; 231 - 232 - pa = (phys_addr_t)vma->vm_pgoff << PAGE_SHIFT; 233 - pa += vm_start - vma->vm_start; 234 - 235 231 /* IO region dirty page logging not allowed */ 236 232 if (new->flags & KVM_MEM_LOG_DIRTY_PAGES) { 237 233 ret = -EINVAL; 238 234 goto out; 239 235 } 240 - 241 - ret = kvm_riscv_mmu_ioremap(kvm, gpa, pa, vm_end - vm_start, 242 - writable, false); 243 - if (ret) 244 - break; 245 236 } 246 237 hva = vm_end; 247 238 } while (hva < reg_end); 248 - 249 - if (change == KVM_MR_FLAGS_ONLY) 250 - goto out; 251 - 252 - if (ret) 253 - kvm_riscv_mmu_iounmap(kvm, base_gpa, size); 254 239 255 240 out: 256 241 mmap_read_unlock(current->mm);

+1 -1

arch/riscv/kvm/vcpu.c

··· 212 212 213 213 int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu) 214 214 { 215 - return (kvm_riscv_vcpu_has_interrupts(vcpu, -1UL) && 215 + return (kvm_riscv_vcpu_has_interrupts(vcpu, -1ULL) && 216 216 !kvm_riscv_vcpu_stopped(vcpu) && !vcpu->arch.pause); 217 217 } 218 218

+5 -7

arch/s390/include/asm/pgtable.h

··· 1154 1154 #define IPTE_NODAT 0x400 1155 1155 #define IPTE_GUEST_ASCE 0x800 1156 1156 1157 - static __always_inline void __ptep_rdp(unsigned long addr, pte_t *ptep, 1158 - unsigned long opt, unsigned long asce, 1159 - int local) 1157 + static __always_inline void __ptep_rdp(unsigned long addr, pte_t *ptep, int local) 1160 1158 { 1161 1159 unsigned long pto; 1162 1160 1163 1161 pto = __pa(ptep) & ~(PTRS_PER_PTE * sizeof(pte_t) - 1); 1164 - asm volatile(".insn rrf,0xb98b0000,%[r1],%[r2],%[asce],%[m4]" 1162 + asm volatile(".insn rrf,0xb98b0000,%[r1],%[r2],%%r0,%[m4]" 1165 1163 : "+m" (*ptep) 1166 - : [r1] "a" (pto), [r2] "a" ((addr & PAGE_MASK) | opt), 1167 - [asce] "a" (asce), [m4] "i" (local)); 1164 + : [r1] "a" (pto), [r2] "a" (addr & PAGE_MASK), 1165 + [m4] "i" (local)); 1168 1166 } 1169 1167 1170 1168 static __always_inline void __ptep_ipte(unsigned long address, pte_t *ptep, ··· 1346 1348 * A local RDP can be used to do the flush. 1347 1349 */ 1348 1350 if (cpu_has_rdp() && !(pte_val(*ptep) & _PAGE_PROTECT)) 1349 - __ptep_rdp(address, ptep, 0, 0, 1); 1351 + __ptep_rdp(address, ptep, 1); 1350 1352 } 1351 1353 #define flush_tlb_fix_spurious_fault flush_tlb_fix_spurious_fault 1352 1354

+2 -2

arch/s390/mm/pgtable.c

··· 274 274 preempt_disable(); 275 275 atomic_inc(&mm->context.flush_count); 276 276 if (cpumask_equal(mm_cpumask(mm), cpumask_of(smp_processor_id()))) 277 - __ptep_rdp(addr, ptep, 0, 0, 1); 277 + __ptep_rdp(addr, ptep, 1); 278 278 else 279 - __ptep_rdp(addr, ptep, 0, 0, 0); 279 + __ptep_rdp(addr, ptep, 0); 280 280 /* 281 281 * PTE is not invalidated by RDP, only _PAGE_PROTECT is cleared. That 282 282 * means it is still valid and active, and must not be changed according

+5 -5

arch/x86/events/core.c

··· 2789 2789 return; 2790 2790 } 2791 2791 2792 - if (perf_callchain_store(entry, regs->ip)) 2793 - return; 2794 - 2795 - if (perf_hw_regs(regs)) 2792 + if (perf_hw_regs(regs)) { 2793 + if (perf_callchain_store(entry, regs->ip)) 2794 + return; 2796 2795 unwind_start(&state, current, regs, NULL); 2797 - else 2796 + } else { 2798 2797 unwind_start(&state, current, NULL, (void *)regs->sp); 2798 + } 2799 2799 2800 2800 for (; !unwind_done(&state); unwind_next_frame(&state)) { 2801 2801 addr = unwind_get_return_address(&state);

+5

arch/x86/include/asm/ftrace.h

··· 56 56 return &arch_ftrace_regs(fregs)->regs; 57 57 } 58 58 59 + #define arch_ftrace_partial_regs(regs) do { \ 60 + regs->flags &= ~X86_EFLAGS_FIXED; \ 61 + regs->cs = __KERNEL_CS; \ 62 + } while (0) 63 + 59 64 #define arch_ftrace_fill_perf_regs(fregs, _regs) do { \ 60 65 (_regs)->ip = arch_ftrace_regs(fregs)->regs.ip; \ 61 66 (_regs)->sp = arch_ftrace_regs(fregs)->regs.sp; \

+1

arch/x86/include/uapi/asm/vmx.h

··· 93 93 #define EXIT_REASON_TPAUSE 68 94 94 #define EXIT_REASON_BUS_LOCK 74 95 95 #define EXIT_REASON_NOTIFY 75 96 + #define EXIT_REASON_SEAMCALL 76 96 97 #define EXIT_REASON_TDCALL 77 97 98 #define EXIT_REASON_MSR_READ_IMM 84 98 99 #define EXIT_REASON_MSR_WRITE_IMM 85

+1 -1

arch/x86/kernel/acpi/cppc.c

··· 196 196 break; 197 197 } 198 198 199 - for_each_present_cpu(cpu) { 199 + for_each_online_cpu(cpu) { 200 200 u32 tmp; 201 201 int ret; 202 202

+7

arch/x86/kernel/cpu/amd.c

··· 1037 1037 1038 1038 static const struct x86_cpu_id zen5_rdseed_microcode[] = { 1039 1039 ZEN_MODEL_STEP_UCODE(0x1a, 0x02, 0x1, 0x0b00215a), 1040 + ZEN_MODEL_STEP_UCODE(0x1a, 0x08, 0x1, 0x0b008121), 1040 1041 ZEN_MODEL_STEP_UCODE(0x1a, 0x11, 0x0, 0x0b101054), 1042 + ZEN_MODEL_STEP_UCODE(0x1a, 0x24, 0x0, 0x0b204037), 1043 + ZEN_MODEL_STEP_UCODE(0x1a, 0x44, 0x0, 0x0b404035), 1044 + ZEN_MODEL_STEP_UCODE(0x1a, 0x44, 0x1, 0x0b404108), 1045 + ZEN_MODEL_STEP_UCODE(0x1a, 0x60, 0x0, 0x0b600037), 1046 + ZEN_MODEL_STEP_UCODE(0x1a, 0x68, 0x0, 0x0b608038), 1047 + ZEN_MODEL_STEP_UCODE(0x1a, 0x70, 0x0, 0x0b700037), 1041 1048 {}, 1042 1049 }; 1043 1050

+1

arch/x86/kernel/cpu/microcode/amd.c

··· 224 224 case 0xb1010: return cur_rev <= 0xb101046; break; 225 225 case 0xb2040: return cur_rev <= 0xb204031; break; 226 226 case 0xb4040: return cur_rev <= 0xb404031; break; 227 + case 0xb4041: return cur_rev <= 0xb404101; break; 227 228 case 0xb6000: return cur_rev <= 0xb600031; break; 228 229 case 0xb6080: return cur_rev <= 0xb608031; break; 229 230 case 0xb7000: return cur_rev <= 0xb700031; break;

+7 -1

arch/x86/kernel/ftrace_64.S

··· 354 354 UNWIND_HINT_UNDEFINED 355 355 ANNOTATE_NOENDBR 356 356 357 + /* Restore return_to_handler value that got eaten by previous ret instruction. */ 358 + subq $8, %rsp 359 + UNWIND_HINT_FUNC 360 + 357 361 /* Save ftrace_regs for function exit context */ 358 362 subq $(FRAME_SIZE), %rsp 359 363 360 364 movq %rax, RAX(%rsp) 361 365 movq %rdx, RDX(%rsp) 362 366 movq %rbp, RBP(%rsp) 367 + movq %rsp, RSP(%rsp) 363 368 movq %rsp, %rdi 364 369 365 370 call ftrace_return_to_handler ··· 373 368 movq RDX(%rsp), %rdx 374 369 movq RAX(%rsp), %rax 375 370 376 - addq $(FRAME_SIZE), %rsp 371 + addq $(FRAME_SIZE) + 8, %rsp 372 + 377 373 /* 378 374 * Jump back to the old return address. This cannot be JMP_NOSPEC rdi 379 375 * since IBT would demand that contain ENDBR, which simply isn't so for

+15 -9

arch/x86/kvm/svm/avic.c

··· 216 216 * This function is called from IOMMU driver to notify 217 217 * SVM to schedule in a particular vCPU of a particular VM. 218 218 */ 219 - int avic_ga_log_notifier(u32 ga_tag) 219 + static int avic_ga_log_notifier(u32 ga_tag) 220 220 { 221 221 unsigned long flags; 222 222 struct kvm_svm *kvm_svm; ··· 788 788 struct kvm_vcpu *vcpu = &svm->vcpu; 789 789 790 790 INIT_LIST_HEAD(&svm->ir_list); 791 - spin_lock_init(&svm->ir_list_lock); 791 + raw_spin_lock_init(&svm->ir_list_lock); 792 792 793 793 if (!enable_apicv || !irqchip_in_kernel(vcpu->kvm)) 794 794 return 0; ··· 816 816 if (!vcpu) 817 817 return; 818 818 819 - spin_lock_irqsave(&to_svm(vcpu)->ir_list_lock, flags); 819 + raw_spin_lock_irqsave(&to_svm(vcpu)->ir_list_lock, flags); 820 820 list_del(&irqfd->vcpu_list); 821 - spin_unlock_irqrestore(&to_svm(vcpu)->ir_list_lock, flags); 821 + raw_spin_unlock_irqrestore(&to_svm(vcpu)->ir_list_lock, flags); 822 822 } 823 823 824 824 int avic_pi_update_irte(struct kvm_kernel_irqfd *irqfd, struct kvm *kvm, ··· 855 855 * list of IRQs being posted to the vCPU, to ensure the IRTE 856 856 * isn't programmed with stale pCPU/IsRunning information. 857 857 */ 858 - guard(spinlock_irqsave)(&svm->ir_list_lock); 858 + guard(raw_spinlock_irqsave)(&svm->ir_list_lock); 859 859 860 860 /* 861 861 * Update the target pCPU for IOMMU doorbells if the vCPU is ··· 972 972 * up-to-date entry information, or that this task will wait until 973 973 * svm_ir_list_add() completes to set the new target pCPU. 974 974 */ 975 - spin_lock_irqsave(&svm->ir_list_lock, flags); 975 + raw_spin_lock_irqsave(&svm->ir_list_lock, flags); 976 976 977 977 entry = svm->avic_physical_id_entry; 978 978 WARN_ON_ONCE(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK); ··· 997 997 998 998 avic_update_iommu_vcpu_affinity(vcpu, h_physical_id, action); 999 999 1000 - spin_unlock_irqrestore(&svm->ir_list_lock, flags); 1000 + raw_spin_unlock_irqrestore(&svm->ir_list_lock, flags); 1001 1001 } 1002 1002 1003 1003 void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu) ··· 1035 1035 * or that this task will wait until svm_ir_list_add() completes to 1036 1036 * mark the vCPU as not running. 1037 1037 */ 1038 - spin_lock_irqsave(&svm->ir_list_lock, flags); 1038 + raw_spin_lock_irqsave(&svm->ir_list_lock, flags); 1039 1039 1040 1040 avic_update_iommu_vcpu_affinity(vcpu, -1, action); 1041 1041 ··· 1059 1059 1060 1060 svm->avic_physical_id_entry = entry; 1061 1061 1062 - spin_unlock_irqrestore(&svm->ir_list_lock, flags); 1062 + raw_spin_unlock_irqrestore(&svm->ir_list_lock, flags); 1063 1063 } 1064 1064 1065 1065 void avic_vcpu_put(struct kvm_vcpu *vcpu) ··· 1242 1242 amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier); 1243 1243 1244 1244 return true; 1245 + } 1246 + 1247 + void avic_hardware_unsetup(void) 1248 + { 1249 + if (avic) 1250 + amd_iommu_register_ga_log_notifier(NULL); 1245 1251 }

+7 -13

arch/x86/kvm/svm/nested.c

··· 677 677 */ 678 678 svm_copy_lbrs(vmcb02, vmcb12); 679 679 vmcb02->save.dbgctl &= ~DEBUGCTL_RESERVED_BITS; 680 - svm_update_lbrv(&svm->vcpu); 681 - 682 - } else if (unlikely(vmcb01->control.virt_ext & LBR_CTL_ENABLE_MASK)) { 680 + } else { 683 681 svm_copy_lbrs(vmcb02, vmcb01); 684 682 } 683 + svm_update_lbrv(&svm->vcpu); 685 684 } 686 685 687 686 static inline bool is_evtinj_soft(u32 evtinj) ··· 832 833 svm->soft_int_next_rip = vmcb12_rip; 833 834 } 834 835 835 - vmcb02->control.virt_ext = vmcb01->control.virt_ext & 836 - LBR_CTL_ENABLE_MASK; 837 - if (guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV)) 838 - vmcb02->control.virt_ext |= 839 - (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK); 836 + /* LBR_CTL_ENABLE_MASK is controlled by svm_update_lbrv() */ 840 837 841 838 if (!nested_vmcb_needs_vls_intercept(svm)) 842 839 vmcb02->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; ··· 1184 1189 kvm_make_request(KVM_REQ_EVENT, &svm->vcpu); 1185 1190 1186 1191 if (unlikely(guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) && 1187 - (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) { 1192 + (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) 1188 1193 svm_copy_lbrs(vmcb12, vmcb02); 1189 - svm_update_lbrv(vcpu); 1190 - } else if (unlikely(vmcb01->control.virt_ext & LBR_CTL_ENABLE_MASK)) { 1194 + else 1191 1195 svm_copy_lbrs(vmcb01, vmcb02); 1192 - svm_update_lbrv(vcpu); 1193 - } 1196 + 1197 + svm_update_lbrv(vcpu); 1194 1198 1195 1199 if (vnmi) { 1196 1200 if (vmcb02->control.int_ctl & V_NMI_BLOCKING_MASK)

+39 -49

arch/x86/kvm/svm/svm.c

··· 806 806 vmcb_mark_dirty(to_vmcb, VMCB_LBR); 807 807 } 808 808 809 + static void __svm_enable_lbrv(struct kvm_vcpu *vcpu) 810 + { 811 + to_svm(vcpu)->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; 812 + } 813 + 809 814 void svm_enable_lbrv(struct kvm_vcpu *vcpu) 810 815 { 811 - struct vcpu_svm *svm = to_svm(vcpu); 812 - 813 - svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; 816 + __svm_enable_lbrv(vcpu); 814 817 svm_recalc_lbr_msr_intercepts(vcpu); 815 - 816 - /* Move the LBR msrs to the vmcb02 so that the guest can see them. */ 817 - if (is_guest_mode(vcpu)) 818 - svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr); 819 818 } 820 819 821 - static void svm_disable_lbrv(struct kvm_vcpu *vcpu) 820 + static void __svm_disable_lbrv(struct kvm_vcpu *vcpu) 822 821 { 823 - struct vcpu_svm *svm = to_svm(vcpu); 824 - 825 822 KVM_BUG_ON(sev_es_guest(vcpu->kvm), vcpu->kvm); 826 - svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK; 827 - svm_recalc_lbr_msr_intercepts(vcpu); 828 - 829 - /* 830 - * Move the LBR msrs back to the vmcb01 to avoid copying them 831 - * on nested guest entries. 832 - */ 833 - if (is_guest_mode(vcpu)) 834 - svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb); 835 - } 836 - 837 - static struct vmcb *svm_get_lbr_vmcb(struct vcpu_svm *svm) 838 - { 839 - /* 840 - * If LBR virtualization is disabled, the LBR MSRs are always kept in 841 - * vmcb01. If LBR virtualization is enabled and L1 is running VMs of 842 - * its own, the MSRs are moved between vmcb01 and vmcb02 as needed. 843 - */ 844 - return svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK ? svm->vmcb : 845 - svm->vmcb01.ptr; 823 + to_svm(vcpu)->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK; 846 824 } 847 825 848 826 void svm_update_lbrv(struct kvm_vcpu *vcpu) 849 827 { 850 828 struct vcpu_svm *svm = to_svm(vcpu); 851 829 bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK; 852 - bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) || 830 + bool enable_lbrv = (svm->vmcb->save.dbgctl & DEBUGCTLMSR_LBR) || 853 831 (is_guest_mode(vcpu) && guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) && 854 832 (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK)); 855 833 856 - if (enable_lbrv == current_enable_lbrv) 857 - return; 834 + if (enable_lbrv && !current_enable_lbrv) 835 + __svm_enable_lbrv(vcpu); 836 + else if (!enable_lbrv && current_enable_lbrv) 837 + __svm_disable_lbrv(vcpu); 858 838 859 - if (enable_lbrv) 860 - svm_enable_lbrv(vcpu); 861 - else 862 - svm_disable_lbrv(vcpu); 839 + /* 840 + * During nested transitions, it is possible that the current VMCB has 841 + * LBR_CTL set, but the previous LBR_CTL had it cleared (or vice versa). 842 + * In this case, even though LBR_CTL does not need an update, intercepts 843 + * do, so always recalculate the intercepts here. 844 + */ 845 + svm_recalc_lbr_msr_intercepts(vcpu); 863 846 } 864 847 865 848 void disable_nmi_singlestep(struct vcpu_svm *svm) ··· 903 920 static void svm_hardware_unsetup(void) 904 921 { 905 922 int cpu; 923 + 924 + avic_hardware_unsetup(); 906 925 907 926 sev_hardware_unsetup(); 908 927 ··· 2707 2722 msr_info->data = svm->tsc_aux; 2708 2723 break; 2709 2724 case MSR_IA32_DEBUGCTLMSR: 2710 - msr_info->data = svm_get_lbr_vmcb(svm)->save.dbgctl; 2725 + msr_info->data = svm->vmcb->save.dbgctl; 2711 2726 break; 2712 2727 case MSR_IA32_LASTBRANCHFROMIP: 2713 - msr_info->data = svm_get_lbr_vmcb(svm)->save.br_from; 2728 + msr_info->data = svm->vmcb->save.br_from; 2714 2729 break; 2715 2730 case MSR_IA32_LASTBRANCHTOIP: 2716 - msr_info->data = svm_get_lbr_vmcb(svm)->save.br_to; 2731 + msr_info->data = svm->vmcb->save.br_to; 2717 2732 break; 2718 2733 case MSR_IA32_LASTINTFROMIP: 2719 - msr_info->data = svm_get_lbr_vmcb(svm)->save.last_excp_from; 2734 + msr_info->data = svm->vmcb->save.last_excp_from; 2720 2735 break; 2721 2736 case MSR_IA32_LASTINTTOIP: 2722 - msr_info->data = svm_get_lbr_vmcb(svm)->save.last_excp_to; 2737 + msr_info->data = svm->vmcb->save.last_excp_to; 2723 2738 break; 2724 2739 case MSR_VM_HSAVE_PA: 2725 2740 msr_info->data = svm->nested.hsave_msr; ··· 2987 3002 if (data & DEBUGCTL_RESERVED_BITS) 2988 3003 return 1; 2989 3004 2990 - svm_get_lbr_vmcb(svm)->save.dbgctl = data; 3005 + if (svm->vmcb->save.dbgctl == data) 3006 + break; 3007 + 3008 + svm->vmcb->save.dbgctl = data; 3009 + vmcb_mark_dirty(svm->vmcb, VMCB_LBR); 2991 3010 svm_update_lbrv(vcpu); 2992 3011 break; 2993 3012 case MSR_VM_HSAVE_PA: ··· 5375 5386 5376 5387 svm_hv_hardware_setup(); 5377 5388 5378 - for_each_possible_cpu(cpu) { 5379 - r = svm_cpu_init(cpu); 5380 - if (r) 5381 - goto err; 5382 - } 5383 - 5384 5389 enable_apicv = avic_hardware_setup(); 5385 5390 if (!enable_apicv) { 5386 5391 enable_ipiv = false; ··· 5418 5435 svm_set_cpu_caps(); 5419 5436 5420 5437 kvm_caps.inapplicable_quirks &= ~KVM_X86_QUIRK_CD_NW_CLEARED; 5438 + 5439 + for_each_possible_cpu(cpu) { 5440 + r = svm_cpu_init(cpu); 5441 + if (r) 5442 + goto err; 5443 + } 5444 + 5421 5445 return 0; 5422 5446 5423 5447 err:

+2 -2

arch/x86/kvm/svm/svm.h

··· 329 329 * back into remapped mode). 330 330 */ 331 331 struct list_head ir_list; 332 - spinlock_t ir_list_lock; 332 + raw_spinlock_t ir_list_lock; 333 333 334 334 struct vcpu_sev_es_state sev_es; 335 335 ··· 805 805 ) 806 806 807 807 bool __init avic_hardware_setup(void); 808 - int avic_ga_log_notifier(u32 ga_tag); 808 + void avic_hardware_unsetup(void); 809 809 void avic_vm_destroy(struct kvm *kvm); 810 810 int avic_vm_init(struct kvm *kvm); 811 811 void avic_init_vmcb(struct vcpu_svm *svm, struct vmcb *vmcb);

+1 -1

arch/x86/kvm/vmx/common.h

··· 98 98 error_code |= (exit_qualification & EPT_VIOLATION_PROT_MASK) 99 99 ? PFERR_PRESENT_MASK : 0; 100 100 101 - if (error_code & EPT_VIOLATION_GVA_IS_VALID) 101 + if (exit_qualification & EPT_VIOLATION_GVA_IS_VALID) 102 102 error_code |= (exit_qualification & EPT_VIOLATION_GVA_TRANSLATED) ? 103 103 PFERR_GUEST_FINAL_MASK : PFERR_GUEST_PAGE_MASK; 104 104

+8

arch/x86/kvm/vmx/nested.c

··· 6728 6728 case EXIT_REASON_NOTIFY: 6729 6729 /* Notify VM exit is not exposed to L1 */ 6730 6730 return false; 6731 + case EXIT_REASON_SEAMCALL: 6732 + case EXIT_REASON_TDCALL: 6733 + /* 6734 + * SEAMCALL and TDCALL unconditionally VM-Exit, but aren't 6735 + * virtualized by KVM for L1 hypervisors, i.e. L1 should 6736 + * never want or expect such an exit. 6737 + */ 6738 + return false; 6731 6739 default: 6732 6740 return true; 6733 6741 }

+8

arch/x86/kvm/vmx/vmx.c

··· 6032 6032 return 1; 6033 6033 } 6034 6034 6035 + static int handle_tdx_instruction(struct kvm_vcpu *vcpu) 6036 + { 6037 + kvm_queue_exception(vcpu, UD_VECTOR); 6038 + return 1; 6039 + } 6040 + 6035 6041 #ifndef CONFIG_X86_SGX_KVM 6036 6042 static int handle_encls(struct kvm_vcpu *vcpu) 6037 6043 { ··· 6163 6157 [EXIT_REASON_ENCLS] = handle_encls, 6164 6158 [EXIT_REASON_BUS_LOCK] = handle_bus_lock_vmexit, 6165 6159 [EXIT_REASON_NOTIFY] = handle_notify, 6160 + [EXIT_REASON_SEAMCALL] = handle_tdx_instruction, 6161 + [EXIT_REASON_TDCALL] = handle_tdx_instruction, 6166 6162 [EXIT_REASON_MSR_READ_IMM] = handle_rdmsr_imm, 6167 6163 [EXIT_REASON_MSR_WRITE_IMM] = handle_wrmsr_imm, 6168 6164 };

+29 -19

arch/x86/kvm/x86.c

··· 3874 3874 3875 3875 /* 3876 3876 * Returns true if the MSR in question is managed via XSTATE, i.e. is context 3877 - * switched with the rest of guest FPU state. Note! S_CET is _not_ context 3878 - * switched via XSTATE even though it _is_ saved/restored via XSAVES/XRSTORS. 3879 - * Because S_CET is loaded on VM-Enter and VM-Exit via dedicated VMCS fields, 3880 - * the value saved/restored via XSTATE is always the host's value. That detail 3881 - * is _extremely_ important, as the guest's S_CET must _never_ be resident in 3882 - * hardware while executing in the host. Loading guest values for U_CET and 3883 - * PL[0-3]_SSP while executing in the kernel is safe, as U_CET is specific to 3884 - * userspace, and PL[0-3]_SSP are only consumed when transitioning to lower 3885 - * privilege levels, i.e. are effectively only consumed by userspace as well. 3877 + * switched with the rest of guest FPU state. 3878 + * 3879 + * Note, S_CET is _not_ saved/restored via XSAVES/XRSTORS. 3886 3880 */ 3887 3881 static bool is_xstate_managed_msr(struct kvm_vcpu *vcpu, u32 msr) 3888 3882 { ··· 3899 3905 * MSR that is managed via XSTATE. Note, the caller is responsible for doing 3900 3906 * the initial FPU load, this helper only ensures that guest state is resident 3901 3907 * in hardware (the kernel can load its FPU state in IRQ context). 3908 + * 3909 + * Note, loading guest values for U_CET and PL[0-3]_SSP while executing in the 3910 + * kernel is safe, as U_CET is specific to userspace, and PL[0-3]_SSP are only 3911 + * consumed when transitioning to lower privilege levels, i.e. are effectively 3912 + * only consumed by userspace as well. 3902 3913 */ 3903 3914 static __always_inline void kvm_access_xstate_msr(struct kvm_vcpu *vcpu, 3904 3915 struct msr_data *msr_info, ··· 11806 11807 /* Swap (qemu) user FPU context for the guest FPU context. */ 11807 11808 static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu) 11808 11809 { 11810 + if (KVM_BUG_ON(vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm)) 11811 + return; 11812 + 11809 11813 /* Exclude PKRU, it's restored separately immediately after VM-Exit. */ 11810 11814 fpu_swap_kvm_fpstate(&vcpu->arch.guest_fpu, true); 11811 11815 trace_kvm_fpu(1); ··· 11817 11815 /* When vcpu_run ends, restore user space FPU context. */ 11818 11816 static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu) 11819 11817 { 11818 + if (KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm)) 11819 + return; 11820 + 11820 11821 fpu_swap_kvm_fpstate(&vcpu->arch.guest_fpu, false); 11821 11822 ++vcpu->stat.fpu_reload; 11822 11823 trace_kvm_fpu(0); ··· 12142 12137 int r; 12143 12138 12144 12139 vcpu_load(vcpu); 12145 - if (kvm_mpx_supported()) 12146 - kvm_load_guest_fpu(vcpu); 12147 - 12148 12140 kvm_vcpu_srcu_read_lock(vcpu); 12149 12141 12150 12142 r = kvm_apic_accept_events(vcpu); ··· 12158 12156 12159 12157 out: 12160 12158 kvm_vcpu_srcu_read_unlock(vcpu); 12161 - 12162 - if (kvm_mpx_supported()) 12163 - kvm_put_guest_fpu(vcpu); 12164 12159 vcpu_put(vcpu); 12165 12160 return r; 12166 12161 } ··· 12787 12788 { 12788 12789 struct fpstate *fpstate = vcpu->arch.guest_fpu.fpstate; 12789 12790 u64 xfeatures_mask; 12791 + bool fpu_in_use; 12790 12792 int i; 12791 12793 12792 12794 /* ··· 12811 12811 BUILD_BUG_ON(sizeof(xfeatures_mask) * BITS_PER_BYTE <= XFEATURE_MAX); 12812 12812 12813 12813 /* 12814 - * All paths that lead to INIT are required to load the guest's FPU 12815 - * state (because most paths are buried in KVM_RUN). 12814 + * Unload guest FPU state (if necessary) before zeroing XSTATE fields 12815 + * as the kernel can only modify the state when its resident in memory, 12816 + * i.e. when it's not loaded into hardware. 12817 + * 12818 + * WARN if the vCPU's desire to run, i.e. whether or not its in KVM_RUN, 12819 + * doesn't match the loaded/in-use state of the FPU, as KVM_RUN is the 12820 + * only path that can trigger INIT emulation _and_ loads FPU state, and 12821 + * KVM_RUN should _always_ load FPU state. 12816 12822 */ 12817 - kvm_put_guest_fpu(vcpu); 12823 + WARN_ON_ONCE(vcpu->wants_to_run != fpstate->in_use); 12824 + fpu_in_use = fpstate->in_use; 12825 + if (fpu_in_use) 12826 + kvm_put_guest_fpu(vcpu); 12818 12827 for_each_set_bit(i, (unsigned long *)&xfeatures_mask, XFEATURE_MAX) 12819 12828 fpstate_clear_xstate_component(fpstate, i); 12820 - kvm_load_guest_fpu(vcpu); 12829 + if (fpu_in_use) 12830 + kvm_load_guest_fpu(vcpu); 12821 12831 } 12822 12832 12823 12833 void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)

+37 -14

drivers/acpi/acpi_mrrm.c

··· 152 152 153 153 static __init int add_boot_memory_ranges(void) 154 154 { 155 - struct kobject *pkobj, *kobj; 155 + struct kobject *pkobj, *kobj, **kobjs; 156 156 int ret = -EINVAL; 157 - char *name; 157 + char name[16]; 158 + int i; 158 159 159 160 pkobj = kobject_create_and_add("memory_ranges", acpi_kobj); 161 + if (!pkobj) 162 + return -ENOMEM; 160 163 161 - for (int i = 0; i < mrrm_mem_entry_num; i++) { 162 - name = kasprintf(GFP_KERNEL, "range%d", i); 163 - if (!name) { 164 - ret = -ENOMEM; 165 - break; 166 - } 167 - 168 - kobj = kobject_create_and_add(name, pkobj); 169 - 170 - ret = sysfs_create_groups(kobj, memory_range_groups); 171 - if (ret) 172 - return ret; 164 + kobjs = kcalloc(mrrm_mem_entry_num, sizeof(*kobjs), GFP_KERNEL); 165 + if (!kobjs) { 166 + kobject_put(pkobj); 167 + return -ENOMEM; 173 168 } 174 169 170 + for (i = 0; i < mrrm_mem_entry_num; i++) { 171 + scnprintf(name, sizeof(name), "range%d", i); 172 + kobj = kobject_create_and_add(name, pkobj); 173 + if (!kobj) { 174 + ret = -ENOMEM; 175 + goto cleanup; 176 + } 177 + 178 + ret = sysfs_create_groups(kobj, memory_range_groups); 179 + if (ret) { 180 + kobject_put(kobj); 181 + goto cleanup; 182 + } 183 + kobjs[i] = kobj; 184 + } 185 + 186 + kfree(kobjs); 187 + return 0; 188 + 189 + cleanup: 190 + for (int j = 0; j < i; j++) { 191 + if (kobjs[j]) { 192 + sysfs_remove_groups(kobjs[j], memory_range_groups); 193 + kobject_put(kobjs[j]); 194 + } 195 + } 196 + kfree(kobjs); 197 + kobject_put(pkobj); 175 198 return ret; 176 199 } 177 200

+3 -3

drivers/acpi/cppc_acpi.c

··· 460 460 if (acpi_disabled) 461 461 return false; 462 462 463 - for_each_present_cpu(cpu) { 463 + for_each_online_cpu(cpu) { 464 464 cpc_ptr = per_cpu(cpc_desc_ptr, cpu); 465 465 if (!cpc_ptr) 466 466 return false; ··· 476 476 struct cpc_desc *cpc_ptr; 477 477 int cpu; 478 478 479 - for_each_present_cpu(cpu) { 479 + for_each_online_cpu(cpu) { 480 480 cpc_ptr = per_cpu(cpc_desc_ptr, cpu); 481 481 desired_reg = &cpc_ptr->cpc_regs[DESIRED_PERF]; 482 482 if (!CPC_IN_SYSTEM_MEMORY(desired_reg) && ··· 1435 1435 { 1436 1436 int cpu; 1437 1437 1438 - for_each_present_cpu(cpu) { 1438 + for_each_online_cpu(cpu) { 1439 1439 struct cpc_register_resource *ref_perf_reg; 1440 1440 struct cpc_desc *cpc_desc; 1441 1441

+25 -21

drivers/acpi/numa/hmat.c

··· 874 874 } 875 875 } 876 876 877 - static void hmat_register_target(struct memory_target *target) 877 + static void hmat_hotplug_target(struct memory_target *target) 878 878 { 879 879 int nid = pxm_to_node(target->memory_pxm); 880 880 881 + /* 882 + * Skip offline nodes. This can happen when memory marked EFI_MEMORY_SP, 883 + * "specific purpose", is applied to all the memory in a proximity 884 + * domain leading to * the node being marked offline / unplugged, or if 885 + * memory-only "hotplug" node is offline. 886 + */ 887 + if (nid == NUMA_NO_NODE || !node_online(nid)) 888 + return; 889 + 890 + guard(mutex)(&target_lock); 891 + if (target->registered) 892 + return; 893 + 894 + hmat_register_target_initiators(target); 895 + hmat_register_target_cache(target); 896 + hmat_register_target_perf(target, ACCESS_COORDINATE_LOCAL); 897 + hmat_register_target_perf(target, ACCESS_COORDINATE_CPU); 898 + target->registered = true; 899 + } 900 + 901 + static void hmat_register_target(struct memory_target *target) 902 + { 881 903 /* 882 904 * Devices may belong to either an offline or online 883 905 * node, so unconditionally add them. ··· 917 895 } 918 896 mutex_unlock(&target_lock); 919 897 920 - /* 921 - * Skip offline nodes. This can happen when memory 922 - * marked EFI_MEMORY_SP, "specific purpose", is applied 923 - * to all the memory in a proximity domain leading to 924 - * the node being marked offline / unplugged, or if 925 - * memory-only "hotplug" node is offline. 926 - */ 927 - if (nid == NUMA_NO_NODE || !node_online(nid)) 928 - return; 929 - 930 - mutex_lock(&target_lock); 931 - if (!target->registered) { 932 - hmat_register_target_initiators(target); 933 - hmat_register_target_cache(target); 934 - hmat_register_target_perf(target, ACCESS_COORDINATE_LOCAL); 935 - hmat_register_target_perf(target, ACCESS_COORDINATE_CPU); 936 - target->registered = true; 937 - } 938 - mutex_unlock(&target_lock); 898 + hmat_hotplug_target(target); 939 899 } 940 900 941 901 static void hmat_register_targets(void) ··· 943 939 if (!target) 944 940 return NOTIFY_OK; 945 941 946 - hmat_register_target(target); 942 + hmat_hotplug_target(target); 947 943 return NOTIFY_OK; 948 944 } 949 945

+1 -1

drivers/acpi/numa/srat.c

··· 237 237 struct acpi_srat_generic_affinity *p = 238 238 (struct acpi_srat_generic_affinity *)header; 239 239 240 - if (p->device_handle_type == 0) { 240 + if (p->device_handle_type == 1) { 241 241 /* 242 242 * For pci devices this may be the only place they 243 243 * are assigned a proximity domain

+13 -11

drivers/bluetooth/btrtl.c

··· 50 50 51 51 #define RTL_CHIP_SUBVER (&(struct rtl_vendor_cmd) {{0x10, 0x38, 0x04, 0x28, 0x80}}) 52 52 #define RTL_CHIP_REV (&(struct rtl_vendor_cmd) {{0x10, 0x3A, 0x04, 0x28, 0x80}}) 53 - #define RTL_SEC_PROJ (&(struct rtl_vendor_cmd) {{0x10, 0xA4, 0x0D, 0x00, 0xb0}}) 53 + #define RTL_SEC_PROJ (&(struct rtl_vendor_cmd) {{0x10, 0xA4, 0xAD, 0x00, 0xb0}}) 54 54 55 55 #define RTL_PATCH_SNIPPETS 0x01 56 56 #define RTL_PATCH_DUMMY_HEADER 0x02 ··· 534 534 { 535 535 struct rtl_epatch_header_v2 *hdr; 536 536 int rc; 537 - u8 reg_val[2]; 538 537 u8 key_id; 539 538 u32 num_sections; 540 539 struct rtl_section *section; ··· 548 549 .len = btrtl_dev->fw_len - 7, /* Cut the tail */ 549 550 }; 550 551 551 - rc = btrtl_vendor_read_reg16(hdev, RTL_SEC_PROJ, reg_val); 552 - if (rc < 0) 553 - return -EIO; 554 - key_id = reg_val[0]; 555 - 556 - rtl_dev_dbg(hdev, "%s: key id %u", __func__, key_id); 557 - 558 - btrtl_dev->key_id = key_id; 552 + key_id = btrtl_dev->key_id; 559 553 560 554 hdr = rtl_iov_pull_data(&iov, sizeof(*hdr)); 561 555 if (!hdr) ··· 1062 1070 u16 hci_rev, lmp_subver; 1063 1071 u8 hci_ver, lmp_ver, chip_type = 0; 1064 1072 int ret; 1073 + int rc; 1074 + u8 key_id; 1065 1075 u8 reg_val[2]; 1066 1076 1067 1077 btrtl_dev = kzalloc(sizeof(*btrtl_dev), GFP_KERNEL); ··· 1174 1180 goto err_free; 1175 1181 } 1176 1182 1183 + rc = btrtl_vendor_read_reg16(hdev, RTL_SEC_PROJ, reg_val); 1184 + if (rc < 0) 1185 + goto err_free; 1186 + 1187 + key_id = reg_val[0]; 1188 + btrtl_dev->key_id = key_id; 1189 + rtl_dev_info(hdev, "%s: key id %u", __func__, key_id); 1190 + 1177 1191 btrtl_dev->fw_len = -EIO; 1178 1192 if (lmp_subver == RTL_ROM_LMP_8852A && hci_rev == 0x000c) { 1179 1193 snprintf(fw_name, sizeof(fw_name), "%s_v2.bin", ··· 1204 1202 goto err_free; 1205 1203 } 1206 1204 1207 - if (btrtl_dev->ic_info->cfg_name) { 1205 + if (btrtl_dev->ic_info->cfg_name && !btrtl_dev->key_id) { 1208 1206 if (postfix) { 1209 1207 snprintf(cfg_name, sizeof(cfg_name), "%s-%s.bin", 1210 1208 btrtl_dev->ic_info->cfg_name, postfix);

+6 -7

drivers/bluetooth/btusb.c

··· 4361 4361 4362 4362 hci_unregister_dev(hdev); 4363 4363 4364 + if (data->oob_wake_irq) 4365 + device_init_wakeup(&data->udev->dev, false); 4366 + if (data->reset_gpio) 4367 + gpiod_put(data->reset_gpio); 4368 + 4364 4369 if (intf == data->intf) { 4365 4370 if (data->isoc) 4366 4371 usb_driver_release_interface(&btusb_driver, data->isoc); ··· 4376 4371 usb_driver_release_interface(&btusb_driver, data->diag); 4377 4372 usb_driver_release_interface(&btusb_driver, data->intf); 4378 4373 } else if (intf == data->diag) { 4379 - usb_driver_release_interface(&btusb_driver, data->intf); 4380 4374 if (data->isoc) 4381 4375 usb_driver_release_interface(&btusb_driver, data->isoc); 4376 + usb_driver_release_interface(&btusb_driver, data->intf); 4382 4377 } 4383 - 4384 - if (data->oob_wake_irq) 4385 - device_init_wakeup(&data->udev->dev, false); 4386 - 4387 - if (data->reset_gpio) 4388 - gpiod_put(data->reset_gpio); 4389 4378 4390 4379 hci_free_dev(hdev); 4391 4380 }

+4 -5

drivers/cpufreq/intel_pstate.c

··· 603 603 { 604 604 u64 misc_en; 605 605 606 - if (!cpu_feature_enabled(X86_FEATURE_IDA)) 607 - return true; 608 - 609 606 rdmsrq(MSR_IA32_MISC_ENABLE, misc_en); 610 607 611 608 return !!(misc_en & MSR_IA32_MISC_ENABLE_TURBO_DISABLE); ··· 2103 2106 u32 vid; 2104 2107 2105 2108 val = (u64)pstate << 8; 2106 - if (READ_ONCE(global.no_turbo) && !READ_ONCE(global.turbo_disabled)) 2109 + if (READ_ONCE(global.no_turbo) && !READ_ONCE(global.turbo_disabled) && 2110 + cpu_feature_enabled(X86_FEATURE_IDA)) 2107 2111 val |= (u64)1 << 32; 2108 2112 2109 2113 vid_fp = cpudata->vid.min + mul_fp( ··· 2269 2271 u64 val; 2270 2272 2271 2273 val = (u64)pstate << 8; 2272 - if (READ_ONCE(global.no_turbo) && !READ_ONCE(global.turbo_disabled)) 2274 + if (READ_ONCE(global.no_turbo) && !READ_ONCE(global.turbo_disabled) && 2275 + cpu_feature_enabled(X86_FEATURE_IDA)) 2273 2276 val |= (u64)1 << 32; 2274 2277 2275 2278 return val;

+2

drivers/crypto/hisilicon/qm.c

··· 3871 3871 pdev = container_of(dev, struct pci_dev, dev); 3872 3872 if (pci_physfn(pdev) != qm->pdev) { 3873 3873 pci_err(qm->pdev, "the pdev input does not match the pf!\n"); 3874 + put_device(dev); 3874 3875 return -EINVAL; 3875 3876 } 3876 3877 3877 3878 *fun_index = pdev->devfn; 3879 + put_device(dev); 3878 3880 3879 3881 return 0; 3880 3882 }

+2

drivers/cxl/core/region.c

··· 3702 3702 if (validate_region_offset(cxlr, offset)) 3703 3703 return -EINVAL; 3704 3704 3705 + offset -= cxlr->params.cache_size; 3705 3706 rc = region_offset_to_dpa_result(cxlr, offset, &result); 3706 3707 if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) { 3707 3708 dev_dbg(&cxlr->dev, ··· 3735 3734 if (validate_region_offset(cxlr, offset)) 3736 3735 return -EINVAL; 3737 3736 3737 + offset -= cxlr->params.cache_size; 3738 3738 rc = region_offset_to_dpa_result(cxlr, offset, &result); 3739 3739 if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) { 3740 3740 dev_dbg(&cxlr->dev,

+17 -5

drivers/edac/altera_edac.c

··· 1184 1184 if (ret) 1185 1185 return ret; 1186 1186 1187 - /* Verify OCRAM has been initialized */ 1187 + /* 1188 + * Verify that OCRAM has been initialized. 1189 + * During a warm reset, OCRAM contents are retained, but the control 1190 + * and status registers are reset to their default values. Therefore, 1191 + * ECC must be explicitly re-enabled in the control register. 1192 + * Error condition: if INITCOMPLETEA is clear and ECC_EN is already set. 1193 + */ 1188 1194 if (!ecc_test_bits(ALTR_A10_ECC_INITCOMPLETEA, 1189 - (base + ALTR_A10_ECC_INITSTAT_OFST))) 1190 - return -ENODEV; 1195 + (base + ALTR_A10_ECC_INITSTAT_OFST))) { 1196 + if (!ecc_test_bits(ALTR_A10_ECC_EN, 1197 + (base + ALTR_A10_ECC_CTRL_OFST))) 1198 + ecc_set_bits(ALTR_A10_ECC_EN, 1199 + (base + ALTR_A10_ECC_CTRL_OFST)); 1200 + else 1201 + return -ENODEV; 1202 + } 1191 1203 1192 1204 /* Enable IRQ on Single Bit Error */ 1193 1205 writel(ALTR_A10_ECC_SERRINTEN, (base + ALTR_A10_ECC_ERRINTENS_OFST)); ··· 1369 1357 .ue_set_mask = ALTR_A10_ECC_TDERRA, 1370 1358 .set_err_ofst = ALTR_A10_ECC_INTTEST_OFST, 1371 1359 .ecc_irq_handler = altr_edac_a10_ecc_irq, 1372 - .inject_fops = &altr_edac_a10_device_inject2_fops, 1360 + .inject_fops = &altr_edac_a10_device_inject_fops, 1373 1361 }; 1374 1362 1375 1363 #endif /* CONFIG_EDAC_ALTERA_ETHERNET */ ··· 1459 1447 .ue_set_mask = ALTR_A10_ECC_TDERRA, 1460 1448 .set_err_ofst = ALTR_A10_ECC_INTTEST_OFST, 1461 1449 .ecc_irq_handler = altr_edac_a10_ecc_irq, 1462 - .inject_fops = &altr_edac_a10_device_inject2_fops, 1450 + .inject_fops = &altr_edac_a10_device_inject_fops, 1463 1451 }; 1464 1452 1465 1453 #endif /* CONFIG_EDAC_ALTERA_USB */

+13 -11

drivers/edac/versalnet_edac.c

··· 605 605 length = result[MSG_ERR_LENGTH]; 606 606 offset = result[MSG_ERR_OFFSET]; 607 607 608 + /* 609 + * The data can come in two stretches. Construct the regs from two 610 + * messages. The offset indicates the offset from which the data is to 611 + * be taken. 612 + */ 613 + for (i = 0 ; i < length; i++) { 614 + k = offset + i; 615 + j = ERROR_DATA + i; 616 + mc_priv->regs[k] = result[j]; 617 + } 618 + 608 619 if (result[TOTAL_ERR_LENGTH] > length) { 609 620 if (!mc_priv->part_len) 610 621 mc_priv->part_len = length; 611 622 else 612 623 mc_priv->part_len += length; 613 - /* 614 - * The data can come in 2 stretches. Construct the regs from 2 615 - * messages the offset indicates the offset from which the data is to 616 - * be taken 617 - */ 618 - for (i = 0 ; i < length; i++) { 619 - k = offset + i; 620 - j = ERROR_DATA + i; 621 - mc_priv->regs[k] = result[j]; 622 - } 624 + 623 625 if (mc_priv->part_len < result[TOTAL_ERR_LENGTH]) 624 626 return 0; 625 627 mc_priv->part_len = 0; ··· 707 705 /* Convert to bytes */ 708 706 length = result[TOTAL_ERR_LENGTH] * 4; 709 707 log_non_standard_event(sec_type, &amd_versalnet_guid, mc_priv->message, 710 - sec_sev, (void *)&result[ERROR_DATA], length); 708 + sec_sev, (void *)&mc_priv->regs, length); 711 709 712 710 return 0; 713 711 }

+2

drivers/firewire/core-card.c

··· 577 577 INIT_LIST_HEAD(&card->transactions.list); 578 578 spin_lock_init(&card->transactions.lock); 579 579 580 + spin_lock_init(&card->topology_map.lock); 581 + 580 582 card->split_timeout.hi = DEFAULT_SPLIT_TIMEOUT / 8000; 581 583 card->split_timeout.lo = (DEFAULT_SPLIT_TIMEOUT % 8000) << 19; 582 584 card->split_timeout.cycles = DEFAULT_SPLIT_TIMEOUT;

+2 -1

drivers/firewire/core-topology.c

··· 441 441 const u32 *self_ids, int self_id_count) 442 442 { 443 443 __be32 *map = buffer; 444 + u32 next_generation = be32_to_cpu(buffer[1]) + 1; 444 445 int node_count = (root_node_id & 0x3f) + 1; 445 446 446 447 memset(map, 0, buffer_size); 447 448 448 449 *map++ = cpu_to_be32((self_id_count + 2) << 16); 449 - *map++ = cpu_to_be32(be32_to_cpu(buffer[1]) + 1); 450 + *map++ = cpu_to_be32(next_generation); 450 451 *map++ = cpu_to_be32((node_count << 16) | self_id_count); 451 452 452 453 while (self_id_count--)

+1 -1

drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c

··· 236 236 r = amdgpu_xcp_select_scheds(adev, hw_ip, hw_prio, fpriv, 237 237 &num_scheds, &scheds); 238 238 if (r) 239 - goto cleanup_entity; 239 + goto error_free_entity; 240 240 } 241 241 242 242 /* disable load balance if the hw engine retains context among dependent jobs */

+12

drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c

··· 82 82 struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); 83 83 struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); 84 84 85 + /* 86 + * Disable peer-to-peer access for DCC-enabled VRAM surfaces on GFX12+. 87 + * Such buffers cannot be safely accessed over P2P due to device-local 88 + * compression metadata. Fallback to system-memory path instead. 89 + * Device supports GFX12 (GC 12.x or newer) 90 + * BO was created with the AMDGPU_GEM_CREATE_GFX12_DCC flag 91 + * 92 + */ 93 + if (amdgpu_ip_version(adev, GC_HWIP, 0) >= IP_VERSION(12, 0, 0) && 94 + bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC) 95 + attach->peer2peer = false; 96 + 85 97 if (!amdgpu_dmabuf_is_xgmi_accessible(attach_adev, bo) && 86 98 pci_p2pdma_distance(adev->pdev, attach->dev, false) < 0) 87 99 attach->peer2peer = false;

+2

drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c

··· 280 280 if (ret) 281 281 return ret; 282 282 283 + /* Ensure *bo is NULL so a new BO will be created */ 284 + *bo = NULL; 283 285 ret = amdgpu_bo_create_kernel(adev, 284 286 size, 285 287 ISP_MC_ADDR_ALIGN,

+3 -2

drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c

··· 151 151 { 152 152 struct amdgpu_userq_fence *userq_fence, *tmp; 153 153 struct dma_fence *fence; 154 + unsigned long flags; 154 155 u64 rptr; 155 156 int i; 156 157 157 158 if (!fence_drv) 158 159 return; 159 160 161 + spin_lock_irqsave(&fence_drv->fence_list_lock, flags); 160 162 rptr = amdgpu_userq_fence_read(fence_drv); 161 163 162 - spin_lock(&fence_drv->fence_list_lock); 163 164 list_for_each_entry_safe(userq_fence, tmp, &fence_drv->fences, link) { 164 165 fence = &userq_fence->base; 165 166 ··· 175 174 list_del(&userq_fence->link); 176 175 dma_fence_put(fence); 177 176 } 178 - spin_unlock(&fence_drv->fence_list_lock); 177 + spin_unlock_irqrestore(&fence_drv->fence_list_lock, flags); 179 178 } 180 179 181 180 void amdgpu_userq_fence_driver_destroy(struct kref *ref)

+1

drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c

··· 878 878 .get_rptr = jpeg_v5_0_1_dec_ring_get_rptr, 879 879 .get_wptr = jpeg_v5_0_1_dec_ring_get_wptr, 880 880 .set_wptr = jpeg_v5_0_1_dec_ring_set_wptr, 881 + .parse_cs = amdgpu_jpeg_dec_parse_cs, 881 882 .emit_frame_size = 882 883 SOC15_FLUSH_GPU_TLB_NUM_WREG * 6 + 883 884 SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 8 +

+6 -6

drivers/gpu/drm/amd/amdkfd/kfd_queue.c

··· 297 297 goto out_err_unreserve; 298 298 } 299 299 300 - if (properties->ctx_save_restore_area_size != topo_dev->node_props.cwsr_size) { 301 - pr_debug("queue cwsr size 0x%x not equal to node cwsr size 0x%x\n", 300 + if (properties->ctx_save_restore_area_size < topo_dev->node_props.cwsr_size) { 301 + pr_debug("queue cwsr size 0x%x not sufficient for node cwsr size 0x%x\n", 302 302 properties->ctx_save_restore_area_size, 303 303 topo_dev->node_props.cwsr_size); 304 304 err = -EINVAL; 305 305 goto out_err_unreserve; 306 306 } 307 307 308 - total_cwsr_size = (topo_dev->node_props.cwsr_size + topo_dev->node_props.debug_memory_size) 309 - * NUM_XCC(pdd->dev->xcc_mask); 308 + total_cwsr_size = (properties->ctx_save_restore_area_size + 309 + topo_dev->node_props.debug_memory_size) * NUM_XCC(pdd->dev->xcc_mask); 310 310 total_cwsr_size = ALIGN(total_cwsr_size, PAGE_SIZE); 311 311 312 312 err = kfd_queue_buffer_get(vm, (void *)properties->ctx_save_restore_area_address, ··· 352 352 topo_dev = kfd_topology_device_by_id(pdd->dev->id); 353 353 if (!topo_dev) 354 354 return -EINVAL; 355 - total_cwsr_size = (topo_dev->node_props.cwsr_size + topo_dev->node_props.debug_memory_size) 356 - * NUM_XCC(pdd->dev->xcc_mask); 355 + total_cwsr_size = (properties->ctx_save_restore_area_size + 356 + topo_dev->node_props.debug_memory_size) * NUM_XCC(pdd->dev->xcc_mask); 357 357 total_cwsr_size = ALIGN(total_cwsr_size, PAGE_SIZE); 358 358 359 359 kfd_queue_buffer_svm_put(pdd, properties->ctx_save_restore_area_address, total_cwsr_size);

+2

drivers/gpu/drm/amd/amdkfd/kfd_svm.c

··· 3687 3687 svm_range_apply_attrs(p, prange, nattr, attrs, &update_mapping); 3688 3688 /* TODO: unmap ranges from GPU that lost access */ 3689 3689 } 3690 + update_mapping |= !p->xnack_enabled && !list_empty(&remap_list); 3691 + 3690 3692 list_for_each_entry_safe(prange, next, &remove_list, update_list) { 3691 3693 pr_debug("unlink old 0x%p prange 0x%p [0x%lx 0x%lx]\n", 3692 3694 prange->svms, prange, prange->start,

+11

drivers/gpu/drm/amd/display/modules/freesync/freesync.c

··· 1260 1260 update_v_total_for_static_ramp( 1261 1261 core_freesync, stream, in_out_vrr); 1262 1262 } 1263 + 1264 + /* 1265 + * If VRR is inactive, set vtotal min and max to nominal vtotal 1266 + */ 1267 + if (in_out_vrr->state == VRR_STATE_INACTIVE) { 1268 + in_out_vrr->adjust.v_total_min = 1269 + mod_freesync_calc_v_total_from_refresh(stream, 1270 + in_out_vrr->max_refresh_in_uhz); 1271 + in_out_vrr->adjust.v_total_max = in_out_vrr->adjust.v_total_min; 1272 + return; 1273 + } 1263 1274 } 1264 1275 1265 1276 unsigned long long mod_freesync_calc_nominal_field_rate(

+2 -2

drivers/gpu/drm/clients/drm_client_setup.c

··· 13 13 static char drm_client_default[16] = CONFIG_DRM_CLIENT_DEFAULT; 14 14 module_param_string(active, drm_client_default, sizeof(drm_client_default), 0444); 15 15 MODULE_PARM_DESC(active, 16 - "Choose which drm client to start, default is" 17 - CONFIG_DRM_CLIENT_DEFAULT "]"); 16 + "Choose which drm client to start, default is " 17 + CONFIG_DRM_CLIENT_DEFAULT); 18 18 19 19 /** 20 20 * drm_client_setup() - Setup in-kernel DRM clients

+6 -1

drivers/gpu/drm/i915/display/intel_psr.c

··· 585 585 struct intel_display *display = to_intel_display(intel_dp); 586 586 int ret; 587 587 588 + /* TODO: Enable Panel Replay on MST once it's properly implemented. */ 589 + if (intel_dp->mst_detect == DRM_DP_MST) 590 + return; 591 + 588 592 ret = drm_dp_dpcd_read_data(&intel_dp->aux, DP_PANEL_REPLAY_CAP_SUPPORT, 589 593 &intel_dp->pr_dpcd, sizeof(intel_dp->pr_dpcd)); 590 594 if (ret < 0) ··· 892 888 { 893 889 struct intel_display *display = to_intel_display(intel_dp); 894 890 u32 current_dc_state = intel_display_power_get_current_dc_state(display); 895 - struct drm_vblank_crtc *vblank = &display->drm->vblank[intel_dp->psr.pipe]; 891 + struct intel_crtc *crtc = intel_crtc_for_pipe(display, intel_dp->psr.pipe); 892 + struct drm_vblank_crtc *vblank = drm_crtc_vblank_crtc(&crtc->base); 896 893 897 894 return (current_dc_state != DC_STATE_EN_UPTO_DC5 && 898 895 current_dc_state != DC_STATE_EN_UPTO_DC6) ||

+18

drivers/gpu/drm/panthor/panthor_gem.c

··· 288 288 289 289 panthor_gem_debugfs_set_usage_flags(bo, 0); 290 290 291 + /* If this is a write-combine mapping, we query the sgt to force a CPU 292 + * cache flush (dma_map_sgtable() is called when the sgt is created). 293 + * This ensures the zero-ing is visible to any uncached mapping created 294 + * by vmap/mmap. 295 + * FIXME: Ideally this should be done when pages are allocated, not at 296 + * BO creation time. 297 + */ 298 + if (shmem->map_wc) { 299 + struct sg_table *sgt; 300 + 301 + sgt = drm_gem_shmem_get_pages_sgt(shmem); 302 + if (IS_ERR(sgt)) { 303 + ret = PTR_ERR(sgt); 304 + goto out_put_gem; 305 + } 306 + } 307 + 291 308 /* 292 309 * Allocate an id of idr table where the obj is registered 293 310 * and handle has the id what user can see. ··· 313 296 if (!ret) 314 297 *size = bo->base.base.size; 315 298 299 + out_put_gem: 316 300 /* drop reference from allocate - handle holds it now. */ 317 301 drm_gem_object_put(&shmem->base); 318 302

+15 -1

drivers/gpu/drm/vmwgfx/vmwgfx_cursor_plane.c

··· 100 100 if (vmw->has_mob) { 101 101 if ((vmw->capabilities2 & SVGA_CAP2_CURSOR_MOB) != 0) 102 102 return VMW_CURSOR_UPDATE_MOB; 103 + else 104 + return VMW_CURSOR_UPDATE_GB_ONLY; 103 105 } 104 - 106 + drm_warn_once(&vmw->drm, "Unknown Cursor Type!\n"); 105 107 return VMW_CURSOR_UPDATE_NONE; 106 108 } 107 109 ··· 141 139 { 142 140 switch (update_type) { 143 141 case VMW_CURSOR_UPDATE_LEGACY: 142 + case VMW_CURSOR_UPDATE_GB_ONLY: 144 143 case VMW_CURSOR_UPDATE_NONE: 145 144 return 0; 146 145 case VMW_CURSOR_UPDATE_MOB: ··· 626 623 if (!surface || vps->cursor.legacy.id == surface->snooper.id) 627 624 vps->cursor.update_type = VMW_CURSOR_UPDATE_NONE; 628 625 break; 626 + case VMW_CURSOR_UPDATE_GB_ONLY: 629 627 case VMW_CURSOR_UPDATE_MOB: { 630 628 bo = vmw_user_object_buffer(&vps->uo); 631 629 if (bo) { ··· 741 737 vmw_cursor_plane_atomic_update(struct drm_plane *plane, 742 738 struct drm_atomic_state *state) 743 739 { 740 + struct vmw_bo *bo; 744 741 struct drm_plane_state *new_state = 745 742 drm_atomic_get_new_plane_state(state, plane); 746 743 struct drm_plane_state *old_state = ··· 766 761 break; 767 762 case VMW_CURSOR_UPDATE_MOB: 768 763 vmw_cursor_update_mob(dev_priv, vps); 764 + break; 765 + case VMW_CURSOR_UPDATE_GB_ONLY: 766 + bo = vmw_user_object_buffer(&vps->uo); 767 + if (bo) 768 + vmw_send_define_cursor_cmd(dev_priv, bo->map.virtual, 769 + vps->base.crtc_w, 770 + vps->base.crtc_h, 771 + vps->base.hotspot_x, 772 + vps->base.hotspot_y); 769 773 break; 770 774 case VMW_CURSOR_UPDATE_NONE: 771 775 /* do nothing */

+1

drivers/gpu/drm/vmwgfx/vmwgfx_cursor_plane.h

··· 33 33 enum vmw_cursor_update_type { 34 34 VMW_CURSOR_UPDATE_NONE = 0, 35 35 VMW_CURSOR_UPDATE_LEGACY, 36 + VMW_CURSOR_UPDATE_GB_ONLY, 36 37 VMW_CURSOR_UPDATE_MOB, 37 38 }; 38 39

+5

drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c

··· 3668 3668 3669 3669 3670 3670 cmd_id = header->id; 3671 + if (header->size > SVGA_CMD_MAX_DATASIZE) { 3672 + VMW_DEBUG_USER("SVGA3D command: %d is too big.\n", 3673 + cmd_id + SVGA_3D_CMD_BASE); 3674 + return -E2BIG; 3675 + } 3671 3676 *size = header->size + sizeof(SVGA3dCmdHeader); 3672 3677 3673 3678 cmd_id -= SVGA_3D_CMD_BASE;

+5 -7

drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c

··· 32 32 33 33 /** 34 34 * struct vmw_bo_dirty - Dirty information for buffer objects 35 + * @ref_count: Reference count for this structure. Must be first member! 35 36 * @start: First currently dirty bit 36 37 * @end: Last currently dirty bit + 1 37 38 * @method: The currently used dirty method 38 39 * @change_count: Number of consecutive method change triggers 39 - * @ref_count: Reference count for this structure 40 40 * @bitmap_size: The size of the bitmap in bits. Typically equal to the 41 41 * nuber of pages in the bo. 42 42 * @bitmap: A bitmap where each bit represents a page. A set bit means a 43 43 * dirty page. 44 44 */ 45 45 struct vmw_bo_dirty { 46 + struct kref ref_count; 46 47 unsigned long start; 47 48 unsigned long end; 48 49 enum vmw_bo_dirty_method method; 49 50 unsigned int change_count; 50 - unsigned int ref_count; 51 51 unsigned long bitmap_size; 52 52 unsigned long bitmap[]; 53 53 }; ··· 221 221 int ret; 222 222 223 223 if (dirty) { 224 - dirty->ref_count++; 224 + kref_get(&dirty->ref_count); 225 225 return 0; 226 226 } 227 227 ··· 235 235 dirty->bitmap_size = num_pages; 236 236 dirty->start = dirty->bitmap_size; 237 237 dirty->end = 0; 238 - dirty->ref_count = 1; 238 + kref_init(&dirty->ref_count); 239 239 if (num_pages < PAGE_SIZE / sizeof(pte_t)) { 240 240 dirty->method = VMW_BO_DIRTY_PAGETABLE; 241 241 } else { ··· 274 274 { 275 275 struct vmw_bo_dirty *dirty = vbo->dirty; 276 276 277 - if (dirty && --dirty->ref_count == 0) { 278 - kvfree(dirty); 277 + if (dirty && kref_put(&dirty->ref_count, (void *)kvfree)) 279 278 vbo->dirty = NULL; 280 - } 281 279 } 282 280 283 281 /**

+1

drivers/gpu/drm/xe/regs/xe_gt_regs.h

··· 168 168 169 169 #define XEHP_SLICE_COMMON_ECO_CHICKEN1 XE_REG_MCR(0x731c, XE_REG_OPTION_MASKED) 170 170 #define MSC_MSAA_REODER_BUF_BYPASS_DISABLE REG_BIT(14) 171 + #define FAST_CLEAR_VALIGN_FIX REG_BIT(13) 171 172 172 173 #define XE2LPM_CCCHKNREG1 XE_REG(0x82a8) 173 174

+11

drivers/gpu/drm/xe/xe_wa.c

··· 679 679 }, 680 680 { XE_RTP_NAME("14023061436"), 681 681 XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3001), 682 + FUNC(xe_rtp_match_first_render_or_compute), OR, 683 + GRAPHICS_VERSION_RANGE(3003, 3005), 682 684 FUNC(xe_rtp_match_first_render_or_compute)), 683 685 XE_RTP_ACTIONS(SET(TDL_CHICKEN, QID_WAIT_FOR_THREAD_NOT_RUN_DISABLE)) 684 686 }, ··· 917 915 { XE_RTP_NAME("22021007897"), 918 916 XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3003), ENGINE_CLASS(RENDER)), 919 917 XE_RTP_ACTIONS(SET(COMMON_SLICE_CHICKEN4, SBE_PUSH_CONSTANT_BEHIND_FIX_ENABLE)) 918 + }, 919 + { XE_RTP_NAME("14024681466"), 920 + XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3005), ENGINE_CLASS(RENDER)), 921 + XE_RTP_ACTIONS(SET(XEHP_SLICE_COMMON_ECO_CHICKEN1, FAST_CLEAR_VALIGN_FIX)) 922 + }, 923 + { XE_RTP_NAME("15016589081"), 924 + XE_RTP_RULES(GRAPHICS_VERSION(3000), GRAPHICS_STEP(A0, B0), 925 + ENGINE_CLASS(RENDER)), 926 + XE_RTP_ACTIONS(SET(CHICKEN_RASTER_1, DIS_CLIP_NEGATIVE_BOUNDING_BOX)) 920 927 }, 921 928 }; 922 929

+26 -28

drivers/hwmon/gpd-fan.c

··· 12 12 * Copyright (c) 2024 Cryolitia PukNgae 13 13 */ 14 14 15 - #include <linux/acpi.h> 16 15 #include <linux/dmi.h> 17 16 #include <linux/hwmon.h> 17 + #include <linux/io.h> 18 18 #include <linux/ioport.h> 19 19 #include <linux/kernel.h> 20 20 #include <linux/module.h> ··· 276 276 return (u16)high << 8 | low; 277 277 } 278 278 279 - static void gpd_win4_init_ec(void) 280 - { 281 - u8 chip_id, chip_ver; 282 - 283 - gpd_ecram_read(0x2000, &chip_id); 284 - 285 - if (chip_id == 0x55) { 286 - gpd_ecram_read(0x1060, &chip_ver); 287 - gpd_ecram_write(0x1060, chip_ver | 0x80); 288 - } 289 - } 290 - 291 - static int gpd_win4_read_rpm(void) 292 - { 293 - int ret; 294 - 295 - ret = gpd_generic_read_rpm(); 296 - 297 - if (ret == 0) 298 - // Re-init EC when speed is 0 299 - gpd_win4_init_ec(); 300 - 301 - return ret; 302 - } 303 - 304 279 static int gpd_wm2_read_rpm(void) 305 280 { 306 281 for (u16 pwm_ctr_offset = GPD_PWM_CTR_OFFSET; ··· 295 320 static int gpd_read_rpm(void) 296 321 { 297 322 switch (gpd_driver_priv.drvdata->board) { 323 + case win4_6800u: 298 324 case win_mini: 299 325 case duo: 300 326 return gpd_generic_read_rpm(); 301 - case win4_6800u: 302 - return gpd_win4_read_rpm(); 303 327 case win_max_2: 304 328 return gpd_wm2_read_rpm(); 305 329 } ··· 581 607 .info = gpd_fan_hwmon_channel_info 582 608 }; 583 609 610 + static void gpd_win4_init_ec(void) 611 + { 612 + u8 chip_id, chip_ver; 613 + 614 + gpd_ecram_read(0x2000, &chip_id); 615 + 616 + if (chip_id == 0x55) { 617 + gpd_ecram_read(0x1060, &chip_ver); 618 + gpd_ecram_write(0x1060, chip_ver | 0x80); 619 + } 620 + } 621 + 622 + static void gpd_init_ec(void) 623 + { 624 + // The buggy firmware won't initialize EC properly on boot. 625 + // Before its initialization, reading RPM will always return 0, 626 + // and writing PWM will have no effect. 627 + // Initialize it manually on driver load. 628 + if (gpd_driver_priv.drvdata->board == win4_6800u) 629 + gpd_win4_init_ec(); 630 + } 631 + 584 632 static int gpd_fan_probe(struct platform_device *pdev) 585 633 { 586 634 struct device *dev = &pdev->dev; ··· 629 633 if (IS_ERR(hwdev)) 630 634 return dev_err_probe(dev, PTR_ERR(hwdev), 631 635 "Failed to register hwmon device\n"); 636 + 637 + gpd_init_ec(); 632 638 633 639 return 0; 634 640 }

+7 -4

drivers/infiniband/hw/mlx5/cq.c

··· 1020 1020 if (cq->create_flags & IB_UVERBS_CQ_FLAGS_IGNORE_OVERRUN) 1021 1021 MLX5_SET(cqc, cqc, oi, 1); 1022 1022 1023 + if (udata) { 1024 + cq->mcq.comp = mlx5_add_cq_to_tasklet; 1025 + cq->mcq.tasklet_ctx.comp = mlx5_ib_cq_comp; 1026 + } else { 1027 + cq->mcq.comp = mlx5_ib_cq_comp; 1028 + } 1029 + 1023 1030 err = mlx5_core_create_cq(dev->mdev, &cq->mcq, cqb, inlen, out, sizeof(out)); 1024 1031 if (err) 1025 1032 goto err_cqb; 1026 1033 1027 1034 mlx5_ib_dbg(dev, "cqn 0x%x\n", cq->mcq.cqn); 1028 - if (udata) 1029 - cq->mcq.tasklet_ctx.comp = mlx5_ib_cq_comp; 1030 - else 1031 - cq->mcq.comp = mlx5_ib_cq_comp; 1032 1035 cq->mcq.event = mlx5_ib_cq_event; 1033 1036 1034 1037 INIT_LIST_HEAD(&cq->wc_list);

+2 -1

drivers/irqchip/irq-riscv-intc.c

··· 166 166 static const struct irq_domain_ops riscv_intc_domain_ops = { 167 167 .map = riscv_intc_domain_map, 168 168 .xlate = irq_domain_xlate_onecell, 169 - .alloc = riscv_intc_domain_alloc 169 + .alloc = riscv_intc_domain_alloc, 170 + .free = irq_domain_free_irqs_top, 170 171 }; 171 172 172 173 static struct fwnode_handle *riscv_intc_hwnode(void)

+1 -1

drivers/mmc/host/Kconfig

··· 950 950 config MMC_WMT 951 951 tristate "Wondermedia SD/MMC Host Controller support" 952 952 depends on ARCH_VT8500 || COMPILE_TEST 953 - default y 953 + default ARCH_VT8500 954 954 help 955 955 This selects support for the SD/MMC Host Controller on 956 956 Wondermedia WM8505/WM8650 based SoCs.

+2 -2

drivers/mmc/host/dw_mmc-rockchip.c

··· 42 42 */ 43 43 static int rockchip_mmc_get_internal_phase(struct dw_mci *host, bool sample) 44 44 { 45 - unsigned long rate = clk_get_rate(host->ciu_clk); 45 + unsigned long rate = clk_get_rate(host->ciu_clk) / RK3288_CLKGEN_DIV; 46 46 u32 raw_value; 47 47 u16 degrees; 48 48 u32 delay_num = 0; ··· 85 85 86 86 static int rockchip_mmc_set_internal_phase(struct dw_mci *host, bool sample, int degrees) 87 87 { 88 - unsigned long rate = clk_get_rate(host->ciu_clk); 88 + unsigned long rate = clk_get_rate(host->ciu_clk) / RK3288_CLKGEN_DIV; 89 89 u8 nineties, remainder; 90 90 u8 delay_num; 91 91 u32 raw_value;

+18 -38

drivers/mmc/host/pxamci.c

··· 652 652 host->clkrt = CLKRT_OFF; 653 653 654 654 host->clk = devm_clk_get(dev, NULL); 655 - if (IS_ERR(host->clk)) { 656 - host->clk = NULL; 657 - return PTR_ERR(host->clk); 658 - } 655 + if (IS_ERR(host->clk)) 656 + return dev_err_probe(dev, PTR_ERR(host->clk), 657 + "Failed to acquire clock\n"); 659 658 660 659 host->clkrate = clk_get_rate(host->clk); 661 660 ··· 702 703 703 704 platform_set_drvdata(pdev, mmc); 704 705 705 - host->dma_chan_rx = dma_request_chan(dev, "rx"); 706 - if (IS_ERR(host->dma_chan_rx)) { 707 - host->dma_chan_rx = NULL; 706 + host->dma_chan_rx = devm_dma_request_chan(dev, "rx"); 707 + if (IS_ERR(host->dma_chan_rx)) 708 708 return dev_err_probe(dev, PTR_ERR(host->dma_chan_rx), 709 709 "unable to request rx dma channel\n"); 710 - } 711 710 712 - host->dma_chan_tx = dma_request_chan(dev, "tx"); 713 - if (IS_ERR(host->dma_chan_tx)) { 714 - dev_err(dev, "unable to request tx dma channel\n"); 715 - ret = PTR_ERR(host->dma_chan_tx); 716 - host->dma_chan_tx = NULL; 717 - goto out; 718 - } 711 + 712 + host->dma_chan_tx = devm_dma_request_chan(dev, "tx"); 713 + if (IS_ERR(host->dma_chan_tx)) 714 + return dev_err_probe(dev, PTR_ERR(host->dma_chan_tx), 715 + "unable to request tx dma channel\n"); 719 716 720 717 if (host->pdata) { 721 718 host->detect_delay_ms = host->pdata->detect_delay_ms; 722 719 723 720 host->power = devm_gpiod_get_optional(dev, "power", GPIOD_OUT_LOW); 724 - if (IS_ERR(host->power)) { 725 - ret = PTR_ERR(host->power); 726 - dev_err(dev, "Failed requesting gpio_power\n"); 727 - goto out; 728 - } 721 + if (IS_ERR(host->power)) 722 + return dev_err_probe(dev, PTR_ERR(host->power), 723 + "Failed requesting gpio_power\n"); 729 724 730 725 /* FIXME: should we pass detection delay to debounce? */ 731 726 ret = mmc_gpiod_request_cd(mmc, "cd", 0, false, 0); 732 - if (ret && ret != -ENOENT) { 733 - dev_err(dev, "Failed requesting gpio_cd\n"); 734 - goto out; 735 - } 727 + if (ret && ret != -ENOENT) 728 + return dev_err_probe(dev, ret, "Failed requesting gpio_cd\n"); 736 729 737 730 if (!host->pdata->gpio_card_ro_invert) 738 731 mmc->caps2 |= MMC_CAP2_RO_ACTIVE_HIGH; 739 732 740 733 ret = mmc_gpiod_request_ro(mmc, "wp", 0, 0); 741 - if (ret && ret != -ENOENT) { 742 - dev_err(dev, "Failed requesting gpio_ro\n"); 743 - goto out; 744 - } 734 + if (ret && ret != -ENOENT) 735 + return dev_err_probe(dev, ret, "Failed requesting gpio_ro\n"); 736 + 745 737 if (!ret) 746 738 host->use_ro_gpio = true; 747 739 ··· 749 759 if (ret) { 750 760 if (host->pdata && host->pdata->exit) 751 761 host->pdata->exit(dev, mmc); 752 - goto out; 753 762 } 754 763 755 - return 0; 756 - 757 - out: 758 - if (host->dma_chan_rx) 759 - dma_release_channel(host->dma_chan_rx); 760 - if (host->dma_chan_tx) 761 - dma_release_channel(host->dma_chan_tx); 762 764 return ret; 763 765 } 764 766 ··· 773 791 774 792 dmaengine_terminate_all(host->dma_chan_rx); 775 793 dmaengine_terminate_all(host->dma_chan_tx); 776 - dma_release_channel(host->dma_chan_rx); 777 - dma_release_channel(host->dma_chan_tx); 778 794 } 779 795 } 780 796

+1 -1

drivers/mmc/host/sdhci-of-dwcmshc.c

··· 94 94 #define DLL_TXCLK_TAPNUM_DEFAULT 0x10 95 95 #define DLL_TXCLK_TAPNUM_90_DEGREES 0xA 96 96 #define DLL_TXCLK_TAPNUM_FROM_SW BIT(24) 97 - #define DLL_STRBIN_TAPNUM_DEFAULT 0x8 97 + #define DLL_STRBIN_TAPNUM_DEFAULT 0x4 98 98 #define DLL_STRBIN_TAPNUM_FROM_SW BIT(24) 99 99 #define DLL_STRBIN_DELAY_NUM_SEL BIT(26) 100 100 #define DLL_STRBIN_DELAY_NUM_OFFSET 16

+3 -2

drivers/net/bonding/bond_main.c

··· 2120 2120 /* check for initial state */ 2121 2121 new_slave->link = BOND_LINK_NOCHANGE; 2122 2122 if (bond->params.miimon) { 2123 - if (netif_carrier_ok(slave_dev)) { 2123 + if (netif_running(slave_dev) && netif_carrier_ok(slave_dev)) { 2124 2124 if (bond->params.updelay) { 2125 2125 bond_set_slave_link_state(new_slave, 2126 2126 BOND_LINK_BACK, ··· 2665 2665 bond_for_each_slave_rcu(bond, slave, iter) { 2666 2666 bond_propose_link_state(slave, BOND_LINK_NOCHANGE); 2667 2667 2668 - link_state = netif_carrier_ok(slave->dev); 2668 + link_state = netif_running(slave->dev) && 2669 + netif_carrier_ok(slave->dev); 2669 2670 2670 2671 switch (slave->link) { 2671 2672 case BOND_LINK_UP:

+2

drivers/net/ethernet/freescale/fec_main.c

··· 1835 1835 ndev->stats.rx_packets++; 1836 1836 pkt_len = fec16_to_cpu(bdp->cbd_datlen); 1837 1837 ndev->stats.rx_bytes += pkt_len; 1838 + if (fep->quirks & FEC_QUIRK_HAS_RACC) 1839 + ndev->stats.rx_bytes -= 2; 1838 1840 1839 1841 index = fec_enet_get_bd_index(bdp, &rxq->bd); 1840 1842 page = rxq->rx_skb_info[index].page;

+20 -3

drivers/net/ethernet/mellanox/mlx5/core/cq.c

··· 66 66 tasklet_schedule(&ctx->task); 67 67 } 68 68 69 - static void mlx5_add_cq_to_tasklet(struct mlx5_core_cq *cq, 70 - struct mlx5_eqe *eqe) 69 + void mlx5_add_cq_to_tasklet(struct mlx5_core_cq *cq, 70 + struct mlx5_eqe *eqe) 71 71 { 72 72 unsigned long flags; 73 73 struct mlx5_eq_tasklet *tasklet_ctx = cq->tasklet_ctx.priv; ··· 95 95 if (schedule_tasklet) 96 96 tasklet_schedule(&tasklet_ctx->task); 97 97 } 98 + EXPORT_SYMBOL(mlx5_add_cq_to_tasklet); 98 99 100 + static void mlx5_core_cq_dummy_cb(struct mlx5_core_cq *cq, struct mlx5_eqe *eqe) 101 + { 102 + mlx5_core_err(cq->eq->core.dev, 103 + "CQ default completion callback, CQ #%u\n", cq->cqn); 104 + } 105 + 106 + #define MLX5_CQ_INIT_CMD_SN cpu_to_be32(2 << 28) 99 107 /* Callers must verify outbox status in case of err */ 100 108 int mlx5_create_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq, 101 109 u32 *in, int inlen, u32 *out, int outlen) ··· 129 121 cq->arm_sn = 0; 130 122 cq->eq = eq; 131 123 cq->uid = MLX5_GET(create_cq_in, in, uid); 124 + 125 + /* Kernel CQs must set the arm_db address prior to calling 126 + * this function, allowing for the proper value to be 127 + * initialized. User CQs are responsible for their own 128 + * initialization since they do not use the arm_db field. 129 + */ 130 + if (cq->arm_db) 131 + *cq->arm_db = MLX5_CQ_INIT_CMD_SN; 132 + 132 133 refcount_set(&cq->refcount, 1); 133 134 init_completion(&cq->free); 134 135 if (!cq->comp) 135 - cq->comp = mlx5_add_cq_to_tasklet; 136 + cq->comp = mlx5_core_cq_dummy_cb; 136 137 /* assuming CQ will be deleted before the EQ */ 137 138 cq->tasklet_ctx.priv = &eq->tasklet_ctx; 138 139 INIT_LIST_HEAD(&cq->tasklet_ctx.list);

+1 -1

drivers/net/ethernet/mellanox/mlx5/core/devlink.c

··· 541 541 max_num_channels = mlx5e_get_max_num_channels(mdev); 542 542 if (val32 > max_num_channels) { 543 543 NL_SET_ERR_MSG_FMT_MOD(extack, 544 - "Requested num_doorbells (%u) exceeds maximum number of channels (%u)", 544 + "Requested num_doorbells (%u) exceeds max number of channels (%u)", 545 545 val32, max_num_channels); 546 546 return -EINVAL; 547 547 }

+2 -1

drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c

··· 804 804 goto err_xfrm; 805 805 } 806 806 807 - if (mlx5_eswitch_block_mode(priv->mdev)) 807 + err = mlx5_eswitch_block_mode(priv->mdev); 808 + if (err) 808 809 goto unblock_ipsec; 809 810 810 811 if (x->props.mode == XFRM_MODE_TUNNEL &&

+28 -5

drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c

··· 595 595 struct mlx5_core_dev *mdev = priv->mdev; 596 596 u8 max_bw_value[IEEE_8021QAZ_MAX_TCS]; 597 597 u8 max_bw_unit[IEEE_8021QAZ_MAX_TCS]; 598 - __u64 upper_limit_mbps = roundup(255 * MLX5E_100MB, MLX5E_1GB); 598 + __u64 upper_limit_mbps; 599 + __u64 upper_limit_gbps; 599 600 int i; 601 + struct { 602 + int scale; 603 + const char *units_str; 604 + } units[] = { 605 + [MLX5_100_MBPS_UNIT] = { 606 + .scale = 100, 607 + .units_str = "Mbps", 608 + }, 609 + [MLX5_GBPS_UNIT] = { 610 + .scale = 1, 611 + .units_str = "Gbps", 612 + }, 613 + }; 600 614 601 615 memset(max_bw_value, 0, sizeof(max_bw_value)); 602 616 memset(max_bw_unit, 0, sizeof(max_bw_unit)); 617 + upper_limit_mbps = 255 * MLX5E_100MB; 618 + upper_limit_gbps = 255 * MLX5E_1GB; 603 619 604 620 for (i = 0; i <= mlx5_max_tc(mdev); i++) { 605 621 if (!maxrate->tc_maxrate[i]) { 606 622 max_bw_unit[i] = MLX5_BW_NO_LIMIT; 607 623 continue; 608 624 } 609 - if (maxrate->tc_maxrate[i] < upper_limit_mbps) { 625 + if (maxrate->tc_maxrate[i] <= upper_limit_mbps) { 610 626 max_bw_value[i] = div_u64(maxrate->tc_maxrate[i], 611 627 MLX5E_100MB); 612 628 max_bw_value[i] = max_bw_value[i] ? max_bw_value[i] : 1; 613 629 max_bw_unit[i] = MLX5_100_MBPS_UNIT; 614 - } else { 630 + } else if (max_bw_value[i] <= upper_limit_gbps) { 615 631 max_bw_value[i] = div_u64(maxrate->tc_maxrate[i], 616 632 MLX5E_1GB); 617 633 max_bw_unit[i] = MLX5_GBPS_UNIT; 634 + } else { 635 + netdev_err(netdev, 636 + "tc_%d maxrate %llu Kbps exceeds limit %llu\n", 637 + i, maxrate->tc_maxrate[i], 638 + upper_limit_gbps); 639 + return -EINVAL; 618 640 } 619 641 } 620 642 621 643 for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) { 622 - netdev_dbg(netdev, "%s: tc_%d <=> max_bw %d Gbps\n", 623 - __func__, i, max_bw_value[i]); 644 + netdev_dbg(netdev, "%s: tc_%d <=> max_bw %u %s\n", __func__, i, 645 + max_bw_value[i] * units[max_bw_unit[i]].scale, 646 + units[max_bw_unit[i]].units_str); 624 647 } 625 648 626 649 return mlx5_modify_port_ets_rate_limit(mdev, max_bw_value, max_bw_unit);

-1

drivers/net/ethernet/mellanox/mlx5/core/en_main.c

··· 2219 2219 mcq->set_ci_db = cq->wq_ctrl.db.db; 2220 2220 mcq->arm_db = cq->wq_ctrl.db.db + 1; 2221 2221 *mcq->set_ci_db = 0; 2222 - *mcq->arm_db = 0; 2223 2222 mcq->vector = param->eq_ix; 2224 2223 mcq->comp = mlx5e_completion_event; 2225 2224 mcq->event = mlx5e_cq_error_event;

+7 -8

drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c

··· 421 421 __be64 *pas; 422 422 u32 i; 423 423 424 + conn->cq.mcq.cqe_sz = 64; 425 + conn->cq.mcq.set_ci_db = conn->cq.wq_ctrl.db.db; 426 + conn->cq.mcq.arm_db = conn->cq.wq_ctrl.db.db + 1; 427 + *conn->cq.mcq.set_ci_db = 0; 428 + conn->cq.mcq.vector = 0; 429 + conn->cq.mcq.comp = mlx5_fpga_conn_cq_complete; 430 + 424 431 cq_size = roundup_pow_of_two(cq_size); 425 432 MLX5_SET(cqc, temp_cqc, log_cq_size, ilog2(cq_size)); 426 433 ··· 475 468 if (err) 476 469 goto err_cqwq; 477 470 478 - conn->cq.mcq.cqe_sz = 64; 479 - conn->cq.mcq.set_ci_db = conn->cq.wq_ctrl.db.db; 480 - conn->cq.mcq.arm_db = conn->cq.wq_ctrl.db.db + 1; 481 - *conn->cq.mcq.set_ci_db = 0; 482 - *conn->cq.mcq.arm_db = 0; 483 - conn->cq.mcq.vector = 0; 484 - conn->cq.mcq.comp = mlx5_fpga_conn_cq_complete; 485 471 tasklet_setup(&conn->cq.tasklet, mlx5_fpga_conn_cq_tasklet); 486 - 487 472 mlx5_fpga_dbg(fdev, "Created CQ #0x%x\n", conn->cq.mcq.cqn); 488 473 489 474 goto out;

-7

drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c

··· 873 873 return err; 874 874 } 875 875 876 - static void hws_cq_complete(struct mlx5_core_cq *mcq, 877 - struct mlx5_eqe *eqe) 878 - { 879 - pr_err("CQ completion CQ: #%u\n", mcq->cqn); 880 - } 881 - 882 876 static int hws_send_ring_alloc_cq(struct mlx5_core_dev *mdev, 883 877 int numa_node, 884 878 struct mlx5hws_send_engine *queue, ··· 895 901 mcq->cqe_sz = 64; 896 902 mcq->set_ci_db = cq->wq_ctrl.db.db; 897 903 mcq->arm_db = cq->wq_ctrl.db.db + 1; 898 - mcq->comp = hws_cq_complete; 899 904 900 905 for (i = 0; i < mlx5_cqwq_get_size(&cq->wq); i++) { 901 906 cqe = mlx5_cqwq_get_wqe(&cq->wq, i);

+7 -21

drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_send.c

··· 1049 1049 return 0; 1050 1050 } 1051 1051 1052 - static void dr_cq_complete(struct mlx5_core_cq *mcq, 1053 - struct mlx5_eqe *eqe) 1054 - { 1055 - pr_err("CQ completion CQ: #%u\n", mcq->cqn); 1056 - } 1057 - 1058 1052 static struct mlx5dr_cq *dr_create_cq(struct mlx5_core_dev *mdev, 1059 1053 struct mlx5_uars_page *uar, 1060 1054 size_t ncqe) ··· 1083 1089 cqe->op_own = MLX5_CQE_INVALID << 4 | MLX5_CQE_OWNER_MASK; 1084 1090 } 1085 1091 1092 + cq->mcq.cqe_sz = 64; 1093 + cq->mcq.set_ci_db = cq->wq_ctrl.db.db; 1094 + cq->mcq.arm_db = cq->wq_ctrl.db.db + 1; 1095 + *cq->mcq.set_ci_db = 0; 1096 + cq->mcq.vector = 0; 1097 + cq->mdev = mdev; 1098 + 1086 1099 inlen = MLX5_ST_SZ_BYTES(create_cq_in) + 1087 1100 sizeof(u64) * cq->wq_ctrl.buf.npages; 1088 1101 in = kvzalloc(inlen, GFP_KERNEL); ··· 1113 1112 pas = (__be64 *)MLX5_ADDR_OF(create_cq_in, in, pas); 1114 1113 mlx5_fill_page_frag_array(&cq->wq_ctrl.buf, pas); 1115 1114 1116 - cq->mcq.comp = dr_cq_complete; 1117 - 1118 1115 err = mlx5_core_create_cq(mdev, &cq->mcq, in, inlen, out, sizeof(out)); 1119 1116 kvfree(in); 1120 1117 1121 1118 if (err) 1122 1119 goto err_cqwq; 1123 - 1124 - cq->mcq.cqe_sz = 64; 1125 - cq->mcq.set_ci_db = cq->wq_ctrl.db.db; 1126 - cq->mcq.arm_db = cq->wq_ctrl.db.db + 1; 1127 - *cq->mcq.set_ci_db = 0; 1128 - 1129 - /* set no-zero value, in order to avoid the HW to run db-recovery on 1130 - * CQ that used in polling mode. 1131 - */ 1132 - *cq->mcq.arm_db = cpu_to_be32(2 << 28); 1133 - 1134 - cq->mcq.vector = 0; 1135 - cq->mdev = mdev; 1136 1120 1137 1121 return cq; 1138 1122

+38 -15

drivers/net/ethernet/ti/am65-cpsw-qos.c

··· 276 276 /* The number of wireside clocks contained in the verify 277 277 * timeout counter. The default is 0x1312d0 278 278 * (10ms at 125Mhz in 1G mode). 279 + * The frequency of the clock depends on the link speed 280 + * and the PHY interface. 279 281 */ 280 - val = 125 * HZ_PER_MHZ; /* assuming 125MHz wireside clock */ 282 + switch (port->slave.phy_if) { 283 + case PHY_INTERFACE_MODE_RGMII: 284 + case PHY_INTERFACE_MODE_RGMII_ID: 285 + case PHY_INTERFACE_MODE_RGMII_RXID: 286 + case PHY_INTERFACE_MODE_RGMII_TXID: 287 + if (port->qos.link_speed == SPEED_1000) 288 + val = 125 * HZ_PER_MHZ; /* 125 MHz at 1000Mbps*/ 289 + else if (port->qos.link_speed == SPEED_100) 290 + val = 25 * HZ_PER_MHZ; /* 25 MHz at 100Mbps*/ 291 + else 292 + val = (25 * HZ_PER_MHZ) / 10; /* 2.5 MHz at 10Mbps*/ 293 + break; 281 294 295 + case PHY_INTERFACE_MODE_QSGMII: 296 + case PHY_INTERFACE_MODE_SGMII: 297 + val = 125 * HZ_PER_MHZ; /* 125 MHz */ 298 + break; 299 + 300 + default: 301 + netdev_err(port->ndev, "selected mode does not supported IET\n"); 302 + return -EOPNOTSUPP; 303 + } 282 304 val /= MILLIHZ_PER_HZ; /* count per ms timeout */ 283 305 val *= verify_time_ms; /* count for timeout ms */ 284 306 ··· 317 295 u32 ctrl, status; 318 296 int try; 319 297 320 - try = 20; 298 + try = 3; 299 + 300 + /* Reset the verify state machine by writing 1 301 + * to LINKFAIL 302 + */ 303 + ctrl = readl(port->port_base + AM65_CPSW_PN_REG_IET_CTRL); 304 + ctrl |= AM65_CPSW_PN_IET_MAC_LINKFAIL; 305 + writel(ctrl, port->port_base + AM65_CPSW_PN_REG_IET_CTRL); 306 + 307 + /* Clear MAC_LINKFAIL bit to start Verify. */ 308 + ctrl = readl(port->port_base + AM65_CPSW_PN_REG_IET_CTRL); 309 + ctrl &= ~AM65_CPSW_PN_IET_MAC_LINKFAIL; 310 + writel(ctrl, port->port_base + AM65_CPSW_PN_REG_IET_CTRL); 311 + 321 312 do { 322 - /* Reset the verify state machine by writing 1 323 - * to LINKFAIL 324 - */ 325 - ctrl = readl(port->port_base + AM65_CPSW_PN_REG_IET_CTRL); 326 - ctrl |= AM65_CPSW_PN_IET_MAC_LINKFAIL; 327 - writel(ctrl, port->port_base + AM65_CPSW_PN_REG_IET_CTRL); 328 - 329 - /* Clear MAC_LINKFAIL bit to start Verify. */ 330 - ctrl = readl(port->port_base + AM65_CPSW_PN_REG_IET_CTRL); 331 - ctrl &= ~AM65_CPSW_PN_IET_MAC_LINKFAIL; 332 - writel(ctrl, port->port_base + AM65_CPSW_PN_REG_IET_CTRL); 333 - 334 313 msleep(port->qos.iet.verify_time_ms); 335 314 336 315 status = readl(port->port_base + AM65_CPSW_PN_REG_IET_STATUS); ··· 353 330 netdev_dbg(port->ndev, "MAC Merge verify error\n"); 354 331 return -ENODEV; 355 332 } 356 - } while (try-- > 0); 333 + } while (--try > 0); 357 334 358 335 netdev_dbg(port->ndev, "MAC Merge verify timeout\n"); 359 336 return -ETIMEDOUT;

+4 -1

drivers/net/phy/mdio_bus.c

··· 73 73 return err; 74 74 75 75 err = mdiobus_register_reset(mdiodev); 76 - if (err) 76 + if (err) { 77 + gpiod_put(mdiodev->reset_gpio); 78 + mdiodev->reset_gpio = NULL; 77 79 return err; 80 + } 78 81 79 82 /* Assert the reset signal */ 80 83 mdio_device_reset(mdiodev, 1);

+6 -6

drivers/net/phy/micrel.c

··· 4380 4380 { 4381 4381 struct kszphy_priv *lan8814 = phydev->priv; 4382 4382 4383 - /* Reset the PHY */ 4384 - lanphy_modify_page_reg(phydev, LAN8814_PAGE_COMMON_REGS, 4385 - LAN8814_QSGMII_SOFT_RESET, 4386 - LAN8814_QSGMII_SOFT_RESET_BIT, 4387 - LAN8814_QSGMII_SOFT_RESET_BIT); 4388 - 4389 4383 /* Disable ANEG with QSGMII PCS Host side */ 4390 4384 lanphy_modify_page_reg(phydev, LAN8814_PAGE_PORT_REGS, 4391 4385 LAN8814_QSGMII_PCS1G_ANEG_CONFIG, ··· 4465 4471 addr, sizeof(struct lan8814_shared_priv)); 4466 4472 4467 4473 if (phy_package_init_once(phydev)) { 4474 + /* Reset the PHY */ 4475 + lanphy_modify_page_reg(phydev, LAN8814_PAGE_COMMON_REGS, 4476 + LAN8814_QSGMII_SOFT_RESET, 4477 + LAN8814_QSGMII_SOFT_RESET_BIT, 4478 + LAN8814_QSGMII_SOFT_RESET_BIT); 4479 + 4468 4480 err = lan8814_release_coma_mode(phydev); 4469 4481 if (err) 4470 4482 return err;

+11 -5

drivers/net/virtio_net.c

··· 2631 2631 return; 2632 2632 } 2633 2633 2634 - /* 1. Save the flags early, as the XDP program might overwrite them. 2634 + /* About the flags below: 2635 + * 1. Save the flags early, as the XDP program might overwrite them. 2635 2636 * These flags ensure packets marked as VIRTIO_NET_HDR_F_DATA_VALID 2636 2637 * stay valid after XDP processing. 2637 2638 * 2. XDP doesn't work with partially checksummed packets (refer to 2638 2639 * virtnet_xdp_set()), so packets marked as 2639 2640 * VIRTIO_NET_HDR_F_NEEDS_CSUM get dropped during XDP processing. 2640 2641 */ 2641 - flags = ((struct virtio_net_common_hdr *)buf)->hdr.flags; 2642 2642 2643 - if (vi->mergeable_rx_bufs) 2643 + if (vi->mergeable_rx_bufs) { 2644 + flags = ((struct virtio_net_common_hdr *)buf)->hdr.flags; 2644 2645 skb = receive_mergeable(dev, vi, rq, buf, ctx, len, xdp_xmit, 2645 2646 stats); 2646 - else if (vi->big_packets) 2647 + } else if (vi->big_packets) { 2648 + void *p = page_address((struct page *)buf); 2649 + 2650 + flags = ((struct virtio_net_common_hdr *)p)->hdr.flags; 2647 2651 skb = receive_big(dev, vi, rq, buf, len, stats); 2648 - else 2652 + } else { 2653 + flags = ((struct virtio_net_common_hdr *)buf)->hdr.flags; 2649 2654 skb = receive_small(dev, vi, rq, buf, ctx, len, xdp_xmit, stats); 2655 + } 2650 2656 2651 2657 if (unlikely(!skb)) 2652 2658 return;

+3

drivers/net/wireless/ath/ath11k/wmi.c

··· 5961 5961 dma_unmap_single(ar->ab->dev, skb_cb->paddr, msdu->len, DMA_TO_DEVICE); 5962 5962 5963 5963 info = IEEE80211_SKB_CB(msdu); 5964 + memset(&info->status, 0, sizeof(info->status)); 5965 + info->status.rates[0].idx = -1; 5966 + 5964 5967 if ((!(info->flags & IEEE80211_TX_CTL_NO_ACK)) && 5965 5968 !tx_compl_param->status) { 5966 5969 info->flags |= IEEE80211_TX_STAT_ACK;

+1 -6

drivers/net/wireless/intel/iwlwifi/mld/link.c

··· 708 708 iwl_mld_get_chan_load_from_element(struct iwl_mld *mld, 709 709 struct ieee80211_bss_conf *link_conf) 710 710 { 711 - struct ieee80211_vif *vif = link_conf->vif; 712 711 const struct cfg80211_bss_ies *ies; 713 712 const struct element *bss_load_elem = NULL; 714 713 const struct ieee80211_bss_load_elem *bss_load; 715 714 716 715 guard(rcu)(); 717 716 718 - if (ieee80211_vif_link_active(vif, link_conf->link_id)) 719 - ies = rcu_dereference(link_conf->bss->beacon_ies); 720 - else 721 - ies = rcu_dereference(link_conf->bss->ies); 722 - 717 + ies = rcu_dereference(link_conf->bss->beacon_ies); 723 718 if (ies) 724 719 bss_load_elem = cfg80211_find_elem(WLAN_EID_QBSS_LOAD, 725 720 ies->data, ies->len);

+3 -10

drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c

··· 938 938 939 939 u16 iwl_mvm_mac_ctxt_get_beacon_flags(const struct iwl_fw *fw, u8 rate_idx) 940 940 { 941 + u16 flags = iwl_mvm_mac80211_idx_to_hwrate(fw, rate_idx); 941 942 bool is_new_rate = iwl_fw_lookup_cmd_ver(fw, BEACON_TEMPLATE_CMD, 0) > 10; 942 - u16 flags, cck_flag; 943 - 944 - if (is_new_rate) { 945 - flags = iwl_mvm_mac80211_idx_to_hwrate(fw, rate_idx); 946 - cck_flag = IWL_MAC_BEACON_CCK; 947 - } else { 948 - cck_flag = IWL_MAC_BEACON_CCK_V1; 949 - flags = iwl_fw_rate_idx_to_plcp(rate_idx); 950 - } 951 943 952 944 if (rate_idx <= IWL_LAST_CCK_RATE) 953 - flags |= cck_flag; 945 + flags |= is_new_rate ? IWL_MAC_BEACON_CCK 946 + : IWL_MAC_BEACON_CCK_V1; 954 947 955 948 return flags; 956 949 }

+7 -7

drivers/net/wireless/intel/iwlwifi/mvm/time-event.c

··· 463 463 if (!aux_roc_te) /* Not a Aux ROC time event */ 464 464 return -EINVAL; 465 465 466 - iwl_mvm_te_check_trigger(mvm, notif, te_data); 466 + iwl_mvm_te_check_trigger(mvm, notif, aux_roc_te); 467 467 468 468 IWL_DEBUG_TE(mvm, 469 469 "Aux ROC time event notification - UID = 0x%x action %d (error = %d)\n", ··· 475 475 /* End TE, notify mac80211 */ 476 476 ieee80211_remain_on_channel_expired(mvm->hw); 477 477 iwl_mvm_roc_finished(mvm); /* flush aux queue */ 478 - list_del(&te_data->list); /* remove from list */ 479 - te_data->running = false; 480 - te_data->vif = NULL; 481 - te_data->uid = 0; 482 - te_data->id = TE_MAX; 478 + list_del(&aux_roc_te->list); /* remove from list */ 479 + aux_roc_te->running = false; 480 + aux_roc_te->vif = NULL; 481 + aux_roc_te->uid = 0; 482 + aux_roc_te->id = TE_MAX; 483 483 } else if (le32_to_cpu(notif->action) == TE_V2_NOTIF_HOST_EVENT_START) { 484 484 set_bit(IWL_MVM_STATUS_ROC_AUX_RUNNING, &mvm->status); 485 - te_data->running = true; 485 + aux_roc_te->running = true; 486 486 ieee80211_ready_on_channel(mvm->hw); /* Start TE */ 487 487 } else { 488 488 IWL_DEBUG_TE(mvm,

+9 -3

drivers/net/wireless/intel/iwlwifi/mvm/utils.c

··· 159 159 160 160 u8 iwl_mvm_mac80211_idx_to_hwrate(const struct iwl_fw *fw, int rate_idx) 161 161 { 162 - return (rate_idx >= IWL_FIRST_OFDM_RATE ? 163 - rate_idx - IWL_FIRST_OFDM_RATE : 164 - rate_idx); 162 + if (iwl_fw_lookup_cmd_ver(fw, TX_CMD, 0) > 8) 163 + /* In the new rate legacy rates are indexed: 164 + * 0 - 3 for CCK and 0 - 7 for OFDM. 165 + */ 166 + return (rate_idx >= IWL_FIRST_OFDM_RATE ? 167 + rate_idx - IWL_FIRST_OFDM_RATE : 168 + rate_idx); 169 + 170 + return iwl_fw_rate_idx_to_plcp(rate_idx); 165 171 } 166 172 167 173 u8 iwl_mvm_mac80211_ac_to_ucode_ac(enum ieee80211_ac_numbers ac)

+66 -5

drivers/net/wireless/marvell/mwl8k.c

··· 2966 2966 /* 2967 2967 * CMD_SET_BEACON. 2968 2968 */ 2969 + 2970 + static bool mwl8k_beacon_has_ds_params(const u8 *buf, int len) 2971 + { 2972 + const struct ieee80211_mgmt *mgmt = (const void *)buf; 2973 + int ies_len; 2974 + 2975 + if (len <= offsetof(struct ieee80211_mgmt, u.beacon.variable)) 2976 + return false; 2977 + 2978 + ies_len = len - offsetof(struct ieee80211_mgmt, u.beacon.variable); 2979 + 2980 + return cfg80211_find_ie(WLAN_EID_DS_PARAMS, mgmt->u.beacon.variable, 2981 + ies_len) != NULL; 2982 + } 2983 + 2984 + static void mwl8k_beacon_copy_inject_ds_params(struct ieee80211_hw *hw, 2985 + u8 *buf_dst, const u8 *buf_src, 2986 + int src_len) 2987 + { 2988 + const struct ieee80211_mgmt *mgmt = (const void *)buf_src; 2989 + static const u8 before_ds_params[] = { 2990 + WLAN_EID_SSID, 2991 + WLAN_EID_SUPP_RATES, 2992 + }; 2993 + const u8 *ies; 2994 + int hdr_len, left, offs, pos; 2995 + 2996 + ies = mgmt->u.beacon.variable; 2997 + hdr_len = offsetof(struct ieee80211_mgmt, u.beacon.variable); 2998 + 2999 + offs = ieee80211_ie_split(ies, src_len - hdr_len, before_ds_params, 3000 + ARRAY_SIZE(before_ds_params), 0); 3001 + 3002 + pos = hdr_len + offs; 3003 + left = src_len - pos; 3004 + 3005 + memcpy(buf_dst, buf_src, pos); 3006 + 3007 + /* Inject a DSSS Parameter Set after SSID + Supp Rates */ 3008 + buf_dst[pos + 0] = WLAN_EID_DS_PARAMS; 3009 + buf_dst[pos + 1] = 1; 3010 + buf_dst[pos + 2] = hw->conf.chandef.chan->hw_value; 3011 + 3012 + memcpy(buf_dst + pos + 3, buf_src + pos, left); 3013 + } 2969 3014 struct mwl8k_cmd_set_beacon { 2970 3015 struct mwl8k_cmd_pkt_hdr header; 2971 3016 __le16 beacon_len; ··· 3020 2975 static int mwl8k_cmd_set_beacon(struct ieee80211_hw *hw, 3021 2976 struct ieee80211_vif *vif, u8 *beacon, int len) 3022 2977 { 2978 + bool ds_params_present = mwl8k_beacon_has_ds_params(beacon, len); 3023 2979 struct mwl8k_cmd_set_beacon *cmd; 3024 - int rc; 2980 + int rc, final_len = len; 3025 2981 3026 - cmd = kzalloc(sizeof(*cmd) + len, GFP_KERNEL); 2982 + if (!ds_params_present) { 2983 + /* 2984 + * mwl8k firmware requires a DS Params IE with the current 2985 + * channel in AP beacons. If mac80211/hostapd does not 2986 + * include it, inject one here. IE ID + length + channel 2987 + * number = 3 bytes. 2988 + */ 2989 + final_len += 3; 2990 + } 2991 + 2992 + cmd = kzalloc(sizeof(*cmd) + final_len, GFP_KERNEL); 3027 2993 if (cmd == NULL) 3028 2994 return -ENOMEM; 3029 2995 3030 2996 cmd->header.code = cpu_to_le16(MWL8K_CMD_SET_BEACON); 3031 - cmd->header.length = cpu_to_le16(sizeof(*cmd) + len); 3032 - cmd->beacon_len = cpu_to_le16(len); 3033 - memcpy(cmd->beacon, beacon, len); 2997 + cmd->header.length = cpu_to_le16(sizeof(*cmd) + final_len); 2998 + cmd->beacon_len = cpu_to_le16(final_len); 2999 + 3000 + if (ds_params_present) 3001 + memcpy(cmd->beacon, beacon, len); 3002 + else 3003 + mwl8k_beacon_copy_inject_ds_params(hw, cmd->beacon, beacon, 3004 + len); 3034 3005 3035 3006 rc = mwl8k_post_pervif_cmd(hw, vif, &cmd->header); 3036 3007 kfree(cmd);

+10 -4

drivers/net/wireless/virtual/mac80211_hwsim.c

··· 2003 2003 struct ieee80211_sta *sta = control->sta; 2004 2004 struct ieee80211_bss_conf *bss_conf; 2005 2005 2006 + /* This can happen in case of monitor injection */ 2007 + if (!vif) { 2008 + ieee80211_free_txskb(hw, skb); 2009 + return; 2010 + } 2011 + 2006 2012 if (link != IEEE80211_LINK_UNSPECIFIED) { 2007 - bss_conf = rcu_dereference(txi->control.vif->link_conf[link]); 2013 + bss_conf = rcu_dereference(vif->link_conf[link]); 2008 2014 if (sta) 2009 2015 link_sta = rcu_dereference(sta->link[link]); 2010 2016 } else { ··· 2071 2065 return; 2072 2066 } 2073 2067 2074 - if (txi->control.vif) 2075 - hwsim_check_magic(txi->control.vif); 2068 + if (vif) 2069 + hwsim_check_magic(vif); 2076 2070 if (control->sta) 2077 2071 hwsim_check_sta_magic(control->sta); 2078 2072 2079 2073 if (ieee80211_hw_check(hw, SUPPORTS_RC_TABLE)) 2080 - ieee80211_get_tx_rates(txi->control.vif, control->sta, skb, 2074 + ieee80211_get_tx_rates(vif, control->sta, skb, 2081 2075 txi->control.rates, 2082 2076 ARRAY_SIZE(txi->control.rates)); 2083 2077

+2

drivers/pci/pci.h

··· 958 958 void pci_restore_aspm_l1ss_state(struct pci_dev *dev); 959 959 960 960 #ifdef CONFIG_PCIEASPM 961 + void pcie_aspm_remove_cap(struct pci_dev *pdev, u32 lnkcap); 961 962 void pcie_aspm_init_link_state(struct pci_dev *pdev); 962 963 void pcie_aspm_exit_link_state(struct pci_dev *pdev); 963 964 void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked); ··· 966 965 void pci_configure_ltr(struct pci_dev *pdev); 967 966 void pci_bridge_reconfigure_ltr(struct pci_dev *pdev); 968 967 #else 968 + static inline void pcie_aspm_remove_cap(struct pci_dev *pdev, u32 lnkcap) { } 969 969 static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { } 970 970 static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { } 971 971 static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked) { }

+17 -8

drivers/pci/pcie/aspm.c

··· 814 814 static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist) 815 815 { 816 816 struct pci_dev *child = link->downstream, *parent = link->pdev; 817 - u32 parent_lnkcap, child_lnkcap; 818 817 u16 parent_lnkctl, child_lnkctl; 819 818 struct pci_bus *linkbus = parent->subordinate; 820 819 ··· 828 829 * If ASPM not supported, don't mess with the clocks and link, 829 830 * bail out now. 830 831 */ 831 - pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, &parent_lnkcap); 832 - pcie_capability_read_dword(child, PCI_EXP_LNKCAP, &child_lnkcap); 833 - if (!(parent_lnkcap & child_lnkcap & PCI_EXP_LNKCAP_ASPMS)) 832 + if (!(parent->aspm_l0s_support && child->aspm_l0s_support) && 833 + !(parent->aspm_l1_support && child->aspm_l1_support)) 834 834 return; 835 835 836 836 /* Configure common clock before checking latencies */ ··· 841 843 * read-only Link Capabilities may change depending on common clock 842 844 * configuration (PCIe r5.0, sec 7.5.3.6). 843 845 */ 844 - pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, &parent_lnkcap); 845 - pcie_capability_read_dword(child, PCI_EXP_LNKCAP, &child_lnkcap); 846 846 pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &parent_lnkctl); 847 847 pcie_capability_read_word(child, PCI_EXP_LNKCTL, &child_lnkctl); 848 848 ··· 860 864 * given link unless components on both sides of the link each 861 865 * support L0s. 862 866 */ 863 - if (parent_lnkcap & child_lnkcap & PCI_EXP_LNKCAP_ASPM_L0S) 867 + if (parent->aspm_l0s_support && child->aspm_l0s_support) 864 868 link->aspm_support |= PCIE_LINK_STATE_L0S; 865 869 866 870 if (child_lnkctl & PCI_EXP_LNKCTL_ASPM_L0S) ··· 869 873 link->aspm_enabled |= PCIE_LINK_STATE_L0S_DW; 870 874 871 875 /* Setup L1 state */ 872 - if (parent_lnkcap & child_lnkcap & PCI_EXP_LNKCAP_ASPM_L1) 876 + if (parent->aspm_l1_support && child->aspm_l1_support) 873 877 link->aspm_support |= PCIE_LINK_STATE_L1; 874 878 875 879 if (parent_lnkctl & child_lnkctl & PCI_EXP_LNKCTL_ASPM_L1) ··· 1525 1529 return __pci_enable_link_state(pdev, state, true); 1526 1530 } 1527 1531 EXPORT_SYMBOL(pci_enable_link_state_locked); 1532 + 1533 + void pcie_aspm_remove_cap(struct pci_dev *pdev, u32 lnkcap) 1534 + { 1535 + if (lnkcap & PCI_EXP_LNKCAP_ASPM_L0S) 1536 + pdev->aspm_l0s_support = 0; 1537 + if (lnkcap & PCI_EXP_LNKCAP_ASPM_L1) 1538 + pdev->aspm_l1_support = 0; 1539 + 1540 + pci_info(pdev, "ASPM: Link Capabilities%s%s treated as unsupported to avoid device defect\n", 1541 + lnkcap & PCI_EXP_LNKCAP_ASPM_L0S ? " L0s" : "", 1542 + lnkcap & PCI_EXP_LNKCAP_ASPM_L1 ? " L1" : ""); 1543 + 1544 + } 1528 1545 1529 1546 static int pcie_aspm_set_policy(const char *val, 1530 1547 const struct kernel_param *kp)

+7

drivers/pci/probe.c

··· 1656 1656 if (reg32 & PCI_EXP_LNKCAP_DLLLARC) 1657 1657 pdev->link_active_reporting = 1; 1658 1658 1659 + #ifdef CONFIG_PCIEASPM 1660 + if (reg32 & PCI_EXP_LNKCAP_ASPM_L0S) 1661 + pdev->aspm_l0s_support = 1; 1662 + if (reg32 & PCI_EXP_LNKCAP_ASPM_L1) 1663 + pdev->aspm_l1_support = 1; 1664 + #endif 1665 + 1659 1666 parent = pci_upstream_bridge(pdev); 1660 1667 if (!parent) 1661 1668 return;

+21 -19

drivers/pci/quirks.c

··· 2494 2494 */ 2495 2495 static void quirk_disable_aspm_l0s(struct pci_dev *dev) 2496 2496 { 2497 - pci_info(dev, "Disabling L0s\n"); 2498 - pci_disable_link_state(dev, PCIE_LINK_STATE_L0S); 2497 + pcie_aspm_remove_cap(dev, PCI_EXP_LNKCAP_ASPM_L0S); 2499 2498 } 2500 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10a7, quirk_disable_aspm_l0s); 2501 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10a9, quirk_disable_aspm_l0s); 2502 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10b6, quirk_disable_aspm_l0s); 2503 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10c6, quirk_disable_aspm_l0s); 2504 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10c7, quirk_disable_aspm_l0s); 2505 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10c8, quirk_disable_aspm_l0s); 2506 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10d6, quirk_disable_aspm_l0s); 2507 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10db, quirk_disable_aspm_l0s); 2508 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10dd, quirk_disable_aspm_l0s); 2509 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10e1, quirk_disable_aspm_l0s); 2510 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10ec, quirk_disable_aspm_l0s); 2511 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f1, quirk_disable_aspm_l0s); 2512 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s); 2513 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s); 2499 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10a7, quirk_disable_aspm_l0s); 2500 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10a9, quirk_disable_aspm_l0s); 2501 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10b6, quirk_disable_aspm_l0s); 2502 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10c6, quirk_disable_aspm_l0s); 2503 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10c7, quirk_disable_aspm_l0s); 2504 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10c8, quirk_disable_aspm_l0s); 2505 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10d6, quirk_disable_aspm_l0s); 2506 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10db, quirk_disable_aspm_l0s); 2507 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10dd, quirk_disable_aspm_l0s); 2508 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10e1, quirk_disable_aspm_l0s); 2509 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10ec, quirk_disable_aspm_l0s); 2510 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10f1, quirk_disable_aspm_l0s); 2511 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s); 2512 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s); 2514 2513 2515 2514 static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev) 2516 2515 { 2517 - pci_info(dev, "Disabling ASPM L0s/L1\n"); 2518 - pci_disable_link_state(dev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1); 2516 + pcie_aspm_remove_cap(dev, 2517 + PCI_EXP_LNKCAP_ASPM_L0S | PCI_EXP_LNKCAP_ASPM_L1); 2519 2518 } 2520 2519 2521 2520 /* ··· 2522 2523 * upstream PCIe root port when ASPM is enabled. At least L0s mode is affected; 2523 2524 * disable both L0s and L1 for now to be safe. 2524 2525 */ 2525 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1); 2526 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1); 2527 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_FREESCALE, 0x0451, quirk_disable_aspm_l0s_l1); 2528 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PASEMI, 0xa002, quirk_disable_aspm_l0s_l1); 2529 + DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HUAWEI, 0x1105, quirk_disable_aspm_l0s_l1); 2526 2530 2527 2531 /* 2528 2532 * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain

+11 -2

drivers/pmdomain/arm/scmi_pm_domain.c

··· 41 41 42 42 static int scmi_pm_domain_probe(struct scmi_device *sdev) 43 43 { 44 - int num_domains, i; 44 + int num_domains, i, ret; 45 45 struct device *dev = &sdev->dev; 46 46 struct device_node *np = dev->of_node; 47 47 struct scmi_pm_domain *scmi_pd; ··· 108 108 scmi_pd_data->domains = domains; 109 109 scmi_pd_data->num_domains = num_domains; 110 110 111 + ret = of_genpd_add_provider_onecell(np, scmi_pd_data); 112 + if (ret) 113 + goto err_rm_genpds; 114 + 111 115 dev_set_drvdata(dev, scmi_pd_data); 112 116 113 - return of_genpd_add_provider_onecell(np, scmi_pd_data); 117 + return 0; 118 + err_rm_genpds: 119 + for (i = num_domains - 1; i >= 0; i--) 120 + pm_genpd_remove(domains[i]); 121 + 122 + return ret; 114 123 } 115 124 116 125 static void scmi_pm_domain_remove(struct scmi_device *sdev)

+2

drivers/pmdomain/imx/gpc.c

··· 536 536 return; 537 537 } 538 538 } 539 + 540 + of_node_put(pgc_node); 539 541 } 540 542 541 543 static struct platform_driver imx_gpc_driver = {

+14 -15

drivers/pmdomain/samsung/exynos-pm-domains.c

··· 92 92 { }, 93 93 }; 94 94 95 - static const char *exynos_get_domain_name(struct device_node *node) 95 + static const char *exynos_get_domain_name(struct device *dev, 96 + struct device_node *node) 96 97 { 97 98 const char *name; 98 99 99 100 if (of_property_read_string(node, "label", &name) < 0) 100 101 name = kbasename(node->full_name); 101 - return kstrdup_const(name, GFP_KERNEL); 102 + return devm_kstrdup_const(dev, name, GFP_KERNEL); 102 103 } 103 104 104 105 static int exynos_pd_probe(struct platform_device *pdev) ··· 116 115 if (!pd) 117 116 return -ENOMEM; 118 117 119 - pd->pd.name = exynos_get_domain_name(np); 118 + pd->pd.name = exynos_get_domain_name(dev, np); 120 119 if (!pd->pd.name) 121 120 return -ENOMEM; 122 121 123 122 pd->base = of_iomap(np, 0); 124 - if (!pd->base) { 125 - kfree_const(pd->pd.name); 123 + if (!pd->base) 126 124 return -ENODEV; 127 - } 128 125 129 126 pd->pd.power_off = exynos_pd_power_off; 130 127 pd->pd.power_on = exynos_pd_power_on; 131 128 pd->local_pwr_cfg = pm_domain_cfg->local_pwr_cfg; 129 + 130 + /* 131 + * Some Samsung platforms with bootloaders turning on the splash-screen 132 + * and handing it over to the kernel, requires the power-domains to be 133 + * reset during boot. 134 + */ 135 + if (IS_ENABLED(CONFIG_ARM) && 136 + of_device_is_compatible(np, "samsung,exynos4210-pd")) 137 + exynos_pd_power_off(&pd->pd); 132 138 133 139 on = readl_relaxed(pd->base + 0x4) & pd->local_pwr_cfg; 134 140 ··· 154 146 pr_info("%pOF has as child subdomain: %pOF.\n", 155 147 parent.np, child.np); 156 148 } 157 - 158 - /* 159 - * Some Samsung platforms with bootloaders turning on the splash-screen 160 - * and handing it over to the kernel, requires the power-domains to be 161 - * reset during boot. As a temporary hack to manage this, let's enforce 162 - * a sync_state. 163 - */ 164 - if (!ret) 165 - of_genpd_sync_state(np); 166 149 167 150 pm_runtime_enable(dev); 168 151 return ret;

+1

drivers/regulator/fixed.c

··· 334 334 ret = dev_err_probe(&pdev->dev, PTR_ERR(drvdata->dev), 335 335 "Failed to register regulator: %ld\n", 336 336 PTR_ERR(drvdata->dev)); 337 + gpiod_put(cfg.ena_gpiod); 337 338 return ret; 338 339 } 339 340

+11 -4

drivers/spi/spi-imx.c

··· 519 519 { 520 520 u32 reg; 521 521 522 - reg = readl(spi_imx->base + MX51_ECSPI_CTRL); 523 - reg |= MX51_ECSPI_CTRL_XCH; 524 - writel(reg, spi_imx->base + MX51_ECSPI_CTRL); 522 + if (spi_imx->usedma) { 523 + reg = readl(spi_imx->base + MX51_ECSPI_DMA); 524 + reg |= MX51_ECSPI_DMA_TEDEN | MX51_ECSPI_DMA_RXDEN; 525 + writel(reg, spi_imx->base + MX51_ECSPI_DMA); 526 + } else { 527 + reg = readl(spi_imx->base + MX51_ECSPI_CTRL); 528 + reg |= MX51_ECSPI_CTRL_XCH; 529 + writel(reg, spi_imx->base + MX51_ECSPI_CTRL); 530 + } 525 531 } 526 532 527 533 static void mx51_ecspi_disable(struct spi_imx_data *spi_imx) ··· 765 759 writel(MX51_ECSPI_DMA_RX_WML(spi_imx->wml - 1) | 766 760 MX51_ECSPI_DMA_TX_WML(tx_wml) | 767 761 MX51_ECSPI_DMA_RXT_WML(spi_imx->wml) | 768 - MX51_ECSPI_DMA_TEDEN | MX51_ECSPI_DMA_RXDEN | 769 762 MX51_ECSPI_DMA_RXTDEN, spi_imx->base + MX51_ECSPI_DMA); 770 763 } 771 764 ··· 1524 1519 dmaengine_submit(desc_tx); 1525 1520 reinit_completion(&spi_imx->dma_tx_completion); 1526 1521 dma_async_issue_pending(controller->dma_tx); 1522 + 1523 + spi_imx->devtype_data->trigger(spi_imx); 1527 1524 1528 1525 transfer_timeout = spi_imx_calculate_timeout(spi_imx, transfer->len); 1529 1526

+1 -1

drivers/spi/spi-xilinx.c

··· 300 300 301 301 /* Read out all the data from the Rx FIFO */ 302 302 rx_words = n_words; 303 - stalled = 10; 303 + stalled = 32; 304 304 while (rx_words) { 305 305 if (rx_words == n_words && !(stalled--) && 306 306 !(sr & XSPI_SR_TX_EMPTY_MASK) &&

+12

drivers/spi/spi.c

··· 2851 2851 acpi_set_modalias(adev, acpi_device_hid(adev), spi->modalias, 2852 2852 sizeof(spi->modalias)); 2853 2853 2854 + /* 2855 + * This gets re-tried in spi_probe() for -EPROBE_DEFER handling in case 2856 + * the GPIO controller does not have a driver yet. This needs to be done 2857 + * here too, because this call sets the GPIO direction and/or bias. 2858 + * Setting these needs to be done even if there is no driver, in which 2859 + * case spi_probe() will never get called. 2860 + * TODO: ideally the setup of the GPIO should be handled in a generic 2861 + * manner in the ACPI/gpiolib core code. 2862 + */ 2863 + if (spi->irq < 0) 2864 + spi->irq = acpi_dev_gpio_irq_get(adev, 0); 2865 + 2854 2866 acpi_device_set_enumerated(adev); 2855 2867 2856 2868 adev->power.flags.ignore_parent = true;

+2 -4

drivers/vdpa/mlx5/net/mlx5_vnet.c

··· 573 573 vcq->mcq.set_ci_db = vcq->db.db; 574 574 vcq->mcq.arm_db = vcq->db.db + 1; 575 575 vcq->mcq.cqe_sz = 64; 576 + vcq->mcq.comp = mlx5_vdpa_cq_comp; 577 + vcq->cqe = num_ent; 576 578 577 579 err = cq_frag_buf_alloc(ndev, &vcq->buf, num_ent); 578 580 if (err) ··· 614 612 if (err) 615 613 goto err_vec; 616 614 617 - vcq->mcq.comp = mlx5_vdpa_cq_comp; 618 - vcq->cqe = num_ent; 619 - vcq->mcq.set_ci_db = vcq->db.db; 620 - vcq->mcq.arm_db = vcq->db.db + 1; 621 615 mlx5_cq_arm(&mvq->cq.mcq, MLX5_CQ_DB_REQ_NOT, uar_page, mvq->cq.mcq.cons_index); 622 616 kfree(in); 623 617 return 0;

+3 -1

fs/btrfs/inode.c

··· 177 177 return ret; 178 178 } 179 179 ret = paths_from_inode(inum, ipath); 180 - if (ret < 0) 180 + if (ret < 0) { 181 + btrfs_put_root(local_root); 181 182 goto err; 183 + } 182 184 183 185 /* 184 186 * We deliberately ignore the bit ipath might have been too small to

+2

fs/btrfs/scrub.c

··· 2203 2203 ret = btrfs_map_block(fs_info, BTRFS_MAP_WRITE, full_stripe_start, 2204 2204 &length, &bioc, NULL, NULL); 2205 2205 if (ret < 0) { 2206 + bio_put(bio); 2206 2207 btrfs_put_bioc(bioc); 2207 2208 btrfs_bio_counter_dec(fs_info); 2208 2209 goto out; ··· 2213 2212 btrfs_put_bioc(bioc); 2214 2213 if (!rbio) { 2215 2214 ret = -ENOMEM; 2215 + bio_put(bio); 2216 2216 btrfs_bio_counter_dec(fs_info); 2217 2217 goto out; 2218 2218 }

+1 -1

fs/btrfs/tree-log.c

··· 7122 7122 * a power failure unless the log was synced as part of an fsync 7123 7123 * against any other unrelated inode. 7124 7124 */ 7125 - if (inode_only != LOG_INODE_EXISTS) 7125 + if (!ctx->logging_new_name && inode_only != LOG_INODE_EXISTS) 7126 7126 inode->last_log_commit = inode->last_sub_trans; 7127 7127 spin_unlock(&inode->lock); 7128 7128

+28 -32

fs/btrfs/zoned.c

··· 1317 1317 if (!btrfs_dev_is_sequential(device, info->physical)) { 1318 1318 up_read(&dev_replace->rwsem); 1319 1319 info->alloc_offset = WP_CONVENTIONAL; 1320 + info->capacity = device->zone_info->zone_size; 1320 1321 return 0; 1321 1322 } 1322 1323 ··· 1523 1522 u64 last_alloc) 1524 1523 { 1525 1524 struct btrfs_fs_info *fs_info = bg->fs_info; 1525 + u64 stripe_nr = 0, stripe_offset = 0; 1526 + u32 stripe_index = 0; 1526 1527 1527 1528 if ((map->type & BTRFS_BLOCK_GROUP_DATA) && !fs_info->stripe_root) { 1528 1529 btrfs_err(fs_info, "zoned: data %s needs raid-stripe-tree", ··· 1532 1529 return -EINVAL; 1533 1530 } 1534 1531 1532 + if (last_alloc) { 1533 + u32 factor = map->num_stripes; 1534 + 1535 + stripe_nr = last_alloc >> BTRFS_STRIPE_LEN_SHIFT; 1536 + stripe_offset = last_alloc & BTRFS_STRIPE_LEN_MASK; 1537 + stripe_nr = div_u64_rem(stripe_nr, factor, &stripe_index); 1538 + } 1539 + 1535 1540 for (int i = 0; i < map->num_stripes; i++) { 1536 1541 if (zone_info[i].alloc_offset == WP_MISSING_DEV) 1537 1542 continue; 1538 1543 1539 1544 if (zone_info[i].alloc_offset == WP_CONVENTIONAL) { 1540 - u64 stripe_nr, full_stripe_nr; 1541 - u64 stripe_offset; 1542 - int stripe_index; 1543 1545 1544 - stripe_nr = div64_u64(last_alloc, map->stripe_size); 1545 - stripe_offset = stripe_nr * map->stripe_size; 1546 - full_stripe_nr = div_u64(stripe_nr, map->num_stripes); 1547 - div_u64_rem(stripe_nr, map->num_stripes, &stripe_index); 1548 - 1549 - zone_info[i].alloc_offset = 1550 - full_stripe_nr * map->stripe_size; 1546 + zone_info[i].alloc_offset = btrfs_stripe_nr_to_offset(stripe_nr); 1551 1547 1552 1548 if (stripe_index > i) 1553 - zone_info[i].alloc_offset += map->stripe_size; 1549 + zone_info[i].alloc_offset += BTRFS_STRIPE_LEN; 1554 1550 else if (stripe_index == i) 1555 - zone_info[i].alloc_offset += 1556 - (last_alloc - stripe_offset); 1551 + zone_info[i].alloc_offset += stripe_offset; 1557 1552 } 1558 1553 1559 1554 if (test_bit(0, active) != test_bit(i, active)) { ··· 1575 1574 u64 last_alloc) 1576 1575 { 1577 1576 struct btrfs_fs_info *fs_info = bg->fs_info; 1577 + u64 stripe_nr = 0, stripe_offset = 0; 1578 + u32 stripe_index = 0; 1578 1579 1579 1580 if ((map->type & BTRFS_BLOCK_GROUP_DATA) && !fs_info->stripe_root) { 1580 1581 btrfs_err(fs_info, "zoned: data %s needs raid-stripe-tree", 1581 1582 btrfs_bg_type_to_raid_name(map->type)); 1582 1583 return -EINVAL; 1584 + } 1585 + 1586 + if (last_alloc) { 1587 + u32 factor = map->num_stripes / map->sub_stripes; 1588 + 1589 + stripe_nr = last_alloc >> BTRFS_STRIPE_LEN_SHIFT; 1590 + stripe_offset = last_alloc & BTRFS_STRIPE_LEN_MASK; 1591 + stripe_nr = div_u64_rem(stripe_nr, factor, &stripe_index); 1583 1592 } 1584 1593 1585 1594 for (int i = 0; i < map->num_stripes; i++) { ··· 1605 1594 } 1606 1595 1607 1596 if (zone_info[i].alloc_offset == WP_CONVENTIONAL) { 1608 - u64 stripe_nr, full_stripe_nr; 1609 - u64 stripe_offset; 1610 - int stripe_index; 1611 - 1612 - stripe_nr = div64_u64(last_alloc, map->stripe_size); 1613 - stripe_offset = stripe_nr * map->stripe_size; 1614 - full_stripe_nr = div_u64(stripe_nr, 1615 - map->num_stripes / map->sub_stripes); 1616 - div_u64_rem(stripe_nr, 1617 - (map->num_stripes / map->sub_stripes), 1618 - &stripe_index); 1619 - 1620 - zone_info[i].alloc_offset = 1621 - full_stripe_nr * map->stripe_size; 1597 + zone_info[i].alloc_offset = btrfs_stripe_nr_to_offset(stripe_nr); 1622 1598 1623 1599 if (stripe_index > (i / map->sub_stripes)) 1624 - zone_info[i].alloc_offset += map->stripe_size; 1600 + zone_info[i].alloc_offset += BTRFS_STRIPE_LEN; 1625 1601 else if (stripe_index == (i / map->sub_stripes)) 1626 - zone_info[i].alloc_offset += 1627 - (last_alloc - stripe_offset); 1602 + zone_info[i].alloc_offset += stripe_offset; 1628 1603 } 1629 1604 1630 1605 if ((i % map->sub_stripes) == 0) { ··· 1680 1683 set_bit(BLOCK_GROUP_FLAG_SEQUENTIAL_ZONE, &cache->runtime_flags); 1681 1684 1682 1685 if (num_conventional > 0) { 1683 - /* Zone capacity is always zone size in emulation */ 1684 - cache->zone_capacity = cache->length; 1685 1686 ret = calculate_alloc_pointer(cache, &last_alloc, new); 1686 1687 if (ret) { 1687 1688 btrfs_err(fs_info, ··· 1688 1693 goto out; 1689 1694 } else if (map->num_stripes == num_conventional) { 1690 1695 cache->alloc_offset = last_alloc; 1696 + cache->zone_capacity = cache->length; 1691 1697 set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags); 1692 1698 goto out; 1693 1699 }

+7 -4

fs/erofs/decompressor_zstd.c

··· 172 172 dctx.bounce = strm->bounce; 173 173 174 174 do { 175 - dctx.avail_out = out_buf.size - out_buf.pos; 176 175 dctx.inbuf_sz = in_buf.size; 177 176 dctx.inbuf_pos = in_buf.pos; 178 177 err = z_erofs_stream_switch_bufs(&dctx, &out_buf.dst, ··· 187 188 in_buf.pos = dctx.inbuf_pos; 188 189 189 190 zerr = zstd_decompress_stream(stream, &out_buf, &in_buf); 190 - if (zstd_is_error(zerr) || (!zerr && rq->outputsize)) { 191 + dctx.avail_out = out_buf.size - out_buf.pos; 192 + if (zstd_is_error(zerr) || 193 + ((rq->outputsize + dctx.avail_out) && (!zerr || (zerr > 0 && 194 + !(rq->inputsize + in_buf.size - in_buf.pos))))) { 191 195 erofs_err(sb, "failed to decompress in[%u] out[%u]: %s", 192 196 rq->inputsize, rq->outputsize, 193 - zerr ? zstd_get_error_name(zerr) : "unexpected end of stream"); 197 + zstd_is_error(zerr) ? zstd_get_error_name(zerr) : 198 + "unexpected end of stream"); 194 199 err = -EFSCORRUPTED; 195 200 break; 196 201 } 197 - } while (rq->outputsize || out_buf.pos < out_buf.size); 202 + } while (rq->outputsize + dctx.avail_out); 198 203 199 204 if (dctx.kout) 200 205 kunmap_local(dctx.kout);

+8

fs/nfs/client.c

··· 338 338 /* Match the xprt security policy */ 339 339 if (clp->cl_xprtsec.policy != data->xprtsec.policy) 340 340 continue; 341 + if (clp->cl_xprtsec.policy == RPC_XPRTSEC_TLS_X509) { 342 + if (clp->cl_xprtsec.cert_serial != 343 + data->xprtsec.cert_serial) 344 + continue; 345 + if (clp->cl_xprtsec.privkey_serial != 346 + data->xprtsec.privkey_serial) 347 + continue; 348 + } 341 349 342 350 refcount_inc(&clp->cl_count); 343 351 return clp;

+4 -3

fs/nfs/dir.c

··· 2268 2268 return -ENAMETOOLONG; 2269 2269 2270 2270 if (open_flags & O_CREAT) { 2271 - file->f_mode |= FMODE_CREATED; 2272 2271 error = nfs_do_create(dir, dentry, mode, open_flags); 2273 - if (error) 2272 + if (!error) { 2273 + file->f_mode |= FMODE_CREATED; 2274 + return finish_open(file, dentry, NULL); 2275 + } else if (error != -EEXIST || open_flags & O_EXCL) 2274 2276 return error; 2275 - return finish_open(file, dentry, NULL); 2276 2277 } 2277 2278 if (d_in_lookup(dentry)) { 2278 2279 /* The only flags nfs_lookup considers are

+12 -6

fs/nfs/inode.c

··· 718 718 struct nfs_fattr *fattr; 719 719 loff_t oldsize = i_size_read(inode); 720 720 int error = 0; 721 + kuid_t task_uid = current_fsuid(); 722 + kuid_t owner_uid = inode->i_uid; 721 723 722 724 nfs_inc_stats(inode, NFSIOS_VFSSETATTR); 723 725 ··· 741 739 if (nfs_have_delegated_mtime(inode) && attr->ia_valid & ATTR_MTIME) { 742 740 spin_lock(&inode->i_lock); 743 741 if (attr->ia_valid & ATTR_MTIME_SET) { 744 - nfs_set_timestamps_to_ts(inode, attr); 745 - attr->ia_valid &= ~(ATTR_MTIME|ATTR_MTIME_SET| 742 + if (uid_eq(task_uid, owner_uid)) { 743 + nfs_set_timestamps_to_ts(inode, attr); 744 + attr->ia_valid &= ~(ATTR_MTIME|ATTR_MTIME_SET| 746 745 ATTR_ATIME|ATTR_ATIME_SET); 746 + } 747 747 } else { 748 748 nfs_update_timestamps(inode, attr->ia_valid); 749 749 attr->ia_valid &= ~(ATTR_MTIME|ATTR_ATIME); ··· 755 751 attr->ia_valid & ATTR_ATIME && 756 752 !(attr->ia_valid & ATTR_MTIME)) { 757 753 if (attr->ia_valid & ATTR_ATIME_SET) { 758 - spin_lock(&inode->i_lock); 759 - nfs_set_timestamps_to_ts(inode, attr); 760 - spin_unlock(&inode->i_lock); 761 - attr->ia_valid &= ~(ATTR_ATIME|ATTR_ATIME_SET); 754 + if (uid_eq(task_uid, owner_uid)) { 755 + spin_lock(&inode->i_lock); 756 + nfs_set_timestamps_to_ts(inode, attr); 757 + spin_unlock(&inode->i_lock); 758 + attr->ia_valid &= ~(ATTR_ATIME|ATTR_ATIME_SET); 759 + } 762 760 } else { 763 761 nfs_update_delegated_atime(inode); 764 762 attr->ia_valid &= ~ATTR_ATIME;

+121 -108

fs/nfs/localio.c

··· 42 42 /* Begin mostly DIO-specific members */ 43 43 size_t end_len; 44 44 short int end_iter_index; 45 - short int n_iters; 45 + atomic_t n_iters; 46 46 bool iter_is_dio_aligned[NFSLOCAL_MAX_IOS]; 47 - loff_t offset[NFSLOCAL_MAX_IOS] ____cacheline_aligned; 48 - struct iov_iter iters[NFSLOCAL_MAX_IOS]; 47 + struct iov_iter iters[NFSLOCAL_MAX_IOS] ____cacheline_aligned; 49 48 /* End mostly DIO-specific members */ 50 49 }; 51 50 ··· 313 314 init_sync_kiocb(&iocb->kiocb, file); 314 315 315 316 iocb->hdr = hdr; 317 + iocb->kiocb.ki_pos = hdr->args.offset; 316 318 iocb->kiocb.ki_flags &= ~IOCB_APPEND; 319 + iocb->kiocb.ki_complete = NULL; 317 320 iocb->aio_complete_work = NULL; 318 321 319 322 iocb->end_iter_index = -1; ··· 389 388 return true; 390 389 } 391 390 391 + static void 392 + nfs_local_iter_setup(struct iov_iter *iter, int rw, struct bio_vec *bvec, 393 + unsigned int nvecs, unsigned long total, 394 + size_t start, size_t len) 395 + { 396 + iov_iter_bvec(iter, rw, bvec, nvecs, total); 397 + if (start) 398 + iov_iter_advance(iter, start); 399 + iov_iter_truncate(iter, len); 400 + } 401 + 392 402 /* 393 403 * Setup as many as 3 iov_iter based on extents described by @local_dio. 394 404 * Returns the number of iov_iter that were setup. 395 405 */ 396 406 static int 397 407 nfs_local_iters_setup_dio(struct nfs_local_kiocb *iocb, int rw, 398 - unsigned int nvecs, size_t len, 408 + unsigned int nvecs, unsigned long total, 399 409 struct nfs_local_dio *local_dio) 400 410 { 401 411 int n_iters = 0; ··· 414 402 415 403 /* Setup misaligned start? */ 416 404 if (local_dio->start_len) { 417 - iov_iter_bvec(&iters[n_iters], rw, iocb->bvec, nvecs, len); 418 - iters[n_iters].count = local_dio->start_len; 419 - iocb->offset[n_iters] = iocb->hdr->args.offset; 420 - iocb->iter_is_dio_aligned[n_iters] = false; 405 + nfs_local_iter_setup(&iters[n_iters], rw, iocb->bvec, 406 + nvecs, total, 0, local_dio->start_len); 421 407 ++n_iters; 422 408 } 423 409 424 - /* Setup misaligned end? 425 - * If so, the end is purposely setup to be issued using buffered IO 426 - * before the middle (which will use DIO, if DIO-aligned, with AIO). 427 - * This creates problems if/when the end results in a partial write. 428 - * So must save index and length of end to handle this corner case. 410 + /* 411 + * Setup DIO-aligned middle, if there is no misaligned end (below) 412 + * then AIO completion is used, see nfs_local_call_{read,write} 429 413 */ 430 - if (local_dio->end_len) { 431 - iov_iter_bvec(&iters[n_iters], rw, iocb->bvec, nvecs, len); 432 - iocb->offset[n_iters] = local_dio->end_offset; 433 - iov_iter_advance(&iters[n_iters], 434 - local_dio->start_len + local_dio->middle_len); 435 - iocb->iter_is_dio_aligned[n_iters] = false; 436 - /* Save index and length of end */ 437 - iocb->end_iter_index = n_iters; 438 - iocb->end_len = local_dio->end_len; 439 - ++n_iters; 440 - } 441 - 442 - /* Setup DIO-aligned middle to be issued last, to allow for 443 - * DIO with AIO completion (see nfs_local_call_{read,write}). 444 - */ 445 - iov_iter_bvec(&iters[n_iters], rw, iocb->bvec, nvecs, len); 446 - if (local_dio->start_len) 447 - iov_iter_advance(&iters[n_iters], local_dio->start_len); 448 - iters[n_iters].count -= local_dio->end_len; 449 - iocb->offset[n_iters] = local_dio->middle_offset; 414 + nfs_local_iter_setup(&iters[n_iters], rw, iocb->bvec, nvecs, 415 + total, local_dio->start_len, local_dio->middle_len); 450 416 451 417 iocb->iter_is_dio_aligned[n_iters] = 452 418 nfs_iov_iter_aligned_bvec(&iters[n_iters], ··· 432 442 433 443 if (unlikely(!iocb->iter_is_dio_aligned[n_iters])) { 434 444 trace_nfs_local_dio_misaligned(iocb->hdr->inode, 435 - iocb->hdr->args.offset, len, local_dio); 445 + local_dio->start_len, local_dio->middle_len, local_dio); 436 446 return 0; /* no DIO-aligned IO possible */ 437 447 } 448 + iocb->end_iter_index = n_iters; 438 449 ++n_iters; 439 450 440 - iocb->n_iters = n_iters; 451 + /* Setup misaligned end? */ 452 + if (local_dio->end_len) { 453 + nfs_local_iter_setup(&iters[n_iters], rw, iocb->bvec, 454 + nvecs, total, local_dio->start_len + 455 + local_dio->middle_len, local_dio->end_len); 456 + iocb->end_iter_index = n_iters; 457 + ++n_iters; 458 + } 459 + 460 + atomic_set(&iocb->n_iters, n_iters); 441 461 return n_iters; 442 462 } 443 463 ··· 473 473 } 474 474 len = hdr->args.count - total; 475 475 476 + /* 477 + * For each iocb, iocb->n_iters is always at least 1 and we always 478 + * end io after first nfs_local_pgio_done call unless misaligned DIO. 479 + */ 480 + atomic_set(&iocb->n_iters, 1); 481 + 476 482 if (test_bit(NFS_IOHDR_ODIRECT, &hdr->flags)) { 477 483 struct nfs_local_dio local_dio; 478 484 479 485 if (nfs_is_local_dio_possible(iocb, rw, len, &local_dio) && 480 - nfs_local_iters_setup_dio(iocb, rw, v, len, &local_dio) != 0) 486 + nfs_local_iters_setup_dio(iocb, rw, v, len, &local_dio) != 0) { 487 + /* Ensure DIO WRITE's IO on stable storage upon completion */ 488 + if (rw == ITER_SOURCE) 489 + iocb->kiocb.ki_flags |= IOCB_DSYNC|IOCB_SYNC; 481 490 return; /* is DIO-aligned */ 491 + } 482 492 } 483 493 484 494 /* Use buffered IO */ 485 - iocb->offset[0] = hdr->args.offset; 486 495 iov_iter_bvec(&iocb->iters[0], rw, iocb->bvec, v, len); 487 - iocb->n_iters = 1; 488 496 } 489 497 490 498 static void ··· 512 504 hdr->task.tk_start = ktime_get(); 513 505 } 514 506 515 - static void 516 - nfs_local_pgio_done(struct nfs_pgio_header *hdr, long status) 507 + static bool 508 + nfs_local_pgio_done(struct nfs_local_kiocb *iocb, long status, bool force) 517 509 { 510 + struct nfs_pgio_header *hdr = iocb->hdr; 511 + 518 512 /* Must handle partial completions */ 519 513 if (status >= 0) { 520 514 hdr->res.count += status; ··· 527 517 hdr->res.op_status = nfs_localio_errno_to_nfs4_stat(status); 528 518 hdr->task.tk_status = status; 529 519 } 520 + 521 + if (force) 522 + return true; 523 + 524 + BUG_ON(atomic_read(&iocb->n_iters) <= 0); 525 + return atomic_dec_and_test(&iocb->n_iters); 530 526 } 531 527 532 528 static void ··· 563 547 queue_work(nfsiod_workqueue, &iocb->work); 564 548 } 565 549 566 - static void 567 - nfs_local_read_done(struct nfs_local_kiocb *iocb, long status) 550 + static void nfs_local_read_done(struct nfs_local_kiocb *iocb) 568 551 { 569 552 struct nfs_pgio_header *hdr = iocb->hdr; 570 553 struct file *filp = iocb->kiocb.ki_filp; 554 + long status = hdr->task.tk_status; 571 555 572 556 if ((iocb->kiocb.ki_flags & IOCB_DIRECT) && status == -EINVAL) { 573 557 /* Underlying FS will return -EINVAL if misaligned DIO is attempted. */ ··· 580 564 */ 581 565 hdr->res.replen = 0; 582 566 583 - if (hdr->res.count != hdr->args.count || 584 - hdr->args.offset + hdr->res.count >= i_size_read(file_inode(filp))) 567 + /* nfs_readpage_result() handles short read */ 568 + 569 + if (hdr->args.offset + hdr->res.count >= i_size_read(file_inode(filp))) 585 570 hdr->res.eof = true; 586 571 587 572 dprintk("%s: read %ld bytes eof %d.\n", __func__, 588 573 status > 0 ? status : 0, hdr->res.eof); 574 + } 575 + 576 + static inline void nfs_local_read_iocb_done(struct nfs_local_kiocb *iocb) 577 + { 578 + nfs_local_read_done(iocb); 579 + nfs_local_pgio_release(iocb); 589 580 } 590 581 591 582 static void nfs_local_read_aio_complete_work(struct work_struct *work) ··· 600 577 struct nfs_local_kiocb *iocb = 601 578 container_of(work, struct nfs_local_kiocb, work); 602 579 603 - nfs_local_pgio_release(iocb); 580 + nfs_local_read_iocb_done(iocb); 604 581 } 605 582 606 583 static void nfs_local_read_aio_complete(struct kiocb *kiocb, long ret) ··· 608 585 struct nfs_local_kiocb *iocb = 609 586 container_of(kiocb, struct nfs_local_kiocb, kiocb); 610 587 611 - nfs_local_pgio_done(iocb->hdr, ret); 612 - nfs_local_read_done(iocb, ret); 588 + /* AIO completion of DIO read should always be last to complete */ 589 + if (unlikely(!nfs_local_pgio_done(iocb, ret, false))) 590 + return; 591 + 613 592 nfs_local_pgio_aio_complete(iocb); /* Calls nfs_local_read_aio_complete_work */ 614 593 } 615 594 ··· 621 596 container_of(work, struct nfs_local_kiocb, work); 622 597 struct file *filp = iocb->kiocb.ki_filp; 623 598 const struct cred *save_cred; 599 + bool force_done = false; 624 600 ssize_t status; 601 + int n_iters; 625 602 626 603 save_cred = override_creds(filp->f_cred); 627 604 628 - for (int i = 0; i < iocb->n_iters ; i++) { 605 + n_iters = atomic_read(&iocb->n_iters); 606 + for (int i = 0; i < n_iters ; i++) { 629 607 if (iocb->iter_is_dio_aligned[i]) { 630 608 iocb->kiocb.ki_flags |= IOCB_DIRECT; 631 - iocb->kiocb.ki_complete = nfs_local_read_aio_complete; 632 - iocb->aio_complete_work = nfs_local_read_aio_complete_work; 633 - } 609 + /* Only use AIO completion if DIO-aligned segment is last */ 610 + if (i == iocb->end_iter_index) { 611 + iocb->kiocb.ki_complete = nfs_local_read_aio_complete; 612 + iocb->aio_complete_work = nfs_local_read_aio_complete_work; 613 + } 614 + } else 615 + iocb->kiocb.ki_flags &= ~IOCB_DIRECT; 634 616 635 - iocb->kiocb.ki_pos = iocb->offset[i]; 636 617 status = filp->f_op->read_iter(&iocb->kiocb, &iocb->iters[i]); 637 618 if (status != -EIOCBQUEUED) { 638 - nfs_local_pgio_done(iocb->hdr, status); 639 - if (iocb->hdr->task.tk_status) 619 + if (unlikely(status >= 0 && status < iocb->iters[i].count)) 620 + force_done = true; /* Partial read */ 621 + if (nfs_local_pgio_done(iocb, status, force_done)) { 622 + nfs_local_read_iocb_done(iocb); 640 623 break; 624 + } 641 625 } 642 626 } 643 627 644 628 revert_creds(save_cred); 645 - 646 - if (status != -EIOCBQUEUED) { 647 - nfs_local_read_done(iocb, status); 648 - nfs_local_pgio_release(iocb); 649 - } 650 629 } 651 630 652 631 static int ··· 765 736 fattr->du.nfs3.used = stat.blocks << 9; 766 737 } 767 738 768 - static void 769 - nfs_local_write_done(struct nfs_local_kiocb *iocb, long status) 739 + static void nfs_local_write_done(struct nfs_local_kiocb *iocb) 770 740 { 771 741 struct nfs_pgio_header *hdr = iocb->hdr; 772 - struct inode *inode = hdr->inode; 742 + long status = hdr->task.tk_status; 773 743 774 744 dprintk("%s: wrote %ld bytes.\n", __func__, status > 0 ? status : 0); 775 745 ··· 787 759 nfs_set_pgio_error(hdr, -ENOSPC, hdr->args.offset); 788 760 status = -ENOSPC; 789 761 /* record -ENOSPC in terms of nfs_local_pgio_done */ 790 - nfs_local_pgio_done(hdr, status); 762 + (void) nfs_local_pgio_done(iocb, status, true); 791 763 } 792 764 if (hdr->task.tk_status < 0) 793 - nfs_reset_boot_verifier(inode); 765 + nfs_reset_boot_verifier(hdr->inode); 766 + } 767 + 768 + static inline void nfs_local_write_iocb_done(struct nfs_local_kiocb *iocb) 769 + { 770 + nfs_local_write_done(iocb); 771 + nfs_local_vfs_getattr(iocb); 772 + nfs_local_pgio_release(iocb); 794 773 } 795 774 796 775 static void nfs_local_write_aio_complete_work(struct work_struct *work) ··· 805 770 struct nfs_local_kiocb *iocb = 806 771 container_of(work, struct nfs_local_kiocb, work); 807 772 808 - nfs_local_vfs_getattr(iocb); 809 - nfs_local_pgio_release(iocb); 773 + nfs_local_write_iocb_done(iocb); 810 774 } 811 775 812 776 static void nfs_local_write_aio_complete(struct kiocb *kiocb, long ret) ··· 813 779 struct nfs_local_kiocb *iocb = 814 780 container_of(kiocb, struct nfs_local_kiocb, kiocb); 815 781 816 - nfs_local_pgio_done(iocb->hdr, ret); 817 - nfs_local_write_done(iocb, ret); 782 + /* AIO completion of DIO write should always be last to complete */ 783 + if (unlikely(!nfs_local_pgio_done(iocb, ret, false))) 784 + return; 785 + 818 786 nfs_local_pgio_aio_complete(iocb); /* Calls nfs_local_write_aio_complete_work */ 819 787 } 820 788 ··· 827 791 struct file *filp = iocb->kiocb.ki_filp; 828 792 unsigned long old_flags = current->flags; 829 793 const struct cred *save_cred; 794 + bool force_done = false; 830 795 ssize_t status; 796 + int n_iters; 831 797 832 798 current->flags |= PF_LOCAL_THROTTLE | PF_MEMALLOC_NOIO; 833 799 save_cred = override_creds(filp->f_cred); 834 800 835 801 file_start_write(filp); 836 - for (int i = 0; i < iocb->n_iters ; i++) { 802 + n_iters = atomic_read(&iocb->n_iters); 803 + for (int i = 0; i < n_iters ; i++) { 837 804 if (iocb->iter_is_dio_aligned[i]) { 838 805 iocb->kiocb.ki_flags |= IOCB_DIRECT; 839 - iocb->kiocb.ki_complete = nfs_local_write_aio_complete; 840 - iocb->aio_complete_work = nfs_local_write_aio_complete_work; 841 - } 842 - retry: 843 - iocb->kiocb.ki_pos = iocb->offset[i]; 806 + /* Only use AIO completion if DIO-aligned segment is last */ 807 + if (i == iocb->end_iter_index) { 808 + iocb->kiocb.ki_complete = nfs_local_write_aio_complete; 809 + iocb->aio_complete_work = nfs_local_write_aio_complete_work; 810 + } 811 + } else 812 + iocb->kiocb.ki_flags &= ~IOCB_DIRECT; 813 + 844 814 status = filp->f_op->write_iter(&iocb->kiocb, &iocb->iters[i]); 845 815 if (status != -EIOCBQUEUED) { 846 - if (unlikely(status >= 0 && status < iocb->iters[i].count)) { 847 - /* partial write */ 848 - if (i == iocb->end_iter_index) { 849 - /* Must not account partial end, otherwise, due 850 - * to end being issued before middle: the partial 851 - * write accounting in nfs_local_write_done() 852 - * would incorrectly advance hdr->args.offset 853 - */ 854 - status = 0; 855 - } else { 856 - /* Partial write at start or buffered middle, 857 - * exit early. 858 - */ 859 - nfs_local_pgio_done(iocb->hdr, status); 860 - break; 861 - } 862 - } else if (unlikely(status == -ENOTBLK && 863 - (iocb->kiocb.ki_flags & IOCB_DIRECT))) { 864 - /* VFS will return -ENOTBLK if DIO WRITE fails to 865 - * invalidate the page cache. Retry using buffered IO. 866 - */ 867 - iocb->kiocb.ki_flags &= ~IOCB_DIRECT; 868 - iocb->kiocb.ki_complete = NULL; 869 - iocb->aio_complete_work = NULL; 870 - goto retry; 871 - } 872 - nfs_local_pgio_done(iocb->hdr, status); 873 - if (iocb->hdr->task.tk_status) 816 + if (unlikely(status >= 0 && status < iocb->iters[i].count)) 817 + force_done = true; /* Partial write */ 818 + if (nfs_local_pgio_done(iocb, status, force_done)) { 819 + nfs_local_write_iocb_done(iocb); 874 820 break; 821 + } 875 822 } 876 823 } 877 824 file_end_write(filp); 878 825 879 826 revert_creds(save_cred); 880 827 current->flags = old_flags; 881 - 882 - if (status != -EIOCBQUEUED) { 883 - nfs_local_write_done(iocb, status); 884 - nfs_local_vfs_getattr(iocb); 885 - nfs_local_pgio_release(iocb); 886 - } 887 828 } 888 829 889 830 static int

+12 -2

fs/nfs/nfs3client.c

··· 2 2 #include <linux/nfs_fs.h> 3 3 #include <linux/nfs_mount.h> 4 4 #include <linux/sunrpc/addr.h> 5 + #include <net/handshake.h> 5 6 #include "internal.h" 6 7 #include "nfs3_fs.h" 7 8 #include "netns.h" ··· 99 98 .net = mds_clp->cl_net, 100 99 .timeparms = &ds_timeout, 101 100 .cred = mds_srv->cred, 102 - .xprtsec = mds_clp->cl_xprtsec, 101 + .xprtsec = { 102 + .policy = RPC_XPRTSEC_NONE, 103 + .cert_serial = TLS_NO_CERT, 104 + .privkey_serial = TLS_NO_PRIVKEY, 105 + }, 103 106 .connect_timeout = connect_timeout, 104 107 .reconnect_timeout = connect_timeout, 105 108 }; ··· 116 111 cl_init.hostname = buf; 117 112 118 113 switch (ds_proto) { 114 + case XPRT_TRANSPORT_TCP_TLS: 115 + if (mds_clp->cl_xprtsec.policy != RPC_XPRTSEC_NONE) 116 + cl_init.xprtsec = mds_clp->cl_xprtsec; 117 + else 118 + ds_proto = XPRT_TRANSPORT_TCP; 119 + fallthrough; 119 120 case XPRT_TRANSPORT_RDMA: 120 121 case XPRT_TRANSPORT_TCP: 121 - case XPRT_TRANSPORT_TCP_TLS: 122 122 if (mds_clp->cl_nconnect > 1) 123 123 cl_init.nconnect = mds_clp->cl_nconnect; 124 124 }

+12 -2

fs/nfs/nfs4client.c

··· 11 11 #include <linux/sunrpc/xprt.h> 12 12 #include <linux/sunrpc/bc_xprt.h> 13 13 #include <linux/sunrpc/rpc_pipe_fs.h> 14 + #include <net/handshake.h> 14 15 #include "internal.h" 15 16 #include "callback.h" 16 17 #include "delegation.h" ··· 984 983 .net = mds_clp->cl_net, 985 984 .timeparms = &ds_timeout, 986 985 .cred = mds_srv->cred, 987 - .xprtsec = mds_srv->nfs_client->cl_xprtsec, 986 + .xprtsec = { 987 + .policy = RPC_XPRTSEC_NONE, 988 + .cert_serial = TLS_NO_CERT, 989 + .privkey_serial = TLS_NO_PRIVKEY, 990 + }, 988 991 }; 989 992 char buf[INET6_ADDRSTRLEN + 1]; 990 993 ··· 997 992 cl_init.hostname = buf; 998 993 999 994 switch (ds_proto) { 995 + case XPRT_TRANSPORT_TCP_TLS: 996 + if (mds_srv->nfs_client->cl_xprtsec.policy != RPC_XPRTSEC_NONE) 997 + cl_init.xprtsec = mds_srv->nfs_client->cl_xprtsec; 998 + else 999 + ds_proto = XPRT_TRANSPORT_TCP; 1000 + fallthrough; 1000 1001 case XPRT_TRANSPORT_RDMA: 1001 1002 case XPRT_TRANSPORT_TCP: 1002 - case XPRT_TRANSPORT_TCP_TLS: 1003 1003 if (mds_clp->cl_nconnect > 1) { 1004 1004 cl_init.nconnect = mds_clp->cl_nconnect; 1005 1005 cl_init.max_connect = NFS_MAX_TRANSPORTS;

+6 -3

fs/nfs/nfs4proc.c

··· 4715 4715 }; 4716 4716 unsigned short task_flags = 0; 4717 4717 4718 - if (NFS_SERVER(inode)->flags & NFS_MOUNT_SOFTREVAL) 4718 + if (server->flags & NFS_MOUNT_SOFTREVAL) 4719 4719 task_flags |= RPC_TASK_TIMEOUT; 4720 + if (server->caps & NFS_CAP_MOVEABLE) 4721 + task_flags |= RPC_TASK_MOVEABLE; 4720 4722 4721 4723 args.bitmask = nfs4_bitmask(server, fattr->label); 4722 4724 4723 4725 nfs_fattr_init(fattr); 4726 + nfs4_init_sequence(&args.seq_args, &res.seq_res, 0, 0); 4724 4727 4725 4728 dprintk("NFS call lookupp ino=0x%lx\n", inode->i_ino); 4726 - status = nfs4_call_sync(clnt, server, &msg, &args.seq_args, 4727 - &res.seq_res, task_flags); 4729 + status = nfs4_do_call_sync(clnt, server, &msg, &args.seq_args, 4730 + &res.seq_res, task_flags); 4728 4731 dprintk("NFS reply lookupp: %d\n", status); 4729 4732 return status; 4730 4733 }

+35 -31

fs/nfs/pnfs_nfs.c

··· 809 809 unsigned int retrans) 810 810 { 811 811 struct nfs_client *clp = ERR_PTR(-EIO); 812 + struct nfs_client *mds_clp = mds_srv->nfs_client; 813 + enum xprtsec_policies xprtsec_policy = mds_clp->cl_xprtsec.policy; 812 814 struct nfs4_pnfs_ds_addr *da; 813 815 unsigned long connect_timeout = timeo * (retrans + 1) * HZ / 10; 816 + int ds_proto; 814 817 int status = 0; 815 818 816 819 dprintk("--> %s DS %s\n", __func__, ds->ds_remotestr); ··· 837 834 .xprtsec = clp->cl_xprtsec, 838 835 }; 839 836 840 - if (da->da_transport != clp->cl_proto && 841 - clp->cl_proto != XPRT_TRANSPORT_TCP_TLS) 842 - continue; 843 - if (da->da_transport == XPRT_TRANSPORT_TCP && 844 - mds_srv->nfs_client->cl_proto == XPRT_TRANSPORT_TCP_TLS) 837 + if (xprt_args.ident == XPRT_TRANSPORT_TCP && 838 + clp->cl_proto == XPRT_TRANSPORT_TCP_TLS) 845 839 xprt_args.ident = XPRT_TRANSPORT_TCP_TLS; 846 840 847 - if (da->da_addr.ss_family != clp->cl_addr.ss_family) 841 + if (xprt_args.ident != clp->cl_proto) 842 + continue; 843 + if (xprt_args.dstaddr->sa_family != 844 + clp->cl_addr.ss_family) 848 845 continue; 849 846 /* Add this address as an alias */ 850 847 rpc_clnt_add_xprt(clp->cl_rpcclient, &xprt_args, 851 - rpc_clnt_test_and_add_xprt, NULL); 848 + rpc_clnt_test_and_add_xprt, NULL); 852 849 continue; 853 850 } 854 - if (da->da_transport == XPRT_TRANSPORT_TCP && 855 - mds_srv->nfs_client->cl_proto == XPRT_TRANSPORT_TCP_TLS) 856 - da->da_transport = XPRT_TRANSPORT_TCP_TLS; 857 - clp = get_v3_ds_connect(mds_srv, 858 - &da->da_addr, 859 - da->da_addrlen, da->da_transport, 860 - timeo, retrans); 851 + 852 + ds_proto = da->da_transport; 853 + if (ds_proto == XPRT_TRANSPORT_TCP && 854 + xprtsec_policy != RPC_XPRTSEC_NONE) 855 + ds_proto = XPRT_TRANSPORT_TCP_TLS; 856 + 857 + clp = get_v3_ds_connect(mds_srv, &da->da_addr, da->da_addrlen, 858 + ds_proto, timeo, retrans); 861 859 if (IS_ERR(clp)) 862 860 continue; 863 861 clp->cl_rpcclient->cl_softerr = 0; ··· 884 880 u32 minor_version) 885 881 { 886 882 struct nfs_client *clp = ERR_PTR(-EIO); 883 + struct nfs_client *mds_clp = mds_srv->nfs_client; 884 + enum xprtsec_policies xprtsec_policy = mds_clp->cl_xprtsec.policy; 887 885 struct nfs4_pnfs_ds_addr *da; 886 + int ds_proto; 888 887 int status = 0; 889 888 890 889 dprintk("--> %s DS %s\n", __func__, ds->ds_remotestr); ··· 915 908 .data = &xprtdata, 916 909 }; 917 910 918 - if (da->da_transport != clp->cl_proto && 919 - clp->cl_proto != XPRT_TRANSPORT_TCP_TLS) 920 - continue; 921 - if (da->da_transport == XPRT_TRANSPORT_TCP && 922 - mds_srv->nfs_client->cl_proto == 923 - XPRT_TRANSPORT_TCP_TLS) { 911 + if (xprt_args.ident == XPRT_TRANSPORT_TCP && 912 + clp->cl_proto == XPRT_TRANSPORT_TCP_TLS) { 924 913 struct sockaddr *addr = 925 914 (struct sockaddr *)&da->da_addr; 926 915 struct sockaddr_in *sin = ··· 947 944 xprt_args.ident = XPRT_TRANSPORT_TCP_TLS; 948 945 xprt_args.servername = servername; 949 946 } 950 - if (da->da_addr.ss_family != clp->cl_addr.ss_family) 947 + if (xprt_args.ident != clp->cl_proto) 948 + continue; 949 + if (xprt_args.dstaddr->sa_family != 950 + clp->cl_addr.ss_family) 951 951 continue; 952 952 953 953 /** ··· 964 958 if (xprtdata.cred) 965 959 put_cred(xprtdata.cred); 966 960 } else { 967 - if (da->da_transport == XPRT_TRANSPORT_TCP && 968 - mds_srv->nfs_client->cl_proto == 969 - XPRT_TRANSPORT_TCP_TLS) 970 - da->da_transport = XPRT_TRANSPORT_TCP_TLS; 971 - clp = nfs4_set_ds_client(mds_srv, 972 - &da->da_addr, 973 - da->da_addrlen, 974 - da->da_transport, timeo, 975 - retrans, minor_version); 961 + ds_proto = da->da_transport; 962 + if (ds_proto == XPRT_TRANSPORT_TCP && 963 + xprtsec_policy != RPC_XPRTSEC_NONE) 964 + ds_proto = XPRT_TRANSPORT_TCP_TLS; 965 + 966 + clp = nfs4_set_ds_client(mds_srv, &da->da_addr, 967 + da->da_addrlen, ds_proto, 968 + timeo, retrans, minor_version); 976 969 if (IS_ERR(clp)) 977 970 continue; 978 971 ··· 982 977 clp = ERR_PTR(-EIO); 983 978 continue; 984 979 } 985 - 986 980 } 987 981 } 988 982

+1

fs/nfs/sysfs.c

··· 189 189 return p; 190 190 191 191 kobject_put(&p->kobject); 192 + kobject_put(&p->nfs_net_kobj); 192 193 } 193 194 return NULL; 194 195 }

+49 -19

fs/nfsd/nfs4state.c

··· 1542 1542 release_all_access(stp); 1543 1543 if (stp->st_stateowner) 1544 1544 nfs4_put_stateowner(stp->st_stateowner); 1545 - WARN_ON(!list_empty(&stid->sc_cp_list)); 1545 + if (!list_empty(&stid->sc_cp_list)) 1546 + nfs4_free_cpntf_statelist(stid->sc_client->net, stid); 1546 1547 kmem_cache_free(stateid_slab, stid); 1547 1548 } 1548 1549 ··· 3487 3486 struct nfsd4_slot *slot = resp->cstate.slot; 3488 3487 unsigned int base; 3489 3488 3490 - dprintk("--> %s slot %p\n", __func__, slot); 3489 + /* 3490 + * RFC 5661 Section 2.10.6.1.2: 3491 + * 3492 + * Any time SEQUENCE ... returns an error ... [t]he replier MUST NOT 3493 + * modify the reply cache entry for the slot whenever an error is 3494 + * returned from SEQUENCE ... 3495 + * 3496 + * Because nfsd4_store_cache_entry is called only by 3497 + * nfsd4_sequence_done(), nfsd4_store_cache_entry() is called only 3498 + * when a SEQUENCE operation was part of the COMPOUND. 3499 + * nfs41_check_op_ordering() ensures SEQUENCE is the first op. 3500 + */ 3501 + if (resp->opcnt == 1 && resp->cstate.status != nfs_ok) 3502 + return; 3491 3503 3492 3504 slot->sl_flags |= NFSD4_SLOT_INITIALIZED; 3493 3505 slot->sl_opcnt = resp->opcnt; ··· 4363 4349 return true; 4364 4350 } 4365 4351 4352 + /* 4353 + * Note that the response is constructed here both for the case 4354 + * of a new SEQUENCE request and for a replayed SEQUENCE request. 4355 + * We do not cache SEQUENCE responses as SEQUENCE is idempotent. 4356 + */ 4357 + static void nfsd4_construct_sequence_response(struct nfsd4_session *session, 4358 + struct nfsd4_sequence *seq) 4359 + { 4360 + struct nfs4_client *clp = session->se_client; 4361 + 4362 + seq->maxslots_response = max(session->se_target_maxslots, 4363 + seq->maxslots); 4364 + seq->target_maxslots = session->se_target_maxslots; 4365 + 4366 + switch (clp->cl_cb_state) { 4367 + case NFSD4_CB_DOWN: 4368 + seq->status_flags = SEQ4_STATUS_CB_PATH_DOWN; 4369 + break; 4370 + case NFSD4_CB_FAULT: 4371 + seq->status_flags = SEQ4_STATUS_BACKCHANNEL_FAULT; 4372 + break; 4373 + default: 4374 + seq->status_flags = 0; 4375 + } 4376 + if (!list_empty(&clp->cl_revoked)) 4377 + seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED; 4378 + if (atomic_read(&clp->cl_admin_revoked)) 4379 + seq->status_flags |= SEQ4_STATUS_ADMIN_STATE_REVOKED; 4380 + } 4381 + 4366 4382 __be32 4367 4383 nfsd4_sequence(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, 4368 4384 union nfsd4_op_u *u) ··· 4442 4398 dprintk("%s: slotid %d\n", __func__, seq->slotid); 4443 4399 4444 4400 trace_nfsd_slot_seqid_sequence(clp, seq, slot); 4401 + 4402 + nfsd4_construct_sequence_response(session, seq); 4403 + 4445 4404 status = check_slot_seqid(seq->seqid, slot->sl_seqid, slot->sl_flags); 4446 4405 if (status == nfserr_replay_cache) { 4447 4406 status = nfserr_seq_misordered; ··· 4542 4495 } 4543 4496 4544 4497 out: 4545 - seq->maxslots = max(session->se_target_maxslots, seq->maxslots); 4546 - seq->target_maxslots = session->se_target_maxslots; 4547 - 4548 - switch (clp->cl_cb_state) { 4549 - case NFSD4_CB_DOWN: 4550 - seq->status_flags = SEQ4_STATUS_CB_PATH_DOWN; 4551 - break; 4552 - case NFSD4_CB_FAULT: 4553 - seq->status_flags = SEQ4_STATUS_BACKCHANNEL_FAULT; 4554 - break; 4555 - default: 4556 - seq->status_flags = 0; 4557 - } 4558 - if (!list_empty(&clp->cl_revoked)) 4559 - seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED; 4560 - if (atomic_read(&clp->cl_admin_revoked)) 4561 - seq->status_flags |= SEQ4_STATUS_ADMIN_STATE_REVOKED; 4562 4498 trace_nfsd_seq4_status(rqstp, seq); 4563 4499 out_no_session: 4564 4500 if (conn)

+2 -3

fs/nfsd/nfs4xdr.c

··· 5073 5073 return nfserr; 5074 5074 /* Note slotid's are numbered from zero: */ 5075 5075 /* sr_highest_slotid */ 5076 - nfserr = nfsd4_encode_slotid4(xdr, seq->maxslots - 1); 5076 + nfserr = nfsd4_encode_slotid4(xdr, seq->maxslots_response - 1); 5077 5077 if (nfserr != nfs_ok) 5078 5078 return nfserr; 5079 5079 /* sr_target_highest_slotid */ ··· 5925 5925 */ 5926 5926 warn_on_nonidempotent_op(op); 5927 5927 xdr_truncate_encode(xdr, op_status_offset + XDR_UNIT); 5928 - } 5929 - if (so) { 5928 + } else if (so) { 5930 5929 int len = xdr->buf->len - (op_status_offset + XDR_UNIT); 5931 5930 5932 5931 so->so_replay.rp_status = op->status;

+1

fs/nfsd/nfsd.h

+3 -3

fs/nfsd/nfsfh.c

··· 269 269 dentry); 270 270 } 271 271 272 - fhp->fh_dentry = dentry; 273 - fhp->fh_export = exp; 274 - 275 272 switch (fhp->fh_maxsize) { 276 273 case NFS4_FHSIZE: 277 274 if (dentry->d_sb->s_export_op->flags & EXPORT_OP_NOATOMIC_ATTR) ··· 289 292 if (exp->ex_flags & NFSEXP_V4ROOT) 290 293 goto out; 291 294 } 295 + 296 + fhp->fh_dentry = dentry; 297 + fhp->fh_export = exp; 292 298 293 299 return 0; 294 300 out:

+2 -1

fs/nfsd/xdr4.h

··· 574 574 struct nfs4_sessionid sessionid; /* request/response */ 575 575 u32 seqid; /* request/response */ 576 576 u32 slotid; /* request/response */ 577 - u32 maxslots; /* request/response */ 577 + u32 maxslots; /* request */ 578 578 u32 cachethis; /* request */ 579 + u32 maxslots_response; /* response */ 579 580 u32 target_maxslots; /* response */ 580 581 u32 status_flags; /* response */ 581 582 };

+6 -1

fs/nilfs2/segment.c

··· 2768 2768 2769 2769 if (sci->sc_task) { 2770 2770 wake_up(&sci->sc_wait_daemon); 2771 - kthread_stop(sci->sc_task); 2771 + if (kthread_stop(sci->sc_task)) { 2772 + spin_lock(&sci->sc_state_lock); 2773 + sci->sc_task = NULL; 2774 + timer_shutdown_sync(&sci->sc_timer); 2775 + spin_unlock(&sci->sc_state_lock); 2776 + } 2772 2777 } 2773 2778 2774 2779 spin_lock(&sci->sc_state_lock);

+9 -3

fs/proc/generic.c

··· 698 698 } 699 699 } 700 700 701 + static void pde_erase(struct proc_dir_entry *pde, struct proc_dir_entry *parent) 702 + { 703 + rb_erase(&pde->subdir_node, &parent->subdir); 704 + RB_CLEAR_NODE(&pde->subdir_node); 705 + } 706 + 701 707 /* 702 708 * Remove a /proc entry and free it if it's not currently in use. 703 709 */ ··· 726 720 WARN(1, "removing permanent /proc entry '%s'", de->name); 727 721 de = NULL; 728 722 } else { 729 - rb_erase(&de->subdir_node, &parent->subdir); 723 + pde_erase(de, parent); 730 724 if (S_ISDIR(de->mode)) 731 725 parent->nlink--; 732 726 } ··· 770 764 root->parent->name, root->name); 771 765 return -EINVAL; 772 766 } 773 - rb_erase(&root->subdir_node, &parent->subdir); 767 + pde_erase(root, parent); 774 768 775 769 de = root; 776 770 while (1) { ··· 782 776 next->parent->name, next->name); 783 777 return -EINVAL; 784 778 } 785 - rb_erase(&next->subdir_node, &de->subdir); 779 + pde_erase(next, de); 786 780 de = next; 787 781 continue; 788 782 }

+3 -1

fs/smb/client/fs_context.c

··· 1435 1435 cifs_errorf(fc, "Unknown error parsing devname\n"); 1436 1436 goto cifs_parse_mount_err; 1437 1437 } 1438 + kfree(ctx->source); 1438 1439 ctx->source = smb3_fs_context_fullpath(ctx, '/'); 1439 1440 if (IS_ERR(ctx->source)) { 1440 1441 ctx->source = NULL; 1441 1442 cifs_errorf(fc, "OOM when copying UNC string\n"); 1442 1443 goto cifs_parse_mount_err; 1443 1444 } 1445 + kfree(fc->source); 1444 1446 fc->source = kstrdup(ctx->source, GFP_KERNEL); 1445 1447 if (fc->source == NULL) { 1446 1448 cifs_errorf(fc, "OOM when copying UNC string\n"); ··· 1470 1468 break; 1471 1469 } 1472 1470 1473 - if (strnlen(param->string, CIFS_MAX_USERNAME_LEN) > 1471 + if (strnlen(param->string, CIFS_MAX_USERNAME_LEN) == 1474 1472 CIFS_MAX_USERNAME_LEN) { 1475 1473 pr_warn("username too long\n"); 1476 1474 goto cifs_parse_mount_err;

+3

fs/smb/client/smbdirect.c

··· 290 290 break; 291 291 292 292 case SMBDIRECT_SOCKET_CREATED: 293 + sc->status = SMBDIRECT_SOCKET_DISCONNECTED; 294 + break; 295 + 293 296 case SMBDIRECT_SOCKET_CONNECTED: 294 297 sc->status = SMBDIRECT_SOCKET_ERROR; 295 298 break;

+1 -1

fs/smb/client/transport.c

··· 830 830 if (!server || server->terminate) 831 831 continue; 832 832 833 - if (CIFS_CHAN_NEEDS_RECONNECT(ses, i)) 833 + if (CIFS_CHAN_NEEDS_RECONNECT(ses, cur)) 834 834 continue; 835 835 836 836 /*

+13 -1

fs/smb/server/transport_rdma.c

··· 334 334 break; 335 335 336 336 case SMBDIRECT_SOCKET_CREATED: 337 + sc->status = SMBDIRECT_SOCKET_DISCONNECTED; 338 + break; 339 + 337 340 case SMBDIRECT_SOCKET_CONNECTED: 338 341 sc->status = SMBDIRECT_SOCKET_ERROR; 339 342 break; ··· 1886 1883 static int smb_direct_prepare_negotiation(struct smbdirect_socket *sc) 1887 1884 { 1888 1885 struct smbdirect_recv_io *recvmsg; 1886 + bool recv_posted = false; 1889 1887 int ret; 1890 1888 1891 1889 WARN_ON_ONCE(sc->status != SMBDIRECT_SOCKET_CREATED); ··· 1903 1899 pr_err("Can't post recv: %d\n", ret); 1904 1900 goto out_err; 1905 1901 } 1902 + recv_posted = true; 1906 1903 1907 1904 ret = smb_direct_accept_client(sc); 1908 1905 if (ret) { ··· 1913 1908 1914 1909 return 0; 1915 1910 out_err: 1916 - put_recvmsg(sc, recvmsg); 1911 + /* 1912 + * If the recv was never posted, return it to the free list. 1913 + * If it was posted, leave it alone so disconnect teardown can 1914 + * drain the QP and complete it (flush) and the completion path 1915 + * will unmap it exactly once. 1916 + */ 1917 + if (!recv_posted) 1918 + put_recvmsg(sc, recvmsg); 1917 1919 return ret; 1918 1920 } 1919 1921

+4 -1

fs/smb/server/transport_tcp.c

··· 290 290 } 291 291 } 292 292 up_read(&conn_list_lock); 293 - if (ret == -EAGAIN) 293 + if (ret == -EAGAIN) { 294 + /* Per-IP limit hit: release the just-accepted socket. */ 295 + sock_release(client_sk); 294 296 continue; 297 + } 295 298 296 299 skip_max_ip_conns_limit: 297 300 if (server_conf.max_connections &&

+1 -1

include/linux/dma-mapping.h

··· 90 90 */ 91 91 #define DMA_MAPPING_ERROR (~(dma_addr_t)0) 92 92 93 - #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1)) 93 + #define DMA_BIT_MASK(n) GENMASK_ULL(n - 1, 0) 94 94 95 95 struct dma_iova_state { 96 96 dma_addr_t addr;

+1 -1

include/linux/entry-virt.h

··· 32 32 */ 33 33 static inline int arch_xfer_to_guest_mode_handle_work(unsigned long ti_work); 34 34 35 - #ifndef arch_xfer_to_guest_mode_work 35 + #ifndef arch_xfer_to_guest_mode_handle_work 36 36 static inline int arch_xfer_to_guest_mode_handle_work(unsigned long ti_work) 37 37 { 38 38 return 0;

+1 -1

include/linux/ethtool.h

··· 492 492 }; 493 493 494 494 #define ETHTOOL_MAX_LANES 8 495 - /** 495 + /* 496 496 * IEEE 802.3ck/df defines 16 bins for FEC histogram plus one more for 497 497 * the end-of-list marker, total 17 items 498 498 */

+20

include/linux/filter.h

··· 901 901 cb->data_end = skb->data + skb_headlen(skb); 902 902 } 903 903 904 + static inline int bpf_prog_run_data_pointers( 905 + const struct bpf_prog *prog, 906 + struct sk_buff *skb) 907 + { 908 + struct bpf_skb_data_end *cb = (struct bpf_skb_data_end *)skb->cb; 909 + void *save_data_meta, *save_data_end; 910 + int res; 911 + 912 + save_data_meta = cb->data_meta; 913 + save_data_end = cb->data_end; 914 + 915 + bpf_compute_data_pointers(skb); 916 + res = bpf_prog_run(prog, skb); 917 + 918 + cb->data_meta = save_data_meta; 919 + cb->data_end = save_data_end; 920 + 921 + return res; 922 + } 923 + 904 924 /* Similar to bpf_compute_data_pointers(), except that save orginal 905 925 * data in cb->data and cb->meta_data for restore. 906 926 */

+9 -1

include/linux/ftrace.h

··· 193 193 #if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || \ 194 194 defined(CONFIG_HAVE_FTRACE_REGS_HAVING_PT_REGS) 195 195 196 + #ifndef arch_ftrace_partial_regs 197 + #define arch_ftrace_partial_regs(regs) do {} while (0) 198 + #endif 199 + 196 200 static __always_inline struct pt_regs * 197 201 ftrace_partial_regs(struct ftrace_regs *fregs, struct pt_regs *regs) 198 202 { ··· 206 202 * Since arch_ftrace_get_regs() will check some members and may return 207 203 * NULL, we can not use it. 208 204 */ 209 - return &arch_ftrace_regs(fregs)->regs; 205 + regs = &arch_ftrace_regs(fregs)->regs; 206 + 207 + /* Allow arch specific updates to regs. */ 208 + arch_ftrace_partial_regs(regs); 209 + return regs; 210 210 } 211 211 212 212 #endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_FTRACE_REGS_HAVING_PT_REGS */

+3

include/linux/gfp.h

··· 7 7 #include <linux/mmzone.h> 8 8 #include <linux/topology.h> 9 9 #include <linux/alloc_tag.h> 10 + #include <linux/cleanup.h> 10 11 #include <linux/sched.h> 11 12 12 13 struct vm_area_struct; ··· 463 462 #endif 464 463 /* This should be paired with folio_put() rather than free_contig_range(). */ 465 464 #define folio_alloc_gigantic(...) alloc_hooks(folio_alloc_gigantic_noprof(__VA_ARGS__)) 465 + 466 + DEFINE_FREE(free_page, void *, free_page((unsigned long)_T)) 466 467 467 468 #endif /* __LINUX_GFP_H */

+23 -32

include/linux/huge_mm.h

··· 376 376 int folio_split(struct folio *folio, unsigned int new_order, struct page *page, 377 377 struct list_head *list); 378 378 /* 379 - * try_folio_split - try to split a @folio at @page using non uniform split. 379 + * try_folio_split_to_order - try to split a @folio at @page to @new_order using 380 + * non uniform split. 380 381 * @folio: folio to be split 381 - * @page: split to order-0 at the given page 382 - * @list: store the after-split folios 382 + * @page: split to @new_order at the given page 383 + * @new_order: the target split order 383 384 * 384 - * Try to split a @folio at @page using non uniform split to order-0, if 385 - * non uniform split is not supported, fall back to uniform split. 385 + * Try to split a @folio at @page using non uniform split to @new_order, if 386 + * non uniform split is not supported, fall back to uniform split. After-split 387 + * folios are put back to LRU list. Use min_order_for_split() to get the lower 388 + * bound of @new_order. 386 389 * 387 390 * Return: 0: split is successful, otherwise split failed. 388 391 */ 389 - static inline int try_folio_split(struct folio *folio, struct page *page, 390 - struct list_head *list) 392 + static inline int try_folio_split_to_order(struct folio *folio, 393 + struct page *page, unsigned int new_order) 391 394 { 392 - int ret = min_order_for_split(folio); 393 - 394 - if (ret < 0) 395 - return ret; 396 - 397 - if (!non_uniform_split_supported(folio, 0, false)) 398 - return split_huge_page_to_list_to_order(&folio->page, list, 399 - ret); 400 - return folio_split(folio, ret, page, list); 395 + if (!non_uniform_split_supported(folio, new_order, /* warns= */ false)) 396 + return split_huge_page_to_list_to_order(&folio->page, NULL, 397 + new_order); 398 + return folio_split(folio, new_order, page, NULL); 401 399 } 402 400 static inline int split_huge_page(struct page *page) 403 401 { 404 - struct folio *folio = page_folio(page); 405 - int ret = min_order_for_split(folio); 406 - 407 - if (ret < 0) 408 - return ret; 409 - 410 - /* 411 - * split_huge_page() locks the page before splitting and 412 - * expects the same page that has been split to be locked when 413 - * returned. split_folio(page_folio(page)) cannot be used here 414 - * because it converts the page to folio and passes the head 415 - * page to be split. 416 - */ 417 - return split_huge_page_to_list_to_order(page, NULL, ret); 402 + return split_huge_page_to_list_to_order(page, NULL, 0); 418 403 } 419 404 void deferred_split_folio(struct folio *folio, bool partially_mapped); 420 405 ··· 582 597 return -EINVAL; 583 598 } 584 599 600 + static inline int min_order_for_split(struct folio *folio) 601 + { 602 + VM_WARN_ON_ONCE_FOLIO(1, folio); 603 + return -EINVAL; 604 + } 605 + 585 606 static inline int split_folio_to_list(struct folio *folio, struct list_head *list) 586 607 { 587 608 VM_WARN_ON_ONCE_FOLIO(1, folio); 588 609 return -EINVAL; 589 610 } 590 611 591 - static inline int try_folio_split(struct folio *folio, struct page *page, 592 - struct list_head *list) 612 + static inline int try_folio_split_to_order(struct folio *folio, 613 + struct page *page, unsigned int new_order) 593 614 { 594 615 VM_WARN_ON_ONCE_FOLIO(1, folio); 595 616 return -EINVAL;

+1

include/linux/map_benchmark.h

··· 27 27 __u32 dma_dir; /* DMA data direction */ 28 28 __u32 dma_trans_ns; /* time for DMA transmission in ns */ 29 29 __u32 granule; /* how many PAGE_SIZE will do map/unmap once a time */ 30 + __u8 expansion[76]; /* For future use */ 30 31 }; 31 32 #endif /* _KERNEL_DMA_BENCHMARK_H */

+1

include/linux/mlx5/cq.h

··· 183 183 complete(&cq->free); 184 184 } 185 185 186 + void mlx5_add_cq_to_tasklet(struct mlx5_core_cq *cq, struct mlx5_eqe *eqe); 186 187 int mlx5_create_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq, 187 188 u32 *in, int inlen, u32 *out, int outlen); 188 189 int mlx5_core_create_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq,

+10 -3

include/linux/mm.h

··· 2074 2074 return folio_large_nr_pages(folio); 2075 2075 } 2076 2076 2077 - #if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE) 2077 + #if !defined(CONFIG_HAVE_GIGANTIC_FOLIOS) 2078 2078 /* 2079 2079 * We don't expect any folios that exceed buddy sizes (and consequently 2080 2080 * memory sections). ··· 2087 2087 * pages are guaranteed to be contiguous. 2088 2088 */ 2089 2089 #define MAX_FOLIO_ORDER PFN_SECTION_SHIFT 2090 - #else 2090 + #elif defined(CONFIG_HUGETLB_PAGE) 2091 2091 /* 2092 2092 * There is no real limit on the folio size. We limit them to the maximum we 2093 - * currently expect (e.g., hugetlb, dax). 2093 + * currently expect (see CONFIG_HAVE_GIGANTIC_FOLIOS): with hugetlb, we expect 2094 + * no folios larger than 16 GiB on 64bit and 1 GiB on 32bit. 2095 + */ 2096 + #define MAX_FOLIO_ORDER get_order(IS_ENABLED(CONFIG_64BIT) ? SZ_16G : SZ_1G) 2097 + #else 2098 + /* 2099 + * Without hugetlb, gigantic folios that are bigger than a single PUD are 2100 + * currently impossible. 2094 2101 */ 2095 2102 #define MAX_FOLIO_ORDER PUD_ORDER 2096 2103 #endif

+2

include/linux/pci.h

··· 412 412 u16 l1ss; /* L1SS Capability pointer */ 413 413 #ifdef CONFIG_PCIEASPM 414 414 struct pcie_link_state *link_state; /* ASPM link state */ 415 + unsigned int aspm_l0s_support:1; /* ASPM L0s support */ 416 + unsigned int aspm_l1_support:1; /* ASPM L1 support */ 415 417 unsigned int ltr_path:1; /* Latency Tolerance Reporting 416 418 supported from root to here */ 417 419 #endif

+5

include/net/bluetooth/hci.h

··· 2783 2783 __u8 data[]; 2784 2784 } __packed; 2785 2785 2786 + #define HCI_EV_LE_PA_SYNC_LOST 0x10 2787 + struct hci_ev_le_pa_sync_lost { 2788 + __le16 handle; 2789 + } __packed; 2790 + 2786 2791 #define LE_PA_DATA_COMPLETE 0x00 2787 2792 #define LE_PA_DATA_MORE_TO_COME 0x01 2788 2793 #define LE_PA_DATA_TRUNCATED 0x02

+3

include/uapi/linux/io_uring/query.h

··· 36 36 __u64 enter_flags; 37 37 /* Bitmask of all supported IOSQE_* flags */ 38 38 __u64 sqe_flags; 39 + /* The number of available query opcodes */ 40 + __u32 nr_query_opcodes; 41 + __u32 __pad; 39 42 }; 40 43 41 44 #endif

+6

include/uapi/sound/intel/avs/tokens.h

··· 21 21 AVS_TKN_MANIFEST_NUM_BINDINGS_U32 = 8, 22 22 AVS_TKN_MANIFEST_NUM_CONDPATH_TMPLS_U32 = 9, 23 23 AVS_TKN_MANIFEST_NUM_INIT_CONFIGS_U32 = 10, 24 + AVS_TKN_MANIFEST_NUM_NHLT_CONFIGS_U32 = 11, 24 25 25 26 /* struct avs_tplg_library */ 26 27 AVS_TKN_LIBRARY_ID_U32 = 101, ··· 125 124 AVS_TKN_MOD_KCONTROL_ID_U32 = 1707, 126 125 AVS_TKN_MOD_INIT_CONFIG_NUM_IDS_U32 = 1708, 127 126 AVS_TKN_MOD_INIT_CONFIG_ID_U32 = 1709, 127 + AVS_TKN_MOD_NHLT_CONFIG_ID_U32 = 1710, 128 128 129 129 /* struct avs_tplg_path_template */ 130 130 AVS_TKN_PATH_TMPL_ID_U32 = 1801, ··· 162 160 AVS_TKN_INIT_CONFIG_ID_U32 = 2401, 163 161 AVS_TKN_INIT_CONFIG_PARAM_U8 = 2402, 164 162 AVS_TKN_INIT_CONFIG_LENGTH_U32 = 2403, 163 + 164 + /* struct avs_tplg_nhlt_config */ 165 + AVS_TKN_NHLT_CONFIG_ID_U32 = 2501, 166 + AVS_TKN_NHLT_CONFIG_SIZE_U32 = 2502, 165 167 }; 166 168 167 169 #endif

+2

io_uring/query.c

··· 20 20 e->ring_setup_flags = IORING_SETUP_FLAGS; 21 21 e->enter_flags = IORING_ENTER_FLAGS; 22 22 e->sqe_flags = SQE_VALID_FLAGS; 23 + e->nr_query_opcodes = __IO_URING_QUERY_MAX; 24 + e->__pad = 0; 23 25 return sizeof(*e); 24 26 } 25 27

+9 -7

io_uring/rsrc.c

··· 943 943 struct req_iterator rq_iter; 944 944 struct io_mapped_ubuf *imu; 945 945 struct io_rsrc_node *node; 946 - struct bio_vec bv, *bvec; 947 - u16 nr_bvecs; 946 + struct bio_vec bv; 947 + unsigned int nr_bvecs = 0; 948 948 int ret = 0; 949 949 950 950 io_ring_submit_lock(ctx, issue_flags); ··· 965 965 goto unlock; 966 966 } 967 967 968 - nr_bvecs = blk_rq_nr_phys_segments(rq); 969 - imu = io_alloc_imu(ctx, nr_bvecs); 968 + /* 969 + * blk_rq_nr_phys_segments() may overestimate the number of bvecs 970 + * but avoids needing to iterate over the bvecs 971 + */ 972 + imu = io_alloc_imu(ctx, blk_rq_nr_phys_segments(rq)); 970 973 if (!imu) { 971 974 kfree(node); 972 975 ret = -ENOMEM; ··· 980 977 imu->len = blk_rq_bytes(rq); 981 978 imu->acct_pages = 0; 982 979 imu->folio_shift = PAGE_SHIFT; 983 - imu->nr_bvecs = nr_bvecs; 984 980 refcount_set(&imu->refs, 1); 985 981 imu->release = release; 986 982 imu->priv = rq; 987 983 imu->is_kbuf = true; 988 984 imu->dir = 1 << rq_data_dir(rq); 989 985 990 - bvec = imu->bvec; 991 986 rq_for_each_bvec(bv, rq, rq_iter) 992 - *bvec++ = bv; 987 + imu->bvec[nr_bvecs++] = bv; 988 + imu->nr_bvecs = nr_bvecs; 993 989 994 990 node->buf = imu; 995 991 data->nodes[index] = node;

+3

io_uring/rw.c

··· 463 463 464 464 void io_readv_writev_cleanup(struct io_kiocb *req) 465 465 { 466 + struct io_async_rw *rw = req->async_data; 467 + 466 468 lockdep_assert_held(&req->ctx->uring_lock); 469 + io_vec_free(&rw->vec); 467 470 io_rw_recycle(req, 0); 468 471 } 469 472

+9

kernel/Kconfig.kexec

··· 109 109 to keep data or state alive across the kexec. For this to work, 110 110 both source and target kernels need to have this option enabled. 111 111 112 + config KEXEC_HANDOVER_DEBUG 113 + bool "Enable Kexec Handover debug checks" 114 + depends on KEXEC_HANDOVER 115 + help 116 + This option enables extra sanity checks for the Kexec Handover 117 + subsystem. Since, KHO performance is crucial in live update 118 + scenarios and the extra code might be adding overhead it is 119 + only optionally enabled. 120 + 112 121 config CRASH_DUMP 113 122 bool "kernel crash dumps" 114 123 default ARCH_DEFAULT_CRASH_DUMP

+1

kernel/Makefile

··· 83 83 obj-$(CONFIG_KEXEC_FILE) += kexec_file.o 84 84 obj-$(CONFIG_KEXEC_ELF) += kexec_elf.o 85 85 obj-$(CONFIG_KEXEC_HANDOVER) += kexec_handover.o 86 + obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) += kexec_handover_debug.o 86 87 obj-$(CONFIG_BACKTRACE_SELF_TEST) += backtracetest.o 87 88 obj-$(CONFIG_COMPAT) += compat.o 88 89 obj-$(CONFIG_CGROUPS) += cgroup/

+15 -11

kernel/bpf/helpers.c

··· 4169 4169 } 4170 4170 4171 4171 /** 4172 - * bpf_task_work_schedule_signal - Schedule BPF callback using task_work_add with TWA_SIGNAL mode 4172 + * bpf_task_work_schedule_signal_impl - Schedule BPF callback using task_work_add with TWA_SIGNAL 4173 + * mode 4173 4174 * @task: Task struct for which callback should be scheduled 4174 4175 * @tw: Pointer to struct bpf_task_work in BPF map value for internal bookkeeping 4175 4176 * @map__map: bpf_map that embeds struct bpf_task_work in the values ··· 4179 4178 * 4180 4179 * Return: 0 if task work has been scheduled successfully, negative error code otherwise 4181 4180 */ 4182 - __bpf_kfunc int bpf_task_work_schedule_signal(struct task_struct *task, struct bpf_task_work *tw, 4183 - void *map__map, bpf_task_work_callback_t callback, 4184 - void *aux__prog) 4181 + __bpf_kfunc int bpf_task_work_schedule_signal_impl(struct task_struct *task, 4182 + struct bpf_task_work *tw, void *map__map, 4183 + bpf_task_work_callback_t callback, 4184 + void *aux__prog) 4185 4185 { 4186 4186 return bpf_task_work_schedule(task, tw, map__map, callback, aux__prog, TWA_SIGNAL); 4187 4187 } 4188 4188 4189 4189 /** 4190 - * bpf_task_work_schedule_resume - Schedule BPF callback using task_work_add with TWA_RESUME mode 4190 + * bpf_task_work_schedule_resume_impl - Schedule BPF callback using task_work_add with TWA_RESUME 4191 + * mode 4191 4192 * @task: Task struct for which callback should be scheduled 4192 4193 * @tw: Pointer to struct bpf_task_work in BPF map value for internal bookkeeping 4193 4194 * @map__map: bpf_map that embeds struct bpf_task_work in the values ··· 4198 4195 * 4199 4196 * Return: 0 if task work has been scheduled successfully, negative error code otherwise 4200 4197 */ 4201 - __bpf_kfunc int bpf_task_work_schedule_resume(struct task_struct *task, struct bpf_task_work *tw, 4202 - void *map__map, bpf_task_work_callback_t callback, 4203 - void *aux__prog) 4198 + __bpf_kfunc int bpf_task_work_schedule_resume_impl(struct task_struct *task, 4199 + struct bpf_task_work *tw, void *map__map, 4200 + bpf_task_work_callback_t callback, 4201 + void *aux__prog) 4204 4202 { 4205 4203 return bpf_task_work_schedule(task, tw, map__map, callback, aux__prog, TWA_RESUME); 4206 4204 } ··· 4380 4376 #if defined(CONFIG_BPF_LSM) && defined(CONFIG_CGROUPS) 4381 4377 BTF_ID_FLAGS(func, bpf_cgroup_read_xattr, KF_RCU) 4382 4378 #endif 4383 - BTF_ID_FLAGS(func, bpf_stream_vprintk, KF_TRUSTED_ARGS) 4384 - BTF_ID_FLAGS(func, bpf_task_work_schedule_signal, KF_TRUSTED_ARGS) 4385 - BTF_ID_FLAGS(func, bpf_task_work_schedule_resume, KF_TRUSTED_ARGS) 4379 + BTF_ID_FLAGS(func, bpf_stream_vprintk_impl, KF_TRUSTED_ARGS) 4380 + BTF_ID_FLAGS(func, bpf_task_work_schedule_signal_impl, KF_TRUSTED_ARGS) 4381 + BTF_ID_FLAGS(func, bpf_task_work_schedule_resume_impl, KF_TRUSTED_ARGS) 4386 4382 BTF_KFUNCS_END(common_btf_ids) 4387 4383 4388 4384 static const struct btf_kfunc_id_set common_kfunc_set = {

+2 -1

kernel/bpf/stream.c

··· 355 355 * Avoid using enum bpf_stream_id so that kfunc users don't have to pull in the 356 356 * enum in headers. 357 357 */ 358 - __bpf_kfunc int bpf_stream_vprintk(int stream_id, const char *fmt__str, const void *args, u32 len__sz, void *aux__prog) 358 + __bpf_kfunc int bpf_stream_vprintk_impl(int stream_id, const char *fmt__str, const void *args, 359 + u32 len__sz, void *aux__prog) 359 360 { 360 361 struct bpf_bprintf_data data = { 361 362 .get_bin_args = true,

-5

kernel/bpf/trampoline.c

··· 479 479 * BPF_TRAMP_F_SHARE_IPMODIFY is set, we can generate the 480 480 * trampoline again, and retry register. 481 481 */ 482 - /* reset fops->func and fops->trampoline for re-register */ 483 - tr->fops->func = NULL; 484 - tr->fops->trampoline = 0; 485 - 486 - /* free im memory and reallocate later */ 487 482 bpf_tramp_image_free(im); 488 483 goto again; 489 484 }

+10 -8

kernel/bpf/verifier.c

··· 8866 8866 struct bpf_verifier_state *cur) 8867 8867 { 8868 8868 struct bpf_func_state *fold, *fcur; 8869 - int i, fr; 8869 + int i, fr, num_slots; 8870 8870 8871 8871 reset_idmap_scratch(env); 8872 8872 for (fr = old->curframe; fr >= 0; fr--) { ··· 8879 8879 &fcur->regs[i], 8880 8880 &env->idmap_scratch); 8881 8881 8882 - for (i = 0; i < fold->allocated_stack / BPF_REG_SIZE; i++) { 8882 + num_slots = min(fold->allocated_stack / BPF_REG_SIZE, 8883 + fcur->allocated_stack / BPF_REG_SIZE); 8884 + for (i = 0; i < num_slots; i++) { 8883 8885 if (!is_spilled_reg(&fold->stack[i]) || 8884 8886 !is_spilled_reg(&fcur->stack[i])) 8885 8887 continue; ··· 12261 12259 KF_bpf_res_spin_lock_irqsave, 12262 12260 KF_bpf_res_spin_unlock_irqrestore, 12263 12261 KF___bpf_trap, 12264 - KF_bpf_task_work_schedule_signal, 12265 - KF_bpf_task_work_schedule_resume, 12262 + KF_bpf_task_work_schedule_signal_impl, 12263 + KF_bpf_task_work_schedule_resume_impl, 12266 12264 }; 12267 12265 12268 12266 BTF_ID_LIST(special_kfunc_list) ··· 12333 12331 BTF_ID(func, bpf_res_spin_lock_irqsave) 12334 12332 BTF_ID(func, bpf_res_spin_unlock_irqrestore) 12335 12333 BTF_ID(func, __bpf_trap) 12336 - BTF_ID(func, bpf_task_work_schedule_signal) 12337 - BTF_ID(func, bpf_task_work_schedule_resume) 12334 + BTF_ID(func, bpf_task_work_schedule_signal_impl) 12335 + BTF_ID(func, bpf_task_work_schedule_resume_impl) 12338 12336 12339 12337 static bool is_task_work_add_kfunc(u32 func_id) 12340 12338 { 12341 - return func_id == special_kfunc_list[KF_bpf_task_work_schedule_signal] || 12342 - func_id == special_kfunc_list[KF_bpf_task_work_schedule_resume]; 12339 + return func_id == special_kfunc_list[KF_bpf_task_work_schedule_signal_impl] || 12340 + func_id == special_kfunc_list[KF_bpf_task_work_schedule_resume_impl]; 12343 12341 } 12344 12342 12345 12343 static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta)

+1 -1

kernel/crash_core.c

··· 373 373 old_res->start = 0; 374 374 old_res->end = 0; 375 375 } else { 376 - crashk_res.end = ram_res->start - 1; 376 + old_res->end = ram_res->start - 1; 377 377 } 378 378 379 379 crash_free_reserved_phys_range(ram_res->start, ram_res->end);

+3 -1

kernel/gcov/gcc_4_7.c

··· 18 18 #include <linux/mm.h> 19 19 #include "gcov.h" 20 20 21 - #if (__GNUC__ >= 14) 21 + #if (__GNUC__ >= 15) 22 + #define GCOV_COUNTERS 10 23 + #elif (__GNUC__ >= 14) 22 24 #define GCOV_COUNTERS 9 23 25 #elif (__GNUC__ >= 10) 24 26 #define GCOV_COUNTERS 8

+58 -37

kernel/kexec_handover.c

··· 8 8 9 9 #define pr_fmt(fmt) "KHO: " fmt 10 10 11 + #include <linux/cleanup.h> 11 12 #include <linux/cma.h> 12 13 #include <linux/count_zeros.h> 13 14 #include <linux/debugfs.h> ··· 23 22 24 23 #include <asm/early_ioremap.h> 25 24 25 + #include "kexec_handover_internal.h" 26 26 /* 27 27 * KHO is tightly coupled with mm init and needs access to some of mm 28 28 * internal APIs. ··· 69 67 * Keep track of memory that is to be preserved across KHO. 70 68 * 71 69 * The serializing side uses two levels of xarrays to manage chunks of per-order 72 - * 512 byte bitmaps. For instance if PAGE_SIZE = 4096, the entire 1G order of a 73 - * 1TB system would fit inside a single 512 byte bitmap. For order 0 allocations 74 - * each bitmap will cover 16M of address space. Thus, for 16G of memory at most 75 - * 512K of bitmap memory will be needed for order 0. 70 + * PAGE_SIZE byte bitmaps. For instance if PAGE_SIZE = 4096, the entire 1G order 71 + * of a 8TB system would fit inside a single 4096 byte bitmap. For order 0 72 + * allocations each bitmap will cover 128M of address space. Thus, for 16G of 73 + * memory at most 512K of bitmap memory will be needed for order 0. 76 74 * 77 75 * This approach is fully incremental, as the serialization progresses folios 78 76 * can continue be aggregated to the tracker. The final step, immediately prior ··· 80 78 * successor kernel to parse. 81 79 */ 82 80 83 - #define PRESERVE_BITS (512 * 8) 81 + #define PRESERVE_BITS (PAGE_SIZE * 8) 84 82 85 83 struct kho_mem_phys_bits { 86 84 DECLARE_BITMAP(preserve, PRESERVE_BITS); 87 85 }; 86 + 87 + static_assert(sizeof(struct kho_mem_phys_bits) == PAGE_SIZE); 88 88 89 89 struct kho_mem_phys { 90 90 /* ··· 135 131 .finalized = false, 136 132 }; 137 133 138 - static void *xa_load_or_alloc(struct xarray *xa, unsigned long index, size_t sz) 134 + static void *xa_load_or_alloc(struct xarray *xa, unsigned long index) 139 135 { 140 - void *elm, *res; 136 + void *res = xa_load(xa, index); 141 137 142 - elm = xa_load(xa, index); 143 - if (elm) 144 - return elm; 138 + if (res) 139 + return res; 145 140 146 - elm = kzalloc(sz, GFP_KERNEL); 141 + void *elm __free(free_page) = (void *)get_zeroed_page(GFP_KERNEL); 142 + 147 143 if (!elm) 148 144 return ERR_PTR(-ENOMEM); 149 145 146 + if (WARN_ON(kho_scratch_overlap(virt_to_phys(elm), PAGE_SIZE))) 147 + return ERR_PTR(-EINVAL); 148 + 150 149 res = xa_cmpxchg(xa, index, NULL, elm, GFP_KERNEL); 151 150 if (xa_is_err(res)) 152 - res = ERR_PTR(xa_err(res)); 153 - 154 - if (res) { 155 - kfree(elm); 151 + return ERR_PTR(xa_err(res)); 152 + else if (res) 156 153 return res; 157 - } 158 154 159 - return elm; 155 + return no_free_ptr(elm); 160 156 } 161 157 162 158 static void __kho_unpreserve(struct kho_mem_track *track, unsigned long pfn, ··· 171 167 const unsigned long pfn_high = pfn >> order; 172 168 173 169 physxa = xa_load(&track->orders, order); 174 - if (!physxa) 175 - continue; 170 + if (WARN_ON_ONCE(!physxa)) 171 + return; 176 172 177 173 bits = xa_load(&physxa->phys_bits, pfn_high / PRESERVE_BITS); 178 - if (!bits) 179 - continue; 174 + if (WARN_ON_ONCE(!bits)) 175 + return; 180 176 181 177 clear_bit(pfn_high % PRESERVE_BITS, bits->preserve); 182 178 ··· 220 216 } 221 217 } 222 218 223 - bits = xa_load_or_alloc(&physxa->phys_bits, pfn_high / PRESERVE_BITS, 224 - sizeof(*bits)); 219 + bits = xa_load_or_alloc(&physxa->phys_bits, pfn_high / PRESERVE_BITS); 225 220 if (IS_ERR(bits)) 226 221 return PTR_ERR(bits); 227 222 ··· 348 345 static struct khoser_mem_chunk *new_chunk(struct khoser_mem_chunk *cur_chunk, 349 346 unsigned long order) 350 347 { 351 - struct khoser_mem_chunk *chunk; 348 + struct khoser_mem_chunk *chunk __free(free_page) = NULL; 352 349 353 - chunk = kzalloc(PAGE_SIZE, GFP_KERNEL); 350 + chunk = (void *)get_zeroed_page(GFP_KERNEL); 354 351 if (!chunk) 355 - return NULL; 352 + return ERR_PTR(-ENOMEM); 353 + 354 + if (WARN_ON(kho_scratch_overlap(virt_to_phys(chunk), PAGE_SIZE))) 355 + return ERR_PTR(-EINVAL); 356 + 356 357 chunk->hdr.order = order; 357 358 if (cur_chunk) 358 359 KHOSER_STORE_PTR(cur_chunk->hdr.next, chunk); 359 - return chunk; 360 + return no_free_ptr(chunk); 360 361 } 361 362 362 363 static void kho_mem_ser_free(struct khoser_mem_chunk *first_chunk) ··· 381 374 struct khoser_mem_chunk *chunk = NULL; 382 375 struct kho_mem_phys *physxa; 383 376 unsigned long order; 377 + int err = -ENOMEM; 384 378 385 379 xa_for_each(&ser->track.orders, order, physxa) { 386 380 struct kho_mem_phys_bits *bits; 387 381 unsigned long phys; 388 382 389 383 chunk = new_chunk(chunk, order); 390 - if (!chunk) 384 + if (IS_ERR(chunk)) { 385 + err = PTR_ERR(chunk); 391 386 goto err_free; 387 + } 392 388 393 389 if (!first_chunk) 394 390 first_chunk = chunk; ··· 401 391 402 392 if (chunk->hdr.num_elms == ARRAY_SIZE(chunk->bitmaps)) { 403 393 chunk = new_chunk(chunk, order); 404 - if (!chunk) 394 + if (IS_ERR(chunk)) { 395 + err = PTR_ERR(chunk); 405 396 goto err_free; 397 + } 406 398 } 407 399 408 400 elm = &chunk->bitmaps[chunk->hdr.num_elms]; ··· 421 409 422 410 err_free: 423 411 kho_mem_ser_free(first_chunk); 424 - return -ENOMEM; 412 + return err; 425 413 } 426 414 427 415 static void __init deserialize_bitmap(unsigned int order, ··· 477 465 * area for early allocations that happen before page allocator is 478 466 * initialized. 479 467 */ 480 - static struct kho_scratch *kho_scratch; 481 - static unsigned int kho_scratch_cnt; 468 + struct kho_scratch *kho_scratch; 469 + unsigned int kho_scratch_cnt; 482 470 483 471 /* 484 472 * The scratch areas are scaled by default as percent of memory allocated from ··· 764 752 const unsigned int order = folio_order(folio); 765 753 struct kho_mem_track *track = &kho_out.ser.track; 766 754 755 + if (WARN_ON(kho_scratch_overlap(pfn << PAGE_SHIFT, PAGE_SIZE << order))) 756 + return -EINVAL; 757 + 767 758 return __kho_preserve_order(track, pfn, order); 768 759 } 769 760 EXPORT_SYMBOL_GPL(kho_preserve_folio); ··· 789 774 unsigned long pfn = start_pfn; 790 775 unsigned long failed_pfn = 0; 791 776 int err = 0; 777 + 778 + if (WARN_ON(kho_scratch_overlap(start_pfn << PAGE_SHIFT, 779 + nr_pages << PAGE_SHIFT))) { 780 + return -EINVAL; 781 + } 792 782 793 783 while (pfn < end_pfn) { 794 784 const unsigned int order = ··· 882 862 return NULL; 883 863 } 884 864 885 - static void kho_vmalloc_unpreserve_chunk(struct kho_vmalloc_chunk *chunk) 865 + static void kho_vmalloc_unpreserve_chunk(struct kho_vmalloc_chunk *chunk, 866 + unsigned short order) 886 867 { 887 868 struct kho_mem_track *track = &kho_out.ser.track; 888 869 unsigned long pfn = PHYS_PFN(virt_to_phys(chunk)); 889 870 890 871 __kho_unpreserve(track, pfn, pfn + 1); 891 872 892 - for (int i = 0; chunk->phys[i]; i++) { 873 + for (int i = 0; i < ARRAY_SIZE(chunk->phys) && chunk->phys[i]; i++) { 893 874 pfn = PHYS_PFN(chunk->phys[i]); 894 - __kho_unpreserve(track, pfn, pfn + 1); 875 + __kho_unpreserve(track, pfn, pfn + (1 << order)); 895 876 } 896 877 } 897 878 ··· 903 882 while (chunk) { 904 883 struct kho_vmalloc_chunk *tmp = chunk; 905 884 906 - kho_vmalloc_unpreserve_chunk(chunk); 885 + kho_vmalloc_unpreserve_chunk(chunk, kho_vmalloc->order); 907 886 908 887 chunk = KHOSER_LOAD_PTR(chunk->hdr.next); 909 888 free_page((unsigned long)tmp); ··· 1013 992 while (chunk) { 1014 993 struct page *page; 1015 994 1016 - for (int i = 0; chunk->phys[i]; i++) { 995 + for (int i = 0; i < ARRAY_SIZE(chunk->phys) && chunk->phys[i]; i++) { 1017 996 phys_addr_t phys = chunk->phys[i]; 1018 997 1019 998 if (idx + contig_pages > total_pages)

+25

kernel/kexec_handover_debug.c

··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * kexec_handover_debug.c - kexec handover optional debug functionality 4 + * Copyright (C) 2025 Google LLC, Pasha Tatashin <pasha.tatashin@soleen.com> 5 + */ 6 + 7 + #define pr_fmt(fmt) "KHO: " fmt 8 + 9 + #include "kexec_handover_internal.h" 10 + 11 + bool kho_scratch_overlap(phys_addr_t phys, size_t size) 12 + { 13 + phys_addr_t scratch_start, scratch_end; 14 + unsigned int i; 15 + 16 + for (i = 0; i < kho_scratch_cnt; i++) { 17 + scratch_start = kho_scratch[i].addr; 18 + scratch_end = kho_scratch[i].addr + kho_scratch[i].size; 19 + 20 + if (phys < scratch_end && (phys + size) > scratch_start) 21 + return true; 22 + } 23 + 24 + return false; 25 + }

+20

kernel/kexec_handover_internal.h

··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef LINUX_KEXEC_HANDOVER_INTERNAL_H 3 + #define LINUX_KEXEC_HANDOVER_INTERNAL_H 4 + 5 + #include <linux/kexec_handover.h> 6 + #include <linux/types.h> 7 + 8 + extern struct kho_scratch *kho_scratch; 9 + extern unsigned int kho_scratch_cnt; 10 + 11 + #ifdef CONFIG_KEXEC_HANDOVER_DEBUG 12 + bool kho_scratch_overlap(phys_addr_t phys, size_t size); 13 + #else 14 + static inline bool kho_scratch_overlap(phys_addr_t phys, size_t size) 15 + { 16 + return false; 17 + } 18 + #endif /* CONFIG_KEXEC_HANDOVER_DEBUG */ 19 + 20 + #endif /* LINUX_KEXEC_HANDOVER_INTERNAL_H */

+13 -9

kernel/power/swap.c

··· 635 635 }; 636 636 637 637 /* Indicates the image size after compression */ 638 - static atomic_t compressed_size = ATOMIC_INIT(0); 638 + static atomic64_t compressed_size = ATOMIC_INIT(0); 639 639 640 640 /* 641 641 * Compression function that runs in its own thread. ··· 664 664 d->ret = crypto_acomp_compress(d->cr); 665 665 d->cmp_len = d->cr->dlen; 666 666 667 - atomic_set(&compressed_size, atomic_read(&compressed_size) + d->cmp_len); 667 + atomic64_add(d->cmp_len, &compressed_size); 668 668 atomic_set_release(&d->stop, 1); 669 669 wake_up(&d->done); 670 670 } ··· 689 689 ktime_t start; 690 690 ktime_t stop; 691 691 size_t off; 692 - unsigned thr, run_threads, nr_threads; 692 + unsigned int thr, run_threads, nr_threads; 693 693 unsigned char *page = NULL; 694 694 struct cmp_data *data = NULL; 695 695 struct crc_data *crc = NULL; 696 696 697 697 hib_init_batch(&hb); 698 698 699 - atomic_set(&compressed_size, 0); 699 + atomic64_set(&compressed_size, 0); 700 700 701 701 /* 702 702 * We'll limit the number of threads for compression to limit memory ··· 877 877 stop = ktime_get(); 878 878 if (!ret) 879 879 ret = err2; 880 - if (!ret) 880 + if (!ret) { 881 + swsusp_show_speed(start, stop, nr_to_write, "Wrote"); 882 + pr_info("Image size after compression: %lld kbytes\n", 883 + (atomic64_read(&compressed_size) / 1024)); 881 884 pr_info("Image saving done\n"); 882 - swsusp_show_speed(start, stop, nr_to_write, "Wrote"); 883 - pr_info("Image size after compression: %d kbytes\n", 884 - (atomic_read(&compressed_size) / 1024)); 885 + } else { 886 + pr_err("Image saving failed: %d\n", ret); 887 + } 885 888 886 889 out_clean: 887 890 hib_finish_batch(&hb); ··· 902 899 } 903 900 vfree(data); 904 901 } 905 - if (page) free_page((unsigned long)page); 902 + if (page) 903 + free_page((unsigned long)page); 906 904 907 905 return ret; 908 906 }

+6 -6

kernel/time/posix-timers.c

··· 475 475 if (!kc->timer_create) 476 476 return -EOPNOTSUPP; 477 477 478 - new_timer = alloc_posix_timer(); 479 - if (unlikely(!new_timer)) 480 - return -EAGAIN; 481 - 482 - spin_lock_init(&new_timer->it_lock); 483 - 484 478 /* Special case for CRIU to restore timers with a given timer ID. */ 485 479 if (unlikely(current->signal->timer_create_restore_ids)) { 486 480 if (copy_from_user(&req_id, created_timer_id, sizeof(req_id))) ··· 483 489 if ((unsigned int)req_id > INT_MAX) 484 490 return -EINVAL; 485 491 } 492 + 493 + new_timer = alloc_posix_timer(); 494 + if (unlikely(!new_timer)) 495 + return -EAGAIN; 496 + 497 + spin_lock_init(&new_timer->it_lock); 486 498 487 499 /* 488 500 * Add the timer to the hash table. The timer is not yet valid

+45 -15

kernel/trace/ftrace.c

··· 1971 1971 */ 1972 1972 static int __ftrace_hash_update_ipmodify(struct ftrace_ops *ops, 1973 1973 struct ftrace_hash *old_hash, 1974 - struct ftrace_hash *new_hash) 1974 + struct ftrace_hash *new_hash, 1975 + bool update_target) 1975 1976 { 1976 1977 struct ftrace_page *pg; 1977 1978 struct dyn_ftrace *rec, *end = NULL; ··· 2007 2006 if (rec->flags & FTRACE_FL_DISABLED) 2008 2007 continue; 2009 2008 2010 - /* We need to update only differences of filter_hash */ 2009 + /* 2010 + * Unless we are updating the target of a direct function, 2011 + * we only need to update differences of filter_hash 2012 + */ 2011 2013 in_old = !!ftrace_lookup_ip(old_hash, rec->ip); 2012 2014 in_new = !!ftrace_lookup_ip(new_hash, rec->ip); 2013 - if (in_old == in_new) 2015 + if (!update_target && (in_old == in_new)) 2014 2016 continue; 2015 2017 2016 2018 if (in_new) { ··· 2024 2020 if (is_ipmodify) 2025 2021 goto rollback; 2026 2022 2027 - FTRACE_WARN_ON(rec->flags & FTRACE_FL_DIRECT); 2023 + /* 2024 + * If this is called by __modify_ftrace_direct() 2025 + * then it is only changing where the direct 2026 + * pointer is jumping to, and the record already 2027 + * points to a direct trampoline. If it isn't, 2028 + * then it is a bug to update ipmodify on a direct 2029 + * caller. 2030 + */ 2031 + FTRACE_WARN_ON(!update_target && 2032 + (rec->flags & FTRACE_FL_DIRECT)); 2028 2033 2029 2034 /* 2030 2035 * Another ops with IPMODIFY is already ··· 2089 2076 if (ftrace_hash_empty(hash)) 2090 2077 hash = NULL; 2091 2078 2092 - return __ftrace_hash_update_ipmodify(ops, EMPTY_HASH, hash); 2079 + return __ftrace_hash_update_ipmodify(ops, EMPTY_HASH, hash, false); 2093 2080 } 2094 2081 2095 2082 /* Disabling always succeeds */ ··· 2100 2087 if (ftrace_hash_empty(hash)) 2101 2088 hash = NULL; 2102 2089 2103 - __ftrace_hash_update_ipmodify(ops, hash, EMPTY_HASH); 2090 + __ftrace_hash_update_ipmodify(ops, hash, EMPTY_HASH, false); 2104 2091 } 2105 2092 2106 2093 static int ftrace_hash_ipmodify_update(struct ftrace_ops *ops, ··· 2114 2101 if (ftrace_hash_empty(new_hash)) 2115 2102 new_hash = NULL; 2116 2103 2117 - return __ftrace_hash_update_ipmodify(ops, old_hash, new_hash); 2104 + return __ftrace_hash_update_ipmodify(ops, old_hash, new_hash, false); 2118 2105 } 2119 2106 2120 2107 static void print_ip_ins(const char *fmt, const unsigned char *p) ··· 5966 5953 free_ftrace_hash(fhp); 5967 5954 } 5968 5955 5956 + static void reset_direct(struct ftrace_ops *ops, unsigned long addr) 5957 + { 5958 + struct ftrace_hash *hash = ops->func_hash->filter_hash; 5959 + 5960 + remove_direct_functions_hash(hash, addr); 5961 + 5962 + /* cleanup for possible another register call */ 5963 + ops->func = NULL; 5964 + ops->trampoline = 0; 5965 + } 5966 + 5969 5967 /** 5970 5968 * register_ftrace_direct - Call a custom trampoline directly 5971 5969 * for multiple functions registered in @ops ··· 6072 6048 ops->direct_call = addr; 6073 6049 6074 6050 err = register_ftrace_function_nolock(ops); 6051 + if (err) 6052 + reset_direct(ops, addr); 6075 6053 6076 6054 out_unlock: 6077 6055 mutex_unlock(&direct_mutex); ··· 6106 6080 int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr, 6107 6081 bool free_filters) 6108 6082 { 6109 - struct ftrace_hash *hash = ops->func_hash->filter_hash; 6110 6083 int err; 6111 6084 6112 6085 if (check_direct_multi(ops)) ··· 6115 6090 6116 6091 mutex_lock(&direct_mutex); 6117 6092 err = unregister_ftrace_function(ops); 6118 - remove_direct_functions_hash(hash, addr); 6093 + reset_direct(ops, addr); 6119 6094 mutex_unlock(&direct_mutex); 6120 - 6121 - /* cleanup for possible another register call */ 6122 - ops->func = NULL; 6123 - ops->trampoline = 0; 6124 6095 6125 6096 if (free_filters) 6126 6097 ftrace_free_filter(ops); ··· 6127 6106 static int 6128 6107 __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr) 6129 6108 { 6130 - struct ftrace_hash *hash; 6109 + struct ftrace_hash *hash = ops->func_hash->filter_hash; 6131 6110 struct ftrace_func_entry *entry, *iter; 6132 6111 static struct ftrace_ops tmp_ops = { 6133 6112 .func = ftrace_stub, ··· 6148 6127 return err; 6149 6128 6150 6129 /* 6130 + * Call __ftrace_hash_update_ipmodify() here, so that we can call 6131 + * ops->ops_func for the ops. This is needed because the above 6132 + * register_ftrace_function_nolock() worked on tmp_ops. 6133 + */ 6134 + err = __ftrace_hash_update_ipmodify(ops, hash, hash, true); 6135 + if (err) 6136 + goto out; 6137 + 6138 + /* 6151 6139 * Now the ftrace_ops_list_func() is called to do the direct callers. 6152 6140 * We can safely change the direct functions attached to each entry. 6153 6141 */ 6154 6142 mutex_lock(&ftrace_lock); 6155 6143 6156 - hash = ops->func_hash->filter_hash; 6157 6144 size = 1 << hash->size_bits; 6158 6145 for (i = 0; i < size; i++) { 6159 6146 hlist_for_each_entry(iter, &hash->buckets[i], hlist) { ··· 6176 6147 6177 6148 mutex_unlock(&ftrace_lock); 6178 6149 6150 + out: 6179 6151 /* Removing the tmp_ops will add the updated direct callers to the functions */ 6180 6152 unregister_ftrace_function(&tmp_ops); 6181 6153

+16 -14

lib/maple_tree.c

··· 64 64 #define CREATE_TRACE_POINTS 65 65 #include <trace/events/maple_tree.h> 66 66 67 + #define TP_FCT tracepoint_string(__func__) 68 + 67 69 /* 68 70 * Kernel pointer hashing renders much of the maple tree dump useless as tagged 69 71 * pointers get hashed to arbitrary values. ··· 2758 2756 MA_STATE(l_mas, mas->tree, mas->index, mas->last); 2759 2757 MA_STATE(r_mas, mas->tree, mas->index, mas->last); 2760 2758 2761 - trace_ma_op(__func__, mas); 2759 + trace_ma_op(TP_FCT, mas); 2762 2760 2763 2761 /* 2764 2762 * Rebalancing occurs if a node is insufficient. Data is rebalanced ··· 2999 2997 MA_STATE(prev_l_mas, mas->tree, mas->index, mas->last); 3000 2998 MA_STATE(prev_r_mas, mas->tree, mas->index, mas->last); 3001 2999 3002 - trace_ma_op(__func__, mas); 3000 + trace_ma_op(TP_FCT, mas); 3003 3001 3004 3002 mast.l = &l_mas; 3005 3003 mast.r = &r_mas; ··· 3174 3172 return false; 3175 3173 } 3176 3174 3177 - trace_ma_write(__func__, wr_mas->mas, wr_mas->r_max, entry); 3175 + trace_ma_write(TP_FCT, wr_mas->mas, wr_mas->r_max, entry); 3178 3176 return true; 3179 3177 } 3180 3178 ··· 3418 3416 * of data may happen. 3419 3417 */ 3420 3418 mas = wr_mas->mas; 3421 - trace_ma_op(__func__, mas); 3419 + trace_ma_op(TP_FCT, mas); 3422 3420 3423 3421 if (unlikely(!mas->index && mas->last == ULONG_MAX)) 3424 3422 return mas_new_root(mas, wr_mas->entry); ··· 3554 3552 } else { 3555 3553 memcpy(wr_mas->node, newnode, sizeof(struct maple_node)); 3556 3554 } 3557 - trace_ma_write(__func__, mas, 0, wr_mas->entry); 3555 + trace_ma_write(TP_FCT, mas, 0, wr_mas->entry); 3558 3556 mas_update_gap(mas); 3559 3557 mas->end = new_end; 3560 3558 return; ··· 3598 3596 mas->offset++; /* Keep mas accurate. */ 3599 3597 } 3600 3598 3601 - trace_ma_write(__func__, mas, 0, wr_mas->entry); 3599 + trace_ma_write(TP_FCT, mas, 0, wr_mas->entry); 3602 3600 /* 3603 3601 * Only update gap when the new entry is empty or there is an empty 3604 3602 * entry in the original two ranges. ··· 3719 3717 mas_update_gap(mas); 3720 3718 3721 3719 mas->end = new_end; 3722 - trace_ma_write(__func__, mas, new_end, wr_mas->entry); 3720 + trace_ma_write(TP_FCT, mas, new_end, wr_mas->entry); 3723 3721 return; 3724 3722 } 3725 3723 ··· 3733 3731 { 3734 3732 struct maple_big_node b_node; 3735 3733 3736 - trace_ma_write(__func__, wr_mas->mas, 0, wr_mas->entry); 3734 + trace_ma_write(TP_FCT, wr_mas->mas, 0, wr_mas->entry); 3737 3735 memset(&b_node, 0, sizeof(struct maple_big_node)); 3738 3736 mas_store_b_node(wr_mas, &b_node, wr_mas->offset_end); 3739 3737 mas_commit_b_node(wr_mas, &b_node); ··· 5064 5062 { 5065 5063 MA_WR_STATE(wr_mas, mas, entry); 5066 5064 5067 - trace_ma_write(__func__, mas, 0, entry); 5065 + trace_ma_write(TP_FCT, mas, 0, entry); 5068 5066 #ifdef CONFIG_DEBUG_MAPLE_TREE 5069 5067 if (MAS_WARN_ON(mas, mas->index > mas->last)) 5070 5068 pr_err("Error %lX > %lX " PTR_FMT "\n", mas->index, mas->last, ··· 5165 5163 } 5166 5164 5167 5165 store: 5168 - trace_ma_write(__func__, mas, 0, entry); 5166 + trace_ma_write(TP_FCT, mas, 0, entry); 5169 5167 mas_wr_store_entry(&wr_mas); 5170 5168 MAS_WR_BUG_ON(&wr_mas, mas_is_err(mas)); 5171 5169 mas_destroy(mas); ··· 5884 5882 MA_STATE(mas, mt, index, index); 5885 5883 void *entry; 5886 5884 5887 - trace_ma_read(__func__, &mas); 5885 + trace_ma_read(TP_FCT, &mas); 5888 5886 rcu_read_lock(); 5889 5887 retry: 5890 5888 entry = mas_start(&mas); ··· 5927 5925 MA_STATE(mas, mt, index, last); 5928 5926 int ret = 0; 5929 5927 5930 - trace_ma_write(__func__, &mas, 0, entry); 5928 + trace_ma_write(TP_FCT, &mas, 0, entry); 5931 5929 if (WARN_ON_ONCE(xa_is_advanced(entry))) 5932 5930 return -EINVAL; 5933 5931 ··· 6150 6148 void *entry = NULL; 6151 6149 6152 6150 MA_STATE(mas, mt, index, index); 6153 - trace_ma_op(__func__, &mas); 6151 + trace_ma_op(TP_FCT, &mas); 6154 6152 6155 6153 mtree_lock(mt); 6156 6154 entry = mas_erase(&mas); ··· 6487 6485 unsigned long copy = *index; 6488 6486 #endif 6489 6487 6490 - trace_ma_read(__func__, &mas); 6488 + trace_ma_read(TP_FCT, &mas); 6491 6489 6492 6490 if ((*index) > max) 6493 6491 return NULL;

+3

lib/test_kho.c

··· 301 301 phys_addr_t fdt_phys; 302 302 int err; 303 303 304 + if (!kho_is_enabled()) 305 + return 0; 306 + 304 307 err = kho_retrieve_subtree(KHO_TEST_FDT, &fdt_phys); 305 308 if (!err) 306 309 return kho_test_restore(fdt_phys);

+7

mm/Kconfig

··· 908 908 config PGTABLE_HAS_HUGE_LEAVES 909 909 def_bool TRANSPARENT_HUGEPAGE || HUGETLB_PAGE 910 910 911 + # 912 + # We can end up creating gigantic folio. 913 + # 914 + config HAVE_GIGANTIC_FOLIOS 915 + def_bool (HUGETLB_PAGE && ARCH_HAS_GIGANTIC_PAGE) || \ 916 + (ZONE_DEVICE && HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) 917 + 911 918 # TODO: Allow to be enabled without THP 912 919 config ARCH_SUPPORTS_HUGE_PFNMAP 913 920 def_bool n

+6 -3

mm/damon/stat.c

··· 46 46 47 47 static struct damon_ctx *damon_stat_context; 48 48 49 + static unsigned long damon_stat_last_refresh_jiffies; 50 + 49 51 static void damon_stat_set_estimated_memory_bandwidth(struct damon_ctx *c) 50 52 { 51 53 struct damon_target *t; ··· 132 130 static int damon_stat_damon_call_fn(void *data) 133 131 { 134 132 struct damon_ctx *c = data; 135 - static unsigned long last_refresh_jiffies; 136 133 137 134 /* avoid unnecessarily frequent stat update */ 138 - if (time_before_eq(jiffies, last_refresh_jiffies + 135 + if (time_before_eq(jiffies, damon_stat_last_refresh_jiffies + 139 136 msecs_to_jiffies(5 * MSEC_PER_SEC))) 140 137 return 0; 141 - last_refresh_jiffies = jiffies; 138 + damon_stat_last_refresh_jiffies = jiffies; 142 139 143 140 aggr_interval_us = c->attrs.aggr_interval; 144 141 damon_stat_set_estimated_memory_bandwidth(c); ··· 211 210 err = damon_start(&damon_stat_context, 1, true); 212 211 if (err) 213 212 return err; 213 + 214 + damon_stat_last_refresh_jiffies = jiffies; 214 215 call_control.data = damon_stat_context; 215 216 return damon_call(damon_stat_context, &call_control); 216 217 }

+7 -3

mm/damon/sysfs.c

··· 1552 1552 return ctx; 1553 1553 } 1554 1554 1555 + static unsigned long damon_sysfs_next_update_jiffies; 1556 + 1555 1557 static int damon_sysfs_repeat_call_fn(void *data) 1556 1558 { 1557 1559 struct damon_sysfs_kdamond *sysfs_kdamond = data; 1558 - static unsigned long next_update_jiffies; 1559 1560 1560 1561 if (!sysfs_kdamond->refresh_ms) 1561 1562 return 0; 1562 - if (time_before(jiffies, next_update_jiffies)) 1563 + if (time_before(jiffies, damon_sysfs_next_update_jiffies)) 1563 1564 return 0; 1564 - next_update_jiffies = jiffies + 1565 + damon_sysfs_next_update_jiffies = jiffies + 1565 1566 msecs_to_jiffies(sysfs_kdamond->refresh_ms); 1566 1567 1567 1568 if (!mutex_trylock(&damon_sysfs_lock)) ··· 1607 1606 return err; 1608 1607 } 1609 1608 kdamond->damon_ctx = ctx; 1609 + 1610 + damon_sysfs_next_update_jiffies = 1611 + jiffies + msecs_to_jiffies(kdamond->refresh_ms); 1610 1612 1611 1613 repeat_call_control->fn = damon_sysfs_repeat_call_fn; 1612 1614 repeat_call_control->data = kdamond;

+20 -8

mm/filemap.c

··· 3681 3681 static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf, 3682 3682 struct folio *folio, unsigned long start, 3683 3683 unsigned long addr, unsigned int nr_pages, 3684 - unsigned long *rss, unsigned short *mmap_miss) 3684 + unsigned long *rss, unsigned short *mmap_miss, 3685 + bool can_map_large) 3685 3686 { 3686 3687 unsigned int ref_from_caller = 1; 3687 3688 vm_fault_t ret = 0; ··· 3697 3696 * The folio must not cross VMA or page table boundary. 3698 3697 */ 3699 3698 addr0 = addr - start * PAGE_SIZE; 3700 - if (folio_within_vma(folio, vmf->vma) && 3699 + if (can_map_large && folio_within_vma(folio, vmf->vma) && 3701 3700 (addr0 & PMD_MASK) == ((addr0 + folio_size(folio) - 1) & PMD_MASK)) { 3702 3701 vmf->pte -= start; 3703 3702 page -= start; ··· 3812 3811 unsigned long rss = 0; 3813 3812 unsigned int nr_pages = 0, folio_type; 3814 3813 unsigned short mmap_miss = 0, mmap_miss_saved; 3814 + bool can_map_large; 3815 3815 3816 3816 rcu_read_lock(); 3817 3817 folio = next_uptodate_folio(&xas, mapping, end_pgoff); 3818 3818 if (!folio) 3819 3819 goto out; 3820 3820 3821 - if (filemap_map_pmd(vmf, folio, start_pgoff)) { 3821 + file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1; 3822 + end_pgoff = min(end_pgoff, file_end); 3823 + 3824 + /* 3825 + * Do not allow to map with PTEs beyond i_size and with PMD 3826 + * across i_size to preserve SIGBUS semantics. 3827 + * 3828 + * Make an exception for shmem/tmpfs that for long time 3829 + * intentionally mapped with PMDs across i_size. 3830 + */ 3831 + can_map_large = shmem_mapping(mapping) || 3832 + file_end >= folio_next_index(folio); 3833 + 3834 + if (can_map_large && filemap_map_pmd(vmf, folio, start_pgoff)) { 3822 3835 ret = VM_FAULT_NOPAGE; 3823 3836 goto out; 3824 3837 } ··· 3844 3829 folio_put(folio); 3845 3830 goto out; 3846 3831 } 3847 - 3848 - file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1; 3849 - if (end_pgoff > file_end) 3850 - end_pgoff = file_end; 3851 3832 3852 3833 folio_type = mm_counter_file(folio); 3853 3834 do { ··· 3861 3850 else 3862 3851 ret |= filemap_map_folio_range(vmf, folio, 3863 3852 xas.xa_index - folio->index, addr, 3864 - nr_pages, &rss, &mmap_miss); 3853 + nr_pages, &rss, &mmap_miss, 3854 + can_map_large); 3865 3855 3866 3856 folio_unlock(folio); 3867 3857 } while ((folio = next_uptodate_folio(&xas, mapping, end_pgoff)) != NULL);

+27 -14

mm/huge_memory.c

··· 214 214 if (likely(atomic_inc_not_zero(&huge_zero_refcount))) 215 215 return true; 216 216 217 - zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE, 217 + zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) & 218 + ~__GFP_MOVABLE, 218 219 HPAGE_PMD_ORDER); 219 220 if (!zero_folio) { 220 221 count_vm_event(THP_ZERO_PAGE_ALLOC_FAILED); ··· 3264 3263 caller_pins; 3265 3264 } 3266 3265 3266 + static bool page_range_has_hwpoisoned(struct page *page, long nr_pages) 3267 + { 3268 + for (; nr_pages; page++, nr_pages--) 3269 + if (PageHWPoison(page)) 3270 + return true; 3271 + return false; 3272 + } 3273 + 3267 3274 /* 3268 3275 * It splits @folio into @new_order folios and copies the @folio metadata to 3269 3276 * all the resulting folios. ··· 3279 3270 static void __split_folio_to_order(struct folio *folio, int old_order, 3280 3271 int new_order) 3281 3272 { 3273 + /* Scan poisoned pages when split a poisoned folio to large folios */ 3274 + const bool handle_hwpoison = folio_test_has_hwpoisoned(folio) && new_order; 3282 3275 long new_nr_pages = 1 << new_order; 3283 3276 long nr_pages = 1 << old_order; 3284 3277 long i; 3285 3278 3279 + folio_clear_has_hwpoisoned(folio); 3280 + 3281 + /* Check first new_nr_pages since the loop below skips them */ 3282 + if (handle_hwpoison && 3283 + page_range_has_hwpoisoned(folio_page(folio, 0), new_nr_pages)) 3284 + folio_set_has_hwpoisoned(folio); 3286 3285 /* 3287 3286 * Skip the first new_nr_pages, since the new folio from them have all 3288 3287 * the flags from the original folio. 3289 3288 */ 3290 3289 for (i = new_nr_pages; i < nr_pages; i += new_nr_pages) { 3291 3290 struct page *new_head = &folio->page + i; 3292 - 3293 3291 /* 3294 3292 * Careful: new_folio is not a "real" folio before we cleared PageTail. 3295 3293 * Don't pass it around before clear_compound_head(). ··· 3337 3321 #endif 3338 3322 (1L << PG_dirty) | 3339 3323 LRU_GEN_MASK | LRU_REFS_MASK)); 3324 + 3325 + if (handle_hwpoison && 3326 + page_range_has_hwpoisoned(new_head, new_nr_pages)) 3327 + folio_set_has_hwpoisoned(new_folio); 3340 3328 3341 3329 new_folio->mapping = folio->mapping; 3342 3330 new_folio->index = folio->index + i; ··· 3442 3422 if (folio_test_anon(folio)) 3443 3423 mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1); 3444 3424 3445 - folio_clear_has_hwpoisoned(folio); 3446 - 3447 3425 /* 3448 3426 * split to new_order one order at a time. For uniform split, 3449 3427 * folio is split to new_order directly. ··· 3522 3504 /* order-1 is not supported for anonymous THP. */ 3523 3505 VM_WARN_ONCE(warns && new_order == 1, 3524 3506 "Cannot split to order-1 folio"); 3525 - return new_order != 1; 3507 + if (new_order == 1) 3508 + return false; 3526 3509 } else if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && 3527 3510 !mapping_large_folio_support(folio->mapping)) { 3528 3511 /* ··· 3554 3535 if (folio_test_anon(folio)) { 3555 3536 VM_WARN_ONCE(warns && new_order == 1, 3556 3537 "Cannot split to order-1 folio"); 3557 - return new_order != 1; 3538 + if (new_order == 1) 3539 + return false; 3558 3540 } else if (new_order) { 3559 3541 if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && 3560 3542 !mapping_large_folio_support(folio->mapping)) { ··· 3673 3653 3674 3654 min_order = mapping_min_folio_order(folio->mapping); 3675 3655 if (new_order < min_order) { 3676 - VM_WARN_ONCE(1, "Cannot split mapped folio below min-order: %u", 3677 - min_order); 3678 3656 ret = -EINVAL; 3679 3657 goto out; 3680 3658 } ··· 4004 3986 4005 3987 int split_folio_to_list(struct folio *folio, struct list_head *list) 4006 3988 { 4007 - int ret = min_order_for_split(folio); 4008 - 4009 - if (ret < 0) 4010 - return ret; 4011 - 4012 - return split_huge_page_to_list_to_order(&folio->page, list, ret); 3989 + return split_huge_page_to_list_to_order(&folio->page, list, 0); 4013 3990 } 4014 3991 4015 3992 /*

-3

mm/kmsan/core.c

··· 72 72 73 73 nr_entries = stack_trace_save(entries, KMSAN_STACK_DEPTH, 0); 74 74 75 - /* Don't sleep. */ 76 - flags &= ~(__GFP_DIRECT_RECLAIM | __GFP_KSWAPD_RECLAIM); 77 - 78 75 handle = stack_depot_save(entries, nr_entries, flags); 79 76 return stack_depot_set_extra_bits(handle, extra); 80 77 }

+4 -2

mm/kmsan/hooks.c

··· 84 84 if (s->ctor) 85 85 return; 86 86 kmsan_enter_runtime(); 87 - kmsan_internal_poison_memory(object, s->object_size, GFP_KERNEL, 87 + kmsan_internal_poison_memory(object, s->object_size, 88 + GFP_KERNEL & ~(__GFP_RECLAIM), 88 89 KMSAN_POISON_CHECK | KMSAN_POISON_FREE); 89 90 kmsan_leave_runtime(); 90 91 } ··· 115 114 kmsan_enter_runtime(); 116 115 page = virt_to_head_page((void *)ptr); 117 116 KMSAN_WARN_ON(ptr != page_address(page)); 118 - kmsan_internal_poison_memory((void *)ptr, page_size(page), GFP_KERNEL, 117 + kmsan_internal_poison_memory((void *)ptr, page_size(page), 118 + GFP_KERNEL & ~(__GFP_RECLAIM), 119 119 KMSAN_POISON_CHECK | KMSAN_POISON_FREE); 120 120 kmsan_leave_runtime(); 121 121 }

+1 -1

mm/kmsan/shadow.c

··· 208 208 return; 209 209 kmsan_enter_runtime(); 210 210 kmsan_internal_poison_memory(page_address(page), page_size(page), 211 - GFP_KERNEL, 211 + GFP_KERNEL & ~(__GFP_RECLAIM), 212 212 KMSAN_POISON_CHECK | KMSAN_POISON_FREE); 213 213 kmsan_leave_runtime(); 214 214 }

+104 -9

mm/ksm.c

··· 2455 2455 return true; 2456 2456 } 2457 2457 2458 + struct ksm_next_page_arg { 2459 + struct folio *folio; 2460 + struct page *page; 2461 + unsigned long addr; 2462 + }; 2463 + 2464 + static int ksm_next_page_pmd_entry(pmd_t *pmdp, unsigned long addr, unsigned long end, 2465 + struct mm_walk *walk) 2466 + { 2467 + struct ksm_next_page_arg *private = walk->private; 2468 + struct vm_area_struct *vma = walk->vma; 2469 + pte_t *start_ptep = NULL, *ptep, pte; 2470 + struct mm_struct *mm = walk->mm; 2471 + struct folio *folio; 2472 + struct page *page; 2473 + spinlock_t *ptl; 2474 + pmd_t pmd; 2475 + 2476 + if (ksm_test_exit(mm)) 2477 + return 0; 2478 + 2479 + cond_resched(); 2480 + 2481 + pmd = pmdp_get_lockless(pmdp); 2482 + if (!pmd_present(pmd)) 2483 + return 0; 2484 + 2485 + if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && pmd_leaf(pmd)) { 2486 + ptl = pmd_lock(mm, pmdp); 2487 + pmd = pmdp_get(pmdp); 2488 + 2489 + if (!pmd_present(pmd)) { 2490 + goto not_found_unlock; 2491 + } else if (pmd_leaf(pmd)) { 2492 + page = vm_normal_page_pmd(vma, addr, pmd); 2493 + if (!page) 2494 + goto not_found_unlock; 2495 + folio = page_folio(page); 2496 + 2497 + if (folio_is_zone_device(folio) || !folio_test_anon(folio)) 2498 + goto not_found_unlock; 2499 + 2500 + page += ((addr & (PMD_SIZE - 1)) >> PAGE_SHIFT); 2501 + goto found_unlock; 2502 + } 2503 + spin_unlock(ptl); 2504 + } 2505 + 2506 + start_ptep = pte_offset_map_lock(mm, pmdp, addr, &ptl); 2507 + if (!start_ptep) 2508 + return 0; 2509 + 2510 + for (ptep = start_ptep; addr < end; ptep++, addr += PAGE_SIZE) { 2511 + pte = ptep_get(ptep); 2512 + 2513 + if (!pte_present(pte)) 2514 + continue; 2515 + 2516 + page = vm_normal_page(vma, addr, pte); 2517 + if (!page) 2518 + continue; 2519 + folio = page_folio(page); 2520 + 2521 + if (folio_is_zone_device(folio) || !folio_test_anon(folio)) 2522 + continue; 2523 + goto found_unlock; 2524 + } 2525 + 2526 + not_found_unlock: 2527 + spin_unlock(ptl); 2528 + if (start_ptep) 2529 + pte_unmap(start_ptep); 2530 + return 0; 2531 + found_unlock: 2532 + folio_get(folio); 2533 + spin_unlock(ptl); 2534 + if (start_ptep) 2535 + pte_unmap(start_ptep); 2536 + private->page = page; 2537 + private->folio = folio; 2538 + private->addr = addr; 2539 + return 1; 2540 + } 2541 + 2542 + static struct mm_walk_ops ksm_next_page_ops = { 2543 + .pmd_entry = ksm_next_page_pmd_entry, 2544 + .walk_lock = PGWALK_RDLOCK, 2545 + }; 2546 + 2458 2547 static struct ksm_rmap_item *scan_get_next_rmap_item(struct page **page) 2459 2548 { 2460 2549 struct mm_struct *mm; ··· 2631 2542 ksm_scan.address = vma->vm_end; 2632 2543 2633 2544 while (ksm_scan.address < vma->vm_end) { 2545 + struct ksm_next_page_arg ksm_next_page_arg; 2634 2546 struct page *tmp_page = NULL; 2635 - struct folio_walk fw; 2636 2547 struct folio *folio; 2637 2548 2638 2549 if (ksm_test_exit(mm)) 2639 2550 break; 2640 2551 2641 - folio = folio_walk_start(&fw, vma, ksm_scan.address, 0); 2642 - if (folio) { 2643 - if (!folio_is_zone_device(folio) && 2644 - folio_test_anon(folio)) { 2645 - folio_get(folio); 2646 - tmp_page = fw.page; 2647 - } 2648 - folio_walk_end(&fw, vma); 2552 + int found; 2553 + 2554 + found = walk_page_range_vma(vma, ksm_scan.address, 2555 + vma->vm_end, 2556 + &ksm_next_page_ops, 2557 + &ksm_next_page_arg); 2558 + 2559 + if (found > 0) { 2560 + folio = ksm_next_page_arg.folio; 2561 + tmp_page = ksm_next_page_arg.page; 2562 + ksm_scan.address = ksm_next_page_arg.addr; 2563 + } else { 2564 + VM_WARN_ON_ONCE(found < 0); 2565 + ksm_scan.address = vma->vm_end - PAGE_SIZE; 2649 2566 } 2650 2567 2651 2568 if (tmp_page) {

+19 -1

mm/memory.c

··· 65 65 #include <linux/gfp.h> 66 66 #include <linux/migrate.h> 67 67 #include <linux/string.h> 68 + #include <linux/shmem_fs.h> 68 69 #include <linux/memory-tiers.h> 69 70 #include <linux/debugfs.h> 70 71 #include <linux/userfaultfd_k.h> ··· 5502 5501 return ret; 5503 5502 } 5504 5503 5504 + if (!needs_fallback && vma->vm_file) { 5505 + struct address_space *mapping = vma->vm_file->f_mapping; 5506 + pgoff_t file_end; 5507 + 5508 + file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); 5509 + 5510 + /* 5511 + * Do not allow to map with PTEs beyond i_size and with PMD 5512 + * across i_size to preserve SIGBUS semantics. 5513 + * 5514 + * Make an exception for shmem/tmpfs that for long time 5515 + * intentionally mapped with PMDs across i_size. 5516 + */ 5517 + needs_fallback = !shmem_mapping(mapping) && 5518 + file_end < folio_next_index(folio); 5519 + } 5520 + 5505 5521 if (pmd_none(*vmf->pmd)) { 5506 - if (folio_test_pmd_mappable(folio)) { 5522 + if (!needs_fallback && folio_test_pmd_mappable(folio)) { 5507 5523 ret = do_set_pmd(vmf, folio, page); 5508 5524 if (ret != VM_FAULT_FALLBACK) 5509 5525 return ret;

+1 -1

mm/mm_init.c

··· 2469 2469 panic("Failed to allocate %s hash table\n", tablename); 2470 2470 2471 2471 pr_info("%s hash table entries: %ld (order: %d, %lu bytes, %s)\n", 2472 - tablename, 1UL << log2qty, ilog2(size) - PAGE_SHIFT, size, 2472 + tablename, 1UL << log2qty, get_order(size), size, 2473 2473 virt ? (huge ? "vmalloc hugepage" : "vmalloc") : "linear"); 2474 2474 2475 2475 if (_hash_shift)

+1 -1

mm/mremap.c

··· 187 187 if (!folio || !folio_test_large(folio)) 188 188 return 1; 189 189 190 - return folio_pte_batch(folio, ptep, pte, max_nr); 190 + return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr, FPB_RESPECT_WRITE); 191 191 } 192 192 193 193 static int move_ptes(struct pagetable_move_control *pmc,

+1 -1

mm/secretmem.c

··· 82 82 __folio_mark_uptodate(folio); 83 83 err = filemap_add_folio(mapping, folio, offset, gfp); 84 84 if (unlikely(err)) { 85 - folio_put(folio); 86 85 /* 87 86 * If a split of large page was required, it 88 87 * already happened when we marked the page invalid 89 88 * which guarantees that this call won't fail 90 89 */ 91 90 set_direct_map_default_noflush(folio_page(folio, 0)); 91 + folio_put(folio); 92 92 if (err == -EEXIST) 93 93 goto retry; 94 94

+6 -3

mm/shmem.c

··· 1882 1882 struct shmem_inode_info *info = SHMEM_I(inode); 1883 1883 unsigned long suitable_orders = 0; 1884 1884 struct folio *folio = NULL; 1885 + pgoff_t aligned_index; 1885 1886 long pages; 1886 1887 int error, order; 1887 1888 ··· 1896 1895 order = highest_order(suitable_orders); 1897 1896 while (suitable_orders) { 1898 1897 pages = 1UL << order; 1899 - index = round_down(index, pages); 1900 - folio = shmem_alloc_folio(gfp, order, info, index); 1901 - if (folio) 1898 + aligned_index = round_down(index, pages); 1899 + folio = shmem_alloc_folio(gfp, order, info, aligned_index); 1900 + if (folio) { 1901 + index = aligned_index; 1902 1902 goto allocated; 1903 + } 1903 1904 1904 1905 if (pages == HPAGE_PMD_NR) 1905 1906 count_vm_event(THP_FILE_FALLBACK);

+11 -3

mm/slub.c

··· 2046 2046 if (slab_exts) { 2047 2047 unsigned int offs = obj_to_index(obj_exts_slab->slab_cache, 2048 2048 obj_exts_slab, obj_exts); 2049 - /* codetag should be NULL */ 2049 + 2050 + if (unlikely(is_codetag_empty(&slab_exts[offs].ref))) 2051 + return; 2052 + 2053 + /* codetag should be NULL here */ 2050 2054 WARN_ON(slab_exts[offs].ref.ct); 2051 2055 set_codetag_empty(&slab_exts[offs].ref); 2052 2056 } ··· 6336 6332 6337 6333 if (unlikely(!slab_free_hook(s, p[i], init, false))) { 6338 6334 p[i] = p[--size]; 6339 - if (!size) 6340 - goto flush_remote; 6341 6335 continue; 6342 6336 } 6343 6337 ··· 6349 6347 6350 6348 i++; 6351 6349 } 6350 + 6351 + if (!size) 6352 + goto flush_remote; 6352 6353 6353 6354 next_batch: 6354 6355 if (!local_trylock(&s->cpu_sheaves->lock)) ··· 6406 6401 size -= batch; 6407 6402 goto next_batch; 6408 6403 } 6404 + 6405 + if (remote_nr) 6406 + goto flush_remote; 6409 6407 6410 6408 return; 6411 6409

+13

mm/swap_state.c

··· 748 748 749 749 blk_start_plug(&plug); 750 750 for (addr = start; addr < end; ilx++, addr += PAGE_SIZE) { 751 + struct swap_info_struct *si = NULL; 752 + 751 753 if (!pte++) { 752 754 pte = pte_offset_map(vmf->pmd, addr); 753 755 if (!pte) ··· 763 761 continue; 764 762 pte_unmap(pte); 765 763 pte = NULL; 764 + /* 765 + * Readahead entry may come from a device that we are not 766 + * holding a reference to, try to grab a reference, or skip. 767 + */ 768 + if (swp_type(entry) != swp_type(targ_entry)) { 769 + si = get_swap_device(entry); 770 + if (!si) 771 + continue; 772 + } 766 773 folio = __read_swap_cache_async(entry, gfp_mask, mpol, ilx, 767 774 &page_allocated, false); 775 + if (si) 776 + put_swap_device(si); 768 777 if (!folio) 769 778 continue; 770 779 if (page_allocated) {

+31 -6

mm/truncate.c

··· 177 177 return 0; 178 178 } 179 179 180 + static int try_folio_split_or_unmap(struct folio *folio, struct page *split_at, 181 + unsigned long min_order) 182 + { 183 + enum ttu_flags ttu_flags = 184 + TTU_SYNC | 185 + TTU_SPLIT_HUGE_PMD | 186 + TTU_IGNORE_MLOCK; 187 + int ret; 188 + 189 + ret = try_folio_split_to_order(folio, split_at, min_order); 190 + 191 + /* 192 + * If the split fails, unmap the folio, so it will be refaulted 193 + * with PTEs to respect SIGBUS semantics. 194 + * 195 + * Make an exception for shmem/tmpfs that for long time 196 + * intentionally mapped with PMDs across i_size. 197 + */ 198 + if (ret && !shmem_mapping(folio->mapping)) { 199 + try_to_unmap(folio, ttu_flags); 200 + WARN_ON(folio_mapped(folio)); 201 + } 202 + 203 + return ret; 204 + } 205 + 180 206 /* 181 207 * Handle partial folios. The folio may be entirely within the 182 208 * range if a split has raced with us. If not, we zero the part of the ··· 220 194 size_t size = folio_size(folio); 221 195 unsigned int offset, length; 222 196 struct page *split_at, *split_at2; 197 + unsigned int min_order; 223 198 224 199 if (pos < start) 225 200 offset = start - pos; ··· 250 223 if (!folio_test_large(folio)) 251 224 return true; 252 225 226 + min_order = mapping_min_folio_order(folio->mapping); 253 227 split_at = folio_page(folio, PAGE_ALIGN_DOWN(offset) / PAGE_SIZE); 254 - if (!try_folio_split(folio, split_at, NULL)) { 228 + if (!try_folio_split_or_unmap(folio, split_at, min_order)) { 255 229 /* 256 230 * try to split at offset + length to make sure folios within 257 231 * the range can be dropped, especially to avoid memory waste ··· 276 248 if (!folio_trylock(folio2)) 277 249 goto out; 278 250 279 - /* 280 - * make sure folio2 is large and does not change its mapping. 281 - * Its split result does not matter here. 282 - */ 251 + /* make sure folio2 is large and does not change its mapping */ 283 252 if (folio_test_large(folio2) && 284 253 folio2->mapping == folio->mapping) 285 - try_folio_split(folio2, split_at2, NULL); 254 + try_folio_split_or_unmap(folio2, split_at2, min_order); 286 255 287 256 folio_unlock(folio2); 288 257 out:

+79 -32

net/bluetooth/6lowpan.c

··· 53 53 static struct l2cap_chan *listen_chan; 54 54 static DEFINE_MUTEX(set_lock); 55 55 56 + enum { 57 + LOWPAN_PEER_CLOSING, 58 + LOWPAN_PEER_MAXBITS 59 + }; 60 + 56 61 struct lowpan_peer { 57 62 struct list_head list; 58 63 struct rcu_head rcu; ··· 66 61 /* peer addresses in various formats */ 67 62 unsigned char lladdr[ETH_ALEN]; 68 63 struct in6_addr peer_addr; 64 + 65 + DECLARE_BITMAP(flags, LOWPAN_PEER_MAXBITS); 69 66 }; 70 67 71 68 struct lowpan_btle_dev { ··· 296 289 local_skb->pkt_type = PACKET_HOST; 297 290 local_skb->dev = dev; 298 291 292 + skb_reset_mac_header(local_skb); 299 293 skb_set_transport_header(local_skb, sizeof(struct ipv6hdr)); 300 294 301 295 if (give_skb_to_upper(local_skb, dev) != NET_RX_SUCCESS) { ··· 927 919 928 920 BT_DBG("peer %p chan %p", peer, peer->chan); 929 921 922 + l2cap_chan_lock(peer->chan); 930 923 l2cap_chan_close(peer->chan, ENOENT); 924 + l2cap_chan_unlock(peer->chan); 931 925 932 926 return 0; 933 927 } ··· 966 956 } 967 957 968 958 static int get_l2cap_conn(char *buf, bdaddr_t *addr, u8 *addr_type, 969 - struct l2cap_conn **conn) 959 + struct l2cap_conn **conn, bool disconnect) 970 960 { 971 961 struct hci_conn *hcon; 972 962 struct hci_dev *hdev; 963 + int le_addr_type; 973 964 int n; 974 965 975 966 n = sscanf(buf, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx %hhu", ··· 981 970 if (n < 7) 982 971 return -EINVAL; 983 972 973 + if (disconnect) { 974 + /* The "disconnect" debugfs command has used different address 975 + * type constants than "connect" since 2015. Let's retain that 976 + * for now even though it's obviously buggy... 977 + */ 978 + *addr_type += 1; 979 + } 980 + 981 + switch (*addr_type) { 982 + case BDADDR_LE_PUBLIC: 983 + le_addr_type = ADDR_LE_DEV_PUBLIC; 984 + break; 985 + case BDADDR_LE_RANDOM: 986 + le_addr_type = ADDR_LE_DEV_RANDOM; 987 + break; 988 + default: 989 + return -EINVAL; 990 + } 991 + 984 992 /* The LE_PUBLIC address type is ignored because of BDADDR_ANY */ 985 993 hdev = hci_get_route(addr, BDADDR_ANY, BDADDR_LE_PUBLIC); 986 994 if (!hdev) 987 995 return -ENOENT; 988 996 989 997 hci_dev_lock(hdev); 990 - hcon = hci_conn_hash_lookup_le(hdev, addr, *addr_type); 998 + hcon = hci_conn_hash_lookup_le(hdev, addr, le_addr_type); 991 999 hci_dev_unlock(hdev); 992 1000 hci_dev_put(hdev); 993 1001 ··· 1023 993 static void disconnect_all_peers(void) 1024 994 { 1025 995 struct lowpan_btle_dev *entry; 1026 - struct lowpan_peer *peer, *tmp_peer, *new_peer; 1027 - struct list_head peers; 996 + struct lowpan_peer *peer; 997 + int nchans; 1028 998 1029 - INIT_LIST_HEAD(&peers); 1030 - 1031 - /* We make a separate list of peers as the close_cb() will 1032 - * modify the device peers list so it is better not to mess 1033 - * with the same list at the same time. 999 + /* l2cap_chan_close() cannot be called from RCU, and lock ordering 1000 + * chan->lock > devices_lock prevents taking write side lock, so copy 1001 + * then close. 1034 1002 */ 1035 1003 1036 1004 rcu_read_lock(); 1037 - 1038 - list_for_each_entry_rcu(entry, &bt_6lowpan_devices, list) { 1039 - list_for_each_entry_rcu(peer, &entry->peers, list) { 1040 - new_peer = kmalloc(sizeof(*new_peer), GFP_ATOMIC); 1041 - if (!new_peer) 1042 - break; 1043 - 1044 - new_peer->chan = peer->chan; 1045 - INIT_LIST_HEAD(&new_peer->list); 1046 - 1047 - list_add(&new_peer->list, &peers); 1048 - } 1049 - } 1050 - 1005 + list_for_each_entry_rcu(entry, &bt_6lowpan_devices, list) 1006 + list_for_each_entry_rcu(peer, &entry->peers, list) 1007 + clear_bit(LOWPAN_PEER_CLOSING, peer->flags); 1051 1008 rcu_read_unlock(); 1052 1009 1053 - spin_lock(&devices_lock); 1054 - list_for_each_entry_safe(peer, tmp_peer, &peers, list) { 1055 - l2cap_chan_close(peer->chan, ENOENT); 1010 + do { 1011 + struct l2cap_chan *chans[32]; 1012 + int i; 1056 1013 1057 - list_del_rcu(&peer->list); 1058 - kfree_rcu(peer, rcu); 1059 - } 1060 - spin_unlock(&devices_lock); 1014 + nchans = 0; 1015 + 1016 + spin_lock(&devices_lock); 1017 + 1018 + list_for_each_entry_rcu(entry, &bt_6lowpan_devices, list) { 1019 + list_for_each_entry_rcu(peer, &entry->peers, list) { 1020 + if (test_and_set_bit(LOWPAN_PEER_CLOSING, 1021 + peer->flags)) 1022 + continue; 1023 + 1024 + l2cap_chan_hold(peer->chan); 1025 + chans[nchans++] = peer->chan; 1026 + 1027 + if (nchans >= ARRAY_SIZE(chans)) 1028 + goto done; 1029 + } 1030 + } 1031 + 1032 + done: 1033 + spin_unlock(&devices_lock); 1034 + 1035 + for (i = 0; i < nchans; ++i) { 1036 + l2cap_chan_lock(chans[i]); 1037 + l2cap_chan_close(chans[i], ENOENT); 1038 + l2cap_chan_unlock(chans[i]); 1039 + l2cap_chan_put(chans[i]); 1040 + } 1041 + } while (nchans); 1061 1042 } 1062 1043 1063 1044 struct set_enable { ··· 1091 1050 1092 1051 mutex_lock(&set_lock); 1093 1052 if (listen_chan) { 1053 + l2cap_chan_lock(listen_chan); 1094 1054 l2cap_chan_close(listen_chan, 0); 1055 + l2cap_chan_unlock(listen_chan); 1095 1056 l2cap_chan_put(listen_chan); 1096 1057 } 1097 1058 ··· 1146 1103 buf[buf_size] = '\0'; 1147 1104 1148 1105 if (memcmp(buf, "connect ", 8) == 0) { 1149 - ret = get_l2cap_conn(&buf[8], &addr, &addr_type, &conn); 1106 + ret = get_l2cap_conn(&buf[8], &addr, &addr_type, &conn, false); 1150 1107 if (ret == -EINVAL) 1151 1108 return ret; 1152 1109 1153 1110 mutex_lock(&set_lock); 1154 1111 if (listen_chan) { 1112 + l2cap_chan_lock(listen_chan); 1155 1113 l2cap_chan_close(listen_chan, 0); 1114 + l2cap_chan_unlock(listen_chan); 1156 1115 l2cap_chan_put(listen_chan); 1157 1116 listen_chan = NULL; 1158 1117 } ··· 1185 1140 } 1186 1141 1187 1142 if (memcmp(buf, "disconnect ", 11) == 0) { 1188 - ret = get_l2cap_conn(&buf[11], &addr, &addr_type, &conn); 1143 + ret = get_l2cap_conn(&buf[11], &addr, &addr_type, &conn, true); 1189 1144 if (ret < 0) 1190 1145 return ret; 1191 1146 ··· 1316 1271 debugfs_remove(lowpan_control_debugfs); 1317 1272 1318 1273 if (listen_chan) { 1274 + l2cap_chan_lock(listen_chan); 1319 1275 l2cap_chan_close(listen_chan, 0); 1276 + l2cap_chan_unlock(listen_chan); 1320 1277 l2cap_chan_put(listen_chan); 1321 1278 } 1322 1279

+19 -14

net/bluetooth/hci_conn.c

··· 769 769 d->count++; 770 770 } 771 771 772 - static int hci_le_big_terminate(struct hci_dev *hdev, u8 big, struct hci_conn *conn) 772 + static int hci_le_big_terminate(struct hci_dev *hdev, struct hci_conn *conn) 773 773 { 774 774 struct iso_list_data *d; 775 775 int ret; 776 776 777 - bt_dev_dbg(hdev, "big 0x%2.2x sync_handle 0x%4.4x", big, conn->sync_handle); 777 + bt_dev_dbg(hdev, "hcon %p big 0x%2.2x sync_handle 0x%4.4x", conn, 778 + conn->iso_qos.bcast.big, conn->sync_handle); 778 779 779 780 d = kzalloc(sizeof(*d), GFP_KERNEL); 780 781 if (!d) 781 782 return -ENOMEM; 782 783 783 - d->big = big; 784 + d->big = conn->iso_qos.bcast.big; 784 785 d->sync_handle = conn->sync_handle; 785 786 786 - if (test_and_clear_bit(HCI_CONN_PA_SYNC, &conn->flags)) { 787 + if (conn->type == PA_LINK && 788 + test_and_clear_bit(HCI_CONN_PA_SYNC, &conn->flags)) { 787 789 hci_conn_hash_list_flag(hdev, find_bis, PA_LINK, 788 790 HCI_CONN_PA_SYNC, d); 789 791 ··· 802 800 if (!d->count) 803 801 d->big_sync_term = true; 804 802 } 803 + 804 + if (!d->pa_sync_term && !d->big_sync_term) 805 + return 0; 805 806 806 807 ret = hci_cmd_sync_queue(hdev, big_terminate_sync, d, 807 808 terminate_big_destroy); ··· 857 852 858 853 hci_le_terminate_big(hdev, conn); 859 854 } else { 860 - hci_le_big_terminate(hdev, conn->iso_qos.bcast.big, 861 - conn); 855 + hci_le_big_terminate(hdev, conn); 862 856 } 863 857 } 864 858 ··· 998 994 conn->mtu = hdev->le_mtu ? hdev->le_mtu : hdev->acl_mtu; 999 995 break; 1000 996 case CIS_LINK: 1001 - case BIS_LINK: 1002 - case PA_LINK: 1003 997 /* conn->src should reflect the local identity address */ 1004 998 hci_copy_identity_address(hdev, &conn->src, &conn->src_type); 1005 999 1006 - /* set proper cleanup function */ 1007 - if (!bacmp(dst, BDADDR_ANY)) 1008 - conn->cleanup = bis_cleanup; 1009 - else if (conn->role == HCI_ROLE_MASTER) 1000 + if (conn->role == HCI_ROLE_MASTER) 1010 1001 conn->cleanup = cis_cleanup; 1011 1002 1012 - conn->mtu = hdev->iso_mtu ? hdev->iso_mtu : 1013 - hdev->le_mtu ? hdev->le_mtu : hdev->acl_mtu; 1003 + conn->mtu = hdev->iso_mtu; 1004 + break; 1005 + case PA_LINK: 1006 + case BIS_LINK: 1007 + /* conn->src should reflect the local identity address */ 1008 + hci_copy_identity_address(hdev, &conn->src, &conn->src_type); 1009 + conn->cleanup = bis_cleanup; 1010 + conn->mtu = hdev->iso_mtu; 1014 1011 break; 1015 1012 case SCO_LINK: 1016 1013 if (lmp_esco_capable(hdev))

+36 -20

net/bluetooth/hci_event.c

··· 5843 5843 le16_to_cpu(ev->supervision_timeout)); 5844 5844 } 5845 5845 5846 + static void hci_le_pa_sync_lost_evt(struct hci_dev *hdev, void *data, 5847 + struct sk_buff *skb) 5848 + { 5849 + struct hci_ev_le_pa_sync_lost *ev = data; 5850 + u16 handle = le16_to_cpu(ev->handle); 5851 + struct hci_conn *conn; 5852 + 5853 + bt_dev_dbg(hdev, "sync handle 0x%4.4x", handle); 5854 + 5855 + hci_dev_lock(hdev); 5856 + 5857 + /* Delete the pa sync connection */ 5858 + conn = hci_conn_hash_lookup_pa_sync_handle(hdev, handle); 5859 + if (conn) { 5860 + clear_bit(HCI_CONN_BIG_SYNC, &conn->flags); 5861 + clear_bit(HCI_CONN_PA_SYNC, &conn->flags); 5862 + hci_disconn_cfm(conn, HCI_ERROR_REMOTE_USER_TERM); 5863 + hci_conn_del(conn); 5864 + } 5865 + 5866 + hci_dev_unlock(hdev); 5867 + } 5868 + 5846 5869 static void hci_le_ext_adv_term_evt(struct hci_dev *hdev, void *data, 5847 5870 struct sk_buff *skb) 5848 5871 { ··· 7024 7001 continue; 7025 7002 } 7026 7003 7027 - if (ev->status != 0x42) { 7004 + if (ev->status != 0x42) 7028 7005 /* Mark PA sync as established */ 7029 7006 set_bit(HCI_CONN_PA_SYNC, &bis->flags); 7030 - /* Reset cleanup callback of PA Sync so it doesn't 7031 - * terminate the sync when deleting the connection. 7032 - */ 7033 - conn->cleanup = NULL; 7034 - } 7035 7007 7036 7008 bis->sync_handle = conn->sync_handle; 7037 7009 bis->iso_qos.bcast.big = ev->handle; ··· 7069 7051 struct sk_buff *skb) 7070 7052 { 7071 7053 struct hci_evt_le_big_sync_lost *ev = data; 7072 - struct hci_conn *bis, *conn; 7073 - bool mgmt_conn; 7054 + struct hci_conn *bis; 7055 + bool mgmt_conn = false; 7074 7056 7075 7057 bt_dev_dbg(hdev, "big handle 0x%2.2x", ev->handle); 7076 7058 7077 7059 hci_dev_lock(hdev); 7078 7060 7079 - /* Delete the pa sync connection */ 7080 - bis = hci_conn_hash_lookup_pa_sync_big_handle(hdev, ev->handle); 7081 - if (bis) { 7082 - conn = hci_conn_hash_lookup_pa_sync_handle(hdev, 7083 - bis->sync_handle); 7084 - if (conn) 7085 - hci_conn_del(conn); 7086 - } 7087 - 7088 7061 /* Delete each bis connection */ 7089 7062 while ((bis = hci_conn_hash_lookup_big_state(hdev, ev->handle, 7090 7063 BT_CONNECTED, 7091 7064 HCI_ROLE_SLAVE))) { 7092 - mgmt_conn = test_and_clear_bit(HCI_CONN_MGMT_CONNECTED, &bis->flags); 7093 - mgmt_device_disconnected(hdev, &bis->dst, bis->type, bis->dst_type, 7094 - ev->reason, mgmt_conn); 7065 + if (!mgmt_conn) { 7066 + mgmt_conn = test_and_clear_bit(HCI_CONN_MGMT_CONNECTED, 7067 + &bis->flags); 7068 + mgmt_device_disconnected(hdev, &bis->dst, bis->type, 7069 + bis->dst_type, ev->reason, 7070 + mgmt_conn); 7071 + } 7095 7072 7096 7073 clear_bit(HCI_CONN_BIG_SYNC, &bis->flags); 7097 7074 hci_disconn_cfm(bis, ev->reason); ··· 7200 7187 hci_le_per_adv_report_evt, 7201 7188 sizeof(struct hci_ev_le_per_adv_report), 7202 7189 HCI_MAX_EVENT_SIZE), 7190 + /* [0x10 = HCI_EV_LE_PA_SYNC_LOST] */ 7191 + HCI_LE_EV(HCI_EV_LE_PA_SYNC_LOST, hci_le_pa_sync_lost_evt, 7192 + sizeof(struct hci_ev_le_pa_sync_lost)), 7203 7193 /* [0x12 = HCI_EV_LE_EXT_ADV_SET_TERM] */ 7204 7194 HCI_LE_EV(HCI_EV_LE_EXT_ADV_SET_TERM, hci_le_ext_adv_term_evt, 7205 7195 sizeof(struct hci_evt_le_ext_adv_set_term)),

+1 -1

net/bluetooth/hci_sync.c

··· 6999 6999 7000 7000 hci_dev_lock(hdev); 7001 7001 7002 - if (!hci_conn_valid(hdev, conn)) 7002 + if (hci_conn_valid(hdev, conn)) 7003 7003 clear_bit(HCI_CONN_CREATE_PA_SYNC, &conn->flags); 7004 7004 7005 7005 if (!err)

+1

net/bluetooth/l2cap_core.c

··· 497 497 498 498 kref_get(&c->kref); 499 499 } 500 + EXPORT_SYMBOL_GPL(l2cap_chan_hold); 500 501 501 502 struct l2cap_chan *l2cap_chan_hold_unless_zero(struct l2cap_chan *c) 502 503 {

+1

net/bluetooth/mgmt.c

··· 9497 9497 cancel_delayed_work_sync(&hdev->discov_off); 9498 9498 cancel_delayed_work_sync(&hdev->service_cache); 9499 9499 cancel_delayed_work_sync(&hdev->rpa_expired); 9500 + cancel_delayed_work_sync(&hdev->mesh_send_done); 9500 9501 } 9501 9502 9502 9503 void mgmt_power_on(struct hci_dev *hdev, int err)

+5 -2

net/core/netpoll.c

··· 811 811 if (!npinfo) 812 812 return; 813 813 814 + /* At this point, there is a single npinfo instance per netdevice, and 815 + * its refcnt tracks how many netpoll structures are linked to it. We 816 + * only perform npinfo cleanup when the refcnt decrements to zero. 817 + */ 814 818 if (refcount_dec_and_test(&npinfo->refcnt)) { 815 819 const struct net_device_ops *ops; 816 820 ··· 824 820 825 821 RCU_INIT_POINTER(np->dev->npinfo, NULL); 826 822 call_rcu(&npinfo->rcu, rcu_cleanup_netpoll_info); 827 - } else 828 - RCU_INIT_POINTER(np->dev->npinfo, NULL); 823 + } 829 824 830 825 skb_pool_flush(np); 831 826 }

+4 -2

net/dsa/tag_brcm.c

··· 176 176 /* Remove Broadcom tag and update checksum */ 177 177 skb_pull_rcsum(skb, BRCM_TAG_LEN); 178 178 179 - dsa_default_offload_fwd_mark(skb); 179 + if (likely(!is_link_local_ether_addr(eth_hdr(skb)->h_dest))) 180 + dsa_default_offload_fwd_mark(skb); 180 181 181 182 return skb; 182 183 } ··· 251 250 /* Remove Broadcom tag and update checksum */ 252 251 skb_pull_rcsum(skb, len); 253 252 254 - dsa_default_offload_fwd_mark(skb); 253 + if (likely(!is_link_local_ether_addr(eth_hdr(skb)->h_dest))) 254 + dsa_default_offload_fwd_mark(skb); 255 255 256 256 dsa_strip_etype_header(skb, len); 257 257

+1

net/handshake/tlshd.c

··· 259 259 260 260 out_cancel: 261 261 genlmsg_cancel(msg, hdr); 262 + nlmsg_free(msg); 262 263 out: 263 264 return ret; 264 265 }

+4 -1

net/hsr/hsr_device.c

··· 320 320 } 321 321 322 322 hsr_stag = skb_put(skb, sizeof(struct hsr_sup_tag)); 323 + skb_set_network_header(skb, ETH_HLEN + HSR_HLEN); 324 + skb_reset_mac_len(skb); 325 + 323 326 set_hsr_stag_path(hsr_stag, (hsr->prot_version ? 0x0 : 0xf)); 324 327 set_hsr_stag_HSR_ver(hsr_stag, hsr->prot_version); 325 328 ··· 337 334 } 338 335 339 336 hsr_stag->tlv.HSR_TLV_type = type; 340 - /* TODO: Why 12 in HSRv0? */ 337 + /* HSRv0 has 6 unused bytes after the MAC */ 341 338 hsr_stag->tlv.HSR_TLV_length = hsr->prot_version ? 342 339 sizeof(struct hsr_sup_payload) : 12; 343 340

+15 -7

net/hsr/hsr_forward.c

··· 262 262 return skb; 263 263 } 264 264 265 - static void hsr_set_path_id(struct hsr_ethhdr *hsr_ethhdr, 265 + static void hsr_set_path_id(struct hsr_frame_info *frame, 266 + struct hsr_ethhdr *hsr_ethhdr, 266 267 struct hsr_port *port) 267 268 { 268 269 int path_id; 269 270 270 - if (port->type == HSR_PT_SLAVE_A) 271 - path_id = 0; 272 - else 273 - path_id = 1; 271 + if (port->hsr->prot_version) { 272 + if (port->type == HSR_PT_SLAVE_A) 273 + path_id = 0; 274 + else 275 + path_id = 1; 276 + } else { 277 + if (frame->is_supervision) 278 + path_id = 0xf; 279 + else 280 + path_id = 1; 281 + } 274 282 275 283 set_hsr_tag_path(&hsr_ethhdr->hsr_tag, path_id); 276 284 } ··· 312 304 else 313 305 hsr_ethhdr = (struct hsr_ethhdr *)pc; 314 306 315 - hsr_set_path_id(hsr_ethhdr, port); 307 + hsr_set_path_id(frame, hsr_ethhdr, port); 316 308 set_hsr_tag_LSDU_size(&hsr_ethhdr->hsr_tag, lsdu_size); 317 309 hsr_ethhdr->hsr_tag.sequence_nr = htons(frame->sequence_nr); 318 310 hsr_ethhdr->hsr_tag.encap_proto = hsr_ethhdr->ethhdr.h_proto; ··· 338 330 (struct hsr_ethhdr *)skb_mac_header(frame->skb_hsr); 339 331 340 332 /* set the lane id properly */ 341 - hsr_set_path_id(hsr_ethhdr, port); 333 + hsr_set_path_id(frame, hsr_ethhdr, port); 342 334 return skb_clone(frame->skb_hsr, GFP_ATOMIC); 343 335 } else if (port->dev->features & NETIF_F_HW_HSR_TAG_INS) { 344 336 return skb_clone(frame->skb_std, GFP_ATOMIC);

+5

net/ipv4/route.c

··· 607 607 oldest_p = fnhe_p; 608 608 } 609 609 } 610 + 611 + /* Clear oldest->fnhe_daddr to prevent this fnhe from being 612 + * rebound with new dsts in rt_bind_exception(). 613 + */ 614 + oldest->fnhe_daddr = 0; 610 615 fnhe_flush_routes(oldest); 611 616 *oldest_p = oldest->fnhe_next; 612 617 kfree_rcu(oldest, rcu);

+11 -3

net/mac80211/iface.c

··· 223 223 if (netif_carrier_ok(sdata->dev)) 224 224 return -EBUSY; 225 225 226 + /* if any stations are set known (so they know this vif too), reject */ 227 + if (sta_info_get_by_idx(sdata, 0)) 228 + return -EBUSY; 229 + 226 230 /* First check no ROC work is happening on this iface */ 227 231 list_for_each_entry(roc, &local->roc_list, list) { 228 232 if (roc->sdata != sdata) ··· 246 242 ret = -EBUSY; 247 243 } 248 244 245 + /* 246 + * More interface types could be added here but changing the 247 + * address while powered makes the most sense in client modes. 248 + */ 249 249 switch (sdata->vif.type) { 250 250 case NL80211_IFTYPE_STATION: 251 251 case NL80211_IFTYPE_P2P_CLIENT: 252 - /* More interface types could be added here but changing the 253 - * address while powered makes the most sense in client modes. 254 - */ 252 + /* refuse while connecting */ 253 + if (sdata->u.mgd.auth_data || sdata->u.mgd.assoc_data) 254 + return -EBUSY; 255 255 break; 256 256 default: 257 257 ret = -EOPNOTSUPP;

+7 -3

net/mac80211/rx.c

··· 5360 5360 if (WARN_ON(!local->started)) 5361 5361 goto drop; 5362 5362 5363 - if (likely(!(status->flag & RX_FLAG_FAILED_PLCP_CRC))) { 5363 + if (likely(!(status->flag & RX_FLAG_FAILED_PLCP_CRC) && 5364 + !(status->flag & RX_FLAG_NO_PSDU && 5365 + status->zero_length_psdu_type == 5366 + IEEE80211_RADIOTAP_ZERO_LEN_PSDU_NOT_CAPTURED))) { 5364 5367 /* 5365 - * Validate the rate, unless a PLCP error means that 5366 - * we probably can't have a valid rate here anyway. 5368 + * Validate the rate, unless there was a PLCP error which may 5369 + * have an invalid rate or the PSDU was not capture and may be 5370 + * missing rate information. 5367 5371 */ 5368 5372 5369 5373 switch (status->encoding) {

+4 -2

net/mptcp/protocol.c

··· 61 61 62 62 static const struct proto_ops *mptcp_fallback_tcp_ops(const struct sock *sk) 63 63 { 64 + unsigned short family = READ_ONCE(sk->sk_family); 65 + 64 66 #if IS_ENABLED(CONFIG_MPTCP_IPV6) 65 - if (sk->sk_prot == &tcpv6_prot) 67 + if (family == AF_INET6) 66 68 return &inet6_stream_ops; 67 69 #endif 68 - WARN_ON_ONCE(sk->sk_prot != &tcp_prot); 70 + WARN_ON_ONCE(family != AF_INET); 69 71 return &inet_stream_ops; 70 72 } 71 73

+8

net/mptcp/subflow.c

··· 2144 2144 tcp_prot_override = tcp_prot; 2145 2145 tcp_prot_override.release_cb = tcp_release_cb_override; 2146 2146 tcp_prot_override.diag_destroy = tcp_abort_override; 2147 + #ifdef CONFIG_BPF_SYSCALL 2148 + /* Disable sockmap processing for subflows */ 2149 + tcp_prot_override.psock_update_sk_prot = NULL; 2150 + #endif 2147 2151 2148 2152 #if IS_ENABLED(CONFIG_MPTCP_IPV6) 2149 2153 /* In struct mptcp_subflow_request_sock, we assume the TCP request sock ··· 2184 2180 tcpv6_prot_override = tcpv6_prot; 2185 2181 tcpv6_prot_override.release_cb = tcp_release_cb_override; 2186 2182 tcpv6_prot_override.diag_destroy = tcp_abort_override; 2183 + #ifdef CONFIG_BPF_SYSCALL 2184 + /* Disable sockmap processing for subflows */ 2185 + tcpv6_prot_override.psock_update_sk_prot = NULL; 2186 + #endif 2187 2187 #endif 2188 2188 2189 2189 mptcp_diag_subflow_init(&subflow_ulp_ops);

+2 -4

net/sched/act_bpf.c

··· 47 47 filter = rcu_dereference(prog->filter); 48 48 if (at_ingress) { 49 49 __skb_push(skb, skb->mac_len); 50 - bpf_compute_data_pointers(skb); 51 - filter_res = bpf_prog_run(filter, skb); 50 + filter_res = bpf_prog_run_data_pointers(filter, skb); 52 51 __skb_pull(skb, skb->mac_len); 53 52 } else { 54 - bpf_compute_data_pointers(skb); 55 - filter_res = bpf_prog_run(filter, skb); 53 + filter_res = bpf_prog_run_data_pointers(filter, skb); 56 54 } 57 55 if (unlikely(!skb->tstamp && skb->tstamp_type)) 58 56 skb->tstamp_type = SKB_CLOCK_REALTIME;

+7 -5

net/sched/act_connmark.c

··· 195 195 const struct tcf_connmark_info *ci = to_connmark(a); 196 196 unsigned char *b = skb_tail_pointer(skb); 197 197 const struct tcf_connmark_parms *parms; 198 - struct tc_connmark opt = { 199 - .index = ci->tcf_index, 200 - .refcnt = refcount_read(&ci->tcf_refcnt) - ref, 201 - .bindcnt = atomic_read(&ci->tcf_bindcnt) - bind, 202 - }; 198 + struct tc_connmark opt; 203 199 struct tcf_t t; 200 + 201 + memset(&opt, 0, sizeof(opt)); 202 + 203 + opt.index = ci->tcf_index; 204 + opt.refcnt = refcount_read(&ci->tcf_refcnt) - ref; 205 + opt.bindcnt = atomic_read(&ci->tcf_bindcnt) - bind; 204 206 205 207 rcu_read_lock(); 206 208 parms = rcu_dereference(ci->parms);

+7 -5

net/sched/act_ife.c

··· 644 644 unsigned char *b = skb_tail_pointer(skb); 645 645 struct tcf_ife_info *ife = to_ife(a); 646 646 struct tcf_ife_params *p; 647 - struct tc_ife opt = { 648 - .index = ife->tcf_index, 649 - .refcnt = refcount_read(&ife->tcf_refcnt) - ref, 650 - .bindcnt = atomic_read(&ife->tcf_bindcnt) - bind, 651 - }; 647 + struct tc_ife opt; 652 648 struct tcf_t t; 649 + 650 + memset(&opt, 0, sizeof(opt)); 651 + 652 + opt.index = ife->tcf_index, 653 + opt.refcnt = refcount_read(&ife->tcf_refcnt) - ref, 654 + opt.bindcnt = atomic_read(&ife->tcf_bindcnt) - bind, 653 655 654 656 spin_lock_bh(&ife->tcf_lock); 655 657 opt.action = ife->tcf_action;

+2 -4

net/sched/cls_bpf.c

··· 97 97 } else if (at_ingress) { 98 98 /* It is safe to push/pull even if skb_shared() */ 99 99 __skb_push(skb, skb->mac_len); 100 - bpf_compute_data_pointers(skb); 101 - filter_res = bpf_prog_run(prog->filter, skb); 100 + filter_res = bpf_prog_run_data_pointers(prog->filter, skb); 102 101 __skb_pull(skb, skb->mac_len); 103 102 } else { 104 - bpf_compute_data_pointers(skb); 105 - filter_res = bpf_prog_run(prog->filter, skb); 103 + filter_res = bpf_prog_run_data_pointers(prog->filter, skb); 106 104 } 107 105 if (unlikely(!skb->tstamp && skb->tstamp_type)) 108 106 skb->tstamp_type = SKB_CLOCK_REALTIME;

+5

net/sched/sch_api.c

··· 1599 1599 NL_SET_ERR_MSG(extack, "Failed to find specified qdisc"); 1600 1600 return -ENOENT; 1601 1601 } 1602 + if (p->flags & TCQ_F_INGRESS) { 1603 + NL_SET_ERR_MSG(extack, 1604 + "Cannot add children to ingress/clsact qdisc"); 1605 + return -EOPNOTSUPP; 1606 + } 1602 1607 q = qdisc_leaf(p, clid, extack); 1603 1608 if (IS_ERR(q)) 1604 1609 return PTR_ERR(q);

+10 -7

net/sched/sch_generic.c

··· 180 180 static void try_bulk_dequeue_skb(struct Qdisc *q, 181 181 struct sk_buff *skb, 182 182 const struct netdev_queue *txq, 183 - int *packets) 183 + int *packets, int budget) 184 184 { 185 185 int bytelimit = qdisc_avail_bulklimit(txq) - skb->len; 186 + int cnt = 0; 186 187 187 188 while (bytelimit > 0) { 188 189 struct sk_buff *nskb = q->dequeue(q); ··· 194 193 bytelimit -= nskb->len; /* covers GSO len */ 195 194 skb->next = nskb; 196 195 skb = nskb; 197 - (*packets)++; /* GSO counts as one pkt */ 196 + if (++cnt >= budget) 197 + break; 198 198 } 199 + (*packets) += cnt; 199 200 skb_mark_not_on_list(skb); 200 201 } 201 202 ··· 231 228 * A requeued skb (via q->gso_skb) can also be a SKB list. 232 229 */ 233 230 static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate, 234 - int *packets) 231 + int *packets, int budget) 235 232 { 236 233 const struct netdev_queue *txq = q->dev_queue; 237 234 struct sk_buff *skb = NULL; ··· 298 295 if (skb) { 299 296 bulk: 300 297 if (qdisc_may_bulk(q)) 301 - try_bulk_dequeue_skb(q, skb, txq, packets); 298 + try_bulk_dequeue_skb(q, skb, txq, packets, budget); 302 299 else 303 300 try_bulk_dequeue_skb_slow(q, skb, packets); 304 301 } ··· 390 387 * >0 - queue is not empty. 391 388 * 392 389 */ 393 - static inline bool qdisc_restart(struct Qdisc *q, int *packets) 390 + static inline bool qdisc_restart(struct Qdisc *q, int *packets, int budget) 394 391 { 395 392 spinlock_t *root_lock = NULL; 396 393 struct netdev_queue *txq; ··· 399 396 bool validate; 400 397 401 398 /* Dequeue packet */ 402 - skb = dequeue_skb(q, &validate, packets); 399 + skb = dequeue_skb(q, &validate, packets, budget); 403 400 if (unlikely(!skb)) 404 401 return false; 405 402 ··· 417 414 int quota = READ_ONCE(net_hotdata.dev_tx_weight); 418 415 int packets; 419 416 420 - while (qdisc_restart(q, &packets)) { 417 + while (qdisc_restart(q, &packets, quota)) { 421 418 quota -= packets; 422 419 if (quota <= 0) { 423 420 if (q->flags & TCQ_F_NOLOCK)

+9 -4

net/sctp/transport.c

··· 486 486 487 487 if (tp->rttvar || tp->srtt) { 488 488 struct net *net = tp->asoc->base.net; 489 + unsigned int rto_beta, rto_alpha; 489 490 /* 6.3.1 C3) When a new RTT measurement R' is made, set 490 491 * RTTVAR <- (1 - RTO.Beta) * RTTVAR + RTO.Beta * |SRTT - R'| 491 492 * SRTT <- (1 - RTO.Alpha) * SRTT + RTO.Alpha * R' ··· 498 497 * For example, assuming the default value of RTO.Alpha of 499 498 * 1/8, rto_alpha would be expressed as 3. 500 499 */ 501 - tp->rttvar = tp->rttvar - (tp->rttvar >> net->sctp.rto_beta) 502 - + (((__u32)abs((__s64)tp->srtt - (__s64)rtt)) >> net->sctp.rto_beta); 503 - tp->srtt = tp->srtt - (tp->srtt >> net->sctp.rto_alpha) 504 - + (rtt >> net->sctp.rto_alpha); 500 + rto_beta = READ_ONCE(net->sctp.rto_beta); 501 + if (rto_beta < 32) 502 + tp->rttvar = tp->rttvar - (tp->rttvar >> rto_beta) 503 + + (((__u32)abs((__s64)tp->srtt - (__s64)rtt)) >> rto_beta); 504 + rto_alpha = READ_ONCE(net->sctp.rto_alpha); 505 + if (rto_alpha < 32) 506 + tp->srtt = tp->srtt - (tp->srtt >> rto_alpha) 507 + + (rtt >> rto_alpha); 505 508 } else { 506 509 /* 6.3.1 C2) When the first RTT measurement R is made, set 507 510 * SRTT <- R, RTTVAR <- R/2.

+1

net/smc/smc_clc.c

··· 890 890 return SMC_CLC_DECL_CNFERR; 891 891 } 892 892 pclc_base->hdr.typev1 = SMC_TYPE_N; 893 + ini->smc_type_v1 = SMC_TYPE_N; 893 894 } else { 894 895 pclc_base->iparea_offset = htons(sizeof(*pclc_smcd)); 895 896 plen += sizeof(*pclc_prfx) +

+1 -1

net/strparser/strparser.c

··· 238 238 strp_parser_err(strp, -EMSGSIZE, desc); 239 239 break; 240 240 } else if (len <= (ssize_t)head->len - 241 - skb->len - stm->strp.offset) { 241 + (ssize_t)skb->len - stm->strp.offset) { 242 242 /* Length must be into new skb (and also 243 243 * greater than zero) 244 244 */

+1 -2

net/sunrpc/Kconfig

··· 18 18 19 19 config RPCSEC_GSS_KRB5 20 20 tristate "Secure RPC: Kerberos V mechanism" 21 - depends on SUNRPC 21 + depends on SUNRPC && CRYPTO 22 22 default y 23 23 select SUNRPC_GSS 24 - select CRYPTO 25 24 select CRYPTO_SKCIPHER 26 25 select CRYPTO_HASH 27 26 help

+2

net/tipc/net.c

··· 145 145 { 146 146 struct tipc_net *tn = container_of(work, struct tipc_net, work); 147 147 148 + rtnl_lock(); 148 149 tipc_net_finalize(tipc_link_net(tn->bcl), tn->trial_addr); 150 + rtnl_unlock(); 149 151 } 150 152 151 153 void tipc_net_stop(struct net *net)

+11 -3

net/unix/garbage.c

··· 145 145 }; 146 146 147 147 static unsigned long unix_vertex_unvisited_index = UNIX_VERTEX_INDEX_MARK1; 148 + static unsigned long unix_vertex_max_scc_index = UNIX_VERTEX_INDEX_START; 148 149 149 150 static void unix_add_edge(struct scm_fp_list *fpl, struct unix_edge *edge) 150 151 { ··· 154 153 if (!vertex) { 155 154 vertex = list_first_entry(&fpl->vertices, typeof(*vertex), entry); 156 155 vertex->index = unix_vertex_unvisited_index; 156 + vertex->scc_index = ++unix_vertex_max_scc_index; 157 157 vertex->out_degree = 0; 158 158 INIT_LIST_HEAD(&vertex->edges); 159 159 INIT_LIST_HEAD(&vertex->scc_entry); ··· 491 489 scc_dead = unix_vertex_dead(v); 492 490 } 493 491 494 - if (scc_dead) 492 + if (scc_dead) { 495 493 unix_collect_skb(&scc, hitlist); 496 - else if (!unix_graph_maybe_cyclic) 497 - unix_graph_maybe_cyclic = unix_scc_cyclic(&scc); 494 + } else { 495 + if (unix_vertex_max_scc_index < vertex->scc_index) 496 + unix_vertex_max_scc_index = vertex->scc_index; 497 + 498 + if (!unix_graph_maybe_cyclic) 499 + unix_graph_maybe_cyclic = unix_scc_cyclic(&scc); 500 + } 498 501 499 502 list_del(&scc); 500 503 } ··· 514 507 unsigned long last_index = UNIX_VERTEX_INDEX_START; 515 508 516 509 unix_graph_maybe_cyclic = false; 510 + unix_vertex_max_scc_index = UNIX_VERTEX_INDEX_START; 517 511 518 512 /* Visit every vertex exactly once. 519 513 * __unix_walk_scc() moves visited vertices to unix_visited_vertices.

+1 -1

rust/Makefile

··· 298 298 -fno-inline-functions-called-once -fsanitize=bounds-strict \ 299 299 -fstrict-flex-arrays=% -fmin-function-alignment=% \ 300 300 -fzero-init-padding-bits=% -mno-fdpic \ 301 - --param=% --param asan-% 301 + --param=% --param asan-% -fno-isolate-erroneous-paths-dereference 302 302 303 303 # Derived from `scripts/Makefile.clang`. 304 304 BINDGEN_TARGET_x86 := x86_64-linux-gnu

+8 -6

scripts/decode_stacktrace.sh

··· 277 277 fi 278 278 done 279 279 280 - if [[ ${words[$last]} =~ ^[0-9a-f]+\] ]]; then 281 - words[$last-1]="${words[$last-1]} ${words[$last]}" 282 - unset words[$last] spaces[$last] 283 - last=$(( $last - 1 )) 284 - fi 285 - 286 280 # Extract info after the symbol if present. E.g.: 287 281 # func_name+0x54/0x80 (P) 288 282 # ^^^ ··· 285 291 local info_str="" 286 292 if [[ ${words[$last]} =~ $[A-Z]*$ ]]; then 287 293 info_str=${words[$last]} 294 + unset words[$last] spaces[$last] 295 + last=$(( $last - 1 )) 296 + fi 297 + 298 + # Join module name with its build id if present, as these were 299 + # split during tokenization (e.g. "[module" and "modbuildid]"). 300 + if [[ ${words[$last]} =~ ^[0-9a-f]+\] ]]; then 301 + words[$last-1]="${words[$last-1]} ${words[$last]}" 288 302 unset words[$last] spaces[$last] 289 303 last=$(( $last - 1 )) 290 304 fi

+2 -1

scripts/gendwarfksyms/gendwarfksyms.c

··· 138 138 error("no input files?"); 139 139 } 140 140 141 - symbol_read_exports(stdin); 141 + if (!symbol_read_exports(stdin)) 142 + return 0; 142 143 143 144 if (symtypes_file) { 144 145 symfile = fopen(symtypes_file, "w");

+1 -1

scripts/gendwarfksyms/gendwarfksyms.h

··· 123 123 typedef void (*symbol_callback_t)(struct symbol *, void *arg); 124 124 125 125 bool is_symbol_ptr(const char *name); 126 - void symbol_read_exports(FILE *file); 126 + int symbol_read_exports(FILE *file); 127 127 void symbol_read_symtab(int fd); 128 128 struct symbol *symbol_get(const char *name); 129 129 void symbol_set_ptr(struct symbol *sym, Dwarf_Die *ptr);

+3 -1

scripts/gendwarfksyms/symbols.c

··· 128 128 return for_each(name, NULL, NULL) > 0; 129 129 } 130 130 131 - void symbol_read_exports(FILE *file) 131 + int symbol_read_exports(FILE *file) 132 132 { 133 133 struct symbol *sym; 134 134 char *line = NULL; ··· 159 159 160 160 free(line); 161 161 debug("%d exported symbols", nsym); 162 + 163 + return nsym; 162 164 } 163 165 164 166 static void get_symbol(struct symbol *sym, void *arg)

+2 -2

sound/hda/codecs/hdmi/nvhdmi-mcp.c

··· 350 350 static const struct hda_codec_ops nvhdmi_mcp_codec_ops = { 351 351 .probe = nvhdmi_mcp_probe, 352 352 .remove = snd_hda_hdmi_simple_remove, 353 - .build_controls = nvhdmi_mcp_build_pcms, 354 - .build_pcms = nvhdmi_mcp_build_controls, 353 + .build_pcms = nvhdmi_mcp_build_pcms, 354 + .build_controls = nvhdmi_mcp_build_controls, 355 355 .init = nvhdmi_mcp_init, 356 356 .unsol_event = snd_hda_hdmi_simple_unsol_event, 357 357 };

+9

sound/hda/codecs/realtek/alc269.c

··· 6694 6694 SND_PCI_QUIRK(0x103c, 0x8e60, "HP Trekker ", ALC287_FIXUP_CS35L41_I2C_2), 6695 6695 SND_PCI_QUIRK(0x103c, 0x8e61, "HP Trekker ", ALC287_FIXUP_CS35L41_I2C_2), 6696 6696 SND_PCI_QUIRK(0x103c, 0x8e62, "HP Trekker ", ALC287_FIXUP_CS35L41_I2C_2), 6697 + SND_PCI_QUIRK(0x103c, 0x8ed5, "HP Merino13X", ALC245_FIXUP_TAS2781_SPI_2), 6698 + SND_PCI_QUIRK(0x103c, 0x8ed6, "HP Merino13", ALC245_FIXUP_TAS2781_SPI_2), 6699 + SND_PCI_QUIRK(0x103c, 0x8ed7, "HP Merino14", ALC245_FIXUP_TAS2781_SPI_2), 6700 + SND_PCI_QUIRK(0x103c, 0x8ed8, "HP Merino16", ALC245_FIXUP_TAS2781_SPI_2), 6701 + SND_PCI_QUIRK(0x103c, 0x8ed9, "HP Merino14W", ALC245_FIXUP_TAS2781_SPI_2), 6702 + SND_PCI_QUIRK(0x103c, 0x8eda, "HP Merino16W", ALC245_FIXUP_TAS2781_SPI_2), 6703 + SND_PCI_QUIRK(0x103c, 0x8f40, "HP Lampas14", ALC287_FIXUP_TXNW2781_I2C), 6704 + SND_PCI_QUIRK(0x103c, 0x8f41, "HP Lampas16", ALC287_FIXUP_TXNW2781_I2C), 6705 + SND_PCI_QUIRK(0x103c, 0x8f42, "HP LampasW14", ALC287_FIXUP_TXNW2781_I2C), 6697 6706 SND_PCI_QUIRK(0x1043, 0x1032, "ASUS VivoBook X513EA", ALC256_FIXUP_ASUS_MIC_NO_PRESENCE), 6698 6707 SND_PCI_QUIRK(0x1043, 0x1034, "ASUS GU605C", ALC285_FIXUP_ASUS_GU605_SPI_SPEAKER2_TO_DAC1), 6699 6708 SND_PCI_QUIRK(0x1043, 0x103e, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC),

+1 -1

sound/soc/codecs/lpass-va-macro.c

··· 1674 1674 if (ret) 1675 1675 goto err_clkout; 1676 1676 1677 - va->fsgen = clk_hw_get_clk(&va->hw, "fsgen"); 1677 + va->fsgen = devm_clk_hw_get_clk(dev, &va->hw, "fsgen"); 1678 1678 if (IS_ERR(va->fsgen)) { 1679 1679 ret = PTR_ERR(va->fsgen); 1680 1680 goto err_clkout;

+7 -2

sound/soc/codecs/tas2781-i2c.c

··· 1958 1958 { 1959 1959 struct i2c_client *client = (struct i2c_client *)tas_priv->client; 1960 1960 unsigned int dev_addrs[TASDEVICE_MAX_CHANNELS]; 1961 - int i, ndev = 0; 1961 + int ndev = 0; 1962 + int i, rc; 1962 1963 1963 1964 if (tas_priv->isacpi) { 1964 1965 ndev = device_property_read_u32_array(&client->dev, ··· 1970 1969 } else { 1971 1970 ndev = (ndev < ARRAY_SIZE(dev_addrs)) 1972 1971 ? ndev : ARRAY_SIZE(dev_addrs); 1973 - ndev = device_property_read_u32_array(&client->dev, 1972 + rc = device_property_read_u32_array(&client->dev, 1974 1973 "ti,audio-slots", dev_addrs, ndev); 1974 + if (rc != 0) { 1975 + ndev = 1; 1976 + dev_addrs[0] = client->addr; 1977 + } 1975 1978 } 1976 1979 1977 1980 tas_priv->irq =

+9 -4

sound/soc/intel/avs/path.c

··· 210 210 continue; 211 211 } 212 212 213 - blob = avs_nhlt_config_or_default(adev, module_template); 214 - if (IS_ERR(blob)) 215 - continue; 213 + if (!module_template->nhlt_config) { 214 + blob = avs_nhlt_config_or_default(adev, module_template); 215 + if (IS_ERR(blob)) 216 + continue; 217 + } 216 218 217 219 rlist[i] = path_template->fe_fmt->sampling_freq; 218 220 clist[i] = path_template->fe_fmt->num_channels; ··· 384 382 struct acpi_nhlt_config *blob; 385 383 size_t gtw_size; 386 384 387 - blob = avs_nhlt_config_or_default(adev, t); 385 + if (t->nhlt_config) 386 + blob = t->nhlt_config->blob; 387 + else 388 + blob = avs_nhlt_config_or_default(adev, t); 388 389 if (IS_ERR(blob)) 389 390 return PTR_ERR(blob); 390 391

+110 -3

sound/soc/intel/avs/topology.c

··· 350 350 AVS_DEFINE_PTR_PARSER(modcfg_ext, struct avs_tplg_modcfg_ext, modcfgs_ext); 351 351 AVS_DEFINE_PTR_PARSER(pplcfg, struct avs_tplg_pplcfg, pplcfgs); 352 352 AVS_DEFINE_PTR_PARSER(binding, struct avs_tplg_binding, bindings); 353 + AVS_DEFINE_PTR_PARSER(nhlt_config, struct avs_tplg_nhlt_config, nhlt_configs); 353 354 354 355 static int 355 356 parse_audio_format_bitfield(struct snd_soc_component *comp, void *elem, void *object, u32 offset) ··· 418 417 419 418 avs_ssp_sprint(val, SNDRV_CTL_ELEM_ID_NAME_MAXLEN, tuple->string, ssp_port, tdm_slot); 420 419 420 + return 0; 421 + } 422 + 423 + static int avs_parse_nhlt_config_size(struct snd_soc_component *comp, void *elem, void *object, 424 + u32 offset) 425 + { 426 + struct snd_soc_tplg_vendor_value_elem *tuple = elem; 427 + struct acpi_nhlt_config **blob = (struct acpi_nhlt_config **)((u8 *)object + offset); 428 + u32 size; 429 + 430 + size = le32_to_cpu(tuple->value); 431 + *blob = devm_kzalloc(comp->card->dev, struct_size(*blob, capabilities, size), GFP_KERNEL); 432 + if (!*blob) 433 + return -ENOMEM; 434 + 435 + (*blob)->capabilities_size = size; 421 436 return 0; 422 437 } 423 438 ··· 1201 1184 .offset = offsetof(struct avs_tplg_module, num_config_ids), 1202 1185 .parse = avs_parse_byte_token, 1203 1186 }, 1187 + { 1188 + .token = AVS_TKN_MOD_NHLT_CONFIG_ID_U32, 1189 + .type = SND_SOC_TPLG_TUPLE_TYPE_WORD, 1190 + .offset = offsetof(struct avs_tplg_module, nhlt_config), 1191 + .parse = avs_parse_nhlt_config_ptr, 1192 + }, 1204 1193 }; 1205 1194 1206 1195 static const struct avs_tplg_token_parser init_config_parsers[] = { ··· 1674 1651 1675 1652 static int avs_tplg_parse_initial_configs(struct snd_soc_component *comp, 1676 1653 struct snd_soc_tplg_vendor_array *tuples, 1677 - u32 block_size) 1654 + u32 block_size, u32 *offset) 1678 1655 { 1679 1656 struct avs_soc_component *acomp = to_avs_soc_component(comp); 1680 1657 struct avs_tplg *tplg = acomp->tplg; 1681 1658 int ret, i; 1659 + 1660 + *offset = 0; 1682 1661 1683 1662 /* Parse tuple section telling how many init configs there are. */ 1684 1663 ret = parse_dictionary_header(comp, tuples, (void **)&tplg->init_configs, ··· 1691 1666 return ret; 1692 1667 1693 1668 block_size -= le32_to_cpu(tuples->size); 1669 + *offset += le32_to_cpu(tuples->size); 1694 1670 /* With header parsed, move on to parsing entries. */ 1695 1671 tuples = avs_tplg_vendor_array_next(tuples); 1696 1672 ··· 1707 1681 */ 1708 1682 tmp = avs_tplg_vendor_array_next(tuples); 1709 1683 esize = le32_to_cpu(tuples->size) + le32_to_cpu(tmp->size); 1684 + *offset += esize; 1710 1685 1711 1686 ret = parse_dictionary_entries(comp, tuples, esize, config, 1, sizeof(*config), 1712 1687 AVS_TKN_INIT_CONFIG_ID_U32, ··· 1719 1692 /* handle raw data section */ 1720 1693 init_config_data = (void *)tuples + esize; 1721 1694 esize = config->length; 1695 + *offset += esize; 1722 1696 1723 1697 config->data = devm_kmemdup(comp->card->dev, init_config_data, esize, GFP_KERNEL); 1724 1698 if (!config->data) ··· 1727 1699 1728 1700 tuples = init_config_data + esize; 1729 1701 block_size -= esize; 1702 + } 1703 + 1704 + return 0; 1705 + } 1706 + 1707 + static const struct avs_tplg_token_parser mod_nhlt_config_parsers[] = { 1708 + { 1709 + .token = AVS_TKN_NHLT_CONFIG_ID_U32, 1710 + .type = SND_SOC_TPLG_TUPLE_TYPE_WORD, 1711 + .offset = offsetof(struct avs_tplg_nhlt_config, id), 1712 + .parse = avs_parse_word_token, 1713 + }, 1714 + { 1715 + .token = AVS_TKN_NHLT_CONFIG_SIZE_U32, 1716 + .type = SND_SOC_TPLG_TUPLE_TYPE_WORD, 1717 + .offset = offsetof(struct avs_tplg_nhlt_config, blob), 1718 + .parse = avs_parse_nhlt_config_size, 1719 + }, 1720 + }; 1721 + 1722 + static int avs_tplg_parse_nhlt_configs(struct snd_soc_component *comp, 1723 + struct snd_soc_tplg_vendor_array *tuples, 1724 + u32 block_size) 1725 + { 1726 + struct avs_soc_component *acomp = to_avs_soc_component(comp); 1727 + struct avs_tplg *tplg = acomp->tplg; 1728 + int ret, i; 1729 + 1730 + /* Parse the header section to know how many entries there are. */ 1731 + ret = parse_dictionary_header(comp, tuples, (void **)&tplg->nhlt_configs, 1732 + &tplg->num_nhlt_configs, 1733 + sizeof(*tplg->nhlt_configs), 1734 + AVS_TKN_MANIFEST_NUM_NHLT_CONFIGS_U32); 1735 + if (ret) 1736 + return ret; 1737 + 1738 + block_size -= le32_to_cpu(tuples->size); 1739 + /* With the header parsed, move on to parsing entries. */ 1740 + tuples = avs_tplg_vendor_array_next(tuples); 1741 + 1742 + for (i = 0; i < tplg->num_nhlt_configs && block_size > 0; i++) { 1743 + struct avs_tplg_nhlt_config *config; 1744 + u32 esize; 1745 + 1746 + config = &tplg->nhlt_configs[i]; 1747 + esize = le32_to_cpu(tuples->size); 1748 + 1749 + ret = parse_dictionary_entries(comp, tuples, esize, config, 1, sizeof(*config), 1750 + AVS_TKN_NHLT_CONFIG_ID_U32, 1751 + mod_nhlt_config_parsers, 1752 + ARRAY_SIZE(mod_nhlt_config_parsers)); 1753 + if (ret) 1754 + return ret; 1755 + /* With tuples parsed, the blob shall be allocated. */ 1756 + if (!config->blob) 1757 + return -EINVAL; 1758 + 1759 + /* Consume the raw data and move to the next entry. */ 1760 + memcpy(config->blob->capabilities, (u8 *)tuples + esize, 1761 + config->blob->capabilities_size); 1762 + esize += config->blob->capabilities_size; 1763 + 1764 + block_size -= esize; 1765 + tuples = avs_tplg_vendor_array_at(tuples, esize); 1730 1766 } 1731 1767 1732 1768 return 0; ··· 2100 2008 tuples = avs_tplg_vendor_array_at(tuples, offset); 2101 2009 2102 2010 /* Initial configs dictionary. */ 2103 - ret = avs_tplg_parse_initial_configs(comp, tuples, remaining); 2011 + ret = avs_tplg_parse_initial_configs(comp, tuples, remaining, &offset); 2104 2012 if (ret < 0) 2105 2013 return ret; 2106 2014 2107 - return 0; 2015 + remaining -= offset; 2016 + tuples = avs_tplg_vendor_array_at(tuples, offset); 2017 + 2018 + ret = avs_tplg_vendor_array_lookup(tuples, remaining, 2019 + AVS_TKN_MANIFEST_NUM_NHLT_CONFIGS_U32, &offset); 2020 + if (ret == -ENOENT) 2021 + return 0; 2022 + if (ret) { 2023 + dev_err(comp->dev, "NHLT config lookup failed: %d\n", ret); 2024 + return ret; 2025 + } 2026 + 2027 + tuples = avs_tplg_vendor_array_at(tuples, offset); 2028 + 2029 + /* NHLT configs dictionary. */ 2030 + return avs_tplg_parse_nhlt_configs(comp, tuples, remaining); 2108 2031 } 2109 2032 2110 2033 enum {

+8

sound/soc/intel/avs/topology.h

··· 37 37 u32 num_condpath_tmpls; 38 38 struct avs_tplg_init_config *init_configs; 39 39 u32 num_init_configs; 40 + struct avs_tplg_nhlt_config *nhlt_configs; 41 + u32 num_nhlt_configs; 40 42 41 43 struct list_head path_tmpl_list; 42 44 }; ··· 177 175 void *data; 178 176 }; 179 177 178 + struct avs_tplg_nhlt_config { 179 + u32 id; 180 + struct acpi_nhlt_config *blob; 181 + }; 182 + 180 183 struct avs_tplg_path { 181 184 u32 id; 182 185 ··· 223 216 u32 ctl_id; 224 217 u32 num_config_ids; 225 218 u32 *config_ids; 219 + struct avs_tplg_nhlt_config *nhlt_config; 226 220 227 221 struct avs_tplg_pipeline *owner; 228 222 /* Pipeline modules management. */

+1 -2

sound/soc/renesas/rcar/ssiu.c

··· 509 509 int rsnd_ssiu_probe(struct rsnd_priv *priv) 510 510 { 511 511 struct device *dev = rsnd_priv_to_dev(priv); 512 - struct device_node *node; 512 + struct device_node *node __free(device_node) = rsnd_ssiu_of_node(priv); 513 513 struct rsnd_ssiu *ssiu; 514 514 struct rsnd_mod_ops *ops; 515 515 const int *list = NULL; ··· 522 522 * see 523 523 * rsnd_ssiu_bufsif_to_id() 524 524 */ 525 - node = rsnd_ssiu_of_node(priv); 526 525 if (node) 527 526 nr = rsnd_node_count(priv, node, SSIU_NAME); 528 527 else

+2 -1

sound/soc/sdca/sdca_functions.c

··· 950 950 return ret; 951 951 } 952 952 953 - control->values = devm_kzalloc(dev, hweight64(control->cn_list), GFP_KERNEL); 953 + control->values = devm_kcalloc(dev, hweight64(control->cn_list), 954 + sizeof(int), GFP_KERNEL); 954 955 if (!control->values) 955 956 return -ENOMEM; 956 957

+5

sound/usb/endpoint.c

··· 1362 1362 ep->sample_rem = ep->cur_rate % ep->pps; 1363 1363 ep->packsize[0] = ep->cur_rate / ep->pps; 1364 1364 ep->packsize[1] = (ep->cur_rate + (ep->pps - 1)) / ep->pps; 1365 + if (ep->packsize[1] > ep->maxpacksize) { 1366 + usb_audio_dbg(chip, "Too small maxpacksize %u for rate %u / pps %u\n", 1367 + ep->maxpacksize, ep->cur_rate, ep->pps); 1368 + return -EINVAL; 1369 + } 1365 1370 1366 1371 /* calculate the frequency in 16.16 format */ 1367 1372 ep->freqm = ep->freqn;

+2

sound/usb/mixer.c

··· 3086 3086 int i; 3087 3087 3088 3088 assoc = usb_ifnum_to_if(dev, ctrlif)->intf_assoc; 3089 + if (!assoc) 3090 + return -EINVAL; 3089 3091 3090 3092 /* Detect BADD capture/playback channels from AS EP descriptors */ 3091 3093 for (i = 0; i < assoc->bInterfaceCount; i++) {

+8

sound/usb/quirks.c

··· 2022 2022 case USB_ID(0x16d0, 0x09d8): /* NuPrime IDA-8 */ 2023 2023 case USB_ID(0x16d0, 0x09db): /* NuPrime Audio DAC-9 */ 2024 2024 case USB_ID(0x16d0, 0x09dd): /* Encore mDSD */ 2025 + case USB_ID(0x16d0, 0x0ab1): /* PureAudio APA DAC */ 2026 + case USB_ID(0x16d0, 0xeca1): /* PureAudio Lotus DAC5, DAC5 SE, DAC5 Pro */ 2025 2027 case USB_ID(0x1db5, 0x0003): /* Bryston BDA3 */ 2026 2028 case USB_ID(0x20a0, 0x4143): /* WaveIO USB Audio 2.0 */ 2027 2029 case USB_ID(0x22e1, 0xca01): /* HDTA Serenade DSD */ ··· 2269 2267 QUIRK_FLAG_FIXED_RATE), 2270 2268 DEVICE_FLG(0x0fd9, 0x0008, /* Hauppauge HVR-950Q */ 2271 2269 QUIRK_FLAG_SHARE_MEDIA_DEVICE | QUIRK_FLAG_ALIGN_TRANSFER), 2270 + DEVICE_FLG(0x1038, 0x1294, /* SteelSeries Arctis Pro Wireless */ 2271 + QUIRK_FLAG_MIXER_PLAYBACK_MIN_MUTE), 2272 2272 DEVICE_FLG(0x1101, 0x0003, /* Audioengine D1 */ 2273 2273 QUIRK_FLAG_GET_SAMPLE_RATE), 2274 2274 DEVICE_FLG(0x12d1, 0x3a07, /* Huawei Technologies Co., Ltd. */ ··· 2301 2297 QUIRK_FLAG_IGNORE_CLOCK_SOURCE), 2302 2298 DEVICE_FLG(0x1686, 0x00dd, /* Zoom R16/24 */ 2303 2299 QUIRK_FLAG_TX_LENGTH | QUIRK_FLAG_CTL_MSG_DELAY_1M), 2300 + DEVICE_FLG(0x16d0, 0x0ab1, /* PureAudio APA DAC */ 2301 + QUIRK_FLAG_DSD_RAW), 2302 + DEVICE_FLG(0x16d0, 0xeca1, /* PureAudio Lotus DAC5, DAC5 SE and DAC5 Pro */ 2303 + QUIRK_FLAG_DSD_RAW), 2304 2304 DEVICE_FLG(0x17aa, 0x1046, /* Lenovo ThinkStation P620 Rear Line-in, Line-out and Microphone */ 2305 2305 QUIRK_FLAG_DISABLE_AUTOSUSPEND), 2306 2306 DEVICE_FLG(0x17aa, 0x104d, /* Lenovo ThinkStation P620 Internal Speaker + Front Headset */

+1

tools/arch/x86/include/uapi/asm/vmx.h

··· 93 93 #define EXIT_REASON_TPAUSE 68 94 94 #define EXIT_REASON_BUS_LOCK 74 95 95 #define EXIT_REASON_NOTIFY 75 96 + #define EXIT_REASON_SEAMCALL 76 96 97 #define EXIT_REASON_TDCALL 77 97 98 #define EXIT_REASON_MSR_READ_IMM 84 98 99 #define EXIT_REASON_MSR_WRITE_IMM 85

+1 -1

tools/bpf/bpftool/Documentation/bpftool-prog.rst

··· 182 182 183 183 bpftool prog tracelog { stdout | stderr } *PROG* 184 184 Dump the BPF stream of the program. BPF programs can write to these streams 185 - at runtime with the **bpf_stream_vprintk**\ () kfunc. The kernel may write 185 + at runtime with the **bpf_stream_vprintk_impl**\ () kfunc. The kernel may write 186 186 error messages to the standard error stream. This facility should be used 187 187 only for debugging purposes. 188 188

+2 -2

tools/build/feature/Makefile

··· 107 107 __BUILD = $(CC) $(CFLAGS) -MD -Wall -Werror -o $@ $(patsubst %.bin,%.c,$(@F)) $(LDFLAGS) 108 108 BUILD = $(__BUILD) > $(@:.bin=.make.output) 2>&1 109 109 BUILD_BFD = $(BUILD) -DPACKAGE='"perf"' -lbfd -ldl 110 - BUILD_ALL = $(BUILD) -fstack-protector-all -O2 -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma -lelf -lslang $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -DPACKAGE='"perf"' -lbfd -ldl -lz -llzma -lzstd 110 + BUILD_ALL = $(BUILD) -fstack-protector-all -O2 -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma -lelf -lslang $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -ldl -lz -llzma -lzstd 111 111 112 112 __BUILDXX = $(CXX) $(CXXFLAGS) -MD -Wall -Werror -o $@ $(patsubst %.bin,%.cpp,$(@F)) $(LDFLAGS) 113 113 BUILDXX = $(__BUILDXX) > $(@:.bin=.make.output) 2>&1 ··· 115 115 ############################### 116 116 117 117 $(OUTPUT)test-all.bin: 118 - $(BUILD_ALL) || $(BUILD_ALL) -lopcodes -liberty 118 + $(BUILD_ALL) 119 119 120 120 $(OUTPUT)test-hello.bin: 121 121 $(BUILD)

+13 -13

tools/lib/bpf/bpf_helpers.h

··· 315 315 ___param, sizeof(___param)); \ 316 316 }) 317 317 318 - extern int bpf_stream_vprintk(int stream_id, const char *fmt__str, const void *args, 319 - __u32 len__sz, void *aux__prog) __weak __ksym; 318 + extern int bpf_stream_vprintk_impl(int stream_id, const char *fmt__str, const void *args, 319 + __u32 len__sz, void *aux__prog) __weak __ksym; 320 320 321 - #define bpf_stream_printk(stream_id, fmt, args...) \ 322 - ({ \ 323 - static const char ___fmt[] = fmt; \ 324 - unsigned long long ___param[___bpf_narg(args)]; \ 325 - \ 326 - _Pragma("GCC diagnostic push") \ 327 - _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \ 328 - ___bpf_fill(___param, args); \ 329 - _Pragma("GCC diagnostic pop") \ 330 - \ 331 - bpf_stream_vprintk(stream_id, ___fmt, ___param, sizeof(___param), NULL);\ 321 + #define bpf_stream_printk(stream_id, fmt, args...) \ 322 + ({ \ 323 + static const char ___fmt[] = fmt; \ 324 + unsigned long long ___param[___bpf_narg(args)]; \ 325 + \ 326 + _Pragma("GCC diagnostic push") \ 327 + _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \ 328 + ___bpf_fill(___param, args); \ 329 + _Pragma("GCC diagnostic pop") \ 330 + \ 331 + bpf_stream_vprintk_impl(stream_id, ___fmt, ___param, sizeof(___param), NULL); \ 332 332 }) 333 333 334 334 /* Use __bpf_printk when bpf_printk call has 3 or fewer fmt args

+12

tools/net/ynl/pyynl/ynl_gen_c.py

··· 861 861 return [f"{member} = {self.c_name};", 862 862 f"{presence} = n_{self.c_name};"] 863 863 864 + def free_needs_iter(self): 865 + return self.sub_type == 'nest' 866 + 867 + def _free_lines(self, ri, var, ref): 868 + lines = [] 869 + if self.sub_type == 'nest': 870 + lines += [ 871 + f"for (i = 0; i < {var}->{ref}_count.{self.c_name}; i++)", 872 + f'{self.nested_render_name}_free(&{var}->{ref}{self.c_name}[i]);', 873 + ] 874 + lines += f"free({var}->{ref}{self.c_name});", 875 + return lines 864 876 865 877 class TypeNestTypeValue(Type): 866 878 def _complex_member_type(self, ri):

+2 -3

tools/perf/Makefile.config

··· 354 354 355 355 FEATURE_CHECK_LDFLAGS-libaio = -lrt 356 356 357 - FEATURE_CHECK_LDFLAGS-disassembler-four-args = -lbfd -lopcodes -ldl 358 - FEATURE_CHECK_LDFLAGS-disassembler-init-styled = -lbfd -lopcodes -ldl 359 - 360 357 CORE_CFLAGS += -fno-omit-frame-pointer 361 358 CORE_CFLAGS += -Wall 362 359 CORE_CFLAGS += -Wextra ··· 927 930 928 931 ifeq ($(feature-libbfd), 1) 929 932 EXTLIBS += -lbfd -lopcodes 933 + FEATURE_CHECK_LDFLAGS-disassembler-four-args = -lbfd -lopcodes -ldl 934 + FEATURE_CHECK_LDFLAGS-disassembler-init-styled = -lbfd -lopcodes -ldl 930 935 else 931 936 # we are on a system that requires -liberty and (maybe) -lz 932 937 # to link against -lbfd; test each case individually here

+2

tools/perf/builtin-lock.c

··· 1867 1867 eops.sample = process_sample_event; 1868 1868 eops.comm = perf_event__process_comm; 1869 1869 eops.mmap = perf_event__process_mmap; 1870 + eops.mmap2 = perf_event__process_mmap2; 1870 1871 eops.namespaces = perf_event__process_namespaces; 1871 1872 eops.tracing_data = perf_event__process_tracing_data; 1872 1873 session = perf_session__new(&data, &eops); ··· 2024 2023 eops.sample = process_sample_event; 2025 2024 eops.comm = perf_event__process_comm; 2026 2025 eops.mmap = perf_event__process_mmap; 2026 + eops.mmap2 = perf_event__process_mmap2; 2027 2027 eops.tracing_data = perf_event__process_tracing_data; 2028 2028 2029 2029 perf_env__init(&host_env);

+9 -5

tools/perf/tests/shell/lock_contention.sh

··· 13 13 rm -f ${perfdata} 14 14 rm -f ${result} 15 15 rm -f ${errout} 16 - trap - EXIT TERM INT 16 + trap - EXIT TERM INT ERR 17 17 } 18 18 19 19 trap_cleanup() { 20 + if (( $? == 139 )); then #SIGSEGV 21 + err=1 22 + fi 20 23 echo "Unexpected signal in ${FUNCNAME[1]}" 21 24 cleanup 22 25 exit ${err} 23 26 } 24 - trap trap_cleanup EXIT TERM INT 27 + trap trap_cleanup EXIT TERM INT ERR 25 28 26 29 check() { 27 30 if [ "$(id -u)" != 0 ]; then ··· 148 145 fi 149 146 150 147 # the perf lock contention output goes to the stderr 151 - perf lock con -a -b -g -E 1 -q -- perf bench sched messaging -p > /dev/null 2> ${result} 148 + perf lock con -a -b --lock-cgroup -E 1 -q -- perf bench sched messaging -p > /dev/null 2> ${result} 152 149 if [ "$(cat "${result}" | wc -l)" != "1" ]; then 153 150 echo "[Fail] BPF result count is not 1:" "$(cat "${result}" | wc -l)" 154 151 err=1 ··· 274 271 return 275 272 fi 276 273 277 - perf lock con -a -b -g -E 1 -F wait_total -q -- perf bench sched messaging -p > /dev/null 2> ${result} 274 + perf lock con -a -b --lock-cgroup -E 1 -F wait_total -q -- perf bench sched messaging -p > /dev/null 2> ${result} 278 275 if [ "$(cat "${result}" | wc -l)" != "1" ]; then 279 276 echo "[Fail] BPF result should have a cgroup result:" "$(cat "${result}")" 280 277 err=1 ··· 282 279 fi 283 280 284 281 cgroup=$(cat "${result}" | awk '{ print $3 }') 285 - perf lock con -a -b -g -E 1 -G "${cgroup}" -q -- perf bench sched messaging -p > /dev/null 2> ${result} 282 + perf lock con -a -b --lock-cgroup -E 1 -G "${cgroup}" -q -- perf bench sched messaging -p > /dev/null 2> ${result} 286 283 if [ "$(cat "${result}" | wc -l)" != "1" ]; then 287 284 echo "[Fail] BPF result should have a result with cgroup filter:" "$(cat "${cgroup}")" 288 285 err=1 ··· 341 338 test_cgroup_filter 342 339 test_csv_output 343 340 341 + cleanup 344 342 exit ${err}

+2 -8

tools/perf/util/header.c

··· 1022 1022 1023 1023 down_read(&env->bpf_progs.lock); 1024 1024 1025 - if (env->bpf_progs.infos_cnt == 0) 1026 - goto out; 1027 - 1028 1025 ret = do_write(ff, &env->bpf_progs.infos_cnt, 1029 1026 sizeof(env->bpf_progs.infos_cnt)); 1030 - if (ret < 0) 1027 + if (ret < 0 || env->bpf_progs.infos_cnt == 0) 1031 1028 goto out; 1032 1029 1033 1030 root = &env->bpf_progs.infos; ··· 1064 1067 1065 1068 down_read(&env->bpf_progs.lock); 1066 1069 1067 - if (env->bpf_progs.btfs_cnt == 0) 1068 - goto out; 1069 - 1070 1070 ret = do_write(ff, &env->bpf_progs.btfs_cnt, 1071 1071 sizeof(env->bpf_progs.btfs_cnt)); 1072 1072 1073 - if (ret < 0) 1073 + if (ret < 0 || env->bpf_progs.btfs_cnt == 0) 1074 1074 goto out; 1075 1075 1076 1076 root = &env->bpf_progs.btfs;

+38

tools/perf/util/libbfd.c

··· 38 38 asymbol **syms; 39 39 }; 40 40 41 + static bool perf_bfd_lock(void *bfd_mutex) 42 + { 43 + mutex_lock(bfd_mutex); 44 + return true; 45 + } 46 + 47 + static bool perf_bfd_unlock(void *bfd_mutex) 48 + { 49 + mutex_unlock(bfd_mutex); 50 + return true; 51 + } 52 + 53 + static void perf_bfd_init(void) 54 + { 55 + static struct mutex bfd_mutex; 56 + 57 + mutex_init_recursive(&bfd_mutex); 58 + 59 + if (bfd_init() != BFD_INIT_MAGIC) { 60 + pr_err("Error initializing libbfd\n"); 61 + return; 62 + } 63 + if (!bfd_thread_init(perf_bfd_lock, perf_bfd_unlock, &bfd_mutex)) 64 + pr_err("Error initializing libbfd threading\n"); 65 + } 66 + 67 + static void ensure_bfd_init(void) 68 + { 69 + static pthread_once_t bfd_init_once = PTHREAD_ONCE_INIT; 70 + 71 + pthread_once(&bfd_init_once, perf_bfd_init); 72 + } 73 + 41 74 static int bfd_error(const char *string) 42 75 { 43 76 const char *errmsg; ··· 165 132 bfd *abfd; 166 133 struct a2l_data *a2l = NULL; 167 134 135 + ensure_bfd_init(); 168 136 abfd = bfd_openr(path, NULL); 169 137 if (abfd == NULL) 170 138 return NULL; ··· 322 288 bfd *abfd; 323 289 u64 start, len; 324 290 291 + ensure_bfd_init(); 325 292 abfd = bfd_openr(debugfile, NULL); 326 293 if (!abfd) 327 294 return -1; ··· 428 393 if (fd < 0) 429 394 return -1; 430 395 396 + ensure_bfd_init(); 431 397 abfd = bfd_fdopenr(filename, /*target=*/NULL, fd); 432 398 if (!abfd) 433 399 return -1; ··· 457 421 asection *section; 458 422 bfd *abfd; 459 423 424 + ensure_bfd_init(); 460 425 abfd = bfd_openr(filename, NULL); 461 426 if (!abfd) 462 427 return -1; ··· 517 480 memset(tpath, 0, sizeof(tpath)); 518 481 perf_exe(tpath, sizeof(tpath)); 519 482 483 + ensure_bfd_init(); 520 484 bfdf = bfd_openr(tpath, NULL); 521 485 if (bfdf == NULL) 522 486 abort();

+10 -4

tools/perf/util/mutex.c

··· 17 17 18 18 #define CHECK_ERR(err) check_err(__func__, err) 19 19 20 - static void __mutex_init(struct mutex *mtx, bool pshared) 20 + static void __mutex_init(struct mutex *mtx, bool pshared, bool recursive) 21 21 { 22 22 pthread_mutexattr_t attr; 23 23 ··· 27 27 /* In normal builds enable error checking, such as recursive usage. */ 28 28 CHECK_ERR(pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_ERRORCHECK)); 29 29 #endif 30 + if (recursive) 31 + CHECK_ERR(pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE)); 30 32 if (pshared) 31 33 CHECK_ERR(pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED)); 32 - 33 34 CHECK_ERR(pthread_mutex_init(&mtx->lock, &attr)); 34 35 CHECK_ERR(pthread_mutexattr_destroy(&attr)); 35 36 } 36 37 37 38 void mutex_init(struct mutex *mtx) 38 39 { 39 - __mutex_init(mtx, /*pshared=*/false); 40 + __mutex_init(mtx, /*pshared=*/false, /*recursive=*/false); 40 41 } 41 42 42 43 void mutex_init_pshared(struct mutex *mtx) 43 44 { 44 - __mutex_init(mtx, /*pshared=*/true); 45 + __mutex_init(mtx, /*pshared=*/true, /*recursive=*/false); 46 + } 47 + 48 + void mutex_init_recursive(struct mutex *mtx) 49 + { 50 + __mutex_init(mtx, /*pshared=*/false, /*recursive=*/true); 45 51 } 46 52 47 53 void mutex_destroy(struct mutex *mtx)

+2

tools/perf/util/mutex.h

··· 104 104 * process-private attribute. 105 105 */ 106 106 void mutex_init_pshared(struct mutex *mtx); 107 + /* Initializes a mutex that may be recursively held on the same thread. */ 108 + void mutex_init_recursive(struct mutex *mtx); 107 109 void mutex_destroy(struct mutex *mtx); 108 110 109 111 void mutex_lock(struct mutex *mtx) EXCLUSIVE_LOCK_FUNCTION(*mtx);

+3

tools/testing/selftests/bpf/config

··· 50 50 CONFIG_IPV6_TUNNEL=y 51 51 CONFIG_KEYS=y 52 52 CONFIG_LIRC=y 53 + CONFIG_LIVEPATCH=y 53 54 CONFIG_LWTUNNEL=y 54 55 CONFIG_MODULE_SIG=y 55 56 CONFIG_MODULE_SRCVERSION_ALL=y ··· 112 111 CONFIG_NF_NAT=y 113 112 CONFIG_PACKET=y 114 113 CONFIG_RC_CORE=y 114 + CONFIG_SAMPLES=y 115 + CONFIG_SAMPLE_LIVEPATCH=m 115 116 CONFIG_SECURITY=y 116 117 CONFIG_SECURITYFS=y 117 118 CONFIG_SYN_COOKIES=y

+107

tools/testing/selftests/bpf/prog_tests/livepatch_trampoline.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <test_progs.h> 5 + #include "testing_helpers.h" 6 + #include "livepatch_trampoline.skel.h" 7 + 8 + static int load_livepatch(void) 9 + { 10 + char path[4096]; 11 + 12 + /* CI will set KBUILD_OUTPUT */ 13 + snprintf(path, sizeof(path), "%s/samples/livepatch/livepatch-sample.ko", 14 + getenv("KBUILD_OUTPUT") ? : "../../../.."); 15 + 16 + return load_module(path, env_verbosity > VERBOSE_NONE); 17 + } 18 + 19 + static void unload_livepatch(void) 20 + { 21 + /* Disable the livepatch before unloading the module */ 22 + system("echo 0 > /sys/kernel/livepatch/livepatch_sample/enabled"); 23 + 24 + unload_module("livepatch_sample", env_verbosity > VERBOSE_NONE); 25 + } 26 + 27 + static void read_proc_cmdline(void) 28 + { 29 + char buf[4096]; 30 + int fd, ret; 31 + 32 + fd = open("/proc/cmdline", O_RDONLY); 33 + if (!ASSERT_OK_FD(fd, "open /proc/cmdline")) 34 + return; 35 + 36 + ret = read(fd, buf, sizeof(buf)); 37 + if (!ASSERT_GT(ret, 0, "read /proc/cmdline")) 38 + goto out; 39 + 40 + ASSERT_OK(strncmp(buf, "this has been live patched", 26), "strncmp"); 41 + 42 + out: 43 + close(fd); 44 + } 45 + 46 + static void __test_livepatch_trampoline(bool fexit_first) 47 + { 48 + struct livepatch_trampoline *skel = NULL; 49 + int err; 50 + 51 + skel = livepatch_trampoline__open_and_load(); 52 + if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) 53 + goto out; 54 + 55 + skel->bss->my_pid = getpid(); 56 + 57 + if (!fexit_first) { 58 + /* fentry program is loaded first by default */ 59 + err = livepatch_trampoline__attach(skel); 60 + if (!ASSERT_OK(err, "skel_attach")) 61 + goto out; 62 + } else { 63 + /* Manually load fexit program first. */ 64 + skel->links.fexit_cmdline = bpf_program__attach(skel->progs.fexit_cmdline); 65 + if (!ASSERT_OK_PTR(skel->links.fexit_cmdline, "attach_fexit")) 66 + goto out; 67 + 68 + skel->links.fentry_cmdline = bpf_program__attach(skel->progs.fentry_cmdline); 69 + if (!ASSERT_OK_PTR(skel->links.fentry_cmdline, "attach_fentry")) 70 + goto out; 71 + } 72 + 73 + read_proc_cmdline(); 74 + 75 + ASSERT_EQ(skel->bss->fentry_hit, 1, "fentry_hit"); 76 + ASSERT_EQ(skel->bss->fexit_hit, 1, "fexit_hit"); 77 + out: 78 + livepatch_trampoline__destroy(skel); 79 + } 80 + 81 + void test_livepatch_trampoline(void) 82 + { 83 + int retry_cnt = 0; 84 + 85 + retry: 86 + if (load_livepatch()) { 87 + if (retry_cnt) { 88 + ASSERT_OK(1, "load_livepatch"); 89 + goto out; 90 + } 91 + /* 92 + * Something else (previous run of the same test?) loaded 93 + * the KLP module. Unload the KLP module and retry. 94 + */ 95 + unload_livepatch(); 96 + retry_cnt++; 97 + goto retry; 98 + } 99 + 100 + if (test__start_subtest("fentry_first")) 101 + __test_livepatch_trampoline(false); 102 + 103 + if (test__start_subtest("fexit_first")) 104 + __test_livepatch_trampoline(true); 105 + out: 106 + unload_livepatch(); 107 + }

+140

tools/testing/selftests/bpf/prog_tests/mptcp.c

··· 6 6 #include <netinet/in.h> 7 7 #include <test_progs.h> 8 8 #include <unistd.h> 9 + #include <errno.h> 9 10 #include "cgroup_helpers.h" 10 11 #include "network_helpers.h" 11 12 #include "mptcp_sock.skel.h" 12 13 #include "mptcpify.skel.h" 13 14 #include "mptcp_subflow.skel.h" 15 + #include "mptcp_sockmap.skel.h" 14 16 15 17 #define NS_TEST "mptcp_ns" 16 18 #define ADDR_1 "10.0.1.1" ··· 438 436 close(cgroup_fd); 439 437 } 440 438 439 + /* Test sockmap on MPTCP server handling non-mp-capable clients. */ 440 + static void test_sockmap_with_mptcp_fallback(struct mptcp_sockmap *skel) 441 + { 442 + int listen_fd = -1, client_fd1 = -1, client_fd2 = -1; 443 + int server_fd1 = -1, server_fd2 = -1, sent, recvd; 444 + char snd[9] = "123456789"; 445 + char rcv[10]; 446 + 447 + /* start server with MPTCP enabled */ 448 + listen_fd = start_mptcp_server(AF_INET, NULL, 0, 0); 449 + if (!ASSERT_OK_FD(listen_fd, "sockmap-fb:start_mptcp_server")) 450 + return; 451 + 452 + skel->bss->trace_port = ntohs(get_socket_local_port(listen_fd)); 453 + skel->bss->sk_index = 0; 454 + /* create client without MPTCP enabled */ 455 + client_fd1 = connect_to_fd_opts(listen_fd, NULL); 456 + if (!ASSERT_OK_FD(client_fd1, "sockmap-fb:connect_to_fd")) 457 + goto end; 458 + 459 + server_fd1 = accept(listen_fd, NULL, 0); 460 + skel->bss->sk_index = 1; 461 + client_fd2 = connect_to_fd_opts(listen_fd, NULL); 462 + if (!ASSERT_OK_FD(client_fd2, "sockmap-fb:connect_to_fd")) 463 + goto end; 464 + 465 + server_fd2 = accept(listen_fd, NULL, 0); 466 + /* test normal redirect behavior: data sent by client_fd1 can be 467 + * received by client_fd2 468 + */ 469 + skel->bss->redirect_idx = 1; 470 + sent = send(client_fd1, snd, sizeof(snd), 0); 471 + if (!ASSERT_EQ(sent, sizeof(snd), "sockmap-fb:send(client_fd1)")) 472 + goto end; 473 + 474 + /* try to recv more bytes to avoid truncation check */ 475 + recvd = recv(client_fd2, rcv, sizeof(rcv), 0); 476 + if (!ASSERT_EQ(recvd, sizeof(snd), "sockmap-fb:recv(client_fd2)")) 477 + goto end; 478 + 479 + end: 480 + if (client_fd1 >= 0) 481 + close(client_fd1); 482 + if (client_fd2 >= 0) 483 + close(client_fd2); 484 + if (server_fd1 >= 0) 485 + close(server_fd1); 486 + if (server_fd2 >= 0) 487 + close(server_fd2); 488 + close(listen_fd); 489 + } 490 + 491 + /* Test sockmap rejection of MPTCP sockets - both server and client sides. */ 492 + static void test_sockmap_reject_mptcp(struct mptcp_sockmap *skel) 493 + { 494 + int listen_fd = -1, server_fd = -1, client_fd1 = -1; 495 + int err, zero = 0; 496 + 497 + /* start server with MPTCP enabled */ 498 + listen_fd = start_mptcp_server(AF_INET, NULL, 0, 0); 499 + if (!ASSERT_OK_FD(listen_fd, "start_mptcp_server")) 500 + return; 501 + 502 + skel->bss->trace_port = ntohs(get_socket_local_port(listen_fd)); 503 + skel->bss->sk_index = 0; 504 + /* create client with MPTCP enabled */ 505 + client_fd1 = connect_to_fd(listen_fd, 0); 506 + if (!ASSERT_OK_FD(client_fd1, "connect_to_fd client_fd1")) 507 + goto end; 508 + 509 + /* bpf_sock_map_update() called from sockops should reject MPTCP sk */ 510 + if (!ASSERT_EQ(skel->bss->helper_ret, -EOPNOTSUPP, "should reject")) 511 + goto end; 512 + 513 + server_fd = accept(listen_fd, NULL, 0); 514 + err = bpf_map_update_elem(bpf_map__fd(skel->maps.sock_map), 515 + &zero, &server_fd, BPF_NOEXIST); 516 + if (!ASSERT_EQ(err, -EOPNOTSUPP, "server should be disallowed")) 517 + goto end; 518 + 519 + /* MPTCP client should also be disallowed */ 520 + err = bpf_map_update_elem(bpf_map__fd(skel->maps.sock_map), 521 + &zero, &client_fd1, BPF_NOEXIST); 522 + if (!ASSERT_EQ(err, -EOPNOTSUPP, "client should be disallowed")) 523 + goto end; 524 + end: 525 + if (client_fd1 >= 0) 526 + close(client_fd1); 527 + if (server_fd >= 0) 528 + close(server_fd); 529 + close(listen_fd); 530 + } 531 + 532 + static void test_mptcp_sockmap(void) 533 + { 534 + struct mptcp_sockmap *skel; 535 + struct netns_obj *netns; 536 + int cgroup_fd, err; 537 + 538 + cgroup_fd = test__join_cgroup("/mptcp_sockmap"); 539 + if (!ASSERT_OK_FD(cgroup_fd, "join_cgroup: mptcp_sockmap")) 540 + return; 541 + 542 + skel = mptcp_sockmap__open_and_load(); 543 + if (!ASSERT_OK_PTR(skel, "skel_open_load: mptcp_sockmap")) 544 + goto close_cgroup; 545 + 546 + skel->links.mptcp_sockmap_inject = 547 + bpf_program__attach_cgroup(skel->progs.mptcp_sockmap_inject, cgroup_fd); 548 + if (!ASSERT_OK_PTR(skel->links.mptcp_sockmap_inject, "attach sockmap")) 549 + goto skel_destroy; 550 + 551 + err = bpf_prog_attach(bpf_program__fd(skel->progs.mptcp_sockmap_redirect), 552 + bpf_map__fd(skel->maps.sock_map), 553 + BPF_SK_SKB_STREAM_VERDICT, 0); 554 + if (!ASSERT_OK(err, "bpf_prog_attach stream verdict")) 555 + goto skel_destroy; 556 + 557 + netns = netns_new(NS_TEST, true); 558 + if (!ASSERT_OK_PTR(netns, "netns_new: mptcp_sockmap")) 559 + goto skel_destroy; 560 + 561 + if (endpoint_init("subflow") < 0) 562 + goto close_netns; 563 + 564 + test_sockmap_with_mptcp_fallback(skel); 565 + test_sockmap_reject_mptcp(skel); 566 + 567 + close_netns: 568 + netns_free(netns); 569 + skel_destroy: 570 + mptcp_sockmap__destroy(skel); 571 + close_cgroup: 572 + close(cgroup_fd); 573 + } 574 + 441 575 void test_mptcp(void) 442 576 { 443 577 if (test__start_subtest("base")) ··· 582 444 test_mptcpify(); 583 445 if (test__start_subtest("subflow")) 584 446 test_subflow(); 447 + if (test__start_subtest("sockmap")) 448 + test_mptcp_sockmap(); 585 449 }

+150

tools/testing/selftests/bpf/prog_tests/stacktrace_ips.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <test_progs.h> 3 + #include "stacktrace_ips.skel.h" 4 + 5 + #ifdef __x86_64__ 6 + static int check_stacktrace_ips(int fd, __u32 key, int cnt, ...) 7 + { 8 + __u64 ips[PERF_MAX_STACK_DEPTH]; 9 + struct ksyms *ksyms = NULL; 10 + int i, err = 0; 11 + va_list args; 12 + 13 + /* sorted by addr */ 14 + ksyms = load_kallsyms_local(); 15 + if (!ASSERT_OK_PTR(ksyms, "load_kallsyms_local")) 16 + return -1; 17 + 18 + /* unlikely, but... */ 19 + if (!ASSERT_LT(cnt, PERF_MAX_STACK_DEPTH, "check_max")) 20 + return -1; 21 + 22 + err = bpf_map_lookup_elem(fd, &key, ips); 23 + if (err) 24 + goto out; 25 + 26 + /* 27 + * Compare all symbols provided via arguments with stacktrace ips, 28 + * and their related symbol addresses.t 29 + */ 30 + va_start(args, cnt); 31 + 32 + for (i = 0; i < cnt; i++) { 33 + unsigned long val; 34 + struct ksym *ksym; 35 + 36 + val = va_arg(args, unsigned long); 37 + ksym = ksym_search_local(ksyms, ips[i]); 38 + if (!ASSERT_OK_PTR(ksym, "ksym_search_local")) 39 + break; 40 + ASSERT_EQ(ksym->addr, val, "stack_cmp"); 41 + } 42 + 43 + va_end(args); 44 + 45 + out: 46 + free_kallsyms_local(ksyms); 47 + return err; 48 + } 49 + 50 + static void test_stacktrace_ips_kprobe_multi(bool retprobe) 51 + { 52 + LIBBPF_OPTS(bpf_kprobe_multi_opts, opts, 53 + .retprobe = retprobe 54 + ); 55 + LIBBPF_OPTS(bpf_test_run_opts, topts); 56 + struct stacktrace_ips *skel; 57 + 58 + skel = stacktrace_ips__open_and_load(); 59 + if (!ASSERT_OK_PTR(skel, "stacktrace_ips__open_and_load")) 60 + return; 61 + 62 + if (!skel->kconfig->CONFIG_UNWINDER_ORC) { 63 + test__skip(); 64 + goto cleanup; 65 + } 66 + 67 + skel->links.kprobe_multi_test = bpf_program__attach_kprobe_multi_opts( 68 + skel->progs.kprobe_multi_test, 69 + "bpf_testmod_stacktrace_test", &opts); 70 + if (!ASSERT_OK_PTR(skel->links.kprobe_multi_test, "bpf_program__attach_kprobe_multi_opts")) 71 + goto cleanup; 72 + 73 + trigger_module_test_read(1); 74 + 75 + load_kallsyms(); 76 + 77 + check_stacktrace_ips(bpf_map__fd(skel->maps.stackmap), skel->bss->stack_key, 4, 78 + ksym_get_addr("bpf_testmod_stacktrace_test_3"), 79 + ksym_get_addr("bpf_testmod_stacktrace_test_2"), 80 + ksym_get_addr("bpf_testmod_stacktrace_test_1"), 81 + ksym_get_addr("bpf_testmod_test_read")); 82 + 83 + cleanup: 84 + stacktrace_ips__destroy(skel); 85 + } 86 + 87 + static void test_stacktrace_ips_raw_tp(void) 88 + { 89 + __u32 info_len = sizeof(struct bpf_prog_info); 90 + LIBBPF_OPTS(bpf_test_run_opts, topts); 91 + struct bpf_prog_info info = {}; 92 + struct stacktrace_ips *skel; 93 + __u64 bpf_prog_ksym = 0; 94 + int err; 95 + 96 + skel = stacktrace_ips__open_and_load(); 97 + if (!ASSERT_OK_PTR(skel, "stacktrace_ips__open_and_load")) 98 + return; 99 + 100 + if (!skel->kconfig->CONFIG_UNWINDER_ORC) { 101 + test__skip(); 102 + goto cleanup; 103 + } 104 + 105 + skel->links.rawtp_test = bpf_program__attach_raw_tracepoint( 106 + skel->progs.rawtp_test, 107 + "bpf_testmod_test_read"); 108 + if (!ASSERT_OK_PTR(skel->links.rawtp_test, "bpf_program__attach_raw_tracepoint")) 109 + goto cleanup; 110 + 111 + /* get bpf program address */ 112 + info.jited_ksyms = ptr_to_u64(&bpf_prog_ksym); 113 + info.nr_jited_ksyms = 1; 114 + err = bpf_prog_get_info_by_fd(bpf_program__fd(skel->progs.rawtp_test), 115 + &info, &info_len); 116 + if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd")) 117 + goto cleanup; 118 + 119 + trigger_module_test_read(1); 120 + 121 + load_kallsyms(); 122 + 123 + check_stacktrace_ips(bpf_map__fd(skel->maps.stackmap), skel->bss->stack_key, 2, 124 + bpf_prog_ksym, 125 + ksym_get_addr("bpf_trace_run2")); 126 + 127 + cleanup: 128 + stacktrace_ips__destroy(skel); 129 + } 130 + 131 + static void __test_stacktrace_ips(void) 132 + { 133 + if (test__start_subtest("kprobe_multi")) 134 + test_stacktrace_ips_kprobe_multi(false); 135 + if (test__start_subtest("kretprobe_multi")) 136 + test_stacktrace_ips_kprobe_multi(true); 137 + if (test__start_subtest("raw_tp")) 138 + test_stacktrace_ips_raw_tp(); 139 + } 140 + #else 141 + static void __test_stacktrace_ips(void) 142 + { 143 + test__skip(); 144 + } 145 + #endif 146 + 147 + void test_stacktrace_ips(void) 148 + { 149 + __test_stacktrace_ips(); 150 + }

+53

tools/testing/selftests/bpf/progs/iters_looping.c

··· 161 161 162 162 return 0; 163 163 } 164 + 165 + __used 166 + static void iterator_with_diff_stack_depth(int x) 167 + { 168 + struct bpf_iter_num iter; 169 + 170 + asm volatile ( 171 + "if r1 == 42 goto 0f;" 172 + "*(u64 *)(r10 - 128) = 0;" 173 + "0:" 174 + /* create iterator */ 175 + "r1 = %[iter];" 176 + "r2 = 0;" 177 + "r3 = 10;" 178 + "call %[bpf_iter_num_new];" 179 + "1:" 180 + /* consume next item */ 181 + "r1 = %[iter];" 182 + "call %[bpf_iter_num_next];" 183 + "if r0 == 0 goto 2f;" 184 + "goto 1b;" 185 + "2:" 186 + /* destroy iterator */ 187 + "r1 = %[iter];" 188 + "call %[bpf_iter_num_destroy];" 189 + : 190 + : __imm_ptr(iter), ITER_HELPERS 191 + : __clobber_common, "r6" 192 + ); 193 + } 194 + 195 + SEC("socket") 196 + __success 197 + __naked int widening_stack_size_bug(void *ctx) 198 + { 199 + /* 200 + * Depending on iterator_with_diff_stack_depth() parameter value, 201 + * subprogram stack depth is either 8 or 128 bytes. Arrange values so 202 + * that it is 128 on a first call and 8 on a second. This triggered a 203 + * bug in verifier's widen_imprecise_scalars() logic. 204 + */ 205 + asm volatile ( 206 + "r6 = 0;" 207 + "r1 = 0;" 208 + "1:" 209 + "call iterator_with_diff_stack_depth;" 210 + "r1 = 42;" 211 + "r6 += 1;" 212 + "if r6 < 2 goto 1b;" 213 + "r0 = 0;" 214 + "exit;" 215 + ::: __clobber_all); 216 + }

+30

tools/testing/selftests/bpf/progs/livepatch_trampoline.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <linux/bpf.h> 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + 8 + int fentry_hit; 9 + int fexit_hit; 10 + int my_pid; 11 + 12 + SEC("fentry/cmdline_proc_show") 13 + int BPF_PROG(fentry_cmdline) 14 + { 15 + if (my_pid != (bpf_get_current_pid_tgid() >> 32)) 16 + return 0; 17 + 18 + fentry_hit = 1; 19 + return 0; 20 + } 21 + 22 + SEC("fexit/cmdline_proc_show") 23 + int BPF_PROG(fexit_cmdline) 24 + { 25 + if (my_pid != (bpf_get_current_pid_tgid() >> 32)) 26 + return 0; 27 + 28 + fexit_hit = 1; 29 + return 0; 30 + }

+43

tools/testing/selftests/bpf/progs/mptcp_sockmap.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include "bpf_tracing_net.h" 4 + 5 + char _license[] SEC("license") = "GPL"; 6 + 7 + int sk_index; 8 + int redirect_idx; 9 + int trace_port; 10 + int helper_ret; 11 + struct { 12 + __uint(type, BPF_MAP_TYPE_SOCKMAP); 13 + __uint(key_size, sizeof(__u32)); 14 + __uint(value_size, sizeof(__u32)); 15 + __uint(max_entries, 100); 16 + } sock_map SEC(".maps"); 17 + 18 + SEC("sockops") 19 + int mptcp_sockmap_inject(struct bpf_sock_ops *skops) 20 + { 21 + struct bpf_sock *sk; 22 + 23 + /* only accept specified connection */ 24 + if (skops->local_port != trace_port || 25 + skops->op != BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) 26 + return 1; 27 + 28 + sk = skops->sk; 29 + if (!sk) 30 + return 1; 31 + 32 + /* update sk handler */ 33 + helper_ret = bpf_sock_map_update(skops, &sock_map, &sk_index, BPF_NOEXIST); 34 + 35 + return 1; 36 + } 37 + 38 + SEC("sk_skb/stream_verdict") 39 + int mptcp_sockmap_redirect(struct __sk_buff *skb) 40 + { 41 + /* redirect skb to the sk under sock_map[redirect_idx] */ 42 + return bpf_sk_redirect_map(skb, &sock_map, redirect_idx, 0); 43 + }

+49

tools/testing/selftests/bpf/progs/stacktrace_ips.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + // Copyright (c) 2018 Facebook 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + 8 + #ifndef PERF_MAX_STACK_DEPTH 9 + #define PERF_MAX_STACK_DEPTH 127 10 + #endif 11 + 12 + typedef __u64 stack_trace_t[PERF_MAX_STACK_DEPTH]; 13 + 14 + struct { 15 + __uint(type, BPF_MAP_TYPE_STACK_TRACE); 16 + __uint(max_entries, 16384); 17 + __type(key, __u32); 18 + __type(value, stack_trace_t); 19 + } stackmap SEC(".maps"); 20 + 21 + extern bool CONFIG_UNWINDER_ORC __kconfig __weak; 22 + 23 + /* 24 + * This function is here to have CONFIG_UNWINDER_ORC 25 + * used and added to object BTF. 26 + */ 27 + int unused(void) 28 + { 29 + return CONFIG_UNWINDER_ORC ? 0 : 1; 30 + } 31 + 32 + __u32 stack_key; 33 + 34 + SEC("kprobe.multi") 35 + int kprobe_multi_test(struct pt_regs *ctx) 36 + { 37 + stack_key = bpf_get_stackid(ctx, &stackmap, 0); 38 + return 0; 39 + } 40 + 41 + SEC("raw_tp/bpf_testmod_test_read") 42 + int rawtp_test(void *ctx) 43 + { 44 + /* Skip ebpf program entry in the stack. */ 45 + stack_key = bpf_get_stackid(ctx, &stackmap, 0); 46 + return 0; 47 + } 48 + 49 + char _license[] SEC("license") = "GPL";

+3 -3

tools/testing/selftests/bpf/progs/stream_fail.c

··· 10 10 __failure __msg("Possibly NULL pointer passed") 11 11 int stream_vprintk_null_arg(void *ctx) 12 12 { 13 - bpf_stream_vprintk(BPF_STDOUT, "", NULL, 0, NULL); 13 + bpf_stream_vprintk_impl(BPF_STDOUT, "", NULL, 0, NULL); 14 14 return 0; 15 15 } 16 16 ··· 18 18 __failure __msg("R3 type=scalar expected=") 19 19 int stream_vprintk_scalar_arg(void *ctx) 20 20 { 21 - bpf_stream_vprintk(BPF_STDOUT, "", (void *)46, 0, NULL); 21 + bpf_stream_vprintk_impl(BPF_STDOUT, "", (void *)46, 0, NULL); 22 22 return 0; 23 23 } 24 24 ··· 26 26 __failure __msg("arg#1 doesn't point to a const string") 27 27 int stream_vprintk_string_arg(void *ctx) 28 28 { 29 - bpf_stream_vprintk(BPF_STDOUT, ctx, NULL, 0, NULL); 29 + bpf_stream_vprintk_impl(BPF_STDOUT, ctx, NULL, 0, NULL); 30 30 return 0; 31 31 } 32 32

+3 -3

tools/testing/selftests/bpf/progs/task_work.c

··· 66 66 if (!work) 67 67 return 0; 68 68 69 - bpf_task_work_schedule_resume(task, &work->tw, &hmap, process_work, NULL); 69 + bpf_task_work_schedule_resume_impl(task, &work->tw, &hmap, process_work, NULL); 70 70 return 0; 71 71 } 72 72 ··· 80 80 work = bpf_map_lookup_elem(&arrmap, &key); 81 81 if (!work) 82 82 return 0; 83 - bpf_task_work_schedule_signal(task, &work->tw, &arrmap, process_work, NULL); 83 + bpf_task_work_schedule_signal_impl(task, &work->tw, &arrmap, process_work, NULL); 84 84 return 0; 85 85 } 86 86 ··· 102 102 work = bpf_map_lookup_elem(&lrumap, &key); 103 103 if (!work || work->data[0]) 104 104 return 0; 105 - bpf_task_work_schedule_resume(task, &work->tw, &lrumap, process_work, NULL); 105 + bpf_task_work_schedule_resume_impl(task, &work->tw, &lrumap, process_work, NULL); 106 106 return 0; 107 107 }

+4 -4

tools/testing/selftests/bpf/progs/task_work_fail.c

··· 53 53 work = bpf_map_lookup_elem(&arrmap, &key); 54 54 if (!work) 55 55 return 0; 56 - bpf_task_work_schedule_resume(task, &work->tw, &hmap, process_work, NULL); 56 + bpf_task_work_schedule_resume_impl(task, &work->tw, &hmap, process_work, NULL); 57 57 return 0; 58 58 } 59 59 ··· 65 65 struct bpf_task_work tw; 66 66 67 67 task = bpf_get_current_task_btf(); 68 - bpf_task_work_schedule_resume(task, &tw, &hmap, process_work, NULL); 68 + bpf_task_work_schedule_resume_impl(task, &tw, &hmap, process_work, NULL); 69 69 return 0; 70 70 } 71 71 ··· 76 76 struct task_struct *task; 77 77 78 78 task = bpf_get_current_task_btf(); 79 - bpf_task_work_schedule_resume(task, NULL, &hmap, process_work, NULL); 79 + bpf_task_work_schedule_resume_impl(task, NULL, &hmap, process_work, NULL); 80 80 return 0; 81 81 } 82 82 ··· 91 91 work = bpf_map_lookup_elem(&arrmap, &key); 92 92 if (!work) 93 93 return 0; 94 - bpf_task_work_schedule_resume(task, &work->tw, NULL, process_work, NULL); 94 + bpf_task_work_schedule_resume_impl(task, &work->tw, NULL, process_work, NULL); 95 95 return 0; 96 96 }

+2 -2

tools/testing/selftests/bpf/progs/task_work_stress.c

··· 51 51 if (!work) 52 52 return 0; 53 53 } 54 - err = bpf_task_work_schedule_signal(bpf_get_current_task_btf(), &work->tw, &hmap, 55 - process_work, NULL); 54 + err = bpf_task_work_schedule_signal_impl(bpf_get_current_task_btf(), &work->tw, &hmap, 55 + process_work, NULL); 56 56 if (err) 57 57 __sync_fetch_and_add(&schedule_error, 1); 58 58 else

+26

tools/testing/selftests/bpf/test_kmods/bpf_testmod.c

··· 417 417 return a + (long)b + c + d + (long)e + f + g + h + i + j + k; 418 418 } 419 419 420 + noinline void bpf_testmod_stacktrace_test(void) 421 + { 422 + /* used for stacktrace test as attach function */ 423 + asm volatile (""); 424 + } 425 + 426 + noinline void bpf_testmod_stacktrace_test_3(void) 427 + { 428 + bpf_testmod_stacktrace_test(); 429 + asm volatile (""); 430 + } 431 + 432 + noinline void bpf_testmod_stacktrace_test_2(void) 433 + { 434 + bpf_testmod_stacktrace_test_3(); 435 + asm volatile (""); 436 + } 437 + 438 + noinline void bpf_testmod_stacktrace_test_1(void) 439 + { 440 + bpf_testmod_stacktrace_test_2(); 441 + asm volatile (""); 442 + } 443 + 420 444 int bpf_testmod_fentry_ok; 421 445 422 446 noinline ssize_t ··· 520 496 bpf_testmod_fentry_test11(16, (void *)17, 18, 19, (void *)20, 521 497 21, 22, 23, 24, 25, 26) != 231) 522 498 goto out; 499 + 500 + bpf_testmod_stacktrace_test_1(); 523 501 524 502 bpf_testmod_fentry_ok = 1; 525 503 out:

+1

tools/testing/selftests/drivers/net/Makefile

··· 18 18 netcons_fragmented_msg.sh \ 19 19 netcons_overflow.sh \ 20 20 netcons_sysdata.sh \ 21 + netcons_torture.sh \ 21 22 netpoll_basic.py \ 22 23 ping.py \ 23 24 psp.py \

+2

tools/testing/selftests/drivers/net/bonding/Makefile

··· 14 14 dev_addr_lists.sh \ 15 15 mode-1-recovery-updelay.sh \ 16 16 mode-2-recovery-updelay.sh \ 17 + netcons_over_bonding.sh \ 17 18 # end of TEST_PROGS 18 19 19 20 TEST_FILES := \ ··· 25 24 26 25 TEST_INCLUDES := \ 27 26 ../../../net/lib.sh \ 27 + ../lib/sh/lib_netcons.sh \ 28 28 ../../../net/forwarding/lib.sh \ 29 29 # end of TEST_INCLUDES 30 30

+4

tools/testing/selftests/drivers/net/bonding/config

··· 1 1 CONFIG_BONDING=y 2 2 CONFIG_BRIDGE=y 3 + CONFIG_CONFIGFS_FS=y 3 4 CONFIG_DUMMY=y 4 5 CONFIG_INET_ESP=y 5 6 CONFIG_INET_ESP_OFFLOAD=y ··· 10 9 CONFIG_NET_ACT_GACT=y 11 10 CONFIG_NET_CLS_FLOWER=y 12 11 CONFIG_NET_CLS_MATCHALL=m 12 + CONFIG_NETCONSOLE=m 13 + CONFIG_NETCONSOLE_DYNAMIC=y 14 + CONFIG_NETCONSOLE_EXTENDED_LOG=y 13 15 CONFIG_NETDEVSIM=m 14 16 CONFIG_NET_SCH_INGRESS=y 15 17 CONFIG_NLMON=y

+361

tools/testing/selftests/drivers/net/bonding/netcons_over_bonding.sh

··· 1 + #!/usr/bin/env bash 2 + # SPDX-License-Identifier: GPL-2.0 3 + # 4 + # This selftest exercises trying to have multiple netpoll users at the same 5 + # time. 6 + # 7 + # This selftest has multiple smalls test inside, and the goal is to 8 + # get interfaces with bonding and netconsole in different orders in order 9 + # to catch any possible issue. 10 + # 11 + # The main test composes of four interfaces being created using netdevsim; two 12 + # of them are bonded to serve as the netconsole's transmit interface. The 13 + # remaining two interfaces are similarly bonded and assigned to a separate 14 + # network namespace, which acts as the receive interface, where socat monitors 15 + # for incoming messages. 16 + # 17 + # A netconsole message is then sent to ensure it is properly received across 18 + # this configuration. 19 + # 20 + # Later, run a few other tests, to make sure that bonding and netconsole 21 + # cannot coexist. 22 + # 23 + # The test's objective is to exercise netpoll usage when managed simultaneously 24 + # by multiple subsystems (netconsole and bonding). 25 + # 26 + # Author: Breno Leitao <leitao@debian.org> 27 + 28 + set -euo pipefail 29 + 30 + SCRIPTDIR=$(dirname "$(readlink -e "${BASH_SOURCE[0]}")") 31 + 32 + source "${SCRIPTDIR}"/../lib/sh/lib_netcons.sh 33 + 34 + modprobe netdevsim 2> /dev/null || true 35 + modprobe netconsole 2> /dev/null || true 36 + modprobe bonding 2> /dev/null || true 37 + modprobe veth 2> /dev/null || true 38 + 39 + # The content of kmsg will be save to the following file 40 + OUTPUT_FILE="/tmp/${TARGET}" 41 + 42 + # Check for basic system dependency and exit if not found 43 + check_for_dependencies 44 + # Set current loglevel to KERN_INFO(6), and default to KERN_NOTICE(5) 45 + echo "6 5" > /proc/sys/kernel/printk 46 + # Remove the namespace, interfaces and netconsole target on exit 47 + trap cleanup_bond EXIT 48 + 49 + FORMAT="extended" 50 + IP_VERSION="ipv4" 51 + VETH0="veth"$(( RANDOM % 256)) 52 + VETH1="veth"$((256 + RANDOM % 256)) 53 + TXNS="" 54 + RXNS="" 55 + 56 + # Create "bond_tx_XX" and "bond_rx_XX" interfaces, and set DSTIF and SRCIF with 57 + # the bonding interfaces 58 + function setup_bonding_ifaces() { 59 + local RAND=$(( RANDOM % 100 )) 60 + BOND_TX_MAIN_IF="bond_tx_$RAND" 61 + BOND_RX_MAIN_IF="bond_rx_$RAND" 62 + 63 + # Setup TX 64 + if ! ip -n "${TXNS}" link add "${BOND_TX_MAIN_IF}" type bond mode balance-rr 65 + then 66 + echo "Failed to create bond TX interface. Is CONFIG_BONDING set?" >&2 67 + # only clean nsim ifaces and namespace. Nothing else has been 68 + # initialized 69 + cleanup_bond_nsim 70 + trap - EXIT 71 + exit "${ksft_skip}" 72 + fi 73 + 74 + # create_netdevsim() got the interface up, but it needs to be down 75 + # before being enslaved. 76 + ip -n "${TXNS}" \ 77 + link set "${BOND_TX1_SLAVE_IF}" down 78 + ip -n "${TXNS}" \ 79 + link set "${BOND_TX2_SLAVE_IF}" down 80 + ip -n "${TXNS}" \ 81 + link set "${BOND_TX1_SLAVE_IF}" master "${BOND_TX_MAIN_IF}" 82 + ip -n "${TXNS}" \ 83 + link set "${BOND_TX2_SLAVE_IF}" master "${BOND_TX_MAIN_IF}" 84 + ip -n "${TXNS}" \ 85 + link set "${BOND_TX_MAIN_IF}" up 86 + 87 + # Setup RX 88 + ip -n "${RXNS}" \ 89 + link add "${BOND_RX_MAIN_IF}" type bond mode balance-rr 90 + ip -n "${RXNS}" \ 91 + link set "${BOND_RX1_SLAVE_IF}" down 92 + ip -n "${RXNS}" \ 93 + link set "${BOND_RX2_SLAVE_IF}" down 94 + ip -n "${RXNS}" \ 95 + link set "${BOND_RX1_SLAVE_IF}" master "${BOND_RX_MAIN_IF}" 96 + ip -n "${RXNS}" \ 97 + link set "${BOND_RX2_SLAVE_IF}" master "${BOND_RX_MAIN_IF}" 98 + ip -n "${RXNS}" \ 99 + link set "${BOND_RX_MAIN_IF}" up 100 + 101 + export DSTIF="${BOND_RX_MAIN_IF}" 102 + export SRCIF="${BOND_TX_MAIN_IF}" 103 + } 104 + 105 + # Create 4 netdevsim interfaces. Two of them will be bound to TX bonding iface 106 + # and the other two will be bond to the RX interface (on the other namespace) 107 + function create_ifaces_bond() { 108 + BOND_TX1_SLAVE_IF=$(create_netdevsim "${NSIM_BOND_TX_1}" "${TXNS}") 109 + BOND_TX2_SLAVE_IF=$(create_netdevsim "${NSIM_BOND_TX_2}" "${TXNS}") 110 + BOND_RX1_SLAVE_IF=$(create_netdevsim "${NSIM_BOND_RX_1}" "${RXNS}") 111 + BOND_RX2_SLAVE_IF=$(create_netdevsim "${NSIM_BOND_RX_2}" "${RXNS}") 112 + } 113 + 114 + # netdevsim link BOND_TX to BOND_RX interfaces 115 + function link_ifaces_bond() { 116 + local BOND_TX1_SLAVE_IFIDX 117 + local BOND_TX2_SLAVE_IFIDX 118 + local BOND_RX1_SLAVE_IFIDX 119 + local BOND_RX2_SLAVE_IFIDX 120 + local TXNS_FD 121 + local RXNS_FD 122 + 123 + BOND_TX1_SLAVE_IFIDX=$(ip netns exec "${TXNS}" \ 124 + cat /sys/class/net/"$BOND_TX1_SLAVE_IF"/ifindex) 125 + BOND_TX2_SLAVE_IFIDX=$(ip netns exec "${TXNS}" \ 126 + cat /sys/class/net/"$BOND_TX2_SLAVE_IF"/ifindex) 127 + BOND_RX1_SLAVE_IFIDX=$(ip netns exec "${RXNS}" \ 128 + cat /sys/class/net/"$BOND_RX1_SLAVE_IF"/ifindex) 129 + BOND_RX2_SLAVE_IFIDX=$(ip netns exec "${RXNS}" \ 130 + cat /sys/class/net/"$BOND_RX2_SLAVE_IF"/ifindex) 131 + 132 + exec {TXNS_FD}</var/run/netns/"${TXNS}" 133 + exec {RXNS_FD}</var/run/netns/"${RXNS}" 134 + 135 + # Linking TX ifaces to the RX ones (on the other namespace) 136 + echo "${TXNS_FD}:$BOND_TX1_SLAVE_IFIDX $RXNS_FD:$BOND_RX1_SLAVE_IFIDX" \ 137 + > "$NSIM_DEV_SYS_LINK" 138 + echo "${TXNS_FD}:$BOND_TX2_SLAVE_IFIDX $RXNS_FD:$BOND_RX2_SLAVE_IFIDX" \ 139 + > "$NSIM_DEV_SYS_LINK" 140 + 141 + exec {TXNS_FD}<&- 142 + exec {RXNS_FD}<&- 143 + } 144 + 145 + function create_all_ifaces() { 146 + # setup_ns function is coming from lib.sh 147 + setup_ns TXNS RXNS 148 + export NAMESPACE="${RXNS}" 149 + 150 + # Create two interfaces for RX and two for TX 151 + create_ifaces_bond 152 + # Link netlink ifaces 153 + link_ifaces_bond 154 + } 155 + 156 + # configure DSTIF and SRCIF IPs 157 + function configure_ifaces_ips() { 158 + local IP_VERSION=${1:-"ipv4"} 159 + select_ipv4_or_ipv6 "${IP_VERSION}" 160 + 161 + ip -n "${RXNS}" addr add "${DSTIP}"/24 dev "${DSTIF}" 162 + ip -n "${RXNS}" link set "${DSTIF}" up 163 + 164 + ip -n "${TXNS}" addr add "${SRCIP}"/24 dev "${SRCIF}" 165 + ip -n "${TXNS}" link set "${SRCIF}" up 166 + } 167 + 168 + function test_enable_netpoll_on_enslaved_iface() { 169 + echo 0 > "${NETCONS_PATH}"/enabled 170 + 171 + # At this stage, BOND_TX1_SLAVE_IF is enslaved to BOND_TX_MAIN_IF, and 172 + # linked to BOND_RX1_SLAVE_IF inside the namespace. 173 + echo "${BOND_TX1_SLAVE_IF}" > "${NETCONS_PATH}"/dev_name 174 + 175 + # This should fail with the following message in dmesg: 176 + # netpoll: netconsole: ethX is a slave device, aborting 177 + set +e 178 + enable_netcons_ns 2> /dev/null 179 + set -e 180 + 181 + if [[ $(cat "${NETCONS_PATH}"/enabled) -eq 1 ]] 182 + then 183 + echo "test failed: Bonding and netpoll cannot co-exists." >&2 184 + exit "${ksft_fail}" 185 + fi 186 + } 187 + 188 + function test_delete_bond_and_reenable_target() { 189 + ip -n "${TXNS}" \ 190 + link delete "${BOND_TX_MAIN_IF}" type bond 191 + 192 + # BOND_TX1_SLAVE_IF is not attached to a bond interface anymore 193 + # netpoll can be plugged in there 194 + echo "${BOND_TX1_SLAVE_IF}" > "${NETCONS_PATH}"/dev_name 195 + 196 + # this should work, since the interface is not enslaved 197 + enable_netcons_ns 198 + 199 + if [[ $(cat "${NETCONS_PATH}"/enabled) -eq 0 ]] 200 + then 201 + echo "test failed: Unable to start netpoll on an unbond iface." >&2 202 + exit "${ksft_fail}" 203 + fi 204 + } 205 + 206 + # Send a netconsole message to the netconsole target 207 + function test_send_netcons_msg_through_bond_iface() { 208 + # Listen for netconsole port inside the namespace and 209 + # destination interface 210 + listen_port_and_save_to "${OUTPUT_FILE}" "${IP_VERSION}" & 211 + # Wait for socat to start and listen to the port. 212 + wait_for_port "${RXNS}" "${PORT}" "${IP_VERSION}" 213 + # Send the message 214 + echo "${MSG}: ${TARGET}" > /dev/kmsg 215 + # Wait until socat saves the file to disk 216 + busywait "${BUSYWAIT_TIMEOUT}" test -s "${OUTPUT_FILE}" 217 + # Make sure the message was received in the dst part 218 + # and exit 219 + validate_result "${OUTPUT_FILE}" "${FORMAT}" 220 + # kill socat in case it is still running 221 + pkill_socat 222 + } 223 + 224 + # BOND_TX1_SLAVE_IF has netconsole enabled on it, bind it to BOND_TX_MAIN_IF. 225 + # Given BOND_TX_MAIN_IF was deleted, recreate it first 226 + function test_enslave_netcons_enabled_iface { 227 + # netconsole got disabled while the interface was down 228 + if [[ $(cat "${NETCONS_PATH}"/enabled) -eq 0 ]] 229 + then 230 + echo "test failed: netconsole expected to be enabled against BOND_TX1_SLAVE_IF" >&2 231 + exit "${ksft_fail}" 232 + fi 233 + 234 + # recreate the bonding iface. it got deleted by previous 235 + # test (test_delete_bond_and_reenable_target) 236 + ip -n "${TXNS}" \ 237 + link add "${BOND_TX_MAIN_IF}" type bond mode balance-rr 238 + 239 + # sub-interface need to be down before attaching to bonding 240 + # This will also disable netconsole. 241 + ip -n "${TXNS}" \ 242 + link set "${BOND_TX1_SLAVE_IF}" down 243 + ip -n "${TXNS}" \ 244 + link set "${BOND_TX1_SLAVE_IF}" master "${BOND_TX_MAIN_IF}" 245 + ip -n "${TXNS}" \ 246 + link set "${BOND_TX_MAIN_IF}" up 247 + 248 + # netconsole got disabled while the interface was down 249 + if [[ $(cat "${NETCONS_PATH}"/enabled) -eq 1 ]] 250 + then 251 + echo "test failed: Device is part of a bond iface, cannot have netcons enabled" >&2 252 + exit "${ksft_fail}" 253 + fi 254 + } 255 + 256 + # Get netconsole enabled on a bonding interface and attach a second 257 + # sub-interface. 258 + function test_enslave_iface_to_bond { 259 + # BOND_TX_MAIN_IF has only BOND_TX1_SLAVE_IF right now 260 + echo "${BOND_TX_MAIN_IF}" > "${NETCONS_PATH}"/dev_name 261 + enable_netcons_ns 262 + 263 + # netcons is attached to bond0 and BOND_TX1_SLAVE_IF is 264 + # part of BOND_TX_MAIN_IF. Attach BOND_TX2_SLAVE_IF to BOND_TX_MAIN_IF. 265 + ip -n "${TXNS}" \ 266 + link set "${BOND_TX2_SLAVE_IF}" master "${BOND_TX_MAIN_IF}" 267 + if [[ $(cat "${NETCONS_PATH}"/enabled) -eq 0 ]] 268 + then 269 + echo "test failed: Netconsole should be enabled on bonding interface. Failed" >&2 270 + exit "${ksft_fail}" 271 + fi 272 + } 273 + 274 + function test_enslave_iff_disabled_netpoll_iface { 275 + local ret 276 + 277 + # Create two interfaces. veth interfaces it known to have 278 + # IFF_DISABLE_NETPOLL set 279 + if ! ip link add "${VETH0}" type veth peer name "${VETH1}" 280 + then 281 + echo "Failed to create veth TX interface. Is CONFIG_VETH set?" >&2 282 + exit "${ksft_skip}" 283 + fi 284 + set +e 285 + # This will print RTNETLINK answers: Device or resource busy 286 + ip link set "${VETH0}" master "${BOND_TX_MAIN_IF}" 2> /dev/null 287 + ret=$? 288 + set -e 289 + if [[ $ret -eq 0 ]] 290 + then 291 + echo "test failed: veth interface could not be enslaved" 292 + exit "${ksft_fail}" 293 + fi 294 + } 295 + 296 + # Given that netconsole picks the current net namespace, we need to enable it 297 + # from inside the TXNS namespace 298 + function enable_netcons_ns() { 299 + ip netns exec "${TXNS}" sh -c \ 300 + "mount -t configfs configfs /sys/kernel/config && echo 1 > $NETCONS_PATH/enabled" 301 + } 302 + 303 + #################### 304 + # Tests start here # 305 + #################### 306 + 307 + # Create regular interfaces using netdevsim and link them 308 + create_all_ifaces 309 + 310 + # Setup the bonding interfaces 311 + # BOND_RX_MAIN_IF has BOND_RX{1,2}_SLAVE_IF 312 + # BOND_TX_MAIN_IF has BOND_TX{1,2}_SLAVE_IF 313 + setup_bonding_ifaces 314 + 315 + # Configure the ips as BOND_RX1_SLAVE_IF and BOND_TX1_SLAVE_IF 316 + configure_ifaces_ips "${IP_VERSION}" 317 + 318 + _create_dynamic_target "${FORMAT}" "${NETCONS_PATH}" 319 + enable_netcons_ns 320 + set_user_data 321 + 322 + # Test #1 : Create an bonding interface and attach netpoll into 323 + # the bonding interface. Netconsole/netpoll should work on 324 + # the bonding interface. 325 + test_send_netcons_msg_through_bond_iface 326 + echo "test #1: netpoll on bonding interface worked. Test passed" >&2 327 + 328 + # Test #2: Attach netpoll to an enslaved interface 329 + # Try to attach netpoll to an enslaved sub-interface (while still being part of 330 + # a bonding interface), which shouldn't be allowed 331 + test_enable_netpoll_on_enslaved_iface 332 + echo "test #2: netpoll correctly rejected enslaved interface (expected behavior). Test passed." >&2 333 + 334 + # Test #3: Unplug the sub-interface from bond and enable netconsole 335 + # Detach the interface from a bonding interface and attach netpoll again 336 + test_delete_bond_and_reenable_target 337 + echo "test #3: Able to attach to an unbound interface. Test passed." >&2 338 + 339 + # Test #4: Enslave a sub-interface that had netconsole enabled 340 + # Try to enslave an interface that has netconsole/netpoll enabled. 341 + # Previous test has netconsole enabled in BOND_TX1_SLAVE_IF, try to enslave it 342 + test_enslave_netcons_enabled_iface 343 + echo "test #4: Enslaving an interface with netpoll attached. Test passed." >&2 344 + 345 + # Test #5: Enslave a sub-interface to a bonding interface 346 + # Enslave an interface to a bond interface that has netpoll attached 347 + # At this stage, BOND_TX_MAIN_IF is created and BOND_TX1_SLAVE_IF is part of 348 + # it. Netconsole is currently disabled 349 + test_enslave_iface_to_bond 350 + echo "test #5: Enslaving an interface to bond+netpoll. Test passed." >&2 351 + 352 + # Test #6: Enslave a IFF_DISABLE_NETPOLL sub-interface to a bonding interface 353 + # At this stage, BOND_TX_MAIN_IF has both sub interface and netconsole is 354 + # enabled. This test will try to enslave an a veth (IFF_DISABLE_NETPOLL) interface 355 + # and it should fail, with netpoll: veth0 doesn't support polling 356 + test_enslave_iff_disabled_netpoll_iface 357 + echo "test #6: Enslaving IFF_DISABLE_NETPOLL ifaces to bond iface is not supported. Test passed." >&2 358 + 359 + cleanup_bond 360 + trap - EXIT 361 + exit "${EXIT_STATUS}"

+63 -15

tools/testing/selftests/drivers/net/lib/sh/lib_netcons.sh

··· 11 11 LIBDIR=$(dirname "$(readlink -e "${BASH_SOURCE[0]}")") 12 12 13 13 SRCIF="" # to be populated later 14 + SRCIP="" # to be populated later 14 15 SRCIP4="192.0.2.1" 15 16 SRCIP6="fc00::1" 16 17 DSTIF="" # to be populated later 18 + DSTIP="" # to be populated later 17 19 DSTIP4="192.0.2.2" 18 20 DSTIP6="fc00::2" 19 21 ··· 30 28 # NAMESPACE will be populated by setup_ns with a random value 31 29 NAMESPACE="" 32 30 33 - # IDs for netdevsim 31 + # IDs for netdevsim. We either use NSIM_DEV_{1,2}_ID for standard test 32 + # or NSIM_BOND_{T,R}X_{1,2} for the bonding tests. Not both at the 33 + # same time. 34 34 NSIM_DEV_1_ID=$((256 + RANDOM % 256)) 35 35 NSIM_DEV_2_ID=$((512 + RANDOM % 256)) 36 + NSIM_BOND_TX_1=$((768 + RANDOM % 256)) 37 + NSIM_BOND_TX_2=$((1024 + RANDOM % 256)) 38 + NSIM_BOND_RX_1=$((1280 + RANDOM % 256)) 39 + NSIM_BOND_RX_2=$((1536 + RANDOM % 256)) 36 40 NSIM_DEV_SYS_NEW="/sys/bus/netdevsim/new_device" 41 + NSIM_DEV_SYS_LINK="/sys/bus/netdevsim/link_device" 37 42 38 43 # Used to create and delete namespaces 39 44 source "${LIBDIR}"/../../../../net/lib.sh 40 45 41 46 # Create netdevsim interfaces 42 47 create_ifaces() { 43 - 44 48 echo "$NSIM_DEV_2_ID" > "$NSIM_DEV_SYS_NEW" 45 49 echo "$NSIM_DEV_1_ID" > "$NSIM_DEV_SYS_NEW" 46 50 udevadm settle 2> /dev/null || true ··· 121 113 configure_ip 122 114 } 123 115 124 - function create_dynamic_target() { 125 - local FORMAT=${1:-"extended"} 116 + function _create_dynamic_target() { 117 + local FORMAT="${1:?FORMAT parameter required}" 118 + local NCPATH="${2:?NCPATH parameter required}" 126 119 127 120 DSTMAC=$(ip netns exec "${NAMESPACE}" \ 128 121 ip link show "${DSTIF}" | awk '/ether/ {print $2}') 129 122 130 123 # Create a dynamic target 131 - mkdir "${NETCONS_PATH}" 124 + mkdir "${NCPATH}" 132 125 133 - echo "${DSTIP}" > "${NETCONS_PATH}"/remote_ip 134 - echo "${SRCIP}" > "${NETCONS_PATH}"/local_ip 135 - echo "${DSTMAC}" > "${NETCONS_PATH}"/remote_mac 136 - echo "${SRCIF}" > "${NETCONS_PATH}"/dev_name 126 + echo "${DSTIP}" > "${NCPATH}"/remote_ip 127 + echo "${SRCIP}" > "${NCPATH}"/local_ip 128 + echo "${DSTMAC}" > "${NCPATH}"/remote_mac 129 + echo "${SRCIF}" > "${NCPATH}"/dev_name 137 130 138 131 if [ "${FORMAT}" == "basic" ] 139 132 then 140 133 # Basic target does not support release 141 - echo 0 > "${NETCONS_PATH}"/release 142 - echo 0 > "${NETCONS_PATH}"/extended 134 + echo 0 > "${NCPATH}"/release 135 + echo 0 > "${NCPATH}"/extended 143 136 elif [ "${FORMAT}" == "extended" ] 144 137 then 145 - echo 1 > "${NETCONS_PATH}"/extended 138 + echo 1 > "${NCPATH}"/extended 146 139 fi 140 + } 147 141 148 - echo 1 > "${NETCONS_PATH}"/enabled 142 + function create_dynamic_target() { 143 + local FORMAT=${1:-"extended"} 144 + local NCPATH=${2:-"$NETCONS_PATH"} 145 + _create_dynamic_target "${FORMAT}" "${NCPATH}" 146 + 147 + echo 1 > "${NCPATH}"/enabled 149 148 150 149 # This will make sure that the kernel was able to 151 150 # load the netconsole driver configuration. The console message ··· 200 185 echo "${DEFAULT_PRINTK_VALUES}" > /proc/sys/kernel/printk 201 186 } 202 187 203 - function cleanup() { 188 + function cleanup_netcons() { 204 189 # delete netconsole dynamic reconfiguration 205 - echo 0 > "${NETCONS_PATH}"/enabled 190 + # do not fail if the target is already disabled 191 + if [[ ! -d "${NETCONS_PATH}" ]] 192 + then 193 + # in some cases this is called before netcons path is created 194 + return 195 + fi 196 + if [[ $(cat "${NETCONS_PATH}"/enabled) != 0 ]] 197 + then 198 + echo 0 > "${NETCONS_PATH}"/enabled || true 199 + fi 206 200 # Remove all the keys that got created during the selftest 207 201 find "${NETCONS_PATH}/userdata/" -mindepth 1 -type d -delete 208 202 # Remove the configfs entry 209 203 rmdir "${NETCONS_PATH}" 204 + } 210 205 206 + function cleanup() { 207 + cleanup_netcons 211 208 do_cleanup 212 209 } 213 210 ··· 395 368 # otherwise the packet could be missed, and the test will fail. Happens 396 369 # more frequently on IPv6 397 370 sleep 1 371 + } 372 + 373 + # Clean up netdevsim ifaces created for bonding test 374 + function cleanup_bond_nsim() { 375 + ip -n "${TXNS}" \ 376 + link delete "${BOND_TX_MAIN_IF}" type bond || true 377 + ip -n "${RXNS}" \ 378 + link delete "${BOND_RX_MAIN_IF}" type bond || true 379 + 380 + cleanup_netdevsim "$NSIM_BOND_TX_1" 381 + cleanup_netdevsim "$NSIM_BOND_TX_2" 382 + cleanup_netdevsim "$NSIM_BOND_RX_1" 383 + cleanup_netdevsim "$NSIM_BOND_RX_2" 384 + } 385 + 386 + # cleanup tests that use bonding interfaces 387 + function cleanup_bond() { 388 + cleanup_netcons 389 + cleanup_bond_nsim 390 + cleanup_all_ns 391 + ip link delete "${VETH0}" || true 398 392 }

+130

tools/testing/selftests/drivers/net/netcons_torture.sh

··· 1 + #!/usr/bin/env bash 2 + # SPDX-License-Identifier: GPL-2.0 3 + 4 + # Repeatedly send kernel messages, toggles netconsole targets on and off, 5 + # creates and deletes targets in parallel, and toggles the source interface to 6 + # simulate stress conditions. 7 + # 8 + # This test aims to verify the robustness of netconsole under dynamic 9 + # configurations and concurrent operations. 10 + # 11 + # The major goal is to run this test with LOCKDEP, Kmemleak and KASAN to make 12 + # sure no issues is reported. 13 + # 14 + # Author: Breno Leitao <leitao@debian.org> 15 + 16 + set -euo pipefail 17 + 18 + SCRIPTDIR=$(dirname "$(readlink -e "${BASH_SOURCE[0]}")") 19 + 20 + source "${SCRIPTDIR}"/lib/sh/lib_netcons.sh 21 + 22 + # Number of times the main loop run 23 + ITERATIONS=${1:-150} 24 + 25 + # Only test extended format 26 + FORMAT="extended" 27 + # And ipv6 only 28 + IP_VERSION="ipv6" 29 + 30 + # Create, enable and delete some targets. 31 + create_and_delete_random_target() { 32 + COUNT=2 33 + RND_PREFIX=$(mktemp -u netcons_rnd_XXXX_) 34 + 35 + if [ -d "${NETCONS_CONFIGFS}/${RND_PREFIX}${COUNT}" ] || \ 36 + [ -d "${NETCONS_CONFIGFS}/${RND_PREFIX}0" ]; then 37 + echo "Function didn't finish yet, skipping it." >&2 38 + return 39 + fi 40 + 41 + # enable COUNT targets 42 + for i in $(seq ${COUNT}) 43 + do 44 + RND_TARGET="${RND_PREFIX}"${i} 45 + RND_TARGET_PATH="${NETCONS_CONFIGFS}"/"${RND_TARGET}" 46 + 47 + # Basic population so the target can come up 48 + _create_dynamic_target "${FORMAT}" "${RND_TARGET_PATH}" 49 + done 50 + 51 + echo "netconsole selftest: ${COUNT} additional targets were created" > /dev/kmsg 52 + # disable them all 53 + for i in $(seq ${COUNT}) 54 + do 55 + RND_TARGET="${RND_PREFIX}"${i} 56 + RND_TARGET_PATH="${NETCONS_CONFIGFS}"/"${RND_TARGET}" 57 + if [[ $(cat "${RND_TARGET_PATH}/enabled") -eq 1 ]] 58 + then 59 + echo 0 > "${RND_TARGET_PATH}"/enabled 60 + fi 61 + rmdir "${RND_TARGET_PATH}" 62 + done 63 + } 64 + 65 + # Disable and enable the target mid-air, while messages 66 + # are being transmitted. 67 + toggle_netcons_target() { 68 + for i in $(seq 2) 69 + do 70 + if [ ! -d "${NETCONS_PATH}" ] 71 + then 72 + break 73 + fi 74 + echo 0 > "${NETCONS_PATH}"/enabled 2> /dev/null || true 75 + # Try to enable a bit harder, given it might fail to enable 76 + # Write to `enabled` might fail depending on the lock, which is 77 + # highly contentious here 78 + for _ in $(seq 5) 79 + do 80 + echo 1 > "${NETCONS_PATH}"/enabled 2> /dev/null || true 81 + done 82 + done 83 + } 84 + 85 + toggle_iface(){ 86 + ip link set "${SRCIF}" down 87 + ip link set "${SRCIF}" up 88 + } 89 + 90 + # Start here 91 + 92 + modprobe netdevsim 2> /dev/null || true 93 + modprobe netconsole 2> /dev/null || true 94 + 95 + # Check for basic system dependency and exit if not found 96 + check_for_dependencies 97 + # Set current loglevel to KERN_INFO(6), and default to KERN_NOTICE(5) 98 + echo "6 5" > /proc/sys/kernel/printk 99 + # Remove the namespace, interfaces and netconsole target on exit 100 + trap cleanup EXIT 101 + # Create one namespace and two interfaces 102 + set_network "${IP_VERSION}" 103 + # Create a dynamic target for netconsole 104 + create_dynamic_target "${FORMAT}" 105 + 106 + for i in $(seq "$ITERATIONS") 107 + do 108 + for _ in $(seq 10) 109 + do 110 + echo "${MSG}: ${TARGET} ${i}" > /dev/kmsg 111 + done 112 + wait 113 + 114 + if (( i % 30 == 0 )); then 115 + toggle_netcons_target & 116 + fi 117 + 118 + if (( i % 50 == 0 )); then 119 + # create some targets, enable them, send msg and disable 120 + # all in a parallel thread 121 + create_and_delete_random_target & 122 + fi 123 + 124 + if (( i % 70 == 0 )); then 125 + toggle_iface & 126 + fi 127 + done 128 + wait 129 + 130 + exit "${EXIT_STATUS}"

+4

tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc

··· 20 20 echo 0 > tracing_on 21 21 echo 0 > events/enable 22 22 23 + # Clear functions caused by page cache; run sample_events twice 24 + sample_events 25 + sample_events 26 + 23 27 echo "Get the most frequently calling function" 24 28 echo > trace 25 29 sample_events

+3

tools/testing/selftests/kvm/arm64/get-reg-list.c

··· 63 63 REG_FEAT(HDFGWTR2_EL2, ID_AA64MMFR0_EL1, FGT, FGT2), 64 64 REG_FEAT(ZCR_EL2, ID_AA64PFR0_EL1, SVE, IMP), 65 65 REG_FEAT(SCTLR2_EL1, ID_AA64MMFR3_EL1, SCTLRX, IMP), 66 + REG_FEAT(SCTLR2_EL2, ID_AA64MMFR3_EL1, SCTLRX, IMP), 66 67 REG_FEAT(VDISR_EL2, ID_AA64PFR0_EL1, RAS, IMP), 67 68 REG_FEAT(VSESR_EL2, ID_AA64PFR0_EL1, RAS, IMP), 68 69 REG_FEAT(VNCR_EL2, ID_AA64MMFR4_EL1, NV_frac, NV2_ONLY), 69 70 REG_FEAT(CNTHV_CTL_EL2, ID_AA64MMFR1_EL1, VH, IMP), 70 71 REG_FEAT(CNTHV_CVAL_EL2,ID_AA64MMFR1_EL1, VH, IMP), 72 + REG_FEAT(ZCR_EL2, ID_AA64PFR0_EL1, SVE, IMP), 71 73 }; 72 74 73 75 bool filter_reg(__u64 reg) ··· 720 718 SYS_REG(VMPIDR_EL2), 721 719 SYS_REG(SCTLR_EL2), 722 720 SYS_REG(ACTLR_EL2), 721 + SYS_REG(SCTLR2_EL2), 723 722 SYS_REG(HCR_EL2), 724 723 SYS_REG(MDCR_EL2), 725 724 SYS_REG(CPTR_EL2),

+8 -1

tools/testing/selftests/kvm/lib/arm64/gic_v3_its.c

··· 15 15 #include "gic_v3.h" 16 16 #include "processor.h" 17 17 18 + #define GITS_COLLECTION_TARGET_SHIFT 16 19 + 18 20 static u64 its_read_u64(unsigned long offset) 19 21 { 20 22 return readq_relaxed(GITS_BASE_GVA + offset); ··· 165 163 its_mask_encode(&cmd->raw_cmd[2], col, 15, 0); 166 164 } 167 165 166 + static u64 procnum_to_rdbase(u32 vcpu_id) 167 + { 168 + return vcpu_id << GITS_COLLECTION_TARGET_SHIFT; 169 + } 170 + 168 171 #define GITS_CMDQ_POLL_ITERATIONS 0 169 172 170 173 static void its_send_cmd(void *cmdq_base, struct its_cmd_block *cmd) ··· 224 217 225 218 its_encode_cmd(&cmd, GITS_CMD_MAPC); 226 219 its_encode_collection(&cmd, collection_id); 227 - its_encode_target(&cmd, vcpu_id); 220 + its_encode_target(&cmd, procnum_to_rdbase(vcpu_id)); 228 221 its_encode_valid(&cmd, valid); 229 222 230 223 its_send_cmd(cmdq_base, &cmd);

+2

tools/testing/selftests/net/forwarding/local_termination.sh

··· 176 176 local rcv_dmac=$(mac_get $rcv_if_name) 177 177 local should_receive 178 178 179 + setup_wait 180 + 179 181 tcpdump_start $rcv_if_name 180 182 181 183 mc_route_prepare $send_if_name

+13 -5

tools/testing/selftests/net/mptcp/mptcp_connect.c

··· 710 710 711 711 bw = do_rnd_write(peerfd, winfo->buf + winfo->off, winfo->len); 712 712 if (bw < 0) { 713 - if (cfg_rcv_trunc) 714 - return 0; 713 + /* expected reset, continue to read */ 714 + if (cfg_rcv_trunc && 715 + (errno == ECONNRESET || 716 + errno == EPIPE)) { 717 + fds.events &= ~POLLOUT; 718 + continue; 719 + } 720 + 715 721 perror("write"); 716 722 return 111; 717 723 } ··· 743 737 } 744 738 745 739 if (fds.revents & (POLLERR | POLLNVAL)) { 746 - if (cfg_rcv_trunc) 747 - return 0; 740 + if (cfg_rcv_trunc) { 741 + fds.events &= ~(POLLERR | POLLNVAL); 742 + continue; 743 + } 748 744 fprintf(stderr, "Unexpected revents: " 749 745 "POLLERR/POLLNVAL(%x)\n", fds.revents); 750 746 return 5; ··· 1441 1433 */ 1442 1434 if (cfg_truncate < 0) { 1443 1435 cfg_rcv_trunc = true; 1444 - signal(SIGPIPE, handle_signal); 1436 + signal(SIGPIPE, SIG_IGN); 1445 1437 } 1446 1438 break; 1447 1439 case 'j':

+1 -1

tools/testing/selftests/net/mptcp/mptcp_connect.sh

··· 492 492 "than expected (${expect_synrx})" 493 493 retc=1 494 494 fi 495 - if [ ${stat_ackrx_now_l} -lt ${expect_ackrx} ] && [ ${stat_ooo_now} -eq 0 ]; then 495 + if [ ${stat_ackrx_now_l} -lt ${expect_ackrx} ]; then 496 496 if [ ${stat_ooo_now} -eq 0 ]; then 497 497 mptcp_lib_pr_fail "lower MPC ACK rx (${stat_ackrx_now_l})" \ 498 498 "than expected (${expect_ackrx})"

+45 -45

tools/testing/selftests/net/mptcp/mptcp_join.sh

··· 2532 2532 if reset "remove single subflow"; then 2533 2533 pm_nl_set_limits $ns1 0 1 2534 2534 pm_nl_set_limits $ns2 0 1 2535 - pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow 2535 + pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow,backup 2536 2536 addr_nr_ns2=-1 speed=slow \ 2537 2537 run_tests $ns1 $ns2 10.0.1.1 2538 2538 chk_join_nr 1 1 1 ··· 2545 2545 if reset "remove multiple subflows"; then 2546 2546 pm_nl_set_limits $ns1 0 2 2547 2547 pm_nl_set_limits $ns2 0 2 2548 - pm_nl_add_endpoint $ns2 10.0.2.2 flags subflow 2549 - pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow 2548 + pm_nl_add_endpoint $ns2 10.0.2.2 flags subflow,backup 2549 + pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow,backup 2550 2550 addr_nr_ns2=-2 speed=slow \ 2551 2551 run_tests $ns1 $ns2 10.0.1.1 2552 2552 chk_join_nr 2 2 2 ··· 2557 2557 # single address, remove 2558 2558 if reset "remove single address"; then 2559 2559 pm_nl_set_limits $ns1 0 1 2560 - pm_nl_add_endpoint $ns1 10.0.2.1 flags signal 2560 + pm_nl_add_endpoint $ns1 10.0.2.1 flags signal,backup 2561 2561 pm_nl_set_limits $ns2 1 1 2562 2562 addr_nr_ns1=-1 speed=slow \ 2563 2563 run_tests $ns1 $ns2 10.0.1.1 ··· 2570 2570 # subflow and signal, remove 2571 2571 if reset "remove subflow and signal"; then 2572 2572 pm_nl_set_limits $ns1 0 2 2573 - pm_nl_add_endpoint $ns1 10.0.2.1 flags signal 2573 + pm_nl_add_endpoint $ns1 10.0.2.1 flags signal,backup 2574 2574 pm_nl_set_limits $ns2 1 2 2575 - pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow 2575 + pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow,backup 2576 2576 addr_nr_ns1=-1 addr_nr_ns2=-1 speed=slow \ 2577 2577 run_tests $ns1 $ns2 10.0.1.1 2578 2578 chk_join_nr 2 2 2 ··· 2584 2584 # subflows and signal, remove 2585 2585 if reset "remove subflows and signal"; then 2586 2586 pm_nl_set_limits $ns1 0 3 2587 - pm_nl_add_endpoint $ns1 10.0.2.1 flags signal 2587 + pm_nl_add_endpoint $ns1 10.0.2.1 flags signal,backup 2588 2588 pm_nl_set_limits $ns2 1 3 2589 - pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow 2590 - pm_nl_add_endpoint $ns2 10.0.4.2 flags subflow 2589 + pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow,backup 2590 + pm_nl_add_endpoint $ns2 10.0.4.2 flags subflow,backup 2591 2591 addr_nr_ns1=-1 addr_nr_ns2=-2 speed=10 \ 2592 2592 run_tests $ns1 $ns2 10.0.1.1 2593 2593 chk_join_nr 3 3 3 ··· 2599 2599 # addresses remove 2600 2600 if reset "remove addresses"; then 2601 2601 pm_nl_set_limits $ns1 3 3 2602 - pm_nl_add_endpoint $ns1 10.0.2.1 flags signal id 250 2603 - pm_nl_add_endpoint $ns1 10.0.3.1 flags signal 2604 - pm_nl_add_endpoint $ns1 10.0.4.1 flags signal 2602 + pm_nl_add_endpoint $ns1 10.0.2.1 flags signal,backup id 250 2603 + pm_nl_add_endpoint $ns1 10.0.3.1 flags signal,backup 2604 + pm_nl_add_endpoint $ns1 10.0.4.1 flags signal,backup 2605 2605 pm_nl_set_limits $ns2 3 3 2606 2606 addr_nr_ns1=-3 speed=10 \ 2607 2607 run_tests $ns1 $ns2 10.0.1.1 ··· 2614 2614 # invalid addresses remove 2615 2615 if reset "remove invalid addresses"; then 2616 2616 pm_nl_set_limits $ns1 3 3 2617 - pm_nl_add_endpoint $ns1 10.0.12.1 flags signal 2617 + pm_nl_add_endpoint $ns1 10.0.12.1 flags signal,backup 2618 2618 # broadcast IP: no packet for this address will be received on ns1 2619 - pm_nl_add_endpoint $ns1 224.0.0.1 flags signal 2620 - pm_nl_add_endpoint $ns1 10.0.3.1 flags signal 2619 + pm_nl_add_endpoint $ns1 224.0.0.1 flags signal,backup 2620 + pm_nl_add_endpoint $ns1 10.0.3.1 flags signal,backup 2621 2621 pm_nl_set_limits $ns2 2 2 2622 2622 addr_nr_ns1=-3 speed=10 \ 2623 2623 run_tests $ns1 $ns2 10.0.1.1 ··· 2631 2631 # subflows and signal, flush 2632 2632 if reset "flush subflows and signal"; then 2633 2633 pm_nl_set_limits $ns1 0 3 2634 - pm_nl_add_endpoint $ns1 10.0.2.1 flags signal 2634 + pm_nl_add_endpoint $ns1 10.0.2.1 flags signal,backup 2635 2635 pm_nl_set_limits $ns2 1 3 2636 - pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow 2637 - pm_nl_add_endpoint $ns2 10.0.4.2 flags subflow 2636 + pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow,backup 2637 + pm_nl_add_endpoint $ns2 10.0.4.2 flags subflow,backup 2638 2638 addr_nr_ns1=-8 addr_nr_ns2=-8 speed=slow \ 2639 2639 run_tests $ns1 $ns2 10.0.1.1 2640 2640 chk_join_nr 3 3 3 ··· 2647 2647 if reset "flush subflows"; then 2648 2648 pm_nl_set_limits $ns1 3 3 2649 2649 pm_nl_set_limits $ns2 3 3 2650 - pm_nl_add_endpoint $ns2 10.0.2.2 flags subflow id 150 2651 - pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow 2652 - pm_nl_add_endpoint $ns2 10.0.4.2 flags subflow 2650 + pm_nl_add_endpoint $ns2 10.0.2.2 flags subflow,backup id 150 2651 + pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow,backup 2652 + pm_nl_add_endpoint $ns2 10.0.4.2 flags subflow,backup 2653 2653 addr_nr_ns1=-8 addr_nr_ns2=-8 speed=slow \ 2654 2654 run_tests $ns1 $ns2 10.0.1.1 2655 2655 chk_join_nr 3 3 3 ··· 2666 2666 # addresses flush 2667 2667 if reset "flush addresses"; then 2668 2668 pm_nl_set_limits $ns1 3 3 2669 - pm_nl_add_endpoint $ns1 10.0.2.1 flags signal id 250 2670 - pm_nl_add_endpoint $ns1 10.0.3.1 flags signal 2671 - pm_nl_add_endpoint $ns1 10.0.4.1 flags signal 2669 + pm_nl_add_endpoint $ns1 10.0.2.1 flags signal,backup id 250 2670 + pm_nl_add_endpoint $ns1 10.0.3.1 flags signal,backup 2671 + pm_nl_add_endpoint $ns1 10.0.4.1 flags signal,backup 2672 2672 pm_nl_set_limits $ns2 3 3 2673 2673 addr_nr_ns1=-8 addr_nr_ns2=-8 speed=slow \ 2674 2674 run_tests $ns1 $ns2 10.0.1.1 ··· 2681 2681 # invalid addresses flush 2682 2682 if reset "flush invalid addresses"; then 2683 2683 pm_nl_set_limits $ns1 3 3 2684 - pm_nl_add_endpoint $ns1 10.0.12.1 flags signal 2685 - pm_nl_add_endpoint $ns1 10.0.3.1 flags signal 2686 - pm_nl_add_endpoint $ns1 10.0.14.1 flags signal 2684 + pm_nl_add_endpoint $ns1 10.0.12.1 flags signal,backup 2685 + pm_nl_add_endpoint $ns1 10.0.3.1 flags signal,backup 2686 + pm_nl_add_endpoint $ns1 10.0.14.1 flags signal,backup 2687 2687 pm_nl_set_limits $ns2 3 3 2688 2688 addr_nr_ns1=-8 speed=slow \ 2689 2689 run_tests $ns1 $ns2 10.0.1.1 ··· 3806 3806 continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then 3807 3807 set_userspace_pm $ns1 3808 3808 pm_nl_set_limits $ns2 2 2 3809 - { speed=5 \ 3809 + { test_linkfail=128 speed=5 \ 3810 3810 run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null 3811 3811 local tests_pid=$! 3812 3812 wait_mpj $ns1 ··· 3831 3831 chk_mptcp_info subflows 0 subflows 0 3832 3832 chk_subflows_total 1 1 3833 3833 kill_events_pids 3834 - mptcp_lib_kill_wait $tests_pid 3834 + mptcp_lib_kill_group_wait $tests_pid 3835 3835 fi 3836 3836 3837 3837 # userspace pm create destroy subflow ··· 3839 3839 continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then 3840 3840 set_userspace_pm $ns2 3841 3841 pm_nl_set_limits $ns1 0 1 3842 - { speed=5 \ 3842 + { test_linkfail=128 speed=5 \ 3843 3843 run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null 3844 3844 local tests_pid=$! 3845 3845 wait_mpj $ns2 ··· 3859 3859 chk_mptcp_info subflows 0 subflows 0 3860 3860 chk_subflows_total 1 1 3861 3861 kill_events_pids 3862 - mptcp_lib_kill_wait $tests_pid 3862 + mptcp_lib_kill_group_wait $tests_pid 3863 3863 fi 3864 3864 3865 3865 # userspace pm create id 0 subflow ··· 3867 3867 continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then 3868 3868 set_userspace_pm $ns2 3869 3869 pm_nl_set_limits $ns1 0 1 3870 - { speed=5 \ 3870 + { test_linkfail=128 speed=5 \ 3871 3871 run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null 3872 3872 local tests_pid=$! 3873 3873 wait_mpj $ns2 ··· 3880 3880 chk_mptcp_info subflows 1 subflows 1 3881 3881 chk_subflows_total 2 2 3882 3882 kill_events_pids 3883 - mptcp_lib_kill_wait $tests_pid 3883 + mptcp_lib_kill_group_wait $tests_pid 3884 3884 fi 3885 3885 3886 3886 # userspace pm remove initial subflow ··· 3888 3888 continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then 3889 3889 set_userspace_pm $ns2 3890 3890 pm_nl_set_limits $ns1 0 1 3891 - { speed=5 \ 3891 + { test_linkfail=128 speed=5 \ 3892 3892 run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null 3893 3893 local tests_pid=$! 3894 3894 wait_mpj $ns2 ··· 3904 3904 chk_mptcp_info subflows 1 subflows 1 3905 3905 chk_subflows_total 1 1 3906 3906 kill_events_pids 3907 - mptcp_lib_kill_wait $tests_pid 3907 + mptcp_lib_kill_group_wait $tests_pid 3908 3908 fi 3909 3909 3910 3910 # userspace pm send RM_ADDR for ID 0 ··· 3912 3912 continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then 3913 3913 set_userspace_pm $ns1 3914 3914 pm_nl_set_limits $ns2 1 1 3915 - { speed=5 \ 3915 + { test_linkfail=128 speed=5 \ 3916 3916 run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null 3917 3917 local tests_pid=$! 3918 3918 wait_mpj $ns1 ··· 3930 3930 chk_mptcp_info subflows 1 subflows 1 3931 3931 chk_subflows_total 1 1 3932 3932 kill_events_pids 3933 - mptcp_lib_kill_wait $tests_pid 3933 + mptcp_lib_kill_group_wait $tests_pid 3934 3934 fi 3935 3935 } 3936 3936 ··· 3943 3943 pm_nl_set_limits $ns1 2 2 3944 3944 pm_nl_set_limits $ns2 2 2 3945 3945 pm_nl_add_endpoint $ns1 10.0.2.1 flags signal 3946 - { speed=slow \ 3946 + { test_linkfail=128 speed=slow \ 3947 3947 run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null 3948 3948 local tests_pid=$! 3949 3949 ··· 3960 3960 pm_nl_add_endpoint $ns2 10.0.2.2 flags signal 3961 3961 pm_nl_check_endpoint "modif is allowed" \ 3962 3962 $ns2 10.0.2.2 id 1 flags signal 3963 - mptcp_lib_kill_wait $tests_pid 3963 + mptcp_lib_kill_group_wait $tests_pid 3964 3964 fi 3965 3965 3966 3966 if reset_with_tcp_filter "delete and re-add" ns2 10.0.3.2 REJECT OUTPUT && ··· 3970 3970 pm_nl_set_limits $ns2 0 3 3971 3971 pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow 3972 3972 pm_nl_add_endpoint $ns2 10.0.2.2 id 2 dev ns2eth2 flags subflow 3973 - { test_linkfail=4 speed=5 \ 3973 + { test_linkfail=128 speed=5 \ 3974 3974 run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null 3975 3975 local tests_pid=$! 3976 3976 ··· 4015 4015 chk_mptcp_info subflows 3 subflows 3 4016 4016 done 4017 4017 4018 - mptcp_lib_kill_wait $tests_pid 4018 + mptcp_lib_kill_group_wait $tests_pid 4019 4019 4020 4020 kill_events_pids 4021 4021 chk_evt_nr ns1 MPTCP_LIB_EVENT_LISTENER_CREATED 1 ··· 4048 4048 # broadcast IP: no packet for this address will be received on ns1 4049 4049 pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal 4050 4050 pm_nl_add_endpoint $ns1 10.0.1.1 id 42 flags signal 4051 - { test_linkfail=4 speed=5 \ 4051 + { test_linkfail=128 speed=5 \ 4052 4052 run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null 4053 4053 local tests_pid=$! 4054 4054 ··· 4089 4089 wait_mpj $ns2 4090 4090 chk_subflow_nr "after re-re-add ID 0" 3 4091 4091 chk_mptcp_info subflows 3 subflows 3 4092 - mptcp_lib_kill_wait $tests_pid 4092 + mptcp_lib_kill_group_wait $tests_pid 4093 4093 4094 4094 kill_events_pids 4095 4095 chk_evt_nr ns1 MPTCP_LIB_EVENT_LISTENER_CREATED 1 ··· 4121 4121 # broadcast IP: no packet for this address will be received on ns1 4122 4122 pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal 4123 4123 pm_nl_add_endpoint $ns2 10.0.3.2 id 3 flags subflow 4124 - { test_linkfail=4 speed=20 \ 4124 + { test_linkfail=128 speed=20 \ 4125 4125 run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null 4126 4126 local tests_pid=$! 4127 4127 ··· 4137 4137 wait_mpj $ns2 4138 4138 pm_nl_add_endpoint $ns1 10.0.3.1 id 2 flags signal 4139 4139 wait_mpj $ns2 4140 - mptcp_lib_kill_wait $tests_pid 4140 + mptcp_lib_kill_group_wait $tests_pid 4141 4141 4142 4142 join_syn_tx=3 join_connect_err=1 \ 4143 4143 chk_join_nr 2 2 2

+21

tools/testing/selftests/net/mptcp/mptcp_lib.sh

··· 350 350 wait "${1}" 2>/dev/null 351 351 } 352 352 353 + # $1: PID 354 + mptcp_lib_pid_list_children() { 355 + local curr="${1}" 356 + # evoke 'ps' only once 357 + local pids="${2:-"$(ps o pid,ppid)"}" 358 + 359 + echo "${curr}" 360 + 361 + local pid 362 + for pid in $(echo "${pids}" | awk "\$2 == ${curr} { print \$1 }"); do 363 + mptcp_lib_pid_list_children "${pid}" "${pids}" 364 + done 365 + } 366 + 367 + # $1: PID 368 + mptcp_lib_kill_group_wait() { 369 + # Some users might not have procps-ng: cannot use "kill -- -PID" 370 + mptcp_lib_pid_list_children "${1}" | xargs -r kill &>/dev/null 371 + wait "${1}" 2>/dev/null 372 + } 373 + 353 374 # $1: IP address 354 375 mptcp_lib_is_v6() { 355 376 [ -z "${1##*:*}" ]

+44

tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json

··· 961 961 "teardown": [ 962 962 "$TC qdisc del dev $DUMMY root" 963 963 ] 964 + }, 965 + { 966 + "id": "4989", 967 + "name": "Try to add an fq child to an ingress qdisc", 968 + "category": [ 969 + "qdisc", 970 + "ingress" 971 + ], 972 + "plugins": { 973 + "requires": "nsPlugin" 974 + }, 975 + "setup": [ 976 + "$TC qdisc add dev $DUMMY handle ffff:0 ingress" 977 + ], 978 + "cmdUnderTest": "$TC qdisc add dev $DUMMY parent ffff:0 handle ffe0:0 fq", 979 + "expExitCode": "2", 980 + "verifyCmd": "$TC -j qdisc ls dev $DUMMY handle ffe0:", 981 + "matchJSON": [], 982 + "matchCount": "1", 983 + "teardown": [ 984 + "$TC qdisc del dev $DUMMY ingress" 985 + ] 986 + }, 987 + { 988 + "id": "c2b0", 989 + "name": "Try to add an fq child to a clsact qdisc", 990 + "category": [ 991 + "qdisc", 992 + "ingress" 993 + ], 994 + "plugins": { 995 + "requires": "nsPlugin" 996 + }, 997 + "setup": [ 998 + "$TC qdisc add dev $DUMMY handle ffff:0 clsact" 999 + ], 1000 + "cmdUnderTest": "$TC qdisc add dev $DUMMY parent ffff:0 handle ffe0:0 fq", 1001 + "expExitCode": "2", 1002 + "verifyCmd": "$TC -j qdisc ls dev $DUMMY handle ffe0:", 1003 + "matchJSON": [], 1004 + "matchCount": "1", 1005 + "teardown": [ 1006 + "$TC qdisc del dev $DUMMY clsact" 1007 + ] 964 1008 } 965 1009 ]

+1 -1

tools/testing/selftests/user_events/perf_test.c

··· 236 236 ASSERT_EQ(1 << reg.enable_bit, self->check); 237 237 238 238 /* Ensure write shows up at correct offset */ 239 - ASSERT_NE(-1, write(self->data_fd, &reg.write_index, 239 + ASSERT_NE(-1, write(self->data_fd, (void *)&reg.write_index, 240 240 sizeof(reg.write_index))); 241 241 val = (void *)(((char *)perf_page) + perf_page->data_offset); 242 242 ASSERT_EQ(PERF_RECORD_SAMPLE, *val);

+18 -1

tools/testing/selftests/vfio/lib/include/vfio_util.h

··· 4 4 5 5 #include <fcntl.h> 6 6 #include <string.h> 7 - #include <linux/vfio.h> 7 + 8 + #include <uapi/linux/types.h> 9 + #include <linux/iommufd.h> 8 10 #include <linux/list.h> 9 11 #include <linux/pci_regs.h> 12 + #include <linux/vfio.h> 10 13 11 14 #include "../../../kselftest.h" 12 15 ··· 188 185 struct vfio_pci_driver driver; 189 186 }; 190 187 188 + struct iova_allocator { 189 + struct iommu_iova_range *ranges; 190 + u32 nranges; 191 + u32 range_idx; 192 + u64 range_offset; 193 + }; 194 + 191 195 /* 192 196 * Return the BDF string of the device that the test should use. 193 197 * ··· 215 205 struct vfio_pci_device *vfio_pci_device_init(const char *bdf, const char *iommu_mode); 216 206 void vfio_pci_device_cleanup(struct vfio_pci_device *device); 217 207 void vfio_pci_device_reset(struct vfio_pci_device *device); 208 + 209 + struct iommu_iova_range *vfio_pci_iova_ranges(struct vfio_pci_device *device, 210 + u32 *nranges); 211 + 212 + struct iova_allocator *iova_allocator_init(struct vfio_pci_device *device); 213 + void iova_allocator_cleanup(struct iova_allocator *allocator); 214 + iova_t iova_allocator_alloc(struct iova_allocator *allocator, size_t size); 218 215 219 216 int __vfio_pci_dma_map(struct vfio_pci_device *device, 220 217 struct vfio_dma_region *region);

+245 -1

tools/testing/selftests/vfio/lib/vfio_pci_device.c

··· 12 12 #include <sys/mman.h> 13 13 14 14 #include <uapi/linux/types.h> 15 + #include <linux/iommufd.h> 15 16 #include <linux/limits.h> 16 17 #include <linux/mman.h> 18 + #include <linux/overflow.h> 17 19 #include <linux/types.h> 18 20 #include <linux/vfio.h> 19 - #include <linux/iommufd.h> 20 21 21 22 #include "../../../kselftest.h" 22 23 #include <vfio_util.h> ··· 29 28 int __ret = ioctl((_fd), (_op), (__arg)); \ 30 29 VFIO_ASSERT_EQ(__ret, 0, "ioctl(%s, %s, %s) returned %d\n", #_fd, #_op, #_arg, __ret); \ 31 30 } while (0) 31 + 32 + static struct vfio_info_cap_header *next_cap_hdr(void *buf, u32 bufsz, 33 + u32 *cap_offset) 34 + { 35 + struct vfio_info_cap_header *hdr; 36 + 37 + if (!*cap_offset) 38 + return NULL; 39 + 40 + VFIO_ASSERT_LT(*cap_offset, bufsz); 41 + VFIO_ASSERT_GE(bufsz - *cap_offset, sizeof(*hdr)); 42 + 43 + hdr = (struct vfio_info_cap_header *)((u8 *)buf + *cap_offset); 44 + *cap_offset = hdr->next; 45 + 46 + return hdr; 47 + } 48 + 49 + static struct vfio_info_cap_header *vfio_iommu_info_cap_hdr(struct vfio_iommu_type1_info *info, 50 + u16 cap_id) 51 + { 52 + struct vfio_info_cap_header *hdr; 53 + u32 cap_offset = info->cap_offset; 54 + u32 max_depth; 55 + u32 depth = 0; 56 + 57 + if (!(info->flags & VFIO_IOMMU_INFO_CAPS)) 58 + return NULL; 59 + 60 + if (cap_offset) 61 + VFIO_ASSERT_GE(cap_offset, sizeof(*info)); 62 + 63 + max_depth = (info->argsz - sizeof(*info)) / sizeof(*hdr); 64 + 65 + while ((hdr = next_cap_hdr(info, info->argsz, &cap_offset))) { 66 + depth++; 67 + VFIO_ASSERT_LE(depth, max_depth, "Capability chain contains a cycle\n"); 68 + 69 + if (hdr->id == cap_id) 70 + return hdr; 71 + } 72 + 73 + return NULL; 74 + } 75 + 76 + /* Return buffer including capability chain, if present. Free with free() */ 77 + static struct vfio_iommu_type1_info *vfio_iommu_get_info(struct vfio_pci_device *device) 78 + { 79 + struct vfio_iommu_type1_info *info; 80 + 81 + info = malloc(sizeof(*info)); 82 + VFIO_ASSERT_NOT_NULL(info); 83 + 84 + *info = (struct vfio_iommu_type1_info) { 85 + .argsz = sizeof(*info), 86 + }; 87 + 88 + ioctl_assert(device->container_fd, VFIO_IOMMU_GET_INFO, info); 89 + VFIO_ASSERT_GE(info->argsz, sizeof(*info)); 90 + 91 + info = realloc(info, info->argsz); 92 + VFIO_ASSERT_NOT_NULL(info); 93 + 94 + ioctl_assert(device->container_fd, VFIO_IOMMU_GET_INFO, info); 95 + VFIO_ASSERT_GE(info->argsz, sizeof(*info)); 96 + 97 + return info; 98 + } 99 + 100 + /* 101 + * Return iova ranges for the device's container. Normalize vfio_iommu_type1 to 102 + * report iommufd's iommu_iova_range. Free with free(). 103 + */ 104 + static struct iommu_iova_range *vfio_iommu_iova_ranges(struct vfio_pci_device *device, 105 + u32 *nranges) 106 + { 107 + struct vfio_iommu_type1_info_cap_iova_range *cap_range; 108 + struct vfio_iommu_type1_info *info; 109 + struct vfio_info_cap_header *hdr; 110 + struct iommu_iova_range *ranges = NULL; 111 + 112 + info = vfio_iommu_get_info(device); 113 + hdr = vfio_iommu_info_cap_hdr(info, VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE); 114 + VFIO_ASSERT_NOT_NULL(hdr); 115 + 116 + cap_range = container_of(hdr, struct vfio_iommu_type1_info_cap_iova_range, header); 117 + VFIO_ASSERT_GT(cap_range->nr_iovas, 0); 118 + 119 + ranges = calloc(cap_range->nr_iovas, sizeof(*ranges)); 120 + VFIO_ASSERT_NOT_NULL(ranges); 121 + 122 + for (u32 i = 0; i < cap_range->nr_iovas; i++) { 123 + ranges[i] = (struct iommu_iova_range){ 124 + .start = cap_range->iova_ranges[i].start, 125 + .last = cap_range->iova_ranges[i].end, 126 + }; 127 + } 128 + 129 + *nranges = cap_range->nr_iovas; 130 + 131 + free(info); 132 + return ranges; 133 + } 134 + 135 + /* Return iova ranges of the device's IOAS. Free with free() */ 136 + static struct iommu_iova_range *iommufd_iova_ranges(struct vfio_pci_device *device, 137 + u32 *nranges) 138 + { 139 + struct iommu_iova_range *ranges; 140 + int ret; 141 + 142 + struct iommu_ioas_iova_ranges query = { 143 + .size = sizeof(query), 144 + .ioas_id = device->ioas_id, 145 + }; 146 + 147 + ret = ioctl(device->iommufd, IOMMU_IOAS_IOVA_RANGES, &query); 148 + VFIO_ASSERT_EQ(ret, -1); 149 + VFIO_ASSERT_EQ(errno, EMSGSIZE); 150 + VFIO_ASSERT_GT(query.num_iovas, 0); 151 + 152 + ranges = calloc(query.num_iovas, sizeof(*ranges)); 153 + VFIO_ASSERT_NOT_NULL(ranges); 154 + 155 + query.allowed_iovas = (uintptr_t)ranges; 156 + 157 + ioctl_assert(device->iommufd, IOMMU_IOAS_IOVA_RANGES, &query); 158 + *nranges = query.num_iovas; 159 + 160 + return ranges; 161 + } 162 + 163 + static int iova_range_comp(const void *a, const void *b) 164 + { 165 + const struct iommu_iova_range *ra = a, *rb = b; 166 + 167 + if (ra->start < rb->start) 168 + return -1; 169 + 170 + if (ra->start > rb->start) 171 + return 1; 172 + 173 + return 0; 174 + } 175 + 176 + /* Return sorted IOVA ranges of the device. Free with free(). */ 177 + struct iommu_iova_range *vfio_pci_iova_ranges(struct vfio_pci_device *device, 178 + u32 *nranges) 179 + { 180 + struct iommu_iova_range *ranges; 181 + 182 + if (device->iommufd) 183 + ranges = iommufd_iova_ranges(device, nranges); 184 + else 185 + ranges = vfio_iommu_iova_ranges(device, nranges); 186 + 187 + if (!ranges) 188 + return NULL; 189 + 190 + VFIO_ASSERT_GT(*nranges, 0); 191 + 192 + /* Sort and check that ranges are sane and non-overlapping */ 193 + qsort(ranges, *nranges, sizeof(*ranges), iova_range_comp); 194 + VFIO_ASSERT_LT(ranges[0].start, ranges[0].last); 195 + 196 + for (u32 i = 1; i < *nranges; i++) { 197 + VFIO_ASSERT_LT(ranges[i].start, ranges[i].last); 198 + VFIO_ASSERT_LT(ranges[i - 1].last, ranges[i].start); 199 + } 200 + 201 + return ranges; 202 + } 203 + 204 + struct iova_allocator *iova_allocator_init(struct vfio_pci_device *device) 205 + { 206 + struct iova_allocator *allocator; 207 + struct iommu_iova_range *ranges; 208 + u32 nranges; 209 + 210 + ranges = vfio_pci_iova_ranges(device, &nranges); 211 + VFIO_ASSERT_NOT_NULL(ranges); 212 + 213 + allocator = malloc(sizeof(*allocator)); 214 + VFIO_ASSERT_NOT_NULL(allocator); 215 + 216 + *allocator = (struct iova_allocator){ 217 + .ranges = ranges, 218 + .nranges = nranges, 219 + .range_idx = 0, 220 + .range_offset = 0, 221 + }; 222 + 223 + return allocator; 224 + } 225 + 226 + void iova_allocator_cleanup(struct iova_allocator *allocator) 227 + { 228 + free(allocator->ranges); 229 + free(allocator); 230 + } 231 + 232 + iova_t iova_allocator_alloc(struct iova_allocator *allocator, size_t size) 233 + { 234 + VFIO_ASSERT_GT(size, 0, "Invalid size arg, zero\n"); 235 + VFIO_ASSERT_EQ(size & (size - 1), 0, "Invalid size arg, non-power-of-2\n"); 236 + 237 + for (;;) { 238 + struct iommu_iova_range *range; 239 + iova_t iova, last; 240 + 241 + VFIO_ASSERT_LT(allocator->range_idx, allocator->nranges, 242 + "IOVA allocator out of space\n"); 243 + 244 + range = &allocator->ranges[allocator->range_idx]; 245 + iova = range->start + allocator->range_offset; 246 + 247 + /* Check for sufficient space at the current offset */ 248 + if (check_add_overflow(iova, size - 1, &last) || 249 + last > range->last) 250 + goto next_range; 251 + 252 + /* Align iova to size */ 253 + iova = last & ~(size - 1); 254 + 255 + /* Check for sufficient space at the aligned iova */ 256 + if (check_add_overflow(iova, size - 1, &last) || 257 + last > range->last) 258 + goto next_range; 259 + 260 + if (last == range->last) { 261 + allocator->range_idx++; 262 + allocator->range_offset = 0; 263 + } else { 264 + allocator->range_offset = last - range->start + 1; 265 + } 266 + 267 + return iova; 268 + 269 + next_range: 270 + allocator->range_idx++; 271 + allocator->range_offset = 0; 272 + } 273 + } 32 274 33 275 iova_t __to_iova(struct vfio_pci_device *device, void *vaddr) 34 276 {

+17 -3

tools/testing/selftests/vfio/vfio_dma_mapping_test.c

··· 3 3 #include <sys/mman.h> 4 4 #include <unistd.h> 5 5 6 + #include <uapi/linux/types.h> 7 + #include <linux/iommufd.h> 6 8 #include <linux/limits.h> 7 9 #include <linux/mman.h> 8 10 #include <linux/sizes.h> ··· 95 93 96 94 FIXTURE(vfio_dma_mapping_test) { 97 95 struct vfio_pci_device *device; 96 + struct iova_allocator *iova_allocator; 98 97 }; 99 98 100 99 FIXTURE_VARIANT(vfio_dma_mapping_test) { ··· 120 117 FIXTURE_SETUP(vfio_dma_mapping_test) 121 118 { 122 119 self->device = vfio_pci_device_init(device_bdf, variant->iommu_mode); 120 + self->iova_allocator = iova_allocator_init(self->device); 123 121 } 124 122 125 123 FIXTURE_TEARDOWN(vfio_dma_mapping_test) 126 124 { 125 + iova_allocator_cleanup(self->iova_allocator); 127 126 vfio_pci_device_cleanup(self->device); 128 127 } 129 128 ··· 147 142 else 148 143 ASSERT_NE(region.vaddr, MAP_FAILED); 149 144 150 - region.iova = (u64)region.vaddr; 145 + region.iova = iova_allocator_alloc(self->iova_allocator, size); 151 146 region.size = size; 152 147 153 148 vfio_pci_dma_map(self->device, &region); ··· 224 219 FIXTURE_SETUP(vfio_dma_map_limit_test) 225 220 { 226 221 struct vfio_dma_region *region = &self->region; 222 + struct iommu_iova_range *ranges; 227 223 u64 region_size = getpagesize(); 224 + iova_t last_iova; 225 + u32 nranges; 228 226 229 227 /* 230 228 * Over-allocate mmap by double the size to provide enough backing vaddr ··· 240 232 MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); 241 233 ASSERT_NE(region->vaddr, MAP_FAILED); 242 234 243 - /* One page prior to the end of address space */ 244 - region->iova = ~(iova_t)0 & ~(region_size - 1); 235 + ranges = vfio_pci_iova_ranges(self->device, &nranges); 236 + VFIO_ASSERT_NOT_NULL(ranges); 237 + last_iova = ranges[nranges - 1].last; 238 + free(ranges); 239 + 240 + /* One page prior to the last iova */ 241 + region->iova = last_iova & ~(region_size - 1); 245 242 region->size = region_size; 246 243 } 247 244 ··· 289 276 struct vfio_dma_region *region = &self->region; 290 277 int rc; 291 278 279 + region->iova = ~(iova_t)0 & ~(region->size - 1); 292 280 region->size = self->mmap_size; 293 281 294 282 rc = __vfio_pci_dma_map(self->device, region);

+8 -4

tools/testing/selftests/vfio/vfio_pci_driver_test.c

··· 19 19 } while (0) 20 20 21 21 static void region_setup(struct vfio_pci_device *device, 22 + struct iova_allocator *iova_allocator, 22 23 struct vfio_dma_region *region, u64 size) 23 24 { 24 25 const int flags = MAP_SHARED | MAP_ANONYMOUS; ··· 30 29 VFIO_ASSERT_NE(vaddr, MAP_FAILED); 31 30 32 31 region->vaddr = vaddr; 33 - region->iova = (u64)vaddr; 32 + region->iova = iova_allocator_alloc(iova_allocator, size); 34 33 region->size = size; 35 34 36 35 vfio_pci_dma_map(device, region); ··· 45 44 46 45 FIXTURE(vfio_pci_driver_test) { 47 46 struct vfio_pci_device *device; 47 + struct iova_allocator *iova_allocator; 48 48 struct vfio_dma_region memcpy_region; 49 49 void *vaddr; 50 50 int msi_fd; ··· 74 72 struct vfio_pci_driver *driver; 75 73 76 74 self->device = vfio_pci_device_init(device_bdf, variant->iommu_mode); 75 + self->iova_allocator = iova_allocator_init(self->device); 77 76 78 77 driver = &self->device->driver; 79 78 80 - region_setup(self->device, &self->memcpy_region, SZ_1G); 81 - region_setup(self->device, &driver->region, SZ_2M); 79 + region_setup(self->device, self->iova_allocator, &self->memcpy_region, SZ_1G); 80 + region_setup(self->device, self->iova_allocator, &driver->region, SZ_2M); 82 81 83 82 /* Any IOVA that doesn't overlap memcpy_region and driver->region. */ 84 - self->unmapped_iova = 8UL * SZ_1G; 83 + self->unmapped_iova = iova_allocator_alloc(self->iova_allocator, SZ_1G); 85 84 86 85 vfio_pci_driver_init(self->device); 87 86 self->msi_fd = self->device->msi_eventfds[driver->msi]; ··· 111 108 region_teardown(self->device, &self->memcpy_region); 112 109 region_teardown(self->device, &driver->region); 113 110 111 + iova_allocator_cleanup(self->iova_allocator); 114 112 vfio_pci_device_cleanup(self->device); 115 113 } 116 114

+33 -14

virt/kvm/guest_memfd.c

··· 623 623 return r; 624 624 } 625 625 626 - void kvm_gmem_unbind(struct kvm_memory_slot *slot) 626 + static void __kvm_gmem_unbind(struct kvm_memory_slot *slot, struct kvm_gmem *gmem) 627 627 { 628 628 unsigned long start = slot->gmem.pgoff; 629 629 unsigned long end = start + slot->npages; 630 - struct kvm_gmem *gmem; 631 - struct file *file; 632 630 633 - /* 634 - * Nothing to do if the underlying file was already closed (or is being 635 - * closed right now), kvm_gmem_release() invalidates all bindings. 636 - */ 637 - file = kvm_gmem_get_file(slot); 638 - if (!file) 639 - return; 640 - 641 - gmem = file->private_data; 642 - 643 - filemap_invalidate_lock(file->f_mapping); 644 631 xa_store_range(&gmem->bindings, start, end - 1, NULL, GFP_KERNEL); 645 632 646 633 /* ··· 635 648 * cannot see this memslot. 636 649 */ 637 650 WRITE_ONCE(slot->gmem.file, NULL); 651 + } 652 + 653 + void kvm_gmem_unbind(struct kvm_memory_slot *slot) 654 + { 655 + struct file *file; 656 + 657 + /* 658 + * Nothing to do if the underlying file was _already_ closed, as 659 + * kvm_gmem_release() invalidates and nullifies all bindings. 660 + */ 661 + if (!slot->gmem.file) 662 + return; 663 + 664 + file = kvm_gmem_get_file(slot); 665 + 666 + /* 667 + * However, if the file is _being_ closed, then the bindings need to be 668 + * removed as kvm_gmem_release() might not run until after the memslot 669 + * is freed. Note, modifying the bindings is safe even though the file 670 + * is dying as kvm_gmem_release() nullifies slot->gmem.file under 671 + * slots_lock, and only puts its reference to KVM after destroying all 672 + * bindings. I.e. reaching this point means kvm_gmem_release() hasn't 673 + * yet destroyed the bindings or freed the gmem_file, and can't do so 674 + * until the caller drops slots_lock. 675 + */ 676 + if (!file) { 677 + __kvm_gmem_unbind(slot, slot->gmem.file->private_data); 678 + return; 679 + } 680 + 681 + filemap_invalidate_lock(file->f_mapping); 682 + __kvm_gmem_unbind(slot, file->private_data); 638 683 filemap_invalidate_unlock(file->f_mapping); 639 684 640 685 fput(file);