Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

+1

.mailmap

··· 665 665 Thomas Graf <tgraf@suug.ch> 666 666 Thomas Körper <socketcan@esd.eu> <thomas.koerper@esd.eu> 667 667 Thomas Pedersen <twp@codeaurora.org> 668 + Thorsten Blum <thorsten.blum@linux.dev> <thorsten.blum@toblux.com> 668 669 Tiezhu Yang <yangtiezhu@loongson.cn> <kernelpatch@126.com> 669 670 Tingwei Zhang <quic_tingwei@quicinc.com> <tingwei@codeaurora.org> 670 671 Tirupathi Reddy <quic_tirupath@quicinc.com> <tirupath@codeaurora.org>

+9

Documentation/admin-guide/cgroup-v2.rst

··· 1599 1599 pglazyfreed (npn) 1600 1600 Amount of reclaimed lazyfree pages 1601 1601 1602 + swpin_zero 1603 + Number of pages swapped into memory and filled with zero, where I/O 1604 + was optimized out because the page content was detected to be zero 1605 + during swapout. 1606 + 1607 + swpout_zero 1608 + Number of zero-filled pages swapped out with I/O skipped due to the 1609 + content being detected as zero. 1610 + 1602 1611 zswpin 1603 1612 Number of pages moved in to memory from zswap. 1604 1613

+10 -1

Documentation/admin-guide/kernel-parameters.txt

··· 6689 6689 0: no polling (default) 6690 6690 6691 6691 thp_anon= [KNL] 6692 - Format: <size>,<size>[KMG]:<state>;<size>-<size>[KMG]:<state> 6692 + Format: <size>[KMG],<size>[KMG]:<state>;<size>[KMG]-<size>[KMG]:<state> 6693 6693 state is one of "always", "madvise", "never" or "inherit". 6694 6694 Control the default behavior of the system with respect 6695 6695 to anonymous transparent hugepages. ··· 6727 6727 6728 6728 torture.verbose_sleep_duration= [KNL] 6729 6729 Duration of each verbose-printk() sleep in jiffies. 6730 + 6731 + tpm.disable_pcr_integrity= [HW,TPM] 6732 + Do not protect PCR registers from unintended physical 6733 + access, or interposers in the bus by the means of 6734 + having an integrity protected session wrapped around 6735 + TPM2_PCR_Extend command. Consider this in a situation 6736 + where TPM is heavily utilized by IMA, thus protection 6737 + causing a major performance hit, and the space where 6738 + machines are deployed is by other means guarded. 6730 6739 6731 6740 tpm_suspend_pcr=[HW,TPM] 6732 6741 Format: integer pcr id

+1 -1

Documentation/admin-guide/mm/transhuge.rst

··· 303 303 kernel command line. 304 304 305 305 Alternatively, each supported anonymous THP size can be controlled by 306 - passing ``thp_anon=<size>,<size>[KMG]:<state>;<size>-<size>[KMG]:<state>``, 306 + passing ``thp_anon=<size>[KMG],<size>[KMG]:<state>;<size>[KMG]-<size>[KMG]:<state>``, 307 307 where ``<size>`` is the THP size (must be a power of 2 of PAGE_SIZE and 308 308 supported anonymous THP) and ``<state>`` is one of ``always``, ``madvise``, 309 309 ``never`` or ``inherit``.

+9

Documentation/networking/devmem.rst

··· 225 225 Failure to do so will exhaust the limited dmabuf that is bound to the RX queue 226 226 and will lead to packet drops. 227 227 228 + The user must pass no more than 128 tokens, with no more than 1024 total frags 229 + among the token->token_count across all the tokens. If the user provides more 230 + than 1024 frags, the kernel will free up to 1024 frags and return early. 231 + 232 + The kernel returns the number of actual frags freed. The number of frags freed 233 + can be less than the tokens provided by the user in case of: 234 + 235 + (a) an internal kernel leak bug. 236 + (b) the user passed more than 1024 frags. 228 237 229 238 Implementation & Caveats 230 239 ========================

+7 -7

Documentation/security/landlock.rst

··· 11 11 12 12 Landlock's goal is to create scoped access-control (i.e. sandboxing). To 13 13 harden a whole system, this feature should be available to any process, 14 - including unprivileged ones. Because such process may be compromised or 14 + including unprivileged ones. Because such a process may be compromised or 15 15 backdoored (i.e. untrusted), Landlock's features must be safe to use from the 16 16 kernel and other processes point of view. Landlock's interface must therefore 17 17 expose a minimal attack surface. 18 18 19 19 Landlock is designed to be usable by unprivileged processes while following the 20 20 system security policy enforced by other access control mechanisms (e.g. DAC, 21 - LSM). Indeed, a Landlock rule shall not interfere with other access-controls 22 - enforced on the system, only add more restrictions. 21 + LSM). A Landlock rule shall not interfere with other access-controls enforced 22 + on the system, only add more restrictions. 23 23 24 24 Any user can enforce Landlock rulesets on their processes. They are merged and 25 - evaluated according to the inherited ones in a way that ensures that only more 25 + evaluated against inherited rulesets in a way that ensures that only more 26 26 constraints can be added. 27 27 28 28 User space documentation can be found here: ··· 43 43 only impact the processes requesting them. 44 44 * Resources (e.g. file descriptors) directly obtained from the kernel by a 45 45 sandboxed process shall retain their scoped accesses (at the time of resource 46 - acquisition) whatever process use them. 46 + acquisition) whatever process uses them. 47 47 Cf. `File descriptor access rights`_. 48 48 49 49 Design choices ··· 71 71 Taking the ``LANDLOCK_ACCESS_FS_TRUNCATE`` right as an example, it may be 72 72 allowed to open a file for writing without being allowed to 73 73 :manpage:`ftruncate` the resulting file descriptor if the related file 74 - hierarchy doesn't grant such access right. The following sequences of 74 + hierarchy doesn't grant that access right. The following sequences of 75 75 operations have the same semantic and should then have the same result: 76 76 77 77 * ``truncate(path);`` ··· 81 81 attached to file descriptors are retained even if they are passed between 82 82 processes (e.g. through a Unix domain socket). Such access rights will then be 83 83 enforced even if the receiving process is not sandboxed by Landlock. Indeed, 84 - this is required to keep a consistent access control over the whole system, and 84 + this is required to keep access controls consistent over the whole system, and 85 85 this avoids unattended bypasses through file descriptor passing (i.e. confused 86 86 deputy attack). 87 87

+45 -45

Documentation/userspace-api/landlock.rst

··· 8 8 ===================================== 9 9 10 10 :Author: Mickaël Salaün 11 - :Date: September 2024 11 + :Date: October 2024 12 12 13 - The goal of Landlock is to enable to restrict ambient rights (e.g. global 13 + The goal of Landlock is to enable restriction of ambient rights (e.g. global 14 14 filesystem or network access) for a set of processes. Because Landlock 15 - is a stackable LSM, it makes possible to create safe security sandboxes as new 16 - security layers in addition to the existing system-wide access-controls. This 17 - kind of sandbox is expected to help mitigate the security impact of bugs or 15 + is a stackable LSM, it makes it possible to create safe security sandboxes as 16 + new security layers in addition to the existing system-wide access-controls. 17 + This kind of sandbox is expected to help mitigate the security impact of bugs or 18 18 unexpected/malicious behaviors in user space applications. Landlock empowers 19 19 any process, including unprivileged ones, to securely restrict themselves. 20 20 ··· 86 86 LANDLOCK_SCOPE_SIGNAL, 87 87 }; 88 88 89 - Because we may not know on which kernel version an application will be 90 - executed, it is safer to follow a best-effort security approach. Indeed, we 89 + Because we may not know which kernel version an application will be executed 90 + on, it is safer to follow a best-effort security approach. Indeed, we 91 91 should try to protect users as much as possible whatever the kernel they are 92 92 using. 93 93 ··· 129 129 LANDLOCK_SCOPE_SIGNAL); 130 130 } 131 131 132 - This enables to create an inclusive ruleset that will contain our rules. 132 + This enables the creation of an inclusive ruleset that will contain our rules. 133 133 134 134 .. code-block:: c 135 135 ··· 219 219 now restricted and this policy will be enforced on all its subsequently created 220 220 children as well. Once a thread is landlocked, there is no way to remove its 221 221 security policy; only adding more restrictions is allowed. These threads are 222 - now in a new Landlock domain, merge of their parent one (if any) with the new 223 - ruleset. 222 + now in a new Landlock domain, which is a merger of their parent one (if any) 223 + with the new ruleset. 224 224 225 225 Full working code can be found in `samples/landlock/sandboxer.c`_. 226 226 227 227 Good practices 228 228 -------------- 229 229 230 - It is recommended setting access rights to file hierarchy leaves as much as 230 + It is recommended to set access rights to file hierarchy leaves as much as 231 231 possible. For instance, it is better to be able to have ``~/doc/`` as a 232 232 read-only hierarchy and ``~/tmp/`` as a read-write hierarchy, compared to 233 233 ``~/`` as a read-only hierarchy and ``~/tmp/`` as a read-write hierarchy. 234 234 Following this good practice leads to self-sufficient hierarchies that do not 235 235 depend on their location (i.e. parent directories). This is particularly 236 236 relevant when we want to allow linking or renaming. Indeed, having consistent 237 - access rights per directory enables to change the location of such directory 237 + access rights per directory enables changing the location of such directories 238 238 without relying on the destination directory access rights (except those that 239 239 are required for this operation, see ``LANDLOCK_ACCESS_FS_REFER`` 240 240 documentation). 241 241 242 242 Having self-sufficient hierarchies also helps to tighten the required access 243 243 rights to the minimal set of data. This also helps avoid sinkhole directories, 244 - i.e. directories where data can be linked to but not linked from. However, 244 + i.e. directories where data can be linked to but not linked from. However, 245 245 this depends on data organization, which might not be controlled by developers. 246 246 In this case, granting read-write access to ``~/tmp/``, instead of write-only 247 - access, would potentially allow to move ``~/tmp/`` to a non-readable directory 247 + access, would potentially allow moving ``~/tmp/`` to a non-readable directory 248 248 and still keep the ability to list the content of ``~/tmp/``. 249 249 250 250 Layers of file path access rights 251 251 --------------------------------- 252 252 253 253 Each time a thread enforces a ruleset on itself, it updates its Landlock domain 254 - with a new layer of policy. Indeed, this complementary policy is stacked with 255 - the potentially other rulesets already restricting this thread. A sandboxed 256 - thread can then safely add more constraints to itself with a new enforced 257 - ruleset. 254 + with a new layer of policy. This complementary policy is stacked with any 255 + other rulesets potentially already restricting this thread. A sandboxed thread 256 + can then safely add more constraints to itself with a new enforced ruleset. 258 257 259 258 One policy layer grants access to a file path if at least one of its rules 260 259 encountered on the path grants the access. A sandboxed thread can only access ··· 264 265 Bind mounts and OverlayFS 265 266 ------------------------- 266 267 267 - Landlock enables to restrict access to file hierarchies, which means that these 268 + Landlock enables restricting access to file hierarchies, which means that these 268 269 access rights can be propagated with bind mounts (cf. 269 270 Documentation/filesystems/sharedsubtree.rst) but not with 270 271 Documentation/filesystems/overlayfs.rst. ··· 277 278 are the result of bind mounts or not. 278 279 279 280 An OverlayFS mount point consists of upper and lower layers. These layers are 280 - combined in a merge directory, result of the mount point. This merge hierarchy 281 - may include files from the upper and lower layers, but modifications performed 282 - on the merge hierarchy only reflects on the upper layer. From a Landlock 283 - policy point of view, each OverlayFS layers and merge hierarchies are 284 - standalone and contains their own set of files and directories, which is 285 - different from bind mounts. A policy restricting an OverlayFS layer will not 286 - restrict the resulted merged hierarchy, and vice versa. Landlock users should 287 - then only think about file hierarchies they want to allow access to, regardless 288 - of the underlying filesystem. 281 + combined in a merge directory, and that merged directory becomes available at 282 + the mount point. This merge hierarchy may include files from the upper and 283 + lower layers, but modifications performed on the merge hierarchy only reflect 284 + on the upper layer. From a Landlock policy point of view, all OverlayFS layers 285 + and merge hierarchies are standalone and each contains their own set of files 286 + and directories, which is different from bind mounts. A policy restricting an 287 + OverlayFS layer will not restrict the resulted merged hierarchy, and vice versa. 288 + Landlock users should then only think about file hierarchies they want to allow 289 + access to, regardless of the underlying filesystem. 289 290 290 291 Inheritance 291 292 ----------- 292 293 293 294 Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain 294 - restrictions from its parent. This is similar to the seccomp inheritance (cf. 295 + restrictions from its parent. This is similar to seccomp inheritance (cf. 295 296 Documentation/userspace-api/seccomp_filter.rst) or any other LSM dealing with 296 297 task's :manpage:`credentials(7)`. For instance, one process's thread may apply 297 298 Landlock rules to itself, but they will not be automatically applied to other ··· 310 311 A sandboxed process has less privileges than a non-sandboxed process and must 311 312 then be subject to additional restrictions when manipulating another process. 312 313 To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target 313 - process, a sandboxed process should have a subset of the target process rules, 314 - which means the tracee must be in a sub-domain of the tracer. 314 + process, a sandboxed process should have a superset of the target process's 315 + access rights, which means the tracee must be in a sub-domain of the tracer. 315 316 316 317 IPC scoping 317 318 ----------- ··· 321 322 for a set of actions by specifying it on a ruleset. For example, if a 322 323 sandboxed process should not be able to :manpage:`connect(2)` to a 323 324 non-sandboxed process through abstract :manpage:`unix(7)` sockets, we can 324 - specify such restriction with ``LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET``. 325 + specify such a restriction with ``LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET``. 325 326 Moreover, if a sandboxed process should not be able to send a signal to a 326 327 non-sandboxed process, we can specify this restriction with 327 328 ``LANDLOCK_SCOPE_SIGNAL``. ··· 393 394 Landlock is designed to be compatible with past and future versions of the 394 395 kernel. This is achieved thanks to the system call attributes and the 395 396 associated bitflags, particularly the ruleset's ``handled_access_fs``. Making 396 - handled access right explicit enables the kernel and user space to have a clear 397 + handled access rights explicit enables the kernel and user space to have a clear 397 398 contract with each other. This is required to make sure sandboxing will not 398 399 get stricter with a system update, which could break applications. 399 400 ··· 562 563 Starting with the Landlock ABI version 3, it is now possible to securely control 563 564 truncation thanks to the new ``LANDLOCK_ACCESS_FS_TRUNCATE`` access right. 564 565 565 - Network support (ABI < 4) 566 - ------------------------- 566 + TCP bind and connect (ABI < 4) 567 + ------------------------------ 567 568 568 569 Starting with the Landlock ABI version 4, it is now possible to restrict TCP 569 570 bind and connect actions to only a set of allowed ports thanks to the new 570 571 ``LANDLOCK_ACCESS_NET_BIND_TCP`` and ``LANDLOCK_ACCESS_NET_CONNECT_TCP`` 571 572 access rights. 572 573 573 - IOCTL (ABI < 5) 574 - --------------- 574 + Device IOCTL (ABI < 5) 575 + ---------------------- 575 576 576 577 IOCTL operations could not be denied before the fifth Landlock ABI, so 577 578 :manpage:`ioctl(2)` is always allowed when using a kernel that only supports an 578 579 earlier ABI. 579 580 580 581 Starting with the Landlock ABI version 5, it is possible to restrict the use of 581 - :manpage:`ioctl(2)` using the new ``LANDLOCK_ACCESS_FS_IOCTL_DEV`` right. 582 + :manpage:`ioctl(2)` on character and block devices using the new 583 + ``LANDLOCK_ACCESS_FS_IOCTL_DEV`` right. 582 584 583 - Abstract UNIX socket scoping (ABI < 6) 584 - -------------------------------------- 585 + Abstract UNIX socket (ABI < 6) 586 + ------------------------------ 585 587 586 588 Starting with the Landlock ABI version 6, it is possible to restrict 587 589 connections to an abstract :manpage:`unix(7)` socket by setting 588 590 ``LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET`` to the ``scoped`` ruleset attribute. 589 591 590 - Signal scoping (ABI < 6) 591 - ------------------------ 592 + Signal (ABI < 6) 593 + ---------------- 592 594 593 595 Starting with the Landlock ABI version 6, it is possible to restrict 594 596 :manpage:`signal(7)` sending by setting ``LANDLOCK_SCOPE_SIGNAL`` to the ··· 605 605 606 606 Landlock was first introduced in Linux 5.13 but it must be configured at build 607 607 time with ``CONFIG_SECURITY_LANDLOCK=y``. Landlock must also be enabled at boot 608 - time as the other security modules. The list of security modules enabled by 608 + time like other security modules. The list of security modules enabled by 609 609 default is set with ``CONFIG_LSM``. The kernel configuration should then 610 - contains ``CONFIG_LSM=landlock,[...]`` with ``[...]`` as the list of other 610 + contain ``CONFIG_LSM=landlock,[...]`` with ``[...]`` as the list of other 611 611 potentially useful security modules for the running system (see the 612 612 ``CONFIG_LSM`` help). 613 613 ··· 669 669 What about user space sandbox managers? 670 670 --------------------------------------- 671 671 672 - Using user space process to enforce restrictions on kernel resources can lead 672 + Using user space processes to enforce restrictions on kernel resources can lead 673 673 to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of 674 674 the OS code and state 675 675 <https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_).

+42 -2

MAINTAINERS

··· 1174 1174 F: drivers/hid/amd-sfh-hid/ 1175 1175 1176 1176 AMD SPI DRIVER 1177 - M: Sanjay R Mehta <sanju.mehta@amd.com> 1178 - S: Maintained 1177 + M: Raju Rangoju <Raju.Rangoju@amd.com> 1178 + L: linux-spi@vger.kernel.org 1179 + S: Supported 1179 1180 F: drivers/spi/spi-amd.c 1180 1181 1181 1182 AMD XGBE DRIVER ··· 19610 19609 F: Documentation/devicetree/bindings/i2c/renesas,iic-emev2.yaml 19611 19610 F: drivers/i2c/busses/i2c-emev2.c 19612 19611 19612 + RENESAS ETHERNET AVB DRIVER 19613 + M: Paul Barker <paul.barker.ct@bp.renesas.com> 19614 + M: Niklas Söderlund <niklas.soderlund@ragnatech.se> 19615 + L: netdev@vger.kernel.org 19616 + L: linux-renesas-soc@vger.kernel.org 19617 + S: Supported 19618 + F: Documentation/devicetree/bindings/net/renesas,etheravb.yaml 19619 + F: drivers/net/ethernet/renesas/Kconfig 19620 + F: drivers/net/ethernet/renesas/Makefile 19621 + F: drivers/net/ethernet/renesas/ravb* 19622 + 19613 19623 RENESAS ETHERNET SWITCH DRIVER 19614 19624 R: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> 19615 19625 L: netdev@vger.kernel.org ··· 19669 19657 F: Documentation/devicetree/bindings/i2c/renesas,rmobile-iic.yaml 19670 19658 F: drivers/i2c/busses/i2c-rcar.c 19671 19659 F: drivers/i2c/busses/i2c-sh_mobile.c 19660 + 19661 + RENESAS R-CAR SATA DRIVER 19662 + M: Geert Uytterhoeven <geert+renesas@glider.be> 19663 + L: linux-ide@vger.kernel.org 19664 + L: linux-renesas-soc@vger.kernel.org 19665 + S: Supported 19666 + F: Documentation/devicetree/bindings/ata/renesas,rcar-sata.yaml 19667 + F: drivers/ata/sata_rcar.c 19672 19668 19673 19669 RENESAS R-CAR THERMAL DRIVERS 19674 19670 M: Niklas Söderlund <niklas.soderlund@ragnatech.se> ··· 19752 19732 S: Supported 19753 19733 F: Documentation/devicetree/bindings/i2c/renesas,rzv2m.yaml 19754 19734 F: drivers/i2c/busses/i2c-rzv2m.c 19735 + 19736 + RENESAS SUPERH ETHERNET DRIVER 19737 + M: Niklas Söderlund <niklas.soderlund@ragnatech.se> 19738 + L: netdev@vger.kernel.org 19739 + L: linux-renesas-soc@vger.kernel.org 19740 + S: Supported 19741 + F: Documentation/devicetree/bindings/net/renesas,ether.yaml 19742 + F: drivers/net/ethernet/renesas/Kconfig 19743 + F: drivers/net/ethernet/renesas/Makefile 19744 + F: drivers/net/ethernet/renesas/sh_eth* 19745 + F: include/linux/sh_eth.h 19755 19746 19756 19747 RENESAS USB PHY DRIVER 19757 19748 M: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> ··· 21685 21654 S: Supported 21686 21655 W: https://github.com/thesofproject/linux/ 21687 21656 F: sound/soc/sof/ 21657 + 21658 + SOUND - GENERIC SOUND CARD (Simple-Audio-Card, Audio-Graph-Card) 21659 + M: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> 21660 + S: Supported 21661 + L: linux-sound@vger.kernel.org 21662 + F: sound/soc/generic/ 21663 + F: include/sound/simple_card* 21664 + F: Documentation/devicetree/bindings/sound/simple-card.yaml 21665 + F: Documentation/devicetree/bindings/sound/audio-graph*.yaml 21688 21666 21689 21667 SOUNDWIRE SUBSYSTEM 21690 21668 M: Vinod Koul <vkoul@kernel.org>

+1 -1

Makefile

··· 2 2 VERSION = 6 3 3 PATCHLEVEL = 12 4 4 SUBLEVEL = 0 5 - EXTRAVERSION = -rc6 5 + EXTRAVERSION = -rc7 6 6 NAME = Baby Opossum Posse 7 7 8 8 # *DOCUMENTATION*

+1

arch/arm64/Kconfig

··· 2214 2214 bool "ARM Scalable Matrix Extension support" 2215 2215 default y 2216 2216 depends on ARM64_SVE 2217 + depends on BROKEN 2217 2218 help 2218 2219 The Scalable Matrix Extension (SME) is an extension to the AArch64 2219 2220 execution state which utilises a substantial subset of the SVE

+7 -3

arch/arm64/include/asm/mman.h

··· 6 6 7 7 #ifndef BUILD_VDSO 8 8 #include <linux/compiler.h> 9 + #include <linux/fs.h> 10 + #include <linux/shmem_fs.h> 9 11 #include <linux/types.h> 10 12 11 13 static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot, ··· 33 31 } 34 32 #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey) 35 33 36 - static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags) 34 + static inline unsigned long arch_calc_vm_flag_bits(struct file *file, 35 + unsigned long flags) 37 36 { 38 37 /* 39 38 * Only allow MTE on anonymous mappings as these are guaranteed to be 40 39 * backed by tags-capable memory. The vm_flags may be overridden by a 41 40 * filesystem supporting MTE (RAM-based). 42 41 */ 43 - if (system_supports_mte() && (flags & MAP_ANONYMOUS)) 42 + if (system_supports_mte() && 43 + ((flags & MAP_ANONYMOUS) || shmem_file(file))) 44 44 return VM_MTE_ALLOWED; 45 45 46 46 return 0; 47 47 } 48 - #define arch_calc_vm_flag_bits(flags) arch_calc_vm_flag_bits(flags) 48 + #define arch_calc_vm_flag_bits(file, flags) arch_calc_vm_flag_bits(file, flags) 49 49 50 50 static inline bool arch_validate_prot(unsigned long prot, 51 51 unsigned long addr __always_unused)

-4

arch/arm64/include/asm/topology.h

··· 26 26 #define arch_scale_freq_invariant topology_scale_freq_invariant 27 27 #define arch_scale_freq_ref topology_get_freq_ref 28 28 29 - #ifdef CONFIG_ACPI_CPPC_LIB 30 - #define arch_init_invariance_cppc topology_init_cpu_capacity_cppc 31 - #endif 32 - 33 29 /* Replace task scheduler's default cpu-invariant accounting */ 34 30 #define arch_scale_cpu_capacity topology_get_cpu_scale 35 31

+1

arch/arm64/kernel/fpsimd.c

··· 1367 1367 } else { 1368 1368 fpsimd_to_sve(current); 1369 1369 current->thread.fp_type = FP_STATE_SVE; 1370 + fpsimd_flush_task_state(current); 1370 1371 } 1371 1372 } 1372 1373

+3 -32

arch/arm64/kernel/smccc-call.S

··· 7 7 8 8 #include <asm/asm-offsets.h> 9 9 #include <asm/assembler.h> 10 - #include <asm/thread_info.h> 11 - 12 - /* 13 - * If we have SMCCC v1.3 and (as is likely) no SVE state in 14 - * the registers then set the SMCCC hint bit to say there's no 15 - * need to preserve it. Do this by directly adjusting the SMCCC 16 - * function value which is already stored in x0 ready to be called. 17 - */ 18 - SYM_FUNC_START(__arm_smccc_sve_check) 19 - 20 - ldr_l x16, smccc_has_sve_hint 21 - cbz x16, 2f 22 - 23 - get_current_task x16 24 - ldr x16, [x16, #TSK_TI_FLAGS] 25 - tbnz x16, #TIF_FOREIGN_FPSTATE, 1f // Any live FP state? 26 - tbnz x16, #TIF_SVE, 2f // Does that state include SVE? 27 - 28 - 1: orr x0, x0, ARM_SMCCC_1_3_SVE_HINT 29 - 30 - 2: ret 31 - SYM_FUNC_END(__arm_smccc_sve_check) 32 - EXPORT_SYMBOL(__arm_smccc_sve_check) 33 10 34 11 .macro SMCCC instr 35 - stp x29, x30, [sp, #-16]! 36 - mov x29, sp 37 - alternative_if ARM64_SVE 38 - bl __arm_smccc_sve_check 39 - alternative_else_nop_endif 40 12 \instr #0 41 - ldr x4, [sp, #16] 13 + ldr x4, [sp] 42 14 stp x0, x1, [x4, #ARM_SMCCC_RES_X0_OFFS] 43 15 stp x2, x3, [x4, #ARM_SMCCC_RES_X2_OFFS] 44 - ldr x4, [sp, #24] 16 + ldr x4, [sp, #8] 45 17 cbz x4, 1f /* no quirk structure */ 46 18 ldr x9, [x4, #ARM_SMCCC_QUIRK_ID_OFFS] 47 19 cmp x9, #ARM_SMCCC_QUIRK_QCOM_A6 48 20 b.ne 1f 49 21 str x6, [x4, ARM_SMCCC_QUIRK_STATE_OFFS] 50 - 1: ldp x29, x30, [sp], #16 51 - ret 22 + 1: ret 52 23 .endm 53 24 54 25 /*

+11 -2

arch/loongarch/include/asm/kasan.h

··· 25 25 /* 64-bit segment value. */ 26 26 #define XKPRANGE_UC_SEG (0x8000) 27 27 #define XKPRANGE_CC_SEG (0x9000) 28 + #define XKPRANGE_WC_SEG (0xa000) 28 29 #define XKVRANGE_VC_SEG (0xffff) 29 30 30 31 /* Cached */ ··· 42 41 #define XKPRANGE_UC_SHADOW_SIZE (XKPRANGE_UC_SIZE >> KASAN_SHADOW_SCALE_SHIFT) 43 42 #define XKPRANGE_UC_SHADOW_END (XKPRANGE_UC_KASAN_OFFSET + XKPRANGE_UC_SHADOW_SIZE) 44 43 44 + /* WriteCombine */ 45 + #define XKPRANGE_WC_START WRITECOMBINE_BASE 46 + #define XKPRANGE_WC_SIZE XRANGE_SIZE 47 + #define XKPRANGE_WC_KASAN_OFFSET XKPRANGE_UC_SHADOW_END 48 + #define XKPRANGE_WC_SHADOW_SIZE (XKPRANGE_WC_SIZE >> KASAN_SHADOW_SCALE_SHIFT) 49 + #define XKPRANGE_WC_SHADOW_END (XKPRANGE_WC_KASAN_OFFSET + XKPRANGE_WC_SHADOW_SIZE) 50 + 45 51 /* VMALLOC (Cached or UnCached) */ 46 52 #define XKVRANGE_VC_START MODULES_VADDR 47 53 #define XKVRANGE_VC_SIZE round_up(KFENCE_AREA_END - MODULES_VADDR + 1, PGDIR_SIZE) 48 - #define XKVRANGE_VC_KASAN_OFFSET XKPRANGE_UC_SHADOW_END 54 + #define XKVRANGE_VC_KASAN_OFFSET XKPRANGE_WC_SHADOW_END 49 55 #define XKVRANGE_VC_SHADOW_SIZE (XKVRANGE_VC_SIZE >> KASAN_SHADOW_SCALE_SHIFT) 50 56 #define XKVRANGE_VC_SHADOW_END (XKVRANGE_VC_KASAN_OFFSET + XKVRANGE_VC_SHADOW_SIZE) 51 57 52 58 /* KAsan shadow memory start right after vmalloc. */ 53 59 #define KASAN_SHADOW_START round_up(KFENCE_AREA_END, PGDIR_SIZE) 54 60 #define KASAN_SHADOW_SIZE (XKVRANGE_VC_SHADOW_END - XKPRANGE_CC_KASAN_OFFSET) 55 - #define KASAN_SHADOW_END round_up(KASAN_SHADOW_START + KASAN_SHADOW_SIZE, PGDIR_SIZE) 61 + #define KASAN_SHADOW_END (round_up(KASAN_SHADOW_START + KASAN_SHADOW_SIZE, PGDIR_SIZE) - 1) 56 62 57 63 #define XKPRANGE_CC_SHADOW_OFFSET (KASAN_SHADOW_START + XKPRANGE_CC_KASAN_OFFSET) 58 64 #define XKPRANGE_UC_SHADOW_OFFSET (KASAN_SHADOW_START + XKPRANGE_UC_KASAN_OFFSET) 65 + #define XKPRANGE_WC_SHADOW_OFFSET (KASAN_SHADOW_START + XKPRANGE_WC_KASAN_OFFSET) 59 66 #define XKVRANGE_VC_SHADOW_OFFSET (KASAN_SHADOW_START + XKVRANGE_VC_KASAN_OFFSET) 60 67 61 68 extern bool kasan_early_stage;

+1 -4

arch/loongarch/include/asm/page.h

··· 113 113 extern int __virt_addr_valid(volatile void *kaddr); 114 114 #define virt_addr_valid(kaddr) __virt_addr_valid((volatile void *)(kaddr)) 115 115 116 - #define VM_DATA_DEFAULT_FLAGS \ 117 - (VM_READ | VM_WRITE | \ 118 - ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \ 119 - VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) 116 + #define VM_DATA_DEFAULT_FLAGS VM_DATA_FLAGS_TSK_EXEC 120 117 121 118 #include <asm-generic/memory_model.h> 122 119 #include <asm-generic/getorder.h>

+53 -28

arch/loongarch/kernel/acpi.c

··· 58 58 return ioremap_cache(phys, size); 59 59 } 60 60 61 - static int cpu_enumerated = 0; 62 - 63 61 #ifdef CONFIG_SMP 64 - static int set_processor_mask(u32 id, u32 flags) 62 + static int set_processor_mask(u32 id, u32 pass) 65 63 { 66 - int nr_cpus; 67 - int cpu, cpuid = id; 64 + int cpu = -1, cpuid = id; 68 65 69 - if (!cpu_enumerated) 70 - nr_cpus = NR_CPUS; 71 - else 72 - nr_cpus = nr_cpu_ids; 73 - 74 - if (num_processors >= nr_cpus) { 66 + if (num_processors >= NR_CPUS) { 75 67 pr_warn(PREFIX "nr_cpus limit of %i reached." 76 - " processor 0x%x ignored.\n", nr_cpus, cpuid); 68 + " processor 0x%x ignored.\n", NR_CPUS, cpuid); 77 69 78 70 return -ENODEV; 79 71 80 72 } 73 + 81 74 if (cpuid == loongson_sysconf.boot_cpu_id) 82 75 cpu = 0; 83 - else 84 - cpu = find_first_zero_bit(cpumask_bits(cpu_present_mask), NR_CPUS); 85 76 86 - if (!cpu_enumerated) 87 - set_cpu_possible(cpu, true); 88 - 89 - if (flags & ACPI_MADT_ENABLED) { 77 + switch (pass) { 78 + case 1: /* Pass 1 handle enabled processors */ 79 + if (cpu < 0) 80 + cpu = find_first_zero_bit(cpumask_bits(cpu_present_mask), NR_CPUS); 90 81 num_processors++; 91 82 set_cpu_present(cpu, true); 92 - __cpu_number_map[cpuid] = cpu; 93 - __cpu_logical_map[cpu] = cpuid; 94 - } else 83 + break; 84 + case 2: /* Pass 2 handle disabled processors */ 85 + if (cpu < 0) 86 + cpu = find_first_zero_bit(cpumask_bits(cpu_possible_mask), NR_CPUS); 95 87 disabled_cpus++; 88 + break; 89 + default: 90 + return cpu; 91 + } 92 + 93 + set_cpu_possible(cpu, true); 94 + __cpu_number_map[cpuid] = cpu; 95 + __cpu_logical_map[cpu] = cpuid; 96 96 97 97 return cpu; 98 98 } 99 99 #endif 100 100 101 101 static int __init 102 - acpi_parse_processor(union acpi_subtable_headers *header, const unsigned long end) 102 + acpi_parse_p1_processor(union acpi_subtable_headers *header, const unsigned long end) 103 103 { 104 104 struct acpi_madt_core_pic *processor = NULL; 105 105 ··· 110 110 acpi_table_print_madt_entry(&header->common); 111 111 #ifdef CONFIG_SMP 112 112 acpi_core_pic[processor->core_id] = *processor; 113 - set_processor_mask(processor->core_id, processor->flags); 113 + if (processor->flags & ACPI_MADT_ENABLED) 114 + set_processor_mask(processor->core_id, 1); 114 115 #endif 115 116 116 117 return 0; 117 118 } 118 119 120 + static int __init 121 + acpi_parse_p2_processor(union acpi_subtable_headers *header, const unsigned long end) 122 + { 123 + struct acpi_madt_core_pic *processor = NULL; 124 + 125 + processor = (struct acpi_madt_core_pic *)header; 126 + if (BAD_MADT_ENTRY(processor, end)) 127 + return -EINVAL; 128 + 129 + #ifdef CONFIG_SMP 130 + if (!(processor->flags & ACPI_MADT_ENABLED)) 131 + set_processor_mask(processor->core_id, 2); 132 + #endif 133 + 134 + return 0; 135 + } 119 136 static int __init 120 137 acpi_parse_eio_master(union acpi_subtable_headers *header, const unsigned long end) 121 138 { ··· 160 143 } 161 144 #endif 162 145 acpi_table_parse_madt(ACPI_MADT_TYPE_CORE_PIC, 163 - acpi_parse_processor, MAX_CORE_PIC); 146 + acpi_parse_p1_processor, MAX_CORE_PIC); 147 + 148 + acpi_table_parse_madt(ACPI_MADT_TYPE_CORE_PIC, 149 + acpi_parse_p2_processor, MAX_CORE_PIC); 164 150 165 151 acpi_table_parse_madt(ACPI_MADT_TYPE_EIO_PIC, 166 152 acpi_parse_eio_master, MAX_IO_PICS); 167 153 168 - cpu_enumerated = 1; 169 154 loongson_sysconf.nr_cpus = num_processors; 170 155 } 171 156 ··· 329 310 int nid; 330 311 331 312 nid = acpi_get_node(handle); 313 + 314 + if (nid != NUMA_NO_NODE) 315 + nid = early_cpu_to_node(cpu); 316 + 332 317 if (nid != NUMA_NO_NODE) { 333 318 set_cpuid_to_node(physid, nid); 334 319 node_set(nid, numa_nodes_parsed); ··· 347 324 { 348 325 int cpu; 349 326 350 - cpu = set_processor_mask(physid, ACPI_MADT_ENABLED); 351 - if (cpu < 0) { 327 + cpu = cpu_number_map(physid); 328 + if (cpu < 0 || cpu >= nr_cpu_ids) { 352 329 pr_info(PREFIX "Unable to map lapic to logical cpu number\n"); 353 - return cpu; 330 + return -ERANGE; 354 331 } 355 332 333 + num_processors++; 334 + set_cpu_present(cpu, true); 356 335 acpi_map_cpu2node(handle, cpu, physid); 357 336 358 337 *pcpu = cpu;

+15

arch/loongarch/kernel/paravirt.c

··· 51 51 } 52 52 53 53 #ifdef CONFIG_SMP 54 + static struct smp_ops native_ops; 55 + 54 56 static void pv_send_ipi_single(int cpu, unsigned int action) 55 57 { 56 58 int min, old; 57 59 irq_cpustat_t *info = &per_cpu(irq_stat, cpu); 60 + 61 + if (unlikely(action == ACTION_BOOT_CPU)) { 62 + native_ops.send_ipi_single(cpu, action); 63 + return; 64 + } 58 65 59 66 old = atomic_fetch_or(BIT(action), &info->message); 60 67 if (old) ··· 81 74 82 75 if (cpumask_empty(mask)) 83 76 return; 77 + 78 + if (unlikely(action == ACTION_BOOT_CPU)) { 79 + native_ops.send_ipi_mask(mask, action); 80 + return; 81 + } 84 82 85 83 action = BIT(action); 86 84 for_each_cpu(i, mask) { ··· 159 147 { 160 148 int r, swi; 161 149 150 + /* Init native ipi irq for ACTION_BOOT_CPU */ 151 + native_ops.init_ipi(); 162 152 swi = get_percpu_irq(INT_SWI0); 163 153 if (swi < 0) 164 154 panic("SWI0 IRQ mapping failed\n"); ··· 207 193 return 0; 208 194 209 195 #ifdef CONFIG_SMP 196 + native_ops = mp_ops; 210 197 mp_ops.init_ipi = pv_init_ipi; 211 198 mp_ops.send_ipi_single = pv_send_ipi_single; 212 199 mp_ops.send_ipi_mask = pv_send_ipi_mask;

+3 -2

arch/loongarch/kernel/smp.c

··· 302 302 __cpu_number_map[cpuid] = cpu; 303 303 __cpu_logical_map[cpu] = cpuid; 304 304 305 - early_numa_add_cpu(cpu, 0); 305 + early_numa_add_cpu(cpuid, 0); 306 306 set_cpuid_to_node(cpuid, 0); 307 307 } 308 308 ··· 331 331 int i = 0; 332 332 333 333 parse_acpi_topology(); 334 + cpu_data[0].global_id = cpu_logical_map(0); 334 335 335 336 for (i = 0; i < loongson_sysconf.nr_cpus; i++) { 336 337 set_cpu_present(i, true); 337 338 csr_mail_send(0, __cpu_logical_map[i], 0); 338 - cpu_data[i].global_id = __cpu_logical_map[i]; 339 339 } 340 340 341 341 per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE; ··· 380 380 cpu_logical_map(cpu) / loongson_sysconf.cores_per_package; 381 381 cpu_data[cpu].core = pptt_enabled ? cpu_data[cpu].core : 382 382 cpu_logical_map(cpu) % loongson_sysconf.cores_per_package; 383 + cpu_data[cpu].global_id = cpu_logical_map(cpu); 383 384 } 384 385 385 386 void loongson_smp_finish(void)

+41 -5

arch/loongarch/mm/kasan_init.c

··· 13 13 14 14 static pgd_t kasan_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); 15 15 16 + #ifdef __PAGETABLE_P4D_FOLDED 17 + #define __pgd_none(early, pgd) (0) 18 + #else 19 + #define __pgd_none(early, pgd) (early ? (pgd_val(pgd) == 0) : \ 20 + (__pa(pgd_val(pgd)) == (unsigned long)__pa(kasan_early_shadow_p4d))) 21 + #endif 22 + 16 23 #ifdef __PAGETABLE_PUD_FOLDED 17 24 #define __p4d_none(early, p4d) (0) 18 25 #else ··· 62 55 case XKPRANGE_UC_SEG: 63 56 offset = XKPRANGE_UC_SHADOW_OFFSET; 64 57 break; 58 + case XKPRANGE_WC_SEG: 59 + offset = XKPRANGE_WC_SHADOW_OFFSET; 60 + break; 65 61 case XKVRANGE_VC_SEG: 66 62 offset = XKVRANGE_VC_SHADOW_OFFSET; 67 63 break; ··· 89 79 90 80 if (addr >= XKVRANGE_VC_SHADOW_OFFSET) 91 81 return (void *)(((addr - XKVRANGE_VC_SHADOW_OFFSET) << KASAN_SHADOW_SCALE_SHIFT) + XKVRANGE_VC_START); 82 + else if (addr >= XKPRANGE_WC_SHADOW_OFFSET) 83 + return (void *)(((addr - XKPRANGE_WC_SHADOW_OFFSET) << KASAN_SHADOW_SCALE_SHIFT) + XKPRANGE_WC_START); 92 84 else if (addr >= XKPRANGE_UC_SHADOW_OFFSET) 93 85 return (void *)(((addr - XKPRANGE_UC_SHADOW_OFFSET) << KASAN_SHADOW_SCALE_SHIFT) + XKPRANGE_UC_START); 94 86 else if (addr >= XKPRANGE_CC_SHADOW_OFFSET) ··· 154 142 return pud_offset(p4dp, addr); 155 143 } 156 144 145 + static p4d_t *__init kasan_p4d_offset(pgd_t *pgdp, unsigned long addr, int node, bool early) 146 + { 147 + if (__pgd_none(early, pgdp_get(pgdp))) { 148 + phys_addr_t p4d_phys = early ? 149 + __pa_symbol(kasan_early_shadow_p4d) : kasan_alloc_zeroed_page(node); 150 + if (!early) 151 + memcpy(__va(p4d_phys), kasan_early_shadow_p4d, sizeof(kasan_early_shadow_p4d)); 152 + pgd_populate(&init_mm, pgdp, (p4d_t *)__va(p4d_phys)); 153 + } 154 + 155 + return p4d_offset(pgdp, addr); 156 + } 157 + 157 158 static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr, 158 159 unsigned long end, int node, bool early) 159 160 { ··· 203 178 do { 204 179 next = pud_addr_end(addr, end); 205 180 kasan_pmd_populate(pudp, addr, next, node, early); 206 - } while (pudp++, addr = next, addr != end); 181 + } while (pudp++, addr = next, addr != end && __pud_none(early, READ_ONCE(*pudp))); 207 182 } 208 183 209 184 static void __init kasan_p4d_populate(pgd_t *pgdp, unsigned long addr, 210 185 unsigned long end, int node, bool early) 211 186 { 212 187 unsigned long next; 213 - p4d_t *p4dp = p4d_offset(pgdp, addr); 188 + p4d_t *p4dp = kasan_p4d_offset(pgdp, addr, node, early); 214 189 215 190 do { 216 191 next = p4d_addr_end(addr, end); 217 192 kasan_pud_populate(p4dp, addr, next, node, early); 218 - } while (p4dp++, addr = next, addr != end); 193 + } while (p4dp++, addr = next, addr != end && __p4d_none(early, READ_ONCE(*p4dp))); 219 194 } 220 195 221 196 static void __init kasan_pgd_populate(unsigned long addr, unsigned long end, ··· 243 218 asmlinkage void __init kasan_early_init(void) 244 219 { 245 220 BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_START, PGDIR_SIZE)); 246 - BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_END, PGDIR_SIZE)); 221 + BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_END + 1, PGDIR_SIZE)); 247 222 } 248 223 249 224 static inline void kasan_set_pgd(pgd_t *pgdp, pgd_t pgdval) ··· 258 233 * swapper_pg_dir. pgd_clear() can't be used 259 234 * here because it's nop on 2,3-level pagetable setups 260 235 */ 261 - for (; start < end; start += PGDIR_SIZE) 236 + for (; start < end; start = pgd_addr_end(start, end)) 262 237 kasan_set_pgd((pgd_t *)pgd_offset_k(start), __pgd(0)); 263 238 } 264 239 ··· 266 241 { 267 242 u64 i; 268 243 phys_addr_t pa_start, pa_end; 244 + 245 + /* 246 + * If PGDIR_SIZE is too large for cpu_vabits, KASAN_SHADOW_END will 247 + * overflow UINTPTR_MAX and then looks like a user space address. 248 + * For example, PGDIR_SIZE of CONFIG_4KB_4LEVEL is 2^39, which is too 249 + * large for Loongson-2K series whose cpu_vabits = 39. 250 + */ 251 + if (KASAN_SHADOW_END < vm_map_base) { 252 + pr_warn("PGDIR_SIZE too large for cpu_vabits, KernelAddressSanitizer disabled.\n"); 253 + return; 254 + } 269 255 270 256 /* 271 257 * PGD was populated as invalid_pmd_table or invalid_pud_table

+3 -2

arch/parisc/include/asm/mman.h

··· 2 2 #ifndef __ASM_MMAN_H__ 3 3 #define __ASM_MMAN_H__ 4 4 5 + #include <linux/fs.h> 5 6 #include <uapi/asm/mman.h> 6 7 7 8 /* PARISC cannot allow mdwe as it needs writable stacks */ ··· 12 11 } 13 12 #define arch_memory_deny_write_exec_supported arch_memory_deny_write_exec_supported 14 13 15 - static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags) 14 + static inline unsigned long arch_calc_vm_flag_bits(struct file *file, unsigned long flags) 16 15 { 17 16 /* 18 17 * The stack on parisc grows upwards, so if userspace requests memory ··· 24 23 25 24 return 0; 26 25 } 27 - #define arch_calc_vm_flag_bits(flags) arch_calc_vm_flag_bits(flags) 26 + #define arch_calc_vm_flag_bits(file, flags) arch_calc_vm_flag_bits(file, flags) 28 27 29 28 #endif /* __ASM_MMAN_H__ */

+12

arch/powerpc/kvm/book3s_hv.c

··· 4898 4898 BOOK3S_INTERRUPT_EXTERNAL, 0); 4899 4899 else 4900 4900 lpcr |= LPCR_MER; 4901 + } else { 4902 + /* 4903 + * L1's copy of L2's LPCR (vcpu->arch.vcore->lpcr) can get its MER bit 4904 + * unexpectedly set - for e.g. during NMI handling when all register 4905 + * states are synchronized from L0 to L1. L1 needs to inform L0 about 4906 + * MER=1 only when there are pending external interrupts. 4907 + * In the above if check, MER bit is set if there are pending 4908 + * external interrupts. Hence, explicity mask off MER bit 4909 + * here as otherwise it may generate spurious interrupts in L2 KVM 4910 + * causing an endless loop, which results in L2 guest getting hung. 4911 + */ 4912 + lpcr &= ~LPCR_MER; 4901 4913 } 4902 4914 } else if (vcpu->arch.pending_exceptions || 4903 4915 vcpu->arch.doorbell_request ||

-5

arch/x86/include/asm/topology.h

··· 305 305 extern void arch_scale_freq_tick(void); 306 306 #define arch_scale_freq_tick arch_scale_freq_tick 307 307 308 - #ifdef CONFIG_ACPI_CPPC_LIB 309 - void init_freq_invariance_cppc(void); 310 - #define arch_init_invariance_cppc init_freq_invariance_cppc 311 - #endif 312 - 313 308 #endif /* _ASM_X86_TOPOLOGY_H */

+6 -1

arch/x86/kernel/acpi/cppc.c

··· 110 110 111 111 static DEFINE_MUTEX(freq_invariance_lock); 112 112 113 - void init_freq_invariance_cppc(void) 113 + static inline void init_freq_invariance_cppc(void) 114 114 { 115 115 static bool init_done; 116 116 ··· 125 125 amd_set_max_freq_ratio(); 126 126 init_done = true; 127 127 mutex_unlock(&freq_invariance_lock); 128 + } 129 + 130 + void acpi_processor_init_invariance_cppc(void) 131 + { 132 + init_freq_invariance_cppc(); 128 133 } 129 134 130 135 /*

+18 -11

arch/x86/kvm/lapic.c

··· 2629 2629 { 2630 2630 struct kvm_lapic *apic = vcpu->arch.apic; 2631 2631 2632 - if (apic->apicv_active) { 2633 - /* irr_pending is always true when apicv is activated. */ 2634 - apic->irr_pending = true; 2632 + /* 2633 + * When APICv is enabled, KVM must always search the IRR for a pending 2634 + * IRQ, as other vCPUs and devices can set IRR bits even if the vCPU 2635 + * isn't running. If APICv is disabled, KVM _should_ search the IRR 2636 + * for a pending IRQ. But KVM currently doesn't ensure *all* hardware, 2637 + * e.g. CPUs and IOMMUs, has seen the change in state, i.e. searching 2638 + * the IRR at this time could race with IRQ delivery from hardware that 2639 + * still sees APICv as being enabled. 2640 + * 2641 + * FIXME: Ensure other vCPUs and devices observe the change in APICv 2642 + * state prior to updating KVM's metadata caches, so that KVM 2643 + * can safely search the IRR and set irr_pending accordingly. 2644 + */ 2645 + apic->irr_pending = true; 2646 + 2647 + if (apic->apicv_active) 2635 2648 apic->isr_count = 1; 2636 - } else { 2637 - /* 2638 - * Don't clear irr_pending, searching the IRR can race with 2639 - * updates from the CPU as APICv is still active from hardware's 2640 - * perspective. The flag will be cleared as appropriate when 2641 - * KVM injects the interrupt. 2642 - */ 2649 + else 2643 2650 apic->isr_count = count_vectors(apic->regs + APIC_ISR); 2644 - } 2651 + 2645 2652 apic->highest_isr_cache = -1; 2646 2653 } 2647 2654

+9 -6

arch/x86/kvm/svm/sev.c

··· 450 450 goto e_free; 451 451 452 452 /* This needs to happen after SEV/SNP firmware initialization. */ 453 - if (vm_type == KVM_X86_SNP_VM && snp_guest_req_init(kvm)) 454 - goto e_free; 453 + if (vm_type == KVM_X86_SNP_VM) { 454 + ret = snp_guest_req_init(kvm); 455 + if (ret) 456 + goto e_free; 457 + } 455 458 456 459 INIT_LIST_HEAD(&sev->regions_list); 457 460 INIT_LIST_HEAD(&sev->mirror_vms); ··· 2215 2212 if (sev->snp_context) 2216 2213 return -EINVAL; 2217 2214 2218 - sev->snp_context = snp_context_create(kvm, argp); 2219 - if (!sev->snp_context) 2220 - return -ENOTTY; 2221 - 2222 2215 if (params.flags) 2223 2216 return -EINVAL; 2224 2217 ··· 2228 2229 2229 2230 if (params.policy & SNP_POLICY_MASK_SINGLE_SOCKET) 2230 2231 return -EINVAL; 2232 + 2233 + sev->snp_context = snp_context_create(kvm, argp); 2234 + if (!sev->snp_context) 2235 + return -ENOTTY; 2231 2236 2232 2237 start.gctx_paddr = __psp_pa(sev->snp_context); 2233 2238 start.policy = params.policy;

+25 -5

arch/x86/kvm/vmx/nested.c

··· 1197 1197 kvm_hv_nested_transtion_tlb_flush(vcpu, enable_ept); 1198 1198 1199 1199 /* 1200 - * If vmcs12 doesn't use VPID, L1 expects linear and combined mappings 1201 - * for *all* contexts to be flushed on VM-Enter/VM-Exit, i.e. it's a 1202 - * full TLB flush from the guest's perspective. This is required even 1203 - * if VPID is disabled in the host as KVM may need to synchronize the 1204 - * MMU in response to the guest TLB flush. 1200 + * If VPID is disabled, then guest TLB accesses use VPID=0, i.e. the 1201 + * same VPID as the host, and so architecturally, linear and combined 1202 + * mappings for VPID=0 must be flushed at VM-Enter and VM-Exit. KVM 1203 + * emulates L2 sharing L1's VPID=0 by using vpid01 while running L2, 1204 + * and so KVM must also emulate TLB flush of VPID=0, i.e. vpid01. This 1205 + * is required if VPID is disabled in KVM, as a TLB flush (there are no 1206 + * VPIDs) still occurs from L1's perspective, and KVM may need to 1207 + * synchronize the MMU in response to the guest TLB flush. 1205 1208 * 1206 1209 * Note, using TLB_FLUSH_GUEST is correct even if nested EPT is in use. 1207 1210 * EPT is a special snowflake, as guest-physical mappings aren't ··· 2318 2315 2319 2316 vmcs_write64(VMCS_LINK_POINTER, INVALID_GPA); 2320 2317 2318 + /* 2319 + * If VPID is disabled, then guest TLB accesses use VPID=0, i.e. the 2320 + * same VPID as the host. Emulate this behavior by using vpid01 for L2 2321 + * if VPID is disabled in vmcs12. Note, if VPID is disabled, VM-Enter 2322 + * and VM-Exit are architecturally required to flush VPID=0, but *only* 2323 + * VPID=0. I.e. using vpid02 would be ok (so long as KVM emulates the 2324 + * required flushes), but doing so would cause KVM to over-flush. E.g. 2325 + * if L1 runs L2 X with VPID12=1, then runs L2 Y with VPID12 disabled, 2326 + * and then runs L2 X again, then KVM can and should retain TLB entries 2327 + * for VPID12=1. 2328 + */ 2321 2329 if (enable_vpid) { 2322 2330 if (nested_cpu_has_vpid(vmcs12) && vmx->nested.vpid02) 2323 2331 vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->nested.vpid02); ··· 5964 5950 return nested_vmx_fail(vcpu, 5965 5951 VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); 5966 5952 5953 + /* 5954 + * Always flush the effective vpid02, i.e. never flush the current VPID 5955 + * and never explicitly flush vpid01. INVVPID targets a VPID, not a 5956 + * VMCS, and so whether or not the current vmcs12 has VPID enabled is 5957 + * irrelevant (and there may not be a loaded vmcs12). 5958 + */ 5967 5959 vpid02 = nested_get_vpid02(vcpu); 5968 5960 switch (type) { 5969 5961 case VMX_VPID_EXTENT_INDIVIDUAL_ADDR:

+4 -2

arch/x86/kvm/vmx/vmx.c

··· 217 217 static unsigned int ple_window_max = KVM_VMX_DEFAULT_PLE_WINDOW_MAX; 218 218 module_param(ple_window_max, uint, 0444); 219 219 220 - /* Default is SYSTEM mode, 1 for host-guest mode */ 220 + /* Default is SYSTEM mode, 1 for host-guest mode (which is BROKEN) */ 221 221 int __read_mostly pt_mode = PT_MODE_SYSTEM; 222 + #ifdef CONFIG_BROKEN 222 223 module_param(pt_mode, int, S_IRUGO); 224 + #endif 223 225 224 226 struct x86_pmu_lbr __ro_after_init vmx_lbr_caps; 225 227 ··· 3218 3216 3219 3217 static inline int vmx_get_current_vpid(struct kvm_vcpu *vcpu) 3220 3218 { 3221 - if (is_guest_mode(vcpu)) 3219 + if (is_guest_mode(vcpu) && nested_cpu_has_vpid(get_vmcs12(vcpu))) 3222 3220 return nested_get_vpid02(vcpu); 3223 3221 return to_vmx(vcpu)->vpid; 3224 3222 }

-6

drivers/acpi/cppc_acpi.c

··· 671 671 * ) 672 672 */ 673 673 674 - #ifndef arch_init_invariance_cppc 675 - static inline void arch_init_invariance_cppc(void) { } 676 - #endif 677 - 678 674 /** 679 675 * acpi_cppc_processor_probe - Search for per CPU _CPC objects. 680 676 * @pr: Ptr to acpi_processor containing this CPU's logical ID. ··· 900 904 kobject_put(&cpc_ptr->kobj); 901 905 goto out_free; 902 906 } 903 - 904 - arch_init_invariance_cppc(); 905 907 906 908 kfree(output.pointer); 907 909 return 0;

+9

drivers/acpi/processor_driver.c

··· 237 237 .notifier_call = acpi_processor_notifier, 238 238 }; 239 239 240 + void __weak acpi_processor_init_invariance_cppc(void) 241 + { } 242 + 240 243 /* 241 244 * We keep the driver loaded even when ACPI is not running. 242 245 * This is needed for the powernow-k8 driver, that works even without ··· 273 270 NULL, acpi_soft_cpu_dead); 274 271 275 272 acpi_processor_throttling_init(); 273 + 274 + /* 275 + * Frequency invariance calculations on AMD platforms can't be run until 276 + * after acpi_cppc_processor_probe() has been called for all online CPUs 277 + */ 278 + acpi_processor_init_invariance_cppc(); 276 279 return 0; 277 280 err: 278 281 driver_unregister(&acpi_processor_driver);

+5 -1

drivers/base/arch_topology.c

··· 366 366 #ifdef CONFIG_ACPI_CPPC_LIB 367 367 #include <acpi/cppc_acpi.h> 368 368 369 - void topology_init_cpu_capacity_cppc(void) 369 + static inline void topology_init_cpu_capacity_cppc(void) 370 370 { 371 371 u64 capacity, capacity_scale = 0; 372 372 struct cppc_perf_caps perf_caps; ··· 416 416 417 417 exit: 418 418 free_raw_capacity(); 419 + } 420 + void acpi_processor_init_invariance_cppc(void) 421 + { 422 + topology_init_cpu_capacity_cppc(); 419 423 } 420 424 #endif 421 425

+2 -3

drivers/bluetooth/btintel.c

··· 3288 3288 case INTEL_TLV_TEST_EXCEPTION: 3289 3289 /* Generate devcoredump from exception */ 3290 3290 if (!hci_devcd_init(hdev, skb->len)) { 3291 - hci_devcd_append(hdev, skb); 3291 + hci_devcd_append(hdev, skb_clone(skb, GFP_ATOMIC)); 3292 3292 hci_devcd_complete(hdev); 3293 3293 } else { 3294 3294 bt_dev_err(hdev, "Failed to generate devcoredump"); 3295 - kfree_skb(skb); 3296 3295 } 3297 - return 0; 3296 + break; 3298 3297 default: 3299 3298 bt_dev_err(hdev, "Invalid exception type %02X", tlv->val[0]); 3300 3299 }

+20

drivers/char/tpm/tpm-buf.c

··· 147 147 EXPORT_SYMBOL_GPL(tpm_buf_append_u32); 148 148 149 149 /** 150 + * tpm_buf_append_handle() - Add a handle 151 + * @chip: &tpm_chip instance 152 + * @buf: &tpm_buf instance 153 + * @handle: a TPM object handle 154 + * 155 + * Add a handle to the buffer, and increase the count tracking the number of 156 + * handles in the command buffer. Works only for command buffers. 157 + */ 158 + void tpm_buf_append_handle(struct tpm_chip *chip, struct tpm_buf *buf, u32 handle) 159 + { 160 + if (buf->flags & TPM_BUF_TPM2B) { 161 + dev_err(&chip->dev, "Invalid buffer type (TPM2B)\n"); 162 + return; 163 + } 164 + 165 + tpm_buf_append_u32(buf, handle); 166 + buf->handles++; 167 + } 168 + 169 + /** 150 170 * tpm_buf_read() - Read from a TPM buffer 151 171 * @buf: &tpm_buf instance 152 172 * @offset: offset within the buffer

+22 -8

drivers/char/tpm/tpm2-cmd.c

··· 14 14 #include "tpm.h" 15 15 #include <crypto/hash_info.h> 16 16 17 + static bool disable_pcr_integrity; 18 + module_param(disable_pcr_integrity, bool, 0444); 19 + MODULE_PARM_DESC(disable_pcr_integrity, "Disable integrity protection of TPM2_PCR_Extend"); 20 + 17 21 static struct tpm2_hash tpm2_hash_map[] = { 18 22 {HASH_ALGO_SHA1, TPM_ALG_SHA1}, 19 23 {HASH_ALGO_SHA256, TPM_ALG_SHA256}, ··· 236 232 int rc; 237 233 int i; 238 234 239 - rc = tpm2_start_auth_session(chip); 240 - if (rc) 241 - return rc; 235 + if (!disable_pcr_integrity) { 236 + rc = tpm2_start_auth_session(chip); 237 + if (rc) 238 + return rc; 239 + } 242 240 243 241 rc = tpm_buf_init(&buf, TPM2_ST_SESSIONS, TPM2_CC_PCR_EXTEND); 244 242 if (rc) { 245 - tpm2_end_auth_session(chip); 243 + if (!disable_pcr_integrity) 244 + tpm2_end_auth_session(chip); 246 245 return rc; 247 246 } 248 247 249 - tpm_buf_append_name(chip, &buf, pcr_idx, NULL); 250 - tpm_buf_append_hmac_session(chip, &buf, 0, NULL, 0); 248 + if (!disable_pcr_integrity) { 249 + tpm_buf_append_name(chip, &buf, pcr_idx, NULL); 250 + tpm_buf_append_hmac_session(chip, &buf, 0, NULL, 0); 251 + } else { 252 + tpm_buf_append_handle(chip, &buf, pcr_idx); 253 + tpm_buf_append_auth(chip, &buf, 0, NULL, 0); 254 + } 251 255 252 256 tpm_buf_append_u32(&buf, chip->nr_allocated_banks); 253 257 ··· 265 253 chip->allocated_banks[i].digest_size); 266 254 } 267 255 268 - tpm_buf_fill_hmac_session(chip, &buf); 256 + if (!disable_pcr_integrity) 257 + tpm_buf_fill_hmac_session(chip, &buf); 269 258 rc = tpm_transmit_cmd(chip, &buf, 0, "attempting extend a PCR value"); 270 - rc = tpm_buf_check_hmac_response(chip, &buf, rc); 259 + if (!disable_pcr_integrity) 260 + rc = tpm_buf_check_hmac_response(chip, &buf, rc); 271 261 272 262 tpm_buf_destroy(&buf); 273 263

+33 -25

drivers/char/tpm/tpm2-sessions.c

··· 237 237 #endif 238 238 239 239 if (!tpm2_chip_auth(chip)) { 240 - tpm_buf_append_u32(buf, handle); 241 - /* count the number of handles in the upper bits of flags */ 242 - buf->handles++; 240 + tpm_buf_append_handle(chip, buf, handle); 243 241 return; 244 242 } 245 243 ··· 269 271 #endif 270 272 } 271 273 EXPORT_SYMBOL_GPL(tpm_buf_append_name); 274 + 275 + void tpm_buf_append_auth(struct tpm_chip *chip, struct tpm_buf *buf, 276 + u8 attributes, u8 *passphrase, int passphrase_len) 277 + { 278 + /* offset tells us where the sessions area begins */ 279 + int offset = buf->handles * 4 + TPM_HEADER_SIZE; 280 + u32 len = 9 + passphrase_len; 281 + 282 + if (tpm_buf_length(buf) != offset) { 283 + /* not the first session so update the existing length */ 284 + len += get_unaligned_be32(&buf->data[offset]); 285 + put_unaligned_be32(len, &buf->data[offset]); 286 + } else { 287 + tpm_buf_append_u32(buf, len); 288 + } 289 + /* auth handle */ 290 + tpm_buf_append_u32(buf, TPM2_RS_PW); 291 + /* nonce */ 292 + tpm_buf_append_u16(buf, 0); 293 + /* attributes */ 294 + tpm_buf_append_u8(buf, 0); 295 + /* passphrase */ 296 + tpm_buf_append_u16(buf, passphrase_len); 297 + tpm_buf_append(buf, passphrase, passphrase_len); 298 + } 272 299 273 300 /** 274 301 * tpm_buf_append_hmac_session() - Append a TPM session element ··· 332 309 #endif 333 310 334 311 if (!tpm2_chip_auth(chip)) { 335 - /* offset tells us where the sessions area begins */ 336 - int offset = buf->handles * 4 + TPM_HEADER_SIZE; 337 - u32 len = 9 + passphrase_len; 338 - 339 - if (tpm_buf_length(buf) != offset) { 340 - /* not the first session so update the existing length */ 341 - len += get_unaligned_be32(&buf->data[offset]); 342 - put_unaligned_be32(len, &buf->data[offset]); 343 - } else { 344 - tpm_buf_append_u32(buf, len); 345 - } 346 - /* auth handle */ 347 - tpm_buf_append_u32(buf, TPM2_RS_PW); 348 - /* nonce */ 349 - tpm_buf_append_u16(buf, 0); 350 - /* attributes */ 351 - tpm_buf_append_u8(buf, 0); 352 - /* passphrase */ 353 - tpm_buf_append_u16(buf, passphrase_len); 354 - tpm_buf_append(buf, passphrase, passphrase_len); 312 + tpm_buf_append_auth(chip, buf, attributes, passphrase, 313 + passphrase_len); 355 314 return; 356 315 } 357 316 ··· 953 948 /* Deduce from the name change TPM interference: */ 954 949 dev_err(&chip->dev, "null key integrity check failed\n"); 955 950 tpm2_flush_context(chip, tmp_null_key); 956 - chip->flags |= TPM_CHIP_FLAG_DISABLE; 957 951 958 952 err: 959 - return rc ? -ENODEV : 0; 953 + if (rc) { 954 + chip->flags |= TPM_CHIP_FLAG_DISABLE; 955 + rc = -ENODEV; 956 + } 957 + return rc; 960 958 } 961 959 962 960 /**

+1 -1

drivers/clk/qcom/clk-alpha-pll.c

··· 40 40 41 41 #define PLL_USER_CTL(p) ((p)->offset + (p)->regs[PLL_OFF_USER_CTL]) 42 42 # define PLL_POST_DIV_SHIFT 8 43 - # define PLL_POST_DIV_MASK(p) GENMASK((p)->width - 1, 0) 43 + # define PLL_POST_DIV_MASK(p) GENMASK((p)->width ? (p)->width - 1 : 3, 0) 44 44 # define PLL_ALPHA_MSB BIT(15) 45 45 # define PLL_ALPHA_EN BIT(24) 46 46 # define PLL_ALPHA_MODE BIT(25)

+6 -6

drivers/clk/qcom/gcc-x1e80100.c

··· 3123 3123 3124 3124 static struct clk_branch gcc_pcie_3_pipediv2_clk = { 3125 3125 .halt_reg = 0x58060, 3126 - .halt_check = BRANCH_HALT_VOTED, 3126 + .halt_check = BRANCH_HALT_SKIP, 3127 3127 .clkr = { 3128 3128 .enable_reg = 0x52020, 3129 3129 .enable_mask = BIT(5), ··· 3248 3248 3249 3249 static struct clk_branch gcc_pcie_4_pipediv2_clk = { 3250 3250 .halt_reg = 0x6b054, 3251 - .halt_check = BRANCH_HALT_VOTED, 3251 + .halt_check = BRANCH_HALT_SKIP, 3252 3252 .clkr = { 3253 3253 .enable_reg = 0x52010, 3254 3254 .enable_mask = BIT(27), ··· 3373 3373 3374 3374 static struct clk_branch gcc_pcie_5_pipediv2_clk = { 3375 3375 .halt_reg = 0x2f054, 3376 - .halt_check = BRANCH_HALT_VOTED, 3376 + .halt_check = BRANCH_HALT_SKIP, 3377 3377 .clkr = { 3378 3378 .enable_reg = 0x52018, 3379 3379 .enable_mask = BIT(19), ··· 3511 3511 3512 3512 static struct clk_branch gcc_pcie_6a_pipediv2_clk = { 3513 3513 .halt_reg = 0x31060, 3514 - .halt_check = BRANCH_HALT_VOTED, 3514 + .halt_check = BRANCH_HALT_SKIP, 3515 3515 .clkr = { 3516 3516 .enable_reg = 0x52018, 3517 3517 .enable_mask = BIT(28), ··· 3649 3649 3650 3650 static struct clk_branch gcc_pcie_6b_pipediv2_clk = { 3651 3651 .halt_reg = 0x8d060, 3652 - .halt_check = BRANCH_HALT_VOTED, 3652 + .halt_check = BRANCH_HALT_SKIP, 3653 3653 .clkr = { 3654 3654 .enable_reg = 0x52010, 3655 3655 .enable_mask = BIT(28), ··· 6155 6155 .pd = { 6156 6156 .name = "gcc_usb3_mp_ss1_phy_gdsc", 6157 6157 }, 6158 - .pwrsts = PWRSTS_OFF_ON, 6158 + .pwrsts = PWRSTS_RET_ON, 6159 6159 .flags = POLL_CFG_GDSCR | RETAIN_FF_ENABLE, 6160 6160 }; 6161 6161

+2 -2

drivers/clk/qcom/videocc-sm8350.c

··· 452 452 .pd = { 453 453 .name = "mvs0_gdsc", 454 454 }, 455 - .flags = HW_CTRL | RETAIN_FF_ENABLE, 455 + .flags = HW_CTRL_TRIGGER | RETAIN_FF_ENABLE, 456 456 .pwrsts = PWRSTS_OFF_ON, 457 457 }; 458 458 ··· 461 461 .pd = { 462 462 .name = "mvs1_gdsc", 463 463 }, 464 - .flags = HW_CTRL | RETAIN_FF_ENABLE, 464 + .flags = HW_CTRL_TRIGGER | RETAIN_FF_ENABLE, 465 465 .pwrsts = PWRSTS_OFF_ON, 466 466 }; 467 467

+38 -21

drivers/cpufreq/intel_pstate.c

··· 1028 1028 } 1029 1029 } 1030 1030 1031 - static void __hybrid_init_cpu_capacity_scaling(void) 1031 + static void __hybrid_refresh_cpu_capacity_scaling(void) 1032 1032 { 1033 1033 hybrid_max_perf_cpu = NULL; 1034 1034 hybrid_update_cpu_capacity_scaling(); 1035 1035 } 1036 1036 1037 - static void hybrid_init_cpu_capacity_scaling(void) 1037 + static void hybrid_refresh_cpu_capacity_scaling(void) 1038 1038 { 1039 - bool disable_itmt = false; 1039 + guard(mutex)(&hybrid_capacity_lock); 1040 1040 1041 - mutex_lock(&hybrid_capacity_lock); 1041 + __hybrid_refresh_cpu_capacity_scaling(); 1042 + } 1042 1043 1044 + static void hybrid_init_cpu_capacity_scaling(bool refresh) 1045 + { 1043 1046 /* 1044 1047 * If hybrid_max_perf_cpu is set at this point, the hybrid CPU capacity 1045 1048 * scaling has been enabled already and the driver is just changing the 1046 1049 * operation mode. 1047 1050 */ 1048 - if (hybrid_max_perf_cpu) { 1049 - __hybrid_init_cpu_capacity_scaling(); 1050 - goto unlock; 1051 + if (refresh) { 1052 + hybrid_refresh_cpu_capacity_scaling(); 1053 + return; 1051 1054 } 1052 1055 1053 1056 /* ··· 1059 1056 * do not do that when SMT is in use. 1060 1057 */ 1061 1058 if (hwp_is_hybrid && !sched_smt_active() && arch_enable_hybrid_capacity_scale()) { 1062 - __hybrid_init_cpu_capacity_scaling(); 1063 - disable_itmt = true; 1064 - } 1065 - 1066 - unlock: 1067 - mutex_unlock(&hybrid_capacity_lock); 1068 - 1069 - /* 1070 - * Disabling ITMT causes sched domains to be rebuilt to disable asym 1071 - * packing and enable asym capacity. 1072 - */ 1073 - if (disable_itmt) 1059 + hybrid_refresh_cpu_capacity_scaling(); 1060 + /* 1061 + * Disabling ITMT causes sched domains to be rebuilt to disable asym 1062 + * packing and enable asym capacity. 1063 + */ 1074 1064 sched_clear_itmt_support(); 1065 + } 1066 + } 1067 + 1068 + static bool hybrid_clear_max_perf_cpu(void) 1069 + { 1070 + bool ret; 1071 + 1072 + guard(mutex)(&hybrid_capacity_lock); 1073 + 1074 + ret = !!hybrid_max_perf_cpu; 1075 + hybrid_max_perf_cpu = NULL; 1076 + 1077 + return ret; 1075 1078 } 1076 1079 1077 1080 static void __intel_pstate_get_hwp_cap(struct cpudata *cpu) ··· 1401 1392 mutex_lock(&hybrid_capacity_lock); 1402 1393 1403 1394 if (hybrid_max_perf_cpu) 1404 - __hybrid_init_cpu_capacity_scaling(); 1395 + __hybrid_refresh_cpu_capacity_scaling(); 1405 1396 1406 1397 mutex_unlock(&hybrid_capacity_lock); 1407 1398 } ··· 2272 2263 } else { 2273 2264 cpu->pstate.scaling = perf_ctl_scaling; 2274 2265 } 2266 + /* 2267 + * If the CPU is going online for the first time and it was 2268 + * offline initially, asym capacity scaling needs to be updated. 2269 + */ 2270 + hybrid_update_capacity(cpu); 2275 2271 } else { 2276 2272 cpu->pstate.scaling = perf_ctl_scaling; 2277 2273 cpu->pstate.max_pstate = pstate_funcs.get_max(cpu->cpu); ··· 3366 3352 3367 3353 static int intel_pstate_register_driver(struct cpufreq_driver *driver) 3368 3354 { 3355 + bool refresh_cpu_cap_scaling; 3369 3356 int ret; 3370 3357 3371 3358 if (driver == &intel_pstate) ··· 3379 3364 3380 3365 arch_set_max_freq_ratio(global.turbo_disabled); 3381 3366 3367 + refresh_cpu_cap_scaling = hybrid_clear_max_perf_cpu(); 3368 + 3382 3369 intel_pstate_driver = driver; 3383 3370 ret = cpufreq_register_driver(intel_pstate_driver); 3384 3371 if (ret) { ··· 3390 3373 3391 3374 global.min_perf_pct = min_perf_pct_min(); 3392 3375 3393 - hybrid_init_cpu_capacity_scaling(); 3376 + hybrid_init_cpu_capacity_scaling(refresh_cpu_cap_scaling); 3394 3377 3395 3378 return 0; 3396 3379 }

-4

drivers/firmware/smccc/smccc.c

··· 16 16 static enum arm_smccc_conduit smccc_conduit = SMCCC_CONDUIT_NONE; 17 17 18 18 bool __ro_after_init smccc_trng_available = false; 19 - u64 __ro_after_init smccc_has_sve_hint = false; 20 19 s32 __ro_after_init smccc_soc_id_version = SMCCC_RET_NOT_SUPPORTED; 21 20 s32 __ro_after_init smccc_soc_id_revision = SMCCC_RET_NOT_SUPPORTED; 22 21 ··· 27 28 smccc_conduit = conduit; 28 29 29 30 smccc_trng_available = smccc_probe_trng(); 30 - if (IS_ENABLED(CONFIG_ARM64_SVE) && 31 - smccc_version >= ARM_SMCCC_VERSION_1_3) 32 - smccc_has_sve_hint = true; 33 31 34 32 if ((smccc_version >= ARM_SMCCC_VERSION_1_2) && 35 33 (smccc_conduit != SMCCC_CONDUIT_NONE)) {

+2 -2

drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c

··· 172 172 &buffer); 173 173 obj = (union acpi_object *)buffer.pointer; 174 174 175 - /* Fail if calling the method fails and ATIF is supported */ 176 - if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) { 175 + /* Fail if calling the method fails */ 176 + if (ACPI_FAILURE(status)) { 177 177 DRM_DEBUG_DRIVER("failed to evaluate ATIF got %s\n", 178 178 acpi_format_exception(status)); 179 179 kfree(obj);

+5 -5

drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c

··· 402 402 int r; 403 403 uint32_t *data, x; 404 404 405 - if (size & 0x3 || *pos & 0x3) 405 + if (size > 4096 || size & 0x3 || *pos & 0x3) 406 406 return -EINVAL; 407 407 408 408 r = pm_runtime_get_sync(adev_to_drm(adev)->dev); ··· 1648 1648 1649 1649 for (i = 0; i < ARRAY_SIZE(debugfs_regs); i++) { 1650 1650 ent = debugfs_create_file(debugfs_regs_names[i], 1651 - S_IFREG | 0444, root, 1651 + S_IFREG | 0400, root, 1652 1652 adev, debugfs_regs[i]); 1653 1653 if (!i && !IS_ERR_OR_NULL(ent)) 1654 1654 i_size_write(ent->d_inode, adev->rmmio_size); ··· 2100 2100 amdgpu_securedisplay_debugfs_init(adev); 2101 2101 amdgpu_fw_attestation_debugfs_init(adev); 2102 2102 2103 - debugfs_create_file("amdgpu_evict_vram", 0444, root, adev, 2103 + debugfs_create_file("amdgpu_evict_vram", 0400, root, adev, 2104 2104 &amdgpu_evict_vram_fops); 2105 - debugfs_create_file("amdgpu_evict_gtt", 0444, root, adev, 2105 + debugfs_create_file("amdgpu_evict_gtt", 0400, root, adev, 2106 2106 &amdgpu_evict_gtt_fops); 2107 - debugfs_create_file("amdgpu_test_ib", 0444, root, adev, 2107 + debugfs_create_file("amdgpu_test_ib", 0400, root, adev, 2108 2108 &amdgpu_debugfs_test_ib_fops); 2109 2109 debugfs_create_file("amdgpu_vm_info", 0444, root, adev, 2110 2110 &amdgpu_debugfs_vm_info_fops);

+1 -1

drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c

··· 482 482 case AMDGPU_SPX_PARTITION_MODE: 483 483 return adev->gmc.num_mem_partitions == 1 && num_xcc > 0; 484 484 case AMDGPU_DPX_PARTITION_MODE: 485 - return adev->gmc.num_mem_partitions != 8 && (num_xcc % 4) == 0; 485 + return adev->gmc.num_mem_partitions <= 2 && (num_xcc % 4) == 0; 486 486 case AMDGPU_TPX_PARTITION_MODE: 487 487 return (adev->gmc.num_mem_partitions == 1 || 488 488 adev->gmc.num_mem_partitions == 3) &&

+15

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

··· 9429 9429 bool mode_set_reset_required = false; 9430 9430 u32 i; 9431 9431 struct dc_commit_streams_params params = {dc_state->streams, dc_state->stream_count}; 9432 + bool set_backlight_level = false; 9432 9433 9433 9434 /* Disable writeback */ 9434 9435 for_each_old_connector_in_state(state, connector, old_con_state, i) { ··· 9549 9548 acrtc->hw_mode = new_crtc_state->mode; 9550 9549 crtc->hwmode = new_crtc_state->mode; 9551 9550 mode_set_reset_required = true; 9551 + set_backlight_level = true; 9552 9552 } else if (modereset_required(new_crtc_state)) { 9553 9553 drm_dbg_atomic(dev, 9554 9554 "Atomic commit: RESET. crtc id %d:[%p]\n", ··· 9599 9597 dm_new_crtc_state->stream, acrtc); 9600 9598 else 9601 9599 acrtc->otg_inst = status->primary_otg_inst; 9600 + } 9601 + } 9602 + 9603 + /* During boot up and resume the DC layer will reset the panel brightness 9604 + * to fix a flicker issue. 9605 + * It will cause the dm->actual_brightness is not the current panel brightness 9606 + * level. (the dm->brightness is the correct panel level) 9607 + * So we set the backlight level with dm->brightness value after set mode 9608 + */ 9609 + if (set_backlight_level) { 9610 + for (i = 0; i < dm->num_of_edps; i++) { 9611 + if (dm->backlight_dev[i]) 9612 + amdgpu_dm_backlight_set_level(dm, i, dm->brightness[i]); 9602 9613 } 9603 9614 } 9604 9615 }

+3 -1

drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c

··· 3127 3127 struct atom_data_revision revision; 3128 3128 3129 3129 // vram info moved to umc_info for DCN4x 3130 - if (info && DATA_TABLES(umc_info)) { 3130 + if (dcb->ctx->dce_version >= DCN_VERSION_4_01 && 3131 + dcb->ctx->dce_version < DCN_VERSION_MAX && 3132 + info && DATA_TABLES(umc_info)) { 3131 3133 header = GET_IMAGE(struct atom_common_table_header, 3132 3134 DATA_TABLES(umc_info)); 3133 3135

+35 -14

drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c

··· 1259 1259 smu->watermarks_bitmap = 0; 1260 1260 smu->power_profile_mode = PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; 1261 1261 smu->default_power_profile_mode = PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; 1262 + smu->user_dpm_profile.user_workload_mask = 0; 1262 1263 1263 1264 atomic_set(&smu->smu_power.power_gate.vcn_gated, 1); 1264 1265 atomic_set(&smu->smu_power.power_gate.jpeg_gated, 1); 1265 1266 atomic_set(&smu->smu_power.power_gate.vpe_gated, 1); 1266 1267 atomic_set(&smu->smu_power.power_gate.umsch_mm_gated, 1); 1267 1268 1268 - smu->workload_prority[PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT] = 0; 1269 - smu->workload_prority[PP_SMC_POWER_PROFILE_FULLSCREEN3D] = 1; 1270 - smu->workload_prority[PP_SMC_POWER_PROFILE_POWERSAVING] = 2; 1271 - smu->workload_prority[PP_SMC_POWER_PROFILE_VIDEO] = 3; 1272 - smu->workload_prority[PP_SMC_POWER_PROFILE_VR] = 4; 1273 - smu->workload_prority[PP_SMC_POWER_PROFILE_COMPUTE] = 5; 1274 - smu->workload_prority[PP_SMC_POWER_PROFILE_CUSTOM] = 6; 1269 + smu->workload_priority[PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT] = 0; 1270 + smu->workload_priority[PP_SMC_POWER_PROFILE_FULLSCREEN3D] = 1; 1271 + smu->workload_priority[PP_SMC_POWER_PROFILE_POWERSAVING] = 2; 1272 + smu->workload_priority[PP_SMC_POWER_PROFILE_VIDEO] = 3; 1273 + smu->workload_priority[PP_SMC_POWER_PROFILE_VR] = 4; 1274 + smu->workload_priority[PP_SMC_POWER_PROFILE_COMPUTE] = 5; 1275 + smu->workload_priority[PP_SMC_POWER_PROFILE_CUSTOM] = 6; 1275 1276 1276 1277 if (smu->is_apu || 1277 - !smu_is_workload_profile_available(smu, PP_SMC_POWER_PROFILE_FULLSCREEN3D)) 1278 - smu->workload_mask = 1 << smu->workload_prority[PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT]; 1279 - else 1280 - smu->workload_mask = 1 << smu->workload_prority[PP_SMC_POWER_PROFILE_FULLSCREEN3D]; 1278 + !smu_is_workload_profile_available(smu, PP_SMC_POWER_PROFILE_FULLSCREEN3D)) { 1279 + smu->driver_workload_mask = 1280 + 1 << smu->workload_priority[PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT]; 1281 + } else { 1282 + smu->driver_workload_mask = 1283 + 1 << smu->workload_priority[PP_SMC_POWER_PROFILE_FULLSCREEN3D]; 1284 + smu->default_power_profile_mode = PP_SMC_POWER_PROFILE_FULLSCREEN3D; 1285 + } 1281 1286 1287 + smu->workload_mask = smu->driver_workload_mask | 1288 + smu->user_dpm_profile.user_workload_mask; 1282 1289 smu->workload_setting[0] = PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; 1283 1290 smu->workload_setting[1] = PP_SMC_POWER_PROFILE_FULLSCREEN3D; 1284 1291 smu->workload_setting[2] = PP_SMC_POWER_PROFILE_POWERSAVING; ··· 2355 2348 return -EINVAL; 2356 2349 2357 2350 if (!en) { 2358 - smu->workload_mask &= ~(1 << smu->workload_prority[type]); 2351 + smu->driver_workload_mask &= ~(1 << smu->workload_priority[type]); 2359 2352 index = fls(smu->workload_mask); 2360 2353 index = index > 0 && index <= WORKLOAD_POLICY_MAX ? index - 1 : 0; 2361 2354 workload[0] = smu->workload_setting[index]; 2362 2355 } else { 2363 - smu->workload_mask |= (1 << smu->workload_prority[type]); 2356 + smu->driver_workload_mask |= (1 << smu->workload_priority[type]); 2364 2357 index = fls(smu->workload_mask); 2365 2358 index = index <= WORKLOAD_POLICY_MAX ? index - 1 : 0; 2366 2359 workload[0] = smu->workload_setting[index]; 2367 2360 } 2361 + 2362 + smu->workload_mask = smu->driver_workload_mask | 2363 + smu->user_dpm_profile.user_workload_mask; 2368 2364 2369 2365 if (smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_MANUAL && 2370 2366 smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_PERF_DETERMINISM) ··· 3059 3049 uint32_t param_size) 3060 3050 { 3061 3051 struct smu_context *smu = handle; 3052 + int ret; 3062 3053 3063 3054 if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled || 3064 3055 !smu->ppt_funcs->set_power_profile_mode) 3065 3056 return -EOPNOTSUPP; 3066 3057 3067 - return smu_bump_power_profile_mode(smu, param, param_size); 3058 + if (smu->user_dpm_profile.user_workload_mask & 3059 + (1 << smu->workload_priority[param[param_size]])) 3060 + return 0; 3061 + 3062 + smu->user_dpm_profile.user_workload_mask = 3063 + (1 << smu->workload_priority[param[param_size]]); 3064 + smu->workload_mask = smu->user_dpm_profile.user_workload_mask | 3065 + smu->driver_workload_mask; 3066 + ret = smu_bump_power_profile_mode(smu, param, param_size); 3067 + 3068 + return ret; 3068 3069 } 3069 3070 3070 3071 static int smu_get_fan_control_mode(void *handle, u32 *fan_mode)

+3 -1

drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h

··· 240 240 /* user clock state information */ 241 241 uint32_t clk_mask[SMU_CLK_COUNT]; 242 242 uint32_t clk_dependency; 243 + uint32_t user_workload_mask; 243 244 }; 244 245 245 246 #define SMU_TABLE_INIT(tables, table_id, s, a, d) \ ··· 558 557 bool disable_uclk_switch; 559 558 560 559 uint32_t workload_mask; 561 - uint32_t workload_prority[WORKLOAD_POLICY_MAX]; 560 + uint32_t driver_workload_mask; 561 + uint32_t workload_priority[WORKLOAD_POLICY_MAX]; 562 562 uint32_t workload_setting[WORKLOAD_POLICY_MAX]; 563 563 uint32_t power_profile_mode; 564 564 uint32_t default_power_profile_mode;

+2 -3

drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c

··· 1455 1455 return -EINVAL; 1456 1456 } 1457 1457 1458 - 1459 1458 if ((profile_mode == PP_SMC_POWER_PROFILE_CUSTOM) && 1460 1459 (smu->smc_fw_version >= 0x360d00)) { 1461 1460 if (size != 10) ··· 1522 1523 1523 1524 ret = smu_cmn_send_smc_msg_with_param(smu, 1524 1525 SMU_MSG_SetWorkloadMask, 1525 - 1 << workload_type, 1526 + smu->workload_mask, 1526 1527 NULL); 1527 1528 if (ret) { 1528 1529 dev_err(smu->adev->dev, "Fail to set workload type %d\n", workload_type); 1529 1530 return ret; 1530 1531 } 1531 1532 1532 - smu->power_profile_mode = profile_mode; 1533 + smu_cmn_assign_power_profile(smu); 1533 1534 1534 1535 return 0; 1535 1536 }

+4 -1

drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c

··· 2081 2081 smu->power_profile_mode); 2082 2082 if (workload_type < 0) 2083 2083 return -EINVAL; 2084 + 2084 2085 ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, 2085 - 1 << workload_type, NULL); 2086 + smu->workload_mask, NULL); 2086 2087 if (ret) 2087 2088 dev_err(smu->adev->dev, "[%s] Failed to set work load mask!", __func__); 2089 + else 2090 + smu_cmn_assign_power_profile(smu); 2088 2091 2089 2092 return ret; 2090 2093 }

+4 -1

drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c

··· 1786 1786 smu->power_profile_mode); 1787 1787 if (workload_type < 0) 1788 1788 return -EINVAL; 1789 + 1789 1790 ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, 1790 - 1 << workload_type, NULL); 1791 + smu->workload_mask, NULL); 1791 1792 if (ret) 1792 1793 dev_err(smu->adev->dev, "[%s] Failed to set work load mask!", __func__); 1794 + else 1795 + smu_cmn_assign_power_profile(smu); 1793 1796 1794 1797 return ret; 1795 1798 }

+2 -2

drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c

··· 1079 1079 } 1080 1080 1081 1081 ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_ActiveProcessNotify, 1082 - 1 << workload_type, 1082 + smu->workload_mask, 1083 1083 NULL); 1084 1084 if (ret) { 1085 1085 dev_err_once(smu->adev->dev, "Fail to set workload type %d\n", ··· 1087 1087 return ret; 1088 1088 } 1089 1089 1090 - smu->power_profile_mode = profile_mode; 1090 + smu_cmn_assign_power_profile(smu); 1091 1091 1092 1092 return 0; 1093 1093 }

+2 -2

drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c

··· 890 890 } 891 891 892 892 ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_ActiveProcessNotify, 893 - 1 << workload_type, 893 + smu->workload_mask, 894 894 NULL); 895 895 if (ret) { 896 896 dev_err_once(smu->adev->dev, "Fail to set workload type %d\n", workload_type); 897 897 return ret; 898 898 } 899 899 900 - smu->power_profile_mode = profile_mode; 900 + smu_cmn_assign_power_profile(smu); 901 901 902 902 return 0; 903 903 }

+15 -5

drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c

··· 2485 2485 DpmActivityMonitorCoeffInt_t *activity_monitor = 2486 2486 &(activity_monitor_external.DpmActivityMonitorCoeffInt); 2487 2487 int workload_type, ret = 0; 2488 - u32 workload_mask, selected_workload_mask; 2488 + u32 workload_mask; 2489 2489 2490 2490 smu->power_profile_mode = input[size]; 2491 2491 ··· 2552 2552 if (workload_type < 0) 2553 2553 return -EINVAL; 2554 2554 2555 - selected_workload_mask = workload_mask = 1 << workload_type; 2555 + workload_mask = 1 << workload_type; 2556 2556 2557 2557 /* Add optimizations for SMU13.0.0/10. Reuse the power saving profile */ 2558 2558 if ((amdgpu_ip_version(smu->adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 0) && ··· 2567 2567 workload_mask |= 1 << workload_type; 2568 2568 } 2569 2569 2570 + smu->workload_mask |= workload_mask; 2570 2571 ret = smu_cmn_send_smc_msg_with_param(smu, 2571 2572 SMU_MSG_SetWorkloadMask, 2572 - workload_mask, 2573 + smu->workload_mask, 2573 2574 NULL); 2574 - if (!ret) 2575 - smu->workload_mask = selected_workload_mask; 2575 + if (!ret) { 2576 + smu_cmn_assign_power_profile(smu); 2577 + if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_POWERSAVING) { 2578 + workload_type = smu_cmn_to_asic_specific_index(smu, 2579 + CMN2ASIC_MAPPING_WORKLOAD, 2580 + PP_SMC_POWER_PROFILE_FULLSCREEN3D); 2581 + smu->power_profile_mode = smu->workload_mask & (1 << workload_type) 2582 + ? PP_SMC_POWER_PROFILE_FULLSCREEN3D 2583 + : PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; 2584 + } 2585 + } 2576 2586 2577 2587 return ret; 2578 2588 }

+3 -2

drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c

··· 2499 2499 smu->power_profile_mode); 2500 2500 if (workload_type < 0) 2501 2501 return -EINVAL; 2502 + 2502 2503 ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, 2503 - 1 << workload_type, NULL); 2504 + smu->workload_mask, NULL); 2504 2505 2505 2506 if (ret) 2506 2507 dev_err(smu->adev->dev, "[%s] Failed to set work load mask!", __func__); 2507 2508 else 2508 - smu->workload_mask = (1 << workload_type); 2509 + smu_cmn_assign_power_profile(smu); 2509 2510 2510 2511 return ret; 2511 2512 }

+5 -69

drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c

··· 367 367 return 0; 368 368 } 369 369 370 - #ifndef atom_smc_dpm_info_table_14_0_0 371 - struct atom_smc_dpm_info_table_14_0_0 { 372 - struct atom_common_table_header table_header; 373 - BoardTable_t BoardTable; 374 - }; 375 - #endif 376 - 377 - static int smu_v14_0_2_append_powerplay_table(struct smu_context *smu) 378 - { 379 - struct smu_table_context *table_context = &smu->smu_table; 380 - PPTable_t *smc_pptable = table_context->driver_pptable; 381 - struct atom_smc_dpm_info_table_14_0_0 *smc_dpm_table; 382 - BoardTable_t *BoardTable = &smc_pptable->BoardTable; 383 - int index, ret; 384 - 385 - index = get_index_into_master_table(atom_master_list_of_data_tables_v2_1, 386 - smc_dpm_info); 387 - 388 - ret = amdgpu_atombios_get_data_table(smu->adev, index, NULL, NULL, NULL, 389 - (uint8_t **)&smc_dpm_table); 390 - if (ret) 391 - return ret; 392 - 393 - memcpy(BoardTable, &smc_dpm_table->BoardTable, sizeof(BoardTable_t)); 394 - 395 - return 0; 396 - } 397 - 398 - #if 0 399 - static int smu_v14_0_2_get_pptable_from_pmfw(struct smu_context *smu, 400 - void **table, 401 - uint32_t *size) 402 - { 403 - struct smu_table_context *smu_table = &smu->smu_table; 404 - void *combo_pptable = smu_table->combo_pptable; 405 - int ret = 0; 406 - 407 - ret = smu_cmn_get_combo_pptable(smu); 408 - if (ret) 409 - return ret; 410 - 411 - *table = combo_pptable; 412 - *size = sizeof(struct smu_14_0_powerplay_table); 413 - 414 - return 0; 415 - } 416 - #endif 417 - 418 370 static int smu_v14_0_2_get_pptable_from_pmfw(struct smu_context *smu, 419 371 void **table, 420 372 uint32_t *size) ··· 388 436 static int smu_v14_0_2_setup_pptable(struct smu_context *smu) 389 437 { 390 438 struct smu_table_context *smu_table = &smu->smu_table; 391 - struct amdgpu_device *adev = smu->adev; 392 439 int ret = 0; 393 440 394 441 if (amdgpu_sriov_vf(smu->adev)) 395 442 return 0; 396 443 397 - if (!adev->scpm_enabled) 398 - ret = smu_v14_0_setup_pptable(smu); 399 - else 400 - ret = smu_v14_0_2_get_pptable_from_pmfw(smu, 444 + ret = smu_v14_0_2_get_pptable_from_pmfw(smu, 401 445 &smu_table->power_play_table, 402 446 &smu_table->power_play_table_size); 403 447 if (ret) ··· 402 454 ret = smu_v14_0_2_store_powerplay_table(smu); 403 455 if (ret) 404 456 return ret; 405 - 406 - /* 407 - * With SCPM enabled, the operation below will be handled 408 - * by PSP. Driver involvment is unnecessary and useless. 409 - */ 410 - if (!adev->scpm_enabled) { 411 - ret = smu_v14_0_2_append_powerplay_table(smu); 412 - if (ret) 413 - return ret; 414 - } 415 457 416 458 ret = smu_v14_0_2_check_powerplay_table(smu); 417 459 if (ret) ··· 1807 1869 if (workload_type < 0) 1808 1870 return -EINVAL; 1809 1871 1810 - ret = smu_cmn_send_smc_msg_with_param(smu, 1811 - SMU_MSG_SetWorkloadMask, 1812 - 1 << workload_type, 1813 - NULL); 1872 + ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, 1873 + smu->workload_mask, NULL); 1874 + 1814 1875 if (!ret) 1815 - smu->workload_mask = 1 << workload_type; 1876 + smu_cmn_assign_power_profile(smu); 1816 1877 1817 1878 return ret; 1818 1879 } ··· 2736 2799 .check_fw_status = smu_v14_0_check_fw_status, 2737 2800 .setup_pptable = smu_v14_0_2_setup_pptable, 2738 2801 .check_fw_version = smu_v14_0_check_fw_version, 2739 - .write_pptable = smu_cmn_write_pptable, 2740 2802 .set_driver_table_location = smu_v14_0_set_driver_table_location, 2741 2803 .system_features_control = smu_v14_0_system_features_control, 2742 2804 .set_allowed_mask = smu_v14_0_set_allowed_mask,

+8

drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c

··· 1138 1138 return ret; 1139 1139 } 1140 1140 1141 + void smu_cmn_assign_power_profile(struct smu_context *smu) 1142 + { 1143 + uint32_t index; 1144 + index = fls(smu->workload_mask); 1145 + index = index > 0 && index <= WORKLOAD_POLICY_MAX ? index - 1 : 0; 1146 + smu->power_profile_mode = smu->workload_setting[index]; 1147 + } 1148 + 1141 1149 bool smu_cmn_is_audio_func_enabled(struct amdgpu_device *adev) 1142 1150 { 1143 1151 struct pci_dev *p = NULL;

+2

drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h

··· 130 130 int smu_cmn_set_mp1_state(struct smu_context *smu, 131 131 enum pp_mp1_state mp1_state); 132 132 133 + void smu_cmn_assign_power_profile(struct smu_context *smu); 134 + 133 135 /* 134 136 * Helper function to make sysfs_emit_at() happy. Align buf to 135 137 * the current page boundary and record the offset.

-1

drivers/gpu/drm/drm_panel_orientation_quirks.c

··· 403 403 }, { /* Lenovo Yoga Tab 3 X90F */ 404 404 .matches = { 405 405 DMI_MATCH(DMI_SYS_VENDOR, "Intel Corporation"), 406 - DMI_MATCH(DMI_PRODUCT_NAME, "CHERRYVIEW D1 PLATFORM"), 407 406 DMI_MATCH(DMI_PRODUCT_VERSION, "Blade3-10A-001"), 408 407 }, 409 408 .driver_data = (void *)&lcd1600x2560_rightside_up,

+33

drivers/gpu/drm/imagination/pvr_context.c

··· 17 17 18 18 #include <drm/drm_auth.h> 19 19 #include <drm/drm_managed.h> 20 + 21 + #include <linux/bug.h> 20 22 #include <linux/errno.h> 21 23 #include <linux/kernel.h> 24 + #include <linux/list.h> 22 25 #include <linux/sched.h> 23 26 #include <linux/slab.h> 27 + #include <linux/spinlock.h> 24 28 #include <linux/string.h> 25 29 #include <linux/types.h> 26 30 #include <linux/xarray.h> ··· 358 354 return err; 359 355 } 360 356 357 + spin_lock(&pvr_dev->ctx_list_lock); 358 + list_add_tail(&ctx->file_link, &pvr_file->contexts); 359 + spin_unlock(&pvr_dev->ctx_list_lock); 360 + 361 361 return 0; 362 362 363 363 err_destroy_fw_obj: ··· 387 379 struct pvr_context *ctx = 388 380 container_of(ref_count, struct pvr_context, ref_count); 389 381 struct pvr_device *pvr_dev = ctx->pvr_dev; 382 + 383 + WARN_ON(in_interrupt()); 384 + spin_lock(&pvr_dev->ctx_list_lock); 385 + list_del(&ctx->file_link); 386 + spin_unlock(&pvr_dev->ctx_list_lock); 390 387 391 388 xa_erase(&pvr_dev->ctx_ids, ctx->ctx_id); 392 389 pvr_context_destroy_queues(ctx); ··· 450 437 */ 451 438 void pvr_destroy_contexts_for_file(struct pvr_file *pvr_file) 452 439 { 440 + struct pvr_device *pvr_dev = pvr_file->pvr_dev; 453 441 struct pvr_context *ctx; 454 442 unsigned long handle; 455 443 456 444 xa_for_each(&pvr_file->ctx_handles, handle, ctx) 457 445 pvr_context_destroy(pvr_file, handle); 446 + 447 + spin_lock(&pvr_dev->ctx_list_lock); 448 + ctx = list_first_entry(&pvr_file->contexts, struct pvr_context, file_link); 449 + 450 + while (!list_entry_is_head(ctx, &pvr_file->contexts, file_link)) { 451 + list_del_init(&ctx->file_link); 452 + 453 + if (pvr_context_get_if_referenced(ctx)) { 454 + spin_unlock(&pvr_dev->ctx_list_lock); 455 + 456 + pvr_vm_unmap_all(ctx->vm_ctx); 457 + 458 + pvr_context_put(ctx); 459 + spin_lock(&pvr_dev->ctx_list_lock); 460 + } 461 + ctx = list_first_entry(&pvr_file->contexts, struct pvr_context, file_link); 462 + } 463 + spin_unlock(&pvr_dev->ctx_list_lock); 458 464 } 459 465 460 466 /** ··· 483 451 void pvr_context_device_init(struct pvr_device *pvr_dev) 484 452 { 485 453 xa_init_flags(&pvr_dev->ctx_ids, XA_FLAGS_ALLOC1); 454 + spin_lock_init(&pvr_dev->ctx_list_lock); 486 455 } 487 456 488 457 /**

+21

drivers/gpu/drm/imagination/pvr_context.h

··· 85 85 /** @compute: Transfer queue. */ 86 86 struct pvr_queue *transfer; 87 87 } queues; 88 + 89 + /** @file_link: pvr_file PVR context list link. */ 90 + struct list_head file_link; 88 91 }; 89 92 90 93 static __always_inline struct pvr_queue * ··· 124 121 kref_get(&ctx->ref_count); 125 122 126 123 return ctx; 124 + } 125 + 126 + /** 127 + * pvr_context_get_if_referenced() - Take an additional reference on a still 128 + * referenced context. 129 + * @ctx: Context pointer. 130 + * 131 + * Call pvr_context_put() to release. 132 + * 133 + * Returns: 134 + * * True on success, or 135 + * * false if no context pointer passed, or the context wasn't still 136 + * * referenced. 137 + */ 138 + static __always_inline bool 139 + pvr_context_get_if_referenced(struct pvr_context *ctx) 140 + { 141 + return ctx != NULL && kref_get_unless_zero(&ctx->ref_count) != 0; 127 142 } 128 143 129 144 /**

+10

drivers/gpu/drm/imagination/pvr_device.h

··· 23 23 #include <linux/kernel.h> 24 24 #include <linux/math.h> 25 25 #include <linux/mutex.h> 26 + #include <linux/spinlock_types.h> 26 27 #include <linux/timer.h> 27 28 #include <linux/types.h> 28 29 #include <linux/wait.h> ··· 294 293 295 294 /** @sched_wq: Workqueue for schedulers. */ 296 295 struct workqueue_struct *sched_wq; 296 + 297 + /** 298 + * @ctx_list_lock: Lock to be held when accessing the context list in 299 + * struct pvr_file. 300 + */ 301 + spinlock_t ctx_list_lock; 297 302 }; 298 303 299 304 /** ··· 351 344 * This array is used to allocate handles returned to userspace. 352 345 */ 353 346 struct xarray vm_ctx_handles; 347 + 348 + /** @contexts: PVR context list. */ 349 + struct list_head contexts; 354 350 }; 355 351 356 352 /**

+3

drivers/gpu/drm/imagination/pvr_drv.c

··· 28 28 #include <linux/export.h> 29 29 #include <linux/fs.h> 30 30 #include <linux/kernel.h> 31 + #include <linux/list.h> 31 32 #include <linux/mod_devicetable.h> 32 33 #include <linux/module.h> 33 34 #include <linux/moduleparam.h> ··· 1326 1325 * private data for convenient access. 1327 1326 */ 1328 1327 pvr_file->pvr_dev = pvr_dev; 1328 + 1329 + INIT_LIST_HEAD(&pvr_file->contexts); 1329 1330 1330 1331 xa_init_flags(&pvr_file->ctx_handles, XA_FLAGS_ALLOC1); 1331 1332 xa_init_flags(&pvr_file->free_list_handles, XA_FLAGS_ALLOC1);

+18 -4

drivers/gpu/drm/imagination/pvr_vm.c

··· 14 14 #include <drm/drm_gem.h> 15 15 #include <drm/drm_gpuvm.h> 16 16 17 + #include <linux/bug.h> 17 18 #include <linux/container_of.h> 18 19 #include <linux/err.h> 19 20 #include <linux/errno.h> ··· 598 597 } 599 598 600 599 /** 601 - * pvr_vm_context_release() - Teardown a VM context. 602 - * @ref_count: Pointer to reference counter of the VM context. 600 + * pvr_vm_unmap_all() - Unmap all mappings associated with a VM context. 601 + * @vm_ctx: Target VM context. 603 602 * 604 603 * This function ensures that no mappings are left dangling by unmapping them 605 604 * all in order of ascending device-virtual address. 605 + */ 606 + void 607 + pvr_vm_unmap_all(struct pvr_vm_context *vm_ctx) 608 + { 609 + WARN_ON(pvr_vm_unmap(vm_ctx, vm_ctx->gpuvm_mgr.mm_start, 610 + vm_ctx->gpuvm_mgr.mm_range)); 611 + } 612 + 613 + /** 614 + * pvr_vm_context_release() - Teardown a VM context. 615 + * @ref_count: Pointer to reference counter of the VM context. 616 + * 617 + * This function also ensures that no mappings are left dangling by calling 618 + * pvr_vm_unmap_all. 606 619 */ 607 620 static void 608 621 pvr_vm_context_release(struct kref *ref_count) ··· 627 612 if (vm_ctx->fw_mem_ctx_obj) 628 613 pvr_fw_object_destroy(vm_ctx->fw_mem_ctx_obj); 629 614 630 - WARN_ON(pvr_vm_unmap(vm_ctx, vm_ctx->gpuvm_mgr.mm_start, 631 - vm_ctx->gpuvm_mgr.mm_range)); 615 + pvr_vm_unmap_all(vm_ctx); 632 616 633 617 pvr_mmu_context_destroy(vm_ctx->mmu_ctx); 634 618 drm_gem_private_object_fini(&vm_ctx->dummy_gem);

+1

drivers/gpu/drm/imagination/pvr_vm.h

··· 39 39 struct pvr_gem_object *pvr_obj, u64 pvr_obj_offset, 40 40 u64 device_addr, u64 size); 41 41 int pvr_vm_unmap(struct pvr_vm_context *vm_ctx, u64 device_addr, u64 size); 42 + void pvr_vm_unmap_all(struct pvr_vm_context *vm_ctx); 42 43 43 44 dma_addr_t pvr_vm_get_page_table_root_addr(struct pvr_vm_context *vm_ctx); 44 45 struct dma_resv *pvr_vm_get_dma_resv(struct pvr_vm_context *vm_ctx);

+4

drivers/gpu/drm/panthor/panthor_device.c

··· 390 390 { 391 391 u64 offset = (u64)vma->vm_pgoff << PAGE_SHIFT; 392 392 393 + if ((vma->vm_flags & VM_SHARED) == 0) 394 + return -EINVAL; 395 + 393 396 switch (offset) { 394 397 case DRM_PANTHOR_USER_FLUSH_ID_MMIO_OFFSET: 395 398 if (vma->vm_end - vma->vm_start != PAGE_SIZE || 396 399 (vma->vm_flags & (VM_WRITE | VM_EXEC))) 397 400 return -EINVAL; 401 + vm_flags_clear(vma, VM_MAYWRITE); 398 402 399 403 break; 400 404

+2

drivers/gpu/drm/panthor/panthor_mmu.c

··· 1580 1580 { 1581 1581 struct panthor_vm *vm; 1582 1582 1583 + xa_lock(&pool->xa); 1583 1584 vm = panthor_vm_get(xa_load(&pool->xa, handle)); 1585 + xa_unlock(&pool->xa); 1584 1586 1585 1587 return vm; 1586 1588 }

+1 -1

drivers/gpu/drm/xe/regs/xe_gt_regs.h

··· 517 517 * [4-6] RSVD 518 518 * [7] Disabled 519 519 */ 520 - #define CCS_MODE XE_REG(0x14804) 520 + #define CCS_MODE XE_REG(0x14804, XE_REG_OPTION_MASKED) 521 521 #define CCS_MODE_CSLICE_0_3_MASK REG_GENMASK(11, 0) /* 3 bits per cslice */ 522 522 #define CCS_MODE_CSLICE_MASK 0x7 /* CCS0-3 + rsvd */ 523 523 #define CCS_MODE_CSLICE_WIDTH ilog2(CCS_MODE_CSLICE_MASK + 1)

-10

drivers/gpu/drm/xe/xe_device.c

··· 87 87 mutex_init(&xef->exec_queue.lock); 88 88 xa_init_flags(&xef->exec_queue.xa, XA_FLAGS_ALLOC1); 89 89 90 - spin_lock(&xe->clients.lock); 91 - xe->clients.count++; 92 - spin_unlock(&xe->clients.lock); 93 - 94 90 file->driver_priv = xef; 95 91 kref_init(&xef->refcount); 96 92 ··· 103 107 static void xe_file_destroy(struct kref *ref) 104 108 { 105 109 struct xe_file *xef = container_of(ref, struct xe_file, refcount); 106 - struct xe_device *xe = xef->xe; 107 110 108 111 xa_destroy(&xef->exec_queue.xa); 109 112 mutex_destroy(&xef->exec_queue.lock); 110 113 xa_destroy(&xef->vm.xa); 111 114 mutex_destroy(&xef->vm.lock); 112 - 113 - spin_lock(&xe->clients.lock); 114 - xe->clients.count--; 115 - spin_unlock(&xe->clients.lock); 116 115 117 116 xe_drm_client_put(xef->client); 118 117 kfree(xef->process_name); ··· 324 333 xe->info.force_execlist = xe_modparam.force_execlist; 325 334 326 335 spin_lock_init(&xe->irq.lock); 327 - spin_lock_init(&xe->clients.lock); 328 336 329 337 init_waitqueue_head(&xe->ufence_wq); 330 338

+14

drivers/gpu/drm/xe/xe_device.h

··· 178 178 struct xe_file *xe_file_get(struct xe_file *xef); 179 179 void xe_file_put(struct xe_file *xef); 180 180 181 + /* 182 + * Occasionally it is seen that the G2H worker starts running after a delay of more than 183 + * a second even after being queued and activated by the Linux workqueue subsystem. This 184 + * leads to G2H timeout error. The root cause of issue lies with scheduling latency of 185 + * Lunarlake Hybrid CPU. Issue disappears if we disable Lunarlake atom cores from BIOS 186 + * and this is beyond xe kmd. 187 + * 188 + * TODO: Drop this change once workqueue scheduling delay issue is fixed on LNL Hybrid CPU. 189 + */ 190 + #define LNL_FLUSH_WORKQUEUE(wq__) \ 191 + flush_workqueue(wq__) 192 + #define LNL_FLUSH_WORK(wrk__) \ 193 + flush_work(wrk__) 194 + 181 195 #endif

-9

drivers/gpu/drm/xe/xe_device_types.h

··· 353 353 struct workqueue_struct *wq; 354 354 } sriov; 355 355 356 - /** @clients: drm clients info */ 357 - struct { 358 - /** @clients.lock: Protects drm clients info */ 359 - spinlock_t lock; 360 - 361 - /** @clients.count: number of drm clients */ 362 - u64 count; 363 - } clients; 364 - 365 356 /** @usm: unified memory state */ 366 357 struct { 367 358 /** @usm.asid: convert a ASID to VM */

+9 -4

drivers/gpu/drm/xe/xe_exec.c

··· 132 132 if (XE_IOCTL_DBG(xe, !q)) 133 133 return -ENOENT; 134 134 135 - if (XE_IOCTL_DBG(xe, q->flags & EXEC_QUEUE_FLAG_VM)) 136 - return -EINVAL; 135 + if (XE_IOCTL_DBG(xe, q->flags & EXEC_QUEUE_FLAG_VM)) { 136 + err = -EINVAL; 137 + goto err_exec_queue; 138 + } 137 139 138 140 if (XE_IOCTL_DBG(xe, args->num_batch_buffer && 139 - q->width != args->num_batch_buffer)) 140 - return -EINVAL; 141 + q->width != args->num_batch_buffer)) { 142 + err = -EINVAL; 143 + goto err_exec_queue; 144 + } 141 145 142 146 if (XE_IOCTL_DBG(xe, q->ops->reset_status(q))) { 143 147 err = -ECANCELED; ··· 224 220 fence = xe_sync_in_fence_get(syncs, num_syncs, q, vm); 225 221 if (IS_ERR(fence)) { 226 222 err = PTR_ERR(fence); 223 + xe_vm_unlock(vm); 227 224 goto err_unlock_list; 228 225 } 229 226 for (i = 0; i < num_syncs; i++)

+6

drivers/gpu/drm/xe/xe_exec_queue.c

··· 260 260 { 261 261 int i; 262 262 263 + /* 264 + * Before releasing our ref to lrc and xef, accumulate our run ticks 265 + */ 266 + xe_exec_queue_update_run_ticks(q); 267 + 263 268 for (i = 0; i < q->width; ++i) 264 269 xe_lrc_put(q->lrc[i]); 270 + 265 271 __xe_exec_queue_free(q); 266 272 } 267 273

+11 -4

drivers/gpu/drm/xe/xe_gt_ccs_mode.c

··· 68 68 } 69 69 } 70 70 71 + /* 72 + * Mask bits need to be set for the register. Though only Xe2+ 73 + * platforms require setting of mask bits, it won't harm for older 74 + * platforms as these bits are unused there. 75 + */ 76 + mode |= CCS_MODE_CSLICE_0_3_MASK << 16; 71 77 xe_mmio_write32(gt, CCS_MODE, mode); 72 78 73 79 xe_gt_dbg(gt, "CCS_MODE=%x config:%08x, num_engines:%d, num_slices:%d\n", ··· 139 133 } 140 134 141 135 /* CCS mode can only be updated when there are no drm clients */ 142 - spin_lock(&xe->clients.lock); 143 - if (xe->clients.count) { 144 - spin_unlock(&xe->clients.lock); 136 + mutex_lock(&xe->drm.filelist_mutex); 137 + if (!list_empty(&xe->drm.filelist)) { 138 + mutex_unlock(&xe->drm.filelist_mutex); 139 + xe_gt_dbg(gt, "Rejecting compute mode change as there are active drm clients\n"); 145 140 return -EBUSY; 146 141 } 147 142 ··· 153 146 xe_gt_reset_async(gt); 154 147 } 155 148 156 - spin_unlock(&xe->clients.lock); 149 + mutex_unlock(&xe->drm.filelist_mutex); 157 150 158 151 return count; 159 152 }

+3 -1

drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c

··· 387 387 * the xe_ggtt_clear() called by below xe_ggtt_remove_node(). 388 388 */ 389 389 xe_ggtt_node_remove(node, false); 390 + } else { 391 + xe_ggtt_node_fini(node); 390 392 } 391 393 } 392 394 ··· 444 442 config->ggtt_region = node; 445 443 return 0; 446 444 err: 447 - xe_ggtt_node_fini(node); 445 + pf_release_ggtt(tile, node); 448 446 return err; 449 447 } 450 448

+2

drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c

··· 72 72 struct xe_device *xe = gt_to_xe(gt); 73 73 struct xe_gt_tlb_invalidation_fence *fence, *next; 74 74 75 + LNL_FLUSH_WORK(&gt->uc.guc.ct.g2h_worker); 76 + 75 77 spin_lock_irq(&gt->tlb_invalidation.pending_lock); 76 78 list_for_each_entry_safe(fence, next, 77 79 &gt->tlb_invalidation.pending_fences, link) {

+1 -10

drivers/gpu/drm/xe/xe_guc_ct.c

··· 897 897 898 898 ret = wait_event_timeout(ct->g2h_fence_wq, g2h_fence.done, HZ); 899 899 900 - /* 901 - * Occasionally it is seen that the G2H worker starts running after a delay of more than 902 - * a second even after being queued and activated by the Linux workqueue subsystem. This 903 - * leads to G2H timeout error. The root cause of issue lies with scheduling latency of 904 - * Lunarlake Hybrid CPU. Issue dissappears if we disable Lunarlake atom cores from BIOS 905 - * and this is beyond xe kmd. 906 - * 907 - * TODO: Drop this change once workqueue scheduling delay issue is fixed on LNL Hybrid CPU. 908 - */ 909 900 if (!ret) { 910 - flush_work(&ct->g2h_worker); 901 + LNL_FLUSH_WORK(&ct->g2h_worker); 911 902 if (g2h_fence.done) { 912 903 xe_gt_warn(gt, "G2H fence %u, action %04x, done\n", 913 904 g2h_fence.seqno, action[0]);

-2

drivers/gpu/drm/xe/xe_guc_submit.c

··· 745 745 { 746 746 struct xe_sched_job *job = to_xe_sched_job(drm_job); 747 747 748 - xe_exec_queue_update_run_ticks(job->q); 749 - 750 748 trace_xe_sched_job_free(job); 751 749 xe_sched_job_put(job); 752 750 }

+7

drivers/gpu/drm/xe/xe_wait_user_fence.c

··· 155 155 } 156 156 157 157 if (!timeout) { 158 + LNL_FLUSH_WORKQUEUE(xe->ordered_wq); 159 + err = do_compare(addr, args->value, args->mask, 160 + args->op); 161 + if (err <= 0) { 162 + drm_dbg(&xe->drm, "LNL_FLUSH_WORKQUEUE resolved ufence timeout\n"); 163 + break; 164 + } 158 165 err = -ETIME; 159 166 break; 160 167 }

+4 -2

drivers/i2c/busses/i2c-designware-common.c

··· 524 524 void __i2c_dw_disable(struct dw_i2c_dev *dev) 525 525 { 526 526 struct i2c_timings *t = &dev->timings; 527 - unsigned int raw_intr_stats; 527 + unsigned int raw_intr_stats, ic_stats; 528 528 unsigned int enable; 529 529 int timeout = 100; 530 530 bool abort_needed; ··· 532 532 int ret; 533 533 534 534 regmap_read(dev->map, DW_IC_RAW_INTR_STAT, &raw_intr_stats); 535 + regmap_read(dev->map, DW_IC_STATUS, &ic_stats); 535 536 regmap_read(dev->map, DW_IC_ENABLE, &enable); 536 537 537 - abort_needed = raw_intr_stats & DW_IC_INTR_MST_ON_HOLD; 538 + abort_needed = (raw_intr_stats & DW_IC_INTR_MST_ON_HOLD) || 539 + (ic_stats & DW_IC_STATUS_MASTER_HOLD_TX_FIFO_EMPTY); 538 540 if (abort_needed) { 539 541 if (!(enable & DW_IC_ENABLE_ENABLE)) { 540 542 regmap_write(dev->map, DW_IC_ENABLE, DW_IC_ENABLE_ENABLE);

+1

drivers/i2c/busses/i2c-designware-core.h

··· 116 116 #define DW_IC_STATUS_RFNE BIT(3) 117 117 #define DW_IC_STATUS_MASTER_ACTIVITY BIT(5) 118 118 #define DW_IC_STATUS_SLAVE_ACTIVITY BIT(6) 119 + #define DW_IC_STATUS_MASTER_HOLD_TX_FIFO_EMPTY BIT(7) 119 120 120 121 #define DW_IC_SDA_HOLD_RX_SHIFT 16 121 122 #define DW_IC_SDA_HOLD_RX_MASK GENMASK(23, 16)

+2 -2

drivers/i2c/muxes/i2c-mux-mule.c

··· 66 66 priv = i2c_mux_priv(muxc); 67 67 68 68 priv->regmap = dev_get_regmap(mux_dev->parent, NULL); 69 - if (IS_ERR(priv->regmap)) 70 - return dev_err_probe(mux_dev, PTR_ERR(priv->regmap), 69 + if (!priv->regmap) 70 + return dev_err_probe(mux_dev, -ENODEV, 71 71 "No parent i2c register map\n"); 72 72 73 73 platform_set_drvdata(pdev, muxc);

+7

drivers/irqchip/irq-gic-v3.c

··· 524 524 } 525 525 526 526 gic_poke_irq(d, reg); 527 + 528 + /* 529 + * Force read-back to guarantee that the active state has taken 530 + * effect, and won't race with a guest-driven deactivation. 531 + */ 532 + if (reg == GICD_ISACTIVER) 533 + gic_peek_irq(d, reg); 527 534 return 0; 528 535 } 529 536

+8 -4

drivers/md/dm-bufio.c

··· 2471 2471 int r; 2472 2472 unsigned int num_locks; 2473 2473 struct dm_bufio_client *c; 2474 - char slab_name[27]; 2474 + char slab_name[64]; 2475 + static atomic_t seqno = ATOMIC_INIT(0); 2475 2476 2476 2477 if (!block_size || block_size & ((1 << SECTOR_SHIFT) - 1)) { 2477 2478 DMERR("%s: block size not specified or is not multiple of 512b", __func__); ··· 2523 2522 (block_size < PAGE_SIZE || !is_power_of_2(block_size))) { 2524 2523 unsigned int align = min(1U << __ffs(block_size), (unsigned int)PAGE_SIZE); 2525 2524 2526 - snprintf(slab_name, sizeof(slab_name), "dm_bufio_cache-%u", block_size); 2525 + snprintf(slab_name, sizeof(slab_name), "dm_bufio_cache-%u-%u", 2526 + block_size, atomic_inc_return(&seqno)); 2527 2527 c->slab_cache = kmem_cache_create(slab_name, block_size, align, 2528 2528 SLAB_RECLAIM_ACCOUNT, NULL); 2529 2529 if (!c->slab_cache) { ··· 2533 2531 } 2534 2532 } 2535 2533 if (aux_size) 2536 - snprintf(slab_name, sizeof(slab_name), "dm_bufio_buffer-%u", aux_size); 2534 + snprintf(slab_name, sizeof(slab_name), "dm_bufio_buffer-%u-%u", 2535 + aux_size, atomic_inc_return(&seqno)); 2537 2536 else 2538 - snprintf(slab_name, sizeof(slab_name), "dm_bufio_buffer"); 2537 + snprintf(slab_name, sizeof(slab_name), "dm_bufio_buffer-%u", 2538 + atomic_inc_return(&seqno)); 2539 2539 c->slab_buffer = kmem_cache_create(slab_name, sizeof(struct dm_buffer) + aux_size, 2540 2540 0, SLAB_RECLAIM_ACCOUNT, NULL); 2541 2541 if (!c->slab_buffer) {

+6 -19

drivers/md/dm-cache-background-tracker.c

··· 11 11 12 12 #define DM_MSG_PREFIX "dm-background-tracker" 13 13 14 - struct bt_work { 15 - struct list_head list; 16 - struct rb_node node; 17 - struct policy_work work; 18 - }; 19 - 20 14 struct background_tracker { 21 15 unsigned int max_work; 22 16 atomic_t pending_promotes; ··· 20 26 struct list_head issued; 21 27 struct list_head queued; 22 28 struct rb_root pending; 23 - 24 - struct kmem_cache *work_cache; 25 29 }; 30 + 31 + struct kmem_cache *btracker_work_cache = NULL; 26 32 27 33 struct background_tracker *btracker_create(unsigned int max_work) 28 34 { ··· 42 48 INIT_LIST_HEAD(&b->queued); 43 49 44 50 b->pending = RB_ROOT; 45 - b->work_cache = KMEM_CACHE(bt_work, 0); 46 - if (!b->work_cache) { 47 - DMERR("couldn't create mempool for background work items"); 48 - kfree(b); 49 - b = NULL; 50 - } 51 51 52 52 return b; 53 53 } ··· 54 66 BUG_ON(!list_empty(&b->issued)); 55 67 list_for_each_entry_safe (w, tmp, &b->queued, list) { 56 68 list_del(&w->list); 57 - kmem_cache_free(b->work_cache, w); 69 + kmem_cache_free(btracker_work_cache, w); 58 70 } 59 71 60 - kmem_cache_destroy(b->work_cache); 61 72 kfree(b); 62 73 } 63 74 EXPORT_SYMBOL_GPL(btracker_destroy); ··· 167 180 if (max_work_reached(b)) 168 181 return NULL; 169 182 170 - return kmem_cache_alloc(b->work_cache, GFP_NOWAIT); 183 + return kmem_cache_alloc(btracker_work_cache, GFP_NOWAIT); 171 184 } 172 185 173 186 int btracker_queue(struct background_tracker *b, ··· 190 203 * There was a race, we'll just ignore this second 191 204 * bit of work for the same oblock. 192 205 */ 193 - kmem_cache_free(b->work_cache, w); 206 + kmem_cache_free(btracker_work_cache, w); 194 207 return -EINVAL; 195 208 } 196 209 ··· 231 244 update_stats(b, &w->work, -1); 232 245 rb_erase(&w->node, &b->pending); 233 246 list_del(&w->list); 234 - kmem_cache_free(b->work_cache, w); 247 + kmem_cache_free(btracker_work_cache, w); 235 248 } 236 249 EXPORT_SYMBOL_GPL(btracker_complete); 237 250

+8

drivers/md/dm-cache-background-tracker.h

··· 26 26 * protected with a spinlock. 27 27 */ 28 28 29 + struct bt_work { 30 + struct list_head list; 31 + struct rb_node node; 32 + struct policy_work work; 33 + }; 34 + 35 + extern struct kmem_cache *btracker_work_cache; 36 + 29 37 struct background_work; 30 38 struct background_tracker; 31 39

+20 -5

drivers/md/dm-cache-target.c

··· 10 10 #include "dm-bio-record.h" 11 11 #include "dm-cache-metadata.h" 12 12 #include "dm-io-tracker.h" 13 + #include "dm-cache-background-tracker.h" 13 14 14 15 #include <linux/dm-io.h> 15 16 #include <linux/dm-kcopyd.h> ··· 2264 2263 2265 2264 /*----------------------------------------------------------------*/ 2266 2265 2267 - static struct kmem_cache *migration_cache; 2266 + static struct kmem_cache *migration_cache = NULL; 2268 2267 2269 2268 #define NOT_CORE_OPTION 1 2270 2269 ··· 3446 3445 int r; 3447 3446 3448 3447 migration_cache = KMEM_CACHE(dm_cache_migration, 0); 3449 - if (!migration_cache) 3450 - return -ENOMEM; 3448 + if (!migration_cache) { 3449 + r = -ENOMEM; 3450 + goto err; 3451 + } 3452 + 3453 + btracker_work_cache = kmem_cache_create("dm_cache_bt_work", 3454 + sizeof(struct bt_work), __alignof__(struct bt_work), 0, NULL); 3455 + if (!btracker_work_cache) { 3456 + r = -ENOMEM; 3457 + goto err; 3458 + } 3451 3459 3452 3460 r = dm_register_target(&cache_target); 3453 3461 if (r) { 3454 - kmem_cache_destroy(migration_cache); 3455 - return r; 3462 + goto err; 3456 3463 } 3457 3464 3458 3465 return 0; 3466 + 3467 + err: 3468 + kmem_cache_destroy(migration_cache); 3469 + kmem_cache_destroy(btracker_work_cache); 3470 + return r; 3459 3471 } 3460 3472 3461 3473 static void __exit dm_cache_exit(void) 3462 3474 { 3463 3475 dm_unregister_target(&cache_target); 3464 3476 kmem_cache_destroy(migration_cache); 3477 + kmem_cache_destroy(btracker_work_cache); 3465 3478 } 3466 3479 3467 3480 module_init(dm_cache_init);

+3 -3

drivers/media/cec/usb/extron-da-hd-4k-plus/extron-da-hd-4k-plus.c

··· 348 348 349 349 /* Return if not a CTA-861 extension block */ 350 350 if (size < 256 || edid[0] != 0x02 || edid[1] != 0x03) 351 - return -1; 351 + return -ENOENT; 352 352 353 353 /* search tag */ 354 354 d = edid[0x02] & 0x7f; 355 355 if (d <= 4) 356 - return -1; 356 + return -ENOENT; 357 357 358 358 i = 0x04; 359 359 end = 0x00 + d; ··· 371 371 return offset + i; 372 372 i += len + 1; 373 373 } while (i < end); 374 - return -1; 374 + return -ENOENT; 375 375 } 376 376 377 377 static void extron_edid_crc(u8 *edid)

+1 -1

drivers/media/cec/usb/pulse8/pulse8-cec.c

··· 685 685 err = pulse8_send_and_wait(pulse8, cmd, 1, cmd[0], 4); 686 686 if (err) 687 687 return err; 688 - date = (data[0] << 24) | (data[1] << 16) | (data[2] << 8) | data[3]; 688 + date = ((unsigned)data[0] << 24) | (data[1] << 16) | (data[2] << 8) | data[3]; 689 689 dev_info(pulse8->dev, "Firmware build date %ptT\n", &date); 690 690 691 691 dev_dbg(pulse8->dev, "Persistent config:\n");

+3

drivers/media/common/v4l2-tpg/v4l2-tpg-core.c

··· 1795 1795 unsigned p; 1796 1796 unsigned x; 1797 1797 1798 + if (WARN_ON_ONCE(!tpg->src_width || !tpg->scaled_width)) 1799 + return; 1800 + 1798 1801 switch (tpg->pattern) { 1799 1802 case TPG_PAT_GREEN: 1800 1803 contrast = TPG_COLOR_100_RED;

+15 -13

drivers/media/common/videobuf2/videobuf2-core.c

··· 1482 1482 } 1483 1483 vb->planes[plane].dbuf_mapped = 1; 1484 1484 } 1485 + } else { 1486 + for (plane = 0; plane < vb->num_planes; ++plane) 1487 + dma_buf_put(planes[plane].dbuf); 1488 + } 1485 1489 1486 - /* 1487 - * Now that everything is in order, copy relevant information 1488 - * provided by userspace. 1489 - */ 1490 - for (plane = 0; plane < vb->num_planes; ++plane) { 1491 - vb->planes[plane].bytesused = planes[plane].bytesused; 1492 - vb->planes[plane].length = planes[plane].length; 1493 - vb->planes[plane].m.fd = planes[plane].m.fd; 1494 - vb->planes[plane].data_offset = planes[plane].data_offset; 1495 - } 1490 + /* 1491 + * Now that everything is in order, copy relevant information 1492 + * provided by userspace. 1493 + */ 1494 + for (plane = 0; plane < vb->num_planes; ++plane) { 1495 + vb->planes[plane].bytesused = planes[plane].bytesused; 1496 + vb->planes[plane].length = planes[plane].length; 1497 + vb->planes[plane].m.fd = planes[plane].m.fd; 1498 + vb->planes[plane].data_offset = planes[plane].data_offset; 1499 + } 1496 1500 1501 + if (reacquired) { 1497 1502 /* 1498 1503 * Call driver-specific initialization on the newly acquired buffer, 1499 1504 * if provided. ··· 1508 1503 dprintk(q, 1, "buffer initialization failed\n"); 1509 1504 goto err_put_vb2_buf; 1510 1505 } 1511 - } else { 1512 - for (plane = 0; plane < vb->num_planes; ++plane) 1513 - dma_buf_put(planes[plane].dbuf); 1514 1506 } 1515 1507 1516 1508 ret = call_vb_qop(vb, buf_prepare, vb);

+2 -2

drivers/media/dvb-core/dvb_frontend.c

··· 443 443 444 444 default: 445 445 fepriv->auto_step++; 446 - fepriv->auto_sub_step = -1; /* it'll be incremented to 0 in a moment */ 447 - break; 446 + fepriv->auto_sub_step = 0; 447 + continue; 448 448 } 449 449 450 450 if (!ready) fepriv->auto_sub_step++;

+7 -1

drivers/media/dvb-core/dvb_vb2.c

··· 366 366 int dvb_vb2_expbuf(struct dvb_vb2_ctx *ctx, struct dmx_exportbuffer *exp) 367 367 { 368 368 struct vb2_queue *q = &ctx->vb_q; 369 + struct vb2_buffer *vb2 = vb2_get_buffer(q, exp->index); 369 370 int ret; 370 371 371 - ret = vb2_core_expbuf(&ctx->vb_q, &exp->fd, q->type, q->bufs[exp->index], 372 + if (!vb2) { 373 + dprintk(1, "[%s] invalid buffer index\n", ctx->name); 374 + return -EINVAL; 375 + } 376 + 377 + ret = vb2_core_expbuf(&ctx->vb_q, &exp->fd, q->type, vb2, 372 378 0, exp->flags); 373 379 if (ret) { 374 380 dprintk(1, "[%s] index=%d errno=%d\n", ctx->name,

+11 -5

drivers/media/dvb-core/dvbdev.c

··· 86 86 static int dvb_device_open(struct inode *inode, struct file *file) 87 87 { 88 88 struct dvb_device *dvbdev; 89 + unsigned int minor = iminor(inode); 90 + 91 + if (minor >= MAX_DVB_MINORS) 92 + return -ENODEV; 89 93 90 94 mutex_lock(&dvbdev_mutex); 91 95 down_read(&minor_rwsem); 92 - dvbdev = dvb_minors[iminor(inode)]; 96 + 97 + dvbdev = dvb_minors[minor]; 93 98 94 99 if (dvbdev && dvbdev->fops) { 95 100 int err = 0; ··· 530 525 for (minor = 0; minor < MAX_DVB_MINORS; minor++) 531 526 if (!dvb_minors[minor]) 532 527 break; 533 - if (minor == MAX_DVB_MINORS) { 528 + #else 529 + minor = nums2minor(adap->num, type, id); 530 + #endif 531 + if (minor >= MAX_DVB_MINORS) { 534 532 if (new_node) { 535 533 list_del(&new_node->list_head); 536 534 kfree(dvbdevfops); ··· 546 538 mutex_unlock(&dvbdev_register_lock); 547 539 return -EINVAL; 548 540 } 549 - #else 550 - minor = nums2minor(adap->num, type, id); 551 - #endif 541 + 552 542 dvbdev->minor = minor; 553 543 dvb_minors[minor] = dvb_device_get(dvbdev); 554 544 up_write(&minor_rwsem);

+6 -1

drivers/media/dvb-frontends/cx24116.c

··· 741 741 { 742 742 struct cx24116_state *state = fe->demodulator_priv; 743 743 u8 snr_reading; 744 + int ret; 744 745 static const u32 snr_tab[] = { /* 10 x Table (rounded up) */ 745 746 0x00000, 0x0199A, 0x03333, 0x04ccD, 0x06667, 746 747 0x08000, 0x0999A, 0x0b333, 0x0cccD, 0x0e667, ··· 750 749 751 750 dprintk("%s()\n", __func__); 752 751 753 - snr_reading = cx24116_readreg(state, CX24116_REG_QUALITY0); 752 + ret = cx24116_readreg(state, CX24116_REG_QUALITY0); 753 + if (ret < 0) 754 + return ret; 755 + 756 + snr_reading = ret; 754 757 755 758 if (snr_reading >= 0xa0 /* 100% */) 756 759 *snr = 0xffff;

+1 -1

drivers/media/dvb-frontends/stb0899_algo.c

··· 269 269 270 270 short int derot_freq = 0, last_derot_freq = 0, derot_limit, next_loop = 3; 271 271 int index = 0; 272 - u8 cfr[2]; 272 + u8 cfr[2] = {0}; 273 273 u8 reg; 274 274 275 275 internal->status = NOCARRIER;

+17 -9

drivers/media/i2c/adv7604.c

··· 2519 2519 const struct adv76xx_chip_info *info = state->info; 2520 2520 struct v4l2_dv_timings timings; 2521 2521 struct stdi_readback stdi; 2522 - u8 reg_io_0x02 = io_read(sd, 0x02); 2522 + int ret; 2523 + u8 reg_io_0x02; 2523 2524 u8 edid_enabled; 2524 2525 u8 cable_det; 2525 - 2526 2526 static const char * const csc_coeff_sel_rb[16] = { 2527 2527 "bypassed", "YPbPr601 -> RGB", "reserved", "YPbPr709 -> RGB", 2528 2528 "reserved", "RGB -> YPbPr601", "reserved", "RGB -> YPbPr709", ··· 2621 2621 v4l2_info(sd, "-----Color space-----\n"); 2622 2622 v4l2_info(sd, "RGB quantization range ctrl: %s\n", 2623 2623 rgb_quantization_range_txt[state->rgb_quantization_range]); 2624 - v4l2_info(sd, "Input color space: %s\n", 2625 - input_color_space_txt[reg_io_0x02 >> 4]); 2626 - v4l2_info(sd, "Output color space: %s %s, alt-gamma %s\n", 2627 - (reg_io_0x02 & 0x02) ? "RGB" : "YCbCr", 2628 - (((reg_io_0x02 >> 2) & 0x01) ^ (reg_io_0x02 & 0x01)) ? 2629 - "(16-235)" : "(0-255)", 2630 - (reg_io_0x02 & 0x08) ? "enabled" : "disabled"); 2624 + 2625 + ret = io_read(sd, 0x02); 2626 + if (ret < 0) { 2627 + v4l2_info(sd, "Can't read Input/Output color space\n"); 2628 + } else { 2629 + reg_io_0x02 = ret; 2630 + 2631 + v4l2_info(sd, "Input color space: %s\n", 2632 + input_color_space_txt[reg_io_0x02 >> 4]); 2633 + v4l2_info(sd, "Output color space: %s %s, alt-gamma %s\n", 2634 + (reg_io_0x02 & 0x02) ? "RGB" : "YCbCr", 2635 + (((reg_io_0x02 >> 2) & 0x01) ^ (reg_io_0x02 & 0x01)) ? 2636 + "(16-235)" : "(0-255)", 2637 + (reg_io_0x02 & 0x08) ? "enabled" : "disabled"); 2638 + } 2631 2639 v4l2_info(sd, "Color space conversion: %s\n", 2632 2640 csc_coeff_sel_rb[cp_read(sd, info->cp_csc) >> 4]); 2633 2641

+2 -2

drivers/media/i2c/ar0521.c

··· 255 255 continue; /* Minimum value */ 256 256 if (new_mult > 254) 257 257 break; /* Maximum, larger pre won't work either */ 258 - if (sensor->extclk_freq * (u64)new_mult < AR0521_PLL_MIN * 258 + if (sensor->extclk_freq * (u64)new_mult < (u64)AR0521_PLL_MIN * 259 259 new_pre) 260 260 continue; 261 - if (sensor->extclk_freq * (u64)new_mult > AR0521_PLL_MAX * 261 + if (sensor->extclk_freq * (u64)new_mult > (u64)AR0521_PLL_MAX * 262 262 new_pre) 263 263 break; /* Larger pre won't work either */ 264 264 new_pll = div64_round_up(sensor->extclk_freq * (u64)new_mult,

+2

drivers/media/pci/mgb4/mgb4_cmt.c

··· 227 227 u32 config; 228 228 size_t i; 229 229 230 + freq_range = array_index_nospec(freq_range, ARRAY_SIZE(cmt_vals_in)); 231 + 230 232 addr = cmt_addrs_in[vindev->config->id]; 231 233 reg_set = cmt_vals_in[freq_range]; 232 234

+11 -6

drivers/media/platform/samsung/s5p-jpeg/jpeg-core.c

··· 775 775 (unsigned long)vb2_plane_vaddr(&vb->vb2_buf, 0) + ctx->out_q.sos + 2; 776 776 jpeg_buffer.curr = 0; 777 777 778 - word = 0; 779 - 780 778 if (get_word_be(&jpeg_buffer, &word)) 781 779 return; 782 - jpeg_buffer.size = (long)word - 2; 780 + 781 + if (word < 2) 782 + jpeg_buffer.size = 0; 783 + else 784 + jpeg_buffer.size = (long)word - 2; 785 + 783 786 jpeg_buffer.data += 2; 784 787 jpeg_buffer.curr = 0; 785 788 ··· 1061 1058 if (byte == -1) 1062 1059 return -1; 1063 1060 *word = (unsigned int)byte | temp; 1061 + 1064 1062 return 0; 1065 1063 } 1066 1064 ··· 1149 1145 if (get_word_be(&jpeg_buffer, &word)) 1150 1146 break; 1151 1147 length = (long)word - 2; 1152 - if (!length) 1148 + if (length <= 0) 1153 1149 return false; 1154 1150 sof = jpeg_buffer.curr; /* after 0xffc0 */ 1155 1151 sof_len = length; ··· 1180 1176 if (get_word_be(&jpeg_buffer, &word)) 1181 1177 break; 1182 1178 length = (long)word - 2; 1183 - if (!length) 1179 + if (length <= 0) 1184 1180 return false; 1185 1181 if (n_dqt >= S5P_JPEG_MAX_MARKER) 1186 1182 return false; ··· 1193 1189 if (get_word_be(&jpeg_buffer, &word)) 1194 1190 break; 1195 1191 length = (long)word - 2; 1196 - if (!length) 1192 + if (length <= 0) 1197 1193 return false; 1198 1194 if (n_dht >= S5P_JPEG_MAX_MARKER) 1199 1195 return false; ··· 1218 1214 if (get_word_be(&jpeg_buffer, &word)) 1219 1215 break; 1220 1216 length = (long)word - 2; 1217 + /* No need to check underflows as skip() does it */ 1221 1218 skip(&jpeg_buffer, length); 1222 1219 break; 1223 1220 }

+1 -1

drivers/media/test-drivers/vivid/vivid-core.c

··· 910 910 * videobuf2-core.c to MAX_BUFFER_INDEX. 911 911 */ 912 912 if (buf_type == V4L2_BUF_TYPE_VIDEO_CAPTURE) 913 - q->max_num_buffers = 64; 913 + q->max_num_buffers = MAX_VID_CAP_BUFFERS; 914 914 if (buf_type == V4L2_BUF_TYPE_SDR_CAPTURE) 915 915 q->max_num_buffers = 1024; 916 916 if (buf_type == V4L2_BUF_TYPE_VBI_CAPTURE)

+3 -1

drivers/media/test-drivers/vivid/vivid-core.h

··· 26 26 #define MAX_INPUTS 16 27 27 /* The maximum number of outputs */ 28 28 #define MAX_OUTPUTS 16 29 + /* The maximum number of video capture buffers */ 30 + #define MAX_VID_CAP_BUFFERS 64 29 31 /* The maximum up or down scaling factor is 4 */ 30 32 #define MAX_ZOOM 4 31 33 /* The maximum image width/height are set to 4K DMT */ ··· 483 481 /* video capture */ 484 482 struct tpg_data tpg; 485 483 unsigned ms_vid_cap; 486 - bool must_blank[VIDEO_MAX_FRAME]; 484 + bool must_blank[MAX_VID_CAP_BUFFERS]; 487 485 488 486 const struct vivid_fmt *fmt_cap; 489 487 struct v4l2_fract timeperframe_vid_cap;

+1 -1

drivers/media/test-drivers/vivid/vivid-ctrls.c

··· 553 553 break; 554 554 case VIVID_CID_PERCENTAGE_FILL: 555 555 tpg_s_perc_fill(&dev->tpg, ctrl->val); 556 - for (i = 0; i < VIDEO_MAX_FRAME; i++) 556 + for (i = 0; i < MAX_VID_CAP_BUFFERS; i++) 557 557 dev->must_blank[i] = ctrl->val < 100; 558 558 break; 559 559 case VIVID_CID_INSERT_SAV:

+1 -1

drivers/media/test-drivers/vivid/vivid-vid-cap.c

··· 213 213 214 214 dev->vid_cap_seq_count = 0; 215 215 dprintk(dev, 1, "%s\n", __func__); 216 - for (i = 0; i < VIDEO_MAX_FRAME; i++) 216 + for (i = 0; i < MAX_VID_CAP_BUFFERS; i++) 217 217 dev->must_blank[i] = tpg_g_perc_fill(&dev->tpg) < 100; 218 218 if (dev->start_streaming_error) { 219 219 dev->start_streaming_error = false;

+11 -6

drivers/media/v4l2-core/v4l2-ctrls-api.c

··· 753 753 for (i = 0; i < master->ncontrols; i++) 754 754 cur_to_new(master->cluster[i]); 755 755 ret = call_op(master, g_volatile_ctrl); 756 - new_to_user(c, ctrl); 756 + if (!ret) 757 + ret = new_to_user(c, ctrl); 757 758 } else { 758 - cur_to_user(c, ctrl); 759 + ret = cur_to_user(c, ctrl); 759 760 } 760 761 v4l2_ctrl_unlock(master); 761 762 return ret; ··· 771 770 if (!ctrl || !ctrl->is_int) 772 771 return -EINVAL; 773 772 ret = get_ctrl(ctrl, &c); 774 - control->value = c.value; 773 + 774 + if (!ret) 775 + control->value = c.value; 776 + 775 777 return ret; 776 778 } 777 779 EXPORT_SYMBOL(v4l2_g_ctrl); ··· 815 811 int ret; 816 812 817 813 v4l2_ctrl_lock(ctrl); 818 - user_to_new(c, ctrl); 819 - ret = set_ctrl(fh, ctrl, 0); 814 + ret = user_to_new(c, ctrl); 820 815 if (!ret) 821 - cur_to_user(c, ctrl); 816 + ret = set_ctrl(fh, ctrl, 0); 817 + if (!ret) 818 + ret = cur_to_user(c, ctrl); 822 819 v4l2_ctrl_unlock(ctrl); 823 820 return ret; 824 821 }

+15 -1

drivers/net/bonding/bond_main.c

··· 1008 1008 1009 1009 if (bond->dev->flags & IFF_UP) 1010 1010 bond_hw_addr_flush(bond->dev, old_active->dev); 1011 + 1012 + bond_slave_ns_maddrs_add(bond, old_active); 1011 1013 } 1012 1014 1013 1015 if (new_active) { ··· 1026 1024 dev_mc_sync(new_active->dev, bond->dev); 1027 1025 netif_addr_unlock_bh(bond->dev); 1028 1026 } 1027 + 1028 + bond_slave_ns_maddrs_del(bond, new_active); 1029 1029 } 1030 1030 } 1031 1031 ··· 2354 2350 bond_compute_features(bond); 2355 2351 bond_set_carrier(bond); 2356 2352 2353 + /* Needs to be called before bond_select_active_slave(), which will 2354 + * remove the maddrs if the slave is selected as active slave. 2355 + */ 2356 + bond_slave_ns_maddrs_add(bond, new_slave); 2357 + 2357 2358 if (bond_uses_primary(bond)) { 2358 2359 block_netpoll_tx(); 2359 2360 bond_select_active_slave(bond); ··· 2367 2358 2368 2359 if (bond_mode_can_use_xmit_hash(bond)) 2369 2360 bond_update_slave_arr(bond, NULL); 2370 - 2371 2361 2372 2362 if (!slave_dev->netdev_ops->ndo_bpf || 2373 2363 !slave_dev->netdev_ops->ndo_xdp_xmit) { ··· 2564 2556 2565 2557 if (oldcurrent == slave) 2566 2558 bond_change_active_slave(bond, NULL); 2559 + 2560 + /* Must be called after bond_change_active_slave () as the slave 2561 + * might change from an active slave to a backup slave. Then it is 2562 + * necessary to clear the maddrs on the backup slave. 2563 + */ 2564 + bond_slave_ns_maddrs_del(bond, slave); 2567 2565 2568 2566 if (bond_is_lb(bond)) { 2569 2567 /* Must be called only after the slave has been

+81 -1

drivers/net/bonding/bond_options.c

··· 15 15 #include <linux/sched/signal.h> 16 16 17 17 #include <net/bonding.h> 18 + #include <net/ndisc.h> 18 19 19 20 static int bond_option_active_slave_set(struct bonding *bond, 20 21 const struct bond_opt_value *newval); ··· 1235 1234 } 1236 1235 1237 1236 #if IS_ENABLED(CONFIG_IPV6) 1237 + static bool slave_can_set_ns_maddr(const struct bonding *bond, struct slave *slave) 1238 + { 1239 + return BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP && 1240 + !bond_is_active_slave(slave) && 1241 + slave->dev->flags & IFF_MULTICAST; 1242 + } 1243 + 1244 + static void slave_set_ns_maddrs(struct bonding *bond, struct slave *slave, bool add) 1245 + { 1246 + struct in6_addr *targets = bond->params.ns_targets; 1247 + char slot_maddr[MAX_ADDR_LEN]; 1248 + int i; 1249 + 1250 + if (!slave_can_set_ns_maddr(bond, slave)) 1251 + return; 1252 + 1253 + for (i = 0; i < BOND_MAX_NS_TARGETS; i++) { 1254 + if (ipv6_addr_any(&targets[i])) 1255 + break; 1256 + 1257 + if (!ndisc_mc_map(&targets[i], slot_maddr, slave->dev, 0)) { 1258 + if (add) 1259 + dev_mc_add(slave->dev, slot_maddr); 1260 + else 1261 + dev_mc_del(slave->dev, slot_maddr); 1262 + } 1263 + } 1264 + } 1265 + 1266 + void bond_slave_ns_maddrs_add(struct bonding *bond, struct slave *slave) 1267 + { 1268 + if (!bond->params.arp_validate) 1269 + return; 1270 + slave_set_ns_maddrs(bond, slave, true); 1271 + } 1272 + 1273 + void bond_slave_ns_maddrs_del(struct bonding *bond, struct slave *slave) 1274 + { 1275 + if (!bond->params.arp_validate) 1276 + return; 1277 + slave_set_ns_maddrs(bond, slave, false); 1278 + } 1279 + 1280 + static void slave_set_ns_maddr(struct bonding *bond, struct slave *slave, 1281 + struct in6_addr *target, struct in6_addr *slot) 1282 + { 1283 + char target_maddr[MAX_ADDR_LEN], slot_maddr[MAX_ADDR_LEN]; 1284 + 1285 + if (!bond->params.arp_validate || !slave_can_set_ns_maddr(bond, slave)) 1286 + return; 1287 + 1288 + /* remove the previous maddr from slave */ 1289 + if (!ipv6_addr_any(slot) && 1290 + !ndisc_mc_map(slot, slot_maddr, slave->dev, 0)) 1291 + dev_mc_del(slave->dev, slot_maddr); 1292 + 1293 + /* add new maddr on slave if target is set */ 1294 + if (!ipv6_addr_any(target) && 1295 + !ndisc_mc_map(target, target_maddr, slave->dev, 0)) 1296 + dev_mc_add(slave->dev, target_maddr); 1297 + } 1298 + 1238 1299 static void _bond_options_ns_ip6_target_set(struct bonding *bond, int slot, 1239 1300 struct in6_addr *target, 1240 1301 unsigned long last_rx) ··· 1306 1243 struct slave *slave; 1307 1244 1308 1245 if (slot >= 0 && slot < BOND_MAX_NS_TARGETS) { 1309 - bond_for_each_slave(bond, slave, iter) 1246 + bond_for_each_slave(bond, slave, iter) { 1310 1247 slave->target_last_arp_rx[slot] = last_rx; 1248 + slave_set_ns_maddr(bond, slave, target, &targets[slot]); 1249 + } 1311 1250 targets[slot] = *target; 1312 1251 } 1313 1252 } ··· 1361 1296 { 1362 1297 return -EPERM; 1363 1298 } 1299 + 1300 + static void slave_set_ns_maddrs(struct bonding *bond, struct slave *slave, bool add) {} 1301 + 1302 + void bond_slave_ns_maddrs_add(struct bonding *bond, struct slave *slave) {} 1303 + 1304 + void bond_slave_ns_maddrs_del(struct bonding *bond, struct slave *slave) {} 1364 1305 #endif 1365 1306 1366 1307 static int bond_option_arp_validate_set(struct bonding *bond, 1367 1308 const struct bond_opt_value *newval) 1368 1309 { 1310 + bool changed = !!bond->params.arp_validate != !!newval->value; 1311 + struct list_head *iter; 1312 + struct slave *slave; 1313 + 1369 1314 netdev_dbg(bond->dev, "Setting arp_validate to %s (%llu)\n", 1370 1315 newval->string, newval->value); 1371 1316 bond->params.arp_validate = newval->value; 1317 + 1318 + if (changed) { 1319 + bond_for_each_slave(bond, slave, iter) 1320 + slave_set_ns_maddrs(bond, slave, !!bond->params.arp_validate); 1321 + } 1372 1322 1373 1323 return 0; 1374 1324 }

+1 -1

drivers/net/ethernet/intel/igb/igb_main.c

··· 907 907 int i, err = 0, vector = 0, free_vector = 0; 908 908 909 909 err = request_irq(adapter->msix_entries[vector].vector, 910 - igb_msix_other, IRQF_NO_THREAD, netdev->name, adapter); 910 + igb_msix_other, 0, netdev->name, adapter); 911 911 if (err) 912 912 goto err_out; 913 913

+1 -1

drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c

··· 866 866 return 0; 867 867 868 868 err_rule: 869 - mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, zone_rule->attr, zone_rule->mh); 869 + mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, attr, zone_rule->mh); 870 870 mlx5_put_label_mapping(ct_priv, attr->ct_attr.ct_labels_id); 871 871 err_mod_hdr: 872 872 kfree(attr);

+4 -4

drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c

··· 660 660 while (remaining > 0) { 661 661 skb_frag_t *frag = &record->frags[i]; 662 662 663 - get_page(skb_frag_page(frag)); 663 + page_ref_inc(skb_frag_page(frag)); 664 664 remaining -= skb_frag_size(frag); 665 665 info->frags[i++] = *frag; 666 666 } ··· 763 763 stats = sq->stats; 764 764 765 765 mlx5e_tx_dma_unmap(sq->pdev, dma); 766 - put_page(wi->resync_dump_frag_page); 766 + page_ref_dec(wi->resync_dump_frag_page); 767 767 stats->tls_dump_packets++; 768 768 stats->tls_dump_bytes += wi->num_bytes; 769 769 } ··· 816 816 817 817 err_out: 818 818 for (; i < info.nr_frags; i++) 819 - /* The put_page() here undoes the page ref obtained in tx_sync_info_get(). 819 + /* The page_ref_dec() here undoes the page ref obtained in tx_sync_info_get(). 820 820 * Page refs obtained for the DUMP WQEs above (by page_ref_add) will be 821 821 * released only upon their completions (or in mlx5e_free_txqsq_descs, 822 822 * if channel closes). 823 823 */ 824 - put_page(skb_frag_page(&info.frags[i])); 824 + page_ref_dec(skb_frag_page(&info.frags[i])); 825 825 826 826 return MLX5E_KTLS_SYNC_FAIL; 827 827 }

+2 -1

drivers/net/ethernet/mellanox/mlx5/core/en_main.c

··· 4295 4295 struct mlx5e_params *params = &priv->channels.params; 4296 4296 xdp_features_t val; 4297 4297 4298 - if (params->packet_merge.type != MLX5E_PACKET_MERGE_NONE) { 4298 + if (!netdev->netdev_ops->ndo_bpf || 4299 + params->packet_merge.type != MLX5E_PACKET_MERGE_NONE) { 4299 4300 xdp_clear_features_flag(netdev); 4300 4301 return; 4301 4302 }

+4

drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c

··· 36 36 #include "en.h" 37 37 #include "en/port.h" 38 38 #include "eswitch.h" 39 + #include "lib/mlx5.h" 39 40 40 41 static int mlx5e_test_health_info(struct mlx5e_priv *priv) 41 42 { ··· 246 245 static int mlx5e_cond_loopback(struct mlx5e_priv *priv) 247 246 { 248 247 if (is_mdev_switchdev_mode(priv->mdev)) 248 + return -EOPNOTSUPP; 249 + 250 + if (mlx5_get_sd(priv->mdev)) 249 251 return -EOPNOTSUPP; 250 252 251 253 return 0;

+4 -1

drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c

··· 2544 2544 struct mlx5_eswitch_rep *rep, u8 rep_type) 2545 2545 { 2546 2546 if (atomic_cmpxchg(&rep->rep_data[rep_type].state, 2547 - REP_LOADED, REP_REGISTERED) == REP_LOADED) 2547 + REP_LOADED, REP_REGISTERED) == REP_LOADED) { 2548 + if (rep_type == REP_ETH) 2549 + __esw_offloads_unload_rep(esw, rep, REP_IB); 2548 2550 esw->offloads.rep_ops[rep_type]->unload(rep); 2551 + } 2549 2552 } 2550 2553 2551 2554 static void __unload_reps_all_vport(struct mlx5_eswitch *esw, u8 rep_type)

+14 -5

drivers/net/ethernet/mellanox/mlx5/core/fs_core.c

··· 2105 2105 fte_tmp = NULL; 2106 2106 goto out; 2107 2107 } 2108 - if (!fte_tmp->node.active) { 2109 - tree_put_node(&fte_tmp->node, false); 2110 - fte_tmp = NULL; 2111 - goto out; 2112 - } 2113 2108 2114 2109 nested_down_write_ref_node(&fte_tmp->node, FS_LOCK_CHILD); 2110 + 2111 + if (!fte_tmp->node.active) { 2112 + up_write_ref_node(&fte_tmp->node, false); 2113 + 2114 + if (take_write) 2115 + up_write_ref_node(&g->node, false); 2116 + else 2117 + up_read_ref_node(&g->node); 2118 + 2119 + tree_put_node(&fte_tmp->node, false); 2120 + 2121 + return NULL; 2122 + } 2123 + 2115 2124 out: 2116 2125 if (take_write) 2117 2126 up_write_ref_node(&g->node, false);

+27 -5

drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c

··· 593 593 kvfree(pool); 594 594 } 595 595 596 - static int irq_pools_init(struct mlx5_core_dev *dev, int sf_vec, int pcif_vec) 596 + static int irq_pools_init(struct mlx5_core_dev *dev, int sf_vec, int pcif_vec, 597 + bool dynamic_vec) 597 598 { 598 599 struct mlx5_irq_table *table = dev->priv.irq_table; 600 + int sf_vec_available = sf_vec; 599 601 int num_sf_ctrl; 600 602 int err; 601 603 ··· 618 616 num_sf_ctrl = DIV_ROUND_UP(mlx5_sf_max_functions(dev), 619 617 MLX5_SFS_PER_CTRL_IRQ); 620 618 num_sf_ctrl = min_t(int, MLX5_IRQ_CTRL_SF_MAX, num_sf_ctrl); 619 + if (!dynamic_vec && (num_sf_ctrl + 1) > sf_vec_available) { 620 + mlx5_core_dbg(dev, 621 + "Not enough IRQs for SFs control and completion pool, required=%d avail=%d\n", 622 + num_sf_ctrl + 1, sf_vec_available); 623 + return 0; 624 + } 625 + 621 626 table->sf_ctrl_pool = irq_pool_alloc(dev, pcif_vec, num_sf_ctrl, 622 627 "mlx5_sf_ctrl", 623 628 MLX5_EQ_SHARE_IRQ_MIN_CTRL, ··· 633 624 err = PTR_ERR(table->sf_ctrl_pool); 634 625 goto err_pf; 635 626 } 636 - /* init sf_comp_pool */ 627 + sf_vec_available -= num_sf_ctrl; 628 + 629 + /* init sf_comp_pool, remaining vectors are for the SF completions */ 637 630 table->sf_comp_pool = irq_pool_alloc(dev, pcif_vec + num_sf_ctrl, 638 - sf_vec - num_sf_ctrl, "mlx5_sf_comp", 631 + sf_vec_available, "mlx5_sf_comp", 639 632 MLX5_EQ_SHARE_IRQ_MIN_COMP, 640 633 MLX5_EQ_SHARE_IRQ_MAX_COMP); 641 634 if (IS_ERR(table->sf_comp_pool)) { ··· 726 715 int mlx5_irq_table_create(struct mlx5_core_dev *dev) 727 716 { 728 717 int num_eqs = mlx5_max_eq_cap_get(dev); 718 + bool dynamic_vec; 729 719 int total_vec; 730 720 int pcif_vec; 731 721 int req_vec; ··· 736 724 if (mlx5_core_is_sf(dev)) 737 725 return 0; 738 726 727 + /* PCI PF vectors usage is limited by online cpus, device EQs and 728 + * PCI MSI-X capability. 729 + */ 739 730 pcif_vec = MLX5_CAP_GEN(dev, num_ports) * num_online_cpus() + 1; 740 731 pcif_vec = min_t(int, pcif_vec, num_eqs); 732 + pcif_vec = min_t(int, pcif_vec, pci_msix_vec_count(dev->pdev)); 741 733 742 734 total_vec = pcif_vec; 743 735 if (mlx5_sf_max_functions(dev)) 744 736 total_vec += MLX5_MAX_MSIX_PER_SF * mlx5_sf_max_functions(dev); 745 737 total_vec = min_t(int, total_vec, pci_msix_vec_count(dev->pdev)); 746 - pcif_vec = min_t(int, pcif_vec, pci_msix_vec_count(dev->pdev)); 747 738 748 739 req_vec = pci_msix_can_alloc_dyn(dev->pdev) ? 1 : total_vec; 749 740 n = pci_alloc_irq_vectors(dev->pdev, 1, req_vec, PCI_IRQ_MSIX); 750 741 if (n < 0) 751 742 return n; 752 743 753 - err = irq_pools_init(dev, total_vec - pcif_vec, pcif_vec); 744 + /* Further limit vectors of the pools based on platform for non dynamic case */ 745 + dynamic_vec = pci_msix_can_alloc_dyn(dev->pdev); 746 + if (!dynamic_vec) { 747 + pcif_vec = min_t(int, n, pcif_vec); 748 + total_vec = min_t(int, n, total_vec); 749 + } 750 + 751 + err = irq_pools_init(dev, total_vec - pcif_vec, pcif_vec, dynamic_vec); 754 752 if (err) 755 753 pci_free_irq_vectors(dev->pdev); 756 754

+17 -8

drivers/net/ethernet/stmicro/stmmac/dwmac-intel-plat.c

··· 108 108 if (IS_ERR(dwmac->tx_clk)) 109 109 return PTR_ERR(dwmac->tx_clk); 110 110 111 - clk_prepare_enable(dwmac->tx_clk); 111 + ret = clk_prepare_enable(dwmac->tx_clk); 112 + if (ret) { 113 + dev_err(&pdev->dev, 114 + "Failed to enable tx_clk\n"); 115 + return ret; 116 + } 112 117 113 118 /* Check and configure TX clock rate */ 114 119 rate = clk_get_rate(dwmac->tx_clk); ··· 124 119 if (ret) { 125 120 dev_err(&pdev->dev, 126 121 "Failed to set tx_clk\n"); 127 - return ret; 122 + goto err_tx_clk_disable; 128 123 } 129 124 } 130 125 } ··· 138 133 if (ret) { 139 134 dev_err(&pdev->dev, 140 135 "Failed to set clk_ptp_ref\n"); 141 - return ret; 136 + goto err_tx_clk_disable; 142 137 } 143 138 } 144 139 } ··· 154 149 } 155 150 156 151 ret = stmmac_dvr_probe(&pdev->dev, plat_dat, &stmmac_res); 157 - if (ret) { 158 - clk_disable_unprepare(dwmac->tx_clk); 159 - return ret; 160 - } 152 + if (ret) 153 + goto err_tx_clk_disable; 161 154 162 155 return 0; 156 + 157 + err_tx_clk_disable: 158 + if (dwmac->data->tx_clk_en) 159 + clk_disable_unprepare(dwmac->tx_clk); 160 + return ret; 163 161 } 164 162 165 163 static void intel_eth_plat_remove(struct platform_device *pdev) ··· 170 162 struct intel_dwmac *dwmac = get_stmmac_bsp_priv(&pdev->dev); 171 163 172 164 stmmac_pltfr_remove(pdev); 173 - clk_disable_unprepare(dwmac->tx_clk); 165 + if (dwmac->data->tx_clk_en) 166 + clk_disable_unprepare(dwmac->tx_clk); 174 167 } 175 168 176 169 static struct platform_driver intel_eth_plat_driver = {

+2 -2

drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c

··· 589 589 590 590 plat->mac_interface = priv_plat->phy_mode; 591 591 if (priv_plat->mac_wol) 592 - plat->flags |= STMMAC_FLAG_USE_PHY_WOL; 593 - else 594 592 plat->flags &= ~STMMAC_FLAG_USE_PHY_WOL; 593 + else 594 + plat->flags |= STMMAC_FLAG_USE_PHY_WOL; 595 595 plat->riwt_off = 1; 596 596 plat->maxmtu = ETH_DATA_LEN; 597 597 plat->host_dma_width = priv_plat->variant->dma_bit_mask;

+11 -2

drivers/net/ethernet/ti/icssg/icssg_prueth.c

··· 16 16 #include <linux/if_hsr.h> 17 17 #include <linux/if_vlan.h> 18 18 #include <linux/interrupt.h> 19 + #include <linux/io-64-nonatomic-hi-lo.h> 19 20 #include <linux/kernel.h> 20 21 #include <linux/mfd/syscon.h> 21 22 #include <linux/module.h> ··· 412 411 struct prueth_emac *emac = clockops_data; 413 412 u32 reduction_factor = 0, offset = 0; 414 413 struct timespec64 ts; 414 + u64 current_cycle; 415 + u64 start_offset; 415 416 u64 ns_period; 416 417 417 418 if (!on) ··· 452 449 writel(reduction_factor, emac->prueth->shram.va + 453 450 TIMESYNC_FW_WC_SYNCOUT_REDUCTION_FACTOR_OFFSET); 454 451 455 - writel(0, emac->prueth->shram.va + 456 - TIMESYNC_FW_WC_SYNCOUT_START_TIME_CYCLECOUNT_OFFSET); 452 + current_cycle = icssg_read_time(emac->prueth->shram.va + 453 + TIMESYNC_FW_WC_CYCLECOUNT_OFFSET); 454 + 455 + /* Rounding of current_cycle count to next second */ 456 + start_offset = roundup(current_cycle, MSEC_PER_SEC); 457 + 458 + hi_lo_writeq(start_offset, emac->prueth->shram.va + 459 + TIMESYNC_FW_WC_SYNCOUT_START_TIME_CYCLECOUNT_OFFSET); 457 460 458 461 return 0; 459 462 }

+12

drivers/net/ethernet/ti/icssg/icssg_prueth.h

··· 330 330 extern const struct ethtool_ops icssg_ethtool_ops; 331 331 extern const struct dev_pm_ops prueth_dev_pm_ops; 332 332 333 + static inline u64 icssg_read_time(const void __iomem *addr) 334 + { 335 + u32 low, high; 336 + 337 + do { 338 + high = readl(addr + 4); 339 + low = readl(addr); 340 + } while (high != readl(addr + 4)); 341 + 342 + return low + ((u64)high << 32); 343 + } 344 + 333 345 /* Classifier helpers */ 334 346 void icssg_class_set_mac_addr(struct regmap *miig_rt, int slice, u8 *mac); 335 347 void icssg_class_set_host_mac_addr(struct regmap *miig_rt, const u8 *mac);

+3 -1

drivers/net/ethernet/vertexcom/mse102x.c

··· 437 437 mse = &mses->mse102x; 438 438 439 439 while ((txb = skb_dequeue(&mse->txq))) { 440 + unsigned int len = max_t(unsigned int, txb->len, ETH_ZLEN); 441 + 440 442 mutex_lock(&mses->lock); 441 443 ret = mse102x_tx_pkt_spi(mse, txb, work_timeout); 442 444 mutex_unlock(&mses->lock); 443 445 if (ret) { 444 446 mse->ndev->stats.tx_dropped++; 445 447 } else { 446 - mse->ndev->stats.tx_bytes += txb->len; 448 + mse->ndev->stats.tx_bytes += len; 447 449 mse->ndev->stats.tx_packets++; 448 450 } 449 451

+8 -6

drivers/net/phy/phylink.c

··· 78 78 unsigned int pcs_neg_mode; 79 79 unsigned int pcs_state; 80 80 81 - bool mac_link_dropped; 81 + bool link_failed; 82 82 83 83 struct sfp_bus *sfp_bus; 84 84 bool sfp_may_have_phy; ··· 1458 1458 cur_link_state = pl->old_link_state; 1459 1459 1460 1460 if (pl->phylink_disable_state) { 1461 - pl->mac_link_dropped = false; 1461 + pl->link_failed = false; 1462 1462 link_state.link = false; 1463 - } else if (pl->mac_link_dropped) { 1463 + } else if (pl->link_failed) { 1464 1464 link_state.link = false; 1465 1465 retrigger = true; 1466 1466 } else if (pl->cur_link_an_mode == MLO_AN_FIXED) { ··· 1545 1545 phylink_link_up(pl, link_state); 1546 1546 } 1547 1547 if (!link_state.link && retrigger) { 1548 - pl->mac_link_dropped = false; 1548 + pl->link_failed = false; 1549 1549 queue_work(system_power_efficient_wq, &pl->resolve); 1550 1550 } 1551 1551 mutex_unlock(&pl->state_mutex); ··· 1801 1801 pl->phy_state.pause |= MLO_PAUSE_RX; 1802 1802 pl->phy_state.interface = phydev->interface; 1803 1803 pl->phy_state.link = up; 1804 + if (!up) 1805 + pl->link_failed = true; 1804 1806 mutex_unlock(&pl->state_mutex); 1805 1807 1806 1808 phylink_run_resolve(pl); ··· 2126 2124 static void phylink_link_changed(struct phylink *pl, bool up, const char *what) 2127 2125 { 2128 2126 if (!up) 2129 - pl->mac_link_dropped = true; 2127 + pl->link_failed = true; 2130 2128 phylink_run_resolve(pl); 2131 2129 phylink_dbg(pl, "%s link %s\n", what, up ? "up" : "down"); 2132 2130 } ··· 2781 2779 * link will cycle. 2782 2780 */ 2783 2781 if (manual_changed) { 2784 - pl->mac_link_dropped = true; 2782 + pl->link_failed = true; 2785 2783 phylink_run_resolve(pl); 2786 2784 } 2787 2785

+14 -7

drivers/nvme/host/core.c

··· 3795 3795 int srcu_idx; 3796 3796 3797 3797 srcu_idx = srcu_read_lock(&ctrl->srcu); 3798 - list_for_each_entry_rcu(ns, &ctrl->namespaces, list) { 3798 + list_for_each_entry_srcu(ns, &ctrl->namespaces, list, 3799 + srcu_read_lock_held(&ctrl->srcu)) { 3799 3800 if (ns->head->ns_id == nsid) { 3800 3801 if (!nvme_get_ns(ns)) 3801 3802 continue; ··· 4880 4879 int srcu_idx; 4881 4880 4882 4881 srcu_idx = srcu_read_lock(&ctrl->srcu); 4883 - list_for_each_entry_rcu(ns, &ctrl->namespaces, list) 4882 + list_for_each_entry_srcu(ns, &ctrl->namespaces, list, 4883 + srcu_read_lock_held(&ctrl->srcu)) 4884 4884 blk_mark_disk_dead(ns->disk); 4885 4885 srcu_read_unlock(&ctrl->srcu, srcu_idx); 4886 4886 } ··· 4893 4891 int srcu_idx; 4894 4892 4895 4893 srcu_idx = srcu_read_lock(&ctrl->srcu); 4896 - list_for_each_entry_rcu(ns, &ctrl->namespaces, list) 4894 + list_for_each_entry_srcu(ns, &ctrl->namespaces, list, 4895 + srcu_read_lock_held(&ctrl->srcu)) 4897 4896 blk_mq_unfreeze_queue(ns->queue); 4898 4897 srcu_read_unlock(&ctrl->srcu, srcu_idx); 4899 4898 clear_bit(NVME_CTRL_FROZEN, &ctrl->flags); ··· 4907 4904 int srcu_idx; 4908 4905 4909 4906 srcu_idx = srcu_read_lock(&ctrl->srcu); 4910 - list_for_each_entry_rcu(ns, &ctrl->namespaces, list) { 4907 + list_for_each_entry_srcu(ns, &ctrl->namespaces, list, 4908 + srcu_read_lock_held(&ctrl->srcu)) { 4911 4909 timeout = blk_mq_freeze_queue_wait_timeout(ns->queue, timeout); 4912 4910 if (timeout <= 0) 4913 4911 break; ··· 4924 4920 int srcu_idx; 4925 4921 4926 4922 srcu_idx = srcu_read_lock(&ctrl->srcu); 4927 - list_for_each_entry_rcu(ns, &ctrl->namespaces, list) 4923 + list_for_each_entry_srcu(ns, &ctrl->namespaces, list, 4924 + srcu_read_lock_held(&ctrl->srcu)) 4928 4925 blk_mq_freeze_queue_wait(ns->queue); 4929 4926 srcu_read_unlock(&ctrl->srcu, srcu_idx); 4930 4927 } ··· 4938 4933 4939 4934 set_bit(NVME_CTRL_FROZEN, &ctrl->flags); 4940 4935 srcu_idx = srcu_read_lock(&ctrl->srcu); 4941 - list_for_each_entry_rcu(ns, &ctrl->namespaces, list) 4936 + list_for_each_entry_srcu(ns, &ctrl->namespaces, list, 4937 + srcu_read_lock_held(&ctrl->srcu)) 4942 4938 blk_freeze_queue_start(ns->queue); 4943 4939 srcu_read_unlock(&ctrl->srcu, srcu_idx); 4944 4940 } ··· 4987 4981 int srcu_idx; 4988 4982 4989 4983 srcu_idx = srcu_read_lock(&ctrl->srcu); 4990 - list_for_each_entry_rcu(ns, &ctrl->namespaces, list) 4984 + list_for_each_entry_srcu(ns, &ctrl->namespaces, list, 4985 + srcu_read_lock_held(&ctrl->srcu)) 4991 4986 blk_sync_queue(ns->queue); 4992 4987 srcu_read_unlock(&ctrl->srcu, srcu_idx); 4993 4988 }

+2

drivers/regulator/rk808-regulator.c

··· 1379 1379 .n_linear_ranges = ARRAY_SIZE(rk817_buck1_voltage_ranges), 1380 1380 .vsel_reg = RK817_BUCK3_ON_VSEL_REG, 1381 1381 .vsel_mask = RK817_BUCK_VSEL_MASK, 1382 + .apply_reg = RK817_POWER_CONFIG, 1383 + .apply_bit = RK817_BUCK3_FB_RES_INTER, 1382 1384 .enable_reg = RK817_POWER_EN_REG(0), 1383 1385 .enable_mask = ENABLE_MASK(RK817_ID_DCDC3), 1384 1386 .enable_val = ENABLE_MASK(RK817_ID_DCDC3),

+1 -1

drivers/regulator/rtq2208-regulator.c

··· 568 568 struct regmap *regmap; 569 569 struct rtq2208_regulator_desc *rdesc[RTQ2208_LDO_MAX]; 570 570 struct regulator_dev *rdev; 571 - struct regulator_config cfg; 571 + struct regulator_config cfg = {}; 572 572 struct rtq2208_rdev_map *rdev_map; 573 573 int i, ret = 0, idx, n_regulator = 0; 574 574 unsigned int regulator_idx_table[RTQ2208_LDO_MAX],

+1 -2

drivers/scsi/sd_zbc.c

··· 188 188 bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT); 189 189 190 190 while (bufsize >= SECTOR_SIZE) { 191 - buf = __vmalloc(bufsize, 192 - GFP_KERNEL | __GFP_ZERO | __GFP_NORETRY); 191 + buf = kvzalloc(bufsize, GFP_KERNEL | __GFP_NORETRY); 193 192 if (buf) { 194 193 *buflen = bufsize; 195 194 return buf;

+3 -1

drivers/staging/media/av7110/av7110.h

··· 88 88 u32 ir_config; 89 89 }; 90 90 91 + #define MAX_CI_SLOTS 2 92 + 91 93 /* place to store all the necessary device information */ 92 94 struct av7110 { 93 95 /* devices */ ··· 165 163 166 164 /* CA */ 167 165 168 - struct ca_slot_info ci_slot[2]; 166 + struct ca_slot_info ci_slot[MAX_CI_SLOTS]; 169 167 170 168 enum av7110_video_mode vidmode; 171 169 struct dmxdev dmxdev;

+17 -8

drivers/staging/media/av7110/av7110_ca.c

··· 26 26 27 27 void CI_handle(struct av7110 *av7110, u8 *data, u16 len) 28 28 { 29 + unsigned slot_num; 30 + 29 31 dprintk(8, "av7110:%p\n", av7110); 30 32 31 33 if (len < 3) 32 34 return; 33 35 switch (data[0]) { 34 36 case CI_MSG_CI_INFO: 35 - if (data[2] != 1 && data[2] != 2) 37 + if (data[2] != 1 && data[2] != MAX_CI_SLOTS) 36 38 break; 39 + 40 + slot_num = array_index_nospec(data[2] - 1, MAX_CI_SLOTS); 41 + 37 42 switch (data[1]) { 38 43 case 0: 39 - av7110->ci_slot[data[2] - 1].flags = 0; 44 + av7110->ci_slot[slot_num].flags = 0; 40 45 break; 41 46 case 1: 42 - av7110->ci_slot[data[2] - 1].flags |= CA_CI_MODULE_PRESENT; 47 + av7110->ci_slot[slot_num].flags |= CA_CI_MODULE_PRESENT; 43 48 break; 44 49 case 2: 45 - av7110->ci_slot[data[2] - 1].flags |= CA_CI_MODULE_READY; 50 + av7110->ci_slot[slot_num].flags |= CA_CI_MODULE_READY; 46 51 break; 47 52 } 48 53 break; ··· 267 262 case CA_GET_SLOT_INFO: 268 263 { 269 264 struct ca_slot_info *info = (struct ca_slot_info *)parg; 265 + unsigned int slot_num; 270 266 271 267 if (info->num < 0 || info->num > 1) { 272 268 mutex_unlock(&av7110->ioctl_mutex); 273 269 return -EINVAL; 274 270 } 275 - av7110->ci_slot[info->num].num = info->num; 276 - av7110->ci_slot[info->num].type = FW_CI_LL_SUPPORT(av7110->arm_app) ? 277 - CA_CI_LINK : CA_CI; 278 - memcpy(info, &av7110->ci_slot[info->num], sizeof(struct ca_slot_info)); 271 + slot_num = array_index_nospec(info->num, MAX_CI_SLOTS); 272 + 273 + av7110->ci_slot[slot_num].num = info->num; 274 + av7110->ci_slot[slot_num].type = FW_CI_LL_SUPPORT(av7110->arm_app) ? 275 + CA_CI_LINK : CA_CI; 276 + memcpy(info, &av7110->ci_slot[slot_num], 277 + sizeof(struct ca_slot_info)); 279 278 break; 280 279 } 281 280

+2 -4

drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c

··· 593 593 { 594 594 struct vchiq_arm_state *platform_state; 595 595 596 - platform_state = kzalloc(sizeof(*platform_state), GFP_KERNEL); 596 + platform_state = devm_kzalloc(state->dev, sizeof(*platform_state), GFP_KERNEL); 597 597 if (!platform_state) 598 598 return -ENOMEM; 599 599 ··· 1731 1731 return -ENOENT; 1732 1732 } 1733 1733 1734 - mgmt = kzalloc(sizeof(*mgmt), GFP_KERNEL); 1734 + mgmt = devm_kzalloc(&pdev->dev, sizeof(*mgmt), GFP_KERNEL); 1735 1735 if (!mgmt) 1736 1736 return -ENOMEM; 1737 1737 ··· 1789 1789 1790 1790 arm_state = vchiq_platform_get_arm_state(&mgmt->state); 1791 1791 kthread_stop(arm_state->ka_thread); 1792 - 1793 - kfree(mgmt); 1794 1792 } 1795 1793 1796 1794 static struct platform_driver vchiq_driver = {

+7

drivers/thermal/qcom/lmh.c

··· 73 73 static int lmh_irq_map(struct irq_domain *d, unsigned int irq, irq_hw_number_t hw) 74 74 { 75 75 struct lmh_hw_data *lmh_data = d->host_data; 76 + static struct lock_class_key lmh_lock_key; 77 + static struct lock_class_key lmh_request_key; 76 78 79 + /* 80 + * This lock class tells lockdep that GPIO irqs are in a different 81 + * category than their parents, so it won't report false recursion. 82 + */ 83 + irq_set_lockdep_class(irq, &lmh_lock_key, &lmh_request_key); 77 84 irq_set_chip_and_handler(irq, &lmh_irq_chip, handle_simple_irq); 78 85 irq_set_chip_data(irq, lmh_data); 79 86

+10 -11

drivers/thermal/thermal_of.c

··· 99 99 struct device_node *trips; 100 100 int ret, count; 101 101 102 + *ntrips = 0; 103 + 102 104 trips = of_get_child_by_name(np, "trips"); 103 - if (!trips) { 104 - pr_err("Failed to find 'trips' node\n"); 105 - return ERR_PTR(-EINVAL); 106 - } 105 + if (!trips) 106 + return NULL; 107 107 108 108 count = of_get_child_count(trips); 109 - if (!count) { 110 - pr_err("No trip point defined\n"); 111 - ret = -EINVAL; 112 - goto out_of_node_put; 113 - } 109 + if (!count) 110 + return NULL; 114 111 115 112 tt = kzalloc(sizeof(*tt) * count, GFP_KERNEL); 116 113 if (!tt) { ··· 130 133 131 134 out_kfree: 132 135 kfree(tt); 133 - *ntrips = 0; 134 136 out_of_node_put: 135 137 of_node_put(trips); 136 138 ··· 397 401 398 402 trips = thermal_of_trips_init(np, &ntrips); 399 403 if (IS_ERR(trips)) { 400 - pr_err("Failed to find trip points for %pOFn id=%d\n", sensor, id); 404 + pr_err("Failed to parse trip points for %pOFn id=%d\n", sensor, id); 401 405 ret = PTR_ERR(trips); 402 406 goto out_of_node_put; 403 407 } 408 + 409 + if (!trips) 410 + pr_info("No trip points found for %pOFn id=%d\n", sensor, id); 404 411 405 412 ret = thermal_of_monitor_init(np, &delay, &pdelay); 406 413 if (ret) {

+2

drivers/thunderbolt/retimer.c

··· 532 532 } 533 533 534 534 ret = 0; 535 + if (!IS_ENABLED(CONFIG_USB4_DEBUGFS_MARGINING)) 536 + max = min(last_idx, max); 535 537 536 538 /* Add retimers if they do not exist already */ 537 539 for (i = 1; i <= max; i++) {

+1 -1

drivers/thunderbolt/usb4.c

··· 48 48 49 49 /* Delays in us used with usb4_port_wait_for_bit() */ 50 50 #define USB4_PORT_DELAY 50 51 - #define USB4_PORT_SB_DELAY 5000 51 + #define USB4_PORT_SB_DELAY 1000 52 52 53 53 static int usb4_native_switch_op(struct tb_switch *sw, u16 opcode, 54 54 u32 *metadata, u8 *status,

+8 -2

drivers/ufs/core/ufshcd.c

··· 8636 8636 ufshcd_init_clk_scaling_sysfs(hba); 8637 8637 } 8638 8638 8639 + /* 8640 + * The RTC update code accesses the hba->ufs_device_wlun->sdev_gendev 8641 + * pointer and hence must only be started after the WLUN pointer has 8642 + * been initialized by ufshcd_scsi_add_wlus(). 8643 + */ 8644 + schedule_delayed_work(&hba->ufs_rtc_update_work, 8645 + msecs_to_jiffies(UFS_RTC_UPDATE_INTERVAL_MS)); 8646 + 8639 8647 ufs_bsg_probe(hba); 8640 8648 scsi_scan_host(hba->host); 8641 8649 ··· 8803 8795 ufshcd_force_reset_auto_bkops(hba); 8804 8796 8805 8797 ufshcd_set_timestamp_attr(hba); 8806 - schedule_delayed_work(&hba->ufs_rtc_update_work, 8807 - msecs_to_jiffies(UFS_RTC_UPDATE_INTERVAL_MS)); 8808 8798 8809 8799 /* Gear up to HS gear if supported */ 8810 8800 if (hba->max_pwr_info.is_valid) {

+12 -13

drivers/usb/dwc3/core.c

··· 2342 2342 u32 reg; 2343 2343 int i; 2344 2344 2345 - dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) & 2346 - DWC3_GUSB2PHYCFG_SUSPHY) || 2347 - (dwc3_readl(dwc->regs, DWC3_GUSB3PIPECTL(0)) & 2348 - DWC3_GUSB3PIPECTL_SUSPHY); 2345 + if (!pm_runtime_suspended(dwc->dev) && !PMSG_IS_AUTO(msg)) { 2346 + dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) & 2347 + DWC3_GUSB2PHYCFG_SUSPHY) || 2348 + (dwc3_readl(dwc->regs, DWC3_GUSB3PIPECTL(0)) & 2349 + DWC3_GUSB3PIPECTL_SUSPHY); 2350 + /* 2351 + * TI AM62 platform requires SUSPHY to be 2352 + * enabled for system suspend to work. 2353 + */ 2354 + if (!dwc->susphy_state) 2355 + dwc3_enable_susphy(dwc, true); 2356 + } 2349 2357 2350 2358 switch (dwc->current_dr_role) { 2351 2359 case DWC3_GCTL_PRTCAP_DEVICE: ··· 2404 2396 default: 2405 2397 /* do nothing */ 2406 2398 break; 2407 - } 2408 - 2409 - if (!PMSG_IS_AUTO(msg)) { 2410 - /* 2411 - * TI AM62 platform requires SUSPHY to be 2412 - * enabled for system suspend to work. 2413 - */ 2414 - if (!dwc->susphy_state) 2415 - dwc3_enable_susphy(dwc, true); 2416 2399 } 2417 2400 2418 2401 return 0;

-2

drivers/usb/musb/sunxi.c

··· 293 293 if (test_bit(SUNXI_MUSB_FL_HAS_SRAM, &glue->flags)) 294 294 sunxi_sram_release(musb->controller->parent); 295 295 296 - devm_usb_put_phy(glue->dev, glue->xceiv); 297 - 298 296 return 0; 299 297 } 300 298

+4 -4

drivers/usb/serial/io_edgeport.c

··· 770 770 static void edge_bulk_out_cmd_callback(struct urb *urb) 771 771 { 772 772 struct edgeport_port *edge_port = urb->context; 773 + struct device *dev = &urb->dev->dev; 773 774 int status = urb->status; 774 775 775 776 atomic_dec(&CmdUrbs); 776 - dev_dbg(&urb->dev->dev, "%s - FREE URB %p (outstanding %d)\n", 777 - __func__, urb, atomic_read(&CmdUrbs)); 777 + dev_dbg(dev, "%s - FREE URB %p (outstanding %d)\n", __func__, urb, 778 + atomic_read(&CmdUrbs)); 778 779 779 780 780 781 /* clean up the transfer buffer */ ··· 785 784 usb_free_urb(urb); 786 785 787 786 if (status) { 788 - dev_dbg(&urb->dev->dev, 789 - "%s - nonzero write bulk status received: %d\n", 787 + dev_dbg(dev, "%s - nonzero write bulk status received: %d\n", 790 788 __func__, status); 791 789 return; 792 790 }

+6

drivers/usb/serial/option.c

··· 251 251 #define QUECTEL_VENDOR_ID 0x2c7c 252 252 /* These Quectel products use Quectel's vendor ID */ 253 253 #define QUECTEL_PRODUCT_EC21 0x0121 254 + #define QUECTEL_PRODUCT_RG650V 0x0122 254 255 #define QUECTEL_PRODUCT_EM061K_LTA 0x0123 255 256 #define QUECTEL_PRODUCT_EM061K_LMS 0x0124 256 257 #define QUECTEL_PRODUCT_EC25 0x0125 ··· 1274 1273 { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EG912Y, 0xff, 0, 0) }, 1275 1274 { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EG916Q, 0xff, 0x00, 0x00) }, 1276 1275 { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500K, 0xff, 0x00, 0x00) }, 1276 + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RG650V, 0xff, 0xff, 0x30) }, 1277 + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RG650V, 0xff, 0, 0) }, 1277 1278 1278 1279 { USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_6001) }, 1279 1280 { USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_CMU_300) }, ··· 2323 2320 { USB_DEVICE_AND_INTERFACE_INFO(0x2cb7, 0x010b, 0xff, 0xff, 0x30) }, /* Fibocom FG150 Diag */ 2324 2321 { USB_DEVICE_AND_INTERFACE_INFO(0x2cb7, 0x010b, 0xff, 0, 0) }, /* Fibocom FG150 AT */ 2325 2322 { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x0111, 0xff) }, /* Fibocom FM160 (MBIM mode) */ 2323 + { USB_DEVICE_AND_INTERFACE_INFO(0x2cb7, 0x0112, 0xff, 0xff, 0x30) }, /* Fibocom FG132 Diag */ 2324 + { USB_DEVICE_AND_INTERFACE_INFO(0x2cb7, 0x0112, 0xff, 0xff, 0x40) }, /* Fibocom FG132 AT */ 2325 + { USB_DEVICE_AND_INTERFACE_INFO(0x2cb7, 0x0112, 0xff, 0, 0) }, /* Fibocom FG132 NMEA */ 2326 2326 { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x0115, 0xff), /* Fibocom FM135 (laptop MBIM) */ 2327 2327 .driver_info = RSVD(5) }, 2328 2328 { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x01a0, 0xff) }, /* Fibocom NL668-AM/NL652-EU (laptop MBIM) */

+2

drivers/usb/serial/qcserial.c

··· 166 166 {DEVICE_SWI(0x1199, 0x9090)}, /* Sierra Wireless EM7565 QDL */ 167 167 {DEVICE_SWI(0x1199, 0x9091)}, /* Sierra Wireless EM7565 */ 168 168 {DEVICE_SWI(0x1199, 0x90d2)}, /* Sierra Wireless EM9191 QDL */ 169 + {DEVICE_SWI(0x1199, 0x90e4)}, /* Sierra Wireless EM86xx QDL*/ 170 + {DEVICE_SWI(0x1199, 0x90e5)}, /* Sierra Wireless EM86xx */ 169 171 {DEVICE_SWI(0x1199, 0xc080)}, /* Sierra Wireless EM7590 QDL */ 170 172 {DEVICE_SWI(0x1199, 0xc081)}, /* Sierra Wireless EM7590 */ 171 173 {DEVICE_SWI(0x413c, 0x81a2)}, /* Dell Wireless 5806 Gobi(TM) 4G LTE Mobile Broadband Card */

+4 -4

drivers/usb/typec/tcpm/qcom/qcom_pmic_typec_pdphy.c

··· 227 227 228 228 spin_lock_irqsave(&pmic_typec_pdphy->lock, flags); 229 229 230 + hdr_len = sizeof(msg->header); 231 + txbuf_len = pd_header_cnt_le(msg->header) * 4; 232 + txsize_len = hdr_len + txbuf_len - 1; 233 + 230 234 ret = regmap_read(pmic_typec_pdphy->regmap, 231 235 pmic_typec_pdphy->base + USB_PDPHY_RX_ACKNOWLEDGE_REG, 232 236 &val); ··· 247 243 ret = qcom_pmic_typec_pdphy_clear_tx_control_reg(pmic_typec_pdphy); 248 244 if (ret) 249 245 goto done; 250 - 251 - hdr_len = sizeof(msg->header); 252 - txbuf_len = pd_header_cnt_le(msg->header) * 4; 253 - txsize_len = hdr_len + txbuf_len - 1; 254 246 255 247 /* Write message header sizeof(u16) to USB_PDPHY_TX_BUFFER_HDR_REG */ 256 248 ret = regmap_bulk_write(pmic_typec_pdphy->regmap,

+2

drivers/usb/typec/ucsi/ucsi_ccg.c

··· 482 482 483 483 port = uc->orig; 484 484 new_cam = UCSI_SET_NEW_CAM_GET_AM(*cmd); 485 + if (new_cam >= ARRAY_SIZE(uc->updated)) 486 + return; 485 487 new_port = &uc->updated[new_cam]; 486 488 cam = new_port->linked_idx; 487 489 enter_new_mode = UCSI_SET_NEW_CAM_ENTER(*cmd);

+1 -1

drivers/vdpa/ifcvf/ifcvf_base.c

··· 108 108 u32 i; 109 109 110 110 ret = pci_read_config_byte(pdev, PCI_CAPABILITY_LIST, &pos); 111 - if (ret < 0) { 111 + if (ret) { 112 112 IFCVF_ERR(pdev, "Failed to read PCI capability list\n"); 113 113 return -EIO; 114 114 }

+5 -3

drivers/vdpa/mlx5/core/mr.c

··· 373 373 struct page *pg; 374 374 unsigned int nsg; 375 375 int sglen; 376 - u64 pa; 376 + u64 pa, offset; 377 377 u64 paend; 378 378 struct scatterlist *sg; 379 379 struct device *dma = mvdev->vdev.dma_dev; ··· 396 396 sg = mr->sg_head.sgl; 397 397 for (map = vhost_iotlb_itree_first(iotlb, mr->start, mr->end - 1); 398 398 map; map = vhost_iotlb_itree_next(map, mr->start, mr->end - 1)) { 399 - paend = map->addr + maplen(map, mr); 400 - for (pa = map->addr; pa < paend; pa += sglen) { 399 + offset = mr->start > map->start ? mr->start - map->start : 0; 400 + pa = map->addr + offset; 401 + paend = map->addr + offset + maplen(map, mr); 402 + for (; pa < paend; pa += sglen) { 401 403 pg = pfn_to_page(__phys_to_pfn(pa)); 402 404 if (!sg) { 403 405 mlx5_vdpa_warn(mvdev, "sg null. start 0x%llx, end 0x%llx\n",

+5 -16

drivers/vdpa/mlx5/net/mlx5_vnet.c

··· 3963 3963 mvdev->vdev.dma_dev = &mdev->pdev->dev; 3964 3964 err = mlx5_vdpa_alloc_resources(&ndev->mvdev); 3965 3965 if (err) 3966 - goto err_mpfs; 3966 + goto err_alloc; 3967 3967 3968 3968 err = mlx5_vdpa_init_mr_resources(mvdev); 3969 3969 if (err) 3970 - goto err_res; 3970 + goto err_alloc; 3971 3971 3972 3972 if (MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) { 3973 3973 err = mlx5_vdpa_create_dma_mr(mvdev); 3974 3974 if (err) 3975 - goto err_mr_res; 3975 + goto err_alloc; 3976 3976 } 3977 3977 3978 3978 err = alloc_fixed_resources(ndev); 3979 3979 if (err) 3980 - goto err_mr; 3980 + goto err_alloc; 3981 3981 3982 3982 ndev->cvq_ent.mvdev = mvdev; 3983 3983 INIT_WORK(&ndev->cvq_ent.work, mlx5_cvq_kick_handler); 3984 3984 mvdev->wq = create_singlethread_workqueue("mlx5_vdpa_wq"); 3985 3985 if (!mvdev->wq) { 3986 3986 err = -ENOMEM; 3987 - goto err_res2; 3987 + goto err_alloc; 3988 3988 } 3989 3989 3990 3990 mvdev->vdev.mdev = &mgtdev->mgtdev; ··· 4010 4010 _vdpa_unregister_device(&mvdev->vdev); 4011 4011 err_reg: 4012 4012 destroy_workqueue(mvdev->wq); 4013 - err_res2: 4014 - free_fixed_resources(ndev); 4015 - err_mr: 4016 - mlx5_vdpa_clean_mrs(mvdev); 4017 - err_mr_res: 4018 - mlx5_vdpa_destroy_mr_resources(mvdev); 4019 - err_res: 4020 - mlx5_vdpa_free_resources(&ndev->mvdev); 4021 - err_mpfs: 4022 - if (!is_zero_ether_addr(config->mac)) 4023 - mlx5_mpfs_del_mac(pfmdev, config->mac); 4024 4013 err_alloc: 4025 4014 put_device(&mvdev->vdev.dev); 4026 4015 return err;

+10 -4

drivers/vdpa/solidrun/snet_main.c

··· 555 555 556 556 static int psnet_open_pf_bar(struct pci_dev *pdev, struct psnet *psnet) 557 557 { 558 - char name[50]; 558 + char *name; 559 559 int ret, i, mask = 0; 560 560 /* We don't know which BAR will be used to communicate.. 561 561 * We will map every bar with len > 0. ··· 573 573 return -ENODEV; 574 574 } 575 575 576 - snprintf(name, sizeof(name), "psnet[%s]-bars", pci_name(pdev)); 576 + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "psnet[%s]-bars", pci_name(pdev)); 577 + if (!name) 578 + return -ENOMEM; 579 + 577 580 ret = pcim_iomap_regions(pdev, mask, name); 578 581 if (ret) { 579 582 SNET_ERR(pdev, "Failed to request and map PCI BARs\n"); ··· 593 590 594 591 static int snet_open_vf_bar(struct pci_dev *pdev, struct snet *snet) 595 592 { 596 - char name[50]; 593 + char *name; 597 594 int ret; 598 595 599 - snprintf(name, sizeof(name), "snet[%s]-bar", pci_name(pdev)); 596 + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "snet[%s]-bars", pci_name(pdev)); 597 + if (!name) 598 + return -ENOMEM; 599 + 600 600 /* Request and map BAR */ 601 601 ret = pcim_iomap_regions(pdev, BIT(snet->psnet->cfg.vf_bar), name); 602 602 if (ret) {

+7 -3

drivers/vdpa/virtio_pci/vp_vdpa.c

··· 612 612 goto mdev_err; 613 613 } 614 614 615 - mdev_id = kzalloc(sizeof(struct virtio_device_id), GFP_KERNEL); 615 + /* 616 + * id_table should be a null terminated array, so allocate one additional 617 + * entry here, see vdpa_mgmtdev_get_classes(). 618 + */ 619 + mdev_id = kcalloc(2, sizeof(struct virtio_device_id), GFP_KERNEL); 616 620 if (!mdev_id) { 617 621 err = -ENOMEM; 618 622 goto mdev_id_err; ··· 636 632 goto probe_err; 637 633 } 638 634 639 - mdev_id->device = mdev->id.device; 640 - mdev_id->vendor = mdev->id.vendor; 635 + mdev_id[0].device = mdev->id.device; 636 + mdev_id[0].vendor = mdev->id.vendor; 641 637 mgtdev->id_table = mdev_id; 642 638 mgtdev->max_supported_vqs = vp_modern_get_num_queues(mdev); 643 639 mgtdev->supported_features = vp_modern_get_features(mdev);

+18 -6

drivers/virtio/virtio_pci_common.c

··· 24 24 "Force legacy mode for transitional virtio 1 devices"); 25 25 #endif 26 26 27 + bool vp_is_avq(struct virtio_device *vdev, unsigned int index) 28 + { 29 + struct virtio_pci_device *vp_dev = to_vp_device(vdev); 30 + 31 + if (!virtio_has_feature(vdev, VIRTIO_F_ADMIN_VQ)) 32 + return false; 33 + 34 + return index == vp_dev->admin_vq.vq_index; 35 + } 36 + 27 37 /* wait for pending irq handlers */ 28 38 void vp_synchronize_vectors(struct virtio_device *vdev) 29 39 { ··· 244 234 return vq; 245 235 } 246 236 247 - static void vp_del_vq(struct virtqueue *vq) 237 + static void vp_del_vq(struct virtqueue *vq, struct virtio_pci_vq_info *info) 248 238 { 249 239 struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev); 250 - struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index]; 251 240 unsigned long flags; 252 241 253 242 /* ··· 267 258 void vp_del_vqs(struct virtio_device *vdev) 268 259 { 269 260 struct virtio_pci_device *vp_dev = to_vp_device(vdev); 261 + struct virtio_pci_vq_info *info; 270 262 struct virtqueue *vq, *n; 271 263 int i; 272 264 273 265 list_for_each_entry_safe(vq, n, &vdev->vqs, list) { 274 - if (vp_dev->per_vq_vectors) { 275 - int v = vp_dev->vqs[vq->index]->msix_vector; 266 + info = vp_is_avq(vdev, vq->index) ? vp_dev->admin_vq.info : 267 + vp_dev->vqs[vq->index]; 276 268 269 + if (vp_dev->per_vq_vectors) { 270 + int v = info->msix_vector; 277 271 if (v != VIRTIO_MSI_NO_VECTOR && 278 272 !vp_is_slow_path_vector(v)) { 279 273 int irq = pci_irq_vector(vp_dev->pci_dev, v); ··· 285 273 free_irq(irq, vq); 286 274 } 287 275 } 288 - vp_del_vq(vq); 276 + vp_del_vq(vq, info); 289 277 } 290 278 vp_dev->per_vq_vectors = false; 291 279 ··· 366 354 vring_interrupt, 0, 367 355 vp_dev->msix_names[msix_vec], vq); 368 356 if (err) { 369 - vp_del_vq(vq); 357 + vp_del_vq(vq, *p_info); 370 358 return ERR_PTR(err); 371 359 } 372 360

+1

drivers/virtio/virtio_pci_common.h

··· 178 178 #define VIRTIO_ADMIN_CMD_BITMAP 0 179 179 #endif 180 180 181 + bool vp_is_avq(struct virtio_device *vdev, unsigned int index); 181 182 void vp_modern_avq_done(struct virtqueue *vq); 182 183 int vp_modern_admin_cmd_exec(struct virtio_device *vdev, 183 184 struct virtio_admin_cmd *cmd);

+1 -11

drivers/virtio/virtio_pci_modern.c

··· 43 43 return 0; 44 44 } 45 45 46 - static bool vp_is_avq(struct virtio_device *vdev, unsigned int index) 47 - { 48 - struct virtio_pci_device *vp_dev = to_vp_device(vdev); 49 - 50 - if (!virtio_has_feature(vdev, VIRTIO_F_ADMIN_VQ)) 51 - return false; 52 - 53 - return index == vp_dev->admin_vq.vq_index; 54 - } 55 - 56 46 void vp_modern_avq_done(struct virtqueue *vq) 57 47 { 58 48 struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev); ··· 235 245 if (!virtio_has_feature(vdev, VIRTIO_F_ADMIN_VQ)) 236 246 return; 237 247 238 - vq = vp_dev->vqs[vp_dev->admin_vq.vq_index]->vq; 248 + vq = vp_dev->admin_vq.info->vq; 239 249 if (!vq) 240 250 return; 241 251

+12 -5

fs/bcachefs/backpointers.c

··· 52 52 enum bch_validate_flags flags) 53 53 { 54 54 struct bkey_s_c_backpointer bp = bkey_s_c_to_backpointer(k); 55 + int ret = 0; 56 + 57 + bkey_fsck_err_on(bp.v->level > BTREE_MAX_DEPTH, 58 + c, backpointer_level_bad, 59 + "backpointer level bad: %u >= %u", 60 + bp.v->level, BTREE_MAX_DEPTH); 55 61 56 62 rcu_read_lock(); 57 63 struct bch_dev *ca = bch2_dev_rcu_noerror(c, bp.k->p.inode); ··· 70 64 struct bpos bucket = bp_pos_to_bucket(ca, bp.k->p); 71 65 struct bpos bp_pos = bucket_pos_to_bp_noerror(ca, bucket, bp.v->bucket_offset); 72 66 rcu_read_unlock(); 73 - int ret = 0; 74 67 75 68 bkey_fsck_err_on((bp.v->bucket_offset >> MAX_EXTENT_COMPRESS_RATIO_SHIFT) >= ca->mi.bucket_size || 76 69 !bpos_eq(bp.k->p, bp_pos), ··· 952 947 static int check_one_backpointer(struct btree_trans *trans, 953 948 struct bbpos start, 954 949 struct bbpos end, 955 - struct bkey_s_c_backpointer bp, 950 + struct bkey_s_c bp_k, 956 951 struct bkey_buf *last_flushed) 957 952 { 953 + if (bp_k.k->type != KEY_TYPE_backpointer) 954 + return 0; 955 + 956 + struct bkey_s_c_backpointer bp = bkey_s_c_to_backpointer(bp_k); 958 957 struct bch_fs *c = trans->c; 959 958 struct btree_iter iter; 960 959 struct bbpos pos = bp_to_bbpos(*bp.v); ··· 1013 1004 POS_MIN, BTREE_ITER_prefetch, k, 1014 1005 NULL, NULL, BCH_TRANS_COMMIT_no_enospc, ({ 1015 1006 progress_update_iter(trans, &progress, &iter, "backpointers_to_extents"); 1016 - check_one_backpointer(trans, start, end, 1017 - bkey_s_c_to_backpointer(k), 1018 - &last_flushed); 1007 + check_one_backpointer(trans, start, end, k, &last_flushed); 1019 1008 })); 1020 1009 1021 1010 bch2_bkey_buf_exit(&last_flushed, c);

+3 -4

fs/bcachefs/bkey.c

··· 643 643 enum bch_validate_flags flags, 644 644 struct printbuf *err) 645 645 { 646 - unsigned i, bits = KEY_PACKED_BITS_START; 646 + unsigned bits = KEY_PACKED_BITS_START; 647 647 648 648 if (f->nr_fields != BKEY_NR_FIELDS) { 649 649 prt_printf(err, "incorrect number of fields: got %u, should be %u", ··· 655 655 * Verify that the packed format can't represent fields larger than the 656 656 * unpacked format: 657 657 */ 658 - for (i = 0; i < f->nr_fields; i++) { 659 - if ((!c || c->sb.version_min >= bcachefs_metadata_version_snapshot) && 660 - bch2_bkey_format_field_overflows(f, i)) { 658 + for (unsigned i = 0; i < f->nr_fields; i++) { 659 + if (bch2_bkey_format_field_overflows(f, i)) { 661 660 unsigned unpacked_bits = bch2_bkey_format_current.bits_per_field[i]; 662 661 u64 unpacked_max = ~((~0ULL << 1) << (unpacked_bits - 1)); 663 662 unsigned packed_bits = min(64, f->bits_per_field[i]);

+66 -41

fs/bcachefs/btree_cache.c

··· 59 59 60 60 static void btree_node_to_freedlist(struct btree_cache *bc, struct btree *b) 61 61 { 62 + BUG_ON(!list_empty(&b->list)); 63 + 62 64 if (b->c.lock.readers) 63 - list_move(&b->list, &bc->freed_pcpu); 65 + list_add(&b->list, &bc->freed_pcpu); 64 66 else 65 - list_move(&b->list, &bc->freed_nonpcpu); 67 + list_add(&b->list, &bc->freed_nonpcpu); 66 68 } 67 69 68 - static void btree_node_data_free(struct bch_fs *c, struct btree *b) 70 + static void __bch2_btree_node_to_freelist(struct btree_cache *bc, struct btree *b) 71 + { 72 + BUG_ON(!list_empty(&b->list)); 73 + BUG_ON(!b->data); 74 + 75 + bc->nr_freeable++; 76 + list_add(&b->list, &bc->freeable); 77 + } 78 + 79 + void bch2_btree_node_to_freelist(struct bch_fs *c, struct btree *b) 69 80 { 70 81 struct btree_cache *bc = &c->btree_cache; 71 82 83 + mutex_lock(&bc->lock); 84 + __bch2_btree_node_to_freelist(bc, b); 85 + mutex_unlock(&bc->lock); 86 + 87 + six_unlock_write(&b->c.lock); 88 + six_unlock_intent(&b->c.lock); 89 + } 90 + 91 + static void __btree_node_data_free(struct btree_cache *bc, struct btree *b) 92 + { 93 + BUG_ON(!list_empty(&b->list)); 72 94 BUG_ON(btree_node_hashed(b)); 73 95 74 96 /* ··· 116 94 #endif 117 95 b->aux_data = NULL; 118 96 119 - bc->nr_freeable--; 120 - 121 97 btree_node_to_freedlist(bc, b); 98 + } 99 + 100 + static void btree_node_data_free(struct btree_cache *bc, struct btree *b) 101 + { 102 + BUG_ON(list_empty(&b->list)); 103 + list_del_init(&b->list); 104 + --bc->nr_freeable; 105 + __btree_node_data_free(bc, b); 122 106 } 123 107 124 108 static int bch2_btree_cache_cmp_fn(struct rhashtable_compare_arg *arg, ··· 202 174 203 175 bch2_btree_lock_init(&b->c, 0); 204 176 205 - bc->nr_freeable++; 206 - list_add(&b->list, &bc->freeable); 177 + __bch2_btree_node_to_freelist(bc, b); 207 178 return b; 208 - } 209 - 210 - void bch2_btree_node_to_freelist(struct bch_fs *c, struct btree *b) 211 - { 212 - mutex_lock(&c->btree_cache.lock); 213 - list_move(&b->list, &c->btree_cache.freeable); 214 - mutex_unlock(&c->btree_cache.lock); 215 - 216 - six_unlock_write(&b->c.lock); 217 - six_unlock_intent(&b->c.lock); 218 179 } 219 180 220 181 static inline bool __btree_node_pinned(struct btree_cache *bc, struct btree *b) ··· 253 236 254 237 /* Btree in memory cache - hash table */ 255 238 256 - void bch2_btree_node_hash_remove(struct btree_cache *bc, struct btree *b) 239 + void __bch2_btree_node_hash_remove(struct btree_cache *bc, struct btree *b) 257 240 { 258 241 lockdep_assert_held(&bc->lock); 259 - int ret = rhashtable_remove_fast(&bc->table, &b->hash, bch_btree_cache_params); 260 242 243 + int ret = rhashtable_remove_fast(&bc->table, &b->hash, bch_btree_cache_params); 261 244 BUG_ON(ret); 262 245 263 246 /* Cause future lookups for this node to fail: */ ··· 265 248 266 249 if (b->c.btree_id < BTREE_ID_NR) 267 250 --bc->nr_by_btree[b->c.btree_id]; 251 + --bc->live[btree_node_pinned(b)].nr; 252 + list_del_init(&b->list); 253 + } 268 254 269 - bc->live[btree_node_pinned(b)].nr--; 270 - bc->nr_freeable++; 271 - list_move(&b->list, &bc->freeable); 255 + void bch2_btree_node_hash_remove(struct btree_cache *bc, struct btree *b) 256 + { 257 + __bch2_btree_node_hash_remove(bc, b); 258 + __bch2_btree_node_to_freelist(bc, b); 272 259 } 273 260 274 261 int __bch2_btree_node_hash_insert(struct btree_cache *bc, struct btree *b) 275 262 { 263 + BUG_ON(!list_empty(&b->list)); 276 264 BUG_ON(b->hash_val); 277 - b->hash_val = btree_ptr_hash_val(&b->key); 278 265 266 + b->hash_val = btree_ptr_hash_val(&b->key); 279 267 int ret = rhashtable_lookup_insert_fast(&bc->table, &b->hash, 280 268 bch_btree_cache_params); 281 269 if (ret) ··· 292 270 bool p = __btree_node_pinned(bc, b); 293 271 mod_bit(BTREE_NODE_pinned, &b->flags, p); 294 272 295 - list_move_tail(&b->list, &bc->live[p].list); 273 + list_add_tail(&b->list, &bc->live[p].list); 296 274 bc->live[p].nr++; 297 - 298 - bc->nr_freeable--; 299 275 return 0; 300 276 } 301 277 ··· 505 485 goto out; 506 486 507 487 if (!btree_node_reclaim(c, b, true)) { 508 - btree_node_data_free(c, b); 488 + btree_node_data_free(bc, b); 509 489 six_unlock_write(&b->c.lock); 510 490 six_unlock_intent(&b->c.lock); 511 491 freed++; ··· 521 501 bc->not_freed[BCH_BTREE_CACHE_NOT_FREED_access_bit]++; 522 502 --touched;; 523 503 } else if (!btree_node_reclaim(c, b, true)) { 524 - bch2_btree_node_hash_remove(bc, b); 504 + __bch2_btree_node_hash_remove(bc, b); 505 + __btree_node_data_free(bc, b); 525 506 526 507 freed++; 527 - btree_node_data_free(c, b); 528 508 bc->nr_freed++; 529 509 530 510 six_unlock_write(&b->c.lock); ··· 607 587 BUG_ON(btree_node_read_in_flight(b) || 608 588 btree_node_write_in_flight(b)); 609 589 610 - btree_node_data_free(c, b); 590 + btree_node_data_free(bc, b); 611 591 } 612 592 613 593 BUG_ON(!bch2_journal_error(&c->journal) && ··· 806 786 807 787 BUG_ON(!six_trylock_intent(&b->c.lock)); 808 788 BUG_ON(!six_trylock_write(&b->c.lock)); 809 - got_node: 810 789 790 + got_node: 811 791 /* 812 792 * btree_free() doesn't free memory; it sticks the node on the end of 813 793 * the list. Check if there's any freed nodes there: ··· 816 796 if (!btree_node_reclaim(c, b2, false)) { 817 797 swap(b->data, b2->data); 818 798 swap(b->aux_data, b2->aux_data); 799 + 800 + list_del_init(&b2->list); 801 + --bc->nr_freeable; 819 802 btree_node_to_freedlist(bc, b2); 803 + mutex_unlock(&bc->lock); 804 + 820 805 six_unlock_write(&b2->c.lock); 821 806 six_unlock_intent(&b2->c.lock); 822 807 goto got_mem; ··· 835 810 goto err; 836 811 } 837 812 838 - mutex_lock(&bc->lock); 839 - bc->nr_freeable++; 840 813 got_mem: 841 - mutex_unlock(&bc->lock); 842 - 814 + BUG_ON(!list_empty(&b->list)); 843 815 BUG_ON(btree_node_hashed(b)); 844 816 BUG_ON(btree_node_dirty(b)); 845 817 BUG_ON(btree_node_write_in_flight(b)); ··· 867 845 if (bc->alloc_lock == current) { 868 846 b2 = btree_node_cannibalize(c); 869 847 clear_btree_node_just_written(b2); 870 - bch2_btree_node_hash_remove(bc, b2); 848 + __bch2_btree_node_hash_remove(bc, b2); 871 849 872 850 if (b) { 873 851 swap(b->data, b2->data); ··· 877 855 six_unlock_intent(&b2->c.lock); 878 856 } else { 879 857 b = b2; 880 - list_del_init(&b->list); 881 858 } 882 859 860 + BUG_ON(!list_empty(&b->list)); 883 861 mutex_unlock(&bc->lock); 884 862 885 863 trace_and_count(c, btree_cache_cannibalize, trans); ··· 958 936 b->hash_val = 0; 959 937 960 938 mutex_lock(&bc->lock); 961 - list_add(&b->list, &bc->freeable); 939 + __bch2_btree_node_to_freelist(bc, b); 962 940 mutex_unlock(&bc->lock); 963 941 964 942 six_unlock_write(&b->c.lock); ··· 1334 1312 1335 1313 b = bch2_btree_node_fill(trans, path, k, btree_id, 1336 1314 level, SIX_LOCK_read, false); 1337 - if (!IS_ERR_OR_NULL(b)) 1315 + int ret = PTR_ERR_OR_ZERO(b); 1316 + if (ret) 1317 + return ret; 1318 + if (b) 1338 1319 six_unlock_read(&b->c.lock); 1339 - return bch2_trans_relock(trans) ?: PTR_ERR_OR_ZERO(b); 1320 + return 0; 1340 1321 } 1341 1322 1342 1323 void bch2_btree_node_evict(struct btree_trans *trans, const struct bkey_i *k) ··· 1378 1353 1379 1354 mutex_lock(&bc->lock); 1380 1355 bch2_btree_node_hash_remove(bc, b); 1381 - btree_node_data_free(c, b); 1356 + btree_node_data_free(bc, b); 1382 1357 mutex_unlock(&bc->lock); 1383 1358 out: 1384 1359 six_unlock_write(&b->c.lock);

+2

fs/bcachefs/btree_cache.h

··· 14 14 15 15 void bch2_btree_node_to_freelist(struct bch_fs *, struct btree *); 16 16 17 + void __bch2_btree_node_hash_remove(struct btree_cache *, struct btree *); 17 18 void bch2_btree_node_hash_remove(struct btree_cache *, struct btree *); 19 + 18 20 int __bch2_btree_node_hash_insert(struct btree_cache *, struct btree *); 19 21 int bch2_btree_node_hash_insert(struct btree_cache *, struct btree *, 20 22 unsigned, enum btree_id);

+1 -1

fs/bcachefs/btree_gc.c

··· 182 182 bch2_btree_node_drop_keys_outside_node(b); 183 183 184 184 mutex_lock(&c->btree_cache.lock); 185 - bch2_btree_node_hash_remove(&c->btree_cache, b); 185 + __bch2_btree_node_hash_remove(&c->btree_cache, b); 186 186 187 187 bkey_copy(&b->key, &new->k_i); 188 188 ret = __bch2_btree_node_hash_insert(&c->btree_cache, b);

+1 -5

fs/bcachefs/btree_io.c

··· 733 733 c, ca, b, i, NULL, 734 734 bset_past_end_of_btree_node, 735 735 "bset past end of btree node (offset %u len %u but written %zu)", 736 - offset, sectors, ptr_written ?: btree_sectors(c))) { 736 + offset, sectors, ptr_written ?: btree_sectors(c))) 737 737 i->u64s = 0; 738 - ret = 0; 739 - goto out; 740 - } 741 738 742 739 btree_err_on(offset && !i->u64s, 743 740 -BCH_ERR_btree_node_read_err_fixable, ··· 826 829 BSET_BIG_ENDIAN(i), write, 827 830 &bn->format); 828 831 } 829 - out: 830 832 fsck_err: 831 833 printbuf_exit(&buf2); 832 834 printbuf_exit(&buf1);

+1 -1

fs/bcachefs/btree_node_scan.c

··· 186 186 .ptrs[0].type = 1 << BCH_EXTENT_ENTRY_ptr, 187 187 .ptrs[0].offset = offset, 188 188 .ptrs[0].dev = ca->dev_idx, 189 - .ptrs[0].gen = *bucket_gen(ca, sector_to_bucket(ca, offset)), 189 + .ptrs[0].gen = bucket_gen_get(ca, sector_to_bucket(ca, offset)), 190 190 }; 191 191 rcu_read_unlock(); 192 192

+20 -13

fs/bcachefs/btree_update_interior.c

··· 237 237 BUG_ON(b->will_make_reachable); 238 238 239 239 clear_btree_node_noevict(b); 240 - 241 - mutex_lock(&c->btree_cache.lock); 242 - list_move(&b->list, &c->btree_cache.freeable); 243 - mutex_unlock(&c->btree_cache.lock); 244 240 } 245 241 246 242 static void bch2_btree_node_free_inmem(struct btree_trans *trans, ··· 248 252 249 253 bch2_btree_node_lock_write_nofail(trans, path, &b->c); 250 254 255 + __btree_node_free(trans, b); 256 + 251 257 mutex_lock(&c->btree_cache.lock); 252 258 bch2_btree_node_hash_remove(&c->btree_cache, b); 253 259 mutex_unlock(&c->btree_cache.lock); 254 - 255 - __btree_node_free(trans, b); 256 260 257 261 six_unlock_write(&b->c.lock); 258 262 mark_btree_node_locked_noreset(path, level, BTREE_NODE_INTENT_LOCKED); ··· 285 289 clear_btree_node_need_write(b); 286 290 287 291 mutex_lock(&c->btree_cache.lock); 288 - bch2_btree_node_hash_remove(&c->btree_cache, b); 292 + __bch2_btree_node_hash_remove(&c->btree_cache, b); 289 293 mutex_unlock(&c->btree_cache.lock); 290 294 291 295 BUG_ON(p->nr >= ARRAY_SIZE(p->b)); ··· 517 521 btree_node_lock_nopath_nofail(trans, &b->c, SIX_LOCK_intent); 518 522 btree_node_lock_nopath_nofail(trans, &b->c, SIX_LOCK_write); 519 523 __btree_node_free(trans, b); 520 - six_unlock_write(&b->c.lock); 521 - six_unlock_intent(&b->c.lock); 524 + bch2_btree_node_to_freelist(c, b); 522 525 } 523 526 } 524 527 } ··· 1429 1434 } 1430 1435 } 1431 1436 1437 + static bool key_deleted_in_insert(struct keylist *insert_keys, struct bpos pos) 1438 + { 1439 + if (insert_keys) 1440 + for_each_keylist_key(insert_keys, k) 1441 + if (bkey_deleted(&k->k) && bpos_eq(k->k.p, pos)) 1442 + return true; 1443 + return false; 1444 + } 1445 + 1432 1446 /* 1433 1447 * Move keys from n1 (original replacement node, now lower node) to n2 (higher 1434 1448 * node) ··· 1445 1441 static void __btree_split_node(struct btree_update *as, 1446 1442 struct btree_trans *trans, 1447 1443 struct btree *b, 1448 - struct btree *n[2]) 1444 + struct btree *n[2], 1445 + struct keylist *insert_keys) 1449 1446 { 1450 1447 struct bkey_packed *k; 1451 1448 struct bpos n1_pos = POS_MIN; ··· 1481 1476 if (b->c.level && 1482 1477 u64s < n1_u64s && 1483 1478 u64s + k->u64s >= n1_u64s && 1484 - bch2_key_deleted_in_journal(trans, b->c.btree_id, b->c.level, uk.p)) 1479 + (bch2_key_deleted_in_journal(trans, b->c.btree_id, b->c.level, uk.p) || 1480 + key_deleted_in_insert(insert_keys, uk.p))) 1485 1481 n1_u64s += k->u64s; 1486 1482 1487 1483 i = u64s >= n1_u64s; ··· 1609 1603 n[0] = n1 = bch2_btree_node_alloc(as, trans, b->c.level); 1610 1604 n[1] = n2 = bch2_btree_node_alloc(as, trans, b->c.level); 1611 1605 1612 - __btree_split_node(as, trans, b, n); 1606 + __btree_split_node(as, trans, b, n, keys); 1613 1607 1614 1608 if (keys) { 1615 1609 btree_split_insert_keys(as, trans, path, n1, keys); ··· 2398 2392 if (new_hash) { 2399 2393 mutex_lock(&c->btree_cache.lock); 2400 2394 bch2_btree_node_hash_remove(&c->btree_cache, new_hash); 2401 - bch2_btree_node_hash_remove(&c->btree_cache, b); 2395 + 2396 + __bch2_btree_node_hash_remove(&c->btree_cache, b); 2402 2397 2403 2398 bkey_copy(&b->key, new_key); 2404 2399 ret = __bch2_btree_node_hash_insert(&c->btree_cache, b);

+27 -3

fs/bcachefs/btree_write_buffer.c

··· 277 277 bool accounting_replay_done = test_bit(BCH_FS_accounting_replay_done, &c->flags); 278 278 int ret = 0; 279 279 280 + ret = bch2_journal_error(&c->journal); 281 + if (ret) 282 + return ret; 283 + 280 284 bch2_trans_unlock(trans); 281 285 bch2_trans_begin(trans); 282 286 ··· 495 491 return ret; 496 492 } 497 493 498 - static int btree_write_buffer_flush_seq(struct btree_trans *trans, u64 seq) 494 + static int btree_write_buffer_flush_seq(struct btree_trans *trans, u64 seq, 495 + bool *did_work) 499 496 { 500 497 struct bch_fs *c = trans->c; 501 498 struct btree_write_buffer *wb = &c->btree_write_buffer; ··· 506 501 bch2_trans_unlock(trans); 507 502 508 503 fetch_from_journal_err = fetch_wb_keys_from_journal(c, seq); 504 + 505 + *did_work |= wb->inc.keys.nr || wb->flushing.keys.nr; 509 506 510 507 /* 511 508 * On memory allocation failure, bch2_btree_write_buffer_flush_locked() ··· 528 521 struct journal_entry_pin *_pin, u64 seq) 529 522 { 530 523 struct bch_fs *c = container_of(j, struct bch_fs, journal); 524 + bool did_work = false; 531 525 532 - return bch2_trans_run(c, btree_write_buffer_flush_seq(trans, seq)); 526 + return bch2_trans_run(c, btree_write_buffer_flush_seq(trans, seq, &did_work)); 533 527 } 534 528 535 529 int bch2_btree_write_buffer_flush_sync(struct btree_trans *trans) 536 530 { 537 531 struct bch_fs *c = trans->c; 532 + bool did_work = false; 538 533 539 534 trace_and_count(c, write_buffer_flush_sync, trans, _RET_IP_); 540 535 541 - return btree_write_buffer_flush_seq(trans, journal_cur_seq(&c->journal)); 536 + return btree_write_buffer_flush_seq(trans, journal_cur_seq(&c->journal), &did_work); 537 + } 538 + 539 + /* 540 + * The write buffer requires flushing when going RO: keys in the journal for the 541 + * write buffer don't have a journal pin yet 542 + */ 543 + bool bch2_btree_write_buffer_flush_going_ro(struct bch_fs *c) 544 + { 545 + if (bch2_journal_error(&c->journal)) 546 + return false; 547 + 548 + bool did_work = false; 549 + bch2_trans_run(c, btree_write_buffer_flush_seq(trans, 550 + journal_cur_seq(&c->journal), &did_work)); 551 + return did_work; 542 552 } 543 553 544 554 int bch2_btree_write_buffer_flush_nocheck_rw(struct btree_trans *trans)

+1

fs/bcachefs/btree_write_buffer.h

··· 21 21 22 22 struct btree_trans; 23 23 int bch2_btree_write_buffer_flush_sync(struct btree_trans *); 24 + bool bch2_btree_write_buffer_flush_going_ro(struct bch_fs *); 24 25 int bch2_btree_write_buffer_flush_nocheck_rw(struct btree_trans *); 25 26 int bch2_btree_write_buffer_tryflush(struct btree_trans *); 26 27

+11 -8

fs/bcachefs/buckets.h

··· 103 103 return gens->b + b; 104 104 } 105 105 106 - static inline u8 bucket_gen_get(struct bch_dev *ca, size_t b) 106 + static inline int bucket_gen_get_rcu(struct bch_dev *ca, size_t b) 107 + { 108 + u8 *gen = bucket_gen(ca, b); 109 + return gen ? *gen : -1; 110 + } 111 + 112 + static inline int bucket_gen_get(struct bch_dev *ca, size_t b) 107 113 { 108 114 rcu_read_lock(); 109 - u8 gen = *bucket_gen(ca, b); 115 + int ret = bucket_gen_get_rcu(ca, b); 110 116 rcu_read_unlock(); 111 - return gen; 117 + return ret; 112 118 } 113 119 114 120 static inline size_t PTR_BUCKET_NR(const struct bch_dev *ca, ··· 175 169 176 170 static inline int dev_ptr_stale_rcu(struct bch_dev *ca, const struct bch_extent_ptr *ptr) 177 171 { 178 - u8 *gen = bucket_gen(ca, PTR_BUCKET_NR(ca, ptr)); 179 - if (!gen) 180 - return -1; 181 - return gen_after(*gen, ptr->gen); 172 + int gen = bucket_gen_get_rcu(ca, PTR_BUCKET_NR(ca, ptr)); 173 + return gen < 0 ? gen : gen_after(gen, ptr->gen); 182 174 } 183 175 184 176 /** ··· 188 184 rcu_read_lock(); 189 185 int ret = dev_ptr_stale_rcu(ca, ptr); 190 186 rcu_read_unlock(); 191 - 192 187 return ret; 193 188 } 194 189

+1

fs/bcachefs/errcode.h

··· 84 84 x(ENOMEM, ENOMEM_dev_alloc) \ 85 85 x(ENOMEM, ENOMEM_disk_accounting) \ 86 86 x(ENOMEM, ENOMEM_stripe_head_alloc) \ 87 + x(ENOMEM, ENOMEM_journal_read_bucket) \ 87 88 x(ENOSPC, ENOSPC_disk_reservation) \ 88 89 x(ENOSPC, ENOSPC_bucket_alloc) \ 89 90 x(ENOSPC, ENOSPC_disk_label_add) \

+4 -1

fs/bcachefs/extents.c

··· 1364 1364 for (entry = ptrs.start; 1365 1365 entry < ptrs.end; 1366 1366 entry = extent_entry_next(entry)) { 1367 - switch (extent_entry_type(entry)) { 1367 + switch (__extent_entry_type(entry)) { 1368 1368 case BCH_EXTENT_ENTRY_ptr: 1369 1369 break; 1370 1370 case BCH_EXTENT_ENTRY_crc32: ··· 1384 1384 break; 1385 1385 case BCH_EXTENT_ENTRY_rebalance: 1386 1386 break; 1387 + default: 1388 + /* Bad entry type: will be caught by validate() */ 1389 + return; 1387 1390 } 1388 1391 } 1389 1392 }

+5 -5

fs/bcachefs/io_read.c

··· 262 262 bio_free_pages(&(*rbio)->bio); 263 263 kfree(*rbio); 264 264 *rbio = NULL; 265 - kfree(op); 265 + /* We may have added to the rhashtable and thus need rcu freeing: */ 266 + kfree_rcu(op, rcu); 266 267 bch2_write_ref_put(c, BCH_WRITE_REF_promote); 267 268 return ERR_PTR(ret); 268 269 } ··· 803 802 PTR_BUCKET_POS(ca, &ptr), 804 803 BTREE_ITER_cached); 805 804 806 - u8 *gen = bucket_gen(ca, iter.pos.offset); 807 - if (gen) { 808 - 805 + int gen = bucket_gen_get(ca, iter.pos.offset); 806 + if (gen >= 0) { 809 807 prt_printf(&buf, "Attempting to read from stale dirty pointer:\n"); 810 808 printbuf_indent_add(&buf, 2); 811 809 812 810 bch2_bkey_val_to_text(&buf, c, k); 813 811 prt_newline(&buf); 814 812 815 - prt_printf(&buf, "memory gen: %u", *gen); 813 + prt_printf(&buf, "memory gen: %u", gen); 816 814 817 815 ret = lockrestart_do(trans, bkey_err(k = bch2_btree_iter_peek_slot(&iter))); 818 816 if (!ret) {

+2 -5

fs/bcachefs/io_write.c

··· 1300 1300 bucket_to_u64(i->b), 1301 1301 BUCKET_NOCOW_LOCK_UPDATE); 1302 1302 1303 - rcu_read_lock(); 1304 - u8 *gen = bucket_gen(ca, i->b.offset); 1305 - stale = !gen ? -1 : gen_after(*gen, i->gen); 1306 - rcu_read_unlock(); 1307 - 1303 + int gen = bucket_gen_get(ca, i->b.offset); 1304 + stale = gen < 0 ? gen : gen_after(gen, i->gen); 1308 1305 if (unlikely(stale)) { 1309 1306 stale_at = i; 1310 1307 goto err_bucket_stale;

+5

fs/bcachefs/journal_io.c

··· 708 708 container_of(entry, struct jset_entry_dev_usage, entry); 709 709 unsigned i, nr_types = jset_entry_dev_usage_nr_types(u); 710 710 711 + if (vstruct_bytes(entry) < sizeof(*u)) 712 + return; 713 + 711 714 prt_printf(out, "dev=%u", le32_to_cpu(u->dev)); 712 715 713 716 printbuf_indent_add(out, 2); ··· 1015 1012 nr_bvecs = buf_pages(buf->data, sectors_read << 9); 1016 1013 1017 1014 bio = bio_kmalloc(nr_bvecs, GFP_KERNEL); 1015 + if (!bio) 1016 + return -BCH_ERR_ENOMEM_journal_read_bucket; 1018 1017 bio_init(bio, ca->disk_sb.bdev, bio->bi_inline_vecs, nr_bvecs, REQ_OP_READ); 1019 1018 1020 1019 bio->bi_iter.bi_sector = offset;

+2 -2

fs/bcachefs/opts.c

··· 226 226 #define OPT_UINT(_min, _max) .type = BCH_OPT_UINT, \ 227 227 .min = _min, .max = _max 228 228 #define OPT_STR(_choices) .type = BCH_OPT_STR, \ 229 - .min = 0, .max = ARRAY_SIZE(_choices), \ 229 + .min = 0, .max = ARRAY_SIZE(_choices) - 1, \ 230 230 .choices = _choices 231 231 #define OPT_STR_NOLIMIT(_choices) .type = BCH_OPT_STR, \ 232 232 .min = 0, .max = U64_MAX, \ ··· 428 428 prt_printf(out, "%lli", v); 429 429 break; 430 430 case BCH_OPT_STR: 431 - if (v < opt->min || v >= opt->max - 1) 431 + if (v < opt->min || v >= opt->max) 432 432 prt_printf(out, "(invalid option %lli)", v); 433 433 else if (flags & OPT_SHOW_FULL_LIST) 434 434 prt_string_option(out, opt->choices, v);

+7

fs/bcachefs/recovery.c

··· 862 862 if (ret) 863 863 goto err; 864 864 865 + /* 866 + * Normally set by the appropriate recovery pass: when cleared, this 867 + * indicates we're in early recovery and btree updates should be done by 868 + * being applied to the journal replay keys. _Must_ be cleared before 869 + * multithreaded use: 870 + */ 871 + set_bit(BCH_FS_may_go_rw, &c->flags); 865 872 clear_bit(BCH_FS_fsck_running, &c->flags); 866 873 867 874 /* in case we don't run journal replay, i.e. norecovery mode */

+12

fs/bcachefs/recovery_passes.c

··· 27 27 NULL 28 28 }; 29 29 30 + /* Fake recovery pass, so that scan_for_btree_nodes isn't 0: */ 31 + static int bch2_recovery_pass_empty(struct bch_fs *c) 32 + { 33 + return 0; 34 + } 35 + 30 36 static int bch2_set_may_go_rw(struct bch_fs *c) 31 37 { 32 38 struct journal_keys *keys = &c->journal_keys; ··· 226 220 int bch2_run_recovery_passes(struct bch_fs *c) 227 221 { 228 222 int ret = 0; 223 + 224 + /* 225 + * We can't allow set_may_go_rw to be excluded; that would cause us to 226 + * use the journal replay keys for updates where it's not expected. 227 + */ 228 + c->opts.recovery_passes_exclude &= ~BCH_RECOVERY_PASS_set_may_go_rw; 229 229 230 230 while (c->curr_recovery_pass < ARRAY_SIZE(recovery_pass_fns)) { 231 231 if (c->opts.recovery_pass_last &&

+1

fs/bcachefs/recovery_passes_types.h

··· 13 13 * must never change: 14 14 */ 15 15 #define BCH_RECOVERY_PASSES() \ 16 + x(recovery_pass_empty, 41, PASS_SILENT) \ 16 17 x(scan_for_btree_nodes, 37, 0) \ 17 18 x(check_topology, 4, 0) \ 18 19 x(accounting_read, 39, PASS_ALWAYS) \

+5 -1

fs/bcachefs/sb-errors_format.h

··· 136 136 x(bucket_gens_nonzero_for_invalid_buckets, 122, FSCK_AUTOFIX) \ 137 137 x(need_discard_freespace_key_to_invalid_dev_bucket, 123, 0) \ 138 138 x(need_discard_freespace_key_bad, 124, 0) \ 139 + x(discarding_bucket_not_in_need_discard_btree, 291, 0) \ 139 140 x(backpointer_bucket_offset_wrong, 125, 0) \ 141 + x(backpointer_level_bad, 294, 0) \ 140 142 x(backpointer_to_missing_device, 126, 0) \ 141 143 x(backpointer_to_missing_alloc, 127, 0) \ 142 144 x(backpointer_to_missing_ptr, 128, 0) \ ··· 179 177 x(ptr_stripe_redundant, 163, 0) \ 180 178 x(reservation_key_nr_replicas_invalid, 164, 0) \ 181 179 x(reflink_v_refcount_wrong, 165, 0) \ 180 + x(reflink_v_pos_bad, 292, 0) \ 182 181 x(reflink_p_to_missing_reflink_v, 166, 0) \ 182 + x(reflink_refcount_underflow, 293, 0) \ 183 183 x(stripe_pos_bad, 167, 0) \ 184 184 x(stripe_val_size_bad, 168, 0) \ 185 185 x(stripe_csum_granularity_bad, 290, 0) \ ··· 306 302 x(accounting_key_replicas_devs_unsorted, 280, FSCK_AUTOFIX) \ 307 303 x(accounting_key_version_0, 282, FSCK_AUTOFIX) \ 308 304 x(logged_op_but_clean, 283, FSCK_AUTOFIX) \ 309 - x(MAX, 291, 0) 305 + x(MAX, 295, 0) 310 306 311 307 enum bch_sb_error_id { 312 308 #define x(t, n, ...) BCH_FSCK_ERR_##t = n,

+2 -2

fs/bcachefs/sb-members.c

··· 163 163 return -BCH_ERR_invalid_sb_members; 164 164 } 165 165 166 - if (m.btree_bitmap_shift >= 64) { 166 + if (m.btree_bitmap_shift >= BCH_MI_BTREE_BITMAP_SHIFT_MAX) { 167 167 prt_printf(err, "device %u: invalid btree_bitmap_shift %u", i, m.btree_bitmap_shift); 168 168 return -BCH_ERR_invalid_sb_members; 169 169 } ··· 450 450 m->btree_bitmap_shift += resize; 451 451 } 452 452 453 - BUG_ON(m->btree_bitmap_shift > 57); 453 + BUG_ON(m->btree_bitmap_shift >= BCH_MI_BTREE_BITMAP_SHIFT_MAX); 454 454 BUG_ON(end > 64ULL << m->btree_bitmap_shift); 455 455 456 456 for (unsigned bit = start >> m->btree_bitmap_shift;

+6

fs/bcachefs/sb-members_format.h

··· 66 66 }; 67 67 68 68 /* 69 + * btree_allocated_bitmap can represent sector addresses of a u64: it itself has 70 + * 64 elements, so 64 - ilog2(64) 71 + */ 72 + #define BCH_MI_BTREE_BITMAP_SHIFT_MAX 58 73 + 74 + /* 69 75 * This limit comes from the bucket_gens array - it's a single allocation, and 70 76 * kernel allocation are limited to INT_MAX 71 77 */

+1

fs/bcachefs/super.c

··· 272 272 clean_passes++; 273 273 274 274 if (bch2_btree_interior_updates_flush(c) || 275 + bch2_btree_write_buffer_flush_going_ro(c) || 275 276 bch2_journal_flush_all_pins(&c->journal) || 276 277 bch2_btree_flush_all_writes(c) || 277 278 seq != atomic64_read(&c->journal.seq)) {

+5

fs/bcachefs/tests.c

··· 809 809 unsigned i; 810 810 u64 time; 811 811 812 + if (nr == 0 || nr_threads == 0) { 813 + pr_err("nr of iterations or threads is not allowed to be 0"); 814 + return -EINVAL; 815 + } 816 + 812 817 atomic_set(&j.ready, nr_threads); 813 818 init_waitqueue_head(&j.ready_wait); 814 819

+1 -1

fs/btrfs/delayed-ref.c

··· 649 649 &href->ref_add_list); 650 650 else if (ref->action == BTRFS_DROP_DELAYED_REF) { 651 651 ASSERT(!list_empty(&exist->add_list)); 652 - list_del(&exist->add_list); 652 + list_del_init(&exist->add_list); 653 653 } else { 654 654 ASSERT(0); 655 655 }

+1 -1

fs/btrfs/inode.c

··· 1618 1618 clear_bits |= EXTENT_CLEAR_DATA_RESV; 1619 1619 extent_clear_unlock_delalloc(inode, start, end, locked_folio, 1620 1620 &cached, clear_bits, page_ops); 1621 - btrfs_qgroup_free_data(inode, NULL, start, cur_alloc_size, NULL); 1621 + btrfs_qgroup_free_data(inode, NULL, start, end - start + 1, NULL); 1622 1622 } 1623 1623 return ret; 1624 1624 }

+5 -20

fs/btrfs/super.c

··· 1979 1979 * fsconfig(FSCONFIG_SET_FLAG, "ro"). This option is seen by the filesystem 1980 1980 * in fc->sb_flags. 1981 1981 * 1982 - * This disambiguation has rather positive consequences. Mounting a subvolume 1983 - * ro will not also turn the superblock ro. Only the mount for the subvolume 1984 - * will become ro. 1985 - * 1986 - * So, if the superblock creation request comes from the new mount API the 1987 - * caller must have explicitly done: 1988 - * 1989 - * fsconfig(FSCONFIG_SET_FLAG, "ro") 1990 - * fsmount/mount_setattr(MOUNT_ATTR_RDONLY) 1991 - * 1992 - * IOW, at some point the caller must have explicitly turned the whole 1993 - * superblock ro and we shouldn't just undo it like we did for the old mount 1994 - * API. In any case, it lets us avoid the hack in the new mount API. 1995 - * 1996 - * Consequently, the remounting hack must only be used for requests originating 1997 - * from the old mount API and should be marked for full deprecation so it can be 1998 - * turned off in a couple of years. 1999 - * 2000 - * The new mount API has no reason to support this hack. 1982 + * But, currently the util-linux mount command already utilizes the new mount 1983 + * API and is still setting fsconfig(FSCONFIG_SET_FLAG, "ro") no matter if it's 1984 + * btrfs or not, setting the whole super block RO. To make per-subvolume mounting 1985 + * work with different options work we need to keep backward compatibility. 2001 1986 */ 2002 1987 static struct vfsmount *btrfs_reconfigure_for_mount(struct fs_context *fc) 2003 1988 { ··· 2004 2019 if (IS_ERR(mnt)) 2005 2020 return mnt; 2006 2021 2007 - if (!fc->oldapi || !ro2rw) 2022 + if (!ro2rw) 2008 2023 return mnt; 2009 2024 2010 2025 /* We need to convert to rw, call reconfigure. */

+5 -8

fs/nfsd/vfs.c

··· 903 903 goto out; 904 904 } 905 905 906 - if (may_flags & NFSD_MAY_64BIT_COOKIE) 907 - file->f_mode |= FMODE_64BITHASH; 908 - else 909 - file->f_mode |= FMODE_32BITHASH; 910 - 911 906 *filp = file; 912 907 out: 913 908 return host_err; ··· 2169 2174 loff_t offset = *offsetp; 2170 2175 int may_flags = NFSD_MAY_READ; 2171 2176 2172 - if (fhp->fh_64bit_cookies) 2173 - may_flags |= NFSD_MAY_64BIT_COOKIE; 2174 - 2175 2177 err = nfsd_open(rqstp, fhp, S_IFDIR, may_flags, &file); 2176 2178 if (err) 2177 2179 goto out; 2180 + 2181 + if (fhp->fh_64bit_cookies) 2182 + file->f_mode |= FMODE_64BITHASH; 2183 + else 2184 + file->f_mode |= FMODE_32BITHASH; 2178 2185 2179 2186 offset = vfs_llseek(file, offset, SEEK_SET); 2180 2187 if (offset < 0) {

-2

fs/nilfs2/btnode.c

··· 68 68 goto failed; 69 69 } 70 70 memset(bh->b_data, 0, i_blocksize(inode)); 71 - bh->b_bdev = inode->i_sb->s_bdev; 72 71 bh->b_blocknr = blocknr; 73 72 set_buffer_mapped(bh); 74 73 set_buffer_uptodate(bh); ··· 132 133 goto found; 133 134 } 134 135 set_buffer_mapped(bh); 135 - bh->b_bdev = inode->i_sb->s_bdev; 136 136 bh->b_blocknr = pblocknr; /* set block address for read */ 137 137 bh->b_end_io = end_buffer_read_sync; 138 138 get_bh(bh);

+1 -3

fs/nilfs2/gcinode.c

··· 83 83 goto out; 84 84 } 85 85 86 - if (!buffer_mapped(bh)) { 87 - bh->b_bdev = inode->i_sb->s_bdev; 86 + if (!buffer_mapped(bh)) 88 87 set_buffer_mapped(bh); 89 - } 90 88 bh->b_blocknr = pbn; 91 89 bh->b_end_io = end_buffer_read_sync; 92 90 get_bh(bh);

-1

fs/nilfs2/mdt.c

··· 89 89 if (buffer_uptodate(bh)) 90 90 goto failed_bh; 91 91 92 - bh->b_bdev = sb->s_bdev; 93 92 err = nilfs_mdt_insert_new_block(inode, block, bh, init_block); 94 93 if (likely(!err)) { 95 94 get_bh(bh);

+1 -1

fs/nilfs2/page.c

··· 39 39 first_block = (unsigned long)index << (PAGE_SHIFT - blkbits); 40 40 bh = get_nth_bh(bh, block - first_block); 41 41 42 - touch_buffer(bh); 43 42 wait_on_buffer(bh); 44 43 return bh; 45 44 } ··· 63 64 folio_put(folio); 64 65 return NULL; 65 66 } 67 + bh->b_bdev = inode->i_sb->s_bdev; 66 68 return bh; 67 69 } 68 70

+9 -4

fs/ocfs2/super.c

··· 2319 2319 struct ocfs2_blockcheck_stats *stats) 2320 2320 { 2321 2321 int status = -EAGAIN; 2322 + u32 blksz_bits; 2322 2323 2323 2324 if (memcmp(di->i_signature, OCFS2_SUPER_BLOCK_SIGNATURE, 2324 2325 strlen(OCFS2_SUPER_BLOCK_SIGNATURE)) == 0) { ··· 2334 2333 goto out; 2335 2334 } 2336 2335 status = -EINVAL; 2337 - if ((1 << le32_to_cpu(di->id2.i_super.s_blocksize_bits)) != blksz) { 2336 + /* Acceptable block sizes are 512 bytes, 1K, 2K and 4K. */ 2337 + blksz_bits = le32_to_cpu(di->id2.i_super.s_blocksize_bits); 2338 + if (blksz_bits < 9 || blksz_bits > 12) { 2338 2339 mlog(ML_ERROR, "found superblock with incorrect block " 2339 - "size: found %u, should be %u\n", 2340 - 1 << le32_to_cpu(di->id2.i_super.s_blocksize_bits), 2341 - blksz); 2340 + "size bits: found %u, should be 9, 10, 11, or 12\n", 2341 + blksz_bits); 2342 + } else if ((1 << le32_to_cpu(blksz_bits)) != blksz) { 2343 + mlog(ML_ERROR, "found superblock with incorrect block " 2344 + "size: found %u, should be %u\n", 1 << blksz_bits, blksz); 2342 2345 } else if (le16_to_cpu(di->id2.i_super.s_major_rev_level) != 2343 2346 OCFS2_MAJOR_REV_LEVEL || 2344 2347 le16_to_cpu(di->id2.i_super.s_minor_rev_level) !=

+1 -2

fs/ocfs2/xattr.c

··· 2036 2036 rc = 0; 2037 2037 ocfs2_xa_cleanup_value_truncate(loc, "removing", 2038 2038 orig_clusters); 2039 - if (rc) 2040 - goto out; 2039 + goto out; 2041 2040 } 2042 2041 } 2043 2042

+5 -4

fs/proc/vmcore.c

··· 457 457 #endif 458 458 } 459 459 460 - static const struct vm_operations_struct vmcore_mmap_ops = { 461 - .fault = mmap_vmcore_fault, 462 - }; 463 - 464 460 /** 465 461 * vmcore_alloc_buf - allocate buffer in vmalloc memory 466 462 * @size: size of buffer ··· 484 488 * virtually contiguous user-space in ELF layout. 485 489 */ 486 490 #ifdef CONFIG_MMU 491 + 492 + static const struct vm_operations_struct vmcore_mmap_ops = { 493 + .fault = mmap_vmcore_fault, 494 + }; 495 + 487 496 /* 488 497 * remap_oldmem_pfn_checked - do remap_oldmem_pfn_range replacing all pages 489 498 * reported as not being ram with the zero page.

+11 -3

fs/smb/client/connect.c

··· 1037 1037 */ 1038 1038 } 1039 1039 1040 + put_net(cifs_net_ns(server)); 1040 1041 kfree(server->leaf_fullpath); 1041 1042 kfree(server); 1042 1043 ··· 1635 1634 1636 1635 /* srv_count can never go negative */ 1637 1636 WARN_ON(server->srv_count < 0); 1638 - 1639 - put_net(cifs_net_ns(server)); 1640 1637 1641 1638 list_del_init(&server->tcp_ses_list); 1642 1639 spin_unlock(&cifs_tcp_ses_lock); ··· 3069 3070 if (server->ssocket) { 3070 3071 socket = server->ssocket; 3071 3072 } else { 3072 - rc = __sock_create(cifs_net_ns(server), sfamily, SOCK_STREAM, 3073 + struct net *net = cifs_net_ns(server); 3074 + struct sock *sk; 3075 + 3076 + rc = __sock_create(net, sfamily, SOCK_STREAM, 3073 3077 IPPROTO_TCP, &server->ssocket, 1); 3074 3078 if (rc < 0) { 3075 3079 cifs_server_dbg(VFS, "Error %d creating socket\n", rc); 3076 3080 return rc; 3077 3081 } 3082 + 3083 + sk = server->ssocket->sk; 3084 + __netns_tracker_free(net, &sk->ns_tracker, false); 3085 + sk->sk_net_refcnt = 1; 3086 + get_net_track(net, &sk->ns_tracker, GFP_KERNEL); 3087 + sock_inuse_add(net, 1); 3078 3088 3079 3089 /* BB other socket options to set KEEPALIVE, NODELAY? */ 3080 3090 cifs_dbg(FYI, "Socket created\n");

+1

fs/smb/server/connection.c

··· 70 70 atomic_set(&conn->req_running, 0); 71 71 atomic_set(&conn->r_count, 0); 72 72 atomic_set(&conn->refcnt, 1); 73 + atomic_set(&conn->mux_smb_requests, 0); 73 74 conn->total_credits = 1; 74 75 conn->outstanding_credits = 0; 75 76

+1

fs/smb/server/connection.h

··· 107 107 __le16 signing_algorithm; 108 108 bool binding; 109 109 atomic_t refcnt; 110 + atomic_t mux_smb_requests; 110 111 }; 111 112 112 113 struct ksmbd_conn_ops {

+10 -5

fs/smb/server/mgmt/user_session.c

··· 90 90 91 91 int ksmbd_session_rpc_open(struct ksmbd_session *sess, char *rpc_name) 92 92 { 93 - struct ksmbd_session_rpc *entry; 93 + struct ksmbd_session_rpc *entry, *old; 94 94 struct ksmbd_rpc_command *resp; 95 95 int method; 96 96 ··· 106 106 entry->id = ksmbd_ipc_id_alloc(); 107 107 if (entry->id < 0) 108 108 goto free_entry; 109 - xa_store(&sess->rpc_handle_list, entry->id, entry, GFP_KERNEL); 109 + old = xa_store(&sess->rpc_handle_list, entry->id, entry, GFP_KERNEL); 110 + if (xa_is_err(old)) 111 + goto free_id; 110 112 111 113 resp = ksmbd_rpc_open(sess, entry->id); 112 114 if (!resp) 113 - goto free_id; 115 + goto erase_xa; 114 116 115 117 kvfree(resp); 116 118 return entry->id; 117 - free_id: 119 + erase_xa: 118 120 xa_erase(&sess->rpc_handle_list, entry->id); 121 + free_id: 119 122 ksmbd_rpc_id_free(entry->id); 120 123 free_entry: 121 124 kfree(entry); ··· 178 175 unsigned long id; 179 176 struct ksmbd_session *sess; 180 177 178 + down_write(&sessions_table_lock); 181 179 down_write(&conn->session_lock); 182 180 xa_for_each(&conn->sessions, id, sess) { 183 181 if (atomic_read(&sess->refcnt) == 0 && ··· 192 188 } 193 189 } 194 190 up_write(&conn->session_lock); 191 + up_write(&sessions_table_lock); 195 192 } 196 193 197 194 int ksmbd_session_register(struct ksmbd_conn *conn, ··· 234 229 } 235 230 } 236 231 } 237 - up_write(&sessions_table_lock); 238 232 239 233 down_write(&conn->session_lock); 240 234 xa_for_each(&conn->sessions, id, sess) { ··· 253 249 } 254 250 } 255 251 up_write(&conn->session_lock); 252 + up_write(&sessions_table_lock); 256 253 } 257 254 258 255 struct ksmbd_session *ksmbd_session_lookup(struct ksmbd_conn *conn,

+12 -8

fs/smb/server/server.c

··· 238 238 } while (is_chained == true); 239 239 240 240 send: 241 - if (work->sess) 242 - ksmbd_user_session_put(work->sess); 243 241 if (work->tcon) 244 242 ksmbd_tree_connect_put(work->tcon); 245 243 smb3_preauth_hash_rsp(work); 244 + if (work->sess) 245 + ksmbd_user_session_put(work->sess); 246 246 if (work->sess && work->sess->enc && work->encrypted && 247 247 conn->ops->encrypt_resp) { 248 248 rc = conn->ops->encrypt_resp(work); ··· 270 270 271 271 ksmbd_conn_try_dequeue_request(work); 272 272 ksmbd_free_work_struct(work); 273 + atomic_dec(&conn->mux_smb_requests); 273 274 /* 274 275 * Checking waitqueue to dropping pending requests on 275 276 * disconnection. waitqueue_active is safe because it ··· 292 291 struct ksmbd_work *work; 293 292 int err; 294 293 294 + err = ksmbd_init_smb_server(conn); 295 + if (err) 296 + return 0; 297 + 298 + if (atomic_inc_return(&conn->mux_smb_requests) >= conn->vals->max_credits) { 299 + atomic_dec_return(&conn->mux_smb_requests); 300 + return -ENOSPC; 301 + } 302 + 295 303 work = ksmbd_alloc_work_struct(); 296 304 if (!work) { 297 305 pr_err("allocation for work failed\n"); ··· 310 300 work->conn = conn; 311 301 work->request_buf = conn->request_buf; 312 302 conn->request_buf = NULL; 313 - 314 - err = ksmbd_init_smb_server(work); 315 - if (err) { 316 - ksmbd_free_work_struct(work); 317 - return 0; 318 - } 319 303 320 304 ksmbd_conn_enqueue_request(work); 321 305 atomic_inc(&conn->r_count);

+7 -3

fs/smb/server/smb_common.c

··· 388 388 .set_rsp_status = set_smb1_rsp_status, 389 389 }; 390 390 391 + static struct smb_version_values smb1_server_values = { 392 + .max_credits = SMB2_MAX_CREDITS, 393 + }; 394 + 391 395 static int smb1_negotiate(struct ksmbd_work *work) 392 396 { 393 397 return ksmbd_smb_negotiate_common(work, SMB_COM_NEGOTIATE); ··· 403 399 404 400 static int init_smb1_server(struct ksmbd_conn *conn) 405 401 { 402 + conn->vals = &smb1_server_values; 406 403 conn->ops = &smb1_server_ops; 407 404 conn->cmds = smb1_server_cmds; 408 405 conn->max_cmds = ARRAY_SIZE(smb1_server_cmds); 409 406 return 0; 410 407 } 411 408 412 - int ksmbd_init_smb_server(struct ksmbd_work *work) 409 + int ksmbd_init_smb_server(struct ksmbd_conn *conn) 413 410 { 414 - struct ksmbd_conn *conn = work->conn; 415 411 __le32 proto; 416 412 417 - proto = *(__le32 *)((struct smb_hdr *)work->request_buf)->Protocol; 413 + proto = *(__le32 *)((struct smb_hdr *)conn->request_buf)->Protocol; 418 414 if (conn->need_neg == false) { 419 415 if (proto == SMB1_PROTO_NUMBER) 420 416 return -EINVAL;

+1 -1

fs/smb/server/smb_common.h

··· 427 427 428 428 int ksmbd_lookup_dialect_by_id(__le16 *cli_dialects, __le16 dialects_count); 429 429 430 - int ksmbd_init_smb_server(struct ksmbd_work *work); 430 + int ksmbd_init_smb_server(struct ksmbd_conn *conn); 431 431 432 432 struct ksmbd_kstat; 433 433 int ksmbd_populate_dot_dotdot_entries(struct ksmbd_work *work,

+2

include/acpi/processor.h

··· 465 465 extern int acpi_processor_ffh_lpi_enter(struct acpi_lpi_state *lpi); 466 466 #endif 467 467 468 + void acpi_processor_init_invariance_cppc(void); 469 + 468 470 #endif

-4

include/linux/arch_topology.h

··· 11 11 void topology_normalize_cpu_scale(void); 12 12 int topology_update_cpu_topology(void); 13 13 14 - #ifdef CONFIG_ACPI_CPPC_LIB 15 - void topology_init_cpu_capacity_cppc(void); 16 - #endif 17 - 18 14 struct device_node; 19 15 bool topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu); 20 16

+3 -29

include/linux/arm-smccc.h

··· 315 315 316 316 void __init arm_smccc_version_init(u32 version, enum arm_smccc_conduit conduit); 317 317 318 - extern u64 smccc_has_sve_hint; 319 - 320 318 /** 321 319 * arm_smccc_get_soc_id_version() 322 320 * ··· 413 415 }; 414 416 415 417 /** 416 - * __arm_smccc_sve_check() - Set the SVE hint bit when doing SMC calls 417 - * 418 - * Sets the SMCCC hint bit to indicate if there is live state in the SVE 419 - * registers, this modifies x0 in place and should never be called from C 420 - * code. 421 - */ 422 - asmlinkage unsigned long __arm_smccc_sve_check(unsigned long x0); 423 - 424 - /** 425 418 * __arm_smccc_smc() - make SMC calls 426 419 * @a0-a7: arguments passed in registers 0 to 7 427 420 * @res: result values from registers 0 to 3 ··· 476 487 477 488 #define SMCCC_SMC_INST __SMC(0) 478 489 #define SMCCC_HVC_INST __HVC(0) 479 - 480 - #endif 481 - 482 - /* nVHE hypervisor doesn't have a current thread so needs separate checks */ 483 - #if defined(CONFIG_ARM64_SVE) && !defined(__KVM_NVHE_HYPERVISOR__) 484 - 485 - #define SMCCC_SVE_CHECK ALTERNATIVE("nop \n", "bl __arm_smccc_sve_check \n", \ 486 - ARM64_SVE) 487 - #define smccc_sve_clobbers "x16", "x30", "cc", 488 - 489 - #else 490 - 491 - #define SMCCC_SVE_CHECK 492 - #define smccc_sve_clobbers 493 490 494 491 #endif 495 492 ··· 549 574 register unsigned long r3 asm("r3"); \ 550 575 CONCATENATE(__declare_arg_, \ 551 576 COUNT_ARGS(__VA_ARGS__))(__VA_ARGS__); \ 552 - asm volatile(SMCCC_SVE_CHECK \ 553 - inst "\n" : \ 577 + asm volatile(inst "\n" : \ 554 578 "=r" (r0), "=r" (r1), "=r" (r2), "=r" (r3) \ 555 579 : CONCATENATE(__constraint_read_, \ 556 580 COUNT_ARGS(__VA_ARGS__)) \ 557 - : smccc_sve_clobbers "memory"); \ 581 + : "memory"); \ 558 582 if (___res) \ 559 583 *___res = (typeof(*___res)){r0, r1, r2, r3}; \ 560 584 } while (0) ··· 602 628 asm ("" : \ 603 629 : CONCATENATE(__constraint_read_, \ 604 630 COUNT_ARGS(__VA_ARGS__)) \ 605 - : smccc_sve_clobbers "memory"); \ 631 + : "memory"); \ 606 632 if (___res) \ 607 633 ___res->a0 = SMCCC_RET_NOT_SUPPORTED; \ 608 634 } while (0)

+7 -5

include/linux/memcontrol.h

··· 1760 1760 1761 1761 struct mem_cgroup *mem_cgroup_from_slab_obj(void *p); 1762 1762 1763 - static inline void count_objcg_event(struct obj_cgroup *objcg, 1764 - enum vm_event_item idx) 1763 + static inline void count_objcg_events(struct obj_cgroup *objcg, 1764 + enum vm_event_item idx, 1765 + unsigned long count) 1765 1766 { 1766 1767 struct mem_cgroup *memcg; 1767 1768 ··· 1771 1770 1772 1771 rcu_read_lock(); 1773 1772 memcg = obj_cgroup_memcg(objcg); 1774 - count_memcg_events(memcg, idx, 1); 1773 + count_memcg_events(memcg, idx, count); 1775 1774 rcu_read_unlock(); 1776 1775 } 1777 1776 ··· 1826 1825 return NULL; 1827 1826 } 1828 1827 1829 - static inline void count_objcg_event(struct obj_cgroup *objcg, 1830 - enum vm_event_item idx) 1828 + static inline void count_objcg_events(struct obj_cgroup *objcg, 1829 + enum vm_event_item idx, 1830 + unsigned long count) 1831 1831 { 1832 1832 } 1833 1833

+22 -6

include/linux/mman.h

··· 2 2 #ifndef _LINUX_MMAN_H 3 3 #define _LINUX_MMAN_H 4 4 5 + #include <linux/fs.h> 5 6 #include <linux/mm.h> 6 7 #include <linux/percpu_counter.h> 7 8 ··· 95 94 #endif 96 95 97 96 #ifndef arch_calc_vm_flag_bits 98 - #define arch_calc_vm_flag_bits(flags) 0 97 + #define arch_calc_vm_flag_bits(file, flags) 0 99 98 #endif 100 99 101 100 #ifndef arch_validate_prot ··· 152 151 * Combine the mmap "flags" argument into "vm_flags" used internally. 153 152 */ 154 153 static inline unsigned long 155 - calc_vm_flag_bits(unsigned long flags) 154 + calc_vm_flag_bits(struct file *file, unsigned long flags) 156 155 { 157 156 return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) | 158 157 _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) | 159 158 _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) | 160 159 _calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) | 161 - arch_calc_vm_flag_bits(flags); 160 + arch_calc_vm_flag_bits(file, flags); 162 161 } 163 162 164 163 unsigned long vm_commit_limit(void); ··· 189 188 * 190 189 * d) mmap(PROT_READ | PROT_EXEC) 191 190 * mmap(PROT_READ | PROT_EXEC | PROT_BTI) 191 + * 192 + * This is only applicable if the user has set the Memory-Deny-Write-Execute 193 + * (MDWE) protection mask for the current process. 194 + * 195 + * @old specifies the VMA flags the VMA originally possessed, and @new the ones 196 + * we propose to set. 197 + * 198 + * Return: false if proposed change is OK, true if not ok and should be denied. 192 199 */ 193 - static inline bool map_deny_write_exec(struct vm_area_struct *vma, unsigned long vm_flags) 200 + static inline bool map_deny_write_exec(unsigned long old, unsigned long new) 194 201 { 202 + /* If MDWE is disabled, we have nothing to deny. */ 195 203 if (!test_bit(MMF_HAS_MDWE, &current->mm->flags)) 196 204 return false; 197 205 198 - if ((vm_flags & VM_EXEC) && (vm_flags & VM_WRITE)) 206 + /* If the new VMA is not executable, we have nothing to deny. */ 207 + if (!(new & VM_EXEC)) 208 + return false; 209 + 210 + /* Under MDWE we do not accept newly writably executable VMAs... */ 211 + if (new & VM_WRITE) 199 212 return true; 200 213 201 - if (!(vma->vm_flags & VM_EXEC) && (vm_flags & VM_EXEC)) 214 + /* ...nor previously non-executable VMAs becoming executable. */ 215 + if (!(old & VM_EXEC)) 202 216 return true; 203 217 204 218 return false;

+1

include/linux/mmzone.h

··· 823 823 unsigned long watermark_boost; 824 824 825 825 unsigned long nr_reserved_highatomic; 826 + unsigned long nr_free_highatomic; 826 827 827 828 /* 828 829 * We don't know if the memory that we're going to allocate will be

+3 -1

include/linux/sockptr.h

··· 77 77 { 78 78 if (optlen < ksize) 79 79 return -EINVAL; 80 - return copy_from_sockptr(dst, optval, ksize); 80 + if (copy_from_sockptr(dst, optval, ksize)) 81 + return -EFAULT; 82 + return 0; 81 83 } 82 84 83 85 static inline int copy_struct_from_sockptr(void *dst, size_t ksize,

+3

include/linux/tpm.h

··· 421 421 u8 tpm_buf_read_u8(struct tpm_buf *buf, off_t *offset); 422 422 u16 tpm_buf_read_u16(struct tpm_buf *buf, off_t *offset); 423 423 u32 tpm_buf_read_u32(struct tpm_buf *buf, off_t *offset); 424 + void tpm_buf_append_handle(struct tpm_chip *chip, struct tpm_buf *buf, u32 handle); 424 425 425 426 /* 426 427 * Check if TPM device is in the firmware upgrade mode. ··· 506 505 void tpm_buf_append_hmac_session(struct tpm_chip *chip, struct tpm_buf *buf, 507 506 u8 attributes, u8 *passphrase, 508 507 int passphraselen); 508 + void tpm_buf_append_auth(struct tpm_chip *chip, struct tpm_buf *buf, 509 + u8 attributes, u8 *passphrase, int passphraselen); 509 510 static inline void tpm_buf_append_hmac_session_opt(struct tpm_chip *chip, 510 511 struct tpm_buf *buf, 511 512 u8 attributes,

+2 -1

include/linux/user_namespace.h

··· 141 141 142 142 long inc_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v); 143 143 bool dec_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v); 144 - long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type); 144 + long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type, 145 + bool override_rlimit); 145 146 void dec_rlimit_put_ucounts(struct ucounts *ucounts, enum rlimit_type type); 146 147 bool is_rlimit_overlimit(struct ucounts *ucounts, enum rlimit_type type, unsigned long max); 147 148

+2

include/linux/vm_event_item.h

··· 134 134 #ifdef CONFIG_SWAP 135 135 SWAP_RA, 136 136 SWAP_RA_HIT, 137 + SWPIN_ZERO, 138 + SWPOUT_ZERO, 137 139 #ifdef CONFIG_KSM 138 140 KSM_SWPIN_COPY, 139 141 #endif

+2

include/net/bond_options.h

··· 161 161 #if IS_ENABLED(CONFIG_IPV6) 162 162 void bond_option_ns_ip6_targets_clear(struct bonding *bond); 163 163 #endif 164 + void bond_slave_ns_maddrs_add(struct bonding *bond, struct slave *slave); 165 + void bond_slave_ns_maddrs_del(struct bonding *bond, struct slave *slave); 164 166 165 167 #endif /* _NET_BOND_OPTIONS_H */

+10 -2

include/net/tls.h

··· 390 390 391 391 static inline bool tls_sw_has_ctx_tx(const struct sock *sk) 392 392 { 393 - struct tls_context *ctx = tls_get_ctx(sk); 393 + struct tls_context *ctx; 394 394 395 + if (!sk_is_inet(sk) || !inet_test_bit(IS_ICSK, sk)) 396 + return false; 397 + 398 + ctx = tls_get_ctx(sk); 395 399 if (!ctx) 396 400 return false; 397 401 return !!tls_sw_ctx_tx(ctx); ··· 403 399 404 400 static inline bool tls_sw_has_ctx_rx(const struct sock *sk) 405 401 { 406 - struct tls_context *ctx = tls_get_ctx(sk); 402 + struct tls_context *ctx; 407 403 404 + if (!sk_is_inet(sk) || !inet_test_bit(IS_ICSK, sk)) 405 + return false; 406 + 407 + ctx = tls_get_ctx(sk); 408 408 if (!ctx) 409 409 return false; 410 410 return !!tls_sw_ctx_rx(ctx);

+8 -5

kernel/sched/core.c

··· 5920 5920 5921 5921 #ifdef CONFIG_SCHED_CLASS_EXT 5922 5922 /* 5923 - * SCX requires a balance() call before every pick_next_task() including 5924 - * when waking up from SCHED_IDLE. If @start_class is below SCX, start 5925 - * from SCX instead. 5923 + * SCX requires a balance() call before every pick_task() including when 5924 + * waking up from SCHED_IDLE. If @start_class is below SCX, start from 5925 + * SCX instead. Also, set a flag to detect missing balance() call. 5926 5926 */ 5927 - if (scx_enabled() && sched_class_above(&ext_sched_class, start_class)) 5928 - start_class = &ext_sched_class; 5927 + if (scx_enabled()) { 5928 + rq->scx.flags |= SCX_RQ_BAL_PENDING; 5929 + if (sched_class_above(&ext_sched_class, start_class)) 5930 + start_class = &ext_sched_class; 5931 + } 5929 5932 #endif 5930 5933 5931 5934 /*

+32 -14

kernel/sched/ext.c

··· 2634 2634 2635 2635 lockdep_assert_rq_held(rq); 2636 2636 rq->scx.flags |= SCX_RQ_IN_BALANCE; 2637 - rq->scx.flags &= ~SCX_RQ_BAL_KEEP; 2637 + rq->scx.flags &= ~(SCX_RQ_BAL_PENDING | SCX_RQ_BAL_KEEP); 2638 2638 2639 2639 if (static_branch_unlikely(&scx_ops_cpu_preempt) && 2640 2640 unlikely(rq->scx.cpu_released)) { ··· 2948 2948 { 2949 2949 struct task_struct *prev = rq->curr; 2950 2950 struct task_struct *p; 2951 + bool prev_on_scx = prev->sched_class == &ext_sched_class; 2952 + bool keep_prev = rq->scx.flags & SCX_RQ_BAL_KEEP; 2953 + bool kick_idle = false; 2951 2954 2952 2955 /* 2953 - * If balance_scx() is telling us to keep running @prev, replenish slice 2954 - * if necessary and keep running @prev. Otherwise, pop the first one 2955 - * from the local DSQ. 2956 - * 2957 2956 * WORKAROUND: 2958 2957 * 2959 2958 * %SCX_RQ_BAL_KEEP should be set iff $prev is on SCX as it must just ··· 2961 2962 * which then ends up calling pick_task_scx() without preceding 2962 2963 * balance_scx(). 2963 2964 * 2964 - * For now, ignore cases where $prev is not on SCX. This isn't great and 2965 - * can theoretically lead to stalls. However, for switch_all cases, this 2966 - * happens only while a BPF scheduler is being loaded or unloaded, and, 2967 - * for partial cases, fair will likely keep triggering this CPU. 2965 + * Keep running @prev if possible and avoid stalling from entering idle 2966 + * without balancing. 2968 2967 * 2969 - * Once fair is fixed, restore WARN_ON_ONCE(). 2968 + * Once fair is fixed, remove the workaround and trigger WARN_ON_ONCE() 2969 + * if pick_task_scx() is called without preceding balance_scx(). 2970 2970 */ 2971 - if ((rq->scx.flags & SCX_RQ_BAL_KEEP) && 2972 - prev->sched_class == &ext_sched_class) { 2971 + if (unlikely(rq->scx.flags & SCX_RQ_BAL_PENDING)) { 2972 + if (prev_on_scx) { 2973 + keep_prev = true; 2974 + } else { 2975 + keep_prev = false; 2976 + kick_idle = true; 2977 + } 2978 + } else if (unlikely(keep_prev && !prev_on_scx)) { 2979 + /* only allowed during transitions */ 2980 + WARN_ON_ONCE(scx_ops_enable_state() == SCX_OPS_ENABLED); 2981 + keep_prev = false; 2982 + } 2983 + 2984 + /* 2985 + * If balance_scx() is telling us to keep running @prev, replenish slice 2986 + * if necessary and keep running @prev. Otherwise, pop the first one 2987 + * from the local DSQ. 2988 + */ 2989 + if (keep_prev) { 2973 2990 p = prev; 2974 2991 if (!p->scx.slice) 2975 2992 p->scx.slice = SCX_SLICE_DFL; 2976 2993 } else { 2977 2994 p = first_local_task(rq); 2978 - if (!p) 2995 + if (!p) { 2996 + if (kick_idle) 2997 + scx_bpf_kick_cpu(cpu_of(rq), SCX_KICK_IDLE); 2979 2998 return NULL; 2999 + } 2980 3000 2981 3001 if (unlikely(!p->scx.slice)) { 2982 3002 if (!scx_rq_bypassing(rq) && !scx_warned_zero_slice) { ··· 4997 4979 4998 4980 if (!cpumask_equal(housekeeping_cpumask(HK_TYPE_DOMAIN), 4999 4981 cpu_possible_mask)) { 5000 - pr_err("sched_ext: Not compatible with \"isolcpus=\" domain isolation"); 4982 + pr_err("sched_ext: Not compatible with \"isolcpus=\" domain isolation\n"); 5001 4983 return -EINVAL; 5002 4984 } 5003 4985

+3 -2

kernel/sched/sched.h

··· 751 751 */ 752 752 SCX_RQ_ONLINE = 1 << 0, 753 753 SCX_RQ_CAN_STOP_TICK = 1 << 1, 754 - SCX_RQ_BAL_KEEP = 1 << 2, /* balance decided to keep current */ 755 - SCX_RQ_BYPASSING = 1 << 3, 754 + SCX_RQ_BAL_PENDING = 1 << 2, /* balance hasn't run yet */ 755 + SCX_RQ_BAL_KEEP = 1 << 3, /* balance decided to keep current */ 756 + SCX_RQ_BYPASSING = 1 << 4, 756 757 757 758 SCX_RQ_IN_WAKEUP = 1 << 16, 758 759 SCX_RQ_IN_BALANCE = 1 << 17,

+2 -1

kernel/signal.c

··· 419 419 */ 420 420 rcu_read_lock(); 421 421 ucounts = task_ucounts(t); 422 - sigpending = inc_rlimit_get_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING); 422 + sigpending = inc_rlimit_get_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING, 423 + override_rlimit); 423 424 rcu_read_unlock(); 424 425 if (!sigpending) 425 426 return NULL;

+5 -4

kernel/ucount.c

··· 307 307 do_dec_rlimit_put_ucounts(ucounts, NULL, type); 308 308 } 309 309 310 - long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type) 310 + long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type, 311 + bool override_rlimit) 311 312 { 312 313 /* Caller must hold a reference to ucounts */ 313 314 struct ucounts *iter; ··· 318 317 for (iter = ucounts; iter; iter = iter->ns->ucounts) { 319 318 long new = atomic_long_add_return(1, &iter->rlimit[type]); 320 319 if (new < 0 || new > max) 321 - goto unwind; 320 + goto dec_unwind; 322 321 if (iter == ucounts) 323 322 ret = new; 324 - max = get_userns_rlimit_max(iter->ns, type); 323 + if (!override_rlimit) 324 + max = get_userns_rlimit_max(iter->ns, type); 325 325 /* 326 326 * Grab an extra ucount reference for the caller when 327 327 * the rlimit count was previously 0. ··· 336 334 dec_unwind: 337 335 dec = atomic_long_sub_return(1, &iter->rlimit[type]); 338 336 WARN_ON_ONCE(dec < 0); 339 - unwind: 340 337 do_dec_rlimit_put_ucounts(ucounts, iter, type); 341 338 return 0; 342 339 }

+12 -6

lib/objpool.c

··· 74 74 * warm caches and TLB hits. in default vmalloc is used to 75 75 * reduce the pressure of kernel slab system. as we know, 76 76 * mimimal size of vmalloc is one page since vmalloc would 77 - * always align the requested size to page size 77 + * always align the requested size to page size. 78 + * but if vmalloc fails or it is not available (e.g. GFP_ATOMIC) 79 + * allocate percpu slot with kmalloc. 78 80 */ 79 - if ((pool->gfp & GFP_ATOMIC) == GFP_ATOMIC) 80 - slot = kmalloc_node(size, pool->gfp, cpu_to_node(i)); 81 - else 81 + slot = NULL; 82 + 83 + if ((pool->gfp & (GFP_ATOMIC | GFP_KERNEL)) != GFP_ATOMIC) 82 84 slot = __vmalloc_node(size, sizeof(void *), pool->gfp, 83 85 cpu_to_node(i), __builtin_return_address(0)); 84 - if (!slot) 85 - return -ENOMEM; 86 + 87 + if (!slot) { 88 + slot = kmalloc_node(size, pool->gfp, cpu_to_node(i)); 89 + if (!slot) 90 + return -ENOMEM; 91 + } 86 92 memset(slot, 0, size); 87 93 pool->cpu_slots[i] = slot; 88 94

+28 -14

mm/damon/core.c

··· 1412 1412 damon_for_each_scheme(s, c) { 1413 1413 struct damos_quota *quota = &s->quota; 1414 1414 1415 - if (c->passed_sample_intervals != s->next_apply_sis) 1415 + if (c->passed_sample_intervals < s->next_apply_sis) 1416 1416 continue; 1417 1417 1418 1418 if (!s->wmarks.activated) ··· 1456 1456 unsigned long score) 1457 1457 { 1458 1458 const unsigned long goal = 10000; 1459 - unsigned long score_goal_diff = max(goal, score) - min(goal, score); 1460 - unsigned long score_goal_diff_bp = score_goal_diff * 10000 / goal; 1461 - unsigned long compensation = last_input * score_goal_diff_bp / 10000; 1462 1459 /* Set minimum input as 10000 to avoid compensation be zero */ 1463 1460 const unsigned long min_input = 10000; 1461 + unsigned long score_goal_diff, compensation; 1462 + bool over_achieving = score > goal; 1464 1463 1465 - if (goal > score) 1464 + if (score == goal) 1465 + return last_input; 1466 + if (score >= goal * 2) 1467 + return min_input; 1468 + 1469 + if (over_achieving) 1470 + score_goal_diff = score - goal; 1471 + else 1472 + score_goal_diff = goal - score; 1473 + 1474 + if (last_input < ULONG_MAX / score_goal_diff) 1475 + compensation = last_input * score_goal_diff / goal; 1476 + else 1477 + compensation = last_input / goal * score_goal_diff; 1478 + 1479 + if (over_achieving) 1480 + return max(last_input - compensation, min_input); 1481 + if (last_input < ULONG_MAX - compensation) 1466 1482 return last_input + compensation; 1467 - if (last_input > compensation + min_input) 1468 - return last_input - compensation; 1469 - return min_input; 1483 + return ULONG_MAX; 1470 1484 } 1471 1485 1472 1486 #ifdef CONFIG_PSI ··· 1636 1622 bool has_schemes_to_apply = false; 1637 1623 1638 1624 damon_for_each_scheme(s, c) { 1639 - if (c->passed_sample_intervals != s->next_apply_sis) 1625 + if (c->passed_sample_intervals < s->next_apply_sis) 1640 1626 continue; 1641 1627 1642 1628 if (!s->wmarks.activated) ··· 1656 1642 } 1657 1643 1658 1644 damon_for_each_scheme(s, c) { 1659 - if (c->passed_sample_intervals != s->next_apply_sis) 1645 + if (c->passed_sample_intervals < s->next_apply_sis) 1660 1646 continue; 1661 - s->next_apply_sis += 1647 + s->next_apply_sis = c->passed_sample_intervals + 1662 1648 (s->apply_interval_us ? s->apply_interval_us : 1663 1649 c->attrs.aggr_interval) / sample_interval; 1664 1650 } ··· 2014 2000 if (ctx->ops.check_accesses) 2015 2001 max_nr_accesses = ctx->ops.check_accesses(ctx); 2016 2002 2017 - if (ctx->passed_sample_intervals == next_aggregation_sis) { 2003 + if (ctx->passed_sample_intervals >= next_aggregation_sis) { 2018 2004 kdamond_merge_regions(ctx, 2019 2005 max_nr_accesses / 10, 2020 2006 sz_limit); ··· 2032 2018 2033 2019 sample_interval = ctx->attrs.sample_interval ? 2034 2020 ctx->attrs.sample_interval : 1; 2035 - if (ctx->passed_sample_intervals == next_aggregation_sis) { 2021 + if (ctx->passed_sample_intervals >= next_aggregation_sis) { 2036 2022 ctx->next_aggregation_sis = next_aggregation_sis + 2037 2023 ctx->attrs.aggr_interval / sample_interval; 2038 2024 ··· 2042 2028 ctx->ops.reset_aggregated(ctx); 2043 2029 } 2044 2030 2045 - if (ctx->passed_sample_intervals == next_ops_update_sis) { 2031 + if (ctx->passed_sample_intervals >= next_ops_update_sis) { 2046 2032 ctx->next_ops_update_sis = next_ops_update_sis + 2047 2033 ctx->attrs.ops_update_interval / 2048 2034 sample_interval;

+1 -1

mm/filemap.c

··· 2625 2625 if (unlikely(!iov_iter_count(iter))) 2626 2626 return 0; 2627 2627 2628 - iov_iter_truncate(iter, inode->i_sb->s_maxbytes); 2628 + iov_iter_truncate(iter, inode->i_sb->s_maxbytes - iocb->ki_pos); 2629 2629 folio_batch_init(&fbatch); 2630 2630 2631 2631 do {

+77 -39

mm/gup.c

··· 2273 2273 #endif /* CONFIG_ELF_CORE */ 2274 2274 2275 2275 #ifdef CONFIG_MIGRATION 2276 + 2277 + /* 2278 + * An array of either pages or folios ("pofs"). Although it may seem tempting to 2279 + * avoid this complication, by simply interpreting a list of folios as a list of 2280 + * pages, that approach won't work in the longer term, because eventually the 2281 + * layouts of struct page and struct folio will become completely different. 2282 + * Furthermore, this pof approach avoids excessive page_folio() calls. 2283 + */ 2284 + struct pages_or_folios { 2285 + union { 2286 + struct page **pages; 2287 + struct folio **folios; 2288 + void **entries; 2289 + }; 2290 + bool has_folios; 2291 + long nr_entries; 2292 + }; 2293 + 2294 + static struct folio *pofs_get_folio(struct pages_or_folios *pofs, long i) 2295 + { 2296 + if (pofs->has_folios) 2297 + return pofs->folios[i]; 2298 + return page_folio(pofs->pages[i]); 2299 + } 2300 + 2301 + static void pofs_clear_entry(struct pages_or_folios *pofs, long i) 2302 + { 2303 + pofs->entries[i] = NULL; 2304 + } 2305 + 2306 + static void pofs_unpin(struct pages_or_folios *pofs) 2307 + { 2308 + if (pofs->has_folios) 2309 + unpin_folios(pofs->folios, pofs->nr_entries); 2310 + else 2311 + unpin_user_pages(pofs->pages, pofs->nr_entries); 2312 + } 2313 + 2276 2314 /* 2277 2315 * Returns the number of collected folios. Return value is always >= 0. 2278 2316 */ 2279 2317 static unsigned long collect_longterm_unpinnable_folios( 2280 - struct list_head *movable_folio_list, 2281 - unsigned long nr_folios, 2282 - struct folio **folios) 2318 + struct list_head *movable_folio_list, 2319 + struct pages_or_folios *pofs) 2283 2320 { 2284 2321 unsigned long i, collected = 0; 2285 2322 struct folio *prev_folio = NULL; 2286 2323 bool drain_allow = true; 2287 2324 2288 - for (i = 0; i < nr_folios; i++) { 2289 - struct folio *folio = folios[i]; 2325 + for (i = 0; i < pofs->nr_entries; i++) { 2326 + struct folio *folio = pofs_get_folio(pofs, i); 2290 2327 2291 2328 if (folio == prev_folio) 2292 2329 continue; ··· 2364 2327 * Returns -EAGAIN if all folios were successfully migrated or -errno for 2365 2328 * failure (or partial success). 2366 2329 */ 2367 - static int migrate_longterm_unpinnable_folios( 2368 - struct list_head *movable_folio_list, 2369 - unsigned long nr_folios, 2370 - struct folio **folios) 2330 + static int 2331 + migrate_longterm_unpinnable_folios(struct list_head *movable_folio_list, 2332 + struct pages_or_folios *pofs) 2371 2333 { 2372 2334 int ret; 2373 2335 unsigned long i; 2374 2336 2375 - for (i = 0; i < nr_folios; i++) { 2376 - struct folio *folio = folios[i]; 2337 + for (i = 0; i < pofs->nr_entries; i++) { 2338 + struct folio *folio = pofs_get_folio(pofs, i); 2377 2339 2378 2340 if (folio_is_device_coherent(folio)) { 2379 2341 /* ··· 2380 2344 * convert the pin on the source folio to a normal 2381 2345 * reference. 2382 2346 */ 2383 - folios[i] = NULL; 2347 + pofs_clear_entry(pofs, i); 2384 2348 folio_get(folio); 2385 2349 gup_put_folio(folio, 1, FOLL_PIN); 2386 2350 ··· 2399 2363 * calling folio_isolate_lru() which takes a reference so the 2400 2364 * folio won't be freed if it's migrating. 2401 2365 */ 2402 - unpin_folio(folios[i]); 2403 - folios[i] = NULL; 2366 + unpin_folio(folio); 2367 + pofs_clear_entry(pofs, i); 2404 2368 } 2405 2369 2406 2370 if (!list_empty(movable_folio_list)) { ··· 2423 2387 return -EAGAIN; 2424 2388 2425 2389 err: 2426 - unpin_folios(folios, nr_folios); 2390 + pofs_unpin(pofs); 2427 2391 putback_movable_pages(movable_folio_list); 2428 2392 2429 2393 return ret; 2394 + } 2395 + 2396 + static long 2397 + check_and_migrate_movable_pages_or_folios(struct pages_or_folios *pofs) 2398 + { 2399 + LIST_HEAD(movable_folio_list); 2400 + unsigned long collected; 2401 + 2402 + collected = collect_longterm_unpinnable_folios(&movable_folio_list, 2403 + pofs); 2404 + if (!collected) 2405 + return 0; 2406 + 2407 + return migrate_longterm_unpinnable_folios(&movable_folio_list, pofs); 2430 2408 } 2431 2409 2432 2410 /* ··· 2467 2417 static long check_and_migrate_movable_folios(unsigned long nr_folios, 2468 2418 struct folio **folios) 2469 2419 { 2470 - unsigned long collected; 2471 - LIST_HEAD(movable_folio_list); 2420 + struct pages_or_folios pofs = { 2421 + .folios = folios, 2422 + .has_folios = true, 2423 + .nr_entries = nr_folios, 2424 + }; 2472 2425 2473 - collected = collect_longterm_unpinnable_folios(&movable_folio_list, 2474 - nr_folios, folios); 2475 - if (!collected) 2476 - return 0; 2477 - 2478 - return migrate_longterm_unpinnable_folios(&movable_folio_list, 2479 - nr_folios, folios); 2426 + return check_and_migrate_movable_pages_or_folios(&pofs); 2480 2427 } 2481 2428 2482 2429 /* ··· 2483 2436 static long check_and_migrate_movable_pages(unsigned long nr_pages, 2484 2437 struct page **pages) 2485 2438 { 2486 - struct folio **folios; 2487 - long i, ret; 2439 + struct pages_or_folios pofs = { 2440 + .pages = pages, 2441 + .has_folios = false, 2442 + .nr_entries = nr_pages, 2443 + }; 2488 2444 2489 - folios = kmalloc_array(nr_pages, sizeof(*folios), GFP_KERNEL); 2490 - if (!folios) { 2491 - unpin_user_pages(pages, nr_pages); 2492 - return -ENOMEM; 2493 - } 2494 - 2495 - for (i = 0; i < nr_pages; i++) 2496 - folios[i] = page_folio(pages[i]); 2497 - 2498 - ret = check_and_migrate_movable_folios(nr_pages, folios); 2499 - 2500 - kfree(folios); 2501 - return ret; 2445 + return check_and_migrate_movable_pages_or_folios(&pofs); 2502 2446 } 2503 2447 #else 2504 2448 static long check_and_migrate_movable_pages(unsigned long nr_pages,

+46 -14

mm/huge_memory.c

··· 3588 3588 return split_huge_page_to_list_to_order(&folio->page, list, ret); 3589 3589 } 3590 3590 3591 - void __folio_undo_large_rmappable(struct folio *folio) 3591 + /* 3592 + * __folio_unqueue_deferred_split() is not to be called directly: 3593 + * the folio_unqueue_deferred_split() inline wrapper in mm/internal.h 3594 + * limits its calls to those folios which may have a _deferred_list for 3595 + * queueing THP splits, and that list is (racily observed to be) non-empty. 3596 + * 3597 + * It is unsafe to call folio_unqueue_deferred_split() until folio refcount is 3598 + * zero: because even when split_queue_lock is held, a non-empty _deferred_list 3599 + * might be in use on deferred_split_scan()'s unlocked on-stack list. 3600 + * 3601 + * If memory cgroups are enabled, split_queue_lock is in the mem_cgroup: it is 3602 + * therefore important to unqueue deferred split before changing folio memcg. 3603 + */ 3604 + bool __folio_unqueue_deferred_split(struct folio *folio) 3592 3605 { 3593 3606 struct deferred_split *ds_queue; 3594 3607 unsigned long flags; 3608 + bool unqueued = false; 3609 + 3610 + WARN_ON_ONCE(folio_ref_count(folio)); 3611 + WARN_ON_ONCE(!mem_cgroup_disabled() && !folio_memcg(folio)); 3595 3612 3596 3613 ds_queue = get_deferred_split_queue(folio); 3597 3614 spin_lock_irqsave(&ds_queue->split_queue_lock, flags); ··· 3620 3603 MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); 3621 3604 } 3622 3605 list_del_init(&folio->_deferred_list); 3606 + unqueued = true; 3623 3607 } 3624 3608 spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); 3609 + 3610 + return unqueued; /* useful for debug warnings */ 3625 3611 } 3626 3612 3627 3613 /* partially_mapped=false won't clear PG_partially_mapped folio flag */ ··· 3647 3627 return; 3648 3628 3649 3629 /* 3650 - * The try_to_unmap() in page reclaim path might reach here too, 3651 - * this may cause a race condition to corrupt deferred split queue. 3652 - * And, if page reclaim is already handling the same folio, it is 3653 - * unnecessary to handle it again in shrinker. 3654 - * 3655 - * Check the swapcache flag to determine if the folio is being 3656 - * handled by page reclaim since THP swap would add the folio into 3657 - * swap cache before calling try_to_unmap(). 3630 + * Exclude swapcache: originally to avoid a corrupt deferred split 3631 + * queue. Nowadays that is fully prevented by mem_cgroup_swapout(); 3632 + * but if page reclaim is already handling the same folio, it is 3633 + * unnecessary to handle it again in the shrinker, so excluding 3634 + * swapcache here may still be a useful optimization. 3658 3635 */ 3659 3636 if (folio_test_swapcache(folio)) 3660 3637 return; ··· 3735 3718 struct deferred_split *ds_queue = &pgdata->deferred_split_queue; 3736 3719 unsigned long flags; 3737 3720 LIST_HEAD(list); 3738 - struct folio *folio, *next; 3739 - int split = 0; 3721 + struct folio *folio, *next, *prev = NULL; 3722 + int split = 0, removed = 0; 3740 3723 3741 3724 #ifdef CONFIG_MEMCG 3742 3725 if (sc->memcg) ··· 3790 3773 * in the case it was underused, then consider it used and 3791 3774 * don't add it back to split_queue. 3792 3775 */ 3793 - if (!did_split && !folio_test_partially_mapped(folio)) { 3776 + if (did_split) { 3777 + ; /* folio already removed from list */ 3778 + } else if (!folio_test_partially_mapped(folio)) { 3794 3779 list_del_init(&folio->_deferred_list); 3795 - ds_queue->split_queue_len--; 3780 + removed++; 3781 + } else { 3782 + /* 3783 + * That unlocked list_del_init() above would be unsafe, 3784 + * unless its folio is separated from any earlier folios 3785 + * left on the list (which may be concurrently unqueued) 3786 + * by one safe folio with refcount still raised. 3787 + */ 3788 + swap(folio, prev); 3796 3789 } 3797 - folio_put(folio); 3790 + if (folio) 3791 + folio_put(folio); 3798 3792 } 3799 3793 3800 3794 spin_lock_irqsave(&ds_queue->split_queue_lock, flags); 3801 3795 list_splice_tail(&list, &ds_queue->split_queue); 3796 + ds_queue->split_queue_len -= removed; 3802 3797 spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); 3798 + 3799 + if (prev) 3800 + folio_put(prev); 3803 3801 3804 3802 /* 3805 3803 * Stop shrinker if we didn't split any page, but the queue is empty.

+50 -5

mm/internal.h

··· 108 108 return (void *)(mapping & ~PAGE_MAPPING_FLAGS); 109 109 } 110 110 111 + /* 112 + * This is a file-backed mapping, and is about to be memory mapped - invoke its 113 + * mmap hook and safely handle error conditions. On error, VMA hooks will be 114 + * mutated. 115 + * 116 + * @file: File which backs the mapping. 117 + * @vma: VMA which we are mapping. 118 + * 119 + * Returns: 0 if success, error otherwise. 120 + */ 121 + static inline int mmap_file(struct file *file, struct vm_area_struct *vma) 122 + { 123 + int err = call_mmap(file, vma); 124 + 125 + if (likely(!err)) 126 + return 0; 127 + 128 + /* 129 + * OK, we tried to call the file hook for mmap(), but an error 130 + * arose. The mapping is in an inconsistent state and we most not invoke 131 + * any further hooks on it. 132 + */ 133 + vma->vm_ops = &vma_dummy_vm_ops; 134 + 135 + return err; 136 + } 137 + 138 + /* 139 + * If the VMA has a close hook then close it, and since closing it might leave 140 + * it in an inconsistent state which makes the use of any hooks suspect, clear 141 + * them down by installing dummy empty hooks. 142 + */ 143 + static inline void vma_close(struct vm_area_struct *vma) 144 + { 145 + if (vma->vm_ops && vma->vm_ops->close) { 146 + vma->vm_ops->close(vma); 147 + 148 + /* 149 + * The mapping is in an inconsistent state, and no further hooks 150 + * may be invoked upon it. 151 + */ 152 + vma->vm_ops = &vma_dummy_vm_ops; 153 + } 154 + } 155 + 111 156 #ifdef CONFIG_MMU 112 157 113 158 /* Flags for folio_pte_batch(). */ ··· 684 639 #endif 685 640 } 686 641 687 - void __folio_undo_large_rmappable(struct folio *folio); 688 - static inline void folio_undo_large_rmappable(struct folio *folio) 642 + bool __folio_unqueue_deferred_split(struct folio *folio); 643 + static inline bool folio_unqueue_deferred_split(struct folio *folio) 689 644 { 690 645 if (folio_order(folio) <= 1 || !folio_test_large_rmappable(folio)) 691 - return; 646 + return false; 692 647 693 648 /* 694 649 * At this point, there is no one trying to add the folio to ··· 696 651 * to check without acquiring the split_queue_lock. 697 652 */ 698 653 if (data_race(list_empty(&folio->_deferred_list))) 699 - return; 654 + return false; 700 655 701 - __folio_undo_large_rmappable(folio); 656 + return __folio_unqueue_deferred_split(folio); 702 657 } 703 658 704 659 static inline struct folio *page_rmappable_folio(struct page *page)

+25

mm/memcontrol-v1.c

··· 848 848 css_get(&to->css); 849 849 css_put(&from->css); 850 850 851 + /* Warning should never happen, so don't worry about refcount non-0 */ 852 + WARN_ON_ONCE(folio_unqueue_deferred_split(folio)); 851 853 folio->memcg_data = (unsigned long)to; 852 854 853 855 __folio_memcg_unlock(from); ··· 1219 1217 enum mc_target_type target_type; 1220 1218 union mc_target target; 1221 1219 struct folio *folio; 1220 + bool tried_split_before = false; 1222 1221 1222 + retry_pmd: 1223 1223 ptl = pmd_trans_huge_lock(pmd, vma); 1224 1224 if (ptl) { 1225 1225 if (mc.precharge < HPAGE_PMD_NR) { ··· 1231 1227 target_type = get_mctgt_type_thp(vma, addr, *pmd, &target); 1232 1228 if (target_type == MC_TARGET_PAGE) { 1233 1229 folio = target.folio; 1230 + /* 1231 + * Deferred split queue locking depends on memcg, 1232 + * and unqueue is unsafe unless folio refcount is 0: 1233 + * split or skip if on the queue? first try to split. 1234 + */ 1235 + if (!list_empty(&folio->_deferred_list)) { 1236 + spin_unlock(ptl); 1237 + if (!tried_split_before) 1238 + split_folio(folio); 1239 + folio_unlock(folio); 1240 + folio_put(folio); 1241 + if (tried_split_before) 1242 + return 0; 1243 + tried_split_before = true; 1244 + goto retry_pmd; 1245 + } 1246 + /* 1247 + * So long as that pmd lock is held, the folio cannot 1248 + * be racily added to the _deferred_list, because 1249 + * __folio_remove_rmap() will find !partially_mapped. 1250 + */ 1234 1251 if (folio_isolate_lru(folio)) { 1235 1252 if (!mem_cgroup_move_account(folio, true, 1236 1253 mc.from, mc.to)) {

+9 -4

mm/memcontrol.c

··· 431 431 PGDEACTIVATE, 432 432 PGLAZYFREE, 433 433 PGLAZYFREED, 434 + #ifdef CONFIG_SWAP 435 + SWPIN_ZERO, 436 + SWPOUT_ZERO, 437 + #endif 434 438 #ifdef CONFIG_ZSWAP 435 439 ZSWPIN, 436 440 ZSWPOUT, ··· 4633 4629 struct obj_cgroup *objcg; 4634 4630 4635 4631 VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); 4636 - VM_BUG_ON_FOLIO(folio_order(folio) > 1 && 4637 - !folio_test_hugetlb(folio) && 4638 - !list_empty(&folio->_deferred_list) && 4639 - folio_test_partially_mapped(folio), folio); 4640 4632 4641 4633 /* 4642 4634 * Nobody should be changing or seriously looking at ··· 4679 4679 ug->nr_memory += nr_pages; 4680 4680 ug->pgpgout++; 4681 4681 4682 + WARN_ON_ONCE(folio_unqueue_deferred_split(folio)); 4682 4683 folio->memcg_data = 0; 4683 4684 } 4684 4685 ··· 4791 4790 4792 4791 /* Transfer the charge and the css ref */ 4793 4792 commit_charge(new, memcg); 4793 + 4794 + /* Warning should never happen, so don't worry about refcount non-0 */ 4795 + WARN_ON_ONCE(folio_unqueue_deferred_split(old)); 4794 4796 old->memcg_data = 0; 4795 4797 } 4796 4798 ··· 4980 4976 VM_BUG_ON_FOLIO(oldid, folio); 4981 4977 mod_memcg_state(swap_memcg, MEMCG_SWAP, nr_entries); 4982 4978 4979 + folio_unqueue_deferred_split(folio); 4983 4980 folio->memcg_data = 0; 4984 4981 4985 4982 if (!mem_cgroup_is_root(memcg))

+2 -2

mm/migrate.c

··· 490 490 folio_test_large_rmappable(folio)) { 491 491 if (!folio_ref_freeze(folio, expected_count)) 492 492 return -EAGAIN; 493 - folio_undo_large_rmappable(folio); 493 + folio_unqueue_deferred_split(folio); 494 494 folio_ref_unfreeze(folio, expected_count); 495 495 } 496 496 ··· 515 515 } 516 516 517 517 /* Take off deferred split queue while frozen and memcg set */ 518 - folio_undo_large_rmappable(folio); 518 + folio_unqueue_deferred_split(folio); 519 519 520 520 /* 521 521 * Now we know that no one else is looking at the folio:

+6 -3

mm/mlock.c

··· 725 725 } 726 726 727 727 for_each_vma(vmi, vma) { 728 + int error; 728 729 vm_flags_t newflags; 729 730 730 731 newflags = vma->vm_flags & ~VM_LOCKED_MASK; 731 732 newflags |= to_add; 732 733 733 - /* Ignore errors */ 734 - mlock_fixup(&vmi, vma, &prev, vma->vm_start, vma->vm_end, 735 - newflags); 734 + error = mlock_fixup(&vmi, vma, &prev, vma->vm_start, vma->vm_end, 735 + newflags); 736 + /* Ignore errors, but prev needs fixing up. */ 737 + if (error) 738 + prev = vma; 736 739 cond_resched(); 737 740 } 738 741 out:

+70 -60

mm/mmap.c

··· 344 344 * to. we assume access permissions have been handled by the open 345 345 * of the memory object, so we don't do any here. 346 346 */ 347 - vm_flags |= calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(flags) | 347 + vm_flags |= calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(file, flags) | 348 348 mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; 349 349 350 350 /* Obtain the address to map to. we verify (or select) it and ensure ··· 1358 1358 return do_vmi_munmap(&vmi, mm, start, len, uf, false); 1359 1359 } 1360 1360 1361 - unsigned long mmap_region(struct file *file, unsigned long addr, 1361 + static unsigned long __mmap_region(struct file *file, unsigned long addr, 1362 1362 unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, 1363 1363 struct list_head *uf) 1364 1364 { 1365 1365 struct mm_struct *mm = current->mm; 1366 1366 struct vm_area_struct *vma = NULL; 1367 1367 pgoff_t pglen = PHYS_PFN(len); 1368 - struct vm_area_struct *merge; 1369 1368 unsigned long charged = 0; 1370 1369 struct vma_munmap_struct vms; 1371 1370 struct ma_state mas_detach; 1372 1371 struct maple_tree mt_detach; 1373 1372 unsigned long end = addr + len; 1374 - bool writable_file_mapping = false; 1375 1373 int error; 1376 1374 VMA_ITERATOR(vmi, mm, addr); 1377 1375 VMG_STATE(vmg, mm, &vmi, addr, end, vm_flags, pgoff); ··· 1420 1422 /* 1421 1423 * clear PTEs while the vma is still in the tree so that rmap 1422 1424 * cannot race with the freeing later in the truncate scenario. 1423 - * This is also needed for call_mmap(), which is why vm_ops 1425 + * This is also needed for mmap_file(), which is why vm_ops 1424 1426 * close function is called. 1425 1427 */ 1426 1428 vms_clean_up_area(&vms, &mas_detach); ··· 1443 1445 vm_flags_init(vma, vm_flags); 1444 1446 vma->vm_page_prot = vm_get_page_prot(vm_flags); 1445 1447 1448 + if (vma_iter_prealloc(&vmi, vma)) { 1449 + error = -ENOMEM; 1450 + goto free_vma; 1451 + } 1452 + 1446 1453 if (file) { 1447 1454 vma->vm_file = get_file(file); 1448 - error = call_mmap(file, vma); 1455 + error = mmap_file(file, vma); 1449 1456 if (error) 1450 - goto unmap_and_free_vma; 1457 + goto unmap_and_free_file_vma; 1451 1458 1452 - if (vma_is_shared_maywrite(vma)) { 1453 - error = mapping_map_writable(file->f_mapping); 1454 - if (error) 1455 - goto close_and_free_vma; 1456 - 1457 - writable_file_mapping = true; 1458 - } 1459 - 1459 + /* Drivers cannot alter the address of the VMA. */ 1460 + WARN_ON_ONCE(addr != vma->vm_start); 1460 1461 /* 1461 - * Expansion is handled above, merging is handled below. 1462 - * Drivers should not alter the address of the VMA. 1462 + * Drivers should not permit writability when previously it was 1463 + * disallowed. 1463 1464 */ 1464 - if (WARN_ON((addr != vma->vm_start))) { 1465 - error = -EINVAL; 1466 - goto close_and_free_vma; 1467 - } 1465 + VM_WARN_ON_ONCE(vm_flags != vma->vm_flags && 1466 + !(vm_flags & VM_MAYWRITE) && 1467 + (vma->vm_flags & VM_MAYWRITE)); 1468 1468 1469 1469 vma_iter_config(&vmi, addr, end); 1470 1470 /* 1471 - * If vm_flags changed after call_mmap(), we should try merge 1471 + * If vm_flags changed after mmap_file(), we should try merge 1472 1472 * vma again as we may succeed this time. 1473 1473 */ 1474 1474 if (unlikely(vm_flags != vma->vm_flags && vmg.prev)) { 1475 + struct vm_area_struct *merge; 1476 + 1475 1477 vmg.flags = vma->vm_flags; 1476 1478 /* If this fails, state is reset ready for a reattempt. */ 1477 1479 merge = vma_merge_new_range(&vmg); ··· 1489 1491 vma = merge; 1490 1492 /* Update vm_flags to pick up the change. */ 1491 1493 vm_flags = vma->vm_flags; 1492 - goto unmap_writable; 1494 + goto file_expanded; 1493 1495 } 1494 1496 vma_iter_config(&vmi, addr, end); 1495 1497 } ··· 1498 1500 } else if (vm_flags & VM_SHARED) { 1499 1501 error = shmem_zero_setup(vma); 1500 1502 if (error) 1501 - goto free_vma; 1503 + goto free_iter_vma; 1502 1504 } else { 1503 1505 vma_set_anonymous(vma); 1504 1506 } 1505 1507 1506 - if (map_deny_write_exec(vma, vma->vm_flags)) { 1507 - error = -EACCES; 1508 - goto close_and_free_vma; 1509 - } 1510 - 1511 - /* Allow architectures to sanity-check the vm_flags */ 1512 - if (!arch_validate_flags(vma->vm_flags)) { 1513 - error = -EINVAL; 1514 - goto close_and_free_vma; 1515 - } 1516 - 1517 - if (vma_iter_prealloc(&vmi, vma)) { 1518 - error = -ENOMEM; 1519 - goto close_and_free_vma; 1520 - } 1508 + #ifdef CONFIG_SPARC64 1509 + /* TODO: Fix SPARC ADI! */ 1510 + WARN_ON_ONCE(!arch_validate_flags(vm_flags)); 1511 + #endif 1521 1512 1522 1513 /* Lock the VMA since it is modified after insertion into VMA tree */ 1523 1514 vma_start_write(vma); ··· 1520 1533 */ 1521 1534 khugepaged_enter_vma(vma, vma->vm_flags); 1522 1535 1523 - /* Once vma denies write, undo our temporary denial count */ 1524 - unmap_writable: 1525 - if (writable_file_mapping) 1526 - mapping_unmap_writable(file->f_mapping); 1536 + file_expanded: 1527 1537 file = vma->vm_file; 1528 1538 ksm_add_vma(vma); 1529 1539 expanded: ··· 1553 1569 1554 1570 vma_set_page_prot(vma); 1555 1571 1556 - validate_mm(mm); 1557 1572 return addr; 1558 1573 1559 - close_and_free_vma: 1560 - if (file && !vms.closed_vm_ops && vma->vm_ops && vma->vm_ops->close) 1561 - vma->vm_ops->close(vma); 1574 + unmap_and_free_file_vma: 1575 + fput(vma->vm_file); 1576 + vma->vm_file = NULL; 1562 1577 1563 - if (file || vma->vm_file) { 1564 - unmap_and_free_vma: 1565 - fput(vma->vm_file); 1566 - vma->vm_file = NULL; 1567 - 1568 - vma_iter_set(&vmi, vma->vm_end); 1569 - /* Undo any partial mapping done by a device driver. */ 1570 - unmap_region(&vmi.mas, vma, vmg.prev, vmg.next); 1571 - } 1572 - if (writable_file_mapping) 1573 - mapping_unmap_writable(file->f_mapping); 1578 + vma_iter_set(&vmi, vma->vm_end); 1579 + /* Undo any partial mapping done by a device driver. */ 1580 + unmap_region(&vmi.mas, vma, vmg.prev, vmg.next); 1581 + free_iter_vma: 1582 + vma_iter_free(&vmi); 1574 1583 free_vma: 1575 1584 vm_area_free(vma); 1576 1585 unacct_error: ··· 1573 1596 abort_munmap: 1574 1597 vms_abort_munmap_vmas(&vms, &mas_detach); 1575 1598 gather_failed: 1576 - validate_mm(mm); 1577 1599 return error; 1600 + } 1601 + 1602 + unsigned long mmap_region(struct file *file, unsigned long addr, 1603 + unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, 1604 + struct list_head *uf) 1605 + { 1606 + unsigned long ret; 1607 + bool writable_file_mapping = false; 1608 + 1609 + /* Check to see if MDWE is applicable. */ 1610 + if (map_deny_write_exec(vm_flags, vm_flags)) 1611 + return -EACCES; 1612 + 1613 + /* Allow architectures to sanity-check the vm_flags. */ 1614 + if (!arch_validate_flags(vm_flags)) 1615 + return -EINVAL; 1616 + 1617 + /* Map writable and ensure this isn't a sealed memfd. */ 1618 + if (file && is_shared_maywrite(vm_flags)) { 1619 + int error = mapping_map_writable(file->f_mapping); 1620 + 1621 + if (error) 1622 + return error; 1623 + writable_file_mapping = true; 1624 + } 1625 + 1626 + ret = __mmap_region(file, addr, len, vm_flags, pgoff, uf); 1627 + 1628 + /* Clear our write mapping regardless of error. */ 1629 + if (writable_file_mapping) 1630 + mapping_unmap_writable(file->f_mapping); 1631 + 1632 + validate_mm(current->mm); 1633 + return ret; 1578 1634 } 1579 1635 1580 1636 static int __vm_munmap(unsigned long start, size_t len, bool unlock) ··· 1944 1934 do { 1945 1935 if (vma->vm_flags & VM_ACCOUNT) 1946 1936 nr_accounted += vma_pages(vma); 1947 - remove_vma(vma, /* unreachable = */ true, /* closed = */ false); 1937 + remove_vma(vma, /* unreachable = */ true); 1948 1938 count++; 1949 1939 cond_resched(); 1950 1940 vma = vma_next(&vmi);

+1 -1

mm/mprotect.c

··· 810 810 break; 811 811 } 812 812 813 - if (map_deny_write_exec(vma, newflags)) { 813 + if (map_deny_write_exec(vma->vm_flags, newflags)) { 814 814 error = -EACCES; 815 815 break; 816 816 }

+5 -6

mm/nommu.c

··· 573 573 VMA_ITERATOR(vmi, vma->vm_mm, vma->vm_start); 574 574 575 575 vma_iter_config(&vmi, vma->vm_start, vma->vm_end); 576 - if (vma_iter_prealloc(&vmi, vma)) { 576 + if (vma_iter_prealloc(&vmi, NULL)) { 577 577 pr_warn("Allocation of vma tree for process %d failed\n", 578 578 current->pid); 579 579 return -ENOMEM; ··· 589 589 */ 590 590 static void delete_vma(struct mm_struct *mm, struct vm_area_struct *vma) 591 591 { 592 - if (vma->vm_ops && vma->vm_ops->close) 593 - vma->vm_ops->close(vma); 592 + vma_close(vma); 594 593 if (vma->vm_file) 595 594 fput(vma->vm_file); 596 595 put_nommu_region(vma->vm_region); ··· 842 843 { 843 844 unsigned long vm_flags; 844 845 845 - vm_flags = calc_vm_prot_bits(prot, 0) | calc_vm_flag_bits(flags); 846 + vm_flags = calc_vm_prot_bits(prot, 0) | calc_vm_flag_bits(file, flags); 846 847 847 848 if (!file) { 848 849 /* ··· 884 885 { 885 886 int ret; 886 887 887 - ret = call_mmap(vma->vm_file, vma); 888 + ret = mmap_file(vma->vm_file, vma); 888 889 if (ret == 0) { 889 890 vma->vm_region->vm_top = vma->vm_region->vm_end; 890 891 return 0; ··· 917 918 * happy. 918 919 */ 919 920 if (capabilities & NOMMU_MAP_DIRECT) { 920 - ret = call_mmap(vma->vm_file, vma); 921 + ret = mmap_file(vma->vm_file, vma); 921 922 /* shouldn't return success if we're not sharing */ 922 923 if (WARN_ON_ONCE(!is_nommu_shared_mapping(vma->vm_flags))) 923 924 ret = -ENOSYS;

+24 -7

mm/page_alloc.c

··· 635 635 static inline void account_freepages(struct zone *zone, int nr_pages, 636 636 int migratetype) 637 637 { 638 + lockdep_assert_held(&zone->lock); 639 + 638 640 if (is_migrate_isolate(migratetype)) 639 641 return; 640 642 ··· 644 642 645 643 if (is_migrate_cma(migratetype)) 646 644 __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, nr_pages); 645 + else if (is_migrate_highatomic(migratetype)) 646 + WRITE_ONCE(zone->nr_free_highatomic, 647 + zone->nr_free_highatomic + nr_pages); 647 648 } 648 649 649 650 /* Used for pages not on another list */ ··· 966 961 break; 967 962 case 2: 968 963 /* the second tail page: deferred_list overlaps ->mapping */ 969 - if (unlikely(!list_empty(&folio->_deferred_list) && 970 - folio_test_partially_mapped(folio))) { 971 - bad_page(page, "partially mapped folio on deferred list"); 964 + if (unlikely(!list_empty(&folio->_deferred_list))) { 965 + bad_page(page, "on deferred list"); 972 966 goto out; 973 967 } 974 968 break; ··· 1048 1044 bool skip_kasan_poison = should_skip_kasan_poison(page); 1049 1045 bool init = want_init_on_free(); 1050 1046 bool compound = PageCompound(page); 1047 + struct folio *folio = page_folio(page); 1051 1048 1052 1049 VM_BUG_ON_PAGE(PageTail(page), page); 1053 1050 ··· 1057 1052 1058 1053 if (memcg_kmem_online() && PageMemcgKmem(page)) 1059 1054 __memcg_kmem_uncharge_page(page, order); 1055 + 1056 + /* 1057 + * In rare cases, when truncation or holepunching raced with 1058 + * munlock after VM_LOCKED was cleared, Mlocked may still be 1059 + * found set here. This does not indicate a problem, unless 1060 + * "unevictable_pgs_cleared" appears worryingly large. 1061 + */ 1062 + if (unlikely(folio_test_mlocked(folio))) { 1063 + long nr_pages = folio_nr_pages(folio); 1064 + 1065 + __folio_clear_mlocked(folio); 1066 + zone_stat_mod_folio(folio, NR_MLOCK, -nr_pages); 1067 + count_vm_events(UNEVICTABLE_PGCLEARED, nr_pages); 1068 + } 1060 1069 1061 1070 if (unlikely(PageHWPoison(page)) && !order) { 1062 1071 /* Do not let hwpoison pages hit pcplists/buddy */ ··· 2701 2682 unsigned long pfn = folio_pfn(folio); 2702 2683 unsigned int order = folio_order(folio); 2703 2684 2704 - folio_undo_large_rmappable(folio); 2705 2685 if (!free_pages_prepare(&folio->page, order)) 2706 2686 continue; 2707 2687 /* ··· 3099 3081 3100 3082 /* 3101 3083 * If the caller does not have rights to reserves below the min 3102 - * watermark then subtract the high-atomic reserves. This will 3103 - * over-estimate the size of the atomic reserve but it avoids a search. 3084 + * watermark then subtract the free pages reserved for highatomic. 3104 3085 */ 3105 3086 if (likely(!(alloc_flags & ALLOC_RESERVES))) 3106 - unusable_free += z->nr_reserved_highatomic; 3087 + unusable_free += READ_ONCE(z->nr_free_highatomic); 3107 3088 3108 3089 #ifdef CONFIG_CMA 3109 3090 /* If allocation can't use CMA areas don't use free CMA pages */

+16

mm/page_io.c

··· 204 204 205 205 static void swap_zeromap_folio_set(struct folio *folio) 206 206 { 207 + struct obj_cgroup *objcg = get_obj_cgroup_from_folio(folio); 207 208 struct swap_info_struct *sis = swp_swap_info(folio->swap); 209 + int nr_pages = folio_nr_pages(folio); 208 210 swp_entry_t entry; 209 211 unsigned int i; 210 212 211 213 for (i = 0; i < folio_nr_pages(folio); i++) { 212 214 entry = page_swap_entry(folio_page(folio, i)); 213 215 set_bit(swp_offset(entry), sis->zeromap); 216 + } 217 + 218 + count_vm_events(SWPOUT_ZERO, nr_pages); 219 + if (objcg) { 220 + count_objcg_events(objcg, SWPOUT_ZERO, nr_pages); 221 + obj_cgroup_put(objcg); 214 222 } 215 223 } 216 224 ··· 511 503 static bool swap_read_folio_zeromap(struct folio *folio) 512 504 { 513 505 int nr_pages = folio_nr_pages(folio); 506 + struct obj_cgroup *objcg; 514 507 bool is_zeromap; 515 508 516 509 /* ··· 525 516 526 517 if (!is_zeromap) 527 518 return false; 519 + 520 + objcg = get_obj_cgroup_from_folio(folio); 521 + count_vm_events(SWPIN_ZERO, nr_pages); 522 + if (objcg) { 523 + count_objcg_events(objcg, SWPIN_ZERO, nr_pages); 524 + obj_cgroup_put(objcg); 525 + } 528 526 529 527 folio_zero_range(folio, 0, folio_size(folio)); 530 528 folio_mark_uptodate(folio);

-3

mm/shmem.c

··· 2733 2733 if (ret) 2734 2734 return ret; 2735 2735 2736 - /* arm64 - allow memory tagging on RAM-based files */ 2737 - vm_flags_set(vma, VM_MTE_ALLOWED); 2738 - 2739 2736 file_accessed(file); 2740 2737 /* This is anonymous shared memory if it is unlinked at the time of mmap */ 2741 2738 if (inode->i_nlink)

+20 -11

mm/slab_common.c

··· 380 380 unsigned int usersize, 381 381 void (*ctor)(void *)) 382 382 { 383 + unsigned long mask = 0; 384 + unsigned int idx; 383 385 kmem_buckets *b; 384 - int idx; 386 + 387 + BUILD_BUG_ON(ARRAY_SIZE(kmalloc_caches[KMALLOC_NORMAL]) > BITS_PER_LONG); 385 388 386 389 /* 387 390 * When the separate buckets API is not built in, just return ··· 406 403 for (idx = 0; idx < ARRAY_SIZE(kmalloc_caches[KMALLOC_NORMAL]); idx++) { 407 404 char *short_size, *cache_name; 408 405 unsigned int cache_useroffset, cache_usersize; 409 - unsigned int size; 406 + unsigned int size, aligned_idx; 410 407 411 408 if (!kmalloc_caches[KMALLOC_NORMAL][idx]) 412 409 continue; ··· 419 416 if (WARN_ON(!short_size)) 420 417 goto fail; 421 418 422 - cache_name = kasprintf(GFP_KERNEL, "%s-%s", name, short_size + 1); 423 - if (WARN_ON(!cache_name)) 424 - goto fail; 425 - 426 419 if (useroffset >= size) { 427 420 cache_useroffset = 0; 428 421 cache_usersize = 0; ··· 426 427 cache_useroffset = useroffset; 427 428 cache_usersize = min(size - cache_useroffset, usersize); 428 429 } 429 - (*b)[idx] = kmem_cache_create_usercopy(cache_name, size, 430 + 431 + aligned_idx = __kmalloc_index(size, false); 432 + if (!(*b)[aligned_idx]) { 433 + cache_name = kasprintf(GFP_KERNEL, "%s-%s", name, short_size + 1); 434 + if (WARN_ON(!cache_name)) 435 + goto fail; 436 + (*b)[aligned_idx] = kmem_cache_create_usercopy(cache_name, size, 430 437 0, flags, cache_useroffset, 431 438 cache_usersize, ctor); 432 - kfree(cache_name); 433 - if (WARN_ON(!(*b)[idx])) 434 - goto fail; 439 + kfree(cache_name); 440 + if (WARN_ON(!(*b)[aligned_idx])) 441 + goto fail; 442 + set_bit(aligned_idx, &mask); 443 + } 444 + if (idx != aligned_idx) 445 + (*b)[idx] = (*b)[aligned_idx]; 435 446 } 436 447 437 448 return b; 438 449 439 450 fail: 440 - for (idx = 0; idx < ARRAY_SIZE(kmalloc_caches[KMALLOC_NORMAL]); idx++) 451 + for_each_set_bit(idx, &mask, ARRAY_SIZE(kmalloc_caches[KMALLOC_NORMAL])) 441 452 kmem_cache_destroy((*b)[idx]); 442 453 kmem_cache_free(kmem_buckets_cache, b); 443 454

+2 -16

mm/swap.c

··· 78 78 lruvec_del_folio(*lruvecp, folio); 79 79 __folio_clear_lru_flags(folio); 80 80 } 81 - 82 - /* 83 - * In rare cases, when truncation or holepunching raced with 84 - * munlock after VM_LOCKED was cleared, Mlocked may still be 85 - * found set here. This does not indicate a problem, unless 86 - * "unevictable_pgs_cleared" appears worryingly large. 87 - */ 88 - if (unlikely(folio_test_mlocked(folio))) { 89 - long nr_pages = folio_nr_pages(folio); 90 - 91 - __folio_clear_mlocked(folio); 92 - zone_stat_mod_folio(folio, NR_MLOCK, -nr_pages); 93 - count_vm_events(UNEVICTABLE_PGCLEARED, nr_pages); 94 - } 95 81 } 96 82 97 83 /* ··· 107 121 } 108 122 109 123 page_cache_release(folio); 110 - folio_undo_large_rmappable(folio); 124 + folio_unqueue_deferred_split(folio); 111 125 mem_cgroup_uncharge(folio); 112 126 free_unref_page(&folio->page, folio_order(folio)); 113 127 } ··· 974 988 free_huge_folio(folio); 975 989 continue; 976 990 } 977 - folio_undo_large_rmappable(folio); 991 + folio_unqueue_deferred_split(folio); 978 992 __page_cache_release(folio, &lruvec, &flags); 979 993 980 994 if (j != i)

+1 -1

mm/swapfile.c

··· 929 929 si->highest_bit = 0; 930 930 del_from_avail_list(si); 931 931 932 - if (vm_swap_full()) 932 + if (si->cluster_info && vm_swap_full()) 933 933 schedule_work(&si->reclaim_work); 934 934 } 935 935 }

+5 -9

mm/vma.c

··· 323 323 /* 324 324 * Close a vm structure and free it. 325 325 */ 326 - void remove_vma(struct vm_area_struct *vma, bool unreachable, bool closed) 326 + void remove_vma(struct vm_area_struct *vma, bool unreachable) 327 327 { 328 328 might_sleep(); 329 - if (!closed && vma->vm_ops && vma->vm_ops->close) 330 - vma->vm_ops->close(vma); 329 + vma_close(vma); 331 330 if (vma->vm_file) 332 331 fput(vma->vm_file); 333 332 mpol_put(vma_policy(vma)); ··· 1114 1115 vms_clear_ptes(vms, mas_detach, true); 1115 1116 mas_set(mas_detach, 0); 1116 1117 mas_for_each(mas_detach, vma, ULONG_MAX) 1117 - if (vma->vm_ops && vma->vm_ops->close) 1118 - vma->vm_ops->close(vma); 1119 - vms->closed_vm_ops = true; 1118 + vma_close(vma); 1120 1119 } 1121 1120 1122 1121 /* ··· 1157 1160 /* Remove and clean up vmas */ 1158 1161 mas_set(mas_detach, 0); 1159 1162 mas_for_each(mas_detach, vma, ULONG_MAX) 1160 - remove_vma(vma, /* = */ false, vms->closed_vm_ops); 1163 + remove_vma(vma, /* unreachable = */ false); 1161 1164 1162 1165 vm_unacct_memory(vms->nr_accounted); 1163 1166 validate_mm(mm); ··· 1681 1684 return new_vma; 1682 1685 1683 1686 out_vma_link: 1684 - if (new_vma->vm_ops && new_vma->vm_ops->close) 1685 - new_vma->vm_ops->close(new_vma); 1687 + vma_close(new_vma); 1686 1688 1687 1689 if (new_vma->vm_file) 1688 1690 fput(new_vma->vm_file);

+2 -4

mm/vma.h

··· 42 42 int vma_count; /* Number of vmas that will be removed */ 43 43 bool unlock; /* Unlock after the munmap */ 44 44 bool clear_ptes; /* If there are outstanding PTE to be cleared */ 45 - bool closed_vm_ops; /* call_mmap() was encountered, so vmas may be closed */ 46 - /* 1 byte hole */ 45 + /* 2 byte hole */ 47 46 unsigned long nr_pages; /* Number of pages being removed */ 48 47 unsigned long locked_vm; /* Number of locked pages */ 49 48 unsigned long nr_accounted; /* Number of VM_ACCOUNT pages */ ··· 197 198 vms->unmap_start = FIRST_USER_ADDRESS; 198 199 vms->unmap_end = USER_PGTABLES_CEILING; 199 200 vms->clear_ptes = false; 200 - vms->closed_vm_ops = false; 201 201 } 202 202 #endif 203 203 ··· 267 269 unsigned long start, size_t len, struct list_head *uf, 268 270 bool unlock); 269 271 270 - void remove_vma(struct vm_area_struct *vma, bool unreachable, bool closed); 272 + void remove_vma(struct vm_area_struct *vma, bool unreachable); 271 273 272 274 void unmap_region(struct ma_state *mas, struct vm_area_struct *vma, 273 275 struct vm_area_struct *prev, struct vm_area_struct *next);

+2 -2

mm/vmscan.c

··· 1476 1476 */ 1477 1477 nr_reclaimed += nr_pages; 1478 1478 1479 - folio_undo_large_rmappable(folio); 1479 + folio_unqueue_deferred_split(folio); 1480 1480 if (folio_batch_add(&free_folios, folio) == 0) { 1481 1481 mem_cgroup_uncharge_folios(&free_folios); 1482 1482 try_to_unmap_flush(); ··· 1864 1864 if (unlikely(folio_put_testzero(folio))) { 1865 1865 __folio_clear_lru_flags(folio); 1866 1866 1867 - folio_undo_large_rmappable(folio); 1867 + folio_unqueue_deferred_split(folio); 1868 1868 if (folio_batch_add(&free_folios, folio) == 0) { 1869 1869 spin_unlock_irq(&lruvec->lru_lock); 1870 1870 mem_cgroup_uncharge_folios(&free_folios);

+2

mm/vmstat.c

··· 1415 1415 #ifdef CONFIG_SWAP 1416 1416 "swap_ra", 1417 1417 "swap_ra_hit", 1418 + "swpin_zero", 1419 + "swpout_zero", 1418 1420 #ifdef CONFIG_KSM 1419 1421 "ksm_swpin_copy", 1420 1422 #endif

+3 -3

mm/zswap.c

··· 1053 1053 1054 1054 count_vm_event(ZSWPWB); 1055 1055 if (entry->objcg) 1056 - count_objcg_event(entry->objcg, ZSWPWB); 1056 + count_objcg_events(entry->objcg, ZSWPWB, 1); 1057 1057 1058 1058 zswap_entry_free(entry); 1059 1059 ··· 1483 1483 1484 1484 if (objcg) { 1485 1485 obj_cgroup_charge_zswap(objcg, entry->length); 1486 - count_objcg_event(objcg, ZSWPOUT); 1486 + count_objcg_events(objcg, ZSWPOUT, 1); 1487 1487 } 1488 1488 1489 1489 /* ··· 1577 1577 1578 1578 count_vm_event(ZSWPIN); 1579 1579 if (entry->objcg) 1580 - count_objcg_event(entry->objcg, ZSWPIN); 1580 + count_objcg_events(entry->objcg, ZSWPIN, 1); 1581 1581 1582 1582 if (swapcache) { 1583 1583 zswap_entry_free(entry);

-2

net/bluetooth/hci_core.c

··· 3788 3788 3789 3789 hci_dev_lock(hdev); 3790 3790 conn = hci_conn_hash_lookup_handle(hdev, handle); 3791 - if (conn && hci_dev_test_flag(hdev, HCI_MGMT)) 3792 - mgmt_device_connected(hdev, conn, NULL, 0); 3793 3791 hci_dev_unlock(hdev); 3794 3792 3795 3793 if (conn) {

+1 -1

net/core/filter.c

··· 2232 2232 rcu_read_unlock(); 2233 2233 return ret; 2234 2234 } 2235 - rcu_read_unlock_bh(); 2235 + rcu_read_unlock(); 2236 2236 if (dst) 2237 2237 IP6_INC_STATS(net, ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES); 2238 2238 out_drop:

+24 -18

net/core/sock.c

··· 1047 1047 1048 1048 #ifdef CONFIG_PAGE_POOL 1049 1049 1050 - /* This is the number of tokens that the user can SO_DEVMEM_DONTNEED in 1051 - * 1 syscall. The limit exists to limit the amount of memory the kernel 1052 - * allocates to copy these tokens. 1050 + /* This is the number of tokens and frags that the user can SO_DEVMEM_DONTNEED 1051 + * in 1 syscall. The limit exists to limit the amount of memory the kernel 1052 + * allocates to copy these tokens, and to prevent looping over the frags for 1053 + * too long. 1053 1054 */ 1054 1055 #define MAX_DONTNEED_TOKENS 128 1056 + #define MAX_DONTNEED_FRAGS 1024 1055 1057 1056 1058 static noinline_for_stack int 1057 1059 sock_devmem_dontneed(struct sock *sk, sockptr_t optval, unsigned int optlen) 1058 1060 { 1059 1061 unsigned int num_tokens, i, j, k, netmem_num = 0; 1060 1062 struct dmabuf_token *tokens; 1063 + int ret = 0, num_frags = 0; 1061 1064 netmem_ref netmems[16]; 1062 - int ret = 0; 1063 1065 1064 1066 if (!sk_is_tcp(sk)) 1065 1067 return -EBADF; 1066 1068 1067 - if (optlen % sizeof(struct dmabuf_token) || 1069 + if (optlen % sizeof(*tokens) || 1068 1070 optlen > sizeof(*tokens) * MAX_DONTNEED_TOKENS) 1069 1071 return -EINVAL; 1070 1072 1071 - tokens = kvmalloc_array(optlen, sizeof(*tokens), GFP_KERNEL); 1073 + num_tokens = optlen / sizeof(*tokens); 1074 + tokens = kvmalloc_array(num_tokens, sizeof(*tokens), GFP_KERNEL); 1072 1075 if (!tokens) 1073 1076 return -ENOMEM; 1074 1077 1075 - num_tokens = optlen / sizeof(struct dmabuf_token); 1076 1078 if (copy_from_sockptr(tokens, optval, optlen)) { 1077 1079 kvfree(tokens); 1078 1080 return -EFAULT; ··· 1083 1081 xa_lock_bh(&sk->sk_user_frags); 1084 1082 for (i = 0; i < num_tokens; i++) { 1085 1083 for (j = 0; j < tokens[i].token_count; j++) { 1084 + if (++num_frags > MAX_DONTNEED_FRAGS) 1085 + goto frag_limit_reached; 1086 + 1086 1087 netmem_ref netmem = (__force netmem_ref)__xa_erase( 1087 1088 &sk->sk_user_frags, tokens[i].token_start + j); 1088 1089 1089 - if (netmem && 1090 - !WARN_ON_ONCE(!netmem_is_net_iov(netmem))) { 1091 - netmems[netmem_num++] = netmem; 1092 - if (netmem_num == ARRAY_SIZE(netmems)) { 1093 - xa_unlock_bh(&sk->sk_user_frags); 1094 - for (k = 0; k < netmem_num; k++) 1095 - WARN_ON_ONCE(!napi_pp_put_page(netmems[k])); 1096 - netmem_num = 0; 1097 - xa_lock_bh(&sk->sk_user_frags); 1098 - } 1099 - ret++; 1090 + if (!netmem || WARN_ON_ONCE(!netmem_is_net_iov(netmem))) 1091 + continue; 1092 + 1093 + netmems[netmem_num++] = netmem; 1094 + if (netmem_num == ARRAY_SIZE(netmems)) { 1095 + xa_unlock_bh(&sk->sk_user_frags); 1096 + for (k = 0; k < netmem_num; k++) 1097 + WARN_ON_ONCE(!napi_pp_put_page(netmems[k])); 1098 + netmem_num = 0; 1099 + xa_lock_bh(&sk->sk_user_frags); 1100 1100 } 1101 + ret++; 1101 1102 } 1102 1103 } 1103 1104 1105 + frag_limit_reached: 1104 1106 xa_unlock_bh(&sk->sk_user_frags); 1105 1107 for (k = 0; k < netmem_num; k++) 1106 1108 WARN_ON_ONCE(!napi_pp_put_page(netmems[k]));

+1 -1

net/dccp/ipv6.c

··· 618 618 by tcp. Feel free to propose better solution. 619 619 --ANK (980728) 620 620 */ 621 - if (np->rxopt.all) 621 + if (np->rxopt.all && sk->sk_state != DCCP_LISTEN) 622 622 opt_skb = skb_clone_and_charge_r(skb, sk); 623 623 624 624 if (sk->sk_state == DCCP_OPEN) { /* Fast path */

+2 -1

net/ipv4/ipmr_base.c

··· 310 310 if (filter->filter_set) 311 311 flags |= NLM_F_DUMP_FILTERED; 312 312 313 - list_for_each_entry_rcu(mfc, &mrt->mfc_cache_list, list) { 313 + list_for_each_entry_rcu(mfc, &mrt->mfc_cache_list, list, 314 + lockdep_rtnl_is_held()) { 314 315 if (e < s_e) 315 316 goto next_entry; 316 317 if (filter->dev &&

+1 -3

net/ipv6/tcp_ipv6.c

··· 1621 1621 by tcp. Feel free to propose better solution. 1622 1622 --ANK (980728) 1623 1623 */ 1624 - if (np->rxopt.all) 1624 + if (np->rxopt.all && sk->sk_state != TCP_LISTEN) 1625 1625 opt_skb = skb_clone_and_charge_r(skb, sk); 1626 1626 1627 1627 if (sk->sk_state == TCP_ESTABLISHED) { /* Fast path */ ··· 1659 1659 if (reason) 1660 1660 goto reset; 1661 1661 } 1662 - if (opt_skb) 1663 - __kfree_skb(opt_skb); 1664 1662 return 0; 1665 1663 } 1666 1664 } else

+2 -1

net/mptcp/pm_netlink.c

··· 524 524 { 525 525 struct mptcp_pm_addr_entry *entry; 526 526 527 - list_for_each_entry(entry, &pernet->local_addr_list, list) { 527 + list_for_each_entry_rcu(entry, &pernet->local_addr_list, list, 528 + lockdep_is_held(&pernet->lock)) { 528 529 if (mptcp_addresses_equal(&entry->addr, info, entry->addr.port)) 529 530 return entry; 530 531 }

+15

net/mptcp/pm_userspace.c

··· 308 308 309 309 lock_sock(sk); 310 310 311 + spin_lock_bh(&msk->pm.lock); 311 312 match = mptcp_userspace_pm_lookup_addr_by_id(msk, id_val); 312 313 if (!match) { 313 314 GENL_SET_ERR_MSG(info, "address with specified id not found"); 315 + spin_unlock_bh(&msk->pm.lock); 314 316 release_sock(sk); 315 317 goto out; 316 318 } 317 319 318 320 list_move(&match->list, &free_list); 321 + spin_unlock_bh(&msk->pm.lock); 319 322 320 323 mptcp_pm_remove_addrs(msk, &free_list); 321 324 ··· 563 560 struct nlattr *token = info->attrs[MPTCP_PM_ATTR_TOKEN]; 564 561 struct nlattr *attr = info->attrs[MPTCP_PM_ATTR_ADDR]; 565 562 struct net *net = sock_net(skb->sk); 563 + struct mptcp_pm_addr_entry *entry; 566 564 struct mptcp_sock *msk; 567 565 int ret = -EINVAL; 568 566 struct sock *sk; ··· 604 600 605 601 if (loc.flags & MPTCP_PM_ADDR_FLAG_BACKUP) 606 602 bkup = 1; 603 + 604 + spin_lock_bh(&msk->pm.lock); 605 + list_for_each_entry(entry, &msk->pm.userspace_pm_local_addr_list, list) { 606 + if (mptcp_addresses_equal(&entry->addr, &loc.addr, false)) { 607 + if (bkup) 608 + entry->flags |= MPTCP_PM_ADDR_FLAG_BACKUP; 609 + else 610 + entry->flags &= ~MPTCP_PM_ADDR_FLAG_BACKUP; 611 + } 612 + } 613 + spin_unlock_bh(&msk->pm.lock); 607 614 608 615 lock_sock(sk); 609 616 ret = mptcp_pm_nl_mp_prio_send_ack(msk, &loc.addr, &rem.addr, bkup);

+11 -5

net/mptcp/protocol.c

··· 2082 2082 slow = lock_sock_fast(ssk); 2083 2083 WRITE_ONCE(ssk->sk_rcvbuf, rcvbuf); 2084 2084 WRITE_ONCE(tcp_sk(ssk)->window_clamp, window_clamp); 2085 - tcp_cleanup_rbuf(ssk, 1); 2085 + if (tcp_can_send_ack(ssk)) 2086 + tcp_cleanup_rbuf(ssk, 1); 2086 2087 unlock_sock_fast(ssk, slow); 2087 2088 } 2088 2089 } ··· 2206 2205 cmsg_flags = MPTCP_CMSG_INQ; 2207 2206 2208 2207 while (copied < len) { 2209 - int bytes_read; 2208 + int err, bytes_read; 2210 2209 2211 2210 bytes_read = __mptcp_recvmsg_mskq(msk, msg, len - copied, flags, &tss, &cmsg_flags); 2212 2211 if (unlikely(bytes_read < 0)) { ··· 2268 2267 } 2269 2268 2270 2269 pr_debug("block timeout %ld\n", timeo); 2271 - sk_wait_data(sk, &timeo, NULL); 2270 + mptcp_rcv_space_adjust(msk, copied); 2271 + err = sk_wait_data(sk, &timeo, NULL); 2272 + if (err < 0) { 2273 + err = copied ? : err; 2274 + goto out_err; 2275 + } 2272 2276 } 2277 + 2278 + mptcp_rcv_space_adjust(msk, copied); 2273 2279 2274 2280 out_err: 2275 2281 if (cmsg_flags && copied >= 0) { ··· 2293 2285 pr_debug("msk=%p rx queue empty=%d:%d copied=%d\n", 2294 2286 msk, skb_queue_empty_lockless(&sk->sk_receive_queue), 2295 2287 skb_queue_empty(&msk->receive_queue), copied); 2296 - if (!(flags & MSG_PEEK)) 2297 - mptcp_rcv_space_adjust(msk, copied); 2298 2288 2299 2289 release_sock(sk); 2300 2290 return copied;

+8 -23

net/netlink/af_netlink.c

··· 393 393 394 394 static void netlink_sock_destruct(struct sock *sk) 395 395 { 396 - struct netlink_sock *nlk = nlk_sk(sk); 397 - 398 - if (nlk->cb_running) { 399 - if (nlk->cb.done) 400 - nlk->cb.done(&nlk->cb); 401 - module_put(nlk->cb.module); 402 - kfree_skb(nlk->cb.skb); 403 - } 404 - 405 396 skb_queue_purge(&sk->sk_receive_queue); 406 397 407 398 if (!sock_flag(sk, SOCK_DEAD)) { ··· 403 412 WARN_ON(atomic_read(&sk->sk_rmem_alloc)); 404 413 WARN_ON(refcount_read(&sk->sk_wmem_alloc)); 405 414 WARN_ON(nlk_sk(sk)->groups); 406 - } 407 - 408 - static void netlink_sock_destruct_work(struct work_struct *work) 409 - { 410 - struct netlink_sock *nlk = container_of(work, struct netlink_sock, 411 - work); 412 - 413 - sk_free(&nlk->sk); 414 415 } 415 416 416 417 /* This lock without WQ_FLAG_EXCLUSIVE is good on UP and it is _very_ bad on ··· 714 731 if (!refcount_dec_and_test(&sk->sk_refcnt)) 715 732 return; 716 733 717 - if (nlk->cb_running && nlk->cb.done) { 718 - INIT_WORK(&nlk->work, netlink_sock_destruct_work); 719 - schedule_work(&nlk->work); 720 - return; 721 - } 722 - 723 734 sk_free(sk); 724 735 } 725 736 ··· 763 786 }; 764 787 blocking_notifier_call_chain(&netlink_chain, 765 788 NETLINK_URELEASE, &n); 789 + } 790 + 791 + /* Terminate any outstanding dump */ 792 + if (nlk->cb_running) { 793 + if (nlk->cb.done) 794 + nlk->cb.done(&nlk->cb); 795 + module_put(nlk->cb.module); 796 + kfree_skb(nlk->cb.skb); 766 797 } 767 798 768 799 module_put(nlk->module);

-2

net/netlink/af_netlink.h

··· 4 4 5 5 #include <linux/rhashtable.h> 6 6 #include <linux/atomic.h> 7 - #include <linux/workqueue.h> 8 7 #include <net/sock.h> 9 8 10 9 /* flags */ ··· 49 50 50 51 struct rhash_head node; 51 52 struct rcu_head rcu; 52 - struct work_struct work; 53 53 }; 54 54 55 55 static inline struct netlink_sock *nlk_sk(struct sock *sk)

+14 -4

net/sched/cls_u32.c

··· 92 92 long knodes; 93 93 }; 94 94 95 + static u32 handle2id(u32 h) 96 + { 97 + return ((h & 0x80000000) ? ((h >> 20) & 0x7FF) : h); 98 + } 99 + 100 + static u32 id2handle(u32 id) 101 + { 102 + return (id | 0x800U) << 20; 103 + } 104 + 95 105 static inline unsigned int u32_hash_fold(__be32 key, 96 106 const struct tc_u32_sel *sel, 97 107 u8 fshift) ··· 320 310 int id = idr_alloc_cyclic(&tp_c->handle_idr, ptr, 1, 0x7FF, GFP_KERNEL); 321 311 if (id < 0) 322 312 return 0; 323 - return (id | 0x800U) << 20; 313 + return id2handle(id); 324 314 } 325 315 326 316 static struct hlist_head *tc_u_common_hash; ··· 370 360 return -ENOBUFS; 371 361 372 362 refcount_set(&root_ht->refcnt, 1); 373 - root_ht->handle = tp_c ? gen_new_htid(tp_c, root_ht) : 0x80000000; 363 + root_ht->handle = tp_c ? gen_new_htid(tp_c, root_ht) : id2handle(0); 374 364 root_ht->prio = tp->prio; 375 365 root_ht->is_root = true; 376 366 idr_init(&root_ht->handle_idr); ··· 622 612 if (phn == ht) { 623 613 u32_clear_hw_hnode(tp, ht, extack); 624 614 idr_destroy(&ht->handle_idr); 625 - idr_remove(&tp_c->handle_idr, ht->handle); 615 + idr_remove(&tp_c->handle_idr, handle2id(ht->handle)); 626 616 RCU_INIT_POINTER(*hn, ht->next); 627 617 kfree_rcu(ht, rcu); 628 618 return 0; ··· 999 989 1000 990 err = u32_replace_hw_hnode(tp, ht, userflags, extack); 1001 991 if (err) { 1002 - idr_remove(&tp_c->handle_idr, handle); 992 + idr_remove(&tp_c->handle_idr, handle2id(handle)); 1003 993 kfree(ht); 1004 994 return err; 1005 995 }

+13 -6

net/sctp/ipv6.c

··· 683 683 struct sock *sk = &sp->inet.sk; 684 684 struct net *net = sock_net(sk); 685 685 struct net_device *dev = NULL; 686 - int type; 686 + int type, res, bound_dev_if; 687 687 688 688 type = ipv6_addr_type(in6); 689 689 if (IPV6_ADDR_ANY == type) ··· 697 697 if (!(type & IPV6_ADDR_UNICAST)) 698 698 return 0; 699 699 700 - if (sk->sk_bound_dev_if) { 701 - dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if); 700 + rcu_read_lock(); 701 + bound_dev_if = READ_ONCE(sk->sk_bound_dev_if); 702 + if (bound_dev_if) { 703 + res = 0; 704 + dev = dev_get_by_index_rcu(net, bound_dev_if); 702 705 if (!dev) 703 - return 0; 706 + goto out; 704 707 } 705 708 706 - return ipv6_can_nonlocal_bind(net, &sp->inet) || 707 - ipv6_chk_addr(net, in6, dev, 0); 709 + res = ipv6_can_nonlocal_bind(net, &sp->inet) || 710 + ipv6_chk_addr(net, in6, dev, 0); 711 + 712 + out: 713 + rcu_read_unlock(); 714 + return res; 708 715 } 709 716 710 717 /* This function checks if the address is a valid address to be used for

+3

net/vmw_vsock/af_vsock.c

··· 836 836 { 837 837 struct vsock_sock *vsk = vsock_sk(sk); 838 838 839 + /* Flush MSG_ZEROCOPY leftovers. */ 840 + __skb_queue_purge(&sk->sk_error_queue); 841 + 839 842 vsock_deassign_transport(vsk); 840 843 841 844 /* When clearing these addresses, there's no need to set the family and

+10

net/vmw_vsock/virtio_transport_common.c

··· 400 400 if (virtio_transport_init_zcopy_skb(vsk, skb, 401 401 info->msg, 402 402 can_zcopy)) { 403 + kfree_skb(skb); 403 404 ret = -ENOMEM; 404 405 break; 405 406 } ··· 1110 1109 struct virtio_vsock_sock *vvs = vsk->trans; 1111 1110 1112 1111 kfree(vvs); 1112 + vsk->trans = NULL; 1113 1113 } 1114 1114 EXPORT_SYMBOL_GPL(virtio_transport_destruct); 1115 1115 ··· 1512 1510 if (sk_acceptq_is_full(sk)) { 1513 1511 virtio_transport_reset_no_sock(t, skb); 1514 1512 return -ENOMEM; 1513 + } 1514 + 1515 + /* __vsock_release() might have already flushed accept_queue. 1516 + * Subsequent enqueues would lead to a memory leak. 1517 + */ 1518 + if (sk->sk_shutdown == SHUTDOWN_MASK) { 1519 + virtio_transport_reset_no_sock(t, skb); 1520 + return -ESHUTDOWN; 1515 1521 } 1516 1522 1517 1523 child = vsock_create_connected(sk);

+69 -43

samples/landlock/sandboxer.c

··· 60 60 #define ENV_SCOPED_NAME "LL_SCOPED" 61 61 #define ENV_DELIMITER ":" 62 62 63 + static int str2num(const char *numstr, __u64 *num_dst) 64 + { 65 + char *endptr = NULL; 66 + int err = 0; 67 + __u64 num; 68 + 69 + errno = 0; 70 + num = strtoull(numstr, &endptr, 10); 71 + if (errno != 0) 72 + err = errno; 73 + /* Was the string empty, or not entirely parsed successfully? */ 74 + else if ((*numstr == '\0') || (*endptr != '\0')) 75 + err = EINVAL; 76 + else 77 + *num_dst = num; 78 + 79 + return err; 80 + } 81 + 63 82 static int parse_path(char *env_path, const char ***const path_list) 64 83 { 65 84 int i, num_paths = 0; ··· 179 160 char *env_port_name, *env_port_name_next, *strport; 180 161 struct landlock_net_port_attr net_port = { 181 162 .allowed_access = allowed_access, 182 - .port = 0, 183 163 }; 184 164 185 165 env_port_name = getenv(env_var); ··· 189 171 190 172 env_port_name_next = env_port_name; 191 173 while ((strport = strsep(&env_port_name_next, ENV_DELIMITER))) { 192 - net_port.port = atoi(strport); 174 + __u64 port; 175 + 176 + if (strcmp(strport, "") == 0) 177 + continue; 178 + 179 + if (str2num(strport, &port)) { 180 + fprintf(stderr, "Failed to parse port at \"%s\"\n", 181 + strport); 182 + goto out_free_name; 183 + } 184 + net_port.port = port; 193 185 if (landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT, 194 186 &net_port, 0)) { 195 187 fprintf(stderr, ··· 290 262 291 263 #define LANDLOCK_ABI_LAST 6 292 264 265 + #define XSTR(s) #s 266 + #define STR(s) XSTR(s) 267 + 268 + /* clang-format off */ 269 + 270 + static const char help[] = 271 + "usage: " ENV_FS_RO_NAME "=\"...\" " ENV_FS_RW_NAME "=\"...\" " 272 + "[other environment variables] %1$s <cmd> [args]...\n" 273 + "\n" 274 + "Execute the given command in a restricted environment.\n" 275 + "Multi-valued settings (lists of ports, paths, scopes) are colon-delimited.\n" 276 + "\n" 277 + "Mandatory settings:\n" 278 + "* " ENV_FS_RO_NAME ": paths allowed to be used in a read-only way\n" 279 + "* " ENV_FS_RW_NAME ": paths allowed to be used in a read-write way\n" 280 + "\n" 281 + "Optional settings (when not set, their associated access check " 282 + "is always allowed, which is different from an empty string which " 283 + "means an empty list):\n" 284 + "* " ENV_TCP_BIND_NAME ": ports allowed to bind (server)\n" 285 + "* " ENV_TCP_CONNECT_NAME ": ports allowed to connect (client)\n" 286 + "* " ENV_SCOPED_NAME ": actions denied on the outside of the landlock domain\n" 287 + " - \"a\" to restrict opening abstract unix sockets\n" 288 + " - \"s\" to restrict sending signals\n" 289 + "\n" 290 + "Example:\n" 291 + ENV_FS_RO_NAME "=\"${PATH}:/lib:/usr:/proc:/etc:/dev/urandom\" " 292 + ENV_FS_RW_NAME "=\"/dev/null:/dev/full:/dev/zero:/dev/pts:/tmp\" " 293 + ENV_TCP_BIND_NAME "=\"9418\" " 294 + ENV_TCP_CONNECT_NAME "=\"80:443\" " 295 + ENV_SCOPED_NAME "=\"a:s\" " 296 + "%1$s bash -i\n" 297 + "\n" 298 + "This sandboxer can use Landlock features up to ABI version " 299 + STR(LANDLOCK_ABI_LAST) ".\n"; 300 + 301 + /* clang-format on */ 302 + 293 303 int main(const int argc, char *const argv[], char *const *const envp) 294 304 { 295 305 const char *cmd_path; ··· 346 280 }; 347 281 348 282 if (argc < 2) { 349 - fprintf(stderr, 350 - "usage: %s=\"...\" %s=\"...\" %s=\"...\" %s=\"...\" %s=\"...\" %s " 351 - "<cmd> [args]...\n\n", 352 - ENV_FS_RO_NAME, ENV_FS_RW_NAME, ENV_TCP_BIND_NAME, 353 - ENV_TCP_CONNECT_NAME, ENV_SCOPED_NAME, argv[0]); 354 - fprintf(stderr, 355 - "Execute a command in a restricted environment.\n\n"); 356 - fprintf(stderr, 357 - "Environment variables containing paths and ports " 358 - "each separated by a colon:\n"); 359 - fprintf(stderr, 360 - "* %s: list of paths allowed to be used in a read-only way.\n", 361 - ENV_FS_RO_NAME); 362 - fprintf(stderr, 363 - "* %s: list of paths allowed to be used in a read-write way.\n\n", 364 - ENV_FS_RW_NAME); 365 - fprintf(stderr, 366 - "Environment variables containing ports are optional " 367 - "and could be skipped.\n"); 368 - fprintf(stderr, 369 - "* %s: list of ports allowed to bind (server).\n", 370 - ENV_TCP_BIND_NAME); 371 - fprintf(stderr, 372 - "* %s: list of ports allowed to connect (client).\n", 373 - ENV_TCP_CONNECT_NAME); 374 - fprintf(stderr, "* %s: list of scoped IPCs.\n", 375 - ENV_SCOPED_NAME); 376 - fprintf(stderr, 377 - "\nexample:\n" 378 - "%s=\"${PATH}:/lib:/usr:/proc:/etc:/dev/urandom\" " 379 - "%s=\"/dev/null:/dev/full:/dev/zero:/dev/pts:/tmp\" " 380 - "%s=\"9418\" " 381 - "%s=\"80:443\" " 382 - "%s=\"a:s\" " 383 - "%s bash -i\n\n", 384 - ENV_FS_RO_NAME, ENV_FS_RW_NAME, ENV_TCP_BIND_NAME, 385 - ENV_TCP_CONNECT_NAME, ENV_SCOPED_NAME, argv[0]); 386 - fprintf(stderr, 387 - "This sandboxer can use Landlock features " 388 - "up to ABI version %d.\n", 389 - LANDLOCK_ABI_LAST); 283 + fprintf(stderr, help, argv[0]); 390 284 return 1; 391 285 } 392 286

+1 -1

samples/pktgen/pktgen_sample01_simple.sh

··· 76 76 pg_set $DEV "udp_dst_max $UDP_DST_MAX" 77 77 fi 78 78 79 - [ ! -z "$UDP_CSUM" ] && pg_set $dev "flag UDPCSUM" 79 + [ ! -z "$UDP_CSUM" ] && pg_set $DEV "flag UDPCSUM" 80 80 81 81 # Setup random UDP port src range 82 82 pg_set $DEV "flag UDPSRC_RND"

+2 -1

security/integrity/evm/evm_main.c

··· 1084 1084 if (!S_ISREG(inode->i_mode) || !(mode & FMODE_WRITE)) 1085 1085 return; 1086 1086 1087 - if (iint && atomic_read(&inode->i_writecount) == 1) 1087 + if (iint && iint->flags & EVM_NEW_FILE && 1088 + atomic_read(&inode->i_writecount) == 1) 1088 1089 iint->flags &= ~EVM_NEW_FILE; 1089 1090 } 1090 1091

+10 -4

security/integrity/ima/ima_template_lib.c

··· 318 318 hash_algo_name[hash_algo]); 319 319 } 320 320 321 - if (digest) 321 + if (digest) { 322 322 memcpy(buffer + offset, digest, digestsize); 323 - else 323 + } else { 324 324 /* 325 325 * If digest is NULL, the event being recorded is a violation. 326 326 * Make room for the digest by increasing the offset by the 327 - * hash algorithm digest size. 327 + * hash algorithm digest size. If the hash algorithm is not 328 + * specified increase the offset by IMA_DIGEST_SIZE which 329 + * fits SHA1 or MD5 328 330 */ 329 - offset += hash_digest_size[hash_algo]; 331 + if (hash_algo < HASH_ALGO__LAST) 332 + offset += hash_digest_size[hash_algo]; 333 + else 334 + offset += IMA_DIGEST_SIZE; 335 + } 330 336 331 337 return ima_write_template_field_data(buffer, offset + digestsize, 332 338 fmt, field_data);

+4

security/integrity/integrity.h

··· 37 37 ); 38 38 u8 data[]; 39 39 } __packed; 40 + static_assert(offsetof(struct evm_ima_xattr_data, data) == sizeof(struct evm_ima_xattr_data_hdr), 41 + "struct member likely outside of __struct_group()"); 40 42 41 43 /* Only used in the EVM HMAC code. */ 42 44 struct evm_xattr { ··· 67 65 ); 68 66 u8 digest[]; 69 67 } __packed; 68 + static_assert(offsetof(struct ima_digest_data, digest) == sizeof(struct ima_digest_data_hdr), 69 + "struct member likely outside of __struct_group()"); 70 70 71 71 /* 72 72 * Instead of wrapping the ima_digest_data struct inside a local structure

+8 -23

security/landlock/fs.c

··· 389 389 } 390 390 391 391 static access_mask_t 392 - get_raw_handled_fs_accesses(const struct landlock_ruleset *const domain) 393 - { 394 - access_mask_t access_dom = 0; 395 - size_t layer_level; 396 - 397 - for (layer_level = 0; layer_level < domain->num_layers; layer_level++) 398 - access_dom |= 399 - landlock_get_raw_fs_access_mask(domain, layer_level); 400 - return access_dom; 401 - } 402 - 403 - static access_mask_t 404 392 get_handled_fs_accesses(const struct landlock_ruleset *const domain) 405 393 { 406 394 /* Handles all initially denied by default access rights. */ 407 - return get_raw_handled_fs_accesses(domain) | 395 + return landlock_union_access_masks(domain).fs | 408 396 LANDLOCK_ACCESS_FS_INITIALLY_DENIED; 409 397 } 410 398 411 - static const struct landlock_ruleset * 412 - get_fs_domain(const struct landlock_ruleset *const domain) 413 - { 414 - if (!domain || !get_raw_handled_fs_accesses(domain)) 415 - return NULL; 416 - 417 - return domain; 418 - } 399 + static const struct access_masks any_fs = { 400 + .fs = ~0, 401 + }; 419 402 420 403 static const struct landlock_ruleset *get_current_fs_domain(void) 421 404 { 422 - return get_fs_domain(landlock_get_current_domain()); 405 + return landlock_get_applicable_domain(landlock_get_current_domain(), 406 + any_fs); 423 407 } 424 408 425 409 /* ··· 1501 1517 access_mask_t open_access_request, full_access_request, allowed_access, 1502 1518 optional_access; 1503 1519 const struct landlock_ruleset *const dom = 1504 - get_fs_domain(landlock_cred(file->f_cred)->domain); 1520 + landlock_get_applicable_domain( 1521 + landlock_cred(file->f_cred)->domain, any_fs); 1505 1522 1506 1523 if (!dom) 1507 1524 return 0;

+6 -22

security/landlock/net.c

··· 39 39 return err; 40 40 } 41 41 42 - static access_mask_t 43 - get_raw_handled_net_accesses(const struct landlock_ruleset *const domain) 44 - { 45 - access_mask_t access_dom = 0; 46 - size_t layer_level; 47 - 48 - for (layer_level = 0; layer_level < domain->num_layers; layer_level++) 49 - access_dom |= landlock_get_net_access_mask(domain, layer_level); 50 - return access_dom; 51 - } 52 - 53 - static const struct landlock_ruleset *get_current_net_domain(void) 54 - { 55 - const struct landlock_ruleset *const dom = 56 - landlock_get_current_domain(); 57 - 58 - if (!dom || !get_raw_handled_net_accesses(dom)) 59 - return NULL; 60 - 61 - return dom; 62 - } 42 + static const struct access_masks any_net = { 43 + .net = ~0, 44 + }; 63 45 64 46 static int current_check_access_socket(struct socket *const sock, 65 47 struct sockaddr *const address, ··· 54 72 struct landlock_id id = { 55 73 .type = LANDLOCK_KEY_NET_PORT, 56 74 }; 57 - const struct landlock_ruleset *const dom = get_current_net_domain(); 75 + const struct landlock_ruleset *const dom = 76 + landlock_get_applicable_domain(landlock_get_current_domain(), 77 + any_net); 58 78 59 79 if (!dom) 60 80 return 0;

+66 -8

security/landlock/ruleset.h

··· 11 11 12 12 #include <linux/bitops.h> 13 13 #include <linux/build_bug.h> 14 + #include <linux/kernel.h> 14 15 #include <linux/mutex.h> 15 16 #include <linux/rbtree.h> 16 17 #include <linux/refcount.h> ··· 47 46 access_mask_t net : LANDLOCK_NUM_ACCESS_NET; 48 47 access_mask_t scope : LANDLOCK_NUM_SCOPE; 49 48 }; 49 + 50 + union access_masks_all { 51 + struct access_masks masks; 52 + u32 all; 53 + }; 54 + 55 + /* Makes sure all fields are covered. */ 56 + static_assert(sizeof(typeof_member(union access_masks_all, masks)) == 57 + sizeof(typeof_member(union access_masks_all, all))); 50 58 51 59 typedef u16 layer_mask_t; 52 60 /* Makes sure all layers can be checked. */ ··· 270 260 refcount_inc(&ruleset->usage); 271 261 } 272 262 263 + /** 264 + * landlock_union_access_masks - Return all access rights handled in the 265 + * domain 266 + * 267 + * @domain: Landlock ruleset (used as a domain) 268 + * 269 + * Returns: an access_masks result of the OR of all the domain's access masks. 270 + */ 271 + static inline struct access_masks 272 + landlock_union_access_masks(const struct landlock_ruleset *const domain) 273 + { 274 + union access_masks_all matches = {}; 275 + size_t layer_level; 276 + 277 + for (layer_level = 0; layer_level < domain->num_layers; layer_level++) { 278 + union access_masks_all layer = { 279 + .masks = domain->access_masks[layer_level], 280 + }; 281 + 282 + matches.all |= layer.all; 283 + } 284 + 285 + return matches.masks; 286 + } 287 + 288 + /** 289 + * landlock_get_applicable_domain - Return @domain if it applies to (handles) 290 + * at least one of the access rights specified 291 + * in @masks 292 + * 293 + * @domain: Landlock ruleset (used as a domain) 294 + * @masks: access masks 295 + * 296 + * Returns: @domain if any access rights specified in @masks is handled, or 297 + * NULL otherwise. 298 + */ 299 + static inline const struct landlock_ruleset * 300 + landlock_get_applicable_domain(const struct landlock_ruleset *const domain, 301 + const struct access_masks masks) 302 + { 303 + const union access_masks_all masks_all = { 304 + .masks = masks, 305 + }; 306 + union access_masks_all merge = {}; 307 + 308 + if (!domain) 309 + return NULL; 310 + 311 + merge.masks = landlock_union_access_masks(domain); 312 + if (merge.all & masks_all.all) 313 + return domain; 314 + 315 + return NULL; 316 + } 317 + 273 318 static inline void 274 319 landlock_add_fs_access_mask(struct landlock_ruleset *const ruleset, 275 320 const access_mask_t fs_access_mask, ··· 361 296 } 362 297 363 298 static inline access_mask_t 364 - landlock_get_raw_fs_access_mask(const struct landlock_ruleset *const ruleset, 365 - const u16 layer_level) 366 - { 367 - return ruleset->access_masks[layer_level].fs; 368 - } 369 - 370 - static inline access_mask_t 371 299 landlock_get_fs_access_mask(const struct landlock_ruleset *const ruleset, 372 300 const u16 layer_level) 373 301 { 374 302 /* Handles all initially denied by default access rights. */ 375 - return landlock_get_raw_fs_access_mask(ruleset, layer_level) | 303 + return ruleset->access_masks[layer_level].fs | 376 304 LANDLOCK_ACCESS_FS_INITIALLY_DENIED; 377 305 } 378 306

+1 -1

security/landlock/syscalls.c

··· 329 329 return -ENOMSG; 330 330 331 331 /* Checks that allowed_access matches the @ruleset constraints. */ 332 - mask = landlock_get_raw_fs_access_mask(ruleset, 0); 332 + mask = ruleset->access_masks[0].fs; 333 333 if ((path_beneath_attr.allowed_access | mask) != mask) 334 334 return -EINVAL; 335 335

+15 -3

security/landlock/task.c

··· 204 204 return false; 205 205 } 206 206 207 + static const struct access_masks unix_scope = { 208 + .scope = LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET, 209 + }; 210 + 207 211 static int hook_unix_stream_connect(struct sock *const sock, 208 212 struct sock *const other, 209 213 struct sock *const newsk) 210 214 { 211 215 const struct landlock_ruleset *const dom = 212 - landlock_get_current_domain(); 216 + landlock_get_applicable_domain(landlock_get_current_domain(), 217 + unix_scope); 213 218 214 219 /* Quick return for non-landlocked tasks. */ 215 220 if (!dom) ··· 230 225 struct socket *const other) 231 226 { 232 227 const struct landlock_ruleset *const dom = 233 - landlock_get_current_domain(); 228 + landlock_get_applicable_domain(landlock_get_current_domain(), 229 + unix_scope); 234 230 235 231 if (!dom) 236 232 return 0; ··· 249 243 return 0; 250 244 } 251 245 246 + static const struct access_masks signal_scope = { 247 + .scope = LANDLOCK_SCOPE_SIGNAL, 248 + }; 249 + 252 250 static int hook_task_kill(struct task_struct *const p, 253 251 struct kernel_siginfo *const info, const int sig, 254 252 const struct cred *const cred) ··· 266 256 } else { 267 257 dom = landlock_get_current_domain(); 268 258 } 259 + dom = landlock_get_applicable_domain(dom, signal_scope); 269 260 270 261 /* Quick return for non-landlocked tasks. */ 271 262 if (!dom) ··· 290 279 291 280 /* Lock already held by send_sigio() and send_sigurg(). */ 292 281 lockdep_assert_held(&fown->lock); 293 - dom = landlock_file(fown->file)->fown_domain; 282 + dom = landlock_get_applicable_domain( 283 + landlock_file(fown->file)->fown_domain, signal_scope); 294 284 295 285 /* Quick return for unowned socket. */ 296 286 if (!dom)

+1 -1

sound/core/ump.c

··· 1233 1233 1234 1234 num = 0; 1235 1235 for (i = 0; i < SNDRV_UMP_MAX_GROUPS; i++) 1236 - if (group_maps & (1U << i)) 1236 + if ((group_maps & (1U << i)) && ump->groups[i].valid) 1237 1237 ump->legacy_mapping[num++] = i; 1238 1238 1239 1239 return num;

+1 -1

sound/firewire/tascam/amdtp-tascam.c

··· 238 238 err = amdtp_stream_init(s, unit, dir, flags, fmt, 239 239 process_ctx_payloads, sizeof(struct amdtp_tscm)); 240 240 if (err < 0) 241 - return 0; 241 + return err; 242 242 243 243 if (dir == AMDTP_OUT_STREAM) { 244 244 // Use fixed value for FDF field.

-2

sound/pci/hda/patch_conexant.c

··· 205 205 { 206 206 struct conexant_spec *spec = codec->spec; 207 207 208 - snd_hda_gen_shutup_speakers(codec); 209 - 210 208 /* Turn the problematic codec into D3 to avoid spurious noises 211 209 from the internal speaker during (and after) reboot */ 212 210 cx_auto_turn_eapd(codec, spec->num_eapds, spec->eapds, false);

+14

sound/soc/amd/yc/acp6x-mach.c

··· 231 231 .driver_data = &acp6x_card, 232 232 .matches = { 233 233 DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"), 234 + DMI_MATCH(DMI_PRODUCT_NAME, "21M4"), 235 + } 236 + }, 237 + { 238 + .driver_data = &acp6x_card, 239 + .matches = { 240 + DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"), 234 241 DMI_MATCH(DMI_PRODUCT_NAME, "21M5"), 235 242 } 236 243 }, ··· 400 393 .matches = { 401 394 DMI_MATCH(DMI_BOARD_VENDOR, "TIMI"), 402 395 DMI_MATCH(DMI_PRODUCT_NAME, "Redmi Book Pro 15 2022"), 396 + } 397 + }, 398 + { 399 + .driver_data = &acp6x_card, 400 + .matches = { 401 + DMI_MATCH(DMI_BOARD_VENDOR, "TIMI"), 402 + DMI_MATCH(DMI_PRODUCT_NAME, "Xiaomi Book Pro 14 2022"), 403 403 } 404 404 }, 405 405 {

+1

sound/soc/codecs/tas2781-fmwlib.c

··· 1992 1992 break; 1993 1993 case 0x202: 1994 1994 case 0x400: 1995 + case 0x401: 1995 1996 tas_priv->fw_parse_variable_header = 1996 1997 fw_parse_variable_header_git; 1997 1998 tas_priv->fw_parse_program_data =

+9 -1

sound/soc/sof/amd/acp.c

··· 342 342 { 343 343 struct snd_sof_dev *sdev = adata->dev; 344 344 unsigned int val; 345 + unsigned int acp_dma_ch_sts; 345 346 int ret = 0; 346 347 348 + switch (adata->pci_rev) { 349 + case ACP70_PCI_ID: 350 + acp_dma_ch_sts = ACP70_DMA_CH_STS; 351 + break; 352 + default: 353 + acp_dma_ch_sts = ACP_DMA_CH_STS; 354 + } 347 355 val = snd_sof_dsp_read(sdev, ACP_DSP_BAR, ACP_DMA_CNTL_0 + ch * sizeof(u32)); 348 356 if (val & ACP_DMA_CH_RUN) { 349 - ret = snd_sof_dsp_read_poll_timeout(sdev, ACP_DSP_BAR, ACP_DMA_CH_STS, val, !val, 357 + ret = snd_sof_dsp_read_poll_timeout(sdev, ACP_DSP_BAR, acp_dma_ch_sts, val, !val, 350 358 ACP_REG_POLL_INTERVAL, 351 359 ACP_DMA_COMPLETE_TIMEOUT_US); 352 360 if (ret < 0)

+1

sound/soc/sof/sof-client-probes-ipc4.c

··· 125 125 msg.primary |= SOF_IPC4_MSG_TARGET(SOF_IPC4_MODULE_MSG); 126 126 msg.extension = SOF_IPC4_MOD_EXT_DST_MOD_INSTANCE(INVALID_PIPELINE_ID); 127 127 msg.extension |= SOF_IPC4_MOD_EXT_CORE_ID(0); 128 + msg.extension |= SOF_IPC4_MOD_EXT_PARAM_SIZE(sizeof(cfg) / sizeof(uint32_t)); 128 129 129 130 msg.data_size = sizeof(cfg); 130 131 msg.data_ptr = &cfg;

+3 -3

sound/soc/stm/stm32_sai_sub.c

··· 317 317 int div; 318 318 319 319 div = DIV_ROUND_CLOSEST(input_rate, output_rate); 320 - if (div > SAI_XCR1_MCKDIV_MAX(version)) { 320 + if (div > SAI_XCR1_MCKDIV_MAX(version) || div <= 0) { 321 321 dev_err(&sai->pdev->dev, "Divider %d out of range\n", div); 322 322 return -EINVAL; 323 323 } ··· 378 378 int div; 379 379 380 380 div = stm32_sai_get_clk_div(sai, *prate, rate); 381 - if (div < 0) 382 - return div; 381 + if (div <= 0) 382 + return -EINVAL; 383 383 384 384 mclk->freq = *prate / div; 385 385

+1 -1

sound/soc/stm/stm32_spdifrx.c

··· 939 939 { 940 940 struct stm32_spdifrx_data *spdifrx = platform_get_drvdata(pdev); 941 941 942 - if (spdifrx->ctrl_chan) 942 + if (!IS_ERR(spdifrx->ctrl_chan)) 943 943 dma_release_channel(spdifrx->ctrl_chan); 944 944 945 945 if (spdifrx->dmab)

+1

sound/usb/mixer.c

··· 1205 1205 } 1206 1206 break; 1207 1207 case USB_ID(0x1bcf, 0x2283): /* NexiGo N930AF FHD Webcam */ 1208 + case USB_ID(0x03f0, 0x654a): /* HP 320 FHD Webcam */ 1208 1209 if (!strcmp(kctl->id.name, "Mic Capture Volume")) { 1209 1210 usb_audio_info(chip, 1210 1211 "set resolution quirk: cval->res = 16\n");

+2

sound/usb/quirks.c

··· 2114 2114 2115 2115 static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { 2116 2116 /* Device matches */ 2117 + DEVICE_FLG(0x03f0, 0x654a, /* HP 320 FHD Webcam */ 2118 + QUIRK_FLAG_GET_SAMPLE_RATE), 2117 2119 DEVICE_FLG(0x041e, 0x3000, /* Creative SB Extigy */ 2118 2120 QUIRK_FLAG_IGNORE_CTL_ERROR), 2119 2121 DEVICE_FLG(0x041e, 0x4080, /* Creative Live Cam VF0610 */

+3 -1

tools/lib/thermal/Makefile

··· 121 121 122 122 clean: 123 123 $(call QUIET_CLEAN, libthermal) $(RM) $(LIBTHERMAL_A) \ 124 - *.o *~ *.a *.so *.so.$(VERSION) *.so.$(LIBTHERMAL_VERSION) .*.d .*.cmd LIBTHERMAL-CFLAGS $(LIBTHERMAL_PC) 124 + *.o *~ *.a *.so *.so.$(VERSION) *.so.$(LIBTHERMAL_VERSION) \ 125 + .*.d .*.cmd LIBTHERMAL-CFLAGS $(LIBTHERMAL_PC) \ 126 + $(srctree)/tools/$(THERMAL_UAPI) 125 127 126 128 $(LIBTHERMAL_PC): 127 129 $(QUIET_GEN)sed -e "s|@PREFIX@|$(prefix)|" \

+2

tools/lib/thermal/sampling.c

··· 16 16 struct thermal_handler_param *thp = arg; 17 17 struct thermal_handler *th = thp->th; 18 18 19 + arg = thp->arg; 20 + 19 21 genlmsg_parse(nlh, 0, attrs, THERMAL_GENL_ATTR_MAX, NULL); 20 22 21 23 switch (genlhdr->cmd) {

+1 -1

tools/sched_ext/scx_show_state.py

··· 35 35 print(f'switching_all : {read_int("scx_switching_all")}') 36 36 print(f'switched_all : {read_static_key("__scx_switched_all")}') 37 37 print(f'enable_state : {ops_state_str(enable_state)} ({enable_state})') 38 - print(f'bypass_depth : {read_atomic("scx_ops_bypass_depth")}') 38 + print(f'bypass_depth : {prog["scx_ops_bypass_depth"].value_()}') 39 39 print(f'nr_rejected : {read_atomic("scx_nr_rejected")}') 40 40 print(f'enable_seq : {read_atomic("scx_enable_seq")}')

+28 -4

tools/testing/selftests/bpf/progs/verifier_bits_iter.c

··· 57 57 __success __retval(0) 58 58 int null_pointer(void) 59 59 { 60 - int nr = 0; 60 + struct bpf_iter_bits iter; 61 + int err, nr = 0; 61 62 int *bit; 63 + 64 + err = bpf_iter_bits_new(&iter, NULL, 1); 65 + bpf_iter_bits_destroy(&iter); 66 + if (err != -EINVAL) 67 + return 1; 62 68 63 69 bpf_for_each(bits, bit, NULL, 1) 64 70 nr++; ··· 200 194 __success __retval(0) 201 195 int bad_words(void) 202 196 { 203 - void *bad_addr = (void *)(3UL << 30); 204 - int nr = 0; 197 + void *bad_addr = (void *)-4095; 198 + struct bpf_iter_bits iter; 199 + volatile int nr; 205 200 int *bit; 201 + int err; 206 202 203 + err = bpf_iter_bits_new(&iter, bad_addr, 1); 204 + bpf_iter_bits_destroy(&iter); 205 + if (err != -EFAULT) 206 + return 1; 207 + 208 + nr = 0; 207 209 bpf_for_each(bits, bit, bad_addr, 1) 208 210 nr++; 211 + if (nr != 0) 212 + return 2; 209 213 214 + err = bpf_iter_bits_new(&iter, bad_addr, 4); 215 + bpf_iter_bits_destroy(&iter); 216 + if (err != -EFAULT) 217 + return 3; 218 + 219 + nr = 0; 210 220 bpf_for_each(bits, bit, bad_addr, 4) 211 221 nr++; 222 + if (nr != 0) 223 + return 4; 212 224 213 - return nr; 225 + return 0; 214 226 }

+53 -1

tools/testing/selftests/drivers/net/bonding/bond_options.sh

··· 11 11 12 12 lib_dir=$(dirname "$0") 13 13 source ${lib_dir}/bond_topo_3d1c.sh 14 + c_maddr="33:33:00:00:00:10" 15 + g_maddr="33:33:00:00:02:54" 14 16 15 17 skip_prio() 16 18 { ··· 242 240 done 243 241 } 244 242 243 + # Testing correct multicast groups are added to slaves for ns targets 244 + arp_validate_mcast() 245 + { 246 + RET=0 247 + local arp_valid=$(cmd_jq "ip -n ${s_ns} -j -d link show bond0" ".[].linkinfo.info_data.arp_validate") 248 + local active_slave=$(cmd_jq "ip -n ${s_ns} -d -j link show bond0" ".[].linkinfo.info_data.active_slave") 249 + 250 + for i in $(seq 0 2); do 251 + maddr_list=$(ip -n ${s_ns} maddr show dev eth${i}) 252 + 253 + # arp_valid == 0 or active_slave should not join any maddrs 254 + if { [ "$arp_valid" == "null" ] || [ "eth${i}" == ${active_slave} ]; } && \ 255 + echo "$maddr_list" | grep -qE "${c_maddr}|${g_maddr}"; then 256 + RET=1 257 + check_err 1 "arp_valid $arp_valid active_slave $active_slave, eth$i has mcast group" 258 + # arp_valid != 0 and backup_slave should join both maddrs 259 + elif [ "$arp_valid" != "null" ] && [ "eth${i}" != ${active_slave} ] && \ 260 + ( ! echo "$maddr_list" | grep -q "${c_maddr}" || \ 261 + ! echo "$maddr_list" | grep -q "${m_maddr}"); then 262 + RET=1 263 + check_err 1 "arp_valid $arp_valid active_slave $active_slave, eth$i has mcast group" 264 + fi 265 + done 266 + 267 + # Do failover 268 + ip -n ${s_ns} link set ${active_slave} down 269 + # wait for active link change 270 + slowwait 2 active_slave_changed $active_slave 271 + active_slave=$(cmd_jq "ip -n ${s_ns} -d -j link show bond0" ".[].linkinfo.info_data.active_slave") 272 + 273 + for i in $(seq 0 2); do 274 + maddr_list=$(ip -n ${s_ns} maddr show dev eth${i}) 275 + 276 + # arp_valid == 0 or active_slave should not join any maddrs 277 + if { [ "$arp_valid" == "null" ] || [ "eth${i}" == ${active_slave} ]; } && \ 278 + echo "$maddr_list" | grep -qE "${c_maddr}|${g_maddr}"; then 279 + RET=1 280 + check_err 1 "arp_valid $arp_valid active_slave $active_slave, eth$i has mcast group" 281 + # arp_valid != 0 and backup_slave should join both maddrs 282 + elif [ "$arp_valid" != "null" ] && [ "eth${i}" != ${active_slave} ] && \ 283 + ( ! echo "$maddr_list" | grep -q "${c_maddr}" || \ 284 + ! echo "$maddr_list" | grep -q "${m_maddr}"); then 285 + RET=1 286 + check_err 1 "arp_valid $arp_valid active_slave $active_slave, eth$i has mcast group" 287 + fi 288 + done 289 + } 290 + 245 291 arp_validate_arp() 246 292 { 247 293 local mode=$1 ··· 311 261 fi 312 262 313 263 for val in $(seq 0 6); do 314 - arp_validate_test "mode $mode arp_interval 100 ns_ip6_target ${g_ip6} arp_validate $val" 264 + arp_validate_test "mode $mode arp_interval 100 ns_ip6_target ${g_ip6},${c_ip6} arp_validate $val" 315 265 log_test "arp_validate" "$mode ns_ip6_target arp_validate $val" 266 + arp_validate_mcast 267 + log_test "arp_validate" "join mcast group" 316 268 done 317 269 } 318 270

+6 -4

tools/testing/selftests/kvm/Makefile

··· 241 241 -Wno-gnu-variable-sized-type-not-at-end -MD -MP -DCONFIG_64BIT \ 242 242 -fno-builtin-memcmp -fno-builtin-memcpy \ 243 243 -fno-builtin-memset -fno-builtin-strnlen \ 244 - -fno-stack-protector -fno-PIE -I$(LINUX_TOOL_INCLUDE) \ 245 - -I$(LINUX_TOOL_ARCH_INCLUDE) -I$(LINUX_HDR_PATH) -Iinclude \ 246 - -I$(<D) -Iinclude/$(ARCH_DIR) -I ../rseq -I.. $(EXTRA_CFLAGS) \ 247 - $(KHDR_INCLUDES) 244 + -fno-stack-protector -fno-PIE -fno-strict-aliasing \ 245 + -I$(LINUX_TOOL_INCLUDE) -I$(LINUX_TOOL_ARCH_INCLUDE) \ 246 + -I$(LINUX_HDR_PATH) -Iinclude -I$(<D) -Iinclude/$(ARCH_DIR) \ 247 + -I ../rseq -I.. $(EXTRA_CFLAGS) $(KHDR_INCLUDES) 248 248 ifeq ($(ARCH),s390) 249 249 CFLAGS += -march=z10 250 250 endif 251 251 ifeq ($(ARCH),x86) 252 + ifeq ($(shell echo "void foo(void) { }" | $(CC) -march=x86-64-v2 -x c - -c -o /dev/null 2>/dev/null; echo "$$?"),0) 252 253 CFLAGS += -march=x86-64-v2 254 + endif 253 255 endif 254 256 ifeq ($(ARCH),arm64) 255 257 tools_dir := $(top_srcdir)/tools

+1 -1

tools/testing/selftests/kvm/guest_memfd_test.c

··· 134 134 size); 135 135 } 136 136 137 - for (flag = 0; flag; flag <<= 1) { 137 + for (flag = BIT(0); flag; flag <<= 1) { 138 138 fd = __vm_create_guest_memfd(vm, page_size, flag); 139 139 TEST_ASSERT(fd == -1 && errno == EINVAL, 140 140 "guest_memfd() with flag '0x%lx' should fail with EINVAL",

+1 -1

tools/testing/selftests/kvm/lib/x86_64/vmx.c

··· 200 200 if (vmx->eptp_gpa) { 201 201 uint64_t ept_paddr; 202 202 struct eptPageTablePointer eptp = { 203 - .memory_type = VMX_BASIC_MEM_TYPE_WB, 203 + .memory_type = X86_MEMTYPE_WB, 204 204 .page_walk_length = 3, /* + 1 */ 205 205 .ad_enabled = ept_vpid_cap_supported(VMX_EPT_VPID_CAP_AD_BITS), 206 206 .address = vmx->eptp_gpa >> PAGE_SHIFT_4K,

+1 -1

tools/testing/selftests/kvm/memslot_perf_test.c

··· 417 417 */ 418 418 static noinline void host_perform_sync(struct sync_area *sync) 419 419 { 420 - alarm(2); 420 + alarm(10); 421 421 422 422 atomic_store_explicit(&sync->sync_flag, true, memory_order_release); 423 423 while (atomic_load_explicit(&sync->sync_flag, memory_order_acquire))

+12

tools/testing/selftests/mm/hugetlb_dio.c

··· 94 94 int main(void) 95 95 { 96 96 size_t pagesize = 0; 97 + int fd; 97 98 98 99 ksft_print_header(); 100 + 101 + /* Open the file to DIO */ 102 + fd = open("/tmp", O_TMPFILE | O_RDWR | O_DIRECT, 0664); 103 + if (fd < 0) 104 + ksft_exit_skip("Unable to allocate file: %s\n", strerror(errno)); 105 + close(fd); 106 + 107 + /* Check if huge pages are free */ 108 + if (!get_free_hugepages()) 109 + ksft_exit_skip("No free hugepage, exiting\n"); 110 + 99 111 ksft_set_plan(4); 100 112 101 113 /* Get base page size */

+1

tools/testing/selftests/net/.gitignore

··· 19 19 log.txt 20 20 msg_oob 21 21 msg_zerocopy 22 + netlink-dumps 22 23 nettest 23 24 psock_fanout 24 25 psock_snd

+1

tools/testing/selftests/net/Makefile

··· 78 78 TEST_GEN_FILES += io_uring_zerocopy_tx 79 79 TEST_PROGS += io_uring_zerocopy_tx.sh 80 80 TEST_GEN_FILES += bind_bhash 81 + TEST_GEN_PROGS += netlink-dumps 81 82 TEST_GEN_PROGS += sk_bind_sendto_listen 82 83 TEST_GEN_PROGS += sk_connect_zero_addr 83 84 TEST_GEN_PROGS += sk_so_peek_off

+110

tools/testing/selftests/net/netlink-dumps.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #define _GNU_SOURCE 4 + 5 + #include <fcntl.h> 6 + #include <stdio.h> 7 + #include <string.h> 8 + #include <sys/socket.h> 9 + #include <sys/stat.h> 10 + #include <sys/syscall.h> 11 + #include <sys/types.h> 12 + #include <unistd.h> 13 + 14 + #include <linux/genetlink.h> 15 + #include <linux/netlink.h> 16 + #include <linux/mqueue.h> 17 + 18 + #include "../kselftest_harness.h" 19 + 20 + static const struct { 21 + struct nlmsghdr nlhdr; 22 + struct genlmsghdr genlhdr; 23 + struct nlattr ahdr; 24 + __u16 val; 25 + __u16 pad; 26 + } dump_policies = { 27 + .nlhdr = { 28 + .nlmsg_len = sizeof(dump_policies), 29 + .nlmsg_type = GENL_ID_CTRL, 30 + .nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | NLM_F_DUMP, 31 + .nlmsg_seq = 1, 32 + }, 33 + .genlhdr = { 34 + .cmd = CTRL_CMD_GETPOLICY, 35 + .version = 2, 36 + }, 37 + .ahdr = { 38 + .nla_len = 6, 39 + .nla_type = CTRL_ATTR_FAMILY_ID, 40 + }, 41 + .val = GENL_ID_CTRL, 42 + .pad = 0, 43 + }; 44 + 45 + // Sanity check for the test itself, make sure the dump doesn't fit in one msg 46 + TEST(test_sanity) 47 + { 48 + int netlink_sock; 49 + char buf[8192]; 50 + ssize_t n; 51 + 52 + netlink_sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC); 53 + ASSERT_GE(netlink_sock, 0); 54 + 55 + n = send(netlink_sock, &dump_policies, sizeof(dump_policies), 0); 56 + ASSERT_EQ(n, sizeof(dump_policies)); 57 + 58 + n = recv(netlink_sock, buf, sizeof(buf), MSG_DONTWAIT); 59 + ASSERT_GE(n, sizeof(struct nlmsghdr)); 60 + 61 + n = recv(netlink_sock, buf, sizeof(buf), MSG_DONTWAIT); 62 + ASSERT_GE(n, sizeof(struct nlmsghdr)); 63 + 64 + close(netlink_sock); 65 + } 66 + 67 + TEST(close_in_progress) 68 + { 69 + int netlink_sock; 70 + ssize_t n; 71 + 72 + netlink_sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC); 73 + ASSERT_GE(netlink_sock, 0); 74 + 75 + n = send(netlink_sock, &dump_policies, sizeof(dump_policies), 0); 76 + ASSERT_EQ(n, sizeof(dump_policies)); 77 + 78 + close(netlink_sock); 79 + } 80 + 81 + TEST(close_with_ref) 82 + { 83 + char cookie[NOTIFY_COOKIE_LEN] = {}; 84 + int netlink_sock, mq_fd; 85 + struct sigevent sigev; 86 + ssize_t n; 87 + 88 + netlink_sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC); 89 + ASSERT_GE(netlink_sock, 0); 90 + 91 + n = send(netlink_sock, &dump_policies, sizeof(dump_policies), 0); 92 + ASSERT_EQ(n, sizeof(dump_policies)); 93 + 94 + mq_fd = syscall(__NR_mq_open, "sed", O_CREAT | O_WRONLY, 0600, 0); 95 + ASSERT_GE(mq_fd, 0); 96 + 97 + memset(&sigev, 0, sizeof(sigev)); 98 + sigev.sigev_notify = SIGEV_THREAD; 99 + sigev.sigev_value.sival_ptr = cookie; 100 + sigev.sigev_signo = netlink_sock; 101 + 102 + syscall(__NR_mq_notify, mq_fd, &sigev); 103 + 104 + close(netlink_sock); 105 + 106 + // give mqueue time to fire 107 + usleep(100 * 1000); 108 + } 109 + 110 + TEST_HARNESS_MAIN

+24

tools/testing/selftests/tc-testing/tc-tests/filters/u32.json

··· 329 329 "teardown": [ 330 330 "$TC qdisc del dev $DEV1 parent root drr" 331 331 ] 332 + }, 333 + { 334 + "id": "1234", 335 + "name": "Exercise IDR leaks by creating/deleting a filter many (2048) times", 336 + "category": [ 337 + "filter", 338 + "u32" 339 + ], 340 + "plugins": { 341 + "requires": "nsPlugin" 342 + }, 343 + "setup": [ 344 + "$TC qdisc add dev $DEV1 parent root handle 10: drr", 345 + "$TC filter add dev $DEV1 parent 10:0 protocol ip prio 2 u32 match ip src 0.0.0.2/32 action drop", 346 + "$TC filter add dev $DEV1 parent 10:0 protocol ip prio 3 u32 match ip src 0.0.0.3/32 action drop" 347 + ], 348 + "cmdUnderTest": "bash -c 'for i in {1..2048} ;do echo filter delete dev $DEV1 pref 3;echo filter add dev $DEV1 parent 10:0 protocol ip prio 3 u32 match ip src 0.0.0.3/32 action drop;done | $TC -b -'", 349 + "expExitCode": "0", 350 + "verifyCmd": "$TC filter show dev $DEV1", 351 + "matchPattern": "protocol ip pref 3 u32", 352 + "matchCount": "3", 353 + "teardown": [ 354 + "$TC qdisc del dev $DEV1 parent root drr" 355 + ] 332 356 } 333 357 ]

+1 -1

tools/virtio/vringh_test.c

··· 519 519 errx(1, "virtqueue_add_sgs: %i", err); 520 520 __kmalloc_fake = NULL; 521 521 522 - /* Host retreives it. */ 522 + /* Host retrieves it. */ 523 523 vringh_iov_init(&riov, host_riov, ARRAY_SIZE(host_riov)); 524 524 vringh_iov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov)); 525 525