Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

+14

Documentation/core-api/printk-formats.rst

··· 112 112 consideration the effect of compiler optimisations which may occur 113 113 when tail-calls are used and marked with the noreturn GCC attribute. 114 114 115 + Probed Pointers from BPF / tracing 116 + ---------------------------------- 117 + 118 + :: 119 + 120 + %pks kernel string 121 + %pus user string 122 + 123 + The ``k`` and ``u`` specifiers are used for printing prior probed memory from 124 + either kernel memory (k) or user memory (u). The subsequent ``s`` specifier 125 + results in printing a string. For direct use in regular vsnprintf() the (k) 126 + and (u) annotation is ignored, however, when used out of BPF's bpf_trace_printk(), 127 + for example, it reads the memory it is pointing to without faulting. 128 + 115 129 Kernel Pointers 116 130 --------------- 117 131

+2

Documentation/virt/kvm/index.rst

··· 28 28 arm/index 29 29 30 30 devices/index 31 + 32 + running-nested-guests

+276

Documentation/virt/kvm/running-nested-guests.rst

··· 1 + ============================== 2 + Running nested guests with KVM 3 + ============================== 4 + 5 + A nested guest is the ability to run a guest inside another guest (it 6 + can be KVM-based or a different hypervisor). The straightforward 7 + example is a KVM guest that in turn runs on a KVM guest (the rest of 8 + this document is built on this example):: 9 + 10 + .----------------. .----------------. 11 + | | | | 12 + | L2 | | L2 | 13 + | (Nested Guest) | | (Nested Guest) | 14 + | | | | 15 + |----------------'--'----------------| 16 + | | 17 + | L1 (Guest Hypervisor) | 18 + | KVM (/dev/kvm) | 19 + | | 20 + .------------------------------------------------------. 21 + | L0 (Host Hypervisor) | 22 + | KVM (/dev/kvm) | 23 + |------------------------------------------------------| 24 + | Hardware (with virtualization extensions) | 25 + '------------------------------------------------------' 26 + 27 + Terminology: 28 + 29 + - L0 – level-0; the bare metal host, running KVM 30 + 31 + - L1 – level-1 guest; a VM running on L0; also called the "guest 32 + hypervisor", as it itself is capable of running KVM. 33 + 34 + - L2 – level-2 guest; a VM running on L1, this is the "nested guest" 35 + 36 + .. note:: The above diagram is modelled after the x86 architecture; 37 + s390x, ppc64 and other architectures are likely to have 38 + a different design for nesting. 39 + 40 + For example, s390x always has an LPAR (LogicalPARtition) 41 + hypervisor running on bare metal, adding another layer and 42 + resulting in at least four levels in a nested setup — L0 (bare 43 + metal, running the LPAR hypervisor), L1 (host hypervisor), L2 44 + (guest hypervisor), L3 (nested guest). 45 + 46 + This document will stick with the three-level terminology (L0, 47 + L1, and L2) for all architectures; and will largely focus on 48 + x86. 49 + 50 + 51 + Use Cases 52 + --------- 53 + 54 + There are several scenarios where nested KVM can be useful, to name a 55 + few: 56 + 57 + - As a developer, you want to test your software on different operating 58 + systems (OSes). Instead of renting multiple VMs from a Cloud 59 + Provider, using nested KVM lets you rent a large enough "guest 60 + hypervisor" (level-1 guest). This in turn allows you to create 61 + multiple nested guests (level-2 guests), running different OSes, on 62 + which you can develop and test your software. 63 + 64 + - Live migration of "guest hypervisors" and their nested guests, for 65 + load balancing, disaster recovery, etc. 66 + 67 + - VM image creation tools (e.g. ``virt-install``, etc) often run 68 + their own VM, and users expect these to work inside a VM. 69 + 70 + - Some OSes use virtualization internally for security (e.g. to let 71 + applications run safely in isolation). 72 + 73 + 74 + Enabling "nested" (x86) 75 + ----------------------- 76 + 77 + From Linux kernel v4.19 onwards, the ``nested`` KVM parameter is enabled 78 + by default for Intel and AMD. (Though your Linux distribution might 79 + override this default.) 80 + 81 + In case you are running a Linux kernel older than v4.19, to enable 82 + nesting, set the ``nested`` KVM module parameter to ``Y`` or ``1``. To 83 + persist this setting across reboots, you can add it in a config file, as 84 + shown below: 85 + 86 + 1. On the bare metal host (L0), list the kernel modules and ensure that 87 + the KVM modules:: 88 + 89 + $ lsmod | grep -i kvm 90 + kvm_intel 133627 0 91 + kvm 435079 1 kvm_intel 92 + 93 + 2. Show information for ``kvm_intel`` module:: 94 + 95 + $ modinfo kvm_intel | grep -i nested 96 + parm: nested:bool 97 + 98 + 3. For the nested KVM configuration to persist across reboots, place the 99 + below in ``/etc/modprobed/kvm_intel.conf`` (create the file if it 100 + doesn't exist):: 101 + 102 + $ cat /etc/modprobe.d/kvm_intel.conf 103 + options kvm-intel nested=y 104 + 105 + 4. Unload and re-load the KVM Intel module:: 106 + 107 + $ sudo rmmod kvm-intel 108 + $ sudo modprobe kvm-intel 109 + 110 + 5. Verify if the ``nested`` parameter for KVM is enabled:: 111 + 112 + $ cat /sys/module/kvm_intel/parameters/nested 113 + Y 114 + 115 + For AMD hosts, the process is the same as above, except that the module 116 + name is ``kvm-amd``. 117 + 118 + 119 + Additional nested-related kernel parameters (x86) 120 + ------------------------------------------------- 121 + 122 + If your hardware is sufficiently advanced (Intel Haswell processor or 123 + higher, which has newer hardware virt extensions), the following 124 + additional features will also be enabled by default: "Shadow VMCS 125 + (Virtual Machine Control Structure)", APIC Virtualization on your bare 126 + metal host (L0). Parameters for Intel hosts:: 127 + 128 + $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs 129 + Y 130 + 131 + $ cat /sys/module/kvm_intel/parameters/enable_apicv 132 + Y 133 + 134 + $ cat /sys/module/kvm_intel/parameters/ept 135 + Y 136 + 137 + .. note:: If you suspect your L2 (i.e. nested guest) is running slower, 138 + ensure the above are enabled (particularly 139 + ``enable_shadow_vmcs`` and ``ept``). 140 + 141 + 142 + Starting a nested guest (x86) 143 + ----------------------------- 144 + 145 + Once your bare metal host (L0) is configured for nesting, you should be 146 + able to start an L1 guest with:: 147 + 148 + $ qemu-kvm -cpu host [...] 149 + 150 + The above will pass through the host CPU's capabilities as-is to the 151 + gues); or for better live migration compatibility, use a named CPU 152 + model supported by QEMU. e.g.:: 153 + 154 + $ qemu-kvm -cpu Haswell-noTSX-IBRS,vmx=on 155 + 156 + then the guest hypervisor will subsequently be capable of running a 157 + nested guest with accelerated KVM. 158 + 159 + 160 + Enabling "nested" (s390x) 161 + ------------------------- 162 + 163 + 1. On the host hypervisor (L0), enable the ``nested`` parameter on 164 + s390x:: 165 + 166 + $ rmmod kvm 167 + $ modprobe kvm nested=1 168 + 169 + .. note:: On s390x, the kernel parameter ``hpage`` is mutually exclusive 170 + with the ``nested`` paramter — i.e. to be able to enable 171 + ``nested``, the ``hpage`` parameter *must* be disabled. 172 + 173 + 2. The guest hypervisor (L1) must be provided with the ``sie`` CPU 174 + feature — with QEMU, this can be done by using "host passthrough" 175 + (via the command-line ``-cpu host``). 176 + 177 + 3. Now the KVM module can be loaded in the L1 (guest hypervisor):: 178 + 179 + $ modprobe kvm 180 + 181 + 182 + Live migration with nested KVM 183 + ------------------------------ 184 + 185 + Migrating an L1 guest, with a *live* nested guest in it, to another 186 + bare metal host, works as of Linux kernel 5.3 and QEMU 4.2.0 for 187 + Intel x86 systems, and even on older versions for s390x. 188 + 189 + On AMD systems, once an L1 guest has started an L2 guest, the L1 guest 190 + should no longer be migrated or saved (refer to QEMU documentation on 191 + "savevm"/"loadvm") until the L2 guest shuts down. Attempting to migrate 192 + or save-and-load an L1 guest while an L2 guest is running will result in 193 + undefined behavior. You might see a ``kernel BUG!`` entry in ``dmesg``, a 194 + kernel 'oops', or an outright kernel panic. Such a migrated or loaded L1 195 + guest can no longer be considered stable or secure, and must be restarted. 196 + Migrating an L1 guest merely configured to support nesting, while not 197 + actually running L2 guests, is expected to function normally even on AMD 198 + systems but may fail once guests are started. 199 + 200 + Migrating an L2 guest is always expected to succeed, so all the following 201 + scenarios should work even on AMD systems: 202 + 203 + - Migrating a nested guest (L2) to another L1 guest on the *same* bare 204 + metal host. 205 + 206 + - Migrating a nested guest (L2) to another L1 guest on a *different* 207 + bare metal host. 208 + 209 + - Migrating a nested guest (L2) to a bare metal host. 210 + 211 + Reporting bugs from nested setups 212 + ----------------------------------- 213 + 214 + Debugging "nested" problems can involve sifting through log files across 215 + L0, L1 and L2; this can result in tedious back-n-forth between the bug 216 + reporter and the bug fixer. 217 + 218 + - Mention that you are in a "nested" setup. If you are running any kind 219 + of "nesting" at all, say so. Unfortunately, this needs to be called 220 + out because when reporting bugs, people tend to forget to even 221 + *mention* that they're using nested virtualization. 222 + 223 + - Ensure you are actually running KVM on KVM. Sometimes people do not 224 + have KVM enabled for their guest hypervisor (L1), which results in 225 + them running with pure emulation or what QEMU calls it as "TCG", but 226 + they think they're running nested KVM. Thus confusing "nested Virt" 227 + (which could also mean, QEMU on KVM) with "nested KVM" (KVM on KVM). 228 + 229 + Information to collect (generic) 230 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 231 + 232 + The following is not an exhaustive list, but a very good starting point: 233 + 234 + - Kernel, libvirt, and QEMU version from L0 235 + 236 + - Kernel, libvirt and QEMU version from L1 237 + 238 + - QEMU command-line of L1 -- when using libvirt, you'll find it here: 239 + ``/var/log/libvirt/qemu/instance.log`` 240 + 241 + - QEMU command-line of L2 -- as above, when using libvirt, get the 242 + complete libvirt-generated QEMU command-line 243 + 244 + - ``cat /sys/cpuinfo`` from L0 245 + 246 + - ``cat /sys/cpuinfo`` from L1 247 + 248 + - ``lscpu`` from L0 249 + 250 + - ``lscpu`` from L1 251 + 252 + - Full ``dmesg`` output from L0 253 + 254 + - Full ``dmesg`` output from L1 255 + 256 + x86-specific info to collect 257 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 258 + 259 + Both the below commands, ``x86info`` and ``dmidecode``, should be 260 + available on most Linux distributions with the same name: 261 + 262 + - Output of: ``x86info -a`` from L0 263 + 264 + - Output of: ``x86info -a`` from L1 265 + 266 + - Output of: ``dmidecode`` from L0 267 + 268 + - Output of: ``dmidecode`` from L1 269 + 270 + s390x-specific info to collect 271 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 272 + 273 + Along with the earlier mentioned generic details, the below is 274 + also recommended: 275 + 276 + - ``/proc/sysinfo`` from L1; this will also include the info from L0

+6 -13

MAINTAINERS

··· 3936 3936 CEPH COMMON CODE (LIBCEPH) 3937 3937 M: Ilya Dryomov <idryomov@gmail.com> 3938 3938 M: Jeff Layton <jlayton@kernel.org> 3939 - M: Sage Weil <sage@redhat.com> 3940 3939 L: ceph-devel@vger.kernel.org 3941 3940 S: Supported 3942 3941 W: http://ceph.com/ 3943 - T: git git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git 3944 3942 T: git git://github.com/ceph/ceph-client.git 3945 3943 F: include/linux/ceph/ 3946 3944 F: include/linux/crush/ ··· 3946 3948 3947 3949 CEPH DISTRIBUTED FILE SYSTEM CLIENT (CEPH) 3948 3950 M: Jeff Layton <jlayton@kernel.org> 3949 - M: Sage Weil <sage@redhat.com> 3950 3951 M: Ilya Dryomov <idryomov@gmail.com> 3951 3952 L: ceph-devel@vger.kernel.org 3952 3953 S: Supported 3953 3954 W: http://ceph.com/ 3954 - T: git git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git 3955 3955 T: git git://github.com/ceph/ceph-client.git 3956 3956 F: Documentation/filesystems/ceph.rst 3957 3957 F: fs/ceph/ ··· 7115 7119 7116 7120 GENERIC PHY FRAMEWORK 7117 7121 M: Kishon Vijay Abraham I <kishon@ti.com> 7122 + M: Vinod Koul <vkoul@kernel.org> 7118 7123 L: linux-kernel@vger.kernel.org 7119 7124 S: Supported 7120 - T: git git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy.git 7125 + T: git git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy.git 7121 7126 F: Documentation/devicetree/bindings/phy/ 7122 7127 F: drivers/phy/ 7123 7128 F: include/linux/phy/ ··· 7742 7745 L: platform-driver-x86@vger.kernel.org 7743 7746 S: Orphan 7744 7747 F: drivers/platform/x86/tc1100-wmi.c 7745 - 7746 - HP100: Driver for HP 10/100 Mbit/s Voice Grade Network Adapter Series 7747 - M: Jaroslav Kysela <perex@perex.cz> 7748 - S: Obsolete 7749 - F: drivers/staging/hp/hp100.* 7750 7748 7751 7749 HPET: High Precision Event Timers driver 7752 7750 M: Clemens Ladisch <clemens@ladisch.de> ··· 11710 11718 11711 11719 NETWORKING DRIVERS 11712 11720 M: "David S. Miller" <davem@davemloft.net> 11721 + M: Jakub Kicinski <kuba@kernel.org> 11713 11722 L: netdev@vger.kernel.org 11714 - S: Odd Fixes 11723 + S: Maintained 11715 11724 W: http://www.linuxfoundation.org/en/Net 11716 11725 Q: http://patchwork.ozlabs.org/project/netdev/list/ 11717 11726 T: git git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git ··· 14095 14102 14096 14103 RADOS BLOCK DEVICE (RBD) 14097 14104 M: Ilya Dryomov <idryomov@gmail.com> 14098 - M: Sage Weil <sage@redhat.com> 14099 14105 R: Dongsheng Yang <dongsheng.yang@easystack.cn> 14100 14106 L: ceph-devel@vger.kernel.org 14101 14107 S: Supported 14102 14108 W: http://ceph.com/ 14103 - T: git git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git 14104 14109 T: git git://github.com/ceph/ceph-client.git 14105 14110 F: Documentation/ABI/testing/sysfs-bus-rbd 14106 14111 F: drivers/block/rbd.c ··· 14635 14644 14636 14645 S390 IUCV NETWORK LAYER 14637 14646 M: Julian Wiedmann <jwi@linux.ibm.com> 14647 + M: Karsten Graul <kgraul@linux.ibm.com> 14638 14648 M: Ursula Braun <ubraun@linux.ibm.com> 14639 14649 L: linux-s390@vger.kernel.org 14640 14650 S: Supported ··· 14646 14654 14647 14655 S390 NETWORK DRIVERS 14648 14656 M: Julian Wiedmann <jwi@linux.ibm.com> 14657 + M: Karsten Graul <kgraul@linux.ibm.com> 14649 14658 M: Ursula Braun <ubraun@linux.ibm.com> 14650 14659 L: linux-s390@vger.kernel.org 14651 14660 S: Supported

+12 -5

Makefile

··· 2 2 VERSION = 5 3 3 PATCHLEVEL = 7 4 4 SUBLEVEL = 0 5 - EXTRAVERSION = -rc4 5 + EXTRAVERSION = -rc5 6 6 NAME = Kleptomaniac Octopus 7 7 8 8 # *DOCUMENTATION* ··· 729 729 KBUILD_CFLAGS += -Os 730 730 endif 731 731 732 - ifdef CONFIG_CC_DISABLE_WARN_MAYBE_UNINITIALIZED 733 - KBUILD_CFLAGS += -Wno-maybe-uninitialized 734 - endif 735 - 736 732 # Tell gcc to never replace conditional load with a non-conditional one 737 733 KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0) 738 734 KBUILD_CFLAGS += $(call cc-option,-fno-allow-store-data-races) ··· 876 880 877 881 # disable stringop warnings in gcc 8+ 878 882 KBUILD_CFLAGS += $(call cc-disable-warning, stringop-truncation) 883 + 884 + # We'll want to enable this eventually, but it's not going away for 5.7 at least 885 + KBUILD_CFLAGS += $(call cc-disable-warning, zero-length-bounds) 886 + KBUILD_CFLAGS += $(call cc-disable-warning, array-bounds) 887 + KBUILD_CFLAGS += $(call cc-disable-warning, stringop-overflow) 888 + 889 + # Another good warning that we'll want to enable eventually 890 + KBUILD_CFLAGS += $(call cc-disable-warning, restrict) 891 + 892 + # Enabled with W=2, disabled by default as noisy 893 + KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized) 879 894 880 895 # disable invalid "can't wrap" optimizations for signed / pointers 881 896 KBUILD_CFLAGS += $(call cc-option,-fno-strict-overflow)

+1

arch/arm/Kconfig

··· 12 12 select ARCH_HAS_KEEPINITRD 13 13 select ARCH_HAS_KCOV 14 14 select ARCH_HAS_MEMBARRIER_SYNC_CORE 15 + select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE 15 16 select ARCH_HAS_PTE_SPECIAL if ARM_LPAE 16 17 select ARCH_HAS_PHYS_TO_DMA 17 18 select ARCH_HAS_SETUP_DMA_OPS

+1

arch/arm/configs/keystone_defconfig

··· 147 147 CONFIG_SPI=y 148 148 CONFIG_SPI_DAVINCI=y 149 149 CONFIG_SPI_SPIDEV=y 150 + CONFIG_PTP_1588_CLOCK=y 150 151 CONFIG_PINCTRL_SINGLE=y 151 152 CONFIG_GPIOLIB=y 152 153 CONFIG_GPIO_SYSFS=y

+1

arch/arm/configs/omap2plus_defconfig

··· 274 274 CONFIG_HSI=m 275 275 CONFIG_OMAP_SSI=m 276 276 CONFIG_SSI_PROTOCOL=m 277 + CONFIG_PTP_1588_CLOCK=y 277 278 CONFIG_PINCTRL_SINGLE=y 278 279 CONFIG_DEBUG_GPIO=y 279 280 CONFIG_GPIO_SYSFS=y

+7 -2

arch/arm/include/asm/futex.h

··· 165 165 preempt_enable(); 166 166 #endif 167 167 168 - if (!ret) 169 - *oval = oldval; 168 + /* 169 + * Store unconditionally. If ret != 0 the extra store is the least 170 + * of the worries but GCC cannot figure out that __futex_atomic_op() 171 + * is either setting ret to -EFAULT or storing the old value in 172 + * oldval which results in a uninitialized warning at the call site. 173 + */ 174 + *oval = oldval; 170 175 171 176 return ret; 172 177 }

+1

arch/arm64/Kconfig

··· 20 20 select ARCH_HAS_KCOV 21 21 select ARCH_HAS_KEEPINITRD 22 22 select ARCH_HAS_MEMBARRIER_SYNC_CORE 23 + select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE 23 24 select ARCH_HAS_PTE_DEVMAP 24 25 select ARCH_HAS_PTE_SPECIAL 25 26 select ARCH_HAS_SETUP_DMA_OPS

+1

arch/arm64/kernel/machine_kexec.c

··· 177 177 * the offline CPUs. Therefore, we must use the __* variant here. 178 178 */ 179 179 __flush_icache_range((uintptr_t)reboot_code_buffer, 180 + (uintptr_t)reboot_code_buffer + 180 181 arm64_relocate_new_kernel_size); 181 182 182 183 /* Flush the kimage list and its buffers. */

+7

arch/arm64/kvm/guest.c

··· 200 200 } 201 201 202 202 memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id)); 203 + 204 + if (*vcpu_cpsr(vcpu) & PSR_MODE32_BIT) { 205 + int i; 206 + 207 + for (i = 0; i < 16; i++) 208 + *vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i); 209 + } 203 210 out: 204 211 return err; 205 212 }

+23

arch/arm64/kvm/hyp/entry.S

··· 18 18 19 19 #define CPU_GP_REG_OFFSET(x) (CPU_GP_REGS + x) 20 20 #define CPU_XREG_OFFSET(x) CPU_GP_REG_OFFSET(CPU_USER_PT_REGS + 8*x) 21 + #define CPU_SP_EL0_OFFSET (CPU_XREG_OFFSET(30) + 8) 21 22 22 23 .text 23 24 .pushsection .hyp.text, "ax" ··· 48 47 ldp x29, lr, [\ctxt, #CPU_XREG_OFFSET(29)] 49 48 .endm 50 49 50 + .macro save_sp_el0 ctxt, tmp 51 + mrs \tmp, sp_el0 52 + str \tmp, [\ctxt, #CPU_SP_EL0_OFFSET] 53 + .endm 54 + 55 + .macro restore_sp_el0 ctxt, tmp 56 + ldr \tmp, [\ctxt, #CPU_SP_EL0_OFFSET] 57 + msr sp_el0, \tmp 58 + .endm 59 + 51 60 /* 52 61 * u64 __guest_enter(struct kvm_vcpu *vcpu, 53 62 * struct kvm_cpu_context *host_ctxt); ··· 70 59 71 60 // Store the host regs 72 61 save_callee_saved_regs x1 62 + 63 + // Save the host's sp_el0 64 + save_sp_el0 x1, x2 73 65 74 66 // Now the host state is stored if we have a pending RAS SError it must 75 67 // affect the host. If any asynchronous exception is pending we defer ··· 96 82 // as it may cause Pointer Authentication key signing mismatch errors 97 83 // when this feature is enabled for kernel code. 98 84 ptrauth_switch_to_guest x29, x0, x1, x2 85 + 86 + // Restore the guest's sp_el0 87 + restore_sp_el0 x29, x0 99 88 100 89 // Restore guest regs x0-x17 101 90 ldp x0, x1, [x29, #CPU_XREG_OFFSET(0)] ··· 147 130 // Store the guest regs x18-x29, lr 148 131 save_callee_saved_regs x1 149 132 133 + // Store the guest's sp_el0 134 + save_sp_el0 x1, x2 135 + 150 136 get_host_ctxt x2, x3 151 137 152 138 // Macro ptrauth_switch_to_guest format: ··· 158 138 // as it may cause Pointer Authentication key signing mismatch errors 159 139 // when this feature is enabled for kernel code. 160 140 ptrauth_switch_to_host x1, x2, x3, x4, x5 141 + 142 + // Restore the hosts's sp_el0 143 + restore_sp_el0 x2, x3 161 144 162 145 // Now restore the host regs 163 146 restore_callee_saved_regs x2

-1

arch/arm64/kvm/hyp/hyp-entry.S

··· 198 198 .macro invalid_vector label, target = __hyp_panic 199 199 .align 2 200 200 SYM_CODE_START(\label) 201 - \label: 202 201 b \target 203 202 SYM_CODE_END(\label) 204 203 .endm

+3 -14

arch/arm64/kvm/hyp/sysreg-sr.c

··· 15 15 /* 16 16 * Non-VHE: Both host and guest must save everything. 17 17 * 18 - * VHE: Host and guest must save mdscr_el1 and sp_el0 (and the PC and pstate, 19 - * which are handled as part of the el2 return state) on every switch. 18 + * VHE: Host and guest must save mdscr_el1 and sp_el0 (and the PC and 19 + * pstate, which are handled as part of the el2 return state) on every 20 + * switch (sp_el0 is being dealt with in the assembly code). 20 21 * tpidr_el0 and tpidrro_el0 only need to be switched when going 21 22 * to host userspace or a different VCPU. EL1 registers only need to be 22 23 * switched when potentially going to run a different VCPU. The latter two ··· 27 26 static void __hyp_text __sysreg_save_common_state(struct kvm_cpu_context *ctxt) 28 27 { 29 28 ctxt->sys_regs[MDSCR_EL1] = read_sysreg(mdscr_el1); 30 - 31 - /* 32 - * The host arm64 Linux uses sp_el0 to point to 'current' and it must 33 - * therefore be saved/restored on every entry/exit to/from the guest. 34 - */ 35 - ctxt->gp_regs.regs.sp = read_sysreg(sp_el0); 36 29 } 37 30 38 31 static void __hyp_text __sysreg_save_user_state(struct kvm_cpu_context *ctxt) ··· 94 99 static void __hyp_text __sysreg_restore_common_state(struct kvm_cpu_context *ctxt) 95 100 { 96 101 write_sysreg(ctxt->sys_regs[MDSCR_EL1], mdscr_el1); 97 - 98 - /* 99 - * The host arm64 Linux uses sp_el0 to point to 'current' and it must 100 - * therefore be saved/restored on every entry/exit to/from the guest. 101 - */ 102 - write_sysreg(ctxt->gp_regs.regs.sp, sp_el0); 103 102 } 104 103 105 104 static void __hyp_text __sysreg_restore_user_state(struct kvm_cpu_context *ctxt)

+2

arch/arm64/mm/hugetlbpage.c

··· 230 230 ptep = (pte_t *)pudp; 231 231 } else if (sz == (CONT_PTE_SIZE)) { 232 232 pmdp = pmd_alloc(mm, pudp, addr); 233 + if (!pmdp) 234 + return NULL; 233 235 234 236 WARN_ON(addr & (sz - 1)); 235 237 /*

+1

arch/powerpc/kvm/powerpc.c

··· 521 521 case KVM_CAP_IOEVENTFD: 522 522 case KVM_CAP_DEVICE_CTRL: 523 523 case KVM_CAP_IMMEDIATE_EXIT: 524 + case KVM_CAP_SET_GUEST_DEBUG: 524 525 r = 1; 525 526 break; 526 527 case KVM_CAP_PPC_GUEST_DEBUG_SSTEP:

+2 -1

arch/riscv/Kconfig

··· 54 54 select GENERIC_ARCH_TOPOLOGY if SMP 55 55 select ARCH_HAS_PTE_SPECIAL 56 56 select ARCH_HAS_MMIOWB 57 - select ARCH_HAS_DEBUG_VIRTUAL 57 + select ARCH_HAS_DEBUG_VIRTUAL if MMU 58 58 select HAVE_EBPF_JIT if MMU 59 59 select EDAC_SUPPORT 60 60 select ARCH_HAS_GIGANTIC_PAGE ··· 136 136 def_bool y 137 137 138 138 config SYS_SUPPORTS_HUGETLBFS 139 + depends on MMU 139 140 def_bool y 140 141 141 142 config STACKTRACE_SUPPORT

+9 -8

arch/riscv/Kconfig.socs

··· 11 11 This enables support for SiFive SoC platform hardware. 12 12 13 13 config SOC_VIRT 14 - bool "QEMU Virt Machine" 15 - select POWER_RESET_SYSCON 16 - select POWER_RESET_SYSCON_POWEROFF 17 - select GOLDFISH 18 - select RTC_DRV_GOLDFISH 19 - select SIFIVE_PLIC 20 - help 21 - This enables support for QEMU Virt Machine. 14 + bool "QEMU Virt Machine" 15 + select POWER_RESET 16 + select POWER_RESET_SYSCON 17 + select POWER_RESET_SYSCON_POWEROFF 18 + select GOLDFISH 19 + select RTC_DRV_GOLDFISH if RTC_CLASS 20 + select SIFIVE_PLIC 21 + help 22 + This enables support for QEMU Virt Machine. 22 23 23 24 config SOC_KENDRYTE 24 25 bool "Kendryte K210 SoC"

-3

arch/riscv/include/asm/csr.h

··· 51 51 #define CAUSE_IRQ_FLAG (_AC(1, UL) << (__riscv_xlen - 1)) 52 52 53 53 /* Interrupt causes (minus the high bit) */ 54 - #define IRQ_U_SOFT 0 55 54 #define IRQ_S_SOFT 1 56 55 #define IRQ_M_SOFT 3 57 - #define IRQ_U_TIMER 4 58 56 #define IRQ_S_TIMER 5 59 57 #define IRQ_M_TIMER 7 60 - #define IRQ_U_EXT 8 61 58 #define IRQ_S_EXT 9 62 59 #define IRQ_M_EXT 11 63 60

+22

arch/riscv/include/asm/hwcap.h

··· 8 8 #ifndef _ASM_RISCV_HWCAP_H 9 9 #define _ASM_RISCV_HWCAP_H 10 10 11 + #include <linux/bits.h> 11 12 #include <uapi/asm/hwcap.h> 12 13 13 14 #ifndef __ASSEMBLY__ ··· 23 22 }; 24 23 25 24 extern unsigned long elf_hwcap; 25 + 26 + #define RISCV_ISA_EXT_a ('a' - 'a') 27 + #define RISCV_ISA_EXT_c ('c' - 'a') 28 + #define RISCV_ISA_EXT_d ('d' - 'a') 29 + #define RISCV_ISA_EXT_f ('f' - 'a') 30 + #define RISCV_ISA_EXT_h ('h' - 'a') 31 + #define RISCV_ISA_EXT_i ('i' - 'a') 32 + #define RISCV_ISA_EXT_m ('m' - 'a') 33 + #define RISCV_ISA_EXT_s ('s' - 'a') 34 + #define RISCV_ISA_EXT_u ('u' - 'a') 35 + 36 + #define RISCV_ISA_EXT_MAX 64 37 + 38 + unsigned long riscv_isa_extension_base(const unsigned long *isa_bitmap); 39 + 40 + #define riscv_isa_extension_mask(ext) BIT_MASK(RISCV_ISA_EXT_##ext) 41 + 42 + bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, int bit); 43 + #define riscv_isa_extension_available(isa_bitmap, ext) \ 44 + __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_##ext) 45 + 26 46 #endif 27 47 28 48 #endif /* _ASM_RISCV_HWCAP_H */

+2

arch/riscv/include/asm/mmio.h

··· 16 16 17 17 #ifndef CONFIG_MMU 18 18 #define pgprot_noncached(x) (x) 19 + #define pgprot_writecombine(x) (x) 20 + #define pgprot_device(x) (x) 19 21 #endif /* CONFIG_MMU */ 20 22 21 23 /* Generic IO read/write. These perform native-endian accesses. */

+1

arch/riscv/include/asm/mmiowb.h

··· 9 9 */ 10 10 #define mmiowb() __asm__ __volatile__ ("fence o,w" : : : "memory"); 11 11 12 + #include <linux/smp.h> 12 13 #include <asm-generic/mmiowb.h> 13 14 14 15 #endif /* _ASM_RISCV_MMIOWB_H */

+2 -6

arch/riscv/include/asm/perf_event.h

··· 12 12 #include <linux/ptrace.h> 13 13 #include <linux/interrupt.h> 14 14 15 + #ifdef CONFIG_RISCV_BASE_PMU 15 16 #define RISCV_BASE_COUNTERS 2 16 17 17 18 /* 18 19 * The RISCV_MAX_COUNTERS parameter should be specified. 19 20 */ 20 21 21 - #ifdef CONFIG_RISCV_BASE_PMU 22 22 #define RISCV_MAX_COUNTERS 2 23 - #endif 24 - 25 - #ifndef RISCV_MAX_COUNTERS 26 - #error "Please provide a valid RISCV_MAX_COUNTERS for the PMU." 27 - #endif 28 23 29 24 /* 30 25 * These are the indexes of bits in counteren register *minus* 1, ··· 77 82 int irq; 78 83 }; 79 84 85 + #endif 80 86 #ifdef CONFIG_PERF_EVENTS 81 87 #define perf_arch_bpf_user_pt_regs(regs) (struct user_regs_struct *)regs 82 88 #endif

+3

arch/riscv/include/asm/pgtable.h

··· 470 470 471 471 #else /* CONFIG_MMU */ 472 472 473 + #define PAGE_SHARED __pgprot(0) 473 474 #define PAGE_KERNEL __pgprot(0) 474 475 #define swapper_pg_dir NULL 475 476 #define VMALLOC_START 0 476 477 477 478 #define TASK_SIZE 0xffffffffUL 479 + 480 + static inline void __kernel_map_pages(struct page *page, int numpages, int enable) {} 478 481 479 482 #endif /* !CONFIG_MMU */ 480 483

-8

arch/riscv/include/asm/set_memory.h

··· 22 22 static inline int set_memory_nx(unsigned long addr, int numpages) { return 0; } 23 23 #endif 24 24 25 - #ifdef CONFIG_STRICT_KERNEL_RWX 26 - void set_kernel_text_ro(void); 27 - void set_kernel_text_rw(void); 28 - #else 29 - static inline void set_kernel_text_ro(void) { } 30 - static inline void set_kernel_text_rw(void) { } 31 - #endif 32 - 33 25 int set_direct_map_invalid_noflush(struct page *page); 34 26 int set_direct_map_default_noflush(struct page *page); 35 27

+1 -1

arch/riscv/kernel/Makefile

··· 43 43 obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o 44 44 obj-$(CONFIG_DYNAMIC_FTRACE) += mcount-dyn.o 45 45 46 - obj-$(CONFIG_PERF_EVENTS) += perf_event.o 46 + obj-$(CONFIG_RISCV_BASE_PMU) += perf_event.o 47 47 obj-$(CONFIG_PERF_EVENTS) += perf_callchain.o 48 48 obj-$(CONFIG_HAVE_PERF_REGS) += perf_regs.o 49 49 obj-$(CONFIG_RISCV_SBI) += sbi.o

+2 -2

arch/riscv/kernel/cpu_ops.c

··· 15 15 16 16 const struct cpu_operations *cpu_ops[NR_CPUS] __ro_after_init; 17 17 18 - void *__cpu_up_stack_pointer[NR_CPUS]; 19 - void *__cpu_up_task_pointer[NR_CPUS]; 18 + void *__cpu_up_stack_pointer[NR_CPUS] __section(.data); 19 + void *__cpu_up_task_pointer[NR_CPUS] __section(.data); 20 20 21 21 extern const struct cpu_operations cpu_ops_sbi; 22 22 extern const struct cpu_operations cpu_ops_spinwait;

+80 -3

arch/riscv/kernel/cpufeature.c

··· 6 6 * Copyright (C) 2017 SiFive 7 7 */ 8 8 9 + #include <linux/bitmap.h> 9 10 #include <linux/of.h> 10 11 #include <asm/processor.h> 11 12 #include <asm/hwcap.h> ··· 14 13 #include <asm/switch_to.h> 15 14 16 15 unsigned long elf_hwcap __read_mostly; 16 + 17 + /* Host ISA bitmap */ 18 + static DECLARE_BITMAP(riscv_isa, RISCV_ISA_EXT_MAX) __read_mostly; 19 + 17 20 #ifdef CONFIG_FPU 18 21 bool has_fpu __read_mostly; 19 22 #endif 23 + 24 + /** 25 + * riscv_isa_extension_base() - Get base extension word 26 + * 27 + * @isa_bitmap: ISA bitmap to use 28 + * Return: base extension word as unsigned long value 29 + * 30 + * NOTE: If isa_bitmap is NULL then Host ISA bitmap will be used. 31 + */ 32 + unsigned long riscv_isa_extension_base(const unsigned long *isa_bitmap) 33 + { 34 + if (!isa_bitmap) 35 + return riscv_isa[0]; 36 + return isa_bitmap[0]; 37 + } 38 + EXPORT_SYMBOL_GPL(riscv_isa_extension_base); 39 + 40 + /** 41 + * __riscv_isa_extension_available() - Check whether given extension 42 + * is available or not 43 + * 44 + * @isa_bitmap: ISA bitmap to use 45 + * @bit: bit position of the desired extension 46 + * Return: true or false 47 + * 48 + * NOTE: If isa_bitmap is NULL then Host ISA bitmap will be used. 49 + */ 50 + bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, int bit) 51 + { 52 + const unsigned long *bmap = (isa_bitmap) ? isa_bitmap : riscv_isa; 53 + 54 + if (bit >= RISCV_ISA_EXT_MAX) 55 + return false; 56 + 57 + return test_bit(bit, bmap) ? true : false; 58 + } 59 + EXPORT_SYMBOL_GPL(__riscv_isa_extension_available); 20 60 21 61 void riscv_fill_hwcap(void) 22 62 { 23 63 struct device_node *node; 24 64 const char *isa; 25 - size_t i; 65 + char print_str[BITS_PER_LONG + 1]; 66 + size_t i, j, isa_len; 26 67 static unsigned long isa2hwcap[256] = {0}; 27 68 28 69 isa2hwcap['i'] = isa2hwcap['I'] = COMPAT_HWCAP_ISA_I; ··· 76 33 77 34 elf_hwcap = 0; 78 35 36 + bitmap_zero(riscv_isa, RISCV_ISA_EXT_MAX); 37 + 79 38 for_each_of_cpu_node(node) { 80 39 unsigned long this_hwcap = 0; 40 + unsigned long this_isa = 0; 81 41 82 42 if (riscv_of_processor_hartid(node) < 0) 83 43 continue; ··· 90 44 continue; 91 45 } 92 46 93 - for (i = 0; i < strlen(isa); ++i) 47 + i = 0; 48 + isa_len = strlen(isa); 49 + #if IS_ENABLED(CONFIG_32BIT) 50 + if (!strncmp(isa, "rv32", 4)) 51 + i += 4; 52 + #elif IS_ENABLED(CONFIG_64BIT) 53 + if (!strncmp(isa, "rv64", 4)) 54 + i += 4; 55 + #endif 56 + for (; i < isa_len; ++i) { 94 57 this_hwcap |= isa2hwcap[(unsigned char)(isa[i])]; 58 + /* 59 + * TODO: X, Y and Z extension parsing for Host ISA 60 + * bitmap will be added in-future. 61 + */ 62 + if ('a' <= isa[i] && isa[i] < 'x') 63 + this_isa |= (1UL << (isa[i] - 'a')); 64 + } 95 65 96 66 /* 97 67 * All "okay" hart should have same isa. Set HWCAP based on ··· 118 56 elf_hwcap &= this_hwcap; 119 57 else 120 58 elf_hwcap = this_hwcap; 59 + 60 + if (riscv_isa[0]) 61 + riscv_isa[0] &= this_isa; 62 + else 63 + riscv_isa[0] = this_isa; 121 64 } 122 65 123 66 /* We don't support systems with F but without D, so mask those out ··· 132 65 elf_hwcap &= ~COMPAT_HWCAP_ISA_F; 133 66 } 134 67 135 - pr_info("elf_hwcap is 0x%lx\n", elf_hwcap); 68 + memset(print_str, 0, sizeof(print_str)); 69 + for (i = 0, j = 0; i < BITS_PER_LONG; i++) 70 + if (riscv_isa[0] & BIT_MASK(i)) 71 + print_str[j++] = (char)('a' + i); 72 + pr_info("riscv: ISA extensions %s\n", print_str); 73 + 74 + memset(print_str, 0, sizeof(print_str)); 75 + for (i = 0, j = 0; i < BITS_PER_LONG; i++) 76 + if (elf_hwcap & BIT_MASK(i)) 77 + print_str[j++] = (char)('a' + i); 78 + pr_info("riscv: ELF capabilities %s\n", print_str); 136 79 137 80 #ifdef CONFIG_FPU 138 81 if (elf_hwcap & (COMPAT_HWCAP_ISA_F | COMPAT_HWCAP_ISA_D))

+4 -4

arch/riscv/kernel/perf_event.c

··· 147 147 return riscv_pmu->hw_events[config]; 148 148 } 149 149 150 - int riscv_map_cache_decode(u64 config, unsigned int *type, 150 + static int riscv_map_cache_decode(u64 config, unsigned int *type, 151 151 unsigned int *op, unsigned int *result) 152 152 { 153 153 return -ENOENT; ··· 342 342 343 343 static DEFINE_MUTEX(pmc_reserve_mutex); 344 344 345 - irqreturn_t riscv_base_pmu_handle_irq(int irq_num, void *dev) 345 + static irqreturn_t riscv_base_pmu_handle_irq(int irq_num, void *dev) 346 346 { 347 347 return IRQ_NONE; 348 348 } ··· 361 361 return err; 362 362 } 363 363 364 - void release_pmc_hardware(void) 364 + static void release_pmc_hardware(void) 365 365 { 366 366 mutex_lock(&pmc_reserve_mutex); 367 367 if (riscv_pmu->irq >= 0) ··· 464 464 { /* sentinel value */ } 465 465 }; 466 466 467 - int __init init_hw_perf_events(void) 467 + static int __init init_hw_perf_events(void) 468 468 { 469 469 struct device_node *node = of_find_node_by_type(NULL, "pmu"); 470 470 const struct of_device_id *of_id;

+2

arch/riscv/kernel/smp.c

··· 10 10 11 11 #include <linux/cpu.h> 12 12 #include <linux/interrupt.h> 13 + #include <linux/module.h> 13 14 #include <linux/profile.h> 14 15 #include <linux/smp.h> 15 16 #include <linux/sched.h> ··· 64 63 for_each_cpu(cpu, in) 65 64 cpumask_set_cpu(cpuid_to_hartid_map(cpu), out); 66 65 } 66 + EXPORT_SYMBOL_GPL(riscv_cpuid_to_hartid_mask); 67 67 68 68 bool arch_match_cpu_phys_id(int cpu, u64 phys_id) 69 69 {

+1 -1

arch/riscv/kernel/stacktrace.c

··· 65 65 66 66 #else /* !CONFIG_FRAME_POINTER */ 67 67 68 - static void notrace walk_stackframe(struct task_struct *task, 68 + void notrace walk_stackframe(struct task_struct *task, 69 69 struct pt_regs *regs, bool (*fn)(unsigned long, void *), void *arg) 70 70 { 71 71 unsigned long sp, pc;

+1 -1

arch/riscv/kernel/vdso/Makefile

··· 12 12 vdso-syms += flush_icache 13 13 14 14 # Files to link into the vdso 15 - obj-vdso = $(patsubst %, %.o, $(vdso-syms)) 15 + obj-vdso = $(patsubst %, %.o, $(vdso-syms)) note.o 16 16 17 17 # Build rules 18 18 targets := $(obj-vdso) vdso.so vdso.so.dbg vdso.lds vdso-dummy.o

+12

arch/riscv/kernel/vdso/note.S

··· 1 + /* SPDX-License-Identifier: GPL-2.0-or-later */ 2 + /* 3 + * This supplies .note.* sections to go into the PT_NOTE inside the vDSO text. 4 + * Here we can supply some information useful to userland. 5 + */ 6 + 7 + #include <linux/elfnote.h> 8 + #include <linux/version.h> 9 + 10 + ELFNOTE_START(Linux, 0, "a") 11 + .long LINUX_VERSION_CODE 12 + ELFNOTE_END

+2 -17

arch/riscv/mm/init.c

··· 150 150 memblock_reserve(vmlinux_start, vmlinux_end - vmlinux_start); 151 151 152 152 set_max_mapnr(PFN_DOWN(mem_size)); 153 - max_low_pfn = PFN_DOWN(memblock_end_of_DRAM()); 153 + max_pfn = PFN_DOWN(memblock_end_of_DRAM()); 154 + max_low_pfn = max_pfn; 154 155 155 156 #ifdef CONFIG_BLK_DEV_INITRD 156 157 setup_initrd(); ··· 502 501 #endif /* CONFIG_MMU */ 503 502 504 503 #ifdef CONFIG_STRICT_KERNEL_RWX 505 - void set_kernel_text_rw(void) 506 - { 507 - unsigned long text_start = (unsigned long)_text; 508 - unsigned long text_end = (unsigned long)_etext; 509 - 510 - set_memory_rw(text_start, (text_end - text_start) >> PAGE_SHIFT); 511 - } 512 - 513 - void set_kernel_text_ro(void) 514 - { 515 - unsigned long text_start = (unsigned long)_text; 516 - unsigned long text_end = (unsigned long)_etext; 517 - 518 - set_memory_ro(text_start, (text_end - text_start) >> PAGE_SHIFT); 519 - } 520 - 521 504 void mark_rodata_ro(void) 522 505 { 523 506 unsigned long text_start = (unsigned long)_text;

+1

arch/s390/kvm/kvm-s390.c

··· 545 545 case KVM_CAP_S390_AIS: 546 546 case KVM_CAP_S390_AIS_MIGRATION: 547 547 case KVM_CAP_S390_VCPU_RESETS: 548 + case KVM_CAP_SET_GUEST_DEBUG: 548 549 r = 1; 549 550 break; 550 551 case KVM_CAP_S390_HPAGE_1M:

+3 -1

arch/s390/kvm/priv.c

··· 626 626 * available for the guest are AQIC and TAPQ with the t bit set 627 627 * since we do not set IC.3 (FIII) we currently will only intercept 628 628 * the AQIC function code. 629 + * Note: running nested under z/VM can result in intercepts for other 630 + * function codes, e.g. PQAP(QCI). We do not support this and bail out. 629 631 */ 630 632 reg0 = vcpu->run->s.regs.gprs[0]; 631 633 fc = (reg0 >> 24) & 0xff; 632 - if (WARN_ON_ONCE(fc != 0x03)) 634 + if (fc != 0x03) 633 635 return -EOPNOTSUPP; 634 636 635 637 /* PQAP instruction is allowed for guest kernel only */

+1

arch/x86/Kconfig

··· 68 68 select ARCH_HAS_KCOV if X86_64 69 69 select ARCH_HAS_MEM_ENCRYPT 70 70 select ARCH_HAS_MEMBARRIER_SYNC_CORE 71 + select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE 71 72 select ARCH_HAS_PMEM_API if X86_64 72 73 select ARCH_HAS_PTE_DEVMAP if X86_64 73 74 select ARCH_HAS_PTE_SPECIAL

+21 -19

arch/x86/entry/calling.h

··· 98 98 #define SIZEOF_PTREGS 21*8 99 99 100 100 .macro PUSH_AND_CLEAR_REGS rdx=%rdx rax=%rax save_ret=0 101 - /* 102 - * Push registers and sanitize registers of values that a 103 - * speculation attack might otherwise want to exploit. The 104 - * lower registers are likely clobbered well before they 105 - * could be put to use in a speculative execution gadget. 106 - * Interleave XOR with PUSH for better uop scheduling: 107 - */ 108 101 .if \save_ret 109 102 pushq %rsi /* pt_regs->si */ 110 103 movq 8(%rsp), %rsi /* temporarily store the return address in %rsi */ ··· 107 114 pushq %rsi /* pt_regs->si */ 108 115 .endif 109 116 pushq \rdx /* pt_regs->dx */ 110 - xorl %edx, %edx /* nospec dx */ 111 117 pushq %rcx /* pt_regs->cx */ 112 - xorl %ecx, %ecx /* nospec cx */ 113 118 pushq \rax /* pt_regs->ax */ 114 119 pushq %r8 /* pt_regs->r8 */ 115 - xorl %r8d, %r8d /* nospec r8 */ 116 120 pushq %r9 /* pt_regs->r9 */ 117 - xorl %r9d, %r9d /* nospec r9 */ 118 121 pushq %r10 /* pt_regs->r10 */ 119 - xorl %r10d, %r10d /* nospec r10 */ 120 122 pushq %r11 /* pt_regs->r11 */ 121 - xorl %r11d, %r11d /* nospec r11*/ 122 123 pushq %rbx /* pt_regs->rbx */ 123 - xorl %ebx, %ebx /* nospec rbx*/ 124 124 pushq %rbp /* pt_regs->rbp */ 125 - xorl %ebp, %ebp /* nospec rbp*/ 126 125 pushq %r12 /* pt_regs->r12 */ 127 - xorl %r12d, %r12d /* nospec r12*/ 128 126 pushq %r13 /* pt_regs->r13 */ 129 - xorl %r13d, %r13d /* nospec r13*/ 130 127 pushq %r14 /* pt_regs->r14 */ 131 - xorl %r14d, %r14d /* nospec r14*/ 132 128 pushq %r15 /* pt_regs->r15 */ 133 - xorl %r15d, %r15d /* nospec r15*/ 134 129 UNWIND_HINT_REGS 130 + 135 131 .if \save_ret 136 132 pushq %rsi /* return address on top of stack */ 137 133 .endif 134 + 135 + /* 136 + * Sanitize registers of values that a speculation attack might 137 + * otherwise want to exploit. The lower registers are likely clobbered 138 + * well before they could be put to use in a speculative execution 139 + * gadget. 140 + */ 141 + xorl %edx, %edx /* nospec dx */ 142 + xorl %ecx, %ecx /* nospec cx */ 143 + xorl %r8d, %r8d /* nospec r8 */ 144 + xorl %r9d, %r9d /* nospec r9 */ 145 + xorl %r10d, %r10d /* nospec r10 */ 146 + xorl %r11d, %r11d /* nospec r11 */ 147 + xorl %ebx, %ebx /* nospec rbx */ 148 + xorl %ebp, %ebp /* nospec rbp */ 149 + xorl %r12d, %r12d /* nospec r12 */ 150 + xorl %r13d, %r13d /* nospec r13 */ 151 + xorl %r14d, %r14d /* nospec r14 */ 152 + xorl %r15d, %r15d /* nospec r15 */ 153 + 138 154 .endm 139 155 140 156 .macro POP_REGS pop_rdi=1 skip_r11rcx=0

+7 -7

arch/x86/entry/entry_64.S

··· 249 249 */ 250 250 syscall_return_via_sysret: 251 251 /* rcx and r11 are already restored (see code above) */ 252 - UNWIND_HINT_EMPTY 253 252 POP_REGS pop_rdi=0 skip_r11rcx=1 254 253 255 254 /* ··· 257 258 */ 258 259 movq %rsp, %rdi 259 260 movq PER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rsp 261 + UNWIND_HINT_EMPTY 260 262 261 263 pushq RSP-RDI(%rdi) /* RSP */ 262 264 pushq (%rdi) /* RDI */ ··· 279 279 * %rdi: prev task 280 280 * %rsi: next task 281 281 */ 282 - SYM_CODE_START(__switch_to_asm) 283 - UNWIND_HINT_FUNC 282 + SYM_FUNC_START(__switch_to_asm) 284 283 /* 285 284 * Save callee-saved registers 286 285 * This must match the order in inactive_task_frame ··· 320 321 popq %rbp 321 322 322 323 jmp __switch_to 323 - SYM_CODE_END(__switch_to_asm) 324 + SYM_FUNC_END(__switch_to_asm) 324 325 325 326 /* 326 327 * A newly forked process directly context switches into this address. ··· 511 512 * +----------------------------------------------------+ 512 513 */ 513 514 SYM_CODE_START(interrupt_entry) 514 - UNWIND_HINT_FUNC 515 + UNWIND_HINT_IRET_REGS offset=16 515 516 ASM_CLAC 516 517 cld 517 518 ··· 543 544 pushq 5*8(%rdi) /* regs->eflags */ 544 545 pushq 4*8(%rdi) /* regs->cs */ 545 546 pushq 3*8(%rdi) /* regs->ip */ 547 + UNWIND_HINT_IRET_REGS 546 548 pushq 2*8(%rdi) /* regs->orig_ax */ 547 549 pushq 8(%rdi) /* return address */ 548 - UNWIND_HINT_FUNC 549 550 550 551 movq (%rdi), %rdi 551 552 jmp 2f ··· 636 637 */ 637 638 movq %rsp, %rdi 638 639 movq PER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rsp 640 + UNWIND_HINT_EMPTY 639 641 640 642 /* Copy the IRET frame to the trampoline stack. */ 641 643 pushq 6*8(%rdi) /* SS */ ··· 1739 1739 1740 1740 movq PER_CPU_VAR(cpu_current_top_of_stack), %rax 1741 1741 leaq -PTREGS_SIZE(%rax), %rsp 1742 - UNWIND_HINT_FUNC sp_offset=PTREGS_SIZE 1742 + UNWIND_HINT_REGS 1743 1743 1744 1744 call do_exit 1745 1745 SYM_CODE_END(rewind_stack_do_exit)

+9 -2

arch/x86/include/asm/ftrace.h

··· 56 56 57 57 #ifndef __ASSEMBLY__ 58 58 59 + #if defined(CONFIG_FUNCTION_TRACER) && defined(CONFIG_DYNAMIC_FTRACE) 60 + extern void set_ftrace_ops_ro(void); 61 + #else 62 + static inline void set_ftrace_ops_ro(void) { } 63 + #endif 64 + 59 65 #define ARCH_HAS_SYSCALL_MATCH_SYM_NAME 60 66 static inline bool arch_syscall_match_sym_name(const char *sym, const char *name) 61 67 { 62 68 /* 63 69 * Compare the symbol name with the system call name. Skip the 64 - * "__x64_sys", "__ia32_sys" or simple "sys" prefix. 70 + * "__x64_sys", "__ia32_sys", "__do_sys" or simple "sys" prefix. 65 71 */ 66 72 return !strcmp(sym + 3, name + 3) || 67 73 (!strncmp(sym, "__x64_", 6) && !strcmp(sym + 9, name + 3)) || 68 - (!strncmp(sym, "__ia32_", 7) && !strcmp(sym + 10, name + 3)); 74 + (!strncmp(sym, "__ia32_", 7) && !strcmp(sym + 10, name + 3)) || 75 + (!strncmp(sym, "__do_sys", 8) && !strcmp(sym + 8, name + 3)); 69 76 } 70 77 71 78 #ifndef COMPILE_OFFSETS

+2 -2

arch/x86/include/asm/kvm_host.h

··· 1663 1663 static inline bool kvm_irq_is_postable(struct kvm_lapic_irq *irq) 1664 1664 { 1665 1665 /* We can only post Fixed and LowPrio IRQs */ 1666 - return (irq->delivery_mode == dest_Fixed || 1667 - irq->delivery_mode == dest_LowestPrio); 1666 + return (irq->delivery_mode == APIC_DM_FIXED || 1667 + irq->delivery_mode == APIC_DM_LOWEST); 1668 1668 } 1669 1669 1670 1670 static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)

+1 -1

arch/x86/include/asm/unwind.h

··· 19 19 #if defined(CONFIG_UNWINDER_ORC) 20 20 bool signal, full_regs; 21 21 unsigned long sp, bp, ip; 22 - struct pt_regs *regs; 22 + struct pt_regs *regs, *prev_regs; 23 23 #elif defined(CONFIG_UNWINDER_FRAME_POINTER) 24 24 bool got_irq; 25 25 unsigned long *bp, *orig_sp, ip;

+14 -13

arch/x86/kernel/apic/apic.c

··· 352 352 * According to Intel, MFENCE can do the serialization here. 353 353 */ 354 354 asm volatile("mfence" : : : "memory"); 355 - 356 - printk_once(KERN_DEBUG "TSC deadline timer enabled\n"); 357 355 return; 358 356 } 359 357 ··· 544 546 }; 545 547 static DEFINE_PER_CPU(struct clock_event_device, lapic_events); 546 548 547 - static u32 hsx_deadline_rev(void) 549 + static __init u32 hsx_deadline_rev(void) 548 550 { 549 551 switch (boot_cpu_data.x86_stepping) { 550 552 case 0x02: return 0x3a; /* EP */ ··· 554 556 return ~0U; 555 557 } 556 558 557 - static u32 bdx_deadline_rev(void) 559 + static __init u32 bdx_deadline_rev(void) 558 560 { 559 561 switch (boot_cpu_data.x86_stepping) { 560 562 case 0x02: return 0x00000011; ··· 566 568 return ~0U; 567 569 } 568 570 569 - static u32 skx_deadline_rev(void) 571 + static __init u32 skx_deadline_rev(void) 570 572 { 571 573 switch (boot_cpu_data.x86_stepping) { 572 574 case 0x03: return 0x01000136; ··· 579 581 return ~0U; 580 582 } 581 583 582 - static const struct x86_cpu_id deadline_match[] = { 584 + static const struct x86_cpu_id deadline_match[] __initconst = { 583 585 X86_MATCH_INTEL_FAM6_MODEL( HASWELL_X, &hsx_deadline_rev), 584 586 X86_MATCH_INTEL_FAM6_MODEL( BROADWELL_X, 0x0b000020), 585 587 X86_MATCH_INTEL_FAM6_MODEL( BROADWELL_D, &bdx_deadline_rev), ··· 601 603 {}, 602 604 }; 603 605 604 - static void apic_check_deadline_errata(void) 606 + static __init bool apic_validate_deadline_timer(void) 605 607 { 606 608 const struct x86_cpu_id *m; 607 609 u32 rev; 608 610 609 - if (!boot_cpu_has(X86_FEATURE_TSC_DEADLINE_TIMER) || 610 - boot_cpu_has(X86_FEATURE_HYPERVISOR)) 611 - return; 611 + if (!boot_cpu_has(X86_FEATURE_TSC_DEADLINE_TIMER)) 612 + return false; 613 + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) 614 + return true; 612 615 613 616 m = x86_match_cpu(deadline_match); 614 617 if (!m) 615 - return; 618 + return true; 616 619 617 620 /* 618 621 * Function pointers will have the MSB set due to address layout, ··· 625 626 rev = (u32)m->driver_data; 626 627 627 628 if (boot_cpu_data.microcode >= rev) 628 - return; 629 + return true; 629 630 630 631 setup_clear_cpu_cap(X86_FEATURE_TSC_DEADLINE_TIMER); 631 632 pr_err(FW_BUG "TSC_DEADLINE disabled due to Errata; " 632 633 "please update microcode to version: 0x%x (or later)\n", rev); 634 + return false; 633 635 } 634 636 635 637 /* ··· 2092 2092 { 2093 2093 unsigned int new_apicid; 2094 2094 2095 - apic_check_deadline_errata(); 2095 + if (apic_validate_deadline_timer()) 2096 + pr_debug("TSC deadline timer available\n"); 2096 2097 2097 2098 if (x2apic_mode) { 2098 2099 boot_cpu_physical_apicid = read_apic_id();

+2 -1

arch/x86/kernel/dumpstack_64.c

··· 183 183 */ 184 184 if (visit_mask) { 185 185 if (*visit_mask & (1UL << info->type)) { 186 - printk_deferred_once(KERN_WARNING "WARNING: stack recursion on stack type %d\n", info->type); 186 + if (task == current) 187 + printk_deferred_once(KERN_WARNING "WARNING: stack recursion on stack type %d\n", info->type); 187 188 goto unknown; 188 189 } 189 190 *visit_mask |= 1UL << info->type;

+28 -1

arch/x86/kernel/ftrace.c

··· 407 407 408 408 set_vm_flush_reset_perms(trampoline); 409 409 410 - set_memory_ro((unsigned long)trampoline, npages); 410 + if (likely(system_state != SYSTEM_BOOTING)) 411 + set_memory_ro((unsigned long)trampoline, npages); 411 412 set_memory_x((unsigned long)trampoline, npages); 412 413 return (unsigned long)trampoline; 413 414 fail: 414 415 tramp_free(trampoline); 415 416 return 0; 417 + } 418 + 419 + void set_ftrace_ops_ro(void) 420 + { 421 + struct ftrace_ops *ops; 422 + unsigned long start_offset; 423 + unsigned long end_offset; 424 + unsigned long npages; 425 + unsigned long size; 426 + 427 + do_for_each_ftrace_op(ops, ftrace_ops_list) { 428 + if (!(ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP)) 429 + continue; 430 + 431 + if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) { 432 + start_offset = (unsigned long)ftrace_regs_caller; 433 + end_offset = (unsigned long)ftrace_regs_caller_end; 434 + } else { 435 + start_offset = (unsigned long)ftrace_caller; 436 + end_offset = (unsigned long)ftrace_epilogue; 437 + } 438 + size = end_offset - start_offset; 439 + size = size + RET_SIZE + sizeof(void *); 440 + npages = DIV_ROUND_UP(size, PAGE_SIZE); 441 + set_memory_ro((unsigned long)ops->trampoline, npages); 442 + } while_for_each_ftrace_op(ops); 416 443 } 417 444 418 445 static unsigned long calc_trampoline_call_offset(bool save_regs)

+3

arch/x86/kernel/unwind_frame.c

··· 344 344 if (IS_ENABLED(CONFIG_X86_32)) 345 345 goto the_end; 346 346 347 + if (state->task != current) 348 + goto the_end; 349 + 347 350 if (state->regs) { 348 351 printk_deferred_once(KERN_WARNING 349 352 "WARNING: kernel stack regs at %p in %s:%d has bad 'bp' value %p\n",

+74 -39

arch/x86/kernel/unwind_orc.c

··· 8 8 #include <asm/orc_lookup.h> 9 9 10 10 #define orc_warn(fmt, ...) \ 11 - printk_deferred_once(KERN_WARNING pr_fmt("WARNING: " fmt), ##__VA_ARGS__) 11 + printk_deferred_once(KERN_WARNING "WARNING: " fmt, ##__VA_ARGS__) 12 + 13 + #define orc_warn_current(args...) \ 14 + ({ \ 15 + if (state->task == current) \ 16 + orc_warn(args); \ 17 + }) 12 18 13 19 extern int __start_orc_unwind_ip[]; 14 20 extern int __stop_orc_unwind_ip[]; 15 21 extern struct orc_entry __start_orc_unwind[]; 16 22 extern struct orc_entry __stop_orc_unwind[]; 17 23 18 - static DEFINE_MUTEX(sort_mutex); 19 - int *cur_orc_ip_table = __start_orc_unwind_ip; 20 - struct orc_entry *cur_orc_table = __start_orc_unwind; 21 - 22 - unsigned int lookup_num_blocks; 23 - bool orc_init; 24 + static bool orc_init __ro_after_init; 25 + static unsigned int lookup_num_blocks __ro_after_init; 24 26 25 27 static inline unsigned long orc_ip(const int *ip) 26 28 { ··· 144 142 { 145 143 static struct orc_entry *orc; 146 144 147 - if (!orc_init) 148 - return NULL; 149 - 150 145 if (ip == 0) 151 146 return &null_orc_entry; 152 147 ··· 187 188 } 188 189 189 190 #ifdef CONFIG_MODULES 191 + 192 + static DEFINE_MUTEX(sort_mutex); 193 + static int *cur_orc_ip_table = __start_orc_unwind_ip; 194 + static struct orc_entry *cur_orc_table = __start_orc_unwind; 190 195 191 196 static void orc_sort_swap(void *_a, void *_b, int size) 192 197 { ··· 384 381 return true; 385 382 } 386 383 384 + /* 385 + * If state->regs is non-NULL, and points to a full pt_regs, just get the reg 386 + * value from state->regs. 387 + * 388 + * Otherwise, if state->regs just points to IRET regs, and the previous frame 389 + * had full regs, it's safe to get the value from the previous regs. This can 390 + * happen when early/late IRQ entry code gets interrupted by an NMI. 391 + */ 392 + static bool get_reg(struct unwind_state *state, unsigned int reg_off, 393 + unsigned long *val) 394 + { 395 + unsigned int reg = reg_off/8; 396 + 397 + if (!state->regs) 398 + return false; 399 + 400 + if (state->full_regs) { 401 + *val = ((unsigned long *)state->regs)[reg]; 402 + return true; 403 + } 404 + 405 + if (state->prev_regs) { 406 + *val = ((unsigned long *)state->prev_regs)[reg]; 407 + return true; 408 + } 409 + 410 + return false; 411 + } 412 + 387 413 bool unwind_next_frame(struct unwind_state *state) 388 414 { 389 - unsigned long ip_p, sp, orig_ip = state->ip, prev_sp = state->sp; 415 + unsigned long ip_p, sp, tmp, orig_ip = state->ip, prev_sp = state->sp; 390 416 enum stack_type prev_type = state->stack_info.type; 391 417 struct orc_entry *orc; 392 418 bool indirect = false; ··· 477 445 break; 478 446 479 447 case ORC_REG_R10: 480 - if (!state->regs || !state->full_regs) { 481 - orc_warn("missing regs for base reg R10 at ip %pB\n", 482 - (void *)state->ip); 448 + if (!get_reg(state, offsetof(struct pt_regs, r10), &sp)) { 449 + orc_warn_current("missing R10 value at %pB\n", 450 + (void *)state->ip); 483 451 goto err; 484 452 } 485 - sp = state->regs->r10; 486 453 break; 487 454 488 455 case ORC_REG_R13: 489 - if (!state->regs || !state->full_regs) { 490 - orc_warn("missing regs for base reg R13 at ip %pB\n", 491 - (void *)state->ip); 456 + if (!get_reg(state, offsetof(struct pt_regs, r13), &sp)) { 457 + orc_warn_current("missing R13 value at %pB\n", 458 + (void *)state->ip); 492 459 goto err; 493 460 } 494 - sp = state->regs->r13; 495 461 break; 496 462 497 463 case ORC_REG_DI: 498 - if (!state->regs || !state->full_regs) { 499 - orc_warn("missing regs for base reg DI at ip %pB\n", 500 - (void *)state->ip); 464 + if (!get_reg(state, offsetof(struct pt_regs, di), &sp)) { 465 + orc_warn_current("missing RDI value at %pB\n", 466 + (void *)state->ip); 501 467 goto err; 502 468 } 503 - sp = state->regs->di; 504 469 break; 505 470 506 471 case ORC_REG_DX: 507 - if (!state->regs || !state->full_regs) { 508 - orc_warn("missing regs for base reg DX at ip %pB\n", 509 - (void *)state->ip); 472 + if (!get_reg(state, offsetof(struct pt_regs, dx), &sp)) { 473 + orc_warn_current("missing DX value at %pB\n", 474 + (void *)state->ip); 510 475 goto err; 511 476 } 512 - sp = state->regs->dx; 513 477 break; 514 478 515 479 default: 516 - orc_warn("unknown SP base reg %d for ip %pB\n", 480 + orc_warn("unknown SP base reg %d at %pB\n", 517 481 orc->sp_reg, (void *)state->ip); 518 482 goto err; 519 483 } ··· 532 504 533 505 state->sp = sp; 534 506 state->regs = NULL; 507 + state->prev_regs = NULL; 535 508 state->signal = false; 536 509 break; 537 510 538 511 case ORC_TYPE_REGS: 539 512 if (!deref_stack_regs(state, sp, &state->ip, &state->sp)) { 540 - orc_warn("can't dereference registers at %p for ip %pB\n", 541 - (void *)sp, (void *)orig_ip); 513 + orc_warn_current("can't access registers at %pB\n", 514 + (void *)orig_ip); 542 515 goto err; 543 516 } 544 517 545 518 state->regs = (struct pt_regs *)sp; 519 + state->prev_regs = NULL; 546 520 state->full_regs = true; 547 521 state->signal = true; 548 522 break; 549 523 550 524 case ORC_TYPE_REGS_IRET: 551 525 if (!deref_stack_iret_regs(state, sp, &state->ip, &state->sp)) { 552 - orc_warn("can't dereference iret registers at %p for ip %pB\n", 553 - (void *)sp, (void *)orig_ip); 526 + orc_warn_current("can't access iret registers at %pB\n", 527 + (void *)orig_ip); 554 528 goto err; 555 529 } 556 530 531 + if (state->full_regs) 532 + state->prev_regs = state->regs; 557 533 state->regs = (void *)sp - IRET_FRAME_OFFSET; 558 534 state->full_regs = false; 559 535 state->signal = true; 560 536 break; 561 537 562 538 default: 563 - orc_warn("unknown .orc_unwind entry type %d for ip %pB\n", 539 + orc_warn("unknown .orc_unwind entry type %d at %pB\n", 564 540 orc->type, (void *)orig_ip); 565 - break; 541 + goto err; 566 542 } 567 543 568 544 /* Find BP: */ 569 545 switch (orc->bp_reg) { 570 546 case ORC_REG_UNDEFINED: 571 - if (state->regs && state->full_regs) 572 - state->bp = state->regs->bp; 547 + if (get_reg(state, offsetof(struct pt_regs, bp), &tmp)) 548 + state->bp = tmp; 573 549 break; 574 550 575 551 case ORC_REG_PREV_SP: ··· 596 564 if (state->stack_info.type == prev_type && 597 565 on_stack(&state->stack_info, (void *)state->sp, sizeof(long)) && 598 566 state->sp <= prev_sp) { 599 - orc_warn("stack going in the wrong direction? ip=%pB\n", 600 - (void *)orig_ip); 567 + orc_warn_current("stack going in the wrong direction? at %pB\n", 568 + (void *)orig_ip); 601 569 goto err; 602 570 } 603 571 ··· 617 585 void __unwind_start(struct unwind_state *state, struct task_struct *task, 618 586 struct pt_regs *regs, unsigned long *first_frame) 619 587 { 588 + if (!orc_init) 589 + goto done; 590 + 620 591 memset(state, 0, sizeof(*state)); 621 592 state->task = task; 622 593 ··· 686 651 /* Otherwise, skip ahead to the user-specified starting frame: */ 687 652 while (!unwind_done(state) && 688 653 (!on_stack(&state->stack_info, first_frame, sizeof(long)) || 689 - state->sp <= (unsigned long)first_frame)) 654 + state->sp < (unsigned long)first_frame)) 690 655 unwind_next_frame(state); 691 656 692 657 return;

+5 -5

arch/x86/kvm/ioapic.c

··· 225 225 } 226 226 227 227 /* 228 - * AMD SVM AVIC accelerate EOI write and do not trap, 229 - * in-kernel IOAPIC will not be able to receive the EOI. 230 - * In this case, we do lazy update of the pending EOI when 231 - * trying to set IOAPIC irq. 228 + * AMD SVM AVIC accelerate EOI write iff the interrupt is edge 229 + * triggered, in which case the in-kernel IOAPIC will not be able 230 + * to receive the EOI. In this case, we do a lazy update of the 231 + * pending EOI when trying to set IOAPIC irq. 232 232 */ 233 - if (kvm_apicv_activated(ioapic->kvm)) 233 + if (edge && kvm_apicv_activated(ioapic->kvm)) 234 234 ioapic_lazy_update_eoi(ioapic, irq); 235 235 236 236 /*

+1 -1

arch/x86/kvm/svm/sev.c

··· 345 345 return NULL; 346 346 347 347 /* Pin the user virtual address. */ 348 - npinned = get_user_pages_fast(uaddr, npages, FOLL_WRITE, pages); 348 + npinned = get_user_pages_fast(uaddr, npages, write ? FOLL_WRITE : 0, pages); 349 349 if (npinned != npages) { 350 350 pr_err("SEV: Failure locking %lu pages.\n", npages); 351 351 goto err;

+2

arch/x86/kvm/svm/svm.c

··· 1752 1752 if (svm->vcpu.guest_debug & 1753 1753 (KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP)) { 1754 1754 kvm_run->exit_reason = KVM_EXIT_DEBUG; 1755 + kvm_run->debug.arch.dr6 = svm->vmcb->save.dr6; 1756 + kvm_run->debug.arch.dr7 = svm->vmcb->save.dr7; 1755 1757 kvm_run->debug.arch.pc = 1756 1758 svm->vmcb->save.cs.base + svm->vmcb->save.rip; 1757 1759 kvm_run->debug.arch.exception = DB_VECTOR;

+1 -1

arch/x86/kvm/vmx/nested.c

··· 5165 5165 */ 5166 5166 break; 5167 5167 default: 5168 - BUG_ON(1); 5168 + BUG(); 5169 5169 break; 5170 5170 } 5171 5171

+3

arch/x86/kvm/vmx/vmenter.S

··· 82 82 /* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */ 83 83 FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE 84 84 85 + /* Clear RFLAGS.CF and RFLAGS.ZF to preserve VM-Exit, i.e. !VM-Fail. */ 86 + or $1, %_ASM_AX 87 + 85 88 pop %_ASM_AX 86 89 .Lvmexit_skip_rsb: 87 90 #endif

+6 -15

arch/x86/kvm/x86.c

··· 926 926 __reserved_bits; \ 927 927 }) 928 928 929 - static u64 kvm_host_cr4_reserved_bits(struct cpuinfo_x86 *c) 930 - { 931 - u64 reserved_bits = __cr4_reserved_bits(cpu_has, c); 932 - 933 - if (kvm_cpu_cap_has(X86_FEATURE_LA57)) 934 - reserved_bits &= ~X86_CR4_LA57; 935 - 936 - if (kvm_cpu_cap_has(X86_FEATURE_UMIP)) 937 - reserved_bits &= ~X86_CR4_UMIP; 938 - 939 - return reserved_bits; 940 - } 941 - 942 929 static int kvm_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) 943 930 { 944 931 if (cr4 & cr4_reserved_bits) ··· 3372 3385 case KVM_CAP_GET_MSR_FEATURES: 3373 3386 case KVM_CAP_MSR_PLATFORM_INFO: 3374 3387 case KVM_CAP_EXCEPTION_PAYLOAD: 3388 + case KVM_CAP_SET_GUEST_DEBUG: 3375 3389 r = 1; 3376 3390 break; 3377 3391 case KVM_CAP_SYNC_REGS: ··· 9663 9675 if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) 9664 9676 supported_xss = 0; 9665 9677 9666 - cr4_reserved_bits = kvm_host_cr4_reserved_bits(&boot_cpu_data); 9678 + #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f) 9679 + cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_); 9680 + #undef __kvm_cpu_cap_has 9667 9681 9668 9682 if (kvm_has_tsc_control) { 9669 9683 /* ··· 9697 9707 9698 9708 WARN_ON(!irqs_disabled()); 9699 9709 9700 - if (kvm_host_cr4_reserved_bits(c) != cr4_reserved_bits) 9710 + if (__cr4_reserved_bits(cpu_has, c) != 9711 + __cr4_reserved_bits(cpu_has, &boot_cpu_data)) 9701 9712 return -EIO; 9702 9713 9703 9714 return ops->check_processor_compatibility();

+3

arch/x86/mm/init_64.c

··· 54 54 #include <asm/init.h> 55 55 #include <asm/uv/uv.h> 56 56 #include <asm/setup.h> 57 + #include <asm/ftrace.h> 57 58 58 59 #include "mm_internal.h" 59 60 ··· 1291 1290 */ 1292 1291 all_end = roundup((unsigned long)_brk_end, PMD_SIZE); 1293 1292 set_memory_nx(text_end, (all_end - text_end) >> PAGE_SHIFT); 1293 + 1294 + set_ftrace_ops_ro(); 1294 1295 1295 1296 #ifdef CONFIG_CPA_DEBUG 1296 1297 printk(KERN_INFO "Testing CPA: undo %lx-%lx\n", start, end);

+8 -4

arch/x86/mm/pat/set_memory.c

··· 43 43 unsigned long pfn; 44 44 unsigned int flags; 45 45 unsigned int force_split : 1, 46 - force_static_prot : 1; 46 + force_static_prot : 1, 47 + force_flush_all : 1; 47 48 struct page **pages; 48 49 }; 49 50 ··· 356 355 return; 357 356 } 358 357 359 - if (cpa->numpages <= tlb_single_page_flush_ceiling) 360 - on_each_cpu(__cpa_flush_tlb, cpa, 1); 361 - else 358 + if (cpa->force_flush_all || cpa->numpages > tlb_single_page_flush_ceiling) 362 359 flush_tlb_all(); 360 + else 361 + on_each_cpu(__cpa_flush_tlb, cpa, 1); 363 362 364 363 if (!cache) 365 364 return; ··· 1599 1598 alias_cpa.flags &= ~(CPA_PAGES_ARRAY | CPA_ARRAY); 1600 1599 alias_cpa.curpage = 0; 1601 1600 1601 + cpa->force_flush_all = 1; 1602 + 1602 1603 ret = __change_page_attr_set_clr(&alias_cpa, 0); 1603 1604 if (ret) 1604 1605 return ret; ··· 1621 1618 alias_cpa.flags &= ~(CPA_PAGES_ARRAY | CPA_ARRAY); 1622 1619 alias_cpa.curpage = 0; 1623 1620 1621 + cpa->force_flush_all = 1; 1624 1622 /* 1625 1623 * The high mapping range is imprecise, so ignore the 1626 1624 * return value.

+4 -2

block/bfq-iosched.c

··· 123 123 #include <linux/ioprio.h> 124 124 #include <linux/sbitmap.h> 125 125 #include <linux/delay.h> 126 + #include <linux/backing-dev.h> 126 127 127 128 #include "blk.h" 128 129 #include "blk-mq.h" ··· 4977 4976 ioprio_class = IOPRIO_PRIO_CLASS(bic->ioprio); 4978 4977 switch (ioprio_class) { 4979 4978 default: 4980 - dev_err(bfqq->bfqd->queue->backing_dev_info->dev, 4981 - "bfq: bad prio class %d\n", ioprio_class); 4979 + pr_err("bdi %s: bfq: bad prio class %d\n", 4980 + bdi_dev_name(bfqq->bfqd->queue->backing_dev_info), 4981 + ioprio_class); 4982 4982 /* fall through */ 4983 4983 case IOPRIO_CLASS_NONE: 4984 4984 /*

+1 -1

block/blk-cgroup.c

··· 496 496 { 497 497 /* some drivers (floppy) instantiate a queue w/o disk registered */ 498 498 if (blkg->q->backing_dev_info->dev) 499 - return dev_name(blkg->q->backing_dev_info->dev); 499 + return bdi_dev_name(blkg->q->backing_dev_info); 500 500 return NULL; 501 501 } 502 502

+71 -46

block/blk-iocost.c

··· 466 466 */ 467 467 atomic64_t vtime; 468 468 atomic64_t done_vtime; 469 - atomic64_t abs_vdebt; 469 + u64 abs_vdebt; 470 470 u64 last_vtime; 471 471 472 472 /* ··· 1142 1142 struct iocg_wake_ctx ctx = { .iocg = iocg }; 1143 1143 u64 margin_ns = (u64)(ioc->period_us * 1144 1144 WAITQ_TIMER_MARGIN_PCT / 100) * NSEC_PER_USEC; 1145 - u64 abs_vdebt, vdebt, vshortage, expires, oexpires; 1145 + u64 vdebt, vshortage, expires, oexpires; 1146 1146 s64 vbudget; 1147 1147 u32 hw_inuse; 1148 1148 ··· 1152 1152 vbudget = now->vnow - atomic64_read(&iocg->vtime); 1153 1153 1154 1154 /* pay off debt */ 1155 - abs_vdebt = atomic64_read(&iocg->abs_vdebt); 1156 - vdebt = abs_cost_to_cost(abs_vdebt, hw_inuse); 1155 + vdebt = abs_cost_to_cost(iocg->abs_vdebt, hw_inuse); 1157 1156 if (vdebt && vbudget > 0) { 1158 1157 u64 delta = min_t(u64, vbudget, vdebt); 1159 1158 u64 abs_delta = min(cost_to_abs_cost(delta, hw_inuse), 1160 - abs_vdebt); 1159 + iocg->abs_vdebt); 1161 1160 1162 1161 atomic64_add(delta, &iocg->vtime); 1163 1162 atomic64_add(delta, &iocg->done_vtime); 1164 - atomic64_sub(abs_delta, &iocg->abs_vdebt); 1165 - if (WARN_ON_ONCE(atomic64_read(&iocg->abs_vdebt) < 0)) 1166 - atomic64_set(&iocg->abs_vdebt, 0); 1163 + iocg->abs_vdebt -= abs_delta; 1167 1164 } 1168 1165 1169 1166 /* ··· 1216 1219 u64 expires, oexpires; 1217 1220 u32 hw_inuse; 1218 1221 1222 + lockdep_assert_held(&iocg->waitq.lock); 1223 + 1219 1224 /* debt-adjust vtime */ 1220 1225 current_hweight(iocg, NULL, &hw_inuse); 1221 - vtime += abs_cost_to_cost(atomic64_read(&iocg->abs_vdebt), hw_inuse); 1226 + vtime += abs_cost_to_cost(iocg->abs_vdebt, hw_inuse); 1222 1227 1223 - /* clear or maintain depending on the overage */ 1224 - if (time_before_eq64(vtime, now->vnow)) { 1228 + /* 1229 + * Clear or maintain depending on the overage. Non-zero vdebt is what 1230 + * guarantees that @iocg is online and future iocg_kick_delay() will 1231 + * clear use_delay. Don't leave it on when there's no vdebt. 1232 + */ 1233 + if (!iocg->abs_vdebt || time_before_eq64(vtime, now->vnow)) { 1225 1234 blkcg_clear_delay(blkg); 1226 1235 return false; 1227 1236 } ··· 1261 1258 { 1262 1259 struct ioc_gq *iocg = container_of(timer, struct ioc_gq, delay_timer); 1263 1260 struct ioc_now now; 1261 + unsigned long flags; 1264 1262 1263 + spin_lock_irqsave(&iocg->waitq.lock, flags); 1265 1264 ioc_now(iocg->ioc, &now); 1266 1265 iocg_kick_delay(iocg, &now, 0); 1266 + spin_unlock_irqrestore(&iocg->waitq.lock, flags); 1267 1267 1268 1268 return HRTIMER_NORESTART; 1269 1269 } ··· 1374 1368 * should have woken up in the last period and expire idle iocgs. 1375 1369 */ 1376 1370 list_for_each_entry_safe(iocg, tiocg, &ioc->active_iocgs, active_list) { 1377 - if (!waitqueue_active(&iocg->waitq) && 1378 - !atomic64_read(&iocg->abs_vdebt) && !iocg_is_idle(iocg)) 1371 + if (!waitqueue_active(&iocg->waitq) && iocg->abs_vdebt && 1372 + !iocg_is_idle(iocg)) 1379 1373 continue; 1380 1374 1381 1375 spin_lock(&iocg->waitq.lock); 1382 1376 1383 - if (waitqueue_active(&iocg->waitq) || 1384 - atomic64_read(&iocg->abs_vdebt)) { 1377 + if (waitqueue_active(&iocg->waitq) || iocg->abs_vdebt) { 1385 1378 /* might be oversleeping vtime / hweight changes, kick */ 1386 1379 iocg_kick_waitq(iocg, &now); 1387 1380 iocg_kick_delay(iocg, &now, 0); ··· 1723 1718 * tests are racy but the races aren't systemic - we only miss once 1724 1719 * in a while which is fine. 1725 1720 */ 1726 - if (!waitqueue_active(&iocg->waitq) && 1727 - !atomic64_read(&iocg->abs_vdebt) && 1721 + if (!waitqueue_active(&iocg->waitq) && !iocg->abs_vdebt && 1728 1722 time_before_eq64(vtime + cost, now.vnow)) { 1729 1723 iocg_commit_bio(iocg, bio, cost); 1730 1724 return; 1731 1725 } 1732 1726 1733 1727 /* 1734 - * We're over budget. If @bio has to be issued regardless, 1735 - * remember the abs_cost instead of advancing vtime. 1736 - * iocg_kick_waitq() will pay off the debt before waking more IOs. 1728 + * We activated above but w/o any synchronization. Deactivation is 1729 + * synchronized with waitq.lock and we won't get deactivated as long 1730 + * as we're waiting or has debt, so we're good if we're activated 1731 + * here. In the unlikely case that we aren't, just issue the IO. 1732 + */ 1733 + spin_lock_irq(&iocg->waitq.lock); 1734 + 1735 + if (unlikely(list_empty(&iocg->active_list))) { 1736 + spin_unlock_irq(&iocg->waitq.lock); 1737 + iocg_commit_bio(iocg, bio, cost); 1738 + return; 1739 + } 1740 + 1741 + /* 1742 + * We're over budget. If @bio has to be issued regardless, remember 1743 + * the abs_cost instead of advancing vtime. iocg_kick_waitq() will pay 1744 + * off the debt before waking more IOs. 1745 + * 1737 1746 * This way, the debt is continuously paid off each period with the 1738 - * actual budget available to the cgroup. If we just wound vtime, 1739 - * we would incorrectly use the current hw_inuse for the entire 1740 - * amount which, for example, can lead to the cgroup staying 1741 - * blocked for a long time even with substantially raised hw_inuse. 1747 + * actual budget available to the cgroup. If we just wound vtime, we 1748 + * would incorrectly use the current hw_inuse for the entire amount 1749 + * which, for example, can lead to the cgroup staying blocked for a 1750 + * long time even with substantially raised hw_inuse. 1751 + * 1752 + * An iocg with vdebt should stay online so that the timer can keep 1753 + * deducting its vdebt and [de]activate use_delay mechanism 1754 + * accordingly. We don't want to race against the timer trying to 1755 + * clear them and leave @iocg inactive w/ dangling use_delay heavily 1756 + * penalizing the cgroup and its descendants. 1742 1757 */ 1743 1758 if (bio_issue_as_root_blkg(bio) || fatal_signal_pending(current)) { 1744 - atomic64_add(abs_cost, &iocg->abs_vdebt); 1759 + iocg->abs_vdebt += abs_cost; 1745 1760 if (iocg_kick_delay(iocg, &now, cost)) 1746 1761 blkcg_schedule_throttle(rqos->q, 1747 1762 (bio->bi_opf & REQ_SWAP) == REQ_SWAP); 1763 + spin_unlock_irq(&iocg->waitq.lock); 1748 1764 return; 1749 1765 } 1750 1766 ··· 1782 1756 * All waiters are on iocg->waitq and the wait states are 1783 1757 * synchronized using waitq.lock. 1784 1758 */ 1785 - spin_lock_irq(&iocg->waitq.lock); 1786 - 1787 - /* 1788 - * We activated above but w/o any synchronization. Deactivation is 1789 - * synchronized with waitq.lock and we won't get deactivated as 1790 - * long as we're waiting, so we're good if we're activated here. 1791 - * In the unlikely case that we are deactivated, just issue the IO. 1792 - */ 1793 - if (unlikely(list_empty(&iocg->active_list))) { 1794 - spin_unlock_irq(&iocg->waitq.lock); 1795 - iocg_commit_bio(iocg, bio, cost); 1796 - return; 1797 - } 1798 - 1799 1759 init_waitqueue_func_entry(&wait.wait, iocg_wake_fn); 1800 1760 wait.wait.private = current; 1801 1761 wait.bio = bio; ··· 1813 1801 struct ioc_now now; 1814 1802 u32 hw_inuse; 1815 1803 u64 abs_cost, cost; 1804 + unsigned long flags; 1816 1805 1817 1806 /* bypass if disabled or for root cgroup */ 1818 1807 if (!ioc->enabled || !iocg->level) ··· 1833 1820 iocg->cursor = bio_end; 1834 1821 1835 1822 /* 1836 - * Charge if there's enough vtime budget and the existing request 1837 - * has cost assigned. Otherwise, account it as debt. See debt 1838 - * handling in ioc_rqos_throttle() for details. 1823 + * Charge if there's enough vtime budget and the existing request has 1824 + * cost assigned. 1839 1825 */ 1840 1826 if (rq->bio && rq->bio->bi_iocost_cost && 1841 - time_before_eq64(atomic64_read(&iocg->vtime) + cost, now.vnow)) 1827 + time_before_eq64(atomic64_read(&iocg->vtime) + cost, now.vnow)) { 1842 1828 iocg_commit_bio(iocg, bio, cost); 1843 - else 1844 - atomic64_add(abs_cost, &iocg->abs_vdebt); 1829 + return; 1830 + } 1831 + 1832 + /* 1833 + * Otherwise, account it as debt if @iocg is online, which it should 1834 + * be for the vast majority of cases. See debt handling in 1835 + * ioc_rqos_throttle() for details. 1836 + */ 1837 + spin_lock_irqsave(&iocg->waitq.lock, flags); 1838 + if (likely(!list_empty(&iocg->active_list))) { 1839 + iocg->abs_vdebt += abs_cost; 1840 + iocg_kick_delay(iocg, &now, cost); 1841 + } else { 1842 + iocg_commit_bio(iocg, bio, cost); 1843 + } 1844 + spin_unlock_irqrestore(&iocg->waitq.lock, flags); 1845 1845 } 1846 1846 1847 1847 static void ioc_rqos_done_bio(struct rq_qos *rqos, struct bio *bio) ··· 2024 1998 iocg->ioc = ioc; 2025 1999 atomic64_set(&iocg->vtime, now.vnow); 2026 2000 atomic64_set(&iocg->done_vtime, now.vnow); 2027 - atomic64_set(&iocg->abs_vdebt, 0); 2028 2001 atomic64_set(&iocg->active_period, atomic64_read(&ioc->cur_period)); 2029 2002 INIT_LIST_HEAD(&iocg->active_list); 2030 2003 iocg->hweight_active = HWEIGHT_WHOLE;

+3 -3

crypto/lrw.c

··· 287 287 crypto_free_skcipher(ctx->child); 288 288 } 289 289 290 - static void free(struct skcipher_instance *inst) 290 + static void free_inst(struct skcipher_instance *inst) 291 291 { 292 292 crypto_drop_skcipher(skcipher_instance_ctx(inst)); 293 293 kfree(inst); ··· 400 400 inst->alg.encrypt = encrypt; 401 401 inst->alg.decrypt = decrypt; 402 402 403 - inst->free = free; 403 + inst->free = free_inst; 404 404 405 405 err = skcipher_register_instance(tmpl, inst); 406 406 if (err) { 407 407 err_free_inst: 408 - free(inst); 408 + free_inst(inst); 409 409 } 410 410 return err; 411 411 }

+3 -3

crypto/xts.c

··· 322 322 crypto_free_cipher(ctx->tweak); 323 323 } 324 324 325 - static void free(struct skcipher_instance *inst) 325 + static void free_inst(struct skcipher_instance *inst) 326 326 { 327 327 crypto_drop_skcipher(skcipher_instance_ctx(inst)); 328 328 kfree(inst); ··· 434 434 inst->alg.encrypt = encrypt; 435 435 inst->alg.decrypt = decrypt; 436 436 437 - inst->free = free; 437 + inst->free = free_inst; 438 438 439 439 err = skcipher_register_instance(tmpl, inst); 440 440 if (err) { 441 441 err_free_inst: 442 - free(inst); 442 + free_inst(inst); 443 443 } 444 444 return err; 445 445 }

+16 -8

drivers/acpi/ec.c

··· 1994 1994 acpi_set_gpe_wake_mask(NULL, first_ec->gpe, action); 1995 1995 } 1996 1996 1997 - bool acpi_ec_other_gpes_active(void) 1998 - { 1999 - return acpi_any_gpe_status_set(first_ec ? first_ec->gpe : U32_MAX); 2000 - } 2001 - 2002 1997 bool acpi_ec_dispatch_gpe(void) 2003 1998 { 2004 1999 u32 ret; 2005 2000 2006 2001 if (!first_ec) 2002 + return acpi_any_gpe_status_set(U32_MAX); 2003 + 2004 + /* 2005 + * Report wakeup if the status bit is set for any enabled GPE other 2006 + * than the EC one. 2007 + */ 2008 + if (acpi_any_gpe_status_set(first_ec->gpe)) 2009 + return true; 2010 + 2011 + if (ec_no_wakeup) 2007 2012 return false; 2008 2013 2014 + /* 2015 + * Dispatch the EC GPE in-band, but do not report wakeup in any case 2016 + * to allow the caller to process events properly after that. 2017 + */ 2009 2018 ret = acpi_dispatch_gpe(NULL, first_ec->gpe); 2010 - if (ret == ACPI_INTERRUPT_HANDLED) { 2019 + if (ret == ACPI_INTERRUPT_HANDLED) 2011 2020 pm_pr_dbg("EC GPE dispatched\n"); 2012 - return true; 2013 - } 2021 + 2014 2022 return false; 2015 2023 } 2016 2024 #endif /* CONFIG_PM_SLEEP */

-1

drivers/acpi/internal.h

··· 202 202 203 203 #ifdef CONFIG_PM_SLEEP 204 204 void acpi_ec_flush_work(void); 205 - bool acpi_ec_other_gpes_active(void); 206 205 bool acpi_ec_dispatch_gpe(void); 207 206 #endif 208 207

+2 -12

drivers/acpi/sleep.c

··· 1013 1013 if (acpi_check_wakeup_handlers()) 1014 1014 return true; 1015 1015 1016 - /* 1017 - * If the status bit is set for any enabled GPE other than the 1018 - * EC one, the wakeup is regarded as a genuine one. 1019 - */ 1020 - if (acpi_ec_other_gpes_active()) 1016 + /* Check non-EC GPE wakeups and dispatch the EC GPE. */ 1017 + if (acpi_ec_dispatch_gpe()) 1021 1018 return true; 1022 - 1023 - /* 1024 - * If the EC GPE status bit has not been set, the wakeup is 1025 - * regarded as a spurious one. 1026 - */ 1027 - if (!acpi_ec_dispatch_gpe()) 1028 - return false; 1029 1019 1030 1020 /* 1031 1021 * Cancel the wakeup and process all pending events in case

+1

drivers/amba/bus.c

··· 645 645 dev->dev.release = amba_device_release; 646 646 dev->dev.bus = &amba_bustype; 647 647 dev->dev.dma_mask = &dev->dev.coherent_dma_mask; 648 + dev->dev.dma_parms = &dev->dma_parms; 648 649 dev->res.name = dev_name(&dev->dev); 649 650 } 650 651

+5 -3

drivers/base/component.c

··· 256 256 ret = master->ops->bind(master->dev); 257 257 if (ret < 0) { 258 258 devres_release_group(master->dev, NULL); 259 - dev_info(master->dev, "master bind failed: %d\n", ret); 259 + if (ret != -EPROBE_DEFER) 260 + dev_info(master->dev, "master bind failed: %d\n", ret); 260 261 return ret; 261 262 } 262 263 ··· 612 611 devres_release_group(component->dev, NULL); 613 612 devres_release_group(master->dev, NULL); 614 613 615 - dev_err(master->dev, "failed to bind %s (ops %ps): %d\n", 616 - dev_name(component->dev), component->ops, ret); 614 + if (ret != -EPROBE_DEFER) 615 + dev_err(master->dev, "failed to bind %s (ops %ps): %d\n", 616 + dev_name(component->dev), component->ops, ret); 617 617 } 618 618 619 619 return ret;

+6 -1

drivers/base/core.c

··· 2370 2370 return fw_devlink_flags; 2371 2371 } 2372 2372 2373 + static bool fw_devlink_is_permissive(void) 2374 + { 2375 + return fw_devlink_flags == DL_FLAG_SYNC_STATE_ONLY; 2376 + } 2377 + 2373 2378 /** 2374 2379 * device_add - add device to device hierarchy. 2375 2380 * @dev: device. ··· 2529 2524 if (fw_devlink_flags && is_fwnode_dev && 2530 2525 fwnode_has_op(dev->fwnode, add_links)) { 2531 2526 fw_ret = fwnode_call_int_op(dev->fwnode, add_links, dev); 2532 - if (fw_ret == -ENODEV) 2527 + if (fw_ret == -ENODEV && !fw_devlink_is_permissive()) 2533 2528 device_link_wait_for_mandatory_supplier(dev); 2534 2529 else if (fw_ret) 2535 2530 device_link_wait_for_optional_supplier(dev);

+8 -12

drivers/base/dd.c

··· 224 224 } 225 225 DEFINE_SHOW_ATTRIBUTE(deferred_devs); 226 226 227 - #ifdef CONFIG_MODULES 228 - /* 229 - * In the case of modules, set the default probe timeout to 230 - * 30 seconds to give userland some time to load needed modules 231 - */ 232 - int driver_deferred_probe_timeout = 30; 233 - #else 234 - /* In the case of !modules, no probe timeout needed */ 235 - int driver_deferred_probe_timeout = -1; 236 - #endif 227 + int driver_deferred_probe_timeout; 237 228 EXPORT_SYMBOL_GPL(driver_deferred_probe_timeout); 229 + static DECLARE_WAIT_QUEUE_HEAD(probe_timeout_waitqueue); 238 230 239 231 static int __init deferred_probe_timeout_setup(char *str) 240 232 { ··· 258 266 return -ENODEV; 259 267 } 260 268 261 - if (!driver_deferred_probe_timeout) { 262 - dev_WARN(dev, "deferred probe timeout, ignoring dependency"); 269 + if (!driver_deferred_probe_timeout && initcalls_done) { 270 + dev_warn(dev, "deferred probe timeout, ignoring dependency"); 263 271 return -ETIMEDOUT; 264 272 } 265 273 ··· 276 284 277 285 list_for_each_entry_safe(private, p, &deferred_probe_pending_list, deferred_probe) 278 286 dev_info(private->device, "deferred probe pending"); 287 + wake_up(&probe_timeout_waitqueue); 279 288 } 280 289 static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_work_func); 281 290 ··· 651 658 */ 652 659 void wait_for_device_probe(void) 653 660 { 661 + /* wait for probe timeout */ 662 + wait_event(probe_timeout_waitqueue, !driver_deferred_probe_timeout); 663 + 654 664 /* wait for the deferred probe workqueue to finish */ 655 665 flush_work(&deferred_probe_work); 656 666

+2

drivers/base/platform.c

··· 380 380 */ 381 381 static void setup_pdev_dma_masks(struct platform_device *pdev) 382 382 { 383 + pdev->dev.dma_parms = &pdev->dma_parms; 384 + 383 385 if (!pdev->dev.coherent_dma_mask) 384 386 pdev->dev.coherent_dma_mask = DMA_BIT_MASK(32); 385 387 if (!pdev->dev.dma_mask) {

+3 -4

drivers/bus/mhi/core/init.c

··· 812 812 if (!mhi_cntrl) 813 813 return -EINVAL; 814 814 815 - if (!mhi_cntrl->runtime_get || !mhi_cntrl->runtime_put) 816 - return -EINVAL; 817 - 818 - if (!mhi_cntrl->status_cb || !mhi_cntrl->link_status) 815 + if (!mhi_cntrl->runtime_get || !mhi_cntrl->runtime_put || 816 + !mhi_cntrl->status_cb || !mhi_cntrl->read_reg || 817 + !mhi_cntrl->write_reg) 819 818 return -EINVAL; 820 819 821 820 ret = parse_config(mhi_cntrl, config);

-3

drivers/bus/mhi/core/internal.h

··· 11 11 12 12 extern struct bus_type mhi_bus_type; 13 13 14 - /* MHI MMIO register mapping */ 15 - #define PCI_INVALID_READ(val) (val == U32_MAX) 16 - 17 14 #define MHIREGLEN (0x0) 18 15 #define MHIREGLEN_MHIREGLEN_MASK (0xFFFFFFFF) 19 16 #define MHIREGLEN_MHIREGLEN_SHIFT (0)

+5 -13

drivers/bus/mhi/core/main.c

··· 18 18 int __must_check mhi_read_reg(struct mhi_controller *mhi_cntrl, 19 19 void __iomem *base, u32 offset, u32 *out) 20 20 { 21 - u32 tmp = readl(base + offset); 22 - 23 - /* If there is any unexpected value, query the link status */ 24 - if (PCI_INVALID_READ(tmp) && 25 - mhi_cntrl->link_status(mhi_cntrl)) 26 - return -EIO; 27 - 28 - *out = tmp; 29 - 30 - return 0; 21 + return mhi_cntrl->read_reg(mhi_cntrl, base + offset, out); 31 22 } 32 23 33 24 int __must_check mhi_read_reg_field(struct mhi_controller *mhi_cntrl, ··· 40 49 void mhi_write_reg(struct mhi_controller *mhi_cntrl, void __iomem *base, 41 50 u32 offset, u32 val) 42 51 { 43 - writel(val, base + offset); 52 + mhi_cntrl->write_reg(mhi_cntrl, base + offset, val); 44 53 } 45 54 46 55 void mhi_write_reg_field(struct mhi_controller *mhi_cntrl, void __iomem *base, ··· 285 294 !(mhi_chan->ee_mask & BIT(mhi_cntrl->ee))) 286 295 continue; 287 296 mhi_dev = mhi_alloc_device(mhi_cntrl); 288 - if (!mhi_dev) 297 + if (IS_ERR(mhi_dev)) 289 298 return; 290 299 291 300 mhi_dev->dev_type = MHI_DEVICE_XFER; ··· 327 336 328 337 /* Channel name is same for both UL and DL */ 329 338 mhi_dev->chan_name = mhi_chan->name; 330 - dev_set_name(&mhi_dev->dev, "%04x_%s", mhi_chan->chan, 339 + dev_set_name(&mhi_dev->dev, "%s_%s", 340 + dev_name(mhi_cntrl->cntrl_dev), 331 341 mhi_dev->chan_name); 332 342 333 343 /* Init wakeup source if available */

+5 -1

drivers/bus/mhi/core/pm.c

··· 902 902 MHI_PM_IN_ERROR_STATE(mhi_cntrl->pm_state), 903 903 msecs_to_jiffies(mhi_cntrl->timeout_ms)); 904 904 905 - return (MHI_IN_MISSION_MODE(mhi_cntrl->ee)) ? 0 : -EIO; 905 + ret = (MHI_IN_MISSION_MODE(mhi_cntrl->ee)) ? 0 : -ETIMEDOUT; 906 + if (ret) 907 + mhi_power_down(mhi_cntrl, false); 908 + 909 + return ret; 906 910 } 907 911 EXPORT_SYMBOL(mhi_sync_power_up); 908 912

+1 -1

drivers/firmware/efi/tpm.c

··· 16 16 int efi_tpm_final_log_size; 17 17 EXPORT_SYMBOL(efi_tpm_final_log_size); 18 18 19 - static int tpm2_calc_event_log_size(void *data, int count, void *size_info) 19 + static int __init tpm2_calc_event_log_size(void *data, int count, void *size_info) 20 20 { 21 21 struct tcg_pcr_event2_head *header; 22 22 int event_size, size = 0;

+1 -1

drivers/gpio/gpio-pca953x.c

··· 531 531 { 532 532 struct pca953x_chip *chip = gpiochip_get_data(gc); 533 533 534 - switch (config) { 534 + switch (pinconf_to_config_param(config)) { 535 535 case PIN_CONFIG_BIAS_PULL_UP: 536 536 case PIN_CONFIG_BIAS_PULL_DOWN: 537 537 return pca953x_gpio_set_pull_up_down(chip, offset, config);

+1

drivers/gpio/gpio-tegra.c

··· 368 368 struct tegra_gpio_info *tgi = bank->tgi; 369 369 unsigned int gpio = d->hwirq; 370 370 371 + tegra_gpio_irq_mask(d); 371 372 gpiochip_unlock_as_irq(&tgi->gc, gpio); 372 373 } 373 374

+29 -5

drivers/gpio/gpiolib.c

··· 1158 1158 struct gpioline_info *info) 1159 1159 { 1160 1160 struct gpio_chip *gc = desc->gdev->chip; 1161 + bool ok_for_pinctrl; 1161 1162 unsigned long flags; 1163 + 1164 + /* 1165 + * This function takes a mutex so we must check this before taking 1166 + * the spinlock. 1167 + * 1168 + * FIXME: find a non-racy way to retrieve this information. Maybe a 1169 + * lock common to both frameworks? 1170 + */ 1171 + ok_for_pinctrl = 1172 + pinctrl_gpio_can_use_line(gc->base + info->line_offset); 1162 1173 1163 1174 spin_lock_irqsave(&gpio_lock, flags); 1164 1175 ··· 1197 1186 test_bit(FLAG_USED_AS_IRQ, &desc->flags) || 1198 1187 test_bit(FLAG_EXPORT, &desc->flags) || 1199 1188 test_bit(FLAG_SYSFS, &desc->flags) || 1200 - !pinctrl_gpio_can_use_line(gc->base + info->line_offset)) 1189 + !ok_for_pinctrl) 1201 1190 info->flags |= GPIOLINE_FLAG_KERNEL; 1202 1191 if (test_bit(FLAG_IS_OUT, &desc->flags)) 1203 1192 info->flags |= GPIOLINE_FLAG_IS_OUT; ··· 1238 1227 void __user *ip = (void __user *)arg; 1239 1228 struct gpio_desc *desc; 1240 1229 __u32 offset; 1230 + int hwgpio; 1241 1231 1242 1232 /* We fail any subsequent ioctl():s when the chip is gone */ 1243 1233 if (!gc) ··· 1271 1259 if (IS_ERR(desc)) 1272 1260 return PTR_ERR(desc); 1273 1261 1262 + hwgpio = gpio_chip_hwgpio(desc); 1263 + 1264 + if (cmd == GPIO_GET_LINEINFO_WATCH_IOCTL && 1265 + test_bit(hwgpio, priv->watched_lines)) 1266 + return -EBUSY; 1267 + 1274 1268 gpio_desc_to_lineinfo(desc, &lineinfo); 1275 1269 1276 1270 if (copy_to_user(ip, &lineinfo, sizeof(lineinfo))) 1277 1271 return -EFAULT; 1278 1272 1279 1273 if (cmd == GPIO_GET_LINEINFO_WATCH_IOCTL) 1280 - set_bit(gpio_chip_hwgpio(desc), priv->watched_lines); 1274 + set_bit(hwgpio, priv->watched_lines); 1281 1275 1282 1276 return 0; 1283 1277 } else if (cmd == GPIO_GET_LINEHANDLE_IOCTL) { ··· 1298 1280 if (IS_ERR(desc)) 1299 1281 return PTR_ERR(desc); 1300 1282 1301 - clear_bit(gpio_chip_hwgpio(desc), priv->watched_lines); 1283 + hwgpio = gpio_chip_hwgpio(desc); 1284 + 1285 + if (!test_bit(hwgpio, priv->watched_lines)) 1286 + return -EBUSY; 1287 + 1288 + clear_bit(hwgpio, priv->watched_lines); 1302 1289 return 0; 1303 1290 } 1304 1291 return -EINVAL; ··· 5312 5289 gpiolib_initialized = true; 5313 5290 gpiochip_setup_devs(); 5314 5291 5315 - if (IS_ENABLED(CONFIG_OF_DYNAMIC)) 5316 - WARN_ON(of_reconfig_notifier_register(&gpio_of_notifier)); 5292 + #if IS_ENABLED(CONFIG_OF_DYNAMIC) && IS_ENABLED(CONFIG_OF_GPIO) 5293 + WARN_ON(of_reconfig_notifier_register(&gpio_of_notifier)); 5294 + #endif /* CONFIG_OF_DYNAMIC && CONFIG_OF_GPIO */ 5317 5295 5318 5296 return ret; 5319 5297 }

+1

drivers/gpu/drm/amd/amdgpu/amdgpu.h

··· 945 945 946 946 /* s3/s4 mask */ 947 947 bool in_suspend; 948 + bool in_hibernate; 948 949 949 950 /* record last mm index being written through WREG32*/ 950 951 unsigned long last_mm_index;

+3 -2

drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c

··· 1343 1343 } 1344 1344 1345 1345 /* Free the BO*/ 1346 - amdgpu_bo_unref(&mem->bo); 1346 + drm_gem_object_put_unlocked(&mem->bo->tbo.base); 1347 1347 mutex_destroy(&mem->lock); 1348 1348 kfree(mem); 1349 1349 ··· 1688 1688 | KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE 1689 1689 | KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE; 1690 1690 1691 - (*mem)->bo = amdgpu_bo_ref(bo); 1691 + drm_gem_object_get(&bo->tbo.base); 1692 + (*mem)->bo = bo; 1692 1693 (*mem)->va = va; 1693 1694 (*mem)->domain = (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) ? 1694 1695 AMDGPU_GEM_DOMAIN_VRAM : AMDGPU_GEM_DOMAIN_GTT;

+2 -5

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

··· 3372 3372 } 3373 3373 } 3374 3374 3375 - amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE); 3376 - amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE); 3377 - 3378 - amdgpu_amdkfd_suspend(adev, !fbcon); 3379 - 3380 3375 amdgpu_ras_suspend(adev); 3381 3376 3382 3377 r = amdgpu_device_ip_suspend_phase1(adev); 3378 + 3379 + amdgpu_amdkfd_suspend(adev, !fbcon); 3383 3380 3384 3381 /* evict vram memory */ 3385 3382 amdgpu_bo_evict_vram(adev);

+2

drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

··· 1181 1181 struct amdgpu_device *adev = drm_dev->dev_private; 1182 1182 int r; 1183 1183 1184 + adev->in_hibernate = true; 1184 1185 r = amdgpu_device_suspend(drm_dev, true); 1186 + adev->in_hibernate = false; 1185 1187 if (r) 1186 1188 return r; 1187 1189 return amdgpu_asic_reset(adev);

+1 -2

drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c

··· 133 133 u32 cpp; 134 134 u64 flags = AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED | 135 135 AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS | 136 - AMDGPU_GEM_CREATE_VRAM_CLEARED | 137 - AMDGPU_GEM_CREATE_CPU_GTT_USWC; 136 + AMDGPU_GEM_CREATE_VRAM_CLEARED; 138 137 139 138 info = drm_get_format_info(adev->ddev, mode_cmd); 140 139 cpp = info->cpp[0];

+16 -6

drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c

··· 4273 4273 /* === CGCG /CGLS for GFX 3D Only === */ 4274 4274 gfx_v10_0_update_3d_clock_gating(adev, enable); 4275 4275 /* === MGCG + MGLS === */ 4276 - /* gfx_v10_0_update_medium_grain_clock_gating(adev, enable); */ 4276 + gfx_v10_0_update_medium_grain_clock_gating(adev, enable); 4277 4277 } 4278 4278 4279 4279 if (adev->cg_flags & ··· 4353 4353 switch (adev->asic_type) { 4354 4354 case CHIP_NAVI10: 4355 4355 case CHIP_NAVI14: 4356 - if (!enable) { 4357 - amdgpu_gfx_off_ctrl(adev, false); 4358 - cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work); 4359 - } else 4360 - amdgpu_gfx_off_ctrl(adev, true); 4356 + amdgpu_gfx_off_ctrl(adev, enable); 4361 4357 break; 4362 4358 default: 4363 4359 break; ··· 4914 4918 ref, mask); 4915 4919 } 4916 4920 4921 + static void gfx_v10_0_ring_soft_recovery(struct amdgpu_ring *ring, 4922 + unsigned vmid) 4923 + { 4924 + struct amdgpu_device *adev = ring->adev; 4925 + uint32_t value = 0; 4926 + 4927 + value = REG_SET_FIELD(value, SQ_CMD, CMD, 0x03); 4928 + value = REG_SET_FIELD(value, SQ_CMD, MODE, 0x01); 4929 + value = REG_SET_FIELD(value, SQ_CMD, CHECK_VMID, 1); 4930 + value = REG_SET_FIELD(value, SQ_CMD, VM_ID, vmid); 4931 + WREG32_SOC15(GC, 0, mmSQ_CMD, value); 4932 + } 4933 + 4917 4934 static void 4918 4935 gfx_v10_0_set_gfx_eop_interrupt_state(struct amdgpu_device *adev, 4919 4936 uint32_t me, uint32_t pipe, ··· 5318 5309 .emit_wreg = gfx_v10_0_ring_emit_wreg, 5319 5310 .emit_reg_wait = gfx_v10_0_ring_emit_reg_wait, 5320 5311 .emit_reg_write_reg_wait = gfx_v10_0_ring_emit_reg_write_reg_wait, 5312 + .soft_recovery = gfx_v10_0_ring_soft_recovery, 5321 5313 }; 5322 5314 5323 5315 static const struct amdgpu_ring_funcs gfx_v10_0_ring_funcs_compute = {

+5 -9

drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c

··· 1236 1236 { 0x1002, 0x15dd, 0x1002, 0x15dd, 0xc8 }, 1237 1237 /* https://bugzilla.kernel.org/show_bug.cgi?id=207171 */ 1238 1238 { 0x1002, 0x15dd, 0x103c, 0x83e7, 0xd3 }, 1239 + /* GFXOFF is unstable on C6 parts with a VBIOS 113-RAVEN-114 */ 1240 + { 0x1002, 0x15dd, 0x1002, 0x15dd, 0xc6 }, 1239 1241 { 0, 0, 0, 0, 0 }, 1240 1242 }; 1241 1243 ··· 5027 5025 switch (adev->asic_type) { 5028 5026 case CHIP_RAVEN: 5029 5027 case CHIP_RENOIR: 5030 - if (!enable) { 5028 + if (!enable) 5031 5029 amdgpu_gfx_off_ctrl(adev, false); 5032 - cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work); 5033 - } 5030 + 5034 5031 if (adev->pg_flags & AMD_PG_SUPPORT_RLC_SMU_HS) { 5035 5032 gfx_v9_0_enable_sck_slow_down_on_power_up(adev, true); 5036 5033 gfx_v9_0_enable_sck_slow_down_on_power_down(adev, true); ··· 5053 5052 amdgpu_gfx_off_ctrl(adev, true); 5054 5053 break; 5055 5054 case CHIP_VEGA12: 5056 - if (!enable) { 5057 - amdgpu_gfx_off_ctrl(adev, false); 5058 - cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work); 5059 - } else { 5060 - amdgpu_gfx_off_ctrl(adev, true); 5061 - } 5055 + amdgpu_gfx_off_ctrl(adev, enable); 5062 5056 break; 5063 5057 default: 5064 5058 break;

+90 -90

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

··· 441 441 442 442 /** 443 443 * dm_crtc_high_irq() - Handles CRTC interrupt 444 - * @interrupt_params: ignored 444 + * @interrupt_params: used for determining the CRTC instance 445 445 * 446 446 * Handles the CRTC/VSYNC interrupt by notfying DRM's VBLANK 447 447 * event handler. ··· 455 455 unsigned long flags; 456 456 457 457 acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - IRQ_TYPE_VBLANK); 458 - 459 - if (acrtc) { 460 - acrtc_state = to_dm_crtc_state(acrtc->base.state); 461 - 462 - DRM_DEBUG_VBL("crtc:%d, vupdate-vrr:%d\n", 463 - acrtc->crtc_id, 464 - amdgpu_dm_vrr_active(acrtc_state)); 465 - 466 - /* Core vblank handling at start of front-porch is only possible 467 - * in non-vrr mode, as only there vblank timestamping will give 468 - * valid results while done in front-porch. Otherwise defer it 469 - * to dm_vupdate_high_irq after end of front-porch. 470 - */ 471 - if (!amdgpu_dm_vrr_active(acrtc_state)) 472 - drm_crtc_handle_vblank(&acrtc->base); 473 - 474 - /* Following stuff must happen at start of vblank, for crc 475 - * computation and below-the-range btr support in vrr mode. 476 - */ 477 - amdgpu_dm_crtc_handle_crc_irq(&acrtc->base); 478 - 479 - if (acrtc_state->stream && adev->family >= AMDGPU_FAMILY_AI && 480 - acrtc_state->vrr_params.supported && 481 - acrtc_state->freesync_config.state == VRR_STATE_ACTIVE_VARIABLE) { 482 - spin_lock_irqsave(&adev->ddev->event_lock, flags); 483 - mod_freesync_handle_v_update( 484 - adev->dm.freesync_module, 485 - acrtc_state->stream, 486 - &acrtc_state->vrr_params); 487 - 488 - dc_stream_adjust_vmin_vmax( 489 - adev->dm.dc, 490 - acrtc_state->stream, 491 - &acrtc_state->vrr_params.adjust); 492 - spin_unlock_irqrestore(&adev->ddev->event_lock, flags); 493 - } 494 - } 495 - } 496 - 497 - #if defined(CONFIG_DRM_AMD_DC_DCN) 498 - /** 499 - * dm_dcn_crtc_high_irq() - Handles VStartup interrupt for DCN generation ASICs 500 - * @interrupt params - interrupt parameters 501 - * 502 - * Notify DRM's vblank event handler at VSTARTUP 503 - * 504 - * Unlike DCE hardware, we trigger the handler at VSTARTUP. at which: 505 - * * We are close enough to VUPDATE - the point of no return for hw 506 - * * We are in the fixed portion of variable front porch when vrr is enabled 507 - * * We are before VUPDATE, where double-buffered vrr registers are swapped 508 - * 509 - * It is therefore the correct place to signal vblank, send user flip events, 510 - * and update VRR. 511 - */ 512 - static void dm_dcn_crtc_high_irq(void *interrupt_params) 513 - { 514 - struct common_irq_params *irq_params = interrupt_params; 515 - struct amdgpu_device *adev = irq_params->adev; 516 - struct amdgpu_crtc *acrtc; 517 - struct dm_crtc_state *acrtc_state; 518 - unsigned long flags; 519 - 520 - acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - IRQ_TYPE_VBLANK); 521 - 522 458 if (!acrtc) 523 459 return; 524 460 ··· 464 528 amdgpu_dm_vrr_active(acrtc_state), 465 529 acrtc_state->active_planes); 466 530 531 + /** 532 + * Core vblank handling at start of front-porch is only possible 533 + * in non-vrr mode, as only there vblank timestamping will give 534 + * valid results while done in front-porch. Otherwise defer it 535 + * to dm_vupdate_high_irq after end of front-porch. 536 + */ 537 + if (!amdgpu_dm_vrr_active(acrtc_state)) 538 + drm_crtc_handle_vblank(&acrtc->base); 539 + 540 + /** 541 + * Following stuff must happen at start of vblank, for crc 542 + * computation and below-the-range btr support in vrr mode. 543 + */ 467 544 amdgpu_dm_crtc_handle_crc_irq(&acrtc->base); 468 - drm_crtc_handle_vblank(&acrtc->base); 545 + 546 + /* BTR updates need to happen before VUPDATE on Vega and above. */ 547 + if (adev->family < AMDGPU_FAMILY_AI) 548 + return; 469 549 470 550 spin_lock_irqsave(&adev->ddev->event_lock, flags); 471 551 472 - if (acrtc_state->vrr_params.supported && 552 + if (acrtc_state->stream && acrtc_state->vrr_params.supported && 473 553 acrtc_state->freesync_config.state == VRR_STATE_ACTIVE_VARIABLE) { 474 - mod_freesync_handle_v_update( 475 - adev->dm.freesync_module, 476 - acrtc_state->stream, 477 - &acrtc_state->vrr_params); 554 + mod_freesync_handle_v_update(adev->dm.freesync_module, 555 + acrtc_state->stream, 556 + &acrtc_state->vrr_params); 478 557 479 - dc_stream_adjust_vmin_vmax( 480 - adev->dm.dc, 481 - acrtc_state->stream, 482 - &acrtc_state->vrr_params.adjust); 558 + dc_stream_adjust_vmin_vmax(adev->dm.dc, acrtc_state->stream, 559 + &acrtc_state->vrr_params.adjust); 483 560 } 484 561 485 562 /* ··· 505 556 * avoid race conditions between flip programming and completion, 506 557 * which could cause too early flip completion events. 507 558 */ 508 - if (acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED && 559 + if (adev->family >= AMDGPU_FAMILY_RV && 560 + acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED && 509 561 acrtc_state->active_planes == 0) { 510 562 if (acrtc->event) { 511 563 drm_crtc_send_vblank_event(&acrtc->base, acrtc->event); ··· 518 568 519 569 spin_unlock_irqrestore(&adev->ddev->event_lock, flags); 520 570 } 521 - #endif 522 571 523 572 static int dm_set_clockgating_state(void *handle, 524 573 enum amd_clockgating_state state) ··· 1957 2008 dc_sink_retain(aconnector->dc_sink); 1958 2009 if (sink->dc_edid.length == 0) { 1959 2010 aconnector->edid = NULL; 1960 - drm_dp_cec_unset_edid(&aconnector->dm_dp_aux.aux); 2011 + if (aconnector->dc_link->aux_mode) { 2012 + drm_dp_cec_unset_edid( 2013 + &aconnector->dm_dp_aux.aux); 2014 + } 1961 2015 } else { 1962 2016 aconnector->edid = 1963 - (struct edid *) sink->dc_edid.raw_edid; 1964 - 2017 + (struct edid *)sink->dc_edid.raw_edid; 1965 2018 1966 2019 drm_connector_update_edid_property(connector, 1967 - aconnector->edid); 1968 - drm_dp_cec_set_edid(&aconnector->dm_dp_aux.aux, 1969 - aconnector->edid); 2020 + aconnector->edid); 2021 + 2022 + if (aconnector->dc_link->aux_mode) 2023 + drm_dp_cec_set_edid(&aconnector->dm_dp_aux.aux, 2024 + aconnector->edid); 1970 2025 } 2026 + 1971 2027 amdgpu_dm_update_freesync_caps(connector, aconnector->edid); 1972 2028 update_connector_ext_caps(aconnector); 1973 2029 } else { ··· 2394 2440 c_irq_params->adev = adev; 2395 2441 c_irq_params->irq_src = int_params.irq_source; 2396 2442 2443 + amdgpu_dm_irq_register_interrupt( 2444 + adev, &int_params, dm_crtc_high_irq, c_irq_params); 2445 + } 2446 + 2447 + /* Use VUPDATE_NO_LOCK interrupt on DCN, which seems to correspond to 2448 + * the regular VUPDATE interrupt on DCE. We want DC_IRQ_SOURCE_VUPDATEx 2449 + * to trigger at end of each vblank, regardless of state of the lock, 2450 + * matching DCE behaviour. 2451 + */ 2452 + for (i = DCN_1_0__SRCID__OTG0_IHC_V_UPDATE_NO_LOCK_INTERRUPT; 2453 + i <= DCN_1_0__SRCID__OTG0_IHC_V_UPDATE_NO_LOCK_INTERRUPT + adev->mode_info.num_crtc - 1; 2454 + i++) { 2455 + r = amdgpu_irq_add_id(adev, SOC15_IH_CLIENTID_DCE, i, &adev->vupdate_irq); 2456 + 2457 + if (r) { 2458 + DRM_ERROR("Failed to add vupdate irq id!\n"); 2459 + return r; 2460 + } 2461 + 2462 + int_params.int_context = INTERRUPT_HIGH_IRQ_CONTEXT; 2463 + int_params.irq_source = 2464 + dc_interrupt_to_irq_source(dc, i, 0); 2465 + 2466 + c_irq_params = &adev->dm.vupdate_params[int_params.irq_source - DC_IRQ_SOURCE_VUPDATE1]; 2467 + 2468 + c_irq_params->adev = adev; 2469 + c_irq_params->irq_src = int_params.irq_source; 2470 + 2397 2471 amdgpu_dm_irq_register_interrupt(adev, &int_params, 2398 - dm_dcn_crtc_high_irq, c_irq_params); 2472 + dm_vupdate_high_irq, c_irq_params); 2399 2473 } 2400 2474 2401 2475 /* Use GRPH_PFLIP interrupt */ ··· 4429 4447 struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc); 4430 4448 struct amdgpu_device *adev = crtc->dev->dev_private; 4431 4449 int rc; 4432 - 4433 - /* Do not set vupdate for DCN hardware */ 4434 - if (adev->family > AMDGPU_FAMILY_AI) 4435 - return 0; 4436 4450 4437 4451 irq_source = IRQ_TYPE_VUPDATE + acrtc->otg_inst; 4438 4452 ··· 7855 7877 struct drm_crtc_state *old_crtc_state, *new_crtc_state; 7856 7878 struct dm_crtc_state *dm_new_crtc_state, *dm_old_crtc_state; 7857 7879 struct dm_plane_state *dm_new_plane_state, *dm_old_plane_state; 7880 + struct amdgpu_crtc *new_acrtc; 7858 7881 bool needs_reset; 7859 7882 int ret = 0; 7860 7883 ··· 7865 7886 dm_new_plane_state = to_dm_plane_state(new_plane_state); 7866 7887 dm_old_plane_state = to_dm_plane_state(old_plane_state); 7867 7888 7868 - /*TODO Implement atomic check for cursor plane */ 7869 - if (plane->type == DRM_PLANE_TYPE_CURSOR) 7889 + /*TODO Implement better atomic check for cursor plane */ 7890 + if (plane->type == DRM_PLANE_TYPE_CURSOR) { 7891 + if (!enable || !new_plane_crtc || 7892 + drm_atomic_plane_disabling(plane->state, new_plane_state)) 7893 + return 0; 7894 + 7895 + new_acrtc = to_amdgpu_crtc(new_plane_crtc); 7896 + 7897 + if ((new_plane_state->crtc_w > new_acrtc->max_cursor_width) || 7898 + (new_plane_state->crtc_h > new_acrtc->max_cursor_height)) { 7899 + DRM_DEBUG_ATOMIC("Bad cursor size %d x %d\n", 7900 + new_plane_state->crtc_w, new_plane_state->crtc_h); 7901 + return -EINVAL; 7902 + } 7903 + 7904 + if (new_plane_state->crtc_x <= -new_acrtc->max_cursor_width || 7905 + new_plane_state->crtc_y <= -new_acrtc->max_cursor_height) { 7906 + DRM_DEBUG_ATOMIC("Bad cursor position %d, %d\n", 7907 + new_plane_state->crtc_x, new_plane_state->crtc_y); 7908 + return -EINVAL; 7909 + } 7910 + 7870 7911 return 0; 7912 + } 7871 7913 7872 7914 needs_reset = should_reset_plane(state, plane, old_plane_state, 7873 7915 new_plane_state);

+5 -5

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c

··· 398 398 struct mod_hdcp_display *display = &hdcp_work[link_index].display; 399 399 struct mod_hdcp_link *link = &hdcp_work[link_index].link; 400 400 401 - memset(display, 0, sizeof(*display)); 402 - memset(link, 0, sizeof(*link)); 403 - 404 - display->index = aconnector->base.index; 405 - 406 401 if (config->dpms_off) { 407 402 hdcp_remove_display(hdcp_work, link_index, aconnector); 408 403 return; 409 404 } 405 + 406 + memset(display, 0, sizeof(*display)); 407 + memset(link, 0, sizeof(*link)); 408 + 409 + display->index = aconnector->base.index; 410 410 display->state = MOD_HDCP_DISPLAY_ACTIVE; 411 411 412 412 if (aconnector->dc_sink != NULL)

+2 -3

drivers/gpu/drm/amd/display/dc/core/dc.c

··· 834 834 static void wait_for_no_pipes_pending(struct dc *dc, struct dc_state *context) 835 835 { 836 836 int i; 837 - int count = 0; 838 - struct pipe_ctx *pipe; 839 837 PERF_TRACE(); 840 838 for (i = 0; i < MAX_PIPES; i++) { 841 - pipe = &context->res_ctx.pipe_ctx[i]; 839 + int count = 0; 840 + struct pipe_ctx *pipe = &context->res_ctx.pipe_ctx[i]; 842 841 843 842 if (!pipe->plane_state) 844 843 continue;

+23 -8

drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c

··· 3068 3068 return out; 3069 3069 } 3070 3070 3071 - 3072 - bool dcn20_validate_bandwidth(struct dc *dc, struct dc_state *context, 3073 - bool fast_validate) 3071 + /* 3072 + * This must be noinline to ensure anything that deals with FP registers 3073 + * is contained within this call; previously our compiling with hard-float 3074 + * would result in fp instructions being emitted outside of the boundaries 3075 + * of the DC_FP_START/END macros, which makes sense as the compiler has no 3076 + * idea about what is wrapped and what is not 3077 + * 3078 + * This is largely just a workaround to avoid breakage introduced with 5.6, 3079 + * ideally all fp-using code should be moved into its own file, only that 3080 + * should be compiled with hard-float, and all code exported from there 3081 + * should be strictly wrapped with DC_FP_START/END 3082 + */ 3083 + static noinline bool dcn20_validate_bandwidth_fp(struct dc *dc, 3084 + struct dc_state *context, bool fast_validate) 3074 3085 { 3075 3086 bool voltage_supported = false; 3076 3087 bool full_pstate_supported = false; 3077 3088 bool dummy_pstate_supported = false; 3078 3089 double p_state_latency_us; 3079 3090 3080 - DC_FP_START(); 3081 3091 p_state_latency_us = context->bw_ctx.dml.soc.dram_clock_change_latency_us; 3082 3092 context->bw_ctx.dml.soc.disable_dram_clock_change_vactive_support = 3083 3093 dc->debug.disable_dram_clock_change_vactive_support; 3084 3094 3085 3095 if (fast_validate) { 3086 - voltage_supported = dcn20_validate_bandwidth_internal(dc, context, true); 3087 - 3088 - DC_FP_END(); 3089 - return voltage_supported; 3096 + return dcn20_validate_bandwidth_internal(dc, context, true); 3090 3097 } 3091 3098 3092 3099 // Best case, we support full UCLK switch latency ··· 3122 3115 3123 3116 restore_dml_state: 3124 3117 context->bw_ctx.dml.soc.dram_clock_change_latency_us = p_state_latency_us; 3118 + return voltage_supported; 3119 + } 3125 3120 3121 + bool dcn20_validate_bandwidth(struct dc *dc, struct dc_state *context, 3122 + bool fast_validate) 3123 + { 3124 + bool voltage_supported = false; 3125 + DC_FP_START(); 3126 + voltage_supported = dcn20_validate_bandwidth_fp(dc, context, fast_validate); 3126 3127 DC_FP_END(); 3127 3128 return voltage_supported; 3128 3129 }

+4 -4

drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c

··· 1200 1200 min_hratio_fact_l = 1.0; 1201 1201 min_hratio_fact_c = 1.0; 1202 1202 1203 - if (htaps_l <= 1) 1203 + if (hratio_l <= 1) 1204 1204 min_hratio_fact_l = 2.0; 1205 1205 else if (htaps_l <= 6) { 1206 1206 if ((hratio_l * 2.0) > 4.0) ··· 1216 1216 1217 1217 hscale_pixel_rate_l = min_hratio_fact_l * dppclk_freq_in_mhz; 1218 1218 1219 - if (htaps_c <= 1) 1219 + if (hratio_c <= 1) 1220 1220 min_hratio_fact_c = 2.0; 1221 1221 else if (htaps_c <= 6) { 1222 1222 if ((hratio_c * 2.0) > 4.0) ··· 1522 1522 1523 1523 disp_dlg_regs->refcyc_per_vm_group_vblank = get_refcyc_per_vm_group_vblank(mode_lib, e2e_pipe_param, num_pipes, pipe_idx) * refclk_freq_in_mhz; 1524 1524 disp_dlg_regs->refcyc_per_vm_group_flip = get_refcyc_per_vm_group_flip(mode_lib, e2e_pipe_param, num_pipes, pipe_idx) * refclk_freq_in_mhz; 1525 - disp_dlg_regs->refcyc_per_vm_req_vblank = get_refcyc_per_vm_req_vblank(mode_lib, e2e_pipe_param, num_pipes, pipe_idx) * refclk_freq_in_mhz; 1526 - disp_dlg_regs->refcyc_per_vm_req_flip = get_refcyc_per_vm_req_flip(mode_lib, e2e_pipe_param, num_pipes, pipe_idx) * refclk_freq_in_mhz; 1525 + disp_dlg_regs->refcyc_per_vm_req_vblank = get_refcyc_per_vm_req_vblank(mode_lib, e2e_pipe_param, num_pipes, pipe_idx) * refclk_freq_in_mhz * dml_pow(2, 10); 1526 + disp_dlg_regs->refcyc_per_vm_req_flip = get_refcyc_per_vm_req_flip(mode_lib, e2e_pipe_param, num_pipes, pipe_idx) * refclk_freq_in_mhz * dml_pow(2, 10); 1527 1527 1528 1528 // Clamp to max for now 1529 1529 if (disp_dlg_regs->refcyc_per_vm_group_vblank >= (unsigned int)dml_pow(2, 23))

+1 -1

drivers/gpu/drm/amd/display/dc/os_types.h

··· 108 108 #define ASSERT(expr) ASSERT_CRITICAL(expr) 109 109 110 110 #else 111 - #define ASSERT(expr) WARN_ON(!(expr)) 111 + #define ASSERT(expr) WARN_ON_ONCE(!(expr)) 112 112 #endif 113 113 114 114 #define BREAK_TO_DEBUGGER() ASSERT(0)

+3 -3

drivers/gpu/drm/amd/powerplay/amd_powerplay.c

··· 319 319 if (*level & profile_mode_mask) { 320 320 hwmgr->saved_dpm_level = hwmgr->dpm_level; 321 321 hwmgr->en_umd_pstate = true; 322 - amdgpu_device_ip_set_clockgating_state(hwmgr->adev, 323 - AMD_IP_BLOCK_TYPE_GFX, 324 - AMD_CG_STATE_UNGATE); 325 322 amdgpu_device_ip_set_powergating_state(hwmgr->adev, 326 323 AMD_IP_BLOCK_TYPE_GFX, 327 324 AMD_PG_STATE_UNGATE); 325 + amdgpu_device_ip_set_clockgating_state(hwmgr->adev, 326 + AMD_IP_BLOCK_TYPE_GFX, 327 + AMD_CG_STATE_UNGATE); 328 328 } 329 329 } else { 330 330 /* exit umd pstate, restore level, enable gfx cg*/

+4 -4

drivers/gpu/drm/amd/powerplay/amdgpu_smu.c

··· 1476 1476 bool use_baco = !smu->is_apu && 1477 1477 ((adev->in_gpu_reset && 1478 1478 (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO)) || 1479 - (adev->in_runpm && amdgpu_asic_supports_baco(adev))); 1479 + ((adev->in_runpm || adev->in_hibernate) && amdgpu_asic_supports_baco(adev))); 1480 1480 1481 1481 ret = smu_get_smc_version(smu, NULL, &smu_version); 1482 1482 if (ret) { ··· 1744 1744 if (*level & profile_mode_mask) { 1745 1745 smu_dpm_ctx->saved_dpm_level = smu_dpm_ctx->dpm_level; 1746 1746 smu_dpm_ctx->enable_umd_pstate = true; 1747 - amdgpu_device_ip_set_clockgating_state(smu->adev, 1748 - AMD_IP_BLOCK_TYPE_GFX, 1749 - AMD_CG_STATE_UNGATE); 1750 1747 amdgpu_device_ip_set_powergating_state(smu->adev, 1751 1748 AMD_IP_BLOCK_TYPE_GFX, 1752 1749 AMD_PG_STATE_UNGATE); 1750 + amdgpu_device_ip_set_clockgating_state(smu->adev, 1751 + AMD_IP_BLOCK_TYPE_GFX, 1752 + AMD_CG_STATE_UNGATE); 1753 1753 } 1754 1754 } else { 1755 1755 /* exit umd pstate, restore level, enable gfx cg*/

+7 -1

drivers/gpu/drm/drm_hdcp.c

··· 241 241 242 242 ret = request_firmware_direct(&fw, (const char *)fw_name, 243 243 drm_dev->dev); 244 - if (ret < 0) 244 + if (ret < 0) { 245 + *revoked_ksv_cnt = 0; 246 + *revoked_ksv_list = NULL; 247 + ret = 0; 245 248 goto exit; 249 + } 246 250 247 251 if (fw->size && fw->data) 248 252 ret = drm_hdcp_srm_update(fw->data, fw->size, revoked_ksv_list, ··· 291 287 292 288 ret = drm_hdcp_request_srm(drm_dev, &revoked_ksv_list, 293 289 &revoked_ksv_cnt); 290 + if (ret) 291 + return ret; 294 292 295 293 /* revoked_ksv_cnt will be zero when above function failed */ 296 294 for (i = 0; i < revoked_ksv_cnt; i++)

+1 -2

drivers/gpu/drm/i915/display/intel_fbc.c

··· 485 485 if (!ret) 486 486 goto err_llb; 487 487 else if (ret > 1) { 488 - DRM_INFO("Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.\n"); 489 - 488 + DRM_INFO_ONCE("Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.\n"); 490 489 } 491 490 492 491 fbc->threshold = ret;

+1 -6

drivers/gpu/drm/i915/gem/i915_gem_domain.c

··· 368 368 struct drm_i915_private *i915 = to_i915(obj->base.dev); 369 369 struct i915_vma *vma; 370 370 371 - GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj)); 372 371 if (!atomic_read(&obj->bind_count)) 373 372 return; 374 373 ··· 399 400 void 400 401 i915_gem_object_unpin_from_display_plane(struct i915_vma *vma) 401 402 { 402 - struct drm_i915_gem_object *obj = vma->obj; 403 - 404 - assert_object_held(obj); 405 - 406 403 /* Bump the LRU to try and avoid premature eviction whilst flipping */ 407 - i915_gem_object_bump_inactive_ggtt(obj); 404 + i915_gem_object_bump_inactive_ggtt(vma->obj); 408 405 409 406 i915_vma_unpin(vma); 410 407 }

+7 -1

drivers/gpu/drm/i915/gt/intel_context_types.h

··· 69 69 #define CONTEXT_NOPREEMPT 7 70 70 71 71 u32 *lrc_reg_state; 72 - u64 lrc_desc; 72 + union { 73 + struct { 74 + u32 lrca; 75 + u32 ccid; 76 + }; 77 + u64 desc; 78 + } lrc; 73 79 u32 tag; /* cookie passed to HW to track this context on submission */ 74 80 75 81 /* Time on GPU as tracked by the hw. */

-9

drivers/gpu/drm/i915/gt/intel_engine.h

··· 333 333 return intel_engine_has_preemption(engine); 334 334 } 335 335 336 - static inline bool 337 - intel_engine_has_timeslices(const struct intel_engine_cs *engine) 338 - { 339 - if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION)) 340 - return false; 341 - 342 - return intel_engine_has_semaphores(engine); 343 - } 344 - 345 336 #endif /* _INTEL_RINGBUFFER_H_ */

+6

drivers/gpu/drm/i915/gt/intel_engine_cs.c

··· 1295 1295 1296 1296 if (engine->id == RENDER_CLASS && IS_GEN_RANGE(dev_priv, 4, 7)) 1297 1297 drm_printf(m, "\tCCID: 0x%08x\n", ENGINE_READ(engine, CCID)); 1298 + if (HAS_EXECLISTS(dev_priv)) { 1299 + drm_printf(m, "\tEL_STAT_HI: 0x%08x\n", 1300 + ENGINE_READ(engine, RING_EXECLIST_STATUS_HI)); 1301 + drm_printf(m, "\tEL_STAT_LO: 0x%08x\n", 1302 + ENGINE_READ(engine, RING_EXECLIST_STATUS_LO)); 1303 + } 1298 1304 drm_printf(m, "\tRING_START: 0x%08x\n", 1299 1305 ENGINE_READ(engine, RING_START)); 1300 1306 drm_printf(m, "\tRING_HEAD: 0x%08x\n",

+29 -6

drivers/gpu/drm/i915/gt/intel_engine_types.h

··· 157 157 struct i915_priolist default_priolist; 158 158 159 159 /** 160 + * @ccid: identifier for contexts submitted to this engine 161 + */ 162 + u32 ccid; 163 + 164 + /** 165 + * @yield: CCID at the time of the last semaphore-wait interrupt. 166 + * 167 + * Instead of leaving a semaphore busy-spinning on an engine, we would 168 + * like to switch to another ready context, i.e. yielding the semaphore 169 + * timeslice. 170 + */ 171 + u32 yield; 172 + 173 + /** 160 174 * @error_interrupt: CS Master EIR 161 175 * 162 176 * The CS generates an interrupt when it detects an error. We capture ··· 309 295 u32 context_size; 310 296 u32 mmio_base; 311 297 312 - unsigned int context_tag; 313 - #define NUM_CONTEXT_TAG roundup_pow_of_two(2 * EXECLIST_MAX_PORTS) 298 + unsigned long context_tag; 314 299 315 300 struct rb_node uabi_node; 316 301 ··· 496 483 #define I915_ENGINE_SUPPORTS_STATS BIT(1) 497 484 #define I915_ENGINE_HAS_PREEMPTION BIT(2) 498 485 #define I915_ENGINE_HAS_SEMAPHORES BIT(3) 499 - #define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4) 500 - #define I915_ENGINE_IS_VIRTUAL BIT(5) 501 - #define I915_ENGINE_HAS_RELATIVE_MMIO BIT(6) 502 - #define I915_ENGINE_REQUIRES_CMD_PARSER BIT(7) 486 + #define I915_ENGINE_HAS_TIMESLICES BIT(4) 487 + #define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(5) 488 + #define I915_ENGINE_IS_VIRTUAL BIT(6) 489 + #define I915_ENGINE_HAS_RELATIVE_MMIO BIT(7) 490 + #define I915_ENGINE_REQUIRES_CMD_PARSER BIT(8) 503 491 unsigned int flags; 504 492 505 493 /* ··· 596 582 intel_engine_has_semaphores(const struct intel_engine_cs *engine) 597 583 { 598 584 return engine->flags & I915_ENGINE_HAS_SEMAPHORES; 585 + } 586 + 587 + static inline bool 588 + intel_engine_has_timeslices(const struct intel_engine_cs *engine) 589 + { 590 + if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION)) 591 + return false; 592 + 593 + return engine->flags & I915_ENGINE_HAS_TIMESLICES; 599 594 } 600 595 601 596 static inline bool

+13 -2

drivers/gpu/drm/i915/gt/intel_gt_irq.c

··· 39 39 } 40 40 } 41 41 42 + if (iir & GT_WAIT_SEMAPHORE_INTERRUPT) { 43 + WRITE_ONCE(engine->execlists.yield, 44 + ENGINE_READ_FW(engine, RING_EXECLIST_STATUS_HI)); 45 + ENGINE_TRACE(engine, "semaphore yield: %08x\n", 46 + engine->execlists.yield); 47 + if (del_timer(&engine->execlists.timer)) 48 + tasklet = true; 49 + } 50 + 42 51 if (iir & GT_CONTEXT_SWITCH_INTERRUPT) 43 52 tasklet = true; 44 53 ··· 237 228 const u32 irqs = 238 229 GT_CS_MASTER_ERROR_INTERRUPT | 239 230 GT_RENDER_USER_INTERRUPT | 240 - GT_CONTEXT_SWITCH_INTERRUPT; 231 + GT_CONTEXT_SWITCH_INTERRUPT | 232 + GT_WAIT_SEMAPHORE_INTERRUPT; 241 233 struct intel_uncore *uncore = gt->uncore; 242 234 const u32 dmask = irqs << 16 | irqs; 243 235 const u32 smask = irqs << 16; ··· 376 366 const u32 irqs = 377 367 GT_CS_MASTER_ERROR_INTERRUPT | 378 368 GT_RENDER_USER_INTERRUPT | 379 - GT_CONTEXT_SWITCH_INTERRUPT; 369 + GT_CONTEXT_SWITCH_INTERRUPT | 370 + GT_WAIT_SEMAPHORE_INTERRUPT; 380 371 const u32 gt_interrupts[] = { 381 372 irqs << GEN8_RCS_IRQ_SHIFT | irqs << GEN8_BCS_IRQ_SHIFT, 382 373 irqs << GEN8_VCS0_IRQ_SHIFT | irqs << GEN8_VCS1_IRQ_SHIFT,

+81 -39

drivers/gpu/drm/i915/gt/intel_lrc.c

··· 456 456 * engine info, SW context ID and SW counter need to form a unique number 457 457 * (Context ID) per lrc. 458 458 */ 459 - static u64 459 + static u32 460 460 lrc_descriptor(struct intel_context *ce, struct intel_engine_cs *engine) 461 461 { 462 - u64 desc; 462 + u32 desc; 463 463 464 464 desc = INTEL_LEGACY_32B_CONTEXT; 465 465 if (i915_vm_is_4lvl(ce->vm)) ··· 470 470 if (IS_GEN(engine->i915, 8)) 471 471 desc |= GEN8_CTX_L3LLC_COHERENT; 472 472 473 - desc |= i915_ggtt_offset(ce->state); /* bits 12-31 */ 474 - /* 475 - * The following 32bits are copied into the OA reports (dword 2). 476 - * Consider updating oa_get_render_ctx_id in i915_perf.c when changing 477 - * anything below. 478 - */ 479 - if (INTEL_GEN(engine->i915) >= 11) { 480 - desc |= (u64)engine->instance << GEN11_ENGINE_INSTANCE_SHIFT; 481 - /* bits 48-53 */ 482 - 483 - desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT; 484 - /* bits 61-63 */ 485 - } 486 - 487 - return desc; 473 + return i915_ggtt_offset(ce->state) | desc; 488 474 } 489 475 490 476 static inline unsigned int dword_in_page(void *addr) ··· 1178 1192 __execlists_update_reg_state(ce, engine, head); 1179 1193 1180 1194 /* We've switched away, so this should be a no-op, but intent matters */ 1181 - ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; 1195 + ce->lrc.desc |= CTX_DESC_FORCE_RESTORE; 1182 1196 } 1183 1197 1184 1198 static u32 intel_context_get_runtime(const struct intel_context *ce) ··· 1237 1251 if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) 1238 1252 execlists_check_context(ce, engine); 1239 1253 1240 - ce->lrc_desc &= ~GENMASK_ULL(47, 37); 1241 1254 if (ce->tag) { 1242 1255 /* Use a fixed tag for OA and friends */ 1243 - ce->lrc_desc |= (u64)ce->tag << 32; 1256 + GEM_BUG_ON(ce->tag <= BITS_PER_LONG); 1257 + ce->lrc.ccid = ce->tag; 1244 1258 } else { 1245 1259 /* We don't need a strict matching tag, just different values */ 1246 - ce->lrc_desc |= 1247 - (u64)(++engine->context_tag % NUM_CONTEXT_TAG) << 1248 - GEN11_SW_CTX_ID_SHIFT; 1249 - BUILD_BUG_ON(NUM_CONTEXT_TAG > GEN12_MAX_CONTEXT_HW_ID); 1260 + unsigned int tag = ffs(engine->context_tag); 1261 + 1262 + GEM_BUG_ON(tag == 0 || tag >= BITS_PER_LONG); 1263 + clear_bit(tag - 1, &engine->context_tag); 1264 + ce->lrc.ccid = tag << (GEN11_SW_CTX_ID_SHIFT - 32); 1265 + 1266 + BUILD_BUG_ON(BITS_PER_LONG > GEN12_MAX_CONTEXT_HW_ID); 1250 1267 } 1268 + 1269 + ce->lrc.ccid |= engine->execlists.ccid; 1251 1270 1252 1271 __intel_gt_pm_get(engine->gt); 1253 1272 execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN); ··· 1293 1302 1294 1303 static inline void 1295 1304 __execlists_schedule_out(struct i915_request *rq, 1296 - struct intel_engine_cs * const engine) 1305 + struct intel_engine_cs * const engine, 1306 + unsigned int ccid) 1297 1307 { 1298 1308 struct intel_context * const ce = rq->context; 1299 1309 ··· 1311 1319 if (list_is_last_rcu(&rq->link, &ce->timeline->requests) && 1312 1320 i915_request_completed(rq)) 1313 1321 intel_engine_add_retire(engine, ce->timeline); 1322 + 1323 + ccid >>= GEN11_SW_CTX_ID_SHIFT - 32; 1324 + ccid &= GEN12_MAX_CONTEXT_HW_ID; 1325 + if (ccid < BITS_PER_LONG) { 1326 + GEM_BUG_ON(ccid == 0); 1327 + GEM_BUG_ON(test_bit(ccid - 1, &engine->context_tag)); 1328 + set_bit(ccid - 1, &engine->context_tag); 1329 + } 1314 1330 1315 1331 intel_context_update_runtime(ce); 1316 1332 intel_engine_context_out(engine); ··· 1345 1345 { 1346 1346 struct intel_context * const ce = rq->context; 1347 1347 struct intel_engine_cs *cur, *old; 1348 + u32 ccid; 1348 1349 1349 1350 trace_i915_request_out(rq); 1350 1351 1352 + ccid = rq->context->lrc.ccid; 1351 1353 old = READ_ONCE(ce->inflight); 1352 1354 do 1353 1355 cur = ptr_unmask_bits(old, 2) ? ptr_dec(old) : NULL; 1354 1356 while (!try_cmpxchg(&ce->inflight, &old, cur)); 1355 1357 if (!cur) 1356 - __execlists_schedule_out(rq, old); 1358 + __execlists_schedule_out(rq, old, ccid); 1357 1359 1358 1360 i915_request_put(rq); 1359 1361 } ··· 1363 1361 static u64 execlists_update_context(struct i915_request *rq) 1364 1362 { 1365 1363 struct intel_context *ce = rq->context; 1366 - u64 desc = ce->lrc_desc; 1364 + u64 desc = ce->lrc.desc; 1367 1365 u32 tail, prev; 1368 1366 1369 1367 /* ··· 1402 1400 */ 1403 1401 wmb(); 1404 1402 1405 - ce->lrc_desc &= ~CTX_DESC_FORCE_RESTORE; 1403 + ce->lrc.desc &= ~CTX_DESC_FORCE_RESTORE; 1406 1404 return desc; 1407 1405 } 1408 1406 ··· 1721 1719 struct i915_request *w = 1722 1720 container_of(p->waiter, typeof(*w), sched); 1723 1721 1722 + if (p->flags & I915_DEPENDENCY_WEAK) 1723 + continue; 1724 + 1724 1725 /* Leave semaphores spinning on the other engines */ 1725 1726 if (w->engine != rq->engine) 1726 1727 continue; ··· 1759 1754 } 1760 1755 1761 1756 static bool 1762 - need_timeslice(struct intel_engine_cs *engine, const struct i915_request *rq) 1757 + need_timeslice(const struct intel_engine_cs *engine, 1758 + const struct i915_request *rq) 1763 1759 { 1764 1760 int hint; 1765 1761 ··· 1772 1766 hint = max(hint, rq_prio(list_next_entry(rq, sched.link))); 1773 1767 1774 1768 return hint >= effective_prio(rq); 1769 + } 1770 + 1771 + static bool 1772 + timeslice_yield(const struct intel_engine_execlists *el, 1773 + const struct i915_request *rq) 1774 + { 1775 + /* 1776 + * Once bitten, forever smitten! 1777 + * 1778 + * If the active context ever busy-waited on a semaphore, 1779 + * it will be treated as a hog until the end of its timeslice (i.e. 1780 + * until it is scheduled out and replaced by a new submission, 1781 + * possibly even its own lite-restore). The HW only sends an interrupt 1782 + * on the first miss, and we do know if that semaphore has been 1783 + * signaled, or even if it is now stuck on another semaphore. Play 1784 + * safe, yield if it might be stuck -- it will be given a fresh 1785 + * timeslice in the near future. 1786 + */ 1787 + return rq->context->lrc.ccid == READ_ONCE(el->yield); 1788 + } 1789 + 1790 + static bool 1791 + timeslice_expired(const struct intel_engine_execlists *el, 1792 + const struct i915_request *rq) 1793 + { 1794 + return timer_expired(&el->timer) || timeslice_yield(el, rq); 1775 1795 } 1776 1796 1777 1797 static int ··· 1815 1783 return READ_ONCE(engine->props.timeslice_duration_ms); 1816 1784 } 1817 1785 1818 - static unsigned long 1819 - active_timeslice(const struct intel_engine_cs *engine) 1786 + static unsigned long active_timeslice(const struct intel_engine_cs *engine) 1820 1787 { 1821 1788 const struct intel_engine_execlists *execlists = &engine->execlists; 1822 1789 const struct i915_request *rq = *execlists->active; ··· 1977 1946 1978 1947 last = NULL; 1979 1948 } else if (need_timeslice(engine, last) && 1980 - timer_expired(&engine->execlists.timer)) { 1949 + timeslice_expired(execlists, last)) { 1981 1950 ENGINE_TRACE(engine, 1982 - "expired last=%llx:%lld, prio=%d, hint=%d\n", 1951 + "expired last=%llx:%lld, prio=%d, hint=%d, yield?=%s\n", 1983 1952 last->fence.context, 1984 1953 last->fence.seqno, 1985 1954 last->sched.attr.priority, 1986 - execlists->queue_priority_hint); 1955 + execlists->queue_priority_hint, 1956 + yesno(timeslice_yield(execlists, last))); 1987 1957 1988 1958 ring_set_paused(engine, 1); 1989 1959 defer_active(engine); ··· 2245 2213 } 2246 2214 clear_ports(port + 1, last_port - port); 2247 2215 2216 + WRITE_ONCE(execlists->yield, -1); 2248 2217 execlists_submit_ports(engine); 2249 2218 set_preempt_timeout(engine, *active); 2250 2219 } else { ··· 3076 3043 if (IS_ERR(vaddr)) 3077 3044 return PTR_ERR(vaddr); 3078 3045 3079 - ce->lrc_desc = lrc_descriptor(ce, engine) | CTX_DESC_FORCE_RESTORE; 3046 + ce->lrc.lrca = lrc_descriptor(ce, engine) | CTX_DESC_FORCE_RESTORE; 3080 3047 ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE; 3081 3048 __execlists_update_reg_state(ce, engine, ce->ring->tail); 3082 3049 ··· 3105 3072 ce, ce->engine, ce->ring, true); 3106 3073 __execlists_update_reg_state(ce, ce->engine, ce->ring->tail); 3107 3074 3108 - ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; 3075 + ce->lrc.desc |= CTX_DESC_FORCE_RESTORE; 3109 3076 } 3110 3077 3111 3078 static const struct intel_context_ops execlists_context_ops = { ··· 3574 3541 3575 3542 enable_error_interrupt(engine); 3576 3543 3577 - engine->context_tag = 0; 3544 + engine->context_tag = GENMASK(BITS_PER_LONG - 2, 0); 3578 3545 } 3579 3546 3580 3547 static bool unexpected_starting_state(struct intel_engine_cs *engine) ··· 3786 3753 head, ce->ring->tail); 3787 3754 __execlists_reset_reg_state(ce, engine); 3788 3755 __execlists_update_reg_state(ce, engine, head); 3789 - ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */ 3756 + ce->lrc.desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */ 3790 3757 3791 3758 unwind: 3792 3759 /* Push back any incomplete requests for replay after the reset. */ ··· 4402 4369 engine->flags |= I915_ENGINE_SUPPORTS_STATS; 4403 4370 if (!intel_vgpu_active(engine->i915)) { 4404 4371 engine->flags |= I915_ENGINE_HAS_SEMAPHORES; 4405 - if (HAS_LOGICAL_RING_PREEMPTION(engine->i915)) 4372 + if (HAS_LOGICAL_RING_PREEMPTION(engine->i915)) { 4406 4373 engine->flags |= I915_ENGINE_HAS_PREEMPTION; 4374 + if (IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION)) 4375 + engine->flags |= I915_ENGINE_HAS_TIMESLICES; 4376 + } 4407 4377 } 4408 4378 4409 4379 if (INTEL_GEN(engine->i915) >= 12) ··· 4485 4449 engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT << shift; 4486 4450 engine->irq_keep_mask = GT_CONTEXT_SWITCH_INTERRUPT << shift; 4487 4451 engine->irq_keep_mask |= GT_CS_MASTER_ERROR_INTERRUPT << shift; 4452 + engine->irq_keep_mask |= GT_WAIT_SEMAPHORE_INTERRUPT << shift; 4488 4453 } 4489 4454 4490 4455 static void rcs_submission_override(struct intel_engine_cs *engine) ··· 4552 4515 execlists->csb_size = GEN8_CSB_ENTRIES; 4553 4516 else 4554 4517 execlists->csb_size = GEN11_CSB_ENTRIES; 4518 + 4519 + if (INTEL_GEN(engine->i915) >= 11) { 4520 + execlists->ccid |= engine->instance << (GEN11_ENGINE_INSTANCE_SHIFT - 32); 4521 + execlists->ccid |= engine->class << (GEN11_ENGINE_CLASS_SHIFT - 32); 4522 + } 4555 4523 4556 4524 reset_csb_pointers(engine); 4557 4525

+18 -16

drivers/gpu/drm/i915/gt/selftest_lrc.c

··· 929 929 goto err; 930 930 } 931 931 932 - cs = intel_ring_begin(rq, 10); 932 + cs = intel_ring_begin(rq, 14); 933 933 if (IS_ERR(cs)) { 934 934 err = PTR_ERR(cs); 935 935 goto err; ··· 941 941 *cs++ = MI_SEMAPHORE_WAIT | 942 942 MI_SEMAPHORE_GLOBAL_GTT | 943 943 MI_SEMAPHORE_POLL | 944 - MI_SEMAPHORE_SAD_NEQ_SDD; 945 - *cs++ = 0; 944 + MI_SEMAPHORE_SAD_GTE_SDD; 945 + *cs++ = idx; 946 946 *cs++ = offset; 947 947 *cs++ = 0; 948 948 ··· 950 950 *cs++ = i915_mmio_reg_offset(RING_TIMESTAMP(rq->engine->mmio_base)); 951 951 *cs++ = offset + idx * sizeof(u32); 952 952 *cs++ = 0; 953 + 954 + *cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT; 955 + *cs++ = offset; 956 + *cs++ = 0; 957 + *cs++ = idx + 1; 953 958 954 959 intel_ring_advance(rq, cs); 955 960 ··· 989 984 990 985 for_each_engine(engine, gt, id) { 991 986 enum { A1, A2, B1 }; 992 - enum { X = 1, Y, Z }; 987 + enum { X = 1, Z, Y }; 993 988 struct i915_request *rq[3] = {}; 994 989 struct intel_context *ce; 995 990 unsigned long heartbeat; ··· 1022 1017 goto err; 1023 1018 } 1024 1019 1025 - rq[0] = create_rewinder(ce, NULL, slot, 1); 1020 + rq[0] = create_rewinder(ce, NULL, slot, X); 1026 1021 if (IS_ERR(rq[0])) { 1027 1022 intel_context_put(ce); 1028 1023 goto err; 1029 1024 } 1030 1025 1031 - rq[1] = create_rewinder(ce, NULL, slot, 2); 1026 + rq[1] = create_rewinder(ce, NULL, slot, Y); 1032 1027 intel_context_put(ce); 1033 1028 if (IS_ERR(rq[1])) 1034 1029 goto err; ··· 1046 1041 goto err; 1047 1042 } 1048 1043 1049 - rq[2] = create_rewinder(ce, rq[0], slot, 3); 1044 + rq[2] = create_rewinder(ce, rq[0], slot, Z); 1050 1045 intel_context_put(ce); 1051 1046 if (IS_ERR(rq[2])) 1052 1047 goto err; ··· 1060 1055 GEM_BUG_ON(!timer_pending(&engine->execlists.timer)); 1061 1056 1062 1057 /* ELSP[] = { { A:rq1, A:rq2 }, { B:rq1 } } */ 1063 - GEM_BUG_ON(!i915_request_is_active(rq[A1])); 1064 - GEM_BUG_ON(!i915_request_is_active(rq[A2])); 1065 - GEM_BUG_ON(!i915_request_is_active(rq[B1])); 1066 - 1067 - /* Wait for the timeslice to kick in */ 1068 - del_timer(&engine->execlists.timer); 1069 - tasklet_hi_schedule(&engine->execlists.tasklet); 1070 - intel_engine_flush_submission(engine); 1071 - 1058 + if (i915_request_is_active(rq[A2])) { /* semaphore yielded! */ 1059 + /* Wait for the timeslice to kick in */ 1060 + del_timer(&engine->execlists.timer); 1061 + tasklet_hi_schedule(&engine->execlists.tasklet); 1062 + intel_engine_flush_submission(engine); 1063 + } 1072 1064 /* -> ELSP[] = { { A:rq1 }, { B:rq1 } } */ 1073 1065 GEM_BUG_ON(!i915_request_is_active(rq[A1])); 1074 1066 GEM_BUG_ON(!i915_request_is_active(rq[B1]));

+1 -1

drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c

··· 217 217 static void guc_add_request(struct intel_guc *guc, struct i915_request *rq) 218 218 { 219 219 struct intel_engine_cs *engine = rq->engine; 220 - u32 ctx_desc = lower_32_bits(rq->context->lrc_desc); 220 + u32 ctx_desc = rq->context->lrc.ccid; 221 221 u32 ring_tail = intel_ring_set_tail(rq->ring, rq->tail) / sizeof(u64); 222 222 223 223 guc_wq_item_append(guc, engine->guc_id, ctx_desc,

+44 -5

drivers/gpu/drm/i915/gvt/display.c

··· 208 208 SKL_FUSE_PG_DIST_STATUS(SKL_PG0) | 209 209 SKL_FUSE_PG_DIST_STATUS(SKL_PG1) | 210 210 SKL_FUSE_PG_DIST_STATUS(SKL_PG2); 211 - vgpu_vreg_t(vgpu, LCPLL1_CTL) |= 212 - LCPLL_PLL_ENABLE | 213 - LCPLL_PLL_LOCK; 214 - vgpu_vreg_t(vgpu, LCPLL2_CTL) |= LCPLL_PLL_ENABLE; 215 - 211 + /* 212 + * Only 1 PIPE enabled in current vGPU display and PIPE_A is 213 + * tied to TRANSCODER_A in HW, so it's safe to assume PIPE_A, 214 + * TRANSCODER_A can be enabled. PORT_x depends on the input of 215 + * setup_virtual_dp_monitor, we can bind DPLL0 to any PORT_x 216 + * so we fixed to DPLL0 here. 217 + * Setup DPLL0: DP link clk 1620 MHz, non SSC, DP Mode 218 + */ 219 + vgpu_vreg_t(vgpu, DPLL_CTRL1) = 220 + DPLL_CTRL1_OVERRIDE(DPLL_ID_SKL_DPLL0); 221 + vgpu_vreg_t(vgpu, DPLL_CTRL1) |= 222 + DPLL_CTRL1_LINK_RATE(DPLL_CTRL1_LINK_RATE_1620, DPLL_ID_SKL_DPLL0); 223 + vgpu_vreg_t(vgpu, LCPLL1_CTL) = 224 + LCPLL_PLL_ENABLE | LCPLL_PLL_LOCK; 225 + vgpu_vreg_t(vgpu, DPLL_STATUS) = DPLL_LOCK(DPLL_ID_SKL_DPLL0); 226 + /* 227 + * Golden M/N are calculated based on: 228 + * 24 bpp, 4 lanes, 154000 pixel clk (from virtual EDID), 229 + * DP link clk 1620 MHz and non-constant_n. 230 + * TODO: calculate DP link symbol clk and stream clk m/n. 231 + */ 232 + vgpu_vreg_t(vgpu, PIPE_DATA_M1(TRANSCODER_A)) = 63 << TU_SIZE_SHIFT; 233 + vgpu_vreg_t(vgpu, PIPE_DATA_M1(TRANSCODER_A)) |= 0x5b425e; 234 + vgpu_vreg_t(vgpu, PIPE_DATA_N1(TRANSCODER_A)) = 0x800000; 235 + vgpu_vreg_t(vgpu, PIPE_LINK_M1(TRANSCODER_A)) = 0x3cd6e; 236 + vgpu_vreg_t(vgpu, PIPE_LINK_N1(TRANSCODER_A)) = 0x80000; 216 237 } 217 238 218 239 if (intel_vgpu_has_monitor_on_port(vgpu, PORT_B)) { 240 + vgpu_vreg_t(vgpu, DPLL_CTRL2) &= 241 + ~DPLL_CTRL2_DDI_CLK_OFF(PORT_B); 242 + vgpu_vreg_t(vgpu, DPLL_CTRL2) |= 243 + DPLL_CTRL2_DDI_CLK_SEL(DPLL_ID_SKL_DPLL0, PORT_B); 244 + vgpu_vreg_t(vgpu, DPLL_CTRL2) |= 245 + DPLL_CTRL2_DDI_SEL_OVERRIDE(PORT_B); 219 246 vgpu_vreg_t(vgpu, SFUSE_STRAP) |= SFUSE_STRAP_DDIB_DETECTED; 220 247 vgpu_vreg_t(vgpu, TRANS_DDI_FUNC_CTL(TRANSCODER_A)) &= 221 248 ~(TRANS_DDI_BPC_MASK | TRANS_DDI_MODE_SELECT_MASK | ··· 263 236 } 264 237 265 238 if (intel_vgpu_has_monitor_on_port(vgpu, PORT_C)) { 239 + vgpu_vreg_t(vgpu, DPLL_CTRL2) &= 240 + ~DPLL_CTRL2_DDI_CLK_OFF(PORT_C); 241 + vgpu_vreg_t(vgpu, DPLL_CTRL2) |= 242 + DPLL_CTRL2_DDI_CLK_SEL(DPLL_ID_SKL_DPLL0, PORT_C); 243 + vgpu_vreg_t(vgpu, DPLL_CTRL2) |= 244 + DPLL_CTRL2_DDI_SEL_OVERRIDE(PORT_C); 266 245 vgpu_vreg_t(vgpu, SDEISR) |= SDE_PORTC_HOTPLUG_CPT; 267 246 vgpu_vreg_t(vgpu, TRANS_DDI_FUNC_CTL(TRANSCODER_A)) &= 268 247 ~(TRANS_DDI_BPC_MASK | TRANS_DDI_MODE_SELECT_MASK | ··· 289 256 } 290 257 291 258 if (intel_vgpu_has_monitor_on_port(vgpu, PORT_D)) { 259 + vgpu_vreg_t(vgpu, DPLL_CTRL2) &= 260 + ~DPLL_CTRL2_DDI_CLK_OFF(PORT_D); 261 + vgpu_vreg_t(vgpu, DPLL_CTRL2) |= 262 + DPLL_CTRL2_DDI_CLK_SEL(DPLL_ID_SKL_DPLL0, PORT_D); 263 + vgpu_vreg_t(vgpu, DPLL_CTRL2) |= 264 + DPLL_CTRL2_DDI_SEL_OVERRIDE(PORT_D); 292 265 vgpu_vreg_t(vgpu, SDEISR) |= SDE_PORTD_HOTPLUG_CPT; 293 266 vgpu_vreg_t(vgpu, TRANS_DDI_FUNC_CTL(TRANSCODER_A)) &= 294 267 ~(TRANS_DDI_BPC_MASK | TRANS_DDI_MODE_SELECT_MASK |

+7 -3

drivers/gpu/drm/i915/gvt/scheduler.c

··· 290 290 shadow_context_descriptor_update(struct intel_context *ce, 291 291 struct intel_vgpu_workload *workload) 292 292 { 293 - u64 desc = ce->lrc_desc; 293 + u64 desc = ce->lrc.desc; 294 294 295 295 /* 296 296 * Update bits 0-11 of the context descriptor which includes flags ··· 300 300 desc |= (u64)workload->ctx_desc.addressing_mode << 301 301 GEN8_CTX_ADDRESSING_MODE_SHIFT; 302 302 303 - ce->lrc_desc = desc; 303 + ce->lrc.desc = desc; 304 304 } 305 305 306 306 static int copy_workload_to_ring_buffer(struct intel_vgpu_workload *workload) ··· 379 379 for (i = 0; i < GVT_RING_CTX_NR_PDPS; i++) { 380 380 struct i915_page_directory * const pd = 381 381 i915_pd_entry(ppgtt->pd, i); 382 - 382 + /* skip now as current i915 ppgtt alloc won't allocate 383 + top level pdp for non 4-level table, won't impact 384 + shadow ppgtt. */ 385 + if (!pd) 386 + break; 383 387 px_dma(pd) = mm->ppgtt_mm.shadow_pdps[i]; 384 388 } 385 389 }

+12 -14

drivers/gpu/drm/i915/i915_gem_evict.c

··· 128 128 active = NULL; 129 129 INIT_LIST_HEAD(&eviction_list); 130 130 list_for_each_entry_safe(vma, next, &vm->bound_list, vm_link) { 131 + if (vma == active) { /* now seen this vma twice */ 132 + if (flags & PIN_NONBLOCK) 133 + break; 134 + 135 + active = ERR_PTR(-EAGAIN); 136 + } 137 + 131 138 /* 132 139 * We keep this list in a rough least-recently scanned order 133 140 * of active elements (inactive elements are cheap to reap). ··· 150 143 * To notice when we complete one full cycle, we record the 151 144 * first active element seen, before moving it to the tail. 152 145 */ 153 - if (i915_vma_is_active(vma)) { 154 - if (vma == active) { 155 - if (flags & PIN_NONBLOCK) 156 - break; 146 + if (active != ERR_PTR(-EAGAIN) && i915_vma_is_active(vma)) { 147 + if (!active) 148 + active = vma; 157 149 158 - active = ERR_PTR(-EAGAIN); 159 - } 160 - 161 - if (active != ERR_PTR(-EAGAIN)) { 162 - if (!active) 163 - active = vma; 164 - 165 - list_move_tail(&vma->vm_link, &vm->bound_list); 166 - continue; 167 - } 150 + list_move_tail(&vma->vm_link, &vm->bound_list); 151 + continue; 168 152 } 169 153 170 154 if (mark_free(&scan, vma, flags, &eviction_list))

+7 -5

drivers/gpu/drm/i915/i915_gpu_error.c

··· 1207 1207 static void record_request(const struct i915_request *request, 1208 1208 struct i915_request_coredump *erq) 1209 1209 { 1210 - const struct i915_gem_context *ctx; 1211 - 1212 1210 erq->flags = request->fence.flags; 1213 1211 erq->context = request->fence.context; 1214 1212 erq->seqno = request->fence.seqno; ··· 1217 1219 1218 1220 erq->pid = 0; 1219 1221 rcu_read_lock(); 1220 - ctx = rcu_dereference(request->context->gem_context); 1221 - if (ctx) 1222 - erq->pid = pid_nr(ctx->pid); 1222 + if (!intel_context_is_closed(request->context)) { 1223 + const struct i915_gem_context *ctx; 1224 + 1225 + ctx = rcu_dereference(request->context->gem_context); 1226 + if (ctx) 1227 + erq->pid = pid_nr(ctx->pid); 1228 + } 1223 1229 rcu_read_unlock(); 1224 1230 } 1225 1231

+3 -13

drivers/gpu/drm/i915/i915_irq.c

··· 3361 3361 u32 de_pipe_masked = gen8_de_pipe_fault_mask(dev_priv) | 3362 3362 GEN8_PIPE_CDCLK_CRC_DONE; 3363 3363 u32 de_pipe_enables; 3364 - u32 de_port_masked = GEN8_AUX_CHANNEL_A; 3364 + u32 de_port_masked = gen8_de_port_aux_mask(dev_priv); 3365 3365 u32 de_port_enables; 3366 3366 u32 de_misc_masked = GEN8_DE_EDP_PSR; 3367 3367 enum pipe pipe; ··· 3369 3369 if (INTEL_GEN(dev_priv) <= 10) 3370 3370 de_misc_masked |= GEN8_DE_MISC_GSE; 3371 3371 3372 - if (INTEL_GEN(dev_priv) >= 9) { 3373 - de_port_masked |= GEN9_AUX_CHANNEL_B | GEN9_AUX_CHANNEL_C | 3374 - GEN9_AUX_CHANNEL_D; 3375 - if (IS_GEN9_LP(dev_priv)) 3376 - de_port_masked |= BXT_DE_PORT_GMBUS; 3377 - } 3378 - 3379 - if (INTEL_GEN(dev_priv) >= 11) 3380 - de_port_masked |= ICL_AUX_CHANNEL_E; 3381 - 3382 - if (IS_CNL_WITH_PORT_F(dev_priv) || INTEL_GEN(dev_priv) >= 11) 3383 - de_port_masked |= CNL_AUX_CHANNEL_F; 3372 + if (IS_GEN9_LP(dev_priv)) 3373 + de_port_masked |= BXT_DE_PORT_GMBUS; 3384 3374 3385 3375 de_pipe_enables = de_pipe_masked | GEN8_PIPE_VBLANK | 3386 3376 GEN8_PIPE_FIFO_UNDERRUN;

+2 -4

drivers/gpu/drm/i915/i915_perf.c

··· 1310 1310 * dropped by GuC. They won't be part of the context 1311 1311 * ID in the OA reports, so squash those lower bits. 1312 1312 */ 1313 - stream->specific_ctx_id = 1314 - lower_32_bits(ce->lrc_desc) >> 12; 1313 + stream->specific_ctx_id = ce->lrc.lrca >> 12; 1315 1314 1316 1315 /* 1317 1316 * GuC uses the top bit to signal proxy submission, so ··· 1327 1328 ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << (GEN11_SW_CTX_ID_SHIFT - 32); 1328 1329 /* 1329 1330 * Pick an unused context id 1330 - * 0 - (NUM_CONTEXT_TAG - 1) are used by other contexts 1331 + * 0 - BITS_PER_LONG are used by other contexts 1331 1332 * GEN12_MAX_CONTEXT_HW_ID (0x7ff) is used by idle context 1332 1333 */ 1333 1334 stream->specific_ctx_id = (GEN12_MAX_CONTEXT_HW_ID - 1) << (GEN11_SW_CTX_ID_SHIFT - 32); 1334 - BUILD_BUG_ON((GEN12_MAX_CONTEXT_HW_ID - 1) < NUM_CONTEXT_TAG); 1335 1335 break; 1336 1336 } 1337 1337

+1

drivers/gpu/drm/i915/i915_reg.h

··· 3094 3094 #define GT_BSD_CS_ERROR_INTERRUPT (1 << 15) 3095 3095 #define GT_BSD_USER_INTERRUPT (1 << 12) 3096 3096 #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1 (1 << 11) /* hsw+; rsvd on snb, ivb, vlv */ 3097 + #define GT_WAIT_SEMAPHORE_INTERRUPT REG_BIT(11) /* bdw+ */ 3097 3098 #define GT_CONTEXT_SWITCH_INTERRUPT (1 << 8) 3098 3099 #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT (1 << 5) /* !snb */ 3099 3100 #define GT_RENDER_PIPECTL_NOTIFY_INTERRUPT (1 << 4)

+9 -3

drivers/gpu/drm/i915/i915_request.c

··· 1017 1017 GEM_BUG_ON(to == from); 1018 1018 GEM_BUG_ON(to->timeline == from->timeline); 1019 1019 1020 - if (i915_request_completed(from)) 1020 + if (i915_request_completed(from)) { 1021 + i915_sw_fence_set_error_once(&to->submit, from->fence.error); 1021 1022 return 0; 1023 + } 1022 1024 1023 1025 if (to->engine->schedule) { 1024 - ret = i915_sched_node_add_dependency(&to->sched, &from->sched); 1026 + ret = i915_sched_node_add_dependency(&to->sched, 1027 + &from->sched, 1028 + I915_DEPENDENCY_EXTERNAL); 1025 1029 if (ret < 0) 1026 1030 return ret; 1027 1031 } ··· 1187 1183 1188 1184 /* Couple the dependency tree for PI on this exposed to->fence */ 1189 1185 if (to->engine->schedule) { 1190 - err = i915_sched_node_add_dependency(&to->sched, &from->sched); 1186 + err = i915_sched_node_add_dependency(&to->sched, 1187 + &from->sched, 1188 + I915_DEPENDENCY_WEAK); 1191 1189 if (err < 0) 1192 1190 return err; 1193 1191 }

+3 -3

drivers/gpu/drm/i915/i915_scheduler.c

··· 456 456 } 457 457 458 458 int i915_sched_node_add_dependency(struct i915_sched_node *node, 459 - struct i915_sched_node *signal) 459 + struct i915_sched_node *signal, 460 + unsigned long flags) 460 461 { 461 462 struct i915_dependency *dep; 462 463 ··· 466 465 return -ENOMEM; 467 466 468 467 if (!__i915_sched_node_add_dependency(node, signal, dep, 469 - I915_DEPENDENCY_EXTERNAL | 470 - I915_DEPENDENCY_ALLOC)) 468 + flags | I915_DEPENDENCY_ALLOC)) 471 469 i915_dependency_free(dep); 472 470 473 471 return 0;

+2 -1

drivers/gpu/drm/i915/i915_scheduler.h

··· 34 34 unsigned long flags); 35 35 36 36 int i915_sched_node_add_dependency(struct i915_sched_node *node, 37 - struct i915_sched_node *signal); 37 + struct i915_sched_node *signal, 38 + unsigned long flags); 38 39 39 40 void i915_sched_node_fini(struct i915_sched_node *node); 40 41

+1

drivers/gpu/drm/i915/i915_scheduler_types.h

··· 78 78 unsigned long flags; 79 79 #define I915_DEPENDENCY_ALLOC BIT(0) 80 80 #define I915_DEPENDENCY_EXTERNAL BIT(1) 81 + #define I915_DEPENDENCY_WEAK BIT(2) 81 82 }; 82 83 83 84 #endif /* _I915_SCHEDULER_TYPES_H_ */

+9 -16

drivers/gpu/drm/i915/i915_vma.c

··· 1228 1228 1229 1229 lockdep_assert_held(&vma->vm->mutex); 1230 1230 1231 - /* 1232 - * First wait upon any activity as retiring the request may 1233 - * have side-effects such as unpinning or even unbinding this vma. 1234 - * 1235 - * XXX Actually waiting under the vm->mutex is a hinderance and 1236 - * should be pipelined wherever possible. In cases where that is 1237 - * unavoidable, we should lift the wait to before the mutex. 1238 - */ 1239 - ret = i915_vma_sync(vma); 1240 - if (ret) 1241 - return ret; 1242 - 1243 1231 if (i915_vma_is_pinned(vma)) { 1244 1232 vma_print_allocator(vma, "is pinned"); 1245 1233 return -EAGAIN; ··· 1301 1313 if (!drm_mm_node_allocated(&vma->node)) 1302 1314 return 0; 1303 1315 1304 - if (i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) 1305 - /* XXX not always required: nop_clear_range */ 1306 - wakeref = intel_runtime_pm_get(&vm->i915->runtime_pm); 1307 - 1308 1316 /* Optimistic wait before taking the mutex */ 1309 1317 err = i915_vma_sync(vma); 1310 1318 if (err) 1311 1319 goto out_rpm; 1320 + 1321 + if (i915_vma_is_pinned(vma)) { 1322 + vma_print_allocator(vma, "is pinned"); 1323 + return -EAGAIN; 1324 + } 1325 + 1326 + if (i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) 1327 + /* XXX not always required: nop_clear_range */ 1328 + wakeref = intel_runtime_pm_get(&vm->i915->runtime_pm); 1312 1329 1313 1330 err = mutex_lock_interruptible(&vm->mutex); 1314 1331 if (err)

+1 -1

drivers/gpu/drm/i915/intel_pm.c

··· 4992 4992 * WaIncreaseLatencyIPCEnabled: kbl,cfl 4993 4993 * Display WA #1141: kbl,cfl 4994 4994 */ 4995 - if ((IS_KABYLAKE(dev_priv) || IS_COFFEELAKE(dev_priv)) || 4995 + if ((IS_KABYLAKE(dev_priv) || IS_COFFEELAKE(dev_priv)) && 4996 4996 dev_priv->ipc_enabled) 4997 4997 latency += 4; 4998 4998

+1 -1

drivers/gpu/drm/i915/selftests/i915_vma.c

··· 173 173 } 174 174 175 175 nc = 0; 176 - for_each_prime_number(num_ctx, 2 * NUM_CONTEXT_TAG) { 176 + for_each_prime_number(num_ctx, 2 * BITS_PER_LONG) { 177 177 for (; nc < num_ctx; nc++) { 178 178 ctx = mock_context(i915, "mock"); 179 179 if (!ctx)

+1

drivers/gpu/drm/ingenic/ingenic-drm.c

··· 843 843 { .compatible = "ingenic,jz4770-lcd", .data = &jz4770_soc_info }, 844 844 { /* sentinel */ }, 845 845 }; 846 + MODULE_DEVICE_TABLE(of, ingenic_drm_of_match); 846 847 847 848 static struct platform_driver ingenic_drm_driver = { 848 849 .driver = {

+1 -3

drivers/gpu/drm/meson/meson_drv.c

··· 412 412 if (priv->afbcd.ops) 413 413 priv->afbcd.ops->init(priv); 414 414 415 - drm_mode_config_helper_resume(priv->drm); 416 - 417 - return 0; 415 + return drm_mode_config_helper_resume(priv->drm); 418 416 } 419 417 420 418 static int compare_of(struct device *dev, void *data)

+1 -1

drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c

··· 717 717 struct drm_display_mode *mode = &encoder->crtc->state->adjusted_mode; 718 718 struct sun6i_dsi *dsi = encoder_to_sun6i_dsi(encoder); 719 719 struct mipi_dsi_device *device = dsi->device; 720 - union phy_configure_opts opts = { 0 }; 720 + union phy_configure_opts opts = { }; 721 721 struct phy_configure_opts_mipi_dphy *cfg = &opts.mipi_dphy; 722 722 u16 delay; 723 723 int err;

+2 -1

drivers/gpu/drm/tegra/drm.c

··· 1039 1039 1040 1040 static bool host1x_drm_wants_iommu(struct host1x_device *dev) 1041 1041 { 1042 + struct host1x *host1x = dev_get_drvdata(dev->dev.parent); 1042 1043 struct iommu_domain *domain; 1043 1044 1044 1045 /* ··· 1077 1076 * sufficient and whether or not the host1x is attached to an IOMMU 1078 1077 * doesn't matter. 1079 1078 */ 1080 - if (!domain && dma_get_mask(dev->dev.parent) <= DMA_BIT_MASK(32)) 1079 + if (!domain && host1x_get_dma_mask(host1x) <= DMA_BIT_MASK(32)) 1081 1080 return true; 1082 1081 1083 1082 return domain != NULL;

+1

drivers/gpu/drm/virtio/virtgpu_drv.h

··· 221 221 /* virtio_ioctl.c */ 222 222 #define DRM_VIRTIO_NUM_IOCTLS 10 223 223 extern struct drm_ioctl_desc virtio_gpu_ioctls[DRM_VIRTIO_NUM_IOCTLS]; 224 + void virtio_gpu_create_context(struct drm_device *dev, struct drm_file *file); 224 225 225 226 /* virtio_kms.c */ 226 227 int virtio_gpu_init(struct drm_device *dev);

+3

drivers/gpu/drm/virtio/virtgpu_gem.c

··· 39 39 int ret; 40 40 u32 handle; 41 41 42 + if (vgdev->has_virgl_3d) 43 + virtio_gpu_create_context(dev, file); 44 + 42 45 ret = virtio_gpu_object_create(vgdev, params, &obj, NULL); 43 46 if (ret < 0) 44 47 return ret;

+1 -2

drivers/gpu/drm/virtio/virtgpu_ioctl.c

··· 34 34 35 35 #include "virtgpu_drv.h" 36 36 37 - static void virtio_gpu_create_context(struct drm_device *dev, 38 - struct drm_file *file) 37 + void virtio_gpu_create_context(struct drm_device *dev, struct drm_file *file) 39 38 { 40 39 struct virtio_gpu_device *vgdev = dev->dev_private; 41 40 struct virtio_gpu_fpriv *vfpriv = file->driver_priv;

+55 -4

drivers/gpu/host1x/dev.c

··· 192 192 } 193 193 } 194 194 195 + static bool host1x_wants_iommu(struct host1x *host1x) 196 + { 197 + /* 198 + * If we support addressing a maximum of 32 bits of physical memory 199 + * and if the host1x firewall is enabled, there's no need to enable 200 + * IOMMU support. This can happen for example on Tegra20, Tegra30 201 + * and Tegra114. 202 + * 203 + * Tegra124 and later can address up to 34 bits of physical memory and 204 + * many platforms come equipped with more than 2 GiB of system memory, 205 + * which requires crossing the 4 GiB boundary. But there's a catch: on 206 + * SoCs before Tegra186 (i.e. Tegra124 and Tegra210), the host1x can 207 + * only address up to 32 bits of memory in GATHER opcodes, which means 208 + * that command buffers need to either be in the first 2 GiB of system 209 + * memory (which could quickly lead to memory exhaustion), or command 210 + * buffers need to be treated differently from other buffers (which is 211 + * not possible with the current ABI). 212 + * 213 + * A third option is to use the IOMMU in these cases to make sure all 214 + * buffers will be mapped into a 32-bit IOVA space that host1x can 215 + * address. This allows all of the system memory to be used and works 216 + * within the limitations of the host1x on these SoCs. 217 + * 218 + * In summary, default to enable IOMMU on Tegra124 and later. For any 219 + * of the earlier SoCs, only use the IOMMU for additional safety when 220 + * the host1x firewall is disabled. 221 + */ 222 + if (host1x->info->dma_mask <= DMA_BIT_MASK(32)) { 223 + if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) 224 + return false; 225 + } 226 + 227 + return true; 228 + } 229 + 195 230 static struct iommu_domain *host1x_iommu_attach(struct host1x *host) 196 231 { 197 232 struct iommu_domain *domain = iommu_get_domain_for_dev(host->dev); 198 233 int err; 199 234 200 235 /* 201 - * If the host1x firewall is enabled, there's no need to enable IOMMU 202 - * support. Similarly, if host1x is already attached to an IOMMU (via 203 - * the DMA API), don't try to attach again. 236 + * We may not always want to enable IOMMU support (for example if the 237 + * host1x firewall is already enabled and we don't support addressing 238 + * more than 32 bits of physical memory), so check for that first. 239 + * 240 + * Similarly, if host1x is already attached to an IOMMU (via the DMA 241 + * API), don't try to attach again. 204 242 */ 205 - if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) || domain) 243 + if (!host1x_wants_iommu(host) || domain) 206 244 return domain; 207 245 208 246 host->group = iommu_group_get(host->dev); ··· 539 501 bus_unregister(&host1x_bus_type); 540 502 } 541 503 module_exit(tegra_host1x_exit); 504 + 505 + /** 506 + * host1x_get_dma_mask() - query the supported DMA mask for host1x 507 + * @host1x: host1x instance 508 + * 509 + * Note that this returns the supported DMA mask for host1x, which can be 510 + * different from the applicable DMA mask under certain circumstances. 511 + */ 512 + u64 host1x_get_dma_mask(struct host1x *host1x) 513 + { 514 + return host1x->info->dma_mask; 515 + } 516 + EXPORT_SYMBOL(host1x_get_dma_mask); 542 517 543 518 MODULE_AUTHOR("Thierry Reding <thierry.reding@avionic-design.de>"); 544 519 MODULE_AUTHOR("Terje Bergstrom <tbergstrom@nvidia.com>");

+2 -2

drivers/hwmon/da9052-hwmon.c

··· 244 244 int channel = to_sensor_dev_attr(devattr)->index; 245 245 int ret; 246 246 247 - mutex_lock(&hwmon->hwmon_lock); 247 + mutex_lock(&hwmon->da9052->auxadc_lock); 248 248 ret = __da9052_read_tsi(dev, channel); 249 - mutex_unlock(&hwmon->hwmon_lock); 249 + mutex_unlock(&hwmon->da9052->auxadc_lock); 250 250 251 251 if (ret < 0) 252 252 return ret;

+1 -1

drivers/hwmon/drivetemp.c

··· 346 346 st->have_temp_highest = temp_is_valid(buf[SCT_STATUS_TEMP_HIGHEST]); 347 347 348 348 if (!have_sct_data_table) 349 - goto skip_sct; 349 + goto skip_sct_data; 350 350 351 351 /* Request and read temperature history table */ 352 352 memset(buf, '\0', sizeof(st->smartdata));

+11 -1

drivers/hwmon/nct7904.c

··· 41 41 #define FANCTL_MAX 4 /* Counted from 1 */ 42 42 #define TCPU_MAX 8 /* Counted from 1 */ 43 43 #define TEMP_MAX 4 /* Counted from 1 */ 44 + #define SMI_STS_MAX 10 /* Counted from 1 */ 44 45 45 46 #define VT_ADC_CTRL0_REG 0x20 /* Bank 0 */ 46 47 #define VT_ADC_CTRL1_REG 0x21 /* Bank 0 */ ··· 362 361 struct nct7904_data *data = dev_get_drvdata(dev); 363 362 int ret, temp; 364 363 unsigned int reg1, reg2, reg3; 364 + s8 temps; 365 365 366 366 switch (attr) { 367 367 case hwmon_temp_input: ··· 468 466 469 467 if (ret < 0) 470 468 return ret; 471 - *val = ret * 1000; 469 + temps = ret; 470 + *val = temps * 1000; 472 471 return 0; 473 472 } 474 473 ··· 1010 1007 if (ret < 0) 1011 1008 return ret; 1012 1009 data->fan_mode[i] = ret; 1010 + } 1011 + 1012 + /* Read all of SMI status register to clear alarms */ 1013 + for (i = 0; i < SMI_STS_MAX; i++) { 1014 + ret = nct7904_read_reg(data, BANK_0, SMI_STS1_REG + i); 1015 + if (ret < 0) 1016 + return ret; 1013 1017 } 1014 1018 1015 1019 hwmon_dev =

+5 -2

drivers/infiniband/core/cache.c

··· 1553 1553 if (err) 1554 1554 return err; 1555 1555 1556 - rdma_for_each_port (device, p) 1557 - ib_cache_update(device, p, true); 1556 + rdma_for_each_port (device, p) { 1557 + err = ib_cache_update(device, p, true); 1558 + if (err) 1559 + return err; 1560 + } 1558 1561 1559 1562 return 0; 1560 1563 }

+1 -2

drivers/infiniband/core/nldev.c

··· 1292 1292 has_cap_net_admin = netlink_capable(skb, CAP_NET_ADMIN); 1293 1293 1294 1294 ret = fill_func(msg, has_cap_net_admin, res, port); 1295 - 1296 - rdma_restrack_put(res); 1297 1295 if (ret) 1298 1296 goto err_free; 1299 1297 1298 + rdma_restrack_put(res); 1300 1299 nlmsg_end(msg, nlh); 1301 1300 ib_device_put(device); 1302 1301 return rdma_nl_unicast(sock_net(skb->sk), msg, NETLINK_CB(skb).portid);

+2 -1

drivers/infiniband/core/rdma_core.c

··· 459 459 struct ib_uobject *uobj; 460 460 struct file *filp; 461 461 462 - if (WARN_ON(fd_type->fops->release != &uverbs_uobject_fd_release)) 462 + if (WARN_ON(fd_type->fops->release != &uverbs_uobject_fd_release && 463 + fd_type->fops->release != &uverbs_async_event_release)) 463 464 return ERR_PTR(-EINVAL); 464 465 465 466 new_fd = get_unused_fd_flags(O_CLOEXEC);

+4

drivers/infiniband/core/uverbs.h

··· 219 219 void ib_uverbs_init_async_event_file(struct ib_uverbs_async_event_file *ev_file); 220 220 void ib_uverbs_free_event_queue(struct ib_uverbs_event_queue *event_queue); 221 221 void ib_uverbs_flow_resources_free(struct ib_uflow_resources *uflow_res); 222 + int uverbs_async_event_release(struct inode *inode, struct file *filp); 222 223 223 224 int ib_alloc_ucontext(struct uverbs_attr_bundle *attrs); 224 225 int ib_init_ucontext(struct uverbs_attr_bundle *attrs); ··· 228 227 struct ib_ucq_object *uobj); 229 228 void ib_uverbs_release_uevent(struct ib_uevent_object *uobj); 230 229 void ib_uverbs_release_file(struct kref *ref); 230 + void ib_uverbs_async_handler(struct ib_uverbs_async_event_file *async_file, 231 + __u64 element, __u64 event, 232 + struct list_head *obj_list, u32 *counter); 231 233 232 234 void ib_uverbs_comp_handler(struct ib_cq *cq, void *cq_context); 233 235 void ib_uverbs_cq_event_handler(struct ib_event *event, void *context_ptr);

+4 -8

drivers/infiniband/core/uverbs_main.c

··· 346 346 .owner = THIS_MODULE, 347 347 .read = ib_uverbs_async_event_read, 348 348 .poll = ib_uverbs_async_event_poll, 349 - .release = uverbs_uobject_fd_release, 349 + .release = uverbs_async_event_release, 350 350 .fasync = ib_uverbs_async_event_fasync, 351 351 .llseek = no_llseek, 352 352 }; ··· 386 386 kill_fasync(&ev_queue->async_queue, SIGIO, POLL_IN); 387 387 } 388 388 389 - static void 390 - ib_uverbs_async_handler(struct ib_uverbs_async_event_file *async_file, 391 - __u64 element, __u64 event, struct list_head *obj_list, 392 - u32 *counter) 389 + void ib_uverbs_async_handler(struct ib_uverbs_async_event_file *async_file, 390 + __u64 element, __u64 event, 391 + struct list_head *obj_list, u32 *counter) 393 392 { 394 393 struct ib_uverbs_event *entry; 395 394 unsigned long flags; ··· 1185 1186 * mmput). 1186 1187 */ 1187 1188 mutex_unlock(&uverbs_dev->lists_mutex); 1188 - 1189 - ib_uverbs_async_handler(READ_ONCE(file->async_file), 0, 1190 - IB_EVENT_DEVICE_FATAL, NULL, NULL); 1191 1189 1192 1190 uverbs_destroy_ufile_hw(file, RDMA_REMOVE_DRIVER_REMOVE); 1193 1191 kref_put(&file->ref, ib_uverbs_release_file);

+29 -1

drivers/infiniband/core/uverbs_std_types_async_fd.c

··· 26 26 container_of(uobj, struct ib_uverbs_async_event_file, uobj); 27 27 28 28 ib_unregister_event_handler(&event_file->event_handler); 29 - ib_uverbs_free_event_queue(&event_file->ev_queue); 29 + 30 + if (why == RDMA_REMOVE_DRIVER_REMOVE) 31 + ib_uverbs_async_handler(event_file, 0, IB_EVENT_DEVICE_FATAL, 32 + NULL, NULL); 30 33 return 0; 34 + } 35 + 36 + int uverbs_async_event_release(struct inode *inode, struct file *filp) 37 + { 38 + struct ib_uverbs_async_event_file *event_file; 39 + struct ib_uobject *uobj = filp->private_data; 40 + int ret; 41 + 42 + if (!uobj) 43 + return uverbs_uobject_fd_release(inode, filp); 44 + 45 + event_file = 46 + container_of(uobj, struct ib_uverbs_async_event_file, uobj); 47 + 48 + /* 49 + * The async event FD has to deliver IB_EVENT_DEVICE_FATAL even after 50 + * disassociation, so cleaning the event list must only happen after 51 + * release. The user knows it has reached the end of the event stream 52 + * when it sees IB_EVENT_DEVICE_FATAL. 53 + */ 54 + uverbs_uobject_get(uobj); 55 + ret = uverbs_uobject_fd_release(inode, filp); 56 + ib_uverbs_free_event_queue(&event_file->ev_queue); 57 + uverbs_uobject_put(uobj); 58 + return ret; 31 59 } 32 60 33 61 DECLARE_UVERBS_NAMED_METHOD(

+3 -4

drivers/infiniband/hw/cxgb4/cm.c

··· 2891 2891 srqidx = ABORT_RSS_SRQIDX_G( 2892 2892 be32_to_cpu(req->srqidx_status)); 2893 2893 if (srqidx) { 2894 - complete_cached_srq_buffers(ep, 2895 - req->srqidx_status); 2894 + complete_cached_srq_buffers(ep, srqidx); 2896 2895 } else { 2897 2896 /* Hold ep ref until finish_peer_abort() */ 2898 2897 c4iw_get_ep(&ep->com); ··· 3877 3878 return 0; 3878 3879 } 3879 3880 3880 - ep->srqe_idx = t4_tcb_get_field32(tcb, TCB_RQ_START_W, TCB_RQ_START_W, 3881 - TCB_RQ_START_S); 3881 + ep->srqe_idx = t4_tcb_get_field32(tcb, TCB_RQ_START_W, TCB_RQ_START_M, 3882 + TCB_RQ_START_S); 3882 3883 cleanup: 3883 3884 pr_debug("ep %p tid %u %016x\n", ep, ep->hwtid, ep->srqe_idx); 3884 3885

-4

drivers/infiniband/hw/hfi1/user_sdma.c

··· 589 589 590 590 set_comp_state(pq, cq, info.comp_idx, QUEUED, 0); 591 591 pq->state = SDMA_PKT_Q_ACTIVE; 592 - /* Send the first N packets in the request to buy us some time */ 593 - ret = user_sdma_send_pkts(req, pcount); 594 - if (unlikely(ret < 0 && ret != -EBUSY)) 595 - goto free_req; 596 592 597 593 /* 598 594 * This is a somewhat blocking send implementation.

-8

drivers/infiniband/hw/i40iw/i40iw_cm.c

··· 1987 1987 struct rtable *rt; 1988 1988 struct neighbour *neigh; 1989 1989 int rc = arpindex; 1990 - struct net_device *netdev = iwdev->netdev; 1991 1990 __be32 dst_ipaddr = htonl(dst_ip); 1992 1991 __be32 src_ipaddr = htonl(src_ip); 1993 1992 ··· 1995 1996 i40iw_pr_err("ip_route_output\n"); 1996 1997 return rc; 1997 1998 } 1998 - 1999 - if (netif_is_bond_slave(netdev)) 2000 - netdev = netdev_master_upper_dev_get(netdev); 2001 1999 2002 2000 neigh = dst_neigh_lookup(&rt->dst, &dst_ipaddr); 2003 2001 ··· 2061 2065 { 2062 2066 struct neighbour *neigh; 2063 2067 int rc = arpindex; 2064 - struct net_device *netdev = iwdev->netdev; 2065 2068 struct dst_entry *dst; 2066 2069 struct sockaddr_in6 dst_addr; 2067 2070 struct sockaddr_in6 src_addr; ··· 2080 2085 } 2081 2086 return rc; 2082 2087 } 2083 - 2084 - if (netif_is_bond_slave(netdev)) 2085 - netdev = netdev_master_upper_dev_get(netdev); 2086 2088 2087 2089 neigh = dst_neigh_lookup(dst, dst_addr.sin6_addr.in6_u.u6_addr32); 2088 2090

+1 -1

drivers/infiniband/hw/i40iw/i40iw_hw.c

··· 534 534 int arp_index; 535 535 536 536 arp_index = i40iw_arp_table(iwdev, ip_addr, ipv4, mac_addr, action); 537 - if (arp_index == -1) 537 + if (arp_index < 0) 538 538 return; 539 539 cqp_request = i40iw_get_cqp_request(&iwdev->cqp, false); 540 540 if (!cqp_request)

+11 -3

drivers/infiniband/hw/mlx4/qp.c

··· 2891 2891 int send_size; 2892 2892 int header_size; 2893 2893 int spc; 2894 + int err; 2894 2895 int i; 2895 2896 2896 2897 if (wr->wr.opcode != IB_WR_SEND) ··· 2926 2925 2927 2926 sqp->ud_header.lrh.virtual_lane = 0; 2928 2927 sqp->ud_header.bth.solicited_event = !!(wr->wr.send_flags & IB_SEND_SOLICITED); 2929 - ib_get_cached_pkey(ib_dev, sqp->qp.port, 0, &pkey); 2928 + err = ib_get_cached_pkey(ib_dev, sqp->qp.port, 0, &pkey); 2929 + if (err) 2930 + return err; 2930 2931 sqp->ud_header.bth.pkey = cpu_to_be16(pkey); 2931 2932 if (sqp->qp.mlx4_ib_qp_type == MLX4_IB_QPT_TUN_SMI_OWNER) 2932 2933 sqp->ud_header.bth.destination_qpn = cpu_to_be32(wr->remote_qpn); ··· 3215 3212 } 3216 3213 sqp->ud_header.bth.solicited_event = !!(wr->wr.send_flags & IB_SEND_SOLICITED); 3217 3214 if (!sqp->qp.ibqp.qp_num) 3218 - ib_get_cached_pkey(ib_dev, sqp->qp.port, sqp->pkey_index, &pkey); 3215 + err = ib_get_cached_pkey(ib_dev, sqp->qp.port, sqp->pkey_index, 3216 + &pkey); 3219 3217 else 3220 - ib_get_cached_pkey(ib_dev, sqp->qp.port, wr->pkey_index, &pkey); 3218 + err = ib_get_cached_pkey(ib_dev, sqp->qp.port, wr->pkey_index, 3219 + &pkey); 3220 + if (err) 3221 + return err; 3222 + 3221 3223 sqp->ud_header.bth.pkey = cpu_to_be16(pkey); 3222 3224 sqp->ud_header.bth.destination_qpn = cpu_to_be32(wr->remote_qpn); 3223 3225 sqp->ud_header.bth.psn = cpu_to_be32((sqp->send_psn++) & ((1 << 24) - 1));

+1 -1

drivers/infiniband/sw/rxe/rxe_mmap.c

··· 151 151 152 152 ip = kmalloc(sizeof(*ip), GFP_KERNEL); 153 153 if (!ip) 154 - return NULL; 154 + return ERR_PTR(-ENOMEM); 155 155 156 156 size = PAGE_ALIGN(size); 157 157

+7 -4

drivers/infiniband/sw/rxe/rxe_queue.c

··· 45 45 46 46 if (outbuf) { 47 47 ip = rxe_create_mmap_info(rxe, buf_size, udata, buf); 48 - if (!ip) 48 + if (IS_ERR(ip)) { 49 + err = PTR_ERR(ip); 49 50 goto err1; 51 + } 50 52 51 - err = copy_to_user(outbuf, &ip->info, sizeof(ip->info)); 52 - if (err) 53 + if (copy_to_user(outbuf, &ip->info, sizeof(ip->info))) { 54 + err = -EFAULT; 53 55 goto err2; 56 + } 54 57 55 58 spin_lock_bh(&rxe->pending_lock); 56 59 list_add(&ip->pending_mmaps, &rxe->pending_mmaps); ··· 67 64 err2: 68 65 kfree(ip); 69 66 err1: 70 - return -EINVAL; 67 + return err; 71 68 } 72 69 73 70 inline void rxe_queue_reset(struct rxe_queue *q)

+2 -2

drivers/interconnect/qcom/osm-l3.c

··· 78 78 [SLAVE_OSM_L3] = &sdm845_osm_l3, 79 79 }; 80 80 81 - const static struct qcom_icc_desc sdm845_icc_osm_l3 = { 81 + static const struct qcom_icc_desc sdm845_icc_osm_l3 = { 82 82 .nodes = sdm845_osm_l3_nodes, 83 83 .num_nodes = ARRAY_SIZE(sdm845_osm_l3_nodes), 84 84 }; ··· 91 91 [SLAVE_OSM_L3] = &sc7180_osm_l3, 92 92 }; 93 93 94 - const static struct qcom_icc_desc sc7180_icc_osm_l3 = { 94 + static const struct qcom_icc_desc sc7180_icc_osm_l3 = { 95 95 .nodes = sc7180_osm_l3_nodes, 96 96 .num_nodes = ARRAY_SIZE(sc7180_osm_l3_nodes), 97 97 };

+8 -8

drivers/interconnect/qcom/sdm845.c

··· 192 192 [SLAVE_ANOC_PCIE_A1NOC_SNOC] = &qns_pcie_a1noc_snoc, 193 193 }; 194 194 195 - const static struct qcom_icc_desc sdm845_aggre1_noc = { 195 + static const struct qcom_icc_desc sdm845_aggre1_noc = { 196 196 .nodes = aggre1_noc_nodes, 197 197 .num_nodes = ARRAY_SIZE(aggre1_noc_nodes), 198 198 .bcms = aggre1_noc_bcms, ··· 220 220 [SLAVE_SERVICE_A2NOC] = &srvc_aggre2_noc, 221 221 }; 222 222 223 - const static struct qcom_icc_desc sdm845_aggre2_noc = { 223 + static const struct qcom_icc_desc sdm845_aggre2_noc = { 224 224 .nodes = aggre2_noc_nodes, 225 225 .num_nodes = ARRAY_SIZE(aggre2_noc_nodes), 226 226 .bcms = aggre2_noc_bcms, ··· 281 281 [SLAVE_SERVICE_CNOC] = &srvc_cnoc, 282 282 }; 283 283 284 - const static struct qcom_icc_desc sdm845_config_noc = { 284 + static const struct qcom_icc_desc sdm845_config_noc = { 285 285 .nodes = config_noc_nodes, 286 286 .num_nodes = ARRAY_SIZE(config_noc_nodes), 287 287 .bcms = config_noc_bcms, ··· 297 297 [SLAVE_MEM_NOC_CFG] = &qhs_memnoc, 298 298 }; 299 299 300 - const static struct qcom_icc_desc sdm845_dc_noc = { 300 + static const struct qcom_icc_desc sdm845_dc_noc = { 301 301 .nodes = dc_noc_nodes, 302 302 .num_nodes = ARRAY_SIZE(dc_noc_nodes), 303 303 .bcms = dc_noc_bcms, ··· 315 315 [SLAVE_SERVICE_GNOC] = &srvc_gnoc, 316 316 }; 317 317 318 - const static struct qcom_icc_desc sdm845_gladiator_noc = { 318 + static const struct qcom_icc_desc sdm845_gladiator_noc = { 319 319 .nodes = gladiator_noc_nodes, 320 320 .num_nodes = ARRAY_SIZE(gladiator_noc_nodes), 321 321 .bcms = gladiator_noc_bcms, ··· 350 350 [SLAVE_EBI1] = &ebi, 351 351 }; 352 352 353 - const static struct qcom_icc_desc sdm845_mem_noc = { 353 + static const struct qcom_icc_desc sdm845_mem_noc = { 354 354 .nodes = mem_noc_nodes, 355 355 .num_nodes = ARRAY_SIZE(mem_noc_nodes), 356 356 .bcms = mem_noc_bcms, ··· 384 384 [SLAVE_CAMNOC_UNCOMP] = &qns_camnoc_uncomp, 385 385 }; 386 386 387 - const static struct qcom_icc_desc sdm845_mmss_noc = { 387 + static const struct qcom_icc_desc sdm845_mmss_noc = { 388 388 .nodes = mmss_noc_nodes, 389 389 .num_nodes = ARRAY_SIZE(mmss_noc_nodes), 390 390 .bcms = mmss_noc_bcms, ··· 430 430 [SLAVE_TCU] = &xs_sys_tcu_cfg, 431 431 }; 432 432 433 - const static struct qcom_icc_desc sdm845_system_noc = { 433 + static const struct qcom_icc_desc sdm845_system_noc = { 434 434 .nodes = system_noc_nodes, 435 435 .num_nodes = ARRAY_SIZE(system_noc_nodes), 436 436 .bcms = system_noc_bcms,

+154 -44

drivers/iommu/amd_iommu.c

··· 101 101 static void update_domain(struct protection_domain *domain); 102 102 static int protection_domain_init(struct protection_domain *domain); 103 103 static void detach_device(struct device *dev); 104 + static void update_and_flush_device_table(struct protection_domain *domain, 105 + struct domain_pgtable *pgtable); 104 106 105 107 /**************************************************************************** 106 108 * ··· 151 149 static struct protection_domain *to_pdomain(struct iommu_domain *dom) 152 150 { 153 151 return container_of(dom, struct protection_domain, domain); 152 + } 153 + 154 + static void amd_iommu_domain_get_pgtable(struct protection_domain *domain, 155 + struct domain_pgtable *pgtable) 156 + { 157 + u64 pt_root = atomic64_read(&domain->pt_root); 158 + 159 + pgtable->root = (u64 *)(pt_root & PAGE_MASK); 160 + pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */ 161 + } 162 + 163 + static u64 amd_iommu_domain_encode_pgtable(u64 *root, int mode) 164 + { 165 + u64 pt_root; 166 + 167 + /* lowest 3 bits encode pgtable mode */ 168 + pt_root = mode & 7; 169 + pt_root |= (u64)root; 170 + 171 + return pt_root; 154 172 } 155 173 156 174 static struct iommu_dev_data *alloc_dev_data(u16 devid) ··· 1419 1397 1420 1398 static void free_pagetable(struct protection_domain *domain) 1421 1399 { 1422 - unsigned long root = (unsigned long)domain->pt_root; 1400 + struct domain_pgtable pgtable; 1423 1401 struct page *freelist = NULL; 1402 + unsigned long root; 1424 1403 1425 - BUG_ON(domain->mode < PAGE_MODE_NONE || 1426 - domain->mode > PAGE_MODE_6_LEVEL); 1404 + amd_iommu_domain_get_pgtable(domain, &pgtable); 1405 + atomic64_set(&domain->pt_root, 0); 1427 1406 1428 - freelist = free_sub_pt(root, domain->mode, freelist); 1407 + BUG_ON(pgtable.mode < PAGE_MODE_NONE || 1408 + pgtable.mode > PAGE_MODE_6_LEVEL); 1409 + 1410 + root = (unsigned long)pgtable.root; 1411 + freelist = free_sub_pt(root, pgtable.mode, freelist); 1429 1412 1430 1413 free_page_list(freelist); 1431 1414 } ··· 1444 1417 unsigned long address, 1445 1418 gfp_t gfp) 1446 1419 { 1420 + struct domain_pgtable pgtable; 1447 1421 unsigned long flags; 1448 - bool ret = false; 1449 - u64 *pte; 1422 + bool ret = true; 1423 + u64 *pte, root; 1450 1424 1451 1425 spin_lock_irqsave(&domain->lock, flags); 1452 1426 1453 - if (address <= PM_LEVEL_SIZE(domain->mode) || 1454 - WARN_ON_ONCE(domain->mode == PAGE_MODE_6_LEVEL)) 1427 + amd_iommu_domain_get_pgtable(domain, &pgtable); 1428 + 1429 + if (address <= PM_LEVEL_SIZE(pgtable.mode)) 1430 + goto out; 1431 + 1432 + ret = false; 1433 + if (WARN_ON_ONCE(pgtable.mode == PAGE_MODE_6_LEVEL)) 1455 1434 goto out; 1456 1435 1457 1436 pte = (void *)get_zeroed_page(gfp); 1458 1437 if (!pte) 1459 1438 goto out; 1460 1439 1461 - *pte = PM_LEVEL_PDE(domain->mode, 1462 - iommu_virt_to_phys(domain->pt_root)); 1463 - domain->pt_root = pte; 1464 - domain->mode += 1; 1440 + *pte = PM_LEVEL_PDE(pgtable.mode, iommu_virt_to_phys(pgtable.root)); 1441 + 1442 + pgtable.root = pte; 1443 + pgtable.mode += 1; 1444 + update_and_flush_device_table(domain, &pgtable); 1445 + domain_flush_complete(domain); 1446 + 1447 + /* 1448 + * Device Table needs to be updated and flushed before the new root can 1449 + * be published. 1450 + */ 1451 + root = amd_iommu_domain_encode_pgtable(pte, pgtable.mode); 1452 + atomic64_set(&domain->pt_root, root); 1465 1453 1466 1454 ret = true; 1467 1455 ··· 1493 1451 gfp_t gfp, 1494 1452 bool *updated) 1495 1453 { 1454 + struct domain_pgtable pgtable; 1496 1455 int level, end_lvl; 1497 1456 u64 *pte, *page; 1498 1457 1499 1458 BUG_ON(!is_power_of_2(page_size)); 1500 1459 1501 - while (address > PM_LEVEL_SIZE(domain->mode)) 1502 - *updated = increase_address_space(domain, address, gfp) || *updated; 1460 + amd_iommu_domain_get_pgtable(domain, &pgtable); 1503 1461 1504 - level = domain->mode - 1; 1505 - pte = &domain->pt_root[PM_LEVEL_INDEX(level, address)]; 1462 + while (address > PM_LEVEL_SIZE(pgtable.mode)) { 1463 + /* 1464 + * Return an error if there is no memory to update the 1465 + * page-table. 1466 + */ 1467 + if (!increase_address_space(domain, address, gfp)) 1468 + return NULL; 1469 + 1470 + /* Read new values to check if update was successful */ 1471 + amd_iommu_domain_get_pgtable(domain, &pgtable); 1472 + } 1473 + 1474 + 1475 + level = pgtable.mode - 1; 1476 + pte = &pgtable.root[PM_LEVEL_INDEX(level, address)]; 1506 1477 address = PAGE_SIZE_ALIGN(address, page_size); 1507 1478 end_lvl = PAGE_SIZE_LEVEL(page_size); 1508 1479 ··· 1591 1536 unsigned long address, 1592 1537 unsigned long *page_size) 1593 1538 { 1539 + struct domain_pgtable pgtable; 1594 1540 int level; 1595 1541 u64 *pte; 1596 1542 1597 1543 *page_size = 0; 1598 1544 1599 - if (address > PM_LEVEL_SIZE(domain->mode)) 1545 + amd_iommu_domain_get_pgtable(domain, &pgtable); 1546 + 1547 + if (address > PM_LEVEL_SIZE(pgtable.mode)) 1600 1548 return NULL; 1601 1549 1602 - level = domain->mode - 1; 1603 - pte = &domain->pt_root[PM_LEVEL_INDEX(level, address)]; 1550 + level = pgtable.mode - 1; 1551 + pte = &pgtable.root[PM_LEVEL_INDEX(level, address)]; 1604 1552 *page_size = PTE_LEVEL_PAGE_SIZE(level); 1605 1553 1606 1554 while (level > 0) { ··· 1718 1660 unsigned long flags; 1719 1661 1720 1662 spin_lock_irqsave(&dom->lock, flags); 1721 - update_domain(dom); 1663 + /* 1664 + * Flush domain TLB(s) and wait for completion. Any Device-Table 1665 + * Updates and flushing already happened in 1666 + * increase_address_space(). 1667 + */ 1668 + domain_flush_tlb_pde(dom); 1669 + domain_flush_complete(dom); 1722 1670 spin_unlock_irqrestore(&dom->lock, flags); 1723 1671 } 1724 1672 ··· 1870 1806 static struct protection_domain *dma_ops_domain_alloc(void) 1871 1807 { 1872 1808 struct protection_domain *domain; 1809 + u64 *pt_root, root; 1873 1810 1874 1811 domain = kzalloc(sizeof(struct protection_domain), GFP_KERNEL); 1875 1812 if (!domain) ··· 1879 1814 if (protection_domain_init(domain)) 1880 1815 goto free_domain; 1881 1816 1882 - domain->mode = PAGE_MODE_3_LEVEL; 1883 - domain->pt_root = (void *)get_zeroed_page(GFP_KERNEL); 1884 - domain->flags = PD_DMA_OPS_MASK; 1885 - if (!domain->pt_root) 1817 + pt_root = (void *)get_zeroed_page(GFP_KERNEL); 1818 + if (!pt_root) 1886 1819 goto free_domain; 1820 + 1821 + root = amd_iommu_domain_encode_pgtable(pt_root, PAGE_MODE_3_LEVEL); 1822 + atomic64_set(&domain->pt_root, root); 1823 + domain->flags = PD_DMA_OPS_MASK; 1887 1824 1888 1825 if (iommu_get_dma_cookie(&domain->domain) == -ENOMEM) 1889 1826 goto free_domain; ··· 1908 1841 } 1909 1842 1910 1843 static void set_dte_entry(u16 devid, struct protection_domain *domain, 1844 + struct domain_pgtable *pgtable, 1911 1845 bool ats, bool ppr) 1912 1846 { 1913 1847 u64 pte_root = 0; 1914 1848 u64 flags = 0; 1915 1849 u32 old_domid; 1916 1850 1917 - if (domain->mode != PAGE_MODE_NONE) 1918 - pte_root = iommu_virt_to_phys(domain->pt_root); 1851 + if (pgtable->mode != PAGE_MODE_NONE) 1852 + pte_root = iommu_virt_to_phys(pgtable->root); 1919 1853 1920 - pte_root |= (domain->mode & DEV_ENTRY_MODE_MASK) 1854 + pte_root |= (pgtable->mode & DEV_ENTRY_MODE_MASK) 1921 1855 << DEV_ENTRY_MODE_SHIFT; 1922 1856 pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV; 1923 1857 ··· 1991 1923 static void do_attach(struct iommu_dev_data *dev_data, 1992 1924 struct protection_domain *domain) 1993 1925 { 1926 + struct domain_pgtable pgtable; 1994 1927 struct amd_iommu *iommu; 1995 1928 bool ats; 1996 1929 ··· 2007 1938 domain->dev_cnt += 1; 2008 1939 2009 1940 /* Update device table */ 2010 - set_dte_entry(dev_data->devid, domain, ats, dev_data->iommu_v2); 1941 + amd_iommu_domain_get_pgtable(domain, &pgtable); 1942 + set_dte_entry(dev_data->devid, domain, &pgtable, 1943 + ats, dev_data->iommu_v2); 2011 1944 clone_aliases(dev_data->pdev); 2012 1945 2013 1946 device_flush_dte(dev_data); ··· 2320 2249 * 2321 2250 *****************************************************************************/ 2322 2251 2323 - static void update_device_table(struct protection_domain *domain) 2252 + static void update_device_table(struct protection_domain *domain, 2253 + struct domain_pgtable *pgtable) 2324 2254 { 2325 2255 struct iommu_dev_data *dev_data; 2326 2256 2327 2257 list_for_each_entry(dev_data, &domain->dev_list, list) { 2328 - set_dte_entry(dev_data->devid, domain, dev_data->ats.enabled, 2329 - dev_data->iommu_v2); 2258 + set_dte_entry(dev_data->devid, domain, pgtable, 2259 + dev_data->ats.enabled, dev_data->iommu_v2); 2330 2260 clone_aliases(dev_data->pdev); 2331 2261 } 2332 2262 } 2333 2263 2264 + static void update_and_flush_device_table(struct protection_domain *domain, 2265 + struct domain_pgtable *pgtable) 2266 + { 2267 + update_device_table(domain, pgtable); 2268 + domain_flush_devices(domain); 2269 + } 2270 + 2334 2271 static void update_domain(struct protection_domain *domain) 2335 2272 { 2336 - update_device_table(domain); 2273 + struct domain_pgtable pgtable; 2337 2274 2338 - domain_flush_devices(domain); 2275 + /* Update device table */ 2276 + amd_iommu_domain_get_pgtable(domain, &pgtable); 2277 + update_and_flush_device_table(domain, &pgtable); 2278 + 2279 + /* Flush domain TLB(s) and wait for completion */ 2339 2280 domain_flush_tlb_pde(domain); 2281 + domain_flush_complete(domain); 2340 2282 } 2341 2283 2342 2284 int __init amd_iommu_init_api(void) ··· 2459 2375 static struct iommu_domain *amd_iommu_domain_alloc(unsigned type) 2460 2376 { 2461 2377 struct protection_domain *pdomain; 2378 + u64 *pt_root, root; 2462 2379 2463 2380 switch (type) { 2464 2381 case IOMMU_DOMAIN_UNMANAGED: ··· 2467 2382 if (!pdomain) 2468 2383 return NULL; 2469 2384 2470 - pdomain->mode = PAGE_MODE_3_LEVEL; 2471 - pdomain->pt_root = (void *)get_zeroed_page(GFP_KERNEL); 2472 - if (!pdomain->pt_root) { 2385 + pt_root = (void *)get_zeroed_page(GFP_KERNEL); 2386 + if (!pt_root) { 2473 2387 protection_domain_free(pdomain); 2474 2388 return NULL; 2475 2389 } 2390 + 2391 + root = amd_iommu_domain_encode_pgtable(pt_root, PAGE_MODE_3_LEVEL); 2392 + atomic64_set(&pdomain->pt_root, root); 2476 2393 2477 2394 pdomain->domain.geometry.aperture_start = 0; 2478 2395 pdomain->domain.geometry.aperture_end = ~0ULL; ··· 2493 2406 if (!pdomain) 2494 2407 return NULL; 2495 2408 2496 - pdomain->mode = PAGE_MODE_NONE; 2409 + atomic64_set(&pdomain->pt_root, PAGE_MODE_NONE); 2497 2410 break; 2498 2411 default: 2499 2412 return NULL; ··· 2505 2418 static void amd_iommu_domain_free(struct iommu_domain *dom) 2506 2419 { 2507 2420 struct protection_domain *domain; 2421 + struct domain_pgtable pgtable; 2508 2422 2509 2423 domain = to_pdomain(dom); 2510 2424 ··· 2523 2435 dma_ops_domain_free(domain); 2524 2436 break; 2525 2437 default: 2526 - if (domain->mode != PAGE_MODE_NONE) 2438 + amd_iommu_domain_get_pgtable(domain, &pgtable); 2439 + 2440 + if (pgtable.mode != PAGE_MODE_NONE) 2527 2441 free_pagetable(domain); 2528 2442 2529 2443 if (domain->flags & PD_IOMMUV2_MASK) ··· 2608 2518 gfp_t gfp) 2609 2519 { 2610 2520 struct protection_domain *domain = to_pdomain(dom); 2521 + struct domain_pgtable pgtable; 2611 2522 int prot = 0; 2612 2523 int ret; 2613 2524 2614 - if (domain->mode == PAGE_MODE_NONE) 2525 + amd_iommu_domain_get_pgtable(domain, &pgtable); 2526 + if (pgtable.mode == PAGE_MODE_NONE) 2615 2527 return -EINVAL; 2616 2528 2617 2529 if (iommu_prot & IOMMU_READ) ··· 2633 2541 struct iommu_iotlb_gather *gather) 2634 2542 { 2635 2543 struct protection_domain *domain = to_pdomain(dom); 2544 + struct domain_pgtable pgtable; 2636 2545 2637 - if (domain->mode == PAGE_MODE_NONE) 2546 + amd_iommu_domain_get_pgtable(domain, &pgtable); 2547 + if (pgtable.mode == PAGE_MODE_NONE) 2638 2548 return 0; 2639 2549 2640 2550 return iommu_unmap_page(domain, iova, page_size); ··· 2647 2553 { 2648 2554 struct protection_domain *domain = to_pdomain(dom); 2649 2555 unsigned long offset_mask, pte_pgsize; 2556 + struct domain_pgtable pgtable; 2650 2557 u64 *pte, __pte; 2651 2558 2652 - if (domain->mode == PAGE_MODE_NONE) 2559 + amd_iommu_domain_get_pgtable(domain, &pgtable); 2560 + if (pgtable.mode == PAGE_MODE_NONE) 2653 2561 return iova; 2654 2562 2655 2563 pte = fetch_pte(domain, iova, &pte_pgsize); ··· 2804 2708 void amd_iommu_domain_direct_map(struct iommu_domain *dom) 2805 2709 { 2806 2710 struct protection_domain *domain = to_pdomain(dom); 2711 + struct domain_pgtable pgtable; 2807 2712 unsigned long flags; 2713 + u64 pt_root; 2808 2714 2809 2715 spin_lock_irqsave(&domain->lock, flags); 2810 2716 2717 + /* First save pgtable configuration*/ 2718 + amd_iommu_domain_get_pgtable(domain, &pgtable); 2719 + 2811 2720 /* Update data structure */ 2812 - domain->mode = PAGE_MODE_NONE; 2721 + pt_root = amd_iommu_domain_encode_pgtable(NULL, PAGE_MODE_NONE); 2722 + atomic64_set(&domain->pt_root, pt_root); 2813 2723 2814 2724 /* Make changes visible to IOMMUs */ 2815 2725 update_domain(domain); 2726 + 2727 + /* Restore old pgtable in domain->ptroot to free page-table */ 2728 + pt_root = amd_iommu_domain_encode_pgtable(pgtable.root, pgtable.mode); 2729 + atomic64_set(&domain->pt_root, pt_root); 2816 2730 2817 2731 /* Page-table is not visible to IOMMU anymore, so free it */ 2818 2732 free_pagetable(domain); ··· 3014 2908 static int __set_gcr3(struct protection_domain *domain, int pasid, 3015 2909 unsigned long cr3) 3016 2910 { 2911 + struct domain_pgtable pgtable; 3017 2912 u64 *pte; 3018 2913 3019 - if (domain->mode != PAGE_MODE_NONE) 2914 + amd_iommu_domain_get_pgtable(domain, &pgtable); 2915 + if (pgtable.mode != PAGE_MODE_NONE) 3020 2916 return -EINVAL; 3021 2917 3022 2918 pte = __get_gcr3_pte(domain->gcr3_tbl, domain->glx, pasid, true); ··· 3032 2924 3033 2925 static int __clear_gcr3(struct protection_domain *domain, int pasid) 3034 2926 { 2927 + struct domain_pgtable pgtable; 3035 2928 u64 *pte; 3036 2929 3037 - if (domain->mode != PAGE_MODE_NONE) 2930 + amd_iommu_domain_get_pgtable(domain, &pgtable); 2931 + if (pgtable.mode != PAGE_MODE_NONE) 3038 2932 return -EINVAL; 3039 2933 3040 2934 pte = __get_gcr3_pte(domain->gcr3_tbl, domain->glx, pasid, false);

+7 -2

drivers/iommu/amd_iommu_types.h

··· 468 468 iommu core code */ 469 469 spinlock_t lock; /* mostly used to lock the page table*/ 470 470 u16 id; /* the domain id written to the device table */ 471 - int mode; /* paging mode (0-6 levels) */ 472 - u64 *pt_root; /* page table root pointer */ 471 + atomic64_t pt_root; /* pgtable root and pgtable mode */ 473 472 int glx; /* Number of levels for GCR3 table */ 474 473 u64 *gcr3_tbl; /* Guest CR3 table */ 475 474 unsigned long flags; /* flags to find out type of domain */ 476 475 unsigned dev_cnt; /* devices assigned to this domain */ 477 476 unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */ 477 + }; 478 + 479 + /* For decocded pt_root */ 480 + struct domain_pgtable { 481 + int mode; 482 + u64 *root; 478 483 }; 479 484 480 485 /*

+1 -1

drivers/iommu/virtio-iommu.c

··· 453 453 if (!region) 454 454 return -ENOMEM; 455 455 456 - list_add(&vdev->resv_regions, &region->list); 456 + list_add(&region->list, &vdev->resv_regions); 457 457 return 0; 458 458 } 459 459

+8

drivers/misc/mei/hw-me.c

··· 1465 1465 MEI_CFG_DMA_128, 1466 1466 }; 1467 1467 1468 + /* LBG with quirk for SPS Firmware exclusion */ 1469 + static const struct mei_cfg mei_me_pch12_sps_cfg = { 1470 + MEI_CFG_PCH8_HFS, 1471 + MEI_CFG_FW_VER_SUPP, 1472 + MEI_CFG_FW_SPS, 1473 + }; 1474 + 1468 1475 /* Tiger Lake and newer devices */ 1469 1476 static const struct mei_cfg mei_me_pch15_cfg = { 1470 1477 MEI_CFG_PCH8_HFS, ··· 1494 1487 [MEI_ME_PCH8_CFG] = &mei_me_pch8_cfg, 1495 1488 [MEI_ME_PCH8_SPS_CFG] = &mei_me_pch8_sps_cfg, 1496 1489 [MEI_ME_PCH12_CFG] = &mei_me_pch12_cfg, 1490 + [MEI_ME_PCH12_SPS_CFG] = &mei_me_pch12_sps_cfg, 1497 1491 [MEI_ME_PCH15_CFG] = &mei_me_pch15_cfg, 1498 1492 }; 1499 1493

+4

drivers/misc/mei/hw-me.h

··· 80 80 * servers platforms with quirk for 81 81 * SPS firmware exclusion. 82 82 * @MEI_ME_PCH12_CFG: Platform Controller Hub Gen12 and newer 83 + * @MEI_ME_PCH12_SPS_CFG: Platform Controller Hub Gen12 and newer 84 + * servers platforms with quirk for 85 + * SPS firmware exclusion. 83 86 * @MEI_ME_PCH15_CFG: Platform Controller Hub Gen15 and newer 84 87 * @MEI_ME_NUM_CFG: Upper Sentinel. 85 88 */ ··· 96 93 MEI_ME_PCH8_CFG, 97 94 MEI_ME_PCH8_SPS_CFG, 98 95 MEI_ME_PCH12_CFG, 96 + MEI_ME_PCH12_SPS_CFG, 99 97 MEI_ME_PCH15_CFG, 100 98 MEI_ME_NUM_CFG, 101 99 };

+1 -1

drivers/misc/mei/pci-me.c

··· 70 70 {MEI_PCI_DEVICE(MEI_DEV_ID_SPT_2, MEI_ME_PCH8_CFG)}, 71 71 {MEI_PCI_DEVICE(MEI_DEV_ID_SPT_H, MEI_ME_PCH8_SPS_CFG)}, 72 72 {MEI_PCI_DEVICE(MEI_DEV_ID_SPT_H_2, MEI_ME_PCH8_SPS_CFG)}, 73 - {MEI_PCI_DEVICE(MEI_DEV_ID_LBG, MEI_ME_PCH12_CFG)}, 73 + {MEI_PCI_DEVICE(MEI_DEV_ID_LBG, MEI_ME_PCH12_SPS_CFG)}, 74 74 75 75 {MEI_PCI_DEVICE(MEI_DEV_ID_BXT_M, MEI_ME_PCH8_CFG)}, 76 76 {MEI_PCI_DEVICE(MEI_DEV_ID_APL_I, MEI_ME_PCH8_CFG)},

+2 -1

drivers/mmc/core/block.c

··· 1370 1370 struct mmc_request *mrq = &mqrq->brq.mrq; 1371 1371 struct request_queue *q = req->q; 1372 1372 struct mmc_host *host = mq->card->host; 1373 + enum mmc_issue_type issue_type = mmc_issue_type(mq, req); 1373 1374 unsigned long flags; 1374 1375 bool put_card; 1375 1376 int err; ··· 1400 1399 1401 1400 spin_lock_irqsave(&mq->lock, flags); 1402 1401 1403 - mq->in_flight[mmc_issue_type(mq, req)] -= 1; 1402 + mq->in_flight[issue_type] -= 1; 1404 1403 1405 1404 put_card = (mmc_tot_in_flight(mq) == 0); 1406 1405

+5 -11

drivers/mmc/core/queue.c

··· 107 107 case MMC_ISSUE_DCMD: 108 108 if (host->cqe_ops->cqe_timeout(host, mrq, &recovery_needed)) { 109 109 if (recovery_needed) 110 - __mmc_cqe_recovery_notifier(mq); 110 + mmc_cqe_recovery_notifier(mrq); 111 111 return BLK_EH_RESET_TIMER; 112 112 } 113 - /* No timeout (XXX: huh? comment doesn't make much sense) */ 114 - blk_mq_complete_request(req); 113 + /* The request has gone already */ 115 114 return BLK_EH_DONE; 116 115 default: 117 116 /* Timeout is handled by mmc core */ ··· 126 127 struct mmc_card *card = mq->card; 127 128 struct mmc_host *host = card->host; 128 129 unsigned long flags; 129 - int ret; 130 + bool ignore_tout; 130 131 131 132 spin_lock_irqsave(&mq->lock, flags); 132 - 133 - if (mq->recovery_needed || !mq->use_cqe || host->hsq_enabled) 134 - ret = BLK_EH_RESET_TIMER; 135 - else 136 - ret = mmc_cqe_timed_out(req); 137 - 133 + ignore_tout = mq->recovery_needed || !mq->use_cqe || host->hsq_enabled; 138 134 spin_unlock_irqrestore(&mq->lock, flags); 139 135 140 - return ret; 136 + return ignore_tout ? BLK_EH_RESET_TIMER : mmc_cqe_timed_out(req); 141 137 } 142 138 143 139 static void mmc_mq_recovery_handler(struct work_struct *work)

+5 -1

drivers/mmc/host/alcor.c

··· 1104 1104 1105 1105 if (ret) { 1106 1106 dev_err(&pdev->dev, "Failed to get irq for data line\n"); 1107 - return ret; 1107 + goto free_host; 1108 1108 } 1109 1109 1110 1110 mutex_init(&host->cmd_mutex); ··· 1116 1116 dev_set_drvdata(&pdev->dev, host); 1117 1117 mmc_add_host(mmc); 1118 1118 return 0; 1119 + 1120 + free_host: 1121 + mmc_free_host(mmc); 1122 + return ret; 1119 1123 } 1120 1124 1121 1125 static int alcor_pci_sdmmc_drv_remove(struct platform_device *pdev)

+6 -4

drivers/mmc/host/sdhci-acpi.c

··· 605 605 } 606 606 607 607 static const struct sdhci_acpi_slot sdhci_acpi_slot_amd_emmc = { 608 - .chip = &sdhci_acpi_chip_amd, 609 - .caps = MMC_CAP_8_BIT_DATA | MMC_CAP_NONREMOVABLE, 610 - .quirks = SDHCI_QUIRK_32BIT_DMA_ADDR | SDHCI_QUIRK_32BIT_DMA_SIZE | 611 - SDHCI_QUIRK_32BIT_ADMA_SIZE, 608 + .chip = &sdhci_acpi_chip_amd, 609 + .caps = MMC_CAP_8_BIT_DATA | MMC_CAP_NONREMOVABLE, 610 + .quirks = SDHCI_QUIRK_32BIT_DMA_ADDR | 611 + SDHCI_QUIRK_32BIT_DMA_SIZE | 612 + SDHCI_QUIRK_32BIT_ADMA_SIZE, 613 + .quirks2 = SDHCI_QUIRK2_BROKEN_64_BIT_DMA, 612 614 .probe_slot = sdhci_acpi_emmc_amd_probe_slot, 613 615 }; 614 616

+23

drivers/mmc/host/sdhci-pci-gli.c

··· 26 26 #define SDHCI_GLI_9750_DRIVING_2 GENMASK(27, 26) 27 27 #define GLI_9750_DRIVING_1_VALUE 0xFFF 28 28 #define GLI_9750_DRIVING_2_VALUE 0x3 29 + #define SDHCI_GLI_9750_SEL_1 BIT(29) 30 + #define SDHCI_GLI_9750_SEL_2 BIT(31) 31 + #define SDHCI_GLI_9750_ALL_RST (BIT(24)|BIT(25)|BIT(28)|BIT(30)) 29 32 30 33 #define SDHCI_GLI_9750_PLL 0x864 31 34 #define SDHCI_GLI_9750_PLL_TX2_INV BIT(23) ··· 125 122 GLI_9750_DRIVING_1_VALUE); 126 123 driving_value |= FIELD_PREP(SDHCI_GLI_9750_DRIVING_2, 127 124 GLI_9750_DRIVING_2_VALUE); 125 + driving_value &= ~(SDHCI_GLI_9750_SEL_1|SDHCI_GLI_9750_SEL_2|SDHCI_GLI_9750_ALL_RST); 126 + driving_value |= SDHCI_GLI_9750_SEL_2; 128 127 sdhci_writel(host, driving_value, SDHCI_GLI_9750_DRIVING); 129 128 130 129 sw_ctrl_value &= ~SDHCI_GLI_9750_SW_CTRL_4; ··· 339 334 return value; 340 335 } 341 336 337 + #ifdef CONFIG_PM_SLEEP 338 + static int sdhci_pci_gli_resume(struct sdhci_pci_chip *chip) 339 + { 340 + struct sdhci_pci_slot *slot = chip->slots[0]; 341 + 342 + pci_free_irq_vectors(slot->chip->pdev); 343 + gli_pcie_enable_msi(slot); 344 + 345 + return sdhci_pci_resume_host(chip); 346 + } 347 + #endif 348 + 342 349 static const struct sdhci_ops sdhci_gl9755_ops = { 343 350 .set_clock = sdhci_set_clock, 344 351 .enable_dma = sdhci_pci_enable_dma, ··· 365 348 .quirks2 = SDHCI_QUIRK2_BROKEN_DDR50, 366 349 .probe_slot = gli_probe_slot_gl9755, 367 350 .ops = &sdhci_gl9755_ops, 351 + #ifdef CONFIG_PM_SLEEP 352 + .resume = sdhci_pci_gli_resume, 353 + #endif 368 354 }; 369 355 370 356 static const struct sdhci_ops sdhci_gl9750_ops = { ··· 386 366 .quirks2 = SDHCI_QUIRK2_BROKEN_DDR50, 387 367 .probe_slot = gli_probe_slot_gl9750, 388 368 .ops = &sdhci_gl9750_ops, 369 + #ifdef CONFIG_PM_SLEEP 370 + .resume = sdhci_pci_gli_resume, 371 + #endif 389 372 };

+1 -1

drivers/most/core.c

··· 1483 1483 ida_destroy(&mdev_id); 1484 1484 } 1485 1485 1486 - module_init(most_init); 1486 + subsys_initcall(most_init); 1487 1487 module_exit(most_exit); 1488 1488 MODULE_LICENSE("GPL"); 1489 1489 MODULE_AUTHOR("Christian Gromm <christian.gromm@microchip.com>");

+4 -14

drivers/net/bareudp.c

··· 136 136 oiph = skb_network_header(skb); 137 137 skb_reset_network_header(skb); 138 138 139 - if (family == AF_INET) 139 + if (!IS_ENABLED(CONFIG_IPV6) || family == AF_INET) 140 140 err = IP_ECN_decapsulate(oiph, skb); 141 - #if IS_ENABLED(CONFIG_IPV6) 142 141 else 143 142 err = IP6_ECN_decapsulate(oiph, skb); 144 - #endif 145 143 146 144 if (unlikely(err)) { 147 145 if (log_ecn_error) { 148 - if (family == AF_INET) 146 + if (!IS_ENABLED(CONFIG_IPV6) || family == AF_INET) 149 147 net_info_ratelimited("non-ECT from %pI4 " 150 148 "with TOS=%#x\n", 151 149 &((struct iphdr *)oiph)->saddr, 152 150 ((struct iphdr *)oiph)->tos); 153 - #if IS_ENABLED(CONFIG_IPV6) 154 151 else 155 152 net_info_ratelimited("non-ECT from %pI6\n", 156 153 &((struct ipv6hdr *)oiph)->saddr); 157 - #endif 158 154 } 159 155 if (err > 1) { 160 156 ++bareudp->dev->stats.rx_frame_errors; ··· 346 350 return err; 347 351 } 348 352 349 - #if IS_ENABLED(CONFIG_IPV6) 350 353 static int bareudp6_xmit_skb(struct sk_buff *skb, struct net_device *dev, 351 354 struct bareudp_dev *bareudp, 352 355 const struct ip_tunnel_info *info) ··· 406 411 dst_release(dst); 407 412 return err; 408 413 } 409 - #endif 410 414 411 415 static netdev_tx_t bareudp_xmit(struct sk_buff *skb, struct net_device *dev) 412 416 { ··· 429 435 } 430 436 431 437 rcu_read_lock(); 432 - #if IS_ENABLED(CONFIG_IPV6) 433 - if (info->mode & IP_TUNNEL_INFO_IPV6) 438 + if (IS_ENABLED(CONFIG_IPV6) && info->mode & IP_TUNNEL_INFO_IPV6) 434 439 err = bareudp6_xmit_skb(skb, dev, bareudp, info); 435 440 else 436 - #endif 437 441 err = bareudp_xmit_skb(skb, dev, bareudp, info); 438 442 439 443 rcu_read_unlock(); ··· 459 467 460 468 use_cache = ip_tunnel_dst_cache_usable(skb, info); 461 469 462 - if (ip_tunnel_info_af(info) == AF_INET) { 470 + if (!IS_ENABLED(CONFIG_IPV6) || ip_tunnel_info_af(info) == AF_INET) { 463 471 struct rtable *rt; 464 472 __be32 saddr; 465 473 ··· 470 478 471 479 ip_rt_put(rt); 472 480 info->key.u.ipv4.src = saddr; 473 - #if IS_ENABLED(CONFIG_IPV6) 474 481 } else if (ip_tunnel_info_af(info) == AF_INET6) { 475 482 struct dst_entry *dst; 476 483 struct in6_addr saddr; ··· 483 492 484 493 dst_release(dst); 485 494 info->key.u.ipv6.src = saddr; 486 - #endif 487 495 } else { 488 496 return -EINVAL; 489 497 }

+1

drivers/net/dsa/dsa_loop.c

··· 360 360 } 361 361 module_exit(dsa_loop_exit); 362 362 363 + MODULE_SOFTDEP("pre: dsa_loop_bdinfo"); 363 364 MODULE_LICENSE("GPL"); 364 365 MODULE_AUTHOR("Florian Fainelli"); 365 366 MODULE_DESCRIPTION("DSA loopback driver");

+1

drivers/net/ethernet/broadcom/Kconfig

··· 69 69 select BCM7XXX_PHY 70 70 select MDIO_BCM_UNIMAC 71 71 select DIMLIB 72 + select BROADCOM_PHY if ARCH_BCM2835 72 73 help 73 74 This driver supports the built-in Ethernet MACs found in the 74 75 Broadcom BCM7xxx Set Top Box family chipset.

+2

drivers/net/ethernet/freescale/Kconfig

··· 77 77 depends on QUICC_ENGINE && PPC32 78 78 select FSL_PQ_MDIO 79 79 select PHYLIB 80 + select FIXED_PHY 80 81 ---help--- 81 82 This driver supports the Gigabit Ethernet mode of the QUICC Engine, 82 83 which is available on some Freescale SOCs. ··· 91 90 depends on HAS_DMA 92 91 select FSL_PQ_MDIO 93 92 select PHYLIB 93 + select FIXED_PHY 94 94 select CRC32 95 95 ---help--- 96 96 This driver supports the Gigabit TSEC on the MPC83xx, MPC85xx,

+1

drivers/net/ethernet/freescale/dpaa/Kconfig

··· 3 3 tristate "DPAA Ethernet" 4 4 depends on FSL_DPAA && FSL_FMAN 5 5 select PHYLIB 6 + select FIXED_PHY 6 7 select FSL_FMAN_MAC 7 8 ---help--- 8 9 Data Path Acceleration Architecture Ethernet driver,

+17 -12

drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c

··· 86 86 for (i = 1; i < DPAA2_ETH_MAX_SG_ENTRIES; i++) { 87 87 addr = dpaa2_sg_get_addr(&sgt[i]); 88 88 sg_vaddr = dpaa2_iova_to_virt(priv->iommu_domain, addr); 89 - dma_unmap_page(dev, addr, DPAA2_ETH_RX_BUF_SIZE, 89 + dma_unmap_page(dev, addr, priv->rx_buf_size, 90 90 DMA_BIDIRECTIONAL); 91 91 92 92 free_pages((unsigned long)sg_vaddr, 0); ··· 144 144 /* Get the address and length from the S/G entry */ 145 145 sg_addr = dpaa2_sg_get_addr(sge); 146 146 sg_vaddr = dpaa2_iova_to_virt(priv->iommu_domain, sg_addr); 147 - dma_unmap_page(dev, sg_addr, DPAA2_ETH_RX_BUF_SIZE, 147 + dma_unmap_page(dev, sg_addr, priv->rx_buf_size, 148 148 DMA_BIDIRECTIONAL); 149 149 150 150 sg_length = dpaa2_sg_get_len(sge); ··· 185 185 (page_address(page) - page_address(head_page)); 186 186 187 187 skb_add_rx_frag(skb, i - 1, head_page, page_offset, 188 - sg_length, DPAA2_ETH_RX_BUF_SIZE); 188 + sg_length, priv->rx_buf_size); 189 189 } 190 190 191 191 if (dpaa2_sg_is_final(sge)) ··· 211 211 212 212 for (i = 0; i < count; i++) { 213 213 vaddr = dpaa2_iova_to_virt(priv->iommu_domain, buf_array[i]); 214 - dma_unmap_page(dev, buf_array[i], DPAA2_ETH_RX_BUF_SIZE, 214 + dma_unmap_page(dev, buf_array[i], priv->rx_buf_size, 215 215 DMA_BIDIRECTIONAL); 216 216 free_pages((unsigned long)vaddr, 0); 217 217 } ··· 367 367 break; 368 368 case XDP_REDIRECT: 369 369 dma_unmap_page(priv->net_dev->dev.parent, addr, 370 - DPAA2_ETH_RX_BUF_SIZE, DMA_BIDIRECTIONAL); 370 + priv->rx_buf_size, DMA_BIDIRECTIONAL); 371 371 ch->buf_count--; 372 372 373 373 /* Allow redirect use of full headroom */ ··· 410 410 trace_dpaa2_rx_fd(priv->net_dev, fd); 411 411 412 412 vaddr = dpaa2_iova_to_virt(priv->iommu_domain, addr); 413 - dma_sync_single_for_cpu(dev, addr, DPAA2_ETH_RX_BUF_SIZE, 413 + dma_sync_single_for_cpu(dev, addr, priv->rx_buf_size, 414 414 DMA_BIDIRECTIONAL); 415 415 416 416 fas = dpaa2_get_fas(vaddr, false); ··· 429 429 return; 430 430 } 431 431 432 - dma_unmap_page(dev, addr, DPAA2_ETH_RX_BUF_SIZE, 432 + dma_unmap_page(dev, addr, priv->rx_buf_size, 433 433 DMA_BIDIRECTIONAL); 434 434 skb = build_linear_skb(ch, fd, vaddr); 435 435 } else if (fd_format == dpaa2_fd_sg) { 436 436 WARN_ON(priv->xdp_prog); 437 437 438 - dma_unmap_page(dev, addr, DPAA2_ETH_RX_BUF_SIZE, 438 + dma_unmap_page(dev, addr, priv->rx_buf_size, 439 439 DMA_BIDIRECTIONAL); 440 440 skb = build_frag_skb(priv, ch, buf_data); 441 441 free_pages((unsigned long)vaddr, 0); ··· 1011 1011 if (!page) 1012 1012 goto err_alloc; 1013 1013 1014 - addr = dma_map_page(dev, page, 0, DPAA2_ETH_RX_BUF_SIZE, 1014 + addr = dma_map_page(dev, page, 0, priv->rx_buf_size, 1015 1015 DMA_BIDIRECTIONAL); 1016 1016 if (unlikely(dma_mapping_error(dev, addr))) 1017 1017 goto err_map; ··· 1021 1021 /* tracing point */ 1022 1022 trace_dpaa2_eth_buf_seed(priv->net_dev, 1023 1023 page, DPAA2_ETH_RX_BUF_RAW_SIZE, 1024 - addr, DPAA2_ETH_RX_BUF_SIZE, 1024 + addr, priv->rx_buf_size, 1025 1025 bpid); 1026 1026 } 1027 1027 ··· 1757 1757 int mfl, linear_mfl; 1758 1758 1759 1759 mfl = DPAA2_ETH_L2_MAX_FRM(mtu); 1760 - linear_mfl = DPAA2_ETH_RX_BUF_SIZE - DPAA2_ETH_RX_HWA_SIZE - 1760 + linear_mfl = priv->rx_buf_size - DPAA2_ETH_RX_HWA_SIZE - 1761 1761 dpaa2_eth_rx_head_room(priv) - XDP_PACKET_HEADROOM; 1762 1762 1763 1763 if (mfl > linear_mfl) { ··· 2492 2492 else 2493 2493 rx_buf_align = DPAA2_ETH_RX_BUF_ALIGN; 2494 2494 2495 + /* We need to ensure that the buffer size seen by WRIOP is a multiple 2496 + * of 64 or 256 bytes depending on the WRIOP version. 2497 + */ 2498 + priv->rx_buf_size = ALIGN_DOWN(DPAA2_ETH_RX_BUF_SIZE, rx_buf_align); 2499 + 2495 2500 /* tx buffer */ 2496 2501 buf_layout.private_data_size = DPAA2_ETH_SWA_SIZE; 2497 2502 buf_layout.pass_timestamp = true; ··· 3182 3177 pools_params.num_dpbp = 1; 3183 3178 pools_params.pools[0].dpbp_id = priv->dpbp_dev->obj_desc.id; 3184 3179 pools_params.pools[0].backup_pool = 0; 3185 - pools_params.pools[0].buffer_size = DPAA2_ETH_RX_BUF_SIZE; 3180 + pools_params.pools[0].buffer_size = priv->rx_buf_size; 3186 3181 err = dpni_set_pools(priv->mc_io, 0, priv->mc_token, &pools_params); 3187 3182 if (err) { 3188 3183 dev_err(dev, "dpni_set_pools() failed\n");

+1

drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h

··· 393 393 u16 tx_data_offset; 394 394 395 395 struct fsl_mc_device *dpbp_dev; 396 + u16 rx_buf_size; 396 397 u16 bpid; 397 398 struct iommu_domain *iommu_domain; 398 399

+1 -1

drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c

··· 635 635 636 636 static int update_cls_rule(struct net_device *net_dev, 637 637 struct ethtool_rx_flow_spec *new_fs, 638 - int location) 638 + unsigned int location) 639 639 { 640 640 struct dpaa2_eth_priv *priv = netdev_priv(net_dev); 641 641 struct dpaa2_eth_cls_rule *rule;

+1 -1

drivers/net/ethernet/hisilicon/Kconfig

··· 64 64 the PHY 65 65 66 66 config HNS 67 - tristate "Hisilicon Network Subsystem Support (Framework)" 67 + tristate 68 68 ---help--- 69 69 This selects the framework support for Hisilicon Network Subsystem. It 70 70 is needed by any driver which provides HNS acceleration engine or make

+12 -4

drivers/net/ethernet/huawei/hinic/hinic_hw_mgmt.c

··· 45 45 46 46 #define MGMT_MSG_TIMEOUT 5000 47 47 48 + #define SET_FUNC_PORT_MGMT_TIMEOUT 25000 49 + 48 50 #define mgmt_to_pfhwdev(pf_mgmt) \ 49 51 container_of(pf_mgmt, struct hinic_pfhwdev, pf_to_mgmt) 50 52 ··· 240 238 u8 *buf_in, u16 in_size, 241 239 u8 *buf_out, u16 *out_size, 242 240 enum mgmt_direction_type direction, 243 - u16 resp_msg_id) 241 + u16 resp_msg_id, u32 timeout) 244 242 { 245 243 struct hinic_hwif *hwif = pf_to_mgmt->hwif; 246 244 struct pci_dev *pdev = hwif->pdev; 247 245 struct hinic_recv_msg *recv_msg; 248 246 struct completion *recv_done; 247 + unsigned long timeo; 249 248 u16 msg_id; 250 249 int err; 251 250 ··· 270 267 goto unlock_sync_msg; 271 268 } 272 269 273 - if (!wait_for_completion_timeout(recv_done, 274 - msecs_to_jiffies(MGMT_MSG_TIMEOUT))) { 270 + timeo = msecs_to_jiffies(timeout ? timeout : MGMT_MSG_TIMEOUT); 271 + 272 + if (!wait_for_completion_timeout(recv_done, timeo)) { 275 273 dev_err(&pdev->dev, "MGMT timeout, MSG id = %d\n", msg_id); 276 274 err = -ETIMEDOUT; 277 275 goto unlock_sync_msg; ··· 346 342 { 347 343 struct hinic_hwif *hwif = pf_to_mgmt->hwif; 348 344 struct pci_dev *pdev = hwif->pdev; 345 + u32 timeout = 0; 349 346 350 347 if (sync != HINIC_MGMT_MSG_SYNC) { 351 348 dev_err(&pdev->dev, "Invalid MGMT msg type\n"); ··· 358 353 return -EINVAL; 359 354 } 360 355 356 + if (cmd == HINIC_PORT_CMD_SET_FUNC_STATE) 357 + timeout = SET_FUNC_PORT_MGMT_TIMEOUT; 358 + 361 359 if (HINIC_IS_VF(hwif)) 362 360 return hinic_mbox_to_pf(pf_to_mgmt->hwdev, mod, cmd, buf_in, 363 361 in_size, buf_out, out_size, 0); 364 362 else 365 363 return msg_to_mgmt_sync(pf_to_mgmt, mod, cmd, buf_in, in_size, 366 364 buf_out, out_size, MGMT_DIRECT_SEND, 367 - MSG_NOT_RESP); 365 + MSG_NOT_RESP, timeout); 368 366 } 369 367 370 368 /**

+2 -14

drivers/net/ethernet/huawei/hinic/hinic_main.c

··· 488 488 { 489 489 struct hinic_dev *nic_dev = netdev_priv(netdev); 490 490 unsigned int flags; 491 - int err; 492 491 493 492 down(&nic_dev->mgmt_lock); 494 493 ··· 504 505 if (!HINIC_IS_VF(nic_dev->hwdev->hwif)) 505 506 hinic_notify_all_vfs_link_changed(nic_dev->hwdev, 0); 506 507 507 - err = hinic_port_set_func_state(nic_dev, HINIC_FUNC_PORT_DISABLE); 508 - if (err) { 509 - netif_err(nic_dev, drv, netdev, 510 - "Failed to set func port state\n"); 511 - nic_dev->flags |= (flags & HINIC_INTF_UP); 512 - return err; 513 - } 508 + hinic_port_set_state(nic_dev, HINIC_PORT_DISABLE); 514 509 515 - err = hinic_port_set_state(nic_dev, HINIC_PORT_DISABLE); 516 - if (err) { 517 - netif_err(nic_dev, drv, netdev, "Failed to set port state\n"); 518 - nic_dev->flags |= (flags & HINIC_INTF_UP); 519 - return err; 520 - } 510 + hinic_port_set_func_state(nic_dev, HINIC_FUNC_PORT_DISABLE); 521 511 522 512 if (nic_dev->flags & HINIC_RSS_ENABLE) { 523 513 hinic_rss_deinit(nic_dev);

+6 -2

drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c

··· 497 497 498 498 hw->irq_name = devm_kmalloc_array(&hw->pdev->dev, num_vec, NAME_SIZE, 499 499 GFP_KERNEL); 500 - if (!hw->irq_name) 500 + if (!hw->irq_name) { 501 + err = -ENOMEM; 501 502 goto err_free_netdev; 503 + } 502 504 503 505 hw->affinity_mask = devm_kcalloc(&hw->pdev->dev, num_vec, 504 506 sizeof(cpumask_var_t), GFP_KERNEL); 505 - if (!hw->affinity_mask) 507 + if (!hw->affinity_mask) { 508 + err = -ENOMEM; 506 509 goto err_free_netdev; 510 + } 507 511 508 512 err = pci_alloc_irq_vectors(hw->pdev, num_vec, num_vec, PCI_IRQ_MSIX); 509 513 if (err < 0) {

+4 -1

drivers/net/ethernet/microchip/encx24j600.c

··· 1062 1062 if (unlikely(ret)) { 1063 1063 netif_err(priv, probe, ndev, "Error %d initializing card encx24j600 card\n", 1064 1064 ret); 1065 - goto out_free; 1065 + goto out_stop; 1066 1066 } 1067 1067 1068 1068 eidled = encx24j600_read_reg(priv, EIDLED); ··· 1080 1080 1081 1081 out_unregister: 1082 1082 unregister_netdev(priv->ndev); 1083 + out_stop: 1084 + kthread_stop(priv->kworker_task); 1083 1085 out_free: 1084 1086 free_netdev(ndev); 1085 1087 ··· 1094 1092 struct encx24j600_priv *priv = dev_get_drvdata(&spi->dev); 1095 1093 1096 1094 unregister_netdev(priv->ndev); 1095 + kthread_stop(priv->kworker_task); 1097 1096 1098 1097 free_netdev(priv->ndev); 1099 1098

+3 -1

drivers/net/ethernet/netronome/nfp/abm/main.c

··· 333 333 goto err_free_alink; 334 334 335 335 alink->prio_map = kzalloc(abm->prio_map_len, GFP_KERNEL); 336 - if (!alink->prio_map) 336 + if (!alink->prio_map) { 337 + err = -ENOMEM; 337 338 goto err_free_alink; 339 + } 338 340 339 341 /* This is a multi-host app, make sure MAC/PHY is up, but don't 340 342 * make the MAC/PHY state follow the state of any of the ports.

+12 -7

drivers/net/ethernet/pensando/ionic/ionic_lif.c

··· 2175 2175 dev_info(ionic->dev, "FW Up: restarting LIFs\n"); 2176 2176 2177 2177 ionic_init_devinfo(ionic); 2178 + ionic_port_init(ionic); 2178 2179 err = ionic_qcqs_alloc(lif); 2179 2180 if (err) 2180 2181 goto err_out; ··· 2407 2406 if (is_zero_ether_addr(ctx.comp.lif_getattr.mac)) 2408 2407 return 0; 2409 2408 2410 - if (!ether_addr_equal(ctx.comp.lif_getattr.mac, netdev->dev_addr)) { 2409 + if (!is_zero_ether_addr(netdev->dev_addr)) { 2410 + /* If the netdev mac is non-zero and doesn't match the default 2411 + * device address, it was set by something earlier and we're 2412 + * likely here again after a fw-upgrade reset. We need to be 2413 + * sure the netdev mac is in our filter list. 2414 + */ 2415 + if (!ether_addr_equal(ctx.comp.lif_getattr.mac, 2416 + netdev->dev_addr)) 2417 + ionic_lif_addr(lif, netdev->dev_addr, true); 2418 + } else { 2419 + /* Update the netdev mac with the device's mac */ 2411 2420 memcpy(addr.sa_data, ctx.comp.lif_getattr.mac, netdev->addr_len); 2412 2421 addr.sa_family = AF_INET; 2413 2422 err = eth_prepare_mac_addr_change(netdev, &addr); ··· 2425 2414 netdev_warn(lif->netdev, "ignoring bad MAC addr from NIC %pM - err %d\n", 2426 2415 addr.sa_data, err); 2427 2416 return 0; 2428 - } 2429 - 2430 - if (!is_zero_ether_addr(netdev->dev_addr)) { 2431 - netdev_dbg(lif->netdev, "deleting station MAC addr %pM\n", 2432 - netdev->dev_addr); 2433 - ionic_lif_addr(lif, netdev->dev_addr, false); 2434 2417 } 2435 2418 2436 2419 eth_commit_mac_addr_change(netdev, &addr);

+9 -9

drivers/net/ethernet/pensando/ionic/ionic_main.c

··· 512 512 size_t sz; 513 513 int err; 514 514 515 - if (idev->port_info) 516 - return 0; 517 - 518 - idev->port_info_sz = ALIGN(sizeof(*idev->port_info), PAGE_SIZE); 519 - idev->port_info = dma_alloc_coherent(ionic->dev, idev->port_info_sz, 520 - &idev->port_info_pa, 521 - GFP_KERNEL); 522 515 if (!idev->port_info) { 523 - dev_err(ionic->dev, "Failed to allocate port info, aborting\n"); 524 - return -ENOMEM; 516 + idev->port_info_sz = ALIGN(sizeof(*idev->port_info), PAGE_SIZE); 517 + idev->port_info = dma_alloc_coherent(ionic->dev, 518 + idev->port_info_sz, 519 + &idev->port_info_pa, 520 + GFP_KERNEL); 521 + if (!idev->port_info) { 522 + dev_err(ionic->dev, "Failed to allocate port info\n"); 523 + return -ENOMEM; 524 + } 525 525 } 526 526 527 527 sz = min(sizeof(ident->port.config), sizeof(idev->dev_cmd_regs->data));

+2

drivers/net/ethernet/realtek/r8169_main.c

··· 2048 2048 { 0x7cf, 0x348, RTL_GIGA_MAC_VER_07 }, 2049 2049 { 0x7cf, 0x248, RTL_GIGA_MAC_VER_07 }, 2050 2050 { 0x7cf, 0x340, RTL_GIGA_MAC_VER_13 }, 2051 + /* RTL8401, reportedly works if treated as RTL8101e */ 2052 + { 0x7cf, 0x240, RTL_GIGA_MAC_VER_13 }, 2051 2053 { 0x7cf, 0x343, RTL_GIGA_MAC_VER_10 }, 2052 2054 { 0x7cf, 0x342, RTL_GIGA_MAC_VER_16 }, 2053 2055 { 0x7c8, 0x348, RTL_GIGA_MAC_VER_09 },

+15 -2

drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c

··· 75 75 unsigned int value; 76 76 }; 77 77 78 + struct ethqos_emac_driver_data { 79 + const struct ethqos_emac_por *por; 80 + unsigned int num_por; 81 + }; 82 + 78 83 struct qcom_ethqos { 79 84 struct platform_device *pdev; 80 85 void __iomem *rgmii_base; ··· 174 169 { .offset = SDCC_HC_REG_DLL_CONFIG2, .value = 0x00200000 }, 175 170 { .offset = SDCC_USR_CTL, .value = 0x00010800 }, 176 171 { .offset = RGMII_IO_MACRO_CONFIG2, .value = 0x00002060 }, 172 + }; 173 + 174 + static const struct ethqos_emac_driver_data emac_v2_3_0_data = { 175 + .por = emac_v2_3_0_por, 176 + .num_por = ARRAY_SIZE(emac_v2_3_0_por), 177 177 }; 178 178 179 179 static int ethqos_dll_configure(struct qcom_ethqos *ethqos) ··· 452 442 struct device_node *np = pdev->dev.of_node; 453 443 struct plat_stmmacenet_data *plat_dat; 454 444 struct stmmac_resources stmmac_res; 445 + const struct ethqos_emac_driver_data *data; 455 446 struct qcom_ethqos *ethqos; 456 447 struct resource *res; 457 448 int ret; ··· 482 471 goto err_mem; 483 472 } 484 473 485 - ethqos->por = of_device_get_match_data(&pdev->dev); 474 + data = of_device_get_match_data(&pdev->dev); 475 + ethqos->por = data->por; 476 + ethqos->num_por = data->num_por; 486 477 487 478 ethqos->rgmii_clk = devm_clk_get(&pdev->dev, "rgmii"); 488 479 if (IS_ERR(ethqos->rgmii_clk)) { ··· 539 526 } 540 527 541 528 static const struct of_device_id qcom_ethqos_match[] = { 542 - { .compatible = "qcom,qcs404-ethqos", .data = &emac_v2_3_0_por}, 529 + { .compatible = "qcom,qcs404-ethqos", .data = &emac_v2_3_0_data}, 543 530 { } 544 531 }; 545 532 MODULE_DEVICE_TABLE(of, qcom_ethqos_match);

+6 -10

drivers/net/ethernet/ti/Kconfig

··· 49 49 config TI_CPSW 50 50 tristate "TI CPSW Switch Support" 51 51 depends on ARCH_DAVINCI || ARCH_OMAP2PLUS || COMPILE_TEST 52 + depends on TI_CPTS || !TI_CPTS 52 53 select TI_DAVINCI_MDIO 53 54 select MFD_SYSCON 54 55 select PAGE_POOL ··· 65 64 tristate "TI CPSW Switch Support with switchdev" 66 65 depends on ARCH_DAVINCI || ARCH_OMAP2PLUS || COMPILE_TEST 67 66 depends on NET_SWITCHDEV 67 + depends on TI_CPTS || !TI_CPTS 68 68 select PAGE_POOL 69 69 select TI_DAVINCI_MDIO 70 70 select MFD_SYSCON ··· 79 77 will be called cpsw_new. 80 78 81 79 config TI_CPTS 82 - bool "TI Common Platform Time Sync (CPTS) Support" 83 - depends on TI_CPSW || TI_KEYSTONE_NETCP || TI_CPSW_SWITCHDEV || COMPILE_TEST 80 + tristate "TI Common Platform Time Sync (CPTS) Support" 81 + depends on ARCH_OMAP2PLUS || ARCH_KEYSTONE || COMPILE_TEST 84 82 depends on COMMON_CLK 85 - depends on POSIX_TIMERS 83 + depends on PTP_1588_CLOCK 86 84 ---help--- 87 85 This driver supports the Common Platform Time Sync unit of 88 86 the CPSW Ethernet Switch and Keystone 2 1g/10g Switch Subsystem. 89 87 The unit can time stamp PTP UDP/IPv4 and Layer 2 packets, and the 90 88 driver offers a PTP Hardware Clock. 91 - 92 - config TI_CPTS_MOD 93 - tristate 94 - depends on TI_CPTS 95 - depends on PTP_1588_CLOCK 96 - default y if TI_CPSW=y || TI_KEYSTONE_NETCP=y || TI_CPSW_SWITCHDEV=y 97 - default m 98 89 99 90 config TI_K3_AM65_CPSW_NUSS 100 91 tristate "TI K3 AM654x/J721E CPSW Ethernet driver" ··· 132 137 select TI_DAVINCI_MDIO 133 138 depends on OF 134 139 depends on KEYSTONE_NAVIGATOR_DMA && KEYSTONE_NAVIGATOR_QMSS 140 + depends on TI_CPTS || !TI_CPTS 135 141 ---help--- 136 142 This driver supports TI's Keystone NETCP Core. 137 143

+1 -1

drivers/net/ethernet/ti/Makefile

··· 13 13 ti_davinci_emac-y := davinci_emac.o davinci_cpdma.o 14 14 obj-$(CONFIG_TI_DAVINCI_MDIO) += davinci_mdio.o 15 15 obj-$(CONFIG_TI_CPSW_PHY_SEL) += cpsw-phy-sel.o 16 - obj-$(CONFIG_TI_CPTS_MOD) += cpts.o 16 + obj-$(CONFIG_TI_CPTS) += cpts.o 17 17 obj-$(CONFIG_TI_CPSW) += ti_cpsw.o 18 18 ti_cpsw-y := cpsw.o davinci_cpdma.o cpsw_ale.o cpsw_priv.o cpsw_sl.o cpsw_ethtool.o 19 19 obj-$(CONFIG_TI_CPSW_SWITCHDEV) += ti_cpsw_new.o

+2 -1

drivers/net/hamradio/bpqether.c

··· 146 146 { 147 147 struct bpqdev *bpq; 148 148 149 - list_for_each_entry_rcu(bpq, &bpq_devices, bpq_list) { 149 + list_for_each_entry_rcu(bpq, &bpq_devices, bpq_list, 150 + lockdep_rtnl_is_held()) { 150 151 if (bpq->ethdev == dev) 151 152 return bpq->axdev; 152 153 }

+3 -2

drivers/net/ipa/gsi_trans.c

··· 399 399 /* assert(which < trans->tre_count); */ 400 400 401 401 /* Set the page information for the buffer. We also need to fill in 402 - * the DMA address for the buffer (something dma_map_sg() normally 403 - * does). 402 + * the DMA address and length for the buffer (something dma_map_sg() 403 + * normally does). 404 404 */ 405 405 sg = &trans->sgl[which]; 406 406 407 407 sg_set_buf(sg, buf, size); 408 408 sg_dma_address(sg) = addr; 409 + sg_dma_len(sg) = sg->length; 409 410 410 411 info = &trans->info[which]; 411 412 info->opcode = opcode;

+3 -11

drivers/net/ipa/ipa_cmd.c

··· 569 569 570 570 void ipa_cmd_tag_process_add(struct gsi_trans *trans) 571 571 { 572 - ipa_cmd_register_write_add(trans, 0, 0, 0, true); 573 - #if 1 574 - /* Reference these functions to avoid a compile error */ 575 - (void)ipa_cmd_ip_packet_init_add; 576 - (void)ipa_cmd_ip_tag_status_add; 577 - (void) ipa_cmd_transfer_add; 578 - #else 579 572 struct ipa *ipa = container_of(trans->gsi, struct ipa, gsi); 580 - struct gsi_endpoint *endpoint; 573 + struct ipa_endpoint *endpoint; 581 574 582 575 endpoint = ipa->name_map[IPA_ENDPOINT_AP_LAN_RX]; 576 + 577 + ipa_cmd_register_write_add(trans, 0, 0, 0, true); 583 578 ipa_cmd_ip_packet_init_add(trans, endpoint->endpoint_id); 584 - 585 579 ipa_cmd_ip_tag_status_add(trans, 0xcba987654321); 586 - 587 580 ipa_cmd_transfer_add(trans, 4); 588 - #endif 589 581 } 590 582 591 583 /* Returns the number of commands required for the tag process */

+1 -1

drivers/net/ipa/ipa_smp2p.c

··· 53 53 * @clock_on: Whether IPA clock is on 54 54 * @notified: Whether modem has been notified of clock state 55 55 * @disabled: Whether setup ready interrupt handling is disabled 56 - * @mutex mutex: Motex protecting ready interrupt/shutdown interlock 56 + * @mutex: Mutex protecting ready-interrupt/shutdown interlock 57 57 * @panic_notifier: Panic notifier structure 58 58 */ 59 59 struct ipa_smp2p {

+5 -3

drivers/net/phy/phy.c

··· 1240 1240 /* Restart autonegotiation so the new modes get sent to the 1241 1241 * link partner. 1242 1242 */ 1243 - ret = phy_restart_aneg(phydev); 1244 - if (ret < 0) 1245 - return ret; 1243 + if (phydev->autoneg == AUTONEG_ENABLE) { 1244 + ret = phy_restart_aneg(phydev); 1245 + if (ret < 0) 1246 + return ret; 1247 + } 1246 1248 } 1247 1249 1248 1250 return 0;

+3

drivers/net/ppp/pppoe.c

··· 490 490 if (!skb) 491 491 goto out; 492 492 493 + if (skb->pkt_type != PACKET_HOST) 494 + goto abort; 495 + 493 496 if (!pskb_may_pull(skb, sizeof(struct pppoe_hdr))) 494 497 goto abort; 495 498

+1 -1

drivers/net/usb/hso.c

··· 2659 2659 if (! 2660 2660 (serial->out_endp = 2661 2661 hso_get_ep(interface, USB_ENDPOINT_XFER_BULK, USB_DIR_OUT))) { 2662 - dev_err(&interface->dev, "Failed to find BULK IN ep\n"); 2662 + dev_err(&interface->dev, "Failed to find BULK OUT ep\n"); 2663 2663 goto exit2; 2664 2664 } 2665 2665

+4 -2

drivers/net/virtio_net.c

··· 1252 1252 break; 1253 1253 } while (rq->vq->num_free); 1254 1254 if (virtqueue_kick_prepare(rq->vq) && virtqueue_notify(rq->vq)) { 1255 - u64_stats_update_begin(&rq->stats.syncp); 1255 + unsigned long flags; 1256 + 1257 + flags = u64_stats_update_begin_irqsave(&rq->stats.syncp); 1256 1258 rq->stats.kicks++; 1257 - u64_stats_update_end(&rq->stats.syncp); 1259 + u64_stats_update_end_irqrestore(&rq->stats.syncp, flags); 1258 1260 } 1259 1261 1260 1262 return !oom;

+1 -1

drivers/nvme/host/core.c

··· 1110 1110 * Don't treat an error as fatal, as we potentially already 1111 1111 * have a NGUID or EUI-64. 1112 1112 */ 1113 - if (status > 0) 1113 + if (status > 0 && !(status & NVME_SC_DNR)) 1114 1114 status = 0; 1115 1115 goto free_data; 1116 1116 }

+5 -1

drivers/nvme/host/pci.c

··· 973 973 974 974 static inline void nvme_update_cq_head(struct nvme_queue *nvmeq) 975 975 { 976 - if (++nvmeq->cq_head == nvmeq->q_depth) { 976 + u16 tmp = nvmeq->cq_head + 1; 977 + 978 + if (tmp == nvmeq->q_depth) { 977 979 nvmeq->cq_head = 0; 978 980 nvmeq->cq_phase ^= 1; 981 + } else { 982 + nvmeq->cq_head = tmp; 979 983 } 980 984 } 981 985

+7

drivers/phy/qualcomm/phy-qcom-qusb2.c

··· 816 816 .compatible = "qcom,msm8998-qusb2-phy", 817 817 .data = &msm8998_phy_cfg, 818 818 }, { 819 + /* 820 + * Deprecated. Only here to support legacy device 821 + * trees that didn't include "qcom,qusb2-v2-phy" 822 + */ 823 + .compatible = "qcom,sdm845-qusb2-phy", 824 + .data = &qusb2_v2_phy_cfg, 825 + }, { 819 826 .compatible = "qcom,qusb2-v2-phy", 820 827 .data = &qusb2_v2_phy_cfg, 821 828 },

+21 -11

drivers/phy/qualcomm/phy-qcom-usb-hs-28nm.c

··· 160 160 ret = regulator_bulk_enable(VREG_NUM, priv->vregs); 161 161 if (ret) 162 162 return ret; 163 - ret = clk_bulk_prepare_enable(priv->num_clks, priv->clks); 164 - if (ret) 165 - goto err_disable_regulator; 163 + 166 164 qcom_snps_hsphy_disable_hv_interrupts(priv); 167 165 qcom_snps_hsphy_exit_retention(priv); 168 166 169 167 return 0; 170 - 171 - err_disable_regulator: 172 - regulator_bulk_disable(VREG_NUM, priv->vregs); 173 - 174 - return ret; 175 168 } 176 169 177 170 static int qcom_snps_hsphy_power_off(struct phy *phy) ··· 173 180 174 181 qcom_snps_hsphy_enter_retention(priv); 175 182 qcom_snps_hsphy_enable_hv_interrupts(priv); 176 - clk_bulk_disable_unprepare(priv->num_clks, priv->clks); 177 183 regulator_bulk_disable(VREG_NUM, priv->vregs); 178 184 179 185 return 0; ··· 258 266 struct hsphy_priv *priv = phy_get_drvdata(phy); 259 267 int ret; 260 268 261 - ret = qcom_snps_hsphy_reset(priv); 269 + ret = clk_bulk_prepare_enable(priv->num_clks, priv->clks); 262 270 if (ret) 263 271 return ret; 272 + 273 + ret = qcom_snps_hsphy_reset(priv); 274 + if (ret) 275 + goto disable_clocks; 264 276 265 277 qcom_snps_hsphy_init_sequence(priv); 266 278 267 279 ret = qcom_snps_hsphy_por_reset(priv); 268 280 if (ret) 269 - return ret; 281 + goto disable_clocks; 282 + 283 + return 0; 284 + 285 + disable_clocks: 286 + clk_bulk_disable_unprepare(priv->num_clks, priv->clks); 287 + return ret; 288 + } 289 + 290 + static int qcom_snps_hsphy_exit(struct phy *phy) 291 + { 292 + struct hsphy_priv *priv = phy_get_drvdata(phy); 293 + 294 + clk_bulk_disable_unprepare(priv->num_clks, priv->clks); 270 295 271 296 return 0; 272 297 } 273 298 274 299 static const struct phy_ops qcom_snps_hsphy_ops = { 275 300 .init = qcom_snps_hsphy_init, 301 + .exit = qcom_snps_hsphy_exit, 276 302 .power_on = qcom_snps_hsphy_power_on, 277 303 .power_off = qcom_snps_hsphy_power_off, 278 304 .set_mode = qcom_snps_hsphy_set_mode,

+11 -14

drivers/regulator/core.c

··· 5754 5754 5755 5755 static int __init regulator_init_complete(void) 5756 5756 { 5757 - int delay = driver_deferred_probe_timeout; 5758 - 5759 - if (delay < 0) 5760 - delay = 0; 5761 5757 /* 5762 5758 * Since DT doesn't provide an idiomatic mechanism for 5763 5759 * enabling full constraints and since it's much more natural ··· 5764 5768 has_full_constraints = true; 5765 5769 5766 5770 /* 5767 - * If driver_deferred_probe_timeout is set, we punt 5768 - * completion for that many seconds since systems like 5769 - * distros will load many drivers from userspace so consumers 5770 - * might not always be ready yet, this is particularly an 5771 - * issue with laptops where this might bounce the display off 5772 - * then on. Ideally we'd get a notification from userspace 5773 - * when this happens but we don't so just wait a bit and hope 5774 - * we waited long enough. It'd be better if we'd only do 5775 - * this on systems that need it. 5771 + * We punt completion for an arbitrary amount of time since 5772 + * systems like distros will load many drivers from userspace 5773 + * so consumers might not always be ready yet, this is 5774 + * particularly an issue with laptops where this might bounce 5775 + * the display off then on. Ideally we'd get a notification 5776 + * from userspace when this happens but we don't so just wait 5777 + * a bit and hope we waited long enough. It'd be better if 5778 + * we'd only do this on systems that need it, and a kernel 5779 + * command line option might be useful. 5776 5780 */ 5777 - schedule_delayed_work(&regulator_init_complete_work, delay * HZ); 5781 + schedule_delayed_work(&regulator_init_complete_work, 5782 + msecs_to_jiffies(30000)); 5778 5783 5779 5784 return 0; 5780 5785 }

+3 -1

drivers/s390/net/ism_drv.c

··· 521 521 522 522 ism->smcd = smcd_alloc_dev(&pdev->dev, dev_name(&pdev->dev), &ism_ops, 523 523 ISM_NR_DMBS); 524 - if (!ism->smcd) 524 + if (!ism->smcd) { 525 + ret = -ENOMEM; 525 526 goto err_resource; 527 + } 526 528 527 529 ism->smcd->priv = ism; 528 530 ret = ism_dev_init(ism);

+5

drivers/scsi/ibmvscsi/ibmvfc.c

··· 3640 3640 struct ibmvfc_host *vhost = tgt->vhost; 3641 3641 struct ibmvfc_event *evt; 3642 3642 3643 + if (!vhost->logged_in) { 3644 + ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_DEL_RPORT); 3645 + return; 3646 + } 3647 + 3643 3648 if (vhost->discovery_threads >= disc_threads) 3644 3649 return; 3645 3650

-4

drivers/scsi/ibmvscsi/ibmvscsi.c

··· 2320 2320 static int ibmvscsi_remove(struct vio_dev *vdev) 2321 2321 { 2322 2322 struct ibmvscsi_host_data *hostdata = dev_get_drvdata(&vdev->dev); 2323 - unsigned long flags; 2324 2323 2325 2324 srp_remove_host(hostdata->host); 2326 2325 scsi_remove_host(hostdata->host); 2327 2326 2328 2327 purge_requests(hostdata, DID_ERROR); 2329 - 2330 - spin_lock_irqsave(hostdata->host->host_lock, flags); 2331 2328 release_event_pool(&hostdata->pool, hostdata); 2332 - spin_unlock_irqrestore(hostdata->host->host_lock, flags); 2333 2329 2334 2330 ibmvscsi_release_crq_queue(&hostdata->queue, hostdata, 2335 2331 max_events);

+1 -1

drivers/scsi/qla2xxx/qla_attr.c

··· 3031 3031 test_bit(FCPORT_UPDATE_NEEDED, &vha->dpc_flags)) 3032 3032 msleep(1000); 3033 3033 3034 - qla_nvme_delete(vha); 3035 3034 3036 3035 qla24xx_disable_vp(vha); 3037 3036 qla2x00_wait_for_sess_deletion(vha); 3038 3037 3038 + qla_nvme_delete(vha); 3039 3039 vha->flags.delete_progress = 1; 3040 3040 3041 3041 qlt_remove_target(ha, vha);

+1 -1

drivers/scsi/qla2xxx/qla_mbx.c

··· 3153 3153 ql_dbg(ql_dbg_mbx + ql_dbg_verbose, vha, 0x108c, 3154 3154 "Entered %s.\n", __func__); 3155 3155 3156 - if (vha->flags.qpairs_available && sp->qpair) 3156 + if (sp->qpair) 3157 3157 req = sp->qpair->req; 3158 3158 else 3159 3159 return QLA_FUNCTION_FAILED;

+4

drivers/staging/gasket/gasket_core.c

··· 925 925 gasket_get_bar_index(gasket_dev, 926 926 (vma->vm_pgoff << PAGE_SHIFT) + 927 927 driver_desc->legacy_mmap_address_offset); 928 + 929 + if (bar_index < 0) 930 + return DO_MAP_REGION_INVALID; 931 + 928 932 phys_base = gasket_dev->bar_data[bar_index].phys_base + phys_offset; 929 933 while (mapped_bytes < map_length) { 930 934 /*

-1

drivers/staging/ks7010/TODO

··· 30 30 31 31 Please send any patches to: 32 32 Greg Kroah-Hartman <gregkh@linuxfoundation.org> 33 - Wolfram Sang <wsa@the-dreams.de> 34 33 Linux Driver Project Developer List <driverdev-devel@linuxdriverproject.org>

+3

drivers/thunderbolt/usb4.c

··· 182 182 return ret; 183 183 184 184 ret = tb_sw_read(sw, &val, TB_CFG_SWITCH, ROUTER_CS_26, 1); 185 + if (ret) 186 + return ret; 187 + 185 188 if (val & ROUTER_CS_26_ONS) 186 189 return -EOPNOTSUPP; 187 190

+1 -3

drivers/tty/serial/bcm63xx_uart.c

··· 843 843 if (IS_ERR(clk) && pdev->dev.of_node) 844 844 clk = of_clk_get(pdev->dev.of_node, 0); 845 845 846 - if (IS_ERR(clk)) { 847 - clk_put(clk); 846 + if (IS_ERR(clk)) 848 847 return -ENODEV; 849 - } 850 848 851 849 port->iotype = UPIO_MEM; 852 850 port->irq = res_irq->start;

+1

drivers/tty/serial/xilinx_uartps.c

··· 1459 1459 cdns_uart_uart_driver.nr = CDNS_UART_NR_PORTS; 1460 1460 #ifdef CONFIG_SERIAL_XILINX_PS_UART_CONSOLE 1461 1461 cdns_uart_uart_driver.cons = &cdns_uart_console; 1462 + cdns_uart_console.index = id; 1462 1463 #endif 1463 1464 1464 1465 rc = uart_register_driver(&cdns_uart_uart_driver);

+7 -2

drivers/tty/vt/vt.c

··· 365 365 return uniscr; 366 366 } 367 367 368 + static void vc_uniscr_free(struct uni_screen *uniscr) 369 + { 370 + vfree(uniscr); 371 + } 372 + 368 373 static void vc_uniscr_set(struct vc_data *vc, struct uni_screen *new_uniscr) 369 374 { 370 - vfree(vc->vc_uni_screen); 375 + vc_uniscr_free(vc->vc_uni_screen); 371 376 vc->vc_uni_screen = new_uniscr; 372 377 } 373 378 ··· 1235 1230 err = resize_screen(vc, new_cols, new_rows, user); 1236 1231 if (err) { 1237 1232 kfree(newscreen); 1238 - kfree(new_uniscr); 1233 + vc_uniscr_free(new_uniscr); 1239 1234 return err; 1240 1235 } 1241 1236

+1 -1

drivers/usb/chipidea/ci_hdrc_msm.c

··· 114 114 hw_write_id_reg(ci, HS_PHY_GENCONFIG_2, 115 115 HS_PHY_ULPI_TX_PKT_EN_CLR_FIX, 0); 116 116 117 - if (!IS_ERR(ci->platdata->vbus_extcon.edev)) { 117 + if (!IS_ERR(ci->platdata->vbus_extcon.edev) || ci->role_switch) { 118 118 hw_write_id_reg(ci, HS_PHY_GENCONFIG_2, 119 119 HS_PHY_SESS_VLD_CTRL_EN, 120 120 HS_PHY_SESS_VLD_CTRL_EN);

+2 -3

drivers/usb/core/devio.c

··· 217 217 { 218 218 struct usb_memory *usbm = NULL; 219 219 struct usb_dev_state *ps = file->private_data; 220 + struct usb_hcd *hcd = bus_to_hcd(ps->dev->bus); 220 221 size_t size = vma->vm_end - vma->vm_start; 221 222 void *mem; 222 223 unsigned long flags; ··· 251 250 usbm->vma_use_count = 1; 252 251 INIT_LIST_HEAD(&usbm->memlist); 253 252 254 - if (remap_pfn_range(vma, vma->vm_start, 255 - virt_to_phys(usbm->mem) >> PAGE_SHIFT, 256 - size, vma->vm_page_prot) < 0) { 253 + if (dma_mmap_coherent(hcd->self.sysdev, vma, mem, dma_handle, size)) { 257 254 dec_usb_memory_use_count(usbm, &usbm->vma_use_count); 258 255 return -EAGAIN; 259 256 }

+2 -2

drivers/usb/core/message.c

··· 1144 1144 1145 1145 if (usb_endpoint_out(epaddr)) { 1146 1146 ep = dev->ep_out[epnum]; 1147 - if (reset_hardware) 1147 + if (reset_hardware && epnum != 0) 1148 1148 dev->ep_out[epnum] = NULL; 1149 1149 } else { 1150 1150 ep = dev->ep_in[epnum]; 1151 - if (reset_hardware) 1151 + if (reset_hardware && epnum != 0) 1152 1152 dev->ep_in[epnum] = NULL; 1153 1153 } 1154 1154 if (ep) {

+2 -2

drivers/usb/serial/garmin_gps.c

··· 1138 1138 send it directly to the tty port */ 1139 1139 if (garmin_data_p->flags & FLAGS_QUEUING) { 1140 1140 pkt_add(garmin_data_p, data, data_length); 1141 - } else if (bulk_data || 1142 - getLayerId(data) == GARMIN_LAYERID_APPL) { 1141 + } else if (bulk_data || (data_length >= sizeof(u32) && 1142 + getLayerId(data) == GARMIN_LAYERID_APPL)) { 1143 1143 1144 1144 spin_lock_irqsave(&garmin_data_p->lock, flags); 1145 1145 garmin_data_p->flags |= APP_RESP_SEEN;

+1

drivers/usb/serial/qcserial.c

··· 173 173 {DEVICE_SWI(0x413c, 0x81b3)}, /* Dell Wireless 5809e Gobi(TM) 4G LTE Mobile Broadband Card (rev3) */ 174 174 {DEVICE_SWI(0x413c, 0x81b5)}, /* Dell Wireless 5811e QDL */ 175 175 {DEVICE_SWI(0x413c, 0x81b6)}, /* Dell Wireless 5811e QDL */ 176 + {DEVICE_SWI(0x413c, 0x81cc)}, /* Dell Wireless 5816e */ 176 177 {DEVICE_SWI(0x413c, 0x81cf)}, /* Dell Wireless 5819 */ 177 178 {DEVICE_SWI(0x413c, 0x81d0)}, /* Dell Wireless 5819 */ 178 179 {DEVICE_SWI(0x413c, 0x81d1)}, /* Dell Wireless 5818 */

+7

drivers/usb/storage/unusual_uas.h

··· 28 28 * and don't forget to CC: the USB development list <linux-usb@vger.kernel.org> 29 29 */ 30 30 31 + /* Reported-by: Julian Groß <julian.g@posteo.de> */ 32 + UNUSUAL_DEV(0x059f, 0x105f, 0x0000, 0x9999, 33 + "LaCie", 34 + "2Big Quadra USB3", 35 + USB_SC_DEVICE, USB_PR_DEVICE, NULL, 36 + US_FL_NO_REPORT_OPCODES), 37 + 31 38 /* 32 39 * Apricorn USB3 dongle sometimes returns "USBSUSBSUSBS" in response to SCSI 33 40 * commands in UAS mode. Observed with the 1.28 firmware; are there others?

+6 -2

drivers/usb/typec/mux/intel_pmc_mux.c

··· 157 157 req.mode_data |= (state->mode - TYPEC_STATE_MODAL) << 158 158 PMC_USB_ALTMODE_DP_MODE_SHIFT; 159 159 160 + if (data->status & DP_STATUS_HPD_STATE) 161 + req.mode_data |= PMC_USB_DP_HPD_LVL << 162 + PMC_USB_ALTMODE_DP_MODE_SHIFT; 163 + 160 164 return pmc_usb_command(port, (void *)&req, sizeof(req)); 161 165 } 162 166 ··· 302 298 struct typec_mux_desc mux_desc = { }; 303 299 int ret; 304 300 305 - ret = fwnode_property_read_u8(fwnode, "usb2-port", &port->usb2_port); 301 + ret = fwnode_property_read_u8(fwnode, "usb2-port-number", &port->usb2_port); 306 302 if (ret) 307 303 return ret; 308 304 309 - ret = fwnode_property_read_u8(fwnode, "usb3-port", &port->usb3_port); 305 + ret = fwnode_property_read_u8(fwnode, "usb3-port-number", &port->usb3_port); 310 306 if (ret) 311 307 return ret; 312 308

+2 -1

fs/ceph/caps.c

··· 2749 2749 2750 2750 ret = try_get_cap_refs(inode, need, want, 0, flags, got); 2751 2751 /* three special error codes */ 2752 - if (ret == -EAGAIN || ret == -EFBIG || ret == -EAGAIN) 2752 + if (ret == -EAGAIN || ret == -EFBIG || ret == -ESTALE) 2753 2753 ret = 0; 2754 2754 return ret; 2755 2755 } ··· 3746 3746 WARN_ON(1); 3747 3747 tsession = NULL; 3748 3748 target = -1; 3749 + mutex_lock(&session->s_mutex); 3749 3750 } 3750 3751 goto retry; 3751 3752

+1 -1

fs/ceph/debugfs.c

··· 271 271 &congestion_kb_fops); 272 272 273 273 snprintf(name, sizeof(name), "../../bdi/%s", 274 - dev_name(fsc->sb->s_bdi->dev)); 274 + bdi_dev_name(fsc->sb->s_bdi)); 275 275 fsc->debugfs_bdi = 276 276 debugfs_create_symlink("bdi", 277 277 fsc->client->debugfs_dir,

+3 -5

fs/ceph/mds_client.c

··· 3251 3251 void *end = p + msg->front.iov_len; 3252 3252 struct ceph_mds_session_head *h; 3253 3253 u32 op; 3254 - u64 seq; 3255 - unsigned long features = 0; 3254 + u64 seq, features = 0; 3256 3255 int wake = 0; 3257 3256 bool blacklisted = false; 3258 3257 ··· 3270 3271 goto bad; 3271 3272 /* version >= 3, feature bits */ 3272 3273 ceph_decode_32_safe(&p, end, len, bad); 3273 - ceph_decode_need(&p, end, len, bad); 3274 - memcpy(&features, p, min_t(size_t, len, sizeof(features))); 3275 - p += len; 3274 + ceph_decode_64_safe(&p, end, features, bad); 3275 + p += len - sizeof(features); 3276 3276 } 3277 3277 3278 3278 mutex_lock(&mdsc->mutex);

+2 -2

fs/ceph/quota.c

··· 159 159 } 160 160 161 161 if (IS_ERR(in)) { 162 - pr_warn("Can't lookup inode %llx (err: %ld)\n", 163 - realm->ino, PTR_ERR(in)); 162 + dout("Can't lookup inode %llx (err: %ld)\n", 163 + realm->ino, PTR_ERR(in)); 164 164 qri->timeout = jiffies + msecs_to_jiffies(60 * 1000); /* XXX */ 165 165 } else { 166 166 qri->timeout = 0;

+1

fs/configfs/dir.c

··· 1519 1519 spin_lock(&configfs_dirent_lock); 1520 1520 configfs_detach_rollback(dentry); 1521 1521 spin_unlock(&configfs_dirent_lock); 1522 + config_item_put(parent_item); 1522 1523 return -EINTR; 1523 1524 } 1524 1525 frag->frag_dead = true;

+8

fs/coredump.c

··· 788 788 if (displaced) 789 789 put_files_struct(displaced); 790 790 if (!dump_interrupted()) { 791 + /* 792 + * umh disabled with CONFIG_STATIC_USERMODEHELPER_PATH="" would 793 + * have this set to NULL. 794 + */ 795 + if (!cprm.file) { 796 + pr_info("Core dump to |%s disabled\n", cn.corename); 797 + goto close_fail; 798 + } 791 799 file_start_write(cprm.file); 792 800 core_dumped = binfmt->core_dump(&cprm); 793 801 file_end_write(cprm.file);

+66 -53

fs/eventpoll.c

··· 1171 1171 { 1172 1172 struct eventpoll *ep = epi->ep; 1173 1173 1174 + /* Fast preliminary check */ 1175 + if (epi->next != EP_UNACTIVE_PTR) 1176 + return false; 1177 + 1174 1178 /* Check that the same epi has not been just chained from another CPU */ 1175 1179 if (cmpxchg(&epi->next, EP_UNACTIVE_PTR, NULL) != EP_UNACTIVE_PTR) 1176 1180 return false; ··· 1241 1237 * chained in ep->ovflist and requeued later on. 1242 1238 */ 1243 1239 if (READ_ONCE(ep->ovflist) != EP_UNACTIVE_PTR) { 1244 - if (epi->next == EP_UNACTIVE_PTR && 1245 - chain_epi_lockless(epi)) 1240 + if (chain_epi_lockless(epi)) 1246 1241 ep_pm_stay_awake_rcu(epi); 1247 - goto out_unlock; 1248 - } 1249 - 1250 - /* If this file is already in the ready list we exit soon */ 1251 - if (!ep_is_linked(epi) && 1252 - list_add_tail_lockless(&epi->rdllink, &ep->rdllist)) { 1253 - ep_pm_stay_awake_rcu(epi); 1242 + } else if (!ep_is_linked(epi)) { 1243 + /* In the usual case, add event to ready list. */ 1244 + if (list_add_tail_lockless(&epi->rdllink, &ep->rdllist)) 1245 + ep_pm_stay_awake_rcu(epi); 1254 1246 } 1255 1247 1256 1248 /* ··· 1822 1822 { 1823 1823 int res = 0, eavail, timed_out = 0; 1824 1824 u64 slack = 0; 1825 - bool waiter = false; 1826 1825 wait_queue_entry_t wait; 1827 1826 ktime_t expires, *to = NULL; 1828 1827 ··· 1866 1867 */ 1867 1868 ep_reset_busy_poll_napi_id(ep); 1868 1869 1869 - /* 1870 - * We don't have any available event to return to the caller. We need 1871 - * to sleep here, and we will be woken by ep_poll_callback() when events 1872 - * become available. 1873 - */ 1874 - if (!waiter) { 1875 - waiter = true; 1876 - init_waitqueue_entry(&wait, current); 1870 + do { 1871 + /* 1872 + * Internally init_wait() uses autoremove_wake_function(), 1873 + * thus wait entry is removed from the wait queue on each 1874 + * wakeup. Why it is important? In case of several waiters 1875 + * each new wakeup will hit the next waiter, giving it the 1876 + * chance to harvest new event. Otherwise wakeup can be 1877 + * lost. This is also good performance-wise, because on 1878 + * normal wakeup path no need to call __remove_wait_queue() 1879 + * explicitly, thus ep->lock is not taken, which halts the 1880 + * event delivery. 1881 + */ 1882 + init_wait(&wait); 1877 1883 1878 1884 write_lock_irq(&ep->lock); 1879 - __add_wait_queue_exclusive(&ep->wq, &wait); 1885 + /* 1886 + * Barrierless variant, waitqueue_active() is called under 1887 + * the same lock on wakeup ep_poll_callback() side, so it 1888 + * is safe to avoid an explicit barrier. 1889 + */ 1890 + __set_current_state(TASK_INTERRUPTIBLE); 1891 + 1892 + /* 1893 + * Do the final check under the lock. ep_scan_ready_list() 1894 + * plays with two lists (->rdllist and ->ovflist) and there 1895 + * is always a race when both lists are empty for short 1896 + * period of time although events are pending, so lock is 1897 + * important. 1898 + */ 1899 + eavail = ep_events_available(ep); 1900 + if (!eavail) { 1901 + if (signal_pending(current)) 1902 + res = -EINTR; 1903 + else 1904 + __add_wait_queue_exclusive(&ep->wq, &wait); 1905 + } 1906 + write_unlock_irq(&ep->lock); 1907 + 1908 + if (eavail || res) 1909 + break; 1910 + 1911 + if (!schedule_hrtimeout_range(to, slack, HRTIMER_MODE_ABS)) { 1912 + timed_out = 1; 1913 + break; 1914 + } 1915 + 1916 + /* We were woken up, thus go and try to harvest some events */ 1917 + eavail = 1; 1918 + 1919 + } while (0); 1920 + 1921 + __set_current_state(TASK_RUNNING); 1922 + 1923 + if (!list_empty_careful(&wait.entry)) { 1924 + write_lock_irq(&ep->lock); 1925 + __remove_wait_queue(&ep->wq, &wait); 1880 1926 write_unlock_irq(&ep->lock); 1881 1927 } 1882 1928 1883 - for (;;) { 1884 - /* 1885 - * We don't want to sleep if the ep_poll_callback() sends us 1886 - * a wakeup in between. That's why we set the task state 1887 - * to TASK_INTERRUPTIBLE before doing the checks. 1888 - */ 1889 - set_current_state(TASK_INTERRUPTIBLE); 1929 + send_events: 1930 + if (fatal_signal_pending(current)) { 1890 1931 /* 1891 1932 * Always short-circuit for fatal signals to allow 1892 1933 * threads to make a timely exit without the chance of 1893 1934 * finding more events available and fetching 1894 1935 * repeatedly. 1895 1936 */ 1896 - if (fatal_signal_pending(current)) { 1897 - res = -EINTR; 1898 - break; 1899 - } 1900 - 1901 - eavail = ep_events_available(ep); 1902 - if (eavail) 1903 - break; 1904 - if (signal_pending(current)) { 1905 - res = -EINTR; 1906 - break; 1907 - } 1908 - 1909 - if (!schedule_hrtimeout_range(to, slack, HRTIMER_MODE_ABS)) { 1910 - timed_out = 1; 1911 - break; 1912 - } 1937 + res = -EINTR; 1913 1938 } 1914 - 1915 - __set_current_state(TASK_RUNNING); 1916 - 1917 - send_events: 1918 1939 /* 1919 1940 * Try to transfer events to user space. In case we get 0 events and 1920 1941 * there's still timeout left over, we go trying again in search of ··· 1943 1924 if (!res && eavail && 1944 1925 !(res = ep_send_events(ep, events, maxevents)) && !timed_out) 1945 1926 goto fetch_events; 1946 - 1947 - if (waiter) { 1948 - write_lock_irq(&ep->lock); 1949 - __remove_wait_queue(&ep->wq, &wait); 1950 - write_unlock_irq(&ep->lock); 1951 - } 1952 1927 1953 1928 return res; 1954 1929 }

+9 -7

fs/gfs2/bmap.c

··· 528 528 529 529 /* Advance in metadata tree. */ 530 530 (mp->mp_list[hgt])++; 531 - if (mp->mp_list[hgt] >= sdp->sd_inptrs) { 532 - if (!hgt) 531 + if (hgt) { 532 + if (mp->mp_list[hgt] >= sdp->sd_inptrs) 533 + goto lower_metapath; 534 + } else { 535 + if (mp->mp_list[hgt] >= sdp->sd_diptrs) 533 536 break; 534 - goto lower_metapath; 535 537 } 536 538 537 539 fill_up_metapath: ··· 878 876 ret = -ENOENT; 879 877 goto unlock; 880 878 } else { 881 - /* report a hole */ 882 879 iomap->offset = pos; 883 880 iomap->length = length; 884 - goto do_alloc; 881 + goto hole_found; 885 882 } 886 883 } 887 884 iomap->length = size; ··· 934 933 return ret; 935 934 936 935 do_alloc: 937 - iomap->addr = IOMAP_NULL_ADDR; 938 - iomap->type = IOMAP_HOLE; 939 936 if (flags & IOMAP_REPORT) { 940 937 if (pos >= size) 941 938 ret = -ENOENT; ··· 955 956 if (pos < size && height == ip->i_height) 956 957 ret = gfs2_hole_size(inode, lblock, len, mp, iomap); 957 958 } 959 + hole_found: 960 + iomap->addr = IOMAP_NULL_ADDR; 961 + iomap->type = IOMAP_HOLE; 958 962 goto out; 959 963 } 960 964

+2 -4

fs/gfs2/glock.c

··· 613 613 fs_err(sdp, "Error %d syncing glock \n", ret); 614 614 gfs2_dump_glock(NULL, gl, true); 615 615 } 616 - return; 616 + goto skip_inval; 617 617 } 618 618 } 619 619 if (test_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags)) { ··· 633 633 clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags); 634 634 } 635 635 636 + skip_inval: 636 637 gfs2_glock_hold(gl); 637 638 /* 638 639 * Check for an error encountered since we called go_sync and go_inval. ··· 722 721 if (find_first_holder(gl)) 723 722 goto out_unlock; 724 723 if (nonblock) 725 - goto out_sched; 726 - smp_mb(); 727 - if (atomic_read(&gl->gl_revokes) != 0) 728 724 goto out_sched; 729 725 set_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags); 730 726 GLOCK_BUG_ON(gl, gl->gl_demote_state == LM_ST_EXCLUSIVE);

+4 -3

fs/gfs2/inode.c

··· 622 622 error = finish_no_open(file, NULL); 623 623 } 624 624 gfs2_glock_dq_uninit(ghs); 625 - return error; 625 + goto fail; 626 626 } else if (error != -ENOENT) { 627 627 goto fail_gunlock; 628 628 } ··· 764 764 error = finish_open(file, dentry, gfs2_open_common); 765 765 } 766 766 gfs2_glock_dq_uninit(ghs); 767 + gfs2_qa_put(ip); 767 768 gfs2_glock_dq_uninit(ghs + 1); 768 769 clear_bit(GLF_INODE_CREATING, &io_gl->gl_flags); 769 770 gfs2_glock_put(io_gl); 771 + gfs2_qa_put(dip); 770 772 return error; 771 773 772 774 fail_gunlock3: ··· 778 776 clear_bit(GLF_INODE_CREATING, &io_gl->gl_flags); 779 777 gfs2_glock_put(io_gl); 780 778 fail_free_inode: 781 - gfs2_qa_put(ip); 782 779 if (ip->i_gl) { 783 780 glock_clear_object(ip->i_gl, ip); 784 781 gfs2_glock_put(ip->i_gl); ··· 1006 1005 out_child: 1007 1006 gfs2_glock_dq(ghs); 1008 1007 out_parent: 1009 - gfs2_qa_put(ip); 1008 + gfs2_qa_put(dip); 1010 1009 gfs2_holder_uninit(ghs); 1011 1010 gfs2_holder_uninit(ghs + 1); 1012 1011 return error;

+8 -3

fs/gfs2/log.c

··· 669 669 struct buffer_head *bh = bd->bd_bh; 670 670 struct gfs2_glock *gl = bd->bd_gl; 671 671 672 + sdp->sd_log_num_revoke++; 673 + if (atomic_inc_return(&gl->gl_revokes) == 1) 674 + gfs2_glock_hold(gl); 672 675 bh->b_private = NULL; 673 676 bd->bd_blkno = bh->b_blocknr; 674 677 gfs2_remove_from_ail(bd); /* drops ref on bh */ 675 678 bd->bd_bh = NULL; 676 - sdp->sd_log_num_revoke++; 677 - if (atomic_inc_return(&gl->gl_revokes) == 1) 678 - gfs2_glock_hold(gl); 679 679 set_bit(GLF_LFLUSH, &gl->gl_flags); 680 680 list_add(&bd->bd_list, &sdp->sd_log_revokes); 681 681 } ··· 1131 1131 1132 1132 while (!kthread_should_stop()) { 1133 1133 1134 + if (gfs2_withdrawn(sdp)) { 1135 + msleep_interruptible(HZ); 1136 + continue; 1137 + } 1134 1138 /* Check for errors writing to the journal */ 1135 1139 if (sdp->sd_log_error) { 1136 1140 gfs2_lm(sdp, ··· 1143 1139 "prevent further damage.\n", 1144 1140 sdp->sd_fsname, sdp->sd_log_error); 1145 1141 gfs2_withdraw(sdp); 1142 + continue; 1146 1143 } 1147 1144 1148 1145 did_flush = false;

+12 -7

fs/gfs2/lops.c

··· 263 263 struct super_block *sb = sdp->sd_vfs; 264 264 struct bio *bio = bio_alloc(GFP_NOIO, BIO_MAX_PAGES); 265 265 266 - bio->bi_iter.bi_sector = blkno << (sb->s_blocksize_bits - 9); 266 + bio->bi_iter.bi_sector = blkno << sdp->sd_fsb2bb_shift; 267 267 bio_set_dev(bio, sb->s_bdev); 268 268 bio->bi_end_io = end_io; 269 269 bio->bi_private = sdp; ··· 509 509 unsigned int bsize = sdp->sd_sb.sb_bsize, off; 510 510 unsigned int bsize_shift = sdp->sd_sb.sb_bsize_shift; 511 511 unsigned int shift = PAGE_SHIFT - bsize_shift; 512 - unsigned int readahead_blocks = BIO_MAX_PAGES << shift; 512 + unsigned int max_bio_size = 2 * 1024 * 1024; 513 513 struct gfs2_journal_extent *je; 514 514 int sz, ret = 0; 515 515 struct bio *bio = NULL; ··· 537 537 off = 0; 538 538 } 539 539 540 - if (!bio || (bio_chained && !off)) { 540 + if (!bio || (bio_chained && !off) || 541 + bio->bi_iter.bi_size >= max_bio_size) { 541 542 /* start new bio */ 542 543 } else { 543 - sz = bio_add_page(bio, page, bsize, off); 544 - if (sz == bsize) 545 - goto block_added; 544 + sector_t sector = dblock << sdp->sd_fsb2bb_shift; 545 + 546 + if (bio_end_sector(bio) == sector) { 547 + sz = bio_add_page(bio, page, bsize, off); 548 + if (sz == bsize) 549 + goto block_added; 550 + } 546 551 if (off) { 547 552 unsigned int blocks = 548 553 (PAGE_SIZE - off) >> bsize_shift; ··· 573 568 off += bsize; 574 569 if (off == PAGE_SIZE) 575 570 page = NULL; 576 - if (blocks_submitted < blocks_read + readahead_blocks) { 571 + if (blocks_submitted < 2 * max_bio_size >> bsize_shift) { 577 572 /* Keep at least one bio in flight */ 578 573 continue; 579 574 }

+1 -1

fs/gfs2/meta_io.c

··· 252 252 int num = 0; 253 253 254 254 if (unlikely(gfs2_withdrawn(sdp)) && 255 - (!sdp->sd_jdesc || (blkno != sdp->sd_jdesc->jd_no_addr))) { 255 + (!sdp->sd_jdesc || gl != sdp->sd_jinode_gl)) { 256 256 *bhp = NULL; 257 257 return -EIO; 258 258 }

+5 -8

fs/gfs2/quota.c

··· 1051 1051 u32 x; 1052 1052 int error = 0; 1053 1053 1054 - if (capable(CAP_SYS_RESOURCE) || 1055 - sdp->sd_args.ar_quota != GFS2_QUOTA_ON) 1054 + if (sdp->sd_args.ar_quota != GFS2_QUOTA_ON) 1056 1055 return 0; 1057 1056 1058 1057 error = gfs2_quota_hold(ip, uid, gid); ··· 1124 1125 int found; 1125 1126 1126 1127 if (!test_and_clear_bit(GIF_QD_LOCKED, &ip->i_flags)) 1127 - goto out; 1128 + return; 1128 1129 1129 1130 for (x = 0; x < ip->i_qadata->qa_qd_num; x++) { 1130 1131 struct gfs2_quota_data *qd; ··· 1161 1162 qd_unlock(qda[x]); 1162 1163 } 1163 1164 1164 - out: 1165 1165 gfs2_quota_unhold(ip); 1166 1166 } 1167 1167 ··· 1207 1209 ap->allowed = UINT_MAX; /* Assume we are permitted a whole lot */ 1208 1210 if (!test_bit(GIF_QD_LOCKED, &ip->i_flags)) 1209 1211 return 0; 1210 - 1211 - if (sdp->sd_args.ar_quota != GFS2_QUOTA_ON) 1212 - return 0; 1213 1212 1214 1213 for (x = 0; x < ip->i_qadata->qa_qd_num; x++) { 1215 1214 qd = ip->i_qadata->qa_qd[x]; ··· 1265 1270 if (ip->i_diskflags & GFS2_DIF_SYSTEM) 1266 1271 return; 1267 1272 1268 - BUG_ON(ip->i_qadata->qa_ref <= 0); 1273 + if (gfs2_assert_withdraw(sdp, ip->i_qadata && 1274 + ip->i_qadata->qa_ref > 0)) 1275 + return; 1269 1276 for (x = 0; x < ip->i_qadata->qa_qd_num; x++) { 1270 1277 qd = ip->i_qadata->qa_qd[x]; 1271 1278

+2 -1

fs/gfs2/quota.h

··· 44 44 int ret; 45 45 46 46 ap->allowed = UINT_MAX; /* Assume we are permitted a whole lot */ 47 - if (sdp->sd_args.ar_quota == GFS2_QUOTA_OFF) 47 + if (capable(CAP_SYS_RESOURCE) || 48 + sdp->sd_args.ar_quota == GFS2_QUOTA_OFF) 48 49 return 0; 49 50 ret = gfs2_quota_lock(ip, NO_UID_QUOTA_CHANGE, NO_GID_QUOTA_CHANGE); 50 51 if (ret)

-1

fs/gfs2/super.c

··· 1404 1404 if (ip->i_qadata) 1405 1405 gfs2_assert_warn(sdp, ip->i_qadata->qa_ref == 0); 1406 1406 gfs2_rs_delete(ip, NULL); 1407 - gfs2_qa_put(ip); 1408 1407 gfs2_ordered_del_inode(ip); 1409 1408 clear_inode(inode); 1410 1409 gfs2_dir_hash_inval(ip);

+6 -4

fs/gfs2/util.c

··· 119 119 if (!sb_rdonly(sdp->sd_vfs)) 120 120 ret = gfs2_make_fs_ro(sdp); 121 121 122 + if (sdp->sd_lockstruct.ls_ops->lm_lock == NULL) { /* lock_nolock */ 123 + if (!ret) 124 + ret = -EIO; 125 + clear_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags); 126 + goto skip_recovery; 127 + } 122 128 /* 123 129 * Drop the glock for our journal so another node can recover it. 124 130 */ ··· 165 159 wait_on_bit(&gl->gl_flags, GLF_FREEING, TASK_UNINTERRUPTIBLE); 166 160 } 167 161 168 - if (sdp->sd_lockstruct.ls_ops->lm_lock == NULL) { /* lock_nolock */ 169 - clear_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags); 170 - goto skip_recovery; 171 - } 172 162 /* 173 163 * Dequeue the "live" glock, but keep a reference so it's never freed. 174 164 */

+22 -43

fs/io_uring.c

··· 680 680 unsigned needs_mm : 1; 681 681 /* needs req->file assigned */ 682 682 unsigned needs_file : 1; 683 - /* needs req->file assigned IFF fd is >= 0 */ 684 - unsigned fd_non_neg : 1; 685 683 /* hash wq insertion if file is a regular file */ 686 684 unsigned hash_reg_file : 1; 687 685 /* unbound wq insertion if file is a non-regular file */ ··· 782 784 .needs_file = 1, 783 785 }, 784 786 [IORING_OP_OPENAT] = { 785 - .needs_file = 1, 786 - .fd_non_neg = 1, 787 787 .file_table = 1, 788 788 .needs_fs = 1, 789 789 }, ··· 795 799 }, 796 800 [IORING_OP_STATX] = { 797 801 .needs_mm = 1, 798 - .needs_file = 1, 799 - .fd_non_neg = 1, 800 802 .needs_fs = 1, 801 803 .file_table = 1, 802 804 }, ··· 831 837 .buffer_select = 1, 832 838 }, 833 839 [IORING_OP_OPENAT2] = { 834 - .needs_file = 1, 835 - .fd_non_neg = 1, 836 840 .file_table = 1, 837 841 .needs_fs = 1, 838 842 }, ··· 5360 5368 io_steal_work(req, workptr); 5361 5369 } 5362 5370 5363 - static int io_req_needs_file(struct io_kiocb *req, int fd) 5364 - { 5365 - if (!io_op_defs[req->opcode].needs_file) 5366 - return 0; 5367 - if ((fd == -1 || fd == AT_FDCWD) && io_op_defs[req->opcode].fd_non_neg) 5368 - return 0; 5369 - return 1; 5370 - } 5371 - 5372 5371 static inline struct file *io_file_from_index(struct io_ring_ctx *ctx, 5373 5372 int index) 5374 5373 { ··· 5397 5414 } 5398 5415 5399 5416 static int io_req_set_file(struct io_submit_state *state, struct io_kiocb *req, 5400 - int fd, unsigned int flags) 5417 + int fd) 5401 5418 { 5402 5419 bool fixed; 5403 5420 5404 - if (!io_req_needs_file(req, fd)) 5405 - return 0; 5406 - 5407 - fixed = (flags & IOSQE_FIXED_FILE); 5421 + fixed = (req->flags & REQ_F_FIXED_FILE) != 0; 5408 5422 if (unlikely(!fixed && req->needs_fixed_file)) 5409 5423 return -EBADF; 5410 5424 ··· 5778 5798 struct io_submit_state *state, bool async) 5779 5799 { 5780 5800 unsigned int sqe_flags; 5781 - int id, fd; 5801 + int id; 5782 5802 5783 5803 /* 5784 5804 * All io need record the previous position, if LINK vs DARIN, ··· 5830 5850 IOSQE_ASYNC | IOSQE_FIXED_FILE | 5831 5851 IOSQE_BUFFER_SELECT | IOSQE_IO_LINK); 5832 5852 5833 - fd = READ_ONCE(sqe->fd); 5834 - return io_req_set_file(state, req, fd, sqe_flags); 5853 + if (!io_op_defs[req->opcode].needs_file) 5854 + return 0; 5855 + 5856 + return io_req_set_file(state, req, READ_ONCE(sqe->fd)); 5835 5857 } 5836 5858 5837 5859 static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr, ··· 7342 7360 static void io_uring_cancel_files(struct io_ring_ctx *ctx, 7343 7361 struct files_struct *files) 7344 7362 { 7345 - struct io_kiocb *req; 7346 - DEFINE_WAIT(wait); 7347 - 7348 7363 while (!list_empty_careful(&ctx->inflight_list)) { 7349 - struct io_kiocb *cancel_req = NULL; 7364 + struct io_kiocb *cancel_req = NULL, *req; 7365 + DEFINE_WAIT(wait); 7350 7366 7351 7367 spin_lock_irq(&ctx->inflight_lock); 7352 7368 list_for_each_entry(req, &ctx->inflight_list, inflight_entry) { ··· 7384 7404 */ 7385 7405 if (refcount_sub_and_test(2, &cancel_req->refs)) { 7386 7406 io_put_req(cancel_req); 7407 + finish_wait(&ctx->inflight_wait, &wait); 7387 7408 continue; 7388 7409 } 7389 7410 } ··· 7392 7411 io_wq_cancel_work(ctx->io_wq, &cancel_req->work); 7393 7412 io_put_req(cancel_req); 7394 7413 schedule(); 7414 + finish_wait(&ctx->inflight_wait, &wait); 7395 7415 } 7396 - finish_wait(&ctx->inflight_wait, &wait); 7397 7416 } 7398 7417 7399 7418 static int io_uring_flush(struct file *file, void *data) ··· 7742 7761 return ret; 7743 7762 } 7744 7763 7745 - static int io_uring_create(unsigned entries, struct io_uring_params *p) 7764 + static int io_uring_create(unsigned entries, struct io_uring_params *p, 7765 + struct io_uring_params __user *params) 7746 7766 { 7747 7767 struct user_struct *user = NULL; 7748 7768 struct io_ring_ctx *ctx; ··· 7835 7853 p->cq_off.overflow = offsetof(struct io_rings, cq_overflow); 7836 7854 p->cq_off.cqes = offsetof(struct io_rings, cqes); 7837 7855 7856 + p->features = IORING_FEAT_SINGLE_MMAP | IORING_FEAT_NODROP | 7857 + IORING_FEAT_SUBMIT_STABLE | IORING_FEAT_RW_CUR_POS | 7858 + IORING_FEAT_CUR_PERSONALITY | IORING_FEAT_FAST_POLL; 7859 + 7860 + if (copy_to_user(params, p, sizeof(*p))) { 7861 + ret = -EFAULT; 7862 + goto err; 7863 + } 7838 7864 /* 7839 7865 * Install ring fd as the very last thing, so we don't risk someone 7840 7866 * having closed it before we finish setup ··· 7851 7861 if (ret < 0) 7852 7862 goto err; 7853 7863 7854 - p->features = IORING_FEAT_SINGLE_MMAP | IORING_FEAT_NODROP | 7855 - IORING_FEAT_SUBMIT_STABLE | IORING_FEAT_RW_CUR_POS | 7856 - IORING_FEAT_CUR_PERSONALITY | IORING_FEAT_FAST_POLL; 7857 7864 trace_io_uring_create(ret, ctx, p->sq_entries, p->cq_entries, p->flags); 7858 7865 return ret; 7859 7866 err: ··· 7866 7879 static long io_uring_setup(u32 entries, struct io_uring_params __user *params) 7867 7880 { 7868 7881 struct io_uring_params p; 7869 - long ret; 7870 7882 int i; 7871 7883 7872 7884 if (copy_from_user(&p, params, sizeof(p))) ··· 7880 7894 IORING_SETUP_CLAMP | IORING_SETUP_ATTACH_WQ)) 7881 7895 return -EINVAL; 7882 7896 7883 - ret = io_uring_create(entries, &p); 7884 - if (ret < 0) 7885 - return ret; 7886 - 7887 - if (copy_to_user(params, &p, sizeof(p))) 7888 - return -EFAULT; 7889 - 7890 - return ret; 7897 + return io_uring_create(entries, &p, params); 7891 7898 } 7892 7899 7893 7900 SYSCALL_DEFINE2(io_uring_setup, u32, entries,

+18 -27

fs/splice.c

··· 1118 1118 loff_t offset; 1119 1119 long ret; 1120 1120 1121 + if (unlikely(!(in->f_mode & FMODE_READ) || 1122 + !(out->f_mode & FMODE_WRITE))) 1123 + return -EBADF; 1124 + 1121 1125 ipipe = get_pipe_info(in); 1122 1126 opipe = get_pipe_info(out); 1123 1127 1124 1128 if (ipipe && opipe) { 1125 1129 if (off_in || off_out) 1126 1130 return -ESPIPE; 1127 - 1128 - if (!(in->f_mode & FMODE_READ)) 1129 - return -EBADF; 1130 - 1131 - if (!(out->f_mode & FMODE_WRITE)) 1132 - return -EBADF; 1133 1131 1134 1132 /* Splicing to self would be fun, but... */ 1135 1133 if (ipipe == opipe) ··· 1150 1152 } else { 1151 1153 offset = out->f_pos; 1152 1154 } 1153 - 1154 - if (unlikely(!(out->f_mode & FMODE_WRITE))) 1155 - return -EBADF; 1156 1155 1157 1156 if (unlikely(out->f_flags & O_APPEND)) 1158 1157 return -EINVAL; ··· 1435 1440 error = -EBADF; 1436 1441 in = fdget(fd_in); 1437 1442 if (in.file) { 1438 - if (in.file->f_mode & FMODE_READ) { 1439 - out = fdget(fd_out); 1440 - if (out.file) { 1441 - if (out.file->f_mode & FMODE_WRITE) 1442 - error = do_splice(in.file, off_in, 1443 - out.file, off_out, 1444 - len, flags); 1445 - fdput(out); 1446 - } 1443 + out = fdget(fd_out); 1444 + if (out.file) { 1445 + error = do_splice(in.file, off_in, out.file, off_out, 1446 + len, flags); 1447 + fdput(out); 1447 1448 } 1448 1449 fdput(in); 1449 1450 } ··· 1761 1770 struct pipe_inode_info *opipe = get_pipe_info(out); 1762 1771 int ret = -EINVAL; 1763 1772 1773 + if (unlikely(!(in->f_mode & FMODE_READ) || 1774 + !(out->f_mode & FMODE_WRITE))) 1775 + return -EBADF; 1776 + 1764 1777 /* 1765 1778 * Duplicate the contents of ipipe to opipe without actually 1766 1779 * copying the data. ··· 1790 1795 1791 1796 SYSCALL_DEFINE4(tee, int, fdin, int, fdout, size_t, len, unsigned int, flags) 1792 1797 { 1793 - struct fd in; 1798 + struct fd in, out; 1794 1799 int error; 1795 1800 1796 1801 if (unlikely(flags & ~SPLICE_F_ALL)) ··· 1802 1807 error = -EBADF; 1803 1808 in = fdget(fdin); 1804 1809 if (in.file) { 1805 - if (in.file->f_mode & FMODE_READ) { 1806 - struct fd out = fdget(fdout); 1807 - if (out.file) { 1808 - if (out.file->f_mode & FMODE_WRITE) 1809 - error = do_tee(in.file, out.file, 1810 - len, flags); 1811 - fdput(out); 1812 - } 1810 + out = fdget(fdout); 1811 + if (out.file) { 1812 + error = do_tee(in.file, out.file, len, flags); 1813 + fdput(out); 1813 1814 } 1814 1815 fdput(in); 1815 1816 }

+1 -1

fs/vboxsf/super.c

··· 164 164 goto fail_free; 165 165 } 166 166 167 - err = super_setup_bdi_name(sb, "vboxsf-%s.%d", fc->source, sbi->bdi_id); 167 + err = super_setup_bdi_name(sb, "vboxsf-%d", sbi->bdi_id); 168 168 if (err) 169 169 goto fail_free; 170 170

+1 -1

include/drm/drm_modes.h

··· 48 48 * @MODE_HSYNC: hsync out of range 49 49 * @MODE_VSYNC: vsync out of range 50 50 * @MODE_H_ILLEGAL: mode has illegal horizontal timings 51 - * @MODE_V_ILLEGAL: mode has illegal horizontal timings 51 + * @MODE_V_ILLEGAL: mode has illegal vertical timings 52 52 * @MODE_BAD_WIDTH: requires an unsupported linepitch 53 53 * @MODE_NOMODE: no mode with a matching name 54 54 * @MODE_NO_INTERLACE: interlaced mode not supported

+1

include/linux/amba/bus.h

··· 65 65 struct device dev; 66 66 struct resource res; 67 67 struct clk *pclk; 68 + struct device_dma_parameters dma_parms; 68 69 unsigned int periphid; 69 70 unsigned int cid; 70 71 struct amba_cs_uci_id uci;

+1

include/linux/backing-dev-defs.h

··· 219 219 wait_queue_head_t wb_waitq; 220 220 221 221 struct device *dev; 222 + char dev_name[64]; 222 223 struct device *owner; 223 224 224 225 struct timer_list laptop_mode_wb_timer;

+1 -8

include/linux/backing-dev.h

··· 505 505 (1 << WB_async_congested)); 506 506 } 507 507 508 - extern const char *bdi_unknown_name; 509 - 510 - static inline const char *bdi_dev_name(struct backing_dev_info *bdi) 511 - { 512 - if (!bdi || !bdi->dev) 513 - return bdi_unknown_name; 514 - return dev_name(bdi->dev); 515 - } 508 + const char *bdi_dev_name(struct backing_dev_info *bdi); 516 509 517 510 #endif /* _LINUX_BACKING_DEV_H */

+23

include/linux/ftrace.h

··· 210 210 #endif 211 211 }; 212 212 213 + extern struct ftrace_ops __rcu *ftrace_ops_list; 214 + extern struct ftrace_ops ftrace_list_end; 215 + 216 + /* 217 + * Traverse the ftrace_global_list, invoking all entries. The reason that we 218 + * can use rcu_dereference_raw_check() is that elements removed from this list 219 + * are simply leaked, so there is no need to interact with a grace-period 220 + * mechanism. The rcu_dereference_raw_check() calls are needed to handle 221 + * concurrent insertions into the ftrace_global_list. 222 + * 223 + * Silly Alpha and silly pointer-speculation compiler optimizations! 224 + */ 225 + #define do_for_each_ftrace_op(op, list) \ 226 + op = rcu_dereference_raw_check(list); \ 227 + do 228 + 229 + /* 230 + * Optimized for just a single item in the list (as that is the normal case). 231 + */ 232 + #define while_for_each_ftrace_op(op) \ 233 + while (likely(op = rcu_dereference_raw_check((op)->next)) && \ 234 + unlikely((op) != &ftrace_list_end)) 235 + 213 236 /* 214 237 * Type of the current tracing. 215 238 */

+3

include/linux/host1x.h

··· 17 17 HOST1X_CLASS_GR3D = 0x60, 18 18 }; 19 19 20 + struct host1x; 20 21 struct host1x_client; 21 22 struct iommu_group; 23 + 24 + u64 host1x_get_dma_mask(struct host1x *host1x); 22 25 23 26 /** 24 27 * struct host1x_client_ops - host1x client operations

+2 -2

include/linux/lsm_hook_defs.h

··· 55 55 LSM_HOOK(void, LSM_RET_VOID, bprm_committed_creds, struct linux_binprm *bprm) 56 56 LSM_HOOK(int, 0, fs_context_dup, struct fs_context *fc, 57 57 struct fs_context *src_sc) 58 - LSM_HOOK(int, 0, fs_context_parse_param, struct fs_context *fc, 58 + LSM_HOOK(int, -ENOPARAM, fs_context_parse_param, struct fs_context *fc, 59 59 struct fs_parameter *param) 60 60 LSM_HOOK(int, 0, sb_alloc_security, struct super_block *sb) 61 61 LSM_HOOK(void, LSM_RET_VOID, sb_free_security, struct super_block *sb) ··· 243 243 char **value) 244 244 LSM_HOOK(int, -EINVAL, setprocattr, const char *name, void *value, size_t size) 245 245 LSM_HOOK(int, 0, ismaclabel, const char *name) 246 - LSM_HOOK(int, 0, secid_to_secctx, u32 secid, char **secdata, 246 + LSM_HOOK(int, -EOPNOTSUPP, secid_to_secctx, u32 secid, char **secdata, 247 247 u32 *seclen) 248 248 LSM_HOOK(int, 0, secctx_to_secid, const char *secdata, u32 seclen, u32 *secid) 249 249 LSM_HOOK(void, LSM_RET_VOID, release_secctx, char *secdata, u32 seclen)

+2

include/linux/memcontrol.h

··· 783 783 atomic_long_inc(&memcg->memory_events[event]); 784 784 cgroup_file_notify(&memcg->events_file); 785 785 786 + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) 787 + break; 786 788 if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS) 787 789 break; 788 790 } while ((memcg = parent_mem_cgroup(memcg)) &&

+10 -6

include/linux/mhi.h

··· 53 53 * @MHI_CHAIN: Linked transfer 54 54 */ 55 55 enum mhi_flags { 56 - MHI_EOB, 57 - MHI_EOT, 58 - MHI_CHAIN, 56 + MHI_EOB = BIT(0), 57 + MHI_EOT = BIT(1), 58 + MHI_CHAIN = BIT(2), 59 59 }; 60 60 61 61 /** ··· 335 335 * @syserr_worker: System error worker 336 336 * @state_event: State change event 337 337 * @status_cb: CB function to notify power states of the device (required) 338 - * @link_status: CB function to query link status of the device (required) 339 338 * @wake_get: CB function to assert device wake (optional) 340 339 * @wake_put: CB function to de-assert device wake (optional) 341 340 * @wake_toggle: CB function to assert and de-assert device wake (optional) 342 341 * @runtime_get: CB function to controller runtime resume (required) 343 - * @runtimet_put: CB function to decrement pm usage (required) 342 + * @runtime_put: CB function to decrement pm usage (required) 344 343 * @map_single: CB function to create TRE buffer 345 344 * @unmap_single: CB function to destroy TRE buffer 345 + * @read_reg: Read a MHI register via the physical link (required) 346 + * @write_reg: Write a MHI register via the physical link (required) 346 347 * @buffer_len: Bounce buffer length 347 348 * @bounce_buf: Use of bounce buffer 348 349 * @fbc_download: MHI host needs to do complete image transfer (optional) ··· 418 417 419 418 void (*status_cb)(struct mhi_controller *mhi_cntrl, 420 419 enum mhi_callback cb); 421 - int (*link_status)(struct mhi_controller *mhi_cntrl); 422 420 void (*wake_get)(struct mhi_controller *mhi_cntrl, bool override); 423 421 void (*wake_put)(struct mhi_controller *mhi_cntrl, bool override); 424 422 void (*wake_toggle)(struct mhi_controller *mhi_cntrl); ··· 427 427 struct mhi_buf_info *buf); 428 428 void (*unmap_single)(struct mhi_controller *mhi_cntrl, 429 429 struct mhi_buf_info *buf); 430 + int (*read_reg)(struct mhi_controller *mhi_cntrl, void __iomem *addr, 431 + u32 *out); 432 + void (*write_reg)(struct mhi_controller *mhi_cntrl, void __iomem *addr, 433 + u32 val); 430 434 431 435 size_t buffer_len; 432 436 bool bounce_buf;

+1

include/linux/platform_device.h

··· 25 25 bool id_auto; 26 26 struct device dev; 27 27 u64 platform_dma_mask; 28 + struct device_dma_parameters dma_parms; 28 29 u32 num_resources; 29 30 struct resource *resource; 30 31

+4 -4

include/linux/ptp_clock_kernel.h

··· 108 108 * parameter func: the desired function to use. 109 109 * parameter chan: the function channel index to use. 110 110 * 111 - * @do_work: Request driver to perform auxiliary (periodic) operations 112 - * Driver should return delay of the next auxiliary work scheduling 113 - * time (>=0) or negative value in case further scheduling 114 - * is not required. 111 + * @do_aux_work: Request driver to perform auxiliary (periodic) operations 112 + * Driver should return delay of the next auxiliary work 113 + * scheduling time (>=0) or negative value in case further 114 + * scheduling is not required. 115 115 * 116 116 * Drivers should embed their ptp_clock_info within a private 117 117 * structure, obtaining a reference to it using container_of().

+1

include/linux/skmsg.h

··· 187 187 dst->sg.data[which] = src->sg.data[which]; 188 188 dst->sg.data[which].length = size; 189 189 dst->sg.size += size; 190 + src->sg.size -= size; 190 191 src->sg.data[which].length -= size; 191 192 src->sg.data[which].offset += size; 192 193 }

+3

include/linux/sunrpc/gss_api.h

··· 21 21 struct gss_ctx { 22 22 struct gss_api_mech *mech_type; 23 23 void *internal_ctx_id; 24 + unsigned int slack, align; 24 25 }; 25 26 26 27 #define GSS_C_NO_BUFFER ((struct xdr_netobj) 0) ··· 67 66 u32 gss_unwrap( 68 67 struct gss_ctx *ctx_id, 69 68 int offset, 69 + int len, 70 70 struct xdr_buf *inbuf); 71 71 u32 gss_delete_sec_context( 72 72 struct gss_ctx **ctx_id); ··· 128 126 u32 (*gss_unwrap)( 129 127 struct gss_ctx *ctx_id, 130 128 int offset, 129 + int len, 131 130 struct xdr_buf *buf); 132 131 void (*gss_delete_sec_context)( 133 132 void *internal_ctx_id);

+3 -3

include/linux/sunrpc/gss_krb5.h

··· 83 83 u32 (*encrypt_v2) (struct krb5_ctx *kctx, u32 offset, 84 84 struct xdr_buf *buf, 85 85 struct page **pages); /* v2 encryption function */ 86 - u32 (*decrypt_v2) (struct krb5_ctx *kctx, u32 offset, 86 + u32 (*decrypt_v2) (struct krb5_ctx *kctx, u32 offset, u32 len, 87 87 struct xdr_buf *buf, u32 *headskip, 88 88 u32 *tailskip); /* v2 decryption function */ 89 89 }; ··· 255 255 struct xdr_buf *outbuf, struct page **pages); 256 256 257 257 u32 258 - gss_unwrap_kerberos(struct gss_ctx *ctx_id, int offset, 258 + gss_unwrap_kerberos(struct gss_ctx *ctx_id, int offset, int len, 259 259 struct xdr_buf *buf); 260 260 261 261 ··· 312 312 struct page **pages); 313 313 314 314 u32 315 - gss_krb5_aes_decrypt(struct krb5_ctx *kctx, u32 offset, 315 + gss_krb5_aes_decrypt(struct krb5_ctx *kctx, u32 offset, u32 len, 316 316 struct xdr_buf *buf, u32 *plainoffset, 317 317 u32 *plainlen); 318 318

+1

include/linux/sunrpc/xdr.h

··· 184 184 extern void xdr_shift_buf(struct xdr_buf *, size_t); 185 185 extern void xdr_buf_from_iov(struct kvec *, struct xdr_buf *); 186 186 extern int xdr_buf_subsegment(struct xdr_buf *, struct xdr_buf *, unsigned int, unsigned int); 187 + extern void xdr_buf_trim(struct xdr_buf *, unsigned int); 187 188 extern int read_bytes_from_xdr_buf(struct xdr_buf *, unsigned int, void *, unsigned int); 188 189 extern int write_bytes_to_xdr_buf(struct xdr_buf *, unsigned int, void *, unsigned int); 189 190

+1 -1

include/net/netfilter/nf_conntrack.h

··· 87 87 struct hlist_node nat_bysource; 88 88 #endif 89 89 /* all members below initialized via memset */ 90 - u8 __nfct_init_offset[0]; 90 + struct { } __nfct_init_offset; 91 91 92 92 /* If we were expected by an expectation, this will be it */ 93 93 struct nf_conn *master;

+1

include/net/netfilter/nf_flow_table.h

··· 127 127 NF_FLOW_HW_DYING, 128 128 NF_FLOW_HW_DEAD, 129 129 NF_FLOW_HW_REFRESH, 130 + NF_FLOW_HW_PENDING, 130 131 }; 131 132 132 133 enum flow_offload_type {

+13 -1

include/net/tcp.h

··· 1372 1372 rx_opt->num_sacks = 0; 1373 1373 } 1374 1374 1375 - u32 tcp_default_init_rwnd(u32 mss); 1376 1375 void tcp_cwnd_restart(struct sock *sk, s32 delta); 1377 1376 1378 1377 static inline void tcp_slow_start_after_idle_check(struct sock *sk) ··· 1414 1415 static inline int tcp_full_space(const struct sock *sk) 1415 1416 { 1416 1417 return tcp_win_from_space(sk, READ_ONCE(sk->sk_rcvbuf)); 1418 + } 1419 + 1420 + /* We provision sk_rcvbuf around 200% of sk_rcvlowat. 1421 + * If 87.5 % (7/8) of the space has been consumed, we want to override 1422 + * SO_RCVLOWAT constraint, since we are receiving skbs with too small 1423 + * len/truesize ratio. 1424 + */ 1425 + static inline bool tcp_rmem_pressure(const struct sock *sk) 1426 + { 1427 + int rcvbuf = READ_ONCE(sk->sk_rcvbuf); 1428 + int threshold = rcvbuf - (rcvbuf >> 3); 1429 + 1430 + return atomic_read(&sk->sk_rmem_alloc) > threshold; 1417 1431 } 1418 1432 1419 1433 extern void tcp_openreq_init_rwin(struct request_sock *req,

-2

include/net/udp_tunnel.h

··· 143 143 __be16 df, __be16 src_port, __be16 dst_port, 144 144 bool xnet, bool nocheck); 145 145 146 - #if IS_ENABLED(CONFIG_IPV6) 147 146 int udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sock *sk, 148 147 struct sk_buff *skb, 149 148 struct net_device *dev, struct in6_addr *saddr, 150 149 struct in6_addr *daddr, 151 150 __u8 prio, __u8 ttl, __be32 label, 152 151 __be16 src_port, __be16 dst_port, bool nocheck); 153 - #endif 154 152 155 153 void udp_tunnel_sock_release(struct socket *sock); 156 154

+1

include/sound/rawmidi.h

··· 61 61 size_t avail_min; /* min avail for wakeup */ 62 62 size_t avail; /* max used buffer for wakeup */ 63 63 size_t xruns; /* over/underruns counter */ 64 + int buffer_ref; /* buffer reference count */ 64 65 /* misc */ 65 66 spinlock_t lock; 66 67 wait_queue_head_t sleep;

+1 -1

include/trace/events/gpu_mem.h

··· 24 24 * 25 25 * @pid: Put 0 for global total, while positive pid for process total. 26 26 * 27 - * @size: Virtual size of the allocation in bytes. 27 + * @size: Size of the allocation in bytes. 28 28 * 29 29 */ 30 30 TRACE_EVENT(gpu_mem_total,

+4 -4

include/trace/events/wbt.h

··· 33 33 ), 34 34 35 35 TP_fast_assign( 36 - strlcpy(__entry->name, dev_name(bdi->dev), 36 + strlcpy(__entry->name, bdi_dev_name(bdi), 37 37 ARRAY_SIZE(__entry->name)); 38 38 __entry->rmean = stat[0].mean; 39 39 __entry->rmin = stat[0].min; ··· 68 68 ), 69 69 70 70 TP_fast_assign( 71 - strlcpy(__entry->name, dev_name(bdi->dev), 71 + strlcpy(__entry->name, bdi_dev_name(bdi), 72 72 ARRAY_SIZE(__entry->name)); 73 73 __entry->lat = div_u64(lat, 1000); 74 74 ), ··· 105 105 ), 106 106 107 107 TP_fast_assign( 108 - strlcpy(__entry->name, dev_name(bdi->dev), 108 + strlcpy(__entry->name, bdi_dev_name(bdi), 109 109 ARRAY_SIZE(__entry->name)); 110 110 __entry->msg = msg; 111 111 __entry->step = step; ··· 141 141 ), 142 142 143 143 TP_fast_assign( 144 - strlcpy(__entry->name, dev_name(bdi->dev), 144 + strlcpy(__entry->name, bdi_dev_name(bdi), 145 145 ARRAY_SIZE(__entry->name)); 146 146 __entry->status = status; 147 147 __entry->step = step;

+3 -18

init/Kconfig

··· 39 39 config CC_HAS_ASM_INLINE 40 40 def_bool $(success,echo 'void foo(void) { asm inline (""); }' | $(CC) -x c - -c -o /dev/null) 41 41 42 - config CC_HAS_WARN_MAYBE_UNINITIALIZED 43 - def_bool $(cc-option,-Wmaybe-uninitialized) 44 - help 45 - GCC >= 4.7 supports this option. 46 - 47 - config CC_DISABLE_WARN_MAYBE_UNINITIALIZED 48 - bool 49 - depends on CC_HAS_WARN_MAYBE_UNINITIALIZED 50 - default CC_IS_GCC && GCC_VERSION < 40900 # unreliable for GCC < 4.9 51 - help 52 - GCC's -Wmaybe-uninitialized is not reliable by definition. 53 - Lots of false positive warnings are produced in some cases. 54 - 55 - If this option is enabled, -Wno-maybe-uninitialzed is passed 56 - to the compiler to suppress maybe-uninitialized warnings. 57 - 58 42 config CONSTRUCTORS 59 43 bool 60 44 depends on !UML ··· 1241 1257 config CC_OPTIMIZE_FOR_PERFORMANCE_O3 1242 1258 bool "Optimize more for performance (-O3)" 1243 1259 depends on ARC 1244 - imply CC_DISABLE_WARN_MAYBE_UNINITIALIZED # avoid false positives 1245 1260 help 1246 1261 Choosing this option will pass "-O3" to your compiler to optimize 1247 1262 the kernel yet more for performance. 1248 1263 1249 1264 config CC_OPTIMIZE_FOR_SIZE 1250 1265 bool "Optimize for size (-Os)" 1251 - imply CC_DISABLE_WARN_MAYBE_UNINITIALIZED # avoid false positives 1252 1266 help 1253 1267 Choosing this option will pass "-Os" to your compiler resulting 1254 1268 in a smaller kernel. ··· 2260 2278 functions to call on what tags. 2261 2279 2262 2280 source "kernel/Kconfig.locks" 2281 + 2282 + config ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE 2283 + bool 2263 2284 2264 2285 config ARCH_HAS_SYNC_CORE_BEFORE_USERMODE 2265 2286 bool

+1 -1

init/initramfs.c

··· 542 542 } 543 543 544 544 #ifdef CONFIG_KEXEC_CORE 545 - static bool kexec_free_initrd(void) 545 + static bool __init kexec_free_initrd(void) 546 546 { 547 547 unsigned long crashk_start = (unsigned long)__va(crashk_res.start); 548 548 unsigned long crashk_end = (unsigned long)__va(crashk_res.end);

+55 -18

init/main.c

··· 257 257 258 258 early_param("loglevel", loglevel); 259 259 260 + #ifdef CONFIG_BLK_DEV_INITRD 261 + static void * __init get_boot_config_from_initrd(u32 *_size, u32 *_csum) 262 + { 263 + u32 size, csum; 264 + char *data; 265 + u32 *hdr; 266 + 267 + if (!initrd_end) 268 + return NULL; 269 + 270 + data = (char *)initrd_end - BOOTCONFIG_MAGIC_LEN; 271 + if (memcmp(data, BOOTCONFIG_MAGIC, BOOTCONFIG_MAGIC_LEN)) 272 + return NULL; 273 + 274 + hdr = (u32 *)(data - 8); 275 + size = hdr[0]; 276 + csum = hdr[1]; 277 + 278 + data = ((void *)hdr) - size; 279 + if ((unsigned long)data < initrd_start) { 280 + pr_err("bootconfig size %d is greater than initrd size %ld\n", 281 + size, initrd_end - initrd_start); 282 + return NULL; 283 + } 284 + 285 + /* Remove bootconfig from initramfs/initrd */ 286 + initrd_end = (unsigned long)data; 287 + if (_size) 288 + *_size = size; 289 + if (_csum) 290 + *_csum = csum; 291 + 292 + return data; 293 + } 294 + #else 295 + static void * __init get_boot_config_from_initrd(u32 *_size, u32 *_csum) 296 + { 297 + return NULL; 298 + } 299 + #endif 300 + 260 301 #ifdef CONFIG_BOOT_CONFIG 261 302 262 303 char xbc_namebuf[XBC_KEYLEN_MAX] __initdata; ··· 398 357 int pos; 399 358 u32 size, csum; 400 359 char *data, *copy; 401 - u32 *hdr; 402 360 int ret; 361 + 362 + /* Cut out the bootconfig data even if we have no bootconfig option */ 363 + data = get_boot_config_from_initrd(&size, &csum); 403 364 404 365 strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE); 405 366 parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL, ··· 410 367 if (!bootconfig_found) 411 368 return; 412 369 413 - if (!initrd_end) 414 - goto not_found; 415 - 416 - data = (char *)initrd_end - BOOTCONFIG_MAGIC_LEN; 417 - if (memcmp(data, BOOTCONFIG_MAGIC, BOOTCONFIG_MAGIC_LEN)) 418 - goto not_found; 419 - 420 - hdr = (u32 *)(data - 8); 421 - size = hdr[0]; 422 - csum = hdr[1]; 370 + if (!data) { 371 + pr_err("'bootconfig' found on command line, but no bootconfig found\n"); 372 + return; 373 + } 423 374 424 375 if (size >= XBC_DATA_MAX) { 425 376 pr_err("bootconfig size %d greater than max size %d\n", 426 377 size, XBC_DATA_MAX); 427 378 return; 428 379 } 429 - 430 - data = ((void *)hdr) - size; 431 - if ((unsigned long)data < initrd_start) 432 - goto not_found; 433 380 434 381 if (boot_config_checksum((unsigned char *)data, size) != csum) { 435 382 pr_err("bootconfig checksum failed\n"); ··· 450 417 extra_init_args = xbc_make_cmdline("init"); 451 418 } 452 419 return; 453 - not_found: 454 - pr_err("'bootconfig' found on command line, but no bootconfig found\n"); 455 420 } 421 + 456 422 #else 457 - #define setup_boot_config(cmdline) do { } while (0) 423 + 424 + static void __init setup_boot_config(const char *cmdline) 425 + { 426 + /* Remove bootconfig data from initrd */ 427 + get_boot_config_from_initrd(NULL, NULL); 428 + } 458 429 459 430 static int __init warn_bootconfig(char *str) 460 431 {

+26 -8

ipc/mqueue.c

··· 142 142 143 143 struct sigevent notify; 144 144 struct pid *notify_owner; 145 + u32 notify_self_exec_id; 145 146 struct user_namespace *notify_user_ns; 146 147 struct user_struct *user; /* user who created, for accounting */ 147 148 struct sock *notify_sock; ··· 774 773 * synchronously. */ 775 774 if (info->notify_owner && 776 775 info->attr.mq_curmsgs == 1) { 777 - struct kernel_siginfo sig_i; 778 776 switch (info->notify.sigev_notify) { 779 777 case SIGEV_NONE: 780 778 break; 781 - case SIGEV_SIGNAL: 782 - /* sends signal */ 779 + case SIGEV_SIGNAL: { 780 + struct kernel_siginfo sig_i; 781 + struct task_struct *task; 782 + 783 + /* do_mq_notify() accepts sigev_signo == 0, why?? */ 784 + if (!info->notify.sigev_signo) 785 + break; 783 786 784 787 clear_siginfo(&sig_i); 785 788 sig_i.si_signo = info->notify.sigev_signo; 786 789 sig_i.si_errno = 0; 787 790 sig_i.si_code = SI_MESGQ; 788 791 sig_i.si_value = info->notify.sigev_value; 789 - /* map current pid/uid into info->owner's namespaces */ 790 792 rcu_read_lock(); 793 + /* map current pid/uid into info->owner's namespaces */ 791 794 sig_i.si_pid = task_tgid_nr_ns(current, 792 795 ns_of_pid(info->notify_owner)); 793 - sig_i.si_uid = from_kuid_munged(info->notify_user_ns, current_uid()); 796 + sig_i.si_uid = from_kuid_munged(info->notify_user_ns, 797 + current_uid()); 798 + /* 799 + * We can't use kill_pid_info(), this signal should 800 + * bypass check_kill_permission(). It is from kernel 801 + * but si_fromuser() can't know this. 802 + * We do check the self_exec_id, to avoid sending 803 + * signals to programs that don't expect them. 804 + */ 805 + task = pid_task(info->notify_owner, PIDTYPE_TGID); 806 + if (task && task->self_exec_id == 807 + info->notify_self_exec_id) { 808 + do_send_sig_info(info->notify.sigev_signo, 809 + &sig_i, task, PIDTYPE_TGID); 810 + } 794 811 rcu_read_unlock(); 795 - 796 - kill_pid_info(info->notify.sigev_signo, 797 - &sig_i, info->notify_owner); 798 812 break; 813 + } 799 814 case SIGEV_THREAD: 800 815 set_cookie(info->notify_cookie, NOTIFY_WOKENUP); 801 816 netlink_sendskb(info->notify_sock, info->notify_cookie); ··· 1400 1383 info->notify.sigev_signo = notification->sigev_signo; 1401 1384 info->notify.sigev_value = notification->sigev_value; 1402 1385 info->notify.sigev_notify = SIGEV_SIGNAL; 1386 + info->notify_self_exec_id = current->self_exec_id; 1403 1387 break; 1404 1388 } 1405 1389

+6 -6

ipc/util.c

··· 764 764 total++; 765 765 } 766 766 767 - *new_pos = pos + 1; 767 + ipc = NULL; 768 768 if (total >= ids->in_use) 769 - return NULL; 769 + goto out; 770 770 771 771 for (; pos < ipc_mni; pos++) { 772 772 ipc = idr_find(&ids->ipcs_idr, pos); 773 773 if (ipc != NULL) { 774 774 rcu_read_lock(); 775 775 ipc_lock_object(ipc); 776 - return ipc; 776 + break; 777 777 } 778 778 } 779 - 780 - /* Out of range - return NULL to terminate iteration */ 781 - return NULL; 779 + out: 780 + *new_pos = pos + 1; 781 + return ipc; 782 782 } 783 783 784 784 static void *sysvipc_proc_next(struct seq_file *s, void *it, loff_t *pos)

+6 -1

kernel/bpf/arraymap.c

··· 486 486 if (!(map->map_flags & BPF_F_MMAPABLE)) 487 487 return -EINVAL; 488 488 489 - return remap_vmalloc_range(vma, array_map_vmalloc_addr(array), pgoff); 489 + if (vma->vm_pgoff * PAGE_SIZE + (vma->vm_end - vma->vm_start) > 490 + PAGE_ALIGN((u64)array->map.max_entries * array->elem_size)) 491 + return -EINVAL; 492 + 493 + return remap_vmalloc_range(vma, array_map_vmalloc_addr(array), 494 + vma->vm_pgoff + pgoff); 490 495 } 491 496 492 497 const struct bpf_map_ops array_map_ops = {

+3 -1

kernel/bpf/syscall.c

··· 1489 1489 if (err) 1490 1490 goto free_value; 1491 1491 1492 - if (copy_to_user(uvalue, value, value_size) != 0) 1492 + if (copy_to_user(uvalue, value, value_size) != 0) { 1493 + err = -EFAULT; 1493 1494 goto free_value; 1495 + } 1494 1496 1495 1497 err = 0; 1496 1498

+18 -2

kernel/bpf/verifier.c

··· 4389 4389 4390 4390 if (ret_type != RET_INTEGER || 4391 4391 (func_id != BPF_FUNC_get_stack && 4392 - func_id != BPF_FUNC_probe_read_str)) 4392 + func_id != BPF_FUNC_probe_read_str && 4393 + func_id != BPF_FUNC_probe_read_kernel_str && 4394 + func_id != BPF_FUNC_probe_read_user_str)) 4393 4395 return; 4394 4396 4395 4397 ret_reg->smax_value = meta->msize_max_value; ··· 7115 7113 range = tnum_const(0); 7116 7114 break; 7117 7115 case BPF_PROG_TYPE_TRACING: 7118 - if (env->prog->expected_attach_type != BPF_TRACE_ITER) 7116 + switch (env->prog->expected_attach_type) { 7117 + case BPF_TRACE_FENTRY: 7118 + case BPF_TRACE_FEXIT: 7119 + range = tnum_const(0); 7120 + break; 7121 + case BPF_TRACE_ITER: 7122 + case BPF_TRACE_RAW_TP: 7123 + case BPF_MODIFY_RETURN: 7119 7124 return 0; 7125 + default: 7126 + return -ENOTSUPP; 7127 + } 7120 7128 break; 7129 + case BPF_PROG_TYPE_EXT: 7130 + /* freplace program can return anything as its return value 7131 + * depends on the to-be-replaced kernel func or bpf program. 7132 + */ 7121 7133 default: 7122 7134 return 0; 7123 7135 }

+7 -6

kernel/fork.c

··· 2486 2486 int __user *child_tidptr) 2487 2487 { 2488 2488 struct kernel_clone_args args = { 2489 - .flags = (clone_flags & ~CSIGNAL), 2489 + .flags = (lower_32_bits(clone_flags) & ~CSIGNAL), 2490 2490 .pidfd = parent_tidptr, 2491 2491 .child_tid = child_tidptr, 2492 2492 .parent_tid = parent_tidptr, 2493 - .exit_signal = (clone_flags & CSIGNAL), 2493 + .exit_signal = (lower_32_bits(clone_flags) & CSIGNAL), 2494 2494 .stack = stack_start, 2495 2495 .stack_size = stack_size, 2496 2496 }; ··· 2508 2508 pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags) 2509 2509 { 2510 2510 struct kernel_clone_args args = { 2511 - .flags = ((flags | CLONE_VM | CLONE_UNTRACED) & ~CSIGNAL), 2512 - .exit_signal = (flags & CSIGNAL), 2511 + .flags = ((lower_32_bits(flags) | CLONE_VM | 2512 + CLONE_UNTRACED) & ~CSIGNAL), 2513 + .exit_signal = (lower_32_bits(flags) & CSIGNAL), 2513 2514 .stack = (unsigned long)fn, 2514 2515 .stack_size = (unsigned long)arg, 2515 2516 }; ··· 2571 2570 #endif 2572 2571 { 2573 2572 struct kernel_clone_args args = { 2574 - .flags = (clone_flags & ~CSIGNAL), 2573 + .flags = (lower_32_bits(clone_flags) & ~CSIGNAL), 2575 2574 .pidfd = parent_tidptr, 2576 2575 .child_tid = child_tidptr, 2577 2576 .parent_tid = parent_tidptr, 2578 - .exit_signal = (clone_flags & CSIGNAL), 2577 + .exit_signal = (lower_32_bits(clone_flags) & CSIGNAL), 2579 2578 .stack = newsp, 2580 2579 .tls = tls, 2581 2580 };

+2 -2

kernel/kcov.c

··· 740 740 * kcov_remote_handle() with KCOV_SUBSYSTEM_COMMON as the subsystem id and an 741 741 * arbitrary 4-byte non-zero number as the instance id). This common handle 742 742 * then gets saved into the task_struct of the process that issued the 743 - * KCOV_REMOTE_ENABLE ioctl. When this proccess issues system calls that spawn 744 - * kernel threads, the common handle must be retrived via kcov_common_handle() 743 + * KCOV_REMOTE_ENABLE ioctl. When this process issues system calls that spawn 744 + * kernel threads, the common handle must be retrieved via kcov_common_handle() 745 745 * and passed to the spawned threads via custom annotations. Those kernel 746 746 * threads must in turn be annotated with kcov_remote_start(common_handle) and 747 747 * kcov_remote_stop(). All of the threads that are spawned by the same process

-1

kernel/trace/Kconfig

··· 466 466 config PROFILE_ALL_BRANCHES 467 467 bool "Profile all if conditionals" if !FORTIFY_SOURCE 468 468 select TRACE_BRANCH_PROFILING 469 - imply CC_DISABLE_WARN_MAYBE_UNINITIALIZED # avoid false positives 470 469 help 471 470 This tracer profiles all branch conditions. Every if () 472 471 taken in the kernel is recorded whether it hit or miss.

+66 -34

kernel/trace/bpf_trace.c

··· 326 326 327 327 /* 328 328 * Only limited trace_printk() conversion specifiers allowed: 329 - * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %s 329 + * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pks %pus %s 330 330 */ 331 331 BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1, 332 332 u64, arg2, u64, arg3) 333 333 { 334 + int i, mod[3] = {}, fmt_cnt = 0; 335 + char buf[64], fmt_ptype; 336 + void *unsafe_ptr = NULL; 334 337 bool str_seen = false; 335 - int mod[3] = {}; 336 - int fmt_cnt = 0; 337 - u64 unsafe_addr; 338 - char buf[64]; 339 - int i; 340 338 341 339 /* 342 340 * bpf_check()->check_func_arg()->check_stack_boundary() ··· 360 362 if (fmt[i] == 'l') { 361 363 mod[fmt_cnt]++; 362 364 i++; 363 - } else if (fmt[i] == 'p' || fmt[i] == 's') { 365 + } else if (fmt[i] == 'p') { 364 366 mod[fmt_cnt]++; 367 + if ((fmt[i + 1] == 'k' || 368 + fmt[i + 1] == 'u') && 369 + fmt[i + 2] == 's') { 370 + fmt_ptype = fmt[i + 1]; 371 + i += 2; 372 + goto fmt_str; 373 + } 374 + 365 375 /* disallow any further format extensions */ 366 376 if (fmt[i + 1] != 0 && 367 377 !isspace(fmt[i + 1]) && 368 378 !ispunct(fmt[i + 1])) 369 379 return -EINVAL; 370 - fmt_cnt++; 371 - if (fmt[i] == 's') { 372 - if (str_seen) 373 - /* allow only one '%s' per fmt string */ 374 - return -EINVAL; 375 - str_seen = true; 376 380 377 - switch (fmt_cnt) { 378 - case 1: 379 - unsafe_addr = arg1; 380 - arg1 = (long) buf; 381 - break; 382 - case 2: 383 - unsafe_addr = arg2; 384 - arg2 = (long) buf; 385 - break; 386 - case 3: 387 - unsafe_addr = arg3; 388 - arg3 = (long) buf; 389 - break; 390 - } 391 - buf[0] = 0; 392 - strncpy_from_unsafe(buf, 393 - (void *) (long) unsafe_addr, 394 - sizeof(buf)); 381 + goto fmt_next; 382 + } else if (fmt[i] == 's') { 383 + mod[fmt_cnt]++; 384 + fmt_ptype = fmt[i]; 385 + fmt_str: 386 + if (str_seen) 387 + /* allow only one '%s' per fmt string */ 388 + return -EINVAL; 389 + str_seen = true; 390 + 391 + if (fmt[i + 1] != 0 && 392 + !isspace(fmt[i + 1]) && 393 + !ispunct(fmt[i + 1])) 394 + return -EINVAL; 395 + 396 + switch (fmt_cnt) { 397 + case 0: 398 + unsafe_ptr = (void *)(long)arg1; 399 + arg1 = (long)buf; 400 + break; 401 + case 1: 402 + unsafe_ptr = (void *)(long)arg2; 403 + arg2 = (long)buf; 404 + break; 405 + case 2: 406 + unsafe_ptr = (void *)(long)arg3; 407 + arg3 = (long)buf; 408 + break; 395 409 } 396 - continue; 410 + 411 + buf[0] = 0; 412 + switch (fmt_ptype) { 413 + case 's': 414 + #ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE 415 + strncpy_from_unsafe(buf, unsafe_ptr, 416 + sizeof(buf)); 417 + break; 418 + #endif 419 + case 'k': 420 + strncpy_from_unsafe_strict(buf, unsafe_ptr, 421 + sizeof(buf)); 422 + break; 423 + case 'u': 424 + strncpy_from_unsafe_user(buf, 425 + (__force void __user *)unsafe_ptr, 426 + sizeof(buf)); 427 + break; 428 + } 429 + goto fmt_next; 397 430 } 398 431 399 432 if (fmt[i] == 'l') { ··· 435 406 if (fmt[i] != 'i' && fmt[i] != 'd' && 436 407 fmt[i] != 'u' && fmt[i] != 'x') 437 408 return -EINVAL; 409 + fmt_next: 438 410 fmt_cnt++; 439 411 } 440 412 ··· 1066 1036 return &bpf_probe_read_user_proto; 1067 1037 case BPF_FUNC_probe_read_kernel: 1068 1038 return &bpf_probe_read_kernel_proto; 1069 - case BPF_FUNC_probe_read: 1070 - return &bpf_probe_read_compat_proto; 1071 1039 case BPF_FUNC_probe_read_user_str: 1072 1040 return &bpf_probe_read_user_str_proto; 1073 1041 case BPF_FUNC_probe_read_kernel_str: 1074 1042 return &bpf_probe_read_kernel_str_proto; 1043 + #ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE 1044 + case BPF_FUNC_probe_read: 1045 + return &bpf_probe_read_compat_proto; 1075 1046 case BPF_FUNC_probe_read_str: 1076 1047 return &bpf_probe_read_compat_str_proto; 1048 + #endif 1077 1049 #ifdef CONFIG_CGROUPS 1078 1050 case BPF_FUNC_get_current_cgroup_id: 1079 1051 return &bpf_get_current_cgroup_id_proto;

-22

kernel/trace/ftrace_internal.h

··· 4 4 5 5 #ifdef CONFIG_FUNCTION_TRACER 6 6 7 - /* 8 - * Traverse the ftrace_global_list, invoking all entries. The reason that we 9 - * can use rcu_dereference_raw_check() is that elements removed from this list 10 - * are simply leaked, so there is no need to interact with a grace-period 11 - * mechanism. The rcu_dereference_raw_check() calls are needed to handle 12 - * concurrent insertions into the ftrace_global_list. 13 - * 14 - * Silly Alpha and silly pointer-speculation compiler optimizations! 15 - */ 16 - #define do_for_each_ftrace_op(op, list) \ 17 - op = rcu_dereference_raw_check(list); \ 18 - do 19 - 20 - /* 21 - * Optimized for just a single item in the list (as that is the normal case). 22 - */ 23 - #define while_for_each_ftrace_op(op) \ 24 - while (likely(op = rcu_dereference_raw_check((op)->next)) && \ 25 - unlikely((op) != &ftrace_list_end)) 26 - 27 - extern struct ftrace_ops __rcu *ftrace_ops_list; 28 - extern struct ftrace_ops ftrace_list_end; 29 7 extern struct mutex ftrace_lock; 30 8 extern struct ftrace_ops global_ops; 31 9

+32 -6

kernel/trace/preemptirq_delay_test.c

··· 16 16 #include <linux/printk.h> 17 17 #include <linux/string.h> 18 18 #include <linux/sysfs.h> 19 + #include <linux/completion.h> 19 20 20 21 static ulong delay = 100; 21 22 static char test_mode[12] = "irq"; ··· 28 27 MODULE_PARM_DESC(delay, "Period in microseconds (100 us default)"); 29 28 MODULE_PARM_DESC(test_mode, "Mode of the test such as preempt, irq, or alternate (default irq)"); 30 29 MODULE_PARM_DESC(burst_size, "The size of a burst (default 1)"); 30 + 31 + static struct completion done; 31 32 32 33 #define MIN(x, y) ((x) < (y) ? (x) : (y)) 33 34 ··· 116 113 117 114 for (i = 0; i < s; i++) 118 115 (testfuncs[i])(i); 116 + 117 + complete(&done); 118 + 119 + set_current_state(TASK_INTERRUPTIBLE); 120 + while (!kthread_should_stop()) { 121 + schedule(); 122 + set_current_state(TASK_INTERRUPTIBLE); 123 + } 124 + 125 + __set_current_state(TASK_RUNNING); 126 + 119 127 return 0; 120 128 } 121 129 122 - static struct task_struct *preemptirq_start_test(void) 130 + static int preemptirq_run_test(void) 123 131 { 132 + struct task_struct *task; 124 133 char task_name[50]; 125 134 135 + init_completion(&done); 136 + 126 137 snprintf(task_name, sizeof(task_name), "%s_test", test_mode); 127 - return kthread_run(preemptirq_delay_run, NULL, task_name); 138 + task = kthread_run(preemptirq_delay_run, NULL, task_name); 139 + if (IS_ERR(task)) 140 + return PTR_ERR(task); 141 + if (task) { 142 + wait_for_completion(&done); 143 + kthread_stop(task); 144 + } 145 + return 0; 128 146 } 129 147 130 148 131 149 static ssize_t trigger_store(struct kobject *kobj, struct kobj_attribute *attr, 132 150 const char *buf, size_t count) 133 151 { 134 - preemptirq_start_test(); 152 + ssize_t ret; 153 + 154 + ret = preemptirq_run_test(); 155 + if (ret) 156 + return ret; 135 157 return count; 136 158 } 137 159 ··· 176 148 177 149 static int __init preemptirq_delay_init(void) 178 150 { 179 - struct task_struct *test_task; 180 151 int retval; 181 152 182 - test_task = preemptirq_start_test(); 183 - retval = PTR_ERR_OR_ZERO(test_task); 153 + retval = preemptirq_run_test(); 184 154 if (retval != 0) 185 155 return retval; 186 156

+13 -21

kernel/trace/ring_buffer.c

··· 193 193 case RINGBUF_TYPE_DATA: 194 194 return rb_event_data_length(event); 195 195 default: 196 - BUG(); 196 + WARN_ON_ONCE(1); 197 197 } 198 198 /* not hit */ 199 199 return 0; ··· 249 249 { 250 250 if (extended_time(event)) 251 251 event = skip_time_extend(event); 252 - BUG_ON(event->type_len > RINGBUF_TYPE_DATA_TYPE_LEN_MAX); 252 + WARN_ON_ONCE(event->type_len > RINGBUF_TYPE_DATA_TYPE_LEN_MAX); 253 253 /* If length is in len field, then array[0] has the data */ 254 254 if (event->type_len) 255 255 return (void *)&event->array[0]; ··· 3727 3727 return; 3728 3728 3729 3729 default: 3730 - BUG(); 3730 + RB_WARN_ON(cpu_buffer, 1); 3731 3731 } 3732 3732 return; 3733 3733 } ··· 3757 3757 return; 3758 3758 3759 3759 default: 3760 - BUG(); 3760 + RB_WARN_ON(iter->cpu_buffer, 1); 3761 3761 } 3762 3762 return; 3763 3763 } ··· 4020 4020 return event; 4021 4021 4022 4022 default: 4023 - BUG(); 4023 + RB_WARN_ON(cpu_buffer, 1); 4024 4024 } 4025 4025 4026 4026 return NULL; ··· 4034 4034 struct ring_buffer_per_cpu *cpu_buffer; 4035 4035 struct ring_buffer_event *event; 4036 4036 int nr_loops = 0; 4037 - bool failed = false; 4038 4037 4039 4038 if (ts) 4040 4039 *ts = 0; ··· 4055 4056 return NULL; 4056 4057 4057 4058 /* 4058 - * We repeat when a time extend is encountered or we hit 4059 - * the end of the page. Since the time extend is always attached 4060 - * to a data event, we should never loop more than three times. 4061 - * Once for going to next page, once on time extend, and 4062 - * finally once to get the event. 4063 - * We should never hit the following condition more than thrice, 4064 - * unless the buffer is very small, and there's a writer 4065 - * that is causing the reader to fail getting an event. 4059 + * As the writer can mess with what the iterator is trying 4060 + * to read, just give up if we fail to get an event after 4061 + * three tries. The iterator is not as reliable when reading 4062 + * the ring buffer with an active write as the consumer is. 4063 + * Do not warn if the three failures is reached. 4066 4064 */ 4067 - if (++nr_loops > 3) { 4068 - RB_WARN_ON(cpu_buffer, !failed); 4065 + if (++nr_loops > 3) 4069 4066 return NULL; 4070 - } 4071 4067 4072 4068 if (rb_per_cpu_empty(cpu_buffer)) 4073 4069 return NULL; ··· 4073 4079 } 4074 4080 4075 4081 event = rb_iter_head_event(iter); 4076 - if (!event) { 4077 - failed = true; 4082 + if (!event) 4078 4083 goto again; 4079 - } 4080 4084 4081 4085 switch (event->type_len) { 4082 4086 case RINGBUF_TYPE_PADDING: ··· 4109 4117 return event; 4110 4118 4111 4119 default: 4112 - BUG(); 4120 + RB_WARN_ON(cpu_buffer, 1); 4113 4121 } 4114 4122 4115 4123 return NULL;

+15 -1

kernel/trace/trace.c

··· 947 947 EXPORT_SYMBOL_GPL(__trace_bputs); 948 948 949 949 #ifdef CONFIG_TRACER_SNAPSHOT 950 - void tracing_snapshot_instance_cond(struct trace_array *tr, void *cond_data) 950 + static void tracing_snapshot_instance_cond(struct trace_array *tr, 951 + void *cond_data) 951 952 { 952 953 struct tracer *tracer = tr->current_trace; 953 954 unsigned long flags; ··· 8526 8525 */ 8527 8526 allocate_snapshot = false; 8528 8527 #endif 8528 + 8529 + /* 8530 + * Because of some magic with the way alloc_percpu() works on 8531 + * x86_64, we need to synchronize the pgd of all the tables, 8532 + * otherwise the trace events that happen in x86_64 page fault 8533 + * handlers can't cope with accessing the chance that a 8534 + * alloc_percpu()'d memory might be touched in the page fault trace 8535 + * event. Oh, and we need to audit all other alloc_percpu() and vmalloc() 8536 + * calls in tracing, because something might get triggered within a 8537 + * page fault trace event! 8538 + */ 8539 + vmalloc_sync_mappings(); 8540 + 8529 8541 return 0; 8530 8542 } 8531 8543

+10 -14

kernel/trace/trace_boot.c

··· 95 95 struct xbc_node *anode; 96 96 char buf[MAX_BUF_LEN]; 97 97 const char *val; 98 - int ret; 99 - 100 - kprobe_event_cmd_init(&cmd, buf, MAX_BUF_LEN); 101 - 102 - ret = kprobe_event_gen_cmd_start(&cmd, event, NULL); 103 - if (ret) 104 - return ret; 98 + int ret = 0; 105 99 106 100 xbc_node_for_each_array_value(node, "probes", anode, val) { 107 - ret = kprobe_event_add_field(&cmd, val); 108 - if (ret) 109 - return ret; 110 - } 101 + kprobe_event_cmd_init(&cmd, buf, MAX_BUF_LEN); 111 102 112 - ret = kprobe_event_gen_cmd_end(&cmd); 113 - if (ret) 114 - pr_err("Failed to add probe: %s\n", buf); 103 + ret = kprobe_event_gen_cmd_start(&cmd, event, val); 104 + if (ret) 105 + break; 106 + 107 + ret = kprobe_event_gen_cmd_end(&cmd); 108 + if (ret) 109 + pr_err("Failed to add probe: %s\n", buf); 110 + } 115 111 116 112 return ret; 117 113 }

+7 -1

kernel/trace/trace_kprobe.c

··· 453 453 454 454 static bool within_notrace_func(struct trace_kprobe *tk) 455 455 { 456 - unsigned long addr = addr = trace_kprobe_address(tk); 456 + unsigned long addr = trace_kprobe_address(tk); 457 457 char symname[KSYM_NAME_LEN], *p; 458 458 459 459 if (!__within_notrace_func(addr)) ··· 940 940 * complete command or only the first part of it; in the latter case, 941 941 * kprobe_event_add_fields() can be used to add more fields following this. 942 942 * 943 + * Unlikely the synth_event_gen_cmd_start(), @loc must be specified. This 944 + * returns -EINVAL if @loc == NULL. 945 + * 943 946 * Return: 0 if successful, error otherwise. 944 947 */ 945 948 int __kprobe_event_gen_cmd_start(struct dynevent_cmd *cmd, bool kretprobe, ··· 954 951 int ret; 955 952 956 953 if (cmd->type != DYNEVENT_TYPE_KPROBE) 954 + return -EINVAL; 955 + 956 + if (!loc) 957 957 return -EINVAL; 958 958 959 959 if (kretprobe)

+11

kernel/umh.c

··· 475 475 { 476 476 struct umh_info *umh_info = info->data; 477 477 478 + /* cleanup if umh_pipe_setup() was successful but exec failed */ 479 + if (info->pid && info->retval) { 480 + fput(umh_info->pipe_to_umh); 481 + fput(umh_info->pipe_from_umh); 482 + } 483 + 478 484 argv_free(info->argv); 479 485 umh_info->pid = info->pid; 480 486 } ··· 550 544 * Runs a user-space application. The application is started 551 545 * asynchronously if wait is not set, and runs as a child of system workqueues. 552 546 * (ie. it runs with full root capabilities and optimized affinity). 547 + * 548 + * Note: successful return value does not guarantee the helper was called at 549 + * all. You can't rely on sub_info->{init,cleanup} being called even for 550 + * UMH_WAIT_* wait modes as STATIC_USERMODEHELPER_PATH="" turns all helpers 551 + * into a successful no-op. 553 552 */ 554 553 int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait) 555 554 {

+7 -10

lib/Kconfig.ubsan

··· 60 60 Enabling this option will get kernel image size increased 61 61 significantly. 62 62 63 - config UBSAN_NO_ALIGNMENT 64 - bool "Disable checking of pointers alignment" 65 - default y if HAVE_EFFICIENT_UNALIGNED_ACCESS 66 - help 67 - This option disables the check of unaligned memory accesses. 68 - This option should be used when building allmodconfig. 69 - Disabling this option on architectures that support unaligned 70 - accesses may produce a lot of false positives. 71 - 72 63 config UBSAN_ALIGNMENT 73 - def_bool !UBSAN_NO_ALIGNMENT 64 + bool "Enable checks for pointers alignment" 65 + default !HAVE_EFFICIENT_UNALIGNED_ACCESS 66 + depends on !X86 || !COMPILE_TEST 67 + help 68 + This option enables the check of unaligned memory accesses. 69 + Enabling this option on architectures that support unaligned 70 + accesses may produce a lot of false positives. 74 71 75 72 config TEST_UBSAN 76 73 tristate "Module for testing for undefined behavior detection"

+12

lib/vsprintf.c

··· 2168 2168 * f full name 2169 2169 * P node name, including a possible unit address 2170 2170 * - 'x' For printing the address. Equivalent to "%lx". 2171 + * - '[ku]s' For a BPF/tracing related format specifier, e.g. used out of 2172 + * bpf_trace_printk() where [ku] prefix specifies either kernel (k) 2173 + * or user (u) memory to probe, and: 2174 + * s a string, equivalent to "%s" on direct vsnprintf() use 2171 2175 * 2172 2176 * ** When making changes please also update: 2173 2177 * Documentation/core-api/printk-formats.rst ··· 2255 2251 if (!IS_ERR(ptr)) 2256 2252 break; 2257 2253 return err_ptr(buf, end, ptr, spec); 2254 + case 'u': 2255 + case 'k': 2256 + switch (fmt[1]) { 2257 + case 's': 2258 + return string(buf, end, ptr, spec); 2259 + default: 2260 + return error_string(buf, end, "(einval)", spec); 2261 + } 2258 2262 } 2259 2263 2260 2264 /* default is to _not_ leak addresses, hash before printing */

+11 -2

mm/backing-dev.c

··· 21 21 EXPORT_SYMBOL_GPL(noop_backing_dev_info); 22 22 23 23 static struct class *bdi_class; 24 - const char *bdi_unknown_name = "(unknown)"; 24 + static const char *bdi_unknown_name = "(unknown)"; 25 25 26 26 /* 27 27 * bdi_lock protects bdi_tree and updates to bdi_list. bdi_list has RCU ··· 938 938 if (bdi->dev) /* The driver needs to use separate queues per device */ 939 939 return 0; 940 940 941 - dev = device_create_vargs(bdi_class, NULL, MKDEV(0, 0), bdi, fmt, args); 941 + vsnprintf(bdi->dev_name, sizeof(bdi->dev_name), fmt, args); 942 + dev = device_create(bdi_class, NULL, MKDEV(0, 0), bdi, bdi->dev_name); 942 943 if (IS_ERR(dev)) 943 944 return PTR_ERR(dev); 944 945 ··· 1043 1042 kref_put(&bdi->refcnt, release_bdi); 1044 1043 } 1045 1044 EXPORT_SYMBOL(bdi_put); 1045 + 1046 + const char *bdi_dev_name(struct backing_dev_info *bdi) 1047 + { 1048 + if (!bdi || !bdi->dev) 1049 + return bdi_unknown_name; 1050 + return bdi->dev_name; 1051 + } 1052 + EXPORT_SYMBOL_GPL(bdi_dev_name); 1046 1053 1047 1054 static wait_queue_head_t congestion_wqh[2] = { 1048 1055 __WAIT_QUEUE_HEAD_INITIALIZER(congestion_wqh[0]),

+7 -5

mm/gup.c

··· 1218 1218 if (!vma_permits_fault(vma, fault_flags)) 1219 1219 return -EFAULT; 1220 1220 1221 + if ((fault_flags & FAULT_FLAG_KILLABLE) && 1222 + fatal_signal_pending(current)) 1223 + return -EINTR; 1224 + 1221 1225 ret = handle_mm_fault(vma, address, fault_flags); 1222 1226 major |= ret & VM_FAULT_MAJOR; 1223 1227 if (ret & VM_FAULT_ERROR) { ··· 1234 1230 1235 1231 if (ret & VM_FAULT_RETRY) { 1236 1232 down_read(&mm->mmap_sem); 1237 - if (!(fault_flags & FAULT_FLAG_TRIED)) { 1238 - *unlocked = true; 1239 - fault_flags |= FAULT_FLAG_TRIED; 1240 - goto retry; 1241 - } 1233 + *unlocked = true; 1234 + fault_flags |= FAULT_FLAG_TRIED; 1235 + goto retry; 1242 1236 } 1243 1237 1244 1238 if (tsk) {

+10 -5

mm/kasan/Makefile

··· 1 1 # SPDX-License-Identifier: GPL-2.0 2 2 KASAN_SANITIZE := n 3 - UBSAN_SANITIZE_common.o := n 4 - UBSAN_SANITIZE_generic.o := n 5 - UBSAN_SANITIZE_generic_report.o := n 6 - UBSAN_SANITIZE_tags.o := n 3 + UBSAN_SANITIZE := n 7 4 KCOV_INSTRUMENT := n 8 5 6 + # Disable ftrace to avoid recursion. 9 7 CFLAGS_REMOVE_common.o = $(CC_FLAGS_FTRACE) 10 8 CFLAGS_REMOVE_generic.o = $(CC_FLAGS_FTRACE) 11 9 CFLAGS_REMOVE_generic_report.o = $(CC_FLAGS_FTRACE) 10 + CFLAGS_REMOVE_init.o = $(CC_FLAGS_FTRACE) 11 + CFLAGS_REMOVE_quarantine.o = $(CC_FLAGS_FTRACE) 12 + CFLAGS_REMOVE_report.o = $(CC_FLAGS_FTRACE) 12 13 CFLAGS_REMOVE_tags.o = $(CC_FLAGS_FTRACE) 14 + CFLAGS_REMOVE_tags_report.o = $(CC_FLAGS_FTRACE) 13 15 14 16 # Function splitter causes unnecessary splits in __asan_load1/__asan_store1 15 17 # see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63533 16 - 17 18 CFLAGS_common.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) 18 19 CFLAGS_generic.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) 19 20 CFLAGS_generic_report.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) 21 + CFLAGS_init.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) 22 + CFLAGS_quarantine.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) 23 + CFLAGS_report.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) 20 24 CFLAGS_tags.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) 25 + CFLAGS_tags_report.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) 21 26 22 27 obj-$(CONFIG_KASAN) := common.o init.o report.o 23 28 obj-$(CONFIG_KASAN_GENERIC) += generic.o generic_report.o quarantine.o

+32 -2

mm/kasan/kasan.h

··· 212 212 asmlinkage void kasan_unpoison_task_stack_below(const void *watermark); 213 213 void __asan_register_globals(struct kasan_global *globals, size_t size); 214 214 void __asan_unregister_globals(struct kasan_global *globals, size_t size); 215 - void __asan_loadN(unsigned long addr, size_t size); 216 - void __asan_storeN(unsigned long addr, size_t size); 217 215 void __asan_handle_no_return(void); 218 216 void __asan_alloca_poison(unsigned long addr, size_t size); 219 217 void __asan_allocas_unpoison(const void *stack_top, const void *stack_bottom); ··· 226 228 void __asan_store8(unsigned long addr); 227 229 void __asan_load16(unsigned long addr); 228 230 void __asan_store16(unsigned long addr); 231 + void __asan_loadN(unsigned long addr, size_t size); 232 + void __asan_storeN(unsigned long addr, size_t size); 229 233 230 234 void __asan_load1_noabort(unsigned long addr); 231 235 void __asan_store1_noabort(unsigned long addr); ··· 239 239 void __asan_store8_noabort(unsigned long addr); 240 240 void __asan_load16_noabort(unsigned long addr); 241 241 void __asan_store16_noabort(unsigned long addr); 242 + void __asan_loadN_noabort(unsigned long addr, size_t size); 243 + void __asan_storeN_noabort(unsigned long addr, size_t size); 244 + 245 + void __asan_report_load1_noabort(unsigned long addr); 246 + void __asan_report_store1_noabort(unsigned long addr); 247 + void __asan_report_load2_noabort(unsigned long addr); 248 + void __asan_report_store2_noabort(unsigned long addr); 249 + void __asan_report_load4_noabort(unsigned long addr); 250 + void __asan_report_store4_noabort(unsigned long addr); 251 + void __asan_report_load8_noabort(unsigned long addr); 252 + void __asan_report_store8_noabort(unsigned long addr); 253 + void __asan_report_load16_noabort(unsigned long addr); 254 + void __asan_report_store16_noabort(unsigned long addr); 255 + void __asan_report_load_n_noabort(unsigned long addr, size_t size); 256 + void __asan_report_store_n_noabort(unsigned long addr, size_t size); 242 257 243 258 void __asan_set_shadow_00(const void *addr, size_t size); 244 259 void __asan_set_shadow_f1(const void *addr, size_t size); ··· 261 246 void __asan_set_shadow_f3(const void *addr, size_t size); 262 247 void __asan_set_shadow_f5(const void *addr, size_t size); 263 248 void __asan_set_shadow_f8(const void *addr, size_t size); 249 + 250 + void __hwasan_load1_noabort(unsigned long addr); 251 + void __hwasan_store1_noabort(unsigned long addr); 252 + void __hwasan_load2_noabort(unsigned long addr); 253 + void __hwasan_store2_noabort(unsigned long addr); 254 + void __hwasan_load4_noabort(unsigned long addr); 255 + void __hwasan_store4_noabort(unsigned long addr); 256 + void __hwasan_load8_noabort(unsigned long addr); 257 + void __hwasan_store8_noabort(unsigned long addr); 258 + void __hwasan_load16_noabort(unsigned long addr); 259 + void __hwasan_store16_noabort(unsigned long addr); 260 + void __hwasan_loadN_noabort(unsigned long addr, size_t size); 261 + void __hwasan_storeN_noabort(unsigned long addr, size_t size); 262 + 263 + void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size); 264 264 265 265 #endif

+9 -6

mm/memcontrol.c

··· 4990 4990 unsigned int size; 4991 4991 int node; 4992 4992 int __maybe_unused i; 4993 + long error = -ENOMEM; 4993 4994 4994 4995 size = sizeof(struct mem_cgroup); 4995 4996 size += nr_node_ids * sizeof(struct mem_cgroup_per_node *); 4996 4997 4997 4998 memcg = kzalloc(size, GFP_KERNEL); 4998 4999 if (!memcg) 4999 - return NULL; 5000 + return ERR_PTR(error); 5000 5001 5001 5002 memcg->id.id = idr_alloc(&mem_cgroup_idr, NULL, 5002 5003 1, MEM_CGROUP_ID_MAX, 5003 5004 GFP_KERNEL); 5004 - if (memcg->id.id < 0) 5005 + if (memcg->id.id < 0) { 5006 + error = memcg->id.id; 5005 5007 goto fail; 5008 + } 5006 5009 5007 5010 memcg->vmstats_local = alloc_percpu(struct memcg_vmstats_percpu); 5008 5011 if (!memcg->vmstats_local) ··· 5049 5046 fail: 5050 5047 mem_cgroup_id_remove(memcg); 5051 5048 __mem_cgroup_free(memcg); 5052 - return NULL; 5049 + return ERR_PTR(error); 5053 5050 } 5054 5051 5055 5052 static struct cgroup_subsys_state * __ref ··· 5060 5057 long error = -ENOMEM; 5061 5058 5062 5059 memcg = mem_cgroup_alloc(); 5063 - if (!memcg) 5064 - return ERR_PTR(error); 5060 + if (IS_ERR(memcg)) 5061 + return ERR_CAST(memcg); 5065 5062 5066 5063 WRITE_ONCE(memcg->high, PAGE_COUNTER_MAX); 5067 5064 memcg->soft_limit = PAGE_COUNTER_MAX; ··· 5111 5108 fail: 5112 5109 mem_cgroup_id_remove(memcg); 5113 5110 mem_cgroup_free(memcg); 5114 - return ERR_PTR(-ENOMEM); 5111 + return ERR_PTR(error); 5115 5112 } 5116 5113 5117 5114 static int mem_cgroup_css_online(struct cgroup_subsys_state *css)

+1 -1

mm/mremap.c

··· 794 794 if (locked && new_len > old_len) 795 795 mm_populate(new_addr + old_len, new_len - old_len); 796 796 userfaultfd_unmap_complete(mm, &uf_unmap_early); 797 - mremap_userfaultfd_complete(&uf, addr, new_addr, old_len); 797 + mremap_userfaultfd_complete(&uf, addr, ret, old_len); 798 798 userfaultfd_unmap_complete(mm, &uf_unmap); 799 799 return ret; 800 800 }

+9

mm/page_alloc.c

··· 1607 1607 if (!__pageblock_pfn_to_page(block_start_pfn, 1608 1608 block_end_pfn, zone)) 1609 1609 return; 1610 + cond_resched(); 1610 1611 } 1611 1612 1612 1613 /* We confirm that there is no hole */ ··· 2400 2399 unsigned long max_boost; 2401 2400 2402 2401 if (!watermark_boost_factor) 2402 + return; 2403 + /* 2404 + * Don't bother in zones that are unlikely to produce results. 2405 + * On small machines, including kdump capture kernels running 2406 + * in a small area, boosting the watermark can cause an out of 2407 + * memory situation immediately. 2408 + */ 2409 + if ((pageblock_nr_pages * 4) > zone_managed_pages(zone)) 2403 2410 return; 2404 2411 2405 2412 max_boost = mult_frac(zone->_watermark[WMARK_HIGH],

+10 -4

mm/percpu.c

··· 80 80 #include <linux/workqueue.h> 81 81 #include <linux/kmemleak.h> 82 82 #include <linux/sched.h> 83 + #include <linux/sched/mm.h> 83 84 84 85 #include <asm/cacheflush.h> 85 86 #include <asm/sections.h> ··· 1558 1557 static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, 1559 1558 gfp_t gfp) 1560 1559 { 1561 - /* whitelisted flags that can be passed to the backing allocators */ 1562 - gfp_t pcpu_gfp = gfp & (GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN); 1563 - bool is_atomic = (gfp & GFP_KERNEL) != GFP_KERNEL; 1564 - bool do_warn = !(gfp & __GFP_NOWARN); 1560 + gfp_t pcpu_gfp; 1561 + bool is_atomic; 1562 + bool do_warn; 1565 1563 static int warn_limit = 10; 1566 1564 struct pcpu_chunk *chunk, *next; 1567 1565 const char *err; ··· 1568 1568 unsigned long flags; 1569 1569 void __percpu *ptr; 1570 1570 size_t bits, bit_align; 1571 + 1572 + gfp = current_gfp_context(gfp); 1573 + /* whitelisted flags that can be passed to the backing allocators */ 1574 + pcpu_gfp = gfp & (GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN); 1575 + is_atomic = (gfp & GFP_KERNEL) != GFP_KERNEL; 1576 + do_warn = !(gfp & __GFP_NOWARN); 1571 1577 1572 1578 /* 1573 1579 * There is now a minimum allocation size of PCPU_MIN_ALLOC_SIZE,

+30 -15

mm/slub.c

··· 551 551 metadata_access_disable(); 552 552 } 553 553 554 + /* 555 + * See comment in calculate_sizes(). 556 + */ 557 + static inline bool freeptr_outside_object(struct kmem_cache *s) 558 + { 559 + return s->offset >= s->inuse; 560 + } 561 + 562 + /* 563 + * Return offset of the end of info block which is inuse + free pointer if 564 + * not overlapping with object. 565 + */ 566 + static inline unsigned int get_info_end(struct kmem_cache *s) 567 + { 568 + if (freeptr_outside_object(s)) 569 + return s->inuse + sizeof(void *); 570 + else 571 + return s->inuse; 572 + } 573 + 554 574 static struct track *get_track(struct kmem_cache *s, void *object, 555 575 enum track_item alloc) 556 576 { 557 577 struct track *p; 558 578 559 - if (s->offset) 560 - p = object + s->offset + sizeof(void *); 561 - else 562 - p = object + s->inuse; 579 + p = object + get_info_end(s); 563 580 564 581 return p + alloc; 565 582 } ··· 703 686 print_section(KERN_ERR, "Redzone ", p + s->object_size, 704 687 s->inuse - s->object_size); 705 688 706 - if (s->offset) 707 - off = s->offset + sizeof(void *); 708 - else 709 - off = s->inuse; 689 + off = get_info_end(s); 710 690 711 691 if (s->flags & SLAB_STORE_USER) 712 692 off += 2 * sizeof(struct track); ··· 796 782 * object address 797 783 * Bytes of the object to be managed. 798 784 * If the freepointer may overlay the object then the free 799 - * pointer is the first word of the object. 785 + * pointer is at the middle of the object. 800 786 * 801 787 * Poisoning uses 0x6b (POISON_FREE) and the last byte is 802 788 * 0xa5 (POISON_END) ··· 830 816 831 817 static int check_pad_bytes(struct kmem_cache *s, struct page *page, u8 *p) 832 818 { 833 - unsigned long off = s->inuse; /* The end of info */ 834 - 835 - if (s->offset) 836 - /* Freepointer is placed after the object. */ 837 - off += sizeof(void *); 819 + unsigned long off = get_info_end(s); /* The end of info */ 838 820 839 821 if (s->flags & SLAB_STORE_USER) 840 822 /* We also have user information there */ ··· 917 907 check_pad_bytes(s, page, p); 918 908 } 919 909 920 - if (!s->offset && val == SLUB_RED_ACTIVE) 910 + if (!freeptr_outside_object(s) && val == SLUB_RED_ACTIVE) 921 911 /* 922 912 * Object and freepointer overlap. Cannot check 923 913 * freepointer while object is allocated. ··· 3597 3587 * 3598 3588 * This is the case if we do RCU, have a constructor or 3599 3589 * destructor or are poisoning the objects. 3590 + * 3591 + * The assumption that s->offset >= s->inuse means free 3592 + * pointer is outside of the object is used in the 3593 + * freeptr_outside_object() function. If that is no 3594 + * longer true, the function needs to be modified. 3600 3595 */ 3601 3596 s->offset = size; 3602 3597 size += sizeof(void *);

-1

mm/vmscan.c

··· 1625 1625 * @dst: The temp list to put pages on to. 1626 1626 * @nr_scanned: The number of pages that were scanned. 1627 1627 * @sc: The scan_control struct for this reclaim session 1628 - * @mode: One of the LRU isolation modes 1629 1628 * @lru: LRU list id for isolating 1630 1629 * 1631 1630 * returns how many pages were moved onto *@dst.

+3 -1

net/core/dev.c

··· 9006 9006 netdev_dbg(upper, "Disabling feature %pNF on lower dev %s.\n", 9007 9007 &feature, lower->name); 9008 9008 lower->wanted_features &= ~feature; 9009 - netdev_update_features(lower); 9009 + __netdev_update_features(lower); 9010 9010 9011 9011 if (unlikely(lower->features & feature)) 9012 9012 netdev_WARN(upper, "failed to disable %pNF on %s!\n", 9013 9013 &feature, lower->name); 9014 + else 9015 + netdev_features_change(lower); 9014 9016 } 9015 9017 } 9016 9018 }

+1 -1

net/core/filter.c

··· 2579 2579 } 2580 2580 pop = 0; 2581 2581 } else if (pop >= sge->length - a) { 2582 - sge->length = a; 2583 2582 pop -= (sge->length - a); 2583 + sge->length = a; 2584 2584 } 2585 2585 } 2586 2586

+2

net/core/netprio_cgroup.c

··· 236 236 struct task_struct *p; 237 237 struct cgroup_subsys_state *css; 238 238 239 + cgroup_sk_alloc_disable(); 240 + 239 241 cgroup_taskset_for_each(p, css, tset) { 240 242 void *v = (void *)(unsigned long)css->id; 241 243

+4 -2

net/ipv4/cipso_ipv4.c

··· 1258 1258 return ret_val; 1259 1259 } 1260 1260 1261 - secattr->flags |= NETLBL_SECATTR_MLS_CAT; 1261 + if (secattr->attr.mls.cat) 1262 + secattr->flags |= NETLBL_SECATTR_MLS_CAT; 1262 1263 } 1263 1264 1264 1265 return 0; ··· 1440 1439 return ret_val; 1441 1440 } 1442 1441 1443 - secattr->flags |= NETLBL_SECATTR_MLS_CAT; 1442 + if (secattr->attr.mls.cat) 1443 + secattr->flags |= NETLBL_SECATTR_MLS_CAT; 1444 1444 } 1445 1445 1446 1446 return 0;

+4 -2

net/ipv4/ipmr.c

··· 109 109 static void ipmr_expire_process(struct timer_list *t); 110 110 111 111 #ifdef CONFIG_IP_MROUTE_MULTIPLE_TABLES 112 - #define ipmr_for_each_table(mrt, net) \ 113 - list_for_each_entry_rcu(mrt, &net->ipv4.mr_tables, list) 112 + #define ipmr_for_each_table(mrt, net) \ 113 + list_for_each_entry_rcu(mrt, &net->ipv4.mr_tables, list, \ 114 + lockdep_rtnl_is_held() || \ 115 + list_empty(&net->ipv4.mr_tables)) 114 116 115 117 static struct mr_table *ipmr_mr_table_iter(struct net *net, 116 118 struct mr_table *mrt)

+1 -1

net/ipv4/route.c

··· 915 915 /* Check for load limit; set rate_last to the latest sent 916 916 * redirect. 917 917 */ 918 - if (peer->rate_tokens == 0 || 918 + if (peer->n_redirects == 0 || 919 919 time_after(jiffies, 920 920 (peer->rate_last + 921 921 (ip_rt_redirect_load << peer->n_redirects)))) {

+19 -8

net/ipv4/tcp.c

··· 476 476 static inline bool tcp_stream_is_readable(const struct tcp_sock *tp, 477 477 int target, struct sock *sk) 478 478 { 479 - return (READ_ONCE(tp->rcv_nxt) - READ_ONCE(tp->copied_seq) >= target) || 480 - (sk->sk_prot->stream_memory_read ? 481 - sk->sk_prot->stream_memory_read(sk) : false); 479 + int avail = READ_ONCE(tp->rcv_nxt) - READ_ONCE(tp->copied_seq); 480 + 481 + if (avail > 0) { 482 + if (avail >= target) 483 + return true; 484 + if (tcp_rmem_pressure(sk)) 485 + return true; 486 + } 487 + if (sk->sk_prot->stream_memory_read) 488 + return sk->sk_prot->stream_memory_read(sk); 489 + return false; 482 490 } 483 491 484 492 /* ··· 1764 1756 1765 1757 down_read(&current->mm->mmap_sem); 1766 1758 1767 - ret = -EINVAL; 1768 1759 vma = find_vma(current->mm, address); 1769 - if (!vma || vma->vm_start > address || vma->vm_ops != &tcp_vm_ops) 1770 - goto out; 1760 + if (!vma || vma->vm_start > address || vma->vm_ops != &tcp_vm_ops) { 1761 + up_read(&current->mm->mmap_sem); 1762 + return -EINVAL; 1763 + } 1771 1764 zc->length = min_t(unsigned long, zc->length, vma->vm_end - address); 1772 1765 1773 1766 tp = tcp_sk(sk); ··· 2163 2154 tp->urg_data = 0; 2164 2155 tcp_fast_path_check(sk); 2165 2156 } 2166 - if (used + offset < skb->len) 2167 - continue; 2168 2157 2169 2158 if (TCP_SKB_CB(skb)->has_rxtstamp) { 2170 2159 tcp_update_recv_tstamps(skb, &tss); 2171 2160 cmsg_flags |= 2; 2172 2161 } 2162 + 2163 + if (used + offset < skb->len) 2164 + continue; 2165 + 2173 2166 if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN) 2174 2167 goto found_fin_ok; 2175 2168 if (!(flags & MSG_PEEK))

+6 -4

net/ipv4/tcp_bpf.c

··· 125 125 126 126 if (!ret) { 127 127 msg->sg.start = i; 128 - msg->sg.size -= apply_bytes; 129 128 sk_psock_queue_msg(psock, tmp); 130 129 sk_psock_data_ready(sk, psock); 131 130 } else { ··· 261 262 struct sk_psock *psock; 262 263 int copied, ret; 263 264 265 + if (unlikely(flags & MSG_ERRQUEUE)) 266 + return inet_recv_error(sk, msg, len, addr_len); 267 + 264 268 psock = sk_psock_get(sk); 265 269 if (unlikely(!psock)) 266 270 return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); 267 - if (unlikely(flags & MSG_ERRQUEUE)) 268 - return inet_recv_error(sk, msg, len, addr_len); 269 271 if (!skb_queue_empty(&sk->sk_receive_queue) && 270 - sk_psock_queue_empty(psock)) 272 + sk_psock_queue_empty(psock)) { 273 + sk_psock_put(sk, psock); 271 274 return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); 275 + } 272 276 lock_sock(sk); 273 277 msg_bytes_ready: 274 278 copied = __tcp_bpf_recvmsg(sk, psock, msg, len, flags);

+2 -1

net/ipv4/tcp_input.c

··· 4769 4769 const struct tcp_sock *tp = tcp_sk(sk); 4770 4770 int avail = tp->rcv_nxt - tp->copied_seq; 4771 4771 4772 - if (avail < sk->sk_rcvlowat && !sock_flag(sk, SOCK_DONE)) 4772 + if (avail < sk->sk_rcvlowat && !tcp_rmem_pressure(sk) && 4773 + !sock_flag(sk, SOCK_DONE)) 4773 4774 return; 4774 4775 4775 4776 sk->sk_data_ready(sk);

+2 -1

net/ipv6/calipso.c

··· 1047 1047 goto getattr_return; 1048 1048 } 1049 1049 1050 - secattr->flags |= NETLBL_SECATTR_MLS_CAT; 1050 + if (secattr->attr.mls.cat) 1051 + secattr->flags |= NETLBL_SECATTR_MLS_CAT; 1051 1052 } 1052 1053 1053 1054 secattr->type = NETLBL_NLTYPE_CALIPSO;

+4 -2

net/ipv6/route.c

··· 2722 2722 const struct in6_addr *daddr, *saddr; 2723 2723 struct rt6_info *rt6 = (struct rt6_info *)dst; 2724 2724 2725 - if (dst_metric_locked(dst, RTAX_MTU)) 2726 - return; 2725 + /* Note: do *NOT* check dst_metric_locked(dst, RTAX_MTU) 2726 + * IPv6 pmtu discovery isn't optional, so 'mtu lock' cannot disable it. 2727 + * [see also comment in rt6_mtu_change_route()] 2728 + */ 2727 2729 2728 2730 if (iph) { 2729 2731 daddr = &iph->daddr;

+2

net/mptcp/protocol.c

··· 1629 1629 1630 1630 ret = mptcp_pm_allow_new_subflow(msk); 1631 1631 if (ret) { 1632 + subflow->map_seq = msk->ack_seq; 1633 + 1632 1634 /* active connections are already on conn_list */ 1633 1635 spin_lock_bh(&msk->join_list_lock); 1634 1636 if (!WARN_ON_ONCE(!list_empty(&subflow->node)))

+10

net/mptcp/subflow.c

··· 1036 1036 if (err) 1037 1037 return err; 1038 1038 1039 + /* the newly created socket really belongs to the owning MPTCP master 1040 + * socket, even if for additional subflows the allocation is performed 1041 + * by a kernel workqueue. Adjust inode references, so that the 1042 + * procfs/diag interaces really show this one belonging to the correct 1043 + * user. 1044 + */ 1045 + SOCK_INODE(sf)->i_ino = SOCK_INODE(sk->sk_socket)->i_ino; 1046 + SOCK_INODE(sf)->i_uid = SOCK_INODE(sk->sk_socket)->i_uid; 1047 + SOCK_INODE(sf)->i_gid = SOCK_INODE(sk->sk_socket)->i_gid; 1048 + 1039 1049 subflow = mptcp_subflow_ctx(sf->sk); 1040 1050 pr_debug("subflow=%p", subflow); 1041 1051

+14 -3

net/netfilter/nf_conntrack_core.c

··· 1519 1519 ct->status = 0; 1520 1520 ct->timeout = 0; 1521 1521 write_pnet(&ct->ct_net, net); 1522 - memset(&ct->__nfct_init_offset[0], 0, 1522 + memset(&ct->__nfct_init_offset, 0, 1523 1523 offsetof(struct nf_conn, proto) - 1524 - offsetof(struct nf_conn, __nfct_init_offset[0])); 1524 + offsetof(struct nf_conn, __nfct_init_offset)); 1525 1525 1526 1526 nf_ct_zone_add(ct, zone); 1527 1527 ··· 2139 2139 nf_conntrack_lock(lockp); 2140 2140 if (*bucket < nf_conntrack_htable_size) { 2141 2141 hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[*bucket], hnnode) { 2142 - if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL) 2142 + if (NF_CT_DIRECTION(h) != IP_CT_DIR_REPLY) 2143 2143 continue; 2144 + /* All nf_conn objects are added to hash table twice, one 2145 + * for original direction tuple, once for the reply tuple. 2146 + * 2147 + * Exception: In the IPS_NAT_CLASH case, only the reply 2148 + * tuple is added (the original tuple already existed for 2149 + * a different object). 2150 + * 2151 + * We only need to call the iterator once for each 2152 + * conntrack, so we just use the 'reply' direction 2153 + * tuple while iterating. 2154 + */ 2144 2155 ct = nf_ct_tuplehash_to_ctrack(h); 2145 2156 if (iter(ct, data)) 2146 2157 goto found;

+5 -3

net/netfilter/nf_flow_table_core.c

··· 284 284 285 285 if (nf_flow_has_expired(flow)) 286 286 flow_offload_fixup_ct(flow->ct); 287 - else if (test_bit(NF_FLOW_TEARDOWN, &flow->flags)) 287 + else 288 288 flow_offload_fixup_ct_timeout(flow->ct); 289 289 290 290 flow_offload_free(flow); ··· 361 361 { 362 362 struct nf_flowtable *flow_table = data; 363 363 364 - if (nf_flow_has_expired(flow) || nf_ct_is_dying(flow->ct) || 365 - test_bit(NF_FLOW_TEARDOWN, &flow->flags)) { 364 + if (nf_flow_has_expired(flow) || nf_ct_is_dying(flow->ct)) 365 + set_bit(NF_FLOW_TEARDOWN, &flow->flags); 366 + 367 + if (test_bit(NF_FLOW_TEARDOWN, &flow->flags)) { 366 368 if (test_bit(NF_FLOW_HW, &flow->flags)) { 367 369 if (!test_bit(NF_FLOW_HW_DYING, &flow->flags)) 368 370 nf_flow_offload_del(flow_table, flow);

+9 -3

net/netfilter/nf_flow_table_offload.c

··· 820 820 WARN_ON_ONCE(1); 821 821 } 822 822 823 + clear_bit(NF_FLOW_HW_PENDING, &offload->flow->flags); 823 824 kfree(offload); 824 825 } 825 826 ··· 835 834 { 836 835 struct flow_offload_work *offload; 837 836 838 - offload = kmalloc(sizeof(struct flow_offload_work), GFP_ATOMIC); 839 - if (!offload) 837 + if (test_and_set_bit(NF_FLOW_HW_PENDING, &flow->flags)) 840 838 return NULL; 839 + 840 + offload = kmalloc(sizeof(struct flow_offload_work), GFP_ATOMIC); 841 + if (!offload) { 842 + clear_bit(NF_FLOW_HW_PENDING, &flow->flags); 843 + return NULL; 844 + } 841 845 842 846 offload->cmd = cmd; 843 847 offload->flow = flow; ··· 1065 1059 int nf_flow_table_offload_init(void) 1066 1060 { 1067 1061 nf_flow_offload_wq = alloc_workqueue("nf_flow_table_offload", 1068 - WQ_UNBOUND | WQ_MEM_RECLAIM, 0); 1062 + WQ_UNBOUND, 0); 1069 1063 if (!nf_flow_offload_wq) 1070 1064 return -ENOMEM; 1071 1065

+11

net/netfilter/nft_set_rbtree.c

··· 79 79 parent = rcu_dereference_raw(parent->rb_left); 80 80 continue; 81 81 } 82 + 83 + if (nft_set_elem_expired(&rbe->ext)) 84 + return false; 85 + 82 86 if (nft_rbtree_interval_end(rbe)) { 83 87 if (nft_set_is_anonymous(set)) 84 88 return false; ··· 98 94 99 95 if (set->flags & NFT_SET_INTERVAL && interval != NULL && 100 96 nft_set_elem_active(&interval->ext, genmask) && 97 + !nft_set_elem_expired(&interval->ext) && 101 98 nft_rbtree_interval_start(interval)) { 102 99 *ext = &interval->ext; 103 100 return true; ··· 159 154 continue; 160 155 } 161 156 157 + if (nft_set_elem_expired(&rbe->ext)) 158 + return false; 159 + 162 160 if (!nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_FLAGS) || 163 161 (*nft_set_ext_flags(&rbe->ext) & NFT_SET_ELEM_INTERVAL_END) == 164 162 (flags & NFT_SET_ELEM_INTERVAL_END)) { ··· 178 170 179 171 if (set->flags & NFT_SET_INTERVAL && interval != NULL && 180 172 nft_set_elem_active(&interval->ext, genmask) && 173 + !nft_set_elem_expired(&interval->ext) && 181 174 ((!nft_rbtree_interval_end(interval) && 182 175 !(flags & NFT_SET_ELEM_INTERVAL_END)) || 183 176 (nft_rbtree_interval_end(interval) && ··· 426 417 rbe = rb_entry(node, struct nft_rbtree_elem, node); 427 418 428 419 if (iter->count < iter->skip) 420 + goto cont; 421 + if (nft_set_elem_expired(&rbe->ext)) 429 422 goto cont; 430 423 if (!nft_set_elem_active(&rbe->ext, iter->genmask)) 431 424 goto cont;

+6

net/netlabel/netlabel_kapi.c

··· 734 734 if ((off & (BITS_PER_LONG - 1)) != 0) 735 735 return -EINVAL; 736 736 737 + /* a null catmap is equivalent to an empty one */ 738 + if (!catmap) { 739 + *offset = (u32)-1; 740 + return 0; 741 + } 742 + 737 743 if (off < catmap->startbit) { 738 744 off = catmap->startbit; 739 745 *offset = off;

+5 -7

net/sunrpc/auth_gss/auth_gss.c

··· 2032 2032 struct xdr_buf *rcv_buf = &rqstp->rq_rcv_buf; 2033 2033 struct kvec *head = rqstp->rq_rcv_buf.head; 2034 2034 struct rpc_auth *auth = cred->cr_auth; 2035 - unsigned int savedlen = rcv_buf->len; 2036 2035 u32 offset, opaque_len, maj_stat; 2037 2036 __be32 *p; 2038 2037 ··· 2042 2043 offset = (u8 *)(p) - (u8 *)head->iov_base; 2043 2044 if (offset + opaque_len > rcv_buf->len) 2044 2045 goto unwrap_failed; 2045 - rcv_buf->len = offset + opaque_len; 2046 2046 2047 - maj_stat = gss_unwrap(ctx->gc_gss_ctx, offset, rcv_buf); 2047 + maj_stat = gss_unwrap(ctx->gc_gss_ctx, offset, 2048 + offset + opaque_len, rcv_buf); 2048 2049 if (maj_stat == GSS_S_CONTEXT_EXPIRED) 2049 2050 clear_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags); 2050 2051 if (maj_stat != GSS_S_COMPLETE) ··· 2058 2059 */ 2059 2060 xdr_init_decode(xdr, rcv_buf, p, rqstp); 2060 2061 2061 - auth->au_rslack = auth->au_verfsize + 2 + 2062 - XDR_QUADLEN(savedlen - rcv_buf->len); 2063 - auth->au_ralign = auth->au_verfsize + 2 + 2064 - XDR_QUADLEN(savedlen - rcv_buf->len); 2062 + auth->au_rslack = auth->au_verfsize + 2 + ctx->gc_gss_ctx->slack; 2063 + auth->au_ralign = auth->au_verfsize + 2 + ctx->gc_gss_ctx->align; 2064 + 2065 2065 return 0; 2066 2066 unwrap_failed: 2067 2067 trace_rpcgss_unwrap_failed(task);

+4 -4

net/sunrpc/auth_gss/gss_krb5_crypto.c

··· 851 851 } 852 852 853 853 u32 854 - gss_krb5_aes_decrypt(struct krb5_ctx *kctx, u32 offset, struct xdr_buf *buf, 855 - u32 *headskip, u32 *tailskip) 854 + gss_krb5_aes_decrypt(struct krb5_ctx *kctx, u32 offset, u32 len, 855 + struct xdr_buf *buf, u32 *headskip, u32 *tailskip) 856 856 { 857 857 struct xdr_buf subbuf; 858 858 u32 ret = 0; ··· 881 881 882 882 /* create a segment skipping the header and leaving out the checksum */ 883 883 xdr_buf_subsegment(buf, &subbuf, offset + GSS_KRB5_TOK_HDR_LEN, 884 - (buf->len - offset - GSS_KRB5_TOK_HDR_LEN - 884 + (len - offset - GSS_KRB5_TOK_HDR_LEN - 885 885 kctx->gk5e->cksumlength)); 886 886 887 887 nblocks = (subbuf.len + blocksize - 1) / blocksize; ··· 926 926 goto out_err; 927 927 928 928 /* Get the packet's hmac value */ 929 - ret = read_bytes_from_xdr_buf(buf, buf->len - kctx->gk5e->cksumlength, 929 + ret = read_bytes_from_xdr_buf(buf, len - kctx->gk5e->cksumlength, 930 930 pkt_hmac, kctx->gk5e->cksumlength); 931 931 if (ret) 932 932 goto out_err;

+29 -15

net/sunrpc/auth_gss/gss_krb5_wrap.c

··· 261 261 } 262 262 263 263 static u32 264 - gss_unwrap_kerberos_v1(struct krb5_ctx *kctx, int offset, struct xdr_buf *buf) 264 + gss_unwrap_kerberos_v1(struct krb5_ctx *kctx, int offset, int len, 265 + struct xdr_buf *buf, unsigned int *slack, 266 + unsigned int *align) 265 267 { 266 268 int signalg; 267 269 int sealalg; ··· 281 279 u32 conflen = kctx->gk5e->conflen; 282 280 int crypt_offset; 283 281 u8 *cksumkey; 282 + unsigned int saved_len = buf->len; 284 283 285 284 dprintk("RPC: gss_unwrap_kerberos\n"); 286 285 287 286 ptr = (u8 *)buf->head[0].iov_base + offset; 288 287 if (g_verify_token_header(&kctx->mech_used, &bodysize, &ptr, 289 - buf->len - offset)) 288 + len - offset)) 290 289 return GSS_S_DEFECTIVE_TOKEN; 291 290 292 291 if ((ptr[0] != ((KG_TOK_WRAP_MSG >> 8) & 0xff)) || ··· 327 324 (!kctx->initiate && direction != 0)) 328 325 return GSS_S_BAD_SIG; 329 326 327 + buf->len = len; 330 328 if (kctx->enctype == ENCTYPE_ARCFOUR_HMAC) { 331 329 struct crypto_sync_skcipher *cipher; 332 330 int err; ··· 380 376 data_len = (buf->head[0].iov_base + buf->head[0].iov_len) - data_start; 381 377 memmove(orig_start, data_start, data_len); 382 378 buf->head[0].iov_len -= (data_start - orig_start); 383 - buf->len -= (data_start - orig_start); 379 + buf->len = len - (data_start - orig_start); 384 380 385 381 if (gss_krb5_remove_padding(buf, blocksize)) 386 382 return GSS_S_DEFECTIVE_TOKEN; 387 383 384 + /* slack must include room for krb5 padding */ 385 + *slack = XDR_QUADLEN(saved_len - buf->len); 386 + /* The GSS blob always precedes the RPC message payload */ 387 + *align = *slack; 388 388 return GSS_S_COMPLETE; 389 389 } 390 390 ··· 494 486 } 495 487 496 488 static u32 497 - gss_unwrap_kerberos_v2(struct krb5_ctx *kctx, int offset, struct xdr_buf *buf) 489 + gss_unwrap_kerberos_v2(struct krb5_ctx *kctx, int offset, int len, 490 + struct xdr_buf *buf, unsigned int *slack, 491 + unsigned int *align) 498 492 { 499 493 time64_t now; 500 494 u8 *ptr; ··· 542 532 if (rrc != 0) 543 533 rotate_left(offset + 16, buf, rrc); 544 534 545 - err = (*kctx->gk5e->decrypt_v2)(kctx, offset, buf, 535 + err = (*kctx->gk5e->decrypt_v2)(kctx, offset, len, buf, 546 536 &headskip, &tailskip); 547 537 if (err) 548 538 return GSS_S_FAILURE; ··· 552 542 * it against the original 553 543 */ 554 544 err = read_bytes_from_xdr_buf(buf, 555 - buf->len - GSS_KRB5_TOK_HDR_LEN - tailskip, 545 + len - GSS_KRB5_TOK_HDR_LEN - tailskip, 556 546 decrypted_hdr, GSS_KRB5_TOK_HDR_LEN); 557 547 if (err) { 558 548 dprintk("%s: error %u getting decrypted_hdr\n", __func__, err); ··· 578 568 * Note that buf->head[0].iov_len may indicate the available 579 569 * head buffer space rather than that actually occupied. 580 570 */ 581 - movelen = min_t(unsigned int, buf->head[0].iov_len, buf->len); 571 + movelen = min_t(unsigned int, buf->head[0].iov_len, len); 582 572 movelen -= offset + GSS_KRB5_TOK_HDR_LEN + headskip; 583 - if (offset + GSS_KRB5_TOK_HDR_LEN + headskip + movelen > 584 - buf->head[0].iov_len) 585 - return GSS_S_FAILURE; 573 + BUG_ON(offset + GSS_KRB5_TOK_HDR_LEN + headskip + movelen > 574 + buf->head[0].iov_len); 586 575 memmove(ptr, ptr + GSS_KRB5_TOK_HDR_LEN + headskip, movelen); 587 576 buf->head[0].iov_len -= GSS_KRB5_TOK_HDR_LEN + headskip; 588 - buf->len -= GSS_KRB5_TOK_HDR_LEN + headskip; 577 + buf->len = len - GSS_KRB5_TOK_HDR_LEN + headskip; 589 578 590 579 /* Trim off the trailing "extra count" and checksum blob */ 591 - buf->len -= ec + GSS_KRB5_TOK_HDR_LEN + tailskip; 580 + xdr_buf_trim(buf, ec + GSS_KRB5_TOK_HDR_LEN + tailskip); 592 581 582 + *align = XDR_QUADLEN(GSS_KRB5_TOK_HDR_LEN + headskip); 583 + *slack = *align + XDR_QUADLEN(ec + GSS_KRB5_TOK_HDR_LEN + tailskip); 593 584 return GSS_S_COMPLETE; 594 585 } 595 586 ··· 614 603 } 615 604 616 605 u32 617 - gss_unwrap_kerberos(struct gss_ctx *gctx, int offset, struct xdr_buf *buf) 606 + gss_unwrap_kerberos(struct gss_ctx *gctx, int offset, 607 + int len, struct xdr_buf *buf) 618 608 { 619 609 struct krb5_ctx *kctx = gctx->internal_ctx_id; 620 610 ··· 625 613 case ENCTYPE_DES_CBC_RAW: 626 614 case ENCTYPE_DES3_CBC_RAW: 627 615 case ENCTYPE_ARCFOUR_HMAC: 628 - return gss_unwrap_kerberos_v1(kctx, offset, buf); 616 + return gss_unwrap_kerberos_v1(kctx, offset, len, buf, 617 + &gctx->slack, &gctx->align); 629 618 case ENCTYPE_AES128_CTS_HMAC_SHA1_96: 630 619 case ENCTYPE_AES256_CTS_HMAC_SHA1_96: 631 - return gss_unwrap_kerberos_v2(kctx, offset, buf); 620 + return gss_unwrap_kerberos_v2(kctx, offset, len, buf, 621 + &gctx->slack, &gctx->align); 632 622 } 633 623 }

+2 -1

net/sunrpc/auth_gss/gss_mech_switch.c

··· 411 411 u32 412 412 gss_unwrap(struct gss_ctx *ctx_id, 413 413 int offset, 414 + int len, 414 415 struct xdr_buf *buf) 415 416 { 416 417 return ctx_id->mech_type->gm_ops 417 - ->gss_unwrap(ctx_id, offset, buf); 418 + ->gss_unwrap(ctx_id, offset, len, buf); 418 419 } 419 420 420 421

+3 -7

net/sunrpc/auth_gss/svcauth_gss.c

··· 906 906 if (svc_getnl(&buf->head[0]) != seq) 907 907 goto out; 908 908 /* trim off the mic and padding at the end before returning */ 909 - buf->len -= 4 + round_up_to_quad(mic.len); 909 + xdr_buf_trim(buf, round_up_to_quad(mic.len) + 4); 910 910 stat = 0; 911 911 out: 912 912 kfree(mic.data); ··· 934 934 unwrap_priv_data(struct svc_rqst *rqstp, struct xdr_buf *buf, u32 seq, struct gss_ctx *ctx) 935 935 { 936 936 u32 priv_len, maj_stat; 937 - int pad, saved_len, remaining_len, offset; 937 + int pad, remaining_len, offset; 938 938 939 939 clear_bit(RQ_SPLICE_OK, &rqstp->rq_flags); 940 940 ··· 954 954 buf->len -= pad; 955 955 fix_priv_head(buf, pad); 956 956 957 - /* Maybe it would be better to give gss_unwrap a length parameter: */ 958 - saved_len = buf->len; 959 - buf->len = priv_len; 960 - maj_stat = gss_unwrap(ctx, 0, buf); 957 + maj_stat = gss_unwrap(ctx, 0, priv_len, buf); 961 958 pad = priv_len - buf->len; 962 - buf->len = saved_len; 963 959 buf->len -= pad; 964 960 /* The upper layers assume the buffer is aligned on 4-byte boundaries. 965 961 * In the krb5p case, at least, the data ends up offset, so we need to

+41

net/sunrpc/xdr.c

··· 1150 1150 } 1151 1151 EXPORT_SYMBOL_GPL(xdr_buf_subsegment); 1152 1152 1153 + /** 1154 + * xdr_buf_trim - lop at most "len" bytes off the end of "buf" 1155 + * @buf: buf to be trimmed 1156 + * @len: number of bytes to reduce "buf" by 1157 + * 1158 + * Trim an xdr_buf by the given number of bytes by fixing up the lengths. Note 1159 + * that it's possible that we'll trim less than that amount if the xdr_buf is 1160 + * too small, or if (for instance) it's all in the head and the parser has 1161 + * already read too far into it. 1162 + */ 1163 + void xdr_buf_trim(struct xdr_buf *buf, unsigned int len) 1164 + { 1165 + size_t cur; 1166 + unsigned int trim = len; 1167 + 1168 + if (buf->tail[0].iov_len) { 1169 + cur = min_t(size_t, buf->tail[0].iov_len, trim); 1170 + buf->tail[0].iov_len -= cur; 1171 + trim -= cur; 1172 + if (!trim) 1173 + goto fix_len; 1174 + } 1175 + 1176 + if (buf->page_len) { 1177 + cur = min_t(unsigned int, buf->page_len, trim); 1178 + buf->page_len -= cur; 1179 + trim -= cur; 1180 + if (!trim) 1181 + goto fix_len; 1182 + } 1183 + 1184 + if (buf->head[0].iov_len) { 1185 + cur = min_t(size_t, buf->head[0].iov_len, trim); 1186 + buf->head[0].iov_len -= cur; 1187 + trim -= cur; 1188 + } 1189 + fix_len: 1190 + buf->len -= (len - trim); 1191 + } 1192 + EXPORT_SYMBOL_GPL(xdr_buf_trim); 1193 + 1153 1194 static void __read_bytes_from_xdr_buf(struct xdr_buf *subbuf, void *obj, unsigned int len) 1154 1195 { 1155 1196 unsigned int this_len;

+31 -11

net/tipc/socket.c

··· 1739 1739 return 0; 1740 1740 } 1741 1741 1742 - static void tipc_sk_send_ack(struct tipc_sock *tsk) 1742 + static struct sk_buff *tipc_sk_build_ack(struct tipc_sock *tsk) 1743 1743 { 1744 1744 struct sock *sk = &tsk->sk; 1745 - struct net *net = sock_net(sk); 1746 1745 struct sk_buff *skb = NULL; 1747 1746 struct tipc_msg *msg; 1748 1747 u32 peer_port = tsk_peer_port(tsk); 1749 1748 u32 dnode = tsk_peer_node(tsk); 1750 1749 1751 1750 if (!tipc_sk_connected(sk)) 1752 - return; 1751 + return NULL; 1753 1752 skb = tipc_msg_create(CONN_MANAGER, CONN_ACK, INT_H_SIZE, 0, 1754 1753 dnode, tsk_own_node(tsk), peer_port, 1755 1754 tsk->portid, TIPC_OK); 1756 1755 if (!skb) 1757 - return; 1756 + return NULL; 1758 1757 msg = buf_msg(skb); 1759 1758 msg_set_conn_ack(msg, tsk->rcv_unacked); 1760 1759 tsk->rcv_unacked = 0; ··· 1763 1764 tsk->rcv_win = tsk_adv_blocks(tsk->sk.sk_rcvbuf); 1764 1765 msg_set_adv_win(msg, tsk->rcv_win); 1765 1766 } 1766 - tipc_node_xmit_skb(net, skb, dnode, msg_link_selector(msg)); 1767 + return skb; 1768 + } 1769 + 1770 + static void tipc_sk_send_ack(struct tipc_sock *tsk) 1771 + { 1772 + struct sk_buff *skb; 1773 + 1774 + skb = tipc_sk_build_ack(tsk); 1775 + if (!skb) 1776 + return; 1777 + 1778 + tipc_node_xmit_skb(sock_net(&tsk->sk), skb, tsk_peer_node(tsk), 1779 + msg_link_selector(buf_msg(skb))); 1767 1780 } 1768 1781 1769 1782 static int tipc_wait_for_rcvmsg(struct socket *sock, long *timeop) ··· 1949 1938 bool peek = flags & MSG_PEEK; 1950 1939 int offset, required, copy, copied = 0; 1951 1940 int hlen, dlen, err, rc; 1952 - bool ack = false; 1953 1941 long timeout; 1954 1942 1955 1943 /* Catch invalid receive attempts */ ··· 1993 1983 1994 1984 /* Copy data if msg ok, otherwise return error/partial data */ 1995 1985 if (likely(!err)) { 1996 - ack = msg_ack_required(hdr); 1997 1986 offset = skb_cb->bytes_read; 1998 1987 copy = min_t(int, dlen - offset, buflen - copied); 1999 1988 rc = skb_copy_datagram_msg(skb, hlen + offset, m, copy); ··· 2020 2011 2021 2012 /* Send connection flow control advertisement when applicable */ 2022 2013 tsk->rcv_unacked += tsk_inc(tsk, hlen + dlen); 2023 - if (ack || tsk->rcv_unacked >= tsk->rcv_win / TIPC_ACK_RATE) 2014 + if (tsk->rcv_unacked >= tsk->rcv_win / TIPC_ACK_RATE) 2024 2015 tipc_sk_send_ack(tsk); 2025 2016 2026 2017 /* Exit if all requested data or FIN/error received */ ··· 2114 2105 * tipc_sk_filter_connect - check incoming message for a connection-based socket 2115 2106 * @tsk: TIPC socket 2116 2107 * @skb: pointer to message buffer. 2108 + * @xmitq: for Nagle ACK if any 2117 2109 * Returns true if message should be added to receive queue, false otherwise 2118 2110 */ 2119 - static bool tipc_sk_filter_connect(struct tipc_sock *tsk, struct sk_buff *skb) 2111 + static bool tipc_sk_filter_connect(struct tipc_sock *tsk, struct sk_buff *skb, 2112 + struct sk_buff_head *xmitq) 2120 2113 { 2121 2114 struct sock *sk = &tsk->sk; 2122 2115 struct net *net = sock_net(sk); ··· 2182 2171 if (!skb_queue_empty(&sk->sk_write_queue)) 2183 2172 tipc_sk_push_backlog(tsk); 2184 2173 /* Accept only connection-based messages sent by peer */ 2185 - if (likely(con_msg && !err && pport == oport && pnode == onode)) 2174 + if (likely(con_msg && !err && pport == oport && 2175 + pnode == onode)) { 2176 + if (msg_ack_required(hdr)) { 2177 + struct sk_buff *skb; 2178 + 2179 + skb = tipc_sk_build_ack(tsk); 2180 + if (skb) 2181 + __skb_queue_tail(xmitq, skb); 2182 + } 2186 2183 return true; 2184 + } 2187 2185 if (!tsk_peer_msg(tsk, hdr)) 2188 2186 return false; 2189 2187 if (!err) ··· 2287 2267 while ((skb = __skb_dequeue(&inputq))) { 2288 2268 hdr = buf_msg(skb); 2289 2269 limit = rcvbuf_limit(sk, skb); 2290 - if ((sk_conn && !tipc_sk_filter_connect(tsk, skb)) || 2270 + if ((sk_conn && !tipc_sk_filter_connect(tsk, skb, xmitq)) || 2291 2271 (!sk_conn && msg_connected(hdr)) || 2292 2272 (!grp && msg_in_group(hdr))) 2293 2273 err = TIPC_ERR_NO_PORT;

+10

net/tipc/subscr.h

··· 96 96 (swap_ ? swab32(val__) : val__); \ 97 97 }) 98 98 99 + /* tipc_sub_write - write val_ to field_ of struct sub_ in user endian format 100 + */ 101 + #define tipc_sub_write(sub_, field_, val_) \ 102 + ({ \ 103 + struct tipc_subscr *sub__ = sub_; \ 104 + u32 val__ = val_; \ 105 + int swap_ = !((sub__)->filter & TIPC_FILTER_MASK); \ 106 + (sub__)->field_ = swap_ ? swab32(val__) : val__; \ 107 + }) 108 + 99 109 /* tipc_evt_write - write val_ to field_ of struct evt_ in user endian format 100 110 */ 101 111 #define tipc_evt_write(evt_, field_, val_) \

+8 -5

net/tipc/topsrv.c

··· 237 237 if (!s || !memcmp(s, &sub->evt.s, sizeof(*s))) { 238 238 tipc_sub_unsubscribe(sub); 239 239 atomic_dec(&tn->subscription_count); 240 - } else if (s) { 241 - break; 240 + if (s) 241 + break; 242 242 } 243 243 } 244 244 spin_unlock_bh(&con->sub_lock); ··· 362 362 { 363 363 struct tipc_net *tn = tipc_net(srv->net); 364 364 struct tipc_subscription *sub; 365 + u32 s_filter = tipc_sub_read(s, filter); 365 366 366 - if (tipc_sub_read(s, filter) & TIPC_SUB_CANCEL) { 367 - s->filter &= __constant_ntohl(~TIPC_SUB_CANCEL); 367 + if (s_filter & TIPC_SUB_CANCEL) { 368 + tipc_sub_write(s, filter, s_filter & ~TIPC_SUB_CANCEL); 368 369 tipc_conn_delete_sub(con, s); 369 370 return 0; 370 371 } ··· 401 400 return -EWOULDBLOCK; 402 401 if (ret == sizeof(s)) { 403 402 read_lock_bh(&sk->sk_callback_lock); 404 - ret = tipc_conn_rcv_sub(srv, con, &s); 403 + /* RACE: the connection can be closed in the meantime */ 404 + if (likely(connected(con))) 405 + ret = tipc_conn_rcv_sub(srv, con, &s); 405 406 read_unlock_bh(&sk->sk_callback_lock); 406 407 if (!ret) 407 408 return 0;

-2

samples/bpf/lwt_len_hist_user.c

··· 15 15 #define MAX_INDEX 64 16 16 #define MAX_STARS 38 17 17 18 - char bpf_log_buf[BPF_LOG_BUF_SIZE]; 19 - 20 18 static void stars(char *str, long val, long max, int width) 21 19 { 22 20 int i;

+1 -1

samples/trace_events/trace-events-sample.h

··· 416 416 * Note, TRACE_EVENT() itself is simply defined as: 417 417 * 418 418 * #define TRACE_EVENT(name, proto, args, tstruct, assign, printk) \ 419 - * DEFINE_EVENT_CLASS(name, proto, args, tstruct, assign, printk); \ 419 + * DECLARE_EVENT_CLASS(name, proto, args, tstruct, assign, printk); \ 420 420 * DEFINE_EVENT(name, name, proto, args) 421 421 * 422 422 * The DEFINE_EVENT() also can be declared with conditions and reg functions:

+1 -1

scripts/decodecode

··· 126 126 faultline=`cat $T.dis | head -1 | cut -d":" -f2-` 127 127 faultline=`echo "$faultline" | sed -e 's/\[/\\\[/g; s/\]/\\\]/g'` 128 128 129 - cat $T.oo | sed -e "${faultlinenum}s/^$.*:$$.*$/\1\*\2\t\t<-- trapping instruction/" 129 + cat $T.oo | sed -e "${faultlinenum}s/^$[^:]*:$$.*$/\1\*\2\t\t<-- trapping instruction/" 130 130 echo 131 131 cat $T.aa 132 132 cleanup

+2 -2

scripts/gdb/linux/rbtree.py

··· 12 12 13 13 def rb_first(root): 14 14 if root.type == rb_root_type.get_type(): 15 - node = node.address.cast(rb_root_type.get_type().pointer()) 15 + node = root.address.cast(rb_root_type.get_type().pointer()) 16 16 elif root.type != rb_root_type.get_type().pointer(): 17 17 raise gdb.GdbError("Must be struct rb_root not {}".format(root.type)) 18 18 ··· 28 28 29 29 def rb_last(root): 30 30 if root.type == rb_root_type.get_type(): 31 - node = node.address.cast(rb_root_type.get_type().pointer()) 31 + node = root.address.cast(rb_root_type.get_type().pointer()) 32 32 elif root.type != rb_root_type.get_type().pointer(): 33 33 raise gdb.GdbError("Must be struct rb_root not {}".format(root.type)) 34 34

+27 -4

sound/core/rawmidi.c

··· 120 120 runtime->event(runtime->substream); 121 121 } 122 122 123 + /* buffer refcount management: call with runtime->lock held */ 124 + static inline void snd_rawmidi_buffer_ref(struct snd_rawmidi_runtime *runtime) 125 + { 126 + runtime->buffer_ref++; 127 + } 128 + 129 + static inline void snd_rawmidi_buffer_unref(struct snd_rawmidi_runtime *runtime) 130 + { 131 + runtime->buffer_ref--; 132 + } 133 + 123 134 static int snd_rawmidi_runtime_create(struct snd_rawmidi_substream *substream) 124 135 { 125 136 struct snd_rawmidi_runtime *runtime; ··· 680 669 if (!newbuf) 681 670 return -ENOMEM; 682 671 spin_lock_irq(&runtime->lock); 672 + if (runtime->buffer_ref) { 673 + spin_unlock_irq(&runtime->lock); 674 + kvfree(newbuf); 675 + return -EBUSY; 676 + } 683 677 oldbuf = runtime->buffer; 684 678 runtime->buffer = newbuf; 685 679 runtime->buffer_size = params->buffer_size; ··· 1035 1019 long result = 0, count1; 1036 1020 struct snd_rawmidi_runtime *runtime = substream->runtime; 1037 1021 unsigned long appl_ptr; 1022 + int err = 0; 1038 1023 1039 1024 spin_lock_irqsave(&runtime->lock, flags); 1025 + snd_rawmidi_buffer_ref(runtime); 1040 1026 while (count > 0 && runtime->avail) { 1041 1027 count1 = runtime->buffer_size - runtime->appl_ptr; 1042 1028 if (count1 > count) ··· 1057 1039 if (userbuf) { 1058 1040 spin_unlock_irqrestore(&runtime->lock, flags); 1059 1041 if (copy_to_user(userbuf + result, 1060 - runtime->buffer + appl_ptr, count1)) { 1061 - return result > 0 ? result : -EFAULT; 1062 - } 1042 + runtime->buffer + appl_ptr, count1)) 1043 + err = -EFAULT; 1063 1044 spin_lock_irqsave(&runtime->lock, flags); 1045 + if (err) 1046 + goto out; 1064 1047 } 1065 1048 result += count1; 1066 1049 count -= count1; 1067 1050 } 1051 + out: 1052 + snd_rawmidi_buffer_unref(runtime); 1068 1053 spin_unlock_irqrestore(&runtime->lock, flags); 1069 - return result; 1054 + return result > 0 ? result : err; 1070 1055 } 1071 1056 1072 1057 long snd_rawmidi_kernel_read(struct snd_rawmidi_substream *substream, ··· 1363 1342 return -EAGAIN; 1364 1343 } 1365 1344 } 1345 + snd_rawmidi_buffer_ref(runtime); 1366 1346 while (count > 0 && runtime->avail > 0) { 1367 1347 count1 = runtime->buffer_size - runtime->appl_ptr; 1368 1348 if (count1 > count) ··· 1395 1373 } 1396 1374 __end: 1397 1375 count1 = runtime->avail < runtime->buffer_size; 1376 + snd_rawmidi_buffer_unref(runtime); 1398 1377 spin_unlock_irqrestore(&runtime->lock, flags); 1399 1378 if (count1) 1400 1379 snd_rawmidi_output_trigger(substream, 1);

+1 -2

sound/firewire/amdtp-stream-trace.h

··· 66 66 __entry->irq, 67 67 __entry->index, 68 68 __print_array(__get_dynamic_array(cip_header), 69 - __get_dynamic_array_len(cip_header), 70 - sizeof(u8))) 69 + __get_dynamic_array_len(cip_header), 1)) 71 70 ); 72 71 73 72 #endif

+71 -3

sound/pci/hda/patch_realtek.c

··· 5856 5856 } 5857 5857 } 5858 5858 5859 + static void alc225_fixup_s3_pop_noise(struct hda_codec *codec, 5860 + const struct hda_fixup *fix, int action) 5861 + { 5862 + if (action != HDA_FIXUP_ACT_PRE_PROBE) 5863 + return; 5864 + 5865 + codec->power_save_node = 1; 5866 + } 5867 + 5859 5868 /* Forcibly assign NID 0x03 to HP/LO while NID 0x02 to SPK for EQ */ 5860 5869 static void alc274_fixup_bind_dacs(struct hda_codec *codec, 5861 5870 const struct hda_fixup *fix, int action) ··· 5969 5960 ALC269_FIXUP_HP_LINE1_MIC1_LED, 5970 5961 ALC269_FIXUP_INV_DMIC, 5971 5962 ALC269_FIXUP_LENOVO_DOCK, 5963 + ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST, 5972 5964 ALC269_FIXUP_NO_SHUTUP, 5973 5965 ALC286_FIXUP_SONY_MIC_NO_PRESENCE, 5974 5966 ALC269_FIXUP_PINCFG_NO_HP_TO_LINEOUT, ··· 6055 6045 ALC233_FIXUP_ACER_HEADSET_MIC, 6056 6046 ALC294_FIXUP_LENOVO_MIC_LOCATION, 6057 6047 ALC225_FIXUP_DELL_WYSE_MIC_NO_PRESENCE, 6048 + ALC225_FIXUP_S3_POP_NOISE, 6058 6049 ALC700_FIXUP_INTEL_REFERENCE, 6059 6050 ALC274_FIXUP_DELL_BIND_DACS, 6060 6051 ALC274_FIXUP_DELL_AIO_LINEOUT_VERB, ··· 6091 6080 ALC294_FIXUP_ASUS_DUAL_SPK, 6092 6081 ALC285_FIXUP_THINKPAD_HEADSET_JACK, 6093 6082 ALC294_FIXUP_ASUS_HPE, 6083 + ALC294_FIXUP_ASUS_COEF_1B, 6094 6084 ALC285_FIXUP_HP_GPIO_LED, 6095 6085 ALC285_FIXUP_HP_MUTE_LED, 6096 6086 ALC236_FIXUP_HP_MUTE_LED, 6087 + ALC298_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET, 6088 + ALC295_FIXUP_ASUS_MIC_NO_PRESENCE, 6097 6089 }; 6098 6090 6099 6091 static const struct hda_fixup alc269_fixups[] = { ··· 6293 6279 }, 6294 6280 .chained = true, 6295 6281 .chain_id = ALC269_FIXUP_PINCFG_NO_HP_TO_LINEOUT 6282 + }, 6283 + [ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST] = { 6284 + .type = HDA_FIXUP_FUNC, 6285 + .v.func = alc269_fixup_limit_int_mic_boost, 6286 + .chained = true, 6287 + .chain_id = ALC269_FIXUP_LENOVO_DOCK, 6296 6288 }, 6297 6289 [ALC269_FIXUP_PINCFG_NO_HP_TO_LINEOUT] = { 6298 6290 .type = HDA_FIXUP_FUNC, ··· 6952 6932 { } 6953 6933 }, 6954 6934 .chained = true, 6935 + .chain_id = ALC225_FIXUP_S3_POP_NOISE 6936 + }, 6937 + [ALC225_FIXUP_S3_POP_NOISE] = { 6938 + .type = HDA_FIXUP_FUNC, 6939 + .v.func = alc225_fixup_s3_pop_noise, 6940 + .chained = true, 6955 6941 .chain_id = ALC269_FIXUP_HEADSET_MODE_NO_HP_MIC 6956 6942 }, 6957 6943 [ALC700_FIXUP_INTEL_REFERENCE] = { ··· 7230 7204 .chained = true, 7231 7205 .chain_id = ALC294_FIXUP_ASUS_HEADSET_MIC 7232 7206 }, 7207 + [ALC294_FIXUP_ASUS_COEF_1B] = { 7208 + .type = HDA_FIXUP_VERBS, 7209 + .v.verbs = (const struct hda_verb[]) { 7210 + /* Set bit 10 to correct noisy output after reboot from 7211 + * Windows 10 (due to pop noise reduction?) 7212 + */ 7213 + { 0x20, AC_VERB_SET_COEF_INDEX, 0x1b }, 7214 + { 0x20, AC_VERB_SET_PROC_COEF, 0x4e4b }, 7215 + { } 7216 + }, 7217 + }, 7233 7218 [ALC285_FIXUP_HP_GPIO_LED] = { 7234 7219 .type = HDA_FIXUP_FUNC, 7235 7220 .v.func = alc285_fixup_hp_gpio_led, ··· 7252 7215 [ALC236_FIXUP_HP_MUTE_LED] = { 7253 7216 .type = HDA_FIXUP_FUNC, 7254 7217 .v.func = alc236_fixup_hp_mute_led, 7218 + }, 7219 + [ALC298_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET] = { 7220 + .type = HDA_FIXUP_VERBS, 7221 + .v.verbs = (const struct hda_verb[]) { 7222 + { 0x1a, AC_VERB_SET_PIN_WIDGET_CONTROL, 0xc5 }, 7223 + { } 7224 + }, 7225 + }, 7226 + [ALC295_FIXUP_ASUS_MIC_NO_PRESENCE] = { 7227 + .type = HDA_FIXUP_PINS, 7228 + .v.pins = (const struct hda_pintbl[]) { 7229 + { 0x19, 0x01a1913c }, /* use as headset mic, without its own jack detect */ 7230 + { } 7231 + }, 7232 + .chained = true, 7233 + .chain_id = ALC269_FIXUP_HEADSET_MODE 7255 7234 }, 7256 7235 }; 7257 7236 ··· 7436 7383 SND_PCI_QUIRK(0x1043, 0x18b1, "Asus MJ401TA", ALC256_FIXUP_ASUS_HEADSET_MIC), 7437 7384 SND_PCI_QUIRK(0x1043, 0x18f1, "Asus FX505DT", ALC256_FIXUP_ASUS_HEADSET_MIC), 7438 7385 SND_PCI_QUIRK(0x1043, 0x19ce, "ASUS B9450FA", ALC294_FIXUP_ASUS_HPE), 7386 + SND_PCI_QUIRK(0x1043, 0x19e1, "ASUS UX581LV", ALC295_FIXUP_ASUS_MIC_NO_PRESENCE), 7439 7387 SND_PCI_QUIRK(0x1043, 0x1a13, "Asus G73Jw", ALC269_FIXUP_ASUS_G73JW), 7440 7388 SND_PCI_QUIRK(0x1043, 0x1a30, "ASUS X705UD", ALC256_FIXUP_ASUS_MIC), 7389 + SND_PCI_QUIRK(0x1043, 0x1b11, "ASUS UX431DA", ALC294_FIXUP_ASUS_COEF_1B), 7441 7390 SND_PCI_QUIRK(0x1043, 0x1b13, "Asus U41SV", ALC269_FIXUP_INV_DMIC), 7442 7391 SND_PCI_QUIRK(0x1043, 0x1bbd, "ASUS Z550MA", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE), 7443 7392 SND_PCI_QUIRK(0x1043, 0x1c23, "Asus X55U", ALC269_FIXUP_LIMIT_INT_MIC_BOOST), ··· 7465 7410 SND_PCI_QUIRK(0x10ec, 0x10f2, "Intel Reference board", ALC700_FIXUP_INTEL_REFERENCE), 7466 7411 SND_PCI_QUIRK(0x10f7, 0x8338, "Panasonic CF-SZ6", ALC269_FIXUP_HEADSET_MODE), 7467 7412 SND_PCI_QUIRK(0x144d, 0xc109, "Samsung Ativ book 9 (NP900X3G)", ALC269_FIXUP_INV_DMIC), 7413 + SND_PCI_QUIRK(0x144d, 0xc169, "Samsung Notebook 9 Pen (NP930SBE-K01US)", ALC298_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET), 7414 + SND_PCI_QUIRK(0x144d, 0xc176, "Samsung Notebook 9 Pro (NP930MBE-K04US)", ALC298_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET), 7468 7415 SND_PCI_QUIRK(0x144d, 0xc740, "Samsung Ativ book 8 (NP870Z5G)", ALC269_FIXUP_ATIV_BOOK_8), 7469 7416 SND_PCI_QUIRK(0x1458, 0xfa53, "Gigabyte BXBT-2807", ALC283_FIXUP_HEADSET_MIC), 7470 7417 SND_PCI_QUIRK(0x1462, 0xb120, "MSI Cubi MS-B120", ALC283_FIXUP_HEADSET_MIC), ··· 7483 7426 SND_PCI_QUIRK(0x17aa, 0x21b8, "Thinkpad Edge 14", ALC269_FIXUP_SKU_IGNORE), 7484 7427 SND_PCI_QUIRK(0x17aa, 0x21ca, "Thinkpad L412", ALC269_FIXUP_SKU_IGNORE), 7485 7428 SND_PCI_QUIRK(0x17aa, 0x21e9, "Thinkpad Edge 15", ALC269_FIXUP_SKU_IGNORE), 7486 - SND_PCI_QUIRK(0x17aa, 0x21f6, "Thinkpad T530", ALC269_FIXUP_LENOVO_DOCK), 7429 + SND_PCI_QUIRK(0x17aa, 0x21f6, "Thinkpad T530", ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST), 7487 7430 SND_PCI_QUIRK(0x17aa, 0x21fa, "Thinkpad X230", ALC269_FIXUP_LENOVO_DOCK), 7488 7431 SND_PCI_QUIRK(0x17aa, 0x21f3, "Thinkpad T430", ALC269_FIXUP_LENOVO_DOCK), 7489 7432 SND_PCI_QUIRK(0x17aa, 0x21fb, "Thinkpad T430s", ALC269_FIXUP_LENOVO_DOCK), ··· 7622 7565 {.id = ALC269_FIXUP_HEADSET_MODE, .name = "headset-mode"}, 7623 7566 {.id = ALC269_FIXUP_HEADSET_MODE_NO_HP_MIC, .name = "headset-mode-no-hp-mic"}, 7624 7567 {.id = ALC269_FIXUP_LENOVO_DOCK, .name = "lenovo-dock"}, 7568 + {.id = ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST, .name = "lenovo-dock-limit-boost"}, 7625 7569 {.id = ALC269_FIXUP_HP_GPIO_LED, .name = "hp-gpio-led"}, 7626 7570 {.id = ALC269_FIXUP_HP_DOCK_GPIO_MIC1_LED, .name = "hp-dock-gpio-mic1-led"}, 7627 7571 {.id = ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, .name = "dell-headset-multi"}, ··· 8051 7993 {0x12, 0x90a60130}, 8052 7994 {0x17, 0x90170110}, 8053 7995 {0x21, 0x03211020}), 7996 + SND_HDA_PIN_QUIRK(0x10ec0295, 0x1043, "ASUS", ALC295_FIXUP_ASUS_MIC_NO_PRESENCE, 7997 + {0x12, 0x90a60120}, 7998 + {0x17, 0x90170110}, 7999 + {0x21, 0x04211030}), 8000 + SND_HDA_PIN_QUIRK(0x10ec0295, 0x1043, "ASUS", ALC295_FIXUP_ASUS_MIC_NO_PRESENCE, 8001 + {0x12, 0x90a60130}, 8002 + {0x17, 0x90170110}, 8003 + {0x21, 0x03211020}), 8004 + SND_HDA_PIN_QUIRK(0x10ec0295, 0x1043, "ASUS", ALC295_FIXUP_ASUS_MIC_NO_PRESENCE, 8005 + {0x12, 0x90a60130}, 8006 + {0x17, 0x90170110}, 8007 + {0x21, 0x03211020}), 8054 8008 SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE, 8055 8009 {0x14, 0x90170110}, 8056 8010 {0x21, 0x04211020}), ··· 8279 8209 spec->gen.mixer_nid = 0; 8280 8210 break; 8281 8211 case 0x10ec0225: 8282 - codec->power_save_node = 1; 8283 - /* fall through */ 8284 8212 case 0x10ec0295: 8285 8213 case 0x10ec0299: 8286 8214 spec->codec_variant = ALC269_TYPE_ALC225;

+5

sound/usb/mixer_maps.c

··· 549 549 .map = trx40_mobo_map, 550 550 .connector_map = trx40_mobo_connector_map, 551 551 }, 552 + { /* Asrock TRX40 Creator */ 553 + .id = USB_ID(0x26ce, 0x0a01), 554 + .map = trx40_mobo_map, 555 + .connector_map = trx40_mobo_connector_map, 556 + }, 552 557 { 0 } /* terminator */ 553 558 }; 554 559

+1

sound/usb/quirks-table.h

··· 3563 3563 ALC1220_VB_DESKTOP(0x0414, 0xa002), /* Gigabyte TRX40 Aorus Pro WiFi */ 3564 3564 ALC1220_VB_DESKTOP(0x0db0, 0x0d64), /* MSI TRX40 Creator */ 3565 3565 ALC1220_VB_DESKTOP(0x0db0, 0x543d), /* MSI TRX40 */ 3566 + ALC1220_VB_DESKTOP(0x26ce, 0x0a01), /* Asrock TRX40 Creator */ 3566 3567 #undef ALC1220_VB_DESKTOP 3567 3568 3568 3569 #undef USB_DEVICE_VENDOR_SPEC

+5 -4

sound/usb/quirks.c

··· 1636 1636 && (requesttype & USB_TYPE_MASK) == USB_TYPE_CLASS) 1637 1637 msleep(20); 1638 1638 1639 - /* Zoom R16/24, Logitech H650e, Jabra 550a needs a tiny delay here, 1640 - * otherwise requests like get/set frequency return as failed despite 1641 - * actually succeeding. 1639 + /* Zoom R16/24, Logitech H650e, Jabra 550a, Kingston HyperX needs a tiny 1640 + * delay here, otherwise requests like get/set frequency return as 1641 + * failed despite actually succeeding. 1642 1642 */ 1643 1643 if ((chip->usb_id == USB_ID(0x1686, 0x00dd) || 1644 1644 chip->usb_id == USB_ID(0x046d, 0x0a46) || 1645 - chip->usb_id == USB_ID(0x0b0e, 0x0349)) && 1645 + chip->usb_id == USB_ID(0x0b0e, 0x0349) || 1646 + chip->usb_id == USB_ID(0x0951, 0x16ad)) && 1646 1647 (requesttype & USB_TYPE_MASK) == USB_TYPE_CLASS) 1647 1648 usleep_range(1000, 2000); 1648 1649 }

+7 -3

tools/bootconfig/main.c

··· 314 314 ret = delete_xbc(path); 315 315 if (ret < 0) { 316 316 pr_err("Failed to delete previous boot config: %d\n", ret); 317 + free(data); 317 318 return ret; 318 319 } 319 320 ··· 322 321 fd = open(path, O_RDWR | O_APPEND); 323 322 if (fd < 0) { 324 323 pr_err("Failed to open %s: %d\n", path, fd); 324 + free(data); 325 325 return fd; 326 326 } 327 327 /* TODO: Ensure the @path is initramfs/initrd image */ 328 328 ret = write(fd, data, size + 8); 329 329 if (ret < 0) { 330 330 pr_err("Failed to apply a boot config: %d\n", ret); 331 - return ret; 331 + goto out; 332 332 } 333 333 /* Write a magic word of the bootconfig */ 334 334 ret = write(fd, BOOTCONFIG_MAGIC, BOOTCONFIG_MAGIC_LEN); 335 335 if (ret < 0) { 336 336 pr_err("Failed to apply a boot config magic: %d\n", ret); 337 - return ret; 337 + goto out; 338 338 } 339 + ret = 0; 340 + out: 339 341 close(fd); 340 342 free(data); 341 343 342 - return 0; 344 + return ret; 343 345 } 344 346 345 347 int usage(void)

+6 -1

tools/cgroup/iocost_monitor.py

··· 159 159 else: 160 160 self.inflight_pct = 0 161 161 162 - self.debt_ms = iocg.abs_vdebt.counter.value_() / VTIME_PER_USEC / 1000 162 + # vdebt used to be an atomic64_t and is now u64, support both 163 + try: 164 + self.debt_ms = iocg.abs_vdebt.counter.value_() / VTIME_PER_USEC / 1000 165 + except: 166 + self.debt_ms = iocg.abs_vdebt.value_() / VTIME_PER_USEC / 1000 167 + 163 168 self.use_delay = blkg.use_delay.counter.value_() 164 169 self.delay_ms = blkg.delay_nsec.counter.value_() / 1_000_000 165 170

+2 -2

tools/lib/bpf/bpf_tracing.h

··· 148 148 #define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[4]) 149 149 #define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[5]) 150 150 #define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[6]) 151 - #define PT_REGS_RET_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), grps[14]) 151 + #define PT_REGS_RET_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[14]) 152 152 #define PT_REGS_FP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[11]) 153 153 #define PT_REGS_RC_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[2]) 154 154 #define PT_REGS_SP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[15]) 155 - #define PT_REGS_IP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), pdw.addr) 155 + #define PT_REGS_IP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), psw.addr) 156 156 157 157 #elif defined(bpf_target_arm) 158 158

+14 -3

tools/objtool/check.c

··· 72 72 return find_insn(file, func->cfunc->sec, func->cfunc->offset); 73 73 } 74 74 75 + static struct instruction *prev_insn_same_sym(struct objtool_file *file, 76 + struct instruction *insn) 77 + { 78 + struct instruction *prev = list_prev_entry(insn, list); 79 + 80 + if (&prev->list != &file->insn_list && prev->func == insn->func) 81 + return prev; 82 + 83 + return NULL; 84 + } 85 + 75 86 #define func_for_each_insn(file, func, insn) \ 76 87 for (insn = find_insn(file, func->sec, func->offset); \ 77 88 insn; \ ··· 1061 1050 * it. 1062 1051 */ 1063 1052 for (; 1064 - &insn->list != &file->insn_list && insn->func && insn->func->pfunc == func; 1065 - insn = insn->first_jump_src ?: list_prev_entry(insn, list)) { 1053 + insn && insn->func && insn->func->pfunc == func; 1054 + insn = insn->first_jump_src ?: prev_insn_same_sym(file, insn)) { 1066 1055 1067 1056 if (insn != orig_insn && insn->type == INSN_JUMP_DYNAMIC) 1068 1057 break; ··· 1460 1449 struct cfi_reg *cfa = &state->cfa; 1461 1450 struct stack_op *op = &insn->stack_op; 1462 1451 1463 - if (cfa->base != CFI_SP) 1452 + if (cfa->base != CFI_SP && cfa->base != CFI_SP_INDIRECT) 1464 1453 return 0; 1465 1454 1466 1455 /* push */

+4 -3

tools/objtool/elf.h

··· 87 87 #define OFFSET_STRIDE (1UL << OFFSET_STRIDE_BITS) 88 88 #define OFFSET_STRIDE_MASK (~(OFFSET_STRIDE - 1)) 89 89 90 - #define for_offset_range(_offset, _start, _end) \ 91 - for (_offset = ((_start) & OFFSET_STRIDE_MASK); \ 92 - _offset <= ((_end) & OFFSET_STRIDE_MASK); \ 90 + #define for_offset_range(_offset, _start, _end) \ 91 + for (_offset = ((_start) & OFFSET_STRIDE_MASK); \ 92 + _offset >= ((_start) & OFFSET_STRIDE_MASK) && \ 93 + _offset <= ((_end) & OFFSET_STRIDE_MASK); \ 93 94 _offset += OFFSET_STRIDE) 94 95 95 96 static inline u32 sec_offset_hash(struct section *sec, unsigned long offset)

+8

tools/testing/selftests/bpf/prog_tests/mmap.c

··· 217 217 218 218 munmap(tmp2, 4 * page_size); 219 219 220 + /* map all 4 pages, but with pg_off=1 page, should fail */ 221 + tmp1 = mmap(NULL, 4 * page_size, PROT_READ, MAP_SHARED | MAP_FIXED, 222 + data_map_fd, page_size /* initial page shift */); 223 + if (CHECK(tmp1 != MAP_FAILED, "adv_mmap7", "unexpected success")) { 224 + munmap(tmp1, 4 * page_size); 225 + goto cleanup; 226 + } 227 + 220 228 tmp1 = mmap(NULL, map_sz, PROT_READ, MAP_SHARED, data_map_fd, 0); 221 229 if (CHECK(tmp1 == MAP_FAILED, "last_mmap", "failed %d\n", errno)) 222 230 goto cleanup;

+2 -2

tools/testing/selftests/bpf/progs/test_overhead.c

··· 30 30 SEC("fentry/__set_task_comm") 31 31 int BPF_PROG(prog4, struct task_struct *tsk, const char *buf, bool exec) 32 32 { 33 - return !tsk; 33 + return 0; 34 34 } 35 35 36 36 SEC("fexit/__set_task_comm") 37 37 int BPF_PROG(prog5, struct task_struct *tsk, const char *buf, bool exec) 38 38 { 39 - return !tsk; 39 + return 0; 40 40 } 41 41 42 42 SEC("fmod_ret/__set_task_comm")

+1

tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c

··· 351 351 } 352 352 353 353 printf("Expected error checking passed\n"); 354 + ret = 0; 354 355 out: 355 356 if (dmabuf_fd >= 0) 356 357 close(dmabuf_fd);

+146

tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c

··· 3 3 #define _GNU_SOURCE 4 4 #include <poll.h> 5 5 #include <unistd.h> 6 + #include <assert.h> 6 7 #include <signal.h> 7 8 #include <pthread.h> 8 9 #include <sys/epoll.h> ··· 3135 3134 } 3136 3135 close(ctx.efd[0]); 3137 3136 close(ctx.sfd[0]); 3137 + } 3138 + 3139 + enum { 3140 + EPOLL60_EVENTS_NR = 10, 3141 + }; 3142 + 3143 + struct epoll60_ctx { 3144 + volatile int stopped; 3145 + int ready; 3146 + int waiters; 3147 + int epfd; 3148 + int evfd[EPOLL60_EVENTS_NR]; 3149 + }; 3150 + 3151 + static void *epoll60_wait_thread(void *ctx_) 3152 + { 3153 + struct epoll60_ctx *ctx = ctx_; 3154 + struct epoll_event e; 3155 + sigset_t sigmask; 3156 + uint64_t v; 3157 + int ret; 3158 + 3159 + /* Block SIGUSR1 */ 3160 + sigemptyset(&sigmask); 3161 + sigaddset(&sigmask, SIGUSR1); 3162 + sigprocmask(SIG_SETMASK, &sigmask, NULL); 3163 + 3164 + /* Prepare empty mask for epoll_pwait() */ 3165 + sigemptyset(&sigmask); 3166 + 3167 + while (!ctx->stopped) { 3168 + /* Mark we are ready */ 3169 + __atomic_fetch_add(&ctx->ready, 1, __ATOMIC_ACQUIRE); 3170 + 3171 + /* Start when all are ready */ 3172 + while (__atomic_load_n(&ctx->ready, __ATOMIC_ACQUIRE) && 3173 + !ctx->stopped); 3174 + 3175 + /* Account this waiter */ 3176 + __atomic_fetch_add(&ctx->waiters, 1, __ATOMIC_ACQUIRE); 3177 + 3178 + ret = epoll_pwait(ctx->epfd, &e, 1, 2000, &sigmask); 3179 + if (ret != 1) { 3180 + /* We expect only signal delivery on stop */ 3181 + assert(ret < 0 && errno == EINTR && "Lost wakeup!\n"); 3182 + assert(ctx->stopped); 3183 + break; 3184 + } 3185 + 3186 + ret = read(e.data.fd, &v, sizeof(v)); 3187 + /* Since we are on ET mode, thus each thread gets its own fd. */ 3188 + assert(ret == sizeof(v)); 3189 + 3190 + __atomic_fetch_sub(&ctx->waiters, 1, __ATOMIC_RELEASE); 3191 + } 3192 + 3193 + return NULL; 3194 + } 3195 + 3196 + static inline unsigned long long msecs(void) 3197 + { 3198 + struct timespec ts; 3199 + unsigned long long msecs; 3200 + 3201 + clock_gettime(CLOCK_REALTIME, &ts); 3202 + msecs = ts.tv_sec * 1000ull; 3203 + msecs += ts.tv_nsec / 1000000ull; 3204 + 3205 + return msecs; 3206 + } 3207 + 3208 + static inline int count_waiters(struct epoll60_ctx *ctx) 3209 + { 3210 + return __atomic_load_n(&ctx->waiters, __ATOMIC_ACQUIRE); 3211 + } 3212 + 3213 + TEST(epoll60) 3214 + { 3215 + struct epoll60_ctx ctx = { 0 }; 3216 + pthread_t waiters[ARRAY_SIZE(ctx.evfd)]; 3217 + struct epoll_event e; 3218 + int i, n, ret; 3219 + 3220 + signal(SIGUSR1, signal_handler); 3221 + 3222 + ctx.epfd = epoll_create1(0); 3223 + ASSERT_GE(ctx.epfd, 0); 3224 + 3225 + /* Create event fds */ 3226 + for (i = 0; i < ARRAY_SIZE(ctx.evfd); i++) { 3227 + ctx.evfd[i] = eventfd(0, EFD_NONBLOCK); 3228 + ASSERT_GE(ctx.evfd[i], 0); 3229 + 3230 + e.events = EPOLLIN | EPOLLET; 3231 + e.data.fd = ctx.evfd[i]; 3232 + ASSERT_EQ(epoll_ctl(ctx.epfd, EPOLL_CTL_ADD, ctx.evfd[i], &e), 0); 3233 + } 3234 + 3235 + /* Create waiter threads */ 3236 + for (i = 0; i < ARRAY_SIZE(waiters); i++) 3237 + ASSERT_EQ(pthread_create(&waiters[i], NULL, 3238 + epoll60_wait_thread, &ctx), 0); 3239 + 3240 + for (i = 0; i < 300; i++) { 3241 + uint64_t v = 1, ms; 3242 + 3243 + /* Wait for all to be ready */ 3244 + while (__atomic_load_n(&ctx.ready, __ATOMIC_ACQUIRE) != 3245 + ARRAY_SIZE(ctx.evfd)) 3246 + ; 3247 + 3248 + /* Steady, go */ 3249 + __atomic_fetch_sub(&ctx.ready, ARRAY_SIZE(ctx.evfd), 3250 + __ATOMIC_ACQUIRE); 3251 + 3252 + /* Wait all have gone to kernel */ 3253 + while (count_waiters(&ctx) != ARRAY_SIZE(ctx.evfd)) 3254 + ; 3255 + 3256 + /* 1ms should be enough to schedule away */ 3257 + usleep(1000); 3258 + 3259 + /* Quickly signal all handles at once */ 3260 + for (n = 0; n < ARRAY_SIZE(ctx.evfd); n++) { 3261 + ret = write(ctx.evfd[n], &v, sizeof(v)); 3262 + ASSERT_EQ(ret, sizeof(v)); 3263 + } 3264 + 3265 + /* Busy loop for 1s and wait for all waiters to wake up */ 3266 + ms = msecs(); 3267 + while (count_waiters(&ctx) && msecs() < ms + 1000) 3268 + ; 3269 + 3270 + ASSERT_EQ(count_waiters(&ctx), 0); 3271 + } 3272 + ctx.stopped = 1; 3273 + /* Stop waiters */ 3274 + for (i = 0; i < ARRAY_SIZE(waiters); i++) 3275 + ret = pthread_kill(waiters[i], SIGUSR1); 3276 + for (i = 0; i < ARRAY_SIZE(waiters); i++) 3277 + pthread_join(waiters[i], NULL); 3278 + 3279 + for (i = 0; i < ARRAY_SIZE(waiters); i++) 3280 + close(ctx.evfd[i]); 3281 + close(ctx.epfd); 3138 3282 } 3139 3283 3140 3284 TEST_HARNESS_MAIN

+30 -2

tools/testing/selftests/ftrace/ftracetest

··· 17 17 echo " -vv Alias of -v -v (Show all results in stdout)" 18 18 echo " -vvv Alias of -v -v -v (Show all commands immediately)" 19 19 echo " --fail-unsupported Treat UNSUPPORTED as a failure" 20 + echo " --fail-unresolved Treat UNRESOLVED as a failure" 20 21 echo " -d|--debug Debug mode (trace all shell commands)" 21 22 echo " -l|--logdir <dir> Save logs on the <dir>" 22 23 echo " If <dir> is -, all logs output in console only" ··· 30 29 # kselftest skip code is 4 31 30 err_skip=4 32 31 32 + # cgroup RT scheduling prevents chrt commands from succeeding, which 33 + # induces failures in test wakeup tests. Disable for the duration of 34 + # the tests. 35 + 36 + readonly sched_rt_runtime=/proc/sys/kernel/sched_rt_runtime_us 37 + 38 + sched_rt_runtime_orig=$(cat $sched_rt_runtime) 39 + 40 + setup() { 41 + echo -1 > $sched_rt_runtime 42 + } 43 + 44 + cleanup() { 45 + echo $sched_rt_runtime_orig > $sched_rt_runtime 46 + } 47 + 33 48 errexit() { # message 34 49 echo "Error: $1" 1>&2 50 + cleanup 35 51 exit $err_ret 36 52 } 37 53 ··· 56 38 if [ `id -u` -ne 0 ]; then 57 39 errexit "this must be run by root user" 58 40 fi 41 + 42 + setup 59 43 60 44 # Utilities 61 45 absdir() { # file_path ··· 111 91 ;; 112 92 --fail-unsupported) 113 93 UNSUPPORTED_RESULT=1 94 + shift 1 95 + ;; 96 + --fail-unresolved) 97 + UNRESOLVED_RESULT=1 114 98 shift 1 115 99 ;; 116 100 --logdir|-l) ··· 181 157 DEBUG=0 182 158 VERBOSE=0 183 159 UNSUPPORTED_RESULT=0 160 + UNRESOLVED_RESULT=0 184 161 STOP_FAILURE=0 185 162 # Parse command-line options 186 163 parse_opts $* ··· 260 235 261 236 INSTANCE= 262 237 CASENO=0 238 + 263 239 testcase() { # testfile 264 240 CASENO=$((CASENO+1)) 265 241 desc=`grep "^#[ \t]*description:" $1 | cut -f2 -d:` ··· 286 260 $UNRESOLVED) 287 261 prlog " [${color_blue}UNRESOLVED${color_reset}]" 288 262 UNRESOLVED_CASES="$UNRESOLVED_CASES $CASENO" 289 - return 1 # this is a kind of bug.. something happened. 263 + return $UNRESOLVED_RESULT # depends on use case 290 264 ;; 291 265 $UNTESTED) 292 266 prlog " [${color_blue}UNTESTED${color_reset}]" ··· 299 273 return $UNSUPPORTED_RESULT # depends on use case 300 274 ;; 301 275 $XFAIL) 302 - prlog " [${color_red}XFAIL${color_reset}]" 276 + prlog " [${color_green}XFAIL${color_reset}]" 303 277 XFAILED_CASES="$XFAILED_CASES $CASENO" 304 278 return 0 305 279 ;; ··· 431 405 prlog "# of unsupported: " `echo $UNSUPPORTED_CASES | wc -w` 432 406 prlog "# of xfailed: " `echo $XFAILED_CASES | wc -w` 433 407 prlog "# of undefined(test bug): " `echo $UNDEFINED_CASES | wc -w` 408 + 409 + cleanup 434 410 435 411 # if no error, return 0 436 412 exit $TOTAL_RESULT

+8 -1

tools/testing/selftests/ftrace/test.d/preemptirq/irqsoff_tracer.tc

··· 17 17 exit_unsupported 18 18 } 19 19 20 - modprobe $MOD || unsup "$MOD module not available" 20 + unres() { #msg 21 + reset_tracer 22 + rmmod $MOD || true 23 + echo $1 24 + exit_unresolved 25 + } 26 + 27 + modprobe $MOD || unres "$MOD module not available" 21 28 rmmod $MOD 22 29 23 30 grep -q "preemptoff" available_tracers || unsup "preemptoff tracer not enabled"

+28 -1

tools/testing/selftests/kvm/Makefile

··· 5 5 6 6 top_srcdir = ../../../.. 7 7 KSFT_KHDR_INSTALL := 1 8 + 9 + # For cross-builds to work, UNAME_M has to map to ARCH and arch specific 10 + # directories and targets in this Makefile. "uname -m" doesn't map to 11 + # arch specific sub-directory names. 12 + # 13 + # UNAME_M variable to used to run the compiles pointing to the right arch 14 + # directories and build the right targets for these supported architectures. 15 + # 16 + # TEST_GEN_PROGS and LIBKVM are set using UNAME_M variable. 17 + # LINUX_TOOL_ARCH_INCLUDE is set using ARCH variable. 18 + # 19 + # x86_64 targets are named to include x86_64 as a suffix and directories 20 + # for includes are in x86_64 sub-directory. s390x and aarch64 follow the 21 + # same convention. "uname -m" doesn't result in the correct mapping for 22 + # s390x and aarch64. 23 + # 24 + # No change necessary for x86_64 8 25 UNAME_M := $(shell uname -m) 26 + 27 + # Set UNAME_M for arm64 compile/install to work 28 + ifeq ($(ARCH),arm64) 29 + UNAME_M := aarch64 30 + endif 31 + # Set UNAME_M s390x compile/install to work 32 + ifeq ($(ARCH),s390) 33 + UNAME_M := s390x 34 + endif 9 35 10 36 LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/sparsebit.c lib/test_util.c 11 37 LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c ··· 79 53 INSTALL_HDR_PATH = $(top_srcdir)/usr 80 54 LINUX_HDR_PATH = $(INSTALL_HDR_PATH)/include/ 81 55 LINUX_TOOL_INCLUDE = $(top_srcdir)/tools/include 82 - LINUX_TOOL_ARCH_INCLUDE = $(top_srcdir)/tools/arch/x86/include 56 + LINUX_TOOL_ARCH_INCLUDE = $(top_srcdir)/tools/arch/$(ARCH)/include 83 57 CFLAGS += -Wall -Wstrict-prototypes -Wuninitialized -O2 -g -std=gnu99 \ 84 58 -fno-stack-protector -fno-PIE -I$(LINUX_TOOL_INCLUDE) \ 85 59 -I$(LINUX_TOOL_ARCH_INCLUDE) -I$(LINUX_HDR_PATH) -Iinclude \ ··· 110 84 $(OUTPUT)/libkvm.a: $(LIBKVM_OBJ) 111 85 $(AR) crs $@ $^ 112 86 87 + x := $(shell mkdir -p $(sort $(dir $(TEST_GEN_PROGS)))) 113 88 all: $(STATIC_LIBS) 114 89 $(TEST_GEN_PROGS): $(STATIC_LIBS) 115 90

+2 -2

tools/testing/selftests/kvm/include/evmcs.h

··· 219 219 #define HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_MASK \ 220 220 (~((1ull << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) 221 221 222 - struct hv_enlightened_vmcs *current_evmcs; 223 - struct hv_vp_assist_page *current_vp_assist; 222 + extern struct hv_enlightened_vmcs *current_evmcs; 223 + extern struct hv_vp_assist_page *current_vp_assist; 224 224 225 225 int vcpu_enable_evmcs(struct kvm_vm *vm, int vcpu_id); 226 226

+3

tools/testing/selftests/kvm/lib/x86_64/vmx.c

··· 17 17 18 18 bool enable_evmcs; 19 19 20 + struct hv_enlightened_vmcs *current_evmcs; 21 + struct hv_vp_assist_page *current_vp_assist; 22 + 20 23 struct eptPageTableEntry { 21 24 uint64_t readable:1; 22 25 uint64_t writable:1;

+12 -10

tools/testing/selftests/lkdtm/run.sh

··· 25 25 # Figure out which test to run from our script name. 26 26 test=$(basename $0 .sh) 27 27 # Look up details about the test from master list of LKDTM tests. 28 - line=$(egrep '^#?'"$test"'\b' tests.txt) 28 + line=$(grep -E '^#?'"$test"'\b' tests.txt) 29 29 if [ -z "$line" ]; then 30 30 echo "Skipped: missing test '$test' in tests.txt" 31 31 exit $KSELFTEST_SKIP_TEST 32 32 fi 33 33 # Check that the test is known to LKDTM. 34 - if ! egrep -q '^'"$test"'$' "$TRIGGER" ; then 34 + if ! grep -E -q '^'"$test"'$' "$TRIGGER" ; then 35 35 echo "Skipped: test '$test' missing in $TRIGGER!" 36 36 exit $KSELFTEST_SKIP_TEST 37 37 fi ··· 59 59 expect="call trace:" 60 60 fi 61 61 62 - # Clear out dmesg for output reporting 63 - dmesg -c >/dev/null 64 - 65 62 # Prepare log for report checking 66 - LOG=$(mktemp --tmpdir -t lkdtm-XXXXXX) 63 + LOG=$(mktemp --tmpdir -t lkdtm-log-XXXXXX) 64 + DMESG=$(mktemp --tmpdir -t lkdtm-dmesg-XXXXXX) 67 65 cleanup() { 68 - rm -f "$LOG" 66 + rm -f "$LOG" "$DMESG" 69 67 } 70 68 trap cleanup EXIT 69 + 70 + # Save existing dmesg so we can detect new content below 71 + dmesg > "$DMESG" 71 72 72 73 # Most shells yell about signals and we're expecting the "cat" process 73 74 # to usually be killed by the kernel. So we have to run it in a sub-shell ··· 76 75 ($SHELL -c 'cat <(echo '"$test"') >'"$TRIGGER" 2>/dev/null) || true 77 76 78 77 # Record and dump the results 79 - dmesg -c >"$LOG" 78 + dmesg | diff --changed-group-format='%>' --unchanged-group-format='' "$DMESG" - > "$LOG" || true 79 + 80 80 cat "$LOG" 81 81 # Check for expected output 82 - if egrep -qi "$expect" "$LOG" ; then 82 + if grep -E -qi "$expect" "$LOG" ; then 83 83 echo "$test: saw '$expect': ok" 84 84 exit 0 85 85 else 86 - if egrep -qi XFAIL: "$LOG" ; then 86 + if grep -E -qi XFAIL: "$LOG" ; then 87 87 echo "$test: saw 'XFAIL': [SKIP]" 88 88 exit $KSELFTEST_SKIP_TEST 89 89 else

+1 -1

tools/testing/selftests/net/mptcp/pm_netlink.sh

··· 30 30 31 31 cleanup() 32 32 { 33 - rm -f $out 33 + rm -f $err 34 34 ip netns del $ns1 35 35 } 36 36

+1 -1

tools/testing/selftests/nsfs/pidns.c

··· 27 27 #define __stack_aligned__ __attribute__((aligned(16))) 28 28 struct cr_clone_arg { 29 29 char stack[128] __stack_aligned__; 30 - char stack_ptr[0]; 30 + char stack_ptr[]; 31 31 }; 32 32 33 33 static int child(void *args)

-1

tools/testing/selftests/wireguard/qemu/debug.config

··· 25 25 CONFIG_KASAN_INLINE=y 26 26 CONFIG_UBSAN=y 27 27 CONFIG_UBSAN_SANITIZE_ALL=y 28 - CONFIG_UBSAN_NO_ALIGNMENT=y 29 28 CONFIG_UBSAN_NULL=y 30 29 CONFIG_DEBUG_KMEMLEAK=y 31 30 CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=8192

+6 -2

virt/kvm/arm/hyp/aarch32.c

··· 125 125 */ 126 126 void __hyp_text kvm_skip_instr32(struct kvm_vcpu *vcpu, bool is_wide_instr) 127 127 { 128 + u32 pc = *vcpu_pc(vcpu); 128 129 bool is_thumb; 129 130 130 131 is_thumb = !!(*vcpu_cpsr(vcpu) & PSR_AA32_T_BIT); 131 132 if (is_thumb && !is_wide_instr) 132 - *vcpu_pc(vcpu) += 2; 133 + pc += 2; 133 134 else 134 - *vcpu_pc(vcpu) += 4; 135 + pc += 4; 136 + 137 + *vcpu_pc(vcpu) = pc; 138 + 135 139 kvm_adjust_itstate(vcpu); 136 140 }

+40

virt/kvm/arm/psci.c

··· 186 186 kvm_prepare_system_event(vcpu, KVM_SYSTEM_EVENT_RESET); 187 187 } 188 188 189 + static void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu) 190 + { 191 + int i; 192 + 193 + /* 194 + * Zero the input registers' upper 32 bits. They will be fully 195 + * zeroed on exit, so we're fine changing them in place. 196 + */ 197 + for (i = 1; i < 4; i++) 198 + vcpu_set_reg(vcpu, i, lower_32_bits(vcpu_get_reg(vcpu, i))); 199 + } 200 + 201 + static unsigned long kvm_psci_check_allowed_function(struct kvm_vcpu *vcpu, u32 fn) 202 + { 203 + switch(fn) { 204 + case PSCI_0_2_FN64_CPU_SUSPEND: 205 + case PSCI_0_2_FN64_CPU_ON: 206 + case PSCI_0_2_FN64_AFFINITY_INFO: 207 + /* Disallow these functions for 32bit guests */ 208 + if (vcpu_mode_is_32bit(vcpu)) 209 + return PSCI_RET_NOT_SUPPORTED; 210 + break; 211 + } 212 + 213 + return 0; 214 + } 215 + 189 216 static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu) 190 217 { 191 218 struct kvm *kvm = vcpu->kvm; 192 219 u32 psci_fn = smccc_get_function(vcpu); 193 220 unsigned long val; 194 221 int ret = 1; 222 + 223 + val = kvm_psci_check_allowed_function(vcpu, psci_fn); 224 + if (val) 225 + goto out; 195 226 196 227 switch (psci_fn) { 197 228 case PSCI_0_2_FN_PSCI_VERSION: ··· 241 210 val = PSCI_RET_SUCCESS; 242 211 break; 243 212 case PSCI_0_2_FN_CPU_ON: 213 + kvm_psci_narrow_to_32bit(vcpu); 214 + fallthrough; 244 215 case PSCI_0_2_FN64_CPU_ON: 245 216 mutex_lock(&kvm->lock); 246 217 val = kvm_psci_vcpu_on(vcpu); 247 218 mutex_unlock(&kvm->lock); 248 219 break; 249 220 case PSCI_0_2_FN_AFFINITY_INFO: 221 + kvm_psci_narrow_to_32bit(vcpu); 222 + fallthrough; 250 223 case PSCI_0_2_FN64_AFFINITY_INFO: 251 224 val = kvm_psci_vcpu_affinity_info(vcpu); 252 225 break; ··· 291 256 break; 292 257 } 293 258 259 + out: 294 260 smccc_set_retval(vcpu, val, 0, 0, 0); 295 261 return ret; 296 262 } ··· 309 273 break; 310 274 case PSCI_1_0_FN_PSCI_FEATURES: 311 275 feature = smccc_get_arg1(vcpu); 276 + val = kvm_psci_check_allowed_function(vcpu, feature); 277 + if (val) 278 + break; 279 + 312 280 switch(feature) { 313 281 case PSCI_0_2_FN_PSCI_VERSION: 314 282 case PSCI_0_2_FN_CPU_SUSPEND:

+16 -3

virt/kvm/arm/vgic/vgic-init.c

··· 294 294 } 295 295 } 296 296 297 - if (vgic_has_its(kvm)) { 297 + if (vgic_has_its(kvm)) 298 298 vgic_lpi_translation_cache_init(kvm); 299 + 300 + /* 301 + * If we have GICv4.1 enabled, unconditionnaly request enable the 302 + * v4 support so that we get HW-accelerated vSGIs. Otherwise, only 303 + * enable it if we present a virtual ITS to the guest. 304 + */ 305 + if (vgic_supports_direct_msis(kvm)) { 299 306 ret = vgic_v4_init(kvm); 300 307 if (ret) 301 308 goto out; ··· 355 348 { 356 349 struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; 357 350 351 + /* 352 + * Retire all pending LPIs on this vcpu anyway as we're 353 + * going to destroy it. 354 + */ 355 + vgic_flush_pending_lpis(vcpu); 356 + 358 357 INIT_LIST_HEAD(&vgic_cpu->ap_list_head); 359 358 } 360 359 ··· 372 359 373 360 vgic_debug_destroy(kvm); 374 361 375 - kvm_vgic_dist_destroy(kvm); 376 - 377 362 kvm_for_each_vcpu(i, vcpu, kvm) 378 363 kvm_vgic_vcpu_destroy(vcpu); 364 + 365 + kvm_vgic_dist_destroy(kvm); 379 366 } 380 367 381 368 void kvm_vgic_destroy(struct kvm *kvm)

+9 -2

virt/kvm/arm/vgic/vgic-its.c

··· 96 96 * We "cache" the configuration table entries in our struct vgic_irq's. 97 97 * However we only have those structs for mapped IRQs, so we read in 98 98 * the respective config data from memory here upon mapping the LPI. 99 + * 100 + * Should any of these fail, behave as if we couldn't create the LPI 101 + * by dropping the refcount and returning the error. 99 102 */ 100 103 ret = update_lpi_config(kvm, irq, NULL, false); 101 - if (ret) 104 + if (ret) { 105 + vgic_put_irq(kvm, irq); 102 106 return ERR_PTR(ret); 107 + } 103 108 104 109 ret = vgic_v3_lpi_sync_pending_status(kvm, irq); 105 - if (ret) 110 + if (ret) { 111 + vgic_put_irq(kvm, irq); 106 112 return ERR_PTR(ret); 113 + } 107 114 108 115 return irq; 109 116 }

+10 -6

virt/kvm/arm/vgic/vgic-mmio-v2.c

··· 409 409 NULL, vgic_mmio_uaccess_write_v2_group, 1, 410 410 VGIC_ACCESS_32bit), 411 411 REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_ENABLE_SET, 412 - vgic_mmio_read_enable, vgic_mmio_write_senable, NULL, NULL, 1, 412 + vgic_mmio_read_enable, vgic_mmio_write_senable, 413 + NULL, vgic_uaccess_write_senable, 1, 413 414 VGIC_ACCESS_32bit), 414 415 REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_ENABLE_CLEAR, 415 - vgic_mmio_read_enable, vgic_mmio_write_cenable, NULL, NULL, 1, 416 + vgic_mmio_read_enable, vgic_mmio_write_cenable, 417 + NULL, vgic_uaccess_write_cenable, 1, 416 418 VGIC_ACCESS_32bit), 417 419 REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_PENDING_SET, 418 - vgic_mmio_read_pending, vgic_mmio_write_spending, NULL, NULL, 1, 420 + vgic_mmio_read_pending, vgic_mmio_write_spending, 421 + NULL, vgic_uaccess_write_spending, 1, 419 422 VGIC_ACCESS_32bit), 420 423 REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_PENDING_CLEAR, 421 - vgic_mmio_read_pending, vgic_mmio_write_cpending, NULL, NULL, 1, 424 + vgic_mmio_read_pending, vgic_mmio_write_cpending, 425 + NULL, vgic_uaccess_write_cpending, 1, 422 426 VGIC_ACCESS_32bit), 423 427 REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_ACTIVE_SET, 424 428 vgic_mmio_read_active, vgic_mmio_write_sactive, 425 - NULL, vgic_mmio_uaccess_write_sactive, 1, 429 + vgic_uaccess_read_active, vgic_mmio_uaccess_write_sactive, 1, 426 430 VGIC_ACCESS_32bit), 427 431 REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_ACTIVE_CLEAR, 428 432 vgic_mmio_read_active, vgic_mmio_write_cactive, 429 - NULL, vgic_mmio_uaccess_write_cactive, 1, 433 + vgic_uaccess_read_active, vgic_mmio_uaccess_write_cactive, 1, 430 434 VGIC_ACCESS_32bit), 431 435 REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_PRI, 432 436 vgic_mmio_read_priority, vgic_mmio_write_priority, NULL, NULL,

+18 -13

virt/kvm/arm/vgic/vgic-mmio-v3.c

··· 50 50 51 51 bool vgic_supports_direct_msis(struct kvm *kvm) 52 52 { 53 - return kvm_vgic_global_state.has_gicv4 && vgic_has_its(kvm); 53 + return (kvm_vgic_global_state.has_gicv4_1 || 54 + (kvm_vgic_global_state.has_gicv4 && vgic_has_its(kvm))); 54 55 } 55 56 56 57 /* ··· 539 538 vgic_mmio_read_group, vgic_mmio_write_group, NULL, NULL, 1, 540 539 VGIC_ACCESS_32bit), 541 540 REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_ISENABLER, 542 - vgic_mmio_read_enable, vgic_mmio_write_senable, NULL, NULL, 1, 541 + vgic_mmio_read_enable, vgic_mmio_write_senable, 542 + NULL, vgic_uaccess_write_senable, 1, 543 543 VGIC_ACCESS_32bit), 544 544 REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_ICENABLER, 545 - vgic_mmio_read_enable, vgic_mmio_write_cenable, NULL, NULL, 1, 545 + vgic_mmio_read_enable, vgic_mmio_write_cenable, 546 + NULL, vgic_uaccess_write_cenable, 1, 546 547 VGIC_ACCESS_32bit), 547 548 REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_ISPENDR, 548 549 vgic_mmio_read_pending, vgic_mmio_write_spending, ··· 556 553 VGIC_ACCESS_32bit), 557 554 REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_ISACTIVER, 558 555 vgic_mmio_read_active, vgic_mmio_write_sactive, 559 - NULL, vgic_mmio_uaccess_write_sactive, 1, 556 + vgic_uaccess_read_active, vgic_mmio_uaccess_write_sactive, 1, 560 557 VGIC_ACCESS_32bit), 561 558 REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_ICACTIVER, 562 559 vgic_mmio_read_active, vgic_mmio_write_cactive, 563 - NULL, vgic_mmio_uaccess_write_cactive, 560 + vgic_uaccess_read_active, vgic_mmio_uaccess_write_cactive, 564 561 1, VGIC_ACCESS_32bit), 565 562 REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_IPRIORITYR, 566 563 vgic_mmio_read_priority, vgic_mmio_write_priority, NULL, NULL, ··· 612 609 REGISTER_DESC_WITH_LENGTH(SZ_64K + GICR_IGROUPR0, 613 610 vgic_mmio_read_group, vgic_mmio_write_group, 4, 614 611 VGIC_ACCESS_32bit), 615 - REGISTER_DESC_WITH_LENGTH(SZ_64K + GICR_ISENABLER0, 616 - vgic_mmio_read_enable, vgic_mmio_write_senable, 4, 612 + REGISTER_DESC_WITH_LENGTH_UACCESS(SZ_64K + GICR_ISENABLER0, 613 + vgic_mmio_read_enable, vgic_mmio_write_senable, 614 + NULL, vgic_uaccess_write_senable, 4, 617 615 VGIC_ACCESS_32bit), 618 - REGISTER_DESC_WITH_LENGTH(SZ_64K + GICR_ICENABLER0, 619 - vgic_mmio_read_enable, vgic_mmio_write_cenable, 4, 616 + REGISTER_DESC_WITH_LENGTH_UACCESS(SZ_64K + GICR_ICENABLER0, 617 + vgic_mmio_read_enable, vgic_mmio_write_cenable, 618 + NULL, vgic_uaccess_write_cenable, 4, 620 619 VGIC_ACCESS_32bit), 621 620 REGISTER_DESC_WITH_LENGTH_UACCESS(SZ_64K + GICR_ISPENDR0, 622 621 vgic_mmio_read_pending, vgic_mmio_write_spending, ··· 630 625 VGIC_ACCESS_32bit), 631 626 REGISTER_DESC_WITH_LENGTH_UACCESS(SZ_64K + GICR_ISACTIVER0, 632 627 vgic_mmio_read_active, vgic_mmio_write_sactive, 633 - NULL, vgic_mmio_uaccess_write_sactive, 634 - 4, VGIC_ACCESS_32bit), 628 + vgic_uaccess_read_active, vgic_mmio_uaccess_write_sactive, 4, 629 + VGIC_ACCESS_32bit), 635 630 REGISTER_DESC_WITH_LENGTH_UACCESS(SZ_64K + GICR_ICACTIVER0, 636 631 vgic_mmio_read_active, vgic_mmio_write_cactive, 637 - NULL, vgic_mmio_uaccess_write_cactive, 638 - 4, VGIC_ACCESS_32bit), 632 + vgic_uaccess_read_active, vgic_mmio_uaccess_write_cactive, 4, 633 + VGIC_ACCESS_32bit), 639 634 REGISTER_DESC_WITH_LENGTH(SZ_64K + GICR_IPRIORITYR0, 640 635 vgic_mmio_read_priority, vgic_mmio_write_priority, 32, 641 636 VGIC_ACCESS_32bit | VGIC_ACCESS_8bit),

+170 -58

virt/kvm/arm/vgic/vgic-mmio.c

··· 184 184 } 185 185 } 186 186 187 + int vgic_uaccess_write_senable(struct kvm_vcpu *vcpu, 188 + gpa_t addr, unsigned int len, 189 + unsigned long val) 190 + { 191 + u32 intid = VGIC_ADDR_TO_INTID(addr, 1); 192 + int i; 193 + unsigned long flags; 194 + 195 + for_each_set_bit(i, &val, len * 8) { 196 + struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i); 197 + 198 + raw_spin_lock_irqsave(&irq->irq_lock, flags); 199 + irq->enabled = true; 200 + vgic_queue_irq_unlock(vcpu->kvm, irq, flags); 201 + 202 + vgic_put_irq(vcpu->kvm, irq); 203 + } 204 + 205 + return 0; 206 + } 207 + 208 + int vgic_uaccess_write_cenable(struct kvm_vcpu *vcpu, 209 + gpa_t addr, unsigned int len, 210 + unsigned long val) 211 + { 212 + u32 intid = VGIC_ADDR_TO_INTID(addr, 1); 213 + int i; 214 + unsigned long flags; 215 + 216 + for_each_set_bit(i, &val, len * 8) { 217 + struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i); 218 + 219 + raw_spin_lock_irqsave(&irq->irq_lock, flags); 220 + irq->enabled = false; 221 + raw_spin_unlock_irqrestore(&irq->irq_lock, flags); 222 + 223 + vgic_put_irq(vcpu->kvm, irq); 224 + } 225 + 226 + return 0; 227 + } 228 + 187 229 unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu, 188 230 gpa_t addr, unsigned int len) 189 231 { ··· 261 219 return value; 262 220 } 263 221 264 - /* Must be called with irq->irq_lock held */ 265 - static void vgic_hw_irq_spending(struct kvm_vcpu *vcpu, struct vgic_irq *irq, 266 - bool is_uaccess) 267 - { 268 - if (is_uaccess) 269 - return; 270 - 271 - irq->pending_latch = true; 272 - vgic_irq_set_phys_active(irq, true); 273 - } 274 - 275 222 static bool is_vgic_v2_sgi(struct kvm_vcpu *vcpu, struct vgic_irq *irq) 276 223 { 277 224 return (vgic_irq_is_sgi(irq->intid) && ··· 271 240 gpa_t addr, unsigned int len, 272 241 unsigned long val) 273 242 { 274 - bool is_uaccess = !kvm_get_running_vcpu(); 275 243 u32 intid = VGIC_ADDR_TO_INTID(addr, 1); 276 244 int i; 277 245 unsigned long flags; ··· 300 270 continue; 301 271 } 302 272 273 + irq->pending_latch = true; 303 274 if (irq->hw) 304 - vgic_hw_irq_spending(vcpu, irq, is_uaccess); 305 - else 306 - irq->pending_latch = true; 275 + vgic_irq_set_phys_active(irq, true); 276 + 307 277 vgic_queue_irq_unlock(vcpu->kvm, irq, flags); 308 278 vgic_put_irq(vcpu->kvm, irq); 309 279 } 310 280 } 311 281 312 - /* Must be called with irq->irq_lock held */ 313 - static void vgic_hw_irq_cpending(struct kvm_vcpu *vcpu, struct vgic_irq *irq, 314 - bool is_uaccess) 282 + int vgic_uaccess_write_spending(struct kvm_vcpu *vcpu, 283 + gpa_t addr, unsigned int len, 284 + unsigned long val) 315 285 { 316 - if (is_uaccess) 317 - return; 286 + u32 intid = VGIC_ADDR_TO_INTID(addr, 1); 287 + int i; 288 + unsigned long flags; 318 289 290 + for_each_set_bit(i, &val, len * 8) { 291 + struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i); 292 + 293 + raw_spin_lock_irqsave(&irq->irq_lock, flags); 294 + irq->pending_latch = true; 295 + 296 + /* 297 + * GICv2 SGIs are terribly broken. We can't restore 298 + * the source of the interrupt, so just pick the vcpu 299 + * itself as the source... 300 + */ 301 + if (is_vgic_v2_sgi(vcpu, irq)) 302 + irq->source |= BIT(vcpu->vcpu_id); 303 + 304 + vgic_queue_irq_unlock(vcpu->kvm, irq, flags); 305 + 306 + vgic_put_irq(vcpu->kvm, irq); 307 + } 308 + 309 + return 0; 310 + } 311 + 312 + /* Must be called with irq->irq_lock held */ 313 + static void vgic_hw_irq_cpending(struct kvm_vcpu *vcpu, struct vgic_irq *irq) 314 + { 319 315 irq->pending_latch = false; 320 316 321 317 /* ··· 364 308 gpa_t addr, unsigned int len, 365 309 unsigned long val) 366 310 { 367 - bool is_uaccess = !kvm_get_running_vcpu(); 368 311 u32 intid = VGIC_ADDR_TO_INTID(addr, 1); 369 312 int i; 370 313 unsigned long flags; ··· 394 339 } 395 340 396 341 if (irq->hw) 397 - vgic_hw_irq_cpending(vcpu, irq, is_uaccess); 342 + vgic_hw_irq_cpending(vcpu, irq); 398 343 else 399 344 irq->pending_latch = false; 400 345 ··· 403 348 } 404 349 } 405 350 406 - unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu, 407 - gpa_t addr, unsigned int len) 351 + int vgic_uaccess_write_cpending(struct kvm_vcpu *vcpu, 352 + gpa_t addr, unsigned int len, 353 + unsigned long val) 354 + { 355 + u32 intid = VGIC_ADDR_TO_INTID(addr, 1); 356 + int i; 357 + unsigned long flags; 358 + 359 + for_each_set_bit(i, &val, len * 8) { 360 + struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i); 361 + 362 + raw_spin_lock_irqsave(&irq->irq_lock, flags); 363 + /* 364 + * More fun with GICv2 SGIs! If we're clearing one of them 365 + * from userspace, which source vcpu to clear? Let's not 366 + * even think of it, and blow the whole set. 367 + */ 368 + if (is_vgic_v2_sgi(vcpu, irq)) 369 + irq->source = 0; 370 + 371 + irq->pending_latch = false; 372 + 373 + raw_spin_unlock_irqrestore(&irq->irq_lock, flags); 374 + 375 + vgic_put_irq(vcpu->kvm, irq); 376 + } 377 + 378 + return 0; 379 + } 380 + 381 + /* 382 + * If we are fiddling with an IRQ's active state, we have to make sure the IRQ 383 + * is not queued on some running VCPU's LRs, because then the change to the 384 + * active state can be overwritten when the VCPU's state is synced coming back 385 + * from the guest. 386 + * 387 + * For shared interrupts as well as GICv3 private interrupts, we have to 388 + * stop all the VCPUs because interrupts can be migrated while we don't hold 389 + * the IRQ locks and we don't want to be chasing moving targets. 390 + * 391 + * For GICv2 private interrupts we don't have to do anything because 392 + * userspace accesses to the VGIC state already require all VCPUs to be 393 + * stopped, and only the VCPU itself can modify its private interrupts 394 + * active state, which guarantees that the VCPU is not running. 395 + */ 396 + static void vgic_access_active_prepare(struct kvm_vcpu *vcpu, u32 intid) 397 + { 398 + if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 || 399 + intid >= VGIC_NR_PRIVATE_IRQS) 400 + kvm_arm_halt_guest(vcpu->kvm); 401 + } 402 + 403 + /* See vgic_access_active_prepare */ 404 + static void vgic_access_active_finish(struct kvm_vcpu *vcpu, u32 intid) 405 + { 406 + if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 || 407 + intid >= VGIC_NR_PRIVATE_IRQS) 408 + kvm_arm_resume_guest(vcpu->kvm); 409 + } 410 + 411 + static unsigned long __vgic_mmio_read_active(struct kvm_vcpu *vcpu, 412 + gpa_t addr, unsigned int len) 408 413 { 409 414 u32 intid = VGIC_ADDR_TO_INTID(addr, 1); 410 415 u32 value = 0; ··· 474 359 for (i = 0; i < len * 8; i++) { 475 360 struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i); 476 361 362 + /* 363 + * Even for HW interrupts, don't evaluate the HW state as 364 + * all the guest is interested in is the virtual state. 365 + */ 477 366 if (irq->active) 478 367 value |= (1U << i); 479 368 ··· 485 366 } 486 367 487 368 return value; 369 + } 370 + 371 + unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu, 372 + gpa_t addr, unsigned int len) 373 + { 374 + u32 intid = VGIC_ADDR_TO_INTID(addr, 1); 375 + u32 val; 376 + 377 + mutex_lock(&vcpu->kvm->lock); 378 + vgic_access_active_prepare(vcpu, intid); 379 + 380 + val = __vgic_mmio_read_active(vcpu, addr, len); 381 + 382 + vgic_access_active_finish(vcpu, intid); 383 + mutex_unlock(&vcpu->kvm->lock); 384 + 385 + return val; 386 + } 387 + 388 + unsigned long vgic_uaccess_read_active(struct kvm_vcpu *vcpu, 389 + gpa_t addr, unsigned int len) 390 + { 391 + return __vgic_mmio_read_active(vcpu, addr, len); 488 392 } 489 393 490 394 /* Must be called with irq->irq_lock held */ ··· 568 426 raw_spin_unlock_irqrestore(&irq->irq_lock, flags); 569 427 } 570 428 571 - /* 572 - * If we are fiddling with an IRQ's active state, we have to make sure the IRQ 573 - * is not queued on some running VCPU's LRs, because then the change to the 574 - * active state can be overwritten when the VCPU's state is synced coming back 575 - * from the guest. 576 - * 577 - * For shared interrupts, we have to stop all the VCPUs because interrupts can 578 - * be migrated while we don't hold the IRQ locks and we don't want to be 579 - * chasing moving targets. 580 - * 581 - * For private interrupts we don't have to do anything because userspace 582 - * accesses to the VGIC state already require all VCPUs to be stopped, and 583 - * only the VCPU itself can modify its private interrupts active state, which 584 - * guarantees that the VCPU is not running. 585 - */ 586 - static void vgic_change_active_prepare(struct kvm_vcpu *vcpu, u32 intid) 587 - { 588 - if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 || 589 - intid > VGIC_NR_PRIVATE_IRQS) 590 - kvm_arm_halt_guest(vcpu->kvm); 591 - } 592 - 593 - /* See vgic_change_active_prepare */ 594 - static void vgic_change_active_finish(struct kvm_vcpu *vcpu, u32 intid) 595 - { 596 - if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 || 597 - intid > VGIC_NR_PRIVATE_IRQS) 598 - kvm_arm_resume_guest(vcpu->kvm); 599 - } 600 - 601 429 static void __vgic_mmio_write_cactive(struct kvm_vcpu *vcpu, 602 430 gpa_t addr, unsigned int len, 603 431 unsigned long val) ··· 589 477 u32 intid = VGIC_ADDR_TO_INTID(addr, 1); 590 478 591 479 mutex_lock(&vcpu->kvm->lock); 592 - vgic_change_active_prepare(vcpu, intid); 480 + vgic_access_active_prepare(vcpu, intid); 593 481 594 482 __vgic_mmio_write_cactive(vcpu, addr, len, val); 595 483 596 - vgic_change_active_finish(vcpu, intid); 484 + vgic_access_active_finish(vcpu, intid); 597 485 mutex_unlock(&vcpu->kvm->lock); 598 486 } 599 487 ··· 626 514 u32 intid = VGIC_ADDR_TO_INTID(addr, 1); 627 515 628 516 mutex_lock(&vcpu->kvm->lock); 629 - vgic_change_active_prepare(vcpu, intid); 517 + vgic_access_active_prepare(vcpu, intid); 630 518 631 519 __vgic_mmio_write_sactive(vcpu, addr, len, val); 632 520 633 - vgic_change_active_finish(vcpu, intid); 521 + vgic_access_active_finish(vcpu, intid); 634 522 mutex_unlock(&vcpu->kvm->lock); 635 523 } 636 524

+19

virt/kvm/arm/vgic/vgic-mmio.h

··· 138 138 gpa_t addr, unsigned int len, 139 139 unsigned long val); 140 140 141 + int vgic_uaccess_write_senable(struct kvm_vcpu *vcpu, 142 + gpa_t addr, unsigned int len, 143 + unsigned long val); 144 + 145 + int vgic_uaccess_write_cenable(struct kvm_vcpu *vcpu, 146 + gpa_t addr, unsigned int len, 147 + unsigned long val); 148 + 141 149 unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu, 142 150 gpa_t addr, unsigned int len); 143 151 ··· 157 149 gpa_t addr, unsigned int len, 158 150 unsigned long val); 159 151 152 + int vgic_uaccess_write_spending(struct kvm_vcpu *vcpu, 153 + gpa_t addr, unsigned int len, 154 + unsigned long val); 155 + 156 + int vgic_uaccess_write_cpending(struct kvm_vcpu *vcpu, 157 + gpa_t addr, unsigned int len, 158 + unsigned long val); 159 + 160 160 unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu, 161 + gpa_t addr, unsigned int len); 162 + 163 + unsigned long vgic_uaccess_read_active(struct kvm_vcpu *vcpu, 161 164 gpa_t addr, unsigned int len); 162 165 163 166 void vgic_mmio_write_cactive(struct kvm_vcpu *vcpu,