···7070Boris Brezillon <bbrezillon@kernel.org> <boris.brezillon@free-electrons.com>7171Brian Avery <b.avery@hp.com>7272Brian King <brking@us.ibm.com>7373+Brian Silverman <bsilver16384@gmail.com> <brian.silverman@bluerivertech.com>7374Changbin Du <changbin.du@intel.com> <changbin.du@gmail.com>7475Changbin Du <changbin.du@intel.com> <changbin.du@intel.com>7576Chao Yu <chao@kernel.org> <chao2.yu@samsung.com>
+2-1
Documentation/accounting/psi.rst
···9292for the same psi metric can be specified. However for each trigger a separate9393file descriptor is required to be able to poll it separately from others,9494therefore for each trigger a separate open() syscall should be made even9595-when opening the same psi interface file.9595+when opening the same psi interface file. Write operations to a file descriptor9696+with an already existing psi trigger will fail with EBUSY.96979798Monitors activate only when system enters stall state for the monitored9899psi metric and deactivates upon exit from the stall state. While system is
+1
Documentation/admin-guide/gpio/index.rst
···1010 gpio-aggregator1111 sysfs1212 gpio-mockup1313+ gpio-sim13141415.. only:: subproject and html1516
···295295296296- If you are in a process context (any syscall) and want to lock other297297 process out, use a mutex. You can take a mutex and sleep298298- (``copy_from_user*(`` or ``kmalloc(x,GFP_KERNEL)``).298298+ (``copy_from_user()`` or ``kmalloc(x,GFP_KERNEL)``).299299300300- Otherwise (== data can be touched in an interrupt), use301301 spin_lock_irqsave() and
+20
Documentation/tools/index.rst
···11+.. SPDX-License-Identifier: GPL-2.022+33+============44+Kernel tools55+============66+77+This book covers user-space tools that are shipped with the kernel source;88+more additions are needed here:99+1010+.. toctree::1111+ :maxdepth: 11212+1313+ rtla/index1414+1515+.. only:: subproject and html1616+1717+ Indices1818+ =======1919+2020+ * :ref:`genindex`
+26
Documentation/tools/rtla/index.rst
···11+.. SPDX-License-Identifier: GPL-2.022+33+================================44+The realtime Linux analysis tool55+================================66+77+RTLA provides a set of tools for the analysis of the kernel's realtime88+behavior on specific hardware.99+1010+.. toctree::1111+ :maxdepth: 11212+1313+ rtla1414+ rtla-osnoise1515+ rtla-osnoise-hist1616+ rtla-osnoise-top1717+ rtla-timerlat1818+ rtla-timerlat-hist1919+ rtla-timerlat-top2020+2121+.. only:: subproject and html2222+2323+ Indices2424+ =======2525+2626+ * :ref:`genindex`
+3-1
Documentation/virt/kvm/api.rst
···3268326832693269:Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device,32703270 KVM_CAP_VCPU_ATTRIBUTES for vcpu device32713271+ KVM_CAP_SYS_ATTRIBUTES for system (/dev/kvm) device (no set)32713272:Type: device ioctl, vm ioctl, vcpu ioctl32723273:Parameters: struct kvm_device_attr32733274:Returns: 0 on success, -1 on error···33033302------------------------3304330333053304:Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device,33063306- KVM_CAP_VCPU_ATTRIBUTES for vcpu device33053305+ KVM_CAP_VCPU_ATTRIBUTES for vcpu device33063306+ KVM_CAP_SYS_ATTRIBUTES for system (/dev/kvm) device33073307:Type: device ioctl, vm ioctl, vcpu ioctl33083308:Parameters: struct kvm_device_attr33093309:Returns: 0 on success, -1 on error
+1-1
Documentation/vm/page_table_check.rst
···99Introduction1010============11111212-Page table check allows to hardern the kernel by ensuring that some types of1212+Page table check allows to harden the kernel by ensuring that some types of1313the memory corruptions are prevented.14141515Page table check performs extra verifications at the time when new pages become
···670670config ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE671671 bool672672673673+config ARM64_ERRATUM_2051678674674+ bool "Cortex-A510: 2051678: disable Hardware Update of the page table dirty bit"675675+ help676676+ This options adds the workaround for ARM Cortex-A510 erratum ARM64_ERRATUM_2051678.677677+ Affected Coretex-A510 might not respect the ordering rules for678678+ hardware update of the page table's dirty bit. The workaround679679+ is to not enable the feature on affected CPUs.680680+681681+ If unsure, say Y.682682+673683config ARM64_ERRATUM_2119858674674- bool "Cortex-A710: 2119858: workaround TRBE overwriting trace data in FILL mode"684684+ bool "Cortex-A710/X2: 2119858: workaround TRBE overwriting trace data in FILL mode"675685 default y676686 depends on CORESIGHT_TRBE677687 select ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE678688 help679679- This option adds the workaround for ARM Cortex-A710 erratum 2119858.689689+ This option adds the workaround for ARM Cortex-A710/X2 erratum 2119858.680690681681- Affected Cortex-A710 cores could overwrite up to 3 cache lines of trace691691+ Affected Cortex-A710/X2 cores could overwrite up to 3 cache lines of trace682692 data at the base of the buffer (pointed to by TRBASER_EL1) in FILL mode in683693 the event of a WRAP event.684694···771761 If unsure, say Y.772762773763config ARM64_ERRATUM_2224489774774- bool "Cortex-A710: 2224489: workaround TRBE writing to address out-of-range"764764+ bool "Cortex-A710/X2: 2224489: workaround TRBE writing to address out-of-range"775765 depends on CORESIGHT_TRBE776766 default y777767 select ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE778768 help779779- This option adds the workaround for ARM Cortex-A710 erratum 2224489.769769+ This option adds the workaround for ARM Cortex-A710/X2 erratum 2224489.780770781781- Affected Cortex-A710 cores might write to an out-of-range address, not reserved771771+ Affected Cortex-A710/X2 cores might write to an out-of-range address, not reserved782772 for TRBE. Under some conditions, the TRBE might generate a write to the next783773 virtually addressed page following the last page of the TRBE address space784774 (i.e., the TRBLIMITR_EL1.LIMIT), instead of wrapping around to the base.785775786776 Work around this in the driver by always making sure that there is a787777 page beyond the TRBLIMITR_EL1.LIMIT, within the space allowed for the TRBE.778778+779779+ If unsure, say Y.780780+781781+config ARM64_ERRATUM_2064142782782+ bool "Cortex-A510: 2064142: workaround TRBE register writes while disabled"783783+ depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in784784+ default y785785+ help786786+ This option adds the workaround for ARM Cortex-A510 erratum 2064142.787787+788788+ Affected Cortex-A510 core might fail to write into system registers after the789789+ TRBE has been disabled. Under some conditions after the TRBE has been disabled790790+ writes into TRBE registers TRBLIMITR_EL1, TRBPTR_EL1, TRBBASER_EL1, TRBSR_EL1,791791+ and TRBTRG_EL1 will be ignored and will not be effected.792792+793793+ Work around this in the driver by executing TSB CSYNC and DSB after collection794794+ is stopped and before performing a system register write to one of the affected795795+ registers.796796+797797+ If unsure, say Y.798798+799799+config ARM64_ERRATUM_2038923800800+ bool "Cortex-A510: 2038923: workaround TRBE corruption with enable"801801+ depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in802802+ default y803803+ help804804+ This option adds the workaround for ARM Cortex-A510 erratum 2038923.805805+806806+ Affected Cortex-A510 core might cause an inconsistent view on whether trace is807807+ prohibited within the CPU. As a result, the trace buffer or trace buffer state808808+ might be corrupted. This happens after TRBE buffer has been enabled by setting809809+ TRBLIMITR_EL1.E, followed by just a single context synchronization event before810810+ execution changes from a context, in which trace is prohibited to one where it811811+ isn't, or vice versa. In these mentioned conditions, the view of whether trace812812+ is prohibited is inconsistent between parts of the CPU, and the trace buffer or813813+ the trace buffer state might be corrupted.814814+815815+ Work around this in the driver by preventing an inconsistent view of whether the816816+ trace is prohibited or not based on TRBLIMITR_EL1.E by immediately following a817817+ change to TRBLIMITR_EL1.E with at least one ISB instruction before an ERET, or818818+ two ISB instructions if no ERET is to take place.819819+820820+ If unsure, say Y.821821+822822+config ARM64_ERRATUM_1902691823823+ bool "Cortex-A510: 1902691: workaround TRBE trace corruption"824824+ depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in825825+ default y826826+ help827827+ This option adds the workaround for ARM Cortex-A510 erratum 1902691.828828+829829+ Affected Cortex-A510 core might cause trace data corruption, when being written830830+ into the memory. Effectively TRBE is broken and hence cannot be used to capture831831+ trace data.832832+833833+ Work around this problem in the driver by just preventing TRBE initialization on834834+ affected cpus. The firmware must have disabled the access to TRBE for the kernel835835+ on such implementations. This will cover the kernel for any firmware that doesn't836836+ do this already.788837789838 If unsure, say Y.790839
···3333 */343435353636-static void start_backtrace(struct stackframe *frame, unsigned long fp,3737- unsigned long pc)3636+static notrace void start_backtrace(struct stackframe *frame, unsigned long fp,3737+ unsigned long pc)3838{3939 frame->fp = fp;4040 frame->pc = pc;···5555 frame->prev_fp = 0;5656 frame->prev_type = STACK_TYPE_UNKNOWN;5757}5858+NOKPROBE_SYMBOL(start_backtrace);58595960/*6061 * Unwind from one frame record (A) to the next frame record (B).
+4-1
arch/arm64/kernel/vdso/Makefile
···2929ccflags-y := -fno-common -fno-builtin -fno-stack-protector -ffixed-x183030ccflags-y += -DDISABLE_BRANCH_PROFILING -DBUILD_VDSO31313232+# -Wmissing-prototypes and -Wmissing-declarations are removed from3333+# the CFLAGS of vgettimeofday.c to make possible to build the3434+# kernel with CONFIG_WERROR enabled.3235CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) $(GCC_PLUGINS_CFLAGS) \3333- $(CC_FLAGS_LTO)3636+ $(CC_FLAGS_LTO) -Wmissing-prototypes -Wmissing-declarations3437KASAN_SANITIZE := n3538KCSAN_SANITIZE := n3639UBSAN_SANITIZE := n
···318318 depends on PROC_KCORE319319320320config IA64_MCA_RECOVERY321321- tristate "MCA recovery from errors other than TLB."321321+ bool "MCA recovery from errors other than TLB."322322323323config IA64_PALINFO324324 tristate "/proc/pal support"
···223223 update_user_segment(15, val);224224}225225226226+int __init find_free_bat(void);227227+unsigned int bat_block_size(unsigned long base, unsigned long top);226228#endif /* !__ASSEMBLY__ */227229228230/* We happily ignore the smaller BATs on 601, we don't actually use
+1
arch/powerpc/include/asm/book3s/32/pgtable.h
···178178#ifndef __ASSEMBLY__179179180180int map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot);181181+void unmap_kernel_page(unsigned long va);181182182183#endif /* !__ASSEMBLY__ */183184
+2
arch/powerpc/include/asm/book3s/64/pgtable.h
···10821082 return hash__map_kernel_page(ea, pa, prot);10831083}1084108410851085+void unmap_kernel_page(unsigned long va);10861086+10851087static inline int __meminit vmemmap_create_mapping(unsigned long start,10861088 unsigned long page_size,10871089 unsigned long phys)
···3939 pgd_t *shadow_pgtable; /* our page table for this guest */4040 u64 l1_gr_to_hr; /* L1's addr of part'n-scoped table */4141 u64 process_table; /* process table entry for this guest */4242- u64 hfscr; /* HFSCR that the L1 requested for this nested guest */4342 long refcnt; /* number of pointers to this struct */4443 struct mutex tlb_lock; /* serialize page faults and tlbies */4544 struct kvm_nested_guest *next;
+1
arch/powerpc/include/asm/kvm_host.h
···818818819819 /* For support of nested guests */820820 struct kvm_nested_guest *nested;821821+ u64 nested_hfscr; /* HFSCR that the L1 requested for the nested guest */821822 u32 nested_vcpu_id;822823 gpa_t nested_io_gpr;823824#endif
+1
arch/powerpc/include/asm/nohash/32/pgtable.h
···6464#ifndef __ASSEMBLY__65656666int map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot);6767+void unmap_kernel_page(unsigned long va);67686869#endif /* !__ASSEMBLY__ */6970
+1
arch/powerpc/include/asm/nohash/64/pgtable.h
···308308#define __swp_entry_to_pte(x) __pte((x).val)309309310310int map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot);311311+void unmap_kernel_page(unsigned long va);311312extern int __meminit vmemmap_create_mapping(unsigned long start,312313 unsigned long page_size,313314 unsigned long phys);
···9090 unsigned long val, mask = -1UL;9191 unsigned int n = 6;92929393- if (is_32bit_task())9393+ if (is_tsk_32bit_task(task))9494 mask = 0xffffffff;95959696 while (n--) {···105105106106static inline int syscall_get_arch(struct task_struct *task)107107{108108- if (is_32bit_task())108108+ if (is_tsk_32bit_task(task))109109 return AUDIT_ARCH_PPC;110110 else if (IS_ENABLED(CONFIG_CPU_LITTLE_ENDIAN))111111 return AUDIT_ARCH_PPC64LE;
···649649 __this_cpu_inc(irq_stat.timer_irqs_event);650650 } else {651651 now = *next_tb - now;652652- if (now <= decrementer_max)653653- set_dec_or_work(now);652652+ if (now > decrementer_max)653653+ now = decrementer_max;654654+ set_dec_or_work(now);654655 __this_cpu_inc(irq_stat.timer_irqs_others);655656 }656657
+1-2
arch/powerpc/kvm/book3s_hv.c
···1816181618171817static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)18181818{18191819- struct kvm_nested_guest *nested = vcpu->arch.nested;18201819 int r;18211820 int srcu_idx;18221821···19211922 * it into a HEAI.19221923 */19231924 if (!(vcpu->arch.hfscr_permitted & (1UL << cause)) ||19241924- (nested->hfscr & (1UL << cause))) {19251925+ (vcpu->arch.nested_hfscr & (1UL << cause))) {19251926 vcpu->arch.trap = BOOK3S_INTERRUPT_H_EMUL_ASSIST;1926192719271928 /*
+1-1
arch/powerpc/kvm/book3s_hv_nested.c
···363363 /* set L1 state to L2 state */364364 vcpu->arch.nested = l2;365365 vcpu->arch.nested_vcpu_id = l2_hv.vcpu_token;366366- l2->hfscr = l2_hv.hfscr;366366+ vcpu->arch.nested_hfscr = l2_hv.hfscr;367367 vcpu->arch.regs = l2_regs;368368369369 /* Guest must always run with ME enabled, HV disabled. */
+5-5
arch/powerpc/mm/book3s32/mmu.c
···7676 return 0;7777}78787979-static int __init find_free_bat(void)7979+int __init find_free_bat(void)8080{8181 int b;8282 int n = mmu_has_feature(MMU_FTR_USE_HIGH_BATS) ? 8 : 4;···100100 * - block size has to be a power of two. This is calculated by finding the101101 * highest bit set to 1.102102 */103103-static unsigned int block_size(unsigned long base, unsigned long top)103103+unsigned int bat_block_size(unsigned long base, unsigned long top)104104{105105 unsigned int max_size = SZ_256M;106106 unsigned int base_shift = (ffs(base) - 1) & 31;···145145 int idx;146146147147 while ((idx = find_free_bat()) != -1 && base != top) {148148- unsigned int size = block_size(base, top);148148+ unsigned int size = bat_block_size(base, top);149149150150 if (size < 128 << 10)151151 break;···201201 unsigned long size;202202203203 for (i = 0; i < nb - 1 && base < top;) {204204- size = block_size(base, top);204204+ size = bat_block_size(base, top);205205 setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_TEXT);206206 base += size;207207 }208208 if (base < top) {209209- size = block_size(base, top);209209+ size = bat_block_size(base, top);210210 if ((top - base) > size) {211211 size <<= 1;212212 if (strict_kernel_rwx_enabled() && base + size > border)
+29-26
arch/powerpc/mm/kasan/book3s_32.c
···1010{1111 unsigned long k_start = (unsigned long)kasan_mem_to_shadow(start);1212 unsigned long k_end = (unsigned long)kasan_mem_to_shadow(start + size);1313- unsigned long k_cur = k_start;1414- int k_size = k_end - k_start;1515- int k_size_base = 1 << (ffs(k_size) - 1);1313+ unsigned long k_nobat = k_start;1414+ unsigned long k_cur;1515+ phys_addr_t phys;1616 int ret;1717- void *block;18171919- block = memblock_alloc(k_size, k_size_base);1818+ while (k_nobat < k_end) {1919+ unsigned int k_size = bat_block_size(k_nobat, k_end);2020+ int idx = find_free_bat();20212121- if (block && k_size_base >= SZ_128K && k_start == ALIGN(k_start, k_size_base)) {2222- int shift = ffs(k_size - k_size_base);2323- int k_size_more = shift ? 1 << (shift - 1) : 0;2222+ if (idx == -1)2323+ break;2424+ if (k_size < SZ_128K)2525+ break;2626+ phys = memblock_phys_alloc_range(k_size, k_size, 0,2727+ MEMBLOCK_ALLOC_ANYWHERE);2828+ if (!phys)2929+ break;24302525- setbat(-1, k_start, __pa(block), k_size_base, PAGE_KERNEL);2626- if (k_size_more >= SZ_128K)2727- setbat(-1, k_start + k_size_base, __pa(block) + k_size_base,2828- k_size_more, PAGE_KERNEL);2929- if (v_block_mapped(k_start))3030- k_cur = k_start + k_size_base;3131- if (v_block_mapped(k_start + k_size_base))3232- k_cur = k_start + k_size_base + k_size_more;3333-3434- update_bats();3131+ setbat(idx, k_nobat, phys, k_size, PAGE_KERNEL);3232+ k_nobat += k_size;3533 }3434+ if (k_nobat != k_start)3535+ update_bats();36363737- if (!block)3838- block = memblock_alloc(k_size, PAGE_SIZE);3939- if (!block)4040- return -ENOMEM;3737+ if (k_nobat < k_end) {3838+ phys = memblock_phys_alloc_range(k_end - k_nobat, PAGE_SIZE, 0,3939+ MEMBLOCK_ALLOC_ANYWHERE);4040+ if (!phys)4141+ return -ENOMEM;4242+ }41434244 ret = kasan_init_shadow_page_tables(k_start, k_end);4345 if (ret)4446 return ret;45474646- kasan_update_early_region(k_start, k_cur, __pte(0));4848+ kasan_update_early_region(k_start, k_nobat, __pte(0));47494848- for (; k_cur < k_end; k_cur += PAGE_SIZE) {5050+ for (k_cur = k_nobat; k_cur < k_end; k_cur += PAGE_SIZE) {4951 pmd_t *pmd = pmd_off_k(k_cur);5050- void *va = block + k_cur - k_start;5151- pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);5252+ pte_t pte = pfn_pte(PHYS_PFN(phys + k_cur - k_nobat), PAGE_KERNEL);52535354 __set_pte_at(&init_mm, k_cur, pte_offset_kernel(pmd, k_cur), pte, 0);5455 }5556 flush_tlb_kernel_range(k_start, k_end);5757+ memset(kasan_mem_to_shadow(start), 0, k_end - k_start);5858+5659 return 0;5760}
+9
arch/powerpc/mm/pgtable.c
···206206 __set_pte_at(mm, addr, ptep, pte, 0);207207}208208209209+void unmap_kernel_page(unsigned long va)210210+{211211+ pmd_t *pmdp = pmd_off_k(va);212212+ pte_t *ptep = pte_offset_kernel(pmdp, va);213213+214214+ pte_clear(&init_mm, va, ptep);215215+ flush_tlb_kernel_range(va, va + PAGE_SIZE);216216+}217217+209218/*210219 * This is called when relaxing access to a PTE. It's also called in the page211220 * fault path when we don't hit any of the major fault cases, ie, a minor
+23-6
arch/powerpc/net/bpf_jit_comp.c
···2323 memset32(area, BREAKPOINT_INSTRUCTION, size / 4);2424}25252626-/* Fix the branch target addresses for subprog calls */2727-static int bpf_jit_fixup_subprog_calls(struct bpf_prog *fp, u32 *image,2828- struct codegen_context *ctx, u32 *addrs)2626+/* Fix updated addresses (for subprog calls, ldimm64, et al) during extra pass */2727+static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image,2828+ struct codegen_context *ctx, u32 *addrs)2929{3030 const struct bpf_insn *insn = fp->insnsi;3131 bool func_addr_fixed;3232 u64 func_addr;3333 u32 tmp_idx;3434- int i, ret;3434+ int i, j, ret;35353636 for (i = 0; i < fp->len; i++) {3737 /*···6666 * of the JITed sequence remains unchanged.6767 */6868 ctx->idx = tmp_idx;6969+ } else if (insn[i].code == (BPF_LD | BPF_IMM | BPF_DW)) {7070+ tmp_idx = ctx->idx;7171+ ctx->idx = addrs[i] / 4;7272+#ifdef CONFIG_PPC327373+ PPC_LI32(ctx->b2p[insn[i].dst_reg] - 1, (u32)insn[i + 1].imm);7474+ PPC_LI32(ctx->b2p[insn[i].dst_reg], (u32)insn[i].imm);7575+ for (j = ctx->idx - addrs[i] / 4; j < 4; j++)7676+ EMIT(PPC_RAW_NOP());7777+#else7878+ func_addr = ((u64)(u32)insn[i].imm) | (((u64)(u32)insn[i + 1].imm) << 32);7979+ PPC_LI64(b2p[insn[i].dst_reg], func_addr);8080+ /* overwrite rest with nops */8181+ for (j = ctx->idx - addrs[i] / 4; j < 5; j++)8282+ EMIT(PPC_RAW_NOP());8383+#endif8484+ ctx->idx = tmp_idx;8585+ i++;6986 }7087 }7188···217200 /*218201 * Do not touch the prologue and epilogue as they will remain219202 * unchanged. Only fix the branch target address for subprog220220- * calls in the body.203203+ * calls in the body, and ldimm64 instructions.221204 *222205 * This does not change the offsets and lengths of the subprog223206 * call instruction sequences and hence, the size of the JITed224207 * image as well.225208 */226226- bpf_jit_fixup_subprog_calls(fp, code_base, &cgctx, addrs);209209+ bpf_jit_fixup_addresses(fp, code_base, &cgctx, addrs);227210228211 /* There is no need to perform the usual passes. */229212 goto skip_codegen_passes;
+9
arch/powerpc/net/bpf_jit_comp32.c
···191191192192 if (image && rel < 0x2000000 && rel >= -0x2000000) {193193 PPC_BL_ABS(func);194194+ EMIT(PPC_RAW_NOP());195195+ EMIT(PPC_RAW_NOP());196196+ EMIT(PPC_RAW_NOP());194197 } else {195198 /* Load function address into r0 */196199 EMIT(PPC_RAW_LIS(_R0, IMM_H(func)));···293290 bool func_addr_fixed;294291 u64 func_addr;295292 u32 true_cond;293293+ u32 tmp_idx;294294+ int j;296295297296 /*298297 * addrs[] maps a BPF bytecode address into a real offset from···910905 * 16 byte instruction that uses two 'struct bpf_insn'911906 */912907 case BPF_LD | BPF_IMM | BPF_DW: /* dst = (u64) imm */908908+ tmp_idx = ctx->idx;913909 PPC_LI32(dst_reg_h, (u32)insn[i + 1].imm);914910 PPC_LI32(dst_reg, (u32)insn[i].imm);911911+ /* padding to allow full 4 instructions for later patching */912912+ for (j = ctx->idx - tmp_idx; j < 4; j++)913913+ EMIT(PPC_RAW_NOP());915914 /* Adjust for two bpf instructions */916915 addrs[++i] = ctx->idx * 4;917916 break;
+19-10
arch/powerpc/net/bpf_jit_comp64.c
···319319 u64 imm64;320320 u32 true_cond;321321 u32 tmp_idx;322322+ int j;322323323324 /*324325 * addrs[] maps a BPF bytecode address into a real offset from···634633 EMIT(PPC_RAW_MR(dst_reg, b2p[TMP_REG_1]));635634 break;636635 case 64:637637- /*638638- * Way easier and faster(?) to store the value639639- * into stack and then use ldbrx640640- *641641- * ctx->seen will be reliable in pass2, but642642- * the instructions generated will remain the643643- * same across all passes644644- */636636+ /* Store the value to stack and then use byte-reverse loads */645637 PPC_BPF_STL(dst_reg, 1, bpf_jit_stack_local(ctx));646638 EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], 1, bpf_jit_stack_local(ctx)));647647- EMIT(PPC_RAW_LDBRX(dst_reg, 0, b2p[TMP_REG_1]));639639+ if (cpu_has_feature(CPU_FTR_ARCH_206)) {640640+ EMIT(PPC_RAW_LDBRX(dst_reg, 0, b2p[TMP_REG_1]));641641+ } else {642642+ EMIT(PPC_RAW_LWBRX(dst_reg, 0, b2p[TMP_REG_1]));643643+ if (IS_ENABLED(CONFIG_CPU_LITTLE_ENDIAN))644644+ EMIT(PPC_RAW_SLDI(dst_reg, dst_reg, 32));645645+ EMIT(PPC_RAW_LI(b2p[TMP_REG_2], 4));646646+ EMIT(PPC_RAW_LWBRX(b2p[TMP_REG_2], b2p[TMP_REG_2], b2p[TMP_REG_1]));647647+ if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))648648+ EMIT(PPC_RAW_SLDI(b2p[TMP_REG_2], b2p[TMP_REG_2], 32));649649+ EMIT(PPC_RAW_OR(dst_reg, dst_reg, b2p[TMP_REG_2]));650650+ }648651 break;649652 }650653 break;···853848 case BPF_LD | BPF_IMM | BPF_DW: /* dst = (u64) imm */854849 imm64 = ((u64)(u32) insn[i].imm) |855850 (((u64)(u32) insn[i+1].imm) << 32);851851+ tmp_idx = ctx->idx;852852+ PPC_LI64(dst_reg, imm64);853853+ /* padding to allow full 5 instructions for later patching */854854+ for (j = ctx->idx - tmp_idx; j < 5; j++)855855+ EMIT(PPC_RAW_NOP());856856 /* Adjust for two bpf instructions */857857 addrs[++i] = ctx->idx * 4;858858- PPC_LI64(dst_reg, imm64);859858 break;860859861860 /*
+42-33
arch/powerpc/perf/core-book3s.c
···776776 mtspr(SPRN_PMC6, pmcs[5]);777777}778778779779+/*780780+ * If the perf subsystem wants performance monitor interrupts as soon as781781+ * possible (e.g., to sample the instruction address and stack chain),782782+ * this should return true. The IRQ masking code can then enable MSR[EE]783783+ * in some places (e.g., interrupt handlers) that allows PMI interrupts784784+ * through to improve accuracy of profiles, at the cost of some performance.785785+ *786786+ * The PMU counters can be enabled by other means (e.g., sysfs raw SPR787787+ * access), but in that case there is no need for prompt PMI handling.788788+ *789789+ * This currently returns true if any perf counter is being used. It790790+ * could possibly return false if only events are being counted rather than791791+ * samples being taken, but for now this is good enough.792792+ */793793+bool power_pmu_wants_prompt_pmi(void)794794+{795795+ struct cpu_hw_events *cpuhw;796796+797797+ /*798798+ * This could simply test local_paca->pmcregs_in_use if that were not799799+ * under ifdef KVM.800800+ */801801+ if (!ppmu)802802+ return false;803803+804804+ cpuhw = this_cpu_ptr(&cpu_hw_events);805805+ return cpuhw->n_events;806806+}779807#endif /* CONFIG_PPC64 */780808781809static void perf_event_interrupt(struct pt_regs *regs);···13551327 * Otherwise provide a warning if there is PMI pending, but13561328 * no counter is found overflown.13571329 */13581358- if (any_pmc_overflown(cpuhw))13591359- clear_pmi_irq_pending();13601360- else13301330+ if (any_pmc_overflown(cpuhw)) {13311331+ /*13321332+ * Since power_pmu_disable runs under local_irq_save, it13331333+ * could happen that code hits a PMC overflow without PMI13341334+ * pending in paca. Hence only clear PMI pending if it was13351335+ * set.13361336+ *13371337+ * If a PMI is pending, then MSR[EE] must be disabled (because13381338+ * the masked PMI handler disabling EE). So it is safe to13391339+ * call clear_pmi_irq_pending().13401340+ */13411341+ if (pmi_irq_pending())13421342+ clear_pmi_irq_pending();13431343+ } else13611344 WARN_ON(pmi_irq_pending());1362134513631346 val = mmcra = cpuhw->mmcr.mmcra;···2475243624762437 __perf_event_interrupt(regs);24772438 perf_sample_event_took(sched_clock() - start_clock);24782478-}24792479-24802480-/*24812481- * If the perf subsystem wants performance monitor interrupts as soon as24822482- * possible (e.g., to sample the instruction address and stack chain),24832483- * this should return true. The IRQ masking code can then enable MSR[EE]24842484- * in some places (e.g., interrupt handlers) that allows PMI interrupts24852485- * though to improve accuracy of profiles, at the cost of some performance.24862486- *24872487- * The PMU counters can be enabled by other means (e.g., sysfs raw SPR24882488- * access), but in that case there is no need for prompt PMI handling.24892489- *24902490- * This currently returns true if any perf counter is being used. It24912491- * could possibly return false if only events are being counted rather than24922492- * samples being taken, but for now this is good enough.24932493- */24942494-bool power_pmu_wants_prompt_pmi(void)24952495-{24962496- struct cpu_hw_events *cpuhw;24972497-24982498- /*24992499- * This could simply test local_paca->pmcregs_in_use if that were not25002500- * under ifdef KVM.25012501- */25022502-25032503- if (!ppmu)25042504- return false;25052505-25062506- cpuhw = this_cpu_ptr(&cpu_hw_events);25072507- return cpuhw->n_events;25082439}2509244025102441static int power_pmu_prepare_cpu(unsigned int cpu)
+15
arch/s390/Kconfig
···945945946946endmenu947947948948+config S390_MODULES_SANITY_TEST_HELPERS949949+ def_bool n950950+948951menu "Selftests"949952950953config S390_UNWIND_SELFTEST···974971975972 Say N if you are unsure.976973974974+config S390_MODULES_SANITY_TEST975975+ def_tristate n976976+ depends on KUNIT977977+ default KUNIT_ALL_TESTS978978+ prompt "Enable s390 specific modules tests"979979+ select S390_MODULES_SANITY_TEST_HELPERS980980+ help981981+ This option enables an s390 specific modules test. This option is982982+ not useful for distributions or general kernels, but only for983983+ kernel developers working on architecture code.984984+985985+ Say N if you are unsure.977986endmenu
+10-10
arch/s390/configs/debug_defconfig
···6363CONFIG_KVM=m6464CONFIG_S390_UNWIND_SELFTEST=m6565CONFIG_S390_KPROBES_SANITY_TEST=m6666+CONFIG_S390_MODULES_SANITY_TEST=m6667CONFIG_KPROBES=y6768CONFIG_JUMP_LABEL=y6869CONFIG_STATIC_KEYS_SELFTEST=y···9796CONFIG_MEMORY_HOTREMOVE=y9897CONFIG_KSM=y9998CONFIG_TRANSPARENT_HUGEPAGE=y100100-CONFIG_FRONTSWAP=y10199CONFIG_CMA_DEBUG=y102100CONFIG_CMA_DEBUGFS=y103101CONFIG_CMA_SYSFS=y···109109CONFIG_IDLE_PAGE_TRACKING=y110110CONFIG_PERCPU_STATS=y111111CONFIG_GUP_TEST=y112112+CONFIG_ANON_VMA_NAME=y112113CONFIG_NET=y113114CONFIG_PACKET=y114115CONFIG_PACKET_DIAG=m···117116CONFIG_UNIX_DIAG=m118117CONFIG_XFRM_USER=m119118CONFIG_NET_KEY=m120120-CONFIG_NET_SWITCHDEV=y121119CONFIG_SMC=m122120CONFIG_SMC_DIAG=m123121CONFIG_INET=y···185185CONFIG_NF_TABLES=m186186CONFIG_NF_TABLES_INET=y187187CONFIG_NFT_CT=m188188-CONFIG_NFT_COUNTER=m189188CONFIG_NFT_LOG=m190189CONFIG_NFT_LIMIT=m191190CONFIG_NFT_NAT=m···390391CONFIG_VSOCKETS=m391392CONFIG_VIRTIO_VSOCKETS=m392393CONFIG_NETLINK_DIAG=m394394+CONFIG_NET_SWITCHDEV=y393395CONFIG_CGROUP_NET_PRIO=y394396CONFIG_NET_PKTGEN=m395397CONFIG_PCI=y···400400CONFIG_HOTPLUG_PCI=y401401CONFIG_HOTPLUG_PCI_S390=y402402CONFIG_DEVTMPFS=y403403+CONFIG_DEVTMPFS_SAFE=y403404CONFIG_CONNECTOR=y404405CONFIG_ZRAM=y405406CONFIG_BLK_DEV_LOOP=m···502501# CONFIG_NET_VENDOR_DEC is not set503502# CONFIG_NET_VENDOR_DLINK is not set504503# CONFIG_NET_VENDOR_EMULEX is not set504504+# CONFIG_NET_VENDOR_ENGLEDER is not set505505# CONFIG_NET_VENDOR_EZCHIP is not set506506# CONFIG_NET_VENDOR_GOOGLE is not set507507# CONFIG_NET_VENDOR_HUAWEI is not set···513511CONFIG_MLX4_EN=m514512CONFIG_MLX5_CORE=m515513CONFIG_MLX5_CORE_EN=y516516-CONFIG_MLX5_ESWITCH=y517514# CONFIG_NET_VENDOR_MICREL is not set518515# CONFIG_NET_VENDOR_MICROCHIP is not set519516# CONFIG_NET_VENDOR_MICROSEMI is not set···543542# CONFIG_NET_VENDOR_SYNOPSYS is not set544543# CONFIG_NET_VENDOR_TEHUTI is not set545544# CONFIG_NET_VENDOR_TI is not set545545+# CONFIG_NET_VENDOR_VERTEXCOM is not set546546# CONFIG_NET_VENDOR_VIA is not set547547# CONFIG_NET_VENDOR_WIZNET is not set548548# CONFIG_NET_VENDOR_XILINX is not set···594592CONFIG_VIRTIO_INPUT=y595593CONFIG_VHOST_NET=m596594CONFIG_VHOST_VSOCK=m595595+# CONFIG_SURFACE_PLATFORMS is not set597596CONFIG_S390_CCW_IOMMU=y598597CONFIG_S390_AP_IOMMU=y599598CONFIG_EXT4_FS=y···759756CONFIG_CRYPTO_USER_API_RNG=m760757CONFIG_CRYPTO_USER_API_AEAD=m761758CONFIG_CRYPTO_STATS=y762762-CONFIG_CRYPTO_LIB_BLAKE2S=m763763-CONFIG_CRYPTO_LIB_CURVE25519=m764764-CONFIG_CRYPTO_LIB_CHACHA20POLY1305=m765759CONFIG_ZCRYPT=m766760CONFIG_PKEY=m767761CONFIG_CRYPTO_PAES_S390=m···774774CONFIG_CRYPTO_CRC32_S390=y775775CONFIG_CRYPTO_DEV_VIRTIO=m776776CONFIG_CORDIC=m777777+CONFIG_CRYPTO_LIB_CURVE25519=m778778+CONFIG_CRYPTO_LIB_CHACHA20POLY1305=m777779CONFIG_CRC32_SELFTEST=y778780CONFIG_CRC4=m779781CONFIG_CRC7=m···809807CONFIG_SLUB_STATS=y810808CONFIG_DEBUG_STACK_USAGE=y811809CONFIG_DEBUG_VM=y812812-CONFIG_DEBUG_VM_VMACACHE=y813810CONFIG_DEBUG_VM_PGFLAGS=y814811CONFIG_DEBUG_MEMORY_INIT=y815812CONFIG_MEMORY_NOTIFIER_ERROR_INJECT=m···820819CONFIG_DETECT_HUNG_TASK=y821820CONFIG_WQ_WATCHDOG=y822821CONFIG_TEST_LOCKUP=m823823-CONFIG_DEBUG_TIMEKEEPING=y824822CONFIG_PROVE_LOCKING=y825823CONFIG_LOCK_STAT=y826826-CONFIG_DEBUG_LOCKDEP=y827824CONFIG_DEBUG_ATOMIC_SLEEP=y828825CONFIG_DEBUG_LOCKING_API_SELFTESTS=y826826+CONFIG_DEBUG_IRQFLAGS=y829827CONFIG_DEBUG_SG=y830828CONFIG_DEBUG_NOTIFIERS=y831829CONFIG_BUG_ON_DATA_CORRUPTION=y
+9-7
arch/s390/configs/defconfig
···6161CONFIG_KVM=m6262CONFIG_S390_UNWIND_SELFTEST=m6363CONFIG_S390_KPROBES_SANITY_TEST=m6464+CONFIG_S390_MODULES_SANITY_TEST=m6465CONFIG_KPROBES=y6566CONFIG_JUMP_LABEL=y6667# CONFIG_GCC_PLUGINS is not set···9291CONFIG_MEMORY_HOTREMOVE=y9392CONFIG_KSM=y9493CONFIG_TRANSPARENT_HUGEPAGE=y9595-CONFIG_FRONTSWAP=y9694CONFIG_CMA_SYSFS=y9795CONFIG_CMA_AREAS=79896CONFIG_MEM_SOFT_DIRTY=y···101101CONFIG_DEFERRED_STRUCT_PAGE_INIT=y102102CONFIG_IDLE_PAGE_TRACKING=y103103CONFIG_PERCPU_STATS=y104104+CONFIG_ANON_VMA_NAME=y104105CONFIG_NET=y105106CONFIG_PACKET=y106107CONFIG_PACKET_DIAG=m···109108CONFIG_UNIX_DIAG=m110109CONFIG_XFRM_USER=m111110CONFIG_NET_KEY=m112112-CONFIG_NET_SWITCHDEV=y113111CONFIG_SMC=m114112CONFIG_SMC_DIAG=m115113CONFIG_INET=y···177177CONFIG_NF_TABLES=m178178CONFIG_NF_TABLES_INET=y179179CONFIG_NFT_CT=m180180-CONFIG_NFT_COUNTER=m181180CONFIG_NFT_LOG=m182181CONFIG_NFT_LIMIT=m183182CONFIG_NFT_NAT=m···381382CONFIG_VSOCKETS=m382383CONFIG_VIRTIO_VSOCKETS=m383384CONFIG_NETLINK_DIAG=m385385+CONFIG_NET_SWITCHDEV=y384386CONFIG_CGROUP_NET_PRIO=y385387CONFIG_NET_PKTGEN=m386388CONFIG_PCI=y···391391CONFIG_HOTPLUG_PCI_S390=y392392CONFIG_UEVENT_HELPER=y393393CONFIG_DEVTMPFS=y394394+CONFIG_DEVTMPFS_SAFE=y394395CONFIG_CONNECTOR=y395396CONFIG_ZRAM=y396397CONFIG_BLK_DEV_LOOP=m···493492# CONFIG_NET_VENDOR_DEC is not set494493# CONFIG_NET_VENDOR_DLINK is not set495494# CONFIG_NET_VENDOR_EMULEX is not set495495+# CONFIG_NET_VENDOR_ENGLEDER is not set496496# CONFIG_NET_VENDOR_EZCHIP is not set497497# CONFIG_NET_VENDOR_GOOGLE is not set498498# CONFIG_NET_VENDOR_HUAWEI is not set···504502CONFIG_MLX4_EN=m505503CONFIG_MLX5_CORE=m506504CONFIG_MLX5_CORE_EN=y507507-CONFIG_MLX5_ESWITCH=y508505# CONFIG_NET_VENDOR_MICREL is not set509506# CONFIG_NET_VENDOR_MICROCHIP is not set510507# CONFIG_NET_VENDOR_MICROSEMI is not set···534533# CONFIG_NET_VENDOR_SYNOPSYS is not set535534# CONFIG_NET_VENDOR_TEHUTI is not set536535# CONFIG_NET_VENDOR_TI is not set536536+# CONFIG_NET_VENDOR_VERTEXCOM is not set537537# CONFIG_NET_VENDOR_VIA is not set538538# CONFIG_NET_VENDOR_WIZNET is not set539539# CONFIG_NET_VENDOR_XILINX is not set···584582CONFIG_VIRTIO_INPUT=y585583CONFIG_VHOST_NET=m586584CONFIG_VHOST_VSOCK=m585585+# CONFIG_SURFACE_PLATFORMS is not set587586CONFIG_S390_CCW_IOMMU=y588587CONFIG_S390_AP_IOMMU=y589588CONFIG_EXT4_FS=y···746743CONFIG_CRYPTO_USER_API_RNG=m747744CONFIG_CRYPTO_USER_API_AEAD=m748745CONFIG_CRYPTO_STATS=y749749-CONFIG_CRYPTO_LIB_BLAKE2S=m750750-CONFIG_CRYPTO_LIB_CURVE25519=m751751-CONFIG_CRYPTO_LIB_CHACHA20POLY1305=m752746CONFIG_ZCRYPT=m753747CONFIG_PKEY=m754748CONFIG_CRYPTO_PAES_S390=m···762762CONFIG_CRYPTO_DEV_VIRTIO=m763763CONFIG_CORDIC=m764764CONFIG_PRIME_NUMBERS=m765765+CONFIG_CRYPTO_LIB_CURVE25519=m766766+CONFIG_CRYPTO_LIB_CHACHA20POLY1305=m765767CONFIG_CRC4=m766768CONFIG_CRC7=m767769CONFIG_CRC8=m
+3
arch/s390/configs/zfcpdump_defconfig
···11# CONFIG_SWAP is not set22CONFIG_NO_HZ_IDLE=y33CONFIG_HIGH_RES_TIMERS=y44+CONFIG_BPF_SYSCALL=y45# CONFIG_CPU_ISOLATION is not set56# CONFIG_UTS_NS is not set67# CONFIG_TIME_NS is not set···3534# CONFIG_PCPU_DEV_REFCNT is not set3635# CONFIG_ETHTOOL_NETLINK is not set3736CONFIG_DEVTMPFS=y3737+CONFIG_DEVTMPFS_SAFE=y3838CONFIG_BLK_DEV_RAM=y3939# CONFIG_DCSSBLK is not set4040# CONFIG_DASD is not set···6058# CONFIG_HID is not set6159# CONFIG_VIRTIO_MENU is not set6260# CONFIG_VHOST_MENU is not set6161+# CONFIG_SURFACE_PLATFORMS is not set6362# CONFIG_IOMMU_SUPPORT is not set6463# CONFIG_DNOTIFY is not set6564# CONFIG_INOTIFY_USER is not set
···264264 /* Validate vector registers */265265 union ctlreg0 cr0;266266267267- if (!mci.vr) {267267+ /*268268+ * The vector validity must only be checked if not running a269269+ * KVM guest. For KVM guests the machine check is forwarded by270270+ * KVM and it is the responsibility of the guest to take271271+ * appropriate actions. The host vector or FPU values have been272272+ * saved by KVM and will be restored by KVM.273273+ */274274+ if (!mci.vr && !test_cpu_flag(CIF_MCCK_GUEST)) {268275 /*269276 * Vector registers can't be restored. If the kernel270277 * currently uses vector registers the system is···314307 if (cr2.gse) {315308 if (!mci.gs) {316309 /*317317- * Guarded storage register can't be restored and318318- * the current processes uses guarded storage.319319- * It has to be terminated.310310+ * 2 cases:311311+ * - machine check in kernel or userspace312312+ * - machine check while running SIE (KVM guest)313313+ * For kernel or userspace the userspace values of314314+ * guarded storage control can not be recreated, the315315+ * process must be terminated.316316+ * For SIE the guest values of guarded storage can not317317+ * be recreated. This is either due to a bug or due to318318+ * GS being disabled in the guest. The guest will be319319+ * notified by KVM code and the guests machine check320320+ * handling must take care of this. The host values321321+ * are saved by KVM and are not affected.320322 */321321- kill_task = 1;323323+ if (!test_cpu_flag(CIF_MCCK_GUEST))324324+ kill_task = 1;322325 } else {323326 load_gs_cb((struct gs_cb *)mcesa->guarded_storage_save_area);324327 }
···186186 select HAVE_CONTEXT_TRACKING_OFFSTACK if HAVE_CONTEXT_TRACKING187187 select HAVE_C_RECORDMCOUNT188188 select HAVE_OBJTOOL_MCOUNT if STACK_VALIDATION189189+ select HAVE_BUILDTIME_MCOUNT_SORT189190 select HAVE_DEBUG_KMEMLEAK190191 select HAVE_DMA_CONTIGUOUS191192 select HAVE_DYNAMIC_FTRACE
+15
arch/x86/events/intel/core.c
···62366236 pmu->num_counters = x86_pmu.num_counters;62376237 pmu->num_counters_fixed = x86_pmu.num_counters_fixed;62386238 }62396239+62406240+ /*62416241+ * Quirk: For some Alder Lake machine, when all E-cores are disabled in62426242+ * a BIOS, the leaf 0xA will enumerate all counters of P-cores. However,62436243+ * the X86_FEATURE_HYBRID_CPU is still set. The above codes will62446244+ * mistakenly add extra counters for P-cores. Correct the number of62456245+ * counters here.62466246+ */62476247+ if ((pmu->num_counters > 8) || (pmu->num_counters_fixed > 4)) {62486248+ pmu->num_counters = x86_pmu.num_counters;62496249+ pmu->num_counters_fixed = x86_pmu.num_counters_fixed;62506250+ }62516251+62396252 pmu->max_pebs_events = min_t(unsigned, MAX_PEBS_EVENTS, pmu->num_counters);62406253 pmu->unconstrained = (struct event_constraint)62416254 __EVENT_CONSTRAINT(0, (1ULL << pmu->num_counters) - 1,···63536340 }6354634163556342 if (x86_pmu.lbr_nr) {63436343+ intel_pmu_lbr_init();63446344+63566345 pr_cont("%d-deep LBR, ", x86_pmu.lbr_nr);6357634663586347 /* only support branch_stack snapshot for perfmon >= v2 */
+99-65
arch/x86/events/intel/lbr.c
···8899#include "../perf_event.h"10101111-static const enum {1212- LBR_EIP_FLAGS = 1,1313- LBR_TSX = 2,1414-} lbr_desc[LBR_FORMAT_MAX_KNOWN + 1] = {1515- [LBR_FORMAT_EIP_FLAGS] = LBR_EIP_FLAGS,1616- [LBR_FORMAT_EIP_FLAGS2] = LBR_EIP_FLAGS | LBR_TSX,1717-};1818-1911/*2012 * Intel LBR_SELECT bits2113 * Intel Vol3a, April 2011, Section 16.7 Table 16-10···235243 for (i = 0; i < x86_pmu.lbr_nr; i++) {236244 wrmsrl(x86_pmu.lbr_from + i, 0);237245 wrmsrl(x86_pmu.lbr_to + i, 0);238238- if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO)246246+ if (x86_pmu.lbr_has_info)239247 wrmsrl(x86_pmu.lbr_info + i, 0);240248 }241249}···297305 */298306static inline bool lbr_from_signext_quirk_needed(void)299307{300300- int lbr_format = x86_pmu.intel_cap.lbr_format;301308 bool tsx_support = boot_cpu_has(X86_FEATURE_HLE) ||302309 boot_cpu_has(X86_FEATURE_RTM);303310304304- return !tsx_support && (lbr_desc[lbr_format] & LBR_TSX);311311+ return !tsx_support && x86_pmu.lbr_has_tsx;305312}306313307314static DEFINE_STATIC_KEY_FALSE(lbr_from_quirk_key);···418427419428void intel_pmu_lbr_restore(void *ctx)420429{421421- bool need_info = x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO;422430 struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);423431 struct x86_perf_task_context *task_ctx = ctx;424424- int i;425425- unsigned lbr_idx, mask;432432+ bool need_info = x86_pmu.lbr_has_info;426433 u64 tos = task_ctx->tos;434434+ unsigned lbr_idx, mask;435435+ int i;427436428437 mask = x86_pmu.lbr_nr - 1;429438 for (i = 0; i < task_ctx->valid_lbrs; i++) {···435444 lbr_idx = (tos - i) & mask;436445 wrlbr_from(lbr_idx, 0);437446 wrlbr_to(lbr_idx, 0);438438- if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO)447447+ if (need_info)439448 wrlbr_info(lbr_idx, 0);440449 }441450···510519511520void intel_pmu_lbr_save(void *ctx)512521{513513- bool need_info = x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO;514522 struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);515523 struct x86_perf_task_context *task_ctx = ctx;524524+ bool need_info = x86_pmu.lbr_has_info;516525 unsigned lbr_idx, mask;517526 u64 tos;518527 int i;···807816{808817 bool need_info = false, call_stack = false;809818 unsigned long mask = x86_pmu.lbr_nr - 1;810810- int lbr_format = x86_pmu.intel_cap.lbr_format;811819 u64 tos = intel_pmu_lbr_tos();812820 int i;813821 int out = 0;···821831 for (i = 0; i < num; i++) {822832 unsigned long lbr_idx = (tos - i) & mask;823833 u64 from, to, mis = 0, pred = 0, in_tx = 0, abort = 0;824824- int skip = 0;825834 u16 cycles = 0;826826- int lbr_flags = lbr_desc[lbr_format];827835828836 from = rdlbr_from(lbr_idx, NULL);829837 to = rdlbr_to(lbr_idx, NULL);···833845 if (call_stack && !from)834846 break;835847836836- if (lbr_format == LBR_FORMAT_INFO && need_info) {837837- u64 info;848848+ if (x86_pmu.lbr_has_info) {849849+ if (need_info) {850850+ u64 info;838851839839- info = rdlbr_info(lbr_idx, NULL);840840- mis = !!(info & LBR_INFO_MISPRED);841841- pred = !mis;842842- in_tx = !!(info & LBR_INFO_IN_TX);843843- abort = !!(info & LBR_INFO_ABORT);844844- cycles = (info & LBR_INFO_CYCLES);845845- }852852+ info = rdlbr_info(lbr_idx, NULL);853853+ mis = !!(info & LBR_INFO_MISPRED);854854+ pred = !mis;855855+ cycles = (info & LBR_INFO_CYCLES);856856+ if (x86_pmu.lbr_has_tsx) {857857+ in_tx = !!(info & LBR_INFO_IN_TX);858858+ abort = !!(info & LBR_INFO_ABORT);859859+ }860860+ }861861+ } else {862862+ int skip = 0;846863847847- if (lbr_format == LBR_FORMAT_TIME) {848848- mis = !!(from & LBR_FROM_FLAG_MISPRED);849849- pred = !mis;850850- skip = 1;851851- cycles = ((to >> 48) & LBR_INFO_CYCLES);864864+ if (x86_pmu.lbr_from_flags) {865865+ mis = !!(from & LBR_FROM_FLAG_MISPRED);866866+ pred = !mis;867867+ skip = 1;868868+ }869869+ if (x86_pmu.lbr_has_tsx) {870870+ in_tx = !!(from & LBR_FROM_FLAG_IN_TX);871871+ abort = !!(from & LBR_FROM_FLAG_ABORT);872872+ skip = 3;873873+ }874874+ from = (u64)((((s64)from) << skip) >> skip);852875853853- to = (u64)((((s64)to) << 16) >> 16);876876+ if (x86_pmu.lbr_to_cycles) {877877+ cycles = ((to >> 48) & LBR_INFO_CYCLES);878878+ to = (u64)((((s64)to) << 16) >> 16);879879+ }854880 }855855-856856- if (lbr_flags & LBR_EIP_FLAGS) {857857- mis = !!(from & LBR_FROM_FLAG_MISPRED);858858- pred = !mis;859859- skip = 1;860860- }861861- if (lbr_flags & LBR_TSX) {862862- in_tx = !!(from & LBR_FROM_FLAG_IN_TX);863863- abort = !!(from & LBR_FROM_FLAG_ABORT);864864- skip = 3;865865- }866866- from = (u64)((((s64)from) << skip) >> skip);867881868882 /*869883 * Some CPUs report duplicated abort records,···893903 cpuc->lbr_stack.hw_idx = tos;894904}895905906906+static DEFINE_STATIC_KEY_FALSE(x86_lbr_mispred);907907+static DEFINE_STATIC_KEY_FALSE(x86_lbr_cycles);908908+static DEFINE_STATIC_KEY_FALSE(x86_lbr_type);909909+896910static __always_inline int get_lbr_br_type(u64 info)897911{898898- if (!static_cpu_has(X86_FEATURE_ARCH_LBR) || !x86_pmu.lbr_br_type)899899- return 0;912912+ int type = 0;900913901901- return (info & LBR_INFO_BR_TYPE) >> LBR_INFO_BR_TYPE_OFFSET;914914+ if (static_branch_likely(&x86_lbr_type))915915+ type = (info & LBR_INFO_BR_TYPE) >> LBR_INFO_BR_TYPE_OFFSET;916916+917917+ return type;902918}903919904920static __always_inline bool get_lbr_mispred(u64 info)905921{906906- if (static_cpu_has(X86_FEATURE_ARCH_LBR) && !x86_pmu.lbr_mispred)907907- return 0;922922+ bool mispred = 0;908923909909- return !!(info & LBR_INFO_MISPRED);910910-}924924+ if (static_branch_likely(&x86_lbr_mispred))925925+ mispred = !!(info & LBR_INFO_MISPRED);911926912912-static __always_inline bool get_lbr_predicted(u64 info)913913-{914914- if (static_cpu_has(X86_FEATURE_ARCH_LBR) && !x86_pmu.lbr_mispred)915915- return 0;916916-917917- return !(info & LBR_INFO_MISPRED);927927+ return mispred;918928}919929920930static __always_inline u16 get_lbr_cycles(u64 info)921931{922922- if (static_cpu_has(X86_FEATURE_ARCH_LBR) &&923923- !(x86_pmu.lbr_timed_lbr && info & LBR_INFO_CYC_CNT_VALID))924924- return 0;932932+ u16 cycles = info & LBR_INFO_CYCLES;925933926926- return info & LBR_INFO_CYCLES;934934+ if (static_cpu_has(X86_FEATURE_ARCH_LBR) &&935935+ (!static_branch_likely(&x86_lbr_cycles) ||936936+ !(info & LBR_INFO_CYC_CNT_VALID)))937937+ cycles = 0;938938+939939+ return cycles;927940}928941929942static void intel_pmu_store_lbr(struct cpu_hw_events *cpuc,···954961 e->from = from;955962 e->to = to;956963 e->mispred = get_lbr_mispred(info);957957- e->predicted = get_lbr_predicted(info);964964+ e->predicted = !e->mispred;958965 e->in_tx = !!(info & LBR_INFO_IN_TX);959966 e->abort = !!(info & LBR_INFO_ABORT);960967 e->cycles = get_lbr_cycles(info);···1113112011141121 if ((br_type & PERF_SAMPLE_BRANCH_NO_CYCLES) &&11151122 (br_type & PERF_SAMPLE_BRANCH_NO_FLAGS) &&11161116- (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO))11231123+ x86_pmu.lbr_has_info)11171124 reg->config |= LBR_NO_INFO;1118112511191126 return 0;···16991706 x86_pmu.intel_cap.lbr_format = LBR_FORMAT_EIP_FLAGS;17001707}1701170817091709+void intel_pmu_lbr_init(void)17101710+{17111711+ switch (x86_pmu.intel_cap.lbr_format) {17121712+ case LBR_FORMAT_EIP_FLAGS2:17131713+ x86_pmu.lbr_has_tsx = 1;17141714+ fallthrough;17151715+ case LBR_FORMAT_EIP_FLAGS:17161716+ x86_pmu.lbr_from_flags = 1;17171717+ break;17181718+17191719+ case LBR_FORMAT_INFO:17201720+ x86_pmu.lbr_has_tsx = 1;17211721+ fallthrough;17221722+ case LBR_FORMAT_INFO2:17231723+ x86_pmu.lbr_has_info = 1;17241724+ break;17251725+17261726+ case LBR_FORMAT_TIME:17271727+ x86_pmu.lbr_from_flags = 1;17281728+ x86_pmu.lbr_to_cycles = 1;17291729+ break;17301730+ }17311731+17321732+ if (x86_pmu.lbr_has_info) {17331733+ /*17341734+ * Only used in combination with baseline pebs.17351735+ */17361736+ static_branch_enable(&x86_lbr_mispred);17371737+ static_branch_enable(&x86_lbr_cycles);17381738+ }17391739+}17401740+17021741/*17031742 * LBR state size is variable based on the max number of registers.17041743 * This calculates the expected state size, which should match···17511726 * Check the LBR state with the corresponding software structure.17521727 * Disable LBR XSAVES support if the size doesn't match.17531728 */17291729+ if (xfeature_size(XFEATURE_LBR) == 0)17301730+ return false;17311731+17541732 if (WARN_ON(xfeature_size(XFEATURE_LBR) != get_lbr_state_size()))17551733 return false;17561734···17931765 x86_pmu.lbr_br_type = ecx.split.lbr_br_type;17941766 x86_pmu.lbr_nr = lbr_nr;1795176717681768+ if (x86_pmu.lbr_mispred)17691769+ static_branch_enable(&x86_lbr_mispred);17701770+ if (x86_pmu.lbr_timed_lbr)17711771+ static_branch_enable(&x86_lbr_cycles);17721772+ if (x86_pmu.lbr_br_type)17731773+ static_branch_enable(&x86_lbr_type);1796177417971775 arch_lbr_xsave = is_arch_lbr_xsave_available();17981776 if (arch_lbr_xsave) {
···1483148314841484 int (*get_msr_feature)(struct kvm_msr_entry *entry);1485148514861486- bool (*can_emulate_instruction)(struct kvm_vcpu *vcpu, void *insn, int insn_len);14861486+ bool (*can_emulate_instruction)(struct kvm_vcpu *vcpu, int emul_type,14871487+ void *insn, int insn_len);1487148814881489 bool (*apic_init_signal_blocked)(struct kvm_vcpu *vcpu);14891490 int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu);···14971496};1498149714991498struct kvm_x86_nested_ops {14991499+ void (*leave_nested)(struct kvm_vcpu *vcpu);15001500 int (*check_events)(struct kvm_vcpu *vcpu);15011501 bool (*hv_timer_pending)(struct kvm_vcpu *vcpu);15021502 void (*triple_fault)(struct kvm_vcpu *vcpu);···18631861int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu);18641862int kvm_cpu_get_interrupt(struct kvm_vcpu *v);18651863void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event);18661866-void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu);1867186418681865int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,18691866 unsigned long ipi_bitmap_high, u32 min,
+3
arch/x86/include/uapi/asm/kvm.h
···452452453453#define KVM_STATE_VMX_PREEMPTION_TIMER_DEADLINE 0x00000001454454455455+/* attributes for system fd (group 0) */456456+#define KVM_X86_XCOMP_GUEST_SUPP 0457457+455458struct kvm_vmx_nested_state_data {456459 __u8 vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE];457460 __u8 shadow_vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE];
+1-1
arch/x86/kernel/cpu/mce/amd.c
···423423 u32 hi, lo;424424425425 /* sysfs write might race against an offline operation */426426- if (this_cpu_read(threshold_banks))426426+ if (!this_cpu_read(threshold_banks) && !tr->set_lvt_off)427427 return;428428429429 rdmsr(tr->b->address, lo, hi);
+1
arch/x86/kernel/cpu/mce/intel.c
···486486 case INTEL_FAM6_BROADWELL_X:487487 case INTEL_FAM6_SKYLAKE_X:488488 case INTEL_FAM6_ICELAKE_X:489489+ case INTEL_FAM6_ICELAKE_D:489490 case INTEL_FAM6_SAPPHIRERAPIDS_X:490491 case INTEL_FAM6_XEON_PHI_KNL:491492 case INTEL_FAM6_XEON_PHI_KNM:
+55-35
arch/x86/kvm/cpuid.c
···133133 orig = &vcpu->arch.cpuid_entries[i];134134 if (e2[i].function != orig->function ||135135 e2[i].index != orig->index ||136136+ e2[i].flags != orig->flags ||136137 e2[i].eax != orig->eax || e2[i].ebx != orig->ebx ||137138 e2[i].ecx != orig->ecx || e2[i].edx != orig->edx)138139 return -EINVAL;···197196 vcpu->arch.pv_cpuid.features = best->eax;198197}199198199199+/*200200+ * Calculate guest's supported XCR0 taking into account guest CPUID data and201201+ * supported_xcr0 (comprised of host configuration and KVM_SUPPORTED_XCR0).202202+ */203203+static u64 cpuid_get_supported_xcr0(struct kvm_cpuid_entry2 *entries, int nent)204204+{205205+ struct kvm_cpuid_entry2 *best;206206+207207+ best = cpuid_entry2_find(entries, nent, 0xd, 0);208208+ if (!best)209209+ return 0;210210+211211+ return (best->eax | ((u64)best->edx << 32)) & supported_xcr0;212212+}213213+200214static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries,201215 int nent)202216{203217 struct kvm_cpuid_entry2 *best;218218+ u64 guest_supported_xcr0 = cpuid_get_supported_xcr0(entries, nent);204219205220 best = cpuid_entry2_find(entries, nent, 1, 0);206221 if (best) {···255238 vcpu->arch.ia32_misc_enable_msr &256239 MSR_IA32_MISC_ENABLE_MWAIT);257240 }241241+242242+ /*243243+ * Bits 127:0 of the allowed SECS.ATTRIBUTES (CPUID.0x12.0x1) enumerate244244+ * the supported XSAVE Feature Request Mask (XFRM), i.e. the enclave's245245+ * requested XCR0 value. The enclave's XFRM must be a subset of XCRO246246+ * at the time of EENTER, thus adjust the allowed XFRM by the guest's247247+ * supported XCR0. Similar to XCR0 handling, FP and SSE are forced to248248+ * '1' even on CPUs that don't support XSAVE.249249+ */250250+ best = cpuid_entry2_find(entries, nent, 0x12, 0x1);251251+ if (best) {252252+ best->ecx &= guest_supported_xcr0 & 0xffffffff;253253+ best->edx &= guest_supported_xcr0 >> 32;254254+ best->ecx |= XFEATURE_MASK_FPSSE;255255+ }258256}259257260258void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)···293261 kvm_apic_set_version(vcpu);294262 }295263296296- best = kvm_find_cpuid_entry(vcpu, 0xD, 0);297297- if (!best)298298- vcpu->arch.guest_supported_xcr0 = 0;299299- else300300- vcpu->arch.guest_supported_xcr0 =301301- (best->eax | ((u64)best->edx << 32)) & supported_xcr0;302302-303303- /*304304- * Bits 127:0 of the allowed SECS.ATTRIBUTES (CPUID.0x12.0x1) enumerate305305- * the supported XSAVE Feature Request Mask (XFRM), i.e. the enclave's306306- * requested XCR0 value. The enclave's XFRM must be a subset of XCRO307307- * at the time of EENTER, thus adjust the allowed XFRM by the guest's308308- * supported XCR0. Similar to XCR0 handling, FP and SSE are forced to309309- * '1' even on CPUs that don't support XSAVE.310310- */311311- best = kvm_find_cpuid_entry(vcpu, 0x12, 0x1);312312- if (best) {313313- best->ecx &= vcpu->arch.guest_supported_xcr0 & 0xffffffff;314314- best->edx &= vcpu->arch.guest_supported_xcr0 >> 32;315315- best->ecx |= XFEATURE_MASK_FPSSE;316316- }264264+ vcpu->arch.guest_supported_xcr0 =265265+ cpuid_get_supported_xcr0(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent);317266318267 kvm_update_pv_runtime(vcpu);319268···359346 * KVM_SET_CPUID{,2} again. To support this legacy behavior, check360347 * whether the supplied CPUID data is equal to what's already set.361348 */362362- if (vcpu->arch.last_vmentry_cpu != -1)363363- return kvm_cpuid_check_equal(vcpu, e2, nent);349349+ if (vcpu->arch.last_vmentry_cpu != -1) {350350+ r = kvm_cpuid_check_equal(vcpu, e2, nent);351351+ if (r)352352+ return r;353353+354354+ kvfree(e2);355355+ return 0;356356+ }364357365358 r = kvm_check_cpuid(vcpu, e2, nent);366359 if (r)···906887 }907888 break;908889 case 0xd: {909909- u64 guest_perm = xstate_get_guest_group_perm();890890+ u64 permitted_xcr0 = supported_xcr0 & xstate_get_guest_group_perm();891891+ u64 permitted_xss = supported_xss;910892911911- entry->eax &= supported_xcr0 & guest_perm;912912- entry->ebx = xstate_required_size(supported_xcr0, false);893893+ entry->eax &= permitted_xcr0;894894+ entry->ebx = xstate_required_size(permitted_xcr0, false);913895 entry->ecx = entry->ebx;914914- entry->edx &= (supported_xcr0 & guest_perm) >> 32;915915- if (!supported_xcr0)896896+ entry->edx &= permitted_xcr0 >> 32;897897+ if (!permitted_xcr0)916898 break;917899918900 entry = do_host_cpuid(array, function, 1);···922902923903 cpuid_entry_override(entry, CPUID_D_1_EAX);924904 if (entry->eax & (F(XSAVES)|F(XSAVEC)))925925- entry->ebx = xstate_required_size(supported_xcr0 | supported_xss,905905+ entry->ebx = xstate_required_size(permitted_xcr0 | permitted_xss,926906 true);927907 else {928928- WARN_ON_ONCE(supported_xss != 0);908908+ WARN_ON_ONCE(permitted_xss != 0);929909 entry->ebx = 0;930910 }931931- entry->ecx &= supported_xss;932932- entry->edx &= supported_xss >> 32;911911+ entry->ecx &= permitted_xss;912912+ entry->edx &= permitted_xss >> 32;933913934914 for (i = 2; i < 64; ++i) {935915 bool s_state;936936- if (supported_xcr0 & BIT_ULL(i))916916+ if (permitted_xcr0 & BIT_ULL(i))937917 s_state = false;938938- else if (supported_xss & BIT_ULL(i))918918+ else if (permitted_xss & BIT_ULL(i))939919 s_state = true;940920 else941921 continue;···949929 * invalid sub-leafs. Only valid sub-leafs should950930 * reach this point, and they should have a non-zero951931 * save state size. Furthermore, check whether the952952- * processor agrees with supported_xcr0/supported_xss932932+ * processor agrees with permitted_xcr0/permitted_xss953933 * on whether this is an XCR0- or IA32_XSS-managed area.954934 */955935 if (WARN_ON_ONCE(!entry->eax || (entry->ecx & 0x1) != s_state)) {
···983983/*984984 * Forcibly leave nested mode in order to be able to reset the VCPU later on.985985 */986986-void svm_leave_nested(struct vcpu_svm *svm)986986+void svm_leave_nested(struct kvm_vcpu *vcpu)987987{988988- struct kvm_vcpu *vcpu = &svm->vcpu;988988+ struct vcpu_svm *svm = to_svm(vcpu);989989990990 if (is_guest_mode(vcpu)) {991991 svm->nested.nested_run_pending = 0;···14111411 return -EINVAL;1412141214131413 if (!(kvm_state->flags & KVM_STATE_NESTED_GUEST_MODE)) {14141414- svm_leave_nested(svm);14141414+ svm_leave_nested(vcpu);14151415 svm_set_gif(svm, !!(kvm_state->flags & KVM_STATE_NESTED_GIF_SET));14161416 return 0;14171417 }···14781478 */1479147914801480 if (is_guest_mode(vcpu))14811481- svm_leave_nested(svm);14811481+ svm_leave_nested(vcpu);14821482 else14831483 svm->nested.vmcb02.ptr->save = svm->vmcb01.ptr->save;14841484···15321532}1533153315341534struct kvm_x86_nested_ops svm_nested_ops = {15351535+ .leave_nested = svm_leave_nested,15351536 .check_events = svm_check_nested_events,15361537 .triple_fault = nested_svm_triple_fault,15371538 .get_nested_state_pages = svm_get_nested_state_pages,
+7-2
arch/x86/kvm/svm/sev.c
···21002100 if (!sev_enabled || !npt_enabled)21012101 goto out;2102210221032103- /* Does the CPU support SEV? */21042104- if (!boot_cpu_has(X86_FEATURE_SEV))21032103+ /*21042104+ * SEV must obviously be supported in hardware. Sanity check that the21052105+ * CPU supports decode assists, which is mandatory for SEV guests to21062106+ * support instruction emulation.21072107+ */21082108+ if (!boot_cpu_has(X86_FEATURE_SEV) ||21092109+ WARN_ON_ONCE(!boot_cpu_has(X86_FEATURE_DECODEASSISTS)))21052110 goto out;2106211121072112 /* Retrieve SEV CPUID information */
+124-57
arch/x86/kvm/svm/svm.c
···290290291291 if ((old_efer & EFER_SVME) != (efer & EFER_SVME)) {292292 if (!(efer & EFER_SVME)) {293293- svm_leave_nested(svm);293293+ svm_leave_nested(vcpu);294294 svm_set_gif(svm, true);295295 /* #GP intercept is still needed for vmware backdoor */296296 if (!enable_vmware_backdoor)···312312 return ret;313313 }314314315315- if (svm_gp_erratum_intercept)315315+ /*316316+ * Never intercept #GP for SEV guests, KVM can't317317+ * decrypt guest memory to workaround the erratum.318318+ */319319+ if (svm_gp_erratum_intercept && !sev_guest(vcpu->kvm))316320 set_exception_intercept(svm, GP_VECTOR);317321 }318322 }···10141010 * Guest access to VMware backdoor ports could legitimately10151011 * trigger #GP because of TSS I/O permission bitmap.10161012 * We intercept those #GP and allow access to them anyway10171017- * as VMware does.10131013+ * as VMware does. Don't intercept #GP for SEV guests as KVM can't10141014+ * decrypt guest memory to decode the faulting instruction.10181015 */10191019- if (enable_vmware_backdoor)10161016+ if (enable_vmware_backdoor && !sev_guest(vcpu->kvm))10201017 set_exception_intercept(svm, GP_VECTOR);1021101810221019 svm_set_intercept(svm, INTERCEPT_INTR);···20962091 if (error_code)20972092 goto reinject;2098209320992099- /* All SVM instructions expect page aligned RAX */21002100- if (svm->vmcb->save.rax & ~PAGE_MASK)21012101- goto reinject;21022102-21032094 /* Decode the instruction for usage later */21042095 if (x86_decode_emulated_instruction(vcpu, 0, NULL, 0) != EMULATION_OK)21052096 goto reinject;···21132112 if (!is_guest_mode(vcpu))21142113 return kvm_emulate_instruction(vcpu,21152114 EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);21162116- } else21152115+ } else {21162116+ /* All SVM instructions expect page aligned RAX */21172117+ if (svm->vmcb->save.rax & ~PAGE_MASK)21182118+ goto reinject;21192119+21172120 return emulate_svm_instr(vcpu, opcode);21212121+ }2118212221192123reinject:21202124 kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);···42584252 }42594253}4260425442614261-static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, int insn_len)42554255+static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,42564256+ void *insn, int insn_len)42624257{42634258 bool smep, smap, is_user;42644259 unsigned long cr4;42604260+ u64 error_code;42614261+42624262+ /* Emulation is always possible when KVM has access to all guest state. */42634263+ if (!sev_guest(vcpu->kvm))42644264+ return true;42654265+42664266+ /* #UD and #GP should never be intercepted for SEV guests. */42674267+ WARN_ON_ONCE(emul_type & (EMULTYPE_TRAP_UD |42684268+ EMULTYPE_TRAP_UD_FORCED |42694269+ EMULTYPE_VMWARE_GP));4265427042664271 /*42674267- * When the guest is an SEV-ES guest, emulation is not possible.42724272+ * Emulation is impossible for SEV-ES guests as KVM doesn't have access42734273+ * to guest register state.42684274 */42694275 if (sev_es_guest(vcpu->kvm))42704276 return false;4271427742724278 /*42734273- * Detect and workaround Errata 1096 Fam_17h_00_0Fh.42744274- *42754275- * Errata:42764276- * When CPU raise #NPF on guest data access and vCPU CR4.SMAP=1, it is42774277- * possible that CPU microcode implementing DecodeAssist will fail42784278- * to read bytes of instruction which caused #NPF. In this case,42794279- * GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly42804280- * return 0 instead of the correct guest instruction bytes.42814281- *42824282- * This happens because CPU microcode reading instruction bytes42834283- * uses a special opcode which attempts to read data using CPL=042844284- * privileges. The microcode reads CS:RIP and if it hits a SMAP42854285- * fault, it gives up and returns no instruction bytes.42864286- *42874287- * Detection:42884288- * We reach here in case CPU supports DecodeAssist, raised #NPF and42894289- * returned 0 in GuestIntrBytes field of the VMCB.42904290- * First, errata can only be triggered in case vCPU CR4.SMAP=1.42914291- * Second, if vCPU CR4.SMEP=1, errata could only be triggered42924292- * in case vCPU CPL==3 (Because otherwise guest would have triggered42934293- * a SMEP fault instead of #NPF).42944294- * Otherwise, vCPU CR4.SMEP=0, errata could be triggered by any vCPU CPL.42954295- * As most guests enable SMAP if they have also enabled SMEP, use above42964296- * logic in order to attempt minimize false-positive of detecting errata42974297- * while still preserving all cases semantic correctness.42984298- *42994299- * Workaround:43004300- * To determine what instruction the guest was executing, the hypervisor43014301- * will have to decode the instruction at the instruction pointer.43024302- *43034303- * In non SEV guest, hypervisor will be able to read the guest43044304- * memory to decode the instruction pointer when insn_len is zero43054305- * so we return true to indicate that decoding is possible.43064306- *43074307- * But in the SEV guest, the guest memory is encrypted with the43084308- * guest specific key and hypervisor will not be able to decode the43094309- * instruction pointer so we will not able to workaround it. Lets43104310- * print the error and request to kill the guest.42794279+ * Emulation is possible if the instruction is already decoded, e.g.42804280+ * when completing I/O after returning from userspace.43114281 */43124312- if (likely(!insn || insn_len))42824282+ if (emul_type & EMULTYPE_NO_DECODE)43134283 return true;4314428443154285 /*43164316- * If RIP is invalid, go ahead with emulation which will cause an43174317- * internal error exit.42864286+ * Emulation is possible for SEV guests if and only if a prefilled42874287+ * buffer containing the bytes of the intercepted instruction is42884288+ * available. SEV guest memory is encrypted with a guest specific key42894289+ * and cannot be decrypted by KVM, i.e. KVM would read cyphertext and42904290+ * decode garbage.42914291+ *42924292+ * Inject #UD if KVM reached this point without an instruction buffer.42934293+ * In practice, this path should never be hit by a well-behaved guest,42944294+ * e.g. KVM doesn't intercept #UD or #GP for SEV guests, but this path42954295+ * is still theoretically reachable, e.g. via unaccelerated fault-like42964296+ * AVIC access, and needs to be handled by KVM to avoid putting the42974297+ * guest into an infinite loop. Injecting #UD is somewhat arbitrary,42984298+ * but its the least awful option given lack of insight into the guest.43184299 */43194319- if (!kvm_vcpu_gfn_to_memslot(vcpu, kvm_rip_read(vcpu) >> PAGE_SHIFT))43004300+ if (unlikely(!insn)) {43014301+ kvm_queue_exception(vcpu, UD_VECTOR);43024302+ return false;43034303+ }43044304+43054305+ /*43064306+ * Emulate for SEV guests if the insn buffer is not empty. The buffer43074307+ * will be empty if the DecodeAssist microcode cannot fetch bytes for43084308+ * the faulting instruction because the code fetch itself faulted, e.g.43094309+ * the guest attempted to fetch from emulated MMIO or a guest page43104310+ * table used to translate CS:RIP resides in emulated MMIO.43114311+ */43124312+ if (likely(insn_len))43204313 return true;43144314+43154315+ /*43164316+ * Detect and workaround Errata 1096 Fam_17h_00_0Fh.43174317+ *43184318+ * Errata:43194319+ * When CPU raises #NPF on guest data access and vCPU CR4.SMAP=1, it is43204320+ * possible that CPU microcode implementing DecodeAssist will fail to43214321+ * read guest memory at CS:RIP and vmcb.GuestIntrBytes will incorrectly43224322+ * be '0'. This happens because microcode reads CS:RIP using a _data_43234323+ * loap uop with CPL=0 privileges. If the load hits a SMAP #PF, ucode43244324+ * gives up and does not fill the instruction bytes buffer.43254325+ *43264326+ * As above, KVM reaches this point iff the VM is an SEV guest, the CPU43274327+ * supports DecodeAssist, a #NPF was raised, KVM's page fault handler43284328+ * triggered emulation (e.g. for MMIO), and the CPU returned 0 in the43294329+ * GuestIntrBytes field of the VMCB.43304330+ *43314331+ * This does _not_ mean that the erratum has been encountered, as the43324332+ * DecodeAssist will also fail if the load for CS:RIP hits a legitimate43334333+ * #PF, e.g. if the guest attempt to execute from emulated MMIO and43344334+ * encountered a reserved/not-present #PF.43354335+ *43364336+ * To hit the erratum, the following conditions must be true:43374337+ * 1. CR4.SMAP=1 (obviously).43384338+ * 2. CR4.SMEP=0 || CPL=3. If SMEP=1 and CPL<3, the erratum cannot43394339+ * have been hit as the guest would have encountered a SMEP43404340+ * violation #PF, not a #NPF.43414341+ * 3. The #NPF is not due to a code fetch, in which case failure to43424342+ * retrieve the instruction bytes is legitimate (see abvoe).43434343+ *43444344+ * In addition, don't apply the erratum workaround if the #NPF occurred43454345+ * while translating guest page tables (see below).43464346+ */43474347+ error_code = to_svm(vcpu)->vmcb->control.exit_info_1;43484348+ if (error_code & (PFERR_GUEST_PAGE_MASK | PFERR_FETCH_MASK))43494349+ goto resume_guest;4321435043224351 cr4 = kvm_read_cr4(vcpu);43234352 smep = cr4 & X86_CR4_SMEP;43244353 smap = cr4 & X86_CR4_SMAP;43254354 is_user = svm_get_cpl(vcpu) == 3;43264355 if (smap && (!smep || is_user)) {43274327- if (!sev_guest(vcpu->kvm))43284328- return true;43294329-43304356 pr_err_ratelimited("KVM: SEV Guest triggered AMD Erratum 1096\n");43314331- kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);43574357+43584358+ /*43594359+ * If the fault occurred in userspace, arbitrarily inject #GP43604360+ * to avoid killing the guest and to hopefully avoid confusing43614361+ * the guest kernel too much, e.g. injecting #PF would not be43624362+ * coherent with respect to the guest's page tables. Request43634363+ * triple fault if the fault occurred in the kernel as there's43644364+ * no fault that KVM can inject without confusing the guest.43654365+ * In practice, the triple fault is moot as no sane SEV kernel43664366+ * will execute from user memory while also running with SMAP=1.43674367+ */43684368+ if (is_user)43694369+ kvm_inject_gp(vcpu, 0);43704370+ else43714371+ kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);43324372 }4333437343744374+resume_guest:43754375+ /*43764376+ * If the erratum was not hit, simply resume the guest and let it fault43774377+ * again. While awful, e.g. the vCPU may get stuck in an infinite loop43784378+ * if the fault is at CPL=0, it's the lesser of all evils. Exiting to43794379+ * userspace will kill the guest, and letting the emulator read garbage43804380+ * will yield random behavior and potentially corrupt the guest.43814381+ *43824382+ * Simply resuming the guest is technically not a violation of the SEV43834383+ * architecture. AMD's APM states that all code fetches and page table43844384+ * accesses for SEV guest are encrypted, regardless of the C-Bit. The43854385+ * APM also states that encrypted accesses to MMIO are "ignored", but43864386+ * doesn't explicitly define "ignored", i.e. doing nothing and letting43874387+ * the guest spin is technically "ignoring" the access.43884388+ */43344389 return false;43354390}43364391
···4646 if (npt_enabled &&4747 ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB)4848 hve->hv_enlightenments_control.enlightened_npt_tlb = 1;4949+5050+ if (ms_hyperv.nested_features & HV_X64_NESTED_MSR_BITMAP)5151+ hve->hv_enlightenments_control.msr_bitmap = 1;4952}50535154static inline void svm_hv_hardware_setup(void)···8683 struct hv_enlightenments *hve =8784 (struct hv_enlightenments *)vmcb->control.reserved_sw;88858989- /*9090- * vmcb can be NULL if called during early vcpu init.9191- * And its okay not to mark vmcb dirty during vcpu init9292- * as we mark it dirty unconditionally towards end of vcpu9393- * init phase.9494- */9595- if (vmcb_is_clean(vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS) &&9696- hve->hv_enlightenments_control.msr_bitmap)8686+ if (hve->hv_enlightenments_control.msr_bitmap)9787 vmcb_mark_dirty(vmcb, VMCB_HV_NESTED_ENLIGHTENMENTS);9888}9989
-1
arch/x86/kvm/vmx/capabilities.h
···54545555struct vmcs_config {5656 int size;5757- int order;5857 u32 basic_cap;5958 u32 revision_id;6059 u32 pin_based_exec_ctrl;
···5959 SECONDARY_EXEC_SHADOW_VMCS | \6060 SECONDARY_EXEC_TSC_SCALING | \6161 SECONDARY_EXEC_PAUSE_LOOP_EXITING)6262-#define EVMCS1_UNSUPPORTED_VMEXIT_CTRL (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL)6262+#define EVMCS1_UNSUPPORTED_VMEXIT_CTRL \6363+ (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | \6464+ VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)6365#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL)6466#define EVMCS1_UNSUPPORTED_VMFUNC (VMX_VMFUNC_EPTP_SWITCHING)6565-6666-#if IS_ENABLED(CONFIG_HYPERV)67676868struct evmcs_field {6969 u16 offset;···7373extern const struct evmcs_field vmcs_field_to_evmcs_1[];7474extern const unsigned int nr_evmcs_1_fields;75757676-static __always_inline int get_evmcs_offset(unsigned long field,7777- u16 *clean_field)7676+static __always_inline int evmcs_field_offset(unsigned long field,7777+ u16 *clean_field)7878{7979 unsigned int index = ROL16(field, 6);8080 const struct evmcs_field *evmcs_field;81818282- if (unlikely(index >= nr_evmcs_1_fields)) {8383- WARN_ONCE(1, "KVM: accessing unsupported EVMCS field %lx\n",8484- field);8282+ if (unlikely(index >= nr_evmcs_1_fields))8583 return -ENOENT;8686- }87848885 evmcs_field = &vmcs_field_to_evmcs_1[index];8686+8787+ /*8888+ * Use offset=0 to detect holes in eVMCS. This offset belongs to8989+ * 'revision_id' but this field has no encoding and is supposed to9090+ * be accessed directly.9191+ */9292+ if (unlikely(!evmcs_field->offset))9393+ return -ENOENT;89949095 if (clean_field)9196 *clean_field = evmcs_field->clean_field;92979398 return evmcs_field->offset;9999+}100100+101101+static inline u64 evmcs_read_any(struct hv_enlightened_vmcs *evmcs,102102+ unsigned long field, u16 offset)103103+{104104+ /*105105+ * vmcs12_read_any() doesn't care whether the supplied structure106106+ * is 'struct vmcs12' or 'struct hv_enlightened_vmcs' as it takes107107+ * the exact offset of the required field, use it for convenience108108+ * here.109109+ */110110+ return vmcs12_read_any((void *)evmcs, field, offset);111111+}112112+113113+#if IS_ENABLED(CONFIG_HYPERV)114114+115115+static __always_inline int get_evmcs_offset(unsigned long field,116116+ u16 *clean_field)117117+{118118+ int offset = evmcs_field_offset(field, clean_field);119119+120120+ WARN_ONCE(offset < 0, "KVM: accessing unsupported EVMCS field %lx\n",121121+ field);122122+123123+ return offset;94124}9512596126static __always_inline void evmcs_write64(unsigned long field, u64 value)
+54-28
arch/x86/kvm/vmx/nested.c
···77#include <asm/mmu_context.h>8899#include "cpuid.h"1010+#include "evmcs.h"1011#include "hyperv.h"1112#include "mmu.h"1213#include "nested.h"···48524851 struct loaded_vmcs *loaded_vmcs = vmx->loaded_vmcs;4853485248544853 /*48554855- * We should allocate a shadow vmcs for vmcs01 only when L148564856- * executes VMXON and free it when L1 executes VMXOFF.48574857- * As it is invalid to execute VMXON twice, we shouldn't reach48584858- * here when vmcs01 already have an allocated shadow vmcs.48544854+ * KVM allocates a shadow VMCS only when L1 executes VMXON and frees it48554855+ * when L1 executes VMXOFF or the vCPU is forced out of nested48564856+ * operation. VMXON faults if the CPU is already post-VMXON, so it48574857+ * should be impossible to already have an allocated shadow VMCS. KVM48584858+ * doesn't support virtualization of VMCS shadowing, so vmcs01 should48594859+ * always be the loaded VMCS.48594860 */48604860- WARN_ON(loaded_vmcs == &vmx->vmcs01 && loaded_vmcs->shadow_vmcs);48614861+ if (WARN_ON(loaded_vmcs != &vmx->vmcs01 || loaded_vmcs->shadow_vmcs))48624862+ return loaded_vmcs->shadow_vmcs;4861486348624862- if (!loaded_vmcs->shadow_vmcs) {48634863- loaded_vmcs->shadow_vmcs = alloc_vmcs(true);48644864- if (loaded_vmcs->shadow_vmcs)48654865- vmcs_clear(loaded_vmcs->shadow_vmcs);48664866- }48644864+ loaded_vmcs->shadow_vmcs = alloc_vmcs(true);48654865+ if (loaded_vmcs->shadow_vmcs)48664866+ vmcs_clear(loaded_vmcs->shadow_vmcs);48674867+48674868 return loaded_vmcs->shadow_vmcs;48684869}48694870···51025099 if (!nested_vmx_check_permission(vcpu))51035100 return 1;5104510151055105- /*51065106- * In VMX non-root operation, when the VMCS-link pointer is INVALID_GPA,51075107- * any VMREAD sets the ALU flags for VMfailInvalid.51085108- */51095109- if (vmx->nested.current_vmptr == INVALID_GPA ||51105110- (is_guest_mode(vcpu) &&51115111- get_vmcs12(vcpu)->vmcs_link_pointer == INVALID_GPA))51125112- return nested_vmx_failInvalid(vcpu);51135113-51145102 /* Decode instruction info and find the field to read */51155103 field = kvm_register_read(vcpu, (((instr_info) >> 28) & 0xf));5116510451175117- offset = vmcs_field_to_offset(field);51185118- if (offset < 0)51195119- return nested_vmx_fail(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT);51055105+ if (!evmptr_is_valid(vmx->nested.hv_evmcs_vmptr)) {51065106+ /*51075107+ * In VMX non-root operation, when the VMCS-link pointer is INVALID_GPA,51085108+ * any VMREAD sets the ALU flags for VMfailInvalid.51095109+ */51105110+ if (vmx->nested.current_vmptr == INVALID_GPA ||51115111+ (is_guest_mode(vcpu) &&51125112+ get_vmcs12(vcpu)->vmcs_link_pointer == INVALID_GPA))51135113+ return nested_vmx_failInvalid(vcpu);5120511451215121- if (!is_guest_mode(vcpu) && is_vmcs12_ext_field(field))51225122- copy_vmcs02_to_vmcs12_rare(vcpu, vmcs12);51155115+ offset = get_vmcs12_field_offset(field);51165116+ if (offset < 0)51175117+ return nested_vmx_fail(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT);5123511851245124- /* Read the field, zero-extended to a u64 value */51255125- value = vmcs12_read_any(vmcs12, field, offset);51195119+ if (!is_guest_mode(vcpu) && is_vmcs12_ext_field(field))51205120+ copy_vmcs02_to_vmcs12_rare(vcpu, vmcs12);51215121+51225122+ /* Read the field, zero-extended to a u64 value */51235123+ value = vmcs12_read_any(vmcs12, field, offset);51245124+ } else {51255125+ /*51265126+ * Hyper-V TLFS (as of 6.0b) explicitly states, that while an51275127+ * enlightened VMCS is active VMREAD/VMWRITE instructions are51285128+ * unsupported. Unfortunately, certain versions of Windows 1151295129+ * don't comply with this requirement which is not enforced in51305130+ * genuine Hyper-V. Allow VMREAD from an enlightened VMCS as a51315131+ * workaround, as misbehaving guests will panic on VM-Fail.51325132+ * Note, enlightened VMCS is incompatible with shadow VMCS so51335133+ * all VMREADs from L2 should go to L1.51345134+ */51355135+ if (WARN_ON_ONCE(is_guest_mode(vcpu)))51365136+ return nested_vmx_failInvalid(vcpu);51375137+51385138+ offset = evmcs_field_offset(field, NULL);51395139+ if (offset < 0)51405140+ return nested_vmx_fail(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT);51415141+51425142+ /* Read the field, zero-extended to a u64 value */51435143+ value = evmcs_read_any(vmx->nested.hv_evmcs, field, offset);51445144+ }5126514551275146 /*51285147 * Now copy part of this value to register or memory, as requested.···5239521452405215 field = kvm_register_read(vcpu, (((instr_info) >> 28) & 0xf));5241521652425242- offset = vmcs_field_to_offset(field);52175217+ offset = get_vmcs12_field_offset(field);52435218 if (offset < 0)52445219 return nested_vmx_fail(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT);52455220···64876462 max_idx = 0;64886463 for (i = 0; i < nr_vmcs12_fields; i++) {64896464 /* The vmcs12 table is very, very sparsely populated. */64906490- if (!vmcs_field_to_offset_table[i])64656465+ if (!vmcs12_field_offsets[i])64916466 continue;6492646764936468 idx = vmcs_field_index(VMCS12_IDX_TO_ENC(i));···67966771}6797677267986773struct kvm_x86_nested_ops vmx_nested_ops = {67746774+ .leave_nested = vmx_leave_nested,67996775 .check_events = vmx_check_nested_events,68006776 .hv_timer_pending = nested_vmx_preemption_timer_pending,68016777 .triple_fault = nested_vmx_triple_fault,
+2-2
arch/x86/kvm/vmx/vmcs12.c
···88 FIELD(number, name), \99 [ROL16(number##_HIGH, 6)] = VMCS12_OFFSET(name) + sizeof(u32)10101111-const unsigned short vmcs_field_to_offset_table[] = {1111+const unsigned short vmcs12_field_offsets[] = {1212 FIELD(VIRTUAL_PROCESSOR_ID, virtual_processor_id),1313 FIELD(POSTED_INTR_NV, posted_intr_nv),1414 FIELD(GUEST_ES_SELECTOR, guest_es_selector),···151151 FIELD(HOST_RSP, host_rsp),152152 FIELD(HOST_RIP, host_rip),153153};154154-const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs_field_to_offset_table);154154+const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs12_field_offsets);
+3-3
arch/x86/kvm/vmx/vmcs12.h
···361361 CHECK_OFFSET(guest_pml_index, 996);362362}363363364364-extern const unsigned short vmcs_field_to_offset_table[];364364+extern const unsigned short vmcs12_field_offsets[];365365extern const unsigned int nr_vmcs12_fields;366366367367-static inline short vmcs_field_to_offset(unsigned long field)367367+static inline short get_vmcs12_field_offset(unsigned long field)368368{369369 unsigned short offset;370370 unsigned int index;···377377 return -ENOENT;378378379379 index = array_index_nospec(index, nr_vmcs12_fields);380380- offset = vmcs_field_to_offset_table[index];380380+ offset = vmcs12_field_offsets[index];381381 if (offset == 0)382382 return -ENOENT;383383 return offset;
+38-9
arch/x86/kvm/vmx/vmx.c
···14871487 return 0;14881488}1489148914901490-static bool vmx_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, int insn_len)14901490+static bool vmx_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,14911491+ void *insn, int insn_len)14911492{14921493 /*14931494 * Emulation of instructions in SGX enclaves is impossible as RIP does14941494- * not point tthe failing instruction, and even if it did, the code14951495+ * not point at the failing instruction, and even if it did, the code14951496 * stream is inaccessible. Inject #UD instead of exiting to userspace14961497 * so that guest userspace can't DoS the guest simply by triggering14971498 * emulation (enclaves are CPL3 only).···26042603 return -EIO;2605260426062605 vmcs_conf->size = vmx_msr_high & 0x1fff;26072607- vmcs_conf->order = get_order(vmcs_conf->size);26082606 vmcs_conf->basic_cap = vmx_msr_high & ~0x1fff;2609260726102608 vmcs_conf->revision_id = vmx_msr_low;···26282628 struct page *pages;26292629 struct vmcs *vmcs;2630263026312631- pages = __alloc_pages_node(node, flags, vmcs_config.order);26312631+ pages = __alloc_pages_node(node, flags, 0);26322632 if (!pages)26332633 return NULL;26342634 vmcs = page_address(pages);···2647264726482648void free_vmcs(struct vmcs *vmcs)26492649{26502650- free_pages((unsigned long)vmcs, vmcs_config.order);26502650+ free_page((unsigned long)vmcs);26512651}2652265226532653/*···40944094 vmcs_write32(HOST_IA32_SYSENTER_CS, low32);4095409540964096 /*40974097- * If 32-bit syscall is enabled, vmx_vcpu_load_vcms rewrites40984098- * HOST_IA32_SYSENTER_ESP.40974097+ * SYSENTER is used for 32-bit system calls on either 32-bit or40984098+ * 64-bit kernels. It is always zero If neither is allowed, otherwise40994099+ * vmx_vcpu_load_vmcs loads it with the per-CPU entry stack (and may41004100+ * have already done so!).40994101 */41004100- vmcs_writel(HOST_IA32_SYSENTER_ESP, 0);41024102+ if (!IS_ENABLED(CONFIG_IA32_EMULATION) && !IS_ENABLED(CONFIG_X86_32))41034103+ vmcs_writel(HOST_IA32_SYSENTER_ESP, 0);41044104+41014105 rdmsrl(MSR_IA32_SYSENTER_EIP, tmpl);41024106 vmcs_writel(HOST_IA32_SYSENTER_EIP, tmpl); /* 22.2.3 */41034107···49054901 dr6 = vmx_get_exit_qual(vcpu);49064902 if (!(vcpu->guest_debug &49074903 (KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP))) {49044904+ /*49054905+ * If the #DB was due to ICEBP, a.k.a. INT1, skip the49064906+ * instruction. ICEBP generates a trap-like #DB, but49074907+ * despite its interception control being tied to #DB,49084908+ * is an instruction intercept, i.e. the VM-Exit occurs49094909+ * on the ICEBP itself. Note, skipping ICEBP also49104910+ * clears STI and MOVSS blocking.49114911+ *49124912+ * For all other #DBs, set vmcs.PENDING_DBG_EXCEPTIONS.BS49134913+ * if single-step is enabled in RFLAGS and STI or MOVSS49144914+ * blocking is active, as the CPU doesn't set the bit49154915+ * on VM-Exit due to #DB interception. VM-Entry has a49164916+ * consistency check that a single-step #DB is pending49174917+ * in this scenario as the previous instruction cannot49184918+ * have toggled RFLAGS.TF 0=>1 (because STI and POP/MOV49194919+ * don't modify RFLAGS), therefore the one instruction49204920+ * delay when activating single-step breakpoints must49214921+ * have already expired. Note, the CPU sets/clears BS49224922+ * as appropriate for all other VM-Exits types.49234923+ */49084924 if (is_icebp(intr_info))49094925 WARN_ON(!skip_emulated_instruction(vcpu));49264926+ else if ((vmx_get_rflags(vcpu) & X86_EFLAGS_TF) &&49274927+ (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &49284928+ (GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS)))49294929+ vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS,49304930+ vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS) | DR6_BS);4910493149114932 kvm_queue_exception_p(vcpu, DB_VECTOR, dr6);49124933 return 1;···54265397{54275398 gpa_t gpa;5428539954295429- if (!vmx_can_emulate_instruction(vcpu, NULL, 0))54005400+ if (!vmx_can_emulate_instruction(vcpu, EMULTYPE_PF, NULL, 0))54305401 return 1;5431540254325403 /*
+81-13
arch/x86/kvm/x86.c
···35353535 if (data & ~supported_xss)35363536 return 1;35373537 vcpu->arch.ia32_xss = data;35383538+ kvm_update_cpuid_runtime(vcpu);35383539 break;35393540 case MSR_SMI_COUNT:35403541 if (!msr_info->host_initiated)···42304229 case KVM_CAP_SREGS2:42314230 case KVM_CAP_EXIT_ON_EMULATION_FAILURE:42324231 case KVM_CAP_VCPU_ATTRIBUTES:42324232+ case KVM_CAP_SYS_ATTRIBUTES:42334233 r = 1;42344234 break;42354235 case KVM_CAP_EXIT_HYPERCALL:···43334331 break;43344332 }43354333 return r;43344334+}4336433543364336+static inline void __user *kvm_get_attr_addr(struct kvm_device_attr *attr)43374337+{43384338+ void __user *uaddr = (void __user*)(unsigned long)attr->addr;43394339+43404340+ if ((u64)(unsigned long)uaddr != attr->addr)43414341+ return ERR_PTR(-EFAULT);43424342+ return uaddr;43434343+}43444344+43454345+static int kvm_x86_dev_get_attr(struct kvm_device_attr *attr)43464346+{43474347+ u64 __user *uaddr = kvm_get_attr_addr(attr);43484348+43494349+ if (attr->group)43504350+ return -ENXIO;43514351+43524352+ if (IS_ERR(uaddr))43534353+ return PTR_ERR(uaddr);43544354+43554355+ switch (attr->attr) {43564356+ case KVM_X86_XCOMP_GUEST_SUPP:43574357+ if (put_user(supported_xcr0, uaddr))43584358+ return -EFAULT;43594359+ return 0;43604360+ default:43614361+ return -ENXIO;43624362+ break;43634363+ }43644364+}43654365+43664366+static int kvm_x86_dev_has_attr(struct kvm_device_attr *attr)43674367+{43684368+ if (attr->group)43694369+ return -ENXIO;43704370+43714371+ switch (attr->attr) {43724372+ case KVM_X86_XCOMP_GUEST_SUPP:43734373+ return 0;43744374+ default:43754375+ return -ENXIO;43764376+ }43374377}4338437843394379long kvm_arch_dev_ioctl(struct file *filp,···44664422 case KVM_GET_SUPPORTED_HV_CPUID:44674423 r = kvm_ioctl_get_supported_hv_cpuid(NULL, argp);44684424 break;44254425+ case KVM_GET_DEVICE_ATTR: {44264426+ struct kvm_device_attr attr;44274427+ r = -EFAULT;44284428+ if (copy_from_user(&attr, (void __user *)arg, sizeof(attr)))44294429+ break;44304430+ r = kvm_x86_dev_get_attr(&attr);44314431+ break;44324432+ }44334433+ case KVM_HAS_DEVICE_ATTR: {44344434+ struct kvm_device_attr attr;44354435+ r = -EFAULT;44364436+ if (copy_from_user(&attr, (void __user *)arg, sizeof(attr)))44374437+ break;44384438+ r = kvm_x86_dev_has_attr(&attr);44394439+ break;44404440+ }44694441 default:44704442 r = -EINVAL;44714443 break;···49204860 vcpu->arch.apic->sipi_vector = events->sipi_vector;4921486149224862 if (events->flags & KVM_VCPUEVENT_VALID_SMM) {49234923- if (!!(vcpu->arch.hflags & HF_SMM_MASK) != events->smi.smm)48634863+ if (!!(vcpu->arch.hflags & HF_SMM_MASK) != events->smi.smm) {48644864+ kvm_x86_ops.nested_ops->leave_nested(vcpu);49244865 kvm_smm_changed(vcpu, events->smi.smm);48664866+ }4925486749264868 vcpu->arch.smi_pending = events->smi.pending;49274869···50845022static int kvm_arch_tsc_get_attr(struct kvm_vcpu *vcpu,50855023 struct kvm_device_attr *attr)50865024{50875087- u64 __user *uaddr = (u64 __user *)(unsigned long)attr->addr;50255025+ u64 __user *uaddr = kvm_get_attr_addr(attr);50885026 int r;5089502750905090- if ((u64)(unsigned long)uaddr != attr->addr)50915091- return -EFAULT;50285028+ if (IS_ERR(uaddr))50295029+ return PTR_ERR(uaddr);5092503050935031 switch (attr->attr) {50945032 case KVM_VCPU_TSC_OFFSET:···51075045static int kvm_arch_tsc_set_attr(struct kvm_vcpu *vcpu,51085046 struct kvm_device_attr *attr)51095047{51105110- u64 __user *uaddr = (u64 __user *)(unsigned long)attr->addr;50485048+ u64 __user *uaddr = kvm_get_attr_addr(attr);51115049 struct kvm *kvm = vcpu->kvm;51125050 int r;5113505151145114- if ((u64)(unsigned long)uaddr != attr->addr)51155115- return -EFAULT;50525052+ if (IS_ERR(uaddr))50535053+ return PTR_ERR(uaddr);5116505451175055 switch (attr->attr) {51185056 case KVM_VCPU_TSC_OFFSET: {···68726810}68736811EXPORT_SYMBOL_GPL(kvm_write_guest_virt_system);6874681268136813+static int kvm_can_emulate_insn(struct kvm_vcpu *vcpu, int emul_type,68146814+ void *insn, int insn_len)68156815+{68166816+ return static_call(kvm_x86_can_emulate_instruction)(vcpu, emul_type,68176817+ insn, insn_len);68186818+}68196819+68756820int handle_ud(struct kvm_vcpu *vcpu)68766821{68776822 static const char kvm_emulate_prefix[] = { __KVM_EMULATE_PREFIX };···68866817 char sig[5]; /* ud2; .ascii "kvm" */68876818 struct x86_exception e;6888681968896889- if (unlikely(!static_call(kvm_x86_can_emulate_instruction)(vcpu, NULL, 0)))68206820+ if (unlikely(!kvm_can_emulate_insn(vcpu, emul_type, NULL, 0)))68906821 return 1;6891682268926823 if (force_emulation_prefix &&···82628193 bool writeback = true;82638194 bool write_fault_to_spt;8264819582658265- if (unlikely(!static_call(kvm_x86_can_emulate_instruction)(vcpu, insn, insn_len)))81968196+ if (unlikely(!kvm_can_emulate_insn(vcpu, emulation_type, insn, insn_len)))82668197 return 1;8267819882688199 vcpu->arch.l1tf_flush_l1d = true;···97759706 kvm_make_all_cpus_request(kvm, KVM_REQ_APIC_PAGE_RELOAD);97769707}9777970897789778-void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)97099709+static void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)97799710{97809711 if (!lapic_in_kernel(vcpu))97819712 return;···11278112091127911210 vcpu->arch.msr_misc_features_enables = 0;11280112111128111281- vcpu->arch.xcr0 = XFEATURE_MASK_FP;1121211212+ __kvm_set_xcr(vcpu, 0, XFEATURE_MASK_FP);1121311213+ __kvm_set_msr(vcpu, MSR_IA32_XSS, 0, true);1128211214 }11283112151128411216 /* All GPRs except RDX (handled below) are zeroed on RESET/INIT. */···1129511225 */1129611226 cpuid_0x1 = kvm_find_cpuid_entry(vcpu, 1, 0);1129711227 kvm_rdx_write(vcpu, cpuid_0x1 ? cpuid_0x1->eax : 0x600);1129811298-1129911299- vcpu->arch.ia32_xss = 0;11300112281130111229 static_call(kvm_x86_vcpu_reset)(vcpu, init_event);1130211230
···10611061}1062106210631063static unsigned long __part_start_io_acct(struct block_device *part,10641064- unsigned int sectors, unsigned int op)10641064+ unsigned int sectors, unsigned int op,10651065+ unsigned long start_time)10651066{10661067 const int sgrp = op_stat_group(op);10671067- unsigned long now = READ_ONCE(jiffies);1068106810691069 part_stat_lock();10701070- update_io_ticks(part, now, false);10701070+ update_io_ticks(part, start_time, false);10711071 part_stat_inc(part, ios[sgrp]);10721072 part_stat_add(part, sectors[sgrp], sectors);10731073 part_stat_local_inc(part, in_flight[op_is_write(op)]);10741074 part_stat_unlock();1075107510761076- return now;10761076+ return start_time;10771077}10781078+10791079+/**10801080+ * bio_start_io_acct_time - start I/O accounting for bio based drivers10811081+ * @bio: bio to start account for10821082+ * @start_time: start time that should be passed back to bio_end_io_acct().10831083+ */10841084+void bio_start_io_acct_time(struct bio *bio, unsigned long start_time)10851085+{10861086+ __part_start_io_acct(bio->bi_bdev, bio_sectors(bio),10871087+ bio_op(bio), start_time);10881088+}10891089+EXPORT_SYMBOL_GPL(bio_start_io_acct_time);1078109010791091/**10801092 * bio_start_io_acct - start I/O accounting for bio based drivers···10961084 */10971085unsigned long bio_start_io_acct(struct bio *bio)10981086{10991099- return __part_start_io_acct(bio->bi_bdev, bio_sectors(bio), bio_op(bio));10871087+ return __part_start_io_acct(bio->bi_bdev, bio_sectors(bio),10881088+ bio_op(bio), jiffies);11001089}11011090EXPORT_SYMBOL_GPL(bio_start_io_acct);1102109111031092unsigned long disk_start_io_acct(struct gendisk *disk, unsigned int sectors,11041093 unsigned int op)11051094{11061106- return __part_start_io_acct(disk->part0, sectors, op);10951095+ return __part_start_io_acct(disk->part0, sectors, op, jiffies);11071096}11081097EXPORT_SYMBOL(disk_start_io_acct);11091098
···358358 * other namespaces.359359 */360360 if ((current_user_ns() != &init_user_ns) ||361361- (task_active_pid_ns(current) != &init_pid_ns))361361+ !task_is_in_init_pid_ns(current))362362 return;363363364364 /* Can only change if privileged. */
+6-9
drivers/counter/counter-core.c
···9090 int err;91919292 ch = kzalloc(sizeof(*ch) + sizeof_priv, GFP_KERNEL);9393- if (!ch) {9494- err = -ENOMEM;9595- goto err_alloc_ch;9696- }9393+ if (!ch)9494+ return NULL;97959896 counter = &ch->counter;9997 dev = &counter->dev;···121123err_ida_alloc:122124123125 kfree(ch);124124-err_alloc_ch:125126126126- return ERR_PTR(err);127127+ return NULL;127128}128129EXPORT_SYMBOL_GPL(counter_alloc);129130···205208 int err;206209207210 counter = counter_alloc(sizeof_priv);208208- if (IS_ERR(counter))209209- return counter;211211+ if (!counter)212212+ return NULL;210213211214 err = devm_add_action_or_reset(dev, devm_counter_put, counter);212215 if (err < 0)213213- return ERR_PTR(err);216216+ return NULL;214217215218 return counter;216219}
···32173217}32183218#endif3219321932203220+void reset_syncd_pipes_from_disabled_pipes(struct dc *dc,32213221+ struct dc_state *context)32223222+{32233223+ int i, j;32243224+ struct pipe_ctx *pipe_ctx_old, *pipe_ctx, *pipe_ctx_syncd;32253225+32263226+ /* If pipe backend is reset, need to reset pipe syncd status */32273227+ for (i = 0; i < dc->res_pool->pipe_count; i++) {32283228+ pipe_ctx_old = &dc->current_state->res_ctx.pipe_ctx[i];32293229+ pipe_ctx = &context->res_ctx.pipe_ctx[i];32303230+32313231+ if (!pipe_ctx_old->stream)32323232+ continue;32333233+32343234+ if (pipe_ctx_old->top_pipe || pipe_ctx_old->prev_odm_pipe)32353235+ continue;32363236+32373237+ if (!pipe_ctx->stream ||32383238+ pipe_need_reprogram(pipe_ctx_old, pipe_ctx)) {32393239+32403240+ /* Reset all the syncd pipes from the disabled pipe */32413241+ for (j = 0; j < dc->res_pool->pipe_count; j++) {32423242+ pipe_ctx_syncd = &context->res_ctx.pipe_ctx[j];32433243+ if ((GET_PIPE_SYNCD_FROM_PIPE(pipe_ctx_syncd) == pipe_ctx_old->pipe_idx) ||32443244+ !IS_PIPE_SYNCD_VALID(pipe_ctx_syncd))32453245+ SET_PIPE_SYNCD_TO_PIPE(pipe_ctx_syncd, j);32463246+ }32473247+ }32483248+ }32493249+}32503250+32513251+void check_syncd_pipes_for_disabled_master_pipe(struct dc *dc,32523252+ struct dc_state *context,32533253+ uint8_t disabled_master_pipe_idx)32543254+{32553255+ int i;32563256+ struct pipe_ctx *pipe_ctx, *pipe_ctx_check;32573257+32583258+ pipe_ctx = &context->res_ctx.pipe_ctx[disabled_master_pipe_idx];32593259+ if ((GET_PIPE_SYNCD_FROM_PIPE(pipe_ctx) != disabled_master_pipe_idx) ||32603260+ !IS_PIPE_SYNCD_VALID(pipe_ctx))32613261+ SET_PIPE_SYNCD_TO_PIPE(pipe_ctx, disabled_master_pipe_idx);32623262+32633263+ /* for the pipe disabled, check if any slave pipe exists and assert */32643264+ for (i = 0; i < dc->res_pool->pipe_count; i++) {32653265+ pipe_ctx_check = &context->res_ctx.pipe_ctx[i];32663266+32673267+ if ((GET_PIPE_SYNCD_FROM_PIPE(pipe_ctx_check) == disabled_master_pipe_idx) &&32683268+ IS_PIPE_SYNCD_VALID(pipe_ctx_check) && (i != disabled_master_pipe_idx))32693269+ DC_ERR("DC: Failure: pipe_idx[%d] syncd with disabled master pipe_idx[%d]\n",32703270+ i, disabled_master_pipe_idx);32713271+ }32723272+}32733273+32203274uint8_t resource_transmitter_to_phy_idx(const struct dc *dc, enum transmitter transmitter)32213275{32223276 /* TODO - get transmitter to phy idx mapping from DMUB */
···15661566 &pipe_ctx->stream->audio_info);15671567 }1568156815691569+ /* make sure no pipes syncd to the pipe being enabled */15701570+ if (!pipe_ctx->stream->apply_seamless_boot_optimization && dc->config.use_pipe_ctx_sync_logic)15711571+ check_syncd_pipes_for_disabled_master_pipe(dc, context, pipe_ctx->pipe_idx);15721572+15691573#if defined(CONFIG_DRM_AMD_DC_DCN)15701574 /* DCN3.1 FPGA Workaround15711575 * Need to enable HPO DP Stream Encoder before setting OTG master enable.···16081604 pipe_ctx->stream_res.stream_enc,16091605 pipe_ctx->stream_res.tg->inst);1610160616111611- if (dc_is_dp_signal(pipe_ctx->stream->signal) &&16071607+ if (dc_is_embedded_signal(pipe_ctx->stream->signal) &&16121608 pipe_ctx->stream_res.stream_enc->funcs->reset_fifo)16131609 pipe_ctx->stream_res.stream_enc->funcs->reset_fifo(16141610 pipe_ctx->stream_res.stream_enc);···23002296 struct dc_bios *dcb = dc->ctx->dc_bios;23012297 enum dc_status status;23022298 int i;22992299+23002300+ /* reset syncd pipes from disabled pipes */23012301+ if (dc->config.use_pipe_ctx_sync_logic)23022302+ reset_syncd_pipes_from_disabled_pipes(dc, context);2303230323042304 /* Reset old context */23052305 /* look up the targets that have been removed since last commit */
···3333 unsigned long long output;3434 acpi_status status;35353636+ if (acpi_disabled)3737+ return false;3838+3639 /* Get embedded-controller handle */3740 status = acpi_get_devices("PNP0C09", acpi_set_handle, NULL, &ec_handle);3841 if (ACPI_FAILURE(status) || !ec_handle)
···608608 return gpu->funcs->pm_resume(gpu);609609}610610611611+static int active_submits(struct msm_gpu *gpu)612612+{613613+ int active_submits;614614+ mutex_lock(&gpu->active_lock);615615+ active_submits = gpu->active_submits;616616+ mutex_unlock(&gpu->active_lock);617617+ return active_submits;618618+}619619+611620static int adreno_suspend(struct device *dev)612621{613622 struct msm_gpu *gpu = dev_to_gpu(dev);623623+ int remaining;624624+625625+ remaining = wait_event_timeout(gpu->retire_event,626626+ active_submits(gpu) == 0,627627+ msecs_to_jiffies(1000));628628+ if (remaining == 0) {629629+ dev_err(dev, "Timeout waiting for GPU to suspend\n");630630+ return -EBUSY;631631+ }614632615633 return gpu->funcs->pm_suspend(gpu);616634}
+9-2
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dspp.c
···2626 struct dpu_hw_pcc_cfg *cfg)2727{28282929- u32 base = ctx->cap->sblk->pcc.base;2929+ u32 base;30303131- if (!ctx || !base) {3131+ if (!ctx) {3232+ DRM_ERROR("invalid ctx %pK\n", ctx);3333+ return;3434+ }3535+3636+ base = ctx->cap->sblk->pcc.base;3737+3838+ if (!base) {3239 DRM_ERROR("invalid ctx %pK pcc base 0x%x\n", ctx, base);3340 return;3441 }
+6-1
drivers/gpu/drm/msm/dsi/dsi.c
···40404141 of_node_put(phy_node);42424343- if (!phy_pdev || !msm_dsi->phy) {4343+ if (!phy_pdev) {4444+ DRM_DEV_ERROR(&pdev->dev, "%s: phy driver is not ready\n", __func__);4545+ return -EPROBE_DEFER;4646+ }4747+ if (!msm_dsi->phy) {4848+ put_device(&phy_pdev->dev);4449 DRM_DEV_ERROR(&pdev->dev, "%s: phy driver is not ready\n", __func__);4550 return -EPROBE_DEFER;4651 }
+3-1
drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
···808808 struct msm_dsi_phy_clk_request *clk_req,809809 struct msm_dsi_phy_shared_timings *shared_timings)810810{811811- struct device *dev = &phy->pdev->dev;811811+ struct device *dev;812812 int ret;813813814814 if (!phy || !phy->cfg->ops.enable)815815 return -EINVAL;816816+817817+ dev = &phy->pdev->dev;816818817819 ret = dsi_phy_enable_resource(phy);818820 if (ret) {
+6-1
drivers/gpu/drm/msm/hdmi/hdmi.c
···97979898 of_node_put(phy_node);9999100100- if (!phy_pdev || !hdmi->phy) {100100+ if (!phy_pdev) {101101 DRM_DEV_ERROR(&pdev->dev, "phy driver is not ready\n");102102+ return -EPROBE_DEFER;103103+ }104104+ if (!hdmi->phy) {105105+ DRM_DEV_ERROR(&pdev->dev, "phy driver is not ready\n");106106+ put_device(&phy_pdev->dev);102107 return -EPROBE_DEFER;103108 }104109
+1-4
drivers/gpu/drm/msm/msm_drv.c
···461461 of_node_put(node);462462 if (ret)463463 return ret;464464- size = r.end - r.start;464464+ size = r.end - r.start + 1;465465 DRM_INFO("using VRAM carveout: %lx@%pa\n", size, &r.start);466466467467 /* if we have no IOMMU, then we need to use carveout allocator.···510510 struct msm_drm_private *priv = dev_get_drvdata(dev);511511 struct drm_device *ddev;512512 struct msm_kms *kms;513513- struct msm_mdss *mdss;514513 int ret, i;515514516515 ddev = drm_dev_alloc(drv, dev);···519520 }520521 ddev->dev_private = priv;521522 priv->dev = ddev;522522-523523- mdss = priv->mdss;524523525524 priv->wq = alloc_ordered_workqueue("msm", 0);526525 priv->hangcheck_period = DRM_MSM_HANGCHECK_DEFAULT_PERIOD;
···25012501 if (file_priv)25022502 vmw_execbuf_copy_fence_user(dev_priv, vmw_fpriv(file_priv),25032503 ret, user_fence_rep, fence,25042504- handle, -1, NULL);25042504+ handle, -1);25052505 if (out_fence)25062506 *out_fence = fence;25072507 else
+7
drivers/hv/hv_balloon.c
···16601660 unsigned long t;16611661 int ret;1662166216631663+ /*16641664+ * max_pkt_size should be large enough for one vmbus packet header plus16651665+ * our receive buffer size. Hyper-V sends messages up to16661666+ * HV_HYP_PAGE_SIZE bytes long on balloon channel.16671667+ */16681668+ dev->channel->max_pkt_size = HV_HYP_PAGE_SIZE * 2;16691669+16631670 ret = vmbus_open(dev->channel, dm_ring_size, dm_ring_size, NULL, 0,16641671 balloon_onchannelcallback, dev);16651672 if (ret)
+3
drivers/hwmon/adt7470.c
···662662 struct adt7470_data *data = dev_get_drvdata(dev);663663 int err;664664665665+ if (val <= 0)666666+ return -EINVAL;667667+665668 val = FAN_RPM_TO_PERIOD(val);666669 val = clamp_val(val, 1, 65534);667670
···178178 struct irq_domain *hw_domain;179179 struct irq_domain *ipi_domain;180180 int nr_hw;181181- int ipi_hwirq;182181};183182184183static DEFINE_PER_CPU(uint32_t, aic_fiq_unmasked);
+101-22
drivers/irqchip/irq-gic-v3-its.c
···48564856 .resume = its_restore_enable,48574857};4858485848594859+static void __init __iomem *its_map_one(struct resource *res, int *err)48604860+{48614861+ void __iomem *its_base;48624862+ u32 val;48634863+48644864+ its_base = ioremap(res->start, SZ_64K);48654865+ if (!its_base) {48664866+ pr_warn("ITS@%pa: Unable to map ITS registers\n", &res->start);48674867+ *err = -ENOMEM;48684868+ return NULL;48694869+ }48704870+48714871+ val = readl_relaxed(its_base + GITS_PIDR2) & GIC_PIDR2_ARCH_MASK;48724872+ if (val != 0x30 && val != 0x40) {48734873+ pr_warn("ITS@%pa: No ITS detected, giving up\n", &res->start);48744874+ *err = -ENODEV;48754875+ goto out_unmap;48764876+ }48774877+48784878+ *err = its_force_quiescent(its_base);48794879+ if (*err) {48804880+ pr_warn("ITS@%pa: Failed to quiesce, giving up\n", &res->start);48814881+ goto out_unmap;48824882+ }48834883+48844884+ return its_base;48854885+48864886+out_unmap:48874887+ iounmap(its_base);48884888+ return NULL;48894889+}48904890+48594891static int its_init_domain(struct fwnode_handle *handle, struct its_node *its)48604892{48614893 struct irq_domain *inner_domain;···49954963{49964964 struct its_node *its;49974965 void __iomem *its_base;49984998- u32 val, ctlr;49994966 u64 baser, tmp, typer;50004967 struct page *page;49684968+ u32 ctlr;50014969 int err;5002497050035003- its_base = ioremap(res->start, SZ_64K);50045004- if (!its_base) {50055005- pr_warn("ITS@%pa: Unable to map ITS registers\n", &res->start);50065006- return -ENOMEM;50075007- }50085008-50095009- val = readl_relaxed(its_base + GITS_PIDR2) & GIC_PIDR2_ARCH_MASK;50105010- if (val != 0x30 && val != 0x40) {50115011- pr_warn("ITS@%pa: No ITS detected, giving up\n", &res->start);50125012- err = -ENODEV;50135013- goto out_unmap;50145014- }50155015-50165016- err = its_force_quiescent(its_base);50175017- if (err) {50185018- pr_warn("ITS@%pa: Failed to quiesce, giving up\n", &res->start);50195019- goto out_unmap;50205020- }49714971+ its_base = its_map_one(res, &err);49724972+ if (!its_base)49734973+ return err;5021497450224975 pr_info("ITS %pR\n", res);50234976···5258524152595242out:52605243 /* Last CPU being brought up gets to issue the cleanup */52615261- if (cpumask_equal(&cpus_booted_once_mask, cpu_possible_mask))52445244+ if (!IS_ENABLED(CONFIG_SMP) ||52455245+ cpumask_equal(&cpus_booted_once_mask, cpu_possible_mask))52625246 schedule_work(&rdist_memreserve_cpuhp_cleanup_work);5263524752645248 gic_data_rdist()->flags |= RD_LOCAL_MEMRESERVE_DONE;52655249 return ret;52505250+}52515251+52525252+/* Mark all the BASER registers as invalid before they get reprogrammed */52535253+static int __init its_reset_one(struct resource *res)52545254+{52555255+ void __iomem *its_base;52565256+ int err, i;52575257+52585258+ its_base = its_map_one(res, &err);52595259+ if (!its_base)52605260+ return err;52615261+52625262+ for (i = 0; i < GITS_BASER_NR_REGS; i++)52635263+ gits_write_baser(0, its_base + GITS_BASER + (i << 3));52645264+52655265+ iounmap(its_base);52665266+ return 0;52665267}5267526852685269static const struct of_device_id its_device_id[] = {···52925257{52935258 struct device_node *np;52945259 struct resource res;52605260+52615261+ /*52625262+ * Make sure *all* the ITS are reset before we probe any, as52635263+ * they may be sharing memory. If any of the ITS fails to52645264+ * reset, don't even try to go any further, as this could52655265+ * result in something even worse.52665266+ */52675267+ for (np = of_find_matching_node(node, its_device_id); np;52685268+ np = of_find_matching_node(np, its_device_id)) {52695269+ int err;52705270+52715271+ if (!of_device_is_available(np) ||52725272+ !of_property_read_bool(np, "msi-controller") ||52735273+ of_address_to_resource(np, 0, &res))52745274+ continue;52755275+52765276+ err = its_reset_one(&res);52775277+ if (err)52785278+ return err;52795279+ }5295528052965281 for (np = of_find_matching_node(node, its_device_id); np;52975282 np = of_find_matching_node(np, its_device_id)) {···54755420 return err;54765421}5477542254235423+static int __init its_acpi_reset(union acpi_subtable_headers *header,54245424+ const unsigned long end)54255425+{54265426+ struct acpi_madt_generic_translator *its_entry;54275427+ struct resource res;54285428+54295429+ its_entry = (struct acpi_madt_generic_translator *)header;54305430+ res = (struct resource) {54315431+ .start = its_entry->base_address,54325432+ .end = its_entry->base_address + ACPI_GICV3_ITS_MEM_SIZE - 1,54335433+ .flags = IORESOURCE_MEM,54345434+ };54355435+54365436+ return its_reset_one(&res);54375437+}54385438+54785439static void __init its_acpi_probe(void)54795440{54805441 acpi_table_parse_srat_its();54815481- acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,54825482- gic_acpi_parse_madt_its, 0);54425442+ /*54435443+ * Make sure *all* the ITS are reset before we probe any, as54445444+ * they may be sharing memory. If any of the ITS fails to54455445+ * reset, don't even try to go any further, as this could54465446+ * result in something even worse.54475447+ */54485448+ if (acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,54495449+ its_acpi_reset, 0) > 0)54505450+ acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,54515451+ gic_acpi_parse_madt_its, 0);54835452 acpi_its_srat_maps_free();54845453}54855454#else
···62626363static int intc_map(struct irq_domain *d, unsigned int irq, irq_hw_number_t hw)6464{6565- irq_set_chip_and_handler(hw, &realtek_ictl_irq, handle_level_irq);6565+ irq_set_chip_and_handler(irq, &realtek_ictl_irq, handle_level_irq);66666767 return 0;6868}···7676{7777 struct irq_chip *chip = irq_desc_get_chip(desc);7878 struct irq_domain *domain;7979- unsigned int pending;7979+ unsigned long pending;8080+ unsigned int soc_int;80818182 chained_irq_enter(chip, desc);8283 pending = readl(REG(RTL_ICTL_GIMR)) & readl(REG(RTL_ICTL_GISR));8484+8385 if (unlikely(!pending)) {8486 spurious_interrupt();8587 goto out;8688 }8989+8790 domain = irq_desc_get_handler_data(desc);8888- generic_handle_domain_irq(domain, __ffs(pending));9191+ for_each_set_bit(soc_int, &pending, 32)9292+ generic_handle_domain_irq(domain, soc_int);89939094out:9195 chained_irq_exit(chip, desc);···9995 * SoC interrupts are cascaded to MIPS CPU interrupts according to the10096 * interrupt-map in the device tree. Each SoC interrupt gets 4 bits for10197 * the CPU interrupt in an Interrupt Routing Register. Max 32 SoC interrupts102102- * thus go into 4 IRRs.9898+ * thus go into 4 IRRs. A routing value of '0' means the interrupt is left9999+ * disconnected. Routing values {1..15} connect to output lines {0..14}.103100 */104101static int __init map_interrupts(struct device_node *node, struct irq_domain *domain)105102{···139134 of_node_put(cpu_ictl);140135141136 cpu_int = be32_to_cpup(imap + 2);142142- if (cpu_int > 7)137137+ if (cpu_int > 7 || cpu_int < 2)143138 return -EINVAL;144139145140 if (!(mips_irqs_set & BIT(cpu_int))) {···148143 mips_irqs_set |= BIT(cpu_int);149144 }150145151151- regs[(soc_int * 4) / 32] |= cpu_int << (soc_int * 4) % 32;146146+ /* Use routing values (1..6) for CPU interrupts (2..7) */147147+ regs[(soc_int * 4) / 32] |= (cpu_int - 1) << (soc_int * 4) % 32;152148 imap += 3;153149 }154150
+3-17
drivers/md/dm.c
···489489 struct mapped_device *md = io->md;490490 struct bio *bio = io->orig_bio;491491492492- io->start_time = bio_start_io_acct(bio);492492+ bio_start_io_acct_time(bio, io->start_time);493493 if (unlikely(dm_stats_used(&md->stats)))494494 dm_stats_account_io(&md->stats, bio_data_dir(bio),495495 bio->bi_iter.bi_sector, bio_sectors(bio),···535535 io->md = md;536536 spin_lock_init(&io->endio_lock);537537538538- start_io_acct(io);538538+ io->start_time = jiffies;539539540540 return io;541541}···14421442 ci->sector = bio->bi_iter.bi_sector;14431443}1444144414451445-#define __dm_part_stat_sub(part, field, subnd) \14461446- (part_stat_get(part, field) -= (subnd))14471447-14481445/*14491446 * Entry point to split a bio into clones and submit them to the targets.14501447 */···14771480 GFP_NOIO, &md->queue->bio_split);14781481 ci.io->orig_bio = b;1479148214801480- /*14811481- * Adjust IO stats for each split, otherwise upon queue14821482- * reentry there will be redundant IO accounting.14831483- * NOTE: this is a stop-gap fix, a proper fix involves14841484- * significant refactoring of DM core's bio splitting14851485- * (by eliminating DM's splitting and just using bio_split)14861486- */14871487- part_stat_lock();14881488- __dm_part_stat_sub(dm_disk(md)->part0,14891489- sectors[op_stat_group(bio_op(bio))], ci.sector_count);14901490- part_stat_unlock();14911491-14921483 bio_chain(b, bio);14931484 trace_block_split(b, bio->bi_iter.bi_sector);14941485 submit_bio_noacct(bio);14951486 }14961487 }14881488+ start_io_acct(ci.io);1497148914981490 /* drop the extra reference count */14991491 dm_io_dec_pending(ci.io, errno_to_blk_status(error));
···2121 * Below is some version info we got:2222 * SOC Version IP-Version Glitch- [TR]WRN_INT IRQ Err Memory err RTR rece- FD Mode MB2323 * Filter? connected? Passive detection ption in MB Supported?2424- * MCF5441X FlexCAN2 ? no yes no no yes no 162424+ * MCF5441X FlexCAN2 ? no yes no no no no 162525 * MX25 FlexCAN2 03.00.00.00 no no no no no no 642626 * MX28 FlexCAN2 03.00.04.00 yes yes no no no no 642727 * MX35 FlexCAN2 03.00.00.00 no no no no no no 64
···986986ether1_probe(struct expansion_card *ec, const struct ecard_id *id)987987{988988 struct net_device *dev;989989+ u8 addr[ETH_ALEN];989990 int i, ret = 0;990991991992 ether1_banner();···10161015 }1017101610181017 for (i = 0; i < 6; i++)10191019- dev->dev_addr[i] = readb(IDPROM_ADDRESS + (i << 2));10181018+ addr[i] = readb(IDPROM_ADDRESS + (i << 2));10191019+ eth_hw_addr_set(dev, addr);1020102010211021 if (ether1_init_2(dev)) {10221022 ret = -ENODEV;
+100-67
drivers/net/ethernet/ibm/ibmvnic.c
···26022602 struct ibmvnic_rwi *rwi;26032603 unsigned long flags;26042604 u32 reset_state;26052605+ int num_fails = 0;26052606 int rc = 0;2606260726072608 adapter = container_of(work, struct ibmvnic_adapter, ibmvnic_reset);···26562655 rc = do_hard_reset(adapter, rwi, reset_state);26572656 rtnl_unlock();26582657 }26592659- if (rc) {26602660- /* give backing device time to settle down */26582658+ if (rc)26592659+ num_fails++;26602660+ else26612661+ num_fails = 0;26622662+26632663+ /* If auto-priority-failover is enabled we can get26642664+ * back to back failovers during resets, resulting26652665+ * in at least two failed resets (from high-priority26662666+ * backing device to low-priority one and then back)26672667+ * If resets continue to fail beyond that, give the26682668+ * adapter some time to settle down before retrying.26692669+ */26702670+ if (num_fails >= 3) {26612671 netdev_dbg(adapter->netdev,26622662- "[S:%s] Hard reset failed, waiting 60 secs\n",26632663- adapter_state_to_string(adapter->state));26722672+ "[S:%s] Hard reset failed %d times, waiting 60 secs\n",26732673+ adapter_state_to_string(adapter->state),26742674+ num_fails);26642675 set_current_state(TASK_UNINTERRUPTIBLE);26652676 schedule_timeout(60 * HZ);26662677 }···38573844 struct device *dev = &adapter->vdev->dev;38583845 union ibmvnic_crq crq;38593846 int max_entries;38473847+ int cap_reqs;38483848+38493849+ /* We send out 6 or 7 REQUEST_CAPABILITY CRQs below (depending on38503850+ * the PROMISC flag). Initialize this count upfront. When the tasklet38513851+ * receives a response to all of these, it will send the next protocol38523852+ * message (QUERY_IP_OFFLOAD).38533853+ */38543854+ if (!(adapter->netdev->flags & IFF_PROMISC) ||38553855+ adapter->promisc_supported)38563856+ cap_reqs = 7;38573857+ else38583858+ cap_reqs = 6;3860385938613860 if (!retry) {38623861 /* Sub-CRQ entries are 32 byte long */38633862 int entries_page = 4 * PAGE_SIZE / (sizeof(u64) * 4);38633863+38643864+ atomic_set(&adapter->running_cap_crqs, cap_reqs);3864386538653866 if (adapter->min_tx_entries_per_subcrq > entries_page ||38663867 adapter->min_rx_add_entries_per_subcrq > entries_page) {···39363909 adapter->opt_rx_comp_queues;3937391039383911 adapter->req_rx_add_queues = adapter->max_rx_add_queues;39123912+ } else {39133913+ atomic_add(cap_reqs, &adapter->running_cap_crqs);39393914 }39403940-39413915 memset(&crq, 0, sizeof(crq));39423916 crq.request_capability.first = IBMVNIC_CRQ_CMD;39433917 crq.request_capability.cmd = REQUEST_CAPABILITY;3944391839453919 crq.request_capability.capability = cpu_to_be16(REQ_TX_QUEUES);39463920 crq.request_capability.number = cpu_to_be64(adapter->req_tx_queues);39473947- atomic_inc(&adapter->running_cap_crqs);39213921+ cap_reqs--;39483922 ibmvnic_send_crq(adapter, &crq);3949392339503924 crq.request_capability.capability = cpu_to_be16(REQ_RX_QUEUES);39513925 crq.request_capability.number = cpu_to_be64(adapter->req_rx_queues);39523952- atomic_inc(&adapter->running_cap_crqs);39263926+ cap_reqs--;39533927 ibmvnic_send_crq(adapter, &crq);3954392839553929 crq.request_capability.capability = cpu_to_be16(REQ_RX_ADD_QUEUES);39563930 crq.request_capability.number = cpu_to_be64(adapter->req_rx_add_queues);39573957- atomic_inc(&adapter->running_cap_crqs);39313931+ cap_reqs--;39583932 ibmvnic_send_crq(adapter, &crq);3959393339603934 crq.request_capability.capability =39613935 cpu_to_be16(REQ_TX_ENTRIES_PER_SUBCRQ);39623936 crq.request_capability.number =39633937 cpu_to_be64(adapter->req_tx_entries_per_subcrq);39643964- atomic_inc(&adapter->running_cap_crqs);39383938+ cap_reqs--;39653939 ibmvnic_send_crq(adapter, &crq);3966394039673941 crq.request_capability.capability =39683942 cpu_to_be16(REQ_RX_ADD_ENTRIES_PER_SUBCRQ);39693943 crq.request_capability.number =39703944 cpu_to_be64(adapter->req_rx_add_entries_per_subcrq);39713971- atomic_inc(&adapter->running_cap_crqs);39453945+ cap_reqs--;39723946 ibmvnic_send_crq(adapter, &crq);3973394739743948 crq.request_capability.capability = cpu_to_be16(REQ_MTU);39753949 crq.request_capability.number = cpu_to_be64(adapter->req_mtu);39763976- atomic_inc(&adapter->running_cap_crqs);39503950+ cap_reqs--;39773951 ibmvnic_send_crq(adapter, &crq);3978395239793953 if (adapter->netdev->flags & IFF_PROMISC) {···39823954 crq.request_capability.capability =39833955 cpu_to_be16(PROMISC_REQUESTED);39843956 crq.request_capability.number = cpu_to_be64(1);39853985- atomic_inc(&adapter->running_cap_crqs);39573957+ cap_reqs--;39863958 ibmvnic_send_crq(adapter, &crq);39873959 }39883960 } else {39893961 crq.request_capability.capability =39903962 cpu_to_be16(PROMISC_REQUESTED);39913963 crq.request_capability.number = cpu_to_be64(0);39923992- atomic_inc(&adapter->running_cap_crqs);39643964+ cap_reqs--;39933965 ibmvnic_send_crq(adapter, &crq);39943966 }39673967+39683968+ /* Keep at end to catch any discrepancy between expected and actual39693969+ * CRQs sent.39703970+ */39713971+ WARN_ON(cap_reqs != 0);39953972}3996397339973974static int pending_scrq(struct ibmvnic_adapter *adapter,···43904357static void send_query_cap(struct ibmvnic_adapter *adapter)43914358{43924359 union ibmvnic_crq crq;43604360+ int cap_reqs;4393436143944394- atomic_set(&adapter->running_cap_crqs, 0);43624362+ /* We send out 25 QUERY_CAPABILITY CRQs below. Initialize this count43634363+ * upfront. When the tasklet receives a response to all of these, it43644364+ * can send out the next protocol messaage (REQUEST_CAPABILITY).43654365+ */43664366+ cap_reqs = 25;43674367+43684368+ atomic_set(&adapter->running_cap_crqs, cap_reqs);43694369+43954370 memset(&crq, 0, sizeof(crq));43964371 crq.query_capability.first = IBMVNIC_CRQ_CMD;43974372 crq.query_capability.cmd = QUERY_CAPABILITY;4398437343994374 crq.query_capability.capability = cpu_to_be16(MIN_TX_QUEUES);44004400- atomic_inc(&adapter->running_cap_crqs);44014375 ibmvnic_send_crq(adapter, &crq);43764376+ cap_reqs--;4402437744034378 crq.query_capability.capability = cpu_to_be16(MIN_RX_QUEUES);44044404- atomic_inc(&adapter->running_cap_crqs);44054379 ibmvnic_send_crq(adapter, &crq);43804380+ cap_reqs--;4406438144074382 crq.query_capability.capability = cpu_to_be16(MIN_RX_ADD_QUEUES);44084408- atomic_inc(&adapter->running_cap_crqs);44094383 ibmvnic_send_crq(adapter, &crq);43844384+ cap_reqs--;4410438544114386 crq.query_capability.capability = cpu_to_be16(MAX_TX_QUEUES);44124412- atomic_inc(&adapter->running_cap_crqs);44134387 ibmvnic_send_crq(adapter, &crq);43884388+ cap_reqs--;4414438944154390 crq.query_capability.capability = cpu_to_be16(MAX_RX_QUEUES);44164416- atomic_inc(&adapter->running_cap_crqs);44174391 ibmvnic_send_crq(adapter, &crq);43924392+ cap_reqs--;4418439344194394 crq.query_capability.capability = cpu_to_be16(MAX_RX_ADD_QUEUES);44204420- atomic_inc(&adapter->running_cap_crqs);44214395 ibmvnic_send_crq(adapter, &crq);43964396+ cap_reqs--;4422439744234398 crq.query_capability.capability =44244399 cpu_to_be16(MIN_TX_ENTRIES_PER_SUBCRQ);44254425- atomic_inc(&adapter->running_cap_crqs);44264400 ibmvnic_send_crq(adapter, &crq);44014401+ cap_reqs--;4427440244284403 crq.query_capability.capability =44294404 cpu_to_be16(MIN_RX_ADD_ENTRIES_PER_SUBCRQ);44304430- atomic_inc(&adapter->running_cap_crqs);44314405 ibmvnic_send_crq(adapter, &crq);44064406+ cap_reqs--;4432440744334408 crq.query_capability.capability =44344409 cpu_to_be16(MAX_TX_ENTRIES_PER_SUBCRQ);44354435- atomic_inc(&adapter->running_cap_crqs);44364410 ibmvnic_send_crq(adapter, &crq);44114411+ cap_reqs--;4437441244384413 crq.query_capability.capability =44394414 cpu_to_be16(MAX_RX_ADD_ENTRIES_PER_SUBCRQ);44404440- atomic_inc(&adapter->running_cap_crqs);44414415 ibmvnic_send_crq(adapter, &crq);44164416+ cap_reqs--;4442441744434418 crq.query_capability.capability = cpu_to_be16(TCP_IP_OFFLOAD);44444444- atomic_inc(&adapter->running_cap_crqs);44454419 ibmvnic_send_crq(adapter, &crq);44204420+ cap_reqs--;4446442144474422 crq.query_capability.capability = cpu_to_be16(PROMISC_SUPPORTED);44484448- atomic_inc(&adapter->running_cap_crqs);44494423 ibmvnic_send_crq(adapter, &crq);44244424+ cap_reqs--;4450442544514426 crq.query_capability.capability = cpu_to_be16(MIN_MTU);44524452- atomic_inc(&adapter->running_cap_crqs);44534427 ibmvnic_send_crq(adapter, &crq);44284428+ cap_reqs--;4454442944554430 crq.query_capability.capability = cpu_to_be16(MAX_MTU);44564456- atomic_inc(&adapter->running_cap_crqs);44574431 ibmvnic_send_crq(adapter, &crq);44324432+ cap_reqs--;4458443344594434 crq.query_capability.capability = cpu_to_be16(MAX_MULTICAST_FILTERS);44604460- atomic_inc(&adapter->running_cap_crqs);44614435 ibmvnic_send_crq(adapter, &crq);44364436+ cap_reqs--;4462443744634438 crq.query_capability.capability = cpu_to_be16(VLAN_HEADER_INSERTION);44644464- atomic_inc(&adapter->running_cap_crqs);44654439 ibmvnic_send_crq(adapter, &crq);44404440+ cap_reqs--;4466444144674442 crq.query_capability.capability = cpu_to_be16(RX_VLAN_HEADER_INSERTION);44684468- atomic_inc(&adapter->running_cap_crqs);44694443 ibmvnic_send_crq(adapter, &crq);44444444+ cap_reqs--;4470444544714446 crq.query_capability.capability = cpu_to_be16(MAX_TX_SG_ENTRIES);44724472- atomic_inc(&adapter->running_cap_crqs);44734447 ibmvnic_send_crq(adapter, &crq);44484448+ cap_reqs--;4474444944754450 crq.query_capability.capability = cpu_to_be16(RX_SG_SUPPORTED);44764476- atomic_inc(&adapter->running_cap_crqs);44774451 ibmvnic_send_crq(adapter, &crq);44524452+ cap_reqs--;4478445344794454 crq.query_capability.capability = cpu_to_be16(OPT_TX_COMP_SUB_QUEUES);44804480- atomic_inc(&adapter->running_cap_crqs);44814455 ibmvnic_send_crq(adapter, &crq);44564456+ cap_reqs--;4482445744834458 crq.query_capability.capability = cpu_to_be16(OPT_RX_COMP_QUEUES);44844484- atomic_inc(&adapter->running_cap_crqs);44854459 ibmvnic_send_crq(adapter, &crq);44604460+ cap_reqs--;4486446144874462 crq.query_capability.capability =44884463 cpu_to_be16(OPT_RX_BUFADD_Q_PER_RX_COMP_Q);44894489- atomic_inc(&adapter->running_cap_crqs);44904464 ibmvnic_send_crq(adapter, &crq);44654465+ cap_reqs--;4491446644924467 crq.query_capability.capability =44934468 cpu_to_be16(OPT_TX_ENTRIES_PER_SUBCRQ);44944494- atomic_inc(&adapter->running_cap_crqs);44954469 ibmvnic_send_crq(adapter, &crq);44704470+ cap_reqs--;4496447144974472 crq.query_capability.capability =44984473 cpu_to_be16(OPT_RXBA_ENTRIES_PER_SUBCRQ);44994499- atomic_inc(&adapter->running_cap_crqs);45004474 ibmvnic_send_crq(adapter, &crq);44754475+ cap_reqs--;4501447645024477 crq.query_capability.capability = cpu_to_be16(TX_RX_DESC_REQ);45034503- atomic_inc(&adapter->running_cap_crqs);44784478+45044479 ibmvnic_send_crq(adapter, &crq);44804480+ cap_reqs--;44814481+44824482+ /* Keep at end to catch any discrepancy between expected and actual44834483+ * CRQs sent.44844484+ */44854485+ WARN_ON(cap_reqs != 0);45054486}4506448745074488static void send_query_ip_offload(struct ibmvnic_adapter *adapter)···48194772 char *name;4820477348214774 atomic_dec(&adapter->running_cap_crqs);47754775+ netdev_dbg(adapter->netdev, "Outstanding request-caps: %d\n",47764776+ atomic_read(&adapter->running_cap_crqs));48224777 switch (be16_to_cpu(crq->request_capability_rsp.capability)) {48234778 case REQ_TX_QUEUES:48244779 req_value = &adapter->req_tx_queues;···48844835 }4885483648864837 /* Done receiving requested capabilities, query IP offload support */48874887- if (atomic_read(&adapter->running_cap_crqs) == 0) {48884888- adapter->wait_capability = false;48384838+ if (atomic_read(&adapter->running_cap_crqs) == 0)48894839 send_query_ip_offload(adapter);48904890- }48914840}4892484148934842static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,···51835136 }5184513751855138out:51865186- if (atomic_read(&adapter->running_cap_crqs) == 0) {51875187- adapter->wait_capability = false;51395139+ if (atomic_read(&adapter->running_cap_crqs) == 0)51885140 send_request_cap(adapter, 0);51895189- }51905141}5191514251925143static int send_query_phys_parms(struct ibmvnic_adapter *adapter)···54805435 struct ibmvnic_crq_queue *queue = &adapter->crq;54815436 union ibmvnic_crq *crq;54825437 unsigned long flags;54835483- bool done = false;5484543854855439 spin_lock_irqsave(&queue->lock, flags);54865486- while (!done) {54875487- /* Pull all the valid messages off the CRQ */54885488- while ((crq = ibmvnic_next_crq(adapter)) != NULL) {54895489- /* This barrier makes sure ibmvnic_next_crq()'s54905490- * crq->generic.first & IBMVNIC_CRQ_CMD_RSP is loaded54915491- * before ibmvnic_handle_crq()'s54925492- * switch(gen_crq->first) and switch(gen_crq->cmd).54935493- */54945494- dma_rmb();54955495- ibmvnic_handle_crq(crq, adapter);54965496- crq->generic.first = 0;54975497- }5498544054995499- /* remain in tasklet until all55005500- * capabilities responses are received54415441+ /* Pull all the valid messages off the CRQ */54425442+ while ((crq = ibmvnic_next_crq(adapter)) != NULL) {54435443+ /* This barrier makes sure ibmvnic_next_crq()'s54445444+ * crq->generic.first & IBMVNIC_CRQ_CMD_RSP is loaded54455445+ * before ibmvnic_handle_crq()'s54465446+ * switch(gen_crq->first) and switch(gen_crq->cmd).55015447 */55025502- if (!adapter->wait_capability)55035503- done = true;54485448+ dma_rmb();54495449+ ibmvnic_handle_crq(crq, adapter);54505450+ crq->generic.first = 0;55045451 }55055505- /* if capabilities CRQ's were sent in this tasklet, the following55065506- * tasklet must wait until all responses are received55075507- */55085508- if (atomic_read(&adapter->running_cap_crqs) != 0)55095509- adapter->wait_capability = true;54525452+55105453 spin_unlock_irqrestore(&queue->lock, flags);55115454}55125455
···196196 * @id: an owner id to stick on the items assigned197197 *198198 * Returns the base item index of the lump, or negative for error199199- *200200- * The search_hint trick and lack of advanced fit-finding only work201201- * because we're highly likely to have all the same size lump requests.202202- * Linear search time and any fragmentation should be minimal.203199 **/204200static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,205201 u16 needed, u16 id)···210214 return -EINVAL;211215 }212216213213- /* start the linear search with an imperfect hint */214214- i = pile->search_hint;217217+ /* Allocate last queue in the pile for FDIR VSI queue218218+ * so it doesn't fragment the qp_pile219219+ */220220+ if (pile == pf->qp_pile && pf->vsi[id]->type == I40E_VSI_FDIR) {221221+ if (pile->list[pile->num_entries - 1] & I40E_PILE_VALID_BIT) {222222+ dev_err(&pf->pdev->dev,223223+ "Cannot allocate queue %d for I40E_VSI_FDIR\n",224224+ pile->num_entries - 1);225225+ return -ENOMEM;226226+ }227227+ pile->list[pile->num_entries - 1] = id | I40E_PILE_VALID_BIT;228228+ return pile->num_entries - 1;229229+ }230230+231231+ i = 0;215232 while (i < pile->num_entries) {216233 /* skip already allocated entries */217234 if (pile->list[i] & I40E_PILE_VALID_BIT) {···243234 for (j = 0; j < needed; j++)244235 pile->list[i+j] = id | I40E_PILE_VALID_BIT;245236 ret = i;246246- pile->search_hint = i + j;247237 break;248238 }249239···265257{266258 int valid_id = (id | I40E_PILE_VALID_BIT);267259 int count = 0;268268- int i;260260+ u16 i;269261270262 if (!pile || index >= pile->num_entries)271263 return -EINVAL;···277269 count++;278270 }279271280280- if (count && index < pile->search_hint)281281- pile->search_hint = index;282272283273 return count;284274}···778772 struct rtnl_link_stats64 *ns; /* netdev stats */779773 struct i40e_eth_stats *oes;780774 struct i40e_eth_stats *es; /* device's eth stats */781781- u32 tx_restart, tx_busy;775775+ u64 tx_restart, tx_busy;782776 struct i40e_ring *p;783783- u32 rx_page, rx_buf;777777+ u64 rx_page, rx_buf;784778 u64 bytes, packets;785779 unsigned int start;786780 u64 tx_linearize;···1058010574 }1058110575 i40e_get_oem_version(&pf->hw);10582105761058310583- if (test_bit(__I40E_EMP_RESET_INTR_RECEIVED, pf->state) &&1058410584- ((hw->aq.fw_maj_ver == 4 && hw->aq.fw_min_ver <= 33) ||1058510585- hw->aq.fw_maj_ver < 4) && hw->mac.type == I40E_MAC_XL710) {1058610586- /* The following delay is necessary for 4.33 firmware and older1058710587- * to recover after EMP reset. 200 ms should suffice but we1058810588- * put here 300 ms to be sure that FW is ready to operate1058910589- * after reset.1059010590- */1059110591- mdelay(300);1057710577+ if (test_and_clear_bit(__I40E_EMP_RESET_INTR_RECEIVED, pf->state)) {1057810578+ /* The following delay is necessary for firmware update. */1057910579+ mdelay(1000);1059210580 }10593105811059410582 /* re-verify the eeprom if we just had an EMP reset */···1179211792 return -ENOMEM;11793117931179411794 pf->irq_pile->num_entries = vectors;1179511795- pf->irq_pile->search_hint = 0;11796117951179711796 /* track first vector for misc interrupts, ignore return */1179811797 (void)i40e_get_lump(pf, pf->irq_pile, 1, I40E_PILE_VALID_BIT - 1);···1259412595 goto sw_init_done;1259512596 }1259612597 pf->qp_pile->num_entries = pf->hw.func_caps.num_tx_qp;1259712597- pf->qp_pile->search_hint = 0;12598125981259912599 pf->tx_timeout_recovery_level = 1;1260012600
···13771377}1378137813791379/**13801380+ * i40e_sync_vfr_reset13811381+ * @hw: pointer to hw struct13821382+ * @vf_id: VF identifier13831383+ *13841384+ * Before trigger hardware reset, we need to know if no other process has13851385+ * reserved the hardware for any reset operations. This check is done by13861386+ * examining the status of the RSTAT1 register used to signal the reset.13871387+ **/13881388+static int i40e_sync_vfr_reset(struct i40e_hw *hw, int vf_id)13891389+{13901390+ u32 reg;13911391+ int i;13921392+13931393+ for (i = 0; i < I40E_VFR_WAIT_COUNT; i++) {13941394+ reg = rd32(hw, I40E_VFINT_ICR0_ENA(vf_id)) &13951395+ I40E_VFINT_ICR0_ADMINQ_MASK;13961396+ if (reg)13971397+ return 0;13981398+13991399+ usleep_range(100, 200);14001400+ }14011401+14021402+ return -EAGAIN;14031403+}14041404+14051405+/**13801406 * i40e_trigger_vf_reset13811407 * @vf: pointer to the VF structure13821408 * @flr: VFLR was issued or not···14161390 struct i40e_pf *pf = vf->pf;14171391 struct i40e_hw *hw = &pf->hw;14181392 u32 reg, reg_idx, bit_idx;13931393+ bool vf_active;13941394+ u32 radq;1419139514201396 /* warn the VF */14211421- clear_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);13971397+ vf_active = test_and_clear_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states);1422139814231399 /* Disable VF's configuration API during reset. The flag is re-enabled14241400 * in i40e_alloc_vf_res(), when it's safe again to access VF's VSI.···14341406 * just need to clean up, so don't hit the VFRTRIG register.14351407 */14361408 if (!flr) {14371437- /* reset VF using VPGEN_VFRTRIG reg */14091409+ /* Sync VFR reset before trigger next one */14101410+ radq = rd32(hw, I40E_VFINT_ICR0_ENA(vf->vf_id)) &14111411+ I40E_VFINT_ICR0_ADMINQ_MASK;14121412+ if (vf_active && !radq)14131413+ /* waiting for finish reset by virtual driver */14141414+ if (i40e_sync_vfr_reset(hw, vf->vf_id))14151415+ dev_info(&pf->pdev->dev,14161416+ "Reset VF %d never finished\n",14171417+ vf->vf_id);14181418+14191419+ /* Reset VF using VPGEN_VFRTRIG reg. It is also setting14201420+ * in progress state in rstat1 register.14211421+ */14381422 reg = rd32(hw, I40E_VPGEN_VFRTRIG(vf->vf_id));14391423 reg |= I40E_VPGEN_VFRTRIG_VFSWR_MASK;14401424 wr32(hw, I40E_VPGEN_VFRTRIG(vf->vf_id), reg);···26582618}2659261926602620/**26212621+ * i40e_check_enough_queue - find big enough queue number26222622+ * @vf: pointer to the VF info26232623+ * @needed: the number of items needed26242624+ *26252625+ * Returns the base item index of the queue, or negative for error26262626+ **/26272627+static int i40e_check_enough_queue(struct i40e_vf *vf, u16 needed)26282628+{26292629+ unsigned int i, cur_queues, more, pool_size;26302630+ struct i40e_lump_tracking *pile;26312631+ struct i40e_pf *pf = vf->pf;26322632+ struct i40e_vsi *vsi;26332633+26342634+ vsi = pf->vsi[vf->lan_vsi_idx];26352635+ cur_queues = vsi->alloc_queue_pairs;26362636+26372637+ /* if current allocated queues are enough for need */26382638+ if (cur_queues >= needed)26392639+ return vsi->base_queue;26402640+26412641+ pile = pf->qp_pile;26422642+ if (cur_queues > 0) {26432643+ /* if the allocated queues are not zero26442644+ * just check if there are enough queues for more26452645+ * behind the allocated queues.26462646+ */26472647+ more = needed - cur_queues;26482648+ for (i = vsi->base_queue + cur_queues;26492649+ i < pile->num_entries; i++) {26502650+ if (pile->list[i] & I40E_PILE_VALID_BIT)26512651+ break;26522652+26532653+ if (more-- == 1)26542654+ /* there is enough */26552655+ return vsi->base_queue;26562656+ }26572657+ }26582658+26592659+ pool_size = 0;26602660+ for (i = 0; i < pile->num_entries; i++) {26612661+ if (pile->list[i] & I40E_PILE_VALID_BIT) {26622662+ pool_size = 0;26632663+ continue;26642664+ }26652665+ if (needed <= ++pool_size)26662666+ /* there is enough */26672667+ return i;26682668+ }26692669+26702670+ return -ENOMEM;26712671+}26722672+26732673+/**26612674 * i40e_vc_request_queues_msg26622675 * @vf: pointer to the VF info26632676 * @msg: pointer to the msg buffer···27442651 req_pairs - cur_pairs,27452652 pf->queues_left);27462653 vfres->num_queue_pairs = pf->queues_left + cur_pairs;26542654+ } else if (i40e_check_enough_queue(vf, req_pairs) < 0) {26552655+ dev_warn(&pf->pdev->dev,26562656+ "VF %d requested %d more queues, but there is not enough for it.\n",26572657+ vf->vf_id,26582658+ req_pairs - cur_pairs);26592659+ vfres->num_queue_pairs = cur_pairs;27472660 } else {27482661 /* successful request */27492662 vf->num_req_queues = req_pairs;
···107107 void (*mac_enadis_ptp_config)(void *cgxd,108108 int lmac_id,109109 bool enable);110110+111111+ int (*mac_rx_tx_enable)(void *cgxd, int lmac_id, bool enable);112112+ int (*mac_tx_enable)(void *cgxd, int lmac_id, bool enable);110113};111114112115struct cgx {
···394394 dst_mdev->msg_size = mbox_hdr->msg_size;395395 dst_mdev->num_msgs = num_msgs;396396 err = otx2_sync_mbox_msg(dst_mbox);397397- if (err) {397397+ /* Error code -EIO indicate there is a communication failure398398+ * to the AF. Rest of the error codes indicate that AF processed399399+ * VF messages and set the error codes in response messages400400+ * (if any) so simply forward responses to VF.401401+ */402402+ if (err == -EIO) {398403 dev_warn(pf->dev,399404 "AF not responding to VF%d messages\n", vf);400405 /* restore PF mbase and exit */
···194194 u32 tx_coal_timer[MTL_MAX_TX_QUEUES];195195 u32 rx_coal_frames[MTL_MAX_TX_QUEUES];196196197197- int tx_coalesce;198197 int hwts_tx_en;199198 bool tx_path_in_lpi_mode;200199 bool tso;···228229 unsigned int flow_ctrl;229230 unsigned int pause;230231 struct mii_bus *mii;231231- int mii_irq[PHY_MAX_ADDR];232232233233 struct phylink_config phylink_config;234234 struct phylink *phylink;
+19-17
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
···402402 * Description: this function is to verify and enter in LPI mode in case of403403 * EEE.404404 */405405-static void stmmac_enable_eee_mode(struct stmmac_priv *priv)405405+static int stmmac_enable_eee_mode(struct stmmac_priv *priv)406406{407407 u32 tx_cnt = priv->plat->tx_queues_to_use;408408 u32 queue;···412412 struct stmmac_tx_queue *tx_q = &priv->tx_queue[queue];413413414414 if (tx_q->dirty_tx != tx_q->cur_tx)415415- return; /* still unfinished work */415415+ return -EBUSY; /* still unfinished work */416416 }417417418418 /* Check and enter in LPI mode */419419 if (!priv->tx_path_in_lpi_mode)420420 stmmac_set_eee_mode(priv, priv->hw,421421 priv->plat->en_tx_lpi_clockgating);422422+ return 0;422423}423424424425/**···451450{452451 struct stmmac_priv *priv = from_timer(priv, t, eee_ctrl_timer);453452454454- stmmac_enable_eee_mode(priv);455455- mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_T(priv->tx_lpi_timer));453453+ if (stmmac_enable_eee_mode(priv))454454+ mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_T(priv->tx_lpi_timer));456455}457456458457/**···890889 bool xmac = priv->plat->has_gmac4 || priv->plat->has_xgmac;891890 int ret;892891892892+ if (priv->plat->ptp_clk_freq_config)893893+ priv->plat->ptp_clk_freq_config(priv);894894+893895 ret = stmmac_init_tstamp_counter(priv, STMMAC_HWTS_ACTIVE);894896 if (ret)895897 return ret;···914910915911 priv->hwts_tx_en = 0;916912 priv->hwts_rx_en = 0;917917-918918- stmmac_ptp_register(priv);919913920914 return 0;921915}···2649264726502648 if (priv->eee_enabled && !priv->tx_path_in_lpi_mode &&26512649 priv->eee_sw_timer_en) {26522652- stmmac_enable_eee_mode(priv);26532653- mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_T(priv->tx_lpi_timer));26502650+ if (stmmac_enable_eee_mode(priv))26512651+ mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_T(priv->tx_lpi_timer));26542652 }2655265326562654 /* We still have pending packets, let's call for a new scheduling */···32403238/**32413239 * stmmac_hw_setup - setup mac in a usable state.32423240 * @dev : pointer to the device structure.32433243- * @init_ptp: initialize PTP if set32413241+ * @ptp_register: register PTP if set32443242 * Description:32453243 * this is the main function to setup the HW in a usable state because the32463244 * dma engine is reset, the core registers are configured (e.g. AXI,···32503248 * 0 on success and an appropriate (-)ve integer as defined in errno.h32513249 * file on failure.32523250 */32533253-static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)32513251+static int stmmac_hw_setup(struct net_device *dev, bool ptp_register)32543252{32553253 struct stmmac_priv *priv = netdev_priv(dev);32563254 u32 rx_cnt = priv->plat->rx_queues_to_use;···3307330533083306 stmmac_mmc_setup(priv);3309330733103310- if (init_ptp) {33113311- ret = stmmac_init_ptp(priv);33123312- if (ret == -EOPNOTSUPP)33133313- netdev_warn(priv->dev, "PTP not supported by HW\n");33143314- else if (ret)33153315- netdev_warn(priv->dev, "PTP init failed\n");33163316- }33083308+ ret = stmmac_init_ptp(priv);33093309+ if (ret == -EOPNOTSUPP)33103310+ netdev_warn(priv->dev, "PTP not supported by HW\n");33113311+ else if (ret)33123312+ netdev_warn(priv->dev, "PTP init failed\n");33133313+ else if (ptp_register)33143314+ stmmac_ptp_register(priv);3317331533183316 priv->eee_tw_timer = STMMAC_DEFAULT_TWT_LS;33193317
-3
drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c
···297297{298298 int i;299299300300- if (priv->plat->ptp_clk_freq_config)301301- priv->plat->ptp_clk_freq_config(priv);302302-303300 for (i = 0; i < priv->dma_cap.pps_out_num; i++) {304301 if (i >= STMMAC_PPS_MAX)305302 break;
···17461746 phy_driver_is_genphy_10g(phydev))17471747 device_release_driver(&phydev->mdio.dev);1748174817491749+ /* Assert the reset signal */17501750+ phy_device_reset(phydev, 1);17511751+17491752 /*17501753 * The phydev might go away on the put_device() below, so avoid17511754 * a use-after-free bug by reading the underlying bus first.···17601757 ndev_owner = dev->dev.parent->driver->owner;17611758 if (ndev_owner != bus->owner)17621759 module_put(bus->owner);17631763-17641764- /* Assert the reset signal */17651765- phy_device_reset(phydev, 1);17661760}17671761EXPORT_SYMBOL(phy_detach);17681762
+5
drivers/net/phy/sfp-bus.c
···651651 else if (ret < 0)652652 return ERR_PTR(ret);653653654654+ if (!fwnode_device_is_available(ref.fwnode)) {655655+ fwnode_handle_put(ref.fwnode);656656+ return NULL;657657+ }658658+654659 bus = sfp_bus_get(ref.fwnode);655660 fwnode_handle_put(ref.fwnode);656661 if (!bus)
···9393 /* wake up any blocked readers */9494 wake_up_interruptible(&eptdev->readq);95959696- device_del(&eptdev->dev);9696+ cdev_device_del(&eptdev->cdev, &eptdev->dev);9797 put_device(&eptdev->dev);98989999 return 0;···336336337337 ida_simple_remove(&rpmsg_ept_ida, dev->id);338338 ida_simple_remove(&rpmsg_minor_ida, MINOR(eptdev->dev.devt));339339- cdev_del(&eptdev->cdev);340339 kfree(eptdev);341340}342341···380381 dev->id = ret;381382 dev_set_name(dev, "rpmsg%d", ret);382383383383- ret = cdev_add(&eptdev->cdev, dev->devt, 1);384384+ ret = cdev_device_add(&eptdev->cdev, &eptdev->dev);384385 if (ret)385386 goto free_ept_ida;386387387388 /* We can now rely on the release function for cleanup */388389 dev->release = rpmsg_eptdev_release_device;389389-390390- ret = device_add(dev);391391- if (ret) {392392- dev_err(dev, "device_add failed: %d\n", ret);393393- put_device(dev);394394- }395390396391 return ret;397392···455462456463 ida_simple_remove(&rpmsg_ctrl_ida, dev->id);457464 ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));458458- cdev_del(&ctrldev->cdev);459465 kfree(ctrldev);460466}461467···489497 dev->id = ret;490498 dev_set_name(&ctrldev->dev, "rpmsg_ctrl%d", ret);491499492492- ret = cdev_add(&ctrldev->cdev, dev->devt, 1);500500+ ret = cdev_device_add(&ctrldev->cdev, &ctrldev->dev);493501 if (ret)494502 goto free_ctrl_ida;495503496504 /* We can now rely on the release function for cleanup */497505 dev->release = rpmsg_ctrldev_release_device;498498-499499- ret = device_add(dev);500500- if (ret) {501501- dev_err(&rpdev->dev, "device_add failed: %d\n", ret);502502- put_device(dev);503503- }504506505507 dev_set_drvdata(&rpdev->dev, ctrldev);506508···521535 if (ret)522536 dev_warn(&rpdev->dev, "failed to nuke endpoints: %d\n", ret);523537524524- device_del(&ctrldev->dev);538538+ cdev_device_del(&ctrldev->cdev, &ctrldev->dev);525539 put_device(&ctrldev->dev);526540}527541
+12-1
drivers/s390/scsi/zfcp_fc.c
···521521 goto out;522522 }523523524524+ /* re-init to undo drop from zfcp_fc_adisc() */525525+ port->d_id = ntoh24(adisc_resp->adisc_port_id);524526 /* port is good, unblock rport without going through erp */525527 zfcp_scsi_schedule_rport_register(port);526528 out:···536534 struct zfcp_fc_req *fc_req;537535 struct zfcp_adapter *adapter = port->adapter;538536 struct Scsi_Host *shost = adapter->scsi_host;537537+ u32 d_id;539538 int ret;540539541540 fc_req = kmem_cache_zalloc(zfcp_fc_req_cache, GFP_ATOMIC);···561558 fc_req->u.adisc.req.adisc_cmd = ELS_ADISC;562559 hton24(fc_req->u.adisc.req.adisc_port_id, fc_host_port_id(shost));563560564564- ret = zfcp_fsf_send_els(adapter, port->d_id, &fc_req->ct_els,561561+ d_id = port->d_id; /* remember as destination for send els below */562562+ /*563563+ * Force fresh GID_PN lookup on next port recovery.564564+ * Must happen after request setup and before sending request,565565+ * to prevent race with port->d_id re-init in zfcp_fc_adisc_handler().566566+ */567567+ port->d_id = 0;568568+569569+ ret = zfcp_fsf_send_els(adapter, d_id, &fc_req->ct_els,565570 ZFCP_FC_CTELS_TMO);566571 if (ret)567572 kmem_cache_free(zfcp_fc_req_cache, fc_req);
-4
drivers/scsi/3w-sas.c
···15671567 pci_try_set_mwi(pdev);1568156815691569 retval = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));15701570- if (retval)15711571- retval = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));15721570 if (retval) {15731571 TW_PRINTK(host, TW_DRIVER, 0x18, "Failed to set dma mask");15741572 retval = -ENODEV;···17841786 pci_try_set_mwi(pdev);1785178717861788 retval = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));17871787- if (retval)17881788- retval = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));17891789 if (retval) {17901790 TW_PRINTK(host, TW_DRIVER, 0x25, "Failed to set dma mask during resume");17911791 retval = -ENODEV;
···9292 clki->min_freq = clkfreq[i];9393 clki->max_freq = clkfreq[i+1];9494 clki->name = devm_kstrdup(dev, name, GFP_KERNEL);9595+ if (!clki->name) {9696+ ret = -ENOMEM;9797+ goto out;9898+ }9999+95100 if (!strcmp(name, "ref_clk"))96101 clki->keep_link_active = true;97102 dev_dbg(dev, "%s: min %u max %u name %s\n", "freq-table-hz",···132127 return -ENOMEM;133128134129 vreg->name = devm_kstrdup(dev, name, GFP_KERNEL);130130+ if (!vreg->name)131131+ return -ENOMEM;135132136133 snprintf(prop_name, MAX_PROP_SIZE, "%s-max-microamp", name);137134 if (of_property_read_u32(np, prop_name, &vreg->max_uA)) {
+6-3
drivers/scsi/ufs/ufshcd.c
···86138613 * @pwr_mode: device power mode to set86148614 *86158615 * Returns 0 if requested power mode is set successfully86168616- * Returns non-zero if failed to set the requested power mode86168616+ * Returns < 0 if failed to set the requested power mode86178617 */86188618static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,86198619 enum ufs_dev_pwr_mode pwr_mode)···86678667 sdev_printk(KERN_WARNING, sdp,86688668 "START_STOP failed for power mode: %d, result %x\n",86698669 pwr_mode, ret);86708670- if (ret > 0 && scsi_sense_valid(&sshdr))86718671- scsi_print_sense_hdr(sdp, NULL, &sshdr);86708670+ if (ret > 0) {86718671+ if (scsi_sense_valid(&sshdr))86728672+ scsi_print_sense_hdr(sdp, NULL, &sshdr);86738673+ ret = -EIO;86748674+ }86728675 }8673867686748677 if (!ret)
···20562056 serial8250_rpm_put(up);20572057}2058205820592059-static void wait_for_lsr(struct uart_8250_port *up, int bits)20592059+/*20602060+ * Wait for transmitter & holding register to empty20612061+ */20622062+static void wait_for_xmitr(struct uart_8250_port *up, int bits)20602063{20612064 unsigned int status, tmout = 10000;20622065···20762073 udelay(1);20772074 touch_nmi_watchdog();20782075 }20792079-}20802080-20812081-/*20822082- * Wait for transmitter & holding register to empty20832083- */20842084-static void wait_for_xmitr(struct uart_8250_port *up, int bits)20852085-{20862086- unsigned int tmout;20872087-20882088- wait_for_lsr(up, bits);2089207620902077 /* Wait up to 1s for flow control if necessary */20912078 if (up->port.flags & UPF_CONS_FLOW) {···33263333}3327333433283335/*33293329- * Print a string to the serial port using the device FIFO33303330- *33313331- * It sends fifosize bytes and then waits for the fifo33323332- * to get empty.33333333- */33343334-static void serial8250_console_fifo_write(struct uart_8250_port *up,33353335- const char *s, unsigned int count)33363336-{33373337- int i;33383338- const char *end = s + count;33393339- unsigned int fifosize = up->port.fifosize;33403340- bool cr_sent = false;33413341-33423342- while (s != end) {33433343- wait_for_lsr(up, UART_LSR_THRE);33443344-33453345- for (i = 0; i < fifosize && s != end; ++i) {33463346- if (*s == '\n' && !cr_sent) {33473347- serial_out(up, UART_TX, '\r');33483348- cr_sent = true;33493349- } else {33503350- serial_out(up, UART_TX, *s++);33513351- cr_sent = false;33523352- }33533353- }33543354- }33553355-}33563356-33573357-/*33583336 * Print a string to the serial port trying not to disturb33593337 * any possible real use of the port...33603338 *···33403376 struct uart_8250_em485 *em485 = up->em485;33413377 struct uart_port *port = &up->port;33423378 unsigned long flags;33433343- unsigned int ier, use_fifo;33793379+ unsigned int ier;33443380 int locked = 1;3345338133463382 touch_nmi_watchdog();···33723408 mdelay(port->rs485.delay_rts_before_send);33733409 }3374341033753375- use_fifo = (up->capabilities & UART_CAP_FIFO) &&33763376- port->fifosize > 1 &&33773377- (serial_port_in(port, UART_FCR) & UART_FCR_ENABLE_FIFO) &&33783378- /*33793379- * After we put a data in the fifo, the controller will send33803380- * it regardless of the CTS state. Therefore, only use fifo33813381- * if we don't use control flow.33823382- */33833383- !(up->port.flags & UPF_CONS_FLOW);33843384-33853385- if (likely(use_fifo))33863386- serial8250_console_fifo_write(up, s, count);33873387- else33883388- uart_console_write(port, s, count, serial8250_console_putchar);34113411+ uart_console_write(port, s, count, serial8250_console_putchar);3389341233903413 /*33913414 * Finally, wait for transmitter to become empty
···144144 unsigned long flags;145145 unsigned int old;146146147147+ if (port->rs485.flags & SER_RS485_ENABLED) {148148+ set &= ~TIOCM_RTS;149149+ clear &= ~TIOCM_RTS;150150+ }151151+147152 spin_lock_irqsave(&port->lock, flags);148153 old = port->mctrl;149154 port->mctrl = (old & ~clear) | set;···162157163158static void uart_port_dtr_rts(struct uart_port *uport, int raise)164159{165165- int rs485_on = uport->rs485_config &&166166- (uport->rs485.flags & SER_RS485_ENABLED);167167- int RTS_after_send = !!(uport->rs485.flags & SER_RS485_RTS_AFTER_SEND);168168-169169- if (raise) {170170- if (rs485_on && RTS_after_send) {171171- uart_set_mctrl(uport, TIOCM_DTR);172172- uart_clear_mctrl(uport, TIOCM_RTS);173173- } else {174174- uart_set_mctrl(uport, TIOCM_DTR | TIOCM_RTS);175175- }176176- } else {177177- unsigned int clear = TIOCM_DTR;178178-179179- clear |= (!rs485_on || RTS_after_send) ? TIOCM_RTS : 0;180180- uart_clear_mctrl(uport, clear);181181- }160160+ if (raise)161161+ uart_set_mctrl(uport, TIOCM_DTR | TIOCM_RTS);162162+ else163163+ uart_clear_mctrl(uport, TIOCM_DTR | TIOCM_RTS);182164}183165184166/*···10671075 goto out;1068107610691077 if (!tty_io_error(tty)) {10701070- if (uport->rs485.flags & SER_RS485_ENABLED) {10711071- set &= ~TIOCM_RTS;10721072- clear &= ~TIOCM_RTS;10731073- }10741074-10751078 uart_update_mctrl(uport, set, clear);10761079 ret = 0;10771080 }···23772390 */23782391 spin_lock_irqsave(&port->lock, flags);23792392 port->mctrl &= TIOCM_DTR;23932393+ if (port->rs485.flags & SER_RS485_ENABLED &&23942394+ !(port->rs485.flags & SER_RS485_RTS_AFTER_SEND))23952395+ port->mctrl |= TIOCM_RTS;23802396 port->ops->set_mctrl(port, port->mctrl);23812397 spin_unlock_irqrestore(&port->lock, flags);23822398
+13-1
drivers/tty/serial/stm32-usart.c
···550550 struct stm32_port *stm32_port = to_stm32_port(port);551551 const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;552552 struct circ_buf *xmit = &port->state->xmit;553553+ u32 isr;554554+ int ret;553555554556 if (port->x_char) {555557 if (stm32_usart_tx_dma_started(stm32_port) &&556558 stm32_usart_tx_dma_enabled(stm32_port))557559 stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_DMAT);560560+561561+ /* Check that TDR is empty before filling FIFO */562562+ ret =563563+ readl_relaxed_poll_timeout_atomic(port->membase + ofs->isr,564564+ isr,565565+ (isr & USART_SR_TXE),566566+ 10, 1000);567567+ if (ret)568568+ dev_warn(port->dev, "1 character may be erased\n");569569+558570 writel_relaxed(port->x_char, port->membase + ofs->tdr);559571 port->x_char = 0;560572 port->icount.tx++;···742730 struct serial_rs485 *rs485conf = &port->rs485;743731 struct circ_buf *xmit = &port->state->xmit;744732745745- if (uart_circ_empty(xmit))733733+ if (uart_circ_empty(xmit) && !port->x_char)746734 return;747735748736 if (rs485conf->flags & SER_RS485_ENABLED) {
+3-3
drivers/usb/cdns3/drd.c
···483483/* Indicate the cdns3 core was power lost before */484484bool cdns_power_is_lost(struct cdns *cdns)485485{486486- if (cdns->version == CDNS3_CONTROLLER_V1) {487487- if (!(readl(&cdns->otg_v1_regs->simulate) & BIT(0)))486486+ if (cdns->version == CDNS3_CONTROLLER_V0) {487487+ if (!(readl(&cdns->otg_v0_regs->simulate) & BIT(0)))488488 return true;489489 } else {490490- if (!(readl(&cdns->otg_v0_regs->simulate) & BIT(0)))490490+ if (!(readl(&cdns->otg_v1_regs->simulate) & BIT(0)))491491 return true;492492 }493493 return false;
+5-2
drivers/usb/common/ulpi.c
···3939 struct ulpi *ulpi = to_ulpi_dev(dev);4040 const struct ulpi_device_id *id;41414242- /* Some ULPI devices don't have a vendor id so rely on OF match */4343- if (ulpi->id.vendor == 0)4242+ /*4343+ * Some ULPI devices don't have a vendor id4444+ * or provide an id_table so rely on OF match.4545+ */4646+ if (ulpi->id.vendor == 0 || !drv->id_table)4447 return of_driver_match_device(dev, driver);45484649 for (id = drv->id_table; id->vendor; id++)
+14
drivers/usb/core/hcd.c
···15631563 urb->hcpriv = NULL;15641564 INIT_LIST_HEAD(&urb->urb_list);15651565 atomic_dec(&urb->use_count);15661566+ /*15671567+ * Order the write of urb->use_count above before the read15681568+ * of urb->reject below. Pairs with the memory barriers in15691569+ * usb_kill_urb() and usb_poison_urb().15701570+ */15711571+ smp_mb__after_atomic();15721572+15661573 atomic_dec(&urb->dev->urbnum);15671574 if (atomic_read(&urb->reject))15681575 wake_up(&usb_kill_urb_queue);···1672166516731666 usb_anchor_resume_wakeups(anchor);16741667 atomic_dec(&urb->use_count);16681668+ /*16691669+ * Order the write of urb->use_count above before the read16701670+ * of urb->reject below. Pairs with the memory barriers in16711671+ * usb_kill_urb() and usb_poison_urb().16721672+ */16731673+ smp_mb__after_atomic();16741674+16751675 if (unlikely(atomic_read(&urb->reject)))16761676 wake_up(&usb_kill_urb_queue);16771677 usb_put_urb(urb);
+12
drivers/usb/core/urb.c
···715715 if (!(urb && urb->dev && urb->ep))716716 return;717717 atomic_inc(&urb->reject);718718+ /*719719+ * Order the write of urb->reject above before the read720720+ * of urb->use_count below. Pairs with the barriers in721721+ * __usb_hcd_giveback_urb() and usb_hcd_submit_urb().722722+ */723723+ smp_mb__after_atomic();718724719725 usb_hcd_unlink_urb(urb, -ENOENT);720726 wait_event(usb_kill_urb_queue, atomic_read(&urb->use_count) == 0);···762756 if (!urb)763757 return;764758 atomic_inc(&urb->reject);759759+ /*760760+ * Order the write of urb->reject above before the read761761+ * of urb->use_count below. Pairs with the barriers in762762+ * __usb_hcd_giveback_urb() and usb_hcd_submit_urb().763763+ */764764+ smp_mb__after_atomic();765765766766 if (!urb->dev || !urb->ep)767767 return;
+1-1
drivers/usb/dwc2/gadget.c
···50975097 hsotg->gadget.speed = USB_SPEED_UNKNOWN;50985098 spin_unlock_irqrestore(&hsotg->lock, flags);5099509951005100- for (ep = 0; ep < hsotg->num_of_eps; ep++) {51005100+ for (ep = 1; ep < hsotg->num_of_eps; ep++) {51015101 if (hsotg->eps_in[ep])51025102 dwc2_hsotg_ep_disable_lock(&hsotg->eps_in[ep]->ep);51035103 if (hsotg->eps_out[ep])
+18-5
drivers/usb/dwc3/dwc3-xilinx.c
···102102 int ret;103103 u32 reg;104104105105- usb3_phy = devm_phy_get(dev, "usb3-phy");106106- if (PTR_ERR(usb3_phy) == -EPROBE_DEFER) {107107- ret = -EPROBE_DEFER;105105+ usb3_phy = devm_phy_optional_get(dev, "usb3-phy");106106+ if (IS_ERR(usb3_phy)) {107107+ ret = PTR_ERR(usb3_phy);108108+ dev_err_probe(dev, ret,109109+ "failed to get USB3 PHY\n");108110 goto err;109109- } else if (IS_ERR(usb3_phy)) {110110- usb3_phy = NULL;111111 }112112+113113+ /*114114+ * The following core resets are not required unless a USB3 PHY115115+ * is used, and the subsequent register settings are not required116116+ * unless a core reset is performed (they should be set properly117117+ * by the first-stage boot loader, but may be reverted by a core118118+ * reset). They may also break the configuration if USB3 is actually119119+ * in use but the usb3-phy entry is missing from the device tree.120120+ * Therefore, skip these operations in this case.121121+ */122122+ if (!usb3_phy)123123+ goto skip_usb3_phy;112124113125 crst = devm_reset_control_get_exclusive(dev, "usb_crst");114126 if (IS_ERR(crst)) {···200188 goto err;201189 }202190191191+skip_usb3_phy:203192 /*204193 * This routes the USB DMA traffic to go through FPD path instead205194 * of reaching DDR directly. This traffic routing is needed to
+1
drivers/usb/gadget/function/f_sourcesink.c
···584584585585 if (is_iso) {586586 switch (speed) {587587+ case USB_SPEED_SUPER_PLUS:587588 case USB_SPEED_SUPER:588589 size = ss->isoc_maxpacket *589590 (ss->isoc_mult + 1) *
···430430 struct xhci_hcd *xhci = hcd_to_xhci(hcd);431431 int ret;432432433433+ if (pm_runtime_suspended(dev))434434+ pm_runtime_resume(dev);435435+433436 ret = xhci_priv_suspend_quirk(hcd);434437 if (ret)435438 return ret;
+10
drivers/usb/storage/unusual_devs.h
···23012301 USB_SC_DEVICE, USB_PR_DEVICE, usb_stor_euscsi_init,23022302 US_FL_SCM_MULT_TARG ),2303230323042304+/*23052305+ * Reported by DocMAX <mail@vacharakis.de>23062306+ * and Thomas Weißschuh <linux@weissschuh.net>23072307+ */23082308+UNUSUAL_DEV( 0x2109, 0x0715, 0x9999, 0x9999,23092309+ "VIA Labs, Inc.",23102310+ "VL817 SATA Bridge",23112311+ USB_SC_DEVICE, USB_PR_DEVICE, NULL,23122312+ US_FL_IGNORE_UAS),23132313+23042314UNUSUAL_DEV( 0x2116, 0x0320, 0x0001, 0x0001,23052315 "ST",23062316 "2A",
+7-1
drivers/usb/typec/port-mapper.c
···5656{5757 struct each_port_arg arg = { .port = con, .match = NULL };58585959+ if (!has_acpi_companion(&con->dev))6060+ return 0;6161+5962 bus_for_each_dev(&acpi_bus_type, NULL, &arg, typec_port_match);6363+ if (!arg.match)6464+ return 0;60656166 /*6267 * REVISIT: Now each connector can have only a single component master.···79748075void typec_unlink_ports(struct typec_port *con)8176{8282- component_master_del(&con->dev, &typec_aggregate_ops);7777+ if (has_acpi_companion(&con->dev))7878+ component_master_del(&con->dev, &typec_aggregate_ops);8379}
+26
drivers/usb/typec/tcpm/tcpci.c
···7575static int tcpci_set_cc(struct tcpc_dev *tcpc, enum typec_cc_status cc)7676{7777 struct tcpci *tcpci = tcpc_to_tcpci(tcpc);7878+ bool vconn_pres;7979+ enum typec_cc_polarity polarity = TYPEC_POLARITY_CC1;7880 unsigned int reg;7981 int ret;8282+8383+ ret = regmap_read(tcpci->regmap, TCPC_POWER_STATUS, ®);8484+ if (ret < 0)8585+ return ret;8686+8787+ vconn_pres = !!(reg & TCPC_POWER_STATUS_VCONN_PRES);8888+ if (vconn_pres) {8989+ ret = regmap_read(tcpci->regmap, TCPC_TCPC_CTRL, ®);9090+ if (ret < 0)9191+ return ret;9292+9393+ if (reg & TCPC_TCPC_CTRL_ORIENTATION)9494+ polarity = TYPEC_POLARITY_CC2;9595+ }80968197 switch (cc) {8298 case TYPEC_CC_RA:···126110 reg = (TCPC_ROLE_CTRL_CC_OPEN << TCPC_ROLE_CTRL_CC1_SHIFT) |127111 (TCPC_ROLE_CTRL_CC_OPEN << TCPC_ROLE_CTRL_CC2_SHIFT);128112 break;113113+ }114114+115115+ if (vconn_pres) {116116+ if (polarity == TYPEC_POLARITY_CC2) {117117+ reg &= ~(TCPC_ROLE_CTRL_CC1_MASK << TCPC_ROLE_CTRL_CC1_SHIFT);118118+ reg |= (TCPC_ROLE_CTRL_CC_OPEN << TCPC_ROLE_CTRL_CC1_SHIFT);119119+ } else {120120+ reg &= ~(TCPC_ROLE_CTRL_CC2_MASK << TCPC_ROLE_CTRL_CC2_SHIFT);121121+ reg |= (TCPC_ROLE_CTRL_CC_OPEN << TCPC_ROLE_CTRL_CC2_SHIFT);122122+ }129123 }130124131125 ret = regmap_write(tcpci->regmap, TCPC_ROLE_CTRL, reg);
···51565156 case SNK_TRYWAIT_DEBOUNCE:51575157 break;51585158 case SNK_ATTACH_WAIT:51595159- tcpm_set_state(port, SNK_UNATTACHED, 0);51595159+ case SNK_DEBOUNCED:51605160+ /* Do nothing, as TCPM is still waiting for vbus to reaach VSAFE5V to connect */51605161 break;5161516251625163 case SNK_NEGOTIATE_CAPABILITIES:···52635262 case PR_SWAP_SNK_SRC_SINK_OFF:52645263 case PR_SWAP_SNK_SRC_SOURCE_ON:52655264 /* Do nothing, vsafe0v is expected during transition */52655265+ break;52665266+ case SNK_ATTACH_WAIT:52675267+ case SNK_DEBOUNCED:52685268+ /*Do nothing, still waiting for VSAFE5V for connect */52665269 break;52675270 default:52685271 if (port->pwr_role == TYPEC_SINK && port->auto_vbus_discharge_enabled)
+1-1
drivers/usb/typec/ucsi/ucsi_ccg.c
···325325 if (status < 0)326326 return status;327327328328- if (!data)328328+ if (!(data & DEV_INT))329329 return 0;330330331331 status = ccg_write(uc, CCGX_RAB_INTR_REG, &data, sizeof(data));
+3-13
drivers/video/fbdev/hyperv_fb.c
···287287288288static uint screen_width = HVFB_WIDTH;289289static uint screen_height = HVFB_HEIGHT;290290-static uint screen_width_max = HVFB_WIDTH;291291-static uint screen_height_max = HVFB_HEIGHT;292290static uint screen_depth;293291static uint screen_fb_size;294292static uint dio_fb_size; /* FB size for deferred IO */···580582 int ret = 0;581583 unsigned long t;582584 u8 index;583583- int i;584585585586 memset(msg, 0, sizeof(struct synthvid_msg));586587 msg->vid_hdr.type = SYNTHVID_RESOLUTION_REQUEST;···608611 pr_err("Invalid resolution index: %d\n", index);609612 ret = -ENODEV;610613 goto out;611611- }612612-613613- for (i = 0; i < msg->resolution_resp.resolution_count; i++) {614614- screen_width_max = max_t(unsigned int, screen_width_max,615615- msg->resolution_resp.supported_resolution[i].width);616616- screen_height_max = max_t(unsigned int, screen_height_max,617617- msg->resolution_resp.supported_resolution[i].height);618614 }619615620616 screen_width =···931941932942 if (x < HVFB_WIDTH_MIN || y < HVFB_HEIGHT_MIN ||933943 (synthvid_ver_ge(par->synthvid_version, SYNTHVID_VERSION_WIN10) &&934934- (x > screen_width_max || y > screen_height_max)) ||944944+ (x * y * screen_depth / 8 > screen_fb_size)) ||935945 (par->synthvid_version == SYNTHVID_VERSION_WIN8 &&936946 x * y * screen_depth / 8 > SYNTHVID_FB_SIZE_WIN8) ||937947 (par->synthvid_version == SYNTHVID_VERSION_WIN7 &&···11841194 }1185119511861196 hvfb_get_option(info);11871187- pr_info("Screen resolution: %dx%d, Color depth: %d\n",11881188- screen_width, screen_height, screen_depth);11971197+ pr_info("Screen resolution: %dx%d, Color depth: %d, Frame buffer size: %d\n",11981198+ screen_width, screen_height, screen_depth, screen_fb_size);1189119911901200 ret = hvfb_getmem(hdev, info);11911201 if (ret) {
+4-4
fs/binfmt_misc.c
···817817};818818MODULE_ALIAS_FS("binfmt_misc");819819820820+static struct ctl_table_header *binfmt_misc_header;821821+820822static int __init init_misc_binfmt(void)821823{822824 int err = register_filesystem(&bm_fs_type);823825 if (!err)824826 insert_binfmt(&misc_format);825825- if (!register_sysctl_mount_point("fs/binfmt_misc")) {826826- pr_warn("Failed to create fs/binfmt_misc sysctl mount point");827827- return -ENOMEM;828828- }827827+ binfmt_misc_header = register_sysctl_mount_point("fs/binfmt_misc");829828 return 0;830829}831830832831static void __exit exit_misc_binfmt(void)833832{833833+ unregister_sysctl_table(binfmt_misc_header);834834 unregister_binfmt(&misc_format);835835 unregister_filesystem(&bm_fs_type);836836}
+77-13
fs/btrfs/ioctl.c
···12141214 goto next;1215121512161216 /*12171217+ * Our start offset might be in the middle of an existing extent12181218+ * map, so take that into account.12191219+ */12201220+ range_len = em->len - (cur - em->start);12211221+ /*12221222+ * If this range of the extent map is already flagged for delalloc,12231223+ * skip it, because:12241224+ *12251225+ * 1) We could deadlock later, when trying to reserve space for12261226+ * delalloc, because in case we can't immediately reserve space12271227+ * the flusher can start delalloc and wait for the respective12281228+ * ordered extents to complete. The deadlock would happen12291229+ * because we do the space reservation while holding the range12301230+ * locked, and starting writeback, or finishing an ordered12311231+ * extent, requires locking the range;12321232+ *12331233+ * 2) If there's delalloc there, it means there's dirty pages for12341234+ * which writeback has not started yet (we clean the delalloc12351235+ * flag when starting writeback and after creating an ordered12361236+ * extent). If we mark pages in an adjacent range for defrag,12371237+ * then we will have a larger contiguous range for delalloc,12381238+ * very likely resulting in a larger extent after writeback is12391239+ * triggered (except in a case of free space fragmentation).12401240+ */12411241+ if (test_range_bit(&inode->io_tree, cur, cur + range_len - 1,12421242+ EXTENT_DELALLOC, 0, NULL))12431243+ goto next;12441244+12451245+ /*12171246 * For do_compress case, we want to compress all valid file12181247 * extents, thus no @extent_thresh or mergeable check.12191248 */···12501221 goto add;1251122212521223 /* Skip too large extent */12531253- if (em->len >= extent_thresh)12241224+ if (range_len >= extent_thresh)12541225 goto next;1255122612561227 next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,···14711442 list_for_each_entry(entry, &target_list, list) {14721443 u32 range_len = entry->len;1473144414741474- /* Reached the limit */14751475- if (max_sectors && max_sectors == *sectors_defragged)14451445+ /* Reached or beyond the limit */14461446+ if (max_sectors && *sectors_defragged >= max_sectors) {14471447+ ret = 1;14761448 break;14491449+ }1477145014781451 if (max_sectors)14791452 range_len = min_t(u32, range_len,···14961465 extent_thresh, newer_than, do_compress);14971466 if (ret < 0)14981467 break;14991499- *sectors_defragged += range_len;14681468+ *sectors_defragged += range_len >>14691469+ inode->root->fs_info->sectorsize_bits;15001470 }15011471out:15021472 list_for_each_entry_safe(entry, tmp, &target_list, list) {···15161484 * @newer_than: minimum transid to defrag15171485 * @max_to_defrag: max number of sectors to be defragged, if 0, the whole inode15181486 * will be defragged.14871487+ *14881488+ * Return <0 for error.14891489+ * Return >=0 for the number of sectors defragged, and range->start will be updated14901490+ * to indicate the file offset where next defrag should be started at.14911491+ * (Mostly for autodefrag, which sets @max_to_defrag thus we may exit early without14921492+ * defragging all the range).15191493 */15201494int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,15211495 struct btrfs_ioctl_defrag_range_args *range,···15371499 int compress_type = BTRFS_COMPRESS_ZLIB;15381500 int ret = 0;15391501 u32 extent_thresh = range->extent_thresh;15021502+ pgoff_t start_index;1540150315411504 if (isize == 0)15421505 return 0;···1557151815581519 if (range->start + range->len > range->start) {15591520 /* Got a specific range */15601560- last_byte = min(isize, range->start + range->len) - 1;15211521+ last_byte = min(isize, range->start + range->len);15611522 } else {15621523 /* Defrag until file end */15631563- last_byte = isize - 1;15241524+ last_byte = isize;15641525 }15261526+15271527+ /* Align the range */15281528+ cur = round_down(range->start, fs_info->sectorsize);15291529+ last_byte = round_up(last_byte, fs_info->sectorsize) - 1;1565153015661531 /*15671532 * If we were not given a ra, allocate a readahead context. As···15791536 file_ra_state_init(ra, inode->i_mapping);15801537 }1581153815821582- /* Align the range */15831583- cur = round_down(range->start, fs_info->sectorsize);15841584- last_byte = round_up(last_byte, fs_info->sectorsize) - 1;15391539+ /*15401540+ * Make writeback start from the beginning of the range, so that the15411541+ * defrag range can be written sequentially.15421542+ */15431543+ start_index = cur >> PAGE_SHIFT;15441544+ if (start_index < inode->i_mapping->writeback_index)15451545+ inode->i_mapping->writeback_index = start_index;1585154615861547 while (cur < last_byte) {15481548+ const unsigned long prev_sectors_defragged = sectors_defragged;15871549 u64 cluster_end;1588155015891551 /* The cluster size 256K should always be page aligned */15901552 BUILD_BUG_ON(!IS_ALIGNED(CLUSTER_SIZE, PAGE_SIZE));15531553+15541554+ if (btrfs_defrag_cancelled(fs_info)) {15551555+ ret = -EAGAIN;15561556+ break;15571557+ }1591155815921559 /* We want the cluster end at page boundary when possible */15931560 cluster_end = (((cur >> PAGE_SHIFT) +···16201567 cluster_end + 1 - cur, extent_thresh,16211568 newer_than, do_compress,16221569 §ors_defragged, max_to_defrag);15701570+15711571+ if (sectors_defragged > prev_sectors_defragged)15721572+ balance_dirty_pages_ratelimited(inode->i_mapping);15731573+16231574 btrfs_inode_unlock(inode, 0);16241575 if (ret < 0)16251576 break;16261577 cur = cluster_end + 1;15781578+ if (ret > 0) {15791579+ ret = 0;15801580+ break;15811581+ }16271582 }1628158316291584 if (ra_allocated)16301585 kfree(ra);15861586+ /*15871587+ * Update range.start for autodefrag, this will indicate where to start15881588+ * in next run.15891589+ */15901590+ range->start = cur;16311591 if (sectors_defragged) {16321592 /*16331593 * We have defragged some sectors, for compression case they···31523086 btrfs_inode_lock(inode, 0);31533087 err = btrfs_delete_subvolume(dir, dentry);31543088 btrfs_inode_unlock(inode, 0);31553155- if (!err) {31563156- fsnotify_rmdir(dir, dentry);31573157- d_delete(dentry);31583158- }30893089+ if (!err)30903090+ d_delete_notify(dir, dentry);3159309131603092out_dput:31613093 dput(dentry);
+37-18
fs/ceph/caps.c
···22182218 struct ceph_mds_client *mdsc = ceph_sb_to_client(inode->i_sb)->mdsc;22192219 struct ceph_inode_info *ci = ceph_inode(inode);22202220 struct ceph_mds_request *req1 = NULL, *req2 = NULL;22212221+ unsigned int max_sessions;22212222 int ret, err = 0;2222222322232224 spin_lock(&ci->i_unsafe_lock);···22372236 spin_unlock(&ci->i_unsafe_lock);2238223722392238 /*22392239+ * The mdsc->max_sessions is unlikely to be changed22402240+ * mostly, here we will retry it by reallocating the22412241+ * sessions array memory to get rid of the mdsc->mutex22422242+ * lock.22432243+ */22442244+retry:22452245+ max_sessions = mdsc->max_sessions;22462246+22472247+ /*22402248 * Trigger to flush the journal logs in all the relevant MDSes22412249 * manually, or in the worst case we must wait at most 5 seconds22422250 * to wait the journal logs to be flushed by the MDSes periodically.22432251 */22442244- if (req1 || req2) {22522252+ if ((req1 || req2) && likely(max_sessions)) {22452253 struct ceph_mds_session **sessions = NULL;22462254 struct ceph_mds_session *s;22472255 struct ceph_mds_request *req;22482248- unsigned int max;22492256 int i;2250225722512251- /*22522252- * The mdsc->max_sessions is unlikely to be changed22532253- * mostly, here we will retry it by reallocating the22542254- * sessions arrary memory to get rid of the mdsc->mutex22552255- * lock.22562256- */22572257-retry:22582258- max = mdsc->max_sessions;22592259- sessions = krealloc(sessions, max * sizeof(s), __GFP_ZERO);22602260- if (!sessions)22612261- return -ENOMEM;22582258+ sessions = kzalloc(max_sessions * sizeof(s), GFP_KERNEL);22592259+ if (!sessions) {22602260+ err = -ENOMEM;22612261+ goto out;22622262+ }2262226322632264 spin_lock(&ci->i_unsafe_lock);22642265 if (req1) {22652266 list_for_each_entry(req, &ci->i_unsafe_dirops,22662267 r_unsafe_dir_item) {22672268 s = req->r_session;22682268- if (unlikely(s->s_mds >= max)) {22692269+ if (unlikely(s->s_mds >= max_sessions)) {22692270 spin_unlock(&ci->i_unsafe_lock);22712271+ for (i = 0; i < max_sessions; i++) {22722272+ s = sessions[i];22732273+ if (s)22742274+ ceph_put_mds_session(s);22752275+ }22762276+ kfree(sessions);22702277 goto retry;22712278 }22722279 if (!sessions[s->s_mds]) {···22872278 list_for_each_entry(req, &ci->i_unsafe_iops,22882279 r_unsafe_target_item) {22892280 s = req->r_session;22902290- if (unlikely(s->s_mds >= max)) {22812281+ if (unlikely(s->s_mds >= max_sessions)) {22912282 spin_unlock(&ci->i_unsafe_lock);22832283+ for (i = 0; i < max_sessions; i++) {22842284+ s = sessions[i];22852285+ if (s)22862286+ ceph_put_mds_session(s);22872287+ }22882288+ kfree(sessions);22922289 goto retry;22932290 }22942291 if (!sessions[s->s_mds]) {···23152300 spin_unlock(&ci->i_ceph_lock);2316230123172302 /* send flush mdlog request to MDSes */23182318- for (i = 0; i < max; i++) {23032303+ for (i = 0; i < max_sessions; i++) {23192304 s = sessions[i];23202305 if (s) {23212306 send_flush_mdlog(s);···23322317 ceph_timeout_jiffies(req1->r_timeout));23332318 if (ret)23342319 err = -EIO;23352335- ceph_mdsc_put_request(req1);23362320 }23372321 if (req2) {23382322 ret = !wait_for_completion_timeout(&req2->r_safe_completion,23392323 ceph_timeout_jiffies(req2->r_timeout));23402324 if (ret)23412325 err = -EIO;23422342- ceph_mdsc_put_request(req2);23432326 }23272327+23282328+out:23292329+ if (req1)23302330+ ceph_mdsc_put_request(req1);23312331+ if (req2)23322332+ ceph_mdsc_put_request(req2);23442333 return err;23452334}23462335
···12511251{12521252 struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data;12531253 struct journal_head *jh;12541254- int ret = 1;12541254+ int ret;1255125512561256 if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap))12571257 return 0;1258125812591259- if (!buffer_jbd(bg_bh))12591259+ jh = jbd2_journal_grab_journal_head(bg_bh);12601260+ if (!jh)12601261 return 1;1261126212621262- jbd_lock_bh_journal_head(bg_bh);12631263- if (buffer_jbd(bg_bh)) {12641264- jh = bh2jh(bg_bh);12651265- spin_lock(&jh->b_state_lock);12661266- bg = (struct ocfs2_group_desc *) jh->b_committed_data;12671267- if (bg)12681268- ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap);12691269- else12701270- ret = 1;12711271- spin_unlock(&jh->b_state_lock);12721272- }12731273- jbd_unlock_bh_journal_head(bg_bh);12631263+ spin_lock(&jh->b_state_lock);12641264+ bg = (struct ocfs2_group_desc *) jh->b_committed_data;12651265+ if (bg)12661266+ ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap);12671267+ else12681268+ ret = 1;12691269+ spin_unlock(&jh->b_state_lock);12701270+ jbd2_journal_put_journal_head(jh);1274127112751272 return ret;12761273}
+4-5
fs/udf/inode.c
···258258 char *kaddr;259259 struct udf_inode_info *iinfo = UDF_I(inode);260260 int err;261261- struct writeback_control udf_wbc = {262262- .sync_mode = WB_SYNC_NONE,263263- .nr_to_write = 1,264264- };265261266262 WARN_ON_ONCE(!inode_is_locked(inode));267263 if (!iinfo->i_lenAlloc) {···301305 iinfo->i_alloc_type = ICBTAG_FLAG_AD_LONG;302306 /* from now on we have normal address_space methods */303307 inode->i_data.a_ops = &udf_aops;308308+ set_page_dirty(page);309309+ unlock_page(page);304310 up_write(&iinfo->i_data_sem);305305- err = inode->i_data.a_ops->writepage(page, &udf_wbc);311311+ err = filemap_fdatawrite(inode->i_mapping);306312 if (err) {307313 /* Restore everything back so that we don't lose data... */308314 lock_page(page);···315317 unlock_page(page);316318 iinfo->i_alloc_type = ICBTAG_FLAG_AD_IN_ICB;317319 inode->i_data.a_ops = &udf_adinicb_aops;320320+ iinfo->i_lenAlloc = inode->i_size;318321 up_write(&iinfo->i_data_sem);319322 }320323 put_page(page);
+1
include/linux/blkdev.h
···12581258void disk_end_io_acct(struct gendisk *disk, unsigned int op,12591259 unsigned long start_time);1260126012611261+void bio_start_io_acct_time(struct bio *bio, unsigned long start_time);12611262unsigned long bio_start_io_acct(struct bio *bio);12621263void bio_end_io_acct_remapped(struct bio *bio, unsigned long start_time,12631264 struct block_device *orig_bdev);
···225225}226226227227/*228228+ * fsnotify_delete - @dentry was unlinked and unhashed229229+ *230230+ * Caller must make sure that dentry->d_name is stable.231231+ *232232+ * Note: unlike fsnotify_unlink(), we have to pass also the unlinked inode233233+ * as this may be called after d_delete() and old_dentry may be negative.234234+ */235235+static inline void fsnotify_delete(struct inode *dir, struct inode *inode,236236+ struct dentry *dentry)237237+{238238+ __u32 mask = FS_DELETE;239239+240240+ if (S_ISDIR(inode->i_mode))241241+ mask |= FS_ISDIR;242242+243243+ fsnotify_name(mask, inode, FSNOTIFY_EVENT_INODE, dir, &dentry->d_name,244244+ 0);245245+}246246+247247+/**248248+ * d_delete_notify - delete a dentry and call fsnotify_delete()249249+ * @dentry: The dentry to delete250250+ *251251+ * This helper is used to guaranty that the unlinked inode cannot be found252252+ * by lookup of this name after fsnotify_delete() event has been delivered.253253+ */254254+static inline void d_delete_notify(struct inode *dir, struct dentry *dentry)255255+{256256+ struct inode *inode = d_inode(dentry);257257+258258+ ihold(inode);259259+ d_delete(dentry);260260+ fsnotify_delete(dir, inode, dentry);261261+ iput(inode);262262+}263263+264264+/*228265 * fsnotify_unlink - 'name' was unlinked229266 *230267 * Caller must make sure that dentry->d_name is stable.231268 */232269static inline void fsnotify_unlink(struct inode *dir, struct dentry *dentry)233270{234234- /* Expected to be called before d_delete() */235235- WARN_ON_ONCE(d_is_negative(dentry));271271+ if (WARN_ON_ONCE(d_is_negative(dentry)))272272+ return;236273237237- fsnotify_dirent(dir, dentry, FS_DELETE);274274+ fsnotify_delete(dir, d_inode(dentry), dentry);238275}239276240277/*···295258 */296259static inline void fsnotify_rmdir(struct inode *dir, struct dentry *dentry)297260{298298- /* Expected to be called before d_delete() */299299- WARN_ON_ONCE(d_is_negative(dentry));261261+ if (WARN_ON_ONCE(d_is_negative(dentry)))262262+ return;300263301301- fsnotify_dirent(dir, dentry, FS_DELETE | FS_ISDIR);264264+ fsnotify_delete(dir, d_inode(dentry), dentry);302265}303266304267/*
···693693 u64 total_time_running;694694 u64 tstamp;695695696696- /*697697- * timestamp shadows the actual context timing but it can698698- * be safely used in NMI interrupt context. It reflects the699699- * context time as it was when the event was last scheduled in,700700- * or when ctx_sched_in failed to schedule the event because we701701- * run out of PMC.702702- *703703- * ctx_time already accounts for ctx->timestamp. Therefore to704704- * compute ctx_time for a sample, simply add perf_clock().705705- */706706- u64 shadow_ctx_time;707707-708696 struct perf_event_attr attr;709697 u16 header_size;710698 u16 id_header_size;···840852 */841853 u64 time;842854 u64 timestamp;855855+ u64 timeoffset;843856844857 /*845858 * These fields let us detect when two contexts have both···923934struct perf_cgroup_info {924935 u64 time;925936 u64 timestamp;937937+ u64 timeoffset;938938+ int active;926939};927940928941struct perf_cgroup {
···141141 * events to one per window142142 */143143 u64 last_event_time;144144-145145- /* Refcounting to prevent premature destruction */146146- struct kref refcount;147144};148145149146struct psi_group {
+1-1
include/linux/quota.h
···9191 *9292 * When there is no mapping defined for the user-namespace, type,9393 * qid tuple an invalid kqid is returned. Callers are expected to9494- * test for and handle handle invalid kqids being returned.9494+ * test for and handle invalid kqids being returned.9595 * Invalid kqids may be tested for using qid_valid().9696 */9797static inline struct kqid make_kqid(struct user_namespace *from,
-4
include/linux/sched.h
···619619 * task has to wait for a replenishment to be performed at the620620 * next firing of dl_timer.621621 *622622- * @dl_boosted tells if we are boosted due to DI. If so we are623623- * outside bandwidth enforcement mechanism (but only until we624624- * exit the critical section);625625- *626622 * @dl_yielded tells if task gave up the CPU before consuming627623 * all its available runtime during the last job.628624 *
···525525{526526 struct iphdr *iph = ip_hdr(skb);527527528528+ /* We had many attacks based on IPID, use the private529529+ * generator as much as we can.530530+ */531531+ if (sk && inet_sk(sk)->inet_daddr) {532532+ iph->id = htons(inet_sk(sk)->inet_id);533533+ inet_sk(sk)->inet_id += segs;534534+ return;535535+ }528536 if ((iph->frag_off & htons(IP_DF)) && !skb->ignore_df) {529529- /* This is only to work around buggy Windows95/2000530530- * VJ compression implementations. If the ID field531531- * does not change, they drop every other packet in532532- * a TCP stream using header compression.533533- */534534- if (sk && inet_sk(sk)->inet_daddr) {535535- iph->id = htons(inet_sk(sk)->inet_id);536536- inet_sk(sk)->inet_id += segs;537537- } else {538538- iph->id = 0;539539- }537537+ iph->id = 0;540538 } else {539539+ /* Unfortunately we need the big hammer to get a suitable IPID */541540 __ip_select_ident(net, iph, segs);542541 }543542}
+1-1
include/net/ip6_fib.h
···282282 fn = rcu_dereference(f6i->fib6_node);283283284284 if (fn) {285285- *cookie = fn->fn_sernum;285285+ *cookie = READ_ONCE(fn->fn_sernum);286286 /* pairs with smp_wmb() in __fib6_update_sernum_upto_root() */287287 smp_rmb();288288 status = true;
···472472 u32, size, u64, flags)473473{474474 struct pt_regs *regs;475475- long res;475475+ long res = -EINVAL;476476477477 if (!try_get_task_stack(task))478478 return -EFAULT;479479480480 regs = task_pt_regs(task);481481- res = __bpf_get_stack(regs, task, NULL, buf, size, flags);481481+ if (regs)482482+ res = __bpf_get_stack(regs, task, NULL, buf, size, flags);482483 put_task_stack(task);483484484485 return res;
+8-3
kernel/cgroup/cgroup.c
···36433643 cgroup_get(cgrp);36443644 cgroup_kn_unlock(of->kn);3645364536463646+ /* Allow only one trigger per file descriptor */36473647+ if (ctx->psi.trigger) {36483648+ cgroup_put(cgrp);36493649+ return -EBUSY;36503650+ }36513651+36463652 psi = cgroup_ino(cgrp) == 1 ? &psi_system : &cgrp->psi;36473653 new = psi_trigger_create(psi, buf, nbytes, res);36483654 if (IS_ERR(new)) {···36563650 return PTR_ERR(new);36573651 }3658365236593659- psi_trigger_replace(&ctx->psi.trigger, new);36603660-36533653+ smp_store_release(&ctx->psi.trigger, new);36613654 cgroup_put(cgrp);3662365536633656 return nbytes;···36953690{36963691 struct cgroup_file_ctx *ctx = of->priv;3697369236983698- psi_trigger_replace(&ctx->psi.trigger, NULL);36933693+ psi_trigger_destroy(ctx->psi.trigger);36993694}3700369537013696bool cgroup_psi_enabled(void)
+167-106
kernel/events/core.c
···674674 WRITE_ONCE(event->state, state);675675}676676677677+/*678678+ * UP store-release, load-acquire679679+ */680680+681681+#define __store_release(ptr, val) \682682+do { \683683+ barrier(); \684684+ WRITE_ONCE(*(ptr), (val)); \685685+} while (0)686686+687687+#define __load_acquire(ptr) \688688+({ \689689+ __unqual_scalar_typeof(*(ptr)) ___p = READ_ONCE(*(ptr)); \690690+ barrier(); \691691+ ___p; \692692+})693693+677694#ifdef CONFIG_CGROUP_PERF678695679696static inline bool···736719 return t->time;737720}738721739739-static inline void __update_cgrp_time(struct perf_cgroup *cgrp)722722+static inline u64 perf_cgroup_event_time_now(struct perf_event *event, u64 now)740723{741741- struct perf_cgroup_info *info;742742- u64 now;724724+ struct perf_cgroup_info *t;743725744744- now = perf_clock();745745-746746- info = this_cpu_ptr(cgrp->info);747747-748748- info->time += now - info->timestamp;749749- info->timestamp = now;726726+ t = per_cpu_ptr(event->cgrp->info, event->cpu);727727+ if (!__load_acquire(&t->active))728728+ return t->time;729729+ now += READ_ONCE(t->timeoffset);730730+ return now;750731}751732752752-static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context *cpuctx)733733+static inline void __update_cgrp_time(struct perf_cgroup_info *info, u64 now, bool adv)734734+{735735+ if (adv)736736+ info->time += now - info->timestamp;737737+ info->timestamp = now;738738+ /*739739+ * see update_context_time()740740+ */741741+ WRITE_ONCE(info->timeoffset, info->time - info->timestamp);742742+}743743+744744+static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context *cpuctx, bool final)753745{754746 struct perf_cgroup *cgrp = cpuctx->cgrp;755747 struct cgroup_subsys_state *css;748748+ struct perf_cgroup_info *info;756749757750 if (cgrp) {751751+ u64 now = perf_clock();752752+758753 for (css = &cgrp->css; css; css = css->parent) {759754 cgrp = container_of(css, struct perf_cgroup, css);760760- __update_cgrp_time(cgrp);755755+ info = this_cpu_ptr(cgrp->info);756756+757757+ __update_cgrp_time(info, now, true);758758+ if (final)759759+ __store_release(&info->active, 0);761760 }762761 }763762}764763765764static inline void update_cgrp_time_from_event(struct perf_event *event)766765{766766+ struct perf_cgroup_info *info;767767 struct perf_cgroup *cgrp;768768769769 /*···794760 /*795761 * Do not update time when cgroup is not active796762 */797797- if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup))798798- __update_cgrp_time(event->cgrp);763763+ if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup)) {764764+ info = this_cpu_ptr(event->cgrp->info);765765+ __update_cgrp_time(info, perf_clock(), true);766766+ }799767}800768801769static inline void···821785 for (css = &cgrp->css; css; css = css->parent) {822786 cgrp = container_of(css, struct perf_cgroup, css);823787 info = this_cpu_ptr(cgrp->info);824824- info->timestamp = ctx->timestamp;788788+ __update_cgrp_time(info, ctx->timestamp, false);789789+ __store_release(&info->active, 1);825790 }826791}827792···1019982}10209831021984static inline void10221022-perf_cgroup_set_shadow_time(struct perf_event *event, u64 now)10231023-{10241024- struct perf_cgroup_info *t;10251025- t = per_cpu_ptr(event->cgrp->info, event->cpu);10261026- event->shadow_ctx_time = now - t->timestamp;10271027-}10281028-10291029-static inline void1030985perf_cgroup_event_enable(struct perf_event *event, struct perf_event_context *ctx)1031986{1032987 struct perf_cpu_context *cpuctx;···10951066{10961067}1097106810981098-static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context *cpuctx)10691069+static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context *cpuctx,10701070+ bool final)10991071{11001072}11011073···11281098{11291099}1130110011311131-static inline void11321132-perf_cgroup_set_shadow_time(struct perf_event *event, u64 now)11011101+static inline u64 perf_cgroup_event_time(struct perf_event *event)11331102{11031103+ return 0;11341104}1135110511361136-static inline u64 perf_cgroup_event_time(struct perf_event *event)11061106+static inline u64 perf_cgroup_event_time_now(struct perf_event *event, u64 now)11371107{11381108 return 0;11391109}···15551525/*15561526 * Update the record of the current time in a context.15571527 */15581558-static void update_context_time(struct perf_event_context *ctx)15281528+static void __update_context_time(struct perf_event_context *ctx, bool adv)15591529{15601530 u64 now = perf_clock();1561153115621562- ctx->time += now - ctx->timestamp;15321532+ if (adv)15331533+ ctx->time += now - ctx->timestamp;15631534 ctx->timestamp = now;15351535+15361536+ /*15371537+ * The above: time' = time + (now - timestamp), can be re-arranged15381538+ * into: time` = now + (time - timestamp), which gives a single value15391539+ * offset to compute future time without locks on.15401540+ *15411541+ * See perf_event_time_now(), which can be used from NMI context where15421542+ * it's (obviously) not possible to acquire ctx->lock in order to read15431543+ * both the above values in a consistent manner.15441544+ */15451545+ WRITE_ONCE(ctx->timeoffset, ctx->time - ctx->timestamp);15461546+}15471547+15481548+static void update_context_time(struct perf_event_context *ctx)15491549+{15501550+ __update_context_time(ctx, true);15641551}1565155215661553static u64 perf_event_time(struct perf_event *event)15671554{15681555 struct perf_event_context *ctx = event->ctx;1569155615571557+ if (unlikely(!ctx))15581558+ return 0;15591559+15701560 if (is_cgroup_event(event))15711561 return perf_cgroup_event_time(event);1572156215731573- return ctx ? ctx->time : 0;15631563+ return ctx->time;15641564+}15651565+15661566+static u64 perf_event_time_now(struct perf_event *event, u64 now)15671567+{15681568+ struct perf_event_context *ctx = event->ctx;15691569+15701570+ if (unlikely(!ctx))15711571+ return 0;15721572+15731573+ if (is_cgroup_event(event))15741574+ return perf_cgroup_event_time_now(event, now);15751575+15761576+ if (!(__load_acquire(&ctx->is_active) & EVENT_TIME))15771577+ return ctx->time;15781578+15791579+ now += READ_ONCE(ctx->timeoffset);15801580+ return now;15741581}1575158215761583static enum event_type_t get_event_type(struct perf_event *event)···2417235024182351 if (ctx->is_active & EVENT_TIME) {24192352 update_context_time(ctx);24202420- update_cgrp_time_from_cpuctx(cpuctx);23532353+ update_cgrp_time_from_cpuctx(cpuctx, false);24212354 }2422235524232356 event_sched_out(event, cpuctx, ctx);···24282361 list_del_event(event, ctx);2429236224302363 if (!ctx->nr_events && ctx->is_active) {23642364+ if (ctx == &cpuctx->ctx)23652365+ update_cgrp_time_from_cpuctx(cpuctx, true);23662366+24312367 ctx->is_active = 0;24322368 ctx->rotate_necessary = 0;24332369 if (ctx->task) {···24622392 * event_function_call() user.24632393 */24642394 raw_spin_lock_irq(&ctx->lock);24652465- if (!ctx->is_active) {23952395+ /*23962396+ * Cgroup events are per-cpu events, and must IPI because of23972397+ * cgrp_cpuctx_list.23982398+ */23992399+ if (!ctx->is_active && !is_cgroup_event(event)) {24662400 __perf_remove_from_context(event, __get_cpu_context(ctx),24672401 ctx, (void *)flags);24682402 raw_spin_unlock_irq(&ctx->lock);···25562482 irq_work_queue(&event->pending);25572483}2558248425592559-static void perf_set_shadow_time(struct perf_event *event,25602560- struct perf_event_context *ctx)25612561-{25622562- /*25632563- * use the correct time source for the time snapshot25642564- *25652565- * We could get by without this by leveraging the25662566- * fact that to get to this function, the caller25672567- * has most likely already called update_context_time()25682568- * and update_cgrp_time_xx() and thus both timestamp25692569- * are identical (or very close). Given that tstamp is,25702570- * already adjusted for cgroup, we could say that:25712571- * tstamp - ctx->timestamp25722572- * is equivalent to25732573- * tstamp - cgrp->timestamp.25742574- *25752575- * Then, in perf_output_read(), the calculation would25762576- * work with no changes because:25772577- * - event is guaranteed scheduled in25782578- * - no scheduled out in between25792579- * - thus the timestamp would be the same25802580- *25812581- * But this is a bit hairy.25822582- *25832583- * So instead, we have an explicit cgroup call to remain25842584- * within the time source all along. We believe it25852585- * is cleaner and simpler to understand.25862586- */25872587- if (is_cgroup_event(event))25882588- perf_cgroup_set_shadow_time(event, event->tstamp);25892589- else25902590- event->shadow_ctx_time = event->tstamp - ctx->timestamp;25912591-}25922592-25932485#define MAX_INTERRUPTS (~0ULL)2594248625952487static void perf_log_throttle(struct perf_event *event, int enable);···25952555 }2596255625972557 perf_pmu_disable(event->pmu);25982598-25992599- perf_set_shadow_time(event, ctx);2600255826012559 perf_log_itrace_start(event);26022560···28992861 * perf_event_attr::disabled events will not run and can be initialized29002862 * without IPI. Except when this is the first event for the context, in29012863 * that case we need the magic of the IPI to set ctx->is_active.28642864+ * Similarly, cgroup events for the context also needs the IPI to28652865+ * manipulate the cgrp_cpuctx_list.29022866 *29032867 * The IOC_ENABLE that is sure to follow the creation of a disabled29042868 * event will issue the IPI and reprogram the hardware.29052869 */29062906- if (__perf_effective_state(event) == PERF_EVENT_STATE_OFF && ctx->nr_events) {28702870+ if (__perf_effective_state(event) == PERF_EVENT_STATE_OFF &&28712871+ ctx->nr_events && !is_cgroup_event(event)) {29072872 raw_spin_lock_irq(&ctx->lock);29082873 if (ctx->task == TASK_TOMBSTONE) {29092874 raw_spin_unlock_irq(&ctx->lock);···32923251 return;32933252 }3294325332953295- ctx->is_active &= ~event_type;32963296- if (!(ctx->is_active & EVENT_ALL))32973297- ctx->is_active = 0;32983298-32993299- if (ctx->task) {33003300- WARN_ON_ONCE(cpuctx->task_ctx != ctx);33013301- if (!ctx->is_active)33023302- cpuctx->task_ctx = NULL;33033303- }33043304-33053254 /*33063255 * Always update time if it was set; not only when it changes.33073256 * Otherwise we can 'forget' to update time for any but the last···33053274 if (is_active & EVENT_TIME) {33063275 /* update (and stop) ctx time */33073276 update_context_time(ctx);33083308- update_cgrp_time_from_cpuctx(cpuctx);32773277+ update_cgrp_time_from_cpuctx(cpuctx, ctx == &cpuctx->ctx);32783278+ /*32793279+ * CPU-release for the below ->is_active store,32803280+ * see __load_acquire() in perf_event_time_now()32813281+ */32823282+ barrier();32833283+ }32843284+32853285+ ctx->is_active &= ~event_type;32863286+ if (!(ctx->is_active & EVENT_ALL))32873287+ ctx->is_active = 0;32883288+32893289+ if (ctx->task) {32903290+ WARN_ON_ONCE(cpuctx->task_ctx != ctx);32913291+ if (!ctx->is_active)32923292+ cpuctx->task_ctx = NULL;33093293 }3310329433113295 is_active ^= ctx->is_active; /* changed bits */···37573711 return 0;37583712}3759371337143714+/*37153715+ * Because the userpage is strictly per-event (there is no concept of context,37163716+ * so there cannot be a context indirection), every userpage must be updated37173717+ * when context time starts :-(37183718+ *37193719+ * IOW, we must not miss EVENT_TIME edges.37203720+ */37603721static inline bool event_update_userpage(struct perf_event *event)37613722{37623723 if (likely(!atomic_read(&event->mmap_count)))37633724 return false;3764372537653726 perf_event_update_time(event);37663766- perf_set_shadow_time(event, event->ctx);37673727 perf_event_update_userpage(event);3768372837693729 return true;···38533801 struct task_struct *task)38543802{38553803 int is_active = ctx->is_active;38563856- u64 now;3857380438583805 lockdep_assert_held(&ctx->lock);3859380638603807 if (likely(!ctx->nr_events))38613808 return;38093809+38103810+ if (is_active ^ EVENT_TIME) {38113811+ /* start ctx time */38123812+ __update_context_time(ctx, false);38133813+ perf_cgroup_set_timestamp(task, ctx);38143814+ /*38153815+ * CPU-release for the below ->is_active store,38163816+ * see __load_acquire() in perf_event_time_now()38173817+ */38183818+ barrier();38193819+ }3862382038633821 ctx->is_active |= (event_type | EVENT_TIME);38643822 if (ctx->task) {···38793817 }3880381838813819 is_active ^= ctx->is_active; /* changed bits */38823882-38833883- if (is_active & EVENT_TIME) {38843884- /* start ctx time */38853885- now = perf_clock();38863886- ctx->timestamp = now;38873887- perf_cgroup_set_timestamp(task, ctx);38883888- }3889382038903821 /*38913822 * First go through the list and put on any pinned groups···44734418 return local64_read(&event->count) + atomic64_read(&event->child_count);44744419}4475442044214421+static void calc_timer_values(struct perf_event *event,44224422+ u64 *now,44234423+ u64 *enabled,44244424+ u64 *running)44254425+{44264426+ u64 ctx_time;44274427+44284428+ *now = perf_clock();44294429+ ctx_time = perf_event_time_now(event, *now);44304430+ __perf_update_times(event, ctx_time, enabled, running);44314431+}44324432+44764433/*44774434 * NMI-safe method to read a local event, that is an event that44784435 * is:···4544447745454478 *value = local64_read(&event->count);45464479 if (enabled || running) {45474547- u64 now = event->shadow_ctx_time + perf_clock();45484548- u64 __enabled, __running;44804480+ u64 __enabled, __running, __now;;4549448145504550- __perf_update_times(event, now, &__enabled, &__running);44824482+ calc_timer_values(event, &__now, &__enabled, &__running);45514483 if (enabled)45524484 *enabled = __enabled;45534485 if (running)···58685802 return event->pmu->event_idx(event);58695803}5870580458715871-static void calc_timer_values(struct perf_event *event,58725872- u64 *now,58735873- u64 *enabled,58745874- u64 *running)58755875-{58765876- u64 ctx_time;58775877-58785878- *now = perf_clock();58795879- ctx_time = event->shadow_ctx_time + *now;58805880- __perf_update_times(event, ctx_time, enabled, running);58815881-}58825882-58835805static void perf_event_init_userpage(struct perf_event *event)58845806{58855807 struct perf_event_mmap_page *userpg;···59925938 struct perf_buffer *old_rb = NULL;59935939 unsigned long flags;5994594059415941+ WARN_ON_ONCE(event->parent);59425942+59955943 if (event->rb) {59965944 /*59975945 * Should be impossible, we set this when removing···60515995{60525996 struct perf_buffer *rb;6053599759985998+ if (event->parent)59995999+ event = event->parent;60006000+60546001 rcu_read_lock();60556002 rb = rcu_dereference(event->rb);60566003 if (rb) {···60666007struct perf_buffer *ring_buffer_get(struct perf_event *event)60676008{60686009 struct perf_buffer *rb;60106010+60116011+ if (event->parent)60126012+ event = event->parent;6069601360706014 rcu_read_lock();60716015 rb = rcu_dereference(event->rb);···64156353 ring_buffer_attach(event, rb);6416635464176355 perf_event_update_time(event);64186418- perf_set_shadow_time(event, event->ctx);64196356 perf_event_init_userpage(event);64206357 perf_event_update_userpage(event);64216358 } else {···67786717 if (WARN_ON_ONCE(READ_ONCE(sampler->oncpu) != smp_processor_id()))67796718 goto out;6780671967816781- rb = ring_buffer_get(sampler->parent ? sampler->parent : sampler);67206720+ rb = ring_buffer_get(sampler);67826721 if (!rb)67836722 goto out;67846723···68446783 if (WARN_ON_ONCE(!sampler || !data->aux_size))68456784 return;6846678568476847- rb = ring_buffer_get(sampler->parent ? sampler->parent : sampler);67866786+ rb = ring_buffer_get(sampler);68486787 if (!rb)68496788 return;68506789
+7-14
kernel/power/snapshot.c
···978978 * Register a range of page frames the contents of which should not be saved979979 * during hibernation (to be used in the early initialization code).980980 */981981-void __init __register_nosave_region(unsigned long start_pfn,982982- unsigned long end_pfn, int use_kmalloc)981981+void __init register_nosave_region(unsigned long start_pfn, unsigned long end_pfn)983982{984983 struct nosave_region *region;985984···994995 goto Report;995996 }996997 }997997- if (use_kmalloc) {998998- /* During init, this shouldn't fail */999999- region = kmalloc(sizeof(struct nosave_region), GFP_KERNEL);10001000- BUG_ON(!region);10011001- } else {10021002- /* This allocation cannot fail */10031003- region = memblock_alloc(sizeof(struct nosave_region),10041004- SMP_CACHE_BYTES);10051005- if (!region)10061006- panic("%s: Failed to allocate %zu bytes\n", __func__,10071007- sizeof(struct nosave_region));10081008- }998998+ /* This allocation cannot fail */999999+ region = memblock_alloc(sizeof(struct nosave_region),10001000+ SMP_CACHE_BYTES);10011001+ if (!region)10021002+ panic("%s: Failed to allocate %zu bytes\n", __func__,10031003+ sizeof(struct nosave_region));10091004 region->start_pfn = start_pfn;10101005 region->end_pfn = end_pfn;10111006 list_add_tail(®ion->list, &nosave_regions);
+4-7
kernel/power/wakelock.c
···3939{4040 struct rb_node *node;4141 struct wakelock *wl;4242- char *str = buf;4343- char *end = buf + PAGE_SIZE;4242+ int len = 0;44434544 mutex_lock(&wakelocks_lock);46454746 for (node = rb_first(&wakelocks_tree); node; node = rb_next(node)) {4847 wl = rb_entry(node, struct wakelock, node);4948 if (wl->ws->active == show_active)5050- str += scnprintf(str, end - str, "%s ", wl->name);4949+ len += sysfs_emit_at(buf, len, "%s ", wl->name);5150 }5252- if (str > buf)5353- str--;54515555- str += scnprintf(str, end - str, "\n");5252+ len += sysfs_emit_at(buf, len, "\n");56535754 mutex_unlock(&wakelocks_lock);5858- return (str - buf);5555+ return len;5956}60576158#if CONFIG_PM_WAKELOCKS_LIMIT > 0
···58225822 }5823582358245824 if (schedstat_enabled() && rq->core->core_forceidle_count) {58255825- if (cookie)58265826- rq->core->core_forceidle_start = rq_clock(rq->core);58255825+ rq->core->core_forceidle_start = rq_clock(rq->core);58275826 rq->core->core_forceidle_occupation = occ;58285827 }58295828···8218821982198220 if (spin_needbreak(lock) || resched) {82208221 spin_unlock(lock);82218221- if (resched)82228222- preempt_schedule_common();82238223- else82228222+ if (!_cond_resched())82248223 cpu_relax();82258224 ret = 1;82268225 spin_lock(lock);···8236823982378240 if (rwlock_needbreak(lock) || resched) {82388241 read_unlock(lock);82398239- if (resched)82408240- preempt_schedule_common();82418241- else82428242+ if (!_cond_resched())82428243 cpu_relax();82438244 ret = 1;82448245 read_lock(lock);···8254825982558260 if (rwlock_needbreak(lock) || resched) {82568261 write_unlock(lock);82578257- if (resched)82588258- preempt_schedule_common();82598259- else82628262+ if (!_cond_resched())82608263 cpu_relax();82618264 ret = 1;82628265 write_lock(lock);
+1-1
kernel/sched/core_sched.c
···277277 rq_i = cpu_rq(i);278278 p = rq_i->core_pick ?: rq_i->curr;279279280280- if (!p->core_cookie)280280+ if (p == rq_i->idle)281281 continue;282282283283 __schedstat_add(p->stats.core_forceidle_sum, delta);
+77-41
kernel/sched/fair.c
···30283028static inline void30293029dequeue_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)30303030{30313031- u32 divider = get_pelt_divider(&se->avg);30323031 sub_positive(&cfs_rq->avg.load_avg, se->avg.load_avg);30333033- cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;30323032+ sub_positive(&cfs_rq->avg.load_sum, se_weight(se) * se->avg.load_sum);30333033+ /* See update_cfs_rq_load_avg() */30343034+ cfs_rq->avg.load_sum = max_t(u32, cfs_rq->avg.load_sum,30353035+ cfs_rq->avg.load_avg * PELT_MIN_DIVIDER);30343036}30353037#else30363038static inline void···33833381 se->avg.last_update_time = n_last_update_time;33843382}3385338333863386-33873384/*33883385 * When on migration a sched_entity joins/leaves the PELT hierarchy, we need to33893386 * propagate its contribution. The key to this propagation is the invariant···34503449 * XXX: only do this for the part of runnable > running ?34513450 *34523451 */34533453-34543452static inline void34553453update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)34563454{34573457- long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;34583458- u32 divider;34553455+ long delta_sum, delta_avg = gcfs_rq->avg.util_avg - se->avg.util_avg;34563456+ u32 new_sum, divider;3459345734603458 /* Nothing to update */34613461- if (!delta)34593459+ if (!delta_avg)34623460 return;3463346134643462 /*···34663466 */34673467 divider = get_pelt_divider(&cfs_rq->avg);3468346834693469+34693470 /* Set new sched_entity's utilization */34703471 se->avg.util_avg = gcfs_rq->avg.util_avg;34713471- se->avg.util_sum = se->avg.util_avg * divider;34723472+ new_sum = se->avg.util_avg * divider;34733473+ delta_sum = (long)new_sum - (long)se->avg.util_sum;34743474+ se->avg.util_sum = new_sum;3472347534733476 /* Update parent cfs_rq utilization */34743474- add_positive(&cfs_rq->avg.util_avg, delta);34753475- cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * divider;34773477+ add_positive(&cfs_rq->avg.util_avg, delta_avg);34783478+ add_positive(&cfs_rq->avg.util_sum, delta_sum);34793479+34803480+ /* See update_cfs_rq_load_avg() */34813481+ cfs_rq->avg.util_sum = max_t(u32, cfs_rq->avg.util_sum,34823482+ cfs_rq->avg.util_avg * PELT_MIN_DIVIDER);34763483}3477348434783485static inline void34793486update_tg_cfs_runnable(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)34803487{34813481- long delta = gcfs_rq->avg.runnable_avg - se->avg.runnable_avg;34823482- u32 divider;34883488+ long delta_sum, delta_avg = gcfs_rq->avg.runnable_avg - se->avg.runnable_avg;34893489+ u32 new_sum, divider;3483349034843491 /* Nothing to update */34853485- if (!delta)34923492+ if (!delta_avg)34863493 return;3487349434883495 /*···3500349335013494 /* Set new sched_entity's runnable */35023495 se->avg.runnable_avg = gcfs_rq->avg.runnable_avg;35033503- se->avg.runnable_sum = se->avg.runnable_avg * divider;34963496+ new_sum = se->avg.runnable_avg * divider;34973497+ delta_sum = (long)new_sum - (long)se->avg.runnable_sum;34983498+ se->avg.runnable_sum = new_sum;3504349935053500 /* Update parent cfs_rq runnable */35063506- add_positive(&cfs_rq->avg.runnable_avg, delta);35073507- cfs_rq->avg.runnable_sum = cfs_rq->avg.runnable_avg * divider;35013501+ add_positive(&cfs_rq->avg.runnable_avg, delta_avg);35023502+ add_positive(&cfs_rq->avg.runnable_sum, delta_sum);35033503+ /* See update_cfs_rq_load_avg() */35043504+ cfs_rq->avg.runnable_sum = max_t(u32, cfs_rq->avg.runnable_sum,35053505+ cfs_rq->avg.runnable_avg * PELT_MIN_DIVIDER);35083506}3509350735103508static inline void35113509update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)35123510{35133513- long delta, running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;35113511+ long delta_avg, running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;35143512 unsigned long load_avg;35153513 u64 load_sum = 0;35143514+ s64 delta_sum;35163515 u32 divider;3517351635183517 if (!runnable_sum)···35453532 * assuming all tasks are equally runnable.35463533 */35473534 if (scale_load_down(gcfs_rq->load.weight)) {35483548- load_sum = div_s64(gcfs_rq->avg.load_sum,35353535+ load_sum = div_u64(gcfs_rq->avg.load_sum,35493536 scale_load_down(gcfs_rq->load.weight));35503537 }35513538···35623549 running_sum = se->avg.util_sum >> SCHED_CAPACITY_SHIFT;35633550 runnable_sum = max(runnable_sum, running_sum);3564355135653565- load_sum = (s64)se_weight(se) * runnable_sum;35663566- load_avg = div_s64(load_sum, divider);35523552+ load_sum = se_weight(se) * runnable_sum;35533553+ load_avg = div_u64(load_sum, divider);3567355435683568- se->avg.load_sum = runnable_sum;35693569-35703570- delta = load_avg - se->avg.load_avg;35713571- if (!delta)35553555+ delta_avg = load_avg - se->avg.load_avg;35563556+ if (!delta_avg)35723557 return;3573355835743574- se->avg.load_avg = load_avg;35593559+ delta_sum = load_sum - (s64)se_weight(se) * se->avg.load_sum;3575356035763576- add_positive(&cfs_rq->avg.load_avg, delta);35773577- cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;35613561+ se->avg.load_sum = runnable_sum;35623562+ se->avg.load_avg = load_avg;35633563+ add_positive(&cfs_rq->avg.load_avg, delta_avg);35643564+ add_positive(&cfs_rq->avg.load_sum, delta_sum);35653565+ /* See update_cfs_rq_load_avg() */35663566+ cfs_rq->avg.load_sum = max_t(u32, cfs_rq->avg.load_sum,35673567+ cfs_rq->avg.load_avg * PELT_MIN_DIVIDER);35783568}3579356935803570static inline void add_tg_cfs_propagate(struct cfs_rq *cfs_rq, long runnable_sum)···36683652 *36693653 * cfs_rq->avg is used for task_h_load() and update_cfs_share() for example.36703654 *36713671- * Returns true if the load decayed or we removed load.36553655+ * Return: true if the load decayed or we removed load.36723656 *36733657 * Since both these conditions indicate a changed cfs_rq->avg.load we should36743658 * call update_tg_load_avg() when this function returns true.···3693367736943678 r = removed_load;36953679 sub_positive(&sa->load_avg, r);36963696- sa->load_sum = sa->load_avg * divider;36803680+ sub_positive(&sa->load_sum, r * divider);36813681+ /* See sa->util_sum below */36823682+ sa->load_sum = max_t(u32, sa->load_sum, sa->load_avg * PELT_MIN_DIVIDER);3697368336983684 r = removed_util;36993685 sub_positive(&sa->util_avg, r);37003700- sa->util_sum = sa->util_avg * divider;36863686+ sub_positive(&sa->util_sum, r * divider);36873687+ /*36883688+ * Because of rounding, se->util_sum might ends up being +1 more than36893689+ * cfs->util_sum. Although this is not a problem by itself, detaching36903690+ * a lot of tasks with the rounding problem between 2 updates of36913691+ * util_avg (~1ms) can make cfs->util_sum becoming null whereas36923692+ * cfs_util_avg is not.36933693+ * Check that util_sum is still above its lower bound for the new36943694+ * util_avg. Given that period_contrib might have moved since the last36953695+ * sync, we are only sure that util_sum must be above or equal to36963696+ * util_avg * minimum possible divider36973697+ */36983698+ sa->util_sum = max_t(u32, sa->util_sum, sa->util_avg * PELT_MIN_DIVIDER);3701369937023700 r = removed_runnable;37033701 sub_positive(&sa->runnable_avg, r);37043704- sa->runnable_sum = sa->runnable_avg * divider;37023702+ sub_positive(&sa->runnable_sum, r * divider);37033703+ /* See sa->util_sum above */37043704+ sa->runnable_sum = max_t(u32, sa->runnable_sum,37053705+ sa->runnable_avg * PELT_MIN_DIVIDER);3705370637063707 /*37073708 * removed_runnable is the unweighted version of removed_load so we···38053772 */38063773static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)38073774{38083808- /*38093809- * cfs_rq->avg.period_contrib can be used for both cfs_rq and se.38103810- * See ___update_load_avg() for details.38113811- */38123812- u32 divider = get_pelt_divider(&cfs_rq->avg);38133813-38143775 dequeue_load_avg(cfs_rq, se);38153776 sub_positive(&cfs_rq->avg.util_avg, se->avg.util_avg);38163816- cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * divider;37773777+ sub_positive(&cfs_rq->avg.util_sum, se->avg.util_sum);37783778+ /* See update_cfs_rq_load_avg() */37793779+ cfs_rq->avg.util_sum = max_t(u32, cfs_rq->avg.util_sum,37803780+ cfs_rq->avg.util_avg * PELT_MIN_DIVIDER);37813781+38173782 sub_positive(&cfs_rq->avg.runnable_avg, se->avg.runnable_avg);38183818- cfs_rq->avg.runnable_sum = cfs_rq->avg.runnable_avg * divider;37833783+ sub_positive(&cfs_rq->avg.runnable_sum, se->avg.runnable_sum);37843784+ /* See update_cfs_rq_load_avg() */37853785+ cfs_rq->avg.runnable_sum = max_t(u32, cfs_rq->avg.runnable_sum,37863786+ cfs_rq->avg.runnable_avg * PELT_MIN_DIVIDER);3819378738203788 add_tg_cfs_propagate(cfs_rq, -se->avg.load_sum);38213789···85738539 *85748540 * If @sg does not have SMT siblings, only pull tasks if all of the SMT siblings85758541 * of @dst_cpu are idle and @sg has lower priority.85428542+ *85438543+ * Return: true if @dst_cpu can pull tasks, false otherwise.85768544 */85778545static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,85788546 struct sg_lb_stats *sgs,···86508614/**86518615 * update_sg_lb_stats - Update sched_group's statistics for load balancing.86528616 * @env: The load balancing environment.86178617+ * @sds: Load-balancing data with statistics of the local group.86538618 * @group: sched_group whose statistics are to be updated.86548619 * @sgs: variable to hold the statistics for this group.86558620 * @sg_status: Holds flag indicating the status of the sched_group···94589421/**94599422 * find_busiest_group - Returns the busiest group within the sched_domain94609423 * if there is an imbalance.94249424+ * @env: The load balancing environment.94619425 *94629426 * Also calculates the amount of runnable load which should be moved94639427 * to restore balance.94649464- *94659465- * @env: The load balancing environment.94669428 *94679429 * Return: - The busiest group if imbalance exists.94689430 */
···10821082 return 0;10831083}1084108410851085-static int psi_io_show(struct seq_file *m, void *v)10861086-{10871087- return psi_show(m, &psi_system, PSI_IO);10881088-}10891089-10901090-static int psi_memory_show(struct seq_file *m, void *v)10911091-{10921092- return psi_show(m, &psi_system, PSI_MEM);10931093-}10941094-10951095-static int psi_cpu_show(struct seq_file *m, void *v)10961096-{10971097- return psi_show(m, &psi_system, PSI_CPU);10981098-}10991099-11001100-static int psi_open(struct file *file, int (*psi_show)(struct seq_file *, void *))11011101-{11021102- if (file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE))11031103- return -EPERM;11041104-11051105- return single_open(file, psi_show, NULL);11061106-}11071107-11081108-static int psi_io_open(struct inode *inode, struct file *file)11091109-{11101110- return psi_open(file, psi_io_show);11111111-}11121112-11131113-static int psi_memory_open(struct inode *inode, struct file *file)11141114-{11151115- return psi_open(file, psi_memory_show);11161116-}11171117-11181118-static int psi_cpu_open(struct inode *inode, struct file *file)11191119-{11201120- return psi_open(file, psi_cpu_show);11211121-}11221122-11231085struct psi_trigger *psi_trigger_create(struct psi_group *group,11241086 char *buf, size_t nbytes, enum psi_res res)11251087{···11241162 t->event = 0;11251163 t->last_event_time = 0;11261164 init_waitqueue_head(&t->event_wait);11271127- kref_init(&t->refcount);1128116511291166 mutex_lock(&group->trigger_lock);11301167···11521191 return t;11531192}1154119311551155-static void psi_trigger_destroy(struct kref *ref)11941194+void psi_trigger_destroy(struct psi_trigger *t)11561195{11571157- struct psi_trigger *t = container_of(ref, struct psi_trigger, refcount);11581158- struct psi_group *group = t->group;11961196+ struct psi_group *group;11591197 struct task_struct *task_to_destroy = NULL;1160119811611161- if (static_branch_likely(&psi_disabled))11991199+ /*12001200+ * We do not check psi_disabled since it might have been disabled after12011201+ * the trigger got created.12021202+ */12031203+ if (!t)11621204 return;1163120512061206+ group = t->group;11641207 /*11651208 * Wakeup waiters to stop polling. Can happen if cgroup is deleted11661209 * from under a polling process.···12001235 mutex_unlock(&group->trigger_lock);1201123612021237 /*12031203- * Wait for both *trigger_ptr from psi_trigger_replace and12041204- * poll_task RCUs to complete their read-side critical sections12051205- * before destroying the trigger and optionally the poll_task12381238+ * Wait for psi_schedule_poll_work RCU to complete its read-side12391239+ * critical section before destroying the trigger and optionally the12401240+ * poll_task.12061241 */12071242 synchronize_rcu();12081243 /*···12191254 kfree(t);12201255}1221125612221222-void psi_trigger_replace(void **trigger_ptr, struct psi_trigger *new)12231223-{12241224- struct psi_trigger *old = *trigger_ptr;12251225-12261226- if (static_branch_likely(&psi_disabled))12271227- return;12281228-12291229- rcu_assign_pointer(*trigger_ptr, new);12301230- if (old)12311231- kref_put(&old->refcount, psi_trigger_destroy);12321232-}12331233-12341257__poll_t psi_trigger_poll(void **trigger_ptr,12351258 struct file *file, poll_table *wait)12361259{···12281275 if (static_branch_likely(&psi_disabled))12291276 return DEFAULT_POLLMASK | EPOLLERR | EPOLLPRI;1230127712311231- rcu_read_lock();12321232-12331233- t = rcu_dereference(*(void __rcu __force **)trigger_ptr);12341234- if (!t) {12351235- rcu_read_unlock();12781278+ t = smp_load_acquire(trigger_ptr);12791279+ if (!t)12361280 return DEFAULT_POLLMASK | EPOLLERR | EPOLLPRI;12371237- }12381238- kref_get(&t->refcount);12391239-12401240- rcu_read_unlock();1241128112421282 poll_wait(file, &t->event_wait, wait);1243128312441284 if (cmpxchg(&t->event, 1, 0) == 1)12451285 ret |= EPOLLPRI;1246128612471247- kref_put(&t->refcount, psi_trigger_destroy);12481248-12491287 return ret;12881288+}12891289+12901290+#ifdef CONFIG_PROC_FS12911291+static int psi_io_show(struct seq_file *m, void *v)12921292+{12931293+ return psi_show(m, &psi_system, PSI_IO);12941294+}12951295+12961296+static int psi_memory_show(struct seq_file *m, void *v)12971297+{12981298+ return psi_show(m, &psi_system, PSI_MEM);12991299+}13001300+13011301+static int psi_cpu_show(struct seq_file *m, void *v)13021302+{13031303+ return psi_show(m, &psi_system, PSI_CPU);13041304+}13051305+13061306+static int psi_open(struct file *file, int (*psi_show)(struct seq_file *, void *))13071307+{13081308+ if (file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE))13091309+ return -EPERM;13101310+13111311+ return single_open(file, psi_show, NULL);13121312+}13131313+13141314+static int psi_io_open(struct inode *inode, struct file *file)13151315+{13161316+ return psi_open(file, psi_io_show);13171317+}13181318+13191319+static int psi_memory_open(struct inode *inode, struct file *file)13201320+{13211321+ return psi_open(file, psi_memory_show);13221322+}13231323+13241324+static int psi_cpu_open(struct inode *inode, struct file *file)13251325+{13261326+ return psi_open(file, psi_cpu_show);12501327}1251132812521329static ssize_t psi_write(struct file *file, const char __user *user_buf,···1299131613001317 buf[buf_size - 1] = '\0';1301131813021302- new = psi_trigger_create(&psi_system, buf, nbytes, res);13031303- if (IS_ERR(new))13041304- return PTR_ERR(new);13051305-13061319 seq = file->private_data;13201320+13071321 /* Take seq->lock to protect seq->private from concurrent writes */13081322 mutex_lock(&seq->lock);13091309- psi_trigger_replace(&seq->private, new);13231323+13241324+ /* Allow only one trigger per file descriptor */13251325+ if (seq->private) {13261326+ mutex_unlock(&seq->lock);13271327+ return -EBUSY;13281328+ }13291329+13301330+ new = psi_trigger_create(&psi_system, buf, nbytes, res);13311331+ if (IS_ERR(new)) {13321332+ mutex_unlock(&seq->lock);13331333+ return PTR_ERR(new);13341334+ }13351335+13361336+ smp_store_release(&seq->private, new);13101337 mutex_unlock(&seq->lock);1311133813121339 return nbytes;···13511358{13521359 struct seq_file *seq = file->private_data;1353136013541354- psi_trigger_replace(&seq->private, NULL);13611361+ psi_trigger_destroy(seq->private);13551362 return single_release(inode, file);13561363}13571364···13931400 return 0;13941401}13951402module_init(psi_proc_init);14031403+14041404+#endif /* CONFIG_PROC_FS */
+7-1
kernel/trace/Kconfig
···7070 help7171 C version of recordmcount available?72727373+config HAVE_BUILDTIME_MCOUNT_SORT7474+ bool7575+ help7676+ An architecture selects this if it sorts the mcount_loc section7777+ at build time.7878+7379config BUILDTIME_MCOUNT_SORT7480 bool7581 default y7676- depends on BUILDTIME_TABLE_SORT && !S3908282+ depends on HAVE_BUILDTIME_MCOUNT_SORT && DYNAMIC_FTRACE7783 help7884 Sort the mcount_loc section at build time.7985
···15961596 }1597159715981598 /*15991599+ * Pages instantiated by device-dax (not filesystem-dax)16001600+ * may be compound pages.16011601+ */16021602+ page = compound_head(page);16031603+16041604+ /*15991605 * Prevent the inode from being freed while we are interrogating16001606 * the address_space, typically this would be handled by16011607 * lock_page(), but dax pages do not use the page lock. This
+5-4
net/bridge/br_vlan.c
···560560 !br_opt_get(br, BROPT_VLAN_STATS_ENABLED)) {561561 if (*state == BR_STATE_FORWARDING) {562562 *state = br_vlan_get_pvid_state(vg);563563- return br_vlan_state_allowed(*state, true);564564- } else {565565- return true;563563+ if (!br_vlan_state_allowed(*state, true))564564+ goto drop;566565 }566566+ return true;567567 }568568 }569569 v = br_vlan_find(vg, *vid);···20202020 goto out_err;20212021 }20222022 err = br_vlan_dump_dev(dev, skb, cb, dump_flags);20232023- if (err && err != -EMSGSIZE)20232023+ /* if the dump completed without an error we return 0 here */20242024+ if (err != -EMSGSIZE)20242025 goto out_err;20252026 } else {20262027 for_each_netdev_rcu(net, dev) {
+34-4
net/core/net-procfs.c
···190190 .show = softnet_seq_show,191191};192192193193-static void *ptype_get_idx(loff_t pos)193193+static void *ptype_get_idx(struct seq_file *seq, loff_t pos)194194{195195+ struct list_head *ptype_list = NULL;195196 struct packet_type *pt = NULL;197197+ struct net_device *dev;196198 loff_t i = 0;197199 int t;200200+201201+ for_each_netdev_rcu(seq_file_net(seq), dev) {202202+ ptype_list = &dev->ptype_all;203203+ list_for_each_entry_rcu(pt, ptype_list, list) {204204+ if (i == pos)205205+ return pt;206206+ ++i;207207+ }208208+ }198209199210 list_for_each_entry_rcu(pt, &ptype_all, list) {200211 if (i == pos)···227216 __acquires(RCU)228217{229218 rcu_read_lock();230230- return *pos ? ptype_get_idx(*pos - 1) : SEQ_START_TOKEN;219219+ return *pos ? ptype_get_idx(seq, *pos - 1) : SEQ_START_TOKEN;231220}232221233222static void *ptype_seq_next(struct seq_file *seq, void *v, loff_t *pos)234223{224224+ struct net_device *dev;235225 struct packet_type *pt;236226 struct list_head *nxt;237227 int hash;238228239229 ++*pos;240230 if (v == SEQ_START_TOKEN)241241- return ptype_get_idx(0);231231+ return ptype_get_idx(seq, 0);242232243233 pt = v;244234 nxt = pt->list.next;235235+ if (pt->dev) {236236+ if (nxt != &pt->dev->ptype_all)237237+ goto found;238238+239239+ dev = pt->dev;240240+ for_each_netdev_continue_rcu(seq_file_net(seq), dev) {241241+ if (!list_empty(&dev->ptype_all)) {242242+ nxt = dev->ptype_all.next;243243+ goto found;244244+ }245245+ }246246+247247+ nxt = ptype_all.next;248248+ goto ptype_all;249249+ }250250+245251 if (pt->type == htons(ETH_P_ALL)) {252252+ptype_all:246253 if (nxt != &ptype_all)247254 goto found;248255 hash = 0;···289260290261 if (v == SEQ_START_TOKEN)291262 seq_puts(seq, "Type Device Function\n");292292- else if (pt->dev == NULL || dev_net(pt->dev) == seq_file_net(seq)) {263263+ else if ((!pt->af_packet_net || net_eq(pt->af_packet_net, seq_file_net(seq))) &&264264+ (!pt->dev || net_eq(dev_net(pt->dev), seq_file_net(seq)))) {293265 if (pt->type == htons(ETH_P_ALL))294266 seq_puts(seq, "ALL ");295267 else
+21-5
net/ipv4/ip_output.c
···162162 iph->daddr = (opt && opt->opt.srr ? opt->opt.faddr : daddr);163163 iph->saddr = saddr;164164 iph->protocol = sk->sk_protocol;165165- if (ip_dont_fragment(sk, &rt->dst)) {165165+ /* Do not bother generating IPID for small packets (eg SYNACK) */166166+ if (skb->len <= IPV4_MIN_MTU || ip_dont_fragment(sk, &rt->dst)) {166167 iph->frag_off = htons(IP_DF);167168 iph->id = 0;168169 } else {169170 iph->frag_off = 0;170170- __ip_select_ident(net, iph, 1);171171+ /* TCP packets here are SYNACK with fat IPv4/TCP options.172172+ * Avoid using the hashed IP ident generator.173173+ */174174+ if (sk->sk_protocol == IPPROTO_TCP)175175+ iph->id = (__force __be16)prandom_u32();176176+ else177177+ __ip_select_ident(net, iph, 1);171178 }172179173180 if (opt && opt->opt.optlen) {···832825 /* Everything is OK. Generate! */833826 ip_fraglist_init(skb, iph, hlen, &iter);834827835835- if (iter.frag)836836- ip_options_fragment(iter.frag);837837-838828 for (;;) {839829 /* Prepare header of the next frame,840830 * before previous one went down. */841831 if (iter.frag) {832832+ bool first_frag = (iter.offset == 0);833833+842834 IPCB(iter.frag)->flags = IPCB(skb)->flags;843835 ip_fraglist_prepare(skb, &iter);836836+ if (first_frag && IPCB(skb)->opt.optlen) {837837+ /* ipcb->opt is not populated for frags838838+ * coming from __ip_make_skb(),839839+ * ip_options_fragment() needs optlen840840+ */841841+ IPCB(iter.frag)->opt.optlen =842842+ IPCB(skb)->opt.optlen;843843+ ip_options_fragment(iter.frag);844844+ ip_send_check(iter.iph);845845+ }844846 }845847846848 skb->tstamp = tstamp;
···25892589 __u32 valid_lft, u32 prefered_lft)25902590{25912591 struct inet6_ifaddr *ifp = ipv6_get_ifaddr(net, addr, dev, 1);25922592- int create = 0;25922592+ int create = 0, update_lft = 0;2593259325942594 if (!ifp && valid_lft) {25952595 int max_addresses = in6_dev->cnf.max_addresses;···26332633 unsigned long now;26342634 u32 stored_lft;2635263526362636- /* Update lifetime (RFC4862 5.5.3 e)26372637- * We deviate from RFC4862 by honoring all Valid Lifetimes to26382638- * improve the reaction of SLAAC to renumbering events26392639- * (draft-gont-6man-slaac-renum-06, Section 4.2)26402640- */26362636+ /* update lifetime (RFC2462 5.5.3 e) */26412637 spin_lock_bh(&ifp->lock);26422638 now = jiffies;26432639 if (ifp->valid_lft > (now - ifp->tstamp) / HZ)26442640 stored_lft = ifp->valid_lft - (now - ifp->tstamp) / HZ;26452641 else26462642 stored_lft = 0;26472647-26482643 if (!create && stored_lft) {26442644+ const u32 minimum_lft = min_t(u32,26452645+ stored_lft, MIN_VALID_LIFETIME);26462646+ valid_lft = max(valid_lft, minimum_lft);26472647+26482648+ /* RFC4862 Section 5.5.3e:26492649+ * "Note that the preferred lifetime of the26502650+ * corresponding address is always reset to26512651+ * the Preferred Lifetime in the received26522652+ * Prefix Information option, regardless of26532653+ * whether the valid lifetime is also reset or26542654+ * ignored."26552655+ *26562656+ * So we should always update prefered_lft here.26572657+ */26582658+ update_lft = 1;26592659+ }26602660+26612661+ if (update_lft) {26492662 ifp->valid_lft = valid_lft;26502663 ifp->prefered_lft = prefered_lft;26512664 ifp->tstamp = now;
+13-10
net/ipv6/ip6_fib.c
···112112 fn = rcu_dereference_protected(f6i->fib6_node,113113 lockdep_is_held(&f6i->fib6_table->tb6_lock));114114 if (fn)115115- fn->fn_sernum = fib6_new_sernum(net);115115+ WRITE_ONCE(fn->fn_sernum, fib6_new_sernum(net));116116}117117118118/*···590590 spin_unlock_bh(&table->tb6_lock);591591 if (res > 0) {592592 cb->args[4] = 1;593593- cb->args[5] = w->root->fn_sernum;593593+ cb->args[5] = READ_ONCE(w->root->fn_sernum);594594 }595595 } else {596596- if (cb->args[5] != w->root->fn_sernum) {596596+ int sernum = READ_ONCE(w->root->fn_sernum);597597+ if (cb->args[5] != sernum) {597598 /* Begin at the root if the tree changed */598598- cb->args[5] = w->root->fn_sernum;599599+ cb->args[5] = sernum;599600 w->state = FWS_INIT;600601 w->node = w->root;601602 w->skip = w->count;···13461345 /* paired with smp_rmb() in fib6_get_cookie_safe() */13471346 smp_wmb();13481347 while (fn) {13491349- fn->fn_sernum = sernum;13481348+ WRITE_ONCE(fn->fn_sernum, sernum);13501349 fn = rcu_dereference_protected(fn->parent,13511350 lockdep_is_held(&rt->fib6_table->tb6_lock));13521351 }···21752174 };2176217521772176 if (c->sernum != FIB6_NO_SERNUM_CHANGE &&21782178- w->node->fn_sernum != c->sernum)21792179- w->node->fn_sernum = c->sernum;21772177+ READ_ONCE(w->node->fn_sernum) != c->sernum)21782178+ WRITE_ONCE(w->node->fn_sernum, c->sernum);2180217921812180 if (!c->func) {21822181 WARN_ON_ONCE(c->sernum == FIB6_NO_SERNUM_CHANGE);···25442543 iter->w.state = FWS_INIT;25452544 iter->w.node = iter->w.root;25462545 iter->w.args = iter;25472547- iter->sernum = iter->w.root->fn_sernum;25462546+ iter->sernum = READ_ONCE(iter->w.root->fn_sernum);25482547 INIT_LIST_HEAD(&iter->w.lh);25492548 fib6_walker_link(net, &iter->w);25502549}···2572257125732572static void ipv6_route_check_sernum(struct ipv6_route_iter *iter)25742573{25752575- if (iter->sernum != iter->w.root->fn_sernum) {25762576- iter->sernum = iter->w.root->fn_sernum;25742574+ int sernum = READ_ONCE(iter->w.root->fn_sernum);25752575+25762576+ if (iter->sernum != sernum) {25772577+ iter->sernum = sernum;25772578 iter->w.state = FWS_INIT;25782579 iter->w.node = iter->w.root;25792580 WARN_ON(iter->w.skip);
+4-4
net/ipv6/ip6_tunnel.c
···1036103610371037 if (unlikely(!ipv6_chk_addr_and_flags(net, laddr, ldev, false,10381038 0, IFA_F_TENTATIVE)))10391039- pr_warn("%s xmit: Local address not yet configured!\n",10401040- p->name);10391039+ pr_warn_ratelimited("%s xmit: Local address not yet configured!\n",10401040+ p->name);10411041 else if (!(p->flags & IP6_TNL_F_ALLOW_LOCAL_REMOTE) &&10421042 !ipv6_addr_is_multicast(raddr) &&10431043 unlikely(ipv6_chk_addr_and_flags(net, raddr, ldev,10441044 true, 0, IFA_F_TENTATIVE)))10451045- pr_warn("%s xmit: Routing loop! Remote address found on this node!\n",10461046- p->name);10451045+ pr_warn_ratelimited("%s xmit: Routing loop! Remote address found on this node!\n",10461046+ p->name);10471047 else10481048 ret = 1;10491049 rcu_read_unlock();
···408408struct mptcp_subflow_context {409409 struct list_head node;/* conn_list of subflows */410410411411- char reset_start[0];411411+ struct_group(reset,412412413413 unsigned long avg_pacing_rate; /* protected by msk socket lock */414414 u64 local_key;···458458459459 long delegated_status;460460461461- char reset_end[0];461461+ );462462463463 struct list_head delegated_node; /* link into delegated_action, protected by local BH */464464···494494static inline void495495mptcp_subflow_ctx_reset(struct mptcp_subflow_context *subflow)496496{497497- memset(subflow->reset_start, 0, subflow->reset_end - subflow->reset_start);497497+ memset(&subflow->reset, 0, sizeof(subflow->reset));498498 subflow->request_mptcp = 1;499499}500500
+5-3
net/netfilter/nf_conntrack_core.c
···19241924 pr_debug("nf_conntrack_in: Can't track with proto module\n");19251925 nf_ct_put(ct);19261926 skb->_nfct = 0;19271927- NF_CT_STAT_INC_ATOMIC(state->net, invalid);19281928- if (ret == -NF_DROP)19291929- NF_CT_STAT_INC_ATOMIC(state->net, drop);19301927 /* Special case: TCP tracker reports an attempt to reopen a19311928 * closed/aborted connection. We have to go back and create a19321929 * fresh conntrack.19331930 */19341931 if (ret == -NF_REPEAT)19351932 goto repeat;19331933+19341934+ NF_CT_STAT_INC_ATOMIC(state->net, invalid);19351935+ if (ret == -NF_DROP)19361936+ NF_CT_STAT_INC_ATOMIC(state->net, drop);19371937+19361938 ret = -ret;19371939 goto out;19381940 }
···1204120412051205 err = -ENOENT;12061206 if (!ops) {12071207- NL_SET_ERR_MSG(extack, "Specified qdisc not found");12071207+ NL_SET_ERR_MSG(extack, "Specified qdisc kind is unknown");12081208 goto err_out;12091209 }12101210
+20
net/sched/sch_htb.c
···18101810 if (!hopt->rate.rate || !hopt->ceil.rate)18111811 goto failure;1812181218131813+ if (q->offload) {18141814+ /* Options not supported by the offload. */18151815+ if (hopt->rate.overhead || hopt->ceil.overhead) {18161816+ NL_SET_ERR_MSG(extack, "HTB offload doesn't support the overhead parameter");18171817+ goto failure;18181818+ }18191819+ if (hopt->rate.mpu || hopt->ceil.mpu) {18201820+ NL_SET_ERR_MSG(extack, "HTB offload doesn't support the mpu parameter");18211821+ goto failure;18221822+ }18231823+ if (hopt->quantum) {18241824+ NL_SET_ERR_MSG(extack, "HTB offload doesn't support the quantum parameter");18251825+ goto failure;18261826+ }18271827+ if (hopt->prio) {18281828+ NL_SET_ERR_MSG(extack, "HTB offload doesn't support the prio parameter");18291829+ goto failure;18301830+ }18311831+ }18321832+18131833 /* Keeping backward compatible with rate_table based iproute2 tc */18141834 if (hopt->rate.linklayer == TC_LINKLAYER_UNAWARE)18151835 qdisc_put_rtab(qdisc_get_rtab(&hopt->rate, tb[TCA_HTB_RTAB],
+51-12
net/smc/af_smc.c
···566566 mutex_unlock(&net->smc.mutex_fback_rsn);567567}568568569569-static void smc_switch_to_fallback(struct smc_sock *smc, int reason_code)569569+static int smc_switch_to_fallback(struct smc_sock *smc, int reason_code)570570{571571 wait_queue_head_t *smc_wait = sk_sleep(&smc->sk);572572- wait_queue_head_t *clc_wait = sk_sleep(smc->clcsock->sk);572572+ wait_queue_head_t *clc_wait;573573 unsigned long flags;574574575575+ mutex_lock(&smc->clcsock_release_lock);576576+ if (!smc->clcsock) {577577+ mutex_unlock(&smc->clcsock_release_lock);578578+ return -EBADF;579579+ }575580 smc->use_fallback = true;576581 smc->fallback_rsn = reason_code;577582 smc_stat_fallback(smc);···591586 * smc socket->wq, which should be removed592587 * to clcsocket->wq during the fallback.593588 */589589+ clc_wait = sk_sleep(smc->clcsock->sk);594590 spin_lock_irqsave(&smc_wait->lock, flags);595591 spin_lock_nested(&clc_wait->lock, SINGLE_DEPTH_NESTING);596592 list_splice_init(&smc_wait->head, &clc_wait->head);597593 spin_unlock(&clc_wait->lock);598594 spin_unlock_irqrestore(&smc_wait->lock, flags);599595 }596596+ mutex_unlock(&smc->clcsock_release_lock);597597+ return 0;600598}601599602600/* fall back during connect */603601static int smc_connect_fallback(struct smc_sock *smc, int reason_code)604602{605605- smc_switch_to_fallback(smc, reason_code);603603+ struct net *net = sock_net(&smc->sk);604604+ int rc = 0;605605+606606+ rc = smc_switch_to_fallback(smc, reason_code);607607+ if (rc) { /* fallback fails */608608+ this_cpu_inc(net->smc.smc_stats->clnt_hshake_err_cnt);609609+ if (smc->sk.sk_state == SMC_INIT)610610+ sock_put(&smc->sk); /* passive closing */611611+ return rc;612612+ }606613 smc_copy_sock_settings_to_clc(smc);607614 smc->connect_nonblock = 0;608615 if (smc->sk.sk_state == SMC_INIT)···15351518{15361519 /* RDMA setup failed, switch back to TCP */15371520 smc_conn_abort(new_smc, local_first);15381538- if (reason_code < 0) { /* error, no fallback possible */15211521+ if (reason_code < 0 ||15221522+ smc_switch_to_fallback(new_smc, reason_code)) {15231523+ /* error, no fallback possible */15391524 smc_listen_out_err(new_smc);15401525 return;15411526 }15421542- smc_switch_to_fallback(new_smc, reason_code);15431527 if (reason_code && reason_code != SMC_CLC_DECL_PEERDECL) {15441528 if (smc_clc_send_decline(new_smc, reason_code, version) < 0) {15451529 smc_listen_out_err(new_smc);···1982196419831965 /* check if peer is smc capable */19841966 if (!tcp_sk(newclcsock->sk)->syn_smc) {19851985- smc_switch_to_fallback(new_smc, SMC_CLC_DECL_PEERNOSMC);19861986- smc_listen_out_connected(new_smc);19671967+ rc = smc_switch_to_fallback(new_smc, SMC_CLC_DECL_PEERNOSMC);19681968+ if (rc)19691969+ smc_listen_out_err(new_smc);19701970+ else19711971+ smc_listen_out_connected(new_smc);19871972 return;19881973 }19891974···2275225422762255 if (msg->msg_flags & MSG_FASTOPEN) {22772256 if (sk->sk_state == SMC_INIT && !smc->connect_nonblock) {22782278- smc_switch_to_fallback(smc, SMC_CLC_DECL_OPTUNSUPP);22572257+ rc = smc_switch_to_fallback(smc, SMC_CLC_DECL_OPTUNSUPP);22582258+ if (rc)22592259+ goto out;22792260 } else {22802261 rc = -EINVAL;22812262 goto out;···24702447 /* generic setsockopts reaching us here always apply to the24712448 * CLC socket24722449 */24502450+ mutex_lock(&smc->clcsock_release_lock);24512451+ if (!smc->clcsock) {24522452+ mutex_unlock(&smc->clcsock_release_lock);24532453+ return -EBADF;24542454+ }24732455 if (unlikely(!smc->clcsock->ops->setsockopt))24742456 rc = -EOPNOTSUPP;24752457 else···24842456 sk->sk_err = smc->clcsock->sk->sk_err;24852457 sk_error_report(sk);24862458 }24592459+ mutex_unlock(&smc->clcsock_release_lock);2487246024882461 if (optlen < sizeof(int))24892462 return -EINVAL;···25012472 case TCP_FASTOPEN_NO_COOKIE:25022473 /* option not supported by SMC */25032474 if (sk->sk_state == SMC_INIT && !smc->connect_nonblock) {25042504- smc_switch_to_fallback(smc, SMC_CLC_DECL_OPTUNSUPP);24752475+ rc = smc_switch_to_fallback(smc, SMC_CLC_DECL_OPTUNSUPP);25052476 } else {25062477 rc = -EINVAL;25072478 }···25442515 char __user *optval, int __user *optlen)25452516{25462517 struct smc_sock *smc;25182518+ int rc;2547251925482520 smc = smc_sk(sock->sk);25212521+ mutex_lock(&smc->clcsock_release_lock);25222522+ if (!smc->clcsock) {25232523+ mutex_unlock(&smc->clcsock_release_lock);25242524+ return -EBADF;25252525+ }25492526 /* socket options apply to the CLC socket */25502550- if (unlikely(!smc->clcsock->ops->getsockopt))25272527+ if (unlikely(!smc->clcsock->ops->getsockopt)) {25282528+ mutex_unlock(&smc->clcsock_release_lock);25512529 return -EOPNOTSUPP;25522552- return smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,25532553- optval, optlen);25302530+ }25312531+ rc = smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,25322532+ optval, optlen);25332533+ mutex_unlock(&smc->clcsock_release_lock);25342534+ return rc;25542535}2555253625562537static int smc_ioctl(struct socket *sock, unsigned int cmd,
···5454#include "xprt_rdma.h"5555#include <trace/events/rpcrdma.h>56565757-#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)5858-# define RPCDBG_FACILITY RPCDBG_TRANS5959-#endif6060-6157/* Returns size of largest RPC-over-RDMA header in a Call message6258 *6359 * The largest Call header contains a full-size Read list and a
···19101910 struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);19111911 int ret;1912191219131913- if (RPC_IS_ASYNC(task)) {19131913+ if (RPC_IS_ASYNC(task)) {19141914 /*19151915 * We want the AF_LOCAL connect to be resolved in the19161916 * filesystem namespace of the process making the rpc
···393393 struct kvm_vm *vm;394394 int i;395395396396-#ifdef __x86_64__397397- /*398398- * Permission needs to be requested before KVM_SET_CPUID2.399399- */400400- vm_xsave_req_perm();401401-#endif402402-403396 /* Force slot0 memory size not small than DEFAULT_GUEST_PHY_PAGES */404397 if (slot0_mem_pages < DEFAULT_GUEST_PHY_PAGES)405398 slot0_mem_pages = DEFAULT_GUEST_PHY_PAGES;