commits
The Code of Conflict is not achieving its implicit goal of fostering
civility and the spirit of 'be excellent to each other'. Explicit
guidelines have demonstrated success in other projects and other areas
of the kernel.
Here is a Code of Conduct statement for the wider kernel. It is based
on the Contributor Covenant as described at www.contributor-covenant.org
From this point forward, we should abide by these rules in order to help
make the kernel community a welcoming environment to participate in.
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Olof Johansson <olof@lxom.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull x86 fixes from Ingol Molnar:
"Misc fixes:
- EFI crash fix
- Xen PV fixes
- do not allow PTI on 2-level 32-bit kernels for now
- documentation fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/APM: Fix build warning when PROC_FS is not enabled
Revert "x86/mm/legacy: Populate the user page-table with user pgd's"
x86/efi: Load fixmap GDT in efi_call_phys_epilog() before setting %cr3
x86/xen: Disable CPU0 hotplug for Xen PV
x86/EISA: Don't probe EISA bus for Xen PV guests
x86/doc: Fix Documentation/x86/earlyprintk.txt
Pull scheduler fixes from Ingo Molnar:
"Misc fixes: various scheduler metrics corner case fixes, a
sched_features deadlock fix, and a topology fix for certain NUMA
systems"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix kernel-doc notation warning
sched/fair: Fix load_balance redo for !imbalance
sched/fair: Fix scale_rt_capacity() for SMT
sched/fair: Fix vruntime_normalized() for remote non-migration wakeup
sched/pelt: Fix update_blocked_averages() for RT and DL classes
sched/topology: Set correct NUMA topology type
sched/debug: Fix potential deadlock when writing to sched_features
Fix build warning in apm_32.c when CONFIG_PROC_FS is not enabled:
../arch/x86/kernel/apm_32.c:1643:12: warning: 'proc_apm_show' defined but not used [-Wunused-function]
static int proc_apm_show(struct seq_file *m, void *v)
Fixes: 3f3942aca6da ("proc: introduce proc_create_single{,_data}")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Jiri Kosina <jikos@kernel.org>
Link: https://lkml.kernel.org/r/be39ac12-44c2-4715-247f-4dcc3c525b8b@infradead.org
Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, but also breakpoint and x86 PMU driver fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
perf tools: Fix maps__find_symbol_by_name()
tools headers uapi: Update tools's copy of linux/if_link.h
tools headers uapi: Update tools's copy of linux/vhost.h
tools headers uapi: Update tools's copies of kvm headers
tools headers uapi: Update tools's copy of drm/drm.h
tools headers uapi: Update tools's copy of asm-generic/unistd.h
tools headers uapi: Update tools's copy of linux/perf_event.h
perf/core: Force USER_DS when recording user stack data
perf/UAPI: Clearly mark __PERF_SAMPLE_CALLCHAIN_EARLY as internal use
perf/x86/intel: Add support/quirk for the MISPREDICT bit on Knights Landing CPUs
perf annotate: Fix parsing aarch64 branch instructions after objdump update
perf probe powerpc: Ignore SyS symbols irrespective of endianness
perf event-parse: Use fixed size string for comms
perf util: Fix bad memory access in trace info.
perf tools: Streamline bpf examples and headers installation
perf evsel: Fix potential null pointer dereference in perf_evsel__new_idx()
perf arm64: Fix include path for asm-generic/unistd.h
perf/hw_breakpoint: Simplify breakpoint enable in perf_event_modify_breakpoint
perf/hw_breakpoint: Enable breakpoint in modify_user_hw_breakpoint
perf/hw_breakpoint: Remove superfluous bp->attr.disabled = 0
...
Fix kernel-doc warning for missing 'flags' parameter description:
../kernel/sched/fair.c:3371: warning: Function parameter or member 'flags' not described in 'attach_entity_load_avg'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: ea14b57e8a18 ("sched/cpufreq: Provide migration hint")
Link: http://lkml.kernel.org/r/cdda0d42-880d-4229-a9f7-5899c977a063@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This reverts commit 1f40a46cf47c12d93a5ad9dccd82bd36ff8f956a.
It turned out that this patch is not sufficient to enable PTI on 32 bit
systems with legacy 2-level page-tables. In this paging mode the huge-page
PTEs are in the top-level page-table directory, where also the mirroring to
the user-space page-table happens. So every huge PTE exits twice, in the
kernel and in the user page-table.
That means that accessed/dirty bits need to be fetched from two PTEs in
this mode to be safe, but this is not trivial to implement because it needs
changes to generic code just for the sake of enabling PTI with 32-bit
legacy paging. As all systems that need PTI should support PAE anyway,
remove support for PTI when 32-bit legacy paging is used.
Fixes: 7757d607c6b3 ('x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32')
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: hpa@zytor.com
Cc: linux-mm@kvack.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Link: https://lkml.kernel.org/r/1536922754-31379-1-git-send-email-joro@8bytes.org
Pull locking fixes from Ingo Molnar:
"Misc fixes: liblockdep fixes and ww_mutex fixes"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/ww_mutex: Fix spelling mistake "cylic" -> "cyclic"
locking/lockdep: Delete unnecessary #include
tools/lib/lockdep: Add dummy task_struct state member
tools/lib/lockdep: Add empty nmi.h
tools/lib/lockdep: Update Sasha Levin email to MSFT
jump_label: Fix typo in warning message
locking/mutex: Fix mutex debug call and ww_mutex documentation
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix finding a symbol by name when multiple maps use the same backing DSO,
so we must first see if that symbol name is in the DSO, then see if it is
inside the range of addresses for that specific map (Adrian Hunter)
- Update the tools copies of UAPI headers, which silences the warnings
emitted when building the tools and in some cases, like for the new
KVM ioctls, results in 'perf trace' being able to translate that
ioctl number to a string (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It can happen that load_balance() finds a busiest group and then a
busiest rq but the calculated imbalance is in fact 0.
In such situation, detach_tasks() returns immediately and lets the
flag LBF_ALL_PINNED set. The busiest CPU is then wrongly assumed to
have pinned tasks and removed from the load balance mask. then, we
redo a load balance without the busiest CPU. This creates wrong load
balance situation and generates wrong task migration.
If the calculated imbalance is 0, it's useless to try to find a
busiest rq as no task will be migrated and we can return immediately.
This situation can happen with heterogeneous system or smp system when
RT tasks are decreasing the capacity of some CPUs.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: jhugo@codeaurora.org
Link: http://lkml.kernel.org/r/1536306664-29827-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit eeb89e2bb1ac ("x86/efi: Load fixmap GDT in efi_call_phys_epilog()")
moved loading the fixmap in efi_call_phys_epilog() after load_cr3() since
it was assumed to be more logical.
Turns out this is incorrect: In efi_call_phys_prolog(), the gdt with its
physical address is loaded first, and when the %cr3 is reloaded in _epilog
from initial_page_table to swapper_pg_dir again the gdt is no longer
mapped. This results in a triple fault if an interrupt occurs after
load_cr3() and before load_fixmap_gdt(0). Calling load_fixmap_gdt(0) first
restores the execution order prior to commit eeb89e2bb1ac and fixes the
problem.
Fixes: eeb89e2bb1ac ("x86/efi: Load fixmap GDT in efi_call_phys_epilog()")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: linux-efi@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Joerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/1536689892-21538-1-git-send-email-linux@roeck-us.net
Pull cifs fixes from Steve French:
"Fixes for four CIFS/SMB3 potential pointer overflow issues, one minor
build fix, and a build warning cleanup"
* tag '4.19-rc3-smb3-cifs' of git://git.samba.org/sfrench/cifs-2.6:
cifs: read overflow in is_valid_oplock_break()
cifs: integer overflow in in SMB2_ioctl()
CIFS: fix wrapping bugs in num_entries()
cifs: prevent integer overflow in nxt_dir_entry()
fs/cifs: require sha512
fs/cifs: suppress a string overflow warning
Trivial fix to spelling mistake in pr_err() error message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20180824112235.8842-1-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Perf can record user stack data in response to a synchronous request, such
as a tracepoint firing. If this happens under set_fs(KERNEL_DS), then we
end up reading user stack data using __copy_from_user_inatomic() under
set_fs(KERNEL_DS). I think this conflicts with the intention of using
set_fs(KERNEL_DS). And it is explicitly forbidden by hardware on ARM64
when both CONFIG_ARM64_UAO and CONFIG_ARM64_PAN are used.
So fix this by forcing USER_DS when recording user stack data.
Signed-off-by: Yabin Cui <yabinc@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 88b0193d9418 ("perf/callchain: Force USER_DS when invoking perf_callchain_user()")
Link: http://lkml.kernel.org/r/20180823225935.27035-1-yabinc@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 1c5aae7710bb ("perf machine: Create maps for x86 PTI entry
trampolines") revealed a problem with maps__find_symbol_by_name() that
resulted in probes not being found e.g.
$ sudo perf probe xsk_mmap
xsk_mmap is out of .text, skip it.
Probe point 'xsk_mmap' not found.
Error: Failed to add events.
maps__find_symbol_by_name() can optionally return the map of the found
symbol. It can get the map wrong because, in fact, the symbol is found
on the map's dso, not allowing for the possibility that the dso has more
than one map. Fix by always checking the map contains the symbol.
Reported-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 1c5aae7710bb ("perf machine: Create maps for x86 PTI entry trampolines")
Link: http://lkml.kernel.org/r/20180907085116.25782-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since commit:
523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")
scale_rt_capacity() returns the remaining capacity and not a scale factor
to apply on cpu_capacity_orig. arch_scale_cpu() is directly called by
scale_rt_capacity() so we must take the sched_domain argument.
Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")
Link: http://lkml.kernel.org/r/20180904093626.GA23936@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Xen PV guests don't allow CPU0 hotplug, so disable it.
Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20180912174122.24282-1-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull NFS client bugfixes from Anna Schumaker:
"These are a handful of fixes for problems that Trond found. Patch #1
and #3 have the same name, a second issue was found after applying the
first patch.
Stable bugfixes:
- v4.17+: Fix tracepoint Oops in initiate_file_draining()
- v4.11+: Fix an infinite loop on I/O
Other fixes:
- Return errors if a waiting layoutget is killed
- Don't open code clearing of delegation state"
* tag 'nfs-for-4.19-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFS: Don't open code clearing of delegation state
NFSv4.1 fix infinite loop on I/O.
NFSv4: Fix a tracepoint Oops in initiate_file_draining()
pNFS: Ensure we return the error if someone kills a waiting layoutget
NFSv4: Fix a tracepoint Oops in initiate_file_draining()
We need to verify that the "data_offset" is within bounds.
Reported-by: Dr Silvio Cesare of InfoSect <silvio.cesare@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Commit:
c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage")
added the inclusion of <trace/events/preemptirq.h>.
liblockdep doesn't have a stub version of that header so now fails to build.
However, commit:
bff1b208a5d1 ("tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage"")
removed the use of functions declared in that header. So delete the #include.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <alexander.levin@verizon.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: bff1b208a5d1 ("tracing: Partial revert of "tracing: Centralize ...")
Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints ...")
Link: http://lkml.kernel.org/r/20180828203315.GD18030@decadent.org.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Vince noted that commit:
6cbc304f2f36 ("perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)")
'leaked' __PERF_SAMPLE_CALLCHAIN_EARLY into the UAPI namespace. And
while sys_perf_event_open() will error out if you try to use it, it is
exposed.
Clearly mark it for internal use only to avoid any confusion.
Requested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
To get the changes in:
3e7a50ceb11e ("net: report min and max mtu network device settings")
2756f68c3149 ("net: bridge: add support for backup port")
a25717d2b604 ("xdp: support simultaneous driver and hw XDP attachment")
4f91da26c811 ("xdp: add per mode attributes for attached programs")
f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
Silencing this libbpf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/if_link.h' differs from latest version at 'include/uapi/linux/if_link.h'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-xd9ztioa894zemv8ag8kg64u@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a task which previously ran on a given CPU is remotely queued to
wake up on that same CPU, there is a period where the task's state is
TASK_WAKING and its vruntime is not normalized. This is not accounted
for in vruntime_normalized() which will cause an error in the task's
vruntime if it is switched from the fair class during this time.
For example if it is boosted to RT priority via rt_mutex_setprio(),
rq->min_vruntime will not be subtracted from the task's vruntime but
it will be added again when the task returns to the fair class. The
task's vruntime will have been erroneously doubled and the effective
priority of the task will be reduced.
Note this will also lead to inflation of all vruntimes since the doubled
vruntime value will become the rq's min_vruntime when other tasks leave
the rq. This leads to repeated doubling of the vruntime and priority
penalty.
Fix this by recognizing a WAKING task's vruntime as normalized only if
sched_remote_wakeup is true. This indicates a migration, in which case
the vruntime would have been normalized in migrate_task_rq_fair().
Based on a similar patch from John Dias <joaodias@google.com>.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Steve Muckle <smuckle@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Redpath <Chris.Redpath@arm.com>
Cc: John Dias <joaodias@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miguel de Dios <migueldedios@google.com>
Cc: Morten Rasmussen <Morten.Rasmussen@arm.com>
Cc: Patrick Bellasi <Patrick.Bellasi@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@google.com>
Cc: kernel-team@android.com
Fixes: b5179ac70de8 ("sched/fair: Prepare to fix fairness problems on migration")
Link: http://lkml.kernel.org/r/20180831224217.169476-1-smuckle@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For unprivileged Xen PV guests this is normal memory and ioremap will
not be able to properly map it.
While at it, since ioremap may return NULL, add a test for pointer's
validity.
Reported-by: Andy Smith <andy@strugglers.net>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: hpa@zytor.com
Cc: xen-devel@lists.xenproject.org
Cc: jgross@suse.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180911195538.23289-1-boris.ostrovsky@oracle.com
Pull tracing fix from Steven Rostedt:
"This fixes an issue with the build system caused by a change that
modifies CC_FLAGS_FTRACE. The issue is that it breaks the dependencies
and causes "make targz-pkg" to rebuild the entire world"
* tag 'trace-v4.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/Makefile: Fix handling redefinition of CC_FLAGS_FTRACE
Add a helper for the case when the nfs4 open state has been set to use
a delegation stateid, and we want to revert to using the open stateid.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
The "le32_to_cpu(rsp->OutputOffset) + *plen" addition can overflow and
wrap around to a smaller value which looks like it would lead to an
information leak.
Fixes: 4a72dafa19ba ("SMB2 FSCTL and IOCTL worker function")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
CC: Stable <stable@vger.kernel.org>
Commit:
8cc05c71ba5f ("locking/lockdep: Move sanity check to inside lockdep_print_held_locks()")
added accesses to the task_struct's state member. Add dummy userspace declaration.
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <Alexander.Levin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180813190527.16853-4-alexander.levin@microsoft.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Problem: perf did not show branch predicted/mispredicted bit in brstack.
Output of perf -F brstack for profile collected
Before:
0x4fdbcd/0x4fdc03/-/-/-/0
0x45f4c1/0x4fdba0/-/-/-/0
0x45f544/0x45f4bb/-/-/-/0
0x45f555/0x45f53c/-/-/-/0
0x7f66901cc24b/0x45f555/-/-/-/0
0x7f66901cc22e/0x7f66901cc23d/-/-/-/0
0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0
0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0
After:
0x4fdbcd/0x4fdc03/P/-/-/0
0x45f4c1/0x4fdba0/P/-/-/0
0x45f544/0x45f4bb/P/-/-/0
0x45f555/0x45f53c/P/-/-/0
0x7f66901cc24b/0x45f555/P/-/-/0
0x7f66901cc22e/0x7f66901cc23d/P/-/-/0
0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0
0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0
Cause:
As mentioned in Software Development Manual vol 3, 17.4.8.1,
IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is
stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as
its format. Despite that, registers containing FROM address of the branch,
do have MISPREDICT bit but because of the format indicated in
IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit.
Solution:
Teach LBR about above Knights Landing quirk and make it read MISPREDICT bit.
Signed-off-by: Jacek Tomaka <jacek.tomaka@poczta.fm>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180802013830.10600-1-jacekt@dugeo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
To get the changes in:
c48300c92ad9 ("vhost: fix VHOST_GET_BACKEND_FEATURES ioctl request definition")
This makes 'perf trace' and other tools in the future using its
beautifiers in a libbeauty.so library be able to translate these new
ioctl to strings:
$ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > /tmp/after
$ diff -u /tmp/before /tmp/after
--- /tmp/before 2018-09-11 13:10:57.923038244 -0300
+++ /tmp/after 2018-09-11 13:11:20.329012685 -0300
@@ -15,6 +15,7 @@
[0x22] = "SET_VRING_ERR",
[0x23] = "SET_VRING_BUSYLOOP_TIMEOUT",
[0x24] = "GET_VRING_BUSYLOOP_TIMEOUT",
+ [0x25] = "SET_BACKEND_FEATURES",
[0x30] = "NET_SET_BACKEND",
[0x40] = "SCSI_SET_ENDPOINT",
[0x41] = "SCSI_CLEAR_ENDPOINT",
@@ -27,4 +28,5 @@
static const char *vhost_virtio_ioctl_read_cmds[] = {
[0x00] = "GET_FEATURES",
[0x12] = "GET_VRING_BASE",
+ [0x26] = "GET_BACKEND_FEATURES",
};
$
We'll also use this to be able to express syscall filters using symbolic
these symbolic names, something like:
# perf trace --all-cpus -e ioctl(cmd=*GET_FEATURES)
This silences the following warning during perf's build:
Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h'
diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-35x71oei2hdui9u0tarpimbq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
update_blocked_averages() is called to periodiccally decay the stalled load
of idle CPUs and to sync all loads before running load balance.
When cfs rq is idle, it trigs a load balance during pick_next_task_fair()
in order to potentially pull tasks and to use this newly idle CPU. This
load balance happens whereas prev task from another class has not been put
and its utilization updated yet. This may lead to wrongly account running
time as idle time for RT or DL classes.
Test that no RT or DL task is running when updating their utilization in
update_blocked_averages().
We still update RT and DL utilization instead of simply skipping them to
make sure that all metrics are synced when used during load balance.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 371bf4273269 ("sched/rt: Add rt_rq utilization tracking")
Fixes: 3727e0e16340 ("sched/dl: Add dl_rq utilization tracking")
Link: http://lkml.kernel.org/r/1535728975-22799-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix a few issues in Documentation/x86/earlyprintk.txt:
- correct typos, punctuation, missing word, wrong word
- change product name from Netchip to NetChip
- expand where to add "earlyprintk=dbg"
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Link: http://lkml.kernel.org/r/d0c40ac3-7659-6374-dbda-23d3d2577f30@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull DeviceTree fix from Rob Herring:
"One regression for a 20 year old PowerMac:
- Fix a regression on systems having a DT without any phandles which
happens on a PowerMac G3"
* tag 'devicetree-fixes-for-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: fix phandle cache creation for DTs with no phandles
As a Kernel developer, I make heavy use of "make targz-pkg" in order
to locally compile and remotely install my development Kernels. The
nice feature I rely on is that after a normal "make", "make targz-pkg"
only generates the tarball without having to recompile everything.
That was true until commit f28bc3c32c05 ("tracing: Handle
CC_FLAGS_FTRACE more accurately"). After it, running "make targz-pkg"
after "make" will recompile the whole Kernel tree, making my
development workflow much slower.
The Kernel is choosing to recompile everything because it claims the
command line has changed. A diff of the .cmd files show a repeated
-mfentry in one of the files. That is because "make targz-pkg" calls
"make modules_install" and the environment is already populated with
the exported variables, CC_FLAGS_FTRACE being one of them. Then,
-mfentry gets duplicated because it is not protected behind an ifndef
block, like -pg.
To complicate the problem a little bit more, architectures can define
their own version CC_FLAGS_FTRACE, so our code not only has to
consider recursive Makefiles, but also architecture overrides.
So in this patch we move CC_FLAGS_FTRACE up and unconditionally
define it to -pg. Then we let the architecture Makefiles possibly
override it, and finally append the extra options later. This ensures
the variable is always fully redefined at each invocation so recursive
Makefiles don't keep appending, and hopefully it maintains the
intended behavior on how architectures can override the defaults..
Thanks Steven Rostedt and Vasily Gorbik for the help on this
regression.
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: linux-kbuild@vger.kernel.org
Fixes: commit f28bc3c32c05 ("tracing: Handle CC_FLAGS_FTRACE more accurately")
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The previous fix broke recovery of delegated stateids because it assumes
that if we did not mark the delegation as suspect, then the delegation has
effectively been revoked, and so it removes that delegation irrespectively
of whether or not it is valid and still in use. While this is "mostly
harmless" for ordinary I/O, we've seen pNFS fail with LAYOUTGET spinning
in an infinite loop while complaining that we're using an invalid stateid
(in this case the all-zero stateid).
What we rather want to do here is ensure that the delegation is always
correctly marked as needing testing when that is the case. So we want
to close the loophole offered by nfs4_schedule_stateid_recovery(),
which marks the state as needing to be reclaimed, but not the
delegation that may be backing it.
Fixes: 0e3d3e5df07dc ("NFSv4.1 fix infinite loop on IO BAD_STATEID error")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.11+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
The problem is that "entryptr + next_offset" and "entryptr + len + size"
can wrap. I ended up changing the type of "entryptr" because it makes
the math easier when we don't have to do so much casting.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Required since:
88f1c87de11a8 ("locking/lockdep: Avoid triggering hardlockup from debug_show_all_locks()")
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <Alexander.Levin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180813190527.16853-3-alexander.levin@microsoft.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
Kernel:
- Modify breakpoint fixes (Jiri Olsa)
perf annotate:
- Fix parsing aarch64 branch instructions after objdump update (Kim Phillips)
- Fix parsing indirect calls in 'perf annotate' (Martin Liška)
perf probe:
- Ignore SyS symbols irrespective of endianness on PowerPC (Sandipan Das)
perf trace:
- Fix include path for asm-generic/unistd.h on arm64 (Kim Phillips)
Core libraries:
- Fix potential null pointer dereference in perf_evsel__new_idx() (Hisao Tanabe)
- Use fixed size string for comms instead of scanf("%m"), that is
not present in the bionic libc and leads to a crash (Chris Phlipot)
- Fix bad memory access in trace info on 32-bit systems, we were reading
8 bytes from a 4-byte long variable when saving the command line in the
perf.data file. (Chris Phlipot)
Build system:
- Streamline bpf examples and headers installation, clarifying
some install messages. (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
To get the changes in:
a449938297e5 ("KVM: s390: Add huge page enablement control")
8fcc4b5923af ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE")
be26b3a73413 ("arm64: KVM: export the capability to set guest SError syndrome")
b7b27facc7b5 ("arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTS")
b0960b9569db ("KVM: arm: Add 32bit get/set events support")
a3da7b4a3be5 ("KVM: s390: add etoken support for guests")
This makes 'perf trace' automagically get aware of these new ioctls:
$ cp include/uapi/linux/kvm.h tools/include/uapi/linux/kvm.h
$ tools/perf/trace/beauty/kvm_ioctl.sh > /tmp/after
$ diff -u /tmp/before /tmp/after
--- /tmp/before 2018-09-11 11:18:29.173207586 -0300
+++ /tmp/after 2018-09-11 11:18:38.488200446 -0300
@@ -84,6 +84,8 @@
[0xbb] = "MEMORY_ENCRYPT_REG_REGION",
[0xbc] = "MEMORY_ENCRYPT_UNREG_REGION",
[0xbd] = "HYPERV_EVENTFD",
+ [0xbe] = "GET_NESTED_STATE",
+ [0xbf] = "SET_NESTED_STATE",
[0xe0] = "CREATE_DEVICE",
[0xe1] = "SET_DEVICE_ATTR",
[0xe2] = "G
And cures the following warning during perf's build:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dongjiu Geng <gengdongjiu@huawei.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-2vvwh2o19orn56di0ksrtgzr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the following commit:
051f3ca02e46 ("sched/topology: Introduce NUMA identity node sched domain")
the scheduler introduced a new NUMA level. However this leads to the NUMA topology
on 2 node systems to not be marked as NUMA_DIRECT anymore.
After this commit, it gets reported as NUMA_BACKPLANE, because
sched_domains_numa_level is now 2 on 2 node systems.
Fix this by allowing setting systems that have up to 2 NUMA levels as
NUMA_DIRECT.
While here remove code that assumes that level can be 0.
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andre Wild <wild@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Fixes: 051f3ca02e46 "Introduce NUMA identity node sched domain"
Link: http://lkml.kernel.org/r/1533920419-17410-1-git-send-email-srikar@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull xen fixes from Juergen Gross:
"This contains some minor cleanups and fixes:
- a new knob for controlling scrubbing of pages returned by the Xen
balloon driver to the Xen hypervisor to address a boot performance
issue seen in large guests booted pre-ballooned
- a fix of a regression in the gntdev driver which made it impossible
to use fully virtualized guests (HVM guests) with a 4.19 based dom0
- a fix in Xen cpu hotplug functionality which could be triggered by
wrong admin commands (setting number of active vcpus to 0)
One further note: the patches have all been under test for several
days in another branch. This branch has been rebased in order to avoid
merge conflicts"
* tag 'for-linus-4.19c-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/gntdev: fix up blockable calls to mn_invl_range_start
xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usage
xen: avoid crash in disable_hotplug_cpu
xen/balloon: add runtime control for scrubbing ballooned out pages
xen/manage: don't complain about an empty value in control/sysrq node
With commit 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of
of_find_node_by_phandle()"), a G3 PowerMac fails to boot. The root cause
is the DT for this system has no phandle properties when booted with
BootX. of_populate_phandle_cache() does not handle the case of no
phandles correctly. The problem is roundup_pow_of_two() for 0 is
undefined. The implementation subtracts 1 underflowing and then things
are in the weeds.
Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
Cc: stable@vger.kernel.org # 4.17+
Reported-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Reviewed-by: Frank Rowand <frank.rowand@sony.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Now that the value of 'ino' can be NULL or an ERR_PTR(), we need to
change the test in the tracepoint.
Fixes: ce5624f7e6675 ("NFSv4: Return NFS4ERR_DELAY when a layout fails...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
The "old_entry + le32_to_cpu(pDirInfo->NextEntryOffset)" can wrap
around so I have added a check for integer overflow.
Reported-by: Dr Silvio Cesare of InfoSect <silvio.cesare@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <Alexander.Levin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180813190527.16853-2-alexander.levin@microsoft.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull x86 fixes from Thomas Gleixner:
"A set of fixes for x86:
- Prevent multiplication result truncation on 32bit. Introduced with
the early timestamp reworrk.
- Ensure microcode revision storage to be consistent under all
circumstances
- Prevent write tearing of PTEs
- Prevent confusion of user and kernel reegisters when dumping fatal
signals verbosely
- Make an error return value in a failure path of the vector
allocation negative. Returning EINVAL might the caller assume
success and causes further wreckage.
- A trivial kernel doc warning fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Use WRITE_ONCE() when setting PTEs
x86/apic/vector: Make error return value negative
x86/process: Don't mix user/kernel regs in 64bit __show_regs()
x86/tsc: Prevent result truncation on 32bit
x86: Fix kernel-doc atomic.h warnings
x86/microcode: Update the new microcode revision unconditionally
x86/microcode: Make sure boot_cpu_data.microcode is up-to-date
Starting with binutils 2.28, aarch64 objdump adds comments to the
disassembly output to show the alternative names of a condition code
[1].
It is assumed that commas in objdump comments could occur in other
arches now or in the future, so this fix is arch-independent.
The fix could have been done with arm64 specific jump__parse and
jump__scnprintf functions, but the jump__scnprintf instruction would
have to have its comment character be a literal, since the scnprintf
functions cannot receive a struct arch easily.
This inconvenience also applies to the generic jump__scnprintf, which is
why we add a raw_comment pointer to struct ins_operands, so the __parse
function assigns it to be re-used by its corresponding __scnprintf
function.
Example differences in 'perf annotate --stdio2' output on an aarch64
perf.data file:
BEFORE: → b.cs ffff200008133d1c <unwind_frame+0x18c> // b.hs, dffff7ecc47b
AFTER : ↓ b.cs 18c
BEFORE: → b.cc ffff200008d8d9cc <get_alloc_profile+0x31c> // b.lo, b.ul, dffff727295b
AFTER : ↓ b.cc 31c
The branch target labels 18c and 31c also now appear in the output:
BEFORE: add x26, x29, #0x80
AFTER : 18c: add x26, x29, #0x80
BEFORE: add x21, x21, #0x8
AFTER : 31c: add x21, x21, #0x8
The Fixes: tag below is added so stable branches will get the update; it
doesn't necessarily mean that commit was broken at the time, rather it
didn't withstand the aarch64 objdump update.
Tested no difference in output for sample x86_64, power arch perf.data files.
[1] https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=bb7eff5206e4795ac79c177a80fe9f4630aaf730
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Fixes: b13bbeee5ee6 ("perf annotate: Fix branch instruction with multiple operands")
Link: http://lkml.kernel.org/r/20180827125340.a2f7e291901d17cea05daba4@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get the changes in:
d67b6a206507 ("drm: writeback: Add client capability for exposing writeback connectors")
This is for an argument to a DRM ioctl, which is not being prettyfied in
the 'perf trace' DRM ioctl beautifier, but will now that syscalls are
starting to have pointer arguments augmented via BPF.
This time around this just cures the following warning during perf's
build:
Warning: Kernel ABI header at 'tools/include/uapi/drm/drm.h' differs from latest version at 'include/uapi/drm/drm.h'
diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brian Starkey <brian.starkey@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-n7qib1bac6mc6w9oke7r4qdc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The following lockdep report can be triggered by writing to /sys/kernel/debug/sched_features:
======================================================
WARNING: possible circular locking dependency detected
4.18.0-rc6-00152-gcd3f77d74ac3-dirty #18 Not tainted
------------------------------------------------------
sh/3358 is trying to acquire lock:
000000004ad3989d (cpu_hotplug_lock.rw_sem){++++}, at: static_key_enable+0x14/0x30
but task is already holding lock:
00000000c1b31a88 (&sb->s_type->i_mutex_key#3){+.+.}, at: sched_feat_write+0x160/0x428
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&sb->s_type->i_mutex_key#3){+.+.}:
lock_acquire+0xb8/0x148
down_write+0xac/0x140
start_creating+0x5c/0x168
debugfs_create_dir+0x18/0x220
opp_debug_register+0x8c/0x120
_add_opp_dev+0x104/0x1f8
dev_pm_opp_get_opp_table+0x174/0x340
_of_add_opp_table_v2+0x110/0x760
dev_pm_opp_of_add_table+0x5c/0x240
dev_pm_opp_of_cpumask_add_table+0x5c/0x100
cpufreq_init+0x160/0x430
cpufreq_online+0x1cc/0xe30
cpufreq_add_dev+0x78/0x198
subsys_interface_register+0x168/0x270
cpufreq_register_driver+0x1c8/0x278
dt_cpufreq_probe+0xdc/0x1b8
platform_drv_probe+0xb4/0x168
driver_probe_device+0x318/0x4b0
__device_attach_driver+0xfc/0x1f0
bus_for_each_drv+0xf8/0x180
__device_attach+0x164/0x200
device_initial_probe+0x10/0x18
bus_probe_device+0x110/0x178
device_add+0x6d8/0x908
platform_device_add+0x138/0x3d8
platform_device_register_full+0x1cc/0x1f8
cpufreq_dt_platdev_init+0x174/0x1bc
do_one_initcall+0xb8/0x310
kernel_init_freeable+0x4b8/0x56c
kernel_init+0x10/0x138
ret_from_fork+0x10/0x18
-> #2 (opp_table_lock){+.+.}:
lock_acquire+0xb8/0x148
__mutex_lock+0x104/0xf50
mutex_lock_nested+0x1c/0x28
_of_add_opp_table_v2+0xb4/0x760
dev_pm_opp_of_add_table+0x5c/0x240
dev_pm_opp_of_cpumask_add_table+0x5c/0x100
cpufreq_init+0x160/0x430
cpufreq_online+0x1cc/0xe30
cpufreq_add_dev+0x78/0x198
subsys_interface_register+0x168/0x270
cpufreq_register_driver+0x1c8/0x278
dt_cpufreq_probe+0xdc/0x1b8
platform_drv_probe+0xb4/0x168
driver_probe_device+0x318/0x4b0
__device_attach_driver+0xfc/0x1f0
bus_for_each_drv+0xf8/0x180
__device_attach+0x164/0x200
device_initial_probe+0x10/0x18
bus_probe_device+0x110/0x178
device_add+0x6d8/0x908
platform_device_add+0x138/0x3d8
platform_device_register_full+0x1cc/0x1f8
cpufreq_dt_platdev_init+0x174/0x1bc
do_one_initcall+0xb8/0x310
kernel_init_freeable+0x4b8/0x56c
kernel_init+0x10/0x138
ret_from_fork+0x10/0x18
-> #1 (subsys mutex#6){+.+.}:
lock_acquire+0xb8/0x148
__mutex_lock+0x104/0xf50
mutex_lock_nested+0x1c/0x28
subsys_interface_register+0xd8/0x270
cpufreq_register_driver+0x1c8/0x278
dt_cpufreq_probe+0xdc/0x1b8
platform_drv_probe+0xb4/0x168
driver_probe_device+0x318/0x4b0
__device_attach_driver+0xfc/0x1f0
bus_for_each_drv+0xf8/0x180
__device_attach+0x164/0x200
device_initial_probe+0x10/0x18
bus_probe_device+0x110/0x178
device_add+0x6d8/0x908
platform_device_add+0x138/0x3d8
platform_device_register_full+0x1cc/0x1f8
cpufreq_dt_platdev_init+0x174/0x1bc
do_one_initcall+0xb8/0x310
kernel_init_freeable+0x4b8/0x56c
kernel_init+0x10/0x138
ret_from_fork+0x10/0x18
-> #0 (cpu_hotplug_lock.rw_sem){++++}:
__lock_acquire+0x203c/0x21d0
lock_acquire+0xb8/0x148
cpus_read_lock+0x58/0x1c8
static_key_enable+0x14/0x30
sched_feat_write+0x314/0x428
full_proxy_write+0xa0/0x138
__vfs_write+0xd8/0x388
vfs_write+0xdc/0x318
ksys_write+0xb4/0x138
sys_write+0xc/0x18
__sys_trace_return+0x0/0x4
other info that might help us debug this:
Chain exists of:
cpu_hotplug_lock.rw_sem --> opp_table_lock --> &sb->s_type->i_mutex_key#3
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&sb->s_type->i_mutex_key#3);
lock(opp_table_lock);
lock(&sb->s_type->i_mutex_key#3);
lock(cpu_hotplug_lock.rw_sem);
*** DEADLOCK ***
2 locks held by sh/3358:
#0: 00000000a8c4b363 (sb_writers#10){.+.+}, at: vfs_write+0x238/0x318
#1: 00000000c1b31a88 (&sb->s_type->i_mutex_key#3){+.+.}, at: sched_feat_write+0x160/0x428
stack backtrace:
CPU: 5 PID: 3358 Comm: sh Not tainted 4.18.0-rc6-00152-gcd3f77d74ac3-dirty #18
Hardware name: Renesas H3ULCB Kingfisher board based on r8a7795 ES2.0+ (DT)
Call trace:
dump_backtrace+0x0/0x288
show_stack+0x14/0x20
dump_stack+0x13c/0x1ac
print_circular_bug.isra.10+0x270/0x438
check_prev_add.constprop.16+0x4dc/0xb98
__lock_acquire+0x203c/0x21d0
lock_acquire+0xb8/0x148
cpus_read_lock+0x58/0x1c8
static_key_enable+0x14/0x30
sched_feat_write+0x314/0x428
full_proxy_write+0xa0/0x138
__vfs_write+0xd8/0x388
vfs_write+0xdc/0x318
ksys_write+0xb4/0x138
sys_write+0xc/0x18
__sys_trace_return+0x0/0x4
This is because when loading the cpufreq_dt module we first acquire
cpu_hotplug_lock.rw_sem lock, then in cpufreq_init(), we are taking
the &sb->s_type->i_mutex_key lock.
But when writing to /sys/kernel/debug/sched_features, the
cpu_hotplug_lock.rw_sem lock depends on the &sb->s_type->i_mutex_key lock.
To fix this bug, reverse the lock acquisition order when writing to
sched_features, this way cpu_hotplug_lock.rw_sem no longer depends on
&sb->s_type->i_mutex_key.
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Jiada Wang <jiada_wang@mentor.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Eugeniu Rosca <erosca@de.adit-jv.com>
Cc: George G. Davis <george_davis@mentor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180731121222.26195-1-jiada_wang@mentor.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull Xtensa fixes and cleanups from Max Filippov:
- don't allocate memory in platform_setup as the memory allocator is
not initialized at that point yet;
- remove unnecessary ifeq KBUILD_SRC from arch/xtensa/Makefile;
- enable SG chaining in arch/xtensa/Kconfig.
* tag 'xtensa-20180914' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: enable SG chaining in Kconfig
xtensa: remove unnecessary KBUILD_SRC ifeq conditional
xtensa: ISS: don't allocate memory in platform_setup
Patch series "mmu_notifiers follow ups".
Tetsuo has noticed some fallouts from 93065ac753e4 ("mm, oom: distinguish
blockable mode for mmu notifiers"). One of them has been fixed and picked
up by AMD/DRM maintainer [1]. XEN issue is fixed by patch 1. I have also
clarified expectations about blockable semantic of invalidate_range_end.
Finally the last patch removes MMU_INVALIDATE_DOES_NOT_BLOCK which is no
longer used nor needed.
[1] http://lkml.kernel.org/r/20180824135257.GU29735@dhcp22.suse.cz
This patch (of 3):
93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers") has
introduced blockable parameter to all mmu_notifiers and the notifier has
to back off when called in !blockable case and it could block down the
road.
The above commit implemented that for mn_invl_range_start but both
in_range checks are done unconditionally regardless of the blockable mode
and as such they would fail all the time for regular calls. Fix this by
checking blockable parameter as well.
Once we are there we can remove the stale TODO. The lock has to be
sleepable because we wait for completion down in gnttab_unmap_refs_sync.
Link: http://lkml.kernel.org/r/20180827112623.8992-2-mhocko@kernel.org
Fixes: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
In preparation to remove direct access to device_node.type, add
of_node_is_type() and of_node_get_device_type() helpers to check and
retrieve the device type.
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
If someone interrupts a wait on one or more outstanding layoutgets in
pnfs_update_layout() then return the ERESTARTSYS/EINTR error.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
This got lost in commit 0fdfef9aa7ee68ddd508aef7c98630cfc054f8d6,
which removed CONFIG_CIFS_SMB311.
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Fixes: 0fdfef9aa7ee68ddd ("smb3: simplify code by removing CONFIG_CIFS_SMB311")
CC: Stable <stable@vger.kernel.org>
CC: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
There's no 'allocatote' - use the next best thing: 'allocate' :-)
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180907103521.31344-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull timekeeping fixes from Thomas Gleixner:
"Two fixes for timekeeping:
- Revert to the previous kthread based update, which is unfortunately
required due to lock ordering issues. The removal caused boot
failures on old Core2 machines. Add a proper comment why the thread
needs to stay to prevent accidental removal in the future.
- Fix a silly typo in a function declaration"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: Revert "Remove kthread"
timekeeping: Fix declaration of read_persistent_wall_and_boot_offset()
When page-table entries are set, the compiler might optimize their
assignment by using multiple instructions to set the PTE. This might
turn into a security hazard if the user somehow manages to use the
interim PTE. L1TF does not make our lives easier, making even an interim
non-present PTE a security hazard.
Using WRITE_ONCE() to set PTEs and friends should prevent this potential
security hazard.
I skimmed the differences in the binary with and without this patch. The
differences are (obviously) greater when CONFIG_PARAVIRT=n as more
code optimizations are possible. For better and worse, the impact on the
binary with this patch is pretty small. Skimming the code did not cause
anything to jump out as a security hazard, but it seems that at least
move_soft_dirty_pte() caused set_pte_at() to use multiple writes.
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180902181451.80520-1-namit@vmware.com
This makes sure that the SyS symbols are ignored for any powerpc system,
not just the big endian ones.
Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Fixes: fb6d59423115 ("perf probe ppc: Use the right prefix when ignoring SyS symbols on ppc")
Link: http://lkml.kernel.org/r/20180828090848.1914-1-sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The Code of Conflict is not achieving its implicit goal of fostering
civility and the spirit of 'be excellent to each other'. Explicit
guidelines have demonstrated success in other projects and other areas
of the kernel.
Here is a Code of Conduct statement for the wider kernel. It is based
on the Contributor Covenant as described at www.contributor-covenant.org
From this point forward, we should abide by these rules in order to help
make the kernel community a welcoming environment to participate in.
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Olof Johansson <olof@lxom.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull x86 fixes from Ingol Molnar:
"Misc fixes:
- EFI crash fix
- Xen PV fixes
- do not allow PTI on 2-level 32-bit kernels for now
- documentation fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/APM: Fix build warning when PROC_FS is not enabled
Revert "x86/mm/legacy: Populate the user page-table with user pgd's"
x86/efi: Load fixmap GDT in efi_call_phys_epilog() before setting %cr3
x86/xen: Disable CPU0 hotplug for Xen PV
x86/EISA: Don't probe EISA bus for Xen PV guests
x86/doc: Fix Documentation/x86/earlyprintk.txt
Pull scheduler fixes from Ingo Molnar:
"Misc fixes: various scheduler metrics corner case fixes, a
sched_features deadlock fix, and a topology fix for certain NUMA
systems"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix kernel-doc notation warning
sched/fair: Fix load_balance redo for !imbalance
sched/fair: Fix scale_rt_capacity() for SMT
sched/fair: Fix vruntime_normalized() for remote non-migration wakeup
sched/pelt: Fix update_blocked_averages() for RT and DL classes
sched/topology: Set correct NUMA topology type
sched/debug: Fix potential deadlock when writing to sched_features
Fix build warning in apm_32.c when CONFIG_PROC_FS is not enabled:
../arch/x86/kernel/apm_32.c:1643:12: warning: 'proc_apm_show' defined but not used [-Wunused-function]
static int proc_apm_show(struct seq_file *m, void *v)
Fixes: 3f3942aca6da ("proc: introduce proc_create_single{,_data}")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Jiri Kosina <jikos@kernel.org>
Link: https://lkml.kernel.org/r/be39ac12-44c2-4715-247f-4dcc3c525b8b@infradead.org
Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, but also breakpoint and x86 PMU driver fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
perf tools: Fix maps__find_symbol_by_name()
tools headers uapi: Update tools's copy of linux/if_link.h
tools headers uapi: Update tools's copy of linux/vhost.h
tools headers uapi: Update tools's copies of kvm headers
tools headers uapi: Update tools's copy of drm/drm.h
tools headers uapi: Update tools's copy of asm-generic/unistd.h
tools headers uapi: Update tools's copy of linux/perf_event.h
perf/core: Force USER_DS when recording user stack data
perf/UAPI: Clearly mark __PERF_SAMPLE_CALLCHAIN_EARLY as internal use
perf/x86/intel: Add support/quirk for the MISPREDICT bit on Knights Landing CPUs
perf annotate: Fix parsing aarch64 branch instructions after objdump update
perf probe powerpc: Ignore SyS symbols irrespective of endianness
perf event-parse: Use fixed size string for comms
perf util: Fix bad memory access in trace info.
perf tools: Streamline bpf examples and headers installation
perf evsel: Fix potential null pointer dereference in perf_evsel__new_idx()
perf arm64: Fix include path for asm-generic/unistd.h
perf/hw_breakpoint: Simplify breakpoint enable in perf_event_modify_breakpoint
perf/hw_breakpoint: Enable breakpoint in modify_user_hw_breakpoint
perf/hw_breakpoint: Remove superfluous bp->attr.disabled = 0
...
Fix kernel-doc warning for missing 'flags' parameter description:
../kernel/sched/fair.c:3371: warning: Function parameter or member 'flags' not described in 'attach_entity_load_avg'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: ea14b57e8a18 ("sched/cpufreq: Provide migration hint")
Link: http://lkml.kernel.org/r/cdda0d42-880d-4229-a9f7-5899c977a063@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This reverts commit 1f40a46cf47c12d93a5ad9dccd82bd36ff8f956a.
It turned out that this patch is not sufficient to enable PTI on 32 bit
systems with legacy 2-level page-tables. In this paging mode the huge-page
PTEs are in the top-level page-table directory, where also the mirroring to
the user-space page-table happens. So every huge PTE exits twice, in the
kernel and in the user page-table.
That means that accessed/dirty bits need to be fetched from two PTEs in
this mode to be safe, but this is not trivial to implement because it needs
changes to generic code just for the sake of enabling PTI with 32-bit
legacy paging. As all systems that need PTI should support PAE anyway,
remove support for PTI when 32-bit legacy paging is used.
Fixes: 7757d607c6b3 ('x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32')
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: hpa@zytor.com
Cc: linux-mm@kvack.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Link: https://lkml.kernel.org/r/1536922754-31379-1-git-send-email-joro@8bytes.org
Pull locking fixes from Ingo Molnar:
"Misc fixes: liblockdep fixes and ww_mutex fixes"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/ww_mutex: Fix spelling mistake "cylic" -> "cyclic"
locking/lockdep: Delete unnecessary #include
tools/lib/lockdep: Add dummy task_struct state member
tools/lib/lockdep: Add empty nmi.h
tools/lib/lockdep: Update Sasha Levin email to MSFT
jump_label: Fix typo in warning message
locking/mutex: Fix mutex debug call and ww_mutex documentation
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix finding a symbol by name when multiple maps use the same backing DSO,
so we must first see if that symbol name is in the DSO, then see if it is
inside the range of addresses for that specific map (Adrian Hunter)
- Update the tools copies of UAPI headers, which silences the warnings
emitted when building the tools and in some cases, like for the new
KVM ioctls, results in 'perf trace' being able to translate that
ioctl number to a string (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It can happen that load_balance() finds a busiest group and then a
busiest rq but the calculated imbalance is in fact 0.
In such situation, detach_tasks() returns immediately and lets the
flag LBF_ALL_PINNED set. The busiest CPU is then wrongly assumed to
have pinned tasks and removed from the load balance mask. then, we
redo a load balance without the busiest CPU. This creates wrong load
balance situation and generates wrong task migration.
If the calculated imbalance is 0, it's useless to try to find a
busiest rq as no task will be migrated and we can return immediately.
This situation can happen with heterogeneous system or smp system when
RT tasks are decreasing the capacity of some CPUs.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: jhugo@codeaurora.org
Link: http://lkml.kernel.org/r/1536306664-29827-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit eeb89e2bb1ac ("x86/efi: Load fixmap GDT in efi_call_phys_epilog()")
moved loading the fixmap in efi_call_phys_epilog() after load_cr3() since
it was assumed to be more logical.
Turns out this is incorrect: In efi_call_phys_prolog(), the gdt with its
physical address is loaded first, and when the %cr3 is reloaded in _epilog
from initial_page_table to swapper_pg_dir again the gdt is no longer
mapped. This results in a triple fault if an interrupt occurs after
load_cr3() and before load_fixmap_gdt(0). Calling load_fixmap_gdt(0) first
restores the execution order prior to commit eeb89e2bb1ac and fixes the
problem.
Fixes: eeb89e2bb1ac ("x86/efi: Load fixmap GDT in efi_call_phys_epilog()")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: linux-efi@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Joerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/1536689892-21538-1-git-send-email-linux@roeck-us.net
Pull cifs fixes from Steve French:
"Fixes for four CIFS/SMB3 potential pointer overflow issues, one minor
build fix, and a build warning cleanup"
* tag '4.19-rc3-smb3-cifs' of git://git.samba.org/sfrench/cifs-2.6:
cifs: read overflow in is_valid_oplock_break()
cifs: integer overflow in in SMB2_ioctl()
CIFS: fix wrapping bugs in num_entries()
cifs: prevent integer overflow in nxt_dir_entry()
fs/cifs: require sha512
fs/cifs: suppress a string overflow warning
Trivial fix to spelling mistake in pr_err() error message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20180824112235.8842-1-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Perf can record user stack data in response to a synchronous request, such
as a tracepoint firing. If this happens under set_fs(KERNEL_DS), then we
end up reading user stack data using __copy_from_user_inatomic() under
set_fs(KERNEL_DS). I think this conflicts with the intention of using
set_fs(KERNEL_DS). And it is explicitly forbidden by hardware on ARM64
when both CONFIG_ARM64_UAO and CONFIG_ARM64_PAN are used.
So fix this by forcing USER_DS when recording user stack data.
Signed-off-by: Yabin Cui <yabinc@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 88b0193d9418 ("perf/callchain: Force USER_DS when invoking perf_callchain_user()")
Link: http://lkml.kernel.org/r/20180823225935.27035-1-yabinc@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 1c5aae7710bb ("perf machine: Create maps for x86 PTI entry
trampolines") revealed a problem with maps__find_symbol_by_name() that
resulted in probes not being found e.g.
$ sudo perf probe xsk_mmap
xsk_mmap is out of .text, skip it.
Probe point 'xsk_mmap' not found.
Error: Failed to add events.
maps__find_symbol_by_name() can optionally return the map of the found
symbol. It can get the map wrong because, in fact, the symbol is found
on the map's dso, not allowing for the possibility that the dso has more
than one map. Fix by always checking the map contains the symbol.
Reported-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 1c5aae7710bb ("perf machine: Create maps for x86 PTI entry trampolines")
Link: http://lkml.kernel.org/r/20180907085116.25782-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since commit:
523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")
scale_rt_capacity() returns the remaining capacity and not a scale factor
to apply on cpu_capacity_orig. arch_scale_cpu() is directly called by
scale_rt_capacity() so we must take the sched_domain argument.
Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")
Link: http://lkml.kernel.org/r/20180904093626.GA23936@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Xen PV guests don't allow CPU0 hotplug, so disable it.
Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20180912174122.24282-1-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull NFS client bugfixes from Anna Schumaker:
"These are a handful of fixes for problems that Trond found. Patch #1
and #3 have the same name, a second issue was found after applying the
first patch.
Stable bugfixes:
- v4.17+: Fix tracepoint Oops in initiate_file_draining()
- v4.11+: Fix an infinite loop on I/O
Other fixes:
- Return errors if a waiting layoutget is killed
- Don't open code clearing of delegation state"
* tag 'nfs-for-4.19-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFS: Don't open code clearing of delegation state
NFSv4.1 fix infinite loop on I/O.
NFSv4: Fix a tracepoint Oops in initiate_file_draining()
pNFS: Ensure we return the error if someone kills a waiting layoutget
NFSv4: Fix a tracepoint Oops in initiate_file_draining()
Commit:
c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage")
added the inclusion of <trace/events/preemptirq.h>.
liblockdep doesn't have a stub version of that header so now fails to build.
However, commit:
bff1b208a5d1 ("tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage"")
removed the use of functions declared in that header. So delete the #include.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <alexander.levin@verizon.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: bff1b208a5d1 ("tracing: Partial revert of "tracing: Centralize ...")
Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints ...")
Link: http://lkml.kernel.org/r/20180828203315.GD18030@decadent.org.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Vince noted that commit:
6cbc304f2f36 ("perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)")
'leaked' __PERF_SAMPLE_CALLCHAIN_EARLY into the UAPI namespace. And
while sys_perf_event_open() will error out if you try to use it, it is
exposed.
Clearly mark it for internal use only to avoid any confusion.
Requested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
To get the changes in:
3e7a50ceb11e ("net: report min and max mtu network device settings")
2756f68c3149 ("net: bridge: add support for backup port")
a25717d2b604 ("xdp: support simultaneous driver and hw XDP attachment")
4f91da26c811 ("xdp: add per mode attributes for attached programs")
f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
Silencing this libbpf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/if_link.h' differs from latest version at 'include/uapi/linux/if_link.h'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-xd9ztioa894zemv8ag8kg64u@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a task which previously ran on a given CPU is remotely queued to
wake up on that same CPU, there is a period where the task's state is
TASK_WAKING and its vruntime is not normalized. This is not accounted
for in vruntime_normalized() which will cause an error in the task's
vruntime if it is switched from the fair class during this time.
For example if it is boosted to RT priority via rt_mutex_setprio(),
rq->min_vruntime will not be subtracted from the task's vruntime but
it will be added again when the task returns to the fair class. The
task's vruntime will have been erroneously doubled and the effective
priority of the task will be reduced.
Note this will also lead to inflation of all vruntimes since the doubled
vruntime value will become the rq's min_vruntime when other tasks leave
the rq. This leads to repeated doubling of the vruntime and priority
penalty.
Fix this by recognizing a WAKING task's vruntime as normalized only if
sched_remote_wakeup is true. This indicates a migration, in which case
the vruntime would have been normalized in migrate_task_rq_fair().
Based on a similar patch from John Dias <joaodias@google.com>.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Steve Muckle <smuckle@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Redpath <Chris.Redpath@arm.com>
Cc: John Dias <joaodias@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miguel de Dios <migueldedios@google.com>
Cc: Morten Rasmussen <Morten.Rasmussen@arm.com>
Cc: Patrick Bellasi <Patrick.Bellasi@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@google.com>
Cc: kernel-team@android.com
Fixes: b5179ac70de8 ("sched/fair: Prepare to fix fairness problems on migration")
Link: http://lkml.kernel.org/r/20180831224217.169476-1-smuckle@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For unprivileged Xen PV guests this is normal memory and ioremap will
not be able to properly map it.
While at it, since ioremap may return NULL, add a test for pointer's
validity.
Reported-by: Andy Smith <andy@strugglers.net>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: hpa@zytor.com
Cc: xen-devel@lists.xenproject.org
Cc: jgross@suse.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180911195538.23289-1-boris.ostrovsky@oracle.com
Pull tracing fix from Steven Rostedt:
"This fixes an issue with the build system caused by a change that
modifies CC_FLAGS_FTRACE. The issue is that it breaks the dependencies
and causes "make targz-pkg" to rebuild the entire world"
* tag 'trace-v4.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/Makefile: Fix handling redefinition of CC_FLAGS_FTRACE
The "le32_to_cpu(rsp->OutputOffset) + *plen" addition can overflow and
wrap around to a smaller value which looks like it would lead to an
information leak.
Fixes: 4a72dafa19ba ("SMB2 FSCTL and IOCTL worker function")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
CC: Stable <stable@vger.kernel.org>
Commit:
8cc05c71ba5f ("locking/lockdep: Move sanity check to inside lockdep_print_held_locks()")
added accesses to the task_struct's state member. Add dummy userspace declaration.
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <Alexander.Levin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180813190527.16853-4-alexander.levin@microsoft.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Problem: perf did not show branch predicted/mispredicted bit in brstack.
Output of perf -F brstack for profile collected
Before:
0x4fdbcd/0x4fdc03/-/-/-/0
0x45f4c1/0x4fdba0/-/-/-/0
0x45f544/0x45f4bb/-/-/-/0
0x45f555/0x45f53c/-/-/-/0
0x7f66901cc24b/0x45f555/-/-/-/0
0x7f66901cc22e/0x7f66901cc23d/-/-/-/0
0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0
0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0
After:
0x4fdbcd/0x4fdc03/P/-/-/0
0x45f4c1/0x4fdba0/P/-/-/0
0x45f544/0x45f4bb/P/-/-/0
0x45f555/0x45f53c/P/-/-/0
0x7f66901cc24b/0x45f555/P/-/-/0
0x7f66901cc22e/0x7f66901cc23d/P/-/-/0
0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0
0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0
Cause:
As mentioned in Software Development Manual vol 3, 17.4.8.1,
IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is
stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as
its format. Despite that, registers containing FROM address of the branch,
do have MISPREDICT bit but because of the format indicated in
IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit.
Solution:
Teach LBR about above Knights Landing quirk and make it read MISPREDICT bit.
Signed-off-by: Jacek Tomaka <jacek.tomaka@poczta.fm>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180802013830.10600-1-jacekt@dugeo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
To get the changes in:
c48300c92ad9 ("vhost: fix VHOST_GET_BACKEND_FEATURES ioctl request definition")
This makes 'perf trace' and other tools in the future using its
beautifiers in a libbeauty.so library be able to translate these new
ioctl to strings:
$ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > /tmp/after
$ diff -u /tmp/before /tmp/after
--- /tmp/before 2018-09-11 13:10:57.923038244 -0300
+++ /tmp/after 2018-09-11 13:11:20.329012685 -0300
@@ -15,6 +15,7 @@
[0x22] = "SET_VRING_ERR",
[0x23] = "SET_VRING_BUSYLOOP_TIMEOUT",
[0x24] = "GET_VRING_BUSYLOOP_TIMEOUT",
+ [0x25] = "SET_BACKEND_FEATURES",
[0x30] = "NET_SET_BACKEND",
[0x40] = "SCSI_SET_ENDPOINT",
[0x41] = "SCSI_CLEAR_ENDPOINT",
@@ -27,4 +28,5 @@
static const char *vhost_virtio_ioctl_read_cmds[] = {
[0x00] = "GET_FEATURES",
[0x12] = "GET_VRING_BASE",
+ [0x26] = "GET_BACKEND_FEATURES",
};
$
We'll also use this to be able to express syscall filters using symbolic
these symbolic names, something like:
# perf trace --all-cpus -e ioctl(cmd=*GET_FEATURES)
This silences the following warning during perf's build:
Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h'
diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-35x71oei2hdui9u0tarpimbq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
update_blocked_averages() is called to periodiccally decay the stalled load
of idle CPUs and to sync all loads before running load balance.
When cfs rq is idle, it trigs a load balance during pick_next_task_fair()
in order to potentially pull tasks and to use this newly idle CPU. This
load balance happens whereas prev task from another class has not been put
and its utilization updated yet. This may lead to wrongly account running
time as idle time for RT or DL classes.
Test that no RT or DL task is running when updating their utilization in
update_blocked_averages().
We still update RT and DL utilization instead of simply skipping them to
make sure that all metrics are synced when used during load balance.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 371bf4273269 ("sched/rt: Add rt_rq utilization tracking")
Fixes: 3727e0e16340 ("sched/dl: Add dl_rq utilization tracking")
Link: http://lkml.kernel.org/r/1535728975-22799-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix a few issues in Documentation/x86/earlyprintk.txt:
- correct typos, punctuation, missing word, wrong word
- change product name from Netchip to NetChip
- expand where to add "earlyprintk=dbg"
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Link: http://lkml.kernel.org/r/d0c40ac3-7659-6374-dbda-23d3d2577f30@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull DeviceTree fix from Rob Herring:
"One regression for a 20 year old PowerMac:
- Fix a regression on systems having a DT without any phandles which
happens on a PowerMac G3"
* tag 'devicetree-fixes-for-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: fix phandle cache creation for DTs with no phandles
As a Kernel developer, I make heavy use of "make targz-pkg" in order
to locally compile and remotely install my development Kernels. The
nice feature I rely on is that after a normal "make", "make targz-pkg"
only generates the tarball without having to recompile everything.
That was true until commit f28bc3c32c05 ("tracing: Handle
CC_FLAGS_FTRACE more accurately"). After it, running "make targz-pkg"
after "make" will recompile the whole Kernel tree, making my
development workflow much slower.
The Kernel is choosing to recompile everything because it claims the
command line has changed. A diff of the .cmd files show a repeated
-mfentry in one of the files. That is because "make targz-pkg" calls
"make modules_install" and the environment is already populated with
the exported variables, CC_FLAGS_FTRACE being one of them. Then,
-mfentry gets duplicated because it is not protected behind an ifndef
block, like -pg.
To complicate the problem a little bit more, architectures can define
their own version CC_FLAGS_FTRACE, so our code not only has to
consider recursive Makefiles, but also architecture overrides.
So in this patch we move CC_FLAGS_FTRACE up and unconditionally
define it to -pg. Then we let the architecture Makefiles possibly
override it, and finally append the extra options later. This ensures
the variable is always fully redefined at each invocation so recursive
Makefiles don't keep appending, and hopefully it maintains the
intended behavior on how architectures can override the defaults..
Thanks Steven Rostedt and Vasily Gorbik for the help on this
regression.
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: linux-kbuild@vger.kernel.org
Fixes: commit f28bc3c32c05 ("tracing: Handle CC_FLAGS_FTRACE more accurately")
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The previous fix broke recovery of delegated stateids because it assumes
that if we did not mark the delegation as suspect, then the delegation has
effectively been revoked, and so it removes that delegation irrespectively
of whether or not it is valid and still in use. While this is "mostly
harmless" for ordinary I/O, we've seen pNFS fail with LAYOUTGET spinning
in an infinite loop while complaining that we're using an invalid stateid
(in this case the all-zero stateid).
What we rather want to do here is ensure that the delegation is always
correctly marked as needing testing when that is the case. So we want
to close the loophole offered by nfs4_schedule_stateid_recovery(),
which marks the state as needing to be reclaimed, but not the
delegation that may be backing it.
Fixes: 0e3d3e5df07dc ("NFSv4.1 fix infinite loop on IO BAD_STATEID error")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.11+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
The problem is that "entryptr + next_offset" and "entryptr + len + size"
can wrap. I ended up changing the type of "entryptr" because it makes
the math easier when we don't have to do so much casting.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Required since:
88f1c87de11a8 ("locking/lockdep: Avoid triggering hardlockup from debug_show_all_locks()")
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <Alexander.Levin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180813190527.16853-3-alexander.levin@microsoft.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
Kernel:
- Modify breakpoint fixes (Jiri Olsa)
perf annotate:
- Fix parsing aarch64 branch instructions after objdump update (Kim Phillips)
- Fix parsing indirect calls in 'perf annotate' (Martin Liška)
perf probe:
- Ignore SyS symbols irrespective of endianness on PowerPC (Sandipan Das)
perf trace:
- Fix include path for asm-generic/unistd.h on arm64 (Kim Phillips)
Core libraries:
- Fix potential null pointer dereference in perf_evsel__new_idx() (Hisao Tanabe)
- Use fixed size string for comms instead of scanf("%m"), that is
not present in the bionic libc and leads to a crash (Chris Phlipot)
- Fix bad memory access in trace info on 32-bit systems, we were reading
8 bytes from a 4-byte long variable when saving the command line in the
perf.data file. (Chris Phlipot)
Build system:
- Streamline bpf examples and headers installation, clarifying
some install messages. (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
To get the changes in:
a449938297e5 ("KVM: s390: Add huge page enablement control")
8fcc4b5923af ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE")
be26b3a73413 ("arm64: KVM: export the capability to set guest SError syndrome")
b7b27facc7b5 ("arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTS")
b0960b9569db ("KVM: arm: Add 32bit get/set events support")
a3da7b4a3be5 ("KVM: s390: add etoken support for guests")
This makes 'perf trace' automagically get aware of these new ioctls:
$ cp include/uapi/linux/kvm.h tools/include/uapi/linux/kvm.h
$ tools/perf/trace/beauty/kvm_ioctl.sh > /tmp/after
$ diff -u /tmp/before /tmp/after
--- /tmp/before 2018-09-11 11:18:29.173207586 -0300
+++ /tmp/after 2018-09-11 11:18:38.488200446 -0300
@@ -84,6 +84,8 @@
[0xbb] = "MEMORY_ENCRYPT_REG_REGION",
[0xbc] = "MEMORY_ENCRYPT_UNREG_REGION",
[0xbd] = "HYPERV_EVENTFD",
+ [0xbe] = "GET_NESTED_STATE",
+ [0xbf] = "SET_NESTED_STATE",
[0xe0] = "CREATE_DEVICE",
[0xe1] = "SET_DEVICE_ATTR",
[0xe2] = "G
And cures the following warning during perf's build:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dongjiu Geng <gengdongjiu@huawei.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-2vvwh2o19orn56di0ksrtgzr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the following commit:
051f3ca02e46 ("sched/topology: Introduce NUMA identity node sched domain")
the scheduler introduced a new NUMA level. However this leads to the NUMA topology
on 2 node systems to not be marked as NUMA_DIRECT anymore.
After this commit, it gets reported as NUMA_BACKPLANE, because
sched_domains_numa_level is now 2 on 2 node systems.
Fix this by allowing setting systems that have up to 2 NUMA levels as
NUMA_DIRECT.
While here remove code that assumes that level can be 0.
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andre Wild <wild@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Fixes: 051f3ca02e46 "Introduce NUMA identity node sched domain"
Link: http://lkml.kernel.org/r/1533920419-17410-1-git-send-email-srikar@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull xen fixes from Juergen Gross:
"This contains some minor cleanups and fixes:
- a new knob for controlling scrubbing of pages returned by the Xen
balloon driver to the Xen hypervisor to address a boot performance
issue seen in large guests booted pre-ballooned
- a fix of a regression in the gntdev driver which made it impossible
to use fully virtualized guests (HVM guests) with a 4.19 based dom0
- a fix in Xen cpu hotplug functionality which could be triggered by
wrong admin commands (setting number of active vcpus to 0)
One further note: the patches have all been under test for several
days in another branch. This branch has been rebased in order to avoid
merge conflicts"
* tag 'for-linus-4.19c-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/gntdev: fix up blockable calls to mn_invl_range_start
xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usage
xen: avoid crash in disable_hotplug_cpu
xen/balloon: add runtime control for scrubbing ballooned out pages
xen/manage: don't complain about an empty value in control/sysrq node
With commit 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of
of_find_node_by_phandle()"), a G3 PowerMac fails to boot. The root cause
is the DT for this system has no phandle properties when booted with
BootX. of_populate_phandle_cache() does not handle the case of no
phandles correctly. The problem is roundup_pow_of_two() for 0 is
undefined. The implementation subtracts 1 underflowing and then things
are in the weeds.
Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
Cc: stable@vger.kernel.org # 4.17+
Reported-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Reviewed-by: Frank Rowand <frank.rowand@sony.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Now that the value of 'ino' can be NULL or an ERR_PTR(), we need to
change the test in the tracepoint.
Fixes: ce5624f7e6675 ("NFSv4: Return NFS4ERR_DELAY when a layout fails...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
The "old_entry + le32_to_cpu(pDirInfo->NextEntryOffset)" can wrap
around so I have added a check for integer overflow.
Reported-by: Dr Silvio Cesare of InfoSect <silvio.cesare@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <Alexander.Levin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180813190527.16853-2-alexander.levin@microsoft.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull x86 fixes from Thomas Gleixner:
"A set of fixes for x86:
- Prevent multiplication result truncation on 32bit. Introduced with
the early timestamp reworrk.
- Ensure microcode revision storage to be consistent under all
circumstances
- Prevent write tearing of PTEs
- Prevent confusion of user and kernel reegisters when dumping fatal
signals verbosely
- Make an error return value in a failure path of the vector
allocation negative. Returning EINVAL might the caller assume
success and causes further wreckage.
- A trivial kernel doc warning fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Use WRITE_ONCE() when setting PTEs
x86/apic/vector: Make error return value negative
x86/process: Don't mix user/kernel regs in 64bit __show_regs()
x86/tsc: Prevent result truncation on 32bit
x86: Fix kernel-doc atomic.h warnings
x86/microcode: Update the new microcode revision unconditionally
x86/microcode: Make sure boot_cpu_data.microcode is up-to-date
Starting with binutils 2.28, aarch64 objdump adds comments to the
disassembly output to show the alternative names of a condition code
[1].
It is assumed that commas in objdump comments could occur in other
arches now or in the future, so this fix is arch-independent.
The fix could have been done with arm64 specific jump__parse and
jump__scnprintf functions, but the jump__scnprintf instruction would
have to have its comment character be a literal, since the scnprintf
functions cannot receive a struct arch easily.
This inconvenience also applies to the generic jump__scnprintf, which is
why we add a raw_comment pointer to struct ins_operands, so the __parse
function assigns it to be re-used by its corresponding __scnprintf
function.
Example differences in 'perf annotate --stdio2' output on an aarch64
perf.data file:
BEFORE: → b.cs ffff200008133d1c <unwind_frame+0x18c> // b.hs, dffff7ecc47b
AFTER : ↓ b.cs 18c
BEFORE: → b.cc ffff200008d8d9cc <get_alloc_profile+0x31c> // b.lo, b.ul, dffff727295b
AFTER : ↓ b.cc 31c
The branch target labels 18c and 31c also now appear in the output:
BEFORE: add x26, x29, #0x80
AFTER : 18c: add x26, x29, #0x80
BEFORE: add x21, x21, #0x8
AFTER : 31c: add x21, x21, #0x8
The Fixes: tag below is added so stable branches will get the update; it
doesn't necessarily mean that commit was broken at the time, rather it
didn't withstand the aarch64 objdump update.
Tested no difference in output for sample x86_64, power arch perf.data files.
[1] https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=bb7eff5206e4795ac79c177a80fe9f4630aaf730
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Fixes: b13bbeee5ee6 ("perf annotate: Fix branch instruction with multiple operands")
Link: http://lkml.kernel.org/r/20180827125340.a2f7e291901d17cea05daba4@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get the changes in:
d67b6a206507 ("drm: writeback: Add client capability for exposing writeback connectors")
This is for an argument to a DRM ioctl, which is not being prettyfied in
the 'perf trace' DRM ioctl beautifier, but will now that syscalls are
starting to have pointer arguments augmented via BPF.
This time around this just cures the following warning during perf's
build:
Warning: Kernel ABI header at 'tools/include/uapi/drm/drm.h' differs from latest version at 'include/uapi/drm/drm.h'
diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brian Starkey <brian.starkey@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-n7qib1bac6mc6w9oke7r4qdc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The following lockdep report can be triggered by writing to /sys/kernel/debug/sched_features:
======================================================
WARNING: possible circular locking dependency detected
4.18.0-rc6-00152-gcd3f77d74ac3-dirty #18 Not tainted
------------------------------------------------------
sh/3358 is trying to acquire lock:
000000004ad3989d (cpu_hotplug_lock.rw_sem){++++}, at: static_key_enable+0x14/0x30
but task is already holding lock:
00000000c1b31a88 (&sb->s_type->i_mutex_key#3){+.+.}, at: sched_feat_write+0x160/0x428
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&sb->s_type->i_mutex_key#3){+.+.}:
lock_acquire+0xb8/0x148
down_write+0xac/0x140
start_creating+0x5c/0x168
debugfs_create_dir+0x18/0x220
opp_debug_register+0x8c/0x120
_add_opp_dev+0x104/0x1f8
dev_pm_opp_get_opp_table+0x174/0x340
_of_add_opp_table_v2+0x110/0x760
dev_pm_opp_of_add_table+0x5c/0x240
dev_pm_opp_of_cpumask_add_table+0x5c/0x100
cpufreq_init+0x160/0x430
cpufreq_online+0x1cc/0xe30
cpufreq_add_dev+0x78/0x198
subsys_interface_register+0x168/0x270
cpufreq_register_driver+0x1c8/0x278
dt_cpufreq_probe+0xdc/0x1b8
platform_drv_probe+0xb4/0x168
driver_probe_device+0x318/0x4b0
__device_attach_driver+0xfc/0x1f0
bus_for_each_drv+0xf8/0x180
__device_attach+0x164/0x200
device_initial_probe+0x10/0x18
bus_probe_device+0x110/0x178
device_add+0x6d8/0x908
platform_device_add+0x138/0x3d8
platform_device_register_full+0x1cc/0x1f8
cpufreq_dt_platdev_init+0x174/0x1bc
do_one_initcall+0xb8/0x310
kernel_init_freeable+0x4b8/0x56c
kernel_init+0x10/0x138
ret_from_fork+0x10/0x18
-> #2 (opp_table_lock){+.+.}:
lock_acquire+0xb8/0x148
__mutex_lock+0x104/0xf50
mutex_lock_nested+0x1c/0x28
_of_add_opp_table_v2+0xb4/0x760
dev_pm_opp_of_add_table+0x5c/0x240
dev_pm_opp_of_cpumask_add_table+0x5c/0x100
cpufreq_init+0x160/0x430
cpufreq_online+0x1cc/0xe30
cpufreq_add_dev+0x78/0x198
subsys_interface_register+0x168/0x270
cpufreq_register_driver+0x1c8/0x278
dt_cpufreq_probe+0xdc/0x1b8
platform_drv_probe+0xb4/0x168
driver_probe_device+0x318/0x4b0
__device_attach_driver+0xfc/0x1f0
bus_for_each_drv+0xf8/0x180
__device_attach+0x164/0x200
device_initial_probe+0x10/0x18
bus_probe_device+0x110/0x178
device_add+0x6d8/0x908
platform_device_add+0x138/0x3d8
platform_device_register_full+0x1cc/0x1f8
cpufreq_dt_platdev_init+0x174/0x1bc
do_one_initcall+0xb8/0x310
kernel_init_freeable+0x4b8/0x56c
kernel_init+0x10/0x138
ret_from_fork+0x10/0x18
-> #1 (subsys mutex#6){+.+.}:
lock_acquire+0xb8/0x148
__mutex_lock+0x104/0xf50
mutex_lock_nested+0x1c/0x28
subsys_interface_register+0xd8/0x270
cpufreq_register_driver+0x1c8/0x278
dt_cpufreq_probe+0xdc/0x1b8
platform_drv_probe+0xb4/0x168
driver_probe_device+0x318/0x4b0
__device_attach_driver+0xfc/0x1f0
bus_for_each_drv+0xf8/0x180
__device_attach+0x164/0x200
device_initial_probe+0x10/0x18
bus_probe_device+0x110/0x178
device_add+0x6d8/0x908
platform_device_add+0x138/0x3d8
platform_device_register_full+0x1cc/0x1f8
cpufreq_dt_platdev_init+0x174/0x1bc
do_one_initcall+0xb8/0x310
kernel_init_freeable+0x4b8/0x56c
kernel_init+0x10/0x138
ret_from_fork+0x10/0x18
-> #0 (cpu_hotplug_lock.rw_sem){++++}:
__lock_acquire+0x203c/0x21d0
lock_acquire+0xb8/0x148
cpus_read_lock+0x58/0x1c8
static_key_enable+0x14/0x30
sched_feat_write+0x314/0x428
full_proxy_write+0xa0/0x138
__vfs_write+0xd8/0x388
vfs_write+0xdc/0x318
ksys_write+0xb4/0x138
sys_write+0xc/0x18
__sys_trace_return+0x0/0x4
other info that might help us debug this:
Chain exists of:
cpu_hotplug_lock.rw_sem --> opp_table_lock --> &sb->s_type->i_mutex_key#3
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&sb->s_type->i_mutex_key#3);
lock(opp_table_lock);
lock(&sb->s_type->i_mutex_key#3);
lock(cpu_hotplug_lock.rw_sem);
*** DEADLOCK ***
2 locks held by sh/3358:
#0: 00000000a8c4b363 (sb_writers#10){.+.+}, at: vfs_write+0x238/0x318
#1: 00000000c1b31a88 (&sb->s_type->i_mutex_key#3){+.+.}, at: sched_feat_write+0x160/0x428
stack backtrace:
CPU: 5 PID: 3358 Comm: sh Not tainted 4.18.0-rc6-00152-gcd3f77d74ac3-dirty #18
Hardware name: Renesas H3ULCB Kingfisher board based on r8a7795 ES2.0+ (DT)
Call trace:
dump_backtrace+0x0/0x288
show_stack+0x14/0x20
dump_stack+0x13c/0x1ac
print_circular_bug.isra.10+0x270/0x438
check_prev_add.constprop.16+0x4dc/0xb98
__lock_acquire+0x203c/0x21d0
lock_acquire+0xb8/0x148
cpus_read_lock+0x58/0x1c8
static_key_enable+0x14/0x30
sched_feat_write+0x314/0x428
full_proxy_write+0xa0/0x138
__vfs_write+0xd8/0x388
vfs_write+0xdc/0x318
ksys_write+0xb4/0x138
sys_write+0xc/0x18
__sys_trace_return+0x0/0x4
This is because when loading the cpufreq_dt module we first acquire
cpu_hotplug_lock.rw_sem lock, then in cpufreq_init(), we are taking
the &sb->s_type->i_mutex_key lock.
But when writing to /sys/kernel/debug/sched_features, the
cpu_hotplug_lock.rw_sem lock depends on the &sb->s_type->i_mutex_key lock.
To fix this bug, reverse the lock acquisition order when writing to
sched_features, this way cpu_hotplug_lock.rw_sem no longer depends on
&sb->s_type->i_mutex_key.
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Jiada Wang <jiada_wang@mentor.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Eugeniu Rosca <erosca@de.adit-jv.com>
Cc: George G. Davis <george_davis@mentor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180731121222.26195-1-jiada_wang@mentor.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull Xtensa fixes and cleanups from Max Filippov:
- don't allocate memory in platform_setup as the memory allocator is
not initialized at that point yet;
- remove unnecessary ifeq KBUILD_SRC from arch/xtensa/Makefile;
- enable SG chaining in arch/xtensa/Kconfig.
* tag 'xtensa-20180914' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: enable SG chaining in Kconfig
xtensa: remove unnecessary KBUILD_SRC ifeq conditional
xtensa: ISS: don't allocate memory in platform_setup
Patch series "mmu_notifiers follow ups".
Tetsuo has noticed some fallouts from 93065ac753e4 ("mm, oom: distinguish
blockable mode for mmu notifiers"). One of them has been fixed and picked
up by AMD/DRM maintainer [1]. XEN issue is fixed by patch 1. I have also
clarified expectations about blockable semantic of invalidate_range_end.
Finally the last patch removes MMU_INVALIDATE_DOES_NOT_BLOCK which is no
longer used nor needed.
[1] http://lkml.kernel.org/r/20180824135257.GU29735@dhcp22.suse.cz
This patch (of 3):
93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers") has
introduced blockable parameter to all mmu_notifiers and the notifier has
to back off when called in !blockable case and it could block down the
road.
The above commit implemented that for mn_invl_range_start but both
in_range checks are done unconditionally regardless of the blockable mode
and as such they would fail all the time for regular calls. Fix this by
checking blockable parameter as well.
Once we are there we can remove the stale TODO. The lock has to be
sleepable because we wait for completion down in gnttab_unmap_refs_sync.
Link: http://lkml.kernel.org/r/20180827112623.8992-2-mhocko@kernel.org
Fixes: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
This got lost in commit 0fdfef9aa7ee68ddd508aef7c98630cfc054f8d6,
which removed CONFIG_CIFS_SMB311.
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Fixes: 0fdfef9aa7ee68ddd ("smb3: simplify code by removing CONFIG_CIFS_SMB311")
CC: Stable <stable@vger.kernel.org>
CC: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
There's no 'allocatote' - use the next best thing: 'allocate' :-)
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180907103521.31344-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull timekeeping fixes from Thomas Gleixner:
"Two fixes for timekeeping:
- Revert to the previous kthread based update, which is unfortunately
required due to lock ordering issues. The removal caused boot
failures on old Core2 machines. Add a proper comment why the thread
needs to stay to prevent accidental removal in the future.
- Fix a silly typo in a function declaration"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: Revert "Remove kthread"
timekeeping: Fix declaration of read_persistent_wall_and_boot_offset()
When page-table entries are set, the compiler might optimize their
assignment by using multiple instructions to set the PTE. This might
turn into a security hazard if the user somehow manages to use the
interim PTE. L1TF does not make our lives easier, making even an interim
non-present PTE a security hazard.
Using WRITE_ONCE() to set PTEs and friends should prevent this potential
security hazard.
I skimmed the differences in the binary with and without this patch. The
differences are (obviously) greater when CONFIG_PARAVIRT=n as more
code optimizations are possible. For better and worse, the impact on the
binary with this patch is pretty small. Skimming the code did not cause
anything to jump out as a security hazard, but it seems that at least
move_soft_dirty_pte() caused set_pte_at() to use multiple writes.
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180902181451.80520-1-namit@vmware.com
This makes sure that the SyS symbols are ignored for any powerpc system,
not just the big endian ones.
Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Fixes: fb6d59423115 ("perf probe ppc: Use the right prefix when ignoring SyS symbols on ppc")
Link: http://lkml.kernel.org/r/20180828090848.1914-1-sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>