commits
Pull x86 fixes from Thomas Gleixner:
"This contains the following fixes and improvements:
- Avoid dereferencing an unprotected VMA pointer in the fault signal
generation code
- Fix inline asm call constraints for GCC 4.4
- Use existing register variable to retrieve the stack pointer
instead of forcing the compiler to create another indirect access
which results in excessive extra 'mov %rsp, %<dst>' instructions
- Disable branch profiling for the memory encryption code to prevent
an early boot crash
- Fix a sparse warning caused by casting the __user annotation in
__get_user_asm_u64() away
- Fix an off by one error in the loop termination of the error patch
in the x86 sysfs init code
- Add missing CPU IDs to various Intel specific drivers to enable the
functionality on recent hardware
- More (init) constification in the numachip code"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm: Use register variable to get stack pointer value
x86/mm: Disable branch profiling in mem_encrypt.c
x86/asm: Fix inline asm call constraints for GCC 4.4
perf/x86/intel/uncore: Correct num_boxes for IIO and IRP
perf/x86/intel/rapl: Add missing CPU IDs
perf/x86/msr: Add missing CPU IDs
perf/x86/intel/cstate: Add missing CPU IDs
x86: Don't cast away the __user in __get_user_asm_u64()
x86/sysfs: Fix off-by-one error in loop termination
x86/mm: Fix fault error path using unsafe vma pointer
x86/numachip: Add const and __initconst to numachip2_clockevent
Pull timer fixes from Thomas Gleixner:
"This adds a new timer wheel function which is required for the
conversion of the timer callback function from the 'unsigned long
data' argument to 'struct timer_list *timer'. This conversion has two
benefits:
1) It makes struct timer_list smaller
2) Many callers hand in a pointer to the timer or to the structure
containing the timer, which happens via type casting both at setup
and in the callback. This change gets rid of the typecasts.
Once the conversion is complete, which is planned for 4.15, the old
setup function and the intermediate typecast in the new setup function
go away along with the data field in struct timer_list.
Merging this now into mainline allows a smooth queueing of the actual
conversion in the affected maintainer trees without creating
dependencies"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
um/time: Fixup namespace collision
timer: Prepare to change timer callback argument type
Currently we use current_stack_pointer() function to get the value
of the stack pointer register. Since commit:
f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
... we have a stack register variable declared. It can be used instead of
current_stack_pointer() function which allows to optimize away some
excessive "mov %rsp, %<dst>" instructions:
-mov %rsp,%rdx
-sub %rdx,%rax
-cmp $0x3fff,%rax
-ja ffffffff810722fd <ist_begin_non_atomic+0x2d>
+sub %rsp,%rax
+cmp $0x3fff,%rax
+ja ffffffff810722fa <ist_begin_non_atomic+0x2a>
Remove current_stack_pointer(), rename __asm_call_sp to current_stack_pointer
and use it instead of the removed function.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170929141537.29167-1-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull smp/hotplug fixes from Thomas Gleixner:
"This addresses the fallout of the new lockdep mechanism which covers
completions in the CPU hotplug code.
The lockdep splats are false positives, but there is no way to
annotate that reliably. The solution is to split the completions for
CPU up and down, which requires some reshuffling of the failure
rollback handling as well"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smp/hotplug: Hotplug state fail injection
smp/hotplug: Differentiate the AP completion between up and down
smp/hotplug: Differentiate the AP-work lockdep class between up and down
smp/hotplug: Callback vs state-machine consistency
smp/hotplug: Rewrite AP state machine core
smp/hotplug: Allow external multi-instance rollback
smp/hotplug: Add state diagram
The new timer_setup() function for struct timer_list collides with a
private um function. Rename it.
Fixes: 686fef928bba ("timer: Prepare to change timer callback argument type")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: Kees Cook <keescook@chromium.org>
Some routines in mem_encrypt.c are called very early in the boot process,
e.g. sme_encrypt_kernel(). When CONFIG_TRACE_BRANCH_PROFILING=y is defined
the resulting branch profiling associated with the check to see if SME is
active results in a kernel crash. Disable branch profiling for
mem_encrypt.c by defining DISABLE_BRANCH_PROFILING before including any
header files.
Reported-by: kernel test robot <lkp@01.org>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170929162419.6016.53390.stgit@tlendack-t1.amdoffice.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull scheduler fixes from Thomas Gleixner:
"The scheduler pull request comes with the following updates:
- Prevent a divide by zero issue by validating the input value of
sysctl_sched_time_avg
- Make task state printing consistent all over the place and have
explicit state characters for IDLE and PARKED so they wont be
displayed as 'D' state which confuses tools"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/sysctl: Check user input value of sysctl_sched_time_avg
sched/debug: Add explicit TASK_PARKED printing
sched/debug: Ignore TASK_IDLE for SysRq-W
sched/debug: Add explicit TASK_IDLE printing
sched/tracing: Use common task-state helpers
sched/tracing: Fix trace_sched_switch task-state printing
sched/debug: Remove unused variable
sched/debug: Convert TASK_state to hex
sched/debug: Implement consistent task-state printing
Add a sysfs file to one-time fail a specific state. This can be used
to test the state rollback code paths.
Something like this (hotplug-up.sh):
#!/bin/bash
echo 0 > /debug/sched_debug
echo 1 > /debug/tracing/events/cpuhp/enable
ALL_STATES=`cat /sys/devices/system/cpu/hotplug/states | cut -d':' -f1`
STATES=${1:-$ALL_STATES}
for state in $STATES
do
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /debug/tracing/trace
echo Fail state: $state
echo $state > /sys/devices/system/cpu/cpu1/hotplug/fail
cat /sys/devices/system/cpu/cpu1/hotplug/fail
echo 1 > /sys/devices/system/cpu/cpu1/online
cat /debug/tracing/trace > hotfail-${state}.trace
sleep 1
done
Can be used to test for all possible rollback (barring multi-instance)
scenarios on CPU-up, CPU-down is a trivial modification of the above.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.972581715@infradead.org
Modern kernel callback systems pass the structure associated with a
given callback to the callback function. The timer callback remains one
of the legacy cases where an arbitrary unsigned long argument continues
to be passed as the callback argument. This has several problems:
- This bloats the timer_list structure with a normally redundant
.data field.
- No type checking is being performed, forcing callbacks to do
explicit type casts of the unsigned long argument into the object
that was passed, rather than using container_of(), as done in most
of the other callback infrastructure.
- Neighboring buffer overflows can overwrite both the .function and
the .data field, providing attackers with a way to elevate from a buffer
overflow into a simplistic ROP-like mechanism that allows calling
arbitrary functions with a controlled first argument.
- For future Control Flow Integrity work, this creates a unique function
prototype for timer callbacks, instead of allowing them to continue to
be clustered with other void functions that take a single unsigned long
argument.
This adds a new timer initialization API, which will ultimately replace
the existing setup_timer(), setup_{deferrable,pinned,etc}_timer() family,
named timer_setup() (to mirror hrtimer_setup(), making instances of its
use much easier to grep for).
In order to support the migration of existing timers into the new
callback arguments, timer_setup() casts its arguments to the existing
legacy types, and explicitly passes the timer pointer as the legacy
data argument. Once all setup_*timer() callers have been replaced with
timer_setup(), the casts can be removed, and the data argument can be
dropped with the timer expiration code changed to just pass the timer
to the callback directly.
Since the regular pattern of using container_of() during local variable
declaration repeats the need for the variable type declaration
to be included, this adds a helper modeled after other from_*()
helpers that wrap container_of(), named from_timer(). This helper uses
typeof(*variable), removing the type redundancy and minimizing the need
for line wraps in forthcoming conversions from "unsigned data long" to
"struct timer_list *" in the timer callbacks:
-void callback(unsigned long data)
+void callback(struct timer_list *t)
{
- struct some_data_structure *local = (struct some_data_structure *)data;
+ struct some_data_structure *local = from_timer(local, t, timer);
Finally, in order to support the handful of timer users that perform
open-coded assignments of the .function (and .data) fields, provide
cast macros (TIMER_FUNC_TYPE and TIMER_DATA_TYPE) that can be used
temporarily. Once conversion has been completed, these can be globally
trivially removed.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20170928133817.GA113410@beast
The kernel test bot (run by Xiaolong Ye) reported that the following commit:
f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
is causing double faults in a kernel compiled with GCC 4.4.
Linus subsequently diagnosed the crash pattern and the buggy commit and found that
the issue is with this code:
register unsigned int __asm_call_sp asm("esp");
#define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp)
Even on a 64-bit kernel, it's using ESP instead of RSP. That causes GCC
to produce the following bogus code:
ffffffff8147461d: 89 e0 mov %esp,%eax
ffffffff8147461f: 4c 89 f7 mov %r14,%rdi
ffffffff81474622: 4c 89 fe mov %r15,%rsi
ffffffff81474625: ba 20 00 00 00 mov $0x20,%edx
ffffffff8147462a: 89 c4 mov %eax,%esp
ffffffff8147462c: e8 bf 52 05 00 callq ffffffff814c98f0 <copy_user_generic_unrolled>
Despite the absurdity of it backing up and restoring the stack pointer
for no reason, the bug is actually the fact that it's only backing up
and restoring the lower 32 bits of the stack pointer. The upper 32 bits
are getting cleared out, corrupting the stack pointer.
So change the '__asm_call_sp' register variable to be associated with
the actual full-size stack pointer.
This also requires changing the __ASM_SEL() macro to be based on the
actual compiled arch size, rather than the CONFIG value, because
CONFIG_X86_64 compiles some files with '-m32' (e.g., realmode and vdso).
Otherwise Clang fails to build the kernel because it complains about the
use of a 64-bit register (RSP) in a 32-bit file.
Reported-and-Bisected-and-Tested-by: kernel test robot <xiaolong.ye@intel.com>
Diagnosed-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: LKP <lkp@01.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
Link: http://lkml.kernel.org/r/20170928215826.6sdpmwtkiydiytim@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull perf fixes from Thomas Gleixner:
- Prevent a division by zero in the perf aux buffer handling
- Sync kernel headers with perf tool headers
- Fix a build failure in the syscalltbl code
- Make the debug messages of perf report --call-graph work correctly
- Make sure that all required perf files are in the MANIFEST for
container builds
- Fix the atrr.exclude kernel handling so it respects the
perf_event_paranoid and the user permissions
- Make perf test on s390x work correctly
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/aux: Only update ->aux_wakeup in non-overwrite mode
perf test: Fix vmlinux failure on s390x part 2
perf test: Fix vmlinux failure on s390x
perf tools: Fix syscalltbl build failure
perf report: Fix debug messages with --call-graph option
perf evsel: Fix attr.exclude_kernel setting for default cycles:p
tools include: Sync kernel ABI headers with tooling headers
perf tools: Get all of tools/{arch,include}/ in the MANIFEST
System will hang if user set sysctl_sched_time_avg to 0:
[root@XXX ~]# sysctl kernel.sched_time_avg_ms=0
Stack traceback for pid 0
0xffff883f6406c600 0 0 1 3 R 0xffff883f6406cf50 *swapper/3
ffff883f7ccc3ae8 0000000000000018 ffffffff810c4dd0 0000000000000000
0000000000017800 ffff883f7ccc3d78 0000000000000003 ffff883f7ccc3bf8
ffffffff810c4fc9 ffff883f7ccc3c08 00000000810c5043 ffff883f7ccc3c08
Call Trace:
<IRQ> [<ffffffff810c4dd0>] ? update_group_capacity+0x110/0x200
[<ffffffff810c4fc9>] ? update_sd_lb_stats+0x109/0x600
[<ffffffff810c5507>] ? find_busiest_group+0x47/0x530
[<ffffffff810c5b84>] ? load_balance+0x194/0x900
[<ffffffff810ad5ca>] ? update_rq_clock.part.83+0x1a/0xe0
[<ffffffff810c6d42>] ? rebalance_domains+0x152/0x290
[<ffffffff810c6f5c>] ? run_rebalance_domains+0xdc/0x1d0
[<ffffffff8108a75b>] ? __do_softirq+0xfb/0x320
[<ffffffff8108ac85>] ? irq_exit+0x125/0x130
[<ffffffff810b3a17>] ? scheduler_ipi+0x97/0x160
[<ffffffff81052709>] ? smp_reschedule_interrupt+0x29/0x30
[<ffffffff8173a1be>] ? reschedule_interrupt+0x6e/0x80
<EOI> [<ffffffff815bc83c>] ? cpuidle_enter_state+0xcc/0x230
[<ffffffff815bc80c>] ? cpuidle_enter_state+0x9c/0x230
[<ffffffff815bc9d7>] ? cpuidle_enter+0x17/0x20
[<ffffffff810cd6dc>] ? cpu_startup_entry+0x38c/0x420
[<ffffffff81053373>] ? start_secondary+0x173/0x1e0
Because divide-by-zero error happens in function:
update_group_capacity()
update_cpu_capacity()
scale_rt_capacity()
{
...
total = sched_avg_period() + delta;
used = div_u64(avg, total);
...
}
To fix this issue, check user input value of sysctl_sched_time_avg, keep
it unchanged when hitting invalid input, and set the minimum limit of
sysctl_sched_time_avg to 1 ms.
Reported-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: efault@gmx.de
Cc: ethan.kernel@gmail.com
Cc: keescook@chromium.org
Cc: mcgrof@kernel.org
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1504504774-18253-1-git-send-email-ethan.zhao@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With lockdep-crossrelease we get deadlock reports that span cpu-up and
cpu-down chains. Such deadlocks cannot possibly happen because cpu-up
and cpu-down are globally serialized.
takedown_cpu()
irq_lock_sparse()
wait_for_completion(&st->done)
cpuhp_thread_fun
cpuhp_up_callback
cpuhp_invoke_callback
irq_affinity_online_cpu
irq_local_spare()
irq_unlock_sparse()
complete(&st->done)
Now that we have consistent AP state, we can trivially separate the
AP completion between up and down using st->bringup.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: max.byungchul.park@gmail.com
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Link: https://lkml.kernel.org/r/20170920170546.872472799@infradead.org
There are 6 IIO/IRP boxes for CBDMA, PCIe0-2, MCP 0 and MCP 1
separately. Correct the num_boxes.
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: acme@kernel.org
Link: http://lkml.kernel.org/r/1505149816-12580-1-git-send-email-kan.liang@intel.com
Pull locking fixes from Thomas Gleixner:
"Two fixes for locking:
- Plug a hole the pi_stat->owner serialization which was changed
recently and failed to fixup two usage sites.
- Prevent reordering of the rwsem_has_spinner() check vs the
decrement of rwsem count in up_write() which causes a missed
wakeup"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rwsem-xadd: Fix missed wakeup due to reordering of load
futex: Fix pi_state->owner serialization
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix syscalltbl build failure (Akemi Yagi)
- Fix attr.exclude_kernel setting for default cycles:p, this time for
!root with kernel.perf_event_paranoid = -1 (Arnaldo Carvalho de Melo)
- Sync kernel ABI headers with tooling headers (Ingo Molnar)
- Remove misleading debug messages with --call-graph option (Mengting Zhang)
- Revert vmlinux symbol resolution patches for s390x (Thomas Richter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently TASK_PARKED is masqueraded as TASK_INTERRUPTIBLE, give it
its own print state because it will not in fact get woken by regular
wakeups and is a long-term state.
This requires moving TASK_PARKED into the TASK_REPORT mask, and since
that latter needs to be a contiguous bitmask, we need to shuffle the
bits around a bit.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With lockdep-crossrelease we get deadlock reports that span cpu-up and
cpu-down chains. Such deadlocks cannot possibly happen because cpu-up
and cpu-down are globally serialized.
CPU0 CPU1 CPU2
cpuhp_up_callbacks: takedown_cpu: cpuhp_thread_fun:
cpuhp_state
irq_lock_sparse()
irq_lock_sparse()
wait_for_completion()
cpuhp_state
complete()
Now that we have consistent AP state, we can trivially separate the
AP-work class between up and down using st->bringup.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: max.byungchul.park@gmail.com
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Link: https://lkml.kernel.org/r/20170920170546.922524234@infradead.org
Pull DeviceTree fixes from Rob Herring:
- fix build for !OF providing empty of_find_device_by_node
- fix Abracon vendor prefix
- sync dtx_diff include paths (again)
- a stm32h7 clock binding doc fix
* tag 'devicetree-fixes-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: clk: stm32h7: fix clock-cell size
scripts/dtc: dtx_diff - 2nd update of include dts paths to match build
dt-bindings: fix vendor prefix for Abracon
of: provide inline helper for of_find_device_by_node
DENVERTON and GEMINI_LAKE support same RAPL counters as Apollo Lake.
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: piotr.luc@intel.com
Cc: harry.pan@intel.com
Cc: srinivas.pandruvada@linux.intel.com
Link: http://lkml.kernel.org/r/20170908213449.6224-3-kan.liang@intel.com
Pull irq fixes from Thomas Gleixner:
- Add a missing NULL pointer check in free_irq()
- Fix a memory leak/memory corruption in the generic irq chip
- Add missing rcu annotations for radix tree access
- Use ffs instead of fls when extracting data from a chip register in
the MIPS GIC irq driver
- Fix the unmasking of IPI interrupts in the MIPS GIC driver so they
end up at the target CPU and not at CPU0
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irq/generic-chip: Don't replace domain's name
irqdomain: Add __rcu annotations to radix tree accessors
irqchip/mips-gic: Use effective affinity to unmask
irqchip/mips-gic: Fix shifts to extract register fields
genirq: Check __free_irq() return value for NULL
If a spinner is present, there is a chance that the load of
rwsem_has_spinner() in rwsem_wake() can be reordered with
respect to decrement of rwsem count in __up_write() leading
to wakeup being missed:
spinning writer up_write caller
--------------- -----------------------
[S] osq_unlock() [L] osq
spin_lock(wait_lock)
sem->count=0xFFFFFFFF00000001
+0xFFFFFFFF00000000
count=sem->count
MB
sem->count=0xFFFFFFFE00000001
-0xFFFFFFFF00000001
spin_trylock(wait_lock)
return
rwsem_try_write_lock(count)
spin_unlock(wait_lock)
schedule()
Reordering of atomic_long_sub_return_release() in __up_write()
and rwsem_has_spinner() in rwsem_wake() can cause missing of
wakeup in up_write() context. In spinning writer, sem->count
and local variable count is 0XFFFFFFFE00000001. It would result
in rwsem_try_write_lock() failing to acquire rwsem and spinning
writer going to sleep in rwsem_down_write_failed().
The smp_rmb() will make sure that the spinner state is
consulted after sem->count is updated in up_write context.
Signed-off-by: Prateek Sood <prsood@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: longman@redhat.com
Cc: parri.andrea@gmail.com
Cc: sramana@codeaurora.org
Link: http://lkml.kernel.org/r/1504794658-15397-1-git-send-email-prsood@codeaurora.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The following commit:
d9a50b0256 ("perf/aux: Ensure aux_wakeup represents most recent wakeup index")
changed the AUX wakeup position calculation to rounddown(), which causes
a division-by-zero in AUX overwrite mode (aka "snapshot mode").
The zero denominator results from the fact that perf record doesn't set
aux_watermark to anything, in which case the kernel will set it to half
the AUX buffer size, but only for non-overwrite mode. In the overwrite
mode aux_watermark stays zero.
The good news is that, AUX overwrite mode, wakeups don't happen and
related bookkeeping is not relevant, so we can simply forego the whole
wakeup updates.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/20170906160811.16510-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On s390x perf test 1 failed. It turned out that commit cf6383f73cf2
("perf report: Fix kernel symbol adjustment for s390x") was incorrect.
The previous implementation in dso__load_sym() is also suitable for
s390x.
Therefore this patch undoes commit cf6383f73cf2
Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Fixes: cf6383f73cf2 ("perf report: Fix kernel symbol adjustment for s390x")
LPU-Reference: 20170915071404.58398-2-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-v101o8k25vuja2ogosgf15yy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Markus reported that tasks in TASK_IDLE state are reported by SysRq-W,
which results in undesirable clutter.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
While the generic callback functions have an 'int' return and thus
appear to be allowed to return error, this is not true for all states.
Specifically, what used to be STARTING/DYING are ran with IRQs
disabled from critical parts of CPU bringup/teardown and are not
allowed to fail. Add WARNs to enforce this rule.
But since some callbacks are indeed allowed to fail, we have the
situation where a state-machine rollback encounters a failure, in this
case we're stuck, we can't go forward and we can't go back. Also add a
WARN for that case.
AFAICT this is a fundamental 'problem' with no real obvious solution.
We want the 'prepare' callbacks to allow failure on either up or down.
Typically on prepare-up this would be things like -ENOMEM from
resource allocations, and the typical usage in prepare-down would be
something like -EBUSY to avoid CPUs being taken away.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.819539119@infradead.org
Pull x86 fixes from Ingo Molnar:
"Another round of CR3/PCID related fixes (I think this addresses all
but one of the known problems with PCID support), an objtool fix plus
a Clang fix that (finally) solves all Clang quirks to build a bootable
x86 kernel as-is"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm: Fix inline asm call constraints for Clang
objtool: Handle another GCC stack pointer adjustment bug
x86/mm/32: Load a sane CR3 before cpu_init() on secondary CPUs
x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier
x86/mm/64: Stop using CR3.PCID == 0 in ASID-aware code
x86/mm: Factor out CR3-building code
The clock-cell size is 1 on stm32h7 plaform.
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@st.com>
Fixes: 3e4d618b0722 ("clk: stm32h7: Add stm32h743 clock driver")
Signed-off-by: Rob Herring <robh@kernel.org>
Goldmont, Glodmont plus and Xeon Phi have MSR_SMI_COUNT as well.
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: piotr.luc@intel.com
Cc: harry.pan@intel.com
Cc: srinivas.pandruvada@linux.intel.com
Link: http://lkml.kernel.org/r/20170908213449.6224-2-kan.liang@intel.com
Pull objtool fixes from Thomas Gleixner:
"Two small fixes for objtool:
- Support frame pointer setup via 'lea (%rsp), %rbp' which was not
yet supported and caused build warnings
- Disable unreacahble warnings for GCC4.4 and older to avoid false
positives caused by the compiler itself"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Support unoptimized frame pointer setup
objtool: Skip unreachable warnings for GCC 4.4 and older
When generic irq chips are allocated for an irq domain the domain name is
set to the irq chip name. That was done to have named domains before the
recent changes which enforce domain naming were done.
Since then the overwrite causes a memory leak when the domain name is
dynamically allocated and even worse it would cause the domain free code to
free the wrong name pointer, which might point to a constant.
Remove the name assignment to prevent this.
Fixes: d59f6617eef0 ("genirq: Allow fwnode to carry name information only")
Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20170928043731.4764-1-jeffy.chen@rock-chips.com
There was a reported suspicion about a race between exit_pi_state_list()
and put_pi_state(). The same report mentioned the comment with
put_pi_state() said it should be called with hb->lock held, and it no
longer is in all places.
As it turns out, the pi_state->owner serialization is indeed broken. As per
the new rules:
734009e96d19 ("futex: Change locking rules")
pi_state->owner should be serialized by pi_state->pi_mutex.wait_lock.
For the sites setting pi_state->owner we already hold wait_lock (where
required) but exit_pi_state_list() and put_pi_state() were not and
raced on clearing it.
Fixes: 734009e96d19 ("futex: Change locking rules")
Reported-by: Gratian Crisan <gratian.crisan@ni.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: dvhart@infradead.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20170922154806.jd3ffltfk24m4o4y@hirez.programming.kicks-ass.net
Pull ACPI fix from Rafael Wysocki:
"This fixes an APEI problem that may cause a reported error to be
missed due to a race condition"
* tag 'acpi-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / APEI: clear error status before acknowledging the error
On s390x perf test 1 failed. It turned out that commit 4a084ecfc821
("perf report: Fix module symbol adjustment for s390x") was incorrect.
The previous implementation in dso__load_sym() is also suitable for
s390x.
Therefore this patch undoes commit 4a084ecfc821.
Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
Fixes: 4a084ecfc821 ("perf report: Fix module symbol adjustment for s390x")
LPU-Reference: 20170915071404.58398-1-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-5ani7ly57zji7s0hmzkx416l@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Markus reported that kthreads that idle using TASK_IDLE instead of
TASK_INTERRUPTIBLE are reported in as TASK_UNINTERRUPTIBLE and things
like htop mark those red.
This is undesirable, so add an explicit state for TASK_IDLE.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There is currently no explicit state change on rollback. That is,
st->bringup, st->rollback and st->target are not consistent when doing
the rollback.
Rework the AP state handling to be more coherent. This does mean we
have to do a second AP kick-and-wait for rollback, but since rollback
is the slow path of a slowpath, this really should not matter.
Take this opportunity to simplify the AP thread function to only run a
single callback per invocation. This unifies the three single/up/down
modes is supports. The looping it used to do for up/down are achieved
by retaining should_run and relying on the main smpboot_thread_fn()
loop.
(I have most of a patch that does the same for the BP state handling,
but that's not critical and gets a little complicated because
CPUHP_BRINGUP_CPU does the AP handoff from a callback, which gets
recursive @st usage, I still have de-fugly that.)
[ tglx: Move cpuhp_down_callbacks() et al. into the HOTPLUG_CPU section to
avoid gcc complaining about unused functions. Make the HOTPLUG_CPU
one piece instead of having two consecutive ifdef sections of the
same type. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.769658088@infradead.org
Pull timer fix from Ingo Molnar:
"A clocksource driver section mismatch fix"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/integrator: Fix section mismatch warning
For inline asm statements which have a CALL instruction, we list the
stack pointer as a constraint to convince GCC to ensure the frame
pointer is set up first:
static inline void foo()
{
register void *__sp asm(_ASM_SP);
asm("call bar" : "+r" (__sp))
}
Unfortunately, that pattern causes Clang to corrupt the stack pointer.
The fix is easy: convert the stack pointer register variable to a global
variable.
It should be noted that the end result is different based on the GCC
version. With GCC 6.4, this patch has exactly the same result as
before:
defconfig defconfig-nofp distro distro-nofp
before 9820389 9491555 8816046 8516940
after 9820389 9491555 8816046 8516940
With GCC 7.2, however, GCC's behavior has changed. It now changes its
behavior based on the conversion of the register variable to a global.
That somehow convinces it to *always* set up the frame pointer before
inserting *any* inline asm. (Therefore, listing the variable as an
output constraint is a no-op and is no longer necessary.) It's a bit
overkill, but the performance impact should be negligible. And in fact,
there's a nice improvement with frame pointers disabled:
defconfig defconfig-nofp distro distro-nofp
before 9796316 9468236 9076191 8790305
after 9796957 9464267 9076381 8785949
So in summary, while listing the stack pointer as an output constraint
is no longer necessary for newer versions of GCC, it's still needed for
older versions.
Suggested-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Update dtx_diff include paths in the same manner as:
commit b12869a8d519 ("of: remove drivers/of/testcase-data from
include search path for CPP"), commit 5ffa2aed389c ("of: remove
arch/$(SRCARCH)/boot/dts from include search path for CPP"), and
commit 50f9ddaf64e1 ("of: search scripts/dtc/include-prefixes path
for both CPP and DTC").
Remove proposed include path kernel/dts/, which was never implemented
for the dtb build.
For the diff case, each source file is compiled separately. For
each of those compiles, provide the location of the source file
as an include path, not the location of both source files.
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Skylake server uses the same C-state residency events as Sandy Bridge.
Denverton and Gemini lake use the same C-state residency events as
Apollo Lake.
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: piotr.luc@intel.com
Cc: harry.pan@intel.com
Cc: srinivas.pandruvada@linux.intel.com
Link: http://lkml.kernel.org/r/20170908213449.6224-1-kan.liang@intel.com
Pull mtd fixes from Boris Brezillon:
- Fix partition alignment check in mtdcore.c
- Fix a buffer overflow in the Atmel NAND driver
* tag 'mtd/fixes-for-4.14-rc3' of git://git.infradead.org/linux-mtd:
mtd: nand: atmel: fix buffer overflow in atmel_pmecc_user
mtd: Fix partition alignment check on multi-erasesize devices
Arnd Bergmann reported a bunch of warnings like:
crypto/jitterentropy.o: warning: objtool: jent_fold_time()+0x3b: call without frame pointer save/setup
crypto/jitterentropy.o: warning: objtool: jent_stuck()+0x1d: call without frame pointer save/setup
crypto/jitterentropy.o: warning: objtool: jent_unbiased_bit()+0x15: call without frame pointer save/setup
crypto/jitterentropy.o: warning: objtool: jent_read_entropy()+0x32: call without frame pointer save/setup
crypto/jitterentropy.o: warning: objtool: jent_entropy_collector_free()+0x19: call without frame pointer save/setup
and
arch/x86/events/core.o: warning: objtool: collect_events uses BP as a scratch register
arch/x86/events/core.o: warning: objtool: events_ht_sysfs_show()+0x22: call without frame pointer save/setup
With certain rare configurations, GCC sometimes sets up the frame
pointer with:
lea (%rsp),%rbp
instead of:
mov %rsp,%rbp
The instructions are equivalent, so treat the former like the latter.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a468af8b28a69b83fffc6d7668be9b6fcc873699.1506526584.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix various address spaces warning of sparse.
kernel/irq/irqdomain.c:1463:14: warning: incorrect type in assignment (different address spaces)
kernel/irq/irqdomain.c:1463:14: expected void **slot
kernel/irq/irqdomain.c:1463:14: got void [noderef] <asn:4>**
kernel/irq/irqdomain.c:1465:66: warning: incorrect type in argument 2 (different address spaces)
kernel/irq/irqdomain.c:1465:66: expected void [noderef] <asn:4>**slot
kernel/irq/irqdomain.c:1465:66: got void **slot
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: https://lkml.kernel.org/r/1506082841-11530-1-git-send-email-yamada.masahiro@socionext.com
Pull power management fixes from Rafael Wysocki:
"These fix a deadlock in the operating performance points (OPP)
framework introduced during the 4.11 cycle, more issues with duplicate
device objects for cpufreq-dt and cpufreq documentation.
Specifics:
- Fix a deadlock in the operating performance points (OPP) framework
caused by a notifier callback taking a lock that's already held by
its caller (Viresh Kumar).
- Prevent the ti-cpufreq and cpufreq-dt-platdev drivers from
attempting to register conflicting device objects which triggers a
warning from sysfs (Suniel Mahesh).
- Drop a stale reference to a piece of intel_pstate documentation
that's not in the tree any more (Rafael Wysocki)"
* tag 'pm-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: docs: Drop intel-pstate.txt from index.txt
cpufreq: dt: Fix sysfs duplicate filename creation for platform-device
PM / OPP: Call notifier without holding opp_table->lock
* acpi-apei:
ACPI / APEI: clear error status before acknowledging the error
The build of kernel v4.14-rc1 for i686 fails on RHEL 6 with the error
in tools/perf:
util/syscalltbl.c:157: error: expected ';', ',' or ')' before '__maybe_unused'
mv: cannot stat `util/.syscalltbl.o.tmp': No such file or directory
Fix it by placing/moving:
#include <linux/compiler.h>
outside of #ifdef HAVE_SYSCALL_TABLE block.
Signed-off-by: Akemi Yagi <toracat@elrepo.org>
Cc: Alan Bartlett <ajb@elrepo.org>
Link: http://lkml.kernel.org/r/oq41r8$1v9$1@blaine.gmane.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove yet another task-state char instance.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently the rollback of multi-instance states is handled inside
cpuhp_invoke_callback(). The problem is that when we want to allow an
explicit state change for rollback, we need to return from the
function without doing the rollback.
Change cpuhp_invoke_callback() to optionally return the multi-instance
state, such that rollback can be done from a subsequent call.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.720361181@infradead.org
Pull irq fixes from Ingo Molnar:
"Three irqchip driver fixes, and an affinity mask helper function bug
fix affecting x86"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "genirq: Restrict effective affinity to interrupts actually using it"
irqchip.mips-gic: Fix shared interrupt mask writes
irqchip/gic-v4: Fix building with ancient gcc
irqchip/gic-v3: Iterate over possible CPUs by for_each_possible_cpu()
gcc-4.6 and older fail to inline integrator_clocksource_init, so they
end up showing a harmless warning:
WARNING: vmlinux.o(.text+0x4aa94c): Section mismatch in reference from the function integrator_clocksource_init() to the function .init.text:clocksource_mmio_init()
The function integrator_clocksource_init() references
the function __init clocksource_mmio_init().
This is often because integrator_clocksource_init lacks a __init
annotation or the annotation of clocksource_mmio_init is wrong.
Add the missing __init annotation that makes it build cleanly with all
compilers.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: http://lkml.kernel.org/r/20170915194310.1170514-1-arnd@arndb.de
The kbuild bot reported the following warning with GCC 4.4 and a
randconfig:
net/socket.o: warning: objtool: compat_sock_ioctl()+0x1083: stack state mismatch: cfa1=7+160 cfa2=-1+0
This is caused by another GCC non-optimization, where it backs up and
restores the stack pointer for no apparent reason:
2f91: 48 89 e0 mov %rsp,%rax
2f94: 4c 89 e7 mov %r12,%rdi
2f97: 4c 89 f6 mov %r14,%rsi
2f9a: ba 20 00 00 00 mov $0x20,%edx
2f9f: 48 89 c4 mov %rax,%rsp
This issue would have been happily ignored before the following commit:
dd88a0a0c861 ("objtool: Handle GCC stack pointer adjustment bug")
But now that objtool is paying attention to such stack pointer writes
to/from a register, it needs to understand them properly. In this case
that means recognizing that the "mov %rsp, %rax" instruction is
potentially a backup of the stack pointer.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: dd88a0a0c861 ("objtool: Handle GCC stack pointer adjustment bug")
Link: http://lkml.kernel.org/r/8c7aa8e9a36fbbb6655d9d8e7cea58958c912da8.1505942196.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 446810f2dd41 ("of: add vendor prefix for Abracon Corporation")
claimed that "abcn" was used as the vendor prefix while in fact "abracon"
was used in the subsequent commits. It is also the only prefix used in the
tree.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
[robh: fix alphabetical order]
Signed-off-by: Rob Herring <robh@kernel.org>
Don't cast away the __user in __get_user_asm_u64() on x86-32.
Prevents sparse getting upset.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20170912164000.13745-1-ville.syrjala@linux.intel.com
Pull SCSI fixes from James Bottomley:
"Eight mostly minor fixes for recently discovered issues in drivers"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ILLEGAL REQUEST + ASC==27 => target failure
scsi: aacraid: Add a small delay after IOP reset
scsi: scsi_transport_fc: Also check for NOTPRESENT in fc_remote_port_add()
scsi: scsi_transport_fc: set scsi_target_id upon rescan
scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly
scsi: aacraid: error: testing array offset 'bus' after use
scsi: lpfc: Don't return internal MBXERR_ERROR code from probe function
scsi: aacraid: Fix 2T+ drives on SmartIOC-2000
When calculating the size needed by struct atmel_pmecc_user *user,
the dmu and delta buffer sizes were forgotten.
This lead to a memory corruption (especially with a large ecc_strength).
Link: http://lkml.kernel.org/r/1506503157.3016.5.camel@gmail.com
Fixes: f88fc122cc34 ("mtd: nand: Cleanup/rework the atmel_nand driver")
Cc: stable@vger.kernel.org
Reported-by: Richard Genoud <richard.genoud@gmail.com>
Pointed-at-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Reviewed-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
The kbuild bot occasionally reports warnings like:
drivers/scsi/pcmcia/aha152x_core.o: warning: objtool: seldo_run()+0x130: unreachable instruction
These warnings are always with GCC 4.4. That version of GCC sometimes
places unreachable instructions after calls to noreturn functions.
The unreachable warnings aren't very important anyway. Just ignore them
for old versions of GCC.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/bc89b807d965b98ec18a0bb94f96a594bd58f2f2.1506551639.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 7778c4b27cbe ("irqchip: mips-gic: Use pcpu_masks to avoid reading
GIC_SH_MASK*") adjusted the way we handle masking interrupts to set &
clear the interrupt's bit in each pcpu_mask. This allows us to avoid
needing to read the GIC mask registers and perform a bitwise and of
their values with the pending & pcpu_masks.
Unfortunately this didn't quite work for IPIs, which were mapped to a
particular CPU/VP during initialisation but never set the affinity or
effective_affinity fields of their struct irq_desc. This led to them
losing their affinity when gic_unmask_irq() was called for them, and
they'd all become affine to cpu0.
Fix this by:
1) Setting the effective affinity of interrupts in
gic_shared_irq_domain_map(), which is where we actually map an
interrupt to a CPU/VP. This ensures that the effective affinity mask
is always valid, not just after explicitly setting affinity.
2) Using an interrupt's effective affinity when unmasking it, which
prevents gic_unmask_irq() from unintentionally changing which
pcpu_mask includes an interrupt.
Fixes: 7778c4b27cbe ("irqchip: mips-gic: Use pcpu_masks to avoid reading GIC_SH_MASK*")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: https://lkml.kernel.org/r/20170922062440.23701-3-paul.burton@imgtec.com
Pull xfs fixes from Darrick Wong:
- fix various problems with the copy-on-write extent maps getting freed
at the wrong time
- fix printk format specifier problems
- report zeroing operation outcomes instead of dropping them on the
floor
- fix some crashes when dio operations partially fail
- fix a race condition between unwritten extent conversion & dio read
- fix some incorrect tests in the inode log item processing
- correct the delayed allocation space reservations on rmap filesystems
- fix some problems checking for dax support
* tag 'xfs-4.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: revert "xfs: factor rmap btree size into the indlen calculations"
xfs: Capture state of the right inode in xfs_iflush_done
xfs: perag initialization should only touch m_ag_max_usable for AG 0
xfs: update i_size after unwritten conversion in dio completion
iomap_dio_rw: Allocate AIO completion queue before submitting dio
xfs: validate bdev support for DAX inode flag
xfs: remove redundant re-initialization of total_nr_pages
xfs: Output warning message when discard option was enabled even though the device does not support discard
xfs: report zeroed or not correctly in xfs_zero_range()
xfs: kill meaningless variable 'zero'
fs/xfs: Use %pS printk format for direct addresses
xfs: evict CoW fork extents when performing finsert/fcollapse
xfs: don't unconditionally clear the reflink flag on zero-block files
Pull x86 fixes from Thomas Gleixner:
"This contains the following fixes and improvements:
- Avoid dereferencing an unprotected VMA pointer in the fault signal
generation code
- Fix inline asm call constraints for GCC 4.4
- Use existing register variable to retrieve the stack pointer
instead of forcing the compiler to create another indirect access
which results in excessive extra 'mov %rsp, %<dst>' instructions
- Disable branch profiling for the memory encryption code to prevent
an early boot crash
- Fix a sparse warning caused by casting the __user annotation in
__get_user_asm_u64() away
- Fix an off by one error in the loop termination of the error patch
in the x86 sysfs init code
- Add missing CPU IDs to various Intel specific drivers to enable the
functionality on recent hardware
- More (init) constification in the numachip code"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm: Use register variable to get stack pointer value
x86/mm: Disable branch profiling in mem_encrypt.c
x86/asm: Fix inline asm call constraints for GCC 4.4
perf/x86/intel/uncore: Correct num_boxes for IIO and IRP
perf/x86/intel/rapl: Add missing CPU IDs
perf/x86/msr: Add missing CPU IDs
perf/x86/intel/cstate: Add missing CPU IDs
x86: Don't cast away the __user in __get_user_asm_u64()
x86/sysfs: Fix off-by-one error in loop termination
x86/mm: Fix fault error path using unsafe vma pointer
x86/numachip: Add const and __initconst to numachip2_clockevent
Pull timer fixes from Thomas Gleixner:
"This adds a new timer wheel function which is required for the
conversion of the timer callback function from the 'unsigned long
data' argument to 'struct timer_list *timer'. This conversion has two
benefits:
1) It makes struct timer_list smaller
2) Many callers hand in a pointer to the timer or to the structure
containing the timer, which happens via type casting both at setup
and in the callback. This change gets rid of the typecasts.
Once the conversion is complete, which is planned for 4.15, the old
setup function and the intermediate typecast in the new setup function
go away along with the data field in struct timer_list.
Merging this now into mainline allows a smooth queueing of the actual
conversion in the affected maintainer trees without creating
dependencies"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
um/time: Fixup namespace collision
timer: Prepare to change timer callback argument type
Currently we use current_stack_pointer() function to get the value
of the stack pointer register. Since commit:
f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
... we have a stack register variable declared. It can be used instead of
current_stack_pointer() function which allows to optimize away some
excessive "mov %rsp, %<dst>" instructions:
-mov %rsp,%rdx
-sub %rdx,%rax
-cmp $0x3fff,%rax
-ja ffffffff810722fd <ist_begin_non_atomic+0x2d>
+sub %rsp,%rax
+cmp $0x3fff,%rax
+ja ffffffff810722fa <ist_begin_non_atomic+0x2a>
Remove current_stack_pointer(), rename __asm_call_sp to current_stack_pointer
and use it instead of the removed function.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170929141537.29167-1-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull smp/hotplug fixes from Thomas Gleixner:
"This addresses the fallout of the new lockdep mechanism which covers
completions in the CPU hotplug code.
The lockdep splats are false positives, but there is no way to
annotate that reliably. The solution is to split the completions for
CPU up and down, which requires some reshuffling of the failure
rollback handling as well"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smp/hotplug: Hotplug state fail injection
smp/hotplug: Differentiate the AP completion between up and down
smp/hotplug: Differentiate the AP-work lockdep class between up and down
smp/hotplug: Callback vs state-machine consistency
smp/hotplug: Rewrite AP state machine core
smp/hotplug: Allow external multi-instance rollback
smp/hotplug: Add state diagram
The new timer_setup() function for struct timer_list collides with a
private um function. Rename it.
Fixes: 686fef928bba ("timer: Prepare to change timer callback argument type")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: Kees Cook <keescook@chromium.org>
Some routines in mem_encrypt.c are called very early in the boot process,
e.g. sme_encrypt_kernel(). When CONFIG_TRACE_BRANCH_PROFILING=y is defined
the resulting branch profiling associated with the check to see if SME is
active results in a kernel crash. Disable branch profiling for
mem_encrypt.c by defining DISABLE_BRANCH_PROFILING before including any
header files.
Reported-by: kernel test robot <lkp@01.org>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170929162419.6016.53390.stgit@tlendack-t1.amdoffice.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull scheduler fixes from Thomas Gleixner:
"The scheduler pull request comes with the following updates:
- Prevent a divide by zero issue by validating the input value of
sysctl_sched_time_avg
- Make task state printing consistent all over the place and have
explicit state characters for IDLE and PARKED so they wont be
displayed as 'D' state which confuses tools"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/sysctl: Check user input value of sysctl_sched_time_avg
sched/debug: Add explicit TASK_PARKED printing
sched/debug: Ignore TASK_IDLE for SysRq-W
sched/debug: Add explicit TASK_IDLE printing
sched/tracing: Use common task-state helpers
sched/tracing: Fix trace_sched_switch task-state printing
sched/debug: Remove unused variable
sched/debug: Convert TASK_state to hex
sched/debug: Implement consistent task-state printing
Add a sysfs file to one-time fail a specific state. This can be used
to test the state rollback code paths.
Something like this (hotplug-up.sh):
#!/bin/bash
echo 0 > /debug/sched_debug
echo 1 > /debug/tracing/events/cpuhp/enable
ALL_STATES=`cat /sys/devices/system/cpu/hotplug/states | cut -d':' -f1`
STATES=${1:-$ALL_STATES}
for state in $STATES
do
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /debug/tracing/trace
echo Fail state: $state
echo $state > /sys/devices/system/cpu/cpu1/hotplug/fail
cat /sys/devices/system/cpu/cpu1/hotplug/fail
echo 1 > /sys/devices/system/cpu/cpu1/online
cat /debug/tracing/trace > hotfail-${state}.trace
sleep 1
done
Can be used to test for all possible rollback (barring multi-instance)
scenarios on CPU-up, CPU-down is a trivial modification of the above.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.972581715@infradead.org
Modern kernel callback systems pass the structure associated with a
given callback to the callback function. The timer callback remains one
of the legacy cases where an arbitrary unsigned long argument continues
to be passed as the callback argument. This has several problems:
- This bloats the timer_list structure with a normally redundant
.data field.
- No type checking is being performed, forcing callbacks to do
explicit type casts of the unsigned long argument into the object
that was passed, rather than using container_of(), as done in most
of the other callback infrastructure.
- Neighboring buffer overflows can overwrite both the .function and
the .data field, providing attackers with a way to elevate from a buffer
overflow into a simplistic ROP-like mechanism that allows calling
arbitrary functions with a controlled first argument.
- For future Control Flow Integrity work, this creates a unique function
prototype for timer callbacks, instead of allowing them to continue to
be clustered with other void functions that take a single unsigned long
argument.
This adds a new timer initialization API, which will ultimately replace
the existing setup_timer(), setup_{deferrable,pinned,etc}_timer() family,
named timer_setup() (to mirror hrtimer_setup(), making instances of its
use much easier to grep for).
In order to support the migration of existing timers into the new
callback arguments, timer_setup() casts its arguments to the existing
legacy types, and explicitly passes the timer pointer as the legacy
data argument. Once all setup_*timer() callers have been replaced with
timer_setup(), the casts can be removed, and the data argument can be
dropped with the timer expiration code changed to just pass the timer
to the callback directly.
Since the regular pattern of using container_of() during local variable
declaration repeats the need for the variable type declaration
to be included, this adds a helper modeled after other from_*()
helpers that wrap container_of(), named from_timer(). This helper uses
typeof(*variable), removing the type redundancy and minimizing the need
for line wraps in forthcoming conversions from "unsigned data long" to
"struct timer_list *" in the timer callbacks:
-void callback(unsigned long data)
+void callback(struct timer_list *t)
{
- struct some_data_structure *local = (struct some_data_structure *)data;
+ struct some_data_structure *local = from_timer(local, t, timer);
Finally, in order to support the handful of timer users that perform
open-coded assignments of the .function (and .data) fields, provide
cast macros (TIMER_FUNC_TYPE and TIMER_DATA_TYPE) that can be used
temporarily. Once conversion has been completed, these can be globally
trivially removed.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20170928133817.GA113410@beast
The kernel test bot (run by Xiaolong Ye) reported that the following commit:
f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
is causing double faults in a kernel compiled with GCC 4.4.
Linus subsequently diagnosed the crash pattern and the buggy commit and found that
the issue is with this code:
register unsigned int __asm_call_sp asm("esp");
#define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp)
Even on a 64-bit kernel, it's using ESP instead of RSP. That causes GCC
to produce the following bogus code:
ffffffff8147461d: 89 e0 mov %esp,%eax
ffffffff8147461f: 4c 89 f7 mov %r14,%rdi
ffffffff81474622: 4c 89 fe mov %r15,%rsi
ffffffff81474625: ba 20 00 00 00 mov $0x20,%edx
ffffffff8147462a: 89 c4 mov %eax,%esp
ffffffff8147462c: e8 bf 52 05 00 callq ffffffff814c98f0 <copy_user_generic_unrolled>
Despite the absurdity of it backing up and restoring the stack pointer
for no reason, the bug is actually the fact that it's only backing up
and restoring the lower 32 bits of the stack pointer. The upper 32 bits
are getting cleared out, corrupting the stack pointer.
So change the '__asm_call_sp' register variable to be associated with
the actual full-size stack pointer.
This also requires changing the __ASM_SEL() macro to be based on the
actual compiled arch size, rather than the CONFIG value, because
CONFIG_X86_64 compiles some files with '-m32' (e.g., realmode and vdso).
Otherwise Clang fails to build the kernel because it complains about the
use of a 64-bit register (RSP) in a 32-bit file.
Reported-and-Bisected-and-Tested-by: kernel test robot <xiaolong.ye@intel.com>
Diagnosed-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: LKP <lkp@01.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
Link: http://lkml.kernel.org/r/20170928215826.6sdpmwtkiydiytim@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull perf fixes from Thomas Gleixner:
- Prevent a division by zero in the perf aux buffer handling
- Sync kernel headers with perf tool headers
- Fix a build failure in the syscalltbl code
- Make the debug messages of perf report --call-graph work correctly
- Make sure that all required perf files are in the MANIFEST for
container builds
- Fix the atrr.exclude kernel handling so it respects the
perf_event_paranoid and the user permissions
- Make perf test on s390x work correctly
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/aux: Only update ->aux_wakeup in non-overwrite mode
perf test: Fix vmlinux failure on s390x part 2
perf test: Fix vmlinux failure on s390x
perf tools: Fix syscalltbl build failure
perf report: Fix debug messages with --call-graph option
perf evsel: Fix attr.exclude_kernel setting for default cycles:p
tools include: Sync kernel ABI headers with tooling headers
perf tools: Get all of tools/{arch,include}/ in the MANIFEST
System will hang if user set sysctl_sched_time_avg to 0:
[root@XXX ~]# sysctl kernel.sched_time_avg_ms=0
Stack traceback for pid 0
0xffff883f6406c600 0 0 1 3 R 0xffff883f6406cf50 *swapper/3
ffff883f7ccc3ae8 0000000000000018 ffffffff810c4dd0 0000000000000000
0000000000017800 ffff883f7ccc3d78 0000000000000003 ffff883f7ccc3bf8
ffffffff810c4fc9 ffff883f7ccc3c08 00000000810c5043 ffff883f7ccc3c08
Call Trace:
<IRQ> [<ffffffff810c4dd0>] ? update_group_capacity+0x110/0x200
[<ffffffff810c4fc9>] ? update_sd_lb_stats+0x109/0x600
[<ffffffff810c5507>] ? find_busiest_group+0x47/0x530
[<ffffffff810c5b84>] ? load_balance+0x194/0x900
[<ffffffff810ad5ca>] ? update_rq_clock.part.83+0x1a/0xe0
[<ffffffff810c6d42>] ? rebalance_domains+0x152/0x290
[<ffffffff810c6f5c>] ? run_rebalance_domains+0xdc/0x1d0
[<ffffffff8108a75b>] ? __do_softirq+0xfb/0x320
[<ffffffff8108ac85>] ? irq_exit+0x125/0x130
[<ffffffff810b3a17>] ? scheduler_ipi+0x97/0x160
[<ffffffff81052709>] ? smp_reschedule_interrupt+0x29/0x30
[<ffffffff8173a1be>] ? reschedule_interrupt+0x6e/0x80
<EOI> [<ffffffff815bc83c>] ? cpuidle_enter_state+0xcc/0x230
[<ffffffff815bc80c>] ? cpuidle_enter_state+0x9c/0x230
[<ffffffff815bc9d7>] ? cpuidle_enter+0x17/0x20
[<ffffffff810cd6dc>] ? cpu_startup_entry+0x38c/0x420
[<ffffffff81053373>] ? start_secondary+0x173/0x1e0
Because divide-by-zero error happens in function:
update_group_capacity()
update_cpu_capacity()
scale_rt_capacity()
{
...
total = sched_avg_period() + delta;
used = div_u64(avg, total);
...
}
To fix this issue, check user input value of sysctl_sched_time_avg, keep
it unchanged when hitting invalid input, and set the minimum limit of
sysctl_sched_time_avg to 1 ms.
Reported-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: efault@gmx.de
Cc: ethan.kernel@gmail.com
Cc: keescook@chromium.org
Cc: mcgrof@kernel.org
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1504504774-18253-1-git-send-email-ethan.zhao@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With lockdep-crossrelease we get deadlock reports that span cpu-up and
cpu-down chains. Such deadlocks cannot possibly happen because cpu-up
and cpu-down are globally serialized.
takedown_cpu()
irq_lock_sparse()
wait_for_completion(&st->done)
cpuhp_thread_fun
cpuhp_up_callback
cpuhp_invoke_callback
irq_affinity_online_cpu
irq_local_spare()
irq_unlock_sparse()
complete(&st->done)
Now that we have consistent AP state, we can trivially separate the
AP completion between up and down using st->bringup.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: max.byungchul.park@gmail.com
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Link: https://lkml.kernel.org/r/20170920170546.872472799@infradead.org
There are 6 IIO/IRP boxes for CBDMA, PCIe0-2, MCP 0 and MCP 1
separately. Correct the num_boxes.
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: acme@kernel.org
Link: http://lkml.kernel.org/r/1505149816-12580-1-git-send-email-kan.liang@intel.com
Pull locking fixes from Thomas Gleixner:
"Two fixes for locking:
- Plug a hole the pi_stat->owner serialization which was changed
recently and failed to fixup two usage sites.
- Prevent reordering of the rwsem_has_spinner() check vs the
decrement of rwsem count in up_write() which causes a missed
wakeup"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rwsem-xadd: Fix missed wakeup due to reordering of load
futex: Fix pi_state->owner serialization
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix syscalltbl build failure (Akemi Yagi)
- Fix attr.exclude_kernel setting for default cycles:p, this time for
!root with kernel.perf_event_paranoid = -1 (Arnaldo Carvalho de Melo)
- Sync kernel ABI headers with tooling headers (Ingo Molnar)
- Remove misleading debug messages with --call-graph option (Mengting Zhang)
- Revert vmlinux symbol resolution patches for s390x (Thomas Richter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently TASK_PARKED is masqueraded as TASK_INTERRUPTIBLE, give it
its own print state because it will not in fact get woken by regular
wakeups and is a long-term state.
This requires moving TASK_PARKED into the TASK_REPORT mask, and since
that latter needs to be a contiguous bitmask, we need to shuffle the
bits around a bit.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With lockdep-crossrelease we get deadlock reports that span cpu-up and
cpu-down chains. Such deadlocks cannot possibly happen because cpu-up
and cpu-down are globally serialized.
CPU0 CPU1 CPU2
cpuhp_up_callbacks: takedown_cpu: cpuhp_thread_fun:
cpuhp_state
irq_lock_sparse()
irq_lock_sparse()
wait_for_completion()
cpuhp_state
complete()
Now that we have consistent AP state, we can trivially separate the
AP-work class between up and down using st->bringup.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: max.byungchul.park@gmail.com
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Link: https://lkml.kernel.org/r/20170920170546.922524234@infradead.org
Pull DeviceTree fixes from Rob Herring:
- fix build for !OF providing empty of_find_device_by_node
- fix Abracon vendor prefix
- sync dtx_diff include paths (again)
- a stm32h7 clock binding doc fix
* tag 'devicetree-fixes-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: clk: stm32h7: fix clock-cell size
scripts/dtc: dtx_diff - 2nd update of include dts paths to match build
dt-bindings: fix vendor prefix for Abracon
of: provide inline helper for of_find_device_by_node
DENVERTON and GEMINI_LAKE support same RAPL counters as Apollo Lake.
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: piotr.luc@intel.com
Cc: harry.pan@intel.com
Cc: srinivas.pandruvada@linux.intel.com
Link: http://lkml.kernel.org/r/20170908213449.6224-3-kan.liang@intel.com
Pull irq fixes from Thomas Gleixner:
- Add a missing NULL pointer check in free_irq()
- Fix a memory leak/memory corruption in the generic irq chip
- Add missing rcu annotations for radix tree access
- Use ffs instead of fls when extracting data from a chip register in
the MIPS GIC irq driver
- Fix the unmasking of IPI interrupts in the MIPS GIC driver so they
end up at the target CPU and not at CPU0
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irq/generic-chip: Don't replace domain's name
irqdomain: Add __rcu annotations to radix tree accessors
irqchip/mips-gic: Use effective affinity to unmask
irqchip/mips-gic: Fix shifts to extract register fields
genirq: Check __free_irq() return value for NULL
If a spinner is present, there is a chance that the load of
rwsem_has_spinner() in rwsem_wake() can be reordered with
respect to decrement of rwsem count in __up_write() leading
to wakeup being missed:
spinning writer up_write caller
--------------- -----------------------
[S] osq_unlock() [L] osq
spin_lock(wait_lock)
sem->count=0xFFFFFFFF00000001
+0xFFFFFFFF00000000
count=sem->count
MB
sem->count=0xFFFFFFFE00000001
-0xFFFFFFFF00000001
spin_trylock(wait_lock)
return
rwsem_try_write_lock(count)
spin_unlock(wait_lock)
schedule()
Reordering of atomic_long_sub_return_release() in __up_write()
and rwsem_has_spinner() in rwsem_wake() can cause missing of
wakeup in up_write() context. In spinning writer, sem->count
and local variable count is 0XFFFFFFFE00000001. It would result
in rwsem_try_write_lock() failing to acquire rwsem and spinning
writer going to sleep in rwsem_down_write_failed().
The smp_rmb() will make sure that the spinner state is
consulted after sem->count is updated in up_write context.
Signed-off-by: Prateek Sood <prsood@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: longman@redhat.com
Cc: parri.andrea@gmail.com
Cc: sramana@codeaurora.org
Link: http://lkml.kernel.org/r/1504794658-15397-1-git-send-email-prsood@codeaurora.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The following commit:
d9a50b0256 ("perf/aux: Ensure aux_wakeup represents most recent wakeup index")
changed the AUX wakeup position calculation to rounddown(), which causes
a division-by-zero in AUX overwrite mode (aka "snapshot mode").
The zero denominator results from the fact that perf record doesn't set
aux_watermark to anything, in which case the kernel will set it to half
the AUX buffer size, but only for non-overwrite mode. In the overwrite
mode aux_watermark stays zero.
The good news is that, AUX overwrite mode, wakeups don't happen and
related bookkeeping is not relevant, so we can simply forego the whole
wakeup updates.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/20170906160811.16510-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On s390x perf test 1 failed. It turned out that commit cf6383f73cf2
("perf report: Fix kernel symbol adjustment for s390x") was incorrect.
The previous implementation in dso__load_sym() is also suitable for
s390x.
Therefore this patch undoes commit cf6383f73cf2
Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Fixes: cf6383f73cf2 ("perf report: Fix kernel symbol adjustment for s390x")
LPU-Reference: 20170915071404.58398-2-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-v101o8k25vuja2ogosgf15yy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Markus reported that tasks in TASK_IDLE state are reported by SysRq-W,
which results in undesirable clutter.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
While the generic callback functions have an 'int' return and thus
appear to be allowed to return error, this is not true for all states.
Specifically, what used to be STARTING/DYING are ran with IRQs
disabled from critical parts of CPU bringup/teardown and are not
allowed to fail. Add WARNs to enforce this rule.
But since some callbacks are indeed allowed to fail, we have the
situation where a state-machine rollback encounters a failure, in this
case we're stuck, we can't go forward and we can't go back. Also add a
WARN for that case.
AFAICT this is a fundamental 'problem' with no real obvious solution.
We want the 'prepare' callbacks to allow failure on either up or down.
Typically on prepare-up this would be things like -ENOMEM from
resource allocations, and the typical usage in prepare-down would be
something like -EBUSY to avoid CPUs being taken away.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.819539119@infradead.org
Pull x86 fixes from Ingo Molnar:
"Another round of CR3/PCID related fixes (I think this addresses all
but one of the known problems with PCID support), an objtool fix plus
a Clang fix that (finally) solves all Clang quirks to build a bootable
x86 kernel as-is"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm: Fix inline asm call constraints for Clang
objtool: Handle another GCC stack pointer adjustment bug
x86/mm/32: Load a sane CR3 before cpu_init() on secondary CPUs
x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier
x86/mm/64: Stop using CR3.PCID == 0 in ASID-aware code
x86/mm: Factor out CR3-building code
Goldmont, Glodmont plus and Xeon Phi have MSR_SMI_COUNT as well.
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: piotr.luc@intel.com
Cc: harry.pan@intel.com
Cc: srinivas.pandruvada@linux.intel.com
Link: http://lkml.kernel.org/r/20170908213449.6224-2-kan.liang@intel.com
Pull objtool fixes from Thomas Gleixner:
"Two small fixes for objtool:
- Support frame pointer setup via 'lea (%rsp), %rbp' which was not
yet supported and caused build warnings
- Disable unreacahble warnings for GCC4.4 and older to avoid false
positives caused by the compiler itself"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Support unoptimized frame pointer setup
objtool: Skip unreachable warnings for GCC 4.4 and older
When generic irq chips are allocated for an irq domain the domain name is
set to the irq chip name. That was done to have named domains before the
recent changes which enforce domain naming were done.
Since then the overwrite causes a memory leak when the domain name is
dynamically allocated and even worse it would cause the domain free code to
free the wrong name pointer, which might point to a constant.
Remove the name assignment to prevent this.
Fixes: d59f6617eef0 ("genirq: Allow fwnode to carry name information only")
Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20170928043731.4764-1-jeffy.chen@rock-chips.com
There was a reported suspicion about a race between exit_pi_state_list()
and put_pi_state(). The same report mentioned the comment with
put_pi_state() said it should be called with hb->lock held, and it no
longer is in all places.
As it turns out, the pi_state->owner serialization is indeed broken. As per
the new rules:
734009e96d19 ("futex: Change locking rules")
pi_state->owner should be serialized by pi_state->pi_mutex.wait_lock.
For the sites setting pi_state->owner we already hold wait_lock (where
required) but exit_pi_state_list() and put_pi_state() were not and
raced on clearing it.
Fixes: 734009e96d19 ("futex: Change locking rules")
Reported-by: Gratian Crisan <gratian.crisan@ni.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: dvhart@infradead.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20170922154806.jd3ffltfk24m4o4y@hirez.programming.kicks-ass.net
On s390x perf test 1 failed. It turned out that commit 4a084ecfc821
("perf report: Fix module symbol adjustment for s390x") was incorrect.
The previous implementation in dso__load_sym() is also suitable for
s390x.
Therefore this patch undoes commit 4a084ecfc821.
Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
Fixes: 4a084ecfc821 ("perf report: Fix module symbol adjustment for s390x")
LPU-Reference: 20170915071404.58398-1-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-5ani7ly57zji7s0hmzkx416l@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Markus reported that kthreads that idle using TASK_IDLE instead of
TASK_INTERRUPTIBLE are reported in as TASK_UNINTERRUPTIBLE and things
like htop mark those red.
This is undesirable, so add an explicit state for TASK_IDLE.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There is currently no explicit state change on rollback. That is,
st->bringup, st->rollback and st->target are not consistent when doing
the rollback.
Rework the AP state handling to be more coherent. This does mean we
have to do a second AP kick-and-wait for rollback, but since rollback
is the slow path of a slowpath, this really should not matter.
Take this opportunity to simplify the AP thread function to only run a
single callback per invocation. This unifies the three single/up/down
modes is supports. The looping it used to do for up/down are achieved
by retaining should_run and relying on the main smpboot_thread_fn()
loop.
(I have most of a patch that does the same for the BP state handling,
but that's not critical and gets a little complicated because
CPUHP_BRINGUP_CPU does the AP handoff from a callback, which gets
recursive @st usage, I still have de-fugly that.)
[ tglx: Move cpuhp_down_callbacks() et al. into the HOTPLUG_CPU section to
avoid gcc complaining about unused functions. Make the HOTPLUG_CPU
one piece instead of having two consecutive ifdef sections of the
same type. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.769658088@infradead.org
For inline asm statements which have a CALL instruction, we list the
stack pointer as a constraint to convince GCC to ensure the frame
pointer is set up first:
static inline void foo()
{
register void *__sp asm(_ASM_SP);
asm("call bar" : "+r" (__sp))
}
Unfortunately, that pattern causes Clang to corrupt the stack pointer.
The fix is easy: convert the stack pointer register variable to a global
variable.
It should be noted that the end result is different based on the GCC
version. With GCC 6.4, this patch has exactly the same result as
before:
defconfig defconfig-nofp distro distro-nofp
before 9820389 9491555 8816046 8516940
after 9820389 9491555 8816046 8516940
With GCC 7.2, however, GCC's behavior has changed. It now changes its
behavior based on the conversion of the register variable to a global.
That somehow convinces it to *always* set up the frame pointer before
inserting *any* inline asm. (Therefore, listing the variable as an
output constraint is a no-op and is no longer necessary.) It's a bit
overkill, but the performance impact should be negligible. And in fact,
there's a nice improvement with frame pointers disabled:
defconfig defconfig-nofp distro distro-nofp
before 9796316 9468236 9076191 8790305
after 9796957 9464267 9076381 8785949
So in summary, while listing the stack pointer as an output constraint
is no longer necessary for newer versions of GCC, it's still needed for
older versions.
Suggested-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Update dtx_diff include paths in the same manner as:
commit b12869a8d519 ("of: remove drivers/of/testcase-data from
include search path for CPP"), commit 5ffa2aed389c ("of: remove
arch/$(SRCARCH)/boot/dts from include search path for CPP"), and
commit 50f9ddaf64e1 ("of: search scripts/dtc/include-prefixes path
for both CPP and DTC").
Remove proposed include path kernel/dts/, which was never implemented
for the dtb build.
For the diff case, each source file is compiled separately. For
each of those compiles, provide the location of the source file
as an include path, not the location of both source files.
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Skylake server uses the same C-state residency events as Sandy Bridge.
Denverton and Gemini lake use the same C-state residency events as
Apollo Lake.
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: piotr.luc@intel.com
Cc: harry.pan@intel.com
Cc: srinivas.pandruvada@linux.intel.com
Link: http://lkml.kernel.org/r/20170908213449.6224-1-kan.liang@intel.com
Pull mtd fixes from Boris Brezillon:
- Fix partition alignment check in mtdcore.c
- Fix a buffer overflow in the Atmel NAND driver
* tag 'mtd/fixes-for-4.14-rc3' of git://git.infradead.org/linux-mtd:
mtd: nand: atmel: fix buffer overflow in atmel_pmecc_user
mtd: Fix partition alignment check on multi-erasesize devices
Arnd Bergmann reported a bunch of warnings like:
crypto/jitterentropy.o: warning: objtool: jent_fold_time()+0x3b: call without frame pointer save/setup
crypto/jitterentropy.o: warning: objtool: jent_stuck()+0x1d: call without frame pointer save/setup
crypto/jitterentropy.o: warning: objtool: jent_unbiased_bit()+0x15: call without frame pointer save/setup
crypto/jitterentropy.o: warning: objtool: jent_read_entropy()+0x32: call without frame pointer save/setup
crypto/jitterentropy.o: warning: objtool: jent_entropy_collector_free()+0x19: call without frame pointer save/setup
and
arch/x86/events/core.o: warning: objtool: collect_events uses BP as a scratch register
arch/x86/events/core.o: warning: objtool: events_ht_sysfs_show()+0x22: call without frame pointer save/setup
With certain rare configurations, GCC sometimes sets up the frame
pointer with:
lea (%rsp),%rbp
instead of:
mov %rsp,%rbp
The instructions are equivalent, so treat the former like the latter.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a468af8b28a69b83fffc6d7668be9b6fcc873699.1506526584.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix various address spaces warning of sparse.
kernel/irq/irqdomain.c:1463:14: warning: incorrect type in assignment (different address spaces)
kernel/irq/irqdomain.c:1463:14: expected void **slot
kernel/irq/irqdomain.c:1463:14: got void [noderef] <asn:4>**
kernel/irq/irqdomain.c:1465:66: warning: incorrect type in argument 2 (different address spaces)
kernel/irq/irqdomain.c:1465:66: expected void [noderef] <asn:4>**slot
kernel/irq/irqdomain.c:1465:66: got void **slot
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: https://lkml.kernel.org/r/1506082841-11530-1-git-send-email-yamada.masahiro@socionext.com
Pull power management fixes from Rafael Wysocki:
"These fix a deadlock in the operating performance points (OPP)
framework introduced during the 4.11 cycle, more issues with duplicate
device objects for cpufreq-dt and cpufreq documentation.
Specifics:
- Fix a deadlock in the operating performance points (OPP) framework
caused by a notifier callback taking a lock that's already held by
its caller (Viresh Kumar).
- Prevent the ti-cpufreq and cpufreq-dt-platdev drivers from
attempting to register conflicting device objects which triggers a
warning from sysfs (Suniel Mahesh).
- Drop a stale reference to a piece of intel_pstate documentation
that's not in the tree any more (Rafael Wysocki)"
* tag 'pm-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: docs: Drop intel-pstate.txt from index.txt
cpufreq: dt: Fix sysfs duplicate filename creation for platform-device
PM / OPP: Call notifier without holding opp_table->lock
The build of kernel v4.14-rc1 for i686 fails on RHEL 6 with the error
in tools/perf:
util/syscalltbl.c:157: error: expected ';', ',' or ')' before '__maybe_unused'
mv: cannot stat `util/.syscalltbl.o.tmp': No such file or directory
Fix it by placing/moving:
#include <linux/compiler.h>
outside of #ifdef HAVE_SYSCALL_TABLE block.
Signed-off-by: Akemi Yagi <toracat@elrepo.org>
Cc: Alan Bartlett <ajb@elrepo.org>
Link: http://lkml.kernel.org/r/oq41r8$1v9$1@blaine.gmane.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove yet another task-state char instance.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently the rollback of multi-instance states is handled inside
cpuhp_invoke_callback(). The problem is that when we want to allow an
explicit state change for rollback, we need to return from the
function without doing the rollback.
Change cpuhp_invoke_callback() to optionally return the multi-instance
state, such that rollback can be done from a subsequent call.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.720361181@infradead.org
Pull irq fixes from Ingo Molnar:
"Three irqchip driver fixes, and an affinity mask helper function bug
fix affecting x86"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "genirq: Restrict effective affinity to interrupts actually using it"
irqchip.mips-gic: Fix shared interrupt mask writes
irqchip/gic-v4: Fix building with ancient gcc
irqchip/gic-v3: Iterate over possible CPUs by for_each_possible_cpu()
gcc-4.6 and older fail to inline integrator_clocksource_init, so they
end up showing a harmless warning:
WARNING: vmlinux.o(.text+0x4aa94c): Section mismatch in reference from the function integrator_clocksource_init() to the function .init.text:clocksource_mmio_init()
The function integrator_clocksource_init() references
the function __init clocksource_mmio_init().
This is often because integrator_clocksource_init lacks a __init
annotation or the annotation of clocksource_mmio_init is wrong.
Add the missing __init annotation that makes it build cleanly with all
compilers.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: http://lkml.kernel.org/r/20170915194310.1170514-1-arnd@arndb.de
The kbuild bot reported the following warning with GCC 4.4 and a
randconfig:
net/socket.o: warning: objtool: compat_sock_ioctl()+0x1083: stack state mismatch: cfa1=7+160 cfa2=-1+0
This is caused by another GCC non-optimization, where it backs up and
restores the stack pointer for no apparent reason:
2f91: 48 89 e0 mov %rsp,%rax
2f94: 4c 89 e7 mov %r12,%rdi
2f97: 4c 89 f6 mov %r14,%rsi
2f9a: ba 20 00 00 00 mov $0x20,%edx
2f9f: 48 89 c4 mov %rax,%rsp
This issue would have been happily ignored before the following commit:
dd88a0a0c861 ("objtool: Handle GCC stack pointer adjustment bug")
But now that objtool is paying attention to such stack pointer writes
to/from a register, it needs to understand them properly. In this case
that means recognizing that the "mov %rsp, %rax" instruction is
potentially a backup of the stack pointer.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: dd88a0a0c861 ("objtool: Handle GCC stack pointer adjustment bug")
Link: http://lkml.kernel.org/r/8c7aa8e9a36fbbb6655d9d8e7cea58958c912da8.1505942196.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 446810f2dd41 ("of: add vendor prefix for Abracon Corporation")
claimed that "abcn" was used as the vendor prefix while in fact "abracon"
was used in the subsequent commits. It is also the only prefix used in the
tree.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
[robh: fix alphabetical order]
Signed-off-by: Rob Herring <robh@kernel.org>
Don't cast away the __user in __get_user_asm_u64() on x86-32.
Prevents sparse getting upset.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20170912164000.13745-1-ville.syrjala@linux.intel.com
Pull SCSI fixes from James Bottomley:
"Eight mostly minor fixes for recently discovered issues in drivers"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ILLEGAL REQUEST + ASC==27 => target failure
scsi: aacraid: Add a small delay after IOP reset
scsi: scsi_transport_fc: Also check for NOTPRESENT in fc_remote_port_add()
scsi: scsi_transport_fc: set scsi_target_id upon rescan
scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly
scsi: aacraid: error: testing array offset 'bus' after use
scsi: lpfc: Don't return internal MBXERR_ERROR code from probe function
scsi: aacraid: Fix 2T+ drives on SmartIOC-2000
When calculating the size needed by struct atmel_pmecc_user *user,
the dmu and delta buffer sizes were forgotten.
This lead to a memory corruption (especially with a large ecc_strength).
Link: http://lkml.kernel.org/r/1506503157.3016.5.camel@gmail.com
Fixes: f88fc122cc34 ("mtd: nand: Cleanup/rework the atmel_nand driver")
Cc: stable@vger.kernel.org
Reported-by: Richard Genoud <richard.genoud@gmail.com>
Pointed-at-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Reviewed-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
The kbuild bot occasionally reports warnings like:
drivers/scsi/pcmcia/aha152x_core.o: warning: objtool: seldo_run()+0x130: unreachable instruction
These warnings are always with GCC 4.4. That version of GCC sometimes
places unreachable instructions after calls to noreturn functions.
The unreachable warnings aren't very important anyway. Just ignore them
for old versions of GCC.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/bc89b807d965b98ec18a0bb94f96a594bd58f2f2.1506551639.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 7778c4b27cbe ("irqchip: mips-gic: Use pcpu_masks to avoid reading
GIC_SH_MASK*") adjusted the way we handle masking interrupts to set &
clear the interrupt's bit in each pcpu_mask. This allows us to avoid
needing to read the GIC mask registers and perform a bitwise and of
their values with the pending & pcpu_masks.
Unfortunately this didn't quite work for IPIs, which were mapped to a
particular CPU/VP during initialisation but never set the affinity or
effective_affinity fields of their struct irq_desc. This led to them
losing their affinity when gic_unmask_irq() was called for them, and
they'd all become affine to cpu0.
Fix this by:
1) Setting the effective affinity of interrupts in
gic_shared_irq_domain_map(), which is where we actually map an
interrupt to a CPU/VP. This ensures that the effective affinity mask
is always valid, not just after explicitly setting affinity.
2) Using an interrupt's effective affinity when unmasking it, which
prevents gic_unmask_irq() from unintentionally changing which
pcpu_mask includes an interrupt.
Fixes: 7778c4b27cbe ("irqchip: mips-gic: Use pcpu_masks to avoid reading GIC_SH_MASK*")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: https://lkml.kernel.org/r/20170922062440.23701-3-paul.burton@imgtec.com
Pull xfs fixes from Darrick Wong:
- fix various problems with the copy-on-write extent maps getting freed
at the wrong time
- fix printk format specifier problems
- report zeroing operation outcomes instead of dropping them on the
floor
- fix some crashes when dio operations partially fail
- fix a race condition between unwritten extent conversion & dio read
- fix some incorrect tests in the inode log item processing
- correct the delayed allocation space reservations on rmap filesystems
- fix some problems checking for dax support
* tag 'xfs-4.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: revert "xfs: factor rmap btree size into the indlen calculations"
xfs: Capture state of the right inode in xfs_iflush_done
xfs: perag initialization should only touch m_ag_max_usable for AG 0
xfs: update i_size after unwritten conversion in dio completion
iomap_dio_rw: Allocate AIO completion queue before submitting dio
xfs: validate bdev support for DAX inode flag
xfs: remove redundant re-initialization of total_nr_pages
xfs: Output warning message when discard option was enabled even though the device does not support discard
xfs: report zeroed or not correctly in xfs_zero_range()
xfs: kill meaningless variable 'zero'
fs/xfs: Use %pS printk format for direct addresses
xfs: evict CoW fork extents when performing finsert/fcollapse
xfs: don't unconditionally clear the reflink flag on zero-block files