commits
What happens is that a write to /dev/sg is given a request with non-zero
->iovec_count combined with zero ->dxfer_len. Or with ->dxferp pointing
to an array full of empty iovecs.
Having write permission to /dev/sg shouldn't be equivalent to the
ability to trigger BUG_ON() while holding spinlocks...
Found by Dmitry Vyukov and syzkaller.
[ The BUG_ON() got changed to a WARN_ON_ONCE(), but this fixes the
underlying issue. - Linus ]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Don't crash the machine just because of an empty transfer. Use WARN_ON()
combined with returning an error.
Found by Dmitry Vyukov and syzkaller.
[ Changed to "WARN_ON_ONCE()". Al has a patch that should fix the root
cause, but a BUG_ON() is not acceptable in any case, and a WARN_ON()
might still be a cause of excessive log spamming.
NOTE! If this warning ever triggers, we may end up leaking resources,
since this doesn't bother to try to clean the command up. So this
WARN_ON_ONCE() triggering does imply real problems. But BUG_ON() is
much worse.
People really need to stop using BUG_ON() for "this shouldn't ever
happen". It makes pretty much any bug worse. - Linus ]
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: James Bottomley <jejb@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If ip6_dst_lookup_tail has acquired a dst and fails the IPv4-mapped
check, release the dst before returning an error.
Fixes: ec5e3b0a1d41 ("ipv6: Inhibit IPv4-mapped src address on the wire.")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull ARM SoC fixes from Arnd Bergmann:
"Two more bugfixes that came in during this week:
- a defconfig change to enable a vital driver used on some Qualcomm
based phones. This was already queued for 4.11, but the maintainer
asked to have it in 4.10 after all.
- a regression fix for the reset controller framework, this got
broken by a typo in the 4.10 merge window"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: multi_v7_defconfig: enable Qualcomm RPMCC
reset: fix shared reset triggered_count decrement on error
Pull ARM fixes from Russell King:
"A couple of fixes from Kees concerning problems he spotted with our
user access support"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8658/1: uaccess: fix zeroing of 64-bit get_user()
ARM: 8657/1: uaccess: consistently check object sizes
This patch enables the Qualcomm RPM based Clock Controller present on
A-family boards.
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull x86 fix from Thomas Gleixner:
"Make the build clean by working around yet another GCC stupidity"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vm86: Fix unused variable warning if THP is disabled
The 64-bit get_user() wasn't clearing the high word due to a typo in the
error handler. The exception handler entry was already correct, though.
Noticed during recent usercopy test additions in lib/test_user_copy.c.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Pull "Reset controller fixes for v4.10" from Philipp Zabel:
- Remove erroneous negation of the error check of the reset function
to decrement trigger_count in the error case, not on success. This
fixes shared resets to actually only trigger once, as intended.
* tag 'reset-for-4.10-fixes' of https://git.pengutronix.de/git/pza/linux:
reset: fix shared reset triggered_count decrement on error
Pull locking fix from Thomas Gleixner:
"Move the futex init function to core initcall so user mode helper does
not run into an uninitialized futex syscall"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Move futex_init() to core_initcall
GCC complains about unused variable 'vma' in mark_screen_rdonly() if THP is
disabled:
arch/x86/kernel/vm86_32.c: In function ‘mark_screen_rdonly’:
arch/x86/kernel/vm86_32.c:180:26: warning: unused variable ‘vma’
[-Wunused-variable]
struct vm_area_struct *vma = find_vma(mm, 0xA0000);
That's silly. pmd_trans_huge() resolves to 0 when THP is disabled, so the
whole block should be eliminated.
Moving the variable declaration outside the if() block shuts GCC up.
Reported-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: Carlos O'Donell <carlos@redhat.com>
Link: http://lkml.kernel.org/r/20170213125228.63645-1-kirill.shutemov@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In commit 76624175dcae ("arm64: uaccess: consistently check object sizes"),
the object size checks are moved outside the access_ok() so that bad
destinations are detected before hitting the "memset(dest, 0, size)" in the
copy_from_user() failure path.
This makes the same change for arm, with attention given to possibly
extracting the uaccess routines into a common header file for all
architectures in the future.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
For a shared reset, when the reset is successful, the triggered_count is
incremented when trying to call the reset callback, so that another device
sharing the same reset line won't trigger it again. If the reset has not
been triggered successfully, the trigger_count should be decremented.
The code does the opposite, and decrements the trigger_count on success.
As a consequence, another device sharing the reset will be able to trigger
it again.
Fixed be removing negation in from of the error code of the reset function.
Fixes: 7da33a37b48f ("reset: allow using reset_control_reset with shared reset")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Pull timer fixes from Thomas Gleixner:
"Two small fixes::
- Prevent deadlock on the tick broadcast lock. Found and fixed by
Mike.
- Stop using printk() in the timekeeping debug code to prevent a
deadlock against the scheduler"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping: Use deferred printk() in debug code
tick/broadcast: Prevent deadlock on tick_broadcast_lock
The UEVENT user mode helper is enabled before the initcalls are executed
and is available when the root filesystem has been mounted.
The user mode helper is triggered by device init calls and the executable
might use the futex syscall.
futex_init() is marked __initcall which maps to device_initcall, but there
is no guarantee that futex_init() is invoked _before_ the first device init
call which triggers the UEVENT user mode helper.
If the user mode helper uses the futex syscall before futex_init() then the
syscall crashes with a NULL pointer dereference because the futex subsystem
has not been initialized yet.
Move futex_init() to core_initcall so futexes are initialized before the
root filesystem is mounted and the usermode helper becomes available.
[ tglx: Rewrote changelog ]
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Cc: jiang.biao2@zte.com.cn
Cc: jiang.zhengxiong@zte.com.cn
Cc: zhong.weidong@zte.com.cn
Cc: deng.huali@zte.com.cn
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1483085875-6130-1-git-send-email-yang.yang29@zte.com.cn
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.
Cc: <stable@vger.kernel.org> # 3.0.x-
Fixes: 5be6f62b0059 ("ARM: 6883/1: ptrace: Migrate to regsets framework")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull x86 fixes from Ingo Molnar:
"Last minute x86 fixes:
- Fix a softlockup detector warning and long delays if using ptdump
with KASAN enabled.
- Two more TSC-adjust fixes for interesting firmware interactions.
- Two commits to fix an AMD CPU topology enumeration bug that caused
a measurable gaming performance regression"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/ptdump: Fix soft lockup in page table walker
x86/tsc: Make the TSC ADJUST sanitizing work for tsc_reliable
x86/tsc: Avoid the large time jump when sanitizing TSC ADJUST
x86/CPU/AMD: Fix Zen SMT topology
x86/CPU/AMD: Bring back Compute Unit ID
Pull networking fixes from David Miller:
1) Fix leak in dpaa_eth error paths, from Dan Carpenter.
2) Use after free when using IPV6_RECVPKTINFO, from Andrey Konovalov.
3) fanout_release() cannot be invoked from atomic contexts, from Anoob
Soman.
4) Fix bogus attempt at lockdep annotation in IRDA.
5) dev_fill_metadata_dst() can OOP on a NULL dst cache pointer, from
Paolo Abeni.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
irda: Fix lockdep annotations in hashbin_delete().
vxlan: fix oops in dev_fill_metadata_dst
dccp: fix freeing skb too early for IPV6_RECVPKTINFO
dpaa_eth: small leak on error
packet: Do not call fanout_release from atomic contexts
We cannot do printk() from tk_debug_account_sleep_time(), because
tk_debug_account_sleep_time() is called under tk_core seq lock.
The reason why printk() is unsafe there is that console_sem may
invoke scheduler (up()->wake_up_process()->activate_task()), which,
in turn, can return back to timekeeping code, for instance, via
get_time()->ktime_get(), deadlocking the system on tk_core seq lock.
[ 48.950592] ======================================================
[ 48.950622] [ INFO: possible circular locking dependency detected ]
[ 48.950622] 4.10.0-rc7-next-20170213+ #101 Not tainted
[ 48.950622] -------------------------------------------------------
[ 48.950622] kworker/0:0/3 is trying to acquire lock:
[ 48.950653] (tk_core){----..}, at: [<c01cc624>] retrigger_next_event+0x4c/0x90
[ 48.950683]
but task is already holding lock:
[ 48.950683] (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90
[ 48.950714]
which lock already depends on the new lock.
[ 48.950714]
the existing dependency chain (in reverse order) is:
[ 48.950714]
-> #5 (hrtimer_bases.lock){-.-...}:
[ 48.950744] _raw_spin_lock_irqsave+0x50/0x64
[ 48.950775] lock_hrtimer_base+0x28/0x58
[ 48.950775] hrtimer_start_range_ns+0x20/0x5c8
[ 48.950775] __enqueue_rt_entity+0x320/0x360
[ 48.950805] enqueue_rt_entity+0x2c/0x44
[ 48.950805] enqueue_task_rt+0x24/0x94
[ 48.950836] ttwu_do_activate+0x54/0xc0
[ 48.950836] try_to_wake_up+0x248/0x5c8
[ 48.950836] __setup_irq+0x420/0x5f0
[ 48.950836] request_threaded_irq+0xdc/0x184
[ 48.950866] devm_request_threaded_irq+0x58/0xa4
[ 48.950866] omap_i2c_probe+0x530/0x6a0
[ 48.950897] platform_drv_probe+0x50/0xb0
[ 48.950897] driver_probe_device+0x1f8/0x2cc
[ 48.950897] __driver_attach+0xc0/0xc4
[ 48.950927] bus_for_each_dev+0x6c/0xa0
[ 48.950927] bus_add_driver+0x100/0x210
[ 48.950927] driver_register+0x78/0xf4
[ 48.950958] do_one_initcall+0x3c/0x16c
[ 48.950958] kernel_init_freeable+0x20c/0x2d8
[ 48.950958] kernel_init+0x8/0x110
[ 48.950988] ret_from_fork+0x14/0x24
[ 48.950988]
-> #4 (&rt_b->rt_runtime_lock){-.-...}:
[ 48.951019] _raw_spin_lock+0x40/0x50
[ 48.951019] rq_offline_rt+0x9c/0x2bc
[ 48.951019] set_rq_offline.part.2+0x2c/0x58
[ 48.951049] rq_attach_root+0x134/0x144
[ 48.951049] cpu_attach_domain+0x18c/0x6f4
[ 48.951049] build_sched_domains+0xba4/0xd80
[ 48.951080] sched_init_smp+0x68/0x10c
[ 48.951080] kernel_init_freeable+0x160/0x2d8
[ 48.951080] kernel_init+0x8/0x110
[ 48.951080] ret_from_fork+0x14/0x24
[ 48.951110]
-> #3 (&rq->lock){-.-.-.}:
[ 48.951110] _raw_spin_lock+0x40/0x50
[ 48.951141] task_fork_fair+0x30/0x124
[ 48.951141] sched_fork+0x194/0x2e0
[ 48.951141] copy_process.part.5+0x448/0x1a20
[ 48.951171] _do_fork+0x98/0x7e8
[ 48.951171] kernel_thread+0x2c/0x34
[ 48.951171] rest_init+0x1c/0x18c
[ 48.951202] start_kernel+0x35c/0x3d4
[ 48.951202] 0x8000807c
[ 48.951202]
-> #2 (&p->pi_lock){-.-.-.}:
[ 48.951232] _raw_spin_lock_irqsave+0x50/0x64
[ 48.951232] try_to_wake_up+0x30/0x5c8
[ 48.951232] up+0x4c/0x60
[ 48.951263] __up_console_sem+0x2c/0x58
[ 48.951263] console_unlock+0x3b4/0x650
[ 48.951263] vprintk_emit+0x270/0x474
[ 48.951293] vprintk_default+0x20/0x28
[ 48.951293] printk+0x20/0x30
[ 48.951324] kauditd_hold_skb+0x94/0xb8
[ 48.951324] kauditd_thread+0x1a4/0x56c
[ 48.951324] kthread+0x104/0x148
[ 48.951354] ret_from_fork+0x14/0x24
[ 48.951354]
-> #1 ((console_sem).lock){-.....}:
[ 48.951385] _raw_spin_lock_irqsave+0x50/0x64
[ 48.951385] down_trylock+0xc/0x2c
[ 48.951385] __down_trylock_console_sem+0x24/0x80
[ 48.951385] console_trylock+0x10/0x8c
[ 48.951416] vprintk_emit+0x264/0x474
[ 48.951416] vprintk_default+0x20/0x28
[ 48.951416] printk+0x20/0x30
[ 48.951446] tk_debug_account_sleep_time+0x5c/0x70
[ 48.951446] __timekeeping_inject_sleeptime.constprop.3+0x170/0x1a0
[ 48.951446] timekeeping_resume+0x218/0x23c
[ 48.951477] syscore_resume+0x94/0x42c
[ 48.951477] suspend_enter+0x554/0x9b4
[ 48.951477] suspend_devices_and_enter+0xd8/0x4b4
[ 48.951507] enter_state+0x934/0xbd4
[ 48.951507] pm_suspend+0x14/0x70
[ 48.951507] state_store+0x68/0xc8
[ 48.951538] kernfs_fop_write+0xf4/0x1f8
[ 48.951538] __vfs_write+0x1c/0x114
[ 48.951538] vfs_write+0xa0/0x168
[ 48.951568] SyS_write+0x3c/0x90
[ 48.951568] __sys_trace_return+0x0/0x10
[ 48.951568]
-> #0 (tk_core){----..}:
[ 48.951599] lock_acquire+0xe0/0x294
[ 48.951599] ktime_get_update_offsets_now+0x5c/0x1d4
[ 48.951629] retrigger_next_event+0x4c/0x90
[ 48.951629] on_each_cpu+0x40/0x7c
[ 48.951629] clock_was_set_work+0x14/0x20
[ 48.951660] process_one_work+0x2b4/0x808
[ 48.951660] worker_thread+0x3c/0x550
[ 48.951660] kthread+0x104/0x148
[ 48.951690] ret_from_fork+0x14/0x24
[ 48.951690]
other info that might help us debug this:
[ 48.951690] Chain exists of:
tk_core --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock
[ 48.951721] Possible unsafe locking scenario:
[ 48.951721] CPU0 CPU1
[ 48.951721] ---- ----
[ 48.951721] lock(hrtimer_bases.lock);
[ 48.951751] lock(&rt_b->rt_runtime_lock);
[ 48.951751] lock(hrtimer_bases.lock);
[ 48.951751] lock(tk_core);
[ 48.951782]
*** DEADLOCK ***
[ 48.951782] 3 locks held by kworker/0:0/3:
[ 48.951782] #0: ("events"){.+.+.+}, at: [<c0156590>] process_one_work+0x1f8/0x808
[ 48.951812] #1: (hrtimer_work){+.+...}, at: [<c0156590>] process_one_work+0x1f8/0x808
[ 48.951843] #2: (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90
[ 48.951843] stack backtrace:
[ 48.951873] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.10.0-rc7-next-20170213+
[ 48.951904] Workqueue: events clock_was_set_work
[ 48.951904] [<c0110208>] (unwind_backtrace) from [<c010c224>] (show_stack+0x10/0x14)
[ 48.951934] [<c010c224>] (show_stack) from [<c04ca6c0>] (dump_stack+0xac/0xe0)
[ 48.951934] [<c04ca6c0>] (dump_stack) from [<c019b5cc>] (print_circular_bug+0x1d0/0x308)
[ 48.951965] [<c019b5cc>] (print_circular_bug) from [<c019d2a8>] (validate_chain+0xf50/0x1324)
[ 48.951965] [<c019d2a8>] (validate_chain) from [<c019ec18>] (__lock_acquire+0x468/0x7e8)
[ 48.951995] [<c019ec18>] (__lock_acquire) from [<c019f634>] (lock_acquire+0xe0/0x294)
[ 48.951995] [<c019f634>] (lock_acquire) from [<c01d0ea0>] (ktime_get_update_offsets_now+0x5c/0x1d4)
[ 48.952026] [<c01d0ea0>] (ktime_get_update_offsets_now) from [<c01cc624>] (retrigger_next_event+0x4c/0x90)
[ 48.952026] [<c01cc624>] (retrigger_next_event) from [<c01e4e24>] (on_each_cpu+0x40/0x7c)
[ 48.952056] [<c01e4e24>] (on_each_cpu) from [<c01cafc4>] (clock_was_set_work+0x14/0x20)
[ 48.952056] [<c01cafc4>] (clock_was_set_work) from [<c015664c>] (process_one_work+0x2b4/0x808)
[ 48.952087] [<c015664c>] (process_one_work) from [<c0157774>] (worker_thread+0x3c/0x550)
[ 48.952087] [<c0157774>] (worker_thread) from [<c015d644>] (kthread+0x104/0x148)
[ 48.952087] [<c015d644>] (kthread) from [<c0107830>] (ret_from_fork+0x14/0x24)
Replace printk() with printk_deferred(), which does not call into
the scheduler.
Fixes: 0bf43f15db85 ("timekeeping: Prints the amounts of time spent during suspend")
Reported-and-tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: "[4.9+]" <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Since KERN_CONT became meaningful again, lockdep stack traces have had
annoying extra newlines, like this:
[ 5.561122] -> #1 (B){+.+...}:
[ 5.561528]
[ 5.561532] [<ffffffff810d8873>] lock_acquire+0xc3/0x210
[ 5.562178]
[ 5.562181] [<ffffffff816f6414>] mutex_lock_nested+0x74/0x6d0
[ 5.562861]
[ 5.562880] [<ffffffffa01aa3c3>] init_btrfs_fs+0x21/0x196 [btrfs]
[ 5.563717]
[ 5.563721] [<ffffffff81000472>] do_one_initcall+0x52/0x1b0
[ 5.564554]
[ 5.564559] [<ffffffff811a3af6>] do_init_module+0x5f/0x209
[ 5.565357]
[ 5.565361] [<ffffffff81122f4d>] load_module+0x218d/0x2b80
[ 5.566020]
[ 5.566021] [<ffffffff81123beb>] SyS_finit_module+0xeb/0x120
[ 5.566694]
[ 5.566696] [<ffffffff816fd241>] entry_SYSCALL_64_fastpath+0x1f/0xc2
That's happening because each printk() call now gets printed on its own
line, and we do a separate call to print the spaces before the symbol.
Fix it by doing the printk() directly instead of using the
print_ip_sym() helper.
Additionally, the symbol address isn't very helpful, so let's get rid of
that, too. The final result looks like this:
[ 5.194518] -> #1 (B){+.+...}:
[ 5.195002] lock_acquire+0xc3/0x210
[ 5.195439] mutex_lock_nested+0x74/0x6d0
[ 5.196491] do_one_initcall+0x52/0x1b0
[ 5.196939] do_init_module+0x5f/0x209
[ 5.197355] load_module+0x218d/0x2b80
[ 5.197792] SyS_finit_module+0xeb/0x120
[ 5.198251] entry_SYSCALL_64_fastpath+0x1f/0xc2
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@fb.com
Fixes: 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines")
Link: http://lkml.kernel.org/r/43b4e114724b2bdb0308fa86cb33aa07d3d67fad.1486510315.git.osandov@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Asynchronous external abort is coded differently in DFSR with LPAE enabled.
Fixes: 9254970c "ARM: 8447/1: catch pending imprecise abort on unmask".
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull timer fix from Ingo Molnar:
"Fix a sporadic missed timer hw reprogramming bug that can result in
random delays"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/nohz: Fix possible missing clock reprog after tick soft restart
CONFIG_KASAN=y needs a lot of virtual memory mapped for its shadow.
In that case ptdump_walk_pgd_level_core() takes a lot of time to
walk across all page tables and doing this without
a rescheduling causes soft lockups:
NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [swapper/0:1]
...
Call Trace:
ptdump_walk_pgd_level_core+0x40c/0x550
ptdump_walk_pgd_level_checkwx+0x17/0x20
mark_rodata_ro+0x13b/0x150
kernel_init+0x2f/0x120
ret_from_fork+0x2c/0x40
I guess that this issue might arise even without KASAN on huge machines
with several terabytes of RAM.
Stick cond_resched() in pgd loop to fix this.
Reported-by: Tobias Regnery <tobias.regnery@gmail.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: kasan-dev@googlegroups.com
Cc: Alexander Potapenko <glider@google.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170210095405.31802-1-aryabinin@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
I am getting the following warning when I build kernel 4.9-git on my
PowerBook G4 with a 32-bit PPC processor:
AS arch/powerpc/kernel/misc_32.o
arch/powerpc/kernel/misc_32.S:299:7: warning: "CONFIG_FSL_BOOKE" is not defined [-Wundef]
This problem is evident after commit 989cea5c14be ("kbuild: prevent
lib-ksyms.o rebuilds"); however, this change in kbuild only exposes an
error that has been in the code since 2005 when this source file was
created. That was with commit 9994a33865f4 ("powerpc: Introduce
entry_{32,64}.S, misc_{32,64}.S, systbl.S").
The offending line does not make a lot of sense. This error does not
seem to cause any errors in the executable, thus I am not recommending
that it be applied to any stable versions.
Thanks to Nicholas Piggin for suggesting this solution.
Fixes: 9994a33865f4 ("powerpc: Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S")
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use rcuidle console tracepoint because, apparently, it may be issued
from an idle CPU:
hw-breakpoint: Failed to enable monitor mode on CPU 0.
hw-breakpoint: CPU 0 failed to disable vector catch
===============================
[ ERR: suspicious RCU usage. ]
4.10.0-rc8-next-20170215+ #119 Not tainted
-------------------------------
./include/trace/events/printk.h:32 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
RCU used illegally from idle CPU!
rcu_scheduler_active = 2, debug_locks = 0
RCU used illegally from extended quiescent state!
2 locks held by swapper/0/0:
#0: (cpu_pm_notifier_lock){......}, at: [<c0237e2c>] cpu_pm_exit+0x10/0x54
#1: (console_lock){+.+.+.}, at: [<c01ab350>] vprintk_emit+0x264/0x474
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc8-next-20170215+ #119
Hardware name: Generic OMAP4 (Flattened Device Tree)
console_unlock
vprintk_emit
vprintk_default
printk
reset_ctrl_regs
dbg_cpu_pm_notify
notifier_call_chain
cpu_pm_exit
omap_enter_idle_coupled
cpuidle_enter_state
cpuidle_enter_state_coupled
do_idle
cpu_startup_entry
start_kernel
This RCU warning, however, is suppressed by lockdep_off() in printk().
lockdep_off() increments the ->lockdep_recursion counter and thus
disables RCU_LOCKDEP_WARN() and debug_lockdep_rcu_enabled(), which want
lockdep to be enabled "current->lockdep_recursion == 0".
Link: http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Tony Lindgren <tony@atomide.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Russell King <rmk@armlinux.org.uk>
Cc: <stable@vger.kernel.org> [3.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A nested lock depth was added to the hasbin_delete() code but it
doesn't actually work some well and results in tons of lockdep splats.
Fix the code instead to properly drop the lock around the operation
and just keep peeking the head of the hashbin queue.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tick_broadcast_lock is taken from interrupt context, but the following call
chain takes the lock without disabling interrupts:
[ 12.703736] _raw_spin_lock+0x3b/0x50
[ 12.703738] tick_broadcast_control+0x5a/0x1a0
[ 12.703742] intel_idle_cpu_online+0x22/0x100
[ 12.703744] cpuhp_invoke_callback+0x245/0x9d0
[ 12.703752] cpuhp_thread_fun+0x52/0x110
[ 12.703754] smpboot_thread_fn+0x276/0x320
So the following deadlock can happen:
lock(tick_broadcast_lock);
<Interrupt>
lock(tick_broadcast_lock);
intel_idle_cpu_online() is the only place which violates the calling
convention of tick_broadcast_control(). This was caused by the removal of
the smp function call in course of the cpu hotplug rework.
Instead of slapping local_irq_disable/enable() at the call site, we can
relax the calling convention and handle it in the core code, which makes
the whole machinery more robust.
Fixes: 29d7bbada98e ("intel_idle: Remove superfluous SMP fuction call")
Reported-by: Gabriel C <nix.or.die@gmail.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Ruslan Ruslichenko <rruslich@cisco.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: lwn@lwn.net
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: stable <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1486953115.5912.4.camel@gmx.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull networking fixes from David Miller:
1) Load correct firmware in rtl8192ce wireless driver, from Jurij
Smakov.
2) Fix leak of tx_ring and tx_cq due to overwriting in mlx4 driver,
from Martin KaFai Lau.
3) Need to reference count PHY driver module when it is attached, from
Mao Wenan.
4) Don't do zero length vzalloc() in ethtool register dump, from
Stanislaw Gruszka.
5) Defer net_disable_timestamp() to a workqueue to get out of locking
issues, from Eric Dumazet.
6) We cannot drop the SKB dst when IP options refer to them, fix also
from Eric Dumazet.
7) Incorrect packet header offset calculations in ip6_gre, again from
Eric Dumazet.
8) Missing tcp_v6_restore_cb() causes use-after-free, from Eric too.
9) tcp_splice_read() can get into an infinite loop with URG, and hey
it's from Eric once more.
10) vnet_hdr_sz can change asynchronously, so read it once during
decision making in macvtap and tun, from Willem de Bruijn.
11) Can't use kernel stack for DMA transfers in USB networking drivers,
from Ben Hutchings.
12) Handle csum errors properly in UDP by calling the proper destructor,
from Eric Dumazet.
13) For non-deterministic softirq run when scheduling NAPI from a
workqueue in mlx4, from Benjamin Poirier.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (28 commits)
sctp: check af before verify address in sctp_addr_id2transport
sctp: avoid BUG_ON on sctp_wait_for_sndbuf
mlx4: Invoke softirqs after napi_reschedule
udp: properly cope with csum errors
catc: Use heap buffer for memory size test
catc: Combine failure cleanup code in catc_probe()
rtl8150: Use heap buffers for all register access
pegasus: Use heap buffers for all register access
macvtap: read vnet_hdr_size once
tun: read vnet_hdr_sz once
tcp: avoid infinite loop in tcp_splice_read()
hns: avoid stack overflow with CONFIG_KASAN
ipv6: Fix IPv6 packet loss in scenarios involving roaming + snooping switches
ipv6: tcp: add a missing tcp_v6_restore_cb()
nl80211: Fix mesh HT operation check
mac80211: Fix adding of mesh vendor IEs
mac80211: Allocate a sync skcipher explicitly for FILS AEAD
mac80211: Fix FILS AEAD protection in Association Request frame
ip6_gre: fix ip6gre_err() invalid reads
netlabel: out of bound access in cipso_v4_validate()
...
The following patch was sketched by Russell in response to my
crashes on the PB11MPCore after the patch for software-based
priviledged no access support for ARMv8.1. See this thread:
http://marc.info/?l=linux-arm-kernel&m=144051749807214&w=2
I am unsure what is going on, I suspect everyone involved in
the discussion is. I just want to repost this to get the
discussion restarted, as I still have to apply this patch
with every kernel iteration to get my PB11MPCore Realview
running.
Testing by Neil Armstrong on the Oxnas NAS has revealed that
this bug exist also on that widely deployed hardware, so
we are probably currently regressing all ARM11MPCore systems.
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: a5e090acbf54 ("ARM: software-based priviledged-no-access support")
Tested-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull perf fixes from Ingo Molnar:
"A kernel crash fix plus three tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix crash in perf_event_read()
perf callchain: Reference count maps
perf diff: Fix -o/--order option behavior (again)
perf diff: Fix segfault on 'perf diff -o N' option
ts->next_tick keeps track of the next tick deadline in order to optimize
clock programmation on irq exit and avoid redundant clock device writes.
Now if ts->next_tick missed an update, we may spuriously miss a clock
reprog later as the nohz code is fooled by an obsolete next_tick value.
This is what happens here on a specific path: when we observe an
expired timer from the nohz update code on irq exit, we perform a soft
tick restart which simply fires the closest possible tick without
actually exiting the nohz mode and restoring a periodic state. But we
forget to update ts->next_tick accordingly.
As a result, after the next tick resulting from such soft tick restart,
the nohz code sees a stale value on ts->next_tick which doesn't match
the clock deadline that just expired. If that obsolete ts->next_tick
value happens to collide with the actual next tick deadline to be
scheduled, we may spuriously bypass the clock reprogramming. In the
worst case, the tick may never fire again.
Fix this with a ts->next_tick reset on soft tick restart.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1486485894-29173-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When the TSC is marked reliable then the synchronization check is skipped,
but that also skips the TSC ADJUST sanitizing code. So on a machine with a
wreckaged BIOS the TSC deviation between CPUs might go unnoticed.
Let the TSC adjust sanitizing code run unconditionally and just skip the
expensive synchronization checks when TSC is marked reliable.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Olof Johansson <olof@lixom.net>
Link: http://lkml.kernel.org/r/20170209151231.491189912@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The timer type simplifications caused a new gcc warning:
drivers/base/power/domain.c: In function ‘genpd_runtime_suspend’:
drivers/base/power/domain.c:562:14: warning: ‘time_start’ may be used uninitialized in this function [-Wmaybe-uninitialized]
elapsed_ns = ktime_to_ns(ktime_sub(ktime_get(), time_start));
despite the actual use of "time_start" not having changed in any way.
It appears that simply changing the type of ktime_t from a union to a
plain scalar type made gcc check the use.
The variable wasn't actually used uninitialized, but gcc apparently
failed to notice that the conditional around the use was exactly the
same as the conditional around the initialization of that variable.
Add an unnecessary initialization just to shut up the compiler.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull block layer fix from Jens Axboe:
"A single fix for a lockdep splat reported by Thomas and Gabriel"
* 'for-linus' of git://git.kernel.dk/linux-block:
cfq-iosched: don't call wbt_disable_default() with IRQs disabled
Since the commit 0c1d70af924b ("net: use dst_cache for vxlan device")
vxlan_fill_metadata_dst() calls vxlan_get_route() passing a NULL
dst_cache pointer, so the latter should explicitly check for
valid dst_cache ptr. Unfortunately the commit d71785ffc7e7 ("net: add
dst_cache to ovs vxlan lwtunnel") removed said check.
As a result is possible to trigger a null pointer access calling
vxlan_fill_metadata_dst(), e.g. with:
ovs-vsctl add-br ovs-br0
ovs-vsctl add-port ovs-br0 vxlan0 -- set interface vxlan0 \
type=vxlan options:remote_ip=192.168.1.1 \
options:key=1234 options:dst_port=4789 ofport_request=10
ip address add dev ovs-br0 172.16.1.2/24
ovs-vsctl set Bridge ovs-br0 ipfix=@i -- --id=@i create IPFIX \
targets=\"172.16.1.1:1234\" sampling=1
iperf -c 172.16.1.1 -u -l 1000 -b 10M -t 1 -p 1234
This commit addresses the issue passing to vxlan_get_route() the
dst_cache already available into the lwt info processed by
vxlan_fill_metadata_dst().
Fixes: d71785ffc7e7 ("net: add dst_cache to ovs vxlan lwtunnel")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6326fec1122c ("mm: Use owner_priv bit for PageSwapCache, valid
when PageSwapBacked") aliased PG_swapcache to PG_owner_priv_1 (and
depending on PageSwapBacked being true).
As a result, the KPF_SWAPCACHE bit in '/proc/kpageflags' should now be
synthesized, instead of being shown on unrelated pages which just happen
to have PG_owner_priv_1 set.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 6f29a1306131 ("sctp: sctp_addr_id2transport should verify the
addr before looking up assoc") invoked sctp_verify_addr to verify the
addr.
But it didn't check af variable beforehand, once users pass an address
with family = 0 through sockopt, sctp_get_af_specific will return NULL
and NULL pointer dereference will be caused by af->sockaddr_len.
This patch is to fix it by returning NULL if af variable is NULL.
Fixes: 6f29a1306131 ("sctp: sctp_addr_id2transport should verify the addr before looking up assoc")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update my entries in the MAINTAINERS file with the same email address
for kernel work, and, now that the git tree is hosted on more suitable
hardware, add git tree references where appropriate.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Pull lockdep fix from Ingo Molnar:
"This fixes an ugly lockdep stack trace output regression. (But also
affects other stacktrace users such as kmemleak, KASAN, etc)"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
stacktrace, lockdep: Fix address, newline ugliness
Alexei had his box explode because doing read() on a package
(rapl/uncore) event that isn't currently scheduled in ends up doing an
out-of-bounds load.
Rework the code to more explicitly deal with event->oncpu being -1.
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Fixes: d6a2f9035bfc ("perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG")
Link: http://lkml.kernel.org/r/20170131102710.GL6515@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull drm fixes from Dave Airlie:
"This should be the final set of drm fixes for 4.10: one vmwgfx boot
fix, one vc4 fix, and a few i915 fixes:
* tag 'drm-fixes-for-v4.10-rc8' of git://people.freedesktop.org/~airlied/linux:
drm: vc4: adapt to new behaviour of drm_crtc.c
drm/i915: Always convert incoming exec offsets to non-canonical
drm/i915: Remove overzealous fence warn on runtime suspend
drm/i915/bxt: Add MST support when do DPLL calculation
drm/i915: don't warn about Skylake CPU - KabyPoint PCH combo
drm/i915: fix i915 running as dom0 under Xen
drm/i915: Flush untouched framebuffers before display on !llc
drm/i915: fix use-after-free in page_flip_completed()
drm/vmwgfx: Fix depth input into drm_mode_legacy_fb_format
Olof reported that on a machine which has a BIOS wreckaged TSC the
timestamps in dmesg are making a large jump because the TSC value is
jumping forward after resetting the TSC ADJUST register to a sane value.
This can be avoided by calling the TSC ADJUST saniziting function before
initializing the per cpu sched clock machinery. That takes the offset into
account and avoid the time jump.
What cannot be avoided is that the 'Firmware Bug' warnings on the secondary
CPUs are printed with the large time offsets because it would be too much
effort and ugly hackery to print those warnings into a buffer and emit them
after the adjustemt on the starting CPUs. It's a firmware bug and should be
fixed in firmware. The weird timestamps are collateral damage and just
illustrate the sillyness of the BIOS folks:
[ 0.397445] smp: Bringing up secondary CPUs ...
[ 0.402100] x86: Booting SMP configuration:
[ 0.406343] .... node #0, CPUs: #1
[1265776479.930667] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU1: -2978888639183101
[1265776479.944664] TSC ADJUST synchronize: Reference CPU0: 0 CPU1: -2978888639183101
[ 0.508119] #2
[1265776480.032346] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU2: -2978888639183677
[1265776480.044192] TSC ADJUST synchronize: Reference CPU0: 0 CPU2: -2978888639183677
[ 0.607643] #3
[1265776480.131874] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU3: -2978888639184530
[1265776480.143720] TSC ADJUST synchronize: Reference CPU0: 0 CPU3: -2978888639184530
[ 0.707108] smp: Brought up 1 node, 4 CPUs
[ 0.711271] smpboot: Total of 4 processors activated (21698.88 BogoMIPS)
Reported-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170209151231.411460506@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull timer type cleanups from Thomas Gleixner:
"This series does a tree wide cleanup of types related to
timers/timekeeping.
- Get rid of cycles_t and use a plain u64. The type is not really
helpful and caused more confusion than clarity
- Get rid of the ktime union. The union has become useless as we use
the scalar nanoseconds storage unconditionally now. The 32bit
timespec alike storage got removed due to the Y2038 limitations
some time ago.
That leaves the odd union access around for no reason. Clean it up.
Both changes have been done with coccinelle and a small amount of
manual mopping up"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ktime: Get rid of ktime_equal()
ktime: Cleanup ktime_set() usage
ktime: Get rid of the union
clocksource: Use a plain u64 instead of cycle_t
Pull powerpc fix from Michael Ellerman:
"One fix from Paul: we can not use the radix MMU under a hypervisor for
now.
Although the code checked if the processor supports radix, that is not
sufficient"
* tag 'powerpc-4.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64: Disable use of radix under a hypervisor
wbt_disable_default() calls del_timer_sync() to wait for the wbt
timer to finish before disabling throttling. We can't do this with
IRQs disable. This fixes a lockdep splat on boot, if non-root
cgroups are used.
Reported-by: Gabriel C <nix.or.die@gmail.com>
Fixes: 87760e5eef35 ("block: hook up writeback throttling")
Signed-off-by: Jens Axboe <axboe@fb.com>
In the current DCCP implementation an skb for a DCCP_PKT_REQUEST packet
is forcibly freed via __kfree_skb in dccp_rcv_state_process if
dccp_v6_conn_request successfully returns.
However, if IPV6_RECVPKTINFO is set on a socket, the address of the skb
is saved to ireq->pktopts and the ref count for skb is incremented in
dccp_v6_conn_request, so skb is still in use. Nevertheless, it gets freed
in dccp_rcv_state_process.
Fix by calling consume_skb instead of doing goto discard and therefore
calling __kfree_skb.
Similar fixes for TCP:
fb7e2399ec17f1004c0e0ccfd17439f8759ede01 [TCP]: skb is unexpectedly freed.
0aea76d35c9651d55bbaf746e7914e5f9ae5a25d tcp: SYN packets are now
simply consumed
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Jo-Philipp Wich <jo@mein.io>
Fixes: 9aed02feae57bf7 ("ARC: [arcompact] handle unaligned access delay slot")
Cc: linux-kernel@vger.kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Popov reported that an application may trigger a BUG_ON in
sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is
waiting on it to queue more data and meanwhile another thread peels off
the association being used by the first thread.
This patch replaces the BUG_ON call with a proper error handling. It
will return -EPIPE to the original sendmsg call, similarly to what would
have been done if the association wasn't found in the first place.
Acked-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to the way kbuild works, this header was unintentionally exported
back in 2013 when it was created, despite it not being in a uapi/
directory. This is very non-intuitive behaviour by Kbuild.
However, we've had this include exported to userland for almost four
years, and searching google for "ARM types.h __UINTPTR_TYPE__" gives
no hint that anyone has complained about it. So, let's make it
officially exported in this state.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Pull irq fixes from Ingo Molnar:
"Two last minute ARM irqchip driver fixes"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mxs: Enable SKIP_SET_WAKE and MASK_ON_SUSPEND
irqchip/keystone: Fix "scheduling while atomic" on rt
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Reference count maps in callchains, fixing a SEGFAULT when referencing a
map after it is freed (Krister Johansen)
- Fix segfault on 'perf diff -o N' option (Namhyung Kim)
- Fix 'perf diff -o/--order' option behavior (Namhyung Kim)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull SCSI target fixes from Nicholas Bellinger:
"This target series for v4.10 contains fixes which address a few
long-standing bugs that DATERA's QA + automation teams have uncovered
while putting v4.1.y target code into production usage.
We've been running the top three in our nightly automated regression
runs for the last two months, and the COMPARE_AND_WRITE fix Mr. Gary
Guo has been manually verifying against a four node ESX cluster this
past week.
Note all of them have CC' stable tags.
Summary:
- Fix a bug with ESX EXTENDED_COPY + SAM_STAT_RESERVATION_CONFLICT
status, where target_core_xcopy.c logic was incorrectly returning
SAM_STAT_CHECK_CONDITION for all non SAM_STAT_GOOD cases (Nixon
Vincent)
- Fix a TMR LUN_RESET hung task bug while other in-flight TMRs are
being aborted, before the new one had been dispatched into tmr_wq
(Rob Millner)
- Fix a long standing double free OOPs, where a dynamically generated
'demo-mode' NodeACL has multiple sessions associated with it, and
the /sys/kernel/config/target/$FABRIC/$WWN/ subsequently disables
demo-mode, but never converts the dynamic ACL into a explicit ACL
(Rob Millner)
- Fix a long standing reference leak with ESX VAAI COMPARE_AND_WRITE
when the second phase WRITE COMMIT command fails, resulting in
CHECK_CONDITION response never being sent and se_cmd->cmd_kref
never reaching zero (Gary Guo)
Beyond these items on v4.1.y we've reproduced, fixed, and run through
our regression test suite using iscsi-target exports, there are two
additional outstanding list items:
- Remove a >= v4.2 RCU conversion BUG_ON that would trigger when
dynamic node NodeACLs where being converted to explicit NodeACLs.
The patch drops the BUG_ON to follow how pre RCU conversion worked
for this special case (Benjamin Estrabaud)
- Add ibmvscsis target_core_fabric_ops->max_data_sg_nent assignment
to match what IBM's Virtual SCSI hypervisor is already enforcing at
transport layer. (Bryant Ly + Steven Royer)"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
ibmvscsis: Add SGL limit
target: Fix COMPARE_AND_WRITE ref leak for non GOOD status
target: Fix multi-session dynamic se_node_acl double free OOPs
target: Fix early transport_generic_handle_tmr abort scenario
target: Use correct SCSI status during EXTENDED_COPY exception
target: Don't BUG_ON during NodeACL dynamic -> explicit conversion
Hopefully final fixes for v4.10, about half of them stable material.
* tag 'drm-intel-fixes-2017-02-09' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Always convert incoming exec offsets to non-canonical
drm/i915: Remove overzealous fence warn on runtime suspend
drm/i915/bxt: Add MST support when do DPLL calculation
drm/i915: don't warn about Skylake CPU - KabyPoint PCH combo
drm/i915: fix i915 running as dom0 under Xen
drm/i915: Flush untouched framebuffers before display on !llc
drm/i915: fix use-after-free in page_flip_completed()
After:
a33d331761bc ("x86/CPU/AMD: Fix Bulldozer topology")
our SMT scheduling topology for Fam17h systems is broken, because
the ThreadId is included in the ApicId when SMT is enabled.
So, without further decoding cpu_core_id is unique for each thread
rather than the same for threads on the same core. This didn't affect
systems with SMT disabled. Make cpu_core_id be what it is defined to be.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.9
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170205105022.8705-2-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull SMP hotplug notifier removal from Thomas Gleixner:
"This is the final cleanup of the hotplug notifier infrastructure. The
series has been reintgrated in the last two days because there came a
new driver using the old infrastructure via the SCSI tree.
Summary:
- convert the last leftover drivers utilizing notifiers
- fixup for a completely broken hotplug user
- prevent setup of already used states
- removal of the notifiers
- treewide cleanup of hotplug state names
- consolidation of state space
There is a sphinx based documentation pending, but that needs review
from the documentation folks"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/armada-xp: Consolidate hotplug state space
irqchip/gic: Consolidate hotplug state space
coresight/etm3/4x: Consolidate hotplug state space
cpu/hotplug: Cleanup state names
cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
staging/lustre/libcfs: Convert to hotplug state machine
scsi/bnx2i: Convert to hotplug state machine
scsi/bnx2fc: Convert to hotplug state machine
cpu/hotplug: Prevent overwriting of callbacks
x86/msr: Remove bogus cleanup from the error path
bus: arm-ccn: Prevent hotplug callback leak
perf/x86/intel/cstate: Prevent hotplug callback leak
ARM/imx/mmcd: Fix broken cpu hotplug handling
scsi: qedi: Convert to hotplug state machine
No point in going through loops and hoops instead of just comparing the
values.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Pull input fix from Dmitry Torokhov:
"Just a single change to Elan touchpad driver to recognize a new ACPI
ID"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elan_i2c - add ELAN0605 to the ACPI table
What happens is that a write to /dev/sg is given a request with non-zero
->iovec_count combined with zero ->dxfer_len. Or with ->dxferp pointing
to an array full of empty iovecs.
Having write permission to /dev/sg shouldn't be equivalent to the
ability to trigger BUG_ON() while holding spinlocks...
Found by Dmitry Vyukov and syzkaller.
[ The BUG_ON() got changed to a WARN_ON_ONCE(), but this fixes the
underlying issue. - Linus ]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Don't crash the machine just because of an empty transfer. Use WARN_ON()
combined with returning an error.
Found by Dmitry Vyukov and syzkaller.
[ Changed to "WARN_ON_ONCE()". Al has a patch that should fix the root
cause, but a BUG_ON() is not acceptable in any case, and a WARN_ON()
might still be a cause of excessive log spamming.
NOTE! If this warning ever triggers, we may end up leaking resources,
since this doesn't bother to try to clean the command up. So this
WARN_ON_ONCE() triggering does imply real problems. But BUG_ON() is
much worse.
People really need to stop using BUG_ON() for "this shouldn't ever
happen". It makes pretty much any bug worse. - Linus ]
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: James Bottomley <jejb@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If ip6_dst_lookup_tail has acquired a dst and fails the IPv4-mapped
check, release the dst before returning an error.
Fixes: ec5e3b0a1d41 ("ipv6: Inhibit IPv4-mapped src address on the wire.")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull ARM SoC fixes from Arnd Bergmann:
"Two more bugfixes that came in during this week:
- a defconfig change to enable a vital driver used on some Qualcomm
based phones. This was already queued for 4.11, but the maintainer
asked to have it in 4.10 after all.
- a regression fix for the reset controller framework, this got
broken by a typo in the 4.10 merge window"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: multi_v7_defconfig: enable Qualcomm RPMCC
reset: fix shared reset triggered_count decrement on error
The 64-bit get_user() wasn't clearing the high word due to a typo in the
error handler. The exception handler entry was already correct, though.
Noticed during recent usercopy test additions in lib/test_user_copy.c.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Pull "Reset controller fixes for v4.10" from Philipp Zabel:
- Remove erroneous negation of the error check of the reset function
to decrement trigger_count in the error case, not on success. This
fixes shared resets to actually only trigger once, as intended.
* tag 'reset-for-4.10-fixes' of https://git.pengutronix.de/git/pza/linux:
reset: fix shared reset triggered_count decrement on error
GCC complains about unused variable 'vma' in mark_screen_rdonly() if THP is
disabled:
arch/x86/kernel/vm86_32.c: In function ‘mark_screen_rdonly’:
arch/x86/kernel/vm86_32.c:180:26: warning: unused variable ‘vma’
[-Wunused-variable]
struct vm_area_struct *vma = find_vma(mm, 0xA0000);
That's silly. pmd_trans_huge() resolves to 0 when THP is disabled, so the
whole block should be eliminated.
Moving the variable declaration outside the if() block shuts GCC up.
Reported-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: Carlos O'Donell <carlos@redhat.com>
Link: http://lkml.kernel.org/r/20170213125228.63645-1-kirill.shutemov@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In commit 76624175dcae ("arm64: uaccess: consistently check object sizes"),
the object size checks are moved outside the access_ok() so that bad
destinations are detected before hitting the "memset(dest, 0, size)" in the
copy_from_user() failure path.
This makes the same change for arm, with attention given to possibly
extracting the uaccess routines into a common header file for all
architectures in the future.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
For a shared reset, when the reset is successful, the triggered_count is
incremented when trying to call the reset callback, so that another device
sharing the same reset line won't trigger it again. If the reset has not
been triggered successfully, the trigger_count should be decremented.
The code does the opposite, and decrements the trigger_count on success.
As a consequence, another device sharing the reset will be able to trigger
it again.
Fixed be removing negation in from of the error code of the reset function.
Fixes: 7da33a37b48f ("reset: allow using reset_control_reset with shared reset")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Pull timer fixes from Thomas Gleixner:
"Two small fixes::
- Prevent deadlock on the tick broadcast lock. Found and fixed by
Mike.
- Stop using printk() in the timekeeping debug code to prevent a
deadlock against the scheduler"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping: Use deferred printk() in debug code
tick/broadcast: Prevent deadlock on tick_broadcast_lock
The UEVENT user mode helper is enabled before the initcalls are executed
and is available when the root filesystem has been mounted.
The user mode helper is triggered by device init calls and the executable
might use the futex syscall.
futex_init() is marked __initcall which maps to device_initcall, but there
is no guarantee that futex_init() is invoked _before_ the first device init
call which triggers the UEVENT user mode helper.
If the user mode helper uses the futex syscall before futex_init() then the
syscall crashes with a NULL pointer dereference because the futex subsystem
has not been initialized yet.
Move futex_init() to core_initcall so futexes are initialized before the
root filesystem is mounted and the usermode helper becomes available.
[ tglx: Rewrote changelog ]
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Cc: jiang.biao2@zte.com.cn
Cc: jiang.zhengxiong@zte.com.cn
Cc: zhong.weidong@zte.com.cn
Cc: deng.huali@zte.com.cn
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1483085875-6130-1-git-send-email-yang.yang29@zte.com.cn
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.
Cc: <stable@vger.kernel.org> # 3.0.x-
Fixes: 5be6f62b0059 ("ARM: 6883/1: ptrace: Migrate to regsets framework")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull x86 fixes from Ingo Molnar:
"Last minute x86 fixes:
- Fix a softlockup detector warning and long delays if using ptdump
with KASAN enabled.
- Two more TSC-adjust fixes for interesting firmware interactions.
- Two commits to fix an AMD CPU topology enumeration bug that caused
a measurable gaming performance regression"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/ptdump: Fix soft lockup in page table walker
x86/tsc: Make the TSC ADJUST sanitizing work for tsc_reliable
x86/tsc: Avoid the large time jump when sanitizing TSC ADJUST
x86/CPU/AMD: Fix Zen SMT topology
x86/CPU/AMD: Bring back Compute Unit ID
Pull networking fixes from David Miller:
1) Fix leak in dpaa_eth error paths, from Dan Carpenter.
2) Use after free when using IPV6_RECVPKTINFO, from Andrey Konovalov.
3) fanout_release() cannot be invoked from atomic contexts, from Anoob
Soman.
4) Fix bogus attempt at lockdep annotation in IRDA.
5) dev_fill_metadata_dst() can OOP on a NULL dst cache pointer, from
Paolo Abeni.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
irda: Fix lockdep annotations in hashbin_delete().
vxlan: fix oops in dev_fill_metadata_dst
dccp: fix freeing skb too early for IPV6_RECVPKTINFO
dpaa_eth: small leak on error
packet: Do not call fanout_release from atomic contexts
We cannot do printk() from tk_debug_account_sleep_time(), because
tk_debug_account_sleep_time() is called under tk_core seq lock.
The reason why printk() is unsafe there is that console_sem may
invoke scheduler (up()->wake_up_process()->activate_task()), which,
in turn, can return back to timekeeping code, for instance, via
get_time()->ktime_get(), deadlocking the system on tk_core seq lock.
[ 48.950592] ======================================================
[ 48.950622] [ INFO: possible circular locking dependency detected ]
[ 48.950622] 4.10.0-rc7-next-20170213+ #101 Not tainted
[ 48.950622] -------------------------------------------------------
[ 48.950622] kworker/0:0/3 is trying to acquire lock:
[ 48.950653] (tk_core){----..}, at: [<c01cc624>] retrigger_next_event+0x4c/0x90
[ 48.950683]
but task is already holding lock:
[ 48.950683] (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90
[ 48.950714]
which lock already depends on the new lock.
[ 48.950714]
the existing dependency chain (in reverse order) is:
[ 48.950714]
-> #5 (hrtimer_bases.lock){-.-...}:
[ 48.950744] _raw_spin_lock_irqsave+0x50/0x64
[ 48.950775] lock_hrtimer_base+0x28/0x58
[ 48.950775] hrtimer_start_range_ns+0x20/0x5c8
[ 48.950775] __enqueue_rt_entity+0x320/0x360
[ 48.950805] enqueue_rt_entity+0x2c/0x44
[ 48.950805] enqueue_task_rt+0x24/0x94
[ 48.950836] ttwu_do_activate+0x54/0xc0
[ 48.950836] try_to_wake_up+0x248/0x5c8
[ 48.950836] __setup_irq+0x420/0x5f0
[ 48.950836] request_threaded_irq+0xdc/0x184
[ 48.950866] devm_request_threaded_irq+0x58/0xa4
[ 48.950866] omap_i2c_probe+0x530/0x6a0
[ 48.950897] platform_drv_probe+0x50/0xb0
[ 48.950897] driver_probe_device+0x1f8/0x2cc
[ 48.950897] __driver_attach+0xc0/0xc4
[ 48.950927] bus_for_each_dev+0x6c/0xa0
[ 48.950927] bus_add_driver+0x100/0x210
[ 48.950927] driver_register+0x78/0xf4
[ 48.950958] do_one_initcall+0x3c/0x16c
[ 48.950958] kernel_init_freeable+0x20c/0x2d8
[ 48.950958] kernel_init+0x8/0x110
[ 48.950988] ret_from_fork+0x14/0x24
[ 48.950988]
-> #4 (&rt_b->rt_runtime_lock){-.-...}:
[ 48.951019] _raw_spin_lock+0x40/0x50
[ 48.951019] rq_offline_rt+0x9c/0x2bc
[ 48.951019] set_rq_offline.part.2+0x2c/0x58
[ 48.951049] rq_attach_root+0x134/0x144
[ 48.951049] cpu_attach_domain+0x18c/0x6f4
[ 48.951049] build_sched_domains+0xba4/0xd80
[ 48.951080] sched_init_smp+0x68/0x10c
[ 48.951080] kernel_init_freeable+0x160/0x2d8
[ 48.951080] kernel_init+0x8/0x110
[ 48.951080] ret_from_fork+0x14/0x24
[ 48.951110]
-> #3 (&rq->lock){-.-.-.}:
[ 48.951110] _raw_spin_lock+0x40/0x50
[ 48.951141] task_fork_fair+0x30/0x124
[ 48.951141] sched_fork+0x194/0x2e0
[ 48.951141] copy_process.part.5+0x448/0x1a20
[ 48.951171] _do_fork+0x98/0x7e8
[ 48.951171] kernel_thread+0x2c/0x34
[ 48.951171] rest_init+0x1c/0x18c
[ 48.951202] start_kernel+0x35c/0x3d4
[ 48.951202] 0x8000807c
[ 48.951202]
-> #2 (&p->pi_lock){-.-.-.}:
[ 48.951232] _raw_spin_lock_irqsave+0x50/0x64
[ 48.951232] try_to_wake_up+0x30/0x5c8
[ 48.951232] up+0x4c/0x60
[ 48.951263] __up_console_sem+0x2c/0x58
[ 48.951263] console_unlock+0x3b4/0x650
[ 48.951263] vprintk_emit+0x270/0x474
[ 48.951293] vprintk_default+0x20/0x28
[ 48.951293] printk+0x20/0x30
[ 48.951324] kauditd_hold_skb+0x94/0xb8
[ 48.951324] kauditd_thread+0x1a4/0x56c
[ 48.951324] kthread+0x104/0x148
[ 48.951354] ret_from_fork+0x14/0x24
[ 48.951354]
-> #1 ((console_sem).lock){-.....}:
[ 48.951385] _raw_spin_lock_irqsave+0x50/0x64
[ 48.951385] down_trylock+0xc/0x2c
[ 48.951385] __down_trylock_console_sem+0x24/0x80
[ 48.951385] console_trylock+0x10/0x8c
[ 48.951416] vprintk_emit+0x264/0x474
[ 48.951416] vprintk_default+0x20/0x28
[ 48.951416] printk+0x20/0x30
[ 48.951446] tk_debug_account_sleep_time+0x5c/0x70
[ 48.951446] __timekeeping_inject_sleeptime.constprop.3+0x170/0x1a0
[ 48.951446] timekeeping_resume+0x218/0x23c
[ 48.951477] syscore_resume+0x94/0x42c
[ 48.951477] suspend_enter+0x554/0x9b4
[ 48.951477] suspend_devices_and_enter+0xd8/0x4b4
[ 48.951507] enter_state+0x934/0xbd4
[ 48.951507] pm_suspend+0x14/0x70
[ 48.951507] state_store+0x68/0xc8
[ 48.951538] kernfs_fop_write+0xf4/0x1f8
[ 48.951538] __vfs_write+0x1c/0x114
[ 48.951538] vfs_write+0xa0/0x168
[ 48.951568] SyS_write+0x3c/0x90
[ 48.951568] __sys_trace_return+0x0/0x10
[ 48.951568]
-> #0 (tk_core){----..}:
[ 48.951599] lock_acquire+0xe0/0x294
[ 48.951599] ktime_get_update_offsets_now+0x5c/0x1d4
[ 48.951629] retrigger_next_event+0x4c/0x90
[ 48.951629] on_each_cpu+0x40/0x7c
[ 48.951629] clock_was_set_work+0x14/0x20
[ 48.951660] process_one_work+0x2b4/0x808
[ 48.951660] worker_thread+0x3c/0x550
[ 48.951660] kthread+0x104/0x148
[ 48.951690] ret_from_fork+0x14/0x24
[ 48.951690]
other info that might help us debug this:
[ 48.951690] Chain exists of:
tk_core --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock
[ 48.951721] Possible unsafe locking scenario:
[ 48.951721] CPU0 CPU1
[ 48.951721] ---- ----
[ 48.951721] lock(hrtimer_bases.lock);
[ 48.951751] lock(&rt_b->rt_runtime_lock);
[ 48.951751] lock(hrtimer_bases.lock);
[ 48.951751] lock(tk_core);
[ 48.951782]
*** DEADLOCK ***
[ 48.951782] 3 locks held by kworker/0:0/3:
[ 48.951782] #0: ("events"){.+.+.+}, at: [<c0156590>] process_one_work+0x1f8/0x808
[ 48.951812] #1: (hrtimer_work){+.+...}, at: [<c0156590>] process_one_work+0x1f8/0x808
[ 48.951843] #2: (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90
[ 48.951843] stack backtrace:
[ 48.951873] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.10.0-rc7-next-20170213+
[ 48.951904] Workqueue: events clock_was_set_work
[ 48.951904] [<c0110208>] (unwind_backtrace) from [<c010c224>] (show_stack+0x10/0x14)
[ 48.951934] [<c010c224>] (show_stack) from [<c04ca6c0>] (dump_stack+0xac/0xe0)
[ 48.951934] [<c04ca6c0>] (dump_stack) from [<c019b5cc>] (print_circular_bug+0x1d0/0x308)
[ 48.951965] [<c019b5cc>] (print_circular_bug) from [<c019d2a8>] (validate_chain+0xf50/0x1324)
[ 48.951965] [<c019d2a8>] (validate_chain) from [<c019ec18>] (__lock_acquire+0x468/0x7e8)
[ 48.951995] [<c019ec18>] (__lock_acquire) from [<c019f634>] (lock_acquire+0xe0/0x294)
[ 48.951995] [<c019f634>] (lock_acquire) from [<c01d0ea0>] (ktime_get_update_offsets_now+0x5c/0x1d4)
[ 48.952026] [<c01d0ea0>] (ktime_get_update_offsets_now) from [<c01cc624>] (retrigger_next_event+0x4c/0x90)
[ 48.952026] [<c01cc624>] (retrigger_next_event) from [<c01e4e24>] (on_each_cpu+0x40/0x7c)
[ 48.952056] [<c01e4e24>] (on_each_cpu) from [<c01cafc4>] (clock_was_set_work+0x14/0x20)
[ 48.952056] [<c01cafc4>] (clock_was_set_work) from [<c015664c>] (process_one_work+0x2b4/0x808)
[ 48.952087] [<c015664c>] (process_one_work) from [<c0157774>] (worker_thread+0x3c/0x550)
[ 48.952087] [<c0157774>] (worker_thread) from [<c015d644>] (kthread+0x104/0x148)
[ 48.952087] [<c015d644>] (kthread) from [<c0107830>] (ret_from_fork+0x14/0x24)
Replace printk() with printk_deferred(), which does not call into
the scheduler.
Fixes: 0bf43f15db85 ("timekeeping: Prints the amounts of time spent during suspend")
Reported-and-tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: "[4.9+]" <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Since KERN_CONT became meaningful again, lockdep stack traces have had
annoying extra newlines, like this:
[ 5.561122] -> #1 (B){+.+...}:
[ 5.561528]
[ 5.561532] [<ffffffff810d8873>] lock_acquire+0xc3/0x210
[ 5.562178]
[ 5.562181] [<ffffffff816f6414>] mutex_lock_nested+0x74/0x6d0
[ 5.562861]
[ 5.562880] [<ffffffffa01aa3c3>] init_btrfs_fs+0x21/0x196 [btrfs]
[ 5.563717]
[ 5.563721] [<ffffffff81000472>] do_one_initcall+0x52/0x1b0
[ 5.564554]
[ 5.564559] [<ffffffff811a3af6>] do_init_module+0x5f/0x209
[ 5.565357]
[ 5.565361] [<ffffffff81122f4d>] load_module+0x218d/0x2b80
[ 5.566020]
[ 5.566021] [<ffffffff81123beb>] SyS_finit_module+0xeb/0x120
[ 5.566694]
[ 5.566696] [<ffffffff816fd241>] entry_SYSCALL_64_fastpath+0x1f/0xc2
That's happening because each printk() call now gets printed on its own
line, and we do a separate call to print the spaces before the symbol.
Fix it by doing the printk() directly instead of using the
print_ip_sym() helper.
Additionally, the symbol address isn't very helpful, so let's get rid of
that, too. The final result looks like this:
[ 5.194518] -> #1 (B){+.+...}:
[ 5.195002] lock_acquire+0xc3/0x210
[ 5.195439] mutex_lock_nested+0x74/0x6d0
[ 5.196491] do_one_initcall+0x52/0x1b0
[ 5.196939] do_init_module+0x5f/0x209
[ 5.197355] load_module+0x218d/0x2b80
[ 5.197792] SyS_finit_module+0xeb/0x120
[ 5.198251] entry_SYSCALL_64_fastpath+0x1f/0xc2
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@fb.com
Fixes: 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines")
Link: http://lkml.kernel.org/r/43b4e114724b2bdb0308fa86cb33aa07d3d67fad.1486510315.git.osandov@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Asynchronous external abort is coded differently in DFSR with LPAE enabled.
Fixes: 9254970c "ARM: 8447/1: catch pending imprecise abort on unmask".
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
CONFIG_KASAN=y needs a lot of virtual memory mapped for its shadow.
In that case ptdump_walk_pgd_level_core() takes a lot of time to
walk across all page tables and doing this without
a rescheduling causes soft lockups:
NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [swapper/0:1]
...
Call Trace:
ptdump_walk_pgd_level_core+0x40c/0x550
ptdump_walk_pgd_level_checkwx+0x17/0x20
mark_rodata_ro+0x13b/0x150
kernel_init+0x2f/0x120
ret_from_fork+0x2c/0x40
I guess that this issue might arise even without KASAN on huge machines
with several terabytes of RAM.
Stick cond_resched() in pgd loop to fix this.
Reported-by: Tobias Regnery <tobias.regnery@gmail.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: kasan-dev@googlegroups.com
Cc: Alexander Potapenko <glider@google.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170210095405.31802-1-aryabinin@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
I am getting the following warning when I build kernel 4.9-git on my
PowerBook G4 with a 32-bit PPC processor:
AS arch/powerpc/kernel/misc_32.o
arch/powerpc/kernel/misc_32.S:299:7: warning: "CONFIG_FSL_BOOKE" is not defined [-Wundef]
This problem is evident after commit 989cea5c14be ("kbuild: prevent
lib-ksyms.o rebuilds"); however, this change in kbuild only exposes an
error that has been in the code since 2005 when this source file was
created. That was with commit 9994a33865f4 ("powerpc: Introduce
entry_{32,64}.S, misc_{32,64}.S, systbl.S").
The offending line does not make a lot of sense. This error does not
seem to cause any errors in the executable, thus I am not recommending
that it be applied to any stable versions.
Thanks to Nicholas Piggin for suggesting this solution.
Fixes: 9994a33865f4 ("powerpc: Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S")
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use rcuidle console tracepoint because, apparently, it may be issued
from an idle CPU:
hw-breakpoint: Failed to enable monitor mode on CPU 0.
hw-breakpoint: CPU 0 failed to disable vector catch
===============================
[ ERR: suspicious RCU usage. ]
4.10.0-rc8-next-20170215+ #119 Not tainted
-------------------------------
./include/trace/events/printk.h:32 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
RCU used illegally from idle CPU!
rcu_scheduler_active = 2, debug_locks = 0
RCU used illegally from extended quiescent state!
2 locks held by swapper/0/0:
#0: (cpu_pm_notifier_lock){......}, at: [<c0237e2c>] cpu_pm_exit+0x10/0x54
#1: (console_lock){+.+.+.}, at: [<c01ab350>] vprintk_emit+0x264/0x474
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc8-next-20170215+ #119
Hardware name: Generic OMAP4 (Flattened Device Tree)
console_unlock
vprintk_emit
vprintk_default
printk
reset_ctrl_regs
dbg_cpu_pm_notify
notifier_call_chain
cpu_pm_exit
omap_enter_idle_coupled
cpuidle_enter_state
cpuidle_enter_state_coupled
do_idle
cpu_startup_entry
start_kernel
This RCU warning, however, is suppressed by lockdep_off() in printk().
lockdep_off() increments the ->lockdep_recursion counter and thus
disables RCU_LOCKDEP_WARN() and debug_lockdep_rcu_enabled(), which want
lockdep to be enabled "current->lockdep_recursion == 0".
Link: http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Tony Lindgren <tony@atomide.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Russell King <rmk@armlinux.org.uk>
Cc: <stable@vger.kernel.org> [3.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A nested lock depth was added to the hasbin_delete() code but it
doesn't actually work some well and results in tons of lockdep splats.
Fix the code instead to properly drop the lock around the operation
and just keep peeking the head of the hashbin queue.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tick_broadcast_lock is taken from interrupt context, but the following call
chain takes the lock without disabling interrupts:
[ 12.703736] _raw_spin_lock+0x3b/0x50
[ 12.703738] tick_broadcast_control+0x5a/0x1a0
[ 12.703742] intel_idle_cpu_online+0x22/0x100
[ 12.703744] cpuhp_invoke_callback+0x245/0x9d0
[ 12.703752] cpuhp_thread_fun+0x52/0x110
[ 12.703754] smpboot_thread_fn+0x276/0x320
So the following deadlock can happen:
lock(tick_broadcast_lock);
<Interrupt>
lock(tick_broadcast_lock);
intel_idle_cpu_online() is the only place which violates the calling
convention of tick_broadcast_control(). This was caused by the removal of
the smp function call in course of the cpu hotplug rework.
Instead of slapping local_irq_disable/enable() at the call site, we can
relax the calling convention and handle it in the core code, which makes
the whole machinery more robust.
Fixes: 29d7bbada98e ("intel_idle: Remove superfluous SMP fuction call")
Reported-by: Gabriel C <nix.or.die@gmail.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Ruslan Ruslichenko <rruslich@cisco.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: lwn@lwn.net
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: stable <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1486953115.5912.4.camel@gmx.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull networking fixes from David Miller:
1) Load correct firmware in rtl8192ce wireless driver, from Jurij
Smakov.
2) Fix leak of tx_ring and tx_cq due to overwriting in mlx4 driver,
from Martin KaFai Lau.
3) Need to reference count PHY driver module when it is attached, from
Mao Wenan.
4) Don't do zero length vzalloc() in ethtool register dump, from
Stanislaw Gruszka.
5) Defer net_disable_timestamp() to a workqueue to get out of locking
issues, from Eric Dumazet.
6) We cannot drop the SKB dst when IP options refer to them, fix also
from Eric Dumazet.
7) Incorrect packet header offset calculations in ip6_gre, again from
Eric Dumazet.
8) Missing tcp_v6_restore_cb() causes use-after-free, from Eric too.
9) tcp_splice_read() can get into an infinite loop with URG, and hey
it's from Eric once more.
10) vnet_hdr_sz can change asynchronously, so read it once during
decision making in macvtap and tun, from Willem de Bruijn.
11) Can't use kernel stack for DMA transfers in USB networking drivers,
from Ben Hutchings.
12) Handle csum errors properly in UDP by calling the proper destructor,
from Eric Dumazet.
13) For non-deterministic softirq run when scheduling NAPI from a
workqueue in mlx4, from Benjamin Poirier.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (28 commits)
sctp: check af before verify address in sctp_addr_id2transport
sctp: avoid BUG_ON on sctp_wait_for_sndbuf
mlx4: Invoke softirqs after napi_reschedule
udp: properly cope with csum errors
catc: Use heap buffer for memory size test
catc: Combine failure cleanup code in catc_probe()
rtl8150: Use heap buffers for all register access
pegasus: Use heap buffers for all register access
macvtap: read vnet_hdr_size once
tun: read vnet_hdr_sz once
tcp: avoid infinite loop in tcp_splice_read()
hns: avoid stack overflow with CONFIG_KASAN
ipv6: Fix IPv6 packet loss in scenarios involving roaming + snooping switches
ipv6: tcp: add a missing tcp_v6_restore_cb()
nl80211: Fix mesh HT operation check
mac80211: Fix adding of mesh vendor IEs
mac80211: Allocate a sync skcipher explicitly for FILS AEAD
mac80211: Fix FILS AEAD protection in Association Request frame
ip6_gre: fix ip6gre_err() invalid reads
netlabel: out of bound access in cipso_v4_validate()
...
The following patch was sketched by Russell in response to my
crashes on the PB11MPCore after the patch for software-based
priviledged no access support for ARMv8.1. See this thread:
http://marc.info/?l=linux-arm-kernel&m=144051749807214&w=2
I am unsure what is going on, I suspect everyone involved in
the discussion is. I just want to repost this to get the
discussion restarted, as I still have to apply this patch
with every kernel iteration to get my PB11MPCore Realview
running.
Testing by Neil Armstrong on the Oxnas NAS has revealed that
this bug exist also on that widely deployed hardware, so
we are probably currently regressing all ARM11MPCore systems.
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: a5e090acbf54 ("ARM: software-based priviledged-no-access support")
Tested-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull perf fixes from Ingo Molnar:
"A kernel crash fix plus three tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix crash in perf_event_read()
perf callchain: Reference count maps
perf diff: Fix -o/--order option behavior (again)
perf diff: Fix segfault on 'perf diff -o N' option
ts->next_tick keeps track of the next tick deadline in order to optimize
clock programmation on irq exit and avoid redundant clock device writes.
Now if ts->next_tick missed an update, we may spuriously miss a clock
reprog later as the nohz code is fooled by an obsolete next_tick value.
This is what happens here on a specific path: when we observe an
expired timer from the nohz update code on irq exit, we perform a soft
tick restart which simply fires the closest possible tick without
actually exiting the nohz mode and restoring a periodic state. But we
forget to update ts->next_tick accordingly.
As a result, after the next tick resulting from such soft tick restart,
the nohz code sees a stale value on ts->next_tick which doesn't match
the clock deadline that just expired. If that obsolete ts->next_tick
value happens to collide with the actual next tick deadline to be
scheduled, we may spuriously bypass the clock reprogramming. In the
worst case, the tick may never fire again.
Fix this with a ts->next_tick reset on soft tick restart.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1486485894-29173-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When the TSC is marked reliable then the synchronization check is skipped,
but that also skips the TSC ADJUST sanitizing code. So on a machine with a
wreckaged BIOS the TSC deviation between CPUs might go unnoticed.
Let the TSC adjust sanitizing code run unconditionally and just skip the
expensive synchronization checks when TSC is marked reliable.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Olof Johansson <olof@lixom.net>
Link: http://lkml.kernel.org/r/20170209151231.491189912@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The timer type simplifications caused a new gcc warning:
drivers/base/power/domain.c: In function ‘genpd_runtime_suspend’:
drivers/base/power/domain.c:562:14: warning: ‘time_start’ may be used uninitialized in this function [-Wmaybe-uninitialized]
elapsed_ns = ktime_to_ns(ktime_sub(ktime_get(), time_start));
despite the actual use of "time_start" not having changed in any way.
It appears that simply changing the type of ktime_t from a union to a
plain scalar type made gcc check the use.
The variable wasn't actually used uninitialized, but gcc apparently
failed to notice that the conditional around the use was exactly the
same as the conditional around the initialization of that variable.
Add an unnecessary initialization just to shut up the compiler.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since the commit 0c1d70af924b ("net: use dst_cache for vxlan device")
vxlan_fill_metadata_dst() calls vxlan_get_route() passing a NULL
dst_cache pointer, so the latter should explicitly check for
valid dst_cache ptr. Unfortunately the commit d71785ffc7e7 ("net: add
dst_cache to ovs vxlan lwtunnel") removed said check.
As a result is possible to trigger a null pointer access calling
vxlan_fill_metadata_dst(), e.g. with:
ovs-vsctl add-br ovs-br0
ovs-vsctl add-port ovs-br0 vxlan0 -- set interface vxlan0 \
type=vxlan options:remote_ip=192.168.1.1 \
options:key=1234 options:dst_port=4789 ofport_request=10
ip address add dev ovs-br0 172.16.1.2/24
ovs-vsctl set Bridge ovs-br0 ipfix=@i -- --id=@i create IPFIX \
targets=\"172.16.1.1:1234\" sampling=1
iperf -c 172.16.1.1 -u -l 1000 -b 10M -t 1 -p 1234
This commit addresses the issue passing to vxlan_get_route() the
dst_cache already available into the lwt info processed by
vxlan_fill_metadata_dst().
Fixes: d71785ffc7e7 ("net: add dst_cache to ovs vxlan lwtunnel")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6326fec1122c ("mm: Use owner_priv bit for PageSwapCache, valid
when PageSwapBacked") aliased PG_swapcache to PG_owner_priv_1 (and
depending on PageSwapBacked being true).
As a result, the KPF_SWAPCACHE bit in '/proc/kpageflags' should now be
synthesized, instead of being shown on unrelated pages which just happen
to have PG_owner_priv_1 set.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 6f29a1306131 ("sctp: sctp_addr_id2transport should verify the
addr before looking up assoc") invoked sctp_verify_addr to verify the
addr.
But it didn't check af variable beforehand, once users pass an address
with family = 0 through sockopt, sctp_get_af_specific will return NULL
and NULL pointer dereference will be caused by af->sockaddr_len.
This patch is to fix it by returning NULL if af variable is NULL.
Fixes: 6f29a1306131 ("sctp: sctp_addr_id2transport should verify the addr before looking up assoc")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull lockdep fix from Ingo Molnar:
"This fixes an ugly lockdep stack trace output regression. (But also
affects other stacktrace users such as kmemleak, KASAN, etc)"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
stacktrace, lockdep: Fix address, newline ugliness
Alexei had his box explode because doing read() on a package
(rapl/uncore) event that isn't currently scheduled in ends up doing an
out-of-bounds load.
Rework the code to more explicitly deal with event->oncpu being -1.
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Fixes: d6a2f9035bfc ("perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG")
Link: http://lkml.kernel.org/r/20170131102710.GL6515@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull drm fixes from Dave Airlie:
"This should be the final set of drm fixes for 4.10: one vmwgfx boot
fix, one vc4 fix, and a few i915 fixes:
* tag 'drm-fixes-for-v4.10-rc8' of git://people.freedesktop.org/~airlied/linux:
drm: vc4: adapt to new behaviour of drm_crtc.c
drm/i915: Always convert incoming exec offsets to non-canonical
drm/i915: Remove overzealous fence warn on runtime suspend
drm/i915/bxt: Add MST support when do DPLL calculation
drm/i915: don't warn about Skylake CPU - KabyPoint PCH combo
drm/i915: fix i915 running as dom0 under Xen
drm/i915: Flush untouched framebuffers before display on !llc
drm/i915: fix use-after-free in page_flip_completed()
drm/vmwgfx: Fix depth input into drm_mode_legacy_fb_format
Olof reported that on a machine which has a BIOS wreckaged TSC the
timestamps in dmesg are making a large jump because the TSC value is
jumping forward after resetting the TSC ADJUST register to a sane value.
This can be avoided by calling the TSC ADJUST saniziting function before
initializing the per cpu sched clock machinery. That takes the offset into
account and avoid the time jump.
What cannot be avoided is that the 'Firmware Bug' warnings on the secondary
CPUs are printed with the large time offsets because it would be too much
effort and ugly hackery to print those warnings into a buffer and emit them
after the adjustemt on the starting CPUs. It's a firmware bug and should be
fixed in firmware. The weird timestamps are collateral damage and just
illustrate the sillyness of the BIOS folks:
[ 0.397445] smp: Bringing up secondary CPUs ...
[ 0.402100] x86: Booting SMP configuration:
[ 0.406343] .... node #0, CPUs: #1
[1265776479.930667] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU1: -2978888639183101
[1265776479.944664] TSC ADJUST synchronize: Reference CPU0: 0 CPU1: -2978888639183101
[ 0.508119] #2
[1265776480.032346] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU2: -2978888639183677
[1265776480.044192] TSC ADJUST synchronize: Reference CPU0: 0 CPU2: -2978888639183677
[ 0.607643] #3
[1265776480.131874] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU3: -2978888639184530
[1265776480.143720] TSC ADJUST synchronize: Reference CPU0: 0 CPU3: -2978888639184530
[ 0.707108] smp: Brought up 1 node, 4 CPUs
[ 0.711271] smpboot: Total of 4 processors activated (21698.88 BogoMIPS)
Reported-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170209151231.411460506@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull timer type cleanups from Thomas Gleixner:
"This series does a tree wide cleanup of types related to
timers/timekeeping.
- Get rid of cycles_t and use a plain u64. The type is not really
helpful and caused more confusion than clarity
- Get rid of the ktime union. The union has become useless as we use
the scalar nanoseconds storage unconditionally now. The 32bit
timespec alike storage got removed due to the Y2038 limitations
some time ago.
That leaves the odd union access around for no reason. Clean it up.
Both changes have been done with coccinelle and a small amount of
manual mopping up"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ktime: Get rid of ktime_equal()
ktime: Cleanup ktime_set() usage
ktime: Get rid of the union
clocksource: Use a plain u64 instead of cycle_t
Pull powerpc fix from Michael Ellerman:
"One fix from Paul: we can not use the radix MMU under a hypervisor for
now.
Although the code checked if the processor supports radix, that is not
sufficient"
* tag 'powerpc-4.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64: Disable use of radix under a hypervisor
wbt_disable_default() calls del_timer_sync() to wait for the wbt
timer to finish before disabling throttling. We can't do this with
IRQs disable. This fixes a lockdep splat on boot, if non-root
cgroups are used.
Reported-by: Gabriel C <nix.or.die@gmail.com>
Fixes: 87760e5eef35 ("block: hook up writeback throttling")
Signed-off-by: Jens Axboe <axboe@fb.com>
In the current DCCP implementation an skb for a DCCP_PKT_REQUEST packet
is forcibly freed via __kfree_skb in dccp_rcv_state_process if
dccp_v6_conn_request successfully returns.
However, if IPV6_RECVPKTINFO is set on a socket, the address of the skb
is saved to ireq->pktopts and the ref count for skb is incremented in
dccp_v6_conn_request, so skb is still in use. Nevertheless, it gets freed
in dccp_rcv_state_process.
Fix by calling consume_skb instead of doing goto discard and therefore
calling __kfree_skb.
Similar fixes for TCP:
fb7e2399ec17f1004c0e0ccfd17439f8759ede01 [TCP]: skb is unexpectedly freed.
0aea76d35c9651d55bbaf746e7914e5f9ae5a25d tcp: SYN packets are now
simply consumed
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Jo-Philipp Wich <jo@mein.io>
Fixes: 9aed02feae57bf7 ("ARC: [arcompact] handle unaligned access delay slot")
Cc: linux-kernel@vger.kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Popov reported that an application may trigger a BUG_ON in
sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is
waiting on it to queue more data and meanwhile another thread peels off
the association being used by the first thread.
This patch replaces the BUG_ON call with a proper error handling. It
will return -EPIPE to the original sendmsg call, similarly to what would
have been done if the association wasn't found in the first place.
Acked-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to the way kbuild works, this header was unintentionally exported
back in 2013 when it was created, despite it not being in a uapi/
directory. This is very non-intuitive behaviour by Kbuild.
However, we've had this include exported to userland for almost four
years, and searching google for "ARM types.h __UINTPTR_TYPE__" gives
no hint that anyone has complained about it. So, let's make it
officially exported in this state.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Reference count maps in callchains, fixing a SEGFAULT when referencing a
map after it is freed (Krister Johansen)
- Fix segfault on 'perf diff -o N' option (Namhyung Kim)
- Fix 'perf diff -o/--order' option behavior (Namhyung Kim)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull SCSI target fixes from Nicholas Bellinger:
"This target series for v4.10 contains fixes which address a few
long-standing bugs that DATERA's QA + automation teams have uncovered
while putting v4.1.y target code into production usage.
We've been running the top three in our nightly automated regression
runs for the last two months, and the COMPARE_AND_WRITE fix Mr. Gary
Guo has been manually verifying against a four node ESX cluster this
past week.
Note all of them have CC' stable tags.
Summary:
- Fix a bug with ESX EXTENDED_COPY + SAM_STAT_RESERVATION_CONFLICT
status, where target_core_xcopy.c logic was incorrectly returning
SAM_STAT_CHECK_CONDITION for all non SAM_STAT_GOOD cases (Nixon
Vincent)
- Fix a TMR LUN_RESET hung task bug while other in-flight TMRs are
being aborted, before the new one had been dispatched into tmr_wq
(Rob Millner)
- Fix a long standing double free OOPs, where a dynamically generated
'demo-mode' NodeACL has multiple sessions associated with it, and
the /sys/kernel/config/target/$FABRIC/$WWN/ subsequently disables
demo-mode, but never converts the dynamic ACL into a explicit ACL
(Rob Millner)
- Fix a long standing reference leak with ESX VAAI COMPARE_AND_WRITE
when the second phase WRITE COMMIT command fails, resulting in
CHECK_CONDITION response never being sent and se_cmd->cmd_kref
never reaching zero (Gary Guo)
Beyond these items on v4.1.y we've reproduced, fixed, and run through
our regression test suite using iscsi-target exports, there are two
additional outstanding list items:
- Remove a >= v4.2 RCU conversion BUG_ON that would trigger when
dynamic node NodeACLs where being converted to explicit NodeACLs.
The patch drops the BUG_ON to follow how pre RCU conversion worked
for this special case (Benjamin Estrabaud)
- Add ibmvscsis target_core_fabric_ops->max_data_sg_nent assignment
to match what IBM's Virtual SCSI hypervisor is already enforcing at
transport layer. (Bryant Ly + Steven Royer)"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
ibmvscsis: Add SGL limit
target: Fix COMPARE_AND_WRITE ref leak for non GOOD status
target: Fix multi-session dynamic se_node_acl double free OOPs
target: Fix early transport_generic_handle_tmr abort scenario
target: Use correct SCSI status during EXTENDED_COPY exception
target: Don't BUG_ON during NodeACL dynamic -> explicit conversion
Hopefully final fixes for v4.10, about half of them stable material.
* tag 'drm-intel-fixes-2017-02-09' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Always convert incoming exec offsets to non-canonical
drm/i915: Remove overzealous fence warn on runtime suspend
drm/i915/bxt: Add MST support when do DPLL calculation
drm/i915: don't warn about Skylake CPU - KabyPoint PCH combo
drm/i915: fix i915 running as dom0 under Xen
drm/i915: Flush untouched framebuffers before display on !llc
drm/i915: fix use-after-free in page_flip_completed()
After:
a33d331761bc ("x86/CPU/AMD: Fix Bulldozer topology")
our SMT scheduling topology for Fam17h systems is broken, because
the ThreadId is included in the ApicId when SMT is enabled.
So, without further decoding cpu_core_id is unique for each thread
rather than the same for threads on the same core. This didn't affect
systems with SMT disabled. Make cpu_core_id be what it is defined to be.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.9
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170205105022.8705-2-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull SMP hotplug notifier removal from Thomas Gleixner:
"This is the final cleanup of the hotplug notifier infrastructure. The
series has been reintgrated in the last two days because there came a
new driver using the old infrastructure via the SCSI tree.
Summary:
- convert the last leftover drivers utilizing notifiers
- fixup for a completely broken hotplug user
- prevent setup of already used states
- removal of the notifiers
- treewide cleanup of hotplug state names
- consolidation of state space
There is a sphinx based documentation pending, but that needs review
from the documentation folks"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/armada-xp: Consolidate hotplug state space
irqchip/gic: Consolidate hotplug state space
coresight/etm3/4x: Consolidate hotplug state space
cpu/hotplug: Cleanup state names
cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
staging/lustre/libcfs: Convert to hotplug state machine
scsi/bnx2i: Convert to hotplug state machine
scsi/bnx2fc: Convert to hotplug state machine
cpu/hotplug: Prevent overwriting of callbacks
x86/msr: Remove bogus cleanup from the error path
bus: arm-ccn: Prevent hotplug callback leak
perf/x86/intel/cstate: Prevent hotplug callback leak
ARM/imx/mmcd: Fix broken cpu hotplug handling
scsi: qedi: Convert to hotplug state machine