commits

Shortly before 3.16-rc1, Dave Jones reported:

WARNING: CPU: 3 PID: 19721 at fs/xfs/xfs_aops.c:971
xfs_vm_writepage+0x5ce/0x630 [xfs]()
CPU: 3 PID: 19721 Comm: trinity-c61 Not tainted 3.15.0+ #3
Call Trace:
xfs_vm_writepage+0x5ce/0x630 [xfs]
shrink_page_list+0x8f9/0xb90
shrink_inactive_list+0x253/0x510
shrink_lruvec+0x563/0x6c0
shrink_zone+0x3b/0x100
shrink_zones+0x1f1/0x3c0
try_to_free_pages+0x164/0x380
__alloc_pages_nodemask+0x822/0xc90
alloc_pages_vma+0xaf/0x1c0
handle_mm_fault+0xa31/0xc50
etc.

970 if (WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD)) ==
971 PF_MEMALLOC))

I did not respond at the time, because a glance at the PageDirty block
in shrink_page_list() quickly shows that this is impossible: we don't do
writeback on file pages (other than tmpfs) from direct reclaim nowadays.
Dave was hallucinating, but it would have been disrespectful to say so.

However, my own /var/log/messages now shows similar complaints

WARNING: CPU: 1 PID: 28814 at fs/ext4/inode.c:1881 ext4_writepage+0xa7/0x38b()
WARNING: CPU: 0 PID: 27347 at fs/ext4/inode.c:1764 ext4_writepage+0xa7/0x38b()

from stressing some mmotm trees during July.

Could a dirty xfs or ext4 file page somehow get marked PageSwapBacked,
so fail shrink_page_list()'s page_is_file_cache() test, and so proceed
to mapping->a_ops->writepage()?

Yes, 3.16-rc1's commit 68711a746345 ("mm, migration: add destination
page freeing callback") has provided such a way to compaction: if
migrating a SwapBacked page fails, its newpage may be put back on the
list for later use with PageSwapBacked still set, and nothing will clear
it.

Whether that can do anything worse than issue WARN_ON_ONCEs, and get
some statistics wrong, is unclear: easier to fix than to think through
the consequences.

Fixing it here, before the put_new_page(), addresses the bug directly,
but is probably the worst place to fix it. Page migration is doing too
many parts of the job on too many levels: fixing it in
move_to_new_page() to complement its SetPageSwapBacked would be
preferable, except why is it (and newpage->mapping and newpage->index)
done there, rather than down in migrate_page_move_mapping(), once we are
sure of success? Not a cleanup to get into right now, especially not
with memcg cleanups coming in 3.17.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11y ago

Geert Uytterhoeven

655fc39b

firewire: IEEE 1394 (FireWire) support should depend on HAS_DMA

Commit b3d681a4fc108f9653bbb44e4f4e72db2b8a5734 ("firewire: Use
COMPILE_TEST for build testing") added COMPILE_TEST as an alternative
dependency for the purpose of build testing the firewire core.
However, this bypasses all other implicit dependencies assumed by PCI,
like HAS_DMA.

If NO_DMA=y:

drivers/built-in.o: In function `fw_iso_buffer_destroy':
(.text+0x36a096): undefined reference to `dma_unmap_page'
drivers/built-in.o: In function `fw_iso_buffer_map_dma':
(.text+0x36a164): undefined reference to `dma_map_page'
drivers/built-in.o: In function `fw_iso_buffer_map_dma':
(.text+0x36a172): undefined reference to `dma_mapping_error'
drivers/built-in.o: In function `sbp2_send_management_orb':
sbp2.c:(.text+0x36c6b4): undefined reference to `dma_map_single'
sbp2.c:(.text+0x36c6c8): undefined reference to `dma_mapping_error'
sbp2.c:(.text+0x36c772): undefined reference to `dma_map_single'
sbp2.c:(.text+0x36c786): undefined reference to `dma_mapping_error'
sbp2.c:(.text+0x36c854): undefined reference to `dma_unmap_single'
sbp2.c:(.text+0x36c872): undefined reference to `dma_unmap_single'
drivers/built-in.o: In function `sbp2_map_scatterlist':
sbp2.c:(.text+0x36ccbc): undefined reference to `scsi_dma_map'
sbp2.c:(.text+0x36cd36): undefined reference to `dma_map_single'
sbp2.c:(.text+0x36cd4e): undefined reference to `dma_mapping_error'
sbp2.c:(.text+0x36cd84): undefined reference to `scsi_dma_unmap'
drivers/built-in.o: In function `sbp2_unmap_scatterlist':
sbp2.c:(.text+0x36cda6): undefined reference to `scsi_dma_unmap'
sbp2.c:(.text+0x36cdc6): undefined reference to `dma_unmap_single'
drivers/built-in.o: In function `complete_command_orb':
sbp2.c:(.text+0x36d6ac): undefined reference to `dma_unmap_single'
drivers/built-in.o: In function `sbp2_scsi_queuecommand':
sbp2.c:(.text+0x36d8e0): undefined reference to `dma_map_single'
sbp2.c:(.text+0x36d8f6): undefined reference to `dma_mapping_error'

Add an explicit dependency on HAS_DMA to fix this.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>

11y ago

Linus Torvalds

82e13c71

Merge branch 'for-3.16' of git://linux-nfs.org/~bfields/linux

11y ago

Linus Torvalds

b7a68369

Merge tag 'staging-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

11y ago

Peter Zijlstra

4a1c0f26

perf: Fix lockdep warning on process exit

Sasha Levin reported:

> While fuzzing with trinity inside a KVM tools guest running the latest -next
> kernel I've stumbled on the following spew:
>
> ======================================================
> [ INFO: possible circular locking dependency detected ]
> 3.15.0-next-20140613-sasha-00026-g6dd125d-dirty #654 Not tainted
> -------------------------------------------------------
> trinity-c578/9725 is trying to acquire lock:
> (&(&pool->lock)->rlock){-.-...}, at: __queue_work (kernel/workqueue.c:1346)
>
> but task is already holding lock:
> (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533)
>
> which lock already depends on the new lock.

> 1 lock held by trinity-c578/9725:
> #0: (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533)
>
> Call Trace:
> dump_stack (lib/dump_stack.c:52)
> print_circular_bug (kernel/locking/lockdep.c:1216)
> __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182)
> lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602)
> _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
> __queue_work (kernel/workqueue.c:1346)
> queue_work_on (kernel/workqueue.c:1424)
> free_object (lib/debugobjects.c:209)
> __debug_check_no_obj_freed (lib/debugobjects.c:715)
> debug_check_no_obj_freed (lib/debugobjects.c:727)
> kmem_cache_free (mm/slub.c:2683 mm/slub.c:2711)
> free_task (kernel/fork.c:221)
> __put_task_struct (kernel/fork.c:250)
> put_ctx (include/linux/sched.h:1855 kernel/events/core.c:898)
> perf_event_exit_task (kernel/events/core.c:907 kernel/events/core.c:7478 kernel/events/core.c:7533)
> do_exit (kernel/exit.c:766)
> do_group_exit (kernel/exit.c:884)
> get_signal_to_deliver (kernel/signal.c:2347)
> do_signal (arch/x86/kernel/signal.c:698)
> do_notify_resume (arch/x86/kernel/signal.c:751)
> int_signal (arch/x86/kernel/entry_64.S:600)

Urgh.. so the only way I can make that happen is through:

perf_event_exit_task_context()
raw_spin_lock(&child_ctx->lock);
unclone_ctx(child_ctx)
put_ctx(ctx->parent_ctx);
raw_spin_unlock_irqrestore(&child_ctx->lock);

And we can avoid this by doing the change below.

I can't immediately see how this changed recently, but given that you
say it's easy to reproduce, lets fix this.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140623141242.GB19860@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>

11y ago

Linus Torvalds

b401796c

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

11y ago

Stefan Richter

d151f985

firewire: ohci: enable MSI for VIA VT6315 rev 1, drop cycle timer quirk

11y ago

Linus Torvalds

98de5ab7

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

11y ago

Kinglong Mee

f98bac5a

NFSD: Fix crash encoding lock reply on 32-bit

11y ago

Linus Torvalds

caa7c4e1

Merge tag 'usb-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

11y ago

Greg Kroah-Hartman

93590033

Merge tag 'iio-fixes-for-3.16d' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus

11y ago

Stephane Eranian

7711fe4f

perf/x86/intel/uncore: Fix SNB-EP/IVT Cbox filter mappings

11y ago

Linus Torvalds

9c550218

Merge tag 'sound-3.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

11y ago

Jerome Glisse

1b2c4869

drm/radeon: fix cut and paste issue for hawaii.

11y ago

Jean Delvare

b3d681a4

firewire: Use COMPILE_TEST for build testing

11y ago

Linus Torvalds

29ae8a6a

Merge tag 'xtensa-next-20140721' of git://github.com/czankel/xtensa-linux

11y ago

Catalin Marinas

d50314a6

arm64: Create non-empty ZONE_DMA when DRAM starts above 4GB

11y ago

Kinglong Mee

c3a45617

nfsd: Fix bad reserving space for encoding rdattr_error

11y ago

Linus Torvalds

f47d5bb0

Merge tag 'driver-core-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

11y ago

Gavin Guo

bb86cf56

usb: Check if port status is equal to RxDetect

11y ago

Linus Torvalds

1795cd9b

Linux 3.16-rc5 v3.16-rc5

11y ago

Martin Fuzzey

71702e6e

iio: mma8452: Use correct acceleration units.

11y ago

Vince Weaver

1996388e

perf/x86/intel: Use proper dTLB-load-misses event on IvyBridge

11y ago

Linus Torvalds

051c2a9f

Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

11y ago

Takashi Sakamoto

eb12f72e

ALSA: bebob: Correction for return value of special_clk_ctl_put() in error

11y ago

Dave Airlie

97cefc3e

Merge branch 'drm-fixes-3.16' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

11y ago

Daeseok Youn

1118f8d0

firewire: net: fix NULL derefencing in fwnet_probe()

11y ago

Linus Torvalds

02ec4747

Merge tag 'pinctrl-v3.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

11y ago

Chris Zankel

4ed2ad38

Merge tag 'xtensa-for-next-20140715' of git://github.com/jcmvbkbc/linux-xtensa into for_next

11y ago

Colin Cross

fa2ec3ea

arm64: implement TASK_SIZE_OF

11y ago

Avi Kivity

69bbd9c7

nfs: fix nfs4d readlink truncated packet

11y ago

Linus Torvalds

fa24615f

Merge tag 'char-misc-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

11y ago

Guenter Roeck

aff008ad

platform_get_irq: Revert to platform_get_resource if of_irq_get fails

11y ago

Abbas Raza

953c6646

usb: chipidea: udc: Disable auto ZLP generation on ep0

There are 2 methods for ZLP (zero-length packet) generation:
1) In software
2) Automatic generation by device controller

1) is implemented in UDC driver and it attaches ZLP to IN packet if
descriptor->size < wLength
2) can be enabled/disabled by setting ZLT bit in the QH

When gadget ffs is connected to ubuntu host, the host sends
get descriptor request and wLength in setup packet is 255 while the
size of descriptor which will be sent by gadget in IN packet is
64 byte. So the composite driver sets req->zero = 1.
In UDC driver following code will be executed then

if (hwreq->req.zero && hwreq->req.length
&& (hwreq->req.length % hwep->ep.maxpacket == 0))
add_td_to_list(hwep, hwreq, 0);

Case-A:
So in case of ubuntu host, UDC driver will attach a ZLP to the IN packet.
ubuntu host will request 255 byte in IN request, gadget will send 64 byte
with ZLP and host will come to know that there is no more data.
But hold on, by default ZLT=0 for endpoint 0 so hardware also tries to
automatically generate the ZLP which blocks enumeration for ~6 seconds due
to endpoint 0 STALL, NAKs are sent to host for any requests (OUT/PING)

Case-B:
In case when gadget ffs is connected to Apple device, Apple device sends
setup packet with wLength=64. So descriptor->size = 64 and wLength=64
therefore req->zero = 0 and UDC driver will not attach any ZLP to the
IN packet. Apple device requests 64 bytes, gets 64 bytes and doesn't
further request for IN data. But ZLT=0 by default for endpoint 0 so
hardware tries to automatically generate the ZLP which blocks enumeration
for ~6 seconds due to endpoint 0 STALL, NAKs are sent to host for any
requests (OUT/PING)

According to USB2.0 specs:

8.5.3.2 Variable-length Data Stage
A control pipe may have a variable-length data phase in which the
host requests more data than is contained in the specified data
structure. When all of the data structure is returned to the host,
the function should indicate that the Data stage is ended by
returning a packet that is shorter than the MaxPacketSize for the
pipe. If the data structure is an exact multiple of wMaxPacketSize
for the pipe, the function will return a zero-length packet to indicate
the end of the Data stage.

In Case-A mentioned above:
If we disable software ZLP generation & ZLT=0 for endpoint 0 OR if software
ZLP generation is not disabled but we set ZLT=1 for endpoint 0 then
enumeration doesn't block for 6 seconds.

In Case-B mentioned above:
If we disable software ZLP generation & ZLT=0 for endpoint then enumeration
still blocks due to ZLP automatically generated by hardware and host not needing
it. But if we keep software ZLP generation enabled but we set ZLT=1 for
endpoint 0 then enumeration doesn't block for 6 seconds.

So the proper solution for this issue seems to disable automatic ZLP generation
by hardware (i.e by setting ZLT=1 for endpoint 0) and let software (UDC driver)
handle the ZLP generation based on req->zero field.

Cc: stable@vger.kernel.org
Signed-off-by: Abbas Raza <Abbas_Raza@mentor.com>
Signed-off-by: Peter Chen <peter.chen@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>