commits

0day robot reported a 9.2% regression for will-it-scale mmap1 test
case[1], caused by commit 57efa1fe5957 ("mm/gup: prevent gup_fast from
racing with COW during fork").

Further debug shows the regression is due to that commit changes the
offset of hot fields 'mmap_lock' inside structure 'mm_struct', thus some
cache alignment changes.

From the perf data, the contention for 'mmap_lock' is very severe and
takes around 95% cpu cycles, and it is a rw_semaphore

struct rw_semaphore {
atomic_long_t count; /* 8 bytes */
atomic_long_t owner; /* 8 bytes */
struct optimistic_spin_queue osq; /* spinner MCS lock */
...

Before commit 57efa1fe5957 adds the 'write_protect_seq', it happens to
have a very optimal cache alignment layout, as Linus explained:

"and before the addition of the 'write_protect_seq' field, the
mmap_sem was at offset 120 in 'struct mm_struct'.

Which meant that count and owner were in two different cachelines,
and then when you have contention and spend time in
rwsem_down_write_slowpath(), this is probably *exactly* the kind
of layout you want.

Because first the rwsem_write_trylock() will do a cmpxchg on the
first cacheline (for the optimistic fast-path), and then in the
case of contention, rwsem_down_write_slowpath() will just access
the second cacheline.

Which is probably just optimal for a load that spends a lot of
time contended - new waiters touch that first cacheline, and then
they queue themselves up on the second cacheline."

After the commit, the rw_semaphore is at offset 128, which means the
'count' and 'owner' fields are now in the same cacheline, and causes
more cache bouncing.

Currently there are 3 "#ifdef CONFIG_XXX" before 'mmap_lock' which will
affect its offset:

CONFIG_MMU
CONFIG_MEMBARRIER
CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES

The layout above is on 64 bits system with 0day's default kernel config
(similar to RHEL-8.3's config), in which all these 3 options are 'y'.
And the layout can vary with different kernel configs.

Relayouting a structure is usually a double-edged sword, as sometimes it
can helps one case, but hurt other cases. For this case, one solution
is, as the newly added 'write_protect_seq' is a 4 bytes long seqcount_t
(when CONFIG_DEBUG_LOCK_ALLOC=n), placing it into an existing 4 bytes
hole in 'mm_struct' will not change other fields' alignment, while
restoring the regression.

Link: https://lore.kernel.org/lkml/20210525031636.GB7744@xsang-OptiPlex-9020/ [1]
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

4y ago

Alexandre Ghiti

0ddd7eaf

riscv: Fix BUILTIN_DTB for sifive and microchip soc

4y ago

Ming Lei

11714026

scsi: core: Put .shost_dev in failure path if host state changes to RUNNING

4y ago

Chuck Lever

d1b5c230

NFS: FMODE_READ and friends are C macros, not enum types

4y ago

Linus Torvalds

f09eacca

Merge branch 'for-5.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

4y ago

Linus Torvalds

43cb5d49

Merge tag 'usb-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are a number of tiny USB fixes for 5.13-rc6.

There are more than I would normally like, but there's been a bunch of
people banging on the gadget and dwc3 and typec code recently for I
think an Android release, which has resulted in a number of small
fixes. It's nice to see companies send fixes upstream for this type of
work, a notable change from years ago.

Anyway, fixes in here are:

- usb-serial device id updates

- usb-serial cp210x driver fixes for broken firmware versions

- typec fixes for crazy charging devices and other reported problems

- dwc3 fixes for reported problems found

- gadget fixes for reported problems

- tiny xhci fixes

- other small fixes for reported issues.

- revert of a problem fix found by linux-next testing

All of these have passed 0-day and linux-next testing with no reported
problems (the revert for the found linux-next build problem included)"

* tag 'usb-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (44 commits)
Revert "usb: gadget: fsl: Re-enable driver for ARM SoCs"
usb: typec: mux: Fix copy-paste mistake in typec_mux_match
usb: typec: ucsi: Clear PPM capability data in ucsi_init() error path
usb: gadget: fsl: Re-enable driver for ARM SoCs
usb: typec: wcove: Use LE to CPU conversion when accessing msg->header
USB: serial: cp210x: fix CP2102N-A01 modem control
USB: serial: cp210x: fix alternate function for CP2102N QFN20
usb: misc: brcmstb-usb-pinmap: check return value after calling platform_get_resource()
usb: dwc3: ep0: fix NULL pointer exception
usb: gadget: eem: fix wrong eem header operation
usb: typec: intel_pmc_mux: Put ACPI device using acpi_dev_put()
usb: typec: intel_pmc_mux: Add missed error check for devm_ioremap_resource()
usb: typec: intel_pmc_mux: Put fwnode in error case during ->probe()
usb: typec: tcpm: Do not finish VDM AMS for retrying Responses
usb: fix various gadget panics on 10gbps cabling
usb: fix various gadgets null ptr deref on 10gbps cabling.
usb: pci-quirks: disable D3cold on xhci suspend for s2idle on AMD Renoir
usb: f_ncm: only first packet of aggregate needs to start timer
USB: f_ncm: ncm_bitrate (speed) is unsigned
MAINTAINERS: usb: add entry for isp1760
...

4y ago

Vitaly Wool

858cf860

riscv: alternative: fix typo in macro name

4y ago

Ming Lei

3719f4ff

scsi: core: Fix failure handling of scsi_add_host_with_dma()

4y ago

Dan Carpenter

09226e83

NFS: Fix a potential NULL dereference in nfs_get_client()

4y ago

Linus Torvalds

29a877d5

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

4y ago

Alexander Kuznetsov

b7e24eb1

cgroup1: don't allow '\n' in renaming

4y ago

Linus Torvalds

c46fe4aa

Merge tag 'tty-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

4y ago

Greg Kroah-Hartman

7c4363d3

Merge tag 'usb-serial-5.13-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus

4y ago

Jisheng Zhang

42e0e0b4

riscv: code patching only works on !XIP_KERNEL

4y ago

Ming Lei

66a834d0

scsi: core: Fix error handling of scsi_host_alloc()

4y ago

Anna Schumaker

476bdb04

NFS: Fix use-after-free in nfs4_init_client()

4y ago

Linus Torvalds

cd1245d7

Merge tag 'platform-drivers-x86-v5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

4y ago

Alaa Hleihel

2ba0aa2f

IB/mlx5: Fix initializing CQ fragments buffer

4y ago

Zhen Lei

08b2b6fd

cgroup: fix spelling mistakes

4y ago

Linus Torvalds

0d506588

Merge tag 'staging-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

4y ago

Andy Shevchenko

7c3e8d9d

serial: 8250_exar: Avoid NULL pointer dereference at ->exit()

4y ago

Greg Kroah-Hartman

abd06288

Revert "usb: gadget: fsl: Re-enable driver for ARM SoCs"

4y ago

Johan Hovold

63a8eef7

USB: serial: cp210x: fix CP2102N-A01 modem control

4y ago

Vitaly Wool

5e63215c

riscv: xip: support runtime trap patching

4y ago

Ewan D. Milne

e57f5cd9

scsi: scsi_devinfo: Add blacklist entry for HPE OPEN-V

4y ago

Scott Mayhew

0b4f132b

NFS: Ensure the NFS_CAP_SECURITY_LABEL capability is set when appropriate

4y ago

Linus Torvalds

a4c30b86

Merge tag 'compiler-attributes-for-linus-v5.13-rc6' of git://github.com/ojeda/linux

4y ago

Mykola Kostenok

701b54bc

platform/mellanox: mlxreg-hotplug: Revert "move to use request_irq by IRQF_NO_AUTOEN flag"

4y ago

Aharon Landau

6466f03f

RDMA/mlx5: Delete right entry from MR signature database

The value mr->sig is stored in the entry upon mr allocation, however, ibmr
is wrongly entered here as "old", therefore, xa_cmpxchg() does not replace
the entry with NULL, which leads to the following trace:

WARNING: CPU: 28 PID: 2078 at drivers/infiniband/hw/mlx5/main.c:3643 mlx5_ib_stage_init_cleanup+0x4d/0x60 [mlx5_ib]
Modules linked in: nvme_rdma nvme_fabrics nvme_core 8021q garp mrp bonding bridge stp llc rfkill rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_tad
CPU: 28 PID: 2078 Comm: reboot Tainted: G X --------- --- 5.13.0-0.rc2.19.el9.x86_64 #1
Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 2.9.1 12/07/2018
RIP: 0010:mlx5_ib_stage_init_cleanup+0x4d/0x60 [mlx5_ib]
Code: 8d bb 70 1f 00 00 be 00 01 00 00 e8 9d 94 ce da 48 3d 00 01 00 00 75 02 5b c3 0f 0b 5b c3 0f 0b 48 83 bb b0 20 00 00 00 74 d5 <0f> 0b eb d1 4
RSP: 0018:ffffa8db06d33c90 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff97f890a44000 RCX: ffff97f900ec0160
RDX: 0000000000000000 RSI: 0000000080080001 RDI: ffff97f890a44000
RBP: ffffffffc0c189b8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000300 R12: ffff97f890a44000
R13: ffffffffc0c36030 R14: 00000000fee1dead R15: 0000000000000000
FS: 00007f0d5a8a3b40(0000) GS:ffff98077fb80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555acbf4f450 CR3: 00000002a6f56002 CR4: 00000000001706e0
Call Trace:
mlx5r_remove+0x39/0x60 [mlx5_ib]
auxiliary_bus_remove+0x1b/0x30
__device_release_driver+0x17a/0x230
device_release_driver+0x24/0x30
bus_remove_device+0xdb/0x140
device_del+0x18b/0x3e0
mlx5_detach_device+0x59/0x90 [mlx5_core]
mlx5_unload_one+0x22/0x60 [mlx5_core]
shutdown+0x31/0x3a [mlx5_core]
pci_device_shutdown+0x34/0x60
device_shutdown+0x15b/0x1c0
__do_sys_reboot.cold+0x2f/0x5b
? vfs_writev+0xc7/0x140
? handle_mm_fault+0xc5/0x290
? do_writev+0x6b/0x110
do_syscall_64+0x40/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: e6fb246ccafb ("RDMA/mlx5: Consolidate MR destruction to mlx5_ib_dereg_mr()")
Link: https://lore.kernel.org/r/f3f585ea0db59c2a78f94f65eedeafc5a2374993.1623309971.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

4y ago

Shakeel Butt

45e1ba40

cgroup: disable controllers at parse time

4y ago

Linus Torvalds

87a7f736

Merge tag 'driver-core-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

4y ago

Lars-Peter Clausen

e9de1eca

staging: ralink-gdma: Remove incorrect author information

4y ago

Linus Torvalds

8124c8a6

Linux 5.13-rc4 v5.13-rc4

4y ago

Bjorn Andersson

142d0b24

usb: typec: mux: Fix copy-paste mistake in typec_mux_match

4y ago

Stefan Agner

6f7ec77c

USB: serial: cp210x: fix alternate function for CP2102N QFN20

4y ago

Palmer Dabbelt

160ce364

Merge remote-tracking branch 'riscv/riscv-wx-mappings' into fixes

4y ago

Stanley Chu

2c89e413

scsi: ufs: ufs-mediatek: Fix HCI version in some platforms

4y ago

Dai Ngo

f8849e20

NFSv4: nfs4_proc_set_acl needs to restore NFS_CAP_UIDGID_NOMAP on error.

4y ago

Linus Torvalds

a25b088c

Merge tag 'clang-format-for-linus-v5.13-rc6' of git://github.com/ojeda/linux

4y ago

Wei Ming Chen

ca0760e7

Compiler Attributes: Add continue in comment

4y ago

Maximilian Luz

6325ce15

platform/surface: dtx: Add missing mutex_destroy() call in failure path

4y ago

Maor Gottlieb

2adcb4c5

RDMA: Verify port when creating flow rule

4y ago

Linus Torvalds

c3d0e3fd

Merge tag 'fs.idmapped.mount_setattr.v5.13-rc3' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux

4y ago

Linus Torvalds

1dfa2e77

Merge tag 'char-misc-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

4y ago

Dietmar Eggemann

f501b6a2

debugfs: Fix debugfs_read_file_str()

4y ago

Wenli Looi

43c85d77

staging: rtl8723bs: Fix uninitialized variables

The sinfo.pertid and sinfo.generation variables are not initialized and
it causes a crash when we use this as a wireless access point.

[ 456.873025] ------------[ cut here ]------------
[ 456.878198] kernel BUG at mm/slub.c:3968!
[ 456.882680] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM

[ snip ]

[ 457.271004] Backtrace:
[ 457.273733] [<c02b7ee4>] (kfree) from [<c0e2a470>] (nl80211_send_station+0x954/0xfc4)
[ 457.282481] r9:eccca0c0 r8:e8edfec0 r7:00000000 r6:00000011 r5:e80a9480 r4:e8edfe00
[ 457.291132] [<c0e29b1c>] (nl80211_send_station) from [<c0e2b18c>] (cfg80211_new_sta+0x90/0x1cc)
[ 457.300850] r10:e80a9480 r9:e8edfe00 r8:ea678cca r7:00000a20 r6:00000000 r5:ec46d000
[ 457.309586] r4:ec46d9e0
[ 457.312433] [<c0e2b0fc>] (cfg80211_new_sta) from [<bf086684>] (rtw_cfg80211_indicate_sta_assoc+0x80/0x9c [r8723bs])
[ 457.324095] r10:00009930 r9:e85b9d80 r8:bf091050 r7:00000000 r6:00000000 r5:0000001c
[ 457.332831] r4:c1606788
[ 457.335692] [<bf086604>] (rtw_cfg80211_indicate_sta_assoc [r8723bs]) from [<bf03df38>] (rtw_stassoc_event_callback+0x1c8/0x1d4 [r8723bs])
[ 457.349489] r7:ea678cc0 r6:000000a1 r5:f1225f84 r4:f086b000
[ 457.355845] [<bf03dd70>] (rtw_stassoc_event_callback [r8723bs]) from [<bf048e4c>] (mlme_evt_hdl+0x8c/0xb4 [r8723bs])
[ 457.367601] r7:c1604900 r6:f086c4b8 r5:00000000 r4:f086c000
[ 457.373959] [<bf048dc0>] (mlme_evt_hdl [r8723bs]) from [<bf03693c>] (rtw_cmd_thread+0x198/0x3d8 [r8723bs])
[ 457.384744] r5:f086e000 r4:f086c000
[ 457.388754] [<bf0367a4>] (rtw_cmd_thread [r8723bs]) from [<c014a214>] (kthread+0x170/0x174)
[ 457.398083] r10:ed7a57e8 r9:bf0367a4 r8:f086b000 r7:e8ede000 r6:00000000 r5:e9975200
[ 457.406828] r4:e8369900
[ 457.409653] [<c014a0a4>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[ 457.417718] Exception stack(0xe8edffb0 to 0xe8edfff8)
[ 457.423356] ffa0: 00000000 00000000 00000000 00000000
[ 457.432492] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 457.441618] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 457.449006] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c014a0a4
[ 457.457750] r4:e9975200
[ 457.460574] Code: 1a000003 e5953004 e3130001 1a000000 (e7f001f2)
[ 457.467381] ---[ end trace 4acbc8c15e9e6aa7 ]---

Link: https://forum.armbian.com/topic/14727-wifi-ap-kernel-bug-in-kernel-5444/
Fixes: 8689c051a201 ("cfg80211: dynamically allocate per-tid stats for station info")
Fixes: f5ea9120be2e ("nl80211: add generation number to all dumps")
Signed-off-by: Wenli Looi <wlooi@ucalgary.ca>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210608064620.74059-1-wlooi@ucalgary.ca
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>