tjh.dev/kernel at 7907a021e4bbfa29cccacd2ba2dade894d9a7d4c

Commit f5ce815f34bc ("scsi: target: tcmu: Support DATA_BLOCK_SIZE = N *
PAGE_SIZE") introduced xas_next() calls to iterate xarray elements. These
calls triggered the WARNING "suspicious RCU usage" at tcmu device set up
[1]. In the call stack of xas_next(), xas_load() was called. According to
its comment, this function requires "the xa_lock or the RCU lock".

To avoid the warning:

- Guard the small loop calling xas_next() in tcmu_get_empty_block with RCU
lock.

- In the large loop in tcmu_copy_data using RCU lock would possibly
disable preemtion for a long time (copy multi MBs). Therefore replace
XA_STATE, xas_set and xas_next with a single xa_load.

[1]

[ 1899.867091] =============================
[ 1899.871199] WARNING: suspicious RCU usage
[ 1899.875310] 5.13.0-rc1+ #41 Not tainted
[ 1899.879222] -----------------------------
[ 1899.883299] include/linux/xarray.h:1182 suspicious rcu_dereference_check() usage!
[ 1899.890940] other info that might help us debug this:
[ 1899.899082] rcu_scheduler_active = 2, debug_locks = 1
[ 1899.905719] 3 locks held by kworker/0:1/1368:
[ 1899.910161] #0: ffffa1f8c8b98738 ((wq_completion)target_submission){+.+.}-{0:0}, at: process_one_work+0x1ee/0x580
[ 1899.920732] #1: ffffbd7040cd7e78 ((work_completion)(&q->sq.work)){+.+.}-{0:0}, at: process_one_work+0x1ee/0x580
[ 1899.931146] #2: ffffa1f8d1c99768 (&udev->cmdr_lock){+.+.}-{3:3}, at: tcmu_queue_cmd+0xea/0x160 [target_core_user]
[ 1899.941678] stack backtrace:
[ 1899.946093] CPU: 0 PID: 1368 Comm: kworker/0:1 Not tainted 5.13.0-rc1+ #41
[ 1899.953070] Hardware name: System manufacturer System Product Name/PRIME Z270-A, BIOS 1302 03/15/2018
[ 1899.962459] Workqueue: target_submission target_queued_submit_work [target_core_mod]
[ 1899.970337] Call Trace:
[ 1899.972839] dump_stack+0x6d/0x89
[ 1899.976222] xas_descend+0x10e/0x120
[ 1899.979875] xas_load+0x39/0x50
[ 1899.983077] tcmu_get_empty_blocks+0x115/0x1c0 [target_core_user]
[ 1899.989318] queue_cmd_ring+0x1da/0x630 [target_core_user]
[ 1899.994897] ? rcu_read_lock_sched_held+0x3f/0x70
[ 1899.999695] ? trace_kmalloc+0xa6/0xd0
[ 1900.003501] ? __kmalloc+0x205/0x380
[ 1900.007167] tcmu_queue_cmd+0x12f/0x160 [target_core_user]
[ 1900.012746] __target_execute_cmd+0x23/0xa0 [target_core_mod]
[ 1900.018589] transport_generic_new_cmd+0x1f3/0x370 [target_core_mod]
[ 1900.025046] transport_handle_cdb_direct+0x34/0x50 [target_core_mod]
[ 1900.031517] target_queued_submit_work+0x43/0xe0 [target_core_mod]
[ 1900.037837] process_one_work+0x268/0x580
[ 1900.041952] ? process_one_work+0x580/0x580
[ 1900.046195] worker_thread+0x55/0x3b0
[ 1900.049921] ? process_one_work+0x580/0x580
[ 1900.054192] kthread+0x143/0x160
[ 1900.057499] ? kthread_create_worker_on_cpu+0x40/0x40
[ 1900.062661] ret_from_fork+0x1f/0x30

Link: https://lore.kernel.org/r/20210519135440.26773-1-bostroesser@gmail.com
Fixes: f5ce815f34bc ("scsi: target: tcmu: Support DATA_BLOCK_SIZE = N * PAGE_SIZE")
Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

b4150b68

Bodo Stroesser

4 years ago

scsi: target: core: Avoid smp_processor_id() in preemptible code

The BUG message "BUG: using smp_processor_id() in preemptible [00000000]
code" was observed for TCMU devices with kernel config DEBUG_PREEMPT.

The message was observed when blktests block/005 was run on TCMU devices
with fileio backend or user:zbc backend [1]. The commit 1130b499b4a7
("scsi: target: tcm_loop: Use LIO wq cmd submission helper") triggered the
symptom. The commit modified work queue to handle commands and changed
'current->nr_cpu_allowed' at smp_processor_id() call.

The message was also observed at system shutdown when TCMU devices were not
cleaned up [2]. The function smp_processor_id() was called in SCSI host
work queue for abort handling, and triggered the BUG message. This symptom
was observed regardless of the commit 1130b499b4a7 ("scsi: target:
tcm_loop: Use LIO wq cmd submission helper").

To avoid the preemptible code check at smp_processor_id(), get CPU ID with
raw_smp_processor_id() instead. The CPU ID is used for performance
improvement then thread move to other CPU will not affect the code.

[1]

[ 56.468103] run blktests block/005 at 2021-05-12 14:16:38
[ 57.369473] check_preemption_disabled: 85 callbacks suppressed
[ 57.369480] BUG: using smp_processor_id() in preemptible [00000000] code: fio/1511
[ 57.369506] BUG: using smp_processor_id() in preemptible [00000000] code: fio/1510
[ 57.369512] BUG: using smp_processor_id() in preemptible [00000000] code: fio/1506
[ 57.369552] caller is __target_init_cmd+0x157/0x170 [target_core_mod]
[ 57.369606] CPU: 4 PID: 1506 Comm: fio Not tainted 5.13.0-rc1+ #34
[ 57.369613] Hardware name: System manufacturer System Product Name/PRIME Z270-A, BIOS 1302 03/15/2018
[ 57.369617] Call Trace:
[ 57.369621] BUG: using smp_processor_id() in preemptible [00000000] code: fio/1507
[ 57.369628] dump_stack+0x6d/0x89
[ 57.369642] check_preemption_disabled+0xc8/0xd0
[ 57.369628] caller is __target_init_cmd+0x157/0x170 [target_core_mod]
[ 57.369655] __target_init_cmd+0x157/0x170 [target_core_mod]
[ 57.369695] target_init_cmd+0x76/0x90 [target_core_mod]
[ 57.369732] tcm_loop_queuecommand+0x109/0x210 [tcm_loop]
[ 57.369744] scsi_queue_rq+0x38e/0xc40
[ 57.369761] __blk_mq_try_issue_directly+0x109/0x1c0
[ 57.369779] blk_mq_try_issue_directly+0x43/0x90
[ 57.369790] blk_mq_submit_bio+0x4e5/0x5d0
[ 57.369812] submit_bio_noacct+0x46e/0x4e0
[ 57.369830] __blkdev_direct_IO_simple+0x1a3/0x2d0
[ 57.369859] ? set_init_blocksize.isra.0+0x60/0x60
[ 57.369880] generic_file_read_iter+0x89/0x160
[ 57.369898] blkdev_read_iter+0x44/0x60
[ 57.369906] new_sync_read+0x102/0x170
[ 57.369929] vfs_read+0xd4/0x160
[ 57.369941] __x64_sys_pread64+0x6e/0xa0
[ 57.369946] ? lockdep_hardirqs_on+0x79/0x100
[ 57.369958] do_syscall_64+0x3a/0x70
[ 57.369965] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 57.369973] RIP: 0033:0x7f7ed4c1399f
[ 57.369979] Code: 08 89 3c 24 48 89 4c 24 18 e8 7d f3 ff ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 04 24 e8 cd f3 ff ff 48 8b
[ 57.369983] RSP: 002b:00007ffd7918c580 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
[ 57.369990] RAX: ffffffffffffffda RBX: 00000000015b4540 RCX: 00007f7ed4c1399f
[ 57.369993] RDX: 0000000000001000 RSI: 00000000015de000 RDI: 0000000000000009
[ 57.369996] RBP: 00000000015b4540 R08: 0000000000000000 R09: 0000000000000001
[ 57.369999] R10: 0000000000e5c000 R11: 0000000000000293 R12: 00007f7eb5269a70
[ 57.370002] R13: 0000000000000000 R14: 0000000000001000 R15: 00000000015b4568
[ 57.370031] CPU: 7 PID: 1507 Comm: fio Not tainted 5.13.0-rc1+ #34
[ 57.370036] Hardware name: System manufacturer System Product Name/PRIME Z270-A, BIOS 1302 03/15/2018
[ 57.370039] Call Trace:
[ 57.370045] dump_stack+0x6d/0x89
[ 57.370056] check_preemption_disabled+0xc8/0xd0
[ 57.370068] __target_init_cmd+0x157/0x170 [target_core_mod]
[ 57.370121] target_init_cmd+0x76/0x90 [target_core_mod]
[ 57.370178] tcm_loop_queuecommand+0x109/0x210 [tcm_loop]
[ 57.370197] scsi_queue_rq+0x38e/0xc40
[ 57.370224] __blk_mq_try_issue_directly+0x109/0x1c0
...

[2]

[ 117.458597] BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u16:8
[ 117.467279] caller is __target_init_cmd+0x157/0x170 [target_core_mod]
[ 117.473893] CPU: 1 PID: 418 Comm: kworker/u16:6 Not tainted 5.13.0-rc1+ #34
[ 117.481150] Hardware name: System manufacturer System Product Name/PRIME Z270-A, BIOS 8
[ 117.481153] Workqueue: scsi_tmf_7 scmd_eh_abort_handler
[ 117.481156] Call Trace:
[ 117.481158] dump_stack+0x6d/0x89
[ 117.481162] check_preemption_disabled+0xc8/0xd0
[ 117.512575] target_submit_tmr+0x41/0x150 [target_core_mod]
[ 117.519705] tcm_loop_issue_tmr+0xa7/0x100 [tcm_loop]
[ 117.524913] tcm_loop_abort_task+0x43/0x60 [tcm_loop]
[ 117.530137] scmd_eh_abort_handler+0x7b/0x230
[ 117.534681] process_one_work+0x268/0x580
[ 117.538862] worker_thread+0x55/0x3b0
[ 117.542652] ? process_one_work+0x580/0x580
[ 117.548351] kthread+0x143/0x160
[ 117.551675] ? kthread_create_worker_on_cpu+0x40/0x40
[ 117.556873] ret_from_fork+0x1f/0x30

Link: https://lore.kernel.org/r/20210515070315.215801-1-shinichiro.kawasaki@wdc.com
Fixes: 1526d9f10c61 ("scsi: target: Make state_list per CPU")
Cc: stable@vger.kernel.org # v5.11+
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

70ca3c57

Shin'ichiro Kawasaki

4 years ago

scsi: pm80xx: Fix drives missing during rmmod/insmod loop

d1acd81b

Ajish Koshy

5 years ago

scsi: qla2xxx: Fix error return code in qla82xx_write_flash_dword()

5cb289bf

Zhen Lei

5 years ago

scsi: qedf: Add pointer checks in qedf_update_link_speed()

73578af9

Javed Hasan

5 years ago

scsi: ufs: core: Increase the usable queue depth

d0b2b70e

Bart Van Assche

5 years ago

branches 3

master 18 hours ago default

compare

nocache-cleanup 3 weeks ago

compare

for-next 1 year ago

compare

tags 927

v7.0

1 week ago latest

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.

Configure Feed

Configure Feed

Clone this repository