tjh.dev/kernel at f95077acac6d3235735a41cc5f25a024777399dd

While the ATA specification states that a device should return command
aborted for all commands queued after the device has entered error state,
since ATA only keeps the sense data for the latest command (in non-NCQ
case), we really don't want to send block layer commands to the device
after it has entered error state. (Only ATA EH commands should be sent,
to read the sense data etc.)

Currently, scsi_queue_rq() will check if scsi_host_in_recovery()
(state is SHOST_RECOVERY), and if so, it will _not_ issue a command via:
scsi_dispatch_cmd() -> host->hostt->queuecommand() (ata_scsi_queuecmd())
-> __ata_scsi_queuecmd() -> ata_scsi_translate() -> ata_qc_issue()

Before commit e494f6a72839 ("[SCSI] improved eh timeout handler"),
when receiving a TFES error IRQ, the call chain looked like this:
ahci_error_intr() -> ata_port_abort() -> ata_do_link_abort() ->
ata_qc_complete() -> ata_qc_schedule_eh() -> blk_abort_request() ->
blk_rq_timed_out() -> q->rq_timed_out_fn() (scsi_times_out()) ->
scsi_eh_scmd_add() -> scsi_host_set_state(shost, SHOST_RECOVERY)

Which meant that as soon as an error IRQ was serviced, SHOST_RECOVERY
would be set.

However, after commit e494f6a72839 ("[SCSI] improved eh timeout handler"),
scsi_times_out() will instead call scsi_abort_command() which will queue
delayed work, and the worker function scmd_eh_abort_handler() will call
scsi_eh_scmd_add(), which calls scsi_host_set_state(shost, SHOST_RECOVERY).

So now, after the TFES error IRQ has been serviced, we need to wait for
the SCSI workqueue to run its work before SHOST_RECOVERY gets set.

It is worth noting that, even before commit e494f6a72839 ("[SCSI] improved
eh timeout handler"), we could receive an error IRQ from the time when
scsi_queue_rq() checks scsi_host_in_recovery(), to the time when
ata_scsi_queuecmd() is actually called.

In order to handle both the delayed setting of SHOST_RECOVERY and the
window where we can receive an error IRQ, add a check against
ATA_PFLAG_EH_PENDING (which gets set when servicing the error IRQ),
inside ata_scsi_queuecmd() itself, while holding the ap->lock.
(Since the ap->lock is held while servicing IRQs.)

Fixes: e494f6a72839 ("[SCSI] improved eh timeout handler")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>

e20e81a2

Niklas Cassel

3 years ago

ALSA: hda: fix potential memleak in 'add_widget_node'

9a5523f7

Ye Bin

3 years ago

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

5ad6e7ba

Linus Torvalds

3 years ago

docs: kmsan: fix formatting of "Example report"

436fa4a6

Alexander Potapenko

3 years ago

ata: libata-transport: fix error handling in ata_tdev_add()

1ff36351

Yang Yingliang

3 years ago

ALSA: memalloc: Don't fall back for SG-buffer with IOMMU

9736a325

Takashi Iwai

3 years ago

Merge tag 'block-6.1-2022-11-11' of git://git.kernel.dk/linux

b0b6e2c9

Linus Torvalds

3 years ago

arm64/syscall: Include asm/ptrace.h in syscall_wrapper header.

acfc35cf

Kuniyuki Iwashima

3 years ago

mm/damon/dbgfs: check if rm_contexts input is for a real context

1de09a72

SeongJae Park

3 years ago

ata: libata-transport: fix error handling in ata_tlink_add()

cf0816f6

Yang Yingliang

3 years ago

ALSA: usb-audio: add quirk to fix Hamedal C20 disconnect issue

bf990c10

Ai Chao

3 years ago

branches 3

master 1 day ago default

compare

nocache-cleanup 3 weeks ago

compare

for-next 1 year ago

compare

tags 927

v7.0

2 weeks ago latest

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.

Configure Feed

Configure Feed

Clone this repository