Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

[SCSI] scsi_remove_target: fix softlockup regression on hot remove

John reports:
BUG: soft lockup - CPU#2 stuck for 23s! [kworker/u:8:2202]
[..]
Call Trace:
[<ffffffff8141782a>] scsi_remove_target+0xda/0x1f0
[<ffffffff81421de5>] sas_rphy_remove+0x55/0x60
[<ffffffff81421e01>] sas_rphy_delete+0x11/0x20
[<ffffffff81421e35>] sas_port_delete+0x25/0x160
[<ffffffff814549a3>] mptsas_del_end_device+0x183/0x270

...introduced by commit 3b661a9 "[SCSI] fix hot unplug vs async scan race".

Don't restart lookup of more stargets in the multi-target case, just
arrange to traverse the list once, on the assumption that new targets
are always added at the end. There is no guarantee that the target will
change state in scsi_target_reap() so we can end up spinning if we
restart.

Cc: <stable@vger.kernel.org>
Acked-by: Jack Wang <jack_wang@usish.com>
LKML-Reference: <CAEhu1-6wq1YsNiscGMwP4ud0Q+MrViRzv=kcWCQSBNc8c68N5Q@mail.gmail.com>
Reported-by: John Drescher <drescherjm@gmail.com>
Tested-by: John Drescher <drescherjm@gmail.com>
Signed-off-by: Dan Williams <djbw@fb.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

authored by

Dan Williams and committed by
James Bottomley
bc3f02a7 225c5696

+14 -16
+14 -16
drivers/scsi/scsi_sysfs.c
··· 1031 1031 void scsi_remove_target(struct device *dev) 1032 1032 { 1033 1033 struct Scsi_Host *shost = dev_to_shost(dev->parent); 1034 - struct scsi_target *starget, *found; 1034 + struct scsi_target *starget, *last = NULL; 1035 1035 unsigned long flags; 1036 1036 1037 - restart: 1038 - found = NULL; 1037 + /* remove targets being careful to lookup next entry before 1038 + * deleting the last 1039 + */ 1039 1040 spin_lock_irqsave(shost->host_lock, flags); 1040 1041 list_for_each_entry(starget, &shost->__targets, siblings) { 1041 1042 if (starget->state == STARGET_DEL) 1042 1043 continue; 1043 1044 if (starget->dev.parent == dev || &starget->dev == dev) { 1044 - found = starget; 1045 - found->reap_ref++; 1046 - break; 1045 + /* assuming new targets arrive at the end */ 1046 + starget->reap_ref++; 1047 + spin_unlock_irqrestore(shost->host_lock, flags); 1048 + if (last) 1049 + scsi_target_reap(last); 1050 + last = starget; 1051 + __scsi_remove_target(starget); 1052 + spin_lock_irqsave(shost->host_lock, flags); 1047 1053 } 1048 1054 } 1049 1055 spin_unlock_irqrestore(shost->host_lock, flags); 1050 1056 1051 - if (found) { 1052 - __scsi_remove_target(found); 1053 - scsi_target_reap(found); 1054 - /* in the case where @dev has multiple starget children, 1055 - * continue removing. 1056 - * 1057 - * FIXME: does such a case exist? 1058 - */ 1059 - goto restart; 1060 - } 1057 + if (last) 1058 + scsi_target_reap(last); 1061 1059 } 1062 1060 EXPORT_SYMBOL(scsi_remove_target); 1063 1061