[PATCH] sbp2: fix spinlock recursion

sbp2util_mark_command_completed takes a lock which was already taken by
sbp2scsi_complete_all_commands. This is a regression in Linux 2.6.15.

Reported by Kristian Harms at
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=187394

[ More complete commentary, as response to questions by Andrew: ]

> This changes the call environment for all implementations of
> ->Current_done(). Are they all safe to call under this lock?

Short answer: Yes, trust me. ;-) Long answer:

The done() callbacks are passed on to sbp2 from the SCSI stack along
with each SCSI command via the queuecommand hook. The done() callback
is safe to call in atomic context. So does
Documentation/scsi/scsi_mid_low_api.txt say, and many if not all SCSI
low-level handlers rely on this fact. So whatever this callback does,
it is "self-contained" and it won't conflict with sbp2's internal ORB
list handling. In particular, it won't race with the
sbp2_command_orb_lock.

Moreover, sbp2 already calls the done() handler with
sbp2_command_orb_lock taken in sbp2scsi_complete_all_commands(). I
admit this is ultimately no proof of correctness, especially since this
portion of code introduced the spinlock recursion in the first place and
we didn't realize it since this code's submission before 2.6.15 until
now. (I have learned a lesson from this.)

I stress-tested my patch on x86 uniprocessor with a preemptible SMP
kernel (alas I have no SMP machine yet) and made sure that all code
paths which involve the sbp2_command_orb_lock were gone through multiple
times.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

authored by Stefan Richter and committed by Linus Torvalds 24c7cd06 b043b673

+15 -17
+15 -17
drivers/ieee1394/sbp2.c
··· 496 /* 497 * This function finds the sbp2_command for a given outstanding SCpnt. 498 * Only looks at the inuse list. 499 */ 500 - static struct sbp2_command_info *sbp2util_find_command_for_SCpnt(struct scsi_id_instance_data *scsi_id, void *SCpnt) 501 { 502 struct sbp2_command_info *command; 503 - unsigned long flags; 504 505 - spin_lock_irqsave(&scsi_id->sbp2_command_orb_lock, flags); 506 - if (!list_empty(&scsi_id->sbp2_command_orb_inuse)) { 507 - list_for_each_entry(command, &scsi_id->sbp2_command_orb_inuse, list) { 508 - if (command->Current_SCpnt == SCpnt) { 509 - spin_unlock_irqrestore(&scsi_id->sbp2_command_orb_lock, flags); 510 return command; 511 - } 512 - } 513 - } 514 - spin_unlock_irqrestore(&scsi_id->sbp2_command_orb_lock, flags); 515 return NULL; 516 } 517 ··· 575 576 /* 577 * This function moves a command to the completed orb list. 578 */ 579 - static void sbp2util_mark_command_completed(struct scsi_id_instance_data *scsi_id, 580 - struct sbp2_command_info *command) 581 { 582 - unsigned long flags; 583 - 584 - spin_lock_irqsave(&scsi_id->sbp2_command_orb_lock, flags); 585 list_del(&command->list); 586 sbp2util_free_command_dma(command); 587 list_add_tail(&command->list, &scsi_id->sbp2_command_orb_completed); 588 - spin_unlock_irqrestore(&scsi_id->sbp2_command_orb_lock, flags); 589 } 590 591 /* ··· 2141 * Matched status with command, now grab scsi command pointers and check status 2142 */ 2143 SCpnt = command->Current_SCpnt; 2144 sbp2util_mark_command_completed(scsi_id, command); 2145 2146 if (SCpnt) { 2147 ··· 2479 (struct scsi_id_instance_data *)SCpnt->device->host->hostdata[0]; 2480 struct sbp2scsi_host_info *hi = scsi_id->hi; 2481 struct sbp2_command_info *command; 2482 2483 SBP2_ERR("aborting sbp2 command"); 2484 scsi_print_command(SCpnt); ··· 2490 * Right now, just return any matching command structures 2491 * to the free pool. 2492 */ 2493 command = sbp2util_find_command_for_SCpnt(scsi_id, SCpnt); 2494 if (command) { 2495 SBP2_DEBUG("Found command to abort"); ··· 2508 command->Current_done(command->Current_SCpnt); 2509 } 2510 } 2511 2512 /* 2513 * Initiate a fetch agent reset.
··· 496 /* 497 * This function finds the sbp2_command for a given outstanding SCpnt. 498 * Only looks at the inuse list. 499 + * Must be called with scsi_id->sbp2_command_orb_lock held. 500 */ 501 + static struct sbp2_command_info *sbp2util_find_command_for_SCpnt( 502 + struct scsi_id_instance_data *scsi_id, void *SCpnt) 503 { 504 struct sbp2_command_info *command; 505 506 + if (!list_empty(&scsi_id->sbp2_command_orb_inuse)) 507 + list_for_each_entry(command, &scsi_id->sbp2_command_orb_inuse, list) 508 + if (command->Current_SCpnt == SCpnt) 509 return command; 510 return NULL; 511 } 512 ··· 580 581 /* 582 * This function moves a command to the completed orb list. 583 + * Must be called with scsi_id->sbp2_command_orb_lock held. 584 */ 585 + static void sbp2util_mark_command_completed( 586 + struct scsi_id_instance_data *scsi_id, 587 + struct sbp2_command_info *command) 588 { 589 list_del(&command->list); 590 sbp2util_free_command_dma(command); 591 list_add_tail(&command->list, &scsi_id->sbp2_command_orb_completed); 592 } 593 594 /* ··· 2148 * Matched status with command, now grab scsi command pointers and check status 2149 */ 2150 SCpnt = command->Current_SCpnt; 2151 + spin_lock_irqsave(&scsi_id->sbp2_command_orb_lock, flags); 2152 sbp2util_mark_command_completed(scsi_id, command); 2153 + spin_unlock_irqrestore(&scsi_id->sbp2_command_orb_lock, flags); 2154 2155 if (SCpnt) { 2156 ··· 2484 (struct scsi_id_instance_data *)SCpnt->device->host->hostdata[0]; 2485 struct sbp2scsi_host_info *hi = scsi_id->hi; 2486 struct sbp2_command_info *command; 2487 + unsigned long flags; 2488 2489 SBP2_ERR("aborting sbp2 command"); 2490 scsi_print_command(SCpnt); ··· 2494 * Right now, just return any matching command structures 2495 * to the free pool. 2496 */ 2497 + spin_lock_irqsave(&scsi_id->sbp2_command_orb_lock, flags); 2498 command = sbp2util_find_command_for_SCpnt(scsi_id, SCpnt); 2499 if (command) { 2500 SBP2_DEBUG("Found command to abort"); ··· 2511 command->Current_done(command->Current_SCpnt); 2512 } 2513 } 2514 + spin_unlock_irqrestore(&scsi_id->sbp2_command_orb_lock, flags); 2515 2516 /* 2517 * Initiate a fetch agent reset.