[PATCH] sbp2: fix spinlock recursion

sbp2util_mark_command_completed takes a lock which was already taken by
sbp2scsi_complete_all_commands. This is a regression in Linux 2.6.15.

Reported by Kristian Harms at
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=187394

[ More complete commentary, as response to questions by Andrew: ]

> This changes the call environment for all implementations of
> ->Current_done(). Are they all safe to call under this lock?

Short answer: Yes, trust me. ;-) Long answer:

The done() callbacks are passed on to sbp2 from the SCSI stack along
with each SCSI command via the queuecommand hook. The done() callback
is safe to call in atomic context. So does
Documentation/scsi/scsi_mid_low_api.txt say, and many if not all SCSI
low-level handlers rely on this fact. So whatever this callback does,
it is "self-contained" and it won't conflict with sbp2's internal ORB
list handling. In particular, it won't race with the
sbp2_command_orb_lock.

Moreover, sbp2 already calls the done() handler with
sbp2_command_orb_lock taken in sbp2scsi_complete_all_commands(). I
admit this is ultimately no proof of correctness, especially since this
portion of code introduced the spinlock recursion in the first place and
we didn't realize it since this code's submission before 2.6.15 until
now. (I have learned a lesson from this.)

I stress-tested my patch on x86 uniprocessor with a preemptible SMP
kernel (alas I have no SMP machine yet) and made sure that all code
paths which involve the sbp2_command_orb_lock were gone through multiple
times.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

authored by Stefan Richter and committed by Linus Torvalds 24c7cd06 b043b673

+15 -17
+15 -17
drivers/ieee1394/sbp2.c
··· 496 496 /* 497 497 * This function finds the sbp2_command for a given outstanding SCpnt. 498 498 * Only looks at the inuse list. 499 + * Must be called with scsi_id->sbp2_command_orb_lock held. 499 500 */ 500 - static struct sbp2_command_info *sbp2util_find_command_for_SCpnt(struct scsi_id_instance_data *scsi_id, void *SCpnt) 501 + static struct sbp2_command_info *sbp2util_find_command_for_SCpnt( 502 + struct scsi_id_instance_data *scsi_id, void *SCpnt) 501 503 { 502 504 struct sbp2_command_info *command; 503 - unsigned long flags; 504 505 505 - spin_lock_irqsave(&scsi_id->sbp2_command_orb_lock, flags); 506 - if (!list_empty(&scsi_id->sbp2_command_orb_inuse)) { 507 - list_for_each_entry(command, &scsi_id->sbp2_command_orb_inuse, list) { 508 - if (command->Current_SCpnt == SCpnt) { 509 - spin_unlock_irqrestore(&scsi_id->sbp2_command_orb_lock, flags); 506 + if (!list_empty(&scsi_id->sbp2_command_orb_inuse)) 507 + list_for_each_entry(command, &scsi_id->sbp2_command_orb_inuse, list) 508 + if (command->Current_SCpnt == SCpnt) 510 509 return command; 511 - } 512 - } 513 - } 514 - spin_unlock_irqrestore(&scsi_id->sbp2_command_orb_lock, flags); 515 510 return NULL; 516 511 } 517 512 ··· 575 580 576 581 /* 577 582 * This function moves a command to the completed orb list. 583 + * Must be called with scsi_id->sbp2_command_orb_lock held. 578 584 */ 579 - static void sbp2util_mark_command_completed(struct scsi_id_instance_data *scsi_id, 580 - struct sbp2_command_info *command) 585 + static void sbp2util_mark_command_completed( 586 + struct scsi_id_instance_data *scsi_id, 587 + struct sbp2_command_info *command) 581 588 { 582 - unsigned long flags; 583 - 584 - spin_lock_irqsave(&scsi_id->sbp2_command_orb_lock, flags); 585 589 list_del(&command->list); 586 590 sbp2util_free_command_dma(command); 587 591 list_add_tail(&command->list, &scsi_id->sbp2_command_orb_completed); 588 - spin_unlock_irqrestore(&scsi_id->sbp2_command_orb_lock, flags); 589 592 } 590 593 591 594 /* ··· 2141 2148 * Matched status with command, now grab scsi command pointers and check status 2142 2149 */ 2143 2150 SCpnt = command->Current_SCpnt; 2151 + spin_lock_irqsave(&scsi_id->sbp2_command_orb_lock, flags); 2144 2152 sbp2util_mark_command_completed(scsi_id, command); 2153 + spin_unlock_irqrestore(&scsi_id->sbp2_command_orb_lock, flags); 2145 2154 2146 2155 if (SCpnt) { 2147 2156 ··· 2479 2484 (struct scsi_id_instance_data *)SCpnt->device->host->hostdata[0]; 2480 2485 struct sbp2scsi_host_info *hi = scsi_id->hi; 2481 2486 struct sbp2_command_info *command; 2487 + unsigned long flags; 2482 2488 2483 2489 SBP2_ERR("aborting sbp2 command"); 2484 2490 scsi_print_command(SCpnt); ··· 2490 2494 * Right now, just return any matching command structures 2491 2495 * to the free pool. 2492 2496 */ 2497 + spin_lock_irqsave(&scsi_id->sbp2_command_orb_lock, flags); 2493 2498 command = sbp2util_find_command_for_SCpnt(scsi_id, SCpnt); 2494 2499 if (command) { 2495 2500 SBP2_DEBUG("Found command to abort"); ··· 2508 2511 command->Current_done(command->Current_SCpnt); 2509 2512 } 2510 2513 } 2514 + spin_unlock_irqrestore(&scsi_id->sbp2_command_orb_lock, flags); 2511 2515 2512 2516 /* 2513 2517 * Initiate a fetch agent reset.