Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

firewire: fw-core: react on bus resets while the config ROM is being fetched

read_rom() obtained a fresh new fw_device.generation for each read
transaction. Hence it was able to continue reading in the middle of the
ROM even if a bus reset happened. However the device may have modified
the ROM during the reset. We would end up with a corrupt fetched ROM
image then.

Although all of this is quite unlikely, it is not impossible.
Therefore we now restart reading the ROM if the bus generation changed.

Note, the memory barrier in read_rom() is still necessary according to
tests by Jarod Wilson, despite of the ->generation access being moved up
in the call chain.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>

This is essentially what I've been beating on locally, and I've yet to hit
another config rom read failure with it.

Signed-off-by: Jarod Wilson <jwilson@redhat.com>

+17 -8
+17 -8
drivers/firewire/fw-device.c
··· 390 390 complete(&callback_data->done); 391 391 } 392 392 393 - static int read_rom(struct fw_device *device, int index, u32 * data) 393 + static int 394 + read_rom(struct fw_device *device, int generation, int index, u32 *data) 394 395 { 395 396 struct read_quadlet_callback_data callback_data; 396 397 struct fw_transaction t; 397 398 u64 offset; 398 - int generation = device->generation; 399 399 400 400 /* device->node_id, accessed below, must not be older than generation */ 401 401 smp_rmb(); ··· 414 414 return callback_data.rcode; 415 415 } 416 416 417 - static int read_bus_info_block(struct fw_device *device) 417 + /* 418 + * Read the bus info block, perform a speed probe, and read all of the rest of 419 + * the config ROM. We do all this with a cached bus generation. If the bus 420 + * generation changes under us, read_bus_info_block will fail and get retried. 421 + * It's better to start all over in this case because the node from which we 422 + * are reading the ROM may have changed the ROM during the reset. 423 + */ 424 + static int read_bus_info_block(struct fw_device *device, int generation) 418 425 { 419 426 static u32 rom[256]; 420 427 u32 stack[16], sp, key; ··· 431 424 432 425 /* First read the bus info block. */ 433 426 for (i = 0; i < 5; i++) { 434 - if (read_rom(device, i, &rom[i]) != RCODE_COMPLETE) 427 + if (read_rom(device, generation, i, &rom[i]) != RCODE_COMPLETE) 435 428 return -1; 436 429 /* 437 430 * As per IEEE1212 7.2, during power-up, devices can ··· 466 459 device->max_speed = device->card->link_speed; 467 460 468 461 while (device->max_speed > SCODE_100) { 469 - if (read_rom(device, 0, &dummy) == RCODE_COMPLETE) 462 + if (read_rom(device, generation, 0, &dummy) == 463 + RCODE_COMPLETE) 470 464 break; 471 465 device->max_speed--; 472 466 } ··· 500 492 return -1; 501 493 502 494 /* Read header quadlet for the block to get the length. */ 503 - if (read_rom(device, i, &rom[i]) != RCODE_COMPLETE) 495 + if (read_rom(device, generation, i, &rom[i]) != RCODE_COMPLETE) 504 496 return -1; 505 497 end = i + (rom[i] >> 16) + 1; 506 498 i++; ··· 519 511 * it references another block, and push it in that case. 520 512 */ 521 513 while (i < end) { 522 - if (read_rom(device, i, &rom[i]) != RCODE_COMPLETE) 514 + if (read_rom(device, generation, i, &rom[i]) != 515 + RCODE_COMPLETE) 523 516 return -1; 524 517 if ((key >> 30) == 3 && (rom[i] >> 30) > 1 && 525 518 sp < ARRAY_SIZE(stack)) ··· 667 658 * device. 668 659 */ 669 660 670 - if (read_bus_info_block(device) < 0) { 661 + if (read_bus_info_block(device, device->generation) < 0) { 671 662 if (device->config_rom_retries < MAX_RETRIES) { 672 663 device->config_rom_retries++; 673 664 schedule_delayed_work(&device->work, RETRY_DELAY);