Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

qed: Don't collect too many protection override GRC elements

In the protection override dump path, the firmware can return far too
many GRC elements, resulting in attempting to write past the end of the
previously-kmalloc'ed dump buffer.

This will result in a kernel panic with reason:

BUG: unable to handle kernel paging request at ADDRESS

where "ADDRESS" is just past the end of the protection override dump
buffer. The start address of the buffer is:
p_hwfn->cdev->dbg_features[DBG_FEATURE_PROTECTION_OVERRIDE].dump_buf
and the size of the buffer is buf_size in the same data structure.

The panic can be arrived at from either the qede Ethernet driver path:

[exception RIP: qed_grc_dump_addr_range+0x108]
qed_protection_override_dump at ffffffffc02662ed [qed]
qed_dbg_protection_override_dump at ffffffffc0267792 [qed]
qed_dbg_feature at ffffffffc026aa8f [qed]
qed_dbg_all_data at ffffffffc026b211 [qed]
qed_fw_fatal_reporter_dump at ffffffffc027298a [qed]
devlink_health_do_dump at ffffffff82497f61
devlink_health_report at ffffffff8249cf29
qed_report_fatal_error at ffffffffc0272baf [qed]
qede_sp_task at ffffffffc045ed32 [qede]
process_one_work at ffffffff81d19783

or the qedf storage driver path:

[exception RIP: qed_grc_dump_addr_range+0x108]
qed_protection_override_dump at ffffffffc068b2ed [qed]
qed_dbg_protection_override_dump at ffffffffc068c792 [qed]
qed_dbg_feature at ffffffffc068fa8f [qed]
qed_dbg_all_data at ffffffffc0690211 [qed]
qed_fw_fatal_reporter_dump at ffffffffc069798a [qed]
devlink_health_do_dump at ffffffff8aa95e51
devlink_health_report at ffffffff8aa9ae19
qed_report_fatal_error at ffffffffc0697baf [qed]
qed_hw_err_notify at ffffffffc06d32d7 [qed]
qed_spq_post at ffffffffc06b1011 [qed]
qed_fcoe_destroy_conn at ffffffffc06b2e91 [qed]
qedf_cleanup_fcport at ffffffffc05e7597 [qedf]
qedf_rport_event_handler at ffffffffc05e7bf7 [qedf]
fc_rport_work at ffffffffc02da715 [libfc]
process_one_work at ffffffff8a319663

Resolve this by clamping the firmware's return value to the maximum
number of legal elements the firmware should return.

Fixes: d52c89f120de8 ("qed*: Utilize FW 8.37.2.0")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Link: https://patch.msgid.link/f8e1182934aa274c18d0682a12dbaf347595469c.1757485536.git.jamie.bainbridge@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Jamie Bainbridge and committed by
Jakub Kicinski
56c0a2a9 af82e857

+4 -3
+4 -3
drivers/net/ethernet/qlogic/qed/qed_debug.c
··· 4462 4462 goto out; 4463 4463 } 4464 4464 4465 - /* Add override window info to buffer */ 4465 + /* Add override window info to buffer, preventing buffer overflow */ 4466 4466 override_window_dwords = 4467 - qed_rd(p_hwfn, p_ptt, GRC_REG_NUMBER_VALID_OVERRIDE_WINDOW) * 4468 - PROTECTION_OVERRIDE_ELEMENT_DWORDS; 4467 + min(qed_rd(p_hwfn, p_ptt, GRC_REG_NUMBER_VALID_OVERRIDE_WINDOW) * 4468 + PROTECTION_OVERRIDE_ELEMENT_DWORDS, 4469 + PROTECTION_OVERRIDE_DEPTH_DWORDS); 4469 4470 if (override_window_dwords) { 4470 4471 addr = BYTES_TO_DWORDS(GRC_REG_PROTECTION_OVERRIDE_WINDOW); 4471 4472 offset += qed_grc_dump_addr_range(p_hwfn,