Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

net: phy: aquantia: fix -ETIMEDOUT PHY probe failure when firmware not present

The author of the blamed commit apparently did not notice something
about aqr_wait_reset_complete(): it polls the exact same register -
MDIO_MMD_VEND1:VEND1_GLOBAL_FW_ID - as aqr_firmware_load().

Thus, the entire logic after the introduction of aqr_wait_reset_complete() is
now completely side-stepped, because if aqr_wait_reset_complete()
succeeds, MDIO_MMD_VEND1:VEND1_GLOBAL_FW_ID could have only been a
non-zero value. The handling of the case where the register reads as 0
is dead code, due to the previous -ETIMEDOUT having stopped execution
and returning a fatal error to the caller. We never attempt to load
new firmware if no firmware is present.

Based on static code analysis, I guess we should simply introduce a
switch/case statement based on the return code from aqr_wait_reset_complete(),
to determine whether to load firmware or not. I am not intending to
change the procedure through which the driver determines whether to load
firmware or not, as I am unaware of alternative possibilities.

At the same time, Russell King suggests that if aqr_wait_reset_complete()
is expected to return -ETIMEDOUT as part of normal operation and not
just catastrophic failure, the use of phy_read_mmd_poll_timeout() is
improper, since that has an embedded print inside. Just open-code a
call to read_poll_timeout() to avoid printing -ETIMEDOUT, but continue
printing actual read errors from the MDIO bus.

Fixes: ad649a1fac37 ("net: phy: aquantia: wait for FW reset before checking the vendor ID")
Reported-by: Clark Wang <xiaoning.wang@nxp.com>
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Closes: https://lore.kernel.org/netdev/8ac00a45-ac61-41b4-9f74-d18157b8b6bf@nvidia.com/
Reported-by: Hans-Frieder Vogt <hfdevel@gmx.net>
Closes: https://lore.kernel.org/netdev/c7c1a3ae-be97-4929-8d89-04c8aa870209@gmx.net/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Hans-Frieder Vogt <hfdevel@gmx.net>
Link: https://patch.msgid.link/20240913121230.2620122-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

authored by

Vladimir Oltean and committed by
Paolo Abeni
194ef9d0 94106455

+38 -21
+23 -17
drivers/net/phy/aquantia/aquantia_firmware.c
··· 353 353 { 354 354 int ret; 355 355 356 - ret = aqr_wait_reset_complete(phydev); 357 - if (ret) 358 - return ret; 359 - 360 - /* Check if the firmware is not already loaded by pooling 361 - * the current version returned by the PHY. If 0 is returned, 362 - * no firmware is loaded. 356 + /* Check if the firmware is not already loaded by polling 357 + * the current version returned by the PHY. 363 358 */ 364 - ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_FW_ID); 365 - if (ret > 0) 366 - goto exit; 359 + ret = aqr_wait_reset_complete(phydev); 360 + switch (ret) { 361 + case 0: 362 + /* Some firmware is loaded => do nothing */ 363 + return 0; 364 + case -ETIMEDOUT: 365 + /* VEND1_GLOBAL_FW_ID still reads 0 after 2 seconds of polling. 366 + * We don't have full confidence that no firmware is loaded (in 367 + * theory it might just not have loaded yet), but we will 368 + * assume that, and load a new image. 369 + */ 370 + ret = aqr_firmware_load_nvmem(phydev); 371 + if (!ret) 372 + return ret; 367 373 368 - ret = aqr_firmware_load_nvmem(phydev); 369 - if (!ret) 370 - goto exit; 371 - 372 - ret = aqr_firmware_load_fs(phydev); 373 - if (ret) 374 + ret = aqr_firmware_load_fs(phydev); 375 + if (ret) 376 + return ret; 377 + break; 378 + default: 379 + /* PHY read error, propagate it to the caller */ 374 380 return ret; 381 + } 375 382 376 - exit: 377 383 return 0; 378 384 }
+15 -4
drivers/net/phy/aquantia/aquantia_main.c
··· 435 435 } 436 436 } 437 437 438 + #define AQR_FW_WAIT_SLEEP_US 20000 439 + #define AQR_FW_WAIT_TIMEOUT_US 2000000 440 + 438 441 /* If we configure settings whilst firmware is still initializing the chip, 439 442 * then these settings may be overwritten. Therefore make sure chip 440 443 * initialization has completed. Use presence of the firmware ID as ··· 447 444 */ 448 445 int aqr_wait_reset_complete(struct phy_device *phydev) 449 446 { 450 - int val; 447 + int ret, val; 451 448 452 - return phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1, 453 - VEND1_GLOBAL_FW_ID, val, val != 0, 454 - 20000, 2000000, false); 449 + ret = read_poll_timeout(phy_read_mmd, val, val != 0, 450 + AQR_FW_WAIT_SLEEP_US, AQR_FW_WAIT_TIMEOUT_US, 451 + false, phydev, MDIO_MMD_VEND1, 452 + VEND1_GLOBAL_FW_ID); 453 + if (val < 0) { 454 + phydev_err(phydev, "Failed to read VEND1_GLOBAL_FW_ID: %pe\n", 455 + ERR_PTR(val)); 456 + return val; 457 + } 458 + 459 + return ret; 455 460 } 456 461 457 462 static void aqr107_chip_info(struct phy_device *phydev)