Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

net: phy: introduce internal API for PHY MSE diagnostics

Add the base infrastructure for Mean Square Error (MSE) diagnostics,
as proposed by the OPEN Alliance "Advanced diagnostic features for
100BASE-T1 automotive Ethernet PHYs" [1] specification.

The OPEN Alliance spec defines only average MSE and average peak MSE
over a fixed number of symbols. However, other PHYs, such as the
KSZ9131, additionally expose a worst-peak MSE value latched since the
last channel capture. This API accounts for such vendor extensions by
adding a distinct capability bit and snapshot field.

Channel-to-pair mapping is normally straightforward, but in some cases
(e.g. 100BASE-TX with MDI-X resolution unknown) the mapping is ambiguous.
If hardware does not expose MDI-X status, the exact pair cannot be
determined. To avoid returning misleading per-channel data in this case,
a LINK selector is defined for aggregate MSE measurements.

All investigated devices differ in MSE capabilities, such
as sample rate, number of analyzed symbols, and scaling factors.
For example, the KSZ9131 uses different scaling for MSE and pMSE.
To make this visible to callers, scale limits and timing information
are returned via get_mse_capability().

Some PHYs sample very few symbols at high frequency (e.g. 2 us update
rate). To cover such cases and allow for future high-speed PHYs with
even shorter intervals, the refresh rate is reported as u64 in
picoseconds.

This patch introduces the internal PHY API for Mean Square Error
diagnostics. It defines new kernel-side data types and driver hooks:

- struct phy_mse_capability: describes supported metrics, scale
limits, update interval, and sampling length.
- struct phy_mse_snapshot: holds one correlated measurement set.
- New phy_driver ops: `get_mse_capability()` and `get_mse_snapshot()`.

These definitions form the core kernel API. No user-visible interfaces
are added in this commit.

Standardization notes:
OPEN Alliance defines presence and interpretation of some metrics but does
not fix numeric scales or sampling internals:

- SQI (3-bit, 0..7) is mandatory; correlation to SNR/BER is informative
(OA 100BASE-T1 TC1 v1.0 6.1.2; OA 1000BASE-T1 TC12 v2.2 6.1.2).
- MSE is optional; OA recommends 2^16 symbols and scaling to 0..511,
with a worst-case latch since last read (OA 100BASE-T1 TC1 v1.0 6.1.1; OA
1000BASE-T1 TC12 v2.2 6.1.1). Refresh is recommended (~0.8-2.0 ms for
100BASE-T1; ~80-200 us for 1000BASE-T1). Exact scaling/time windows
are vendor-specific.
- Peak MSE (pMSE) is defined only for 100BASE-T1 as optional, e.g.
128-symbol sliding window with 8-bit range and worst-case latch (OA
100BASE-T1 TC1 v1.0 6.1.3).

Therefore this API exposes which measures and selectors a PHY supports,
and documents where behavior is standard-referenced vs vendor-specific.

[1] <https://opensig.org/wp-content/uploads/2024/01/
Advanced_PHY_features_for_automotive_Ethernet_V1.0.pdf>

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20251027122801.982364-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Oleksij Rempel and committed by
Jakub Kicinski
abcf6eef ff371a7e

+206
+206
include/linux/phy.h
··· 903 903 904 904 #define to_phy_led(d) container_of(d, struct phy_led, led_cdev) 905 905 906 + /* 907 + * PHY_MSE_CAP_* - Bitmask flags for Mean Square Error (MSE) capabilities 908 + * 909 + * These flags describe which MSE metrics and selectors are implemented 910 + * by the PHY for the current link mode. They are used in 911 + * struct phy_mse_capability.supported_caps. 912 + * 913 + * Standardization: 914 + * The OPEN Alliance (OA) defines the presence of MSE/SQI/pMSE but not their 915 + * numeric scaling, update intervals, or aggregation windows. See: 916 + * OA 100BASE-T1 TC1 v1.0, sections 6.1.1-6.1.3 917 + * OA 1000BASE-T1 TC12 v2.2, sections 6.1.1-6.1.2 918 + * 919 + * Description of flags: 920 + * 921 + * PHY_MSE_CAP_CHANNEL_A 922 + * Per-pair diagnostics for Channel A are supported. Mapping to the 923 + * physical wire pair may depend on MDI/MDI-X polarity. 924 + * 925 + * PHY_MSE_CAP_CHANNEL_B, _C, _D 926 + * Same as above for channels B-D. 927 + * 928 + * PHY_MSE_CAP_WORST_CHANNEL 929 + * The PHY or driver can identify and report the single worst-performing 930 + * channel without querying each one individually. 931 + * 932 + * PHY_MSE_CAP_LINK 933 + * The PHY provides only a link-wide aggregate measurement or cannot map 934 + * results to a specific pair (for example 100BASE-TX with unknown 935 + * MDI/MDI-X). 936 + * 937 + * PHY_MSE_CAP_AVG 938 + * Average MSE (mean DCQ metric) is supported. For 100/1000BASE-T1 the OA 939 + * recommends 2^16 symbols, scaled 0..511, but the exact scaling is 940 + * vendor-specific. 941 + * 942 + * PHY_MSE_CAP_PEAK 943 + * Peak MSE (current peak within the measurement window) is supported. 944 + * Defined as pMSE for 100BASE-T1; vendor-specific for others. 945 + * 946 + * PHY_MSE_CAP_WORST_PEAK 947 + * Latched worst-case peak MSE since the last read (read-to-clear if 948 + * implemented). Optional in OA 100BASE-T1 TC1 6.1.3. 949 + */ 950 + #define PHY_MSE_CAP_CHANNEL_A BIT(0) 951 + #define PHY_MSE_CAP_CHANNEL_B BIT(1) 952 + #define PHY_MSE_CAP_CHANNEL_C BIT(2) 953 + #define PHY_MSE_CAP_CHANNEL_D BIT(3) 954 + #define PHY_MSE_CAP_WORST_CHANNEL BIT(4) 955 + #define PHY_MSE_CAP_LINK BIT(5) 956 + #define PHY_MSE_CAP_AVG BIT(6) 957 + #define PHY_MSE_CAP_PEAK BIT(7) 958 + #define PHY_MSE_CAP_WORST_PEAK BIT(8) 959 + 960 + /* 961 + * enum phy_mse_channel - Identifiers for selecting MSE measurement channels 962 + * 963 + * PHY_MSE_CHANNEL_A - PHY_MSE_CHANNEL_D 964 + * Select per-pair measurement for the corresponding channel. 965 + * 966 + * PHY_MSE_CHANNEL_WORST 967 + * Select the single worst-performing channel reported by hardware. 968 + * 969 + * PHY_MSE_CHANNEL_LINK 970 + * Select link-wide aggregate data (used when per-pair results are 971 + * unavailable). 972 + */ 973 + enum phy_mse_channel { 974 + PHY_MSE_CHANNEL_A, 975 + PHY_MSE_CHANNEL_B, 976 + PHY_MSE_CHANNEL_C, 977 + PHY_MSE_CHANNEL_D, 978 + PHY_MSE_CHANNEL_WORST, 979 + PHY_MSE_CHANNEL_LINK, 980 + }; 981 + 982 + /** 983 + * struct phy_mse_capability - Capabilities of Mean Square Error (MSE) 984 + * measurement interface 985 + * 986 + * Standardization notes: 987 + * 988 + * - Presence of MSE/SQI/pMSE is defined by OPEN Alliance specs, but numeric 989 + * scaling, refresh/update rate and aggregation windows are not fixed and 990 + * are vendor-/product-specific. (OA 100BASE-T1 TC1 v1.0 6.1.*; 991 + * OA 1000BASE-T1 TC12 v2.2 6.1.*) 992 + * 993 + * - Typical recommendations: 2^16 symbols and 0..511 scaling for MSE; pMSE only 994 + * defined for 100BASE-T1 (sliding window example), others are vendor 995 + * extensions. Drivers must report actual scale/limits here. 996 + * 997 + * Describes the MSE measurement capabilities for the current link mode. These 998 + * properties are dynamic and may change when link settings are modified. 999 + * Callers should re-query this capability after any link state change to 1000 + * ensure they have the most up-to-date information. 1001 + * 1002 + * Callers should only request measurements for channels and types that are 1003 + * indicated as supported by the @supported_caps bitmask. If @supported_caps 1004 + * is 0, the device provides no MSE diagnostics, and driver operations should 1005 + * typically return -EOPNOTSUPP. 1006 + * 1007 + * Snapshot values for average and peak MSE can be normalized to a 0..1 ratio 1008 + * by dividing the raw snapshot by the corresponding @max_average_mse or 1009 + * @max_peak_mse value. 1010 + * 1011 + * @max_average_mse: The maximum value for an average MSE snapshot. This 1012 + * defines the scale for the measurement. If the PHY_MSE_CAP_AVG capability is 1013 + * supported, this value MUST be greater than 0. (vendor-specific units). 1014 + * @max_peak_mse: The maximum value for a peak MSE snapshot. If either 1015 + * PHY_MSE_CAP_PEAK or PHY_MSE_CAP_WORST_PEAK is supported, this value MUST 1016 + * be greater than 0. (vendor-specific units). 1017 + * @refresh_rate_ps: The typical interval, in picoseconds, between hardware 1018 + * updates of the MSE values. This is an estimate, and callers should not 1019 + * assume synchronous sampling. (vendor-specific units). 1020 + * @num_symbols: The number of symbols aggregated per hardware sample to 1021 + * calculate the MSE. (vendor-specific units). 1022 + * @supported_caps: A bitmask of PHY_MSE_CAP_* values indicating which 1023 + * measurement types (e.g., average, peak) and channels 1024 + * (e.g., per-pair or link-wide) are supported. 1025 + */ 1026 + struct phy_mse_capability { 1027 + u64 max_average_mse; 1028 + u64 max_peak_mse; 1029 + u64 refresh_rate_ps; 1030 + u64 num_symbols; 1031 + u32 supported_caps; 1032 + }; 1033 + 1034 + /** 1035 + * struct phy_mse_snapshot - A snapshot of Mean Square Error (MSE) diagnostics 1036 + * 1037 + * Holds a set of MSE diagnostic values that were all captured from a single 1038 + * measurement window. 1039 + * 1040 + * Values are raw, device-scaled and not normalized. Use struct 1041 + * phy_mse_capability to interpret the scale and sampling window. 1042 + * 1043 + * @average_mse: The average MSE value over the measurement window. 1044 + * OPEN Alliance references MSE as a DCQ metric; recommends 2^16 symbols and 1045 + * 0..511 scaling. Exact scale and refresh are vendor-specific. 1046 + * (100BASE-T1 TC1 v1.0 6.1.1; 1000BASE-T1 TC12 v2.2 6.1.1). 1047 + * 1048 + * @peak_mse: The peak MSE value observed within the measurement window. 1049 + * For 100BASE-T1, "pMSE" is optional and may be implemented via a sliding 1050 + * 128-symbol window with periodic capture; not standardized for 1000BASE-T1. 1051 + * (100BASE-T1 TC1 v1.0 6.1.3, Table "DCQ.peakMSE"). 1052 + * 1053 + * @worst_peak_mse: A latched high-water mark of the peak MSE since last read 1054 + * (read-to-clear if implemented). OPEN Alliance shows a latched "worst case 1055 + * peak MSE" for 100BASE-T1 pMSE; availability/semantics outside that are 1056 + * vendor-specific. (100BASE-T1 TC1 v1.0 6.1.3, DCQ.peakMSE high byte; 1057 + * 1000BASE-T1 TC12 v2.2 treats DCQ details as vendor-specific.) 1058 + */ 1059 + struct phy_mse_snapshot { 1060 + u64 average_mse; 1061 + u64 peak_mse; 1062 + u64 worst_peak_mse; 1063 + }; 1064 + 906 1065 /** 907 1066 * struct phy_driver - Driver structure for a particular PHY type 908 1067 * ··· 1342 1183 int (*get_sqi)(struct phy_device *dev); 1343 1184 /** @get_sqi_max: Get the maximum signal quality indication */ 1344 1185 int (*get_sqi_max)(struct phy_device *dev); 1186 + 1187 + /** 1188 + * @get_mse_capability: Get capabilities and scale of MSE measurement 1189 + * @dev: PHY device 1190 + * @cap: Output (filled on success) 1191 + * 1192 + * Fill @cap with the PHY's MSE capability for the current 1193 + * link mode: scale limits (max_average_mse, max_peak_mse), update 1194 + * interval (refresh_rate_ps), sample length (num_symbols) and the 1195 + * capability bitmask (supported_caps). 1196 + * 1197 + * Implementations may defer capability report until hardware has 1198 + * converged; in that case they should return -EAGAIN and allow the 1199 + * caller to retry later. 1200 + * 1201 + * Return: 0 on success. On failure, returns a negative errno code, such 1202 + * as -EOPNOTSUPP if MSE measurement is not supported by the PHY or in 1203 + * the current link mode, or -EAGAIN if the capability information is 1204 + * not yet available. 1205 + */ 1206 + int (*get_mse_capability)(struct phy_device *dev, 1207 + struct phy_mse_capability *cap); 1208 + 1209 + /** 1210 + * @get_mse_snapshot: Retrieve a snapshot of MSE diagnostic values 1211 + * @dev: PHY device 1212 + * @channel: Channel identifier (PHY_MSE_CHANNEL_*) 1213 + * @snapshot: Output (filled on success) 1214 + * 1215 + * Fill @snapshot with a correlated set of MSE values from the most 1216 + * recent measurement window. 1217 + * 1218 + * Callers must validate @channel against supported_caps returned by 1219 + * get_mse_capability(). Drivers must not coerce @channel; if the 1220 + * requested selector is not implemented by the device or current link 1221 + * mode, the operation must fail. 1222 + * 1223 + * worst_peak_mse is latched and must be treated as read-to-clear. 1224 + * 1225 + * Return: 0 on success. On failure, returns a negative errno code, such 1226 + * as -EOPNOTSUPP if MSE measurement is not supported by the PHY or in 1227 + * the current link mode, or -EAGAIN if measurements are not yet 1228 + * available. 1229 + */ 1230 + int (*get_mse_snapshot)(struct phy_device *dev, 1231 + enum phy_mse_channel channel, 1232 + struct phy_mse_snapshot *snapshot); 1345 1233 1346 1234 /* PLCA RS interface */ 1347 1235 /** @get_plca_cfg: Return the current PLCA configuration */