Linux kernel mirror (for testing)
git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel
os
linux
1.. SPDX-License-Identifier: GPL-2.0
2
3Diagnostic Concept for Investigating Twisted Pair Ethernet Variants at OSI Layer 1
4==================================================================================
5
6Introduction
7------------
8
9This documentation is designed for two primary audiences:
10
111. **Users and System Administrators**: For those dealing with real-world
12 Ethernet issues, this guide provides a practical, step-by-step
13 troubleshooting flow to help identify and resolve common problems in Twisted
14 Pair Ethernet at OSI Layer 1. If you're facing unstable links, speed drops,
15 or mysterious network issues, jump right into the step-by-step guide and
16 follow it through to find your solution.
17
182. **Kernel Developers**: For developers working with network drivers and PHY
19 support, this documentation outlines the diagnostic process and highlights
20 areas where the Linux kernel’s diagnostic interfaces could be extended or
21 improved. By understanding the diagnostic flow, developers can better
22 prioritize future enhancements.
23
24Step-by-Step Diagnostic Guide from Linux (General Ethernet)
25-----------------------------------------------------------
26
27This diagnostic guide covers common Ethernet troubleshooting scenarios,
28focusing on **link stability and detection** across different Ethernet
29environments, including **Single-Pair Ethernet (SPE)** and **Multi-Pair
30Ethernet (MPE)**, as well as power delivery technologies like **PoDL** (Power
31over Data Line) and **PoE** (Clause 33 PSE).
32
33The guide is designed to help users diagnose physical layer (Layer 1) issues on
34systems running **Linux kernel version 6.11 or newer**, utilizing **ethtool
35version 6.10 or later** and **iproute2 version 6.4.0 or later**.
36
37In this guide, we assume that users may have **limited or no access to the link
38partner** and will focus on diagnosing issues locally.
39
40Diagnostic Scenarios
41~~~~~~~~~~~~~~~~~~~~
42
43- **Link is up and stable, but no data transfer**: If the link is stable but
44 there are issues with data transmission, refer to the **OSI Layer 2
45 Troubleshooting Guide**.
46
47- **Link is unstable**: Link resets, speed drops, or other fluctuations
48 indicate potential issues at the hardware or physical layer.
49
50- **No link detected**: The interface is up, but no link is established.
51
52Verify Interface Status
53~~~~~~~~~~~~~~~~~~~~~~~
54
55Begin by verifying the status of the Ethernet interface to check if it is
56administratively up. Unlike `ethtool`, which provides information on the link
57and PHY status, it does not show the **administrative state** of the interface.
58To check this, you should use the `ip` command, which describes the interface
59state within the angle brackets `"<>"` in its output.
60
61For example, in the output `<NO-CARRIER,BROADCAST,MULTICAST,UP>`, the important
62keywords are:
63
64- **UP**: The interface is in the administrative "UP" state.
65- **NO-CARRIER**: The interface is administratively up, but no physical link is
66 detected.
67
68If the output shows `<BROADCAST,MULTICAST>`, this indicates the interface is in
69the administrative "DOWN" state.
70
71- **Command:** `ip link show dev <interface>`
72
73- **Expected Output:**
74
75 .. code-block:: bash
76
77 4: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 ...
78 link/ether 88:14:2b:00:96:f2 brd ff:ff:ff:ff:ff:ff
79
80- **Interpreting the Output:**
81
82 - **Administrative UP State**:
83
84 - If the output contains **"UP"**, the interface is administratively up,
85 and the system is trying to establish a physical link.
86
87 - If you also see **"NO-CARRIER"**, it means the physical link has not been
88 detected, indicating potential Layer 1 issues like a cable fault,
89 misconfiguration, or no connection at the link partner. In this case,
90 proceed to the **Inspect Link Status and PHY Configuration** section.
91
92 - **Administrative DOWN State**:
93
94 - If the output lacks **"UP"** and shows only states like
95 **"<BROADCAST,MULTICAST>"**, it means the interface is administratively
96 down. In this case, bring the interface up using the following command:
97
98 .. code-block:: bash
99
100 ip link set dev <interface> up
101
102- **Next Steps**:
103
104 - If the interface is **administratively up** but shows **NO-CARRIER**,
105 proceed to the **Inspect Link Status and PHY Configuration** section to
106 troubleshoot potential physical layer issues.
107
108 - If the interface was **administratively down** and you have brought it up,
109 ensure to **repeat this verification step** to confirm the new state of the
110 interface before proceeding
111
112 - **If the interface is up and the link is detected**:
113
114 - If the output shows **"UP"** and there is **no `NO-CARRIER`**, the
115 interface is administratively up, and the physical link has been
116 successfully established. If everything is working as expected, the Layer
117 1 diagnostics are complete, and no further action is needed.
118
119 - If the interface is up and the link is detected but **no data is being
120 transferred**, the issue is likely beyond Layer 1, and you should proceed
121 with diagnosing the higher layers of the OSI model. This may involve
122 checking Layer 2 configurations (such as VLANs or MAC address issues),
123 Layer 3 settings (like IP addresses, routing, or ARP), or Layer 4 and
124 above (firewalls, services, etc.).
125
126 - If the **link is unstable** or **frequently resetting or dropping**, this
127 may indicate a physical layer issue such as a faulty cable, interference,
128 or power delivery problems. In this case, proceed with the next step in
129 this guide.
130
131Inspect Link Status and PHY Configuration
132~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
133
134Use `ethtool -I` to check the link status, PHY configuration, supported link
135modes, and additional statistics such as the **Link Down Events** counter. This
136step is essential for diagnosing Layer 1 problems such as speed mismatches,
137duplex issues, and link instability.
138
139For both **Single-Pair Ethernet (SPE)** and **Multi-Pair Ethernet (MPE)**
140devices, you will use this step to gather key details about the link. **SPE**
141links generally support a single speed and mode without autonegotiation (with
142the exception of **10BaseT1L**), while **MPE** devices typically support
143multiple link modes and autonegotiation.
144
145- **Command:** `ethtool -I <interface>`
146
147- **Example Output for SPE Interface (Non-autonegotiation)**:
148
149 .. code-block:: bash
150
151 Settings for spe4:
152 Supported ports: [ TP ]
153 Supported link modes: 100baseT1/Full
154 Supported pause frame use: No
155 Supports auto-negotiation: No
156 Supported FEC modes: Not reported
157 Advertised link modes: Not applicable
158 Advertised pause frame use: No
159 Advertised auto-negotiation: No
160 Advertised FEC modes: Not reported
161 Speed: 100Mb/s
162 Duplex: Full
163 Auto-negotiation: off
164 master-slave cfg: forced slave
165 master-slave status: slave
166 Port: Twisted Pair
167 PHYAD: 6
168 Transceiver: external
169 MDI-X: Unknown
170 Supports Wake-on: d
171 Wake-on: d
172 Link detected: yes
173 SQI: 7/7
174 Link Down Events: 2
175
176- **Example Output for MPE Interface (Autonegotiation)**:
177
178 .. code-block:: bash
179
180 Settings for eth1:
181 Supported ports: [ TP MII ]
182 Supported link modes: 10baseT/Half 10baseT/Full
183 100baseT/Half 100baseT/Full
184 Supported pause frame use: Symmetric Receive-only
185 Supports auto-negotiation: Yes
186 Supported FEC modes: Not reported
187 Advertised link modes: 10baseT/Half 10baseT/Full
188 100baseT/Half 100baseT/Full
189 Advertised pause frame use: Symmetric Receive-only
190 Advertised auto-negotiation: Yes
191 Advertised FEC modes: Not reported
192 Link partner advertised link modes: 10baseT/Half 10baseT/Full
193 100baseT/Half 100baseT/Full
194 Link partner advertised pause frame use: Symmetric Receive-only
195 Link partner advertised auto-negotiation: Yes
196 Link partner advertised FEC modes: Not reported
197 Speed: 100Mb/s
198 Duplex: Full
199 Auto-negotiation: on
200 Port: Twisted Pair
201 PHYAD: 10
202 Transceiver: internal
203 MDI-X: Unknown
204 Supports Wake-on: pg
205 Wake-on: p
206 Link detected: yes
207 Link Down Events: 1
208
209- **Next Steps**:
210
211 - Record the output provided by `ethtool`, particularly noting the
212 **master-slave status**, **speed**, **duplex**, and other relevant fields.
213 This information will be useful for further analysis or troubleshooting.
214 Once the **ethtool** output has been collected and stored, move on to the
215 next diagnostic step.
216
217Check Power Delivery (PoDL or PoE)
218~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
219
220If it is known that **PoDL** or **PoE** is **not implemented** on the system,
221or the **PSE** (Power Sourcing Equipment) is managed by proprietary user-space
222software or external tools, you can skip this step. In such cases, verify power
223delivery through alternative methods, such as checking hardware indicators
224(LEDs), using multimeters, or consulting vendor-specific software for
225monitoring power status.
226
227If **PoDL** or **PoE** is implemented and managed directly by Linux, follow
228these steps to ensure power is being delivered correctly:
229
230- **Command:** `ethtool --show-pse <interface>`
231
232- **Expected Output Examples**:
233
234 1. **PSE Not Supported**:
235
236 If no PSE is attached or the interface does not support PSE, the following
237 output is expected:
238
239 .. code-block:: bash
240
241 netlink error: No PSE is attached
242 netlink error: Operation not supported
243
244 2. **PoDL (Single-Pair Ethernet)**:
245
246 When PoDL is implemented, you might see the following attributes:
247
248 .. code-block:: bash
249
250 PSE attributes for eth1:
251 PoDL PSE Admin State: enabled
252 PoDL PSE Power Detection Status: delivering power
253
254 3. **PoE (Clause 33 PSE)**:
255
256 For standard PoE, the output may look like this:
257
258 .. code-block:: bash
259
260 PSE attributes for eth1:
261 Clause 33 PSE Admin State: enabled
262 Clause 33 PSE Power Detection Status: delivering power
263 Clause 33 PSE Available Power Limit: 18000
264
265- **Adjust Power Limit (if needed)**:
266
267 - Sometimes, the available power limit may not be sufficient for the link
268 partner. You can increase the power limit as needed.
269
270 - **Command:** `ethtool --set-pse <interface> c33-pse-avail-pw-limit <limit>`
271
272 Example:
273
274 .. code-block:: bash
275
276 ethtool --set-pse eth1 c33-pse-avail-pw-limit 18000
277 ethtool --show-pse eth1
278
279 **Expected Output** after adjusting the power limit:
280
281 .. code-block:: bash
282
283 Clause 33 PSE Available Power Limit: 18000
284
285
286- **Next Steps**:
287
288 - **PoE or PoDL Not Used**: If **PoE** or **PoDL** is not implemented or used
289 on the system, proceed to the next diagnostic step, as power delivery is
290 not relevant for this setup.
291
292 - **PoE or PoDL Controlled Externally**: If **PoE** or **PoDL** is used but
293 is not managed by the Linux kernel's **PSE-PD** framework (i.e., it is
294 controlled by proprietary user-space software or external tools), this part
295 is out of scope for this documentation. Please consult vendor-specific
296 documentation or external tools for monitoring and managing power delivery.
297
298 - **PSE Admin State Disabled**:
299
300 - If the `PSE Admin State:` is **disabled**, enable it by running one of
301 the following commands:
302
303 .. code-block:: bash
304
305 ethtool --set-pse <devname> podl-pse-admin-control enable
306
307 or, for Clause 33 PSE (PoE):
308
309 ethtool --set-pse <devname> c33-pse-admin-control enable
310
311 - After enabling the PSE Admin State, return to the start of the **Check
312 Power Delivery (PoDL or PoE)** step to recheck the power delivery status.
313
314 - **Power Not Delivered**: If the `Power Detection Status` shows something
315 other than "delivering power" (e.g., `over current`), troubleshoot the
316 **PSE**. Check for potential issues such as a short circuit in the cable,
317 insufficient power delivery, or a fault in the PSE itself.
318
319 - **Power Delivered but No Link**: If power is being delivered but no link is
320 established, proceed with further diagnostics by performing **Cable
321 Diagnostics** or reviewing the **Inspect Link Status and PHY
322 Configuration** steps to identify any underlying issues with the physical
323 link or settings.
324
325Cable Diagnostics
326~~~~~~~~~~~~~~~~~
327
328Use `ethtool` to test for physical layer issues such as cable faults. The test
329results can vary depending on the cable's condition, the technology in use, and
330the state of the link partner. The results from the cable test will help in
331diagnosing issues like open circuits, shorts, impedance mismatches, and
332noise-related problems.
333
334- **Command:** `ethtool --cable-test <interface>`
335
336The following are the typical outputs for **Single-Pair Ethernet (SPE)** and
337**Multi-Pair Ethernet (MPE)**:
338
339- **For Single-Pair Ethernet (SPE)**:
340 - **Expected Output (SPE)**:
341
342 .. code-block:: bash
343
344 Cable test completed for device eth1.
345 Pair A, fault length: 25.00m
346 Pair A code Open Circuit
347
348 This indicates an open circuit or cable fault at the reported distance, but
349 results can be influenced by the link partner's state. Refer to the
350 **"Troubleshooting Based on Cable Test Results"** section for further
351 interpretation of these results.
352
353- **For Multi-Pair Ethernet (MPE)**:
354 - **Expected Output (MPE)**:
355
356 .. code-block:: bash
357
358 Cable test completed for device eth0.
359 Pair A code OK
360 Pair B code OK
361 Pair C code Open Circuit
362
363 Here, Pair C is reported as having an open circuit, while Pairs A and B are
364 functioning correctly. However, if autonegotiation is in use on Pairs A and
365 B, the cable test may be disrupted. Refer to the **"Troubleshooting Based on
366 Cable Test Results"** section for a detailed explanation of these issues and
367 how to resolve them.
368
369For detailed descriptions of the different possible cable test results, please
370refer to the **"Troubleshooting Based on Cable Test Results"** section.
371
372Troubleshooting Based on Cable Test Results
373^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
374
375After running the cable test, the results can help identify specific issues in
376the physical connection. However, it is important to note that **cable testing
377results heavily depend on the capabilities and characteristics of both the
378local hardware and the link partner**. The accuracy and reliability of the
379results can vary significantly between different hardware implementations.
380
381In some cases, this can introduce **blind spots** in the current cable testing
382implementation, where certain results may not accurately reflect the actual
383physical state of the cable. For example:
384
385- An **Open Circuit** result might not only indicate a damaged or disconnected
386 cable but also occur if the cable is properly attached to a powered-down link
387 partner.
388
389- Some PHYs may report a **Short within Pair** if the link partner is in
390 **forced slave mode**, even though there is no actual short in the cable.
391
392To help users interpret the results more effectively, it could be beneficial to
393extend the **kernel UAPI** (User API) to provide additional context or
394**possible variants** of issues based on the hardware’s characteristics. Since
395these quirks are often hardware-specific, the **kernel driver** would be an
396ideal source of such information. By providing flags or hints related to
397potential false positives for each test result, users would have a better
398understanding of what to verify and where to investigate further.
399
400Until such improvements are made, users should be aware of these limitations
401and manually verify cable issues as needed. Physical inspections may help
402resolve uncertainties related to false positive results.
403
404The results can be one of the following:
405
406- **OK**:
407
408 - The cable is functioning correctly, and no issues were detected.
409
410 - **Next Steps**: If you are still experiencing issues, it might be related
411 to higher-layer problems, such as duplex mismatches or speed negotiation,
412 which are not physical-layer issues.
413
414 - **Special Case for `BaseT1` (1000/100/10BaseT1)**: In `BaseT1` systems, an
415 "OK" result typically also means that the link is up and likely in **slave
416 mode**, since cable tests usually only pass in this mode. For some
417 **10BaseT1L** PHYs, an "OK" result may occur even if the cable is too long
418 for the PHY's configured range (for example, when the range is configured
419 for short-distance mode).
420
421- **Open Circuit**:
422
423 - An **Open Circuit** result typically indicates that the cable is damaged or
424 disconnected at the reported fault length. Consider these possibilities:
425
426 - If the link partner is in **admin down** state or powered off, you might
427 still get an "Open Circuit" result even if the cable is functional.
428
429 - **Next Steps**: Inspect the cable at the fault length for visible damage
430 or loose connections. Verify the link partner is powered on and in the
431 correct mode.
432
433- **Short within Pair**:
434
435 - A **Short within Pair** indicates an unintended connection within the same
436 pair of wires, typically caused by physical damage to the cable.
437
438 - **Next Steps**: Replace or repair the cable and check for any physical
439 damage or improperly crimped connectors.
440
441- **Short to Another Pair**:
442
443 - A **Short to Another Pair** means the wires from different pairs are
444 shorted, which could occur due to physical damage or incorrect wiring.
445
446 - **Next Steps**: Replace or repair the damaged cable. Inspect the cable for
447 incorrect terminations or pinched wiring.
448
449- **Impedance Mismatch**:
450
451 - **Impedance Mismatch** indicates a reflection caused by an impedance
452 discontinuity in the cable. This can happen when a part of the cable has
453 abnormal impedance (e.g., when different cable types are spliced together
454 or when there is a defect in the cable).
455
456 - **Next Steps**: Check the cable quality and ensure consistent impedance
457 throughout its length. Replace any sections of the cable that do not meet
458 specifications.
459
460- **Noise**:
461
462 - **Noise** means that the Time Domain Reflectometry (TDR) test could not
463 complete due to excessive noise on the cable, which can be caused by
464 interference from electromagnetic sources.
465
466 - **Next Steps**: Identify and eliminate sources of electromagnetic
467 interference (EMI) near the cable. Consider using shielded cables or
468 rerouting the cable away from noise sources.
469
470- **Resolution Not Possible**:
471
472 - **Resolution Not Possible** means that the TDR test could not detect the
473 issue due to the resolution limitations of the test or because the fault is
474 beyond the distance that the test can measure.
475
476 - **Next Steps**: Inspect the cable manually if possible, or use alternative
477 diagnostic tools that can handle greater distances or higher resolution.
478
479- **Unknown**:
480
481 - An **Unknown** result may occur when the test cannot classify the fault or
482 when a specific issue is outside the scope of the tool's detection
483 capabilities.
484
485 - **Next Steps**: Re-run the test, verify the link partner's state, and inspect
486 the cable manually if necessary.
487
488Verify Link Partner PHY Configuration
489~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
490
491If the cable test passes but the link is still not functioning correctly, it’s
492essential to verify the configuration of the link partner’s PHY. Mismatches in
493speed, duplex settings, or master-slave roles can cause connection issues.
494
495Autonegotiation Mismatch
496^^^^^^^^^^^^^^^^^^^^^^^^
497
498- If both link partners support autonegotiation, ensure that autonegotiation is
499 enabled on both sides and that all supported link modes are advertised. A
500 mismatch can lead to connectivity problems or sub optimal performance.
501
502- **Quick Fix:** Reset autonegotiation to the default settings, which will
503 advertise all default link modes:
504
505 .. code-block:: bash
506
507 ethtool -s <interface> autoneg on
508
509- **Command to check configuration:** `ethtool <interface>`
510
511- **Expected Output:** Ensure that both sides advertise compatible link modes.
512 If autonegotiation is off, verify that both link partners are configured for
513 the same speed and duplex.
514
515 The following example shows a case where the local PHY advertises fewer link
516 modes than it supports. This will reduce the number of overlapping link modes
517 with the link partner. In the worst case, there will be no common link modes,
518 and the link will not be created:
519
520 .. code-block:: bash
521
522 Settings for eth0:
523 Supported link modes: 1000baseT/Full, 100baseT/Full
524 Advertised link modes: 1000baseT/Full
525 Speed: 1000Mb/s
526 Duplex: Full
527 Auto-negotiation: on
528
529Combined Mode Mismatch (Autonegotiation on One Side, Forced on the Other)
530^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
531
532- One possible issue occurs when one side is using **autonegotiation** (as in
533 most modern systems), and the other side is set to a **forced link mode**
534 (e.g., older hardware with single-speed hubs). In such cases, modern PHYs
535 will attempt to detect the forced mode on the other side. If the link is
536 established, you may notice:
537
538 - **No or empty "Link partner advertised link modes"**.
539
540 - **"Link partner advertised auto-negotiation:"** will be **"no"** or not
541 present.
542
543- This type of detection does not always work reliably:
544
545 - Typically, the modern PHY will default to **Half Duplex**, even if the link
546 partner is actually configured for **Full Duplex**.
547
548 - Some PHYs may not work reliably if the link partner switches from one
549 forced mode to another. In this case, only a down/up cycle may help.
550
551- **Next Steps**: Set both sides to the same fixed speed and duplex mode to
552 avoid potential detection issues.
553
554 .. code-block:: bash
555
556 ethtool -s <interface> speed 1000 duplex full autoneg off
557
558Master/Slave Role Mismatch (BaseT1 and 1000BaseT PHYs)
559^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
560
561- In **BaseT1** systems (e.g., 1000BaseT1, 100BaseT1), link establishment
562 requires that one device is configured as **master** and the other as
563 **slave**. A mismatch in this master-slave configuration can prevent the link
564 from being established. However, **1000BaseT** also supports configurable
565 master/slave roles and can face similar issues.
566
567- **Role Preference in 1000BaseT**: The **1000BaseT** specification allows link
568 partners to negotiate master-slave roles or role preferences during
569 autonegotiation. Some PHYs have hardware limitations or bugs that prevent
570 them from functioning properly in certain roles. In such cases, drivers may
571 force these PHYs into a specific role (e.g., **forced master** or **forced
572 slave**) or try a weaker option by setting preferences. If both link partners
573 have the same issue and are forced into the same mode (e.g., both forced into
574 master mode), they will not be able to establish a link.
575
576- **Next Steps**: Ensure that one side is configured as **master** and the
577 other as **slave** to avoid this issue, particularly when hardware
578 limitations are involved, or try the weaker **preferred** option instead of
579 **forced**. Check for any driver-related restrictions or forced modes.
580
581- **Command to force master/slave mode**:
582
583 .. code-block:: bash
584
585 ethtool -s <interface> master-slave forced-master
586
587 or:
588
589 .. code-block:: bash
590
591 ethtool -s <interface> master-slave forced-master speed 1000 duplex full autoneg off
592
593
594- **Check the current master/slave status**:
595
596 .. code-block:: bash
597
598 ethtool <interface>
599
600 Example Output:
601
602 .. code-block:: bash
603
604 master-slave cfg: forced-master
605 master-slave status: master
606
607- **Hardware Bugs and Driver Forcing**: If a known hardware issue forces the
608 PHY into a specific mode, it’s essential to check the driver source code or
609 hardware documentation for details. Ensure that the roles are compatible
610 across both link partners, and if both PHYs are forced into the same mode,
611 adjust one side accordingly to resolve the mismatch.
612
613Monitor Link Resets and Speed Drops
614~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
615
616If the link is unstable, showing frequent resets or speed drops, this may
617indicate issues with the cable, PHY configuration, or environmental factors.
618While there is still no completely unified way in Linux to directly monitor
619downshift events or link speed changes via user space tools, both the Linux
620kernel logs and `ethtool` can provide valuable insights, especially if the
621driver supports reporting such events.
622
623- **Monitor Kernel Logs for Link Resets and Speed Drops**:
624
625 - The Linux kernel will print link status changes, including downshift
626 events, in the system logs. These messages typically include speed changes,
627 duplex mode, and downshifted link speed (if the driver supports it).
628
629 - **Command to monitor kernel logs in real-time:**
630
631 .. code-block:: bash
632
633 dmesg -w | grep "Link is Up\|Link is Down"
634
635 - Example Output (if a downshift occurs):
636
637 .. code-block:: bash
638
639 eth0: Link is Up - 100Mbps/Full (downshifted) - flow control rx/tx
640 eth0: Link is Down
641
642 This indicates that the link has been established but has downshifted from
643 a higher speed.
644
645 - **Note**: Not all drivers or PHYs support downshift reporting, so you may
646 not see this information for all devices.
647
648- **Monitor Link Down Events Using `ethtool`**:
649
650 - Starting with the latest kernel and `ethtool` versions, you can track
651 **Link Down Events** using the `ethtool -I` command. This will provide
652 counters for link drops, helping to diagnose link instability issues if
653 supported by the driver.
654
655 - **Command to monitor link down events:**
656
657 .. code-block:: bash
658
659 ethtool -I <interface>
660
661 - Example Output (if supported):
662
663 .. code-block:: bash
664
665 PSE attributes for eth1:
666 Link Down Events: 5
667
668 This indicates that the link has dropped 5 times. Frequent link down events
669 may indicate cable or environmental issues that require further
670 investigation.
671
672- **Check Link Status and Speed**:
673
674 - Even though downshift counts or events are not easily tracked, you can
675 still use `ethtool` to manually check the current link speed and status.
676
677 - **Command:** `ethtool <interface>`
678
679 - **Expected Output:**
680
681 .. code-block:: bash
682
683 Speed: 1000Mb/s
684 Duplex: Full
685 Auto-negotiation: on
686 Link detected: yes
687
688 Any inconsistencies in the expected speed or duplex setting could indicate
689 an issue.
690
691- **Disable Energy-Efficient Ethernet (EEE) for Diagnostics**:
692
693 - **EEE** (Energy-Efficient Ethernet) can be a source of link instability due
694 to transitions in and out of low-power states. For diagnostic purposes, it
695 may be useful to **temporarily** disable EEE to determine if it is
696 contributing to link instability. This is **not a generic recommendation**
697 for disabling power management.
698
699 - **Next Steps**: Disable EEE and monitor if the link becomes stable. If
700 disabling EEE resolves the issue, report the bug so that the driver can be
701 fixed.
702
703 - **Command:**
704
705 .. code-block:: bash
706
707 ethtool --set-eee <interface> eee off
708
709 - **Important**: If disabling EEE resolves the instability, the issue should
710 be reported to the maintainers as a bug, and the driver should be corrected
711 to handle EEE properly without causing instability. Disabling EEE
712 permanently should not be seen as a solution.
713
714- **Monitor Error Counters**:
715
716 - Use `ethtool -S <interface> --all-groups` to retrieve standardized interface
717 statistics if the driver supports the unified interface:
718
719 - **Command:** `ethtool -S <interface> --all-groups`
720
721 - **Example Output (if supported)**:
722
723 .. code-block:: bash
724
725 phydev-RxFrames: 100391
726 phydev-RxErrors: 0
727 phydev-TxFrames: 9
728 phydev-TxErrors: 0
729
730 - If the unified interface is not supported, use `ethtool -S <interface>` to
731 retrieve MAC and PHY counters. Note that non-standardized PHY counter names
732 vary by driver and must be interpreted accordingly:
733
734 - **Command:** `ethtool -S <interface>`
735
736 - **Example Output (if supported)**:
737
738 .. code-block:: bash
739
740 rx_crc_errors: 123
741 tx_errors: 45
742 rx_frame_errors: 78
743
744 - **Note**: If no meaningful error counters are available or if counters are
745 not supported, you may need to rely on physical inspections (e.g., cable
746 condition) or kernel log messages (e.g., link up/down events) to further
747 diagnose the issue.
748
749 - **Compare Counters**:
750
751 - Compare the egress and ingress frame counts reported by the PHY and MAC.
752
753 - A small difference may occur due to sampling rate differences between the
754 MAC and PHY drivers, or if the PHY and MAC are not always fully
755 synchronized in their UP or DOWN states.
756
757 - Significant discrepancies indicate potential issues in the data path
758 between the MAC and PHY.
759
760When All Else Fails...
761~~~~~~~~~~~~~~~~~~~~~~
762
763So you've checked the cables, monitored the logs, disabled EEE, and still...
764nothing? Don’t worry, you’re not alone. Sometimes, Ethernet gremlins just don’t
765want to cooperate.
766
767But before you throw in the towel (or the Ethernet cable), take a deep breath.
768It’s always possible that:
769
7701. Your PHY has a unique, undocumented personality.
771
7722. The problem is lying dormant, waiting for just the right moment to magically
773 resolve itself (hey, it happens!).
774
7753. Or, it could be that the ultimate solution simply hasn’t been invented yet.
776
777If none of the above bring you comfort, there’s one final step: contribute! If
778you've uncovered new or unusual issues, or have creative diagnostic methods,
779feel free to share your findings and extend this documentation. Together, we
780can hunt down every elusive network issue - one twisted pair at a time.
781
782Remember: sometimes the solution is just a reboot away, but if not, it’s time to
783dig deeper - or report that bug!
784