Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

docs: net: dsa: sja1105: Add info about the Time-Aware Scheduler

While not an exhaustive usage tutorial, this describes the details
needed to build more complex scenarios.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Vladimir Oltean and committed by
David S. Miller
7c95afa4 317ab5b8

+90
+90
Documentation/networking/dsa/sja1105.rst
··· 146 146 this mode, the switch ports beneath br0 are not capable of regular traffic, and 147 147 are only used as a conduit for switchdev operations. 148 148 149 + Offloads 150 + ======== 151 + 152 + Time-aware scheduling 153 + --------------------- 154 + 155 + The switch supports a variation of the enhancements for scheduled traffic 156 + specified in IEEE 802.1Q-2018 (formerly 802.1Qbv). This means it can be used to 157 + ensure deterministic latency for priority traffic that is sent in-band with its 158 + gate-open event in the network schedule. 159 + 160 + This capability can be managed through the tc-taprio offload ('flags 2'). The 161 + difference compared to the software implementation of taprio is that the latter 162 + would only be able to shape traffic originated from the CPU, but not 163 + autonomously forwarded flows. 164 + 165 + The device has 8 traffic classes, and maps incoming frames to one of them based 166 + on the VLAN PCP bits (if no VLAN is present, the port-based default is used). 167 + As described in the previous sections, depending on the value of 168 + ``vlan_filtering``, the EtherType recognized by the switch as being VLAN can 169 + either be the typical 0x8100 or a custom value used internally by the driver 170 + for tagging. Therefore, the switch ignores the VLAN PCP if used in standalone 171 + or bridge mode with ``vlan_filtering=0``, as it will not recognize the 0x8100 172 + EtherType. In these modes, injecting into a particular TX queue can only be 173 + done by the DSA net devices, which populate the PCP field of the tagging header 174 + on egress. Using ``vlan_filtering=1``, the behavior is the other way around: 175 + offloaded flows can be steered to TX queues based on the VLAN PCP, but the DSA 176 + net devices are no longer able to do that. To inject frames into a hardware TX 177 + queue with VLAN awareness active, it is necessary to create a VLAN 178 + sub-interface on the DSA master port, and send normal (0x8100) VLAN-tagged 179 + towards the switch, with the VLAN PCP bits set appropriately. 180 + 181 + Management traffic (having DMAC 01-80-C2-xx-xx-xx or 01-19-1B-xx-xx-xx) is the 182 + notable exception: the switch always treats it with a fixed priority and 183 + disregards any VLAN PCP bits even if present. The traffic class for management 184 + traffic has a value of 7 (highest priority) at the moment, which is not 185 + configurable in the driver. 186 + 187 + Below is an example of configuring a 500 us cyclic schedule on egress port 188 + ``swp5``. The traffic class gate for management traffic (7) is open for 100 us, 189 + and the gates for all other traffic classes are open for 400 us:: 190 + 191 + #!/bin/bash 192 + 193 + set -e -u -o pipefail 194 + 195 + NSEC_PER_SEC="1000000000" 196 + 197 + gatemask() { 198 + local tc_list="$1" 199 + local mask=0 200 + 201 + for tc in ${tc_list}; do 202 + mask=$((${mask} | (1 << ${tc}))) 203 + done 204 + 205 + printf "%02x" ${mask} 206 + } 207 + 208 + if ! systemctl is-active --quiet ptp4l; then 209 + echo "Please start the ptp4l service" 210 + exit 211 + fi 212 + 213 + now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }') 214 + # Phase-align the base time to the start of the next second. 215 + sec=$(echo "${now}" | gawk -F. '{ print $1; }') 216 + base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))" 217 + 218 + tc qdisc add dev swp5 parent root handle 100 taprio \ 219 + num_tc 8 \ 220 + map 0 1 2 3 5 6 7 \ 221 + queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ 222 + base-time ${base_time} \ 223 + sched-entry S $(gatemask 7) 100000 \ 224 + sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \ 225 + flags 2 226 + 227 + It is possible to apply the tc-taprio offload on multiple egress ports. There 228 + are hardware restrictions related to the fact that no gate event may trigger 229 + simultaneously on two ports. The driver checks the consistency of the schedules 230 + against this restriction and errors out when appropriate. Schedule analysis is 231 + needed to avoid this, which is outside the scope of the document. 232 + 233 + At the moment, the time-aware scheduler can only be triggered based on a 234 + standalone clock and not based on PTP time. This means the base-time argument 235 + from tc-taprio is ignored and the schedule starts right away. It also means it 236 + is more difficult to phase-align the scheduler with the other devices in the 237 + network. 238 + 149 239 Device Tree bindings and board design 150 240 ===================================== 151 241