···12- Linux Ethernet Bonding Driver HOWTO0034Initial release : Thomas Davis <tadavis at lbl.gov>5Corrections, HA extensions : 2000/10/03-15 :···1314Reorganized and updated Feb 2005 by Jay Vosburgh1516-Note :17-------18-19-The bonding driver originally came from Donald Becker's beowulf patches for20-kernel 2.0. It has changed quite a bit since, and the original tools from21-extreme-linux and beowulf sites will not work with this version of the driver.2223-For new versions of the driver, patches for older kernels and the updated24-userspace tools, please follow the links at the end of this file.000000000002526Table of Contents27=================···39403. Configuring Bonding Devices413.1 Configuration with sysconfig support00423.2 Configuration with initscripts support00433.3 Configuring Bonding Manually44-3.4 Configuring Multiple Bonds45465. Querying Bonding Configuration475.1 Bonding Configuration···697011. Promiscuous mode7172-12. High Availability Information7312.1 High Availability in a Single Switch Topology74-12.1.1 Bonding Mode Selection for Single Switch Topology75-12.1.2 Link Monitoring for Single Switch Topology7612.2 High Availability in a Multiple Switch Topology77-12.2.1 Bonding Mode Selection for Multiple Switch Topology78-12.2.2 Link Monitoring for Multiple Switch Topology79-12.3 Switch Behavior Issues for High Availability8081-13. Hardware Specific Considerations82-13.1 IBM BladeCenter000008384-14. Frequently Asked Questions008586-15. Resources and Links000008788891. Bonding Driver Installation···1081.1 Configure and build the kernel with bonding109-----------------------------------------------110111- The latest version of the bonding driver is available in the112drivers/net/bonding subdirectory of the most recent kernel source113-(which is available on http://kernel.org).114-115- Prior to the 2.4.11 kernel, the bonding driver was maintained116-largely outside the kernel tree; patches for some earlier kernels are117-available on the bonding sourceforge site, although those patches are118-still several years out of date. Most users will want to use either119-the most recent kernel from kernel.org or whatever kernel came with120-their distro.121122 Configure kernel with "make menuconfig" (or "make xconfig" or123"make config"), then select "Bonding driver support" in the "Network···119driver as module since it is currently the only way to pass parameters120to the driver or configure more than one bonding device.121122- Build and install the new kernel and modules, then proceed to123-step 2.1241251.2 Install ifenslave Control Utility126-------------------------------------···163 Options for the bonding driver are supplied as parameters to164the bonding module at load time. They may be given as command line165arguments to the insmod or modprobe command, but are usually specified166-in either the /etc/modprobe.conf configuration file, or in a167-distro-specific configuration file (some of which are detailed in the168-next section).169170 The available bonding driver parameters are listed below. If a171parameter is not specified the default value is used. When initially···178support at least miimon, so there is really no reason not to use it.179180 Options with textual values will accept either the text name181- or, for backwards compatibility, the option value. E.g.,182- "mode=802.3ad" and "mode=4" set the same mode.183184 The parameters are as follows:185186arp_interval187188- Specifies the ARP monitoring frequency in milli-seconds. If189- ARP monitoring is used in a load-balancing mode (mode 0 or 2),190- the switch should be configured in a mode that evenly191- distributes packets across all links - such as round-robin. If192- the switch is configured to distribute the packets in an XOR193 fashion, all replies from the ARP targets will be received on194 the same link which could cause the other team members to195- fail. ARP monitoring should not be used in conjunction with196- miimon. A value of 0 disables ARP monitoring. The default197 value is 0.198199arp_ip_target200201- Specifies the ip addresses to use when arp_interval is > 0.202- These are the targets of the ARP request sent to determine the203- health of the link to the targets. Specify these values in204- ddd.ddd.ddd.ddd format. Multiple ip adresses must be205- seperated by a comma. At least one IP address must be given206- for ARP monitoring to function. The maximum number of targets207- that can be specified is 16. The default value is no IP208- addresses.209210downdelay211···223 are:224225 slow or 0226- Request partner to transmit LACPDUs every 30 seconds (default)227228 fast or 1229 Request partner to transmit LACPDUs every 1 second00230231max_bonds232···239240miimon241242- Specifies the frequency in milli-seconds that MII link243- monitoring will occur. A value of zero disables MII link244- monitoring. A value of 100 is a good starting point. The245- use_carrier option, below, affects how the link state is0246 determined. See the High Availability section for additional247 information. The default value is 0.248···265 active. A different slave becomes active if, and only266 if, the active slave fails. The bond's MAC address is267 externally visible on only one port (network adapter)268- to avoid confusing the switch. This mode provides269- fault tolerance. The primary option affects the270- behavior of this mode.00000000000271272 balance-xor or 2273274- XOR policy: Transmit based on [(source MAC address275- XOR'd with destination MAC address) modulo slave276- count]. This selects the same slave for each277- destination MAC address. This mode provides load278- balancing and fault tolerance.000279280 broadcast or 3281···303 duplex settings. Utilizes all slaves in the active304 aggregator according to the 802.3ad specification.305306- Pre-requisites:0000000000307308 1. Ethtool support in the base drivers for retrieving309 the speed and duplex of each slave.···376377 When a link is reconnected or a new slave joins the378 bond the receive traffic is redistributed among all379- active slaves in the bond by intiating ARP Replies380 with the selected mac address to each of the381 clients. The updelay parameter (detailed below) must382 be set to a value equal or greater than the switch's···439 0 will use the deprecated MII / ETHTOOL ioctls. The default440 value is 1.4410000000000000000000000000000000000000000000000000000004424434443. Configuring Bonding Devices···545slave devices. On SLES 9, this is most easily done by running the546yast2 sysconfig configuration utility. The goal is for to create an547ifcfg-id file for each slave device. The simplest way to accomplish548-this is to configure the devices for DHCP. The name of the549-configuration file for each device will be of the form:0550551ifcfg-id-xx:xx:xx:xx:xx:xx552···557 Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been558created, it is necessary to edit the configuration files for the slave559devices (the MAC addresses correspond to those of the slave devices).560-Before editing, the file will contain muliple lines, and will look561something like this:562563BOOTPROTO='dhcp'···594BONDING_MASTER="yes"595BONDING_MODULE_OPTS="mode=active-backup miimon=100"596BONDING_SLAVE0="eth0"597-BONDING_SLAVE1="eth1"598599 Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK600values with the appropriate values for your network.601-602- Note that configuring the bonding device with BOOTPROTO='dhcp'603-does not work; the scripts attempt to obtain the device address from604-DHCP prior to adding any of the slave devices. Without active slaves,605-the DHCP requests are not sent to the network.606607 The STARTMODE specifies when the device is brought online.608The possible values are:···624the max_bonds bonding parameter; this will confuse the configuration625system if you have multiple bonding devices.626627- Finally, supply one BONDING_SLAVEn="ethX" for each slave,628-where "n" is an increasing value, one for each slave, and "ethX" is629-the name of the slave device (eth0, eth1, etc).00000000630631 When all configuration files have been modified or created,632networking must be restarted for the configuration changes to take···645 Note that the network control script (/sbin/ifdown) will646remove the bonding module as part of the network shutdown processing,647so it is not necessary to remove the module by hand if, e.g., the648-module paramters have changed.649650 Also, at this writing, YaST/YaST2 will not manage bonding651devices (they do not show bonding interfaces on its list of network···660 Note that the template does not document the various BONDING_661settings described above, but does describe many of the other options.66200000000000000000000000006633.2 Configuration with initscripts support664------------------------------------------665666 This section applies to distros using a version of initscripts667with bonding support, for example, Red Hat Linux 9 or Red Hat668-Enterprise Linux version 3. On these systems, the network669initialization scripts have some knowledge of bonding, and can be670configured to control bonding devices.671···740 Be sure to change the networking specific lines (IPADDR,741NETMASK, NETWORK and BROADCAST) to match your network configuration.742743- Finally, it is necessary to edit /etc/modules.conf to load the744-bonding module when the bond0 interface is brought up. The following745-sample lines in /etc/modules.conf will load the bonding module, and746-select its options:0747748alias bond0 bonding749options bond0 mode=balance-alb miimon=100···756will restart the networking subsystem and your bond link should be now757up and running.7580000000000000000000000000007597603.3 Configuring Bonding Manually761--------------------------------···792knowledge of bonding. One such distro is SuSE Linux Enterprise Server793version 8.794795- The general methodology for these systems is to place the796-bonding module parameters into /etc/modprobe.conf, then add modprobe797-and/or ifenslave commands to the system's global init script. The798-name of the global init script differs; for sysconfig, it is0799/etc/init.d/boot.local and for initscripts it is /etc/rc.d/rc.local.800801 For example, if you wanted to make a simple bond of two e100···804reboots, edit the appropriate file (/etc/init.d/boot.local or805/etc/rc.d/rc.local), and add the following:806807-modprobe bonding -obond0 mode=balance-alb miimon=100808modprobe e100809ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up810ifenslave bond0 eth0···812813 Replace the example bonding module parameters and bond0814network configuration (IP address, netmask, etc) with the appropriate815-values for your configuration. The above example loads the bonding816-module with the name "bond0," this simplifies the naming if multiple817-bonding modules are loaded (each successive instance of the module is818-given a different name, and the module instance names match the819-bonding interface names).820821 Unfortunately, this method will not provide support for the822ifup and ifdown scripts on the bond devices. To reload the bonding···835the following:836837# ifconfig bond0 down838-# rmmod bond0839# rmmod e100840841 Again, for convenience, it may be desirable to create a script842with these commands.843844845-3.4 Configuring Multiple Bonds846-------------------------------847848 This section contains information on configuring multiple849-bonding devices with differing options. If you require multiple850-bonding devices, but all with the same options, see the "max_bonds"851-module paramter, documented above.000852853 To create multiple bonding devices with differing options, it854is necessary to load the bonding driver multiple times. Note that···878miimon of 100. The second instance is named "bond1" and creates the879bond1 device in balance-alb mode with an miimon of 50.880881- This may be repeated any number of times, specifying a new and882-unique name in place of bond0 or bond1 for each instance.00883884- When the appropriate module paramters are in place, then885-configure bonding according to the instructions for your distro.0008868875. Querying Bonding Configuration 888=================================···1005self generated packets.10061007 For reasons of simplicity, and to support the use of adapters1008-that can do VLAN hardware acceleration offloding, the bonding1009-interface declares itself as fully hardware offloaing capable, it gets1010the add_vid/kill_vid notifications to gather the necessary1011information, and it propagates those actions to the slaves. In case1012of mixed adapter types, hardware accelerated tagged packets that···1039matches the hardware address of the VLAN interfaces.10401041 Note that changing a VLAN interface's HW address would set the1042-underlying device -- i.e. the bonding interface -- to promiscouos1043mode, which might not be what you want.10441045···1082an additional target (or several) increases the reliability of the ARP1083monitoring.10841085- Multiple ARP targets must be seperated by commas as follows:10861087# example options for ARP monitoring with three targets1088alias bond0 bonding···1204 This will, when loading the bonding module, rather than1205performing the normal action, instead execute the provided command.1206This command loads the device drivers in the order needed, then calls1207-modprobe with --ingore-install to cause the normal action to then take1208place. Full documentation on this can be found in the modprobe.conf1209and modprobe manual pages.1210···1289common to enable promiscuous mode on the device, so that all traffic1290is seen (instead of seeing only traffic destined for the local host).1291The bonding driver handles promiscuous mode changes to the bonding1292-master device (e.g., bond0), and propogates the setting to the slave1293devices.12941295 For the balance-rr, balance-xor, broadcast, and 802.3ad modes,1296-the promiscuous mode setting is propogated to all slaves.12971298 For the active-backup, balance-tlb and balance-alb modes, the1299-promiscuous mode setting is propogated only to the active slave.13001301 For balance-tlb mode, the active slave is the slave currently1302receiving inbound traffic.···13071308 For the active-backup, balance-tlb and balance-alb modes, when1309the active slave changes (e.g., due to a link failure), the1310-promiscuous setting will be propogated to the new active slave.13111312-12. High Availability Information1313-=================================13141315 High Availability refers to configurations that provide1316maximum network availability by having redundant or backup devices,1317-links and switches between the host and the rest of the world.1318-1319- There are currently two basic methods for configuring to1320-maximize availability. They are dependent on the network topology and1321-the primary goal of the configuration, but in general, a configuration1322-can be optimized for maximum available bandwidth, or for maximum1323-network availability.1324132512.1 High Availability in a Single Switch Topology1326--------------------------------------------------13271328- If two hosts (or a host and a switch) are directly connected1329-via multiple physical links, then there is no network availability1330-penalty for optimizing for maximum bandwidth: there is only one switch1331-(or peer), so if it fails, you have no alternative access to fail over1332-to.0013331334-Example 1 : host to switch (or other host)1335-1336- +----------+ +----------+1337- | |eth0 eth0| switch |1338- | Host A +--------------------------+ or |1339- | +--------------------------+ other |1340- | |eth1 eth1| host |1341- +----------+ +----------+1342-1343-1344-12.1.1 Bonding Mode Selection for single switch topology1345---------------------------------------------------------1346-1347- This configuration is the easiest to set up and to understand,1348-although you will have to decide which bonding mode best suits your1349-needs. The tradeoffs for each mode are detailed below:1350-1351-balance-rr: This mode is the only mode that will permit a single1352- TCP/IP connection to stripe traffic across multiple1353- interfaces. It is therefore the only mode that will allow a1354- single TCP/IP stream to utilize more than one interface's1355- worth of throughput. This comes at a cost, however: the1356- striping often results in peer systems receiving packets out1357- of order, causing TCP/IP's congestion control system to kick1358- in, often by retransmitting segments.1359-1360- It is possible to adjust TCP/IP's congestion limits by1361- altering the net.ipv4.tcp_reordering sysctl parameter. The1362- usual default value is 3, and the maximum useful value is 127.1363- For a four interface balance-rr bond, expect that a single1364- TCP/IP stream will utilize no more than approximately 2.31365- interface's worth of throughput, even after adjusting1366- tcp_reordering.1367-1368- If you are utilizing protocols other than TCP/IP, UDP for1369- example, and your application can tolerate out of order1370- delivery, then this mode can allow for single stream datagram1371- performance that scales near linearly as interfaces are added1372- to the bond.1373-1374- This mode requires the switch to have the appropriate ports1375- configured for "etherchannel" or "trunking."1376-1377-active-backup: There is not much advantage in this network topology to1378- the active-backup mode, as the inactive backup devices are all1379- connected to the same peer as the primary. In this case, a1380- load balancing mode (with link monitoring) will provide the1381- same level of network availability, but with increased1382- available bandwidth. On the plus side, it does not require1383- any configuration of the switch.1384-1385-balance-xor: This mode will limit traffic such that packets destined1386- for specific peers will always be sent over the same1387- interface. Since the destination is determined by the MAC1388- addresses involved, this may be desirable if you have a large1389- network with many hosts. It is likely to be suboptimal if all1390- your traffic is passed through a single router, however. As1391- with balance-rr, the switch ports need to be configured for1392- "etherchannel" or "trunking."1393-1394-broadcast: Like active-backup, there is not much advantage to this1395- mode in this type of network topology.1396-1397-802.3ad: This mode can be a good choice for this type of network1398- topology. The 802.3ad mode is an IEEE standard, so all peers1399- that implement 802.3ad should interoperate well. The 802.3ad1400- protocol includes automatic configuration of the aggregates,1401- so minimal manual configuration of the switch is needed1402- (typically only to designate that some set of devices is1403- usable for 802.3ad). The 802.3ad standard also mandates that1404- frames be delivered in order (within certain limits), so in1405- general single connections will not see misordering of1406- packets. The 802.3ad mode does have some drawbacks: the1407- standard mandates that all devices in the aggregate operate at1408- the same speed and duplex. Also, as with all bonding load1409- balance modes other than balance-rr, no single connection will1410- be able to utilize more than a single interface's worth of1411- bandwidth. Additionally, the linux bonding 802.3ad1412- implementation distributes traffic by peer (using an XOR of1413- MAC addresses), so in general all traffic to a particular1414- destination will use the same interface. Finally, the 802.3ad1415- mode mandates the use of the MII monitor, therefore, the ARP1416- monitor is not available in this mode.1417-1418-balance-tlb: This mode is also a good choice for this type of1419- topology. It has no special switch configuration1420- requirements, and balances outgoing traffic by peer, in a1421- vaguely intelligent manner (not a simple XOR as in balance-xor1422- or 802.3ad mode), so that unlucky MAC addresses will not all1423- "bunch up" on a single interface. Interfaces may be of1424- differing speeds. On the down side, in this mode all incoming1425- traffic arrives over a single interface, this mode requires1426- certain ethtool support in the network device driver of the1427- slave interfaces, and the ARP monitor is not available.1428-1429-balance-alb: This mode is everything that balance-tlb is, and more. It1430- has all of the features (and restrictions) of balance-tlb, and1431- will also balance incoming traffic from peers (as described in1432- the Bonding Module Options section, above). The only extra1433- down side to this mode is that the network device driver must1434- support changing the hardware address while the device is1435- open.1436-1437-12.1.2 Link Monitoring for Single Switch Topology1438--------------------------------------------------1439-1440- The choice of link monitoring may largely depend upon which1441-mode you choose to use. The more advanced load balancing modes do not1442-support the use of the ARP monitor, and are thus restricted to using1443-the MII monitor (which does not provide as high a level of assurance1444-as the ARP monitor).1445-1446144712.2 High Availability in a Multiple Switch Topology1448----------------------------------------------------14491450 With multiple switches, the configuration of bonding and the1451network changes dramatically. In multiple switch topologies, there is1452-a tradeoff between network availability and usable bandwidth.14531454 Below is a sample network, configured to maximize the1455availability of the network:···1360the outside world ("port3" on each switch). There is no technical1361reason that this could not be extended to a third switch.13621363-12.2.1 Bonding Mode Selection for Multiple Switch Topology1364-----------------------------------------------------------13651366- In a topology such as this, the active-backup and broadcast1367-modes are the only useful bonding modes; the other modes require all1368-links to terminate on the same peer for them to behave rationally.013691370active-backup: This is generally the preferred mode, particularly if1371 the switches have an ISL and play together well. If the···1378broadcast: This mode is really a special purpose mode, and is suitable1379 only for very specific needs. For example, if the two1380 switches are not connected (no ISL), and the networks beyond1381- them are totally independant. In this case, if it is1382 necessary for some specific one-way traffic to reach both1383 independent networks, then the broadcast mode may be suitable.13841385-12.2.2 Link Monitoring Selection for Multiple Switch Topology1386--------------------------------------------------------------13871388 The choice of link monitoring ultimately depends upon your1389switch. If the switch can reliably fail ports in response to other···1394thus detecting that failure without switch support.13951396 In general, however, in a multiple switch topology, the ARP1397-monitor can provide a higher level of reliability in detecting link1398-failures. Additionally, it should be configured with multiple targets1399-(at least one for each switch in the network). This will insure that,001400regardless of which switch is active, the ARP monitor has a suitable1401target to query.140214031404-12.3 Switch Behavior Issues for High Availability1405--------------------------------------------------14061407- You may encounter issues with the timing of link up and down1408-reporting by the switch.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000014091410 First, when a link comes up, some switches may indicate that1411the link is up (carrier available), but not pass traffic over the···1696 Second, some switches may "bounce" the link state one or more1697times while a link is changing state. This occurs most commonly while1698the switch is initializing. Again, an appropriate updelay value may1699-help, but note that if all links are down, then updelay is ignored1700-when any link becomes active (the slave closest to completing its1701-updelay is chosen).17021703 Note that when a bonding interface has no active links, the1704-driver will immediately reuse the first link that goes up, even if1705-updelay parameter was specified. If there are slave interfaces1706-waiting for the updelay timeout to expire, the interface that first1707-went into that state will be immediately reused. This reduces down1708-time of the network if the value of updelay has been overestimated.00017091710 In addition to the concerns about switch timings, if your1711switches take a long time to go into backup mode, it may be desirable1712to not activate a backup interface immediately after a link goes down.1713Failover may be delayed via the downdelay bonding module option.17141715-13. Hardware Specific Considerations0000000000000000000000000000000000000001716====================================17171718 This section contains additional information for configuring1719bonding on specific hardware platforms, or for interfacing bonding1720with particular switches or other devices.17211722-13.1 IBM BladeCenter1723--------------------17241725 This applies to the JS20 and similar systems.···1773--------------------------------17741775 All JS20s come with two Broadcom Gigabit Ethernet ports1776-integrated on the planar. In the BladeCenter chassis, the eth0 port1777-of all JS20 blades is hard wired to I/O Module #1; similarly, all eth11778-ports are wired to I/O Module #2. An add-on Broadcom daughter card1779-can be installed on a JS20 to provide two more Gigabit Ethernet ports.1780-These ports, eth2 and eth3, are wired to I/O Modules 3 and 4,1781-respectively.17821783 Each I/O Module may contain either a switch or a passthrough1784module (which allows ports to be directly connected to an external···1798of ways, this discussion will be confined to describing basic1799configurations.18001801- Normally, Ethernet Switch Modules (ESM) are used in I/O1802modules 1 and 2. In this configuration, the eth0 and eth1 ports of a1803JS20 will be connected to different internal switches (in the1804respective I/O modules).18051806- An optical passthru module (OPM) connects the I/O module1807-directly to an external switch. By using OPMs in I/O module #1 and1808-#2, the eth0 and eth1 interfaces of a JS20 can be redirected to the1809-outside world and connected to a common external switch.018101811- Depending upon the mix of ESM and OPM modules, the network1812-will appear to bonding as either a single switch topology (all OPM1813-modules) or as a multiple switch topology (one or more ESM modules,1814-zero or more OPM modules). It is also possible to connect ESM modules1815-together, resulting in a configuration much like the example in "High1816-Availability in a multiple switch topology."18171818-Requirements for specifc modes1819-------------------------------18201821- The balance-rr mode requires the use of OPM modules for1822-devices in the bond, all connected to an common external switch. That1823-switch must be configured for "etherchannel" or "trunking" on the1824appropriate ports, as is usual for balance-rr.18251826 The balance-alb and balance-tlb modes will function with···1851Other concerns1852--------------18531854- The Serial Over LAN link is established over the primary1855ethernet (eth0) only, therefore, any loss of link to eth0 will result1856in losing your SoL connection. It will not fail over with other1857-network traffic.018581859 It may be desirable to disable spanning tree on the switch1860(either the internal Ethernet Switch Module, or an external switch) to1861-avoid fail-over delays issues when using bonding.186218631864-14. Frequently Asked Questions1865==============================186618671. Is it SMP safe?···18732. What type of cards will work with it?18741875 Any Ethernet type cards (you can even mix cards - a Intel1876-EtherExpress PRO/100 and a 3com 3c905b, for example). They need not1877-be of the same speed.187818793. How many bonding devices can I have?1880···1892disabled. The active-backup mode will fail over to a backup link, and1893other modes will ignore the failed link. The link will continue to be1894monitored, and should it recover, it will rejoin the bond (in whatever1895-manner is appropriate for the mode). See the section on High1896-Availability for additional information.018971898 Link monitoring can be enabled via either the miimon or1899-arp_interval paramters (described in the module paramters section,1900above). In general, miimon monitors the carrier state as sensed by1901the underlying network device, and the arp monitor (arp_interval)1902monitors connectivity to another host on the local network.···1905 If no link monitoring is configured, the bonding driver will1906be unable to detect link failures, and will assume that all links are1907always available. This will likely result in lost packets, and a1908-resulting degredation of performance. The precise performance loss1909depends upon the bonding mode and network configuration.191019116. Can bonding be used for High Availability?···1919 In the basic balance modes (balance-rr and balance-xor), it1920works with any system that supports etherchannel (also called1921trunking). Most managed switches currently available have such1922-support, and many unmananged switches as well.19231924 The advanced balance modes (balance-tlb and balance-alb) do1925not have special switch requirements, but do need device drivers that1926support specific features (described in the appropriate section under1927-module paramters, above).19281929 In 802.3ad mode, it works with with systems that support IEEE1930802.3ad Dynamic Link Aggregation. Most managed and many unmanaged···193419358. Where does a bonding device get its MAC address from?19361937- If not explicitly configured with ifconfig, the MAC address of1938-the bonding device is taken from its first slave device. This MAC1939-address is then passed to all following slaves and remains persistent1940-(even if the the first slave is removed) until the bonding device is1941-brought down or reconfigured.19421943 If you wish to change the MAC address, you can set it with1944-ifconfig:19451946# ifconfig bond0 hw ether 00:11:22:33:44:550019471948 The MAC address can be also changed by bringing down/up the1949device and then changing its slaves (or their order):···1962then restore the MAC addresses that the slaves had before they were1963enslaved.19641965-15. Resources and Links1966=======================19671968The latest version of the bonding driver can be found in the latest1969version of the linux kernel, found on http://kernel.org19700000001971Discussions regarding the bonding driver take place primarily on the1972bonding-devel mailing list, hosted at sourceforge.net. If you have1973-questions or problems, post them to the list.19741975bonding-devel@lists.sourceforge.net19760001977https://lists.sourceforge.net/lists/listinfo/bonding-devel1978-1979-There is also a project site on sourceforge.1980-1981-http://www.sourceforge.net/projects/bonding19821983Donald Becker's Ethernet Drivers and diag programs may be found at :1984 - http://www.scyld.com/network/
···12+ Linux Ethernet Bonding Driver HOWTO3+4+ Latest update: 21 June 200556Initial release : Thomas Davis <tadavis at lbl.gov>7Corrections, HA extensions : 2000/10/03-15 :···1112Reorganized and updated Feb 2005 by Jay Vosburgh1314+Introduction15+============00001617+ The Linux bonding driver provides a method for aggregating18+multiple network interfaces into a single logical "bonded" interface.19+The behavior of the bonded interfaces depends upon the mode; generally20+speaking, modes provide either hot standby or load balancing services.21+Additionally, link integrity monitoring may be performed.22+23+ The bonding driver originally came from Donald Becker's24+beowulf patches for kernel 2.0. It has changed quite a bit since, and25+the original tools from extreme-linux and beowulf sites will not work26+with this version of the driver.27+28+ For new versions of the driver, updated userspace tools, and29+who to ask for help, please follow the links at the end of this file.3031Table of Contents32=================···30313. Configuring Bonding Devices323.1 Configuration with sysconfig support33+3.1.1 Using DHCP with sysconfig34+3.1.2 Configuring Multiple Bonds with sysconfig353.2 Configuration with initscripts support36+3.2.1 Using DHCP with initscripts37+3.2.2 Configuring Multiple Bonds with initscripts383.3 Configuring Bonding Manually39+3.3.1 Configuring Multiple Bonds Manually40415. Querying Bonding Configuration425.1 Bonding Configuration···565711. Promiscuous mode5859+12. Configuring Bonding for High Availability6012.1 High Availability in a Single Switch Topology006112.2 High Availability in a Multiple Switch Topology62+12.2.1 HA Bonding Mode Selection for Multiple Switch Topology63+12.2.2 HA Link Monitoring for Multiple Switch Topology06465+13. Configuring Bonding for Maximum Throughput66+13.1 Maximum Throughput in a Single Switch Topology67+13.1.1 MT Bonding Mode Selection for Single Switch Topology68+13.1.2 MT Link Monitoring for Single Switch Topology69+13.2 Maximum Throughput in a Multiple Switch Topology70+13.2.1 MT Bonding Mode Selection for Multiple Switch Topology71+13.2.2 MT Link Monitoring for Multiple Switch Topology7273+14. Switch Behavior Issues74+14.1 Link Establishment and Failover Delays75+14.2 Duplicated Incoming Packets7677+15. Hardware Specific Considerations78+15.1 IBM BladeCenter79+80+16. Frequently Asked Questions81+82+17. Resources and Links8384851. Bonding Driver Installation···861.1 Configure and build the kernel with bonding87-----------------------------------------------8889+ The current version of the bonding driver is available in the90drivers/net/bonding subdirectory of the most recent kernel source91+(which is available on http://kernel.org). Most users "rolling their92+own" will want to use the most recent kernel from kernel.org.0000009394 Configure kernel with "make menuconfig" (or "make xconfig" or95"make config"), then select "Bonding driver support" in the "Network···103driver as module since it is currently the only way to pass parameters104to the driver or configure more than one bonding device.105106+ Build and install the new kernel and modules, then continue107+below to install ifenslave.1081091.2 Install ifenslave Control Utility110-------------------------------------···147 Options for the bonding driver are supplied as parameters to148the bonding module at load time. They may be given as command line149arguments to the insmod or modprobe command, but are usually specified150+in either the /etc/modules.conf or /etc/modprobe.conf configuration151+file, or in a distro-specific configuration file (some of which are152+detailed in the next section).153154 The available bonding driver parameters are listed below. If a155parameter is not specified the default value is used. When initially···162support at least miimon, so there is really no reason not to use it.163164 Options with textual values will accept either the text name165+or, for backwards compatibility, the option value. E.g.,166+"mode=802.3ad" and "mode=4" set the same mode.167168 The parameters are as follows:169170arp_interval171172+ Specifies the ARP link monitoring frequency in milliseconds.173+ If ARP monitoring is used in an etherchannel compatible mode174+ (modes 0 and 2), the switch should be configured in a mode175+ that evenly distributes packets across all links. If the176+ switch is configured to distribute the packets in an XOR177 fashion, all replies from the ARP targets will be received on178 the same link which could cause the other team members to179+ fail. ARP monitoring should not be used in conjunction with180+ miimon. A value of 0 disables ARP monitoring. The default181 value is 0.182183arp_ip_target184185+ Specifies the IP addresses to use as ARP monitoring peers when186+ arp_interval is > 0. These are the targets of the ARP request187+ sent to determine the health of the link to the targets.188+ Specify these values in ddd.ddd.ddd.ddd format. Multiple IP189+ addresses must be separated by a comma. At least one IP190+ address must be given for ARP monitoring to function. The191+ maximum number of targets that can be specified is 16. The192+ default value is no IP addresses.193194downdelay195···207 are:208209 slow or 0210+ Request partner to transmit LACPDUs every 30 seconds211212 fast or 1213 Request partner to transmit LACPDUs every 1 second214+215+ The default is slow.216217max_bonds218···221222miimon223224+ Specifies the MII link monitoring frequency in milliseconds.225+ This determines how often the link state of each slave is226+ inspected for link failures. A value of zero disables MII227+ link monitoring. A value of 100 is a good starting point.228+ The use_carrier option, below, affects how the link state is229 determined. See the High Availability section for additional230 information. The default value is 0.231···246 active. A different slave becomes active if, and only247 if, the active slave fails. The bond's MAC address is248 externally visible on only one port (network adapter)249+ to avoid confusing the switch.250+251+ In bonding version 2.6.2 or later, when a failover252+ occurs in active-backup mode, bonding will issue one253+ or more gratuitous ARPs on the newly active slave.254+ One gratutious ARP is issued for the bonding master255+ interface and each VLAN interfaces configured above256+ it, provided that the interface has at least one IP257+ address configured. Gratuitous ARPs issued for VLAN258+ interfaces are tagged with the appropriate VLAN id.259+260+ This mode provides fault tolerance. The primary261+ option, documented below, affects the behavior of this262+ mode.263264 balance-xor or 2265266+ XOR policy: Transmit based on the selected transmit267+ hash policy. The default policy is a simple [(source268+ MAC address XOR'd with destination MAC address) modulo269+ slave count]. Alternate transmit policies may be270+ selected via the xmit_hash_policy option, described271+ below.272+273+ This mode provides load balancing and fault tolerance.274275 broadcast or 3276···270 duplex settings. Utilizes all slaves in the active271 aggregator according to the 802.3ad specification.272273+ Slave selection for outgoing traffic is done according274+ to the transmit hash policy, which may be changed from275+ the default simple XOR policy via the xmit_hash_policy276+ option, documented below. Note that not all transmit277+ policies may be 802.3ad compliant, particularly in278+ regards to the packet mis-ordering requirements of279+ section 43.2.4 of the 802.3ad standard. Differing280+ peer implementations will have varying tolerances for281+ noncompliance.282+283+ Prerequisites:284285 1. Ethtool support in the base drivers for retrieving286 the speed and duplex of each slave.···333334 When a link is reconnected or a new slave joins the335 bond the receive traffic is redistributed among all336+ active slaves in the bond by initiating ARP Replies337 with the selected mac address to each of the338 clients. The updelay parameter (detailed below) must339 be set to a value equal or greater than the switch's···396 0 will use the deprecated MII / ETHTOOL ioctls. The default397 value is 1.398399+xmit_hash_policy400+401+ Selects the transmit hash policy to use for slave selection in402+ balance-xor and 802.3ad modes. Possible values are:403+404+ layer2405+406+ Uses XOR of hardware MAC addresses to generate the407+ hash. The formula is408+409+ (source MAC XOR destination MAC) modulo slave count410+411+ This algorithm will place all traffic to a particular412+ network peer on the same slave.413+414+ This algorithm is 802.3ad compliant.415+416+ layer3+4417+418+ This policy uses upper layer protocol information,419+ when available, to generate the hash. This allows for420+ traffic to a particular network peer to span multiple421+ slaves, although a single connection will not span422+ multiple slaves.423+424+ The formula for unfragmented TCP and UDP packets is425+426+ ((source port XOR dest port) XOR427+ ((source IP XOR dest IP) AND 0xffff)428+ modulo slave count429+430+ For fragmented TCP or UDP packets and all other IP431+ protocol traffic, the source and destination port432+ information is omitted. For non-IP traffic, the433+ formula is the same as for the layer2 transmit hash434+ policy.435+436+ This policy is intended to mimic the behavior of437+ certain switches, notably Cisco switches with PFC2 as438+ well as some Foundry and IBM products.439+440+ This algorithm is not fully 802.3ad compliant. A441+ single TCP or UDP conversation containing both442+ fragmented and unfragmented packets will see packets443+ striped across two interfaces. This may result in out444+ of order delivery. Most traffic types will not meet445+ this criteria, as TCP rarely fragments traffic, and446+ most UDP traffic is not involved in extended447+ conversations. Other implementations of 802.3ad may448+ or may not tolerate this noncompliance.449+450+ The default value is layer2. This option was added in bonding451+version 2.6.3. In earlier versions of bonding, this parameter does452+not exist, and the layer2 policy is the only policy.4534544553. Configuring Bonding Devices···448slave devices. On SLES 9, this is most easily done by running the449yast2 sysconfig configuration utility. The goal is for to create an450ifcfg-id file for each slave device. The simplest way to accomplish451+this is to configure the devices for DHCP (this is only to get the452+file ifcfg-id file created; see below for some issues with DHCP). The453+name of the configuration file for each device will be of the form:454455ifcfg-id-xx:xx:xx:xx:xx:xx456···459 Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been460created, it is necessary to edit the configuration files for the slave461devices (the MAC addresses correspond to those of the slave devices).462+Before editing, the file will contain multiple lines, and will look463something like this:464465BOOTPROTO='dhcp'···496BONDING_MASTER="yes"497BONDING_MODULE_OPTS="mode=active-backup miimon=100"498BONDING_SLAVE0="eth0"499+BONDING_SLAVE1="bus-pci-0000:06:08.1"500501 Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK502values with the appropriate values for your network.00000503504 The STARTMODE specifies when the device is brought online.505The possible values are:···531the max_bonds bonding parameter; this will confuse the configuration532system if you have multiple bonding devices.533534+ Finally, supply one BONDING_SLAVEn="slave device" for each535+slave. where "n" is an increasing value, one for each slave. The536+"slave device" is either an interface name, e.g., "eth0", or a device537+specifier for the network device. The interface name is easier to538+find, but the ethN names are subject to change at boot time if, e.g.,539+a device early in the sequence has failed. The device specifiers540+(bus-pci-0000:06:08.1 in the example above) specify the physical541+network device, and will not change unless the device's bus location542+changes (for example, it is moved from one PCI slot to another). The543+example above uses one of each type for demonstration purposes; most544+configurations will choose one or the other for all slave devices.545546 When all configuration files have been modified or created,547networking must be restarted for the configuration changes to take···544 Note that the network control script (/sbin/ifdown) will545remove the bonding module as part of the network shutdown processing,546so it is not necessary to remove the module by hand if, e.g., the547+module parameters have changed.548549 Also, at this writing, YaST/YaST2 will not manage bonding550devices (they do not show bonding interfaces on its list of network···559 Note that the template does not document the various BONDING_560settings described above, but does describe many of the other options.561562+3.1.1 Using DHCP with sysconfig563+-------------------------------564+565+ Under sysconfig, configuring a device with BOOTPROTO='dhcp'566+will cause it to query DHCP for its IP address information. At this567+writing, this does not function for bonding devices; the scripts568+attempt to obtain the device address from DHCP prior to adding any of569+the slave devices. Without active slaves, the DHCP requests are not570+sent to the network.571+572+3.1.2 Configuring Multiple Bonds with sysconfig573+-----------------------------------------------574+575+ The sysconfig network initialization system is capable of576+handling multiple bonding devices. All that is necessary is for each577+bonding instance to have an appropriately configured ifcfg-bondX file578+(as described above). Do not specify the "max_bonds" parameter to any579+instance of bonding, as this will confuse sysconfig. If you require580+multiple bonding devices with identical parameters, create multiple581+ifcfg-bondX files.582+583+ Because the sysconfig scripts supply the bonding module584+options in the ifcfg-bondX file, it is not necessary to add them to585+the system /etc/modules.conf or /etc/modprobe.conf configuration file.586+5873.2 Configuration with initscripts support588------------------------------------------589590 This section applies to distros using a version of initscripts591with bonding support, for example, Red Hat Linux 9 or Red Hat592+Enterprise Linux version 3 or 4. On these systems, the network593initialization scripts have some knowledge of bonding, and can be594configured to control bonding devices.595···614 Be sure to change the networking specific lines (IPADDR,615NETMASK, NETWORK and BROADCAST) to match your network configuration.616617+ Finally, it is necessary to edit /etc/modules.conf (or618+/etc/modprobe.conf, depending upon your distro) to load the bonding619+module with your desired options when the bond0 interface is brought620+up. The following lines in /etc/modules.conf (or modprobe.conf) will621+load the bonding module, and select its options:622623alias bond0 bonding624options bond0 mode=balance-alb miimon=100···629will restart the networking subsystem and your bond link should be now630up and running.631632+3.2.1 Using DHCP with initscripts633+---------------------------------634+635+ Recent versions of initscripts (the version supplied with636+Fedora Core 3 and Red Hat Enterprise Linux 4 is reported to work) do637+have support for assigning IP information to bonding devices via DHCP.638+639+ To configure bonding for DHCP, configure it as described640+above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp"641+and add a line consisting of "TYPE=Bonding". Note that the TYPE value642+is case sensitive.643+644+3.2.2 Configuring Multiple Bonds with initscripts645+-------------------------------------------------646+647+ At this writing, the initscripts package does not directly648+support loading the bonding driver multiple times, so the process for649+doing so is the same as described in the "Configuring Multiple Bonds650+Manually" section, below.651+652+ NOTE: It has been observed that some Red Hat supplied kernels653+are apparently unable to rename modules at load time (the "-obonding1"654+part). Attempts to pass that option to modprobe will produce an655+"Operation not permitted" error. This has been reported on some656+Fedora Core kernels, and has been seen on RHEL 4 as well. On kernels657+exhibiting this problem, it will be impossible to configure multiple658+bonds with differing parameters.6596603.3 Configuring Bonding Manually661--------------------------------···638knowledge of bonding. One such distro is SuSE Linux Enterprise Server639version 8.640641+ The general method for these systems is to place the bonding642+module parameters into /etc/modules.conf or /etc/modprobe.conf (as643+appropriate for the installed distro), then add modprobe and/or644+ifenslave commands to the system's global init script. The name of645+the global init script differs; for sysconfig, it is646/etc/init.d/boot.local and for initscripts it is /etc/rc.d/rc.local.647648 For example, if you wanted to make a simple bond of two e100···649reboots, edit the appropriate file (/etc/init.d/boot.local or650/etc/rc.d/rc.local), and add the following:651652+modprobe bonding mode=balance-alb miimon=100653modprobe e100654ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up655ifenslave bond0 eth0···657658 Replace the example bonding module parameters and bond0659network configuration (IP address, netmask, etc) with the appropriate660+values for your configuration.0000661662 Unfortunately, this method will not provide support for the663ifup and ifdown scripts on the bond devices. To reload the bonding···684the following:685686# ifconfig bond0 down687+# rmmod bonding688# rmmod e100689690 Again, for convenience, it may be desirable to create a script691with these commands.692693694+3.3.1 Configuring Multiple Bonds Manually695+-----------------------------------------696697 This section contains information on configuring multiple698+bonding devices with differing options for those systems whose network699+initialization scripts lack support for configuring multiple bonds.700+701+ If you require multiple bonding devices, but all with the same702+options, you may wish to use the "max_bonds" module parameter,703+documented above.704705 To create multiple bonding devices with differing options, it706is necessary to load the bonding driver multiple times. Note that···724miimon of 100. The second instance is named "bond1" and creates the725bond1 device in balance-alb mode with an miimon of 50.726727+ In some circumstances (typically with older distributions),728+the above does not work, and the second bonding instance never sees729+its options. In that case, the second options line can be substituted730+as follows:731732+install bonding1 /sbin/modprobe bonding -obond1 mode=balance-alb miimon=50733+734+ This may be repeated any number of times, specifying a new and735+unique name in place of bond1 for each subsequent instance.736+7377385. Querying Bonding Configuration 739=================================···846self generated packets.847848 For reasons of simplicity, and to support the use of adapters849+that can do VLAN hardware acceleration offloading, the bonding850+interface declares itself as fully hardware offloading capable, it gets851the add_vid/kill_vid notifications to gather the necessary852information, and it propagates those actions to the slaves. In case853of mixed adapter types, hardware accelerated tagged packets that···880matches the hardware address of the VLAN interfaces.881882 Note that changing a VLAN interface's HW address would set the883+underlying device -- i.e. the bonding interface -- to promiscuous884mode, which might not be what you want.885886···923an additional target (or several) increases the reliability of the ARP924monitoring.925926+ Multiple ARP targets must be separated by commas as follows:927928# example options for ARP monitoring with three targets929alias bond0 bonding···1045 This will, when loading the bonding module, rather than1046performing the normal action, instead execute the provided command.1047This command loads the device drivers in the order needed, then calls1048+modprobe with --ignore-install to cause the normal action to then take1049place. Full documentation on this can be found in the modprobe.conf1050and modprobe manual pages.1051···1130common to enable promiscuous mode on the device, so that all traffic1131is seen (instead of seeing only traffic destined for the local host).1132The bonding driver handles promiscuous mode changes to the bonding1133+master device (e.g., bond0), and propagates the setting to the slave1134devices.11351136 For the balance-rr, balance-xor, broadcast, and 802.3ad modes,1137+the promiscuous mode setting is propagated to all slaves.11381139 For the active-backup, balance-tlb and balance-alb modes, the1140+promiscuous mode setting is propagated only to the active slave.11411142 For balance-tlb mode, the active slave is the slave currently1143receiving inbound traffic.···11481149 For the active-backup, balance-tlb and balance-alb modes, when1150the active slave changes (e.g., due to a link failure), the1151+promiscuous setting will be propagated to the new active slave.11521153+12. Configuring Bonding for High Availability1154+=============================================11551156 High Availability refers to configurations that provide1157maximum network availability by having redundant or backup devices,1158+links or switches between the host and the rest of the world. The1159+goal is to provide the maximum availability of network connectivity1160+(i.e., the network always works), even though other configurations1161+could provide higher throughput.0001162116312.1 High Availability in a Single Switch Topology1164--------------------------------------------------11651166+ If two hosts (or a host and a single switch) are directly1167+connected via multiple physical links, then there is no availability1168+penalty to optimizing for maximum bandwidth. In this case, there is1169+only one switch (or peer), so if it fails, there is no alternative1170+access to fail over to. Additionally, the bonding load balance modes1171+support link monitoring of their members, so if individual links fail,1172+the load will be rebalanced across the remaining devices.11731174+ See Section 13, "Configuring Bonding for Maximum Throughput"1175+for information on configuring bonding with one peer device.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001176117712.2 High Availability in a Multiple Switch Topology1178----------------------------------------------------11791180 With multiple switches, the configuration of bonding and the1181network changes dramatically. In multiple switch topologies, there is1182+a trade off between network availability and usable bandwidth.11831184 Below is a sample network, configured to maximize the1185availability of the network:···1312the outside world ("port3" on each switch). There is no technical1313reason that this could not be extended to a third switch.13141315+12.2.1 HA Bonding Mode Selection for Multiple Switch Topology1316+-------------------------------------------------------------13171318+ In a topology such as the example above, the active-backup and1319+broadcast modes are the only useful bonding modes when optimizing for1320+availability; the other modes require all links to terminate on the1321+same peer for them to behave rationally.13221323active-backup: This is generally the preferred mode, particularly if1324 the switches have an ISL and play together well. If the···1329broadcast: This mode is really a special purpose mode, and is suitable1330 only for very specific needs. For example, if the two1331 switches are not connected (no ISL), and the networks beyond1332+ them are totally independent. In this case, if it is1333 necessary for some specific one-way traffic to reach both1334 independent networks, then the broadcast mode may be suitable.13351336+12.2.2 HA Link Monitoring Selection for Multiple Switch Topology1337+----------------------------------------------------------------13381339 The choice of link monitoring ultimately depends upon your1340switch. If the switch can reliably fail ports in response to other···1345thus detecting that failure without switch support.13461347 In general, however, in a multiple switch topology, the ARP1348+monitor can provide a higher level of reliability in detecting end to1349+end connectivity failures (which may be caused by the failure of any1350+individual component to pass traffic for any reason). Additionally,1351+the ARP monitor should be configured with multiple targets (at least1352+one for each switch in the network). This will insure that,1353regardless of which switch is active, the ARP monitor has a suitable1354target to query.135513561357+13. Configuring Bonding for Maximum Throughput1358+==============================================13591360+13.1 Maximizing Throughput in a Single Switch Topology1361+------------------------------------------------------1362+1363+ In a single switch configuration, the best method to maximize1364+throughput depends upon the application and network environment. The1365+various load balancing modes each have strengths and weaknesses in1366+different environments, as detailed below.1367+1368+ For this discussion, we will break down the topologies into1369+two categories. Depending upon the destination of most traffic, we1370+categorize them into either "gatewayed" or "local" configurations.1371+1372+ In a gatewayed configuration, the "switch" is acting primarily1373+as a router, and the majority of traffic passes through this router to1374+other networks. An example would be the following:1375+1376+1377+ +----------+ +----------+1378+ | |eth0 port1| | to other networks1379+ | Host A +---------------------+ router +------------------->1380+ | +---------------------+ | Hosts B and C are out1381+ | |eth1 port2| | here somewhere1382+ +----------+ +----------+1383+1384+ The router may be a dedicated router device, or another host1385+acting as a gateway. For our discussion, the important point is that1386+the majority of traffic from Host A will pass through the router to1387+some other network before reaching its final destination.1388+1389+ In a gatewayed network configuration, although Host A may1390+communicate with many other systems, all of its traffic will be sent1391+and received via one other peer on the local network, the router.1392+1393+ Note that the case of two systems connected directly via1394+multiple physical links is, for purposes of configuring bonding, the1395+same as a gatewayed configuration. In that case, it happens that all1396+traffic is destined for the "gateway" itself, not some other network1397+beyond the gateway.1398+1399+ In a local configuration, the "switch" is acting primarily as1400+a switch, and the majority of traffic passes through this switch to1401+reach other stations on the same network. An example would be the1402+following:1403+1404+ +----------+ +----------+ +--------+1405+ | |eth0 port1| +-------+ Host B |1406+ | Host A +------------+ switch |port3 +--------+1407+ | +------------+ | +--------+1408+ | |eth1 port2| +------------------+ Host C |1409+ +----------+ +----------+port4 +--------+1410+1411+1412+ Again, the switch may be a dedicated switch device, or another1413+host acting as a gateway. For our discussion, the important point is1414+that the majority of traffic from Host A is destined for other hosts1415+on the same local network (Hosts B and C in the above example).1416+1417+ In summary, in a gatewayed configuration, traffic to and from1418+the bonded device will be to the same MAC level peer on the network1419+(the gateway itself, i.e., the router), regardless of its final1420+destination. In a local configuration, traffic flows directly to and1421+from the final destinations, thus, each destination (Host B, Host C)1422+will be addressed directly by their individual MAC addresses.1423+1424+ This distinction between a gatewayed and a local network1425+configuration is important because many of the load balancing modes1426+available use the MAC addresses of the local network source and1427+destination to make load balancing decisions. The behavior of each1428+mode is described below.1429+1430+1431+13.1.1 MT Bonding Mode Selection for Single Switch Topology1432+-----------------------------------------------------------1433+1434+ This configuration is the easiest to set up and to understand,1435+although you will have to decide which bonding mode best suits your1436+needs. The trade offs for each mode are detailed below:1437+1438+balance-rr: This mode is the only mode that will permit a single1439+ TCP/IP connection to stripe traffic across multiple1440+ interfaces. It is therefore the only mode that will allow a1441+ single TCP/IP stream to utilize more than one interface's1442+ worth of throughput. This comes at a cost, however: the1443+ striping often results in peer systems receiving packets out1444+ of order, causing TCP/IP's congestion control system to kick1445+ in, often by retransmitting segments.1446+1447+ It is possible to adjust TCP/IP's congestion limits by1448+ altering the net.ipv4.tcp_reordering sysctl parameter. The1449+ usual default value is 3, and the maximum useful value is 127.1450+ For a four interface balance-rr bond, expect that a single1451+ TCP/IP stream will utilize no more than approximately 2.31452+ interface's worth of throughput, even after adjusting1453+ tcp_reordering.1454+1455+ Note that this out of order delivery occurs when both the1456+ sending and receiving systems are utilizing a multiple1457+ interface bond. Consider a configuration in which a1458+ balance-rr bond feeds into a single higher capacity network1459+ channel (e.g., multiple 100Mb/sec ethernets feeding a single1460+ gigabit ethernet via an etherchannel capable switch). In this1461+ configuration, traffic sent from the multiple 100Mb devices to1462+ a destination connected to the gigabit device will not see1463+ packets out of order. However, traffic sent from the gigabit1464+ device to the multiple 100Mb devices may or may not see1465+ traffic out of order, depending upon the balance policy of the1466+ switch. Many switches do not support any modes that stripe1467+ traffic (instead choosing a port based upon IP or MAC level1468+ addresses); for those devices, traffic flowing from the1469+ gigabit device to the many 100Mb devices will only utilize one1470+ interface.1471+1472+ If you are utilizing protocols other than TCP/IP, UDP for1473+ example, and your application can tolerate out of order1474+ delivery, then this mode can allow for single stream datagram1475+ performance that scales near linearly as interfaces are added1476+ to the bond.1477+1478+ This mode requires the switch to have the appropriate ports1479+ configured for "etherchannel" or "trunking."1480+1481+active-backup: There is not much advantage in this network topology to1482+ the active-backup mode, as the inactive backup devices are all1483+ connected to the same peer as the primary. In this case, a1484+ load balancing mode (with link monitoring) will provide the1485+ same level of network availability, but with increased1486+ available bandwidth. On the plus side, active-backup mode1487+ does not require any configuration of the switch, so it may1488+ have value if the hardware available does not support any of1489+ the load balance modes.1490+1491+balance-xor: This mode will limit traffic such that packets destined1492+ for specific peers will always be sent over the same1493+ interface. Since the destination is determined by the MAC1494+ addresses involved, this mode works best in a "local" network1495+ configuration (as described above), with destinations all on1496+ the same local network. This mode is likely to be suboptimal1497+ if all your traffic is passed through a single router (i.e., a1498+ "gatewayed" network configuration, as described above).1499+1500+ As with balance-rr, the switch ports need to be configured for1501+ "etherchannel" or "trunking."1502+1503+broadcast: Like active-backup, there is not much advantage to this1504+ mode in this type of network topology.1505+1506+802.3ad: This mode can be a good choice for this type of network1507+ topology. The 802.3ad mode is an IEEE standard, so all peers1508+ that implement 802.3ad should interoperate well. The 802.3ad1509+ protocol includes automatic configuration of the aggregates,1510+ so minimal manual configuration of the switch is needed1511+ (typically only to designate that some set of devices is1512+ available for 802.3ad). The 802.3ad standard also mandates1513+ that frames be delivered in order (within certain limits), so1514+ in general single connections will not see misordering of1515+ packets. The 802.3ad mode does have some drawbacks: the1516+ standard mandates that all devices in the aggregate operate at1517+ the same speed and duplex. Also, as with all bonding load1518+ balance modes other than balance-rr, no single connection will1519+ be able to utilize more than a single interface's worth of1520+ bandwidth. 1521+1522+ Additionally, the linux bonding 802.3ad implementation1523+ distributes traffic by peer (using an XOR of MAC addresses),1524+ so in a "gatewayed" configuration, all outgoing traffic will1525+ generally use the same device. Incoming traffic may also end1526+ up on a single device, but that is dependent upon the1527+ balancing policy of the peer's 8023.ad implementation. In a1528+ "local" configuration, traffic will be distributed across the1529+ devices in the bond.1530+1531+ Finally, the 802.3ad mode mandates the use of the MII monitor,1532+ therefore, the ARP monitor is not available in this mode.1533+1534+balance-tlb: The balance-tlb mode balances outgoing traffic by peer.1535+ Since the balancing is done according to MAC address, in a1536+ "gatewayed" configuration (as described above), this mode will1537+ send all traffic across a single device. However, in a1538+ "local" network configuration, this mode balances multiple1539+ local network peers across devices in a vaguely intelligent1540+ manner (not a simple XOR as in balance-xor or 802.3ad mode),1541+ so that mathematically unlucky MAC addresses (i.e., ones that1542+ XOR to the same value) will not all "bunch up" on a single1543+ interface.1544+1545+ Unlike 802.3ad, interfaces may be of differing speeds, and no1546+ special switch configuration is required. On the down side,1547+ in this mode all incoming traffic arrives over a single1548+ interface, this mode requires certain ethtool support in the1549+ network device driver of the slave interfaces, and the ARP1550+ monitor is not available.1551+1552+balance-alb: This mode is everything that balance-tlb is, and more.1553+ It has all of the features (and restrictions) of balance-tlb,1554+ and will also balance incoming traffic from local network1555+ peers (as described in the Bonding Module Options section,1556+ above).1557+1558+ The only additional down side to this mode is that the network1559+ device driver must support changing the hardware address while1560+ the device is open.1561+1562+13.1.2 MT Link Monitoring for Single Switch Topology1563+----------------------------------------------------1564+1565+ The choice of link monitoring may largely depend upon which1566+mode you choose to use. The more advanced load balancing modes do not1567+support the use of the ARP monitor, and are thus restricted to using1568+the MII monitor (which does not provide as high a level of end to end1569+assurance as the ARP monitor).1570+1571+13.2 Maximum Throughput in a Multiple Switch Topology1572+-----------------------------------------------------1573+1574+ Multiple switches may be utilized to optimize for throughput1575+when they are configured in parallel as part of an isolated network1576+between two or more systems, for example:1577+1578+ +-----------+1579+ | Host A | 1580+ +-+---+---+-+1581+ | | |1582+ +--------+ | +---------+1583+ | | |1584+ +------+---+ +-----+----+ +-----+----+1585+ | Switch A | | Switch B | | Switch C |1586+ +------+---+ +-----+----+ +-----+----+1587+ | | |1588+ +--------+ | +---------+1589+ | | |1590+ +-+---+---+-+1591+ | Host B | 1592+ +-----------+1593+1594+ In this configuration, the switches are isolated from one1595+another. One reason to employ a topology such as this is for an1596+isolated network with many hosts (a cluster configured for high1597+performance, for example), using multiple smaller switches can be more1598+cost effective than a single larger switch, e.g., on a network with 241599+hosts, three 24 port switches can be significantly less expensive than1600+a single 72 port switch.1601+1602+ If access beyond the network is required, an individual host1603+can be equipped with an additional network device connected to an1604+external network; this host then additionally acts as a gateway.1605+1606+13.2.1 MT Bonding Mode Selection for Multiple Switch Topology1607+-------------------------------------------------------------1608+1609+ In actual practice, the bonding mode typically employed in1610+configurations of this type is balance-rr. Historically, in this1611+network configuration, the usual caveats about out of order packet1612+delivery are mitigated by the use of network adapters that do not do1613+any kind of packet coalescing (via the use of NAPI, or because the1614+device itself does not generate interrupts until some number of1615+packets has arrived). When employed in this fashion, the balance-rr1616+mode allows individual connections between two hosts to effectively1617+utilize greater than one interface's bandwidth.1618+1619+13.2.2 MT Link Monitoring for Multiple Switch Topology1620+------------------------------------------------------1621+1622+ Again, in actual practice, the MII monitor is most often used1623+in this configuration, as performance is given preference over1624+availability. The ARP monitor will function in this topology, but its1625+advantages over the MII monitor are mitigated by the volume of probes1626+needed as the number of systems involved grows (remember that each1627+host in the network is configured with bonding).1628+1629+14. Switch Behavior Issues1630+==========================1631+1632+14.1 Link Establishment and Failover Delays1633+-------------------------------------------1634+1635+ Some switches exhibit undesirable behavior with regard to the1636+timing of link up and down reporting by the switch.16371638 First, when a link comes up, some switches may indicate that1639the link is up (carrier available), but not pass traffic over the···1370 Second, some switches may "bounce" the link state one or more1371times while a link is changing state. This occurs most commonly while1372the switch is initializing. Again, an appropriate updelay value may1373+help.0013741375 Note that when a bonding interface has no active links, the1376+driver will immediately reuse the first link that goes up, even if the1377+updelay parameter has been specified (the updelay is ignored in this1378+case). If there are slave interfaces waiting for the updelay timeout1379+to expire, the interface that first went into that state will be1380+immediately reused. This reduces down time of the network if the1381+value of updelay has been overestimated, and since this occurs only in1382+cases with no connectivity, there is no additional penalty for1383+ignoring the updelay.13841385 In addition to the concerns about switch timings, if your1386switches take a long time to go into backup mode, it may be desirable1387to not activate a backup interface immediately after a link goes down.1388Failover may be delayed via the downdelay bonding module option.13891390+14.2 Duplicated Incoming Packets1391+--------------------------------1392+1393+ It is not uncommon to observe a short burst of duplicated1394+traffic when the bonding device is first used, or after it has been1395+idle for some period of time. This is most easily observed by issuing1396+a "ping" to some other host on the network, and noticing that the1397+output from ping flags duplicates (typically one per slave).1398+1399+ For example, on a bond in active-backup mode with five slaves1400+all connected to one switch, the output may appear as follows:1401+1402+# ping -n 10.0.4.21403+PING 10.0.4.2 (10.0.4.2) from 10.0.3.10 : 56(84) bytes of data.1404+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.7 ms1405+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!)1406+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!)1407+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!)1408+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!)1409+64 bytes from 10.0.4.2: icmp_seq=2 ttl=64 time=0.216 ms1410+64 bytes from 10.0.4.2: icmp_seq=3 ttl=64 time=0.267 ms1411+64 bytes from 10.0.4.2: icmp_seq=4 ttl=64 time=0.222 ms1412+1413+ This is not due to an error in the bonding driver, rather, it1414+is a side effect of how many switches update their MAC forwarding1415+tables. Initially, the switch does not associate the MAC address in1416+the packet with a particular switch port, and so it may send the1417+traffic to all ports until its MAC forwarding table is updated. Since1418+the interfaces attached to the bond may occupy multiple ports on a1419+single switch, when the switch (temporarily) floods the traffic to all1420+ports, the bond device receives multiple copies of the same packet1421+(one per slave device).1422+1423+ The duplicated packet behavior is switch dependent, some1424+switches exhibit this, and some do not. On switches that display this1425+behavior, it can be induced by clearing the MAC forwarding table (on1426+most Cisco switches, the privileged command "clear mac address-table1427+dynamic" will accomplish this).1428+1429+15. Hardware Specific Considerations1430====================================14311432 This section contains additional information for configuring1433bonding on specific hardware platforms, or for interfacing bonding1434with particular switches or other devices.14351436+15.1 IBM BladeCenter1437--------------------14381439 This applies to the JS20 and similar systems.···1407--------------------------------14081409 All JS20s come with two Broadcom Gigabit Ethernet ports1410+integrated on the planar (that's "motherboard" in IBM-speak). In the1411+BladeCenter chassis, the eth0 port of all JS20 blades is hard wired to1412+I/O Module #1; similarly, all eth1 ports are wired to I/O Module #2.1413+An add-on Broadcom daughter card can be installed on a JS20 to provide1414+two more Gigabit Ethernet ports. These ports, eth2 and eth3, are1415+wired to I/O Modules 3 and 4, respectively.14161417 Each I/O Module may contain either a switch or a passthrough1418module (which allows ports to be directly connected to an external···1432of ways, this discussion will be confined to describing basic1433configurations.14341435+ Normally, Ethernet Switch Modules (ESMs) are used in I/O1436modules 1 and 2. In this configuration, the eth0 and eth1 ports of a1437JS20 will be connected to different internal switches (in the1438respective I/O modules).14391440+ A passthrough module (OPM or CPM, optical or copper,1441+passthrough module) connects the I/O module directly to an external1442+switch. By using PMs in I/O module #1 and #2, the eth0 and eth11443+interfaces of a JS20 can be redirected to the outside world and1444+connected to a common external switch.14451446+ Depending upon the mix of ESMs and PMs, the network will1447+appear to bonding as either a single switch topology (all PMs) or as a1448+multiple switch topology (one or more ESMs, zero or more PMs). It is1449+also possible to connect ESMs together, resulting in a configuration1450+much like the example in "High Availability in a Multiple Switch1451+Topology," above.14521453+Requirements for specific modes1454+-------------------------------14551456+ The balance-rr mode requires the use of passthrough modules1457+for devices in the bond, all connected to an common external switch.1458+That switch must be configured for "etherchannel" or "trunking" on the1459appropriate ports, as is usual for balance-rr.14601461 The balance-alb and balance-tlb modes will function with···1484Other concerns1485--------------14861487+ The Serial Over LAN (SoL) link is established over the primary1488ethernet (eth0) only, therefore, any loss of link to eth0 will result1489in losing your SoL connection. It will not fail over with other1490+network traffic, as the SoL system is beyond the control of the1491+bonding driver.14921493 It may be desirable to disable spanning tree on the switch1494(either the internal Ethernet Switch Module, or an external switch) to1495+avoid fail-over delay issues when using bonding.149614971498+16. Frequently Asked Questions1499==============================150015011. Is it SMP safe?···15052. What type of cards will work with it?15061507 Any Ethernet type cards (you can even mix cards - a Intel1508+EtherExpress PRO/100 and a 3com 3c905b, for example). For most modes,1509+devices need not be of the same speed.151015113. How many bonding devices can I have?1512···1524disabled. The active-backup mode will fail over to a backup link, and1525other modes will ignore the failed link. The link will continue to be1526monitored, and should it recover, it will rejoin the bond (in whatever1527+manner is appropriate for the mode). See the sections on High1528+Availability and the documentation for each mode for additional1529+information.15301531 Link monitoring can be enabled via either the miimon or1532+arp_interval parameters (described in the module parameters section,1533above). In general, miimon monitors the carrier state as sensed by1534the underlying network device, and the arp monitor (arp_interval)1535monitors connectivity to another host on the local network.···1536 If no link monitoring is configured, the bonding driver will1537be unable to detect link failures, and will assume that all links are1538always available. This will likely result in lost packets, and a1539+resulting degradation of performance. The precise performance loss1540depends upon the bonding mode and network configuration.154115426. Can bonding be used for High Availability?···1550 In the basic balance modes (balance-rr and balance-xor), it1551works with any system that supports etherchannel (also called1552trunking). Most managed switches currently available have such1553+support, and many unmanaged switches as well.15541555 The advanced balance modes (balance-tlb and balance-alb) do1556not have special switch requirements, but do need device drivers that1557support specific features (described in the appropriate section under1558+module parameters, above).15591560 In 802.3ad mode, it works with with systems that support IEEE1561802.3ad Dynamic Link Aggregation. Most managed and many unmanaged···156515668. Where does a bonding device get its MAC address from?15671568+ If not explicitly configured (with ifconfig or ip link), the1569+MAC address of the bonding device is taken from its first slave1570+device. This MAC address is then passed to all following slaves and1571+remains persistent (even if the the first slave is removed) until the1572+bonding device is brought down or reconfigured.15731574 If you wish to change the MAC address, you can set it with1575+ifconfig or ip link:15761577# ifconfig bond0 hw ether 00:11:22:33:44:551578+1579+# ip link set bond0 address 66:77:88:99:aa:bb15801581 The MAC address can be also changed by bringing down/up the1582device and then changing its slaves (or their order):···1591then restore the MAC addresses that the slaves had before they were1592enslaved.15931594+16. Resources and Links1595=======================15961597The latest version of the bonding driver can be found in the latest1598version of the linux kernel, found on http://kernel.org15991600+The latest version of this document can be found in either the latest1601+kernel source (named Documentation/networking/bonding.txt), or on the1602+bonding sourceforge site:1603+1604+http://www.sourceforge.net/projects/bonding1605+1606Discussions regarding the bonding driver take place primarily on the1607bonding-devel mailing list, hosted at sourceforge.net. If you have1608+questions or problems, post them to the list. The list address is:16091610bonding-devel@lists.sourceforge.net16111612+ The administrative interface (to subscribe or unsubscribe) can1613+be found at:1614+1615https://lists.sourceforge.net/lists/listinfo/bonding-devel000016161617Donald Becker's Ethernet Drivers and diag programs may be found at :1618 - http://www.scyld.com/network/
+1-1
drivers/net/hamradio/Kconfig
···1718config 6PACK19 tristate "Serial port 6PACK driver"20- depends on AX25 && BROKEN_ON_SMP21 ---help---22 6pack is a transmission protocol for the data exchange between your23 PC and your TNC (the Terminal Node Controller acts as a kind of
···1718config 6PACK19 tristate "Serial port 6PACK driver"20+ depends on AX2521 ---help---22 6pack is a transmission protocol for the data exchange between your23 PC and your TNC (the Terminal Node Controller acts as a kind of
+1-1
drivers/net/sk98lin/skgeinit.c
···2016 * we set the PHY to coma mode and switch to D3 power state.2017 */2018 if (pAC->GIni.GIYukonLite &&2019- pAC->GIni.GIChipRev == CHIP_REV_YU_LITE_A3) {20202021 /* for all ports switch PHY to coma mode */2022 for (i = 0; i < pAC->GIni.GIMacsFound; i++) {
···2016 * we set the PHY to coma mode and switch to D3 power state.2017 */2018 if (pAC->GIni.GIYukonLite &&2019+ pAC->GIni.GIChipRev >= CHIP_REV_YU_LITE_A3) {20202021 /* for all ports switch PHY to coma mode */2022 for (i = 0; i < pAC->GIni.GIMacsFound; i++) {
+4-4
drivers/net/sk98lin/skxmac2.c
···10651066 /* WA code for COMA mode */1067 if (pAC->GIni.GIYukonLite &&1068- pAC->GIni.GIChipRev == CHIP_REV_YU_LITE_A3) {10691070 SK_IN32(IoC, B2_GP_IO, &DWord);1071···11101111 /* WA code for COMA mode */1112 if (pAC->GIni.GIYukonLite &&1113- pAC->GIni.GIChipRev == CHIP_REV_YU_LITE_A3) {11141115 SK_IN32(IoC, B2_GP_IO, &DWord);1116···2126 int Ret = 0;21272128 if (pAC->GIni.GIYukonLite &&2129- pAC->GIni.GIChipRev == CHIP_REV_YU_LITE_A3) {21302131 /* save current power mode */2132 LastMode = pAC->GIni.GP[Port].PPhyPowerState;···2253 int Ret = 0;22542255 if (pAC->GIni.GIYukonLite &&2256- pAC->GIni.GIChipRev == CHIP_REV_YU_LITE_A3) {22572258 /* save current power mode */2259 LastMode = pAC->GIni.GP[Port].PPhyPowerState;
···10651066 /* WA code for COMA mode */1067 if (pAC->GIni.GIYukonLite &&1068+ pAC->GIni.GIChipRev >= CHIP_REV_YU_LITE_A3) {10691070 SK_IN32(IoC, B2_GP_IO, &DWord);1071···11101111 /* WA code for COMA mode */1112 if (pAC->GIni.GIYukonLite &&1113+ pAC->GIni.GIChipRev >= CHIP_REV_YU_LITE_A3) {11141115 SK_IN32(IoC, B2_GP_IO, &DWord);1116···2126 int Ret = 0;21272128 if (pAC->GIni.GIYukonLite &&2129+ pAC->GIni.GIChipRev >= CHIP_REV_YU_LITE_A3) {21302131 /* save current power mode */2132 LastMode = pAC->GIni.GP[Port].PPhyPowerState;···2253 int Ret = 0;22542255 if (pAC->GIni.GIYukonLite &&2256+ pAC->GIni.GIChipRev >= CHIP_REV_YU_LITE_A3) {22572258 /* save current power mode */2259 LastMode = pAC->GIni.GP[Port].PPhyPowerState;