Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

docs: networking: convert bonding.txt to ReST

- add SPDX header;
- adjust titles and chapters, adding proper markups;
- comment out text-only TOC from html/pdf output;
- mark code blocks and literals as such;
- mark tables as such;
- add notes markups;
- adjust identation, whitespaces and blank lines;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Mauro Carvalho Chehab and committed by
David S. Miller
a362032e b5fcf32d

+664 -610
+660 -607
Documentation/networking/bonding.txt Documentation/networking/bonding.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 1 2 2 - Linux Ethernet Bonding Driver HOWTO 3 + =================================== 4 + Linux Ethernet Bonding Driver HOWTO 5 + =================================== 3 6 4 - Latest update: 27 April 2011 7 + Latest update: 27 April 2011 5 8 6 - Initial release : Thomas Davis <tadavis at lbl.gov> 7 - Corrections, HA extensions : 2000/10/03-15 : 9 + Initial release: Thomas Davis <tadavis at lbl.gov> 10 + 11 + Corrections, HA extensions: 2000/10/03-15: 12 + 8 13 - Willy Tarreau <willy at meta-x.org> 9 14 - Constantine Gavrilov <const-g at xpert.com> 10 15 - Chad N. Tindel <ctindel at ieee dot org> ··· 18 13 19 14 Reorganized and updated Feb 2005 by Jay Vosburgh 20 15 Added Sysfs information: 2006/04/24 16 + 21 17 - Mitch Williams <mitch.a.williams at intel.com> 22 18 23 19 Introduction 24 20 ============ 25 21 26 - The Linux bonding driver provides a method for aggregating 22 + The Linux bonding driver provides a method for aggregating 27 23 multiple network interfaces into a single logical "bonded" interface. 28 24 The behavior of the bonded interfaces depends upon the mode; generally 29 25 speaking, modes provide either hot standby or load balancing services. 30 26 Additionally, link integrity monitoring may be performed. 31 - 32 - The bonding driver originally came from Donald Becker's 27 + 28 + The bonding driver originally came from Donald Becker's 33 29 beowulf patches for kernel 2.0. It has changed quite a bit since, and 34 30 the original tools from extreme-linux and beowulf sites will not work 35 31 with this version of the driver. 36 32 37 - For new versions of the driver, updated userspace tools, and 33 + For new versions of the driver, updated userspace tools, and 38 34 who to ask for help, please follow the links at the end of this file. 39 35 40 - Table of Contents 41 - ================= 36 + .. Table of Contents 42 37 43 - 1. Bonding Driver Installation 38 + 1. Bonding Driver Installation 44 39 45 - 2. Bonding Driver Options 40 + 2. Bonding Driver Options 46 41 47 - 3. Configuring Bonding Devices 48 - 3.1 Configuration with Sysconfig Support 49 - 3.1.1 Using DHCP with Sysconfig 50 - 3.1.2 Configuring Multiple Bonds with Sysconfig 51 - 3.2 Configuration with Initscripts Support 52 - 3.2.1 Using DHCP with Initscripts 53 - 3.2.2 Configuring Multiple Bonds with Initscripts 54 - 3.3 Configuring Bonding Manually with Ifenslave 55 - 3.3.1 Configuring Multiple Bonds Manually 56 - 3.4 Configuring Bonding Manually via Sysfs 57 - 3.5 Configuration with Interfaces Support 58 - 3.6 Overriding Configuration for Special Cases 59 - 3.7 Configuring LACP for 802.3ad mode in a more secure way 42 + 3. Configuring Bonding Devices 43 + 3.1 Configuration with Sysconfig Support 44 + 3.1.1 Using DHCP with Sysconfig 45 + 3.1.2 Configuring Multiple Bonds with Sysconfig 46 + 3.2 Configuration with Initscripts Support 47 + 3.2.1 Using DHCP with Initscripts 48 + 3.2.2 Configuring Multiple Bonds with Initscripts 49 + 3.3 Configuring Bonding Manually with Ifenslave 50 + 3.3.1 Configuring Multiple Bonds Manually 51 + 3.4 Configuring Bonding Manually via Sysfs 52 + 3.5 Configuration with Interfaces Support 53 + 3.6 Overriding Configuration for Special Cases 54 + 3.7 Configuring LACP for 802.3ad mode in a more secure way 60 55 61 - 4. Querying Bonding Configuration 62 - 4.1 Bonding Configuration 63 - 4.2 Network Configuration 56 + 4. Querying Bonding Configuration 57 + 4.1 Bonding Configuration 58 + 4.2 Network Configuration 64 59 65 - 5. Switch Configuration 60 + 5. Switch Configuration 66 61 67 - 6. 802.1q VLAN Support 62 + 6. 802.1q VLAN Support 68 63 69 - 7. Link Monitoring 70 - 7.1 ARP Monitor Operation 71 - 7.2 Configuring Multiple ARP Targets 72 - 7.3 MII Monitor Operation 64 + 7. Link Monitoring 65 + 7.1 ARP Monitor Operation 66 + 7.2 Configuring Multiple ARP Targets 67 + 7.3 MII Monitor Operation 73 68 74 - 8. Potential Trouble Sources 75 - 8.1 Adventures in Routing 76 - 8.2 Ethernet Device Renaming 77 - 8.3 Painfully Slow Or No Failed Link Detection By Miimon 69 + 8. Potential Trouble Sources 70 + 8.1 Adventures in Routing 71 + 8.2 Ethernet Device Renaming 72 + 8.3 Painfully Slow Or No Failed Link Detection By Miimon 78 73 79 - 9. SNMP agents 74 + 9. SNMP agents 80 75 81 - 10. Promiscuous mode 76 + 10. Promiscuous mode 82 77 83 - 11. Configuring Bonding for High Availability 84 - 11.1 High Availability in a Single Switch Topology 85 - 11.2 High Availability in a Multiple Switch Topology 86 - 11.2.1 HA Bonding Mode Selection for Multiple Switch Topology 87 - 11.2.2 HA Link Monitoring for Multiple Switch Topology 78 + 11. Configuring Bonding for High Availability 79 + 11.1 High Availability in a Single Switch Topology 80 + 11.2 High Availability in a Multiple Switch Topology 81 + 11.2.1 HA Bonding Mode Selection for Multiple Switch Topology 82 + 11.2.2 HA Link Monitoring for Multiple Switch Topology 88 83 89 - 12. Configuring Bonding for Maximum Throughput 90 - 12.1 Maximum Throughput in a Single Switch Topology 91 - 12.1.1 MT Bonding Mode Selection for Single Switch Topology 92 - 12.1.2 MT Link Monitoring for Single Switch Topology 93 - 12.2 Maximum Throughput in a Multiple Switch Topology 94 - 12.2.1 MT Bonding Mode Selection for Multiple Switch Topology 95 - 12.2.2 MT Link Monitoring for Multiple Switch Topology 84 + 12. Configuring Bonding for Maximum Throughput 85 + 12.1 Maximum Throughput in a Single Switch Topology 86 + 12.1.1 MT Bonding Mode Selection for Single Switch Topology 87 + 12.1.2 MT Link Monitoring for Single Switch Topology 88 + 12.2 Maximum Throughput in a Multiple Switch Topology 89 + 12.2.1 MT Bonding Mode Selection for Multiple Switch Topology 90 + 12.2.2 MT Link Monitoring for Multiple Switch Topology 96 91 97 - 13. Switch Behavior Issues 98 - 13.1 Link Establishment and Failover Delays 99 - 13.2 Duplicated Incoming Packets 92 + 13. Switch Behavior Issues 93 + 13.1 Link Establishment and Failover Delays 94 + 13.2 Duplicated Incoming Packets 100 95 101 - 14. Hardware Specific Considerations 102 - 14.1 IBM BladeCenter 96 + 14. Hardware Specific Considerations 97 + 14.1 IBM BladeCenter 103 98 104 - 15. Frequently Asked Questions 99 + 15. Frequently Asked Questions 105 100 106 - 16. Resources and Links 101 + 16. Resources and Links 107 102 108 103 109 104 1. Bonding Driver Installation 110 105 ============================== 111 106 112 - Most popular distro kernels ship with the bonding driver 107 + Most popular distro kernels ship with the bonding driver 113 108 already available as a module. If your distro does not, or you 114 109 have need to compile bonding from source (e.g., configuring and 115 110 installing a mainline kernel from kernel.org), you'll need to perform ··· 118 113 1.1 Configure and build the kernel with bonding 119 114 ----------------------------------------------- 120 115 121 - The current version of the bonding driver is available in the 116 + The current version of the bonding driver is available in the 122 117 drivers/net/bonding subdirectory of the most recent kernel source 123 118 (which is available on http://kernel.org). Most users "rolling their 124 119 own" will want to use the most recent kernel from kernel.org. 125 120 126 - Configure kernel with "make menuconfig" (or "make xconfig" or 121 + Configure kernel with "make menuconfig" (or "make xconfig" or 127 122 "make config"), then select "Bonding driver support" in the "Network 128 123 device support" section. It is recommended that you configure the 129 124 driver as module since it is currently the only way to pass parameters 130 125 to the driver or configure more than one bonding device. 131 126 132 - Build and install the new kernel and modules. 127 + Build and install the new kernel and modules. 133 128 134 129 1.2 Bonding Control Utility 135 - ------------------------------------- 130 + --------------------------- 136 131 137 - It is recommended to configure bonding via iproute2 (netlink) 132 + It is recommended to configure bonding via iproute2 (netlink) 138 133 or sysfs, the old ifenslave control utility is obsolete. 139 134 140 135 2. Bonding Driver Options 141 136 ========================= 142 137 143 - Options for the bonding driver are supplied as parameters to the 138 + Options for the bonding driver are supplied as parameters to the 144 139 bonding module at load time, or are specified via sysfs. 145 140 146 - Module options may be given as command line arguments to the 141 + Module options may be given as command line arguments to the 147 142 insmod or modprobe command, but are usually specified in either the 148 - /etc/modprobe.d/*.conf configuration files, or in a distro-specific 143 + ``/etc/modprobe.d/*.conf`` configuration files, or in a distro-specific 149 144 configuration file (some of which are detailed in the next section). 150 145 151 - Details on bonding support for sysfs is provided in the 146 + Details on bonding support for sysfs is provided in the 152 147 "Configuring Bonding Manually via Sysfs" section, below. 153 148 154 - The available bonding driver parameters are listed below. If a 149 + The available bonding driver parameters are listed below. If a 155 150 parameter is not specified the default value is used. When initially 156 151 configuring a bond, it is recommended "tail -f /var/log/messages" be 157 152 run in a separate window to watch for bonding driver error messages. 158 153 159 - It is critical that either the miimon or arp_interval and 154 + It is critical that either the miimon or arp_interval and 160 155 arp_ip_target parameters be specified, otherwise serious network 161 156 degradation will occur during link failures. Very few devices do not 162 157 support at least miimon, so there is really no reason not to use it. 163 158 164 - Options with textual values will accept either the text name 159 + Options with textual values will accept either the text name 165 160 or, for backwards compatibility, the option value. E.g., 166 161 "mode=802.3ad" and "mode=4" set the same mode. 167 162 168 - The parameters are as follows: 163 + The parameters are as follows: 169 164 170 165 active_slave 171 166 ··· 251 246 252 247 In an AD system, the port-key has three parts as shown below - 253 248 249 + ===== ============ 254 250 Bits Use 251 + ===== ============ 255 252 00 Duplex 256 253 01-05 Speed 257 254 06-15 User-defined 255 + ===== ============ 258 256 259 257 This defines the upper 10 bits of the port key. The values can be 260 258 from 0 - 1023. If not given, the system defaults to 0. ··· 707 699 swapped with the new curr_active_slave that was 708 700 chosen. 709 701 710 - num_grat_arp 702 + num_grat_arp, 711 703 num_unsol_na 712 704 713 705 Specify the number of peer notifications (gratuitous ARPs and ··· 737 729 738 730 peer_notif_delay 739 731 740 - Specify the delay, in milliseconds, between each peer 741 - notification (gratuitous ARP and unsolicited IPv6 Neighbor 742 - Advertisement) when they are issued after a failover event. 743 - This delay should be a multiple of the link monitor interval 744 - (arp_interval or miimon, whichever is active). The default 745 - value is 0 which means to match the value of the link monitor 746 - interval. 732 + Specify the delay, in milliseconds, between each peer 733 + notification (gratuitous ARP and unsolicited IPv6 Neighbor 734 + Advertisement) when they are issued after a failover event. 735 + This delay should be a multiple of the link monitor interval 736 + (arp_interval or miimon, whichever is active). The default 737 + value is 0 which means to match the value of the link monitor 738 + interval. 747 739 748 740 primary 749 741 ··· 985 977 3. Configuring Bonding Devices 986 978 ============================== 987 979 988 - You can configure bonding using either your distro's network 980 + You can configure bonding using either your distro's network 989 981 initialization scripts, or manually using either iproute2 or the 990 982 sysfs interface. Distros generally use one of three packages for the 991 983 network initialization scripts: initscripts, sysconfig or interfaces. 992 984 Recent versions of these packages have support for bonding, while older 993 985 versions do not. 994 986 995 - We will first describe the options for configuring bonding for 987 + We will first describe the options for configuring bonding for 996 988 distros using versions of initscripts, sysconfig and interfaces with full 997 989 or partial support for bonding, then provide information on enabling 998 990 bonding without support from the network initialization scripts (i.e., 999 991 older versions of initscripts or sysconfig). 1000 992 1001 - If you're unsure whether your distro uses sysconfig, 993 + If you're unsure whether your distro uses sysconfig, 1002 994 initscripts or interfaces, or don't know if it's new enough, have no fear. 1003 995 Determining this is fairly straightforward. 1004 996 1005 - First, look for a file called interfaces in /etc/network directory. 997 + First, look for a file called interfaces in /etc/network directory. 1006 998 If this file is present in your system, then your system use interfaces. See 1007 999 Configuration with Interfaces Support. 1008 1000 1009 - Else, issue the command: 1001 + Else, issue the command:: 1010 1002 1011 - $ rpm -qf /sbin/ifup 1003 + $ rpm -qf /sbin/ifup 1012 1004 1013 - It will respond with a line of text starting with either 1005 + It will respond with a line of text starting with either 1014 1006 "initscripts" or "sysconfig," followed by some numbers. This is the 1015 1007 package that provides your network initialization scripts. 1016 1008 1017 - Next, to determine if your installation supports bonding, 1018 - issue the command: 1009 + Next, to determine if your installation supports bonding, 1010 + issue the command:: 1019 1011 1020 - $ grep ifenslave /sbin/ifup 1012 + $ grep ifenslave /sbin/ifup 1021 1013 1022 - If this returns any matches, then your initscripts or 1014 + If this returns any matches, then your initscripts or 1023 1015 sysconfig has support for bonding. 1024 1016 1025 1017 3.1 Configuration with Sysconfig Support 1026 1018 ---------------------------------------- 1027 1019 1028 - This section applies to distros using a version of sysconfig 1020 + This section applies to distros using a version of sysconfig 1029 1021 with bonding support, for example, SuSE Linux Enterprise Server 9. 1030 1022 1031 - SuSE SLES 9's networking configuration system does support 1023 + SuSE SLES 9's networking configuration system does support 1032 1024 bonding, however, at this writing, the YaST system configuration 1033 1025 front end does not provide any means to work with bonding devices. 1034 1026 Bonding devices can be managed by hand, however, as follows. 1035 1027 1036 - First, if they have not already been configured, configure the 1028 + First, if they have not already been configured, configure the 1037 1029 slave devices. On SLES 9, this is most easily done by running the 1038 1030 yast2 sysconfig configuration utility. The goal is for to create an 1039 1031 ifcfg-id file for each slave device. The simplest way to accomplish 1040 1032 this is to configure the devices for DHCP (this is only to get the 1041 1033 file ifcfg-id file created; see below for some issues with DHCP). The 1042 - name of the configuration file for each device will be of the form: 1034 + name of the configuration file for each device will be of the form:: 1043 1035 1044 - ifcfg-id-xx:xx:xx:xx:xx:xx 1036 + ifcfg-id-xx:xx:xx:xx:xx:xx 1045 1037 1046 - Where the "xx" portion will be replaced with the digits from 1038 + Where the "xx" portion will be replaced with the digits from 1047 1039 the device's permanent MAC address. 1048 1040 1049 - Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been 1041 + Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been 1050 1042 created, it is necessary to edit the configuration files for the slave 1051 1043 devices (the MAC addresses correspond to those of the slave devices). 1052 1044 Before editing, the file will contain multiple lines, and will look 1053 - something like this: 1045 + something like this:: 1054 1046 1055 - BOOTPROTO='dhcp' 1056 - STARTMODE='on' 1057 - USERCTL='no' 1058 - UNIQUE='XNzu.WeZGOGF+4wE' 1059 - _nm_name='bus-pci-0001:61:01.0' 1047 + BOOTPROTO='dhcp' 1048 + STARTMODE='on' 1049 + USERCTL='no' 1050 + UNIQUE='XNzu.WeZGOGF+4wE' 1051 + _nm_name='bus-pci-0001:61:01.0' 1060 1052 1061 - Change the BOOTPROTO and STARTMODE lines to the following: 1053 + Change the BOOTPROTO and STARTMODE lines to the following:: 1062 1054 1063 - BOOTPROTO='none' 1064 - STARTMODE='off' 1055 + BOOTPROTO='none' 1056 + STARTMODE='off' 1065 1057 1066 - Do not alter the UNIQUE or _nm_name lines. Remove any other 1058 + Do not alter the UNIQUE or _nm_name lines. Remove any other 1067 1059 lines (USERCTL, etc). 1068 1060 1069 - Once the ifcfg-id-xx:xx:xx:xx:xx:xx files have been modified, 1061 + Once the ifcfg-id-xx:xx:xx:xx:xx:xx files have been modified, 1070 1062 it's time to create the configuration file for the bonding device 1071 1063 itself. This file is named ifcfg-bondX, where X is the number of the 1072 1064 bonding device to create, starting at 0. The first such file is ··· 1074 1066 network configuration system will correctly start multiple instances 1075 1067 of bonding. 1076 1068 1077 - The contents of the ifcfg-bondX file is as follows: 1069 + The contents of the ifcfg-bondX file is as follows:: 1078 1070 1079 - BOOTPROTO="static" 1080 - BROADCAST="10.0.2.255" 1081 - IPADDR="10.0.2.10" 1082 - NETMASK="255.255.0.0" 1083 - NETWORK="10.0.2.0" 1084 - REMOTE_IPADDR="" 1085 - STARTMODE="onboot" 1086 - BONDING_MASTER="yes" 1087 - BONDING_MODULE_OPTS="mode=active-backup miimon=100" 1088 - BONDING_SLAVE0="eth0" 1089 - BONDING_SLAVE1="bus-pci-0000:06:08.1" 1071 + BOOTPROTO="static" 1072 + BROADCAST="10.0.2.255" 1073 + IPADDR="10.0.2.10" 1074 + NETMASK="255.255.0.0" 1075 + NETWORK="10.0.2.0" 1076 + REMOTE_IPADDR="" 1077 + STARTMODE="onboot" 1078 + BONDING_MASTER="yes" 1079 + BONDING_MODULE_OPTS="mode=active-backup miimon=100" 1080 + BONDING_SLAVE0="eth0" 1081 + BONDING_SLAVE1="bus-pci-0000:06:08.1" 1090 1082 1091 - Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK 1083 + Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK 1092 1084 values with the appropriate values for your network. 1093 1085 1094 - The STARTMODE specifies when the device is brought online. 1086 + The STARTMODE specifies when the device is brought online. 1095 1087 The possible values are: 1096 1088 1097 - onboot: The device is started at boot time. If you're not 1089 + ======== ====================================================== 1090 + onboot The device is started at boot time. If you're not 1098 1091 sure, this is probably what you want. 1099 1092 1100 - manual: The device is started only when ifup is called 1093 + manual The device is started only when ifup is called 1101 1094 manually. Bonding devices may be configured this 1102 1095 way if you do not wish them to start automatically 1103 1096 at boot for some reason. 1104 1097 1105 - hotplug: The device is started by a hotplug event. This is not 1098 + hotplug The device is started by a hotplug event. This is not 1106 1099 a valid choice for a bonding device. 1107 1100 1108 - off or ignore: The device configuration is ignored. 1101 + off or The device configuration is ignored. 1102 + ignore 1103 + ======== ====================================================== 1109 1104 1110 - The line BONDING_MASTER='yes' indicates that the device is a 1105 + The line BONDING_MASTER='yes' indicates that the device is a 1111 1106 bonding master device. The only useful value is "yes." 1112 1107 1113 - The contents of BONDING_MODULE_OPTS are supplied to the 1108 + The contents of BONDING_MODULE_OPTS are supplied to the 1114 1109 instance of the bonding module for this device. Specify the options 1115 1110 for the bonding mode, link monitoring, and so on here. Do not include 1116 1111 the max_bonds bonding parameter; this will confuse the configuration 1117 1112 system if you have multiple bonding devices. 1118 1113 1119 - Finally, supply one BONDING_SLAVEn="slave device" for each 1114 + Finally, supply one BONDING_SLAVEn="slave device" for each 1120 1115 slave. where "n" is an increasing value, one for each slave. The 1121 1116 "slave device" is either an interface name, e.g., "eth0", or a device 1122 1117 specifier for the network device. The interface name is easier to ··· 1131 1120 example above uses one of each type for demonstration purposes; most 1132 1121 configurations will choose one or the other for all slave devices. 1133 1122 1134 - When all configuration files have been modified or created, 1123 + When all configuration files have been modified or created, 1135 1124 networking must be restarted for the configuration changes to take 1136 - effect. This can be accomplished via the following: 1125 + effect. This can be accomplished via the following:: 1137 1126 1138 - # /etc/init.d/network restart 1127 + # /etc/init.d/network restart 1139 1128 1140 - Note that the network control script (/sbin/ifdown) will 1129 + Note that the network control script (/sbin/ifdown) will 1141 1130 remove the bonding module as part of the network shutdown processing, 1142 1131 so it is not necessary to remove the module by hand if, e.g., the 1143 1132 module parameters have changed. 1144 1133 1145 - Also, at this writing, YaST/YaST2 will not manage bonding 1134 + Also, at this writing, YaST/YaST2 will not manage bonding 1146 1135 devices (they do not show bonding interfaces on its list of network 1147 1136 devices). It is necessary to edit the configuration file by hand to 1148 1137 change the bonding configuration. 1149 1138 1150 - Additional general options and details of the ifcfg file 1151 - format can be found in an example ifcfg template file: 1139 + Additional general options and details of the ifcfg file 1140 + format can be found in an example ifcfg template file:: 1152 1141 1153 - /etc/sysconfig/network/ifcfg.template 1142 + /etc/sysconfig/network/ifcfg.template 1154 1143 1155 - Note that the template does not document the various BONDING_ 1144 + Note that the template does not document the various ``BONDING_*`` 1156 1145 settings described above, but does describe many of the other options. 1157 1146 1158 1147 3.1.1 Using DHCP with Sysconfig 1159 1148 ------------------------------- 1160 1149 1161 - Under sysconfig, configuring a device with BOOTPROTO='dhcp' 1150 + Under sysconfig, configuring a device with BOOTPROTO='dhcp' 1162 1151 will cause it to query DHCP for its IP address information. At this 1163 1152 writing, this does not function for bonding devices; the scripts 1164 1153 attempt to obtain the device address from DHCP prior to adding any of ··· 1168 1157 3.1.2 Configuring Multiple Bonds with Sysconfig 1169 1158 ----------------------------------------------- 1170 1159 1171 - The sysconfig network initialization system is capable of 1160 + The sysconfig network initialization system is capable of 1172 1161 handling multiple bonding devices. All that is necessary is for each 1173 1162 bonding instance to have an appropriately configured ifcfg-bondX file 1174 1163 (as described above). Do not specify the "max_bonds" parameter to any ··· 1176 1165 multiple bonding devices with identical parameters, create multiple 1177 1166 ifcfg-bondX files. 1178 1167 1179 - Because the sysconfig scripts supply the bonding module 1168 + Because the sysconfig scripts supply the bonding module 1180 1169 options in the ifcfg-bondX file, it is not necessary to add them to 1181 - the system /etc/modules.d/*.conf configuration files. 1170 + the system ``/etc/modules.d/*.conf`` configuration files. 1182 1171 1183 1172 3.2 Configuration with Initscripts Support 1184 1173 ------------------------------------------ 1185 1174 1186 - This section applies to distros using a recent version of 1175 + This section applies to distros using a recent version of 1187 1176 initscripts with bonding support, for example, Red Hat Enterprise Linux 1188 1177 version 3 or later, Fedora, etc. On these systems, the network 1189 1178 initialization scripts have knowledge of bonding, and can be configured to ··· 1191 1180 package have lower levels of support for bonding; this will be noted where 1192 1181 applicable. 1193 1182 1194 - These distros will not automatically load the network adapter 1183 + These distros will not automatically load the network adapter 1195 1184 driver unless the ethX device is configured with an IP address. 1196 1185 Because of this constraint, users must manually configure a 1197 1186 network-script file for all physical adapters that will be members of ··· 1199 1188 1200 1189 /etc/sysconfig/network-scripts 1201 1190 1202 - The file name must be prefixed with "ifcfg-eth" and suffixed 1191 + The file name must be prefixed with "ifcfg-eth" and suffixed 1203 1192 with the adapter's physical adapter number. For example, the script 1204 1193 for eth0 would be named /etc/sysconfig/network-scripts/ifcfg-eth0. 1205 - Place the following text in the file: 1194 + Place the following text in the file:: 1206 1195 1207 - DEVICE=eth0 1208 - USERCTL=no 1209 - ONBOOT=yes 1210 - MASTER=bond0 1211 - SLAVE=yes 1212 - BOOTPROTO=none 1196 + DEVICE=eth0 1197 + USERCTL=no 1198 + ONBOOT=yes 1199 + MASTER=bond0 1200 + SLAVE=yes 1201 + BOOTPROTO=none 1213 1202 1214 - The DEVICE= line will be different for every ethX device and 1203 + The DEVICE= line will be different for every ethX device and 1215 1204 must correspond with the name of the file, i.e., ifcfg-eth1 must have 1216 1205 a device line of DEVICE=eth1. The setting of the MASTER= line will 1217 1206 also depend on the final bonding interface name chosen for your bond. ··· 1219 1208 one for each device, i.e., the first bonding instance is bond0, the 1220 1209 second is bond1, and so on. 1221 1210 1222 - Next, create a bond network script. The file name for this 1211 + Next, create a bond network script. The file name for this 1223 1212 script will be /etc/sysconfig/network-scripts/ifcfg-bondX where X is 1224 1213 the number of the bond. For bond0 the file is named "ifcfg-bond0", 1225 1214 for bond1 it is named "ifcfg-bond1", and so on. Within that file, 1226 - place the following text: 1215 + place the following text:: 1227 1216 1228 - DEVICE=bond0 1229 - IPADDR=192.168.1.1 1230 - NETMASK=255.255.255.0 1231 - NETWORK=192.168.1.0 1232 - BROADCAST=192.168.1.255 1233 - ONBOOT=yes 1234 - BOOTPROTO=none 1235 - USERCTL=no 1217 + DEVICE=bond0 1218 + IPADDR=192.168.1.1 1219 + NETMASK=255.255.255.0 1220 + NETWORK=192.168.1.0 1221 + BROADCAST=192.168.1.255 1222 + ONBOOT=yes 1223 + BOOTPROTO=none 1224 + USERCTL=no 1236 1225 1237 - Be sure to change the networking specific lines (IPADDR, 1226 + Be sure to change the networking specific lines (IPADDR, 1238 1227 NETMASK, NETWORK and BROADCAST) to match your network configuration. 1239 1228 1240 - For later versions of initscripts, such as that found with Fedora 1229 + For later versions of initscripts, such as that found with Fedora 1241 1230 7 (or later) and Red Hat Enterprise Linux version 5 (or later), it is possible, 1242 1231 and, indeed, preferable, to specify the bonding options in the ifcfg-bond0 1243 - file, e.g. a line of the format: 1232 + file, e.g. a line of the format:: 1244 1233 1245 - BONDING_OPTS="mode=active-backup arp_interval=60 arp_ip_target=192.168.1.254" 1234 + BONDING_OPTS="mode=active-backup arp_interval=60 arp_ip_target=192.168.1.254" 1246 1235 1247 - will configure the bond with the specified options. The options 1236 + will configure the bond with the specified options. The options 1248 1237 specified in BONDING_OPTS are identical to the bonding module parameters 1249 1238 except for the arp_ip_target field when using versions of initscripts older 1250 1239 than and 8.57 (Fedora 8) and 8.45.19 (Red Hat Enterprise Linux 5.2). When 1251 1240 using older versions each target should be included as a separate option and 1252 1241 should be preceded by a '+' to indicate it should be added to the list of 1253 - queried targets, e.g., 1242 + queried targets, e.g.,:: 1254 1243 1255 - arp_ip_target=+192.168.1.1 arp_ip_target=+192.168.1.2 1244 + arp_ip_target=+192.168.1.1 arp_ip_target=+192.168.1.2 1256 1245 1257 - is the proper syntax to specify multiple targets. When specifying 1258 - options via BONDING_OPTS, it is not necessary to edit /etc/modprobe.d/*.conf. 1246 + is the proper syntax to specify multiple targets. When specifying 1247 + options via BONDING_OPTS, it is not necessary to edit 1248 + ``/etc/modprobe.d/*.conf``. 1259 1249 1260 - For even older versions of initscripts that do not support 1250 + For even older versions of initscripts that do not support 1261 1251 BONDING_OPTS, it is necessary to edit /etc/modprobe.d/*.conf, depending upon 1262 1252 your distro) to load the bonding module with your desired options when the 1263 1253 bond0 interface is brought up. The following lines in /etc/modprobe.d/*.conf 1264 1254 will load the bonding module, and select its options: 1265 1255 1266 - alias bond0 bonding 1267 - options bond0 mode=balance-alb miimon=100 1256 + alias bond0 bonding 1257 + options bond0 mode=balance-alb miimon=100 1268 1258 1269 - Replace the sample parameters with the appropriate set of 1259 + Replace the sample parameters with the appropriate set of 1270 1260 options for your configuration. 1271 1261 1272 - Finally run "/etc/rc.d/init.d/network restart" as root. This 1262 + Finally run "/etc/rc.d/init.d/network restart" as root. This 1273 1263 will restart the networking subsystem and your bond link should be now 1274 1264 up and running. 1275 1265 1276 1266 3.2.1 Using DHCP with Initscripts 1277 1267 --------------------------------- 1278 1268 1279 - Recent versions of initscripts (the versions supplied with Fedora 1269 + Recent versions of initscripts (the versions supplied with Fedora 1280 1270 Core 3 and Red Hat Enterprise Linux 4, or later versions, are reported to 1281 1271 work) have support for assigning IP information to bonding devices via 1282 1272 DHCP. 1283 1273 1284 - To configure bonding for DHCP, configure it as described 1274 + To configure bonding for DHCP, configure it as described 1285 1275 above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp" 1286 1276 and add a line consisting of "TYPE=Bonding". Note that the TYPE value 1287 1277 is case sensitive. ··· 1290 1278 3.2.2 Configuring Multiple Bonds with Initscripts 1291 1279 ------------------------------------------------- 1292 1280 1293 - Initscripts packages that are included with Fedora 7 and Red Hat 1281 + Initscripts packages that are included with Fedora 7 and Red Hat 1294 1282 Enterprise Linux 5 support multiple bonding interfaces by simply 1295 1283 specifying the appropriate BONDING_OPTS= in ifcfg-bondX where X is the 1296 1284 number of the bond. This support requires sysfs support in the kernel, ··· 1302 1290 3.3 Configuring Bonding Manually with iproute2 1303 1291 ----------------------------------------------- 1304 1292 1305 - This section applies to distros whose network initialization 1293 + This section applies to distros whose network initialization 1306 1294 scripts (the sysconfig or initscripts package) do not have specific 1307 1295 knowledge of bonding. One such distro is SuSE Linux Enterprise Server 1308 1296 version 8. 1309 1297 1310 - The general method for these systems is to place the bonding 1298 + The general method for these systems is to place the bonding 1311 1299 module parameters into a config file in /etc/modprobe.d/ (as 1312 1300 appropriate for the installed distro), then add modprobe and/or 1313 1301 `ip link` commands to the system's global init script. The name of 1314 1302 the global init script differs; for sysconfig, it is 1315 1303 /etc/init.d/boot.local and for initscripts it is /etc/rc.d/rc.local. 1316 1304 1317 - For example, if you wanted to make a simple bond of two e100 1305 + For example, if you wanted to make a simple bond of two e100 1318 1306 devices (presumed to be eth0 and eth1), and have it persist across 1319 1307 reboots, edit the appropriate file (/etc/init.d/boot.local or 1320 - /etc/rc.d/rc.local), and add the following: 1308 + /etc/rc.d/rc.local), and add the following:: 1321 1309 1322 - modprobe bonding mode=balance-alb miimon=100 1323 - modprobe e100 1324 - ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up 1325 - ip link set eth0 master bond0 1326 - ip link set eth1 master bond0 1310 + modprobe bonding mode=balance-alb miimon=100 1311 + modprobe e100 1312 + ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up 1313 + ip link set eth0 master bond0 1314 + ip link set eth1 master bond0 1327 1315 1328 - Replace the example bonding module parameters and bond0 1316 + Replace the example bonding module parameters and bond0 1329 1317 network configuration (IP address, netmask, etc) with the appropriate 1330 1318 values for your configuration. 1331 1319 1332 - Unfortunately, this method will not provide support for the 1320 + Unfortunately, this method will not provide support for the 1333 1321 ifup and ifdown scripts on the bond devices. To reload the bonding 1334 - configuration, it is necessary to run the initialization script, e.g., 1322 + configuration, it is necessary to run the initialization script, e.g.,:: 1335 1323 1336 - # /etc/init.d/boot.local 1324 + # /etc/init.d/boot.local 1337 1325 1338 - or 1326 + or:: 1339 1327 1340 - # /etc/rc.d/rc.local 1328 + # /etc/rc.d/rc.local 1341 1329 1342 - It may be desirable in such a case to create a separate script 1330 + It may be desirable in such a case to create a separate script 1343 1331 which only initializes the bonding configuration, then call that 1344 1332 separate script from within boot.local. This allows for bonding to be 1345 1333 enabled without re-running the entire global init script. 1346 1334 1347 - To shut down the bonding devices, it is necessary to first 1335 + To shut down the bonding devices, it is necessary to first 1348 1336 mark the bonding device itself as being down, then remove the 1349 1337 appropriate device driver modules. For our example above, you can do 1350 - the following: 1338 + the following:: 1351 1339 1352 - # ifconfig bond0 down 1353 - # rmmod bonding 1354 - # rmmod e100 1340 + # ifconfig bond0 down 1341 + # rmmod bonding 1342 + # rmmod e100 1355 1343 1356 - Again, for convenience, it may be desirable to create a script 1344 + Again, for convenience, it may be desirable to create a script 1357 1345 with these commands. 1358 1346 1359 1347 1360 1348 3.3.1 Configuring Multiple Bonds Manually 1361 1349 ----------------------------------------- 1362 1350 1363 - This section contains information on configuring multiple 1351 + This section contains information on configuring multiple 1364 1352 bonding devices with differing options for those systems whose network 1365 1353 initialization scripts lack support for configuring multiple bonds. 1366 1354 1367 - If you require multiple bonding devices, but all with the same 1355 + If you require multiple bonding devices, but all with the same 1368 1356 options, you may wish to use the "max_bonds" module parameter, 1369 1357 documented above. 1370 1358 1371 - To create multiple bonding devices with differing options, it is 1359 + To create multiple bonding devices with differing options, it is 1372 1360 preferable to use bonding parameters exported by sysfs, documented in the 1373 1361 section below. 1374 1362 1375 - For versions of bonding without sysfs support, the only means to 1363 + For versions of bonding without sysfs support, the only means to 1376 1364 provide multiple instances of bonding with differing options is to load 1377 1365 the bonding driver multiple times. Note that current versions of the 1378 1366 sysconfig network initialization scripts handle this automatically; if ··· 1380 1368 section Configuring Bonding Devices, above, if you're not sure about your 1381 1369 network initialization scripts. 1382 1370 1383 - To load multiple instances of the module, it is necessary to 1371 + To load multiple instances of the module, it is necessary to 1384 1372 specify a different name for each instance (the module loading system 1385 1373 requires that every loaded module, even multiple instances of the same 1386 1374 module, have a unique name). This is accomplished by supplying multiple 1387 - sets of bonding options in /etc/modprobe.d/*.conf, for example: 1375 + sets of bonding options in ``/etc/modprobe.d/*.conf``, for example:: 1388 1376 1389 - alias bond0 bonding 1390 - options bond0 -o bond0 mode=balance-rr miimon=100 1377 + alias bond0 bonding 1378 + options bond0 -o bond0 mode=balance-rr miimon=100 1391 1379 1392 - alias bond1 bonding 1393 - options bond1 -o bond1 mode=balance-alb miimon=50 1380 + alias bond1 bonding 1381 + options bond1 -o bond1 mode=balance-alb miimon=50 1394 1382 1395 - will load the bonding module two times. The first instance is 1383 + will load the bonding module two times. The first instance is 1396 1384 named "bond0" and creates the bond0 device in balance-rr mode with an 1397 1385 miimon of 100. The second instance is named "bond1" and creates the 1398 1386 bond1 device in balance-alb mode with an miimon of 50. 1399 1387 1400 - In some circumstances (typically with older distributions), 1388 + In some circumstances (typically with older distributions), 1401 1389 the above does not work, and the second bonding instance never sees 1402 1390 its options. In that case, the second options line can be substituted 1403 - as follows: 1391 + as follows:: 1404 1392 1405 - install bond1 /sbin/modprobe --ignore-install bonding -o bond1 \ 1406 - mode=balance-alb miimon=50 1393 + install bond1 /sbin/modprobe --ignore-install bonding -o bond1 \ 1394 + mode=balance-alb miimon=50 1407 1395 1408 - This may be repeated any number of times, specifying a new and 1396 + This may be repeated any number of times, specifying a new and 1409 1397 unique name in place of bond1 for each subsequent instance. 1410 1398 1411 - It has been observed that some Red Hat supplied kernels are unable 1399 + It has been observed that some Red Hat supplied kernels are unable 1412 1400 to rename modules at load time (the "-o bond1" part). Attempts to pass 1413 1401 that option to modprobe will produce an "Operation not permitted" error. 1414 1402 This has been reported on some Fedora Core kernels, and has been seen on ··· 1419 1407 3.4 Configuring Bonding Manually via Sysfs 1420 1408 ------------------------------------------ 1421 1409 1422 - Starting with version 3.0.0, Channel Bonding may be configured 1410 + Starting with version 3.0.0, Channel Bonding may be configured 1423 1411 via the sysfs interface. This interface allows dynamic configuration 1424 1412 of all bonds in the system without unloading the module. It also 1425 1413 allows for adding and removing bonds at runtime. Ifenslave is no 1426 1414 longer required, though it is still supported. 1427 1415 1428 - Use of the sysfs interface allows you to use multiple bonds 1416 + Use of the sysfs interface allows you to use multiple bonds 1429 1417 with different configurations without having to reload the module. 1430 1418 It also allows you to use multiple, differently configured bonds when 1431 1419 bonding is compiled into the kernel. 1432 1420 1433 - You must have the sysfs filesystem mounted to configure 1421 + You must have the sysfs filesystem mounted to configure 1434 1422 bonding this way. The examples in this document assume that you 1435 1423 are using the standard mount point for sysfs, e.g. /sys. If your 1436 1424 sysfs filesystem is mounted elsewhere, you will need to adjust the ··· 1438 1426 1439 1427 Creating and Destroying Bonds 1440 1428 ----------------------------- 1441 - To add a new bond foo: 1442 - # echo +foo > /sys/class/net/bonding_masters 1429 + To add a new bond foo:: 1443 1430 1444 - To remove an existing bond bar: 1445 - # echo -bar > /sys/class/net/bonding_masters 1431 + # echo +foo > /sys/class/net/bonding_masters 1446 1432 1447 - To show all existing bonds: 1448 - # cat /sys/class/net/bonding_masters 1433 + To remove an existing bond bar:: 1449 1434 1450 - NOTE: due to 4K size limitation of sysfs files, this list may be 1451 - truncated if you have more than a few hundred bonds. This is unlikely 1452 - to occur under normal operating conditions. 1435 + # echo -bar > /sys/class/net/bonding_masters 1436 + 1437 + To show all existing bonds:: 1438 + 1439 + # cat /sys/class/net/bonding_masters 1440 + 1441 + .. note:: 1442 + 1443 + due to 4K size limitation of sysfs files, this list may be 1444 + truncated if you have more than a few hundred bonds. This is unlikely 1445 + to occur under normal operating conditions. 1453 1446 1454 1447 Adding and Removing Slaves 1455 1448 -------------------------- 1456 - Interfaces may be enslaved to a bond using the file 1449 + Interfaces may be enslaved to a bond using the file 1457 1450 /sys/class/net/<bond>/bonding/slaves. The semantics for this file 1458 1451 are the same as for the bonding_masters file. 1459 1452 1460 - To enslave interface eth0 to bond bond0: 1461 - # ifconfig bond0 up 1462 - # echo +eth0 > /sys/class/net/bond0/bonding/slaves 1453 + To enslave interface eth0 to bond bond0:: 1463 1454 1464 - To free slave eth0 from bond bond0: 1465 - # echo -eth0 > /sys/class/net/bond0/bonding/slaves 1455 + # ifconfig bond0 up 1456 + # echo +eth0 > /sys/class/net/bond0/bonding/slaves 1466 1457 1467 - When an interface is enslaved to a bond, symlinks between the 1458 + To free slave eth0 from bond bond0:: 1459 + 1460 + # echo -eth0 > /sys/class/net/bond0/bonding/slaves 1461 + 1462 + When an interface is enslaved to a bond, symlinks between the 1468 1463 two are created in the sysfs filesystem. In this case, you would get 1469 1464 /sys/class/net/bond0/slave_eth0 pointing to /sys/class/net/eth0, and 1470 1465 /sys/class/net/eth0/master pointing to /sys/class/net/bond0. 1471 1466 1472 - This means that you can tell quickly whether or not an 1467 + This means that you can tell quickly whether or not an 1473 1468 interface is enslaved by looking for the master symlink. Thus: 1474 1469 # echo -eth0 > /sys/class/net/eth0/master/bonding/slaves 1475 1470 will free eth0 from whatever bond it is enslaved to, regardless of ··· 1484 1465 1485 1466 Changing a Bond's Configuration 1486 1467 ------------------------------- 1487 - Each bond may be configured individually by manipulating the 1468 + Each bond may be configured individually by manipulating the 1488 1469 files located in /sys/class/net/<bond name>/bonding 1489 1470 1490 - The names of these files correspond directly with the command- 1471 + The names of these files correspond directly with the command- 1491 1472 line parameters described elsewhere in this file, and, with the 1492 1473 exception of arp_ip_target, they accept the same values. To see the 1493 1474 current setting, simply cat the appropriate file. 1494 1475 1495 - A few examples will be given here; for specific usage 1476 + A few examples will be given here; for specific usage 1496 1477 guidelines for each parameter, see the appropriate section in this 1497 1478 document. 1498 1479 1499 - To configure bond0 for balance-alb mode: 1500 - # ifconfig bond0 down 1501 - # echo 6 > /sys/class/net/bond0/bonding/mode 1502 - - or - 1503 - # echo balance-alb > /sys/class/net/bond0/bonding/mode 1504 - NOTE: The bond interface must be down before the mode can be 1505 - changed. 1480 + To configure bond0 for balance-alb mode:: 1506 1481 1507 - To enable MII monitoring on bond0 with a 1 second interval: 1508 - # echo 1000 > /sys/class/net/bond0/bonding/miimon 1509 - NOTE: If ARP monitoring is enabled, it will disabled when MII 1510 - monitoring is enabled, and vice-versa. 1482 + # ifconfig bond0 down 1483 + # echo 6 > /sys/class/net/bond0/bonding/mode 1484 + - or - 1485 + # echo balance-alb > /sys/class/net/bond0/bonding/mode 1511 1486 1512 - To add ARP targets: 1513 - # echo +192.168.0.100 > /sys/class/net/bond0/bonding/arp_ip_target 1514 - # echo +192.168.0.101 > /sys/class/net/bond0/bonding/arp_ip_target 1515 - NOTE: up to 16 target addresses may be specified. 1487 + .. note:: 1516 1488 1517 - To remove an ARP target: 1518 - # echo -192.168.0.100 > /sys/class/net/bond0/bonding/arp_ip_target 1489 + The bond interface must be down before the mode can be changed. 1519 1490 1520 - To configure the interval between learning packet transmits: 1521 - # echo 12 > /sys/class/net/bond0/bonding/lp_interval 1522 - NOTE: the lp_interval is the number of seconds between instances where 1523 - the bonding driver sends learning packets to each slaves peer switch. The 1524 - default interval is 1 second. 1491 + To enable MII monitoring on bond0 with a 1 second interval:: 1492 + 1493 + # echo 1000 > /sys/class/net/bond0/bonding/miimon 1494 + 1495 + .. note:: 1496 + 1497 + If ARP monitoring is enabled, it will disabled when MII 1498 + monitoring is enabled, and vice-versa. 1499 + 1500 + To add ARP targets:: 1501 + 1502 + # echo +192.168.0.100 > /sys/class/net/bond0/bonding/arp_ip_target 1503 + # echo +192.168.0.101 > /sys/class/net/bond0/bonding/arp_ip_target 1504 + 1505 + .. note:: 1506 + 1507 + up to 16 target addresses may be specified. 1508 + 1509 + To remove an ARP target:: 1510 + 1511 + # echo -192.168.0.100 > /sys/class/net/bond0/bonding/arp_ip_target 1512 + 1513 + To configure the interval between learning packet transmits:: 1514 + 1515 + # echo 12 > /sys/class/net/bond0/bonding/lp_interval 1516 + 1517 + .. note:: 1518 + 1519 + the lp_interval is the number of seconds between instances where 1520 + the bonding driver sends learning packets to each slaves peer switch. The 1521 + default interval is 1 second. 1525 1522 1526 1523 Example Configuration 1527 1524 --------------------- 1528 - We begin with the same example that is shown in section 3.3, 1525 + We begin with the same example that is shown in section 3.3, 1529 1526 executed with sysfs, and without using ifenslave. 1530 1527 1531 - To make a simple bond of two e100 devices (presumed to be eth0 1528 + To make a simple bond of two e100 devices (presumed to be eth0 1532 1529 and eth1), and have it persist across reboots, edit the appropriate 1533 1530 file (/etc/init.d/boot.local or /etc/rc.d/rc.local), and add the 1534 - following: 1531 + following:: 1535 1532 1536 - modprobe bonding 1537 - modprobe e100 1538 - echo balance-alb > /sys/class/net/bond0/bonding/mode 1539 - ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up 1540 - echo 100 > /sys/class/net/bond0/bonding/miimon 1541 - echo +eth0 > /sys/class/net/bond0/bonding/slaves 1542 - echo +eth1 > /sys/class/net/bond0/bonding/slaves 1533 + modprobe bonding 1534 + modprobe e100 1535 + echo balance-alb > /sys/class/net/bond0/bonding/mode 1536 + ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up 1537 + echo 100 > /sys/class/net/bond0/bonding/miimon 1538 + echo +eth0 > /sys/class/net/bond0/bonding/slaves 1539 + echo +eth1 > /sys/class/net/bond0/bonding/slaves 1543 1540 1544 - To add a second bond, with two e1000 interfaces in 1541 + To add a second bond, with two e1000 interfaces in 1545 1542 active-backup mode, using ARP monitoring, add the following lines to 1546 - your init script: 1543 + your init script:: 1547 1544 1548 - modprobe e1000 1549 - echo +bond1 > /sys/class/net/bonding_masters 1550 - echo active-backup > /sys/class/net/bond1/bonding/mode 1551 - ifconfig bond1 192.168.2.1 netmask 255.255.255.0 up 1552 - echo +192.168.2.100 /sys/class/net/bond1/bonding/arp_ip_target 1553 - echo 2000 > /sys/class/net/bond1/bonding/arp_interval 1554 - echo +eth2 > /sys/class/net/bond1/bonding/slaves 1555 - echo +eth3 > /sys/class/net/bond1/bonding/slaves 1545 + modprobe e1000 1546 + echo +bond1 > /sys/class/net/bonding_masters 1547 + echo active-backup > /sys/class/net/bond1/bonding/mode 1548 + ifconfig bond1 192.168.2.1 netmask 255.255.255.0 up 1549 + echo +192.168.2.100 /sys/class/net/bond1/bonding/arp_ip_target 1550 + echo 2000 > /sys/class/net/bond1/bonding/arp_interval 1551 + echo +eth2 > /sys/class/net/bond1/bonding/slaves 1552 + echo +eth3 > /sys/class/net/bond1/bonding/slaves 1556 1553 1557 1554 3.5 Configuration with Interfaces Support 1558 1555 ----------------------------------------- 1559 1556 1560 - This section applies to distros which use /etc/network/interfaces file 1557 + This section applies to distros which use /etc/network/interfaces file 1561 1558 to describe network interface configuration, most notably Debian and it's 1562 1559 derivatives. 1563 1560 1564 - The ifup and ifdown commands on Debian don't support bonding out of 1561 + The ifup and ifdown commands on Debian don't support bonding out of 1565 1562 the box. The ifenslave-2.6 package should be installed to provide bonding 1566 - support. Once installed, this package will provide bond-* options to be used 1567 - into /etc/network/interfaces. 1563 + support. Once installed, this package will provide ``bond-*`` options 1564 + to be used into /etc/network/interfaces. 1568 1565 1569 - Note that ifenslave-2.6 package will load the bonding module and use 1566 + Note that ifenslave-2.6 package will load the bonding module and use 1570 1567 the ifenslave command when appropriate. 1571 1568 1572 1569 Example Configurations 1573 1570 ---------------------- 1574 1571 1575 1572 In /etc/network/interfaces, the following stanza will configure bond0, in 1576 - active-backup mode, with eth0 and eth1 as slaves. 1573 + active-backup mode, with eth0 and eth1 as slaves:: 1577 1574 1578 - auto bond0 1579 - iface bond0 inet dhcp 1580 - bond-slaves eth0 eth1 1581 - bond-mode active-backup 1582 - bond-miimon 100 1583 - bond-primary eth0 eth1 1575 + auto bond0 1576 + iface bond0 inet dhcp 1577 + bond-slaves eth0 eth1 1578 + bond-mode active-backup 1579 + bond-miimon 100 1580 + bond-primary eth0 eth1 1584 1581 1585 1582 If the above configuration doesn't work, you might have a system using 1586 1583 upstart for system startup. This is most notably true for recent 1587 1584 Ubuntu versions. The following stanza in /etc/network/interfaces will 1588 - produce the same result on those systems. 1585 + produce the same result on those systems:: 1589 1586 1590 - auto bond0 1591 - iface bond0 inet dhcp 1592 - bond-slaves none 1593 - bond-mode active-backup 1594 - bond-miimon 100 1587 + auto bond0 1588 + iface bond0 inet dhcp 1589 + bond-slaves none 1590 + bond-mode active-backup 1591 + bond-miimon 100 1595 1592 1596 - auto eth0 1597 - iface eth0 inet manual 1598 - bond-master bond0 1599 - bond-primary eth0 eth1 1593 + auto eth0 1594 + iface eth0 inet manual 1595 + bond-master bond0 1596 + bond-primary eth0 eth1 1600 1597 1601 - auto eth1 1602 - iface eth1 inet manual 1603 - bond-master bond0 1604 - bond-primary eth0 eth1 1598 + auto eth1 1599 + iface eth1 inet manual 1600 + bond-master bond0 1601 + bond-primary eth0 eth1 1605 1602 1606 - For a full list of bond-* supported options in /etc/network/interfaces and some 1607 - more advanced examples tailored to you particular distros, see the files in 1603 + For a full list of ``bond-*`` supported options in /etc/network/interfaces and 1604 + some more advanced examples tailored to you particular distros, see the files in 1608 1605 /usr/share/doc/ifenslave-2.6. 1609 1606 1610 1607 3.6 Overriding Configuration for Special Cases ··· 1645 1610 available as the allocation is done at module init time. 1646 1611 1647 1612 The output of the file /proc/net/bonding/bondX has changed so the output Queue 1648 - ID is now printed for each slave: 1613 + ID is now printed for each slave:: 1649 1614 1650 - Bonding Mode: fault-tolerance (active-backup) 1651 - Primary Slave: None 1652 - Currently Active Slave: eth0 1653 - MII Status: up 1654 - MII Polling Interval (ms): 0 1655 - Up Delay (ms): 0 1656 - Down Delay (ms): 0 1615 + Bonding Mode: fault-tolerance (active-backup) 1616 + Primary Slave: None 1617 + Currently Active Slave: eth0 1618 + MII Status: up 1619 + MII Polling Interval (ms): 0 1620 + Up Delay (ms): 0 1621 + Down Delay (ms): 0 1657 1622 1658 - Slave Interface: eth0 1659 - MII Status: up 1660 - Link Failure Count: 0 1661 - Permanent HW addr: 00:1a:a0:12:8f:cb 1662 - Slave queue ID: 0 1623 + Slave Interface: eth0 1624 + MII Status: up 1625 + Link Failure Count: 0 1626 + Permanent HW addr: 00:1a:a0:12:8f:cb 1627 + Slave queue ID: 0 1663 1628 1664 - Slave Interface: eth1 1665 - MII Status: up 1666 - Link Failure Count: 0 1667 - Permanent HW addr: 00:1a:a0:12:8f:cc 1668 - Slave queue ID: 2 1629 + Slave Interface: eth1 1630 + MII Status: up 1631 + Link Failure Count: 0 1632 + Permanent HW addr: 00:1a:a0:12:8f:cc 1633 + Slave queue ID: 2 1669 1634 1670 - The queue_id for a slave can be set using the command: 1635 + The queue_id for a slave can be set using the command:: 1671 1636 1672 - # echo "eth1:2" > /sys/class/net/bond0/bonding/queue_id 1637 + # echo "eth1:2" > /sys/class/net/bond0/bonding/queue_id 1673 1638 1674 1639 Any interface that needs a queue_id set should set it with multiple calls 1675 1640 like the one above until proper priorities are set for all interfaces. On ··· 1680 1645 a multiqueue qdisc and filters to bias certain traffic to transmit on certain 1681 1646 slave devices. For instance, say we wanted, in the above configuration to 1682 1647 force all traffic bound to 192.168.1.100 to use eth1 in the bond as its output 1683 - device. The following commands would accomplish this: 1648 + device. The following commands would accomplish this:: 1684 1649 1685 - # tc qdisc add dev bond0 handle 1 root multiq 1650 + # tc qdisc add dev bond0 handle 1 root multiq 1686 1651 1687 - # tc filter add dev bond0 protocol ip parent 1: prio 1 u32 match ip dst \ 1688 - 192.168.1.100 action skbedit queue_mapping 2 1652 + # tc filter add dev bond0 protocol ip parent 1: prio 1 u32 match ip \ 1653 + dst 192.168.1.100 action skbedit queue_mapping 2 1689 1654 1690 1655 These commands tell the kernel to attach a multiqueue queue discipline to the 1691 1656 bond0 interface and filter traffic enqueued to it, such that packets with a dst ··· 1698 1663 leaving the qid for a slave to 0 is the multiqueue awareness in the bonding 1699 1664 driver that is now present. This awareness allows tc filters to be placed on 1700 1665 slave devices as well as bond devices and the bonding driver will simply act as 1701 - a pass-through for selecting output queues on the slave device rather than 1666 + a pass-through for selecting output queues on the slave device rather than 1702 1667 output port selection. 1703 1668 1704 1669 This feature first appeared in bonding driver version 3.7.0 and support for ··· 1724 1689 (a) ad_actor_system : You can set a random mac-address that can be used for 1725 1690 these LACPDU exchanges. The value can not be either NULL or Multicast. 1726 1691 Also it's preferable to set the local-admin bit. Following shell code 1727 - generates a random mac-address as described above. 1692 + generates a random mac-address as described above:: 1728 1693 1729 - # sys_mac_addr=$(printf '%02x:%02x:%02x:%02x:%02x:%02x' \ 1730 - $(( (RANDOM & 0xFE) | 0x02 )) \ 1731 - $(( RANDOM & 0xFF )) \ 1732 - $(( RANDOM & 0xFF )) \ 1733 - $(( RANDOM & 0xFF )) \ 1734 - $(( RANDOM & 0xFF )) \ 1735 - $(( RANDOM & 0xFF ))) 1736 - # echo $sys_mac_addr > /sys/class/net/bond0/bonding/ad_actor_system 1694 + # sys_mac_addr=$(printf '%02x:%02x:%02x:%02x:%02x:%02x' \ 1695 + $(( (RANDOM & 0xFE) | 0x02 )) \ 1696 + $(( RANDOM & 0xFF )) \ 1697 + $(( RANDOM & 0xFF )) \ 1698 + $(( RANDOM & 0xFF )) \ 1699 + $(( RANDOM & 0xFF )) \ 1700 + $(( RANDOM & 0xFF ))) 1701 + # echo $sys_mac_addr > /sys/class/net/bond0/bonding/ad_actor_system 1737 1702 1738 1703 (b) ad_actor_sys_prio : Randomize the system priority. The default value 1739 1704 is 65535, but system can take the value from 1 - 65535. Following shell 1740 - code generates random priority and sets it. 1705 + code generates random priority and sets it:: 1741 1706 1742 - # sys_prio=$(( 1 + RANDOM + RANDOM )) 1743 - # echo $sys_prio > /sys/class/net/bond0/bonding/ad_actor_sys_prio 1707 + # sys_prio=$(( 1 + RANDOM + RANDOM )) 1708 + # echo $sys_prio > /sys/class/net/bond0/bonding/ad_actor_sys_prio 1744 1709 1745 1710 (c) ad_user_port_key : Use the user portion of the port-key. The default 1746 1711 keeps this empty. These are the upper 10 bits of the port-key and value 1747 1712 ranges from 0 - 1023. Following shell code generates these 10 bits and 1748 - sets it. 1713 + sets it:: 1749 1714 1750 - # usr_port_key=$(( RANDOM & 0x3FF )) 1751 - # echo $usr_port_key > /sys/class/net/bond0/bonding/ad_user_port_key 1715 + # usr_port_key=$(( RANDOM & 0x3FF )) 1716 + # echo $usr_port_key > /sys/class/net/bond0/bonding/ad_user_port_key 1752 1717 1753 1718 1754 1719 4 Querying Bonding Configuration ··· 1757 1722 4.1 Bonding Configuration 1758 1723 ------------------------- 1759 1724 1760 - Each bonding device has a read-only file residing in the 1725 + Each bonding device has a read-only file residing in the 1761 1726 /proc/net/bonding directory. The file contents include information 1762 1727 about the bonding configuration, options and state of each slave. 1763 1728 1764 - For example, the contents of /proc/net/bonding/bond0 after the 1729 + For example, the contents of /proc/net/bonding/bond0 after the 1765 1730 driver is loaded with parameters of mode=0 and miimon=1000 is 1766 - generally as follows: 1731 + generally as follows:: 1767 1732 1768 1733 Ethernet Channel Bonding Driver: 2.6.1 (October 29, 2004) 1769 - Bonding Mode: load balancing (round-robin) 1770 - Currently Active Slave: eth0 1771 - MII Status: up 1772 - MII Polling Interval (ms): 1000 1773 - Up Delay (ms): 0 1774 - Down Delay (ms): 0 1734 + Bonding Mode: load balancing (round-robin) 1735 + Currently Active Slave: eth0 1736 + MII Status: up 1737 + MII Polling Interval (ms): 1000 1738 + Up Delay (ms): 0 1739 + Down Delay (ms): 0 1775 1740 1776 - Slave Interface: eth1 1777 - MII Status: up 1778 - Link Failure Count: 1 1741 + Slave Interface: eth1 1742 + MII Status: up 1743 + Link Failure Count: 1 1779 1744 1780 - Slave Interface: eth0 1781 - MII Status: up 1782 - Link Failure Count: 1 1745 + Slave Interface: eth0 1746 + MII Status: up 1747 + Link Failure Count: 1 1783 1748 1784 - The precise format and contents will change depending upon the 1749 + The precise format and contents will change depending upon the 1785 1750 bonding configuration, state, and version of the bonding driver. 1786 1751 1787 1752 4.2 Network configuration 1788 1753 ------------------------- 1789 1754 1790 - The network configuration can be inspected using the ifconfig 1755 + The network configuration can be inspected using the ifconfig 1791 1756 command. Bonding devices will have the MASTER flag set; Bonding slave 1792 1757 devices will have the SLAVE flag set. The ifconfig output does not 1793 1758 contain information on which slaves are associated with which masters. 1794 1759 1795 - In the example below, the bond0 interface is the master 1760 + In the example below, the bond0 interface is the master 1796 1761 (MASTER) while eth0 and eth1 are slaves (SLAVE). Notice all slaves of 1797 1762 bond0 have the same MAC address (HWaddr) as bond0 for all modes except 1798 - TLB and ALB that require a unique MAC address for each slave. 1763 + TLB and ALB that require a unique MAC address for each slave:: 1799 1764 1800 - # /sbin/ifconfig 1801 - bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 1802 - inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0 1803 - UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 1804 - RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0 1805 - TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0 1806 - collisions:0 txqueuelen:0 1765 + # /sbin/ifconfig 1766 + bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 1767 + inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0 1768 + UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 1769 + RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0 1770 + TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0 1771 + collisions:0 txqueuelen:0 1807 1772 1808 - eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 1809 - UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 1810 - RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0 1811 - TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0 1812 - collisions:0 txqueuelen:100 1813 - Interrupt:10 Base address:0x1080 1773 + eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 1774 + UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 1775 + RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0 1776 + TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0 1777 + collisions:0 txqueuelen:100 1778 + Interrupt:10 Base address:0x1080 1814 1779 1815 - eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 1816 - UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 1817 - RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0 1818 - TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0 1819 - collisions:0 txqueuelen:100 1820 - Interrupt:9 Base address:0x1400 1780 + eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 1781 + UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 1782 + RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0 1783 + TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0 1784 + collisions:0 txqueuelen:100 1785 + Interrupt:9 Base address:0x1400 1821 1786 1822 1787 5. Switch Configuration 1823 1788 ======================= 1824 1789 1825 - For this section, "switch" refers to whatever system the 1790 + For this section, "switch" refers to whatever system the 1826 1791 bonded devices are directly connected to (i.e., where the other end of 1827 1792 the cable plugs into). This may be an actual dedicated switch device, 1828 1793 or it may be another regular system (e.g., another computer running 1829 1794 Linux), 1830 1795 1831 - The active-backup, balance-tlb and balance-alb modes do not 1796 + The active-backup, balance-tlb and balance-alb modes do not 1832 1797 require any specific configuration of the switch. 1833 1798 1834 - The 802.3ad mode requires that the switch have the appropriate 1799 + The 802.3ad mode requires that the switch have the appropriate 1835 1800 ports configured as an 802.3ad aggregation. The precise method used 1836 1801 to configure this varies from switch to switch, but, for example, a 1837 1802 Cisco 3550 series switch requires that the appropriate ports first be ··· 1839 1804 etherchannel is set to mode "lacp" to enable 802.3ad (instead of 1840 1805 standard EtherChannel). 1841 1806 1842 - The balance-rr, balance-xor and broadcast modes generally 1807 + The balance-rr, balance-xor and broadcast modes generally 1843 1808 require that the switch have the appropriate ports grouped together. 1844 1809 The nomenclature for such a group differs between switches, it may be 1845 1810 called an "etherchannel" (as in the Cisco example, above), a "trunk ··· 1855 1820 6. 802.1q VLAN Support 1856 1821 ====================== 1857 1822 1858 - It is possible to configure VLAN devices over a bond interface 1823 + It is possible to configure VLAN devices over a bond interface 1859 1824 using the 8021q driver. However, only packets coming from the 8021q 1860 1825 driver and passing through bonding will be tagged by default. Self 1861 1826 generated packets, for example, bonding's learning packets or ARP ··· 1864 1829 "learn" the VLAN IDs configured above it, and use those IDs to tag 1865 1830 self generated packets. 1866 1831 1867 - For reasons of simplicity, and to support the use of adapters 1832 + For reasons of simplicity, and to support the use of adapters 1868 1833 that can do VLAN hardware acceleration offloading, the bonding 1869 1834 interface declares itself as fully hardware offloading capable, it gets 1870 1835 the add_vid/kill_vid notifications to gather the necessary ··· 1874 1839 "un-accelerated" by the bonding driver so the VLAN tag sits in the 1875 1840 regular location. 1876 1841 1877 - VLAN interfaces *must* be added on top of a bonding interface 1842 + VLAN interfaces *must* be added on top of a bonding interface 1878 1843 only after enslaving at least one slave. The bonding interface has a 1879 1844 hardware address of 00:00:00:00:00:00 until the first slave is added. 1880 1845 If the VLAN interface is created prior to the first enslavement, it ··· 1882 1847 is attached to the bond, the bond device itself will pick up the 1883 1848 slave's hardware address, which is then available for the VLAN device. 1884 1849 1885 - Also, be aware that a similar problem can occur if all slaves 1850 + Also, be aware that a similar problem can occur if all slaves 1886 1851 are released from a bond that still has one or more VLAN interfaces on 1887 1852 top of it. When a new slave is added, the bonding interface will 1888 1853 obtain its hardware address from the first slave, which might not 1889 1854 match the hardware address of the VLAN interfaces (which was 1890 1855 ultimately copied from an earlier slave). 1891 1856 1892 - There are two methods to insure that the VLAN device operates 1857 + There are two methods to insure that the VLAN device operates 1893 1858 with the correct hardware address if all slaves are removed from a 1894 1859 bond interface: 1895 1860 1896 - 1. Remove all VLAN interfaces then recreate them 1861 + 1. Remove all VLAN interfaces then recreate them 1897 1862 1898 - 2. Set the bonding interface's hardware address so that it 1863 + 2. Set the bonding interface's hardware address so that it 1899 1864 matches the hardware address of the VLAN interfaces. 1900 1865 1901 - Note that changing a VLAN interface's HW address would set the 1866 + Note that changing a VLAN interface's HW address would set the 1902 1867 underlying device -- i.e. the bonding interface -- to promiscuous 1903 1868 mode, which might not be what you want. 1904 1869 ··· 1906 1871 7. Link Monitoring 1907 1872 ================== 1908 1873 1909 - The bonding driver at present supports two schemes for 1874 + The bonding driver at present supports two schemes for 1910 1875 monitoring a slave device's link state: the ARP monitor and the MII 1911 1876 monitor. 1912 1877 1913 - At the present time, due to implementation restrictions in the 1878 + At the present time, due to implementation restrictions in the 1914 1879 bonding driver itself, it is not possible to enable both ARP and MII 1915 1880 monitoring simultaneously. 1916 1881 1917 1882 7.1 ARP Monitor Operation 1918 1883 ------------------------- 1919 1884 1920 - The ARP monitor operates as its name suggests: it sends ARP 1885 + The ARP monitor operates as its name suggests: it sends ARP 1921 1886 queries to one or more designated peer systems on the network, and 1922 1887 uses the response as an indication that the link is operating. This 1923 1888 gives some assurance that traffic is actually flowing to and from one 1924 1889 or more peers on the local network. 1925 1890 1926 - The ARP monitor relies on the device driver itself to verify 1891 + The ARP monitor relies on the device driver itself to verify 1927 1892 that traffic is flowing. In particular, the driver must keep up to 1928 1893 date the last receive time, dev->last_rx. Drivers that use NETIF_F_LLTX 1929 1894 flag must also update netdev_queue->trans_start. If they do not, then the ··· 1935 1900 7.2 Configuring Multiple ARP Targets 1936 1901 ------------------------------------ 1937 1902 1938 - While ARP monitoring can be done with just one target, it can 1903 + While ARP monitoring can be done with just one target, it can 1939 1904 be useful in a High Availability setup to have several targets to 1940 1905 monitor. In the case of just one target, the target itself may go 1941 1906 down or have a problem making it unresponsive to ARP requests. Having 1942 1907 an additional target (or several) increases the reliability of the ARP 1943 1908 monitoring. 1944 1909 1945 - Multiple ARP targets must be separated by commas as follows: 1910 + Multiple ARP targets must be separated by commas as follows:: 1946 1911 1947 - # example options for ARP monitoring with three targets 1948 - alias bond0 bonding 1949 - options bond0 arp_interval=60 arp_ip_target=192.168.0.1,192.168.0.3,192.168.0.9 1912 + # example options for ARP monitoring with three targets 1913 + alias bond0 bonding 1914 + options bond0 arp_interval=60 arp_ip_target=192.168.0.1,192.168.0.3,192.168.0.9 1950 1915 1951 - For just a single target the options would resemble: 1916 + For just a single target the options would resemble:: 1952 1917 1953 - # example options for ARP monitoring with one target 1954 - alias bond0 bonding 1955 - options bond0 arp_interval=60 arp_ip_target=192.168.0.100 1918 + # example options for ARP monitoring with one target 1919 + alias bond0 bonding 1920 + options bond0 arp_interval=60 arp_ip_target=192.168.0.100 1956 1921 1957 1922 1958 1923 7.3 MII Monitor Operation 1959 1924 ------------------------- 1960 1925 1961 - The MII monitor monitors only the carrier state of the local 1926 + The MII monitor monitors only the carrier state of the local 1962 1927 network interface. It accomplishes this in one of three ways: by 1963 1928 depending upon the device driver to maintain its carrier state, by 1964 1929 querying the device's MII registers, or by making an ethtool query to 1965 1930 the device. 1966 1931 1967 - If the use_carrier module parameter is 1 (the default value), 1932 + If the use_carrier module parameter is 1 (the default value), 1968 1933 then the MII monitor will rely on the driver for carrier state 1969 1934 information (via the netif_carrier subsystem). As explained in the 1970 1935 use_carrier parameter information, above, if the MII monitor fails to ··· 1972 1937 disconnected), it may be that the driver does not support 1973 1938 netif_carrier. 1974 1939 1975 - If use_carrier is 0, then the MII monitor will first query the 1940 + If use_carrier is 0, then the MII monitor will first query the 1976 1941 device's (via ioctl) MII registers and check the link state. If that 1977 1942 request fails (not just that it returns carrier down), then the MII 1978 1943 monitor will make an ethtool ETHOOL_GLINK request to attempt to obtain ··· 1987 1952 8.1 Adventures in Routing 1988 1953 ------------------------- 1989 1954 1990 - When bonding is configured, it is important that the slave 1955 + When bonding is configured, it is important that the slave 1991 1956 devices not have routes that supersede routes of the master (or, 1992 1957 generally, not have routes at all). For example, suppose the bonding 1993 1958 device bond0 has two slaves, eth0 and eth1, and the routing table is 1994 - as follows: 1959 + as follows:: 1995 1960 1996 - Kernel IP routing table 1997 - Destination Gateway Genmask Flags MSS Window irtt Iface 1998 - 10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 eth0 1999 - 10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 eth1 2000 - 10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 bond0 2001 - 127.0.0.0 0.0.0.0 255.0.0.0 U 40 0 0 lo 1961 + Kernel IP routing table 1962 + Destination Gateway Genmask Flags MSS Window irtt Iface 1963 + 10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 eth0 1964 + 10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 eth1 1965 + 10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 bond0 1966 + 127.0.0.0 0.0.0.0 255.0.0.0 U 40 0 0 lo 2002 1967 2003 - This routing configuration will likely still update the 1968 + This routing configuration will likely still update the 2004 1969 receive/transmit times in the driver (needed by the ARP monitor), but 2005 1970 may bypass the bonding driver (because outgoing traffic to, in this 2006 1971 case, another host on network 10 would use eth0 or eth1 before bond0). 2007 1972 2008 - The ARP monitor (and ARP itself) may become confused by this 1973 + The ARP monitor (and ARP itself) may become confused by this 2009 1974 configuration, because ARP requests (generated by the ARP monitor) 2010 1975 will be sent on one interface (bond0), but the corresponding reply 2011 1976 will arrive on a different interface (eth0). This reply looks to ARP ··· 2013 1978 interface basis), and is discarded. The MII monitor is not affected 2014 1979 by the state of the routing table. 2015 1980 2016 - The solution here is simply to insure that slaves do not have 1981 + The solution here is simply to insure that slaves do not have 2017 1982 routes of their own, and if for some reason they must, those routes do 2018 1983 not supersede routes of their master. This should generally be the 2019 1984 case, but unusual configurations or errant manual or automatic static ··· 2022 1987 8.2 Ethernet Device Renaming 2023 1988 ---------------------------- 2024 1989 2025 - On systems with network configuration scripts that do not 1990 + On systems with network configuration scripts that do not 2026 1991 associate physical devices directly with network interface names (so 2027 1992 that the same physical device always has the same "ethX" name), it may 2028 1993 be necessary to add some special logic to config files in 2029 1994 /etc/modprobe.d/. 2030 1995 2031 - For example, given a modules.conf containing the following: 1996 + For example, given a modules.conf containing the following:: 2032 1997 2033 - alias bond0 bonding 2034 - options bond0 mode=some-mode miimon=50 2035 - alias eth0 tg3 2036 - alias eth1 tg3 2037 - alias eth2 e1000 2038 - alias eth3 e1000 1998 + alias bond0 bonding 1999 + options bond0 mode=some-mode miimon=50 2000 + alias eth0 tg3 2001 + alias eth1 tg3 2002 + alias eth2 e1000 2003 + alias eth3 e1000 2039 2004 2040 - If neither eth0 and eth1 are slaves to bond0, then when the 2005 + If neither eth0 and eth1 are slaves to bond0, then when the 2041 2006 bond0 interface comes up, the devices may end up reordered. This 2042 2007 happens because bonding is loaded first, then its slave device's 2043 2008 drivers are loaded next. Since no other drivers have been loaded, ··· 2045 2010 devices, but the bonding configuration tries to enslave eth2 and eth3 2046 2011 (which may later be assigned to the tg3 devices). 2047 2012 2048 - Adding the following: 2013 + Adding the following:: 2049 2014 2050 - add above bonding e1000 tg3 2015 + add above bonding e1000 tg3 2051 2016 2052 - causes modprobe to load e1000 then tg3, in that order, when 2017 + causes modprobe to load e1000 then tg3, in that order, when 2053 2018 bonding is loaded. This command is fully documented in the 2054 2019 modules.conf manual page. 2055 2020 2056 - On systems utilizing modprobe an equivalent problem can occur. 2021 + On systems utilizing modprobe an equivalent problem can occur. 2057 2022 In this case, the following can be added to config files in 2058 - /etc/modprobe.d/ as: 2023 + /etc/modprobe.d/ as:: 2059 2024 2060 - softdep bonding pre: tg3 e1000 2025 + softdep bonding pre: tg3 e1000 2061 2026 2062 - This will load tg3 and e1000 modules before loading the bonding one. 2027 + This will load tg3 and e1000 modules before loading the bonding one. 2063 2028 Full documentation on this can be found in the modprobe.d and modprobe 2064 2029 manual pages. 2065 2030 2066 2031 8.3. Painfully Slow Or No Failed Link Detection By Miimon 2067 2032 --------------------------------------------------------- 2068 2033 2069 - By default, bonding enables the use_carrier option, which 2034 + By default, bonding enables the use_carrier option, which 2070 2035 instructs bonding to trust the driver to maintain carrier state. 2071 2036 2072 - As discussed in the options section, above, some drivers do 2037 + As discussed in the options section, above, some drivers do 2073 2038 not support the netif_carrier_on/_off link state tracking system. 2074 2039 With use_carrier enabled, bonding will always see these links as up, 2075 2040 regardless of their actual state. 2076 2041 2077 - Additionally, other drivers do support netif_carrier, but do 2042 + Additionally, other drivers do support netif_carrier, but do 2078 2043 not maintain it in real time, e.g., only polling the link state at 2079 2044 some fixed interval. In this case, miimon will detect failures, but 2080 2045 only after some long period of time has expired. If it appears that ··· 2086 2051 use_carrier=0 does not improve the failover, then the driver may cache 2087 2052 the registers, or the problem may be elsewhere. 2088 2053 2089 - Also, remember that miimon only checks for the device's 2054 + Also, remember that miimon only checks for the device's 2090 2055 carrier state. It has no way to determine the state of devices on or 2091 2056 beyond other ports of a switch, or if a switch is refusing to pass 2092 2057 traffic while still maintaining carrier on. ··· 2094 2059 9. SNMP agents 2095 2060 =============== 2096 2061 2097 - If running SNMP agents, the bonding driver should be loaded 2062 + If running SNMP agents, the bonding driver should be loaded 2098 2063 before any network drivers participating in a bond. This requirement 2099 2064 is due to the interface index (ipAdEntIfIndex) being associated to 2100 2065 the first interface found with a given IP address. That is, there is ··· 2104 2069 with the eth0 interface. This configuration is shown below, the IP 2105 2070 address 192.168.1.1 has an interface index of 2 which indexes to eth0 2106 2071 in the ifDescr table (ifDescr.2). 2072 + 2073 + :: 2107 2074 2108 2075 interfaces.ifTable.ifEntry.ifDescr.1 = lo 2109 2076 interfaces.ifTable.ifEntry.ifDescr.2 = eth0 ··· 2118 2081 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 4 2119 2082 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1 2120 2083 2121 - This problem is avoided by loading the bonding driver before 2084 + This problem is avoided by loading the bonding driver before 2122 2085 any network drivers participating in a bond. Below is an example of 2123 2086 loading the bonding driver first, the IP address 192.168.1.1 is 2124 2087 correctly associated with ifDescr.2. ··· 2134 2097 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 5 2135 2098 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1 2136 2099 2137 - While some distributions may not report the interface name in 2100 + While some distributions may not report the interface name in 2138 2101 ifDescr, the association between the IP address and IfIndex remains 2139 2102 and SNMP functions such as Interface_Scan_Next will report that 2140 2103 association. ··· 2142 2105 10. Promiscuous mode 2143 2106 ==================== 2144 2107 2145 - When running network monitoring tools, e.g., tcpdump, it is 2108 + When running network monitoring tools, e.g., tcpdump, it is 2146 2109 common to enable promiscuous mode on the device, so that all traffic 2147 2110 is seen (instead of seeing only traffic destined for the local host). 2148 2111 The bonding driver handles promiscuous mode changes to the bonding 2149 2112 master device (e.g., bond0), and propagates the setting to the slave 2150 2113 devices. 2151 2114 2152 - For the balance-rr, balance-xor, broadcast, and 802.3ad modes, 2115 + For the balance-rr, balance-xor, broadcast, and 802.3ad modes, 2153 2116 the promiscuous mode setting is propagated to all slaves. 2154 2117 2155 - For the active-backup, balance-tlb and balance-alb modes, the 2118 + For the active-backup, balance-tlb and balance-alb modes, the 2156 2119 promiscuous mode setting is propagated only to the active slave. 2157 2120 2158 - For balance-tlb mode, the active slave is the slave currently 2121 + For balance-tlb mode, the active slave is the slave currently 2159 2122 receiving inbound traffic. 2160 2123 2161 - For balance-alb mode, the active slave is the slave used as a 2124 + For balance-alb mode, the active slave is the slave used as a 2162 2125 "primary." This slave is used for mode-specific control traffic, for 2163 2126 sending to peers that are unassigned or if the load is unbalanced. 2164 2127 2165 - For the active-backup, balance-tlb and balance-alb modes, when 2128 + For the active-backup, balance-tlb and balance-alb modes, when 2166 2129 the active slave changes (e.g., due to a link failure), the 2167 2130 promiscuous setting will be propagated to the new active slave. 2168 2131 2169 2132 11. Configuring Bonding for High Availability 2170 2133 ============================================= 2171 2134 2172 - High Availability refers to configurations that provide 2135 + High Availability refers to configurations that provide 2173 2136 maximum network availability by having redundant or backup devices, 2174 2137 links or switches between the host and the rest of the world. The 2175 2138 goal is to provide the maximum availability of network connectivity ··· 2179 2142 11.1 High Availability in a Single Switch Topology 2180 2143 -------------------------------------------------- 2181 2144 2182 - If two hosts (or a host and a single switch) are directly 2145 + If two hosts (or a host and a single switch) are directly 2183 2146 connected via multiple physical links, then there is no availability 2184 2147 penalty to optimizing for maximum bandwidth. In this case, there is 2185 2148 only one switch (or peer), so if it fails, there is no alternative ··· 2187 2150 support link monitoring of their members, so if individual links fail, 2188 2151 the load will be rebalanced across the remaining devices. 2189 2152 2190 - See Section 12, "Configuring Bonding for Maximum Throughput" 2153 + See Section 12, "Configuring Bonding for Maximum Throughput" 2191 2154 for information on configuring bonding with one peer device. 2192 2155 2193 2156 11.2 High Availability in a Multiple Switch Topology 2194 2157 ---------------------------------------------------- 2195 2158 2196 - With multiple switches, the configuration of bonding and the 2159 + With multiple switches, the configuration of bonding and the 2197 2160 network changes dramatically. In multiple switch topologies, there is 2198 2161 a trade off between network availability and usable bandwidth. 2199 2162 2200 - Below is a sample network, configured to maximize the 2201 - availability of the network: 2163 + Below is a sample network, configured to maximize the 2164 + availability of the network:: 2202 2165 2203 - | | 2204 - |port3 port3| 2205 - +-----+----+ +-----+----+ 2206 - | |port2 ISL port2| | 2207 - | switch A +--------------------------+ switch B | 2208 - | | | | 2209 - +-----+----+ +-----++---+ 2210 - |port1 port1| 2211 - | +-------+ | 2212 - +-------------+ host1 +---------------+ 2213 - eth0 +-------+ eth1 2166 + | | 2167 + |port3 port3| 2168 + +-----+----+ +-----+----+ 2169 + | |port2 ISL port2| | 2170 + | switch A +--------------------------+ switch B | 2171 + | | | | 2172 + +-----+----+ +-----++---+ 2173 + |port1 port1| 2174 + | +-------+ | 2175 + +-------------+ host1 +---------------+ 2176 + eth0 +-------+ eth1 2214 2177 2215 - In this configuration, there is a link between the two 2178 + In this configuration, there is a link between the two 2216 2179 switches (ISL, or inter switch link), and multiple ports connecting to 2217 2180 the outside world ("port3" on each switch). There is no technical 2218 2181 reason that this could not be extended to a third switch. ··· 2220 2183 11.2.1 HA Bonding Mode Selection for Multiple Switch Topology 2221 2184 ------------------------------------------------------------- 2222 2185 2223 - In a topology such as the example above, the active-backup and 2186 + In a topology such as the example above, the active-backup and 2224 2187 broadcast modes are the only useful bonding modes when optimizing for 2225 2188 availability; the other modes require all links to terminate on the 2226 2189 same peer for them to behave rationally. 2227 2190 2228 - active-backup: This is generally the preferred mode, particularly if 2191 + active-backup: 2192 + This is generally the preferred mode, particularly if 2229 2193 the switches have an ISL and play together well. If the 2230 2194 network configuration is such that one switch is specifically 2231 2195 a backup switch (e.g., has lower capacity, higher cost, etc), 2232 2196 then the primary option can be used to insure that the 2233 2197 preferred link is always used when it is available. 2234 2198 2235 - broadcast: This mode is really a special purpose mode, and is suitable 2199 + broadcast: 2200 + This mode is really a special purpose mode, and is suitable 2236 2201 only for very specific needs. For example, if the two 2237 2202 switches are not connected (no ISL), and the networks beyond 2238 2203 them are totally independent. In this case, if it is ··· 2244 2205 11.2.2 HA Link Monitoring Selection for Multiple Switch Topology 2245 2206 ---------------------------------------------------------------- 2246 2207 2247 - The choice of link monitoring ultimately depends upon your 2208 + The choice of link monitoring ultimately depends upon your 2248 2209 switch. If the switch can reliably fail ports in response to other 2249 2210 failures, then either the MII or ARP monitors should work. For 2250 2211 example, in the above example, if the "port3" link fails at the remote ··· 2252 2213 monitor could be configured with a target at the remote end of port3, 2253 2214 thus detecting that failure without switch support. 2254 2215 2255 - In general, however, in a multiple switch topology, the ARP 2216 + In general, however, in a multiple switch topology, the ARP 2256 2217 monitor can provide a higher level of reliability in detecting end to 2257 2218 end connectivity failures (which may be caused by the failure of any 2258 2219 individual component to pass traffic for any reason). Additionally, ··· 2261 2222 regardless of which switch is active, the ARP monitor has a suitable 2262 2223 target to query. 2263 2224 2264 - Note, also, that of late many switches now support a functionality 2225 + Note, also, that of late many switches now support a functionality 2265 2226 generally referred to as "trunk failover." This is a feature of the 2266 2227 switch that causes the link state of a particular switch port to be set 2267 2228 down (or up) when the state of another switch port goes down (or up). ··· 2277 2238 12.1 Maximizing Throughput in a Single Switch Topology 2278 2239 ------------------------------------------------------ 2279 2240 2280 - In a single switch configuration, the best method to maximize 2241 + In a single switch configuration, the best method to maximize 2281 2242 throughput depends upon the application and network environment. The 2282 2243 various load balancing modes each have strengths and weaknesses in 2283 2244 different environments, as detailed below. 2284 2245 2285 - For this discussion, we will break down the topologies into 2246 + For this discussion, we will break down the topologies into 2286 2247 two categories. Depending upon the destination of most traffic, we 2287 2248 categorize them into either "gatewayed" or "local" configurations. 2288 2249 2289 - In a gatewayed configuration, the "switch" is acting primarily 2250 + In a gatewayed configuration, the "switch" is acting primarily 2290 2251 as a router, and the majority of traffic passes through this router to 2291 - other networks. An example would be the following: 2252 + other networks. An example would be the following:: 2292 2253 2293 2254 2294 2255 +----------+ +----------+ ··· 2298 2259 | |eth1 port2| | here somewhere 2299 2260 +----------+ +----------+ 2300 2261 2301 - The router may be a dedicated router device, or another host 2262 + The router may be a dedicated router device, or another host 2302 2263 acting as a gateway. For our discussion, the important point is that 2303 2264 the majority of traffic from Host A will pass through the router to 2304 2265 some other network before reaching its final destination. 2305 2266 2306 - In a gatewayed network configuration, although Host A may 2267 + In a gatewayed network configuration, although Host A may 2307 2268 communicate with many other systems, all of its traffic will be sent 2308 2269 and received via one other peer on the local network, the router. 2309 2270 2310 - Note that the case of two systems connected directly via 2271 + Note that the case of two systems connected directly via 2311 2272 multiple physical links is, for purposes of configuring bonding, the 2312 2273 same as a gatewayed configuration. In that case, it happens that all 2313 2274 traffic is destined for the "gateway" itself, not some other network 2314 2275 beyond the gateway. 2315 2276 2316 - In a local configuration, the "switch" is acting primarily as 2277 + In a local configuration, the "switch" is acting primarily as 2317 2278 a switch, and the majority of traffic passes through this switch to 2318 2279 reach other stations on the same network. An example would be the 2319 - following: 2280 + following:: 2320 2281 2321 2282 +----------+ +----------+ +--------+ 2322 2283 | |eth0 port1| +-------+ Host B | ··· 2326 2287 +----------+ +----------+port4 +--------+ 2327 2288 2328 2289 2329 - Again, the switch may be a dedicated switch device, or another 2290 + Again, the switch may be a dedicated switch device, or another 2330 2291 host acting as a gateway. For our discussion, the important point is 2331 2292 that the majority of traffic from Host A is destined for other hosts 2332 2293 on the same local network (Hosts B and C in the above example). 2333 2294 2334 - In summary, in a gatewayed configuration, traffic to and from 2295 + In summary, in a gatewayed configuration, traffic to and from 2335 2296 the bonded device will be to the same MAC level peer on the network 2336 2297 (the gateway itself, i.e., the router), regardless of its final 2337 2298 destination. In a local configuration, traffic flows directly to and 2338 2299 from the final destinations, thus, each destination (Host B, Host C) 2339 2300 will be addressed directly by their individual MAC addresses. 2340 2301 2341 - This distinction between a gatewayed and a local network 2302 + This distinction between a gatewayed and a local network 2342 2303 configuration is important because many of the load balancing modes 2343 2304 available use the MAC addresses of the local network source and 2344 2305 destination to make load balancing decisions. The behavior of each ··· 2348 2309 12.1.1 MT Bonding Mode Selection for Single Switch Topology 2349 2310 ----------------------------------------------------------- 2350 2311 2351 - This configuration is the easiest to set up and to understand, 2312 + This configuration is the easiest to set up and to understand, 2352 2313 although you will have to decide which bonding mode best suits your 2353 2314 needs. The trade offs for each mode are detailed below: 2354 2315 2355 - balance-rr: This mode is the only mode that will permit a single 2316 + balance-rr: 2317 + This mode is the only mode that will permit a single 2356 2318 TCP/IP connection to stripe traffic across multiple 2357 2319 interfaces. It is therefore the only mode that will allow a 2358 2320 single TCP/IP stream to utilize more than one interface's ··· 2391 2351 This mode requires the switch to have the appropriate ports 2392 2352 configured for "etherchannel" or "trunking." 2393 2353 2394 - active-backup: There is not much advantage in this network topology to 2354 + active-backup: 2355 + There is not much advantage in this network topology to 2395 2356 the active-backup mode, as the inactive backup devices are all 2396 2357 connected to the same peer as the primary. In this case, a 2397 2358 load balancing mode (with link monitoring) will provide the ··· 2402 2361 have value if the hardware available does not support any of 2403 2362 the load balance modes. 2404 2363 2405 - balance-xor: This mode will limit traffic such that packets destined 2364 + balance-xor: 2365 + This mode will limit traffic such that packets destined 2406 2366 for specific peers will always be sent over the same 2407 2367 interface. Since the destination is determined by the MAC 2408 2368 addresses involved, this mode works best in a "local" network ··· 2415 2373 As with balance-rr, the switch ports need to be configured for 2416 2374 "etherchannel" or "trunking." 2417 2375 2418 - broadcast: Like active-backup, there is not much advantage to this 2376 + broadcast: 2377 + Like active-backup, there is not much advantage to this 2419 2378 mode in this type of network topology. 2420 2379 2421 - 802.3ad: This mode can be a good choice for this type of network 2380 + 802.3ad: 2381 + This mode can be a good choice for this type of network 2422 2382 topology. The 802.3ad mode is an IEEE standard, so all peers 2423 2383 that implement 802.3ad should interoperate well. The 802.3ad 2424 2384 protocol includes automatic configuration of the aggregates, ··· 2434 2390 the same speed and duplex. Also, as with all bonding load 2435 2391 balance modes other than balance-rr, no single connection will 2436 2392 be able to utilize more than a single interface's worth of 2437 - bandwidth. 2393 + bandwidth. 2438 2394 2439 2395 Additionally, the linux bonding 802.3ad implementation 2440 2396 distributes traffic by peer (using an XOR of MAC addresses ··· 2448 2404 Finally, the 802.3ad mode mandates the use of the MII monitor, 2449 2405 therefore, the ARP monitor is not available in this mode. 2450 2406 2451 - balance-tlb: The balance-tlb mode balances outgoing traffic by peer. 2407 + balance-tlb: 2408 + The balance-tlb mode balances outgoing traffic by peer. 2452 2409 Since the balancing is done according to MAC address, in a 2453 2410 "gatewayed" configuration (as described above), this mode will 2454 2411 send all traffic across a single device. However, in a ··· 2467 2422 network device driver of the slave interfaces, and the ARP 2468 2423 monitor is not available. 2469 2424 2470 - balance-alb: This mode is everything that balance-tlb is, and more. 2425 + balance-alb: 2426 + This mode is everything that balance-tlb is, and more. 2471 2427 It has all of the features (and restrictions) of balance-tlb, 2472 2428 and will also balance incoming traffic from local network 2473 2429 peers (as described in the Bonding Module Options section, ··· 2481 2435 12.1.2 MT Link Monitoring for Single Switch Topology 2482 2436 ---------------------------------------------------- 2483 2437 2484 - The choice of link monitoring may largely depend upon which 2438 + The choice of link monitoring may largely depend upon which 2485 2439 mode you choose to use. The more advanced load balancing modes do not 2486 2440 support the use of the ARP monitor, and are thus restricted to using 2487 2441 the MII monitor (which does not provide as high a level of end to end ··· 2490 2444 12.2 Maximum Throughput in a Multiple Switch Topology 2491 2445 ----------------------------------------------------- 2492 2446 2493 - Multiple switches may be utilized to optimize for throughput 2447 + Multiple switches may be utilized to optimize for throughput 2494 2448 when they are configured in parallel as part of an isolated network 2495 - between two or more systems, for example: 2449 + between two or more systems, for example:: 2496 2450 2497 - +-----------+ 2498 - | Host A | 2499 - +-+---+---+-+ 2500 - | | | 2501 - +--------+ | +---------+ 2502 - | | | 2503 - +------+---+ +-----+----+ +-----+----+ 2504 - | Switch A | | Switch B | | Switch C | 2505 - +------+---+ +-----+----+ +-----+----+ 2506 - | | | 2507 - +--------+ | +---------+ 2508 - | | | 2509 - +-+---+---+-+ 2510 - | Host B | 2511 - +-----------+ 2451 + +-----------+ 2452 + | Host A | 2453 + +-+---+---+-+ 2454 + | | | 2455 + +--------+ | +---------+ 2456 + | | | 2457 + +------+---+ +-----+----+ +-----+----+ 2458 + | Switch A | | Switch B | | Switch C | 2459 + +------+---+ +-----+----+ +-----+----+ 2460 + | | | 2461 + +--------+ | +---------+ 2462 + | | | 2463 + +-+---+---+-+ 2464 + | Host B | 2465 + +-----------+ 2512 2466 2513 - In this configuration, the switches are isolated from one 2467 + In this configuration, the switches are isolated from one 2514 2468 another. One reason to employ a topology such as this is for an 2515 2469 isolated network with many hosts (a cluster configured for high 2516 2470 performance, for example), using multiple smaller switches can be more ··· 2518 2472 hosts, three 24 port switches can be significantly less expensive than 2519 2473 a single 72 port switch. 2520 2474 2521 - If access beyond the network is required, an individual host 2475 + If access beyond the network is required, an individual host 2522 2476 can be equipped with an additional network device connected to an 2523 2477 external network; this host then additionally acts as a gateway. 2524 2478 2525 2479 12.2.1 MT Bonding Mode Selection for Multiple Switch Topology 2526 2480 ------------------------------------------------------------- 2527 2481 2528 - In actual practice, the bonding mode typically employed in 2482 + In actual practice, the bonding mode typically employed in 2529 2483 configurations of this type is balance-rr. Historically, in this 2530 2484 network configuration, the usual caveats about out of order packet 2531 2485 delivery are mitigated by the use of network adapters that do not do ··· 2538 2492 12.2.2 MT Link Monitoring for Multiple Switch Topology 2539 2493 ------------------------------------------------------ 2540 2494 2541 - Again, in actual practice, the MII monitor is most often used 2495 + Again, in actual practice, the MII monitor is most often used 2542 2496 in this configuration, as performance is given preference over 2543 2497 availability. The ARP monitor will function in this topology, but its 2544 2498 advantages over the MII monitor are mitigated by the volume of probes ··· 2551 2505 13.1 Link Establishment and Failover Delays 2552 2506 ------------------------------------------- 2553 2507 2554 - Some switches exhibit undesirable behavior with regard to the 2508 + Some switches exhibit undesirable behavior with regard to the 2555 2509 timing of link up and down reporting by the switch. 2556 2510 2557 - First, when a link comes up, some switches may indicate that 2511 + First, when a link comes up, some switches may indicate that 2558 2512 the link is up (carrier available), but not pass traffic over the 2559 2513 interface for some period of time. This delay is typically due to 2560 2514 some type of autonegotiation or routing protocol, but may also occur ··· 2563 2517 value to the updelay bonding module option to delay the use of the 2564 2518 relevant interface(s). 2565 2519 2566 - Second, some switches may "bounce" the link state one or more 2520 + Second, some switches may "bounce" the link state one or more 2567 2521 times while a link is changing state. This occurs most commonly while 2568 2522 the switch is initializing. Again, an appropriate updelay value may 2569 2523 help. 2570 2524 2571 - Note that when a bonding interface has no active links, the 2525 + Note that when a bonding interface has no active links, the 2572 2526 driver will immediately reuse the first link that goes up, even if the 2573 2527 updelay parameter has been specified (the updelay is ignored in this 2574 2528 case). If there are slave interfaces waiting for the updelay timeout ··· 2578 2532 cases with no connectivity, there is no additional penalty for 2579 2533 ignoring the updelay. 2580 2534 2581 - In addition to the concerns about switch timings, if your 2535 + In addition to the concerns about switch timings, if your 2582 2536 switches take a long time to go into backup mode, it may be desirable 2583 2537 to not activate a backup interface immediately after a link goes down. 2584 2538 Failover may be delayed via the downdelay bonding module option. ··· 2586 2540 13.2 Duplicated Incoming Packets 2587 2541 -------------------------------- 2588 2542 2589 - NOTE: Starting with version 3.0.2, the bonding driver has logic to 2543 + NOTE: Starting with version 3.0.2, the bonding driver has logic to 2590 2544 suppress duplicate packets, which should largely eliminate this problem. 2591 2545 The following description is kept for reference. 2592 2546 2593 - It is not uncommon to observe a short burst of duplicated 2547 + It is not uncommon to observe a short burst of duplicated 2594 2548 traffic when the bonding device is first used, or after it has been 2595 2549 idle for some period of time. This is most easily observed by issuing 2596 2550 a "ping" to some other host on the network, and noticing that the 2597 2551 output from ping flags duplicates (typically one per slave). 2598 2552 2599 - For example, on a bond in active-backup mode with five slaves 2600 - all connected to one switch, the output may appear as follows: 2553 + For example, on a bond in active-backup mode with five slaves 2554 + all connected to one switch, the output may appear as follows:: 2601 2555 2602 - # ping -n 10.0.4.2 2603 - PING 10.0.4.2 (10.0.4.2) from 10.0.3.10 : 56(84) bytes of data. 2604 - 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.7 ms 2605 - 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) 2606 - 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) 2607 - 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) 2608 - 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) 2609 - 64 bytes from 10.0.4.2: icmp_seq=2 ttl=64 time=0.216 ms 2610 - 64 bytes from 10.0.4.2: icmp_seq=3 ttl=64 time=0.267 ms 2611 - 64 bytes from 10.0.4.2: icmp_seq=4 ttl=64 time=0.222 ms 2556 + # ping -n 10.0.4.2 2557 + PING 10.0.4.2 (10.0.4.2) from 10.0.3.10 : 56(84) bytes of data. 2558 + 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.7 ms 2559 + 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) 2560 + 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) 2561 + 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) 2562 + 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) 2563 + 64 bytes from 10.0.4.2: icmp_seq=2 ttl=64 time=0.216 ms 2564 + 64 bytes from 10.0.4.2: icmp_seq=3 ttl=64 time=0.267 ms 2565 + 64 bytes from 10.0.4.2: icmp_seq=4 ttl=64 time=0.222 ms 2612 2566 2613 - This is not due to an error in the bonding driver, rather, it 2567 + This is not due to an error in the bonding driver, rather, it 2614 2568 is a side effect of how many switches update their MAC forwarding 2615 2569 tables. Initially, the switch does not associate the MAC address in 2616 2570 the packet with a particular switch port, and so it may send the ··· 2620 2574 ports, the bond device receives multiple copies of the same packet 2621 2575 (one per slave device). 2622 2576 2623 - The duplicated packet behavior is switch dependent, some 2577 + The duplicated packet behavior is switch dependent, some 2624 2578 switches exhibit this, and some do not. On switches that display this 2625 2579 behavior, it can be induced by clearing the MAC forwarding table (on 2626 2580 most Cisco switches, the privileged command "clear mac address-table ··· 2629 2583 14. Hardware Specific Considerations 2630 2584 ==================================== 2631 2585 2632 - This section contains additional information for configuring 2586 + This section contains additional information for configuring 2633 2587 bonding on specific hardware platforms, or for interfacing bonding 2634 2588 with particular switches or other devices. 2635 2589 2636 2590 14.1 IBM BladeCenter 2637 2591 -------------------- 2638 2592 2639 - This applies to the JS20 and similar systems. 2593 + This applies to the JS20 and similar systems. 2640 2594 2641 - On the JS20 blades, the bonding driver supports only 2595 + On the JS20 blades, the bonding driver supports only 2642 2596 balance-rr, active-backup, balance-tlb and balance-alb modes. This is 2643 2597 largely due to the network topology inside the BladeCenter, detailed 2644 2598 below. ··· 2646 2600 JS20 network adapter information 2647 2601 -------------------------------- 2648 2602 2649 - All JS20s come with two Broadcom Gigabit Ethernet ports 2603 + All JS20s come with two Broadcom Gigabit Ethernet ports 2650 2604 integrated on the planar (that's "motherboard" in IBM-speak). In the 2651 2605 BladeCenter chassis, the eth0 port of all JS20 blades is hard wired to 2652 2606 I/O Module #1; similarly, all eth1 ports are wired to I/O Module #2. ··· 2654 2608 two more Gigabit Ethernet ports. These ports, eth2 and eth3, are 2655 2609 wired to I/O Modules 3 and 4, respectively. 2656 2610 2657 - Each I/O Module may contain either a switch or a passthrough 2611 + Each I/O Module may contain either a switch or a passthrough 2658 2612 module (which allows ports to be directly connected to an external 2659 2613 switch). Some bonding modes require a specific BladeCenter internal 2660 2614 network topology in order to function; these are detailed below. 2661 2615 2662 - Additional BladeCenter-specific networking information can be 2616 + Additional BladeCenter-specific networking information can be 2663 2617 found in two IBM Redbooks (www.ibm.com/redbooks): 2664 2618 2665 - "IBM eServer BladeCenter Networking Options" 2666 - "IBM eServer BladeCenter Layer 2-7 Network Switching" 2619 + - "IBM eServer BladeCenter Networking Options" 2620 + - "IBM eServer BladeCenter Layer 2-7 Network Switching" 2667 2621 2668 2622 BladeCenter networking configuration 2669 2623 ------------------------------------ 2670 2624 2671 - Because a BladeCenter can be configured in a very large number 2625 + Because a BladeCenter can be configured in a very large number 2672 2626 of ways, this discussion will be confined to describing basic 2673 2627 configurations. 2674 2628 2675 - Normally, Ethernet Switch Modules (ESMs) are used in I/O 2629 + Normally, Ethernet Switch Modules (ESMs) are used in I/O 2676 2630 modules 1 and 2. In this configuration, the eth0 and eth1 ports of a 2677 2631 JS20 will be connected to different internal switches (in the 2678 2632 respective I/O modules). 2679 2633 2680 - A passthrough module (OPM or CPM, optical or copper, 2634 + A passthrough module (OPM or CPM, optical or copper, 2681 2635 passthrough module) connects the I/O module directly to an external 2682 2636 switch. By using PMs in I/O module #1 and #2, the eth0 and eth1 2683 2637 interfaces of a JS20 can be redirected to the outside world and 2684 2638 connected to a common external switch. 2685 2639 2686 - Depending upon the mix of ESMs and PMs, the network will 2640 + Depending upon the mix of ESMs and PMs, the network will 2687 2641 appear to bonding as either a single switch topology (all PMs) or as a 2688 2642 multiple switch topology (one or more ESMs, zero or more PMs). It is 2689 2643 also possible to connect ESMs together, resulting in a configuration ··· 2693 2647 Requirements for specific modes 2694 2648 ------------------------------- 2695 2649 2696 - The balance-rr mode requires the use of passthrough modules 2650 + The balance-rr mode requires the use of passthrough modules 2697 2651 for devices in the bond, all connected to an common external switch. 2698 2652 That switch must be configured for "etherchannel" or "trunking" on the 2699 2653 appropriate ports, as is usual for balance-rr. 2700 2654 2701 - The balance-alb and balance-tlb modes will function with 2655 + The balance-alb and balance-tlb modes will function with 2702 2656 either switch modules or passthrough modules (or a mix). The only 2703 2657 specific requirement for these modes is that all network interfaces 2704 2658 must be able to reach all destinations for traffic sent over the 2705 2659 bonding device (i.e., the network must converge at some point outside 2706 2660 the BladeCenter). 2707 2661 2708 - The active-backup mode has no additional requirements. 2662 + The active-backup mode has no additional requirements. 2709 2663 2710 2664 Link monitoring issues 2711 2665 ---------------------- 2712 2666 2713 - When an Ethernet Switch Module is in place, only the ARP 2667 + When an Ethernet Switch Module is in place, only the ARP 2714 2668 monitor will reliably detect link loss to an external switch. This is 2715 2669 nothing unusual, but examination of the BladeCenter cabinet would 2716 2670 suggest that the "external" network ports are the ethernet ports for ··· 2718 2672 ports and the devices on the JS20 system itself. The MII monitor is 2719 2673 only able to detect link failures between the ESM and the JS20 system. 2720 2674 2721 - When a passthrough module is in place, the MII monitor does 2675 + When a passthrough module is in place, the MII monitor does 2722 2676 detect failures to the "external" port, which is then directly 2723 2677 connected to the JS20 system. 2724 2678 2725 2679 Other concerns 2726 2680 -------------- 2727 2681 2728 - The Serial Over LAN (SoL) link is established over the primary 2682 + The Serial Over LAN (SoL) link is established over the primary 2729 2683 ethernet (eth0) only, therefore, any loss of link to eth0 will result 2730 2684 in losing your SoL connection. It will not fail over with other 2731 2685 network traffic, as the SoL system is beyond the control of the 2732 2686 bonding driver. 2733 2687 2734 - It may be desirable to disable spanning tree on the switch 2688 + It may be desirable to disable spanning tree on the switch 2735 2689 (either the internal Ethernet Switch Module, or an external switch) to 2736 2690 avoid fail-over delay issues when using bonding. 2737 2691 2738 - 2692 + 2739 2693 15. Frequently Asked Questions 2740 2694 ============================== 2741 2695 2742 2696 1. Is it SMP safe? 2697 + ------------------- 2743 2698 2744 - Yes. The old 2.0.xx channel bonding patch was not SMP safe. 2699 + Yes. The old 2.0.xx channel bonding patch was not SMP safe. 2745 2700 The new driver was designed to be SMP safe from the start. 2746 2701 2747 2702 2. What type of cards will work with it? 2703 + ----------------------------------------- 2748 2704 2749 - Any Ethernet type cards (you can even mix cards - a Intel 2705 + Any Ethernet type cards (you can even mix cards - a Intel 2750 2706 EtherExpress PRO/100 and a 3com 3c905b, for example). For most modes, 2751 2707 devices need not be of the same speed. 2752 2708 2753 - Starting with version 3.2.1, bonding also supports Infiniband 2709 + Starting with version 3.2.1, bonding also supports Infiniband 2754 2710 slaves in active-backup mode. 2755 2711 2756 2712 3. How many bonding devices can I have? 2713 + ---------------------------------------- 2757 2714 2758 - There is no limit. 2715 + There is no limit. 2759 2716 2760 2717 4. How many slaves can a bonding device have? 2718 + ---------------------------------------------- 2761 2719 2762 - This is limited only by the number of network interfaces Linux 2720 + This is limited only by the number of network interfaces Linux 2763 2721 supports and/or the number of network cards you can place in your 2764 2722 system. 2765 2723 2766 2724 5. What happens when a slave link dies? 2725 + ---------------------------------------- 2767 2726 2768 - If link monitoring is enabled, then the failing device will be 2727 + If link monitoring is enabled, then the failing device will be 2769 2728 disabled. The active-backup mode will fail over to a backup link, and 2770 2729 other modes will ignore the failed link. The link will continue to be 2771 2730 monitored, and should it recover, it will rejoin the bond (in whatever 2772 2731 manner is appropriate for the mode). See the sections on High 2773 2732 Availability and the documentation for each mode for additional 2774 2733 information. 2775 - 2776 - Link monitoring can be enabled via either the miimon or 2734 + 2735 + Link monitoring can be enabled via either the miimon or 2777 2736 arp_interval parameters (described in the module parameters section, 2778 2737 above). In general, miimon monitors the carrier state as sensed by 2779 2738 the underlying network device, and the arp monitor (arp_interval) 2780 2739 monitors connectivity to another host on the local network. 2781 2740 2782 - If no link monitoring is configured, the bonding driver will 2741 + If no link monitoring is configured, the bonding driver will 2783 2742 be unable to detect link failures, and will assume that all links are 2784 2743 always available. This will likely result in lost packets, and a 2785 2744 resulting degradation of performance. The precise performance loss 2786 2745 depends upon the bonding mode and network configuration. 2787 2746 2788 2747 6. Can bonding be used for High Availability? 2748 + ---------------------------------------------- 2789 2749 2790 - Yes. See the section on High Availability for details. 2750 + Yes. See the section on High Availability for details. 2791 2751 2792 2752 7. Which switches/systems does it work with? 2753 + --------------------------------------------- 2793 2754 2794 - The full answer to this depends upon the desired mode. 2755 + The full answer to this depends upon the desired mode. 2795 2756 2796 - In the basic balance modes (balance-rr and balance-xor), it 2757 + In the basic balance modes (balance-rr and balance-xor), it 2797 2758 works with any system that supports etherchannel (also called 2798 2759 trunking). Most managed switches currently available have such 2799 2760 support, and many unmanaged switches as well. 2800 2761 2801 - The advanced balance modes (balance-tlb and balance-alb) do 2762 + The advanced balance modes (balance-tlb and balance-alb) do 2802 2763 not have special switch requirements, but do need device drivers that 2803 2764 support specific features (described in the appropriate section under 2804 2765 module parameters, above). 2805 2766 2806 - In 802.3ad mode, it works with systems that support IEEE 2767 + In 802.3ad mode, it works with systems that support IEEE 2807 2768 802.3ad Dynamic Link Aggregation. Most managed and many unmanaged 2808 2769 switches currently available support 802.3ad. 2809 2770 2810 - The active-backup mode should work with any Layer-II switch. 2771 + The active-backup mode should work with any Layer-II switch. 2811 2772 2812 2773 8. Where does a bonding device get its MAC address from? 2774 + --------------------------------------------------------- 2813 2775 2814 - When using slave devices that have fixed MAC addresses, or when 2776 + When using slave devices that have fixed MAC addresses, or when 2815 2777 the fail_over_mac option is enabled, the bonding device's MAC address is 2816 2778 the MAC address of the active slave. 2817 2779 2818 - For other configurations, if not explicitly configured (with 2780 + For other configurations, if not explicitly configured (with 2819 2781 ifconfig or ip link), the MAC address of the bonding device is taken from 2820 2782 its first slave device. This MAC address is then passed to all following 2821 2783 slaves and remains persistent (even if the first slave is removed) until 2822 2784 the bonding device is brought down or reconfigured. 2823 2785 2824 - If you wish to change the MAC address, you can set it with 2825 - ifconfig or ip link: 2786 + If you wish to change the MAC address, you can set it with 2787 + ifconfig or ip link:: 2826 2788 2827 - # ifconfig bond0 hw ether 00:11:22:33:44:55 2789 + # ifconfig bond0 hw ether 00:11:22:33:44:55 2828 2790 2829 - # ip link set bond0 address 66:77:88:99:aa:bb 2791 + # ip link set bond0 address 66:77:88:99:aa:bb 2830 2792 2831 - The MAC address can be also changed by bringing down/up the 2832 - device and then changing its slaves (or their order): 2793 + The MAC address can be also changed by bringing down/up the 2794 + device and then changing its slaves (or their order):: 2833 2795 2834 - # ifconfig bond0 down ; modprobe -r bonding 2835 - # ifconfig bond0 .... up 2836 - # ifenslave bond0 eth... 2796 + # ifconfig bond0 down ; modprobe -r bonding 2797 + # ifconfig bond0 .... up 2798 + # ifenslave bond0 eth... 2837 2799 2838 - This method will automatically take the address from the next 2800 + This method will automatically take the address from the next 2839 2801 slave that is added. 2840 2802 2841 - To restore your slaves' MAC addresses, you need to detach them 2842 - from the bond (`ifenslave -d bond0 eth0'). The bonding driver will 2803 + To restore your slaves' MAC addresses, you need to detach them 2804 + from the bond (``ifenslave -d bond0 eth0``). The bonding driver will 2843 2805 then restore the MAC addresses that the slaves had before they were 2844 2806 enslaved. 2845 2807 2846 2808 16. Resources and Links 2847 2809 ======================= 2848 2810 2849 - The latest version of the bonding driver can be found in the latest 2811 + The latest version of the bonding driver can be found in the latest 2850 2812 version of the linux kernel, found on http://kernel.org 2851 2813 2852 - The latest version of this document can be found in the latest kernel 2853 - source (named Documentation/networking/bonding.txt). 2814 + The latest version of this document can be found in the latest kernel 2815 + source (named Documentation/networking/bonding.rst). 2854 2816 2855 - Discussions regarding the usage of the bonding driver take place on the 2817 + Discussions regarding the usage of the bonding driver take place on the 2856 2818 bonding-devel mailing list, hosted at sourceforge.net. If you have questions or 2857 2819 problems, post them to the list. The list address is: 2858 2820 2859 2821 bonding-devel@lists.sourceforge.net 2860 2822 2861 - The administrative interface (to subscribe or unsubscribe) can 2823 + The administrative interface (to subscribe or unsubscribe) can 2862 2824 be found at: 2863 2825 2864 2826 https://lists.sourceforge.net/lists/listinfo/bonding-devel 2865 2827 2866 - Discussions regarding the development of the bonding driver take place 2828 + Discussions regarding the development of the bonding driver take place 2867 2829 on the main Linux network mailing list, hosted at vger.kernel.org. The list 2868 2830 address is: 2869 2831 2870 2832 netdev@vger.kernel.org 2871 2833 2872 - The administrative interface (to subscribe or unsubscribe) can 2834 + The administrative interface (to subscribe or unsubscribe) can 2873 2835 be found at: 2874 2836 2875 2837 http://vger.kernel.org/vger-lists.html#netdev 2876 2838 2877 2839 Donald Becker's Ethernet Drivers and diag programs may be found at : 2878 - - http://web.archive.org/web/*/http://www.scyld.com/network/ 2840 + 2841 + - http://web.archive.org/web/%2E/http://www.scyld.com/network/ 2879 2842 2880 2843 You will also find a lot of information regarding Ethernet, NWay, MII, 2881 2844 etc. at www.scyld.com. 2882 - 2883 - -- END --
+1 -1
Documentation/networking/device_drivers/intel/e100.rst
··· 33 33 - SNMP 34 34 35 35 Channel Bonding documentation can be found in the Linux kernel source: 36 - /Documentation/networking/bonding.txt 36 + /Documentation/networking/bonding.rst 37 37 38 38 39 39 Identifying Your Adapter
+1 -1
Documentation/networking/device_drivers/intel/ixgb.rst
··· 37 37 - SNMP 38 38 39 39 Channel Bonding documentation can be found in the Linux kernel source: 40 - /Documentation/networking/bonding.txt 40 + /Documentation/networking/bonding.rst 41 41 42 42 The driver information previously displayed in the /proc filesystem is not 43 43 supported in this release. Alternatively, you can use ethtool (version 1.6
+1
Documentation/networking/index.rst
··· 44 44 atm 45 45 ax25 46 46 baycom 47 + bonding 47 48 48 49 .. only:: subproject and html 49 50
+1 -1
drivers/net/Kconfig
··· 50 50 The driver supports multiple bonding modes to allow for both high 51 51 performance and high availability operation. 52 52 53 - Refer to <file:Documentation/networking/bonding.txt> for more 53 + Refer to <file:Documentation/networking/bonding.rst> for more 54 54 information. 55 55 56 56 To compile this driver as a module, choose M here: the module