Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

at v2.6.13-rc6 5016 lines 144 kB view raw
1/* 2 * originally based on the dummy device. 3 * 4 * Copyright 1999, Thomas Davis, tadavis@lbl.gov. 5 * Licensed under the GPL. Based on dummy.c, and eql.c devices. 6 * 7 * bonding.c: an Ethernet Bonding driver 8 * 9 * This is useful to talk to a Cisco EtherChannel compatible equipment: 10 * Cisco 5500 11 * Sun Trunking (Solaris) 12 * Alteon AceDirector Trunks 13 * Linux Bonding 14 * and probably many L2 switches ... 15 * 16 * How it works: 17 * ifconfig bond0 ipaddress netmask up 18 * will setup a network device, with an ip address. No mac address 19 * will be assigned at this time. The hw mac address will come from 20 * the first slave bonded to the channel. All slaves will then use 21 * this hw mac address. 22 * 23 * ifconfig bond0 down 24 * will release all slaves, marking them as down. 25 * 26 * ifenslave bond0 eth0 27 * will attach eth0 to bond0 as a slave. eth0 hw mac address will either 28 * a: be used as initial mac address 29 * b: if a hw mac address already is there, eth0's hw mac address 30 * will then be set from bond0. 31 * 32 * v0.1 - first working version. 33 * v0.2 - changed stats to be calculated by summing slaves stats. 34 * 35 * Changes: 36 * Arnaldo Carvalho de Melo <acme@conectiva.com.br> 37 * - fix leaks on failure at bond_init 38 * 39 * 2000/09/30 - Willy Tarreau <willy at meta-x.org> 40 * - added trivial code to release a slave device. 41 * - fixed security bug (CAP_NET_ADMIN not checked) 42 * - implemented MII link monitoring to disable dead links : 43 * All MII capable slaves are checked every <miimon> milliseconds 44 * (100 ms seems good). This value can be changed by passing it to 45 * insmod. A value of zero disables the monitoring (default). 46 * - fixed an infinite loop in bond_xmit_roundrobin() when there's no 47 * good slave. 48 * - made the code hopefully SMP safe 49 * 50 * 2000/10/03 - Willy Tarreau <willy at meta-x.org> 51 * - optimized slave lists based on relevant suggestions from Thomas Davis 52 * - implemented active-backup method to obtain HA with two switches: 53 * stay as long as possible on the same active interface, while we 54 * also monitor the backup one (MII link status) because we want to know 55 * if we are able to switch at any time. ( pass "mode=1" to insmod ) 56 * - lots of stress testings because we need it to be more robust than the 57 * wires ! :-> 58 * 59 * 2000/10/09 - Willy Tarreau <willy at meta-x.org> 60 * - added up and down delays after link state change. 61 * - optimized the slaves chaining so that when we run forward, we never 62 * repass through the bond itself, but we can find it by searching 63 * backwards. Renders the deletion more difficult, but accelerates the 64 * scan. 65 * - smarter enslaving and releasing. 66 * - finer and more robust SMP locking 67 * 68 * 2000/10/17 - Willy Tarreau <willy at meta-x.org> 69 * - fixed two potential SMP race conditions 70 * 71 * 2000/10/18 - Willy Tarreau <willy at meta-x.org> 72 * - small fixes to the monitoring FSM in case of zero delays 73 * 2000/11/01 - Willy Tarreau <willy at meta-x.org> 74 * - fixed first slave not automatically used in trunk mode. 75 * 2000/11/10 : spelling of "EtherChannel" corrected. 76 * 2000/11/13 : fixed a race condition in case of concurrent accesses to ioctl(). 77 * 2000/12/16 : fixed improper usage of rtnl_exlock_nowait(). 78 * 79 * 2001/1/3 - Chad N. Tindel <ctindel at ieee dot org> 80 * - The bonding driver now simulates MII status monitoring, just like 81 * a normal network device. It will show that the link is down iff 82 * every slave in the bond shows that their links are down. If at least 83 * one slave is up, the bond's MII status will appear as up. 84 * 85 * 2001/2/7 - Chad N. Tindel <ctindel at ieee dot org> 86 * - Applications can now query the bond from user space to get 87 * information which may be useful. They do this by calling 88 * the BOND_INFO_QUERY ioctl. Once the app knows how many slaves 89 * are in the bond, it can call the BOND_SLAVE_INFO_QUERY ioctl to 90 * get slave specific information (# link failures, etc). See 91 * <linux/if_bonding.h> for more details. The structs of interest 92 * are ifbond and ifslave. 93 * 94 * 2001/4/5 - Chad N. Tindel <ctindel at ieee dot org> 95 * - Ported to 2.4 Kernel 96 * 97 * 2001/5/2 - Jeffrey E. Mast <jeff at mastfamily dot com> 98 * - When a device is detached from a bond, the slave device is no longer 99 * left thinking that is has a master. 100 * 101 * 2001/5/16 - Jeffrey E. Mast <jeff at mastfamily dot com> 102 * - memset did not appropriately initialized the bond rw_locks. Used 103 * rwlock_init to initialize to unlocked state to prevent deadlock when 104 * first attempting a lock 105 * - Called SET_MODULE_OWNER for bond device 106 * 107 * 2001/5/17 - Tim Anderson <tsa at mvista.com> 108 * - 2 paths for releasing for slave release; 1 through ioctl 109 * and 2) through close. Both paths need to release the same way. 110 * - the free slave in bond release is changing slave status before 111 * the free. The netdev_set_master() is intended to change slave state 112 * so it should not be done as part of the release process. 113 * - Simple rule for slave state at release: only the active in A/B and 114 * only one in the trunked case. 115 * 116 * 2001/6/01 - Tim Anderson <tsa at mvista.com> 117 * - Now call dev_close when releasing a slave so it doesn't screw up 118 * out routing table. 119 * 120 * 2001/6/01 - Chad N. Tindel <ctindel at ieee dot org> 121 * - Added /proc support for getting bond and slave information. 122 * Information is in /proc/net/<bond device>/info. 123 * - Changed the locking when calling bond_close to prevent deadlock. 124 * 125 * 2001/8/05 - Janice Girouard <girouard at us.ibm.com> 126 * - correct problem where refcnt of slave is not incremented in bond_ioctl 127 * so the system hangs when halting. 128 * - correct locking problem when unable to malloc in bond_enslave. 129 * - adding bond_xmit_xor logic. 130 * - adding multiple bond device support. 131 * 132 * 2001/8/13 - Erik Habbinga <erik_habbinga at hp dot com> 133 * - correct locking problem with rtnl_exlock_nowait 134 * 135 * 2001/8/23 - Janice Girouard <girouard at us.ibm.com> 136 * - bzero initial dev_bonds, to correct oops 137 * - convert SIOCDEVPRIVATE to new MII ioctl calls 138 * 139 * 2001/9/13 - Takao Indoh <indou dot takao at jp dot fujitsu dot com> 140 * - Add the BOND_CHANGE_ACTIVE ioctl implementation 141 * 142 * 2001/9/14 - Mark Huth <mhuth at mvista dot com> 143 * - Change MII_LINK_READY to not check for end of auto-negotiation, 144 * but only for an up link. 145 * 146 * 2001/9/20 - Chad N. Tindel <ctindel at ieee dot org> 147 * - Add the device field to bonding_t. Previously the net_device 148 * corresponding to a bond wasn't available from the bonding_t 149 * structure. 150 * 151 * 2001/9/25 - Janice Girouard <girouard at us.ibm.com> 152 * - add arp_monitor for active backup mode 153 * 154 * 2001/10/23 - Takao Indoh <indou dot takao at jp dot fujitsu dot com> 155 * - Various memory leak fixes 156 * 157 * 2001/11/5 - Mark Huth <mark dot huth at mvista dot com> 158 * - Don't take rtnl lock in bond_mii_monitor as it deadlocks under 159 * certain hotswap conditions. 160 * Note: this same change may be required in bond_arp_monitor ??? 161 * - Remove possibility of calling bond_sethwaddr with NULL slave_dev ptr 162 * - Handle hot swap ethernet interface deregistration events to remove 163 * kernel oops following hot swap of enslaved interface 164 * 165 * 2002/1/2 - Chad N. Tindel <ctindel at ieee dot org> 166 * - Restore original slave flags at release time. 167 * 168 * 2002/02/18 - Erik Habbinga <erik_habbinga at hp dot com> 169 * - bond_release(): calling kfree on our_slave after call to 170 * bond_restore_slave_flags, not before 171 * - bond_enslave(): saving slave flags into original_flags before 172 * call to netdev_set_master, so the IFF_SLAVE flag doesn't end 173 * up in original_flags 174 * 175 * 2002/04/05 - Mark Smith <mark.smith at comdev dot cc> and 176 * Steve Mead <steve.mead at comdev dot cc> 177 * - Port Gleb Natapov's multicast support patchs from 2.4.12 178 * to 2.4.18 adding support for multicast. 179 * 180 * 2002/06/10 - Tony Cureington <tony.cureington * hp_com> 181 * - corrected uninitialized pointer (ifr.ifr_data) in bond_check_dev_link; 182 * actually changed function to use MIIPHY, then MIIREG, and finally 183 * ETHTOOL to determine the link status 184 * - fixed bad ifr_data pointer assignments in bond_ioctl 185 * - corrected mode 1 being reported as active-backup in bond_get_info; 186 * also added text to distinguish type of load balancing (rr or xor) 187 * - change arp_ip_target module param from "1-12s" (array of 12 ptrs) 188 * to "s" (a single ptr) 189 * 190 * 2002/08/30 - Jay Vosburgh <fubar at us dot ibm dot com> 191 * - Removed acquisition of xmit_lock in set_multicast_list; caused 192 * deadlock on SMP (lock is held by caller). 193 * - Revamped SIOCGMIIPHY, SIOCGMIIREG portion of bond_check_dev_link(). 194 * 195 * 2002/09/18 - Jay Vosburgh <fubar at us dot ibm dot com> 196 * - Fixed up bond_check_dev_link() (and callers): removed some magic 197 * numbers, banished local MII_ defines, wrapped ioctl calls to 198 * prevent EFAULT errors 199 * 200 * 2002/9/30 - Jay Vosburgh <fubar at us dot ibm dot com> 201 * - make sure the ip target matches the arp_target before saving the 202 * hw address. 203 * 204 * 2002/9/30 - Dan Eisner <eisner at 2robots dot com> 205 * - make sure my_ip is set before taking down the link, since 206 * not all switches respond if the source ip is not set. 207 * 208 * 2002/10/8 - Janice Girouard <girouard at us dot ibm dot com> 209 * - read in the local ip address when enslaving a device 210 * - add primary support 211 * - make sure 2*arp_interval has passed when a new device 212 * is brought on-line before taking it down. 213 * 214 * 2002/09/11 - Philippe De Muyter <phdm at macqel dot be> 215 * - Added bond_xmit_broadcast logic. 216 * - Added bond_mode() support function. 217 * 218 * 2002/10/26 - Laurent Deniel <laurent.deniel at free.fr> 219 * - allow to register multicast addresses only on active slave 220 * (useful in active-backup mode) 221 * - add multicast module parameter 222 * - fix deletion of multicast groups after unloading module 223 * 224 * 2002/11/06 - Kameshwara Rayaprolu <kameshwara.rao * wipro_com> 225 * - Changes to prevent panic from closing the device twice; if we close 226 * the device in bond_release, we must set the original_flags to down 227 * so it won't be closed again by the network layer. 228 * 229 * 2002/11/07 - Tony Cureington <tony.cureington * hp_com> 230 * - Fix arp_target_hw_addr memory leak 231 * - Created activebackup_arp_monitor function to handle arp monitoring 232 * in active backup mode - the bond_arp_monitor had several problems... 233 * such as allowing slaves to tx arps sequentially without any delay 234 * for a response 235 * - Renamed bond_arp_monitor to loadbalance_arp_monitor and re-wrote 236 * this function to just handle arp monitoring in load-balancing mode; 237 * it is a lot more compact now 238 * - Changes to ensure one and only one slave transmits in active-backup 239 * mode 240 * - Robustesize parameters; warn users about bad combinations of 241 * parameters; also if miimon is specified and a network driver does 242 * not support MII or ETHTOOL, inform the user of this 243 * - Changes to support link_failure_count when in arp monitoring mode 244 * - Fix up/down delay reported in /proc 245 * - Added version; log version; make version available from "modinfo -d" 246 * - Fixed problem in bond_check_dev_link - if the first IOCTL (SIOCGMIIPH) 247 * failed, the ETHTOOL ioctl never got a chance 248 * 249 * 2002/11/16 - Laurent Deniel <laurent.deniel at free.fr> 250 * - fix multicast handling in activebackup_arp_monitor 251 * - remove one unnecessary and confusing curr_active_slave == slave test 252 * in activebackup_arp_monitor 253 * 254 * 2002/11/17 - Laurent Deniel <laurent.deniel at free.fr> 255 * - fix bond_slave_info_query when slave_id = num_slaves 256 * 257 * 2002/11/19 - Janice Girouard <girouard at us dot ibm dot com> 258 * - correct ifr_data reference. Update ifr_data reference 259 * to mii_ioctl_data struct values to avoid confusion. 260 * 261 * 2002/11/22 - Bert Barbe <bert.barbe at oracle dot com> 262 * - Add support for multiple arp_ip_target 263 * 264 * 2002/12/13 - Jay Vosburgh <fubar at us dot ibm dot com> 265 * - Changed to allow text strings for mode and multicast, e.g., 266 * insmod bonding mode=active-backup. The numbers still work. 267 * One change: an invalid choice will cause module load failure, 268 * rather than the previous behavior of just picking one. 269 * - Minor cleanups; got rid of dup ctype stuff, atoi function 270 * 271 * 2003/02/07 - Jay Vosburgh <fubar at us dot ibm dot com> 272 * - Added use_carrier module parameter that causes miimon to 273 * use netif_carrier_ok() test instead of MII/ETHTOOL ioctls. 274 * - Minor cleanups; consolidated ioctl calls to one function. 275 * 276 * 2003/02/07 - Tony Cureington <tony.cureington * hp_com> 277 * - Fix bond_mii_monitor() logic error that could result in 278 * bonding round-robin mode ignoring links after failover/recovery 279 * 280 * 2003/03/17 - Jay Vosburgh <fubar at us dot ibm dot com> 281 * - kmalloc fix (GFP_KERNEL to GFP_ATOMIC) reported by 282 * Shmulik dot Hen at intel.com. 283 * - Based on discussion on mailing list, changed use of 284 * update_slave_cnt(), created wrapper functions for adding/removing 285 * slaves, changed bond_xmit_xor() to check slave_cnt instead of 286 * checking slave and slave->dev (which only worked by accident). 287 * - Misc code cleanup: get arp_send() prototype from header file, 288 * add max_bonds to bonding.txt. 289 * 290 * 2003/03/18 - Tsippy Mendelson <tsippy.mendelson at intel dot com> and 291 * Shmulik Hen <shmulik.hen at intel dot com> 292 * - Make sure only bond_attach_slave() and bond_detach_slave() can 293 * manipulate the slave list, including slave_cnt, even when in 294 * bond_release_all(). 295 * - Fixed hang in bond_release() with traffic running: 296 * netdev_set_master() must not be called from within the bond lock. 297 * 298 * 2003/03/18 - Tsippy Mendelson <tsippy.mendelson at intel dot com> and 299 * Shmulik Hen <shmulik.hen at intel dot com> 300 * - Fixed hang in bond_enslave() with traffic running: 301 * netdev_set_master() must not be called from within the bond lock. 302 * 303 * 2003/03/18 - Amir Noam <amir.noam at intel dot com> 304 * - Added support for getting slave's speed and duplex via ethtool. 305 * Needed for 802.3ad and other future modes. 306 * 307 * 2003/03/18 - Tsippy Mendelson <tsippy.mendelson at intel dot com> and 308 * Shmulik Hen <shmulik.hen at intel dot com> 309 * - Enable support of modes that need to use the unique mac address of 310 * each slave. 311 * * bond_enslave(): Moved setting the slave's mac address, and 312 * openning it, from the application to the driver. This breaks 313 * backward comaptibility with old versions of ifenslave that open 314 * the slave before enalsving it !!!. 315 * * bond_release(): The driver also takes care of closing the slave 316 * and restoring its original mac address. 317 * - Removed the code that restores all base driver's flags. 318 * Flags are automatically restored once all undo stages are done 319 * properly. 320 * - Block possibility of enslaving before the master is up. This 321 * prevents putting the system in an unstable state. 322 * 323 * 2003/03/18 - Amir Noam <amir.noam at intel dot com>, 324 * Tsippy Mendelson <tsippy.mendelson at intel dot com> and 325 * Shmulik Hen <shmulik.hen at intel dot com> 326 * - Added support for IEEE 802.3ad Dynamic link aggregation mode. 327 * 328 * 2003/05/01 - Amir Noam <amir.noam at intel dot com> 329 * - Added ABI version control to restore compatibility between 330 * new/old ifenslave and new/old bonding. 331 * 332 * 2003/05/01 - Shmulik Hen <shmulik.hen at intel dot com> 333 * - Fixed bug in bond_release_all(): save old value of curr_active_slave 334 * before setting it to NULL. 335 * - Changed driver versioning scheme to include version number instead 336 * of release date (that is already in another field). There are 3 337 * fields X.Y.Z where: 338 * X - Major version - big behavior changes 339 * Y - Minor version - addition of features 340 * Z - Extra version - minor changes and bug fixes 341 * The current version is 1.0.0 as a base line. 342 * 343 * 2003/05/01 - Tsippy Mendelson <tsippy.mendelson at intel dot com> and 344 * Amir Noam <amir.noam at intel dot com> 345 * - Added support for lacp_rate module param. 346 * - Code beautification and style changes (mainly in comments). 347 * new version - 1.0.1 348 * 349 * 2003/05/01 - Shmulik Hen <shmulik.hen at intel dot com> 350 * - Based on discussion on mailing list, changed locking scheme 351 * to use lock/unlock or lock_bh/unlock_bh appropriately instead 352 * of lock_irqsave/unlock_irqrestore. The new scheme helps exposing 353 * hidden bugs and solves system hangs that occurred due to the fact 354 * that holding lock_irqsave doesn't prevent softirqs from running. 355 * This also increases total throughput since interrupts are not 356 * blocked on each transmitted packets or monitor timeout. 357 * new version - 2.0.0 358 * 359 * 2003/05/01 - Shmulik Hen <shmulik.hen at intel dot com> 360 * - Added support for Transmit load balancing mode. 361 * - Concentrate all assignments of curr_active_slave to a single point 362 * so specific modes can take actions when the primary adapter is 363 * changed. 364 * - Take the updelay parameter into consideration during bond_enslave 365 * since some adapters loose their link during setting the device. 366 * - Renamed bond_3ad_link_status_changed() to 367 * bond_3ad_handle_link_change() for compatibility with TLB. 368 * new version - 2.1.0 369 * 370 * 2003/05/01 - Tsippy Mendelson <tsippy.mendelson at intel dot com> 371 * - Added support for Adaptive load balancing mode which is 372 * equivalent to Transmit load balancing + Receive load balancing. 373 * new version - 2.2.0 374 * 375 * 2003/05/15 - Jay Vosburgh <fubar at us dot ibm dot com> 376 * - Applied fix to activebackup_arp_monitor posted to bonding-devel 377 * by Tony Cureington <tony.cureington * hp_com>. Fixes ARP 378 * monitor endless failover bug. Version to 2.2.10 379 * 380 * 2003/05/20 - Amir Noam <amir.noam at intel dot com> 381 * - Fixed bug in ABI version control - Don't commit to a specific 382 * ABI version if receiving unsupported ioctl commands. 383 * 384 * 2003/05/22 - Jay Vosburgh <fubar at us dot ibm dot com> 385 * - Fix ifenslave -c causing bond to loose existing routes; 386 * added bond_set_mac_address() that doesn't require the 387 * bond to be down. 388 * - In conjunction with fix for ifenslave -c, in 389 * bond_change_active(), changing to the already active slave 390 * is no longer an error (it successfully does nothing). 391 * 392 * 2003/06/30 - Amir Noam <amir.noam at intel dot com> 393 * - Fixed bond_change_active() for ALB/TLB modes. 394 * Version to 2.2.14. 395 * 396 * 2003/07/29 - Amir Noam <amir.noam at intel dot com> 397 * - Fixed ARP monitoring bug. 398 * Version to 2.2.15. 399 * 400 * 2003/07/31 - Willy Tarreau <willy at ods dot org> 401 * - Fixed kernel panic when using ARP monitoring without 402 * setting bond's IP address. 403 * Version to 2.2.16. 404 * 405 * 2003/08/06 - Amir Noam <amir.noam at intel dot com> 406 * - Back port from 2.6: use alloc_netdev(); fix /proc handling; 407 * made stats a part of bond struct so no need to allocate 408 * and free it separately; use standard list operations instead 409 * of pre-allocated array of bonds. 410 * Version to 2.3.0. 411 * 412 * 2003/08/07 - Jay Vosburgh <fubar at us dot ibm dot com>, 413 * Amir Noam <amir.noam at intel dot com> and 414 * Shmulik Hen <shmulik.hen at intel dot com> 415 * - Propagating master's settings: Distinguish between modes that 416 * use a primary slave from those that don't, and propagate settings 417 * accordingly; Consolidate change_active opeartions and add 418 * reselect_active and find_best opeartions; Decouple promiscuous 419 * handling from the multicast mode setting; Add support for changing 420 * HW address and MTU with proper unwind; Consolidate procfs code, 421 * add CHANGENAME handler; Enhance netdev notification handling. 422 * Version to 2.4.0. 423 * 424 * 2003/09/15 - Stephen Hemminger <shemminger at osdl dot org>, 425 * Amir Noam <amir.noam at intel dot com> 426 * - Convert /proc to seq_file interface. 427 * Change /proc/net/bondX/info to /proc/net/bonding/bondX. 428 * Set version to 2.4.1. 429 * 430 * 2003/11/20 - Amir Noam <amir.noam at intel dot com> 431 * - Fix /proc creation/destruction. 432 * 433 * 2003/12/01 - Shmulik Hen <shmulik.hen at intel dot com> 434 * - Massive cleanup - Set version to 2.5.0 435 * Code changes: 436 * o Consolidate format of prints and debug prints. 437 * o Remove bonding_t/slave_t typedefs and consolidate all casts. 438 * o Remove dead code and unnecessary checks. 439 * o Consolidate starting/stopping timers. 440 * o Consolidate handling of primary module param throughout the code. 441 * o Removed multicast module param support - all settings are done 442 * according to mode. 443 * o Slave list iteration - bond is no longer part of the list, 444 * added cyclic list iteration macros. 445 * o Consolidate error handling in all xmit functions. 446 * Style changes: 447 * o Consolidate function naming and declarations. 448 * o Consolidate function params and local variables names. 449 * o Consolidate return values. 450 * o Consolidate curly braces. 451 * o Consolidate conditionals format. 452 * o Change struct member names and types. 453 * o Chomp trailing spaces, remove empty lines, fix indentations. 454 * o Re-organize code according to context. 455 * 456 * 2003/12/30 - Amir Noam <amir.noam at intel dot com> 457 * - Fixed: Cannot remove and re-enslave the original active slave. 458 * - Fixed: Releasing the original active slave causes mac address 459 * duplication. 460 * - Add support for slaves that use ethtool_ops. 461 * Set version to 2.5.3. 462 * 463 * 2004/01/05 - Amir Noam <amir.noam at intel dot com> 464 * - Save bonding parameters per bond instead of using the global values. 465 * Set version to 2.5.4. 466 * 467 * 2004/01/14 - Shmulik Hen <shmulik.hen at intel dot com> 468 * - Enhance VLAN support: 469 * * Add support for VLAN hardware acceleration capable slaves. 470 * * Add capability to tag self generated packets in ALB/TLB modes. 471 * Set version to 2.6.0. 472 * 2004/10/29 - Mitch Williams <mitch.a.williams at intel dot com> 473 * - Fixed bug when unloading module while using 802.3ad. If 474 * spinlock debugging is turned on, this causes a stack dump. 475 * Solution is to move call to dev_remove_pack outside of the 476 * spinlock. 477 * Set version to 2.6.1. 478 * 2005/06/05 - Jay Vosburgh <fubar@us.ibm.com> 479 * - Support for generating gratuitous ARPs in active-backup mode. 480 * Includes support for VLAN tagging all bonding-generated ARPs 481 * as needed. Set version to 2.6.2. 482 * 2005/06/08 - Jason Gabler <jygabler at lbl dot gov> 483 * - alternate hashing policy support for mode 2 484 * * Added kernel parameter "xmit_hash_policy" to allow the selection 485 * of different hashing policies for mode 2. The original mode 2 486 * policy is the default, now found in xmit_hash_policy_layer2(). 487 * * Added xmit_hash_policy_layer34() 488 * - Modified by Jay Vosburgh <fubar@us.ibm.com> to also support mode 4. 489 * Set version to 2.6.3. 490 */ 491 492//#define BONDING_DEBUG 1 493 494#include <linux/config.h> 495#include <linux/kernel.h> 496#include <linux/module.h> 497#include <linux/sched.h> 498#include <linux/types.h> 499#include <linux/fcntl.h> 500#include <linux/interrupt.h> 501#include <linux/ptrace.h> 502#include <linux/ioport.h> 503#include <linux/in.h> 504#include <net/ip.h> 505#include <linux/ip.h> 506#include <linux/tcp.h> 507#include <linux/udp.h> 508#include <linux/slab.h> 509#include <linux/string.h> 510#include <linux/init.h> 511#include <linux/timer.h> 512#include <linux/socket.h> 513#include <linux/ctype.h> 514#include <linux/inet.h> 515#include <linux/bitops.h> 516#include <asm/system.h> 517#include <asm/io.h> 518#include <asm/dma.h> 519#include <asm/uaccess.h> 520#include <linux/errno.h> 521#include <linux/netdevice.h> 522#include <linux/inetdevice.h> 523#include <linux/etherdevice.h> 524#include <linux/skbuff.h> 525#include <net/sock.h> 526#include <linux/rtnetlink.h> 527#include <linux/proc_fs.h> 528#include <linux/seq_file.h> 529#include <linux/smp.h> 530#include <linux/if_ether.h> 531#include <net/arp.h> 532#include <linux/mii.h> 533#include <linux/ethtool.h> 534#include <linux/if_vlan.h> 535#include <linux/if_bonding.h> 536#include <net/route.h> 537#include "bonding.h" 538#include "bond_3ad.h" 539#include "bond_alb.h" 540 541/*---------------------------- Module parameters ----------------------------*/ 542 543/* monitor all links that often (in milliseconds). <=0 disables monitoring */ 544#define BOND_LINK_MON_INTERV 0 545#define BOND_LINK_ARP_INTERV 0 546 547static int max_bonds = BOND_DEFAULT_MAX_BONDS; 548static int miimon = BOND_LINK_MON_INTERV; 549static int updelay = 0; 550static int downdelay = 0; 551static int use_carrier = 1; 552static char *mode = NULL; 553static char *primary = NULL; 554static char *lacp_rate = NULL; 555static char *xmit_hash_policy = NULL; 556static int arp_interval = BOND_LINK_ARP_INTERV; 557static char *arp_ip_target[BOND_MAX_ARP_TARGETS] = { NULL, }; 558 559module_param(max_bonds, int, 0); 560MODULE_PARM_DESC(max_bonds, "Max number of bonded devices"); 561module_param(miimon, int, 0); 562MODULE_PARM_DESC(miimon, "Link check interval in milliseconds"); 563module_param(updelay, int, 0); 564MODULE_PARM_DESC(updelay, "Delay before considering link up, in milliseconds"); 565module_param(downdelay, int, 0); 566MODULE_PARM_DESC(downdelay, "Delay before considering link down, in milliseconds"); 567module_param(use_carrier, int, 0); 568MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; 0 for off, 1 for on (default)"); 569module_param(mode, charp, 0); 570MODULE_PARM_DESC(mode, "Mode of operation : 0 for round robin, 1 for active-backup, 2 for xor"); 571module_param(primary, charp, 0); 572MODULE_PARM_DESC(primary, "Primary network device to use"); 573module_param(lacp_rate, charp, 0); 574MODULE_PARM_DESC(lacp_rate, "LACPDU tx rate to request from 802.3ad partner (slow/fast)"); 575module_param(xmit_hash_policy, charp, 0); 576MODULE_PARM_DESC(xmit_hash_policy, "XOR hashing method : 0 for layer 2 (default), 1 for layer 3+4"); 577module_param(arp_interval, int, 0); 578MODULE_PARM_DESC(arp_interval, "arp interval in milliseconds"); 579module_param_array(arp_ip_target, charp, NULL, 0); 580MODULE_PARM_DESC(arp_ip_target, "arp targets in n.n.n.n form"); 581 582/*----------------------------- Global variables ----------------------------*/ 583 584static const char *version = 585 DRV_DESCRIPTION ": v" DRV_VERSION " (" DRV_RELDATE ")\n"; 586 587static LIST_HEAD(bond_dev_list); 588 589#ifdef CONFIG_PROC_FS 590static struct proc_dir_entry *bond_proc_dir = NULL; 591#endif 592 593static u32 arp_target[BOND_MAX_ARP_TARGETS] = { 0, } ; 594static int arp_ip_count = 0; 595static int bond_mode = BOND_MODE_ROUNDROBIN; 596static int xmit_hashtype= BOND_XMIT_POLICY_LAYER2; 597static int lacp_fast = 0; 598static int app_abi_ver = 0; 599static int orig_app_abi_ver = -1; /* This is used to save the first ABI version 600 * we receive from the application. Once set, 601 * it won't be changed, and the module will 602 * refuse to enslave/release interfaces if the 603 * command comes from an application using 604 * another ABI version. 605 */ 606struct bond_parm_tbl { 607 char *modename; 608 int mode; 609}; 610 611static struct bond_parm_tbl bond_lacp_tbl[] = { 612{ "slow", AD_LACP_SLOW}, 613{ "fast", AD_LACP_FAST}, 614{ NULL, -1}, 615}; 616 617static struct bond_parm_tbl bond_mode_tbl[] = { 618{ "balance-rr", BOND_MODE_ROUNDROBIN}, 619{ "active-backup", BOND_MODE_ACTIVEBACKUP}, 620{ "balance-xor", BOND_MODE_XOR}, 621{ "broadcast", BOND_MODE_BROADCAST}, 622{ "802.3ad", BOND_MODE_8023AD}, 623{ "balance-tlb", BOND_MODE_TLB}, 624{ "balance-alb", BOND_MODE_ALB}, 625{ NULL, -1}, 626}; 627 628static struct bond_parm_tbl xmit_hashtype_tbl[] = { 629{ "layer2", BOND_XMIT_POLICY_LAYER2}, 630{ "layer3+4", BOND_XMIT_POLICY_LAYER34}, 631{ NULL, -1}, 632}; 633 634/*-------------------------- Forward declarations ---------------------------*/ 635 636static inline void bond_set_mode_ops(struct bonding *bond, int mode); 637static void bond_send_gratuitous_arp(struct bonding *bond); 638 639/*---------------------------- General routines -----------------------------*/ 640 641static const char *bond_mode_name(int mode) 642{ 643 switch (mode) { 644 case BOND_MODE_ROUNDROBIN : 645 return "load balancing (round-robin)"; 646 case BOND_MODE_ACTIVEBACKUP : 647 return "fault-tolerance (active-backup)"; 648 case BOND_MODE_XOR : 649 return "load balancing (xor)"; 650 case BOND_MODE_BROADCAST : 651 return "fault-tolerance (broadcast)"; 652 case BOND_MODE_8023AD: 653 return "IEEE 802.3ad Dynamic link aggregation"; 654 case BOND_MODE_TLB: 655 return "transmit load balancing"; 656 case BOND_MODE_ALB: 657 return "adaptive load balancing"; 658 default: 659 return "unknown"; 660 } 661} 662 663/*---------------------------------- VLAN -----------------------------------*/ 664 665/** 666 * bond_add_vlan - add a new vlan id on bond 667 * @bond: bond that got the notification 668 * @vlan_id: the vlan id to add 669 * 670 * Returns -ENOMEM if allocation failed. 671 */ 672static int bond_add_vlan(struct bonding *bond, unsigned short vlan_id) 673{ 674 struct vlan_entry *vlan; 675 676 dprintk("bond: %s, vlan id %d\n", 677 (bond ? bond->dev->name: "None"), vlan_id); 678 679 vlan = kmalloc(sizeof(struct vlan_entry), GFP_KERNEL); 680 if (!vlan) { 681 return -ENOMEM; 682 } 683 684 INIT_LIST_HEAD(&vlan->vlan_list); 685 vlan->vlan_id = vlan_id; 686 vlan->vlan_ip = 0; 687 688 write_lock_bh(&bond->lock); 689 690 list_add_tail(&vlan->vlan_list, &bond->vlan_list); 691 692 write_unlock_bh(&bond->lock); 693 694 dprintk("added VLAN ID %d on bond %s\n", vlan_id, bond->dev->name); 695 696 return 0; 697} 698 699/** 700 * bond_del_vlan - delete a vlan id from bond 701 * @bond: bond that got the notification 702 * @vlan_id: the vlan id to delete 703 * 704 * returns -ENODEV if @vlan_id was not found in @bond. 705 */ 706static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id) 707{ 708 struct vlan_entry *vlan, *next; 709 int res = -ENODEV; 710 711 dprintk("bond: %s, vlan id %d\n", bond->dev->name, vlan_id); 712 713 write_lock_bh(&bond->lock); 714 715 list_for_each_entry_safe(vlan, next, &bond->vlan_list, vlan_list) { 716 if (vlan->vlan_id == vlan_id) { 717 list_del(&vlan->vlan_list); 718 719 if ((bond->params.mode == BOND_MODE_TLB) || 720 (bond->params.mode == BOND_MODE_ALB)) { 721 bond_alb_clear_vlan(bond, vlan_id); 722 } 723 724 dprintk("removed VLAN ID %d from bond %s\n", vlan_id, 725 bond->dev->name); 726 727 kfree(vlan); 728 729 if (list_empty(&bond->vlan_list) && 730 (bond->slave_cnt == 0)) { 731 /* Last VLAN removed and no slaves, so 732 * restore block on adding VLANs. This will 733 * be removed once new slaves that are not 734 * VLAN challenged will be added. 735 */ 736 bond->dev->features |= NETIF_F_VLAN_CHALLENGED; 737 } 738 739 res = 0; 740 goto out; 741 } 742 } 743 744 dprintk("couldn't find VLAN ID %d in bond %s\n", vlan_id, 745 bond->dev->name); 746 747out: 748 write_unlock_bh(&bond->lock); 749 return res; 750} 751 752/** 753 * bond_has_challenged_slaves 754 * @bond: the bond we're working on 755 * 756 * Searches the slave list. Returns 1 if a vlan challenged slave 757 * was found, 0 otherwise. 758 * 759 * Assumes bond->lock is held. 760 */ 761static int bond_has_challenged_slaves(struct bonding *bond) 762{ 763 struct slave *slave; 764 int i; 765 766 bond_for_each_slave(bond, slave, i) { 767 if (slave->dev->features & NETIF_F_VLAN_CHALLENGED) { 768 dprintk("found VLAN challenged slave - %s\n", 769 slave->dev->name); 770 return 1; 771 } 772 } 773 774 dprintk("no VLAN challenged slaves found\n"); 775 return 0; 776} 777 778/** 779 * bond_next_vlan - safely skip to the next item in the vlans list. 780 * @bond: the bond we're working on 781 * @curr: item we're advancing from 782 * 783 * Returns %NULL if list is empty, bond->next_vlan if @curr is %NULL, 784 * or @curr->next otherwise (even if it is @curr itself again). 785 * 786 * Caller must hold bond->lock 787 */ 788struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr) 789{ 790 struct vlan_entry *next, *last; 791 792 if (list_empty(&bond->vlan_list)) { 793 return NULL; 794 } 795 796 if (!curr) { 797 next = list_entry(bond->vlan_list.next, 798 struct vlan_entry, vlan_list); 799 } else { 800 last = list_entry(bond->vlan_list.prev, 801 struct vlan_entry, vlan_list); 802 if (last == curr) { 803 next = list_entry(bond->vlan_list.next, 804 struct vlan_entry, vlan_list); 805 } else { 806 next = list_entry(curr->vlan_list.next, 807 struct vlan_entry, vlan_list); 808 } 809 } 810 811 return next; 812} 813 814/** 815 * bond_dev_queue_xmit - Prepare skb for xmit. 816 * 817 * @bond: bond device that got this skb for tx. 818 * @skb: hw accel VLAN tagged skb to transmit 819 * @slave_dev: slave that is supposed to xmit this skbuff 820 * 821 * When the bond gets an skb to transmit that is 822 * already hardware accelerated VLAN tagged, and it 823 * needs to relay this skb to a slave that is not 824 * hw accel capable, the skb needs to be "unaccelerated", 825 * i.e. strip the hwaccel tag and re-insert it as part 826 * of the payload. 827 */ 828int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, struct net_device *slave_dev) 829{ 830 unsigned short vlan_id; 831 832 if (!list_empty(&bond->vlan_list) && 833 !(slave_dev->features & NETIF_F_HW_VLAN_TX) && 834 vlan_get_tag(skb, &vlan_id) == 0) { 835 skb->dev = slave_dev; 836 skb = vlan_put_tag(skb, vlan_id); 837 if (!skb) { 838 /* vlan_put_tag() frees the skb in case of error, 839 * so return success here so the calling functions 840 * won't attempt to free is again. 841 */ 842 return 0; 843 } 844 } else { 845 skb->dev = slave_dev; 846 } 847 848 skb->priority = 1; 849 dev_queue_xmit(skb); 850 851 return 0; 852} 853 854/* 855 * In the following 3 functions, bond_vlan_rx_register(), bond_vlan_rx_add_vid 856 * and bond_vlan_rx_kill_vid, We don't protect the slave list iteration with a 857 * lock because: 858 * a. This operation is performed in IOCTL context, 859 * b. The operation is protected by the RTNL semaphore in the 8021q code, 860 * c. Holding a lock with BH disabled while directly calling a base driver 861 * entry point is generally a BAD idea. 862 * 863 * The design of synchronization/protection for this operation in the 8021q 864 * module is good for one or more VLAN devices over a single physical device 865 * and cannot be extended for a teaming solution like bonding, so there is a 866 * potential race condition here where a net device from the vlan group might 867 * be referenced (either by a base driver or the 8021q code) while it is being 868 * removed from the system. However, it turns out we're not making matters 869 * worse, and if it works for regular VLAN usage it will work here too. 870*/ 871 872/** 873 * bond_vlan_rx_register - Propagates registration to slaves 874 * @bond_dev: bonding net device that got called 875 * @grp: vlan group being registered 876 */ 877static void bond_vlan_rx_register(struct net_device *bond_dev, struct vlan_group *grp) 878{ 879 struct bonding *bond = bond_dev->priv; 880 struct slave *slave; 881 int i; 882 883 bond->vlgrp = grp; 884 885 bond_for_each_slave(bond, slave, i) { 886 struct net_device *slave_dev = slave->dev; 887 888 if ((slave_dev->features & NETIF_F_HW_VLAN_RX) && 889 slave_dev->vlan_rx_register) { 890 slave_dev->vlan_rx_register(slave_dev, grp); 891 } 892 } 893} 894 895/** 896 * bond_vlan_rx_add_vid - Propagates adding an id to slaves 897 * @bond_dev: bonding net device that got called 898 * @vid: vlan id being added 899 */ 900static void bond_vlan_rx_add_vid(struct net_device *bond_dev, uint16_t vid) 901{ 902 struct bonding *bond = bond_dev->priv; 903 struct slave *slave; 904 int i, res; 905 906 bond_for_each_slave(bond, slave, i) { 907 struct net_device *slave_dev = slave->dev; 908 909 if ((slave_dev->features & NETIF_F_HW_VLAN_FILTER) && 910 slave_dev->vlan_rx_add_vid) { 911 slave_dev->vlan_rx_add_vid(slave_dev, vid); 912 } 913 } 914 915 res = bond_add_vlan(bond, vid); 916 if (res) { 917 printk(KERN_ERR DRV_NAME 918 ": %s: Failed to add vlan id %d\n", 919 bond_dev->name, vid); 920 } 921} 922 923/** 924 * bond_vlan_rx_kill_vid - Propagates deleting an id to slaves 925 * @bond_dev: bonding net device that got called 926 * @vid: vlan id being removed 927 */ 928static void bond_vlan_rx_kill_vid(struct net_device *bond_dev, uint16_t vid) 929{ 930 struct bonding *bond = bond_dev->priv; 931 struct slave *slave; 932 struct net_device *vlan_dev; 933 int i, res; 934 935 bond_for_each_slave(bond, slave, i) { 936 struct net_device *slave_dev = slave->dev; 937 938 if ((slave_dev->features & NETIF_F_HW_VLAN_FILTER) && 939 slave_dev->vlan_rx_kill_vid) { 940 /* Save and then restore vlan_dev in the grp array, 941 * since the slave's driver might clear it. 942 */ 943 vlan_dev = bond->vlgrp->vlan_devices[vid]; 944 slave_dev->vlan_rx_kill_vid(slave_dev, vid); 945 bond->vlgrp->vlan_devices[vid] = vlan_dev; 946 } 947 } 948 949 res = bond_del_vlan(bond, vid); 950 if (res) { 951 printk(KERN_ERR DRV_NAME 952 ": %s: Failed to remove vlan id %d\n", 953 bond_dev->name, vid); 954 } 955} 956 957static void bond_add_vlans_on_slave(struct bonding *bond, struct net_device *slave_dev) 958{ 959 struct vlan_entry *vlan; 960 961 write_lock_bh(&bond->lock); 962 963 if (list_empty(&bond->vlan_list)) { 964 goto out; 965 } 966 967 if ((slave_dev->features & NETIF_F_HW_VLAN_RX) && 968 slave_dev->vlan_rx_register) { 969 slave_dev->vlan_rx_register(slave_dev, bond->vlgrp); 970 } 971 972 if (!(slave_dev->features & NETIF_F_HW_VLAN_FILTER) || 973 !(slave_dev->vlan_rx_add_vid)) { 974 goto out; 975 } 976 977 list_for_each_entry(vlan, &bond->vlan_list, vlan_list) { 978 slave_dev->vlan_rx_add_vid(slave_dev, vlan->vlan_id); 979 } 980 981out: 982 write_unlock_bh(&bond->lock); 983} 984 985static void bond_del_vlans_from_slave(struct bonding *bond, struct net_device *slave_dev) 986{ 987 struct vlan_entry *vlan; 988 struct net_device *vlan_dev; 989 990 write_lock_bh(&bond->lock); 991 992 if (list_empty(&bond->vlan_list)) { 993 goto out; 994 } 995 996 if (!(slave_dev->features & NETIF_F_HW_VLAN_FILTER) || 997 !(slave_dev->vlan_rx_kill_vid)) { 998 goto unreg; 999 } 1000 1001 list_for_each_entry(vlan, &bond->vlan_list, vlan_list) { 1002 /* Save and then restore vlan_dev in the grp array, 1003 * since the slave's driver might clear it. 1004 */ 1005 vlan_dev = bond->vlgrp->vlan_devices[vlan->vlan_id]; 1006 slave_dev->vlan_rx_kill_vid(slave_dev, vlan->vlan_id); 1007 bond->vlgrp->vlan_devices[vlan->vlan_id] = vlan_dev; 1008 } 1009 1010unreg: 1011 if ((slave_dev->features & NETIF_F_HW_VLAN_RX) && 1012 slave_dev->vlan_rx_register) { 1013 slave_dev->vlan_rx_register(slave_dev, NULL); 1014 } 1015 1016out: 1017 write_unlock_bh(&bond->lock); 1018} 1019 1020/*------------------------------- Link status -------------------------------*/ 1021 1022/* 1023 * Get link speed and duplex from the slave's base driver 1024 * using ethtool. If for some reason the call fails or the 1025 * values are invalid, fake speed and duplex to 100/Full 1026 * and return error. 1027 */ 1028static int bond_update_speed_duplex(struct slave *slave) 1029{ 1030 struct net_device *slave_dev = slave->dev; 1031 static int (* ioctl)(struct net_device *, struct ifreq *, int); 1032 struct ifreq ifr; 1033 struct ethtool_cmd etool; 1034 1035 /* Fake speed and duplex */ 1036 slave->speed = SPEED_100; 1037 slave->duplex = DUPLEX_FULL; 1038 1039 if (slave_dev->ethtool_ops) { 1040 u32 res; 1041 1042 if (!slave_dev->ethtool_ops->get_settings) { 1043 return -1; 1044 } 1045 1046 res = slave_dev->ethtool_ops->get_settings(slave_dev, &etool); 1047 if (res < 0) { 1048 return -1; 1049 } 1050 1051 goto verify; 1052 } 1053 1054 ioctl = slave_dev->do_ioctl; 1055 strncpy(ifr.ifr_name, slave_dev->name, IFNAMSIZ); 1056 etool.cmd = ETHTOOL_GSET; 1057 ifr.ifr_data = (char*)&etool; 1058 if (!ioctl || (IOCTL(slave_dev, &ifr, SIOCETHTOOL) < 0)) { 1059 return -1; 1060 } 1061 1062verify: 1063 switch (etool.speed) { 1064 case SPEED_10: 1065 case SPEED_100: 1066 case SPEED_1000: 1067 break; 1068 default: 1069 return -1; 1070 } 1071 1072 switch (etool.duplex) { 1073 case DUPLEX_FULL: 1074 case DUPLEX_HALF: 1075 break; 1076 default: 1077 return -1; 1078 } 1079 1080 slave->speed = etool.speed; 1081 slave->duplex = etool.duplex; 1082 1083 return 0; 1084} 1085 1086/* 1087 * if <dev> supports MII link status reporting, check its link status. 1088 * 1089 * We either do MII/ETHTOOL ioctls, or check netif_carrier_ok(), 1090 * depening upon the setting of the use_carrier parameter. 1091 * 1092 * Return either BMSR_LSTATUS, meaning that the link is up (or we 1093 * can't tell and just pretend it is), or 0, meaning that the link is 1094 * down. 1095 * 1096 * If reporting is non-zero, instead of faking link up, return -1 if 1097 * both ETHTOOL and MII ioctls fail (meaning the device does not 1098 * support them). If use_carrier is set, return whatever it says. 1099 * It'd be nice if there was a good way to tell if a driver supports 1100 * netif_carrier, but there really isn't. 1101 */ 1102static int bond_check_dev_link(struct bonding *bond, struct net_device *slave_dev, int reporting) 1103{ 1104 static int (* ioctl)(struct net_device *, struct ifreq *, int); 1105 struct ifreq ifr; 1106 struct mii_ioctl_data *mii; 1107 struct ethtool_value etool; 1108 1109 if (bond->params.use_carrier) { 1110 return netif_carrier_ok(slave_dev) ? BMSR_LSTATUS : 0; 1111 } 1112 1113 ioctl = slave_dev->do_ioctl; 1114 if (ioctl) { 1115 /* TODO: set pointer to correct ioctl on a per team member */ 1116 /* bases to make this more efficient. that is, once */ 1117 /* we determine the correct ioctl, we will always */ 1118 /* call it and not the others for that team */ 1119 /* member. */ 1120 1121 /* 1122 * We cannot assume that SIOCGMIIPHY will also read a 1123 * register; not all network drivers (e.g., e100) 1124 * support that. 1125 */ 1126 1127 /* Yes, the mii is overlaid on the ifreq.ifr_ifru */ 1128 strncpy(ifr.ifr_name, slave_dev->name, IFNAMSIZ); 1129 mii = if_mii(&ifr); 1130 if (IOCTL(slave_dev, &ifr, SIOCGMIIPHY) == 0) { 1131 mii->reg_num = MII_BMSR; 1132 if (IOCTL(slave_dev, &ifr, SIOCGMIIREG) == 0) { 1133 return (mii->val_out & BMSR_LSTATUS); 1134 } 1135 } 1136 } 1137 1138 /* try SIOCETHTOOL ioctl, some drivers cache ETHTOOL_GLINK */ 1139 /* for a period of time so we attempt to get link status */ 1140 /* from it last if the above MII ioctls fail... */ 1141 if (slave_dev->ethtool_ops) { 1142 if (slave_dev->ethtool_ops->get_link) { 1143 u32 link; 1144 1145 link = slave_dev->ethtool_ops->get_link(slave_dev); 1146 1147 return link ? BMSR_LSTATUS : 0; 1148 } 1149 } 1150 1151 if (ioctl) { 1152 strncpy(ifr.ifr_name, slave_dev->name, IFNAMSIZ); 1153 etool.cmd = ETHTOOL_GLINK; 1154 ifr.ifr_data = (char*)&etool; 1155 if (IOCTL(slave_dev, &ifr, SIOCETHTOOL) == 0) { 1156 if (etool.data == 1) { 1157 return BMSR_LSTATUS; 1158 } else { 1159 dprintk("SIOCETHTOOL shows link down\n"); 1160 return 0; 1161 } 1162 } 1163 } 1164 1165 /* 1166 * If reporting, report that either there's no dev->do_ioctl, 1167 * or both SIOCGMIIREG and SIOCETHTOOL failed (meaning that we 1168 * cannot report link status). If not reporting, pretend 1169 * we're ok. 1170 */ 1171 return (reporting ? -1 : BMSR_LSTATUS); 1172} 1173 1174/*----------------------------- Multicast list ------------------------------*/ 1175 1176/* 1177 * Returns 0 if dmi1 and dmi2 are the same, non-0 otherwise 1178 */ 1179static inline int bond_is_dmi_same(struct dev_mc_list *dmi1, struct dev_mc_list *dmi2) 1180{ 1181 return memcmp(dmi1->dmi_addr, dmi2->dmi_addr, dmi1->dmi_addrlen) == 0 && 1182 dmi1->dmi_addrlen == dmi2->dmi_addrlen; 1183} 1184 1185/* 1186 * returns dmi entry if found, NULL otherwise 1187 */ 1188static struct dev_mc_list *bond_mc_list_find_dmi(struct dev_mc_list *dmi, struct dev_mc_list *mc_list) 1189{ 1190 struct dev_mc_list *idmi; 1191 1192 for (idmi = mc_list; idmi; idmi = idmi->next) { 1193 if (bond_is_dmi_same(dmi, idmi)) { 1194 return idmi; 1195 } 1196 } 1197 1198 return NULL; 1199} 1200 1201/* 1202 * Push the promiscuity flag down to appropriate slaves 1203 */ 1204static void bond_set_promiscuity(struct bonding *bond, int inc) 1205{ 1206 if (USES_PRIMARY(bond->params.mode)) { 1207 /* write lock already acquired */ 1208 if (bond->curr_active_slave) { 1209 dev_set_promiscuity(bond->curr_active_slave->dev, inc); 1210 } 1211 } else { 1212 struct slave *slave; 1213 int i; 1214 bond_for_each_slave(bond, slave, i) { 1215 dev_set_promiscuity(slave->dev, inc); 1216 } 1217 } 1218} 1219 1220/* 1221 * Push the allmulti flag down to all slaves 1222 */ 1223static void bond_set_allmulti(struct bonding *bond, int inc) 1224{ 1225 if (USES_PRIMARY(bond->params.mode)) { 1226 /* write lock already acquired */ 1227 if (bond->curr_active_slave) { 1228 dev_set_allmulti(bond->curr_active_slave->dev, inc); 1229 } 1230 } else { 1231 struct slave *slave; 1232 int i; 1233 bond_for_each_slave(bond, slave, i) { 1234 dev_set_allmulti(slave->dev, inc); 1235 } 1236 } 1237} 1238 1239/* 1240 * Add a Multicast address to slaves 1241 * according to mode 1242 */ 1243static void bond_mc_add(struct bonding *bond, void *addr, int alen) 1244{ 1245 if (USES_PRIMARY(bond->params.mode)) { 1246 /* write lock already acquired */ 1247 if (bond->curr_active_slave) { 1248 dev_mc_add(bond->curr_active_slave->dev, addr, alen, 0); 1249 } 1250 } else { 1251 struct slave *slave; 1252 int i; 1253 bond_for_each_slave(bond, slave, i) { 1254 dev_mc_add(slave->dev, addr, alen, 0); 1255 } 1256 } 1257} 1258 1259/* 1260 * Remove a multicast address from slave 1261 * according to mode 1262 */ 1263static void bond_mc_delete(struct bonding *bond, void *addr, int alen) 1264{ 1265 if (USES_PRIMARY(bond->params.mode)) { 1266 /* write lock already acquired */ 1267 if (bond->curr_active_slave) { 1268 dev_mc_delete(bond->curr_active_slave->dev, addr, alen, 0); 1269 } 1270 } else { 1271 struct slave *slave; 1272 int i; 1273 bond_for_each_slave(bond, slave, i) { 1274 dev_mc_delete(slave->dev, addr, alen, 0); 1275 } 1276 } 1277} 1278 1279/* 1280 * Totally destroys the mc_list in bond 1281 */ 1282static void bond_mc_list_destroy(struct bonding *bond) 1283{ 1284 struct dev_mc_list *dmi; 1285 1286 dmi = bond->mc_list; 1287 while (dmi) { 1288 bond->mc_list = dmi->next; 1289 kfree(dmi); 1290 dmi = bond->mc_list; 1291 } 1292} 1293 1294/* 1295 * Copy all the Multicast addresses from src to the bonding device dst 1296 */ 1297static int bond_mc_list_copy(struct dev_mc_list *mc_list, struct bonding *bond, int gpf_flag) 1298{ 1299 struct dev_mc_list *dmi, *new_dmi; 1300 1301 for (dmi = mc_list; dmi; dmi = dmi->next) { 1302 new_dmi = kmalloc(sizeof(struct dev_mc_list), gpf_flag); 1303 1304 if (!new_dmi) { 1305 /* FIXME: Potential memory leak !!! */ 1306 return -ENOMEM; 1307 } 1308 1309 new_dmi->next = bond->mc_list; 1310 bond->mc_list = new_dmi; 1311 new_dmi->dmi_addrlen = dmi->dmi_addrlen; 1312 memcpy(new_dmi->dmi_addr, dmi->dmi_addr, dmi->dmi_addrlen); 1313 new_dmi->dmi_users = dmi->dmi_users; 1314 new_dmi->dmi_gusers = dmi->dmi_gusers; 1315 } 1316 1317 return 0; 1318} 1319 1320/* 1321 * flush all members of flush->mc_list from device dev->mc_list 1322 */ 1323static void bond_mc_list_flush(struct net_device *bond_dev, struct net_device *slave_dev) 1324{ 1325 struct bonding *bond = bond_dev->priv; 1326 struct dev_mc_list *dmi; 1327 1328 for (dmi = bond_dev->mc_list; dmi; dmi = dmi->next) { 1329 dev_mc_delete(slave_dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); 1330 } 1331 1332 if (bond->params.mode == BOND_MODE_8023AD) { 1333 /* del lacpdu mc addr from mc list */ 1334 u8 lacpdu_multicast[ETH_ALEN] = MULTICAST_LACPDU_ADDR; 1335 1336 dev_mc_delete(slave_dev, lacpdu_multicast, ETH_ALEN, 0); 1337 } 1338} 1339 1340/*--------------------------- Active slave change ---------------------------*/ 1341 1342/* 1343 * Update the mc list and multicast-related flags for the new and 1344 * old active slaves (if any) according to the multicast mode, and 1345 * promiscuous flags unconditionally. 1346 */ 1347static void bond_mc_swap(struct bonding *bond, struct slave *new_active, struct slave *old_active) 1348{ 1349 struct dev_mc_list *dmi; 1350 1351 if (!USES_PRIMARY(bond->params.mode)) { 1352 /* nothing to do - mc list is already up-to-date on 1353 * all slaves 1354 */ 1355 return; 1356 } 1357 1358 if (old_active) { 1359 if (bond->dev->flags & IFF_PROMISC) { 1360 dev_set_promiscuity(old_active->dev, -1); 1361 } 1362 1363 if (bond->dev->flags & IFF_ALLMULTI) { 1364 dev_set_allmulti(old_active->dev, -1); 1365 } 1366 1367 for (dmi = bond->dev->mc_list; dmi; dmi = dmi->next) { 1368 dev_mc_delete(old_active->dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); 1369 } 1370 } 1371 1372 if (new_active) { 1373 if (bond->dev->flags & IFF_PROMISC) { 1374 dev_set_promiscuity(new_active->dev, 1); 1375 } 1376 1377 if (bond->dev->flags & IFF_ALLMULTI) { 1378 dev_set_allmulti(new_active->dev, 1); 1379 } 1380 1381 for (dmi = bond->dev->mc_list; dmi; dmi = dmi->next) { 1382 dev_mc_add(new_active->dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); 1383 } 1384 } 1385} 1386 1387/** 1388 * find_best_interface - select the best available slave to be the active one 1389 * @bond: our bonding struct 1390 * 1391 * Warning: Caller must hold curr_slave_lock for writing. 1392 */ 1393static struct slave *bond_find_best_slave(struct bonding *bond) 1394{ 1395 struct slave *new_active, *old_active; 1396 struct slave *bestslave = NULL; 1397 int mintime = bond->params.updelay; 1398 int i; 1399 1400 new_active = old_active = bond->curr_active_slave; 1401 1402 if (!new_active) { /* there were no active slaves left */ 1403 if (bond->slave_cnt > 0) { /* found one slave */ 1404 new_active = bond->first_slave; 1405 } else { 1406 return NULL; /* still no slave, return NULL */ 1407 } 1408 } 1409 1410 /* first try the primary link; if arping, a link must tx/rx traffic 1411 * before it can be considered the curr_active_slave - also, we would skip 1412 * slaves between the curr_active_slave and primary_slave that may be up 1413 * and able to arp 1414 */ 1415 if ((bond->primary_slave) && 1416 (!bond->params.arp_interval) && 1417 (IS_UP(bond->primary_slave->dev))) { 1418 new_active = bond->primary_slave; 1419 } 1420 1421 /* remember where to stop iterating over the slaves */ 1422 old_active = new_active; 1423 1424 bond_for_each_slave_from(bond, new_active, i, old_active) { 1425 if (IS_UP(new_active->dev)) { 1426 if (new_active->link == BOND_LINK_UP) { 1427 return new_active; 1428 } else if (new_active->link == BOND_LINK_BACK) { 1429 /* link up, but waiting for stabilization */ 1430 if (new_active->delay < mintime) { 1431 mintime = new_active->delay; 1432 bestslave = new_active; 1433 } 1434 } 1435 } 1436 } 1437 1438 return bestslave; 1439} 1440 1441/** 1442 * change_active_interface - change the active slave into the specified one 1443 * @bond: our bonding struct 1444 * @new: the new slave to make the active one 1445 * 1446 * Set the new slave to the bond's settings and unset them on the old 1447 * curr_active_slave. 1448 * Setting include flags, mc-list, promiscuity, allmulti, etc. 1449 * 1450 * If @new's link state is %BOND_LINK_BACK we'll set it to %BOND_LINK_UP, 1451 * because it is apparently the best available slave we have, even though its 1452 * updelay hasn't timed out yet. 1453 * 1454 * Warning: Caller must hold curr_slave_lock for writing. 1455 */ 1456static void bond_change_active_slave(struct bonding *bond, struct slave *new_active) 1457{ 1458 struct slave *old_active = bond->curr_active_slave; 1459 1460 if (old_active == new_active) { 1461 return; 1462 } 1463 1464 if (new_active) { 1465 if (new_active->link == BOND_LINK_BACK) { 1466 if (USES_PRIMARY(bond->params.mode)) { 1467 printk(KERN_INFO DRV_NAME 1468 ": %s: making interface %s the new " 1469 "active one %d ms earlier.\n", 1470 bond->dev->name, new_active->dev->name, 1471 (bond->params.updelay - new_active->delay) * bond->params.miimon); 1472 } 1473 1474 new_active->delay = 0; 1475 new_active->link = BOND_LINK_UP; 1476 new_active->jiffies = jiffies; 1477 1478 if (bond->params.mode == BOND_MODE_8023AD) { 1479 bond_3ad_handle_link_change(new_active, BOND_LINK_UP); 1480 } 1481 1482 if ((bond->params.mode == BOND_MODE_TLB) || 1483 (bond->params.mode == BOND_MODE_ALB)) { 1484 bond_alb_handle_link_change(bond, new_active, BOND_LINK_UP); 1485 } 1486 } else { 1487 if (USES_PRIMARY(bond->params.mode)) { 1488 printk(KERN_INFO DRV_NAME 1489 ": %s: making interface %s the new " 1490 "active one.\n", 1491 bond->dev->name, new_active->dev->name); 1492 } 1493 } 1494 } 1495 1496 if (USES_PRIMARY(bond->params.mode)) { 1497 bond_mc_swap(bond, new_active, old_active); 1498 } 1499 1500 if ((bond->params.mode == BOND_MODE_TLB) || 1501 (bond->params.mode == BOND_MODE_ALB)) { 1502 bond_alb_handle_active_change(bond, new_active); 1503 } else { 1504 bond->curr_active_slave = new_active; 1505 } 1506 1507 if (bond->params.mode == BOND_MODE_ACTIVEBACKUP) { 1508 if (old_active) { 1509 bond_set_slave_inactive_flags(old_active); 1510 } 1511 1512 if (new_active) { 1513 bond_set_slave_active_flags(new_active); 1514 } 1515 bond_send_gratuitous_arp(bond); 1516 } 1517} 1518 1519/** 1520 * bond_select_active_slave - select a new active slave, if needed 1521 * @bond: our bonding struct 1522 * 1523 * This functions shoud be called when one of the following occurs: 1524 * - The old curr_active_slave has been released or lost its link. 1525 * - The primary_slave has got its link back. 1526 * - A slave has got its link back and there's no old curr_active_slave. 1527 * 1528 * Warning: Caller must hold curr_slave_lock for writing. 1529 */ 1530static void bond_select_active_slave(struct bonding *bond) 1531{ 1532 struct slave *best_slave; 1533 1534 best_slave = bond_find_best_slave(bond); 1535 if (best_slave != bond->curr_active_slave) { 1536 bond_change_active_slave(bond, best_slave); 1537 } 1538} 1539 1540/*--------------------------- slave list handling ---------------------------*/ 1541 1542/* 1543 * This function attaches the slave to the end of list. 1544 * 1545 * bond->lock held for writing by caller. 1546 */ 1547static void bond_attach_slave(struct bonding *bond, struct slave *new_slave) 1548{ 1549 if (bond->first_slave == NULL) { /* attaching the first slave */ 1550 new_slave->next = new_slave; 1551 new_slave->prev = new_slave; 1552 bond->first_slave = new_slave; 1553 } else { 1554 new_slave->next = bond->first_slave; 1555 new_slave->prev = bond->first_slave->prev; 1556 new_slave->next->prev = new_slave; 1557 new_slave->prev->next = new_slave; 1558 } 1559 1560 bond->slave_cnt++; 1561} 1562 1563/* 1564 * This function detaches the slave from the list. 1565 * WARNING: no check is made to verify if the slave effectively 1566 * belongs to <bond>. 1567 * Nothing is freed on return, structures are just unchained. 1568 * If any slave pointer in bond was pointing to <slave>, 1569 * it should be changed by the calling function. 1570 * 1571 * bond->lock held for writing by caller. 1572 */ 1573static void bond_detach_slave(struct bonding *bond, struct slave *slave) 1574{ 1575 if (slave->next) { 1576 slave->next->prev = slave->prev; 1577 } 1578 1579 if (slave->prev) { 1580 slave->prev->next = slave->next; 1581 } 1582 1583 if (bond->first_slave == slave) { /* slave is the first slave */ 1584 if (bond->slave_cnt > 1) { /* there are more slave */ 1585 bond->first_slave = slave->next; 1586 } else { 1587 bond->first_slave = NULL; /* slave was the last one */ 1588 } 1589 } 1590 1591 slave->next = NULL; 1592 slave->prev = NULL; 1593 bond->slave_cnt--; 1594} 1595 1596/*---------------------------------- IOCTL ----------------------------------*/ 1597 1598static int bond_sethwaddr(struct net_device *bond_dev, struct net_device *slave_dev) 1599{ 1600 dprintk("bond_dev=%p\n", bond_dev); 1601 dprintk("slave_dev=%p\n", slave_dev); 1602 dprintk("slave_dev->addr_len=%d\n", slave_dev->addr_len); 1603 memcpy(bond_dev->dev_addr, slave_dev->dev_addr, slave_dev->addr_len); 1604 return 0; 1605} 1606 1607/* enslave device <slave> to bond device <master> */ 1608static int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) 1609{ 1610 struct bonding *bond = bond_dev->priv; 1611 struct slave *new_slave = NULL; 1612 struct dev_mc_list *dmi; 1613 struct sockaddr addr; 1614 int link_reporting; 1615 int old_features = bond_dev->features; 1616 int res = 0; 1617 1618 if (slave_dev->do_ioctl == NULL) { 1619 printk(KERN_WARNING DRV_NAME 1620 ": Warning : no link monitoring support for %s\n", 1621 slave_dev->name); 1622 } 1623 1624 /* bond must be initialized by bond_open() before enslaving */ 1625 if (!(bond_dev->flags & IFF_UP)) { 1626 dprintk("Error, master_dev is not up\n"); 1627 return -EPERM; 1628 } 1629 1630 /* already enslaved */ 1631 if (slave_dev->flags & IFF_SLAVE) { 1632 dprintk("Error, Device was already enslaved\n"); 1633 return -EBUSY; 1634 } 1635 1636 /* vlan challenged mutual exclusion */ 1637 /* no need to lock since we're protected by rtnl_lock */ 1638 if (slave_dev->features & NETIF_F_VLAN_CHALLENGED) { 1639 dprintk("%s: NETIF_F_VLAN_CHALLENGED\n", slave_dev->name); 1640 if (!list_empty(&bond->vlan_list)) { 1641 printk(KERN_ERR DRV_NAME 1642 ": Error: cannot enslave VLAN " 1643 "challenged slave %s on VLAN enabled " 1644 "bond %s\n", slave_dev->name, 1645 bond_dev->name); 1646 return -EPERM; 1647 } else { 1648 printk(KERN_WARNING DRV_NAME 1649 ": Warning: enslaved VLAN challenged " 1650 "slave %s. Adding VLANs will be blocked as " 1651 "long as %s is part of bond %s\n", 1652 slave_dev->name, slave_dev->name, 1653 bond_dev->name); 1654 bond_dev->features |= NETIF_F_VLAN_CHALLENGED; 1655 } 1656 } else { 1657 dprintk("%s: ! NETIF_F_VLAN_CHALLENGED\n", slave_dev->name); 1658 if (bond->slave_cnt == 0) { 1659 /* First slave, and it is not VLAN challenged, 1660 * so remove the block of adding VLANs over the bond. 1661 */ 1662 bond_dev->features &= ~NETIF_F_VLAN_CHALLENGED; 1663 } 1664 } 1665 1666 if (app_abi_ver >= 1) { 1667 /* The application is using an ABI, which requires the 1668 * slave interface to be closed. 1669 */ 1670 if ((slave_dev->flags & IFF_UP)) { 1671 printk(KERN_ERR DRV_NAME 1672 ": Error: %s is up\n", 1673 slave_dev->name); 1674 res = -EPERM; 1675 goto err_undo_flags; 1676 } 1677 1678 if (slave_dev->set_mac_address == NULL) { 1679 printk(KERN_ERR DRV_NAME 1680 ": Error: The slave device you specified does " 1681 "not support setting the MAC address.\n"); 1682 printk(KERN_ERR 1683 "Your kernel likely does not support slave " 1684 "devices.\n"); 1685 1686 res = -EOPNOTSUPP; 1687 goto err_undo_flags; 1688 } 1689 } else { 1690 /* The application is not using an ABI, which requires the 1691 * slave interface to be open. 1692 */ 1693 if (!(slave_dev->flags & IFF_UP)) { 1694 printk(KERN_ERR DRV_NAME 1695 ": Error: %s is not running\n", 1696 slave_dev->name); 1697 res = -EINVAL; 1698 goto err_undo_flags; 1699 } 1700 1701 if ((bond->params.mode == BOND_MODE_8023AD) || 1702 (bond->params.mode == BOND_MODE_TLB) || 1703 (bond->params.mode == BOND_MODE_ALB)) { 1704 printk(KERN_ERR DRV_NAME 1705 ": Error: to use %s mode, you must upgrade " 1706 "ifenslave.\n", 1707 bond_mode_name(bond->params.mode)); 1708 res = -EOPNOTSUPP; 1709 goto err_undo_flags; 1710 } 1711 } 1712 1713 new_slave = kmalloc(sizeof(struct slave), GFP_KERNEL); 1714 if (!new_slave) { 1715 res = -ENOMEM; 1716 goto err_undo_flags; 1717 } 1718 1719 memset(new_slave, 0, sizeof(struct slave)); 1720 1721 /* save slave's original flags before calling 1722 * netdev_set_master and dev_open 1723 */ 1724 new_slave->original_flags = slave_dev->flags; 1725 1726 if (app_abi_ver >= 1) { 1727 /* save slave's original ("permanent") mac address for 1728 * modes that needs it, and for restoring it upon release, 1729 * and then set it to the master's address 1730 */ 1731 memcpy(new_slave->perm_hwaddr, slave_dev->dev_addr, ETH_ALEN); 1732 1733 /* set slave to master's mac address 1734 * The application already set the master's 1735 * mac address to that of the first slave 1736 */ 1737 memcpy(addr.sa_data, bond_dev->dev_addr, bond_dev->addr_len); 1738 addr.sa_family = slave_dev->type; 1739 res = dev_set_mac_address(slave_dev, &addr); 1740 if (res) { 1741 dprintk("Error %d calling set_mac_address\n", res); 1742 goto err_free; 1743 } 1744 1745 /* open the slave since the application closed it */ 1746 res = dev_open(slave_dev); 1747 if (res) { 1748 dprintk("Openning slave %s failed\n", slave_dev->name); 1749 goto err_restore_mac; 1750 } 1751 } 1752 1753 res = netdev_set_master(slave_dev, bond_dev); 1754 if (res) { 1755 dprintk("Error %d calling netdev_set_master\n", res); 1756 if (app_abi_ver < 1) { 1757 goto err_free; 1758 } else { 1759 goto err_close; 1760 } 1761 } 1762 1763 new_slave->dev = slave_dev; 1764 1765 if ((bond->params.mode == BOND_MODE_TLB) || 1766 (bond->params.mode == BOND_MODE_ALB)) { 1767 /* bond_alb_init_slave() must be called before all other stages since 1768 * it might fail and we do not want to have to undo everything 1769 */ 1770 res = bond_alb_init_slave(bond, new_slave); 1771 if (res) { 1772 goto err_unset_master; 1773 } 1774 } 1775 1776 /* If the mode USES_PRIMARY, then the new slave gets the 1777 * master's promisc (and mc) settings only if it becomes the 1778 * curr_active_slave, and that is taken care of later when calling 1779 * bond_change_active() 1780 */ 1781 if (!USES_PRIMARY(bond->params.mode)) { 1782 /* set promiscuity level to new slave */ 1783 if (bond_dev->flags & IFF_PROMISC) { 1784 dev_set_promiscuity(slave_dev, 1); 1785 } 1786 1787 /* set allmulti level to new slave */ 1788 if (bond_dev->flags & IFF_ALLMULTI) { 1789 dev_set_allmulti(slave_dev, 1); 1790 } 1791 1792 /* upload master's mc_list to new slave */ 1793 for (dmi = bond_dev->mc_list; dmi; dmi = dmi->next) { 1794 dev_mc_add (slave_dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); 1795 } 1796 } 1797 1798 if (bond->params.mode == BOND_MODE_8023AD) { 1799 /* add lacpdu mc addr to mc list */ 1800 u8 lacpdu_multicast[ETH_ALEN] = MULTICAST_LACPDU_ADDR; 1801 1802 dev_mc_add(slave_dev, lacpdu_multicast, ETH_ALEN, 0); 1803 } 1804 1805 bond_add_vlans_on_slave(bond, slave_dev); 1806 1807 write_lock_bh(&bond->lock); 1808 1809 bond_attach_slave(bond, new_slave); 1810 1811 new_slave->delay = 0; 1812 new_slave->link_failure_count = 0; 1813 1814 if (bond->params.miimon && !bond->params.use_carrier) { 1815 link_reporting = bond_check_dev_link(bond, slave_dev, 1); 1816 1817 if ((link_reporting == -1) && !bond->params.arp_interval) { 1818 /* 1819 * miimon is set but a bonded network driver 1820 * does not support ETHTOOL/MII and 1821 * arp_interval is not set. Note: if 1822 * use_carrier is enabled, we will never go 1823 * here (because netif_carrier is always 1824 * supported); thus, we don't need to change 1825 * the messages for netif_carrier. 1826 */ 1827 printk(KERN_WARNING DRV_NAME 1828 ": Warning: MII and ETHTOOL support not " 1829 "available for interface %s, and " 1830 "arp_interval/arp_ip_target module parameters " 1831 "not specified, thus bonding will not detect " 1832 "link failures! see bonding.txt for details.\n", 1833 slave_dev->name); 1834 } else if (link_reporting == -1) { 1835 /* unable get link status using mii/ethtool */ 1836 printk(KERN_WARNING DRV_NAME 1837 ": Warning: can't get link status from " 1838 "interface %s; the network driver associated " 1839 "with this interface does not support MII or " 1840 "ETHTOOL link status reporting, thus miimon " 1841 "has no effect on this interface.\n", 1842 slave_dev->name); 1843 } 1844 } 1845 1846 /* check for initial state */ 1847 if (!bond->params.miimon || 1848 (bond_check_dev_link(bond, slave_dev, 0) == BMSR_LSTATUS)) { 1849 if (bond->params.updelay) { 1850 dprintk("Initial state of slave_dev is " 1851 "BOND_LINK_BACK\n"); 1852 new_slave->link = BOND_LINK_BACK; 1853 new_slave->delay = bond->params.updelay; 1854 } else { 1855 dprintk("Initial state of slave_dev is " 1856 "BOND_LINK_UP\n"); 1857 new_slave->link = BOND_LINK_UP; 1858 } 1859 new_slave->jiffies = jiffies; 1860 } else { 1861 dprintk("Initial state of slave_dev is " 1862 "BOND_LINK_DOWN\n"); 1863 new_slave->link = BOND_LINK_DOWN; 1864 } 1865 1866 if (bond_update_speed_duplex(new_slave) && 1867 (new_slave->link != BOND_LINK_DOWN)) { 1868 printk(KERN_WARNING DRV_NAME 1869 ": Warning: failed to get speed and duplex from %s, " 1870 "assumed to be 100Mb/sec and Full.\n", 1871 new_slave->dev->name); 1872 1873 if (bond->params.mode == BOND_MODE_8023AD) { 1874 printk(KERN_WARNING 1875 "Operation of 802.3ad mode requires ETHTOOL " 1876 "support in base driver for proper aggregator " 1877 "selection.\n"); 1878 } 1879 } 1880 1881 if (USES_PRIMARY(bond->params.mode) && bond->params.primary[0]) { 1882 /* if there is a primary slave, remember it */ 1883 if (strcmp(bond->params.primary, new_slave->dev->name) == 0) { 1884 bond->primary_slave = new_slave; 1885 } 1886 } 1887 1888 switch (bond->params.mode) { 1889 case BOND_MODE_ACTIVEBACKUP: 1890 /* if we're in active-backup mode, we need one and only one active 1891 * interface. The backup interfaces will have their NOARP flag set 1892 * because we need them to be completely deaf and not to respond to 1893 * any ARP request on the network to avoid fooling a switch. Thus, 1894 * since we guarantee that curr_active_slave always point to the last 1895 * usable interface, we just have to verify this interface's flag. 1896 */ 1897 if (((!bond->curr_active_slave) || 1898 (bond->curr_active_slave->dev->flags & IFF_NOARP)) && 1899 (new_slave->link != BOND_LINK_DOWN)) { 1900 dprintk("This is the first active slave\n"); 1901 /* first slave or no active slave yet, and this link 1902 is OK, so make this interface the active one */ 1903 bond_change_active_slave(bond, new_slave); 1904 } else { 1905 dprintk("This is just a backup slave\n"); 1906 bond_set_slave_inactive_flags(new_slave); 1907 } 1908 break; 1909 case BOND_MODE_8023AD: 1910 /* in 802.3ad mode, the internal mechanism 1911 * will activate the slaves in the selected 1912 * aggregator 1913 */ 1914 bond_set_slave_inactive_flags(new_slave); 1915 /* if this is the first slave */ 1916 if (bond->slave_cnt == 1) { 1917 SLAVE_AD_INFO(new_slave).id = 1; 1918 /* Initialize AD with the number of times that the AD timer is called in 1 second 1919 * can be called only after the mac address of the bond is set 1920 */ 1921 bond_3ad_initialize(bond, 1000/AD_TIMER_INTERVAL, 1922 bond->params.lacp_fast); 1923 } else { 1924 SLAVE_AD_INFO(new_slave).id = 1925 SLAVE_AD_INFO(new_slave->prev).id + 1; 1926 } 1927 1928 bond_3ad_bind_slave(new_slave); 1929 break; 1930 case BOND_MODE_TLB: 1931 case BOND_MODE_ALB: 1932 new_slave->state = BOND_STATE_ACTIVE; 1933 if ((!bond->curr_active_slave) && 1934 (new_slave->link != BOND_LINK_DOWN)) { 1935 /* first slave or no active slave yet, and this link 1936 * is OK, so make this interface the active one 1937 */ 1938 bond_change_active_slave(bond, new_slave); 1939 } 1940 break; 1941 default: 1942 dprintk("This slave is always active in trunk mode\n"); 1943 1944 /* always active in trunk mode */ 1945 new_slave->state = BOND_STATE_ACTIVE; 1946 1947 /* In trunking mode there is little meaning to curr_active_slave 1948 * anyway (it holds no special properties of the bond device), 1949 * so we can change it without calling change_active_interface() 1950 */ 1951 if (!bond->curr_active_slave) { 1952 bond->curr_active_slave = new_slave; 1953 } 1954 break; 1955 } /* switch(bond_mode) */ 1956 1957 write_unlock_bh(&bond->lock); 1958 1959 if (app_abi_ver < 1) { 1960 /* 1961 * !!! This is to support old versions of ifenslave. 1962 * We can remove this in 2.5 because our ifenslave takes 1963 * care of this for us. 1964 * We check to see if the master has a mac address yet. 1965 * If not, we'll give it the mac address of our slave device. 1966 */ 1967 int ndx = 0; 1968 1969 for (ndx = 0; ndx < bond_dev->addr_len; ndx++) { 1970 dprintk("Checking ndx=%d of bond_dev->dev_addr\n", 1971 ndx); 1972 if (bond_dev->dev_addr[ndx] != 0) { 1973 dprintk("Found non-zero byte at ndx=%d\n", 1974 ndx); 1975 break; 1976 } 1977 } 1978 1979 if (ndx == bond_dev->addr_len) { 1980 /* 1981 * We got all the way through the address and it was 1982 * all 0's. 1983 */ 1984 dprintk("%s doesn't have a MAC address yet. \n", 1985 bond_dev->name); 1986 dprintk("Going to give assign it from %s.\n", 1987 slave_dev->name); 1988 bond_sethwaddr(bond_dev, slave_dev); 1989 } 1990 } 1991 1992 printk(KERN_INFO DRV_NAME 1993 ": %s: enslaving %s as a%s interface with a%s link.\n", 1994 bond_dev->name, slave_dev->name, 1995 new_slave->state == BOND_STATE_ACTIVE ? "n active" : " backup", 1996 new_slave->link != BOND_LINK_DOWN ? "n up" : " down"); 1997 1998 /* enslave is successful */ 1999 return 0; 2000 2001/* Undo stages on error */ 2002err_unset_master: 2003 netdev_set_master(slave_dev, NULL); 2004 2005err_close: 2006 dev_close(slave_dev); 2007 2008err_restore_mac: 2009 memcpy(addr.sa_data, new_slave->perm_hwaddr, ETH_ALEN); 2010 addr.sa_family = slave_dev->type; 2011 dev_set_mac_address(slave_dev, &addr); 2012 2013err_free: 2014 kfree(new_slave); 2015 2016err_undo_flags: 2017 bond_dev->features = old_features; 2018 2019 return res; 2020} 2021 2022/* 2023 * Try to release the slave device <slave> from the bond device <master> 2024 * It is legal to access curr_active_slave without a lock because all the function 2025 * is write-locked. 2026 * 2027 * The rules for slave state should be: 2028 * for Active/Backup: 2029 * Active stays on all backups go down 2030 * for Bonded connections: 2031 * The first up interface should be left on and all others downed. 2032 */ 2033static int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) 2034{ 2035 struct bonding *bond = bond_dev->priv; 2036 struct slave *slave, *oldcurrent; 2037 struct sockaddr addr; 2038 int mac_addr_differ; 2039 2040 /* slave is not a slave or master is not master of this slave */ 2041 if (!(slave_dev->flags & IFF_SLAVE) || 2042 (slave_dev->master != bond_dev)) { 2043 printk(KERN_ERR DRV_NAME 2044 ": Error: %s: cannot release %s.\n", 2045 bond_dev->name, slave_dev->name); 2046 return -EINVAL; 2047 } 2048 2049 write_lock_bh(&bond->lock); 2050 2051 slave = bond_get_slave_by_dev(bond, slave_dev); 2052 if (!slave) { 2053 /* not a slave of this bond */ 2054 printk(KERN_INFO DRV_NAME 2055 ": %s: %s not enslaved\n", 2056 bond_dev->name, slave_dev->name); 2057 return -EINVAL; 2058 } 2059 2060 mac_addr_differ = memcmp(bond_dev->dev_addr, 2061 slave->perm_hwaddr, 2062 ETH_ALEN); 2063 if (!mac_addr_differ && (bond->slave_cnt > 1)) { 2064 printk(KERN_WARNING DRV_NAME 2065 ": Warning: the permanent HWaddr of %s " 2066 "- %02X:%02X:%02X:%02X:%02X:%02X - is " 2067 "still in use by %s. Set the HWaddr of " 2068 "%s to a different address to avoid " 2069 "conflicts.\n", 2070 slave_dev->name, 2071 slave->perm_hwaddr[0], 2072 slave->perm_hwaddr[1], 2073 slave->perm_hwaddr[2], 2074 slave->perm_hwaddr[3], 2075 slave->perm_hwaddr[4], 2076 slave->perm_hwaddr[5], 2077 bond_dev->name, 2078 slave_dev->name); 2079 } 2080 2081 /* Inform AD package of unbinding of slave. */ 2082 if (bond->params.mode == BOND_MODE_8023AD) { 2083 /* must be called before the slave is 2084 * detached from the list 2085 */ 2086 bond_3ad_unbind_slave(slave); 2087 } 2088 2089 printk(KERN_INFO DRV_NAME 2090 ": %s: releasing %s interface %s\n", 2091 bond_dev->name, 2092 (slave->state == BOND_STATE_ACTIVE) 2093 ? "active" : "backup", 2094 slave_dev->name); 2095 2096 oldcurrent = bond->curr_active_slave; 2097 2098 bond->current_arp_slave = NULL; 2099 2100 /* release the slave from its bond */ 2101 bond_detach_slave(bond, slave); 2102 2103 if (bond->primary_slave == slave) { 2104 bond->primary_slave = NULL; 2105 } 2106 2107 if (oldcurrent == slave) { 2108 bond_change_active_slave(bond, NULL); 2109 } 2110 2111 if ((bond->params.mode == BOND_MODE_TLB) || 2112 (bond->params.mode == BOND_MODE_ALB)) { 2113 /* Must be called only after the slave has been 2114 * detached from the list and the curr_active_slave 2115 * has been cleared (if our_slave == old_current), 2116 * but before a new active slave is selected. 2117 */ 2118 bond_alb_deinit_slave(bond, slave); 2119 } 2120 2121 if (oldcurrent == slave) { 2122 bond_select_active_slave(bond); 2123 2124 if (!bond->curr_active_slave) { 2125 printk(KERN_INFO DRV_NAME 2126 ": %s: now running without any active " 2127 "interface !\n", 2128 bond_dev->name); 2129 } 2130 } 2131 2132 if (bond->slave_cnt == 0) { 2133 /* if the last slave was removed, zero the mac address 2134 * of the master so it will be set by the application 2135 * to the mac address of the first slave 2136 */ 2137 memset(bond_dev->dev_addr, 0, bond_dev->addr_len); 2138 2139 if (list_empty(&bond->vlan_list)) { 2140 bond_dev->features |= NETIF_F_VLAN_CHALLENGED; 2141 } else { 2142 printk(KERN_WARNING DRV_NAME 2143 ": Warning: clearing HW address of %s while it " 2144 "still has VLANs.\n", 2145 bond_dev->name); 2146 printk(KERN_WARNING DRV_NAME 2147 ": When re-adding slaves, make sure the bond's " 2148 "HW address matches its VLANs'.\n"); 2149 } 2150 } else if ((bond_dev->features & NETIF_F_VLAN_CHALLENGED) && 2151 !bond_has_challenged_slaves(bond)) { 2152 printk(KERN_INFO DRV_NAME 2153 ": last VLAN challenged slave %s " 2154 "left bond %s. VLAN blocking is removed\n", 2155 slave_dev->name, bond_dev->name); 2156 bond_dev->features &= ~NETIF_F_VLAN_CHALLENGED; 2157 } 2158 2159 write_unlock_bh(&bond->lock); 2160 2161 bond_del_vlans_from_slave(bond, slave_dev); 2162 2163 /* If the mode USES_PRIMARY, then we should only remove its 2164 * promisc and mc settings if it was the curr_active_slave, but that was 2165 * already taken care of above when we detached the slave 2166 */ 2167 if (!USES_PRIMARY(bond->params.mode)) { 2168 /* unset promiscuity level from slave */ 2169 if (bond_dev->flags & IFF_PROMISC) { 2170 dev_set_promiscuity(slave_dev, -1); 2171 } 2172 2173 /* unset allmulti level from slave */ 2174 if (bond_dev->flags & IFF_ALLMULTI) { 2175 dev_set_allmulti(slave_dev, -1); 2176 } 2177 2178 /* flush master's mc_list from slave */ 2179 bond_mc_list_flush(bond_dev, slave_dev); 2180 } 2181 2182 netdev_set_master(slave_dev, NULL); 2183 2184 /* close slave before restoring its mac address */ 2185 dev_close(slave_dev); 2186 2187 if (app_abi_ver >= 1) { 2188 /* restore original ("permanent") mac address */ 2189 memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN); 2190 addr.sa_family = slave_dev->type; 2191 dev_set_mac_address(slave_dev, &addr); 2192 } 2193 2194 /* restore the original state of the 2195 * IFF_NOARP flag that might have been 2196 * set by bond_set_slave_inactive_flags() 2197 */ 2198 if ((slave->original_flags & IFF_NOARP) == 0) { 2199 slave_dev->flags &= ~IFF_NOARP; 2200 } 2201 2202 kfree(slave); 2203 2204 return 0; /* deletion OK */ 2205} 2206 2207/* 2208 * This function releases all slaves. 2209 */ 2210static int bond_release_all(struct net_device *bond_dev) 2211{ 2212 struct bonding *bond = bond_dev->priv; 2213 struct slave *slave; 2214 struct net_device *slave_dev; 2215 struct sockaddr addr; 2216 2217 write_lock_bh(&bond->lock); 2218 2219 if (bond->slave_cnt == 0) { 2220 goto out; 2221 } 2222 2223 bond->current_arp_slave = NULL; 2224 bond->primary_slave = NULL; 2225 bond_change_active_slave(bond, NULL); 2226 2227 while ((slave = bond->first_slave) != NULL) { 2228 /* Inform AD package of unbinding of slave 2229 * before slave is detached from the list. 2230 */ 2231 if (bond->params.mode == BOND_MODE_8023AD) { 2232 bond_3ad_unbind_slave(slave); 2233 } 2234 2235 slave_dev = slave->dev; 2236 bond_detach_slave(bond, slave); 2237 2238 if ((bond->params.mode == BOND_MODE_TLB) || 2239 (bond->params.mode == BOND_MODE_ALB)) { 2240 /* must be called only after the slave 2241 * has been detached from the list 2242 */ 2243 bond_alb_deinit_slave(bond, slave); 2244 } 2245 2246 /* now that the slave is detached, unlock and perform 2247 * all the undo steps that should not be called from 2248 * within a lock. 2249 */ 2250 write_unlock_bh(&bond->lock); 2251 2252 bond_del_vlans_from_slave(bond, slave_dev); 2253 2254 /* If the mode USES_PRIMARY, then we should only remove its 2255 * promisc and mc settings if it was the curr_active_slave, but that was 2256 * already taken care of above when we detached the slave 2257 */ 2258 if (!USES_PRIMARY(bond->params.mode)) { 2259 /* unset promiscuity level from slave */ 2260 if (bond_dev->flags & IFF_PROMISC) { 2261 dev_set_promiscuity(slave_dev, -1); 2262 } 2263 2264 /* unset allmulti level from slave */ 2265 if (bond_dev->flags & IFF_ALLMULTI) { 2266 dev_set_allmulti(slave_dev, -1); 2267 } 2268 2269 /* flush master's mc_list from slave */ 2270 bond_mc_list_flush(bond_dev, slave_dev); 2271 } 2272 2273 netdev_set_master(slave_dev, NULL); 2274 2275 /* close slave before restoring its mac address */ 2276 dev_close(slave_dev); 2277 2278 if (app_abi_ver >= 1) { 2279 /* restore original ("permanent") mac address*/ 2280 memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN); 2281 addr.sa_family = slave_dev->type; 2282 dev_set_mac_address(slave_dev, &addr); 2283 } 2284 2285 /* restore the original state of the IFF_NOARP flag that might have 2286 * been set by bond_set_slave_inactive_flags() 2287 */ 2288 if ((slave->original_flags & IFF_NOARP) == 0) { 2289 slave_dev->flags &= ~IFF_NOARP; 2290 } 2291 2292 kfree(slave); 2293 2294 /* re-acquire the lock before getting the next slave */ 2295 write_lock_bh(&bond->lock); 2296 } 2297 2298 /* zero the mac address of the master so it will be 2299 * set by the application to the mac address of the 2300 * first slave 2301 */ 2302 memset(bond_dev->dev_addr, 0, bond_dev->addr_len); 2303 2304 if (list_empty(&bond->vlan_list)) { 2305 bond_dev->features |= NETIF_F_VLAN_CHALLENGED; 2306 } else { 2307 printk(KERN_WARNING DRV_NAME 2308 ": Warning: clearing HW address of %s while it " 2309 "still has VLANs.\n", 2310 bond_dev->name); 2311 printk(KERN_WARNING DRV_NAME 2312 ": When re-adding slaves, make sure the bond's " 2313 "HW address matches its VLANs'.\n"); 2314 } 2315 2316 printk(KERN_INFO DRV_NAME 2317 ": %s: released all slaves\n", 2318 bond_dev->name); 2319 2320out: 2321 write_unlock_bh(&bond->lock); 2322 2323 return 0; 2324} 2325 2326/* 2327 * This function changes the active slave to slave <slave_dev>. 2328 * It returns -EINVAL in the following cases. 2329 * - <slave_dev> is not found in the list. 2330 * - There is not active slave now. 2331 * - <slave_dev> is already active. 2332 * - The link state of <slave_dev> is not BOND_LINK_UP. 2333 * - <slave_dev> is not running. 2334 * In these cases, this fuction does nothing. 2335 * In the other cases, currnt_slave pointer is changed and 0 is returned. 2336 */ 2337static int bond_ioctl_change_active(struct net_device *bond_dev, struct net_device *slave_dev) 2338{ 2339 struct bonding *bond = bond_dev->priv; 2340 struct slave *old_active = NULL; 2341 struct slave *new_active = NULL; 2342 int res = 0; 2343 2344 if (!USES_PRIMARY(bond->params.mode)) { 2345 return -EINVAL; 2346 } 2347 2348 /* Verify that master_dev is indeed the master of slave_dev */ 2349 if (!(slave_dev->flags & IFF_SLAVE) || 2350 (slave_dev->master != bond_dev)) { 2351 return -EINVAL; 2352 } 2353 2354 write_lock_bh(&bond->lock); 2355 2356 old_active = bond->curr_active_slave; 2357 new_active = bond_get_slave_by_dev(bond, slave_dev); 2358 2359 /* 2360 * Changing to the current active: do nothing; return success. 2361 */ 2362 if (new_active && (new_active == old_active)) { 2363 write_unlock_bh(&bond->lock); 2364 return 0; 2365 } 2366 2367 if ((new_active) && 2368 (old_active) && 2369 (new_active->link == BOND_LINK_UP) && 2370 IS_UP(new_active->dev)) { 2371 bond_change_active_slave(bond, new_active); 2372 } else { 2373 res = -EINVAL; 2374 } 2375 2376 write_unlock_bh(&bond->lock); 2377 2378 return res; 2379} 2380 2381static int bond_ethtool_ioctl(struct net_device *bond_dev, struct ifreq *ifr) 2382{ 2383 struct ethtool_drvinfo info; 2384 void __user *addr = ifr->ifr_data; 2385 uint32_t cmd; 2386 2387 if (get_user(cmd, (uint32_t __user *)addr)) { 2388 return -EFAULT; 2389 } 2390 2391 switch (cmd) { 2392 case ETHTOOL_GDRVINFO: 2393 if (copy_from_user(&info, addr, sizeof(info))) { 2394 return -EFAULT; 2395 } 2396 2397 if (strcmp(info.driver, "ifenslave") == 0) { 2398 int new_abi_ver; 2399 char *endptr; 2400 2401 new_abi_ver = simple_strtoul(info.fw_version, 2402 &endptr, 0); 2403 if (*endptr) { 2404 printk(KERN_ERR DRV_NAME 2405 ": Error: got invalid ABI " 2406 "version from application\n"); 2407 2408 return -EINVAL; 2409 } 2410 2411 if (orig_app_abi_ver == -1) { 2412 orig_app_abi_ver = new_abi_ver; 2413 } 2414 2415 app_abi_ver = new_abi_ver; 2416 } 2417 2418 strncpy(info.driver, DRV_NAME, 32); 2419 strncpy(info.version, DRV_VERSION, 32); 2420 snprintf(info.fw_version, 32, "%d", BOND_ABI_VERSION); 2421 2422 if (copy_to_user(addr, &info, sizeof(info))) { 2423 return -EFAULT; 2424 } 2425 2426 return 0; 2427 default: 2428 return -EOPNOTSUPP; 2429 } 2430} 2431 2432static int bond_info_query(struct net_device *bond_dev, struct ifbond *info) 2433{ 2434 struct bonding *bond = bond_dev->priv; 2435 2436 info->bond_mode = bond->params.mode; 2437 info->miimon = bond->params.miimon; 2438 2439 read_lock_bh(&bond->lock); 2440 info->num_slaves = bond->slave_cnt; 2441 read_unlock_bh(&bond->lock); 2442 2443 return 0; 2444} 2445 2446static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *info) 2447{ 2448 struct bonding *bond = bond_dev->priv; 2449 struct slave *slave; 2450 int i, found = 0; 2451 2452 if (info->slave_id < 0) { 2453 return -ENODEV; 2454 } 2455 2456 read_lock_bh(&bond->lock); 2457 2458 bond_for_each_slave(bond, slave, i) { 2459 if (i == (int)info->slave_id) { 2460 found = 1; 2461 break; 2462 } 2463 } 2464 2465 read_unlock_bh(&bond->lock); 2466 2467 if (found) { 2468 strcpy(info->slave_name, slave->dev->name); 2469 info->link = slave->link; 2470 info->state = slave->state; 2471 info->link_failure_count = slave->link_failure_count; 2472 } else { 2473 return -ENODEV; 2474 } 2475 2476 return 0; 2477} 2478 2479/*-------------------------------- Monitoring -------------------------------*/ 2480 2481/* this function is called regularly to monitor each slave's link. */ 2482static void bond_mii_monitor(struct net_device *bond_dev) 2483{ 2484 struct bonding *bond = bond_dev->priv; 2485 struct slave *slave, *oldcurrent; 2486 int do_failover = 0; 2487 int delta_in_ticks; 2488 int i; 2489 2490 read_lock(&bond->lock); 2491 2492 delta_in_ticks = (bond->params.miimon * HZ) / 1000; 2493 2494 if (bond->kill_timers) { 2495 goto out; 2496 } 2497 2498 if (bond->slave_cnt == 0) { 2499 goto re_arm; 2500 } 2501 2502 /* we will try to read the link status of each of our slaves, and 2503 * set their IFF_RUNNING flag appropriately. For each slave not 2504 * supporting MII status, we won't do anything so that a user-space 2505 * program could monitor the link itself if needed. 2506 */ 2507 2508 read_lock(&bond->curr_slave_lock); 2509 oldcurrent = bond->curr_active_slave; 2510 read_unlock(&bond->curr_slave_lock); 2511 2512 bond_for_each_slave(bond, slave, i) { 2513 struct net_device *slave_dev = slave->dev; 2514 int link_state; 2515 u16 old_speed = slave->speed; 2516 u8 old_duplex = slave->duplex; 2517 2518 link_state = bond_check_dev_link(bond, slave_dev, 0); 2519 2520 switch (slave->link) { 2521 case BOND_LINK_UP: /* the link was up */ 2522 if (link_state == BMSR_LSTATUS) { 2523 /* link stays up, nothing more to do */ 2524 break; 2525 } else { /* link going down */ 2526 slave->link = BOND_LINK_FAIL; 2527 slave->delay = bond->params.downdelay; 2528 2529 if (slave->link_failure_count < UINT_MAX) { 2530 slave->link_failure_count++; 2531 } 2532 2533 if (bond->params.downdelay) { 2534 printk(KERN_INFO DRV_NAME 2535 ": %s: link status down for %s " 2536 "interface %s, disabling it in " 2537 "%d ms.\n", 2538 bond_dev->name, 2539 IS_UP(slave_dev) 2540 ? ((bond->params.mode == BOND_MODE_ACTIVEBACKUP) 2541 ? ((slave == oldcurrent) 2542 ? "active " : "backup ") 2543 : "") 2544 : "idle ", 2545 slave_dev->name, 2546 bond->params.downdelay * bond->params.miimon); 2547 } 2548 } 2549 /* no break ! fall through the BOND_LINK_FAIL test to 2550 ensure proper action to be taken 2551 */ 2552 case BOND_LINK_FAIL: /* the link has just gone down */ 2553 if (link_state != BMSR_LSTATUS) { 2554 /* link stays down */ 2555 if (slave->delay <= 0) { 2556 /* link down for too long time */ 2557 slave->link = BOND_LINK_DOWN; 2558 2559 /* in active/backup mode, we must 2560 * completely disable this interface 2561 */ 2562 if ((bond->params.mode == BOND_MODE_ACTIVEBACKUP) || 2563 (bond->params.mode == BOND_MODE_8023AD)) { 2564 bond_set_slave_inactive_flags(slave); 2565 } 2566 2567 printk(KERN_INFO DRV_NAME 2568 ": %s: link status definitely " 2569 "down for interface %s, " 2570 "disabling it\n", 2571 bond_dev->name, 2572 slave_dev->name); 2573 2574 /* notify ad that the link status has changed */ 2575 if (bond->params.mode == BOND_MODE_8023AD) { 2576 bond_3ad_handle_link_change(slave, BOND_LINK_DOWN); 2577 } 2578 2579 if ((bond->params.mode == BOND_MODE_TLB) || 2580 (bond->params.mode == BOND_MODE_ALB)) { 2581 bond_alb_handle_link_change(bond, slave, BOND_LINK_DOWN); 2582 } 2583 2584 if (slave == oldcurrent) { 2585 do_failover = 1; 2586 } 2587 } else { 2588 slave->delay--; 2589 } 2590 } else { 2591 /* link up again */ 2592 slave->link = BOND_LINK_UP; 2593 slave->jiffies = jiffies; 2594 printk(KERN_INFO DRV_NAME 2595 ": %s: link status up again after %d " 2596 "ms for interface %s.\n", 2597 bond_dev->name, 2598 (bond->params.downdelay - slave->delay) * bond->params.miimon, 2599 slave_dev->name); 2600 } 2601 break; 2602 case BOND_LINK_DOWN: /* the link was down */ 2603 if (link_state != BMSR_LSTATUS) { 2604 /* the link stays down, nothing more to do */ 2605 break; 2606 } else { /* link going up */ 2607 slave->link = BOND_LINK_BACK; 2608 slave->delay = bond->params.updelay; 2609 2610 if (bond->params.updelay) { 2611 /* if updelay == 0, no need to 2612 advertise about a 0 ms delay */ 2613 printk(KERN_INFO DRV_NAME 2614 ": %s: link status up for " 2615 "interface %s, enabling it " 2616 "in %d ms.\n", 2617 bond_dev->name, 2618 slave_dev->name, 2619 bond->params.updelay * bond->params.miimon); 2620 } 2621 } 2622 /* no break ! fall through the BOND_LINK_BACK state in 2623 case there's something to do. 2624 */ 2625 case BOND_LINK_BACK: /* the link has just come back */ 2626 if (link_state != BMSR_LSTATUS) { 2627 /* link down again */ 2628 slave->link = BOND_LINK_DOWN; 2629 2630 printk(KERN_INFO DRV_NAME 2631 ": %s: link status down again after %d " 2632 "ms for interface %s.\n", 2633 bond_dev->name, 2634 (bond->params.updelay - slave->delay) * bond->params.miimon, 2635 slave_dev->name); 2636 } else { 2637 /* link stays up */ 2638 if (slave->delay == 0) { 2639 /* now the link has been up for long time enough */ 2640 slave->link = BOND_LINK_UP; 2641 slave->jiffies = jiffies; 2642 2643 if (bond->params.mode == BOND_MODE_8023AD) { 2644 /* prevent it from being the active one */ 2645 slave->state = BOND_STATE_BACKUP; 2646 } else if (bond->params.mode != BOND_MODE_ACTIVEBACKUP) { 2647 /* make it immediately active */ 2648 slave->state = BOND_STATE_ACTIVE; 2649 } else if (slave != bond->primary_slave) { 2650 /* prevent it from being the active one */ 2651 slave->state = BOND_STATE_BACKUP; 2652 } 2653 2654 printk(KERN_INFO DRV_NAME 2655 ": %s: link status definitely " 2656 "up for interface %s.\n", 2657 bond_dev->name, 2658 slave_dev->name); 2659 2660 /* notify ad that the link status has changed */ 2661 if (bond->params.mode == BOND_MODE_8023AD) { 2662 bond_3ad_handle_link_change(slave, BOND_LINK_UP); 2663 } 2664 2665 if ((bond->params.mode == BOND_MODE_TLB) || 2666 (bond->params.mode == BOND_MODE_ALB)) { 2667 bond_alb_handle_link_change(bond, slave, BOND_LINK_UP); 2668 } 2669 2670 if ((!oldcurrent) || 2671 (slave == bond->primary_slave)) { 2672 do_failover = 1; 2673 } 2674 } else { 2675 slave->delay--; 2676 } 2677 } 2678 break; 2679 default: 2680 /* Should not happen */ 2681 printk(KERN_ERR "bonding: Error: %s Illegal value (link=%d)\n", 2682 slave->dev->name, slave->link); 2683 goto out; 2684 } /* end of switch (slave->link) */ 2685 2686 bond_update_speed_duplex(slave); 2687 2688 if (bond->params.mode == BOND_MODE_8023AD) { 2689 if (old_speed != slave->speed) { 2690 bond_3ad_adapter_speed_changed(slave); 2691 } 2692 2693 if (old_duplex != slave->duplex) { 2694 bond_3ad_adapter_duplex_changed(slave); 2695 } 2696 } 2697 2698 } /* end of for */ 2699 2700 if (do_failover) { 2701 write_lock(&bond->curr_slave_lock); 2702 2703 bond_select_active_slave(bond); 2704 2705 if (oldcurrent && !bond->curr_active_slave) { 2706 printk(KERN_INFO DRV_NAME 2707 ": %s: now running without any active " 2708 "interface !\n", 2709 bond_dev->name); 2710 } 2711 2712 write_unlock(&bond->curr_slave_lock); 2713 } 2714 2715re_arm: 2716 if (bond->params.miimon) { 2717 mod_timer(&bond->mii_timer, jiffies + delta_in_ticks); 2718 } 2719out: 2720 read_unlock(&bond->lock); 2721} 2722 2723 2724static u32 bond_glean_dev_ip(struct net_device *dev) 2725{ 2726 struct in_device *idev; 2727 struct in_ifaddr *ifa; 2728 u32 addr = 0; 2729 2730 if (!dev) 2731 return 0; 2732 2733 rcu_read_lock(); 2734 idev = __in_dev_get(dev); 2735 if (!idev) 2736 goto out; 2737 2738 ifa = idev->ifa_list; 2739 if (!ifa) 2740 goto out; 2741 2742 addr = ifa->ifa_local; 2743out: 2744 rcu_read_unlock(); 2745 return addr; 2746} 2747 2748static int bond_has_ip(struct bonding *bond) 2749{ 2750 struct vlan_entry *vlan, *vlan_next; 2751 2752 if (bond->master_ip) 2753 return 1; 2754 2755 if (list_empty(&bond->vlan_list)) 2756 return 0; 2757 2758 list_for_each_entry_safe(vlan, vlan_next, &bond->vlan_list, 2759 vlan_list) { 2760 if (vlan->vlan_ip) 2761 return 1; 2762 } 2763 2764 return 0; 2765} 2766 2767/* 2768 * We go to the (large) trouble of VLAN tagging ARP frames because 2769 * switches in VLAN mode (especially if ports are configured as 2770 * "native" to a VLAN) might not pass non-tagged frames. 2771 */ 2772static void bond_arp_send(struct net_device *slave_dev, int arp_op, u32 dest_ip, u32 src_ip, unsigned short vlan_id) 2773{ 2774 struct sk_buff *skb; 2775 2776 dprintk("arp %d on slave %s: dst %x src %x vid %d\n", arp_op, 2777 slave_dev->name, dest_ip, src_ip, vlan_id); 2778 2779 skb = arp_create(arp_op, ETH_P_ARP, dest_ip, slave_dev, src_ip, 2780 NULL, slave_dev->dev_addr, NULL); 2781 2782 if (!skb) { 2783 printk(KERN_ERR DRV_NAME ": ARP packet allocation failed\n"); 2784 return; 2785 } 2786 if (vlan_id) { 2787 skb = vlan_put_tag(skb, vlan_id); 2788 if (!skb) { 2789 printk(KERN_ERR DRV_NAME ": failed to insert VLAN tag\n"); 2790 return; 2791 } 2792 } 2793 arp_xmit(skb); 2794} 2795 2796 2797static void bond_arp_send_all(struct bonding *bond, struct slave *slave) 2798{ 2799 int i, vlan_id, rv; 2800 u32 *targets = bond->params.arp_targets; 2801 struct vlan_entry *vlan, *vlan_next; 2802 struct net_device *vlan_dev; 2803 struct flowi fl; 2804 struct rtable *rt; 2805 2806 for (i = 0; (i < BOND_MAX_ARP_TARGETS) && targets[i]; i++) { 2807 dprintk("basa: target %x\n", targets[i]); 2808 if (list_empty(&bond->vlan_list)) { 2809 dprintk("basa: empty vlan: arp_send\n"); 2810 bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], 2811 bond->master_ip, 0); 2812 continue; 2813 } 2814 2815 /* 2816 * If VLANs are configured, we do a route lookup to 2817 * determine which VLAN interface would be used, so we 2818 * can tag the ARP with the proper VLAN tag. 2819 */ 2820 memset(&fl, 0, sizeof(fl)); 2821 fl.fl4_dst = targets[i]; 2822 fl.fl4_tos = RTO_ONLINK; 2823 2824 rv = ip_route_output_key(&rt, &fl); 2825 if (rv) { 2826 if (net_ratelimit()) { 2827 printk(KERN_WARNING DRV_NAME 2828 ": %s: no route to arp_ip_target %u.%u.%u.%u\n", 2829 bond->dev->name, NIPQUAD(fl.fl4_dst)); 2830 } 2831 continue; 2832 } 2833 2834 /* 2835 * This target is not on a VLAN 2836 */ 2837 if (rt->u.dst.dev == bond->dev) { 2838 dprintk("basa: rtdev == bond->dev: arp_send\n"); 2839 bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], 2840 bond->master_ip, 0); 2841 continue; 2842 } 2843 2844 vlan_id = 0; 2845 list_for_each_entry_safe(vlan, vlan_next, &bond->vlan_list, 2846 vlan_list) { 2847 vlan_dev = bond->vlgrp->vlan_devices[vlan->vlan_id]; 2848 if (vlan_dev == rt->u.dst.dev) { 2849 vlan_id = vlan->vlan_id; 2850 dprintk("basa: vlan match on %s %d\n", 2851 vlan_dev->name, vlan_id); 2852 break; 2853 } 2854 } 2855 2856 if (vlan_id) { 2857 bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], 2858 vlan->vlan_ip, vlan_id); 2859 continue; 2860 } 2861 2862 if (net_ratelimit()) { 2863 printk(KERN_WARNING DRV_NAME 2864 ": %s: no path to arp_ip_target %u.%u.%u.%u via rt.dev %s\n", 2865 bond->dev->name, NIPQUAD(fl.fl4_dst), 2866 rt->u.dst.dev ? rt->u.dst.dev->name : "NULL"); 2867 } 2868 } 2869} 2870 2871/* 2872 * Kick out a gratuitous ARP for an IP on the bonding master plus one 2873 * for each VLAN above us. 2874 */ 2875static void bond_send_gratuitous_arp(struct bonding *bond) 2876{ 2877 struct slave *slave = bond->curr_active_slave; 2878 struct vlan_entry *vlan; 2879 struct net_device *vlan_dev; 2880 2881 dprintk("bond_send_grat_arp: bond %s slave %s\n", bond->dev->name, 2882 slave ? slave->dev->name : "NULL"); 2883 if (!slave) 2884 return; 2885 2886 if (bond->master_ip) { 2887 bond_arp_send(slave->dev, ARPOP_REPLY, bond->master_ip, 2888 bond->master_ip, 0); 2889 } 2890 2891 list_for_each_entry(vlan, &bond->vlan_list, vlan_list) { 2892 vlan_dev = bond->vlgrp->vlan_devices[vlan->vlan_id]; 2893 if (vlan->vlan_ip) { 2894 bond_arp_send(slave->dev, ARPOP_REPLY, vlan->vlan_ip, 2895 vlan->vlan_ip, vlan->vlan_id); 2896 } 2897 } 2898} 2899 2900/* 2901 * this function is called regularly to monitor each slave's link 2902 * ensuring that traffic is being sent and received when arp monitoring 2903 * is used in load-balancing mode. if the adapter has been dormant, then an 2904 * arp is transmitted to generate traffic. see activebackup_arp_monitor for 2905 * arp monitoring in active backup mode. 2906 */ 2907static void bond_loadbalance_arp_mon(struct net_device *bond_dev) 2908{ 2909 struct bonding *bond = bond_dev->priv; 2910 struct slave *slave, *oldcurrent; 2911 int do_failover = 0; 2912 int delta_in_ticks; 2913 int i; 2914 2915 read_lock(&bond->lock); 2916 2917 delta_in_ticks = (bond->params.arp_interval * HZ) / 1000; 2918 2919 if (bond->kill_timers) { 2920 goto out; 2921 } 2922 2923 if (bond->slave_cnt == 0) { 2924 goto re_arm; 2925 } 2926 2927 read_lock(&bond->curr_slave_lock); 2928 oldcurrent = bond->curr_active_slave; 2929 read_unlock(&bond->curr_slave_lock); 2930 2931 /* see if any of the previous devices are up now (i.e. they have 2932 * xmt and rcv traffic). the curr_active_slave does not come into 2933 * the picture unless it is null. also, slave->jiffies is not needed 2934 * here because we send an arp on each slave and give a slave as 2935 * long as it needs to get the tx/rx within the delta. 2936 * TODO: what about up/down delay in arp mode? it wasn't here before 2937 * so it can wait 2938 */ 2939 bond_for_each_slave(bond, slave, i) { 2940 if (slave->link != BOND_LINK_UP) { 2941 if (((jiffies - slave->dev->trans_start) <= delta_in_ticks) && 2942 ((jiffies - slave->dev->last_rx) <= delta_in_ticks)) { 2943 2944 slave->link = BOND_LINK_UP; 2945 slave->state = BOND_STATE_ACTIVE; 2946 2947 /* primary_slave has no meaning in round-robin 2948 * mode. the window of a slave being up and 2949 * curr_active_slave being null after enslaving 2950 * is closed. 2951 */ 2952 if (!oldcurrent) { 2953 printk(KERN_INFO DRV_NAME 2954 ": %s: link status definitely " 2955 "up for interface %s, ", 2956 bond_dev->name, 2957 slave->dev->name); 2958 do_failover = 1; 2959 } else { 2960 printk(KERN_INFO DRV_NAME 2961 ": %s: interface %s is now up\n", 2962 bond_dev->name, 2963 slave->dev->name); 2964 } 2965 } 2966 } else { 2967 /* slave->link == BOND_LINK_UP */ 2968 2969 /* not all switches will respond to an arp request 2970 * when the source ip is 0, so don't take the link down 2971 * if we don't know our ip yet 2972 */ 2973 if (((jiffies - slave->dev->trans_start) >= (2*delta_in_ticks)) || 2974 (((jiffies - slave->dev->last_rx) >= (2*delta_in_ticks)) && 2975 bond_has_ip(bond))) { 2976 2977 slave->link = BOND_LINK_DOWN; 2978 slave->state = BOND_STATE_BACKUP; 2979 2980 if (slave->link_failure_count < UINT_MAX) { 2981 slave->link_failure_count++; 2982 } 2983 2984 printk(KERN_INFO DRV_NAME 2985 ": %s: interface %s is now down.\n", 2986 bond_dev->name, 2987 slave->dev->name); 2988 2989 if (slave == oldcurrent) { 2990 do_failover = 1; 2991 } 2992 } 2993 } 2994 2995 /* note: if switch is in round-robin mode, all links 2996 * must tx arp to ensure all links rx an arp - otherwise 2997 * links may oscillate or not come up at all; if switch is 2998 * in something like xor mode, there is nothing we can 2999 * do - all replies will be rx'ed on same link causing slaves 3000 * to be unstable during low/no traffic periods 3001 */ 3002 if (IS_UP(slave->dev)) { 3003 bond_arp_send_all(bond, slave); 3004 } 3005 } 3006 3007 if (do_failover) { 3008 write_lock(&bond->curr_slave_lock); 3009 3010 bond_select_active_slave(bond); 3011 3012 if (oldcurrent && !bond->curr_active_slave) { 3013 printk(KERN_INFO DRV_NAME 3014 ": %s: now running without any active " 3015 "interface !\n", 3016 bond_dev->name); 3017 } 3018 3019 write_unlock(&bond->curr_slave_lock); 3020 } 3021 3022re_arm: 3023 if (bond->params.arp_interval) { 3024 mod_timer(&bond->arp_timer, jiffies + delta_in_ticks); 3025 } 3026out: 3027 read_unlock(&bond->lock); 3028} 3029 3030/* 3031 * When using arp monitoring in active-backup mode, this function is 3032 * called to determine if any backup slaves have went down or a new 3033 * current slave needs to be found. 3034 * The backup slaves never generate traffic, they are considered up by merely 3035 * receiving traffic. If the current slave goes down, each backup slave will 3036 * be given the opportunity to tx/rx an arp before being taken down - this 3037 * prevents all slaves from being taken down due to the current slave not 3038 * sending any traffic for the backups to receive. The arps are not necessarily 3039 * necessary, any tx and rx traffic will keep the current slave up. While any 3040 * rx traffic will keep the backup slaves up, the current slave is responsible 3041 * for generating traffic to keep them up regardless of any other traffic they 3042 * may have received. 3043 * see loadbalance_arp_monitor for arp monitoring in load balancing mode 3044 */ 3045static void bond_activebackup_arp_mon(struct net_device *bond_dev) 3046{ 3047 struct bonding *bond = bond_dev->priv; 3048 struct slave *slave; 3049 int delta_in_ticks; 3050 int i; 3051 3052 read_lock(&bond->lock); 3053 3054 delta_in_ticks = (bond->params.arp_interval * HZ) / 1000; 3055 3056 if (bond->kill_timers) { 3057 goto out; 3058 } 3059 3060 if (bond->slave_cnt == 0) { 3061 goto re_arm; 3062 } 3063 3064 /* determine if any slave has come up or any backup slave has 3065 * gone down 3066 * TODO: what about up/down delay in arp mode? it wasn't here before 3067 * so it can wait 3068 */ 3069 bond_for_each_slave(bond, slave, i) { 3070 if (slave->link != BOND_LINK_UP) { 3071 if ((jiffies - slave->dev->last_rx) <= delta_in_ticks) { 3072 3073 slave->link = BOND_LINK_UP; 3074 3075 write_lock(&bond->curr_slave_lock); 3076 3077 if ((!bond->curr_active_slave) && 3078 ((jiffies - slave->dev->trans_start) <= delta_in_ticks)) { 3079 bond_change_active_slave(bond, slave); 3080 bond->current_arp_slave = NULL; 3081 } else if (bond->curr_active_slave != slave) { 3082 /* this slave has just come up but we 3083 * already have a current slave; this 3084 * can also happen if bond_enslave adds 3085 * a new slave that is up while we are 3086 * searching for a new slave 3087 */ 3088 bond_set_slave_inactive_flags(slave); 3089 bond->current_arp_slave = NULL; 3090 } 3091 3092 if (slave == bond->curr_active_slave) { 3093 printk(KERN_INFO DRV_NAME 3094 ": %s: %s is up and now the " 3095 "active interface\n", 3096 bond_dev->name, 3097 slave->dev->name); 3098 } else { 3099 printk(KERN_INFO DRV_NAME 3100 ": %s: backup interface %s is " 3101 "now up\n", 3102 bond_dev->name, 3103 slave->dev->name); 3104 } 3105 3106 write_unlock(&bond->curr_slave_lock); 3107 } 3108 } else { 3109 read_lock(&bond->curr_slave_lock); 3110 3111 if ((slave != bond->curr_active_slave) && 3112 (!bond->current_arp_slave) && 3113 (((jiffies - slave->dev->last_rx) >= 3*delta_in_ticks) && 3114 bond_has_ip(bond))) { 3115 /* a backup slave has gone down; three times 3116 * the delta allows the current slave to be 3117 * taken out before the backup slave. 3118 * note: a non-null current_arp_slave indicates 3119 * the curr_active_slave went down and we are 3120 * searching for a new one; under this 3121 * condition we only take the curr_active_slave 3122 * down - this gives each slave a chance to 3123 * tx/rx traffic before being taken out 3124 */ 3125 3126 read_unlock(&bond->curr_slave_lock); 3127 3128 slave->link = BOND_LINK_DOWN; 3129 3130 if (slave->link_failure_count < UINT_MAX) { 3131 slave->link_failure_count++; 3132 } 3133 3134 bond_set_slave_inactive_flags(slave); 3135 3136 printk(KERN_INFO DRV_NAME 3137 ": %s: backup interface %s is now down\n", 3138 bond_dev->name, 3139 slave->dev->name); 3140 } else { 3141 read_unlock(&bond->curr_slave_lock); 3142 } 3143 } 3144 } 3145 3146 read_lock(&bond->curr_slave_lock); 3147 slave = bond->curr_active_slave; 3148 read_unlock(&bond->curr_slave_lock); 3149 3150 if (slave) { 3151 /* if we have sent traffic in the past 2*arp_intervals but 3152 * haven't xmit and rx traffic in that time interval, select 3153 * a different slave. slave->jiffies is only updated when 3154 * a slave first becomes the curr_active_slave - not necessarily 3155 * after every arp; this ensures the slave has a full 2*delta 3156 * before being taken out. if a primary is being used, check 3157 * if it is up and needs to take over as the curr_active_slave 3158 */ 3159 if ((((jiffies - slave->dev->trans_start) >= (2*delta_in_ticks)) || 3160 (((jiffies - slave->dev->last_rx) >= (2*delta_in_ticks)) && 3161 bond_has_ip(bond))) && 3162 ((jiffies - slave->jiffies) >= 2*delta_in_ticks)) { 3163 3164 slave->link = BOND_LINK_DOWN; 3165 3166 if (slave->link_failure_count < UINT_MAX) { 3167 slave->link_failure_count++; 3168 } 3169 3170 printk(KERN_INFO DRV_NAME 3171 ": %s: link status down for active interface " 3172 "%s, disabling it\n", 3173 bond_dev->name, 3174 slave->dev->name); 3175 3176 write_lock(&bond->curr_slave_lock); 3177 3178 bond_select_active_slave(bond); 3179 slave = bond->curr_active_slave; 3180 3181 write_unlock(&bond->curr_slave_lock); 3182 3183 bond->current_arp_slave = slave; 3184 3185 if (slave) { 3186 slave->jiffies = jiffies; 3187 } 3188 } else if ((bond->primary_slave) && 3189 (bond->primary_slave != slave) && 3190 (bond->primary_slave->link == BOND_LINK_UP)) { 3191 /* at this point, slave is the curr_active_slave */ 3192 printk(KERN_INFO DRV_NAME 3193 ": %s: changing from interface %s to primary " 3194 "interface %s\n", 3195 bond_dev->name, 3196 slave->dev->name, 3197 bond->primary_slave->dev->name); 3198 3199 /* primary is up so switch to it */ 3200 write_lock(&bond->curr_slave_lock); 3201 bond_change_active_slave(bond, bond->primary_slave); 3202 write_unlock(&bond->curr_slave_lock); 3203 3204 slave = bond->primary_slave; 3205 slave->jiffies = jiffies; 3206 } else { 3207 bond->current_arp_slave = NULL; 3208 } 3209 3210 /* the current slave must tx an arp to ensure backup slaves 3211 * rx traffic 3212 */ 3213 if (slave && bond_has_ip(bond)) { 3214 bond_arp_send_all(bond, slave); 3215 } 3216 } 3217 3218 /* if we don't have a curr_active_slave, search for the next available 3219 * backup slave from the current_arp_slave and make it the candidate 3220 * for becoming the curr_active_slave 3221 */ 3222 if (!slave) { 3223 if (!bond->current_arp_slave) { 3224 bond->current_arp_slave = bond->first_slave; 3225 } 3226 3227 if (bond->current_arp_slave) { 3228 bond_set_slave_inactive_flags(bond->current_arp_slave); 3229 3230 /* search for next candidate */ 3231 bond_for_each_slave_from(bond, slave, i, bond->current_arp_slave->next) { 3232 if (IS_UP(slave->dev)) { 3233 slave->link = BOND_LINK_BACK; 3234 bond_set_slave_active_flags(slave); 3235 bond_arp_send_all(bond, slave); 3236 slave->jiffies = jiffies; 3237 bond->current_arp_slave = slave; 3238 break; 3239 } 3240 3241 /* if the link state is up at this point, we 3242 * mark it down - this can happen if we have 3243 * simultaneous link failures and 3244 * reselect_active_interface doesn't make this 3245 * one the current slave so it is still marked 3246 * up when it is actually down 3247 */ 3248 if (slave->link == BOND_LINK_UP) { 3249 slave->link = BOND_LINK_DOWN; 3250 if (slave->link_failure_count < UINT_MAX) { 3251 slave->link_failure_count++; 3252 } 3253 3254 bond_set_slave_inactive_flags(slave); 3255 3256 printk(KERN_INFO DRV_NAME 3257 ": %s: backup interface %s is " 3258 "now down.\n", 3259 bond_dev->name, 3260 slave->dev->name); 3261 } 3262 } 3263 } 3264 } 3265 3266re_arm: 3267 if (bond->params.arp_interval) { 3268 mod_timer(&bond->arp_timer, jiffies + delta_in_ticks); 3269 } 3270out: 3271 read_unlock(&bond->lock); 3272} 3273 3274/*------------------------------ proc/seq_file-------------------------------*/ 3275 3276#ifdef CONFIG_PROC_FS 3277 3278#define SEQ_START_TOKEN ((void *)1) 3279 3280static void *bond_info_seq_start(struct seq_file *seq, loff_t *pos) 3281{ 3282 struct bonding *bond = seq->private; 3283 loff_t off = 0; 3284 struct slave *slave; 3285 int i; 3286 3287 /* make sure the bond won't be taken away */ 3288 read_lock(&dev_base_lock); 3289 read_lock_bh(&bond->lock); 3290 3291 if (*pos == 0) { 3292 return SEQ_START_TOKEN; 3293 } 3294 3295 bond_for_each_slave(bond, slave, i) { 3296 if (++off == *pos) { 3297 return slave; 3298 } 3299 } 3300 3301 return NULL; 3302} 3303 3304static void *bond_info_seq_next(struct seq_file *seq, void *v, loff_t *pos) 3305{ 3306 struct bonding *bond = seq->private; 3307 struct slave *slave = v; 3308 3309 ++*pos; 3310 if (v == SEQ_START_TOKEN) { 3311 return bond->first_slave; 3312 } 3313 3314 slave = slave->next; 3315 3316 return (slave == bond->first_slave) ? NULL : slave; 3317} 3318 3319static void bond_info_seq_stop(struct seq_file *seq, void *v) 3320{ 3321 struct bonding *bond = seq->private; 3322 3323 read_unlock_bh(&bond->lock); 3324 read_unlock(&dev_base_lock); 3325} 3326 3327static void bond_info_show_master(struct seq_file *seq) 3328{ 3329 struct bonding *bond = seq->private; 3330 struct slave *curr; 3331 3332 read_lock(&bond->curr_slave_lock); 3333 curr = bond->curr_active_slave; 3334 read_unlock(&bond->curr_slave_lock); 3335 3336 seq_printf(seq, "Bonding Mode: %s\n", 3337 bond_mode_name(bond->params.mode)); 3338 3339 if (USES_PRIMARY(bond->params.mode)) { 3340 seq_printf(seq, "Primary Slave: %s\n", 3341 (bond->params.primary[0]) ? 3342 bond->params.primary : "None"); 3343 3344 seq_printf(seq, "Currently Active Slave: %s\n", 3345 (curr) ? curr->dev->name : "None"); 3346 } 3347 3348 seq_printf(seq, "MII Status: %s\n", (curr) ? "up" : "down"); 3349 seq_printf(seq, "MII Polling Interval (ms): %d\n", bond->params.miimon); 3350 seq_printf(seq, "Up Delay (ms): %d\n", 3351 bond->params.updelay * bond->params.miimon); 3352 seq_printf(seq, "Down Delay (ms): %d\n", 3353 bond->params.downdelay * bond->params.miimon); 3354 3355 if (bond->params.mode == BOND_MODE_8023AD) { 3356 struct ad_info ad_info; 3357 3358 seq_puts(seq, "\n802.3ad info\n"); 3359 seq_printf(seq, "LACP rate: %s\n", 3360 (bond->params.lacp_fast) ? "fast" : "slow"); 3361 3362 if (bond_3ad_get_active_agg_info(bond, &ad_info)) { 3363 seq_printf(seq, "bond %s has no active aggregator\n", 3364 bond->dev->name); 3365 } else { 3366 seq_printf(seq, "Active Aggregator Info:\n"); 3367 3368 seq_printf(seq, "\tAggregator ID: %d\n", 3369 ad_info.aggregator_id); 3370 seq_printf(seq, "\tNumber of ports: %d\n", 3371 ad_info.ports); 3372 seq_printf(seq, "\tActor Key: %d\n", 3373 ad_info.actor_key); 3374 seq_printf(seq, "\tPartner Key: %d\n", 3375 ad_info.partner_key); 3376 seq_printf(seq, "\tPartner Mac Address: %02x:%02x:%02x:%02x:%02x:%02x\n", 3377 ad_info.partner_system[0], 3378 ad_info.partner_system[1], 3379 ad_info.partner_system[2], 3380 ad_info.partner_system[3], 3381 ad_info.partner_system[4], 3382 ad_info.partner_system[5]); 3383 } 3384 } 3385} 3386 3387static void bond_info_show_slave(struct seq_file *seq, const struct slave *slave) 3388{ 3389 struct bonding *bond = seq->private; 3390 3391 seq_printf(seq, "\nSlave Interface: %s\n", slave->dev->name); 3392 seq_printf(seq, "MII Status: %s\n", 3393 (slave->link == BOND_LINK_UP) ? "up" : "down"); 3394 seq_printf(seq, "Link Failure Count: %d\n", 3395 slave->link_failure_count); 3396 3397 if (app_abi_ver >= 1) { 3398 seq_printf(seq, 3399 "Permanent HW addr: %02x:%02x:%02x:%02x:%02x:%02x\n", 3400 slave->perm_hwaddr[0], 3401 slave->perm_hwaddr[1], 3402 slave->perm_hwaddr[2], 3403 slave->perm_hwaddr[3], 3404 slave->perm_hwaddr[4], 3405 slave->perm_hwaddr[5]); 3406 } 3407 3408 if (bond->params.mode == BOND_MODE_8023AD) { 3409 const struct aggregator *agg 3410 = SLAVE_AD_INFO(slave).port.aggregator; 3411 3412 if (agg) { 3413 seq_printf(seq, "Aggregator ID: %d\n", 3414 agg->aggregator_identifier); 3415 } else { 3416 seq_puts(seq, "Aggregator ID: N/A\n"); 3417 } 3418 } 3419} 3420 3421static int bond_info_seq_show(struct seq_file *seq, void *v) 3422{ 3423 if (v == SEQ_START_TOKEN) { 3424 seq_printf(seq, "%s\n", version); 3425 bond_info_show_master(seq); 3426 } else { 3427 bond_info_show_slave(seq, v); 3428 } 3429 3430 return 0; 3431} 3432 3433static struct seq_operations bond_info_seq_ops = { 3434 .start = bond_info_seq_start, 3435 .next = bond_info_seq_next, 3436 .stop = bond_info_seq_stop, 3437 .show = bond_info_seq_show, 3438}; 3439 3440static int bond_info_open(struct inode *inode, struct file *file) 3441{ 3442 struct seq_file *seq; 3443 struct proc_dir_entry *proc; 3444 int res; 3445 3446 res = seq_open(file, &bond_info_seq_ops); 3447 if (!res) { 3448 /* recover the pointer buried in proc_dir_entry data */ 3449 seq = file->private_data; 3450 proc = PDE(inode); 3451 seq->private = proc->data; 3452 } 3453 3454 return res; 3455} 3456 3457static struct file_operations bond_info_fops = { 3458 .owner = THIS_MODULE, 3459 .open = bond_info_open, 3460 .read = seq_read, 3461 .llseek = seq_lseek, 3462 .release = seq_release, 3463}; 3464 3465static int bond_create_proc_entry(struct bonding *bond) 3466{ 3467 struct net_device *bond_dev = bond->dev; 3468 3469 if (bond_proc_dir) { 3470 bond->proc_entry = create_proc_entry(bond_dev->name, 3471 S_IRUGO, 3472 bond_proc_dir); 3473 if (bond->proc_entry == NULL) { 3474 printk(KERN_WARNING DRV_NAME 3475 ": Warning: Cannot create /proc/net/%s/%s\n", 3476 DRV_NAME, bond_dev->name); 3477 } else { 3478 bond->proc_entry->data = bond; 3479 bond->proc_entry->proc_fops = &bond_info_fops; 3480 bond->proc_entry->owner = THIS_MODULE; 3481 memcpy(bond->proc_file_name, bond_dev->name, IFNAMSIZ); 3482 } 3483 } 3484 3485 return 0; 3486} 3487 3488static void bond_remove_proc_entry(struct bonding *bond) 3489{ 3490 if (bond_proc_dir && bond->proc_entry) { 3491 remove_proc_entry(bond->proc_file_name, bond_proc_dir); 3492 memset(bond->proc_file_name, 0, IFNAMSIZ); 3493 bond->proc_entry = NULL; 3494 } 3495} 3496 3497/* Create the bonding directory under /proc/net, if doesn't exist yet. 3498 * Caller must hold rtnl_lock. 3499 */ 3500static void bond_create_proc_dir(void) 3501{ 3502 int len = strlen(DRV_NAME); 3503 3504 for (bond_proc_dir = proc_net->subdir; bond_proc_dir; 3505 bond_proc_dir = bond_proc_dir->next) { 3506 if ((bond_proc_dir->namelen == len) && 3507 !memcmp(bond_proc_dir->name, DRV_NAME, len)) { 3508 break; 3509 } 3510 } 3511 3512 if (!bond_proc_dir) { 3513 bond_proc_dir = proc_mkdir(DRV_NAME, proc_net); 3514 if (bond_proc_dir) { 3515 bond_proc_dir->owner = THIS_MODULE; 3516 } else { 3517 printk(KERN_WARNING DRV_NAME 3518 ": Warning: cannot create /proc/net/%s\n", 3519 DRV_NAME); 3520 } 3521 } 3522} 3523 3524/* Destroy the bonding directory under /proc/net, if empty. 3525 * Caller must hold rtnl_lock. 3526 */ 3527static void bond_destroy_proc_dir(void) 3528{ 3529 struct proc_dir_entry *de; 3530 3531 if (!bond_proc_dir) { 3532 return; 3533 } 3534 3535 /* verify that the /proc dir is empty */ 3536 for (de = bond_proc_dir->subdir; de; de = de->next) { 3537 /* ignore . and .. */ 3538 if (*(de->name) != '.') { 3539 break; 3540 } 3541 } 3542 3543 if (de) { 3544 if (bond_proc_dir->owner == THIS_MODULE) { 3545 bond_proc_dir->owner = NULL; 3546 } 3547 } else { 3548 remove_proc_entry(DRV_NAME, proc_net); 3549 bond_proc_dir = NULL; 3550 } 3551} 3552#endif /* CONFIG_PROC_FS */ 3553 3554/*-------------------------- netdev event handling --------------------------*/ 3555 3556/* 3557 * Change device name 3558 */ 3559static int bond_event_changename(struct bonding *bond) 3560{ 3561#ifdef CONFIG_PROC_FS 3562 bond_remove_proc_entry(bond); 3563 bond_create_proc_entry(bond); 3564#endif 3565 3566 return NOTIFY_DONE; 3567} 3568 3569static int bond_master_netdev_event(unsigned long event, struct net_device *bond_dev) 3570{ 3571 struct bonding *event_bond = bond_dev->priv; 3572 3573 switch (event) { 3574 case NETDEV_CHANGENAME: 3575 return bond_event_changename(event_bond); 3576 case NETDEV_UNREGISTER: 3577 /* 3578 * TODO: remove a bond from the list? 3579 */ 3580 break; 3581 default: 3582 break; 3583 } 3584 3585 return NOTIFY_DONE; 3586} 3587 3588static int bond_slave_netdev_event(unsigned long event, struct net_device *slave_dev) 3589{ 3590 struct net_device *bond_dev = slave_dev->master; 3591 3592 switch (event) { 3593 case NETDEV_UNREGISTER: 3594 if (bond_dev) { 3595 bond_release(bond_dev, slave_dev); 3596 } 3597 break; 3598 case NETDEV_CHANGE: 3599 /* 3600 * TODO: is this what we get if somebody 3601 * sets up a hierarchical bond, then rmmod's 3602 * one of the slave bonding devices? 3603 */ 3604 break; 3605 case NETDEV_DOWN: 3606 /* 3607 * ... Or is it this? 3608 */ 3609 break; 3610 case NETDEV_CHANGEMTU: 3611 /* 3612 * TODO: Should slaves be allowed to 3613 * independently alter their MTU? For 3614 * an active-backup bond, slaves need 3615 * not be the same type of device, so 3616 * MTUs may vary. For other modes, 3617 * slaves arguably should have the 3618 * same MTUs. To do this, we'd need to 3619 * take over the slave's change_mtu 3620 * function for the duration of their 3621 * servitude. 3622 */ 3623 break; 3624 case NETDEV_CHANGENAME: 3625 /* 3626 * TODO: handle changing the primary's name 3627 */ 3628 break; 3629 default: 3630 break; 3631 } 3632 3633 return NOTIFY_DONE; 3634} 3635 3636/* 3637 * bond_netdev_event: handle netdev notifier chain events. 3638 * 3639 * This function receives events for the netdev chain. The caller (an 3640 * ioctl handler calling notifier_call_chain) holds the necessary 3641 * locks for us to safely manipulate the slave devices (RTNL lock, 3642 * dev_probe_lock). 3643 */ 3644static int bond_netdev_event(struct notifier_block *this, unsigned long event, void *ptr) 3645{ 3646 struct net_device *event_dev = (struct net_device *)ptr; 3647 3648 dprintk("event_dev: %s, event: %lx\n", 3649 (event_dev ? event_dev->name : "None"), 3650 event); 3651 3652 if (event_dev->flags & IFF_MASTER) { 3653 dprintk("IFF_MASTER\n"); 3654 return bond_master_netdev_event(event, event_dev); 3655 } 3656 3657 if (event_dev->flags & IFF_SLAVE) { 3658 dprintk("IFF_SLAVE\n"); 3659 return bond_slave_netdev_event(event, event_dev); 3660 } 3661 3662 return NOTIFY_DONE; 3663} 3664 3665/* 3666 * bond_inetaddr_event: handle inetaddr notifier chain events. 3667 * 3668 * We keep track of device IPs primarily to use as source addresses in 3669 * ARP monitor probes (rather than spewing out broadcasts all the time). 3670 * 3671 * We track one IP for the main device (if it has one), plus one per VLAN. 3672 */ 3673static int bond_inetaddr_event(struct notifier_block *this, unsigned long event, void *ptr) 3674{ 3675 struct in_ifaddr *ifa = ptr; 3676 struct net_device *vlan_dev, *event_dev = ifa->ifa_dev->dev; 3677 struct bonding *bond, *bond_next; 3678 struct vlan_entry *vlan, *vlan_next; 3679 3680 list_for_each_entry_safe(bond, bond_next, &bond_dev_list, bond_list) { 3681 if (bond->dev == event_dev) { 3682 switch (event) { 3683 case NETDEV_UP: 3684 bond->master_ip = ifa->ifa_local; 3685 return NOTIFY_OK; 3686 case NETDEV_DOWN: 3687 bond->master_ip = bond_glean_dev_ip(bond->dev); 3688 return NOTIFY_OK; 3689 default: 3690 return NOTIFY_DONE; 3691 } 3692 } 3693 3694 if (list_empty(&bond->vlan_list)) 3695 continue; 3696 3697 list_for_each_entry_safe(vlan, vlan_next, &bond->vlan_list, 3698 vlan_list) { 3699 vlan_dev = bond->vlgrp->vlan_devices[vlan->vlan_id]; 3700 if (vlan_dev == event_dev) { 3701 switch (event) { 3702 case NETDEV_UP: 3703 vlan->vlan_ip = ifa->ifa_local; 3704 return NOTIFY_OK; 3705 case NETDEV_DOWN: 3706 vlan->vlan_ip = 3707 bond_glean_dev_ip(vlan_dev); 3708 return NOTIFY_OK; 3709 default: 3710 return NOTIFY_DONE; 3711 } 3712 } 3713 } 3714 } 3715 return NOTIFY_DONE; 3716} 3717 3718static struct notifier_block bond_netdev_notifier = { 3719 .notifier_call = bond_netdev_event, 3720}; 3721 3722static struct notifier_block bond_inetaddr_notifier = { 3723 .notifier_call = bond_inetaddr_event, 3724}; 3725 3726/*-------------------------- Packet type handling ---------------------------*/ 3727 3728/* register to receive lacpdus on a bond */ 3729static void bond_register_lacpdu(struct bonding *bond) 3730{ 3731 struct packet_type *pk_type = &(BOND_AD_INFO(bond).ad_pkt_type); 3732 3733 /* initialize packet type */ 3734 pk_type->type = PKT_TYPE_LACPDU; 3735 pk_type->dev = bond->dev; 3736 pk_type->func = bond_3ad_lacpdu_recv; 3737 3738 dev_add_pack(pk_type); 3739} 3740 3741/* unregister to receive lacpdus on a bond */ 3742static void bond_unregister_lacpdu(struct bonding *bond) 3743{ 3744 dev_remove_pack(&(BOND_AD_INFO(bond).ad_pkt_type)); 3745} 3746 3747/*---------------------------- Hashing Policies -----------------------------*/ 3748 3749/* 3750 * Hash for the the output device based upon layer 3 and layer 4 data. If 3751 * the packet is a frag or not TCP or UDP, just use layer 3 data. If it is 3752 * altogether not IP, mimic bond_xmit_hash_policy_l2() 3753 */ 3754static int bond_xmit_hash_policy_l34(struct sk_buff *skb, 3755 struct net_device *bond_dev, int count) 3756{ 3757 struct ethhdr *data = (struct ethhdr *)skb->data; 3758 struct iphdr *iph = skb->nh.iph; 3759 u16 *layer4hdr = (u16 *)((u32 *)iph + iph->ihl); 3760 int layer4_xor = 0; 3761 3762 if (skb->protocol == __constant_htons(ETH_P_IP)) { 3763 if (!(iph->frag_off & __constant_htons(IP_MF|IP_OFFSET)) && 3764 (iph->protocol == IPPROTO_TCP || 3765 iph->protocol == IPPROTO_UDP)) { 3766 layer4_xor = htons((*layer4hdr ^ *(layer4hdr + 1))); 3767 } 3768 return (layer4_xor ^ 3769 ((ntohl(iph->saddr ^ iph->daddr)) & 0xffff)) % count; 3770 3771 } 3772 3773 return (data->h_dest[5] ^ bond_dev->dev_addr[5]) % count; 3774} 3775 3776/* 3777 * Hash for the output device based upon layer 2 data 3778 */ 3779static int bond_xmit_hash_policy_l2(struct sk_buff *skb, 3780 struct net_device *bond_dev, int count) 3781{ 3782 struct ethhdr *data = (struct ethhdr *)skb->data; 3783 3784 return (data->h_dest[5] ^ bond_dev->dev_addr[5]) % count; 3785} 3786 3787/*-------------------------- Device entry points ----------------------------*/ 3788 3789static int bond_open(struct net_device *bond_dev) 3790{ 3791 struct bonding *bond = bond_dev->priv; 3792 struct timer_list *mii_timer = &bond->mii_timer; 3793 struct timer_list *arp_timer = &bond->arp_timer; 3794 3795 bond->kill_timers = 0; 3796 3797 if ((bond->params.mode == BOND_MODE_TLB) || 3798 (bond->params.mode == BOND_MODE_ALB)) { 3799 struct timer_list *alb_timer = &(BOND_ALB_INFO(bond).alb_timer); 3800 3801 /* bond_alb_initialize must be called before the timer 3802 * is started. 3803 */ 3804 if (bond_alb_initialize(bond, (bond->params.mode == BOND_MODE_ALB))) { 3805 /* something went wrong - fail the open operation */ 3806 return -1; 3807 } 3808 3809 init_timer(alb_timer); 3810 alb_timer->expires = jiffies + 1; 3811 alb_timer->data = (unsigned long)bond; 3812 alb_timer->function = (void *)&bond_alb_monitor; 3813 add_timer(alb_timer); 3814 } 3815 3816 if (bond->params.miimon) { /* link check interval, in milliseconds. */ 3817 init_timer(mii_timer); 3818 mii_timer->expires = jiffies + 1; 3819 mii_timer->data = (unsigned long)bond_dev; 3820 mii_timer->function = (void *)&bond_mii_monitor; 3821 add_timer(mii_timer); 3822 } 3823 3824 if (bond->params.arp_interval) { /* arp interval, in milliseconds. */ 3825 init_timer(arp_timer); 3826 arp_timer->expires = jiffies + 1; 3827 arp_timer->data = (unsigned long)bond_dev; 3828 if (bond->params.mode == BOND_MODE_ACTIVEBACKUP) { 3829 arp_timer->function = (void *)&bond_activebackup_arp_mon; 3830 } else { 3831 arp_timer->function = (void *)&bond_loadbalance_arp_mon; 3832 } 3833 add_timer(arp_timer); 3834 } 3835 3836 if (bond->params.mode == BOND_MODE_8023AD) { 3837 struct timer_list *ad_timer = &(BOND_AD_INFO(bond).ad_timer); 3838 init_timer(ad_timer); 3839 ad_timer->expires = jiffies + 1; 3840 ad_timer->data = (unsigned long)bond; 3841 ad_timer->function = (void *)&bond_3ad_state_machine_handler; 3842 add_timer(ad_timer); 3843 3844 /* register to receive LACPDUs */ 3845 bond_register_lacpdu(bond); 3846 } 3847 3848 return 0; 3849} 3850 3851static int bond_close(struct net_device *bond_dev) 3852{ 3853 struct bonding *bond = bond_dev->priv; 3854 3855 if (bond->params.mode == BOND_MODE_8023AD) { 3856 /* Unregister the receive of LACPDUs */ 3857 bond_unregister_lacpdu(bond); 3858 } 3859 3860 write_lock_bh(&bond->lock); 3861 3862 bond_mc_list_destroy(bond); 3863 3864 /* signal timers not to re-arm */ 3865 bond->kill_timers = 1; 3866 3867 write_unlock_bh(&bond->lock); 3868 3869 /* del_timer_sync must run without holding the bond->lock 3870 * because a running timer might be trying to hold it too 3871 */ 3872 3873 if (bond->params.miimon) { /* link check interval, in milliseconds. */ 3874 del_timer_sync(&bond->mii_timer); 3875 } 3876 3877 if (bond->params.arp_interval) { /* arp interval, in milliseconds. */ 3878 del_timer_sync(&bond->arp_timer); 3879 } 3880 3881 switch (bond->params.mode) { 3882 case BOND_MODE_8023AD: 3883 del_timer_sync(&(BOND_AD_INFO(bond).ad_timer)); 3884 break; 3885 case BOND_MODE_TLB: 3886 case BOND_MODE_ALB: 3887 del_timer_sync(&(BOND_ALB_INFO(bond).alb_timer)); 3888 break; 3889 default: 3890 break; 3891 } 3892 3893 /* Release the bonded slaves */ 3894 bond_release_all(bond_dev); 3895 3896 if ((bond->params.mode == BOND_MODE_TLB) || 3897 (bond->params.mode == BOND_MODE_ALB)) { 3898 /* Must be called only after all 3899 * slaves have been released 3900 */ 3901 bond_alb_deinitialize(bond); 3902 } 3903 3904 return 0; 3905} 3906 3907static struct net_device_stats *bond_get_stats(struct net_device *bond_dev) 3908{ 3909 struct bonding *bond = bond_dev->priv; 3910 struct net_device_stats *stats = &(bond->stats), *sstats; 3911 struct slave *slave; 3912 int i; 3913 3914 memset(stats, 0, sizeof(struct net_device_stats)); 3915 3916 read_lock_bh(&bond->lock); 3917 3918 bond_for_each_slave(bond, slave, i) { 3919 sstats = slave->dev->get_stats(slave->dev); 3920 3921 stats->rx_packets += sstats->rx_packets; 3922 stats->rx_bytes += sstats->rx_bytes; 3923 stats->rx_errors += sstats->rx_errors; 3924 stats->rx_dropped += sstats->rx_dropped; 3925 3926 stats->tx_packets += sstats->tx_packets; 3927 stats->tx_bytes += sstats->tx_bytes; 3928 stats->tx_errors += sstats->tx_errors; 3929 stats->tx_dropped += sstats->tx_dropped; 3930 3931 stats->multicast += sstats->multicast; 3932 stats->collisions += sstats->collisions; 3933 3934 stats->rx_length_errors += sstats->rx_length_errors; 3935 stats->rx_over_errors += sstats->rx_over_errors; 3936 stats->rx_crc_errors += sstats->rx_crc_errors; 3937 stats->rx_frame_errors += sstats->rx_frame_errors; 3938 stats->rx_fifo_errors += sstats->rx_fifo_errors; 3939 stats->rx_missed_errors += sstats->rx_missed_errors; 3940 3941 stats->tx_aborted_errors += sstats->tx_aborted_errors; 3942 stats->tx_carrier_errors += sstats->tx_carrier_errors; 3943 stats->tx_fifo_errors += sstats->tx_fifo_errors; 3944 stats->tx_heartbeat_errors += sstats->tx_heartbeat_errors; 3945 stats->tx_window_errors += sstats->tx_window_errors; 3946 } 3947 3948 read_unlock_bh(&bond->lock); 3949 3950 return stats; 3951} 3952 3953static int bond_do_ioctl(struct net_device *bond_dev, struct ifreq *ifr, int cmd) 3954{ 3955 struct net_device *slave_dev = NULL; 3956 struct ifbond k_binfo; 3957 struct ifbond __user *u_binfo = NULL; 3958 struct ifslave k_sinfo; 3959 struct ifslave __user *u_sinfo = NULL; 3960 struct mii_ioctl_data *mii = NULL; 3961 int prev_abi_ver = orig_app_abi_ver; 3962 int res = 0; 3963 3964 dprintk("bond_ioctl: master=%s, cmd=%d\n", 3965 bond_dev->name, cmd); 3966 3967 switch (cmd) { 3968 case SIOCETHTOOL: 3969 return bond_ethtool_ioctl(bond_dev, ifr); 3970 case SIOCGMIIPHY: 3971 mii = if_mii(ifr); 3972 if (!mii) { 3973 return -EINVAL; 3974 } 3975 mii->phy_id = 0; 3976 /* Fall Through */ 3977 case SIOCGMIIREG: 3978 /* 3979 * We do this again just in case we were called by SIOCGMIIREG 3980 * instead of SIOCGMIIPHY. 3981 */ 3982 mii = if_mii(ifr); 3983 if (!mii) { 3984 return -EINVAL; 3985 } 3986 3987 if (mii->reg_num == 1) { 3988 struct bonding *bond = bond_dev->priv; 3989 mii->val_out = 0; 3990 read_lock_bh(&bond->lock); 3991 read_lock(&bond->curr_slave_lock); 3992 if (bond->curr_active_slave) { 3993 mii->val_out = BMSR_LSTATUS; 3994 } 3995 read_unlock(&bond->curr_slave_lock); 3996 read_unlock_bh(&bond->lock); 3997 } 3998 3999 return 0; 4000 case BOND_INFO_QUERY_OLD: 4001 case SIOCBONDINFOQUERY: 4002 u_binfo = (struct ifbond __user *)ifr->ifr_data; 4003 4004 if (copy_from_user(&k_binfo, u_binfo, sizeof(ifbond))) { 4005 return -EFAULT; 4006 } 4007 4008 res = bond_info_query(bond_dev, &k_binfo); 4009 if (res == 0) { 4010 if (copy_to_user(u_binfo, &k_binfo, sizeof(ifbond))) { 4011 return -EFAULT; 4012 } 4013 } 4014 4015 return res; 4016 case BOND_SLAVE_INFO_QUERY_OLD: 4017 case SIOCBONDSLAVEINFOQUERY: 4018 u_sinfo = (struct ifslave __user *)ifr->ifr_data; 4019 4020 if (copy_from_user(&k_sinfo, u_sinfo, sizeof(ifslave))) { 4021 return -EFAULT; 4022 } 4023 4024 res = bond_slave_info_query(bond_dev, &k_sinfo); 4025 if (res == 0) { 4026 if (copy_to_user(u_sinfo, &k_sinfo, sizeof(ifslave))) { 4027 return -EFAULT; 4028 } 4029 } 4030 4031 return res; 4032 default: 4033 /* Go on */ 4034 break; 4035 } 4036 4037 if (!capable(CAP_NET_ADMIN)) { 4038 return -EPERM; 4039 } 4040 4041 if (orig_app_abi_ver == -1) { 4042 /* no orig_app_abi_ver was provided yet, so we'll use the 4043 * current one from now on, even if it's 0 4044 */ 4045 orig_app_abi_ver = app_abi_ver; 4046 4047 } else if (orig_app_abi_ver != app_abi_ver) { 4048 printk(KERN_ERR DRV_NAME 4049 ": Error: already using ifenslave ABI version %d; to " 4050 "upgrade ifenslave to version %d, you must first " 4051 "reload bonding.\n", 4052 orig_app_abi_ver, app_abi_ver); 4053 return -EINVAL; 4054 } 4055 4056 slave_dev = dev_get_by_name(ifr->ifr_slave); 4057 4058 dprintk("slave_dev=%p: \n", slave_dev); 4059 4060 if (!slave_dev) { 4061 res = -ENODEV; 4062 } else { 4063 dprintk("slave_dev->name=%s: \n", slave_dev->name); 4064 switch (cmd) { 4065 case BOND_ENSLAVE_OLD: 4066 case SIOCBONDENSLAVE: 4067 res = bond_enslave(bond_dev, slave_dev); 4068 break; 4069 case BOND_RELEASE_OLD: 4070 case SIOCBONDRELEASE: 4071 res = bond_release(bond_dev, slave_dev); 4072 break; 4073 case BOND_SETHWADDR_OLD: 4074 case SIOCBONDSETHWADDR: 4075 res = bond_sethwaddr(bond_dev, slave_dev); 4076 break; 4077 case BOND_CHANGE_ACTIVE_OLD: 4078 case SIOCBONDCHANGEACTIVE: 4079 res = bond_ioctl_change_active(bond_dev, slave_dev); 4080 break; 4081 default: 4082 res = -EOPNOTSUPP; 4083 } 4084 4085 dev_put(slave_dev); 4086 } 4087 4088 if (res < 0) { 4089 /* The ioctl failed, so there's no point in changing the 4090 * orig_app_abi_ver. We'll restore it's value just in case 4091 * we've changed it earlier in this function. 4092 */ 4093 orig_app_abi_ver = prev_abi_ver; 4094 } 4095 4096 return res; 4097} 4098 4099static void bond_set_multicast_list(struct net_device *bond_dev) 4100{ 4101 struct bonding *bond = bond_dev->priv; 4102 struct dev_mc_list *dmi; 4103 4104 write_lock_bh(&bond->lock); 4105 4106 /* 4107 * Do promisc before checking multicast_mode 4108 */ 4109 if ((bond_dev->flags & IFF_PROMISC) && !(bond->flags & IFF_PROMISC)) { 4110 bond_set_promiscuity(bond, 1); 4111 } 4112 4113 if (!(bond_dev->flags & IFF_PROMISC) && (bond->flags & IFF_PROMISC)) { 4114 bond_set_promiscuity(bond, -1); 4115 } 4116 4117 /* set allmulti flag to slaves */ 4118 if ((bond_dev->flags & IFF_ALLMULTI) && !(bond->flags & IFF_ALLMULTI)) { 4119 bond_set_allmulti(bond, 1); 4120 } 4121 4122 if (!(bond_dev->flags & IFF_ALLMULTI) && (bond->flags & IFF_ALLMULTI)) { 4123 bond_set_allmulti(bond, -1); 4124 } 4125 4126 bond->flags = bond_dev->flags; 4127 4128 /* looking for addresses to add to slaves' mc list */ 4129 for (dmi = bond_dev->mc_list; dmi; dmi = dmi->next) { 4130 if (!bond_mc_list_find_dmi(dmi, bond->mc_list)) { 4131 bond_mc_add(bond, dmi->dmi_addr, dmi->dmi_addrlen); 4132 } 4133 } 4134 4135 /* looking for addresses to delete from slaves' list */ 4136 for (dmi = bond->mc_list; dmi; dmi = dmi->next) { 4137 if (!bond_mc_list_find_dmi(dmi, bond_dev->mc_list)) { 4138 bond_mc_delete(bond, dmi->dmi_addr, dmi->dmi_addrlen); 4139 } 4140 } 4141 4142 /* save master's multicast list */ 4143 bond_mc_list_destroy(bond); 4144 bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC); 4145 4146 write_unlock_bh(&bond->lock); 4147} 4148 4149/* 4150 * Change the MTU of all of a master's slaves to match the master 4151 */ 4152static int bond_change_mtu(struct net_device *bond_dev, int new_mtu) 4153{ 4154 struct bonding *bond = bond_dev->priv; 4155 struct slave *slave, *stop_at; 4156 int res = 0; 4157 int i; 4158 4159 dprintk("bond=%p, name=%s, new_mtu=%d\n", bond, 4160 (bond_dev ? bond_dev->name : "None"), new_mtu); 4161 4162 /* Can't hold bond->lock with bh disabled here since 4163 * some base drivers panic. On the other hand we can't 4164 * hold bond->lock without bh disabled because we'll 4165 * deadlock. The only solution is to rely on the fact 4166 * that we're under rtnl_lock here, and the slaves 4167 * list won't change. This doesn't solve the problem 4168 * of setting the slave's MTU while it is 4169 * transmitting, but the assumption is that the base 4170 * driver can handle that. 4171 * 4172 * TODO: figure out a way to safely iterate the slaves 4173 * list, but without holding a lock around the actual 4174 * call to the base driver. 4175 */ 4176 4177 bond_for_each_slave(bond, slave, i) { 4178 dprintk("s %p s->p %p c_m %p\n", slave, 4179 slave->prev, slave->dev->change_mtu); 4180 res = dev_set_mtu(slave->dev, new_mtu); 4181 4182 if (res) { 4183 /* If we failed to set the slave's mtu to the new value 4184 * we must abort the operation even in ACTIVE_BACKUP 4185 * mode, because if we allow the backup slaves to have 4186 * different mtu values than the active slave we'll 4187 * need to change their mtu when doing a failover. That 4188 * means changing their mtu from timer context, which 4189 * is probably not a good idea. 4190 */ 4191 dprintk("err %d %s\n", res, slave->dev->name); 4192 goto unwind; 4193 } 4194 } 4195 4196 bond_dev->mtu = new_mtu; 4197 4198 return 0; 4199 4200unwind: 4201 /* unwind from head to the slave that failed */ 4202 stop_at = slave; 4203 bond_for_each_slave_from_to(bond, slave, i, bond->first_slave, stop_at) { 4204 int tmp_res; 4205 4206 tmp_res = dev_set_mtu(slave->dev, bond_dev->mtu); 4207 if (tmp_res) { 4208 dprintk("unwind err %d dev %s\n", tmp_res, 4209 slave->dev->name); 4210 } 4211 } 4212 4213 return res; 4214} 4215 4216/* 4217 * Change HW address 4218 * 4219 * Note that many devices must be down to change the HW address, and 4220 * downing the master releases all slaves. We can make bonds full of 4221 * bonding devices to test this, however. 4222 */ 4223static int bond_set_mac_address(struct net_device *bond_dev, void *addr) 4224{ 4225 struct bonding *bond = bond_dev->priv; 4226 struct sockaddr *sa = addr, tmp_sa; 4227 struct slave *slave, *stop_at; 4228 int res = 0; 4229 int i; 4230 4231 dprintk("bond=%p, name=%s\n", bond, (bond_dev ? bond_dev->name : "None")); 4232 4233 if (!is_valid_ether_addr(sa->sa_data)) { 4234 return -EADDRNOTAVAIL; 4235 } 4236 4237 /* Can't hold bond->lock with bh disabled here since 4238 * some base drivers panic. On the other hand we can't 4239 * hold bond->lock without bh disabled because we'll 4240 * deadlock. The only solution is to rely on the fact 4241 * that we're under rtnl_lock here, and the slaves 4242 * list won't change. This doesn't solve the problem 4243 * of setting the slave's hw address while it is 4244 * transmitting, but the assumption is that the base 4245 * driver can handle that. 4246 * 4247 * TODO: figure out a way to safely iterate the slaves 4248 * list, but without holding a lock around the actual 4249 * call to the base driver. 4250 */ 4251 4252 bond_for_each_slave(bond, slave, i) { 4253 dprintk("slave %p %s\n", slave, slave->dev->name); 4254 4255 if (slave->dev->set_mac_address == NULL) { 4256 res = -EOPNOTSUPP; 4257 dprintk("EOPNOTSUPP %s\n", slave->dev->name); 4258 goto unwind; 4259 } 4260 4261 res = dev_set_mac_address(slave->dev, addr); 4262 if (res) { 4263 /* TODO: consider downing the slave 4264 * and retry ? 4265 * User should expect communications 4266 * breakage anyway until ARP finish 4267 * updating, so... 4268 */ 4269 dprintk("err %d %s\n", res, slave->dev->name); 4270 goto unwind; 4271 } 4272 } 4273 4274 /* success */ 4275 memcpy(bond_dev->dev_addr, sa->sa_data, bond_dev->addr_len); 4276 return 0; 4277 4278unwind: 4279 memcpy(tmp_sa.sa_data, bond_dev->dev_addr, bond_dev->addr_len); 4280 tmp_sa.sa_family = bond_dev->type; 4281 4282 /* unwind from head to the slave that failed */ 4283 stop_at = slave; 4284 bond_for_each_slave_from_to(bond, slave, i, bond->first_slave, stop_at) { 4285 int tmp_res; 4286 4287 tmp_res = dev_set_mac_address(slave->dev, &tmp_sa); 4288 if (tmp_res) { 4289 dprintk("unwind err %d dev %s\n", tmp_res, 4290 slave->dev->name); 4291 } 4292 } 4293 4294 return res; 4295} 4296 4297static int bond_xmit_roundrobin(struct sk_buff *skb, struct net_device *bond_dev) 4298{ 4299 struct bonding *bond = bond_dev->priv; 4300 struct slave *slave, *start_at; 4301 int i; 4302 int res = 1; 4303 4304 read_lock(&bond->lock); 4305 4306 if (!BOND_IS_OK(bond)) { 4307 goto out; 4308 } 4309 4310 read_lock(&bond->curr_slave_lock); 4311 slave = start_at = bond->curr_active_slave; 4312 read_unlock(&bond->curr_slave_lock); 4313 4314 if (!slave) { 4315 goto out; 4316 } 4317 4318 bond_for_each_slave_from(bond, slave, i, start_at) { 4319 if (IS_UP(slave->dev) && 4320 (slave->link == BOND_LINK_UP) && 4321 (slave->state == BOND_STATE_ACTIVE)) { 4322 res = bond_dev_queue_xmit(bond, skb, slave->dev); 4323 4324 write_lock(&bond->curr_slave_lock); 4325 bond->curr_active_slave = slave->next; 4326 write_unlock(&bond->curr_slave_lock); 4327 4328 break; 4329 } 4330 } 4331 4332 4333out: 4334 if (res) { 4335 /* no suitable interface, frame not sent */ 4336 dev_kfree_skb(skb); 4337 } 4338 read_unlock(&bond->lock); 4339 return 0; 4340} 4341 4342/* 4343 * in active-backup mode, we know that bond->curr_active_slave is always valid if 4344 * the bond has a usable interface. 4345 */ 4346static int bond_xmit_activebackup(struct sk_buff *skb, struct net_device *bond_dev) 4347{ 4348 struct bonding *bond = bond_dev->priv; 4349 int res = 1; 4350 4351 read_lock(&bond->lock); 4352 read_lock(&bond->curr_slave_lock); 4353 4354 if (!BOND_IS_OK(bond)) { 4355 goto out; 4356 } 4357 4358 if (bond->curr_active_slave) { /* one usable interface */ 4359 res = bond_dev_queue_xmit(bond, skb, bond->curr_active_slave->dev); 4360 } 4361 4362out: 4363 if (res) { 4364 /* no suitable interface, frame not sent */ 4365 dev_kfree_skb(skb); 4366 } 4367 read_unlock(&bond->curr_slave_lock); 4368 read_unlock(&bond->lock); 4369 return 0; 4370} 4371 4372/* 4373 * In bond_xmit_xor() , we determine the output device by using a pre- 4374 * determined xmit_hash_policy(), If the selected device is not enabled, 4375 * find the next active slave. 4376 */ 4377static int bond_xmit_xor(struct sk_buff *skb, struct net_device *bond_dev) 4378{ 4379 struct bonding *bond = bond_dev->priv; 4380 struct slave *slave, *start_at; 4381 int slave_no; 4382 int i; 4383 int res = 1; 4384 4385 read_lock(&bond->lock); 4386 4387 if (!BOND_IS_OK(bond)) { 4388 goto out; 4389 } 4390 4391 slave_no = bond->xmit_hash_policy(skb, bond_dev, bond->slave_cnt); 4392 4393 bond_for_each_slave(bond, slave, i) { 4394 slave_no--; 4395 if (slave_no < 0) { 4396 break; 4397 } 4398 } 4399 4400 start_at = slave; 4401 4402 bond_for_each_slave_from(bond, slave, i, start_at) { 4403 if (IS_UP(slave->dev) && 4404 (slave->link == BOND_LINK_UP) && 4405 (slave->state == BOND_STATE_ACTIVE)) { 4406 res = bond_dev_queue_xmit(bond, skb, slave->dev); 4407 break; 4408 } 4409 } 4410 4411out: 4412 if (res) { 4413 /* no suitable interface, frame not sent */ 4414 dev_kfree_skb(skb); 4415 } 4416 read_unlock(&bond->lock); 4417 return 0; 4418} 4419 4420/* 4421 * in broadcast mode, we send everything to all usable interfaces. 4422 */ 4423static int bond_xmit_broadcast(struct sk_buff *skb, struct net_device *bond_dev) 4424{ 4425 struct bonding *bond = bond_dev->priv; 4426 struct slave *slave, *start_at; 4427 struct net_device *tx_dev = NULL; 4428 int i; 4429 int res = 1; 4430 4431 read_lock(&bond->lock); 4432 4433 if (!BOND_IS_OK(bond)) { 4434 goto out; 4435 } 4436 4437 read_lock(&bond->curr_slave_lock); 4438 start_at = bond->curr_active_slave; 4439 read_unlock(&bond->curr_slave_lock); 4440 4441 if (!start_at) { 4442 goto out; 4443 } 4444 4445 bond_for_each_slave_from(bond, slave, i, start_at) { 4446 if (IS_UP(slave->dev) && 4447 (slave->link == BOND_LINK_UP) && 4448 (slave->state == BOND_STATE_ACTIVE)) { 4449 if (tx_dev) { 4450 struct sk_buff *skb2 = skb_clone(skb, GFP_ATOMIC); 4451 if (!skb2) { 4452 printk(KERN_ERR DRV_NAME 4453 ": Error: bond_xmit_broadcast(): " 4454 "skb_clone() failed\n"); 4455 continue; 4456 } 4457 4458 res = bond_dev_queue_xmit(bond, skb2, tx_dev); 4459 if (res) { 4460 dev_kfree_skb(skb2); 4461 continue; 4462 } 4463 } 4464 tx_dev = slave->dev; 4465 } 4466 } 4467 4468 if (tx_dev) { 4469 res = bond_dev_queue_xmit(bond, skb, tx_dev); 4470 } 4471 4472out: 4473 if (res) { 4474 /* no suitable interface, frame not sent */ 4475 dev_kfree_skb(skb); 4476 } 4477 /* frame sent to all suitable interfaces */ 4478 read_unlock(&bond->lock); 4479 return 0; 4480} 4481 4482/*------------------------- Device initialization ---------------------------*/ 4483 4484/* 4485 * set bond mode specific net device operations 4486 */ 4487static inline void bond_set_mode_ops(struct bonding *bond, int mode) 4488{ 4489 struct net_device *bond_dev = bond->dev; 4490 4491 switch (mode) { 4492 case BOND_MODE_ROUNDROBIN: 4493 bond_dev->hard_start_xmit = bond_xmit_roundrobin; 4494 break; 4495 case BOND_MODE_ACTIVEBACKUP: 4496 bond_dev->hard_start_xmit = bond_xmit_activebackup; 4497 break; 4498 case BOND_MODE_XOR: 4499 bond_dev->hard_start_xmit = bond_xmit_xor; 4500 if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34) 4501 bond->xmit_hash_policy = bond_xmit_hash_policy_l34; 4502 else 4503 bond->xmit_hash_policy = bond_xmit_hash_policy_l2; 4504 break; 4505 case BOND_MODE_BROADCAST: 4506 bond_dev->hard_start_xmit = bond_xmit_broadcast; 4507 break; 4508 case BOND_MODE_8023AD: 4509 bond_dev->hard_start_xmit = bond_3ad_xmit_xor; 4510 if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34) 4511 bond->xmit_hash_policy = bond_xmit_hash_policy_l34; 4512 else 4513 bond->xmit_hash_policy = bond_xmit_hash_policy_l2; 4514 break; 4515 case BOND_MODE_TLB: 4516 case BOND_MODE_ALB: 4517 bond_dev->hard_start_xmit = bond_alb_xmit; 4518 bond_dev->set_mac_address = bond_alb_set_mac_address; 4519 break; 4520 default: 4521 /* Should never happen, mode already checked */ 4522 printk(KERN_ERR DRV_NAME 4523 ": Error: Unknown bonding mode %d\n", 4524 mode); 4525 break; 4526 } 4527} 4528 4529/* 4530 * Does not allocate but creates a /proc entry. 4531 * Allowed to fail. 4532 */ 4533static int __init bond_init(struct net_device *bond_dev, struct bond_params *params) 4534{ 4535 struct bonding *bond = bond_dev->priv; 4536 4537 dprintk("Begin bond_init for %s\n", bond_dev->name); 4538 4539 /* initialize rwlocks */ 4540 rwlock_init(&bond->lock); 4541 rwlock_init(&bond->curr_slave_lock); 4542 4543 bond->params = *params; /* copy params struct */ 4544 4545 /* Initialize pointers */ 4546 bond->first_slave = NULL; 4547 bond->curr_active_slave = NULL; 4548 bond->current_arp_slave = NULL; 4549 bond->primary_slave = NULL; 4550 bond->dev = bond_dev; 4551 INIT_LIST_HEAD(&bond->vlan_list); 4552 4553 /* Initialize the device entry points */ 4554 bond_dev->open = bond_open; 4555 bond_dev->stop = bond_close; 4556 bond_dev->get_stats = bond_get_stats; 4557 bond_dev->do_ioctl = bond_do_ioctl; 4558 bond_dev->set_multicast_list = bond_set_multicast_list; 4559 bond_dev->change_mtu = bond_change_mtu; 4560 bond_dev->set_mac_address = bond_set_mac_address; 4561 4562 bond_set_mode_ops(bond, bond->params.mode); 4563 4564 bond_dev->destructor = free_netdev; 4565 4566 /* Initialize the device options */ 4567 bond_dev->tx_queue_len = 0; 4568 bond_dev->flags |= IFF_MASTER|IFF_MULTICAST; 4569 4570 /* At first, we block adding VLANs. That's the only way to 4571 * prevent problems that occur when adding VLANs over an 4572 * empty bond. The block will be removed once non-challenged 4573 * slaves are enslaved. 4574 */ 4575 bond_dev->features |= NETIF_F_VLAN_CHALLENGED; 4576 4577 /* don't acquire bond device's xmit_lock when 4578 * transmitting */ 4579 bond_dev->features |= NETIF_F_LLTX; 4580 4581 /* By default, we declare the bond to be fully 4582 * VLAN hardware accelerated capable. Special 4583 * care is taken in the various xmit functions 4584 * when there are slaves that are not hw accel 4585 * capable 4586 */ 4587 bond_dev->vlan_rx_register = bond_vlan_rx_register; 4588 bond_dev->vlan_rx_add_vid = bond_vlan_rx_add_vid; 4589 bond_dev->vlan_rx_kill_vid = bond_vlan_rx_kill_vid; 4590 bond_dev->features |= (NETIF_F_HW_VLAN_TX | 4591 NETIF_F_HW_VLAN_RX | 4592 NETIF_F_HW_VLAN_FILTER); 4593 4594#ifdef CONFIG_PROC_FS 4595 bond_create_proc_entry(bond); 4596#endif 4597 4598 list_add_tail(&bond->bond_list, &bond_dev_list); 4599 4600 return 0; 4601} 4602 4603/* De-initialize device specific data. 4604 * Caller must hold rtnl_lock. 4605 */ 4606static inline void bond_deinit(struct net_device *bond_dev) 4607{ 4608 struct bonding *bond = bond_dev->priv; 4609 4610 list_del(&bond->bond_list); 4611 4612#ifdef CONFIG_PROC_FS 4613 bond_remove_proc_entry(bond); 4614#endif 4615} 4616 4617/* Unregister and free all bond devices. 4618 * Caller must hold rtnl_lock. 4619 */ 4620static void bond_free_all(void) 4621{ 4622 struct bonding *bond, *nxt; 4623 4624 list_for_each_entry_safe(bond, nxt, &bond_dev_list, bond_list) { 4625 struct net_device *bond_dev = bond->dev; 4626 4627 unregister_netdevice(bond_dev); 4628 bond_deinit(bond_dev); 4629 } 4630 4631#ifdef CONFIG_PROC_FS 4632 bond_destroy_proc_dir(); 4633#endif 4634} 4635 4636/*------------------------- Module initialization ---------------------------*/ 4637 4638/* 4639 * Convert string input module parms. Accept either the 4640 * number of the mode or its string name. 4641 */ 4642static inline int bond_parse_parm(char *mode_arg, struct bond_parm_tbl *tbl) 4643{ 4644 int i; 4645 4646 for (i = 0; tbl[i].modename; i++) { 4647 if ((isdigit(*mode_arg) && 4648 tbl[i].mode == simple_strtol(mode_arg, NULL, 0)) || 4649 (strncmp(mode_arg, tbl[i].modename, 4650 strlen(tbl[i].modename)) == 0)) { 4651 return tbl[i].mode; 4652 } 4653 } 4654 4655 return -1; 4656} 4657 4658static int bond_check_params(struct bond_params *params) 4659{ 4660 /* 4661 * Convert string parameters. 4662 */ 4663 if (mode) { 4664 bond_mode = bond_parse_parm(mode, bond_mode_tbl); 4665 if (bond_mode == -1) { 4666 printk(KERN_ERR DRV_NAME 4667 ": Error: Invalid bonding mode \"%s\"\n", 4668 mode == NULL ? "NULL" : mode); 4669 return -EINVAL; 4670 } 4671 } 4672 4673 if (xmit_hash_policy) { 4674 if ((bond_mode != BOND_MODE_XOR) && 4675 (bond_mode != BOND_MODE_8023AD)) { 4676 printk(KERN_INFO DRV_NAME 4677 ": xor_mode param is irrelevant in mode %s\n", 4678 bond_mode_name(bond_mode)); 4679 } else { 4680 xmit_hashtype = bond_parse_parm(xmit_hash_policy, 4681 xmit_hashtype_tbl); 4682 if (xmit_hashtype == -1) { 4683 printk(KERN_ERR DRV_NAME 4684 ": Error: Invalid xmit_hash_policy \"%s\"\n", 4685 xmit_hash_policy == NULL ? "NULL" : 4686 xmit_hash_policy); 4687 return -EINVAL; 4688 } 4689 } 4690 } 4691 4692 if (lacp_rate) { 4693 if (bond_mode != BOND_MODE_8023AD) { 4694 printk(KERN_INFO DRV_NAME 4695 ": lacp_rate param is irrelevant in mode %s\n", 4696 bond_mode_name(bond_mode)); 4697 } else { 4698 lacp_fast = bond_parse_parm(lacp_rate, bond_lacp_tbl); 4699 if (lacp_fast == -1) { 4700 printk(KERN_ERR DRV_NAME 4701 ": Error: Invalid lacp rate \"%s\"\n", 4702 lacp_rate == NULL ? "NULL" : lacp_rate); 4703 return -EINVAL; 4704 } 4705 } 4706 } 4707 4708 if (max_bonds < 1 || max_bonds > INT_MAX) { 4709 printk(KERN_WARNING DRV_NAME 4710 ": Warning: max_bonds (%d) not in range %d-%d, so it " 4711 "was reset to BOND_DEFAULT_MAX_BONDS (%d)", 4712 max_bonds, 1, INT_MAX, BOND_DEFAULT_MAX_BONDS); 4713 max_bonds = BOND_DEFAULT_MAX_BONDS; 4714 } 4715 4716 if (miimon < 0) { 4717 printk(KERN_WARNING DRV_NAME 4718 ": Warning: miimon module parameter (%d), " 4719 "not in range 0-%d, so it was reset to %d\n", 4720 miimon, INT_MAX, BOND_LINK_MON_INTERV); 4721 miimon = BOND_LINK_MON_INTERV; 4722 } 4723 4724 if (updelay < 0) { 4725 printk(KERN_WARNING DRV_NAME 4726 ": Warning: updelay module parameter (%d), " 4727 "not in range 0-%d, so it was reset to 0\n", 4728 updelay, INT_MAX); 4729 updelay = 0; 4730 } 4731 4732 if (downdelay < 0) { 4733 printk(KERN_WARNING DRV_NAME 4734 ": Warning: downdelay module parameter (%d), " 4735 "not in range 0-%d, so it was reset to 0\n", 4736 downdelay, INT_MAX); 4737 downdelay = 0; 4738 } 4739 4740 if ((use_carrier != 0) && (use_carrier != 1)) { 4741 printk(KERN_WARNING DRV_NAME 4742 ": Warning: use_carrier module parameter (%d), " 4743 "not of valid value (0/1), so it was set to 1\n", 4744 use_carrier); 4745 use_carrier = 1; 4746 } 4747 4748 /* reset values for 802.3ad */ 4749 if (bond_mode == BOND_MODE_8023AD) { 4750 if (!miimon) { 4751 printk(KERN_WARNING DRV_NAME 4752 ": Warning: miimon must be specified, " 4753 "otherwise bonding will not detect link " 4754 "failure, speed and duplex which are " 4755 "essential for 802.3ad operation\n"); 4756 printk(KERN_WARNING "Forcing miimon to 100msec\n"); 4757 miimon = 100; 4758 } 4759 } 4760 4761 /* reset values for TLB/ALB */ 4762 if ((bond_mode == BOND_MODE_TLB) || 4763 (bond_mode == BOND_MODE_ALB)) { 4764 if (!miimon) { 4765 printk(KERN_WARNING DRV_NAME 4766 ": Warning: miimon must be specified, " 4767 "otherwise bonding will not detect link " 4768 "failure and link speed which are essential " 4769 "for TLB/ALB load balancing\n"); 4770 printk(KERN_WARNING "Forcing miimon to 100msec\n"); 4771 miimon = 100; 4772 } 4773 } 4774 4775 if (bond_mode == BOND_MODE_ALB) { 4776 printk(KERN_NOTICE DRV_NAME 4777 ": In ALB mode you might experience client " 4778 "disconnections upon reconnection of a link if the " 4779 "bonding module updelay parameter (%d msec) is " 4780 "incompatible with the forwarding delay time of the " 4781 "switch\n", 4782 updelay); 4783 } 4784 4785 if (!miimon) { 4786 if (updelay || downdelay) { 4787 /* just warn the user the up/down delay will have 4788 * no effect since miimon is zero... 4789 */ 4790 printk(KERN_WARNING DRV_NAME 4791 ": Warning: miimon module parameter not set " 4792 "and updelay (%d) or downdelay (%d) module " 4793 "parameter is set; updelay and downdelay have " 4794 "no effect unless miimon is set\n", 4795 updelay, downdelay); 4796 } 4797 } else { 4798 /* don't allow arp monitoring */ 4799 if (arp_interval) { 4800 printk(KERN_WARNING DRV_NAME 4801 ": Warning: miimon (%d) and arp_interval (%d) " 4802 "can't be used simultaneously, disabling ARP " 4803 "monitoring\n", 4804 miimon, arp_interval); 4805 arp_interval = 0; 4806 } 4807 4808 if ((updelay % miimon) != 0) { 4809 printk(KERN_WARNING DRV_NAME 4810 ": Warning: updelay (%d) is not a multiple " 4811 "of miimon (%d), updelay rounded to %d ms\n", 4812 updelay, miimon, (updelay / miimon) * miimon); 4813 } 4814 4815 updelay /= miimon; 4816 4817 if ((downdelay % miimon) != 0) { 4818 printk(KERN_WARNING DRV_NAME 4819 ": Warning: downdelay (%d) is not a multiple " 4820 "of miimon (%d), downdelay rounded to %d ms\n", 4821 downdelay, miimon, 4822 (downdelay / miimon) * miimon); 4823 } 4824 4825 downdelay /= miimon; 4826 } 4827 4828 if (arp_interval < 0) { 4829 printk(KERN_WARNING DRV_NAME 4830 ": Warning: arp_interval module parameter (%d) " 4831 ", not in range 0-%d, so it was reset to %d\n", 4832 arp_interval, INT_MAX, BOND_LINK_ARP_INTERV); 4833 arp_interval = BOND_LINK_ARP_INTERV; 4834 } 4835 4836 for (arp_ip_count = 0; 4837 (arp_ip_count < BOND_MAX_ARP_TARGETS) && arp_ip_target[arp_ip_count]; 4838 arp_ip_count++) { 4839 /* not complete check, but should be good enough to 4840 catch mistakes */ 4841 if (!isdigit(arp_ip_target[arp_ip_count][0])) { 4842 printk(KERN_WARNING DRV_NAME 4843 ": Warning: bad arp_ip_target module parameter " 4844 "(%s), ARP monitoring will not be performed\n", 4845 arp_ip_target[arp_ip_count]); 4846 arp_interval = 0; 4847 } else { 4848 u32 ip = in_aton(arp_ip_target[arp_ip_count]); 4849 arp_target[arp_ip_count] = ip; 4850 } 4851 } 4852 4853 if (arp_interval && !arp_ip_count) { 4854 /* don't allow arping if no arp_ip_target given... */ 4855 printk(KERN_WARNING DRV_NAME 4856 ": Warning: arp_interval module parameter (%d) " 4857 "specified without providing an arp_ip_target " 4858 "parameter, arp_interval was reset to 0\n", 4859 arp_interval); 4860 arp_interval = 0; 4861 } 4862 4863 if (miimon) { 4864 printk(KERN_INFO DRV_NAME 4865 ": MII link monitoring set to %d ms\n", 4866 miimon); 4867 } else if (arp_interval) { 4868 int i; 4869 4870 printk(KERN_INFO DRV_NAME 4871 ": ARP monitoring set to %d ms with %d target(s):", 4872 arp_interval, arp_ip_count); 4873 4874 for (i = 0; i < arp_ip_count; i++) 4875 printk (" %s", arp_ip_target[i]); 4876 4877 printk("\n"); 4878 4879 } else { 4880 /* miimon and arp_interval not set, we need one so things 4881 * work as expected, see bonding.txt for details 4882 */ 4883 printk(KERN_WARNING DRV_NAME 4884 ": Warning: either miimon or arp_interval and " 4885 "arp_ip_target module parameters must be specified, " 4886 "otherwise bonding will not detect link failures! see " 4887 "bonding.txt for details.\n"); 4888 } 4889 4890 if (primary && !USES_PRIMARY(bond_mode)) { 4891 /* currently, using a primary only makes sense 4892 * in active backup, TLB or ALB modes 4893 */ 4894 printk(KERN_WARNING DRV_NAME 4895 ": Warning: %s primary device specified but has no " 4896 "effect in %s mode\n", 4897 primary, bond_mode_name(bond_mode)); 4898 primary = NULL; 4899 } 4900 4901 /* fill params struct with the proper values */ 4902 params->mode = bond_mode; 4903 params->xmit_policy = xmit_hashtype; 4904 params->miimon = miimon; 4905 params->arp_interval = arp_interval; 4906 params->updelay = updelay; 4907 params->downdelay = downdelay; 4908 params->use_carrier = use_carrier; 4909 params->lacp_fast = lacp_fast; 4910 params->primary[0] = 0; 4911 4912 if (primary) { 4913 strncpy(params->primary, primary, IFNAMSIZ); 4914 params->primary[IFNAMSIZ - 1] = 0; 4915 } 4916 4917 memcpy(params->arp_targets, arp_target, sizeof(arp_target)); 4918 4919 return 0; 4920} 4921 4922static int __init bonding_init(void) 4923{ 4924 struct bond_params params; 4925 int i; 4926 int res; 4927 4928 printk(KERN_INFO "%s", version); 4929 4930 res = bond_check_params(&params); 4931 if (res) { 4932 return res; 4933 } 4934 4935 rtnl_lock(); 4936 4937#ifdef CONFIG_PROC_FS 4938 bond_create_proc_dir(); 4939#endif 4940 4941 for (i = 0; i < max_bonds; i++) { 4942 struct net_device *bond_dev; 4943 4944 bond_dev = alloc_netdev(sizeof(struct bonding), "", ether_setup); 4945 if (!bond_dev) { 4946 res = -ENOMEM; 4947 goto out_err; 4948 } 4949 4950 res = dev_alloc_name(bond_dev, "bond%d"); 4951 if (res < 0) { 4952 free_netdev(bond_dev); 4953 goto out_err; 4954 } 4955 4956 /* bond_init() must be called after dev_alloc_name() (for the 4957 * /proc files), but before register_netdevice(), because we 4958 * need to set function pointers. 4959 */ 4960 res = bond_init(bond_dev, &params); 4961 if (res < 0) { 4962 free_netdev(bond_dev); 4963 goto out_err; 4964 } 4965 4966 SET_MODULE_OWNER(bond_dev); 4967 4968 res = register_netdevice(bond_dev); 4969 if (res < 0) { 4970 bond_deinit(bond_dev); 4971 free_netdev(bond_dev); 4972 goto out_err; 4973 } 4974 } 4975 4976 rtnl_unlock(); 4977 register_netdevice_notifier(&bond_netdev_notifier); 4978 register_inetaddr_notifier(&bond_inetaddr_notifier); 4979 4980 return 0; 4981 4982out_err: 4983 /* free and unregister all bonds that were successfully added */ 4984 bond_free_all(); 4985 4986 rtnl_unlock(); 4987 4988 return res; 4989} 4990 4991static void __exit bonding_exit(void) 4992{ 4993 unregister_netdevice_notifier(&bond_netdev_notifier); 4994 unregister_inetaddr_notifier(&bond_inetaddr_notifier); 4995 4996 rtnl_lock(); 4997 bond_free_all(); 4998 rtnl_unlock(); 4999} 5000 5001module_init(bonding_init); 5002module_exit(bonding_exit); 5003MODULE_LICENSE("GPL"); 5004MODULE_VERSION(DRV_VERSION); 5005MODULE_DESCRIPTION(DRV_DESCRIPTION ", v" DRV_VERSION); 5006MODULE_AUTHOR("Thomas Davis, tadavis@lbl.gov and many others"); 5007MODULE_SUPPORTED_DEVICE("most ethernet devices"); 5008 5009/* 5010 * Local variables: 5011 * c-indent-level: 8 5012 * c-basic-offset: 8 5013 * tab-width: 8 5014 * End: 5015 */ 5016