Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

Pull networking fixes from David Miller:

1) mlx4 doesn't check fully for supported valid RSS hash function, fix
from Amir Vadai

2) Off by one in ibmveth_change_mtu(), from David Gibson

3) Prevent altera chip from reporting false error interrupts in some
circumstances, from Chee Nouk Phoon

4) Get rid of that stupid endless loop trying to allocate a FIN packet
in TCP, and in the process kill deadlocks. From Eric Dumazet

5) Fix get_rps_cpus() crash due to wrong invalid-cpu value, also from
Eric Dumazet

6) Fix two bugs in async rhashtable resizing, from Thomas Graf

7) Fix topology server listener socket namespace bug in TIPC, from Ying
Xue

8) Add some missing HAS_DMA kconfig dependencies, from Geert
Uytterhoeven

9) bgmac driver intends to force re-polling but does so by returning
the wrong value from it's ->poll() handler. Fix from Rafał Miłecki

10) When the creater of an rhashtable configures a max size for it,
don't bark in the logs and drop insertions when that is exceeded.
Fix from Johannes Berg

11) Recover from out of order packets in ppp mppe properly, from Sylvain
Rochet

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
bnx2x: really disable TPA if 'disable_tpa' option is set
net:treewide: Fix typo in drivers/net
net/mlx4_en: Prevent setting invalid RSS hash function
mdio-mux-gpio: use new gpiod_get_array and gpiod_put_array functions
netfilter; Add some missing default cases to switch statements in nft_reject.
ppp: mppe: discard late packet in stateless mode
ppp: mppe: sanity error path rework
net/bonding: Make DRV macros private
net: rfs: fix crash in get_rps_cpus()
altera tse: add support for fixed-links.
pxa168: fix double deallocation of managed resources
net: fix crash in build_skb()
net: eth: altera: Resolve false errors from MSGDMA to TSE
ehea: Fix memory hook reference counting crashes
net/tg3: Release IRQs on permanent error
net: mdio-gpio: support access that may sleep
inet: fix possible panic in reqsk_queue_unlink()
rhashtable: don't attempt to grow when at max_size
bgmac: fix requests for extra polling calls from NAPI
tcp: avoid looping in tcp_send_fin()
...

Linus Torvalds 11 years ago 2decb268 b787f68c

+528 -318

53 changed files

expand all

Documentation

networking

mpls-sysctl.txt

scaling.txt

drivers

net

bonding

bond_main.c

bond_procfs.c

bonding_priv.h

can

Kconfig

usb

kvaser_usb.c

ethernet

8390

etherh.c

altera

altera_msgdmahw.h

altera_tse_main.c

amd

Kconfig

arc

Kconfig

broadcom

bgmac.c

bnx2x

bnx2x_cmn.c

bnx2x_ethtool.c

tg3.c

cadence

macb.c

ibm

ehea

ehea_main.c

ibmveth.c

marvell

pxa168_eth.c

mellanox

mlx4

en_ethtool.c

myricom

myri10ge

myri10ge.c

phy

mdio-gpio.c

mdio-mux-gpio.c

ppp

ppp_mppe.c

vxlan.c

include

linux

netdevice.h

rhashtable.h

skbuff.h

net

bonding.h

inet_connection_sock.h

request_sock.h

lib

rhashtable.c

net

core

dev.c

skbuff.c

dccp

ipv4.c

ipv6.c

minisocks.c

ipv4

inet_connection_sock.c

tcp_ipv4.c

tcp_minisocks.c

tcp_output.c

ipv6

ip6_gre.c

tcp_ipv6.c

mpls

af_mpls.c

internal.h

netfilter

nft_reject.c

nft_reject_inet.c

netlink

af_netlink.c

tipc

link.c

server.c

socket.c

unix

garbage.c

Documentation/networking/mpls-sysctl.txt

··· 18 18 19 19 Possible values: 0 - 1048575 20 20 Default: 0 21 + 22 + conf/<interface>/input - BOOL 23 + Control whether packets can be input on this interface. 24 + 25 + If disabled, packets will be discarded without further 26 + processing. 27 + 28 + 0 - disabled (default) 29 + not 0 - enabled

+1 -1

Documentation/networking/scaling.txt

··· 282 282 283 283 - The current CPU's queue head counter >= the recorded tail counter 284 284 value in rps_dev_flow[i] 285 - - The current CPU is unset (equal to RPS_NO_CPU) 285 + - The current CPU is unset (>= nr_cpu_ids) 286 286 - The current CPU is offline 287 287 288 288 After this check, the packet is sent to the (possibly updated) current

drivers/net/bonding/bond_main.c

··· 82 82 #include <net/bond_3ad.h> 83 83 #include <net/bond_alb.h> 84 84 85 + #include "bonding_priv.h" 86 + 85 87 /*---------------------------- Module parameters ----------------------------*/ 86 88 87 89 /* monitor all links that often (in milliseconds). <=0 disables monitoring */

drivers/net/bonding/bond_procfs.c

··· 4 4 #include <net/netns/generic.h> 5 5 #include <net/bonding.h> 6 6 7 + #include "bonding_priv.h" 7 8 8 9 static void *bond_info_seq_start(struct seq_file *seq, loff_t *pos) 9 10 __acquires(RCU)

+25

drivers/net/bonding/bonding_priv.h

··· 1 + /* 2 + * Bond several ethernet interfaces into a Cisco, running 'Etherchannel'. 3 + * 4 + * Portions are (c) Copyright 1995 Simon "Guru Aleph-Null" Janes 5 + * NCM: Network and Communications Management, Inc. 6 + * 7 + * BUT, I'm the one who modified it for ethernet, so: 8 + * (c) Copyright 1999, Thomas Davis, tadavis@lbl.gov 9 + * 10 + * This software may be used and distributed according to the terms 11 + * of the GNU Public License, incorporated herein by reference. 12 + * 13 + */ 14 + 15 + #ifndef _BONDING_PRIV_H 16 + #define _BONDING_PRIV_H 17 + 18 + #define DRV_VERSION "3.7.1" 19 + #define DRV_RELDATE "April 27, 2011" 20 + #define DRV_NAME "bonding" 21 + #define DRV_DESCRIPTION "Ethernet Channel Bonding Driver" 22 + 23 + #define bond_version DRV_DESCRIPTION ": v" DRV_VERSION " (" DRV_RELDATE ")\n" 24 + 25 + #endif

+1 -1

drivers/net/can/Kconfig

··· 112 112 113 113 config CAN_GRCAN 114 114 tristate "Aeroflex Gaisler GRCAN and GRHCAN CAN devices" 115 - depends on OF 115 + depends on OF && HAS_DMA 116 116 ---help--- 117 117 Say Y here if you want to use Aeroflex Gaisler GRCAN or GRHCAN. 118 118 Note that the driver supports little endian, even though little

+1 -1

drivers/net/can/usb/kvaser_usb.c

··· 1102 1102 1103 1103 if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME | 1104 1104 MSG_FLAG_NERR)) { 1105 - netdev_err(priv->netdev, "Unknow error (flags: 0x%02x)\n", 1105 + netdev_err(priv->netdev, "Unknown error (flags: 0x%02x)\n", 1106 1106 msg->u.rx_can_header.flag); 1107 1107 1108 1108 stats->rx_errors++;

+1 -1

drivers/net/ethernet/8390/etherh.c

··· 523 523 char *s; 524 524 525 525 if (!ecard_readchunk(&cd, ec, 0xf5, 0)) { 526 - printk(KERN_ERR "%s: unable to read podule description string\n", 526 + printk(KERN_ERR "%s: unable to read module description string\n", 527 527 dev_name(&ec->dev)); 528 528 goto no_addr; 529 529 }

+1 -4

drivers/net/ethernet/altera/altera_msgdmahw.h

··· 58 58 /* Tx buffer control flags 59 59 */ 60 60 #define MSGDMA_DESC_CTL_TX_FIRST (MSGDMA_DESC_CTL_GEN_SOP | \ 61 - MSGDMA_DESC_CTL_TR_ERR_IRQ | \ 62 61 MSGDMA_DESC_CTL_GO) 63 62 64 - #define MSGDMA_DESC_CTL_TX_MIDDLE (MSGDMA_DESC_CTL_TR_ERR_IRQ | \ 65 - MSGDMA_DESC_CTL_GO) 63 + #define MSGDMA_DESC_CTL_TX_MIDDLE (MSGDMA_DESC_CTL_GO) 66 64 67 65 #define MSGDMA_DESC_CTL_TX_LAST (MSGDMA_DESC_CTL_GEN_EOP | \ 68 66 MSGDMA_DESC_CTL_TR_COMP_IRQ | \ 69 - MSGDMA_DESC_CTL_TR_ERR_IRQ | \ 70 67 MSGDMA_DESC_CTL_GO) 71 68 72 69 #define MSGDMA_DESC_CTL_TX_SINGLE (MSGDMA_DESC_CTL_GEN_SOP | \

+29 -8

drivers/net/ethernet/altera/altera_tse_main.c

··· 777 777 struct altera_tse_private *priv = netdev_priv(dev); 778 778 struct phy_device *phydev; 779 779 struct device_node *phynode; 780 + bool fixed_link = false; 781 + int rc = 0; 780 782 781 783 /* Avoid init phy in case of no phy present */ 782 784 if (!priv->phy_iface) ··· 791 789 phynode = of_parse_phandle(priv->device->of_node, "phy-handle", 0); 792 790 793 791 if (!phynode) { 794 - netdev_dbg(dev, "no phy-handle found\n"); 795 - if (!priv->mdio) { 796 - netdev_err(dev, 797 - "No phy-handle nor local mdio specified\n"); 798 - return -ENODEV; 792 + /* check if a fixed-link is defined in device-tree */ 793 + if (of_phy_is_fixed_link(priv->device->of_node)) { 794 + rc = of_phy_register_fixed_link(priv->device->of_node); 795 + if (rc < 0) { 796 + netdev_err(dev, "cannot register fixed PHY\n"); 797 + return rc; 798 + } 799 + 800 + /* In the case of a fixed PHY, the DT node associated 801 + * to the PHY is the Ethernet MAC DT node. 802 + */ 803 + phynode = of_node_get(priv->device->of_node); 804 + fixed_link = true; 805 + 806 + netdev_dbg(dev, "fixed-link detected\n"); 807 + phydev = of_phy_connect(dev, phynode, 808 + &altera_tse_adjust_link, 809 + 0, priv->phy_iface); 810 + } else { 811 + netdev_dbg(dev, "no phy-handle found\n"); 812 + if (!priv->mdio) { 813 + netdev_err(dev, "No phy-handle nor local mdio specified\n"); 814 + return -ENODEV; 815 + } 816 + phydev = connect_local_phy(dev); 799 817 } 800 - phydev = connect_local_phy(dev); 801 818 } else { 802 819 netdev_dbg(dev, "phy-handle found\n"); 803 820 phydev = of_phy_connect(dev, phynode, ··· 840 819 /* Broken HW is sometimes missing the pull-up resistor on the 841 820 * MDIO line, which results in reads to non-existent devices returning 842 821 * 0 rather than 0xffff. Catch this here and treat 0 as a non-existent 843 - * device as well. 822 + * device as well. If a fixed-link is used the phy_id is always 0. 844 823 * Note: phydev->phy_id is the result of reading the UID PHY registers. 845 824 */ 846 - if (phydev->phy_id == 0) { 825 + if ((phydev->phy_id == 0) && !fixed_link) { 847 826 netdev_err(dev, "Bad PHY UID 0x%08x\n", phydev->phy_id); 848 827 phy_disconnect(phydev); 849 828 return -ENODEV;

+1 -1

drivers/net/ethernet/amd/Kconfig

··· 179 179 180 180 config AMD_XGBE 181 181 tristate "AMD 10GbE Ethernet driver" 182 - depends on (OF_NET || ACPI) && HAS_IOMEM 182 + depends on (OF_NET || ACPI) && HAS_IOMEM && HAS_DMA 183 183 select PHYLIB 184 184 select AMD_XGBE_PHY 185 185 select BITREVERSE

+2 -3

drivers/net/ethernet/arc/Kconfig

··· 25 25 config ARC_EMAC 26 26 tristate "ARC EMAC support" 27 27 select ARC_EMAC_CORE 28 - depends on OF_IRQ 29 - depends on OF_NET 28 + depends on OF_IRQ && OF_NET && HAS_DMA 30 29 ---help--- 31 30 On some legacy ARC (Synopsys) FPGA boards such as ARCAngel4/ML50x 32 31 non-standard on-chip ethernet device ARC EMAC 10/100 is used. ··· 34 35 config EMAC_ROCKCHIP 35 36 tristate "Rockchip EMAC support" 36 37 select ARC_EMAC_CORE 37 - depends on OF_IRQ && OF_NET && REGULATOR 38 + depends on OF_IRQ && OF_NET && REGULATOR && HAS_DMA 38 39 ---help--- 39 40 Support for Rockchip RK3066/RK3188 EMAC ethernet controllers. 40 41 This selects Rockchip SoC glue layer support for the

+1 -1

drivers/net/ethernet/broadcom/bgmac.c

··· 1260 1260 1261 1261 /* Poll again if more events arrived in the meantime */ 1262 1262 if (bgmac_read(bgmac, BGMAC_INT_STATUS) & (BGMAC_IS_TX0 | BGMAC_IS_RX)) 1263 - return handled; 1263 + return weight; 1264 1264 1265 1265 if (handled < weight) { 1266 1266 napi_complete(napi);

+38 -11

drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c

··· 2485 2485 else if (bp->flags & GRO_ENABLE_FLAG) 2486 2486 fp->mode = TPA_MODE_GRO; 2487 2487 2488 - /* We don't want TPA on an FCoE L2 ring */ 2489 - if (IS_FCOE_FP(fp)) 2488 + /* We don't want TPA if it's disabled in bp 2489 + * or if this is an FCoE L2 ring. 2490 + */ 2491 + if (bp->disable_tpa || IS_FCOE_FP(fp)) 2490 2492 fp->disable_tpa = 1; 2491 2493 } 2492 2494 ··· 4811 4809 { 4812 4810 struct bnx2x *bp = netdev_priv(dev); 4813 4811 4812 + if (pci_num_vf(bp->pdev)) { 4813 + netdev_features_t changed = dev->features ^ features; 4814 + 4815 + /* Revert the requested changes in features if they 4816 + * would require internal reload of PF in bnx2x_set_features(). 4817 + */ 4818 + if (!(features & NETIF_F_RXCSUM) && !bp->disable_tpa) { 4819 + features &= ~NETIF_F_RXCSUM; 4820 + features |= dev->features & NETIF_F_RXCSUM; 4821 + } 4822 + 4823 + if (changed & NETIF_F_LOOPBACK) { 4824 + features &= ~NETIF_F_LOOPBACK; 4825 + features |= dev->features & NETIF_F_LOOPBACK; 4826 + } 4827 + } 4828 + 4814 4829 /* TPA requires Rx CSUM offloading */ 4815 4830 if (!(features & NETIF_F_RXCSUM)) { 4816 4831 features &= ~NETIF_F_LRO; ··· 4858 4839 else 4859 4840 flags &= ~GRO_ENABLE_FLAG; 4860 4841 4861 - if (features & NETIF_F_LOOPBACK) { 4862 - if (bp->link_params.loopback_mode != LOOPBACK_BMAC) { 4863 - bp->link_params.loopback_mode = LOOPBACK_BMAC; 4864 - bnx2x_reload = true; 4865 - } 4866 - } else { 4867 - if (bp->link_params.loopback_mode != LOOPBACK_NONE) { 4868 - bp->link_params.loopback_mode = LOOPBACK_NONE; 4869 - bnx2x_reload = true; 4842 + /* VFs or non SRIOV PFs should be able to change loopback feature */ 4843 + if (!pci_num_vf(bp->pdev)) { 4844 + if (features & NETIF_F_LOOPBACK) { 4845 + if (bp->link_params.loopback_mode != LOOPBACK_BMAC) { 4846 + bp->link_params.loopback_mode = LOOPBACK_BMAC; 4847 + bnx2x_reload = true; 4848 + } 4849 + } else { 4850 + if (bp->link_params.loopback_mode != LOOPBACK_NONE) { 4851 + bp->link_params.loopback_mode = LOOPBACK_NONE; 4852 + bnx2x_reload = true; 4853 + } 4870 4854 } 4871 4855 } 4872 4856 ··· 4952 4930 return -ENODEV; 4953 4931 } 4954 4932 bp = netdev_priv(dev); 4933 + 4934 + if (pci_num_vf(bp->pdev)) { 4935 + DP(BNX2X_MSG_IOV, "VFs are enabled, can not change MTU\n"); 4936 + return -EPERM; 4937 + } 4955 4938 4956 4939 if (bp->recovery_state != BNX2X_RECOVERY_DONE) { 4957 4940 BNX2X_ERR("Handling parity error recovery. Try again later\n");

+17

drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c

··· 1843 1843 "set ring params command parameters: rx_pending = %d, tx_pending = %d\n", 1844 1844 ering->rx_pending, ering->tx_pending); 1845 1845 1846 + if (pci_num_vf(bp->pdev)) { 1847 + DP(BNX2X_MSG_IOV, 1848 + "VFs are enabled, can not change ring parameters\n"); 1849 + return -EPERM; 1850 + } 1851 + 1846 1852 if (bp->recovery_state != BNX2X_RECOVERY_DONE) { 1847 1853 DP(BNX2X_MSG_ETHTOOL, 1848 1854 "Handling parity error recovery. Try again later\n"); ··· 2905 2899 u8 is_serdes, link_up; 2906 2900 int rc, cnt = 0; 2907 2901 2902 + if (pci_num_vf(bp->pdev)) { 2903 + DP(BNX2X_MSG_IOV, 2904 + "VFs are enabled, can not perform self test\n"); 2905 + return; 2906 + } 2907 + 2908 2908 if (bp->recovery_state != BNX2X_RECOVERY_DONE) { 2909 2909 netdev_err(bp->dev, 2910 2910 "Handling parity error recovery. Try again later\n"); ··· 3479 3467 "set-channels command parameters: rx = %d, tx = %d, other = %d, combined = %d\n", 3480 3468 channels->rx_count, channels->tx_count, channels->other_count, 3481 3469 channels->combined_count); 3470 + 3471 + if (pci_num_vf(bp->pdev)) { 3472 + DP(BNX2X_MSG_IOV, "VFs are enabled, can not set channels\n"); 3473 + return -EPERM; 3474 + } 3482 3475 3483 3476 /* We don't support separate rx / tx channels. 3484 3477 * We don't allow setting 'other' channels.

+3 -1

drivers/net/ethernet/broadcom/tg3.c

··· 18129 18129 18130 18130 rtnl_lock(); 18131 18131 18132 - tp->pcierr_recovery = true; 18132 + /* We needn't recover from permanent error */ 18133 + if (state == pci_channel_io_frozen) 18134 + tp->pcierr_recovery = true; 18133 18135 18134 18136 /* We probably don't have netdev yet */ 18135 18137 if (!netdev || !netif_running(netdev))

+2 -2

drivers/net/ethernet/cadence/macb.c

··· 1473 1473 for (i = 0; i < TX_RING_SIZE; i++) { 1474 1474 bp->queues[0].tx_ring[i].addr = 0; 1475 1475 bp->queues[0].tx_ring[i].ctrl = MACB_BIT(TX_USED); 1476 - bp->queues[0].tx_head = 0; 1477 - bp->queues[0].tx_tail = 0; 1478 1476 } 1477 + bp->queues[0].tx_head = 0; 1478 + bp->queues[0].tx_tail = 0; 1479 1479 bp->queues[0].tx_ring[TX_RING_SIZE - 1].ctrl |= MACB_BIT(TX_WRAP); 1480 1480 1481 1481 bp->rx_tail = 0;

+4 -2

drivers/net/ethernet/ibm/ehea/ehea_main.c

··· 3347 3347 { 3348 3348 int ret = 0; 3349 3349 3350 - if (atomic_inc_and_test(&ehea_memory_hooks_registered)) 3350 + if (atomic_inc_return(&ehea_memory_hooks_registered) > 1) 3351 3351 return 0; 3352 3352 3353 3353 ret = ehea_create_busmap(); ··· 3381 3381 out2: 3382 3382 unregister_reboot_notifier(&ehea_reboot_nb); 3383 3383 out: 3384 + atomic_dec(&ehea_memory_hooks_registered); 3384 3385 return ret; 3385 3386 } 3386 3387 3387 3388 static void ehea_unregister_memory_hooks(void) 3388 3389 { 3389 - if (atomic_read(&ehea_memory_hooks_registered)) 3390 + /* Only remove the hooks if we've registered them */ 3391 + if (atomic_read(&ehea_memory_hooks_registered) == 0) 3390 3392 return; 3391 3393 3392 3394 unregister_reboot_notifier(&ehea_reboot_nb);

+2 -2

drivers/net/ethernet/ibm/ibmveth.c

··· 1238 1238 return -EINVAL; 1239 1239 1240 1240 for (i = 0; i < IBMVETH_NUM_BUFF_POOLS; i++) 1241 - if (new_mtu_oh < adapter->rx_buff_pool[i].buff_size) 1241 + if (new_mtu_oh <= adapter->rx_buff_pool[i].buff_size) 1242 1242 break; 1243 1243 1244 1244 if (i == IBMVETH_NUM_BUFF_POOLS) ··· 1257 1257 for (i = 0; i < IBMVETH_NUM_BUFF_POOLS; i++) { 1258 1258 adapter->rx_buff_pool[i].active = 1; 1259 1259 1260 - if (new_mtu_oh < adapter->rx_buff_pool[i].buff_size) { 1260 + if (new_mtu_oh <= adapter->rx_buff_pool[i].buff_size) { 1261 1261 dev->mtu = new_mtu; 1262 1262 vio_cmo_set_dev_desired(viodev, 1263 1263 ibmveth_get_desired_dma

+5 -11

drivers/net/ethernet/marvell/pxa168_eth.c

··· 1508 1508 np = of_parse_phandle(pdev->dev.of_node, "phy-handle", 0); 1509 1509 if (!np) { 1510 1510 dev_err(&pdev->dev, "missing phy-handle\n"); 1511 - return -EINVAL; 1511 + err = -EINVAL; 1512 + goto err_netdev; 1512 1513 } 1513 1514 of_property_read_u32(np, "reg", &pep->phy_addr); 1514 1515 pep->phy_intf = of_get_phy_mode(pdev->dev.of_node); ··· 1527 1526 pep->smi_bus = mdiobus_alloc(); 1528 1527 if (pep->smi_bus == NULL) { 1529 1528 err = -ENOMEM; 1530 - goto err_base; 1529 + goto err_netdev; 1531 1530 } 1532 1531 pep->smi_bus->priv = pep; 1533 1532 pep->smi_bus->name = "pxa168_eth smi"; ··· 1552 1551 mdiobus_unregister(pep->smi_bus); 1553 1552 err_free_mdio: 1554 1553 mdiobus_free(pep->smi_bus); 1555 - err_base: 1556 - iounmap(pep->base); 1557 1554 err_netdev: 1558 1555 free_netdev(dev); 1559 1556 err_clk: 1560 - clk_disable(clk); 1561 - clk_put(clk); 1557 + clk_disable_unprepare(clk); 1562 1558 return err; 1563 1559 } 1564 1560 ··· 1572 1574 if (pep->phy) 1573 1575 phy_disconnect(pep->phy); 1574 1576 if (pep->clk) { 1575 - clk_disable(pep->clk); 1576 - clk_put(pep->clk); 1577 - pep->clk = NULL; 1577 + clk_disable_unprepare(pep->clk); 1578 1578 } 1579 1579 1580 - iounmap(pep->base); 1581 - pep->base = NULL; 1582 1580 mdiobus_unregister(pep->smi_bus); 1583 1581 mdiobus_free(pep->smi_bus); 1584 1582 unregister_netdev(dev);

+16 -13

drivers/net/ethernet/mellanox/mlx4/en_ethtool.c

··· 1102 1102 struct mlx4_en_priv *priv = netdev_priv(dev); 1103 1103 1104 1104 /* check if requested function is supported by the device */ 1105 - if ((hfunc == ETH_RSS_HASH_TOP && 1106 - !(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_RSS_TOP)) || 1107 - (hfunc == ETH_RSS_HASH_XOR && 1108 - !(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_RSS_XOR))) 1109 - return -EINVAL; 1105 + if (hfunc == ETH_RSS_HASH_TOP) { 1106 + if (!(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_RSS_TOP)) 1107 + return -EINVAL; 1108 + if (!(dev->features & NETIF_F_RXHASH)) 1109 + en_warn(priv, "Toeplitz hash function should be used in conjunction with RX hashing for optimal performance\n"); 1110 + return 0; 1111 + } else if (hfunc == ETH_RSS_HASH_XOR) { 1112 + if (!(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_RSS_XOR)) 1113 + return -EINVAL; 1114 + if (dev->features & NETIF_F_RXHASH) 1115 + en_warn(priv, "Enabling both XOR Hash function and RX Hashing can limit RPS functionality\n"); 1116 + return 0; 1117 + } 1110 1118 1111 - priv->rss_hash_fn = hfunc; 1112 - if (hfunc == ETH_RSS_HASH_TOP && !(dev->features & NETIF_F_RXHASH)) 1113 - en_warn(priv, 1114 - "Toeplitz hash function should be used in conjunction with RX hashing for optimal performance\n"); 1115 - if (hfunc == ETH_RSS_HASH_XOR && (dev->features & NETIF_F_RXHASH)) 1116 - en_warn(priv, 1117 - "Enabling both XOR Hash function and RX Hashing can limit RPS functionality\n"); 1118 - return 0; 1119 + return -EINVAL; 1119 1120 } 1120 1121 1121 1122 static int mlx4_en_get_rxfh(struct net_device *dev, u32 *ring_index, u8 *key, ··· 1190 1189 priv->prof->rss_rings = rss_rings; 1191 1190 if (key) 1192 1191 memcpy(priv->rss_key, key, MLX4_EN_RSS_KEY_SIZE); 1192 + if (hfunc != ETH_RSS_HASH_NO_CHANGE) 1193 + priv->rss_hash_fn = hfunc; 1193 1194 1194 1195 if (port_up) { 1195 1196 err = mlx4_en_start_port(dev);

+9 -29

drivers/net/ethernet/myricom/myri10ge/myri10ge.c

··· 69 69 #include <net/ip.h> 70 70 #include <net/tcp.h> 71 71 #include <asm/byteorder.h> 72 - #include <asm/io.h> 73 72 #include <asm/processor.h> 74 - #ifdef CONFIG_MTRR 75 - #include <asm/mtrr.h> 76 - #endif 77 73 #include <net/busy_poll.h> 78 74 79 75 #include "myri10ge_mcp.h" ··· 238 242 unsigned int rdma_tags_available; 239 243 int intr_coal_delay; 240 244 __be32 __iomem *intr_coal_delay_ptr; 241 - int mtrr; 242 - int wc_enabled; 245 + int wc_cookie; 243 246 int down_cnt; 244 247 wait_queue_head_t down_wq; 245 248 struct work_struct watchdog_work; ··· 1900 1905 "tx_aborted_errors", "tx_carrier_errors", "tx_fifo_errors", 1901 1906 "tx_heartbeat_errors", "tx_window_errors", 1902 1907 /* device-specific stats */ 1903 - "tx_boundary", "WC", "irq", "MSI", "MSIX", 1908 + "tx_boundary", "irq", "MSI", "MSIX", 1904 1909 "read_dma_bw_MBs", "write_dma_bw_MBs", "read_write_dma_bw_MBs", 1905 1910 "serial_number", "watchdog_resets", 1906 1911 #ifdef CONFIG_MYRI10GE_DCA ··· 1979 1984 data[i] = ((u64 *)&link_stats)[i]; 1980 1985 1981 1986 data[i++] = (unsigned int)mgp->tx_boundary; 1982 - data[i++] = (unsigned int)mgp->wc_enabled; 1983 1987 data[i++] = (unsigned int)mgp->pdev->irq; 1984 1988 data[i++] = (unsigned int)mgp->msi_enabled; 1985 1989 data[i++] = (unsigned int)mgp->msix_enabled; ··· 4034 4040 4035 4041 mgp->board_span = pci_resource_len(pdev, 0); 4036 4042 mgp->iomem_base = pci_resource_start(pdev, 0); 4037 - mgp->mtrr = -1; 4038 - mgp->wc_enabled = 0; 4039 - #ifdef CONFIG_MTRR 4040 - mgp->mtrr = mtrr_add(mgp->iomem_base, mgp->board_span, 4041 - MTRR_TYPE_WRCOMB, 1); 4042 - if (mgp->mtrr >= 0) 4043 - mgp->wc_enabled = 1; 4044 - #endif 4043 + mgp->wc_cookie = arch_phys_wc_add(mgp->iomem_base, mgp->board_span); 4045 4044 mgp->sram = ioremap_wc(mgp->iomem_base, mgp->board_span); 4046 4045 if (mgp->sram == NULL) { 4047 4046 dev_err(&pdev->dev, "ioremap failed for %ld bytes at 0x%lx\n", ··· 4133 4146 goto abort_with_state; 4134 4147 } 4135 4148 if (mgp->msix_enabled) 4136 - dev_info(dev, "%d MSI-X IRQs, tx bndry %d, fw %s, WC %s\n", 4149 + dev_info(dev, "%d MSI-X IRQs, tx bndry %d, fw %s, MTRR %s, WC Enabled\n", 4137 4150 mgp->num_slices, mgp->tx_boundary, mgp->fw_name, 4138 - (mgp->wc_enabled ? "Enabled" : "Disabled")); 4151 + (mgp->wc_cookie > 0 ? "Enabled" : "Disabled")); 4139 4152 else 4140 - dev_info(dev, "%s IRQ %d, tx bndry %d, fw %s, WC %s\n", 4153 + dev_info(dev, "%s IRQ %d, tx bndry %d, fw %s, MTRR %s, WC Enabled\n", 4141 4154 mgp->msi_enabled ? "MSI" : "xPIC", 4142 4155 pdev->irq, mgp->tx_boundary, mgp->fw_name, 4143 - (mgp->wc_enabled ? "Enabled" : "Disabled")); 4156 + (mgp->wc_cookie > 0 ? "Enabled" : "Disabled")); 4144 4157 4145 4158 board_number++; 4146 4159 return 0; ··· 4162 4175 iounmap(mgp->sram); 4163 4176 4164 4177 abort_with_mtrr: 4165 - #ifdef CONFIG_MTRR 4166 - if (mgp->mtrr >= 0) 4167 - mtrr_del(mgp->mtrr, mgp->iomem_base, mgp->board_span); 4168 - #endif 4178 + arch_phys_wc_del(mgp->wc_cookie); 4169 4179 dma_free_coherent(&pdev->dev, sizeof(*mgp->cmd), 4170 4180 mgp->cmd, mgp->cmd_bus); 4171 4181 ··· 4204 4220 pci_restore_state(pdev); 4205 4221 4206 4222 iounmap(mgp->sram); 4207 - 4208 - #ifdef CONFIG_MTRR 4209 - if (mgp->mtrr >= 0) 4210 - mtrr_del(mgp->mtrr, mgp->iomem_base, mgp->board_span); 4211 - #endif 4223 + arch_phys_wc_del(mgp->wc_cookie); 4212 4224 myri10ge_free_slices(mgp); 4213 4225 kfree(mgp->msix_vectors); 4214 4226 dma_free_coherent(&pdev->dev, sizeof(*mgp->cmd),

+9 -5

drivers/net/phy/mdio-gpio.c

··· 80 80 * assume the pin serves as pull-up. If direction is 81 81 * output, the default value is high. 82 82 */ 83 - gpio_set_value(bitbang->mdo, 1 ^ bitbang->mdo_active_low); 83 + gpio_set_value_cansleep(bitbang->mdo, 84 + 1 ^ bitbang->mdo_active_low); 84 85 return; 85 86 } 86 87 ··· 97 96 struct mdio_gpio_info *bitbang = 98 97 container_of(ctrl, struct mdio_gpio_info, ctrl); 99 98 100 - return gpio_get_value(bitbang->mdio) ^ bitbang->mdio_active_low; 99 + return gpio_get_value_cansleep(bitbang->mdio) ^ 100 + bitbang->mdio_active_low; 101 101 } 102 102 103 103 static void mdio_set(struct mdiobb_ctrl *ctrl, int what) ··· 107 105 container_of(ctrl, struct mdio_gpio_info, ctrl); 108 106 109 107 if (bitbang->mdo) 110 - gpio_set_value(bitbang->mdo, what ^ bitbang->mdo_active_low); 108 + gpio_set_value_cansleep(bitbang->mdo, 109 + what ^ bitbang->mdo_active_low); 111 110 else 112 - gpio_set_value(bitbang->mdio, what ^ bitbang->mdio_active_low); 111 + gpio_set_value_cansleep(bitbang->mdio, 112 + what ^ bitbang->mdio_active_low); 113 113 } 114 114 115 115 static void mdc_set(struct mdiobb_ctrl *ctrl, int what) ··· 119 115 struct mdio_gpio_info *bitbang = 120 116 container_of(ctrl, struct mdio_gpio_info, ctrl); 121 117 122 - gpio_set_value(bitbang->mdc, what ^ bitbang->mdc_active_low); 118 + gpio_set_value_cansleep(bitbang->mdc, what ^ bitbang->mdc_active_low); 123 119 } 124 120 125 121 static struct mdiobb_ops mdio_gpio_ops = {

+17 -43

drivers/net/phy/mdio-mux-gpio.c

··· 12 12 #include <linux/module.h> 13 13 #include <linux/phy.h> 14 14 #include <linux/mdio-mux.h> 15 - #include <linux/of_gpio.h> 15 + #include <linux/gpio/consumer.h> 16 16 17 17 #define DRV_VERSION "1.1" 18 18 #define DRV_DESCRIPTION "GPIO controlled MDIO bus multiplexer driver" 19 19 20 - #define MDIO_MUX_GPIO_MAX_BITS 8 21 - 22 20 struct mdio_mux_gpio_state { 23 - struct gpio_desc *gpio[MDIO_MUX_GPIO_MAX_BITS]; 24 - unsigned int num_gpios; 21 + struct gpio_descs *gpios; 25 22 void *mux_handle; 26 23 }; 27 24 28 25 static int mdio_mux_gpio_switch_fn(int current_child, int desired_child, 29 26 void *data) 30 27 { 31 - int values[MDIO_MUX_GPIO_MAX_BITS]; 32 - unsigned int n; 33 28 struct mdio_mux_gpio_state *s = data; 29 + int values[s->gpios->ndescs]; 30 + unsigned int n; 34 31 35 32 if (current_child == desired_child) 36 33 return 0; 37 34 38 - for (n = 0; n < s->num_gpios; n++) { 35 + for (n = 0; n < s->gpios->ndescs; n++) 39 36 values[n] = (desired_child >> n) & 1; 40 - } 41 - gpiod_set_array_cansleep(s->num_gpios, s->gpio, values); 37 + 38 + gpiod_set_array_cansleep(s->gpios->ndescs, s->gpios->desc, values); 42 39 43 40 return 0; 44 41 } ··· 43 46 static int mdio_mux_gpio_probe(struct platform_device *pdev) 44 47 { 45 48 struct mdio_mux_gpio_state *s; 46 - int num_gpios; 47 - unsigned int n; 48 49 int r; 49 - 50 - if (!pdev->dev.of_node) 51 - return -ENODEV; 52 - 53 - num_gpios = of_gpio_count(pdev->dev.of_node); 54 - if (num_gpios <= 0 || num_gpios > MDIO_MUX_GPIO_MAX_BITS) 55 - return -ENODEV; 56 50 57 51 s = devm_kzalloc(&pdev->dev, sizeof(*s), GFP_KERNEL); 58 52 if (!s) 59 53 return -ENOMEM; 60 54 61 - s->num_gpios = num_gpios; 62 - 63 - for (n = 0; n < num_gpios; ) { 64 - struct gpio_desc *gpio = gpiod_get_index(&pdev->dev, NULL, n, 65 - GPIOD_OUT_LOW); 66 - if (IS_ERR(gpio)) { 67 - r = PTR_ERR(gpio); 68 - goto err; 69 - } 70 - s->gpio[n] = gpio; 71 - n++; 72 - } 55 + s->gpios = gpiod_get_array(&pdev->dev, NULL, GPIOD_OUT_LOW); 56 + if (IS_ERR(s->gpios)) 57 + return PTR_ERR(s->gpios); 73 58 74 59 r = mdio_mux_init(&pdev->dev, 75 60 mdio_mux_gpio_switch_fn, &s->mux_handle, s); 76 61 77 - if (r == 0) { 78 - pdev->dev.platform_data = s; 79 - return 0; 62 + if (r != 0) { 63 + gpiod_put_array(s->gpios); 64 + return r; 80 65 } 81 - err: 82 - while (n) { 83 - n--; 84 - gpiod_put(s->gpio[n]); 85 - } 86 - return r; 66 + 67 + pdev->dev.platform_data = s; 68 + return 0; 87 69 } 88 70 89 71 static int mdio_mux_gpio_remove(struct platform_device *pdev) 90 72 { 91 - unsigned int n; 92 73 struct mdio_mux_gpio_state *s = dev_get_platdata(&pdev->dev); 93 74 mdio_mux_uninit(s->mux_handle); 94 - for (n = 0; n < s->num_gpios; n++) 95 - gpiod_put(s->gpio[n]); 75 + gpiod_put_array(s->gpios); 96 76 return 0; 97 77 } 98 78

+20 -16

drivers/net/ppp/ppp_mppe.c

··· 478 478 struct blkcipher_desc desc = { .tfm = state->arc4 }; 479 479 unsigned ccount; 480 480 int flushed = MPPE_BITS(ibuf) & MPPE_BIT_FLUSHED; 481 - int sanity = 0; 482 481 struct scatterlist sg_in[1], sg_out[1]; 483 482 484 483 if (isize <= PPP_HDRLEN + MPPE_OVHD) { ··· 513 514 "mppe_decompress[%d]: ENCRYPTED bit not set!\n", 514 515 state->unit); 515 516 state->sanity_errors += 100; 516 - sanity = 1; 517 + goto sanity_error; 517 518 } 518 519 if (!state->stateful && !flushed) { 519 520 printk(KERN_DEBUG "mppe_decompress[%d]: FLUSHED bit not set in " 520 521 "stateless mode!\n", state->unit); 521 522 state->sanity_errors += 100; 522 - sanity = 1; 523 + goto sanity_error; 523 524 } 524 525 if (state->stateful && ((ccount & 0xff) == 0xff) && !flushed) { 525 526 printk(KERN_DEBUG "mppe_decompress[%d]: FLUSHED bit not set on " 526 527 "flag packet!\n", state->unit); 527 528 state->sanity_errors += 100; 528 - sanity = 1; 529 - } 530 - 531 - if (sanity) { 532 - if (state->sanity_errors < SANITY_MAX) 533 - return DECOMP_ERROR; 534 - else 535 - /* 536 - * Take LCP down if the peer is sending too many bogons. 537 - * We don't want to do this for a single or just a few 538 - * instances since it could just be due to packet corruption. 539 - */ 540 - return DECOMP_FATALERROR; 529 + goto sanity_error; 541 530 } 542 531 543 532 /* ··· 533 546 */ 534 547 535 548 if (!state->stateful) { 549 + /* Discard late packet */ 550 + if ((ccount - state->ccount) % MPPE_CCOUNT_SPACE 551 + > MPPE_CCOUNT_SPACE / 2) { 552 + state->sanity_errors++; 553 + goto sanity_error; 554 + } 555 + 536 556 /* RFC 3078, sec 8.1. Rekey for every packet. */ 537 557 while (state->ccount != ccount) { 538 558 mppe_rekey(state, 0); ··· 643 649 state->sanity_errors >>= 1; 644 650 645 651 return osize; 652 + 653 + sanity_error: 654 + if (state->sanity_errors < SANITY_MAX) 655 + return DECOMP_ERROR; 656 + else 657 + /* Take LCP down if the peer is sending too many bogons. 658 + * We don't want to do this for a single or just a few 659 + * instances since it could just be due to packet corruption. 660 + */ 661 + return DECOMP_FATALERROR; 646 662 } 647 663 648 664 /*

+1 -5

drivers/net/vxlan.c

··· 730 730 /* Only change unicasts */ 731 731 if (!(is_multicast_ether_addr(f->eth_addr) || 732 732 is_zero_ether_addr(f->eth_addr))) { 733 - int rc = vxlan_fdb_replace(f, ip, port, vni, 733 + notify |= vxlan_fdb_replace(f, ip, port, vni, 734 734 ifindex); 735 - 736 - if (rc < 0) 737 - return rc; 738 - notify |= rc; 739 735 } else 740 736 return -EOPNOTSUPP; 741 737 }

+7 -3

include/linux/netdevice.h

··· 60 60 struct wireless_dev; 61 61 /* 802.15.4 specific */ 62 62 struct wpan_dev; 63 + struct mpls_dev; 63 64 64 65 void netdev_set_default_ethtool_ops(struct net_device *dev, 65 66 const struct ethtool_ops *ops); ··· 1628 1627 void *ax25_ptr; 1629 1628 struct wireless_dev *ieee80211_ptr; 1630 1629 struct wpan_dev *ieee802154_ptr; 1630 + #if IS_ENABLED(CONFIG_MPLS_ROUTING) 1631 + struct mpls_dev __rcu *mpls_ptr; 1632 + #endif 1631 1633 1632 1634 /* 1633 1635 * Cache lines mostly used on receive path (including eth_type_trans()) ··· 2025 2021 ({ \ 2026 2022 typeof(type) __percpu *pcpu_stats = alloc_percpu(type); \ 2027 2023 if (pcpu_stats) { \ 2028 - int i; \ 2029 - for_each_possible_cpu(i) { \ 2024 + int __cpu; \ 2025 + for_each_possible_cpu(__cpu) { \ 2030 2026 typeof(type) *stat; \ 2031 - stat = per_cpu_ptr(pcpu_stats, i); \ 2027 + stat = per_cpu_ptr(pcpu_stats, __cpu); \ 2032 2028 u64_stats_init(&stat->syncp); \ 2033 2029 } \ 2034 2030 } \

+2 -1

include/linux/rhashtable.h

··· 282 282 static inline bool rht_grow_above_100(const struct rhashtable *ht, 283 283 const struct bucket_table *tbl) 284 284 { 285 - return atomic_read(&ht->nelems) > tbl->size; 285 + return atomic_read(&ht->nelems) > tbl->size && 286 + (!ht->p.max_size || tbl->size < ht->p.max_size); 286 287 } 287 288 288 289 /* The bucket lock is selected based on the hash and protects mutations

include/linux/skbuff.h

··· 773 773 774 774 struct sk_buff *__alloc_skb(unsigned int size, gfp_t priority, int flags, 775 775 int node); 776 + struct sk_buff *__build_skb(void *data, unsigned int frag_size); 776 777 struct sk_buff *build_skb(void *data, unsigned int frag_size); 777 778 static inline struct sk_buff *alloc_skb(unsigned int size, 778 779 gfp_t priority)

-7

include/net/bonding.h

··· 30 30 #include <net/bond_alb.h> 31 31 #include <net/bond_options.h> 32 32 33 - #define DRV_VERSION "3.7.1" 34 - #define DRV_RELDATE "April 27, 2011" 35 - #define DRV_NAME "bonding" 36 - #define DRV_DESCRIPTION "Ethernet Channel Bonding Driver" 37 - 38 - #define bond_version DRV_DESCRIPTION ": v" DRV_VERSION " (" DRV_RELDATE ")\n" 39 - 40 33 #define BOND_MAX_ARP_TARGETS 16 41 34 42 35 #define BOND_DEFAULT_MIIMON 100

+1 -19

include/net/inet_connection_sock.h

··· 279 279 void inet_csk_reqsk_queue_hash_add(struct sock *sk, struct request_sock *req, 280 280 unsigned long timeout); 281 281 282 - static inline void inet_csk_reqsk_queue_removed(struct sock *sk, 283 - struct request_sock *req) 284 - { 285 - reqsk_queue_removed(&inet_csk(sk)->icsk_accept_queue, req); 286 - } 287 - 288 282 static inline void inet_csk_reqsk_queue_added(struct sock *sk, 289 283 const unsigned long timeout) 290 284 { ··· 300 306 return reqsk_queue_is_full(&inet_csk(sk)->icsk_accept_queue); 301 307 } 302 308 303 - static inline void inet_csk_reqsk_queue_unlink(struct sock *sk, 304 - struct request_sock *req) 305 - { 306 - reqsk_queue_unlink(&inet_csk(sk)->icsk_accept_queue, req); 307 - } 308 - 309 - static inline void inet_csk_reqsk_queue_drop(struct sock *sk, 310 - struct request_sock *req) 311 - { 312 - inet_csk_reqsk_queue_unlink(sk, req); 313 - inet_csk_reqsk_queue_removed(sk, req); 314 - reqsk_put(req); 315 - } 309 + void inet_csk_reqsk_queue_drop(struct sock *sk, struct request_sock *req); 316 310 317 311 void inet_csk_destroy_sock(struct sock *sk); 318 312 void inet_csk_prepare_forced_close(struct sock *sk);

-18

include/net/request_sock.h

··· 212 212 return queue->rskq_accept_head == NULL; 213 213 } 214 214 215 - static inline void reqsk_queue_unlink(struct request_sock_queue *queue, 216 - struct request_sock *req) 217 - { 218 - struct listen_sock *lopt = queue->listen_opt; 219 - struct request_sock **prev; 220 - 221 - spin_lock(&queue->syn_wait_lock); 222 - 223 - prev = &lopt->syn_table[req->rsk_hash]; 224 - while (*prev != req) 225 - prev = &(*prev)->dl_next; 226 - *prev = req->dl_next; 227 - 228 - spin_unlock(&queue->syn_wait_lock); 229 - if (del_timer(&req->rsk_timer)) 230 - reqsk_put(req); 231 - } 232 - 233 215 static inline void reqsk_queue_add(struct request_sock_queue *queue, 234 216 struct request_sock *req, 235 217 struct sock *parent,

+8 -3

lib/rhashtable.c

··· 405 405 406 406 if (rht_grow_above_75(ht, tbl)) 407 407 size *= 2; 408 - /* More than two rehashes (not resizes) detected. */ 409 - else if (WARN_ON(old_tbl != tbl && old_tbl->size == size)) 408 + /* Do not schedule more than one rehash */ 409 + else if (old_tbl != tbl) 410 410 return -EBUSY; 411 411 412 412 new_tbl = bucket_table_alloc(ht, size, GFP_ATOMIC); 413 - if (new_tbl == NULL) 413 + if (new_tbl == NULL) { 414 + /* Schedule async resize/rehash to try allocation 415 + * non-atomic context. 416 + */ 417 + schedule_work(&ht->run_work); 414 418 return -ENOMEM; 419 + } 415 420 416 421 err = rhashtable_rehash_attach(ht, tbl, new_tbl); 417 422 if (err) {

+6 -6

net/core/dev.c

··· 3079 3079 set_rps_cpu(struct net_device *dev, struct sk_buff *skb, 3080 3080 struct rps_dev_flow *rflow, u16 next_cpu) 3081 3081 { 3082 - if (next_cpu != RPS_NO_CPU) { 3082 + if (next_cpu < nr_cpu_ids) { 3083 3083 #ifdef CONFIG_RFS_ACCEL 3084 3084 struct netdev_rx_queue *rxqueue; 3085 3085 struct rps_dev_flow_table *flow_table; ··· 3184 3184 * If the desired CPU (where last recvmsg was done) is 3185 3185 * different from current CPU (one in the rx-queue flow 3186 3186 * table entry), switch if one of the following holds: 3187 - * - Current CPU is unset (equal to RPS_NO_CPU). 3187 + * - Current CPU is unset (>= nr_cpu_ids). 3188 3188 * - Current CPU is offline. 3189 3189 * - The current CPU's queue tail has advanced beyond the 3190 3190 * last packet that was enqueued using this table entry. ··· 3192 3192 * have been dequeued, thus preserving in order delivery. 3193 3193 */ 3194 3194 if (unlikely(tcpu != next_cpu) && 3195 - (tcpu == RPS_NO_CPU || !cpu_online(tcpu) || 3195 + (tcpu >= nr_cpu_ids || !cpu_online(tcpu) || 3196 3196 ((int)(per_cpu(softnet_data, tcpu).input_queue_head - 3197 3197 rflow->last_qtail)) >= 0)) { 3198 3198 tcpu = next_cpu; 3199 3199 rflow = set_rps_cpu(dev, skb, rflow, next_cpu); 3200 3200 } 3201 3201 3202 - if (tcpu != RPS_NO_CPU && cpu_online(tcpu)) { 3202 + if (tcpu < nr_cpu_ids && cpu_online(tcpu)) { 3203 3203 *rflowp = rflow; 3204 3204 cpu = tcpu; 3205 3205 goto done; ··· 3240 3240 struct rps_dev_flow_table *flow_table; 3241 3241 struct rps_dev_flow *rflow; 3242 3242 bool expire = true; 3243 - int cpu; 3243 + unsigned int cpu; 3244 3244 3245 3245 rcu_read_lock(); 3246 3246 flow_table = rcu_dereference(rxqueue->rps_flow_table); 3247 3247 if (flow_table && flow_id <= flow_table->mask) { 3248 3248 rflow = &flow_table->flows[flow_id]; 3249 3249 cpu = ACCESS_ONCE(rflow->cpu); 3250 - if (rflow->filter == filter_id && cpu != RPS_NO_CPU && 3250 + if (rflow->filter == filter_id && cpu < nr_cpu_ids && 3251 3251 ((int)(per_cpu(softnet_data, cpu).input_queue_head - 3252 3252 rflow->last_qtail) < 3253 3253 (int)(10 * flow_table->mask)))

+24 -6

net/core/skbuff.c

··· 280 280 EXPORT_SYMBOL(__alloc_skb); 281 281 282 282 /** 283 - * build_skb - build a network buffer 283 + * __build_skb - build a network buffer 284 284 * @data: data buffer provided by caller 285 - * @frag_size: size of fragment, or 0 if head was kmalloced 285 + * @frag_size: size of data, or 0 if head was kmalloced 286 286 * 287 287 * Allocate a new &sk_buff. Caller provides space holding head and 288 288 * skb_shared_info. @data must have been allocated by kmalloc() only if 289 - * @frag_size is 0, otherwise data should come from the page allocator. 289 + * @frag_size is 0, otherwise data should come from the page allocator 290 + * or vmalloc() 290 291 * The return is the new skb buffer. 291 292 * On a failure the return is %NULL, and @data is not freed. 292 293 * Notes : ··· 298 297 * before giving packet to stack. 299 298 * RX rings only contains data buffers, not full skbs. 300 299 */ 301 - struct sk_buff *build_skb(void *data, unsigned int frag_size) 300 + struct sk_buff *__build_skb(void *data, unsigned int frag_size) 302 301 { 303 302 struct skb_shared_info *shinfo; 304 303 struct sk_buff *skb; ··· 312 311 313 312 memset(skb, 0, offsetof(struct sk_buff, tail)); 314 313 skb->truesize = SKB_TRUESIZE(size); 315 - skb->head_frag = frag_size != 0; 316 314 atomic_set(&skb->users, 1); 317 315 skb->head = data; 318 316 skb->data = data; ··· 326 326 atomic_set(&shinfo->dataref, 1); 327 327 kmemcheck_annotate_variable(shinfo->destructor_arg); 328 328 329 + return skb; 330 + } 331 + 332 + /* build_skb() is wrapper over __build_skb(), that specifically 333 + * takes care of skb->head and skb->pfmemalloc 334 + * This means that if @frag_size is not zero, then @data must be backed 335 + * by a page fragment, not kmalloc() or vmalloc() 336 + */ 337 + struct sk_buff *build_skb(void *data, unsigned int frag_size) 338 + { 339 + struct sk_buff *skb = __build_skb(data, frag_size); 340 + 341 + if (skb && frag_size) { 342 + skb->head_frag = 1; 343 + if (virt_to_head_page(data)->pfmemalloc) 344 + skb->pfmemalloc = 1; 345 + } 329 346 return skb; 330 347 } 331 348 EXPORT_SYMBOL(build_skb); ··· 365 348 gfp_t gfp = gfp_mask; 366 349 367 350 if (order) { 368 - gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY; 351 + gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | 352 + __GFP_NOMEMALLOC; 369 353 page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, order); 370 354 nc->frag.size = PAGE_SIZE << (page ? order : 0); 371 355 }

+2 -1

net/dccp/ipv4.c

··· 453 453 iph->saddr, iph->daddr); 454 454 if (req) { 455 455 nsk = dccp_check_req(sk, skb, req); 456 - reqsk_put(req); 456 + if (!nsk) 457 + reqsk_put(req); 457 458 return nsk; 458 459 } 459 460 nsk = inet_lookup_established(sock_net(sk), &dccp_hashinfo,

+2 -1

net/dccp/ipv6.c

··· 301 301 &iph->daddr, inet6_iif(skb)); 302 302 if (req) { 303 303 nsk = dccp_check_req(sk, skb, req); 304 - reqsk_put(req); 304 + if (!nsk) 305 + reqsk_put(req); 305 306 return nsk; 306 307 } 307 308 nsk = __inet6_lookup_established(sock_net(sk), &dccp_hashinfo,

+1 -2

net/dccp/minisocks.c

··· 186 186 if (child == NULL) 187 187 goto listen_overflow; 188 188 189 - inet_csk_reqsk_queue_unlink(sk, req); 190 - inet_csk_reqsk_queue_removed(sk, req); 189 + inet_csk_reqsk_queue_drop(sk, req); 191 190 inet_csk_reqsk_queue_add(sk, req, child); 192 191 out: 193 192 return child;

+34

net/ipv4/inet_connection_sock.c

··· 564 564 } 565 565 EXPORT_SYMBOL(inet_rtx_syn_ack); 566 566 567 + /* return true if req was found in the syn_table[] */ 568 + static bool reqsk_queue_unlink(struct request_sock_queue *queue, 569 + struct request_sock *req) 570 + { 571 + struct listen_sock *lopt = queue->listen_opt; 572 + struct request_sock **prev; 573 + bool found = false; 574 + 575 + spin_lock(&queue->syn_wait_lock); 576 + 577 + for (prev = &lopt->syn_table[req->rsk_hash]; *prev != NULL; 578 + prev = &(*prev)->dl_next) { 579 + if (*prev == req) { 580 + *prev = req->dl_next; 581 + found = true; 582 + break; 583 + } 584 + } 585 + 586 + spin_unlock(&queue->syn_wait_lock); 587 + if (del_timer(&req->rsk_timer)) 588 + reqsk_put(req); 589 + return found; 590 + } 591 + 592 + void inet_csk_reqsk_queue_drop(struct sock *sk, struct request_sock *req) 593 + { 594 + if (reqsk_queue_unlink(&inet_csk(sk)->icsk_accept_queue, req)) { 595 + reqsk_queue_removed(&inet_csk(sk)->icsk_accept_queue, req); 596 + reqsk_put(req); 597 + } 598 + } 599 + EXPORT_SYMBOL(inet_csk_reqsk_queue_drop); 600 + 567 601 static void reqsk_timer_handler(unsigned long data) 568 602 { 569 603 struct request_sock *req = (struct request_sock *)data;

+2 -1

net/ipv4/tcp_ipv4.c

··· 1348 1348 req = inet_csk_search_req(sk, th->source, iph->saddr, iph->daddr); 1349 1349 if (req) { 1350 1350 nsk = tcp_check_req(sk, skb, req, false); 1351 - reqsk_put(req); 1351 + if (!nsk) 1352 + reqsk_put(req); 1352 1353 return nsk; 1353 1354 } 1354 1355

+4 -3

net/ipv4/tcp_minisocks.c

··· 755 755 if (!child) 756 756 goto listen_overflow; 757 757 758 - inet_csk_reqsk_queue_unlink(sk, req); 759 - inet_csk_reqsk_queue_removed(sk, req); 760 - 758 + inet_csk_reqsk_queue_drop(sk, req); 761 759 inet_csk_reqsk_queue_add(sk, req, child); 760 + /* Warning: caller must not call reqsk_put(req); 761 + * child stole last reference on it. 762 + */ 762 763 return child; 763 764 764 765 listen_overflow:

+46 -20

net/ipv4/tcp_output.c

··· 2812 2812 } 2813 2813 } 2814 2814 2815 - /* Send a fin. The caller locks the socket for us. This cannot be 2816 - * allowed to fail queueing a FIN frame under any circumstances. 2815 + /* We allow to exceed memory limits for FIN packets to expedite 2816 + * connection tear down and (memory) recovery. 2817 + * Otherwise tcp_send_fin() could be tempted to either delay FIN 2818 + * or even be forced to close flow without any FIN. 2819 + */ 2820 + static void sk_forced_wmem_schedule(struct sock *sk, int size) 2821 + { 2822 + int amt, status; 2823 + 2824 + if (size <= sk->sk_forward_alloc) 2825 + return; 2826 + amt = sk_mem_pages(size); 2827 + sk->sk_forward_alloc += amt * SK_MEM_QUANTUM; 2828 + sk_memory_allocated_add(sk, amt, &status); 2829 + } 2830 + 2831 + /* Send a FIN. The caller locks the socket for us. 2832 + * We should try to send a FIN packet really hard, but eventually give up. 2817 2833 */ 2818 2834 void tcp_send_fin(struct sock *sk) 2819 2835 { 2836 + struct sk_buff *skb, *tskb = tcp_write_queue_tail(sk); 2820 2837 struct tcp_sock *tp = tcp_sk(sk); 2821 - struct sk_buff *skb = tcp_write_queue_tail(sk); 2822 - int mss_now; 2823 2838 2824 - /* Optimization, tack on the FIN if we have a queue of 2825 - * unsent frames. But be careful about outgoing SACKS 2826 - * and IP options. 2839 + /* Optimization, tack on the FIN if we have one skb in write queue and 2840 + * this skb was not yet sent, or we are under memory pressure. 2841 + * Note: in the latter case, FIN packet will be sent after a timeout, 2842 + * as TCP stack thinks it has already been transmitted. 2827 2843 */ 2828 - mss_now = tcp_current_mss(sk); 2829 - 2830 - if (tcp_send_head(sk)) { 2831 - TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_FIN; 2832 - TCP_SKB_CB(skb)->end_seq++; 2844 + if (tskb && (tcp_send_head(sk) || sk_under_memory_pressure(sk))) { 2845 + coalesce: 2846 + TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN; 2847 + TCP_SKB_CB(tskb)->end_seq++; 2833 2848 tp->write_seq++; 2834 - } else { 2835 - /* Socket is locked, keep trying until memory is available. */ 2836 - for (;;) { 2837 - skb = sk_stream_alloc_skb(sk, 0, sk->sk_allocation); 2838 - if (skb) 2839 - break; 2840 - yield(); 2849 + if (!tcp_send_head(sk)) { 2850 + /* This means tskb was already sent. 2851 + * Pretend we included the FIN on previous transmit. 2852 + * We need to set tp->snd_nxt to the value it would have 2853 + * if FIN had been sent. This is because retransmit path 2854 + * does not change tp->snd_nxt. 2855 + */ 2856 + tp->snd_nxt++; 2857 + return; 2841 2858 } 2859 + } else { 2860 + skb = alloc_skb_fclone(MAX_TCP_HEADER, sk->sk_allocation); 2861 + if (unlikely(!skb)) { 2862 + if (tskb) 2863 + goto coalesce; 2864 + return; 2865 + } 2866 + skb_reserve(skb, MAX_TCP_HEADER); 2867 + sk_forced_wmem_schedule(sk, skb->truesize); 2842 2868 /* FIN eats a sequence byte, write_seq advanced by tcp_queue_skb(). */ 2843 2869 tcp_init_nondata_skb(skb, tp->write_seq, 2844 2870 TCPHDR_ACK | TCPHDR_FIN); 2845 2871 tcp_queue_skb(sk, skb); 2846 2872 } 2847 - __tcp_push_pending_frames(sk, mss_now, TCP_NAGLE_OFF); 2873 + __tcp_push_pending_frames(sk, tcp_current_mss(sk), TCP_NAGLE_OFF); 2848 2874 } 2849 2875 2850 2876 /* We get here when a process closes a file descriptor (either due to

+1 -8

net/ipv6/ip6_gre.c

··· 1246 1246 static int ip6gre_tunnel_init(struct net_device *dev) 1247 1247 { 1248 1248 struct ip6_tnl *tunnel; 1249 - int i; 1250 1249 1251 1250 tunnel = netdev_priv(dev); 1252 1251 ··· 1259 1260 if (ipv6_addr_any(&tunnel->parms.raddr)) 1260 1261 dev->header_ops = &ip6gre_header_ops; 1261 1262 1262 - dev->tstats = alloc_percpu(struct pcpu_sw_netstats); 1263 + dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); 1263 1264 if (!dev->tstats) 1264 1265 return -ENOMEM; 1265 - 1266 - for_each_possible_cpu(i) { 1267 - struct pcpu_sw_netstats *ip6gre_tunnel_stats; 1268 - ip6gre_tunnel_stats = per_cpu_ptr(dev->tstats, i); 1269 - u64_stats_init(&ip6gre_tunnel_stats->syncp); 1270 - } 1271 1266 1272 1267 return 0; 1273 1268 }

+2 -1

net/ipv6/tcp_ipv6.c

··· 946 946 &ipv6_hdr(skb)->daddr, tcp_v6_iif(skb)); 947 947 if (req) { 948 948 nsk = tcp_check_req(sk, skb, req, false); 949 - reqsk_put(req); 949 + if (!nsk) 950 + reqsk_put(req); 950 951 return nsk; 951 952 } 952 953 nsk = __inet6_lookup_established(sock_net(sk), &tcp_hashinfo,

+122 -3

net/mpls/af_mpls.c

··· 53 53 return rt; 54 54 } 55 55 56 + static inline struct mpls_dev *mpls_dev_get(const struct net_device *dev) 57 + { 58 + return rcu_dereference_rtnl(dev->mpls_ptr); 59 + } 60 + 56 61 static bool mpls_output_possible(const struct net_device *dev) 57 62 { 58 63 return dev && (dev->flags & IFF_UP) && netif_carrier_ok(dev); ··· 141 136 struct mpls_route *rt; 142 137 struct mpls_entry_decoded dec; 143 138 struct net_device *out_dev; 139 + struct mpls_dev *mdev; 144 140 unsigned int hh_len; 145 141 unsigned int new_header_size; 146 142 unsigned int mtu; 147 143 int err; 148 144 149 145 /* Careful this entire function runs inside of an rcu critical section */ 146 + 147 + mdev = mpls_dev_get(dev); 148 + if (!mdev || !mdev->input_enabled) 149 + goto drop; 150 150 151 151 if (skb->pkt_type != PACKET_HOST) 152 152 goto drop; ··· 362 352 if (!dev) 363 353 goto errout; 364 354 365 - /* For now just support ethernet devices */ 355 + /* Ensure this is a supported device */ 366 356 err = -EINVAL; 367 - if ((dev->type != ARPHRD_ETHER) && (dev->type != ARPHRD_LOOPBACK)) 357 + if (!mpls_dev_get(dev)) 368 358 goto errout; 369 359 370 360 err = -EINVAL; ··· 438 428 return err; 439 429 } 440 430 431 + #define MPLS_PERDEV_SYSCTL_OFFSET(field) \ 432 + (&((struct mpls_dev *)0)->field) 433 + 434 + static const struct ctl_table mpls_dev_table[] = { 435 + { 436 + .procname = "input", 437 + .maxlen = sizeof(int), 438 + .mode = 0644, 439 + .proc_handler = proc_dointvec, 440 + .data = MPLS_PERDEV_SYSCTL_OFFSET(input_enabled), 441 + }, 442 + { } 443 + }; 444 + 445 + static int mpls_dev_sysctl_register(struct net_device *dev, 446 + struct mpls_dev *mdev) 447 + { 448 + char path[sizeof("net/mpls/conf/") + IFNAMSIZ]; 449 + struct ctl_table *table; 450 + int i; 451 + 452 + table = kmemdup(&mpls_dev_table, sizeof(mpls_dev_table), GFP_KERNEL); 453 + if (!table) 454 + goto out; 455 + 456 + /* Table data contains only offsets relative to the base of 457 + * the mdev at this point, so make them absolute. 458 + */ 459 + for (i = 0; i < ARRAY_SIZE(mpls_dev_table); i++) 460 + table[i].data = (char *)mdev + (uintptr_t)table[i].data; 461 + 462 + snprintf(path, sizeof(path), "net/mpls/conf/%s", dev->name); 463 + 464 + mdev->sysctl = register_net_sysctl(dev_net(dev), path, table); 465 + if (!mdev->sysctl) 466 + goto free; 467 + 468 + return 0; 469 + 470 + free: 471 + kfree(table); 472 + out: 473 + return -ENOBUFS; 474 + } 475 + 476 + static void mpls_dev_sysctl_unregister(struct mpls_dev *mdev) 477 + { 478 + struct ctl_table *table; 479 + 480 + table = mdev->sysctl->ctl_table_arg; 481 + unregister_net_sysctl_table(mdev->sysctl); 482 + kfree(table); 483 + } 484 + 485 + static struct mpls_dev *mpls_add_dev(struct net_device *dev) 486 + { 487 + struct mpls_dev *mdev; 488 + int err = -ENOMEM; 489 + 490 + ASSERT_RTNL(); 491 + 492 + mdev = kzalloc(sizeof(*mdev), GFP_KERNEL); 493 + if (!mdev) 494 + return ERR_PTR(err); 495 + 496 + err = mpls_dev_sysctl_register(dev, mdev); 497 + if (err) 498 + goto free; 499 + 500 + rcu_assign_pointer(dev->mpls_ptr, mdev); 501 + 502 + return mdev; 503 + 504 + free: 505 + kfree(mdev); 506 + return ERR_PTR(err); 507 + } 508 + 441 509 static void mpls_ifdown(struct net_device *dev) 442 510 { 443 511 struct mpls_route __rcu **platform_label; 444 512 struct net *net = dev_net(dev); 513 + struct mpls_dev *mdev; 445 514 unsigned index; 446 515 447 516 platform_label = rtnl_dereference(net->mpls.platform_label); ··· 532 443 continue; 533 444 rt->rt_dev = NULL; 534 445 } 446 + 447 + mdev = mpls_dev_get(dev); 448 + if (!mdev) 449 + return; 450 + 451 + mpls_dev_sysctl_unregister(mdev); 452 + 453 + RCU_INIT_POINTER(dev->mpls_ptr, NULL); 454 + 455 + kfree(mdev); 535 456 } 536 457 537 458 static int mpls_dev_notify(struct notifier_block *this, unsigned long event, 538 459 void *ptr) 539 460 { 540 461 struct net_device *dev = netdev_notifier_info_to_dev(ptr); 462 + struct mpls_dev *mdev; 541 463 542 464 switch(event) { 465 + case NETDEV_REGISTER: 466 + /* For now just support ethernet devices */ 467 + if ((dev->type == ARPHRD_ETHER) || 468 + (dev->type == ARPHRD_LOOPBACK)) { 469 + mdev = mpls_add_dev(dev); 470 + if (IS_ERR(mdev)) 471 + return notifier_from_errno(PTR_ERR(mdev)); 472 + } 473 + break; 474 + 543 475 case NETDEV_UNREGISTER: 544 476 mpls_ifdown(dev); 545 477 break; ··· 645 535 */ 646 536 if ((dec.bos != bos) || dec.ttl || dec.tc) 647 537 return -EINVAL; 538 + 539 + switch (dec.label) { 540 + case LABEL_IMPLICIT_NULL: 541 + /* RFC3032: This is a label that an LSR may 542 + * assign and distribute, but which never 543 + * actually appears in the encapsulation. 544 + */ 545 + return -EINVAL; 546 + } 648 547 649 548 label[i] = dec.label; 650 549 } ··· 1031 912 return ret; 1032 913 } 1033 914 1034 - static struct ctl_table mpls_table[] = { 915 + static const struct ctl_table mpls_table[] = { 1035 916 { 1036 917 .procname = "platform_labels", 1037 918 .data = NULL,

net/mpls/internal.h

··· 22 22 u8 bos; 23 23 }; 24 24 25 + struct mpls_dev { 26 + int input_enabled; 27 + 28 + struct ctl_table_header *sysctl; 29 + }; 30 + 25 31 struct sk_buff; 26 32 27 33 static inline struct mpls_shim_hdr *mpls_hdr(const struct sk_buff *skb)

net/netfilter/nft_reject.c

··· 63 63 if (nla_put_u8(skb, NFTA_REJECT_ICMP_CODE, priv->icmp_code)) 64 64 goto nla_put_failure; 65 65 break; 66 + default: 67 + break; 66 68 } 67 69 68 70 return 0;

net/netfilter/nft_reject_inet.c

··· 108 108 if (nla_put_u8(skb, NFTA_REJECT_ICMP_CODE, priv->icmp_code)) 109 109 goto nla_put_failure; 110 110 break; 111 + default: 112 + break; 111 113 } 112 114 113 115 return 0;

+2 -4

net/netlink/af_netlink.c

··· 1629 1629 if (data == NULL) 1630 1630 return NULL; 1631 1631 1632 - skb = build_skb(data, size); 1632 + skb = __build_skb(data, size); 1633 1633 if (skb == NULL) 1634 1634 vfree(data); 1635 - else { 1636 - skb->head_frag = 0; 1635 + else 1637 1636 skb->destructor = netlink_skb_destructor; 1638 - } 1639 1637 1640 1638 return skb; 1641 1639 }

-1

net/tipc/link.c

··· 2143 2143 err = __tipc_nl_add_node_links(net, &msg, node, 2144 2144 &prev_link); 2145 2145 tipc_node_unlock(node); 2146 - tipc_node_put(node); 2147 2146 if (err) 2148 2147 goto out; 2149 2148

+3 -6

net/tipc/server.c

··· 102 102 } 103 103 saddr->scope = -TIPC_NODE_SCOPE; 104 104 kernel_bind(sock, (struct sockaddr *)saddr, sizeof(*saddr)); 105 - sk_release_kernel(sk); 105 + sock_release(sock); 106 106 con->sock = NULL; 107 107 } 108 108 ··· 321 321 struct socket *sock = NULL; 322 322 int ret; 323 323 324 - ret = sock_create_kern(AF_TIPC, SOCK_SEQPACKET, 0, &sock); 324 + ret = __sock_create(s->net, AF_TIPC, SOCK_SEQPACKET, 0, &sock, 1); 325 325 if (ret < 0) 326 326 return NULL; 327 - 328 - sk_change_net(sock->sk, s->net); 329 - 330 327 ret = kernel_setsockopt(sock, SOL_TIPC, TIPC_IMPORTANCE, 331 328 (char *)&s->imp, sizeof(s->imp)); 332 329 if (ret < 0) ··· 373 376 374 377 create_err: 375 378 kernel_sock_shutdown(sock, SHUT_RDWR); 376 - sk_release_kernel(sock->sk); 379 + sock_release(sock); 377 380 return NULL; 378 381 } 379 382

+2 -1

net/tipc/socket.c

··· 1764 1764 int tipc_sk_rcv(struct net *net, struct sk_buff_head *inputq) 1765 1765 { 1766 1766 u32 dnode, dport = 0; 1767 - int err = -TIPC_ERR_NO_PORT; 1767 + int err; 1768 1768 struct sk_buff *skb; 1769 1769 struct tipc_sock *tsk; 1770 1770 struct tipc_net *tn; 1771 1771 struct sock *sk; 1772 1772 1773 1773 while (skb_queue_len(inputq)) { 1774 + err = -TIPC_ERR_NO_PORT; 1774 1775 skb = NULL; 1775 1776 dport = tipc_skb_peek_port(inputq, dport); 1776 1777 tsk = tipc_sk_lookup(net, dport);

+28 -42

net/unix/garbage.c

··· 95 95 96 96 unsigned int unix_tot_inflight; 97 97 98 - 99 98 struct sock *unix_get_socket(struct file *filp) 100 99 { 101 100 struct sock *u_sock = NULL; 102 101 struct inode *inode = file_inode(filp); 103 102 104 - /* 105 - * Socket ? 106 - */ 103 + /* Socket ? */ 107 104 if (S_ISSOCK(inode->i_mode) && !(filp->f_mode & FMODE_PATH)) { 108 105 struct socket *sock = SOCKET_I(inode); 109 106 struct sock *s = sock->sk; 110 107 111 - /* 112 - * PF_UNIX ? 113 - */ 108 + /* PF_UNIX ? */ 114 109 if (s && sock->ops && sock->ops->family == PF_UNIX) 115 110 u_sock = s; 116 111 } 117 112 return u_sock; 118 113 } 119 114 120 - /* 121 - * Keep the number of times in flight count for the file 122 - * descriptor if it is for an AF_UNIX socket. 115 + /* Keep the number of times in flight count for the file 116 + * descriptor if it is for an AF_UNIX socket. 123 117 */ 124 118 125 119 void unix_inflight(struct file *fp) 126 120 { 127 121 struct sock *s = unix_get_socket(fp); 122 + 128 123 if (s) { 129 124 struct unix_sock *u = unix_sk(s); 125 + 130 126 spin_lock(&unix_gc_lock); 127 + 131 128 if (atomic_long_inc_return(&u->inflight) == 1) { 132 129 BUG_ON(!list_empty(&u->link)); 133 130 list_add_tail(&u->link, &gc_inflight_list); ··· 139 142 void unix_notinflight(struct file *fp) 140 143 { 141 144 struct sock *s = unix_get_socket(fp); 145 + 142 146 if (s) { 143 147 struct unix_sock *u = unix_sk(s); 148 + 144 149 spin_lock(&unix_gc_lock); 145 150 BUG_ON(list_empty(&u->link)); 151 + 146 152 if (atomic_long_dec_and_test(&u->inflight)) 147 153 list_del_init(&u->link); 148 154 unix_tot_inflight--; ··· 161 161 162 162 spin_lock(&x->sk_receive_queue.lock); 163 163 skb_queue_walk_safe(&x->sk_receive_queue, skb, next) { 164 - /* 165 - * Do we have file descriptors ? 166 - */ 164 + /* Do we have file descriptors ? */ 167 165 if (UNIXCB(skb).fp) { 168 166 bool hit = false; 169 - /* 170 - * Process the descriptors of this socket 171 - */ 167 + /* Process the descriptors of this socket */ 172 168 int nfd = UNIXCB(skb).fp->count; 173 169 struct file **fp = UNIXCB(skb).fp->fp; 170 + 174 171 while (nfd--) { 175 - /* 176 - * Get the socket the fd matches 177 - * if it indeed does so 178 - */ 172 + /* Get the socket the fd matches if it indeed does so */ 179 173 struct sock *sk = unix_get_socket(*fp++); 174 + 180 175 if (sk) { 181 176 struct unix_sock *u = unix_sk(sk); 182 177 183 - /* 184 - * Ignore non-candidates, they could 178 + /* Ignore non-candidates, they could 185 179 * have been added to the queues after 186 180 * starting the garbage collection 187 181 */ 188 182 if (test_bit(UNIX_GC_CANDIDATE, &u->gc_flags)) { 189 183 hit = true; 184 + 190 185 func(u); 191 186 } 192 187 } ··· 198 203 static void scan_children(struct sock *x, void (*func)(struct unix_sock *), 199 204 struct sk_buff_head *hitlist) 200 205 { 201 - if (x->sk_state != TCP_LISTEN) 206 + if (x->sk_state != TCP_LISTEN) { 202 207 scan_inflight(x, func, hitlist); 203 - else { 208 + } else { 204 209 struct sk_buff *skb; 205 210 struct sk_buff *next; 206 211 struct unix_sock *u; 207 212 LIST_HEAD(embryos); 208 213 209 - /* 210 - * For a listening socket collect the queued embryos 214 + /* For a listening socket collect the queued embryos 211 215 * and perform a scan on them as well. 212 216 */ 213 217 spin_lock(&x->sk_receive_queue.lock); 214 218 skb_queue_walk_safe(&x->sk_receive_queue, skb, next) { 215 219 u = unix_sk(skb->sk); 216 220 217 - /* 218 - * An embryo cannot be in-flight, so it's safe 221 + /* An embryo cannot be in-flight, so it's safe 219 222 * to use the list link. 220 223 */ 221 224 BUG_ON(!list_empty(&u->link)); ··· 242 249 static void inc_inflight_move_tail(struct unix_sock *u) 243 250 { 244 251 atomic_long_inc(&u->inflight); 245 - /* 246 - * If this still might be part of a cycle, move it to the end 252 + /* If this still might be part of a cycle, move it to the end 247 253 * of the list, so that it's checked even if it was already 248 254 * passed over 249 255 */ ··· 255 263 256 264 void wait_for_unix_gc(void) 257 265 { 258 - /* 259 - * If number of inflight sockets is insane, 266 + /* If number of inflight sockets is insane, 260 267 * force a garbage collect right now. 261 268 */ 262 269 if (unix_tot_inflight > UNIX_INFLIGHT_TRIGGER_GC && !gc_in_progress) ··· 279 288 goto out; 280 289 281 290 gc_in_progress = true; 282 - /* 283 - * First, select candidates for garbage collection. Only 291 + /* First, select candidates for garbage collection. Only 284 292 * in-flight sockets are considered, and from those only ones 285 293 * which don't have any external reference. 286 294 * ··· 310 320 } 311 321 } 312 322 313 - /* 314 - * Now remove all internal in-flight reference to children of 323 + /* Now remove all internal in-flight reference to children of 315 324 * the candidates. 316 325 */ 317 326 list_for_each_entry(u, &gc_candidates, link) 318 327 scan_children(&u->sk, dec_inflight, NULL); 319 328 320 - /* 321 - * Restore the references for children of all candidates, 329 + /* Restore the references for children of all candidates, 322 330 * which have remaining references. Do this recursively, so 323 331 * only those remain, which form cyclic references. 324 332 * ··· 338 350 } 339 351 list_del(&cursor); 340 352 341 - /* 342 - * not_cycle_list contains those sockets which do not make up a 353 + /* not_cycle_list contains those sockets which do not make up a 343 354 * cycle. Restore these to the inflight list. 344 355 */ 345 356 while (!list_empty(&not_cycle_list)) { ··· 347 360 list_move_tail(&u->link, &gc_inflight_list); 348 361 } 349 362 350 - /* 351 - * Now gc_candidates contains only garbage. Restore original 363 + /* Now gc_candidates contains only garbage. Restore original 352 364 * inflight counters for these as well, and remove the skbuffs 353 365 * which are creating the cycle(s). 354 366 */