Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

+6

Documentation/devicetree/bindings/net/ethernet-controller.yaml

··· 106 106 phy-mode: 107 107 $ref: "#/properties/phy-connection-type" 108 108 109 + pcs-handle: 110 + $ref: /schemas/types.yaml#/definitions/phandle 111 + description: 112 + Specifies a reference to a node representing a PCS PHY device on a MDIO 113 + bus to link with an external PHY (phy-handle) if exists. 114 + 109 115 phy-handle: 110 116 $ref: /schemas/types.yaml#/definitions/phandle 111 117 description:

-17

Documentation/devicetree/bindings/net/micrel.txt

··· 45 45 46 46 In fiber mode, auto-negotiation is disabled and the PHY can only work in 47 47 100base-fx (full and half duplex) modes. 48 - 49 - - lan8814,ignore-ts: If present the PHY will not support timestamping. 50 - 51 - This option acts as check whether Timestamping is supported by 52 - hardware or not. LAN8814 phy support hardware tmestamping. 53 - 54 - - lan8814,latency_rx_10: Configures Latency value of phy in ingress at 10 Mbps. 55 - 56 - - lan8814,latency_tx_10: Configures Latency value of phy in egress at 10 Mbps. 57 - 58 - - lan8814,latency_rx_100: Configures Latency value of phy in ingress at 100 Mbps. 59 - 60 - - lan8814,latency_tx_100: Configures Latency value of phy in egress at 100 Mbps. 61 - 62 - - lan8814,latency_rx_1000: Configures Latency value of phy in ingress at 1000 Mbps. 63 - 64 - - lan8814,latency_tx_1000: Configures Latency value of phy in egress at 1000 Mbps.

+7 -1

Documentation/devicetree/bindings/net/xilinx_axienet.txt

··· 26 26 specified, the TX/RX DMA interrupts should be on that node 27 27 instead, and only the Ethernet core interrupt is optionally 28 28 specified here. 29 - - phy-handle : Should point to the external phy device. 29 + - phy-handle : Should point to the external phy device if exists. Pointing 30 + this to the PCS/PMA PHY is deprecated and should be avoided. 30 31 See ethernet.txt file in the same directory. 31 32 - xlnx,rxmem : Set to allocated memory buffer for Rx/Tx in the hardware 32 33 ··· 68 67 - mdio : Child node for MDIO bus. Must be defined if PHY access is 69 68 required through the core's MDIO interface (i.e. always, 70 69 unless the PHY is accessed through a different bus). 70 + 71 + - pcs-handle: Phandle to the internal PCS/PMA PHY in SGMII or 1000Base-X 72 + modes, where "pcs-handle" should be used to point 73 + to the PCS/PMA PHY, and "phy-handle" should point to an 74 + external PHY if exists. 71 75 72 76 Example: 73 77 axi_ethernet_eth: ethernet@40c00000 {

+32 -32

Documentation/networking/dsa/dsa.rst

··· 10 10 Design principles 11 11 ================= 12 12 13 - The Distributed Switch Architecture is a subsystem which was primarily designed 14 - to support Marvell Ethernet switches (MV88E6xxx, a.k.a Linkstreet product line) 15 - using Linux, but has since evolved to support other vendors as well. 13 + The Distributed Switch Architecture subsystem was primarily designed to 14 + support Marvell Ethernet switches (MV88E6xxx, a.k.a. Link Street product 15 + line) using Linux, but has since evolved to support other vendors as well. 16 16 17 17 The original philosophy behind this design was to be able to use unmodified 18 18 Linux tools such as bridge, iproute2, ifconfig to work transparently whether 19 19 they configured/queried a switch port network device or a regular network 20 20 device. 21 21 22 - An Ethernet switch is typically comprised of multiple front-panel ports, and one 23 - or more CPU or management port. The DSA subsystem currently relies on the 22 + An Ethernet switch typically comprises multiple front-panel ports and one 23 + or more CPU or management ports. The DSA subsystem currently relies on the 24 24 presence of a management port connected to an Ethernet controller capable of 25 25 receiving Ethernet frames from the switch. This is a very common setup for all 26 26 kinds of Ethernet switches found in Small Home and Office products: routers, 27 - gateways, or even top-of-the rack switches. This host Ethernet controller will 27 + gateways, or even top-of-rack switches. This host Ethernet controller will 28 28 be later referred to as "master" and "cpu" in DSA terminology and code. 29 29 30 30 The D in DSA stands for Distributed, because the subsystem has been designed ··· 33 33 ports are referred to as "dsa" ports in DSA terminology and code. A collection 34 34 of multiple switches connected to each other is called a "switch tree". 35 35 36 - For each front-panel port, DSA will create specialized network devices which are 36 + For each front-panel port, DSA creates specialized network devices which are 37 37 used as controlling and data-flowing endpoints for use by the Linux networking 38 38 stack. These specialized network interfaces are referred to as "slave" network 39 39 interfaces in DSA terminology and code. 40 40 41 41 The ideal case for using DSA is when an Ethernet switch supports a "switch tag" 42 42 which is a hardware feature making the switch insert a specific tag for each 43 - Ethernet frames it received to/from specific ports to help the management 43 + Ethernet frame it receives to/from specific ports to help the management 44 44 interface figure out: 45 45 46 46 - what port is this frame coming from ··· 125 125 ports must decapsulate the packet. 126 126 127 127 Note that in certain cases, it might be the case that the tagging format used 128 - by a leaf switch (not connected directly to the CPU) to not be the same as what 128 + by a leaf switch (not connected directly to the CPU) is not the same as what 129 129 the network stack sees. This can be seen with Marvell switch trees, where the 130 130 CPU port can be configured to use either the DSA or the Ethertype DSA (EDSA) 131 131 format, but the DSA links are configured to use the shorter (without Ethertype) ··· 270 270 to/from specific switch ports 271 271 - query the switch for ethtool operations: statistics, link state, 272 272 Wake-on-LAN, register dumps... 273 - - external/internal PHY management: link, auto-negotiation etc. 273 + - manage external/internal PHY: link, auto-negotiation, etc. 274 274 275 275 These slave network devices have custom net_device_ops and ethtool_ops function 276 276 pointers which allow DSA to introduce a level of layering between the networking 277 - stack/ethtool, and the switch driver implementation. 277 + stack/ethtool and the switch driver implementation. 278 278 279 279 Upon frame transmission from these slave network devices, DSA will look up which 280 - switch tagging protocol is currently registered with these network devices, and 280 + switch tagging protocol is currently registered with these network devices and 281 281 invoke a specific transmit routine which takes care of adding the relevant 282 282 switch tag in the Ethernet frames. 283 283 284 284 These frames are then queued for transmission using the master network device 285 - ``ndo_start_xmit()`` function, since they contain the appropriate switch tag, the 285 + ``ndo_start_xmit()`` function. Since they contain the appropriate switch tag, the 286 286 Ethernet switch will be able to process these incoming frames from the 287 - management interface and delivers these frames to the physical switch port. 287 + management interface and deliver them to the physical switch port. 288 288 289 289 Graphical representation 290 290 ------------------------ ··· 330 330 switches, these functions would utilize direct or indirect PHY addressing mode 331 331 to return standard MII registers from the switch builtin PHYs, allowing the PHY 332 332 library and/or to return link status, link partner pages, auto-negotiation 333 - results etc.. 333 + results, etc. 334 334 335 - For Ethernet switches which have both external and internal MDIO busses, the 335 + For Ethernet switches which have both external and internal MDIO buses, the 336 336 slave MII bus can be utilized to mux/demux MDIO reads and writes towards either 337 337 internal or external MDIO devices this switch might be connected to: internal 338 338 PHYs, external PHYs, or even external switches. ··· 349 349 table indication (when cascading switches) 350 350 351 351 - ``dsa_platform_data``: platform device configuration data which can reference 352 - a collection of dsa_chip_data structure if multiples switches are cascaded, 352 + a collection of dsa_chip_data structures if multiple switches are cascaded, 353 353 the master network device this switch tree is attached to needs to be 354 354 referenced 355 355 ··· 426 426 "phy-handle" property, if found, this PHY device is created and registered 427 427 using ``of_phy_connect()`` 428 428 429 - - if Device Tree is used, and the PHY device is "fixed", that is, conforms to 429 + - if Device Tree is used and the PHY device is "fixed", that is, conforms to 430 430 the definition of a non-MDIO managed PHY as defined in 431 431 ``Documentation/devicetree/bindings/net/fixed-link.txt``, the PHY is registered 432 432 and connected transparently using the special fixed MDIO bus driver ··· 481 481 DSA features a standardized binding which is documented in 482 482 ``Documentation/devicetree/bindings/net/dsa/dsa.txt``. PHY/MDIO library helper 483 483 functions such as ``of_get_phy_mode()``, ``of_phy_connect()`` are also used to query 484 - per-port PHY specific details: interface connection, MDIO bus location etc.. 484 + per-port PHY specific details: interface connection, MDIO bus location, etc. 485 485 486 486 Driver development 487 487 ================== ··· 509 509 510 510 - ``setup``: setup function for the switch, this function is responsible for setting 511 511 up the ``dsa_switch_ops`` private structure with all it needs: register maps, 512 - interrupts, mutexes, locks etc.. This function is also expected to properly 512 + interrupts, mutexes, locks, etc. This function is also expected to properly 513 513 configure the switch to separate all network interfaces from each other, that 514 514 is, they should be isolated by the switch hardware itself, typically by creating 515 515 a Port-based VLAN ID for each port and allowing only the CPU port and the ··· 526 526 - ``get_phy_flags``: Some switches are interfaced to various kinds of Ethernet PHYs, 527 527 if the PHY library PHY driver needs to know about information it cannot obtain 528 528 on its own (e.g.: coming from switch memory mapped registers), this function 529 - should return a 32-bits bitmask of "flags", that is private between the switch 529 + should return a 32-bit bitmask of "flags" that is private between the switch 530 530 driver and the Ethernet PHY driver in ``drivers/net/phy/\*``. 531 531 532 532 - ``phy_read``: Function invoked by the DSA slave MDIO bus when attempting to read 533 533 the switch port MDIO registers. If unavailable, return 0xffff for each read. 534 534 For builtin switch Ethernet PHYs, this function should allow reading the link 535 - status, auto-negotiation results, link partner pages etc.. 535 + status, auto-negotiation results, link partner pages, etc. 536 536 537 537 - ``phy_write``: Function invoked by the DSA slave MDIO bus when attempting to write 538 538 to the switch port MDIO registers. If unavailable return a negative error ··· 554 554 ------------------ 555 555 556 556 - ``get_strings``: ethtool function used to query the driver's strings, will 557 - typically return statistics strings, private flags strings etc. 557 + typically return statistics strings, private flags strings, etc. 558 558 559 559 - ``get_ethtool_stats``: ethtool function used to query per-port statistics and 560 560 return their values. DSA overlays slave network devices general statistics: ··· 564 564 - ``get_sset_count``: ethtool function used to query the number of statistics items 565 565 566 566 - ``get_wol``: ethtool function used to obtain Wake-on-LAN settings per-port, this 567 - function may, for certain implementations also query the master network device 567 + function may for certain implementations also query the master network device 568 568 Wake-on-LAN settings if this interface needs to participate in Wake-on-LAN 569 569 570 570 - ``set_wol``: ethtool function used to configure Wake-on-LAN settings per-port, ··· 607 607 in a fully active state 608 608 609 609 - ``port_enable``: function invoked by the DSA slave network device ndo_open 610 - function when a port is administratively brought up, this function should be 611 - fully enabling a given switch port. DSA takes care of marking the port with 610 + function when a port is administratively brought up, this function should 611 + fully enable a given switch port. DSA takes care of marking the port with 612 612 ``BR_STATE_BLOCKING`` if the port is a bridge member, or ``BR_STATE_FORWARDING`` if it 613 613 was not, and propagating these changes down to the hardware 614 614 615 615 - ``port_disable``: function invoked by the DSA slave network device ndo_close 616 - function when a port is administratively brought down, this function should be 617 - fully disabling a given switch port. DSA takes care of marking the port with 616 + function when a port is administratively brought down, this function should 617 + fully disable a given switch port. DSA takes care of marking the port with 618 618 ``BR_STATE_DISABLED`` and propagating changes to the hardware if this port is 619 619 disabled while being a bridge member 620 620 ··· 622 622 ------------ 623 623 624 624 - ``port_bridge_join``: bridge layer function invoked when a given switch port is 625 - added to a bridge, this function should be doing the necessary at the switch 626 - level to permit the joining port from being added to the relevant logical 625 + added to a bridge, this function should do what's necessary at the switch 626 + level to permit the joining port to be added to the relevant logical 627 627 domain for it to ingress/egress traffic with other members of the bridge. 628 628 629 629 - ``port_bridge_leave``: bridge layer function invoked when a given switch port is 630 - removed from a bridge, this function should be doing the necessary at the 630 + removed from a bridge, this function should do what's necessary at the 631 631 switch level to deny the leaving port from ingress/egress traffic from the 632 632 remaining bridge members. When the port leaves the bridge, it should be aged 633 633 out at the switch hardware for the switch to (re) learn MAC addresses behind ··· 663 663 point for drivers that need to configure the hardware for enabling this 664 664 feature. 665 665 666 - - ``port_bridge_tx_fwd_unoffload``: bridge layer function invoken when a driver 666 + - ``port_bridge_tx_fwd_unoffload``: bridge layer function invoked when a driver 667 667 leaves a bridge port which had the TX forwarding offload feature enabled. 668 668 669 669 Bridge VLAN filtering

+19 -2

arch/x86/power/cpu.c

··· 40 40 struct saved_msr *end = msr + ctxt->saved_msrs.num; 41 41 42 42 while (msr < end) { 43 - msr->valid = !rdmsrl_safe(msr->info.msr_no, &msr->info.reg.q); 43 + if (msr->valid) 44 + rdmsrl(msr->info.msr_no, msr->info.reg.q); 44 45 msr++; 45 46 } 46 47 } ··· 425 424 } 426 425 427 426 for (i = saved_msrs->num, j = 0; i < total_num; i++, j++) { 427 + u64 dummy; 428 + 428 429 msr_array[i].info.msr_no = msr_id[j]; 429 - msr_array[i].valid = false; 430 + msr_array[i].valid = !rdmsrl_safe(msr_id[j], &dummy); 430 431 msr_array[i].info.reg.q = 0; 431 432 } 432 433 saved_msrs->num = total_num; ··· 503 500 return ret; 504 501 } 505 502 503 + static void pm_save_spec_msr(void) 504 + { 505 + u32 spec_msr_id[] = { 506 + MSR_IA32_SPEC_CTRL, 507 + MSR_IA32_TSX_CTRL, 508 + MSR_TSX_FORCE_ABORT, 509 + MSR_IA32_MCU_OPT_CTRL, 510 + MSR_AMD64_LS_CFG, 511 + }; 512 + 513 + msr_build_context(spec_msr_id, ARRAY_SIZE(spec_msr_id)); 514 + } 515 + 506 516 static int pm_check_save_msr(void) 507 517 { 508 518 dmi_check_system(msr_save_dmi_table); 509 519 pm_cpu_check(msr_save_cpu_table); 520 + pm_save_spec_msr(); 510 521 511 522 return 0; 512 523 }

+4 -2

drivers/ata/Kconfig

··· 115 115 116 116 If unsure, say N. 117 117 118 - config SATA_LPM_POLICY 118 + config SATA_MOBILE_LPM_POLICY 119 119 int "Default SATA Link Power Management policy for low power chipsets" 120 120 range 0 4 121 121 default 0 122 122 depends on SATA_AHCI 123 123 help 124 124 Select the Default SATA Link Power Management (LPM) policy to use 125 - for chipsets / "South Bridges" designated as supporting low power. 125 + for chipsets / "South Bridges" supporting low-power modes. Such 126 + chipsets are typically found on most laptops but desktops and 127 + servers now also widely use chipsets supporting low power modes. 126 128 127 129 The value set has the following meanings: 128 130 0 => Keep firmware settings

+1 -1

drivers/ata/ahci.c

··· 1595 1595 static void ahci_update_initial_lpm_policy(struct ata_port *ap, 1596 1596 struct ahci_host_priv *hpriv) 1597 1597 { 1598 - int policy = CONFIG_SATA_LPM_POLICY; 1598 + int policy = CONFIG_SATA_MOBILE_LPM_POLICY; 1599 1599 1600 1600 1601 1601 /* Ignore processing for chipsets that don't use policy */

+1 -1

drivers/ata/ahci.h

··· 236 236 AHCI_HFLAG_NO_WRITE_TO_RO = (1 << 24), /* don't write to read 237 237 only registers */ 238 238 AHCI_HFLAG_USE_LPM_POLICY = (1 << 25), /* chipset that should use 239 - SATA_LPM_POLICY 239 + SATA_MOBILE_LPM_POLICY 240 240 as default lpm_policy */ 241 241 AHCI_HFLAG_SUSPEND_PHYS = (1 << 26), /* handle PHYs during 242 242 suspend/resume */

+3

drivers/ata/libata-core.c

··· 4014 4014 ATA_HORKAGE_ZERO_AFTER_TRIM, }, 4015 4015 { "Crucial_CT*MX100*", "MU01", ATA_HORKAGE_NO_NCQ_TRIM | 4016 4016 ATA_HORKAGE_ZERO_AFTER_TRIM, }, 4017 + { "Samsung SSD 840 EVO*", NULL, ATA_HORKAGE_NO_NCQ_TRIM | 4018 + ATA_HORKAGE_NO_DMA_LOG | 4019 + ATA_HORKAGE_ZERO_AFTER_TRIM, }, 4017 4020 { "Samsung SSD 840*", NULL, ATA_HORKAGE_NO_NCQ_TRIM | 4018 4021 ATA_HORKAGE_ZERO_AFTER_TRIM, }, 4019 4022 { "Samsung SSD 850*", NULL, ATA_HORKAGE_NO_NCQ_TRIM |

+1 -1

drivers/ata/libata-sff.c

··· 1634 1634 1635 1635 void ata_sff_lost_interrupt(struct ata_port *ap) 1636 1636 { 1637 - u8 status; 1637 + u8 status = 0; 1638 1638 struct ata_queued_cmd *qc; 1639 1639 1640 1640 /* Only one outstanding command per SFF channel */

+5 -1

drivers/ata/sata_dwc_460ex.c

··· 137 137 #endif 138 138 }; 139 139 140 - #define SATA_DWC_QCMD_MAX 32 140 + /* 141 + * Allow one extra special slot for commands and DMA management 142 + * to account for libata internal commands. 143 + */ 144 + #define SATA_DWC_QCMD_MAX (ATA_MAX_QUEUE + 1) 141 145 142 146 struct sata_dwc_device_port { 143 147 struct sata_dwc_device *hsdev;

+39 -35

drivers/char/random.c

··· 437 437 * This shouldn't be set by functions like add_device_randomness(), 438 438 * where we can't trust the buffer passed to it is guaranteed to be 439 439 * unpredictable (so it might not have any entropy at all). 440 - * 441 - * Returns the number of bytes processed from input, which is bounded 442 - * by CRNG_INIT_CNT_THRESH if account is true. 443 440 */ 444 - static size_t crng_pre_init_inject(const void *input, size_t len, bool account) 441 + static void crng_pre_init_inject(const void *input, size_t len, bool account) 445 442 { 446 443 static int crng_init_cnt = 0; 447 444 struct blake2s_state hash; ··· 449 452 spin_lock_irqsave(&base_crng.lock, flags); 450 453 if (crng_init != 0) { 451 454 spin_unlock_irqrestore(&base_crng.lock, flags); 452 - return 0; 455 + return; 453 456 } 454 - 455 - if (account) 456 - len = min_t(size_t, len, CRNG_INIT_CNT_THRESH - crng_init_cnt); 457 457 458 458 blake2s_update(&hash, base_crng.key, sizeof(base_crng.key)); 459 459 blake2s_update(&hash, input, len); 460 460 blake2s_final(&hash, base_crng.key); 461 461 462 462 if (account) { 463 - crng_init_cnt += len; 463 + crng_init_cnt += min_t(size_t, len, CRNG_INIT_CNT_THRESH - crng_init_cnt); 464 464 if (crng_init_cnt >= CRNG_INIT_CNT_THRESH) { 465 465 ++base_crng.generation; 466 466 crng_init = 1; ··· 468 474 469 475 if (crng_init == 1) 470 476 pr_notice("fast init done\n"); 471 - 472 - return len; 473 477 } 474 478 475 479 static void _get_random_bytes(void *buf, size_t nbytes) ··· 523 531 524 532 static ssize_t get_random_bytes_user(void __user *buf, size_t nbytes) 525 533 { 526 - bool large_request = nbytes > 256; 527 534 ssize_t ret = 0; 528 535 size_t len; 529 536 u32 chacha_state[CHACHA_STATE_WORDS]; ··· 531 540 if (!nbytes) 532 541 return 0; 533 542 534 - len = min_t(size_t, 32, nbytes); 535 - crng_make_state(chacha_state, output, len); 543 + /* 544 + * Immediately overwrite the ChaCha key at index 4 with random 545 + * bytes, in case userspace causes copy_to_user() below to sleep 546 + * forever, so that we still retain forward secrecy in that case. 547 + */ 548 + crng_make_state(chacha_state, (u8 *)&chacha_state[4], CHACHA_KEY_SIZE); 549 + /* 550 + * However, if we're doing a read of len <= 32, we don't need to 551 + * use chacha_state after, so we can simply return those bytes to 552 + * the user directly. 553 + */ 554 + if (nbytes <= CHACHA_KEY_SIZE) { 555 + ret = copy_to_user(buf, &chacha_state[4], nbytes) ? -EFAULT : nbytes; 556 + goto out_zero_chacha; 557 + } 536 558 537 - if (copy_to_user(buf, output, len)) 538 - return -EFAULT; 539 - nbytes -= len; 540 - buf += len; 541 - ret += len; 542 - 543 - while (nbytes) { 544 - if (large_request && need_resched()) { 545 - if (signal_pending(current)) 546 - break; 547 - schedule(); 548 - } 549 - 559 + do { 550 560 chacha20_block(chacha_state, output); 551 561 if (unlikely(chacha_state[12] == 0)) 552 562 ++chacha_state[13]; ··· 561 569 nbytes -= len; 562 570 buf += len; 563 571 ret += len; 564 - } 565 572 566 - memzero_explicit(chacha_state, sizeof(chacha_state)); 573 + BUILD_BUG_ON(PAGE_SIZE % CHACHA_BLOCK_SIZE != 0); 574 + if (!(ret % PAGE_SIZE) && nbytes) { 575 + if (signal_pending(current)) 576 + break; 577 + cond_resched(); 578 + } 579 + } while (nbytes); 580 + 567 581 memzero_explicit(output, sizeof(output)); 582 + out_zero_chacha: 583 + memzero_explicit(chacha_state, sizeof(chacha_state)); 568 584 return ret; 569 585 } 570 586 ··· 1141 1141 size_t entropy) 1142 1142 { 1143 1143 if (unlikely(crng_init == 0 && entropy < POOL_MIN_BITS)) { 1144 - size_t ret = crng_pre_init_inject(buffer, count, true); 1145 - mix_pool_bytes(buffer, ret); 1146 - count -= ret; 1147 - buffer += ret; 1148 - if (!count || crng_init == 0) 1149 - return; 1144 + crng_pre_init_inject(buffer, count, true); 1145 + mix_pool_bytes(buffer, count); 1146 + return; 1150 1147 } 1151 1148 1152 1149 /* ··· 1541 1544 loff_t *ppos) 1542 1545 { 1543 1546 static int maxwarn = 10; 1547 + 1548 + /* 1549 + * Opportunistically attempt to initialize the RNG on platforms that 1550 + * have fast cycle counters, but don't (for now) require it to succeed. 1551 + */ 1552 + if (!crng_ready()) 1553 + try_to_generate_entropy(); 1544 1554 1545 1555 if (!crng_ready() && maxwarn > 0) { 1546 1556 maxwarn--;

+3 -3

drivers/hv/channel_mgmt.c

··· 380 380 * execute: 381 381 * 382 382 * (a) In the "normal (i.e., not resuming from hibernation)" path, 383 - * the full barrier in smp_store_mb() guarantees that the store 383 + * the full barrier in virt_store_mb() guarantees that the store 384 384 * is propagated to all CPUs before the add_channel_work work 385 385 * is queued. In turn, add_channel_work is queued before the 386 386 * channel's ring buffer is allocated/initialized and the ··· 392 392 * recv_int_page before retrieving the channel pointer from the 393 393 * array of channels. 394 394 * 395 - * (b) In the "resuming from hibernation" path, the smp_store_mb() 395 + * (b) In the "resuming from hibernation" path, the virt_store_mb() 396 396 * guarantees that the store is propagated to all CPUs before 397 397 * the VMBus connection is marked as ready for the resume event 398 398 * (cf. check_ready_for_resume_event()). The interrupt handler 399 399 * of the VMBus driver and vmbus_chan_sched() can not run before 400 400 * vmbus_bus_resume() has completed execution (cf. resume_noirq). 401 401 */ 402 - smp_store_mb( 402 + virt_store_mb( 403 403 vmbus_connection.channels[channel->offermsg.child_relid], 404 404 channel); 405 405 }

+44 -5

drivers/hv/hv_balloon.c

··· 17 17 #include <linux/slab.h> 18 18 #include <linux/kthread.h> 19 19 #include <linux/completion.h> 20 + #include <linux/count_zeros.h> 20 21 #include <linux/memory_hotplug.h> 21 22 #include <linux/memory.h> 22 23 #include <linux/notifier.h> ··· 1131 1130 struct dm_status status; 1132 1131 unsigned long now = jiffies; 1133 1132 unsigned long last_post = last_post_time; 1133 + unsigned long num_pages_avail, num_pages_committed; 1134 1134 1135 1135 if (pressure_report_delay > 0) { 1136 1136 --pressure_report_delay; ··· 1156 1154 * num_pages_onlined) as committed to the host, otherwise it can try 1157 1155 * asking us to balloon them out. 1158 1156 */ 1159 - status.num_avail = si_mem_available(); 1160 - status.num_committed = vm_memory_committed() + 1157 + num_pages_avail = si_mem_available(); 1158 + num_pages_committed = vm_memory_committed() + 1161 1159 dm->num_pages_ballooned + 1162 1160 (dm->num_pages_added > dm->num_pages_onlined ? 1163 1161 dm->num_pages_added - dm->num_pages_onlined : 0) + 1164 1162 compute_balloon_floor(); 1165 1163 1166 - trace_balloon_status(status.num_avail, status.num_committed, 1164 + trace_balloon_status(num_pages_avail, num_pages_committed, 1167 1165 vm_memory_committed(), dm->num_pages_ballooned, 1168 1166 dm->num_pages_added, dm->num_pages_onlined); 1167 + 1168 + /* Convert numbers of pages into numbers of HV_HYP_PAGEs. */ 1169 + status.num_avail = num_pages_avail * NR_HV_HYP_PAGES_IN_PAGE; 1170 + status.num_committed = num_pages_committed * NR_HV_HYP_PAGES_IN_PAGE; 1171 + 1169 1172 /* 1170 1173 * If our transaction ID is no longer current, just don't 1171 1174 * send the status. This can happen if we were interrupted ··· 1660 1653 } 1661 1654 } 1662 1655 1656 + static int ballooning_enabled(void) 1657 + { 1658 + /* 1659 + * Disable ballooning if the page size is not 4k (HV_HYP_PAGE_SIZE), 1660 + * since currently it's unclear to us whether an unballoon request can 1661 + * make sure all page ranges are guest page size aligned. 1662 + */ 1663 + if (PAGE_SIZE != HV_HYP_PAGE_SIZE) { 1664 + pr_info("Ballooning disabled because page size is not 4096 bytes\n"); 1665 + return 0; 1666 + } 1667 + 1668 + return 1; 1669 + } 1670 + 1671 + static int hot_add_enabled(void) 1672 + { 1673 + /* 1674 + * Disable hot add on ARM64, because we currently rely on 1675 + * memory_add_physaddr_to_nid() to get a node id of a hot add range, 1676 + * however ARM64's memory_add_physaddr_to_nid() always return 0 and 1677 + * DM_MEM_HOT_ADD_REQUEST doesn't have the NUMA node information for 1678 + * add_memory(). 1679 + */ 1680 + if (IS_ENABLED(CONFIG_ARM64)) { 1681 + pr_info("Memory hot add disabled on ARM64\n"); 1682 + return 0; 1683 + } 1684 + 1685 + return 1; 1686 + } 1687 + 1663 1688 static int balloon_connect_vsp(struct hv_device *dev) 1664 1689 { 1665 1690 struct dm_version_request version_req; ··· 1763 1724 * currently still requires the bits to be set, so we have to add code 1764 1725 * to fail the host's hot-add and balloon up/down requests, if any. 1765 1726 */ 1766 - cap_msg.caps.cap_bits.balloon = 1; 1767 - cap_msg.caps.cap_bits.hot_add = 1; 1727 + cap_msg.caps.cap_bits.balloon = ballooning_enabled(); 1728 + cap_msg.caps.cap_bits.hot_add = hot_add_enabled(); 1768 1729 1769 1730 /* 1770 1731 * Specify our alignment requirements as it relates

+11

drivers/hv/hv_common.c

··· 20 20 #include <linux/panic_notifier.h> 21 21 #include <linux/ptrace.h> 22 22 #include <linux/slab.h> 23 + #include <linux/dma-map-ops.h> 23 24 #include <asm/hyperv-tlfs.h> 24 25 #include <asm/mshyperv.h> 25 26 ··· 218 217 return hv_extended_cap & cap_query; 219 218 } 220 219 EXPORT_SYMBOL_GPL(hv_query_ext_cap); 220 + 221 + void hv_setup_dma_ops(struct device *dev, bool coherent) 222 + { 223 + /* 224 + * Hyper-V does not offer a vIOMMU in the guest 225 + * VM, so pass 0/NULL for the IOMMU settings 226 + */ 227 + arch_setup_dma_ops(dev, 0, 0, NULL, coherent); 228 + } 229 + EXPORT_SYMBOL_GPL(hv_setup_dma_ops); 221 230 222 231 bool hv_is_hibernation_supported(void) 223 232 {

+10 -1

drivers/hv/ring_buffer.c

··· 439 439 static u32 hv_pkt_iter_avail(const struct hv_ring_buffer_info *rbi) 440 440 { 441 441 u32 priv_read_loc = rbi->priv_read_index; 442 - u32 write_loc = READ_ONCE(rbi->ring_buffer->write_index); 442 + u32 write_loc; 443 + 444 + /* 445 + * The Hyper-V host writes the packet data, then uses 446 + * store_release() to update the write_index. Use load_acquire() 447 + * here to prevent loads of the packet data from being re-ordered 448 + * before the read of the write_index and potentially getting 449 + * stale data. 450 + */ 451 + write_loc = virt_load_acquire(&rbi->ring_buffer->write_index); 443 452 444 453 if (write_loc >= priv_read_loc) 445 454 return write_loc - priv_read_loc;

+54 -11

drivers/hv/vmbus_drv.c

··· 77 77 78 78 /* 79 79 * Hyper-V should be notified only once about a panic. If we will be 80 - * doing hyperv_report_panic_msg() later with kmsg data, don't do 81 - * the notification here. 80 + * doing hv_kmsg_dump() with kmsg data later, don't do the notification 81 + * here. 82 82 */ 83 83 if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE 84 84 && hyperv_report_reg()) { ··· 100 100 101 101 /* 102 102 * Hyper-V should be notified only once about a panic. If we will be 103 - * doing hyperv_report_panic_msg() later with kmsg data, don't do 104 - * the notification here. 103 + * doing hv_kmsg_dump() with kmsg data later, don't do the notification 104 + * here. 105 105 */ 106 106 if (hyperv_report_reg()) 107 107 hyperv_report_panic(regs, val, true); ··· 921 921 } 922 922 923 923 /* 924 + * vmbus_dma_configure -- Configure DMA coherence for VMbus device 925 + */ 926 + static int vmbus_dma_configure(struct device *child_device) 927 + { 928 + /* 929 + * On ARM64, propagate the DMA coherence setting from the top level 930 + * VMbus ACPI device to the child VMbus device being added here. 931 + * On x86/x64 coherence is assumed and these calls have no effect. 932 + */ 933 + hv_setup_dma_ops(child_device, 934 + device_get_dma_attr(&hv_acpi_dev->dev) == DEV_DMA_COHERENT); 935 + return 0; 936 + } 937 + 938 + /* 924 939 * vmbus_remove - Remove a vmbus device 925 940 */ 926 941 static void vmbus_remove(struct device *child_device) ··· 1055 1040 .remove = vmbus_remove, 1056 1041 .probe = vmbus_probe, 1057 1042 .uevent = vmbus_uevent, 1043 + .dma_configure = vmbus_dma_configure, 1058 1044 .dev_groups = vmbus_dev_groups, 1059 1045 .drv_groups = vmbus_drv_groups, 1060 1046 .bus_groups = vmbus_bus_groups, ··· 1562 1546 if (ret) 1563 1547 goto err_connect; 1564 1548 1549 + if (hv_is_isolation_supported()) 1550 + sysctl_record_panic_msg = 0; 1551 + 1565 1552 /* 1566 1553 * Only register if the crash MSRs are available 1567 1554 */ 1568 1555 if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) { 1569 1556 u64 hyperv_crash_ctl; 1570 1557 /* 1571 - * Sysctl registration is not fatal, since by default 1572 - * reporting is enabled. 1558 + * Panic message recording (sysctl_record_panic_msg) 1559 + * is enabled by default in non-isolated guests and 1560 + * disabled by default in isolated guests; the panic 1561 + * message recording won't be available in isolated 1562 + * guests should the following registration fail. 1573 1563 */ 1574 1564 hv_ctl_table_hdr = register_sysctl_table(hv_root_table); 1575 1565 if (!hv_ctl_table_hdr) ··· 2119 2097 child_device_obj->device.parent = &hv_acpi_dev->dev; 2120 2098 child_device_obj->device.release = vmbus_device_release; 2121 2099 2100 + child_device_obj->device.dma_parms = &child_device_obj->dma_parms; 2101 + child_device_obj->device.dma_mask = &child_device_obj->dma_mask; 2102 + dma_set_mask(&child_device_obj->device, DMA_BIT_MASK(64)); 2103 + 2122 2104 /* 2123 2105 * Register with the LDM. This will kick off the driver/device 2124 2106 * binding...which will eventually call vmbus_match() and vmbus_probe() ··· 2148 2122 } 2149 2123 hv_debug_add_dev_dir(child_device_obj); 2150 2124 2151 - child_device_obj->device.dma_parms = &child_device_obj->dma_parms; 2152 - child_device_obj->device.dma_mask = &child_device_obj->dma_mask; 2153 - dma_set_mask(&child_device_obj->device, DMA_BIT_MASK(64)); 2154 2125 return 0; 2155 2126 2156 2127 err_kset_unregister: ··· 2450 2427 struct acpi_device *ancestor; 2451 2428 2452 2429 hv_acpi_dev = device; 2430 + 2431 + /* 2432 + * Older versions of Hyper-V for ARM64 fail to include the _CCA 2433 + * method on the top level VMbus device in the DSDT. But devices 2434 + * are hardware coherent in all current Hyper-V use cases, so fix 2435 + * up the ACPI device to behave as if _CCA is present and indicates 2436 + * hardware coherence. 2437 + */ 2438 + ACPI_COMPANION_SET(&device->dev, device); 2439 + if (IS_ENABLED(CONFIG_ACPI_CCA_REQUIRED) && 2440 + device_get_dma_attr(&device->dev) == DEV_DMA_NOT_SUPPORTED) { 2441 + pr_info("No ACPI _CCA found; assuming coherent device I/O\n"); 2442 + device->flags.cca_seen = true; 2443 + device->flags.coherent_dma = true; 2444 + } 2453 2445 2454 2446 result = acpi_walk_resources(device->handle, METHOD_NAME__CRS, 2455 2447 vmbus_walk_resources, NULL); ··· 2818 2780 if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) { 2819 2781 kmsg_dump_unregister(&hv_kmsg_dumper); 2820 2782 unregister_die_notifier(&hyperv_die_block); 2821 - atomic_notifier_chain_unregister(&panic_notifier_list, 2822 - &hyperv_panic_block); 2823 2783 } 2784 + 2785 + /* 2786 + * The panic notifier is always registered, hence we should 2787 + * also unconditionally unregister it here as well. 2788 + */ 2789 + atomic_notifier_chain_unregister(&panic_notifier_list, 2790 + &hyperv_panic_block); 2824 2791 2825 2792 free_page((unsigned long)hv_panic_page); 2826 2793 unregister_sysctl_table(hv_ctl_table_hdr);

+7

drivers/net/ethernet/broadcom/bnxt/bnxt.c

··· 3253 3253 } 3254 3254 qidx = bp->tc_to_qidx[j]; 3255 3255 ring->queue_id = bp->q_info[qidx].queue_id; 3256 + spin_lock_init(&txr->xdp_tx_lock); 3256 3257 if (i < bp->tx_nr_rings_xdp) 3257 3258 continue; 3258 3259 if (i % bp->tx_nr_rings_per_tc == (bp->tx_nr_rings_per_tc - 1)) ··· 10339 10338 if (irq_re_init) 10340 10339 udp_tunnel_nic_reset_ntf(bp->dev); 10341 10340 10341 + if (bp->tx_nr_rings_xdp < num_possible_cpus()) { 10342 + if (!static_key_enabled(&bnxt_xdp_locking_key)) 10343 + static_branch_enable(&bnxt_xdp_locking_key); 10344 + } else if (static_key_enabled(&bnxt_xdp_locking_key)) { 10345 + static_branch_disable(&bnxt_xdp_locking_key); 10346 + } 10342 10347 set_bit(BNXT_STATE_OPEN, &bp->state); 10343 10348 bnxt_enable_int(bp); 10344 10349 /* Enable TX queues */

+4 -1

drivers/net/ethernet/broadcom/bnxt/bnxt.h

··· 593 593 #define BNXT_MAX_MTU 9500 594 594 #define BNXT_MAX_PAGE_MODE_MTU \ 595 595 ((unsigned int)PAGE_SIZE - VLAN_ETH_HLEN - NET_IP_ALIGN - \ 596 - XDP_PACKET_HEADROOM) 596 + XDP_PACKET_HEADROOM - \ 597 + SKB_DATA_ALIGN((unsigned int)sizeof(struct skb_shared_info))) 597 598 598 599 #define BNXT_MIN_PKT_SIZE 52 599 600 ··· 801 800 u32 dev_state; 802 801 803 802 struct bnxt_ring_struct tx_ring_struct; 803 + /* Synchronize simultaneous xdp_xmit on same ring */ 804 + spinlock_t xdp_tx_lock; 804 805 }; 805 806 806 807 #define BNXT_LEGACY_COAL_CMPL_PARAMS \

+12 -2

drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c

··· 20 20 #include "bnxt.h" 21 21 #include "bnxt_xdp.h" 22 22 23 + DEFINE_STATIC_KEY_FALSE(bnxt_xdp_locking_key); 24 + 23 25 struct bnxt_sw_tx_bd *bnxt_xmit_bd(struct bnxt *bp, 24 26 struct bnxt_tx_ring_info *txr, 25 27 dma_addr_t mapping, u32 len) ··· 229 227 ring = smp_processor_id() % bp->tx_nr_rings_xdp; 230 228 txr = &bp->tx_ring[ring]; 231 229 230 + if (READ_ONCE(txr->dev_state) == BNXT_DEV_STATE_CLOSING) 231 + return -EINVAL; 232 + 233 + if (static_branch_unlikely(&bnxt_xdp_locking_key)) 234 + spin_lock(&txr->xdp_tx_lock); 235 + 232 236 for (i = 0; i < num_frames; i++) { 233 237 struct xdp_frame *xdp = frames[i]; 234 238 235 - if (!txr || !bnxt_tx_avail(bp, txr) || 236 - !(bp->bnapi[ring]->flags & BNXT_NAPI_FLAG_XDP)) 239 + if (!bnxt_tx_avail(bp, txr)) 237 240 break; 238 241 239 242 mapping = dma_map_single(&pdev->dev, xdp->data, xdp->len, ··· 256 249 wmb(); 257 250 bnxt_db_write(bp, &txr->tx_db, txr->tx_prod); 258 251 } 252 + 253 + if (static_branch_unlikely(&bnxt_xdp_locking_key)) 254 + spin_unlock(&txr->xdp_tx_lock); 259 255 260 256 return nxmit; 261 257 }

+2

drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h

··· 10 10 #ifndef BNXT_XDP_H 11 11 #define BNXT_XDP_H 12 12 13 + DECLARE_STATIC_KEY_FALSE(bnxt_xdp_locking_key); 14 + 13 15 struct bnxt_sw_tx_bd *bnxt_xmit_bd(struct bnxt *bp, 14 16 struct bnxt_tx_ring_info *txr, 15 17 dma_addr_t mapping, u32 len);

+3 -1

drivers/net/ethernet/freescale/dpaa2/dpaa2-ptp.c

··· 167 167 base = of_iomap(node, 0); 168 168 if (!base) { 169 169 err = -ENOMEM; 170 - goto err_close; 170 + goto err_put; 171 171 } 172 172 173 173 err = fsl_mc_allocate_irqs(mc_dev); ··· 210 210 fsl_mc_free_irqs(mc_dev); 211 211 err_unmap: 212 212 iounmap(base); 213 + err_put: 214 + of_node_put(node); 213 215 err_close: 214 216 dprtc_close(mc_dev->mc_io, 0, mc_dev->mc_handle); 215 217 err_free_mcp:

+2 -2

drivers/net/ethernet/fungible/funcore/fun_dev.c

··· 586 586 /* Calculate the max QID based on SQ/CQ/doorbell counts. 587 587 * SQ/CQ doorbells alternate. 588 588 */ 589 - num_dbs = (pci_resource_len(pdev, 0) - NVME_REG_DBS) / 590 - (fdev->db_stride * 4); 589 + num_dbs = (pci_resource_len(pdev, 0) - NVME_REG_DBS) >> 590 + (2 + NVME_CAP_STRIDE(fdev->cap_reg)); 591 591 fdev->max_qid = min3(cq_count, sq_count, num_dbs / 2) - 1; 592 592 fdev->kern_end_qid = fdev->max_qid + 1; 593 593 return 0;

+1 -2

drivers/net/ethernet/intel/ice/ice.h

··· 301 301 ICE_VSI_NETDEV_REGISTERED, 302 302 ICE_VSI_UMAC_FLTR_CHANGED, 303 303 ICE_VSI_MMAC_FLTR_CHANGED, 304 - ICE_VSI_VLAN_FLTR_CHANGED, 305 304 ICE_VSI_PROMISC_CHANGED, 306 305 ICE_VSI_STATE_NBITS /* must be last */ 307 306 }; ··· 671 672 672 673 static inline bool ice_is_xdp_ena_vsi(struct ice_vsi *vsi) 673 674 { 674 - return !!vsi->xdp_prog; 675 + return !!READ_ONCE(vsi->xdp_prog); 675 676 } 676 677 677 678 static inline void ice_set_ring_xdp(struct ice_tx_ring *ring)

+40 -4

drivers/net/ethernet/intel/ice/ice_fltr.c

··· 58 58 ice_fltr_set_vlan_vsi_promisc(struct ice_hw *hw, struct ice_vsi *vsi, 59 59 u8 promisc_mask) 60 60 { 61 - return ice_set_vlan_vsi_promisc(hw, vsi->idx, promisc_mask, false); 61 + struct ice_pf *pf = hw->back; 62 + int result; 63 + 64 + result = ice_set_vlan_vsi_promisc(hw, vsi->idx, promisc_mask, false); 65 + if (result) 66 + dev_err(ice_pf_to_dev(pf), 67 + "Error setting promisc mode on VSI %i (rc=%d)\n", 68 + vsi->vsi_num, result); 69 + 70 + return result; 62 71 } 63 72 64 73 /** ··· 82 73 ice_fltr_clear_vlan_vsi_promisc(struct ice_hw *hw, struct ice_vsi *vsi, 83 74 u8 promisc_mask) 84 75 { 85 - return ice_set_vlan_vsi_promisc(hw, vsi->idx, promisc_mask, true); 76 + struct ice_pf *pf = hw->back; 77 + int result; 78 + 79 + result = ice_set_vlan_vsi_promisc(hw, vsi->idx, promisc_mask, true); 80 + if (result) 81 + dev_err(ice_pf_to_dev(pf), 82 + "Error clearing promisc mode on VSI %i (rc=%d)\n", 83 + vsi->vsi_num, result); 84 + 85 + return result; 86 86 } 87 87 88 88 /** ··· 105 87 ice_fltr_clear_vsi_promisc(struct ice_hw *hw, u16 vsi_handle, u8 promisc_mask, 106 88 u16 vid) 107 89 { 108 - return ice_clear_vsi_promisc(hw, vsi_handle, promisc_mask, vid); 90 + struct ice_pf *pf = hw->back; 91 + int result; 92 + 93 + result = ice_clear_vsi_promisc(hw, vsi_handle, promisc_mask, vid); 94 + if (result) 95 + dev_err(ice_pf_to_dev(pf), 96 + "Error clearing promisc mode on VSI %i for VID %u (rc=%d)\n", 97 + ice_get_hw_vsi_num(hw, vsi_handle), vid, result); 98 + 99 + return result; 109 100 } 110 101 111 102 /** ··· 128 101 ice_fltr_set_vsi_promisc(struct ice_hw *hw, u16 vsi_handle, u8 promisc_mask, 129 102 u16 vid) 130 103 { 131 - return ice_set_vsi_promisc(hw, vsi_handle, promisc_mask, vid); 104 + struct ice_pf *pf = hw->back; 105 + int result; 106 + 107 + result = ice_set_vsi_promisc(hw, vsi_handle, promisc_mask, vid); 108 + if (result) 109 + dev_err(ice_pf_to_dev(pf), 110 + "Error setting promisc mode on VSI %i for VID %u (rc=%d)\n", 111 + ice_get_hw_vsi_num(hw, vsi_handle), vid, result); 112 + 113 + return result; 132 114 } 133 115 134 116 /**

+3

drivers/net/ethernet/intel/ice/ice_lib.c

··· 1480 1480 ring->tx_tstamps = &pf->ptp.port.tx; 1481 1481 ring->dev = dev; 1482 1482 ring->count = vsi->num_tx_desc; 1483 + ring->txq_teid = ICE_INVAL_TEID; 1483 1484 if (dvm_ena) 1484 1485 ring->flags |= ICE_TX_FLAGS_RING_VLAN_L2TAG2; 1485 1486 else ··· 2984 2983 } 2985 2984 } 2986 2985 2986 + if (ice_is_vsi_dflt_vsi(pf->first_sw, vsi)) 2987 + ice_clear_dflt_vsi(pf->first_sw); 2987 2988 ice_fltr_remove_all(vsi); 2988 2989 ice_rm_vsi_lan_cfg(vsi->port_info, vsi->idx); 2989 2990 err = ice_rm_vsi_rdma_cfg(vsi->port_info, vsi->idx);

+90 -37

drivers/net/ethernet/intel/ice/ice_main.c

··· 243 243 static bool ice_vsi_fltr_changed(struct ice_vsi *vsi) 244 244 { 245 245 return test_bit(ICE_VSI_UMAC_FLTR_CHANGED, vsi->state) || 246 - test_bit(ICE_VSI_MMAC_FLTR_CHANGED, vsi->state) || 247 - test_bit(ICE_VSI_VLAN_FLTR_CHANGED, vsi->state); 246 + test_bit(ICE_VSI_MMAC_FLTR_CHANGED, vsi->state); 248 247 } 249 248 250 249 /** ··· 259 260 if (vsi->type != ICE_VSI_PF) 260 261 return 0; 261 262 262 - if (ice_vsi_has_non_zero_vlans(vsi)) 263 - status = ice_fltr_set_vlan_vsi_promisc(&vsi->back->hw, vsi, promisc_m); 264 - else 265 - status = ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx, promisc_m, 0); 263 + if (ice_vsi_has_non_zero_vlans(vsi)) { 264 + promisc_m |= (ICE_PROMISC_VLAN_RX | ICE_PROMISC_VLAN_TX); 265 + status = ice_fltr_set_vlan_vsi_promisc(&vsi->back->hw, vsi, 266 + promisc_m); 267 + } else { 268 + status = ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx, 269 + promisc_m, 0); 270 + } 271 + 266 272 return status; 267 273 } 268 274 ··· 284 280 if (vsi->type != ICE_VSI_PF) 285 281 return 0; 286 282 287 - if (ice_vsi_has_non_zero_vlans(vsi)) 288 - status = ice_fltr_clear_vlan_vsi_promisc(&vsi->back->hw, vsi, promisc_m); 289 - else 290 - status = ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, promisc_m, 0); 283 + if (ice_vsi_has_non_zero_vlans(vsi)) { 284 + promisc_m |= (ICE_PROMISC_VLAN_RX | ICE_PROMISC_VLAN_TX); 285 + status = ice_fltr_clear_vlan_vsi_promisc(&vsi->back->hw, vsi, 286 + promisc_m); 287 + } else { 288 + status = ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, 289 + promisc_m, 0); 290 + } 291 + 291 292 return status; 292 293 } 293 294 ··· 311 302 struct ice_pf *pf = vsi->back; 312 303 struct ice_hw *hw = &pf->hw; 313 304 u32 changed_flags = 0; 314 - u8 promisc_m; 315 305 int err; 316 306 317 307 if (!vsi->netdev) ··· 328 320 if (ice_vsi_fltr_changed(vsi)) { 329 321 clear_bit(ICE_VSI_UMAC_FLTR_CHANGED, vsi->state); 330 322 clear_bit(ICE_VSI_MMAC_FLTR_CHANGED, vsi->state); 331 - clear_bit(ICE_VSI_VLAN_FLTR_CHANGED, vsi->state); 332 323 333 324 /* grab the netdev's addr_list_lock */ 334 325 netif_addr_lock_bh(netdev); ··· 376 369 /* check for changes in promiscuous modes */ 377 370 if (changed_flags & IFF_ALLMULTI) { 378 371 if (vsi->current_netdev_flags & IFF_ALLMULTI) { 379 - if (ice_vsi_has_non_zero_vlans(vsi)) 380 - promisc_m = ICE_MCAST_VLAN_PROMISC_BITS; 381 - else 382 - promisc_m = ICE_MCAST_PROMISC_BITS; 383 - 384 - err = ice_set_promisc(vsi, promisc_m); 372 + err = ice_set_promisc(vsi, ICE_MCAST_PROMISC_BITS); 385 373 if (err) { 386 - netdev_err(netdev, "Error setting Multicast promiscuous mode on VSI %i\n", 387 - vsi->vsi_num); 388 374 vsi->current_netdev_flags &= ~IFF_ALLMULTI; 389 375 goto out_promisc; 390 376 } 391 377 } else { 392 378 /* !(vsi->current_netdev_flags & IFF_ALLMULTI) */ 393 - if (ice_vsi_has_non_zero_vlans(vsi)) 394 - promisc_m = ICE_MCAST_VLAN_PROMISC_BITS; 395 - else 396 - promisc_m = ICE_MCAST_PROMISC_BITS; 397 - 398 - err = ice_clear_promisc(vsi, promisc_m); 379 + err = ice_clear_promisc(vsi, ICE_MCAST_PROMISC_BITS); 399 380 if (err) { 400 - netdev_err(netdev, "Error clearing Multicast promiscuous mode on VSI %i\n", 401 - vsi->vsi_num); 402 381 vsi->current_netdev_flags |= IFF_ALLMULTI; 403 382 goto out_promisc; 404 383 } ··· 2562 2569 spin_lock_init(&xdp_ring->tx_lock); 2563 2570 for (j = 0; j < xdp_ring->count; j++) { 2564 2571 tx_desc = ICE_TX_DESC(xdp_ring, j); 2565 - tx_desc->cmd_type_offset_bsz = cpu_to_le64(ICE_TX_DESC_DTYPE_DESC_DONE); 2572 + tx_desc->cmd_type_offset_bsz = 0; 2566 2573 } 2567 2574 } 2568 2575 ··· 2758 2765 2759 2766 ice_for_each_xdp_txq(vsi, i) 2760 2767 if (vsi->xdp_rings[i]) { 2761 - if (vsi->xdp_rings[i]->desc) 2768 + if (vsi->xdp_rings[i]->desc) { 2769 + synchronize_rcu(); 2762 2770 ice_free_tx_ring(vsi->xdp_rings[i]); 2771 + } 2763 2772 kfree_rcu(vsi->xdp_rings[i], rcu); 2764 2773 vsi->xdp_rings[i] = NULL; 2765 2774 } ··· 3483 3488 if (!vid) 3484 3489 return 0; 3485 3490 3491 + while (test_and_set_bit(ICE_CFG_BUSY, vsi->state)) 3492 + usleep_range(1000, 2000); 3493 + 3494 + /* Add multicast promisc rule for the VLAN ID to be added if 3495 + * all-multicast is currently enabled. 3496 + */ 3497 + if (vsi->current_netdev_flags & IFF_ALLMULTI) { 3498 + ret = ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx, 3499 + ICE_MCAST_VLAN_PROMISC_BITS, 3500 + vid); 3501 + if (ret) 3502 + goto finish; 3503 + } 3504 + 3486 3505 vlan_ops = ice_get_compat_vsi_vlan_ops(vsi); 3487 3506 3488 3507 /* Add a switch rule for this VLAN ID so its corresponding VLAN tagged ··· 3504 3495 */ 3505 3496 vlan = ICE_VLAN(be16_to_cpu(proto), vid, 0); 3506 3497 ret = vlan_ops->add_vlan(vsi, &vlan); 3507 - if (!ret) 3508 - set_bit(ICE_VSI_VLAN_FLTR_CHANGED, vsi->state); 3498 + if (ret) 3499 + goto finish; 3500 + 3501 + /* If all-multicast is currently enabled and this VLAN ID is only one 3502 + * besides VLAN-0 we have to update look-up type of multicast promisc 3503 + * rule for VLAN-0 from ICE_SW_LKUP_PROMISC to ICE_SW_LKUP_PROMISC_VLAN. 3504 + */ 3505 + if ((vsi->current_netdev_flags & IFF_ALLMULTI) && 3506 + ice_vsi_num_non_zero_vlans(vsi) == 1) { 3507 + ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, 3508 + ICE_MCAST_PROMISC_BITS, 0); 3509 + ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx, 3510 + ICE_MCAST_VLAN_PROMISC_BITS, 0); 3511 + } 3512 + 3513 + finish: 3514 + clear_bit(ICE_CFG_BUSY, vsi->state); 3509 3515 3510 3516 return ret; 3511 3517 } ··· 3546 3522 if (!vid) 3547 3523 return 0; 3548 3524 3525 + while (test_and_set_bit(ICE_CFG_BUSY, vsi->state)) 3526 + usleep_range(1000, 2000); 3527 + 3549 3528 vlan_ops = ice_get_compat_vsi_vlan_ops(vsi); 3550 3529 3551 3530 /* Make sure VLAN delete is successful before updating VLAN ··· 3557 3530 vlan = ICE_VLAN(be16_to_cpu(proto), vid, 0); 3558 3531 ret = vlan_ops->del_vlan(vsi, &vlan); 3559 3532 if (ret) 3560 - return ret; 3533 + goto finish; 3561 3534 3562 - set_bit(ICE_VSI_VLAN_FLTR_CHANGED, vsi->state); 3563 - return 0; 3535 + /* Remove multicast promisc rule for the removed VLAN ID if 3536 + * all-multicast is enabled. 3537 + */ 3538 + if (vsi->current_netdev_flags & IFF_ALLMULTI) 3539 + ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, 3540 + ICE_MCAST_VLAN_PROMISC_BITS, vid); 3541 + 3542 + if (!ice_vsi_has_non_zero_vlans(vsi)) { 3543 + /* Update look-up type of multicast promisc rule for VLAN 0 3544 + * from ICE_SW_LKUP_PROMISC_VLAN to ICE_SW_LKUP_PROMISC when 3545 + * all-multicast is enabled and VLAN 0 is the only VLAN rule. 3546 + */ 3547 + if (vsi->current_netdev_flags & IFF_ALLMULTI) { 3548 + ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, 3549 + ICE_MCAST_VLAN_PROMISC_BITS, 3550 + 0); 3551 + ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx, 3552 + ICE_MCAST_PROMISC_BITS, 0); 3553 + } 3554 + } 3555 + 3556 + finish: 3557 + clear_bit(ICE_CFG_BUSY, vsi->state); 3558 + 3559 + return ret; 3564 3560 } 3565 3561 3566 3562 /** ··· 5525 5475 5526 5476 /* Add filter for new MAC. If filter exists, return success */ 5527 5477 err = ice_fltr_add_mac(vsi, mac, ICE_FWD_TO_VSI); 5528 - if (err == -EEXIST) 5478 + if (err == -EEXIST) { 5529 5479 /* Although this MAC filter is already present in hardware it's 5530 5480 * possible in some cases (e.g. bonding) that dev_addr was 5531 5481 * modified outside of the driver and needs to be restored back 5532 5482 * to this value. 5533 5483 */ 5534 5484 netdev_dbg(netdev, "filter for MAC %pM already exists\n", mac); 5535 - else if (err) 5485 + 5486 + return 0; 5487 + } else if (err) { 5536 5488 /* error if the new filter addition failed */ 5537 5489 err = -EADDRNOTAVAIL; 5490 + } 5538 5491 5539 5492 err_update_filters: 5540 5493 if (err) {

+2 -2

drivers/net/ethernet/intel/ice/ice_virtchnl.c

··· 1358 1358 goto error_param; 1359 1359 } 1360 1360 1361 - /* Skip queue if not enabled */ 1362 1361 if (!test_bit(vf_q_id, vf->txq_ena)) 1363 - continue; 1362 + dev_dbg(ice_pf_to_dev(vsi->back), "Queue %u on VSI %u is not enabled, but stopping it anyway\n", 1363 + vf_q_id, vsi->vsi_num); 1364 1364 1365 1365 ice_fill_txq_meta(vsi, ring, &txq_meta); 1366 1366

+4 -2

drivers/net/ethernet/intel/ice/ice_xsk.c

··· 41 41 static void ice_qp_clean_rings(struct ice_vsi *vsi, u16 q_idx) 42 42 { 43 43 ice_clean_tx_ring(vsi->tx_rings[q_idx]); 44 - if (ice_is_xdp_ena_vsi(vsi)) 44 + if (ice_is_xdp_ena_vsi(vsi)) { 45 + synchronize_rcu(); 45 46 ice_clean_tx_ring(vsi->xdp_rings[q_idx]); 47 + } 46 48 ice_clean_rx_ring(vsi->rx_rings[q_idx]); 47 49 } 48 50 ··· 920 918 struct ice_vsi *vsi = np->vsi; 921 919 struct ice_tx_ring *ring; 922 920 923 - if (test_bit(ICE_DOWN, vsi->state)) 921 + if (test_bit(ICE_VSI_DOWN, vsi->state)) 924 922 return -ENETDOWN; 925 923 926 924 if (!ice_is_xdp_ena_vsi(vsi))

+1 -1

drivers/net/ethernet/marvell/mv643xx_eth.c

··· 2751 2751 } 2752 2752 2753 2753 ret = of_get_mac_address(pnp, ppd.mac_addr); 2754 - if (ret) 2754 + if (ret == -EPROBE_DEFER) 2755 2755 return ret; 2756 2756 2757 2757 mv643xx_eth_property(pnp, "tx-queue-size", ppd.tx_queue_size);

+2

drivers/net/ethernet/micrel/Kconfig

··· 28 28 config KS8851 29 29 tristate "Micrel KS8851 SPI" 30 30 depends on SPI 31 + depends on PTP_1588_CLOCK_OPTIONAL 31 32 select MII 32 33 select CRC32 33 34 select EEPROM_93CX6 ··· 40 39 config KS8851_MLL 41 40 tristate "Micrel KS8851 MLL" 42 41 depends on HAS_IOMEM 42 + depends on PTP_1588_CLOCK_OPTIONAL 43 43 select MII 44 44 select CRC32 45 45 select EEPROM_93CX6

+2 -4

drivers/net/ethernet/myricom/myri10ge/myri10ge.c

··· 2903 2903 status = myri10ge_xmit(curr, dev); 2904 2904 if (status != 0) { 2905 2905 dev_kfree_skb_any(curr); 2906 - if (segs != NULL) { 2907 - curr = segs; 2908 - segs = next; 2906 + skb_list_walk_safe(next, curr, next) { 2909 2907 curr->next = NULL; 2910 - dev_kfree_skb_any(segs); 2908 + dev_kfree_skb_any(curr); 2911 2909 } 2912 2910 goto drop; 2913 2911 }

+1 -1

drivers/net/ethernet/qlogic/qed/qed_debug.c

··· 489 489 490 490 #define STATIC_DEBUG_LINE_DWORDS 9 491 491 492 - #define NUM_COMMON_GLOBAL_PARAMS 11 492 + #define NUM_COMMON_GLOBAL_PARAMS 10 493 493 494 494 #define MAX_RECURSION_DEPTH 10 495 495

+3

drivers/net/ethernet/qlogic/qede/qede_fp.c

··· 748 748 buf = page_address(bd->data) + bd->page_offset; 749 749 skb = build_skb(buf, rxq->rx_buf_seg_size); 750 750 751 + if (unlikely(!skb)) 752 + return NULL; 753 + 751 754 skb_reserve(skb, pad); 752 755 skb_put(skb, len); 753 756

+82 -66

drivers/net/ethernet/sfc/efx_channels.c

··· 786 786 kfree(efx->xdp_tx_queues); 787 787 } 788 788 789 + static int efx_set_xdp_tx_queue(struct efx_nic *efx, int xdp_queue_number, 790 + struct efx_tx_queue *tx_queue) 791 + { 792 + if (xdp_queue_number >= efx->xdp_tx_queue_count) 793 + return -EINVAL; 794 + 795 + netif_dbg(efx, drv, efx->net_dev, 796 + "Channel %u TXQ %u is XDP %u, HW %u\n", 797 + tx_queue->channel->channel, tx_queue->label, 798 + xdp_queue_number, tx_queue->queue); 799 + efx->xdp_tx_queues[xdp_queue_number] = tx_queue; 800 + return 0; 801 + } 802 + 803 + static void efx_set_xdp_channels(struct efx_nic *efx) 804 + { 805 + struct efx_tx_queue *tx_queue; 806 + struct efx_channel *channel; 807 + unsigned int next_queue = 0; 808 + int xdp_queue_number = 0; 809 + int rc; 810 + 811 + /* We need to mark which channels really have RX and TX 812 + * queues, and adjust the TX queue numbers if we have separate 813 + * RX-only and TX-only channels. 814 + */ 815 + efx_for_each_channel(channel, efx) { 816 + if (channel->channel < efx->tx_channel_offset) 817 + continue; 818 + 819 + if (efx_channel_is_xdp_tx(channel)) { 820 + efx_for_each_channel_tx_queue(tx_queue, channel) { 821 + tx_queue->queue = next_queue++; 822 + rc = efx_set_xdp_tx_queue(efx, xdp_queue_number, 823 + tx_queue); 824 + if (rc == 0) 825 + xdp_queue_number++; 826 + } 827 + } else { 828 + efx_for_each_channel_tx_queue(tx_queue, channel) { 829 + tx_queue->queue = next_queue++; 830 + netif_dbg(efx, drv, efx->net_dev, 831 + "Channel %u TXQ %u is HW %u\n", 832 + channel->channel, tx_queue->label, 833 + tx_queue->queue); 834 + } 835 + 836 + /* If XDP is borrowing queues from net stack, it must 837 + * use the queue with no csum offload, which is the 838 + * first one of the channel 839 + * (note: tx_queue_by_type is not initialized yet) 840 + */ 841 + if (efx->xdp_txq_queues_mode == 842 + EFX_XDP_TX_QUEUES_BORROWED) { 843 + tx_queue = &channel->tx_queue[0]; 844 + rc = efx_set_xdp_tx_queue(efx, xdp_queue_number, 845 + tx_queue); 846 + if (rc == 0) 847 + xdp_queue_number++; 848 + } 849 + } 850 + } 851 + WARN_ON(efx->xdp_txq_queues_mode == EFX_XDP_TX_QUEUES_DEDICATED && 852 + xdp_queue_number != efx->xdp_tx_queue_count); 853 + WARN_ON(efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DEDICATED && 854 + xdp_queue_number > efx->xdp_tx_queue_count); 855 + 856 + /* If we have more CPUs than assigned XDP TX queues, assign the already 857 + * existing queues to the exceeding CPUs 858 + */ 859 + next_queue = 0; 860 + while (xdp_queue_number < efx->xdp_tx_queue_count) { 861 + tx_queue = efx->xdp_tx_queues[next_queue++]; 862 + rc = efx_set_xdp_tx_queue(efx, xdp_queue_number, tx_queue); 863 + if (rc == 0) 864 + xdp_queue_number++; 865 + } 866 + } 867 + 789 868 int efx_realloc_channels(struct efx_nic *efx, u32 rxq_entries, u32 txq_entries) 790 869 { 791 870 struct efx_channel *other_channel[EFX_MAX_CHANNELS], *channel; ··· 936 857 efx_init_napi_channel(efx->channel[i]); 937 858 } 938 859 860 + efx_set_xdp_channels(efx); 939 861 out: 940 862 /* Destroy unused channel structures */ 941 863 for (i = 0; i < efx->n_channels; i++) { ··· 969 889 goto out; 970 890 } 971 891 972 - static inline int 973 - efx_set_xdp_tx_queue(struct efx_nic *efx, int xdp_queue_number, 974 - struct efx_tx_queue *tx_queue) 975 - { 976 - if (xdp_queue_number >= efx->xdp_tx_queue_count) 977 - return -EINVAL; 978 - 979 - netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is XDP %u, HW %u\n", 980 - tx_queue->channel->channel, tx_queue->label, 981 - xdp_queue_number, tx_queue->queue); 982 - efx->xdp_tx_queues[xdp_queue_number] = tx_queue; 983 - return 0; 984 - } 985 - 986 892 int efx_set_channels(struct efx_nic *efx) 987 893 { 988 - struct efx_tx_queue *tx_queue; 989 894 struct efx_channel *channel; 990 - unsigned int next_queue = 0; 991 - int xdp_queue_number; 992 895 int rc; 993 896 994 897 efx->tx_channel_offset = ··· 989 926 return -ENOMEM; 990 927 } 991 928 992 - /* We need to mark which channels really have RX and TX 993 - * queues, and adjust the TX queue numbers if we have separate 994 - * RX-only and TX-only channels. 995 - */ 996 - xdp_queue_number = 0; 997 929 efx_for_each_channel(channel, efx) { 998 930 if (channel->channel < efx->n_rx_channels) 999 931 channel->rx_queue.core_index = channel->channel; 1000 932 else 1001 933 channel->rx_queue.core_index = -1; 1002 - 1003 - if (channel->channel >= efx->tx_channel_offset) { 1004 - if (efx_channel_is_xdp_tx(channel)) { 1005 - efx_for_each_channel_tx_queue(tx_queue, channel) { 1006 - tx_queue->queue = next_queue++; 1007 - rc = efx_set_xdp_tx_queue(efx, xdp_queue_number, tx_queue); 1008 - if (rc == 0) 1009 - xdp_queue_number++; 1010 - } 1011 - } else { 1012 - efx_for_each_channel_tx_queue(tx_queue, channel) { 1013 - tx_queue->queue = next_queue++; 1014 - netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is HW %u\n", 1015 - channel->channel, tx_queue->label, 1016 - tx_queue->queue); 1017 - } 1018 - 1019 - /* If XDP is borrowing queues from net stack, it must use the queue 1020 - * with no csum offload, which is the first one of the channel 1021 - * (note: channel->tx_queue_by_type is not initialized yet) 1022 - */ 1023 - if (efx->xdp_txq_queues_mode == EFX_XDP_TX_QUEUES_BORROWED) { 1024 - tx_queue = &channel->tx_queue[0]; 1025 - rc = efx_set_xdp_tx_queue(efx, xdp_queue_number, tx_queue); 1026 - if (rc == 0) 1027 - xdp_queue_number++; 1028 - } 1029 - } 1030 - } 1031 934 } 1032 - WARN_ON(efx->xdp_txq_queues_mode == EFX_XDP_TX_QUEUES_DEDICATED && 1033 - xdp_queue_number != efx->xdp_tx_queue_count); 1034 - WARN_ON(efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DEDICATED && 1035 - xdp_queue_number > efx->xdp_tx_queue_count); 1036 935 1037 - /* If we have more CPUs than assigned XDP TX queues, assign the already 1038 - * existing queues to the exceeding CPUs 1039 - */ 1040 - next_queue = 0; 1041 - while (xdp_queue_number < efx->xdp_tx_queue_count) { 1042 - tx_queue = efx->xdp_tx_queues[next_queue++]; 1043 - rc = efx_set_xdp_tx_queue(efx, xdp_queue_number, tx_queue); 1044 - if (rc == 0) 1045 - xdp_queue_number++; 1046 - } 936 + efx_set_xdp_channels(efx); 1047 937 1048 938 rc = netif_set_real_num_tx_queues(efx->net_dev, efx->n_tx_channels); 1049 939 if (rc) ··· 1140 1124 struct efx_rx_queue *rx_queue; 1141 1125 struct efx_channel *channel; 1142 1126 1143 - efx_for_each_channel(channel, efx) { 1127 + efx_for_each_channel_rev(channel, efx) { 1144 1128 efx_for_each_channel_tx_queue(tx_queue, channel) { 1145 1129 efx_init_tx_queue(tx_queue); 1146 1130 atomic_inc(&efx->active_queues);

+3

drivers/net/ethernet/sfc/rx_common.c

··· 150 150 struct efx_nic *efx = rx_queue->efx; 151 151 int i; 152 152 153 + if (unlikely(!rx_queue->page_ring)) 154 + return; 155 + 153 156 /* Unmap and release the pages in the recycle ring. Remove the ring. */ 154 157 for (i = 0; i <= rx_queue->page_ptr_mask; i++) { 155 158 struct page *page = rx_queue->page_ring[i];

+3

drivers/net/ethernet/sfc/tx.c

··· 443 443 if (unlikely(!tx_queue)) 444 444 return -EINVAL; 445 445 446 + if (!tx_queue->initialised) 447 + return -EINVAL; 448 + 446 449 if (efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DEDICATED) 447 450 HARD_TX_LOCK(efx->net_dev, tx_queue->core_txq, cpu); 448 451

+2

drivers/net/ethernet/sfc/tx_common.c

··· 101 101 netif_dbg(tx_queue->efx, drv, tx_queue->efx->net_dev, 102 102 "shutting down TX queue %d\n", tx_queue->queue); 103 103 104 + tx_queue->initialised = false; 105 + 104 106 if (!tx_queue->buffer) 105 107 return; 106 108

+1 -1

drivers/net/ethernet/stmicro/stmmac/dwmac-loongson.c

··· 205 205 }; 206 206 MODULE_DEVICE_TABLE(pci, loongson_dwmac_id_table); 207 207 208 - struct pci_driver loongson_dwmac_driver = { 208 + static struct pci_driver loongson_dwmac_driver = { 209 209 .name = "dwmac-loongson-pci", 210 210 .id_table = loongson_dwmac_id_table, 211 211 .probe = loongson_dwmac_probe,

+1 -2

drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c

··· 431 431 plat->phylink_node = np; 432 432 433 433 /* Get max speed of operation from device tree */ 434 - if (of_property_read_u32(np, "max-speed", &plat->max_speed)) 435 - plat->max_speed = -1; 434 + of_property_read_u32(np, "max-speed", &plat->max_speed); 436 435 437 436 plat->bus_id = of_alias_get_id(np, "ethernet"); 438 437 if (plat->bus_id < 0)

-2

drivers/net/ethernet/xilinx/xilinx_axienet.h

··· 433 433 struct net_device *ndev; 434 434 struct device *dev; 435 435 436 - struct device_node *phy_node; 437 - 438 436 struct phylink *phylink; 439 437 struct phylink_config phylink_config; 440 438

+18 -15

drivers/net/ethernet/xilinx/xilinx_axienet_main.c

··· 2064 2064 if (ret) 2065 2065 goto cleanup_clk; 2066 2066 2067 - lp->phy_node = of_parse_phandle(pdev->dev.of_node, "phy-handle", 0); 2068 - if (lp->phy_node) { 2069 - ret = axienet_mdio_setup(lp); 2070 - if (ret) 2071 - dev_warn(&pdev->dev, 2072 - "error registering MDIO bus: %d\n", ret); 2073 - } 2067 + ret = axienet_mdio_setup(lp); 2068 + if (ret) 2069 + dev_warn(&pdev->dev, 2070 + "error registering MDIO bus: %d\n", ret); 2071 + 2074 2072 if (lp->phy_mode == PHY_INTERFACE_MODE_SGMII || 2075 2073 lp->phy_mode == PHY_INTERFACE_MODE_1000BASEX) { 2076 - if (!lp->phy_node) { 2077 - dev_err(&pdev->dev, "phy-handle required for 1000BaseX/SGMII\n"); 2074 + np = of_parse_phandle(pdev->dev.of_node, "pcs-handle", 0); 2075 + if (!np) { 2076 + /* Deprecated: Always use "pcs-handle" for pcs_phy. 2077 + * Falling back to "phy-handle" here is only for 2078 + * backward compatibility with old device trees. 2079 + */ 2080 + np = of_parse_phandle(pdev->dev.of_node, "phy-handle", 0); 2081 + } 2082 + if (!np) { 2083 + dev_err(&pdev->dev, "pcs-handle (preferred) or phy-handle required for 1000BaseX/SGMII\n"); 2078 2084 ret = -EINVAL; 2079 2085 goto cleanup_mdio; 2080 2086 } 2081 - lp->pcs_phy = of_mdio_find_device(lp->phy_node); 2087 + lp->pcs_phy = of_mdio_find_device(np); 2082 2088 if (!lp->pcs_phy) { 2083 2089 ret = -EPROBE_DEFER; 2090 + of_node_put(np); 2084 2091 goto cleanup_mdio; 2085 2092 } 2093 + of_node_put(np); 2086 2094 lp->pcs.ops = &axienet_pcs_ops; 2087 2095 lp->pcs.poll = true; 2088 2096 } ··· 2133 2125 put_device(&lp->pcs_phy->dev); 2134 2126 if (lp->mii_bus) 2135 2127 axienet_mdio_teardown(lp); 2136 - of_node_put(lp->phy_node); 2137 - 2138 2128 cleanup_clk: 2139 2129 clk_bulk_disable_unprepare(XAE_NUM_MISC_CLOCKS, lp->misc_clks); 2140 2130 clk_disable_unprepare(lp->axi_clk); ··· 2160 2154 2161 2155 clk_bulk_disable_unprepare(XAE_NUM_MISC_CLOCKS, lp->misc_clks); 2162 2156 clk_disable_unprepare(lp->axi_clk); 2163 - 2164 - of_node_put(lp->phy_node); 2165 - lp->phy_node = NULL; 2166 2157 2167 2158 free_netdev(ndev); 2168 2159

+1 -1

drivers/net/mctp/mctp-i2c.c

··· 553 553 hdr->source_slave = ((llsrc << 1) & 0xff) | 0x01; 554 554 mhdr->ver = 0x01; 555 555 556 - return 0; 556 + return sizeof(struct mctp_i2c_hdr); 557 557 } 558 558 559 559 static int mctp_i2c_tx_thread(void *data)

+6

drivers/net/mdio/mdio-mscc-miim.c

··· 107 107 u32 val; 108 108 int ret; 109 109 110 + if (regnum & MII_ADDR_C45) 111 + return -EOPNOTSUPP; 112 + 110 113 ret = mscc_miim_wait_pending(bus); 111 114 if (ret) 112 115 goto out; ··· 152 149 { 153 150 struct mscc_miim_dev *miim = bus->priv; 154 151 int ret; 152 + 153 + if (regnum & MII_ADDR_C45) 154 + return -EOPNOTSUPP; 155 155 156 156 ret = mscc_miim_wait_pending(bus); 157 157 if (ret < 0)

+2 -104

drivers/net/phy/micrel.c

··· 99 99 #define PTP_TIMESTAMP_EN_PDREQ_ BIT(2) 100 100 #define PTP_TIMESTAMP_EN_PDRES_ BIT(3) 101 101 102 - #define PTP_RX_LATENCY_1000 0x0224 103 - #define PTP_TX_LATENCY_1000 0x0225 104 - 105 - #define PTP_RX_LATENCY_100 0x0222 106 - #define PTP_TX_LATENCY_100 0x0223 107 - 108 - #define PTP_RX_LATENCY_10 0x0220 109 - #define PTP_TX_LATENCY_10 0x0221 110 - 111 102 #define PTP_TX_PARSE_L2_ADDR_EN 0x0284 112 103 #define PTP_RX_PARSE_L2_ADDR_EN 0x0244 113 104 ··· 259 268 u16 seq_id; 260 269 }; 261 270 262 - struct kszphy_latencies { 263 - u16 rx_10; 264 - u16 tx_10; 265 - u16 rx_100; 266 - u16 tx_100; 267 - u16 rx_1000; 268 - u16 tx_1000; 269 - }; 270 - 271 271 struct kszphy_ptp_priv { 272 272 struct mii_timestamper mii_ts; 273 273 struct phy_device *phydev; ··· 278 296 279 297 struct kszphy_priv { 280 298 struct kszphy_ptp_priv ptp_priv; 281 - struct kszphy_latencies latencies; 282 299 const struct kszphy_type *type; 283 300 int led_mode; 284 301 bool rmii_ref_clk_sel; ··· 285 304 u64 stats[ARRAY_SIZE(kszphy_hw_stats)]; 286 305 }; 287 306 288 - static struct kszphy_latencies lan8814_latencies = { 289 - .rx_10 = 0x22AA, 290 - .tx_10 = 0x2E4A, 291 - .rx_100 = 0x092A, 292 - .tx_100 = 0x02C1, 293 - .rx_1000 = 0x01AD, 294 - .tx_1000 = 0x00C9, 295 - }; 296 307 static const struct kszphy_type ksz8021_type = { 297 308 .led_mode_reg = MII_KSZPHY_CTRL_2, 298 309 .has_broadcast_disable = true, ··· 2591 2618 return 0; 2592 2619 } 2593 2620 2594 - static int lan8814_read_status(struct phy_device *phydev) 2595 - { 2596 - struct kszphy_priv *priv = phydev->priv; 2597 - struct kszphy_latencies *latencies = &priv->latencies; 2598 - int err; 2599 - int regval; 2600 - 2601 - err = genphy_read_status(phydev); 2602 - if (err) 2603 - return err; 2604 - 2605 - switch (phydev->speed) { 2606 - case SPEED_1000: 2607 - lanphy_write_page_reg(phydev, 5, PTP_RX_LATENCY_1000, 2608 - latencies->rx_1000); 2609 - lanphy_write_page_reg(phydev, 5, PTP_TX_LATENCY_1000, 2610 - latencies->tx_1000); 2611 - break; 2612 - case SPEED_100: 2613 - lanphy_write_page_reg(phydev, 5, PTP_RX_LATENCY_100, 2614 - latencies->rx_100); 2615 - lanphy_write_page_reg(phydev, 5, PTP_TX_LATENCY_100, 2616 - latencies->tx_100); 2617 - break; 2618 - case SPEED_10: 2619 - lanphy_write_page_reg(phydev, 5, PTP_RX_LATENCY_10, 2620 - latencies->rx_10); 2621 - lanphy_write_page_reg(phydev, 5, PTP_TX_LATENCY_10, 2622 - latencies->tx_10); 2623 - break; 2624 - default: 2625 - break; 2626 - } 2627 - 2628 - /* Make sure the PHY is not broken. Read idle error count, 2629 - * and reset the PHY if it is maxed out. 2630 - */ 2631 - regval = phy_read(phydev, MII_STAT1000); 2632 - if ((regval & 0xFF) == 0xFF) { 2633 - phy_init_hw(phydev); 2634 - phydev->link = 0; 2635 - if (phydev->drv->config_intr && phy_interrupt_is_valid(phydev)) 2636 - phydev->drv->config_intr(phydev); 2637 - return genphy_config_aneg(phydev); 2638 - } 2639 - 2640 - return 0; 2641 - } 2642 - 2643 2621 static int lan8814_config_init(struct phy_device *phydev) 2644 2622 { 2645 2623 int val; ··· 2614 2690 return 0; 2615 2691 } 2616 2692 2617 - static void lan8814_parse_latency(struct phy_device *phydev) 2618 - { 2619 - const struct device_node *np = phydev->mdio.dev.of_node; 2620 - struct kszphy_priv *priv = phydev->priv; 2621 - struct kszphy_latencies *latency = &priv->latencies; 2622 - u32 val; 2623 - 2624 - if (!of_property_read_u32(np, "lan8814,latency_rx_10", &val)) 2625 - latency->rx_10 = val; 2626 - if (!of_property_read_u32(np, "lan8814,latency_tx_10", &val)) 2627 - latency->tx_10 = val; 2628 - if (!of_property_read_u32(np, "lan8814,latency_rx_100", &val)) 2629 - latency->rx_100 = val; 2630 - if (!of_property_read_u32(np, "lan8814,latency_tx_100", &val)) 2631 - latency->tx_100 = val; 2632 - if (!of_property_read_u32(np, "lan8814,latency_rx_1000", &val)) 2633 - latency->rx_1000 = val; 2634 - if (!of_property_read_u32(np, "lan8814,latency_tx_1000", &val)) 2635 - latency->tx_1000 = val; 2636 - } 2637 - 2638 2693 static int lan8814_probe(struct phy_device *phydev) 2639 2694 { 2640 - const struct device_node *np = phydev->mdio.dev.of_node; 2641 2695 struct kszphy_priv *priv; 2642 2696 u16 addr; 2643 2697 int err; ··· 2626 2724 2627 2725 priv->led_mode = -1; 2628 2726 2629 - priv->latencies = lan8814_latencies; 2630 - 2631 2727 phydev->priv = priv; 2632 2728 2633 2729 if (!IS_ENABLED(CONFIG_PTP_1588_CLOCK) || 2634 - !IS_ENABLED(CONFIG_NETWORK_PHY_TIMESTAMPING) || 2635 - of_property_read_bool(np, "lan8814,ignore-ts")) 2730 + !IS_ENABLED(CONFIG_NETWORK_PHY_TIMESTAMPING)) 2636 2731 return 0; 2637 2732 2638 2733 /* Strap-in value for PHY address, below register read gives starting ··· 2645 2746 return err; 2646 2747 } 2647 2748 2648 - lan8814_parse_latency(phydev); 2649 2749 lan8814_ptp_init(phydev); 2650 2750 2651 2751 return 0; ··· 2826 2928 .config_init = lan8814_config_init, 2827 2929 .probe = lan8814_probe, 2828 2930 .soft_reset = genphy_soft_reset, 2829 - .read_status = lan8814_read_status, 2931 + .read_status = ksz9031_read_status, 2830 2932 .get_sset_count = kszphy_get_sset_count, 2831 2933 .get_strings = kszphy_get_strings, 2832 2934 .get_stats = kszphy_get_stats,

+1 -1

drivers/net/slip/slip.c

··· 469 469 spin_lock(&sl->lock); 470 470 471 471 if (netif_queue_stopped(dev)) { 472 - if (!netif_running(dev)) 472 + if (!netif_running(dev) || !sl->tty) 473 473 goto out; 474 474 475 475 /* May be we must check transmitter timeout here ?

+7 -2

drivers/net/usb/aqc111.c

··· 1102 1102 if (start_of_descs != desc_offset) 1103 1103 goto err; 1104 1104 1105 - /* self check desc_offset from header*/ 1106 - if (desc_offset >= skb_len) 1105 + /* self check desc_offset from header and make sure that the 1106 + * bounds of the metadata array are inside the SKB 1107 + */ 1108 + if (pkt_count * 2 + desc_offset >= skb_len) 1107 1109 goto err; 1110 + 1111 + /* Packets must not overlap the metadata array */ 1112 + skb_trim(skb, desc_offset); 1108 1113 1109 1114 if (pkt_count == 0) 1110 1115 goto err;

+11 -4

drivers/net/vrf.c

··· 1265 1265 eth = (struct ethhdr *)skb->data; 1266 1266 1267 1267 skb_reset_mac_header(skb); 1268 + skb_reset_mac_len(skb); 1268 1269 1269 1270 /* we set the ethernet destination and the source addresses to the 1270 1271 * address of the VRF device. ··· 1295 1294 */ 1296 1295 static int vrf_add_mac_header_if_unset(struct sk_buff *skb, 1297 1296 struct net_device *vrf_dev, 1298 - u16 proto) 1297 + u16 proto, struct net_device *orig_dev) 1299 1298 { 1300 - if (skb_mac_header_was_set(skb)) 1299 + if (skb_mac_header_was_set(skb) && dev_has_header(orig_dev)) 1301 1300 return 0; 1302 1301 1303 1302 return vrf_prepare_mac_header(skb, vrf_dev, proto); ··· 1403 1402 1404 1403 /* if packet is NDISC then keep the ingress interface */ 1405 1404 if (!is_ndisc) { 1405 + struct net_device *orig_dev = skb->dev; 1406 + 1406 1407 vrf_rx_stats(vrf_dev, skb->len); 1407 1408 skb->dev = vrf_dev; 1408 1409 skb->skb_iif = vrf_dev->ifindex; ··· 1413 1410 int err; 1414 1411 1415 1412 err = vrf_add_mac_header_if_unset(skb, vrf_dev, 1416 - ETH_P_IPV6); 1413 + ETH_P_IPV6, 1414 + orig_dev); 1417 1415 if (likely(!err)) { 1418 1416 skb_push(skb, skb->mac_len); 1419 1417 dev_queue_xmit_nit(skb, vrf_dev); ··· 1444 1440 static struct sk_buff *vrf_ip_rcv(struct net_device *vrf_dev, 1445 1441 struct sk_buff *skb) 1446 1442 { 1443 + struct net_device *orig_dev = skb->dev; 1444 + 1447 1445 skb->dev = vrf_dev; 1448 1446 skb->skb_iif = vrf_dev->ifindex; 1449 1447 IPCB(skb)->flags |= IPSKB_L3SLAVE; ··· 1466 1460 if (!list_empty(&vrf_dev->ptype_all)) { 1467 1461 int err; 1468 1462 1469 - err = vrf_add_mac_header_if_unset(skb, vrf_dev, ETH_P_IP); 1463 + err = vrf_add_mac_header_if_unset(skb, vrf_dev, ETH_P_IP, 1464 + orig_dev); 1470 1465 if (likely(!err)) { 1471 1466 skb_push(skb, skb->mac_len); 1472 1467 dev_queue_xmit_nit(skb, vrf_dev);

+9

drivers/pci/controller/pci-hyperv.c

··· 3407 3407 hbus->bridge->domain_nr = dom; 3408 3408 #ifdef CONFIG_X86 3409 3409 hbus->sysdata.domain = dom; 3410 + #elif defined(CONFIG_ARM64) 3411 + /* 3412 + * Set the PCI bus parent to be the corresponding VMbus 3413 + * device. Then the VMbus device will be assigned as the 3414 + * ACPI companion in pcibios_root_bridge_prepare() and 3415 + * pci_dma_configure() will propagate device coherence 3416 + * information to devices created on the bus. 3417 + */ 3418 + hbus->sysdata.parent = hdev->device.parent; 3410 3419 #endif 3411 3420 3412 3421 hbus->hdev = hdev;

+41 -21

drivers/vdpa/mlx5/net/mlx5_vnet.c

··· 163 163 u32 cur_num_vqs; 164 164 struct notifier_block nb; 165 165 struct vdpa_callback config_cb; 166 + struct mlx5_vdpa_wq_ent cvq_ent; 166 167 }; 167 168 168 169 static void free_resources(struct mlx5_vdpa_net *ndev); ··· 1659 1658 mvdev = wqent->mvdev; 1660 1659 ndev = to_mlx5_vdpa_ndev(mvdev); 1661 1660 cvq = &mvdev->cvq; 1661 + 1662 + mutex_lock(&ndev->reslock); 1663 + 1664 + if (!(mvdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) 1665 + goto out; 1666 + 1662 1667 if (!(ndev->mvdev.actual_features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))) 1663 1668 goto out; 1664 1669 ··· 1703 1696 1704 1697 if (vringh_need_notify_iotlb(&cvq->vring)) 1705 1698 vringh_notify(&cvq->vring); 1699 + 1700 + queue_work(mvdev->wq, &wqent->work); 1701 + break; 1706 1702 } 1703 + 1707 1704 out: 1708 - kfree(wqent); 1705 + mutex_unlock(&ndev->reslock); 1709 1706 } 1710 1707 1711 1708 static void mlx5_vdpa_kick_vq(struct vdpa_device *vdev, u16 idx) ··· 1717 1706 struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); 1718 1707 struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); 1719 1708 struct mlx5_vdpa_virtqueue *mvq; 1720 - struct mlx5_vdpa_wq_ent *wqent; 1721 1709 1722 1710 if (!is_index_valid(mvdev, idx)) 1723 1711 return; ··· 1725 1715 if (!mvdev->wq || !mvdev->cvq.ready) 1726 1716 return; 1727 1717 1728 - wqent = kzalloc(sizeof(*wqent), GFP_ATOMIC); 1729 - if (!wqent) 1730 - return; 1731 - 1732 - wqent->mvdev = mvdev; 1733 - INIT_WORK(&wqent->work, mlx5_cvq_kick_handler); 1734 - queue_work(mvdev->wq, &wqent->work); 1718 + queue_work(mvdev->wq, &ndev->cvq_ent.work); 1735 1719 return; 1736 1720 } 1737 1721 ··· 2184 2180 goto err_mr; 2185 2181 2186 2182 if (!(mvdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) 2187 - return 0; 2183 + goto err_mr; 2188 2184 2189 2185 restore_channels_info(ndev); 2190 2186 err = setup_driver(mvdev); ··· 2199 2195 return err; 2200 2196 } 2201 2197 2198 + /* reslock must be held for this function */ 2202 2199 static int setup_driver(struct mlx5_vdpa_dev *mvdev) 2203 2200 { 2204 2201 struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); 2205 2202 int err; 2206 2203 2207 - mutex_lock(&ndev->reslock); 2204 + WARN_ON(!mutex_is_locked(&ndev->reslock)); 2205 + 2208 2206 if (ndev->setup) { 2209 2207 mlx5_vdpa_warn(mvdev, "setup driver called for already setup driver\n"); 2210 2208 err = 0; ··· 2236 2230 goto err_fwd; 2237 2231 } 2238 2232 ndev->setup = true; 2239 - mutex_unlock(&ndev->reslock); 2240 2233 2241 2234 return 0; 2242 2235 ··· 2246 2241 err_rqt: 2247 2242 teardown_virtqueues(ndev); 2248 2243 out: 2249 - mutex_unlock(&ndev->reslock); 2250 2244 return err; 2251 2245 } 2252 2246 2247 + /* reslock must be held for this function */ 2253 2248 static void teardown_driver(struct mlx5_vdpa_net *ndev) 2254 2249 { 2255 - mutex_lock(&ndev->reslock); 2250 + 2251 + WARN_ON(!mutex_is_locked(&ndev->reslock)); 2252 + 2256 2253 if (!ndev->setup) 2257 - goto out; 2254 + return; 2258 2255 2259 2256 remove_fwd_to_tir(ndev); 2260 2257 destroy_tir(ndev); 2261 2258 destroy_rqt(ndev); 2262 2259 teardown_virtqueues(ndev); 2263 2260 ndev->setup = false; 2264 - out: 2265 - mutex_unlock(&ndev->reslock); 2266 2261 } 2267 2262 2268 2263 static void clear_vqs_ready(struct mlx5_vdpa_net *ndev) ··· 2283 2278 2284 2279 print_status(mvdev, status, true); 2285 2280 2281 + mutex_lock(&ndev->reslock); 2282 + 2286 2283 if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) { 2287 2284 if (status & VIRTIO_CONFIG_S_DRIVER_OK) { 2288 2285 err = setup_driver(mvdev); ··· 2294 2287 } 2295 2288 } else { 2296 2289 mlx5_vdpa_warn(mvdev, "did not expect DRIVER_OK to be cleared\n"); 2297 - return; 2290 + goto err_clear; 2298 2291 } 2299 2292 } 2300 2293 2301 2294 ndev->mvdev.status = status; 2295 + mutex_unlock(&ndev->reslock); 2302 2296 return; 2303 2297 2304 2298 err_setup: 2305 2299 mlx5_vdpa_destroy_mr(&ndev->mvdev); 2306 2300 ndev->mvdev.status |= VIRTIO_CONFIG_S_FAILED; 2301 + err_clear: 2302 + mutex_unlock(&ndev->reslock); 2307 2303 } 2308 2304 2309 2305 static int mlx5_vdpa_reset(struct vdpa_device *vdev) ··· 2316 2306 2317 2307 print_status(mvdev, 0, true); 2318 2308 mlx5_vdpa_info(mvdev, "performing device reset\n"); 2309 + 2310 + mutex_lock(&ndev->reslock); 2319 2311 teardown_driver(ndev); 2320 2312 clear_vqs_ready(ndev); 2321 2313 mlx5_vdpa_destroy_mr(&ndev->mvdev); ··· 2330 2318 if (mlx5_vdpa_create_mr(mvdev, NULL)) 2331 2319 mlx5_vdpa_warn(mvdev, "create MR failed\n"); 2332 2320 } 2321 + mutex_unlock(&ndev->reslock); 2333 2322 2334 2323 return 0; 2335 2324 } ··· 2366 2353 static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb *iotlb) 2367 2354 { 2368 2355 struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); 2356 + struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); 2369 2357 bool change_map; 2370 2358 int err; 2359 + 2360 + mutex_lock(&ndev->reslock); 2371 2361 2372 2362 err = mlx5_vdpa_handle_set_map(mvdev, iotlb, &change_map); 2373 2363 if (err) { 2374 2364 mlx5_vdpa_warn(mvdev, "set map failed(%d)\n", err); 2375 - return err; 2365 + goto err; 2376 2366 } 2377 2367 2378 2368 if (change_map) 2379 - return mlx5_vdpa_change_map(mvdev, iotlb); 2369 + err = mlx5_vdpa_change_map(mvdev, iotlb); 2380 2370 2381 - return 0; 2371 + err: 2372 + mutex_unlock(&ndev->reslock); 2373 + return err; 2382 2374 } 2383 2375 2384 2376 static void mlx5_vdpa_free(struct vdpa_device *vdev) ··· 2758 2740 if (err) 2759 2741 goto err_mr; 2760 2742 2743 + ndev->cvq_ent.mvdev = mvdev; 2744 + INIT_WORK(&ndev->cvq_ent.work, mlx5_cvq_kick_handler); 2761 2745 mvdev->wq = create_singlethread_workqueue("mlx5_vdpa_wq"); 2762 2746 if (!mvdev->wq) { 2763 2747 err = -ENOMEM;

+2 -3

drivers/virtio/virtio.c

··· 526 526 goto err; 527 527 } 528 528 529 - /* If restore didn't do it, mark device DRIVER_OK ourselves. */ 530 - if (!(dev->config->get_status(dev) & VIRTIO_CONFIG_S_DRIVER_OK)) 531 - virtio_device_ready(dev); 529 + /* Finally, tell the device we're all set */ 530 + virtio_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK); 532 531 533 532 virtio_config_enable(dev); 534 533

+1 -1

fs/btrfs/extent_io.h

··· 118 118 */ 119 119 struct extent_changeset { 120 120 /* How many bytes are set/cleared in this operation */ 121 - unsigned int bytes_changed; 121 + u64 bytes_changed; 122 122 123 123 /* Changed ranges */ 124 124 struct ulist range_changed;

+11 -2

fs/btrfs/file.c

··· 2957 2957 return ret; 2958 2958 } 2959 2959 2960 - static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) 2960 + static int btrfs_punch_hole(struct file *file, loff_t offset, loff_t len) 2961 2961 { 2962 + struct inode *inode = file_inode(file); 2962 2963 struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); 2963 2964 struct btrfs_root *root = BTRFS_I(inode)->root; 2964 2965 struct extent_state *cached_state = NULL; ··· 2990 2989 ret = 0; 2991 2990 goto out_only_mutex; 2992 2991 } 2992 + 2993 + ret = file_modified(file); 2994 + if (ret) 2995 + goto out_only_mutex; 2993 2996 2994 2997 lockstart = round_up(offset, btrfs_inode_sectorsize(BTRFS_I(inode))); 2995 2998 lockend = round_down(offset + len, ··· 3435 3430 return -EOPNOTSUPP; 3436 3431 3437 3432 if (mode & FALLOC_FL_PUNCH_HOLE) 3438 - return btrfs_punch_hole(inode, offset, len); 3433 + return btrfs_punch_hole(file, offset, len); 3439 3434 3440 3435 /* 3441 3436 * Only trigger disk allocation, don't trigger qgroup reserve ··· 3456 3451 if (ret) 3457 3452 goto out; 3458 3453 } 3454 + 3455 + ret = file_modified(file); 3456 + if (ret) 3457 + goto out; 3459 3458 3460 3459 /* 3461 3460 * TODO: Move these two operations after we have checked

+22 -1

fs/btrfs/inode.c

··· 1128 1128 int ret = 0; 1129 1129 1130 1130 if (btrfs_is_free_space_inode(inode)) { 1131 - WARN_ON_ONCE(1); 1132 1131 ret = -EINVAL; 1133 1132 goto out_unlock; 1134 1133 } ··· 4485 4486 btrfs_warn(fs_info, 4486 4487 "attempt to delete subvolume %llu during send", 4487 4488 dest->root_key.objectid); 4489 + return -EPERM; 4490 + } 4491 + if (atomic_read(&dest->nr_swapfiles)) { 4492 + spin_unlock(&dest->root_item_lock); 4493 + btrfs_warn(fs_info, 4494 + "attempt to delete subvolume %llu with active swapfile", 4495 + root->root_key.objectid); 4488 4496 return -EPERM; 4489 4497 } 4490 4498 root_flags = btrfs_root_flags(&dest->root_item); ··· 11113 11107 * set. We use this counter to prevent snapshots. We must increment it 11114 11108 * before walking the extents because we don't want a concurrent 11115 11109 * snapshot to run after we've already checked the extents. 11110 + * 11111 + * It is possible that subvolume is marked for deletion but still not 11112 + * removed yet. To prevent this race, we check the root status before 11113 + * activating the swapfile. 11116 11114 */ 11115 + spin_lock(&root->root_item_lock); 11116 + if (btrfs_root_dead(root)) { 11117 + spin_unlock(&root->root_item_lock); 11118 + 11119 + btrfs_exclop_finish(fs_info); 11120 + btrfs_warn(fs_info, 11121 + "cannot activate swapfile because subvolume %llu is being deleted", 11122 + root->root_key.objectid); 11123 + return -EPERM; 11124 + } 11117 11125 atomic_inc(&root->nr_swapfiles); 11126 + spin_unlock(&root->root_item_lock); 11118 11127 11119 11128 isize = ALIGN_DOWN(inode->i_size, fs_info->sectorsize); 11120 11129

+14 -6

fs/btrfs/ioctl.c

··· 1239 1239 } 1240 1240 1241 1241 static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em, 1242 - bool locked) 1242 + u32 extent_thresh, u64 newer_than, bool locked) 1243 1243 { 1244 1244 struct extent_map *next; 1245 1245 bool ret = false; ··· 1249 1249 return false; 1250 1250 1251 1251 /* 1252 - * We want to check if the next extent can be merged with the current 1253 - * one, which can be an extent created in a past generation, so we pass 1254 - * a minimum generation of 0 to defrag_lookup_extent(). 1252 + * Here we need to pass @newer_then when checking the next extent, or 1253 + * we will hit a case we mark current extent for defrag, but the next 1254 + * one will not be a target. 1255 + * This will just cause extra IO without really reducing the fragments. 1255 1256 */ 1256 - next = defrag_lookup_extent(inode, em->start + em->len, 0, locked); 1257 + next = defrag_lookup_extent(inode, em->start + em->len, newer_than, locked); 1257 1258 /* No more em or hole */ 1258 1259 if (!next || next->block_start >= EXTENT_MAP_LAST_BYTE) 1259 1260 goto out; ··· 1266 1265 */ 1267 1266 if (next->len >= get_extent_max_capacity(em)) 1268 1267 goto out; 1268 + /* Skip older extent */ 1269 + if (next->generation < newer_than) 1270 + goto out; 1271 + /* Also check extent size */ 1272 + if (next->len >= extent_thresh) 1273 + goto out; 1274 + 1269 1275 ret = true; 1270 1276 out: 1271 1277 free_extent_map(next); ··· 1478 1470 goto next; 1479 1471 1480 1472 next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em, 1481 - locked); 1473 + extent_thresh, newer_than, locked); 1482 1474 if (!next_mergeable) { 1483 1475 struct defrag_target_range *last; 1484 1476

+28 -37

fs/btrfs/volumes.c

··· 1896 1896 path_put(&path); 1897 1897 } 1898 1898 1899 - static int btrfs_rm_dev_item(struct btrfs_device *device) 1899 + static int btrfs_rm_dev_item(struct btrfs_trans_handle *trans, 1900 + struct btrfs_device *device) 1900 1901 { 1901 1902 struct btrfs_root *root = device->fs_info->chunk_root; 1902 1903 int ret; 1903 1904 struct btrfs_path *path; 1904 1905 struct btrfs_key key; 1905 - struct btrfs_trans_handle *trans; 1906 1906 1907 1907 path = btrfs_alloc_path(); 1908 1908 if (!path) 1909 1909 return -ENOMEM; 1910 1910 1911 - trans = btrfs_start_transaction(root, 0); 1912 - if (IS_ERR(trans)) { 1913 - btrfs_free_path(path); 1914 - return PTR_ERR(trans); 1915 - } 1916 1911 key.objectid = BTRFS_DEV_ITEMS_OBJECTID; 1917 1912 key.type = BTRFS_DEV_ITEM_KEY; 1918 1913 key.offset = device->devid; ··· 1918 1923 if (ret) { 1919 1924 if (ret > 0) 1920 1925 ret = -ENOENT; 1921 - btrfs_abort_transaction(trans, ret); 1922 - btrfs_end_transaction(trans); 1923 1926 goto out; 1924 1927 } 1925 1928 1926 1929 ret = btrfs_del_item(trans, root, path); 1927 - if (ret) { 1928 - btrfs_abort_transaction(trans, ret); 1929 - btrfs_end_transaction(trans); 1930 - } 1931 - 1932 1930 out: 1933 1931 btrfs_free_path(path); 1934 - if (!ret) 1935 - ret = btrfs_commit_transaction(trans); 1936 1932 return ret; 1937 1933 } 1938 1934 ··· 2064 2078 struct btrfs_dev_lookup_args *args, 2065 2079 struct block_device **bdev, fmode_t *mode) 2066 2080 { 2081 + struct btrfs_trans_handle *trans; 2067 2082 struct btrfs_device *device; 2068 2083 struct btrfs_fs_devices *cur_devices; 2069 2084 struct btrfs_fs_devices *fs_devices = fs_info->fs_devices; ··· 2085 2098 2086 2099 ret = btrfs_check_raid_min_devices(fs_info, num_devices - 1); 2087 2100 if (ret) 2088 - goto out; 2101 + return ret; 2089 2102 2090 2103 device = btrfs_find_device(fs_info->fs_devices, args); 2091 2104 if (!device) { ··· 2093 2106 ret = BTRFS_ERROR_DEV_MISSING_NOT_FOUND; 2094 2107 else 2095 2108 ret = -ENOENT; 2096 - goto out; 2109 + return ret; 2097 2110 } 2098 2111 2099 2112 if (btrfs_pinned_by_swapfile(fs_info, device)) { 2100 2113 btrfs_warn_in_rcu(fs_info, 2101 2114 "cannot remove device %s (devid %llu) due to active swapfile", 2102 2115 rcu_str_deref(device->name), device->devid); 2103 - ret = -ETXTBSY; 2104 - goto out; 2116 + return -ETXTBSY; 2105 2117 } 2106 2118 2107 - if (test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) { 2108 - ret = BTRFS_ERROR_DEV_TGT_REPLACE; 2109 - goto out; 2110 - } 2119 + if (test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) 2120 + return BTRFS_ERROR_DEV_TGT_REPLACE; 2111 2121 2112 2122 if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) && 2113 - fs_info->fs_devices->rw_devices == 1) { 2114 - ret = BTRFS_ERROR_DEV_ONLY_WRITABLE; 2115 - goto out; 2116 - } 2123 + fs_info->fs_devices->rw_devices == 1) 2124 + return BTRFS_ERROR_DEV_ONLY_WRITABLE; 2117 2125 2118 2126 if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) { 2119 2127 mutex_lock(&fs_info->chunk_mutex); ··· 2121 2139 if (ret) 2122 2140 goto error_undo; 2123 2141 2124 - /* 2125 - * TODO: the superblock still includes this device in its num_devices 2126 - * counter although write_all_supers() is not locked out. This 2127 - * could give a filesystem state which requires a degraded mount. 2128 - */ 2129 - ret = btrfs_rm_dev_item(device); 2130 - if (ret) 2142 + trans = btrfs_start_transaction(fs_info->chunk_root, 0); 2143 + if (IS_ERR(trans)) { 2144 + ret = PTR_ERR(trans); 2131 2145 goto error_undo; 2146 + } 2147 + 2148 + ret = btrfs_rm_dev_item(trans, device); 2149 + if (ret) { 2150 + /* Any error in dev item removal is critical */ 2151 + btrfs_crit(fs_info, 2152 + "failed to remove device item for devid %llu: %d", 2153 + device->devid, ret); 2154 + btrfs_abort_transaction(trans, ret); 2155 + btrfs_end_transaction(trans); 2156 + return ret; 2157 + } 2132 2158 2133 2159 clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state); 2134 2160 btrfs_scrub_cancel_dev(device); ··· 2219 2229 free_fs_devices(cur_devices); 2220 2230 } 2221 2231 2222 - out: 2232 + ret = btrfs_commit_transaction(trans); 2233 + 2223 2234 return ret; 2224 2235 2225 2236 error_undo: ··· 2231 2240 device->fs_devices->rw_devices++; 2232 2241 mutex_unlock(&fs_info->chunk_mutex); 2233 2242 } 2234 - goto out; 2243 + return ret; 2235 2244 } 2236 2245 2237 2246 void btrfs_rm_dev_replace_remove_srcdev(struct btrfs_device *srcdev)

+5 -8

fs/btrfs/zoned.c

··· 1801 1801 1802 1802 map = em->map_lookup; 1803 1803 /* We only support single profile for now */ 1804 - ASSERT(map->num_stripes == 1); 1805 1804 device = map->stripes[0].dev; 1806 1805 1807 1806 free_extent_map(em); ··· 1975 1976 1976 1977 bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags) 1977 1978 { 1979 + struct btrfs_fs_info *fs_info = fs_devices->fs_info; 1978 1980 struct btrfs_device *device; 1979 1981 bool ret = false; 1980 1982 1981 - if (!btrfs_is_zoned(fs_devices->fs_info)) 1983 + if (!btrfs_is_zoned(fs_info)) 1982 1984 return true; 1983 1985 1984 - /* Non-single profiles are not supported yet */ 1985 - ASSERT((flags & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0); 1986 - 1987 1986 /* Check if there is a device with active zones left */ 1988 - mutex_lock(&fs_devices->device_list_mutex); 1989 - list_for_each_entry(device, &fs_devices->devices, dev_list) { 1987 + mutex_lock(&fs_info->chunk_mutex); 1988 + list_for_each_entry(device, &fs_devices->alloc_list, dev_alloc_list) { 1990 1989 struct btrfs_zoned_device_info *zinfo = device->zone_info; 1991 1990 1992 1991 if (!device->bdev) ··· 1996 1999 break; 1997 2000 } 1998 2001 } 1999 - mutex_unlock(&fs_devices->device_list_mutex); 2002 + mutex_unlock(&fs_info->chunk_mutex); 2000 2003 2001 2004 return ret; 2002 2005 }

+1

include/asm-generic/mshyperv.h

··· 269 269 u64 hv_ghcb_hypercall(u64 control, void *input, void *output, u32 input_size); 270 270 void hyperv_cleanup(void); 271 271 bool hv_query_ext_cap(u64 cap_query); 272 + void hv_setup_dma_ops(struct device *dev, bool coherent); 272 273 void *hv_map_memory(void *addr, unsigned long size); 273 274 void hv_unmap_memory(void *addr); 274 275 #else /* CONFIG_HYPERV */

+3 -1

include/linux/bpf_verifier.h

··· 570 570 return type & ~BPF_BASE_TYPE_MASK; 571 571 } 572 572 573 + /* only use after check_attach_btf_id() */ 573 574 static inline enum bpf_prog_type resolve_prog_type(struct bpf_prog *prog) 574 575 { 575 - return prog->aux->dst_prog ? prog->aux->dst_prog->type : prog->type; 576 + return prog->type == BPF_PROG_TYPE_EXT ? 577 + prog->aux->dst_prog->type : prog->type; 576 578 } 577 579 578 580 #endif /* _LINUX_BPF_VERIFIER_H */

-6

include/linux/virtio_config.h

··· 23 23 * any of @get/@set, @get_status/@set_status, or @get_features/ 24 24 * @finalize_features are NOT safe to be called from an atomic 25 25 * context. 26 - * @enable_cbs: enable the callbacks 27 - * vdev: the virtio_device 28 26 * @get: read the value of a configuration field 29 27 * vdev: the virtio_device 30 28 * offset: the offset of the configuration field ··· 76 78 */ 77 79 typedef void vq_callback_t(struct virtqueue *); 78 80 struct virtio_config_ops { 79 - void (*enable_cbs)(struct virtio_device *vdev); 80 81 void (*get)(struct virtio_device *vdev, unsigned offset, 81 82 void *buf, unsigned len); 82 83 void (*set)(struct virtio_device *vdev, unsigned offset, ··· 229 232 void virtio_device_ready(struct virtio_device *dev) 230 233 { 231 234 unsigned status = dev->config->get_status(dev); 232 - 233 - if (dev->config->enable_cbs) 234 - dev->config->enable_cbs(dev); 235 235 236 236 BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK); 237 237 dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);

-2

include/net/mctp.h

··· 36 36 #define MCTP_HDR_TAG_SHIFT 0 37 37 #define MCTP_HDR_TAG_MASK GENMASK(2, 0) 38 38 39 - #define MCTP_HEADER_MAXLEN 4 40 - 41 39 #define MCTP_INITIAL_DEFAULT_NET 1 42 40 43 41 static inline bool mctp_address_unicast(mctp_eid_t eid)

+2 -2

kernel/trace/bpf_trace.c

··· 2349 2349 } 2350 2350 2351 2351 static int 2352 - kprobe_multi_resolve_syms(const void *usyms, u32 cnt, 2352 + kprobe_multi_resolve_syms(const void __user *usyms, u32 cnt, 2353 2353 unsigned long *addrs) 2354 2354 { 2355 2355 unsigned long addr, size; 2356 - const char **syms; 2356 + const char __user **syms; 2357 2357 int err = -ENOMEM; 2358 2358 unsigned int i; 2359 2359 char *func;

+1 -1

kernel/trace/rethook.c

··· 65 65 */ 66 66 void rethook_free(struct rethook *rh) 67 67 { 68 - rcu_assign_pointer(rh->handler, NULL); 68 + WRITE_ONCE(rh->handler, NULL); 69 69 70 70 call_rcu(&rh->rcu, rethook_free_rcu); 71 71 }

+13 -4

net/core/filter.c

··· 7016 7016 if (!th->ack || th->rst || th->syn) 7017 7017 return -ENOENT; 7018 7018 7019 + if (unlikely(iph_len < sizeof(struct iphdr))) 7020 + return -EINVAL; 7021 + 7019 7022 if (tcp_synq_no_recent_overflow(sk)) 7020 7023 return -ENOENT; 7021 7024 7022 7025 cookie = ntohl(th->ack_seq) - 1; 7023 7026 7024 - switch (sk->sk_family) { 7025 - case AF_INET: 7026 - if (unlikely(iph_len < sizeof(struct iphdr))) 7027 + /* Both struct iphdr and struct ipv6hdr have the version field at the 7028 + * same offset so we can cast to the shorter header (struct iphdr). 7029 + */ 7030 + switch (((struct iphdr *)iph)->version) { 7031 + case 4: 7032 + if (sk->sk_family == AF_INET6 && ipv6_only_sock(sk)) 7027 7033 return -EINVAL; 7028 7034 7029 7035 ret = __cookie_v4_check((struct iphdr *)iph, th, cookie); 7030 7036 break; 7031 7037 7032 7038 #if IS_BUILTIN(CONFIG_IPV6) 7033 - case AF_INET6: 7039 + case 6: 7034 7040 if (unlikely(iph_len < sizeof(struct ipv6hdr))) 7041 + return -EINVAL; 7042 + 7043 + if (sk->sk_family != AF_INET6) 7035 7044 return -EINVAL; 7036 7045 7037 7046 ret = __cookie_v6_check((struct ipv6hdr *)iph, th, cookie);

+11 -4

net/core/skbuff.c

··· 5276 5276 if (skb_cloned(to)) 5277 5277 return false; 5278 5278 5279 - /* The page pool signature of struct page will eventually figure out 5280 - * which pages can be recycled or not but for now let's prohibit slab 5281 - * allocated and page_pool allocated SKBs from being coalesced. 5279 + /* In general, avoid mixing slab allocated and page_pool allocated 5280 + * pages within the same SKB. However when @to is not pp_recycle and 5281 + * @from is cloned, we can transition frag pages from page_pool to 5282 + * reference counted. 5283 + * 5284 + * On the other hand, don't allow coalescing two pp_recycle SKBs if 5285 + * @from is cloned, in case the SKB is using page_pool fragment 5286 + * references (PP_FLAG_PAGE_FRAG). Since we only take full page 5287 + * references for cloned SKBs at the moment that would result in 5288 + * inconsistent reference counts. 5282 5289 */ 5283 - if (to->pp_recycle != from->pp_recycle) 5290 + if (to->pp_recycle != (from->pp_recycle && !skb_cloned(from))) 5284 5291 return false; 5285 5292 5286 5293 if (len <= skb_tailroom(to)) {

+24 -1

net/dsa/master.c

··· 335 335 .attrs = dsa_slave_attrs, 336 336 }; 337 337 338 + static void dsa_master_reset_mtu(struct net_device *dev) 339 + { 340 + int err; 341 + 342 + err = dev_set_mtu(dev, ETH_DATA_LEN); 343 + if (err) 344 + netdev_dbg(dev, 345 + "Unable to reset MTU to exclude DSA overheads\n"); 346 + } 347 + 338 348 int dsa_master_setup(struct net_device *dev, struct dsa_port *cpu_dp) 339 349 { 350 + const struct dsa_device_ops *tag_ops = cpu_dp->tag_ops; 340 351 struct dsa_switch *ds = cpu_dp->ds; 341 352 struct device_link *consumer_link; 342 - int ret; 353 + int mtu, ret; 354 + 355 + mtu = ETH_DATA_LEN + dsa_tag_protocol_overhead(tag_ops); 343 356 344 357 /* The DSA master must use SET_NETDEV_DEV for this to work. */ 345 358 consumer_link = device_link_add(ds->dev, dev->dev.parent, ··· 361 348 netdev_err(dev, 362 349 "Failed to create a device link to DSA switch %s\n", 363 350 dev_name(ds->dev)); 351 + 352 + /* The switch driver may not implement ->port_change_mtu(), case in 353 + * which dsa_slave_change_mtu() will not update the master MTU either, 354 + * so we need to do that here. 355 + */ 356 + ret = dev_set_mtu(dev, mtu); 357 + if (ret) 358 + netdev_warn(dev, "error %d setting MTU to %d to include DSA overhead\n", 359 + ret, mtu); 364 360 365 361 /* If we use a tagging format that doesn't have an ethertype 366 362 * field, make sure that all packets from this point on get ··· 406 384 sysfs_remove_group(&dev->dev.kobj, &dsa_group); 407 385 dsa_netdev_ops_set(dev, NULL); 408 386 dsa_master_ethtool_teardown(dev); 387 + dsa_master_reset_mtu(dev); 409 388 dsa_master_set_promiscuity(dev, -1); 410 389 411 390 dev->dsa_ptr = NULL;

+6 -1

net/ipv4/fib_semantics.c

··· 889 889 } 890 890 891 891 if (cfg->fc_oif || cfg->fc_gw_family) { 892 - struct fib_nh *nh = fib_info_nh(fi, 0); 892 + struct fib_nh *nh; 893 893 894 + /* cannot match on nexthop object attributes */ 895 + if (fi->nh) 896 + return 1; 897 + 898 + nh = fib_info_nh(fi, 0); 894 899 if (cfg->fc_encap) { 895 900 if (fib_encap_match(net, cfg->fc_encap_type, 896 901 cfg->fc_encap, nh, cfg, extack))

+1 -1

net/ipv6/ip6mr.c

··· 1653 1653 mifi_t mifi; 1654 1654 struct net *net = sock_net(sk); 1655 1655 struct mr_table *mrt; 1656 - bool do_wrmifwhole; 1657 1656 1658 1657 if (sk->sk_type != SOCK_RAW || 1659 1658 inet_sk(sk)->inet_num != IPPROTO_ICMPV6) ··· 1760 1761 #ifdef CONFIG_IPV6_PIMSM_V2 1761 1762 case MRT6_PIM: 1762 1763 { 1764 + bool do_wrmifwhole; 1763 1765 int v; 1764 1766 1765 1767 if (optlen != sizeof(v))

+1 -1

net/ipv6/route.c

··· 4484 4484 struct inet6_dev *idev; 4485 4485 int type; 4486 4486 4487 - if (netif_is_l3_master(skb->dev) && 4487 + if (netif_is_l3_master(skb->dev) || 4488 4488 dst->dev == net->loopback_dev) 4489 4489 idev = __in6_dev_get_safely(dev_get_by_index_rcu(net, IP6CB(skb)->iif)); 4490 4490 else

+33 -13

net/mctp/af_mctp.c

··· 93 93 static int mctp_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) 94 94 { 95 95 DECLARE_SOCKADDR(struct sockaddr_mctp *, addr, msg->msg_name); 96 - const int hlen = MCTP_HEADER_MAXLEN + sizeof(struct mctp_hdr); 97 96 int rc, addrlen = msg->msg_namelen; 98 97 struct sock *sk = sock->sk; 99 98 struct mctp_sock *msk = container_of(sk, struct mctp_sock, sk); 100 99 struct mctp_skb_cb *cb; 101 100 struct mctp_route *rt; 102 - struct sk_buff *skb; 101 + struct sk_buff *skb = NULL; 102 + int hlen; 103 103 104 104 if (addr) { 105 105 const u8 tagbits = MCTP_TAG_MASK | MCTP_TAG_OWNER | ··· 129 129 if (addr->smctp_network == MCTP_NET_ANY) 130 130 addr->smctp_network = mctp_default_net(sock_net(sk)); 131 131 132 + /* direct addressing */ 133 + if (msk->addr_ext && addrlen >= sizeof(struct sockaddr_mctp_ext)) { 134 + DECLARE_SOCKADDR(struct sockaddr_mctp_ext *, 135 + extaddr, msg->msg_name); 136 + struct net_device *dev; 137 + 138 + rc = -EINVAL; 139 + rcu_read_lock(); 140 + dev = dev_get_by_index_rcu(sock_net(sk), extaddr->smctp_ifindex); 141 + /* check for correct halen */ 142 + if (dev && extaddr->smctp_halen == dev->addr_len) { 143 + hlen = LL_RESERVED_SPACE(dev) + sizeof(struct mctp_hdr); 144 + rc = 0; 145 + } 146 + rcu_read_unlock(); 147 + if (rc) 148 + goto err_free; 149 + rt = NULL; 150 + } else { 151 + rt = mctp_route_lookup(sock_net(sk), addr->smctp_network, 152 + addr->smctp_addr.s_addr); 153 + if (!rt) { 154 + rc = -EHOSTUNREACH; 155 + goto err_free; 156 + } 157 + hlen = LL_RESERVED_SPACE(rt->dev->dev) + sizeof(struct mctp_hdr); 158 + } 159 + 132 160 skb = sock_alloc_send_skb(sk, hlen + 1 + len, 133 161 msg->msg_flags & MSG_DONTWAIT, &rc); 134 162 if (!skb) ··· 175 147 cb = __mctp_cb(skb); 176 148 cb->net = addr->smctp_network; 177 149 178 - /* direct addressing */ 179 - if (msk->addr_ext && addrlen >= sizeof(struct sockaddr_mctp_ext)) { 150 + if (!rt) { 151 + /* fill extended address in cb */ 180 152 DECLARE_SOCKADDR(struct sockaddr_mctp_ext *, 181 153 extaddr, msg->msg_name); 182 154 ··· 187 159 } 188 160 189 161 cb->ifindex = extaddr->smctp_ifindex; 162 + /* smctp_halen is checked above */ 190 163 cb->halen = extaddr->smctp_halen; 191 164 memcpy(cb->haddr, extaddr->smctp_haddr, cb->halen); 192 - 193 - rt = NULL; 194 - } else { 195 - rt = mctp_route_lookup(sock_net(sk), addr->smctp_network, 196 - addr->smctp_addr.s_addr); 197 - if (!rt) { 198 - rc = -EHOSTUNREACH; 199 - goto err_free; 200 - } 201 165 } 202 166 203 167 rc = mctp_local_output(sk, rt, skb, addr->smctp_addr.s_addr,

+12 -4

net/mctp/route.c

··· 503 503 504 504 if (cb->ifindex) { 505 505 /* direct route; use the hwaddr we stashed in sendmsg */ 506 + if (cb->halen != skb->dev->addr_len) { 507 + /* sanity check, sendmsg should have already caught this */ 508 + kfree_skb(skb); 509 + return -EMSGSIZE; 510 + } 506 511 daddr = cb->haddr; 507 512 } else { 508 513 /* If lookup fails let the device handle daddr==NULL */ ··· 517 512 518 513 rc = dev_hard_header(skb, skb->dev, ntohs(skb->protocol), 519 514 daddr, skb->dev->dev_addr, skb->len); 520 - if (rc) { 515 + if (rc < 0) { 521 516 kfree_skb(skb); 522 517 return -EHOSTUNREACH; 523 518 } ··· 761 756 { 762 757 const unsigned int hlen = sizeof(struct mctp_hdr); 763 758 struct mctp_hdr *hdr, *hdr2; 764 - unsigned int pos, size; 759 + unsigned int pos, size, headroom; 765 760 struct sk_buff *skb2; 766 761 int rc; 767 762 u8 seq; ··· 775 770 return -EMSGSIZE; 776 771 } 777 772 773 + /* keep same headroom as the original skb */ 774 + headroom = skb_headroom(skb); 775 + 778 776 /* we've got the header */ 779 777 skb_pull(skb, hlen); 780 778 ··· 785 777 /* size of message payload */ 786 778 size = min(mtu - hlen, skb->len - pos); 787 779 788 - skb2 = alloc_skb(MCTP_HEADER_MAXLEN + hlen + size, GFP_KERNEL); 780 + skb2 = alloc_skb(headroom + hlen + size, GFP_KERNEL); 789 781 if (!skb2) { 790 782 rc = -ENOMEM; 791 783 break; ··· 801 793 skb_set_owner_w(skb2, skb->sk); 802 794 803 795 /* establish packet */ 804 - skb_reserve(skb2, MCTP_HEADER_MAXLEN); 796 + skb_reserve(skb2, headroom); 805 797 skb_reset_network_header(skb2); 806 798 skb_put(skb2, hlen + size); 807 799 skb2->transport_header = skb2->network_header + hlen;

+1 -1

net/netfilter/nf_tables_api.c

··· 5526 5526 int err, i, k; 5527 5527 5528 5528 for (i = 0; i < set->num_exprs; i++) { 5529 - expr = kzalloc(set->exprs[i]->ops->size, GFP_KERNEL); 5529 + expr = kzalloc(set->exprs[i]->ops->size, GFP_KERNEL_ACCOUNT); 5530 5530 if (!expr) 5531 5531 goto err_expr; 5532 5532

+2 -2

net/netfilter/nft_bitwise.c

··· 290 290 if (!track->regs[priv->sreg].selector) 291 291 return false; 292 292 293 - bitwise = nft_expr_priv(expr); 293 + bitwise = nft_expr_priv(track->regs[priv->dreg].selector); 294 294 if (track->regs[priv->sreg].selector == track->regs[priv->dreg].selector && 295 295 track->regs[priv->sreg].num_reg == 0 && 296 296 track->regs[priv->dreg].bitwise && ··· 442 442 if (!track->regs[priv->sreg].selector) 443 443 return false; 444 444 445 - bitwise = nft_expr_priv(expr); 445 + bitwise = nft_expr_priv(track->regs[priv->dreg].selector); 446 446 if (track->regs[priv->sreg].selector == track->regs[priv->dreg].selector && 447 447 track->regs[priv->dreg].bitwise && 448 448 track->regs[priv->dreg].bitwise->ops == expr->ops &&

+1 -1

net/netfilter/nft_connlimit.c

··· 77 77 invert = true; 78 78 } 79 79 80 - priv->list = kmalloc(sizeof(*priv->list), GFP_KERNEL); 80 + priv->list = kmalloc(sizeof(*priv->list), GFP_KERNEL_ACCOUNT); 81 81 if (!priv->list) 82 82 return -ENOMEM; 83 83

+1 -1

net/netfilter/nft_counter.c

··· 62 62 struct nft_counter __percpu *cpu_stats; 63 63 struct nft_counter *this_cpu; 64 64 65 - cpu_stats = alloc_percpu(struct nft_counter); 65 + cpu_stats = alloc_percpu_gfp(struct nft_counter, GFP_KERNEL_ACCOUNT); 66 66 if (cpu_stats == NULL) 67 67 return -ENOMEM; 68 68

+1 -1

net/netfilter/nft_last.c

··· 30 30 u64 last_jiffies; 31 31 int err; 32 32 33 - last = kzalloc(sizeof(*last), GFP_KERNEL); 33 + last = kzalloc(sizeof(*last), GFP_KERNEL_ACCOUNT); 34 34 if (!last) 35 35 return -ENOMEM; 36 36

+1 -1

net/netfilter/nft_limit.c

··· 90 90 priv->rate); 91 91 } 92 92 93 - priv->limit = kmalloc(sizeof(*priv->limit), GFP_KERNEL); 93 + priv->limit = kmalloc(sizeof(*priv->limit), GFP_KERNEL_ACCOUNT); 94 94 if (!priv->limit) 95 95 return -ENOMEM; 96 96

+1 -1

net/netfilter/nft_quota.c

··· 90 90 return -EOPNOTSUPP; 91 91 } 92 92 93 - priv->consumed = kmalloc(sizeof(*priv->consumed), GFP_KERNEL); 93 + priv->consumed = kmalloc(sizeof(*priv->consumed), GFP_KERNEL_ACCOUNT); 94 94 if (!priv->consumed) 95 95 return -ENOMEM; 96 96

+1 -1

net/openvswitch/actions.c

··· 1051 1051 int rem = nla_len(attr); 1052 1052 bool dont_clone_flow_key; 1053 1053 1054 - /* The first action is always 'OVS_CLONE_ATTR_ARG'. */ 1054 + /* The first action is always 'OVS_CLONE_ATTR_EXEC'. */ 1055 1055 clone_arg = nla_data(attr); 1056 1056 dont_clone_flow_key = nla_get_u32(clone_arg); 1057 1057 actions = nla_next(clone_arg, &rem);

+93 -6

net/openvswitch/flow_netlink.c

··· 2317 2317 return sfa; 2318 2318 } 2319 2319 2320 + static void ovs_nla_free_nested_actions(const struct nlattr *actions, int len); 2321 + 2322 + static void ovs_nla_free_check_pkt_len_action(const struct nlattr *action) 2323 + { 2324 + const struct nlattr *a; 2325 + int rem; 2326 + 2327 + nla_for_each_nested(a, action, rem) { 2328 + switch (nla_type(a)) { 2329 + case OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL: 2330 + case OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER: 2331 + ovs_nla_free_nested_actions(nla_data(a), nla_len(a)); 2332 + break; 2333 + } 2334 + } 2335 + } 2336 + 2337 + static void ovs_nla_free_clone_action(const struct nlattr *action) 2338 + { 2339 + const struct nlattr *a = nla_data(action); 2340 + int rem = nla_len(action); 2341 + 2342 + switch (nla_type(a)) { 2343 + case OVS_CLONE_ATTR_EXEC: 2344 + /* The real list of actions follows this attribute. */ 2345 + a = nla_next(a, &rem); 2346 + ovs_nla_free_nested_actions(a, rem); 2347 + break; 2348 + } 2349 + } 2350 + 2351 + static void ovs_nla_free_dec_ttl_action(const struct nlattr *action) 2352 + { 2353 + const struct nlattr *a = nla_data(action); 2354 + 2355 + switch (nla_type(a)) { 2356 + case OVS_DEC_TTL_ATTR_ACTION: 2357 + ovs_nla_free_nested_actions(nla_data(a), nla_len(a)); 2358 + break; 2359 + } 2360 + } 2361 + 2362 + static void ovs_nla_free_sample_action(const struct nlattr *action) 2363 + { 2364 + const struct nlattr *a = nla_data(action); 2365 + int rem = nla_len(action); 2366 + 2367 + switch (nla_type(a)) { 2368 + case OVS_SAMPLE_ATTR_ARG: 2369 + /* The real list of actions follows this attribute. */ 2370 + a = nla_next(a, &rem); 2371 + ovs_nla_free_nested_actions(a, rem); 2372 + break; 2373 + } 2374 + } 2375 + 2320 2376 static void ovs_nla_free_set_action(const struct nlattr *a) 2321 2377 { 2322 2378 const struct nlattr *ovs_key = nla_data(a); ··· 2386 2330 } 2387 2331 } 2388 2332 2389 - void ovs_nla_free_flow_actions(struct sw_flow_actions *sf_acts) 2333 + static void ovs_nla_free_nested_actions(const struct nlattr *actions, int len) 2390 2334 { 2391 2335 const struct nlattr *a; 2392 2336 int rem; 2393 2337 2394 - if (!sf_acts) 2338 + /* Whenever new actions are added, the need to update this 2339 + * function should be considered. 2340 + */ 2341 + BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 23); 2342 + 2343 + if (!actions) 2395 2344 return; 2396 2345 2397 - nla_for_each_attr(a, sf_acts->actions, sf_acts->actions_len, rem) { 2346 + nla_for_each_attr(a, actions, len, rem) { 2398 2347 switch (nla_type(a)) { 2399 - case OVS_ACTION_ATTR_SET: 2400 - ovs_nla_free_set_action(a); 2348 + case OVS_ACTION_ATTR_CHECK_PKT_LEN: 2349 + ovs_nla_free_check_pkt_len_action(a); 2401 2350 break; 2351 + 2352 + case OVS_ACTION_ATTR_CLONE: 2353 + ovs_nla_free_clone_action(a); 2354 + break; 2355 + 2402 2356 case OVS_ACTION_ATTR_CT: 2403 2357 ovs_ct_free_action(a); 2404 2358 break; 2359 + 2360 + case OVS_ACTION_ATTR_DEC_TTL: 2361 + ovs_nla_free_dec_ttl_action(a); 2362 + break; 2363 + 2364 + case OVS_ACTION_ATTR_SAMPLE: 2365 + ovs_nla_free_sample_action(a); 2366 + break; 2367 + 2368 + case OVS_ACTION_ATTR_SET: 2369 + ovs_nla_free_set_action(a); 2370 + break; 2405 2371 } 2406 2372 } 2373 + } 2407 2374 2375 + void ovs_nla_free_flow_actions(struct sw_flow_actions *sf_acts) 2376 + { 2377 + if (!sf_acts) 2378 + return; 2379 + 2380 + ovs_nla_free_nested_actions(sf_acts->actions, sf_acts->actions_len); 2408 2381 kfree(sf_acts); 2409 2382 } 2410 2383 ··· 3543 3458 if (!start) 3544 3459 return -EMSGSIZE; 3545 3460 3546 - err = ovs_nla_put_actions(nla_data(attr), rem, skb); 3461 + /* Skipping the OVS_CLONE_ATTR_EXEC that is always the first attribute. */ 3462 + attr = nla_next(nla_data(attr), &rem); 3463 + err = ovs_nla_put_actions(attr, rem, skb); 3547 3464 3548 3465 if (err) 3549 3466 nla_nest_cancel(skb, start);

+1 -1

net/rxrpc/net_ns.c

··· 113 113 struct rxrpc_net *rxnet = rxrpc_net(net); 114 114 115 115 rxnet->live = false; 116 - del_timer_sync(&rxnet->peer_keepalive_timer); 117 116 cancel_work_sync(&rxnet->peer_keepalive_work); 117 + del_timer_sync(&rxnet->peer_keepalive_timer); 118 118 rxrpc_destroy_all_calls(rxnet); 119 119 rxrpc_destroy_all_connections(rxnet); 120 120 rxrpc_destroy_all_peers(rxnet);

+5 -1

net/sctp/outqueue.c

··· 914 914 ctx->asoc->base.sk->sk_err = -error; 915 915 return; 916 916 } 917 + ctx->asoc->stats.octrlchunks++; 917 918 break; 918 919 919 920 case SCTP_CID_ABORT: ··· 939 938 940 939 case SCTP_CID_HEARTBEAT: 941 940 if (chunk->pmtu_probe) { 942 - sctp_packet_singleton(ctx->transport, chunk, ctx->gfp); 941 + error = sctp_packet_singleton(ctx->transport, 942 + chunk, ctx->gfp); 943 + if (!error) 944 + ctx->asoc->stats.octrlchunks++; 943 945 break; 944 946 } 945 947 fallthrough;

+1 -1

net/tls/tls_sw.c

··· 1496 1496 if (prot->version == TLS_1_3_VERSION || 1497 1497 prot->cipher_type == TLS_CIPHER_CHACHA20_POLY1305) 1498 1498 memcpy(iv + iv_offset, tls_ctx->rx.iv, 1499 - crypto_aead_ivsize(ctx->aead_recv)); 1499 + prot->iv_size + prot->salt_size); 1500 1500 else 1501 1501 memcpy(iv + iv_offset, tls_ctx->rx.iv, prot->salt_size); 1502 1502

+15 -7

tools/bpf/bpftool/gen.c

··· 828 828 s->map_cnt = %zu; \n\ 829 829 s->map_skel_sz = sizeof(*s->maps); \n\ 830 830 s->maps = (struct bpf_map_skeleton *)calloc(s->map_cnt, s->map_skel_sz);\n\ 831 - if (!s->maps) \n\ 831 + if (!s->maps) { \n\ 832 + err = -ENOMEM; \n\ 832 833 goto err; \n\ 834 + } \n\ 833 835 ", 834 836 map_cnt 835 837 ); ··· 872 870 s->prog_cnt = %zu; \n\ 873 871 s->prog_skel_sz = sizeof(*s->progs); \n\ 874 872 s->progs = (struct bpf_prog_skeleton *)calloc(s->prog_cnt, s->prog_skel_sz);\n\ 875 - if (!s->progs) \n\ 873 + if (!s->progs) { \n\ 874 + err = -ENOMEM; \n\ 876 875 goto err; \n\ 876 + } \n\ 877 877 ", 878 878 prog_cnt 879 879 ); ··· 1186 1182 %1$s__create_skeleton(struct %1$s *obj) \n\ 1187 1183 { \n\ 1188 1184 struct bpf_object_skeleton *s; \n\ 1185 + int err; \n\ 1189 1186 \n\ 1190 1187 s = (struct bpf_object_skeleton *)calloc(1, sizeof(*s));\n\ 1191 - if (!s) \n\ 1188 + if (!s) { \n\ 1189 + err = -ENOMEM; \n\ 1192 1190 goto err; \n\ 1191 + } \n\ 1193 1192 \n\ 1194 1193 s->sz = sizeof(*s); \n\ 1195 1194 s->name = \"%1$s\"; \n\ ··· 1213 1206 return 0; \n\ 1214 1207 err: \n\ 1215 1208 bpf_object__destroy_skeleton(s); \n\ 1216 - return -ENOMEM; \n\ 1209 + return err; \n\ 1217 1210 } \n\ 1218 1211 \n\ 1219 1212 static inline const void *%2$s__elf_bytes(size_t *sz) \n\ ··· 1473 1466 \n\ 1474 1467 obj = (struct %1$s *)calloc(1, sizeof(*obj)); \n\ 1475 1468 if (!obj) { \n\ 1476 - errno = ENOMEM; \n\ 1469 + err = -ENOMEM; \n\ 1477 1470 goto err; \n\ 1478 1471 } \n\ 1479 1472 s = (struct bpf_object_subskeleton *)calloc(1, sizeof(*s));\n\ 1480 1473 if (!s) { \n\ 1481 - errno = ENOMEM; \n\ 1474 + err = -ENOMEM; \n\ 1482 1475 goto err; \n\ 1483 1476 } \n\ 1484 1477 s->sz = sizeof(*s); \n\ ··· 1490 1483 s->var_cnt = %2$d; \n\ 1491 1484 s->vars = (struct bpf_var_skeleton *)calloc(%2$d, sizeof(*s->vars));\n\ 1492 1485 if (!s->vars) { \n\ 1493 - errno = ENOMEM; \n\ 1486 + err = -ENOMEM; \n\ 1494 1487 goto err; \n\ 1495 1488 } \n\ 1496 1489 ", ··· 1545 1538 return obj; \n\ 1546 1539 err: \n\ 1547 1540 %1$s__destroy(obj); \n\ 1541 + errno = -err; \n\ 1548 1542 return NULL; \n\ 1549 1543 } \n\ 1550 1544 \n\

+23

tools/testing/selftests/bpf/prog_tests/dummy_st_ops.c

··· 2 2 /* Copyright (C) 2021. Huawei Technologies Co., Ltd */ 3 3 #include <test_progs.h> 4 4 #include "dummy_st_ops.skel.h" 5 + #include "trace_dummy_st_ops.skel.h" 5 6 6 7 /* Need to keep consistent with definition in include/linux/bpf.h */ 7 8 struct bpf_dummy_ops_state { ··· 57 56 .ctx_in = args, 58 57 .ctx_size_in = sizeof(args), 59 58 ); 59 + struct trace_dummy_st_ops *trace_skel; 60 60 struct dummy_st_ops *skel; 61 61 int fd, err; 62 62 ··· 66 64 return; 67 65 68 66 fd = bpf_program__fd(skel->progs.test_1); 67 + 68 + trace_skel = trace_dummy_st_ops__open(); 69 + if (!ASSERT_OK_PTR(trace_skel, "trace_dummy_st_ops__open")) 70 + goto done; 71 + 72 + err = bpf_program__set_attach_target(trace_skel->progs.fentry_test_1, 73 + fd, "test_1"); 74 + if (!ASSERT_OK(err, "set_attach_target(fentry_test_1)")) 75 + goto done; 76 + 77 + err = trace_dummy_st_ops__load(trace_skel); 78 + if (!ASSERT_OK(err, "load(trace_skel)")) 79 + goto done; 80 + 81 + err = trace_dummy_st_ops__attach(trace_skel); 82 + if (!ASSERT_OK(err, "attach(trace_skel)")) 83 + goto done; 84 + 69 85 err = bpf_prog_test_run_opts(fd, &attr); 70 86 ASSERT_OK(err, "test_run"); 71 87 ASSERT_EQ(in_state.val, 0x5a, "test_ptr_ret"); 72 88 ASSERT_EQ(attr.retval, exp_retval, "test_ret"); 89 + ASSERT_EQ(trace_skel->bss->val, exp_retval, "fentry_val"); 73 90 91 + done: 74 92 dummy_st_ops__destroy(skel); 93 + trace_dummy_st_ops__destroy(trace_skel); 75 94 } 76 95 77 96 static void test_dummy_multiple_args(void)

+2 -2

tools/testing/selftests/bpf/progs/map_ptr_kern.c

··· 367 367 368 368 VERIFY(check_default(&array_of_maps->map, map)); 369 369 inner_map = bpf_map_lookup_elem(array_of_maps, &key); 370 - VERIFY(inner_map != 0); 370 + VERIFY(inner_map != NULL); 371 371 VERIFY(inner_map->map.max_entries == INNER_MAX_ENTRIES); 372 372 373 373 return 1; ··· 394 394 395 395 VERIFY(check_default(&hash_of_maps->map, map)); 396 396 inner_map = bpf_map_lookup_elem(hash_of_maps, &key); 397 - VERIFY(inner_map != 0); 397 + VERIFY(inner_map != NULL); 398 398 VERIFY(inner_map->map.max_entries == INNER_MAX_ENTRIES); 399 399 400 400 return 1;

+21

tools/testing/selftests/bpf/progs/trace_dummy_st_ops.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <linux/bpf.h> 3 + #include <bpf/bpf_helpers.h> 4 + #include <bpf/bpf_tracing.h> 5 + 6 + int val = 0; 7 + 8 + SEC("fentry/test_1") 9 + int BPF_PROG(fentry_test_1, __u64 *st_ops_ctx) 10 + { 11 + __u64 state; 12 + 13 + /* Read the traced st_ops arg1 which is a pointer */ 14 + bpf_probe_read_kernel(&state, sizeof(__u64), (void *)st_ops_ctx); 15 + /* Read state->val */ 16 + bpf_probe_read_kernel(&val, sizeof(__u32), (void *)state); 17 + 18 + return 0; 19 + } 20 + 21 + char _license[] SEC("license") = "GPL";

+59 -19

tools/testing/selftests/bpf/test_tcp_check_syncookie_user.c

··· 18 18 #include "bpf_rlimit.h" 19 19 #include "cgroup_helpers.h" 20 20 21 - static int start_server(const struct sockaddr *addr, socklen_t len) 21 + static int start_server(const struct sockaddr *addr, socklen_t len, bool dual) 22 22 { 23 + int mode = !dual; 23 24 int fd; 24 25 25 26 fd = socket(addr->sa_family, SOCK_STREAM, 0); 26 27 if (fd == -1) { 27 28 log_err("Failed to create server socket"); 28 29 goto out; 30 + } 31 + 32 + if (addr->sa_family == AF_INET6) { 33 + if (setsockopt(fd, IPPROTO_IPV6, IPV6_V6ONLY, (char *)&mode, 34 + sizeof(mode)) == -1) { 35 + log_err("Failed to set the dual-stack mode"); 36 + goto close_out; 37 + } 29 38 } 30 39 31 40 if (bind(fd, addr, len) == -1) { ··· 56 47 return fd; 57 48 } 58 49 59 - static int connect_to_server(int server_fd) 50 + static int connect_to_server(const struct sockaddr *addr, socklen_t len) 60 51 { 61 - struct sockaddr_storage addr; 62 - socklen_t len = sizeof(addr); 63 52 int fd = -1; 64 53 65 - if (getsockname(server_fd, (struct sockaddr *)&addr, &len)) { 66 - log_err("Failed to get server addr"); 67 - goto out; 68 - } 69 - 70 - fd = socket(addr.ss_family, SOCK_STREAM, 0); 54 + fd = socket(addr->sa_family, SOCK_STREAM, 0); 71 55 if (fd == -1) { 72 56 log_err("Failed to create client socket"); 73 57 goto out; 74 58 } 75 59 76 - if (connect(fd, (const struct sockaddr *)&addr, len) == -1) { 60 + if (connect(fd, (const struct sockaddr *)addr, len) == -1) { 77 61 log_err("Fail to connect to server"); 78 62 goto close_out; 79 63 } ··· 118 116 return map_fd; 119 117 } 120 118 121 - static int run_test(int server_fd, int results_fd, bool xdp) 119 + static int run_test(int server_fd, int results_fd, bool xdp, 120 + const struct sockaddr *addr, socklen_t len) 122 121 { 123 122 int client = -1, srv_client = -1; 124 123 int ret = 0; ··· 145 142 goto err; 146 143 } 147 144 148 - client = connect_to_server(server_fd); 145 + client = connect_to_server(addr, len); 149 146 if (client == -1) 150 147 goto err; 151 148 ··· 202 199 return ret; 203 200 } 204 201 202 + static bool get_port(int server_fd, in_port_t *port) 203 + { 204 + struct sockaddr_in addr; 205 + socklen_t len = sizeof(addr); 206 + 207 + if (getsockname(server_fd, (struct sockaddr *)&addr, &len)) { 208 + log_err("Failed to get server addr"); 209 + return false; 210 + } 211 + 212 + /* sin_port and sin6_port are located at the same offset. */ 213 + *port = addr.sin_port; 214 + return true; 215 + } 216 + 205 217 int main(int argc, char **argv) 206 218 { 207 219 struct sockaddr_in addr4; 208 220 struct sockaddr_in6 addr6; 221 + struct sockaddr_in addr4dual; 222 + struct sockaddr_in6 addr6dual; 209 223 int server = -1; 210 224 int server_v6 = -1; 225 + int server_dual = -1; 211 226 int results = -1; 212 227 int err = 0; 213 228 bool xdp; ··· 245 224 addr4.sin_family = AF_INET; 246 225 addr4.sin_addr.s_addr = htonl(INADDR_LOOPBACK); 247 226 addr4.sin_port = 0; 227 + memcpy(&addr4dual, &addr4, sizeof(addr4dual)); 248 228 249 229 memset(&addr6, 0, sizeof(addr6)); 250 230 addr6.sin6_family = AF_INET6; 251 231 addr6.sin6_addr = in6addr_loopback; 252 232 addr6.sin6_port = 0; 253 233 254 - server = start_server((const struct sockaddr *)&addr4, sizeof(addr4)); 255 - if (server == -1) 234 + memset(&addr6dual, 0, sizeof(addr6dual)); 235 + addr6dual.sin6_family = AF_INET6; 236 + addr6dual.sin6_addr = in6addr_any; 237 + addr6dual.sin6_port = 0; 238 + 239 + server = start_server((const struct sockaddr *)&addr4, sizeof(addr4), 240 + false); 241 + if (server == -1 || !get_port(server, &addr4.sin_port)) 256 242 goto err; 257 243 258 244 server_v6 = start_server((const struct sockaddr *)&addr6, 259 - sizeof(addr6)); 260 - if (server_v6 == -1) 245 + sizeof(addr6), false); 246 + if (server_v6 == -1 || !get_port(server_v6, &addr6.sin6_port)) 261 247 goto err; 262 248 263 - if (run_test(server, results, xdp)) 249 + server_dual = start_server((const struct sockaddr *)&addr6dual, 250 + sizeof(addr6dual), true); 251 + if (server_dual == -1 || !get_port(server_dual, &addr4dual.sin_port)) 264 252 goto err; 265 253 266 - if (run_test(server_v6, results, xdp)) 254 + if (run_test(server, results, xdp, 255 + (const struct sockaddr *)&addr4, sizeof(addr4))) 256 + goto err; 257 + 258 + if (run_test(server_v6, results, xdp, 259 + (const struct sockaddr *)&addr6, sizeof(addr6))) 260 + goto err; 261 + 262 + if (run_test(server_dual, results, xdp, 263 + (const struct sockaddr *)&addr4dual, sizeof(addr4dual))) 267 264 goto err; 268 265 269 266 printf("ok\n"); ··· 291 252 out: 292 253 close(server); 293 254 close(server_v6); 255 + close(server_dual); 294 256 close(results); 295 257 return err; 296 258 }

+14

tools/testing/selftests/net/fib_nexthops.sh

··· 1208 1208 set +e 1209 1209 check_nexthop "dev veth1" "" 1210 1210 log_test $? 0 "Nexthops removed on admin down" 1211 + 1212 + # nexthop route delete warning: route add with nhid and delete 1213 + # using device 1214 + run_cmd "$IP li set dev veth1 up" 1215 + run_cmd "$IP nexthop add id 12 via 172.16.1.3 dev veth1" 1216 + out1=`dmesg | grep "WARNING:.*fib_nh_match.*" | wc -l` 1217 + run_cmd "$IP route add 172.16.101.1/32 nhid 12" 1218 + run_cmd "$IP route delete 172.16.101.1/32 dev veth1" 1219 + out2=`dmesg | grep "WARNING:.*fib_nh_match.*" | wc -l` 1220 + [ $out1 -eq $out2 ] 1221 + rc=$? 1222 + log_test $rc 0 "Delete nexthop route warning" 1223 + run_cmd "$IP route delete 172.16.101.1/32 nhid 12" 1224 + run_cmd "$IP nexthop del id 12" 1211 1225 } 1212 1226 1213 1227 ipv4_grp_fcnal()