Merge 6.14-rc4 into usb-next · tjh.dev/kernel@d3571fa

+1

.mailmap

··· 735 735 Wolfram Sang <wsa@kernel.org> <wsa@the-dreams.de> 736 736 Yakir Yang <kuankuan.y@gmail.com> <ykk@rock-chips.com> 737 737 Yanteng Si <si.yanteng@linux.dev> <siyanteng@loongson.cn> 738 + Ying Huang <huang.ying.caritas@gmail.com> <ying.huang@intel.com> 738 739 Yusuke Goda <goda.yusuke@renesas.com> 739 740 Zack Rusin <zack.rusin@broadcom.com> <zackr@vmware.com> 740 741 Zhu Yanjun <zyjzyj2000@gmail.com> <yanjunz@nvidia.com>

+1 -3

Documentation/admin-guide/pm/amd-pstate.rst

··· 251 251 In some ASICs, the highest CPPC performance is not the one in the ``_CPC`` 252 252 table, so we need to expose it to sysfs. If boost is not active, but 253 253 still supported, this maximum frequency will be larger than the one in 254 - ``cpuinfo``. On systems that support preferred core, the driver will have 255 - different values for some cores than others and this will reflect the values 256 - advertised by the platform at bootup. 254 + ``cpuinfo``. 257 255 This attribute is read-only. 258 256 259 257 ``amd_pstate_lowest_nonlinear_freq``

+6 -4

Documentation/devicetree/bindings/crypto/fsl,sec-v4.0.yaml

··· 114 114 table that specifies the PPID to LIODN mapping. Needed if the PAMU is 115 115 used. Value is a 12 bit value where value is a LIODN ID for this JR. 116 116 This property is normally set by boot firmware. 117 - $ref: /schemas/types.yaml#/definitions/uint32 118 - maximum: 0xfff 117 + $ref: /schemas/types.yaml#/definitions/uint32-array 118 + items: 119 + - maximum: 0xfff 119 120 120 121 '^rtic@[0-9a-f]+$': 121 122 type: object ··· 187 186 Needed if the PAMU is used. Value is a 12 bit value where value 188 187 is a LIODN ID for this JR. This property is normally set by boot 189 188 firmware. 190 - $ref: /schemas/types.yaml#/definitions/uint32 191 - maximum: 0xfff 189 + $ref: /schemas/types.yaml#/definitions/uint32-array 190 + items: 191 + - maximum: 0xfff 192 192 193 193 fsl,rtic-region: 194 194 description:

+1 -1

Documentation/devicetree/bindings/mtd/partitions/fixed-partitions.yaml

··· 82 82 83 83 uimage@100000 { 84 84 reg = <0x0100000 0x200000>; 85 - compress = "lzma"; 85 + compression = "lzma"; 86 86 }; 87 87 }; 88 88

+2

Documentation/devicetree/bindings/soc/fsl/fsl,qman-portal.yaml

··· 35 35 36 36 fsl,liodn: 37 37 $ref: /schemas/types.yaml#/definitions/uint32-array 38 + maxItems: 2 38 39 description: See pamu.txt. Two LIODN(s). DQRR LIODN (DLIODN) and Frame LIODN 39 40 (FLIODN) 40 41 ··· 70 69 type: object 71 70 properties: 72 71 fsl,liodn: 72 + $ref: /schemas/types.yaml#/definitions/uint32-array 73 73 description: See pamu.txt, PAMU property used for static LIODN assignment 74 74 75 75 fsl,iommu-parent:

+850

Documentation/mm/process_addrs.rst

··· 3 3 ================= 4 4 Process Addresses 5 5 ================= 6 + 7 + .. toctree:: 8 + :maxdepth: 3 9 + 10 + 11 + Userland memory ranges are tracked by the kernel via Virtual Memory Areas or 12 + 'VMA's of type :c:struct:`!struct vm_area_struct`. 13 + 14 + Each VMA describes a virtually contiguous memory range with identical 15 + attributes, each described by a :c:struct:`!struct vm_area_struct` 16 + object. Userland access outside of VMAs is invalid except in the case where an 17 + adjacent stack VMA could be extended to contain the accessed address. 18 + 19 + All VMAs are contained within one and only one virtual address space, described 20 + by a :c:struct:`!struct mm_struct` object which is referenced by all tasks (that is, 21 + threads) which share the virtual address space. We refer to this as the 22 + :c:struct:`!mm`. 23 + 24 + Each mm object contains a maple tree data structure which describes all VMAs 25 + within the virtual address space. 26 + 27 + .. note:: An exception to this is the 'gate' VMA which is provided by 28 + architectures which use :c:struct:`!vsyscall` and is a global static 29 + object which does not belong to any specific mm. 30 + 31 + ------- 32 + Locking 33 + ------- 34 + 35 + The kernel is designed to be highly scalable against concurrent read operations 36 + on VMA **metadata** so a complicated set of locks are required to ensure memory 37 + corruption does not occur. 38 + 39 + .. note:: Locking VMAs for their metadata does not have any impact on the memory 40 + they describe nor the page tables that map them. 41 + 42 + Terminology 43 + ----------- 44 + 45 + * **mmap locks** - Each MM has a read/write semaphore :c:member:`!mmap_lock` 46 + which locks at a process address space granularity which can be acquired via 47 + :c:func:`!mmap_read_lock`, :c:func:`!mmap_write_lock` and variants. 48 + * **VMA locks** - The VMA lock is at VMA granularity (of course) which behaves 49 + as a read/write semaphore in practice. A VMA read lock is obtained via 50 + :c:func:`!lock_vma_under_rcu` (and unlocked via :c:func:`!vma_end_read`) and a 51 + write lock via :c:func:`!vma_start_write` (all VMA write locks are unlocked 52 + automatically when the mmap write lock is released). To take a VMA write lock 53 + you **must** have already acquired an :c:func:`!mmap_write_lock`. 54 + * **rmap locks** - When trying to access VMAs through the reverse mapping via a 55 + :c:struct:`!struct address_space` or :c:struct:`!struct anon_vma` object 56 + (reachable from a folio via :c:member:`!folio->mapping`). VMAs must be stabilised via 57 + :c:func:`!anon_vma_[try]lock_read` or :c:func:`!anon_vma_[try]lock_write` for 58 + anonymous memory and :c:func:`!i_mmap_[try]lock_read` or 59 + :c:func:`!i_mmap_[try]lock_write` for file-backed memory. We refer to these 60 + locks as the reverse mapping locks, or 'rmap locks' for brevity. 61 + 62 + We discuss page table locks separately in the dedicated section below. 63 + 64 + The first thing **any** of these locks achieve is to **stabilise** the VMA 65 + within the MM tree. That is, guaranteeing that the VMA object will not be 66 + deleted from under you nor modified (except for some specific fields 67 + described below). 68 + 69 + Stabilising a VMA also keeps the address space described by it around. 70 + 71 + Lock usage 72 + ---------- 73 + 74 + If you want to **read** VMA metadata fields or just keep the VMA stable, you 75 + must do one of the following: 76 + 77 + * Obtain an mmap read lock at the MM granularity via :c:func:`!mmap_read_lock` (or a 78 + suitable variant), unlocking it with a matching :c:func:`!mmap_read_unlock` when 79 + you're done with the VMA, *or* 80 + * Try to obtain a VMA read lock via :c:func:`!lock_vma_under_rcu`. This tries to 81 + acquire the lock atomically so might fail, in which case fall-back logic is 82 + required to instead obtain an mmap read lock if this returns :c:macro:`!NULL`, 83 + *or* 84 + * Acquire an rmap lock before traversing the locked interval tree (whether 85 + anonymous or file-backed) to obtain the required VMA. 86 + 87 + If you want to **write** VMA metadata fields, then things vary depending on the 88 + field (we explore each VMA field in detail below). For the majority you must: 89 + 90 + * Obtain an mmap write lock at the MM granularity via :c:func:`!mmap_write_lock` (or a 91 + suitable variant), unlocking it with a matching :c:func:`!mmap_write_unlock` when 92 + you're done with the VMA, *and* 93 + * Obtain a VMA write lock via :c:func:`!vma_start_write` for each VMA you wish to 94 + modify, which will be released automatically when :c:func:`!mmap_write_unlock` is 95 + called. 96 + * If you want to be able to write to **any** field, you must also hide the VMA 97 + from the reverse mapping by obtaining an **rmap write lock**. 98 + 99 + VMA locks are special in that you must obtain an mmap **write** lock **first** 100 + in order to obtain a VMA **write** lock. A VMA **read** lock however can be 101 + obtained without any other lock (:c:func:`!lock_vma_under_rcu` will acquire then 102 + release an RCU lock to lookup the VMA for you). 103 + 104 + This constrains the impact of writers on readers, as a writer can interact with 105 + one VMA while a reader interacts with another simultaneously. 106 + 107 + .. note:: The primary users of VMA read locks are page fault handlers, which 108 + means that without a VMA write lock, page faults will run concurrent with 109 + whatever you are doing. 110 + 111 + Examining all valid lock states: 112 + 113 + .. table:: 114 + 115 + ========= ======== ========= ======= ===== =========== ========== 116 + mmap lock VMA lock rmap lock Stable? Read? Write most? Write all? 117 + ========= ======== ========= ======= ===== =========== ========== 118 + \- \- \- N N N N 119 + \- R \- Y Y N N 120 + \- \- R/W Y Y N N 121 + R/W \-/R \-/R/W Y Y N N 122 + W W \-/R Y Y Y N 123 + W W W Y Y Y Y 124 + ========= ======== ========= ======= ===== =========== ========== 125 + 126 + .. warning:: While it's possible to obtain a VMA lock while holding an mmap read lock, 127 + attempting to do the reverse is invalid as it can result in deadlock - if 128 + another task already holds an mmap write lock and attempts to acquire a VMA 129 + write lock that will deadlock on the VMA read lock. 130 + 131 + All of these locks behave as read/write semaphores in practice, so you can 132 + obtain either a read or a write lock for each of these. 133 + 134 + .. note:: Generally speaking, a read/write semaphore is a class of lock which 135 + permits concurrent readers. However a write lock can only be obtained 136 + once all readers have left the critical region (and pending readers 137 + made to wait). 138 + 139 + This renders read locks on a read/write semaphore concurrent with other 140 + readers and write locks exclusive against all others holding the semaphore. 141 + 142 + VMA fields 143 + ^^^^^^^^^^ 144 + 145 + We can subdivide :c:struct:`!struct vm_area_struct` fields by their purpose, which makes it 146 + easier to explore their locking characteristics: 147 + 148 + .. note:: We exclude VMA lock-specific fields here to avoid confusion, as these 149 + are in effect an internal implementation detail. 150 + 151 + .. table:: Virtual layout fields 152 + 153 + ===================== ======================================== =========== 154 + Field Description Write lock 155 + ===================== ======================================== =========== 156 + :c:member:`!vm_start` Inclusive start virtual address of range mmap write, 157 + VMA describes. VMA write, 158 + rmap write. 159 + :c:member:`!vm_end` Exclusive end virtual address of range mmap write, 160 + VMA describes. VMA write, 161 + rmap write. 162 + :c:member:`!vm_pgoff` Describes the page offset into the file, mmap write, 163 + the original page offset within the VMA write, 164 + virtual address space (prior to any rmap write. 165 + :c:func:`!mremap`), or PFN if a PFN map 166 + and the architecture does not support 167 + :c:macro:`!CONFIG_ARCH_HAS_PTE_SPECIAL`. 168 + ===================== ======================================== =========== 169 + 170 + These fields describes the size, start and end of the VMA, and as such cannot be 171 + modified without first being hidden from the reverse mapping since these fields 172 + are used to locate VMAs within the reverse mapping interval trees. 173 + 174 + .. table:: Core fields 175 + 176 + ============================ ======================================== ========================= 177 + Field Description Write lock 178 + ============================ ======================================== ========================= 179 + :c:member:`!vm_mm` Containing mm_struct. None - written once on 180 + initial map. 181 + :c:member:`!vm_page_prot` Architecture-specific page table mmap write, VMA write. 182 + protection bits determined from VMA 183 + flags. 184 + :c:member:`!vm_flags` Read-only access to VMA flags describing N/A 185 + attributes of the VMA, in union with 186 + private writable 187 + :c:member:`!__vm_flags`. 188 + :c:member:`!__vm_flags` Private, writable access to VMA flags mmap write, VMA write. 189 + field, updated by 190 + :c:func:`!vm_flags_*` functions. 191 + :c:member:`!vm_file` If the VMA is file-backed, points to a None - written once on 192 + struct file object describing the initial map. 193 + underlying file, if anonymous then 194 + :c:macro:`!NULL`. 195 + :c:member:`!vm_ops` If the VMA is file-backed, then either None - Written once on 196 + the driver or file-system provides a initial map by 197 + :c:struct:`!struct vm_operations_struct` :c:func:`!f_ops->mmap()`. 198 + object describing callbacks to be 199 + invoked on VMA lifetime events. 200 + :c:member:`!vm_private_data` A :c:member:`!void *` field for Handled by driver. 201 + driver-specific metadata. 202 + ============================ ======================================== ========================= 203 + 204 + These are the core fields which describe the MM the VMA belongs to and its attributes. 205 + 206 + .. table:: Config-specific fields 207 + 208 + ================================= ===================== ======================================== =============== 209 + Field Configuration option Description Write lock 210 + ================================= ===================== ======================================== =============== 211 + :c:member:`!anon_name` CONFIG_ANON_VMA_NAME A field for storing a mmap write, 212 + :c:struct:`!struct anon_vma_name` VMA write. 213 + object providing a name for anonymous 214 + mappings, or :c:macro:`!NULL` if none 215 + is set or the VMA is file-backed. The 216 + underlying object is reference counted 217 + and can be shared across multiple VMAs 218 + for scalability. 219 + :c:member:`!swap_readahead_info` CONFIG_SWAP Metadata used by the swap mechanism mmap read, 220 + to perform readahead. This field is swap-specific 221 + accessed atomically. lock. 222 + :c:member:`!vm_policy` CONFIG_NUMA :c:type:`!mempolicy` object which mmap write, 223 + describes the NUMA behaviour of the VMA write. 224 + VMA. The underlying object is reference 225 + counted. 226 + :c:member:`!numab_state` CONFIG_NUMA_BALANCING :c:type:`!vma_numab_state` object which mmap read, 227 + describes the current state of numab-specific 228 + NUMA balancing in relation to this VMA. lock. 229 + Updated under mmap read lock by 230 + :c:func:`!task_numa_work`. 231 + :c:member:`!vm_userfaultfd_ctx` CONFIG_USERFAULTFD Userfaultfd context wrapper object of mmap write, 232 + type :c:type:`!vm_userfaultfd_ctx`, VMA write. 233 + either of zero size if userfaultfd is 234 + disabled, or containing a pointer 235 + to an underlying 236 + :c:type:`!userfaultfd_ctx` object which 237 + describes userfaultfd metadata. 238 + ================================= ===================== ======================================== =============== 239 + 240 + These fields are present or not depending on whether the relevant kernel 241 + configuration option is set. 242 + 243 + .. table:: Reverse mapping fields 244 + 245 + =================================== ========================================= ============================ 246 + Field Description Write lock 247 + =================================== ========================================= ============================ 248 + :c:member:`!shared.rb` A red/black tree node used, if the mmap write, VMA write, 249 + mapping is file-backed, to place the VMA i_mmap write. 250 + in the 251 + :c:member:`!struct address_space->i_mmap` 252 + red/black interval tree. 253 + :c:member:`!shared.rb_subtree_last` Metadata used for management of the mmap write, VMA write, 254 + interval tree if the VMA is file-backed. i_mmap write. 255 + :c:member:`!anon_vma_chain` List of pointers to both forked/CoW’d mmap read, anon_vma write. 256 + :c:type:`!anon_vma` objects and 257 + :c:member:`!vma->anon_vma` if it is 258 + non-:c:macro:`!NULL`. 259 + :c:member:`!anon_vma` :c:type:`!anon_vma` object used by When :c:macro:`NULL` and 260 + anonymous folios mapped exclusively to setting non-:c:macro:`NULL`: 261 + this VMA. Initially set by mmap read, page_table_lock. 262 + :c:func:`!anon_vma_prepare` serialised 263 + by the :c:macro:`!page_table_lock`. This When non-:c:macro:`NULL` and 264 + is set as soon as any page is faulted in. setting :c:macro:`NULL`: 265 + mmap write, VMA write, 266 + anon_vma write. 267 + =================================== ========================================= ============================ 268 + 269 + These fields are used to both place the VMA within the reverse mapping, and for 270 + anonymous mappings, to be able to access both related :c:struct:`!struct anon_vma` objects 271 + and the :c:struct:`!struct anon_vma` in which folios mapped exclusively to this VMA should 272 + reside. 273 + 274 + .. note:: If a file-backed mapping is mapped with :c:macro:`!MAP_PRIVATE` set 275 + then it can be in both the :c:type:`!anon_vma` and :c:type:`!i_mmap` 276 + trees at the same time, so all of these fields might be utilised at 277 + once. 278 + 279 + Page tables 280 + ----------- 281 + 282 + We won't speak exhaustively on the subject but broadly speaking, page tables map 283 + virtual addresses to physical ones through a series of page tables, each of 284 + which contain entries with physical addresses for the next page table level 285 + (along with flags), and at the leaf level the physical addresses of the 286 + underlying physical data pages or a special entry such as a swap entry, 287 + migration entry or other special marker. Offsets into these pages are provided 288 + by the virtual address itself. 289 + 290 + In Linux these are divided into five levels - PGD, P4D, PUD, PMD and PTE. Huge 291 + pages might eliminate one or two of these levels, but when this is the case we 292 + typically refer to the leaf level as the PTE level regardless. 293 + 294 + .. note:: In instances where the architecture supports fewer page tables than 295 + five the kernel cleverly 'folds' page table levels, that is stubbing 296 + out functions related to the skipped levels. This allows us to 297 + conceptually act as if there were always five levels, even if the 298 + compiler might, in practice, eliminate any code relating to missing 299 + ones. 300 + 301 + There are four key operations typically performed on page tables: 302 + 303 + 1. **Traversing** page tables - Simply reading page tables in order to traverse 304 + them. This only requires that the VMA is kept stable, so a lock which 305 + establishes this suffices for traversal (there are also lockless variants 306 + which eliminate even this requirement, such as :c:func:`!gup_fast`). 307 + 2. **Installing** page table mappings - Whether creating a new mapping or 308 + modifying an existing one in such a way as to change its identity. This 309 + requires that the VMA is kept stable via an mmap or VMA lock (explicitly not 310 + rmap locks). 311 + 3. **Zapping/unmapping** page table entries - This is what the kernel calls 312 + clearing page table mappings at the leaf level only, whilst leaving all page 313 + tables in place. This is a very common operation in the kernel performed on 314 + file truncation, the :c:macro:`!MADV_DONTNEED` operation via 315 + :c:func:`!madvise`, and others. This is performed by a number of functions 316 + including :c:func:`!unmap_mapping_range` and :c:func:`!unmap_mapping_pages`. 317 + The VMA need only be kept stable for this operation. 318 + 4. **Freeing** page tables - When finally the kernel removes page tables from a 319 + userland process (typically via :c:func:`!free_pgtables`) extreme care must 320 + be taken to ensure this is done safely, as this logic finally frees all page 321 + tables in the specified range, ignoring existing leaf entries (it assumes the 322 + caller has both zapped the range and prevented any further faults or 323 + modifications within it). 324 + 325 + .. note:: Modifying mappings for reclaim or migration is performed under rmap 326 + lock as it, like zapping, does not fundamentally modify the identity 327 + of what is being mapped. 328 + 329 + **Traversing** and **zapping** ranges can be performed holding any one of the 330 + locks described in the terminology section above - that is the mmap lock, the 331 + VMA lock or either of the reverse mapping locks. 332 + 333 + That is - as long as you keep the relevant VMA **stable** - you are good to go 334 + ahead and perform these operations on page tables (though internally, kernel 335 + operations that perform writes also acquire internal page table locks to 336 + serialise - see the page table implementation detail section for more details). 337 + 338 + When **installing** page table entries, the mmap or VMA lock must be held to 339 + keep the VMA stable. We explore why this is in the page table locking details 340 + section below. 341 + 342 + .. warning:: Page tables are normally only traversed in regions covered by VMAs. 343 + If you want to traverse page tables in areas that might not be 344 + covered by VMAs, heavier locking is required. 345 + See :c:func:`!walk_page_range_novma` for details. 346 + 347 + **Freeing** page tables is an entirely internal memory management operation and 348 + has special requirements (see the page freeing section below for more details). 349 + 350 + .. warning:: When **freeing** page tables, it must not be possible for VMAs 351 + containing the ranges those page tables map to be accessible via 352 + the reverse mapping. 353 + 354 + The :c:func:`!free_pgtables` function removes the relevant VMAs 355 + from the reverse mappings, but no other VMAs can be permitted to be 356 + accessible and span the specified range. 357 + 358 + Lock ordering 359 + ------------- 360 + 361 + As we have multiple locks across the kernel which may or may not be taken at the 362 + same time as explicit mm or VMA locks, we have to be wary of lock inversion, and 363 + the **order** in which locks are acquired and released becomes very important. 364 + 365 + .. note:: Lock inversion occurs when two threads need to acquire multiple locks, 366 + but in doing so inadvertently cause a mutual deadlock. 367 + 368 + For example, consider thread 1 which holds lock A and tries to acquire lock B, 369 + while thread 2 holds lock B and tries to acquire lock A. 370 + 371 + Both threads are now deadlocked on each other. However, had they attempted to 372 + acquire locks in the same order, one would have waited for the other to 373 + complete its work and no deadlock would have occurred. 374 + 375 + The opening comment in :c:macro:`!mm/rmap.c` describes in detail the required 376 + ordering of locks within memory management code: 377 + 378 + .. code-block:: 379 + 380 + inode->i_rwsem (while writing or truncating, not reading or faulting) 381 + mm->mmap_lock 382 + mapping->invalidate_lock (in filemap_fault) 383 + folio_lock 384 + hugetlbfs_i_mmap_rwsem_key (in huge_pmd_share, see hugetlbfs below) 385 + vma_start_write 386 + mapping->i_mmap_rwsem 387 + anon_vma->rwsem 388 + mm->page_table_lock or pte_lock 389 + swap_lock (in swap_duplicate, swap_info_get) 390 + mmlist_lock (in mmput, drain_mmlist and others) 391 + mapping->private_lock (in block_dirty_folio) 392 + i_pages lock (widely used) 393 + lruvec->lru_lock (in folio_lruvec_lock_irq) 394 + inode->i_lock (in set_page_dirty's __mark_inode_dirty) 395 + bdi.wb->list_lock (in set_page_dirty's __mark_inode_dirty) 396 + sb_lock (within inode_lock in fs/fs-writeback.c) 397 + i_pages lock (widely used, in set_page_dirty, 398 + in arch-dependent flush_dcache_mmap_lock, 399 + within bdi.wb->list_lock in __sync_single_inode) 400 + 401 + There is also a file-system specific lock ordering comment located at the top of 402 + :c:macro:`!mm/filemap.c`: 403 + 404 + .. code-block:: 405 + 406 + ->i_mmap_rwsem (truncate_pagecache) 407 + ->private_lock (__free_pte->block_dirty_folio) 408 + ->swap_lock (exclusive_swap_page, others) 409 + ->i_pages lock 410 + 411 + ->i_rwsem 412 + ->invalidate_lock (acquired by fs in truncate path) 413 + ->i_mmap_rwsem (truncate->unmap_mapping_range) 414 + 415 + ->mmap_lock 416 + ->i_mmap_rwsem 417 + ->page_table_lock or pte_lock (various, mainly in memory.c) 418 + ->i_pages lock (arch-dependent flush_dcache_mmap_lock) 419 + 420 + ->mmap_lock 421 + ->invalidate_lock (filemap_fault) 422 + ->lock_page (filemap_fault, access_process_vm) 423 + 424 + ->i_rwsem (generic_perform_write) 425 + ->mmap_lock (fault_in_readable->do_page_fault) 426 + 427 + bdi->wb.list_lock 428 + sb_lock (fs/fs-writeback.c) 429 + ->i_pages lock (__sync_single_inode) 430 + 431 + ->i_mmap_rwsem 432 + ->anon_vma.lock (vma_merge) 433 + 434 + ->anon_vma.lock 435 + ->page_table_lock or pte_lock (anon_vma_prepare and various) 436 + 437 + ->page_table_lock or pte_lock 438 + ->swap_lock (try_to_unmap_one) 439 + ->private_lock (try_to_unmap_one) 440 + ->i_pages lock (try_to_unmap_one) 441 + ->lruvec->lru_lock (follow_page_mask->mark_page_accessed) 442 + ->lruvec->lru_lock (check_pte_range->folio_isolate_lru) 443 + ->private_lock (folio_remove_rmap_pte->set_page_dirty) 444 + ->i_pages lock (folio_remove_rmap_pte->set_page_dirty) 445 + bdi.wb->list_lock (folio_remove_rmap_pte->set_page_dirty) 446 + ->inode->i_lock (folio_remove_rmap_pte->set_page_dirty) 447 + bdi.wb->list_lock (zap_pte_range->set_page_dirty) 448 + ->inode->i_lock (zap_pte_range->set_page_dirty) 449 + ->private_lock (zap_pte_range->block_dirty_folio) 450 + 451 + Please check the current state of these comments which may have changed since 452 + the time of writing of this document. 453 + 454 + ------------------------------ 455 + Locking Implementation Details 456 + ------------------------------ 457 + 458 + .. warning:: Locking rules for PTE-level page tables are very different from 459 + locking rules for page tables at other levels. 460 + 461 + Page table locking details 462 + -------------------------- 463 + 464 + In addition to the locks described in the terminology section above, we have 465 + additional locks dedicated to page tables: 466 + 467 + * **Higher level page table locks** - Higher level page tables, that is PGD, P4D 468 + and PUD each make use of the process address space granularity 469 + :c:member:`!mm->page_table_lock` lock when modified. 470 + 471 + * **Fine-grained page table locks** - PMDs and PTEs each have fine-grained locks 472 + either kept within the folios describing the page tables or allocated 473 + separated and pointed at by the folios if :c:macro:`!ALLOC_SPLIT_PTLOCKS` is 474 + set. The PMD spin lock is obtained via :c:func:`!pmd_lock`, however PTEs are 475 + mapped into higher memory (if a 32-bit system) and carefully locked via 476 + :c:func:`!pte_offset_map_lock`. 477 + 478 + These locks represent the minimum required to interact with each page table 479 + level, but there are further requirements. 480 + 481 + Importantly, note that on a **traversal** of page tables, sometimes no such 482 + locks are taken. However, at the PTE level, at least concurrent page table 483 + deletion must be prevented (using RCU) and the page table must be mapped into 484 + high memory, see below. 485 + 486 + Whether care is taken on reading the page table entries depends on the 487 + architecture, see the section on atomicity below. 488 + 489 + Locking rules 490 + ^^^^^^^^^^^^^ 491 + 492 + We establish basic locking rules when interacting with page tables: 493 + 494 + * When changing a page table entry the page table lock for that page table 495 + **must** be held, except if you can safely assume nobody can access the page 496 + tables concurrently (such as on invocation of :c:func:`!free_pgtables`). 497 + * Reads from and writes to page table entries must be *appropriately* 498 + atomic. See the section on atomicity below for details. 499 + * Populating previously empty entries requires that the mmap or VMA locks are 500 + held (read or write), doing so with only rmap locks would be dangerous (see 501 + the warning below). 502 + * As mentioned previously, zapping can be performed while simply keeping the VMA 503 + stable, that is holding any one of the mmap, VMA or rmap locks. 504 + 505 + .. warning:: Populating previously empty entries is dangerous as, when unmapping 506 + VMAs, :c:func:`!vms_clear_ptes` has a window of time between 507 + zapping (via :c:func:`!unmap_vmas`) and freeing page tables (via 508 + :c:func:`!free_pgtables`), where the VMA is still visible in the 509 + rmap tree. :c:func:`!free_pgtables` assumes that the zap has 510 + already been performed and removes PTEs unconditionally (along with 511 + all other page tables in the freed range), so installing new PTE 512 + entries could leak memory and also cause other unexpected and 513 + dangerous behaviour. 514 + 515 + There are additional rules applicable when moving page tables, which we discuss 516 + in the section on this topic below. 517 + 518 + PTE-level page tables are different from page tables at other levels, and there 519 + are extra requirements for accessing them: 520 + 521 + * On 32-bit architectures, they may be in high memory (meaning they need to be 522 + mapped into kernel memory to be accessible). 523 + * When empty, they can be unlinked and RCU-freed while holding an mmap lock or 524 + rmap lock for reading in combination with the PTE and PMD page table locks. 525 + In particular, this happens in :c:func:`!retract_page_tables` when handling 526 + :c:macro:`!MADV_COLLAPSE`. 527 + So accessing PTE-level page tables requires at least holding an RCU read lock; 528 + but that only suffices for readers that can tolerate racing with concurrent 529 + page table updates such that an empty PTE is observed (in a page table that 530 + has actually already been detached and marked for RCU freeing) while another 531 + new page table has been installed in the same location and filled with 532 + entries. Writers normally need to take the PTE lock and revalidate that the 533 + PMD entry still refers to the same PTE-level page table. 534 + 535 + To access PTE-level page tables, a helper like :c:func:`!pte_offset_map_lock` or 536 + :c:func:`!pte_offset_map` can be used depending on stability requirements. 537 + These map the page table into kernel memory if required, take the RCU lock, and 538 + depending on variant, may also look up or acquire the PTE lock. 539 + See the comment on :c:func:`!__pte_offset_map_lock`. 540 + 541 + Atomicity 542 + ^^^^^^^^^ 543 + 544 + Regardless of page table locks, the MMU hardware concurrently updates accessed 545 + and dirty bits (perhaps more, depending on architecture). Additionally, page 546 + table traversal operations in parallel (though holding the VMA stable) and 547 + functionality like GUP-fast locklessly traverses (that is reads) page tables, 548 + without even keeping the VMA stable at all. 549 + 550 + When performing a page table traversal and keeping the VMA stable, whether a 551 + read must be performed once and only once or not depends on the architecture 552 + (for instance x86-64 does not require any special precautions). 553 + 554 + If a write is being performed, or if a read informs whether a write takes place 555 + (on an installation of a page table entry say, for instance in 556 + :c:func:`!__pud_install`), special care must always be taken. In these cases we 557 + can never assume that page table locks give us entirely exclusive access, and 558 + must retrieve page table entries once and only once. 559 + 560 + If we are reading page table entries, then we need only ensure that the compiler 561 + does not rearrange our loads. This is achieved via :c:func:`!pXXp_get` 562 + functions - :c:func:`!pgdp_get`, :c:func:`!p4dp_get`, :c:func:`!pudp_get`, 563 + :c:func:`!pmdp_get`, and :c:func:`!ptep_get`. 564 + 565 + Each of these uses :c:func:`!READ_ONCE` to guarantee that the compiler reads 566 + the page table entry only once. 567 + 568 + However, if we wish to manipulate an existing page table entry and care about 569 + the previously stored data, we must go further and use an hardware atomic 570 + operation as, for example, in :c:func:`!ptep_get_and_clear`. 571 + 572 + Equally, operations that do not rely on the VMA being held stable, such as 573 + GUP-fast (see :c:func:`!gup_fast` and its various page table level handlers like 574 + :c:func:`!gup_fast_pte_range`), must very carefully interact with page table 575 + entries, using functions such as :c:func:`!ptep_get_lockless` and equivalent for 576 + higher level page table levels. 577 + 578 + Writes to page table entries must also be appropriately atomic, as established 579 + by :c:func:`!set_pXX` functions - :c:func:`!set_pgd`, :c:func:`!set_p4d`, 580 + :c:func:`!set_pud`, :c:func:`!set_pmd`, and :c:func:`!set_pte`. 581 + 582 + Equally functions which clear page table entries must be appropriately atomic, 583 + as in :c:func:`!pXX_clear` functions - :c:func:`!pgd_clear`, 584 + :c:func:`!p4d_clear`, :c:func:`!pud_clear`, :c:func:`!pmd_clear`, and 585 + :c:func:`!pte_clear`. 586 + 587 + Page table installation 588 + ^^^^^^^^^^^^^^^^^^^^^^^ 589 + 590 + Page table installation is performed with the VMA held stable explicitly by an 591 + mmap or VMA lock in read or write mode (see the warning in the locking rules 592 + section for details as to why). 593 + 594 + When allocating a P4D, PUD or PMD and setting the relevant entry in the above 595 + PGD, P4D or PUD, the :c:member:`!mm->page_table_lock` must be held. This is 596 + acquired in :c:func:`!__p4d_alloc`, :c:func:`!__pud_alloc` and 597 + :c:func:`!__pmd_alloc` respectively. 598 + 599 + .. note:: :c:func:`!__pmd_alloc` actually invokes :c:func:`!pud_lock` and 600 + :c:func:`!pud_lockptr` in turn, however at the time of writing it ultimately 601 + references the :c:member:`!mm->page_table_lock`. 602 + 603 + Allocating a PTE will either use the :c:member:`!mm->page_table_lock` or, if 604 + :c:macro:`!USE_SPLIT_PMD_PTLOCKS` is defined, a lock embedded in the PMD 605 + physical page metadata in the form of a :c:struct:`!struct ptdesc`, acquired by 606 + :c:func:`!pmd_ptdesc` called from :c:func:`!pmd_lock` and ultimately 607 + :c:func:`!__pte_alloc`. 608 + 609 + Finally, modifying the contents of the PTE requires special treatment, as the 610 + PTE page table lock must be acquired whenever we want stable and exclusive 611 + access to entries contained within a PTE, especially when we wish to modify 612 + them. 613 + 614 + This is performed via :c:func:`!pte_offset_map_lock` which carefully checks to 615 + ensure that the PTE hasn't changed from under us, ultimately invoking 616 + :c:func:`!pte_lockptr` to obtain a spin lock at PTE granularity contained within 617 + the :c:struct:`!struct ptdesc` associated with the physical PTE page. The lock 618 + must be released via :c:func:`!pte_unmap_unlock`. 619 + 620 + .. note:: There are some variants on this, such as 621 + :c:func:`!pte_offset_map_rw_nolock` when we know we hold the PTE stable but 622 + for brevity we do not explore this. See the comment for 623 + :c:func:`!__pte_offset_map_lock` for more details. 624 + 625 + When modifying data in ranges we typically only wish to allocate higher page 626 + tables as necessary, using these locks to avoid races or overwriting anything, 627 + and set/clear data at the PTE level as required (for instance when page faulting 628 + or zapping). 629 + 630 + A typical pattern taken when traversing page table entries to install a new 631 + mapping is to optimistically determine whether the page table entry in the table 632 + above is empty, if so, only then acquiring the page table lock and checking 633 + again to see if it was allocated underneath us. 634 + 635 + This allows for a traversal with page table locks only being taken when 636 + required. An example of this is :c:func:`!__pud_alloc`. 637 + 638 + At the leaf page table, that is the PTE, we can't entirely rely on this pattern 639 + as we have separate PMD and PTE locks and a THP collapse for instance might have 640 + eliminated the PMD entry as well as the PTE from under us. 641 + 642 + This is why :c:func:`!__pte_offset_map_lock` locklessly retrieves the PMD entry 643 + for the PTE, carefully checking it is as expected, before acquiring the 644 + PTE-specific lock, and then *again* checking that the PMD entry is as expected. 645 + 646 + If a THP collapse (or similar) were to occur then the lock on both pages would 647 + be acquired, so we can ensure this is prevented while the PTE lock is held. 648 + 649 + Installing entries this way ensures mutual exclusion on write. 650 + 651 + Page table freeing 652 + ^^^^^^^^^^^^^^^^^^ 653 + 654 + Tearing down page tables themselves is something that requires significant 655 + care. There must be no way that page tables designated for removal can be 656 + traversed or referenced by concurrent tasks. 657 + 658 + It is insufficient to simply hold an mmap write lock and VMA lock (which will 659 + prevent racing faults, and rmap operations), as a file-backed mapping can be 660 + truncated under the :c:struct:`!struct address_space->i_mmap_rwsem` alone. 661 + 662 + As a result, no VMA which can be accessed via the reverse mapping (either 663 + through the :c:struct:`!struct anon_vma->rb_root` or the :c:member:`!struct 664 + address_space->i_mmap` interval trees) can have its page tables torn down. 665 + 666 + The operation is typically performed via :c:func:`!free_pgtables`, which assumes 667 + either the mmap write lock has been taken (as specified by its 668 + :c:member:`!mm_wr_locked` parameter), or that the VMA is already unreachable. 669 + 670 + It carefully removes the VMA from all reverse mappings, however it's important 671 + that no new ones overlap these or any route remain to permit access to addresses 672 + within the range whose page tables are being torn down. 673 + 674 + Additionally, it assumes that a zap has already been performed and steps have 675 + been taken to ensure that no further page table entries can be installed between 676 + the zap and the invocation of :c:func:`!free_pgtables`. 677 + 678 + Since it is assumed that all such steps have been taken, page table entries are 679 + cleared without page table locks (in the :c:func:`!pgd_clear`, :c:func:`!p4d_clear`, 680 + :c:func:`!pud_clear`, and :c:func:`!pmd_clear` functions. 681 + 682 + .. note:: It is possible for leaf page tables to be torn down independent of 683 + the page tables above it as is done by 684 + :c:func:`!retract_page_tables`, which is performed under the i_mmap 685 + read lock, PMD, and PTE page table locks, without this level of care. 686 + 687 + Page table moving 688 + ^^^^^^^^^^^^^^^^^ 689 + 690 + Some functions manipulate page table levels above PMD (that is PUD, P4D and PGD 691 + page tables). Most notable of these is :c:func:`!mremap`, which is capable of 692 + moving higher level page tables. 693 + 694 + In these instances, it is required that **all** locks are taken, that is 695 + the mmap lock, the VMA lock and the relevant rmap locks. 696 + 697 + You can observe this in the :c:func:`!mremap` implementation in the functions 698 + :c:func:`!take_rmap_locks` and :c:func:`!drop_rmap_locks` which perform the rmap 699 + side of lock acquisition, invoked ultimately by :c:func:`!move_page_tables`. 700 + 701 + VMA lock internals 702 + ------------------ 703 + 704 + Overview 705 + ^^^^^^^^ 706 + 707 + VMA read locking is entirely optimistic - if the lock is contended or a competing 708 + write has started, then we do not obtain a read lock. 709 + 710 + A VMA **read** lock is obtained by :c:func:`!lock_vma_under_rcu`, which first 711 + calls :c:func:`!rcu_read_lock` to ensure that the VMA is looked up in an RCU 712 + critical section, then attempts to VMA lock it via :c:func:`!vma_start_read`, 713 + before releasing the RCU lock via :c:func:`!rcu_read_unlock`. 714 + 715 + VMA read locks hold the read lock on the :c:member:`!vma->vm_lock` semaphore for 716 + their duration and the caller of :c:func:`!lock_vma_under_rcu` must release it 717 + via :c:func:`!vma_end_read`. 718 + 719 + VMA **write** locks are acquired via :c:func:`!vma_start_write` in instances where a 720 + VMA is about to be modified, unlike :c:func:`!vma_start_read` the lock is always 721 + acquired. An mmap write lock **must** be held for the duration of the VMA write 722 + lock, releasing or downgrading the mmap write lock also releases the VMA write 723 + lock so there is no :c:func:`!vma_end_write` function. 724 + 725 + Note that a semaphore write lock is not held across a VMA lock. Rather, a 726 + sequence number is used for serialisation, and the write semaphore is only 727 + acquired at the point of write lock to update this. 728 + 729 + This ensures the semantics we require - VMA write locks provide exclusive write 730 + access to the VMA. 731 + 732 + Implementation details 733 + ^^^^^^^^^^^^^^^^^^^^^^ 734 + 735 + The VMA lock mechanism is designed to be a lightweight means of avoiding the use 736 + of the heavily contended mmap lock. It is implemented using a combination of a 737 + read/write semaphore and sequence numbers belonging to the containing 738 + :c:struct:`!struct mm_struct` and the VMA. 739 + 740 + Read locks are acquired via :c:func:`!vma_start_read`, which is an optimistic 741 + operation, i.e. it tries to acquire a read lock but returns false if it is 742 + unable to do so. At the end of the read operation, :c:func:`!vma_end_read` is 743 + called to release the VMA read lock. 744 + 745 + Invoking :c:func:`!vma_start_read` requires that :c:func:`!rcu_read_lock` has 746 + been called first, establishing that we are in an RCU critical section upon VMA 747 + read lock acquisition. Once acquired, the RCU lock can be released as it is only 748 + required for lookup. This is abstracted by :c:func:`!lock_vma_under_rcu` which 749 + is the interface a user should use. 750 + 751 + Writing requires the mmap to be write-locked and the VMA lock to be acquired via 752 + :c:func:`!vma_start_write`, however the write lock is released by the termination or 753 + downgrade of the mmap write lock so no :c:func:`!vma_end_write` is required. 754 + 755 + All this is achieved by the use of per-mm and per-VMA sequence counts, which are 756 + used in order to reduce complexity, especially for operations which write-lock 757 + multiple VMAs at once. 758 + 759 + If the mm sequence count, :c:member:`!mm->mm_lock_seq` is equal to the VMA 760 + sequence count :c:member:`!vma->vm_lock_seq` then the VMA is write-locked. If 761 + they differ, then it is not. 762 + 763 + Each time the mmap write lock is released in :c:func:`!mmap_write_unlock` or 764 + :c:func:`!mmap_write_downgrade`, :c:func:`!vma_end_write_all` is invoked which 765 + also increments :c:member:`!mm->mm_lock_seq` via 766 + :c:func:`!mm_lock_seqcount_end`. 767 + 768 + This way, we ensure that, regardless of the VMA's sequence number, a write lock 769 + is never incorrectly indicated and that when we release an mmap write lock we 770 + efficiently release **all** VMA write locks contained within the mmap at the 771 + same time. 772 + 773 + Since the mmap write lock is exclusive against others who hold it, the automatic 774 + release of any VMA locks on its release makes sense, as you would never want to 775 + keep VMAs locked across entirely separate write operations. It also maintains 776 + correct lock ordering. 777 + 778 + Each time a VMA read lock is acquired, we acquire a read lock on the 779 + :c:member:`!vma->vm_lock` read/write semaphore and hold it, while checking that 780 + the sequence count of the VMA does not match that of the mm. 781 + 782 + If it does, the read lock fails. If it does not, we hold the lock, excluding 783 + writers, but permitting other readers, who will also obtain this lock under RCU. 784 + 785 + Importantly, maple tree operations performed in :c:func:`!lock_vma_under_rcu` 786 + are also RCU safe, so the whole read lock operation is guaranteed to function 787 + correctly. 788 + 789 + On the write side, we acquire a write lock on the :c:member:`!vma->vm_lock` 790 + read/write semaphore, before setting the VMA's sequence number under this lock, 791 + also simultaneously holding the mmap write lock. 792 + 793 + This way, if any read locks are in effect, :c:func:`!vma_start_write` will sleep 794 + until these are finished and mutual exclusion is achieved. 795 + 796 + After setting the VMA's sequence number, the lock is released, avoiding 797 + complexity with a long-term held write lock. 798 + 799 + This clever combination of a read/write semaphore and sequence count allows for 800 + fast RCU-based per-VMA lock acquisition (especially on page fault, though 801 + utilised elsewhere) with minimal complexity around lock ordering. 802 + 803 + mmap write lock downgrading 804 + --------------------------- 805 + 806 + When an mmap write lock is held one has exclusive access to resources within the 807 + mmap (with the usual caveats about requiring VMA write locks to avoid races with 808 + tasks holding VMA read locks). 809 + 810 + It is then possible to **downgrade** from a write lock to a read lock via 811 + :c:func:`!mmap_write_downgrade` which, similar to :c:func:`!mmap_write_unlock`, 812 + implicitly terminates all VMA write locks via :c:func:`!vma_end_write_all`, but 813 + importantly does not relinquish the mmap lock while downgrading, therefore 814 + keeping the locked virtual address space stable. 815 + 816 + An interesting consequence of this is that downgraded locks are exclusive 817 + against any other task possessing a downgraded lock (since a racing task would 818 + have to acquire a write lock first to downgrade it, and the downgraded lock 819 + prevents a new write lock from being obtained until the original lock is 820 + released). 821 + 822 + For clarity, we map read (R)/downgraded write (D)/write (W) locks against one 823 + another showing which locks exclude the others: 824 + 825 + .. list-table:: Lock exclusivity 826 + :widths: 5 5 5 5 827 + :header-rows: 1 828 + :stub-columns: 1 829 + 830 + * - 831 + - R 832 + - D 833 + - W 834 + * - R 835 + - N 836 + - N 837 + - Y 838 + * - D 839 + - N 840 + - Y 841 + - Y 842 + * - W 843 + - Y 844 + - Y 845 + - Y 846 + 847 + Here a Y indicates the locks in the matching row/column are mutually exclusive, 848 + and N indicates that they are not. 849 + 850 + Stack expansion 851 + --------------- 852 + 853 + Stack expansion throws up additional complexities in that we cannot permit there 854 + to be racing page faults, as a result we invoke :c:func:`!vma_start_write` to 855 + prevent this in :c:func:`!expand_downwards` or :c:func:`!expand_upwards`.

+3 -3

MAINTAINERS

··· 7347 7347 DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS 7348 7348 M: Karol Herbst <kherbst@redhat.com> 7349 7349 M: Lyude Paul <lyude@redhat.com> 7350 - M: Danilo Krummrich <dakr@redhat.com> 7350 + M: Danilo Krummrich <dakr@kernel.org> 7351 7351 L: dri-devel@lists.freedesktop.org 7352 7352 L: nouveau@lists.freedesktop.org 7353 7353 S: Supported ··· 8453 8453 EROFS FILE SYSTEM 8454 8454 M: Gao Xiang <xiang@kernel.org> 8455 8455 M: Chao Yu <chao@kernel.org> 8456 - R: Yue Hu <huyue2@coolpad.com> 8456 + R: Yue Hu <zbestahu@gmail.com> 8457 8457 R: Jeffle Xu <jefflexu@linux.alibaba.com> 8458 8458 R: Sandeep Dhavale <dhavale@google.com> 8459 8459 L: linux-erofs@lists.ozlabs.org ··· 8924 8924 FIRMWARE LOADER (request_firmware) 8925 8925 M: Luis Chamberlain <mcgrof@kernel.org> 8926 8926 M: Russ Weight <russ.weight@linux.dev> 8927 - M: Danilo Krummrich <dakr@redhat.com> 8927 + M: Danilo Krummrich <dakr@kernel.org> 8928 8928 L: linux-kernel@vger.kernel.org 8929 8929 S: Maintained 8930 8930 F: Documentation/firmware_class/

+1 -1

Makefile

··· 2 2 VERSION = 6 3 3 PATCHLEVEL = 13 4 4 SUBLEVEL = 0 5 - EXTRAVERSION = -rc3 5 + EXTRAVERSION = -rc4 6 6 NAME = Baby Opossum Posse 7 7 8 8 # *DOCUMENTATION*

+1

arch/arc/Kconfig

··· 6 6 config ARC 7 7 def_bool y 8 8 select ARC_TIMERS 9 + select ARCH_HAS_CPU_CACHE_ALIASING 9 10 select ARCH_HAS_CACHE_LINE_SIZE 10 11 select ARCH_HAS_DEBUG_VM_PGTABLE 11 12 select ARCH_HAS_DMA_PREP_COHERENT

+8

arch/arc/include/asm/cachetype.h

··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef __ASM_ARC_CACHETYPE_H 3 + #define __ASM_ARC_CACHETYPE_H 4 + 5 + #define cpu_dcache_is_aliasing() false 6 + #define cpu_icache_is_aliasing() true 7 + 8 + #endif

+1 -1

arch/arm64/boot/dts/arm/fvp-base-revc.dts

··· 233 233 #interrupt-cells = <0x1>; 234 234 compatible = "pci-host-ecam-generic"; 235 235 device_type = "pci"; 236 - bus-range = <0x0 0x1>; 236 + bus-range = <0x0 0xff>; 237 237 reg = <0x0 0x40000000 0x0 0x10000000>; 238 238 ranges = <0x2000000 0x0 0x50000000 0x0 0x50000000 0x0 0x10000000>; 239 239 interrupt-map = <0 0 0 1 &gic 0 0 GIC_SPI 168 IRQ_TYPE_LEVEL_HIGH>,

+4 -4

arch/arm64/boot/dts/broadcom/bcm2712.dtsi

··· 67 67 l2_cache_l0: l2-cache-l0 { 68 68 compatible = "cache"; 69 69 cache-size = <0x80000>; 70 - cache-line-size = <128>; 70 + cache-line-size = <64>; 71 71 cache-sets = <1024>; //512KiB(size)/64(line-size)=8192ways/8-way set 72 72 cache-level = <2>; 73 73 cache-unified; ··· 91 91 l2_cache_l1: l2-cache-l1 { 92 92 compatible = "cache"; 93 93 cache-size = <0x80000>; 94 - cache-line-size = <128>; 94 + cache-line-size = <64>; 95 95 cache-sets = <1024>; //512KiB(size)/64(line-size)=8192ways/8-way set 96 96 cache-level = <2>; 97 97 cache-unified; ··· 115 115 l2_cache_l2: l2-cache-l2 { 116 116 compatible = "cache"; 117 117 cache-size = <0x80000>; 118 - cache-line-size = <128>; 118 + cache-line-size = <64>; 119 119 cache-sets = <1024>; //512KiB(size)/64(line-size)=8192ways/8-way set 120 120 cache-level = <2>; 121 121 cache-unified; ··· 139 139 l2_cache_l3: l2-cache-l3 { 140 140 compatible = "cache"; 141 141 cache-size = <0x80000>; 142 - cache-line-size = <128>; 142 + cache-line-size = <64>; 143 143 cache-sets = <1024>; //512KiB(size)/64(line-size)=8192ways/8-way set 144 144 cache-level = <2>; 145 145 cache-unified;

+15 -20

arch/arm64/kernel/signal.c

··· 36 36 #include <asm/traps.h> 37 37 #include <asm/vdso.h> 38 38 39 - #ifdef CONFIG_ARM64_GCS 40 39 #define GCS_SIGNAL_CAP(addr) (((unsigned long)addr) & GCS_CAP_ADDR_MASK) 41 - 42 - static bool gcs_signal_cap_valid(u64 addr, u64 val) 43 - { 44 - return val == GCS_SIGNAL_CAP(addr); 45 - } 46 - #endif 47 40 48 41 /* 49 42 * Do a signal return; undo the signal stack. These are aligned to 128-bit. ··· 1055 1062 #ifdef CONFIG_ARM64_GCS 1056 1063 static int gcs_restore_signal(void) 1057 1064 { 1058 - unsigned long __user *gcspr_el0; 1059 - u64 cap; 1065 + u64 gcspr_el0, cap; 1060 1066 int ret; 1061 1067 1062 1068 if (!system_supports_gcs()) ··· 1064 1072 if (!(current->thread.gcs_el0_mode & PR_SHADOW_STACK_ENABLE)) 1065 1073 return 0; 1066 1074 1067 - gcspr_el0 = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0); 1075 + gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0); 1068 1076 1069 1077 /* 1070 1078 * Ensure that any changes to the GCS done via GCS operations ··· 1079 1087 * then faults will be generated on GCS operations - the main 1080 1088 * concern is to protect GCS pages. 1081 1089 */ 1082 - ret = copy_from_user(&cap, gcspr_el0, sizeof(cap)); 1090 + ret = copy_from_user(&cap, (unsigned long __user *)gcspr_el0, 1091 + sizeof(cap)); 1083 1092 if (ret) 1084 1093 return -EFAULT; 1085 1094 1086 1095 /* 1087 1096 * Check that the cap is the actual GCS before replacing it. 1088 1097 */ 1089 - if (!gcs_signal_cap_valid((u64)gcspr_el0, cap)) 1098 + if (cap != GCS_SIGNAL_CAP(gcspr_el0)) 1090 1099 return -EINVAL; 1091 1100 1092 1101 /* Invalidate the token to prevent reuse */ 1093 - put_user_gcs(0, (__user void*)gcspr_el0, &ret); 1102 + put_user_gcs(0, (unsigned long __user *)gcspr_el0, &ret); 1094 1103 if (ret != 0) 1095 1104 return -EFAULT; 1096 1105 1097 - write_sysreg_s(gcspr_el0 + 1, SYS_GCSPR_EL0); 1106 + write_sysreg_s(gcspr_el0 + 8, SYS_GCSPR_EL0); 1098 1107 1099 1108 return 0; 1100 1109 } ··· 1414 1421 1415 1422 static int gcs_signal_entry(__sigrestore_t sigtramp, struct ksignal *ksig) 1416 1423 { 1417 - unsigned long __user *gcspr_el0; 1424 + u64 gcspr_el0; 1418 1425 int ret = 0; 1419 1426 1420 1427 if (!system_supports_gcs()) ··· 1427 1434 * We are entering a signal handler, current register state is 1428 1435 * active. 1429 1436 */ 1430 - gcspr_el0 = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0); 1437 + gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0); 1431 1438 1432 1439 /* 1433 1440 * Push a cap and the GCS entry for the trampoline onto the GCS. 1434 1441 */ 1435 - put_user_gcs((unsigned long)sigtramp, gcspr_el0 - 2, &ret); 1436 - put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 1), gcspr_el0 - 1, &ret); 1442 + put_user_gcs((unsigned long)sigtramp, 1443 + (unsigned long __user *)(gcspr_el0 - 16), &ret); 1444 + put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 8), 1445 + (unsigned long __user *)(gcspr_el0 - 8), &ret); 1437 1446 if (ret != 0) 1438 1447 return ret; 1439 1448 1440 - gcspr_el0 -= 2; 1441 - write_sysreg_s((unsigned long)gcspr_el0, SYS_GCSPR_EL0); 1449 + gcspr_el0 -= 16; 1450 + write_sysreg_s(gcspr_el0, SYS_GCSPR_EL0); 1442 1451 1443 1452 return 0; 1444 1453 }

+6

arch/hexagon/Makefile

··· 32 32 TIR_NAME := r19 33 33 KBUILD_CFLAGS += -ffixed-$(TIR_NAME) -DTHREADINFO_REG=$(TIR_NAME) -D__linux__ 34 34 KBUILD_AFLAGS += -DTHREADINFO_REG=$(TIR_NAME) 35 + 36 + # Disable HexagonConstExtenders pass for LLVM versions prior to 19.1.0 37 + # https://github.com/llvm/llvm-project/issues/99714 38 + ifneq ($(call clang-min-version, 190100),y) 39 + KBUILD_CFLAGS += -mllvm -hexagon-cext=false 40 + endif

+1

arch/powerpc/configs/pmac32_defconfig

··· 208 208 CONFIG_FB_ATY_CT=y 209 209 CONFIG_FB_ATY_GX=y 210 210 CONFIG_FB_3DFX=y 211 + CONFIG_BACKLIGHT_CLASS_DEVICE=y 211 212 # CONFIG_VGA_CONSOLE is not set 212 213 CONFIG_FRAMEBUFFER_CONSOLE=y 213 214 CONFIG_LOGO=y

+1

arch/powerpc/configs/ppc6xx_defconfig

··· 716 716 CONFIG_FB_SM501=m 717 717 CONFIG_FB_IBM_GXT4500=y 718 718 CONFIG_LCD_PLATFORM=m 719 + CONFIG_BACKLIGHT_CLASS_DEVICE=y 719 720 CONFIG_FRAMEBUFFER_CONSOLE=y 720 721 CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y 721 722 CONFIG_LOGO=y

+2

arch/s390/boot/startup.c

··· 234 234 vsize = round_up(SZ_2G + max_mappable, rte_size) + 235 235 round_up(vmemmap_size, rte_size) + 236 236 FIXMAP_SIZE + MODULES_LEN + KASLR_LEN; 237 + if (IS_ENABLED(CONFIG_KMSAN)) 238 + vsize += MODULES_LEN * 2; 237 239 return size_add(vsize, vmalloc_size); 238 240 } 239 241

+3 -3

arch/s390/boot/vmem.c

··· 306 306 pages++; 307 307 } 308 308 } 309 - if (mode == POPULATE_DIRECT) 309 + if (mode == POPULATE_IDENTITY) 310 310 update_page_count(PG_DIRECT_MAP_4K, pages); 311 311 } 312 312 ··· 339 339 } 340 340 pgtable_pte_populate(pmd, addr, next, mode); 341 341 } 342 - if (mode == POPULATE_DIRECT) 342 + if (mode == POPULATE_IDENTITY) 343 343 update_page_count(PG_DIRECT_MAP_1M, pages); 344 344 } 345 345 ··· 372 372 } 373 373 pgtable_pmd_populate(pud, addr, next, mode); 374 374 } 375 - if (mode == POPULATE_DIRECT) 375 + if (mode == POPULATE_IDENTITY) 376 376 update_page_count(PG_DIRECT_MAP_2G, pages); 377 377 } 378 378

+1 -1

arch/s390/kernel/ipl.c

··· 270 270 if (len >= sizeof(_value)) \ 271 271 return -E2BIG; \ 272 272 len = strscpy(_value, buf, sizeof(_value)); \ 273 - if (len < 0) \ 273 + if ((ssize_t)len < 0) \ 274 274 return len; \ 275 275 strim(_value); \ 276 276 return len; \

+1

arch/x86/include/asm/cpufeatures.h

··· 452 452 #define X86_FEATURE_SME_COHERENT (19*32+10) /* AMD hardware-enforced cache coherency */ 453 453 #define X86_FEATURE_DEBUG_SWAP (19*32+14) /* "debug_swap" AMD SEV-ES full debug state swap support */ 454 454 #define X86_FEATURE_SVSM (19*32+28) /* "svsm" SVSM present */ 455 + #define X86_FEATURE_HV_INUSE_WR_ALLOWED (19*32+30) /* Allow Write to in-use hypervisor-owned pages */ 455 456 456 457 /* AMD-defined Extended Feature 2 EAX, CPUID level 0x80000021 (EAX), word 20 */ 457 458 #define X86_FEATURE_NO_NESTED_DATA_BP (20*32+ 0) /* No Nested Data Breakpoints */

+2

arch/x86/include/asm/processor.h

··· 230 230 return BIT_ULL(boot_cpu_data.x86_cache_bits - 1 - PAGE_SHIFT); 231 231 } 232 232 233 + void init_cpu_devs(void); 234 + void get_cpu_vendor(struct cpuinfo_x86 *c); 233 235 extern void early_cpu_init(void); 234 236 extern void identify_secondary_cpu(struct cpuinfo_x86 *); 235 237 extern void print_cpu_info(struct cpuinfo_x86 *);

+15

arch/x86/include/asm/static_call.h

··· 65 65 66 66 extern bool __static_call_fixup(void *tramp, u8 op, void *dest); 67 67 68 + extern void __static_call_update_early(void *tramp, void *func); 69 + 70 + #define static_call_update_early(name, _func) \ 71 + ({ \ 72 + typeof(&STATIC_CALL_TRAMP(name)) __F = (_func); \ 73 + if (static_call_initialized) { \ 74 + __static_call_update(&STATIC_CALL_KEY(name), \ 75 + STATIC_CALL_TRAMP_ADDR(name), __F);\ 76 + } else { \ 77 + WRITE_ONCE(STATIC_CALL_KEY(name).func, _func); \ 78 + __static_call_update_early(STATIC_CALL_TRAMP_ADDR(name),\ 79 + __F); \ 80 + } \ 81 + }) 82 + 68 83 #endif /* _ASM_STATIC_CALL_H */

+3 -3

arch/x86/include/asm/sync_core.h

··· 8 8 #include <asm/special_insns.h> 9 9 10 10 #ifdef CONFIG_X86_32 11 - static inline void iret_to_self(void) 11 + static __always_inline void iret_to_self(void) 12 12 { 13 13 asm volatile ( 14 14 "pushfl\n\t" ··· 19 19 : ASM_CALL_CONSTRAINT : : "memory"); 20 20 } 21 21 #else 22 - static inline void iret_to_self(void) 22 + static __always_inline void iret_to_self(void) 23 23 { 24 24 unsigned int tmp; 25 25 ··· 55 55 * Like all of Linux's memory ordering operations, this is a 56 56 * compiler barrier as well. 57 57 */ 58 - static inline void sync_core(void) 58 + static __always_inline void sync_core(void) 59 59 { 60 60 /* 61 61 * The SERIALIZE instruction is the most straightforward way to

+22 -14

arch/x86/include/asm/xen/hypercall.h

··· 39 39 #include <linux/string.h> 40 40 #include <linux/types.h> 41 41 #include <linux/pgtable.h> 42 + #include <linux/instrumentation.h> 42 43 43 44 #include <trace/events/xen.h> 44 45 46 + #include <asm/alternative.h> 45 47 #include <asm/page.h> 46 48 #include <asm/smap.h> 47 49 #include <asm/nospec-branch.h> ··· 88 86 * there aren't more than 5 arguments...) 89 87 */ 90 88 91 - extern struct { char _entry[32]; } hypercall_page[]; 89 + void xen_hypercall_func(void); 90 + DECLARE_STATIC_CALL(xen_hypercall, xen_hypercall_func); 92 91 93 - #define __HYPERCALL "call hypercall_page+%c[offset]" 94 - #define __HYPERCALL_ENTRY(x) \ 95 - [offset] "i" (__HYPERVISOR_##x * sizeof(hypercall_page[0])) 92 + #ifdef MODULE 93 + #define __ADDRESSABLE_xen_hypercall 94 + #else 95 + #define __ADDRESSABLE_xen_hypercall __ADDRESSABLE_ASM_STR(__SCK__xen_hypercall) 96 + #endif 97 + 98 + #define __HYPERCALL \ 99 + __ADDRESSABLE_xen_hypercall \ 100 + "call __SCT__xen_hypercall" 101 + 102 + #define __HYPERCALL_ENTRY(x) "a" (x) 96 103 97 104 #ifdef CONFIG_X86_32 98 105 #define __HYPERCALL_RETREG "eax" ··· 159 148 __HYPERCALL_0ARG(); \ 160 149 asm volatile (__HYPERCALL \ 161 150 : __HYPERCALL_0PARAM \ 162 - : __HYPERCALL_ENTRY(name) \ 151 + : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name) \ 163 152 : __HYPERCALL_CLOBBER0); \ 164 153 (type)__res; \ 165 154 }) ··· 170 159 __HYPERCALL_1ARG(a1); \ 171 160 asm volatile (__HYPERCALL \ 172 161 : __HYPERCALL_1PARAM \ 173 - : __HYPERCALL_ENTRY(name) \ 162 + : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name) \ 174 163 : __HYPERCALL_CLOBBER1); \ 175 164 (type)__res; \ 176 165 }) ··· 181 170 __HYPERCALL_2ARG(a1, a2); \ 182 171 asm volatile (__HYPERCALL \ 183 172 : __HYPERCALL_2PARAM \ 184 - : __HYPERCALL_ENTRY(name) \ 173 + : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name) \ 185 174 : __HYPERCALL_CLOBBER2); \ 186 175 (type)__res; \ 187 176 }) ··· 192 181 __HYPERCALL_3ARG(a1, a2, a3); \ 193 182 asm volatile (__HYPERCALL \ 194 183 : __HYPERCALL_3PARAM \ 195 - : __HYPERCALL_ENTRY(name) \ 184 + : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name) \ 196 185 : __HYPERCALL_CLOBBER3); \ 197 186 (type)__res; \ 198 187 }) ··· 203 192 __HYPERCALL_4ARG(a1, a2, a3, a4); \ 204 193 asm volatile (__HYPERCALL \ 205 194 : __HYPERCALL_4PARAM \ 206 - : __HYPERCALL_ENTRY(name) \ 195 + : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name) \ 207 196 : __HYPERCALL_CLOBBER4); \ 208 197 (type)__res; \ 209 198 }) ··· 217 206 __HYPERCALL_DECLS; 218 207 __HYPERCALL_5ARG(a1, a2, a3, a4, a5); 219 208 220 - if (call >= PAGE_SIZE / sizeof(hypercall_page[0])) 221 - return -EINVAL; 222 - 223 - asm volatile(CALL_NOSPEC 209 + asm volatile(__HYPERCALL 224 210 : __HYPERCALL_5PARAM 225 - : [thunk_target] "a" (&hypercall_page[call]) 211 + : __HYPERCALL_ENTRY(call) 226 212 : __HYPERCALL_CLOBBER5); 227 213 228 214 return (long)__res;

-5

arch/x86/kernel/callthunks.c

··· 143 143 dest < (void*)relocate_kernel + KEXEC_CONTROL_CODE_MAX_SIZE) 144 144 return true; 145 145 #endif 146 - #ifdef CONFIG_XEN 147 - if (dest >= (void *)hypercall_page && 148 - dest < (void*)hypercall_page + PAGE_SIZE) 149 - return true; 150 - #endif 151 146 return false; 152 147 } 153 148

+22 -16

arch/x86/kernel/cpu/common.c

··· 867 867 tlb_lld_4m[ENTRIES], tlb_lld_1g[ENTRIES]); 868 868 } 869 869 870 - static void get_cpu_vendor(struct cpuinfo_x86 *c) 870 + void get_cpu_vendor(struct cpuinfo_x86 *c) 871 871 { 872 872 char *v = c->x86_vendor_id; 873 873 int i; ··· 1649 1649 detect_nopl(); 1650 1650 } 1651 1651 1652 - void __init early_cpu_init(void) 1652 + void __init init_cpu_devs(void) 1653 1653 { 1654 1654 const struct cpu_dev *const *cdev; 1655 1655 int count = 0; 1656 - 1657 - #ifdef CONFIG_PROCESSOR_SELECT 1658 - pr_info("KERNEL supported cpus:\n"); 1659 - #endif 1660 1656 1661 1657 for (cdev = __x86_cpu_dev_start; cdev < __x86_cpu_dev_end; cdev++) { 1662 1658 const struct cpu_dev *cpudev = *cdev; ··· 1661 1665 break; 1662 1666 cpu_devs[count] = cpudev; 1663 1667 count++; 1668 + } 1669 + } 1670 + 1671 + void __init early_cpu_init(void) 1672 + { 1673 + #ifdef CONFIG_PROCESSOR_SELECT 1674 + unsigned int i, j; 1675 + 1676 + pr_info("KERNEL supported cpus:\n"); 1677 + #endif 1678 + 1679 + init_cpu_devs(); 1664 1680 1665 1681 #ifdef CONFIG_PROCESSOR_SELECT 1666 - { 1667 - unsigned int j; 1668 - 1669 - for (j = 0; j < 2; j++) { 1670 - if (!cpudev->c_ident[j]) 1671 - continue; 1672 - pr_info(" %s %s\n", cpudev->c_vendor, 1673 - cpudev->c_ident[j]); 1674 - } 1682 + for (i = 0; i < X86_VENDOR_NUM && cpu_devs[i]; i++) { 1683 + for (j = 0; j < 2; j++) { 1684 + if (!cpu_devs[i]->c_ident[j]) 1685 + continue; 1686 + pr_info(" %s %s\n", cpu_devs[i]->c_vendor, 1687 + cpu_devs[i]->c_ident[j]); 1675 1688 } 1676 - #endif 1677 1689 } 1690 + #endif 1691 + 1678 1692 early_identify_cpu(&boot_cpu_data); 1679 1693 } 1680 1694

+58

arch/x86/kernel/cpu/mshyperv.c

··· 223 223 hyperv_cleanup(); 224 224 } 225 225 #endif /* CONFIG_CRASH_DUMP */ 226 + 227 + static u64 hv_ref_counter_at_suspend; 228 + static void (*old_save_sched_clock_state)(void); 229 + static void (*old_restore_sched_clock_state)(void); 230 + 231 + /* 232 + * Hyper-V clock counter resets during hibernation. Save and restore clock 233 + * offset during suspend/resume, while also considering the time passed 234 + * before suspend. This is to make sure that sched_clock using hv tsc page 235 + * based clocksource, proceeds from where it left off during suspend and 236 + * it shows correct time for the timestamps of kernel messages after resume. 237 + */ 238 + static void save_hv_clock_tsc_state(void) 239 + { 240 + hv_ref_counter_at_suspend = hv_read_reference_counter(); 241 + } 242 + 243 + static void restore_hv_clock_tsc_state(void) 244 + { 245 + /* 246 + * Adjust the offsets used by hv tsc clocksource to 247 + * account for the time spent before hibernation. 248 + * adjusted value = reference counter (time) at suspend 249 + * - reference counter (time) now. 250 + */ 251 + hv_adj_sched_clock_offset(hv_ref_counter_at_suspend - hv_read_reference_counter()); 252 + } 253 + 254 + /* 255 + * Functions to override save_sched_clock_state and restore_sched_clock_state 256 + * functions of x86_platform. The Hyper-V clock counter is reset during 257 + * suspend-resume and the offset used to measure time needs to be 258 + * corrected, post resume. 259 + */ 260 + static void hv_save_sched_clock_state(void) 261 + { 262 + old_save_sched_clock_state(); 263 + save_hv_clock_tsc_state(); 264 + } 265 + 266 + static void hv_restore_sched_clock_state(void) 267 + { 268 + restore_hv_clock_tsc_state(); 269 + old_restore_sched_clock_state(); 270 + } 271 + 272 + static void __init x86_setup_ops_for_tsc_pg_clock(void) 273 + { 274 + if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE)) 275 + return; 276 + 277 + old_save_sched_clock_state = x86_platform.save_sched_clock_state; 278 + x86_platform.save_sched_clock_state = hv_save_sched_clock_state; 279 + 280 + old_restore_sched_clock_state = x86_platform.restore_sched_clock_state; 281 + x86_platform.restore_sched_clock_state = hv_restore_sched_clock_state; 282 + } 226 283 #endif /* CONFIG_HYPERV */ 227 284 228 285 static uint32_t __init ms_hyperv_platform(void) ··· 636 579 637 580 /* Register Hyper-V specific clocksource */ 638 581 hv_init_clocksource(); 582 + x86_setup_ops_for_tsc_pg_clock(); 639 583 hv_vtl_init_platform(); 640 584 #endif 641 585 /*

+9

arch/x86/kernel/static_call.c

··· 172 172 } 173 173 EXPORT_SYMBOL_GPL(arch_static_call_transform); 174 174 175 + noinstr void __static_call_update_early(void *tramp, void *func) 176 + { 177 + BUG_ON(system_state != SYSTEM_BOOTING); 178 + BUG_ON(!early_boot_irqs_disabled); 179 + BUG_ON(static_call_initialized); 180 + __text_gen_insn(tramp, JMP32_INSN_OPCODE, tramp, func, JMP32_INSN_SIZE); 181 + sync_core(); 182 + } 183 + 175 184 #ifdef CONFIG_MITIGATION_RETHUNK 176 185 /* 177 186 * This is called by apply_returns() to fix up static call trampolines,

-4

arch/x86/kernel/vmlinux.lds.S

··· 519 519 * linker will never mark as relocatable. (Using just ABSOLUTE() is not 520 520 * sufficient for that). 521 521 */ 522 - #ifdef CONFIG_XEN 523 522 #ifdef CONFIG_XEN_PV 524 523 xen_elfnote_entry_value = 525 524 ABSOLUTE(xen_elfnote_entry) + ABSOLUTE(startup_xen); 526 - #endif 527 - xen_elfnote_hypercall_page_value = 528 - ABSOLUTE(xen_elfnote_hypercall_page) + ABSOLUTE(hypercall_page); 529 525 #endif 530 526 #ifdef CONFIG_PVH 531 527 xen_elfnote_phys32_entry_value =

-12

arch/x86/kvm/mmu/mmu.c

··· 3364 3364 return true; 3365 3365 } 3366 3366 3367 - static bool is_access_allowed(struct kvm_page_fault *fault, u64 spte) 3368 - { 3369 - if (fault->exec) 3370 - return is_executable_pte(spte); 3371 - 3372 - if (fault->write) 3373 - return is_writable_pte(spte); 3374 - 3375 - /* Fault was on Read access */ 3376 - return spte & PT_PRESENT_MASK; 3377 - } 3378 - 3379 3367 /* 3380 3368 * Returns the last level spte pointer of the shadow page walk for the given 3381 3369 * gpa, and sets *spte to the spte value. This spte may be non-preset. If no

+17

arch/x86/kvm/mmu/spte.h

··· 462 462 } 463 463 464 464 /* 465 + * Returns true if the access indicated by @fault is allowed by the existing 466 + * SPTE protections. Note, the caller is responsible for checking that the 467 + * SPTE is a shadow-present, leaf SPTE (either before or after). 468 + */ 469 + static inline bool is_access_allowed(struct kvm_page_fault *fault, u64 spte) 470 + { 471 + if (fault->exec) 472 + return is_executable_pte(spte); 473 + 474 + if (fault->write) 475 + return is_writable_pte(spte); 476 + 477 + /* Fault was on Read access */ 478 + return spte & PT_PRESENT_MASK; 479 + } 480 + 481 + /* 465 482 * If the MMU-writable flag is cleared, i.e. the SPTE is write-protected for 466 483 * write-tracking, remote TLBs must be flushed, even if the SPTE was read-only, 467 484 * as KVM allows stale Writable TLB entries to exist. When dirty logging, KVM

+5

arch/x86/kvm/mmu/tdp_mmu.c

··· 985 985 if (fault->prefetch && is_shadow_present_pte(iter->old_spte)) 986 986 return RET_PF_SPURIOUS; 987 987 988 + if (is_shadow_present_pte(iter->old_spte) && 989 + is_access_allowed(fault, iter->old_spte) && 990 + is_last_spte(iter->old_spte, iter->level)) 991 + return RET_PF_SPURIOUS; 992 + 988 993 if (unlikely(!fault->slot)) 989 994 new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL); 990 995 else

+6

arch/x86/kvm/svm/avic.c

··· 1199 1199 return false; 1200 1200 } 1201 1201 1202 + if (cc_platform_has(CC_ATTR_HOST_SEV_SNP) && 1203 + !boot_cpu_has(X86_FEATURE_HV_INUSE_WR_ALLOWED)) { 1204 + pr_warn("AVIC disabled: missing HvInUseWrAllowed on SNP-enabled system\n"); 1205 + return false; 1206 + } 1207 + 1202 1208 if (boot_cpu_has(X86_FEATURE_AVIC)) { 1203 1209 pr_info("AVIC enabled\n"); 1204 1210 } else if (force_avic) {

-9

arch/x86/kvm/svm/svm.c

··· 3201 3201 if (data & ~supported_de_cfg) 3202 3202 return 1; 3203 3203 3204 - /* 3205 - * Don't let the guest change the host-programmed value. The 3206 - * MSR is very model specific, i.e. contains multiple bits that 3207 - * are completely unknown to KVM, and the one bit known to KVM 3208 - * is simply a reflection of hardware capabilities. 3209 - */ 3210 - if (!msr->host_initiated && data != svm->msr_decfg) 3211 - return 1; 3212 - 3213 3204 svm->msr_decfg = data; 3214 3205 break; 3215 3206 }

+1 -1

arch/x86/kvm/vmx/posted_intr.h

··· 2 2 #ifndef __KVM_X86_VMX_POSTED_INTR_H 3 3 #define __KVM_X86_VMX_POSTED_INTR_H 4 4 5 - #include <linux/find.h> 5 + #include <linux/bitmap.h> 6 6 #include <asm/posted_intr.h> 7 7 8 8 void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu);

+8 -1

arch/x86/kvm/x86.c

··· 9976 9976 { 9977 9977 u64 ret = vcpu->run->hypercall.ret; 9978 9978 9979 - if (!is_64_bit_mode(vcpu)) 9979 + if (!is_64_bit_hypercall(vcpu)) 9980 9980 ret = (u32)ret; 9981 9981 kvm_rax_write(vcpu, ret); 9982 9982 ++vcpu->stat.hypercalls; ··· 12723 12723 kvm_apicv_init(kvm); 12724 12724 kvm_hv_init_vm(kvm); 12725 12725 kvm_xen_init_vm(kvm); 12726 + 12727 + if (ignore_msrs && !report_ignored_msrs) { 12728 + pr_warn_once("Running KVM with ignore_msrs=1 and report_ignored_msrs=0 is not a\n" 12729 + "a supported configuration. Lying to the guest about the existence of MSRs\n" 12730 + "may cause the guest operating system to hang or produce errors. If a guest\n" 12731 + "does not run without ignore_msrs=1, please report it to kvm@vger.kernel.org.\n"); 12732 + } 12726 12733 12727 12734 return 0; 12728 12735

+64 -1

arch/x86/xen/enlighten.c

··· 2 2 3 3 #include <linux/console.h> 4 4 #include <linux/cpu.h> 5 + #include <linux/instrumentation.h> 5 6 #include <linux/kexec.h> 6 7 #include <linux/memblock.h> 7 8 #include <linux/slab.h> ··· 22 21 23 22 #include "xen-ops.h" 24 23 25 - EXPORT_SYMBOL_GPL(hypercall_page); 24 + DEFINE_STATIC_CALL(xen_hypercall, xen_hypercall_hvm); 25 + EXPORT_STATIC_CALL_TRAMP(xen_hypercall); 26 26 27 27 /* 28 28 * Pointer to the xen_vcpu_info structure or ··· 69 67 * page as soon as fixmap is up and running. 70 68 */ 71 69 struct shared_info *HYPERVISOR_shared_info = &xen_dummy_shared_info; 70 + 71 + static __ref void xen_get_vendor(void) 72 + { 73 + init_cpu_devs(); 74 + cpu_detect(&boot_cpu_data); 75 + get_cpu_vendor(&boot_cpu_data); 76 + } 77 + 78 + void xen_hypercall_setfunc(void) 79 + { 80 + if (static_call_query(xen_hypercall) != xen_hypercall_hvm) 81 + return; 82 + 83 + if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD || 84 + boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)) 85 + static_call_update(xen_hypercall, xen_hypercall_amd); 86 + else 87 + static_call_update(xen_hypercall, xen_hypercall_intel); 88 + } 89 + 90 + /* 91 + * Evaluate processor vendor in order to select the correct hypercall 92 + * function for HVM/PVH guests. 93 + * Might be called very early in boot before vendor has been set by 94 + * early_cpu_init(). 95 + */ 96 + noinstr void *__xen_hypercall_setfunc(void) 97 + { 98 + void (*func)(void); 99 + 100 + /* 101 + * Xen is supported only on CPUs with CPUID, so testing for 102 + * X86_FEATURE_CPUID is a test for early_cpu_init() having been 103 + * run. 104 + * 105 + * Note that __xen_hypercall_setfunc() is noinstr only due to a nasty 106 + * dependency chain: it is being called via the xen_hypercall static 107 + * call when running as a PVH or HVM guest. Hypercalls need to be 108 + * noinstr due to PV guests using hypercalls in noinstr code. So we 109 + * can safely tag the function body as "instrumentation ok", since 110 + * the PV guest requirement is not of interest here (xen_get_vendor() 111 + * calls noinstr functions, and static_call_update_early() might do 112 + * so, too). 113 + */ 114 + instrumentation_begin(); 115 + 116 + if (!boot_cpu_has(X86_FEATURE_CPUID)) 117 + xen_get_vendor(); 118 + 119 + if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD || 120 + boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)) 121 + func = xen_hypercall_amd; 122 + else 123 + func = xen_hypercall_intel; 124 + 125 + static_call_update_early(xen_hypercall, func); 126 + 127 + instrumentation_end(); 128 + 129 + return func; 130 + } 72 131 73 132 static int xen_cpu_up_online(unsigned int cpu) 74 133 {

+5 -8

arch/x86/xen/enlighten_hvm.c

··· 106 106 /* PVH set up hypercall page in xen_prepare_pvh(). */ 107 107 if (xen_pvh_domain()) 108 108 pv_info.name = "Xen PVH"; 109 - else { 110 - u64 pfn; 111 - uint32_t msr; 112 - 109 + else 113 110 pv_info.name = "Xen HVM"; 114 - msr = cpuid_ebx(base + 2); 115 - pfn = __pa(hypercall_page); 116 - wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32)); 117 - } 118 111 119 112 xen_setup_features(); 120 113 ··· 292 299 293 300 if (xen_pv_domain()) 294 301 return 0; 302 + 303 + /* Set correct hypercall function. */ 304 + if (xen_domain) 305 + xen_hypercall_setfunc(); 295 306 296 307 if (xen_pvh_domain() && nopv) { 297 308 /* Guest booting via the Xen-PVH boot entry goes here */

+3 -1

arch/x86/xen/enlighten_pv.c

··· 1341 1341 1342 1342 xen_domain_type = XEN_PV_DOMAIN; 1343 1343 xen_start_flags = xen_start_info->flags; 1344 + /* Interrupts are guaranteed to be off initially. */ 1345 + early_boot_irqs_disabled = true; 1346 + static_call_update_early(xen_hypercall, xen_hypercall_pv); 1344 1347 1345 1348 xen_setup_features(); 1346 1349 ··· 1434 1431 WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_pv, xen_cpu_dead_pv)); 1435 1432 1436 1433 local_irq_disable(); 1437 - early_boot_irqs_disabled = true; 1438 1434 1439 1435 xen_raw_console_write("mapping kernel into physical memory\n"); 1440 1436 xen_setup_kernel_pagetable((pgd_t *)xen_start_info->pt_base,

-7

arch/x86/xen/enlighten_pvh.c

··· 129 129 130 130 void __init xen_pvh_init(struct boot_params *boot_params) 131 131 { 132 - u32 msr; 133 - u64 pfn; 134 - 135 132 xen_pvh = 1; 136 133 xen_domain_type = XEN_HVM_DOMAIN; 137 134 xen_start_flags = pvh_start_info.flags; 138 - 139 - msr = cpuid_ebx(xen_cpuid_base() + 2); 140 - pfn = __pa(hypercall_page); 141 - wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32)); 142 135 143 136 x86_init.oem.arch_setup = pvh_arch_setup; 144 137 x86_init.oem.banner = xen_banner;

+41 -9

arch/x86/xen/xen-asm.S

··· 20 20 21 21 #include <linux/init.h> 22 22 #include <linux/linkage.h> 23 + #include <linux/objtool.h> 23 24 #include <../entry/calling.h> 24 25 25 26 .pushsection .noinstr.text, "ax" 27 + /* 28 + * PV hypercall interface to the hypervisor. 29 + * 30 + * Called via inline asm(), so better preserve %rcx and %r11. 31 + * 32 + * Input: 33 + * %eax: hypercall number 34 + * %rdi, %rsi, %rdx, %r10, %r8: args 1..5 for the hypercall 35 + * Output: %rax 36 + */ 37 + SYM_FUNC_START(xen_hypercall_pv) 38 + ANNOTATE_NOENDBR 39 + push %rcx 40 + push %r11 41 + UNWIND_HINT_SAVE 42 + syscall 43 + UNWIND_HINT_RESTORE 44 + pop %r11 45 + pop %rcx 46 + RET 47 + SYM_FUNC_END(xen_hypercall_pv) 48 + 26 49 /* 27 50 * Disabling events is simply a matter of making the event mask 28 51 * non-zero. ··· 199 176 SYM_CODE_END(xen_early_idt_handler_array) 200 177 __FINIT 201 178 202 - hypercall_iret = hypercall_page + __HYPERVISOR_iret * 32 203 179 /* 204 180 * Xen64 iret frame: 205 181 * ··· 208 186 * cs 209 187 * rip <-- standard iret frame 210 188 * 211 - * flags 189 + * flags <-- xen_iret must push from here on 212 190 * 213 - * rcx } 214 - * r11 }<-- pushed by hypercall page 215 - * rsp->rax } 191 + * rcx 192 + * r11 193 + * rsp->rax 216 194 */ 195 + .macro xen_hypercall_iret 196 + pushq $0 /* Flags */ 197 + push %rcx 198 + push %r11 199 + push %rax 200 + mov $__HYPERVISOR_iret, %eax 201 + syscall /* Do the IRET. */ 202 + #ifdef CONFIG_MITIGATION_SLS 203 + int3 204 + #endif 205 + .endm 206 + 217 207 SYM_CODE_START(xen_iret) 218 208 UNWIND_HINT_UNDEFINED 219 209 ANNOTATE_NOENDBR 220 - pushq $0 221 - jmp hypercall_iret 210 + xen_hypercall_iret 222 211 SYM_CODE_END(xen_iret) 223 212 224 213 /* ··· 334 301 ENDBR 335 302 lea 16(%rsp), %rsp /* strip %rcx, %r11 */ 336 303 mov $-ENOSYS, %rax 337 - pushq $0 338 - jmp hypercall_iret 304 + xen_hypercall_iret 339 305 SYM_CODE_END(xen_entry_SYSENTER_compat) 340 306 SYM_CODE_END(xen_entry_SYSCALL_compat) 341 307

+83 -24

arch/x86/xen/xen-head.S

··· 6 6 7 7 #include <linux/elfnote.h> 8 8 #include <linux/init.h> 9 + #include <linux/instrumentation.h> 9 10 10 11 #include <asm/boot.h> 11 12 #include <asm/asm.h> 13 + #include <asm/frame.h> 12 14 #include <asm/msr.h> 13 15 #include <asm/page_types.h> 14 16 #include <asm/percpu.h> ··· 21 19 #include <xen/interface/xen.h> 22 20 #include <xen/interface/xen-mca.h> 23 21 #include <asm/xen/interface.h> 24 - 25 - .pushsection .noinstr.text, "ax" 26 - .balign PAGE_SIZE 27 - SYM_CODE_START(hypercall_page) 28 - .rept (PAGE_SIZE / 32) 29 - UNWIND_HINT_FUNC 30 - ANNOTATE_NOENDBR 31 - ANNOTATE_UNRET_SAFE 32 - ret 33 - /* 34 - * Xen will write the hypercall page, and sort out ENDBR. 35 - */ 36 - .skip 31, 0xcc 37 - .endr 38 - 39 - #define HYPERCALL(n) \ 40 - .equ xen_hypercall_##n, hypercall_page + __HYPERVISOR_##n * 32; \ 41 - .type xen_hypercall_##n, @function; .size xen_hypercall_##n, 32 42 - #include <asm/xen-hypercalls.h> 43 - #undef HYPERCALL 44 - SYM_CODE_END(hypercall_page) 45 - .popsection 46 22 47 23 #ifdef CONFIG_XEN_PV 48 24 __INIT ··· 67 87 #endif 68 88 #endif 69 89 90 + .pushsection .noinstr.text, "ax" 91 + /* 92 + * Xen hypercall interface to the hypervisor. 93 + * 94 + * Input: 95 + * %eax: hypercall number 96 + * 32-bit: 97 + * %ebx, %ecx, %edx, %esi, %edi: args 1..5 for the hypercall 98 + * 64-bit: 99 + * %rdi, %rsi, %rdx, %r10, %r8: args 1..5 for the hypercall 100 + * Output: %[er]ax 101 + */ 102 + SYM_FUNC_START(xen_hypercall_hvm) 103 + ENDBR 104 + FRAME_BEGIN 105 + /* Save all relevant registers (caller save and arguments). */ 106 + #ifdef CONFIG_X86_32 107 + push %eax 108 + push %ebx 109 + push %ecx 110 + push %edx 111 + push %esi 112 + push %edi 113 + #else 114 + push %rax 115 + push %rcx 116 + push %rdx 117 + push %rdi 118 + push %rsi 119 + push %r11 120 + push %r10 121 + push %r9 122 + push %r8 123 + #ifdef CONFIG_FRAME_POINTER 124 + pushq $0 /* Dummy push for stack alignment. */ 125 + #endif 126 + #endif 127 + /* Set the vendor specific function. */ 128 + call __xen_hypercall_setfunc 129 + /* Set ZF = 1 if AMD, Restore saved registers. */ 130 + #ifdef CONFIG_X86_32 131 + lea xen_hypercall_amd, %ebx 132 + cmp %eax, %ebx 133 + pop %edi 134 + pop %esi 135 + pop %edx 136 + pop %ecx 137 + pop %ebx 138 + pop %eax 139 + #else 140 + lea xen_hypercall_amd(%rip), %rbx 141 + cmp %rax, %rbx 142 + #ifdef CONFIG_FRAME_POINTER 143 + pop %rax /* Dummy pop. */ 144 + #endif 145 + pop %r8 146 + pop %r9 147 + pop %r10 148 + pop %r11 149 + pop %rsi 150 + pop %rdi 151 + pop %rdx 152 + pop %rcx 153 + pop %rax 154 + #endif 155 + /* Use correct hypercall function. */ 156 + jz xen_hypercall_amd 157 + jmp xen_hypercall_intel 158 + SYM_FUNC_END(xen_hypercall_hvm) 159 + 160 + SYM_FUNC_START(xen_hypercall_amd) 161 + vmmcall 162 + RET 163 + SYM_FUNC_END(xen_hypercall_amd) 164 + 165 + SYM_FUNC_START(xen_hypercall_intel) 166 + vmcall 167 + RET 168 + SYM_FUNC_END(xen_hypercall_intel) 169 + .popsection 170 + 70 171 ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS, .asciz "linux") 71 172 ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION, .asciz "2.6") 72 173 ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION, .asciz "xen-3.0") ··· 177 116 #else 178 117 # define FEATURES_DOM0 0 179 118 #endif 180 - ELFNOTE(Xen, XEN_ELFNOTE_HYPERCALL_PAGE, .globl xen_elfnote_hypercall_page; 181 - xen_elfnote_hypercall_page: _ASM_PTR xen_elfnote_hypercall_page_value - .) 182 119 ELFNOTE(Xen, XEN_ELFNOTE_SUPPORTED_FEATURES, 183 120 .long FEATURES_PV | FEATURES_PVH | FEATURES_DOM0) 184 121 ELFNOTE(Xen, XEN_ELFNOTE_LOADER, .asciz "generic")

+9

arch/x86/xen/xen-ops.h

··· 326 326 static inline void xen_smp_count_cpus(void) { } 327 327 #endif /* CONFIG_SMP */ 328 328 329 + #ifdef CONFIG_XEN_PV 330 + void xen_hypercall_pv(void); 331 + #endif 332 + void xen_hypercall_hvm(void); 333 + void xen_hypercall_amd(void); 334 + void xen_hypercall_intel(void); 335 + void xen_hypercall_setfunc(void); 336 + void *__xen_hypercall_setfunc(void); 337 + 329 338 #endif /* XEN_OPS_H */

+1 -2

block/bdev.c

··· 155 155 struct inode *inode = file->f_mapping->host; 156 156 struct block_device *bdev = I_BDEV(inode); 157 157 158 - /* Size must be a power of two, and between 512 and PAGE_SIZE */ 159 - if (size > PAGE_SIZE || size < 512 || !is_power_of_2(size)) 158 + if (blk_validate_block_size(size)) 160 159 return -EINVAL; 161 160 162 161 /* Size cannot be smaller than the size supported by the device */

+10 -6

block/blk-mq-sysfs.c

··· 275 275 struct blk_mq_hw_ctx *hctx; 276 276 unsigned long i; 277 277 278 - lockdep_assert_held(&q->sysfs_dir_lock); 279 - 278 + mutex_lock(&q->sysfs_dir_lock); 280 279 if (!q->mq_sysfs_init_done) 281 - return; 280 + goto unlock; 282 281 283 282 queue_for_each_hw_ctx(q, hctx, i) 284 283 blk_mq_unregister_hctx(hctx); 284 + 285 + unlock: 286 + mutex_unlock(&q->sysfs_dir_lock); 285 287 } 286 288 287 289 int blk_mq_sysfs_register_hctxs(struct request_queue *q) ··· 292 290 unsigned long i; 293 291 int ret = 0; 294 292 295 - lockdep_assert_held(&q->sysfs_dir_lock); 296 - 293 + mutex_lock(&q->sysfs_dir_lock); 297 294 if (!q->mq_sysfs_init_done) 298 - return ret; 295 + goto unlock; 299 296 300 297 queue_for_each_hw_ctx(q, hctx, i) { 301 298 ret = blk_mq_register_hctx(hctx); 302 299 if (ret) 303 300 break; 304 301 } 302 + 303 + unlock: 304 + mutex_unlock(&q->sysfs_dir_lock); 305 305 306 306 return ret; 307 307 }

+21 -19

block/blk-mq.c

··· 4412 4412 } 4413 4413 EXPORT_SYMBOL(blk_mq_alloc_disk_for_queue); 4414 4414 4415 + /* 4416 + * Only hctx removed from cpuhp list can be reused 4417 + */ 4418 + static bool blk_mq_hctx_is_reusable(struct blk_mq_hw_ctx *hctx) 4419 + { 4420 + return hlist_unhashed(&hctx->cpuhp_online) && 4421 + hlist_unhashed(&hctx->cpuhp_dead); 4422 + } 4423 + 4415 4424 static struct blk_mq_hw_ctx *blk_mq_alloc_and_init_hctx( 4416 4425 struct blk_mq_tag_set *set, struct request_queue *q, 4417 4426 int hctx_idx, int node) ··· 4430 4421 /* reuse dead hctx first */ 4431 4422 spin_lock(&q->unused_hctx_lock); 4432 4423 list_for_each_entry(tmp, &q->unused_hctx_list, hctx_list) { 4433 - if (tmp->numa_node == node) { 4424 + if (tmp->numa_node == node && blk_mq_hctx_is_reusable(tmp)) { 4434 4425 hctx = tmp; 4435 4426 break; 4436 4427 } ··· 4462 4453 unsigned long i, j; 4463 4454 4464 4455 /* protect against switching io scheduler */ 4465 - lockdep_assert_held(&q->sysfs_lock); 4466 - 4456 + mutex_lock(&q->sysfs_lock); 4467 4457 for (i = 0; i < set->nr_hw_queues; i++) { 4468 4458 int old_node; 4469 4459 int node = blk_mq_get_hctx_node(set, i); ··· 4495 4487 4496 4488 xa_for_each_start(&q->hctx_table, j, hctx, j) 4497 4489 blk_mq_exit_hctx(q, set, hctx, j); 4490 + mutex_unlock(&q->sysfs_lock); 4498 4491 4499 4492 /* unregister cpuhp callbacks for exited hctxs */ 4500 4493 blk_mq_remove_hw_queues_cpuhp(q); ··· 4527 4518 4528 4519 xa_init(&q->hctx_table); 4529 4520 4530 - mutex_lock(&q->sysfs_lock); 4531 - 4532 4521 blk_mq_realloc_hw_ctxs(set, q); 4533 4522 if (!q->nr_hw_queues) 4534 4523 goto err_hctxs; 4535 - 4536 - mutex_unlock(&q->sysfs_lock); 4537 4524 4538 4525 INIT_WORK(&q->timeout_work, blk_mq_timeout_work); 4539 4526 blk_queue_rq_timeout(q, set->timeout ? set->timeout : 30 * HZ); ··· 4549 4544 return 0; 4550 4545 4551 4546 err_hctxs: 4552 - mutex_unlock(&q->sysfs_lock); 4553 4547 blk_mq_release(q); 4554 4548 err_exit: 4555 4549 q->mq_ops = NULL; ··· 4929 4925 return false; 4930 4926 4931 4927 /* q->elevator needs protection from ->sysfs_lock */ 4932 - lockdep_assert_held(&q->sysfs_lock); 4928 + mutex_lock(&q->sysfs_lock); 4933 4929 4934 4930 /* the check has to be done with holding sysfs_lock */ 4935 4931 if (!q->elevator) { 4936 4932 kfree(qe); 4937 - goto out; 4933 + goto unlock; 4938 4934 } 4939 4935 4940 4936 INIT_LIST_HEAD(&qe->node); ··· 4944 4940 __elevator_get(qe->type); 4945 4941 list_add(&qe->node, head); 4946 4942 elevator_disable(q); 4947 - out: 4943 + unlock: 4944 + mutex_unlock(&q->sysfs_lock); 4945 + 4948 4946 return true; 4949 4947 } 4950 4948 ··· 4975 4969 list_del(&qe->node); 4976 4970 kfree(qe); 4977 4971 4972 + mutex_lock(&q->sysfs_lock); 4978 4973 elevator_switch(q, t); 4979 4974 /* drop the reference acquired in blk_mq_elv_switch_none */ 4980 4975 elevator_put(t); 4976 + mutex_unlock(&q->sysfs_lock); 4981 4977 } 4982 4978 4983 4979 static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, ··· 4999 4991 if (set->nr_maps == 1 && nr_hw_queues == set->nr_hw_queues) 5000 4992 return; 5001 4993 5002 - list_for_each_entry(q, &set->tag_list, tag_set_list) { 5003 - mutex_lock(&q->sysfs_dir_lock); 5004 - mutex_lock(&q->sysfs_lock); 4994 + list_for_each_entry(q, &set->tag_list, tag_set_list) 5005 4995 blk_mq_freeze_queue(q); 5006 - } 5007 4996 /* 5008 4997 * Switch IO scheduler to 'none', cleaning up the data associated 5009 4998 * with the previous scheduler. We will switch back once we are done ··· 5056 5051 list_for_each_entry(q, &set->tag_list, tag_set_list) 5057 5052 blk_mq_elv_switch_back(&head, q); 5058 5053 5059 - list_for_each_entry(q, &set->tag_list, tag_set_list) { 5054 + list_for_each_entry(q, &set->tag_list, tag_set_list) 5060 5055 blk_mq_unfreeze_queue(q); 5061 - mutex_unlock(&q->sysfs_lock); 5062 - mutex_unlock(&q->sysfs_dir_lock); 5063 - } 5064 5056 5065 5057 /* Free the excess tags when nr_hw_queues shrink. */ 5066 5058 for (i = set->nr_hw_queues; i < prev_nr_hw_queues; i++)

+2 -2

block/blk-sysfs.c

··· 706 706 if (entry->load_module) 707 707 entry->load_module(disk, page, length); 708 708 709 - mutex_lock(&q->sysfs_lock); 710 709 blk_mq_freeze_queue(q); 710 + mutex_lock(&q->sysfs_lock); 711 711 res = entry->store(disk, page, length); 712 - blk_mq_unfreeze_queue(q); 713 712 mutex_unlock(&q->sysfs_lock); 713 + blk_mq_unfreeze_queue(q); 714 714 return res; 715 715 } 716 716

+1 -1

drivers/accel/ivpu/ivpu_gem.c

··· 409 409 mutex_lock(&bo->lock); 410 410 411 411 drm_printf(p, "%-9p %-3u 0x%-12llx %-10lu 0x%-8x %-4u", 412 - bo, bo->ctx->id, bo->vpu_addr, bo->base.base.size, 412 + bo, bo->ctx ? bo->ctx->id : 0, bo->vpu_addr, bo->base.base.size, 413 413 bo->flags, kref_read(&bo->base.base.refcount)); 414 414 415 415 if (bo->base.pages)

+7 -3

drivers/accel/ivpu/ivpu_mmu_context.c

··· 612 612 if (!ivpu_mmu_ensure_pgd(vdev, &vdev->rctx.pgtable)) { 613 613 ivpu_err(vdev, "Failed to allocate root page table for reserved context\n"); 614 614 ret = -ENOMEM; 615 - goto unlock; 615 + goto err_ctx_fini; 616 616 } 617 617 618 618 ret = ivpu_mmu_cd_set(vdev, vdev->rctx.id, &vdev->rctx.pgtable); 619 619 if (ret) { 620 620 ivpu_err(vdev, "Failed to set context descriptor for reserved context\n"); 621 - goto unlock; 621 + goto err_ctx_fini; 622 622 } 623 623 624 - unlock: 625 624 mutex_unlock(&vdev->rctx.lock); 625 + return ret; 626 + 627 + err_ctx_fini: 628 + mutex_unlock(&vdev->rctx.lock); 629 + ivpu_mmu_context_fini(vdev, &vdev->rctx); 626 630 return ret; 627 631 } 628 632

+1 -1

drivers/accel/ivpu/ivpu_pm.c

··· 378 378 379 379 pm_runtime_use_autosuspend(dev); 380 380 pm_runtime_set_autosuspend_delay(dev, delay); 381 + pm_runtime_set_active(dev); 381 382 382 383 ivpu_dbg(vdev, PM, "Autosuspend delay = %d\n", delay); 383 384 } ··· 393 392 { 394 393 struct device *dev = vdev->drm.dev; 395 394 396 - pm_runtime_set_active(dev); 397 395 pm_runtime_allow(dev); 398 396 pm_runtime_mark_last_busy(dev); 399 397 pm_runtime_put_autosuspend(dev);

+2 -2

drivers/acpi/Kconfig

··· 135 135 config ACPI_EC 136 136 bool "Embedded Controller" 137 137 depends on HAS_IOPORT 138 - default X86 138 + default X86 || LOONGARCH 139 139 help 140 140 This driver handles communication with the microcontroller 141 - on many x86 laptops and other machines. 141 + on many x86/LoongArch laptops and other machines. 142 142 143 143 config ACPI_EC_DEBUGFS 144 144 tristate "EC read/write access through /sys/kernel/debug/ec"

+1 -1

drivers/auxdisplay/Kconfig

··· 489 489 490 490 config HT16K33 491 491 tristate "Holtek Ht16K33 LED controller with keyscan" 492 - depends on FB && I2C && INPUT 492 + depends on FB && I2C && INPUT && BACKLIGHT_CLASS_DEVICE 493 493 select FB_SYSMEM_HELPERS 494 494 select INPUT_MATRIXKMAP 495 495 select FB_BACKLIGHT

+10 -5

drivers/block/zram/zram_drv.c

··· 614 614 } 615 615 616 616 nr_pages = i_size_read(inode) >> PAGE_SHIFT; 617 + /* Refuse to use zero sized device (also prevents self reference) */ 618 + if (!nr_pages) { 619 + err = -EINVAL; 620 + goto out; 621 + } 622 + 617 623 bitmap_sz = BITS_TO_LONGS(nr_pages) * sizeof(long); 618 624 bitmap = kvzalloc(bitmap_sz, GFP_KERNEL); 619 625 if (!bitmap) { ··· 1444 1438 size_t num_pages = disksize >> PAGE_SHIFT; 1445 1439 size_t index; 1446 1440 1441 + if (!zram->table) 1442 + return; 1443 + 1447 1444 /* Free all pages that are still in this zram device */ 1448 1445 for (index = 0; index < num_pages; index++) 1449 1446 zram_free_page(zram, index); 1450 1447 1451 1448 zs_destroy_pool(zram->mem_pool); 1452 1449 vfree(zram->table); 1450 + zram->table = NULL; 1453 1451 } 1454 1452 1455 1453 static bool zram_meta_alloc(struct zram *zram, u64 disksize) ··· 2329 2319 down_write(&zram->init_lock); 2330 2320 2331 2321 zram->limit_pages = 0; 2332 - 2333 - if (!init_done(zram)) { 2334 - up_write(&zram->init_lock); 2335 - return; 2336 - } 2337 2322 2338 2323 set_capacity_and_notify(zram->disk, 0); 2339 2324 part_stat_set_all(zram->disk->part0, 0);

+13 -1

drivers/clocksource/hyperv_timer.c

··· 27 27 #include <asm/mshyperv.h> 28 28 29 29 static struct clock_event_device __percpu *hv_clock_event; 30 - static u64 hv_sched_clock_offset __ro_after_init; 30 + /* Note: offset can hold negative values after hibernation. */ 31 + static u64 hv_sched_clock_offset __read_mostly; 31 32 32 33 /* 33 34 * If false, we're using the old mechanism for stimer0 interrupts ··· 469 468 tsc_msr.enable = 1; 470 469 tsc_msr.pfn = tsc_pfn; 471 470 hv_set_msr(HV_MSR_REFERENCE_TSC, tsc_msr.as_uint64); 471 + } 472 + 473 + /* 474 + * Called during resume from hibernation, from overridden 475 + * x86_platform.restore_sched_clock_state routine. This is to adjust offsets 476 + * used to calculate time for hv tsc page based sched_clock, to account for 477 + * time spent before hibernation. 478 + */ 479 + void hv_adj_sched_clock_offset(u64 offset) 480 + { 481 + hv_sched_clock_offset -= offset; 472 482 } 473 483 474 484 #ifdef HAVE_VDSO_CLOCKMODE_HVCLOCK

+26 -24

drivers/cpufreq/amd-pstate.c

··· 374 374 375 375 static int msr_init_perf(struct amd_cpudata *cpudata) 376 376 { 377 - u64 cap1; 377 + u64 cap1, numerator; 378 378 379 379 int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, 380 380 &cap1); 381 381 if (ret) 382 382 return ret; 383 383 384 - WRITE_ONCE(cpudata->highest_perf, AMD_CPPC_HIGHEST_PERF(cap1)); 385 - WRITE_ONCE(cpudata->max_limit_perf, AMD_CPPC_HIGHEST_PERF(cap1)); 384 + ret = amd_get_boost_ratio_numerator(cpudata->cpu, &numerator); 385 + if (ret) 386 + return ret; 387 + 388 + WRITE_ONCE(cpudata->highest_perf, numerator); 389 + WRITE_ONCE(cpudata->max_limit_perf, numerator); 386 390 WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1)); 387 391 WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1)); 388 392 WRITE_ONCE(cpudata->lowest_perf, AMD_CPPC_LOWEST_PERF(cap1)); ··· 398 394 static int shmem_init_perf(struct amd_cpudata *cpudata) 399 395 { 400 396 struct cppc_perf_caps cppc_perf; 397 + u64 numerator; 401 398 402 399 int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf); 403 400 if (ret) 404 401 return ret; 405 402 406 - WRITE_ONCE(cpudata->highest_perf, cppc_perf.highest_perf); 407 - WRITE_ONCE(cpudata->max_limit_perf, cppc_perf.highest_perf); 403 + ret = amd_get_boost_ratio_numerator(cpudata->cpu, &numerator); 404 + if (ret) 405 + return ret; 406 + 407 + WRITE_ONCE(cpudata->highest_perf, numerator); 408 + WRITE_ONCE(cpudata->max_limit_perf, numerator); 408 409 WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf); 409 410 WRITE_ONCE(cpudata->lowest_nonlinear_perf, 410 411 cppc_perf.lowest_nonlinear_perf); ··· 570 561 571 562 static int amd_pstate_update_min_max_limit(struct cpufreq_policy *policy) 572 563 { 573 - u32 max_limit_perf, min_limit_perf, lowest_perf, max_perf; 564 + u32 max_limit_perf, min_limit_perf, lowest_perf, max_perf, max_freq; 574 565 struct amd_cpudata *cpudata = policy->driver_data; 575 566 576 - if (cpudata->boost_supported && !policy->boost_enabled) 577 - max_perf = READ_ONCE(cpudata->nominal_perf); 578 - else 579 - max_perf = READ_ONCE(cpudata->highest_perf); 580 - 581 - max_limit_perf = div_u64(policy->max * max_perf, policy->cpuinfo.max_freq); 582 - min_limit_perf = div_u64(policy->min * max_perf, policy->cpuinfo.max_freq); 567 + max_perf = READ_ONCE(cpudata->highest_perf); 568 + max_freq = READ_ONCE(cpudata->max_freq); 569 + max_limit_perf = div_u64(policy->max * max_perf, max_freq); 570 + min_limit_perf = div_u64(policy->min * max_perf, max_freq); 583 571 584 572 lowest_perf = READ_ONCE(cpudata->lowest_perf); 585 573 if (min_limit_perf < lowest_perf) ··· 895 889 { 896 890 int ret; 897 891 u32 min_freq, max_freq; 898 - u64 numerator; 899 892 u32 nominal_perf, nominal_freq; 900 893 u32 lowest_nonlinear_perf, lowest_nonlinear_freq; 901 894 u32 boost_ratio, lowest_nonlinear_ratio; ··· 916 911 917 912 nominal_perf = READ_ONCE(cpudata->nominal_perf); 918 913 919 - ret = amd_get_boost_ratio_numerator(cpudata->cpu, &numerator); 920 - if (ret) 921 - return ret; 922 - boost_ratio = div_u64(numerator << SCHED_CAPACITY_SHIFT, nominal_perf); 914 + boost_ratio = div_u64(cpudata->highest_perf << SCHED_CAPACITY_SHIFT, nominal_perf); 923 915 max_freq = (nominal_freq * boost_ratio >> SCHED_CAPACITY_SHIFT) * 1000; 924 916 925 917 lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf); ··· 1871 1869 static_call_update(amd_pstate_update_perf, shmem_update_perf); 1872 1870 } 1873 1871 1874 - ret = amd_pstate_register_driver(cppc_state); 1875 - if (ret) { 1876 - pr_err("failed to register with return %d\n", ret); 1877 - return ret; 1878 - } 1879 - 1880 1872 if (amd_pstate_prefcore) { 1881 1873 ret = amd_detect_prefcore(&amd_pstate_prefcore); 1882 1874 if (ret) 1883 1875 return ret; 1876 + } 1877 + 1878 + ret = amd_pstate_register_driver(cppc_state); 1879 + if (ret) { 1880 + pr_err("failed to register with return %d\n", ret); 1881 + return ret; 1884 1882 } 1885 1883 1886 1884 dev_root = bus_get_dev_root(&cpu_subsys);

+18 -7

drivers/cxl/core/region.c

··· 1295 1295 struct cxl_region_params *p = &cxlr->params; 1296 1296 struct cxl_decoder *cxld = cxl_rr->decoder; 1297 1297 struct cxl_switch_decoder *cxlsd; 1298 + struct cxl_port *iter = port; 1298 1299 u16 eig, peig; 1299 1300 u8 eiw, peiw; 1300 1301 ··· 1312 1311 1313 1312 cxlsd = to_cxl_switch_decoder(&cxld->dev); 1314 1313 if (cxl_rr->nr_targets_set) { 1315 - int i, distance; 1314 + int i, distance = 1; 1315 + struct cxl_region_ref *cxl_rr_iter; 1316 1316 1317 1317 /* 1318 - * Passthrough decoders impose no distance requirements between 1319 - * peers 1318 + * The "distance" between peer downstream ports represents which 1319 + * endpoint positions in the region interleave a given port can 1320 + * host. 1321 + * 1322 + * For example, at the root of a hierarchy the distance is 1323 + * always 1 as every index targets a different host-bridge. At 1324 + * each subsequent switch level those ports map every Nth region 1325 + * position where N is the width of the switch == distance. 1320 1326 */ 1321 - if (cxl_rr->nr_targets == 1) 1322 - distance = 0; 1323 - else 1324 - distance = p->nr_targets / cxl_rr->nr_targets; 1327 + do { 1328 + cxl_rr_iter = cxl_rr_load(iter, cxlr); 1329 + distance *= cxl_rr_iter->nr_targets; 1330 + iter = to_cxl_port(iter->dev.parent); 1331 + } while (!is_cxl_root(iter)); 1332 + distance *= cxlrd->cxlsd.cxld.interleave_ways; 1333 + 1325 1334 for (i = 0; i < cxl_rr->nr_targets_set; i++) 1326 1335 if (ep->dport == cxlsd->target[i]) { 1327 1336 rc = check_last_peer(cxled, ep, cxl_rr,

+4 -2

drivers/cxl/pci.c

··· 836 836 if (!root_dev) 837 837 return -ENXIO; 838 838 839 + if (!dport->regs.rcd_pcie_cap) 840 + return -ENXIO; 841 + 839 842 guard(device)(root_dev); 840 843 if (!root_dev->driver) 841 844 return -ENXIO; ··· 1035 1032 if (rc) 1036 1033 return rc; 1037 1034 1038 - rc = cxl_pci_ras_unmask(pdev); 1039 - if (rc) 1035 + if (cxl_pci_ras_unmask(pdev)) 1040 1036 dev_dbg(&pdev->dev, "No RAS reporting unmasked\n"); 1041 1037 1042 1038 pci_save_state(pdev);

+1 -1

drivers/dma-buf/dma-buf.c

··· 60 60 { 61 61 } 62 62 63 - static void __dma_buf_debugfs_list_del(struct file *file) 63 + static void __dma_buf_debugfs_list_del(struct dma_buf *dmabuf) 64 64 { 65 65 } 66 66 #endif

+27 -16

drivers/dma-buf/udmabuf.c

··· 297 297 }; 298 298 299 299 #define SEALS_WANTED (F_SEAL_SHRINK) 300 - #define SEALS_DENIED (F_SEAL_WRITE) 300 + #define SEALS_DENIED (F_SEAL_WRITE|F_SEAL_FUTURE_WRITE) 301 301 302 302 static int check_memfd_seals(struct file *memfd) 303 303 { ··· 317 317 return 0; 318 318 } 319 319 320 - static int export_udmabuf(struct udmabuf *ubuf, 321 - struct miscdevice *device, 322 - u32 flags) 320 + static struct dma_buf *export_udmabuf(struct udmabuf *ubuf, 321 + struct miscdevice *device) 323 322 { 324 323 DEFINE_DMA_BUF_EXPORT_INFO(exp_info); 325 - struct dma_buf *buf; 326 324 327 325 ubuf->device = device; 328 326 exp_info.ops = &udmabuf_ops; ··· 328 330 exp_info.priv = ubuf; 329 331 exp_info.flags = O_RDWR; 330 332 331 - buf = dma_buf_export(&exp_info); 332 - if (IS_ERR(buf)) 333 - return PTR_ERR(buf); 334 - 335 - return dma_buf_fd(buf, flags); 333 + return dma_buf_export(&exp_info); 336 334 } 337 335 338 336 static long udmabuf_pin_folios(struct udmabuf *ubuf, struct file *memfd, ··· 385 391 struct folio **folios = NULL; 386 392 pgoff_t pgcnt = 0, pglimit; 387 393 struct udmabuf *ubuf; 394 + struct dma_buf *dmabuf; 388 395 long ret = -EINVAL; 389 396 u32 i, flags; 390 397 ··· 431 436 goto err; 432 437 } 433 438 439 + /* 440 + * Take the inode lock to protect against concurrent 441 + * memfd_add_seals(), which takes this lock in write mode. 442 + */ 443 + inode_lock_shared(file_inode(memfd)); 434 444 ret = check_memfd_seals(memfd); 435 - if (ret < 0) { 436 - fput(memfd); 437 - goto err; 438 - } 445 + if (ret) 446 + goto out_unlock; 439 447 440 448 ret = udmabuf_pin_folios(ubuf, memfd, list[i].offset, 441 449 list[i].size, folios); 450 + out_unlock: 451 + inode_unlock_shared(file_inode(memfd)); 442 452 fput(memfd); 443 453 if (ret) 444 454 goto err; 445 455 } 446 456 447 457 flags = head->flags & UDMABUF_FLAGS_CLOEXEC ? O_CLOEXEC : 0; 448 - ret = export_udmabuf(ubuf, device, flags); 449 - if (ret < 0) 458 + dmabuf = export_udmabuf(ubuf, device); 459 + if (IS_ERR(dmabuf)) { 460 + ret = PTR_ERR(dmabuf); 450 461 goto err; 462 + } 463 + /* 464 + * Ownership of ubuf is held by the dmabuf from here. 465 + * If the following dma_buf_fd() fails, dma_buf_put() cleans up both the 466 + * dmabuf and the ubuf (through udmabuf_ops.release). 467 + */ 468 + 469 + ret = dma_buf_fd(dmabuf, flags); 470 + if (ret < 0) 471 + dma_buf_put(dmabuf); 451 472 452 473 kvfree(folios); 453 474 return ret;

+11 -4

drivers/firmware/arm_ffa/bus.c

··· 187 187 return valid; 188 188 } 189 189 190 - struct ffa_device *ffa_device_register(const uuid_t *uuid, int vm_id, 191 - const struct ffa_ops *ops) 190 + struct ffa_device * 191 + ffa_device_register(const struct ffa_partition_info *part_info, 192 + const struct ffa_ops *ops) 192 193 { 193 194 int id, ret; 195 + uuid_t uuid; 194 196 struct device *dev; 195 197 struct ffa_device *ffa_dev; 198 + 199 + if (!part_info) 200 + return NULL; 196 201 197 202 id = ida_alloc_min(&ffa_bus_id, 1, GFP_KERNEL); 198 203 if (id < 0) ··· 215 210 dev_set_name(&ffa_dev->dev, "arm-ffa-%d", id); 216 211 217 212 ffa_dev->id = id; 218 - ffa_dev->vm_id = vm_id; 213 + ffa_dev->vm_id = part_info->id; 214 + ffa_dev->properties = part_info->properties; 219 215 ffa_dev->ops = ops; 220 - uuid_copy(&ffa_dev->uuid, uuid); 216 + import_uuid(&uuid, (u8 *)part_info->uuid); 217 + uuid_copy(&ffa_dev->uuid, &uuid); 221 218 222 219 ret = device_register(&ffa_dev->dev); 223 220 if (ret) {

+1 -6

drivers/firmware/arm_ffa/driver.c

··· 1387 1387 static int ffa_setup_partitions(void) 1388 1388 { 1389 1389 int count, idx, ret; 1390 - uuid_t uuid; 1391 1390 struct ffa_device *ffa_dev; 1392 1391 struct ffa_dev_part_info *info; 1393 1392 struct ffa_partition_info *pbuf, *tpbuf; ··· 1405 1406 1406 1407 xa_init(&drv_info->partition_info); 1407 1408 for (idx = 0, tpbuf = pbuf; idx < count; idx++, tpbuf++) { 1408 - import_uuid(&uuid, (u8 *)tpbuf->uuid); 1409 - 1410 1409 /* Note that if the UUID will be uuid_null, that will require 1411 1410 * ffa_bus_notifier() to find the UUID of this partition id 1412 1411 * with help of ffa_device_match_uuid(). FF-A v1.1 and above 1413 1412 * provides UUID here for each partition as part of the 1414 1413 * discovery API and the same is passed. 1415 1414 */ 1416 - ffa_dev = ffa_device_register(&uuid, tpbuf->id, &ffa_drv_ops); 1415 + ffa_dev = ffa_device_register(tpbuf, &ffa_drv_ops); 1417 1416 if (!ffa_dev) { 1418 1417 pr_err("%s: failed to register partition ID 0x%x\n", 1419 1418 __func__, tpbuf->id); 1420 1419 continue; 1421 1420 } 1422 - 1423 - ffa_dev->properties = tpbuf->properties; 1424 1421 1425 1422 if (drv_info->version > FFA_VERSION_1_0 && 1426 1423 !(tpbuf->properties & FFA_PARTITION_AARCH64_EXEC))

+1

drivers/firmware/arm_scmi/vendors/imx/Kconfig

··· 15 15 config IMX_SCMI_MISC_EXT 16 16 tristate "i.MX SCMI MISC EXTENSION" 17 17 depends on ARM_SCMI_PROTOCOL || (COMPILE_TEST && OF) 18 + depends on IMX_SCMI_MISC_DRV 18 19 default y if ARCH_MXC 19 20 help 20 21 This enables i.MX System MISC control logic such as gpio expander

-1

drivers/firmware/imx/Kconfig

··· 25 25 26 26 config IMX_SCMI_MISC_DRV 27 27 tristate "IMX SCMI MISC Protocol driver" 28 - depends on IMX_SCMI_MISC_EXT || COMPILE_TEST 29 28 default y if ARCH_MXC 30 29 help 31 30 The System Controller Management Interface firmware (SCMI FW) is

+2 -2

drivers/firmware/microchip/mpfs-auto-update.c

··· 402 402 return -EIO; 403 403 404 404 /* 405 - * Bit 5 of byte 1 is "UL_Auto Update" & if it is set, Auto Update is 405 + * Bit 5 of byte 1 is "UL_IAP" & if it is set, Auto Update is 406 406 * not possible. 407 407 */ 408 - if (response_msg[1] & AUTO_UPDATE_FEATURE_ENABLED) 408 + if ((((u8 *)response_msg)[1] & AUTO_UPDATE_FEATURE_ENABLED)) 409 409 return -EPERM; 410 410 411 411 return 0;

+4

drivers/gpu/drm/Kconfig

··· 99 99 config DRM_KMS_HELPER 100 100 tristate 101 101 depends on DRM 102 + select FB_CORE if DRM_FBDEV_EMULATION 102 103 help 103 104 CRTC helpers for KMS drivers. 104 105 ··· 359 358 tristate 360 359 depends on DRM 361 360 select DRM_TTM 361 + select FB_CORE if DRM_FBDEV_EMULATION 362 362 select FB_SYSMEM_HELPERS_DEFERRED if DRM_FBDEV_EMULATION 363 363 help 364 364 Helpers for ttm-based gem objects ··· 367 365 config DRM_GEM_DMA_HELPER 368 366 tristate 369 367 depends on DRM 368 + select FB_CORE if DRM_FBDEV_EMULATION 370 369 select FB_DMAMEM_HELPERS_DEFERRED if DRM_FBDEV_EMULATION 371 370 help 372 371 Choose this if you need the GEM DMA helper functions ··· 375 372 config DRM_GEM_SHMEM_HELPER 376 373 tristate 377 374 depends on DRM && MMU 375 + select FB_CORE if DRM_FBDEV_EMULATION 378 376 select FB_SYSMEM_HELPERS_DEFERRED if DRM_FBDEV_EMULATION 379 377 help 380 378 Choose this if you need the GEM shmem helper functions

+2 -3

drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c

··· 343 343 coredump->skip_vram_check = skip_vram_check; 344 344 coredump->reset_vram_lost = vram_lost; 345 345 346 - if (job && job->vm) { 347 - struct amdgpu_vm *vm = job->vm; 346 + if (job && job->pasid) { 348 347 struct amdgpu_task_info *ti; 349 348 350 - ti = amdgpu_vm_get_task_info_vm(vm); 349 + ti = amdgpu_vm_get_task_info_pasid(adev, job->pasid); 351 350 if (ti) { 352 351 coredump->reset_task_info = *ti; 353 352 amdgpu_vm_put_task_info(ti);

+3

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

··· 417 417 { 418 418 struct amdgpu_device *adev = drm_to_adev(dev); 419 419 420 + if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE)) 421 + return false; 422 + 420 423 if (adev->has_pr3 || 421 424 ((adev->flags & AMD_IS_PX) && amdgpu_is_atpx_hybrid())) 422 425 return true;

+1 -2

drivers/gpu/drm/amd/amdgpu/amdgpu_job.c

··· 255 255 256 256 void amdgpu_job_free_resources(struct amdgpu_job *job) 257 257 { 258 - struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched); 259 258 struct dma_fence *f; 260 259 unsigned i; 261 260 ··· 267 268 f = NULL; 268 269 269 270 for (i = 0; i < job->num_ibs; ++i) 270 - amdgpu_ib_free(ring->adev, &job->ibs[i], f); 271 + amdgpu_ib_free(NULL, &job->ibs[i], f); 271 272 } 272 273 273 274 static void amdgpu_job_free_cb(struct drm_sched_job *s_job)

+3 -4

drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

··· 1266 1266 * next command submission. 1267 1267 */ 1268 1268 if (amdgpu_vm_is_bo_always_valid(vm, bo)) { 1269 - uint32_t mem_type = bo->tbo.resource->mem_type; 1270 - 1271 - if (!(bo->preferred_domains & 1272 - amdgpu_mem_type_to_domain(mem_type))) 1269 + if (bo->tbo.resource && 1270 + !(bo->preferred_domains & 1271 + amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type))) 1273 1272 amdgpu_vm_bo_evicted(&bo_va->base); 1274 1273 else 1275 1274 amdgpu_vm_bo_idle(&bo_va->base);

+1 -1

drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c

··· 4123 4123 if (amdgpu_sriov_vf(adev)) 4124 4124 return 0; 4125 4125 4126 - switch (adev->ip_versions[GC_HWIP][0]) { 4126 + switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { 4127 4127 case IP_VERSION(12, 0, 0): 4128 4128 case IP_VERSION(12, 0, 1): 4129 4129 gfx_v12_0_update_gfx_clock_gating(adev,

+1 -1

drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c

··· 108 108 dev_err(adev->dev, 109 109 "MMVM_L2_PROTECTION_FAULT_STATUS_LO32:0x%08X\n", 110 110 status); 111 - switch (adev->ip_versions[MMHUB_HWIP][0]) { 111 + switch (amdgpu_ip_version(adev, MMHUB_HWIP, 0)) { 112 112 case IP_VERSION(4, 1, 0): 113 113 mmhub_cid = mmhub_client_ids_v4_1_0[cid][rw]; 114 114 break;

+11

drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c

··· 271 271 .ref_and_mask_sdma1 = GPU_HDP_FLUSH_DONE__SDMA1_MASK, 272 272 }; 273 273 274 + #define regRCC_DEV0_EPF6_STRAP4 0xd304 275 + #define regRCC_DEV0_EPF6_STRAP4_BASE_IDX 5 276 + 274 277 static void nbio_v7_0_init_registers(struct amdgpu_device *adev) 275 278 { 279 + uint32_t data; 280 + 281 + switch (amdgpu_ip_version(adev, NBIO_HWIP, 0)) { 282 + case IP_VERSION(2, 5, 0): 283 + data = RREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF6_STRAP4) & ~BIT(23); 284 + WREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF6_STRAP4, data); 285 + break; 286 + } 276 287 } 277 288 278 289 #define MMIO_REG_HOLE_OFFSET (0x80000 - PAGE_SIZE)

+1 -1

drivers/gpu/drm/amd/amdgpu/nbio_v7_11.c

··· 275 275 if (def != data) 276 276 WREG32_SOC15(NBIO, 0, regBIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3, data); 277 277 278 - switch (adev->ip_versions[NBIO_HWIP][0]) { 278 + switch (amdgpu_ip_version(adev, NBIO_HWIP, 0)) { 279 279 case IP_VERSION(7, 11, 0): 280 280 case IP_VERSION(7, 11, 1): 281 281 case IP_VERSION(7, 11, 2):

+1 -1

drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c

··· 247 247 if (def != data) 248 248 WREG32_SOC15(NBIO, 0, regBIF0_PCIE_MST_CTRL_3, data); 249 249 250 - switch (adev->ip_versions[NBIO_HWIP][0]) { 250 + switch (amdgpu_ip_version(adev, NBIO_HWIP, 0)) { 251 251 case IP_VERSION(7, 7, 0): 252 252 data = RREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF5_STRAP4) & ~BIT(23); 253 253 WREG32_SOC15(NBIO, 0, regRCC_DEV0_EPF5_STRAP4, data);

+1 -1

drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c

··· 2096 2096 { 2097 2097 struct amdgpu_device *adev = smu->adev; 2098 2098 2099 - if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(14, 0, 2)) 2099 + if (amdgpu_ip_version(adev, MP1_HWIP, 0) == IP_VERSION(14, 0, 2)) 2100 2100 return smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_EnableAllSmuFeatures, 2101 2101 FEATURE_PWR_GFX, NULL); 2102 2102 else

+5 -5

drivers/gpu/drm/display/drm_dp_tunnel.c

··· 1896 1896 * 1897 1897 * Creates a DP tunnel manager for @dev. 1898 1898 * 1899 - * Returns a pointer to the tunnel manager if created successfully or NULL in 1900 - * case of an error. 1899 + * Returns a pointer to the tunnel manager if created successfully or error 1900 + * pointer in case of failure. 1901 1901 */ 1902 1902 struct drm_dp_tunnel_mgr * 1903 1903 drm_dp_tunnel_mgr_create(struct drm_device *dev, int max_group_count) ··· 1907 1907 1908 1908 mgr = kzalloc(sizeof(*mgr), GFP_KERNEL); 1909 1909 if (!mgr) 1910 - return NULL; 1910 + return ERR_PTR(-ENOMEM); 1911 1911 1912 1912 mgr->dev = dev; 1913 1913 init_waitqueue_head(&mgr->bw_req_queue); ··· 1916 1916 if (!mgr->groups) { 1917 1917 kfree(mgr); 1918 1918 1919 - return NULL; 1919 + return ERR_PTR(-ENOMEM); 1920 1920 } 1921 1921 1922 1922 #ifdef CONFIG_DRM_DISPLAY_DP_TUNNEL_STATE_DEBUG ··· 1927 1927 if (!init_group(mgr, &mgr->groups[i])) { 1928 1928 destroy_mgr(mgr); 1929 1929 1930 - return NULL; 1930 + return ERR_PTR(-ENOMEM); 1931 1931 } 1932 1932 1933 1933 mgr->group_count++;

+7 -4

drivers/gpu/drm/drm_modes.c

··· 1287 1287 */ 1288 1288 int drm_mode_vrefresh(const struct drm_display_mode *mode) 1289 1289 { 1290 - unsigned int num, den; 1290 + unsigned int num = 1, den = 1; 1291 1291 1292 1292 if (mode->htotal == 0 || mode->vtotal == 0) 1293 1293 return 0; 1294 - 1295 - num = mode->clock; 1296 - den = mode->htotal * mode->vtotal; 1297 1294 1298 1295 if (mode->flags & DRM_MODE_FLAG_INTERLACE) 1299 1296 num *= 2; ··· 1298 1301 den *= 2; 1299 1302 if (mode->vscan > 1) 1300 1303 den *= mode->vscan; 1304 + 1305 + if (check_mul_overflow(mode->clock, num, &num)) 1306 + return 0; 1307 + 1308 + if (check_mul_overflow(mode->htotal * mode->vtotal, den, &den)) 1309 + return 0; 1301 1310 1302 1311 return DIV_ROUND_CLOSEST_ULL(mul_u32_u32(num, 1000), den); 1303 1312 }

+5

drivers/gpu/drm/i915/gt/intel_engine_types.h

··· 343 343 * @start_gt_clk: GT clock time of last idle to active transition. 344 344 */ 345 345 u64 start_gt_clk; 346 + 347 + /** 348 + * @total: The last value of total returned 349 + */ 350 + u64 total; 346 351 }; 347 352 348 353 union intel_engine_tlb_inv_reg {

+39 -2

drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c

··· 1243 1243 } while (++i < 6); 1244 1244 } 1245 1245 1246 + static void __set_engine_usage_record(struct intel_engine_cs *engine, 1247 + u32 last_in, u32 id, u32 total) 1248 + { 1249 + struct iosys_map rec_map = intel_guc_engine_usage_record_map(engine); 1250 + 1251 + #define record_write(map_, field_, val_) \ 1252 + iosys_map_wr_field(map_, 0, struct guc_engine_usage_record, field_, val_) 1253 + 1254 + record_write(&rec_map, last_switch_in_stamp, last_in); 1255 + record_write(&rec_map, current_context_index, id); 1256 + record_write(&rec_map, total_runtime, total); 1257 + 1258 + #undef record_write 1259 + } 1260 + 1246 1261 static void guc_update_engine_gt_clks(struct intel_engine_cs *engine) 1247 1262 { 1248 1263 struct intel_engine_guc_stats *stats = &engine->stats.guc; ··· 1378 1363 total += intel_gt_clock_interval_to_ns(gt, clk); 1379 1364 } 1380 1365 1366 + if (total > stats->total) 1367 + stats->total = total; 1368 + 1381 1369 spin_unlock_irqrestore(&guc->timestamp.lock, flags); 1382 1370 1383 - return ns_to_ktime(total); 1371 + return ns_to_ktime(stats->total); 1384 1372 } 1385 1373 1386 1374 static void guc_enable_busyness_worker(struct intel_guc *guc) ··· 1449 1431 1450 1432 guc_update_pm_timestamp(guc, &unused); 1451 1433 for_each_engine(engine, gt, id) { 1434 + struct intel_engine_guc_stats *stats = &engine->stats.guc; 1435 + 1452 1436 guc_update_engine_gt_clks(engine); 1453 - engine->stats.guc.prev_total = 0; 1437 + 1438 + /* 1439 + * If resetting a running context, accumulate the active 1440 + * time as well since there will be no context switch. 1441 + */ 1442 + if (stats->running) { 1443 + u64 clk = guc->timestamp.gt_stamp - stats->start_gt_clk; 1444 + 1445 + stats->total_gt_clks += clk; 1446 + } 1447 + stats->prev_total = 0; 1448 + stats->running = 0; 1454 1449 } 1455 1450 1456 1451 spin_unlock_irqrestore(&guc->timestamp.lock, flags); ··· 1574 1543 1575 1544 static int guc_action_enable_usage_stats(struct intel_guc *guc) 1576 1545 { 1546 + struct intel_gt *gt = guc_to_gt(guc); 1547 + struct intel_engine_cs *engine; 1548 + enum intel_engine_id id; 1577 1549 u32 offset = intel_guc_engine_usage_offset(guc); 1578 1550 u32 action[] = { 1579 1551 INTEL_GUC_ACTION_SET_ENG_UTIL_BUFF, 1580 1552 offset, 1581 1553 0, 1582 1554 }; 1555 + 1556 + for_each_engine(engine, gt, id) 1557 + __set_engine_usage_record(engine, 0, 0xffffffff, 0); 1583 1558 1584 1559 return intel_guc_send(guc, action, ARRAY_SIZE(action)); 1585 1560 }

+2

drivers/gpu/drm/panel/panel-himax-hx83102.c

··· 565 565 struct drm_display_mode *mode; 566 566 567 567 mode = drm_mode_duplicate(connector->dev, m); 568 + if (!mode) 569 + return -ENOMEM; 568 570 569 571 mode->type = DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED; 570 572 drm_mode_set_name(mode);

+2 -2

drivers/gpu/drm/panel/panel-novatek-nt35950.c

··· 481 481 return dev_err_probe(dev, -EPROBE_DEFER, "Cannot get secondary DSI host\n"); 482 482 483 483 nt->dsi[1] = mipi_dsi_device_register_full(dsi_r_host, info); 484 - if (!nt->dsi[1]) { 484 + if (IS_ERR(nt->dsi[1])) { 485 485 dev_err(dev, "Cannot get secondary DSI node\n"); 486 - return -ENODEV; 486 + return PTR_ERR(nt->dsi[1]); 487 487 } 488 488 num_dsis++; 489 489 }

+1

drivers/gpu/drm/panel/panel-sitronix-st7701.c

··· 1177 1177 return dev_err_probe(dev, ret, "Failed to get orientation\n"); 1178 1178 1179 1179 drm_panel_init(&st7701->panel, dev, &st7701_funcs, connector_type); 1180 + st7701->panel.prepare_prev_first = true; 1180 1181 1181 1182 /** 1182 1183 * Once sleep out has been issued, ST7701 IC required to wait 120ms

+1 -1

drivers/gpu/drm/panel/panel-synaptics-r63353.c

··· 325 325 { 326 326 struct r63353_panel *rpanel = mipi_dsi_get_drvdata(dsi); 327 327 328 - r63353_panel_unprepare(&rpanel->base); 328 + drm_panel_unprepare(&rpanel->base); 329 329 } 330 330 331 331 static const struct r63353_desc sharp_ls068b3sx02_data = {

+2 -1

drivers/gpu/drm/scheduler/sched_main.c

··· 1355 1355 * drm_sched_backend_ops.run_job(). Consequently, drm_sched_backend_ops.free_job() 1356 1356 * will not be called for all jobs still in drm_gpu_scheduler.pending_list. 1357 1357 * There is no solution for this currently. Thus, it is up to the driver to make 1358 - * sure that 1358 + * sure that: 1359 + * 1359 1360 * a) drm_sched_fini() is only called after for all submitted jobs 1360 1361 * drm_sched_backend_ops.free_job() has been called or that 1361 1362 * b) the jobs for which drm_sched_backend_ops.free_job() has not been called

+5 -4

drivers/hv/hv_balloon.c

··· 756 756 * adding succeeded, it is ok to proceed even if the memory was 757 757 * not onlined in time. 758 758 */ 759 - wait_for_completion_timeout(&dm_device.ol_waitevent, 5 * HZ); 759 + wait_for_completion_timeout(&dm_device.ol_waitevent, secs_to_jiffies(5)); 760 760 post_status(&dm_device); 761 761 } 762 762 } ··· 1373 1373 struct hv_dynmem_device *dm = dm_dev; 1374 1374 1375 1375 while (!kthread_should_stop()) { 1376 - wait_for_completion_interruptible_timeout(&dm_device.config_event, 1 * HZ); 1376 + wait_for_completion_interruptible_timeout(&dm_device.config_event, 1377 + secs_to_jiffies(1)); 1377 1378 /* 1378 1379 * The host expects us to post information on the memory 1379 1380 * pressure every second. ··· 1749 1748 if (ret) 1750 1749 goto out; 1751 1750 1752 - t = wait_for_completion_timeout(&dm_device.host_event, 5 * HZ); 1751 + t = wait_for_completion_timeout(&dm_device.host_event, secs_to_jiffies(5)); 1753 1752 if (t == 0) { 1754 1753 ret = -ETIMEDOUT; 1755 1754 goto out; ··· 1807 1806 if (ret) 1808 1807 goto out; 1809 1808 1810 - t = wait_for_completion_timeout(&dm_device.host_event, 5 * HZ); 1809 + t = wait_for_completion_timeout(&dm_device.host_event, secs_to_jiffies(5)); 1811 1810 if (t == 0) { 1812 1811 ret = -ETIMEDOUT; 1813 1812 goto out;

+8 -2

drivers/hv/hv_kvp.c

··· 655 655 if (host_negotiatied == NEGO_NOT_STARTED) { 656 656 host_negotiatied = NEGO_IN_PROGRESS; 657 657 schedule_delayed_work(&kvp_host_handshake_work, 658 - HV_UTIL_NEGO_TIMEOUT * HZ); 658 + secs_to_jiffies(HV_UTIL_NEGO_TIMEOUT)); 659 659 } 660 660 return; 661 661 } ··· 724 724 */ 725 725 schedule_work(&kvp_sendkey_work); 726 726 schedule_delayed_work(&kvp_timeout_work, 727 - HV_UTIL_TIMEOUT * HZ); 727 + secs_to_jiffies(HV_UTIL_TIMEOUT)); 728 728 729 729 return; 730 730 ··· 767 767 */ 768 768 kvp_transaction.state = HVUTIL_DEVICE_INIT; 769 769 770 + return 0; 771 + } 772 + 773 + int 774 + hv_kvp_init_transport(void) 775 + { 770 776 hvt = hvutil_transport_init(kvp_devname, CN_KVP_IDX, CN_KVP_VAL, 771 777 kvp_on_msg, kvp_on_reset); 772 778 if (!hvt)

+8 -1

drivers/hv/hv_snapshot.c

··· 193 193 vss_transaction.state = HVUTIL_USERSPACE_REQ; 194 194 195 195 schedule_delayed_work(&vss_timeout_work, op == VSS_OP_FREEZE ? 196 - VSS_FREEZE_TIMEOUT * HZ : HV_UTIL_TIMEOUT * HZ); 196 + secs_to_jiffies(VSS_FREEZE_TIMEOUT) : 197 + secs_to_jiffies(HV_UTIL_TIMEOUT)); 197 198 198 199 rc = hvutil_transport_send(hvt, vss_msg, sizeof(*vss_msg), NULL); 199 200 if (rc) { ··· 389 388 */ 390 389 vss_transaction.state = HVUTIL_DEVICE_INIT; 391 390 391 + return 0; 392 + } 393 + 394 + int 395 + hv_vss_init_transport(void) 396 + { 392 397 hvt = hvutil_transport_init(vss_devname, CN_VSS_IDX, CN_VSS_VAL, 393 398 vss_on_msg, vss_on_reset); 394 399 if (!hvt) {

+10 -3

drivers/hv/hv_util.c

··· 141 141 static struct hv_util_service util_kvp = { 142 142 .util_cb = hv_kvp_onchannelcallback, 143 143 .util_init = hv_kvp_init, 144 + .util_init_transport = hv_kvp_init_transport, 144 145 .util_pre_suspend = hv_kvp_pre_suspend, 145 146 .util_pre_resume = hv_kvp_pre_resume, 146 147 .util_deinit = hv_kvp_deinit, ··· 150 149 static struct hv_util_service util_vss = { 151 150 .util_cb = hv_vss_onchannelcallback, 152 151 .util_init = hv_vss_init, 152 + .util_init_transport = hv_vss_init_transport, 153 153 .util_pre_suspend = hv_vss_pre_suspend, 154 154 .util_pre_resume = hv_vss_pre_resume, 155 155 .util_deinit = hv_vss_deinit, ··· 592 590 srv->channel = dev->channel; 593 591 if (srv->util_init) { 594 592 ret = srv->util_init(srv); 595 - if (ret) { 596 - ret = -ENODEV; 593 + if (ret) 597 594 goto error1; 598 - } 599 595 } 600 596 601 597 /* ··· 613 613 if (ret) 614 614 goto error; 615 615 616 + if (srv->util_init_transport) { 617 + ret = srv->util_init_transport(); 618 + if (ret) { 619 + vmbus_close(dev->channel); 620 + goto error; 621 + } 622 + } 616 623 return 0; 617 624 618 625 error:

+2

drivers/hv/hyperv_vmbus.h

··· 370 370 void vmbus_on_msg_dpc(unsigned long data); 371 371 372 372 int hv_kvp_init(struct hv_util_service *srv); 373 + int hv_kvp_init_transport(void); 373 374 void hv_kvp_deinit(void); 374 375 int hv_kvp_pre_suspend(void); 375 376 int hv_kvp_pre_resume(void); 376 377 void hv_kvp_onchannelcallback(void *context); 377 378 378 379 int hv_vss_init(struct hv_util_service *srv); 380 + int hv_vss_init_transport(void); 379 381 void hv_vss_deinit(void); 380 382 int hv_vss_pre_suspend(void); 381 383 int hv_vss_pre_resume(void);

+1 -1

drivers/hv/vmbus_drv.c

··· 2507 2507 vmbus_request_offers(); 2508 2508 2509 2509 if (wait_for_completion_timeout( 2510 - &vmbus_connection.ready_for_resume_event, 10 * HZ) == 0) 2510 + &vmbus_connection.ready_for_resume_event, secs_to_jiffies(10)) == 0) 2511 2511 pr_err("Some vmbus device is missing after suspending?\n"); 2512 2512 2513 2513 /* Reset the event for the next suspend. */

+6 -4

drivers/hwmon/tmp513.c

··· 182 182 struct regmap *regmap; 183 183 }; 184 184 185 - // Set the shift based on the gain 8=4, 4=3, 2=2, 1=1 185 + // Set the shift based on the gain: 8 -> 1, 4 -> 2, 2 -> 3, 1 -> 4 186 186 static inline u8 tmp51x_get_pga_shift(struct tmp51x_data *data) 187 187 { 188 188 return 5 - ffs(data->pga_gain); ··· 204 204 * 2's complement number shifted by one to four depending 205 205 * on the pga gain setting. 1lsb = 10uV 206 206 */ 207 - *val = sign_extend32(regval, 17 - tmp51x_get_pga_shift(data)); 207 + *val = sign_extend32(regval, 208 + reg == TMP51X_SHUNT_CURRENT_RESULT ? 209 + 16 - tmp51x_get_pga_shift(data) : 15); 208 210 *val = DIV_ROUND_CLOSEST(*val * 10 * MILLI, data->shunt_uohms); 209 211 break; 210 212 case TMP51X_BUS_VOLTAGE_RESULT: ··· 222 220 break; 223 221 case TMP51X_BUS_CURRENT_RESULT: 224 222 // Current = (ShuntVoltage * CalibrationRegister) / 4096 225 - *val = sign_extend32(regval, 16) * data->curr_lsb_ua; 223 + *val = sign_extend32(regval, 15) * (long)data->curr_lsb_ua; 226 224 *val = DIV_ROUND_CLOSEST(*val, MILLI); 227 225 break; 228 226 case TMP51X_LOCAL_TEMP_RESULT: ··· 234 232 case TMP51X_REMOTE_TEMP_LIMIT_2: 235 233 case TMP513_REMOTE_TEMP_LIMIT_3: 236 234 // 1lsb = 0.0625 degrees centigrade 237 - *val = sign_extend32(regval, 16) >> TMP51X_TEMP_SHIFT; 235 + *val = sign_extend32(regval, 15) >> TMP51X_TEMP_SHIFT; 238 236 *val = DIV_ROUND_CLOSEST(*val * 625, 10); 239 237 break; 240 238 case TMP51X_N_FACTOR_AND_HYST_1:

+1

drivers/macintosh/Kconfig

··· 120 120 config PMAC_BACKLIGHT 121 121 bool "Backlight control for LCD screens" 122 122 depends on PPC_PMAC && ADB_PMU && FB = y && (BROKEN || !PPC64) 123 + depends on BACKLIGHT_CLASS_DEVICE=y 123 124 select FB_BACKLIGHT 124 125 help 125 126 Say Y here to enable Macintosh specific extensions of the generic

+1 -1

drivers/media/dvb-frontends/dib3000mb.c

··· 51 51 static int dib3000_read_reg(struct dib3000_state *state, u16 reg) 52 52 { 53 53 u8 wb[] = { ((reg >> 8) | 0x80) & 0xff, reg & 0xff }; 54 - u8 rb[2]; 54 + u8 rb[2] = {}; 55 55 struct i2c_msg msg[] = { 56 56 { .addr = state->config.demod_address, .flags = 0, .buf = wb, .len = 2 }, 57 57 { .addr = state->config.demod_address, .flags = I2C_M_RD, .buf = rb, .len = 2 },

+2 -1

drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c

··· 1188 1188 return ret; 1189 1189 } 1190 1190 1191 - static 1191 + /* clang stack usage explodes if this is inlined */ 1192 + static noinline_for_stack 1192 1193 void vdec_vp9_slice_map_counts_eob_coef(unsigned int i, unsigned int j, unsigned int k, 1193 1194 struct vdec_vp9_slice_frame_counts *counts, 1194 1195 struct v4l2_vp9_frame_symbol_counts *counts_helper)

+2

drivers/mmc/host/mtk-sd.c

··· 3070 3070 msdc_gate_clock(host); 3071 3071 platform_set_drvdata(pdev, NULL); 3072 3072 release_mem: 3073 + device_init_wakeup(&pdev->dev, false); 3073 3074 if (host->dma.gpd) 3074 3075 dma_free_coherent(&pdev->dev, 3075 3076 2 * sizeof(struct mt_gpdma_desc), ··· 3104 3103 host->dma.gpd, host->dma.gpd_addr); 3105 3104 dma_free_coherent(&pdev->dev, MAX_BD_NUM * sizeof(struct mt_bdma_desc), 3106 3105 host->dma.bd, host->dma.bd_addr); 3106 + device_init_wakeup(&pdev->dev, false); 3107 3107 } 3108 3108 3109 3109 static void msdc_save_reg(struct msdc_host *host)

-1

drivers/mmc/host/sdhci-tegra.c

··· 1525 1525 .quirks = SDHCI_QUIRK_BROKEN_TIMEOUT_VAL | 1526 1526 SDHCI_QUIRK_SINGLE_POWER_WRITE | 1527 1527 SDHCI_QUIRK_NO_HISPD_BIT | 1528 - SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC | 1529 1528 SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN, 1530 1529 .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN | 1531 1530 SDHCI_QUIRK2_ISSUE_CMD_DAT_RESET_TOGETHER,

+26 -10

drivers/net/can/m_can/m_can.c

··· 1220 1220 static int m_can_interrupt_handler(struct m_can_classdev *cdev) 1221 1221 { 1222 1222 struct net_device *dev = cdev->net; 1223 - u32 ir; 1223 + u32 ir = 0, ir_read; 1224 1224 int ret; 1225 1225 1226 1226 if (pm_runtime_suspended(cdev->dev)) 1227 1227 return IRQ_NONE; 1228 1228 1229 - ir = m_can_read(cdev, M_CAN_IR); 1229 + /* The m_can controller signals its interrupt status as a level, but 1230 + * depending in the integration the CPU may interpret the signal as 1231 + * edge-triggered (for example with m_can_pci). For these 1232 + * edge-triggered integrations, we must observe that IR is 0 at least 1233 + * once to be sure that the next interrupt will generate an edge. 1234 + */ 1235 + while ((ir_read = m_can_read(cdev, M_CAN_IR)) != 0) { 1236 + ir |= ir_read; 1237 + 1238 + /* ACK all irqs */ 1239 + m_can_write(cdev, M_CAN_IR, ir); 1240 + 1241 + if (!cdev->irq_edge_triggered) 1242 + break; 1243 + } 1244 + 1230 1245 m_can_coalescing_update(cdev, ir); 1231 1246 if (!ir) 1232 1247 return IRQ_NONE; 1233 - 1234 - /* ACK all irqs */ 1235 - m_can_write(cdev, M_CAN_IR, ir); 1236 1248 1237 1249 if (cdev->ops->clear_interrupts) 1238 1250 cdev->ops->clear_interrupts(cdev); ··· 1707 1695 return -EINVAL; 1708 1696 } 1709 1697 1698 + /* Write the INIT bit, in case no hardware reset has happened before 1699 + * the probe (for example, it was observed that the Intel Elkhart Lake 1700 + * SoCs do not properly reset the CAN controllers on reboot) 1701 + */ 1702 + err = m_can_cccr_update_bits(cdev, CCCR_INIT, CCCR_INIT); 1703 + if (err) 1704 + return err; 1705 + 1710 1706 if (!cdev->is_peripheral) 1711 1707 netif_napi_add(dev, &cdev->napi, m_can_poll); 1712 1708 ··· 1766 1746 return -EINVAL; 1767 1747 } 1768 1748 1769 - /* Forcing standby mode should be redundant, as the chip should be in 1770 - * standby after a reset. Write the INIT bit anyways, should the chip 1771 - * be configured by previous stage. 1772 - */ 1773 - return m_can_cccr_update_bits(cdev, CCCR_INIT, CCCR_INIT); 1749 + return 0; 1774 1750 } 1775 1751 1776 1752 static void m_can_stop(struct net_device *dev)

+1

drivers/net/can/m_can/m_can.h

··· 99 99 int pm_clock_support; 100 100 int pm_wake_source; 101 101 int is_peripheral; 102 + bool irq_edge_triggered; 102 103 103 104 // Cached M_CAN_IE register content 104 105 u32 active_interrupts;

+1

drivers/net/can/m_can/m_can_pci.c

··· 127 127 mcan_class->pm_clock_support = 1; 128 128 mcan_class->pm_wake_source = 0; 129 129 mcan_class->can.clock.freq = id->driver_data; 130 + mcan_class->irq_edge_triggered = true; 130 131 mcan_class->ops = &m_can_pci_ops; 131 132 132 133 pci_set_drvdata(pci, mcan_class);

+4 -1

drivers/net/ethernet/broadcom/bgmac-platform.c

··· 171 171 static int bgmac_probe(struct platform_device *pdev) 172 172 { 173 173 struct device_node *np = pdev->dev.of_node; 174 + struct device_node *phy_node; 174 175 struct bgmac *bgmac; 175 176 struct resource *regs; 176 177 int ret; ··· 237 236 bgmac->cco_ctl_maskset = platform_bgmac_cco_ctl_maskset; 238 237 bgmac->get_bus_clock = platform_bgmac_get_bus_clock; 239 238 bgmac->cmn_maskset32 = platform_bgmac_cmn_maskset32; 240 - if (of_parse_phandle(np, "phy-handle", 0)) { 239 + phy_node = of_parse_phandle(np, "phy-handle", 0); 240 + if (phy_node) { 241 + of_node_put(phy_node); 241 242 bgmac->phy_connect = platform_phy_connect; 242 243 } else { 243 244 bgmac->phy_connect = bgmac_phy_connect_direct;

+3 -2

drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c

··· 346 346 * driver. Once driver synthesizes cpl_pass_accept_req the skb will go 347 347 * through the regular cpl_pass_accept_req processing in TOM. 348 348 */ 349 - skb = alloc_skb(gl->tot_len + sizeof(struct cpl_pass_accept_req) 350 - - pktshift, GFP_ATOMIC); 349 + skb = alloc_skb(size_add(gl->tot_len, 350 + sizeof(struct cpl_pass_accept_req)) - 351 + pktshift, GFP_ATOMIC); 351 352 if (unlikely(!skb)) 352 353 return NULL; 353 354 __skb_put(skb, gl->tot_len + sizeof(struct cpl_pass_accept_req)

+2

drivers/net/ethernet/huawei/hinic/hinic_main.c

··· 172 172 hinic_sq_dbgfs_uninit(nic_dev); 173 173 174 174 devm_kfree(&netdev->dev, nic_dev->txqs); 175 + nic_dev->txqs = NULL; 175 176 return err; 176 177 } 177 178 ··· 269 268 hinic_rq_dbgfs_uninit(nic_dev); 270 269 271 270 devm_kfree(&netdev->dev, nic_dev->rxqs); 271 + nic_dev->rxqs = NULL; 272 272 return err; 273 273 } 274 274

+3

drivers/net/ethernet/intel/idpf/idpf_dev.c

··· 101 101 intr->dyn_ctl_itridx_s = PF_GLINT_DYN_CTL_ITR_INDX_S; 102 102 intr->dyn_ctl_intrvl_s = PF_GLINT_DYN_CTL_INTERVAL_S; 103 103 intr->dyn_ctl_wb_on_itr_m = PF_GLINT_DYN_CTL_WB_ON_ITR_M; 104 + intr->dyn_ctl_swint_trig_m = PF_GLINT_DYN_CTL_SWINT_TRIG_M; 105 + intr->dyn_ctl_sw_itridx_ena_m = 106 + PF_GLINT_DYN_CTL_SW_ITR_INDX_ENA_M; 104 107 105 108 spacing = IDPF_ITR_IDX_SPACING(reg_vals[vec_id].itrn_index_spacing, 106 109 IDPF_PF_ITR_IDX_SPACING);

+19 -10

drivers/net/ethernet/intel/idpf/idpf_txrx.c

··· 3604 3604 /** 3605 3605 * idpf_vport_intr_buildreg_itr - Enable default interrupt generation settings 3606 3606 * @q_vector: pointer to q_vector 3607 - * @type: itr index 3608 - * @itr: itr value 3609 3607 */ 3610 - static u32 idpf_vport_intr_buildreg_itr(struct idpf_q_vector *q_vector, 3611 - const int type, u16 itr) 3608 + static u32 idpf_vport_intr_buildreg_itr(struct idpf_q_vector *q_vector) 3612 3609 { 3613 - u32 itr_val; 3610 + u32 itr_val = q_vector->intr_reg.dyn_ctl_intena_m; 3611 + int type = IDPF_NO_ITR_UPDATE_IDX; 3612 + u16 itr = 0; 3613 + 3614 + if (q_vector->wb_on_itr) { 3615 + /* 3616 + * Trigger a software interrupt when exiting wb_on_itr, to make 3617 + * sure we catch any pending write backs that might have been 3618 + * missed due to interrupt state transition. 3619 + */ 3620 + itr_val |= q_vector->intr_reg.dyn_ctl_swint_trig_m | 3621 + q_vector->intr_reg.dyn_ctl_sw_itridx_ena_m; 3622 + type = IDPF_SW_ITR_UPDATE_IDX; 3623 + itr = IDPF_ITR_20K; 3624 + } 3614 3625 3615 3626 itr &= IDPF_ITR_MASK; 3616 3627 /* Don't clear PBA because that can cause lost interrupts that 3617 3628 * came in while we were cleaning/polling 3618 3629 */ 3619 - itr_val = q_vector->intr_reg.dyn_ctl_intena_m | 3620 - (type << q_vector->intr_reg.dyn_ctl_itridx_s) | 3621 - (itr << (q_vector->intr_reg.dyn_ctl_intrvl_s - 1)); 3630 + itr_val |= (type << q_vector->intr_reg.dyn_ctl_itridx_s) | 3631 + (itr << (q_vector->intr_reg.dyn_ctl_intrvl_s - 1)); 3622 3632 3623 3633 return itr_val; 3624 3634 } ··· 3726 3716 /* net_dim() updates ITR out-of-band using a work item */ 3727 3717 idpf_net_dim(q_vector); 3728 3718 3719 + intval = idpf_vport_intr_buildreg_itr(q_vector); 3729 3720 q_vector->wb_on_itr = false; 3730 - intval = idpf_vport_intr_buildreg_itr(q_vector, 3731 - IDPF_NO_ITR_UPDATE_IDX, 0); 3732 3721 3733 3722 writel(intval, q_vector->intr_reg.dyn_ctl); 3734 3723 }

+7 -1

drivers/net/ethernet/intel/idpf/idpf_txrx.h

··· 354 354 * @dyn_ctl_itridx_m: Mask for ITR index 355 355 * @dyn_ctl_intrvl_s: Register bit offset for ITR interval 356 356 * @dyn_ctl_wb_on_itr_m: Mask for WB on ITR feature 357 + * @dyn_ctl_sw_itridx_ena_m: Mask for SW ITR index 358 + * @dyn_ctl_swint_trig_m: Mask for dyn_ctl SW triggered interrupt enable 357 359 * @rx_itr: RX ITR register 358 360 * @tx_itr: TX ITR register 359 361 * @icr_ena: Interrupt cause register offset ··· 369 367 u32 dyn_ctl_itridx_m; 370 368 u32 dyn_ctl_intrvl_s; 371 369 u32 dyn_ctl_wb_on_itr_m; 370 + u32 dyn_ctl_sw_itridx_ena_m; 371 + u32 dyn_ctl_swint_trig_m; 372 372 void __iomem *rx_itr; 373 373 void __iomem *tx_itr; 374 374 void __iomem *icr_ena; ··· 441 437 cpumask_var_t affinity_mask; 442 438 __cacheline_group_end_aligned(cold); 443 439 }; 444 - libeth_cacheline_set_assert(struct idpf_q_vector, 112, 440 + libeth_cacheline_set_assert(struct idpf_q_vector, 120, 445 441 24 + sizeof(struct napi_struct) + 446 442 2 * sizeof(struct dim), 447 443 8 + sizeof(cpumask_var_t)); ··· 475 471 #define IDPF_ITR_IS_DYNAMIC(itr_mode) (itr_mode) 476 472 #define IDPF_ITR_TX_DEF IDPF_ITR_20K 477 473 #define IDPF_ITR_RX_DEF IDPF_ITR_20K 474 + /* Index used for 'SW ITR' update in DYN_CTL register */ 475 + #define IDPF_SW_ITR_UPDATE_IDX 2 478 476 /* Index used for 'No ITR' update in DYN_CTL register */ 479 477 #define IDPF_NO_ITR_UPDATE_IDX 3 480 478 #define IDPF_ITR_IDX_SPACING(spacing, dflt) (spacing ? spacing : dflt)

+3

drivers/net/ethernet/intel/idpf/idpf_vf_dev.c

··· 101 101 intr->dyn_ctl_itridx_s = VF_INT_DYN_CTLN_ITR_INDX_S; 102 102 intr->dyn_ctl_intrvl_s = VF_INT_DYN_CTLN_INTERVAL_S; 103 103 intr->dyn_ctl_wb_on_itr_m = VF_INT_DYN_CTLN_WB_ON_ITR_M; 104 + intr->dyn_ctl_swint_trig_m = VF_INT_DYN_CTLN_SWINT_TRIG_M; 105 + intr->dyn_ctl_sw_itridx_ena_m = 106 + VF_INT_DYN_CTLN_SW_ITR_INDX_ENA_M; 104 107 105 108 spacing = IDPF_ITR_IDX_SPACING(reg_vals[vec_id].itrn_index_spacing, 106 109 IDPF_VF_ITR_IDX_SPACING);

+4 -1

drivers/net/ethernet/marvell/octeontx2/nic/rep.c

··· 680 680 ndev->features |= ndev->hw_features; 681 681 eth_hw_addr_random(ndev); 682 682 err = rvu_rep_devlink_port_register(rep); 683 - if (err) 683 + if (err) { 684 + free_netdev(ndev); 684 685 goto exit; 686 + } 685 687 686 688 SET_NETDEV_DEVLINK_PORT(ndev, &rep->dl_port); 687 689 err = register_netdev(ndev); 688 690 if (err) { 689 691 NL_SET_ERR_MSG_MOD(extack, 690 692 "PFVF representor registration failed"); 693 + rvu_rep_devlink_port_unregister(rep); 691 694 free_netdev(ndev); 692 695 goto exit; 693 696 }

+1 -1

drivers/net/ethernet/mscc/ocelot.c

··· 1432 1432 1433 1433 memset(ifh, 0, OCELOT_TAG_LEN); 1434 1434 ocelot_ifh_set_bypass(ifh, 1); 1435 - ocelot_ifh_set_src(ifh, BIT_ULL(ocelot->num_phys_ports)); 1435 + ocelot_ifh_set_src(ifh, ocelot->num_phys_ports); 1436 1436 ocelot_ifh_set_dest(ifh, BIT_ULL(port)); 1437 1437 ocelot_ifh_set_qos_class(ifh, qos_class); 1438 1438 ocelot_ifh_set_tag_type(ifh, tag_type);

+9 -2

drivers/net/ethernet/oa_tc6.c

··· 113 113 struct mii_bus *mdiobus; 114 114 struct spi_device *spi; 115 115 struct mutex spi_ctrl_lock; /* Protects spi control transfer */ 116 + spinlock_t tx_skb_lock; /* Protects tx skb handling */ 116 117 void *spi_ctrl_tx_buf; 117 118 void *spi_ctrl_rx_buf; 118 119 void *spi_data_tx_buf; ··· 1005 1004 for (used_tx_credits = 0; used_tx_credits < tc6->tx_credits; 1006 1005 used_tx_credits++) { 1007 1006 if (!tc6->ongoing_tx_skb) { 1007 + spin_lock_bh(&tc6->tx_skb_lock); 1008 1008 tc6->ongoing_tx_skb = tc6->waiting_tx_skb; 1009 1009 tc6->waiting_tx_skb = NULL; 1010 + spin_unlock_bh(&tc6->tx_skb_lock); 1010 1011 } 1011 1012 if (!tc6->ongoing_tx_skb) 1012 1013 break; ··· 1114 1111 /* This kthread will be waken up if there is a tx skb or mac-phy 1115 1112 * interrupt to perform spi transfer with tx chunks. 1116 1113 */ 1117 - wait_event_interruptible(tc6->spi_wq, tc6->waiting_tx_skb || 1118 - tc6->int_flag || 1114 + wait_event_interruptible(tc6->spi_wq, tc6->int_flag || 1115 + (tc6->waiting_tx_skb && 1116 + tc6->tx_credits) || 1119 1117 kthread_should_stop()); 1120 1118 1121 1119 if (kthread_should_stop()) ··· 1213 1209 return NETDEV_TX_OK; 1214 1210 } 1215 1211 1212 + spin_lock_bh(&tc6->tx_skb_lock); 1216 1213 tc6->waiting_tx_skb = skb; 1214 + spin_unlock_bh(&tc6->tx_skb_lock); 1217 1215 1218 1216 /* Wake spi kthread to perform spi transfer */ 1219 1217 wake_up_interruptible(&tc6->spi_wq); ··· 1245 1239 tc6->netdev = netdev; 1246 1240 SET_NETDEV_DEV(netdev, &spi->dev); 1247 1241 mutex_init(&tc6->spi_ctrl_lock); 1242 + spin_lock_init(&tc6->tx_skb_lock); 1248 1243 1249 1244 /* Set the SPI controller to pump at realtime priority */ 1250 1245 tc6->spi->rt = true;

+4 -1

drivers/net/ethernet/pensando/ionic/ionic_dev.c

··· 277 277 idev->phy_cmb_pages = 0; 278 278 idev->cmb_npages = 0; 279 279 280 - destroy_workqueue(ionic->wq); 280 + if (ionic->wq) { 281 + destroy_workqueue(ionic->wq); 282 + ionic->wq = NULL; 283 + } 281 284 mutex_destroy(&idev->cmb_inuse_lock); 282 285 } 283 286

+2 -2

drivers/net/ethernet/pensando/ionic/ionic_ethtool.c

··· 961 961 len = min_t(u32, sizeof(xcvr->sprom), ee->len); 962 962 963 963 do { 964 - memcpy(data, xcvr->sprom, len); 965 - memcpy(tbuf, xcvr->sprom, len); 964 + memcpy(data, &xcvr->sprom[ee->offset], len); 965 + memcpy(tbuf, &xcvr->sprom[ee->offset], len); 966 966 967 967 /* Let's make sure we got a consistent copy */ 968 968 if (!memcmp(data, tbuf, len))

+2 -2

drivers/net/ethernet/pensando/ionic/ionic_lif.c

··· 3869 3869 /* only register LIF0 for now */ 3870 3870 err = register_netdev(lif->netdev); 3871 3871 if (err) { 3872 - dev_err(lif->ionic->dev, "Cannot register net device, aborting\n"); 3873 - ionic_lif_unregister_phc(lif); 3872 + dev_err(lif->ionic->dev, "Cannot register net device: %d, aborting\n", err); 3873 + ionic_lif_unregister(lif); 3874 3874 return err; 3875 3875 } 3876 3876

+1

drivers/net/ethernet/qlogic/qed/qed_mcp.c

··· 3358 3358 p_ptt, &nvm_info.num_images); 3359 3359 if (rc == -EOPNOTSUPP) { 3360 3360 DP_INFO(p_hwfn, "DRV_MSG_CODE_BIST_TEST is not supported\n"); 3361 + nvm_info.num_images = 0; 3361 3362 goto out; 3362 3363 } else if (rc || !nvm_info.num_images) { 3363 3364 DP_ERR(p_hwfn, "Failed getting number of images\n");

+36 -32

drivers/net/ethernet/renesas/rswitch.c

··· 547 547 desc = &gq->ts_ring[gq->ring_size]; 548 548 desc->desc.die_dt = DT_LINKFIX; 549 549 rswitch_desc_set_dptr(&desc->desc, gq->ring_dma); 550 - INIT_LIST_HEAD(&priv->gwca.ts_info_list); 551 550 552 551 return 0; 553 552 } ··· 1002 1003 static void rswitch_ts(struct rswitch_private *priv) 1003 1004 { 1004 1005 struct rswitch_gwca_queue *gq = &priv->gwca.ts_queue; 1005 - struct rswitch_gwca_ts_info *ts_info, *ts_info2; 1006 1006 struct skb_shared_hwtstamps shhwtstamps; 1007 1007 struct rswitch_ts_desc *desc; 1008 + struct rswitch_device *rdev; 1009 + struct sk_buff *ts_skb; 1008 1010 struct timespec64 ts; 1009 1011 unsigned int num; 1010 1012 u32 tag, port; ··· 1015 1015 dma_rmb(); 1016 1016 1017 1017 port = TS_DESC_DPN(__le32_to_cpu(desc->desc.dptrl)); 1018 + if (unlikely(port >= RSWITCH_NUM_PORTS)) 1019 + goto next; 1020 + rdev = priv->rdev[port]; 1021 + 1018 1022 tag = TS_DESC_TSUN(__le32_to_cpu(desc->desc.dptrl)); 1023 + if (unlikely(tag >= TS_TAGS_PER_PORT)) 1024 + goto next; 1025 + ts_skb = xchg(&rdev->ts_skb[tag], NULL); 1026 + smp_mb(); /* order rdev->ts_skb[] read before bitmap update */ 1027 + clear_bit(tag, rdev->ts_skb_used); 1019 1028 1020 - list_for_each_entry_safe(ts_info, ts_info2, &priv->gwca.ts_info_list, list) { 1021 - if (!(ts_info->port == port && ts_info->tag == tag)) 1022 - continue; 1029 + if (unlikely(!ts_skb)) 1030 + goto next; 1023 1031 1024 - memset(&shhwtstamps, 0, sizeof(shhwtstamps)); 1025 - ts.tv_sec = __le32_to_cpu(desc->ts_sec); 1026 - ts.tv_nsec = __le32_to_cpu(desc->ts_nsec & cpu_to_le32(0x3fffffff)); 1027 - shhwtstamps.hwtstamp = timespec64_to_ktime(ts); 1028 - skb_tstamp_tx(ts_info->skb, &shhwtstamps); 1029 - dev_consume_skb_irq(ts_info->skb); 1030 - list_del(&ts_info->list); 1031 - kfree(ts_info); 1032 - break; 1033 - } 1032 + memset(&shhwtstamps, 0, sizeof(shhwtstamps)); 1033 + ts.tv_sec = __le32_to_cpu(desc->ts_sec); 1034 + ts.tv_nsec = __le32_to_cpu(desc->ts_nsec & cpu_to_le32(0x3fffffff)); 1035 + shhwtstamps.hwtstamp = timespec64_to_ktime(ts); 1036 + skb_tstamp_tx(ts_skb, &shhwtstamps); 1037 + dev_consume_skb_irq(ts_skb); 1034 1038 1039 + next: 1035 1040 gq->cur = rswitch_next_queue_index(gq, true, 1); 1036 1041 desc = &gq->ts_ring[gq->cur]; 1037 1042 } ··· 1581 1576 static int rswitch_stop(struct net_device *ndev) 1582 1577 { 1583 1578 struct rswitch_device *rdev = netdev_priv(ndev); 1584 - struct rswitch_gwca_ts_info *ts_info, *ts_info2; 1579 + struct sk_buff *ts_skb; 1585 1580 unsigned long flags; 1581 + unsigned int tag; 1586 1582 1587 1583 netif_tx_stop_all_queues(ndev); 1588 1584 ··· 1600 1594 if (bitmap_empty(rdev->priv->opened_ports, RSWITCH_NUM_PORTS)) 1601 1595 iowrite32(GWCA_TS_IRQ_BIT, rdev->priv->addr + GWTSDID); 1602 1596 1603 - list_for_each_entry_safe(ts_info, ts_info2, &rdev->priv->gwca.ts_info_list, list) { 1604 - if (ts_info->port != rdev->port) 1605 - continue; 1606 - dev_kfree_skb_irq(ts_info->skb); 1607 - list_del(&ts_info->list); 1608 - kfree(ts_info); 1597 + for (tag = find_first_bit(rdev->ts_skb_used, TS_TAGS_PER_PORT); 1598 + tag < TS_TAGS_PER_PORT; 1599 + tag = find_next_bit(rdev->ts_skb_used, TS_TAGS_PER_PORT, tag + 1)) { 1600 + ts_skb = xchg(&rdev->ts_skb[tag], NULL); 1601 + clear_bit(tag, rdev->ts_skb_used); 1602 + if (ts_skb) 1603 + dev_kfree_skb(ts_skb); 1609 1604 } 1610 1605 1611 1606 return 0; ··· 1619 1612 desc->info1 = cpu_to_le64(INFO1_DV(BIT(rdev->etha->index)) | 1620 1613 INFO1_IPV(GWCA_IPV_NUM) | INFO1_FMT); 1621 1614 if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) { 1622 - struct rswitch_gwca_ts_info *ts_info; 1615 + unsigned int tag; 1623 1616 1624 - ts_info = kzalloc(sizeof(*ts_info), GFP_ATOMIC); 1625 - if (!ts_info) 1617 + tag = find_first_zero_bit(rdev->ts_skb_used, TS_TAGS_PER_PORT); 1618 + if (tag == TS_TAGS_PER_PORT) 1626 1619 return false; 1620 + smp_mb(); /* order bitmap read before rdev->ts_skb[] write */ 1621 + rdev->ts_skb[tag] = skb_get(skb); 1622 + set_bit(tag, rdev->ts_skb_used); 1627 1623 1628 1624 skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS; 1629 - rdev->ts_tag++; 1630 - desc->info1 |= cpu_to_le64(INFO1_TSUN(rdev->ts_tag) | INFO1_TXC); 1631 - 1632 - ts_info->skb = skb_get(skb); 1633 - ts_info->port = rdev->port; 1634 - ts_info->tag = rdev->ts_tag; 1635 - list_add_tail(&ts_info->list, &rdev->priv->gwca.ts_info_list); 1625 + desc->info1 |= cpu_to_le64(INFO1_TSUN(tag) | INFO1_TXC); 1636 1626 1637 1627 skb_tx_timestamp(skb); 1638 1628 }

+3 -10

drivers/net/ethernet/renesas/rswitch.h

··· 972 972 }; 973 973 }; 974 974 975 - struct rswitch_gwca_ts_info { 976 - struct sk_buff *skb; 977 - struct list_head list; 978 - 979 - int port; 980 - u8 tag; 981 - }; 982 - 983 975 #define RSWITCH_NUM_IRQ_REGS (RSWITCH_MAX_NUM_QUEUES / BITS_PER_TYPE(u32)) 984 976 struct rswitch_gwca { 985 977 unsigned int index; ··· 981 989 struct rswitch_gwca_queue *queues; 982 990 int num_queues; 983 991 struct rswitch_gwca_queue ts_queue; 984 - struct list_head ts_info_list; 985 992 DECLARE_BITMAP(used, RSWITCH_MAX_NUM_QUEUES); 986 993 u32 tx_irq_bits[RSWITCH_NUM_IRQ_REGS]; 987 994 u32 rx_irq_bits[RSWITCH_NUM_IRQ_REGS]; ··· 988 997 }; 989 998 990 999 #define NUM_QUEUES_PER_NDEV 2 1000 + #define TS_TAGS_PER_PORT 256 991 1001 struct rswitch_device { 992 1002 struct rswitch_private *priv; 993 1003 struct net_device *ndev; ··· 996 1004 void __iomem *addr; 997 1005 struct rswitch_gwca_queue *tx_queue; 998 1006 struct rswitch_gwca_queue *rx_queue; 999 - u8 ts_tag; 1007 + struct sk_buff *ts_skb[TS_TAGS_PER_PORT]; 1008 + DECLARE_BITMAP(ts_skb_used, TS_TAGS_PER_PORT); 1000 1009 bool disabled; 1001 1010 1002 1011 int port;

+10 -3

drivers/net/mdio/fwnode_mdio.c

··· 40 40 static struct mii_timestamper * 41 41 fwnode_find_mii_timestamper(struct fwnode_handle *fwnode) 42 42 { 43 + struct mii_timestamper *mii_ts; 43 44 struct of_phandle_args arg; 44 45 int err; 45 46 ··· 54 53 else if (err) 55 54 return ERR_PTR(err); 56 55 57 - if (arg.args_count != 1) 58 - return ERR_PTR(-EINVAL); 56 + if (arg.args_count != 1) { 57 + mii_ts = ERR_PTR(-EINVAL); 58 + goto put_node; 59 + } 59 60 60 - return register_mii_timestamper(arg.np, arg.args[0]); 61 + mii_ts = register_mii_timestamper(arg.np, arg.args[0]); 62 + 63 + put_node: 64 + of_node_put(arg.np); 65 + return mii_ts; 61 66 } 62 67 63 68 int fwnode_mdiobus_phy_device_register(struct mii_bus *mdio,

+2

drivers/net/netdevsim/health.c

··· 149 149 char *break_msg; 150 150 int err; 151 151 152 + if (count == 0 || count > PAGE_SIZE) 153 + return -EINVAL; 152 154 break_msg = memdup_user_nul(data, count); 153 155 if (IS_ERR(break_msg)) 154 156 return PTR_ERR(break_msg);

+2 -2

drivers/net/netdevsim/netdev.c

··· 635 635 page_pool_put_full_page(ns->page->pp, ns->page, false); 636 636 ns->page = NULL; 637 637 } 638 - rtnl_unlock(); 639 638 640 639 exit: 641 - return count; 640 + rtnl_unlock(); 641 + return ret; 642 642 } 643 643 644 644 static const struct file_operations nsim_pp_hold_fops = {

+1 -1

drivers/net/phy/aquantia/aquantia_leds.c

··· 156 156 if (force_active_high || force_active_low) 157 157 return aqr_phy_led_active_low_set(phydev, index, force_active_low); 158 158 159 - unreachable(); 159 + return -EINVAL; 160 160 }

+1 -1

drivers/net/phy/intel-xway.c

··· 529 529 if (force_active_high) 530 530 return phy_clear_bits(phydev, XWAY_MDIO_LED, XWAY_GPHY_LED_INV(index)); 531 531 532 - unreachable(); 532 + return -EINVAL; 533 533 } 534 534 535 535 static struct phy_driver xway_gphy[] = {

+1 -1

drivers/net/phy/mxl-gpy.c

··· 1014 1014 if (force_active_high) 1015 1015 return phy_clear_bits(phydev, PHY_LED, PHY_LED_POLARITY(index)); 1016 1016 1017 - unreachable(); 1017 + return -EINVAL; 1018 1018 } 1019 1019 1020 1020 static struct phy_driver gpy_drivers[] = {

+7 -3

drivers/net/team/team_core.c

··· 998 998 unsigned int dst_release_flag = IFF_XMIT_DST_RELEASE | 999 999 IFF_XMIT_DST_RELEASE_PERM; 1000 1000 1001 - vlan_features = netdev_base_features(vlan_features); 1002 - 1003 1001 rcu_read_lock(); 1002 + if (list_empty(&team->port_list)) 1003 + goto done; 1004 + 1005 + vlan_features = netdev_base_features(vlan_features); 1006 + enc_features = netdev_base_features(enc_features); 1007 + 1004 1008 list_for_each_entry_rcu(port, &team->port_list, list) { 1005 1009 vlan_features = netdev_increment_features(vlan_features, 1006 1010 port->dev->vlan_features, ··· 1014 1010 port->dev->hw_enc_features, 1015 1011 TEAM_ENC_FEATURES); 1016 1012 1017 - 1018 1013 dst_release_flag &= port->dev->priv_flags; 1019 1014 if (port->dev->hard_header_len > max_hard_header_len) 1020 1015 max_hard_header_len = port->dev->hard_header_len; 1021 1016 } 1017 + done: 1022 1018 rcu_read_unlock(); 1023 1019 1024 1020 team->dev->vlan_features = vlan_features;

+1 -1

drivers/net/tun.c

··· 1481 1481 skb->truesize += skb->data_len; 1482 1482 1483 1483 for (i = 1; i < it->nr_segs; i++) { 1484 - const struct iovec *iov = iter_iov(it); 1484 + const struct iovec *iov = iter_iov(it) + i; 1485 1485 size_t fragsz = iov->iov_len; 1486 1486 struct page *page; 1487 1487 void *frag;

+1

drivers/net/usb/qmi_wwan.c

··· 1429 1429 {QMI_QUIRK_SET_DTR(0x2c7c, 0x0195, 4)}, /* Quectel EG95 */ 1430 1430 {QMI_FIXED_INTF(0x2c7c, 0x0296, 4)}, /* Quectel BG96 */ 1431 1431 {QMI_QUIRK_SET_DTR(0x2c7c, 0x030e, 4)}, /* Quectel EM05GV2 */ 1432 + {QMI_QUIRK_SET_DTR(0x2c7c, 0x0316, 3)}, /* Quectel RG255C */ 1432 1433 {QMI_QUIRK_SET_DTR(0x2cb7, 0x0104, 4)}, /* Fibocom NL678 series */ 1433 1434 {QMI_QUIRK_SET_DTR(0x2cb7, 0x0112, 0)}, /* Fibocom FG132 */ 1434 1435 {QMI_FIXED_INTF(0x0489, 0xe0b4, 0)}, /* Foxconn T77W968 LTE */

+4 -1

drivers/net/xen-netfront.c

··· 867 867 static int xennet_close(struct net_device *dev) 868 868 { 869 869 struct netfront_info *np = netdev_priv(dev); 870 - unsigned int num_queues = dev->real_num_tx_queues; 870 + unsigned int num_queues = np->queues ? dev->real_num_tx_queues : 0; 871 871 unsigned int i; 872 872 struct netfront_queue *queue; 873 873 netif_tx_stop_all_queues(np->netdev); ··· 881 881 static void xennet_destroy_queues(struct netfront_info *info) 882 882 { 883 883 unsigned int i; 884 + 885 + if (!info->queues) 886 + return; 884 887 885 888 for (i = 0; i < info->netdev->real_num_tx_queues; i++) { 886 889 struct netfront_queue *queue = &info->queues[i];

+1 -1

drivers/nvme/host/core.c

··· 2034 2034 * or smaller than a sector size yet, so catch this early and don't 2035 2035 * allow block I/O. 2036 2036 */ 2037 - if (head->lba_shift > PAGE_SHIFT || head->lba_shift < SECTOR_SHIFT) { 2037 + if (blk_validate_block_size(bs)) { 2038 2038 bs = (1 << 9); 2039 2039 valid = false; 2040 2040 }

+3 -2

drivers/of/address.c

··· 459 459 } 460 460 if (ranges == NULL || rlen == 0) { 461 461 offset = of_read_number(addr, na); 462 - memset(addr, 0, pna * 4); 462 + /* set address to zero, pass flags through */ 463 + memset(addr + pbus->flag_cells, 0, (pna - pbus->flag_cells) * 4); 463 464 pr_debug("empty ranges; 1:1 translation\n"); 464 465 goto finish; 465 466 } ··· 620 619 if (ret < 0) 621 620 return of_get_parent(np); 622 621 623 - return of_node_get(args.np); 622 + return args.np; 624 623 } 625 624 #endif 626 625

+12 -6

drivers/of/base.c

··· 88 88 } 89 89 90 90 #define EXCLUDED_DEFAULT_CELLS_PLATFORMS ( \ 91 - IS_ENABLED(CONFIG_SPARC) \ 91 + IS_ENABLED(CONFIG_SPARC) || \ 92 + of_find_compatible_node(NULL, NULL, "coreboot") \ 92 93 ) 93 94 94 95 int of_bus_n_addr_cells(struct device_node *np) ··· 1508 1507 map_len--; 1509 1508 1510 1509 /* Check if not found */ 1511 - if (!new) 1510 + if (!new) { 1511 + ret = -EINVAL; 1512 1512 goto put; 1513 + } 1513 1514 1514 1515 if (!of_device_is_available(new)) 1515 1516 match = 0; ··· 1521 1518 goto put; 1522 1519 1523 1520 /* Check for malformed properties */ 1524 - if (WARN_ON(new_size > MAX_PHANDLE_ARGS)) 1521 + if (WARN_ON(new_size > MAX_PHANDLE_ARGS) || 1522 + map_len < new_size) { 1523 + ret = -EINVAL; 1525 1524 goto put; 1526 - if (map_len < new_size) 1527 - goto put; 1525 + } 1528 1526 1529 1527 /* Move forward by new node's #<list>-cells amount */ 1530 1528 map += new_size; 1531 1529 map_len -= new_size; 1532 1530 } 1533 - if (!match) 1531 + if (!match) { 1532 + ret = -ENOENT; 1534 1533 goto put; 1534 + } 1535 1535 1536 1536 /* Get the <list>-map-pass-thru property (optional) */ 1537 1537 pass = of_get_property(cur, pass_name, NULL);

+8 -1

drivers/of/empty_root.dts

··· 2 2 /dts-v1/; 3 3 4 4 / { 5 - 5 + /* 6 + * #address-cells/#size-cells are required properties at root node. 7 + * Use 2 cells for both address cells and size cells in order to fully 8 + * support 64-bit addresses and sizes on systems using this empty root 9 + * node. 10 + */ 11 + #address-cells = <0x02>; 12 + #size-cells = <0x02>; 6 13 };

+2

drivers/of/irq.c

··· 111 111 else 112 112 np = of_find_node_by_phandle(be32_to_cpup(imap)); 113 113 imap++; 114 + len--; 114 115 115 116 /* Check if not found */ 116 117 if (!np) { ··· 355 354 return of_irq_parse_oldworld(device, index, out_irq); 356 355 357 356 /* Get the reg property (if any) */ 357 + addr_len = 0; 358 358 addr = of_get_property(device, "reg", &addr_len); 359 359 360 360 /* Prevent out-of-bounds read in case of longer interrupt parent address size */

-2

drivers/of/property.c

··· 1286 1286 DEFINE_SIMPLE_PROP(mboxes, "mboxes", "#mbox-cells") 1287 1287 DEFINE_SIMPLE_PROP(io_channels, "io-channels", "#io-channel-cells") 1288 1288 DEFINE_SIMPLE_PROP(io_backends, "io-backends", "#io-backend-cells") 1289 - DEFINE_SIMPLE_PROP(interrupt_parent, "interrupt-parent", NULL) 1290 1289 DEFINE_SIMPLE_PROP(dmas, "dmas", "#dma-cells") 1291 1290 DEFINE_SIMPLE_PROP(power_domains, "power-domains", "#power-domain-cells") 1292 1291 DEFINE_SIMPLE_PROP(hwlocks, "hwlocks", "#hwlock-cells") ··· 1431 1432 { .parse_prop = parse_mboxes, }, 1432 1433 { .parse_prop = parse_io_channels, }, 1433 1434 { .parse_prop = parse_io_backends, }, 1434 - { .parse_prop = parse_interrupt_parent, }, 1435 1435 { .parse_prop = parse_dmas, .optional = true, }, 1436 1436 { .parse_prop = parse_power_domains, }, 1437 1437 { .parse_prop = parse_hwlocks, },

+2

drivers/of/unittest-data/tests-address.dtsi

··· 114 114 device_type = "pci"; 115 115 ranges = <0x82000000 0 0xe8000000 0 0xe8000000 0 0x7f00000>, 116 116 <0x81000000 0 0x00000000 0 0xefff0000 0 0x0010000>; 117 + dma-ranges = <0x43000000 0x10 0x00 0x00 0x00 0x00 0x10000000>; 117 118 reg = <0x00000000 0xd1070000 0x20000>; 118 119 119 120 pci@0,0 { ··· 143 142 #size-cells = <0x01>; 144 143 ranges = <0xa0000000 0 0 0 0x2000000>, 145 144 <0xb0000000 1 0 0 0x1000000>; 145 + dma-ranges = <0xc0000000 0x43000000 0x10 0x00 0x10000000>; 146 146 147 147 dev@e0000000 { 148 148 reg = <0xa0001000 0x1000>,

+39

drivers/of/unittest.c

··· 1213 1213 of_node_put(np); 1214 1214 } 1215 1215 1216 + static void __init of_unittest_pci_empty_dma_ranges(void) 1217 + { 1218 + struct device_node *np; 1219 + struct of_pci_range range; 1220 + struct of_pci_range_parser parser; 1221 + 1222 + if (!IS_ENABLED(CONFIG_PCI)) 1223 + return; 1224 + 1225 + np = of_find_node_by_path("/testcase-data/address-tests2/pcie@d1070000/pci@0,0/dev@0,0/local-bus@0"); 1226 + if (!np) { 1227 + pr_err("missing testcase data\n"); 1228 + return; 1229 + } 1230 + 1231 + if (of_pci_dma_range_parser_init(&parser, np)) { 1232 + pr_err("missing dma-ranges property\n"); 1233 + return; 1234 + } 1235 + 1236 + /* 1237 + * Get the dma-ranges from the device tree 1238 + */ 1239 + for_each_of_pci_range(&parser, &range) { 1240 + unittest(range.size == 0x10000000, 1241 + "for_each_of_pci_range wrong size on node %pOF size=%llx\n", 1242 + np, range.size); 1243 + unittest(range.cpu_addr == 0x00000000, 1244 + "for_each_of_pci_range wrong CPU addr (%llx) on node %pOF", 1245 + range.cpu_addr, np); 1246 + unittest(range.pci_addr == 0xc0000000, 1247 + "for_each_of_pci_range wrong DMA addr (%llx) on node %pOF", 1248 + range.pci_addr, np); 1249 + } 1250 + 1251 + of_node_put(np); 1252 + } 1253 + 1216 1254 static void __init of_unittest_bus_ranges(void) 1217 1255 { 1218 1256 struct device_node *np; ··· 4310 4272 of_unittest_dma_get_max_cpu_address(); 4311 4273 of_unittest_parse_dma_ranges(); 4312 4274 of_unittest_pci_dma_ranges(); 4275 + of_unittest_pci_empty_dma_ranges(); 4313 4276 of_unittest_bus_ranges(); 4314 4277 of_unittest_bus_3cell_ranges(); 4315 4278 of_unittest_reg();

+4 -2

drivers/pci/pci.c

··· 6232 6232 pcie_capability_read_dword(dev, PCI_EXP_LNKCAP2, &lnkcap2); 6233 6233 speeds = lnkcap2 & PCI_EXP_LNKCAP2_SLS; 6234 6234 6235 + /* Ignore speeds higher than Max Link Speed */ 6236 + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap); 6237 + speeds &= GENMASK(lnkcap & PCI_EXP_LNKCAP_SLS, 0); 6238 + 6235 6239 /* PCIe r3.0-compliant */ 6236 6240 if (speeds) 6237 6241 return speeds; 6238 - 6239 - pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap); 6240 6242 6241 6243 /* Synthesize from the Max Link Speed field */ 6242 6244 if ((lnkcap & PCI_EXP_LNKCAP_SLS) == PCI_EXP_LNKCAP_SLS_5_0GB)

+3 -1

drivers/pci/pcie/portdrv.c

··· 265 265 (pcie_ports_dpc_native || (services & PCIE_PORT_SERVICE_AER))) 266 266 services |= PCIE_PORT_SERVICE_DPC; 267 267 268 + /* Enable bandwidth control if more than one speed is supported. */ 268 269 if (pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM || 269 270 pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) { 270 271 u32 linkcap; 271 272 272 273 pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &linkcap); 273 - if (linkcap & PCI_EXP_LNKCAP_LBNC) 274 + if (linkcap & PCI_EXP_LNKCAP_LBNC && 275 + hweight8(dev->supported_speeds) > 1) 274 276 services |= PCIE_PORT_SERVICE_BWCTRL; 275 277 } 276 278

+1 -1

drivers/platform/loongarch/Kconfig

··· 18 18 19 19 config LOONGSON_LAPTOP 20 20 tristate "Generic Loongson-3 Laptop Driver" 21 - depends on ACPI 21 + depends on ACPI_EC 22 22 depends on BACKLIGHT_CLASS_DEVICE 23 23 depends on INPUT 24 24 depends on MACH_LOONGSON64

+19 -5

drivers/platform/x86/dell/alienware-wmi.c

··· 190 190 }; 191 191 192 192 static struct quirk_entry quirk_g_series = { 193 - .num_zones = 2, 193 + .num_zones = 0, 194 194 .hdmi_mux = 0, 195 195 .amplifier = 0, 196 196 .deepslp = 0, ··· 199 199 }; 200 200 201 201 static struct quirk_entry quirk_x_series = { 202 - .num_zones = 2, 202 + .num_zones = 0, 203 203 .hdmi_mux = 0, 204 204 .amplifier = 0, 205 205 .deepslp = 0, ··· 240 240 DMI_MATCH(DMI_PRODUCT_NAME, "ASM201"), 241 241 }, 242 242 .driver_data = &quirk_asm201, 243 + }, 244 + { 245 + .callback = dmi_matched, 246 + .ident = "Alienware m16 R1 AMD", 247 + .matches = { 248 + DMI_MATCH(DMI_SYS_VENDOR, "Alienware"), 249 + DMI_MATCH(DMI_PRODUCT_NAME, "Alienware m16 R1 AMD"), 250 + }, 251 + .driver_data = &quirk_x_series, 243 252 }, 244 253 { 245 254 .callback = dmi_matched, ··· 695 686 static void alienware_zone_exit(struct platform_device *dev) 696 687 { 697 688 u8 zone; 689 + 690 + if (!quirks->num_zones) 691 + return; 698 692 699 693 sysfs_remove_group(&dev->dev.kobj, &zone_attribute_group); 700 694 led_classdev_unregister(&global_led); ··· 1241 1229 goto fail_prep_thermal_profile; 1242 1230 } 1243 1231 1244 - ret = alienware_zone_init(platform_device); 1245 - if (ret) 1246 - goto fail_prep_zones; 1232 + if (quirks->num_zones > 0) { 1233 + ret = alienware_zone_init(platform_device); 1234 + if (ret) 1235 + goto fail_prep_zones; 1236 + } 1247 1237 1248 1238 return 0; 1249 1239

+1

drivers/platform/x86/intel/ifs/core.c

··· 20 20 X86_MATCH(INTEL_GRANITERAPIDS_X, ARRAY_GEN0), 21 21 X86_MATCH(INTEL_GRANITERAPIDS_D, ARRAY_GEN0), 22 22 X86_MATCH(INTEL_ATOM_CRESTMONT_X, ARRAY_GEN1), 23 + X86_MATCH(INTEL_ATOM_DARKMONT_X, ARRAY_GEN1), 23 24 {} 24 25 }; 25 26 MODULE_DEVICE_TABLE(x86cpu, ifs_cpu_ids);

+2

drivers/platform/x86/intel/vsec.c

··· 423 423 #define PCI_DEVICE_ID_INTEL_VSEC_RPL 0xa77d 424 424 #define PCI_DEVICE_ID_INTEL_VSEC_TGL 0x9a0d 425 425 #define PCI_DEVICE_ID_INTEL_VSEC_LNL_M 0x647d 426 + #define PCI_DEVICE_ID_INTEL_VSEC_PTL 0xb07d 426 427 static const struct pci_device_id intel_vsec_pci_ids[] = { 427 428 { PCI_DEVICE_DATA(INTEL, VSEC_ADL, &tgl_info) }, 428 429 { PCI_DEVICE_DATA(INTEL, VSEC_DG1, &dg1_info) }, ··· 433 432 { PCI_DEVICE_DATA(INTEL, VSEC_RPL, &tgl_info) }, 434 433 { PCI_DEVICE_DATA(INTEL, VSEC_TGL, &tgl_info) }, 435 434 { PCI_DEVICE_DATA(INTEL, VSEC_LNL_M, &lnl_info) }, 435 + { PCI_DEVICE_DATA(INTEL, VSEC_PTL, &mtl_info) }, 436 436 { } 437 437 }; 438 438 MODULE_DEVICE_TABLE(pci, intel_vsec_pci_ids);

+56 -23

drivers/platform/x86/p2sb.c

··· 43 43 }; 44 44 45 45 static struct p2sb_res_cache p2sb_resources[NR_P2SB_RES_CACHE]; 46 + static bool p2sb_hidden_by_bios; 46 47 47 48 static void p2sb_get_devfn(unsigned int *devfn) 48 49 { ··· 98 97 99 98 static int p2sb_scan_and_cache(struct pci_bus *bus, unsigned int devfn) 100 99 { 100 + /* 101 + * The BIOS prevents the P2SB device from being enumerated by the PCI 102 + * subsystem, so we need to unhide and hide it back to lookup the BAR. 103 + */ 104 + pci_bus_write_config_dword(bus, devfn, P2SBC, 0); 105 + 101 106 /* Scan the P2SB device and cache its BAR0 */ 102 107 p2sb_scan_and_cache_devfn(bus, devfn); 103 108 104 109 /* On Goldmont p2sb_bar() also gets called for the SPI controller */ 105 110 if (devfn == P2SB_DEVFN_GOLDMONT) 106 111 p2sb_scan_and_cache_devfn(bus, SPI_DEVFN_GOLDMONT); 112 + 113 + pci_bus_write_config_dword(bus, devfn, P2SBC, P2SBC_HIDE); 107 114 108 115 if (!p2sb_valid_resource(&p2sb_resources[PCI_FUNC(devfn)].res)) 109 116 return -ENOENT; ··· 138 129 u32 value = P2SBC_HIDE; 139 130 struct pci_bus *bus; 140 131 u16 class; 141 - int ret; 132 + int ret = 0; 142 133 143 134 /* Get devfn for P2SB device itself */ 144 135 p2sb_get_devfn(&devfn_p2sb); ··· 161 152 */ 162 153 pci_lock_rescan_remove(); 163 154 164 - /* 165 - * The BIOS prevents the P2SB device from being enumerated by the PCI 166 - * subsystem, so we need to unhide and hide it back to lookup the BAR. 167 - * Unhide the P2SB device here, if needed. 168 - */ 169 155 pci_bus_read_config_dword(bus, devfn_p2sb, P2SBC, &value); 170 - if (value & P2SBC_HIDE) 171 - pci_bus_write_config_dword(bus, devfn_p2sb, P2SBC, 0); 156 + p2sb_hidden_by_bios = value & P2SBC_HIDE; 172 157 173 - ret = p2sb_scan_and_cache(bus, devfn_p2sb); 174 - 175 - /* Hide the P2SB device, if it was hidden */ 176 - if (value & P2SBC_HIDE) 177 - pci_bus_write_config_dword(bus, devfn_p2sb, P2SBC, P2SBC_HIDE); 158 + /* 159 + * If the BIOS does not hide the P2SB device then its resources 160 + * are accesilble. Cache them only if the P2SB device is hidden. 161 + */ 162 + if (p2sb_hidden_by_bios) 163 + ret = p2sb_scan_and_cache(bus, devfn_p2sb); 178 164 179 165 pci_unlock_rescan_remove(); 166 + 167 + return ret; 168 + } 169 + 170 + static int p2sb_read_from_cache(struct pci_bus *bus, unsigned int devfn, 171 + struct resource *mem) 172 + { 173 + struct p2sb_res_cache *cache = &p2sb_resources[PCI_FUNC(devfn)]; 174 + 175 + if (cache->bus_dev_id != bus->dev.id) 176 + return -ENODEV; 177 + 178 + if (!p2sb_valid_resource(&cache->res)) 179 + return -ENOENT; 180 + 181 + memcpy(mem, &cache->res, sizeof(*mem)); 182 + 183 + return 0; 184 + } 185 + 186 + static int p2sb_read_from_dev(struct pci_bus *bus, unsigned int devfn, 187 + struct resource *mem) 188 + { 189 + struct pci_dev *pdev; 190 + int ret = 0; 191 + 192 + pdev = pci_get_slot(bus, devfn); 193 + if (!pdev) 194 + return -ENODEV; 195 + 196 + if (p2sb_valid_resource(pci_resource_n(pdev, 0))) 197 + p2sb_read_bar0(pdev, mem); 198 + else 199 + ret = -ENOENT; 200 + 201 + pci_dev_put(pdev); 180 202 181 203 return ret; 182 204 } ··· 228 188 */ 229 189 int p2sb_bar(struct pci_bus *bus, unsigned int devfn, struct resource *mem) 230 190 { 231 - struct p2sb_res_cache *cache; 232 - 233 191 bus = p2sb_get_bus(bus); 234 192 if (!bus) 235 193 return -ENODEV; ··· 235 197 if (!devfn) 236 198 p2sb_get_devfn(&devfn); 237 199 238 - cache = &p2sb_resources[PCI_FUNC(devfn)]; 239 - if (cache->bus_dev_id != bus->dev.id) 240 - return -ENODEV; 200 + if (p2sb_hidden_by_bios) 201 + return p2sb_read_from_cache(bus, devfn, mem); 241 202 242 - if (!p2sb_valid_resource(&cache->res)) 243 - return -ENOENT; 244 - 245 - memcpy(mem, &cache->res, sizeof(*mem)); 246 - return 0; 203 + return p2sb_read_from_dev(bus, devfn, mem); 247 204 } 248 205 EXPORT_SYMBOL_GPL(p2sb_bar); 249 206

+26

drivers/platform/x86/touchscreen_dmi.c

··· 855 855 .properties = rwc_nanote_next_props, 856 856 }; 857 857 858 + static const struct property_entry sary_tab_3_props[] = { 859 + PROPERTY_ENTRY_U32("touchscreen-size-x", 1730), 860 + PROPERTY_ENTRY_U32("touchscreen-size-y", 1151), 861 + PROPERTY_ENTRY_BOOL("touchscreen-inverted-x"), 862 + PROPERTY_ENTRY_BOOL("touchscreen-inverted-y"), 863 + PROPERTY_ENTRY_BOOL("touchscreen-swapped-x-y"), 864 + PROPERTY_ENTRY_STRING("firmware-name", "gsl1680-sary-tab-3.fw"), 865 + PROPERTY_ENTRY_U32("silead,max-fingers", 10), 866 + PROPERTY_ENTRY_BOOL("silead,home-button"), 867 + { } 868 + }; 869 + 870 + static const struct ts_dmi_data sary_tab_3_data = { 871 + .acpi_name = "MSSL1680:00", 872 + .properties = sary_tab_3_props, 873 + }; 874 + 858 875 static const struct property_entry schneider_sct101ctm_props[] = { 859 876 PROPERTY_ENTRY_U32("touchscreen-size-x", 1715), 860 877 PROPERTY_ENTRY_U32("touchscreen-size-y", 1140), ··· 1630 1613 DMI_MATCH(DMI_BOARD_VENDOR, "To be filled by O.E.M."), 1631 1614 /* Above matches are too generic, add bios-version match */ 1632 1615 DMI_MATCH(DMI_BIOS_VERSION, "S8A70R100-V005"), 1616 + }, 1617 + }, 1618 + { 1619 + /* SARY Tab 3 */ 1620 + .driver_data = (void *)&sary_tab_3_data, 1621 + .matches = { 1622 + DMI_MATCH(DMI_SYS_VENDOR, "SARY"), 1623 + DMI_MATCH(DMI_PRODUCT_NAME, "C210C"), 1624 + DMI_MATCH(DMI_PRODUCT_SKU, "TAB3"), 1633 1625 }, 1634 1626 }, 1635 1627 {

+1 -1

drivers/pwm/pwm-stm32.c

··· 84 84 85 85 wfhw->ccer = TIM_CCER_CCxE(ch + 1); 86 86 if (priv->have_complementary_output) 87 - wfhw->ccer = TIM_CCER_CCxNE(ch + 1); 87 + wfhw->ccer |= TIM_CCER_CCxNE(ch + 1); 88 88 89 89 rate = clk_get_rate(priv->clk); 90 90

+1 -1

drivers/regulator/of_regulator.c

··· 175 175 if (!ret) 176 176 constraints->enable_time = pval; 177 177 178 - ret = of_property_read_u32(np, "regulator-uv-survival-time-ms", &pval); 178 + ret = of_property_read_u32(np, "regulator-uv-less-critical-window-ms", &pval); 179 179 if (!ret) 180 180 constraints->uv_less_critical_window_ms = pval; 181 181 else

+3 -1

drivers/spi/spi-rockchip-sfc.c

··· 182 182 bool use_dma; 183 183 u32 max_iosize; 184 184 u16 version; 185 + struct spi_controller *host; 185 186 }; 186 187 187 188 static int rockchip_sfc_reset(struct rockchip_sfc *sfc) ··· 575 574 576 575 sfc = spi_controller_get_devdata(host); 577 576 sfc->dev = dev; 577 + sfc->host = host; 578 578 579 579 sfc->regbase = devm_platform_ioremap_resource(pdev, 0); 580 580 if (IS_ERR(sfc->regbase)) ··· 653 651 654 652 static void rockchip_sfc_remove(struct platform_device *pdev) 655 653 { 656 - struct spi_controller *host = platform_get_drvdata(pdev); 657 654 struct rockchip_sfc *sfc = platform_get_drvdata(pdev); 655 + struct spi_controller *host = sfc->host; 658 656 659 657 spi_unregister_controller(host); 660 658

+1

drivers/staging/fbtft/Kconfig

··· 3 3 tristate "Support for small TFT LCD display modules" 4 4 depends on FB && SPI 5 5 depends on FB_DEVICE 6 + depends on BACKLIGHT_CLASS_DEVICE 6 7 depends on GPIOLIB || COMPILE_TEST 7 8 select FB_BACKLIGHT 8 9 select FB_SYSMEM_HELPERS_DEFERRED

+1 -1

drivers/staging/gpib/common/Makefile

··· 1 1 2 - obj-m += gpib_common.o 2 + obj-$(CONFIG_GPIB_COMMON) += gpib_common.o 3 3 4 4 gpib_common-objs := gpib_os.o iblib.o 5 5

+1 -1

drivers/staging/gpib/nec7210/Makefile

··· 1 1 2 - obj-m += nec7210.o 2 + obj-$(CONFIG_GPIB_NEC7210) += nec7210.o 3 3 4 4

+40 -36

drivers/thermal/thermal_thresholds.c

··· 69 69 return NULL; 70 70 } 71 71 72 - static bool __thermal_threshold_is_crossed(struct user_threshold *threshold, int temperature, 73 - int last_temperature, int direction, 74 - int *low, int *high) 75 - { 76 - 77 - if (temperature >= threshold->temperature) { 78 - if (threshold->temperature > *low && 79 - THERMAL_THRESHOLD_WAY_DOWN & threshold->direction) 80 - *low = threshold->temperature; 81 - 82 - if (last_temperature < threshold->temperature && 83 - threshold->direction & direction) 84 - return true; 85 - } else { 86 - if (threshold->temperature < *high && THERMAL_THRESHOLD_WAY_UP 87 - & threshold->direction) 88 - *high = threshold->temperature; 89 - 90 - if (last_temperature >= threshold->temperature && 91 - threshold->direction & direction) 92 - return true; 93 - } 94 - 95 - return false; 96 - } 97 - 98 72 static bool thermal_thresholds_handle_raising(struct list_head *thresholds, int temperature, 99 - int last_temperature, int *low, int *high) 73 + int last_temperature) 100 74 { 101 75 struct user_threshold *t; 102 76 103 77 list_for_each_entry(t, thresholds, list_node) { 104 - if (__thermal_threshold_is_crossed(t, temperature, last_temperature, 105 - THERMAL_THRESHOLD_WAY_UP, low, high)) 78 + 79 + if (!(t->direction & THERMAL_THRESHOLD_WAY_UP)) 80 + continue; 81 + 82 + if (temperature >= t->temperature && 83 + last_temperature < t->temperature) 106 84 return true; 107 85 } 108 86 ··· 88 110 } 89 111 90 112 static bool thermal_thresholds_handle_dropping(struct list_head *thresholds, int temperature, 91 - int last_temperature, int *low, int *high) 113 + int last_temperature) 92 114 { 93 115 struct user_threshold *t; 94 116 95 117 list_for_each_entry_reverse(t, thresholds, list_node) { 96 - if (__thermal_threshold_is_crossed(t, temperature, last_temperature, 97 - THERMAL_THRESHOLD_WAY_DOWN, low, high)) 118 + 119 + if (!(t->direction & THERMAL_THRESHOLD_WAY_DOWN)) 120 + continue; 121 + 122 + if (temperature <= t->temperature && 123 + last_temperature > t->temperature) 98 124 return true; 99 125 } 100 126 101 127 return false; 128 + } 129 + 130 + static void thermal_threshold_find_boundaries(struct list_head *thresholds, int temperature, 131 + int *low, int *high) 132 + { 133 + struct user_threshold *t; 134 + 135 + list_for_each_entry(t, thresholds, list_node) { 136 + if (temperature < t->temperature && 137 + (t->direction & THERMAL_THRESHOLD_WAY_UP) && 138 + *high > t->temperature) 139 + *high = t->temperature; 140 + } 141 + 142 + list_for_each_entry_reverse(t, thresholds, list_node) { 143 + if (temperature > t->temperature && 144 + (t->direction & THERMAL_THRESHOLD_WAY_DOWN) && 145 + *low < t->temperature) 146 + *low = t->temperature; 147 + } 102 148 } 103 149 104 150 void thermal_thresholds_handle(struct thermal_zone_device *tz, int *low, int *high) ··· 133 131 int last_temperature = tz->last_temperature; 134 132 135 133 lockdep_assert_held(&tz->lock); 134 + 135 + thermal_threshold_find_boundaries(thresholds, temperature, low, high); 136 136 137 137 /* 138 138 * We need a second update in order to detect a threshold being crossed ··· 155 151 * - decreased : thresholds are crossed the way down 156 152 */ 157 153 if (temperature > last_temperature) { 158 - if (thermal_thresholds_handle_raising(thresholds, temperature, 159 - last_temperature, low, high)) 154 + if (thermal_thresholds_handle_raising(thresholds, 155 + temperature, last_temperature)) 160 156 thermal_notify_threshold_up(tz); 161 157 } else { 162 - if (thermal_thresholds_handle_dropping(thresholds, temperature, 163 - last_temperature, low, high)) 158 + if (thermal_thresholds_handle_dropping(thresholds, 159 + temperature, last_temperature)) 164 160 thermal_notify_threshold_down(tz); 165 161 } 166 162 }

+8

drivers/thunderbolt/nhi.c

··· 1520 1520 .driver_data = (kernel_ulong_t)&icl_nhi_ops }, 1521 1521 { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_LNL_NHI1), 1522 1522 .driver_data = (kernel_ulong_t)&icl_nhi_ops }, 1523 + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_PTL_M_NHI0), 1524 + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, 1525 + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_PTL_M_NHI1), 1526 + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, 1527 + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_PTL_P_NHI0), 1528 + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, 1529 + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_PTL_P_NHI1), 1530 + .driver_data = (kernel_ulong_t)&icl_nhi_ops }, 1523 1531 { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_80G_NHI) }, 1524 1532 { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_40G_NHI) }, 1525 1533

+4

drivers/thunderbolt/nhi.h

··· 92 92 #define PCI_DEVICE_ID_INTEL_RPL_NHI1 0xa76d 93 93 #define PCI_DEVICE_ID_INTEL_LNL_NHI0 0xa833 94 94 #define PCI_DEVICE_ID_INTEL_LNL_NHI1 0xa834 95 + #define PCI_DEVICE_ID_INTEL_PTL_M_NHI0 0xe333 96 + #define PCI_DEVICE_ID_INTEL_PTL_M_NHI1 0xe334 97 + #define PCI_DEVICE_ID_INTEL_PTL_P_NHI0 0xe433 98 + #define PCI_DEVICE_ID_INTEL_PTL_P_NHI1 0xe434 95 99 96 100 #define PCI_CLASS_SERIAL_USB_USB4 0x0c0340 97 101

+15 -4

drivers/thunderbolt/retimer.c

··· 103 103 104 104 err_nvm: 105 105 dev_dbg(&rt->dev, "NVM upgrade disabled\n"); 106 + rt->no_nvm_upgrade = true; 106 107 if (!IS_ERR(nvm)) 107 108 tb_nvm_free(nvm); 108 109 ··· 183 182 184 183 if (!rt->nvm) 185 184 ret = -EAGAIN; 186 - else if (rt->no_nvm_upgrade) 187 - ret = -EOPNOTSUPP; 188 185 else 189 186 ret = sysfs_emit(buf, "%#x\n", rt->auth_status); 190 187 ··· 322 323 323 324 if (!rt->nvm) 324 325 ret = -EAGAIN; 325 - else if (rt->no_nvm_upgrade) 326 - ret = -EOPNOTSUPP; 327 326 else 328 327 ret = sysfs_emit(buf, "%x.%x\n", rt->nvm->major, rt->nvm->minor); 329 328 ··· 339 342 } 340 343 static DEVICE_ATTR_RO(vendor); 341 344 345 + static umode_t retimer_is_visible(struct kobject *kobj, struct attribute *attr, 346 + int n) 347 + { 348 + struct device *dev = kobj_to_dev(kobj); 349 + struct tb_retimer *rt = tb_to_retimer(dev); 350 + 351 + if (attr == &dev_attr_nvm_authenticate.attr || 352 + attr == &dev_attr_nvm_version.attr) 353 + return rt->no_nvm_upgrade ? 0 : attr->mode; 354 + 355 + return attr->mode; 356 + } 357 + 342 358 static struct attribute *retimer_attrs[] = { 343 359 &dev_attr_device.attr, 344 360 &dev_attr_nvm_authenticate.attr, ··· 361 351 }; 362 352 363 353 static const struct attribute_group retimer_group = { 354 + .is_visible = retimer_is_visible, 364 355 .attrs = retimer_attrs, 365 356 }; 366 357

+41

drivers/thunderbolt/tb.c

··· 2059 2059 } 2060 2060 } 2061 2061 2062 + static void tb_switch_enter_redrive(struct tb_switch *sw) 2063 + { 2064 + struct tb_port *port; 2065 + 2066 + tb_switch_for_each_port(sw, port) 2067 + tb_enter_redrive(port); 2068 + } 2069 + 2070 + /* 2071 + * Called during system and runtime suspend to forcefully exit redrive 2072 + * mode without querying whether the resource is available. 2073 + */ 2074 + static void tb_switch_exit_redrive(struct tb_switch *sw) 2075 + { 2076 + struct tb_port *port; 2077 + 2078 + if (!(sw->quirks & QUIRK_KEEP_POWER_IN_DP_REDRIVE)) 2079 + return; 2080 + 2081 + tb_switch_for_each_port(sw, port) { 2082 + if (!tb_port_is_dpin(port)) 2083 + continue; 2084 + 2085 + if (port->redrive) { 2086 + port->redrive = false; 2087 + pm_runtime_put(&sw->dev); 2088 + tb_port_dbg(port, "exit redrive mode\n"); 2089 + } 2090 + } 2091 + } 2092 + 2062 2093 static void tb_dp_resource_unavailable(struct tb *tb, struct tb_port *port) 2063 2094 { 2064 2095 struct tb_port *in, *out; ··· 2940 2909 tb_create_usb3_tunnels(tb->root_switch); 2941 2910 /* Add DP IN resources for the root switch */ 2942 2911 tb_add_dp_resources(tb->root_switch); 2912 + tb_switch_enter_redrive(tb->root_switch); 2943 2913 /* Make the discovered switches available to the userspace */ 2944 2914 device_for_each_child(&tb->root_switch->dev, NULL, 2945 2915 tb_scan_finalize_switch); ··· 2956 2924 2957 2925 tb_dbg(tb, "suspending...\n"); 2958 2926 tb_disconnect_and_release_dp(tb); 2927 + tb_switch_exit_redrive(tb->root_switch); 2959 2928 tb_switch_suspend(tb->root_switch, false); 2960 2929 tcm->hotplug_active = false; /* signal tb_handle_hotplug to quit */ 2961 2930 tb_dbg(tb, "suspend finished\n"); ··· 3049 3016 tb_dbg(tb, "tunnels restarted, sleeping for 100ms\n"); 3050 3017 msleep(100); 3051 3018 } 3019 + tb_switch_enter_redrive(tb->root_switch); 3052 3020 /* Allow tb_handle_hotplug to progress events */ 3053 3021 tcm->hotplug_active = true; 3054 3022 tb_dbg(tb, "resume finished\n"); ··· 3113 3079 struct tb_cm *tcm = tb_priv(tb); 3114 3080 3115 3081 mutex_lock(&tb->lock); 3082 + /* 3083 + * The below call only releases DP resources to allow exiting and 3084 + * re-entering redrive mode. 3085 + */ 3086 + tb_disconnect_and_release_dp(tb); 3087 + tb_switch_exit_redrive(tb->root_switch); 3116 3088 tb_switch_suspend(tb->root_switch, true); 3117 3089 tcm->hotplug_active = false; 3118 3090 mutex_unlock(&tb->lock); ··· 3150 3110 tb_restore_children(tb->root_switch); 3151 3111 list_for_each_entry_safe(tunnel, n, &tcm->tunnel_list, list) 3152 3112 tb_tunnel_restart(tunnel); 3113 + tb_switch_enter_redrive(tb->root_switch); 3153 3114 tcm->hotplug_active = true; 3154 3115 mutex_unlock(&tb->lock); 3155 3116

+1 -1

drivers/usb/host/xhci-mem.c

··· 436 436 goto free_segments; 437 437 } 438 438 439 - xhci_link_rings(xhci, ring, &new_ring); 439 + xhci_link_rings(xhci, &new_ring, ring); 440 440 trace_xhci_ring_expansion(ring); 441 441 xhci_dbg_trace(xhci, trace_xhci_dbg_ring_expansion, 442 442 "ring expansion succeed, now has %d segments",

-2

drivers/usb/host/xhci-ring.c

··· 1199 1199 * Keep retrying until the EP starts and stops again, on 1200 1200 * chips where this is known to help. Wait for 100ms. 1201 1201 */ 1202 - if (!(xhci->quirks & XHCI_NEC_HOST)) 1203 - break; 1204 1202 if (time_is_before_jiffies(ep->stop_time + msecs_to_jiffies(100))) 1205 1203 break; 1206 1204 fallthrough;

+27

drivers/usb/serial/option.c

··· 625 625 #define MEIGSMART_PRODUCT_SRM825L 0x4d22 626 626 /* MeiG Smart SLM320 based on UNISOC UIS8910 */ 627 627 #define MEIGSMART_PRODUCT_SLM320 0x4d41 628 + /* MeiG Smart SLM770A based on ASR1803 */ 629 + #define MEIGSMART_PRODUCT_SLM770A 0x4d57 628 630 629 631 /* Device flags */ 630 632 ··· 1397 1395 .driver_info = RSVD(0) | NCTRL(2) | RSVD(3) | RSVD(4) }, 1398 1396 { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10aa, 0xff), /* Telit FN920C04 (MBIM) */ 1399 1397 .driver_info = NCTRL(3) | RSVD(4) | RSVD(5) }, 1398 + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10c0, 0xff), /* Telit FE910C04 (rmnet) */ 1399 + .driver_info = RSVD(0) | NCTRL(3) }, 1400 + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10c4, 0xff), /* Telit FE910C04 (rmnet) */ 1401 + .driver_info = RSVD(0) | NCTRL(3) }, 1402 + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10c8, 0xff), /* Telit FE910C04 (rmnet) */ 1403 + .driver_info = RSVD(0) | NCTRL(2) | RSVD(3) | RSVD(4) }, 1400 1404 { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910), 1401 1405 .driver_info = NCTRL(0) | RSVD(1) | RSVD(3) }, 1402 1406 { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910_DUAL_MODEM), ··· 2255 2247 .driver_info = NCTRL(2) }, 2256 2248 { USB_DEVICE_AND_INTERFACE_INFO(MEDIATEK_VENDOR_ID, 0x7127, 0xff, 0x00, 0x00), 2257 2249 .driver_info = NCTRL(2) | NCTRL(3) | NCTRL(4) }, 2250 + { USB_DEVICE_AND_INTERFACE_INFO(MEDIATEK_VENDOR_ID, 0x7129, 0xff, 0x00, 0x00), /* MediaTek T7XX */ 2251 + .driver_info = NCTRL(2) | NCTRL(3) | NCTRL(4) }, 2258 2252 { USB_DEVICE(CELLIENT_VENDOR_ID, CELLIENT_PRODUCT_MEN200) }, 2259 2253 { USB_DEVICE(CELLIENT_VENDOR_ID, CELLIENT_PRODUCT_MPL200), 2260 2254 .driver_info = RSVD(1) | RSVD(4) }, ··· 2385 2375 { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x0116, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WWD for Golbal EDU */ 2386 2376 { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x0116, 0xff, 0x00, 0x40) }, 2387 2377 { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x0116, 0xff, 0xff, 0x40) }, 2378 + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010a, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WRD for WWAN Ready */ 2379 + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010a, 0xff, 0x00, 0x40) }, 2380 + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010a, 0xff, 0xff, 0x40) }, 2381 + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010b, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WWD for WWAN Ready */ 2382 + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010b, 0xff, 0x00, 0x40) }, 2383 + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010b, 0xff, 0xff, 0x40) }, 2384 + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010c, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WRD for WWAN Ready */ 2385 + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010c, 0xff, 0x00, 0x40) }, 2386 + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010c, 0xff, 0xff, 0x40) }, 2387 + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010d, 0xff, 0xff, 0x30) }, /* NetPrisma LCUK54-WWD for WWAN Ready */ 2388 + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010d, 0xff, 0x00, 0x40) }, 2389 + { USB_DEVICE_AND_INTERFACE_INFO(0x3731, 0x010d, 0xff, 0xff, 0x40) }, 2388 2390 { USB_DEVICE_AND_INTERFACE_INFO(OPPO_VENDOR_ID, OPPO_PRODUCT_R11, 0xff, 0xff, 0x30) }, 2389 2391 { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0xff, 0x30) }, 2390 2392 { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0xff, 0x40) }, ··· 2404 2382 { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, TOZED_PRODUCT_LT70C, 0xff, 0, 0) }, 2405 2383 { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, LUAT_PRODUCT_AIR720U, 0xff, 0, 0) }, 2406 2384 { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SLM320, 0xff, 0, 0) }, 2385 + { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SLM770A, 0xff, 0, 0) }, 2407 2386 { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x30) }, 2408 2387 { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x40) }, 2409 2388 { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x60) }, 2389 + { USB_DEVICE_INTERFACE_CLASS(0x1bbb, 0x0530, 0xff), /* TCL IK512 MBIM */ 2390 + .driver_info = NCTRL(1) }, 2391 + { USB_DEVICE_INTERFACE_CLASS(0x1bbb, 0x0640, 0xff), /* TCL IK512 ECM */ 2392 + .driver_info = NCTRL(3) }, 2410 2393 { } /* Terminating entry */ 2411 2394 }; 2412 2395 MODULE_DEVICE_TABLE(usb, option_ids);

+13 -5

drivers/video/fbdev/Kconfig

··· 649 649 config FB_ATMEL 650 650 tristate "AT91 LCD Controller support" 651 651 depends on FB && OF && HAVE_CLK && HAS_IOMEM 652 + depends on BACKLIGHT_CLASS_DEVICE 652 653 depends on HAVE_FB_ATMEL || COMPILE_TEST 653 654 select FB_BACKLIGHT 654 655 select FB_IOMEM_HELPERS ··· 661 660 config FB_NVIDIA 662 661 tristate "nVidia Framebuffer Support" 663 662 depends on FB && PCI 664 - select FB_BACKLIGHT if FB_NVIDIA_BACKLIGHT 665 663 select FB_CFB_FILLRECT 666 664 select FB_CFB_COPYAREA 667 665 select FB_CFB_IMAGEBLIT ··· 700 700 config FB_NVIDIA_BACKLIGHT 701 701 bool "Support for backlight control" 702 702 depends on FB_NVIDIA 703 + depends on BACKLIGHT_CLASS_DEVICE=y || BACKLIGHT_CLASS_DEVICE=FB_NVIDIA 704 + select FB_BACKLIGHT 703 705 default y 704 706 help 705 707 Say Y here if you want to control the backlight of your display. ··· 709 707 config FB_RIVA 710 708 tristate "nVidia Riva support" 711 709 depends on FB && PCI 712 - select FB_BACKLIGHT if FB_RIVA_BACKLIGHT 713 710 select FB_CFB_FILLRECT 714 711 select FB_CFB_COPYAREA 715 712 select FB_CFB_IMAGEBLIT ··· 748 747 config FB_RIVA_BACKLIGHT 749 748 bool "Support for backlight control" 750 749 depends on FB_RIVA 750 + depends on BACKLIGHT_CLASS_DEVICE=y || BACKLIGHT_CLASS_DEVICE=FB_RIVA 751 + select FB_BACKLIGHT 751 752 default y 752 753 help 753 754 Say Y here if you want to control the backlight of your display. ··· 937 934 config FB_RADEON 938 935 tristate "ATI Radeon display support" 939 936 depends on FB && PCI 940 - select FB_BACKLIGHT if FB_RADEON_BACKLIGHT 941 937 select FB_CFB_FILLRECT 942 938 select FB_CFB_COPYAREA 943 939 select FB_CFB_IMAGEBLIT ··· 962 960 config FB_RADEON_BACKLIGHT 963 961 bool "Support for backlight control" 964 962 depends on FB_RADEON 963 + depends on BACKLIGHT_CLASS_DEVICE=y || BACKLIGHT_CLASS_DEVICE=FB_RADEON 964 + select FB_BACKLIGHT 965 965 default y 966 966 help 967 967 Say Y here if you want to control the backlight of your display. ··· 979 975 config FB_ATY128 980 976 tristate "ATI Rage128 display support" 981 977 depends on FB && PCI 982 - select FB_BACKLIGHT if FB_ATY128_BACKLIGHT 983 978 select FB_IOMEM_HELPERS 984 979 select FB_MACMODES if PPC_PMAC 985 980 help ··· 992 989 config FB_ATY128_BACKLIGHT 993 990 bool "Support for backlight control" 994 991 depends on FB_ATY128 992 + depends on BACKLIGHT_CLASS_DEVICE=y || BACKLIGHT_CLASS_DEVICE=FB_ATY128 993 + select FB_BACKLIGHT 995 994 default y 996 995 help 997 996 Say Y here if you want to control the backlight of your display. ··· 1004 999 select FB_CFB_FILLRECT 1005 1000 select FB_CFB_COPYAREA 1006 1001 select FB_CFB_IMAGEBLIT 1007 - select FB_BACKLIGHT if FB_ATY_BACKLIGHT 1008 1002 select FB_IOMEM_FOPS 1009 1003 select FB_MACMODES if PPC 1010 1004 select FB_ATY_CT if SPARC64 && PCI ··· 1044 1040 config FB_ATY_BACKLIGHT 1045 1041 bool "Support for backlight control" 1046 1042 depends on FB_ATY 1043 + depends on BACKLIGHT_CLASS_DEVICE=y || BACKLIGHT_CLASS_DEVICE=FB_ATY 1044 + select FB_BACKLIGHT 1047 1045 default y 1048 1046 help 1049 1047 Say Y here if you want to control the backlight of your display. ··· 1534 1528 depends on FB && HAVE_CLK && HAS_IOMEM 1535 1529 depends on SUPERH || COMPILE_TEST 1536 1530 depends on FB_DEVICE 1531 + depends on BACKLIGHT_CLASS_DEVICE 1537 1532 select FB_BACKLIGHT 1538 1533 select FB_DEFERRED_IO 1539 1534 select FB_DMAMEM_HELPERS ··· 1800 1793 tristate "Solomon SSD1307 framebuffer support" 1801 1794 depends on FB && I2C 1802 1795 depends on GPIOLIB || COMPILE_TEST 1796 + depends on BACKLIGHT_CLASS_DEVICE 1803 1797 select FB_BACKLIGHT 1804 1798 select FB_SYSMEM_HELPERS_DEFERRED 1805 1799 help

+1 -2

drivers/video/fbdev/core/Kconfig

··· 183 183 select FB_SYSMEM_HELPERS 184 184 185 185 config FB_BACKLIGHT 186 - tristate 186 + bool 187 187 depends on FB 188 - select BACKLIGHT_CLASS_DEVICE 189 188 190 189 config FB_MODE_HELPERS 191 190 bool "Enable Video Mode Handling Helpers"

+11 -5

fs/btrfs/bio.c

··· 358 358 INIT_WORK(&bbio->end_io_work, btrfs_end_bio_work); 359 359 queue_work(btrfs_end_io_wq(fs_info, bio), &bbio->end_io_work); 360 360 } else { 361 - if (bio_op(bio) == REQ_OP_ZONE_APPEND && !bio->bi_status) 361 + if (bio_is_zone_append(bio) && !bio->bi_status) 362 362 btrfs_record_physical_zoned(bbio); 363 363 btrfs_bio_end_io(bbio, bbio->bio.bi_status); 364 364 } ··· 401 401 else 402 402 bio->bi_status = BLK_STS_OK; 403 403 404 - if (bio_op(bio) == REQ_OP_ZONE_APPEND && !bio->bi_status) 404 + if (bio_is_zone_append(bio) && !bio->bi_status) 405 405 stripe->physical = bio->bi_iter.bi_sector << SECTOR_SHIFT; 406 406 407 407 btrfs_bio_end_io(bbio, bbio->bio.bi_status); ··· 415 415 if (bio->bi_status) { 416 416 atomic_inc(&stripe->bioc->error); 417 417 btrfs_log_dev_io_error(bio, stripe->dev); 418 - } else if (bio_op(bio) == REQ_OP_ZONE_APPEND) { 418 + } else if (bio_is_zone_append(bio)) { 419 419 stripe->physical = bio->bi_iter.bi_sector << SECTOR_SHIFT; 420 420 } 421 421 ··· 652 652 map_length = min(map_length, bbio->fs_info->max_zone_append_size); 653 653 sector_offset = bio_split_rw_at(&bbio->bio, &bbio->fs_info->limits, 654 654 &nr_segs, map_length); 655 - if (sector_offset) 656 - return sector_offset << SECTOR_SHIFT; 655 + if (sector_offset) { 656 + /* 657 + * bio_split_rw_at() could split at a size smaller than our 658 + * sectorsize and thus cause unaligned I/Os. Fix that by 659 + * always rounding down to the nearest boundary. 660 + */ 661 + return ALIGN_DOWN(sector_offset << SECTOR_SHIFT, bbio->fs_info->sectorsize); 662 + } 657 663 return map_length; 658 664 } 659 665

+19

fs/btrfs/ctree.h

··· 371 371 } 372 372 373 373 /* 374 + * Return the generation this root started with. 375 + * 376 + * Every normal root that is created with root->root_key.offset set to it's 377 + * originating generation. If it is a snapshot it is the generation when the 378 + * snapshot was created. 379 + * 380 + * However for TREE_RELOC roots root_key.offset is the objectid of the owning 381 + * tree root. Thankfully we copy the root item of the owning tree root, which 382 + * has it's last_snapshot set to what we would have root_key.offset set to, so 383 + * return that if this is a TREE_RELOC root. 384 + */ 385 + static inline u64 btrfs_root_origin_generation(const struct btrfs_root *root) 386 + { 387 + if (btrfs_root_id(root) == BTRFS_TREE_RELOC_OBJECTID) 388 + return btrfs_root_last_snapshot(&root->root_item); 389 + return root->root_key.offset; 390 + } 391 + 392 + /* 374 393 * Structure that conveys information about an extent that is going to replace 375 394 * all the extents in a file range. 376 395 */

+3 -3

fs/btrfs/extent-tree.c

··· 5285 5285 * reference to it. 5286 5286 */ 5287 5287 generation = btrfs_node_ptr_generation(eb, slot); 5288 - if (!wc->update_ref || generation <= root->root_key.offset) 5288 + if (!wc->update_ref || generation <= btrfs_root_origin_generation(root)) 5289 5289 return false; 5290 5290 5291 5291 /* ··· 5340 5340 goto reada; 5341 5341 5342 5342 if (wc->stage == UPDATE_BACKREF && 5343 - generation <= root->root_key.offset) 5343 + generation <= btrfs_root_origin_generation(root)) 5344 5344 continue; 5345 5345 5346 5346 /* We don't lock the tree block, it's OK to be racy here */ ··· 5683 5683 * for the subtree 5684 5684 */ 5685 5685 if (wc->stage == UPDATE_BACKREF && 5686 - generation <= root->root_key.offset) { 5686 + generation <= btrfs_root_origin_generation(root)) { 5687 5687 wc->lookup_info = 1; 5688 5688 return 1; 5689 5689 }

+26 -1

fs/btrfs/tree-checker.c

··· 1527 1527 dref_offset, fs_info->sectorsize); 1528 1528 return -EUCLEAN; 1529 1529 } 1530 + if (unlikely(btrfs_extent_data_ref_count(leaf, dref) == 0)) { 1531 + extent_err(leaf, slot, 1532 + "invalid data ref count, should have non-zero value"); 1533 + return -EUCLEAN; 1534 + } 1530 1535 inline_refs += btrfs_extent_data_ref_count(leaf, dref); 1531 1536 break; 1532 1537 /* Contains parent bytenr and ref count */ ··· 1542 1537 extent_err(leaf, slot, 1543 1538 "invalid data parent bytenr, have %llu expect aligned to %u", 1544 1539 inline_offset, fs_info->sectorsize); 1540 + return -EUCLEAN; 1541 + } 1542 + if (unlikely(btrfs_shared_data_ref_count(leaf, sref) == 0)) { 1543 + extent_err(leaf, slot, 1544 + "invalid shared data ref count, should have non-zero value"); 1545 1545 return -EUCLEAN; 1546 1546 } 1547 1547 inline_refs += btrfs_shared_data_ref_count(leaf, sref); ··· 1621 1611 { 1622 1612 u32 expect_item_size = 0; 1623 1613 1624 - if (key->type == BTRFS_SHARED_DATA_REF_KEY) 1614 + if (key->type == BTRFS_SHARED_DATA_REF_KEY) { 1615 + struct btrfs_shared_data_ref *sref; 1616 + 1617 + sref = btrfs_item_ptr(leaf, slot, struct btrfs_shared_data_ref); 1618 + if (unlikely(btrfs_shared_data_ref_count(leaf, sref) == 0)) { 1619 + extent_err(leaf, slot, 1620 + "invalid shared data backref count, should have non-zero value"); 1621 + return -EUCLEAN; 1622 + } 1623 + 1625 1624 expect_item_size = sizeof(struct btrfs_shared_data_ref); 1625 + } 1626 1626 1627 1627 if (unlikely(btrfs_item_size(leaf, slot) != expect_item_size)) { 1628 1628 generic_err(leaf, slot, ··· 1707 1687 extent_err(leaf, slot, 1708 1688 "invalid extent data backref offset, have %llu expect aligned to %u", 1709 1689 offset, leaf->fs_info->sectorsize); 1690 + return -EUCLEAN; 1691 + } 1692 + if (unlikely(btrfs_extent_data_ref_count(leaf, dref) == 0)) { 1693 + extent_err(leaf, slot, 1694 + "invalid extent data backref count, should have non-zero value"); 1710 1695 return -EUCLEAN; 1711 1696 } 1712 1697 }

+38 -39

fs/ceph/file.c

··· 1066 1066 if (ceph_inode_is_shutdown(inode)) 1067 1067 return -EIO; 1068 1068 1069 - if (!len) 1069 + if (!len || !i_size) 1070 1070 return 0; 1071 1071 /* 1072 1072 * flush any page cache pages in this range. this ··· 1086 1086 int num_pages; 1087 1087 size_t page_off; 1088 1088 bool more; 1089 - int idx; 1089 + int idx = 0; 1090 1090 size_t left; 1091 1091 struct ceph_osd_req_op *op; 1092 1092 u64 read_off = off; ··· 1116 1116 len = read_off + read_len - off; 1117 1117 more = len < iov_iter_count(to); 1118 1118 1119 + op = &req->r_ops[0]; 1120 + if (sparse) { 1121 + extent_cnt = __ceph_sparse_read_ext_count(inode, read_len); 1122 + ret = ceph_alloc_sparse_ext_map(op, extent_cnt); 1123 + if (ret) { 1124 + ceph_osdc_put_request(req); 1125 + break; 1126 + } 1127 + } 1128 + 1119 1129 num_pages = calc_pages_for(read_off, read_len); 1120 1130 page_off = offset_in_page(off); 1121 1131 pages = ceph_alloc_page_vector(num_pages, GFP_KERNEL); ··· 1137 1127 1138 1128 osd_req_op_extent_osd_data_pages(req, 0, pages, read_len, 1139 1129 offset_in_page(read_off), 1140 - false, false); 1141 - 1142 - op = &req->r_ops[0]; 1143 - if (sparse) { 1144 - extent_cnt = __ceph_sparse_read_ext_count(inode, read_len); 1145 - ret = ceph_alloc_sparse_ext_map(op, extent_cnt); 1146 - if (ret) { 1147 - ceph_osdc_put_request(req); 1148 - break; 1149 - } 1150 - } 1130 + false, true); 1151 1131 1152 1132 ceph_osdc_start_request(osdc, req); 1153 1133 ret = ceph_osdc_wait_request(osdc, req); ··· 1160 1160 else if (ret == -ENOENT) 1161 1161 ret = 0; 1162 1162 1163 - if (ret > 0 && IS_ENCRYPTED(inode)) { 1163 + if (ret < 0) { 1164 + ceph_osdc_put_request(req); 1165 + if (ret == -EBLOCKLISTED) 1166 + fsc->blocklisted = true; 1167 + break; 1168 + } 1169 + 1170 + if (IS_ENCRYPTED(inode)) { 1164 1171 int fret; 1165 1172 1166 1173 fret = ceph_fscrypt_decrypt_extents(inode, pages, ··· 1193 1186 ret = min_t(ssize_t, fret, len); 1194 1187 } 1195 1188 1196 - ceph_osdc_put_request(req); 1197 - 1198 1189 /* Short read but not EOF? Zero out the remainder. */ 1199 - if (ret >= 0 && ret < len && (off + ret < i_size)) { 1190 + if (ret < len && (off + ret < i_size)) { 1200 1191 int zlen = min(len - ret, i_size - off - ret); 1201 1192 int zoff = page_off + ret; 1202 1193 ··· 1204 1199 ret += zlen; 1205 1200 } 1206 1201 1207 - idx = 0; 1208 - if (ret <= 0) 1209 - left = 0; 1210 - else if (off + ret > i_size) 1211 - left = i_size - off; 1202 + if (off + ret > i_size) 1203 + left = (i_size > off) ? i_size - off : 0; 1212 1204 else 1213 1205 left = ret; 1206 + 1214 1207 while (left > 0) { 1215 1208 size_t plen, copied; 1216 1209 ··· 1224 1221 break; 1225 1222 } 1226 1223 } 1227 - ceph_release_page_vector(pages, num_pages); 1228 1224 1229 - if (ret < 0) { 1230 - if (ret == -EBLOCKLISTED) 1231 - fsc->blocklisted = true; 1232 - break; 1233 - } 1225 + ceph_osdc_put_request(req); 1234 1226 1235 1227 if (off >= i_size || !more) 1236 1228 break; ··· 1551 1553 break; 1552 1554 } 1553 1555 1556 + op = &req->r_ops[0]; 1557 + if (!write && sparse) { 1558 + extent_cnt = __ceph_sparse_read_ext_count(inode, size); 1559 + ret = ceph_alloc_sparse_ext_map(op, extent_cnt); 1560 + if (ret) { 1561 + ceph_osdc_put_request(req); 1562 + break; 1563 + } 1564 + } 1565 + 1554 1566 len = iter_get_bvecs_alloc(iter, size, &bvecs, &num_pages); 1555 1567 if (len < 0) { 1556 1568 ceph_osdc_put_request(req); ··· 1569 1561 } 1570 1562 if (len != size) 1571 1563 osd_req_op_extent_update(req, 0, len); 1564 + 1565 + osd_req_op_extent_osd_data_bvecs(req, 0, bvecs, num_pages, len); 1572 1566 1573 1567 /* 1574 1568 * To simplify error handling, allow AIO when IO within i_size ··· 1601 1591 PAGE_ALIGN(pos + len) - 1); 1602 1592 1603 1593 req->r_mtime = mtime; 1604 - } 1605 - 1606 - osd_req_op_extent_osd_data_bvecs(req, 0, bvecs, num_pages, len); 1607 - op = &req->r_ops[0]; 1608 - if (sparse) { 1609 - extent_cnt = __ceph_sparse_read_ext_count(inode, size); 1610 - ret = ceph_alloc_sparse_ext_map(op, extent_cnt); 1611 - if (ret) { 1612 - ceph_osdc_put_request(req); 1613 - break; 1614 - } 1615 1594 } 1616 1595 1617 1596 if (aio_req) {

+4 -5

fs/ceph/mds_client.c

··· 2800 2800 2801 2801 if (pos < 0) { 2802 2802 /* 2803 - * A rename didn't occur, but somehow we didn't end up where 2804 - * we thought we would. Throw a warning and try again. 2803 + * The path is longer than PATH_MAX and this function 2804 + * cannot ever succeed. Creating paths that long is 2805 + * possible with Ceph, but Linux cannot use them. 2805 2806 */ 2806 - pr_warn_client(cl, "did not end path lookup where expected (pos = %d)\n", 2807 - pos); 2808 - goto retry; 2807 + return ERR_PTR(-ENAMETOOLONG); 2809 2808 } 2810 2809 2811 2810 *pbase = base;

+2

fs/ceph/super.c

··· 431 431 432 432 switch (token) { 433 433 case Opt_snapdirname: 434 + if (strlen(param->string) > NAME_MAX) 435 + return invalfc(fc, "snapdirname too long"); 434 436 kfree(fsopt->snapdir_name); 435 437 fsopt->snapdir_name = param->string; 436 438 param->string = NULL;

+13 -23

fs/erofs/data.c

··· 56 56 57 57 buf->file = NULL; 58 58 if (erofs_is_fileio_mode(sbi)) { 59 - buf->file = sbi->fdev; /* some fs like FUSE needs it */ 59 + buf->file = sbi->dif0.file; /* some fs like FUSE needs it */ 60 60 buf->mapping = buf->file->f_mapping; 61 61 } else if (erofs_is_fscache_mode(sb)) 62 - buf->mapping = sbi->s_fscache->inode->i_mapping; 62 + buf->mapping = sbi->dif0.fscache->inode->i_mapping; 63 63 else 64 64 buf->mapping = sb->s_bdev->bd_mapping; 65 65 } ··· 179 179 } 180 180 181 181 static void erofs_fill_from_devinfo(struct erofs_map_dev *map, 182 - struct erofs_device_info *dif) 182 + struct super_block *sb, struct erofs_device_info *dif) 183 183 { 184 + map->m_sb = sb; 185 + map->m_dif = dif; 184 186 map->m_bdev = NULL; 185 - map->m_fp = NULL; 186 - if (dif->file) { 187 - if (S_ISBLK(file_inode(dif->file)->i_mode)) 188 - map->m_bdev = file_bdev(dif->file); 189 - else 190 - map->m_fp = dif->file; 191 - } 192 - map->m_daxdev = dif->dax_dev; 193 - map->m_dax_part_off = dif->dax_part_off; 194 - map->m_fscache = dif->fscache; 187 + if (dif->file && S_ISBLK(file_inode(dif->file)->i_mode)) 188 + map->m_bdev = file_bdev(dif->file); 195 189 } 196 190 197 191 int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map) ··· 195 201 erofs_off_t startoff, length; 196 202 int id; 197 203 198 - map->m_bdev = sb->s_bdev; 199 - map->m_daxdev = EROFS_SB(sb)->dax_dev; 200 - map->m_dax_part_off = EROFS_SB(sb)->dax_part_off; 201 - map->m_fscache = EROFS_SB(sb)->s_fscache; 202 - map->m_fp = EROFS_SB(sb)->fdev; 203 - 204 + erofs_fill_from_devinfo(map, sb, &EROFS_SB(sb)->dif0); 205 + map->m_bdev = sb->s_bdev; /* use s_bdev for the primary device */ 204 206 if (map->m_deviceid) { 205 207 down_read(&devs->rwsem); 206 208 dif = idr_find(&devs->tree, map->m_deviceid - 1); ··· 209 219 up_read(&devs->rwsem); 210 220 return 0; 211 221 } 212 - erofs_fill_from_devinfo(map, dif); 222 + erofs_fill_from_devinfo(map, sb, dif); 213 223 up_read(&devs->rwsem); 214 224 } else if (devs->extra_devices && !devs->flatdev) { 215 225 down_read(&devs->rwsem); ··· 222 232 if (map->m_pa >= startoff && 223 233 map->m_pa < startoff + length) { 224 234 map->m_pa -= startoff; 225 - erofs_fill_from_devinfo(map, dif); 235 + erofs_fill_from_devinfo(map, sb, dif); 226 236 break; 227 237 } 228 238 } ··· 292 302 293 303 iomap->offset = map.m_la; 294 304 if (flags & IOMAP_DAX) 295 - iomap->dax_dev = mdev.m_daxdev; 305 + iomap->dax_dev = mdev.m_dif->dax_dev; 296 306 else 297 307 iomap->bdev = mdev.m_bdev; 298 308 iomap->length = map.m_llen; ··· 321 331 iomap->type = IOMAP_MAPPED; 322 332 iomap->addr = mdev.m_pa; 323 333 if (flags & IOMAP_DAX) 324 - iomap->addr += mdev.m_dax_part_off; 334 + iomap->addr += mdev.m_dif->dax_part_off; 325 335 } 326 336 return 0; 327 337 }

+6 -3

fs/erofs/fileio.c

··· 9 9 struct bio_vec bvecs[BIO_MAX_VECS]; 10 10 struct bio bio; 11 11 struct kiocb iocb; 12 + struct super_block *sb; 12 13 }; 13 14 14 15 struct erofs_fileio { ··· 53 52 rq->iocb.ki_pos = rq->bio.bi_iter.bi_sector << SECTOR_SHIFT; 54 53 rq->iocb.ki_ioprio = get_current_ioprio(); 55 54 rq->iocb.ki_complete = erofs_fileio_ki_complete; 56 - rq->iocb.ki_flags = (rq->iocb.ki_filp->f_mode & FMODE_CAN_ODIRECT) ? 57 - IOCB_DIRECT : 0; 55 + if (test_opt(&EROFS_SB(rq->sb)->opt, DIRECT_IO) && 56 + rq->iocb.ki_filp->f_mode & FMODE_CAN_ODIRECT) 57 + rq->iocb.ki_flags = IOCB_DIRECT; 58 58 iov_iter_bvec(&iter, ITER_DEST, rq->bvecs, rq->bio.bi_vcnt, 59 59 rq->bio.bi_iter.bi_size); 60 60 ret = vfs_iocb_iter_read(rq->iocb.ki_filp, &rq->iocb, &iter); ··· 69 67 GFP_KERNEL | __GFP_NOFAIL); 70 68 71 69 bio_init(&rq->bio, NULL, rq->bvecs, BIO_MAX_VECS, REQ_OP_READ); 72 - rq->iocb.ki_filp = mdev->m_fp; 70 + rq->iocb.ki_filp = mdev->m_dif->file; 71 + rq->sb = mdev->m_sb; 73 72 return rq; 74 73 } 75 74

+5 -5

fs/erofs/fscache.c

··· 198 198 199 199 io = kmalloc(sizeof(*io), GFP_KERNEL | __GFP_NOFAIL); 200 200 bio_init(&io->bio, NULL, io->bvecs, BIO_MAX_VECS, REQ_OP_READ); 201 - io->io.private = mdev->m_fscache->cookie; 201 + io->io.private = mdev->m_dif->fscache->cookie; 202 202 io->io.end_io = erofs_fscache_bio_endio; 203 203 refcount_set(&io->io.ref, 1); 204 204 return &io->bio; ··· 316 316 if (!io) 317 317 return -ENOMEM; 318 318 iov_iter_xarray(&io->iter, ITER_DEST, &mapping->i_pages, pos, count); 319 - ret = erofs_fscache_read_io_async(mdev.m_fscache->cookie, 319 + ret = erofs_fscache_read_io_async(mdev.m_dif->fscache->cookie, 320 320 mdev.m_pa + (pos - map.m_la), io); 321 321 erofs_fscache_req_io_put(io); 322 322 ··· 657 657 if (IS_ERR(fscache)) 658 658 return PTR_ERR(fscache); 659 659 660 - sbi->s_fscache = fscache; 660 + sbi->dif0.fscache = fscache; 661 661 return 0; 662 662 } 663 663 ··· 665 665 { 666 666 struct erofs_sb_info *sbi = EROFS_SB(sb); 667 667 668 - erofs_fscache_unregister_cookie(sbi->s_fscache); 668 + erofs_fscache_unregister_cookie(sbi->dif0.fscache); 669 669 670 670 if (sbi->domain) 671 671 erofs_fscache_domain_put(sbi->domain); 672 672 else 673 673 fscache_relinquish_volume(sbi->volume, NULL, false); 674 674 675 - sbi->s_fscache = NULL; 675 + sbi->dif0.fscache = NULL; 676 676 sbi->volume = NULL; 677 677 sbi->domain = NULL; 678 678 }

+5 -10

fs/erofs/internal.h

··· 107 107 }; 108 108 109 109 struct erofs_sb_info { 110 + struct erofs_device_info dif0; 110 111 struct erofs_mount_opts opt; /* options */ 111 112 #ifdef CONFIG_EROFS_FS_ZIP 112 113 /* list for all registered superblocks, mainly for shrinker */ ··· 125 124 126 125 struct erofs_sb_lz4_info lz4; 127 126 #endif /* CONFIG_EROFS_FS_ZIP */ 128 - struct file *fdev; 129 127 struct inode *packed_inode; 130 128 struct erofs_dev_context *devs; 131 - struct dax_device *dax_dev; 132 - u64 dax_part_off; 133 129 u64 total_blocks; 134 - u32 primarydevice_blocks; 135 130 136 131 u32 meta_blkaddr; 137 132 #ifdef CONFIG_EROFS_FS_XATTR ··· 163 166 164 167 /* fscache support */ 165 168 struct fscache_volume *volume; 166 - struct erofs_fscache *s_fscache; 167 169 struct erofs_domain *domain; 168 170 char *fsid; 169 171 char *domain_id; ··· 176 180 #define EROFS_MOUNT_POSIX_ACL 0x00000020 177 181 #define EROFS_MOUNT_DAX_ALWAYS 0x00000040 178 182 #define EROFS_MOUNT_DAX_NEVER 0x00000080 183 + #define EROFS_MOUNT_DIRECT_IO 0x00000100 179 184 180 185 #define clear_opt(opt, option) ((opt)->mount_opt &= ~EROFS_MOUNT_##option) 181 186 #define set_opt(opt, option) ((opt)->mount_opt |= EROFS_MOUNT_##option) ··· 184 187 185 188 static inline bool erofs_is_fileio_mode(struct erofs_sb_info *sbi) 186 189 { 187 - return IS_ENABLED(CONFIG_EROFS_FS_BACKED_BY_FILE) && sbi->fdev; 190 + return IS_ENABLED(CONFIG_EROFS_FS_BACKED_BY_FILE) && sbi->dif0.file; 188 191 } 189 192 190 193 static inline bool erofs_is_fscache_mode(struct super_block *sb) ··· 354 357 }; 355 358 356 359 struct erofs_map_dev { 357 - struct erofs_fscache *m_fscache; 360 + struct super_block *m_sb; 361 + struct erofs_device_info *m_dif; 358 362 struct block_device *m_bdev; 359 - struct dax_device *m_daxdev; 360 - struct file *m_fp; 361 - u64 m_dax_part_off; 362 363 363 364 erofs_off_t m_pa; 364 365 unsigned int m_deviceid;

+44 -36

fs/erofs/super.c

··· 203 203 struct erofs_device_info *dif; 204 204 int id, err = 0; 205 205 206 - sbi->total_blocks = sbi->primarydevice_blocks; 206 + sbi->total_blocks = sbi->dif0.blocks; 207 207 if (!erofs_sb_has_device_table(sbi)) 208 208 ondisk_extradevs = 0; 209 209 else ··· 307 307 sbi->sb_size); 308 308 goto out; 309 309 } 310 - sbi->primarydevice_blocks = le32_to_cpu(dsb->blocks); 310 + sbi->dif0.blocks = le32_to_cpu(dsb->blocks); 311 311 sbi->meta_blkaddr = le32_to_cpu(dsb->meta_blkaddr); 312 312 #ifdef CONFIG_EROFS_FS_XATTR 313 313 sbi->xattr_blkaddr = le32_to_cpu(dsb->xattr_blkaddr); ··· 364 364 } 365 365 366 366 enum { 367 - Opt_user_xattr, 368 - Opt_acl, 369 - Opt_cache_strategy, 370 - Opt_dax, 371 - Opt_dax_enum, 372 - Opt_device, 373 - Opt_fsid, 374 - Opt_domain_id, 367 + Opt_user_xattr, Opt_acl, Opt_cache_strategy, Opt_dax, Opt_dax_enum, 368 + Opt_device, Opt_fsid, Opt_domain_id, Opt_directio, 375 369 Opt_err 376 370 }; 377 371 ··· 392 398 fsparam_string("device", Opt_device), 393 399 fsparam_string("fsid", Opt_fsid), 394 400 fsparam_string("domain_id", Opt_domain_id), 401 + fsparam_flag_no("directio", Opt_directio), 395 402 {} 396 403 }; 397 404 ··· 506 511 errorfc(fc, "%s option not supported", erofs_fs_parameters[opt].name); 507 512 break; 508 513 #endif 514 + case Opt_directio: 515 + #ifdef CONFIG_EROFS_FS_BACKED_BY_FILE 516 + if (result.boolean) 517 + set_opt(&sbi->opt, DIRECT_IO); 518 + else 519 + clear_opt(&sbi->opt, DIRECT_IO); 520 + #else 521 + errorfc(fc, "%s option not supported", erofs_fs_parameters[opt].name); 522 + #endif 523 + break; 509 524 default: 510 525 return -ENOPARAM; 511 526 } ··· 607 602 return -EINVAL; 608 603 } 609 604 610 - sbi->dax_dev = fs_dax_get_by_bdev(sb->s_bdev, 611 - &sbi->dax_part_off, 612 - NULL, NULL); 605 + sbi->dif0.dax_dev = fs_dax_get_by_bdev(sb->s_bdev, 606 + &sbi->dif0.dax_part_off, NULL, NULL); 613 607 } 614 608 615 609 err = erofs_read_superblock(sb); ··· 631 627 } 632 628 633 629 if (test_opt(&sbi->opt, DAX_ALWAYS)) { 634 - if (!sbi->dax_dev) { 630 + if (!sbi->dif0.dax_dev) { 635 631 errorfc(fc, "DAX unsupported by block device. Turning off DAX."); 636 632 clear_opt(&sbi->opt, DAX_ALWAYS); 637 633 } else if (sbi->blkszbits != PAGE_SHIFT) { ··· 707 703 GET_TREE_BDEV_QUIET_LOOKUP : 0); 708 704 #ifdef CONFIG_EROFS_FS_BACKED_BY_FILE 709 705 if (ret == -ENOTBLK) { 706 + struct file *file; 707 + 710 708 if (!fc->source) 711 709 return invalf(fc, "No source specified"); 712 - sbi->fdev = filp_open(fc->source, O_RDONLY | O_LARGEFILE, 0); 713 - if (IS_ERR(sbi->fdev)) 714 - return PTR_ERR(sbi->fdev); 710 + file = filp_open(fc->source, O_RDONLY | O_LARGEFILE, 0); 711 + if (IS_ERR(file)) 712 + return PTR_ERR(file); 713 + sbi->dif0.file = file; 715 714 716 - if (S_ISREG(file_inode(sbi->fdev)->i_mode) && 717 - sbi->fdev->f_mapping->a_ops->read_folio) 715 + if (S_ISREG(file_inode(sbi->dif0.file)->i_mode) && 716 + sbi->dif0.file->f_mapping->a_ops->read_folio) 718 717 return get_tree_nodev(fc, erofs_fc_fill_super); 719 - fput(sbi->fdev); 720 718 } 721 719 #endif 722 720 return ret; ··· 769 763 kfree(devs); 770 764 } 771 765 766 + static void erofs_sb_free(struct erofs_sb_info *sbi) 767 + { 768 + erofs_free_dev_context(sbi->devs); 769 + kfree(sbi->fsid); 770 + kfree(sbi->domain_id); 771 + if (sbi->dif0.file) 772 + fput(sbi->dif0.file); 773 + kfree(sbi); 774 + } 775 + 772 776 static void erofs_fc_free(struct fs_context *fc) 773 777 { 774 778 struct erofs_sb_info *sbi = fc->s_fs_info; 775 779 776 - if (!sbi) 777 - return; 778 - 779 - erofs_free_dev_context(sbi->devs); 780 - kfree(sbi->fsid); 781 - kfree(sbi->domain_id); 782 - kfree(sbi); 780 + if (sbi) /* free here if an error occurs before transferring to sb */ 781 + erofs_sb_free(sbi); 783 782 } 784 783 785 784 static const struct fs_context_operations erofs_context_ops = { ··· 820 809 { 821 810 struct erofs_sb_info *sbi = EROFS_SB(sb); 822 811 823 - if ((IS_ENABLED(CONFIG_EROFS_FS_ONDEMAND) && sbi->fsid) || sbi->fdev) 812 + if ((IS_ENABLED(CONFIG_EROFS_FS_ONDEMAND) && sbi->fsid) || 813 + sbi->dif0.file) 824 814 kill_anon_super(sb); 825 815 else 826 816 kill_block_super(sb); 827 - 828 - erofs_free_dev_context(sbi->devs); 829 - fs_put_dax(sbi->dax_dev, NULL); 817 + fs_put_dax(sbi->dif0.dax_dev, NULL); 830 818 erofs_fscache_unregister_fs(sb); 831 - kfree(sbi->fsid); 832 - kfree(sbi->domain_id); 833 - if (sbi->fdev) 834 - fput(sbi->fdev); 835 - kfree(sbi); 819 + erofs_sb_free(sbi); 836 820 sb->s_fs_info = NULL; 837 821 } 838 822 ··· 953 947 seq_puts(seq, ",dax=always"); 954 948 if (test_opt(opt, DAX_NEVER)) 955 949 seq_puts(seq, ",dax=never"); 950 + if (erofs_is_fileio_mode(sbi) && test_opt(opt, DIRECT_IO)) 951 + seq_puts(seq, ",directio"); 956 952 #ifdef CONFIG_EROFS_FS_ONDEMAND 957 953 if (sbi->fsid) 958 954 seq_printf(seq, ",fsid=%s", sbi->fsid);

+2 -2

fs/erofs/zdata.c

··· 1792 1792 erofs_fscache_submit_bio(bio); 1793 1793 else 1794 1794 submit_bio(bio); 1795 - if (memstall) 1796 - psi_memstall_leave(&pflags); 1797 1795 } 1796 + if (memstall) 1797 + psi_memstall_leave(&pflags); 1798 1798 1799 1799 /* 1800 1800 * although background is preferred, no one is pending for submission.

+4 -3

fs/erofs/zutil.c

··· 230 230 struct erofs_sb_info *const sbi = EROFS_SB(sb); 231 231 232 232 mutex_lock(&sbi->umount_mutex); 233 - /* clean up all remaining pclusters in memory */ 234 - z_erofs_shrink_scan(sbi, ~0UL); 235 - 233 + while (!xa_empty(&sbi->managed_pslots)) { 234 + z_erofs_shrink_scan(sbi, ~0UL); 235 + cond_resched(); 236 + } 236 237 spin_lock(&erofs_sb_list_lock); 237 238 list_del(&sbi->list); 238 239 spin_unlock(&erofs_sb_list_lock);

+1 -1

fs/hugetlbfs/inode.c

··· 825 825 error = PTR_ERR(folio); 826 826 goto out; 827 827 } 828 - folio_zero_user(folio, ALIGN_DOWN(addr, hpage_size)); 828 + folio_zero_user(folio, addr); 829 829 __folio_mark_uptodate(folio); 830 830 error = hugetlb_add_to_page_cache(folio, mapping, index); 831 831 if (unlikely(error)) {

+1 -1

fs/nfs/pnfs.c

··· 1308 1308 enum pnfs_iomode *iomode) 1309 1309 { 1310 1310 /* Serialise LAYOUTGET/LAYOUTRETURN */ 1311 - if (atomic_read(&lo->plh_outstanding) != 0) 1311 + if (atomic_read(&lo->plh_outstanding) != 0 && lo->plh_return_seq == 0) 1312 1312 return false; 1313 1313 if (test_and_set_bit(NFS_LAYOUT_RETURN_LOCK, &lo->plh_flags)) 1314 1314 return false;

+1

fs/nfs/super.c

··· 73 73 #include "nfs.h" 74 74 #include "netns.h" 75 75 #include "sysfs.h" 76 + #include "nfs4idmap.h" 76 77 77 78 #define NFSDBG_FACILITY NFSDBG_VFS 78 79

+1

fs/nilfs2/btnode.c

··· 35 35 ii->i_flags = 0; 36 36 memset(&ii->i_bmap_data, 0, sizeof(struct nilfs_bmap)); 37 37 mapping_set_gfp_mask(btnc_inode->i_mapping, GFP_NOFS); 38 + btnc_inode->i_mapping->a_ops = &nilfs_buffer_cache_aops; 38 39 } 39 40 40 41 void nilfs_btnode_cache_clear(struct address_space *btnc)

+1 -1

fs/nilfs2/gcinode.c

··· 163 163 164 164 inode->i_mode = S_IFREG; 165 165 mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); 166 - inode->i_mapping->a_ops = &empty_aops; 166 + inode->i_mapping->a_ops = &nilfs_buffer_cache_aops; 167 167 168 168 ii->i_flags = 0; 169 169 nilfs_bmap_init_gc(ii->i_bmap);

+12 -1

fs/nilfs2/inode.c

··· 276 276 .is_partially_uptodate = block_is_partially_uptodate, 277 277 }; 278 278 279 + const struct address_space_operations nilfs_buffer_cache_aops = { 280 + .invalidate_folio = block_invalidate_folio, 281 + }; 282 + 279 283 static int nilfs_insert_inode_locked(struct inode *inode, 280 284 struct nilfs_root *root, 281 285 unsigned long ino) ··· 548 544 inode = nilfs_iget_locked(sb, root, ino); 549 545 if (unlikely(!inode)) 550 546 return ERR_PTR(-ENOMEM); 551 - if (!(inode->i_state & I_NEW)) 547 + 548 + if (!(inode->i_state & I_NEW)) { 549 + if (!inode->i_nlink) { 550 + iput(inode); 551 + return ERR_PTR(-ESTALE); 552 + } 552 553 return inode; 554 + } 553 555 554 556 err = __nilfs_read_inode(sb, root, ino, inode); 555 557 if (unlikely(err)) { ··· 685 675 NILFS_I(s_inode)->i_flags = 0; 686 676 memset(NILFS_I(s_inode)->i_bmap, 0, sizeof(struct nilfs_bmap)); 687 677 mapping_set_gfp_mask(s_inode->i_mapping, GFP_NOFS); 678 + s_inode->i_mapping->a_ops = &nilfs_buffer_cache_aops; 688 679 689 680 err = nilfs_attach_btree_node_cache(s_inode); 690 681 if (unlikely(err)) {

+5

fs/nilfs2/namei.c

··· 67 67 inode = NULL; 68 68 } else { 69 69 inode = nilfs_iget(dir->i_sb, NILFS_I(dir)->i_root, ino); 70 + if (inode == ERR_PTR(-ESTALE)) { 71 + nilfs_error(dir->i_sb, 72 + "deleted inode referenced: %lu", ino); 73 + return ERR_PTR(-EIO); 74 + } 70 75 } 71 76 72 77 return d_splice_alias(inode, dentry);

+1

fs/nilfs2/nilfs.h

··· 401 401 extern const struct inode_operations nilfs_file_inode_operations; 402 402 extern const struct file_operations nilfs_file_operations; 403 403 extern const struct address_space_operations nilfs_aops; 404 + extern const struct address_space_operations nilfs_buffer_cache_aops; 404 405 extern const struct inode_operations nilfs_dir_inode_operations; 405 406 extern const struct inode_operations nilfs_special_inode_operations; 406 407 extern const struct inode_operations nilfs_symlink_inode_operations;

+5 -22

fs/ocfs2/localalloc.c

··· 971 971 start = count = 0; 972 972 left = le32_to_cpu(alloc->id1.bitmap1.i_total); 973 973 974 - while ((bit_off = ocfs2_find_next_zero_bit(bitmap, left, start)) < 975 - left) { 976 - if (bit_off == start) { 974 + while (1) { 975 + bit_off = ocfs2_find_next_zero_bit(bitmap, left, start); 976 + if ((bit_off < left) && (bit_off == start)) { 977 977 count++; 978 978 start++; 979 979 continue; ··· 998 998 } 999 999 } 1000 1000 1001 + if (bit_off >= left) 1002 + break; 1001 1003 count = 1; 1002 1004 start = bit_off + 1; 1003 - } 1004 - 1005 - /* clear the contiguous bits until the end boundary */ 1006 - if (count) { 1007 - blkno = la_start_blk + 1008 - ocfs2_clusters_to_blocks(osb->sb, 1009 - start - count); 1010 - 1011 - trace_ocfs2_sync_local_to_main_free( 1012 - count, start - count, 1013 - (unsigned long long)la_start_blk, 1014 - (unsigned long long)blkno); 1015 - 1016 - status = ocfs2_release_clusters(handle, 1017 - main_bm_inode, 1018 - main_bm_bh, blkno, 1019 - count); 1020 - if (status < 0) 1021 - mlog_errno(status); 1022 1005 } 1023 1006 1024 1007 bail:

-1

fs/smb/client/Kconfig

··· 2 2 config CIFS 3 3 tristate "SMB3 and CIFS support (advanced network filesystem)" 4 4 depends on INET 5 - select NETFS_SUPPORT 6 5 select NLS 7 6 select NLS_UCS2_UTILS 8 7 select CRYPTO

+1 -1

fs/smb/client/cifsfs.c

··· 398 398 cifs_inode = alloc_inode_sb(sb, cifs_inode_cachep, GFP_KERNEL); 399 399 if (!cifs_inode) 400 400 return NULL; 401 - cifs_inode->cifsAttrs = 0x20; /* default */ 401 + cifs_inode->cifsAttrs = ATTR_ARCHIVE; /* default */ 402 402 cifs_inode->time = 0; 403 403 /* 404 404 * Until the file is open and we have gotten oplock info back from the

+26 -10

fs/smb/client/connect.c

··· 987 987 msleep(125); 988 988 if (cifs_rdma_enabled(server)) 989 989 smbd_destroy(server); 990 + 990 991 if (server->ssocket) { 991 992 sock_release(server->ssocket); 992 993 server->ssocket = NULL; 994 + 995 + /* Release netns reference for the socket. */ 996 + put_net(cifs_net_ns(server)); 993 997 } 994 998 995 999 if (!list_empty(&server->pending_mid_q)) { ··· 1041 1037 */ 1042 1038 } 1043 1039 1040 + /* Release netns reference for this server. */ 1044 1041 put_net(cifs_net_ns(server)); 1045 1042 kfree(server->leaf_fullpath); 1046 1043 kfree(server); ··· 1718 1713 1719 1714 tcp_ses->ops = ctx->ops; 1720 1715 tcp_ses->vals = ctx->vals; 1716 + 1717 + /* Grab netns reference for this server. */ 1721 1718 cifs_set_net_ns(tcp_ses, get_net(current->nsproxy->net_ns)); 1722 1719 1723 1720 tcp_ses->conn_id = atomic_inc_return(&tcpSesNextId); ··· 1851 1844 out_err_crypto_release: 1852 1845 cifs_crypto_secmech_release(tcp_ses); 1853 1846 1847 + /* Release netns reference for this server. */ 1854 1848 put_net(cifs_net_ns(tcp_ses)); 1855 1849 1856 1850 out_err: ··· 1860 1852 cifs_put_tcp_session(tcp_ses->primary_server, false); 1861 1853 kfree(tcp_ses->hostname); 1862 1854 kfree(tcp_ses->leaf_fullpath); 1863 - if (tcp_ses->ssocket) 1855 + if (tcp_ses->ssocket) { 1864 1856 sock_release(tcp_ses->ssocket); 1857 + put_net(cifs_net_ns(tcp_ses)); 1858 + } 1865 1859 kfree(tcp_ses); 1866 1860 } 1867 1861 return ERR_PTR(rc); ··· 3141 3131 socket = server->ssocket; 3142 3132 } else { 3143 3133 struct net *net = cifs_net_ns(server); 3144 - struct sock *sk; 3145 3134 3146 - rc = __sock_create(net, sfamily, SOCK_STREAM, 3147 - IPPROTO_TCP, &server->ssocket, 1); 3135 + rc = sock_create_kern(net, sfamily, SOCK_STREAM, IPPROTO_TCP, &server->ssocket); 3148 3136 if (rc < 0) { 3149 3137 cifs_server_dbg(VFS, "Error %d creating socket\n", rc); 3150 3138 return rc; 3151 3139 } 3152 3140 3153 - sk = server->ssocket->sk; 3154 - __netns_tracker_free(net, &sk->ns_tracker, false); 3155 - sk->sk_net_refcnt = 1; 3156 - get_net_track(net, &sk->ns_tracker, GFP_KERNEL); 3157 - sock_inuse_add(net, 1); 3141 + /* 3142 + * Grab netns reference for the socket. 3143 + * 3144 + * It'll be released here, on error, or in clean_demultiplex_info() upon server 3145 + * teardown. 3146 + */ 3147 + get_net(net); 3158 3148 3159 3149 /* BB other socket options to set KEEPALIVE, NODELAY? */ 3160 3150 cifs_dbg(FYI, "Socket created\n"); ··· 3168 3158 } 3169 3159 3170 3160 rc = bind_socket(server); 3171 - if (rc < 0) 3161 + if (rc < 0) { 3162 + put_net(cifs_net_ns(server)); 3172 3163 return rc; 3164 + } 3173 3165 3174 3166 /* 3175 3167 * Eventually check for other socket options to change from ··· 3208 3196 if (rc < 0) { 3209 3197 cifs_dbg(FYI, "Error %d connecting to server\n", rc); 3210 3198 trace_smb3_connect_err(server->hostname, server->conn_id, &server->dstaddr, rc); 3199 + put_net(cifs_net_ns(server)); 3211 3200 sock_release(socket); 3212 3201 server->ssocket = NULL; 3213 3202 return rc; ··· 3216 3203 trace_smb3_connect_done(server->hostname, server->conn_id, &server->dstaddr); 3217 3204 if (sport == htons(RFC1001_PORT)) 3218 3205 rc = ip_rfc1001_connect(server); 3206 + 3207 + if (rc < 0) 3208 + put_net(cifs_net_ns(server)); 3219 3209 3220 3210 return rc; 3221 3211 }

+4 -1

fs/smb/client/smb2pdu.c

··· 4840 4840 if (written > wdata->subreq.len) 4841 4841 written &= 0xFFFF; 4842 4842 4843 + cifs_stats_bytes_written(tcon, written); 4844 + 4843 4845 if (written < wdata->subreq.len) 4844 4846 wdata->result = -ENOSPC; 4845 4847 else ··· 5158 5156 cifs_dbg(VFS, "Send error in write = %d\n", rc); 5159 5157 } else { 5160 5158 *nbytes = le32_to_cpu(rsp->DataLength); 5159 + cifs_stats_bytes_written(io_parms->tcon, *nbytes); 5161 5160 trace_smb3_write_done(0, 0, xid, 5162 5161 req->PersistentFileId, 5163 5162 io_parms->tcon->tid, ··· 6207 6204 req->StructureSize = cpu_to_le16(36); 6208 6205 total_len += 12; 6209 6206 6210 - memcpy(req->LeaseKey, lease_key, 16); 6207 + memcpy(req->LeaseKey, lease_key, SMB2_LEASE_KEY_SIZE); 6211 6208 req->LeaseState = lease_state; 6212 6209 6213 6210 flags |= CIFS_NO_RSP_BUF;

+14 -4

fs/smb/server/connection.c

··· 70 70 atomic_set(&conn->req_running, 0); 71 71 atomic_set(&conn->r_count, 0); 72 72 atomic_set(&conn->refcnt, 1); 73 - atomic_set(&conn->mux_smb_requests, 0); 74 73 conn->total_credits = 1; 75 74 conn->outstanding_credits = 0; 76 75 ··· 119 120 if (conn->ops->get_cmd_val(work) != SMB2_CANCEL_HE) 120 121 requests_queue = &conn->requests; 121 122 123 + atomic_inc(&conn->req_running); 122 124 if (requests_queue) { 123 - atomic_inc(&conn->req_running); 124 125 spin_lock(&conn->request_lock); 125 126 list_add_tail(&work->request_entry, requests_queue); 126 127 spin_unlock(&conn->request_lock); ··· 131 132 { 132 133 struct ksmbd_conn *conn = work->conn; 133 134 135 + atomic_dec(&conn->req_running); 136 + if (waitqueue_active(&conn->req_running_q)) 137 + wake_up(&conn->req_running_q); 138 + 134 139 if (list_empty(&work->request_entry) && 135 140 list_empty(&work->async_request_entry)) 136 141 return; 137 142 138 - atomic_dec(&conn->req_running); 139 143 spin_lock(&conn->request_lock); 140 144 list_del_init(&work->request_entry); 141 145 spin_unlock(&conn->request_lock); ··· 310 308 { 311 309 struct ksmbd_conn *conn = (struct ksmbd_conn *)p; 312 310 struct ksmbd_transport *t = conn->transport; 313 - unsigned int pdu_size, max_allowed_pdu_size; 311 + unsigned int pdu_size, max_allowed_pdu_size, max_req; 314 312 char hdr_buf[4] = {0,}; 315 313 int size; 316 314 ··· 320 318 if (t->ops->prepare && t->ops->prepare(t)) 321 319 goto out; 322 320 321 + max_req = server_conf.max_inflight_req; 323 322 conn->last_active = jiffies; 324 323 set_freezable(); 325 324 while (ksmbd_conn_alive(conn)) { ··· 329 326 330 327 kvfree(conn->request_buf); 331 328 conn->request_buf = NULL; 329 + 330 + recheck: 331 + if (atomic_read(&conn->req_running) + 1 > max_req) { 332 + wait_event_interruptible(conn->req_running_q, 333 + atomic_read(&conn->req_running) < max_req); 334 + goto recheck; 335 + } 332 336 333 337 size = t->ops->read(t, hdr_buf, sizeof(hdr_buf), -1); 334 338 if (size != sizeof(hdr_buf))

-1

fs/smb/server/connection.h

··· 107 107 __le16 signing_algorithm; 108 108 bool binding; 109 109 atomic_t refcnt; 110 - atomic_t mux_smb_requests; 111 110 }; 112 111 113 112 struct ksmbd_conn_ops {

+1 -6

fs/smb/server/server.c

··· 270 270 271 271 ksmbd_conn_try_dequeue_request(work); 272 272 ksmbd_free_work_struct(work); 273 - atomic_dec(&conn->mux_smb_requests); 274 273 /* 275 274 * Checking waitqueue to dropping pending requests on 276 275 * disconnection. waitqueue_active is safe because it ··· 298 299 err = ksmbd_init_smb_server(conn); 299 300 if (err) 300 301 return 0; 301 - 302 - if (atomic_inc_return(&conn->mux_smb_requests) >= conn->vals->max_credits) { 303 - atomic_dec_return(&conn->mux_smb_requests); 304 - return -ENOSPC; 305 - } 306 302 307 303 work = ksmbd_alloc_work_struct(); 308 304 if (!work) { ··· 361 367 server_conf.auth_mechs |= KSMBD_AUTH_KRB5 | 362 368 KSMBD_AUTH_MSKRB5; 363 369 #endif 370 + server_conf.max_inflight_req = SMB2_MAX_CREDITS; 364 371 return 0; 365 372 } 366 373

+1

fs/smb/server/server.h

··· 42 42 struct smb_sid domain_sid; 43 43 unsigned int auth_mechs; 44 44 unsigned int max_connections; 45 + unsigned int max_inflight_req; 45 46 46 47 char *conf[SERVER_CONF_WORK_GROUP + 1]; 47 48 struct task_struct *dh_task;

+2

fs/smb/server/smb2pdu.c

··· 1097 1097 return rc; 1098 1098 } 1099 1099 1100 + ksmbd_conn_lock(conn); 1100 1101 smb2_buf_len = get_rfc1002_len(work->request_buf); 1101 1102 smb2_neg_size = offsetof(struct smb2_negotiate_req, Dialects); 1102 1103 if (smb2_neg_size > smb2_buf_len) { ··· 1248 1247 ksmbd_conn_set_need_negotiate(conn); 1249 1248 1250 1249 err_out: 1250 + ksmbd_conn_unlock(conn); 1251 1251 if (rc) 1252 1252 rsp->hdr.Status = STATUS_INSUFFICIENT_RESOURCES; 1253 1253

+4 -1

fs/smb/server/transport_ipc.c

··· 319 319 init_smb2_max_write_size(req->smb2_max_write); 320 320 if (req->smb2_max_trans) 321 321 init_smb2_max_trans_size(req->smb2_max_trans); 322 - if (req->smb2_max_credits) 322 + if (req->smb2_max_credits) { 323 323 init_smb2_max_credits(req->smb2_max_credits); 324 + server_conf.max_inflight_req = 325 + req->smb2_max_credits; 326 + } 324 327 if (req->smbd_max_io_size) 325 328 init_smbd_max_io_size(req->smbd_max_io_size); 326 329

+2

include/clocksource/hyperv_timer.h

··· 38 38 extern unsigned long hv_get_tsc_pfn(void); 39 39 extern struct ms_hyperv_tsc_page *hv_get_tsc_page(void); 40 40 41 + extern void hv_adj_sched_clock_offset(u64 offset); 42 + 41 43 static __always_inline bool 42 44 hv_read_tsc_page_tsc(const struct ms_hyperv_tsc_page *tsc_pg, 43 45 u64 *cur_tsc, u64 *time)

+7 -2

include/linux/alloc_tag.h

··· 63 63 #else /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */ 64 64 65 65 static inline bool is_codetag_empty(union codetag_ref *ref) { return false; } 66 - static inline void set_codetag_empty(union codetag_ref *ref) {} 66 + 67 + static inline void set_codetag_empty(union codetag_ref *ref) 68 + { 69 + if (ref) 70 + ref->ct = NULL; 71 + } 67 72 68 73 #endif /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */ 69 74 ··· 140 135 #ifdef CONFIG_MEM_ALLOC_PROFILING_DEBUG 141 136 static inline void alloc_tag_add_check(union codetag_ref *ref, struct alloc_tag *tag) 142 137 { 143 - WARN_ONCE(ref && ref->ct, 138 + WARN_ONCE(ref && ref->ct && !is_codetag_empty(ref), 144 139 "alloc_tag was not cleared (got tag for %s:%u)\n", 145 140 ref->ct->filename, ref->ct->lineno); 146 141

+8 -5

include/linux/arm_ffa.h

··· 166 166 return dev_get_drvdata(&fdev->dev); 167 167 } 168 168 169 + struct ffa_partition_info; 170 + 169 171 #if IS_REACHABLE(CONFIG_ARM_FFA_TRANSPORT) 170 - struct ffa_device *ffa_device_register(const uuid_t *uuid, int vm_id, 171 - const struct ffa_ops *ops); 172 + struct ffa_device * 173 + ffa_device_register(const struct ffa_partition_info *part_info, 174 + const struct ffa_ops *ops); 172 175 void ffa_device_unregister(struct ffa_device *ffa_dev); 173 176 int ffa_driver_register(struct ffa_driver *driver, struct module *owner, 174 177 const char *mod_name); ··· 179 176 bool ffa_device_is_valid(struct ffa_device *ffa_dev); 180 177 181 178 #else 182 - static inline 183 - struct ffa_device *ffa_device_register(const uuid_t *uuid, int vm_id, 184 - const struct ffa_ops *ops) 179 + static inline struct ffa_device * 180 + ffa_device_register(const struct ffa_partition_info *part_info, 181 + const struct ffa_ops *ops) 185 182 { 186 183 return NULL; 187 184 }

+6

include/linux/cacheinfo.h

··· 155 155 156 156 #ifndef CONFIG_ARCH_HAS_CPU_CACHE_ALIASING 157 157 #define cpu_dcache_is_aliasing() false 158 + #define cpu_icache_is_aliasing() cpu_dcache_is_aliasing() 158 159 #else 159 160 #include <asm/cachetype.h> 161 + 162 + #ifndef cpu_icache_is_aliasing 163 + #define cpu_icache_is_aliasing() cpu_dcache_is_aliasing() 164 + #endif 165 + 160 166 #endif 161 167 162 168 #endif /* _LINUX_CACHEINFO_H */

+27 -12

include/linux/compiler.h

··· 216 216 217 217 #endif /* __KERNEL__ */ 218 218 219 - /* 220 - * Force the compiler to emit 'sym' as a symbol, so that we can reference 221 - * it from inline assembler. Necessary in case 'sym' could be inlined 222 - * otherwise, or eliminated entirely due to lack of references that are 223 - * visible to the compiler. 224 - */ 225 - #define ___ADDRESSABLE(sym, __attrs) \ 226 - static void * __used __attrs \ 227 - __UNIQUE_ID(__PASTE(__addressable_,sym)) = (void *)(uintptr_t)&sym; 228 - #define __ADDRESSABLE(sym) \ 229 - ___ADDRESSABLE(sym, __section(".discard.addressable")) 230 - 231 219 /** 232 220 * offset_to_ptr - convert a relative memory offset to an absolute pointer 233 221 * @off: the address of the 32-bit offset value ··· 226 238 } 227 239 228 240 #endif /* __ASSEMBLY__ */ 241 + 242 + #ifdef CONFIG_64BIT 243 + #define ARCH_SEL(a,b) a 244 + #else 245 + #define ARCH_SEL(a,b) b 246 + #endif 247 + 248 + /* 249 + * Force the compiler to emit 'sym' as a symbol, so that we can reference 250 + * it from inline assembler. Necessary in case 'sym' could be inlined 251 + * otherwise, or eliminated entirely due to lack of references that are 252 + * visible to the compiler. 253 + */ 254 + #define ___ADDRESSABLE(sym, __attrs) \ 255 + static void * __used __attrs \ 256 + __UNIQUE_ID(__PASTE(__addressable_,sym)) = (void *)(uintptr_t)&sym; 257 + 258 + #define __ADDRESSABLE(sym) \ 259 + ___ADDRESSABLE(sym, __section(".discard.addressable")) 260 + 261 + #define __ADDRESSABLE_ASM(sym) \ 262 + .pushsection .discard.addressable,"aw"; \ 263 + .align ARCH_SEL(8,4); \ 264 + ARCH_SEL(.quad, .long) __stringify(sym); \ 265 + .popsection; 266 + 267 + #define __ADDRESSABLE_ASM_STR(sym) __stringify(__ADDRESSABLE_ASM(sym)) 229 268 230 269 #ifdef __CHECKER__ 231 270 #define __BUILD_BUG_ON_ZERO_MSG(e, msg) (0)

+13 -1

include/linux/fortify-string.h

··· 616 616 return false; 617 617 } 618 618 619 + /* 620 + * To work around what seems to be an optimizer bug, the macro arguments 621 + * need to have const copies or the values end up changed by the time they 622 + * reach fortify_warn_once(). See commit 6f7630b1b5bc ("fortify: Capture 623 + * __bos() results in const temp vars") for more details. 624 + */ 619 625 #define __fortify_memcpy_chk(p, q, size, p_size, q_size, \ 620 626 p_size_field, q_size_field, op) ({ \ 621 627 const size_t __fortify_size = (size_t)(size); \ ··· 629 623 const size_t __q_size = (q_size); \ 630 624 const size_t __p_size_field = (p_size_field); \ 631 625 const size_t __q_size_field = (q_size_field); \ 626 + /* Keep a mutable version of the size for the final copy. */ \ 627 + size_t __copy_size = __fortify_size; \ 632 628 fortify_warn_once(fortify_memcpy_chk(__fortify_size, __p_size, \ 633 629 __q_size, __p_size_field, \ 634 630 __q_size_field, FORTIFY_FUNC_ ##op), \ ··· 638 630 __fortify_size, \ 639 631 "field \"" #p "\" at " FILE_LINE, \ 640 632 __p_size_field); \ 641 - __underlying_##op(p, q, __fortify_size); \ 633 + /* Hide only the run-time size from value range tracking to */ \ 634 + /* silence compile-time false positive bounds warnings. */ \ 635 + if (!__builtin_constant_p(__copy_size)) \ 636 + OPTIMIZER_HIDE_VAR(__copy_size); \ 637 + __underlying_##op(p, q, __copy_size); \ 642 638 }) 643 639 644 640 /*

+7 -1

include/linux/highmem.h

··· 224 224 struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, 225 225 unsigned long vaddr) 226 226 { 227 - return vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr); 227 + struct folio *folio; 228 + 229 + folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, vaddr); 230 + if (folio && user_alloc_needs_zeroing()) 231 + clear_user_highpage(&folio->page, vaddr); 232 + 233 + return folio; 228 234 } 229 235 #endif 230 236

+1

include/linux/hyperv.h

··· 1559 1559 void *channel; 1560 1560 void (*util_cb)(void *); 1561 1561 int (*util_init)(struct hv_util_service *); 1562 + int (*util_init_transport)(void); 1562 1563 void (*util_deinit)(void); 1563 1564 int (*util_pre_suspend)(void); 1564 1565 int (*util_pre_resume)(void);

+1 -3

include/linux/io_uring.h

··· 15 15 16 16 static inline void io_uring_files_cancel(void) 17 17 { 18 - if (current->io_uring) { 19 - io_uring_unreg_ringfd(); 18 + if (current->io_uring) 20 19 __io_uring_cancel(false); 21 - } 22 20 } 23 21 static inline void io_uring_task_cancel(void) 24 22 {

+1 -1

include/linux/io_uring_types.h

··· 345 345 346 346 /* timeouts */ 347 347 struct { 348 - spinlock_t timeout_lock; 348 + raw_spinlock_t timeout_lock; 349 349 struct list_head timeout_list; 350 350 struct list_head ltimeout_list; 351 351 unsigned cq_last_tm_flush;

+29 -2

include/linux/mm.h

··· 31 31 #include <linux/kasan.h> 32 32 #include <linux/memremap.h> 33 33 #include <linux/slab.h> 34 + #include <linux/cacheinfo.h> 34 35 35 36 struct mempolicy; 36 37 struct anon_vma; ··· 3011 3010 lruvec_stat_sub_folio(folio, NR_PAGETABLE); 3012 3011 } 3013 3012 3014 - pte_t *__pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp); 3013 + pte_t *___pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp); 3014 + static inline pte_t *__pte_offset_map(pmd_t *pmd, unsigned long addr, 3015 + pmd_t *pmdvalp) 3016 + { 3017 + pte_t *pte; 3018 + 3019 + __cond_lock(RCU, pte = ___pte_offset_map(pmd, addr, pmdvalp)); 3020 + return pte; 3021 + } 3015 3022 static inline pte_t *pte_offset_map(pmd_t *pmd, unsigned long addr) 3016 3023 { 3017 3024 return __pte_offset_map(pmd, addr, NULL); ··· 3032 3023 { 3033 3024 pte_t *pte; 3034 3025 3035 - __cond_lock(*ptlp, pte = __pte_offset_map_lock(mm, pmd, addr, ptlp)); 3026 + __cond_lock(RCU, __cond_lock(*ptlp, 3027 + pte = __pte_offset_map_lock(mm, pmd, addr, ptlp))); 3036 3028 return pte; 3037 3029 } 3038 3030 ··· 4184 4174 return 0; 4185 4175 } 4186 4176 #endif 4177 + 4178 + /* 4179 + * user_alloc_needs_zeroing checks if a user folio from page allocator needs to 4180 + * be zeroed or not. 4181 + */ 4182 + static inline bool user_alloc_needs_zeroing(void) 4183 + { 4184 + /* 4185 + * for user folios, arch with cache aliasing requires cache flush and 4186 + * arc changes folio->flags to make icache coherent with dcache, so 4187 + * always return false to make caller use 4188 + * clear_user_page()/clear_user_highpage(). 4189 + */ 4190 + return cpu_dcache_is_aliasing() || cpu_icache_is_aliasing() || 4191 + !static_branch_maybe(CONFIG_INIT_ON_ALLOC_DEFAULT_ON, 4192 + &init_on_alloc); 4193 + } 4187 4194 4188 4195 int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status); 4189 4196 int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status);

+2 -10

include/linux/page-flags.h

··· 862 862 ClearPageHead(page); 863 863 } 864 864 FOLIO_FLAG(large_rmappable, FOLIO_SECOND_PAGE) 865 - FOLIO_TEST_FLAG(partially_mapped, FOLIO_SECOND_PAGE) 866 - /* 867 - * PG_partially_mapped is protected by deferred_split split_queue_lock, 868 - * so its safe to use non-atomic set/clear. 869 - */ 870 - __FOLIO_SET_FLAG(partially_mapped, FOLIO_SECOND_PAGE) 871 - __FOLIO_CLEAR_FLAG(partially_mapped, FOLIO_SECOND_PAGE) 865 + FOLIO_FLAG(partially_mapped, FOLIO_SECOND_PAGE) 872 866 #else 873 867 FOLIO_FLAG_FALSE(large_rmappable) 874 - FOLIO_TEST_FLAG_FALSE(partially_mapped) 875 - __FOLIO_SET_FLAG_NOOP(partially_mapped) 876 - __FOLIO_CLEAR_FLAG_NOOP(partially_mapped) 868 + FOLIO_FLAG_FALSE(partially_mapped) 877 869 #endif 878 870 879 871 #define PG_head_mask ((1UL << PG_head))

+8 -3

include/linux/skmsg.h

··· 317 317 kfree_skb(skb); 318 318 } 319 319 320 - static inline void sk_psock_queue_msg(struct sk_psock *psock, 320 + static inline bool sk_psock_queue_msg(struct sk_psock *psock, 321 321 struct sk_msg *msg) 322 322 { 323 + bool ret; 324 + 323 325 spin_lock_bh(&psock->ingress_lock); 324 - if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) 326 + if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) { 325 327 list_add_tail(&msg->list, &psock->ingress_msg); 326 - else { 328 + ret = true; 329 + } else { 327 330 sk_msg_free(psock->sk, msg); 328 331 kfree(msg); 332 + ret = false; 329 333 } 330 334 spin_unlock_bh(&psock->ingress_lock); 335 + return ret; 331 336 } 332 337 333 338 static inline struct sk_msg *sk_psock_dequeue_msg(struct sk_psock *psock)

+6

include/linux/static_call.h

··· 160 160 161 161 #ifdef CONFIG_HAVE_STATIC_CALL_INLINE 162 162 163 + extern int static_call_initialized; 164 + 163 165 extern int __init static_call_init(void); 164 166 165 167 extern void static_call_force_reinit(void); ··· 227 225 228 226 #elif defined(CONFIG_HAVE_STATIC_CALL) 229 227 228 + #define static_call_initialized 0 229 + 230 230 static inline int static_call_init(void) { return 0; } 231 231 232 232 #define DEFINE_STATIC_CALL(name, _func) \ ··· 284 280 EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name)) 285 281 286 282 #else /* Generic implementation */ 283 + 284 + #define static_call_initialized 0 287 285 288 286 static inline int static_call_init(void) { return 0; } 289 287

+5 -1

include/linux/trace_events.h

··· 273 273 const char *name; 274 274 const int size; 275 275 const int align; 276 - const int is_signed; 276 + const unsigned int is_signed:1; 277 + unsigned int needs_test:1; 277 278 const int filter_type; 278 279 const int len; 279 280 }; ··· 325 324 TRACE_EVENT_FL_EPROBE_BIT, 326 325 TRACE_EVENT_FL_FPROBE_BIT, 327 326 TRACE_EVENT_FL_CUSTOM_BIT, 327 + TRACE_EVENT_FL_TEST_STR_BIT, 328 328 }; 329 329 330 330 /* ··· 342 340 * CUSTOM - Event is a custom event (to be attached to an exsiting tracepoint) 343 341 * This is set when the custom event has not been attached 344 342 * to a tracepoint yet, then it is cleared when it is. 343 + * TEST_STR - The event has a "%s" that points to a string outside the event 345 344 */ 346 345 enum { 347 346 TRACE_EVENT_FL_CAP_ANY = (1 << TRACE_EVENT_FL_CAP_ANY_BIT), ··· 355 352 TRACE_EVENT_FL_EPROBE = (1 << TRACE_EVENT_FL_EPROBE_BIT), 356 353 TRACE_EVENT_FL_FPROBE = (1 << TRACE_EVENT_FL_FPROBE_BIT), 357 354 TRACE_EVENT_FL_CUSTOM = (1 << TRACE_EVENT_FL_CUSTOM_BIT), 355 + TRACE_EVENT_FL_TEST_STR = (1 << TRACE_EVENT_FL_TEST_STR_BIT), 358 356 }; 359 357 360 358 #define TRACE_EVENT_FL_UKPROBE (TRACE_EVENT_FL_KPROBE | TRACE_EVENT_FL_UPROBE)

+1 -1

include/linux/vmstat.h

··· 515 515 516 516 static inline const char *lru_list_name(enum lru_list lru) 517 517 { 518 - return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" 518 + return node_stat_name(NR_LRU_BASE + (enum node_stat_item)lru) + 3; // skip "nr_" 519 519 } 520 520 521 521 #if defined(CONFIG_VM_EVENT_COUNTERS) || defined(CONFIG_MEMCG)

+8 -2

include/net/sock.h

··· 1527 1527 } 1528 1528 1529 1529 static inline bool 1530 - sk_rmem_schedule(struct sock *sk, struct sk_buff *skb, int size) 1530 + __sk_rmem_schedule(struct sock *sk, int size, bool pfmemalloc) 1531 1531 { 1532 1532 int delta; 1533 1533 ··· 1535 1535 return true; 1536 1536 delta = size - sk->sk_forward_alloc; 1537 1537 return delta <= 0 || __sk_mem_schedule(sk, delta, SK_MEM_RECV) || 1538 - skb_pfmemalloc(skb); 1538 + pfmemalloc; 1539 + } 1540 + 1541 + static inline bool 1542 + sk_rmem_schedule(struct sock *sk, struct sk_buff *skb, int size) 1543 + { 1544 + return __sk_rmem_schedule(sk, size, skb_pfmemalloc(skb)); 1539 1545 } 1540 1546 1541 1547 static inline int sk_unused_reserved_mem(const struct sock *sk)

+2 -2

include/uapi/linux/thermal.h

··· 3 3 #define _UAPI_LINUX_THERMAL_H 4 4 5 5 #define THERMAL_NAME_LENGTH 20 6 - #define THERMAL_THRESHOLD_WAY_UP BIT(0) 7 - #define THERMAL_THRESHOLD_WAY_DOWN BIT(1) 6 + #define THERMAL_THRESHOLD_WAY_UP 0x1 7 + #define THERMAL_THRESHOLD_WAY_DOWN 0x2 8 8 9 9 enum thermal_device_mode { 10 10 THERMAL_DEVICE_DISABLED = 0,

+11 -6

io_uring/io_uring.c

··· 215 215 struct io_ring_ctx *ctx = head->ctx; 216 216 217 217 /* protect against races with linked timeouts */ 218 - spin_lock_irq(&ctx->timeout_lock); 218 + raw_spin_lock_irq(&ctx->timeout_lock); 219 219 matched = io_match_linked(head); 220 - spin_unlock_irq(&ctx->timeout_lock); 220 + raw_spin_unlock_irq(&ctx->timeout_lock); 221 221 } else { 222 222 matched = io_match_linked(head); 223 223 } ··· 333 333 init_waitqueue_head(&ctx->cq_wait); 334 334 init_waitqueue_head(&ctx->poll_wq); 335 335 spin_lock_init(&ctx->completion_lock); 336 - spin_lock_init(&ctx->timeout_lock); 336 + raw_spin_lock_init(&ctx->timeout_lock); 337 337 INIT_WQ_LIST(&ctx->iopoll_list); 338 338 INIT_LIST_HEAD(&ctx->io_buffers_comp); 339 339 INIT_LIST_HEAD(&ctx->defer_list); ··· 498 498 if (req->flags & REQ_F_LINK_TIMEOUT) { 499 499 struct io_ring_ctx *ctx = req->ctx; 500 500 501 - spin_lock_irq(&ctx->timeout_lock); 501 + raw_spin_lock_irq(&ctx->timeout_lock); 502 502 io_for_each_link(cur, req) 503 503 io_prep_async_work(cur); 504 - spin_unlock_irq(&ctx->timeout_lock); 504 + raw_spin_unlock_irq(&ctx->timeout_lock); 505 505 } else { 506 506 io_for_each_link(cur, req) 507 507 io_prep_async_work(cur); ··· 514 514 struct io_uring_task *tctx = req->tctx; 515 515 516 516 BUG_ON(!tctx); 517 - BUG_ON(!tctx->io_wq); 517 + 518 + if ((current->flags & PF_KTHREAD) || !tctx->io_wq) { 519 + io_req_task_queue_fail(req, -ECANCELED); 520 + return; 521 + } 518 522 519 523 /* init ->work of the whole link before punting */ 520 524 io_prep_async_link(req); ··· 3218 3214 3219 3215 void __io_uring_cancel(bool cancel_all) 3220 3216 { 3217 + io_uring_unreg_ringfd(); 3221 3218 io_uring_cancel_generic(cancel_all, NULL); 3222 3219 } 3223 3220

+3

io_uring/register.c

··· 414 414 if (ctx->flags & IORING_SETUP_SINGLE_ISSUER && 415 415 current != ctx->submitter_task) 416 416 return -EEXIST; 417 + /* limited to DEFER_TASKRUN for now */ 418 + if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN)) 419 + return -EINVAL; 417 420 if (copy_from_user(&p, arg, sizeof(p))) 418 421 return -EFAULT; 419 422 if (p.flags & ~RESIZE_FLAGS)

+20 -20

io_uring/timeout.c

··· 74 74 if (!io_timeout_finish(timeout, data)) { 75 75 if (io_req_post_cqe(req, -ETIME, IORING_CQE_F_MORE)) { 76 76 /* re-arm timer */ 77 - spin_lock_irq(&ctx->timeout_lock); 77 + raw_spin_lock_irq(&ctx->timeout_lock); 78 78 list_add(&timeout->list, ctx->timeout_list.prev); 79 79 hrtimer_start(&data->timer, timespec64_to_ktime(data->ts), data->mode); 80 - spin_unlock_irq(&ctx->timeout_lock); 80 + raw_spin_unlock_irq(&ctx->timeout_lock); 81 81 return; 82 82 } 83 83 } ··· 109 109 u32 seq; 110 110 struct io_timeout *timeout, *tmp; 111 111 112 - spin_lock_irq(&ctx->timeout_lock); 112 + raw_spin_lock_irq(&ctx->timeout_lock); 113 113 seq = ctx->cached_cq_tail - atomic_read(&ctx->cq_timeouts); 114 114 115 115 list_for_each_entry_safe(timeout, tmp, &ctx->timeout_list, list) { ··· 134 134 io_kill_timeout(req, 0); 135 135 } 136 136 ctx->cq_last_tm_flush = seq; 137 - spin_unlock_irq(&ctx->timeout_lock); 137 + raw_spin_unlock_irq(&ctx->timeout_lock); 138 138 } 139 139 140 140 static void io_req_tw_fail_links(struct io_kiocb *link, struct io_tw_state *ts) ··· 200 200 } else if (req->flags & REQ_F_LINK_TIMEOUT) { 201 201 struct io_ring_ctx *ctx = req->ctx; 202 202 203 - spin_lock_irq(&ctx->timeout_lock); 203 + raw_spin_lock_irq(&ctx->timeout_lock); 204 204 link = io_disarm_linked_timeout(req); 205 - spin_unlock_irq(&ctx->timeout_lock); 205 + raw_spin_unlock_irq(&ctx->timeout_lock); 206 206 if (link) 207 207 io_req_queue_tw_complete(link, -ECANCELED); 208 208 } ··· 238 238 struct io_ring_ctx *ctx = req->ctx; 239 239 unsigned long flags; 240 240 241 - spin_lock_irqsave(&ctx->timeout_lock, flags); 241 + raw_spin_lock_irqsave(&ctx->timeout_lock, flags); 242 242 list_del_init(&timeout->list); 243 243 atomic_set(&req->ctx->cq_timeouts, 244 244 atomic_read(&req->ctx->cq_timeouts) + 1); 245 - spin_unlock_irqrestore(&ctx->timeout_lock, flags); 245 + raw_spin_unlock_irqrestore(&ctx->timeout_lock, flags); 246 246 247 247 if (!(data->flags & IORING_TIMEOUT_ETIME_SUCCESS)) 248 248 req_set_fail(req); ··· 285 285 { 286 286 struct io_kiocb *req; 287 287 288 - spin_lock_irq(&ctx->timeout_lock); 288 + raw_spin_lock_irq(&ctx->timeout_lock); 289 289 req = io_timeout_extract(ctx, cd); 290 - spin_unlock_irq(&ctx->timeout_lock); 290 + raw_spin_unlock_irq(&ctx->timeout_lock); 291 291 292 292 if (IS_ERR(req)) 293 293 return PTR_ERR(req); ··· 330 330 struct io_ring_ctx *ctx = req->ctx; 331 331 unsigned long flags; 332 332 333 - spin_lock_irqsave(&ctx->timeout_lock, flags); 333 + raw_spin_lock_irqsave(&ctx->timeout_lock, flags); 334 334 prev = timeout->head; 335 335 timeout->head = NULL; 336 336 ··· 345 345 } 346 346 list_del(&timeout->list); 347 347 timeout->prev = prev; 348 - spin_unlock_irqrestore(&ctx->timeout_lock, flags); 348 + raw_spin_unlock_irqrestore(&ctx->timeout_lock, flags); 349 349 350 350 req->io_task_work.func = io_req_task_link_timeout; 351 351 io_req_task_work_add(req); ··· 472 472 } else { 473 473 enum hrtimer_mode mode = io_translate_timeout_mode(tr->flags); 474 474 475 - spin_lock_irq(&ctx->timeout_lock); 475 + raw_spin_lock_irq(&ctx->timeout_lock); 476 476 if (tr->ltimeout) 477 477 ret = io_linked_timeout_update(ctx, tr->addr, &tr->ts, mode); 478 478 else 479 479 ret = io_timeout_update(ctx, tr->addr, &tr->ts, mode); 480 - spin_unlock_irq(&ctx->timeout_lock); 480 + raw_spin_unlock_irq(&ctx->timeout_lock); 481 481 } 482 482 483 483 if (ret < 0) ··· 572 572 struct list_head *entry; 573 573 u32 tail, off = timeout->off; 574 574 575 - spin_lock_irq(&ctx->timeout_lock); 575 + raw_spin_lock_irq(&ctx->timeout_lock); 576 576 577 577 /* 578 578 * sqe->off holds how many events that need to occur for this ··· 611 611 list_add(&timeout->list, entry); 612 612 data->timer.function = io_timeout_fn; 613 613 hrtimer_start(&data->timer, timespec64_to_ktime(data->ts), data->mode); 614 - spin_unlock_irq(&ctx->timeout_lock); 614 + raw_spin_unlock_irq(&ctx->timeout_lock); 615 615 return IOU_ISSUE_SKIP_COMPLETE; 616 616 } 617 617 ··· 620 620 struct io_timeout *timeout = io_kiocb_to_cmd(req, struct io_timeout); 621 621 struct io_ring_ctx *ctx = req->ctx; 622 622 623 - spin_lock_irq(&ctx->timeout_lock); 623 + raw_spin_lock_irq(&ctx->timeout_lock); 624 624 /* 625 625 * If the back reference is NULL, then our linked request finished 626 626 * before we got a chance to setup the timer ··· 633 633 data->mode); 634 634 list_add_tail(&timeout->list, &ctx->ltimeout_list); 635 635 } 636 - spin_unlock_irq(&ctx->timeout_lock); 636 + raw_spin_unlock_irq(&ctx->timeout_lock); 637 637 /* drop submission reference */ 638 638 io_put_req(req); 639 639 } ··· 668 668 * timeout_lockfirst to keep locking ordering. 669 669 */ 670 670 spin_lock(&ctx->completion_lock); 671 - spin_lock_irq(&ctx->timeout_lock); 671 + raw_spin_lock_irq(&ctx->timeout_lock); 672 672 list_for_each_entry_safe(timeout, tmp, &ctx->timeout_list, list) { 673 673 struct io_kiocb *req = cmd_to_io_kiocb(timeout); 674 674 ··· 676 676 io_kill_timeout(req, -ECANCELED)) 677 677 canceled++; 678 678 } 679 - spin_unlock_irq(&ctx->timeout_lock); 679 + raw_spin_unlock_irq(&ctx->timeout_lock); 680 680 spin_unlock(&ctx->completion_lock); 681 681 return canceled != 0; 682 682 }

+5 -1

kernel/bpf/verifier.c

··· 21281 21281 * changed in some incompatible and hard to support 21282 21282 * way, it's fine to back out this inlining logic 21283 21283 */ 21284 + #ifdef CONFIG_SMP 21284 21285 insn_buf[0] = BPF_MOV32_IMM(BPF_REG_0, (u32)(unsigned long)&pcpu_hot.cpu_number); 21285 21286 insn_buf[1] = BPF_MOV64_PERCPU_REG(BPF_REG_0, BPF_REG_0); 21286 21287 insn_buf[2] = BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_0, 0); 21287 21288 cnt = 3; 21288 - 21289 + #else 21290 + insn_buf[0] = BPF_ALU32_REG(BPF_XOR, BPF_REG_0, BPF_REG_0); 21291 + cnt = 1; 21292 + #endif 21289 21293 new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); 21290 21294 if (!new_prog) 21291 21295 return -ENOMEM;

+6 -7

kernel/fork.c

··· 639 639 LIST_HEAD(uf); 640 640 VMA_ITERATOR(vmi, mm, 0); 641 641 642 - uprobe_start_dup_mmap(); 643 - if (mmap_write_lock_killable(oldmm)) { 644 - retval = -EINTR; 645 - goto fail_uprobe_end; 646 - } 642 + if (mmap_write_lock_killable(oldmm)) 643 + return -EINTR; 647 644 flush_cache_dup_mm(oldmm); 648 645 uprobe_dup_mmap(oldmm, mm); 649 646 /* ··· 779 782 dup_userfaultfd_complete(&uf); 780 783 else 781 784 dup_userfaultfd_fail(&uf); 782 - fail_uprobe_end: 783 - uprobe_end_dup_mmap(); 784 785 return retval; 785 786 786 787 fail_nomem_anon_vma_fork: ··· 1687 1692 if (!mm_init(mm, tsk, mm->user_ns)) 1688 1693 goto fail_nomem; 1689 1694 1695 + uprobe_start_dup_mmap(); 1690 1696 err = dup_mmap(mm, oldmm); 1691 1697 if (err) 1692 1698 goto free_pt; 1699 + uprobe_end_dup_mmap(); 1693 1700 1694 1701 mm->hiwater_rss = get_mm_rss(mm); 1695 1702 mm->hiwater_vm = mm->total_vm; ··· 1706 1709 mm->binfmt = NULL; 1707 1710 mm_init_owner(mm, NULL); 1708 1711 mmput(mm); 1712 + if (err) 1713 + uprobe_end_dup_mmap(); 1709 1714 1710 1715 fail_nomem: 1711 1716 return NULL;

+1 -1

kernel/static_call_inline.c

··· 15 15 extern struct static_call_tramp_key __start_static_call_tramp_key[], 16 16 __stop_static_call_tramp_key[]; 17 17 18 - static int static_call_initialized; 18 + int static_call_initialized; 19 19 20 20 /* 21 21 * Must be called before early_initcall() to be effective.

+7 -1

kernel/trace/fgraph.c

··· 1215 1215 static int start_graph_tracing(void) 1216 1216 { 1217 1217 unsigned long **ret_stack_list; 1218 - int ret; 1218 + int ret, cpu; 1219 1219 1220 1220 ret_stack_list = kcalloc(FTRACE_RETSTACK_ALLOC_SIZE, 1221 1221 sizeof(*ret_stack_list), GFP_KERNEL); 1222 1222 1223 1223 if (!ret_stack_list) 1224 1224 return -ENOMEM; 1225 + 1226 + /* The cpu_boot init_task->ret_stack will never be freed */ 1227 + for_each_online_cpu(cpu) { 1228 + if (!idle_task(cpu)->ret_stack) 1229 + ftrace_graph_init_idle_task(idle_task(cpu), cpu); 1230 + } 1225 1231 1226 1232 do { 1227 1233 ret = alloc_retstack_tasklist(ret_stack_list);

+5 -1

kernel/trace/ring_buffer.c

··· 7019 7019 lockdep_assert_held(&cpu_buffer->mapping_lock); 7020 7020 7021 7021 nr_subbufs = cpu_buffer->nr_pages + 1; /* + reader-subbuf */ 7022 - nr_pages = ((nr_subbufs + 1) << subbuf_order) - pgoff; /* + meta-page */ 7022 + nr_pages = ((nr_subbufs + 1) << subbuf_order); /* + meta-page */ 7023 + if (nr_pages <= pgoff) 7024 + return -EINVAL; 7025 + 7026 + nr_pages -= pgoff; 7023 7027 7024 7028 nr_vma_pages = vma_pages(vma); 7025 7029 if (!nr_vma_pages || nr_vma_pages > nr_pages)

+62 -202

kernel/trace/trace.c

··· 3611 3611 } 3612 3612 3613 3613 /* Returns true if the string is safe to dereference from an event */ 3614 - static bool trace_safe_str(struct trace_iterator *iter, const char *str, 3615 - bool star, int len) 3614 + static bool trace_safe_str(struct trace_iterator *iter, const char *str) 3616 3615 { 3617 3616 unsigned long addr = (unsigned long)str; 3618 3617 struct trace_event *trace_event; 3619 3618 struct trace_event_call *event; 3620 - 3621 - /* Ignore strings with no length */ 3622 - if (star && !len) 3623 - return true; 3624 3619 3625 3620 /* OK if part of the event data */ 3626 3621 if ((addr >= (unsigned long)iter->ent) && ··· 3656 3661 return false; 3657 3662 } 3658 3663 3659 - static DEFINE_STATIC_KEY_FALSE(trace_no_verify); 3660 - 3661 - static int test_can_verify_check(const char *fmt, ...) 3662 - { 3663 - char buf[16]; 3664 - va_list ap; 3665 - int ret; 3666 - 3667 - /* 3668 - * The verifier is dependent on vsnprintf() modifies the va_list 3669 - * passed to it, where it is sent as a reference. Some architectures 3670 - * (like x86_32) passes it by value, which means that vsnprintf() 3671 - * does not modify the va_list passed to it, and the verifier 3672 - * would then need to be able to understand all the values that 3673 - * vsnprintf can use. If it is passed by value, then the verifier 3674 - * is disabled. 3675 - */ 3676 - va_start(ap, fmt); 3677 - vsnprintf(buf, 16, "%d", ap); 3678 - ret = va_arg(ap, int); 3679 - va_end(ap); 3680 - 3681 - return ret; 3682 - } 3683 - 3684 - static void test_can_verify(void) 3685 - { 3686 - if (!test_can_verify_check("%d %d", 0, 1)) { 3687 - pr_info("trace event string verifier disabled\n"); 3688 - static_branch_inc(&trace_no_verify); 3689 - } 3690 - } 3691 - 3692 3664 /** 3693 - * trace_check_vprintf - Check dereferenced strings while writing to the seq buffer 3665 + * ignore_event - Check dereferenced fields while writing to the seq buffer 3694 3666 * @iter: The iterator that holds the seq buffer and the event being printed 3695 - * @fmt: The format used to print the event 3696 - * @ap: The va_list holding the data to print from @fmt. 3697 3667 * 3698 - * This writes the data into the @iter->seq buffer using the data from 3699 - * @fmt and @ap. If the format has a %s, then the source of the string 3700 - * is examined to make sure it is safe to print, otherwise it will 3701 - * warn and print "[UNSAFE MEMORY]" in place of the dereferenced string 3702 - * pointer. 3668 + * At boot up, test_event_printk() will flag any event that dereferences 3669 + * a string with "%s" that does exist in the ring buffer. It may still 3670 + * be valid, as the string may point to a static string in the kernel 3671 + * rodata that never gets freed. But if the string pointer is pointing 3672 + * to something that was allocated, there's a chance that it can be freed 3673 + * by the time the user reads the trace. This would cause a bad memory 3674 + * access by the kernel and possibly crash the system. 3675 + * 3676 + * This function will check if the event has any fields flagged as needing 3677 + * to be checked at runtime and perform those checks. 3678 + * 3679 + * If it is found that a field is unsafe, it will write into the @iter->seq 3680 + * a message stating what was found to be unsafe. 3681 + * 3682 + * @return: true if the event is unsafe and should be ignored, 3683 + * false otherwise. 3703 3684 */ 3704 - void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, 3705 - va_list ap) 3685 + bool ignore_event(struct trace_iterator *iter) 3706 3686 { 3707 - long text_delta = 0; 3708 - long data_delta = 0; 3709 - const char *p = fmt; 3710 - const char *str; 3711 - bool good; 3712 - int i, j; 3687 + struct ftrace_event_field *field; 3688 + struct trace_event *trace_event; 3689 + struct trace_event_call *event; 3690 + struct list_head *head; 3691 + struct trace_seq *seq; 3692 + const void *ptr; 3713 3693 3714 - if (WARN_ON_ONCE(!fmt)) 3715 - return; 3694 + trace_event = ftrace_find_event(iter->ent->type); 3716 3695 3717 - if (static_branch_unlikely(&trace_no_verify)) 3718 - goto print; 3696 + seq = &iter->seq; 3719 3697 3720 - /* 3721 - * When the kernel is booted with the tp_printk command line 3722 - * parameter, trace events go directly through to printk(). 3723 - * It also is checked by this function, but it does not 3724 - * have an associated trace_array (tr) for it. 3725 - */ 3726 - if (iter->tr) { 3727 - text_delta = iter->tr->text_delta; 3728 - data_delta = iter->tr->data_delta; 3698 + if (!trace_event) { 3699 + trace_seq_printf(seq, "EVENT ID %d NOT FOUND?\n", iter->ent->type); 3700 + return true; 3729 3701 } 3730 3702 3731 - /* Don't bother checking when doing a ftrace_dump() */ 3732 - if (iter->fmt == static_fmt_buf) 3733 - goto print; 3703 + event = container_of(trace_event, struct trace_event_call, event); 3704 + if (!(event->flags & TRACE_EVENT_FL_TEST_STR)) 3705 + return false; 3734 3706 3735 - while (*p) { 3736 - bool star = false; 3737 - int len = 0; 3707 + head = trace_get_fields(event); 3708 + if (!head) { 3709 + trace_seq_printf(seq, "FIELDS FOR EVENT '%s' NOT FOUND?\n", 3710 + trace_event_name(event)); 3711 + return true; 3712 + } 3738 3713 3739 - j = 0; 3714 + /* Offsets are from the iter->ent that points to the raw event */ 3715 + ptr = iter->ent; 3740 3716 3741 - /* 3742 - * We only care about %s and variants 3743 - * as well as %p[sS] if delta is non-zero 3744 - */ 3745 - for (i = 0; p[i]; i++) { 3746 - if (i + 1 >= iter->fmt_size) { 3747 - /* 3748 - * If we can't expand the copy buffer, 3749 - * just print it. 3750 - */ 3751 - if (!trace_iter_expand_format(iter)) 3752 - goto print; 3753 - } 3717 + list_for_each_entry(field, head, link) { 3718 + const char *str; 3719 + bool good; 3754 3720 3755 - if (p[i] == '\\' && p[i+1]) { 3756 - i++; 3757 - continue; 3758 - } 3759 - if (p[i] == '%') { 3760 - /* Need to test cases like %08.*s */ 3761 - for (j = 1; p[i+j]; j++) { 3762 - if (isdigit(p[i+j]) || 3763 - p[i+j] == '.') 3764 - continue; 3765 - if (p[i+j] == '*') { 3766 - star = true; 3767 - continue; 3768 - } 3769 - break; 3770 - } 3771 - if (p[i+j] == 's') 3772 - break; 3773 - 3774 - if (text_delta && p[i+1] == 'p' && 3775 - ((p[i+2] == 's' || p[i+2] == 'S'))) 3776 - break; 3777 - 3778 - star = false; 3779 - } 3780 - j = 0; 3781 - } 3782 - /* If no %s found then just print normally */ 3783 - if (!p[i]) 3784 - break; 3785 - 3786 - /* Copy up to the %s, and print that */ 3787 - strncpy(iter->fmt, p, i); 3788 - iter->fmt[i] = '\0'; 3789 - trace_seq_vprintf(&iter->seq, iter->fmt, ap); 3790 - 3791 - /* Add delta to %pS pointers */ 3792 - if (p[i+1] == 'p') { 3793 - unsigned long addr; 3794 - char fmt[4]; 3795 - 3796 - fmt[0] = '%'; 3797 - fmt[1] = 'p'; 3798 - fmt[2] = p[i+2]; /* Either %ps or %pS */ 3799 - fmt[3] = '\0'; 3800 - 3801 - addr = va_arg(ap, unsigned long); 3802 - addr += text_delta; 3803 - trace_seq_printf(&iter->seq, fmt, (void *)addr); 3804 - 3805 - p += i + 3; 3721 + if (!field->needs_test) 3806 3722 continue; 3807 - } 3808 3723 3809 - /* 3810 - * If iter->seq is full, the above call no longer guarantees 3811 - * that ap is in sync with fmt processing, and further calls 3812 - * to va_arg() can return wrong positional arguments. 3813 - * 3814 - * Ensure that ap is no longer used in this case. 3815 - */ 3816 - if (iter->seq.full) { 3817 - p = ""; 3818 - break; 3819 - } 3724 + str = *(const char **)(ptr + field->offset); 3820 3725 3821 - if (star) 3822 - len = va_arg(ap, int); 3823 - 3824 - /* The ap now points to the string data of the %s */ 3825 - str = va_arg(ap, const char *); 3826 - 3827 - good = trace_safe_str(iter, str, star, len); 3828 - 3829 - /* Could be from the last boot */ 3830 - if (data_delta && !good) { 3831 - str += data_delta; 3832 - good = trace_safe_str(iter, str, star, len); 3833 - } 3726 + good = trace_safe_str(iter, str); 3834 3727 3835 3728 /* 3836 3729 * If you hit this warning, it is likely that the ··· 3729 3846 * instead. See samples/trace_events/trace-events-sample.h 3730 3847 * for reference. 3731 3848 */ 3732 - if (WARN_ONCE(!good, "fmt: '%s' current_buffer: '%s'", 3733 - fmt, seq_buf_str(&iter->seq.seq))) { 3734 - int ret; 3735 - 3736 - /* Try to safely read the string */ 3737 - if (star) { 3738 - if (len + 1 > iter->fmt_size) 3739 - len = iter->fmt_size - 1; 3740 - if (len < 0) 3741 - len = 0; 3742 - ret = copy_from_kernel_nofault(iter->fmt, str, len); 3743 - iter->fmt[len] = 0; 3744 - star = false; 3745 - } else { 3746 - ret = strncpy_from_kernel_nofault(iter->fmt, str, 3747 - iter->fmt_size); 3748 - } 3749 - if (ret < 0) 3750 - trace_seq_printf(&iter->seq, "(0x%px)", str); 3751 - else 3752 - trace_seq_printf(&iter->seq, "(0x%px:%s)", 3753 - str, iter->fmt); 3754 - str = "[UNSAFE-MEMORY]"; 3755 - strcpy(iter->fmt, "%s"); 3756 - } else { 3757 - strncpy(iter->fmt, p + i, j + 1); 3758 - iter->fmt[j+1] = '\0'; 3849 + if (WARN_ONCE(!good, "event '%s' has unsafe pointer field '%s'", 3850 + trace_event_name(event), field->name)) { 3851 + trace_seq_printf(seq, "EVENT %s: HAS UNSAFE POINTER FIELD '%s'\n", 3852 + trace_event_name(event), field->name); 3853 + return true; 3759 3854 } 3760 - if (star) 3761 - trace_seq_printf(&iter->seq, iter->fmt, len, str); 3762 - else 3763 - trace_seq_printf(&iter->seq, iter->fmt, str); 3764 - 3765 - p += i + j + 1; 3766 3855 } 3767 - print: 3768 - if (*p) 3769 - trace_seq_vprintf(&iter->seq, p, ap); 3856 + return false; 3770 3857 } 3771 3858 3772 3859 const char *trace_event_format(struct trace_iterator *iter, const char *fmt) ··· 4206 4353 if (event) { 4207 4354 if (tr->trace_flags & TRACE_ITER_FIELDS) 4208 4355 return print_event_fields(iter, event); 4356 + /* 4357 + * For TRACE_EVENT() events, the print_fmt is not 4358 + * safe to use if the array has delta offsets 4359 + * Force printing via the fields. 4360 + */ 4361 + if ((tr->text_delta || tr->data_delta) && 4362 + event->type > __TRACE_LAST_TYPE) 4363 + return print_event_fields(iter, event); 4364 + 4209 4365 return event->funcs->trace(iter, sym_flags, event); 4210 4366 } 4211 4367 ··· 10638 10776 apply_trace_boot_options(); 10639 10777 10640 10778 register_snapshot_cmd(); 10641 - 10642 - test_can_verify(); 10643 10779 10644 10780 return 0; 10645 10781

+3 -3

kernel/trace/trace.h

··· 667 667 668 668 bool trace_is_tracepoint_string(const char *str); 669 669 const char *trace_event_format(struct trace_iterator *iter, const char *fmt); 670 - void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, 671 - va_list ap) __printf(2, 0); 672 670 char *trace_iter_expand_format(struct trace_iterator *iter); 671 + bool ignore_event(struct trace_iterator *iter); 673 672 674 673 int trace_empty(struct trace_iterator *iter); 675 674 ··· 1412 1413 int filter_type; 1413 1414 int offset; 1414 1415 int size; 1415 - int is_signed; 1416 + unsigned int is_signed:1; 1417 + unsigned int needs_test:1; 1416 1418 int len; 1417 1419 }; 1418 1420

+177 -50

kernel/trace/trace_events.c

··· 82 82 } 83 83 84 84 static struct ftrace_event_field * 85 - __find_event_field(struct list_head *head, char *name) 85 + __find_event_field(struct list_head *head, const char *name) 86 86 { 87 87 struct ftrace_event_field *field; 88 88 ··· 114 114 115 115 static int __trace_define_field(struct list_head *head, const char *type, 116 116 const char *name, int offset, int size, 117 - int is_signed, int filter_type, int len) 117 + int is_signed, int filter_type, int len, 118 + int need_test) 118 119 { 119 120 struct ftrace_event_field *field; 120 121 ··· 134 133 field->offset = offset; 135 134 field->size = size; 136 135 field->is_signed = is_signed; 136 + field->needs_test = need_test; 137 137 field->len = len; 138 138 139 139 list_add(&field->link, head); ··· 153 151 154 152 head = trace_get_fields(call); 155 153 return __trace_define_field(head, type, name, offset, size, 156 - is_signed, filter_type, 0); 154 + is_signed, filter_type, 0, 0); 157 155 } 158 156 EXPORT_SYMBOL_GPL(trace_define_field); 159 157 160 158 static int trace_define_field_ext(struct trace_event_call *call, const char *type, 161 159 const char *name, int offset, int size, int is_signed, 162 - int filter_type, int len) 160 + int filter_type, int len, int need_test) 163 161 { 164 162 struct list_head *head; 165 163 ··· 168 166 169 167 head = trace_get_fields(call); 170 168 return __trace_define_field(head, type, name, offset, size, 171 - is_signed, filter_type, len); 169 + is_signed, filter_type, len, need_test); 172 170 } 173 171 174 172 #define __generic_field(type, item, filter_type) \ 175 173 ret = __trace_define_field(&ftrace_generic_fields, #type, \ 176 174 #item, 0, 0, is_signed_type(type), \ 177 - filter_type, 0); \ 175 + filter_type, 0, 0); \ 178 176 if (ret) \ 179 177 return ret; 180 178 ··· 183 181 "common_" #item, \ 184 182 offsetof(typeof(ent), item), \ 185 183 sizeof(ent.item), \ 186 - is_signed_type(type), FILTER_OTHER, 0); \ 184 + is_signed_type(type), FILTER_OTHER, \ 185 + 0, 0); \ 187 186 if (ret) \ 188 187 return ret; 189 188 ··· 247 244 return tail->offset + tail->size; 248 245 } 249 246 250 - /* 251 - * Check if the referenced field is an array and return true, 252 - * as arrays are OK to dereference. 253 - */ 254 - static bool test_field(const char *fmt, struct trace_event_call *call) 247 + 248 + static struct trace_event_fields *find_event_field(const char *fmt, 249 + struct trace_event_call *call) 255 250 { 256 251 struct trace_event_fields *field = call->class->fields_array; 257 - const char *array_descriptor; 258 252 const char *p = fmt; 259 253 int len; 260 254 261 255 if (!(len = str_has_prefix(fmt, "REC->"))) 262 - return false; 256 + return NULL; 263 257 fmt += len; 264 258 for (p = fmt; *p; p++) { 265 259 if (!isalnum(*p) && *p != '_') ··· 265 265 len = p - fmt; 266 266 267 267 for (; field->type; field++) { 268 - if (strncmp(field->name, fmt, len) || 269 - field->name[len]) 268 + if (strncmp(field->name, fmt, len) || field->name[len]) 270 269 continue; 271 - array_descriptor = strchr(field->type, '['); 272 - /* This is an array and is OK to dereference. */ 273 - return array_descriptor != NULL; 270 + 271 + return field; 272 + } 273 + return NULL; 274 + } 275 + 276 + /* 277 + * Check if the referenced field is an array and return true, 278 + * as arrays are OK to dereference. 279 + */ 280 + static bool test_field(const char *fmt, struct trace_event_call *call) 281 + { 282 + struct trace_event_fields *field; 283 + 284 + field = find_event_field(fmt, call); 285 + if (!field) 286 + return false; 287 + 288 + /* This is an array and is OK to dereference. */ 289 + return strchr(field->type, '[') != NULL; 290 + } 291 + 292 + /* Look for a string within an argument */ 293 + static bool find_print_string(const char *arg, const char *str, const char *end) 294 + { 295 + const char *r; 296 + 297 + r = strstr(arg, str); 298 + return r && r < end; 299 + } 300 + 301 + /* Return true if the argument pointer is safe */ 302 + static bool process_pointer(const char *fmt, int len, struct trace_event_call *call) 303 + { 304 + const char *r, *e, *a; 305 + 306 + e = fmt + len; 307 + 308 + /* Find the REC-> in the argument */ 309 + r = strstr(fmt, "REC->"); 310 + if (r && r < e) { 311 + /* 312 + * Addresses of events on the buffer, or an array on the buffer is 313 + * OK to dereference. There's ways to fool this, but 314 + * this is to catch common mistakes, not malicious code. 315 + */ 316 + a = strchr(fmt, '&'); 317 + if ((a && (a < r)) || test_field(r, call)) 318 + return true; 319 + } else if (find_print_string(fmt, "__get_dynamic_array(", e)) { 320 + return true; 321 + } else if (find_print_string(fmt, "__get_rel_dynamic_array(", e)) { 322 + return true; 323 + } else if (find_print_string(fmt, "__get_dynamic_array_len(", e)) { 324 + return true; 325 + } else if (find_print_string(fmt, "__get_rel_dynamic_array_len(", e)) { 326 + return true; 327 + } else if (find_print_string(fmt, "__get_sockaddr(", e)) { 328 + return true; 329 + } else if (find_print_string(fmt, "__get_rel_sockaddr(", e)) { 330 + return true; 274 331 } 275 332 return false; 333 + } 334 + 335 + /* Return true if the string is safe */ 336 + static bool process_string(const char *fmt, int len, struct trace_event_call *call) 337 + { 338 + struct trace_event_fields *field; 339 + const char *r, *e, *s; 340 + 341 + e = fmt + len; 342 + 343 + /* 344 + * There are several helper functions that return strings. 345 + * If the argument contains a function, then assume its field is valid. 346 + * It is considered that the argument has a function if it has: 347 + * alphanumeric or '_' before a parenthesis. 348 + */ 349 + s = fmt; 350 + do { 351 + r = strstr(s, "("); 352 + if (!r || r >= e) 353 + break; 354 + for (int i = 1; r - i >= s; i++) { 355 + char ch = *(r - i); 356 + if (isspace(ch)) 357 + continue; 358 + if (isalnum(ch) || ch == '_') 359 + return true; 360 + /* Anything else, this isn't a function */ 361 + break; 362 + } 363 + /* A function could be wrapped in parethesis, try the next one */ 364 + s = r + 1; 365 + } while (s < e); 366 + 367 + /* 368 + * If there's any strings in the argument consider this arg OK as it 369 + * could be: REC->field ? "foo" : "bar" and we don't want to get into 370 + * verifying that logic here. 371 + */ 372 + if (find_print_string(fmt, "\"", e)) 373 + return true; 374 + 375 + /* Dereferenced strings are also valid like any other pointer */ 376 + if (process_pointer(fmt, len, call)) 377 + return true; 378 + 379 + /* Make sure the field is found */ 380 + field = find_event_field(fmt, call); 381 + if (!field) 382 + return false; 383 + 384 + /* Test this field's string before printing the event */ 385 + call->flags |= TRACE_EVENT_FL_TEST_STR; 386 + field->needs_test = 1; 387 + 388 + return true; 276 389 } 277 390 278 391 /* ··· 397 284 static void test_event_printk(struct trace_event_call *call) 398 285 { 399 286 u64 dereference_flags = 0; 287 + u64 string_flags = 0; 400 288 bool first = true; 401 - const char *fmt, *c, *r, *a; 289 + const char *fmt; 402 290 int parens = 0; 403 291 char in_quote = 0; 404 292 int start_arg = 0; 405 293 int arg = 0; 406 - int i; 294 + int i, e; 407 295 408 296 fmt = call->print_fmt; 409 297 ··· 488 374 star = true; 489 375 continue; 490 376 } 491 - if ((fmt[i + j] == 's') && star) 492 - arg++; 377 + if ((fmt[i + j] == 's')) { 378 + if (star) 379 + arg++; 380 + if (WARN_ONCE(arg == 63, 381 + "Too many args for event: %s", 382 + trace_event_name(call))) 383 + return; 384 + dereference_flags |= 1ULL << arg; 385 + string_flags |= 1ULL << arg; 386 + } 493 387 break; 494 388 } 495 389 break; ··· 525 403 case ',': 526 404 if (in_quote || parens) 527 405 continue; 406 + e = i; 528 407 i++; 529 408 while (isspace(fmt[i])) 530 409 i++; 531 - start_arg = i; 532 - if (!(dereference_flags & (1ULL << arg))) 533 - goto next_arg; 534 410 535 - /* Find the REC-> in the argument */ 536 - c = strchr(fmt + i, ','); 537 - r = strstr(fmt + i, "REC->"); 538 - if (r && (!c || r < c)) { 539 - /* 540 - * Addresses of events on the buffer, 541 - * or an array on the buffer is 542 - * OK to dereference. 543 - * There's ways to fool this, but 544 - * this is to catch common mistakes, 545 - * not malicious code. 546 - */ 547 - a = strchr(fmt + i, '&'); 548 - if ((a && (a < r)) || test_field(r, call)) 549 - dereference_flags &= ~(1ULL << arg); 550 - } else if ((r = strstr(fmt + i, "__get_dynamic_array(")) && 551 - (!c || r < c)) { 552 - dereference_flags &= ~(1ULL << arg); 553 - } else if ((r = strstr(fmt + i, "__get_sockaddr(")) && 554 - (!c || r < c)) { 555 - dereference_flags &= ~(1ULL << arg); 411 + /* 412 + * If start_arg is zero, then this is the start of the 413 + * first argument. The processing of the argument happens 414 + * when the end of the argument is found, as it needs to 415 + * handle paranthesis and such. 416 + */ 417 + if (!start_arg) { 418 + start_arg = i; 419 + /* Balance out the i++ in the for loop */ 420 + i--; 421 + continue; 556 422 } 557 423 558 - next_arg: 559 - i--; 424 + if (dereference_flags & (1ULL << arg)) { 425 + if (string_flags & (1ULL << arg)) { 426 + if (process_string(fmt + start_arg, e - start_arg, call)) 427 + dereference_flags &= ~(1ULL << arg); 428 + } else if (process_pointer(fmt + start_arg, e - start_arg, call)) 429 + dereference_flags &= ~(1ULL << arg); 430 + } 431 + 432 + start_arg = i; 560 433 arg++; 434 + /* Balance out the i++ in the for loop */ 435 + i--; 561 436 } 437 + } 438 + 439 + if (dereference_flags & (1ULL << arg)) { 440 + if (string_flags & (1ULL << arg)) { 441 + if (process_string(fmt + start_arg, i - start_arg, call)) 442 + dereference_flags &= ~(1ULL << arg); 443 + } else if (process_pointer(fmt + start_arg, i - start_arg, call)) 444 + dereference_flags &= ~(1ULL << arg); 562 445 } 563 446 564 447 /* ··· 2598 2471 ret = trace_define_field_ext(call, field->type, field->name, 2599 2472 offset, field->size, 2600 2473 field->is_signed, field->filter_type, 2601 - field->len); 2474 + field->len, field->needs_test); 2602 2475 if (WARN_ON_ONCE(ret)) { 2603 2476 pr_err("error code is %d\n", ret); 2604 2477 break;

+2 -1

kernel/trace/trace_functions.c

··· 176 176 tracing_reset_online_cpus(&tr->array_buffer); 177 177 } 178 178 179 - #ifdef CONFIG_FUNCTION_GRAPH_TRACER 179 + /* fregs are guaranteed not to be NULL if HAVE_DYNAMIC_FTRACE_WITH_ARGS is set */ 180 + #if defined(CONFIG_FUNCTION_GRAPH_TRACER) && defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) 180 181 static __always_inline unsigned long 181 182 function_get_true_parent_ip(unsigned long parent_ip, struct ftrace_regs *fregs) 182 183 {

+5 -1

kernel/trace/trace_output.c

··· 317 317 318 318 void trace_event_printf(struct trace_iterator *iter, const char *fmt, ...) 319 319 { 320 + struct trace_seq *s = &iter->seq; 320 321 va_list ap; 321 322 323 + if (ignore_event(iter)) 324 + return; 325 + 322 326 va_start(ap, fmt); 323 - trace_check_vprintf(iter, trace_event_format(iter, fmt), ap); 327 + trace_seq_vprintf(s, trace_event_format(iter, fmt), ap); 324 328 va_end(ap); 325 329 } 326 330 EXPORT_SYMBOL(trace_event_printf);

+36 -5

lib/alloc_tag.c

··· 209 209 return; 210 210 } 211 211 212 + /* 213 + * Clear tag references to avoid debug warning when using 214 + * __alloc_tag_ref_set() with non-empty reference. 215 + */ 216 + set_codetag_empty(&ref_old); 217 + set_codetag_empty(&ref_new); 218 + 212 219 /* swap tags */ 213 220 __alloc_tag_ref_set(&ref_old, tag_new); 214 221 update_page_tag_ref(handle_old, &ref_old); ··· 408 401 409 402 static int vm_module_tags_populate(void) 410 403 { 411 - unsigned long phys_size = vm_module_tags->nr_pages << PAGE_SHIFT; 404 + unsigned long phys_end = ALIGN_DOWN(module_tags.start_addr, PAGE_SIZE) + 405 + (vm_module_tags->nr_pages << PAGE_SHIFT); 406 + unsigned long new_end = module_tags.start_addr + module_tags.size; 412 407 413 - if (phys_size < module_tags.size) { 408 + if (phys_end < new_end) { 414 409 struct page **next_page = vm_module_tags->pages + vm_module_tags->nr_pages; 415 - unsigned long addr = module_tags.start_addr + phys_size; 410 + unsigned long old_shadow_end = ALIGN(phys_end, MODULE_ALIGN); 411 + unsigned long new_shadow_end = ALIGN(new_end, MODULE_ALIGN); 416 412 unsigned long more_pages; 417 413 unsigned long nr; 418 414 419 - more_pages = ALIGN(module_tags.size - phys_size, PAGE_SIZE) >> PAGE_SHIFT; 415 + more_pages = ALIGN(new_end - phys_end, PAGE_SIZE) >> PAGE_SHIFT; 420 416 nr = alloc_pages_bulk_array_node(GFP_KERNEL | __GFP_NOWARN, 421 417 NUMA_NO_NODE, more_pages, next_page); 422 418 if (nr < more_pages || 423 - vmap_pages_range(addr, addr + (nr << PAGE_SHIFT), PAGE_KERNEL, 419 + vmap_pages_range(phys_end, phys_end + (nr << PAGE_SHIFT), PAGE_KERNEL, 424 420 next_page, PAGE_SHIFT) < 0) { 425 421 /* Clean up and error out */ 426 422 for (int i = 0; i < nr; i++) 427 423 __free_page(next_page[i]); 428 424 return -ENOMEM; 429 425 } 426 + 430 427 vm_module_tags->nr_pages += nr; 428 + 429 + /* 430 + * Kasan allocates 1 byte of shadow for every 8 bytes of data. 431 + * When kasan_alloc_module_shadow allocates shadow memory, 432 + * its unit of allocation is a page. 433 + * Therefore, here we need to align to MODULE_ALIGN. 434 + */ 435 + if (old_shadow_end < new_shadow_end) 436 + kasan_alloc_module_shadow((void *)old_shadow_end, 437 + new_shadow_end - old_shadow_end, 438 + GFP_KERNEL); 431 439 } 440 + 441 + /* 442 + * Mark the pages as accessible, now that they are mapped. 443 + * With hardware tag-based KASAN, marking is skipped for 444 + * non-VM_ALLOC mappings, see __kasan_unpoison_vmalloc(). 445 + */ 446 + kasan_unpoison_vmalloc((void *)module_tags.start_addr, 447 + new_end - module_tags.start_addr, 448 + KASAN_VMALLOC_PROT_NORMAL); 432 449 433 450 return 0; 434 451 }

+10 -9

mm/huge_memory.c

··· 1176 1176 folio_throttle_swaprate(folio, gfp); 1177 1177 1178 1178 /* 1179 - * When a folio is not zeroed during allocation (__GFP_ZERO not used), 1180 - * folio_zero_user() is used to make sure that the page corresponding 1181 - * to the faulting address will be hot in the cache after zeroing. 1179 + * When a folio is not zeroed during allocation (__GFP_ZERO not used) 1180 + * or user folios require special handling, folio_zero_user() is used to 1181 + * make sure that the page corresponding to the faulting address will be 1182 + * hot in the cache after zeroing. 1182 1183 */ 1183 - if (!alloc_zeroed()) 1184 + if (user_alloc_needs_zeroing()) 1184 1185 folio_zero_user(folio, addr); 1185 1186 /* 1186 1187 * The memory barrier inside __folio_mark_uptodate makes sure that ··· 3577 3576 !list_empty(&folio->_deferred_list)) { 3578 3577 ds_queue->split_queue_len--; 3579 3578 if (folio_test_partially_mapped(folio)) { 3580 - __folio_clear_partially_mapped(folio); 3579 + folio_clear_partially_mapped(folio); 3581 3580 mod_mthp_stat(folio_order(folio), 3582 3581 MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); 3583 3582 } ··· 3689 3688 if (!list_empty(&folio->_deferred_list)) { 3690 3689 ds_queue->split_queue_len--; 3691 3690 if (folio_test_partially_mapped(folio)) { 3692 - __folio_clear_partially_mapped(folio); 3691 + folio_clear_partially_mapped(folio); 3693 3692 mod_mthp_stat(folio_order(folio), 3694 3693 MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); 3695 3694 } ··· 3733 3732 spin_lock_irqsave(&ds_queue->split_queue_lock, flags); 3734 3733 if (partially_mapped) { 3735 3734 if (!folio_test_partially_mapped(folio)) { 3736 - __folio_set_partially_mapped(folio); 3735 + folio_set_partially_mapped(folio); 3737 3736 if (folio_test_pmd_mappable(folio)) 3738 3737 count_vm_event(THP_DEFERRED_SPLIT_PAGE); 3739 3738 count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); ··· 3826 3825 } else { 3827 3826 /* We lost race with folio_put() */ 3828 3827 if (folio_test_partially_mapped(folio)) { 3829 - __folio_clear_partially_mapped(folio); 3828 + folio_clear_partially_mapped(folio); 3830 3829 mod_mthp_stat(folio_order(folio), 3831 3830 MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); 3832 3831 } ··· 4169 4168 size_t input_len = strlen(input_buf); 4170 4169 4171 4170 tok = strsep(&buf, ","); 4172 - if (tok) { 4171 + if (tok && buf) { 4173 4172 strscpy(file_path, tok); 4174 4173 } else { 4175 4174 ret = -EINVAL;

+2 -3

mm/hugetlb.c

··· 5340 5340 break; 5341 5341 } 5342 5342 ret = copy_user_large_folio(new_folio, pte_folio, 5343 - ALIGN_DOWN(addr, sz), dst_vma); 5343 + addr, dst_vma); 5344 5344 folio_put(pte_folio); 5345 5345 if (ret) { 5346 5346 folio_put(new_folio); ··· 6643 6643 *foliop = NULL; 6644 6644 goto out; 6645 6645 } 6646 - ret = copy_user_large_folio(folio, *foliop, 6647 - ALIGN_DOWN(dst_addr, size), dst_vma); 6646 + ret = copy_user_large_folio(folio, *foliop, dst_addr, dst_vma); 6648 6647 folio_put(*foliop); 6649 6648 *foliop = NULL; 6650 6649 if (ret) {

-6

mm/internal.h

··· 1285 1285 void touch_pmd(struct vm_area_struct *vma, unsigned long addr, 1286 1286 pmd_t *pmd, bool write); 1287 1287 1288 - static inline bool alloc_zeroed(void) 1289 - { 1290 - return static_branch_maybe(CONFIG_INIT_ON_ALLOC_DEFAULT_ON, 1291 - &init_on_alloc); 1292 - } 1293 - 1294 1288 /* 1295 1289 * Parses a string with mem suffixes into its order. Useful to parse kernel 1296 1290 * parameters.

+10 -8

mm/memory.c

··· 4733 4733 folio_throttle_swaprate(folio, gfp); 4734 4734 /* 4735 4735 * When a folio is not zeroed during allocation 4736 - * (__GFP_ZERO not used), folio_zero_user() is used 4737 - * to make sure that the page corresponding to the 4738 - * faulting address will be hot in the cache after 4739 - * zeroing. 4736 + * (__GFP_ZERO not used) or user folios require special 4737 + * handling, folio_zero_user() is used to make sure 4738 + * that the page corresponding to the faulting address 4739 + * will be hot in the cache after zeroing. 4740 4740 */ 4741 - if (!alloc_zeroed()) 4741 + if (user_alloc_needs_zeroing()) 4742 4742 folio_zero_user(folio, vmf->address); 4743 4743 return folio; 4744 4744 } ··· 6815 6815 return 0; 6816 6816 } 6817 6817 6818 - static void clear_gigantic_page(struct folio *folio, unsigned long addr, 6818 + static void clear_gigantic_page(struct folio *folio, unsigned long addr_hint, 6819 6819 unsigned int nr_pages) 6820 6820 { 6821 + unsigned long addr = ALIGN_DOWN(addr_hint, folio_size(folio)); 6821 6822 int i; 6822 6823 6823 6824 might_sleep(); ··· 6852 6851 } 6853 6852 6854 6853 static int copy_user_gigantic_page(struct folio *dst, struct folio *src, 6855 - unsigned long addr, 6854 + unsigned long addr_hint, 6856 6855 struct vm_area_struct *vma, 6857 6856 unsigned int nr_pages) 6858 6857 { 6859 - int i; 6858 + unsigned long addr = ALIGN_DOWN(addr_hint, folio_size(dst)); 6860 6859 struct page *dst_page; 6861 6860 struct page *src_page; 6861 + int i; 6862 6862 6863 6863 for (i = 0; i < nr_pages; i++) { 6864 6864 dst_page = folio_page(dst, i);

+4 -2

mm/page_alloc.c

··· 1238 1238 if (order > pageblock_order) 1239 1239 order = pageblock_order; 1240 1240 1241 - while (pfn != end) { 1241 + do { 1242 1242 int mt = get_pfnblock_migratetype(page, pfn); 1243 1243 1244 1244 __free_one_page(page, pfn, zone, order, mt, fpi); 1245 1245 pfn += 1 << order; 1246 + if (pfn == end) 1247 + break; 1246 1248 page = pfn_to_page(pfn); 1247 - } 1249 + } while (1); 1248 1250 } 1249 1251 1250 1252 static void free_one_page(struct zone *zone, struct page *page,

+1 -1

mm/pgtable-generic.c

··· 279 279 static void pmdp_get_lockless_end(unsigned long irqflags) { } 280 280 #endif 281 281 282 - pte_t *__pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp) 282 + pte_t *___pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp) 283 283 { 284 284 unsigned long irqflags; 285 285 pmd_t pmdval;

+12 -10

mm/shmem.c

··· 787 787 } 788 788 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ 789 789 790 + static void shmem_update_stats(struct folio *folio, int nr_pages) 791 + { 792 + if (folio_test_pmd_mappable(folio)) 793 + __lruvec_stat_mod_folio(folio, NR_SHMEM_THPS, nr_pages); 794 + __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, nr_pages); 795 + __lruvec_stat_mod_folio(folio, NR_SHMEM, nr_pages); 796 + } 797 + 790 798 /* 791 799 * Somewhat like filemap_add_folio, but error if expected item has gone. 792 800 */ ··· 829 821 xas_store(&xas, folio); 830 822 if (xas_error(&xas)) 831 823 goto unlock; 832 - if (folio_test_pmd_mappable(folio)) 833 - __lruvec_stat_mod_folio(folio, NR_SHMEM_THPS, nr); 834 - __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, nr); 835 - __lruvec_stat_mod_folio(folio, NR_SHMEM, nr); 824 + shmem_update_stats(folio, nr); 836 825 mapping->nrpages += nr; 837 826 unlock: 838 827 xas_unlock_irq(&xas); ··· 857 852 error = shmem_replace_entry(mapping, folio->index, folio, radswap); 858 853 folio->mapping = NULL; 859 854 mapping->nrpages -= nr; 860 - __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, -nr); 861 - __lruvec_stat_mod_folio(folio, NR_SHMEM, -nr); 855 + shmem_update_stats(folio, -nr); 862 856 xa_unlock_irq(&mapping->i_pages); 863 857 folio_put_refs(folio, nr); 864 858 BUG_ON(error); ··· 1973 1969 } 1974 1970 if (!error) { 1975 1971 mem_cgroup_replace_folio(old, new); 1976 - __lruvec_stat_mod_folio(new, NR_FILE_PAGES, nr_pages); 1977 - __lruvec_stat_mod_folio(new, NR_SHMEM, nr_pages); 1978 - __lruvec_stat_mod_folio(old, NR_FILE_PAGES, -nr_pages); 1979 - __lruvec_stat_mod_folio(old, NR_SHMEM, -nr_pages); 1972 + shmem_update_stats(new, nr_pages); 1973 + shmem_update_stats(old, -nr_pages); 1980 1974 } 1981 1975 xa_unlock_irq(&swap_mapping->i_pages); 1982 1976

+4 -1

mm/vma.c

··· 2460 2460 2461 2461 /* If flags changed, we might be able to merge, so try again. */ 2462 2462 if (map.retry_merge) { 2463 + struct vm_area_struct *merged; 2463 2464 VMG_MMAP_STATE(vmg, &map, vma); 2464 2465 2465 2466 vma_iter_config(map.vmi, map.addr, map.end); 2466 - vma_merge_existing_range(&vmg); 2467 + merged = vma_merge_existing_range(&vmg); 2468 + if (merged) 2469 + vma = merged; 2467 2470 } 2468 2471 2469 2472 __mmap_complete(&map, vma);

+4 -2

mm/vmalloc.c

··· 3374 3374 struct page *page = vm->pages[i]; 3375 3375 3376 3376 BUG_ON(!page); 3377 - mod_memcg_page_state(page, MEMCG_VMALLOC, -1); 3377 + if (!(vm->flags & VM_MAP_PUT_PAGES)) 3378 + mod_memcg_page_state(page, MEMCG_VMALLOC, -1); 3378 3379 /* 3379 3380 * High-order allocs for huge vmallocs are split, so 3380 3381 * can be freed as an array of order-0 allocations ··· 3383 3382 __free_page(page); 3384 3383 cond_resched(); 3385 3384 } 3386 - atomic_long_sub(vm->nr_pages, &nr_vmalloc_pages); 3385 + if (!(vm->flags & VM_MAP_PUT_PAGES)) 3386 + atomic_long_sub(vm->nr_pages, &nr_vmalloc_pages); 3387 3387 kvfree(vm->pages); 3388 3388 kfree(vm); 3389 3389 }

+2

net/ceph/osd_client.c

··· 1173 1173 1174 1174 int __ceph_alloc_sparse_ext_map(struct ceph_osd_req_op *op, int cnt) 1175 1175 { 1176 + WARN_ON(op->op != CEPH_OSD_OP_SPARSE_READ); 1177 + 1176 1178 op->extent.sparse_ext_cnt = cnt; 1177 1179 op->extent.sparse_ext = kmalloc_array(cnt, 1178 1180 sizeof(*op->extent.sparse_ext),

+15 -6

net/core/filter.c

··· 3734 3734 3735 3735 static u32 __bpf_skb_min_len(const struct sk_buff *skb) 3736 3736 { 3737 - u32 min_len = skb_network_offset(skb); 3737 + int offset = skb_network_offset(skb); 3738 + u32 min_len = 0; 3738 3739 3739 - if (skb_transport_header_was_set(skb)) 3740 - min_len = skb_transport_offset(skb); 3741 - if (skb->ip_summed == CHECKSUM_PARTIAL) 3742 - min_len = skb_checksum_start_offset(skb) + 3743 - skb->csum_offset + sizeof(__sum16); 3740 + if (offset > 0) 3741 + min_len = offset; 3742 + if (skb_transport_header_was_set(skb)) { 3743 + offset = skb_transport_offset(skb); 3744 + if (offset > 0) 3745 + min_len = offset; 3746 + } 3747 + if (skb->ip_summed == CHECKSUM_PARTIAL) { 3748 + offset = skb_checksum_start_offset(skb) + 3749 + skb->csum_offset + sizeof(__sum16); 3750 + if (offset > 0) 3751 + min_len = offset; 3752 + } 3744 3753 return min_len; 3745 3754 } 3746 3755

+8 -11

net/core/netdev-genl.c

··· 430 430 netdev_nl_queue_fill(struct sk_buff *rsp, struct net_device *netdev, u32 q_idx, 431 431 u32 q_type, const struct genl_info *info) 432 432 { 433 - int err = 0; 433 + int err; 434 434 435 435 if (!(netdev->flags & IFF_UP)) 436 - return err; 436 + return -ENOENT; 437 437 438 438 err = netdev_nl_queue_validate(netdev, q_idx, q_type); 439 439 if (err) ··· 488 488 struct netdev_nl_dump_ctx *ctx) 489 489 { 490 490 int err = 0; 491 - int i; 492 491 493 492 if (!(netdev->flags & IFF_UP)) 494 493 return err; 495 494 496 - for (i = ctx->rxq_idx; i < netdev->real_num_rx_queues;) { 497 - err = netdev_nl_queue_fill_one(rsp, netdev, i, 495 + for (; ctx->rxq_idx < netdev->real_num_rx_queues; ctx->rxq_idx++) { 496 + err = netdev_nl_queue_fill_one(rsp, netdev, ctx->rxq_idx, 498 497 NETDEV_QUEUE_TYPE_RX, info); 499 498 if (err) 500 499 return err; 501 - ctx->rxq_idx = i++; 502 500 } 503 - for (i = ctx->txq_idx; i < netdev->real_num_tx_queues;) { 504 - err = netdev_nl_queue_fill_one(rsp, netdev, i, 501 + for (; ctx->txq_idx < netdev->real_num_tx_queues; ctx->txq_idx++) { 502 + err = netdev_nl_queue_fill_one(rsp, netdev, ctx->txq_idx, 505 503 NETDEV_QUEUE_TYPE_TX, info); 506 504 if (err) 507 505 return err; 508 - ctx->txq_idx = i++; 509 506 } 510 507 511 508 return err; ··· 668 671 i, info); 669 672 if (err) 670 673 return err; 671 - ctx->rxq_idx = i++; 674 + ctx->rxq_idx = ++i; 672 675 } 673 676 i = ctx->txq_idx; 674 677 while (ops->get_queue_stats_tx && i < netdev->real_num_tx_queues) { ··· 676 679 i, info); 677 680 if (err) 678 681 return err; 679 - ctx->txq_idx = i++; 682 + ctx->txq_idx = ++i; 680 683 } 681 684 682 685 ctx->rxq_idx = 0;

+3 -2

net/core/rtnetlink.c

··· 3819 3819 } 3820 3820 3821 3821 static struct net *rtnl_get_peer_net(const struct rtnl_link_ops *ops, 3822 + struct nlattr *tbp[], 3822 3823 struct nlattr *data[], 3823 3824 struct netlink_ext_ack *extack) 3824 3825 { ··· 3827 3826 int err; 3828 3827 3829 3828 if (!data || !data[ops->peer_type]) 3830 - return NULL; 3829 + return rtnl_link_get_net_ifla(tbp); 3831 3830 3832 3831 err = rtnl_nla_parse_ifinfomsg(tb, data[ops->peer_type], extack); 3833 3832 if (err < 0) ··· 3972 3971 } 3973 3972 3974 3973 if (ops->peer_type) { 3975 - peer_net = rtnl_get_peer_net(ops, data, extack); 3974 + peer_net = rtnl_get_peer_net(ops, tb, data, extack); 3976 3975 if (IS_ERR(peer_net)) { 3977 3976 ret = PTR_ERR(peer_net); 3978 3977 goto put_ops;

+8 -3

net/core/skmsg.c

··· 369 369 struct sk_msg *msg, u32 bytes) 370 370 { 371 371 int ret = -ENOSPC, i = msg->sg.curr; 372 + u32 copy, buf_size, copied = 0; 372 373 struct scatterlist *sge; 373 - u32 copy, buf_size; 374 374 void *to; 375 375 376 376 do { ··· 397 397 goto out; 398 398 } 399 399 bytes -= copy; 400 + copied += copy; 400 401 if (!bytes) 401 402 break; 402 403 msg->sg.copybreak = 0; ··· 405 404 } while (i != msg->sg.end); 406 405 out: 407 406 msg->sg.curr = i; 408 - return ret; 407 + return (ret < 0) ? ret : copied; 409 408 } 410 409 EXPORT_SYMBOL_GPL(sk_msg_memcopy_from_iter); 411 410 ··· 446 445 if (likely(!peek)) { 447 446 sge->offset += copy; 448 447 sge->length -= copy; 449 - if (!msg_rx->skb) 448 + if (!msg_rx->skb) { 450 449 sk_mem_uncharge(sk, copy); 450 + atomic_sub(copy, &sk->sk_rmem_alloc); 451 + } 451 452 msg_rx->sg.size -= copy; 452 453 453 454 if (!sge->length) { ··· 775 772 776 773 list_for_each_entry_safe(msg, tmp, &psock->ingress_msg, list) { 777 774 list_del(&msg->list); 775 + if (!msg->skb) 776 + atomic_sub(msg->sg.size, &psock->sk->sk_rmem_alloc); 778 777 sk_msg_free(psock->sk, msg); 779 778 kfree(msg); 780 779 }

+11 -5

net/dsa/tag.h

··· 138 138 * dsa_software_vlan_untag: Software VLAN untagging in DSA receive path 139 139 * @skb: Pointer to socket buffer (packet) 140 140 * 141 - * Receive path method for switches which cannot avoid tagging all packets 142 - * towards the CPU port. Called when ds->untag_bridge_pvid (legacy) or 143 - * ds->untag_vlan_aware_bridge_pvid is set to true. 141 + * Receive path method for switches which send some packets as VLAN-tagged 142 + * towards the CPU port (generally from VLAN-aware bridge ports) even when the 143 + * packet was not tagged on the wire. Called when ds->untag_bridge_pvid 144 + * (legacy) or ds->untag_vlan_aware_bridge_pvid is set to true. 144 145 * 145 146 * As a side effect of this method, any VLAN tag from the skb head is moved 146 147 * to hwaccel. ··· 150 149 { 151 150 struct dsa_port *dp = dsa_user_to_port(skb->dev); 152 151 struct net_device *br = dsa_port_bridge_dev_get(dp); 153 - u16 vid; 152 + u16 vid, proto; 153 + int err; 154 154 155 155 /* software untagging for standalone ports not yet necessary */ 156 156 if (!br) 157 157 return skb; 158 158 159 + err = br_vlan_get_proto(br, &proto); 160 + if (err) 161 + return skb; 162 + 159 163 /* Move VLAN tag from data to hwaccel */ 160 - if (!skb_vlan_tag_present(skb)) { 164 + if (!skb_vlan_tag_present(skb) && skb->protocol == htons(proto)) { 161 165 skb = skb_vlan_untag(skb); 162 166 if (!skb) 163 167 return NULL;

+8 -6

net/ipv4/tcp_bpf.c

··· 49 49 sge = sk_msg_elem(msg, i); 50 50 size = (apply && apply_bytes < sge->length) ? 51 51 apply_bytes : sge->length; 52 - if (!sk_wmem_schedule(sk, size)) { 52 + if (!__sk_rmem_schedule(sk, size, false)) { 53 53 if (!copied) 54 54 ret = -ENOMEM; 55 55 break; 56 56 } 57 57 58 58 sk_mem_charge(sk, size); 59 + atomic_add(size, &sk->sk_rmem_alloc); 59 60 sk_msg_xfer(tmp, msg, i, size); 60 61 copied += size; 61 62 if (sge->length) ··· 75 74 76 75 if (!ret) { 77 76 msg->sg.start = i; 78 - sk_psock_queue_msg(psock, tmp); 77 + if (!sk_psock_queue_msg(psock, tmp)) 78 + atomic_sub(copied, &sk->sk_rmem_alloc); 79 79 sk_psock_data_ready(sk, psock); 80 80 } else { 81 81 sk_msg_free(sk, tmp); ··· 495 493 static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) 496 494 { 497 495 struct sk_msg tmp, *msg_tx = NULL; 498 - int copied = 0, err = 0; 496 + int copied = 0, err = 0, ret = 0; 499 497 struct sk_psock *psock; 500 498 long timeo; 501 499 int flags; ··· 538 536 copy = msg_tx->sg.size - osize; 539 537 } 540 538 541 - err = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, msg_tx, 539 + ret = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, msg_tx, 542 540 copy); 543 - if (err < 0) { 541 + if (ret < 0) { 544 542 sk_msg_trim(sk, msg_tx, osize); 545 543 goto out_err; 546 544 } 547 545 548 - copied += copy; 546 + copied += ret; 549 547 if (psock->cork_bytes) { 550 548 if (size > psock->cork_bytes) 551 549 psock->cork_bytes = 0;

+26 -10

net/mctp/route.c

··· 374 374 msk = NULL; 375 375 rc = -EINVAL; 376 376 377 - /* we may be receiving a locally-routed packet; drop source sk 378 - * accounting 377 + /* We may be receiving a locally-routed packet; drop source sk 378 + * accounting. 379 + * 380 + * From here, we will either queue the skb - either to a frag_queue, or 381 + * to a receiving socket. When that succeeds, we clear the skb pointer; 382 + * a non-NULL skb on exit will be otherwise unowned, and hence 383 + * kfree_skb()-ed. 379 384 */ 380 385 skb_orphan(skb); 381 386 ··· 439 434 * pending key. 440 435 */ 441 436 if (flags & MCTP_HDR_FLAG_EOM) { 442 - sock_queue_rcv_skb(&msk->sk, skb); 437 + rc = sock_queue_rcv_skb(&msk->sk, skb); 438 + if (!rc) 439 + skb = NULL; 443 440 if (key) { 444 441 /* we've hit a pending reassembly; not much we 445 442 * can do but drop it ··· 450 443 MCTP_TRACE_KEY_REPLIED); 451 444 key = NULL; 452 445 } 453 - rc = 0; 454 446 goto out_unlock; 455 447 } 456 448 ··· 476 470 * this function. 477 471 */ 478 472 rc = mctp_key_add(key, msk); 479 - if (!rc) 473 + if (!rc) { 480 474 trace_mctp_key_acquire(key); 475 + skb = NULL; 476 + } 481 477 482 478 /* we don't need to release key->lock on exit, so 483 479 * clean up here and suppress the unlock via ··· 497 489 key = NULL; 498 490 } else { 499 491 rc = mctp_frag_queue(key, skb); 492 + if (!rc) 493 + skb = NULL; 500 494 } 501 495 } 502 496 ··· 513 503 else 514 504 rc = mctp_frag_queue(key, skb); 515 505 506 + if (rc) 507 + goto out_unlock; 508 + 509 + /* we've queued; the queue owns the skb now */ 510 + skb = NULL; 511 + 516 512 /* end of message? deliver to socket, and we're done with 517 513 * the reassembly/response key 518 514 */ 519 - if (!rc && flags & MCTP_HDR_FLAG_EOM) { 520 - sock_queue_rcv_skb(key->sk, key->reasm_head); 521 - key->reasm_head = NULL; 515 + if (flags & MCTP_HDR_FLAG_EOM) { 516 + rc = sock_queue_rcv_skb(key->sk, key->reasm_head); 517 + if (!rc) 518 + key->reasm_head = NULL; 522 519 __mctp_key_done_in(key, net, f, MCTP_TRACE_KEY_REPLIED); 523 520 key = NULL; 524 521 } ··· 544 527 if (any_key) 545 528 mctp_key_unref(any_key); 546 529 out: 547 - if (rc) 548 - kfree_skb(skb); 530 + kfree_skb(skb); 549 531 return rc; 550 532 } 551 533

+86

net/mctp/test/route-test.c

··· 837 837 mctp_test_route_input_multiple_nets_key_fini(test, &t2); 838 838 } 839 839 840 + /* Input route to socket, using a single-packet message, where sock delivery 841 + * fails. Ensure we're handling the failure appropriately. 842 + */ 843 + static void mctp_test_route_input_sk_fail_single(struct kunit *test) 844 + { 845 + const struct mctp_hdr hdr = RX_HDR(1, 10, 8, FL_S | FL_E | FL_TO); 846 + struct mctp_test_route *rt; 847 + struct mctp_test_dev *dev; 848 + struct socket *sock; 849 + struct sk_buff *skb; 850 + int rc; 851 + 852 + __mctp_route_test_init(test, &dev, &rt, &sock, MCTP_NET_ANY); 853 + 854 + /* No rcvbuf space, so delivery should fail. __sock_set_rcvbuf will 855 + * clamp the minimum to SOCK_MIN_RCVBUF, so we open-code this. 856 + */ 857 + lock_sock(sock->sk); 858 + WRITE_ONCE(sock->sk->sk_rcvbuf, 0); 859 + release_sock(sock->sk); 860 + 861 + skb = mctp_test_create_skb(&hdr, 10); 862 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, skb); 863 + skb_get(skb); 864 + 865 + mctp_test_skb_set_dev(skb, dev); 866 + 867 + /* do route input, which should fail */ 868 + rc = mctp_route_input(&rt->rt, skb); 869 + KUNIT_EXPECT_NE(test, rc, 0); 870 + 871 + /* we should hold the only reference to skb */ 872 + KUNIT_EXPECT_EQ(test, refcount_read(&skb->users), 1); 873 + kfree_skb(skb); 874 + 875 + __mctp_route_test_fini(test, dev, rt, sock); 876 + } 877 + 878 + /* Input route to socket, using a fragmented message, where sock delivery fails. 879 + */ 880 + static void mctp_test_route_input_sk_fail_frag(struct kunit *test) 881 + { 882 + const struct mctp_hdr hdrs[2] = { RX_FRAG(FL_S, 0), RX_FRAG(FL_E, 1) }; 883 + struct mctp_test_route *rt; 884 + struct mctp_test_dev *dev; 885 + struct sk_buff *skbs[2]; 886 + struct socket *sock; 887 + unsigned int i; 888 + int rc; 889 + 890 + __mctp_route_test_init(test, &dev, &rt, &sock, MCTP_NET_ANY); 891 + 892 + lock_sock(sock->sk); 893 + WRITE_ONCE(sock->sk->sk_rcvbuf, 0); 894 + release_sock(sock->sk); 895 + 896 + for (i = 0; i < ARRAY_SIZE(skbs); i++) { 897 + skbs[i] = mctp_test_create_skb(&hdrs[i], 10); 898 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, skbs[i]); 899 + skb_get(skbs[i]); 900 + 901 + mctp_test_skb_set_dev(skbs[i], dev); 902 + } 903 + 904 + /* first route input should succeed, we're only queueing to the 905 + * frag list 906 + */ 907 + rc = mctp_route_input(&rt->rt, skbs[0]); 908 + KUNIT_EXPECT_EQ(test, rc, 0); 909 + 910 + /* final route input should fail to deliver to the socket */ 911 + rc = mctp_route_input(&rt->rt, skbs[1]); 912 + KUNIT_EXPECT_NE(test, rc, 0); 913 + 914 + /* we should hold the only reference to both skbs */ 915 + KUNIT_EXPECT_EQ(test, refcount_read(&skbs[0]->users), 1); 916 + kfree_skb(skbs[0]); 917 + 918 + KUNIT_EXPECT_EQ(test, refcount_read(&skbs[1]->users), 1); 919 + kfree_skb(skbs[1]); 920 + 921 + __mctp_route_test_fini(test, dev, rt, sock); 922 + } 923 + 840 924 #if IS_ENABLED(CONFIG_MCTP_FLOWS) 841 925 842 926 static void mctp_test_flow_init(struct kunit *test, ··· 1137 1053 mctp_route_input_sk_reasm_gen_params), 1138 1054 KUNIT_CASE_PARAM(mctp_test_route_input_sk_keys, 1139 1055 mctp_route_input_sk_keys_gen_params), 1056 + KUNIT_CASE(mctp_test_route_input_sk_fail_single), 1057 + KUNIT_CASE(mctp_test_route_input_sk_fail_frag), 1140 1058 KUNIT_CASE(mctp_test_route_input_multiple_nets_bind), 1141 1059 KUNIT_CASE(mctp_test_route_input_multiple_nets_key), 1142 1060 KUNIT_CASE(mctp_test_packet_flow),

+3

net/netfilter/ipset/ip_set_list_set.c

··· 611 611 return true; 612 612 } 613 613 614 + static struct lock_class_key list_set_lockdep_key; 615 + 614 616 static int 615 617 list_set_create(struct net *net, struct ip_set *set, struct nlattr *tb[], 616 618 u32 flags) ··· 629 627 if (size < IP_SET_LIST_MIN_SIZE) 630 628 size = IP_SET_LIST_MIN_SIZE; 631 629 630 + lockdep_set_class(&set->lock, &list_set_lockdep_key); 632 631 set->variant = &set_variant; 633 632 set->dsize = ip_set_elem_len(set, tb, sizeof(struct set_elem), 634 633 __alignof__(struct set_elem));

+2 -2

net/netfilter/ipvs/ip_vs_conn.c

··· 1495 1495 max_avail -= 2; /* ~4 in hash row */ 1496 1496 max_avail -= 1; /* IPVS up to 1/2 of mem */ 1497 1497 max_avail -= order_base_2(sizeof(struct ip_vs_conn)); 1498 - max = clamp(max, min, max_avail); 1499 - ip_vs_conn_tab_bits = clamp_val(ip_vs_conn_tab_bits, min, max); 1498 + max = clamp(max_avail, min, max); 1499 + ip_vs_conn_tab_bits = clamp(ip_vs_conn_tab_bits, min, max); 1500 1500 ip_vs_conn_tab_size = 1 << ip_vs_conn_tab_bits; 1501 1501 ip_vs_conn_tab_mask = ip_vs_conn_tab_size - 1; 1502 1502

+6 -3

net/psample/psample.c

··· 393 393 nla_total_size_64bit(sizeof(u64)) + /* timestamp */ 394 394 nla_total_size(sizeof(u16)) + /* protocol */ 395 395 (md->user_cookie_len ? 396 - nla_total_size(md->user_cookie_len) : 0); /* user cookie */ 396 + nla_total_size(md->user_cookie_len) : 0) + /* user cookie */ 397 + (md->rate_as_probability ? 398 + nla_total_size(0) : 0); /* rate as probability */ 397 399 398 400 #ifdef CONFIG_INET 399 401 tun_info = skb_tunnel_info(skb); ··· 500 498 md->user_cookie)) 501 499 goto error; 502 500 503 - if (md->rate_as_probability) 504 - nla_put_flag(nl_skb, PSAMPLE_ATTR_SAMPLE_PROBABILITY); 501 + if (md->rate_as_probability && 502 + nla_put_flag(nl_skb, PSAMPLE_ATTR_SAMPLE_PROBABILITY)) 503 + goto error; 505 504 506 505 genlmsg_end(nl_skb, data); 507 506 genlmsg_multicast_netns(&psample_nl_family, group->net, nl_skb, 0,

+16 -2

net/smc/af_smc.c

··· 2032 2032 if (pclc->hdr.typev1 == SMC_TYPE_N) 2033 2033 return 0; 2034 2034 pclc_prfx = smc_clc_proposal_get_prefix(pclc); 2035 + if (!pclc_prfx) 2036 + return -EPROTO; 2035 2037 if (smc_clc_prfx_match(newclcsock, pclc_prfx)) 2036 2038 return SMC_CLC_DECL_DIFFPREFIX; 2037 2039 ··· 2147 2145 pclc_smcd = smc_get_clc_msg_smcd(pclc); 2148 2146 smc_v2_ext = smc_get_clc_v2_ext(pclc); 2149 2147 smcd_v2_ext = smc_get_clc_smcd_v2_ext(smc_v2_ext); 2148 + if (!pclc_smcd || !smc_v2_ext || !smcd_v2_ext) 2149 + goto not_found; 2150 2150 2151 2151 mutex_lock(&smcd_dev_list.mutex); 2152 2152 if (pclc_smcd->ism.chid) { ··· 2225 2221 int rc = 0; 2226 2222 2227 2223 /* check if ISM V1 is available */ 2228 - if (!(ini->smcd_version & SMC_V1) || !smcd_indicated(ini->smc_type_v1)) 2224 + if (!(ini->smcd_version & SMC_V1) || 2225 + !smcd_indicated(ini->smc_type_v1) || 2226 + !pclc_smcd) 2229 2227 goto not_found; 2230 2228 ini->is_smcd = true; /* prepare ISM check */ 2231 2229 ini->ism_peer_gid[0].gid = ntohll(pclc_smcd->ism.gid); ··· 2278 2272 goto not_found; 2279 2273 2280 2274 smc_v2_ext = smc_get_clc_v2_ext(pclc); 2281 - if (!smc_clc_match_eid(ini->negotiated_eid, smc_v2_ext, NULL, NULL)) 2275 + if (!smc_v2_ext || 2276 + !smc_clc_match_eid(ini->negotiated_eid, smc_v2_ext, NULL, NULL)) 2282 2277 goto not_found; 2283 2278 2284 2279 /* prepare RDMA check */ ··· 2888 2881 } else { 2889 2882 sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk); 2890 2883 set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); 2884 + 2885 + if (sk->sk_state != SMC_INIT) { 2886 + /* Race breaker the same way as tcp_poll(). */ 2887 + smp_mb__after_atomic(); 2888 + if (atomic_read(&smc->conn.sndbuf_space)) 2889 + mask |= EPOLLOUT | EPOLLWRNORM; 2890 + } 2891 2891 } 2892 2892 if (atomic_read(&smc->conn.bytes_to_rcv)) 2893 2893 mask |= EPOLLIN | EPOLLRDNORM;

+16 -1

net/smc/smc_clc.c

··· 352 352 struct smc_clc_msg_hdr *hdr = &pclc->hdr; 353 353 struct smc_clc_v2_extension *v2_ext; 354 354 355 - v2_ext = smc_get_clc_v2_ext(pclc); 356 355 pclc_prfx = smc_clc_proposal_get_prefix(pclc); 356 + if (!pclc_prfx || 357 + pclc_prfx->ipv6_prefixes_cnt > SMC_CLC_MAX_V6_PREFIX) 358 + return false; 359 + 357 360 if (hdr->version == SMC_V1) { 358 361 if (hdr->typev1 == SMC_TYPE_N) 359 362 return false; ··· 368 365 sizeof(struct smc_clc_msg_trail)) 369 366 return false; 370 367 } else { 368 + v2_ext = smc_get_clc_v2_ext(pclc); 369 + if ((hdr->typev2 != SMC_TYPE_N && 370 + (!v2_ext || v2_ext->hdr.eid_cnt > SMC_CLC_MAX_UEID)) || 371 + (smcd_indicated(hdr->typev2) && 372 + v2_ext->hdr.ism_gid_cnt > SMCD_CLC_MAX_V2_GID_ENTRIES)) 373 + return false; 374 + 371 375 if (ntohs(hdr->length) != 372 376 sizeof(*pclc) + 373 377 sizeof(struct smc_clc_msg_smcd) + ··· 774 764 SMC_CLC_RECV_BUF_LEN : datlen; 775 765 iov_iter_kvec(&msg.msg_iter, ITER_DEST, &vec, 1, recvlen); 776 766 len = sock_recvmsg(smc->clcsock, &msg, krflags); 767 + if (len < recvlen) { 768 + smc->sk.sk_err = EPROTO; 769 + reason_code = -EPROTO; 770 + goto out; 771 + } 777 772 datlen -= len; 778 773 } 779 774 if (clcm->type == SMC_CLC_DECLINE) {

+19 -3

net/smc/smc_clc.h

··· 336 336 static inline struct smc_clc_msg_proposal_prefix * 337 337 smc_clc_proposal_get_prefix(struct smc_clc_msg_proposal *pclc) 338 338 { 339 + u16 offset = ntohs(pclc->iparea_offset); 340 + 341 + if (offset > sizeof(struct smc_clc_msg_smcd)) 342 + return NULL; 339 343 return (struct smc_clc_msg_proposal_prefix *) 340 - ((u8 *)pclc + sizeof(*pclc) + ntohs(pclc->iparea_offset)); 344 + ((u8 *)pclc + sizeof(*pclc) + offset); 341 345 } 342 346 343 347 static inline bool smcr_indicated(int smc_type) ··· 380 376 smc_get_clc_v2_ext(struct smc_clc_msg_proposal *prop) 381 377 { 382 378 struct smc_clc_msg_smcd *prop_smcd = smc_get_clc_msg_smcd(prop); 379 + u16 max_offset; 383 380 384 - if (!prop_smcd || !ntohs(prop_smcd->v2_ext_offset)) 381 + max_offset = offsetof(struct smc_clc_msg_proposal_area, pclc_v2_ext) - 382 + offsetof(struct smc_clc_msg_proposal_area, pclc_smcd) - 383 + offsetofend(struct smc_clc_msg_smcd, v2_ext_offset); 384 + 385 + if (!prop_smcd || !ntohs(prop_smcd->v2_ext_offset) || 386 + ntohs(prop_smcd->v2_ext_offset) > max_offset) 385 387 return NULL; 386 388 387 389 return (struct smc_clc_v2_extension *) ··· 400 390 static inline struct smc_clc_smcd_v2_extension * 401 391 smc_get_clc_smcd_v2_ext(struct smc_clc_v2_extension *prop_v2ext) 402 392 { 393 + u16 max_offset = offsetof(struct smc_clc_msg_proposal_area, pclc_smcd_v2_ext) - 394 + offsetof(struct smc_clc_msg_proposal_area, pclc_v2_ext) - 395 + offsetof(struct smc_clc_v2_extension, hdr) - 396 + offsetofend(struct smc_clnt_opts_area_hdr, smcd_v2_ext_offset); 397 + 403 398 if (!prop_v2ext) 404 399 return NULL; 405 - if (!ntohs(prop_v2ext->hdr.smcd_v2_ext_offset)) 400 + if (!ntohs(prop_v2ext->hdr.smcd_v2_ext_offset) || 401 + ntohs(prop_v2ext->hdr.smcd_v2_ext_offset) > max_offset) 406 402 return NULL; 407 403 408 404 return (struct smc_clc_smcd_v2_extension *)

+7 -2

net/smc/smc_core.c

··· 1818 1818 { 1819 1819 if (smc_link_downing(&lnk->state)) { 1820 1820 trace_smcr_link_down(lnk, __builtin_return_address(0)); 1821 - schedule_work(&lnk->link_down_wrk); 1821 + smcr_link_hold(lnk); /* smcr_link_put in link_down_wrk */ 1822 + if (!schedule_work(&lnk->link_down_wrk)) 1823 + smcr_link_put(lnk); 1822 1824 } 1823 1825 } 1824 1826 ··· 1852 1850 struct smc_link_group *lgr = link->lgr; 1853 1851 1854 1852 if (list_empty(&lgr->list)) 1855 - return; 1853 + goto out; 1856 1854 wake_up_all(&lgr->llc_msg_waiter); 1857 1855 down_write(&lgr->llc_conf_mutex); 1858 1856 smcr_link_down(link); 1859 1857 up_write(&lgr->llc_conf_mutex); 1858 + 1859 + out: 1860 + smcr_link_put(link); /* smcr_link_hold by schedulers of link_down_work */ 1860 1861 } 1861 1862 1862 1863 static int smc_vlan_by_tcpsk_walk(struct net_device *lower_dev,

+2 -2

rust/kernel/net/phy.rs

··· 860 860 /// ]; 861 861 /// #[cfg(MODULE)] 862 862 /// #[no_mangle] 863 - /// static __mod_mdio__phydev_device_table: [::kernel::bindings::mdio_device_id; 2] = _DEVICE_TABLE; 863 + /// static __mod_device_table__mdio__phydev: [::kernel::bindings::mdio_device_id; 2] = _DEVICE_TABLE; 864 864 /// ``` 865 865 #[macro_export] 866 866 macro_rules! module_phy_driver { ··· 883 883 884 884 #[cfg(MODULE)] 885 885 #[no_mangle] 886 - static __mod_mdio__phydev_device_table: [$crate::bindings::mdio_device_id; 886 + static __mod_device_table__mdio__phydev: [$crate::bindings::mdio_device_id; 887 887 $crate::module_phy_driver!(@count_devices $($dev),+) + 1] = _DEVICE_TABLE; 888 888 }; 889 889

+9 -8

scripts/mod/modpost.c

··· 155 155 /* A list of all modules we processed */ 156 156 LIST_HEAD(modules); 157 157 158 - static struct module *find_module(const char *modname) 158 + static struct module *find_module(const char *filename, const char *modname) 159 159 { 160 160 struct module *mod; 161 161 162 162 list_for_each_entry(mod, &modules, list) { 163 - if (strcmp(mod->name, modname) == 0) 163 + if (!strcmp(mod->dump_file, filename) && 164 + !strcmp(mod->name, modname)) 164 165 return mod; 165 166 } 166 167 return NULL; ··· 2031 2030 continue; 2032 2031 } 2033 2032 2034 - mod = find_module(modname); 2033 + mod = find_module(fname, modname); 2035 2034 if (!mod) { 2036 2035 mod = new_module(modname, strlen(modname)); 2037 - mod->from_dump = true; 2036 + mod->dump_file = fname; 2038 2037 } 2039 2038 s = sym_add_exported(symname, mod, gpl_only, namespace); 2040 2039 sym_set_crc(s, crc); ··· 2053 2052 struct symbol *sym; 2054 2053 2055 2054 list_for_each_entry(mod, &modules, list) { 2056 - if (mod->from_dump) 2055 + if (mod->dump_file) 2057 2056 continue; 2058 2057 list_for_each_entry(sym, &mod->exported_symbols, list) { 2059 2058 if (trim_unused_exports && !sym->used) ··· 2077 2076 2078 2077 list_for_each_entry(mod, &modules, list) { 2079 2078 2080 - if (mod->from_dump || list_empty(&mod->missing_namespaces)) 2079 + if (mod->dump_file || list_empty(&mod->missing_namespaces)) 2081 2080 continue; 2082 2081 2083 2082 buf_printf(&ns_deps_buf, "%s.ko:", mod->name); ··· 2195 2194 read_symbols_from_files(files_source); 2196 2195 2197 2196 list_for_each_entry(mod, &modules, list) { 2198 - if (mod->from_dump || mod->is_vmlinux) 2197 + if (mod->dump_file || mod->is_vmlinux) 2199 2198 continue; 2200 2199 2201 2200 check_modname_len(mod); ··· 2206 2205 handle_white_list_exports(unused_exports_white_list); 2207 2206 2208 2207 list_for_each_entry(mod, &modules, list) { 2209 - if (mod->from_dump) 2208 + if (mod->dump_file) 2210 2209 continue; 2211 2210 2212 2211 if (mod->is_vmlinux)

+2 -1

scripts/mod/modpost.h

··· 95 95 /** 96 96 * struct module - represent a module (vmlinux or *.ko) 97 97 * 98 + * @dump_file: path to the .symvers file if loaded from a file 98 99 * @aliases: list head for module_aliases 99 100 */ 100 101 struct module { 101 102 struct list_head list; 102 103 struct list_head exported_symbols; 103 104 struct list_head unresolved_symbols; 105 + const char *dump_file; 104 106 bool is_gpl_compatible; 105 - bool from_dump; /* true if module was loaded from *.symvers */ 106 107 bool is_vmlinux; 107 108 bool seen; 108 109 bool has_init;

+6

scripts/package/builddeb

··· 63 63 esac 64 64 cp "$(${MAKE} -s -f ${srctree}/Makefile image_name)" "${pdir}/${installed_image_path}" 65 65 66 + if [ "${ARCH}" != um ]; then 67 + install_maint_scripts "${pdir}" 68 + fi 69 + } 70 + 71 + install_maint_scripts () { 66 72 # Install the maintainer scripts 67 73 # Note: hook scripts under /etc/kernel are also executed by official Debian 68 74 # kernel packages, as well as kernel packages built using make-kpkg.

+7

scripts/package/mkdebian

··· 70 70 debarch=sh4$(if_enabled_echo CONFIG_CPU_BIG_ENDIAN eb) 71 71 fi 72 72 ;; 73 + um) 74 + if is_enabled CONFIG_64BIT; then 75 + debarch=amd64 76 + else 77 + debarch=i386 78 + fi 79 + ;; 73 80 esac 74 81 if [ -z "$debarch" ]; then 75 82 debarch=$(dpkg-architecture -qDEB_HOST_ARCH)

+6 -2

security/selinux/ss/services.c

··· 979 979 return; 980 980 break; 981 981 default: 982 - BUG(); 982 + pr_warn_once( 983 + "SELinux: unknown extended permission (%u) will be ignored\n", 984 + node->datum.u.xperms->specified); 985 + return; 983 986 } 984 987 985 988 if (node->key.specified == AVTAB_XPERMS_ALLOWED) { ··· 1001 998 &node->datum.u.xperms->perms, 1002 999 xpermd->dontaudit); 1003 1000 } else { 1004 - BUG(); 1001 + pr_warn_once("SELinux: unknown specified key (%u)\n", 1002 + node->key.specified); 1005 1003 } 1006 1004 } 1007 1005

+1 -1

sound/soc/fsl/Kconfig

··· 29 29 config SND_SOC_FSL_MQS 30 30 tristate "Medium Quality Sound (MQS) module support" 31 31 depends on SND_SOC_FSL_SAI 32 + depends on IMX_SCMI_MISC_DRV || !IMX_SCMI_MISC_DRV 32 33 select REGMAP_MMIO 33 - select IMX_SCMI_MISC_DRV if IMX_SCMI_MISC_EXT !=n 34 34 help 35 35 Say Y if you want to add Medium Quality Sound (MQS) 36 36 support for the Freescale CPUs.

+3

tools/hv/.gitignore

··· 1 + hv_fcopy_uio_daemon 2 + hv_kvp_daemon 3 + hv_vss_daemon

+6 -6

tools/hv/hv_fcopy_uio_daemon.c

··· 35 35 #define WIN8_SRV_MINOR 1 36 36 #define WIN8_SRV_VERSION (WIN8_SRV_MAJOR << 16 | WIN8_SRV_MINOR) 37 37 38 - #define MAX_FOLDER_NAME 15 39 - #define MAX_PATH_LEN 15 40 38 #define FCOPY_UIO "/sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/uio" 41 39 42 40 #define FCOPY_VER_COUNT 1 ··· 49 51 50 52 #define HV_RING_SIZE 0x4000 /* 16KB ring buffer size */ 51 53 52 - unsigned char desc[HV_RING_SIZE]; 54 + static unsigned char desc[HV_RING_SIZE]; 53 55 54 56 static int target_fd; 55 57 static char target_fname[PATH_MAX]; ··· 407 409 struct vmbus_br txbr, rxbr; 408 410 void *ring; 409 411 uint32_t len = HV_RING_SIZE; 410 - char uio_name[MAX_FOLDER_NAME] = {0}; 411 - char uio_dev_path[MAX_PATH_LEN] = {0}; 412 + char uio_name[NAME_MAX] = {0}; 413 + char uio_dev_path[PATH_MAX] = {0}; 412 414 413 415 static struct option long_options[] = { 414 416 {"help", no_argument, 0, 'h' }, ··· 466 468 */ 467 469 ret = pread(fcopy_fd, &tmp, sizeof(int), 0); 468 470 if (ret < 0) { 471 + if (errno == EINTR || errno == EAGAIN) 472 + continue; 469 473 syslog(LOG_ERR, "pread failed: %s", strerror(errno)); 470 - continue; 474 + goto close; 471 475 } 472 476 473 477 len = HV_RING_SIZE;

+2 -2

tools/hv/hv_get_dns_info.sh

··· 1 - #!/bin/bash 1 + #!/bin/sh 2 2 3 3 # This example script parses /etc/resolv.conf to retrive DNS information. 4 4 # In the interest of keeping the KVP daemon code free of distro specific ··· 10 10 # this script can be based on the Network Manager APIs for retrieving DNS 11 11 # entries. 12 12 13 - cat /etc/resolv.conf 2>/dev/null | awk '/^nameserver/ { print $2 }' 13 + exec awk '/^nameserver/ { print $2 }' /etc/resolv.conf 2>/dev/null

+5 -4

tools/hv/hv_kvp_daemon.c

··· 725 725 * . 726 726 */ 727 727 728 - sprintf(cmd, KVP_SCRIPTS_PATH "%s", "hv_get_dns_info"); 728 + sprintf(cmd, "exec %s %s", KVP_SCRIPTS_PATH "hv_get_dns_info", if_name); 729 729 730 730 /* 731 731 * Execute the command to gather DNS info. ··· 742 742 * Enabled: DHCP enabled. 743 743 */ 744 744 745 - sprintf(cmd, KVP_SCRIPTS_PATH "%s %s", "hv_get_dhcp_info", if_name); 745 + sprintf(cmd, "exec %s %s", KVP_SCRIPTS_PATH "hv_get_dhcp_info", if_name); 746 746 747 747 file = popen(cmd, "r"); 748 748 if (file == NULL) ··· 1606 1606 * invoke the external script to do its magic. 1607 1607 */ 1608 1608 1609 - str_len = snprintf(cmd, sizeof(cmd), KVP_SCRIPTS_PATH "%s %s %s", 1610 - "hv_set_ifconfig", if_filename, nm_filename); 1609 + str_len = snprintf(cmd, sizeof(cmd), "exec %s %s %s", 1610 + KVP_SCRIPTS_PATH "hv_set_ifconfig", 1611 + if_filename, nm_filename); 1611 1612 /* 1612 1613 * This is a little overcautious, but it's necessary to suppress some 1613 1614 * false warnings from gcc 8.0.1.

+1 -1

tools/hv/hv_set_ifconfig.sh

··· 81 81 82 82 cp $1 /etc/sysconfig/network-scripts/ 83 83 84 - chmod 600 $2 84 + umask 0177 85 85 interface=$(echo $2 | awk -F - '{ print $2 }') 86 86 filename="${2##*/}" 87 87

+3 -3

tools/net/ynl/lib/ynl.py

··· 556 556 if attr["type"] == 'nest': 557 557 nl_type |= Netlink.NLA_F_NESTED 558 558 attr_payload = b'' 559 - sub_attrs = SpaceAttrs(self.attr_sets[space], value, search_attrs) 559 + sub_space = attr['nested-attributes'] 560 + sub_attrs = SpaceAttrs(self.attr_sets[sub_space], value, search_attrs) 560 561 for subname, subvalue in value.items(): 561 - attr_payload += self._add_attr(attr['nested-attributes'], 562 - subname, subvalue, sub_attrs) 562 + attr_payload += self._add_attr(sub_space, subname, subvalue, sub_attrs) 563 563 elif attr["type"] == 'flag': 564 564 if not value: 565 565 # If value is absent or false then skip attribute creation.

+6 -3

tools/objtool/check.c

··· 3820 3820 break; 3821 3821 3822 3822 case INSN_CONTEXT_SWITCH: 3823 - if (func && (!next_insn || !next_insn->hint)) { 3824 - WARN_INSN(insn, "unsupported instruction in callable function"); 3825 - return 1; 3823 + if (func) { 3824 + if (!next_insn || !next_insn->hint) { 3825 + WARN_INSN(insn, "unsupported instruction in callable function"); 3826 + return 1; 3827 + } 3828 + break; 3826 3829 } 3827 3830 return 0; 3828 3831

+394

tools/testing/selftests/bpf/prog_tests/socket_helpers.h

··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + 3 + #ifndef __SOCKET_HELPERS__ 4 + #define __SOCKET_HELPERS__ 5 + 6 + #include <linux/vm_sockets.h> 7 + 8 + /* include/linux/net.h */ 9 + #define SOCK_TYPE_MASK 0xf 10 + 11 + #define IO_TIMEOUT_SEC 30 12 + #define MAX_STRERR_LEN 256 13 + 14 + /* workaround for older vm_sockets.h */ 15 + #ifndef VMADDR_CID_LOCAL 16 + #define VMADDR_CID_LOCAL 1 17 + #endif 18 + 19 + /* include/linux/cleanup.h */ 20 + #define __get_and_null(p, nullvalue) \ 21 + ({ \ 22 + __auto_type __ptr = &(p); \ 23 + __auto_type __val = *__ptr; \ 24 + *__ptr = nullvalue; \ 25 + __val; \ 26 + }) 27 + 28 + #define take_fd(fd) __get_and_null(fd, -EBADF) 29 + 30 + /* Wrappers that fail the test on error and report it. */ 31 + 32 + #define _FAIL(errnum, fmt...) \ 33 + ({ \ 34 + error_at_line(0, (errnum), __func__, __LINE__, fmt); \ 35 + CHECK_FAIL(true); \ 36 + }) 37 + #define FAIL(fmt...) _FAIL(0, fmt) 38 + #define FAIL_ERRNO(fmt...) _FAIL(errno, fmt) 39 + #define FAIL_LIBBPF(err, msg) \ 40 + ({ \ 41 + char __buf[MAX_STRERR_LEN]; \ 42 + libbpf_strerror((err), __buf, sizeof(__buf)); \ 43 + FAIL("%s: %s", (msg), __buf); \ 44 + }) 45 + 46 + 47 + #define xaccept_nonblock(fd, addr, len) \ 48 + ({ \ 49 + int __ret = \ 50 + accept_timeout((fd), (addr), (len), IO_TIMEOUT_SEC); \ 51 + if (__ret == -1) \ 52 + FAIL_ERRNO("accept"); \ 53 + __ret; \ 54 + }) 55 + 56 + #define xbind(fd, addr, len) \ 57 + ({ \ 58 + int __ret = bind((fd), (addr), (len)); \ 59 + if (__ret == -1) \ 60 + FAIL_ERRNO("bind"); \ 61 + __ret; \ 62 + }) 63 + 64 + #define xclose(fd) \ 65 + ({ \ 66 + int __ret = close((fd)); \ 67 + if (__ret == -1) \ 68 + FAIL_ERRNO("close"); \ 69 + __ret; \ 70 + }) 71 + 72 + #define xconnect(fd, addr, len) \ 73 + ({ \ 74 + int __ret = connect((fd), (addr), (len)); \ 75 + if (__ret == -1) \ 76 + FAIL_ERRNO("connect"); \ 77 + __ret; \ 78 + }) 79 + 80 + #define xgetsockname(fd, addr, len) \ 81 + ({ \ 82 + int __ret = getsockname((fd), (addr), (len)); \ 83 + if (__ret == -1) \ 84 + FAIL_ERRNO("getsockname"); \ 85 + __ret; \ 86 + }) 87 + 88 + #define xgetsockopt(fd, level, name, val, len) \ 89 + ({ \ 90 + int __ret = getsockopt((fd), (level), (name), (val), (len)); \ 91 + if (__ret == -1) \ 92 + FAIL_ERRNO("getsockopt(" #name ")"); \ 93 + __ret; \ 94 + }) 95 + 96 + #define xlisten(fd, backlog) \ 97 + ({ \ 98 + int __ret = listen((fd), (backlog)); \ 99 + if (__ret == -1) \ 100 + FAIL_ERRNO("listen"); \ 101 + __ret; \ 102 + }) 103 + 104 + #define xsetsockopt(fd, level, name, val, len) \ 105 + ({ \ 106 + int __ret = setsockopt((fd), (level), (name), (val), (len)); \ 107 + if (__ret == -1) \ 108 + FAIL_ERRNO("setsockopt(" #name ")"); \ 109 + __ret; \ 110 + }) 111 + 112 + #define xsend(fd, buf, len, flags) \ 113 + ({ \ 114 + ssize_t __ret = send((fd), (buf), (len), (flags)); \ 115 + if (__ret == -1) \ 116 + FAIL_ERRNO("send"); \ 117 + __ret; \ 118 + }) 119 + 120 + #define xrecv_nonblock(fd, buf, len, flags) \ 121 + ({ \ 122 + ssize_t __ret = recv_timeout((fd), (buf), (len), (flags), \ 123 + IO_TIMEOUT_SEC); \ 124 + if (__ret == -1) \ 125 + FAIL_ERRNO("recv"); \ 126 + __ret; \ 127 + }) 128 + 129 + #define xsocket(family, sotype, flags) \ 130 + ({ \ 131 + int __ret = socket(family, sotype, flags); \ 132 + if (__ret == -1) \ 133 + FAIL_ERRNO("socket"); \ 134 + __ret; \ 135 + }) 136 + 137 + static inline void close_fd(int *fd) 138 + { 139 + if (*fd >= 0) 140 + xclose(*fd); 141 + } 142 + 143 + #define __close_fd __attribute__((cleanup(close_fd))) 144 + 145 + static inline struct sockaddr *sockaddr(struct sockaddr_storage *ss) 146 + { 147 + return (struct sockaddr *)ss; 148 + } 149 + 150 + static inline void init_addr_loopback4(struct sockaddr_storage *ss, 151 + socklen_t *len) 152 + { 153 + struct sockaddr_in *addr4 = memset(ss, 0, sizeof(*ss)); 154 + 155 + addr4->sin_family = AF_INET; 156 + addr4->sin_port = 0; 157 + addr4->sin_addr.s_addr = htonl(INADDR_LOOPBACK); 158 + *len = sizeof(*addr4); 159 + } 160 + 161 + static inline void init_addr_loopback6(struct sockaddr_storage *ss, 162 + socklen_t *len) 163 + { 164 + struct sockaddr_in6 *addr6 = memset(ss, 0, sizeof(*ss)); 165 + 166 + addr6->sin6_family = AF_INET6; 167 + addr6->sin6_port = 0; 168 + addr6->sin6_addr = in6addr_loopback; 169 + *len = sizeof(*addr6); 170 + } 171 + 172 + static inline void init_addr_loopback_vsock(struct sockaddr_storage *ss, 173 + socklen_t *len) 174 + { 175 + struct sockaddr_vm *addr = memset(ss, 0, sizeof(*ss)); 176 + 177 + addr->svm_family = AF_VSOCK; 178 + addr->svm_port = VMADDR_PORT_ANY; 179 + addr->svm_cid = VMADDR_CID_LOCAL; 180 + *len = sizeof(*addr); 181 + } 182 + 183 + static inline void init_addr_loopback(int family, struct sockaddr_storage *ss, 184 + socklen_t *len) 185 + { 186 + switch (family) { 187 + case AF_INET: 188 + init_addr_loopback4(ss, len); 189 + return; 190 + case AF_INET6: 191 + init_addr_loopback6(ss, len); 192 + return; 193 + case AF_VSOCK: 194 + init_addr_loopback_vsock(ss, len); 195 + return; 196 + default: 197 + FAIL("unsupported address family %d", family); 198 + } 199 + } 200 + 201 + static inline int enable_reuseport(int s, int progfd) 202 + { 203 + int err, one = 1; 204 + 205 + err = xsetsockopt(s, SOL_SOCKET, SO_REUSEPORT, &one, sizeof(one)); 206 + if (err) 207 + return -1; 208 + err = xsetsockopt(s, SOL_SOCKET, SO_ATTACH_REUSEPORT_EBPF, &progfd, 209 + sizeof(progfd)); 210 + if (err) 211 + return -1; 212 + 213 + return 0; 214 + } 215 + 216 + static inline int socket_loopback_reuseport(int family, int sotype, int progfd) 217 + { 218 + struct sockaddr_storage addr; 219 + socklen_t len = 0; 220 + int err, s; 221 + 222 + init_addr_loopback(family, &addr, &len); 223 + 224 + s = xsocket(family, sotype, 0); 225 + if (s == -1) 226 + return -1; 227 + 228 + if (progfd >= 0) 229 + enable_reuseport(s, progfd); 230 + 231 + err = xbind(s, sockaddr(&addr), len); 232 + if (err) 233 + goto close; 234 + 235 + if (sotype & SOCK_DGRAM) 236 + return s; 237 + 238 + err = xlisten(s, SOMAXCONN); 239 + if (err) 240 + goto close; 241 + 242 + return s; 243 + close: 244 + xclose(s); 245 + return -1; 246 + } 247 + 248 + static inline int socket_loopback(int family, int sotype) 249 + { 250 + return socket_loopback_reuseport(family, sotype, -1); 251 + } 252 + 253 + static inline int poll_connect(int fd, unsigned int timeout_sec) 254 + { 255 + struct timeval timeout = { .tv_sec = timeout_sec }; 256 + fd_set wfds; 257 + int r, eval; 258 + socklen_t esize = sizeof(eval); 259 + 260 + FD_ZERO(&wfds); 261 + FD_SET(fd, &wfds); 262 + 263 + r = select(fd + 1, NULL, &wfds, NULL, &timeout); 264 + if (r == 0) 265 + errno = ETIME; 266 + if (r != 1) 267 + return -1; 268 + 269 + if (getsockopt(fd, SOL_SOCKET, SO_ERROR, &eval, &esize) < 0) 270 + return -1; 271 + if (eval != 0) { 272 + errno = eval; 273 + return -1; 274 + } 275 + 276 + return 0; 277 + } 278 + 279 + static inline int poll_read(int fd, unsigned int timeout_sec) 280 + { 281 + struct timeval timeout = { .tv_sec = timeout_sec }; 282 + fd_set rfds; 283 + int r; 284 + 285 + FD_ZERO(&rfds); 286 + FD_SET(fd, &rfds); 287 + 288 + r = select(fd + 1, &rfds, NULL, NULL, &timeout); 289 + if (r == 0) 290 + errno = ETIME; 291 + 292 + return r == 1 ? 0 : -1; 293 + } 294 + 295 + static inline int accept_timeout(int fd, struct sockaddr *addr, socklen_t *len, 296 + unsigned int timeout_sec) 297 + { 298 + if (poll_read(fd, timeout_sec)) 299 + return -1; 300 + 301 + return accept(fd, addr, len); 302 + } 303 + 304 + static inline int recv_timeout(int fd, void *buf, size_t len, int flags, 305 + unsigned int timeout_sec) 306 + { 307 + if (poll_read(fd, timeout_sec)) 308 + return -1; 309 + 310 + return recv(fd, buf, len, flags); 311 + } 312 + 313 + 314 + static inline int create_pair(int family, int sotype, int *p0, int *p1) 315 + { 316 + __close_fd int s, c = -1, p = -1; 317 + struct sockaddr_storage addr; 318 + socklen_t len = sizeof(addr); 319 + int err; 320 + 321 + s = socket_loopback(family, sotype); 322 + if (s < 0) 323 + return s; 324 + 325 + err = xgetsockname(s, sockaddr(&addr), &len); 326 + if (err) 327 + return err; 328 + 329 + c = xsocket(family, sotype, 0); 330 + if (c < 0) 331 + return c; 332 + 333 + err = connect(c, sockaddr(&addr), len); 334 + if (err) { 335 + if (errno != EINPROGRESS) { 336 + FAIL_ERRNO("connect"); 337 + return err; 338 + } 339 + 340 + err = poll_connect(c, IO_TIMEOUT_SEC); 341 + if (err) { 342 + FAIL_ERRNO("poll_connect"); 343 + return err; 344 + } 345 + } 346 + 347 + switch (sotype & SOCK_TYPE_MASK) { 348 + case SOCK_DGRAM: 349 + err = xgetsockname(c, sockaddr(&addr), &len); 350 + if (err) 351 + return err; 352 + 353 + err = xconnect(s, sockaddr(&addr), len); 354 + if (err) 355 + return err; 356 + 357 + *p0 = take_fd(s); 358 + break; 359 + case SOCK_STREAM: 360 + case SOCK_SEQPACKET: 361 + p = xaccept_nonblock(s, NULL, NULL); 362 + if (p < 0) 363 + return p; 364 + 365 + *p0 = take_fd(p); 366 + break; 367 + default: 368 + FAIL("Unsupported socket type %#x", sotype); 369 + return -EOPNOTSUPP; 370 + } 371 + 372 + *p1 = take_fd(c); 373 + return 0; 374 + } 375 + 376 + static inline int create_socket_pairs(int family, int sotype, int *c0, int *c1, 377 + int *p0, int *p1) 378 + { 379 + int err; 380 + 381 + err = create_pair(family, sotype, c0, p0); 382 + if (err) 383 + return err; 384 + 385 + err = create_pair(family, sotype, c1, p1); 386 + if (err) { 387 + close(*c0); 388 + close(*p0); 389 + } 390 + 391 + return err; 392 + } 393 + 394 + #endif // __SOCKET_HELPERS__

+51

tools/testing/selftests/bpf/prog_tests/sockmap_basic.c

··· 12 12 #include "test_sockmap_progs_query.skel.h" 13 13 #include "test_sockmap_pass_prog.skel.h" 14 14 #include "test_sockmap_drop_prog.skel.h" 15 + #include "test_sockmap_change_tail.skel.h" 15 16 #include "bpf_iter_sockmap.skel.h" 16 17 17 18 #include "sockmap_helpers.h" ··· 644 643 test_sockmap_drop_prog__destroy(drop); 645 644 } 646 645 646 + static void test_sockmap_skb_verdict_change_tail(void) 647 + { 648 + struct test_sockmap_change_tail *skel; 649 + int err, map, verdict; 650 + int c1, p1, sent, recvd; 651 + int zero = 0; 652 + char buf[2]; 653 + 654 + skel = test_sockmap_change_tail__open_and_load(); 655 + if (!ASSERT_OK_PTR(skel, "open_and_load")) 656 + return; 657 + verdict = bpf_program__fd(skel->progs.prog_skb_verdict); 658 + map = bpf_map__fd(skel->maps.sock_map_rx); 659 + 660 + err = bpf_prog_attach(verdict, map, BPF_SK_SKB_STREAM_VERDICT, 0); 661 + if (!ASSERT_OK(err, "bpf_prog_attach")) 662 + goto out; 663 + err = create_pair(AF_INET, SOCK_STREAM, &c1, &p1); 664 + if (!ASSERT_OK(err, "create_pair()")) 665 + goto out; 666 + err = bpf_map_update_elem(map, &zero, &c1, BPF_NOEXIST); 667 + if (!ASSERT_OK(err, "bpf_map_update_elem(c1)")) 668 + goto out_close; 669 + sent = xsend(p1, "Tr", 2, 0); 670 + ASSERT_EQ(sent, 2, "xsend(p1)"); 671 + recvd = recv(c1, buf, 2, 0); 672 + ASSERT_EQ(recvd, 1, "recv(c1)"); 673 + ASSERT_EQ(skel->data->change_tail_ret, 0, "change_tail_ret"); 674 + 675 + sent = xsend(p1, "G", 1, 0); 676 + ASSERT_EQ(sent, 1, "xsend(p1)"); 677 + recvd = recv(c1, buf, 2, 0); 678 + ASSERT_EQ(recvd, 2, "recv(c1)"); 679 + ASSERT_EQ(skel->data->change_tail_ret, 0, "change_tail_ret"); 680 + 681 + sent = xsend(p1, "E", 1, 0); 682 + ASSERT_EQ(sent, 1, "xsend(p1)"); 683 + recvd = recv(c1, buf, 1, 0); 684 + ASSERT_EQ(recvd, 1, "recv(c1)"); 685 + ASSERT_EQ(skel->data->change_tail_ret, -EINVAL, "change_tail_ret"); 686 + 687 + out_close: 688 + close(c1); 689 + close(p1); 690 + out: 691 + test_sockmap_change_tail__destroy(skel); 692 + } 693 + 647 694 static void test_sockmap_skb_verdict_peek_helper(int map) 648 695 { 649 696 int err, c1, p1, zero = 0, sent, recvd, avail; ··· 1107 1058 test_sockmap_skb_verdict_fionread(true); 1108 1059 if (test__start_subtest("sockmap skb_verdict fionread on drop")) 1109 1060 test_sockmap_skb_verdict_fionread(false); 1061 + if (test__start_subtest("sockmap skb_verdict change tail")) 1062 + test_sockmap_skb_verdict_change_tail(); 1110 1063 if (test__start_subtest("sockmap skb_verdict msg_f_peek")) 1111 1064 test_sockmap_skb_verdict_peek(); 1112 1065 if (test__start_subtest("sockmap skb_verdict msg_f_peek with link"))

+1 -384

tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h

··· 1 1 #ifndef __SOCKMAP_HELPERS__ 2 2 #define __SOCKMAP_HELPERS__ 3 3 4 - #include <linux/vm_sockets.h> 4 + #include "socket_helpers.h" 5 5 6 - /* include/linux/net.h */ 7 - #define SOCK_TYPE_MASK 0xf 8 - 9 - #define IO_TIMEOUT_SEC 30 10 - #define MAX_STRERR_LEN 256 11 6 #define MAX_TEST_NAME 80 12 7 13 - /* workaround for older vm_sockets.h */ 14 - #ifndef VMADDR_CID_LOCAL 15 - #define VMADDR_CID_LOCAL 1 16 - #endif 17 - 18 8 #define __always_unused __attribute__((__unused__)) 19 - 20 - /* include/linux/cleanup.h */ 21 - #define __get_and_null(p, nullvalue) \ 22 - ({ \ 23 - __auto_type __ptr = &(p); \ 24 - __auto_type __val = *__ptr; \ 25 - *__ptr = nullvalue; \ 26 - __val; \ 27 - }) 28 - 29 - #define take_fd(fd) __get_and_null(fd, -EBADF) 30 - 31 - #define _FAIL(errnum, fmt...) \ 32 - ({ \ 33 - error_at_line(0, (errnum), __func__, __LINE__, fmt); \ 34 - CHECK_FAIL(true); \ 35 - }) 36 - #define FAIL(fmt...) _FAIL(0, fmt) 37 - #define FAIL_ERRNO(fmt...) _FAIL(errno, fmt) 38 - #define FAIL_LIBBPF(err, msg) \ 39 - ({ \ 40 - char __buf[MAX_STRERR_LEN]; \ 41 - libbpf_strerror((err), __buf, sizeof(__buf)); \ 42 - FAIL("%s: %s", (msg), __buf); \ 43 - }) 44 - 45 - /* Wrappers that fail the test on error and report it. */ 46 - 47 - #define xaccept_nonblock(fd, addr, len) \ 48 - ({ \ 49 - int __ret = \ 50 - accept_timeout((fd), (addr), (len), IO_TIMEOUT_SEC); \ 51 - if (__ret == -1) \ 52 - FAIL_ERRNO("accept"); \ 53 - __ret; \ 54 - }) 55 - 56 - #define xbind(fd, addr, len) \ 57 - ({ \ 58 - int __ret = bind((fd), (addr), (len)); \ 59 - if (__ret == -1) \ 60 - FAIL_ERRNO("bind"); \ 61 - __ret; \ 62 - }) 63 - 64 - #define xclose(fd) \ 65 - ({ \ 66 - int __ret = close((fd)); \ 67 - if (__ret == -1) \ 68 - FAIL_ERRNO("close"); \ 69 - __ret; \ 70 - }) 71 - 72 - #define xconnect(fd, addr, len) \ 73 - ({ \ 74 - int __ret = connect((fd), (addr), (len)); \ 75 - if (__ret == -1) \ 76 - FAIL_ERRNO("connect"); \ 77 - __ret; \ 78 - }) 79 - 80 - #define xgetsockname(fd, addr, len) \ 81 - ({ \ 82 - int __ret = getsockname((fd), (addr), (len)); \ 83 - if (__ret == -1) \ 84 - FAIL_ERRNO("getsockname"); \ 85 - __ret; \ 86 - }) 87 - 88 - #define xgetsockopt(fd, level, name, val, len) \ 89 - ({ \ 90 - int __ret = getsockopt((fd), (level), (name), (val), (len)); \ 91 - if (__ret == -1) \ 92 - FAIL_ERRNO("getsockopt(" #name ")"); \ 93 - __ret; \ 94 - }) 95 - 96 - #define xlisten(fd, backlog) \ 97 - ({ \ 98 - int __ret = listen((fd), (backlog)); \ 99 - if (__ret == -1) \ 100 - FAIL_ERRNO("listen"); \ 101 - __ret; \ 102 - }) 103 - 104 - #define xsetsockopt(fd, level, name, val, len) \ 105 - ({ \ 106 - int __ret = setsockopt((fd), (level), (name), (val), (len)); \ 107 - if (__ret == -1) \ 108 - FAIL_ERRNO("setsockopt(" #name ")"); \ 109 - __ret; \ 110 - }) 111 - 112 - #define xsend(fd, buf, len, flags) \ 113 - ({ \ 114 - ssize_t __ret = send((fd), (buf), (len), (flags)); \ 115 - if (__ret == -1) \ 116 - FAIL_ERRNO("send"); \ 117 - __ret; \ 118 - }) 119 - 120 - #define xrecv_nonblock(fd, buf, len, flags) \ 121 - ({ \ 122 - ssize_t __ret = recv_timeout((fd), (buf), (len), (flags), \ 123 - IO_TIMEOUT_SEC); \ 124 - if (__ret == -1) \ 125 - FAIL_ERRNO("recv"); \ 126 - __ret; \ 127 - }) 128 - 129 - #define xsocket(family, sotype, flags) \ 130 - ({ \ 131 - int __ret = socket(family, sotype, flags); \ 132 - if (__ret == -1) \ 133 - FAIL_ERRNO("socket"); \ 134 - __ret; \ 135 - }) 136 9 137 10 #define xbpf_map_delete_elem(fd, key) \ 138 11 ({ \ ··· 66 193 __ret; \ 67 194 }) 68 195 69 - static inline void close_fd(int *fd) 70 - { 71 - if (*fd >= 0) 72 - xclose(*fd); 73 - } 74 - 75 - #define __close_fd __attribute__((cleanup(close_fd))) 76 - 77 - static inline int poll_connect(int fd, unsigned int timeout_sec) 78 - { 79 - struct timeval timeout = { .tv_sec = timeout_sec }; 80 - fd_set wfds; 81 - int r, eval; 82 - socklen_t esize = sizeof(eval); 83 - 84 - FD_ZERO(&wfds); 85 - FD_SET(fd, &wfds); 86 - 87 - r = select(fd + 1, NULL, &wfds, NULL, &timeout); 88 - if (r == 0) 89 - errno = ETIME; 90 - if (r != 1) 91 - return -1; 92 - 93 - if (getsockopt(fd, SOL_SOCKET, SO_ERROR, &eval, &esize) < 0) 94 - return -1; 95 - if (eval != 0) { 96 - errno = eval; 97 - return -1; 98 - } 99 - 100 - return 0; 101 - } 102 - 103 - static inline int poll_read(int fd, unsigned int timeout_sec) 104 - { 105 - struct timeval timeout = { .tv_sec = timeout_sec }; 106 - fd_set rfds; 107 - int r; 108 - 109 - FD_ZERO(&rfds); 110 - FD_SET(fd, &rfds); 111 - 112 - r = select(fd + 1, &rfds, NULL, NULL, &timeout); 113 - if (r == 0) 114 - errno = ETIME; 115 - 116 - return r == 1 ? 0 : -1; 117 - } 118 - 119 - static inline int accept_timeout(int fd, struct sockaddr *addr, socklen_t *len, 120 - unsigned int timeout_sec) 121 - { 122 - if (poll_read(fd, timeout_sec)) 123 - return -1; 124 - 125 - return accept(fd, addr, len); 126 - } 127 - 128 - static inline int recv_timeout(int fd, void *buf, size_t len, int flags, 129 - unsigned int timeout_sec) 130 - { 131 - if (poll_read(fd, timeout_sec)) 132 - return -1; 133 - 134 - return recv(fd, buf, len, flags); 135 - } 136 - 137 - static inline void init_addr_loopback4(struct sockaddr_storage *ss, 138 - socklen_t *len) 139 - { 140 - struct sockaddr_in *addr4 = memset(ss, 0, sizeof(*ss)); 141 - 142 - addr4->sin_family = AF_INET; 143 - addr4->sin_port = 0; 144 - addr4->sin_addr.s_addr = htonl(INADDR_LOOPBACK); 145 - *len = sizeof(*addr4); 146 - } 147 - 148 - static inline void init_addr_loopback6(struct sockaddr_storage *ss, 149 - socklen_t *len) 150 - { 151 - struct sockaddr_in6 *addr6 = memset(ss, 0, sizeof(*ss)); 152 - 153 - addr6->sin6_family = AF_INET6; 154 - addr6->sin6_port = 0; 155 - addr6->sin6_addr = in6addr_loopback; 156 - *len = sizeof(*addr6); 157 - } 158 - 159 - static inline void init_addr_loopback_vsock(struct sockaddr_storage *ss, 160 - socklen_t *len) 161 - { 162 - struct sockaddr_vm *addr = memset(ss, 0, sizeof(*ss)); 163 - 164 - addr->svm_family = AF_VSOCK; 165 - addr->svm_port = VMADDR_PORT_ANY; 166 - addr->svm_cid = VMADDR_CID_LOCAL; 167 - *len = sizeof(*addr); 168 - } 169 - 170 - static inline void init_addr_loopback(int family, struct sockaddr_storage *ss, 171 - socklen_t *len) 172 - { 173 - switch (family) { 174 - case AF_INET: 175 - init_addr_loopback4(ss, len); 176 - return; 177 - case AF_INET6: 178 - init_addr_loopback6(ss, len); 179 - return; 180 - case AF_VSOCK: 181 - init_addr_loopback_vsock(ss, len); 182 - return; 183 - default: 184 - FAIL("unsupported address family %d", family); 185 - } 186 - } 187 - 188 - static inline struct sockaddr *sockaddr(struct sockaddr_storage *ss) 189 - { 190 - return (struct sockaddr *)ss; 191 - } 192 - 193 196 static inline int add_to_sockmap(int sock_mapfd, int fd1, int fd2) 194 197 { 195 198 u64 value; ··· 81 332 key = 1; 82 333 value = fd2; 83 334 return xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); 84 - } 85 - 86 - static inline int enable_reuseport(int s, int progfd) 87 - { 88 - int err, one = 1; 89 - 90 - err = xsetsockopt(s, SOL_SOCKET, SO_REUSEPORT, &one, sizeof(one)); 91 - if (err) 92 - return -1; 93 - err = xsetsockopt(s, SOL_SOCKET, SO_ATTACH_REUSEPORT_EBPF, &progfd, 94 - sizeof(progfd)); 95 - if (err) 96 - return -1; 97 - 98 - return 0; 99 - } 100 - 101 - static inline int socket_loopback_reuseport(int family, int sotype, int progfd) 102 - { 103 - struct sockaddr_storage addr; 104 - socklen_t len = 0; 105 - int err, s; 106 - 107 - init_addr_loopback(family, &addr, &len); 108 - 109 - s = xsocket(family, sotype, 0); 110 - if (s == -1) 111 - return -1; 112 - 113 - if (progfd >= 0) 114 - enable_reuseport(s, progfd); 115 - 116 - err = xbind(s, sockaddr(&addr), len); 117 - if (err) 118 - goto close; 119 - 120 - if (sotype & SOCK_DGRAM) 121 - return s; 122 - 123 - err = xlisten(s, SOMAXCONN); 124 - if (err) 125 - goto close; 126 - 127 - return s; 128 - close: 129 - xclose(s); 130 - return -1; 131 - } 132 - 133 - static inline int socket_loopback(int family, int sotype) 134 - { 135 - return socket_loopback_reuseport(family, sotype, -1); 136 - } 137 - 138 - static inline int create_pair(int family, int sotype, int *p0, int *p1) 139 - { 140 - __close_fd int s, c = -1, p = -1; 141 - struct sockaddr_storage addr; 142 - socklen_t len = sizeof(addr); 143 - int err; 144 - 145 - s = socket_loopback(family, sotype); 146 - if (s < 0) 147 - return s; 148 - 149 - err = xgetsockname(s, sockaddr(&addr), &len); 150 - if (err) 151 - return err; 152 - 153 - c = xsocket(family, sotype, 0); 154 - if (c < 0) 155 - return c; 156 - 157 - err = connect(c, sockaddr(&addr), len); 158 - if (err) { 159 - if (errno != EINPROGRESS) { 160 - FAIL_ERRNO("connect"); 161 - return err; 162 - } 163 - 164 - err = poll_connect(c, IO_TIMEOUT_SEC); 165 - if (err) { 166 - FAIL_ERRNO("poll_connect"); 167 - return err; 168 - } 169 - } 170 - 171 - switch (sotype & SOCK_TYPE_MASK) { 172 - case SOCK_DGRAM: 173 - err = xgetsockname(c, sockaddr(&addr), &len); 174 - if (err) 175 - return err; 176 - 177 - err = xconnect(s, sockaddr(&addr), len); 178 - if (err) 179 - return err; 180 - 181 - *p0 = take_fd(s); 182 - break; 183 - case SOCK_STREAM: 184 - case SOCK_SEQPACKET: 185 - p = xaccept_nonblock(s, NULL, NULL); 186 - if (p < 0) 187 - return p; 188 - 189 - *p0 = take_fd(p); 190 - break; 191 - default: 192 - FAIL("Unsupported socket type %#x", sotype); 193 - return -EOPNOTSUPP; 194 - } 195 - 196 - *p1 = take_fd(c); 197 - return 0; 198 - } 199 - 200 - static inline int create_socket_pairs(int family, int sotype, int *c0, int *c1, 201 - int *p0, int *p1) 202 - { 203 - int err; 204 - 205 - err = create_pair(family, sotype, c0, p0); 206 - if (err) 207 - return err; 208 - 209 - err = create_pair(family, sotype, c1, p1); 210 - if (err) { 211 - close(*c0); 212 - close(*p0); 213 - } 214 - 215 - return err; 216 335 } 217 336 218 337 #endif // __SOCKMAP_HELPERS__

+62

tools/testing/selftests/bpf/prog_tests/tc_change_tail.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <error.h> 3 + #include <test_progs.h> 4 + #include <linux/pkt_cls.h> 5 + 6 + #include "test_tc_change_tail.skel.h" 7 + #include "socket_helpers.h" 8 + 9 + #define LO_IFINDEX 1 10 + 11 + void test_tc_change_tail(void) 12 + { 13 + LIBBPF_OPTS(bpf_tcx_opts, tcx_opts); 14 + struct test_tc_change_tail *skel = NULL; 15 + struct bpf_link *link; 16 + int c1, p1; 17 + char buf[2]; 18 + int ret; 19 + 20 + skel = test_tc_change_tail__open_and_load(); 21 + if (!ASSERT_OK_PTR(skel, "test_tc_change_tail__open_and_load")) 22 + return; 23 + 24 + link = bpf_program__attach_tcx(skel->progs.change_tail, LO_IFINDEX, 25 + &tcx_opts); 26 + if (!ASSERT_OK_PTR(link, "bpf_program__attach_tcx")) 27 + goto destroy; 28 + 29 + skel->links.change_tail = link; 30 + ret = create_pair(AF_INET, SOCK_DGRAM, &c1, &p1); 31 + if (!ASSERT_OK(ret, "create_pair")) 32 + goto destroy; 33 + 34 + ret = xsend(p1, "Tr", 2, 0); 35 + ASSERT_EQ(ret, 2, "xsend(p1)"); 36 + ret = recv(c1, buf, 2, 0); 37 + ASSERT_EQ(ret, 2, "recv(c1)"); 38 + ASSERT_EQ(skel->data->change_tail_ret, 0, "change_tail_ret"); 39 + 40 + ret = xsend(p1, "G", 1, 0); 41 + ASSERT_EQ(ret, 1, "xsend(p1)"); 42 + ret = recv(c1, buf, 2, 0); 43 + ASSERT_EQ(ret, 1, "recv(c1)"); 44 + ASSERT_EQ(skel->data->change_tail_ret, 0, "change_tail_ret"); 45 + 46 + ret = xsend(p1, "E", 1, 0); 47 + ASSERT_EQ(ret, 1, "xsend(p1)"); 48 + ret = recv(c1, buf, 1, 0); 49 + ASSERT_EQ(ret, 1, "recv(c1)"); 50 + ASSERT_EQ(skel->data->change_tail_ret, -EINVAL, "change_tail_ret"); 51 + 52 + ret = xsend(p1, "Z", 1, 0); 53 + ASSERT_EQ(ret, 1, "xsend(p1)"); 54 + ret = recv(c1, buf, 1, 0); 55 + ASSERT_EQ(ret, 1, "recv(c1)"); 56 + ASSERT_EQ(skel->data->change_tail_ret, -EINVAL, "change_tail_ret"); 57 + 58 + close(c1); 59 + close(p1); 60 + destroy: 61 + test_tc_change_tail__destroy(skel); 62 + }

+40

tools/testing/selftests/bpf/progs/test_sockmap_change_tail.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 ByteDance */ 3 + #include <linux/bpf.h> 4 + #include <bpf/bpf_helpers.h> 5 + 6 + struct { 7 + __uint(type, BPF_MAP_TYPE_SOCKMAP); 8 + __uint(max_entries, 1); 9 + __type(key, int); 10 + __type(value, int); 11 + } sock_map_rx SEC(".maps"); 12 + 13 + long change_tail_ret = 1; 14 + 15 + SEC("sk_skb") 16 + int prog_skb_verdict(struct __sk_buff *skb) 17 + { 18 + char *data, *data_end; 19 + 20 + bpf_skb_pull_data(skb, 1); 21 + data = (char *)(unsigned long)skb->data; 22 + data_end = (char *)(unsigned long)skb->data_end; 23 + 24 + if (data + 1 > data_end) 25 + return SK_PASS; 26 + 27 + if (data[0] == 'T') { /* Trim the packet */ 28 + change_tail_ret = bpf_skb_change_tail(skb, skb->len - 1, 0); 29 + return SK_PASS; 30 + } else if (data[0] == 'G') { /* Grow the packet */ 31 + change_tail_ret = bpf_skb_change_tail(skb, skb->len + 1, 0); 32 + return SK_PASS; 33 + } else if (data[0] == 'E') { /* Error */ 34 + change_tail_ret = bpf_skb_change_tail(skb, 65535, 0); 35 + return SK_PASS; 36 + } 37 + return SK_PASS; 38 + } 39 + 40 + char _license[] SEC("license") = "GPL";

+106

tools/testing/selftests/bpf/progs/test_tc_change_tail.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <linux/bpf.h> 3 + #include <bpf/bpf_helpers.h> 4 + #include <linux/if_ether.h> 5 + #include <linux/in.h> 6 + #include <linux/ip.h> 7 + #include <linux/udp.h> 8 + #include <linux/pkt_cls.h> 9 + 10 + long change_tail_ret = 1; 11 + 12 + static __always_inline struct iphdr *parse_ip_header(struct __sk_buff *skb, int *ip_proto) 13 + { 14 + void *data_end = (void *)(long)skb->data_end; 15 + void *data = (void *)(long)skb->data; 16 + struct ethhdr *eth = data; 17 + struct iphdr *iph; 18 + 19 + /* Verify Ethernet header */ 20 + if ((void *)(data + sizeof(*eth)) > data_end) 21 + return NULL; 22 + 23 + /* Skip Ethernet header to get to IP header */ 24 + iph = (void *)(data + sizeof(struct ethhdr)); 25 + 26 + /* Verify IP header */ 27 + if ((void *)(data + sizeof(struct ethhdr) + sizeof(*iph)) > data_end) 28 + return NULL; 29 + 30 + /* Basic IP header validation */ 31 + if (iph->version != 4) /* Only support IPv4 */ 32 + return NULL; 33 + 34 + if (iph->ihl < 5) /* Minimum IP header length */ 35 + return NULL; 36 + 37 + *ip_proto = iph->protocol; 38 + return iph; 39 + } 40 + 41 + static __always_inline struct udphdr *parse_udp_header(struct __sk_buff *skb, struct iphdr *iph) 42 + { 43 + void *data_end = (void *)(long)skb->data_end; 44 + void *hdr = (void *)iph; 45 + struct udphdr *udp; 46 + 47 + /* Calculate UDP header position */ 48 + udp = hdr + (iph->ihl * 4); 49 + hdr = (void *)udp; 50 + 51 + /* Verify UDP header bounds */ 52 + if ((void *)(hdr + sizeof(*udp)) > data_end) 53 + return NULL; 54 + 55 + return udp; 56 + } 57 + 58 + SEC("tc/ingress") 59 + int change_tail(struct __sk_buff *skb) 60 + { 61 + int len = skb->len; 62 + struct udphdr *udp; 63 + struct iphdr *iph; 64 + void *data_end; 65 + char *payload; 66 + int ip_proto; 67 + 68 + bpf_skb_pull_data(skb, len); 69 + 70 + data_end = (void *)(long)skb->data_end; 71 + iph = parse_ip_header(skb, &ip_proto); 72 + if (!iph) 73 + return TCX_PASS; 74 + 75 + if (ip_proto != IPPROTO_UDP) 76 + return TCX_PASS; 77 + 78 + udp = parse_udp_header(skb, iph); 79 + if (!udp) 80 + return TCX_PASS; 81 + 82 + payload = (char *)udp + (sizeof(struct udphdr)); 83 + if (payload + 1 > (char *)data_end) 84 + return TCX_PASS; 85 + 86 + if (payload[0] == 'T') { /* Trim the packet */ 87 + change_tail_ret = bpf_skb_change_tail(skb, len - 1, 0); 88 + if (!change_tail_ret) 89 + bpf_skb_change_tail(skb, len, 0); 90 + return TCX_PASS; 91 + } else if (payload[0] == 'G') { /* Grow the packet */ 92 + change_tail_ret = bpf_skb_change_tail(skb, len + 1, 0); 93 + if (!change_tail_ret) 94 + bpf_skb_change_tail(skb, len, 0); 95 + return TCX_PASS; 96 + } else if (payload[0] == 'E') { /* Error */ 97 + change_tail_ret = bpf_skb_change_tail(skb, 65535, 0); 98 + return TCX_PASS; 99 + } else if (payload[0] == 'Z') { /* Zero */ 100 + change_tail_ret = bpf_skb_change_tail(skb, 0, 0); 101 + return TCX_PASS; 102 + } 103 + return TCX_DROP; 104 + } 105 + 106 + char _license[] SEC("license") = "GPL";

+2

tools/testing/selftests/bpf/sdt.h

··· 102 102 # define STAP_SDT_ARG_CONSTRAINT nZr 103 103 # elif defined __arm__ 104 104 # define STAP_SDT_ARG_CONSTRAINT g 105 + # elif defined __loongarch__ 106 + # define STAP_SDT_ARG_CONSTRAINT nmr 105 107 # else 106 108 # define STAP_SDT_ARG_CONSTRAINT nor 107 109 # endif

+4

tools/testing/selftests/bpf/trace_helpers.c

··· 293 293 return 0; 294 294 } 295 295 #else 296 + # ifndef PROCMAP_QUERY_VMA_EXECUTABLE 297 + # define PROCMAP_QUERY_VMA_EXECUTABLE 0x04 298 + # endif 299 + 296 300 static int procmap_query(int fd, const void *addr, __u32 query_flags, size_t *start, size_t *offset, int *flags) 297 301 { 298 302 return -EOPNOTSUPP;

+13 -10

tools/testing/selftests/drivers/net/queues.py

··· 8 8 import glob 9 9 10 10 11 - def sys_get_queues(ifname) -> int: 12 - folders = glob.glob(f'/sys/class/net/{ifname}/queues/rx-*') 11 + def sys_get_queues(ifname, qtype='rx') -> int: 12 + folders = glob.glob(f'/sys/class/net/{ifname}/queues/{qtype}-*') 13 13 return len(folders) 14 14 15 15 16 - def nl_get_queues(cfg, nl): 16 + def nl_get_queues(cfg, nl, qtype='rx'): 17 17 queues = nl.queue_get({'ifindex': cfg.ifindex}, dump=True) 18 18 if queues: 19 - return len([q for q in queues if q['type'] == 'rx']) 19 + return len([q for q in queues if q['type'] == qtype]) 20 20 return None 21 21 22 22 23 23 def get_queues(cfg, nl) -> None: 24 - queues = nl_get_queues(cfg, nl) 25 - if not queues: 26 - raise KsftSkipEx('queue-get not supported by device') 24 + snl = NetdevFamily(recv_size=4096) 27 25 28 - expected = sys_get_queues(cfg.dev['ifname']) 29 - ksft_eq(queues, expected) 26 + for qtype in ['rx', 'tx']: 27 + queues = nl_get_queues(cfg, snl, qtype) 28 + if not queues: 29 + raise KsftSkipEx('queue-get not supported by device') 30 + 31 + expected = sys_get_queues(cfg.dev['ifname'], qtype) 32 + ksft_eq(queues, expected) 30 33 31 34 32 35 def addremove_queues(cfg, nl) -> None: ··· 60 57 61 58 62 59 def main() -> None: 63 - with NetDrvEnv(__file__, queue_count=3) as cfg: 60 + with NetDrvEnv(__file__, queue_count=100) as cfg: 64 61 ksft_run([get_queues, addremove_queues], args=(cfg, NetdevFamily())) 65 62 ksft_exit() 66 63

+18 -1

tools/testing/selftests/drivers/net/stats.py

··· 110 110 ksft_ge(triple[1][key], triple[0][key], comment="bad key: " + key) 111 111 ksft_ge(triple[2][key], triple[1][key], comment="bad key: " + key) 112 112 113 + # Sanity check the dumps 114 + queues = NetdevFamily(recv_size=4096).qstats_get({"scope": "queue"}, dump=True) 115 + # Reformat the output into {ifindex: {rx: [id, id, ...], tx: [id, id, ...]}} 116 + parsed = {} 117 + for entry in queues: 118 + ifindex = entry["ifindex"] 119 + if ifindex not in parsed: 120 + parsed[ifindex] = {"rx":[], "tx": []} 121 + parsed[ifindex][entry["queue-type"]].append(entry['queue-id']) 122 + # Now, validate 123 + for ifindex, queues in parsed.items(): 124 + for qtype in ['rx', 'tx']: 125 + ksft_eq(len(queues[qtype]), len(set(queues[qtype])), 126 + comment="repeated queue keys") 127 + ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1, 128 + comment="missing queue keys") 129 + 113 130 # Test invalid dumps 114 131 # 0 is invalid 115 132 with ksft_raises(NlError) as cm: ··· 175 158 176 159 177 160 def main() -> None: 178 - with NetDrvEnv(__file__) as cfg: 161 + with NetDrvEnv(__file__, queue_count=100) as cfg: 179 162 ksft_run([check_pause, check_fec, pkt_byte_sum, qstat_by_ifindex, 180 163 check_down], 181 164 args=(cfg, ))

+12 -2

tools/testing/selftests/memfd/memfd_test.c

··· 9 9 #include <fcntl.h> 10 10 #include <linux/memfd.h> 11 11 #include <sched.h> 12 + #include <stdbool.h> 12 13 #include <stdio.h> 13 14 #include <stdlib.h> 14 15 #include <signal.h> ··· 1558 1557 close(fd); 1559 1558 } 1560 1559 1560 + static bool pid_ns_supported(void) 1561 + { 1562 + return access("/proc/self/ns/pid", F_OK) == 0; 1563 + } 1564 + 1561 1565 int main(int argc, char **argv) 1562 1566 { 1563 1567 pid_t pid; ··· 1597 1591 test_seal_grow(); 1598 1592 test_seal_resize(); 1599 1593 1600 - test_sysctl_simple(); 1601 - test_sysctl_nested(); 1594 + if (pid_ns_supported()) { 1595 + test_sysctl_simple(); 1596 + test_sysctl_nested(); 1597 + } else { 1598 + printf("PID namespaces are not supported; skipping sysctl tests\n"); 1599 + } 1602 1600 1603 1601 test_share_dup("SHARE-DUP", ""); 1604 1602 test_share_mmap("SHARE-MMAP", "");

+8 -8

tools/testing/selftests/net/lib/py/ynl.py

··· 32 32 # Set schema='' to avoid jsonschema validation, it's slow 33 33 # 34 34 class EthtoolFamily(YnlFamily): 35 - def __init__(self): 35 + def __init__(self, recv_size=0): 36 36 super().__init__((SPEC_PATH / Path('ethtool.yaml')).as_posix(), 37 - schema='') 37 + schema='', recv_size=recv_size) 38 38 39 39 40 40 class RtnlFamily(YnlFamily): 41 - def __init__(self): 41 + def __init__(self, recv_size=0): 42 42 super().__init__((SPEC_PATH / Path('rt_link.yaml')).as_posix(), 43 - schema='') 43 + schema='', recv_size=recv_size) 44 44 45 45 46 46 class NetdevFamily(YnlFamily): 47 - def __init__(self): 47 + def __init__(self, recv_size=0): 48 48 super().__init__((SPEC_PATH / Path('netdev.yaml')).as_posix(), 49 - schema='') 49 + schema='', recv_size=recv_size) 50 50 51 51 class NetshaperFamily(YnlFamily): 52 - def __init__(self): 52 + def __init__(self, recv_size=0): 53 53 super().__init__((SPEC_PATH / Path('net_shaper.yaml')).as_posix(), 54 - schema='') 54 + schema='', recv_size=recv_size)

+4 -2

tools/testing/selftests/net/openvswitch/openvswitch.sh

··· 171 171 ovs_add_if "$1" "$2" "$4" -u || return 1 172 172 fi 173 173 174 - [ $TRACING -eq 1 ] && ovs_netns_spawn_daemon "$1" "$ns" \ 175 - tcpdump -i any -s 65535 174 + if [ $TRACING -eq 1 ]; then 175 + ovs_netns_spawn_daemon "$1" "$3" tcpdump -l -i any -s 6553 176 + ovs_wait grep -q "listening on any" ${ovs_dir}/stderr 177 + fi 176 178 177 179 return 0 178 180 }

+1 -1

usr/include/Makefile

··· 78 78 cmd_hdrtest = \ 79 79 $(CC) $(c_flags) -fsyntax-only -x c /dev/null \ 80 80 $(if $(filter-out $(no-header-test), $*.h), -include $< -include $<); \ 81 - $(PERL) $(src)/headers_check.pl $(obj) $(SRCARCH) $<; \ 81 + $(PERL) $(src)/headers_check.pl $(obj) $<; \ 82 82 touch $@ 83 83 84 84 $(obj)/%.hdrtest: $(obj)/%.h FORCE

+2 -7

usr/include/headers_check.pl

··· 3 3 # 4 4 # headers_check.pl execute a number of trivial consistency checks 5 5 # 6 - # Usage: headers_check.pl dir arch [files...] 6 + # Usage: headers_check.pl dir [files...] 7 7 # dir: dir to look for included files 8 - # arch: architecture 9 8 # files: list of files to check 10 9 # 11 10 # The script reads the supplied files line by line and: ··· 22 23 use strict; 23 24 use File::Basename; 24 25 25 - my ($dir, $arch, @files) = @ARGV; 26 + my ($dir, @files) = @ARGV; 26 27 27 28 my $ret = 0; 28 29 my $line; ··· 53 54 my $inc = $1; 54 55 my $found; 55 56 $found = stat($dir . "/" . $inc); 56 - if (!$found) { 57 - $inc =~ s#asm/#asm-$arch/#; 58 - $found = stat($dir . "/" . $inc); 59 - } 60 57 if (!$found) { 61 58 printf STDERR "$filename:$lineno: included file '$inc' is not exported\n"; 62 59 $ret = 1;