Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Andrii Nakryiko says:

====================
bpf-next 2022-11-11

We've added 49 non-merge commits during the last 9 day(s) which contain
a total of 68 files changed, 3592 insertions(+), 1371 deletions(-).

The main changes are:

1) Veristat tool improvements to support custom filtering, sorting, and replay
of results, from Andrii Nakryiko.

2) BPF verifier precision tracking fixes and improvements,
from Andrii Nakryiko.

3) Lots of new BPF documentation for various BPF maps, from Dave Tucker,
Donald Hunter, Maryam Tahhan, Bagas Sanjaya.

4) BTF dedup improvements and libbpf's hashmap interface clean ups, from
Eduard Zingerman.

5) Fix veth driver panic if XDP program is attached before veth_open, from
John Fastabend.

6) BPF verifier clean ups and fixes in preparation for follow up features,
from Kumar Kartikeya Dwivedi.

7) Add access to hwtstamp field from BPF sockops programs,
from Martin KaFai Lau.

8) Various fixes for BPF selftests and samples, from Artem Savkov,
Domenico Cerasuolo, Kang Minchul, Rong Tao, Yang Jihong.

9) Fix redirection to tunneling device logic, preventing skb->len == 0, from
Stanislav Fomichev.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (49 commits)
selftests/bpf: fix veristat's singular file-or-prog filter
selftests/bpf: Test skops->skb_hwtstamp
selftests/bpf: Fix incorrect ASSERT in the tcp_hdr_options test
bpf: Add hwtstamp field for the sockops prog
selftests/bpf: Fix xdp_synproxy compilation failure in 32-bit arch
bpf, docs: Document BPF_MAP_TYPE_ARRAY
docs/bpf: Document BPF map types QUEUE and STACK
docs/bpf: Document BPF ARRAY_OF_MAPS and HASH_OF_MAPS
docs/bpf: Document BPF_MAP_TYPE_CPUMAP map
docs/bpf: Document BPF_MAP_TYPE_LPM_TRIE map
libbpf: Hashmap.h update to fix build issues using LLVM14
bpf: veth driver panics when xdp prog attached before veth_open
selftests: Fix test group SKIPPED result
selftests/bpf: Tests for btf_dedup_resolve_fwds
libbpf: Resolve unambigous forward declarations
libbpf: Hashmap interface update to allow both long and void* keys/values
samples/bpf: Fix sockex3 error: Missing BPF prog type
selftests/bpf: Fix u32 variable compared with less than zero
Documentation: bpf: Escape underscore in BPF type name prefix
selftests/bpf: Use consistent build-id type for liburandom_read.so
...
====================

Link: https://lore.kernel.org/r/20221111233733.1088228-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+3626 -1405
+44
Documentation/bpf/bpf_design_QA.rst
··· 298 298 299 299 The BTF_ID macro does not cause a function to become part of the ABI 300 300 any more than does the EXPORT_SYMBOL_GPL macro. 301 + 302 + Q: What is the compatibility story for special BPF types in map values? 303 + ----------------------------------------------------------------------- 304 + Q: Users are allowed to embed bpf_spin_lock, bpf_timer fields in their BPF map 305 + values (when using BTF support for BPF maps). This allows to use helpers for 306 + such objects on these fields inside map values. Users are also allowed to embed 307 + pointers to some kernel types (with __kptr and __kptr_ref BTF tags). Will the 308 + kernel preserve backwards compatibility for these features? 309 + 310 + A: It depends. For bpf_spin_lock, bpf_timer: YES, for kptr and everything else: 311 + NO, but see below. 312 + 313 + For struct types that have been added already, like bpf_spin_lock and bpf_timer, 314 + the kernel will preserve backwards compatibility, as they are part of UAPI. 315 + 316 + For kptrs, they are also part of UAPI, but only with respect to the kptr 317 + mechanism. The types that you can use with a __kptr and __kptr_ref tagged 318 + pointer in your struct are NOT part of the UAPI contract. The supported types can 319 + and will change across kernel releases. However, operations like accessing kptr 320 + fields and bpf_kptr_xchg() helper will continue to be supported across kernel 321 + releases for the supported types. 322 + 323 + For any other supported struct type, unless explicitly stated in this document 324 + and added to bpf.h UAPI header, such types can and will arbitrarily change their 325 + size, type, and alignment, or any other user visible API or ABI detail across 326 + kernel releases. The users must adapt their BPF programs to the new changes and 327 + update them to make sure their programs continue to work correctly. 328 + 329 + NOTE: BPF subsystem specially reserves the 'bpf\_' prefix for type names, in 330 + order to introduce more special fields in the future. Hence, user programs must 331 + avoid defining types with 'bpf\_' prefix to not be broken in future releases. 332 + In other words, no backwards compatibility is guaranteed if one using a type 333 + in BTF with 'bpf\_' prefix. 334 + 335 + Q: What is the compatibility story for special BPF types in local kptrs? 336 + ------------------------------------------------------------------------ 337 + Q: Same as above, but for local kptrs (i.e. pointers to objects allocated using 338 + bpf_obj_new for user defined structures). Will the kernel preserve backwards 339 + compatibility for these features? 340 + 341 + A: NO. 342 + 343 + Unlike map value types, there are no stability guarantees for this case. The 344 + whole local kptr API itself is unstable (since it is exposed through kfuncs).
+250
Documentation/bpf/map_array.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + ================================================ 5 + BPF_MAP_TYPE_ARRAY and BPF_MAP_TYPE_PERCPU_ARRAY 6 + ================================================ 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_ARRAY`` was introduced in kernel version 3.19 10 + - ``BPF_MAP_TYPE_PERCPU_ARRAY`` was introduced in version 4.6 11 + 12 + ``BPF_MAP_TYPE_ARRAY`` and ``BPF_MAP_TYPE_PERCPU_ARRAY`` provide generic array 13 + storage. The key type is an unsigned 32-bit integer (4 bytes) and the map is 14 + of constant size. The size of the array is defined in ``max_entries`` at 15 + creation time. All array elements are pre-allocated and zero initialized when 16 + created. ``BPF_MAP_TYPE_PERCPU_ARRAY`` uses a different memory region for each 17 + CPU whereas ``BPF_MAP_TYPE_ARRAY`` uses the same memory region. The value 18 + stored can be of any size, however, all array elements are aligned to 8 19 + bytes. 20 + 21 + Since kernel 5.5, memory mapping may be enabled for ``BPF_MAP_TYPE_ARRAY`` by 22 + setting the flag ``BPF_F_MMAPABLE``. The map definition is page-aligned and 23 + starts on the first page. Sufficient page-sized and page-aligned blocks of 24 + memory are allocated to store all array values, starting on the second page, 25 + which in some cases will result in over-allocation of memory. The benefit of 26 + using this is increased performance and ease of use since userspace programs 27 + would not be required to use helper functions to access and mutate data. 28 + 29 + Usage 30 + ===== 31 + 32 + Kernel BPF 33 + ---------- 34 + 35 + .. c:function:: 36 + void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 37 + 38 + Array elements can be retrieved using the ``bpf_map_lookup_elem()`` helper. 39 + This helper returns a pointer into the array element, so to avoid data races 40 + with userspace reading the value, the user must use primitives like 41 + ``__sync_fetch_and_add()`` when updating the value in-place. 42 + 43 + .. c:function:: 44 + long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) 45 + 46 + Array elements can be updated using the ``bpf_map_update_elem()`` helper. 47 + 48 + ``bpf_map_update_elem()`` returns 0 on success, or negative error in case of 49 + failure. 50 + 51 + Since the array is of constant size, ``bpf_map_delete_elem()`` is not supported. 52 + To clear an array element, you may use ``bpf_map_update_elem()`` to insert a 53 + zero value to that index. 54 + 55 + Per CPU Array 56 + ~~~~~~~~~~~~~ 57 + 58 + Values stored in ``BPF_MAP_TYPE_ARRAY`` can be accessed by multiple programs 59 + across different CPUs. To restrict storage to a single CPU, you may use a 60 + ``BPF_MAP_TYPE_PERCPU_ARRAY``. 61 + 62 + When using a ``BPF_MAP_TYPE_PERCPU_ARRAY`` the ``bpf_map_update_elem()`` and 63 + ``bpf_map_lookup_elem()`` helpers automatically access the slot for the current 64 + CPU. 65 + 66 + .. c:function:: 67 + void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key, u32 cpu) 68 + 69 + The ``bpf_map_lookup_percpu_elem()`` helper can be used to lookup the array 70 + value for a specific CPU. Returns value on success , or ``NULL`` if no entry was 71 + found or ``cpu`` is invalid. 72 + 73 + Concurrency 74 + ----------- 75 + 76 + Since kernel version 5.1, the BPF infrastructure provides ``struct bpf_spin_lock`` 77 + to synchronize access. 78 + 79 + Userspace 80 + --------- 81 + 82 + Access from userspace uses libbpf APIs with the same names as above, with 83 + the map identified by its ``fd``. 84 + 85 + Examples 86 + ======== 87 + 88 + Please see the ``tools/testing/selftests/bpf`` directory for functional 89 + examples. The code samples below demonstrate API usage. 90 + 91 + Kernel BPF 92 + ---------- 93 + 94 + This snippet shows how to declare an array in a BPF program. 95 + 96 + .. code-block:: c 97 + 98 + struct { 99 + __uint(type, BPF_MAP_TYPE_ARRAY); 100 + __type(key, u32); 101 + __type(value, long); 102 + __uint(max_entries, 256); 103 + } my_map SEC(".maps"); 104 + 105 + 106 + This example BPF program shows how to access an array element. 107 + 108 + .. code-block:: c 109 + 110 + int bpf_prog(struct __sk_buff *skb) 111 + { 112 + struct iphdr ip; 113 + int index; 114 + long *value; 115 + 116 + if (bpf_skb_load_bytes(skb, ETH_HLEN, &ip, sizeof(ip)) < 0) 117 + return 0; 118 + 119 + index = ip.protocol; 120 + value = bpf_map_lookup_elem(&my_map, &index); 121 + if (value) 122 + __sync_fetch_and_add(&value, skb->len); 123 + 124 + return 0; 125 + } 126 + 127 + Userspace 128 + --------- 129 + 130 + BPF_MAP_TYPE_ARRAY 131 + ~~~~~~~~~~~~~~~~~~ 132 + 133 + This snippet shows how to create an array, using ``bpf_map_create_opts`` to 134 + set flags. 135 + 136 + .. code-block:: c 137 + 138 + #include <bpf/libbpf.h> 139 + #include <bpf/bpf.h> 140 + 141 + int create_array() 142 + { 143 + int fd; 144 + LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = BPF_F_MMAPABLE); 145 + 146 + fd = bpf_map_create(BPF_MAP_TYPE_ARRAY, 147 + "example_array", /* name */ 148 + sizeof(__u32), /* key size */ 149 + sizeof(long), /* value size */ 150 + 256, /* max entries */ 151 + &opts); /* create opts */ 152 + return fd; 153 + } 154 + 155 + This snippet shows how to initialize the elements of an array. 156 + 157 + .. code-block:: c 158 + 159 + int initialize_array(int fd) 160 + { 161 + __u32 i; 162 + long value; 163 + int ret; 164 + 165 + for (i = 0; i < 256; i++) { 166 + value = i; 167 + ret = bpf_map_update_elem(fd, &i, &value, BPF_ANY); 168 + if (ret < 0) 169 + return ret; 170 + } 171 + 172 + return ret; 173 + } 174 + 175 + This snippet shows how to retrieve an element value from an array. 176 + 177 + .. code-block:: c 178 + 179 + int lookup(int fd) 180 + { 181 + __u32 index = 42; 182 + long value; 183 + int ret; 184 + 185 + ret = bpf_map_lookup_elem(fd, &index, &value); 186 + if (ret < 0) 187 + return ret; 188 + 189 + /* use value here */ 190 + assert(value == 42); 191 + 192 + return ret; 193 + } 194 + 195 + BPF_MAP_TYPE_PERCPU_ARRAY 196 + ~~~~~~~~~~~~~~~~~~~~~~~~~ 197 + 198 + This snippet shows how to initialize the elements of a per CPU array. 199 + 200 + .. code-block:: c 201 + 202 + int initialize_array(int fd) 203 + { 204 + int ncpus = libbpf_num_possible_cpus(); 205 + long values[ncpus]; 206 + __u32 i, j; 207 + int ret; 208 + 209 + for (i = 0; i < 256 ; i++) { 210 + for (j = 0; j < ncpus; j++) 211 + values[j] = i; 212 + ret = bpf_map_update_elem(fd, &i, &values, BPF_ANY); 213 + if (ret < 0) 214 + return ret; 215 + } 216 + 217 + return ret; 218 + } 219 + 220 + This snippet shows how to access the per CPU elements of an array value. 221 + 222 + .. code-block:: c 223 + 224 + int lookup(int fd) 225 + { 226 + int ncpus = libbpf_num_possible_cpus(); 227 + __u32 index = 42, j; 228 + long values[ncpus]; 229 + int ret; 230 + 231 + ret = bpf_map_lookup_elem(fd, &index, &values); 232 + if (ret < 0) 233 + return ret; 234 + 235 + for (j = 0; j < ncpus; j++) { 236 + /* Use per CPU value here */ 237 + assert(values[j] == 42); 238 + } 239 + 240 + return ret; 241 + } 242 + 243 + Semantics 244 + ========= 245 + 246 + As shown in the example above, when accessing a ``BPF_MAP_TYPE_PERCPU_ARRAY`` 247 + in userspace, each value is an array with ``ncpus`` elements. 248 + 249 + When calling ``bpf_map_update_elem()`` the flag ``BPF_NOEXIST`` can not be used 250 + for these maps.
+166
Documentation/bpf/map_cpumap.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + =================== 5 + BPF_MAP_TYPE_CPUMAP 6 + =================== 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_CPUMAP`` was introduced in kernel version 4.15 10 + 11 + .. kernel-doc:: kernel/bpf/cpumap.c 12 + :doc: cpu map 13 + 14 + An example use-case for this map type is software based Receive Side Scaling (RSS). 15 + 16 + The CPUMAP represents the CPUs in the system indexed as the map-key, and the 17 + map-value is the config setting (per CPUMAP entry). Each CPUMAP entry has a dedicated 18 + kernel thread bound to the given CPU to represent the remote CPU execution unit. 19 + 20 + Starting from Linux kernel version 5.9 the CPUMAP can run a second XDP program 21 + on the remote CPU. This allows an XDP program to split its processing across 22 + multiple CPUs. For example, a scenario where the initial CPU (that sees/receives 23 + the packets) needs to do minimal packet processing and the remote CPU (to which 24 + the packet is directed) can afford to spend more cycles processing the frame. The 25 + initial CPU is where the XDP redirect program is executed. The remote CPU 26 + receives raw ``xdp_frame`` objects. 27 + 28 + Usage 29 + ===== 30 + 31 + Kernel BPF 32 + ---------- 33 + .. c:function:: 34 + long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags) 35 + 36 + Redirect the packet to the endpoint referenced by ``map`` at index ``key``. 37 + For ``BPF_MAP_TYPE_CPUMAP`` this map contains references to CPUs. 38 + 39 + The lower two bits of ``flags`` are used as the return code if the map lookup 40 + fails. This is so that the return value can be one of the XDP program return 41 + codes up to ``XDP_TX``, as chosen by the caller. 42 + 43 + Userspace 44 + --------- 45 + .. note:: 46 + CPUMAP entries can only be updated/looked up/deleted from user space and not 47 + from an eBPF program. Trying to call these functions from a kernel eBPF 48 + program will result in the program failing to load and a verifier warning. 49 + 50 + .. c:function:: 51 + int bpf_map_update_elem(int fd, const void *key, const void *value, 52 + __u64 flags); 53 + 54 + CPU entries can be added or updated using the ``bpf_map_update_elem()`` 55 + helper. This helper replaces existing elements atomically. The ``value`` parameter 56 + can be ``struct bpf_cpumap_val``. 57 + 58 + .. code-block:: c 59 + 60 + struct bpf_cpumap_val { 61 + __u32 qsize; /* queue size to remote target CPU */ 62 + union { 63 + int fd; /* prog fd on map write */ 64 + __u32 id; /* prog id on map read */ 65 + } bpf_prog; 66 + }; 67 + 68 + The flags argument can be one of the following: 69 + - BPF_ANY: Create a new element or update an existing element. 70 + - BPF_NOEXIST: Create a new element only if it did not exist. 71 + - BPF_EXIST: Update an existing element. 72 + 73 + .. c:function:: 74 + int bpf_map_lookup_elem(int fd, const void *key, void *value); 75 + 76 + CPU entries can be retrieved using the ``bpf_map_lookup_elem()`` 77 + helper. 78 + 79 + .. c:function:: 80 + int bpf_map_delete_elem(int fd, const void *key); 81 + 82 + CPU entries can be deleted using the ``bpf_map_delete_elem()`` 83 + helper. This helper will return 0 on success, or negative error in case of 84 + failure. 85 + 86 + Examples 87 + ======== 88 + Kernel 89 + ------ 90 + 91 + The following code snippet shows how to declare a ``BPF_MAP_TYPE_CPUMAP`` called 92 + ``cpu_map`` and how to redirect packets to a remote CPU using a round robin scheme. 93 + 94 + .. code-block:: c 95 + 96 + struct { 97 + __uint(type, BPF_MAP_TYPE_CPUMAP); 98 + __type(key, __u32); 99 + __type(value, struct bpf_cpumap_val); 100 + __uint(max_entries, 12); 101 + } cpu_map SEC(".maps"); 102 + 103 + struct { 104 + __uint(type, BPF_MAP_TYPE_ARRAY); 105 + __type(key, __u32); 106 + __type(value, __u32); 107 + __uint(max_entries, 12); 108 + } cpus_available SEC(".maps"); 109 + 110 + struct { 111 + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); 112 + __type(key, __u32); 113 + __type(value, __u32); 114 + __uint(max_entries, 1); 115 + } cpus_iterator SEC(".maps"); 116 + 117 + SEC("xdp") 118 + int xdp_redir_cpu_round_robin(struct xdp_md *ctx) 119 + { 120 + __u32 key = 0; 121 + __u32 cpu_dest = 0; 122 + __u32 *cpu_selected, *cpu_iterator; 123 + __u32 cpu_idx; 124 + 125 + cpu_iterator = bpf_map_lookup_elem(&cpus_iterator, &key); 126 + if (!cpu_iterator) 127 + return XDP_ABORTED; 128 + cpu_idx = *cpu_iterator; 129 + 130 + *cpu_iterator += 1; 131 + if (*cpu_iterator == bpf_num_possible_cpus()) 132 + *cpu_iterator = 0; 133 + 134 + cpu_selected = bpf_map_lookup_elem(&cpus_available, &cpu_idx); 135 + if (!cpu_selected) 136 + return XDP_ABORTED; 137 + cpu_dest = *cpu_selected; 138 + 139 + if (cpu_dest >= bpf_num_possible_cpus()) 140 + return XDP_ABORTED; 141 + 142 + return bpf_redirect_map(&cpu_map, cpu_dest, 0); 143 + } 144 + 145 + Userspace 146 + --------- 147 + 148 + The following code snippet shows how to dynamically set the max_entries for a 149 + CPUMAP to the max number of cpus available on the system. 150 + 151 + .. code-block:: c 152 + 153 + int set_max_cpu_entries(struct bpf_map *cpu_map) 154 + { 155 + if (bpf_map__set_max_entries(cpu_map, libbpf_num_possible_cpus()) < 0) { 156 + fprintf(stderr, "Failed to set max entries for cpu_map map: %s", 157 + strerror(errno)); 158 + return -1; 159 + } 160 + return 0; 161 + } 162 + 163 + References 164 + =========== 165 + 166 + - https://developers.redhat.com/blog/2021/05/13/receive-side-scaling-rss-with-ebpf-and-cpumap#redirecting_into_a_cpumap
+181
Documentation/bpf/map_lpm_trie.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + ===================== 5 + BPF_MAP_TYPE_LPM_TRIE 6 + ===================== 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_LPM_TRIE`` was introduced in kernel version 4.11 10 + 11 + ``BPF_MAP_TYPE_LPM_TRIE`` provides a longest prefix match algorithm that 12 + can be used to match IP addresses to a stored set of prefixes. 13 + Internally, data is stored in an unbalanced trie of nodes that uses 14 + ``prefixlen,data`` pairs as its keys. The ``data`` is interpreted in 15 + network byte order, i.e. big endian, so ``data[0]`` stores the most 16 + significant byte. 17 + 18 + LPM tries may be created with a maximum prefix length that is a multiple 19 + of 8, in the range from 8 to 2048. The key used for lookup and update 20 + operations is a ``struct bpf_lpm_trie_key``, extended by 21 + ``max_prefixlen/8`` bytes. 22 + 23 + - For IPv4 addresses the data length is 4 bytes 24 + - For IPv6 addresses the data length is 16 bytes 25 + 26 + The value type stored in the LPM trie can be any user defined type. 27 + 28 + .. note:: 29 + When creating a map of type ``BPF_MAP_TYPE_LPM_TRIE`` you must set the 30 + ``BPF_F_NO_PREALLOC`` flag. 31 + 32 + Usage 33 + ===== 34 + 35 + Kernel BPF 36 + ---------- 37 + 38 + .. c:function:: 39 + void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 40 + 41 + The longest prefix entry for a given data value can be found using the 42 + ``bpf_map_lookup_elem()`` helper. This helper returns a pointer to the 43 + value associated with the longest matching ``key``, or ``NULL`` if no 44 + entry was found. 45 + 46 + The ``key`` should have ``prefixlen`` set to ``max_prefixlen`` when 47 + performing longest prefix lookups. For example, when searching for the 48 + longest prefix match for an IPv4 address, ``prefixlen`` should be set to 49 + ``32``. 50 + 51 + .. c:function:: 52 + long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) 53 + 54 + Prefix entries can be added or updated using the ``bpf_map_update_elem()`` 55 + helper. This helper replaces existing elements atomically. 56 + 57 + ``bpf_map_update_elem()`` returns ``0`` on success, or negative error in 58 + case of failure. 59 + 60 + .. note:: 61 + The flags parameter must be one of BPF_ANY, BPF_NOEXIST or BPF_EXIST, 62 + but the value is ignored, giving BPF_ANY semantics. 63 + 64 + .. c:function:: 65 + long bpf_map_delete_elem(struct bpf_map *map, const void *key) 66 + 67 + Prefix entries can be deleted using the ``bpf_map_delete_elem()`` 68 + helper. This helper will return 0 on success, or negative error in case 69 + of failure. 70 + 71 + Userspace 72 + --------- 73 + 74 + Access from userspace uses libbpf APIs with the same names as above, with 75 + the map identified by ``fd``. 76 + 77 + .. c:function:: 78 + int bpf_map_get_next_key (int fd, const void *cur_key, void *next_key) 79 + 80 + A userspace program can iterate through the entries in an LPM trie using 81 + libbpf's ``bpf_map_get_next_key()`` function. The first key can be 82 + fetched by calling ``bpf_map_get_next_key()`` with ``cur_key`` set to 83 + ``NULL``. Subsequent calls will fetch the next key that follows the 84 + current key. ``bpf_map_get_next_key()`` returns ``0`` on success, 85 + ``-ENOENT`` if ``cur_key`` is the last key in the trie, or negative 86 + error in case of failure. 87 + 88 + ``bpf_map_get_next_key()`` will iterate through the LPM trie elements 89 + from leftmost leaf first. This means that iteration will return more 90 + specific keys before less specific ones. 91 + 92 + Examples 93 + ======== 94 + 95 + Please see ``tools/testing/selftests/bpf/test_lpm_map.c`` for examples 96 + of LPM trie usage from userspace. The code snippets below demonstrate 97 + API usage. 98 + 99 + Kernel BPF 100 + ---------- 101 + 102 + The following BPF code snippet shows how to declare a new LPM trie for IPv4 103 + address prefixes: 104 + 105 + .. code-block:: c 106 + 107 + #include <linux/bpf.h> 108 + #include <bpf/bpf_helpers.h> 109 + 110 + struct ipv4_lpm_key { 111 + __u32 prefixlen; 112 + __u32 data; 113 + }; 114 + 115 + struct { 116 + __uint(type, BPF_MAP_TYPE_LPM_TRIE); 117 + __type(key, struct ipv4_lpm_key); 118 + __type(value, __u32); 119 + __uint(map_flags, BPF_F_NO_PREALLOC); 120 + __uint(max_entries, 255); 121 + } ipv4_lpm_map SEC(".maps"); 122 + 123 + The following BPF code snippet shows how to lookup by IPv4 address: 124 + 125 + .. code-block:: c 126 + 127 + void *lookup(__u32 ipaddr) 128 + { 129 + struct ipv4_lpm_key key = { 130 + .prefixlen = 32, 131 + .data = ipaddr 132 + }; 133 + 134 + return bpf_map_lookup_elem(&ipv4_lpm_map, &key); 135 + } 136 + 137 + Userspace 138 + --------- 139 + 140 + The following snippet shows how to insert an IPv4 prefix entry into an 141 + LPM trie: 142 + 143 + .. code-block:: c 144 + 145 + int add_prefix_entry(int lpm_fd, __u32 addr, __u32 prefixlen, struct value *value) 146 + { 147 + struct ipv4_lpm_key ipv4_key = { 148 + .prefixlen = prefixlen, 149 + .data = addr 150 + }; 151 + return bpf_map_update_elem(lpm_fd, &ipv4_key, value, BPF_ANY); 152 + } 153 + 154 + The following snippet shows a userspace program walking through the entries 155 + of an LPM trie: 156 + 157 + 158 + .. code-block:: c 159 + 160 + #include <bpf/libbpf.h> 161 + #include <bpf/bpf.h> 162 + 163 + void iterate_lpm_trie(int map_fd) 164 + { 165 + struct ipv4_lpm_key *cur_key = NULL; 166 + struct ipv4_lpm_key next_key; 167 + struct value value; 168 + int err; 169 + 170 + for (;;) { 171 + err = bpf_map_get_next_key(map_fd, cur_key, &next_key); 172 + if (err) 173 + break; 174 + 175 + bpf_map_lookup_elem(map_fd, &next_key, &value); 176 + 177 + /* Use key and value here */ 178 + 179 + cur_key = &next_key; 180 + } 181 + }
+126
Documentation/bpf/map_of_maps.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + ======================================================== 5 + BPF_MAP_TYPE_ARRAY_OF_MAPS and BPF_MAP_TYPE_HASH_OF_MAPS 6 + ======================================================== 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_ARRAY_OF_MAPS`` and ``BPF_MAP_TYPE_HASH_OF_MAPS`` were 10 + introduced in kernel version 4.12 11 + 12 + ``BPF_MAP_TYPE_ARRAY_OF_MAPS`` and ``BPF_MAP_TYPE_HASH_OF_MAPS`` provide general 13 + purpose support for map in map storage. One level of nesting is supported, where 14 + an outer map contains instances of a single type of inner map, for example 15 + ``array_of_maps->sock_map``. 16 + 17 + When creating an outer map, an inner map instance is used to initialize the 18 + metadata that the outer map holds about its inner maps. This inner map has a 19 + separate lifetime from the outer map and can be deleted after the outer map has 20 + been created. 21 + 22 + The outer map supports element lookup, update and delete from user space using 23 + the syscall API. A BPF program is only allowed to do element lookup in the outer 24 + map. 25 + 26 + .. note:: 27 + - Multi-level nesting is not supported. 28 + - Any BPF map type can be used as an inner map, except for 29 + ``BPF_MAP_TYPE_PROG_ARRAY``. 30 + - A BPF program cannot update or delete outer map entries. 31 + 32 + For ``BPF_MAP_TYPE_ARRAY_OF_MAPS`` the key is an unsigned 32-bit integer index 33 + into the array. The array is a fixed size with ``max_entries`` elements that are 34 + zero initialized when created. 35 + 36 + For ``BPF_MAP_TYPE_HASH_OF_MAPS`` the key type can be chosen when defining the 37 + map. The kernel is responsible for allocating and freeing key/value pairs, up to 38 + the max_entries limit that you specify. Hash maps use pre-allocation of hash 39 + table elements by default. The ``BPF_F_NO_PREALLOC`` flag can be used to disable 40 + pre-allocation when it is too memory expensive. 41 + 42 + Usage 43 + ===== 44 + 45 + Kernel BPF Helper 46 + ----------------- 47 + 48 + .. c:function:: 49 + void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 50 + 51 + Inner maps can be retrieved using the ``bpf_map_lookup_elem()`` helper. This 52 + helper returns a pointer to the inner map, or ``NULL`` if no entry was found. 53 + 54 + Examples 55 + ======== 56 + 57 + Kernel BPF Example 58 + ------------------ 59 + 60 + This snippet shows how to create and initialise an array of devmaps in a BPF 61 + program. Note that the outer array can only be modified from user space using 62 + the syscall API. 63 + 64 + .. code-block:: c 65 + 66 + struct inner_map { 67 + __uint(type, BPF_MAP_TYPE_DEVMAP); 68 + __uint(max_entries, 10); 69 + __type(key, __u32); 70 + __type(value, __u32); 71 + } inner_map1 SEC(".maps"), inner_map2 SEC(".maps"); 72 + 73 + struct { 74 + __uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS); 75 + __uint(max_entries, 2); 76 + __type(key, __u32); 77 + __array(values, struct inner_map); 78 + } outer_map SEC(".maps") = { 79 + .values = { &inner_map1, 80 + &inner_map2 } 81 + }; 82 + 83 + See ``progs/test_btf_map_in_map.c`` in ``tools/testing/selftests/bpf`` for more 84 + examples of declarative initialisation of outer maps. 85 + 86 + User Space 87 + ---------- 88 + 89 + This snippet shows how to create an array based outer map: 90 + 91 + .. code-block:: c 92 + 93 + int create_outer_array(int inner_fd) { 94 + LIBBPF_OPTS(bpf_map_create_opts, opts, .inner_map_fd = inner_fd); 95 + int fd; 96 + 97 + fd = bpf_map_create(BPF_MAP_TYPE_ARRAY_OF_MAPS, 98 + "example_array", /* name */ 99 + sizeof(__u32), /* key size */ 100 + sizeof(__u32), /* value size */ 101 + 256, /* max entries */ 102 + &opts); /* create opts */ 103 + return fd; 104 + } 105 + 106 + 107 + This snippet shows how to add an inner map to an outer map: 108 + 109 + .. code-block:: c 110 + 111 + int add_devmap(int outer_fd, int index, const char *name) { 112 + int fd; 113 + 114 + fd = bpf_map_create(BPF_MAP_TYPE_DEVMAP, name, 115 + sizeof(__u32), sizeof(__u32), 256, NULL); 116 + if (fd < 0) 117 + return fd; 118 + 119 + return bpf_map_update_elem(outer_fd, &index, &fd, BPF_ANY); 120 + } 121 + 122 + References 123 + ========== 124 + 125 + - https://lore.kernel.org/netdev/20170322170035.923581-3-kafai@fb.com/ 126 + - https://lore.kernel.org/netdev/20170322170035.923581-4-kafai@fb.com/
+122
Documentation/bpf/map_queue_stack.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + ========================================= 5 + BPF_MAP_TYPE_QUEUE and BPF_MAP_TYPE_STACK 6 + ========================================= 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_QUEUE`` and ``BPF_MAP_TYPE_STACK`` were introduced 10 + in kernel version 4.20 11 + 12 + ``BPF_MAP_TYPE_QUEUE`` provides FIFO storage and ``BPF_MAP_TYPE_STACK`` 13 + provides LIFO storage for BPF programs. These maps support peek, pop and 14 + push operations that are exposed to BPF programs through the respective 15 + helpers. These operations are exposed to userspace applications using 16 + the existing ``bpf`` syscall in the following way: 17 + 18 + - ``BPF_MAP_LOOKUP_ELEM`` -> peek 19 + - ``BPF_MAP_LOOKUP_AND_DELETE_ELEM`` -> pop 20 + - ``BPF_MAP_UPDATE_ELEM`` -> push 21 + 22 + ``BPF_MAP_TYPE_QUEUE`` and ``BPF_MAP_TYPE_STACK`` do not support 23 + ``BPF_F_NO_PREALLOC``. 24 + 25 + Usage 26 + ===== 27 + 28 + Kernel BPF 29 + ---------- 30 + 31 + .. c:function:: 32 + long bpf_map_push_elem(struct bpf_map *map, const void *value, u64 flags) 33 + 34 + An element ``value`` can be added to a queue or stack using the 35 + ``bpf_map_push_elem`` helper. The ``flags`` parameter must be set to 36 + ``BPF_ANY`` or ``BPF_EXIST``. If ``flags`` is set to ``BPF_EXIST`` then, 37 + when the queue or stack is full, the oldest element will be removed to 38 + make room for ``value`` to be added. Returns ``0`` on success, or 39 + negative error in case of failure. 40 + 41 + .. c:function:: 42 + long bpf_map_peek_elem(struct bpf_map *map, void *value) 43 + 44 + This helper fetches an element ``value`` from a queue or stack without 45 + removing it. Returns ``0`` on success, or negative error in case of 46 + failure. 47 + 48 + .. c:function:: 49 + long bpf_map_pop_elem(struct bpf_map *map, void *value) 50 + 51 + This helper removes an element into ``value`` from a queue or 52 + stack. Returns ``0`` on success, or negative error in case of failure. 53 + 54 + 55 + Userspace 56 + --------- 57 + 58 + .. c:function:: 59 + int bpf_map_update_elem (int fd, const void *key, const void *value, __u64 flags) 60 + 61 + A userspace program can push ``value`` onto a queue or stack using libbpf's 62 + ``bpf_map_update_elem`` function. The ``key`` parameter must be set to 63 + ``NULL`` and ``flags`` must be set to ``BPF_ANY`` or ``BPF_EXIST``, with the 64 + same semantics as the ``bpf_map_push_elem`` kernel helper. Returns ``0`` on 65 + success, or negative error in case of failure. 66 + 67 + .. c:function:: 68 + int bpf_map_lookup_elem (int fd, const void *key, void *value) 69 + 70 + A userspace program can peek at the ``value`` at the head of a queue or stack 71 + using the libbpf ``bpf_map_lookup_elem`` function. The ``key`` parameter must be 72 + set to ``NULL``. Returns ``0`` on success, or negative error in case of 73 + failure. 74 + 75 + .. c:function:: 76 + int bpf_map_lookup_and_delete_elem (int fd, const void *key, void *value) 77 + 78 + A userspace program can pop a ``value`` from the head of a queue or stack using 79 + the libbpf ``bpf_map_lookup_and_delete_elem`` function. The ``key`` parameter 80 + must be set to ``NULL``. Returns ``0`` on success, or negative error in case of 81 + failure. 82 + 83 + Examples 84 + ======== 85 + 86 + Kernel BPF 87 + ---------- 88 + 89 + This snippet shows how to declare a queue in a BPF program: 90 + 91 + .. code-block:: c 92 + 93 + struct { 94 + __uint(type, BPF_MAP_TYPE_QUEUE); 95 + __type(value, __u32); 96 + __uint(max_entries, 10); 97 + } queue SEC(".maps"); 98 + 99 + 100 + Userspace 101 + --------- 102 + 103 + This snippet shows how to use libbpf's low-level API to create a queue from 104 + userspace: 105 + 106 + .. code-block:: c 107 + 108 + int create_queue() 109 + { 110 + return bpf_map_create(BPF_MAP_TYPE_QUEUE, 111 + "sample_queue", /* name */ 112 + 0, /* key size, must be zero */ 113 + sizeof(__u32), /* value size */ 114 + 10, /* max entries */ 115 + NULL); /* create options */ 116 + } 117 + 118 + 119 + References 120 + ========== 121 + 122 + https://lwn.net/ml/netdev/153986858555.9127.14517764371945179514.stgit@kernel/
+1 -1
drivers/net/veth.c
··· 1125 1125 int err, i; 1126 1126 1127 1127 rq = &priv->rq[0]; 1128 - napi_already_on = (dev->flags & IFF_UP) && rcu_access_pointer(rq->napi); 1128 + napi_already_on = rcu_access_pointer(rq->napi); 1129 1129 1130 1130 if (!xdp_rxq_info_is_reg(&priv->rq[0].xdp_rxq)) { 1131 1131 err = veth_enable_xdp_range(dev, 0, dev->real_num_rx_queues, napi_already_on);
+117 -62
include/linux/bpf.h
··· 165 165 }; 166 166 167 167 enum { 168 - /* Support at most 8 pointers in a BPF map value */ 169 - BPF_MAP_VALUE_OFF_MAX = 8, 170 - BPF_MAP_OFF_ARR_MAX = BPF_MAP_VALUE_OFF_MAX + 171 - 1 + /* for bpf_spin_lock */ 172 - 1, /* for bpf_timer */ 168 + /* Support at most 8 pointers in a BTF type */ 169 + BTF_FIELDS_MAX = 10, 170 + BPF_MAP_OFF_ARR_MAX = BTF_FIELDS_MAX, 173 171 }; 174 172 175 - enum bpf_kptr_type { 176 - BPF_KPTR_UNREF, 177 - BPF_KPTR_REF, 173 + enum btf_field_type { 174 + BPF_SPIN_LOCK = (1 << 0), 175 + BPF_TIMER = (1 << 1), 176 + BPF_KPTR_UNREF = (1 << 2), 177 + BPF_KPTR_REF = (1 << 3), 178 + BPF_KPTR = BPF_KPTR_UNREF | BPF_KPTR_REF, 178 179 }; 179 180 180 - struct bpf_map_value_off_desc { 181 + struct btf_field_kptr { 182 + struct btf *btf; 183 + struct module *module; 184 + btf_dtor_kfunc_t dtor; 185 + u32 btf_id; 186 + }; 187 + 188 + struct btf_field { 181 189 u32 offset; 182 - enum bpf_kptr_type type; 183 - struct { 184 - struct btf *btf; 185 - struct module *module; 186 - btf_dtor_kfunc_t dtor; 187 - u32 btf_id; 188 - } kptr; 190 + enum btf_field_type type; 191 + union { 192 + struct btf_field_kptr kptr; 193 + }; 189 194 }; 190 195 191 - struct bpf_map_value_off { 192 - u32 nr_off; 193 - struct bpf_map_value_off_desc off[]; 196 + struct btf_record { 197 + u32 cnt; 198 + u32 field_mask; 199 + int spin_lock_off; 200 + int timer_off; 201 + struct btf_field fields[]; 194 202 }; 195 203 196 - struct bpf_map_off_arr { 204 + struct btf_field_offs { 197 205 u32 cnt; 198 206 u32 field_off[BPF_MAP_OFF_ARR_MAX]; 199 207 u8 field_sz[BPF_MAP_OFF_ARR_MAX]; ··· 222 214 u32 max_entries; 223 215 u64 map_extra; /* any per-map-type extra fields */ 224 216 u32 map_flags; 225 - int spin_lock_off; /* >=0 valid offset, <0 error */ 226 - struct bpf_map_value_off *kptr_off_tab; 227 - int timer_off; /* >=0 valid offset, <0 error */ 228 217 u32 id; 218 + struct btf_record *record; 229 219 int numa_node; 230 220 u32 btf_key_type_id; 231 221 u32 btf_value_type_id; ··· 233 227 struct obj_cgroup *objcg; 234 228 #endif 235 229 char name[BPF_OBJ_NAME_LEN]; 236 - struct bpf_map_off_arr *off_arr; 230 + struct btf_field_offs *field_offs; 237 231 /* The 3rd and 4th cacheline with misc members to avoid false sharing 238 232 * particularly with refcounting. 239 233 */ ··· 257 251 bool frozen; /* write-once; write-protected by freeze_mutex */ 258 252 }; 259 253 260 - static inline bool map_value_has_spin_lock(const struct bpf_map *map) 254 + static inline const char *btf_field_type_name(enum btf_field_type type) 261 255 { 262 - return map->spin_lock_off >= 0; 256 + switch (type) { 257 + case BPF_SPIN_LOCK: 258 + return "bpf_spin_lock"; 259 + case BPF_TIMER: 260 + return "bpf_timer"; 261 + case BPF_KPTR_UNREF: 262 + case BPF_KPTR_REF: 263 + return "kptr"; 264 + default: 265 + WARN_ON_ONCE(1); 266 + return "unknown"; 267 + } 263 268 } 264 269 265 - static inline bool map_value_has_timer(const struct bpf_map *map) 270 + static inline u32 btf_field_type_size(enum btf_field_type type) 266 271 { 267 - return map->timer_off >= 0; 272 + switch (type) { 273 + case BPF_SPIN_LOCK: 274 + return sizeof(struct bpf_spin_lock); 275 + case BPF_TIMER: 276 + return sizeof(struct bpf_timer); 277 + case BPF_KPTR_UNREF: 278 + case BPF_KPTR_REF: 279 + return sizeof(u64); 280 + default: 281 + WARN_ON_ONCE(1); 282 + return 0; 283 + } 268 284 } 269 285 270 - static inline bool map_value_has_kptrs(const struct bpf_map *map) 286 + static inline u32 btf_field_type_align(enum btf_field_type type) 271 287 { 272 - return !IS_ERR_OR_NULL(map->kptr_off_tab); 288 + switch (type) { 289 + case BPF_SPIN_LOCK: 290 + return __alignof__(struct bpf_spin_lock); 291 + case BPF_TIMER: 292 + return __alignof__(struct bpf_timer); 293 + case BPF_KPTR_UNREF: 294 + case BPF_KPTR_REF: 295 + return __alignof__(u64); 296 + default: 297 + WARN_ON_ONCE(1); 298 + return 0; 299 + } 300 + } 301 + 302 + static inline bool btf_record_has_field(const struct btf_record *rec, enum btf_field_type type) 303 + { 304 + if (IS_ERR_OR_NULL(rec)) 305 + return false; 306 + return rec->field_mask & type; 273 307 } 274 308 275 309 static inline void check_and_init_map_value(struct bpf_map *map, void *dst) 276 310 { 277 - if (unlikely(map_value_has_spin_lock(map))) 278 - memset(dst + map->spin_lock_off, 0, sizeof(struct bpf_spin_lock)); 279 - if (unlikely(map_value_has_timer(map))) 280 - memset(dst + map->timer_off, 0, sizeof(struct bpf_timer)); 281 - if (unlikely(map_value_has_kptrs(map))) { 282 - struct bpf_map_value_off *tab = map->kptr_off_tab; 311 + if (!IS_ERR_OR_NULL(map->record)) { 312 + struct btf_field *fields = map->record->fields; 313 + u32 cnt = map->record->cnt; 283 314 int i; 284 315 285 - for (i = 0; i < tab->nr_off; i++) 286 - *(u64 *)(dst + tab->off[i].offset) = 0; 316 + for (i = 0; i < cnt; i++) 317 + memset(dst + fields[i].offset, 0, btf_field_type_size(fields[i].type)); 287 318 } 288 319 } 289 320 ··· 341 298 } 342 299 343 300 /* copy everything but bpf_spin_lock, bpf_timer, and kptrs. There could be one of each. */ 344 - static inline void __copy_map_value(struct bpf_map *map, void *dst, void *src, bool long_memcpy) 301 + static inline void bpf_obj_memcpy(struct btf_field_offs *foffs, 302 + void *dst, void *src, u32 size, 303 + bool long_memcpy) 345 304 { 346 305 u32 curr_off = 0; 347 306 int i; 348 307 349 - if (likely(!map->off_arr)) { 308 + if (likely(!foffs)) { 350 309 if (long_memcpy) 351 - bpf_long_memcpy(dst, src, round_up(map->value_size, 8)); 310 + bpf_long_memcpy(dst, src, round_up(size, 8)); 352 311 else 353 - memcpy(dst, src, map->value_size); 312 + memcpy(dst, src, size); 354 313 return; 355 314 } 356 315 357 - for (i = 0; i < map->off_arr->cnt; i++) { 358 - u32 next_off = map->off_arr->field_off[i]; 316 + for (i = 0; i < foffs->cnt; i++) { 317 + u32 next_off = foffs->field_off[i]; 318 + u32 sz = next_off - curr_off; 359 319 360 - memcpy(dst + curr_off, src + curr_off, next_off - curr_off); 361 - curr_off += map->off_arr->field_sz[i]; 320 + memcpy(dst + curr_off, src + curr_off, sz); 321 + curr_off += foffs->field_sz[i]; 362 322 } 363 - memcpy(dst + curr_off, src + curr_off, map->value_size - curr_off); 323 + memcpy(dst + curr_off, src + curr_off, size - curr_off); 364 324 } 365 325 366 326 static inline void copy_map_value(struct bpf_map *map, void *dst, void *src) 367 327 { 368 - __copy_map_value(map, dst, src, false); 328 + bpf_obj_memcpy(map->field_offs, dst, src, map->value_size, false); 369 329 } 370 330 371 331 static inline void copy_map_value_long(struct bpf_map *map, void *dst, void *src) 372 332 { 373 - __copy_map_value(map, dst, src, true); 333 + bpf_obj_memcpy(map->field_offs, dst, src, map->value_size, true); 374 334 } 375 335 376 - static inline void zero_map_value(struct bpf_map *map, void *dst) 336 + static inline void bpf_obj_memzero(struct btf_field_offs *foffs, void *dst, u32 size) 377 337 { 378 338 u32 curr_off = 0; 379 339 int i; 380 340 381 - if (likely(!map->off_arr)) { 382 - memset(dst, 0, map->value_size); 341 + if (likely(!foffs)) { 342 + memset(dst, 0, size); 383 343 return; 384 344 } 385 345 386 - for (i = 0; i < map->off_arr->cnt; i++) { 387 - u32 next_off = map->off_arr->field_off[i]; 346 + for (i = 0; i < foffs->cnt; i++) { 347 + u32 next_off = foffs->field_off[i]; 348 + u32 sz = next_off - curr_off; 388 349 389 - memset(dst + curr_off, 0, next_off - curr_off); 390 - curr_off += map->off_arr->field_sz[i]; 350 + memset(dst + curr_off, 0, sz); 351 + curr_off += foffs->field_sz[i]; 391 352 } 392 - memset(dst + curr_off, 0, map->value_size - curr_off); 353 + memset(dst + curr_off, 0, size - curr_off); 354 + } 355 + 356 + static inline void zero_map_value(struct bpf_map *map, void *dst) 357 + { 358 + bpf_obj_memzero(map->field_offs, dst, map->value_size); 393 359 } 394 360 395 361 void copy_map_value_locked(struct bpf_map *map, void *dst, void *src, ··· 1751 1699 void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock); 1752 1700 void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock); 1753 1701 1754 - struct bpf_map_value_off_desc *bpf_map_kptr_off_contains(struct bpf_map *map, u32 offset); 1755 - void bpf_map_free_kptr_off_tab(struct bpf_map *map); 1756 - struct bpf_map_value_off *bpf_map_copy_kptr_off_tab(const struct bpf_map *map); 1757 - bool bpf_map_equal_kptr_off_tab(const struct bpf_map *map_a, const struct bpf_map *map_b); 1758 - void bpf_map_free_kptrs(struct bpf_map *map, void *map_value); 1702 + struct btf_field *btf_record_find(const struct btf_record *rec, 1703 + u32 offset, enum btf_field_type type); 1704 + void btf_record_free(struct btf_record *rec); 1705 + void bpf_map_free_record(struct bpf_map *map); 1706 + struct btf_record *btf_record_dup(const struct btf_record *rec); 1707 + bool btf_record_equal(const struct btf_record *rec_a, const struct btf_record *rec_b); 1708 + void bpf_obj_free_timer(const struct btf_record *rec, void *obj); 1709 + void bpf_obj_free_fields(const struct btf_record *rec, void *obj); 1759 1710 1760 1711 struct bpf_map *bpf_map_get(u32 ufd); 1761 1712 struct bpf_map *bpf_map_get_with_uref(u32 ufd);
+8 -2
include/linux/btf.h
··· 163 163 u32 expected_offset, u32 expected_size); 164 164 int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t); 165 165 int btf_find_timer(const struct btf *btf, const struct btf_type *t); 166 - struct bpf_map_value_off *btf_parse_kptrs(const struct btf *btf, 167 - const struct btf_type *t); 166 + struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type *t, 167 + u32 field_mask, u32 value_size); 168 + struct btf_field_offs *btf_parse_field_offs(struct btf_record *rec); 168 169 bool btf_type_is_void(const struct btf_type *t); 169 170 s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind); 170 171 const struct btf_type *btf_type_skip_modifiers(const struct btf *btf, ··· 287 286 static inline bool btf_type_is_typedef(const struct btf_type *t) 288 287 { 289 288 return BTF_INFO_KIND(t->info) == BTF_KIND_TYPEDEF; 289 + } 290 + 291 + static inline bool btf_type_is_volatile(const struct btf_type *t) 292 + { 293 + return BTF_INFO_KIND(t->info) == BTF_KIND_VOLATILE; 290 294 } 291 295 292 296 static inline bool btf_type_is_func(const struct btf_type *t)
+1
include/uapi/linux/bpf.h
··· 6445 6445 * the outgoing header has not 6446 6446 * been written yet. 6447 6447 */ 6448 + __u64 skb_hwtstamp; 6448 6449 }; 6449 6450 6450 6451 /* Definitions for bpf_sock_ops_cb_flags */
+11 -19
kernel/bpf/arraymap.c
··· 306 306 return 0; 307 307 } 308 308 309 - static void check_and_free_fields(struct bpf_array *arr, void *val) 310 - { 311 - if (map_value_has_timer(&arr->map)) 312 - bpf_timer_cancel_and_free(val + arr->map.timer_off); 313 - if (map_value_has_kptrs(&arr->map)) 314 - bpf_map_free_kptrs(&arr->map, val); 315 - } 316 - 317 309 /* Called from syscall or from eBPF program */ 318 310 static int array_map_update_elem(struct bpf_map *map, void *key, void *value, 319 311 u64 map_flags) ··· 327 335 return -EEXIST; 328 336 329 337 if (unlikely((map_flags & BPF_F_LOCK) && 330 - !map_value_has_spin_lock(map))) 338 + !btf_record_has_field(map->record, BPF_SPIN_LOCK))) 331 339 return -EINVAL; 332 340 333 341 if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { 334 342 val = this_cpu_ptr(array->pptrs[index & array->index_mask]); 335 343 copy_map_value(map, val, value); 336 - check_and_free_fields(array, val); 344 + bpf_obj_free_fields(array->map.record, val); 337 345 } else { 338 346 val = array->value + 339 347 (u64)array->elem_size * (index & array->index_mask); ··· 341 349 copy_map_value_locked(map, val, value, false); 342 350 else 343 351 copy_map_value(map, val, value); 344 - check_and_free_fields(array, val); 352 + bpf_obj_free_fields(array->map.record, val); 345 353 } 346 354 return 0; 347 355 } ··· 378 386 pptr = array->pptrs[index & array->index_mask]; 379 387 for_each_possible_cpu(cpu) { 380 388 copy_map_value_long(map, per_cpu_ptr(pptr, cpu), value + off); 381 - check_and_free_fields(array, per_cpu_ptr(pptr, cpu)); 389 + bpf_obj_free_fields(array->map.record, per_cpu_ptr(pptr, cpu)); 382 390 off += size; 383 391 } 384 392 rcu_read_unlock(); ··· 401 409 struct bpf_array *array = container_of(map, struct bpf_array, map); 402 410 int i; 403 411 404 - /* We don't reset or free kptr on uref dropping to zero. */ 405 - if (!map_value_has_timer(map)) 412 + /* We don't reset or free fields other than timer on uref dropping to zero. */ 413 + if (!btf_record_has_field(map->record, BPF_TIMER)) 406 414 return; 407 415 408 416 for (i = 0; i < array->map.max_entries; i++) 409 - bpf_timer_cancel_and_free(array_map_elem_ptr(array, i) + map->timer_off); 417 + bpf_obj_free_timer(map->record, array_map_elem_ptr(array, i)); 410 418 } 411 419 412 420 /* Called when map->refcnt goes to zero, either from workqueue or from syscall */ ··· 415 423 struct bpf_array *array = container_of(map, struct bpf_array, map); 416 424 int i; 417 425 418 - if (map_value_has_kptrs(map)) { 426 + if (!IS_ERR_OR_NULL(map->record)) { 419 427 if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { 420 428 for (i = 0; i < array->map.max_entries; i++) { 421 429 void __percpu *pptr = array->pptrs[i & array->index_mask]; 422 430 int cpu; 423 431 424 432 for_each_possible_cpu(cpu) { 425 - bpf_map_free_kptrs(map, per_cpu_ptr(pptr, cpu)); 433 + bpf_obj_free_fields(map->record, per_cpu_ptr(pptr, cpu)); 426 434 cond_resched(); 427 435 } 428 436 } 429 437 } else { 430 438 for (i = 0; i < array->map.max_entries; i++) 431 - bpf_map_free_kptrs(map, array_map_elem_ptr(array, i)); 439 + bpf_obj_free_fields(map->record, array_map_elem_ptr(array, i)); 432 440 } 433 - bpf_map_free_kptr_off_tab(map); 441 + bpf_map_free_record(map); 434 442 } 435 443 436 444 if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY)
+1 -1
kernel/bpf/bpf_local_storage.c
··· 382 382 if (unlikely((map_flags & ~BPF_F_LOCK) > BPF_EXIST) || 383 383 /* BPF_F_LOCK can only be used in a value with spin_lock */ 384 384 unlikely((map_flags & BPF_F_LOCK) && 385 - !map_value_has_spin_lock(&smap->map))) 385 + !btf_record_has_field(smap->map.record, BPF_SPIN_LOCK))) 386 386 return ERR_PTR(-EINVAL); 387 387 388 388 if (gfp_flags == GFP_KERNEL && (map_flags & ~BPF_F_LOCK) != BPF_NOEXIST)
+259 -175
kernel/bpf/btf.c
··· 3191 3191 btf_verifier_log(env, "size=%u vlen=%u", t->size, btf_type_vlen(t)); 3192 3192 } 3193 3193 3194 - enum btf_field_type { 3194 + enum btf_field_info_type { 3195 3195 BTF_FIELD_SPIN_LOCK, 3196 3196 BTF_FIELD_TIMER, 3197 3197 BTF_FIELD_KPTR, ··· 3203 3203 }; 3204 3204 3205 3205 struct btf_field_info { 3206 - u32 type_id; 3206 + enum btf_field_type type; 3207 3207 u32 off; 3208 - enum bpf_kptr_type type; 3208 + struct { 3209 + u32 type_id; 3210 + } kptr; 3209 3211 }; 3210 3212 3211 3213 static int btf_find_struct(const struct btf *btf, const struct btf_type *t, 3212 - u32 off, int sz, struct btf_field_info *info) 3214 + u32 off, int sz, enum btf_field_type field_type, 3215 + struct btf_field_info *info) 3213 3216 { 3214 3217 if (!__btf_type_is_struct(t)) 3215 3218 return BTF_FIELD_IGNORE; 3216 3219 if (t->size != sz) 3217 3220 return BTF_FIELD_IGNORE; 3221 + info->type = field_type; 3218 3222 info->off = off; 3219 3223 return BTF_FIELD_FOUND; 3220 3224 } ··· 3226 3222 static int btf_find_kptr(const struct btf *btf, const struct btf_type *t, 3227 3223 u32 off, int sz, struct btf_field_info *info) 3228 3224 { 3229 - enum bpf_kptr_type type; 3225 + enum btf_field_type type; 3230 3226 u32 res_id; 3231 3227 3228 + /* Permit modifiers on the pointer itself */ 3229 + if (btf_type_is_volatile(t)) 3230 + t = btf_type_by_id(btf, t->type); 3232 3231 /* For PTR, sz is always == 8 */ 3233 3232 if (!btf_type_is_ptr(t)) 3234 3233 return BTF_FIELD_IGNORE; ··· 3255 3248 if (!__btf_type_is_struct(t)) 3256 3249 return -EINVAL; 3257 3250 3258 - info->type_id = res_id; 3259 - info->off = off; 3260 3251 info->type = type; 3252 + info->off = off; 3253 + info->kptr.type_id = res_id; 3261 3254 return BTF_FIELD_FOUND; 3262 3255 } 3263 3256 3264 - static int btf_find_struct_field(const struct btf *btf, const struct btf_type *t, 3265 - const char *name, int sz, int align, 3266 - enum btf_field_type field_type, 3257 + static int btf_get_field_type(const char *name, u32 field_mask, u32 *seen_mask, 3258 + int *align, int *sz) 3259 + { 3260 + int type = 0; 3261 + 3262 + if (field_mask & BPF_SPIN_LOCK) { 3263 + if (!strcmp(name, "bpf_spin_lock")) { 3264 + if (*seen_mask & BPF_SPIN_LOCK) 3265 + return -E2BIG; 3266 + *seen_mask |= BPF_SPIN_LOCK; 3267 + type = BPF_SPIN_LOCK; 3268 + goto end; 3269 + } 3270 + } 3271 + if (field_mask & BPF_TIMER) { 3272 + if (!strcmp(name, "bpf_timer")) { 3273 + if (*seen_mask & BPF_TIMER) 3274 + return -E2BIG; 3275 + *seen_mask |= BPF_TIMER; 3276 + type = BPF_TIMER; 3277 + goto end; 3278 + } 3279 + } 3280 + /* Only return BPF_KPTR when all other types with matchable names fail */ 3281 + if (field_mask & BPF_KPTR) { 3282 + type = BPF_KPTR_REF; 3283 + goto end; 3284 + } 3285 + return 0; 3286 + end: 3287 + *sz = btf_field_type_size(type); 3288 + *align = btf_field_type_align(type); 3289 + return type; 3290 + } 3291 + 3292 + static int btf_find_struct_field(const struct btf *btf, 3293 + const struct btf_type *t, u32 field_mask, 3267 3294 struct btf_field_info *info, int info_cnt) 3268 3295 { 3296 + int ret, idx = 0, align, sz, field_type; 3269 3297 const struct btf_member *member; 3270 3298 struct btf_field_info tmp; 3271 - int ret, idx = 0; 3272 - u32 i, off; 3299 + u32 i, off, seen_mask = 0; 3273 3300 3274 3301 for_each_member(i, t, member) { 3275 3302 const struct btf_type *member_type = btf_type_by_id(btf, 3276 3303 member->type); 3277 3304 3278 - if (name && strcmp(__btf_name_by_offset(btf, member_type->name_off), name)) 3305 + field_type = btf_get_field_type(__btf_name_by_offset(btf, member_type->name_off), 3306 + field_mask, &seen_mask, &align, &sz); 3307 + if (field_type == 0) 3279 3308 continue; 3309 + if (field_type < 0) 3310 + return field_type; 3280 3311 3281 3312 off = __btf_member_bit_offset(t, member); 3282 3313 if (off % 8) ··· 3322 3277 return -EINVAL; 3323 3278 off /= 8; 3324 3279 if (off % align) 3325 - return -EINVAL; 3280 + continue; 3326 3281 3327 3282 switch (field_type) { 3328 - case BTF_FIELD_SPIN_LOCK: 3329 - case BTF_FIELD_TIMER: 3330 - ret = btf_find_struct(btf, member_type, off, sz, 3283 + case BPF_SPIN_LOCK: 3284 + case BPF_TIMER: 3285 + ret = btf_find_struct(btf, member_type, off, sz, field_type, 3331 3286 idx < info_cnt ? &info[idx] : &tmp); 3332 3287 if (ret < 0) 3333 3288 return ret; 3334 3289 break; 3335 - case BTF_FIELD_KPTR: 3290 + case BPF_KPTR_UNREF: 3291 + case BPF_KPTR_REF: 3336 3292 ret = btf_find_kptr(btf, member_type, off, sz, 3337 3293 idx < info_cnt ? &info[idx] : &tmp); 3338 3294 if (ret < 0) ··· 3353 3307 } 3354 3308 3355 3309 static int btf_find_datasec_var(const struct btf *btf, const struct btf_type *t, 3356 - const char *name, int sz, int align, 3357 - enum btf_field_type field_type, 3358 - struct btf_field_info *info, int info_cnt) 3310 + u32 field_mask, struct btf_field_info *info, 3311 + int info_cnt) 3359 3312 { 3313 + int ret, idx = 0, align, sz, field_type; 3360 3314 const struct btf_var_secinfo *vsi; 3361 3315 struct btf_field_info tmp; 3362 - int ret, idx = 0; 3363 - u32 i, off; 3316 + u32 i, off, seen_mask = 0; 3364 3317 3365 3318 for_each_vsi(i, t, vsi) { 3366 3319 const struct btf_type *var = btf_type_by_id(btf, vsi->type); 3367 3320 const struct btf_type *var_type = btf_type_by_id(btf, var->type); 3368 3321 3369 - off = vsi->offset; 3370 - 3371 - if (name && strcmp(__btf_name_by_offset(btf, var_type->name_off), name)) 3322 + field_type = btf_get_field_type(__btf_name_by_offset(btf, var_type->name_off), 3323 + field_mask, &seen_mask, &align, &sz); 3324 + if (field_type == 0) 3372 3325 continue; 3326 + if (field_type < 0) 3327 + return field_type; 3328 + 3329 + off = vsi->offset; 3373 3330 if (vsi->size != sz) 3374 3331 continue; 3375 3332 if (off % align) 3376 - return -EINVAL; 3333 + continue; 3377 3334 3378 3335 switch (field_type) { 3379 - case BTF_FIELD_SPIN_LOCK: 3380 - case BTF_FIELD_TIMER: 3381 - ret = btf_find_struct(btf, var_type, off, sz, 3336 + case BPF_SPIN_LOCK: 3337 + case BPF_TIMER: 3338 + ret = btf_find_struct(btf, var_type, off, sz, field_type, 3382 3339 idx < info_cnt ? &info[idx] : &tmp); 3383 3340 if (ret < 0) 3384 3341 return ret; 3385 3342 break; 3386 - case BTF_FIELD_KPTR: 3343 + case BPF_KPTR_UNREF: 3344 + case BPF_KPTR_REF: 3387 3345 ret = btf_find_kptr(btf, var_type, off, sz, 3388 3346 idx < info_cnt ? &info[idx] : &tmp); 3389 3347 if (ret < 0) ··· 3407 3357 } 3408 3358 3409 3359 static int btf_find_field(const struct btf *btf, const struct btf_type *t, 3410 - enum btf_field_type field_type, 3411 - struct btf_field_info *info, int info_cnt) 3360 + u32 field_mask, struct btf_field_info *info, 3361 + int info_cnt) 3412 3362 { 3413 - const char *name; 3414 - int sz, align; 3415 - 3416 - switch (field_type) { 3417 - case BTF_FIELD_SPIN_LOCK: 3418 - name = "bpf_spin_lock"; 3419 - sz = sizeof(struct bpf_spin_lock); 3420 - align = __alignof__(struct bpf_spin_lock); 3421 - break; 3422 - case BTF_FIELD_TIMER: 3423 - name = "bpf_timer"; 3424 - sz = sizeof(struct bpf_timer); 3425 - align = __alignof__(struct bpf_timer); 3426 - break; 3427 - case BTF_FIELD_KPTR: 3428 - name = NULL; 3429 - sz = sizeof(u64); 3430 - align = 8; 3431 - break; 3432 - default: 3433 - return -EFAULT; 3434 - } 3435 - 3436 3363 if (__btf_type_is_struct(t)) 3437 - return btf_find_struct_field(btf, t, name, sz, align, field_type, info, info_cnt); 3364 + return btf_find_struct_field(btf, t, field_mask, info, info_cnt); 3438 3365 else if (btf_type_is_datasec(t)) 3439 - return btf_find_datasec_var(btf, t, name, sz, align, field_type, info, info_cnt); 3366 + return btf_find_datasec_var(btf, t, field_mask, info, info_cnt); 3440 3367 return -EINVAL; 3441 3368 } 3442 3369 3443 - /* find 'struct bpf_spin_lock' in map value. 3444 - * return >= 0 offset if found 3445 - * and < 0 in case of error 3446 - */ 3447 - int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t) 3370 + static int btf_parse_kptr(const struct btf *btf, struct btf_field *field, 3371 + struct btf_field_info *info) 3448 3372 { 3449 - struct btf_field_info info; 3450 - int ret; 3451 - 3452 - ret = btf_find_field(btf, t, BTF_FIELD_SPIN_LOCK, &info, 1); 3453 - if (ret < 0) 3454 - return ret; 3455 - if (!ret) 3456 - return -ENOENT; 3457 - return info.off; 3458 - } 3459 - 3460 - int btf_find_timer(const struct btf *btf, const struct btf_type *t) 3461 - { 3462 - struct btf_field_info info; 3463 - int ret; 3464 - 3465 - ret = btf_find_field(btf, t, BTF_FIELD_TIMER, &info, 1); 3466 - if (ret < 0) 3467 - return ret; 3468 - if (!ret) 3469 - return -ENOENT; 3470 - return info.off; 3471 - } 3472 - 3473 - struct bpf_map_value_off *btf_parse_kptrs(const struct btf *btf, 3474 - const struct btf_type *t) 3475 - { 3476 - struct btf_field_info info_arr[BPF_MAP_VALUE_OFF_MAX]; 3477 - struct bpf_map_value_off *tab; 3478 - struct btf *kernel_btf = NULL; 3479 3373 struct module *mod = NULL; 3480 - int ret, i, nr_off; 3374 + const struct btf_type *t; 3375 + struct btf *kernel_btf; 3376 + int ret; 3377 + s32 id; 3481 3378 3482 - ret = btf_find_field(btf, t, BTF_FIELD_KPTR, info_arr, ARRAY_SIZE(info_arr)); 3379 + /* Find type in map BTF, and use it to look up the matching type 3380 + * in vmlinux or module BTFs, by name and kind. 3381 + */ 3382 + t = btf_type_by_id(btf, info->kptr.type_id); 3383 + id = bpf_find_btf_id(__btf_name_by_offset(btf, t->name_off), BTF_INFO_KIND(t->info), 3384 + &kernel_btf); 3385 + if (id < 0) 3386 + return id; 3387 + 3388 + /* Find and stash the function pointer for the destruction function that 3389 + * needs to be eventually invoked from the map free path. 3390 + */ 3391 + if (info->type == BPF_KPTR_REF) { 3392 + const struct btf_type *dtor_func; 3393 + const char *dtor_func_name; 3394 + unsigned long addr; 3395 + s32 dtor_btf_id; 3396 + 3397 + /* This call also serves as a whitelist of allowed objects that 3398 + * can be used as a referenced pointer and be stored in a map at 3399 + * the same time. 3400 + */ 3401 + dtor_btf_id = btf_find_dtor_kfunc(kernel_btf, id); 3402 + if (dtor_btf_id < 0) { 3403 + ret = dtor_btf_id; 3404 + goto end_btf; 3405 + } 3406 + 3407 + dtor_func = btf_type_by_id(kernel_btf, dtor_btf_id); 3408 + if (!dtor_func) { 3409 + ret = -ENOENT; 3410 + goto end_btf; 3411 + } 3412 + 3413 + if (btf_is_module(kernel_btf)) { 3414 + mod = btf_try_get_module(kernel_btf); 3415 + if (!mod) { 3416 + ret = -ENXIO; 3417 + goto end_btf; 3418 + } 3419 + } 3420 + 3421 + /* We already verified dtor_func to be btf_type_is_func 3422 + * in register_btf_id_dtor_kfuncs. 3423 + */ 3424 + dtor_func_name = __btf_name_by_offset(kernel_btf, dtor_func->name_off); 3425 + addr = kallsyms_lookup_name(dtor_func_name); 3426 + if (!addr) { 3427 + ret = -EINVAL; 3428 + goto end_mod; 3429 + } 3430 + field->kptr.dtor = (void *)addr; 3431 + } 3432 + 3433 + field->kptr.btf_id = id; 3434 + field->kptr.btf = kernel_btf; 3435 + field->kptr.module = mod; 3436 + return 0; 3437 + end_mod: 3438 + module_put(mod); 3439 + end_btf: 3440 + btf_put(kernel_btf); 3441 + return ret; 3442 + } 3443 + 3444 + struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type *t, 3445 + u32 field_mask, u32 value_size) 3446 + { 3447 + struct btf_field_info info_arr[BTF_FIELDS_MAX]; 3448 + struct btf_record *rec; 3449 + int ret, i, cnt; 3450 + 3451 + ret = btf_find_field(btf, t, field_mask, info_arr, ARRAY_SIZE(info_arr)); 3483 3452 if (ret < 0) 3484 3453 return ERR_PTR(ret); 3485 3454 if (!ret) 3486 3455 return NULL; 3487 3456 3488 - nr_off = ret; 3489 - tab = kzalloc(offsetof(struct bpf_map_value_off, off[nr_off]), GFP_KERNEL | __GFP_NOWARN); 3490 - if (!tab) 3457 + cnt = ret; 3458 + rec = kzalloc(offsetof(struct btf_record, fields[cnt]), GFP_KERNEL | __GFP_NOWARN); 3459 + if (!rec) 3491 3460 return ERR_PTR(-ENOMEM); 3492 3461 3493 - for (i = 0; i < nr_off; i++) { 3494 - const struct btf_type *t; 3495 - s32 id; 3496 - 3497 - /* Find type in map BTF, and use it to look up the matching type 3498 - * in vmlinux or module BTFs, by name and kind. 3499 - */ 3500 - t = btf_type_by_id(btf, info_arr[i].type_id); 3501 - id = bpf_find_btf_id(__btf_name_by_offset(btf, t->name_off), BTF_INFO_KIND(t->info), 3502 - &kernel_btf); 3503 - if (id < 0) { 3504 - ret = id; 3462 + rec->spin_lock_off = -EINVAL; 3463 + rec->timer_off = -EINVAL; 3464 + for (i = 0; i < cnt; i++) { 3465 + if (info_arr[i].off + btf_field_type_size(info_arr[i].type) > value_size) { 3466 + WARN_ONCE(1, "verifier bug off %d size %d", info_arr[i].off, value_size); 3467 + ret = -EFAULT; 3505 3468 goto end; 3506 3469 } 3507 3470 3508 - /* Find and stash the function pointer for the destruction function that 3509 - * needs to be eventually invoked from the map free path. 3510 - */ 3511 - if (info_arr[i].type == BPF_KPTR_REF) { 3512 - const struct btf_type *dtor_func; 3513 - const char *dtor_func_name; 3514 - unsigned long addr; 3515 - s32 dtor_btf_id; 3471 + rec->field_mask |= info_arr[i].type; 3472 + rec->fields[i].offset = info_arr[i].off; 3473 + rec->fields[i].type = info_arr[i].type; 3516 3474 3517 - /* This call also serves as a whitelist of allowed objects that 3518 - * can be used as a referenced pointer and be stored in a map at 3519 - * the same time. 3520 - */ 3521 - dtor_btf_id = btf_find_dtor_kfunc(kernel_btf, id); 3522 - if (dtor_btf_id < 0) { 3523 - ret = dtor_btf_id; 3524 - goto end_btf; 3525 - } 3526 - 3527 - dtor_func = btf_type_by_id(kernel_btf, dtor_btf_id); 3528 - if (!dtor_func) { 3529 - ret = -ENOENT; 3530 - goto end_btf; 3531 - } 3532 - 3533 - if (btf_is_module(kernel_btf)) { 3534 - mod = btf_try_get_module(kernel_btf); 3535 - if (!mod) { 3536 - ret = -ENXIO; 3537 - goto end_btf; 3538 - } 3539 - } 3540 - 3541 - /* We already verified dtor_func to be btf_type_is_func 3542 - * in register_btf_id_dtor_kfuncs. 3543 - */ 3544 - dtor_func_name = __btf_name_by_offset(kernel_btf, dtor_func->name_off); 3545 - addr = kallsyms_lookup_name(dtor_func_name); 3546 - if (!addr) { 3547 - ret = -EINVAL; 3548 - goto end_mod; 3549 - } 3550 - tab->off[i].kptr.dtor = (void *)addr; 3475 + switch (info_arr[i].type) { 3476 + case BPF_SPIN_LOCK: 3477 + WARN_ON_ONCE(rec->spin_lock_off >= 0); 3478 + /* Cache offset for faster lookup at runtime */ 3479 + rec->spin_lock_off = rec->fields[i].offset; 3480 + break; 3481 + case BPF_TIMER: 3482 + WARN_ON_ONCE(rec->timer_off >= 0); 3483 + /* Cache offset for faster lookup at runtime */ 3484 + rec->timer_off = rec->fields[i].offset; 3485 + break; 3486 + case BPF_KPTR_UNREF: 3487 + case BPF_KPTR_REF: 3488 + ret = btf_parse_kptr(btf, &rec->fields[i], &info_arr[i]); 3489 + if (ret < 0) 3490 + goto end; 3491 + break; 3492 + default: 3493 + ret = -EFAULT; 3494 + goto end; 3551 3495 } 3552 - 3553 - tab->off[i].offset = info_arr[i].off; 3554 - tab->off[i].type = info_arr[i].type; 3555 - tab->off[i].kptr.btf_id = id; 3556 - tab->off[i].kptr.btf = kernel_btf; 3557 - tab->off[i].kptr.module = mod; 3496 + rec->cnt++; 3558 3497 } 3559 - tab->nr_off = nr_off; 3560 - return tab; 3561 - end_mod: 3562 - module_put(mod); 3563 - end_btf: 3564 - btf_put(kernel_btf); 3498 + return rec; 3565 3499 end: 3566 - while (i--) { 3567 - btf_put(tab->off[i].kptr.btf); 3568 - if (tab->off[i].kptr.module) 3569 - module_put(tab->off[i].kptr.module); 3570 - } 3571 - kfree(tab); 3500 + btf_record_free(rec); 3572 3501 return ERR_PTR(ret); 3502 + } 3503 + 3504 + static int btf_field_offs_cmp(const void *_a, const void *_b, const void *priv) 3505 + { 3506 + const u32 a = *(const u32 *)_a; 3507 + const u32 b = *(const u32 *)_b; 3508 + 3509 + if (a < b) 3510 + return -1; 3511 + else if (a > b) 3512 + return 1; 3513 + return 0; 3514 + } 3515 + 3516 + static void btf_field_offs_swap(void *_a, void *_b, int size, const void *priv) 3517 + { 3518 + struct btf_field_offs *foffs = (void *)priv; 3519 + u32 *off_base = foffs->field_off; 3520 + u32 *a = _a, *b = _b; 3521 + u8 *sz_a, *sz_b; 3522 + 3523 + sz_a = foffs->field_sz + (a - off_base); 3524 + sz_b = foffs->field_sz + (b - off_base); 3525 + 3526 + swap(*a, *b); 3527 + swap(*sz_a, *sz_b); 3528 + } 3529 + 3530 + struct btf_field_offs *btf_parse_field_offs(struct btf_record *rec) 3531 + { 3532 + struct btf_field_offs *foffs; 3533 + u32 i, *off; 3534 + u8 *sz; 3535 + 3536 + BUILD_BUG_ON(ARRAY_SIZE(foffs->field_off) != ARRAY_SIZE(foffs->field_sz)); 3537 + if (IS_ERR_OR_NULL(rec) || WARN_ON_ONCE(rec->cnt > sizeof(foffs->field_off))) 3538 + return NULL; 3539 + 3540 + foffs = kzalloc(sizeof(*foffs), GFP_KERNEL | __GFP_NOWARN); 3541 + if (!foffs) 3542 + return ERR_PTR(-ENOMEM); 3543 + 3544 + off = foffs->field_off; 3545 + sz = foffs->field_sz; 3546 + for (i = 0; i < rec->cnt; i++) { 3547 + off[i] = rec->fields[i].offset; 3548 + sz[i] = btf_field_type_size(rec->fields[i].type); 3549 + } 3550 + foffs->cnt = rec->cnt; 3551 + 3552 + if (foffs->cnt == 1) 3553 + return foffs; 3554 + sort_r(foffs->field_off, foffs->cnt, sizeof(foffs->field_off[0]), 3555 + btf_field_offs_cmp, btf_field_offs_swap, foffs); 3556 + return foffs; 3573 3557 } 3574 3558 3575 3559 static void __btf_struct_show(const struct btf *btf, const struct btf_type *t, ··· 6451 6367 6452 6368 /* kptr_get is only true for kfunc */ 6453 6369 if (i == 0 && kptr_get) { 6454 - struct bpf_map_value_off_desc *off_desc; 6370 + struct btf_field *kptr_field; 6455 6371 6456 6372 if (reg->type != PTR_TO_MAP_VALUE) { 6457 6373 bpf_log(log, "arg#0 expected pointer to map value\n"); ··· 6467 6383 return -EINVAL; 6468 6384 } 6469 6385 6470 - off_desc = bpf_map_kptr_off_contains(reg->map_ptr, reg->off + reg->var_off.value); 6471 - if (!off_desc || off_desc->type != BPF_KPTR_REF) { 6386 + kptr_field = btf_record_find(reg->map_ptr->record, reg->off + reg->var_off.value, BPF_KPTR); 6387 + if (!kptr_field || kptr_field->type != BPF_KPTR_REF) { 6472 6388 bpf_log(log, "arg#0 no referenced kptr at map value offset=%llu\n", 6473 6389 reg->off + reg->var_off.value); 6474 6390 return -EINVAL; ··· 6487 6403 func_name, i, btf_type_str(ref_t), ref_tname); 6488 6404 return -EINVAL; 6489 6405 } 6490 - if (!btf_struct_ids_match(log, btf, ref_id, 0, off_desc->kptr.btf, 6491 - off_desc->kptr.btf_id, true)) { 6406 + if (!btf_struct_ids_match(log, btf, ref_id, 0, kptr_field->kptr.btf, 6407 + kptr_field->kptr.btf_id, true)) { 6492 6408 bpf_log(log, "kernel function %s args#%d expected pointer to %s %s\n", 6493 6409 func_name, i, btf_type_str(ref_t), ref_tname); 6494 6410 return -EINVAL;
+6 -3
kernel/bpf/cpumap.c
··· 4 4 * Copyright (c) 2017 Jesper Dangaard Brouer, Red Hat Inc. 5 5 */ 6 6 7 - /* The 'cpumap' is primarily used as a backend map for XDP BPF helper 7 + /** 8 + * DOC: cpu map 9 + * The 'cpumap' is primarily used as a backend map for XDP BPF helper 8 10 * call bpf_redirect_map() and XDP_REDIRECT action, like 'devmap'. 9 11 * 10 - * Unlike devmap which redirects XDP frames out another NIC device, 12 + * Unlike devmap which redirects XDP frames out to another NIC device, 11 13 * this map type redirects raw XDP frames to another CPU. The remote 12 14 * CPU will do SKB-allocation and call the normal network stack. 13 - * 15 + */ 16 + /* 14 17 * This is a scalability and isolation mechanism, that allow 15 18 * separating the early driver network XDP layer, from the rest of the 16 19 * netstack, and assigning dedicated CPUs for this stage. This
+14 -24
kernel/bpf/hashtab.c
··· 222 222 u32 num_entries = htab->map.max_entries; 223 223 int i; 224 224 225 - if (!map_value_has_timer(&htab->map)) 225 + if (!btf_record_has_field(htab->map.record, BPF_TIMER)) 226 226 return; 227 227 if (htab_has_extra_elems(htab)) 228 228 num_entries += num_possible_cpus(); ··· 231 231 struct htab_elem *elem; 232 232 233 233 elem = get_htab_elem(htab, i); 234 - bpf_timer_cancel_and_free(elem->key + 235 - round_up(htab->map.key_size, 8) + 236 - htab->map.timer_off); 234 + bpf_obj_free_timer(htab->map.record, elem->key + round_up(htab->map.key_size, 8)); 237 235 cond_resched(); 238 236 } 239 237 } 240 238 241 - static void htab_free_prealloced_kptrs(struct bpf_htab *htab) 239 + static void htab_free_prealloced_fields(struct bpf_htab *htab) 242 240 { 243 241 u32 num_entries = htab->map.max_entries; 244 242 int i; 245 243 246 - if (!map_value_has_kptrs(&htab->map)) 244 + if (IS_ERR_OR_NULL(htab->map.record)) 247 245 return; 248 246 if (htab_has_extra_elems(htab)) 249 247 num_entries += num_possible_cpus(); 250 - 251 248 for (i = 0; i < num_entries; i++) { 252 249 struct htab_elem *elem; 253 250 254 251 elem = get_htab_elem(htab, i); 255 - bpf_map_free_kptrs(&htab->map, elem->key + round_up(htab->map.key_size, 8)); 252 + bpf_obj_free_fields(htab->map.record, elem->key + round_up(htab->map.key_size, 8)); 256 253 cond_resched(); 257 254 } 258 255 } ··· 761 764 { 762 765 void *map_value = elem->key + round_up(htab->map.key_size, 8); 763 766 764 - if (map_value_has_timer(&htab->map)) 765 - bpf_timer_cancel_and_free(map_value + htab->map.timer_off); 766 - if (map_value_has_kptrs(&htab->map)) 767 - bpf_map_free_kptrs(&htab->map, map_value); 767 + bpf_obj_free_fields(htab->map.record, map_value); 768 768 } 769 769 770 770 /* It is called from the bpf_lru_list when the LRU needs to delete ··· 1085 1091 head = &b->head; 1086 1092 1087 1093 if (unlikely(map_flags & BPF_F_LOCK)) { 1088 - if (unlikely(!map_value_has_spin_lock(map))) 1094 + if (unlikely(!btf_record_has_field(map->record, BPF_SPIN_LOCK))) 1089 1095 return -EINVAL; 1090 1096 /* find an element without taking the bucket lock */ 1091 1097 l_old = lookup_nulls_elem_raw(head, hash, key, key_size, ··· 1468 1474 struct htab_elem *l; 1469 1475 1470 1476 hlist_nulls_for_each_entry(l, n, head, hash_node) { 1471 - /* We don't reset or free kptr on uref dropping to zero, 1472 - * hence just free timer. 1473 - */ 1474 - bpf_timer_cancel_and_free(l->key + 1475 - round_up(htab->map.key_size, 8) + 1476 - htab->map.timer_off); 1477 + /* We only free timer on uref dropping to zero */ 1478 + bpf_obj_free_timer(htab->map.record, l->key + round_up(htab->map.key_size, 8)); 1477 1479 } 1478 1480 cond_resched_rcu(); 1479 1481 } ··· 1480 1490 { 1481 1491 struct bpf_htab *htab = container_of(map, struct bpf_htab, map); 1482 1492 1483 - /* We don't reset or free kptr on uref dropping to zero. */ 1484 - if (!map_value_has_timer(&htab->map)) 1493 + /* We only free timer on uref dropping to zero */ 1494 + if (!btf_record_has_field(htab->map.record, BPF_TIMER)) 1485 1495 return; 1486 1496 if (!htab_is_prealloc(htab)) 1487 1497 htab_free_malloced_timers(htab); ··· 1507 1517 if (!htab_is_prealloc(htab)) { 1508 1518 delete_all_elements(htab); 1509 1519 } else { 1510 - htab_free_prealloced_kptrs(htab); 1520 + htab_free_prealloced_fields(htab); 1511 1521 prealloc_destroy(htab); 1512 1522 } 1513 1523 1514 - bpf_map_free_kptr_off_tab(map); 1524 + bpf_map_free_record(map); 1515 1525 free_percpu(htab->extra_elems); 1516 1526 bpf_map_area_free(htab->buckets); 1517 1527 bpf_mem_alloc_destroy(&htab->pcpu_ma); ··· 1665 1675 1666 1676 elem_map_flags = attr->batch.elem_flags; 1667 1677 if ((elem_map_flags & ~BPF_F_LOCK) || 1668 - ((elem_map_flags & BPF_F_LOCK) && !map_value_has_spin_lock(map))) 1678 + ((elem_map_flags & BPF_F_LOCK) && !btf_record_has_field(map->record, BPF_SPIN_LOCK))) 1669 1679 return -EINVAL; 1670 1680 1671 1681 map_flags = attr->batch.flags;
+3 -3
kernel/bpf/helpers.c
··· 366 366 struct bpf_spin_lock *lock; 367 367 368 368 if (lock_src) 369 - lock = src + map->spin_lock_off; 369 + lock = src + map->record->spin_lock_off; 370 370 else 371 - lock = dst + map->spin_lock_off; 371 + lock = dst + map->record->spin_lock_off; 372 372 preempt_disable(); 373 373 __bpf_spin_lock_irqsave(lock); 374 374 copy_map_value(map, dst, src); ··· 1169 1169 ret = -ENOMEM; 1170 1170 goto out; 1171 1171 } 1172 - t->value = (void *)timer - map->timer_off; 1172 + t->value = (void *)timer - map->record->timer_off; 1173 1173 t->map = map; 1174 1174 t->prog = NULL; 1175 1175 rcu_assign_pointer(t->callback_fn, NULL);
+1 -1
kernel/bpf/local_storage.c
··· 151 151 return -EINVAL; 152 152 153 153 if (unlikely((flags & BPF_F_LOCK) && 154 - !map_value_has_spin_lock(map))) 154 + !btf_record_has_field(map->record, BPF_SPIN_LOCK))) 155 155 return -EINVAL; 156 156 157 157 storage = cgroup_storage_lookup((struct bpf_cgroup_storage_map *)map,
+12 -7
kernel/bpf/map_in_map.c
··· 29 29 return ERR_PTR(-ENOTSUPP); 30 30 } 31 31 32 - if (map_value_has_spin_lock(inner_map)) { 32 + if (btf_record_has_field(inner_map->record, BPF_SPIN_LOCK)) { 33 33 fdput(f); 34 34 return ERR_PTR(-ENOTSUPP); 35 35 } ··· 50 50 inner_map_meta->value_size = inner_map->value_size; 51 51 inner_map_meta->map_flags = inner_map->map_flags; 52 52 inner_map_meta->max_entries = inner_map->max_entries; 53 - inner_map_meta->spin_lock_off = inner_map->spin_lock_off; 54 - inner_map_meta->timer_off = inner_map->timer_off; 55 - inner_map_meta->kptr_off_tab = bpf_map_copy_kptr_off_tab(inner_map); 53 + inner_map_meta->record = btf_record_dup(inner_map->record); 54 + if (IS_ERR(inner_map_meta->record)) { 55 + /* btf_record_dup returns NULL or valid pointer in case of 56 + * invalid/empty/valid, but ERR_PTR in case of errors. During 57 + * equality NULL or IS_ERR is equivalent. 58 + */ 59 + fdput(f); 60 + return ERR_CAST(inner_map_meta->record); 61 + } 56 62 if (inner_map->btf) { 57 63 btf_get(inner_map->btf); 58 64 inner_map_meta->btf = inner_map->btf; ··· 78 72 79 73 void bpf_map_meta_free(struct bpf_map *map_meta) 80 74 { 81 - bpf_map_free_kptr_off_tab(map_meta); 75 + bpf_map_free_record(map_meta); 82 76 btf_put(map_meta->btf); 83 77 kfree(map_meta); 84 78 } ··· 90 84 return meta0->map_type == meta1->map_type && 91 85 meta0->key_size == meta1->key_size && 92 86 meta0->value_size == meta1->value_size && 93 - meta0->timer_off == meta1->timer_off && 94 87 meta0->map_flags == meta1->map_flags && 95 - bpf_map_equal_kptr_off_tab(meta0, meta1); 88 + btf_record_equal(meta0->record, meta1->record); 96 89 } 97 90 98 91 void *bpf_map_fd_get_ptr(struct bpf_map *map,
+192 -227
kernel/bpf/syscall.c
··· 495 495 } 496 496 #endif 497 497 498 - static int bpf_map_kptr_off_cmp(const void *a, const void *b) 498 + static int btf_field_cmp(const void *a, const void *b) 499 499 { 500 - const struct bpf_map_value_off_desc *off_desc1 = a, *off_desc2 = b; 500 + const struct btf_field *f1 = a, *f2 = b; 501 501 502 - if (off_desc1->offset < off_desc2->offset) 502 + if (f1->offset < f2->offset) 503 503 return -1; 504 - else if (off_desc1->offset > off_desc2->offset) 504 + else if (f1->offset > f2->offset) 505 505 return 1; 506 506 return 0; 507 507 } 508 508 509 - struct bpf_map_value_off_desc *bpf_map_kptr_off_contains(struct bpf_map *map, u32 offset) 509 + struct btf_field *btf_record_find(const struct btf_record *rec, u32 offset, 510 + enum btf_field_type type) 510 511 { 511 - /* Since members are iterated in btf_find_field in increasing order, 512 - * offsets appended to kptr_off_tab are in increasing order, so we can 513 - * do bsearch to find exact match. 514 - */ 515 - struct bpf_map_value_off *tab; 512 + struct btf_field *field; 516 513 517 - if (!map_value_has_kptrs(map)) 514 + if (IS_ERR_OR_NULL(rec) || !(rec->field_mask & type)) 518 515 return NULL; 519 - tab = map->kptr_off_tab; 520 - return bsearch(&offset, tab->off, tab->nr_off, sizeof(tab->off[0]), bpf_map_kptr_off_cmp); 516 + field = bsearch(&offset, rec->fields, rec->cnt, sizeof(rec->fields[0]), btf_field_cmp); 517 + if (!field || !(field->type & type)) 518 + return NULL; 519 + return field; 521 520 } 522 521 523 - void bpf_map_free_kptr_off_tab(struct bpf_map *map) 522 + void btf_record_free(struct btf_record *rec) 524 523 { 525 - struct bpf_map_value_off *tab = map->kptr_off_tab; 526 524 int i; 527 525 528 - if (!map_value_has_kptrs(map)) 526 + if (IS_ERR_OR_NULL(rec)) 529 527 return; 530 - for (i = 0; i < tab->nr_off; i++) { 531 - if (tab->off[i].kptr.module) 532 - module_put(tab->off[i].kptr.module); 533 - btf_put(tab->off[i].kptr.btf); 534 - } 535 - kfree(tab); 536 - map->kptr_off_tab = NULL; 537 - } 538 - 539 - struct bpf_map_value_off *bpf_map_copy_kptr_off_tab(const struct bpf_map *map) 540 - { 541 - struct bpf_map_value_off *tab = map->kptr_off_tab, *new_tab; 542 - int size, i; 543 - 544 - if (!map_value_has_kptrs(map)) 545 - return ERR_PTR(-ENOENT); 546 - size = offsetof(struct bpf_map_value_off, off[tab->nr_off]); 547 - new_tab = kmemdup(tab, size, GFP_KERNEL | __GFP_NOWARN); 548 - if (!new_tab) 549 - return ERR_PTR(-ENOMEM); 550 - /* Do a deep copy of the kptr_off_tab */ 551 - for (i = 0; i < tab->nr_off; i++) { 552 - btf_get(tab->off[i].kptr.btf); 553 - if (tab->off[i].kptr.module && !try_module_get(tab->off[i].kptr.module)) { 554 - while (i--) { 555 - if (tab->off[i].kptr.module) 556 - module_put(tab->off[i].kptr.module); 557 - btf_put(tab->off[i].kptr.btf); 558 - } 559 - kfree(new_tab); 560 - return ERR_PTR(-ENXIO); 561 - } 562 - } 563 - return new_tab; 564 - } 565 - 566 - bool bpf_map_equal_kptr_off_tab(const struct bpf_map *map_a, const struct bpf_map *map_b) 567 - { 568 - struct bpf_map_value_off *tab_a = map_a->kptr_off_tab, *tab_b = map_b->kptr_off_tab; 569 - bool a_has_kptr = map_value_has_kptrs(map_a), b_has_kptr = map_value_has_kptrs(map_b); 570 - int size; 571 - 572 - if (!a_has_kptr && !b_has_kptr) 573 - return true; 574 - if (a_has_kptr != b_has_kptr) 575 - return false; 576 - if (tab_a->nr_off != tab_b->nr_off) 577 - return false; 578 - size = offsetof(struct bpf_map_value_off, off[tab_a->nr_off]); 579 - return !memcmp(tab_a, tab_b, size); 580 - } 581 - 582 - /* Caller must ensure map_value_has_kptrs is true. Note that this function can 583 - * be called on a map value while the map_value is visible to BPF programs, as 584 - * it ensures the correct synchronization, and we already enforce the same using 585 - * the bpf_kptr_xchg helper on the BPF program side for referenced kptrs. 586 - */ 587 - void bpf_map_free_kptrs(struct bpf_map *map, void *map_value) 588 - { 589 - struct bpf_map_value_off *tab = map->kptr_off_tab; 590 - unsigned long *btf_id_ptr; 591 - int i; 592 - 593 - for (i = 0; i < tab->nr_off; i++) { 594 - struct bpf_map_value_off_desc *off_desc = &tab->off[i]; 595 - unsigned long old_ptr; 596 - 597 - btf_id_ptr = map_value + off_desc->offset; 598 - if (off_desc->type == BPF_KPTR_UNREF) { 599 - u64 *p = (u64 *)btf_id_ptr; 600 - 601 - WRITE_ONCE(*p, 0); 528 + for (i = 0; i < rec->cnt; i++) { 529 + switch (rec->fields[i].type) { 530 + case BPF_SPIN_LOCK: 531 + case BPF_TIMER: 532 + break; 533 + case BPF_KPTR_UNREF: 534 + case BPF_KPTR_REF: 535 + if (rec->fields[i].kptr.module) 536 + module_put(rec->fields[i].kptr.module); 537 + btf_put(rec->fields[i].kptr.btf); 538 + break; 539 + default: 540 + WARN_ON_ONCE(1); 602 541 continue; 603 542 } 604 - old_ptr = xchg(btf_id_ptr, 0); 605 - off_desc->kptr.dtor((void *)old_ptr); 543 + } 544 + kfree(rec); 545 + } 546 + 547 + void bpf_map_free_record(struct bpf_map *map) 548 + { 549 + btf_record_free(map->record); 550 + map->record = NULL; 551 + } 552 + 553 + struct btf_record *btf_record_dup(const struct btf_record *rec) 554 + { 555 + const struct btf_field *fields; 556 + struct btf_record *new_rec; 557 + int ret, size, i; 558 + 559 + if (IS_ERR_OR_NULL(rec)) 560 + return NULL; 561 + size = offsetof(struct btf_record, fields[rec->cnt]); 562 + new_rec = kmemdup(rec, size, GFP_KERNEL | __GFP_NOWARN); 563 + if (!new_rec) 564 + return ERR_PTR(-ENOMEM); 565 + /* Do a deep copy of the btf_record */ 566 + fields = rec->fields; 567 + new_rec->cnt = 0; 568 + for (i = 0; i < rec->cnt; i++) { 569 + switch (fields[i].type) { 570 + case BPF_SPIN_LOCK: 571 + case BPF_TIMER: 572 + break; 573 + case BPF_KPTR_UNREF: 574 + case BPF_KPTR_REF: 575 + btf_get(fields[i].kptr.btf); 576 + if (fields[i].kptr.module && !try_module_get(fields[i].kptr.module)) { 577 + ret = -ENXIO; 578 + goto free; 579 + } 580 + break; 581 + default: 582 + ret = -EFAULT; 583 + WARN_ON_ONCE(1); 584 + goto free; 585 + } 586 + new_rec->cnt++; 587 + } 588 + return new_rec; 589 + free: 590 + btf_record_free(new_rec); 591 + return ERR_PTR(ret); 592 + } 593 + 594 + bool btf_record_equal(const struct btf_record *rec_a, const struct btf_record *rec_b) 595 + { 596 + bool a_has_fields = !IS_ERR_OR_NULL(rec_a), b_has_fields = !IS_ERR_OR_NULL(rec_b); 597 + int size; 598 + 599 + if (!a_has_fields && !b_has_fields) 600 + return true; 601 + if (a_has_fields != b_has_fields) 602 + return false; 603 + if (rec_a->cnt != rec_b->cnt) 604 + return false; 605 + size = offsetof(struct btf_record, fields[rec_a->cnt]); 606 + return !memcmp(rec_a, rec_b, size); 607 + } 608 + 609 + void bpf_obj_free_timer(const struct btf_record *rec, void *obj) 610 + { 611 + if (WARN_ON_ONCE(!btf_record_has_field(rec, BPF_TIMER))) 612 + return; 613 + bpf_timer_cancel_and_free(obj + rec->timer_off); 614 + } 615 + 616 + void bpf_obj_free_fields(const struct btf_record *rec, void *obj) 617 + { 618 + const struct btf_field *fields; 619 + int i; 620 + 621 + if (IS_ERR_OR_NULL(rec)) 622 + return; 623 + fields = rec->fields; 624 + for (i = 0; i < rec->cnt; i++) { 625 + const struct btf_field *field = &fields[i]; 626 + void *field_ptr = obj + field->offset; 627 + 628 + switch (fields[i].type) { 629 + case BPF_SPIN_LOCK: 630 + break; 631 + case BPF_TIMER: 632 + bpf_timer_cancel_and_free(field_ptr); 633 + break; 634 + case BPF_KPTR_UNREF: 635 + WRITE_ONCE(*(u64 *)field_ptr, 0); 636 + break; 637 + case BPF_KPTR_REF: 638 + field->kptr.dtor((void *)xchg((unsigned long *)field_ptr, 0)); 639 + break; 640 + default: 641 + WARN_ON_ONCE(1); 642 + continue; 643 + } 606 644 } 607 645 } 608 646 ··· 650 612 struct bpf_map *map = container_of(work, struct bpf_map, work); 651 613 652 614 security_bpf_map_free(map); 653 - kfree(map->off_arr); 615 + kfree(map->field_offs); 654 616 bpf_map_release_memcg(map); 655 617 /* implementation dependent freeing, map_free callback also does 656 - * bpf_map_free_kptr_off_tab, if needed. 618 + * bpf_map_free_record, if needed. 657 619 */ 658 620 map->ops->map_free(map); 659 621 } ··· 816 778 struct bpf_map *map = filp->private_data; 817 779 int err; 818 780 819 - if (!map->ops->map_mmap || map_value_has_spin_lock(map) || 820 - map_value_has_timer(map) || map_value_has_kptrs(map)) 781 + if (!map->ops->map_mmap || !IS_ERR_OR_NULL(map->record)) 821 782 return -ENOTSUPP; 822 783 823 784 if (!(vma->vm_flags & VM_SHARED)) ··· 943 906 return -ENOTSUPP; 944 907 } 945 908 946 - static int map_off_arr_cmp(const void *_a, const void *_b, const void *priv) 947 - { 948 - const u32 a = *(const u32 *)_a; 949 - const u32 b = *(const u32 *)_b; 950 - 951 - if (a < b) 952 - return -1; 953 - else if (a > b) 954 - return 1; 955 - return 0; 956 - } 957 - 958 - static void map_off_arr_swap(void *_a, void *_b, int size, const void *priv) 959 - { 960 - struct bpf_map *map = (struct bpf_map *)priv; 961 - u32 *off_base = map->off_arr->field_off; 962 - u32 *a = _a, *b = _b; 963 - u8 *sz_a, *sz_b; 964 - 965 - sz_a = map->off_arr->field_sz + (a - off_base); 966 - sz_b = map->off_arr->field_sz + (b - off_base); 967 - 968 - swap(*a, *b); 969 - swap(*sz_a, *sz_b); 970 - } 971 - 972 - static int bpf_map_alloc_off_arr(struct bpf_map *map) 973 - { 974 - bool has_spin_lock = map_value_has_spin_lock(map); 975 - bool has_timer = map_value_has_timer(map); 976 - bool has_kptrs = map_value_has_kptrs(map); 977 - struct bpf_map_off_arr *off_arr; 978 - u32 i; 979 - 980 - if (!has_spin_lock && !has_timer && !has_kptrs) { 981 - map->off_arr = NULL; 982 - return 0; 983 - } 984 - 985 - off_arr = kmalloc(sizeof(*map->off_arr), GFP_KERNEL | __GFP_NOWARN); 986 - if (!off_arr) 987 - return -ENOMEM; 988 - map->off_arr = off_arr; 989 - 990 - off_arr->cnt = 0; 991 - if (has_spin_lock) { 992 - i = off_arr->cnt; 993 - 994 - off_arr->field_off[i] = map->spin_lock_off; 995 - off_arr->field_sz[i] = sizeof(struct bpf_spin_lock); 996 - off_arr->cnt++; 997 - } 998 - if (has_timer) { 999 - i = off_arr->cnt; 1000 - 1001 - off_arr->field_off[i] = map->timer_off; 1002 - off_arr->field_sz[i] = sizeof(struct bpf_timer); 1003 - off_arr->cnt++; 1004 - } 1005 - if (has_kptrs) { 1006 - struct bpf_map_value_off *tab = map->kptr_off_tab; 1007 - u32 *off = &off_arr->field_off[off_arr->cnt]; 1008 - u8 *sz = &off_arr->field_sz[off_arr->cnt]; 1009 - 1010 - for (i = 0; i < tab->nr_off; i++) { 1011 - *off++ = tab->off[i].offset; 1012 - *sz++ = sizeof(u64); 1013 - } 1014 - off_arr->cnt += tab->nr_off; 1015 - } 1016 - 1017 - if (off_arr->cnt == 1) 1018 - return 0; 1019 - sort_r(off_arr->field_off, off_arr->cnt, sizeof(off_arr->field_off[0]), 1020 - map_off_arr_cmp, map_off_arr_swap, map); 1021 - return 0; 1022 - } 1023 - 1024 909 static int map_check_btf(struct bpf_map *map, const struct btf *btf, 1025 910 u32 btf_key_id, u32 btf_value_id) 1026 911 { ··· 965 1006 if (!value_type || value_size != map->value_size) 966 1007 return -EINVAL; 967 1008 968 - map->spin_lock_off = btf_find_spin_lock(btf, value_type); 1009 + map->record = btf_parse_fields(btf, value_type, BPF_SPIN_LOCK | BPF_TIMER | BPF_KPTR, 1010 + map->value_size); 1011 + if (!IS_ERR_OR_NULL(map->record)) { 1012 + int i; 969 1013 970 - if (map_value_has_spin_lock(map)) { 971 - if (map->map_flags & BPF_F_RDONLY_PROG) 972 - return -EACCES; 973 - if (map->map_type != BPF_MAP_TYPE_HASH && 974 - map->map_type != BPF_MAP_TYPE_ARRAY && 975 - map->map_type != BPF_MAP_TYPE_CGROUP_STORAGE && 976 - map->map_type != BPF_MAP_TYPE_SK_STORAGE && 977 - map->map_type != BPF_MAP_TYPE_INODE_STORAGE && 978 - map->map_type != BPF_MAP_TYPE_TASK_STORAGE && 979 - map->map_type != BPF_MAP_TYPE_CGRP_STORAGE) 980 - return -ENOTSUPP; 981 - if (map->spin_lock_off + sizeof(struct bpf_spin_lock) > 982 - map->value_size) { 983 - WARN_ONCE(1, 984 - "verifier bug spin_lock_off %d value_size %d\n", 985 - map->spin_lock_off, map->value_size); 986 - return -EFAULT; 987 - } 988 - } 989 - 990 - map->timer_off = btf_find_timer(btf, value_type); 991 - if (map_value_has_timer(map)) { 992 - if (map->map_flags & BPF_F_RDONLY_PROG) 993 - return -EACCES; 994 - if (map->map_type != BPF_MAP_TYPE_HASH && 995 - map->map_type != BPF_MAP_TYPE_LRU_HASH && 996 - map->map_type != BPF_MAP_TYPE_ARRAY) 997 - return -EOPNOTSUPP; 998 - } 999 - 1000 - map->kptr_off_tab = btf_parse_kptrs(btf, value_type); 1001 - if (map_value_has_kptrs(map)) { 1002 1014 if (!bpf_capable()) { 1003 1015 ret = -EPERM; 1004 1016 goto free_map_tab; ··· 978 1048 ret = -EACCES; 979 1049 goto free_map_tab; 980 1050 } 981 - if (map->map_type != BPF_MAP_TYPE_HASH && 982 - map->map_type != BPF_MAP_TYPE_LRU_HASH && 983 - map->map_type != BPF_MAP_TYPE_ARRAY && 984 - map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY) { 985 - ret = -EOPNOTSUPP; 986 - goto free_map_tab; 1051 + for (i = 0; i < sizeof(map->record->field_mask) * 8; i++) { 1052 + switch (map->record->field_mask & (1 << i)) { 1053 + case 0: 1054 + continue; 1055 + case BPF_SPIN_LOCK: 1056 + if (map->map_type != BPF_MAP_TYPE_HASH && 1057 + map->map_type != BPF_MAP_TYPE_ARRAY && 1058 + map->map_type != BPF_MAP_TYPE_CGROUP_STORAGE && 1059 + map->map_type != BPF_MAP_TYPE_SK_STORAGE && 1060 + map->map_type != BPF_MAP_TYPE_INODE_STORAGE && 1061 + map->map_type != BPF_MAP_TYPE_TASK_STORAGE && 1062 + map->map_type != BPF_MAP_TYPE_CGRP_STORAGE) { 1063 + ret = -EOPNOTSUPP; 1064 + goto free_map_tab; 1065 + } 1066 + break; 1067 + case BPF_TIMER: 1068 + if (map->map_type != BPF_MAP_TYPE_HASH && 1069 + map->map_type != BPF_MAP_TYPE_LRU_HASH && 1070 + map->map_type != BPF_MAP_TYPE_ARRAY) { 1071 + return -EOPNOTSUPP; 1072 + goto free_map_tab; 1073 + } 1074 + break; 1075 + case BPF_KPTR_UNREF: 1076 + case BPF_KPTR_REF: 1077 + if (map->map_type != BPF_MAP_TYPE_HASH && 1078 + map->map_type != BPF_MAP_TYPE_LRU_HASH && 1079 + map->map_type != BPF_MAP_TYPE_ARRAY && 1080 + map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY) { 1081 + ret = -EOPNOTSUPP; 1082 + goto free_map_tab; 1083 + } 1084 + break; 1085 + default: 1086 + /* Fail if map_type checks are missing for a field type */ 1087 + ret = -EOPNOTSUPP; 1088 + goto free_map_tab; 1089 + } 987 1090 } 988 1091 } 989 1092 ··· 1028 1065 1029 1066 return ret; 1030 1067 free_map_tab: 1031 - bpf_map_free_kptr_off_tab(map); 1068 + bpf_map_free_record(map); 1032 1069 return ret; 1033 1070 } 1034 1071 ··· 1037 1074 static int map_create(union bpf_attr *attr) 1038 1075 { 1039 1076 int numa_node = bpf_map_attr_numa_node(attr); 1077 + struct btf_field_offs *foffs; 1040 1078 struct bpf_map *map; 1041 1079 int f_flags; 1042 1080 int err; ··· 1082 1118 mutex_init(&map->freeze_mutex); 1083 1119 spin_lock_init(&map->owner.lock); 1084 1120 1085 - map->spin_lock_off = -EINVAL; 1086 - map->timer_off = -EINVAL; 1087 1121 if (attr->btf_key_type_id || attr->btf_value_type_id || 1088 1122 /* Even the map's value is a kernel's struct, 1089 1123 * the bpf_prog.o must have BTF to begin with ··· 1117 1155 attr->btf_vmlinux_value_type_id; 1118 1156 } 1119 1157 1120 - err = bpf_map_alloc_off_arr(map); 1121 - if (err) 1158 + 1159 + foffs = btf_parse_field_offs(map->record); 1160 + if (IS_ERR(foffs)) { 1161 + err = PTR_ERR(foffs); 1122 1162 goto free_map; 1163 + } 1164 + map->field_offs = foffs; 1123 1165 1124 1166 err = security_bpf_map_alloc(map); 1125 1167 if (err) 1126 - goto free_map_off_arr; 1168 + goto free_map_field_offs; 1127 1169 1128 1170 err = bpf_map_alloc_id(map); 1129 1171 if (err) ··· 1151 1185 1152 1186 free_map_sec: 1153 1187 security_bpf_map_free(map); 1154 - free_map_off_arr: 1155 - kfree(map->off_arr); 1188 + free_map_field_offs: 1189 + kfree(map->field_offs); 1156 1190 free_map: 1157 1191 btf_put(map->btf); 1158 1192 map->ops->map_free(map); ··· 1299 1333 } 1300 1334 1301 1335 if ((attr->flags & BPF_F_LOCK) && 1302 - !map_value_has_spin_lock(map)) { 1336 + !btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 1303 1337 err = -EINVAL; 1304 1338 goto err_put; 1305 1339 } ··· 1372 1406 } 1373 1407 1374 1408 if ((attr->flags & BPF_F_LOCK) && 1375 - !map_value_has_spin_lock(map)) { 1409 + !btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 1376 1410 err = -EINVAL; 1377 1411 goto err_put; 1378 1412 } ··· 1535 1569 return -EINVAL; 1536 1570 1537 1571 if ((attr->batch.elem_flags & BPF_F_LOCK) && 1538 - !map_value_has_spin_lock(map)) { 1572 + !btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 1539 1573 return -EINVAL; 1540 1574 } 1541 1575 ··· 1592 1626 return -EINVAL; 1593 1627 1594 1628 if ((attr->batch.elem_flags & BPF_F_LOCK) && 1595 - !map_value_has_spin_lock(map)) { 1629 + !btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 1596 1630 return -EINVAL; 1597 1631 } 1598 1632 ··· 1655 1689 return -EINVAL; 1656 1690 1657 1691 if ((attr->batch.elem_flags & BPF_F_LOCK) && 1658 - !map_value_has_spin_lock(map)) 1692 + !btf_record_has_field(map->record, BPF_SPIN_LOCK)) 1659 1693 return -EINVAL; 1660 1694 1661 1695 value_size = bpf_map_value_size(map); ··· 1777 1811 } 1778 1812 1779 1813 if ((attr->flags & BPF_F_LOCK) && 1780 - !map_value_has_spin_lock(map)) { 1814 + !btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 1781 1815 err = -EINVAL; 1782 1816 goto err_put; 1783 1817 } ··· 1848 1882 if (IS_ERR(map)) 1849 1883 return PTR_ERR(map); 1850 1884 1851 - if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS || 1852 - map_value_has_timer(map) || map_value_has_kptrs(map)) { 1885 + if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS || !IS_ERR_OR_NULL(map->record)) { 1853 1886 fdput(f); 1854 1887 return -ENOTSUPP; 1855 1888 }
+314 -171
kernel/bpf/verifier.c
··· 262 262 struct btf *ret_btf; 263 263 u32 ret_btf_id; 264 264 u32 subprogno; 265 - struct bpf_map_value_off_desc *kptr_off_desc; 265 + struct btf_field *kptr_field; 266 266 u8 uninit_dynptr_regno; 267 267 }; 268 268 ··· 454 454 static bool reg_may_point_to_spin_lock(const struct bpf_reg_state *reg) 455 455 { 456 456 return reg->type == PTR_TO_MAP_VALUE && 457 - map_value_has_spin_lock(reg->map_ptr); 458 - } 459 - 460 - static bool reg_type_may_be_refcounted_or_null(enum bpf_reg_type type) 461 - { 462 - type = base_type(type); 463 - return type == PTR_TO_SOCKET || type == PTR_TO_TCP_SOCK || 464 - type == PTR_TO_MEM || type == PTR_TO_BTF_ID; 457 + btf_record_has_field(reg->map_ptr->record, BPF_SPIN_LOCK); 465 458 } 466 459 467 460 static bool type_is_rdonly_mem(u32 type) ··· 502 509 static bool is_dynptr_ref_function(enum bpf_func_id func_id) 503 510 { 504 511 return func_id == BPF_FUNC_dynptr_data; 512 + } 513 + 514 + static bool is_callback_calling_function(enum bpf_func_id func_id) 515 + { 516 + return func_id == BPF_FUNC_for_each_map_elem || 517 + func_id == BPF_FUNC_timer_set_callback || 518 + func_id == BPF_FUNC_find_vma || 519 + func_id == BPF_FUNC_loop || 520 + func_id == BPF_FUNC_user_ringbuf_drain; 505 521 } 506 522 507 523 static bool helper_multiple_ref_obj_use(enum bpf_func_id func_id, ··· 877 875 878 876 if (reg->id) 879 877 verbose_a("id=%d", reg->id); 880 - if (reg_type_may_be_refcounted_or_null(t) && reg->ref_obj_id) 878 + if (reg->ref_obj_id) 881 879 verbose_a("ref_obj_id=%d", reg->ref_obj_id); 882 880 if (t != SCALAR_VALUE) 883 881 verbose_a("off=%d", reg->off); ··· 1402 1400 /* transfer reg's id which is unique for every map_lookup_elem 1403 1401 * as UID of the inner map. 1404 1402 */ 1405 - if (map_value_has_timer(map->inner_map_meta)) 1403 + if (btf_record_has_field(map->inner_map_meta->record, BPF_TIMER)) 1406 1404 reg->map_uid = reg->id; 1407 1405 } else if (map->map_type == BPF_MAP_TYPE_XSKMAP) { 1408 1406 reg->type = PTR_TO_XDP_SOCK; ··· 1691 1689 reg->type = SCALAR_VALUE; 1692 1690 reg->var_off = tnum_unknown; 1693 1691 reg->frameno = 0; 1694 - reg->precise = env->subprog_cnt > 1 || !env->bpf_capable; 1692 + reg->precise = !env->bpf_capable; 1695 1693 __mark_reg_unbounded(reg); 1696 1694 } 1697 1695 ··· 2660 2658 if (opcode == BPF_CALL) { 2661 2659 if (insn->src_reg == BPF_PSEUDO_CALL) 2662 2660 return -ENOTSUPP; 2661 + /* BPF helpers that invoke callback subprogs are 2662 + * equivalent to BPF_PSEUDO_CALL above 2663 + */ 2664 + if (insn->src_reg == 0 && is_callback_calling_function(insn->imm)) 2665 + return -ENOTSUPP; 2663 2666 /* regular helper call sets R0 */ 2664 2667 *reg_mask &= ~1; 2665 2668 if (*reg_mask & 0x3f) { ··· 2754 2747 2755 2748 /* big hammer: mark all scalars precise in this path. 2756 2749 * pop_stack may still get !precise scalars. 2750 + * We also skip current state and go straight to first parent state, 2751 + * because precision markings in current non-checkpointed state are 2752 + * not needed. See why in the comment in __mark_chain_precision below. 2757 2753 */ 2758 - for (; st; st = st->parent) 2754 + for (st = st->parent; st; st = st->parent) { 2759 2755 for (i = 0; i <= st->curframe; i++) { 2760 2756 func = st->frame[i]; 2761 2757 for (j = 0; j < BPF_REG_FP; j++) { ··· 2776 2766 reg->precise = true; 2777 2767 } 2778 2768 } 2769 + } 2779 2770 } 2780 2771 2781 - static int __mark_chain_precision(struct bpf_verifier_env *env, int regno, 2772 + static void mark_all_scalars_imprecise(struct bpf_verifier_env *env, struct bpf_verifier_state *st) 2773 + { 2774 + struct bpf_func_state *func; 2775 + struct bpf_reg_state *reg; 2776 + int i, j; 2777 + 2778 + for (i = 0; i <= st->curframe; i++) { 2779 + func = st->frame[i]; 2780 + for (j = 0; j < BPF_REG_FP; j++) { 2781 + reg = &func->regs[j]; 2782 + if (reg->type != SCALAR_VALUE) 2783 + continue; 2784 + reg->precise = false; 2785 + } 2786 + for (j = 0; j < func->allocated_stack / BPF_REG_SIZE; j++) { 2787 + if (!is_spilled_reg(&func->stack[j])) 2788 + continue; 2789 + reg = &func->stack[j].spilled_ptr; 2790 + if (reg->type != SCALAR_VALUE) 2791 + continue; 2792 + reg->precise = false; 2793 + } 2794 + } 2795 + } 2796 + 2797 + /* 2798 + * __mark_chain_precision() backtracks BPF program instruction sequence and 2799 + * chain of verifier states making sure that register *regno* (if regno >= 0) 2800 + * and/or stack slot *spi* (if spi >= 0) are marked as precisely tracked 2801 + * SCALARS, as well as any other registers and slots that contribute to 2802 + * a tracked state of given registers/stack slots, depending on specific BPF 2803 + * assembly instructions (see backtrack_insns() for exact instruction handling 2804 + * logic). This backtracking relies on recorded jmp_history and is able to 2805 + * traverse entire chain of parent states. This process ends only when all the 2806 + * necessary registers/slots and their transitive dependencies are marked as 2807 + * precise. 2808 + * 2809 + * One important and subtle aspect is that precise marks *do not matter* in 2810 + * the currently verified state (current state). It is important to understand 2811 + * why this is the case. 2812 + * 2813 + * First, note that current state is the state that is not yet "checkpointed", 2814 + * i.e., it is not yet put into env->explored_states, and it has no children 2815 + * states as well. It's ephemeral, and can end up either a) being discarded if 2816 + * compatible explored state is found at some point or BPF_EXIT instruction is 2817 + * reached or b) checkpointed and put into env->explored_states, branching out 2818 + * into one or more children states. 2819 + * 2820 + * In the former case, precise markings in current state are completely 2821 + * ignored by state comparison code (see regsafe() for details). Only 2822 + * checkpointed ("old") state precise markings are important, and if old 2823 + * state's register/slot is precise, regsafe() assumes current state's 2824 + * register/slot as precise and checks value ranges exactly and precisely. If 2825 + * states turn out to be compatible, current state's necessary precise 2826 + * markings and any required parent states' precise markings are enforced 2827 + * after the fact with propagate_precision() logic, after the fact. But it's 2828 + * important to realize that in this case, even after marking current state 2829 + * registers/slots as precise, we immediately discard current state. So what 2830 + * actually matters is any of the precise markings propagated into current 2831 + * state's parent states, which are always checkpointed (due to b) case above). 2832 + * As such, for scenario a) it doesn't matter if current state has precise 2833 + * markings set or not. 2834 + * 2835 + * Now, for the scenario b), checkpointing and forking into child(ren) 2836 + * state(s). Note that before current state gets to checkpointing step, any 2837 + * processed instruction always assumes precise SCALAR register/slot 2838 + * knowledge: if precise value or range is useful to prune jump branch, BPF 2839 + * verifier takes this opportunity enthusiastically. Similarly, when 2840 + * register's value is used to calculate offset or memory address, exact 2841 + * knowledge of SCALAR range is assumed, checked, and enforced. So, similar to 2842 + * what we mentioned above about state comparison ignoring precise markings 2843 + * during state comparison, BPF verifier ignores and also assumes precise 2844 + * markings *at will* during instruction verification process. But as verifier 2845 + * assumes precision, it also propagates any precision dependencies across 2846 + * parent states, which are not yet finalized, so can be further restricted 2847 + * based on new knowledge gained from restrictions enforced by their children 2848 + * states. This is so that once those parent states are finalized, i.e., when 2849 + * they have no more active children state, state comparison logic in 2850 + * is_state_visited() would enforce strict and precise SCALAR ranges, if 2851 + * required for correctness. 2852 + * 2853 + * To build a bit more intuition, note also that once a state is checkpointed, 2854 + * the path we took to get to that state is not important. This is crucial 2855 + * property for state pruning. When state is checkpointed and finalized at 2856 + * some instruction index, it can be correctly and safely used to "short 2857 + * circuit" any *compatible* state that reaches exactly the same instruction 2858 + * index. I.e., if we jumped to that instruction from a completely different 2859 + * code path than original finalized state was derived from, it doesn't 2860 + * matter, current state can be discarded because from that instruction 2861 + * forward having a compatible state will ensure we will safely reach the 2862 + * exit. States describe preconditions for further exploration, but completely 2863 + * forget the history of how we got here. 2864 + * 2865 + * This also means that even if we needed precise SCALAR range to get to 2866 + * finalized state, but from that point forward *that same* SCALAR register is 2867 + * never used in a precise context (i.e., it's precise value is not needed for 2868 + * correctness), it's correct and safe to mark such register as "imprecise" 2869 + * (i.e., precise marking set to false). This is what we rely on when we do 2870 + * not set precise marking in current state. If no child state requires 2871 + * precision for any given SCALAR register, it's safe to dictate that it can 2872 + * be imprecise. If any child state does require this register to be precise, 2873 + * we'll mark it precise later retroactively during precise markings 2874 + * propagation from child state to parent states. 2875 + * 2876 + * Skipping precise marking setting in current state is a mild version of 2877 + * relying on the above observation. But we can utilize this property even 2878 + * more aggressively by proactively forgetting any precise marking in the 2879 + * current state (which we inherited from the parent state), right before we 2880 + * checkpoint it and branch off into new child state. This is done by 2881 + * mark_all_scalars_imprecise() to hopefully get more permissive and generic 2882 + * finalized states which help in short circuiting more future states. 2883 + */ 2884 + static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int regno, 2782 2885 int spi) 2783 2886 { 2784 2887 struct bpf_verifier_state *st = env->cur_state; ··· 2908 2785 if (!env->bpf_capable) 2909 2786 return 0; 2910 2787 2911 - func = st->frame[st->curframe]; 2788 + /* Do sanity checks against current state of register and/or stack 2789 + * slot, but don't set precise flag in current state, as precision 2790 + * tracking in the current state is unnecessary. 2791 + */ 2792 + func = st->frame[frame]; 2912 2793 if (regno >= 0) { 2913 2794 reg = &func->regs[regno]; 2914 2795 if (reg->type != SCALAR_VALUE) { 2915 2796 WARN_ONCE(1, "backtracing misuse"); 2916 2797 return -EFAULT; 2917 2798 } 2918 - if (!reg->precise) 2919 - new_marks = true; 2920 - else 2921 - reg_mask = 0; 2922 - reg->precise = true; 2799 + new_marks = true; 2923 2800 } 2924 2801 2925 2802 while (spi >= 0) { ··· 2932 2809 stack_mask = 0; 2933 2810 break; 2934 2811 } 2935 - if (!reg->precise) 2936 - new_marks = true; 2937 - else 2938 - stack_mask = 0; 2939 - reg->precise = true; 2812 + new_marks = true; 2940 2813 break; 2941 2814 } 2942 2815 ··· 2940 2821 return 0; 2941 2822 if (!reg_mask && !stack_mask) 2942 2823 return 0; 2824 + 2943 2825 for (;;) { 2944 2826 DECLARE_BITMAP(mask, 64); 2945 2827 u32 history = st->jmp_history_cnt; 2946 2828 2947 2829 if (env->log.level & BPF_LOG_LEVEL2) 2948 2830 verbose(env, "last_idx %d first_idx %d\n", last_idx, first_idx); 2831 + 2832 + if (last_idx < 0) { 2833 + /* we are at the entry into subprog, which 2834 + * is expected for global funcs, but only if 2835 + * requested precise registers are R1-R5 2836 + * (which are global func's input arguments) 2837 + */ 2838 + if (st->curframe == 0 && 2839 + st->frame[0]->subprogno > 0 && 2840 + st->frame[0]->callsite == BPF_MAIN_FUNC && 2841 + stack_mask == 0 && (reg_mask & ~0x3e) == 0) { 2842 + bitmap_from_u64(mask, reg_mask); 2843 + for_each_set_bit(i, mask, 32) { 2844 + reg = &st->frame[0]->regs[i]; 2845 + if (reg->type != SCALAR_VALUE) { 2846 + reg_mask &= ~(1u << i); 2847 + continue; 2848 + } 2849 + reg->precise = true; 2850 + } 2851 + return 0; 2852 + } 2853 + 2854 + verbose(env, "BUG backtracing func entry subprog %d reg_mask %x stack_mask %llx\n", 2855 + st->frame[0]->subprogno, reg_mask, stack_mask); 2856 + WARN_ONCE(1, "verifier backtracking bug"); 2857 + return -EFAULT; 2858 + } 2859 + 2949 2860 for (i = last_idx;;) { 2950 2861 if (skip_first) { 2951 2862 err = 0; ··· 3015 2866 break; 3016 2867 3017 2868 new_marks = false; 3018 - func = st->frame[st->curframe]; 2869 + func = st->frame[frame]; 3019 2870 bitmap_from_u64(mask, reg_mask); 3020 2871 for_each_set_bit(i, mask, 32) { 3021 2872 reg = &func->regs[i]; ··· 3081 2932 3082 2933 int mark_chain_precision(struct bpf_verifier_env *env, int regno) 3083 2934 { 3084 - return __mark_chain_precision(env, regno, -1); 2935 + return __mark_chain_precision(env, env->cur_state->curframe, regno, -1); 3085 2936 } 3086 2937 3087 - static int mark_chain_precision_stack(struct bpf_verifier_env *env, int spi) 2938 + static int mark_chain_precision_frame(struct bpf_verifier_env *env, int frame, int regno) 3088 2939 { 3089 - return __mark_chain_precision(env, -1, spi); 2940 + return __mark_chain_precision(env, frame, regno, -1); 2941 + } 2942 + 2943 + static int mark_chain_precision_stack_frame(struct bpf_verifier_env *env, int frame, int spi) 2944 + { 2945 + return __mark_chain_precision(env, frame, -1, spi); 3090 2946 } 3091 2947 3092 2948 static bool is_spillable_regtype(enum bpf_reg_type type) ··· 3340 3186 stype = &state->stack[spi].slot_type[slot % BPF_REG_SIZE]; 3341 3187 mark_stack_slot_scratched(env, spi); 3342 3188 3343 - if (!env->allow_ptr_leaks 3344 - && *stype != NOT_INIT 3345 - && *stype != SCALAR_VALUE) { 3346 - /* Reject the write if there's are spilled pointers in 3347 - * range. If we didn't reject here, the ptr status 3348 - * would be erased below (even though not all slots are 3349 - * actually overwritten), possibly opening the door to 3350 - * leaks. 3189 + if (!env->allow_ptr_leaks && *stype != STACK_MISC && *stype != STACK_ZERO) { 3190 + /* Reject the write if range we may write to has not 3191 + * been initialized beforehand. If we didn't reject 3192 + * here, the ptr status would be erased below (even 3193 + * though not all slots are actually overwritten), 3194 + * possibly opening the door to leaks. 3195 + * 3196 + * We do however catch STACK_INVALID case below, and 3197 + * only allow reading possibly uninitialized memory 3198 + * later for CAP_PERFMON, as the write may not happen to 3199 + * that slot. 3351 3200 */ 3352 3201 verbose(env, "spilled ptr in range of var-offset stack write; insn %d, ptr off: %d", 3353 3202 insn_idx, i); ··· 3840 3683 } 3841 3684 3842 3685 static int map_kptr_match_type(struct bpf_verifier_env *env, 3843 - struct bpf_map_value_off_desc *off_desc, 3686 + struct btf_field *kptr_field, 3844 3687 struct bpf_reg_state *reg, u32 regno) 3845 3688 { 3846 - const char *targ_name = kernel_type_name(off_desc->kptr.btf, off_desc->kptr.btf_id); 3689 + const char *targ_name = kernel_type_name(kptr_field->kptr.btf, kptr_field->kptr.btf_id); 3847 3690 int perm_flags = PTR_MAYBE_NULL; 3848 3691 const char *reg_name = ""; 3849 3692 3850 3693 /* Only unreferenced case accepts untrusted pointers */ 3851 - if (off_desc->type == BPF_KPTR_UNREF) 3694 + if (kptr_field->type == BPF_KPTR_UNREF) 3852 3695 perm_flags |= PTR_UNTRUSTED; 3853 3696 3854 3697 if (base_type(reg->type) != PTR_TO_BTF_ID || (type_flag(reg->type) & ~perm_flags)) ··· 3895 3738 * strict mode to true for type match. 3896 3739 */ 3897 3740 if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, reg->off, 3898 - off_desc->kptr.btf, off_desc->kptr.btf_id, 3899 - off_desc->type == BPF_KPTR_REF)) 3741 + kptr_field->kptr.btf, kptr_field->kptr.btf_id, 3742 + kptr_field->type == BPF_KPTR_REF)) 3900 3743 goto bad_type; 3901 3744 return 0; 3902 3745 bad_type: 3903 3746 verbose(env, "invalid kptr access, R%d type=%s%s ", regno, 3904 3747 reg_type_str(env, reg->type), reg_name); 3905 3748 verbose(env, "expected=%s%s", reg_type_str(env, PTR_TO_BTF_ID), targ_name); 3906 - if (off_desc->type == BPF_KPTR_UNREF) 3749 + if (kptr_field->type == BPF_KPTR_UNREF) 3907 3750 verbose(env, " or %s%s\n", reg_type_str(env, PTR_TO_BTF_ID | PTR_UNTRUSTED), 3908 3751 targ_name); 3909 3752 else ··· 3913 3756 3914 3757 static int check_map_kptr_access(struct bpf_verifier_env *env, u32 regno, 3915 3758 int value_regno, int insn_idx, 3916 - struct bpf_map_value_off_desc *off_desc) 3759 + struct btf_field *kptr_field) 3917 3760 { 3918 3761 struct bpf_insn *insn = &env->prog->insnsi[insn_idx]; 3919 3762 int class = BPF_CLASS(insn->code); ··· 3923 3766 * - Reject cases where variable offset may touch kptr 3924 3767 * - size of access (must be BPF_DW) 3925 3768 * - tnum_is_const(reg->var_off) 3926 - * - off_desc->offset == off + reg->var_off.value 3769 + * - kptr_field->offset == off + reg->var_off.value 3927 3770 */ 3928 3771 /* Only BPF_[LDX,STX,ST] | BPF_MEM | BPF_DW is supported */ 3929 3772 if (BPF_MODE(insn->code) != BPF_MEM) { ··· 3934 3777 /* We only allow loading referenced kptr, since it will be marked as 3935 3778 * untrusted, similar to unreferenced kptr. 3936 3779 */ 3937 - if (class != BPF_LDX && off_desc->type == BPF_KPTR_REF) { 3780 + if (class != BPF_LDX && kptr_field->type == BPF_KPTR_REF) { 3938 3781 verbose(env, "store to referenced kptr disallowed\n"); 3939 3782 return -EACCES; 3940 3783 } ··· 3944 3787 /* We can simply mark the value_regno receiving the pointer 3945 3788 * value from map as PTR_TO_BTF_ID, with the correct type. 3946 3789 */ 3947 - mark_btf_ld_reg(env, cur_regs(env), value_regno, PTR_TO_BTF_ID, off_desc->kptr.btf, 3948 - off_desc->kptr.btf_id, PTR_MAYBE_NULL | PTR_UNTRUSTED); 3790 + mark_btf_ld_reg(env, cur_regs(env), value_regno, PTR_TO_BTF_ID, kptr_field->kptr.btf, 3791 + kptr_field->kptr.btf_id, PTR_MAYBE_NULL | PTR_UNTRUSTED); 3949 3792 /* For mark_ptr_or_null_reg */ 3950 3793 val_reg->id = ++env->id_gen; 3951 3794 } else if (class == BPF_STX) { 3952 3795 val_reg = reg_state(env, value_regno); 3953 3796 if (!register_is_null(val_reg) && 3954 - map_kptr_match_type(env, off_desc, val_reg, value_regno)) 3797 + map_kptr_match_type(env, kptr_field, val_reg, value_regno)) 3955 3798 return -EACCES; 3956 3799 } else if (class == BPF_ST) { 3957 3800 if (insn->imm) { 3958 3801 verbose(env, "BPF_ST imm must be 0 when storing to kptr at off=%u\n", 3959 - off_desc->offset); 3802 + kptr_field->offset); 3960 3803 return -EACCES; 3961 3804 } 3962 3805 } else { ··· 3975 3818 struct bpf_func_state *state = vstate->frame[vstate->curframe]; 3976 3819 struct bpf_reg_state *reg = &state->regs[regno]; 3977 3820 struct bpf_map *map = reg->map_ptr; 3978 - int err; 3821 + struct btf_record *rec; 3822 + int err, i; 3979 3823 3980 3824 err = check_mem_region_access(env, regno, off, size, map->value_size, 3981 3825 zero_size_allowed); 3982 3826 if (err) 3983 3827 return err; 3984 3828 3985 - if (map_value_has_spin_lock(map)) { 3986 - u32 lock = map->spin_lock_off; 3829 + if (IS_ERR_OR_NULL(map->record)) 3830 + return 0; 3831 + rec = map->record; 3832 + for (i = 0; i < rec->cnt; i++) { 3833 + struct btf_field *field = &rec->fields[i]; 3834 + u32 p = field->offset; 3987 3835 3988 - /* if any part of struct bpf_spin_lock can be touched by 3989 - * load/store reject this program. 3990 - * To check that [x1, x2) overlaps with [y1, y2) 3836 + /* If any part of a field can be touched by load/store, reject 3837 + * this program. To check that [x1, x2) overlaps with [y1, y2), 3991 3838 * it is sufficient to check x1 < y2 && y1 < x2. 3992 3839 */ 3993 - if (reg->smin_value + off < lock + sizeof(struct bpf_spin_lock) && 3994 - lock < reg->umax_value + off + size) { 3995 - verbose(env, "bpf_spin_lock cannot be accessed directly by load/store\n"); 3996 - return -EACCES; 3997 - } 3998 - } 3999 - if (map_value_has_timer(map)) { 4000 - u32 t = map->timer_off; 4001 - 4002 - if (reg->smin_value + off < t + sizeof(struct bpf_timer) && 4003 - t < reg->umax_value + off + size) { 4004 - verbose(env, "bpf_timer cannot be accessed directly by load/store\n"); 4005 - return -EACCES; 4006 - } 4007 - } 4008 - if (map_value_has_kptrs(map)) { 4009 - struct bpf_map_value_off *tab = map->kptr_off_tab; 4010 - int i; 4011 - 4012 - for (i = 0; i < tab->nr_off; i++) { 4013 - u32 p = tab->off[i].offset; 4014 - 4015 - if (reg->smin_value + off < p + sizeof(u64) && 4016 - p < reg->umax_value + off + size) { 3840 + if (reg->smin_value + off < p + btf_field_type_size(field->type) && 3841 + p < reg->umax_value + off + size) { 3842 + switch (field->type) { 3843 + case BPF_KPTR_UNREF: 3844 + case BPF_KPTR_REF: 4017 3845 if (src != ACCESS_DIRECT) { 4018 3846 verbose(env, "kptr cannot be accessed indirectly by helper\n"); 4019 3847 return -EACCES; ··· 4017 3875 return -EACCES; 4018 3876 } 4019 3877 break; 3878 + default: 3879 + verbose(env, "%s cannot be accessed directly by load/store\n", 3880 + btf_field_type_name(field->type)); 3881 + return -EACCES; 4020 3882 } 4021 3883 } 4022 3884 } 4023 - return err; 3885 + return 0; 4024 3886 } 4025 3887 4026 3888 #define MAX_PACKET_OFF 0xffff ··· 4897 4751 if (value_regno >= 0) 4898 4752 mark_reg_unknown(env, regs, value_regno); 4899 4753 } else if (reg->type == PTR_TO_MAP_VALUE) { 4900 - struct bpf_map_value_off_desc *kptr_off_desc = NULL; 4754 + struct btf_field *kptr_field = NULL; 4901 4755 4902 4756 if (t == BPF_WRITE && value_regno >= 0 && 4903 4757 is_pointer_value(env, value_regno)) { ··· 4911 4765 if (err) 4912 4766 return err; 4913 4767 if (tnum_is_const(reg->var_off)) 4914 - kptr_off_desc = bpf_map_kptr_off_contains(reg->map_ptr, 4915 - off + reg->var_off.value); 4916 - if (kptr_off_desc) { 4917 - err = check_map_kptr_access(env, regno, value_regno, insn_idx, 4918 - kptr_off_desc); 4768 + kptr_field = btf_record_find(reg->map_ptr->record, 4769 + off + reg->var_off.value, BPF_KPTR); 4770 + if (kptr_field) { 4771 + err = check_map_kptr_access(env, regno, value_regno, insn_idx, kptr_field); 4919 4772 } else if (t == BPF_READ && value_regno >= 0) { 4920 4773 struct bpf_map *map = reg->map_ptr; 4921 4774 ··· 5305 5160 } 5306 5161 5307 5162 if (is_spilled_reg(&state->stack[spi]) && 5308 - base_type(state->stack[spi].spilled_ptr.type) == PTR_TO_BTF_ID) 5309 - goto mark; 5310 - 5311 - if (is_spilled_reg(&state->stack[spi]) && 5312 5163 (state->stack[spi].spilled_ptr.type == SCALAR_VALUE || 5313 5164 env->allow_ptr_leaks)) { 5314 5165 if (clobber) { ··· 5334 5193 mark_reg_read(env, &state->stack[spi].spilled_ptr, 5335 5194 state->stack[spi].spilled_ptr.parent, 5336 5195 REG_LIVE_READ64); 5196 + /* We do not set REG_LIVE_WRITTEN for stack slot, as we can not 5197 + * be sure that whether stack slot is written to or not. Hence, 5198 + * we must still conservatively propagate reads upwards even if 5199 + * helper may write to the entire memory range. 5200 + */ 5337 5201 } 5338 5202 return update_stack_depth(env, state, min_off); 5339 5203 } ··· 5588 5442 map->name); 5589 5443 return -EINVAL; 5590 5444 } 5591 - if (!map_value_has_spin_lock(map)) { 5592 - if (map->spin_lock_off == -E2BIG) 5593 - verbose(env, 5594 - "map '%s' has more than one 'struct bpf_spin_lock'\n", 5595 - map->name); 5596 - else if (map->spin_lock_off == -ENOENT) 5597 - verbose(env, 5598 - "map '%s' doesn't have 'struct bpf_spin_lock'\n", 5599 - map->name); 5600 - else 5601 - verbose(env, 5602 - "map '%s' is not a struct type or bpf_spin_lock is mangled\n", 5603 - map->name); 5445 + if (!btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 5446 + verbose(env, "map '%s' has no valid bpf_spin_lock\n", map->name); 5604 5447 return -EINVAL; 5605 5448 } 5606 - if (map->spin_lock_off != val + reg->off) { 5607 - verbose(env, "off %lld doesn't point to 'struct bpf_spin_lock'\n", 5608 - val + reg->off); 5449 + if (map->record->spin_lock_off != val + reg->off) { 5450 + verbose(env, "off %lld doesn't point to 'struct bpf_spin_lock' that is at %d\n", 5451 + val + reg->off, map->record->spin_lock_off); 5609 5452 return -EINVAL; 5610 5453 } 5611 5454 if (is_lock) { ··· 5637 5502 map->name); 5638 5503 return -EINVAL; 5639 5504 } 5640 - if (!map_value_has_timer(map)) { 5641 - if (map->timer_off == -E2BIG) 5642 - verbose(env, 5643 - "map '%s' has more than one 'struct bpf_timer'\n", 5644 - map->name); 5645 - else if (map->timer_off == -ENOENT) 5646 - verbose(env, 5647 - "map '%s' doesn't have 'struct bpf_timer'\n", 5648 - map->name); 5649 - else 5650 - verbose(env, 5651 - "map '%s' is not a struct type or bpf_timer is mangled\n", 5652 - map->name); 5505 + if (!btf_record_has_field(map->record, BPF_TIMER)) { 5506 + verbose(env, "map '%s' has no valid bpf_timer\n", map->name); 5653 5507 return -EINVAL; 5654 5508 } 5655 - if (map->timer_off != val + reg->off) { 5509 + if (map->record->timer_off != val + reg->off) { 5656 5510 verbose(env, "off %lld doesn't point to 'struct bpf_timer' that is at %d\n", 5657 - val + reg->off, map->timer_off); 5511 + val + reg->off, map->record->timer_off); 5658 5512 return -EINVAL; 5659 5513 } 5660 5514 if (meta->map_ptr) { ··· 5659 5535 struct bpf_call_arg_meta *meta) 5660 5536 { 5661 5537 struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno]; 5662 - struct bpf_map_value_off_desc *off_desc; 5663 5538 struct bpf_map *map_ptr = reg->map_ptr; 5539 + struct btf_field *kptr_field; 5664 5540 u32 kptr_off; 5665 - int ret; 5666 5541 5667 5542 if (!tnum_is_const(reg->var_off)) { 5668 5543 verbose(env, ··· 5674 5551 map_ptr->name); 5675 5552 return -EINVAL; 5676 5553 } 5677 - if (!map_value_has_kptrs(map_ptr)) { 5678 - ret = PTR_ERR_OR_ZERO(map_ptr->kptr_off_tab); 5679 - if (ret == -E2BIG) 5680 - verbose(env, "map '%s' has more than %d kptr\n", map_ptr->name, 5681 - BPF_MAP_VALUE_OFF_MAX); 5682 - else if (ret == -EEXIST) 5683 - verbose(env, "map '%s' has repeating kptr BTF tags\n", map_ptr->name); 5684 - else 5685 - verbose(env, "map '%s' has no valid kptr\n", map_ptr->name); 5554 + if (!btf_record_has_field(map_ptr->record, BPF_KPTR)) { 5555 + verbose(env, "map '%s' has no valid kptr\n", map_ptr->name); 5686 5556 return -EINVAL; 5687 5557 } 5688 5558 5689 5559 meta->map_ptr = map_ptr; 5690 5560 kptr_off = reg->off + reg->var_off.value; 5691 - off_desc = bpf_map_kptr_off_contains(map_ptr, kptr_off); 5692 - if (!off_desc) { 5561 + kptr_field = btf_record_find(map_ptr->record, kptr_off, BPF_KPTR); 5562 + if (!kptr_field) { 5693 5563 verbose(env, "off=%d doesn't point to kptr\n", kptr_off); 5694 5564 return -EACCES; 5695 5565 } 5696 - if (off_desc->type != BPF_KPTR_REF) { 5566 + if (kptr_field->type != BPF_KPTR_REF) { 5697 5567 verbose(env, "off=%d kptr isn't referenced kptr\n", kptr_off); 5698 5568 return -EACCES; 5699 5569 } 5700 - meta->kptr_off_desc = off_desc; 5570 + meta->kptr_field = kptr_field; 5701 5571 return 0; 5702 5572 } 5703 5573 ··· 5912 5796 } 5913 5797 5914 5798 if (meta->func_id == BPF_FUNC_kptr_xchg) { 5915 - if (map_kptr_match_type(env, meta->kptr_off_desc, reg, regno)) 5799 + if (map_kptr_match_type(env, meta->kptr_field, reg, regno)) 5916 5800 return -EACCES; 5917 5801 } else { 5918 5802 if (arg_btf_id == BPF_PTR_POISON) { ··· 6767 6651 struct bpf_func_state *callee, 6768 6652 int insn_idx); 6769 6653 6654 + static int set_callee_state(struct bpf_verifier_env *env, 6655 + struct bpf_func_state *caller, 6656 + struct bpf_func_state *callee, int insn_idx); 6657 + 6770 6658 static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, 6771 6659 int *insn_idx, int subprog, 6772 6660 set_callee_state_fn set_callee_state_cb) ··· 6819 6699 /* continue with next insn after call */ 6820 6700 return 0; 6821 6701 } 6702 + } 6703 + 6704 + /* set_callee_state is used for direct subprog calls, but we are 6705 + * interested in validating only BPF helpers that can call subprogs as 6706 + * callbacks 6707 + */ 6708 + if (set_callee_state_cb != set_callee_state && !is_callback_calling_function(insn->imm)) { 6709 + verbose(env, "verifier bug: helper %s#%d is not marked as callback-calling\n", 6710 + func_id_name(insn->imm), insn->imm); 6711 + return -EFAULT; 6822 6712 } 6823 6713 6824 6714 if (insn->code == (BPF_JMP | BPF_CALL) && ··· 7614 7484 regs[BPF_REG_0].map_uid = meta.map_uid; 7615 7485 regs[BPF_REG_0].type = PTR_TO_MAP_VALUE | ret_flag; 7616 7486 if (!type_may_be_null(ret_type) && 7617 - map_value_has_spin_lock(meta.map_ptr)) { 7487 + btf_record_has_field(meta.map_ptr->record, BPF_SPIN_LOCK)) { 7618 7488 regs[BPF_REG_0].id = ++env->id_gen; 7619 7489 } 7620 7490 break; ··· 7678 7548 mark_reg_known_zero(env, regs, BPF_REG_0); 7679 7549 regs[BPF_REG_0].type = PTR_TO_BTF_ID | ret_flag; 7680 7550 if (func_id == BPF_FUNC_kptr_xchg) { 7681 - ret_btf = meta.kptr_off_desc->kptr.btf; 7682 - ret_btf_id = meta.kptr_off_desc->kptr.btf_id; 7551 + ret_btf = meta.kptr_field->kptr.btf; 7552 + ret_btf_id = meta.kptr_field->kptr.btf_id; 7683 7553 } else { 7684 7554 if (fn->ret_btf_id == BPF_PTR_POISON) { 7685 7555 verbose(env, "verifier internal error:"); ··· 9337 9207 return err; 9338 9208 return adjust_ptr_min_max_vals(env, insn, 9339 9209 dst_reg, src_reg); 9210 + } else if (dst_reg->precise) { 9211 + /* if dst_reg is precise, src_reg should be precise as well */ 9212 + err = mark_chain_precision(env, insn->src_reg); 9213 + if (err) 9214 + return err; 9340 9215 } 9341 9216 } else { 9342 9217 /* Pretend the src is a reg with a known value, since we only ··· 10530 10395 insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE) { 10531 10396 dst_reg->type = PTR_TO_MAP_VALUE; 10532 10397 dst_reg->off = aux->map_off; 10533 - if (map_value_has_spin_lock(map)) 10398 + if (btf_record_has_field(map->record, BPF_SPIN_LOCK)) 10534 10399 dst_reg->id = ++env->id_gen; 10535 10400 } else if (insn->src_reg == BPF_PSEUDO_MAP_FD || 10536 10401 insn->src_reg == BPF_PSEUDO_MAP_IDX) { ··· 11655 11520 if (env->explore_alu_limits) 11656 11521 return false; 11657 11522 if (rcur->type == SCALAR_VALUE) { 11658 - if (!rold->precise && !rcur->precise) 11523 + if (!rold->precise) 11659 11524 return true; 11660 11525 /* new val must satisfy old val knowledge */ 11661 11526 return range_within(rold, rcur) && ··· 11978 11843 { 11979 11844 struct bpf_reg_state *state_reg; 11980 11845 struct bpf_func_state *state; 11981 - int i, err = 0; 11846 + int i, err = 0, fr; 11982 11847 11983 - state = old->frame[old->curframe]; 11984 - state_reg = state->regs; 11985 - for (i = 0; i < BPF_REG_FP; i++, state_reg++) { 11986 - if (state_reg->type != SCALAR_VALUE || 11987 - !state_reg->precise) 11988 - continue; 11989 - if (env->log.level & BPF_LOG_LEVEL2) 11990 - verbose(env, "propagating r%d\n", i); 11991 - err = mark_chain_precision(env, i); 11992 - if (err < 0) 11993 - return err; 11994 - } 11848 + for (fr = old->curframe; fr >= 0; fr--) { 11849 + state = old->frame[fr]; 11850 + state_reg = state->regs; 11851 + for (i = 0; i < BPF_REG_FP; i++, state_reg++) { 11852 + if (state_reg->type != SCALAR_VALUE || 11853 + !state_reg->precise) 11854 + continue; 11855 + if (env->log.level & BPF_LOG_LEVEL2) 11856 + verbose(env, "frame %d: propagating r%d\n", i, fr); 11857 + err = mark_chain_precision_frame(env, fr, i); 11858 + if (err < 0) 11859 + return err; 11860 + } 11995 11861 11996 - for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) { 11997 - if (!is_spilled_reg(&state->stack[i])) 11998 - continue; 11999 - state_reg = &state->stack[i].spilled_ptr; 12000 - if (state_reg->type != SCALAR_VALUE || 12001 - !state_reg->precise) 12002 - continue; 12003 - if (env->log.level & BPF_LOG_LEVEL2) 12004 - verbose(env, "propagating fp%d\n", 12005 - (-i - 1) * BPF_REG_SIZE); 12006 - err = mark_chain_precision_stack(env, i); 12007 - if (err < 0) 12008 - return err; 11862 + for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) { 11863 + if (!is_spilled_reg(&state->stack[i])) 11864 + continue; 11865 + state_reg = &state->stack[i].spilled_ptr; 11866 + if (state_reg->type != SCALAR_VALUE || 11867 + !state_reg->precise) 11868 + continue; 11869 + if (env->log.level & BPF_LOG_LEVEL2) 11870 + verbose(env, "frame %d: propagating fp%d\n", 11871 + (-i - 1) * BPF_REG_SIZE, fr); 11872 + err = mark_chain_precision_stack_frame(env, fr, i); 11873 + if (err < 0) 11874 + return err; 11875 + } 12009 11876 } 12010 11877 return 0; 12011 11878 } ··· 12201 12064 env->peak_states++; 12202 12065 env->prev_jmps_processed = env->jmps_processed; 12203 12066 env->prev_insn_processed = env->insn_processed; 12067 + 12068 + /* forget precise markings we inherited, see __mark_chain_precision */ 12069 + if (env->bpf_capable) 12070 + mark_all_scalars_imprecise(env, cur); 12204 12071 12205 12072 /* add new state to the head of linked list */ 12206 12073 new = &new_sl->state; ··· 12814 12673 { 12815 12674 enum bpf_prog_type prog_type = resolve_prog_type(prog); 12816 12675 12817 - if (map_value_has_spin_lock(map)) { 12676 + if (btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 12818 12677 if (prog_type == BPF_PROG_TYPE_SOCKET_FILTER) { 12819 12678 verbose(env, "socket filter progs cannot use bpf_spin_lock yet\n"); 12820 12679 return -EINVAL; ··· 12831 12690 } 12832 12691 } 12833 12692 12834 - if (map_value_has_timer(map)) { 12693 + if (btf_record_has_field(map->record, BPF_TIMER)) { 12835 12694 if (is_tracing_prog_type(prog_type)) { 12836 12695 verbose(env, "tracing progs cannot use bpf_timer yet\n"); 12837 12696 return -EINVAL; ··· 14754 14613 BPF_MAIN_FUNC /* callsite */, 14755 14614 0 /* frameno */, 14756 14615 subprog); 14616 + state->first_insn_idx = env->subprog_info[subprog].start; 14617 + state->last_insn_idx = -1; 14757 14618 14758 14619 regs = state->frame[state->curframe]->regs; 14759 14620 if (subprog || env->prog->type == BPF_PROG_TYPE_EXT) {
+2 -2
net/core/bpf_sk_storage.c
··· 147 147 if (!copy_selem) 148 148 return NULL; 149 149 150 - if (map_value_has_spin_lock(&smap->map)) 150 + if (btf_record_has_field(smap->map.record, BPF_SPIN_LOCK)) 151 151 copy_map_value_locked(&smap->map, SDATA(copy_selem)->data, 152 152 SDATA(selem)->data, true); 153 153 else ··· 566 566 if (!nla_value) 567 567 goto errout; 568 568 569 - if (map_value_has_spin_lock(&smap->map)) 569 + if (btf_record_has_field(smap->map.record, BPF_SPIN_LOCK)) 570 570 copy_map_value_locked(&smap->map, nla_data(nla_value), 571 571 sdata->data, true); 572 572 else
+35 -8
net/core/filter.c
··· 2126 2126 2127 2127 if (mlen) { 2128 2128 __skb_pull(skb, mlen); 2129 + if (unlikely(!skb->len)) { 2130 + kfree_skb(skb); 2131 + return -ERANGE; 2132 + } 2129 2133 2130 2134 /* At ingress, the mac header has already been pulled once. 2131 2135 * At egress, skb_pospull_rcsum has to be done in case that ··· 8925 8921 bpf_ctx_record_field_size(info, size_default); 8926 8922 return bpf_ctx_narrow_access_ok(off, size, 8927 8923 size_default); 8924 + case offsetof(struct bpf_sock_ops, skb_hwtstamp): 8925 + if (size != sizeof(__u64)) 8926 + return false; 8927 + break; 8928 8928 default: 8929 8929 if (size != size_default) 8930 8930 return false; ··· 9112 9104 return insn; 9113 9105 } 9114 9106 9115 - static struct bpf_insn *bpf_convert_shinfo_access(const struct bpf_insn *si, 9107 + static struct bpf_insn *bpf_convert_shinfo_access(__u8 dst_reg, __u8 skb_reg, 9116 9108 struct bpf_insn *insn) 9117 9109 { 9118 9110 /* si->dst_reg = skb_shinfo(SKB); */ 9119 9111 #ifdef NET_SKBUFF_DATA_USES_OFFSET 9120 9112 *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, end), 9121 - BPF_REG_AX, si->src_reg, 9113 + BPF_REG_AX, skb_reg, 9122 9114 offsetof(struct sk_buff, end)); 9123 9115 *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, head), 9124 - si->dst_reg, si->src_reg, 9116 + dst_reg, skb_reg, 9125 9117 offsetof(struct sk_buff, head)); 9126 - *insn++ = BPF_ALU64_REG(BPF_ADD, si->dst_reg, BPF_REG_AX); 9118 + *insn++ = BPF_ALU64_REG(BPF_ADD, dst_reg, BPF_REG_AX); 9127 9119 #else 9128 9120 *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, end), 9129 - si->dst_reg, si->src_reg, 9121 + dst_reg, skb_reg, 9130 9122 offsetof(struct sk_buff, end)); 9131 9123 #endif 9132 9124 ··· 9517 9509 break; 9518 9510 9519 9511 case offsetof(struct __sk_buff, gso_segs): 9520 - insn = bpf_convert_shinfo_access(si, insn); 9512 + insn = bpf_convert_shinfo_access(si->dst_reg, si->src_reg, insn); 9521 9513 *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct skb_shared_info, gso_segs), 9522 9514 si->dst_reg, si->dst_reg, 9523 9515 bpf_target_off(struct skb_shared_info, ··· 9525 9517 target_size)); 9526 9518 break; 9527 9519 case offsetof(struct __sk_buff, gso_size): 9528 - insn = bpf_convert_shinfo_access(si, insn); 9520 + insn = bpf_convert_shinfo_access(si->dst_reg, si->src_reg, insn); 9529 9521 *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct skb_shared_info, gso_size), 9530 9522 si->dst_reg, si->dst_reg, 9531 9523 bpf_target_off(struct skb_shared_info, ··· 9552 9544 BUILD_BUG_ON(sizeof_field(struct skb_shared_hwtstamps, hwtstamp) != 8); 9553 9545 BUILD_BUG_ON(offsetof(struct skb_shared_hwtstamps, hwtstamp) != 0); 9554 9546 9555 - insn = bpf_convert_shinfo_access(si, insn); 9547 + insn = bpf_convert_shinfo_access(si->dst_reg, si->src_reg, insn); 9556 9548 *insn++ = BPF_LDX_MEM(BPF_DW, 9557 9549 si->dst_reg, si->dst_reg, 9558 9550 bpf_target_off(struct skb_shared_info, ··· 10402 10394 tcp_flags), 10403 10395 si->dst_reg, si->dst_reg, off); 10404 10396 break; 10397 + case offsetof(struct bpf_sock_ops, skb_hwtstamp): { 10398 + struct bpf_insn *jmp_on_null_skb; 10399 + 10400 + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_sock_ops_kern, 10401 + skb), 10402 + si->dst_reg, si->src_reg, 10403 + offsetof(struct bpf_sock_ops_kern, 10404 + skb)); 10405 + /* Reserve one insn to test skb == NULL */ 10406 + jmp_on_null_skb = insn++; 10407 + insn = bpf_convert_shinfo_access(si->dst_reg, si->dst_reg, insn); 10408 + *insn++ = BPF_LDX_MEM(BPF_DW, si->dst_reg, si->dst_reg, 10409 + bpf_target_off(struct skb_shared_info, 10410 + hwtstamps, 8, 10411 + target_size)); 10412 + *jmp_on_null_skb = BPF_JMP_IMM(BPF_JEQ, si->dst_reg, 0, 10413 + insn - jmp_on_null_skb - 1); 10414 + break; 10415 + } 10405 10416 } 10406 10417 return insn - insn_buf; 10407 10418 }
+53 -42
samples/bpf/sockex3_kern.c
··· 17 17 #define IP_MF 0x2000 18 18 #define IP_OFFSET 0x1FFF 19 19 20 - #define PROG(F) SEC("socket/"__stringify(F)) int bpf_func_##F 21 - 22 - struct { 23 - __uint(type, BPF_MAP_TYPE_PROG_ARRAY); 24 - __uint(key_size, sizeof(u32)); 25 - __uint(value_size, sizeof(u32)); 26 - __uint(max_entries, 8); 27 - } jmp_table SEC(".maps"); 28 - 29 20 #define PARSE_VLAN 1 30 21 #define PARSE_MPLS 2 31 22 #define PARSE_IP 3 32 23 #define PARSE_IPV6 4 33 - 34 - /* Protocol dispatch routine. It tail-calls next BPF program depending 35 - * on eth proto. Note, we could have used ... 36 - * 37 - * bpf_tail_call(skb, &jmp_table, proto); 38 - * 39 - * ... but it would need large prog_array and cannot be optimised given 40 - * the map key is not static. 41 - */ 42 - static inline void parse_eth_proto(struct __sk_buff *skb, u32 proto) 43 - { 44 - switch (proto) { 45 - case ETH_P_8021Q: 46 - case ETH_P_8021AD: 47 - bpf_tail_call(skb, &jmp_table, PARSE_VLAN); 48 - break; 49 - case ETH_P_MPLS_UC: 50 - case ETH_P_MPLS_MC: 51 - bpf_tail_call(skb, &jmp_table, PARSE_MPLS); 52 - break; 53 - case ETH_P_IP: 54 - bpf_tail_call(skb, &jmp_table, PARSE_IP); 55 - break; 56 - case ETH_P_IPV6: 57 - bpf_tail_call(skb, &jmp_table, PARSE_IPV6); 58 - break; 59 - } 60 - } 61 24 62 25 struct vlan_hdr { 63 26 __be16 h_vlan_TCI; ··· 36 73 }; 37 74 __u32 ip_proto; 38 75 }; 76 + 77 + static inline void parse_eth_proto(struct __sk_buff *skb, u32 proto); 39 78 40 79 static inline int ip_is_fragment(struct __sk_buff *ctx, __u64 nhoff) 41 80 { ··· 154 189 } 155 190 } 156 191 157 - PROG(PARSE_IP)(struct __sk_buff *skb) 192 + SEC("socket") 193 + int bpf_func_ip(struct __sk_buff *skb) 158 194 { 159 195 struct globals *g = this_cpu_globals(); 160 196 __u32 nhoff, verlen, ip_proto; ··· 183 217 return 0; 184 218 } 185 219 186 - PROG(PARSE_IPV6)(struct __sk_buff *skb) 220 + SEC("socket") 221 + int bpf_func_ipv6(struct __sk_buff *skb) 187 222 { 188 223 struct globals *g = this_cpu_globals(); 189 224 __u32 nhoff, ip_proto; ··· 207 240 return 0; 208 241 } 209 242 210 - PROG(PARSE_VLAN)(struct __sk_buff *skb) 243 + SEC("socket") 244 + int bpf_func_vlan(struct __sk_buff *skb) 211 245 { 212 246 __u32 nhoff, proto; 213 247 ··· 224 256 return 0; 225 257 } 226 258 227 - PROG(PARSE_MPLS)(struct __sk_buff *skb) 259 + SEC("socket") 260 + int bpf_func_mpls(struct __sk_buff *skb) 228 261 { 229 262 __u32 nhoff, label; 230 263 ··· 248 279 return 0; 249 280 } 250 281 251 - SEC("socket/0") 282 + struct { 283 + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); 284 + __uint(key_size, sizeof(u32)); 285 + __uint(max_entries, 8); 286 + __array(values, u32 (void *)); 287 + } prog_array_init SEC(".maps") = { 288 + .values = { 289 + [PARSE_VLAN] = (void *)&bpf_func_vlan, 290 + [PARSE_IP] = (void *)&bpf_func_ip, 291 + [PARSE_IPV6] = (void *)&bpf_func_ipv6, 292 + [PARSE_MPLS] = (void *)&bpf_func_mpls, 293 + }, 294 + }; 295 + 296 + /* Protocol dispatch routine. It tail-calls next BPF program depending 297 + * on eth proto. Note, we could have used ... 298 + * 299 + * bpf_tail_call(skb, &prog_array_init, proto); 300 + * 301 + * ... but it would need large prog_array and cannot be optimised given 302 + * the map key is not static. 303 + */ 304 + static inline void parse_eth_proto(struct __sk_buff *skb, u32 proto) 305 + { 306 + switch (proto) { 307 + case ETH_P_8021Q: 308 + case ETH_P_8021AD: 309 + bpf_tail_call(skb, &prog_array_init, PARSE_VLAN); 310 + break; 311 + case ETH_P_MPLS_UC: 312 + case ETH_P_MPLS_MC: 313 + bpf_tail_call(skb, &prog_array_init, PARSE_MPLS); 314 + break; 315 + case ETH_P_IP: 316 + bpf_tail_call(skb, &prog_array_init, PARSE_IP); 317 + break; 318 + case ETH_P_IPV6: 319 + bpf_tail_call(skb, &prog_array_init, PARSE_IPV6); 320 + break; 321 + } 322 + } 323 + 324 + SEC("socket") 252 325 int main_prog(struct __sk_buff *skb) 253 326 { 254 327 __u32 nhoff = ETH_HLEN;
+10 -13
samples/bpf/sockex3_user.c
··· 24 24 25 25 int main(int argc, char **argv) 26 26 { 27 - int i, sock, key, fd, main_prog_fd, jmp_table_fd, hash_map_fd; 27 + int i, sock, fd, main_prog_fd, hash_map_fd; 28 28 struct bpf_program *prog; 29 29 struct bpf_object *obj; 30 - const char *section; 31 30 char filename[256]; 32 31 FILE *f; 33 32 ··· 44 45 goto cleanup; 45 46 } 46 47 47 - jmp_table_fd = bpf_object__find_map_fd_by_name(obj, "jmp_table"); 48 48 hash_map_fd = bpf_object__find_map_fd_by_name(obj, "hash_map"); 49 - if (jmp_table_fd < 0 || hash_map_fd < 0) { 49 + if (hash_map_fd < 0) { 50 50 fprintf(stderr, "ERROR: finding a map in obj file failed\n"); 51 51 goto cleanup; 52 52 } 53 53 54 + /* find BPF main program */ 55 + main_prog_fd = 0; 54 56 bpf_object__for_each_program(prog, obj) { 55 57 fd = bpf_program__fd(prog); 56 58 57 - section = bpf_program__section_name(prog); 58 - if (sscanf(section, "socket/%d", &key) != 1) { 59 - fprintf(stderr, "ERROR: finding prog failed\n"); 60 - goto cleanup; 61 - } 62 - 63 - if (key == 0) 59 + if (!strcmp(bpf_program__name(prog), "main_prog")) 64 60 main_prog_fd = fd; 65 - else 66 - bpf_map_update_elem(jmp_table_fd, &key, &fd, BPF_ANY); 61 + } 62 + 63 + if (main_prog_fd == 0) { 64 + fprintf(stderr, "ERROR: can't find main_prog\n"); 65 + goto cleanup; 67 66 } 68 67 69 68 sock = open_raw_sock("lo");
+2 -2
samples/bpf/tracex2_kern.c
··· 22 22 /* kprobe is NOT a stable ABI. If kernel internals change this bpf+kprobe 23 23 * example will no longer be meaningful 24 24 */ 25 - SEC("kprobe/kfree_skb") 25 + SEC("kprobe/kfree_skb_reason") 26 26 int bpf_prog2(struct pt_regs *ctx) 27 27 { 28 28 long loc = 0; 29 29 long init_val = 1; 30 30 long *value; 31 31 32 - /* read ip of kfree_skb caller. 32 + /* read ip of kfree_skb_reason caller. 33 33 * non-portable version of __builtin_return_address(0) 34 34 */ 35 35 BPF_KPROBE_READ_RET_IP(loc, ctx);
+2 -1
samples/bpf/tracex2_user.c
··· 146 146 signal(SIGINT, int_exit); 147 147 signal(SIGTERM, int_exit); 148 148 149 - /* start 'ping' in the background to have some kfree_skb events */ 149 + /* start 'ping' in the background to have some kfree_skb_reason 150 + * events */ 150 151 f = popen("ping -4 -c5 localhost", "r"); 151 152 (void) f; 152 153
+9 -16
tools/bpf/bpftool/btf.c
··· 815 815 if (!btf_id) 816 816 continue; 817 817 818 - err = hashmap__append(tab, u32_as_hash_field(btf_id), 819 - u32_as_hash_field(id)); 818 + err = hashmap__append(tab, btf_id, id); 820 819 if (err) { 821 820 p_err("failed to append entry to hashmap for BTF ID %u, object ID %u: %s", 822 821 btf_id, id, strerror(-err)); ··· 874 875 printf("size %uB", info->btf_size); 875 876 876 877 n = 0; 877 - hashmap__for_each_key_entry(btf_prog_table, entry, 878 - u32_as_hash_field(info->id)) { 879 - printf("%s%u", n++ == 0 ? " prog_ids " : ",", 880 - hash_field_as_u32(entry->value)); 878 + hashmap__for_each_key_entry(btf_prog_table, entry, info->id) { 879 + printf("%s%lu", n++ == 0 ? " prog_ids " : ",", entry->value); 881 880 } 882 881 883 882 n = 0; 884 - hashmap__for_each_key_entry(btf_map_table, entry, 885 - u32_as_hash_field(info->id)) { 886 - printf("%s%u", n++ == 0 ? " map_ids " : ",", 887 - hash_field_as_u32(entry->value)); 883 + hashmap__for_each_key_entry(btf_map_table, entry, info->id) { 884 + printf("%s%lu", n++ == 0 ? " map_ids " : ",", entry->value); 888 885 } 889 886 890 887 emit_obj_refs_plain(refs_table, info->id, "\n\tpids "); ··· 902 907 903 908 jsonw_name(json_wtr, "prog_ids"); 904 909 jsonw_start_array(json_wtr); /* prog_ids */ 905 - hashmap__for_each_key_entry(btf_prog_table, entry, 906 - u32_as_hash_field(info->id)) { 907 - jsonw_uint(json_wtr, hash_field_as_u32(entry->value)); 910 + hashmap__for_each_key_entry(btf_prog_table, entry, info->id) { 911 + jsonw_uint(json_wtr, entry->value); 908 912 } 909 913 jsonw_end_array(json_wtr); /* prog_ids */ 910 914 911 915 jsonw_name(json_wtr, "map_ids"); 912 916 jsonw_start_array(json_wtr); /* map_ids */ 913 - hashmap__for_each_key_entry(btf_map_table, entry, 914 - u32_as_hash_field(info->id)) { 915 - jsonw_uint(json_wtr, hash_field_as_u32(entry->value)); 917 + hashmap__for_each_key_entry(btf_map_table, entry, info->id) { 918 + jsonw_uint(json_wtr, entry->value); 916 919 } 917 920 jsonw_end_array(json_wtr); /* map_ids */ 918 921
+5 -5
tools/bpf/bpftool/common.c
··· 497 497 goto out_close; 498 498 } 499 499 500 - err = hashmap__append(build_fn_table, u32_as_hash_field(pinned_info.id), path); 500 + err = hashmap__append(build_fn_table, pinned_info.id, path); 501 501 if (err) { 502 502 p_err("failed to append entry to hashmap for ID %u, path '%s': %s", 503 503 pinned_info.id, path, strerror(errno)); ··· 548 548 return; 549 549 550 550 hashmap__for_each_entry(map, entry, bkt) 551 - free(entry->value); 551 + free(entry->pvalue); 552 552 553 553 hashmap__free(map); 554 554 } ··· 1044 1044 return fd; 1045 1045 } 1046 1046 1047 - size_t hash_fn_for_key_as_id(const void *key, void *ctx) 1047 + size_t hash_fn_for_key_as_id(long key, void *ctx) 1048 1048 { 1049 - return (size_t)key; 1049 + return key; 1050 1050 } 1051 1051 1052 - bool equal_fn_for_key_as_id(const void *k1, const void *k2, void *ctx) 1052 + bool equal_fn_for_key_as_id(long k1, long k2, void *ctx) 1053 1053 { 1054 1054 return k1 == k2; 1055 1055 }
+7 -12
tools/bpf/bpftool/gen.c
··· 1660 1660 struct btf *marked_btf; /* btf structure used to mark used types */ 1661 1661 }; 1662 1662 1663 - static size_t btfgen_hash_fn(const void *key, void *ctx) 1663 + static size_t btfgen_hash_fn(long key, void *ctx) 1664 1664 { 1665 - return (size_t)key; 1665 + return key; 1666 1666 } 1667 1667 1668 - static bool btfgen_equal_fn(const void *k1, const void *k2, void *ctx) 1668 + static bool btfgen_equal_fn(long k1, long k2, void *ctx) 1669 1669 { 1670 1670 return k1 == k2; 1671 - } 1672 - 1673 - static void *u32_as_hash_key(__u32 x) 1674 - { 1675 - return (void *)(uintptr_t)x; 1676 1671 } 1677 1672 1678 1673 static void btfgen_free_info(struct btfgen_info *info) ··· 2081 2086 struct bpf_core_spec specs_scratch[3] = {}; 2082 2087 struct bpf_core_relo_res targ_res = {}; 2083 2088 struct bpf_core_cand_list *cands = NULL; 2084 - const void *type_key = u32_as_hash_key(relo->type_id); 2085 2089 const char *sec_name = btf__name_by_offset(btf, sec->sec_name_off); 2086 2090 2087 2091 if (relo->kind != BPF_CORE_TYPE_ID_LOCAL && 2088 - !hashmap__find(cand_cache, type_key, (void **)&cands)) { 2092 + !hashmap__find(cand_cache, relo->type_id, &cands)) { 2089 2093 cands = btfgen_find_cands(btf, info->src_btf, relo->type_id); 2090 2094 if (!cands) { 2091 2095 err = -errno; 2092 2096 goto out; 2093 2097 } 2094 2098 2095 - err = hashmap__set(cand_cache, type_key, cands, NULL, NULL); 2099 + err = hashmap__set(cand_cache, relo->type_id, cands, 2100 + NULL, NULL); 2096 2101 if (err) 2097 2102 goto out; 2098 2103 } ··· 2115 2120 2116 2121 if (!IS_ERR_OR_NULL(cand_cache)) { 2117 2122 hashmap__for_each_entry(cand_cache, entry, i) { 2118 - bpf_core_free_cands(entry->value); 2123 + bpf_core_free_cands(entry->pvalue); 2119 2124 } 2120 2125 hashmap__free(cand_cache); 2121 2126 }
+4 -6
tools/bpf/bpftool/link.c
··· 204 204 205 205 jsonw_name(json_wtr, "pinned"); 206 206 jsonw_start_array(json_wtr); 207 - hashmap__for_each_key_entry(link_table, entry, 208 - u32_as_hash_field(info->id)) 209 - jsonw_string(json_wtr, entry->value); 207 + hashmap__for_each_key_entry(link_table, entry, info->id) 208 + jsonw_string(json_wtr, entry->pvalue); 210 209 jsonw_end_array(json_wtr); 211 210 } 212 211 ··· 308 309 if (!hashmap__empty(link_table)) { 309 310 struct hashmap_entry *entry; 310 311 311 - hashmap__for_each_key_entry(link_table, entry, 312 - u32_as_hash_field(info->id)) 313 - printf("\n\tpinned %s", (char *)entry->value); 312 + hashmap__for_each_key_entry(link_table, entry, info->id) 313 + printf("\n\tpinned %s", (char *)entry->pvalue); 314 314 } 315 315 emit_obj_refs_plain(refs_table, info->id, "\n\tpids "); 316 316
+2 -12
tools/bpf/bpftool/main.h
··· 240 240 int print_all_levels(__maybe_unused enum libbpf_print_level level, 241 241 const char *format, va_list args); 242 242 243 - size_t hash_fn_for_key_as_id(const void *key, void *ctx); 244 - bool equal_fn_for_key_as_id(const void *k1, const void *k2, void *ctx); 243 + size_t hash_fn_for_key_as_id(long key, void *ctx); 244 + bool equal_fn_for_key_as_id(long k1, long k2, void *ctx); 245 245 246 246 /* bpf_attach_type_input_str - convert the provided attach type value into a 247 247 * textual representation that we accept for input purposes. ··· 256 256 * returned for unknown bpf_attach_type values. 257 257 */ 258 258 const char *bpf_attach_type_input_str(enum bpf_attach_type t); 259 - 260 - static inline void *u32_as_hash_field(__u32 x) 261 - { 262 - return (void *)(uintptr_t)x; 263 - } 264 - 265 - static inline __u32 hash_field_as_u32(const void *x) 266 - { 267 - return (__u32)(uintptr_t)x; 268 - } 269 259 270 260 static inline bool hashmap__empty(struct hashmap *map) 271 261 {
+4 -6
tools/bpf/bpftool/map.c
··· 518 518 519 519 jsonw_name(json_wtr, "pinned"); 520 520 jsonw_start_array(json_wtr); 521 - hashmap__for_each_key_entry(map_table, entry, 522 - u32_as_hash_field(info->id)) 523 - jsonw_string(json_wtr, entry->value); 521 + hashmap__for_each_key_entry(map_table, entry, info->id) 522 + jsonw_string(json_wtr, entry->pvalue); 524 523 jsonw_end_array(json_wtr); 525 524 } 526 525 ··· 594 595 if (!hashmap__empty(map_table)) { 595 596 struct hashmap_entry *entry; 596 597 597 - hashmap__for_each_key_entry(map_table, entry, 598 - u32_as_hash_field(info->id)) 599 - printf("\n\tpinned %s", (char *)entry->value); 598 + hashmap__for_each_key_entry(map_table, entry, info->id) 599 + printf("\n\tpinned %s", (char *)entry->pvalue); 600 600 } 601 601 602 602 if (frozen_str) {
+8 -8
tools/bpf/bpftool/pids.c
··· 36 36 int err, i; 37 37 void *tmp; 38 38 39 - hashmap__for_each_key_entry(map, entry, u32_as_hash_field(e->id)) { 40 - refs = entry->value; 39 + hashmap__for_each_key_entry(map, entry, e->id) { 40 + refs = entry->pvalue; 41 41 42 42 for (i = 0; i < refs->ref_cnt; i++) { 43 43 if (refs->refs[i].pid == e->pid) ··· 81 81 refs->has_bpf_cookie = e->has_bpf_cookie; 82 82 refs->bpf_cookie = e->bpf_cookie; 83 83 84 - err = hashmap__append(map, u32_as_hash_field(e->id), refs); 84 + err = hashmap__append(map, e->id, refs); 85 85 if (err) 86 86 p_err("failed to append entry to hashmap for ID %u: %s", 87 87 e->id, strerror(errno)); ··· 183 183 return; 184 184 185 185 hashmap__for_each_entry(map, entry, bkt) { 186 - struct obj_refs *refs = entry->value; 186 + struct obj_refs *refs = entry->pvalue; 187 187 188 188 free(refs->refs); 189 189 free(refs); ··· 200 200 if (hashmap__empty(map)) 201 201 return; 202 202 203 - hashmap__for_each_key_entry(map, entry, u32_as_hash_field(id)) { 204 - struct obj_refs *refs = entry->value; 203 + hashmap__for_each_key_entry(map, entry, id) { 204 + struct obj_refs *refs = entry->pvalue; 205 205 int i; 206 206 207 207 if (refs->ref_cnt == 0) ··· 232 232 if (hashmap__empty(map)) 233 233 return; 234 234 235 - hashmap__for_each_key_entry(map, entry, u32_as_hash_field(id)) { 236 - struct obj_refs *refs = entry->value; 235 + hashmap__for_each_key_entry(map, entry, id) { 236 + struct obj_refs *refs = entry->pvalue; 237 237 int i; 238 238 239 239 if (refs->ref_cnt == 0)
+4 -6
tools/bpf/bpftool/prog.c
··· 486 486 487 487 jsonw_name(json_wtr, "pinned"); 488 488 jsonw_start_array(json_wtr); 489 - hashmap__for_each_key_entry(prog_table, entry, 490 - u32_as_hash_field(info->id)) 491 - jsonw_string(json_wtr, entry->value); 489 + hashmap__for_each_key_entry(prog_table, entry, info->id) 490 + jsonw_string(json_wtr, entry->pvalue); 492 491 jsonw_end_array(json_wtr); 493 492 } 494 493 ··· 560 561 if (!hashmap__empty(prog_table)) { 561 562 struct hashmap_entry *entry; 562 563 563 - hashmap__for_each_key_entry(prog_table, entry, 564 - u32_as_hash_field(info->id)) 565 - printf("\n\tpinned %s", (char *)entry->value); 564 + hashmap__for_each_key_entry(prog_table, entry, info->id) 565 + printf("\n\tpinned %s", (char *)entry->pvalue); 566 566 } 567 567 568 568 if (info->btf_id)
+1
tools/include/uapi/linux/bpf.h
··· 6445 6445 * the outgoing header has not 6446 6446 * been written yet. 6447 6447 */ 6448 + __u64 skb_hwtstamp; 6448 6449 }; 6449 6450 6450 6451 /* Definitions for bpf_sock_ops_cb_flags */
+184 -75
tools/lib/bpf/btf.c
··· 1559 1559 static int btf_rewrite_str(__u32 *str_off, void *ctx) 1560 1560 { 1561 1561 struct btf_pipe *p = ctx; 1562 - void *mapped_off; 1562 + long mapped_off; 1563 1563 int off, err; 1564 1564 1565 1565 if (!*str_off) /* nothing to do for empty strings */ 1566 1566 return 0; 1567 1567 1568 1568 if (p->str_off_map && 1569 - hashmap__find(p->str_off_map, (void *)(long)*str_off, &mapped_off)) { 1570 - *str_off = (__u32)(long)mapped_off; 1569 + hashmap__find(p->str_off_map, *str_off, &mapped_off)) { 1570 + *str_off = mapped_off; 1571 1571 return 0; 1572 1572 } 1573 1573 ··· 1579 1579 * performing expensive string comparisons. 1580 1580 */ 1581 1581 if (p->str_off_map) { 1582 - err = hashmap__append(p->str_off_map, (void *)(long)*str_off, (void *)(long)off); 1582 + err = hashmap__append(p->str_off_map, *str_off, off); 1583 1583 if (err) 1584 1584 return err; 1585 1585 } ··· 1630 1630 return 0; 1631 1631 } 1632 1632 1633 - static size_t btf_dedup_identity_hash_fn(const void *key, void *ctx); 1634 - static bool btf_dedup_equal_fn(const void *k1, const void *k2, void *ctx); 1633 + static size_t btf_dedup_identity_hash_fn(long key, void *ctx); 1634 + static bool btf_dedup_equal_fn(long k1, long k2, void *ctx); 1635 1635 1636 1636 int btf__add_btf(struct btf *btf, const struct btf *src_btf) 1637 1637 { ··· 2881 2881 static int btf_dedup_prim_types(struct btf_dedup *d); 2882 2882 static int btf_dedup_struct_types(struct btf_dedup *d); 2883 2883 static int btf_dedup_ref_types(struct btf_dedup *d); 2884 + static int btf_dedup_resolve_fwds(struct btf_dedup *d); 2884 2885 static int btf_dedup_compact_types(struct btf_dedup *d); 2885 2886 static int btf_dedup_remap_types(struct btf_dedup *d); 2886 2887 ··· 2989 2988 * Algorithm summary 2990 2989 * ================= 2991 2990 * 2992 - * Algorithm completes its work in 6 separate passes: 2991 + * Algorithm completes its work in 7 separate passes: 2993 2992 * 2994 2993 * 1. Strings deduplication. 2995 2994 * 2. Primitive types deduplication (int, enum, fwd). 2996 2995 * 3. Struct/union types deduplication. 2997 - * 4. Reference types deduplication (pointers, typedefs, arrays, funcs, func 2996 + * 4. Resolve unambiguous forward declarations. 2997 + * 5. Reference types deduplication (pointers, typedefs, arrays, funcs, func 2998 2998 * protos, and const/volatile/restrict modifiers). 2999 - * 5. Types compaction. 3000 - * 6. Types remapping. 2999 + * 6. Types compaction. 3000 + * 7. Types remapping. 3001 3001 * 3002 3002 * Algorithm determines canonical type descriptor, which is a single 3003 3003 * representative type for each truly unique type. This canonical type is the ··· 3060 3058 err = btf_dedup_struct_types(d); 3061 3059 if (err < 0) { 3062 3060 pr_debug("btf_dedup_struct_types failed:%d\n", err); 3061 + goto done; 3062 + } 3063 + err = btf_dedup_resolve_fwds(d); 3064 + if (err < 0) { 3065 + pr_debug("btf_dedup_resolve_fwds failed:%d\n", err); 3063 3066 goto done; 3064 3067 } 3065 3068 err = btf_dedup_ref_types(d); ··· 3133 3126 } 3134 3127 3135 3128 #define for_each_dedup_cand(d, node, hash) \ 3136 - hashmap__for_each_key_entry(d->dedup_table, node, (void *)hash) 3129 + hashmap__for_each_key_entry(d->dedup_table, node, hash) 3137 3130 3138 3131 static int btf_dedup_table_add(struct btf_dedup *d, long hash, __u32 type_id) 3139 3132 { 3140 - return hashmap__append(d->dedup_table, 3141 - (void *)hash, (void *)(long)type_id); 3133 + return hashmap__append(d->dedup_table, hash, type_id); 3142 3134 } 3143 3135 3144 3136 static int btf_dedup_hypot_map_add(struct btf_dedup *d, ··· 3184 3178 free(d); 3185 3179 } 3186 3180 3187 - static size_t btf_dedup_identity_hash_fn(const void *key, void *ctx) 3181 + static size_t btf_dedup_identity_hash_fn(long key, void *ctx) 3188 3182 { 3189 - return (size_t)key; 3183 + return key; 3190 3184 } 3191 3185 3192 - static size_t btf_dedup_collision_hash_fn(const void *key, void *ctx) 3186 + static size_t btf_dedup_collision_hash_fn(long key, void *ctx) 3193 3187 { 3194 3188 return 0; 3195 3189 } 3196 3190 3197 - static bool btf_dedup_equal_fn(const void *k1, const void *k2, void *ctx) 3191 + static bool btf_dedup_equal_fn(long k1, long k2, void *ctx) 3198 3192 { 3199 3193 return k1 == k2; 3200 3194 } ··· 3410 3404 { 3411 3405 long h; 3412 3406 3413 - /* don't hash vlen and enum members to support enum fwd resolving */ 3407 + /* don't hash vlen, enum members and size to support enum fwd resolving */ 3414 3408 h = hash_combine(0, t->name_off); 3415 - h = hash_combine(h, t->info & ~0xffff); 3416 - h = hash_combine(h, t->size); 3417 3409 return h; 3418 3410 } 3419 3411 3420 - /* Check structural equality of two ENUMs. */ 3421 - static bool btf_equal_enum(struct btf_type *t1, struct btf_type *t2) 3412 + static bool btf_equal_enum_members(struct btf_type *t1, struct btf_type *t2) 3422 3413 { 3423 3414 const struct btf_enum *m1, *m2; 3424 3415 __u16 vlen; 3425 3416 int i; 3426 - 3427 - if (!btf_equal_common(t1, t2)) 3428 - return false; 3429 3417 3430 3418 vlen = btf_vlen(t1); 3431 3419 m1 = btf_enum(t1); ··· 3433 3433 return true; 3434 3434 } 3435 3435 3436 - static bool btf_equal_enum64(struct btf_type *t1, struct btf_type *t2) 3436 + static bool btf_equal_enum64_members(struct btf_type *t1, struct btf_type *t2) 3437 3437 { 3438 3438 const struct btf_enum64 *m1, *m2; 3439 3439 __u16 vlen; 3440 3440 int i; 3441 - 3442 - if (!btf_equal_common(t1, t2)) 3443 - return false; 3444 3441 3445 3442 vlen = btf_vlen(t1); 3446 3443 m1 = btf_enum64(t1); ··· 3452 3455 return true; 3453 3456 } 3454 3457 3458 + /* Check structural equality of two ENUMs or ENUM64s. */ 3459 + static bool btf_equal_enum(struct btf_type *t1, struct btf_type *t2) 3460 + { 3461 + if (!btf_equal_common(t1, t2)) 3462 + return false; 3463 + 3464 + /* t1 & t2 kinds are identical because of btf_equal_common */ 3465 + if (btf_kind(t1) == BTF_KIND_ENUM) 3466 + return btf_equal_enum_members(t1, t2); 3467 + else 3468 + return btf_equal_enum64_members(t1, t2); 3469 + } 3470 + 3455 3471 static inline bool btf_is_enum_fwd(struct btf_type *t) 3456 3472 { 3457 3473 return btf_is_any_enum(t) && btf_vlen(t) == 0; ··· 3474 3464 { 3475 3465 if (!btf_is_enum_fwd(t1) && !btf_is_enum_fwd(t2)) 3476 3466 return btf_equal_enum(t1, t2); 3477 - /* ignore vlen when comparing */ 3467 + /* At this point either t1 or t2 or both are forward declarations, thus: 3468 + * - skip comparing vlen because it is zero for forward declarations; 3469 + * - skip comparing size to allow enum forward declarations 3470 + * to be compatible with enum64 full declarations; 3471 + * - skip comparing kind for the same reason. 3472 + */ 3478 3473 return t1->name_off == t2->name_off && 3479 - (t1->info & ~0xffff) == (t2->info & ~0xffff) && 3480 - t1->size == t2->size; 3481 - } 3482 - 3483 - static bool btf_compat_enum64(struct btf_type *t1, struct btf_type *t2) 3484 - { 3485 - if (!btf_is_enum_fwd(t1) && !btf_is_enum_fwd(t2)) 3486 - return btf_equal_enum64(t1, t2); 3487 - 3488 - /* ignore vlen when comparing */ 3489 - return t1->name_off == t2->name_off && 3490 - (t1->info & ~0xffff) == (t2->info & ~0xffff) && 3491 - t1->size == t2->size; 3474 + btf_is_any_enum(t1) && btf_is_any_enum(t2); 3492 3475 } 3493 3476 3494 3477 /* ··· 3756 3753 case BTF_KIND_INT: 3757 3754 h = btf_hash_int_decl_tag(t); 3758 3755 for_each_dedup_cand(d, hash_entry, h) { 3759 - cand_id = (__u32)(long)hash_entry->value; 3756 + cand_id = hash_entry->value; 3760 3757 cand = btf_type_by_id(d->btf, cand_id); 3761 3758 if (btf_equal_int_tag(t, cand)) { 3762 3759 new_id = cand_id; ··· 3766 3763 break; 3767 3764 3768 3765 case BTF_KIND_ENUM: 3766 + case BTF_KIND_ENUM64: 3769 3767 h = btf_hash_enum(t); 3770 3768 for_each_dedup_cand(d, hash_entry, h) { 3771 - cand_id = (__u32)(long)hash_entry->value; 3769 + cand_id = hash_entry->value; 3772 3770 cand = btf_type_by_id(d->btf, cand_id); 3773 3771 if (btf_equal_enum(t, cand)) { 3774 3772 new_id = cand_id; ··· 3787 3783 } 3788 3784 break; 3789 3785 3790 - case BTF_KIND_ENUM64: 3791 - h = btf_hash_enum(t); 3792 - for_each_dedup_cand(d, hash_entry, h) { 3793 - cand_id = (__u32)(long)hash_entry->value; 3794 - cand = btf_type_by_id(d->btf, cand_id); 3795 - if (btf_equal_enum64(t, cand)) { 3796 - new_id = cand_id; 3797 - break; 3798 - } 3799 - if (btf_compat_enum64(t, cand)) { 3800 - if (btf_is_enum_fwd(t)) { 3801 - /* resolve fwd to full enum */ 3802 - new_id = cand_id; 3803 - break; 3804 - } 3805 - /* resolve canonical enum fwd to full enum */ 3806 - d->map[cand_id] = type_id; 3807 - } 3808 - } 3809 - break; 3810 - 3811 3786 case BTF_KIND_FWD: 3812 3787 case BTF_KIND_FLOAT: 3813 3788 h = btf_hash_common(t); 3814 3789 for_each_dedup_cand(d, hash_entry, h) { 3815 - cand_id = (__u32)(long)hash_entry->value; 3790 + cand_id = hash_entry->value; 3816 3791 cand = btf_type_by_id(d->btf, cand_id); 3817 3792 if (btf_equal_common(t, cand)) { 3818 3793 new_id = cand_id; ··· 4082 4099 return btf_equal_int_tag(cand_type, canon_type); 4083 4100 4084 4101 case BTF_KIND_ENUM: 4085 - return btf_compat_enum(cand_type, canon_type); 4086 - 4087 4102 case BTF_KIND_ENUM64: 4088 - return btf_compat_enum64(cand_type, canon_type); 4103 + return btf_compat_enum(cand_type, canon_type); 4089 4104 4090 4105 case BTF_KIND_FWD: 4091 4106 case BTF_KIND_FLOAT: ··· 4294 4313 4295 4314 h = btf_hash_struct(t); 4296 4315 for_each_dedup_cand(d, hash_entry, h) { 4297 - __u32 cand_id = (__u32)(long)hash_entry->value; 4316 + __u32 cand_id = hash_entry->value; 4298 4317 int eq; 4299 4318 4300 4319 /* ··· 4399 4418 4400 4419 h = btf_hash_common(t); 4401 4420 for_each_dedup_cand(d, hash_entry, h) { 4402 - cand_id = (__u32)(long)hash_entry->value; 4421 + cand_id = hash_entry->value; 4403 4422 cand = btf_type_by_id(d->btf, cand_id); 4404 4423 if (btf_equal_common(t, cand)) { 4405 4424 new_id = cand_id; ··· 4416 4435 4417 4436 h = btf_hash_int_decl_tag(t); 4418 4437 for_each_dedup_cand(d, hash_entry, h) { 4419 - cand_id = (__u32)(long)hash_entry->value; 4438 + cand_id = hash_entry->value; 4420 4439 cand = btf_type_by_id(d->btf, cand_id); 4421 4440 if (btf_equal_int_tag(t, cand)) { 4422 4441 new_id = cand_id; ··· 4440 4459 4441 4460 h = btf_hash_array(t); 4442 4461 for_each_dedup_cand(d, hash_entry, h) { 4443 - cand_id = (__u32)(long)hash_entry->value; 4462 + cand_id = hash_entry->value; 4444 4463 cand = btf_type_by_id(d->btf, cand_id); 4445 4464 if (btf_equal_array(t, cand)) { 4446 4465 new_id = cand_id; ··· 4472 4491 4473 4492 h = btf_hash_fnproto(t); 4474 4493 for_each_dedup_cand(d, hash_entry, h) { 4475 - cand_id = (__u32)(long)hash_entry->value; 4494 + cand_id = hash_entry->value; 4476 4495 cand = btf_type_by_id(d->btf, cand_id); 4477 4496 if (btf_equal_fnproto(t, cand)) { 4478 4497 new_id = cand_id; ··· 4506 4525 hashmap__free(d->dedup_table); 4507 4526 d->dedup_table = NULL; 4508 4527 return 0; 4528 + } 4529 + 4530 + /* 4531 + * Collect a map from type names to type ids for all canonical structs 4532 + * and unions. If the same name is shared by several canonical types 4533 + * use a special value 0 to indicate this fact. 4534 + */ 4535 + static int btf_dedup_fill_unique_names_map(struct btf_dedup *d, struct hashmap *names_map) 4536 + { 4537 + __u32 nr_types = btf__type_cnt(d->btf); 4538 + struct btf_type *t; 4539 + __u32 type_id; 4540 + __u16 kind; 4541 + int err; 4542 + 4543 + /* 4544 + * Iterate over base and split module ids in order to get all 4545 + * available structs in the map. 4546 + */ 4547 + for (type_id = 1; type_id < nr_types; ++type_id) { 4548 + t = btf_type_by_id(d->btf, type_id); 4549 + kind = btf_kind(t); 4550 + 4551 + if (kind != BTF_KIND_STRUCT && kind != BTF_KIND_UNION) 4552 + continue; 4553 + 4554 + /* Skip non-canonical types */ 4555 + if (type_id != d->map[type_id]) 4556 + continue; 4557 + 4558 + err = hashmap__add(names_map, t->name_off, type_id); 4559 + if (err == -EEXIST) 4560 + err = hashmap__set(names_map, t->name_off, 0, NULL, NULL); 4561 + 4562 + if (err) 4563 + return err; 4564 + } 4565 + 4566 + return 0; 4567 + } 4568 + 4569 + static int btf_dedup_resolve_fwd(struct btf_dedup *d, struct hashmap *names_map, __u32 type_id) 4570 + { 4571 + struct btf_type *t = btf_type_by_id(d->btf, type_id); 4572 + enum btf_fwd_kind fwd_kind = btf_kflag(t); 4573 + __u16 cand_kind, kind = btf_kind(t); 4574 + struct btf_type *cand_t; 4575 + uintptr_t cand_id; 4576 + 4577 + if (kind != BTF_KIND_FWD) 4578 + return 0; 4579 + 4580 + /* Skip if this FWD already has a mapping */ 4581 + if (type_id != d->map[type_id]) 4582 + return 0; 4583 + 4584 + if (!hashmap__find(names_map, t->name_off, &cand_id)) 4585 + return 0; 4586 + 4587 + /* Zero is a special value indicating that name is not unique */ 4588 + if (!cand_id) 4589 + return 0; 4590 + 4591 + cand_t = btf_type_by_id(d->btf, cand_id); 4592 + cand_kind = btf_kind(cand_t); 4593 + if ((cand_kind == BTF_KIND_STRUCT && fwd_kind != BTF_FWD_STRUCT) || 4594 + (cand_kind == BTF_KIND_UNION && fwd_kind != BTF_FWD_UNION)) 4595 + return 0; 4596 + 4597 + d->map[type_id] = cand_id; 4598 + 4599 + return 0; 4600 + } 4601 + 4602 + /* 4603 + * Resolve unambiguous forward declarations. 4604 + * 4605 + * The lion's share of all FWD declarations is resolved during 4606 + * `btf_dedup_struct_types` phase when different type graphs are 4607 + * compared against each other. However, if in some compilation unit a 4608 + * FWD declaration is not a part of a type graph compared against 4609 + * another type graph that declaration's canonical type would not be 4610 + * changed. Example: 4611 + * 4612 + * CU #1: 4613 + * 4614 + * struct foo; 4615 + * struct foo *some_global; 4616 + * 4617 + * CU #2: 4618 + * 4619 + * struct foo { int u; }; 4620 + * struct foo *another_global; 4621 + * 4622 + * After `btf_dedup_struct_types` the BTF looks as follows: 4623 + * 4624 + * [1] STRUCT 'foo' size=4 vlen=1 ... 4625 + * [2] INT 'int' size=4 ... 4626 + * [3] PTR '(anon)' type_id=1 4627 + * [4] FWD 'foo' fwd_kind=struct 4628 + * [5] PTR '(anon)' type_id=4 4629 + * 4630 + * This pass assumes that such FWD declarations should be mapped to 4631 + * structs or unions with identical name in case if the name is not 4632 + * ambiguous. 4633 + */ 4634 + static int btf_dedup_resolve_fwds(struct btf_dedup *d) 4635 + { 4636 + int i, err; 4637 + struct hashmap *names_map; 4638 + 4639 + names_map = hashmap__new(btf_dedup_identity_hash_fn, btf_dedup_equal_fn, NULL); 4640 + if (IS_ERR(names_map)) 4641 + return PTR_ERR(names_map); 4642 + 4643 + err = btf_dedup_fill_unique_names_map(d, names_map); 4644 + if (err < 0) 4645 + goto exit; 4646 + 4647 + for (i = 0; i < d->btf->nr_types; i++) { 4648 + err = btf_dedup_resolve_fwd(d, names_map, d->btf->start_id + i); 4649 + if (err < 0) 4650 + break; 4651 + } 4652 + 4653 + exit: 4654 + hashmap__free(names_map); 4655 + return err; 4509 4656 } 4510 4657 4511 4658 /*
+7 -8
tools/lib/bpf/btf_dump.c
··· 117 117 struct btf_dump_data *typed_dump; 118 118 }; 119 119 120 - static size_t str_hash_fn(const void *key, void *ctx) 120 + static size_t str_hash_fn(long key, void *ctx) 121 121 { 122 - return str_hash(key); 122 + return str_hash((void *)key); 123 123 } 124 124 125 - static bool str_equal_fn(const void *a, const void *b, void *ctx) 125 + static bool str_equal_fn(long a, long b, void *ctx) 126 126 { 127 - return strcmp(a, b) == 0; 127 + return strcmp((void *)a, (void *)b) == 0; 128 128 } 129 129 130 130 static const char *btf_name_of(const struct btf_dump *d, __u32 name_off) ··· 225 225 struct hashmap_entry *cur; 226 226 227 227 hashmap__for_each_entry(map, cur, bkt) 228 - free((void *)cur->key); 228 + free((void *)cur->pkey); 229 229 230 230 hashmap__free(map); 231 231 } ··· 1543 1543 if (!new_name) 1544 1544 return 1; 1545 1545 1546 - hashmap__find(name_map, orig_name, (void **)&dup_cnt); 1546 + hashmap__find(name_map, orig_name, &dup_cnt); 1547 1547 dup_cnt++; 1548 1548 1549 - err = hashmap__set(name_map, new_name, (void *)dup_cnt, 1550 - (const void **)&old_name, NULL); 1549 + err = hashmap__set(name_map, new_name, dup_cnt, &old_name, NULL); 1551 1550 if (err) 1552 1551 free(new_name); 1553 1552
+9 -9
tools/lib/bpf/hashmap.c
··· 128 128 } 129 129 130 130 static bool hashmap_find_entry(const struct hashmap *map, 131 - const void *key, size_t hash, 131 + const long key, size_t hash, 132 132 struct hashmap_entry ***pprev, 133 133 struct hashmap_entry **entry) 134 134 { ··· 151 151 return false; 152 152 } 153 153 154 - int hashmap__insert(struct hashmap *map, const void *key, void *value, 155 - enum hashmap_insert_strategy strategy, 156 - const void **old_key, void **old_value) 154 + int hashmap_insert(struct hashmap *map, long key, long value, 155 + enum hashmap_insert_strategy strategy, 156 + long *old_key, long *old_value) 157 157 { 158 158 struct hashmap_entry *entry; 159 159 size_t h; 160 160 int err; 161 161 162 162 if (old_key) 163 - *old_key = NULL; 163 + *old_key = 0; 164 164 if (old_value) 165 - *old_value = NULL; 165 + *old_value = 0; 166 166 167 167 h = hash_bits(map->hash_fn(key, map->ctx), map->cap_bits); 168 168 if (strategy != HASHMAP_APPEND && ··· 203 203 return 0; 204 204 } 205 205 206 - bool hashmap__find(const struct hashmap *map, const void *key, void **value) 206 + bool hashmap_find(const struct hashmap *map, long key, long *value) 207 207 { 208 208 struct hashmap_entry *entry; 209 209 size_t h; ··· 217 217 return true; 218 218 } 219 219 220 - bool hashmap__delete(struct hashmap *map, const void *key, 221 - const void **old_key, void **old_value) 220 + bool hashmap_delete(struct hashmap *map, long key, 221 + long *old_key, long *old_value) 222 222 { 223 223 struct hashmap_entry **pprev, *entry; 224 224 size_t h;
+57 -34
tools/lib/bpf/hashmap.h
··· 40 40 return h; 41 41 } 42 42 43 - typedef size_t (*hashmap_hash_fn)(const void *key, void *ctx); 44 - typedef bool (*hashmap_equal_fn)(const void *key1, const void *key2, void *ctx); 43 + typedef size_t (*hashmap_hash_fn)(long key, void *ctx); 44 + typedef bool (*hashmap_equal_fn)(long key1, long key2, void *ctx); 45 45 46 + /* 47 + * Hashmap interface is polymorphic, keys and values could be either 48 + * long-sized integers or pointers, this is achieved as follows: 49 + * - interface functions that operate on keys and values are hidden 50 + * behind auxiliary macros, e.g. hashmap_insert <-> hashmap__insert; 51 + * - these auxiliary macros cast the key and value parameters as 52 + * long or long *, so the user does not have to specify the casts explicitly; 53 + * - for pointer parameters (e.g. old_key) the size of the pointed 54 + * type is verified by hashmap_cast_ptr using _Static_assert; 55 + * - when iterating using hashmap__for_each_* forms 56 + * hasmap_entry->key should be used for integer keys and 57 + * hasmap_entry->pkey should be used for pointer keys, 58 + * same goes for values. 59 + */ 46 60 struct hashmap_entry { 47 - const void *key; 48 - void *value; 61 + union { 62 + long key; 63 + const void *pkey; 64 + }; 65 + union { 66 + long value; 67 + void *pvalue; 68 + }; 49 69 struct hashmap_entry *next; 50 70 }; 51 71 ··· 122 102 HASHMAP_APPEND, 123 103 }; 124 104 105 + #define hashmap_cast_ptr(p) ({ \ 106 + _Static_assert((__builtin_constant_p((p)) ? (p) == NULL : 0) || \ 107 + sizeof(*(p)) == sizeof(long), \ 108 + #p " pointee should be a long-sized integer or a pointer"); \ 109 + (long *)(p); \ 110 + }) 111 + 125 112 /* 126 113 * hashmap__insert() adds key/value entry w/ various semantics, depending on 127 114 * provided strategy value. If a given key/value pair replaced already ··· 136 109 * through old_key and old_value to allow calling code do proper memory 137 110 * management. 138 111 */ 139 - int hashmap__insert(struct hashmap *map, const void *key, void *value, 140 - enum hashmap_insert_strategy strategy, 141 - const void **old_key, void **old_value); 112 + int hashmap_insert(struct hashmap *map, long key, long value, 113 + enum hashmap_insert_strategy strategy, 114 + long *old_key, long *old_value); 142 115 143 - static inline int hashmap__add(struct hashmap *map, 144 - const void *key, void *value) 145 - { 146 - return hashmap__insert(map, key, value, HASHMAP_ADD, NULL, NULL); 147 - } 116 + #define hashmap__insert(map, key, value, strategy, old_key, old_value) \ 117 + hashmap_insert((map), (long)(key), (long)(value), (strategy), \ 118 + hashmap_cast_ptr(old_key), \ 119 + hashmap_cast_ptr(old_value)) 148 120 149 - static inline int hashmap__set(struct hashmap *map, 150 - const void *key, void *value, 151 - const void **old_key, void **old_value) 152 - { 153 - return hashmap__insert(map, key, value, HASHMAP_SET, 154 - old_key, old_value); 155 - } 121 + #define hashmap__add(map, key, value) \ 122 + hashmap__insert((map), (key), (value), HASHMAP_ADD, NULL, NULL) 156 123 157 - static inline int hashmap__update(struct hashmap *map, 158 - const void *key, void *value, 159 - const void **old_key, void **old_value) 160 - { 161 - return hashmap__insert(map, key, value, HASHMAP_UPDATE, 162 - old_key, old_value); 163 - } 124 + #define hashmap__set(map, key, value, old_key, old_value) \ 125 + hashmap__insert((map), (key), (value), HASHMAP_SET, (old_key), (old_value)) 164 126 165 - static inline int hashmap__append(struct hashmap *map, 166 - const void *key, void *value) 167 - { 168 - return hashmap__insert(map, key, value, HASHMAP_APPEND, NULL, NULL); 169 - } 127 + #define hashmap__update(map, key, value, old_key, old_value) \ 128 + hashmap__insert((map), (key), (value), HASHMAP_UPDATE, (old_key), (old_value)) 170 129 171 - bool hashmap__delete(struct hashmap *map, const void *key, 172 - const void **old_key, void **old_value); 130 + #define hashmap__append(map, key, value) \ 131 + hashmap__insert((map), (key), (value), HASHMAP_APPEND, NULL, NULL) 173 132 174 - bool hashmap__find(const struct hashmap *map, const void *key, void **value); 133 + bool hashmap_delete(struct hashmap *map, long key, long *old_key, long *old_value); 134 + 135 + #define hashmap__delete(map, key, old_key, old_value) \ 136 + hashmap_delete((map), (long)(key), \ 137 + hashmap_cast_ptr(old_key), \ 138 + hashmap_cast_ptr(old_value)) 139 + 140 + bool hashmap_find(const struct hashmap *map, long key, long *value); 141 + 142 + #define hashmap__find(map, key, value) \ 143 + hashmap_find((map), (long)(key), hashmap_cast_ptr(value)) 175 144 176 145 /* 177 146 * hashmap__for_each_entry - iterate over all entries in hashmap
+6 -12
tools/lib/bpf/libbpf.c
··· 5601 5601 return __bpf_core_types_match(local_btf, local_id, targ_btf, targ_id, false, 32); 5602 5602 } 5603 5603 5604 - static size_t bpf_core_hash_fn(const void *key, void *ctx) 5604 + static size_t bpf_core_hash_fn(const long key, void *ctx) 5605 5605 { 5606 - return (size_t)key; 5606 + return key; 5607 5607 } 5608 5608 5609 - static bool bpf_core_equal_fn(const void *k1, const void *k2, void *ctx) 5609 + static bool bpf_core_equal_fn(const long k1, const long k2, void *ctx) 5610 5610 { 5611 5611 return k1 == k2; 5612 - } 5613 - 5614 - static void *u32_as_hash_key(__u32 x) 5615 - { 5616 - return (void *)(uintptr_t)x; 5617 5612 } 5618 5613 5619 5614 static int record_relo_core(struct bpf_program *prog, ··· 5653 5658 struct bpf_core_relo_res *targ_res) 5654 5659 { 5655 5660 struct bpf_core_spec specs_scratch[3] = {}; 5656 - const void *type_key = u32_as_hash_key(relo->type_id); 5657 5661 struct bpf_core_cand_list *cands = NULL; 5658 5662 const char *prog_name = prog->name; 5659 5663 const struct btf_type *local_type; ··· 5669 5675 return -EINVAL; 5670 5676 5671 5677 if (relo->kind != BPF_CORE_TYPE_ID_LOCAL && 5672 - !hashmap__find(cand_cache, type_key, (void **)&cands)) { 5678 + !hashmap__find(cand_cache, local_id, &cands)) { 5673 5679 cands = bpf_core_find_cands(prog->obj, local_btf, local_id); 5674 5680 if (IS_ERR(cands)) { 5675 5681 pr_warn("prog '%s': relo #%d: target candidate search failed for [%d] %s %s: %ld\n", ··· 5677 5683 local_name, PTR_ERR(cands)); 5678 5684 return PTR_ERR(cands); 5679 5685 } 5680 - err = hashmap__set(cand_cache, type_key, cands, NULL, NULL); 5686 + err = hashmap__set(cand_cache, local_id, cands, NULL, NULL); 5681 5687 if (err) { 5682 5688 bpf_core_free_cands(cands); 5683 5689 return err; ··· 5800 5806 5801 5807 if (!IS_ERR_OR_NULL(cand_cache)) { 5802 5808 hashmap__for_each_entry(cand_cache, entry, i) { 5803 - bpf_core_free_cands(entry->value); 5809 + bpf_core_free_cands(entry->pvalue); 5804 5810 } 5805 5811 hashmap__free(cand_cache); 5806 5812 }
+9 -9
tools/lib/bpf/strset.c
··· 19 19 struct hashmap *strs_hash; 20 20 }; 21 21 22 - static size_t strset_hash_fn(const void *key, void *ctx) 22 + static size_t strset_hash_fn(long key, void *ctx) 23 23 { 24 24 const struct strset *s = ctx; 25 - const char *str = s->strs_data + (long)key; 25 + const char *str = s->strs_data + key; 26 26 27 27 return str_hash(str); 28 28 } 29 29 30 - static bool strset_equal_fn(const void *key1, const void *key2, void *ctx) 30 + static bool strset_equal_fn(long key1, long key2, void *ctx) 31 31 { 32 32 const struct strset *s = ctx; 33 - const char *str1 = s->strs_data + (long)key1; 34 - const char *str2 = s->strs_data + (long)key2; 33 + const char *str1 = s->strs_data + key1; 34 + const char *str2 = s->strs_data + key2; 35 35 36 36 return strcmp(str1, str2) == 0; 37 37 } ··· 67 67 /* hashmap__add() returns EEXIST if string with the same 68 68 * content already is in the hash map 69 69 */ 70 - err = hashmap__add(hash, (void *)off, (void *)off); 70 + err = hashmap__add(hash, off, off); 71 71 if (err == -EEXIST) 72 72 continue; /* duplicate */ 73 73 if (err) ··· 127 127 new_off = set->strs_data_len; 128 128 memcpy(p, s, len); 129 129 130 - if (hashmap__find(set->strs_hash, (void *)new_off, (void **)&old_off)) 130 + if (hashmap__find(set->strs_hash, new_off, &old_off)) 131 131 return old_off; 132 132 133 133 return -ENOENT; ··· 165 165 * contents doesn't exist already (HASHMAP_ADD strategy). If such 166 166 * string exists, we'll get its offset in old_off (that's old_key). 167 167 */ 168 - err = hashmap__insert(set->strs_hash, (void *)new_off, (void *)new_off, 169 - HASHMAP_ADD, (const void **)&old_off, NULL); 168 + err = hashmap__insert(set->strs_hash, new_off, new_off, 169 + HASHMAP_ADD, &old_off, NULL); 170 170 if (err == -EEXIST) 171 171 return old_off; /* duplicated string, return existing offset */ 172 172 if (err)
+12 -16
tools/lib/bpf/usdt.c
··· 873 873 free(usdt_link); 874 874 } 875 875 876 - static size_t specs_hash_fn(const void *key, void *ctx) 876 + static size_t specs_hash_fn(long key, void *ctx) 877 877 { 878 - const char *s = key; 879 - 880 - return str_hash(s); 878 + return str_hash((char *)key); 881 879 } 882 880 883 - static bool specs_equal_fn(const void *key1, const void *key2, void *ctx) 881 + static bool specs_equal_fn(long key1, long key2, void *ctx) 884 882 { 885 - const char *s1 = key1; 886 - const char *s2 = key2; 887 - 888 - return strcmp(s1, s2) == 0; 883 + return strcmp((char *)key1, (char *)key2) == 0; 889 884 } 890 885 891 886 static int allocate_spec_id(struct usdt_manager *man, struct hashmap *specs_hash, 892 887 struct bpf_link_usdt *link, struct usdt_target *target, 893 888 int *spec_id, bool *is_new) 894 889 { 895 - void *tmp; 890 + long tmp; 891 + void *new_ids; 896 892 int err; 897 893 898 894 /* check if we already allocated spec ID for this spec string */ 899 895 if (hashmap__find(specs_hash, target->spec_str, &tmp)) { 900 - *spec_id = (long)tmp; 896 + *spec_id = tmp; 901 897 *is_new = false; 902 898 return 0; 903 899 } ··· 901 905 /* otherwise it's a new ID that needs to be set up in specs map and 902 906 * returned back to usdt_manager when USDT link is detached 903 907 */ 904 - tmp = libbpf_reallocarray(link->spec_ids, link->spec_cnt + 1, sizeof(*link->spec_ids)); 905 - if (!tmp) 908 + new_ids = libbpf_reallocarray(link->spec_ids, link->spec_cnt + 1, sizeof(*link->spec_ids)); 909 + if (!new_ids) 906 910 return -ENOMEM; 907 - link->spec_ids = tmp; 911 + link->spec_ids = new_ids; 908 912 909 913 /* get next free spec ID, giving preference to free list, if not empty */ 910 914 if (man->free_spec_cnt) { 911 915 *spec_id = man->free_spec_ids[man->free_spec_cnt - 1]; 912 916 913 917 /* cache spec ID for current spec string for future lookups */ 914 - err = hashmap__add(specs_hash, target->spec_str, (void *)(long)*spec_id); 918 + err = hashmap__add(specs_hash, target->spec_str, *spec_id); 915 919 if (err) 916 920 return err; 917 921 ··· 924 928 *spec_id = man->next_free_spec_id; 925 929 926 930 /* cache spec ID for current spec string for future lookups */ 927 - err = hashmap__add(specs_hash, target->spec_str, (void *)(long)*spec_id); 931 + err = hashmap__add(specs_hash, target->spec_str, *spec_id); 928 932 if (err) 929 933 return err; 930 934
+10 -18
tools/perf/tests/expr.c
··· 130 130 expr__find_ids("FOO + BAR + BAZ + BOZO", "FOO", 131 131 ctx) == 0); 132 132 TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 3); 133 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BAR", 134 - (void **)&val_ptr)); 135 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BAZ", 136 - (void **)&val_ptr)); 137 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BOZO", 138 - (void **)&val_ptr)); 133 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BAR", &val_ptr)); 134 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BAZ", &val_ptr)); 135 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BOZO", &val_ptr)); 139 136 140 137 expr__ctx_clear(ctx); 141 138 ctx->sctx.runtime = 3; ··· 140 143 expr__find_ids("EVENT1\\,param\\=?@ + EVENT2\\,param\\=?@", 141 144 NULL, ctx) == 0); 142 145 TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 2); 143 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "EVENT1,param=3@", 144 - (void **)&val_ptr)); 145 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "EVENT2,param=3@", 146 - (void **)&val_ptr)); 146 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "EVENT1,param=3@", &val_ptr)); 147 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "EVENT2,param=3@", &val_ptr)); 147 148 148 149 expr__ctx_clear(ctx); 149 150 TEST_ASSERT_VAL("find ids", 150 151 expr__find_ids("dash\\-event1 - dash\\-event2", 151 152 NULL, ctx) == 0); 152 153 TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 2); 153 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "dash-event1", 154 - (void **)&val_ptr)); 155 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "dash-event2", 156 - (void **)&val_ptr)); 154 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "dash-event1", &val_ptr)); 155 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "dash-event2", &val_ptr)); 157 156 158 157 /* Only EVENT1 or EVENT2 need be measured depending on the value of smt_on. */ 159 158 { ··· 167 174 TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 1); 168 175 TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, 169 176 smton ? "EVENT1" : "EVENT2", 170 - (void **)&val_ptr)); 177 + &val_ptr)); 171 178 172 179 expr__ctx_clear(ctx); 173 180 TEST_ASSERT_VAL("find ids", ··· 176 183 TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 1); 177 184 TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, 178 185 corewide ? "EVENT1" : "EVENT2", 179 - (void **)&val_ptr)); 186 + &val_ptr)); 180 187 181 188 } 182 189 /* The expression is a constant 1.0 without needing to evaluate EVENT1. */ ··· 213 220 expr__find_ids("source_count(EVENT1)", 214 221 NULL, ctx) == 0); 215 222 TEST_ASSERT_VAL("source count", hashmap__size(ctx->ids) == 1); 216 - TEST_ASSERT_VAL("source count", hashmap__find(ctx->ids, "EVENT1", 217 - (void **)&val_ptr)); 223 + TEST_ASSERT_VAL("source count", hashmap__find(ctx->ids, "EVENT1", &val_ptr)); 218 224 219 225 expr__ctx_free(ctx); 220 226
+3 -3
tools/perf/tests/pmu-events.c
··· 986 986 */ 987 987 i = 1; 988 988 hashmap__for_each_entry(ctx->ids, cur, bkt) 989 - expr__add_id_val(ctx, strdup(cur->key), i++); 989 + expr__add_id_val(ctx, strdup(cur->pkey), i++); 990 990 991 991 hashmap__for_each_entry(ctx->ids, cur, bkt) { 992 - if (check_parse_fake(cur->key)) { 992 + if (check_parse_fake(cur->pkey)) { 993 993 pr_err("check_parse_fake failed\n"); 994 994 goto out; 995 995 } ··· 1003 1003 */ 1004 1004 i = 1024; 1005 1005 hashmap__for_each_entry(ctx->ids, cur, bkt) 1006 - expr__add_id_val(ctx, strdup(cur->key), i--); 1006 + expr__add_id_val(ctx, strdup(cur->pkey), i--); 1007 1007 if (expr__parse(&result, ctx, str)) { 1008 1008 pr_err("expr__parse failed\n"); 1009 1009 ret = -1;
+5 -6
tools/perf/util/bpf-loader.c
··· 318 318 return; 319 319 320 320 hashmap__for_each_entry(bpf_program_hash, cur, bkt) 321 - clear_prog_priv(cur->key, cur->value); 321 + clear_prog_priv(cur->pkey, cur->pvalue); 322 322 323 323 hashmap__free(bpf_program_hash); 324 324 bpf_program_hash = NULL; ··· 339 339 bpf_map_hash_free(); 340 340 } 341 341 342 - static size_t ptr_hash(const void *__key, void *ctx __maybe_unused) 342 + static size_t ptr_hash(const long __key, void *ctx __maybe_unused) 343 343 { 344 - return (size_t) __key; 344 + return __key; 345 345 } 346 346 347 - static bool ptr_equal(const void *key1, const void *key2, 348 - void *ctx __maybe_unused) 347 + static bool ptr_equal(long key1, long key2, void *ctx __maybe_unused) 349 348 { 350 349 return key1 == key2; 351 350 } ··· 1184 1185 return; 1185 1186 1186 1187 hashmap__for_each_entry(bpf_map_hash, cur, bkt) 1187 - bpf_map_priv__clear(cur->key, cur->value); 1188 + bpf_map_priv__clear(cur->pkey, cur->pvalue); 1188 1189 1189 1190 hashmap__free(bpf_map_hash); 1190 1191 bpf_map_hash = NULL;
+1 -1
tools/perf/util/evsel.c
··· 3123 3123 3124 3124 if (evsel->per_pkg_mask) { 3125 3125 hashmap__for_each_entry(evsel->per_pkg_mask, cur, bkt) 3126 - free((char *)cur->key); 3126 + free((void *)cur->pkey); 3127 3127 3128 3128 hashmap__clear(evsel->per_pkg_mask); 3129 3129 }
+15 -21
tools/perf/util/expr.c
··· 46 46 } kind; 47 47 }; 48 48 49 - static size_t key_hash(const void *key, void *ctx __maybe_unused) 49 + static size_t key_hash(long key, void *ctx __maybe_unused) 50 50 { 51 51 const char *str = (const char *)key; 52 52 size_t hash = 0; ··· 59 59 return hash; 60 60 } 61 61 62 - static bool key_equal(const void *key1, const void *key2, 63 - void *ctx __maybe_unused) 62 + static bool key_equal(long key1, long key2, void *ctx __maybe_unused) 64 63 { 65 64 return !strcmp((const char *)key1, (const char *)key2); 66 65 } ··· 83 84 return; 84 85 85 86 hashmap__for_each_entry(ids, cur, bkt) { 86 - free((char *)cur->key); 87 - free(cur->value); 87 + free((void *)cur->pkey); 88 + free((void *)cur->pvalue); 88 89 } 89 90 90 91 hashmap__free(ids); ··· 96 97 char *old_key = NULL; 97 98 int ret; 98 99 99 - ret = hashmap__set(ids, id, data_ptr, 100 - (const void **)&old_key, (void **)&old_data); 100 + ret = hashmap__set(ids, id, data_ptr, &old_key, &old_data); 101 101 if (ret) 102 102 free(data_ptr); 103 103 free(old_key); ··· 125 127 ids2 = tmp; 126 128 } 127 129 hashmap__for_each_entry(ids2, cur, bkt) { 128 - ret = hashmap__set(ids1, cur->key, cur->value, 129 - (const void **)&old_key, (void **)&old_data); 130 + ret = hashmap__set(ids1, cur->key, cur->value, &old_key, &old_data); 130 131 free(old_key); 131 132 free(old_data); 132 133 ··· 166 169 data_ptr->val.source_count = source_count; 167 170 data_ptr->kind = EXPR_ID_DATA__VALUE; 168 171 169 - ret = hashmap__set(ctx->ids, id, data_ptr, 170 - (const void **)&old_key, (void **)&old_data); 172 + ret = hashmap__set(ctx->ids, id, data_ptr, &old_key, &old_data); 171 173 if (ret) 172 174 free(data_ptr); 173 175 free(old_key); ··· 201 205 data_ptr->ref.metric_expr = ref->metric_expr; 202 206 data_ptr->kind = EXPR_ID_DATA__REF; 203 207 204 - ret = hashmap__set(ctx->ids, name, data_ptr, 205 - (const void **)&old_key, (void **)&old_data); 208 + ret = hashmap__set(ctx->ids, name, data_ptr, &old_key, &old_data); 206 209 if (ret) 207 210 free(data_ptr); 208 211 ··· 216 221 int expr__get_id(struct expr_parse_ctx *ctx, const char *id, 217 222 struct expr_id_data **data) 218 223 { 219 - return hashmap__find(ctx->ids, id, (void **)data) ? 0 : -1; 224 + return hashmap__find(ctx->ids, id, data) ? 0 : -1; 220 225 } 221 226 222 227 bool expr__subset_of_ids(struct expr_parse_ctx *haystack, ··· 227 232 struct expr_id_data *data; 228 233 229 234 hashmap__for_each_entry(needles->ids, cur, bkt) { 230 - if (expr__get_id(haystack, cur->key, &data)) 235 + if (expr__get_id(haystack, cur->pkey, &data)) 231 236 return false; 232 237 } 233 238 return true; ··· 277 282 struct expr_id_data *old_val = NULL; 278 283 char *old_key = NULL; 279 284 280 - hashmap__delete(ctx->ids, id, 281 - (const void **)&old_key, (void **)&old_val); 285 + hashmap__delete(ctx->ids, id, &old_key, &old_val); 282 286 free(old_key); 283 287 free(old_val); 284 288 } ··· 308 314 size_t bkt; 309 315 310 316 hashmap__for_each_entry(ctx->ids, cur, bkt) { 311 - free((char *)cur->key); 312 - free(cur->value); 317 + free((void *)cur->pkey); 318 + free(cur->pvalue); 313 319 } 314 320 hashmap__clear(ctx->ids); 315 321 } ··· 324 330 325 331 free(ctx->sctx.user_requested_cpu_list); 326 332 hashmap__for_each_entry(ctx->ids, cur, bkt) { 327 - free((char *)cur->key); 328 - free(cur->value); 333 + free((void *)cur->pkey); 334 + free(cur->pvalue); 329 335 } 330 336 hashmap__free(ctx->ids); 331 337 free(ctx);
+9 -9
tools/perf/util/hashmap.c
··· 128 128 } 129 129 130 130 static bool hashmap_find_entry(const struct hashmap *map, 131 - const void *key, size_t hash, 131 + const long key, size_t hash, 132 132 struct hashmap_entry ***pprev, 133 133 struct hashmap_entry **entry) 134 134 { ··· 151 151 return false; 152 152 } 153 153 154 - int hashmap__insert(struct hashmap *map, const void *key, void *value, 155 - enum hashmap_insert_strategy strategy, 156 - const void **old_key, void **old_value) 154 + int hashmap_insert(struct hashmap *map, long key, long value, 155 + enum hashmap_insert_strategy strategy, 156 + long *old_key, long *old_value) 157 157 { 158 158 struct hashmap_entry *entry; 159 159 size_t h; 160 160 int err; 161 161 162 162 if (old_key) 163 - *old_key = NULL; 163 + *old_key = 0; 164 164 if (old_value) 165 - *old_value = NULL; 165 + *old_value = 0; 166 166 167 167 h = hash_bits(map->hash_fn(key, map->ctx), map->cap_bits); 168 168 if (strategy != HASHMAP_APPEND && ··· 203 203 return 0; 204 204 } 205 205 206 - bool hashmap__find(const struct hashmap *map, const void *key, void **value) 206 + bool hashmap_find(const struct hashmap *map, long key, long *value) 207 207 { 208 208 struct hashmap_entry *entry; 209 209 size_t h; ··· 217 217 return true; 218 218 } 219 219 220 - bool hashmap__delete(struct hashmap *map, const void *key, 221 - const void **old_key, void **old_value) 220 + bool hashmap_delete(struct hashmap *map, long key, 221 + long *old_key, long *old_value) 222 222 { 223 223 struct hashmap_entry **pprev, *entry; 224 224 size_t h;
+57 -34
tools/perf/util/hashmap.h
··· 40 40 return h; 41 41 } 42 42 43 - typedef size_t (*hashmap_hash_fn)(const void *key, void *ctx); 44 - typedef bool (*hashmap_equal_fn)(const void *key1, const void *key2, void *ctx); 43 + typedef size_t (*hashmap_hash_fn)(long key, void *ctx); 44 + typedef bool (*hashmap_equal_fn)(long key1, long key2, void *ctx); 45 45 46 + /* 47 + * Hashmap interface is polymorphic, keys and values could be either 48 + * long-sized integers or pointers, this is achieved as follows: 49 + * - interface functions that operate on keys and values are hidden 50 + * behind auxiliary macros, e.g. hashmap_insert <-> hashmap__insert; 51 + * - these auxiliary macros cast the key and value parameters as 52 + * long or long *, so the user does not have to specify the casts explicitly; 53 + * - for pointer parameters (e.g. old_key) the size of the pointed 54 + * type is verified by hashmap_cast_ptr using _Static_assert; 55 + * - when iterating using hashmap__for_each_* forms 56 + * hasmap_entry->key should be used for integer keys and 57 + * hasmap_entry->pkey should be used for pointer keys, 58 + * same goes for values. 59 + */ 46 60 struct hashmap_entry { 47 - const void *key; 48 - void *value; 61 + union { 62 + long key; 63 + const void *pkey; 64 + }; 65 + union { 66 + long value; 67 + void *pvalue; 68 + }; 49 69 struct hashmap_entry *next; 50 70 }; 51 71 ··· 122 102 HASHMAP_APPEND, 123 103 }; 124 104 105 + #define hashmap_cast_ptr(p) ({ \ 106 + _Static_assert((__builtin_constant_p((p)) ? (p) == NULL : 0) || \ 107 + sizeof(*(p)) == sizeof(long), \ 108 + #p " pointee should be a long-sized integer or a pointer"); \ 109 + (long *)(p); \ 110 + }) 111 + 125 112 /* 126 113 * hashmap__insert() adds key/value entry w/ various semantics, depending on 127 114 * provided strategy value. If a given key/value pair replaced already ··· 136 109 * through old_key and old_value to allow calling code do proper memory 137 110 * management. 138 111 */ 139 - int hashmap__insert(struct hashmap *map, const void *key, void *value, 140 - enum hashmap_insert_strategy strategy, 141 - const void **old_key, void **old_value); 112 + int hashmap_insert(struct hashmap *map, long key, long value, 113 + enum hashmap_insert_strategy strategy, 114 + long *old_key, long *old_value); 142 115 143 - static inline int hashmap__add(struct hashmap *map, 144 - const void *key, void *value) 145 - { 146 - return hashmap__insert(map, key, value, HASHMAP_ADD, NULL, NULL); 147 - } 116 + #define hashmap__insert(map, key, value, strategy, old_key, old_value) \ 117 + hashmap_insert((map), (long)(key), (long)(value), (strategy), \ 118 + hashmap_cast_ptr(old_key), \ 119 + hashmap_cast_ptr(old_value)) 148 120 149 - static inline int hashmap__set(struct hashmap *map, 150 - const void *key, void *value, 151 - const void **old_key, void **old_value) 152 - { 153 - return hashmap__insert(map, key, value, HASHMAP_SET, 154 - old_key, old_value); 155 - } 121 + #define hashmap__add(map, key, value) \ 122 + hashmap__insert((map), (key), (value), HASHMAP_ADD, NULL, NULL) 156 123 157 - static inline int hashmap__update(struct hashmap *map, 158 - const void *key, void *value, 159 - const void **old_key, void **old_value) 160 - { 161 - return hashmap__insert(map, key, value, HASHMAP_UPDATE, 162 - old_key, old_value); 163 - } 124 + #define hashmap__set(map, key, value, old_key, old_value) \ 125 + hashmap__insert((map), (key), (value), HASHMAP_SET, (old_key), (old_value)) 164 126 165 - static inline int hashmap__append(struct hashmap *map, 166 - const void *key, void *value) 167 - { 168 - return hashmap__insert(map, key, value, HASHMAP_APPEND, NULL, NULL); 169 - } 127 + #define hashmap__update(map, key, value, old_key, old_value) \ 128 + hashmap__insert((map), (key), (value), HASHMAP_UPDATE, (old_key), (old_value)) 170 129 171 - bool hashmap__delete(struct hashmap *map, const void *key, 172 - const void **old_key, void **old_value); 130 + #define hashmap__append(map, key, value) \ 131 + hashmap__insert((map), (key), (value), HASHMAP_APPEND, NULL, NULL) 173 132 174 - bool hashmap__find(const struct hashmap *map, const void *key, void **value); 133 + bool hashmap_delete(struct hashmap *map, long key, long *old_key, long *old_value); 134 + 135 + #define hashmap__delete(map, key, old_key, old_value) \ 136 + hashmap_delete((map), (long)(key), \ 137 + hashmap_cast_ptr(old_key), \ 138 + hashmap_cast_ptr(old_value)) 139 + 140 + bool hashmap_find(const struct hashmap *map, long key, long *value); 141 + 142 + #define hashmap__find(map, key, value) \ 143 + hashmap_find((map), (long)(key), hashmap_cast_ptr(value)) 175 144 176 145 /* 177 146 * hashmap__for_each_entry - iterate over all entries in hashmap
+5 -5
tools/perf/util/metricgroup.c
··· 288 288 * combined or shared groups, this metric may not care 289 289 * about this event. 290 290 */ 291 - if (hashmap__find(ids, metric_id, (void **)&val_ptr)) { 291 + if (hashmap__find(ids, metric_id, &val_ptr)) { 292 292 metric_events[matched_events++] = ev; 293 293 294 294 if (matched_events >= ids_size) ··· 764 764 #define RETURN_IF_NON_ZERO(x) do { if (x) return x; } while (0) 765 765 766 766 hashmap__for_each_entry(ctx->ids, cur, bkt) { 767 - const char *sep, *rsep, *id = cur->key; 767 + const char *sep, *rsep, *id = cur->pkey; 768 768 enum perf_tool_event ev; 769 769 770 770 pr_debug("found event %s\n", id); ··· 945 945 hashmap__for_each_entry(root_metric->pctx->ids, cur, bkt) { 946 946 struct pmu_event pe; 947 947 948 - if (metricgroup__find_metric(cur->key, table, &pe)) { 948 + if (metricgroup__find_metric(cur->pkey, table, &pe)) { 949 949 pending = realloc(pending, 950 950 (pending_cnt + 1) * sizeof(struct to_resolve)); 951 951 if (!pending) 952 952 return -ENOMEM; 953 953 954 954 memcpy(&pending[pending_cnt].pe, &pe, sizeof(pe)); 955 - pending[pending_cnt].key = cur->key; 955 + pending[pending_cnt].key = cur->pkey; 956 956 pending_cnt++; 957 957 } 958 958 } ··· 1433 1433 list_for_each_entry(m, metric_list, nd) { 1434 1434 if (m->has_constraint && !m->modifier) { 1435 1435 hashmap__for_each_entry(m->pctx->ids, cur, bkt) { 1436 - dup = strdup(cur->key); 1436 + dup = strdup(cur->pkey); 1437 1437 if (!dup) { 1438 1438 ret = -ENOMEM; 1439 1439 goto err_out;
+1 -1
tools/perf/util/stat-shadow.c
··· 398 398 399 399 i = 0; 400 400 hashmap__for_each_entry(ctx->ids, cur, bkt) { 401 - const char *metric_name = (const char *)cur->key; 401 + const char *metric_name = cur->pkey; 402 402 403 403 found = false; 404 404 if (leader) {
+4 -5
tools/perf/util/stat.c
··· 278 278 } 279 279 } 280 280 281 - static size_t pkg_id_hash(const void *__key, void *ctx __maybe_unused) 281 + static size_t pkg_id_hash(long __key, void *ctx __maybe_unused) 282 282 { 283 283 uint64_t *key = (uint64_t *) __key; 284 284 285 285 return *key & 0xffffffff; 286 286 } 287 287 288 - static bool pkg_id_equal(const void *__key1, const void *__key2, 289 - void *ctx __maybe_unused) 288 + static bool pkg_id_equal(long __key1, long __key2, void *ctx __maybe_unused) 290 289 { 291 290 uint64_t *key1 = (uint64_t *) __key1; 292 291 uint64_t *key2 = (uint64_t *) __key2; ··· 346 347 return -ENOMEM; 347 348 348 349 *key = (uint64_t)d << 32 | s; 349 - if (hashmap__find(mask, (void *)key, NULL)) { 350 + if (hashmap__find(mask, key, NULL)) { 350 351 *skip = true; 351 352 free(key); 352 353 } else 353 - ret = hashmap__add(mask, (void *)key, (void *)1); 354 + ret = hashmap__add(mask, key, 1); 354 355 355 356 return ret; 356 357 }
+4 -3
tools/testing/selftests/bpf/Makefile
··· 182 182 $(OUTPUT)/liburandom_read.so: urandom_read_lib1.c urandom_read_lib2.c 183 183 $(call msg,LIB,,$@) 184 184 $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS) \ 185 - -fuse-ld=$(LLD) -Wl,-znoseparate-code -fPIC -shared -o $@ 185 + -fuse-ld=$(LLD) -Wl,-znoseparate-code -Wl,--build-id=sha1 \ 186 + -fPIC -shared -o $@ 186 187 187 188 $(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c $(OUTPUT)/liburandom_read.so 188 189 $(call msg,BINARY,,$@) 189 190 $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^) \ 190 191 liburandom_read.so $(LDLIBS) \ 191 - -fuse-ld=$(LLD) -Wl,-znoseparate-code \ 192 - -Wl,-rpath=. -Wl,--build-id=sha1 -o $@ 192 + -fuse-ld=$(LLD) -Wl,-znoseparate-code -Wl,--build-id=sha1 \ 193 + -Wl,-rpath=. -o $@ 193 194 194 195 $(OUTPUT)/sign-file: ../../../../scripts/sign-file.c 195 196 $(call msg,SIGN-FILE,,$@)
+19
tools/testing/selftests/bpf/bpf_util.h
··· 20 20 return possible_cpus; 21 21 } 22 22 23 + /* Copy up to sz - 1 bytes from zero-terminated src string and ensure that dst 24 + * is zero-terminated string no matter what (unless sz == 0, in which case 25 + * it's a no-op). It's conceptually close to FreeBSD's strlcpy(), but differs 26 + * in what is returned. Given this is internal helper, it's trivial to extend 27 + * this, when necessary. Use this instead of strncpy inside libbpf source code. 28 + */ 29 + static inline void bpf_strlcpy(char *dst, const char *src, size_t sz) 30 + { 31 + size_t i; 32 + 33 + if (sz == 0) 34 + return; 35 + 36 + sz--; 37 + for (i = 0; i < sz && src[i]; i++) 38 + dst[i] = src[i]; 39 + dst[i] = '\0'; 40 + } 41 + 23 42 #define __bpf_percpu_val_align __attribute__((__aligned__(8))) 24 43 25 44 #define BPF_DECLARE_PERCPU(type, name) \
+2 -1
tools/testing/selftests/bpf/cgroup_helpers.c
··· 13 13 #include <ftw.h> 14 14 15 15 #include "cgroup_helpers.h" 16 + #include "bpf_util.h" 16 17 17 18 /* 18 19 * To avoid relying on the system setup, when setup_cgroup_env is called ··· 78 77 enable[len] = 0; 79 78 close(fd); 80 79 } else { 81 - strncpy(enable, controllers, sizeof(enable)); 80 + bpf_strlcpy(enable, controllers, sizeof(enable)); 82 81 } 83 82 84 83 snprintf(path, sizeof(path), "%s/cgroup.subtree_control", cgroup_path);
+24 -14
tools/testing/selftests/bpf/prog_tests/align.c
··· 2 2 #include <test_progs.h> 3 3 4 4 #define MAX_INSNS 512 5 - #define MAX_MATCHES 16 5 + #define MAX_MATCHES 24 6 6 7 7 struct bpf_reg_match { 8 8 unsigned int line; ··· 267 267 */ 268 268 BPF_MOV64_REG(BPF_REG_5, BPF_REG_2), 269 269 BPF_ALU64_REG(BPF_ADD, BPF_REG_5, BPF_REG_6), 270 + BPF_MOV64_REG(BPF_REG_4, BPF_REG_5), 270 271 BPF_ALU64_IMM(BPF_ADD, BPF_REG_5, 14), 271 272 BPF_MOV64_REG(BPF_REG_4, BPF_REG_5), 272 273 BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, 4), ··· 281 280 BPF_MOV64_REG(BPF_REG_5, BPF_REG_2), 282 281 BPF_ALU64_IMM(BPF_ADD, BPF_REG_5, 14), 283 282 BPF_ALU64_REG(BPF_ADD, BPF_REG_5, BPF_REG_6), 283 + BPF_MOV64_REG(BPF_REG_4, BPF_REG_5), 284 284 BPF_ALU64_IMM(BPF_ADD, BPF_REG_5, 4), 285 285 BPF_ALU64_REG(BPF_ADD, BPF_REG_5, BPF_REG_6), 286 286 BPF_MOV64_REG(BPF_REG_4, BPF_REG_5), ··· 313 311 {15, "R4=pkt(id=1,off=18,r=18,umax=1020,var_off=(0x0; 0x3fc))"}, 314 312 {15, "R5=pkt(id=1,off=14,r=18,umax=1020,var_off=(0x0; 0x3fc))"}, 315 313 /* Variable offset is added to R5 packet pointer, 316 - * resulting in auxiliary alignment of 4. 314 + * resulting in auxiliary alignment of 4. To avoid BPF 315 + * verifier's precision backtracking logging 316 + * interfering we also have a no-op R4 = R5 317 + * instruction to validate R5 state. We also check 318 + * that R4 is what it should be in such case. 317 319 */ 318 - {17, "R5_w=pkt(id=2,off=0,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 320 + {18, "R4_w=pkt(id=2,off=0,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 321 + {18, "R5_w=pkt(id=2,off=0,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 319 322 /* Constant offset is added to R5, resulting in 320 323 * reg->off of 14. 321 324 */ 322 - {18, "R5_w=pkt(id=2,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 325 + {19, "R5_w=pkt(id=2,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 323 326 /* At the time the word size load is performed from R5, 324 327 * its total fixed offset is NET_IP_ALIGN + reg->off 325 328 * (14) which is 16. Then the variable offset is 4-byte 326 329 * aligned, so the total offset is 4-byte aligned and 327 330 * meets the load's requirements. 328 331 */ 329 - {23, "R4=pkt(id=2,off=18,r=18,umax=1020,var_off=(0x0; 0x3fc))"}, 330 - {23, "R5=pkt(id=2,off=14,r=18,umax=1020,var_off=(0x0; 0x3fc))"}, 332 + {24, "R4=pkt(id=2,off=18,r=18,umax=1020,var_off=(0x0; 0x3fc))"}, 333 + {24, "R5=pkt(id=2,off=14,r=18,umax=1020,var_off=(0x0; 0x3fc))"}, 331 334 /* Constant offset is added to R5 packet pointer, 332 335 * resulting in reg->off value of 14. 333 336 */ 334 - {25, "R5_w=pkt(off=14,r=8"}, 337 + {26, "R5_w=pkt(off=14,r=8"}, 335 338 /* Variable offset is added to R5, resulting in a 336 - * variable offset of (4n). 339 + * variable offset of (4n). See comment for insn #18 340 + * for R4 = R5 trick. 337 341 */ 338 - {26, "R5_w=pkt(id=3,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 342 + {28, "R4_w=pkt(id=3,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 343 + {28, "R5_w=pkt(id=3,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 339 344 /* Constant is added to R5 again, setting reg->off to 18. */ 340 - {27, "R5_w=pkt(id=3,off=18,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 345 + {29, "R5_w=pkt(id=3,off=18,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 341 346 /* And once more we add a variable; resulting var_off 342 347 * is still (4n), fixed offset is not changed. 343 348 * Also, we create a new reg->id. 344 349 */ 345 - {28, "R5_w=pkt(id=4,off=18,r=0,umax=2040,var_off=(0x0; 0x7fc)"}, 350 + {31, "R4_w=pkt(id=4,off=18,r=0,umax=2040,var_off=(0x0; 0x7fc)"}, 351 + {31, "R5_w=pkt(id=4,off=18,r=0,umax=2040,var_off=(0x0; 0x7fc)"}, 346 352 /* At the time the word size load is performed from R5, 347 353 * its total fixed offset is NET_IP_ALIGN + reg->off (18) 348 354 * which is 20. Then the variable offset is (4n), so 349 355 * the total offset is 4-byte aligned and meets the 350 356 * load's requirements. 351 357 */ 352 - {33, "R4=pkt(id=4,off=22,r=22,umax=2040,var_off=(0x0; 0x7fc)"}, 353 - {33, "R5=pkt(id=4,off=18,r=22,umax=2040,var_off=(0x0; 0x7fc)"}, 358 + {35, "R4=pkt(id=4,off=22,r=22,umax=2040,var_off=(0x0; 0x7fc)"}, 359 + {35, "R5=pkt(id=4,off=18,r=22,umax=2040,var_off=(0x0; 0x7fc)"}, 354 360 }, 355 361 }, 356 362 { ··· 691 681 if (!test__start_subtest(test->descr)) 692 682 continue; 693 683 694 - CHECK_FAIL(do_test_single(test)); 684 + ASSERT_OK(do_test_single(test), test->descr); 695 685 } 696 686 }
+259 -5
tools/testing/selftests/bpf/prog_tests/btf.c
··· 7133 7133 BTF_ENUM_ENC(NAME_NTH(4), 456), 7134 7134 /* [4] fwd enum 'e2' after full enum */ 7135 7135 BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 0), 4), 7136 - /* [5] incompatible fwd enum with different size */ 7136 + /* [5] fwd enum with different size, size does not matter for fwd */ 7137 7137 BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 0), 1), 7138 7138 /* [6] incompatible full enum with different value */ 7139 7139 BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), ··· 7150 7150 /* [2] full enum 'e2' */ 7151 7151 BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), 7152 7152 BTF_ENUM_ENC(NAME_NTH(4), 456), 7153 - /* [3] incompatible fwd enum with different size */ 7154 - BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 0), 1), 7155 - /* [4] incompatible full enum with different value */ 7153 + /* [3] incompatible full enum with different value */ 7156 7154 BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), 7157 7155 BTF_ENUM_ENC(NAME_NTH(2), 321), 7158 7156 BTF_END_RAW, ··· 7609 7611 BTF_STR_SEC("\0e1\0e1_val"), 7610 7612 }, 7611 7613 }, 7612 - 7614 + { 7615 + .descr = "dedup: enum of different size: no dedup", 7616 + .input = { 7617 + .raw_types = { 7618 + /* [1] enum 'e1' */ 7619 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), 7620 + BTF_ENUM_ENC(NAME_NTH(2), 1), 7621 + /* [2] enum 'e1' */ 7622 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 2), 7623 + BTF_ENUM_ENC(NAME_NTH(2), 1), 7624 + BTF_END_RAW, 7625 + }, 7626 + BTF_STR_SEC("\0e1\0e1_val"), 7627 + }, 7628 + .expect = { 7629 + .raw_types = { 7630 + /* [1] enum 'e1' */ 7631 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), 7632 + BTF_ENUM_ENC(NAME_NTH(2), 1), 7633 + /* [2] enum 'e1' */ 7634 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 2), 7635 + BTF_ENUM_ENC(NAME_NTH(2), 1), 7636 + BTF_END_RAW, 7637 + }, 7638 + BTF_STR_SEC("\0e1\0e1_val"), 7639 + }, 7640 + }, 7641 + { 7642 + .descr = "dedup: enum fwd to enum64", 7643 + .input = { 7644 + .raw_types = { 7645 + /* [1] enum64 'e1' */ 7646 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 8), 7647 + BTF_ENUM64_ENC(NAME_NTH(2), 1, 0), 7648 + /* [2] enum 'e1' fwd */ 7649 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 0), 4), 7650 + /* [3] typedef enum 'e1' td */ 7651 + BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_TYPEDEF, 0, 0), 2), 7652 + BTF_END_RAW, 7653 + }, 7654 + BTF_STR_SEC("\0e1\0e1_val\0td"), 7655 + }, 7656 + .expect = { 7657 + .raw_types = { 7658 + /* [1] enum64 'e1' */ 7659 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 8), 7660 + BTF_ENUM64_ENC(NAME_NTH(2), 1, 0), 7661 + /* [2] typedef enum 'e1' td */ 7662 + BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_TYPEDEF, 0, 0), 1), 7663 + BTF_END_RAW, 7664 + }, 7665 + BTF_STR_SEC("\0e1\0e1_val\0td"), 7666 + }, 7667 + }, 7668 + { 7669 + .descr = "dedup: enum64 fwd to enum", 7670 + .input = { 7671 + .raw_types = { 7672 + /* [1] enum 'e1' */ 7673 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), 7674 + BTF_ENUM_ENC(NAME_NTH(2), 1), 7675 + /* [2] enum64 'e1' fwd */ 7676 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 0), 8), 7677 + /* [3] typedef enum 'e1' td */ 7678 + BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_TYPEDEF, 0, 0), 2), 7679 + BTF_END_RAW, 7680 + }, 7681 + BTF_STR_SEC("\0e1\0e1_val\0td"), 7682 + }, 7683 + .expect = { 7684 + .raw_types = { 7685 + /* [1] enum 'e1' */ 7686 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), 7687 + BTF_ENUM_ENC(NAME_NTH(2), 1), 7688 + /* [2] typedef enum 'e1' td */ 7689 + BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_TYPEDEF, 0, 0), 1), 7690 + BTF_END_RAW, 7691 + }, 7692 + BTF_STR_SEC("\0e1\0e1_val\0td"), 7693 + }, 7694 + }, 7695 + { 7696 + .descr = "dedup: standalone fwd declaration struct", 7697 + /* 7698 + * Verify that CU1:foo and CU2:foo would be unified and that 7699 + * typedef/ptr would be updated to point to CU1:foo. 7700 + * 7701 + * // CU 1: 7702 + * struct foo { int x; }; 7703 + * 7704 + * // CU 2: 7705 + * struct foo; 7706 + * typedef struct foo *foo_ptr; 7707 + */ 7708 + .input = { 7709 + .raw_types = { 7710 + /* CU 1 */ 7711 + BTF_STRUCT_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7712 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7713 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7714 + /* CU 2 */ 7715 + BTF_FWD_ENC(NAME_NTH(1), 0), /* [3] */ 7716 + BTF_PTR_ENC(3), /* [4] */ 7717 + BTF_TYPEDEF_ENC(NAME_NTH(3), 4), /* [5] */ 7718 + BTF_END_RAW, 7719 + }, 7720 + BTF_STR_SEC("\0foo\0x\0foo_ptr"), 7721 + }, 7722 + .expect = { 7723 + .raw_types = { 7724 + BTF_STRUCT_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7725 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7726 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7727 + BTF_PTR_ENC(1), /* [3] */ 7728 + BTF_TYPEDEF_ENC(NAME_NTH(3), 3), /* [4] */ 7729 + BTF_END_RAW, 7730 + }, 7731 + BTF_STR_SEC("\0foo\0x\0foo_ptr"), 7732 + }, 7733 + }, 7734 + { 7735 + .descr = "dedup: standalone fwd declaration union", 7736 + /* 7737 + * Verify that CU1:foo and CU2:foo would be unified and that 7738 + * typedef/ptr would be updated to point to CU1:foo. 7739 + * Same as "dedup: standalone fwd declaration struct" but for unions. 7740 + * 7741 + * // CU 1: 7742 + * union foo { int x; }; 7743 + * 7744 + * // CU 2: 7745 + * union foo; 7746 + * typedef union foo *foo_ptr; 7747 + */ 7748 + .input = { 7749 + .raw_types = { 7750 + /* CU 1 */ 7751 + BTF_UNION_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7752 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7753 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7754 + /* CU 2 */ 7755 + BTF_FWD_ENC(NAME_TBD, 1), /* [3] */ 7756 + BTF_PTR_ENC(3), /* [4] */ 7757 + BTF_TYPEDEF_ENC(NAME_NTH(3), 4), /* [5] */ 7758 + BTF_END_RAW, 7759 + }, 7760 + BTF_STR_SEC("\0foo\0x\0foo_ptr"), 7761 + }, 7762 + .expect = { 7763 + .raw_types = { 7764 + BTF_UNION_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7765 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7766 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7767 + BTF_PTR_ENC(1), /* [3] */ 7768 + BTF_TYPEDEF_ENC(NAME_NTH(3), 3), /* [4] */ 7769 + BTF_END_RAW, 7770 + }, 7771 + BTF_STR_SEC("\0foo\0x\0foo_ptr"), 7772 + }, 7773 + }, 7774 + { 7775 + .descr = "dedup: standalone fwd declaration wrong kind", 7776 + /* 7777 + * Negative test for btf_dedup_resolve_fwds: 7778 + * - CU1:foo is a struct, C2:foo is a union, thus CU2:foo is not deduped; 7779 + * - typedef/ptr should remain unchanged as well. 7780 + * 7781 + * // CU 1: 7782 + * struct foo { int x; }; 7783 + * 7784 + * // CU 2: 7785 + * union foo; 7786 + * typedef union foo *foo_ptr; 7787 + */ 7788 + .input = { 7789 + .raw_types = { 7790 + /* CU 1 */ 7791 + BTF_STRUCT_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7792 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7793 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7794 + /* CU 2 */ 7795 + BTF_FWD_ENC(NAME_NTH(3), 1), /* [3] */ 7796 + BTF_PTR_ENC(3), /* [4] */ 7797 + BTF_TYPEDEF_ENC(NAME_NTH(3), 4), /* [5] */ 7798 + BTF_END_RAW, 7799 + }, 7800 + BTF_STR_SEC("\0foo\0x\0foo_ptr"), 7801 + }, 7802 + .expect = { 7803 + .raw_types = { 7804 + /* CU 1 */ 7805 + BTF_STRUCT_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7806 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7807 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7808 + /* CU 2 */ 7809 + BTF_FWD_ENC(NAME_NTH(3), 1), /* [3] */ 7810 + BTF_PTR_ENC(3), /* [4] */ 7811 + BTF_TYPEDEF_ENC(NAME_NTH(3), 4), /* [5] */ 7812 + BTF_END_RAW, 7813 + }, 7814 + BTF_STR_SEC("\0foo\0x\0foo_ptr"), 7815 + }, 7816 + }, 7817 + { 7818 + .descr = "dedup: standalone fwd declaration name conflict", 7819 + /* 7820 + * Negative test for btf_dedup_resolve_fwds: 7821 + * - two candidates for CU2:foo dedup, thus it is unchanged; 7822 + * - typedef/ptr should remain unchanged as well. 7823 + * 7824 + * // CU 1: 7825 + * struct foo { int x; }; 7826 + * 7827 + * // CU 2: 7828 + * struct foo; 7829 + * typedef struct foo *foo_ptr; 7830 + * 7831 + * // CU 3: 7832 + * struct foo { int x; int y; }; 7833 + */ 7834 + .input = { 7835 + .raw_types = { 7836 + /* CU 1 */ 7837 + BTF_STRUCT_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7838 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7839 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7840 + /* CU 2 */ 7841 + BTF_FWD_ENC(NAME_NTH(1), 0), /* [3] */ 7842 + BTF_PTR_ENC(3), /* [4] */ 7843 + BTF_TYPEDEF_ENC(NAME_NTH(4), 4), /* [5] */ 7844 + /* CU 3 */ 7845 + BTF_STRUCT_ENC(NAME_NTH(1), 2, 8), /* [6] */ 7846 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7847 + BTF_MEMBER_ENC(NAME_NTH(3), 2, 0), 7848 + BTF_END_RAW, 7849 + }, 7850 + BTF_STR_SEC("\0foo\0x\0y\0foo_ptr"), 7851 + }, 7852 + .expect = { 7853 + .raw_types = { 7854 + /* CU 1 */ 7855 + BTF_STRUCT_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7856 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7857 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7858 + /* CU 2 */ 7859 + BTF_FWD_ENC(NAME_NTH(1), 0), /* [3] */ 7860 + BTF_PTR_ENC(3), /* [4] */ 7861 + BTF_TYPEDEF_ENC(NAME_NTH(4), 4), /* [5] */ 7862 + /* CU 3 */ 7863 + BTF_STRUCT_ENC(NAME_NTH(1), 2, 8), /* [6] */ 7864 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7865 + BTF_MEMBER_ENC(NAME_NTH(3), 2, 0), 7866 + BTF_END_RAW, 7867 + }, 7868 + BTF_STR_SEC("\0foo\0x\0y\0foo_ptr"), 7869 + }, 7870 + }, 7613 7871 }; 7614 7872 7615 7873 static int btf_type_size(const struct btf_type *t)
+30 -15
tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
··· 143 143 btf__add_struct(btf1, "s2", 4); /* [5] struct s2 { */ 144 144 btf__add_field(btf1, "f1", 1, 0, 0); /* int f1; */ 145 145 /* } */ 146 + /* keep this not a part of type the graph to test btf_dedup_resolve_fwds */ 147 + btf__add_struct(btf1, "s3", 4); /* [6] struct s3 { */ 148 + btf__add_field(btf1, "f1", 1, 0, 0); /* int f1; */ 149 + /* } */ 146 150 147 151 VALIDATE_RAW_BTF( 148 152 btf1, ··· 157 153 "\t'f1' type_id=2 bits_offset=0\n" 158 154 "\t'f2' type_id=3 bits_offset=64", 159 155 "[5] STRUCT 's2' size=4 vlen=1\n" 156 + "\t'f1' type_id=1 bits_offset=0", 157 + "[6] STRUCT 's3' size=4 vlen=1\n" 160 158 "\t'f1' type_id=1 bits_offset=0"); 161 159 162 160 btf2 = btf__new_empty_split(btf1); 163 161 if (!ASSERT_OK_PTR(btf2, "empty_split_btf")) 164 162 goto cleanup; 165 163 166 - btf__add_int(btf2, "int", 4, BTF_INT_SIGNED); /* [6] int */ 167 - btf__add_ptr(btf2, 10); /* [7] ptr to struct s1 */ 168 - btf__add_fwd(btf2, "s2", BTF_FWD_STRUCT); /* [8] fwd for struct s2 */ 169 - btf__add_ptr(btf2, 8); /* [9] ptr to fwd struct s2 */ 170 - btf__add_struct(btf2, "s1", 16); /* [10] struct s1 { */ 171 - btf__add_field(btf2, "f1", 7, 0, 0); /* struct s1 *f1; */ 172 - btf__add_field(btf2, "f2", 9, 64, 0); /* struct s2 *f2; */ 164 + btf__add_int(btf2, "int", 4, BTF_INT_SIGNED); /* [7] int */ 165 + btf__add_ptr(btf2, 11); /* [8] ptr to struct s1 */ 166 + btf__add_fwd(btf2, "s2", BTF_FWD_STRUCT); /* [9] fwd for struct s2 */ 167 + btf__add_ptr(btf2, 9); /* [10] ptr to fwd struct s2 */ 168 + btf__add_struct(btf2, "s1", 16); /* [11] struct s1 { */ 169 + btf__add_field(btf2, "f1", 8, 0, 0); /* struct s1 *f1; */ 170 + btf__add_field(btf2, "f2", 10, 64, 0); /* struct s2 *f2; */ 173 171 /* } */ 172 + btf__add_fwd(btf2, "s3", BTF_FWD_STRUCT); /* [12] fwd for struct s3 */ 173 + btf__add_ptr(btf2, 12); /* [13] ptr to struct s1 */ 174 174 175 175 VALIDATE_RAW_BTF( 176 176 btf2, ··· 186 178 "\t'f2' type_id=3 bits_offset=64", 187 179 "[5] STRUCT 's2' size=4 vlen=1\n" 188 180 "\t'f1' type_id=1 bits_offset=0", 189 - "[6] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", 190 - "[7] PTR '(anon)' type_id=10", 191 - "[8] FWD 's2' fwd_kind=struct", 192 - "[9] PTR '(anon)' type_id=8", 193 - "[10] STRUCT 's1' size=16 vlen=2\n" 194 - "\t'f1' type_id=7 bits_offset=0\n" 195 - "\t'f2' type_id=9 bits_offset=64"); 181 + "[6] STRUCT 's3' size=4 vlen=1\n" 182 + "\t'f1' type_id=1 bits_offset=0", 183 + "[7] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", 184 + "[8] PTR '(anon)' type_id=11", 185 + "[9] FWD 's2' fwd_kind=struct", 186 + "[10] PTR '(anon)' type_id=9", 187 + "[11] STRUCT 's1' size=16 vlen=2\n" 188 + "\t'f1' type_id=8 bits_offset=0\n" 189 + "\t'f2' type_id=10 bits_offset=64", 190 + "[12] FWD 's3' fwd_kind=struct", 191 + "[13] PTR '(anon)' type_id=12"); 196 192 197 193 err = btf__dedup(btf2, NULL); 198 194 if (!ASSERT_OK(err, "btf_dedup")) ··· 211 199 "\t'f1' type_id=2 bits_offset=0\n" 212 200 "\t'f2' type_id=3 bits_offset=64", 213 201 "[5] STRUCT 's2' size=4 vlen=1\n" 214 - "\t'f1' type_id=1 bits_offset=0"); 202 + "\t'f1' type_id=1 bits_offset=0", 203 + "[6] STRUCT 's3' size=4 vlen=1\n" 204 + "\t'f1' type_id=1 bits_offset=0", 205 + "[7] PTR '(anon)' type_id=6"); 215 206 216 207 cleanup: 217 208 btf__free(btf2);
+2 -2
tools/testing/selftests/bpf/prog_tests/btf_dump.c
··· 791 791 TEST_BTF_DUMP_DATA_OVER(btf, d, "struct", str, struct bpf_sock_ops, 792 792 sizeof(struct bpf_sock_ops) - 1, 793 793 "(struct bpf_sock_ops){\n\t.op = (__u32)1,\n", 794 - { .op = 1, .skb_tcp_flags = 2}); 794 + { .op = 1, .skb_hwtstamp = 2}); 795 795 TEST_BTF_DUMP_DATA_OVER(btf, d, "struct", str, struct bpf_sock_ops, 796 796 sizeof(struct bpf_sock_ops) - 1, 797 797 "(struct bpf_sock_ops){\n\t.op = (__u32)1,\n", 798 - { .op = 1, .skb_tcp_flags = 0}); 798 + { .op = 1, .skb_hwtstamp = 0}); 799 799 } 800 800 801 801 static void test_btf_dump_var_data(struct btf *btf, struct btf_dump *d,
+136 -54
tools/testing/selftests/bpf/prog_tests/hashmap.c
··· 7 7 */ 8 8 #include "test_progs.h" 9 9 #include "bpf/hashmap.h" 10 + #include <stddef.h> 10 11 11 12 static int duration = 0; 12 13 13 - static size_t hash_fn(const void *k, void *ctx) 14 + static size_t hash_fn(long k, void *ctx) 14 15 { 15 - return (long)k; 16 + return k; 16 17 } 17 18 18 - static bool equal_fn(const void *a, const void *b, void *ctx) 19 + static bool equal_fn(long a, long b, void *ctx) 19 20 { 20 - return (long)a == (long)b; 21 + return a == b; 21 22 } 22 23 23 24 static inline size_t next_pow_2(size_t n) ··· 53 52 return; 54 53 55 54 for (i = 0; i < ELEM_CNT; i++) { 56 - const void *oldk, *k = (const void *)(long)i; 57 - void *oldv, *v = (void *)(long)(1024 + i); 55 + long oldk, k = i; 56 + long oldv, v = 1024 + i; 58 57 59 58 err = hashmap__update(map, k, v, &oldk, &oldv); 60 59 if (CHECK(err != -ENOENT, "hashmap__update", ··· 65 64 err = hashmap__add(map, k, v); 66 65 } else { 67 66 err = hashmap__set(map, k, v, &oldk, &oldv); 68 - if (CHECK(oldk != NULL || oldv != NULL, "check_kv", 69 - "unexpected k/v: %p=%p\n", oldk, oldv)) 67 + if (CHECK(oldk != 0 || oldv != 0, "check_kv", 68 + "unexpected k/v: %ld=%ld\n", oldk, oldv)) 70 69 goto cleanup; 71 70 } 72 71 73 - if (CHECK(err, "elem_add", "failed to add k/v %ld = %ld: %d\n", 74 - (long)k, (long)v, err)) 72 + if (CHECK(err, "elem_add", "failed to add k/v %ld = %ld: %d\n", k, v, err)) 75 73 goto cleanup; 76 74 77 75 if (CHECK(!hashmap__find(map, k, &oldv), "elem_find", 78 - "failed to find key %ld\n", (long)k)) 76 + "failed to find key %ld\n", k)) 79 77 goto cleanup; 80 - if (CHECK(oldv != v, "elem_val", 81 - "found value is wrong: %ld\n", (long)oldv)) 78 + if (CHECK(oldv != v, "elem_val", "found value is wrong: %ld\n", oldv)) 82 79 goto cleanup; 83 80 } 84 81 ··· 90 91 91 92 found_msk = 0; 92 93 hashmap__for_each_entry(map, entry, bkt) { 93 - long k = (long)entry->key; 94 - long v = (long)entry->value; 94 + long k = entry->key; 95 + long v = entry->value; 95 96 96 97 found_msk |= 1ULL << k; 97 98 if (CHECK(v - k != 1024, "check_kv", ··· 103 104 goto cleanup; 104 105 105 106 for (i = 0; i < ELEM_CNT; i++) { 106 - const void *oldk, *k = (const void *)(long)i; 107 - void *oldv, *v = (void *)(long)(256 + i); 107 + long oldk, k = i; 108 + long oldv, v = 256 + i; 108 109 109 110 err = hashmap__add(map, k, v); 110 111 if (CHECK(err != -EEXIST, "hashmap__add", ··· 118 119 119 120 if (CHECK(err, "elem_upd", 120 121 "failed to update k/v %ld = %ld: %d\n", 121 - (long)k, (long)v, err)) 122 + k, v, err)) 122 123 goto cleanup; 123 124 if (CHECK(!hashmap__find(map, k, &oldv), "elem_find", 124 - "failed to find key %ld\n", (long)k)) 125 + "failed to find key %ld\n", k)) 125 126 goto cleanup; 126 127 if (CHECK(oldv != v, "elem_val", 127 - "found value is wrong: %ld\n", (long)oldv)) 128 + "found value is wrong: %ld\n", oldv)) 128 129 goto cleanup; 129 130 } 130 131 ··· 138 139 139 140 found_msk = 0; 140 141 hashmap__for_each_entry_safe(map, entry, tmp, bkt) { 141 - long k = (long)entry->key; 142 - long v = (long)entry->value; 142 + long k = entry->key; 143 + long v = entry->value; 143 144 144 145 found_msk |= 1ULL << k; 145 146 if (CHECK(v - k != 256, "elem_check", ··· 151 152 goto cleanup; 152 153 153 154 found_cnt = 0; 154 - hashmap__for_each_key_entry(map, entry, (void *)0) { 155 + hashmap__for_each_key_entry(map, entry, 0) { 155 156 found_cnt++; 156 157 } 157 158 if (CHECK(!found_cnt, "found_cnt", ··· 160 161 161 162 found_msk = 0; 162 163 found_cnt = 0; 163 - hashmap__for_each_key_entry_safe(map, entry, tmp, (void *)0) { 164 - const void *oldk, *k; 165 - void *oldv, *v; 164 + hashmap__for_each_key_entry_safe(map, entry, tmp, 0) { 165 + long oldk, k; 166 + long oldv, v; 166 167 167 168 k = entry->key; 168 169 v = entry->value; 169 170 170 171 found_cnt++; 171 - found_msk |= 1ULL << (long)k; 172 + found_msk |= 1ULL << k; 172 173 173 174 if (CHECK(!hashmap__delete(map, k, &oldk, &oldv), "elem_del", 174 - "failed to delete k/v %ld = %ld\n", 175 - (long)k, (long)v)) 175 + "failed to delete k/v %ld = %ld\n", k, v)) 176 176 goto cleanup; 177 177 if (CHECK(oldk != k || oldv != v, "check_old", 178 178 "invalid deleted k/v: expected %ld = %ld, got %ld = %ld\n", 179 - (long)k, (long)v, (long)oldk, (long)oldv)) 179 + k, v, oldk, oldv)) 180 180 goto cleanup; 181 181 if (CHECK(hashmap__delete(map, k, &oldk, &oldv), "elem_del", 182 - "unexpectedly deleted k/v %ld = %ld\n", 183 - (long)oldk, (long)oldv)) 182 + "unexpectedly deleted k/v %ld = %ld\n", oldk, oldv)) 184 183 goto cleanup; 185 184 } 186 185 ··· 195 198 goto cleanup; 196 199 197 200 hashmap__for_each_entry_safe(map, entry, tmp, bkt) { 198 - const void *oldk, *k; 199 - void *oldv, *v; 201 + long oldk, k; 202 + long oldv, v; 200 203 201 204 k = entry->key; 202 205 v = entry->value; 203 206 204 207 found_cnt++; 205 - found_msk |= 1ULL << (long)k; 208 + found_msk |= 1ULL << k; 206 209 207 210 if (CHECK(!hashmap__delete(map, k, &oldk, &oldv), "elem_del", 208 - "failed to delete k/v %ld = %ld\n", 209 - (long)k, (long)v)) 211 + "failed to delete k/v %ld = %ld\n", k, v)) 210 212 goto cleanup; 211 213 if (CHECK(oldk != k || oldv != v, "elem_check", 212 214 "invalid old k/v: expect %ld = %ld, got %ld = %ld\n", 213 - (long)k, (long)v, (long)oldk, (long)oldv)) 215 + k, v, oldk, oldv)) 214 216 goto cleanup; 215 217 if (CHECK(hashmap__delete(map, k, &oldk, &oldv), "elem_del", 216 - "unexpectedly deleted k/v %ld = %ld\n", 217 - (long)k, (long)v)) 218 + "unexpectedly deleted k/v %ld = %ld\n", k, v)) 218 219 goto cleanup; 219 220 } 220 221 ··· 230 235 hashmap__for_each_entry(map, entry, bkt) { 231 236 CHECK(false, "elem_exists", 232 237 "unexpected map entries left: %ld = %ld\n", 233 - (long)entry->key, (long)entry->value); 238 + entry->key, entry->value); 234 239 goto cleanup; 235 240 } 236 241 ··· 238 243 hashmap__for_each_entry(map, entry, bkt) { 239 244 CHECK(false, "elem_exists", 240 245 "unexpected map entries left: %ld = %ld\n", 241 - (long)entry->key, (long)entry->value); 246 + entry->key, entry->value); 242 247 goto cleanup; 243 248 } 244 249 ··· 246 251 hashmap__free(map); 247 252 } 248 253 249 - static size_t collision_hash_fn(const void *k, void *ctx) 254 + static size_t str_hash_fn(long a, void *ctx) 255 + { 256 + return str_hash((char *)a); 257 + } 258 + 259 + static bool str_equal_fn(long a, long b, void *ctx) 260 + { 261 + return strcmp((char *)a, (char *)b) == 0; 262 + } 263 + 264 + /* Verify that hashmap interface works with pointer keys and values */ 265 + static void test_hashmap_ptr_iface(void) 266 + { 267 + const char *key, *value, *old_key, *old_value; 268 + struct hashmap_entry *cur; 269 + struct hashmap *map; 270 + int err, i, bkt; 271 + 272 + map = hashmap__new(str_hash_fn, str_equal_fn, NULL); 273 + if (CHECK(!map, "hashmap__new", "can't allocate hashmap\n")) 274 + goto cleanup; 275 + 276 + #define CHECK_STR(fn, var, expected) \ 277 + CHECK(strcmp(var, (expected)), (fn), \ 278 + "wrong value of " #var ": '%s' instead of '%s'\n", var, (expected)) 279 + 280 + err = hashmap__insert(map, "a", "apricot", HASHMAP_ADD, NULL, NULL); 281 + if (CHECK(err, "hashmap__insert", "unexpected error: %d\n", err)) 282 + goto cleanup; 283 + 284 + err = hashmap__insert(map, "a", "apple", HASHMAP_SET, &old_key, &old_value); 285 + if (CHECK(err, "hashmap__insert", "unexpected error: %d\n", err)) 286 + goto cleanup; 287 + CHECK_STR("hashmap__update", old_key, "a"); 288 + CHECK_STR("hashmap__update", old_value, "apricot"); 289 + 290 + err = hashmap__add(map, "b", "banana"); 291 + if (CHECK(err, "hashmap__add", "unexpected error: %d\n", err)) 292 + goto cleanup; 293 + 294 + err = hashmap__set(map, "b", "breadfruit", &old_key, &old_value); 295 + if (CHECK(err, "hashmap__set", "unexpected error: %d\n", err)) 296 + goto cleanup; 297 + CHECK_STR("hashmap__set", old_key, "b"); 298 + CHECK_STR("hashmap__set", old_value, "banana"); 299 + 300 + err = hashmap__update(map, "b", "blueberry", &old_key, &old_value); 301 + if (CHECK(err, "hashmap__update", "unexpected error: %d\n", err)) 302 + goto cleanup; 303 + CHECK_STR("hashmap__update", old_key, "b"); 304 + CHECK_STR("hashmap__update", old_value, "breadfruit"); 305 + 306 + err = hashmap__append(map, "c", "cherry"); 307 + if (CHECK(err, "hashmap__append", "unexpected error: %d\n", err)) 308 + goto cleanup; 309 + 310 + if (CHECK(!hashmap__delete(map, "c", &old_key, &old_value), 311 + "hashmap__delete", "expected to have entry for 'c'\n")) 312 + goto cleanup; 313 + CHECK_STR("hashmap__delete", old_key, "c"); 314 + CHECK_STR("hashmap__delete", old_value, "cherry"); 315 + 316 + CHECK(!hashmap__find(map, "b", &value), "hashmap__find", "can't find value for 'b'\n"); 317 + CHECK_STR("hashmap__find", value, "blueberry"); 318 + 319 + if (CHECK(!hashmap__delete(map, "b", NULL, NULL), 320 + "hashmap__delete", "expected to have entry for 'b'\n")) 321 + goto cleanup; 322 + 323 + i = 0; 324 + hashmap__for_each_entry(map, cur, bkt) { 325 + if (CHECK(i != 0, "hashmap__for_each_entry", "too many entries")) 326 + goto cleanup; 327 + key = cur->pkey; 328 + value = cur->pvalue; 329 + CHECK_STR("entry", key, "a"); 330 + CHECK_STR("entry", value, "apple"); 331 + i++; 332 + } 333 + #undef CHECK_STR 334 + 335 + cleanup: 336 + hashmap__free(map); 337 + } 338 + 339 + static size_t collision_hash_fn(long k, void *ctx) 250 340 { 251 341 return 0; 252 342 } 253 343 254 344 static void test_hashmap_multimap(void) 255 345 { 256 - void *k1 = (void *)0, *k2 = (void *)1; 346 + long k1 = 0, k2 = 1; 257 347 struct hashmap_entry *entry; 258 348 struct hashmap *map; 259 349 long found_msk; ··· 353 273 * [0] -> 1, 2, 4; 354 274 * [1] -> 8, 16, 32; 355 275 */ 356 - err = hashmap__append(map, k1, (void *)1); 276 + err = hashmap__append(map, k1, 1); 357 277 if (CHECK(err, "elem_add", "failed to add k/v: %d\n", err)) 358 278 goto cleanup; 359 - err = hashmap__append(map, k1, (void *)2); 279 + err = hashmap__append(map, k1, 2); 360 280 if (CHECK(err, "elem_add", "failed to add k/v: %d\n", err)) 361 281 goto cleanup; 362 - err = hashmap__append(map, k1, (void *)4); 282 + err = hashmap__append(map, k1, 4); 363 283 if (CHECK(err, "elem_add", "failed to add k/v: %d\n", err)) 364 284 goto cleanup; 365 285 366 - err = hashmap__append(map, k2, (void *)8); 286 + err = hashmap__append(map, k2, 8); 367 287 if (CHECK(err, "elem_add", "failed to add k/v: %d\n", err)) 368 288 goto cleanup; 369 - err = hashmap__append(map, k2, (void *)16); 289 + err = hashmap__append(map, k2, 16); 370 290 if (CHECK(err, "elem_add", "failed to add k/v: %d\n", err)) 371 291 goto cleanup; 372 - err = hashmap__append(map, k2, (void *)32); 292 + err = hashmap__append(map, k2, 32); 373 293 if (CHECK(err, "elem_add", "failed to add k/v: %d\n", err)) 374 294 goto cleanup; 375 295 ··· 380 300 /* verify global iteration still works and sees all values */ 381 301 found_msk = 0; 382 302 hashmap__for_each_entry(map, entry, bkt) { 383 - found_msk |= (long)entry->value; 303 + found_msk |= entry->value; 384 304 } 385 305 if (CHECK(found_msk != (1 << 6) - 1, "found_msk", 386 306 "not all keys iterated: %lx\n", found_msk)) ··· 389 309 /* iterate values for key 1 */ 390 310 found_msk = 0; 391 311 hashmap__for_each_key_entry(map, entry, k1) { 392 - found_msk |= (long)entry->value; 312 + found_msk |= entry->value; 393 313 } 394 314 if (CHECK(found_msk != (1 | 2 | 4), "found_msk", 395 315 "invalid k1 values: %lx\n", found_msk)) ··· 398 318 /* iterate values for key 2 */ 399 319 found_msk = 0; 400 320 hashmap__for_each_key_entry(map, entry, k2) { 401 - found_msk |= (long)entry->value; 321 + found_msk |= entry->value; 402 322 } 403 323 if (CHECK(found_msk != (8 | 16 | 32), "found_msk", 404 324 "invalid k2 values: %lx\n", found_msk)) ··· 413 333 struct hashmap_entry *entry; 414 334 int bkt; 415 335 struct hashmap *map; 416 - void *k = (void *)0; 336 + long k = 0; 417 337 418 338 /* force collisions */ 419 339 map = hashmap__new(hash_fn, equal_fn, NULL); ··· 454 374 test_hashmap_multimap(); 455 375 if (test__start_subtest("empty")) 456 376 test_hashmap_empty(); 377 + if (test__start_subtest("ptr_iface")) 378 + test_hashmap_ptr_iface(); 457 379 }
+3 -3
tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
··· 312 312 return (__u64) t.tv_sec * 1000000000 + t.tv_nsec; 313 313 } 314 314 315 - static size_t symbol_hash(const void *key, void *ctx __maybe_unused) 315 + static size_t symbol_hash(long key, void *ctx __maybe_unused) 316 316 { 317 317 return str_hash((const char *) key); 318 318 } 319 319 320 - static bool symbol_equal(const void *key1, const void *key2, void *ctx __maybe_unused) 320 + static bool symbol_equal(long key1, long key2, void *ctx __maybe_unused) 321 321 { 322 322 return strcmp((const char *) key1, (const char *) key2) == 0; 323 323 } ··· 372 372 sizeof("__ftrace_invalid_address__") - 1)) 373 373 continue; 374 374 375 - err = hashmap__add(map, name, NULL); 375 + err = hashmap__add(map, name, 0); 376 376 if (err == -EEXIST) 377 377 continue; 378 378 if (err)
+4 -2
tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c
··· 485 485 goto check_linum; 486 486 487 487 ret = read(sk_fds.passive_fd, recv_msg, sizeof(recv_msg)); 488 - if (ASSERT_EQ(ret, sizeof(send_msg), "read(msg)")) 488 + if (!ASSERT_EQ(ret, sizeof(send_msg), "read(msg)")) 489 489 goto check_linum; 490 490 } 491 491 ··· 504 504 misc_skel->bss->nr_pure_ack); 505 505 506 506 ASSERT_EQ(misc_skel->bss->nr_fin, 1, "unexpected nr_fin"); 507 + 508 + ASSERT_EQ(misc_skel->bss->nr_hwtstamp, 0, "nr_hwtstamp"); 507 509 508 510 check_linum: 509 511 ASSERT_FALSE(check_error_linum(&sk_fds), "check_error_linum"); ··· 541 539 goto skel_destroy; 542 540 543 541 cg_fd = test__join_cgroup(CG_NAME); 544 - if (ASSERT_GE(cg_fd, 0, "join_cgroup")) 542 + if (!ASSERT_GE(cg_fd, 0, "join_cgroup")) 545 543 goto skel_destroy; 546 544 547 545 for (i = 0; i < ARRAY_SIZE(tests); i++) {
+4
tools/testing/selftests/bpf/progs/test_misc_tcp_hdr_options.c
··· 27 27 unsigned int nr_data = 0; 28 28 unsigned int nr_syn = 0; 29 29 unsigned int nr_fin = 0; 30 + unsigned int nr_hwtstamp = 0; 30 31 31 32 /* Check the header received from the active side */ 32 33 static int __check_active_hdr_in(struct bpf_sock_ops *skops, bool check_syn) ··· 146 145 147 146 if (th->ack && !th->fin && tcp_hdrlen(th) == skops->skb_len) 148 147 nr_pure_ack++; 148 + 149 + if (skops->skb_hwtstamp) 150 + nr_hwtstamp++; 149 151 150 152 return CG_OK; 151 153 }
+22 -16
tools/testing/selftests/bpf/test_progs.c
··· 222 222 return failed ? "FAIL" : (skipped ? "SKIP" : "OK"); 223 223 } 224 224 225 + #define TEST_NUM_WIDTH 7 226 + 227 + static void print_test_result(const struct prog_test_def *test, const struct test_state *test_state) 228 + { 229 + int skipped_cnt = test_state->skip_cnt; 230 + int subtests_cnt = test_state->subtest_num; 231 + 232 + fprintf(env.stdout, "#%-*d %s:", TEST_NUM_WIDTH, test->test_num, test->test_name); 233 + if (test_state->error_cnt) 234 + fprintf(env.stdout, "FAIL"); 235 + else if (!skipped_cnt) 236 + fprintf(env.stdout, "OK"); 237 + else if (skipped_cnt == subtests_cnt || !subtests_cnt) 238 + fprintf(env.stdout, "SKIP"); 239 + else 240 + fprintf(env.stdout, "OK (SKIP: %d/%d)", skipped_cnt, subtests_cnt); 241 + 242 + fprintf(env.stdout, "\n"); 243 + } 244 + 225 245 static void print_test_log(char *log_buf, size_t log_cnt) 226 246 { 227 247 log_buf[log_cnt] = '\0'; 228 248 fprintf(env.stdout, "%s", log_buf); 229 249 if (log_buf[log_cnt - 1] != '\n') 230 250 fprintf(env.stdout, "\n"); 231 - } 232 - 233 - #define TEST_NUM_WIDTH 7 234 - 235 - static void print_test_name(int test_num, const char *test_name, char *result) 236 - { 237 - fprintf(env.stdout, "#%-*d %s", TEST_NUM_WIDTH, test_num, test_name); 238 - 239 - if (result) 240 - fprintf(env.stdout, ":%s", result); 241 - 242 - fprintf(env.stdout, "\n"); 243 251 } 244 252 245 253 static void print_subtest_name(int test_num, int subtest_num, ··· 315 307 subtest_state->skipped)); 316 308 } 317 309 318 - print_test_name(test->test_num, test->test_name, 319 - test_result(test_failed, test_state->skip_cnt)); 310 + print_test_result(test, test_state); 320 311 } 321 312 322 313 static void stdio_restore(void); ··· 1077 1070 state->tested = true; 1078 1071 1079 1072 if (verbose() && env.worker_id == -1) 1080 - print_test_name(test_num + 1, test->test_name, 1081 - test_result(state->error_cnt, state->skip_cnt)); 1073 + print_test_result(test, state); 1082 1074 1083 1075 reset_affinity(); 1084 1076 restore_netns();
+737 -164
tools/testing/selftests/bpf/veristat.c
··· 17 17 #include <bpf/libbpf.h> 18 18 #include <libelf.h> 19 19 #include <gelf.h> 20 + #include <float.h> 20 21 21 22 enum stat_id { 22 23 VERDICT, ··· 35 34 NUM_STATS_CNT = FILE_NAME - VERDICT, 36 35 }; 37 36 37 + /* In comparison mode each stat can specify up to four different values: 38 + * - A side value; 39 + * - B side value; 40 + * - absolute diff value; 41 + * - relative (percentage) diff value. 42 + * 43 + * When specifying stat specs in comparison mode, user can use one of the 44 + * following variant suffixes to specify which exact variant should be used for 45 + * ordering or filtering: 46 + * - `_a` for A side value; 47 + * - `_b` for B side value; 48 + * - `_diff` for absolute diff value; 49 + * - `_pct` for relative (percentage) diff value. 50 + * 51 + * If no variant suffix is provided, then `_b` (control data) is assumed. 52 + * 53 + * As an example, let's say instructions stat has the following output: 54 + * 55 + * Insns (A) Insns (B) Insns (DIFF) 56 + * --------- --------- -------------- 57 + * 21547 20920 -627 (-2.91%) 58 + * 59 + * Then: 60 + * - 21547 is A side value (insns_a); 61 + * - 20920 is B side value (insns_b); 62 + * - -627 is absolute diff value (insns_diff); 63 + * - -2.91% is relative diff value (insns_pct). 64 + * 65 + * For verdict there is no verdict_pct variant. 66 + * For file and program name, _a and _b variants are equivalent and there are 67 + * no _diff or _pct variants. 68 + */ 69 + enum stat_variant { 70 + VARIANT_A, 71 + VARIANT_B, 72 + VARIANT_DIFF, 73 + VARIANT_PCT, 74 + }; 75 + 38 76 struct verif_stats { 39 77 char *file_name; 40 78 char *prog_name; ··· 81 41 long stats[NUM_STATS_CNT]; 82 42 }; 83 43 44 + /* joined comparison mode stats */ 45 + struct verif_stats_join { 46 + char *file_name; 47 + char *prog_name; 48 + 49 + const struct verif_stats *stats_a; 50 + const struct verif_stats *stats_b; 51 + }; 52 + 84 53 struct stat_specs { 85 54 int spec_cnt; 86 55 enum stat_id ids[ALL_STATS_CNT]; 56 + enum stat_variant variants[ALL_STATS_CNT]; 87 57 bool asc[ALL_STATS_CNT]; 88 58 int lens[ALL_STATS_CNT * 3]; /* 3x for comparison mode */ 89 59 }; ··· 104 54 RESFMT_CSV, 105 55 }; 106 56 57 + enum filter_kind { 58 + FILTER_NAME, 59 + FILTER_STAT, 60 + }; 61 + 62 + enum operator_kind { 63 + OP_EQ, /* == or = */ 64 + OP_NEQ, /* != or <> */ 65 + OP_LT, /* < */ 66 + OP_LE, /* <= */ 67 + OP_GT, /* > */ 68 + OP_GE, /* >= */ 69 + }; 70 + 107 71 struct filter { 72 + enum filter_kind kind; 73 + /* FILTER_NAME */ 74 + char *any_glob; 108 75 char *file_glob; 109 76 char *prog_glob; 77 + /* FILTER_STAT */ 78 + enum operator_kind op; 79 + int stat_id; 80 + enum stat_variant stat_var; 81 + long value; 110 82 }; 111 83 112 84 static struct env { ··· 139 67 int log_level; 140 68 enum resfmt out_fmt; 141 69 bool comparison_mode; 70 + bool replay_mode; 142 71 143 72 struct verif_stats *prog_stats; 144 73 int prog_stat_cnt; ··· 147 74 /* baseline_stats is allocated and used only in comparsion mode */ 148 75 struct verif_stats *baseline_stats; 149 76 int baseline_stat_cnt; 77 + 78 + struct verif_stats_join *join_stats; 79 + int join_stat_cnt; 150 80 151 81 struct stat_specs output_spec; 152 82 struct stat_specs sort_spec; ··· 191 115 { "sort", 's', "SPEC", 0, "Specify sort order" }, 192 116 { "output-format", 'o', "FMT", 0, "Result output format (table, csv), default is table." }, 193 117 { "compare", 'C', NULL, 0, "Comparison mode" }, 118 + { "replay", 'R', NULL, 0, "Replay mode" }, 194 119 { "filter", 'f', "FILTER", 0, "Filter expressions (or @filename for file with expressions)." }, 195 120 {}, 196 121 }; ··· 245 168 break; 246 169 case 'C': 247 170 env.comparison_mode = true; 171 + break; 172 + case 'R': 173 + env.replay_mode = true; 248 174 break; 249 175 case 'f': 250 176 if (arg[0] == '@') ··· 306 226 return !*str && !*pat; 307 227 } 308 228 309 - static bool should_process_file(const char *filename) 310 - { 311 - int i; 312 - 313 - if (env.deny_filter_cnt > 0) { 314 - for (i = 0; i < env.deny_filter_cnt; i++) { 315 - if (glob_matches(filename, env.deny_filters[i].file_glob)) 316 - return false; 317 - } 318 - } 319 - 320 - if (env.allow_filter_cnt == 0) 321 - return true; 322 - 323 - for (i = 0; i < env.allow_filter_cnt; i++) { 324 - if (glob_matches(filename, env.allow_filters[i].file_glob)) 325 - return true; 326 - } 327 - 328 - return false; 329 - } 330 - 331 229 static bool is_bpf_obj_file(const char *path) { 332 230 Elf64_Ehdr *ehdr; 333 231 int fd, err = -EINVAL; ··· 338 280 return err == 0; 339 281 } 340 282 341 - static bool should_process_prog(const char *path, const char *prog_name) 283 + static bool should_process_file_prog(const char *filename, const char *prog_name) 342 284 { 343 - const char *filename = basename(path); 344 - int i; 285 + struct filter *f; 286 + int i, allow_cnt = 0; 345 287 346 - if (env.deny_filter_cnt > 0) { 347 - for (i = 0; i < env.deny_filter_cnt; i++) { 348 - if (glob_matches(filename, env.deny_filters[i].file_glob)) 349 - return false; 350 - if (!env.deny_filters[i].prog_glob) 288 + for (i = 0; i < env.deny_filter_cnt; i++) { 289 + f = &env.deny_filters[i]; 290 + if (f->kind != FILTER_NAME) 291 + continue; 292 + 293 + if (f->any_glob && glob_matches(filename, f->any_glob)) 294 + return false; 295 + if (f->any_glob && prog_name && glob_matches(prog_name, f->any_glob)) 296 + return false; 297 + if (f->file_glob && glob_matches(filename, f->file_glob)) 298 + return false; 299 + if (f->prog_glob && prog_name && glob_matches(prog_name, f->prog_glob)) 300 + return false; 301 + } 302 + 303 + for (i = 0; i < env.allow_filter_cnt; i++) { 304 + f = &env.allow_filters[i]; 305 + if (f->kind != FILTER_NAME) 306 + continue; 307 + 308 + allow_cnt++; 309 + if (f->any_glob) { 310 + if (glob_matches(filename, f->any_glob)) 311 + return true; 312 + /* If we don't know program name yet, any_glob filter 313 + * has to assume that current BPF object file might be 314 + * relevant; we'll check again later on after opening 315 + * BPF object file, at which point program name will 316 + * be known finally. 317 + */ 318 + if (!prog_name || glob_matches(prog_name, f->any_glob)) 319 + return true; 320 + } else { 321 + if (f->file_glob && !glob_matches(filename, f->file_glob)) 351 322 continue; 352 - if (glob_matches(prog_name, env.deny_filters[i].prog_glob)) 353 - return false; 323 + if (f->prog_glob && prog_name && !glob_matches(prog_name, f->prog_glob)) 324 + continue; 325 + return true; 354 326 } 355 327 } 356 328 357 - if (env.allow_filter_cnt == 0) 358 - return true; 359 - 360 - for (i = 0; i < env.allow_filter_cnt; i++) { 361 - if (!glob_matches(filename, env.allow_filters[i].file_glob)) 362 - continue; 363 - /* if filter specifies only filename glob part, it implicitly 364 - * allows all progs within that file 365 - */ 366 - if (!env.allow_filters[i].prog_glob) 367 - return true; 368 - if (glob_matches(prog_name, env.allow_filters[i].prog_glob)) 369 - return true; 370 - } 371 - 372 - return false; 329 + /* if there are no file/prog name allow filters, allow all progs, 330 + * unless they are denied earlier explicitly 331 + */ 332 + return allow_cnt == 0; 373 333 } 334 + 335 + static struct { 336 + enum operator_kind op_kind; 337 + const char *op_str; 338 + } operators[] = { 339 + /* Order of these definitions matter to avoid situations like '<' 340 + * matching part of what is actually a '<>' operator. That is, 341 + * substrings should go last. 342 + */ 343 + { OP_EQ, "==" }, 344 + { OP_NEQ, "!=" }, 345 + { OP_NEQ, "<>" }, 346 + { OP_LE, "<=" }, 347 + { OP_LT, "<" }, 348 + { OP_GE, ">=" }, 349 + { OP_GT, ">" }, 350 + { OP_EQ, "=" }, 351 + }; 352 + 353 + static bool parse_stat_id_var(const char *name, size_t len, int *id, enum stat_variant *var); 374 354 375 355 static int append_filter(struct filter **filters, int *cnt, const char *str) 376 356 { 377 357 struct filter *f; 378 358 void *tmp; 379 359 const char *p; 360 + int i; 380 361 381 362 tmp = realloc(*filters, (*cnt + 1) * sizeof(**filters)); 382 363 if (!tmp) ··· 423 326 *filters = tmp; 424 327 425 328 f = &(*filters)[*cnt]; 426 - f->file_glob = f->prog_glob = NULL; 329 + memset(f, 0, sizeof(*f)); 427 330 428 - /* filter can be specified either as "<obj-glob>" or "<obj-glob>/<prog-glob>" */ 331 + /* First, let's check if it's a stats filter of the following form: 332 + * <stat><op><value, where: 333 + * - <stat> is one of supported numerical stats (verdict is also 334 + * considered numerical, failure == 0, success == 1); 335 + * - <op> is comparison operator (see `operators` definitions); 336 + * - <value> is an integer (or failure/success, or false/true as 337 + * special aliases for 0 and 1, respectively). 338 + * If the form doesn't match what user provided, we assume file/prog 339 + * glob filter. 340 + */ 341 + for (i = 0; i < ARRAY_SIZE(operators); i++) { 342 + enum stat_variant var; 343 + int id; 344 + long val; 345 + const char *end = str; 346 + const char *op_str; 347 + 348 + op_str = operators[i].op_str; 349 + p = strstr(str, op_str); 350 + if (!p) 351 + continue; 352 + 353 + if (!parse_stat_id_var(str, p - str, &id, &var)) { 354 + fprintf(stderr, "Unrecognized stat name in '%s'!\n", str); 355 + return -EINVAL; 356 + } 357 + if (id >= FILE_NAME) { 358 + fprintf(stderr, "Non-integer stat is specified in '%s'!\n", str); 359 + return -EINVAL; 360 + } 361 + 362 + p += strlen(op_str); 363 + 364 + if (strcasecmp(p, "true") == 0 || 365 + strcasecmp(p, "t") == 0 || 366 + strcasecmp(p, "success") == 0 || 367 + strcasecmp(p, "succ") == 0 || 368 + strcasecmp(p, "s") == 0 || 369 + strcasecmp(p, "match") == 0 || 370 + strcasecmp(p, "m") == 0) { 371 + val = 1; 372 + } else if (strcasecmp(p, "false") == 0 || 373 + strcasecmp(p, "f") == 0 || 374 + strcasecmp(p, "failure") == 0 || 375 + strcasecmp(p, "fail") == 0 || 376 + strcasecmp(p, "mismatch") == 0 || 377 + strcasecmp(p, "mis") == 0) { 378 + val = 0; 379 + } else { 380 + errno = 0; 381 + val = strtol(p, (char **)&end, 10); 382 + if (errno || end == p || *end != '\0' ) { 383 + fprintf(stderr, "Invalid integer value in '%s'!\n", str); 384 + return -EINVAL; 385 + } 386 + } 387 + 388 + f->kind = FILTER_STAT; 389 + f->stat_id = id; 390 + f->stat_var = var; 391 + f->op = operators[i].op_kind; 392 + f->value = val; 393 + 394 + *cnt += 1; 395 + return 0; 396 + } 397 + 398 + /* File/prog filter can be specified either as '<glob>' or 399 + * '<file-glob>/<prog-glob>'. In the former case <glob> is applied to 400 + * both file and program names. This seems to be way more useful in 401 + * practice. If user needs full control, they can use '/<prog-glob>' 402 + * form to glob just program name, or '<file-glob>/' to glob only file 403 + * name. But usually common <glob> seems to be the most useful and 404 + * ergonomic way. 405 + */ 406 + f->kind = FILTER_NAME; 429 407 p = strchr(str, '/'); 430 408 if (!p) { 431 - f->file_glob = strdup(str); 432 - if (!f->file_glob) 409 + f->any_glob = strdup(str); 410 + if (!f->any_glob) 433 411 return -ENOMEM; 434 412 } else { 435 - f->file_glob = strndup(str, p - str); 436 - f->prog_glob = strdup(p + 1); 437 - if (!f->file_glob || !f->prog_glob) { 438 - free(f->file_glob); 439 - free(f->prog_glob); 440 - f->file_glob = f->prog_glob = NULL; 441 - return -ENOMEM; 413 + if (str != p) { 414 + /* non-empty file glob */ 415 + f->file_glob = strndup(str, p - str); 416 + if (!f->file_glob) 417 + return -ENOMEM; 418 + } 419 + if (strlen(p + 1) > 0) { 420 + /* non-empty prog glob */ 421 + f->prog_glob = strdup(p + 1); 422 + if (!f->prog_glob) { 423 + free(f->file_glob); 424 + f->file_glob = NULL; 425 + return -ENOMEM; 426 + } 442 427 } 443 428 } 444 429 445 - *cnt = *cnt + 1; 430 + *cnt += 1; 446 431 return 0; 447 432 } 448 433 ··· 567 388 }, 568 389 }; 569 390 391 + static const struct stat_specs default_csv_output_spec = { 392 + .spec_cnt = 9, 393 + .ids = { 394 + FILE_NAME, PROG_NAME, VERDICT, DURATION, 395 + TOTAL_INSNS, TOTAL_STATES, PEAK_STATES, 396 + MAX_STATES_PER_INSN, MARK_READ_MAX_LEN, 397 + }, 398 + }; 399 + 570 400 static const struct stat_specs default_sort_spec = { 401 + .spec_cnt = 2, 402 + .ids = { 403 + FILE_NAME, PROG_NAME, 404 + }, 405 + .asc = { true, true, }, 406 + }; 407 + 408 + /* sorting for comparison mode to join two data sets */ 409 + static const struct stat_specs join_sort_spec = { 571 410 .spec_cnt = 2, 572 411 .ids = { 573 412 FILE_NAME, PROG_NAME, ··· 597 400 const char *header; 598 401 const char *names[4]; 599 402 bool asc_by_default; 403 + bool left_aligned; 600 404 } stat_defs[] = { 601 - [FILE_NAME] = { "File", {"file_name", "filename", "file"}, true /* asc */ }, 602 - [PROG_NAME] = { "Program", {"prog_name", "progname", "prog"}, true /* asc */ }, 603 - [VERDICT] = { "Verdict", {"verdict"}, true /* asc: failure, success */ }, 405 + [FILE_NAME] = { "File", {"file_name", "filename", "file"}, true /* asc */, true /* left */ }, 406 + [PROG_NAME] = { "Program", {"prog_name", "progname", "prog"}, true /* asc */, true /* left */ }, 407 + [VERDICT] = { "Verdict", {"verdict"}, true /* asc: failure, success */, true /* left */ }, 604 408 [DURATION] = { "Duration (us)", {"duration", "dur"}, }, 605 - [TOTAL_INSNS] = { "Total insns", {"total_insns", "insns"}, }, 606 - [TOTAL_STATES] = { "Total states", {"total_states", "states"}, }, 409 + [TOTAL_INSNS] = { "Insns", {"total_insns", "insns"}, }, 410 + [TOTAL_STATES] = { "States", {"total_states", "states"}, }, 607 411 [PEAK_STATES] = { "Peak states", {"peak_states"}, }, 608 412 [MAX_STATES_PER_INSN] = { "Max states per insn", {"max_states_per_insn"}, }, 609 413 [MARK_READ_MAX_LEN] = { "Max mark read length", {"max_mark_read_len", "mark_read"}, }, 610 414 }; 611 415 416 + static bool parse_stat_id_var(const char *name, size_t len, int *id, enum stat_variant *var) 417 + { 418 + static const char *var_sfxs[] = { 419 + [VARIANT_A] = "_a", 420 + [VARIANT_B] = "_b", 421 + [VARIANT_DIFF] = "_diff", 422 + [VARIANT_PCT] = "_pct", 423 + }; 424 + int i, j, k; 425 + 426 + for (i = 0; i < ARRAY_SIZE(stat_defs); i++) { 427 + struct stat_def *def = &stat_defs[i]; 428 + size_t alias_len, sfx_len; 429 + const char *alias; 430 + 431 + for (j = 0; j < ARRAY_SIZE(stat_defs[i].names); j++) { 432 + alias = def->names[j]; 433 + if (!alias) 434 + continue; 435 + 436 + alias_len = strlen(alias); 437 + if (strncmp(name, alias, alias_len) != 0) 438 + continue; 439 + 440 + if (alias_len == len) { 441 + /* If no variant suffix is specified, we 442 + * assume control group (just in case we are 443 + * in comparison mode. Variant is ignored in 444 + * non-comparison mode. 445 + */ 446 + *var = VARIANT_B; 447 + *id = i; 448 + return true; 449 + } 450 + 451 + for (k = 0; k < ARRAY_SIZE(var_sfxs); k++) { 452 + sfx_len = strlen(var_sfxs[k]); 453 + if (alias_len + sfx_len != len) 454 + continue; 455 + 456 + if (strncmp(name + alias_len, var_sfxs[k], sfx_len) == 0) { 457 + *var = (enum stat_variant)k; 458 + *id = i; 459 + return true; 460 + } 461 + } 462 + } 463 + } 464 + 465 + return false; 466 + } 467 + 468 + static bool is_asc_sym(char c) 469 + { 470 + return c == '^'; 471 + } 472 + 473 + static bool is_desc_sym(char c) 474 + { 475 + return c == 'v' || c == 'V' || c == '.' || c == '!' || c == '_'; 476 + } 477 + 612 478 static int parse_stat(const char *stat_name, struct stat_specs *specs) 613 479 { 614 - int id, i; 480 + int id; 481 + bool has_order = false, is_asc = false; 482 + size_t len = strlen(stat_name); 483 + enum stat_variant var; 615 484 616 485 if (specs->spec_cnt >= ARRAY_SIZE(specs->ids)) { 617 486 fprintf(stderr, "Can't specify more than %zd stats\n", ARRAY_SIZE(specs->ids)); 618 487 return -E2BIG; 619 488 } 620 489 621 - for (id = 0; id < ARRAY_SIZE(stat_defs); id++) { 622 - struct stat_def *def = &stat_defs[id]; 623 - 624 - for (i = 0; i < ARRAY_SIZE(stat_defs[id].names); i++) { 625 - if (!def->names[i] || strcmp(def->names[i], stat_name) != 0) 626 - continue; 627 - 628 - specs->ids[specs->spec_cnt] = id; 629 - specs->asc[specs->spec_cnt] = def->asc_by_default; 630 - specs->spec_cnt++; 631 - 632 - return 0; 633 - } 490 + if (len > 1 && (is_asc_sym(stat_name[len - 1]) || is_desc_sym(stat_name[len - 1]))) { 491 + has_order = true; 492 + is_asc = is_asc_sym(stat_name[len - 1]); 493 + len -= 1; 634 494 } 635 495 636 - fprintf(stderr, "Unrecognized stat name '%s'\n", stat_name); 637 - return -ESRCH; 496 + if (!parse_stat_id_var(stat_name, len, &id, &var)) { 497 + fprintf(stderr, "Unrecognized stat name '%s'\n", stat_name); 498 + return -ESRCH; 499 + } 500 + 501 + specs->ids[specs->spec_cnt] = id; 502 + specs->variants[specs->spec_cnt] = var; 503 + specs->asc[specs->spec_cnt] = has_order ? is_asc : stat_defs[id].asc_by_default; 504 + specs->spec_cnt++; 505 + 506 + return 0; 638 507 } 639 508 640 509 static int parse_stats(const char *stats_str, struct stat_specs *specs) ··· 803 540 int err = 0; 804 541 void *tmp; 805 542 806 - if (!should_process_prog(filename, bpf_program__name(prog))) { 543 + if (!should_process_file_prog(basename(filename), bpf_program__name(prog))) { 807 544 env.progs_skipped++; 808 545 return 0; 809 546 } ··· 859 596 LIBBPF_OPTS(bpf_object_open_opts, opts); 860 597 int err = 0, prog_cnt = 0; 861 598 862 - if (!should_process_file(basename(filename))) { 599 + if (!should_process_file_prog(basename(filename), NULL)) { 863 600 if (env.verbose) 864 601 printf("Skipping '%s' due to filters...\n", filename); 865 602 env.files_skipped++; ··· 979 716 return cmp; 980 717 } 981 718 982 - return 0; 719 + /* always disambiguate with file+prog, which are unique */ 720 + cmp = strcmp(s1->file_name, s2->file_name); 721 + if (cmp != 0) 722 + return cmp; 723 + return strcmp(s1->prog_name, s2->prog_name); 724 + } 725 + 726 + static void fetch_join_stat_value(const struct verif_stats_join *s, 727 + enum stat_id id, enum stat_variant var, 728 + const char **str_val, 729 + double *num_val) 730 + { 731 + long v1, v2; 732 + 733 + if (id == FILE_NAME) { 734 + *str_val = s->file_name; 735 + return; 736 + } 737 + if (id == PROG_NAME) { 738 + *str_val = s->prog_name; 739 + return; 740 + } 741 + 742 + v1 = s->stats_a ? s->stats_a->stats[id] : 0; 743 + v2 = s->stats_b ? s->stats_b->stats[id] : 0; 744 + 745 + switch (var) { 746 + case VARIANT_A: 747 + if (!s->stats_a) 748 + *num_val = -DBL_MAX; 749 + else 750 + *num_val = s->stats_a->stats[id]; 751 + return; 752 + case VARIANT_B: 753 + if (!s->stats_b) 754 + *num_val = -DBL_MAX; 755 + else 756 + *num_val = s->stats_b->stats[id]; 757 + return; 758 + case VARIANT_DIFF: 759 + if (!s->stats_a || !s->stats_b) 760 + *num_val = -DBL_MAX; 761 + else if (id == VERDICT) 762 + *num_val = v1 == v2 ? 1.0 /* MATCH */ : 0.0 /* MISMATCH */; 763 + else 764 + *num_val = (double)(v2 - v1); 765 + return; 766 + case VARIANT_PCT: 767 + if (!s->stats_a || !s->stats_b) { 768 + *num_val = -DBL_MAX; 769 + } else if (v1 == 0) { 770 + if (v1 == v2) 771 + *num_val = 0.0; 772 + else 773 + *num_val = v2 < v1 ? -100.0 : 100.0; 774 + } else { 775 + *num_val = (v2 - v1) * 100.0 / v1; 776 + } 777 + return; 778 + } 779 + } 780 + 781 + static int cmp_join_stat(const struct verif_stats_join *s1, 782 + const struct verif_stats_join *s2, 783 + enum stat_id id, enum stat_variant var, bool asc) 784 + { 785 + const char *str1 = NULL, *str2 = NULL; 786 + double v1, v2; 787 + int cmp = 0; 788 + 789 + fetch_join_stat_value(s1, id, var, &str1, &v1); 790 + fetch_join_stat_value(s2, id, var, &str2, &v2); 791 + 792 + if (str1) 793 + cmp = strcmp(str1, str2); 794 + else if (v1 != v2) 795 + cmp = v1 < v2 ? -1 : 1; 796 + 797 + return asc ? cmp : -cmp; 798 + } 799 + 800 + static int cmp_join_stats(const void *v1, const void *v2) 801 + { 802 + const struct verif_stats_join *s1 = v1, *s2 = v2; 803 + int i, cmp; 804 + 805 + for (i = 0; i < env.sort_spec.spec_cnt; i++) { 806 + cmp = cmp_join_stat(s1, s2, 807 + env.sort_spec.ids[i], 808 + env.sort_spec.variants[i], 809 + env.sort_spec.asc[i]); 810 + if (cmp != 0) 811 + return cmp; 812 + } 813 + 814 + /* always disambiguate with file+prog, which are unique */ 815 + cmp = strcmp(s1->file_name, s2->file_name); 816 + if (cmp != 0) 817 + return cmp; 818 + return strcmp(s1->prog_name, s2->prog_name); 983 819 } 984 820 985 821 #define HEADER_CHAR '-' ··· 1100 738 1101 739 static void output_headers(enum resfmt fmt) 1102 740 { 741 + const char *fmt_str; 1103 742 int i, len; 1104 743 1105 744 for (i = 0; i < env.output_spec.spec_cnt; i++) { ··· 1114 751 *max_len = len; 1115 752 break; 1116 753 case RESFMT_TABLE: 1117 - printf("%s%-*s", i == 0 ? "" : COLUMN_SEP, *max_len, stat_defs[id].header); 754 + fmt_str = stat_defs[id].left_aligned ? "%s%-*s" : "%s%*s"; 755 + printf(fmt_str, i == 0 ? "" : COLUMN_SEP, *max_len, stat_defs[id].header); 1118 756 if (i == env.output_spec.spec_cnt - 1) 1119 757 printf("\n"); 1120 758 break; ··· 1136 772 { 1137 773 switch (id) { 1138 774 case FILE_NAME: 1139 - *str = s->file_name; 775 + *str = s ? s->file_name : "N/A"; 1140 776 break; 1141 777 case PROG_NAME: 1142 - *str = s->prog_name; 778 + *str = s ? s->prog_name : "N/A"; 1143 779 break; 1144 780 case VERDICT: 1145 - *str = s->stats[VERDICT] ? "success" : "failure"; 781 + if (!s) 782 + *str = "N/A"; 783 + else 784 + *str = s->stats[VERDICT] ? "success" : "failure"; 1146 785 break; 1147 786 case DURATION: 1148 787 case TOTAL_INSNS: ··· 1153 786 case PEAK_STATES: 1154 787 case MAX_STATES_PER_INSN: 1155 788 case MARK_READ_MAX_LEN: 1156 - *val = s->stats[id]; 789 + *val = s ? s->stats[id] : 0; 1157 790 break; 1158 791 default: 1159 792 fprintf(stderr, "Unrecognized stat #%d\n", id); ··· 1206 839 printf("Done. Processed %d files, %d programs. Skipped %d files, %d programs.\n", 1207 840 env.files_processed, env.files_skipped, env.progs_processed, env.progs_skipped); 1208 841 } 1209 - } 1210 - 1211 - static int handle_verif_mode(void) 1212 - { 1213 - int i, err; 1214 - 1215 - if (env.filename_cnt == 0) { 1216 - fprintf(stderr, "Please provide path to BPF object file!\n"); 1217 - argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); 1218 - return -EINVAL; 1219 - } 1220 - 1221 - for (i = 0; i < env.filename_cnt; i++) { 1222 - err = process_obj(env.filenames[i]); 1223 - if (err) { 1224 - fprintf(stderr, "Failed to process '%s': %d\n", env.filenames[i], err); 1225 - return err; 1226 - } 1227 - } 1228 - 1229 - qsort(env.prog_stats, env.prog_stat_cnt, sizeof(*env.prog_stats), cmp_prog_stats); 1230 - 1231 - if (env.out_fmt == RESFMT_TABLE) { 1232 - /* calculate column widths */ 1233 - output_headers(RESFMT_TABLE_CALCLEN); 1234 - for (i = 0; i < env.prog_stat_cnt; i++) 1235 - output_stats(&env.prog_stats[i], RESFMT_TABLE_CALCLEN, false); 1236 - } 1237 - 1238 - /* actually output the table */ 1239 - output_headers(env.out_fmt); 1240 - for (i = 0; i < env.prog_stat_cnt; i++) { 1241 - output_stats(&env.prog_stats[i], env.out_fmt, i == env.prog_stat_cnt - 1); 1242 - } 1243 - 1244 - return 0; 1245 842 } 1246 843 1247 844 static int parse_stat_value(const char *str, enum stat_id id, struct verif_stats *st) ··· 1339 1008 * parsed entire line; if row should be ignored we pretend we 1340 1009 * never parsed it 1341 1010 */ 1342 - if (!should_process_prog(st->file_name, st->prog_name)) { 1011 + if (!should_process_file_prog(st->file_name, st->prog_name)) { 1343 1012 free(st->file_name); 1344 1013 free(st->prog_name); 1345 1014 *stat_cntp -= 1; ··· 1428 1097 output_comp_header_underlines(); 1429 1098 } 1430 1099 1431 - static void output_comp_stats(const struct verif_stats *base, const struct verif_stats *comp, 1100 + static void output_comp_stats(const struct verif_stats_join *join_stats, 1432 1101 enum resfmt fmt, bool last) 1433 1102 { 1103 + const struct verif_stats *base = join_stats->stats_a; 1104 + const struct verif_stats *comp = join_stats->stats_b; 1434 1105 char base_buf[1024] = {}, comp_buf[1024] = {}, diff_buf[1024] = {}; 1435 1106 int i; 1436 1107 ··· 1450 1117 /* normalize all the outputs to be in string buffers for simplicity */ 1451 1118 if (is_key_stat(id)) { 1452 1119 /* key stats (file and program name) are always strings */ 1453 - if (base != &fallback_stats) 1120 + if (base) 1454 1121 snprintf(base_buf, sizeof(base_buf), "%s", base_str); 1455 1122 else 1456 1123 snprintf(base_buf, sizeof(base_buf), "%s", comp_str); 1457 1124 } else if (base_str) { 1458 1125 snprintf(base_buf, sizeof(base_buf), "%s", base_str); 1459 1126 snprintf(comp_buf, sizeof(comp_buf), "%s", comp_str); 1460 - if (strcmp(base_str, comp_str) == 0) 1127 + if (!base || !comp) 1128 + snprintf(diff_buf, sizeof(diff_buf), "%s", "N/A"); 1129 + else if (strcmp(base_str, comp_str) == 0) 1461 1130 snprintf(diff_buf, sizeof(diff_buf), "%s", "MATCH"); 1462 1131 else 1463 1132 snprintf(diff_buf, sizeof(diff_buf), "%s", "MISMATCH"); 1464 1133 } else { 1465 1134 double p = 0.0; 1466 1135 1467 - snprintf(base_buf, sizeof(base_buf), "%ld", base_val); 1468 - snprintf(comp_buf, sizeof(comp_buf), "%ld", comp_val); 1136 + if (base) 1137 + snprintf(base_buf, sizeof(base_buf), "%ld", base_val); 1138 + else 1139 + snprintf(base_buf, sizeof(base_buf), "%s", "N/A"); 1140 + if (comp) 1141 + snprintf(comp_buf, sizeof(comp_buf), "%ld", comp_val); 1142 + else 1143 + snprintf(comp_buf, sizeof(comp_buf), "%s", "N/A"); 1469 1144 1470 1145 diff_val = comp_val - base_val; 1471 - if (base == &fallback_stats || comp == &fallback_stats || base_val == 0) { 1472 - if (comp_val == base_val) 1473 - p = 0.0; /* avoid +0 (+100%) case */ 1474 - else 1475 - p = comp_val < base_val ? -100.0 : 100.0; 1146 + if (!base || !comp) { 1147 + snprintf(diff_buf, sizeof(diff_buf), "%s", "N/A"); 1476 1148 } else { 1477 - p = diff_val * 100.0 / base_val; 1149 + if (base_val == 0) { 1150 + if (comp_val == base_val) 1151 + p = 0.0; /* avoid +0 (+100%) case */ 1152 + else 1153 + p = comp_val < base_val ? -100.0 : 100.0; 1154 + } else { 1155 + p = diff_val * 100.0 / base_val; 1156 + } 1157 + snprintf(diff_buf, sizeof(diff_buf), "%+ld (%+.2lf%%)", diff_val, p); 1478 1158 } 1479 - snprintf(diff_buf, sizeof(diff_buf), "%+ld (%+.2lf%%)", diff_val, p); 1480 1159 } 1481 1160 1482 1161 switch (fmt) { ··· 1544 1199 return strcmp(base->prog_name, comp->prog_name); 1545 1200 } 1546 1201 1202 + static bool is_join_stat_filter_matched(struct filter *f, const struct verif_stats_join *stats) 1203 + { 1204 + static const double eps = 1e-9; 1205 + const char *str = NULL; 1206 + double value = 0.0; 1207 + 1208 + fetch_join_stat_value(stats, f->stat_id, f->stat_var, &str, &value); 1209 + 1210 + switch (f->op) { 1211 + case OP_EQ: return value > f->value - eps && value < f->value + eps; 1212 + case OP_NEQ: return value < f->value - eps || value > f->value + eps; 1213 + case OP_LT: return value < f->value - eps; 1214 + case OP_LE: return value <= f->value + eps; 1215 + case OP_GT: return value > f->value + eps; 1216 + case OP_GE: return value >= f->value - eps; 1217 + } 1218 + 1219 + fprintf(stderr, "BUG: unknown filter op %d!\n", f->op); 1220 + return false; 1221 + } 1222 + 1223 + static bool should_output_join_stats(const struct verif_stats_join *stats) 1224 + { 1225 + struct filter *f; 1226 + int i, allow_cnt = 0; 1227 + 1228 + for (i = 0; i < env.deny_filter_cnt; i++) { 1229 + f = &env.deny_filters[i]; 1230 + if (f->kind != FILTER_STAT) 1231 + continue; 1232 + 1233 + if (is_join_stat_filter_matched(f, stats)) 1234 + return false; 1235 + } 1236 + 1237 + for (i = 0; i < env.allow_filter_cnt; i++) { 1238 + f = &env.allow_filters[i]; 1239 + if (f->kind != FILTER_STAT) 1240 + continue; 1241 + allow_cnt++; 1242 + 1243 + if (is_join_stat_filter_matched(f, stats)) 1244 + return true; 1245 + } 1246 + 1247 + /* if there are no stat allowed filters, pass everything through */ 1248 + return allow_cnt == 0; 1249 + } 1250 + 1547 1251 static int handle_comparison_mode(void) 1548 1252 { 1549 1253 struct stat_specs base_specs = {}, comp_specs = {}; 1254 + struct stat_specs tmp_sort_spec; 1550 1255 enum resfmt cur_fmt; 1551 - int err, i, j; 1256 + int err, i, j, last_idx; 1552 1257 1553 1258 if (env.filename_cnt != 2) { 1554 - fprintf(stderr, "Comparison mode expects exactly two input CSV files!\n"); 1259 + fprintf(stderr, "Comparison mode expects exactly two input CSV files!\n\n"); 1555 1260 argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); 1556 1261 return -EINVAL; 1557 1262 } ··· 1639 1244 } 1640 1245 } 1641 1246 1247 + /* Replace user-specified sorting spec with file+prog sorting rule to 1248 + * be able to join two datasets correctly. Once we are done, we will 1249 + * restore the original sort spec. 1250 + */ 1251 + tmp_sort_spec = env.sort_spec; 1252 + env.sort_spec = join_sort_spec; 1642 1253 qsort(env.prog_stats, env.prog_stat_cnt, sizeof(*env.prog_stats), cmp_prog_stats); 1643 1254 qsort(env.baseline_stats, env.baseline_stat_cnt, sizeof(*env.baseline_stats), cmp_prog_stats); 1255 + env.sort_spec = tmp_sort_spec; 1644 1256 1645 - /* for human-readable table output we need to do extra pass to 1646 - * calculate column widths, so we substitute current output format 1647 - * with RESFMT_TABLE_CALCLEN and later revert it back to RESFMT_TABLE 1648 - * and do everything again. 1649 - */ 1650 - if (env.out_fmt == RESFMT_TABLE) 1651 - cur_fmt = RESFMT_TABLE_CALCLEN; 1652 - else 1653 - cur_fmt = env.out_fmt; 1654 - 1655 - one_more_time: 1656 - output_comp_headers(cur_fmt); 1657 - 1658 - /* If baseline and comparison datasets have different subset of rows 1659 - * (we match by 'object + prog' as a unique key) then assume 1660 - * empty/missing/zero value for rows that are missing in the opposite 1661 - * data set 1257 + /* Join two datasets together. If baseline and comparison datasets 1258 + * have different subset of rows (we match by 'object + prog' as 1259 + * a unique key) then assume empty/missing/zero value for rows that 1260 + * are missing in the opposite data set. 1662 1261 */ 1663 1262 i = j = 0; 1664 1263 while (i < env.baseline_stat_cnt || j < env.prog_stat_cnt) { 1665 - bool last = (i == env.baseline_stat_cnt - 1) || (j == env.prog_stat_cnt - 1); 1666 1264 const struct verif_stats *base, *comp; 1265 + struct verif_stats_join *join; 1266 + void *tmp; 1667 1267 int r; 1668 1268 1669 1269 base = i < env.baseline_stat_cnt ? &env.baseline_stats[i] : &fallback_stats; ··· 1675 1285 return -EINVAL; 1676 1286 } 1677 1287 1288 + tmp = realloc(env.join_stats, (env.join_stat_cnt + 1) * sizeof(*env.join_stats)); 1289 + if (!tmp) 1290 + return -ENOMEM; 1291 + env.join_stats = tmp; 1292 + 1293 + join = &env.join_stats[env.join_stat_cnt]; 1294 + memset(join, 0, sizeof(*join)); 1295 + 1678 1296 r = cmp_stats_key(base, comp); 1679 1297 if (r == 0) { 1680 - output_comp_stats(base, comp, cur_fmt, last); 1298 + join->file_name = base->file_name; 1299 + join->prog_name = base->prog_name; 1300 + join->stats_a = base; 1301 + join->stats_b = comp; 1681 1302 i++; 1682 1303 j++; 1683 1304 } else if (comp == &fallback_stats || r < 0) { 1684 - output_comp_stats(base, &fallback_stats, cur_fmt, last); 1305 + join->file_name = base->file_name; 1306 + join->prog_name = base->prog_name; 1307 + join->stats_a = base; 1308 + join->stats_b = NULL; 1685 1309 i++; 1686 1310 } else { 1687 - output_comp_stats(&fallback_stats, comp, cur_fmt, last); 1311 + join->file_name = comp->file_name; 1312 + join->prog_name = comp->prog_name; 1313 + join->stats_a = NULL; 1314 + join->stats_b = comp; 1688 1315 j++; 1689 1316 } 1317 + env.join_stat_cnt += 1; 1318 + } 1319 + 1320 + /* now sort joined results accorsing to sort spec */ 1321 + qsort(env.join_stats, env.join_stat_cnt, sizeof(*env.join_stats), cmp_join_stats); 1322 + 1323 + /* for human-readable table output we need to do extra pass to 1324 + * calculate column widths, so we substitute current output format 1325 + * with RESFMT_TABLE_CALCLEN and later revert it back to RESFMT_TABLE 1326 + * and do everything again. 1327 + */ 1328 + if (env.out_fmt == RESFMT_TABLE) 1329 + cur_fmt = RESFMT_TABLE_CALCLEN; 1330 + else 1331 + cur_fmt = env.out_fmt; 1332 + 1333 + one_more_time: 1334 + output_comp_headers(cur_fmt); 1335 + 1336 + for (i = 0; i < env.join_stat_cnt; i++) { 1337 + const struct verif_stats_join *join = &env.join_stats[i]; 1338 + 1339 + if (!should_output_join_stats(join)) 1340 + continue; 1341 + 1342 + if (cur_fmt == RESFMT_TABLE_CALCLEN) 1343 + last_idx = i; 1344 + 1345 + output_comp_stats(join, cur_fmt, i == last_idx); 1690 1346 } 1691 1347 1692 1348 if (cur_fmt == RESFMT_TABLE_CALCLEN) { 1693 1349 cur_fmt = RESFMT_TABLE; 1694 1350 goto one_more_time; /* ... this time with feeling */ 1695 1351 } 1352 + 1353 + return 0; 1354 + } 1355 + 1356 + static bool is_stat_filter_matched(struct filter *f, const struct verif_stats *stats) 1357 + { 1358 + long value = stats->stats[f->stat_id]; 1359 + 1360 + switch (f->op) { 1361 + case OP_EQ: return value == f->value; 1362 + case OP_NEQ: return value != f->value; 1363 + case OP_LT: return value < f->value; 1364 + case OP_LE: return value <= f->value; 1365 + case OP_GT: return value > f->value; 1366 + case OP_GE: return value >= f->value; 1367 + } 1368 + 1369 + fprintf(stderr, "BUG: unknown filter op %d!\n", f->op); 1370 + return false; 1371 + } 1372 + 1373 + static bool should_output_stats(const struct verif_stats *stats) 1374 + { 1375 + struct filter *f; 1376 + int i, allow_cnt = 0; 1377 + 1378 + for (i = 0; i < env.deny_filter_cnt; i++) { 1379 + f = &env.deny_filters[i]; 1380 + if (f->kind != FILTER_STAT) 1381 + continue; 1382 + 1383 + if (is_stat_filter_matched(f, stats)) 1384 + return false; 1385 + } 1386 + 1387 + for (i = 0; i < env.allow_filter_cnt; i++) { 1388 + f = &env.allow_filters[i]; 1389 + if (f->kind != FILTER_STAT) 1390 + continue; 1391 + allow_cnt++; 1392 + 1393 + if (is_stat_filter_matched(f, stats)) 1394 + return true; 1395 + } 1396 + 1397 + /* if there are no stat allowed filters, pass everything through */ 1398 + return allow_cnt == 0; 1399 + } 1400 + 1401 + static void output_prog_stats(void) 1402 + { 1403 + const struct verif_stats *stats; 1404 + int i, last_stat_idx = 0; 1405 + 1406 + if (env.out_fmt == RESFMT_TABLE) { 1407 + /* calculate column widths */ 1408 + output_headers(RESFMT_TABLE_CALCLEN); 1409 + for (i = 0; i < env.prog_stat_cnt; i++) { 1410 + stats = &env.prog_stats[i]; 1411 + if (!should_output_stats(stats)) 1412 + continue; 1413 + output_stats(stats, RESFMT_TABLE_CALCLEN, false); 1414 + last_stat_idx = i; 1415 + } 1416 + } 1417 + 1418 + /* actually output the table */ 1419 + output_headers(env.out_fmt); 1420 + for (i = 0; i < env.prog_stat_cnt; i++) { 1421 + stats = &env.prog_stats[i]; 1422 + if (!should_output_stats(stats)) 1423 + continue; 1424 + output_stats(stats, env.out_fmt, i == last_stat_idx); 1425 + } 1426 + } 1427 + 1428 + static int handle_verif_mode(void) 1429 + { 1430 + int i, err; 1431 + 1432 + if (env.filename_cnt == 0) { 1433 + fprintf(stderr, "Please provide path to BPF object file!\n\n"); 1434 + argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); 1435 + return -EINVAL; 1436 + } 1437 + 1438 + for (i = 0; i < env.filename_cnt; i++) { 1439 + err = process_obj(env.filenames[i]); 1440 + if (err) { 1441 + fprintf(stderr, "Failed to process '%s': %d\n", env.filenames[i], err); 1442 + return err; 1443 + } 1444 + } 1445 + 1446 + qsort(env.prog_stats, env.prog_stat_cnt, sizeof(*env.prog_stats), cmp_prog_stats); 1447 + 1448 + output_prog_stats(); 1449 + 1450 + return 0; 1451 + } 1452 + 1453 + static int handle_replay_mode(void) 1454 + { 1455 + struct stat_specs specs = {}; 1456 + int err; 1457 + 1458 + if (env.filename_cnt != 1) { 1459 + fprintf(stderr, "Replay mode expects exactly one input CSV file!\n\n"); 1460 + argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); 1461 + return -EINVAL; 1462 + } 1463 + 1464 + err = parse_stats_csv(env.filenames[0], &specs, 1465 + &env.prog_stats, &env.prog_stat_cnt); 1466 + if (err) { 1467 + fprintf(stderr, "Failed to parse stats from '%s': %d\n", env.filenames[0], err); 1468 + return err; 1469 + } 1470 + 1471 + qsort(env.prog_stats, env.prog_stat_cnt, sizeof(*env.prog_stats), cmp_prog_stats); 1472 + 1473 + output_prog_stats(); 1696 1474 1697 1475 return 0; 1698 1476 } ··· 1873 1315 return 1; 1874 1316 1875 1317 if (env.verbose && env.quiet) { 1876 - fprintf(stderr, "Verbose and quiet modes are incompatible, please specify just one or neither!\n"); 1318 + fprintf(stderr, "Verbose and quiet modes are incompatible, please specify just one or neither!\n\n"); 1877 1319 argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); 1878 1320 return 1; 1879 1321 } 1880 1322 if (env.verbose && env.log_level == 0) 1881 1323 env.log_level = 1; 1882 1324 1883 - if (env.output_spec.spec_cnt == 0) 1884 - env.output_spec = default_output_spec; 1325 + if (env.output_spec.spec_cnt == 0) { 1326 + if (env.out_fmt == RESFMT_CSV) 1327 + env.output_spec = default_csv_output_spec; 1328 + else 1329 + env.output_spec = default_output_spec; 1330 + } 1885 1331 if (env.sort_spec.spec_cnt == 0) 1886 1332 env.sort_spec = default_sort_spec; 1887 1333 1334 + if (env.comparison_mode && env.replay_mode) { 1335 + fprintf(stderr, "Can't specify replay and comparison mode at the same time!\n\n"); 1336 + argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); 1337 + return 1; 1338 + } 1339 + 1888 1340 if (env.comparison_mode) 1889 1341 err = handle_comparison_mode(); 1342 + else if (env.replay_mode) 1343 + err = handle_replay_mode(); 1890 1344 else 1891 1345 err = handle_verif_mode(); 1892 1346 1893 1347 free_verif_stats(env.prog_stats, env.prog_stat_cnt); 1894 1348 free_verif_stats(env.baseline_stats, env.baseline_stat_cnt); 1349 + free(env.join_stats); 1895 1350 for (i = 0; i < env.filename_cnt; i++) 1896 1351 free(env.filenames[i]); 1897 1352 free(env.filenames); 1898 1353 for (i = 0; i < env.allow_filter_cnt; i++) { 1354 + free(env.allow_filters[i].any_glob); 1899 1355 free(env.allow_filters[i].file_glob); 1900 1356 free(env.allow_filters[i].prog_glob); 1901 1357 } 1902 1358 free(env.allow_filters); 1903 1359 for (i = 0; i < env.deny_filter_cnt; i++) { 1360 + free(env.deny_filters[i].any_glob); 1904 1361 free(env.deny_filters[i].file_glob); 1905 1362 free(env.deny_filters[i].prog_glob); 1906 1363 }
+3 -2
tools/testing/selftests/bpf/xdp_synproxy.c
··· 104 104 { "tc", no_argument, NULL, 'c' }, 105 105 { NULL, 0, NULL, 0 }, 106 106 }; 107 - unsigned long mss4, mss6, wscale, ttl; 107 + unsigned long mss4, wscale, ttl; 108 + unsigned long long mss6; 108 109 unsigned int tcpipopts_mask = 0; 109 110 110 111 if (argc < 2) ··· 287 286 288 287 prog_info = (struct bpf_prog_info) { 289 288 .nr_map_ids = 8, 290 - .map_ids = (__u64)map_ids, 289 + .map_ids = (__u64)(unsigned long)map_ids, 291 290 }; 292 291 info_len = sizeof(prog_info); 293 292
+4 -22
tools/testing/selftests/bpf/xsk.c
··· 33 33 #include <bpf/bpf.h> 34 34 #include <bpf/libbpf.h> 35 35 #include "xsk.h" 36 + #include "bpf_util.h" 36 37 37 38 #ifndef SOL_XDP 38 39 #define SOL_XDP 283 ··· 522 521 return 0; 523 522 } 524 523 525 - /* Copy up to sz - 1 bytes from zero-terminated src string and ensure that dst 526 - * is zero-terminated string no matter what (unless sz == 0, in which case 527 - * it's a no-op). It's conceptually close to FreeBSD's strlcpy(), but differs 528 - * in what is returned. Given this is internal helper, it's trivial to extend 529 - * this, when necessary. Use this instead of strncpy inside libbpf source code. 530 - */ 531 - static inline void libbpf_strlcpy(char *dst, const char *src, size_t sz) 532 - { 533 - size_t i; 534 - 535 - if (sz == 0) 536 - return; 537 - 538 - sz--; 539 - for (i = 0; i < sz && src[i]; i++) 540 - dst[i] = src[i]; 541 - dst[i] = '\0'; 542 - } 543 - 544 524 static int xsk_get_max_queues(struct xsk_socket *xsk) 545 525 { 546 526 struct ethtool_channels channels = { .cmd = ETHTOOL_GCHANNELS }; ··· 534 552 return -errno; 535 553 536 554 ifr.ifr_data = (void *)&channels; 537 - libbpf_strlcpy(ifr.ifr_name, ctx->ifname, IFNAMSIZ); 555 + bpf_strlcpy(ifr.ifr_name, ctx->ifname, IFNAMSIZ); 538 556 err = ioctl(fd, SIOCETHTOOL, &ifr); 539 557 if (err && errno != EOPNOTSUPP) { 540 558 ret = -errno; ··· 753 771 } 754 772 755 773 ctx->ifindex = ifindex; 756 - libbpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ); 774 + bpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ); 757 775 758 776 xsk->ctx = ctx; 759 777 xsk->ctx->has_bpf_link = xsk_probe_bpf_link(); ··· 940 958 ctx->refcount = 1; 941 959 ctx->umem = umem; 942 960 ctx->queue_id = queue_id; 943 - libbpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ); 961 + bpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ); 944 962 945 963 ctx->fill = fill; 946 964 ctx->comp = comp;
+2 -1
tools/testing/selftests/bpf/xskxceiver.c
··· 1006 1006 { 1007 1007 struct xsk_socket_info *xsk = ifobject->xsk; 1008 1008 bool use_poll = ifobject->use_poll; 1009 - u32 i, idx = 0, ret, valid_pkts = 0; 1009 + u32 i, idx = 0, valid_pkts = 0; 1010 + int ret; 1010 1011 1011 1012 while (xsk_ring_prod__reserve(&xsk->tx, BATCH_SIZE, &idx) < BATCH_SIZE) { 1012 1013 if (use_poll) {