Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

+44

Documentation/bpf/bpf_design_QA.rst

··· 298 298 299 299 The BTF_ID macro does not cause a function to become part of the ABI 300 300 any more than does the EXPORT_SYMBOL_GPL macro. 301 + 302 + Q: What is the compatibility story for special BPF types in map values? 303 + ----------------------------------------------------------------------- 304 + Q: Users are allowed to embed bpf_spin_lock, bpf_timer fields in their BPF map 305 + values (when using BTF support for BPF maps). This allows to use helpers for 306 + such objects on these fields inside map values. Users are also allowed to embed 307 + pointers to some kernel types (with __kptr and __kptr_ref BTF tags). Will the 308 + kernel preserve backwards compatibility for these features? 309 + 310 + A: It depends. For bpf_spin_lock, bpf_timer: YES, for kptr and everything else: 311 + NO, but see below. 312 + 313 + For struct types that have been added already, like bpf_spin_lock and bpf_timer, 314 + the kernel will preserve backwards compatibility, as they are part of UAPI. 315 + 316 + For kptrs, they are also part of UAPI, but only with respect to the kptr 317 + mechanism. The types that you can use with a __kptr and __kptr_ref tagged 318 + pointer in your struct are NOT part of the UAPI contract. The supported types can 319 + and will change across kernel releases. However, operations like accessing kptr 320 + fields and bpf_kptr_xchg() helper will continue to be supported across kernel 321 + releases for the supported types. 322 + 323 + For any other supported struct type, unless explicitly stated in this document 324 + and added to bpf.h UAPI header, such types can and will arbitrarily change their 325 + size, type, and alignment, or any other user visible API or ABI detail across 326 + kernel releases. The users must adapt their BPF programs to the new changes and 327 + update them to make sure their programs continue to work correctly. 328 + 329 + NOTE: BPF subsystem specially reserves the 'bpf\_' prefix for type names, in 330 + order to introduce more special fields in the future. Hence, user programs must 331 + avoid defining types with 'bpf\_' prefix to not be broken in future releases. 332 + In other words, no backwards compatibility is guaranteed if one using a type 333 + in BTF with 'bpf\_' prefix. 334 + 335 + Q: What is the compatibility story for special BPF types in local kptrs? 336 + ------------------------------------------------------------------------ 337 + Q: Same as above, but for local kptrs (i.e. pointers to objects allocated using 338 + bpf_obj_new for user defined structures). Will the kernel preserve backwards 339 + compatibility for these features? 340 + 341 + A: NO. 342 + 343 + Unlike map value types, there are no stability guarantees for this case. The 344 + whole local kptr API itself is unstable (since it is exposed through kfuncs).

+250

Documentation/bpf/map_array.rst

··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + ================================================ 5 + BPF_MAP_TYPE_ARRAY and BPF_MAP_TYPE_PERCPU_ARRAY 6 + ================================================ 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_ARRAY`` was introduced in kernel version 3.19 10 + - ``BPF_MAP_TYPE_PERCPU_ARRAY`` was introduced in version 4.6 11 + 12 + ``BPF_MAP_TYPE_ARRAY`` and ``BPF_MAP_TYPE_PERCPU_ARRAY`` provide generic array 13 + storage. The key type is an unsigned 32-bit integer (4 bytes) and the map is 14 + of constant size. The size of the array is defined in ``max_entries`` at 15 + creation time. All array elements are pre-allocated and zero initialized when 16 + created. ``BPF_MAP_TYPE_PERCPU_ARRAY`` uses a different memory region for each 17 + CPU whereas ``BPF_MAP_TYPE_ARRAY`` uses the same memory region. The value 18 + stored can be of any size, however, all array elements are aligned to 8 19 + bytes. 20 + 21 + Since kernel 5.5, memory mapping may be enabled for ``BPF_MAP_TYPE_ARRAY`` by 22 + setting the flag ``BPF_F_MMAPABLE``. The map definition is page-aligned and 23 + starts on the first page. Sufficient page-sized and page-aligned blocks of 24 + memory are allocated to store all array values, starting on the second page, 25 + which in some cases will result in over-allocation of memory. The benefit of 26 + using this is increased performance and ease of use since userspace programs 27 + would not be required to use helper functions to access and mutate data. 28 + 29 + Usage 30 + ===== 31 + 32 + Kernel BPF 33 + ---------- 34 + 35 + .. c:function:: 36 + void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 37 + 38 + Array elements can be retrieved using the ``bpf_map_lookup_elem()`` helper. 39 + This helper returns a pointer into the array element, so to avoid data races 40 + with userspace reading the value, the user must use primitives like 41 + ``__sync_fetch_and_add()`` when updating the value in-place. 42 + 43 + .. c:function:: 44 + long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) 45 + 46 + Array elements can be updated using the ``bpf_map_update_elem()`` helper. 47 + 48 + ``bpf_map_update_elem()`` returns 0 on success, or negative error in case of 49 + failure. 50 + 51 + Since the array is of constant size, ``bpf_map_delete_elem()`` is not supported. 52 + To clear an array element, you may use ``bpf_map_update_elem()`` to insert a 53 + zero value to that index. 54 + 55 + Per CPU Array 56 + ~~~~~~~~~~~~~ 57 + 58 + Values stored in ``BPF_MAP_TYPE_ARRAY`` can be accessed by multiple programs 59 + across different CPUs. To restrict storage to a single CPU, you may use a 60 + ``BPF_MAP_TYPE_PERCPU_ARRAY``. 61 + 62 + When using a ``BPF_MAP_TYPE_PERCPU_ARRAY`` the ``bpf_map_update_elem()`` and 63 + ``bpf_map_lookup_elem()`` helpers automatically access the slot for the current 64 + CPU. 65 + 66 + .. c:function:: 67 + void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key, u32 cpu) 68 + 69 + The ``bpf_map_lookup_percpu_elem()`` helper can be used to lookup the array 70 + value for a specific CPU. Returns value on success , or ``NULL`` if no entry was 71 + found or ``cpu`` is invalid. 72 + 73 + Concurrency 74 + ----------- 75 + 76 + Since kernel version 5.1, the BPF infrastructure provides ``struct bpf_spin_lock`` 77 + to synchronize access. 78 + 79 + Userspace 80 + --------- 81 + 82 + Access from userspace uses libbpf APIs with the same names as above, with 83 + the map identified by its ``fd``. 84 + 85 + Examples 86 + ======== 87 + 88 + Please see the ``tools/testing/selftests/bpf`` directory for functional 89 + examples. The code samples below demonstrate API usage. 90 + 91 + Kernel BPF 92 + ---------- 93 + 94 + This snippet shows how to declare an array in a BPF program. 95 + 96 + .. code-block:: c 97 + 98 + struct { 99 + __uint(type, BPF_MAP_TYPE_ARRAY); 100 + __type(key, u32); 101 + __type(value, long); 102 + __uint(max_entries, 256); 103 + } my_map SEC(".maps"); 104 + 105 + 106 + This example BPF program shows how to access an array element. 107 + 108 + .. code-block:: c 109 + 110 + int bpf_prog(struct __sk_buff *skb) 111 + { 112 + struct iphdr ip; 113 + int index; 114 + long *value; 115 + 116 + if (bpf_skb_load_bytes(skb, ETH_HLEN, &ip, sizeof(ip)) < 0) 117 + return 0; 118 + 119 + index = ip.protocol; 120 + value = bpf_map_lookup_elem(&my_map, &index); 121 + if (value) 122 + __sync_fetch_and_add(&value, skb->len); 123 + 124 + return 0; 125 + } 126 + 127 + Userspace 128 + --------- 129 + 130 + BPF_MAP_TYPE_ARRAY 131 + ~~~~~~~~~~~~~~~~~~ 132 + 133 + This snippet shows how to create an array, using ``bpf_map_create_opts`` to 134 + set flags. 135 + 136 + .. code-block:: c 137 + 138 + #include <bpf/libbpf.h> 139 + #include <bpf/bpf.h> 140 + 141 + int create_array() 142 + { 143 + int fd; 144 + LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = BPF_F_MMAPABLE); 145 + 146 + fd = bpf_map_create(BPF_MAP_TYPE_ARRAY, 147 + "example_array", /* name */ 148 + sizeof(__u32), /* key size */ 149 + sizeof(long), /* value size */ 150 + 256, /* max entries */ 151 + &opts); /* create opts */ 152 + return fd; 153 + } 154 + 155 + This snippet shows how to initialize the elements of an array. 156 + 157 + .. code-block:: c 158 + 159 + int initialize_array(int fd) 160 + { 161 + __u32 i; 162 + long value; 163 + int ret; 164 + 165 + for (i = 0; i < 256; i++) { 166 + value = i; 167 + ret = bpf_map_update_elem(fd, &i, &value, BPF_ANY); 168 + if (ret < 0) 169 + return ret; 170 + } 171 + 172 + return ret; 173 + } 174 + 175 + This snippet shows how to retrieve an element value from an array. 176 + 177 + .. code-block:: c 178 + 179 + int lookup(int fd) 180 + { 181 + __u32 index = 42; 182 + long value; 183 + int ret; 184 + 185 + ret = bpf_map_lookup_elem(fd, &index, &value); 186 + if (ret < 0) 187 + return ret; 188 + 189 + /* use value here */ 190 + assert(value == 42); 191 + 192 + return ret; 193 + } 194 + 195 + BPF_MAP_TYPE_PERCPU_ARRAY 196 + ~~~~~~~~~~~~~~~~~~~~~~~~~ 197 + 198 + This snippet shows how to initialize the elements of a per CPU array. 199 + 200 + .. code-block:: c 201 + 202 + int initialize_array(int fd) 203 + { 204 + int ncpus = libbpf_num_possible_cpus(); 205 + long values[ncpus]; 206 + __u32 i, j; 207 + int ret; 208 + 209 + for (i = 0; i < 256 ; i++) { 210 + for (j = 0; j < ncpus; j++) 211 + values[j] = i; 212 + ret = bpf_map_update_elem(fd, &i, &values, BPF_ANY); 213 + if (ret < 0) 214 + return ret; 215 + } 216 + 217 + return ret; 218 + } 219 + 220 + This snippet shows how to access the per CPU elements of an array value. 221 + 222 + .. code-block:: c 223 + 224 + int lookup(int fd) 225 + { 226 + int ncpus = libbpf_num_possible_cpus(); 227 + __u32 index = 42, j; 228 + long values[ncpus]; 229 + int ret; 230 + 231 + ret = bpf_map_lookup_elem(fd, &index, &values); 232 + if (ret < 0) 233 + return ret; 234 + 235 + for (j = 0; j < ncpus; j++) { 236 + /* Use per CPU value here */ 237 + assert(values[j] == 42); 238 + } 239 + 240 + return ret; 241 + } 242 + 243 + Semantics 244 + ========= 245 + 246 + As shown in the example above, when accessing a ``BPF_MAP_TYPE_PERCPU_ARRAY`` 247 + in userspace, each value is an array with ``ncpus`` elements. 248 + 249 + When calling ``bpf_map_update_elem()`` the flag ``BPF_NOEXIST`` can not be used 250 + for these maps.

+166

Documentation/bpf/map_cpumap.rst

··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + =================== 5 + BPF_MAP_TYPE_CPUMAP 6 + =================== 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_CPUMAP`` was introduced in kernel version 4.15 10 + 11 + .. kernel-doc:: kernel/bpf/cpumap.c 12 + :doc: cpu map 13 + 14 + An example use-case for this map type is software based Receive Side Scaling (RSS). 15 + 16 + The CPUMAP represents the CPUs in the system indexed as the map-key, and the 17 + map-value is the config setting (per CPUMAP entry). Each CPUMAP entry has a dedicated 18 + kernel thread bound to the given CPU to represent the remote CPU execution unit. 19 + 20 + Starting from Linux kernel version 5.9 the CPUMAP can run a second XDP program 21 + on the remote CPU. This allows an XDP program to split its processing across 22 + multiple CPUs. For example, a scenario where the initial CPU (that sees/receives 23 + the packets) needs to do minimal packet processing and the remote CPU (to which 24 + the packet is directed) can afford to spend more cycles processing the frame. The 25 + initial CPU is where the XDP redirect program is executed. The remote CPU 26 + receives raw ``xdp_frame`` objects. 27 + 28 + Usage 29 + ===== 30 + 31 + Kernel BPF 32 + ---------- 33 + .. c:function:: 34 + long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags) 35 + 36 + Redirect the packet to the endpoint referenced by ``map`` at index ``key``. 37 + For ``BPF_MAP_TYPE_CPUMAP`` this map contains references to CPUs. 38 + 39 + The lower two bits of ``flags`` are used as the return code if the map lookup 40 + fails. This is so that the return value can be one of the XDP program return 41 + codes up to ``XDP_TX``, as chosen by the caller. 42 + 43 + Userspace 44 + --------- 45 + .. note:: 46 + CPUMAP entries can only be updated/looked up/deleted from user space and not 47 + from an eBPF program. Trying to call these functions from a kernel eBPF 48 + program will result in the program failing to load and a verifier warning. 49 + 50 + .. c:function:: 51 + int bpf_map_update_elem(int fd, const void *key, const void *value, 52 + __u64 flags); 53 + 54 + CPU entries can be added or updated using the ``bpf_map_update_elem()`` 55 + helper. This helper replaces existing elements atomically. The ``value`` parameter 56 + can be ``struct bpf_cpumap_val``. 57 + 58 + .. code-block:: c 59 + 60 + struct bpf_cpumap_val { 61 + __u32 qsize; /* queue size to remote target CPU */ 62 + union { 63 + int fd; /* prog fd on map write */ 64 + __u32 id; /* prog id on map read */ 65 + } bpf_prog; 66 + }; 67 + 68 + The flags argument can be one of the following: 69 + - BPF_ANY: Create a new element or update an existing element. 70 + - BPF_NOEXIST: Create a new element only if it did not exist. 71 + - BPF_EXIST: Update an existing element. 72 + 73 + .. c:function:: 74 + int bpf_map_lookup_elem(int fd, const void *key, void *value); 75 + 76 + CPU entries can be retrieved using the ``bpf_map_lookup_elem()`` 77 + helper. 78 + 79 + .. c:function:: 80 + int bpf_map_delete_elem(int fd, const void *key); 81 + 82 + CPU entries can be deleted using the ``bpf_map_delete_elem()`` 83 + helper. This helper will return 0 on success, or negative error in case of 84 + failure. 85 + 86 + Examples 87 + ======== 88 + Kernel 89 + ------ 90 + 91 + The following code snippet shows how to declare a ``BPF_MAP_TYPE_CPUMAP`` called 92 + ``cpu_map`` and how to redirect packets to a remote CPU using a round robin scheme. 93 + 94 + .. code-block:: c 95 + 96 + struct { 97 + __uint(type, BPF_MAP_TYPE_CPUMAP); 98 + __type(key, __u32); 99 + __type(value, struct bpf_cpumap_val); 100 + __uint(max_entries, 12); 101 + } cpu_map SEC(".maps"); 102 + 103 + struct { 104 + __uint(type, BPF_MAP_TYPE_ARRAY); 105 + __type(key, __u32); 106 + __type(value, __u32); 107 + __uint(max_entries, 12); 108 + } cpus_available SEC(".maps"); 109 + 110 + struct { 111 + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); 112 + __type(key, __u32); 113 + __type(value, __u32); 114 + __uint(max_entries, 1); 115 + } cpus_iterator SEC(".maps"); 116 + 117 + SEC("xdp") 118 + int xdp_redir_cpu_round_robin(struct xdp_md *ctx) 119 + { 120 + __u32 key = 0; 121 + __u32 cpu_dest = 0; 122 + __u32 *cpu_selected, *cpu_iterator; 123 + __u32 cpu_idx; 124 + 125 + cpu_iterator = bpf_map_lookup_elem(&cpus_iterator, &key); 126 + if (!cpu_iterator) 127 + return XDP_ABORTED; 128 + cpu_idx = *cpu_iterator; 129 + 130 + *cpu_iterator += 1; 131 + if (*cpu_iterator == bpf_num_possible_cpus()) 132 + *cpu_iterator = 0; 133 + 134 + cpu_selected = bpf_map_lookup_elem(&cpus_available, &cpu_idx); 135 + if (!cpu_selected) 136 + return XDP_ABORTED; 137 + cpu_dest = *cpu_selected; 138 + 139 + if (cpu_dest >= bpf_num_possible_cpus()) 140 + return XDP_ABORTED; 141 + 142 + return bpf_redirect_map(&cpu_map, cpu_dest, 0); 143 + } 144 + 145 + Userspace 146 + --------- 147 + 148 + The following code snippet shows how to dynamically set the max_entries for a 149 + CPUMAP to the max number of cpus available on the system. 150 + 151 + .. code-block:: c 152 + 153 + int set_max_cpu_entries(struct bpf_map *cpu_map) 154 + { 155 + if (bpf_map__set_max_entries(cpu_map, libbpf_num_possible_cpus()) < 0) { 156 + fprintf(stderr, "Failed to set max entries for cpu_map map: %s", 157 + strerror(errno)); 158 + return -1; 159 + } 160 + return 0; 161 + } 162 + 163 + References 164 + =========== 165 + 166 + - https://developers.redhat.com/blog/2021/05/13/receive-side-scaling-rss-with-ebpf-and-cpumap#redirecting_into_a_cpumap

+181

Documentation/bpf/map_lpm_trie.rst

··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + ===================== 5 + BPF_MAP_TYPE_LPM_TRIE 6 + ===================== 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_LPM_TRIE`` was introduced in kernel version 4.11 10 + 11 + ``BPF_MAP_TYPE_LPM_TRIE`` provides a longest prefix match algorithm that 12 + can be used to match IP addresses to a stored set of prefixes. 13 + Internally, data is stored in an unbalanced trie of nodes that uses 14 + ``prefixlen,data`` pairs as its keys. The ``data`` is interpreted in 15 + network byte order, i.e. big endian, so ``data[0]`` stores the most 16 + significant byte. 17 + 18 + LPM tries may be created with a maximum prefix length that is a multiple 19 + of 8, in the range from 8 to 2048. The key used for lookup and update 20 + operations is a ``struct bpf_lpm_trie_key``, extended by 21 + ``max_prefixlen/8`` bytes. 22 + 23 + - For IPv4 addresses the data length is 4 bytes 24 + - For IPv6 addresses the data length is 16 bytes 25 + 26 + The value type stored in the LPM trie can be any user defined type. 27 + 28 + .. note:: 29 + When creating a map of type ``BPF_MAP_TYPE_LPM_TRIE`` you must set the 30 + ``BPF_F_NO_PREALLOC`` flag. 31 + 32 + Usage 33 + ===== 34 + 35 + Kernel BPF 36 + ---------- 37 + 38 + .. c:function:: 39 + void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 40 + 41 + The longest prefix entry for a given data value can be found using the 42 + ``bpf_map_lookup_elem()`` helper. This helper returns a pointer to the 43 + value associated with the longest matching ``key``, or ``NULL`` if no 44 + entry was found. 45 + 46 + The ``key`` should have ``prefixlen`` set to ``max_prefixlen`` when 47 + performing longest prefix lookups. For example, when searching for the 48 + longest prefix match for an IPv4 address, ``prefixlen`` should be set to 49 + ``32``. 50 + 51 + .. c:function:: 52 + long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) 53 + 54 + Prefix entries can be added or updated using the ``bpf_map_update_elem()`` 55 + helper. This helper replaces existing elements atomically. 56 + 57 + ``bpf_map_update_elem()`` returns ``0`` on success, or negative error in 58 + case of failure. 59 + 60 + .. note:: 61 + The flags parameter must be one of BPF_ANY, BPF_NOEXIST or BPF_EXIST, 62 + but the value is ignored, giving BPF_ANY semantics. 63 + 64 + .. c:function:: 65 + long bpf_map_delete_elem(struct bpf_map *map, const void *key) 66 + 67 + Prefix entries can be deleted using the ``bpf_map_delete_elem()`` 68 + helper. This helper will return 0 on success, or negative error in case 69 + of failure. 70 + 71 + Userspace 72 + --------- 73 + 74 + Access from userspace uses libbpf APIs with the same names as above, with 75 + the map identified by ``fd``. 76 + 77 + .. c:function:: 78 + int bpf_map_get_next_key (int fd, const void *cur_key, void *next_key) 79 + 80 + A userspace program can iterate through the entries in an LPM trie using 81 + libbpf's ``bpf_map_get_next_key()`` function. The first key can be 82 + fetched by calling ``bpf_map_get_next_key()`` with ``cur_key`` set to 83 + ``NULL``. Subsequent calls will fetch the next key that follows the 84 + current key. ``bpf_map_get_next_key()`` returns ``0`` on success, 85 + ``-ENOENT`` if ``cur_key`` is the last key in the trie, or negative 86 + error in case of failure. 87 + 88 + ``bpf_map_get_next_key()`` will iterate through the LPM trie elements 89 + from leftmost leaf first. This means that iteration will return more 90 + specific keys before less specific ones. 91 + 92 + Examples 93 + ======== 94 + 95 + Please see ``tools/testing/selftests/bpf/test_lpm_map.c`` for examples 96 + of LPM trie usage from userspace. The code snippets below demonstrate 97 + API usage. 98 + 99 + Kernel BPF 100 + ---------- 101 + 102 + The following BPF code snippet shows how to declare a new LPM trie for IPv4 103 + address prefixes: 104 + 105 + .. code-block:: c 106 + 107 + #include <linux/bpf.h> 108 + #include <bpf/bpf_helpers.h> 109 + 110 + struct ipv4_lpm_key { 111 + __u32 prefixlen; 112 + __u32 data; 113 + }; 114 + 115 + struct { 116 + __uint(type, BPF_MAP_TYPE_LPM_TRIE); 117 + __type(key, struct ipv4_lpm_key); 118 + __type(value, __u32); 119 + __uint(map_flags, BPF_F_NO_PREALLOC); 120 + __uint(max_entries, 255); 121 + } ipv4_lpm_map SEC(".maps"); 122 + 123 + The following BPF code snippet shows how to lookup by IPv4 address: 124 + 125 + .. code-block:: c 126 + 127 + void *lookup(__u32 ipaddr) 128 + { 129 + struct ipv4_lpm_key key = { 130 + .prefixlen = 32, 131 + .data = ipaddr 132 + }; 133 + 134 + return bpf_map_lookup_elem(&ipv4_lpm_map, &key); 135 + } 136 + 137 + Userspace 138 + --------- 139 + 140 + The following snippet shows how to insert an IPv4 prefix entry into an 141 + LPM trie: 142 + 143 + .. code-block:: c 144 + 145 + int add_prefix_entry(int lpm_fd, __u32 addr, __u32 prefixlen, struct value *value) 146 + { 147 + struct ipv4_lpm_key ipv4_key = { 148 + .prefixlen = prefixlen, 149 + .data = addr 150 + }; 151 + return bpf_map_update_elem(lpm_fd, &ipv4_key, value, BPF_ANY); 152 + } 153 + 154 + The following snippet shows a userspace program walking through the entries 155 + of an LPM trie: 156 + 157 + 158 + .. code-block:: c 159 + 160 + #include <bpf/libbpf.h> 161 + #include <bpf/bpf.h> 162 + 163 + void iterate_lpm_trie(int map_fd) 164 + { 165 + struct ipv4_lpm_key *cur_key = NULL; 166 + struct ipv4_lpm_key next_key; 167 + struct value value; 168 + int err; 169 + 170 + for (;;) { 171 + err = bpf_map_get_next_key(map_fd, cur_key, &next_key); 172 + if (err) 173 + break; 174 + 175 + bpf_map_lookup_elem(map_fd, &next_key, &value); 176 + 177 + /* Use key and value here */ 178 + 179 + cur_key = &next_key; 180 + } 181 + }

+126

Documentation/bpf/map_of_maps.rst

··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + ======================================================== 5 + BPF_MAP_TYPE_ARRAY_OF_MAPS and BPF_MAP_TYPE_HASH_OF_MAPS 6 + ======================================================== 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_ARRAY_OF_MAPS`` and ``BPF_MAP_TYPE_HASH_OF_MAPS`` were 10 + introduced in kernel version 4.12 11 + 12 + ``BPF_MAP_TYPE_ARRAY_OF_MAPS`` and ``BPF_MAP_TYPE_HASH_OF_MAPS`` provide general 13 + purpose support for map in map storage. One level of nesting is supported, where 14 + an outer map contains instances of a single type of inner map, for example 15 + ``array_of_maps->sock_map``. 16 + 17 + When creating an outer map, an inner map instance is used to initialize the 18 + metadata that the outer map holds about its inner maps. This inner map has a 19 + separate lifetime from the outer map and can be deleted after the outer map has 20 + been created. 21 + 22 + The outer map supports element lookup, update and delete from user space using 23 + the syscall API. A BPF program is only allowed to do element lookup in the outer 24 + map. 25 + 26 + .. note:: 27 + - Multi-level nesting is not supported. 28 + - Any BPF map type can be used as an inner map, except for 29 + ``BPF_MAP_TYPE_PROG_ARRAY``. 30 + - A BPF program cannot update or delete outer map entries. 31 + 32 + For ``BPF_MAP_TYPE_ARRAY_OF_MAPS`` the key is an unsigned 32-bit integer index 33 + into the array. The array is a fixed size with ``max_entries`` elements that are 34 + zero initialized when created. 35 + 36 + For ``BPF_MAP_TYPE_HASH_OF_MAPS`` the key type can be chosen when defining the 37 + map. The kernel is responsible for allocating and freeing key/value pairs, up to 38 + the max_entries limit that you specify. Hash maps use pre-allocation of hash 39 + table elements by default. The ``BPF_F_NO_PREALLOC`` flag can be used to disable 40 + pre-allocation when it is too memory expensive. 41 + 42 + Usage 43 + ===== 44 + 45 + Kernel BPF Helper 46 + ----------------- 47 + 48 + .. c:function:: 49 + void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 50 + 51 + Inner maps can be retrieved using the ``bpf_map_lookup_elem()`` helper. This 52 + helper returns a pointer to the inner map, or ``NULL`` if no entry was found. 53 + 54 + Examples 55 + ======== 56 + 57 + Kernel BPF Example 58 + ------------------ 59 + 60 + This snippet shows how to create and initialise an array of devmaps in a BPF 61 + program. Note that the outer array can only be modified from user space using 62 + the syscall API. 63 + 64 + .. code-block:: c 65 + 66 + struct inner_map { 67 + __uint(type, BPF_MAP_TYPE_DEVMAP); 68 + __uint(max_entries, 10); 69 + __type(key, __u32); 70 + __type(value, __u32); 71 + } inner_map1 SEC(".maps"), inner_map2 SEC(".maps"); 72 + 73 + struct { 74 + __uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS); 75 + __uint(max_entries, 2); 76 + __type(key, __u32); 77 + __array(values, struct inner_map); 78 + } outer_map SEC(".maps") = { 79 + .values = { &inner_map1, 80 + &inner_map2 } 81 + }; 82 + 83 + See ``progs/test_btf_map_in_map.c`` in ``tools/testing/selftests/bpf`` for more 84 + examples of declarative initialisation of outer maps. 85 + 86 + User Space 87 + ---------- 88 + 89 + This snippet shows how to create an array based outer map: 90 + 91 + .. code-block:: c 92 + 93 + int create_outer_array(int inner_fd) { 94 + LIBBPF_OPTS(bpf_map_create_opts, opts, .inner_map_fd = inner_fd); 95 + int fd; 96 + 97 + fd = bpf_map_create(BPF_MAP_TYPE_ARRAY_OF_MAPS, 98 + "example_array", /* name */ 99 + sizeof(__u32), /* key size */ 100 + sizeof(__u32), /* value size */ 101 + 256, /* max entries */ 102 + &opts); /* create opts */ 103 + return fd; 104 + } 105 + 106 + 107 + This snippet shows how to add an inner map to an outer map: 108 + 109 + .. code-block:: c 110 + 111 + int add_devmap(int outer_fd, int index, const char *name) { 112 + int fd; 113 + 114 + fd = bpf_map_create(BPF_MAP_TYPE_DEVMAP, name, 115 + sizeof(__u32), sizeof(__u32), 256, NULL); 116 + if (fd < 0) 117 + return fd; 118 + 119 + return bpf_map_update_elem(outer_fd, &index, &fd, BPF_ANY); 120 + } 121 + 122 + References 123 + ========== 124 + 125 + - https://lore.kernel.org/netdev/20170322170035.923581-3-kafai@fb.com/ 126 + - https://lore.kernel.org/netdev/20170322170035.923581-4-kafai@fb.com/

+122

Documentation/bpf/map_queue_stack.rst

··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + ========================================= 5 + BPF_MAP_TYPE_QUEUE and BPF_MAP_TYPE_STACK 6 + ========================================= 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_QUEUE`` and ``BPF_MAP_TYPE_STACK`` were introduced 10 + in kernel version 4.20 11 + 12 + ``BPF_MAP_TYPE_QUEUE`` provides FIFO storage and ``BPF_MAP_TYPE_STACK`` 13 + provides LIFO storage for BPF programs. These maps support peek, pop and 14 + push operations that are exposed to BPF programs through the respective 15 + helpers. These operations are exposed to userspace applications using 16 + the existing ``bpf`` syscall in the following way: 17 + 18 + - ``BPF_MAP_LOOKUP_ELEM`` -> peek 19 + - ``BPF_MAP_LOOKUP_AND_DELETE_ELEM`` -> pop 20 + - ``BPF_MAP_UPDATE_ELEM`` -> push 21 + 22 + ``BPF_MAP_TYPE_QUEUE`` and ``BPF_MAP_TYPE_STACK`` do not support 23 + ``BPF_F_NO_PREALLOC``. 24 + 25 + Usage 26 + ===== 27 + 28 + Kernel BPF 29 + ---------- 30 + 31 + .. c:function:: 32 + long bpf_map_push_elem(struct bpf_map *map, const void *value, u64 flags) 33 + 34 + An element ``value`` can be added to a queue or stack using the 35 + ``bpf_map_push_elem`` helper. The ``flags`` parameter must be set to 36 + ``BPF_ANY`` or ``BPF_EXIST``. If ``flags`` is set to ``BPF_EXIST`` then, 37 + when the queue or stack is full, the oldest element will be removed to 38 + make room for ``value`` to be added. Returns ``0`` on success, or 39 + negative error in case of failure. 40 + 41 + .. c:function:: 42 + long bpf_map_peek_elem(struct bpf_map *map, void *value) 43 + 44 + This helper fetches an element ``value`` from a queue or stack without 45 + removing it. Returns ``0`` on success, or negative error in case of 46 + failure. 47 + 48 + .. c:function:: 49 + long bpf_map_pop_elem(struct bpf_map *map, void *value) 50 + 51 + This helper removes an element into ``value`` from a queue or 52 + stack. Returns ``0`` on success, or negative error in case of failure. 53 + 54 + 55 + Userspace 56 + --------- 57 + 58 + .. c:function:: 59 + int bpf_map_update_elem (int fd, const void *key, const void *value, __u64 flags) 60 + 61 + A userspace program can push ``value`` onto a queue or stack using libbpf's 62 + ``bpf_map_update_elem`` function. The ``key`` parameter must be set to 63 + ``NULL`` and ``flags`` must be set to ``BPF_ANY`` or ``BPF_EXIST``, with the 64 + same semantics as the ``bpf_map_push_elem`` kernel helper. Returns ``0`` on 65 + success, or negative error in case of failure. 66 + 67 + .. c:function:: 68 + int bpf_map_lookup_elem (int fd, const void *key, void *value) 69 + 70 + A userspace program can peek at the ``value`` at the head of a queue or stack 71 + using the libbpf ``bpf_map_lookup_elem`` function. The ``key`` parameter must be 72 + set to ``NULL``. Returns ``0`` on success, or negative error in case of 73 + failure. 74 + 75 + .. c:function:: 76 + int bpf_map_lookup_and_delete_elem (int fd, const void *key, void *value) 77 + 78 + A userspace program can pop a ``value`` from the head of a queue or stack using 79 + the libbpf ``bpf_map_lookup_and_delete_elem`` function. The ``key`` parameter 80 + must be set to ``NULL``. Returns ``0`` on success, or negative error in case of 81 + failure. 82 + 83 + Examples 84 + ======== 85 + 86 + Kernel BPF 87 + ---------- 88 + 89 + This snippet shows how to declare a queue in a BPF program: 90 + 91 + .. code-block:: c 92 + 93 + struct { 94 + __uint(type, BPF_MAP_TYPE_QUEUE); 95 + __type(value, __u32); 96 + __uint(max_entries, 10); 97 + } queue SEC(".maps"); 98 + 99 + 100 + Userspace 101 + --------- 102 + 103 + This snippet shows how to use libbpf's low-level API to create a queue from 104 + userspace: 105 + 106 + .. code-block:: c 107 + 108 + int create_queue() 109 + { 110 + return bpf_map_create(BPF_MAP_TYPE_QUEUE, 111 + "sample_queue", /* name */ 112 + 0, /* key size, must be zero */ 113 + sizeof(__u32), /* value size */ 114 + 10, /* max entries */ 115 + NULL); /* create options */ 116 + } 117 + 118 + 119 + References 120 + ========== 121 + 122 + https://lwn.net/ml/netdev/153986858555.9127.14517764371945179514.stgit@kernel/

+1 -1

drivers/net/veth.c

··· 1125 1125 int err, i; 1126 1126 1127 1127 rq = &priv->rq[0]; 1128 - napi_already_on = (dev->flags & IFF_UP) && rcu_access_pointer(rq->napi); 1128 + napi_already_on = rcu_access_pointer(rq->napi); 1129 1129 1130 1130 if (!xdp_rxq_info_is_reg(&priv->rq[0].xdp_rxq)) { 1131 1131 err = veth_enable_xdp_range(dev, 0, dev->real_num_rx_queues, napi_already_on);

+117 -62

include/linux/bpf.h

··· 165 165 }; 166 166 167 167 enum { 168 - /* Support at most 8 pointers in a BPF map value */ 169 - BPF_MAP_VALUE_OFF_MAX = 8, 170 - BPF_MAP_OFF_ARR_MAX = BPF_MAP_VALUE_OFF_MAX + 171 - 1 + /* for bpf_spin_lock */ 172 - 1, /* for bpf_timer */ 168 + /* Support at most 8 pointers in a BTF type */ 169 + BTF_FIELDS_MAX = 10, 170 + BPF_MAP_OFF_ARR_MAX = BTF_FIELDS_MAX, 173 171 }; 174 172 175 - enum bpf_kptr_type { 176 - BPF_KPTR_UNREF, 177 - BPF_KPTR_REF, 173 + enum btf_field_type { 174 + BPF_SPIN_LOCK = (1 << 0), 175 + BPF_TIMER = (1 << 1), 176 + BPF_KPTR_UNREF = (1 << 2), 177 + BPF_KPTR_REF = (1 << 3), 178 + BPF_KPTR = BPF_KPTR_UNREF | BPF_KPTR_REF, 178 179 }; 179 180 180 - struct bpf_map_value_off_desc { 181 + struct btf_field_kptr { 182 + struct btf *btf; 183 + struct module *module; 184 + btf_dtor_kfunc_t dtor; 185 + u32 btf_id; 186 + }; 187 + 188 + struct btf_field { 181 189 u32 offset; 182 - enum bpf_kptr_type type; 183 - struct { 184 - struct btf *btf; 185 - struct module *module; 186 - btf_dtor_kfunc_t dtor; 187 - u32 btf_id; 188 - } kptr; 190 + enum btf_field_type type; 191 + union { 192 + struct btf_field_kptr kptr; 193 + }; 189 194 }; 190 195 191 - struct bpf_map_value_off { 192 - u32 nr_off; 193 - struct bpf_map_value_off_desc off[]; 196 + struct btf_record { 197 + u32 cnt; 198 + u32 field_mask; 199 + int spin_lock_off; 200 + int timer_off; 201 + struct btf_field fields[]; 194 202 }; 195 203 196 - struct bpf_map_off_arr { 204 + struct btf_field_offs { 197 205 u32 cnt; 198 206 u32 field_off[BPF_MAP_OFF_ARR_MAX]; 199 207 u8 field_sz[BPF_MAP_OFF_ARR_MAX]; ··· 222 214 u32 max_entries; 223 215 u64 map_extra; /* any per-map-type extra fields */ 224 216 u32 map_flags; 225 - int spin_lock_off; /* >=0 valid offset, <0 error */ 226 - struct bpf_map_value_off *kptr_off_tab; 227 - int timer_off; /* >=0 valid offset, <0 error */ 228 217 u32 id; 218 + struct btf_record *record; 229 219 int numa_node; 230 220 u32 btf_key_type_id; 231 221 u32 btf_value_type_id; ··· 233 227 struct obj_cgroup *objcg; 234 228 #endif 235 229 char name[BPF_OBJ_NAME_LEN]; 236 - struct bpf_map_off_arr *off_arr; 230 + struct btf_field_offs *field_offs; 237 231 /* The 3rd and 4th cacheline with misc members to avoid false sharing 238 232 * particularly with refcounting. 239 233 */ ··· 257 251 bool frozen; /* write-once; write-protected by freeze_mutex */ 258 252 }; 259 253 260 - static inline bool map_value_has_spin_lock(const struct bpf_map *map) 254 + static inline const char *btf_field_type_name(enum btf_field_type type) 261 255 { 262 - return map->spin_lock_off >= 0; 256 + switch (type) { 257 + case BPF_SPIN_LOCK: 258 + return "bpf_spin_lock"; 259 + case BPF_TIMER: 260 + return "bpf_timer"; 261 + case BPF_KPTR_UNREF: 262 + case BPF_KPTR_REF: 263 + return "kptr"; 264 + default: 265 + WARN_ON_ONCE(1); 266 + return "unknown"; 267 + } 263 268 } 264 269 265 - static inline bool map_value_has_timer(const struct bpf_map *map) 270 + static inline u32 btf_field_type_size(enum btf_field_type type) 266 271 { 267 - return map->timer_off >= 0; 272 + switch (type) { 273 + case BPF_SPIN_LOCK: 274 + return sizeof(struct bpf_spin_lock); 275 + case BPF_TIMER: 276 + return sizeof(struct bpf_timer); 277 + case BPF_KPTR_UNREF: 278 + case BPF_KPTR_REF: 279 + return sizeof(u64); 280 + default: 281 + WARN_ON_ONCE(1); 282 + return 0; 283 + } 268 284 } 269 285 270 - static inline bool map_value_has_kptrs(const struct bpf_map *map) 286 + static inline u32 btf_field_type_align(enum btf_field_type type) 271 287 { 272 - return !IS_ERR_OR_NULL(map->kptr_off_tab); 288 + switch (type) { 289 + case BPF_SPIN_LOCK: 290 + return __alignof__(struct bpf_spin_lock); 291 + case BPF_TIMER: 292 + return __alignof__(struct bpf_timer); 293 + case BPF_KPTR_UNREF: 294 + case BPF_KPTR_REF: 295 + return __alignof__(u64); 296 + default: 297 + WARN_ON_ONCE(1); 298 + return 0; 299 + } 300 + } 301 + 302 + static inline bool btf_record_has_field(const struct btf_record *rec, enum btf_field_type type) 303 + { 304 + if (IS_ERR_OR_NULL(rec)) 305 + return false; 306 + return rec->field_mask & type; 273 307 } 274 308 275 309 static inline void check_and_init_map_value(struct bpf_map *map, void *dst) 276 310 { 277 - if (unlikely(map_value_has_spin_lock(map))) 278 - memset(dst + map->spin_lock_off, 0, sizeof(struct bpf_spin_lock)); 279 - if (unlikely(map_value_has_timer(map))) 280 - memset(dst + map->timer_off, 0, sizeof(struct bpf_timer)); 281 - if (unlikely(map_value_has_kptrs(map))) { 282 - struct bpf_map_value_off *tab = map->kptr_off_tab; 311 + if (!IS_ERR_OR_NULL(map->record)) { 312 + struct btf_field *fields = map->record->fields; 313 + u32 cnt = map->record->cnt; 283 314 int i; 284 315 285 - for (i = 0; i < tab->nr_off; i++) 286 - *(u64 *)(dst + tab->off[i].offset) = 0; 316 + for (i = 0; i < cnt; i++) 317 + memset(dst + fields[i].offset, 0, btf_field_type_size(fields[i].type)); 287 318 } 288 319 } 289 320 ··· 341 298 } 342 299 343 300 /* copy everything but bpf_spin_lock, bpf_timer, and kptrs. There could be one of each. */ 344 - static inline void __copy_map_value(struct bpf_map *map, void *dst, void *src, bool long_memcpy) 301 + static inline void bpf_obj_memcpy(struct btf_field_offs *foffs, 302 + void *dst, void *src, u32 size, 303 + bool long_memcpy) 345 304 { 346 305 u32 curr_off = 0; 347 306 int i; 348 307 349 - if (likely(!map->off_arr)) { 308 + if (likely(!foffs)) { 350 309 if (long_memcpy) 351 - bpf_long_memcpy(dst, src, round_up(map->value_size, 8)); 310 + bpf_long_memcpy(dst, src, round_up(size, 8)); 352 311 else 353 - memcpy(dst, src, map->value_size); 312 + memcpy(dst, src, size); 354 313 return; 355 314 } 356 315 357 - for (i = 0; i < map->off_arr->cnt; i++) { 358 - u32 next_off = map->off_arr->field_off[i]; 316 + for (i = 0; i < foffs->cnt; i++) { 317 + u32 next_off = foffs->field_off[i]; 318 + u32 sz = next_off - curr_off; 359 319 360 - memcpy(dst + curr_off, src + curr_off, next_off - curr_off); 361 - curr_off += map->off_arr->field_sz[i]; 320 + memcpy(dst + curr_off, src + curr_off, sz); 321 + curr_off += foffs->field_sz[i]; 362 322 } 363 - memcpy(dst + curr_off, src + curr_off, map->value_size - curr_off); 323 + memcpy(dst + curr_off, src + curr_off, size - curr_off); 364 324 } 365 325 366 326 static inline void copy_map_value(struct bpf_map *map, void *dst, void *src) 367 327 { 368 - __copy_map_value(map, dst, src, false); 328 + bpf_obj_memcpy(map->field_offs, dst, src, map->value_size, false); 369 329 } 370 330 371 331 static inline void copy_map_value_long(struct bpf_map *map, void *dst, void *src) 372 332 { 373 - __copy_map_value(map, dst, src, true); 333 + bpf_obj_memcpy(map->field_offs, dst, src, map->value_size, true); 374 334 } 375 335 376 - static inline void zero_map_value(struct bpf_map *map, void *dst) 336 + static inline void bpf_obj_memzero(struct btf_field_offs *foffs, void *dst, u32 size) 377 337 { 378 338 u32 curr_off = 0; 379 339 int i; 380 340 381 - if (likely(!map->off_arr)) { 382 - memset(dst, 0, map->value_size); 341 + if (likely(!foffs)) { 342 + memset(dst, 0, size); 383 343 return; 384 344 } 385 345 386 - for (i = 0; i < map->off_arr->cnt; i++) { 387 - u32 next_off = map->off_arr->field_off[i]; 346 + for (i = 0; i < foffs->cnt; i++) { 347 + u32 next_off = foffs->field_off[i]; 348 + u32 sz = next_off - curr_off; 388 349 389 - memset(dst + curr_off, 0, next_off - curr_off); 390 - curr_off += map->off_arr->field_sz[i]; 350 + memset(dst + curr_off, 0, sz); 351 + curr_off += foffs->field_sz[i]; 391 352 } 392 - memset(dst + curr_off, 0, map->value_size - curr_off); 353 + memset(dst + curr_off, 0, size - curr_off); 354 + } 355 + 356 + static inline void zero_map_value(struct bpf_map *map, void *dst) 357 + { 358 + bpf_obj_memzero(map->field_offs, dst, map->value_size); 393 359 } 394 360 395 361 void copy_map_value_locked(struct bpf_map *map, void *dst, void *src, ··· 1751 1699 void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock); 1752 1700 void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock); 1753 1701 1754 - struct bpf_map_value_off_desc *bpf_map_kptr_off_contains(struct bpf_map *map, u32 offset); 1755 - void bpf_map_free_kptr_off_tab(struct bpf_map *map); 1756 - struct bpf_map_value_off *bpf_map_copy_kptr_off_tab(const struct bpf_map *map); 1757 - bool bpf_map_equal_kptr_off_tab(const struct bpf_map *map_a, const struct bpf_map *map_b); 1758 - void bpf_map_free_kptrs(struct bpf_map *map, void *map_value); 1702 + struct btf_field *btf_record_find(const struct btf_record *rec, 1703 + u32 offset, enum btf_field_type type); 1704 + void btf_record_free(struct btf_record *rec); 1705 + void bpf_map_free_record(struct bpf_map *map); 1706 + struct btf_record *btf_record_dup(const struct btf_record *rec); 1707 + bool btf_record_equal(const struct btf_record *rec_a, const struct btf_record *rec_b); 1708 + void bpf_obj_free_timer(const struct btf_record *rec, void *obj); 1709 + void bpf_obj_free_fields(const struct btf_record *rec, void *obj); 1759 1710 1760 1711 struct bpf_map *bpf_map_get(u32 ufd); 1761 1712 struct bpf_map *bpf_map_get_with_uref(u32 ufd);

+8 -2

include/linux/btf.h

··· 163 163 u32 expected_offset, u32 expected_size); 164 164 int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t); 165 165 int btf_find_timer(const struct btf *btf, const struct btf_type *t); 166 - struct bpf_map_value_off *btf_parse_kptrs(const struct btf *btf, 167 - const struct btf_type *t); 166 + struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type *t, 167 + u32 field_mask, u32 value_size); 168 + struct btf_field_offs *btf_parse_field_offs(struct btf_record *rec); 168 169 bool btf_type_is_void(const struct btf_type *t); 169 170 s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind); 170 171 const struct btf_type *btf_type_skip_modifiers(const struct btf *btf, ··· 287 286 static inline bool btf_type_is_typedef(const struct btf_type *t) 288 287 { 289 288 return BTF_INFO_KIND(t->info) == BTF_KIND_TYPEDEF; 289 + } 290 + 291 + static inline bool btf_type_is_volatile(const struct btf_type *t) 292 + { 293 + return BTF_INFO_KIND(t->info) == BTF_KIND_VOLATILE; 290 294 } 291 295 292 296 static inline bool btf_type_is_func(const struct btf_type *t)

+1

include/uapi/linux/bpf.h

··· 6445 6445 * the outgoing header has not 6446 6446 * been written yet. 6447 6447 */ 6448 + __u64 skb_hwtstamp; 6448 6449 }; 6449 6450 6450 6451 /* Definitions for bpf_sock_ops_cb_flags */

+11 -19

kernel/bpf/arraymap.c

··· 306 306 return 0; 307 307 } 308 308 309 - static void check_and_free_fields(struct bpf_array *arr, void *val) 310 - { 311 - if (map_value_has_timer(&arr->map)) 312 - bpf_timer_cancel_and_free(val + arr->map.timer_off); 313 - if (map_value_has_kptrs(&arr->map)) 314 - bpf_map_free_kptrs(&arr->map, val); 315 - } 316 - 317 309 /* Called from syscall or from eBPF program */ 318 310 static int array_map_update_elem(struct bpf_map *map, void *key, void *value, 319 311 u64 map_flags) ··· 327 335 return -EEXIST; 328 336 329 337 if (unlikely((map_flags & BPF_F_LOCK) && 330 - !map_value_has_spin_lock(map))) 338 + !btf_record_has_field(map->record, BPF_SPIN_LOCK))) 331 339 return -EINVAL; 332 340 333 341 if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { 334 342 val = this_cpu_ptr(array->pptrs[index & array->index_mask]); 335 343 copy_map_value(map, val, value); 336 - check_and_free_fields(array, val); 344 + bpf_obj_free_fields(array->map.record, val); 337 345 } else { 338 346 val = array->value + 339 347 (u64)array->elem_size * (index & array->index_mask); ··· 341 349 copy_map_value_locked(map, val, value, false); 342 350 else 343 351 copy_map_value(map, val, value); 344 - check_and_free_fields(array, val); 352 + bpf_obj_free_fields(array->map.record, val); 345 353 } 346 354 return 0; 347 355 } ··· 378 386 pptr = array->pptrs[index & array->index_mask]; 379 387 for_each_possible_cpu(cpu) { 380 388 copy_map_value_long(map, per_cpu_ptr(pptr, cpu), value + off); 381 - check_and_free_fields(array, per_cpu_ptr(pptr, cpu)); 389 + bpf_obj_free_fields(array->map.record, per_cpu_ptr(pptr, cpu)); 382 390 off += size; 383 391 } 384 392 rcu_read_unlock(); ··· 401 409 struct bpf_array *array = container_of(map, struct bpf_array, map); 402 410 int i; 403 411 404 - /* We don't reset or free kptr on uref dropping to zero. */ 405 - if (!map_value_has_timer(map)) 412 + /* We don't reset or free fields other than timer on uref dropping to zero. */ 413 + if (!btf_record_has_field(map->record, BPF_TIMER)) 406 414 return; 407 415 408 416 for (i = 0; i < array->map.max_entries; i++) 409 - bpf_timer_cancel_and_free(array_map_elem_ptr(array, i) + map->timer_off); 417 + bpf_obj_free_timer(map->record, array_map_elem_ptr(array, i)); 410 418 } 411 419 412 420 /* Called when map->refcnt goes to zero, either from workqueue or from syscall */ ··· 415 423 struct bpf_array *array = container_of(map, struct bpf_array, map); 416 424 int i; 417 425 418 - if (map_value_has_kptrs(map)) { 426 + if (!IS_ERR_OR_NULL(map->record)) { 419 427 if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { 420 428 for (i = 0; i < array->map.max_entries; i++) { 421 429 void __percpu *pptr = array->pptrs[i & array->index_mask]; 422 430 int cpu; 423 431 424 432 for_each_possible_cpu(cpu) { 425 - bpf_map_free_kptrs(map, per_cpu_ptr(pptr, cpu)); 433 + bpf_obj_free_fields(map->record, per_cpu_ptr(pptr, cpu)); 426 434 cond_resched(); 427 435 } 428 436 } 429 437 } else { 430 438 for (i = 0; i < array->map.max_entries; i++) 431 - bpf_map_free_kptrs(map, array_map_elem_ptr(array, i)); 439 + bpf_obj_free_fields(map->record, array_map_elem_ptr(array, i)); 432 440 } 433 - bpf_map_free_kptr_off_tab(map); 441 + bpf_map_free_record(map); 434 442 } 435 443 436 444 if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY)

+1 -1

kernel/bpf/bpf_local_storage.c

··· 382 382 if (unlikely((map_flags & ~BPF_F_LOCK) > BPF_EXIST) || 383 383 /* BPF_F_LOCK can only be used in a value with spin_lock */ 384 384 unlikely((map_flags & BPF_F_LOCK) && 385 - !map_value_has_spin_lock(&smap->map))) 385 + !btf_record_has_field(smap->map.record, BPF_SPIN_LOCK))) 386 386 return ERR_PTR(-EINVAL); 387 387 388 388 if (gfp_flags == GFP_KERNEL && (map_flags & ~BPF_F_LOCK) != BPF_NOEXIST)

+259 -175

kernel/bpf/btf.c

··· 3191 3191 btf_verifier_log(env, "size=%u vlen=%u", t->size, btf_type_vlen(t)); 3192 3192 } 3193 3193 3194 - enum btf_field_type { 3194 + enum btf_field_info_type { 3195 3195 BTF_FIELD_SPIN_LOCK, 3196 3196 BTF_FIELD_TIMER, 3197 3197 BTF_FIELD_KPTR, ··· 3203 3203 }; 3204 3204 3205 3205 struct btf_field_info { 3206 - u32 type_id; 3206 + enum btf_field_type type; 3207 3207 u32 off; 3208 - enum bpf_kptr_type type; 3208 + struct { 3209 + u32 type_id; 3210 + } kptr; 3209 3211 }; 3210 3212 3211 3213 static int btf_find_struct(const struct btf *btf, const struct btf_type *t, 3212 - u32 off, int sz, struct btf_field_info *info) 3214 + u32 off, int sz, enum btf_field_type field_type, 3215 + struct btf_field_info *info) 3213 3216 { 3214 3217 if (!__btf_type_is_struct(t)) 3215 3218 return BTF_FIELD_IGNORE; 3216 3219 if (t->size != sz) 3217 3220 return BTF_FIELD_IGNORE; 3221 + info->type = field_type; 3218 3222 info->off = off; 3219 3223 return BTF_FIELD_FOUND; 3220 3224 } ··· 3226 3222 static int btf_find_kptr(const struct btf *btf, const struct btf_type *t, 3227 3223 u32 off, int sz, struct btf_field_info *info) 3228 3224 { 3229 - enum bpf_kptr_type type; 3225 + enum btf_field_type type; 3230 3226 u32 res_id; 3231 3227 3228 + /* Permit modifiers on the pointer itself */ 3229 + if (btf_type_is_volatile(t)) 3230 + t = btf_type_by_id(btf, t->type); 3232 3231 /* For PTR, sz is always == 8 */ 3233 3232 if (!btf_type_is_ptr(t)) 3234 3233 return BTF_FIELD_IGNORE; ··· 3255 3248 if (!__btf_type_is_struct(t)) 3256 3249 return -EINVAL; 3257 3250 3258 - info->type_id = res_id; 3259 - info->off = off; 3260 3251 info->type = type; 3252 + info->off = off; 3253 + info->kptr.type_id = res_id; 3261 3254 return BTF_FIELD_FOUND; 3262 3255 } 3263 3256 3264 - static int btf_find_struct_field(const struct btf *btf, const struct btf_type *t, 3265 - const char *name, int sz, int align, 3266 - enum btf_field_type field_type, 3257 + static int btf_get_field_type(const char *name, u32 field_mask, u32 *seen_mask, 3258 + int *align, int *sz) 3259 + { 3260 + int type = 0; 3261 + 3262 + if (field_mask & BPF_SPIN_LOCK) { 3263 + if (!strcmp(name, "bpf_spin_lock")) { 3264 + if (*seen_mask & BPF_SPIN_LOCK) 3265 + return -E2BIG; 3266 + *seen_mask |= BPF_SPIN_LOCK; 3267 + type = BPF_SPIN_LOCK; 3268 + goto end; 3269 + } 3270 + } 3271 + if (field_mask & BPF_TIMER) { 3272 + if (!strcmp(name, "bpf_timer")) { 3273 + if (*seen_mask & BPF_TIMER) 3274 + return -E2BIG; 3275 + *seen_mask |= BPF_TIMER; 3276 + type = BPF_TIMER; 3277 + goto end; 3278 + } 3279 + } 3280 + /* Only return BPF_KPTR when all other types with matchable names fail */ 3281 + if (field_mask & BPF_KPTR) { 3282 + type = BPF_KPTR_REF; 3283 + goto end; 3284 + } 3285 + return 0; 3286 + end: 3287 + *sz = btf_field_type_size(type); 3288 + *align = btf_field_type_align(type); 3289 + return type; 3290 + } 3291 + 3292 + static int btf_find_struct_field(const struct btf *btf, 3293 + const struct btf_type *t, u32 field_mask, 3267 3294 struct btf_field_info *info, int info_cnt) 3268 3295 { 3296 + int ret, idx = 0, align, sz, field_type; 3269 3297 const struct btf_member *member; 3270 3298 struct btf_field_info tmp; 3271 - int ret, idx = 0; 3272 - u32 i, off; 3299 + u32 i, off, seen_mask = 0; 3273 3300 3274 3301 for_each_member(i, t, member) { 3275 3302 const struct btf_type *member_type = btf_type_by_id(btf, 3276 3303 member->type); 3277 3304 3278 - if (name && strcmp(__btf_name_by_offset(btf, member_type->name_off), name)) 3305 + field_type = btf_get_field_type(__btf_name_by_offset(btf, member_type->name_off), 3306 + field_mask, &seen_mask, &align, &sz); 3307 + if (field_type == 0) 3279 3308 continue; 3309 + if (field_type < 0) 3310 + return field_type; 3280 3311 3281 3312 off = __btf_member_bit_offset(t, member); 3282 3313 if (off % 8) ··· 3322 3277 return -EINVAL; 3323 3278 off /= 8; 3324 3279 if (off % align) 3325 - return -EINVAL; 3280 + continue; 3326 3281 3327 3282 switch (field_type) { 3328 - case BTF_FIELD_SPIN_LOCK: 3329 - case BTF_FIELD_TIMER: 3330 - ret = btf_find_struct(btf, member_type, off, sz, 3283 + case BPF_SPIN_LOCK: 3284 + case BPF_TIMER: 3285 + ret = btf_find_struct(btf, member_type, off, sz, field_type, 3331 3286 idx < info_cnt ? &info[idx] : &tmp); 3332 3287 if (ret < 0) 3333 3288 return ret; 3334 3289 break; 3335 - case BTF_FIELD_KPTR: 3290 + case BPF_KPTR_UNREF: 3291 + case BPF_KPTR_REF: 3336 3292 ret = btf_find_kptr(btf, member_type, off, sz, 3337 3293 idx < info_cnt ? &info[idx] : &tmp); 3338 3294 if (ret < 0) ··· 3353 3307 } 3354 3308 3355 3309 static int btf_find_datasec_var(const struct btf *btf, const struct btf_type *t, 3356 - const char *name, int sz, int align, 3357 - enum btf_field_type field_type, 3358 - struct btf_field_info *info, int info_cnt) 3310 + u32 field_mask, struct btf_field_info *info, 3311 + int info_cnt) 3359 3312 { 3313 + int ret, idx = 0, align, sz, field_type; 3360 3314 const struct btf_var_secinfo *vsi; 3361 3315 struct btf_field_info tmp; 3362 - int ret, idx = 0; 3363 - u32 i, off; 3316 + u32 i, off, seen_mask = 0; 3364 3317 3365 3318 for_each_vsi(i, t, vsi) { 3366 3319 const struct btf_type *var = btf_type_by_id(btf, vsi->type); 3367 3320 const struct btf_type *var_type = btf_type_by_id(btf, var->type); 3368 3321 3369 - off = vsi->offset; 3370 - 3371 - if (name && strcmp(__btf_name_by_offset(btf, var_type->name_off), name)) 3322 + field_type = btf_get_field_type(__btf_name_by_offset(btf, var_type->name_off), 3323 + field_mask, &seen_mask, &align, &sz); 3324 + if (field_type == 0) 3372 3325 continue; 3326 + if (field_type < 0) 3327 + return field_type; 3328 + 3329 + off = vsi->offset; 3373 3330 if (vsi->size != sz) 3374 3331 continue; 3375 3332 if (off % align) 3376 - return -EINVAL; 3333 + continue; 3377 3334 3378 3335 switch (field_type) { 3379 - case BTF_FIELD_SPIN_LOCK: 3380 - case BTF_FIELD_TIMER: 3381 - ret = btf_find_struct(btf, var_type, off, sz, 3336 + case BPF_SPIN_LOCK: 3337 + case BPF_TIMER: 3338 + ret = btf_find_struct(btf, var_type, off, sz, field_type, 3382 3339 idx < info_cnt ? &info[idx] : &tmp); 3383 3340 if (ret < 0) 3384 3341 return ret; 3385 3342 break; 3386 - case BTF_FIELD_KPTR: 3343 + case BPF_KPTR_UNREF: 3344 + case BPF_KPTR_REF: 3387 3345 ret = btf_find_kptr(btf, var_type, off, sz, 3388 3346 idx < info_cnt ? &info[idx] : &tmp); 3389 3347 if (ret < 0) ··· 3407 3357 } 3408 3358 3409 3359 static int btf_find_field(const struct btf *btf, const struct btf_type *t, 3410 - enum btf_field_type field_type, 3411 - struct btf_field_info *info, int info_cnt) 3360 + u32 field_mask, struct btf_field_info *info, 3361 + int info_cnt) 3412 3362 { 3413 - const char *name; 3414 - int sz, align; 3415 - 3416 - switch (field_type) { 3417 - case BTF_FIELD_SPIN_LOCK: 3418 - name = "bpf_spin_lock"; 3419 - sz = sizeof(struct bpf_spin_lock); 3420 - align = __alignof__(struct bpf_spin_lock); 3421 - break; 3422 - case BTF_FIELD_TIMER: 3423 - name = "bpf_timer"; 3424 - sz = sizeof(struct bpf_timer); 3425 - align = __alignof__(struct bpf_timer); 3426 - break; 3427 - case BTF_FIELD_KPTR: 3428 - name = NULL; 3429 - sz = sizeof(u64); 3430 - align = 8; 3431 - break; 3432 - default: 3433 - return -EFAULT; 3434 - } 3435 - 3436 3363 if (__btf_type_is_struct(t)) 3437 - return btf_find_struct_field(btf, t, name, sz, align, field_type, info, info_cnt); 3364 + return btf_find_struct_field(btf, t, field_mask, info, info_cnt); 3438 3365 else if (btf_type_is_datasec(t)) 3439 - return btf_find_datasec_var(btf, t, name, sz, align, field_type, info, info_cnt); 3366 + return btf_find_datasec_var(btf, t, field_mask, info, info_cnt); 3440 3367 return -EINVAL; 3441 3368 } 3442 3369 3443 - /* find 'struct bpf_spin_lock' in map value. 3444 - * return >= 0 offset if found 3445 - * and < 0 in case of error 3446 - */ 3447 - int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t) 3370 + static int btf_parse_kptr(const struct btf *btf, struct btf_field *field, 3371 + struct btf_field_info *info) 3448 3372 { 3449 - struct btf_field_info info; 3450 - int ret; 3451 - 3452 - ret = btf_find_field(btf, t, BTF_FIELD_SPIN_LOCK, &info, 1); 3453 - if (ret < 0) 3454 - return ret; 3455 - if (!ret) 3456 - return -ENOENT; 3457 - return info.off; 3458 - } 3459 - 3460 - int btf_find_timer(const struct btf *btf, const struct btf_type *t) 3461 - { 3462 - struct btf_field_info info; 3463 - int ret; 3464 - 3465 - ret = btf_find_field(btf, t, BTF_FIELD_TIMER, &info, 1); 3466 - if (ret < 0) 3467 - return ret; 3468 - if (!ret) 3469 - return -ENOENT; 3470 - return info.off; 3471 - } 3472 - 3473 - struct bpf_map_value_off *btf_parse_kptrs(const struct btf *btf, 3474 - const struct btf_type *t) 3475 - { 3476 - struct btf_field_info info_arr[BPF_MAP_VALUE_OFF_MAX]; 3477 - struct bpf_map_value_off *tab; 3478 - struct btf *kernel_btf = NULL; 3479 3373 struct module *mod = NULL; 3480 - int ret, i, nr_off; 3374 + const struct btf_type *t; 3375 + struct btf *kernel_btf; 3376 + int ret; 3377 + s32 id; 3481 3378 3482 - ret = btf_find_field(btf, t, BTF_FIELD_KPTR, info_arr, ARRAY_SIZE(info_arr)); 3379 + /* Find type in map BTF, and use it to look up the matching type 3380 + * in vmlinux or module BTFs, by name and kind. 3381 + */ 3382 + t = btf_type_by_id(btf, info->kptr.type_id); 3383 + id = bpf_find_btf_id(__btf_name_by_offset(btf, t->name_off), BTF_INFO_KIND(t->info), 3384 + &kernel_btf); 3385 + if (id < 0) 3386 + return id; 3387 + 3388 + /* Find and stash the function pointer for the destruction function that 3389 + * needs to be eventually invoked from the map free path. 3390 + */ 3391 + if (info->type == BPF_KPTR_REF) { 3392 + const struct btf_type *dtor_func; 3393 + const char *dtor_func_name; 3394 + unsigned long addr; 3395 + s32 dtor_btf_id; 3396 + 3397 + /* This call also serves as a whitelist of allowed objects that 3398 + * can be used as a referenced pointer and be stored in a map at 3399 + * the same time. 3400 + */ 3401 + dtor_btf_id = btf_find_dtor_kfunc(kernel_btf, id); 3402 + if (dtor_btf_id < 0) { 3403 + ret = dtor_btf_id; 3404 + goto end_btf; 3405 + } 3406 + 3407 + dtor_func = btf_type_by_id(kernel_btf, dtor_btf_id); 3408 + if (!dtor_func) { 3409 + ret = -ENOENT; 3410 + goto end_btf; 3411 + } 3412 + 3413 + if (btf_is_module(kernel_btf)) { 3414 + mod = btf_try_get_module(kernel_btf); 3415 + if (!mod) { 3416 + ret = -ENXIO; 3417 + goto end_btf; 3418 + } 3419 + } 3420 + 3421 + /* We already verified dtor_func to be btf_type_is_func 3422 + * in register_btf_id_dtor_kfuncs. 3423 + */ 3424 + dtor_func_name = __btf_name_by_offset(kernel_btf, dtor_func->name_off); 3425 + addr = kallsyms_lookup_name(dtor_func_name); 3426 + if (!addr) { 3427 + ret = -EINVAL; 3428 + goto end_mod; 3429 + } 3430 + field->kptr.dtor = (void *)addr; 3431 + } 3432 + 3433 + field->kptr.btf_id = id; 3434 + field->kptr.btf = kernel_btf; 3435 + field->kptr.module = mod; 3436 + return 0; 3437 + end_mod: 3438 + module_put(mod); 3439 + end_btf: 3440 + btf_put(kernel_btf); 3441 + return ret; 3442 + } 3443 + 3444 + struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type *t, 3445 + u32 field_mask, u32 value_size) 3446 + { 3447 + struct btf_field_info info_arr[BTF_FIELDS_MAX]; 3448 + struct btf_record *rec; 3449 + int ret, i, cnt; 3450 + 3451 + ret = btf_find_field(btf, t, field_mask, info_arr, ARRAY_SIZE(info_arr)); 3483 3452 if (ret < 0) 3484 3453 return ERR_PTR(ret); 3485 3454 if (!ret) 3486 3455 return NULL; 3487 3456 3488 - nr_off = ret; 3489 - tab = kzalloc(offsetof(struct bpf_map_value_off, off[nr_off]), GFP_KERNEL | __GFP_NOWARN); 3490 - if (!tab) 3457 + cnt = ret; 3458 + rec = kzalloc(offsetof(struct btf_record, fields[cnt]), GFP_KERNEL | __GFP_NOWARN); 3459 + if (!rec) 3491 3460 return ERR_PTR(-ENOMEM); 3492 3461 3493 - for (i = 0; i < nr_off; i++) { 3494 - const struct btf_type *t; 3495 - s32 id; 3496 - 3497 - /* Find type in map BTF, and use it to look up the matching type 3498 - * in vmlinux or module BTFs, by name and kind. 3499 - */ 3500 - t = btf_type_by_id(btf, info_arr[i].type_id); 3501 - id = bpf_find_btf_id(__btf_name_by_offset(btf, t->name_off), BTF_INFO_KIND(t->info), 3502 - &kernel_btf); 3503 - if (id < 0) { 3504 - ret = id; 3462 + rec->spin_lock_off = -EINVAL; 3463 + rec->timer_off = -EINVAL; 3464 + for (i = 0; i < cnt; i++) { 3465 + if (info_arr[i].off + btf_field_type_size(info_arr[i].type) > value_size) { 3466 + WARN_ONCE(1, "verifier bug off %d size %d", info_arr[i].off, value_size); 3467 + ret = -EFAULT; 3505 3468 goto end; 3506 3469 } 3507 3470 3508 - /* Find and stash the function pointer for the destruction function that 3509 - * needs to be eventually invoked from the map free path. 3510 - */ 3511 - if (info_arr[i].type == BPF_KPTR_REF) { 3512 - const struct btf_type *dtor_func; 3513 - const char *dtor_func_name; 3514 - unsigned long addr; 3515 - s32 dtor_btf_id; 3471 + rec->field_mask |= info_arr[i].type; 3472 + rec->fields[i].offset = info_arr[i].off; 3473 + rec->fields[i].type = info_arr[i].type; 3516 3474 3517 - /* This call also serves as a whitelist of allowed objects that 3518 - * can be used as a referenced pointer and be stored in a map at 3519 - * the same time. 3520 - */ 3521 - dtor_btf_id = btf_find_dtor_kfunc(kernel_btf, id); 3522 - if (dtor_btf_id < 0) { 3523 - ret = dtor_btf_id; 3524 - goto end_btf; 3525 - } 3526 - 3527 - dtor_func = btf_type_by_id(kernel_btf, dtor_btf_id); 3528 - if (!dtor_func) { 3529 - ret = -ENOENT; 3530 - goto end_btf; 3531 - } 3532 - 3533 - if (btf_is_module(kernel_btf)) { 3534 - mod = btf_try_get_module(kernel_btf); 3535 - if (!mod) { 3536 - ret = -ENXIO; 3537 - goto end_btf; 3538 - } 3539 - } 3540 - 3541 - /* We already verified dtor_func to be btf_type_is_func 3542 - * in register_btf_id_dtor_kfuncs. 3543 - */ 3544 - dtor_func_name = __btf_name_by_offset(kernel_btf, dtor_func->name_off); 3545 - addr = kallsyms_lookup_name(dtor_func_name); 3546 - if (!addr) { 3547 - ret = -EINVAL; 3548 - goto end_mod; 3549 - } 3550 - tab->off[i].kptr.dtor = (void *)addr; 3475 + switch (info_arr[i].type) { 3476 + case BPF_SPIN_LOCK: 3477 + WARN_ON_ONCE(rec->spin_lock_off >= 0); 3478 + /* Cache offset for faster lookup at runtime */ 3479 + rec->spin_lock_off = rec->fields[i].offset; 3480 + break; 3481 + case BPF_TIMER: 3482 + WARN_ON_ONCE(rec->timer_off >= 0); 3483 + /* Cache offset for faster lookup at runtime */ 3484 + rec->timer_off = rec->fields[i].offset; 3485 + break; 3486 + case BPF_KPTR_UNREF: 3487 + case BPF_KPTR_REF: 3488 + ret = btf_parse_kptr(btf, &rec->fields[i], &info_arr[i]); 3489 + if (ret < 0) 3490 + goto end; 3491 + break; 3492 + default: 3493 + ret = -EFAULT; 3494 + goto end; 3551 3495 } 3552 - 3553 - tab->off[i].offset = info_arr[i].off; 3554 - tab->off[i].type = info_arr[i].type; 3555 - tab->off[i].kptr.btf_id = id; 3556 - tab->off[i].kptr.btf = kernel_btf; 3557 - tab->off[i].kptr.module = mod; 3496 + rec->cnt++; 3558 3497 } 3559 - tab->nr_off = nr_off; 3560 - return tab; 3561 - end_mod: 3562 - module_put(mod); 3563 - end_btf: 3564 - btf_put(kernel_btf); 3498 + return rec; 3565 3499 end: 3566 - while (i--) { 3567 - btf_put(tab->off[i].kptr.btf); 3568 - if (tab->off[i].kptr.module) 3569 - module_put(tab->off[i].kptr.module); 3570 - } 3571 - kfree(tab); 3500 + btf_record_free(rec); 3572 3501 return ERR_PTR(ret); 3502 + } 3503 + 3504 + static int btf_field_offs_cmp(const void *_a, const void *_b, const void *priv) 3505 + { 3506 + const u32 a = *(const u32 *)_a; 3507 + const u32 b = *(const u32 *)_b; 3508 + 3509 + if (a < b) 3510 + return -1; 3511 + else if (a > b) 3512 + return 1; 3513 + return 0; 3514 + } 3515 + 3516 + static void btf_field_offs_swap(void *_a, void *_b, int size, const void *priv) 3517 + { 3518 + struct btf_field_offs *foffs = (void *)priv; 3519 + u32 *off_base = foffs->field_off; 3520 + u32 *a = _a, *b = _b; 3521 + u8 *sz_a, *sz_b; 3522 + 3523 + sz_a = foffs->field_sz + (a - off_base); 3524 + sz_b = foffs->field_sz + (b - off_base); 3525 + 3526 + swap(*a, *b); 3527 + swap(*sz_a, *sz_b); 3528 + } 3529 + 3530 + struct btf_field_offs *btf_parse_field_offs(struct btf_record *rec) 3531 + { 3532 + struct btf_field_offs *foffs; 3533 + u32 i, *off; 3534 + u8 *sz; 3535 + 3536 + BUILD_BUG_ON(ARRAY_SIZE(foffs->field_off) != ARRAY_SIZE(foffs->field_sz)); 3537 + if (IS_ERR_OR_NULL(rec) || WARN_ON_ONCE(rec->cnt > sizeof(foffs->field_off))) 3538 + return NULL; 3539 + 3540 + foffs = kzalloc(sizeof(*foffs), GFP_KERNEL | __GFP_NOWARN); 3541 + if (!foffs) 3542 + return ERR_PTR(-ENOMEM); 3543 + 3544 + off = foffs->field_off; 3545 + sz = foffs->field_sz; 3546 + for (i = 0; i < rec->cnt; i++) { 3547 + off[i] = rec->fields[i].offset; 3548 + sz[i] = btf_field_type_size(rec->fields[i].type); 3549 + } 3550 + foffs->cnt = rec->cnt; 3551 + 3552 + if (foffs->cnt == 1) 3553 + return foffs; 3554 + sort_r(foffs->field_off, foffs->cnt, sizeof(foffs->field_off[0]), 3555 + btf_field_offs_cmp, btf_field_offs_swap, foffs); 3556 + return foffs; 3573 3557 } 3574 3558 3575 3559 static void __btf_struct_show(const struct btf *btf, const struct btf_type *t, ··· 6451 6367 6452 6368 /* kptr_get is only true for kfunc */ 6453 6369 if (i == 0 && kptr_get) { 6454 - struct bpf_map_value_off_desc *off_desc; 6370 + struct btf_field *kptr_field; 6455 6371 6456 6372 if (reg->type != PTR_TO_MAP_VALUE) { 6457 6373 bpf_log(log, "arg#0 expected pointer to map value\n"); ··· 6467 6383 return -EINVAL; 6468 6384 } 6469 6385 6470 - off_desc = bpf_map_kptr_off_contains(reg->map_ptr, reg->off + reg->var_off.value); 6471 - if (!off_desc || off_desc->type != BPF_KPTR_REF) { 6386 + kptr_field = btf_record_find(reg->map_ptr->record, reg->off + reg->var_off.value, BPF_KPTR); 6387 + if (!kptr_field || kptr_field->type != BPF_KPTR_REF) { 6472 6388 bpf_log(log, "arg#0 no referenced kptr at map value offset=%llu\n", 6473 6389 reg->off + reg->var_off.value); 6474 6390 return -EINVAL; ··· 6487 6403 func_name, i, btf_type_str(ref_t), ref_tname); 6488 6404 return -EINVAL; 6489 6405 } 6490 - if (!btf_struct_ids_match(log, btf, ref_id, 0, off_desc->kptr.btf, 6491 - off_desc->kptr.btf_id, true)) { 6406 + if (!btf_struct_ids_match(log, btf, ref_id, 0, kptr_field->kptr.btf, 6407 + kptr_field->kptr.btf_id, true)) { 6492 6408 bpf_log(log, "kernel function %s args#%d expected pointer to %s %s\n", 6493 6409 func_name, i, btf_type_str(ref_t), ref_tname); 6494 6410 return -EINVAL;

+6 -3

kernel/bpf/cpumap.c

··· 4 4 * Copyright (c) 2017 Jesper Dangaard Brouer, Red Hat Inc. 5 5 */ 6 6 7 - /* The 'cpumap' is primarily used as a backend map for XDP BPF helper 7 + /** 8 + * DOC: cpu map 9 + * The 'cpumap' is primarily used as a backend map for XDP BPF helper 8 10 * call bpf_redirect_map() and XDP_REDIRECT action, like 'devmap'. 9 11 * 10 - * Unlike devmap which redirects XDP frames out another NIC device, 12 + * Unlike devmap which redirects XDP frames out to another NIC device, 11 13 * this map type redirects raw XDP frames to another CPU. The remote 12 14 * CPU will do SKB-allocation and call the normal network stack. 13 - * 15 + */ 16 + /* 14 17 * This is a scalability and isolation mechanism, that allow 15 18 * separating the early driver network XDP layer, from the rest of the 16 19 * netstack, and assigning dedicated CPUs for this stage. This

+14 -24

kernel/bpf/hashtab.c

··· 222 222 u32 num_entries = htab->map.max_entries; 223 223 int i; 224 224 225 - if (!map_value_has_timer(&htab->map)) 225 + if (!btf_record_has_field(htab->map.record, BPF_TIMER)) 226 226 return; 227 227 if (htab_has_extra_elems(htab)) 228 228 num_entries += num_possible_cpus(); ··· 231 231 struct htab_elem *elem; 232 232 233 233 elem = get_htab_elem(htab, i); 234 - bpf_timer_cancel_and_free(elem->key + 235 - round_up(htab->map.key_size, 8) + 236 - htab->map.timer_off); 234 + bpf_obj_free_timer(htab->map.record, elem->key + round_up(htab->map.key_size, 8)); 237 235 cond_resched(); 238 236 } 239 237 } 240 238 241 - static void htab_free_prealloced_kptrs(struct bpf_htab *htab) 239 + static void htab_free_prealloced_fields(struct bpf_htab *htab) 242 240 { 243 241 u32 num_entries = htab->map.max_entries; 244 242 int i; 245 243 246 - if (!map_value_has_kptrs(&htab->map)) 244 + if (IS_ERR_OR_NULL(htab->map.record)) 247 245 return; 248 246 if (htab_has_extra_elems(htab)) 249 247 num_entries += num_possible_cpus(); 250 - 251 248 for (i = 0; i < num_entries; i++) { 252 249 struct htab_elem *elem; 253 250 254 251 elem = get_htab_elem(htab, i); 255 - bpf_map_free_kptrs(&htab->map, elem->key + round_up(htab->map.key_size, 8)); 252 + bpf_obj_free_fields(htab->map.record, elem->key + round_up(htab->map.key_size, 8)); 256 253 cond_resched(); 257 254 } 258 255 } ··· 761 764 { 762 765 void *map_value = elem->key + round_up(htab->map.key_size, 8); 763 766 764 - if (map_value_has_timer(&htab->map)) 765 - bpf_timer_cancel_and_free(map_value + htab->map.timer_off); 766 - if (map_value_has_kptrs(&htab->map)) 767 - bpf_map_free_kptrs(&htab->map, map_value); 767 + bpf_obj_free_fields(htab->map.record, map_value); 768 768 } 769 769 770 770 /* It is called from the bpf_lru_list when the LRU needs to delete ··· 1085 1091 head = &b->head; 1086 1092 1087 1093 if (unlikely(map_flags & BPF_F_LOCK)) { 1088 - if (unlikely(!map_value_has_spin_lock(map))) 1094 + if (unlikely(!btf_record_has_field(map->record, BPF_SPIN_LOCK))) 1089 1095 return -EINVAL; 1090 1096 /* find an element without taking the bucket lock */ 1091 1097 l_old = lookup_nulls_elem_raw(head, hash, key, key_size, ··· 1468 1474 struct htab_elem *l; 1469 1475 1470 1476 hlist_nulls_for_each_entry(l, n, head, hash_node) { 1471 - /* We don't reset or free kptr on uref dropping to zero, 1472 - * hence just free timer. 1473 - */ 1474 - bpf_timer_cancel_and_free(l->key + 1475 - round_up(htab->map.key_size, 8) + 1476 - htab->map.timer_off); 1477 + /* We only free timer on uref dropping to zero */ 1478 + bpf_obj_free_timer(htab->map.record, l->key + round_up(htab->map.key_size, 8)); 1477 1479 } 1478 1480 cond_resched_rcu(); 1479 1481 } ··· 1480 1490 { 1481 1491 struct bpf_htab *htab = container_of(map, struct bpf_htab, map); 1482 1492 1483 - /* We don't reset or free kptr on uref dropping to zero. */ 1484 - if (!map_value_has_timer(&htab->map)) 1493 + /* We only free timer on uref dropping to zero */ 1494 + if (!btf_record_has_field(htab->map.record, BPF_TIMER)) 1485 1495 return; 1486 1496 if (!htab_is_prealloc(htab)) 1487 1497 htab_free_malloced_timers(htab); ··· 1507 1517 if (!htab_is_prealloc(htab)) { 1508 1518 delete_all_elements(htab); 1509 1519 } else { 1510 - htab_free_prealloced_kptrs(htab); 1520 + htab_free_prealloced_fields(htab); 1511 1521 prealloc_destroy(htab); 1512 1522 } 1513 1523 1514 - bpf_map_free_kptr_off_tab(map); 1524 + bpf_map_free_record(map); 1515 1525 free_percpu(htab->extra_elems); 1516 1526 bpf_map_area_free(htab->buckets); 1517 1527 bpf_mem_alloc_destroy(&htab->pcpu_ma); ··· 1665 1675 1666 1676 elem_map_flags = attr->batch.elem_flags; 1667 1677 if ((elem_map_flags & ~BPF_F_LOCK) || 1668 - ((elem_map_flags & BPF_F_LOCK) && !map_value_has_spin_lock(map))) 1678 + ((elem_map_flags & BPF_F_LOCK) && !btf_record_has_field(map->record, BPF_SPIN_LOCK))) 1669 1679 return -EINVAL; 1670 1680 1671 1681 map_flags = attr->batch.flags;

+3 -3

kernel/bpf/helpers.c

··· 366 366 struct bpf_spin_lock *lock; 367 367 368 368 if (lock_src) 369 - lock = src + map->spin_lock_off; 369 + lock = src + map->record->spin_lock_off; 370 370 else 371 - lock = dst + map->spin_lock_off; 371 + lock = dst + map->record->spin_lock_off; 372 372 preempt_disable(); 373 373 __bpf_spin_lock_irqsave(lock); 374 374 copy_map_value(map, dst, src); ··· 1169 1169 ret = -ENOMEM; 1170 1170 goto out; 1171 1171 } 1172 - t->value = (void *)timer - map->timer_off; 1172 + t->value = (void *)timer - map->record->timer_off; 1173 1173 t->map = map; 1174 1174 t->prog = NULL; 1175 1175 rcu_assign_pointer(t->callback_fn, NULL);

+1 -1

kernel/bpf/local_storage.c

··· 151 151 return -EINVAL; 152 152 153 153 if (unlikely((flags & BPF_F_LOCK) && 154 - !map_value_has_spin_lock(map))) 154 + !btf_record_has_field(map->record, BPF_SPIN_LOCK))) 155 155 return -EINVAL; 156 156 157 157 storage = cgroup_storage_lookup((struct bpf_cgroup_storage_map *)map,

+12 -7

kernel/bpf/map_in_map.c

··· 29 29 return ERR_PTR(-ENOTSUPP); 30 30 } 31 31 32 - if (map_value_has_spin_lock(inner_map)) { 32 + if (btf_record_has_field(inner_map->record, BPF_SPIN_LOCK)) { 33 33 fdput(f); 34 34 return ERR_PTR(-ENOTSUPP); 35 35 } ··· 50 50 inner_map_meta->value_size = inner_map->value_size; 51 51 inner_map_meta->map_flags = inner_map->map_flags; 52 52 inner_map_meta->max_entries = inner_map->max_entries; 53 - inner_map_meta->spin_lock_off = inner_map->spin_lock_off; 54 - inner_map_meta->timer_off = inner_map->timer_off; 55 - inner_map_meta->kptr_off_tab = bpf_map_copy_kptr_off_tab(inner_map); 53 + inner_map_meta->record = btf_record_dup(inner_map->record); 54 + if (IS_ERR(inner_map_meta->record)) { 55 + /* btf_record_dup returns NULL or valid pointer in case of 56 + * invalid/empty/valid, but ERR_PTR in case of errors. During 57 + * equality NULL or IS_ERR is equivalent. 58 + */ 59 + fdput(f); 60 + return ERR_CAST(inner_map_meta->record); 61 + } 56 62 if (inner_map->btf) { 57 63 btf_get(inner_map->btf); 58 64 inner_map_meta->btf = inner_map->btf; ··· 78 72 79 73 void bpf_map_meta_free(struct bpf_map *map_meta) 80 74 { 81 - bpf_map_free_kptr_off_tab(map_meta); 75 + bpf_map_free_record(map_meta); 82 76 btf_put(map_meta->btf); 83 77 kfree(map_meta); 84 78 } ··· 90 84 return meta0->map_type == meta1->map_type && 91 85 meta0->key_size == meta1->key_size && 92 86 meta0->value_size == meta1->value_size && 93 - meta0->timer_off == meta1->timer_off && 94 87 meta0->map_flags == meta1->map_flags && 95 - bpf_map_equal_kptr_off_tab(meta0, meta1); 88 + btf_record_equal(meta0->record, meta1->record); 96 89 } 97 90 98 91 void *bpf_map_fd_get_ptr(struct bpf_map *map,

+192 -227

kernel/bpf/syscall.c

··· 495 495 } 496 496 #endif 497 497 498 - static int bpf_map_kptr_off_cmp(const void *a, const void *b) 498 + static int btf_field_cmp(const void *a, const void *b) 499 499 { 500 - const struct bpf_map_value_off_desc *off_desc1 = a, *off_desc2 = b; 500 + const struct btf_field *f1 = a, *f2 = b; 501 501 502 - if (off_desc1->offset < off_desc2->offset) 502 + if (f1->offset < f2->offset) 503 503 return -1; 504 - else if (off_desc1->offset > off_desc2->offset) 504 + else if (f1->offset > f2->offset) 505 505 return 1; 506 506 return 0; 507 507 } 508 508 509 - struct bpf_map_value_off_desc *bpf_map_kptr_off_contains(struct bpf_map *map, u32 offset) 509 + struct btf_field *btf_record_find(const struct btf_record *rec, u32 offset, 510 + enum btf_field_type type) 510 511 { 511 - /* Since members are iterated in btf_find_field in increasing order, 512 - * offsets appended to kptr_off_tab are in increasing order, so we can 513 - * do bsearch to find exact match. 514 - */ 515 - struct bpf_map_value_off *tab; 512 + struct btf_field *field; 516 513 517 - if (!map_value_has_kptrs(map)) 514 + if (IS_ERR_OR_NULL(rec) || !(rec->field_mask & type)) 518 515 return NULL; 519 - tab = map->kptr_off_tab; 520 - return bsearch(&offset, tab->off, tab->nr_off, sizeof(tab->off[0]), bpf_map_kptr_off_cmp); 516 + field = bsearch(&offset, rec->fields, rec->cnt, sizeof(rec->fields[0]), btf_field_cmp); 517 + if (!field || !(field->type & type)) 518 + return NULL; 519 + return field; 521 520 } 522 521 523 - void bpf_map_free_kptr_off_tab(struct bpf_map *map) 522 + void btf_record_free(struct btf_record *rec) 524 523 { 525 - struct bpf_map_value_off *tab = map->kptr_off_tab; 526 524 int i; 527 525 528 - if (!map_value_has_kptrs(map)) 526 + if (IS_ERR_OR_NULL(rec)) 529 527 return; 530 - for (i = 0; i < tab->nr_off; i++) { 531 - if (tab->off[i].kptr.module) 532 - module_put(tab->off[i].kptr.module); 533 - btf_put(tab->off[i].kptr.btf); 534 - } 535 - kfree(tab); 536 - map->kptr_off_tab = NULL; 537 - } 538 - 539 - struct bpf_map_value_off *bpf_map_copy_kptr_off_tab(const struct bpf_map *map) 540 - { 541 - struct bpf_map_value_off *tab = map->kptr_off_tab, *new_tab; 542 - int size, i; 543 - 544 - if (!map_value_has_kptrs(map)) 545 - return ERR_PTR(-ENOENT); 546 - size = offsetof(struct bpf_map_value_off, off[tab->nr_off]); 547 - new_tab = kmemdup(tab, size, GFP_KERNEL | __GFP_NOWARN); 548 - if (!new_tab) 549 - return ERR_PTR(-ENOMEM); 550 - /* Do a deep copy of the kptr_off_tab */ 551 - for (i = 0; i < tab->nr_off; i++) { 552 - btf_get(tab->off[i].kptr.btf); 553 - if (tab->off[i].kptr.module && !try_module_get(tab->off[i].kptr.module)) { 554 - while (i--) { 555 - if (tab->off[i].kptr.module) 556 - module_put(tab->off[i].kptr.module); 557 - btf_put(tab->off[i].kptr.btf); 558 - } 559 - kfree(new_tab); 560 - return ERR_PTR(-ENXIO); 561 - } 562 - } 563 - return new_tab; 564 - } 565 - 566 - bool bpf_map_equal_kptr_off_tab(const struct bpf_map *map_a, const struct bpf_map *map_b) 567 - { 568 - struct bpf_map_value_off *tab_a = map_a->kptr_off_tab, *tab_b = map_b->kptr_off_tab; 569 - bool a_has_kptr = map_value_has_kptrs(map_a), b_has_kptr = map_value_has_kptrs(map_b); 570 - int size; 571 - 572 - if (!a_has_kptr && !b_has_kptr) 573 - return true; 574 - if (a_has_kptr != b_has_kptr) 575 - return false; 576 - if (tab_a->nr_off != tab_b->nr_off) 577 - return false; 578 - size = offsetof(struct bpf_map_value_off, off[tab_a->nr_off]); 579 - return !memcmp(tab_a, tab_b, size); 580 - } 581 - 582 - /* Caller must ensure map_value_has_kptrs is true. Note that this function can 583 - * be called on a map value while the map_value is visible to BPF programs, as 584 - * it ensures the correct synchronization, and we already enforce the same using 585 - * the bpf_kptr_xchg helper on the BPF program side for referenced kptrs. 586 - */ 587 - void bpf_map_free_kptrs(struct bpf_map *map, void *map_value) 588 - { 589 - struct bpf_map_value_off *tab = map->kptr_off_tab; 590 - unsigned long *btf_id_ptr; 591 - int i; 592 - 593 - for (i = 0; i < tab->nr_off; i++) { 594 - struct bpf_map_value_off_desc *off_desc = &tab->off[i]; 595 - unsigned long old_ptr; 596 - 597 - btf_id_ptr = map_value + off_desc->offset; 598 - if (off_desc->type == BPF_KPTR_UNREF) { 599 - u64 *p = (u64 *)btf_id_ptr; 600 - 601 - WRITE_ONCE(*p, 0); 528 + for (i = 0; i < rec->cnt; i++) { 529 + switch (rec->fields[i].type) { 530 + case BPF_SPIN_LOCK: 531 + case BPF_TIMER: 532 + break; 533 + case BPF_KPTR_UNREF: 534 + case BPF_KPTR_REF: 535 + if (rec->fields[i].kptr.module) 536 + module_put(rec->fields[i].kptr.module); 537 + btf_put(rec->fields[i].kptr.btf); 538 + break; 539 + default: 540 + WARN_ON_ONCE(1); 602 541 continue; 603 542 } 604 - old_ptr = xchg(btf_id_ptr, 0); 605 - off_desc->kptr.dtor((void *)old_ptr); 543 + } 544 + kfree(rec); 545 + } 546 + 547 + void bpf_map_free_record(struct bpf_map *map) 548 + { 549 + btf_record_free(map->record); 550 + map->record = NULL; 551 + } 552 + 553 + struct btf_record *btf_record_dup(const struct btf_record *rec) 554 + { 555 + const struct btf_field *fields; 556 + struct btf_record *new_rec; 557 + int ret, size, i; 558 + 559 + if (IS_ERR_OR_NULL(rec)) 560 + return NULL; 561 + size = offsetof(struct btf_record, fields[rec->cnt]); 562 + new_rec = kmemdup(rec, size, GFP_KERNEL | __GFP_NOWARN); 563 + if (!new_rec) 564 + return ERR_PTR(-ENOMEM); 565 + /* Do a deep copy of the btf_record */ 566 + fields = rec->fields; 567 + new_rec->cnt = 0; 568 + for (i = 0; i < rec->cnt; i++) { 569 + switch (fields[i].type) { 570 + case BPF_SPIN_LOCK: 571 + case BPF_TIMER: 572 + break; 573 + case BPF_KPTR_UNREF: 574 + case BPF_KPTR_REF: 575 + btf_get(fields[i].kptr.btf); 576 + if (fields[i].kptr.module && !try_module_get(fields[i].kptr.module)) { 577 + ret = -ENXIO; 578 + goto free; 579 + } 580 + break; 581 + default: 582 + ret = -EFAULT; 583 + WARN_ON_ONCE(1); 584 + goto free; 585 + } 586 + new_rec->cnt++; 587 + } 588 + return new_rec; 589 + free: 590 + btf_record_free(new_rec); 591 + return ERR_PTR(ret); 592 + } 593 + 594 + bool btf_record_equal(const struct btf_record *rec_a, const struct btf_record *rec_b) 595 + { 596 + bool a_has_fields = !IS_ERR_OR_NULL(rec_a), b_has_fields = !IS_ERR_OR_NULL(rec_b); 597 + int size; 598 + 599 + if (!a_has_fields && !b_has_fields) 600 + return true; 601 + if (a_has_fields != b_has_fields) 602 + return false; 603 + if (rec_a->cnt != rec_b->cnt) 604 + return false; 605 + size = offsetof(struct btf_record, fields[rec_a->cnt]); 606 + return !memcmp(rec_a, rec_b, size); 607 + } 608 + 609 + void bpf_obj_free_timer(const struct btf_record *rec, void *obj) 610 + { 611 + if (WARN_ON_ONCE(!btf_record_has_field(rec, BPF_TIMER))) 612 + return; 613 + bpf_timer_cancel_and_free(obj + rec->timer_off); 614 + } 615 + 616 + void bpf_obj_free_fields(const struct btf_record *rec, void *obj) 617 + { 618 + const struct btf_field *fields; 619 + int i; 620 + 621 + if (IS_ERR_OR_NULL(rec)) 622 + return; 623 + fields = rec->fields; 624 + for (i = 0; i < rec->cnt; i++) { 625 + const struct btf_field *field = &fields[i]; 626 + void *field_ptr = obj + field->offset; 627 + 628 + switch (fields[i].type) { 629 + case BPF_SPIN_LOCK: 630 + break; 631 + case BPF_TIMER: 632 + bpf_timer_cancel_and_free(field_ptr); 633 + break; 634 + case BPF_KPTR_UNREF: 635 + WRITE_ONCE(*(u64 *)field_ptr, 0); 636 + break; 637 + case BPF_KPTR_REF: 638 + field->kptr.dtor((void *)xchg((unsigned long *)field_ptr, 0)); 639 + break; 640 + default: 641 + WARN_ON_ONCE(1); 642 + continue; 643 + } 606 644 } 607 645 } 608 646 ··· 650 612 struct bpf_map *map = container_of(work, struct bpf_map, work); 651 613 652 614 security_bpf_map_free(map); 653 - kfree(map->off_arr); 615 + kfree(map->field_offs); 654 616 bpf_map_release_memcg(map); 655 617 /* implementation dependent freeing, map_free callback also does 656 - * bpf_map_free_kptr_off_tab, if needed. 618 + * bpf_map_free_record, if needed. 657 619 */ 658 620 map->ops->map_free(map); 659 621 } ··· 816 778 struct bpf_map *map = filp->private_data; 817 779 int err; 818 780 819 - if (!map->ops->map_mmap || map_value_has_spin_lock(map) || 820 - map_value_has_timer(map) || map_value_has_kptrs(map)) 781 + if (!map->ops->map_mmap || !IS_ERR_OR_NULL(map->record)) 821 782 return -ENOTSUPP; 822 783 823 784 if (!(vma->vm_flags & VM_SHARED)) ··· 943 906 return -ENOTSUPP; 944 907 } 945 908 946 - static int map_off_arr_cmp(const void *_a, const void *_b, const void *priv) 947 - { 948 - const u32 a = *(const u32 *)_a; 949 - const u32 b = *(const u32 *)_b; 950 - 951 - if (a < b) 952 - return -1; 953 - else if (a > b) 954 - return 1; 955 - return 0; 956 - } 957 - 958 - static void map_off_arr_swap(void *_a, void *_b, int size, const void *priv) 959 - { 960 - struct bpf_map *map = (struct bpf_map *)priv; 961 - u32 *off_base = map->off_arr->field_off; 962 - u32 *a = _a, *b = _b; 963 - u8 *sz_a, *sz_b; 964 - 965 - sz_a = map->off_arr->field_sz + (a - off_base); 966 - sz_b = map->off_arr->field_sz + (b - off_base); 967 - 968 - swap(*a, *b); 969 - swap(*sz_a, *sz_b); 970 - } 971 - 972 - static int bpf_map_alloc_off_arr(struct bpf_map *map) 973 - { 974 - bool has_spin_lock = map_value_has_spin_lock(map); 975 - bool has_timer = map_value_has_timer(map); 976 - bool has_kptrs = map_value_has_kptrs(map); 977 - struct bpf_map_off_arr *off_arr; 978 - u32 i; 979 - 980 - if (!has_spin_lock && !has_timer && !has_kptrs) { 981 - map->off_arr = NULL; 982 - return 0; 983 - } 984 - 985 - off_arr = kmalloc(sizeof(*map->off_arr), GFP_KERNEL | __GFP_NOWARN); 986 - if (!off_arr) 987 - return -ENOMEM; 988 - map->off_arr = off_arr; 989 - 990 - off_arr->cnt = 0; 991 - if (has_spin_lock) { 992 - i = off_arr->cnt; 993 - 994 - off_arr->field_off[i] = map->spin_lock_off; 995 - off_arr->field_sz[i] = sizeof(struct bpf_spin_lock); 996 - off_arr->cnt++; 997 - } 998 - if (has_timer) { 999 - i = off_arr->cnt; 1000 - 1001 - off_arr->field_off[i] = map->timer_off; 1002 - off_arr->field_sz[i] = sizeof(struct bpf_timer); 1003 - off_arr->cnt++; 1004 - } 1005 - if (has_kptrs) { 1006 - struct bpf_map_value_off *tab = map->kptr_off_tab; 1007 - u32 *off = &off_arr->field_off[off_arr->cnt]; 1008 - u8 *sz = &off_arr->field_sz[off_arr->cnt]; 1009 - 1010 - for (i = 0; i < tab->nr_off; i++) { 1011 - *off++ = tab->off[i].offset; 1012 - *sz++ = sizeof(u64); 1013 - } 1014 - off_arr->cnt += tab->nr_off; 1015 - } 1016 - 1017 - if (off_arr->cnt == 1) 1018 - return 0; 1019 - sort_r(off_arr->field_off, off_arr->cnt, sizeof(off_arr->field_off[0]), 1020 - map_off_arr_cmp, map_off_arr_swap, map); 1021 - return 0; 1022 - } 1023 - 1024 909 static int map_check_btf(struct bpf_map *map, const struct btf *btf, 1025 910 u32 btf_key_id, u32 btf_value_id) 1026 911 { ··· 965 1006 if (!value_type || value_size != map->value_size) 966 1007 return -EINVAL; 967 1008 968 - map->spin_lock_off = btf_find_spin_lock(btf, value_type); 1009 + map->record = btf_parse_fields(btf, value_type, BPF_SPIN_LOCK | BPF_TIMER | BPF_KPTR, 1010 + map->value_size); 1011 + if (!IS_ERR_OR_NULL(map->record)) { 1012 + int i; 969 1013 970 - if (map_value_has_spin_lock(map)) { 971 - if (map->map_flags & BPF_F_RDONLY_PROG) 972 - return -EACCES; 973 - if (map->map_type != BPF_MAP_TYPE_HASH && 974 - map->map_type != BPF_MAP_TYPE_ARRAY && 975 - map->map_type != BPF_MAP_TYPE_CGROUP_STORAGE && 976 - map->map_type != BPF_MAP_TYPE_SK_STORAGE && 977 - map->map_type != BPF_MAP_TYPE_INODE_STORAGE && 978 - map->map_type != BPF_MAP_TYPE_TASK_STORAGE && 979 - map->map_type != BPF_MAP_TYPE_CGRP_STORAGE) 980 - return -ENOTSUPP; 981 - if (map->spin_lock_off + sizeof(struct bpf_spin_lock) > 982 - map->value_size) { 983 - WARN_ONCE(1, 984 - "verifier bug spin_lock_off %d value_size %d\n", 985 - map->spin_lock_off, map->value_size); 986 - return -EFAULT; 987 - } 988 - } 989 - 990 - map->timer_off = btf_find_timer(btf, value_type); 991 - if (map_value_has_timer(map)) { 992 - if (map->map_flags & BPF_F_RDONLY_PROG) 993 - return -EACCES; 994 - if (map->map_type != BPF_MAP_TYPE_HASH && 995 - map->map_type != BPF_MAP_TYPE_LRU_HASH && 996 - map->map_type != BPF_MAP_TYPE_ARRAY) 997 - return -EOPNOTSUPP; 998 - } 999 - 1000 - map->kptr_off_tab = btf_parse_kptrs(btf, value_type); 1001 - if (map_value_has_kptrs(map)) { 1002 1014 if (!bpf_capable()) { 1003 1015 ret = -EPERM; 1004 1016 goto free_map_tab; ··· 978 1048 ret = -EACCES; 979 1049 goto free_map_tab; 980 1050 } 981 - if (map->map_type != BPF_MAP_TYPE_HASH && 982 - map->map_type != BPF_MAP_TYPE_LRU_HASH && 983 - map->map_type != BPF_MAP_TYPE_ARRAY && 984 - map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY) { 985 - ret = -EOPNOTSUPP; 986 - goto free_map_tab; 1051 + for (i = 0; i < sizeof(map->record->field_mask) * 8; i++) { 1052 + switch (map->record->field_mask & (1 << i)) { 1053 + case 0: 1054 + continue; 1055 + case BPF_SPIN_LOCK: 1056 + if (map->map_type != BPF_MAP_TYPE_HASH && 1057 + map->map_type != BPF_MAP_TYPE_ARRAY && 1058 + map->map_type != BPF_MAP_TYPE_CGROUP_STORAGE && 1059 + map->map_type != BPF_MAP_TYPE_SK_STORAGE && 1060 + map->map_type != BPF_MAP_TYPE_INODE_STORAGE && 1061 + map->map_type != BPF_MAP_TYPE_TASK_STORAGE && 1062 + map->map_type != BPF_MAP_TYPE_CGRP_STORAGE) { 1063 + ret = -EOPNOTSUPP; 1064 + goto free_map_tab; 1065 + } 1066 + break; 1067 + case BPF_TIMER: 1068 + if (map->map_type != BPF_MAP_TYPE_HASH && 1069 + map->map_type != BPF_MAP_TYPE_LRU_HASH && 1070 + map->map_type != BPF_MAP_TYPE_ARRAY) { 1071 + return -EOPNOTSUPP; 1072 + goto free_map_tab; 1073 + } 1074 + break; 1075 + case BPF_KPTR_UNREF: 1076 + case BPF_KPTR_REF: 1077 + if (map->map_type != BPF_MAP_TYPE_HASH && 1078 + map->map_type != BPF_MAP_TYPE_LRU_HASH && 1079 + map->map_type != BPF_MAP_TYPE_ARRAY && 1080 + map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY) { 1081 + ret = -EOPNOTSUPP; 1082 + goto free_map_tab; 1083 + } 1084 + break; 1085 + default: 1086 + /* Fail if map_type checks are missing for a field type */ 1087 + ret = -EOPNOTSUPP; 1088 + goto free_map_tab; 1089 + } 987 1090 } 988 1091 } 989 1092 ··· 1028 1065 1029 1066 return ret; 1030 1067 free_map_tab: 1031 - bpf_map_free_kptr_off_tab(map); 1068 + bpf_map_free_record(map); 1032 1069 return ret; 1033 1070 } 1034 1071 ··· 1037 1074 static int map_create(union bpf_attr *attr) 1038 1075 { 1039 1076 int numa_node = bpf_map_attr_numa_node(attr); 1077 + struct btf_field_offs *foffs; 1040 1078 struct bpf_map *map; 1041 1079 int f_flags; 1042 1080 int err; ··· 1082 1118 mutex_init(&map->freeze_mutex); 1083 1119 spin_lock_init(&map->owner.lock); 1084 1120 1085 - map->spin_lock_off = -EINVAL; 1086 - map->timer_off = -EINVAL; 1087 1121 if (attr->btf_key_type_id || attr->btf_value_type_id || 1088 1122 /* Even the map's value is a kernel's struct, 1089 1123 * the bpf_prog.o must have BTF to begin with ··· 1117 1155 attr->btf_vmlinux_value_type_id; 1118 1156 } 1119 1157 1120 - err = bpf_map_alloc_off_arr(map); 1121 - if (err) 1158 + 1159 + foffs = btf_parse_field_offs(map->record); 1160 + if (IS_ERR(foffs)) { 1161 + err = PTR_ERR(foffs); 1122 1162 goto free_map; 1163 + } 1164 + map->field_offs = foffs; 1123 1165 1124 1166 err = security_bpf_map_alloc(map); 1125 1167 if (err) 1126 - goto free_map_off_arr; 1168 + goto free_map_field_offs; 1127 1169 1128 1170 err = bpf_map_alloc_id(map); 1129 1171 if (err) ··· 1151 1185 1152 1186 free_map_sec: 1153 1187 security_bpf_map_free(map); 1154 - free_map_off_arr: 1155 - kfree(map->off_arr); 1188 + free_map_field_offs: 1189 + kfree(map->field_offs); 1156 1190 free_map: 1157 1191 btf_put(map->btf); 1158 1192 map->ops->map_free(map); ··· 1299 1333 } 1300 1334 1301 1335 if ((attr->flags & BPF_F_LOCK) && 1302 - !map_value_has_spin_lock(map)) { 1336 + !btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 1303 1337 err = -EINVAL; 1304 1338 goto err_put; 1305 1339 } ··· 1372 1406 } 1373 1407 1374 1408 if ((attr->flags & BPF_F_LOCK) && 1375 - !map_value_has_spin_lock(map)) { 1409 + !btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 1376 1410 err = -EINVAL; 1377 1411 goto err_put; 1378 1412 } ··· 1535 1569 return -EINVAL; 1536 1570 1537 1571 if ((attr->batch.elem_flags & BPF_F_LOCK) && 1538 - !map_value_has_spin_lock(map)) { 1572 + !btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 1539 1573 return -EINVAL; 1540 1574 } 1541 1575 ··· 1592 1626 return -EINVAL; 1593 1627 1594 1628 if ((attr->batch.elem_flags & BPF_F_LOCK) && 1595 - !map_value_has_spin_lock(map)) { 1629 + !btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 1596 1630 return -EINVAL; 1597 1631 } 1598 1632 ··· 1655 1689 return -EINVAL; 1656 1690 1657 1691 if ((attr->batch.elem_flags & BPF_F_LOCK) && 1658 - !map_value_has_spin_lock(map)) 1692 + !btf_record_has_field(map->record, BPF_SPIN_LOCK)) 1659 1693 return -EINVAL; 1660 1694 1661 1695 value_size = bpf_map_value_size(map); ··· 1777 1811 } 1778 1812 1779 1813 if ((attr->flags & BPF_F_LOCK) && 1780 - !map_value_has_spin_lock(map)) { 1814 + !btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 1781 1815 err = -EINVAL; 1782 1816 goto err_put; 1783 1817 } ··· 1848 1882 if (IS_ERR(map)) 1849 1883 return PTR_ERR(map); 1850 1884 1851 - if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS || 1852 - map_value_has_timer(map) || map_value_has_kptrs(map)) { 1885 + if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS || !IS_ERR_OR_NULL(map->record)) { 1853 1886 fdput(f); 1854 1887 return -ENOTSUPP; 1855 1888 }

+314 -171

kernel/bpf/verifier.c

··· 262 262 struct btf *ret_btf; 263 263 u32 ret_btf_id; 264 264 u32 subprogno; 265 - struct bpf_map_value_off_desc *kptr_off_desc; 265 + struct btf_field *kptr_field; 266 266 u8 uninit_dynptr_regno; 267 267 }; 268 268 ··· 454 454 static bool reg_may_point_to_spin_lock(const struct bpf_reg_state *reg) 455 455 { 456 456 return reg->type == PTR_TO_MAP_VALUE && 457 - map_value_has_spin_lock(reg->map_ptr); 458 - } 459 - 460 - static bool reg_type_may_be_refcounted_or_null(enum bpf_reg_type type) 461 - { 462 - type = base_type(type); 463 - return type == PTR_TO_SOCKET || type == PTR_TO_TCP_SOCK || 464 - type == PTR_TO_MEM || type == PTR_TO_BTF_ID; 457 + btf_record_has_field(reg->map_ptr->record, BPF_SPIN_LOCK); 465 458 } 466 459 467 460 static bool type_is_rdonly_mem(u32 type) ··· 502 509 static bool is_dynptr_ref_function(enum bpf_func_id func_id) 503 510 { 504 511 return func_id == BPF_FUNC_dynptr_data; 512 + } 513 + 514 + static bool is_callback_calling_function(enum bpf_func_id func_id) 515 + { 516 + return func_id == BPF_FUNC_for_each_map_elem || 517 + func_id == BPF_FUNC_timer_set_callback || 518 + func_id == BPF_FUNC_find_vma || 519 + func_id == BPF_FUNC_loop || 520 + func_id == BPF_FUNC_user_ringbuf_drain; 505 521 } 506 522 507 523 static bool helper_multiple_ref_obj_use(enum bpf_func_id func_id, ··· 877 875 878 876 if (reg->id) 879 877 verbose_a("id=%d", reg->id); 880 - if (reg_type_may_be_refcounted_or_null(t) && reg->ref_obj_id) 878 + if (reg->ref_obj_id) 881 879 verbose_a("ref_obj_id=%d", reg->ref_obj_id); 882 880 if (t != SCALAR_VALUE) 883 881 verbose_a("off=%d", reg->off); ··· 1402 1400 /* transfer reg's id which is unique for every map_lookup_elem 1403 1401 * as UID of the inner map. 1404 1402 */ 1405 - if (map_value_has_timer(map->inner_map_meta)) 1403 + if (btf_record_has_field(map->inner_map_meta->record, BPF_TIMER)) 1406 1404 reg->map_uid = reg->id; 1407 1405 } else if (map->map_type == BPF_MAP_TYPE_XSKMAP) { 1408 1406 reg->type = PTR_TO_XDP_SOCK; ··· 1691 1689 reg->type = SCALAR_VALUE; 1692 1690 reg->var_off = tnum_unknown; 1693 1691 reg->frameno = 0; 1694 - reg->precise = env->subprog_cnt > 1 || !env->bpf_capable; 1692 + reg->precise = !env->bpf_capable; 1695 1693 __mark_reg_unbounded(reg); 1696 1694 } 1697 1695 ··· 2660 2658 if (opcode == BPF_CALL) { 2661 2659 if (insn->src_reg == BPF_PSEUDO_CALL) 2662 2660 return -ENOTSUPP; 2661 + /* BPF helpers that invoke callback subprogs are 2662 + * equivalent to BPF_PSEUDO_CALL above 2663 + */ 2664 + if (insn->src_reg == 0 && is_callback_calling_function(insn->imm)) 2665 + return -ENOTSUPP; 2663 2666 /* regular helper call sets R0 */ 2664 2667 *reg_mask &= ~1; 2665 2668 if (*reg_mask & 0x3f) { ··· 2754 2747 2755 2748 /* big hammer: mark all scalars precise in this path. 2756 2749 * pop_stack may still get !precise scalars. 2750 + * We also skip current state and go straight to first parent state, 2751 + * because precision markings in current non-checkpointed state are 2752 + * not needed. See why in the comment in __mark_chain_precision below. 2757 2753 */ 2758 - for (; st; st = st->parent) 2754 + for (st = st->parent; st; st = st->parent) { 2759 2755 for (i = 0; i <= st->curframe; i++) { 2760 2756 func = st->frame[i]; 2761 2757 for (j = 0; j < BPF_REG_FP; j++) { ··· 2776 2766 reg->precise = true; 2777 2767 } 2778 2768 } 2769 + } 2779 2770 } 2780 2771 2781 - static int __mark_chain_precision(struct bpf_verifier_env *env, int regno, 2772 + static void mark_all_scalars_imprecise(struct bpf_verifier_env *env, struct bpf_verifier_state *st) 2773 + { 2774 + struct bpf_func_state *func; 2775 + struct bpf_reg_state *reg; 2776 + int i, j; 2777 + 2778 + for (i = 0; i <= st->curframe; i++) { 2779 + func = st->frame[i]; 2780 + for (j = 0; j < BPF_REG_FP; j++) { 2781 + reg = &func->regs[j]; 2782 + if (reg->type != SCALAR_VALUE) 2783 + continue; 2784 + reg->precise = false; 2785 + } 2786 + for (j = 0; j < func->allocated_stack / BPF_REG_SIZE; j++) { 2787 + if (!is_spilled_reg(&func->stack[j])) 2788 + continue; 2789 + reg = &func->stack[j].spilled_ptr; 2790 + if (reg->type != SCALAR_VALUE) 2791 + continue; 2792 + reg->precise = false; 2793 + } 2794 + } 2795 + } 2796 + 2797 + /* 2798 + * __mark_chain_precision() backtracks BPF program instruction sequence and 2799 + * chain of verifier states making sure that register *regno* (if regno >= 0) 2800 + * and/or stack slot *spi* (if spi >= 0) are marked as precisely tracked 2801 + * SCALARS, as well as any other registers and slots that contribute to 2802 + * a tracked state of given registers/stack slots, depending on specific BPF 2803 + * assembly instructions (see backtrack_insns() for exact instruction handling 2804 + * logic). This backtracking relies on recorded jmp_history and is able to 2805 + * traverse entire chain of parent states. This process ends only when all the 2806 + * necessary registers/slots and their transitive dependencies are marked as 2807 + * precise. 2808 + * 2809 + * One important and subtle aspect is that precise marks *do not matter* in 2810 + * the currently verified state (current state). It is important to understand 2811 + * why this is the case. 2812 + * 2813 + * First, note that current state is the state that is not yet "checkpointed", 2814 + * i.e., it is not yet put into env->explored_states, and it has no children 2815 + * states as well. It's ephemeral, and can end up either a) being discarded if 2816 + * compatible explored state is found at some point or BPF_EXIT instruction is 2817 + * reached or b) checkpointed and put into env->explored_states, branching out 2818 + * into one or more children states. 2819 + * 2820 + * In the former case, precise markings in current state are completely 2821 + * ignored by state comparison code (see regsafe() for details). Only 2822 + * checkpointed ("old") state precise markings are important, and if old 2823 + * state's register/slot is precise, regsafe() assumes current state's 2824 + * register/slot as precise and checks value ranges exactly and precisely. If 2825 + * states turn out to be compatible, current state's necessary precise 2826 + * markings and any required parent states' precise markings are enforced 2827 + * after the fact with propagate_precision() logic, after the fact. But it's 2828 + * important to realize that in this case, even after marking current state 2829 + * registers/slots as precise, we immediately discard current state. So what 2830 + * actually matters is any of the precise markings propagated into current 2831 + * state's parent states, which are always checkpointed (due to b) case above). 2832 + * As such, for scenario a) it doesn't matter if current state has precise 2833 + * markings set or not. 2834 + * 2835 + * Now, for the scenario b), checkpointing and forking into child(ren) 2836 + * state(s). Note that before current state gets to checkpointing step, any 2837 + * processed instruction always assumes precise SCALAR register/slot 2838 + * knowledge: if precise value or range is useful to prune jump branch, BPF 2839 + * verifier takes this opportunity enthusiastically. Similarly, when 2840 + * register's value is used to calculate offset or memory address, exact 2841 + * knowledge of SCALAR range is assumed, checked, and enforced. So, similar to 2842 + * what we mentioned above about state comparison ignoring precise markings 2843 + * during state comparison, BPF verifier ignores and also assumes precise 2844 + * markings *at will* during instruction verification process. But as verifier 2845 + * assumes precision, it also propagates any precision dependencies across 2846 + * parent states, which are not yet finalized, so can be further restricted 2847 + * based on new knowledge gained from restrictions enforced by their children 2848 + * states. This is so that once those parent states are finalized, i.e., when 2849 + * they have no more active children state, state comparison logic in 2850 + * is_state_visited() would enforce strict and precise SCALAR ranges, if 2851 + * required for correctness. 2852 + * 2853 + * To build a bit more intuition, note also that once a state is checkpointed, 2854 + * the path we took to get to that state is not important. This is crucial 2855 + * property for state pruning. When state is checkpointed and finalized at 2856 + * some instruction index, it can be correctly and safely used to "short 2857 + * circuit" any *compatible* state that reaches exactly the same instruction 2858 + * index. I.e., if we jumped to that instruction from a completely different 2859 + * code path than original finalized state was derived from, it doesn't 2860 + * matter, current state can be discarded because from that instruction 2861 + * forward having a compatible state will ensure we will safely reach the 2862 + * exit. States describe preconditions for further exploration, but completely 2863 + * forget the history of how we got here. 2864 + * 2865 + * This also means that even if we needed precise SCALAR range to get to 2866 + * finalized state, but from that point forward *that same* SCALAR register is 2867 + * never used in a precise context (i.e., it's precise value is not needed for 2868 + * correctness), it's correct and safe to mark such register as "imprecise" 2869 + * (i.e., precise marking set to false). This is what we rely on when we do 2870 + * not set precise marking in current state. If no child state requires 2871 + * precision for any given SCALAR register, it's safe to dictate that it can 2872 + * be imprecise. If any child state does require this register to be precise, 2873 + * we'll mark it precise later retroactively during precise markings 2874 + * propagation from child state to parent states. 2875 + * 2876 + * Skipping precise marking setting in current state is a mild version of 2877 + * relying on the above observation. But we can utilize this property even 2878 + * more aggressively by proactively forgetting any precise marking in the 2879 + * current state (which we inherited from the parent state), right before we 2880 + * checkpoint it and branch off into new child state. This is done by 2881 + * mark_all_scalars_imprecise() to hopefully get more permissive and generic 2882 + * finalized states which help in short circuiting more future states. 2883 + */ 2884 + static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int regno, 2782 2885 int spi) 2783 2886 { 2784 2887 struct bpf_verifier_state *st = env->cur_state; ··· 2908 2785 if (!env->bpf_capable) 2909 2786 return 0; 2910 2787 2911 - func = st->frame[st->curframe]; 2788 + /* Do sanity checks against current state of register and/or stack 2789 + * slot, but don't set precise flag in current state, as precision 2790 + * tracking in the current state is unnecessary. 2791 + */ 2792 + func = st->frame[frame]; 2912 2793 if (regno >= 0) { 2913 2794 reg = &func->regs[regno]; 2914 2795 if (reg->type != SCALAR_VALUE) { 2915 2796 WARN_ONCE(1, "backtracing misuse"); 2916 2797 return -EFAULT; 2917 2798 } 2918 - if (!reg->precise) 2919 - new_marks = true; 2920 - else 2921 - reg_mask = 0; 2922 - reg->precise = true; 2799 + new_marks = true; 2923 2800 } 2924 2801 2925 2802 while (spi >= 0) { ··· 2932 2809 stack_mask = 0; 2933 2810 break; 2934 2811 } 2935 - if (!reg->precise) 2936 - new_marks = true; 2937 - else 2938 - stack_mask = 0; 2939 - reg->precise = true; 2812 + new_marks = true; 2940 2813 break; 2941 2814 } 2942 2815 ··· 2940 2821 return 0; 2941 2822 if (!reg_mask && !stack_mask) 2942 2823 return 0; 2824 + 2943 2825 for (;;) { 2944 2826 DECLARE_BITMAP(mask, 64); 2945 2827 u32 history = st->jmp_history_cnt; 2946 2828 2947 2829 if (env->log.level & BPF_LOG_LEVEL2) 2948 2830 verbose(env, "last_idx %d first_idx %d\n", last_idx, first_idx); 2831 + 2832 + if (last_idx < 0) { 2833 + /* we are at the entry into subprog, which 2834 + * is expected for global funcs, but only if 2835 + * requested precise registers are R1-R5 2836 + * (which are global func's input arguments) 2837 + */ 2838 + if (st->curframe == 0 && 2839 + st->frame[0]->subprogno > 0 && 2840 + st->frame[0]->callsite == BPF_MAIN_FUNC && 2841 + stack_mask == 0 && (reg_mask & ~0x3e) == 0) { 2842 + bitmap_from_u64(mask, reg_mask); 2843 + for_each_set_bit(i, mask, 32) { 2844 + reg = &st->frame[0]->regs[i]; 2845 + if (reg->type != SCALAR_VALUE) { 2846 + reg_mask &= ~(1u << i); 2847 + continue; 2848 + } 2849 + reg->precise = true; 2850 + } 2851 + return 0; 2852 + } 2853 + 2854 + verbose(env, "BUG backtracing func entry subprog %d reg_mask %x stack_mask %llx\n", 2855 + st->frame[0]->subprogno, reg_mask, stack_mask); 2856 + WARN_ONCE(1, "verifier backtracking bug"); 2857 + return -EFAULT; 2858 + } 2859 + 2949 2860 for (i = last_idx;;) { 2950 2861 if (skip_first) { 2951 2862 err = 0; ··· 3015 2866 break; 3016 2867 3017 2868 new_marks = false; 3018 - func = st->frame[st->curframe]; 2869 + func = st->frame[frame]; 3019 2870 bitmap_from_u64(mask, reg_mask); 3020 2871 for_each_set_bit(i, mask, 32) { 3021 2872 reg = &func->regs[i]; ··· 3081 2932 3082 2933 int mark_chain_precision(struct bpf_verifier_env *env, int regno) 3083 2934 { 3084 - return __mark_chain_precision(env, regno, -1); 2935 + return __mark_chain_precision(env, env->cur_state->curframe, regno, -1); 3085 2936 } 3086 2937 3087 - static int mark_chain_precision_stack(struct bpf_verifier_env *env, int spi) 2938 + static int mark_chain_precision_frame(struct bpf_verifier_env *env, int frame, int regno) 3088 2939 { 3089 - return __mark_chain_precision(env, -1, spi); 2940 + return __mark_chain_precision(env, frame, regno, -1); 2941 + } 2942 + 2943 + static int mark_chain_precision_stack_frame(struct bpf_verifier_env *env, int frame, int spi) 2944 + { 2945 + return __mark_chain_precision(env, frame, -1, spi); 3090 2946 } 3091 2947 3092 2948 static bool is_spillable_regtype(enum bpf_reg_type type) ··· 3340 3186 stype = &state->stack[spi].slot_type[slot % BPF_REG_SIZE]; 3341 3187 mark_stack_slot_scratched(env, spi); 3342 3188 3343 - if (!env->allow_ptr_leaks 3344 - && *stype != NOT_INIT 3345 - && *stype != SCALAR_VALUE) { 3346 - /* Reject the write if there's are spilled pointers in 3347 - * range. If we didn't reject here, the ptr status 3348 - * would be erased below (even though not all slots are 3349 - * actually overwritten), possibly opening the door to 3350 - * leaks. 3189 + if (!env->allow_ptr_leaks && *stype != STACK_MISC && *stype != STACK_ZERO) { 3190 + /* Reject the write if range we may write to has not 3191 + * been initialized beforehand. If we didn't reject 3192 + * here, the ptr status would be erased below (even 3193 + * though not all slots are actually overwritten), 3194 + * possibly opening the door to leaks. 3195 + * 3196 + * We do however catch STACK_INVALID case below, and 3197 + * only allow reading possibly uninitialized memory 3198 + * later for CAP_PERFMON, as the write may not happen to 3199 + * that slot. 3351 3200 */ 3352 3201 verbose(env, "spilled ptr in range of var-offset stack write; insn %d, ptr off: %d", 3353 3202 insn_idx, i); ··· 3840 3683 } 3841 3684 3842 3685 static int map_kptr_match_type(struct bpf_verifier_env *env, 3843 - struct bpf_map_value_off_desc *off_desc, 3686 + struct btf_field *kptr_field, 3844 3687 struct bpf_reg_state *reg, u32 regno) 3845 3688 { 3846 - const char *targ_name = kernel_type_name(off_desc->kptr.btf, off_desc->kptr.btf_id); 3689 + const char *targ_name = kernel_type_name(kptr_field->kptr.btf, kptr_field->kptr.btf_id); 3847 3690 int perm_flags = PTR_MAYBE_NULL; 3848 3691 const char *reg_name = ""; 3849 3692 3850 3693 /* Only unreferenced case accepts untrusted pointers */ 3851 - if (off_desc->type == BPF_KPTR_UNREF) 3694 + if (kptr_field->type == BPF_KPTR_UNREF) 3852 3695 perm_flags |= PTR_UNTRUSTED; 3853 3696 3854 3697 if (base_type(reg->type) != PTR_TO_BTF_ID || (type_flag(reg->type) & ~perm_flags)) ··· 3895 3738 * strict mode to true for type match. 3896 3739 */ 3897 3740 if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, reg->off, 3898 - off_desc->kptr.btf, off_desc->kptr.btf_id, 3899 - off_desc->type == BPF_KPTR_REF)) 3741 + kptr_field->kptr.btf, kptr_field->kptr.btf_id, 3742 + kptr_field->type == BPF_KPTR_REF)) 3900 3743 goto bad_type; 3901 3744 return 0; 3902 3745 bad_type: 3903 3746 verbose(env, "invalid kptr access, R%d type=%s%s ", regno, 3904 3747 reg_type_str(env, reg->type), reg_name); 3905 3748 verbose(env, "expected=%s%s", reg_type_str(env, PTR_TO_BTF_ID), targ_name); 3906 - if (off_desc->type == BPF_KPTR_UNREF) 3749 + if (kptr_field->type == BPF_KPTR_UNREF) 3907 3750 verbose(env, " or %s%s\n", reg_type_str(env, PTR_TO_BTF_ID | PTR_UNTRUSTED), 3908 3751 targ_name); 3909 3752 else ··· 3913 3756 3914 3757 static int check_map_kptr_access(struct bpf_verifier_env *env, u32 regno, 3915 3758 int value_regno, int insn_idx, 3916 - struct bpf_map_value_off_desc *off_desc) 3759 + struct btf_field *kptr_field) 3917 3760 { 3918 3761 struct bpf_insn *insn = &env->prog->insnsi[insn_idx]; 3919 3762 int class = BPF_CLASS(insn->code); ··· 3923 3766 * - Reject cases where variable offset may touch kptr 3924 3767 * - size of access (must be BPF_DW) 3925 3768 * - tnum_is_const(reg->var_off) 3926 - * - off_desc->offset == off + reg->var_off.value 3769 + * - kptr_field->offset == off + reg->var_off.value 3927 3770 */ 3928 3771 /* Only BPF_[LDX,STX,ST] | BPF_MEM | BPF_DW is supported */ 3929 3772 if (BPF_MODE(insn->code) != BPF_MEM) { ··· 3934 3777 /* We only allow loading referenced kptr, since it will be marked as 3935 3778 * untrusted, similar to unreferenced kptr. 3936 3779 */ 3937 - if (class != BPF_LDX && off_desc->type == BPF_KPTR_REF) { 3780 + if (class != BPF_LDX && kptr_field->type == BPF_KPTR_REF) { 3938 3781 verbose(env, "store to referenced kptr disallowed\n"); 3939 3782 return -EACCES; 3940 3783 } ··· 3944 3787 /* We can simply mark the value_regno receiving the pointer 3945 3788 * value from map as PTR_TO_BTF_ID, with the correct type. 3946 3789 */ 3947 - mark_btf_ld_reg(env, cur_regs(env), value_regno, PTR_TO_BTF_ID, off_desc->kptr.btf, 3948 - off_desc->kptr.btf_id, PTR_MAYBE_NULL | PTR_UNTRUSTED); 3790 + mark_btf_ld_reg(env, cur_regs(env), value_regno, PTR_TO_BTF_ID, kptr_field->kptr.btf, 3791 + kptr_field->kptr.btf_id, PTR_MAYBE_NULL | PTR_UNTRUSTED); 3949 3792 /* For mark_ptr_or_null_reg */ 3950 3793 val_reg->id = ++env->id_gen; 3951 3794 } else if (class == BPF_STX) { 3952 3795 val_reg = reg_state(env, value_regno); 3953 3796 if (!register_is_null(val_reg) && 3954 - map_kptr_match_type(env, off_desc, val_reg, value_regno)) 3797 + map_kptr_match_type(env, kptr_field, val_reg, value_regno)) 3955 3798 return -EACCES; 3956 3799 } else if (class == BPF_ST) { 3957 3800 if (insn->imm) { 3958 3801 verbose(env, "BPF_ST imm must be 0 when storing to kptr at off=%u\n", 3959 - off_desc->offset); 3802 + kptr_field->offset); 3960 3803 return -EACCES; 3961 3804 } 3962 3805 } else { ··· 3975 3818 struct bpf_func_state *state = vstate->frame[vstate->curframe]; 3976 3819 struct bpf_reg_state *reg = &state->regs[regno]; 3977 3820 struct bpf_map *map = reg->map_ptr; 3978 - int err; 3821 + struct btf_record *rec; 3822 + int err, i; 3979 3823 3980 3824 err = check_mem_region_access(env, regno, off, size, map->value_size, 3981 3825 zero_size_allowed); 3982 3826 if (err) 3983 3827 return err; 3984 3828 3985 - if (map_value_has_spin_lock(map)) { 3986 - u32 lock = map->spin_lock_off; 3829 + if (IS_ERR_OR_NULL(map->record)) 3830 + return 0; 3831 + rec = map->record; 3832 + for (i = 0; i < rec->cnt; i++) { 3833 + struct btf_field *field = &rec->fields[i]; 3834 + u32 p = field->offset; 3987 3835 3988 - /* if any part of struct bpf_spin_lock can be touched by 3989 - * load/store reject this program. 3990 - * To check that [x1, x2) overlaps with [y1, y2) 3836 + /* If any part of a field can be touched by load/store, reject 3837 + * this program. To check that [x1, x2) overlaps with [y1, y2), 3991 3838 * it is sufficient to check x1 < y2 && y1 < x2. 3992 3839 */ 3993 - if (reg->smin_value + off < lock + sizeof(struct bpf_spin_lock) && 3994 - lock < reg->umax_value + off + size) { 3995 - verbose(env, "bpf_spin_lock cannot be accessed directly by load/store\n"); 3996 - return -EACCES; 3997 - } 3998 - } 3999 - if (map_value_has_timer(map)) { 4000 - u32 t = map->timer_off; 4001 - 4002 - if (reg->smin_value + off < t + sizeof(struct bpf_timer) && 4003 - t < reg->umax_value + off + size) { 4004 - verbose(env, "bpf_timer cannot be accessed directly by load/store\n"); 4005 - return -EACCES; 4006 - } 4007 - } 4008 - if (map_value_has_kptrs(map)) { 4009 - struct bpf_map_value_off *tab = map->kptr_off_tab; 4010 - int i; 4011 - 4012 - for (i = 0; i < tab->nr_off; i++) { 4013 - u32 p = tab->off[i].offset; 4014 - 4015 - if (reg->smin_value + off < p + sizeof(u64) && 4016 - p < reg->umax_value + off + size) { 3840 + if (reg->smin_value + off < p + btf_field_type_size(field->type) && 3841 + p < reg->umax_value + off + size) { 3842 + switch (field->type) { 3843 + case BPF_KPTR_UNREF: 3844 + case BPF_KPTR_REF: 4017 3845 if (src != ACCESS_DIRECT) { 4018 3846 verbose(env, "kptr cannot be accessed indirectly by helper\n"); 4019 3847 return -EACCES; ··· 4017 3875 return -EACCES; 4018 3876 } 4019 3877 break; 3878 + default: 3879 + verbose(env, "%s cannot be accessed directly by load/store\n", 3880 + btf_field_type_name(field->type)); 3881 + return -EACCES; 4020 3882 } 4021 3883 } 4022 3884 } 4023 - return err; 3885 + return 0; 4024 3886 } 4025 3887 4026 3888 #define MAX_PACKET_OFF 0xffff ··· 4897 4751 if (value_regno >= 0) 4898 4752 mark_reg_unknown(env, regs, value_regno); 4899 4753 } else if (reg->type == PTR_TO_MAP_VALUE) { 4900 - struct bpf_map_value_off_desc *kptr_off_desc = NULL; 4754 + struct btf_field *kptr_field = NULL; 4901 4755 4902 4756 if (t == BPF_WRITE && value_regno >= 0 && 4903 4757 is_pointer_value(env, value_regno)) { ··· 4911 4765 if (err) 4912 4766 return err; 4913 4767 if (tnum_is_const(reg->var_off)) 4914 - kptr_off_desc = bpf_map_kptr_off_contains(reg->map_ptr, 4915 - off + reg->var_off.value); 4916 - if (kptr_off_desc) { 4917 - err = check_map_kptr_access(env, regno, value_regno, insn_idx, 4918 - kptr_off_desc); 4768 + kptr_field = btf_record_find(reg->map_ptr->record, 4769 + off + reg->var_off.value, BPF_KPTR); 4770 + if (kptr_field) { 4771 + err = check_map_kptr_access(env, regno, value_regno, insn_idx, kptr_field); 4919 4772 } else if (t == BPF_READ && value_regno >= 0) { 4920 4773 struct bpf_map *map = reg->map_ptr; 4921 4774 ··· 5305 5160 } 5306 5161 5307 5162 if (is_spilled_reg(&state->stack[spi]) && 5308 - base_type(state->stack[spi].spilled_ptr.type) == PTR_TO_BTF_ID) 5309 - goto mark; 5310 - 5311 - if (is_spilled_reg(&state->stack[spi]) && 5312 5163 (state->stack[spi].spilled_ptr.type == SCALAR_VALUE || 5313 5164 env->allow_ptr_leaks)) { 5314 5165 if (clobber) { ··· 5334 5193 mark_reg_read(env, &state->stack[spi].spilled_ptr, 5335 5194 state->stack[spi].spilled_ptr.parent, 5336 5195 REG_LIVE_READ64); 5196 + /* We do not set REG_LIVE_WRITTEN for stack slot, as we can not 5197 + * be sure that whether stack slot is written to or not. Hence, 5198 + * we must still conservatively propagate reads upwards even if 5199 + * helper may write to the entire memory range. 5200 + */ 5337 5201 } 5338 5202 return update_stack_depth(env, state, min_off); 5339 5203 } ··· 5588 5442 map->name); 5589 5443 return -EINVAL; 5590 5444 } 5591 - if (!map_value_has_spin_lock(map)) { 5592 - if (map->spin_lock_off == -E2BIG) 5593 - verbose(env, 5594 - "map '%s' has more than one 'struct bpf_spin_lock'\n", 5595 - map->name); 5596 - else if (map->spin_lock_off == -ENOENT) 5597 - verbose(env, 5598 - "map '%s' doesn't have 'struct bpf_spin_lock'\n", 5599 - map->name); 5600 - else 5601 - verbose(env, 5602 - "map '%s' is not a struct type or bpf_spin_lock is mangled\n", 5603 - map->name); 5445 + if (!btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 5446 + verbose(env, "map '%s' has no valid bpf_spin_lock\n", map->name); 5604 5447 return -EINVAL; 5605 5448 } 5606 - if (map->spin_lock_off != val + reg->off) { 5607 - verbose(env, "off %lld doesn't point to 'struct bpf_spin_lock'\n", 5608 - val + reg->off); 5449 + if (map->record->spin_lock_off != val + reg->off) { 5450 + verbose(env, "off %lld doesn't point to 'struct bpf_spin_lock' that is at %d\n", 5451 + val + reg->off, map->record->spin_lock_off); 5609 5452 return -EINVAL; 5610 5453 } 5611 5454 if (is_lock) { ··· 5637 5502 map->name); 5638 5503 return -EINVAL; 5639 5504 } 5640 - if (!map_value_has_timer(map)) { 5641 - if (map->timer_off == -E2BIG) 5642 - verbose(env, 5643 - "map '%s' has more than one 'struct bpf_timer'\n", 5644 - map->name); 5645 - else if (map->timer_off == -ENOENT) 5646 - verbose(env, 5647 - "map '%s' doesn't have 'struct bpf_timer'\n", 5648 - map->name); 5649 - else 5650 - verbose(env, 5651 - "map '%s' is not a struct type or bpf_timer is mangled\n", 5652 - map->name); 5505 + if (!btf_record_has_field(map->record, BPF_TIMER)) { 5506 + verbose(env, "map '%s' has no valid bpf_timer\n", map->name); 5653 5507 return -EINVAL; 5654 5508 } 5655 - if (map->timer_off != val + reg->off) { 5509 + if (map->record->timer_off != val + reg->off) { 5656 5510 verbose(env, "off %lld doesn't point to 'struct bpf_timer' that is at %d\n", 5657 - val + reg->off, map->timer_off); 5511 + val + reg->off, map->record->timer_off); 5658 5512 return -EINVAL; 5659 5513 } 5660 5514 if (meta->map_ptr) { ··· 5659 5535 struct bpf_call_arg_meta *meta) 5660 5536 { 5661 5537 struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno]; 5662 - struct bpf_map_value_off_desc *off_desc; 5663 5538 struct bpf_map *map_ptr = reg->map_ptr; 5539 + struct btf_field *kptr_field; 5664 5540 u32 kptr_off; 5665 - int ret; 5666 5541 5667 5542 if (!tnum_is_const(reg->var_off)) { 5668 5543 verbose(env, ··· 5674 5551 map_ptr->name); 5675 5552 return -EINVAL; 5676 5553 } 5677 - if (!map_value_has_kptrs(map_ptr)) { 5678 - ret = PTR_ERR_OR_ZERO(map_ptr->kptr_off_tab); 5679 - if (ret == -E2BIG) 5680 - verbose(env, "map '%s' has more than %d kptr\n", map_ptr->name, 5681 - BPF_MAP_VALUE_OFF_MAX); 5682 - else if (ret == -EEXIST) 5683 - verbose(env, "map '%s' has repeating kptr BTF tags\n", map_ptr->name); 5684 - else 5685 - verbose(env, "map '%s' has no valid kptr\n", map_ptr->name); 5554 + if (!btf_record_has_field(map_ptr->record, BPF_KPTR)) { 5555 + verbose(env, "map '%s' has no valid kptr\n", map_ptr->name); 5686 5556 return -EINVAL; 5687 5557 } 5688 5558 5689 5559 meta->map_ptr = map_ptr; 5690 5560 kptr_off = reg->off + reg->var_off.value; 5691 - off_desc = bpf_map_kptr_off_contains(map_ptr, kptr_off); 5692 - if (!off_desc) { 5561 + kptr_field = btf_record_find(map_ptr->record, kptr_off, BPF_KPTR); 5562 + if (!kptr_field) { 5693 5563 verbose(env, "off=%d doesn't point to kptr\n", kptr_off); 5694 5564 return -EACCES; 5695 5565 } 5696 - if (off_desc->type != BPF_KPTR_REF) { 5566 + if (kptr_field->type != BPF_KPTR_REF) { 5697 5567 verbose(env, "off=%d kptr isn't referenced kptr\n", kptr_off); 5698 5568 return -EACCES; 5699 5569 } 5700 - meta->kptr_off_desc = off_desc; 5570 + meta->kptr_field = kptr_field; 5701 5571 return 0; 5702 5572 } 5703 5573 ··· 5912 5796 } 5913 5797 5914 5798 if (meta->func_id == BPF_FUNC_kptr_xchg) { 5915 - if (map_kptr_match_type(env, meta->kptr_off_desc, reg, regno)) 5799 + if (map_kptr_match_type(env, meta->kptr_field, reg, regno)) 5916 5800 return -EACCES; 5917 5801 } else { 5918 5802 if (arg_btf_id == BPF_PTR_POISON) { ··· 6767 6651 struct bpf_func_state *callee, 6768 6652 int insn_idx); 6769 6653 6654 + static int set_callee_state(struct bpf_verifier_env *env, 6655 + struct bpf_func_state *caller, 6656 + struct bpf_func_state *callee, int insn_idx); 6657 + 6770 6658 static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, 6771 6659 int *insn_idx, int subprog, 6772 6660 set_callee_state_fn set_callee_state_cb) ··· 6819 6699 /* continue with next insn after call */ 6820 6700 return 0; 6821 6701 } 6702 + } 6703 + 6704 + /* set_callee_state is used for direct subprog calls, but we are 6705 + * interested in validating only BPF helpers that can call subprogs as 6706 + * callbacks 6707 + */ 6708 + if (set_callee_state_cb != set_callee_state && !is_callback_calling_function(insn->imm)) { 6709 + verbose(env, "verifier bug: helper %s#%d is not marked as callback-calling\n", 6710 + func_id_name(insn->imm), insn->imm); 6711 + return -EFAULT; 6822 6712 } 6823 6713 6824 6714 if (insn->code == (BPF_JMP | BPF_CALL) && ··· 7614 7484 regs[BPF_REG_0].map_uid = meta.map_uid; 7615 7485 regs[BPF_REG_0].type = PTR_TO_MAP_VALUE | ret_flag; 7616 7486 if (!type_may_be_null(ret_type) && 7617 - map_value_has_spin_lock(meta.map_ptr)) { 7487 + btf_record_has_field(meta.map_ptr->record, BPF_SPIN_LOCK)) { 7618 7488 regs[BPF_REG_0].id = ++env->id_gen; 7619 7489 } 7620 7490 break; ··· 7678 7548 mark_reg_known_zero(env, regs, BPF_REG_0); 7679 7549 regs[BPF_REG_0].type = PTR_TO_BTF_ID | ret_flag; 7680 7550 if (func_id == BPF_FUNC_kptr_xchg) { 7681 - ret_btf = meta.kptr_off_desc->kptr.btf; 7682 - ret_btf_id = meta.kptr_off_desc->kptr.btf_id; 7551 + ret_btf = meta.kptr_field->kptr.btf; 7552 + ret_btf_id = meta.kptr_field->kptr.btf_id; 7683 7553 } else { 7684 7554 if (fn->ret_btf_id == BPF_PTR_POISON) { 7685 7555 verbose(env, "verifier internal error:"); ··· 9337 9207 return err; 9338 9208 return adjust_ptr_min_max_vals(env, insn, 9339 9209 dst_reg, src_reg); 9210 + } else if (dst_reg->precise) { 9211 + /* if dst_reg is precise, src_reg should be precise as well */ 9212 + err = mark_chain_precision(env, insn->src_reg); 9213 + if (err) 9214 + return err; 9340 9215 } 9341 9216 } else { 9342 9217 /* Pretend the src is a reg with a known value, since we only ··· 10530 10395 insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE) { 10531 10396 dst_reg->type = PTR_TO_MAP_VALUE; 10532 10397 dst_reg->off = aux->map_off; 10533 - if (map_value_has_spin_lock(map)) 10398 + if (btf_record_has_field(map->record, BPF_SPIN_LOCK)) 10534 10399 dst_reg->id = ++env->id_gen; 10535 10400 } else if (insn->src_reg == BPF_PSEUDO_MAP_FD || 10536 10401 insn->src_reg == BPF_PSEUDO_MAP_IDX) { ··· 11655 11520 if (env->explore_alu_limits) 11656 11521 return false; 11657 11522 if (rcur->type == SCALAR_VALUE) { 11658 - if (!rold->precise && !rcur->precise) 11523 + if (!rold->precise) 11659 11524 return true; 11660 11525 /* new val must satisfy old val knowledge */ 11661 11526 return range_within(rold, rcur) && ··· 11978 11843 { 11979 11844 struct bpf_reg_state *state_reg; 11980 11845 struct bpf_func_state *state; 11981 - int i, err = 0; 11846 + int i, err = 0, fr; 11982 11847 11983 - state = old->frame[old->curframe]; 11984 - state_reg = state->regs; 11985 - for (i = 0; i < BPF_REG_FP; i++, state_reg++) { 11986 - if (state_reg->type != SCALAR_VALUE || 11987 - !state_reg->precise) 11988 - continue; 11989 - if (env->log.level & BPF_LOG_LEVEL2) 11990 - verbose(env, "propagating r%d\n", i); 11991 - err = mark_chain_precision(env, i); 11992 - if (err < 0) 11993 - return err; 11994 - } 11848 + for (fr = old->curframe; fr >= 0; fr--) { 11849 + state = old->frame[fr]; 11850 + state_reg = state->regs; 11851 + for (i = 0; i < BPF_REG_FP; i++, state_reg++) { 11852 + if (state_reg->type != SCALAR_VALUE || 11853 + !state_reg->precise) 11854 + continue; 11855 + if (env->log.level & BPF_LOG_LEVEL2) 11856 + verbose(env, "frame %d: propagating r%d\n", i, fr); 11857 + err = mark_chain_precision_frame(env, fr, i); 11858 + if (err < 0) 11859 + return err; 11860 + } 11995 11861 11996 - for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) { 11997 - if (!is_spilled_reg(&state->stack[i])) 11998 - continue; 11999 - state_reg = &state->stack[i].spilled_ptr; 12000 - if (state_reg->type != SCALAR_VALUE || 12001 - !state_reg->precise) 12002 - continue; 12003 - if (env->log.level & BPF_LOG_LEVEL2) 12004 - verbose(env, "propagating fp%d\n", 12005 - (-i - 1) * BPF_REG_SIZE); 12006 - err = mark_chain_precision_stack(env, i); 12007 - if (err < 0) 12008 - return err; 11862 + for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) { 11863 + if (!is_spilled_reg(&state->stack[i])) 11864 + continue; 11865 + state_reg = &state->stack[i].spilled_ptr; 11866 + if (state_reg->type != SCALAR_VALUE || 11867 + !state_reg->precise) 11868 + continue; 11869 + if (env->log.level & BPF_LOG_LEVEL2) 11870 + verbose(env, "frame %d: propagating fp%d\n", 11871 + (-i - 1) * BPF_REG_SIZE, fr); 11872 + err = mark_chain_precision_stack_frame(env, fr, i); 11873 + if (err < 0) 11874 + return err; 11875 + } 12009 11876 } 12010 11877 return 0; 12011 11878 } ··· 12201 12064 env->peak_states++; 12202 12065 env->prev_jmps_processed = env->jmps_processed; 12203 12066 env->prev_insn_processed = env->insn_processed; 12067 + 12068 + /* forget precise markings we inherited, see __mark_chain_precision */ 12069 + if (env->bpf_capable) 12070 + mark_all_scalars_imprecise(env, cur); 12204 12071 12205 12072 /* add new state to the head of linked list */ 12206 12073 new = &new_sl->state; ··· 12814 12673 { 12815 12674 enum bpf_prog_type prog_type = resolve_prog_type(prog); 12816 12675 12817 - if (map_value_has_spin_lock(map)) { 12676 + if (btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 12818 12677 if (prog_type == BPF_PROG_TYPE_SOCKET_FILTER) { 12819 12678 verbose(env, "socket filter progs cannot use bpf_spin_lock yet\n"); 12820 12679 return -EINVAL; ··· 12831 12690 } 12832 12691 } 12833 12692 12834 - if (map_value_has_timer(map)) { 12693 + if (btf_record_has_field(map->record, BPF_TIMER)) { 12835 12694 if (is_tracing_prog_type(prog_type)) { 12836 12695 verbose(env, "tracing progs cannot use bpf_timer yet\n"); 12837 12696 return -EINVAL; ··· 14754 14613 BPF_MAIN_FUNC /* callsite */, 14755 14614 0 /* frameno */, 14756 14615 subprog); 14616 + state->first_insn_idx = env->subprog_info[subprog].start; 14617 + state->last_insn_idx = -1; 14757 14618 14758 14619 regs = state->frame[state->curframe]->regs; 14759 14620 if (subprog || env->prog->type == BPF_PROG_TYPE_EXT) {

+2 -2

net/core/bpf_sk_storage.c

··· 147 147 if (!copy_selem) 148 148 return NULL; 149 149 150 - if (map_value_has_spin_lock(&smap->map)) 150 + if (btf_record_has_field(smap->map.record, BPF_SPIN_LOCK)) 151 151 copy_map_value_locked(&smap->map, SDATA(copy_selem)->data, 152 152 SDATA(selem)->data, true); 153 153 else ··· 566 566 if (!nla_value) 567 567 goto errout; 568 568 569 - if (map_value_has_spin_lock(&smap->map)) 569 + if (btf_record_has_field(smap->map.record, BPF_SPIN_LOCK)) 570 570 copy_map_value_locked(&smap->map, nla_data(nla_value), 571 571 sdata->data, true); 572 572 else

+35 -8

net/core/filter.c

··· 2126 2126 2127 2127 if (mlen) { 2128 2128 __skb_pull(skb, mlen); 2129 + if (unlikely(!skb->len)) { 2130 + kfree_skb(skb); 2131 + return -ERANGE; 2132 + } 2129 2133 2130 2134 /* At ingress, the mac header has already been pulled once. 2131 2135 * At egress, skb_pospull_rcsum has to be done in case that ··· 8925 8921 bpf_ctx_record_field_size(info, size_default); 8926 8922 return bpf_ctx_narrow_access_ok(off, size, 8927 8923 size_default); 8924 + case offsetof(struct bpf_sock_ops, skb_hwtstamp): 8925 + if (size != sizeof(__u64)) 8926 + return false; 8927 + break; 8928 8928 default: 8929 8929 if (size != size_default) 8930 8930 return false; ··· 9112 9104 return insn; 9113 9105 } 9114 9106 9115 - static struct bpf_insn *bpf_convert_shinfo_access(const struct bpf_insn *si, 9107 + static struct bpf_insn *bpf_convert_shinfo_access(__u8 dst_reg, __u8 skb_reg, 9116 9108 struct bpf_insn *insn) 9117 9109 { 9118 9110 /* si->dst_reg = skb_shinfo(SKB); */ 9119 9111 #ifdef NET_SKBUFF_DATA_USES_OFFSET 9120 9112 *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, end), 9121 - BPF_REG_AX, si->src_reg, 9113 + BPF_REG_AX, skb_reg, 9122 9114 offsetof(struct sk_buff, end)); 9123 9115 *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, head), 9124 - si->dst_reg, si->src_reg, 9116 + dst_reg, skb_reg, 9125 9117 offsetof(struct sk_buff, head)); 9126 - *insn++ = BPF_ALU64_REG(BPF_ADD, si->dst_reg, BPF_REG_AX); 9118 + *insn++ = BPF_ALU64_REG(BPF_ADD, dst_reg, BPF_REG_AX); 9127 9119 #else 9128 9120 *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, end), 9129 - si->dst_reg, si->src_reg, 9121 + dst_reg, skb_reg, 9130 9122 offsetof(struct sk_buff, end)); 9131 9123 #endif 9132 9124 ··· 9517 9509 break; 9518 9510 9519 9511 case offsetof(struct __sk_buff, gso_segs): 9520 - insn = bpf_convert_shinfo_access(si, insn); 9512 + insn = bpf_convert_shinfo_access(si->dst_reg, si->src_reg, insn); 9521 9513 *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct skb_shared_info, gso_segs), 9522 9514 si->dst_reg, si->dst_reg, 9523 9515 bpf_target_off(struct skb_shared_info, ··· 9525 9517 target_size)); 9526 9518 break; 9527 9519 case offsetof(struct __sk_buff, gso_size): 9528 - insn = bpf_convert_shinfo_access(si, insn); 9520 + insn = bpf_convert_shinfo_access(si->dst_reg, si->src_reg, insn); 9529 9521 *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct skb_shared_info, gso_size), 9530 9522 si->dst_reg, si->dst_reg, 9531 9523 bpf_target_off(struct skb_shared_info, ··· 9552 9544 BUILD_BUG_ON(sizeof_field(struct skb_shared_hwtstamps, hwtstamp) != 8); 9553 9545 BUILD_BUG_ON(offsetof(struct skb_shared_hwtstamps, hwtstamp) != 0); 9554 9546 9555 - insn = bpf_convert_shinfo_access(si, insn); 9547 + insn = bpf_convert_shinfo_access(si->dst_reg, si->src_reg, insn); 9556 9548 *insn++ = BPF_LDX_MEM(BPF_DW, 9557 9549 si->dst_reg, si->dst_reg, 9558 9550 bpf_target_off(struct skb_shared_info, ··· 10402 10394 tcp_flags), 10403 10395 si->dst_reg, si->dst_reg, off); 10404 10396 break; 10397 + case offsetof(struct bpf_sock_ops, skb_hwtstamp): { 10398 + struct bpf_insn *jmp_on_null_skb; 10399 + 10400 + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_sock_ops_kern, 10401 + skb), 10402 + si->dst_reg, si->src_reg, 10403 + offsetof(struct bpf_sock_ops_kern, 10404 + skb)); 10405 + /* Reserve one insn to test skb == NULL */ 10406 + jmp_on_null_skb = insn++; 10407 + insn = bpf_convert_shinfo_access(si->dst_reg, si->dst_reg, insn); 10408 + *insn++ = BPF_LDX_MEM(BPF_DW, si->dst_reg, si->dst_reg, 10409 + bpf_target_off(struct skb_shared_info, 10410 + hwtstamps, 8, 10411 + target_size)); 10412 + *jmp_on_null_skb = BPF_JMP_IMM(BPF_JEQ, si->dst_reg, 0, 10413 + insn - jmp_on_null_skb - 1); 10414 + break; 10415 + } 10405 10416 } 10406 10417 return insn - insn_buf; 10407 10418 }

+53 -42

samples/bpf/sockex3_kern.c

··· 17 17 #define IP_MF 0x2000 18 18 #define IP_OFFSET 0x1FFF 19 19 20 - #define PROG(F) SEC("socket/"__stringify(F)) int bpf_func_##F 21 - 22 - struct { 23 - __uint(type, BPF_MAP_TYPE_PROG_ARRAY); 24 - __uint(key_size, sizeof(u32)); 25 - __uint(value_size, sizeof(u32)); 26 - __uint(max_entries, 8); 27 - } jmp_table SEC(".maps"); 28 - 29 20 #define PARSE_VLAN 1 30 21 #define PARSE_MPLS 2 31 22 #define PARSE_IP 3 32 23 #define PARSE_IPV6 4 33 - 34 - /* Protocol dispatch routine. It tail-calls next BPF program depending 35 - * on eth proto. Note, we could have used ... 36 - * 37 - * bpf_tail_call(skb, &jmp_table, proto); 38 - * 39 - * ... but it would need large prog_array and cannot be optimised given 40 - * the map key is not static. 41 - */ 42 - static inline void parse_eth_proto(struct __sk_buff *skb, u32 proto) 43 - { 44 - switch (proto) { 45 - case ETH_P_8021Q: 46 - case ETH_P_8021AD: 47 - bpf_tail_call(skb, &jmp_table, PARSE_VLAN); 48 - break; 49 - case ETH_P_MPLS_UC: 50 - case ETH_P_MPLS_MC: 51 - bpf_tail_call(skb, &jmp_table, PARSE_MPLS); 52 - break; 53 - case ETH_P_IP: 54 - bpf_tail_call(skb, &jmp_table, PARSE_IP); 55 - break; 56 - case ETH_P_IPV6: 57 - bpf_tail_call(skb, &jmp_table, PARSE_IPV6); 58 - break; 59 - } 60 - } 61 24 62 25 struct vlan_hdr { 63 26 __be16 h_vlan_TCI; ··· 36 73 }; 37 74 __u32 ip_proto; 38 75 }; 76 + 77 + static inline void parse_eth_proto(struct __sk_buff *skb, u32 proto); 39 78 40 79 static inline int ip_is_fragment(struct __sk_buff *ctx, __u64 nhoff) 41 80 { ··· 154 189 } 155 190 } 156 191 157 - PROG(PARSE_IP)(struct __sk_buff *skb) 192 + SEC("socket") 193 + int bpf_func_ip(struct __sk_buff *skb) 158 194 { 159 195 struct globals *g = this_cpu_globals(); 160 196 __u32 nhoff, verlen, ip_proto; ··· 183 217 return 0; 184 218 } 185 219 186 - PROG(PARSE_IPV6)(struct __sk_buff *skb) 220 + SEC("socket") 221 + int bpf_func_ipv6(struct __sk_buff *skb) 187 222 { 188 223 struct globals *g = this_cpu_globals(); 189 224 __u32 nhoff, ip_proto; ··· 207 240 return 0; 208 241 } 209 242 210 - PROG(PARSE_VLAN)(struct __sk_buff *skb) 243 + SEC("socket") 244 + int bpf_func_vlan(struct __sk_buff *skb) 211 245 { 212 246 __u32 nhoff, proto; 213 247 ··· 224 256 return 0; 225 257 } 226 258 227 - PROG(PARSE_MPLS)(struct __sk_buff *skb) 259 + SEC("socket") 260 + int bpf_func_mpls(struct __sk_buff *skb) 228 261 { 229 262 __u32 nhoff, label; 230 263 ··· 248 279 return 0; 249 280 } 250 281 251 - SEC("socket/0") 282 + struct { 283 + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); 284 + __uint(key_size, sizeof(u32)); 285 + __uint(max_entries, 8); 286 + __array(values, u32 (void *)); 287 + } prog_array_init SEC(".maps") = { 288 + .values = { 289 + [PARSE_VLAN] = (void *)&bpf_func_vlan, 290 + [PARSE_IP] = (void *)&bpf_func_ip, 291 + [PARSE_IPV6] = (void *)&bpf_func_ipv6, 292 + [PARSE_MPLS] = (void *)&bpf_func_mpls, 293 + }, 294 + }; 295 + 296 + /* Protocol dispatch routine. It tail-calls next BPF program depending 297 + * on eth proto. Note, we could have used ... 298 + * 299 + * bpf_tail_call(skb, &prog_array_init, proto); 300 + * 301 + * ... but it would need large prog_array and cannot be optimised given 302 + * the map key is not static. 303 + */ 304 + static inline void parse_eth_proto(struct __sk_buff *skb, u32 proto) 305 + { 306 + switch (proto) { 307 + case ETH_P_8021Q: 308 + case ETH_P_8021AD: 309 + bpf_tail_call(skb, &prog_array_init, PARSE_VLAN); 310 + break; 311 + case ETH_P_MPLS_UC: 312 + case ETH_P_MPLS_MC: 313 + bpf_tail_call(skb, &prog_array_init, PARSE_MPLS); 314 + break; 315 + case ETH_P_IP: 316 + bpf_tail_call(skb, &prog_array_init, PARSE_IP); 317 + break; 318 + case ETH_P_IPV6: 319 + bpf_tail_call(skb, &prog_array_init, PARSE_IPV6); 320 + break; 321 + } 322 + } 323 + 324 + SEC("socket") 252 325 int main_prog(struct __sk_buff *skb) 253 326 { 254 327 __u32 nhoff = ETH_HLEN;

+10 -13

samples/bpf/sockex3_user.c

··· 24 24 25 25 int main(int argc, char **argv) 26 26 { 27 - int i, sock, key, fd, main_prog_fd, jmp_table_fd, hash_map_fd; 27 + int i, sock, fd, main_prog_fd, hash_map_fd; 28 28 struct bpf_program *prog; 29 29 struct bpf_object *obj; 30 - const char *section; 31 30 char filename[256]; 32 31 FILE *f; 33 32 ··· 44 45 goto cleanup; 45 46 } 46 47 47 - jmp_table_fd = bpf_object__find_map_fd_by_name(obj, "jmp_table"); 48 48 hash_map_fd = bpf_object__find_map_fd_by_name(obj, "hash_map"); 49 - if (jmp_table_fd < 0 || hash_map_fd < 0) { 49 + if (hash_map_fd < 0) { 50 50 fprintf(stderr, "ERROR: finding a map in obj file failed\n"); 51 51 goto cleanup; 52 52 } 53 53 54 + /* find BPF main program */ 55 + main_prog_fd = 0; 54 56 bpf_object__for_each_program(prog, obj) { 55 57 fd = bpf_program__fd(prog); 56 58 57 - section = bpf_program__section_name(prog); 58 - if (sscanf(section, "socket/%d", &key) != 1) { 59 - fprintf(stderr, "ERROR: finding prog failed\n"); 60 - goto cleanup; 61 - } 62 - 63 - if (key == 0) 59 + if (!strcmp(bpf_program__name(prog), "main_prog")) 64 60 main_prog_fd = fd; 65 - else 66 - bpf_map_update_elem(jmp_table_fd, &key, &fd, BPF_ANY); 61 + } 62 + 63 + if (main_prog_fd == 0) { 64 + fprintf(stderr, "ERROR: can't find main_prog\n"); 65 + goto cleanup; 67 66 } 68 67 69 68 sock = open_raw_sock("lo");

+2 -2

samples/bpf/tracex2_kern.c

··· 22 22 /* kprobe is NOT a stable ABI. If kernel internals change this bpf+kprobe 23 23 * example will no longer be meaningful 24 24 */ 25 - SEC("kprobe/kfree_skb") 25 + SEC("kprobe/kfree_skb_reason") 26 26 int bpf_prog2(struct pt_regs *ctx) 27 27 { 28 28 long loc = 0; 29 29 long init_val = 1; 30 30 long *value; 31 31 32 - /* read ip of kfree_skb caller. 32 + /* read ip of kfree_skb_reason caller. 33 33 * non-portable version of __builtin_return_address(0) 34 34 */ 35 35 BPF_KPROBE_READ_RET_IP(loc, ctx);

+2 -1

samples/bpf/tracex2_user.c

··· 146 146 signal(SIGINT, int_exit); 147 147 signal(SIGTERM, int_exit); 148 148 149 - /* start 'ping' in the background to have some kfree_skb events */ 149 + /* start 'ping' in the background to have some kfree_skb_reason 150 + * events */ 150 151 f = popen("ping -4 -c5 localhost", "r"); 151 152 (void) f; 152 153

+9 -16

tools/bpf/bpftool/btf.c

··· 815 815 if (!btf_id) 816 816 continue; 817 817 818 - err = hashmap__append(tab, u32_as_hash_field(btf_id), 819 - u32_as_hash_field(id)); 818 + err = hashmap__append(tab, btf_id, id); 820 819 if (err) { 821 820 p_err("failed to append entry to hashmap for BTF ID %u, object ID %u: %s", 822 821 btf_id, id, strerror(-err)); ··· 874 875 printf("size %uB", info->btf_size); 875 876 876 877 n = 0; 877 - hashmap__for_each_key_entry(btf_prog_table, entry, 878 - u32_as_hash_field(info->id)) { 879 - printf("%s%u", n++ == 0 ? " prog_ids " : ",", 880 - hash_field_as_u32(entry->value)); 878 + hashmap__for_each_key_entry(btf_prog_table, entry, info->id) { 879 + printf("%s%lu", n++ == 0 ? " prog_ids " : ",", entry->value); 881 880 } 882 881 883 882 n = 0; 884 - hashmap__for_each_key_entry(btf_map_table, entry, 885 - u32_as_hash_field(info->id)) { 886 - printf("%s%u", n++ == 0 ? " map_ids " : ",", 887 - hash_field_as_u32(entry->value)); 883 + hashmap__for_each_key_entry(btf_map_table, entry, info->id) { 884 + printf("%s%lu", n++ == 0 ? " map_ids " : ",", entry->value); 888 885 } 889 886 890 887 emit_obj_refs_plain(refs_table, info->id, "\n\tpids "); ··· 902 907 903 908 jsonw_name(json_wtr, "prog_ids"); 904 909 jsonw_start_array(json_wtr); /* prog_ids */ 905 - hashmap__for_each_key_entry(btf_prog_table, entry, 906 - u32_as_hash_field(info->id)) { 907 - jsonw_uint(json_wtr, hash_field_as_u32(entry->value)); 910 + hashmap__for_each_key_entry(btf_prog_table, entry, info->id) { 911 + jsonw_uint(json_wtr, entry->value); 908 912 } 909 913 jsonw_end_array(json_wtr); /* prog_ids */ 910 914 911 915 jsonw_name(json_wtr, "map_ids"); 912 916 jsonw_start_array(json_wtr); /* map_ids */ 913 - hashmap__for_each_key_entry(btf_map_table, entry, 914 - u32_as_hash_field(info->id)) { 915 - jsonw_uint(json_wtr, hash_field_as_u32(entry->value)); 917 + hashmap__for_each_key_entry(btf_map_table, entry, info->id) { 918 + jsonw_uint(json_wtr, entry->value); 916 919 } 917 920 jsonw_end_array(json_wtr); /* map_ids */ 918 921

+5 -5

tools/bpf/bpftool/common.c

··· 497 497 goto out_close; 498 498 } 499 499 500 - err = hashmap__append(build_fn_table, u32_as_hash_field(pinned_info.id), path); 500 + err = hashmap__append(build_fn_table, pinned_info.id, path); 501 501 if (err) { 502 502 p_err("failed to append entry to hashmap for ID %u, path '%s': %s", 503 503 pinned_info.id, path, strerror(errno)); ··· 548 548 return; 549 549 550 550 hashmap__for_each_entry(map, entry, bkt) 551 - free(entry->value); 551 + free(entry->pvalue); 552 552 553 553 hashmap__free(map); 554 554 } ··· 1044 1044 return fd; 1045 1045 } 1046 1046 1047 - size_t hash_fn_for_key_as_id(const void *key, void *ctx) 1047 + size_t hash_fn_for_key_as_id(long key, void *ctx) 1048 1048 { 1049 - return (size_t)key; 1049 + return key; 1050 1050 } 1051 1051 1052 - bool equal_fn_for_key_as_id(const void *k1, const void *k2, void *ctx) 1052 + bool equal_fn_for_key_as_id(long k1, long k2, void *ctx) 1053 1053 { 1054 1054 return k1 == k2; 1055 1055 }

+7 -12

tools/bpf/bpftool/gen.c

··· 1660 1660 struct btf *marked_btf; /* btf structure used to mark used types */ 1661 1661 }; 1662 1662 1663 - static size_t btfgen_hash_fn(const void *key, void *ctx) 1663 + static size_t btfgen_hash_fn(long key, void *ctx) 1664 1664 { 1665 - return (size_t)key; 1665 + return key; 1666 1666 } 1667 1667 1668 - static bool btfgen_equal_fn(const void *k1, const void *k2, void *ctx) 1668 + static bool btfgen_equal_fn(long k1, long k2, void *ctx) 1669 1669 { 1670 1670 return k1 == k2; 1671 - } 1672 - 1673 - static void *u32_as_hash_key(__u32 x) 1674 - { 1675 - return (void *)(uintptr_t)x; 1676 1671 } 1677 1672 1678 1673 static void btfgen_free_info(struct btfgen_info *info) ··· 2081 2086 struct bpf_core_spec specs_scratch[3] = {}; 2082 2087 struct bpf_core_relo_res targ_res = {}; 2083 2088 struct bpf_core_cand_list *cands = NULL; 2084 - const void *type_key = u32_as_hash_key(relo->type_id); 2085 2089 const char *sec_name = btf__name_by_offset(btf, sec->sec_name_off); 2086 2090 2087 2091 if (relo->kind != BPF_CORE_TYPE_ID_LOCAL && 2088 - !hashmap__find(cand_cache, type_key, (void **)&cands)) { 2092 + !hashmap__find(cand_cache, relo->type_id, &cands)) { 2089 2093 cands = btfgen_find_cands(btf, info->src_btf, relo->type_id); 2090 2094 if (!cands) { 2091 2095 err = -errno; 2092 2096 goto out; 2093 2097 } 2094 2098 2095 - err = hashmap__set(cand_cache, type_key, cands, NULL, NULL); 2099 + err = hashmap__set(cand_cache, relo->type_id, cands, 2100 + NULL, NULL); 2096 2101 if (err) 2097 2102 goto out; 2098 2103 } ··· 2115 2120 2116 2121 if (!IS_ERR_OR_NULL(cand_cache)) { 2117 2122 hashmap__for_each_entry(cand_cache, entry, i) { 2118 - bpf_core_free_cands(entry->value); 2123 + bpf_core_free_cands(entry->pvalue); 2119 2124 } 2120 2125 hashmap__free(cand_cache); 2121 2126 }

+4 -6

tools/bpf/bpftool/link.c

··· 204 204 205 205 jsonw_name(json_wtr, "pinned"); 206 206 jsonw_start_array(json_wtr); 207 - hashmap__for_each_key_entry(link_table, entry, 208 - u32_as_hash_field(info->id)) 209 - jsonw_string(json_wtr, entry->value); 207 + hashmap__for_each_key_entry(link_table, entry, info->id) 208 + jsonw_string(json_wtr, entry->pvalue); 210 209 jsonw_end_array(json_wtr); 211 210 } 212 211 ··· 308 309 if (!hashmap__empty(link_table)) { 309 310 struct hashmap_entry *entry; 310 311 311 - hashmap__for_each_key_entry(link_table, entry, 312 - u32_as_hash_field(info->id)) 313 - printf("\n\tpinned %s", (char *)entry->value); 312 + hashmap__for_each_key_entry(link_table, entry, info->id) 313 + printf("\n\tpinned %s", (char *)entry->pvalue); 314 314 } 315 315 emit_obj_refs_plain(refs_table, info->id, "\n\tpids "); 316 316

+2 -12

tools/bpf/bpftool/main.h

··· 240 240 int print_all_levels(__maybe_unused enum libbpf_print_level level, 241 241 const char *format, va_list args); 242 242 243 - size_t hash_fn_for_key_as_id(const void *key, void *ctx); 244 - bool equal_fn_for_key_as_id(const void *k1, const void *k2, void *ctx); 243 + size_t hash_fn_for_key_as_id(long key, void *ctx); 244 + bool equal_fn_for_key_as_id(long k1, long k2, void *ctx); 245 245 246 246 /* bpf_attach_type_input_str - convert the provided attach type value into a 247 247 * textual representation that we accept for input purposes. ··· 256 256 * returned for unknown bpf_attach_type values. 257 257 */ 258 258 const char *bpf_attach_type_input_str(enum bpf_attach_type t); 259 - 260 - static inline void *u32_as_hash_field(__u32 x) 261 - { 262 - return (void *)(uintptr_t)x; 263 - } 264 - 265 - static inline __u32 hash_field_as_u32(const void *x) 266 - { 267 - return (__u32)(uintptr_t)x; 268 - } 269 259 270 260 static inline bool hashmap__empty(struct hashmap *map) 271 261 {

+4 -6

tools/bpf/bpftool/map.c

··· 518 518 519 519 jsonw_name(json_wtr, "pinned"); 520 520 jsonw_start_array(json_wtr); 521 - hashmap__for_each_key_entry(map_table, entry, 522 - u32_as_hash_field(info->id)) 523 - jsonw_string(json_wtr, entry->value); 521 + hashmap__for_each_key_entry(map_table, entry, info->id) 522 + jsonw_string(json_wtr, entry->pvalue); 524 523 jsonw_end_array(json_wtr); 525 524 } 526 525 ··· 594 595 if (!hashmap__empty(map_table)) { 595 596 struct hashmap_entry *entry; 596 597 597 - hashmap__for_each_key_entry(map_table, entry, 598 - u32_as_hash_field(info->id)) 599 - printf("\n\tpinned %s", (char *)entry->value); 598 + hashmap__for_each_key_entry(map_table, entry, info->id) 599 + printf("\n\tpinned %s", (char *)entry->pvalue); 600 600 } 601 601 602 602 if (frozen_str) {

+8 -8

tools/bpf/bpftool/pids.c

··· 36 36 int err, i; 37 37 void *tmp; 38 38 39 - hashmap__for_each_key_entry(map, entry, u32_as_hash_field(e->id)) { 40 - refs = entry->value; 39 + hashmap__for_each_key_entry(map, entry, e->id) { 40 + refs = entry->pvalue; 41 41 42 42 for (i = 0; i < refs->ref_cnt; i++) { 43 43 if (refs->refs[i].pid == e->pid) ··· 81 81 refs->has_bpf_cookie = e->has_bpf_cookie; 82 82 refs->bpf_cookie = e->bpf_cookie; 83 83 84 - err = hashmap__append(map, u32_as_hash_field(e->id), refs); 84 + err = hashmap__append(map, e->id, refs); 85 85 if (err) 86 86 p_err("failed to append entry to hashmap for ID %u: %s", 87 87 e->id, strerror(errno)); ··· 183 183 return; 184 184 185 185 hashmap__for_each_entry(map, entry, bkt) { 186 - struct obj_refs *refs = entry->value; 186 + struct obj_refs *refs = entry->pvalue; 187 187 188 188 free(refs->refs); 189 189 free(refs); ··· 200 200 if (hashmap__empty(map)) 201 201 return; 202 202 203 - hashmap__for_each_key_entry(map, entry, u32_as_hash_field(id)) { 204 - struct obj_refs *refs = entry->value; 203 + hashmap__for_each_key_entry(map, entry, id) { 204 + struct obj_refs *refs = entry->pvalue; 205 205 int i; 206 206 207 207 if (refs->ref_cnt == 0) ··· 232 232 if (hashmap__empty(map)) 233 233 return; 234 234 235 - hashmap__for_each_key_entry(map, entry, u32_as_hash_field(id)) { 236 - struct obj_refs *refs = entry->value; 235 + hashmap__for_each_key_entry(map, entry, id) { 236 + struct obj_refs *refs = entry->pvalue; 237 237 int i; 238 238 239 239 if (refs->ref_cnt == 0)

+4 -6

tools/bpf/bpftool/prog.c

··· 486 486 487 487 jsonw_name(json_wtr, "pinned"); 488 488 jsonw_start_array(json_wtr); 489 - hashmap__for_each_key_entry(prog_table, entry, 490 - u32_as_hash_field(info->id)) 491 - jsonw_string(json_wtr, entry->value); 489 + hashmap__for_each_key_entry(prog_table, entry, info->id) 490 + jsonw_string(json_wtr, entry->pvalue); 492 491 jsonw_end_array(json_wtr); 493 492 } 494 493 ··· 560 561 if (!hashmap__empty(prog_table)) { 561 562 struct hashmap_entry *entry; 562 563 563 - hashmap__for_each_key_entry(prog_table, entry, 564 - u32_as_hash_field(info->id)) 565 - printf("\n\tpinned %s", (char *)entry->value); 564 + hashmap__for_each_key_entry(prog_table, entry, info->id) 565 + printf("\n\tpinned %s", (char *)entry->pvalue); 566 566 } 567 567 568 568 if (info->btf_id)

+1

tools/include/uapi/linux/bpf.h

··· 6445 6445 * the outgoing header has not 6446 6446 * been written yet. 6447 6447 */ 6448 + __u64 skb_hwtstamp; 6448 6449 }; 6449 6450 6450 6451 /* Definitions for bpf_sock_ops_cb_flags */

+184 -75

tools/lib/bpf/btf.c

··· 1559 1559 static int btf_rewrite_str(__u32 *str_off, void *ctx) 1560 1560 { 1561 1561 struct btf_pipe *p = ctx; 1562 - void *mapped_off; 1562 + long mapped_off; 1563 1563 int off, err; 1564 1564 1565 1565 if (!*str_off) /* nothing to do for empty strings */ 1566 1566 return 0; 1567 1567 1568 1568 if (p->str_off_map && 1569 - hashmap__find(p->str_off_map, (void *)(long)*str_off, &mapped_off)) { 1570 - *str_off = (__u32)(long)mapped_off; 1569 + hashmap__find(p->str_off_map, *str_off, &mapped_off)) { 1570 + *str_off = mapped_off; 1571 1571 return 0; 1572 1572 } 1573 1573 ··· 1579 1579 * performing expensive string comparisons. 1580 1580 */ 1581 1581 if (p->str_off_map) { 1582 - err = hashmap__append(p->str_off_map, (void *)(long)*str_off, (void *)(long)off); 1582 + err = hashmap__append(p->str_off_map, *str_off, off); 1583 1583 if (err) 1584 1584 return err; 1585 1585 } ··· 1630 1630 return 0; 1631 1631 } 1632 1632 1633 - static size_t btf_dedup_identity_hash_fn(const void *key, void *ctx); 1634 - static bool btf_dedup_equal_fn(const void *k1, const void *k2, void *ctx); 1633 + static size_t btf_dedup_identity_hash_fn(long key, void *ctx); 1634 + static bool btf_dedup_equal_fn(long k1, long k2, void *ctx); 1635 1635 1636 1636 int btf__add_btf(struct btf *btf, const struct btf *src_btf) 1637 1637 { ··· 2881 2881 static int btf_dedup_prim_types(struct btf_dedup *d); 2882 2882 static int btf_dedup_struct_types(struct btf_dedup *d); 2883 2883 static int btf_dedup_ref_types(struct btf_dedup *d); 2884 + static int btf_dedup_resolve_fwds(struct btf_dedup *d); 2884 2885 static int btf_dedup_compact_types(struct btf_dedup *d); 2885 2886 static int btf_dedup_remap_types(struct btf_dedup *d); 2886 2887 ··· 2989 2988 * Algorithm summary 2990 2989 * ================= 2991 2990 * 2992 - * Algorithm completes its work in 6 separate passes: 2991 + * Algorithm completes its work in 7 separate passes: 2993 2992 * 2994 2993 * 1. Strings deduplication. 2995 2994 * 2. Primitive types deduplication (int, enum, fwd). 2996 2995 * 3. Struct/union types deduplication. 2997 - * 4. Reference types deduplication (pointers, typedefs, arrays, funcs, func 2996 + * 4. Resolve unambiguous forward declarations. 2997 + * 5. Reference types deduplication (pointers, typedefs, arrays, funcs, func 2998 2998 * protos, and const/volatile/restrict modifiers). 2999 - * 5. Types compaction. 3000 - * 6. Types remapping. 2999 + * 6. Types compaction. 3000 + * 7. Types remapping. 3001 3001 * 3002 3002 * Algorithm determines canonical type descriptor, which is a single 3003 3003 * representative type for each truly unique type. This canonical type is the ··· 3060 3058 err = btf_dedup_struct_types(d); 3061 3059 if (err < 0) { 3062 3060 pr_debug("btf_dedup_struct_types failed:%d\n", err); 3061 + goto done; 3062 + } 3063 + err = btf_dedup_resolve_fwds(d); 3064 + if (err < 0) { 3065 + pr_debug("btf_dedup_resolve_fwds failed:%d\n", err); 3063 3066 goto done; 3064 3067 } 3065 3068 err = btf_dedup_ref_types(d); ··· 3133 3126 } 3134 3127 3135 3128 #define for_each_dedup_cand(d, node, hash) \ 3136 - hashmap__for_each_key_entry(d->dedup_table, node, (void *)hash) 3129 + hashmap__for_each_key_entry(d->dedup_table, node, hash) 3137 3130 3138 3131 static int btf_dedup_table_add(struct btf_dedup *d, long hash, __u32 type_id) 3139 3132 { 3140 - return hashmap__append(d->dedup_table, 3141 - (void *)hash, (void *)(long)type_id); 3133 + return hashmap__append(d->dedup_table, hash, type_id); 3142 3134 } 3143 3135 3144 3136 static int btf_dedup_hypot_map_add(struct btf_dedup *d, ··· 3184 3178 free(d); 3185 3179 } 3186 3180 3187 - static size_t btf_dedup_identity_hash_fn(const void *key, void *ctx) 3181 + static size_t btf_dedup_identity_hash_fn(long key, void *ctx) 3188 3182 { 3189 - return (size_t)key; 3183 + return key; 3190 3184 } 3191 3185 3192 - static size_t btf_dedup_collision_hash_fn(const void *key, void *ctx) 3186 + static size_t btf_dedup_collision_hash_fn(long key, void *ctx) 3193 3187 { 3194 3188 return 0; 3195 3189 } 3196 3190 3197 - static bool btf_dedup_equal_fn(const void *k1, const void *k2, void *ctx) 3191 + static bool btf_dedup_equal_fn(long k1, long k2, void *ctx) 3198 3192 { 3199 3193 return k1 == k2; 3200 3194 } ··· 3410 3404 { 3411 3405 long h; 3412 3406 3413 - /* don't hash vlen and enum members to support enum fwd resolving */ 3407 + /* don't hash vlen, enum members and size to support enum fwd resolving */ 3414 3408 h = hash_combine(0, t->name_off); 3415 - h = hash_combine(h, t->info & ~0xffff); 3416 - h = hash_combine(h, t->size); 3417 3409 return h; 3418 3410 } 3419 3411 3420 - /* Check structural equality of two ENUMs. */ 3421 - static bool btf_equal_enum(struct btf_type *t1, struct btf_type *t2) 3412 + static bool btf_equal_enum_members(struct btf_type *t1, struct btf_type *t2) 3422 3413 { 3423 3414 const struct btf_enum *m1, *m2; 3424 3415 __u16 vlen; 3425 3416 int i; 3426 - 3427 - if (!btf_equal_common(t1, t2)) 3428 - return false; 3429 3417 3430 3418 vlen = btf_vlen(t1); 3431 3419 m1 = btf_enum(t1); ··· 3433 3433 return true; 3434 3434 } 3435 3435 3436 - static bool btf_equal_enum64(struct btf_type *t1, struct btf_type *t2) 3436 + static bool btf_equal_enum64_members(struct btf_type *t1, struct btf_type *t2) 3437 3437 { 3438 3438 const struct btf_enum64 *m1, *m2; 3439 3439 __u16 vlen; 3440 3440 int i; 3441 - 3442 - if (!btf_equal_common(t1, t2)) 3443 - return false; 3444 3441 3445 3442 vlen = btf_vlen(t1); 3446 3443 m1 = btf_enum64(t1); ··· 3452 3455 return true; 3453 3456 } 3454 3457 3458 + /* Check structural equality of two ENUMs or ENUM64s. */ 3459 + static bool btf_equal_enum(struct btf_type *t1, struct btf_type *t2) 3460 + { 3461 + if (!btf_equal_common(t1, t2)) 3462 + return false; 3463 + 3464 + /* t1 & t2 kinds are identical because of btf_equal_common */ 3465 + if (btf_kind(t1) == BTF_KIND_ENUM) 3466 + return btf_equal_enum_members(t1, t2); 3467 + else 3468 + return btf_equal_enum64_members(t1, t2); 3469 + } 3470 + 3455 3471 static inline bool btf_is_enum_fwd(struct btf_type *t) 3456 3472 { 3457 3473 return btf_is_any_enum(t) && btf_vlen(t) == 0; ··· 3474 3464 { 3475 3465 if (!btf_is_enum_fwd(t1) && !btf_is_enum_fwd(t2)) 3476 3466 return btf_equal_enum(t1, t2); 3477 - /* ignore vlen when comparing */ 3467 + /* At this point either t1 or t2 or both are forward declarations, thus: 3468 + * - skip comparing vlen because it is zero for forward declarations; 3469 + * - skip comparing size to allow enum forward declarations 3470 + * to be compatible with enum64 full declarations; 3471 + * - skip comparing kind for the same reason. 3472 + */ 3478 3473 return t1->name_off == t2->name_off && 3479 - (t1->info & ~0xffff) == (t2->info & ~0xffff) && 3480 - t1->size == t2->size; 3481 - } 3482 - 3483 - static bool btf_compat_enum64(struct btf_type *t1, struct btf_type *t2) 3484 - { 3485 - if (!btf_is_enum_fwd(t1) && !btf_is_enum_fwd(t2)) 3486 - return btf_equal_enum64(t1, t2); 3487 - 3488 - /* ignore vlen when comparing */ 3489 - return t1->name_off == t2->name_off && 3490 - (t1->info & ~0xffff) == (t2->info & ~0xffff) && 3491 - t1->size == t2->size; 3474 + btf_is_any_enum(t1) && btf_is_any_enum(t2); 3492 3475 } 3493 3476 3494 3477 /* ··· 3756 3753 case BTF_KIND_INT: 3757 3754 h = btf_hash_int_decl_tag(t); 3758 3755 for_each_dedup_cand(d, hash_entry, h) { 3759 - cand_id = (__u32)(long)hash_entry->value; 3756 + cand_id = hash_entry->value; 3760 3757 cand = btf_type_by_id(d->btf, cand_id); 3761 3758 if (btf_equal_int_tag(t, cand)) { 3762 3759 new_id = cand_id; ··· 3766 3763 break; 3767 3764 3768 3765 case BTF_KIND_ENUM: 3766 + case BTF_KIND_ENUM64: 3769 3767 h = btf_hash_enum(t); 3770 3768 for_each_dedup_cand(d, hash_entry, h) { 3771 - cand_id = (__u32)(long)hash_entry->value; 3769 + cand_id = hash_entry->value; 3772 3770 cand = btf_type_by_id(d->btf, cand_id); 3773 3771 if (btf_equal_enum(t, cand)) { 3774 3772 new_id = cand_id; ··· 3787 3783 } 3788 3784 break; 3789 3785 3790 - case BTF_KIND_ENUM64: 3791 - h = btf_hash_enum(t); 3792 - for_each_dedup_cand(d, hash_entry, h) { 3793 - cand_id = (__u32)(long)hash_entry->value; 3794 - cand = btf_type_by_id(d->btf, cand_id); 3795 - if (btf_equal_enum64(t, cand)) { 3796 - new_id = cand_id; 3797 - break; 3798 - } 3799 - if (btf_compat_enum64(t, cand)) { 3800 - if (btf_is_enum_fwd(t)) { 3801 - /* resolve fwd to full enum */ 3802 - new_id = cand_id; 3803 - break; 3804 - } 3805 - /* resolve canonical enum fwd to full enum */ 3806 - d->map[cand_id] = type_id; 3807 - } 3808 - } 3809 - break; 3810 - 3811 3786 case BTF_KIND_FWD: 3812 3787 case BTF_KIND_FLOAT: 3813 3788 h = btf_hash_common(t); 3814 3789 for_each_dedup_cand(d, hash_entry, h) { 3815 - cand_id = (__u32)(long)hash_entry->value; 3790 + cand_id = hash_entry->value; 3816 3791 cand = btf_type_by_id(d->btf, cand_id); 3817 3792 if (btf_equal_common(t, cand)) { 3818 3793 new_id = cand_id; ··· 4082 4099 return btf_equal_int_tag(cand_type, canon_type); 4083 4100 4084 4101 case BTF_KIND_ENUM: 4085 - return btf_compat_enum(cand_type, canon_type); 4086 - 4087 4102 case BTF_KIND_ENUM64: 4088 - return btf_compat_enum64(cand_type, canon_type); 4103 + return btf_compat_enum(cand_type, canon_type); 4089 4104 4090 4105 case BTF_KIND_FWD: 4091 4106 case BTF_KIND_FLOAT: ··· 4294 4313 4295 4314 h = btf_hash_struct(t); 4296 4315 for_each_dedup_cand(d, hash_entry, h) { 4297 - __u32 cand_id = (__u32)(long)hash_entry->value; 4316 + __u32 cand_id = hash_entry->value; 4298 4317 int eq; 4299 4318 4300 4319 /* ··· 4399 4418 4400 4419 h = btf_hash_common(t); 4401 4420 for_each_dedup_cand(d, hash_entry, h) { 4402 - cand_id = (__u32)(long)hash_entry->value; 4421 + cand_id = hash_entry->value; 4403 4422 cand = btf_type_by_id(d->btf, cand_id); 4404 4423 if (btf_equal_common(t, cand)) { 4405 4424 new_id = cand_id; ··· 4416 4435 4417 4436 h = btf_hash_int_decl_tag(t); 4418 4437 for_each_dedup_cand(d, hash_entry, h) { 4419 - cand_id = (__u32)(long)hash_entry->value; 4438 + cand_id = hash_entry->value; 4420 4439 cand = btf_type_by_id(d->btf, cand_id); 4421 4440 if (btf_equal_int_tag(t, cand)) { 4422 4441 new_id = cand_id; ··· 4440 4459 4441 4460 h = btf_hash_array(t); 4442 4461 for_each_dedup_cand(d, hash_entry, h) { 4443 - cand_id = (__u32)(long)hash_entry->value; 4462 + cand_id = hash_entry->value; 4444 4463 cand = btf_type_by_id(d->btf, cand_id); 4445 4464 if (btf_equal_array(t, cand)) { 4446 4465 new_id = cand_id; ··· 4472 4491 4473 4492 h = btf_hash_fnproto(t); 4474 4493 for_each_dedup_cand(d, hash_entry, h) { 4475 - cand_id = (__u32)(long)hash_entry->value; 4494 + cand_id = hash_entry->value; 4476 4495 cand = btf_type_by_id(d->btf, cand_id); 4477 4496 if (btf_equal_fnproto(t, cand)) { 4478 4497 new_id = cand_id; ··· 4506 4525 hashmap__free(d->dedup_table); 4507 4526 d->dedup_table = NULL; 4508 4527 return 0; 4528 + } 4529 + 4530 + /* 4531 + * Collect a map from type names to type ids for all canonical structs 4532 + * and unions. If the same name is shared by several canonical types 4533 + * use a special value 0 to indicate this fact. 4534 + */ 4535 + static int btf_dedup_fill_unique_names_map(struct btf_dedup *d, struct hashmap *names_map) 4536 + { 4537 + __u32 nr_types = btf__type_cnt(d->btf); 4538 + struct btf_type *t; 4539 + __u32 type_id; 4540 + __u16 kind; 4541 + int err; 4542 + 4543 + /* 4544 + * Iterate over base and split module ids in order to get all 4545 + * available structs in the map. 4546 + */ 4547 + for (type_id = 1; type_id < nr_types; ++type_id) { 4548 + t = btf_type_by_id(d->btf, type_id); 4549 + kind = btf_kind(t); 4550 + 4551 + if (kind != BTF_KIND_STRUCT && kind != BTF_KIND_UNION) 4552 + continue; 4553 + 4554 + /* Skip non-canonical types */ 4555 + if (type_id != d->map[type_id]) 4556 + continue; 4557 + 4558 + err = hashmap__add(names_map, t->name_off, type_id); 4559 + if (err == -EEXIST) 4560 + err = hashmap__set(names_map, t->name_off, 0, NULL, NULL); 4561 + 4562 + if (err) 4563 + return err; 4564 + } 4565 + 4566 + return 0; 4567 + } 4568 + 4569 + static int btf_dedup_resolve_fwd(struct btf_dedup *d, struct hashmap *names_map, __u32 type_id) 4570 + { 4571 + struct btf_type *t = btf_type_by_id(d->btf, type_id); 4572 + enum btf_fwd_kind fwd_kind = btf_kflag(t); 4573 + __u16 cand_kind, kind = btf_kind(t); 4574 + struct btf_type *cand_t; 4575 + uintptr_t cand_id; 4576 + 4577 + if (kind != BTF_KIND_FWD) 4578 + return 0; 4579 + 4580 + /* Skip if this FWD already has a mapping */ 4581 + if (type_id != d->map[type_id]) 4582 + return 0; 4583 + 4584 + if (!hashmap__find(names_map, t->name_off, &cand_id)) 4585 + return 0; 4586 + 4587 + /* Zero is a special value indicating that name is not unique */ 4588 + if (!cand_id) 4589 + return 0; 4590 + 4591 + cand_t = btf_type_by_id(d->btf, cand_id); 4592 + cand_kind = btf_kind(cand_t); 4593 + if ((cand_kind == BTF_KIND_STRUCT && fwd_kind != BTF_FWD_STRUCT) || 4594 + (cand_kind == BTF_KIND_UNION && fwd_kind != BTF_FWD_UNION)) 4595 + return 0; 4596 + 4597 + d->map[type_id] = cand_id; 4598 + 4599 + return 0; 4600 + } 4601 + 4602 + /* 4603 + * Resolve unambiguous forward declarations. 4604 + * 4605 + * The lion's share of all FWD declarations is resolved during 4606 + * `btf_dedup_struct_types` phase when different type graphs are 4607 + * compared against each other. However, if in some compilation unit a 4608 + * FWD declaration is not a part of a type graph compared against 4609 + * another type graph that declaration's canonical type would not be 4610 + * changed. Example: 4611 + * 4612 + * CU #1: 4613 + * 4614 + * struct foo; 4615 + * struct foo *some_global; 4616 + * 4617 + * CU #2: 4618 + * 4619 + * struct foo { int u; }; 4620 + * struct foo *another_global; 4621 + * 4622 + * After `btf_dedup_struct_types` the BTF looks as follows: 4623 + * 4624 + * [1] STRUCT 'foo' size=4 vlen=1 ... 4625 + * [2] INT 'int' size=4 ... 4626 + * [3] PTR '(anon)' type_id=1 4627 + * [4] FWD 'foo' fwd_kind=struct 4628 + * [5] PTR '(anon)' type_id=4 4629 + * 4630 + * This pass assumes that such FWD declarations should be mapped to 4631 + * structs or unions with identical name in case if the name is not 4632 + * ambiguous. 4633 + */ 4634 + static int btf_dedup_resolve_fwds(struct btf_dedup *d) 4635 + { 4636 + int i, err; 4637 + struct hashmap *names_map; 4638 + 4639 + names_map = hashmap__new(btf_dedup_identity_hash_fn, btf_dedup_equal_fn, NULL); 4640 + if (IS_ERR(names_map)) 4641 + return PTR_ERR(names_map); 4642 + 4643 + err = btf_dedup_fill_unique_names_map(d, names_map); 4644 + if (err < 0) 4645 + goto exit; 4646 + 4647 + for (i = 0; i < d->btf->nr_types; i++) { 4648 + err = btf_dedup_resolve_fwd(d, names_map, d->btf->start_id + i); 4649 + if (err < 0) 4650 + break; 4651 + } 4652 + 4653 + exit: 4654 + hashmap__free(names_map); 4655 + return err; 4509 4656 } 4510 4657 4511 4658 /*

+7 -8

tools/lib/bpf/btf_dump.c

··· 117 117 struct btf_dump_data *typed_dump; 118 118 }; 119 119 120 - static size_t str_hash_fn(const void *key, void *ctx) 120 + static size_t str_hash_fn(long key, void *ctx) 121 121 { 122 - return str_hash(key); 122 + return str_hash((void *)key); 123 123 } 124 124 125 - static bool str_equal_fn(const void *a, const void *b, void *ctx) 125 + static bool str_equal_fn(long a, long b, void *ctx) 126 126 { 127 - return strcmp(a, b) == 0; 127 + return strcmp((void *)a, (void *)b) == 0; 128 128 } 129 129 130 130 static const char *btf_name_of(const struct btf_dump *d, __u32 name_off) ··· 225 225 struct hashmap_entry *cur; 226 226 227 227 hashmap__for_each_entry(map, cur, bkt) 228 - free((void *)cur->key); 228 + free((void *)cur->pkey); 229 229 230 230 hashmap__free(map); 231 231 } ··· 1543 1543 if (!new_name) 1544 1544 return 1; 1545 1545 1546 - hashmap__find(name_map, orig_name, (void **)&dup_cnt); 1546 + hashmap__find(name_map, orig_name, &dup_cnt); 1547 1547 dup_cnt++; 1548 1548 1549 - err = hashmap__set(name_map, new_name, (void *)dup_cnt, 1550 - (const void **)&old_name, NULL); 1549 + err = hashmap__set(name_map, new_name, dup_cnt, &old_name, NULL); 1551 1550 if (err) 1552 1551 free(new_name); 1553 1552

+9 -9

tools/lib/bpf/hashmap.c

··· 128 128 } 129 129 130 130 static bool hashmap_find_entry(const struct hashmap *map, 131 - const void *key, size_t hash, 131 + const long key, size_t hash, 132 132 struct hashmap_entry ***pprev, 133 133 struct hashmap_entry **entry) 134 134 { ··· 151 151 return false; 152 152 } 153 153 154 - int hashmap__insert(struct hashmap *map, const void *key, void *value, 155 - enum hashmap_insert_strategy strategy, 156 - const void **old_key, void **old_value) 154 + int hashmap_insert(struct hashmap *map, long key, long value, 155 + enum hashmap_insert_strategy strategy, 156 + long *old_key, long *old_value) 157 157 { 158 158 struct hashmap_entry *entry; 159 159 size_t h; 160 160 int err; 161 161 162 162 if (old_key) 163 - *old_key = NULL; 163 + *old_key = 0; 164 164 if (old_value) 165 - *old_value = NULL; 165 + *old_value = 0; 166 166 167 167 h = hash_bits(map->hash_fn(key, map->ctx), map->cap_bits); 168 168 if (strategy != HASHMAP_APPEND && ··· 203 203 return 0; 204 204 } 205 205 206 - bool hashmap__find(const struct hashmap *map, const void *key, void **value) 206 + bool hashmap_find(const struct hashmap *map, long key, long *value) 207 207 { 208 208 struct hashmap_entry *entry; 209 209 size_t h; ··· 217 217 return true; 218 218 } 219 219 220 - bool hashmap__delete(struct hashmap *map, const void *key, 221 - const void **old_key, void **old_value) 220 + bool hashmap_delete(struct hashmap *map, long key, 221 + long *old_key, long *old_value) 222 222 { 223 223 struct hashmap_entry **pprev, *entry; 224 224 size_t h;

+57 -34

tools/lib/bpf/hashmap.h

··· 40 40 return h; 41 41 } 42 42 43 - typedef size_t (*hashmap_hash_fn)(const void *key, void *ctx); 44 - typedef bool (*hashmap_equal_fn)(const void *key1, const void *key2, void *ctx); 43 + typedef size_t (*hashmap_hash_fn)(long key, void *ctx); 44 + typedef bool (*hashmap_equal_fn)(long key1, long key2, void *ctx); 45 45 46 + /* 47 + * Hashmap interface is polymorphic, keys and values could be either 48 + * long-sized integers or pointers, this is achieved as follows: 49 + * - interface functions that operate on keys and values are hidden 50 + * behind auxiliary macros, e.g. hashmap_insert <-> hashmap__insert; 51 + * - these auxiliary macros cast the key and value parameters as 52 + * long or long *, so the user does not have to specify the casts explicitly; 53 + * - for pointer parameters (e.g. old_key) the size of the pointed 54 + * type is verified by hashmap_cast_ptr using _Static_assert; 55 + * - when iterating using hashmap__for_each_* forms 56 + * hasmap_entry->key should be used for integer keys and 57 + * hasmap_entry->pkey should be used for pointer keys, 58 + * same goes for values. 59 + */ 46 60 struct hashmap_entry { 47 - const void *key; 48 - void *value; 61 + union { 62 + long key; 63 + const void *pkey; 64 + }; 65 + union { 66 + long value; 67 + void *pvalue; 68 + }; 49 69 struct hashmap_entry *next; 50 70 }; 51 71 ··· 122 102 HASHMAP_APPEND, 123 103 }; 124 104 105 + #define hashmap_cast_ptr(p) ({ \ 106 + _Static_assert((__builtin_constant_p((p)) ? (p) == NULL : 0) || \ 107 + sizeof(*(p)) == sizeof(long), \ 108 + #p " pointee should be a long-sized integer or a pointer"); \ 109 + (long *)(p); \ 110 + }) 111 + 125 112 /* 126 113 * hashmap__insert() adds key/value entry w/ various semantics, depending on 127 114 * provided strategy value. If a given key/value pair replaced already ··· 136 109 * through old_key and old_value to allow calling code do proper memory 137 110 * management. 138 111 */ 139 - int hashmap__insert(struct hashmap *map, const void *key, void *value, 140 - enum hashmap_insert_strategy strategy, 141 - const void **old_key, void **old_value); 112 + int hashmap_insert(struct hashmap *map, long key, long value, 113 + enum hashmap_insert_strategy strategy, 114 + long *old_key, long *old_value); 142 115 143 - static inline int hashmap__add(struct hashmap *map, 144 - const void *key, void *value) 145 - { 146 - return hashmap__insert(map, key, value, HASHMAP_ADD, NULL, NULL); 147 - } 116 + #define hashmap__insert(map, key, value, strategy, old_key, old_value) \ 117 + hashmap_insert((map), (long)(key), (long)(value), (strategy), \ 118 + hashmap_cast_ptr(old_key), \ 119 + hashmap_cast_ptr(old_value)) 148 120 149 - static inline int hashmap__set(struct hashmap *map, 150 - const void *key, void *value, 151 - const void **old_key, void **old_value) 152 - { 153 - return hashmap__insert(map, key, value, HASHMAP_SET, 154 - old_key, old_value); 155 - } 121 + #define hashmap__add(map, key, value) \ 122 + hashmap__insert((map), (key), (value), HASHMAP_ADD, NULL, NULL) 156 123 157 - static inline int hashmap__update(struct hashmap *map, 158 - const void *key, void *value, 159 - const void **old_key, void **old_value) 160 - { 161 - return hashmap__insert(map, key, value, HASHMAP_UPDATE, 162 - old_key, old_value); 163 - } 124 + #define hashmap__set(map, key, value, old_key, old_value) \ 125 + hashmap__insert((map), (key), (value), HASHMAP_SET, (old_key), (old_value)) 164 126 165 - static inline int hashmap__append(struct hashmap *map, 166 - const void *key, void *value) 167 - { 168 - return hashmap__insert(map, key, value, HASHMAP_APPEND, NULL, NULL); 169 - } 127 + #define hashmap__update(map, key, value, old_key, old_value) \ 128 + hashmap__insert((map), (key), (value), HASHMAP_UPDATE, (old_key), (old_value)) 170 129 171 - bool hashmap__delete(struct hashmap *map, const void *key, 172 - const void **old_key, void **old_value); 130 + #define hashmap__append(map, key, value) \ 131 + hashmap__insert((map), (key), (value), HASHMAP_APPEND, NULL, NULL) 173 132 174 - bool hashmap__find(const struct hashmap *map, const void *key, void **value); 133 + bool hashmap_delete(struct hashmap *map, long key, long *old_key, long *old_value); 134 + 135 + #define hashmap__delete(map, key, old_key, old_value) \ 136 + hashmap_delete((map), (long)(key), \ 137 + hashmap_cast_ptr(old_key), \ 138 + hashmap_cast_ptr(old_value)) 139 + 140 + bool hashmap_find(const struct hashmap *map, long key, long *value); 141 + 142 + #define hashmap__find(map, key, value) \ 143 + hashmap_find((map), (long)(key), hashmap_cast_ptr(value)) 175 144 176 145 /* 177 146 * hashmap__for_each_entry - iterate over all entries in hashmap

+6 -12

tools/lib/bpf/libbpf.c

··· 5601 5601 return __bpf_core_types_match(local_btf, local_id, targ_btf, targ_id, false, 32); 5602 5602 } 5603 5603 5604 - static size_t bpf_core_hash_fn(const void *key, void *ctx) 5604 + static size_t bpf_core_hash_fn(const long key, void *ctx) 5605 5605 { 5606 - return (size_t)key; 5606 + return key; 5607 5607 } 5608 5608 5609 - static bool bpf_core_equal_fn(const void *k1, const void *k2, void *ctx) 5609 + static bool bpf_core_equal_fn(const long k1, const long k2, void *ctx) 5610 5610 { 5611 5611 return k1 == k2; 5612 - } 5613 - 5614 - static void *u32_as_hash_key(__u32 x) 5615 - { 5616 - return (void *)(uintptr_t)x; 5617 5612 } 5618 5613 5619 5614 static int record_relo_core(struct bpf_program *prog, ··· 5653 5658 struct bpf_core_relo_res *targ_res) 5654 5659 { 5655 5660 struct bpf_core_spec specs_scratch[3] = {}; 5656 - const void *type_key = u32_as_hash_key(relo->type_id); 5657 5661 struct bpf_core_cand_list *cands = NULL; 5658 5662 const char *prog_name = prog->name; 5659 5663 const struct btf_type *local_type; ··· 5669 5675 return -EINVAL; 5670 5676 5671 5677 if (relo->kind != BPF_CORE_TYPE_ID_LOCAL && 5672 - !hashmap__find(cand_cache, type_key, (void **)&cands)) { 5678 + !hashmap__find(cand_cache, local_id, &cands)) { 5673 5679 cands = bpf_core_find_cands(prog->obj, local_btf, local_id); 5674 5680 if (IS_ERR(cands)) { 5675 5681 pr_warn("prog '%s': relo #%d: target candidate search failed for [%d] %s %s: %ld\n", ··· 5677 5683 local_name, PTR_ERR(cands)); 5678 5684 return PTR_ERR(cands); 5679 5685 } 5680 - err = hashmap__set(cand_cache, type_key, cands, NULL, NULL); 5686 + err = hashmap__set(cand_cache, local_id, cands, NULL, NULL); 5681 5687 if (err) { 5682 5688 bpf_core_free_cands(cands); 5683 5689 return err; ··· 5800 5806 5801 5807 if (!IS_ERR_OR_NULL(cand_cache)) { 5802 5808 hashmap__for_each_entry(cand_cache, entry, i) { 5803 - bpf_core_free_cands(entry->value); 5809 + bpf_core_free_cands(entry->pvalue); 5804 5810 } 5805 5811 hashmap__free(cand_cache); 5806 5812 }

+9 -9

tools/lib/bpf/strset.c

··· 19 19 struct hashmap *strs_hash; 20 20 }; 21 21 22 - static size_t strset_hash_fn(const void *key, void *ctx) 22 + static size_t strset_hash_fn(long key, void *ctx) 23 23 { 24 24 const struct strset *s = ctx; 25 - const char *str = s->strs_data + (long)key; 25 + const char *str = s->strs_data + key; 26 26 27 27 return str_hash(str); 28 28 } 29 29 30 - static bool strset_equal_fn(const void *key1, const void *key2, void *ctx) 30 + static bool strset_equal_fn(long key1, long key2, void *ctx) 31 31 { 32 32 const struct strset *s = ctx; 33 - const char *str1 = s->strs_data + (long)key1; 34 - const char *str2 = s->strs_data + (long)key2; 33 + const char *str1 = s->strs_data + key1; 34 + const char *str2 = s->strs_data + key2; 35 35 36 36 return strcmp(str1, str2) == 0; 37 37 } ··· 67 67 /* hashmap__add() returns EEXIST if string with the same 68 68 * content already is in the hash map 69 69 */ 70 - err = hashmap__add(hash, (void *)off, (void *)off); 70 + err = hashmap__add(hash, off, off); 71 71 if (err == -EEXIST) 72 72 continue; /* duplicate */ 73 73 if (err) ··· 127 127 new_off = set->strs_data_len; 128 128 memcpy(p, s, len); 129 129 130 - if (hashmap__find(set->strs_hash, (void *)new_off, (void **)&old_off)) 130 + if (hashmap__find(set->strs_hash, new_off, &old_off)) 131 131 return old_off; 132 132 133 133 return -ENOENT; ··· 165 165 * contents doesn't exist already (HASHMAP_ADD strategy). If such 166 166 * string exists, we'll get its offset in old_off (that's old_key). 167 167 */ 168 - err = hashmap__insert(set->strs_hash, (void *)new_off, (void *)new_off, 169 - HASHMAP_ADD, (const void **)&old_off, NULL); 168 + err = hashmap__insert(set->strs_hash, new_off, new_off, 169 + HASHMAP_ADD, &old_off, NULL); 170 170 if (err == -EEXIST) 171 171 return old_off; /* duplicated string, return existing offset */ 172 172 if (err)

+12 -16

tools/lib/bpf/usdt.c

··· 873 873 free(usdt_link); 874 874 } 875 875 876 - static size_t specs_hash_fn(const void *key, void *ctx) 876 + static size_t specs_hash_fn(long key, void *ctx) 877 877 { 878 - const char *s = key; 879 - 880 - return str_hash(s); 878 + return str_hash((char *)key); 881 879 } 882 880 883 - static bool specs_equal_fn(const void *key1, const void *key2, void *ctx) 881 + static bool specs_equal_fn(long key1, long key2, void *ctx) 884 882 { 885 - const char *s1 = key1; 886 - const char *s2 = key2; 887 - 888 - return strcmp(s1, s2) == 0; 883 + return strcmp((char *)key1, (char *)key2) == 0; 889 884 } 890 885 891 886 static int allocate_spec_id(struct usdt_manager *man, struct hashmap *specs_hash, 892 887 struct bpf_link_usdt *link, struct usdt_target *target, 893 888 int *spec_id, bool *is_new) 894 889 { 895 - void *tmp; 890 + long tmp; 891 + void *new_ids; 896 892 int err; 897 893 898 894 /* check if we already allocated spec ID for this spec string */ 899 895 if (hashmap__find(specs_hash, target->spec_str, &tmp)) { 900 - *spec_id = (long)tmp; 896 + *spec_id = tmp; 901 897 *is_new = false; 902 898 return 0; 903 899 } ··· 901 905 /* otherwise it's a new ID that needs to be set up in specs map and 902 906 * returned back to usdt_manager when USDT link is detached 903 907 */ 904 - tmp = libbpf_reallocarray(link->spec_ids, link->spec_cnt + 1, sizeof(*link->spec_ids)); 905 - if (!tmp) 908 + new_ids = libbpf_reallocarray(link->spec_ids, link->spec_cnt + 1, sizeof(*link->spec_ids)); 909 + if (!new_ids) 906 910 return -ENOMEM; 907 - link->spec_ids = tmp; 911 + link->spec_ids = new_ids; 908 912 909 913 /* get next free spec ID, giving preference to free list, if not empty */ 910 914 if (man->free_spec_cnt) { 911 915 *spec_id = man->free_spec_ids[man->free_spec_cnt - 1]; 912 916 913 917 /* cache spec ID for current spec string for future lookups */ 914 - err = hashmap__add(specs_hash, target->spec_str, (void *)(long)*spec_id); 918 + err = hashmap__add(specs_hash, target->spec_str, *spec_id); 915 919 if (err) 916 920 return err; 917 921 ··· 924 928 *spec_id = man->next_free_spec_id; 925 929 926 930 /* cache spec ID for current spec string for future lookups */ 927 - err = hashmap__add(specs_hash, target->spec_str, (void *)(long)*spec_id); 931 + err = hashmap__add(specs_hash, target->spec_str, *spec_id); 928 932 if (err) 929 933 return err; 930 934

+10 -18

tools/perf/tests/expr.c

··· 130 130 expr__find_ids("FOO + BAR + BAZ + BOZO", "FOO", 131 131 ctx) == 0); 132 132 TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 3); 133 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BAR", 134 - (void **)&val_ptr)); 135 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BAZ", 136 - (void **)&val_ptr)); 137 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BOZO", 138 - (void **)&val_ptr)); 133 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BAR", &val_ptr)); 134 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BAZ", &val_ptr)); 135 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BOZO", &val_ptr)); 139 136 140 137 expr__ctx_clear(ctx); 141 138 ctx->sctx.runtime = 3; ··· 140 143 expr__find_ids("EVENT1\\,param\\=?@ + EVENT2\\,param\\=?@", 141 144 NULL, ctx) == 0); 142 145 TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 2); 143 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "EVENT1,param=3@", 144 - (void **)&val_ptr)); 145 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "EVENT2,param=3@", 146 - (void **)&val_ptr)); 146 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "EVENT1,param=3@", &val_ptr)); 147 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "EVENT2,param=3@", &val_ptr)); 147 148 148 149 expr__ctx_clear(ctx); 149 150 TEST_ASSERT_VAL("find ids", 150 151 expr__find_ids("dash\\-event1 - dash\\-event2", 151 152 NULL, ctx) == 0); 152 153 TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 2); 153 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "dash-event1", 154 - (void **)&val_ptr)); 155 - TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "dash-event2", 156 - (void **)&val_ptr)); 154 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "dash-event1", &val_ptr)); 155 + TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "dash-event2", &val_ptr)); 157 156 158 157 /* Only EVENT1 or EVENT2 need be measured depending on the value of smt_on. */ 159 158 { ··· 167 174 TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 1); 168 175 TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, 169 176 smton ? "EVENT1" : "EVENT2", 170 - (void **)&val_ptr)); 177 + &val_ptr)); 171 178 172 179 expr__ctx_clear(ctx); 173 180 TEST_ASSERT_VAL("find ids", ··· 176 183 TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 1); 177 184 TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, 178 185 corewide ? "EVENT1" : "EVENT2", 179 - (void **)&val_ptr)); 186 + &val_ptr)); 180 187 181 188 } 182 189 /* The expression is a constant 1.0 without needing to evaluate EVENT1. */ ··· 213 220 expr__find_ids("source_count(EVENT1)", 214 221 NULL, ctx) == 0); 215 222 TEST_ASSERT_VAL("source count", hashmap__size(ctx->ids) == 1); 216 - TEST_ASSERT_VAL("source count", hashmap__find(ctx->ids, "EVENT1", 217 - (void **)&val_ptr)); 223 + TEST_ASSERT_VAL("source count", hashmap__find(ctx->ids, "EVENT1", &val_ptr)); 218 224 219 225 expr__ctx_free(ctx); 220 226

+3 -3

tools/perf/tests/pmu-events.c

··· 986 986 */ 987 987 i = 1; 988 988 hashmap__for_each_entry(ctx->ids, cur, bkt) 989 - expr__add_id_val(ctx, strdup(cur->key), i++); 989 + expr__add_id_val(ctx, strdup(cur->pkey), i++); 990 990 991 991 hashmap__for_each_entry(ctx->ids, cur, bkt) { 992 - if (check_parse_fake(cur->key)) { 992 + if (check_parse_fake(cur->pkey)) { 993 993 pr_err("check_parse_fake failed\n"); 994 994 goto out; 995 995 } ··· 1003 1003 */ 1004 1004 i = 1024; 1005 1005 hashmap__for_each_entry(ctx->ids, cur, bkt) 1006 - expr__add_id_val(ctx, strdup(cur->key), i--); 1006 + expr__add_id_val(ctx, strdup(cur->pkey), i--); 1007 1007 if (expr__parse(&result, ctx, str)) { 1008 1008 pr_err("expr__parse failed\n"); 1009 1009 ret = -1;

+5 -6

tools/perf/util/bpf-loader.c

··· 318 318 return; 319 319 320 320 hashmap__for_each_entry(bpf_program_hash, cur, bkt) 321 - clear_prog_priv(cur->key, cur->value); 321 + clear_prog_priv(cur->pkey, cur->pvalue); 322 322 323 323 hashmap__free(bpf_program_hash); 324 324 bpf_program_hash = NULL; ··· 339 339 bpf_map_hash_free(); 340 340 } 341 341 342 - static size_t ptr_hash(const void *__key, void *ctx __maybe_unused) 342 + static size_t ptr_hash(const long __key, void *ctx __maybe_unused) 343 343 { 344 - return (size_t) __key; 344 + return __key; 345 345 } 346 346 347 - static bool ptr_equal(const void *key1, const void *key2, 348 - void *ctx __maybe_unused) 347 + static bool ptr_equal(long key1, long key2, void *ctx __maybe_unused) 349 348 { 350 349 return key1 == key2; 351 350 } ··· 1184 1185 return; 1185 1186 1186 1187 hashmap__for_each_entry(bpf_map_hash, cur, bkt) 1187 - bpf_map_priv__clear(cur->key, cur->value); 1188 + bpf_map_priv__clear(cur->pkey, cur->pvalue); 1188 1189 1189 1190 hashmap__free(bpf_map_hash); 1190 1191 bpf_map_hash = NULL;

+1 -1

tools/perf/util/evsel.c

··· 3123 3123 3124 3124 if (evsel->per_pkg_mask) { 3125 3125 hashmap__for_each_entry(evsel->per_pkg_mask, cur, bkt) 3126 - free((char *)cur->key); 3126 + free((void *)cur->pkey); 3127 3127 3128 3128 hashmap__clear(evsel->per_pkg_mask); 3129 3129 }

+15 -21

tools/perf/util/expr.c

··· 46 46 } kind; 47 47 }; 48 48 49 - static size_t key_hash(const void *key, void *ctx __maybe_unused) 49 + static size_t key_hash(long key, void *ctx __maybe_unused) 50 50 { 51 51 const char *str = (const char *)key; 52 52 size_t hash = 0; ··· 59 59 return hash; 60 60 } 61 61 62 - static bool key_equal(const void *key1, const void *key2, 63 - void *ctx __maybe_unused) 62 + static bool key_equal(long key1, long key2, void *ctx __maybe_unused) 64 63 { 65 64 return !strcmp((const char *)key1, (const char *)key2); 66 65 } ··· 83 84 return; 84 85 85 86 hashmap__for_each_entry(ids, cur, bkt) { 86 - free((char *)cur->key); 87 - free(cur->value); 87 + free((void *)cur->pkey); 88 + free((void *)cur->pvalue); 88 89 } 89 90 90 91 hashmap__free(ids); ··· 96 97 char *old_key = NULL; 97 98 int ret; 98 99 99 - ret = hashmap__set(ids, id, data_ptr, 100 - (const void **)&old_key, (void **)&old_data); 100 + ret = hashmap__set(ids, id, data_ptr, &old_key, &old_data); 101 101 if (ret) 102 102 free(data_ptr); 103 103 free(old_key); ··· 125 127 ids2 = tmp; 126 128 } 127 129 hashmap__for_each_entry(ids2, cur, bkt) { 128 - ret = hashmap__set(ids1, cur->key, cur->value, 129 - (const void **)&old_key, (void **)&old_data); 130 + ret = hashmap__set(ids1, cur->key, cur->value, &old_key, &old_data); 130 131 free(old_key); 131 132 free(old_data); 132 133 ··· 166 169 data_ptr->val.source_count = source_count; 167 170 data_ptr->kind = EXPR_ID_DATA__VALUE; 168 171 169 - ret = hashmap__set(ctx->ids, id, data_ptr, 170 - (const void **)&old_key, (void **)&old_data); 172 + ret = hashmap__set(ctx->ids, id, data_ptr, &old_key, &old_data); 171 173 if (ret) 172 174 free(data_ptr); 173 175 free(old_key); ··· 201 205 data_ptr->ref.metric_expr = ref->metric_expr; 202 206 data_ptr->kind = EXPR_ID_DATA__REF; 203 207 204 - ret = hashmap__set(ctx->ids, name, data_ptr, 205 - (const void **)&old_key, (void **)&old_data); 208 + ret = hashmap__set(ctx->ids, name, data_ptr, &old_key, &old_data); 206 209 if (ret) 207 210 free(data_ptr); 208 211 ··· 216 221 int expr__get_id(struct expr_parse_ctx *ctx, const char *id, 217 222 struct expr_id_data **data) 218 223 { 219 - return hashmap__find(ctx->ids, id, (void **)data) ? 0 : -1; 224 + return hashmap__find(ctx->ids, id, data) ? 0 : -1; 220 225 } 221 226 222 227 bool expr__subset_of_ids(struct expr_parse_ctx *haystack, ··· 227 232 struct expr_id_data *data; 228 233 229 234 hashmap__for_each_entry(needles->ids, cur, bkt) { 230 - if (expr__get_id(haystack, cur->key, &data)) 235 + if (expr__get_id(haystack, cur->pkey, &data)) 231 236 return false; 232 237 } 233 238 return true; ··· 277 282 struct expr_id_data *old_val = NULL; 278 283 char *old_key = NULL; 279 284 280 - hashmap__delete(ctx->ids, id, 281 - (const void **)&old_key, (void **)&old_val); 285 + hashmap__delete(ctx->ids, id, &old_key, &old_val); 282 286 free(old_key); 283 287 free(old_val); 284 288 } ··· 308 314 size_t bkt; 309 315 310 316 hashmap__for_each_entry(ctx->ids, cur, bkt) { 311 - free((char *)cur->key); 312 - free(cur->value); 317 + free((void *)cur->pkey); 318 + free(cur->pvalue); 313 319 } 314 320 hashmap__clear(ctx->ids); 315 321 } ··· 324 330 325 331 free(ctx->sctx.user_requested_cpu_list); 326 332 hashmap__for_each_entry(ctx->ids, cur, bkt) { 327 - free((char *)cur->key); 328 - free(cur->value); 333 + free((void *)cur->pkey); 334 + free(cur->pvalue); 329 335 } 330 336 hashmap__free(ctx->ids); 331 337 free(ctx);

+9 -9

tools/perf/util/hashmap.c

··· 128 128 } 129 129 130 130 static bool hashmap_find_entry(const struct hashmap *map, 131 - const void *key, size_t hash, 131 + const long key, size_t hash, 132 132 struct hashmap_entry ***pprev, 133 133 struct hashmap_entry **entry) 134 134 { ··· 151 151 return false; 152 152 } 153 153 154 - int hashmap__insert(struct hashmap *map, const void *key, void *value, 155 - enum hashmap_insert_strategy strategy, 156 - const void **old_key, void **old_value) 154 + int hashmap_insert(struct hashmap *map, long key, long value, 155 + enum hashmap_insert_strategy strategy, 156 + long *old_key, long *old_value) 157 157 { 158 158 struct hashmap_entry *entry; 159 159 size_t h; 160 160 int err; 161 161 162 162 if (old_key) 163 - *old_key = NULL; 163 + *old_key = 0; 164 164 if (old_value) 165 - *old_value = NULL; 165 + *old_value = 0; 166 166 167 167 h = hash_bits(map->hash_fn(key, map->ctx), map->cap_bits); 168 168 if (strategy != HASHMAP_APPEND && ··· 203 203 return 0; 204 204 } 205 205 206 - bool hashmap__find(const struct hashmap *map, const void *key, void **value) 206 + bool hashmap_find(const struct hashmap *map, long key, long *value) 207 207 { 208 208 struct hashmap_entry *entry; 209 209 size_t h; ··· 217 217 return true; 218 218 } 219 219 220 - bool hashmap__delete(struct hashmap *map, const void *key, 221 - const void **old_key, void **old_value) 220 + bool hashmap_delete(struct hashmap *map, long key, 221 + long *old_key, long *old_value) 222 222 { 223 223 struct hashmap_entry **pprev, *entry; 224 224 size_t h;

+57 -34

tools/perf/util/hashmap.h

··· 40 40 return h; 41 41 } 42 42 43 - typedef size_t (*hashmap_hash_fn)(const void *key, void *ctx); 44 - typedef bool (*hashmap_equal_fn)(const void *key1, const void *key2, void *ctx); 43 + typedef size_t (*hashmap_hash_fn)(long key, void *ctx); 44 + typedef bool (*hashmap_equal_fn)(long key1, long key2, void *ctx); 45 45 46 + /* 47 + * Hashmap interface is polymorphic, keys and values could be either 48 + * long-sized integers or pointers, this is achieved as follows: 49 + * - interface functions that operate on keys and values are hidden 50 + * behind auxiliary macros, e.g. hashmap_insert <-> hashmap__insert; 51 + * - these auxiliary macros cast the key and value parameters as 52 + * long or long *, so the user does not have to specify the casts explicitly; 53 + * - for pointer parameters (e.g. old_key) the size of the pointed 54 + * type is verified by hashmap_cast_ptr using _Static_assert; 55 + * - when iterating using hashmap__for_each_* forms 56 + * hasmap_entry->key should be used for integer keys and 57 + * hasmap_entry->pkey should be used for pointer keys, 58 + * same goes for values. 59 + */ 46 60 struct hashmap_entry { 47 - const void *key; 48 - void *value; 61 + union { 62 + long key; 63 + const void *pkey; 64 + }; 65 + union { 66 + long value; 67 + void *pvalue; 68 + }; 49 69 struct hashmap_entry *next; 50 70 }; 51 71 ··· 122 102 HASHMAP_APPEND, 123 103 }; 124 104 105 + #define hashmap_cast_ptr(p) ({ \ 106 + _Static_assert((__builtin_constant_p((p)) ? (p) == NULL : 0) || \ 107 + sizeof(*(p)) == sizeof(long), \ 108 + #p " pointee should be a long-sized integer or a pointer"); \ 109 + (long *)(p); \ 110 + }) 111 + 125 112 /* 126 113 * hashmap__insert() adds key/value entry w/ various semantics, depending on 127 114 * provided strategy value. If a given key/value pair replaced already ··· 136 109 * through old_key and old_value to allow calling code do proper memory 137 110 * management. 138 111 */ 139 - int hashmap__insert(struct hashmap *map, const void *key, void *value, 140 - enum hashmap_insert_strategy strategy, 141 - const void **old_key, void **old_value); 112 + int hashmap_insert(struct hashmap *map, long key, long value, 113 + enum hashmap_insert_strategy strategy, 114 + long *old_key, long *old_value); 142 115 143 - static inline int hashmap__add(struct hashmap *map, 144 - const void *key, void *value) 145 - { 146 - return hashmap__insert(map, key, value, HASHMAP_ADD, NULL, NULL); 147 - } 116 + #define hashmap__insert(map, key, value, strategy, old_key, old_value) \ 117 + hashmap_insert((map), (long)(key), (long)(value), (strategy), \ 118 + hashmap_cast_ptr(old_key), \ 119 + hashmap_cast_ptr(old_value)) 148 120 149 - static inline int hashmap__set(struct hashmap *map, 150 - const void *key, void *value, 151 - const void **old_key, void **old_value) 152 - { 153 - return hashmap__insert(map, key, value, HASHMAP_SET, 154 - old_key, old_value); 155 - } 121 + #define hashmap__add(map, key, value) \ 122 + hashmap__insert((map), (key), (value), HASHMAP_ADD, NULL, NULL) 156 123 157 - static inline int hashmap__update(struct hashmap *map, 158 - const void *key, void *value, 159 - const void **old_key, void **old_value) 160 - { 161 - return hashmap__insert(map, key, value, HASHMAP_UPDATE, 162 - old_key, old_value); 163 - } 124 + #define hashmap__set(map, key, value, old_key, old_value) \ 125 + hashmap__insert((map), (key), (value), HASHMAP_SET, (old_key), (old_value)) 164 126 165 - static inline int hashmap__append(struct hashmap *map, 166 - const void *key, void *value) 167 - { 168 - return hashmap__insert(map, key, value, HASHMAP_APPEND, NULL, NULL); 169 - } 127 + #define hashmap__update(map, key, value, old_key, old_value) \ 128 + hashmap__insert((map), (key), (value), HASHMAP_UPDATE, (old_key), (old_value)) 170 129 171 - bool hashmap__delete(struct hashmap *map, const void *key, 172 - const void **old_key, void **old_value); 130 + #define hashmap__append(map, key, value) \ 131 + hashmap__insert((map), (key), (value), HASHMAP_APPEND, NULL, NULL) 173 132 174 - bool hashmap__find(const struct hashmap *map, const void *key, void **value); 133 + bool hashmap_delete(struct hashmap *map, long key, long *old_key, long *old_value); 134 + 135 + #define hashmap__delete(map, key, old_key, old_value) \ 136 + hashmap_delete((map), (long)(key), \ 137 + hashmap_cast_ptr(old_key), \ 138 + hashmap_cast_ptr(old_value)) 139 + 140 + bool hashmap_find(const struct hashmap *map, long key, long *value); 141 + 142 + #define hashmap__find(map, key, value) \ 143 + hashmap_find((map), (long)(key), hashmap_cast_ptr(value)) 175 144 176 145 /* 177 146 * hashmap__for_each_entry - iterate over all entries in hashmap

+5 -5

tools/perf/util/metricgroup.c

··· 288 288 * combined or shared groups, this metric may not care 289 289 * about this event. 290 290 */ 291 - if (hashmap__find(ids, metric_id, (void **)&val_ptr)) { 291 + if (hashmap__find(ids, metric_id, &val_ptr)) { 292 292 metric_events[matched_events++] = ev; 293 293 294 294 if (matched_events >= ids_size) ··· 764 764 #define RETURN_IF_NON_ZERO(x) do { if (x) return x; } while (0) 765 765 766 766 hashmap__for_each_entry(ctx->ids, cur, bkt) { 767 - const char *sep, *rsep, *id = cur->key; 767 + const char *sep, *rsep, *id = cur->pkey; 768 768 enum perf_tool_event ev; 769 769 770 770 pr_debug("found event %s\n", id); ··· 945 945 hashmap__for_each_entry(root_metric->pctx->ids, cur, bkt) { 946 946 struct pmu_event pe; 947 947 948 - if (metricgroup__find_metric(cur->key, table, &pe)) { 948 + if (metricgroup__find_metric(cur->pkey, table, &pe)) { 949 949 pending = realloc(pending, 950 950 (pending_cnt + 1) * sizeof(struct to_resolve)); 951 951 if (!pending) 952 952 return -ENOMEM; 953 953 954 954 memcpy(&pending[pending_cnt].pe, &pe, sizeof(pe)); 955 - pending[pending_cnt].key = cur->key; 955 + pending[pending_cnt].key = cur->pkey; 956 956 pending_cnt++; 957 957 } 958 958 } ··· 1433 1433 list_for_each_entry(m, metric_list, nd) { 1434 1434 if (m->has_constraint && !m->modifier) { 1435 1435 hashmap__for_each_entry(m->pctx->ids, cur, bkt) { 1436 - dup = strdup(cur->key); 1436 + dup = strdup(cur->pkey); 1437 1437 if (!dup) { 1438 1438 ret = -ENOMEM; 1439 1439 goto err_out;

+1 -1

tools/perf/util/stat-shadow.c

··· 398 398 399 399 i = 0; 400 400 hashmap__for_each_entry(ctx->ids, cur, bkt) { 401 - const char *metric_name = (const char *)cur->key; 401 + const char *metric_name = cur->pkey; 402 402 403 403 found = false; 404 404 if (leader) {

+4 -5

tools/perf/util/stat.c

··· 278 278 } 279 279 } 280 280 281 - static size_t pkg_id_hash(const void *__key, void *ctx __maybe_unused) 281 + static size_t pkg_id_hash(long __key, void *ctx __maybe_unused) 282 282 { 283 283 uint64_t *key = (uint64_t *) __key; 284 284 285 285 return *key & 0xffffffff; 286 286 } 287 287 288 - static bool pkg_id_equal(const void *__key1, const void *__key2, 289 - void *ctx __maybe_unused) 288 + static bool pkg_id_equal(long __key1, long __key2, void *ctx __maybe_unused) 290 289 { 291 290 uint64_t *key1 = (uint64_t *) __key1; 292 291 uint64_t *key2 = (uint64_t *) __key2; ··· 346 347 return -ENOMEM; 347 348 348 349 *key = (uint64_t)d << 32 | s; 349 - if (hashmap__find(mask, (void *)key, NULL)) { 350 + if (hashmap__find(mask, key, NULL)) { 350 351 *skip = true; 351 352 free(key); 352 353 } else 353 - ret = hashmap__add(mask, (void *)key, (void *)1); 354 + ret = hashmap__add(mask, key, 1); 354 355 355 356 return ret; 356 357 }

+4 -3

tools/testing/selftests/bpf/Makefile

··· 182 182 $(OUTPUT)/liburandom_read.so: urandom_read_lib1.c urandom_read_lib2.c 183 183 $(call msg,LIB,,$@) 184 184 $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS) \ 185 - -fuse-ld=$(LLD) -Wl,-znoseparate-code -fPIC -shared -o $@ 185 + -fuse-ld=$(LLD) -Wl,-znoseparate-code -Wl,--build-id=sha1 \ 186 + -fPIC -shared -o $@ 186 187 187 188 $(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c $(OUTPUT)/liburandom_read.so 188 189 $(call msg,BINARY,,$@) 189 190 $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^) \ 190 191 liburandom_read.so $(LDLIBS) \ 191 - -fuse-ld=$(LLD) -Wl,-znoseparate-code \ 192 - -Wl,-rpath=. -Wl,--build-id=sha1 -o $@ 192 + -fuse-ld=$(LLD) -Wl,-znoseparate-code -Wl,--build-id=sha1 \ 193 + -Wl,-rpath=. -o $@ 193 194 194 195 $(OUTPUT)/sign-file: ../../../../scripts/sign-file.c 195 196 $(call msg,SIGN-FILE,,$@)

+19

tools/testing/selftests/bpf/bpf_util.h

··· 20 20 return possible_cpus; 21 21 } 22 22 23 + /* Copy up to sz - 1 bytes from zero-terminated src string and ensure that dst 24 + * is zero-terminated string no matter what (unless sz == 0, in which case 25 + * it's a no-op). It's conceptually close to FreeBSD's strlcpy(), but differs 26 + * in what is returned. Given this is internal helper, it's trivial to extend 27 + * this, when necessary. Use this instead of strncpy inside libbpf source code. 28 + */ 29 + static inline void bpf_strlcpy(char *dst, const char *src, size_t sz) 30 + { 31 + size_t i; 32 + 33 + if (sz == 0) 34 + return; 35 + 36 + sz--; 37 + for (i = 0; i < sz && src[i]; i++) 38 + dst[i] = src[i]; 39 + dst[i] = '\0'; 40 + } 41 + 23 42 #define __bpf_percpu_val_align __attribute__((__aligned__(8))) 24 43 25 44 #define BPF_DECLARE_PERCPU(type, name) \

+2 -1

tools/testing/selftests/bpf/cgroup_helpers.c

··· 13 13 #include <ftw.h> 14 14 15 15 #include "cgroup_helpers.h" 16 + #include "bpf_util.h" 16 17 17 18 /* 18 19 * To avoid relying on the system setup, when setup_cgroup_env is called ··· 78 77 enable[len] = 0; 79 78 close(fd); 80 79 } else { 81 - strncpy(enable, controllers, sizeof(enable)); 80 + bpf_strlcpy(enable, controllers, sizeof(enable)); 82 81 } 83 82 84 83 snprintf(path, sizeof(path), "%s/cgroup.subtree_control", cgroup_path);

+24 -14

tools/testing/selftests/bpf/prog_tests/align.c

··· 2 2 #include <test_progs.h> 3 3 4 4 #define MAX_INSNS 512 5 - #define MAX_MATCHES 16 5 + #define MAX_MATCHES 24 6 6 7 7 struct bpf_reg_match { 8 8 unsigned int line; ··· 267 267 */ 268 268 BPF_MOV64_REG(BPF_REG_5, BPF_REG_2), 269 269 BPF_ALU64_REG(BPF_ADD, BPF_REG_5, BPF_REG_6), 270 + BPF_MOV64_REG(BPF_REG_4, BPF_REG_5), 270 271 BPF_ALU64_IMM(BPF_ADD, BPF_REG_5, 14), 271 272 BPF_MOV64_REG(BPF_REG_4, BPF_REG_5), 272 273 BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, 4), ··· 281 280 BPF_MOV64_REG(BPF_REG_5, BPF_REG_2), 282 281 BPF_ALU64_IMM(BPF_ADD, BPF_REG_5, 14), 283 282 BPF_ALU64_REG(BPF_ADD, BPF_REG_5, BPF_REG_6), 283 + BPF_MOV64_REG(BPF_REG_4, BPF_REG_5), 284 284 BPF_ALU64_IMM(BPF_ADD, BPF_REG_5, 4), 285 285 BPF_ALU64_REG(BPF_ADD, BPF_REG_5, BPF_REG_6), 286 286 BPF_MOV64_REG(BPF_REG_4, BPF_REG_5), ··· 313 311 {15, "R4=pkt(id=1,off=18,r=18,umax=1020,var_off=(0x0; 0x3fc))"}, 314 312 {15, "R5=pkt(id=1,off=14,r=18,umax=1020,var_off=(0x0; 0x3fc))"}, 315 313 /* Variable offset is added to R5 packet pointer, 316 - * resulting in auxiliary alignment of 4. 314 + * resulting in auxiliary alignment of 4. To avoid BPF 315 + * verifier's precision backtracking logging 316 + * interfering we also have a no-op R4 = R5 317 + * instruction to validate R5 state. We also check 318 + * that R4 is what it should be in such case. 317 319 */ 318 - {17, "R5_w=pkt(id=2,off=0,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 320 + {18, "R4_w=pkt(id=2,off=0,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 321 + {18, "R5_w=pkt(id=2,off=0,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 319 322 /* Constant offset is added to R5, resulting in 320 323 * reg->off of 14. 321 324 */ 322 - {18, "R5_w=pkt(id=2,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 325 + {19, "R5_w=pkt(id=2,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 323 326 /* At the time the word size load is performed from R5, 324 327 * its total fixed offset is NET_IP_ALIGN + reg->off 325 328 * (14) which is 16. Then the variable offset is 4-byte 326 329 * aligned, so the total offset is 4-byte aligned and 327 330 * meets the load's requirements. 328 331 */ 329 - {23, "R4=pkt(id=2,off=18,r=18,umax=1020,var_off=(0x0; 0x3fc))"}, 330 - {23, "R5=pkt(id=2,off=14,r=18,umax=1020,var_off=(0x0; 0x3fc))"}, 332 + {24, "R4=pkt(id=2,off=18,r=18,umax=1020,var_off=(0x0; 0x3fc))"}, 333 + {24, "R5=pkt(id=2,off=14,r=18,umax=1020,var_off=(0x0; 0x3fc))"}, 331 334 /* Constant offset is added to R5 packet pointer, 332 335 * resulting in reg->off value of 14. 333 336 */ 334 - {25, "R5_w=pkt(off=14,r=8"}, 337 + {26, "R5_w=pkt(off=14,r=8"}, 335 338 /* Variable offset is added to R5, resulting in a 336 - * variable offset of (4n). 339 + * variable offset of (4n). See comment for insn #18 340 + * for R4 = R5 trick. 337 341 */ 338 - {26, "R5_w=pkt(id=3,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 342 + {28, "R4_w=pkt(id=3,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 343 + {28, "R5_w=pkt(id=3,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 339 344 /* Constant is added to R5 again, setting reg->off to 18. */ 340 - {27, "R5_w=pkt(id=3,off=18,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 345 + {29, "R5_w=pkt(id=3,off=18,r=0,umax=1020,var_off=(0x0; 0x3fc))"}, 341 346 /* And once more we add a variable; resulting var_off 342 347 * is still (4n), fixed offset is not changed. 343 348 * Also, we create a new reg->id. 344 349 */ 345 - {28, "R5_w=pkt(id=4,off=18,r=0,umax=2040,var_off=(0x0; 0x7fc)"}, 350 + {31, "R4_w=pkt(id=4,off=18,r=0,umax=2040,var_off=(0x0; 0x7fc)"}, 351 + {31, "R5_w=pkt(id=4,off=18,r=0,umax=2040,var_off=(0x0; 0x7fc)"}, 346 352 /* At the time the word size load is performed from R5, 347 353 * its total fixed offset is NET_IP_ALIGN + reg->off (18) 348 354 * which is 20. Then the variable offset is (4n), so 349 355 * the total offset is 4-byte aligned and meets the 350 356 * load's requirements. 351 357 */ 352 - {33, "R4=pkt(id=4,off=22,r=22,umax=2040,var_off=(0x0; 0x7fc)"}, 353 - {33, "R5=pkt(id=4,off=18,r=22,umax=2040,var_off=(0x0; 0x7fc)"}, 358 + {35, "R4=pkt(id=4,off=22,r=22,umax=2040,var_off=(0x0; 0x7fc)"}, 359 + {35, "R5=pkt(id=4,off=18,r=22,umax=2040,var_off=(0x0; 0x7fc)"}, 354 360 }, 355 361 }, 356 362 { ··· 691 681 if (!test__start_subtest(test->descr)) 692 682 continue; 693 683 694 - CHECK_FAIL(do_test_single(test)); 684 + ASSERT_OK(do_test_single(test), test->descr); 695 685 } 696 686 }

+259 -5

tools/testing/selftests/bpf/prog_tests/btf.c

··· 7133 7133 BTF_ENUM_ENC(NAME_NTH(4), 456), 7134 7134 /* [4] fwd enum 'e2' after full enum */ 7135 7135 BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 0), 4), 7136 - /* [5] incompatible fwd enum with different size */ 7136 + /* [5] fwd enum with different size, size does not matter for fwd */ 7137 7137 BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 0), 1), 7138 7138 /* [6] incompatible full enum with different value */ 7139 7139 BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), ··· 7150 7150 /* [2] full enum 'e2' */ 7151 7151 BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), 7152 7152 BTF_ENUM_ENC(NAME_NTH(4), 456), 7153 - /* [3] incompatible fwd enum with different size */ 7154 - BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 0), 1), 7155 - /* [4] incompatible full enum with different value */ 7153 + /* [3] incompatible full enum with different value */ 7156 7154 BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), 7157 7155 BTF_ENUM_ENC(NAME_NTH(2), 321), 7158 7156 BTF_END_RAW, ··· 7609 7611 BTF_STR_SEC("\0e1\0e1_val"), 7610 7612 }, 7611 7613 }, 7612 - 7614 + { 7615 + .descr = "dedup: enum of different size: no dedup", 7616 + .input = { 7617 + .raw_types = { 7618 + /* [1] enum 'e1' */ 7619 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), 7620 + BTF_ENUM_ENC(NAME_NTH(2), 1), 7621 + /* [2] enum 'e1' */ 7622 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 2), 7623 + BTF_ENUM_ENC(NAME_NTH(2), 1), 7624 + BTF_END_RAW, 7625 + }, 7626 + BTF_STR_SEC("\0e1\0e1_val"), 7627 + }, 7628 + .expect = { 7629 + .raw_types = { 7630 + /* [1] enum 'e1' */ 7631 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), 7632 + BTF_ENUM_ENC(NAME_NTH(2), 1), 7633 + /* [2] enum 'e1' */ 7634 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 2), 7635 + BTF_ENUM_ENC(NAME_NTH(2), 1), 7636 + BTF_END_RAW, 7637 + }, 7638 + BTF_STR_SEC("\0e1\0e1_val"), 7639 + }, 7640 + }, 7641 + { 7642 + .descr = "dedup: enum fwd to enum64", 7643 + .input = { 7644 + .raw_types = { 7645 + /* [1] enum64 'e1' */ 7646 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 8), 7647 + BTF_ENUM64_ENC(NAME_NTH(2), 1, 0), 7648 + /* [2] enum 'e1' fwd */ 7649 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 0), 4), 7650 + /* [3] typedef enum 'e1' td */ 7651 + BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_TYPEDEF, 0, 0), 2), 7652 + BTF_END_RAW, 7653 + }, 7654 + BTF_STR_SEC("\0e1\0e1_val\0td"), 7655 + }, 7656 + .expect = { 7657 + .raw_types = { 7658 + /* [1] enum64 'e1' */ 7659 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 1), 8), 7660 + BTF_ENUM64_ENC(NAME_NTH(2), 1, 0), 7661 + /* [2] typedef enum 'e1' td */ 7662 + BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_TYPEDEF, 0, 0), 1), 7663 + BTF_END_RAW, 7664 + }, 7665 + BTF_STR_SEC("\0e1\0e1_val\0td"), 7666 + }, 7667 + }, 7668 + { 7669 + .descr = "dedup: enum64 fwd to enum", 7670 + .input = { 7671 + .raw_types = { 7672 + /* [1] enum 'e1' */ 7673 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), 7674 + BTF_ENUM_ENC(NAME_NTH(2), 1), 7675 + /* [2] enum64 'e1' fwd */ 7676 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 0), 8), 7677 + /* [3] typedef enum 'e1' td */ 7678 + BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_TYPEDEF, 0, 0), 2), 7679 + BTF_END_RAW, 7680 + }, 7681 + BTF_STR_SEC("\0e1\0e1_val\0td"), 7682 + }, 7683 + .expect = { 7684 + .raw_types = { 7685 + /* [1] enum 'e1' */ 7686 + BTF_TYPE_ENC(NAME_NTH(1), BTF_INFO_ENC(BTF_KIND_ENUM, 0, 1), 4), 7687 + BTF_ENUM_ENC(NAME_NTH(2), 1), 7688 + /* [2] typedef enum 'e1' td */ 7689 + BTF_TYPE_ENC(NAME_NTH(3), BTF_INFO_ENC(BTF_KIND_TYPEDEF, 0, 0), 1), 7690 + BTF_END_RAW, 7691 + }, 7692 + BTF_STR_SEC("\0e1\0e1_val\0td"), 7693 + }, 7694 + }, 7695 + { 7696 + .descr = "dedup: standalone fwd declaration struct", 7697 + /* 7698 + * Verify that CU1:foo and CU2:foo would be unified and that 7699 + * typedef/ptr would be updated to point to CU1:foo. 7700 + * 7701 + * // CU 1: 7702 + * struct foo { int x; }; 7703 + * 7704 + * // CU 2: 7705 + * struct foo; 7706 + * typedef struct foo *foo_ptr; 7707 + */ 7708 + .input = { 7709 + .raw_types = { 7710 + /* CU 1 */ 7711 + BTF_STRUCT_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7712 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7713 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7714 + /* CU 2 */ 7715 + BTF_FWD_ENC(NAME_NTH(1), 0), /* [3] */ 7716 + BTF_PTR_ENC(3), /* [4] */ 7717 + BTF_TYPEDEF_ENC(NAME_NTH(3), 4), /* [5] */ 7718 + BTF_END_RAW, 7719 + }, 7720 + BTF_STR_SEC("\0foo\0x\0foo_ptr"), 7721 + }, 7722 + .expect = { 7723 + .raw_types = { 7724 + BTF_STRUCT_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7725 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7726 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7727 + BTF_PTR_ENC(1), /* [3] */ 7728 + BTF_TYPEDEF_ENC(NAME_NTH(3), 3), /* [4] */ 7729 + BTF_END_RAW, 7730 + }, 7731 + BTF_STR_SEC("\0foo\0x\0foo_ptr"), 7732 + }, 7733 + }, 7734 + { 7735 + .descr = "dedup: standalone fwd declaration union", 7736 + /* 7737 + * Verify that CU1:foo and CU2:foo would be unified and that 7738 + * typedef/ptr would be updated to point to CU1:foo. 7739 + * Same as "dedup: standalone fwd declaration struct" but for unions. 7740 + * 7741 + * // CU 1: 7742 + * union foo { int x; }; 7743 + * 7744 + * // CU 2: 7745 + * union foo; 7746 + * typedef union foo *foo_ptr; 7747 + */ 7748 + .input = { 7749 + .raw_types = { 7750 + /* CU 1 */ 7751 + BTF_UNION_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7752 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7753 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7754 + /* CU 2 */ 7755 + BTF_FWD_ENC(NAME_TBD, 1), /* [3] */ 7756 + BTF_PTR_ENC(3), /* [4] */ 7757 + BTF_TYPEDEF_ENC(NAME_NTH(3), 4), /* [5] */ 7758 + BTF_END_RAW, 7759 + }, 7760 + BTF_STR_SEC("\0foo\0x\0foo_ptr"), 7761 + }, 7762 + .expect = { 7763 + .raw_types = { 7764 + BTF_UNION_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7765 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7766 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7767 + BTF_PTR_ENC(1), /* [3] */ 7768 + BTF_TYPEDEF_ENC(NAME_NTH(3), 3), /* [4] */ 7769 + BTF_END_RAW, 7770 + }, 7771 + BTF_STR_SEC("\0foo\0x\0foo_ptr"), 7772 + }, 7773 + }, 7774 + { 7775 + .descr = "dedup: standalone fwd declaration wrong kind", 7776 + /* 7777 + * Negative test for btf_dedup_resolve_fwds: 7778 + * - CU1:foo is a struct, C2:foo is a union, thus CU2:foo is not deduped; 7779 + * - typedef/ptr should remain unchanged as well. 7780 + * 7781 + * // CU 1: 7782 + * struct foo { int x; }; 7783 + * 7784 + * // CU 2: 7785 + * union foo; 7786 + * typedef union foo *foo_ptr; 7787 + */ 7788 + .input = { 7789 + .raw_types = { 7790 + /* CU 1 */ 7791 + BTF_STRUCT_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7792 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7793 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7794 + /* CU 2 */ 7795 + BTF_FWD_ENC(NAME_NTH(3), 1), /* [3] */ 7796 + BTF_PTR_ENC(3), /* [4] */ 7797 + BTF_TYPEDEF_ENC(NAME_NTH(3), 4), /* [5] */ 7798 + BTF_END_RAW, 7799 + }, 7800 + BTF_STR_SEC("\0foo\0x\0foo_ptr"), 7801 + }, 7802 + .expect = { 7803 + .raw_types = { 7804 + /* CU 1 */ 7805 + BTF_STRUCT_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7806 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7807 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7808 + /* CU 2 */ 7809 + BTF_FWD_ENC(NAME_NTH(3), 1), /* [3] */ 7810 + BTF_PTR_ENC(3), /* [4] */ 7811 + BTF_TYPEDEF_ENC(NAME_NTH(3), 4), /* [5] */ 7812 + BTF_END_RAW, 7813 + }, 7814 + BTF_STR_SEC("\0foo\0x\0foo_ptr"), 7815 + }, 7816 + }, 7817 + { 7818 + .descr = "dedup: standalone fwd declaration name conflict", 7819 + /* 7820 + * Negative test for btf_dedup_resolve_fwds: 7821 + * - two candidates for CU2:foo dedup, thus it is unchanged; 7822 + * - typedef/ptr should remain unchanged as well. 7823 + * 7824 + * // CU 1: 7825 + * struct foo { int x; }; 7826 + * 7827 + * // CU 2: 7828 + * struct foo; 7829 + * typedef struct foo *foo_ptr; 7830 + * 7831 + * // CU 3: 7832 + * struct foo { int x; int y; }; 7833 + */ 7834 + .input = { 7835 + .raw_types = { 7836 + /* CU 1 */ 7837 + BTF_STRUCT_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7838 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7839 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7840 + /* CU 2 */ 7841 + BTF_FWD_ENC(NAME_NTH(1), 0), /* [3] */ 7842 + BTF_PTR_ENC(3), /* [4] */ 7843 + BTF_TYPEDEF_ENC(NAME_NTH(4), 4), /* [5] */ 7844 + /* CU 3 */ 7845 + BTF_STRUCT_ENC(NAME_NTH(1), 2, 8), /* [6] */ 7846 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7847 + BTF_MEMBER_ENC(NAME_NTH(3), 2, 0), 7848 + BTF_END_RAW, 7849 + }, 7850 + BTF_STR_SEC("\0foo\0x\0y\0foo_ptr"), 7851 + }, 7852 + .expect = { 7853 + .raw_types = { 7854 + /* CU 1 */ 7855 + BTF_STRUCT_ENC(NAME_NTH(1), 1, 4), /* [1] */ 7856 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7857 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [2] */ 7858 + /* CU 2 */ 7859 + BTF_FWD_ENC(NAME_NTH(1), 0), /* [3] */ 7860 + BTF_PTR_ENC(3), /* [4] */ 7861 + BTF_TYPEDEF_ENC(NAME_NTH(4), 4), /* [5] */ 7862 + /* CU 3 */ 7863 + BTF_STRUCT_ENC(NAME_NTH(1), 2, 8), /* [6] */ 7864 + BTF_MEMBER_ENC(NAME_NTH(2), 2, 0), 7865 + BTF_MEMBER_ENC(NAME_NTH(3), 2, 0), 7866 + BTF_END_RAW, 7867 + }, 7868 + BTF_STR_SEC("\0foo\0x\0y\0foo_ptr"), 7869 + }, 7870 + }, 7613 7871 }; 7614 7872 7615 7873 static int btf_type_size(const struct btf_type *t)

+30 -15

tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c

··· 143 143 btf__add_struct(btf1, "s2", 4); /* [5] struct s2 { */ 144 144 btf__add_field(btf1, "f1", 1, 0, 0); /* int f1; */ 145 145 /* } */ 146 + /* keep this not a part of type the graph to test btf_dedup_resolve_fwds */ 147 + btf__add_struct(btf1, "s3", 4); /* [6] struct s3 { */ 148 + btf__add_field(btf1, "f1", 1, 0, 0); /* int f1; */ 149 + /* } */ 146 150 147 151 VALIDATE_RAW_BTF( 148 152 btf1, ··· 157 153 "\t'f1' type_id=2 bits_offset=0\n" 158 154 "\t'f2' type_id=3 bits_offset=64", 159 155 "[5] STRUCT 's2' size=4 vlen=1\n" 156 + "\t'f1' type_id=1 bits_offset=0", 157 + "[6] STRUCT 's3' size=4 vlen=1\n" 160 158 "\t'f1' type_id=1 bits_offset=0"); 161 159 162 160 btf2 = btf__new_empty_split(btf1); 163 161 if (!ASSERT_OK_PTR(btf2, "empty_split_btf")) 164 162 goto cleanup; 165 163 166 - btf__add_int(btf2, "int", 4, BTF_INT_SIGNED); /* [6] int */ 167 - btf__add_ptr(btf2, 10); /* [7] ptr to struct s1 */ 168 - btf__add_fwd(btf2, "s2", BTF_FWD_STRUCT); /* [8] fwd for struct s2 */ 169 - btf__add_ptr(btf2, 8); /* [9] ptr to fwd struct s2 */ 170 - btf__add_struct(btf2, "s1", 16); /* [10] struct s1 { */ 171 - btf__add_field(btf2, "f1", 7, 0, 0); /* struct s1 *f1; */ 172 - btf__add_field(btf2, "f2", 9, 64, 0); /* struct s2 *f2; */ 164 + btf__add_int(btf2, "int", 4, BTF_INT_SIGNED); /* [7] int */ 165 + btf__add_ptr(btf2, 11); /* [8] ptr to struct s1 */ 166 + btf__add_fwd(btf2, "s2", BTF_FWD_STRUCT); /* [9] fwd for struct s2 */ 167 + btf__add_ptr(btf2, 9); /* [10] ptr to fwd struct s2 */ 168 + btf__add_struct(btf2, "s1", 16); /* [11] struct s1 { */ 169 + btf__add_field(btf2, "f1", 8, 0, 0); /* struct s1 *f1; */ 170 + btf__add_field(btf2, "f2", 10, 64, 0); /* struct s2 *f2; */ 173 171 /* } */ 172 + btf__add_fwd(btf2, "s3", BTF_FWD_STRUCT); /* [12] fwd for struct s3 */ 173 + btf__add_ptr(btf2, 12); /* [13] ptr to struct s1 */ 174 174 175 175 VALIDATE_RAW_BTF( 176 176 btf2, ··· 186 178 "\t'f2' type_id=3 bits_offset=64", 187 179 "[5] STRUCT 's2' size=4 vlen=1\n" 188 180 "\t'f1' type_id=1 bits_offset=0", 189 - "[6] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", 190 - "[7] PTR '(anon)' type_id=10", 191 - "[8] FWD 's2' fwd_kind=struct", 192 - "[9] PTR '(anon)' type_id=8", 193 - "[10] STRUCT 's1' size=16 vlen=2\n" 194 - "\t'f1' type_id=7 bits_offset=0\n" 195 - "\t'f2' type_id=9 bits_offset=64"); 181 + "[6] STRUCT 's3' size=4 vlen=1\n" 182 + "\t'f1' type_id=1 bits_offset=0", 183 + "[7] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", 184 + "[8] PTR '(anon)' type_id=11", 185 + "[9] FWD 's2' fwd_kind=struct", 186 + "[10] PTR '(anon)' type_id=9", 187 + "[11] STRUCT 's1' size=16 vlen=2\n" 188 + "\t'f1' type_id=8 bits_offset=0\n" 189 + "\t'f2' type_id=10 bits_offset=64", 190 + "[12] FWD 's3' fwd_kind=struct", 191 + "[13] PTR '(anon)' type_id=12"); 196 192 197 193 err = btf__dedup(btf2, NULL); 198 194 if (!ASSERT_OK(err, "btf_dedup")) ··· 211 199 "\t'f1' type_id=2 bits_offset=0\n" 212 200 "\t'f2' type_id=3 bits_offset=64", 213 201 "[5] STRUCT 's2' size=4 vlen=1\n" 214 - "\t'f1' type_id=1 bits_offset=0"); 202 + "\t'f1' type_id=1 bits_offset=0", 203 + "[6] STRUCT 's3' size=4 vlen=1\n" 204 + "\t'f1' type_id=1 bits_offset=0", 205 + "[7] PTR '(anon)' type_id=6"); 215 206 216 207 cleanup: 217 208 btf__free(btf2);

+2 -2

tools/testing/selftests/bpf/prog_tests/btf_dump.c

··· 791 791 TEST_BTF_DUMP_DATA_OVER(btf, d, "struct", str, struct bpf_sock_ops, 792 792 sizeof(struct bpf_sock_ops) - 1, 793 793 "(struct bpf_sock_ops){\n\t.op = (__u32)1,\n", 794 - { .op = 1, .skb_tcp_flags = 2}); 794 + { .op = 1, .skb_hwtstamp = 2}); 795 795 TEST_BTF_DUMP_DATA_OVER(btf, d, "struct", str, struct bpf_sock_ops, 796 796 sizeof(struct bpf_sock_ops) - 1, 797 797 "(struct bpf_sock_ops){\n\t.op = (__u32)1,\n", 798 - { .op = 1, .skb_tcp_flags = 0}); 798 + { .op = 1, .skb_hwtstamp = 0}); 799 799 } 800 800 801 801 static void test_btf_dump_var_data(struct btf *btf, struct btf_dump *d,

+136 -54

tools/testing/selftests/bpf/prog_tests/hashmap.c

··· 7 7 */ 8 8 #include "test_progs.h" 9 9 #include "bpf/hashmap.h" 10 + #include <stddef.h> 10 11 11 12 static int duration = 0; 12 13 13 - static size_t hash_fn(const void *k, void *ctx) 14 + static size_t hash_fn(long k, void *ctx) 14 15 { 15 - return (long)k; 16 + return k; 16 17 } 17 18 18 - static bool equal_fn(const void *a, const void *b, void *ctx) 19 + static bool equal_fn(long a, long b, void *ctx) 19 20 { 20 - return (long)a == (long)b; 21 + return a == b; 21 22 } 22 23 23 24 static inline size_t next_pow_2(size_t n) ··· 53 52 return; 54 53 55 54 for (i = 0; i < ELEM_CNT; i++) { 56 - const void *oldk, *k = (const void *)(long)i; 57 - void *oldv, *v = (void *)(long)(1024 + i); 55 + long oldk, k = i; 56 + long oldv, v = 1024 + i; 58 57 59 58 err = hashmap__update(map, k, v, &oldk, &oldv); 60 59 if (CHECK(err != -ENOENT, "hashmap__update", ··· 65 64 err = hashmap__add(map, k, v); 66 65 } else { 67 66 err = hashmap__set(map, k, v, &oldk, &oldv); 68 - if (CHECK(oldk != NULL || oldv != NULL, "check_kv", 69 - "unexpected k/v: %p=%p\n", oldk, oldv)) 67 + if (CHECK(oldk != 0 || oldv != 0, "check_kv", 68 + "unexpected k/v: %ld=%ld\n", oldk, oldv)) 70 69 goto cleanup; 71 70 } 72 71 73 - if (CHECK(err, "elem_add", "failed to add k/v %ld = %ld: %d\n", 74 - (long)k, (long)v, err)) 72 + if (CHECK(err, "elem_add", "failed to add k/v %ld = %ld: %d\n", k, v, err)) 75 73 goto cleanup; 76 74 77 75 if (CHECK(!hashmap__find(map, k, &oldv), "elem_find", 78 - "failed to find key %ld\n", (long)k)) 76 + "failed to find key %ld\n", k)) 79 77 goto cleanup; 80 - if (CHECK(oldv != v, "elem_val", 81 - "found value is wrong: %ld\n", (long)oldv)) 78 + if (CHECK(oldv != v, "elem_val", "found value is wrong: %ld\n", oldv)) 82 79 goto cleanup; 83 80 } 84 81 ··· 90 91 91 92 found_msk = 0; 92 93 hashmap__for_each_entry(map, entry, bkt) { 93 - long k = (long)entry->key; 94 - long v = (long)entry->value; 94 + long k = entry->key; 95 + long v = entry->value; 95 96 96 97 found_msk |= 1ULL << k; 97 98 if (CHECK(v - k != 1024, "check_kv", ··· 103 104 goto cleanup; 104 105 105 106 for (i = 0; i < ELEM_CNT; i++) { 106 - const void *oldk, *k = (const void *)(long)i; 107 - void *oldv, *v = (void *)(long)(256 + i); 107 + long oldk, k = i; 108 + long oldv, v = 256 + i; 108 109 109 110 err = hashmap__add(map, k, v); 110 111 if (CHECK(err != -EEXIST, "hashmap__add", ··· 118 119 119 120 if (CHECK(err, "elem_upd", 120 121 "failed to update k/v %ld = %ld: %d\n", 121 - (long)k, (long)v, err)) 122 + k, v, err)) 122 123 goto cleanup; 123 124 if (CHECK(!hashmap__find(map, k, &oldv), "elem_find", 124 - "failed to find key %ld\n", (long)k)) 125 + "failed to find key %ld\n", k)) 125 126 goto cleanup; 126 127 if (CHECK(oldv != v, "elem_val", 127 - "found value is wrong: %ld\n", (long)oldv)) 128 + "found value is wrong: %ld\n", oldv)) 128 129 goto cleanup; 129 130 } 130 131 ··· 138 139 139 140 found_msk = 0; 140 141 hashmap__for_each_entry_safe(map, entry, tmp, bkt) { 141 - long k = (long)entry->key; 142 - long v = (long)entry->value; 142 + long k = entry->key; 143 + long v = entry->value; 143 144 144 145 found_msk |= 1ULL << k; 145 146 if (CHECK(v - k != 256, "elem_check", ··· 151 152 goto cleanup; 152 153 153 154 found_cnt = 0; 154 - hashmap__for_each_key_entry(map, entry, (void *)0) { 155 + hashmap__for_each_key_entry(map, entry, 0) { 155 156 found_cnt++; 156 157 } 157 158 if (CHECK(!found_cnt, "found_cnt", ··· 160 161 161 162 found_msk = 0; 162 163 found_cnt = 0; 163 - hashmap__for_each_key_entry_safe(map, entry, tmp, (void *)0) { 164 - const void *oldk, *k; 165 - void *oldv, *v; 164 + hashmap__for_each_key_entry_safe(map, entry, tmp, 0) { 165 + long oldk, k; 166 + long oldv, v; 166 167 167 168 k = entry->key; 168 169 v = entry->value; 169 170 170 171 found_cnt++; 171 - found_msk |= 1ULL << (long)k; 172 + found_msk |= 1ULL << k; 172 173 173 174 if (CHECK(!hashmap__delete(map, k, &oldk, &oldv), "elem_del", 174 - "failed to delete k/v %ld = %ld\n", 175 - (long)k, (long)v)) 175 + "failed to delete k/v %ld = %ld\n", k, v)) 176 176 goto cleanup; 177 177 if (CHECK(oldk != k || oldv != v, "check_old", 178 178 "invalid deleted k/v: expected %ld = %ld, got %ld = %ld\n", 179 - (long)k, (long)v, (long)oldk, (long)oldv)) 179 + k, v, oldk, oldv)) 180 180 goto cleanup; 181 181 if (CHECK(hashmap__delete(map, k, &oldk, &oldv), "elem_del", 182 - "unexpectedly deleted k/v %ld = %ld\n", 183 - (long)oldk, (long)oldv)) 182 + "unexpectedly deleted k/v %ld = %ld\n", oldk, oldv)) 184 183 goto cleanup; 185 184 } 186 185 ··· 195 198 goto cleanup; 196 199 197 200 hashmap__for_each_entry_safe(map, entry, tmp, bkt) { 198 - const void *oldk, *k; 199 - void *oldv, *v; 201 + long oldk, k; 202 + long oldv, v; 200 203 201 204 k = entry->key; 202 205 v = entry->value; 203 206 204 207 found_cnt++; 205 - found_msk |= 1ULL << (long)k; 208 + found_msk |= 1ULL << k; 206 209 207 210 if (CHECK(!hashmap__delete(map, k, &oldk, &oldv), "elem_del", 208 - "failed to delete k/v %ld = %ld\n", 209 - (long)k, (long)v)) 211 + "failed to delete k/v %ld = %ld\n", k, v)) 210 212 goto cleanup; 211 213 if (CHECK(oldk != k || oldv != v, "elem_check", 212 214 "invalid old k/v: expect %ld = %ld, got %ld = %ld\n", 213 - (long)k, (long)v, (long)oldk, (long)oldv)) 215 + k, v, oldk, oldv)) 214 216 goto cleanup; 215 217 if (CHECK(hashmap__delete(map, k, &oldk, &oldv), "elem_del", 216 - "unexpectedly deleted k/v %ld = %ld\n", 217 - (long)k, (long)v)) 218 + "unexpectedly deleted k/v %ld = %ld\n", k, v)) 218 219 goto cleanup; 219 220 } 220 221 ··· 230 235 hashmap__for_each_entry(map, entry, bkt) { 231 236 CHECK(false, "elem_exists", 232 237 "unexpected map entries left: %ld = %ld\n", 233 - (long)entry->key, (long)entry->value); 238 + entry->key, entry->value); 234 239 goto cleanup; 235 240 } 236 241 ··· 238 243 hashmap__for_each_entry(map, entry, bkt) { 239 244 CHECK(false, "elem_exists", 240 245 "unexpected map entries left: %ld = %ld\n", 241 - (long)entry->key, (long)entry->value); 246 + entry->key, entry->value); 242 247 goto cleanup; 243 248 } 244 249 ··· 246 251 hashmap__free(map); 247 252 } 248 253 249 - static size_t collision_hash_fn(const void *k, void *ctx) 254 + static size_t str_hash_fn(long a, void *ctx) 255 + { 256 + return str_hash((char *)a); 257 + } 258 + 259 + static bool str_equal_fn(long a, long b, void *ctx) 260 + { 261 + return strcmp((char *)a, (char *)b) == 0; 262 + } 263 + 264 + /* Verify that hashmap interface works with pointer keys and values */ 265 + static void test_hashmap_ptr_iface(void) 266 + { 267 + const char *key, *value, *old_key, *old_value; 268 + struct hashmap_entry *cur; 269 + struct hashmap *map; 270 + int err, i, bkt; 271 + 272 + map = hashmap__new(str_hash_fn, str_equal_fn, NULL); 273 + if (CHECK(!map, "hashmap__new", "can't allocate hashmap\n")) 274 + goto cleanup; 275 + 276 + #define CHECK_STR(fn, var, expected) \ 277 + CHECK(strcmp(var, (expected)), (fn), \ 278 + "wrong value of " #var ": '%s' instead of '%s'\n", var, (expected)) 279 + 280 + err = hashmap__insert(map, "a", "apricot", HASHMAP_ADD, NULL, NULL); 281 + if (CHECK(err, "hashmap__insert", "unexpected error: %d\n", err)) 282 + goto cleanup; 283 + 284 + err = hashmap__insert(map, "a", "apple", HASHMAP_SET, &old_key, &old_value); 285 + if (CHECK(err, "hashmap__insert", "unexpected error: %d\n", err)) 286 + goto cleanup; 287 + CHECK_STR("hashmap__update", old_key, "a"); 288 + CHECK_STR("hashmap__update", old_value, "apricot"); 289 + 290 + err = hashmap__add(map, "b", "banana"); 291 + if (CHECK(err, "hashmap__add", "unexpected error: %d\n", err)) 292 + goto cleanup; 293 + 294 + err = hashmap__set(map, "b", "breadfruit", &old_key, &old_value); 295 + if (CHECK(err, "hashmap__set", "unexpected error: %d\n", err)) 296 + goto cleanup; 297 + CHECK_STR("hashmap__set", old_key, "b"); 298 + CHECK_STR("hashmap__set", old_value, "banana"); 299 + 300 + err = hashmap__update(map, "b", "blueberry", &old_key, &old_value); 301 + if (CHECK(err, "hashmap__update", "unexpected error: %d\n", err)) 302 + goto cleanup; 303 + CHECK_STR("hashmap__update", old_key, "b"); 304 + CHECK_STR("hashmap__update", old_value, "breadfruit"); 305 + 306 + err = hashmap__append(map, "c", "cherry"); 307 + if (CHECK(err, "hashmap__append", "unexpected error: %d\n", err)) 308 + goto cleanup; 309 + 310 + if (CHECK(!hashmap__delete(map, "c", &old_key, &old_value), 311 + "hashmap__delete", "expected to have entry for 'c'\n")) 312 + goto cleanup; 313 + CHECK_STR("hashmap__delete", old_key, "c"); 314 + CHECK_STR("hashmap__delete", old_value, "cherry"); 315 + 316 + CHECK(!hashmap__find(map, "b", &value), "hashmap__find", "can't find value for 'b'\n"); 317 + CHECK_STR("hashmap__find", value, "blueberry"); 318 + 319 + if (CHECK(!hashmap__delete(map, "b", NULL, NULL), 320 + "hashmap__delete", "expected to have entry for 'b'\n")) 321 + goto cleanup; 322 + 323 + i = 0; 324 + hashmap__for_each_entry(map, cur, bkt) { 325 + if (CHECK(i != 0, "hashmap__for_each_entry", "too many entries")) 326 + goto cleanup; 327 + key = cur->pkey; 328 + value = cur->pvalue; 329 + CHECK_STR("entry", key, "a"); 330 + CHECK_STR("entry", value, "apple"); 331 + i++; 332 + } 333 + #undef CHECK_STR 334 + 335 + cleanup: 336 + hashmap__free(map); 337 + } 338 + 339 + static size_t collision_hash_fn(long k, void *ctx) 250 340 { 251 341 return 0; 252 342 } 253 343 254 344 static void test_hashmap_multimap(void) 255 345 { 256 - void *k1 = (void *)0, *k2 = (void *)1; 346 + long k1 = 0, k2 = 1; 257 347 struct hashmap_entry *entry; 258 348 struct hashmap *map; 259 349 long found_msk; ··· 353 273 * [0] -> 1, 2, 4; 354 274 * [1] -> 8, 16, 32; 355 275 */ 356 - err = hashmap__append(map, k1, (void *)1); 276 + err = hashmap__append(map, k1, 1); 357 277 if (CHECK(err, "elem_add", "failed to add k/v: %d\n", err)) 358 278 goto cleanup; 359 - err = hashmap__append(map, k1, (void *)2); 279 + err = hashmap__append(map, k1, 2); 360 280 if (CHECK(err, "elem_add", "failed to add k/v: %d\n", err)) 361 281 goto cleanup; 362 - err = hashmap__append(map, k1, (void *)4); 282 + err = hashmap__append(map, k1, 4); 363 283 if (CHECK(err, "elem_add", "failed to add k/v: %d\n", err)) 364 284 goto cleanup; 365 285 366 - err = hashmap__append(map, k2, (void *)8); 286 + err = hashmap__append(map, k2, 8); 367 287 if (CHECK(err, "elem_add", "failed to add k/v: %d\n", err)) 368 288 goto cleanup; 369 - err = hashmap__append(map, k2, (void *)16); 289 + err = hashmap__append(map, k2, 16); 370 290 if (CHECK(err, "elem_add", "failed to add k/v: %d\n", err)) 371 291 goto cleanup; 372 - err = hashmap__append(map, k2, (void *)32); 292 + err = hashmap__append(map, k2, 32); 373 293 if (CHECK(err, "elem_add", "failed to add k/v: %d\n", err)) 374 294 goto cleanup; 375 295 ··· 380 300 /* verify global iteration still works and sees all values */ 381 301 found_msk = 0; 382 302 hashmap__for_each_entry(map, entry, bkt) { 383 - found_msk |= (long)entry->value; 303 + found_msk |= entry->value; 384 304 } 385 305 if (CHECK(found_msk != (1 << 6) - 1, "found_msk", 386 306 "not all keys iterated: %lx\n", found_msk)) ··· 389 309 /* iterate values for key 1 */ 390 310 found_msk = 0; 391 311 hashmap__for_each_key_entry(map, entry, k1) { 392 - found_msk |= (long)entry->value; 312 + found_msk |= entry->value; 393 313 } 394 314 if (CHECK(found_msk != (1 | 2 | 4), "found_msk", 395 315 "invalid k1 values: %lx\n", found_msk)) ··· 398 318 /* iterate values for key 2 */ 399 319 found_msk = 0; 400 320 hashmap__for_each_key_entry(map, entry, k2) { 401 - found_msk |= (long)entry->value; 321 + found_msk |= entry->value; 402 322 } 403 323 if (CHECK(found_msk != (8 | 16 | 32), "found_msk", 404 324 "invalid k2 values: %lx\n", found_msk)) ··· 413 333 struct hashmap_entry *entry; 414 334 int bkt; 415 335 struct hashmap *map; 416 - void *k = (void *)0; 336 + long k = 0; 417 337 418 338 /* force collisions */ 419 339 map = hashmap__new(hash_fn, equal_fn, NULL); ··· 454 374 test_hashmap_multimap(); 455 375 if (test__start_subtest("empty")) 456 376 test_hashmap_empty(); 377 + if (test__start_subtest("ptr_iface")) 378 + test_hashmap_ptr_iface(); 457 379 }

+3 -3

tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c

··· 312 312 return (__u64) t.tv_sec * 1000000000 + t.tv_nsec; 313 313 } 314 314 315 - static size_t symbol_hash(const void *key, void *ctx __maybe_unused) 315 + static size_t symbol_hash(long key, void *ctx __maybe_unused) 316 316 { 317 317 return str_hash((const char *) key); 318 318 } 319 319 320 - static bool symbol_equal(const void *key1, const void *key2, void *ctx __maybe_unused) 320 + static bool symbol_equal(long key1, long key2, void *ctx __maybe_unused) 321 321 { 322 322 return strcmp((const char *) key1, (const char *) key2) == 0; 323 323 } ··· 372 372 sizeof("__ftrace_invalid_address__") - 1)) 373 373 continue; 374 374 375 - err = hashmap__add(map, name, NULL); 375 + err = hashmap__add(map, name, 0); 376 376 if (err == -EEXIST) 377 377 continue; 378 378 if (err)

+4 -2

tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c

··· 485 485 goto check_linum; 486 486 487 487 ret = read(sk_fds.passive_fd, recv_msg, sizeof(recv_msg)); 488 - if (ASSERT_EQ(ret, sizeof(send_msg), "read(msg)")) 488 + if (!ASSERT_EQ(ret, sizeof(send_msg), "read(msg)")) 489 489 goto check_linum; 490 490 } 491 491 ··· 504 504 misc_skel->bss->nr_pure_ack); 505 505 506 506 ASSERT_EQ(misc_skel->bss->nr_fin, 1, "unexpected nr_fin"); 507 + 508 + ASSERT_EQ(misc_skel->bss->nr_hwtstamp, 0, "nr_hwtstamp"); 507 509 508 510 check_linum: 509 511 ASSERT_FALSE(check_error_linum(&sk_fds), "check_error_linum"); ··· 541 539 goto skel_destroy; 542 540 543 541 cg_fd = test__join_cgroup(CG_NAME); 544 - if (ASSERT_GE(cg_fd, 0, "join_cgroup")) 542 + if (!ASSERT_GE(cg_fd, 0, "join_cgroup")) 545 543 goto skel_destroy; 546 544 547 545 for (i = 0; i < ARRAY_SIZE(tests); i++) {

+4

tools/testing/selftests/bpf/progs/test_misc_tcp_hdr_options.c

··· 27 27 unsigned int nr_data = 0; 28 28 unsigned int nr_syn = 0; 29 29 unsigned int nr_fin = 0; 30 + unsigned int nr_hwtstamp = 0; 30 31 31 32 /* Check the header received from the active side */ 32 33 static int __check_active_hdr_in(struct bpf_sock_ops *skops, bool check_syn) ··· 146 145 147 146 if (th->ack && !th->fin && tcp_hdrlen(th) == skops->skb_len) 148 147 nr_pure_ack++; 148 + 149 + if (skops->skb_hwtstamp) 150 + nr_hwtstamp++; 149 151 150 152 return CG_OK; 151 153 }

+22 -16

tools/testing/selftests/bpf/test_progs.c

··· 222 222 return failed ? "FAIL" : (skipped ? "SKIP" : "OK"); 223 223 } 224 224 225 + #define TEST_NUM_WIDTH 7 226 + 227 + static void print_test_result(const struct prog_test_def *test, const struct test_state *test_state) 228 + { 229 + int skipped_cnt = test_state->skip_cnt; 230 + int subtests_cnt = test_state->subtest_num; 231 + 232 + fprintf(env.stdout, "#%-*d %s:", TEST_NUM_WIDTH, test->test_num, test->test_name); 233 + if (test_state->error_cnt) 234 + fprintf(env.stdout, "FAIL"); 235 + else if (!skipped_cnt) 236 + fprintf(env.stdout, "OK"); 237 + else if (skipped_cnt == subtests_cnt || !subtests_cnt) 238 + fprintf(env.stdout, "SKIP"); 239 + else 240 + fprintf(env.stdout, "OK (SKIP: %d/%d)", skipped_cnt, subtests_cnt); 241 + 242 + fprintf(env.stdout, "\n"); 243 + } 244 + 225 245 static void print_test_log(char *log_buf, size_t log_cnt) 226 246 { 227 247 log_buf[log_cnt] = '\0'; 228 248 fprintf(env.stdout, "%s", log_buf); 229 249 if (log_buf[log_cnt - 1] != '\n') 230 250 fprintf(env.stdout, "\n"); 231 - } 232 - 233 - #define TEST_NUM_WIDTH 7 234 - 235 - static void print_test_name(int test_num, const char *test_name, char *result) 236 - { 237 - fprintf(env.stdout, "#%-*d %s", TEST_NUM_WIDTH, test_num, test_name); 238 - 239 - if (result) 240 - fprintf(env.stdout, ":%s", result); 241 - 242 - fprintf(env.stdout, "\n"); 243 251 } 244 252 245 253 static void print_subtest_name(int test_num, int subtest_num, ··· 315 307 subtest_state->skipped)); 316 308 } 317 309 318 - print_test_name(test->test_num, test->test_name, 319 - test_result(test_failed, test_state->skip_cnt)); 310 + print_test_result(test, test_state); 320 311 } 321 312 322 313 static void stdio_restore(void); ··· 1077 1070 state->tested = true; 1078 1071 1079 1072 if (verbose() && env.worker_id == -1) 1080 - print_test_name(test_num + 1, test->test_name, 1081 - test_result(state->error_cnt, state->skip_cnt)); 1073 + print_test_result(test, state); 1082 1074 1083 1075 reset_affinity(); 1084 1076 restore_netns();

+737 -164

tools/testing/selftests/bpf/veristat.c

··· 17 17 #include <bpf/libbpf.h> 18 18 #include <libelf.h> 19 19 #include <gelf.h> 20 + #include <float.h> 20 21 21 22 enum stat_id { 22 23 VERDICT, ··· 35 34 NUM_STATS_CNT = FILE_NAME - VERDICT, 36 35 }; 37 36 37 + /* In comparison mode each stat can specify up to four different values: 38 + * - A side value; 39 + * - B side value; 40 + * - absolute diff value; 41 + * - relative (percentage) diff value. 42 + * 43 + * When specifying stat specs in comparison mode, user can use one of the 44 + * following variant suffixes to specify which exact variant should be used for 45 + * ordering or filtering: 46 + * - `_a` for A side value; 47 + * - `_b` for B side value; 48 + * - `_diff` for absolute diff value; 49 + * - `_pct` for relative (percentage) diff value. 50 + * 51 + * If no variant suffix is provided, then `_b` (control data) is assumed. 52 + * 53 + * As an example, let's say instructions stat has the following output: 54 + * 55 + * Insns (A) Insns (B) Insns (DIFF) 56 + * --------- --------- -------------- 57 + * 21547 20920 -627 (-2.91%) 58 + * 59 + * Then: 60 + * - 21547 is A side value (insns_a); 61 + * - 20920 is B side value (insns_b); 62 + * - -627 is absolute diff value (insns_diff); 63 + * - -2.91% is relative diff value (insns_pct). 64 + * 65 + * For verdict there is no verdict_pct variant. 66 + * For file and program name, _a and _b variants are equivalent and there are 67 + * no _diff or _pct variants. 68 + */ 69 + enum stat_variant { 70 + VARIANT_A, 71 + VARIANT_B, 72 + VARIANT_DIFF, 73 + VARIANT_PCT, 74 + }; 75 + 38 76 struct verif_stats { 39 77 char *file_name; 40 78 char *prog_name; ··· 81 41 long stats[NUM_STATS_CNT]; 82 42 }; 83 43 44 + /* joined comparison mode stats */ 45 + struct verif_stats_join { 46 + char *file_name; 47 + char *prog_name; 48 + 49 + const struct verif_stats *stats_a; 50 + const struct verif_stats *stats_b; 51 + }; 52 + 84 53 struct stat_specs { 85 54 int spec_cnt; 86 55 enum stat_id ids[ALL_STATS_CNT]; 56 + enum stat_variant variants[ALL_STATS_CNT]; 87 57 bool asc[ALL_STATS_CNT]; 88 58 int lens[ALL_STATS_CNT * 3]; /* 3x for comparison mode */ 89 59 }; ··· 104 54 RESFMT_CSV, 105 55 }; 106 56 57 + enum filter_kind { 58 + FILTER_NAME, 59 + FILTER_STAT, 60 + }; 61 + 62 + enum operator_kind { 63 + OP_EQ, /* == or = */ 64 + OP_NEQ, /* != or <> */ 65 + OP_LT, /* < */ 66 + OP_LE, /* <= */ 67 + OP_GT, /* > */ 68 + OP_GE, /* >= */ 69 + }; 70 + 107 71 struct filter { 72 + enum filter_kind kind; 73 + /* FILTER_NAME */ 74 + char *any_glob; 108 75 char *file_glob; 109 76 char *prog_glob; 77 + /* FILTER_STAT */ 78 + enum operator_kind op; 79 + int stat_id; 80 + enum stat_variant stat_var; 81 + long value; 110 82 }; 111 83 112 84 static struct env { ··· 139 67 int log_level; 140 68 enum resfmt out_fmt; 141 69 bool comparison_mode; 70 + bool replay_mode; 142 71 143 72 struct verif_stats *prog_stats; 144 73 int prog_stat_cnt; ··· 147 74 /* baseline_stats is allocated and used only in comparsion mode */ 148 75 struct verif_stats *baseline_stats; 149 76 int baseline_stat_cnt; 77 + 78 + struct verif_stats_join *join_stats; 79 + int join_stat_cnt; 150 80 151 81 struct stat_specs output_spec; 152 82 struct stat_specs sort_spec; ··· 191 115 { "sort", 's', "SPEC", 0, "Specify sort order" }, 192 116 { "output-format", 'o', "FMT", 0, "Result output format (table, csv), default is table." }, 193 117 { "compare", 'C', NULL, 0, "Comparison mode" }, 118 + { "replay", 'R', NULL, 0, "Replay mode" }, 194 119 { "filter", 'f', "FILTER", 0, "Filter expressions (or @filename for file with expressions)." }, 195 120 {}, 196 121 }; ··· 245 168 break; 246 169 case 'C': 247 170 env.comparison_mode = true; 171 + break; 172 + case 'R': 173 + env.replay_mode = true; 248 174 break; 249 175 case 'f': 250 176 if (arg[0] == '@') ··· 306 226 return !*str && !*pat; 307 227 } 308 228 309 - static bool should_process_file(const char *filename) 310 - { 311 - int i; 312 - 313 - if (env.deny_filter_cnt > 0) { 314 - for (i = 0; i < env.deny_filter_cnt; i++) { 315 - if (glob_matches(filename, env.deny_filters[i].file_glob)) 316 - return false; 317 - } 318 - } 319 - 320 - if (env.allow_filter_cnt == 0) 321 - return true; 322 - 323 - for (i = 0; i < env.allow_filter_cnt; i++) { 324 - if (glob_matches(filename, env.allow_filters[i].file_glob)) 325 - return true; 326 - } 327 - 328 - return false; 329 - } 330 - 331 229 static bool is_bpf_obj_file(const char *path) { 332 230 Elf64_Ehdr *ehdr; 333 231 int fd, err = -EINVAL; ··· 338 280 return err == 0; 339 281 } 340 282 341 - static bool should_process_prog(const char *path, const char *prog_name) 283 + static bool should_process_file_prog(const char *filename, const char *prog_name) 342 284 { 343 - const char *filename = basename(path); 344 - int i; 285 + struct filter *f; 286 + int i, allow_cnt = 0; 345 287 346 - if (env.deny_filter_cnt > 0) { 347 - for (i = 0; i < env.deny_filter_cnt; i++) { 348 - if (glob_matches(filename, env.deny_filters[i].file_glob)) 349 - return false; 350 - if (!env.deny_filters[i].prog_glob) 288 + for (i = 0; i < env.deny_filter_cnt; i++) { 289 + f = &env.deny_filters[i]; 290 + if (f->kind != FILTER_NAME) 291 + continue; 292 + 293 + if (f->any_glob && glob_matches(filename, f->any_glob)) 294 + return false; 295 + if (f->any_glob && prog_name && glob_matches(prog_name, f->any_glob)) 296 + return false; 297 + if (f->file_glob && glob_matches(filename, f->file_glob)) 298 + return false; 299 + if (f->prog_glob && prog_name && glob_matches(prog_name, f->prog_glob)) 300 + return false; 301 + } 302 + 303 + for (i = 0; i < env.allow_filter_cnt; i++) { 304 + f = &env.allow_filters[i]; 305 + if (f->kind != FILTER_NAME) 306 + continue; 307 + 308 + allow_cnt++; 309 + if (f->any_glob) { 310 + if (glob_matches(filename, f->any_glob)) 311 + return true; 312 + /* If we don't know program name yet, any_glob filter 313 + * has to assume that current BPF object file might be 314 + * relevant; we'll check again later on after opening 315 + * BPF object file, at which point program name will 316 + * be known finally. 317 + */ 318 + if (!prog_name || glob_matches(prog_name, f->any_glob)) 319 + return true; 320 + } else { 321 + if (f->file_glob && !glob_matches(filename, f->file_glob)) 351 322 continue; 352 - if (glob_matches(prog_name, env.deny_filters[i].prog_glob)) 353 - return false; 323 + if (f->prog_glob && prog_name && !glob_matches(prog_name, f->prog_glob)) 324 + continue; 325 + return true; 354 326 } 355 327 } 356 328 357 - if (env.allow_filter_cnt == 0) 358 - return true; 359 - 360 - for (i = 0; i < env.allow_filter_cnt; i++) { 361 - if (!glob_matches(filename, env.allow_filters[i].file_glob)) 362 - continue; 363 - /* if filter specifies only filename glob part, it implicitly 364 - * allows all progs within that file 365 - */ 366 - if (!env.allow_filters[i].prog_glob) 367 - return true; 368 - if (glob_matches(prog_name, env.allow_filters[i].prog_glob)) 369 - return true; 370 - } 371 - 372 - return false; 329 + /* if there are no file/prog name allow filters, allow all progs, 330 + * unless they are denied earlier explicitly 331 + */ 332 + return allow_cnt == 0; 373 333 } 334 + 335 + static struct { 336 + enum operator_kind op_kind; 337 + const char *op_str; 338 + } operators[] = { 339 + /* Order of these definitions matter to avoid situations like '<' 340 + * matching part of what is actually a '<>' operator. That is, 341 + * substrings should go last. 342 + */ 343 + { OP_EQ, "==" }, 344 + { OP_NEQ, "!=" }, 345 + { OP_NEQ, "<>" }, 346 + { OP_LE, "<=" }, 347 + { OP_LT, "<" }, 348 + { OP_GE, ">=" }, 349 + { OP_GT, ">" }, 350 + { OP_EQ, "=" }, 351 + }; 352 + 353 + static bool parse_stat_id_var(const char *name, size_t len, int *id, enum stat_variant *var); 374 354 375 355 static int append_filter(struct filter **filters, int *cnt, const char *str) 376 356 { 377 357 struct filter *f; 378 358 void *tmp; 379 359 const char *p; 360 + int i; 380 361 381 362 tmp = realloc(*filters, (*cnt + 1) * sizeof(**filters)); 382 363 if (!tmp) ··· 423 326 *filters = tmp; 424 327 425 328 f = &(*filters)[*cnt]; 426 - f->file_glob = f->prog_glob = NULL; 329 + memset(f, 0, sizeof(*f)); 427 330 428 - /* filter can be specified either as "<obj-glob>" or "<obj-glob>/<prog-glob>" */ 331 + /* First, let's check if it's a stats filter of the following form: 332 + * <stat><op><value, where: 333 + * - <stat> is one of supported numerical stats (verdict is also 334 + * considered numerical, failure == 0, success == 1); 335 + * - <op> is comparison operator (see `operators` definitions); 336 + * - <value> is an integer (or failure/success, or false/true as 337 + * special aliases for 0 and 1, respectively). 338 + * If the form doesn't match what user provided, we assume file/prog 339 + * glob filter. 340 + */ 341 + for (i = 0; i < ARRAY_SIZE(operators); i++) { 342 + enum stat_variant var; 343 + int id; 344 + long val; 345 + const char *end = str; 346 + const char *op_str; 347 + 348 + op_str = operators[i].op_str; 349 + p = strstr(str, op_str); 350 + if (!p) 351 + continue; 352 + 353 + if (!parse_stat_id_var(str, p - str, &id, &var)) { 354 + fprintf(stderr, "Unrecognized stat name in '%s'!\n", str); 355 + return -EINVAL; 356 + } 357 + if (id >= FILE_NAME) { 358 + fprintf(stderr, "Non-integer stat is specified in '%s'!\n", str); 359 + return -EINVAL; 360 + } 361 + 362 + p += strlen(op_str); 363 + 364 + if (strcasecmp(p, "true") == 0 || 365 + strcasecmp(p, "t") == 0 || 366 + strcasecmp(p, "success") == 0 || 367 + strcasecmp(p, "succ") == 0 || 368 + strcasecmp(p, "s") == 0 || 369 + strcasecmp(p, "match") == 0 || 370 + strcasecmp(p, "m") == 0) { 371 + val = 1; 372 + } else if (strcasecmp(p, "false") == 0 || 373 + strcasecmp(p, "f") == 0 || 374 + strcasecmp(p, "failure") == 0 || 375 + strcasecmp(p, "fail") == 0 || 376 + strcasecmp(p, "mismatch") == 0 || 377 + strcasecmp(p, "mis") == 0) { 378 + val = 0; 379 + } else { 380 + errno = 0; 381 + val = strtol(p, (char **)&end, 10); 382 + if (errno || end == p || *end != '\0' ) { 383 + fprintf(stderr, "Invalid integer value in '%s'!\n", str); 384 + return -EINVAL; 385 + } 386 + } 387 + 388 + f->kind = FILTER_STAT; 389 + f->stat_id = id; 390 + f->stat_var = var; 391 + f->op = operators[i].op_kind; 392 + f->value = val; 393 + 394 + *cnt += 1; 395 + return 0; 396 + } 397 + 398 + /* File/prog filter can be specified either as '<glob>' or 399 + * '<file-glob>/<prog-glob>'. In the former case <glob> is applied to 400 + * both file and program names. This seems to be way more useful in 401 + * practice. If user needs full control, they can use '/<prog-glob>' 402 + * form to glob just program name, or '<file-glob>/' to glob only file 403 + * name. But usually common <glob> seems to be the most useful and 404 + * ergonomic way. 405 + */ 406 + f->kind = FILTER_NAME; 429 407 p = strchr(str, '/'); 430 408 if (!p) { 431 - f->file_glob = strdup(str); 432 - if (!f->file_glob) 409 + f->any_glob = strdup(str); 410 + if (!f->any_glob) 433 411 return -ENOMEM; 434 412 } else { 435 - f->file_glob = strndup(str, p - str); 436 - f->prog_glob = strdup(p + 1); 437 - if (!f->file_glob || !f->prog_glob) { 438 - free(f->file_glob); 439 - free(f->prog_glob); 440 - f->file_glob = f->prog_glob = NULL; 441 - return -ENOMEM; 413 + if (str != p) { 414 + /* non-empty file glob */ 415 + f->file_glob = strndup(str, p - str); 416 + if (!f->file_glob) 417 + return -ENOMEM; 418 + } 419 + if (strlen(p + 1) > 0) { 420 + /* non-empty prog glob */ 421 + f->prog_glob = strdup(p + 1); 422 + if (!f->prog_glob) { 423 + free(f->file_glob); 424 + f->file_glob = NULL; 425 + return -ENOMEM; 426 + } 442 427 } 443 428 } 444 429 445 - *cnt = *cnt + 1; 430 + *cnt += 1; 446 431 return 0; 447 432 } 448 433 ··· 567 388 }, 568 389 }; 569 390 391 + static const struct stat_specs default_csv_output_spec = { 392 + .spec_cnt = 9, 393 + .ids = { 394 + FILE_NAME, PROG_NAME, VERDICT, DURATION, 395 + TOTAL_INSNS, TOTAL_STATES, PEAK_STATES, 396 + MAX_STATES_PER_INSN, MARK_READ_MAX_LEN, 397 + }, 398 + }; 399 + 570 400 static const struct stat_specs default_sort_spec = { 401 + .spec_cnt = 2, 402 + .ids = { 403 + FILE_NAME, PROG_NAME, 404 + }, 405 + .asc = { true, true, }, 406 + }; 407 + 408 + /* sorting for comparison mode to join two data sets */ 409 + static const struct stat_specs join_sort_spec = { 571 410 .spec_cnt = 2, 572 411 .ids = { 573 412 FILE_NAME, PROG_NAME, ··· 597 400 const char *header; 598 401 const char *names[4]; 599 402 bool asc_by_default; 403 + bool left_aligned; 600 404 } stat_defs[] = { 601 - [FILE_NAME] = { "File", {"file_name", "filename", "file"}, true /* asc */ }, 602 - [PROG_NAME] = { "Program", {"prog_name", "progname", "prog"}, true /* asc */ }, 603 - [VERDICT] = { "Verdict", {"verdict"}, true /* asc: failure, success */ }, 405 + [FILE_NAME] = { "File", {"file_name", "filename", "file"}, true /* asc */, true /* left */ }, 406 + [PROG_NAME] = { "Program", {"prog_name", "progname", "prog"}, true /* asc */, true /* left */ }, 407 + [VERDICT] = { "Verdict", {"verdict"}, true /* asc: failure, success */, true /* left */ }, 604 408 [DURATION] = { "Duration (us)", {"duration", "dur"}, }, 605 - [TOTAL_INSNS] = { "Total insns", {"total_insns", "insns"}, }, 606 - [TOTAL_STATES] = { "Total states", {"total_states", "states"}, }, 409 + [TOTAL_INSNS] = { "Insns", {"total_insns", "insns"}, }, 410 + [TOTAL_STATES] = { "States", {"total_states", "states"}, }, 607 411 [PEAK_STATES] = { "Peak states", {"peak_states"}, }, 608 412 [MAX_STATES_PER_INSN] = { "Max states per insn", {"max_states_per_insn"}, }, 609 413 [MARK_READ_MAX_LEN] = { "Max mark read length", {"max_mark_read_len", "mark_read"}, }, 610 414 }; 611 415 416 + static bool parse_stat_id_var(const char *name, size_t len, int *id, enum stat_variant *var) 417 + { 418 + static const char *var_sfxs[] = { 419 + [VARIANT_A] = "_a", 420 + [VARIANT_B] = "_b", 421 + [VARIANT_DIFF] = "_diff", 422 + [VARIANT_PCT] = "_pct", 423 + }; 424 + int i, j, k; 425 + 426 + for (i = 0; i < ARRAY_SIZE(stat_defs); i++) { 427 + struct stat_def *def = &stat_defs[i]; 428 + size_t alias_len, sfx_len; 429 + const char *alias; 430 + 431 + for (j = 0; j < ARRAY_SIZE(stat_defs[i].names); j++) { 432 + alias = def->names[j]; 433 + if (!alias) 434 + continue; 435 + 436 + alias_len = strlen(alias); 437 + if (strncmp(name, alias, alias_len) != 0) 438 + continue; 439 + 440 + if (alias_len == len) { 441 + /* If no variant suffix is specified, we 442 + * assume control group (just in case we are 443 + * in comparison mode. Variant is ignored in 444 + * non-comparison mode. 445 + */ 446 + *var = VARIANT_B; 447 + *id = i; 448 + return true; 449 + } 450 + 451 + for (k = 0; k < ARRAY_SIZE(var_sfxs); k++) { 452 + sfx_len = strlen(var_sfxs[k]); 453 + if (alias_len + sfx_len != len) 454 + continue; 455 + 456 + if (strncmp(name + alias_len, var_sfxs[k], sfx_len) == 0) { 457 + *var = (enum stat_variant)k; 458 + *id = i; 459 + return true; 460 + } 461 + } 462 + } 463 + } 464 + 465 + return false; 466 + } 467 + 468 + static bool is_asc_sym(char c) 469 + { 470 + return c == '^'; 471 + } 472 + 473 + static bool is_desc_sym(char c) 474 + { 475 + return c == 'v' || c == 'V' || c == '.' || c == '!' || c == '_'; 476 + } 477 + 612 478 static int parse_stat(const char *stat_name, struct stat_specs *specs) 613 479 { 614 - int id, i; 480 + int id; 481 + bool has_order = false, is_asc = false; 482 + size_t len = strlen(stat_name); 483 + enum stat_variant var; 615 484 616 485 if (specs->spec_cnt >= ARRAY_SIZE(specs->ids)) { 617 486 fprintf(stderr, "Can't specify more than %zd stats\n", ARRAY_SIZE(specs->ids)); 618 487 return -E2BIG; 619 488 } 620 489 621 - for (id = 0; id < ARRAY_SIZE(stat_defs); id++) { 622 - struct stat_def *def = &stat_defs[id]; 623 - 624 - for (i = 0; i < ARRAY_SIZE(stat_defs[id].names); i++) { 625 - if (!def->names[i] || strcmp(def->names[i], stat_name) != 0) 626 - continue; 627 - 628 - specs->ids[specs->spec_cnt] = id; 629 - specs->asc[specs->spec_cnt] = def->asc_by_default; 630 - specs->spec_cnt++; 631 - 632 - return 0; 633 - } 490 + if (len > 1 && (is_asc_sym(stat_name[len - 1]) || is_desc_sym(stat_name[len - 1]))) { 491 + has_order = true; 492 + is_asc = is_asc_sym(stat_name[len - 1]); 493 + len -= 1; 634 494 } 635 495 636 - fprintf(stderr, "Unrecognized stat name '%s'\n", stat_name); 637 - return -ESRCH; 496 + if (!parse_stat_id_var(stat_name, len, &id, &var)) { 497 + fprintf(stderr, "Unrecognized stat name '%s'\n", stat_name); 498 + return -ESRCH; 499 + } 500 + 501 + specs->ids[specs->spec_cnt] = id; 502 + specs->variants[specs->spec_cnt] = var; 503 + specs->asc[specs->spec_cnt] = has_order ? is_asc : stat_defs[id].asc_by_default; 504 + specs->spec_cnt++; 505 + 506 + return 0; 638 507 } 639 508 640 509 static int parse_stats(const char *stats_str, struct stat_specs *specs) ··· 803 540 int err = 0; 804 541 void *tmp; 805 542 806 - if (!should_process_prog(filename, bpf_program__name(prog))) { 543 + if (!should_process_file_prog(basename(filename), bpf_program__name(prog))) { 807 544 env.progs_skipped++; 808 545 return 0; 809 546 } ··· 859 596 LIBBPF_OPTS(bpf_object_open_opts, opts); 860 597 int err = 0, prog_cnt = 0; 861 598 862 - if (!should_process_file(basename(filename))) { 599 + if (!should_process_file_prog(basename(filename), NULL)) { 863 600 if (env.verbose) 864 601 printf("Skipping '%s' due to filters...\n", filename); 865 602 env.files_skipped++; ··· 979 716 return cmp; 980 717 } 981 718 982 - return 0; 719 + /* always disambiguate with file+prog, which are unique */ 720 + cmp = strcmp(s1->file_name, s2->file_name); 721 + if (cmp != 0) 722 + return cmp; 723 + return strcmp(s1->prog_name, s2->prog_name); 724 + } 725 + 726 + static void fetch_join_stat_value(const struct verif_stats_join *s, 727 + enum stat_id id, enum stat_variant var, 728 + const char **str_val, 729 + double *num_val) 730 + { 731 + long v1, v2; 732 + 733 + if (id == FILE_NAME) { 734 + *str_val = s->file_name; 735 + return; 736 + } 737 + if (id == PROG_NAME) { 738 + *str_val = s->prog_name; 739 + return; 740 + } 741 + 742 + v1 = s->stats_a ? s->stats_a->stats[id] : 0; 743 + v2 = s->stats_b ? s->stats_b->stats[id] : 0; 744 + 745 + switch (var) { 746 + case VARIANT_A: 747 + if (!s->stats_a) 748 + *num_val = -DBL_MAX; 749 + else 750 + *num_val = s->stats_a->stats[id]; 751 + return; 752 + case VARIANT_B: 753 + if (!s->stats_b) 754 + *num_val = -DBL_MAX; 755 + else 756 + *num_val = s->stats_b->stats[id]; 757 + return; 758 + case VARIANT_DIFF: 759 + if (!s->stats_a || !s->stats_b) 760 + *num_val = -DBL_MAX; 761 + else if (id == VERDICT) 762 + *num_val = v1 == v2 ? 1.0 /* MATCH */ : 0.0 /* MISMATCH */; 763 + else 764 + *num_val = (double)(v2 - v1); 765 + return; 766 + case VARIANT_PCT: 767 + if (!s->stats_a || !s->stats_b) { 768 + *num_val = -DBL_MAX; 769 + } else if (v1 == 0) { 770 + if (v1 == v2) 771 + *num_val = 0.0; 772 + else 773 + *num_val = v2 < v1 ? -100.0 : 100.0; 774 + } else { 775 + *num_val = (v2 - v1) * 100.0 / v1; 776 + } 777 + return; 778 + } 779 + } 780 + 781 + static int cmp_join_stat(const struct verif_stats_join *s1, 782 + const struct verif_stats_join *s2, 783 + enum stat_id id, enum stat_variant var, bool asc) 784 + { 785 + const char *str1 = NULL, *str2 = NULL; 786 + double v1, v2; 787 + int cmp = 0; 788 + 789 + fetch_join_stat_value(s1, id, var, &str1, &v1); 790 + fetch_join_stat_value(s2, id, var, &str2, &v2); 791 + 792 + if (str1) 793 + cmp = strcmp(str1, str2); 794 + else if (v1 != v2) 795 + cmp = v1 < v2 ? -1 : 1; 796 + 797 + return asc ? cmp : -cmp; 798 + } 799 + 800 + static int cmp_join_stats(const void *v1, const void *v2) 801 + { 802 + const struct verif_stats_join *s1 = v1, *s2 = v2; 803 + int i, cmp; 804 + 805 + for (i = 0; i < env.sort_spec.spec_cnt; i++) { 806 + cmp = cmp_join_stat(s1, s2, 807 + env.sort_spec.ids[i], 808 + env.sort_spec.variants[i], 809 + env.sort_spec.asc[i]); 810 + if (cmp != 0) 811 + return cmp; 812 + } 813 + 814 + /* always disambiguate with file+prog, which are unique */ 815 + cmp = strcmp(s1->file_name, s2->file_name); 816 + if (cmp != 0) 817 + return cmp; 818 + return strcmp(s1->prog_name, s2->prog_name); 983 819 } 984 820 985 821 #define HEADER_CHAR '-' ··· 1100 738 1101 739 static void output_headers(enum resfmt fmt) 1102 740 { 741 + const char *fmt_str; 1103 742 int i, len; 1104 743 1105 744 for (i = 0; i < env.output_spec.spec_cnt; i++) { ··· 1114 751 *max_len = len; 1115 752 break; 1116 753 case RESFMT_TABLE: 1117 - printf("%s%-*s", i == 0 ? "" : COLUMN_SEP, *max_len, stat_defs[id].header); 754 + fmt_str = stat_defs[id].left_aligned ? "%s%-*s" : "%s%*s"; 755 + printf(fmt_str, i == 0 ? "" : COLUMN_SEP, *max_len, stat_defs[id].header); 1118 756 if (i == env.output_spec.spec_cnt - 1) 1119 757 printf("\n"); 1120 758 break; ··· 1136 772 { 1137 773 switch (id) { 1138 774 case FILE_NAME: 1139 - *str = s->file_name; 775 + *str = s ? s->file_name : "N/A"; 1140 776 break; 1141 777 case PROG_NAME: 1142 - *str = s->prog_name; 778 + *str = s ? s->prog_name : "N/A"; 1143 779 break; 1144 780 case VERDICT: 1145 - *str = s->stats[VERDICT] ? "success" : "failure"; 781 + if (!s) 782 + *str = "N/A"; 783 + else 784 + *str = s->stats[VERDICT] ? "success" : "failure"; 1146 785 break; 1147 786 case DURATION: 1148 787 case TOTAL_INSNS: ··· 1153 786 case PEAK_STATES: 1154 787 case MAX_STATES_PER_INSN: 1155 788 case MARK_READ_MAX_LEN: 1156 - *val = s->stats[id]; 789 + *val = s ? s->stats[id] : 0; 1157 790 break; 1158 791 default: 1159 792 fprintf(stderr, "Unrecognized stat #%d\n", id); ··· 1206 839 printf("Done. Processed %d files, %d programs. Skipped %d files, %d programs.\n", 1207 840 env.files_processed, env.files_skipped, env.progs_processed, env.progs_skipped); 1208 841 } 1209 - } 1210 - 1211 - static int handle_verif_mode(void) 1212 - { 1213 - int i, err; 1214 - 1215 - if (env.filename_cnt == 0) { 1216 - fprintf(stderr, "Please provide path to BPF object file!\n"); 1217 - argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); 1218 - return -EINVAL; 1219 - } 1220 - 1221 - for (i = 0; i < env.filename_cnt; i++) { 1222 - err = process_obj(env.filenames[i]); 1223 - if (err) { 1224 - fprintf(stderr, "Failed to process '%s': %d\n", env.filenames[i], err); 1225 - return err; 1226 - } 1227 - } 1228 - 1229 - qsort(env.prog_stats, env.prog_stat_cnt, sizeof(*env.prog_stats), cmp_prog_stats); 1230 - 1231 - if (env.out_fmt == RESFMT_TABLE) { 1232 - /* calculate column widths */ 1233 - output_headers(RESFMT_TABLE_CALCLEN); 1234 - for (i = 0; i < env.prog_stat_cnt; i++) 1235 - output_stats(&env.prog_stats[i], RESFMT_TABLE_CALCLEN, false); 1236 - } 1237 - 1238 - /* actually output the table */ 1239 - output_headers(env.out_fmt); 1240 - for (i = 0; i < env.prog_stat_cnt; i++) { 1241 - output_stats(&env.prog_stats[i], env.out_fmt, i == env.prog_stat_cnt - 1); 1242 - } 1243 - 1244 - return 0; 1245 842 } 1246 843 1247 844 static int parse_stat_value(const char *str, enum stat_id id, struct verif_stats *st) ··· 1339 1008 * parsed entire line; if row should be ignored we pretend we 1340 1009 * never parsed it 1341 1010 */ 1342 - if (!should_process_prog(st->file_name, st->prog_name)) { 1011 + if (!should_process_file_prog(st->file_name, st->prog_name)) { 1343 1012 free(st->file_name); 1344 1013 free(st->prog_name); 1345 1014 *stat_cntp -= 1; ··· 1428 1097 output_comp_header_underlines(); 1429 1098 } 1430 1099 1431 - static void output_comp_stats(const struct verif_stats *base, const struct verif_stats *comp, 1100 + static void output_comp_stats(const struct verif_stats_join *join_stats, 1432 1101 enum resfmt fmt, bool last) 1433 1102 { 1103 + const struct verif_stats *base = join_stats->stats_a; 1104 + const struct verif_stats *comp = join_stats->stats_b; 1434 1105 char base_buf[1024] = {}, comp_buf[1024] = {}, diff_buf[1024] = {}; 1435 1106 int i; 1436 1107 ··· 1450 1117 /* normalize all the outputs to be in string buffers for simplicity */ 1451 1118 if (is_key_stat(id)) { 1452 1119 /* key stats (file and program name) are always strings */ 1453 - if (base != &fallback_stats) 1120 + if (base) 1454 1121 snprintf(base_buf, sizeof(base_buf), "%s", base_str); 1455 1122 else 1456 1123 snprintf(base_buf, sizeof(base_buf), "%s", comp_str); 1457 1124 } else if (base_str) { 1458 1125 snprintf(base_buf, sizeof(base_buf), "%s", base_str); 1459 1126 snprintf(comp_buf, sizeof(comp_buf), "%s", comp_str); 1460 - if (strcmp(base_str, comp_str) == 0) 1127 + if (!base || !comp) 1128 + snprintf(diff_buf, sizeof(diff_buf), "%s", "N/A"); 1129 + else if (strcmp(base_str, comp_str) == 0) 1461 1130 snprintf(diff_buf, sizeof(diff_buf), "%s", "MATCH"); 1462 1131 else 1463 1132 snprintf(diff_buf, sizeof(diff_buf), "%s", "MISMATCH"); 1464 1133 } else { 1465 1134 double p = 0.0; 1466 1135 1467 - snprintf(base_buf, sizeof(base_buf), "%ld", base_val); 1468 - snprintf(comp_buf, sizeof(comp_buf), "%ld", comp_val); 1136 + if (base) 1137 + snprintf(base_buf, sizeof(base_buf), "%ld", base_val); 1138 + else 1139 + snprintf(base_buf, sizeof(base_buf), "%s", "N/A"); 1140 + if (comp) 1141 + snprintf(comp_buf, sizeof(comp_buf), "%ld", comp_val); 1142 + else 1143 + snprintf(comp_buf, sizeof(comp_buf), "%s", "N/A"); 1469 1144 1470 1145 diff_val = comp_val - base_val; 1471 - if (base == &fallback_stats || comp == &fallback_stats || base_val == 0) { 1472 - if (comp_val == base_val) 1473 - p = 0.0; /* avoid +0 (+100%) case */ 1474 - else 1475 - p = comp_val < base_val ? -100.0 : 100.0; 1146 + if (!base || !comp) { 1147 + snprintf(diff_buf, sizeof(diff_buf), "%s", "N/A"); 1476 1148 } else { 1477 - p = diff_val * 100.0 / base_val; 1149 + if (base_val == 0) { 1150 + if (comp_val == base_val) 1151 + p = 0.0; /* avoid +0 (+100%) case */ 1152 + else 1153 + p = comp_val < base_val ? -100.0 : 100.0; 1154 + } else { 1155 + p = diff_val * 100.0 / base_val; 1156 + } 1157 + snprintf(diff_buf, sizeof(diff_buf), "%+ld (%+.2lf%%)", diff_val, p); 1478 1158 } 1479 - snprintf(diff_buf, sizeof(diff_buf), "%+ld (%+.2lf%%)", diff_val, p); 1480 1159 } 1481 1160 1482 1161 switch (fmt) { ··· 1544 1199 return strcmp(base->prog_name, comp->prog_name); 1545 1200 } 1546 1201 1202 + static bool is_join_stat_filter_matched(struct filter *f, const struct verif_stats_join *stats) 1203 + { 1204 + static const double eps = 1e-9; 1205 + const char *str = NULL; 1206 + double value = 0.0; 1207 + 1208 + fetch_join_stat_value(stats, f->stat_id, f->stat_var, &str, &value); 1209 + 1210 + switch (f->op) { 1211 + case OP_EQ: return value > f->value - eps && value < f->value + eps; 1212 + case OP_NEQ: return value < f->value - eps || value > f->value + eps; 1213 + case OP_LT: return value < f->value - eps; 1214 + case OP_LE: return value <= f->value + eps; 1215 + case OP_GT: return value > f->value + eps; 1216 + case OP_GE: return value >= f->value - eps; 1217 + } 1218 + 1219 + fprintf(stderr, "BUG: unknown filter op %d!\n", f->op); 1220 + return false; 1221 + } 1222 + 1223 + static bool should_output_join_stats(const struct verif_stats_join *stats) 1224 + { 1225 + struct filter *f; 1226 + int i, allow_cnt = 0; 1227 + 1228 + for (i = 0; i < env.deny_filter_cnt; i++) { 1229 + f = &env.deny_filters[i]; 1230 + if (f->kind != FILTER_STAT) 1231 + continue; 1232 + 1233 + if (is_join_stat_filter_matched(f, stats)) 1234 + return false; 1235 + } 1236 + 1237 + for (i = 0; i < env.allow_filter_cnt; i++) { 1238 + f = &env.allow_filters[i]; 1239 + if (f->kind != FILTER_STAT) 1240 + continue; 1241 + allow_cnt++; 1242 + 1243 + if (is_join_stat_filter_matched(f, stats)) 1244 + return true; 1245 + } 1246 + 1247 + /* if there are no stat allowed filters, pass everything through */ 1248 + return allow_cnt == 0; 1249 + } 1250 + 1547 1251 static int handle_comparison_mode(void) 1548 1252 { 1549 1253 struct stat_specs base_specs = {}, comp_specs = {}; 1254 + struct stat_specs tmp_sort_spec; 1550 1255 enum resfmt cur_fmt; 1551 - int err, i, j; 1256 + int err, i, j, last_idx; 1552 1257 1553 1258 if (env.filename_cnt != 2) { 1554 - fprintf(stderr, "Comparison mode expects exactly two input CSV files!\n"); 1259 + fprintf(stderr, "Comparison mode expects exactly two input CSV files!\n\n"); 1555 1260 argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); 1556 1261 return -EINVAL; 1557 1262 } ··· 1639 1244 } 1640 1245 } 1641 1246 1247 + /* Replace user-specified sorting spec with file+prog sorting rule to 1248 + * be able to join two datasets correctly. Once we are done, we will 1249 + * restore the original sort spec. 1250 + */ 1251 + tmp_sort_spec = env.sort_spec; 1252 + env.sort_spec = join_sort_spec; 1642 1253 qsort(env.prog_stats, env.prog_stat_cnt, sizeof(*env.prog_stats), cmp_prog_stats); 1643 1254 qsort(env.baseline_stats, env.baseline_stat_cnt, sizeof(*env.baseline_stats), cmp_prog_stats); 1255 + env.sort_spec = tmp_sort_spec; 1644 1256 1645 - /* for human-readable table output we need to do extra pass to 1646 - * calculate column widths, so we substitute current output format 1647 - * with RESFMT_TABLE_CALCLEN and later revert it back to RESFMT_TABLE 1648 - * and do everything again. 1649 - */ 1650 - if (env.out_fmt == RESFMT_TABLE) 1651 - cur_fmt = RESFMT_TABLE_CALCLEN; 1652 - else 1653 - cur_fmt = env.out_fmt; 1654 - 1655 - one_more_time: 1656 - output_comp_headers(cur_fmt); 1657 - 1658 - /* If baseline and comparison datasets have different subset of rows 1659 - * (we match by 'object + prog' as a unique key) then assume 1660 - * empty/missing/zero value for rows that are missing in the opposite 1661 - * data set 1257 + /* Join two datasets together. If baseline and comparison datasets 1258 + * have different subset of rows (we match by 'object + prog' as 1259 + * a unique key) then assume empty/missing/zero value for rows that 1260 + * are missing in the opposite data set. 1662 1261 */ 1663 1262 i = j = 0; 1664 1263 while (i < env.baseline_stat_cnt || j < env.prog_stat_cnt) { 1665 - bool last = (i == env.baseline_stat_cnt - 1) || (j == env.prog_stat_cnt - 1); 1666 1264 const struct verif_stats *base, *comp; 1265 + struct verif_stats_join *join; 1266 + void *tmp; 1667 1267 int r; 1668 1268 1669 1269 base = i < env.baseline_stat_cnt ? &env.baseline_stats[i] : &fallback_stats; ··· 1675 1285 return -EINVAL; 1676 1286 } 1677 1287 1288 + tmp = realloc(env.join_stats, (env.join_stat_cnt + 1) * sizeof(*env.join_stats)); 1289 + if (!tmp) 1290 + return -ENOMEM; 1291 + env.join_stats = tmp; 1292 + 1293 + join = &env.join_stats[env.join_stat_cnt]; 1294 + memset(join, 0, sizeof(*join)); 1295 + 1678 1296 r = cmp_stats_key(base, comp); 1679 1297 if (r == 0) { 1680 - output_comp_stats(base, comp, cur_fmt, last); 1298 + join->file_name = base->file_name; 1299 + join->prog_name = base->prog_name; 1300 + join->stats_a = base; 1301 + join->stats_b = comp; 1681 1302 i++; 1682 1303 j++; 1683 1304 } else if (comp == &fallback_stats || r < 0) { 1684 - output_comp_stats(base, &fallback_stats, cur_fmt, last); 1305 + join->file_name = base->file_name; 1306 + join->prog_name = base->prog_name; 1307 + join->stats_a = base; 1308 + join->stats_b = NULL; 1685 1309 i++; 1686 1310 } else { 1687 - output_comp_stats(&fallback_stats, comp, cur_fmt, last); 1311 + join->file_name = comp->file_name; 1312 + join->prog_name = comp->prog_name; 1313 + join->stats_a = NULL; 1314 + join->stats_b = comp; 1688 1315 j++; 1689 1316 } 1317 + env.join_stat_cnt += 1; 1318 + } 1319 + 1320 + /* now sort joined results accorsing to sort spec */ 1321 + qsort(env.join_stats, env.join_stat_cnt, sizeof(*env.join_stats), cmp_join_stats); 1322 + 1323 + /* for human-readable table output we need to do extra pass to 1324 + * calculate column widths, so we substitute current output format 1325 + * with RESFMT_TABLE_CALCLEN and later revert it back to RESFMT_TABLE 1326 + * and do everything again. 1327 + */ 1328 + if (env.out_fmt == RESFMT_TABLE) 1329 + cur_fmt = RESFMT_TABLE_CALCLEN; 1330 + else 1331 + cur_fmt = env.out_fmt; 1332 + 1333 + one_more_time: 1334 + output_comp_headers(cur_fmt); 1335 + 1336 + for (i = 0; i < env.join_stat_cnt; i++) { 1337 + const struct verif_stats_join *join = &env.join_stats[i]; 1338 + 1339 + if (!should_output_join_stats(join)) 1340 + continue; 1341 + 1342 + if (cur_fmt == RESFMT_TABLE_CALCLEN) 1343 + last_idx = i; 1344 + 1345 + output_comp_stats(join, cur_fmt, i == last_idx); 1690 1346 } 1691 1347 1692 1348 if (cur_fmt == RESFMT_TABLE_CALCLEN) { 1693 1349 cur_fmt = RESFMT_TABLE; 1694 1350 goto one_more_time; /* ... this time with feeling */ 1695 1351 } 1352 + 1353 + return 0; 1354 + } 1355 + 1356 + static bool is_stat_filter_matched(struct filter *f, const struct verif_stats *stats) 1357 + { 1358 + long value = stats->stats[f->stat_id]; 1359 + 1360 + switch (f->op) { 1361 + case OP_EQ: return value == f->value; 1362 + case OP_NEQ: return value != f->value; 1363 + case OP_LT: return value < f->value; 1364 + case OP_LE: return value <= f->value; 1365 + case OP_GT: return value > f->value; 1366 + case OP_GE: return value >= f->value; 1367 + } 1368 + 1369 + fprintf(stderr, "BUG: unknown filter op %d!\n", f->op); 1370 + return false; 1371 + } 1372 + 1373 + static bool should_output_stats(const struct verif_stats *stats) 1374 + { 1375 + struct filter *f; 1376 + int i, allow_cnt = 0; 1377 + 1378 + for (i = 0; i < env.deny_filter_cnt; i++) { 1379 + f = &env.deny_filters[i]; 1380 + if (f->kind != FILTER_STAT) 1381 + continue; 1382 + 1383 + if (is_stat_filter_matched(f, stats)) 1384 + return false; 1385 + } 1386 + 1387 + for (i = 0; i < env.allow_filter_cnt; i++) { 1388 + f = &env.allow_filters[i]; 1389 + if (f->kind != FILTER_STAT) 1390 + continue; 1391 + allow_cnt++; 1392 + 1393 + if (is_stat_filter_matched(f, stats)) 1394 + return true; 1395 + } 1396 + 1397 + /* if there are no stat allowed filters, pass everything through */ 1398 + return allow_cnt == 0; 1399 + } 1400 + 1401 + static void output_prog_stats(void) 1402 + { 1403 + const struct verif_stats *stats; 1404 + int i, last_stat_idx = 0; 1405 + 1406 + if (env.out_fmt == RESFMT_TABLE) { 1407 + /* calculate column widths */ 1408 + output_headers(RESFMT_TABLE_CALCLEN); 1409 + for (i = 0; i < env.prog_stat_cnt; i++) { 1410 + stats = &env.prog_stats[i]; 1411 + if (!should_output_stats(stats)) 1412 + continue; 1413 + output_stats(stats, RESFMT_TABLE_CALCLEN, false); 1414 + last_stat_idx = i; 1415 + } 1416 + } 1417 + 1418 + /* actually output the table */ 1419 + output_headers(env.out_fmt); 1420 + for (i = 0; i < env.prog_stat_cnt; i++) { 1421 + stats = &env.prog_stats[i]; 1422 + if (!should_output_stats(stats)) 1423 + continue; 1424 + output_stats(stats, env.out_fmt, i == last_stat_idx); 1425 + } 1426 + } 1427 + 1428 + static int handle_verif_mode(void) 1429 + { 1430 + int i, err; 1431 + 1432 + if (env.filename_cnt == 0) { 1433 + fprintf(stderr, "Please provide path to BPF object file!\n\n"); 1434 + argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); 1435 + return -EINVAL; 1436 + } 1437 + 1438 + for (i = 0; i < env.filename_cnt; i++) { 1439 + err = process_obj(env.filenames[i]); 1440 + if (err) { 1441 + fprintf(stderr, "Failed to process '%s': %d\n", env.filenames[i], err); 1442 + return err; 1443 + } 1444 + } 1445 + 1446 + qsort(env.prog_stats, env.prog_stat_cnt, sizeof(*env.prog_stats), cmp_prog_stats); 1447 + 1448 + output_prog_stats(); 1449 + 1450 + return 0; 1451 + } 1452 + 1453 + static int handle_replay_mode(void) 1454 + { 1455 + struct stat_specs specs = {}; 1456 + int err; 1457 + 1458 + if (env.filename_cnt != 1) { 1459 + fprintf(stderr, "Replay mode expects exactly one input CSV file!\n\n"); 1460 + argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); 1461 + return -EINVAL; 1462 + } 1463 + 1464 + err = parse_stats_csv(env.filenames[0], &specs, 1465 + &env.prog_stats, &env.prog_stat_cnt); 1466 + if (err) { 1467 + fprintf(stderr, "Failed to parse stats from '%s': %d\n", env.filenames[0], err); 1468 + return err; 1469 + } 1470 + 1471 + qsort(env.prog_stats, env.prog_stat_cnt, sizeof(*env.prog_stats), cmp_prog_stats); 1472 + 1473 + output_prog_stats(); 1696 1474 1697 1475 return 0; 1698 1476 } ··· 1873 1315 return 1; 1874 1316 1875 1317 if (env.verbose && env.quiet) { 1876 - fprintf(stderr, "Verbose and quiet modes are incompatible, please specify just one or neither!\n"); 1318 + fprintf(stderr, "Verbose and quiet modes are incompatible, please specify just one or neither!\n\n"); 1877 1319 argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); 1878 1320 return 1; 1879 1321 } 1880 1322 if (env.verbose && env.log_level == 0) 1881 1323 env.log_level = 1; 1882 1324 1883 - if (env.output_spec.spec_cnt == 0) 1884 - env.output_spec = default_output_spec; 1325 + if (env.output_spec.spec_cnt == 0) { 1326 + if (env.out_fmt == RESFMT_CSV) 1327 + env.output_spec = default_csv_output_spec; 1328 + else 1329 + env.output_spec = default_output_spec; 1330 + } 1885 1331 if (env.sort_spec.spec_cnt == 0) 1886 1332 env.sort_spec = default_sort_spec; 1887 1333 1334 + if (env.comparison_mode && env.replay_mode) { 1335 + fprintf(stderr, "Can't specify replay and comparison mode at the same time!\n\n"); 1336 + argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); 1337 + return 1; 1338 + } 1339 + 1888 1340 if (env.comparison_mode) 1889 1341 err = handle_comparison_mode(); 1342 + else if (env.replay_mode) 1343 + err = handle_replay_mode(); 1890 1344 else 1891 1345 err = handle_verif_mode(); 1892 1346 1893 1347 free_verif_stats(env.prog_stats, env.prog_stat_cnt); 1894 1348 free_verif_stats(env.baseline_stats, env.baseline_stat_cnt); 1349 + free(env.join_stats); 1895 1350 for (i = 0; i < env.filename_cnt; i++) 1896 1351 free(env.filenames[i]); 1897 1352 free(env.filenames); 1898 1353 for (i = 0; i < env.allow_filter_cnt; i++) { 1354 + free(env.allow_filters[i].any_glob); 1899 1355 free(env.allow_filters[i].file_glob); 1900 1356 free(env.allow_filters[i].prog_glob); 1901 1357 } 1902 1358 free(env.allow_filters); 1903 1359 for (i = 0; i < env.deny_filter_cnt; i++) { 1360 + free(env.deny_filters[i].any_glob); 1904 1361 free(env.deny_filters[i].file_glob); 1905 1362 free(env.deny_filters[i].prog_glob); 1906 1363 }

+3 -2

tools/testing/selftests/bpf/xdp_synproxy.c

··· 104 104 { "tc", no_argument, NULL, 'c' }, 105 105 { NULL, 0, NULL, 0 }, 106 106 }; 107 - unsigned long mss4, mss6, wscale, ttl; 107 + unsigned long mss4, wscale, ttl; 108 + unsigned long long mss6; 108 109 unsigned int tcpipopts_mask = 0; 109 110 110 111 if (argc < 2) ··· 287 286 288 287 prog_info = (struct bpf_prog_info) { 289 288 .nr_map_ids = 8, 290 - .map_ids = (__u64)map_ids, 289 + .map_ids = (__u64)(unsigned long)map_ids, 291 290 }; 292 291 info_len = sizeof(prog_info); 293 292

+4 -22

tools/testing/selftests/bpf/xsk.c

··· 33 33 #include <bpf/bpf.h> 34 34 #include <bpf/libbpf.h> 35 35 #include "xsk.h" 36 + #include "bpf_util.h" 36 37 37 38 #ifndef SOL_XDP 38 39 #define SOL_XDP 283 ··· 522 521 return 0; 523 522 } 524 523 525 - /* Copy up to sz - 1 bytes from zero-terminated src string and ensure that dst 526 - * is zero-terminated string no matter what (unless sz == 0, in which case 527 - * it's a no-op). It's conceptually close to FreeBSD's strlcpy(), but differs 528 - * in what is returned. Given this is internal helper, it's trivial to extend 529 - * this, when necessary. Use this instead of strncpy inside libbpf source code. 530 - */ 531 - static inline void libbpf_strlcpy(char *dst, const char *src, size_t sz) 532 - { 533 - size_t i; 534 - 535 - if (sz == 0) 536 - return; 537 - 538 - sz--; 539 - for (i = 0; i < sz && src[i]; i++) 540 - dst[i] = src[i]; 541 - dst[i] = '\0'; 542 - } 543 - 544 524 static int xsk_get_max_queues(struct xsk_socket *xsk) 545 525 { 546 526 struct ethtool_channels channels = { .cmd = ETHTOOL_GCHANNELS }; ··· 534 552 return -errno; 535 553 536 554 ifr.ifr_data = (void *)&channels; 537 - libbpf_strlcpy(ifr.ifr_name, ctx->ifname, IFNAMSIZ); 555 + bpf_strlcpy(ifr.ifr_name, ctx->ifname, IFNAMSIZ); 538 556 err = ioctl(fd, SIOCETHTOOL, &ifr); 539 557 if (err && errno != EOPNOTSUPP) { 540 558 ret = -errno; ··· 753 771 } 754 772 755 773 ctx->ifindex = ifindex; 756 - libbpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ); 774 + bpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ); 757 775 758 776 xsk->ctx = ctx; 759 777 xsk->ctx->has_bpf_link = xsk_probe_bpf_link(); ··· 940 958 ctx->refcount = 1; 941 959 ctx->umem = umem; 942 960 ctx->queue_id = queue_id; 943 - libbpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ); 961 + bpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ); 944 962 945 963 ctx->fill = fill; 946 964 ctx->comp = comp;

+2 -1

tools/testing/selftests/bpf/xskxceiver.c

··· 1006 1006 { 1007 1007 struct xsk_socket_info *xsk = ifobject->xsk; 1008 1008 bool use_poll = ifobject->use_poll; 1009 - u32 i, idx = 0, ret, valid_pkts = 0; 1009 + u32 i, idx = 0, valid_pkts = 0; 1010 + int ret; 1010 1011 1011 1012 while (xsk_ring_prod__reserve(&xsk->tx, BATCH_SIZE, &idx) < BATCH_SIZE) { 1012 1013 if (use_poll) {