Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Daniel Borkmann says:

====================
bpf-next 2022-11-25

We've added 101 non-merge commits during the last 11 day(s) which contain
a total of 109 files changed, 8827 insertions(+), 1129 deletions(-).

The main changes are:

1) Support for user defined BPF objects: the use case is to allocate own
objects, build own object hierarchies and use the building blocks to
build own data structures flexibly, for example, linked lists in BPF,
from Kumar Kartikeya Dwivedi.

2) Add bpf_rcu_read_{,un}lock() support for sleepable programs,
from Yonghong Song.

3) Add support storing struct task_struct objects as kptrs in maps,
from David Vernet.

4) Batch of BPF map documentation improvements, from Maryam Tahhan
and Donald Hunter.

5) Improve BPF verifier to propagate nullness information for branches
of register to register comparisons, from Eduard Zingerman.

6) Fix cgroup BPF iter infra to hold reference on the start cgroup,
from Hou Tao.

7) Fix BPF verifier to not mark fentry/fexit program arguments as trusted
given it is not the case for them, from Alexei Starovoitov.

8) Improve BPF verifier's realloc handling to better play along with dynamic
runtime analysis tools like KASAN and friends, from Kees Cook.

9) Remove legacy libbpf mode support from bpftool,
from Sahid Orentino Ferdjaoui.

10) Rework zero-len skb redirection checks to avoid potentially breaking
existing BPF test infra users, from Stanislav Fomichev.

11) Two small refactorings which are independent and have been split out
of the XDP queueing RFC series, from Toke Høiland-Jørgensen.

12) Fix a memory leak in LSM cgroup BPF selftest, from Wang Yufen.

13) Documentation on how to run BPF CI without patch submission,
from Daniel Müller.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================

Link: https://lore.kernel.org/r/20221125012450.441-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+8695 -997
+6 -5
Documentation/bpf/bpf_design_QA.rst
··· 332 332 In other words, no backwards compatibility is guaranteed if one using a type 333 333 in BTF with 'bpf\_' prefix. 334 334 335 - Q: What is the compatibility story for special BPF types in local kptrs? 336 - ------------------------------------------------------------------------ 337 - Q: Same as above, but for local kptrs (i.e. pointers to objects allocated using 338 - bpf_obj_new for user defined structures). Will the kernel preserve backwards 335 + Q: What is the compatibility story for special BPF types in allocated objects? 336 + ------------------------------------------------------------------------------ 337 + Q: Same as above, but for allocated objects (i.e. objects allocated using 338 + bpf_obj_new for user defined types). Will the kernel preserve backwards 339 339 compatibility for these features? 340 340 341 341 A: NO. 342 342 343 343 Unlike map value types, there are no stability guarantees for this case. The 344 - whole local kptr API itself is unstable (since it is exposed through kfuncs). 344 + whole API to work with allocated objects and any support for special fields 345 + inside them is unstable (since it is exposed through kfuncs).
+27
Documentation/bpf/bpf_devel_QA.rst
··· 44 44 Submitting patches 45 45 ================== 46 46 47 + Q: How do I run BPF CI on my changes before sending them out for review? 48 + ------------------------------------------------------------------------ 49 + A: BPF CI is GitHub based and hosted at https://github.com/kernel-patches/bpf. 50 + While GitHub also provides a CLI that can be used to accomplish the same 51 + results, here we focus on the UI based workflow. 52 + 53 + The following steps lay out how to start a CI run for your patches: 54 + 55 + - Create a fork of the aforementioned repository in your own account (one time 56 + action) 57 + 58 + - Clone the fork locally, check out a new branch tracking either the bpf-next 59 + or bpf branch, and apply your to-be-tested patches on top of it 60 + 61 + - Push the local branch to your fork and create a pull request against 62 + kernel-patches/bpf's bpf-next_base or bpf_base branch, respectively 63 + 64 + Shortly after the pull request has been created, the CI workflow will run. Note 65 + that capacity is shared with patches submitted upstream being checked and so 66 + depending on utilization the run can take a while to finish. 67 + 68 + Note furthermore that both base branches (bpf-next_base and bpf_base) will be 69 + updated as patches are pushed to the respective upstream branches they track. As 70 + such, your patch set will automatically (be attempted to) be rebased as well. 71 + This behavior can result in a CI run being aborted and restarted with the new 72 + base line. 73 + 47 74 Q: To which mailing list do I need to submit my BPF patches? 48 75 ------------------------------------------------------------ 49 76 A: Please submit your BPF patches to the bpf kernel mailing list:
+6 -1
Documentation/bpf/btf.rst
··· 1062 1062 7. Testing 1063 1063 ========== 1064 1064 1065 - Kernel bpf selftest `test_btf.c` provides extensive set of BTF-related tests. 1065 + The kernel BPF selftest `tools/testing/selftests/bpf/prog_tests/btf.c`_ 1066 + provides an extensive set of BTF-related tests. 1067 + 1068 + .. Links 1069 + .. _tools/testing/selftests/bpf/prog_tests/btf.c: 1070 + https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/bpf/prog_tests/btf.c
+1
Documentation/bpf/index.rst
··· 29 29 clang-notes 30 30 linux-notes 31 31 other 32 + redirect 32 33 33 34 .. only:: subproject and html 34 35
+35 -13
Documentation/bpf/kfuncs.rst
··· 72 72 of the pointer is used. Without __sz annotation, a kfunc cannot accept a void 73 73 pointer. 74 74 75 + 2.2.2 __k Annotation 76 + -------------------- 77 + 78 + This annotation is only understood for scalar arguments, where it indicates that 79 + the verifier must check the scalar argument to be a known constant, which does 80 + not indicate a size parameter, and the value of the constant is relevant to the 81 + safety of the program. 82 + 83 + An example is given below:: 84 + 85 + void *bpf_obj_new(u32 local_type_id__k, ...) 86 + { 87 + ... 88 + } 89 + 90 + Here, bpf_obj_new uses local_type_id argument to find out the size of that type 91 + ID in program's BTF and return a sized pointer to it. Each type ID will have a 92 + distinct size, hence it is crucial to treat each such call as distinct when 93 + values don't match during verifier state pruning checks. 94 + 95 + Hence, whenever a constant scalar argument is accepted by a kfunc which is not a 96 + size parameter, and the value of the constant matters for program safety, __k 97 + suffix should be used. 98 + 75 99 .. _BPF_kfunc_nodef: 76 100 77 101 2.3 Using an existing kernel function ··· 161 137 -------------------------- 162 138 163 139 The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It 164 - indicates that the all pointer arguments will always have a guaranteed lifetime, 165 - and pointers to kernel objects are always passed to helpers in their unmodified 166 - form (as obtained from acquire kfuncs). 140 + indicates that the all pointer arguments are valid, and that all pointers to 141 + BTF objects have been passed in their unmodified form (that is, at a zero 142 + offset, and without having been obtained from walking another pointer). 167 143 168 - It can be used to enforce that a pointer to a refcounted object acquired from a 169 - kfunc or BPF helper is passed as an argument to this kfunc without any 170 - modifications (e.g. pointer arithmetic) such that it is trusted and points to 171 - the original object. 144 + There are two types of pointers to kernel objects which are considered "valid": 172 145 173 - Meanwhile, it is also allowed pass pointers to normal memory to such kfuncs, 174 - but those can have a non-zero offset. 146 + 1. Pointers which are passed as tracepoint or struct_ops callback arguments. 147 + 2. Pointers which were returned from a KF_ACQUIRE or KF_KPTR_GET kfunc. 175 148 176 - This flag is often used for kfuncs that operate (change some property, perform 177 - some operation) on an object that was obtained using an acquire kfunc. Such 178 - kfuncs need an unchanged pointer to ensure the integrity of the operation being 179 - performed on the expected object. 149 + Pointers to non-BTF objects (e.g. scalar pointers) may also be passed to 150 + KF_TRUSTED_ARGS kfuncs, and may have a non-zero offset. 151 + 152 + The definition of "valid" pointers is subject to change at any time, and has 153 + absolutely no ABI stability guarantees. 180 154 181 155 2.4.6 KF_SLEEPABLE flag 182 156 -----------------------
+3
Documentation/bpf/libbpf/index.rst
··· 1 1 .. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) 2 2 3 + .. _libbpf: 4 + 3 5 libbpf 4 6 ====== 5 7 ··· 9 7 :maxdepth: 1 10 8 11 9 API Documentation <https://libbpf.readthedocs.io/en/latest/api.html> 10 + program_types 12 11 libbpf_naming_convention 13 12 libbpf_build 14 13
+203
Documentation/bpf/libbpf/program_types.rst
··· 1 + .. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) 2 + 3 + .. _program_types_and_elf: 4 + 5 + Program Types and ELF Sections 6 + ============================== 7 + 8 + The table below lists the program types, their attach types where relevant and the ELF section 9 + names supported by libbpf for them. The ELF section names follow these rules: 10 + 11 + - ``type`` is an exact match, e.g. ``SEC("socket")`` 12 + - ``type+`` means it can be either exact ``SEC("type")`` or well-formed ``SEC("type/extras")`` 13 + with a '``/``' separator between ``type`` and ``extras``. 14 + 15 + When ``extras`` are specified, they provide details of how to auto-attach the BPF program. The 16 + format of ``extras`` depends on the program type, e.g. ``SEC("tracepoint/<category>/<name>")`` 17 + for tracepoints or ``SEC("usdt/<path>:<provider>:<name>")`` for USDT probes. The extras are 18 + described in more detail in the footnotes. 19 + 20 + 21 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 22 + | Program Type | Attach Type | ELF Section Name | Sleepable | 23 + +===========================================+========================================+==================================+===========+ 24 + | ``BPF_PROG_TYPE_CGROUP_DEVICE`` | ``BPF_CGROUP_DEVICE`` | ``cgroup/dev`` | | 25 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 26 + | ``BPF_PROG_TYPE_CGROUP_SKB`` | | ``cgroup/skb`` | | 27 + + +----------------------------------------+----------------------------------+-----------+ 28 + | | ``BPF_CGROUP_INET_EGRESS`` | ``cgroup_skb/egress`` | | 29 + + +----------------------------------------+----------------------------------+-----------+ 30 + | | ``BPF_CGROUP_INET_INGRESS`` | ``cgroup_skb/ingress`` | | 31 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 32 + | ``BPF_PROG_TYPE_CGROUP_SOCKOPT`` | ``BPF_CGROUP_GETSOCKOPT`` | ``cgroup/getsockopt`` | | 33 + + +----------------------------------------+----------------------------------+-----------+ 34 + | | ``BPF_CGROUP_SETSOCKOPT`` | ``cgroup/setsockopt`` | | 35 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 36 + | ``BPF_PROG_TYPE_CGROUP_SOCK_ADDR`` | ``BPF_CGROUP_INET4_BIND`` | ``cgroup/bind4`` | | 37 + + +----------------------------------------+----------------------------------+-----------+ 38 + | | ``BPF_CGROUP_INET4_CONNECT`` | ``cgroup/connect4`` | | 39 + + +----------------------------------------+----------------------------------+-----------+ 40 + | | ``BPF_CGROUP_INET4_GETPEERNAME`` | ``cgroup/getpeername4`` | | 41 + + +----------------------------------------+----------------------------------+-----------+ 42 + | | ``BPF_CGROUP_INET4_GETSOCKNAME`` | ``cgroup/getsockname4`` | | 43 + + +----------------------------------------+----------------------------------+-----------+ 44 + | | ``BPF_CGROUP_INET6_BIND`` | ``cgroup/bind6`` | | 45 + + +----------------------------------------+----------------------------------+-----------+ 46 + | | ``BPF_CGROUP_INET6_CONNECT`` | ``cgroup/connect6`` | | 47 + + +----------------------------------------+----------------------------------+-----------+ 48 + | | ``BPF_CGROUP_INET6_GETPEERNAME`` | ``cgroup/getpeername6`` | | 49 + + +----------------------------------------+----------------------------------+-----------+ 50 + | | ``BPF_CGROUP_INET6_GETSOCKNAME`` | ``cgroup/getsockname6`` | | 51 + + +----------------------------------------+----------------------------------+-----------+ 52 + | | ``BPF_CGROUP_UDP4_RECVMSG`` | ``cgroup/recvmsg4`` | | 53 + + +----------------------------------------+----------------------------------+-----------+ 54 + | | ``BPF_CGROUP_UDP4_SENDMSG`` | ``cgroup/sendmsg4`` | | 55 + + +----------------------------------------+----------------------------------+-----------+ 56 + | | ``BPF_CGROUP_UDP6_RECVMSG`` | ``cgroup/recvmsg6`` | | 57 + + +----------------------------------------+----------------------------------+-----------+ 58 + | | ``BPF_CGROUP_UDP6_SENDMSG`` | ``cgroup/sendmsg6`` | | 59 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 60 + | ``BPF_PROG_TYPE_CGROUP_SOCK`` | ``BPF_CGROUP_INET4_POST_BIND`` | ``cgroup/post_bind4`` | | 61 + + +----------------------------------------+----------------------------------+-----------+ 62 + | | ``BPF_CGROUP_INET6_POST_BIND`` | ``cgroup/post_bind6`` | | 63 + + +----------------------------------------+----------------------------------+-----------+ 64 + | | ``BPF_CGROUP_INET_SOCK_CREATE`` | ``cgroup/sock_create`` | | 65 + + + +----------------------------------+-----------+ 66 + | | | ``cgroup/sock`` | | 67 + + +----------------------------------------+----------------------------------+-----------+ 68 + | | ``BPF_CGROUP_INET_SOCK_RELEASE`` | ``cgroup/sock_release`` | | 69 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 70 + | ``BPF_PROG_TYPE_CGROUP_SYSCTL`` | ``BPF_CGROUP_SYSCTL`` | ``cgroup/sysctl`` | | 71 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 72 + | ``BPF_PROG_TYPE_EXT`` | | ``freplace+`` [#fentry]_ | | 73 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 74 + | ``BPF_PROG_TYPE_FLOW_DISSECTOR`` | ``BPF_FLOW_DISSECTOR`` | ``flow_dissector`` | | 75 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 76 + | ``BPF_PROG_TYPE_KPROBE`` | | ``kprobe+`` [#kprobe]_ | | 77 + + + +----------------------------------+-----------+ 78 + | | | ``kretprobe+`` [#kprobe]_ | | 79 + + + +----------------------------------+-----------+ 80 + | | | ``ksyscall+`` [#ksyscall]_ | | 81 + + + +----------------------------------+-----------+ 82 + | | | ``kretsyscall+`` [#ksyscall]_ | | 83 + + + +----------------------------------+-----------+ 84 + | | | ``uprobe+`` [#uprobe]_ | | 85 + + + +----------------------------------+-----------+ 86 + | | | ``uprobe.s+`` [#uprobe]_ | Yes | 87 + + + +----------------------------------+-----------+ 88 + | | | ``uretprobe+`` [#uprobe]_ | | 89 + + + +----------------------------------+-----------+ 90 + | | | ``uretprobe.s+`` [#uprobe]_ | Yes | 91 + + + +----------------------------------+-----------+ 92 + | | | ``usdt+`` [#usdt]_ | | 93 + + +----------------------------------------+----------------------------------+-----------+ 94 + | | ``BPF_TRACE_KPROBE_MULTI`` | ``kprobe.multi+`` [#kpmulti]_ | | 95 + + + +----------------------------------+-----------+ 96 + | | | ``kretprobe.multi+`` [#kpmulti]_ | | 97 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 98 + | ``BPF_PROG_TYPE_LIRC_MODE2`` | ``BPF_LIRC_MODE2`` | ``lirc_mode2`` | | 99 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 100 + | ``BPF_PROG_TYPE_LSM`` | ``BPF_LSM_CGROUP`` | ``lsm_cgroup+`` | | 101 + + +----------------------------------------+----------------------------------+-----------+ 102 + | | ``BPF_LSM_MAC`` | ``lsm+`` [#lsm]_ | | 103 + + + +----------------------------------+-----------+ 104 + | | | ``lsm.s+`` [#lsm]_ | Yes | 105 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 106 + | ``BPF_PROG_TYPE_LWT_IN`` | | ``lwt_in`` | | 107 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 108 + | ``BPF_PROG_TYPE_LWT_OUT`` | | ``lwt_out`` | | 109 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 110 + | ``BPF_PROG_TYPE_LWT_SEG6LOCAL`` | | ``lwt_seg6local`` | | 111 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 112 + | ``BPF_PROG_TYPE_LWT_XMIT`` | | ``lwt_xmit`` | | 113 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 114 + | ``BPF_PROG_TYPE_PERF_EVENT`` | | ``perf_event`` | | 115 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 116 + | ``BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE`` | | ``raw_tp.w+`` [#rawtp]_ | | 117 + + + +----------------------------------+-----------+ 118 + | | | ``raw_tracepoint.w+`` | | 119 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 120 + | ``BPF_PROG_TYPE_RAW_TRACEPOINT`` | | ``raw_tp+`` [#rawtp]_ | | 121 + + + +----------------------------------+-----------+ 122 + | | | ``raw_tracepoint+`` | | 123 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 124 + | ``BPF_PROG_TYPE_SCHED_ACT`` | | ``action`` | | 125 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 126 + | ``BPF_PROG_TYPE_SCHED_CLS`` | | ``classifier`` | | 127 + + + +----------------------------------+-----------+ 128 + | | | ``tc`` | | 129 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 130 + | ``BPF_PROG_TYPE_SK_LOOKUP`` | ``BPF_SK_LOOKUP`` | ``sk_lookup`` | | 131 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 132 + | ``BPF_PROG_TYPE_SK_MSG`` | ``BPF_SK_MSG_VERDICT`` | ``sk_msg`` | | 133 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 134 + | ``BPF_PROG_TYPE_SK_REUSEPORT`` | ``BPF_SK_REUSEPORT_SELECT_OR_MIGRATE`` | ``sk_reuseport/migrate`` | | 135 + + +----------------------------------------+----------------------------------+-----------+ 136 + | | ``BPF_SK_REUSEPORT_SELECT`` | ``sk_reuseport`` | | 137 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 138 + | ``BPF_PROG_TYPE_SK_SKB`` | | ``sk_skb`` | | 139 + + +----------------------------------------+----------------------------------+-----------+ 140 + | | ``BPF_SK_SKB_STREAM_PARSER`` | ``sk_skb/stream_parser`` | | 141 + + +----------------------------------------+----------------------------------+-----------+ 142 + | | ``BPF_SK_SKB_STREAM_VERDICT`` | ``sk_skb/stream_verdict`` | | 143 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 144 + | ``BPF_PROG_TYPE_SOCKET_FILTER`` | | ``socket`` | | 145 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 146 + | ``BPF_PROG_TYPE_SOCK_OPS`` | ``BPF_CGROUP_SOCK_OPS`` | ``sockops`` | | 147 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 148 + | ``BPF_PROG_TYPE_STRUCT_OPS`` | | ``struct_ops+`` | | 149 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 150 + | ``BPF_PROG_TYPE_SYSCALL`` | | ``syscall`` | Yes | 151 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 152 + | ``BPF_PROG_TYPE_TRACEPOINT`` | | ``tp+`` [#tp]_ | | 153 + + + +----------------------------------+-----------+ 154 + | | | ``tracepoint+`` [#tp]_ | | 155 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 156 + | ``BPF_PROG_TYPE_TRACING`` | ``BPF_MODIFY_RETURN`` | ``fmod_ret+`` [#fentry]_ | | 157 + + + +----------------------------------+-----------+ 158 + | | | ``fmod_ret.s+`` [#fentry]_ | Yes | 159 + + +----------------------------------------+----------------------------------+-----------+ 160 + | | ``BPF_TRACE_FENTRY`` | ``fentry+`` [#fentry]_ | | 161 + + + +----------------------------------+-----------+ 162 + | | | ``fentry.s+`` [#fentry]_ | Yes | 163 + + +----------------------------------------+----------------------------------+-----------+ 164 + | | ``BPF_TRACE_FEXIT`` | ``fexit+`` [#fentry]_ | | 165 + + + +----------------------------------+-----------+ 166 + | | | ``fexit.s+`` [#fentry]_ | Yes | 167 + + +----------------------------------------+----------------------------------+-----------+ 168 + | | ``BPF_TRACE_ITER`` | ``iter+`` [#iter]_ | | 169 + + + +----------------------------------+-----------+ 170 + | | | ``iter.s+`` [#iter]_ | Yes | 171 + + +----------------------------------------+----------------------------------+-----------+ 172 + | | ``BPF_TRACE_RAW_TP`` | ``tp_btf+`` [#fentry]_ | | 173 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 174 + | ``BPF_PROG_TYPE_XDP`` | ``BPF_XDP_CPUMAP`` | ``xdp.frags/cpumap`` | | 175 + + + +----------------------------------+-----------+ 176 + | | | ``xdp/cpumap`` | | 177 + + +----------------------------------------+----------------------------------+-----------+ 178 + | | ``BPF_XDP_DEVMAP`` | ``xdp.frags/devmap`` | | 179 + + + +----------------------------------+-----------+ 180 + | | | ``xdp/devmap`` | | 181 + + +----------------------------------------+----------------------------------+-----------+ 182 + | | ``BPF_XDP`` | ``xdp.frags`` | | 183 + + + +----------------------------------+-----------+ 184 + | | | ``xdp`` | | 185 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 186 + 187 + 188 + .. rubric:: Footnotes 189 + 190 + .. [#fentry] The ``fentry`` attach format is ``fentry[.s]/<function>``. 191 + .. [#kprobe] The ``kprobe`` attach format is ``kprobe/<function>[+<offset>]``. Valid 192 + characters for ``function`` are ``a-zA-Z0-9_.`` and ``offset`` must be a valid 193 + non-negative integer. 194 + .. [#ksyscall] The ``ksyscall`` attach format is ``ksyscall/<syscall>``. 195 + .. [#uprobe] The ``uprobe`` attach format is ``uprobe[.s]/<path>:<function>[+<offset>]``. 196 + .. [#usdt] The ``usdt`` attach format is ``usdt/<path>:<provider>:<name>``. 197 + .. [#kpmulti] The ``kprobe.multi`` attach format is ``kprobe.multi/<pattern>`` where ``pattern`` 198 + supports ``*`` and ``?`` wildcards. Valid characters for pattern are 199 + ``a-zA-Z0-9_.*?``. 200 + .. [#lsm] The ``lsm`` attachment format is ``lsm[.s]/<hook>``. 201 + .. [#rawtp] The ``raw_tp`` attach format is ``raw_tracepoint[.w]/<tracepoint>``. 202 + .. [#tp] The ``tracepoint`` attach format is ``tracepoint/<category>/<name>``. 203 + .. [#iter] The ``iter`` attach format is ``iter[.s]/<struct-name>``.
+17 -5
Documentation/bpf/map_array.rst
··· 32 32 Kernel BPF 33 33 ---------- 34 34 35 - .. c:function:: 35 + bpf_map_lookup_elem() 36 + ~~~~~~~~~~~~~~~~~~~~~ 37 + 38 + .. code-block:: c 39 + 36 40 void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 37 41 38 42 Array elements can be retrieved using the ``bpf_map_lookup_elem()`` helper. ··· 44 40 with userspace reading the value, the user must use primitives like 45 41 ``__sync_fetch_and_add()`` when updating the value in-place. 46 42 47 - .. c:function:: 43 + bpf_map_update_elem() 44 + ~~~~~~~~~~~~~~~~~~~~~ 45 + 46 + .. code-block:: c 47 + 48 48 long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) 49 49 50 50 Array elements can be updated using the ``bpf_map_update_elem()`` helper. ··· 61 53 zero value to that index. 62 54 63 55 Per CPU Array 64 - ~~~~~~~~~~~~~ 56 + ------------- 65 57 66 58 Values stored in ``BPF_MAP_TYPE_ARRAY`` can be accessed by multiple programs 67 59 across different CPUs. To restrict storage to a single CPU, you may use a ··· 71 63 ``bpf_map_lookup_elem()`` helpers automatically access the slot for the current 72 64 CPU. 73 65 74 - .. c:function:: 66 + bpf_map_lookup_percpu_elem() 67 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 68 + 69 + .. code-block:: c 70 + 75 71 void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key, u32 cpu) 76 72 77 73 The ``bpf_map_lookup_percpu_elem()`` helper can be used to lookup the array ··· 131 119 index = ip.protocol; 132 120 value = bpf_map_lookup_elem(&my_map, &index); 133 121 if (value) 134 - __sync_fetch_and_add(&value, skb->len); 122 + __sync_fetch_and_add(value, skb->len); 135 123 136 124 return 0; 137 125 }
+174
Documentation/bpf/map_bloom_filter.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + ========================= 5 + BPF_MAP_TYPE_BLOOM_FILTER 6 + ========================= 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_BLOOM_FILTER`` was introduced in kernel version 5.16 10 + 11 + ``BPF_MAP_TYPE_BLOOM_FILTER`` provides a BPF bloom filter map. Bloom 12 + filters are a space-efficient probabilistic data structure used to 13 + quickly test whether an element exists in a set. In a bloom filter, 14 + false positives are possible whereas false negatives are not. 15 + 16 + The bloom filter map does not have keys, only values. When the bloom 17 + filter map is created, it must be created with a ``key_size`` of 0. The 18 + bloom filter map supports two operations: 19 + 20 + - push: adding an element to the map 21 + - peek: determining whether an element is present in the map 22 + 23 + BPF programs must use ``bpf_map_push_elem`` to add an element to the 24 + bloom filter map and ``bpf_map_peek_elem`` to query the map. These 25 + operations are exposed to userspace applications using the existing 26 + ``bpf`` syscall in the following way: 27 + 28 + - ``BPF_MAP_UPDATE_ELEM`` -> push 29 + - ``BPF_MAP_LOOKUP_ELEM`` -> peek 30 + 31 + The ``max_entries`` size that is specified at map creation time is used 32 + to approximate a reasonable bitmap size for the bloom filter, and is not 33 + otherwise strictly enforced. If the user wishes to insert more entries 34 + into the bloom filter than ``max_entries``, this may lead to a higher 35 + false positive rate. 36 + 37 + The number of hashes to use for the bloom filter is configurable using 38 + the lower 4 bits of ``map_extra`` in ``union bpf_attr`` at map creation 39 + time. If no number is specified, the default used will be 5 hash 40 + functions. In general, using more hashes decreases both the false 41 + positive rate and the speed of a lookup. 42 + 43 + It is not possible to delete elements from a bloom filter map. A bloom 44 + filter map may be used as an inner map. The user is responsible for 45 + synchronising concurrent updates and lookups to ensure no false negative 46 + lookups occur. 47 + 48 + Usage 49 + ===== 50 + 51 + Kernel BPF 52 + ---------- 53 + 54 + bpf_map_push_elem() 55 + ~~~~~~~~~~~~~~~~~~~ 56 + 57 + .. code-block:: c 58 + 59 + long bpf_map_push_elem(struct bpf_map *map, const void *value, u64 flags) 60 + 61 + A ``value`` can be added to a bloom filter using the 62 + ``bpf_map_push_elem()`` helper. The ``flags`` parameter must be set to 63 + ``BPF_ANY`` when adding an entry to the bloom filter. This helper 64 + returns ``0`` on success, or negative error in case of failure. 65 + 66 + bpf_map_peek_elem() 67 + ~~~~~~~~~~~~~~~~~~~ 68 + 69 + .. code-block:: c 70 + 71 + long bpf_map_peek_elem(struct bpf_map *map, void *value) 72 + 73 + The ``bpf_map_peek_elem()`` helper is used to determine whether 74 + ``value`` is present in the bloom filter map. This helper returns ``0`` 75 + if ``value`` is probably present in the map, or ``-ENOENT`` if ``value`` 76 + is definitely not present in the map. 77 + 78 + Userspace 79 + --------- 80 + 81 + bpf_map_update_elem() 82 + ~~~~~~~~~~~~~~~~~~~~~ 83 + 84 + .. code-block:: c 85 + 86 + int bpf_map_update_elem (int fd, const void *key, const void *value, __u64 flags) 87 + 88 + A userspace program can add a ``value`` to a bloom filter using libbpf's 89 + ``bpf_map_update_elem`` function. The ``key`` parameter must be set to 90 + ``NULL`` and ``flags`` must be set to ``BPF_ANY``. Returns ``0`` on 91 + success, or negative error in case of failure. 92 + 93 + bpf_map_lookup_elem() 94 + ~~~~~~~~~~~~~~~~~~~~~ 95 + 96 + .. code-block:: c 97 + 98 + int bpf_map_lookup_elem (int fd, const void *key, void *value) 99 + 100 + A userspace program can determine the presence of ``value`` in a bloom 101 + filter using libbpf's ``bpf_map_lookup_elem`` function. The ``key`` 102 + parameter must be set to ``NULL``. Returns ``0`` if ``value`` is 103 + probably present in the map, or ``-ENOENT`` if ``value`` is definitely 104 + not present in the map. 105 + 106 + Examples 107 + ======== 108 + 109 + Kernel BPF 110 + ---------- 111 + 112 + This snippet shows how to declare a bloom filter in a BPF program: 113 + 114 + .. code-block:: c 115 + 116 + struct { 117 + __uint(type, BPF_MAP_TYPE_BLOOM_FILTER); 118 + __type(value, __u32); 119 + __uint(max_entries, 1000); 120 + __uint(map_extra, 3); 121 + } bloom_filter SEC(".maps"); 122 + 123 + This snippet shows how to determine presence of a value in a bloom 124 + filter in a BPF program: 125 + 126 + .. code-block:: c 127 + 128 + void *lookup(__u32 key) 129 + { 130 + if (bpf_map_peek_elem(&bloom_filter, &key) == 0) { 131 + /* Verify not a false positive and fetch an associated 132 + * value using a secondary lookup, e.g. in a hash table 133 + */ 134 + return bpf_map_lookup_elem(&hash_table, &key); 135 + } 136 + return 0; 137 + } 138 + 139 + Userspace 140 + --------- 141 + 142 + This snippet shows how to use libbpf to create a bloom filter map from 143 + userspace: 144 + 145 + .. code-block:: c 146 + 147 + int create_bloom() 148 + { 149 + LIBBPF_OPTS(bpf_map_create_opts, opts, 150 + .map_extra = 3); /* number of hashes */ 151 + 152 + return bpf_map_create(BPF_MAP_TYPE_BLOOM_FILTER, 153 + "ipv6_bloom", /* name */ 154 + 0, /* key size, must be zero */ 155 + sizeof(ipv6_addr), /* value size */ 156 + 10000, /* max entries */ 157 + &opts); /* create options */ 158 + } 159 + 160 + This snippet shows how to add an element to a bloom filter from 161 + userspace: 162 + 163 + .. code-block:: c 164 + 165 + int add_element(struct bpf_map *bloom_map, __u32 value) 166 + { 167 + int bloom_fd = bpf_map__fd(bloom_map); 168 + return bpf_map_update_elem(bloom_fd, NULL, &value, BPF_ANY); 169 + } 170 + 171 + References 172 + ========== 173 + 174 + https://lwn.net/ml/bpf/20210831225005.2762202-1-joannekoong@fb.com/
+35 -24
Documentation/bpf/map_cpumap.rst
··· 30 30 31 31 Kernel BPF 32 32 ---------- 33 - .. c:function:: 33 + bpf_redirect_map() 34 + ^^^^^^^^^^^^^^^^^^ 35 + .. code-block:: c 36 + 34 37 long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags) 35 38 36 - Redirect the packet to the endpoint referenced by ``map`` at index ``key``. 37 - For ``BPF_MAP_TYPE_CPUMAP`` this map contains references to CPUs. 39 + Redirect the packet to the endpoint referenced by ``map`` at index ``key``. 40 + For ``BPF_MAP_TYPE_CPUMAP`` this map contains references to CPUs. 38 41 39 - The lower two bits of ``flags`` are used as the return code if the map lookup 40 - fails. This is so that the return value can be one of the XDP program return 41 - codes up to ``XDP_TX``, as chosen by the caller. 42 + The lower two bits of ``flags`` are used as the return code if the map lookup 43 + fails. This is so that the return value can be one of the XDP program return 44 + codes up to ``XDP_TX``, as chosen by the caller. 42 45 43 - Userspace 44 - --------- 46 + User space 47 + ---------- 45 48 .. note:: 46 49 CPUMAP entries can only be updated/looked up/deleted from user space and not 47 50 from an eBPF program. Trying to call these functions from a kernel eBPF 48 51 program will result in the program failing to load and a verifier warning. 49 52 50 - .. c:function:: 51 - int bpf_map_update_elem(int fd, const void *key, const void *value, 52 - __u64 flags); 53 + bpf_map_update_elem() 54 + ^^^^^^^^^^^^^^^^^^^^^ 55 + .. code-block:: c 53 56 54 - CPU entries can be added or updated using the ``bpf_map_update_elem()`` 55 - helper. This helper replaces existing elements atomically. The ``value`` parameter 56 - can be ``struct bpf_cpumap_val``. 57 + int bpf_map_update_elem(int fd, const void *key, const void *value, __u64 flags); 58 + 59 + CPU entries can be added or updated using the ``bpf_map_update_elem()`` 60 + helper. This helper replaces existing elements atomically. The ``value`` parameter 61 + can be ``struct bpf_cpumap_val``. 57 62 58 63 .. code-block:: c 59 64 ··· 70 65 } bpf_prog; 71 66 }; 72 67 73 - The flags argument can be one of the following: 68 + The flags argument can be one of the following: 74 69 - BPF_ANY: Create a new element or update an existing element. 75 70 - BPF_NOEXIST: Create a new element only if it did not exist. 76 71 - BPF_EXIST: Update an existing element. 77 72 78 - .. c:function:: 73 + bpf_map_lookup_elem() 74 + ^^^^^^^^^^^^^^^^^^^^^ 75 + .. code-block:: c 76 + 79 77 int bpf_map_lookup_elem(int fd, const void *key, void *value); 80 78 81 - CPU entries can be retrieved using the ``bpf_map_lookup_elem()`` 82 - helper. 79 + CPU entries can be retrieved using the ``bpf_map_lookup_elem()`` 80 + helper. 83 81 84 - .. c:function:: 82 + bpf_map_delete_elem() 83 + ^^^^^^^^^^^^^^^^^^^^^ 84 + .. code-block:: c 85 + 85 86 int bpf_map_delete_elem(int fd, const void *key); 86 87 87 - CPU entries can be deleted using the ``bpf_map_delete_elem()`` 88 - helper. This helper will return 0 on success, or negative error in case of 89 - failure. 88 + CPU entries can be deleted using the ``bpf_map_delete_elem()`` 89 + helper. This helper will return 0 on success, or negative error in case of 90 + failure. 90 91 91 92 Examples 92 93 ======== ··· 153 142 return bpf_redirect_map(&cpu_map, cpu_dest, 0); 154 143 } 155 144 156 - Userspace 157 - --------- 145 + User space 146 + ---------- 158 147 159 148 The following code snippet shows how to dynamically set the max_entries for a 160 149 CPUMAP to the max number of cpus available on the system.
+238
Documentation/bpf/map_devmap.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + ================================================= 5 + BPF_MAP_TYPE_DEVMAP and BPF_MAP_TYPE_DEVMAP_HASH 6 + ================================================= 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_DEVMAP`` was introduced in kernel version 4.14 10 + - ``BPF_MAP_TYPE_DEVMAP_HASH`` was introduced in kernel version 5.4 11 + 12 + ``BPF_MAP_TYPE_DEVMAP`` and ``BPF_MAP_TYPE_DEVMAP_HASH`` are BPF maps primarily 13 + used as backend maps for the XDP BPF helper call ``bpf_redirect_map()``. 14 + ``BPF_MAP_TYPE_DEVMAP`` is backed by an array that uses the key as 15 + the index to lookup a reference to a net device. While ``BPF_MAP_TYPE_DEVMAP_HASH`` 16 + is backed by a hash table that uses a key to lookup a reference to a net device. 17 + The user provides either <``key``/ ``ifindex``> or <``key``/ ``struct bpf_devmap_val``> 18 + pairs to update the maps with new net devices. 19 + 20 + .. note:: 21 + - The key to a hash map doesn't have to be an ``ifindex``. 22 + - While ``BPF_MAP_TYPE_DEVMAP_HASH`` allows for densely packing the net devices 23 + it comes at the cost of a hash of the key when performing a look up. 24 + 25 + The setup and packet enqueue/send code is shared between the two types of 26 + devmap; only the lookup and insertion is different. 27 + 28 + Usage 29 + ===== 30 + Kernel BPF 31 + ---------- 32 + bpf_redirect_map() 33 + ^^^^^^^^^^^^^^^^^^ 34 + .. code-block:: c 35 + 36 + long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags) 37 + 38 + Redirect the packet to the endpoint referenced by ``map`` at index ``key``. 39 + For ``BPF_MAP_TYPE_DEVMAP`` and ``BPF_MAP_TYPE_DEVMAP_HASH`` this map contains 40 + references to net devices (for forwarding packets through other ports). 41 + 42 + The lower two bits of *flags* are used as the return code if the map lookup 43 + fails. This is so that the return value can be one of the XDP program return 44 + codes up to ``XDP_TX``, as chosen by the caller. The higher bits of ``flags`` 45 + can be set to ``BPF_F_BROADCAST`` or ``BPF_F_EXCLUDE_INGRESS`` as defined 46 + below. 47 + 48 + With ``BPF_F_BROADCAST`` the packet will be broadcast to all the interfaces 49 + in the map, with ``BPF_F_EXCLUDE_INGRESS`` the ingress interface will be excluded 50 + from the broadcast. 51 + 52 + .. note:: 53 + - The key is ignored if BPF_F_BROADCAST is set. 54 + - The broadcast feature can also be used to implement multicast forwarding: 55 + simply create multiple DEVMAPs, each one corresponding to a single multicast group. 56 + 57 + This helper will return ``XDP_REDIRECT`` on success, or the value of the two 58 + lower bits of the ``flags`` argument if the map lookup fails. 59 + 60 + More information about redirection can be found :doc:`redirect` 61 + 62 + bpf_map_lookup_elem() 63 + ^^^^^^^^^^^^^^^^^^^^^ 64 + .. code-block:: c 65 + 66 + void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 67 + 68 + Net device entries can be retrieved using the ``bpf_map_lookup_elem()`` 69 + helper. 70 + 71 + User space 72 + ---------- 73 + .. note:: 74 + DEVMAP entries can only be updated/deleted from user space and not 75 + from an eBPF program. Trying to call these functions from a kernel eBPF 76 + program will result in the program failing to load and a verifier warning. 77 + 78 + bpf_map_update_elem() 79 + ^^^^^^^^^^^^^^^^^^^^^ 80 + .. code-block:: c 81 + 82 + int bpf_map_update_elem(int fd, const void *key, const void *value, __u64 flags); 83 + 84 + Net device entries can be added or updated using the ``bpf_map_update_elem()`` 85 + helper. This helper replaces existing elements atomically. The ``value`` parameter 86 + can be ``struct bpf_devmap_val`` or a simple ``int ifindex`` for backwards 87 + compatibility. 88 + 89 + .. code-block:: c 90 + 91 + struct bpf_devmap_val { 92 + __u32 ifindex; /* device index */ 93 + union { 94 + int fd; /* prog fd on map write */ 95 + __u32 id; /* prog id on map read */ 96 + } bpf_prog; 97 + }; 98 + 99 + The ``flags`` argument can be one of the following: 100 + - ``BPF_ANY``: Create a new element or update an existing element. 101 + - ``BPF_NOEXIST``: Create a new element only if it did not exist. 102 + - ``BPF_EXIST``: Update an existing element. 103 + 104 + DEVMAPs can associate a program with a device entry by adding a ``bpf_prog.fd`` 105 + to ``struct bpf_devmap_val``. Programs are run after ``XDP_REDIRECT`` and have 106 + access to both Rx device and Tx device. The program associated with the ``fd`` 107 + must have type XDP with expected attach type ``xdp_devmap``. 108 + When a program is associated with a device index, the program is run on an 109 + ``XDP_REDIRECT`` and before the buffer is added to the per-cpu queue. Examples 110 + of how to attach/use xdp_devmap progs can be found in the kernel selftests: 111 + 112 + - ``tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c`` 113 + - ``tools/testing/selftests/bpf/progs/test_xdp_with_devmap_helpers.c`` 114 + 115 + bpf_map_lookup_elem() 116 + ^^^^^^^^^^^^^^^^^^^^^ 117 + .. code-block:: c 118 + 119 + .. c:function:: 120 + int bpf_map_lookup_elem(int fd, const void *key, void *value); 121 + 122 + Net device entries can be retrieved using the ``bpf_map_lookup_elem()`` 123 + helper. 124 + 125 + bpf_map_delete_elem() 126 + ^^^^^^^^^^^^^^^^^^^^^ 127 + .. code-block:: c 128 + 129 + .. c:function:: 130 + int bpf_map_delete_elem(int fd, const void *key); 131 + 132 + Net device entries can be deleted using the ``bpf_map_delete_elem()`` 133 + helper. This helper will return 0 on success, or negative error in case of 134 + failure. 135 + 136 + Examples 137 + ======== 138 + 139 + Kernel BPF 140 + ---------- 141 + 142 + The following code snippet shows how to declare a ``BPF_MAP_TYPE_DEVMAP`` 143 + called tx_port. 144 + 145 + .. code-block:: c 146 + 147 + struct { 148 + __uint(type, BPF_MAP_TYPE_DEVMAP); 149 + __type(key, __u32); 150 + __type(value, __u32); 151 + __uint(max_entries, 256); 152 + } tx_port SEC(".maps"); 153 + 154 + The following code snippet shows how to declare a ``BPF_MAP_TYPE_DEVMAP_HASH`` 155 + called forward_map. 156 + 157 + .. code-block:: c 158 + 159 + struct { 160 + __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); 161 + __type(key, __u32); 162 + __type(value, struct bpf_devmap_val); 163 + __uint(max_entries, 32); 164 + } forward_map SEC(".maps"); 165 + 166 + .. note:: 167 + 168 + The value type in the DEVMAP above is a ``struct bpf_devmap_val`` 169 + 170 + The following code snippet shows a simple xdp_redirect_map program. This program 171 + would work with a user space program that populates the devmap ``forward_map`` based 172 + on ingress ifindexes. The BPF program (below) is redirecting packets using the 173 + ingress ``ifindex`` as the ``key``. 174 + 175 + .. code-block:: c 176 + 177 + SEC("xdp") 178 + int xdp_redirect_map_func(struct xdp_md *ctx) 179 + { 180 + int index = ctx->ingress_ifindex; 181 + 182 + return bpf_redirect_map(&forward_map, index, 0); 183 + } 184 + 185 + The following code snippet shows a BPF program that is broadcasting packets to 186 + all the interfaces in the ``tx_port`` devmap. 187 + 188 + .. code-block:: c 189 + 190 + SEC("xdp") 191 + int xdp_redirect_map_func(struct xdp_md *ctx) 192 + { 193 + return bpf_redirect_map(&tx_port, 0, BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS); 194 + } 195 + 196 + User space 197 + ---------- 198 + 199 + The following code snippet shows how to update a devmap called ``tx_port``. 200 + 201 + .. code-block:: c 202 + 203 + int update_devmap(int ifindex, int redirect_ifindex) 204 + { 205 + int ret; 206 + 207 + ret = bpf_map_update_elem(bpf_map__fd(tx_port), &ifindex, &redirect_ifindex, 0); 208 + if (ret < 0) { 209 + fprintf(stderr, "Failed to update devmap_ value: %s\n", 210 + strerror(errno)); 211 + } 212 + 213 + return ret; 214 + } 215 + 216 + The following code snippet shows how to update a hash_devmap called ``forward_map``. 217 + 218 + .. code-block:: c 219 + 220 + int update_devmap(int ifindex, int redirect_ifindex) 221 + { 222 + struct bpf_devmap_val devmap_val = { .ifindex = redirect_ifindex }; 223 + int ret; 224 + 225 + ret = bpf_map_update_elem(bpf_map__fd(forward_map), &ifindex, &devmap_val, 0); 226 + if (ret < 0) { 227 + fprintf(stderr, "Failed to update devmap_ value: %s\n", 228 + strerror(errno)); 229 + } 230 + return ret; 231 + } 232 + 233 + References 234 + =========== 235 + 236 + - https://lwn.net/Articles/728146/ 237 + - https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=6f9d451ab1a33728adb72d7ff66a7b374d665176 238 + - https://elixir.bootlin.com/linux/latest/source/net/core/filter.c#L4106
+28 -5
Documentation/bpf/map_hash.rst
··· 34 34 Usage 35 35 ===== 36 36 37 - .. c:function:: 37 + Kernel BPF 38 + ---------- 39 + 40 + bpf_map_update_elem() 41 + ~~~~~~~~~~~~~~~~~~~~~ 42 + 43 + .. code-block:: c 44 + 38 45 long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) 39 46 40 47 Hash entries can be added or updated using the ``bpf_map_update_elem()`` ··· 56 49 ``bpf_map_update_elem()`` returns 0 on success, or negative error in 57 50 case of failure. 58 51 59 - .. c:function:: 52 + bpf_map_lookup_elem() 53 + ~~~~~~~~~~~~~~~~~~~~~ 54 + 55 + .. code-block:: c 56 + 60 57 void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 61 58 62 59 Hash entries can be retrieved using the ``bpf_map_lookup_elem()`` 63 60 helper. This helper returns a pointer to the value associated with 64 61 ``key``, or ``NULL`` if no entry was found. 65 62 66 - .. c:function:: 63 + bpf_map_delete_elem() 64 + ~~~~~~~~~~~~~~~~~~~~~ 65 + 66 + .. code-block:: c 67 + 67 68 long bpf_map_delete_elem(struct bpf_map *map, const void *key) 68 69 69 70 Hash entries can be deleted using the ``bpf_map_delete_elem()`` ··· 85 70 the ``bpf_map_update_elem()`` and ``bpf_map_lookup_elem()`` helpers 86 71 automatically access the hash slot for the current CPU. 87 72 88 - .. c:function:: 73 + bpf_map_lookup_percpu_elem() 74 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 75 + 76 + .. code-block:: c 77 + 89 78 void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key, u32 cpu) 90 79 91 80 The ``bpf_map_lookup_percpu_elem()`` helper can be used to lookup the ··· 108 89 Userspace 109 90 --------- 110 91 111 - .. c:function:: 92 + bpf_map_get_next_key() 93 + ~~~~~~~~~~~~~~~~~~~~~~ 94 + 95 + .. code-block:: c 96 + 112 97 int bpf_map_get_next_key(int fd, const void *cur_key, void *next_key) 113 98 114 99 In userspace, it is possible to iterate through the keys of a hash using
+20 -4
Documentation/bpf/map_lpm_trie.rst
··· 35 35 Kernel BPF 36 36 ---------- 37 37 38 - .. c:function:: 38 + bpf_map_lookup_elem() 39 + ~~~~~~~~~~~~~~~~~~~~~ 40 + 41 + .. code-block:: c 42 + 39 43 void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 40 44 41 45 The longest prefix entry for a given data value can be found using the ··· 52 48 longest prefix match for an IPv4 address, ``prefixlen`` should be set to 53 49 ``32``. 54 50 55 - .. c:function:: 51 + bpf_map_update_elem() 52 + ~~~~~~~~~~~~~~~~~~~~~ 53 + 54 + .. code-block:: c 55 + 56 56 long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) 57 57 58 58 Prefix entries can be added or updated using the ``bpf_map_update_elem()`` ··· 69 61 The flags parameter must be one of BPF_ANY, BPF_NOEXIST or BPF_EXIST, 70 62 but the value is ignored, giving BPF_ANY semantics. 71 63 72 - .. c:function:: 64 + bpf_map_delete_elem() 65 + ~~~~~~~~~~~~~~~~~~~~~ 66 + 67 + .. code-block:: c 68 + 73 69 long bpf_map_delete_elem(struct bpf_map *map, const void *key) 74 70 75 71 Prefix entries can be deleted using the ``bpf_map_delete_elem()`` ··· 86 74 Access from userspace uses libbpf APIs with the same names as above, with 87 75 the map identified by ``fd``. 88 76 89 - .. c:function:: 77 + bpf_map_get_next_key() 78 + ~~~~~~~~~~~~~~~~~~~~~~ 79 + 80 + .. code-block:: c 81 + 90 82 int bpf_map_get_next_key (int fd, const void *cur_key, void *next_key) 91 83 92 84 A userspace program can iterate through the entries in an LPM trie using
+5 -1
Documentation/bpf/map_of_maps.rst
··· 45 45 Kernel BPF Helper 46 46 ----------------- 47 47 48 - .. c:function:: 48 + bpf_map_lookup_elem() 49 + ~~~~~~~~~~~~~~~~~~~~~ 50 + 51 + .. code-block:: c 52 + 49 53 void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 50 54 51 55 Inner maps can be retrieved using the ``bpf_map_lookup_elem()`` helper. This
+30 -6
Documentation/bpf/map_queue_stack.rst
··· 28 28 Kernel BPF 29 29 ---------- 30 30 31 - .. c:function:: 31 + bpf_map_push_elem() 32 + ~~~~~~~~~~~~~~~~~~~ 33 + 34 + .. code-block:: c 35 + 32 36 long bpf_map_push_elem(struct bpf_map *map, const void *value, u64 flags) 33 37 34 38 An element ``value`` can be added to a queue or stack using the ··· 42 38 make room for ``value`` to be added. Returns ``0`` on success, or 43 39 negative error in case of failure. 44 40 45 - .. c:function:: 41 + bpf_map_peek_elem() 42 + ~~~~~~~~~~~~~~~~~~~ 43 + 44 + .. code-block:: c 45 + 46 46 long bpf_map_peek_elem(struct bpf_map *map, void *value) 47 47 48 48 This helper fetches an element ``value`` from a queue or stack without 49 49 removing it. Returns ``0`` on success, or negative error in case of 50 50 failure. 51 51 52 - .. c:function:: 52 + bpf_map_pop_elem() 53 + ~~~~~~~~~~~~~~~~~~ 54 + 55 + .. code-block:: c 56 + 53 57 long bpf_map_pop_elem(struct bpf_map *map, void *value) 54 58 55 59 This helper removes an element into ``value`` from a queue or ··· 67 55 Userspace 68 56 --------- 69 57 70 - .. c:function:: 58 + bpf_map_update_elem() 59 + ~~~~~~~~~~~~~~~~~~~~~ 60 + 61 + .. code-block:: c 62 + 71 63 int bpf_map_update_elem (int fd, const void *key, const void *value, __u64 flags) 72 64 73 65 A userspace program can push ``value`` onto a queue or stack using libbpf's ··· 80 64 same semantics as the ``bpf_map_push_elem`` kernel helper. Returns ``0`` on 81 65 success, or negative error in case of failure. 82 66 83 - .. c:function:: 67 + bpf_map_lookup_elem() 68 + ~~~~~~~~~~~~~~~~~~~~~ 69 + 70 + .. code-block:: c 71 + 84 72 int bpf_map_lookup_elem (int fd, const void *key, void *value) 85 73 86 74 A userspace program can peek at the ``value`` at the head of a queue or stack ··· 92 72 set to ``NULL``. Returns ``0`` on success, or negative error in case of 93 73 failure. 94 74 95 - .. c:function:: 75 + bpf_map_lookup_and_delete_elem() 76 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 77 + 78 + .. code-block:: c 79 + 96 80 int bpf_map_lookup_and_delete_elem (int fd, const void *key, void *value) 97 81 98 82 A userspace program can pop a ``value`` from the head of a queue or stack using
+192
Documentation/bpf/map_xskmap.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + =================== 5 + BPF_MAP_TYPE_XSKMAP 6 + =================== 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_XSKMAP`` was introduced in kernel version 4.18 10 + 11 + The ``BPF_MAP_TYPE_XSKMAP`` is used as a backend map for XDP BPF helper 12 + call ``bpf_redirect_map()`` and ``XDP_REDIRECT`` action, like 'devmap' and 'cpumap'. 13 + This map type redirects raw XDP frames to `AF_XDP`_ sockets (XSKs), a new type of 14 + address family in the kernel that allows redirection of frames from a driver to 15 + user space without having to traverse the full network stack. An AF_XDP socket 16 + binds to a single netdev queue. A mapping of XSKs to queues is shown below: 17 + 18 + .. code-block:: none 19 + 20 + +---------------------------------------------------+ 21 + | xsk A | xsk B | xsk C |<---+ User space 22 + =========================================================|========== 23 + | Queue 0 | Queue 1 | Queue 2 | | Kernel 24 + +---------------------------------------------------+ | 25 + | Netdev eth0 | | 26 + +---------------------------------------------------+ | 27 + | +=============+ | | 28 + | | key | xsk | | | 29 + | +---------+ +=============+ | | 30 + | | | | 0 | xsk A | | | 31 + | | | +-------------+ | | 32 + | | | | 1 | xsk B | | | 33 + | | BPF |-- redirect -->+-------------+-------------+ 34 + | | prog | | 2 | xsk C | | 35 + | | | +-------------+ | 36 + | | | | 37 + | | | | 38 + | +---------+ | 39 + | | 40 + +---------------------------------------------------+ 41 + 42 + .. note:: 43 + An AF_XDP socket that is bound to a certain <netdev/queue_id> will *only* 44 + accept XDP frames from that <netdev/queue_id>. If an XDP program tries to redirect 45 + from a <netdev/queue_id> other than what the socket is bound to, the frame will 46 + not be received on the socket. 47 + 48 + Typically an XSKMAP is created per netdev. This map contains an array of XSK File 49 + Descriptors (FDs). The number of array elements is typically set or adjusted using 50 + the ``max_entries`` map parameter. For AF_XDP ``max_entries`` is equal to the number 51 + of queues supported by the netdev. 52 + 53 + .. note:: 54 + Both the map key and map value size must be 4 bytes. 55 + 56 + Usage 57 + ===== 58 + 59 + Kernel BPF 60 + ---------- 61 + bpf_redirect_map() 62 + ^^^^^^^^^^^^^^^^^^ 63 + .. code-block:: c 64 + 65 + long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags) 66 + 67 + Redirect the packet to the endpoint referenced by ``map`` at index ``key``. 68 + For ``BPF_MAP_TYPE_XSKMAP`` this map contains references to XSK FDs 69 + for sockets attached to a netdev's queues. 70 + 71 + .. note:: 72 + If the map is empty at an index, the packet is dropped. This means that it is 73 + necessary to have an XDP program loaded with at least one XSK in the 74 + XSKMAP to be able to get any traffic to user space through the socket. 75 + 76 + bpf_map_lookup_elem() 77 + ^^^^^^^^^^^^^^^^^^^^^ 78 + .. code-block:: c 79 + 80 + void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 81 + 82 + XSK entry references of type ``struct xdp_sock *`` can be retrieved using the 83 + ``bpf_map_lookup_elem()`` helper. 84 + 85 + User space 86 + ---------- 87 + .. note:: 88 + XSK entries can only be updated/deleted from user space and not from 89 + a BPF program. Trying to call these functions from a kernel BPF program will 90 + result in the program failing to load and a verifier warning. 91 + 92 + bpf_map_update_elem() 93 + ^^^^^^^^^^^^^^^^^^^^^ 94 + .. code-block:: c 95 + 96 + int bpf_map_update_elem(int fd, const void *key, const void *value, __u64 flags) 97 + 98 + XSK entries can be added or updated using the ``bpf_map_update_elem()`` 99 + helper. The ``key`` parameter is equal to the queue_id of the queue the XSK 100 + is attaching to. And the ``value`` parameter is the FD value of that socket. 101 + 102 + Under the hood, the XSKMAP update function uses the XSK FD value to retrieve the 103 + associated ``struct xdp_sock`` instance. 104 + 105 + The flags argument can be one of the following: 106 + 107 + - BPF_ANY: Create a new element or update an existing element. 108 + - BPF_NOEXIST: Create a new element only if it did not exist. 109 + - BPF_EXIST: Update an existing element. 110 + 111 + bpf_map_lookup_elem() 112 + ^^^^^^^^^^^^^^^^^^^^^ 113 + .. code-block:: c 114 + 115 + int bpf_map_lookup_elem(int fd, const void *key, void *value) 116 + 117 + Returns ``struct xdp_sock *`` or negative error in case of failure. 118 + 119 + bpf_map_delete_elem() 120 + ^^^^^^^^^^^^^^^^^^^^^ 121 + .. code-block:: c 122 + 123 + int bpf_map_delete_elem(int fd, const void *key) 124 + 125 + XSK entries can be deleted using the ``bpf_map_delete_elem()`` 126 + helper. This helper will return 0 on success, or negative error in case of 127 + failure. 128 + 129 + .. note:: 130 + When `libxdp`_ deletes an XSK it also removes the associated socket 131 + entry from the XSKMAP. 132 + 133 + Examples 134 + ======== 135 + Kernel 136 + ------ 137 + 138 + The following code snippet shows how to declare a ``BPF_MAP_TYPE_XSKMAP`` called 139 + ``xsks_map`` and how to redirect packets to an XSK. 140 + 141 + .. code-block:: c 142 + 143 + struct { 144 + __uint(type, BPF_MAP_TYPE_XSKMAP); 145 + __type(key, __u32); 146 + __type(value, __u32); 147 + __uint(max_entries, 64); 148 + } xsks_map SEC(".maps"); 149 + 150 + 151 + SEC("xdp") 152 + int xsk_redir_prog(struct xdp_md *ctx) 153 + { 154 + __u32 index = ctx->rx_queue_index; 155 + 156 + if (bpf_map_lookup_elem(&xsks_map, &index)) 157 + return bpf_redirect_map(&xsks_map, index, 0); 158 + return XDP_PASS; 159 + } 160 + 161 + User space 162 + ---------- 163 + 164 + The following code snippet shows how to update an XSKMAP with an XSK entry. 165 + 166 + .. code-block:: c 167 + 168 + int update_xsks_map(struct bpf_map *xsks_map, int queue_id, int xsk_fd) 169 + { 170 + int ret; 171 + 172 + ret = bpf_map_update_elem(bpf_map__fd(xsks_map), &queue_id, &xsk_fd, 0); 173 + if (ret < 0) 174 + fprintf(stderr, "Failed to update xsks_map: %s\n", strerror(errno)); 175 + 176 + return ret; 177 + } 178 + 179 + For an example on how create AF_XDP sockets, please see the AF_XDP-example and 180 + AF_XDP-forwarding programs in the `bpf-examples`_ directory in the `libxdp`_ repository. 181 + For a detailed explaination of the AF_XDP interface please see: 182 + 183 + - `libxdp-readme`_. 184 + - `AF_XDP`_ kernel documentation. 185 + 186 + .. note:: 187 + The most comprehensive resource for using XSKMAPs and AF_XDP is `libxdp`_. 188 + 189 + .. _libxdp: https://github.com/xdp-project/xdp-tools/tree/master/lib/libxdp 190 + .. _AF_XDP: https://www.kernel.org/doc/html/latest/networking/af_xdp.html 191 + .. _bpf-examples: https://github.com/xdp-project/bpf-examples 192 + .. _libxdp-readme: https://github.com/xdp-project/xdp-tools/tree/master/lib/libxdp#using-af_xdp-sockets
+3
Documentation/bpf/programs.rst
··· 7 7 :glob: 8 8 9 9 prog_* 10 + 11 + For a list of all program types, see :ref:`program_types_and_elf` in 12 + the :ref:`libbpf` documentation.
+81
Documentation/bpf/redirect.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + ======== 5 + Redirect 6 + ======== 7 + XDP_REDIRECT 8 + ############ 9 + Supported maps 10 + -------------- 11 + 12 + XDP_REDIRECT works with the following map types: 13 + 14 + - ``BPF_MAP_TYPE_DEVMAP`` 15 + - ``BPF_MAP_TYPE_DEVMAP_HASH`` 16 + - ``BPF_MAP_TYPE_CPUMAP`` 17 + - ``BPF_MAP_TYPE_XSKMAP`` 18 + 19 + For more information on these maps, please see the specific map documentation. 20 + 21 + Process 22 + ------- 23 + 24 + .. kernel-doc:: net/core/filter.c 25 + :doc: xdp redirect 26 + 27 + .. note:: 28 + Not all drivers support transmitting frames after a redirect, and for 29 + those that do, not all of them support non-linear frames. Non-linear xdp 30 + bufs/frames are bufs/frames that contain more than one fragment. 31 + 32 + Debugging packet drops 33 + ---------------------- 34 + Silent packet drops for XDP_REDIRECT can be debugged using: 35 + 36 + - bpf_trace 37 + - perf_record 38 + 39 + bpf_trace 40 + ^^^^^^^^^ 41 + The following bpftrace command can be used to capture and count all XDP tracepoints: 42 + 43 + .. code-block:: none 44 + 45 + sudo bpftrace -e 'tracepoint:xdp:* { @cnt[probe] = count(); }' 46 + Attaching 12 probes... 47 + ^C 48 + 49 + @cnt[tracepoint:xdp:mem_connect]: 18 50 + @cnt[tracepoint:xdp:mem_disconnect]: 18 51 + @cnt[tracepoint:xdp:xdp_exception]: 19605 52 + @cnt[tracepoint:xdp:xdp_devmap_xmit]: 1393604 53 + @cnt[tracepoint:xdp:xdp_redirect]: 22292200 54 + 55 + .. note:: 56 + The various xdp tracepoints can be found in ``source/include/trace/events/xdp.h`` 57 + 58 + The following bpftrace command can be used to extract the ``ERRNO`` being returned as 59 + part of the err parameter: 60 + 61 + .. code-block:: none 62 + 63 + sudo bpftrace -e \ 64 + 'tracepoint:xdp:xdp_redirect*_err {@redir_errno[-args->err] = count();} 65 + tracepoint:xdp:xdp_devmap_xmit {@devmap_errno[-args->err] = count();}' 66 + 67 + perf record 68 + ^^^^^^^^^^^ 69 + The perf tool also supports recording tracepoints: 70 + 71 + .. code-block:: none 72 + 73 + perf record -a -e xdp:xdp_redirect_err \ 74 + -e xdp:xdp_redirect_map_err \ 75 + -e xdp:xdp_exception \ 76 + -e xdp:xdp_devmap_xmit 77 + 78 + References 79 + =========== 80 + 81 + - https://github.com/xdp-project/xdp-tutorial/tree/master/tracing02-xdp-monitor
+107 -47
include/linux/bpf.h
··· 54 54 extern struct idr btf_idr; 55 55 extern spinlock_t btf_idr_lock; 56 56 extern struct kobject *btf_kobj; 57 + extern struct bpf_mem_alloc bpf_global_ma; 58 + extern bool bpf_global_ma_set; 57 59 58 60 typedef u64 (*bpf_callback_t)(u64, u64, u64, u64, u64); 59 61 typedef int (*bpf_iter_init_seq_priv_t)(void *private_data, ··· 87 85 int (*map_lookup_and_delete_batch)(struct bpf_map *map, 88 86 const union bpf_attr *attr, 89 87 union bpf_attr __user *uattr); 90 - int (*map_update_batch)(struct bpf_map *map, const union bpf_attr *attr, 88 + int (*map_update_batch)(struct bpf_map *map, struct file *map_file, 89 + const union bpf_attr *attr, 91 90 union bpf_attr __user *uattr); 92 91 int (*map_delete_batch)(struct bpf_map *map, const union bpf_attr *attr, 93 92 union bpf_attr __user *uattr); ··· 138 135 struct bpf_local_storage __rcu ** (*map_owner_storage_ptr)(void *owner); 139 136 140 137 /* Misc helpers.*/ 141 - int (*map_redirect)(struct bpf_map *map, u32 ifindex, u64 flags); 138 + int (*map_redirect)(struct bpf_map *map, u64 key, u64 flags); 142 139 143 140 /* map_meta_equal must be implemented for maps that can be 144 141 * used as an inner map. It is a runtime check to ensure ··· 168 165 }; 169 166 170 167 enum { 171 - /* Support at most 8 pointers in a BTF type */ 172 - BTF_FIELDS_MAX = 10, 173 - BPF_MAP_OFF_ARR_MAX = BTF_FIELDS_MAX, 168 + /* Support at most 10 fields in a BTF type */ 169 + BTF_FIELDS_MAX = 10, 174 170 }; 175 171 176 172 enum btf_field_type { ··· 178 176 BPF_KPTR_UNREF = (1 << 2), 179 177 BPF_KPTR_REF = (1 << 3), 180 178 BPF_KPTR = BPF_KPTR_UNREF | BPF_KPTR_REF, 179 + BPF_LIST_HEAD = (1 << 4), 180 + BPF_LIST_NODE = (1 << 5), 181 181 }; 182 182 183 183 struct btf_field_kptr { ··· 189 185 u32 btf_id; 190 186 }; 191 187 188 + struct btf_field_list_head { 189 + struct btf *btf; 190 + u32 value_btf_id; 191 + u32 node_offset; 192 + struct btf_record *value_rec; 193 + }; 194 + 192 195 struct btf_field { 193 196 u32 offset; 194 197 enum btf_field_type type; 195 198 union { 196 199 struct btf_field_kptr kptr; 200 + struct btf_field_list_head list_head; 197 201 }; 198 202 }; 199 203 ··· 215 203 216 204 struct btf_field_offs { 217 205 u32 cnt; 218 - u32 field_off[BPF_MAP_OFF_ARR_MAX]; 219 - u8 field_sz[BPF_MAP_OFF_ARR_MAX]; 206 + u32 field_off[BTF_FIELDS_MAX]; 207 + u8 field_sz[BTF_FIELDS_MAX]; 220 208 }; 221 209 222 210 struct bpf_map { ··· 279 267 case BPF_KPTR_UNREF: 280 268 case BPF_KPTR_REF: 281 269 return "kptr"; 270 + case BPF_LIST_HEAD: 271 + return "bpf_list_head"; 272 + case BPF_LIST_NODE: 273 + return "bpf_list_node"; 282 274 default: 283 275 WARN_ON_ONCE(1); 284 276 return "unknown"; ··· 299 283 case BPF_KPTR_UNREF: 300 284 case BPF_KPTR_REF: 301 285 return sizeof(u64); 286 + case BPF_LIST_HEAD: 287 + return sizeof(struct bpf_list_head); 288 + case BPF_LIST_NODE: 289 + return sizeof(struct bpf_list_node); 302 290 default: 303 291 WARN_ON_ONCE(1); 304 292 return 0; ··· 319 299 case BPF_KPTR_UNREF: 320 300 case BPF_KPTR_REF: 321 301 return __alignof__(u64); 302 + case BPF_LIST_HEAD: 303 + return __alignof__(struct bpf_list_head); 304 + case BPF_LIST_NODE: 305 + return __alignof__(struct bpf_list_node); 322 306 default: 323 307 WARN_ON_ONCE(1); 324 308 return 0; ··· 336 312 return rec->field_mask & type; 337 313 } 338 314 315 + static inline void bpf_obj_init(const struct btf_field_offs *foffs, void *obj) 316 + { 317 + int i; 318 + 319 + if (!foffs) 320 + return; 321 + for (i = 0; i < foffs->cnt; i++) 322 + memset(obj + foffs->field_off[i], 0, foffs->field_sz[i]); 323 + } 324 + 339 325 static inline void check_and_init_map_value(struct bpf_map *map, void *dst) 340 326 { 341 - if (!IS_ERR_OR_NULL(map->record)) { 342 - struct btf_field *fields = map->record->fields; 343 - u32 cnt = map->record->cnt; 344 - int i; 345 - 346 - for (i = 0; i < cnt; i++) 347 - memset(dst + fields[i].offset, 0, btf_field_type_size(fields[i].type)); 348 - } 327 + bpf_obj_init(map->field_offs, dst); 349 328 } 350 329 351 330 /* memcpy that is used with 8-byte aligned pointers, power-of-8 size and ··· 388 361 u32 sz = next_off - curr_off; 389 362 390 363 memcpy(dst + curr_off, src + curr_off, sz); 391 - curr_off = next_off + foffs->field_sz[i]; 364 + curr_off += foffs->field_sz[i] + sz; 392 365 } 393 366 memcpy(dst + curr_off, src + curr_off, size - curr_off); 394 367 } ··· 418 391 u32 sz = next_off - curr_off; 419 392 420 393 memset(dst + curr_off, 0, sz); 421 - curr_off = next_off + foffs->field_sz[i]; 394 + curr_off += foffs->field_sz[i] + sz; 422 395 } 423 396 memset(dst + curr_off, 0, size - curr_off); 424 397 } ··· 431 404 void copy_map_value_locked(struct bpf_map *map, void *dst, void *src, 432 405 bool lock_src); 433 406 void bpf_timer_cancel_and_free(void *timer); 407 + void bpf_list_head_free(const struct btf_field *field, void *list_head, 408 + struct bpf_spin_lock *spin_lock); 409 + 434 410 int bpf_obj_name_cpy(char *dst, const char *src, unsigned int size); 435 411 436 412 struct bpf_offload_dev; ··· 502 472 */ 503 473 MEM_RDONLY = BIT(1 + BPF_BASE_TYPE_BITS), 504 474 505 - /* MEM was "allocated" from a different helper, and cannot be mixed 506 - * with regular non-MEM_ALLOC'ed MEM types. 507 - */ 508 - MEM_ALLOC = BIT(2 + BPF_BASE_TYPE_BITS), 475 + /* MEM points to BPF ring buffer reservation. */ 476 + MEM_RINGBUF = BIT(2 + BPF_BASE_TYPE_BITS), 509 477 510 478 /* MEM is in user address space. */ 511 479 MEM_USER = BIT(3 + BPF_BASE_TYPE_BITS), ··· 537 509 538 510 /* Size is known at compile time. */ 539 511 MEM_FIXED_SIZE = BIT(10 + BPF_BASE_TYPE_BITS), 512 + 513 + /* MEM is of an allocated object of type in program BTF. This is used to 514 + * tag PTR_TO_BTF_ID allocated using bpf_obj_new. 515 + */ 516 + MEM_ALLOC = BIT(11 + BPF_BASE_TYPE_BITS), 517 + 518 + /* PTR was passed from the kernel in a trusted context, and may be 519 + * passed to KF_TRUSTED_ARGS kfuncs or BPF helper functions. 520 + * Confusingly, this is _not_ the opposite of PTR_UNTRUSTED above. 521 + * PTR_UNTRUSTED refers to a kptr that was read directly from a map 522 + * without invoking bpf_kptr_xchg(). What we really need to know is 523 + * whether a pointer is safe to pass to a kfunc or BPF helper function. 524 + * While PTR_UNTRUSTED pointers are unsafe to pass to kfuncs and BPF 525 + * helpers, they do not cover all possible instances of unsafe 526 + * pointers. For example, a pointer that was obtained from walking a 527 + * struct will _not_ get the PTR_UNTRUSTED type modifier, despite the 528 + * fact that it may be NULL, invalid, etc. This is due to backwards 529 + * compatibility requirements, as this was the behavior that was first 530 + * introduced when kptrs were added. The behavior is now considered 531 + * deprecated, and PTR_UNTRUSTED will eventually be removed. 532 + * 533 + * PTR_TRUSTED, on the other hand, is a pointer that the kernel 534 + * guarantees to be valid and safe to pass to kfuncs and BPF helpers. 535 + * For example, pointers passed to tracepoint arguments are considered 536 + * PTR_TRUSTED, as are pointers that are passed to struct_ops 537 + * callbacks. As alluded to above, pointers that are obtained from 538 + * walking PTR_TRUSTED pointers are _not_ trusted. For example, if a 539 + * struct task_struct *task is PTR_TRUSTED, then accessing 540 + * task->last_wakee will lose the PTR_TRUSTED modifier when it's stored 541 + * in a BPF register. Similarly, pointers passed to certain programs 542 + * types such as kretprobes are not guaranteed to be valid, as they may 543 + * for example contain an object that was recently freed. 544 + */ 545 + PTR_TRUSTED = BIT(12 + BPF_BASE_TYPE_BITS), 546 + 547 + /* MEM is tagged with rcu and memory access needs rcu_read_lock protection. */ 548 + MEM_RCU = BIT(13 + BPF_BASE_TYPE_BITS), 540 549 541 550 __BPF_TYPE_FLAG_MAX, 542 551 __BPF_TYPE_LAST_FLAG = __BPF_TYPE_FLAG_MAX - 1, ··· 614 549 ARG_PTR_TO_LONG, /* pointer to long */ 615 550 ARG_PTR_TO_SOCKET, /* pointer to bpf_sock (fullsock) */ 616 551 ARG_PTR_TO_BTF_ID, /* pointer to in-kernel struct */ 617 - ARG_PTR_TO_ALLOC_MEM, /* pointer to dynamically allocated memory */ 552 + ARG_PTR_TO_RINGBUF_MEM, /* pointer to dynamically reserved ringbuf memory */ 618 553 ARG_CONST_ALLOC_SIZE_OR_ZERO, /* number of allocated bytes requested */ 619 554 ARG_PTR_TO_BTF_ID_SOCK_COMMON, /* pointer to in-kernel sock_common or bpf-mirrored bpf_sock */ 620 555 ARG_PTR_TO_PERCPU_BTF_ID, /* pointer to in-kernel percpu type */ ··· 631 566 ARG_PTR_TO_MEM_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_MEM, 632 567 ARG_PTR_TO_CTX_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_CTX, 633 568 ARG_PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_SOCKET, 634 - ARG_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_ALLOC_MEM, 635 569 ARG_PTR_TO_STACK_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_STACK, 636 570 ARG_PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_BTF_ID, 637 571 /* pointer to memory does not need to be initialized, helper function must fill ··· 655 591 RET_PTR_TO_SOCKET, /* returns a pointer to a socket */ 656 592 RET_PTR_TO_TCP_SOCK, /* returns a pointer to a tcp_sock */ 657 593 RET_PTR_TO_SOCK_COMMON, /* returns a pointer to a sock_common */ 658 - RET_PTR_TO_ALLOC_MEM, /* returns a pointer to dynamically allocated memory */ 594 + RET_PTR_TO_MEM, /* returns a pointer to memory */ 659 595 RET_PTR_TO_MEM_OR_BTF_ID, /* returns a pointer to a valid memory or a btf_id */ 660 596 RET_PTR_TO_BTF_ID, /* returns a pointer to a btf_id */ 661 597 __BPF_RET_TYPE_MAX, ··· 665 601 RET_PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCKET, 666 602 RET_PTR_TO_TCP_SOCK_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_TCP_SOCK, 667 603 RET_PTR_TO_SOCK_COMMON_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCK_COMMON, 668 - RET_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | MEM_ALLOC | RET_PTR_TO_ALLOC_MEM, 669 - RET_PTR_TO_DYNPTR_MEM_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_ALLOC_MEM, 604 + RET_PTR_TO_RINGBUF_MEM_OR_NULL = PTR_MAYBE_NULL | MEM_RINGBUF | RET_PTR_TO_MEM, 605 + RET_PTR_TO_DYNPTR_MEM_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_MEM, 670 606 RET_PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_BTF_ID, 607 + RET_PTR_TO_BTF_ID_TRUSTED = PTR_TRUSTED | RET_PTR_TO_BTF_ID, 671 608 672 609 /* This must be the last entry. Its purpose is to ensure the enum is 673 610 * wide enough to hold the higher bits reserved for bpf_type_flag. ··· 685 620 u64 (*func)(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); 686 621 bool gpl_only; 687 622 bool pkt_access; 623 + bool might_sleep; 688 624 enum bpf_return_type ret_type; 689 625 union { 690 626 struct { ··· 824 758 union bpf_attr __user *uattr); 825 759 }; 826 760 761 + struct bpf_reg_state; 827 762 struct bpf_verifier_ops { 828 763 /* return eBPF function prototype for verification */ 829 764 const struct bpf_func_proto * ··· 846 779 struct bpf_insn *dst, 847 780 struct bpf_prog *prog, u32 *target_size); 848 781 int (*btf_struct_access)(struct bpf_verifier_log *log, 849 - const struct btf *btf, 850 - const struct btf_type *t, int off, int size, 851 - enum bpf_access_type atype, 782 + const struct bpf_reg_state *reg, 783 + int off, int size, enum bpf_access_type atype, 852 784 u32 *next_btf_id, enum bpf_type_flag *flag); 853 785 }; 854 786 ··· 1860 1794 int generic_map_lookup_batch(struct bpf_map *map, 1861 1795 const union bpf_attr *attr, 1862 1796 union bpf_attr __user *uattr); 1863 - int generic_map_update_batch(struct bpf_map *map, 1797 + int generic_map_update_batch(struct bpf_map *map, struct file *map_file, 1864 1798 const union bpf_attr *attr, 1865 1799 union bpf_attr __user *uattr); 1866 1800 int generic_map_delete_batch(struct bpf_map *map, ··· 2151 2085 return btf_ctx_access(off, size, type, prog, info); 2152 2086 } 2153 2087 2154 - int btf_struct_access(struct bpf_verifier_log *log, const struct btf *btf, 2155 - const struct btf_type *t, int off, int size, 2156 - enum bpf_access_type atype, 2088 + int btf_struct_access(struct bpf_verifier_log *log, 2089 + const struct bpf_reg_state *reg, 2090 + int off, int size, enum bpf_access_type atype, 2157 2091 u32 *next_btf_id, enum bpf_type_flag *flag); 2158 2092 bool btf_struct_ids_match(struct bpf_verifier_log *log, 2159 2093 const struct btf *btf, u32 id, int off, ··· 2166 2100 const char *func_name, 2167 2101 struct btf_func_model *m); 2168 2102 2169 - struct bpf_kfunc_arg_meta { 2170 - u64 r0_size; 2171 - bool r0_rdonly; 2172 - int ref_obj_id; 2173 - u32 flags; 2174 - }; 2175 - 2176 2103 struct bpf_reg_state; 2177 2104 int btf_check_subprog_arg_match(struct bpf_verifier_env *env, int subprog, 2178 2105 struct bpf_reg_state *regs); 2179 2106 int btf_check_subprog_call(struct bpf_verifier_env *env, int subprog, 2180 2107 struct bpf_reg_state *regs); 2181 - int btf_check_kfunc_arg_match(struct bpf_verifier_env *env, 2182 - const struct btf *btf, u32 func_id, 2183 - struct bpf_reg_state *regs, 2184 - struct bpf_kfunc_arg_meta *meta); 2185 2108 int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog, 2186 2109 struct bpf_reg_state *reg); 2187 2110 int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *prog, ··· 2393 2338 } 2394 2339 2395 2340 static inline int btf_struct_access(struct bpf_verifier_log *log, 2396 - const struct btf *btf, 2397 - const struct btf_type *t, int off, int size, 2398 - enum bpf_access_type atype, 2341 + const struct bpf_reg_state *reg, 2342 + int off, int size, enum bpf_access_type atype, 2399 2343 u32 *next_btf_id, enum bpf_type_flag *flag) 2400 2344 { 2401 2345 return -EACCES; ··· 2851 2797 bool has_ref; 2852 2798 }; 2853 2799 #endif /* CONFIG_KEYS */ 2800 + 2801 + static inline bool type_is_alloc(u32 type) 2802 + { 2803 + return type & MEM_ALLOC; 2804 + } 2805 + 2854 2806 #endif /* _LINUX_BPF_H */
+33 -4
include/linux/bpf_verifier.h
··· 19 19 */ 20 20 #define BPF_MAX_VAR_SIZ (1 << 29) 21 21 /* size of type_str_buf in bpf_verifier. */ 22 - #define TYPE_STR_BUF_LEN 64 22 + #define TYPE_STR_BUF_LEN 128 23 23 24 24 /* Liveness marks, used for registers and spilled-regs (in stack slots). 25 25 * Read marks propagate upwards until they find a write mark; they record that ··· 223 223 * exiting a callback function. 224 224 */ 225 225 int callback_ref; 226 + /* Mark the reference state to release the registers sharing the same id 227 + * on bpf_spin_unlock (for nodes that we will lose ownership to but are 228 + * safe to access inside the critical section). 229 + */ 230 + bool release_on_unlock; 226 231 }; 227 232 228 233 /* state of the program: ··· 328 323 u32 branches; 329 324 u32 insn_idx; 330 325 u32 curframe; 331 - u32 active_spin_lock; 326 + /* For every reg representing a map value or allocated object pointer, 327 + * we consider the tuple of (ptr, id) for them to be unique in verifier 328 + * context and conside them to not alias each other for the purposes of 329 + * tracking lock state. 330 + */ 331 + struct { 332 + /* This can either be reg->map_ptr or reg->btf. If ptr is NULL, 333 + * there's no active lock held, and other fields have no 334 + * meaning. If non-NULL, it indicates that a lock is held and 335 + * id member has the reg->id of the register which can be >= 0. 336 + */ 337 + void *ptr; 338 + /* This will be reg->id */ 339 + u32 id; 340 + } active_lock; 332 341 bool speculative; 342 + bool active_rcu_lock; 333 343 334 344 /* first and last insn idx of this verifier state */ 335 345 u32 first_insn_idx; ··· 439 419 */ 440 420 struct bpf_loop_inline_state loop_inline_state; 441 421 }; 422 + u64 obj_new_size; /* remember the size of type passed to bpf_obj_new to rewrite R1 */ 423 + struct btf_struct_meta *kptr_struct_meta; 442 424 u64 map_key_state; /* constant (32 bit) key tracking for maps */ 443 425 int ctx_field_size; /* the ctx field size for load insn, maybe 0 */ 444 426 u32 seen; /* this insn was processed by the verifier at env->pass_cnt */ 445 427 bool sanitize_stack_spill; /* subject to Spectre v4 sanitation */ 446 428 bool zext_dst; /* this insn zero extends dst reg */ 429 + bool storage_get_func_atomic; /* bpf_*_storage_get() with atomic memory alloc */ 447 430 u8 alu_state; /* used in combination with alu_limit */ 448 431 449 432 /* below fields are initialized once */ ··· 536 513 bool bypass_spec_v1; 537 514 bool bypass_spec_v4; 538 515 bool seen_direct_write; 516 + bool rcu_tag_supported; 539 517 struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */ 540 518 const struct bpf_line_info *prev_linfo; 541 519 struct bpf_verifier_log log; ··· 613 589 int check_func_arg_reg_off(struct bpf_verifier_env *env, 614 590 const struct bpf_reg_state *reg, int regno, 615 591 enum bpf_arg_type arg_type); 616 - int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg, 617 - u32 regno); 618 592 int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg, 619 593 u32 regno, u32 mem_size); 620 594 bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, ··· 681 659 default: 682 660 return true; 683 661 } 662 + } 663 + 664 + #define BPF_REG_TRUSTED_MODIFIERS (MEM_ALLOC | MEM_RCU | PTR_TRUSTED) 665 + 666 + static inline bool bpf_type_has_unsafe_modifiers(u32 type) 667 + { 668 + return type_flag(type) & ~BPF_REG_TRUSTED_MODIFIERS; 684 669 } 685 670 686 671 #endif /* _LINUX_BPF_VERIFIER_H */
+113 -26
include/linux/btf.h
··· 6 6 7 7 #include <linux/types.h> 8 8 #include <linux/bpfptr.h> 9 + #include <linux/bsearch.h> 10 + #include <linux/btf_ids.h> 9 11 #include <uapi/linux/btf.h> 10 12 #include <uapi/linux/bpf.h> 11 13 ··· 19 17 #define KF_RELEASE (1 << 1) /* kfunc is a release function */ 20 18 #define KF_RET_NULL (1 << 2) /* kfunc returns a pointer that may be NULL */ 21 19 #define KF_KPTR_GET (1 << 3) /* kfunc returns reference to a kptr */ 22 - /* Trusted arguments are those which are meant to be referenced arguments with 23 - * unchanged offset. It is used to enforce that pointers obtained from acquire 24 - * kfuncs remain unmodified when being passed to helpers taking trusted args. 20 + /* Trusted arguments are those which are guaranteed to be valid when passed to 21 + * the kfunc. It is used to enforce that pointers obtained from either acquire 22 + * kfuncs, or from the main kernel on a tracepoint or struct_ops callback 23 + * invocation, remain unmodified when being passed to helpers taking trusted 24 + * args. 25 25 * 26 - * Consider 27 - * struct foo { 28 - * int data; 29 - * struct foo *next; 30 - * }; 26 + * Consider, for example, the following new task tracepoint: 31 27 * 32 - * struct bar { 33 - * int data; 34 - * struct foo f; 35 - * }; 28 + * SEC("tp_btf/task_newtask") 29 + * int BPF_PROG(new_task_tp, struct task_struct *task, u64 clone_flags) 30 + * { 31 + * ... 32 + * } 36 33 * 37 - * struct foo *f = alloc_foo(); // Acquire kfunc 38 - * struct bar *b = alloc_bar(); // Acquire kfunc 34 + * And the following kfunc: 39 35 * 40 - * If a kfunc set_foo_data() wants to operate only on the allocated object, it 41 - * will set the KF_TRUSTED_ARGS flag, which will prevent unsafe usage like: 36 + * BTF_ID_FLAGS(func, bpf_task_acquire, KF_ACQUIRE | KF_TRUSTED_ARGS) 42 37 * 43 - * set_foo_data(f, 42); // Allowed 44 - * set_foo_data(f->next, 42); // Rejected, non-referenced pointer 45 - * set_foo_data(&f->next, 42);// Rejected, referenced, but wrong type 46 - * set_foo_data(&b->f, 42); // Rejected, referenced, but bad offset 38 + * All invocations to the kfunc must pass the unmodified, unwalked task: 47 39 * 48 - * In the final case, usually for the purposes of type matching, it is deduced 49 - * by looking at the type of the member at the offset, but due to the 50 - * requirement of trusted argument, this deduction will be strict and not done 51 - * for this case. 40 + * bpf_task_acquire(task); // Allowed 41 + * bpf_task_acquire(task->last_wakee); // Rejected, walked task 42 + * 43 + * Programs may also pass referenced tasks directly to the kfunc: 44 + * 45 + * struct task_struct *acquired; 46 + * 47 + * acquired = bpf_task_acquire(task); // Allowed, same as above 48 + * bpf_task_acquire(acquired); // Allowed 49 + * bpf_task_acquire(task); // Allowed 50 + * bpf_task_acquire(acquired->last_wakee); // Rejected, walked task 51 + * 52 + * Programs may _not_, however, pass a task from an arbitrary fentry/fexit, or 53 + * kprobe/kretprobe to the kfunc, as BPF cannot guarantee that all of these 54 + * pointers are guaranteed to be safe. For example, the following BPF program 55 + * would be rejected: 56 + * 57 + * SEC("kretprobe/free_task") 58 + * int BPF_PROG(free_task_probe, struct task_struct *tsk) 59 + * { 60 + * struct task_struct *acquired; 61 + * 62 + * acquired = bpf_task_acquire(acquired); // Rejected, not a trusted pointer 63 + * bpf_task_release(acquired); 64 + * 65 + * return 0; 66 + * } 52 67 */ 53 68 #define KF_TRUSTED_ARGS (1 << 4) /* kfunc only takes trusted pointer arguments */ 54 69 #define KF_SLEEPABLE (1 << 5) /* kfunc may sleep */ ··· 95 76 struct btf_id_dtor_kfunc { 96 77 u32 btf_id; 97 78 u32 kfunc_btf_id; 79 + }; 80 + 81 + struct btf_struct_meta { 82 + u32 btf_id; 83 + struct btf_record *record; 84 + struct btf_field_offs *field_offs; 85 + }; 86 + 87 + struct btf_struct_metas { 88 + u32 cnt; 89 + struct btf_struct_meta types[]; 98 90 }; 99 91 100 92 typedef void (*btf_dtor_kfunc_t)(void *); ··· 195 165 int btf_find_timer(const struct btf *btf, const struct btf_type *t); 196 166 struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type *t, 197 167 u32 field_mask, u32 value_size); 168 + int btf_check_and_fixup_fields(const struct btf *btf, struct btf_record *rec); 198 169 struct btf_field_offs *btf_parse_field_offs(struct btf_record *rec); 199 170 bool btf_type_is_void(const struct btf_type *t); 200 171 s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind); ··· 355 324 return kind == BTF_KIND_STRUCT || kind == BTF_KIND_UNION; 356 325 } 357 326 327 + static inline bool __btf_type_is_struct(const struct btf_type *t) 328 + { 329 + return BTF_INFO_KIND(t->info) == BTF_KIND_STRUCT; 330 + } 331 + 332 + static inline bool btf_type_is_array(const struct btf_type *t) 333 + { 334 + return BTF_INFO_KIND(t->info) == BTF_KIND_ARRAY; 335 + } 336 + 358 337 static inline u16 btf_type_vlen(const struct btf_type *t) 359 338 { 360 339 return BTF_INFO_VLEN(t->info); ··· 449 408 return (struct btf_param *)(t + 1); 450 409 } 451 410 452 - #ifdef CONFIG_BPF_SYSCALL 453 - struct bpf_prog; 411 + static inline int btf_id_cmp_func(const void *a, const void *b) 412 + { 413 + const int *pa = a, *pb = b; 454 414 415 + return *pa - *pb; 416 + } 417 + 418 + static inline bool btf_id_set_contains(const struct btf_id_set *set, u32 id) 419 + { 420 + return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL; 421 + } 422 + 423 + static inline void *btf_id_set8_contains(const struct btf_id_set8 *set, u32 id) 424 + { 425 + return bsearch(&id, set->pairs, set->cnt, sizeof(set->pairs[0]), btf_id_cmp_func); 426 + } 427 + 428 + struct bpf_prog; 429 + struct bpf_verifier_log; 430 + 431 + #ifdef CONFIG_BPF_SYSCALL 455 432 const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id); 456 433 const char *btf_name_by_offset(const struct btf *btf, u32 offset); 457 434 struct btf *btf_parse_vmlinux(void); ··· 482 423 s32 btf_find_dtor_kfunc(struct btf *btf, u32 btf_id); 483 424 int register_btf_id_dtor_kfuncs(const struct btf_id_dtor_kfunc *dtors, u32 add_cnt, 484 425 struct module *owner); 426 + struct btf_struct_meta *btf_find_struct_meta(const struct btf *btf, u32 btf_id); 427 + const struct btf_member * 428 + btf_get_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf, 429 + const struct btf_type *t, enum bpf_prog_type prog_type, 430 + int arg); 431 + int get_kern_ctx_btf_id(struct bpf_verifier_log *log, enum bpf_prog_type prog_type); 432 + bool btf_types_are_same(const struct btf *btf1, u32 id1, 433 + const struct btf *btf2, u32 id2); 485 434 #else 486 435 static inline const struct btf_type *btf_type_by_id(const struct btf *btf, 487 436 u32 type_id) ··· 520 453 u32 add_cnt, struct module *owner) 521 454 { 522 455 return 0; 456 + } 457 + static inline struct btf_struct_meta *btf_find_struct_meta(const struct btf *btf, u32 btf_id) 458 + { 459 + return NULL; 460 + } 461 + static inline const struct btf_member * 462 + btf_get_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf, 463 + const struct btf_type *t, enum bpf_prog_type prog_type, 464 + int arg) 465 + { 466 + return NULL; 467 + } 468 + static inline int get_kern_ctx_btf_id(struct bpf_verifier_log *log, 469 + enum bpf_prog_type prog_type) { 470 + return -EINVAL; 471 + } 472 + static inline bool btf_types_are_same(const struct btf *btf1, u32 id1, 473 + const struct btf *btf2, u32 id2) 474 + { 475 + return false; 523 476 } 524 477 #endif 525 478
+1 -1
include/linux/btf_ids.h
··· 204 204 205 205 #else 206 206 207 - #define BTF_ID_LIST(name) static u32 __maybe_unused name[5]; 207 + #define BTF_ID_LIST(name) static u32 __maybe_unused name[16]; 208 208 #define BTF_ID(prefix, name) 209 209 #define BTF_ID_FLAGS(prefix, name, ...) 210 210 #define BTF_ID_UNUSED
+2 -1
include/linux/compiler_types.h
··· 49 49 # endif 50 50 # define __iomem 51 51 # define __percpu BTF_TYPE_TAG(percpu) 52 - # define __rcu 52 + # define __rcu BTF_TYPE_TAG(rcu) 53 + 53 54 # define __chk_user_ptr(x) (void)0 54 55 # define __chk_io_ptr(x) (void)0 55 56 /* context/locking */
+10 -10
include/linux/filter.h
··· 568 568 DECLARE_STATIC_KEY_FALSE(bpf_stats_enabled_key); 569 569 570 570 extern struct mutex nf_conn_btf_access_lock; 571 - extern int (*nfct_btf_struct_access)(struct bpf_verifier_log *log, const struct btf *btf, 572 - const struct btf_type *t, int off, int size, 573 - enum bpf_access_type atype, u32 *next_btf_id, 574 - enum bpf_type_flag *flag); 571 + extern int (*nfct_btf_struct_access)(struct bpf_verifier_log *log, 572 + const struct bpf_reg_state *reg, 573 + int off, int size, enum bpf_access_type atype, 574 + u32 *next_btf_id, enum bpf_type_flag *flag); 575 575 576 576 typedef unsigned int (*bpf_dispatcher_fn)(const void *ctx, 577 577 const struct bpf_insn *insnsi, ··· 643 643 }; 644 644 645 645 struct bpf_redirect_info { 646 - u32 flags; 647 - u32 tgt_index; 646 + u64 tgt_index; 648 647 void *tgt_value; 649 648 struct bpf_map *map; 649 + u32 flags; 650 + u32 kern_flags; 650 651 u32 map_id; 651 652 enum bpf_map_type map_type; 652 - u32 kern_flags; 653 653 struct bpf_nh_params nh; 654 654 }; 655 655 ··· 1504 1504 } 1505 1505 #endif /* IS_ENABLED(CONFIG_IPV6) */ 1506 1506 1507 - static __always_inline int __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifindex, 1507 + static __always_inline int __bpf_xdp_redirect_map(struct bpf_map *map, u64 index, 1508 1508 u64 flags, const u64 flag_mask, 1509 1509 void *lookup_elem(struct bpf_map *map, u32 key)) 1510 1510 { ··· 1515 1515 if (unlikely(flags & ~(action_mask | flag_mask))) 1516 1516 return XDP_ABORTED; 1517 1517 1518 - ri->tgt_value = lookup_elem(map, ifindex); 1518 + ri->tgt_value = lookup_elem(map, index); 1519 1519 if (unlikely(!ri->tgt_value) && !(flags & BPF_F_BROADCAST)) { 1520 1520 /* If the lookup fails we want to clear out the state in the 1521 1521 * redirect_info struct completely, so that if an eBPF program ··· 1527 1527 return flags & action_mask; 1528 1528 } 1529 1529 1530 - ri->tgt_index = ifindex; 1530 + ri->tgt_index = index; 1531 1531 ri->map_id = map->id; 1532 1532 ri->map_type = map->map_type; 1533 1533
+1 -1
include/linux/netdevice.h
··· 3135 3135 /* stats */ 3136 3136 unsigned int processed; 3137 3137 unsigned int time_squeeze; 3138 - unsigned int received_rps; 3139 3138 #ifdef CONFIG_RPS 3140 3139 struct softnet_data *rps_ipi_list; 3141 3140 #endif ··· 3167 3168 unsigned int cpu; 3168 3169 unsigned int input_queue_tail; 3169 3170 #endif 3171 + unsigned int received_rps; 3170 3172 unsigned int dropped; 3171 3173 struct sk_buff_head input_pkt_queue; 3172 3174 struct napi_struct backlog;
+23 -10
include/uapi/linux/bpf.h
··· 2584 2584 * * **SOL_SOCKET**, which supports the following *optname*\ s: 2585 2585 * **SO_RCVBUF**, **SO_SNDBUF**, **SO_MAX_PACING_RATE**, 2586 2586 * **SO_PRIORITY**, **SO_RCVLOWAT**, **SO_MARK**, 2587 - * **SO_BINDTODEVICE**, **SO_KEEPALIVE**. 2587 + * **SO_BINDTODEVICE**, **SO_KEEPALIVE**, **SO_REUSEADDR**, 2588 + * **SO_REUSEPORT**, **SO_BINDTOIFINDEX**, **SO_TXREHASH**. 2588 2589 * * **IPPROTO_TCP**, which supports the following *optname*\ s: 2589 2590 * **TCP_CONGESTION**, **TCP_BPF_IW**, 2590 2591 * **TCP_BPF_SNDCWND_CLAMP**, **TCP_SAVE_SYN**, 2591 2592 * **TCP_KEEPIDLE**, **TCP_KEEPINTVL**, **TCP_KEEPCNT**, 2592 - * **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**. 2593 + * **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**, 2594 + * **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**, 2595 + * **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**, 2596 + * **TCP_BPF_RTO_MIN**. 2593 2597 * * **IPPROTO_IP**, which supports *optname* **IP_TOS**. 2594 - * * **IPPROTO_IPV6**, which supports *optname* **IPV6_TCLASS**. 2598 + * * **IPPROTO_IPV6**, which supports the following *optname*\ s: 2599 + * **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**. 2595 2600 * Return 2596 2601 * 0 on success, or a negative error in case of failure. 2597 2602 * ··· 2652 2647 * Return 2653 2648 * 0 on success, or a negative error in case of failure. 2654 2649 * 2655 - * long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags) 2650 + * long bpf_redirect_map(struct bpf_map *map, u64 key, u64 flags) 2656 2651 * Description 2657 2652 * Redirect the packet to the endpoint referenced by *map* at 2658 2653 * index *key*. Depending on its type, this *map* can contain ··· 2813 2808 * and **BPF_CGROUP_INET6_CONNECT**. 2814 2809 * 2815 2810 * This helper actually implements a subset of **getsockopt()**. 2816 - * It supports the following *level*\ s: 2817 - * 2818 - * * **IPPROTO_TCP**, which supports *optname* 2819 - * **TCP_CONGESTION**. 2820 - * * **IPPROTO_IP**, which supports *optname* **IP_TOS**. 2821 - * * **IPPROTO_IPV6**, which supports *optname* **IPV6_TCLASS**. 2811 + * It supports the same set of *optname*\ s that is supported by 2812 + * the **bpf_setsockopt**\ () helper. The exceptions are 2813 + * **TCP_BPF_*** is **bpf_setsockopt**\ () only and 2814 + * **TCP_SAVED_SYN** is **bpf_getsockopt**\ () only. 2822 2815 * Return 2823 2816 * 0 on success, or a negative error in case of failure. 2824 2817 * ··· 6887 6884 } __attribute__((aligned(8))); 6888 6885 6889 6886 struct bpf_dynptr { 6887 + __u64 :64; 6888 + __u64 :64; 6889 + } __attribute__((aligned(8))); 6890 + 6891 + struct bpf_list_head { 6892 + __u64 :64; 6893 + __u64 :64; 6894 + } __attribute__((aligned(8))); 6895 + 6896 + struct bpf_list_node { 6890 6897 __u64 :64; 6891 6898 __u64 :64; 6892 6899 } __attribute__((aligned(8)));
-1
kernel/bpf/arraymap.c
··· 430 430 for (i = 0; i < array->map.max_entries; i++) 431 431 bpf_obj_free_fields(map->record, array_map_elem_ptr(array, i)); 432 432 } 433 - bpf_map_free_record(map); 434 433 } 435 434 436 435 if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY)
+4 -2
kernel/bpf/bpf_lsm.c
··· 151 151 static const struct bpf_func_proto bpf_ima_inode_hash_proto = { 152 152 .func = bpf_ima_inode_hash, 153 153 .gpl_only = false, 154 + .might_sleep = true, 154 155 .ret_type = RET_INTEGER, 155 156 .arg1_type = ARG_PTR_TO_BTF_ID, 156 157 .arg1_btf_id = &bpf_ima_inode_hash_btf_ids[0], ··· 170 169 static const struct bpf_func_proto bpf_ima_file_hash_proto = { 171 170 .func = bpf_ima_file_hash, 172 171 .gpl_only = false, 172 + .might_sleep = true, 173 173 .ret_type = RET_INTEGER, 174 174 .arg1_type = ARG_PTR_TO_BTF_ID, 175 175 .arg1_btf_id = &bpf_ima_file_hash_btf_ids[0], ··· 223 221 case BPF_FUNC_bprm_opts_set: 224 222 return &bpf_bprm_opts_set_proto; 225 223 case BPF_FUNC_ima_inode_hash: 226 - return prog->aux->sleepable ? &bpf_ima_inode_hash_proto : NULL; 224 + return &bpf_ima_inode_hash_proto; 227 225 case BPF_FUNC_ima_file_hash: 228 - return prog->aux->sleepable ? &bpf_ima_file_hash_proto : NULL; 226 + return &bpf_ima_file_hash_proto; 229 227 case BPF_FUNC_get_attach_cookie: 230 228 return bpf_prog_has_trampoline(prog) ? &bpf_get_attach_cookie_proto : NULL; 231 229 #ifdef CONFIG_NET
+478 -404
kernel/bpf/btf.c
··· 199 199 DEFINE_SPINLOCK(btf_idr_lock); 200 200 201 201 enum btf_kfunc_hook { 202 + BTF_KFUNC_HOOK_COMMON, 202 203 BTF_KFUNC_HOOK_XDP, 203 204 BTF_KFUNC_HOOK_TC, 204 205 BTF_KFUNC_HOOK_STRUCT_OPS, ··· 238 237 struct rcu_head rcu; 239 238 struct btf_kfunc_set_tab *kfunc_set_tab; 240 239 struct btf_id_dtor_kfunc_tab *dtor_kfunc_tab; 240 + struct btf_struct_metas *struct_meta_tab; 241 241 242 242 /* split BTF support */ 243 243 struct btf *base_btf; ··· 477 475 static bool btf_type_nosize_or_null(const struct btf_type *t) 478 476 { 479 477 return !t || btf_type_nosize(t); 480 - } 481 - 482 - static bool __btf_type_is_struct(const struct btf_type *t) 483 - { 484 - return BTF_INFO_KIND(t->info) == BTF_KIND_STRUCT; 485 - } 486 - 487 - static bool btf_type_is_array(const struct btf_type *t) 488 - { 489 - return BTF_INFO_KIND(t->info) == BTF_KIND_ARRAY; 490 478 } 491 479 492 480 static bool btf_type_is_datasec(const struct btf_type *t) ··· 1634 1642 btf->dtor_kfunc_tab = NULL; 1635 1643 } 1636 1644 1645 + static void btf_struct_metas_free(struct btf_struct_metas *tab) 1646 + { 1647 + int i; 1648 + 1649 + if (!tab) 1650 + return; 1651 + for (i = 0; i < tab->cnt; i++) { 1652 + btf_record_free(tab->types[i].record); 1653 + kfree(tab->types[i].field_offs); 1654 + } 1655 + kfree(tab); 1656 + } 1657 + 1658 + static void btf_free_struct_meta_tab(struct btf *btf) 1659 + { 1660 + struct btf_struct_metas *tab = btf->struct_meta_tab; 1661 + 1662 + btf_struct_metas_free(tab); 1663 + btf->struct_meta_tab = NULL; 1664 + } 1665 + 1637 1666 static void btf_free(struct btf *btf) 1638 1667 { 1668 + btf_free_struct_meta_tab(btf); 1639 1669 btf_free_dtor_kfunc_tab(btf); 1640 1670 btf_free_kfunc_set_tab(btf); 1641 1671 kvfree(btf->types); ··· 3219 3205 struct btf_field_info { 3220 3206 enum btf_field_type type; 3221 3207 u32 off; 3222 - struct { 3223 - u32 type_id; 3224 - } kptr; 3208 + union { 3209 + struct { 3210 + u32 type_id; 3211 + } kptr; 3212 + struct { 3213 + const char *node_name; 3214 + u32 value_btf_id; 3215 + } list_head; 3216 + }; 3225 3217 }; 3226 3218 3227 3219 static int btf_find_struct(const struct btf *btf, const struct btf_type *t, ··· 3281 3261 return BTF_FIELD_FOUND; 3282 3262 } 3283 3263 3264 + static const char *btf_find_decl_tag_value(const struct btf *btf, 3265 + const struct btf_type *pt, 3266 + int comp_idx, const char *tag_key) 3267 + { 3268 + int i; 3269 + 3270 + for (i = 1; i < btf_nr_types(btf); i++) { 3271 + const struct btf_type *t = btf_type_by_id(btf, i); 3272 + int len = strlen(tag_key); 3273 + 3274 + if (!btf_type_is_decl_tag(t)) 3275 + continue; 3276 + if (pt != btf_type_by_id(btf, t->type) || 3277 + btf_type_decl_tag(t)->component_idx != comp_idx) 3278 + continue; 3279 + if (strncmp(__btf_name_by_offset(btf, t->name_off), tag_key, len)) 3280 + continue; 3281 + return __btf_name_by_offset(btf, t->name_off) + len; 3282 + } 3283 + return NULL; 3284 + } 3285 + 3286 + static int btf_find_list_head(const struct btf *btf, const struct btf_type *pt, 3287 + const struct btf_type *t, int comp_idx, 3288 + u32 off, int sz, struct btf_field_info *info) 3289 + { 3290 + const char *value_type; 3291 + const char *list_node; 3292 + s32 id; 3293 + 3294 + if (!__btf_type_is_struct(t)) 3295 + return BTF_FIELD_IGNORE; 3296 + if (t->size != sz) 3297 + return BTF_FIELD_IGNORE; 3298 + value_type = btf_find_decl_tag_value(btf, pt, comp_idx, "contains:"); 3299 + if (!value_type) 3300 + return -EINVAL; 3301 + list_node = strstr(value_type, ":"); 3302 + if (!list_node) 3303 + return -EINVAL; 3304 + value_type = kstrndup(value_type, list_node - value_type, GFP_KERNEL | __GFP_NOWARN); 3305 + if (!value_type) 3306 + return -ENOMEM; 3307 + id = btf_find_by_name_kind(btf, value_type, BTF_KIND_STRUCT); 3308 + kfree(value_type); 3309 + if (id < 0) 3310 + return id; 3311 + list_node++; 3312 + if (str_is_empty(list_node)) 3313 + return -EINVAL; 3314 + info->type = BPF_LIST_HEAD; 3315 + info->off = off; 3316 + info->list_head.value_btf_id = id; 3317 + info->list_head.node_name = list_node; 3318 + return BTF_FIELD_FOUND; 3319 + } 3320 + 3284 3321 static int btf_get_field_type(const char *name, u32 field_mask, u32 *seen_mask, 3285 3322 int *align, int *sz) 3286 3323 { ··· 3358 3281 return -E2BIG; 3359 3282 *seen_mask |= BPF_TIMER; 3360 3283 type = BPF_TIMER; 3284 + goto end; 3285 + } 3286 + } 3287 + if (field_mask & BPF_LIST_HEAD) { 3288 + if (!strcmp(name, "bpf_list_head")) { 3289 + type = BPF_LIST_HEAD; 3290 + goto end; 3291 + } 3292 + } 3293 + if (field_mask & BPF_LIST_NODE) { 3294 + if (!strcmp(name, "bpf_list_node")) { 3295 + type = BPF_LIST_NODE; 3361 3296 goto end; 3362 3297 } 3363 3298 } ··· 3416 3327 switch (field_type) { 3417 3328 case BPF_SPIN_LOCK: 3418 3329 case BPF_TIMER: 3330 + case BPF_LIST_NODE: 3419 3331 ret = btf_find_struct(btf, member_type, off, sz, field_type, 3420 3332 idx < info_cnt ? &info[idx] : &tmp); 3421 3333 if (ret < 0) ··· 3426 3336 case BPF_KPTR_REF: 3427 3337 ret = btf_find_kptr(btf, member_type, off, sz, 3428 3338 idx < info_cnt ? &info[idx] : &tmp); 3339 + if (ret < 0) 3340 + return ret; 3341 + break; 3342 + case BPF_LIST_HEAD: 3343 + ret = btf_find_list_head(btf, t, member_type, i, off, sz, 3344 + idx < info_cnt ? &info[idx] : &tmp); 3429 3345 if (ret < 0) 3430 3346 return ret; 3431 3347 break; ··· 3477 3381 switch (field_type) { 3478 3382 case BPF_SPIN_LOCK: 3479 3383 case BPF_TIMER: 3384 + case BPF_LIST_NODE: 3480 3385 ret = btf_find_struct(btf, var_type, off, sz, field_type, 3481 3386 idx < info_cnt ? &info[idx] : &tmp); 3482 3387 if (ret < 0) ··· 3487 3390 case BPF_KPTR_REF: 3488 3391 ret = btf_find_kptr(btf, var_type, off, sz, 3489 3392 idx < info_cnt ? &info[idx] : &tmp); 3393 + if (ret < 0) 3394 + return ret; 3395 + break; 3396 + case BPF_LIST_HEAD: 3397 + ret = btf_find_list_head(btf, var, var_type, -1, off, sz, 3398 + idx < info_cnt ? &info[idx] : &tmp); 3490 3399 if (ret < 0) 3491 3400 return ret; 3492 3401 break; ··· 3594 3491 return ret; 3595 3492 } 3596 3493 3494 + static int btf_parse_list_head(const struct btf *btf, struct btf_field *field, 3495 + struct btf_field_info *info) 3496 + { 3497 + const struct btf_type *t, *n = NULL; 3498 + const struct btf_member *member; 3499 + u32 offset; 3500 + int i; 3501 + 3502 + t = btf_type_by_id(btf, info->list_head.value_btf_id); 3503 + /* We've already checked that value_btf_id is a struct type. We 3504 + * just need to figure out the offset of the list_node, and 3505 + * verify its type. 3506 + */ 3507 + for_each_member(i, t, member) { 3508 + if (strcmp(info->list_head.node_name, __btf_name_by_offset(btf, member->name_off))) 3509 + continue; 3510 + /* Invalid BTF, two members with same name */ 3511 + if (n) 3512 + return -EINVAL; 3513 + n = btf_type_by_id(btf, member->type); 3514 + if (!__btf_type_is_struct(n)) 3515 + return -EINVAL; 3516 + if (strcmp("bpf_list_node", __btf_name_by_offset(btf, n->name_off))) 3517 + return -EINVAL; 3518 + offset = __btf_member_bit_offset(n, member); 3519 + if (offset % 8) 3520 + return -EINVAL; 3521 + offset /= 8; 3522 + if (offset % __alignof__(struct bpf_list_node)) 3523 + return -EINVAL; 3524 + 3525 + field->list_head.btf = (struct btf *)btf; 3526 + field->list_head.value_btf_id = info->list_head.value_btf_id; 3527 + field->list_head.node_offset = offset; 3528 + } 3529 + if (!n) 3530 + return -ENOENT; 3531 + return 0; 3532 + } 3533 + 3597 3534 struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type *t, 3598 3535 u32 field_mask, u32 value_size) 3599 3536 { 3600 3537 struct btf_field_info info_arr[BTF_FIELDS_MAX]; 3601 3538 struct btf_record *rec; 3539 + u32 next_off = 0; 3602 3540 int ret, i, cnt; 3603 3541 3604 3542 ret = btf_find_field(btf, t, field_mask, info_arr, ARRAY_SIZE(info_arr)); ··· 3649 3505 return NULL; 3650 3506 3651 3507 cnt = ret; 3508 + /* This needs to be kzalloc to zero out padding and unused fields, see 3509 + * comment in btf_record_equal. 3510 + */ 3652 3511 rec = kzalloc(offsetof(struct btf_record, fields[cnt]), GFP_KERNEL | __GFP_NOWARN); 3653 3512 if (!rec) 3654 3513 return ERR_PTR(-ENOMEM); ··· 3664 3517 ret = -EFAULT; 3665 3518 goto end; 3666 3519 } 3520 + if (info_arr[i].off < next_off) { 3521 + ret = -EEXIST; 3522 + goto end; 3523 + } 3524 + next_off = info_arr[i].off + btf_field_type_size(info_arr[i].type); 3667 3525 3668 3526 rec->field_mask |= info_arr[i].type; 3669 3527 rec->fields[i].offset = info_arr[i].off; ··· 3691 3539 if (ret < 0) 3692 3540 goto end; 3693 3541 break; 3542 + case BPF_LIST_HEAD: 3543 + ret = btf_parse_list_head(btf, &rec->fields[i], &info_arr[i]); 3544 + if (ret < 0) 3545 + goto end; 3546 + break; 3547 + case BPF_LIST_NODE: 3548 + break; 3694 3549 default: 3695 3550 ret = -EFAULT; 3696 3551 goto end; 3697 3552 } 3698 3553 rec->cnt++; 3699 3554 } 3555 + 3556 + /* bpf_list_head requires bpf_spin_lock */ 3557 + if (btf_record_has_field(rec, BPF_LIST_HEAD) && rec->spin_lock_off < 0) { 3558 + ret = -EINVAL; 3559 + goto end; 3560 + } 3561 + 3700 3562 return rec; 3701 3563 end: 3702 3564 btf_record_free(rec); 3703 3565 return ERR_PTR(ret); 3566 + } 3567 + 3568 + int btf_check_and_fixup_fields(const struct btf *btf, struct btf_record *rec) 3569 + { 3570 + int i; 3571 + 3572 + /* There are two owning types, kptr_ref and bpf_list_head. The former 3573 + * only supports storing kernel types, which can never store references 3574 + * to program allocated local types, atleast not yet. Hence we only need 3575 + * to ensure that bpf_list_head ownership does not form cycles. 3576 + */ 3577 + if (IS_ERR_OR_NULL(rec) || !(rec->field_mask & BPF_LIST_HEAD)) 3578 + return 0; 3579 + for (i = 0; i < rec->cnt; i++) { 3580 + struct btf_struct_meta *meta; 3581 + u32 btf_id; 3582 + 3583 + if (!(rec->fields[i].type & BPF_LIST_HEAD)) 3584 + continue; 3585 + btf_id = rec->fields[i].list_head.value_btf_id; 3586 + meta = btf_find_struct_meta(btf, btf_id); 3587 + if (!meta) 3588 + return -EFAULT; 3589 + rec->fields[i].list_head.value_rec = meta->record; 3590 + 3591 + if (!(rec->field_mask & BPF_LIST_NODE)) 3592 + continue; 3593 + 3594 + /* We need to ensure ownership acyclicity among all types. The 3595 + * proper way to do it would be to topologically sort all BTF 3596 + * IDs based on the ownership edges, since there can be multiple 3597 + * bpf_list_head in a type. Instead, we use the following 3598 + * reasoning: 3599 + * 3600 + * - A type can only be owned by another type in user BTF if it 3601 + * has a bpf_list_node. 3602 + * - A type can only _own_ another type in user BTF if it has a 3603 + * bpf_list_head. 3604 + * 3605 + * We ensure that if a type has both bpf_list_head and 3606 + * bpf_list_node, its element types cannot be owning types. 3607 + * 3608 + * To ensure acyclicity: 3609 + * 3610 + * When A only has bpf_list_head, ownership chain can be: 3611 + * A -> B -> C 3612 + * Where: 3613 + * - B has both bpf_list_head and bpf_list_node. 3614 + * - C only has bpf_list_node. 3615 + * 3616 + * When A has both bpf_list_head and bpf_list_node, some other 3617 + * type already owns it in the BTF domain, hence it can not own 3618 + * another owning type through any of the bpf_list_head edges. 3619 + * A -> B 3620 + * Where: 3621 + * - B only has bpf_list_node. 3622 + */ 3623 + if (meta->record->field_mask & BPF_LIST_HEAD) 3624 + return -ELOOP; 3625 + } 3626 + return 0; 3704 3627 } 3705 3628 3706 3629 static int btf_field_offs_cmp(const void *_a, const void *_b, const void *priv) ··· 3811 3584 u8 *sz; 3812 3585 3813 3586 BUILD_BUG_ON(ARRAY_SIZE(foffs->field_off) != ARRAY_SIZE(foffs->field_sz)); 3814 - if (IS_ERR_OR_NULL(rec) || WARN_ON_ONCE(rec->cnt > sizeof(foffs->field_off))) 3587 + if (IS_ERR_OR_NULL(rec)) 3815 3588 return NULL; 3816 3589 3817 3590 foffs = kzalloc(sizeof(*foffs), GFP_KERNEL | __GFP_NOWARN); ··· 4779 4552 nr_args--; 4780 4553 } 4781 4554 4782 - err = 0; 4783 4555 for (i = 0; i < nr_args; i++) { 4784 4556 const struct btf_type *arg_type; 4785 4557 u32 arg_type_id; ··· 4787 4561 arg_type = btf_type_by_id(btf, arg_type_id); 4788 4562 if (!arg_type) { 4789 4563 btf_verifier_log_type(env, t, "Invalid arg#%u", i + 1); 4790 - err = -EINVAL; 4791 - break; 4564 + return -EINVAL; 4565 + } 4566 + 4567 + if (btf_type_is_resolve_source_only(arg_type)) { 4568 + btf_verifier_log_type(env, t, "Invalid arg#%u", i + 1); 4569 + return -EINVAL; 4792 4570 } 4793 4571 4794 4572 if (args[i].name_off && ··· 4800 4570 !btf_name_valid_identifier(btf, args[i].name_off))) { 4801 4571 btf_verifier_log_type(env, t, 4802 4572 "Invalid arg#%u", i + 1); 4803 - err = -EINVAL; 4804 - break; 4573 + return -EINVAL; 4805 4574 } 4806 4575 4807 4576 if (btf_type_needs_resolve(arg_type) && 4808 4577 !env_type_is_resolved(env, arg_type_id)) { 4809 4578 err = btf_resolve(env, arg_type, arg_type_id); 4810 4579 if (err) 4811 - break; 4580 + return err; 4812 4581 } 4813 4582 4814 4583 if (!btf_type_id_size(btf, &arg_type_id, NULL)) { 4815 4584 btf_verifier_log_type(env, t, "Invalid arg#%u", i + 1); 4816 - err = -EINVAL; 4817 - break; 4585 + return -EINVAL; 4818 4586 } 4819 4587 } 4820 4588 4821 - return err; 4589 + return 0; 4822 4590 } 4823 4591 4824 4592 static int btf_func_check(struct btf_verifier_env *env, ··· 5230 5002 return btf_check_sec_info(env, btf_data_size); 5231 5003 } 5232 5004 5005 + static const char *alloc_obj_fields[] = { 5006 + "bpf_spin_lock", 5007 + "bpf_list_head", 5008 + "bpf_list_node", 5009 + }; 5010 + 5011 + static struct btf_struct_metas * 5012 + btf_parse_struct_metas(struct bpf_verifier_log *log, struct btf *btf) 5013 + { 5014 + union { 5015 + struct btf_id_set set; 5016 + struct { 5017 + u32 _cnt; 5018 + u32 _ids[ARRAY_SIZE(alloc_obj_fields)]; 5019 + } _arr; 5020 + } aof; 5021 + struct btf_struct_metas *tab = NULL; 5022 + int i, n, id, ret; 5023 + 5024 + BUILD_BUG_ON(offsetof(struct btf_id_set, cnt) != 0); 5025 + BUILD_BUG_ON(sizeof(struct btf_id_set) != sizeof(u32)); 5026 + 5027 + memset(&aof, 0, sizeof(aof)); 5028 + for (i = 0; i < ARRAY_SIZE(alloc_obj_fields); i++) { 5029 + /* Try to find whether this special type exists in user BTF, and 5030 + * if so remember its ID so we can easily find it among members 5031 + * of structs that we iterate in the next loop. 5032 + */ 5033 + id = btf_find_by_name_kind(btf, alloc_obj_fields[i], BTF_KIND_STRUCT); 5034 + if (id < 0) 5035 + continue; 5036 + aof.set.ids[aof.set.cnt++] = id; 5037 + } 5038 + 5039 + if (!aof.set.cnt) 5040 + return NULL; 5041 + sort(&aof.set.ids, aof.set.cnt, sizeof(aof.set.ids[0]), btf_id_cmp_func, NULL); 5042 + 5043 + n = btf_nr_types(btf); 5044 + for (i = 1; i < n; i++) { 5045 + struct btf_struct_metas *new_tab; 5046 + const struct btf_member *member; 5047 + struct btf_field_offs *foffs; 5048 + struct btf_struct_meta *type; 5049 + struct btf_record *record; 5050 + const struct btf_type *t; 5051 + int j, tab_cnt; 5052 + 5053 + t = btf_type_by_id(btf, i); 5054 + if (!t) { 5055 + ret = -EINVAL; 5056 + goto free; 5057 + } 5058 + if (!__btf_type_is_struct(t)) 5059 + continue; 5060 + 5061 + cond_resched(); 5062 + 5063 + for_each_member(j, t, member) { 5064 + if (btf_id_set_contains(&aof.set, member->type)) 5065 + goto parse; 5066 + } 5067 + continue; 5068 + parse: 5069 + tab_cnt = tab ? tab->cnt : 0; 5070 + new_tab = krealloc(tab, offsetof(struct btf_struct_metas, types[tab_cnt + 1]), 5071 + GFP_KERNEL | __GFP_NOWARN); 5072 + if (!new_tab) { 5073 + ret = -ENOMEM; 5074 + goto free; 5075 + } 5076 + if (!tab) 5077 + new_tab->cnt = 0; 5078 + tab = new_tab; 5079 + 5080 + type = &tab->types[tab->cnt]; 5081 + type->btf_id = i; 5082 + record = btf_parse_fields(btf, t, BPF_SPIN_LOCK | BPF_LIST_HEAD | BPF_LIST_NODE, t->size); 5083 + /* The record cannot be unset, treat it as an error if so */ 5084 + if (IS_ERR_OR_NULL(record)) { 5085 + ret = PTR_ERR_OR_ZERO(record) ?: -EFAULT; 5086 + goto free; 5087 + } 5088 + foffs = btf_parse_field_offs(record); 5089 + /* We need the field_offs to be valid for a valid record, 5090 + * either both should be set or both should be unset. 5091 + */ 5092 + if (IS_ERR_OR_NULL(foffs)) { 5093 + btf_record_free(record); 5094 + ret = -EFAULT; 5095 + goto free; 5096 + } 5097 + type->record = record; 5098 + type->field_offs = foffs; 5099 + tab->cnt++; 5100 + } 5101 + return tab; 5102 + free: 5103 + btf_struct_metas_free(tab); 5104 + return ERR_PTR(ret); 5105 + } 5106 + 5107 + struct btf_struct_meta *btf_find_struct_meta(const struct btf *btf, u32 btf_id) 5108 + { 5109 + struct btf_struct_metas *tab; 5110 + 5111 + BUILD_BUG_ON(offsetof(struct btf_struct_meta, btf_id) != 0); 5112 + tab = btf->struct_meta_tab; 5113 + if (!tab) 5114 + return NULL; 5115 + return bsearch(&btf_id, tab->types, tab->cnt, sizeof(tab->types[0]), btf_id_cmp_func); 5116 + } 5117 + 5233 5118 static int btf_check_type_tags(struct btf_verifier_env *env, 5234 5119 struct btf *btf, int start_id) 5235 5120 { ··· 5393 5052 static struct btf *btf_parse(bpfptr_t btf_data, u32 btf_data_size, 5394 5053 u32 log_level, char __user *log_ubuf, u32 log_size) 5395 5054 { 5055 + struct btf_struct_metas *struct_meta_tab; 5396 5056 struct btf_verifier_env *env = NULL; 5397 5057 struct bpf_verifier_log *log; 5398 5058 struct btf *btf = NULL; ··· 5462 5120 if (err) 5463 5121 goto errout; 5464 5122 5123 + struct_meta_tab = btf_parse_struct_metas(log, btf); 5124 + if (IS_ERR(struct_meta_tab)) { 5125 + err = PTR_ERR(struct_meta_tab); 5126 + goto errout; 5127 + } 5128 + btf->struct_meta_tab = struct_meta_tab; 5129 + 5130 + if (struct_meta_tab) { 5131 + int i; 5132 + 5133 + for (i = 0; i < struct_meta_tab->cnt; i++) { 5134 + err = btf_check_and_fixup_fields(btf, struct_meta_tab->types[i].record); 5135 + if (err < 0) 5136 + goto errout_meta; 5137 + } 5138 + } 5139 + 5465 5140 if (log->level && bpf_verifier_log_full(log)) { 5466 5141 err = -ENOSPC; 5467 - goto errout; 5142 + goto errout_meta; 5468 5143 } 5469 5144 5470 5145 btf_verifier_env_free(env); 5471 5146 refcount_set(&btf->refcnt, 1); 5472 5147 return btf; 5473 5148 5149 + errout_meta: 5150 + btf_free_struct_meta_tab(btf); 5474 5151 errout: 5475 5152 btf_verifier_env_free(env); 5476 5153 if (btf) ··· 5531 5170 #undef BPF_MAP_TYPE 5532 5171 #undef BPF_LINK_TYPE 5533 5172 5534 - static const struct btf_member * 5173 + const struct btf_member * 5535 5174 btf_get_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf, 5536 5175 const struct btf_type *t, enum bpf_prog_type prog_type, 5537 5176 int arg) ··· 5602 5241 return -ENOENT; 5603 5242 kern_ctx_type = prog_ctx_type + 1; 5604 5243 return kern_ctx_type->type; 5244 + } 5245 + 5246 + int get_kern_ctx_btf_id(struct bpf_verifier_log *log, enum bpf_prog_type prog_type) 5247 + { 5248 + const struct btf_member *kctx_member; 5249 + const struct btf_type *conv_struct; 5250 + const struct btf_type *kctx_type; 5251 + u32 kctx_type_id; 5252 + 5253 + conv_struct = bpf_ctx_convert.t; 5254 + /* get member for kernel ctx type */ 5255 + kctx_member = btf_type_member(conv_struct) + bpf_ctx_convert_map[prog_type] * 2 + 1; 5256 + kctx_type_id = kctx_member->type; 5257 + kctx_type = btf_type_by_id(btf_vmlinux, kctx_type_id); 5258 + if (!btf_type_is_struct(kctx_type)) { 5259 + bpf_log(log, "kern ctx type id %u is not a struct\n", kctx_type_id); 5260 + return -EINVAL; 5261 + } 5262 + 5263 + return kctx_type_id; 5605 5264 } 5606 5265 5607 5266 BTF_ID_LIST(bpf_ctx_convert_btf_id) ··· 5821 5440 return nr_args + 1; 5822 5441 } 5823 5442 5443 + static bool prog_args_trusted(const struct bpf_prog *prog) 5444 + { 5445 + enum bpf_attach_type atype = prog->expected_attach_type; 5446 + 5447 + switch (prog->type) { 5448 + case BPF_PROG_TYPE_TRACING: 5449 + return atype == BPF_TRACE_RAW_TP || atype == BPF_TRACE_ITER; 5450 + case BPF_PROG_TYPE_LSM: 5451 + case BPF_PROG_TYPE_STRUCT_OPS: 5452 + return true; 5453 + default: 5454 + return false; 5455 + } 5456 + } 5457 + 5824 5458 bool btf_ctx_access(int off, int size, enum bpf_access_type type, 5825 5459 const struct bpf_prog *prog, 5826 5460 struct bpf_insn_access_aux *info) ··· 5979 5583 } 5980 5584 5981 5585 info->reg_type = PTR_TO_BTF_ID; 5586 + if (prog_args_trusted(prog)) 5587 + info->reg_type |= PTR_TRUSTED; 5588 + 5982 5589 if (tgt_prog) { 5983 5590 enum bpf_prog_type tgt_type; 5984 5591 ··· 6248 5849 /* check __percpu tag */ 6249 5850 if (strcmp(tag_value, "percpu") == 0) 6250 5851 tmp_flag = MEM_PERCPU; 5852 + /* check __rcu tag */ 5853 + if (strcmp(tag_value, "rcu") == 0) 5854 + tmp_flag = MEM_RCU; 6251 5855 } 6252 5856 6253 5857 stype = btf_type_skip_modifiers(btf, mtype->type, &id); ··· 6280 5878 return -EINVAL; 6281 5879 } 6282 5880 6283 - int btf_struct_access(struct bpf_verifier_log *log, const struct btf *btf, 6284 - const struct btf_type *t, int off, int size, 6285 - enum bpf_access_type atype __maybe_unused, 5881 + int btf_struct_access(struct bpf_verifier_log *log, 5882 + const struct bpf_reg_state *reg, 5883 + int off, int size, enum bpf_access_type atype __maybe_unused, 6286 5884 u32 *next_btf_id, enum bpf_type_flag *flag) 6287 5885 { 5886 + const struct btf *btf = reg->btf; 6288 5887 enum bpf_type_flag tmp_flag = 0; 5888 + const struct btf_type *t; 5889 + u32 id = reg->btf_id; 6289 5890 int err; 6290 - u32 id; 6291 5891 5892 + while (type_is_alloc(reg->type)) { 5893 + struct btf_struct_meta *meta; 5894 + struct btf_record *rec; 5895 + int i; 5896 + 5897 + meta = btf_find_struct_meta(btf, id); 5898 + if (!meta) 5899 + break; 5900 + rec = meta->record; 5901 + for (i = 0; i < rec->cnt; i++) { 5902 + struct btf_field *field = &rec->fields[i]; 5903 + u32 offset = field->offset; 5904 + if (off < offset + btf_field_type_size(field->type) && offset < off + size) { 5905 + bpf_log(log, 5906 + "direct access to %s is disallowed\n", 5907 + btf_field_type_name(field->type)); 5908 + return -EACCES; 5909 + } 5910 + } 5911 + break; 5912 + } 5913 + 5914 + t = btf_type_by_id(btf, id); 6292 5915 do { 6293 5916 err = btf_struct_walk(log, btf, t, off, size, &id, &tmp_flag); 6294 5917 6295 5918 switch (err) { 6296 5919 case WALK_PTR: 5920 + /* For local types, the destination register cannot 5921 + * become a pointer again. 5922 + */ 5923 + if (type_is_alloc(reg->type)) 5924 + return SCALAR_VALUE; 6297 5925 /* If we found the pointer or scalar on t+off, 6298 5926 * we're done. 6299 5927 */ ··· 6358 5926 * end up with two different module BTFs, but IDs point to the common type in 6359 5927 * vmlinux BTF. 6360 5928 */ 6361 - static bool btf_types_are_same(const struct btf *btf1, u32 id1, 6362 - const struct btf *btf2, u32 id2) 5929 + bool btf_types_are_same(const struct btf *btf1, u32 id1, 5930 + const struct btf *btf2, u32 id2) 6363 5931 { 6364 5932 if (id1 != id2) 6365 5933 return false; ··· 6641 6209 return btf_check_func_type_match(log, btf1, t1, btf2, t2); 6642 6210 } 6643 6211 6644 - static u32 *reg2btf_ids[__BPF_REG_TYPE_MAX] = { 6645 - #ifdef CONFIG_NET 6646 - [PTR_TO_SOCKET] = &btf_sock_ids[BTF_SOCK_TYPE_SOCK], 6647 - [PTR_TO_SOCK_COMMON] = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON], 6648 - [PTR_TO_TCP_SOCK] = &btf_sock_ids[BTF_SOCK_TYPE_TCP], 6649 - #endif 6650 - }; 6651 - 6652 - /* Returns true if struct is composed of scalars, 4 levels of nesting allowed */ 6653 - static bool __btf_type_is_scalar_struct(struct bpf_verifier_log *log, 6654 - const struct btf *btf, 6655 - const struct btf_type *t, int rec) 6656 - { 6657 - const struct btf_type *member_type; 6658 - const struct btf_member *member; 6659 - u32 i; 6660 - 6661 - if (!btf_type_is_struct(t)) 6662 - return false; 6663 - 6664 - for_each_member(i, t, member) { 6665 - const struct btf_array *array; 6666 - 6667 - member_type = btf_type_skip_modifiers(btf, member->type, NULL); 6668 - if (btf_type_is_struct(member_type)) { 6669 - if (rec >= 3) { 6670 - bpf_log(log, "max struct nesting depth exceeded\n"); 6671 - return false; 6672 - } 6673 - if (!__btf_type_is_scalar_struct(log, btf, member_type, rec + 1)) 6674 - return false; 6675 - continue; 6676 - } 6677 - if (btf_type_is_array(member_type)) { 6678 - array = btf_type_array(member_type); 6679 - if (!array->nelems) 6680 - return false; 6681 - member_type = btf_type_skip_modifiers(btf, array->type, NULL); 6682 - if (!btf_type_is_scalar(member_type)) 6683 - return false; 6684 - continue; 6685 - } 6686 - if (!btf_type_is_scalar(member_type)) 6687 - return false; 6688 - } 6689 - return true; 6690 - } 6691 - 6692 - static bool is_kfunc_arg_mem_size(const struct btf *btf, 6693 - const struct btf_param *arg, 6694 - const struct bpf_reg_state *reg) 6695 - { 6696 - int len, sfx_len = sizeof("__sz") - 1; 6697 - const struct btf_type *t; 6698 - const char *param_name; 6699 - 6700 - t = btf_type_skip_modifiers(btf, arg->type, NULL); 6701 - if (!btf_type_is_scalar(t) || reg->type != SCALAR_VALUE) 6702 - return false; 6703 - 6704 - /* In the future, this can be ported to use BTF tagging */ 6705 - param_name = btf_name_by_offset(btf, arg->name_off); 6706 - if (str_is_empty(param_name)) 6707 - return false; 6708 - len = strlen(param_name); 6709 - if (len < sfx_len) 6710 - return false; 6711 - param_name += len - sfx_len; 6712 - if (strncmp(param_name, "__sz", sfx_len)) 6713 - return false; 6714 - 6715 - return true; 6716 - } 6717 - 6718 - static bool btf_is_kfunc_arg_mem_size(const struct btf *btf, 6719 - const struct btf_param *arg, 6720 - const struct bpf_reg_state *reg, 6721 - const char *name) 6722 - { 6723 - int len, target_len = strlen(name); 6724 - const struct btf_type *t; 6725 - const char *param_name; 6726 - 6727 - t = btf_type_skip_modifiers(btf, arg->type, NULL); 6728 - if (!btf_type_is_scalar(t) || reg->type != SCALAR_VALUE) 6729 - return false; 6730 - 6731 - param_name = btf_name_by_offset(btf, arg->name_off); 6732 - if (str_is_empty(param_name)) 6733 - return false; 6734 - len = strlen(param_name); 6735 - if (len != target_len) 6736 - return false; 6737 - if (strcmp(param_name, name)) 6738 - return false; 6739 - 6740 - return true; 6741 - } 6742 - 6743 6212 static int btf_check_func_arg_match(struct bpf_verifier_env *env, 6744 6213 const struct btf *btf, u32 func_id, 6745 6214 struct bpf_reg_state *regs, 6746 6215 bool ptr_to_mem_ok, 6747 - struct bpf_kfunc_arg_meta *kfunc_meta, 6748 6216 bool processing_call) 6749 6217 { 6750 6218 enum bpf_prog_type prog_type = resolve_prog_type(env->prog); 6751 - bool rel = false, kptr_get = false, trusted_args = false; 6752 - bool sleepable = false; 6753 6219 struct bpf_verifier_log *log = &env->log; 6754 - u32 i, nargs, ref_id, ref_obj_id = 0; 6755 - bool is_kfunc = btf_is_kernel(btf); 6756 6220 const char *func_name, *ref_tname; 6757 6221 const struct btf_type *t, *ref_t; 6758 6222 const struct btf_param *args; 6759 - int ref_regno = 0, ret; 6223 + u32 i, nargs, ref_id; 6224 + int ret; 6760 6225 6761 6226 t = btf_type_by_id(btf, func_id); 6762 6227 if (!t || !btf_type_is_func(t)) { ··· 6679 6350 return -EINVAL; 6680 6351 } 6681 6352 6682 - if (is_kfunc && kfunc_meta) { 6683 - /* Only kfunc can be release func */ 6684 - rel = kfunc_meta->flags & KF_RELEASE; 6685 - kptr_get = kfunc_meta->flags & KF_KPTR_GET; 6686 - trusted_args = kfunc_meta->flags & KF_TRUSTED_ARGS; 6687 - sleepable = kfunc_meta->flags & KF_SLEEPABLE; 6688 - } 6689 - 6690 6353 /* check that BTF function arguments match actual types that the 6691 6354 * verifier sees. 6692 6355 */ ··· 6686 6365 enum bpf_arg_type arg_type = ARG_DONTCARE; 6687 6366 u32 regno = i + 1; 6688 6367 struct bpf_reg_state *reg = &regs[regno]; 6689 - bool obj_ptr = false; 6690 6368 6691 6369 t = btf_type_skip_modifiers(btf, args[i].type, NULL); 6692 6370 if (btf_type_is_scalar(t)) { 6693 - if (is_kfunc && kfunc_meta) { 6694 - bool is_buf_size = false; 6695 - 6696 - /* check for any const scalar parameter of name "rdonly_buf_size" 6697 - * or "rdwr_buf_size" 6698 - */ 6699 - if (btf_is_kfunc_arg_mem_size(btf, &args[i], reg, 6700 - "rdonly_buf_size")) { 6701 - kfunc_meta->r0_rdonly = true; 6702 - is_buf_size = true; 6703 - } else if (btf_is_kfunc_arg_mem_size(btf, &args[i], reg, 6704 - "rdwr_buf_size")) 6705 - is_buf_size = true; 6706 - 6707 - if (is_buf_size) { 6708 - if (kfunc_meta->r0_size) { 6709 - bpf_log(log, "2 or more rdonly/rdwr_buf_size parameters for kfunc"); 6710 - return -EINVAL; 6711 - } 6712 - 6713 - if (!tnum_is_const(reg->var_off)) { 6714 - bpf_log(log, "R%d is not a const\n", regno); 6715 - return -EINVAL; 6716 - } 6717 - 6718 - kfunc_meta->r0_size = reg->var_off.value; 6719 - ret = mark_chain_precision(env, regno); 6720 - if (ret) 6721 - return ret; 6722 - } 6723 - } 6724 - 6725 6371 if (reg->type == SCALAR_VALUE) 6726 6372 continue; 6727 6373 bpf_log(log, "R%d is not a scalar\n", regno); ··· 6701 6413 return -EINVAL; 6702 6414 } 6703 6415 6704 - /* These register types have special constraints wrt ref_obj_id 6705 - * and offset checks. The rest of trusted args don't. 6706 - */ 6707 - obj_ptr = reg->type == PTR_TO_CTX || reg->type == PTR_TO_BTF_ID || 6708 - reg2btf_ids[base_type(reg->type)]; 6709 - 6710 - /* Check if argument must be a referenced pointer, args + i has 6711 - * been verified to be a pointer (after skipping modifiers). 6712 - * PTR_TO_CTX is ok without having non-zero ref_obj_id. 6713 - */ 6714 - if (is_kfunc && trusted_args && (obj_ptr && reg->type != PTR_TO_CTX) && !reg->ref_obj_id) { 6715 - bpf_log(log, "R%d must be referenced\n", regno); 6716 - return -EINVAL; 6717 - } 6718 - 6719 6416 ref_t = btf_type_skip_modifiers(btf, t->type, &ref_id); 6720 6417 ref_tname = btf_name_by_offset(btf, ref_t->name_off); 6721 6418 6722 - /* Trusted args have the same offset checks as release arguments */ 6723 - if ((trusted_args && obj_ptr) || (rel && reg->ref_obj_id)) 6724 - arg_type |= OBJ_RELEASE; 6725 6419 ret = check_func_arg_reg_off(env, reg, regno, arg_type); 6726 6420 if (ret < 0) 6727 6421 return ret; 6728 6422 6729 - if (is_kfunc && reg->ref_obj_id) { 6730 - /* Ensure only one argument is referenced PTR_TO_BTF_ID */ 6731 - if (ref_obj_id) { 6732 - bpf_log(log, "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n", 6733 - regno, reg->ref_obj_id, ref_obj_id); 6734 - return -EFAULT; 6735 - } 6736 - ref_regno = regno; 6737 - ref_obj_id = reg->ref_obj_id; 6738 - } 6739 - 6740 - /* kptr_get is only true for kfunc */ 6741 - if (i == 0 && kptr_get) { 6742 - struct btf_field *kptr_field; 6743 - 6744 - if (reg->type != PTR_TO_MAP_VALUE) { 6745 - bpf_log(log, "arg#0 expected pointer to map value\n"); 6746 - return -EINVAL; 6747 - } 6748 - 6749 - /* check_func_arg_reg_off allows var_off for 6750 - * PTR_TO_MAP_VALUE, but we need fixed offset to find 6751 - * off_desc. 6752 - */ 6753 - if (!tnum_is_const(reg->var_off)) { 6754 - bpf_log(log, "arg#0 must have constant offset\n"); 6755 - return -EINVAL; 6756 - } 6757 - 6758 - kptr_field = btf_record_find(reg->map_ptr->record, reg->off + reg->var_off.value, BPF_KPTR); 6759 - if (!kptr_field || kptr_field->type != BPF_KPTR_REF) { 6760 - bpf_log(log, "arg#0 no referenced kptr at map value offset=%llu\n", 6761 - reg->off + reg->var_off.value); 6762 - return -EINVAL; 6763 - } 6764 - 6765 - if (!btf_type_is_ptr(ref_t)) { 6766 - bpf_log(log, "arg#0 BTF type must be a double pointer\n"); 6767 - return -EINVAL; 6768 - } 6769 - 6770 - ref_t = btf_type_skip_modifiers(btf, ref_t->type, &ref_id); 6771 - ref_tname = btf_name_by_offset(btf, ref_t->name_off); 6772 - 6773 - if (!btf_type_is_struct(ref_t)) { 6774 - bpf_log(log, "kernel function %s args#%d pointer type %s %s is not supported\n", 6775 - func_name, i, btf_type_str(ref_t), ref_tname); 6776 - return -EINVAL; 6777 - } 6778 - if (!btf_struct_ids_match(log, btf, ref_id, 0, kptr_field->kptr.btf, 6779 - kptr_field->kptr.btf_id, true)) { 6780 - bpf_log(log, "kernel function %s args#%d expected pointer to %s %s\n", 6781 - func_name, i, btf_type_str(ref_t), ref_tname); 6782 - return -EINVAL; 6783 - } 6784 - /* rest of the arguments can be anything, like normal kfunc */ 6785 - } else if (btf_get_prog_ctx_type(log, btf, t, prog_type, i)) { 6423 + if (btf_get_prog_ctx_type(log, btf, t, prog_type, i)) { 6786 6424 /* If function expects ctx type in BTF check that caller 6787 6425 * is passing PTR_TO_CTX. 6788 6426 */ ··· 6718 6504 i, btf_type_str(t)); 6719 6505 return -EINVAL; 6720 6506 } 6721 - } else if (is_kfunc && (reg->type == PTR_TO_BTF_ID || 6722 - (reg2btf_ids[base_type(reg->type)] && !type_flag(reg->type)))) { 6723 - const struct btf_type *reg_ref_t; 6724 - const struct btf *reg_btf; 6725 - const char *reg_ref_tname; 6726 - u32 reg_ref_id; 6727 - 6728 - if (!btf_type_is_struct(ref_t)) { 6729 - bpf_log(log, "kernel function %s args#%d pointer type %s %s is not supported\n", 6730 - func_name, i, btf_type_str(ref_t), 6731 - ref_tname); 6732 - return -EINVAL; 6733 - } 6734 - 6735 - if (reg->type == PTR_TO_BTF_ID) { 6736 - reg_btf = reg->btf; 6737 - reg_ref_id = reg->btf_id; 6738 - } else { 6739 - reg_btf = btf_vmlinux; 6740 - reg_ref_id = *reg2btf_ids[base_type(reg->type)]; 6741 - } 6742 - 6743 - reg_ref_t = btf_type_skip_modifiers(reg_btf, reg_ref_id, 6744 - &reg_ref_id); 6745 - reg_ref_tname = btf_name_by_offset(reg_btf, 6746 - reg_ref_t->name_off); 6747 - if (!btf_struct_ids_match(log, reg_btf, reg_ref_id, 6748 - reg->off, btf, ref_id, 6749 - trusted_args || (rel && reg->ref_obj_id))) { 6750 - bpf_log(log, "kernel function %s args#%d expected pointer to %s %s but R%d has a pointer to %s %s\n", 6751 - func_name, i, 6752 - btf_type_str(ref_t), ref_tname, 6753 - regno, btf_type_str(reg_ref_t), 6754 - reg_ref_tname); 6755 - return -EINVAL; 6756 - } 6757 6507 } else if (ptr_to_mem_ok && processing_call) { 6758 6508 const struct btf_type *resolve_ret; 6759 6509 u32 type_size; 6760 - 6761 - if (is_kfunc) { 6762 - bool arg_mem_size = i + 1 < nargs && is_kfunc_arg_mem_size(btf, &args[i + 1], &regs[regno + 1]); 6763 - bool arg_dynptr = btf_type_is_struct(ref_t) && 6764 - !strcmp(ref_tname, 6765 - stringify_struct(bpf_dynptr_kern)); 6766 - 6767 - /* Permit pointer to mem, but only when argument 6768 - * type is pointer to scalar, or struct composed 6769 - * (recursively) of scalars. 6770 - * When arg_mem_size is true, the pointer can be 6771 - * void *. 6772 - * Also permit initialized local dynamic pointers. 6773 - */ 6774 - if (!btf_type_is_scalar(ref_t) && 6775 - !__btf_type_is_scalar_struct(log, btf, ref_t, 0) && 6776 - !arg_dynptr && 6777 - (arg_mem_size ? !btf_type_is_void(ref_t) : 1)) { 6778 - bpf_log(log, 6779 - "arg#%d pointer type %s %s must point to %sscalar, or struct with scalar\n", 6780 - i, btf_type_str(ref_t), ref_tname, arg_mem_size ? "void, " : ""); 6781 - return -EINVAL; 6782 - } 6783 - 6784 - if (arg_dynptr) { 6785 - if (reg->type != PTR_TO_STACK) { 6786 - bpf_log(log, "arg#%d pointer type %s %s not to stack\n", 6787 - i, btf_type_str(ref_t), 6788 - ref_tname); 6789 - return -EINVAL; 6790 - } 6791 - 6792 - if (!is_dynptr_reg_valid_init(env, reg)) { 6793 - bpf_log(log, 6794 - "arg#%d pointer type %s %s must be valid and initialized\n", 6795 - i, btf_type_str(ref_t), 6796 - ref_tname); 6797 - return -EINVAL; 6798 - } 6799 - 6800 - if (!is_dynptr_type_expected(env, reg, 6801 - ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL)) { 6802 - bpf_log(log, 6803 - "arg#%d pointer type %s %s points to unsupported dynamic pointer type\n", 6804 - i, btf_type_str(ref_t), 6805 - ref_tname); 6806 - return -EINVAL; 6807 - } 6808 - 6809 - continue; 6810 - } 6811 - 6812 - /* Check for mem, len pair */ 6813 - if (arg_mem_size) { 6814 - if (check_kfunc_mem_size_reg(env, &regs[regno + 1], regno + 1)) { 6815 - bpf_log(log, "arg#%d arg#%d memory, len pair leads to invalid memory access\n", 6816 - i, i + 1); 6817 - return -EINVAL; 6818 - } 6819 - i++; 6820 - continue; 6821 - } 6822 - } 6823 6510 6824 6511 resolve_ret = btf_resolve_size(btf, ref_t, &type_size); 6825 6512 if (IS_ERR(resolve_ret)) { ··· 6734 6619 if (check_mem_reg(env, reg, regno, type_size)) 6735 6620 return -EINVAL; 6736 6621 } else { 6737 - bpf_log(log, "reg type unsupported for arg#%d %sfunction %s#%d\n", i, 6738 - is_kfunc ? "kernel " : "", func_name, func_id); 6622 + bpf_log(log, "reg type unsupported for arg#%d function %s#%d\n", i, 6623 + func_name, func_id); 6739 6624 return -EINVAL; 6740 6625 } 6741 6626 } 6742 6627 6743 - /* Either both are set, or neither */ 6744 - WARN_ON_ONCE((ref_obj_id && !ref_regno) || (!ref_obj_id && ref_regno)); 6745 - /* We already made sure ref_obj_id is set only for one argument. We do 6746 - * allow (!rel && ref_obj_id), so that passing such referenced 6747 - * PTR_TO_BTF_ID to other kfuncs works. Note that rel is only true when 6748 - * is_kfunc is true. 6749 - */ 6750 - if (rel && !ref_obj_id) { 6751 - bpf_log(log, "release kernel function %s expects refcounted PTR_TO_BTF_ID\n", 6752 - func_name); 6753 - return -EINVAL; 6754 - } 6755 - 6756 - if (sleepable && !env->prog->aux->sleepable) { 6757 - bpf_log(log, "kernel function %s is sleepable but the program is not\n", 6758 - func_name); 6759 - return -EINVAL; 6760 - } 6761 - 6762 - if (kfunc_meta && ref_obj_id) 6763 - kfunc_meta->ref_obj_id = ref_obj_id; 6764 - 6765 - /* returns argument register number > 0 in case of reference release kfunc */ 6766 - return rel ? ref_regno : 0; 6628 + return 0; 6767 6629 } 6768 6630 6769 6631 /* Compare BTF of a function declaration with given bpf_reg_state. ··· 6770 6678 return -EINVAL; 6771 6679 6772 6680 is_global = prog->aux->func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL; 6773 - err = btf_check_func_arg_match(env, btf, btf_id, regs, is_global, NULL, false); 6681 + err = btf_check_func_arg_match(env, btf, btf_id, regs, is_global, false); 6774 6682 6775 6683 /* Compiler optimizations can remove arguments from static functions 6776 6684 * or mismatched type can be passed into a global function. ··· 6813 6721 return -EINVAL; 6814 6722 6815 6723 is_global = prog->aux->func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL; 6816 - err = btf_check_func_arg_match(env, btf, btf_id, regs, is_global, NULL, true); 6724 + err = btf_check_func_arg_match(env, btf, btf_id, regs, is_global, true); 6817 6725 6818 6726 /* Compiler optimizations can remove arguments from static functions 6819 6727 * or mismatched type can be passed into a global function. ··· 6822 6730 if (err) 6823 6731 prog->aux->func_info_aux[subprog].unreliable = true; 6824 6732 return err; 6825 - } 6826 - 6827 - int btf_check_kfunc_arg_match(struct bpf_verifier_env *env, 6828 - const struct btf *btf, u32 func_id, 6829 - struct bpf_reg_state *regs, 6830 - struct bpf_kfunc_arg_meta *meta) 6831 - { 6832 - return btf_check_func_arg_match(env, btf, func_id, regs, true, meta, true); 6833 6733 } 6834 6734 6835 6735 /* Convert BTF of a function into bpf_reg_state if possible ··· 7206 7122 return btf->kernel_btf && strcmp(btf->name, "vmlinux") != 0; 7207 7123 } 7208 7124 7209 - static int btf_id_cmp_func(const void *a, const void *b) 7210 - { 7211 - const int *pa = a, *pb = b; 7212 - 7213 - return *pa - *pb; 7214 - } 7215 - 7216 - bool btf_id_set_contains(const struct btf_id_set *set, u32 id) 7217 - { 7218 - return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL; 7219 - } 7220 - 7221 - static void *btf_id_set8_contains(const struct btf_id_set8 *set, u32 id) 7222 - { 7223 - return bsearch(&id, set->pairs, set->cnt, sizeof(set->pairs[0]), btf_id_cmp_func); 7224 - } 7225 - 7226 7125 enum { 7227 7126 BTF_MODULE_F_LIVE = (1 << 0), 7228 7127 }; ··· 7566 7499 static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type) 7567 7500 { 7568 7501 switch (prog_type) { 7502 + case BPF_PROG_TYPE_UNSPEC: 7503 + return BTF_KFUNC_HOOK_COMMON; 7569 7504 case BPF_PROG_TYPE_XDP: 7570 7505 return BTF_KFUNC_HOOK_XDP; 7571 7506 case BPF_PROG_TYPE_SCHED_CLS: ··· 7596 7527 u32 kfunc_btf_id) 7597 7528 { 7598 7529 enum btf_kfunc_hook hook; 7530 + u32 *kfunc_flags; 7531 + 7532 + kfunc_flags = __btf_kfunc_id_set_contains(btf, BTF_KFUNC_HOOK_COMMON, kfunc_btf_id); 7533 + if (kfunc_flags) 7534 + return kfunc_flags; 7599 7535 7600 7536 hook = bpf_prog_type_to_kfunc_hook(prog_type); 7601 7537 return __btf_kfunc_id_set_contains(btf, hook, kfunc_btf_id);
+14
kernel/bpf/cgroup_iter.c
··· 164 164 struct cgroup_iter_priv *p = (struct cgroup_iter_priv *)priv; 165 165 struct cgroup *cgrp = aux->cgroup.start; 166 166 167 + /* bpf_iter_attach_cgroup() has already acquired an extra reference 168 + * for the start cgroup, but the reference may be released after 169 + * cgroup_iter_seq_init(), so acquire another reference for the 170 + * start cgroup. 171 + */ 167 172 p->start_css = &cgrp->self; 173 + css_get(p->start_css); 168 174 p->terminate = false; 169 175 p->visited_all = false; 170 176 p->order = aux->cgroup.order; 171 177 return 0; 172 178 } 173 179 180 + static void cgroup_iter_seq_fini(void *priv) 181 + { 182 + struct cgroup_iter_priv *p = (struct cgroup_iter_priv *)priv; 183 + 184 + css_put(p->start_css); 185 + } 186 + 174 187 static const struct bpf_iter_seq_info cgroup_iter_seq_info = { 175 188 .seq_ops = &cgroup_iter_seq_ops, 176 189 .init_seq_private = cgroup_iter_seq_init, 190 + .fini_seq_private = cgroup_iter_seq_fini, 177 191 .seq_priv_size = sizeof(struct cgroup_iter_priv), 178 192 }; 179 193
+16
kernel/bpf/core.c
··· 34 34 #include <linux/log2.h> 35 35 #include <linux/bpf_verifier.h> 36 36 #include <linux/nodemask.h> 37 + #include <linux/bpf_mem_alloc.h> 37 38 38 39 #include <asm/barrier.h> 39 40 #include <asm/unaligned.h> ··· 60 59 #define ARG1 regs[BPF_REG_ARG1] 61 60 #define CTX regs[BPF_REG_CTX] 62 61 #define IMM insn->imm 62 + 63 + struct bpf_mem_alloc bpf_global_ma; 64 + bool bpf_global_ma_set; 63 65 64 66 /* No hurry in this branch 65 67 * ··· 2749 2745 { 2750 2746 return -ENOTSUPP; 2751 2747 } 2748 + 2749 + #ifdef CONFIG_BPF_SYSCALL 2750 + static int __init bpf_global_ma_init(void) 2751 + { 2752 + int ret; 2753 + 2754 + ret = bpf_mem_alloc_init(&bpf_global_ma, 0, false); 2755 + bpf_global_ma_set = !ret; 2756 + return ret; 2757 + } 2758 + late_initcall(bpf_global_ma_init); 2759 + #endif 2752 2760 2753 2761 DEFINE_STATIC_KEY_FALSE(bpf_stats_enabled_key); 2754 2762 EXPORT_SYMBOL(bpf_stats_enabled_key);
+2 -2
kernel/bpf/cpumap.c
··· 667 667 return 0; 668 668 } 669 669 670 - static int cpu_map_redirect(struct bpf_map *map, u32 ifindex, u64 flags) 670 + static int cpu_map_redirect(struct bpf_map *map, u64 index, u64 flags) 671 671 { 672 - return __bpf_xdp_redirect_map(map, ifindex, flags, 0, 672 + return __bpf_xdp_redirect_map(map, index, flags, 0, 673 673 __cpu_map_lookup_elem); 674 674 } 675 675
+2 -2
kernel/bpf/devmap.c
··· 992 992 map, key, value, map_flags); 993 993 } 994 994 995 - static int dev_map_redirect(struct bpf_map *map, u32 ifindex, u64 flags) 995 + static int dev_map_redirect(struct bpf_map *map, u64 ifindex, u64 flags) 996 996 { 997 997 return __bpf_xdp_redirect_map(map, ifindex, flags, 998 998 BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS, 999 999 __dev_map_lookup_elem); 1000 1000 } 1001 1001 1002 - static int dev_hash_map_redirect(struct bpf_map *map, u32 ifindex, u64 flags) 1002 + static int dev_hash_map_redirect(struct bpf_map *map, u64 ifindex, u64 flags) 1003 1003 { 1004 1004 return __bpf_xdp_redirect_map(map, ifindex, flags, 1005 1005 BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS,
-1
kernel/bpf/hashtab.c
··· 1511 1511 prealloc_destroy(htab); 1512 1512 } 1513 1513 1514 - bpf_map_free_record(map); 1515 1514 free_percpu(htab->extra_elems); 1516 1515 bpf_map_area_free(htab->buckets); 1517 1516 bpf_mem_alloc_destroy(&htab->pcpu_ma);
+358 -5
kernel/bpf/helpers.c
··· 4 4 #include <linux/bpf.h> 5 5 #include <linux/btf.h> 6 6 #include <linux/bpf-cgroup.h> 7 + #include <linux/cgroup.h> 7 8 #include <linux/rcupdate.h> 8 9 #include <linux/random.h> 9 10 #include <linux/smp.h> ··· 20 19 #include <linux/proc_ns.h> 21 20 #include <linux/security.h> 22 21 #include <linux/btf_ids.h> 22 + #include <linux/bpf_mem_alloc.h> 23 23 24 24 #include "../../lib/kstrtox.h" 25 25 ··· 338 336 .gpl_only = false, 339 337 .ret_type = RET_VOID, 340 338 .arg1_type = ARG_PTR_TO_SPIN_LOCK, 339 + .arg1_btf_id = BPF_PTR_POISON, 341 340 }; 342 341 343 342 static inline void __bpf_spin_unlock_irqrestore(struct bpf_spin_lock *lock) ··· 361 358 .gpl_only = false, 362 359 .ret_type = RET_VOID, 363 360 .arg1_type = ARG_PTR_TO_SPIN_LOCK, 361 + .arg1_btf_id = BPF_PTR_POISON, 364 362 }; 365 363 366 364 void copy_map_value_locked(struct bpf_map *map, void *dst, void *src, ··· 661 657 const struct bpf_func_proto bpf_copy_from_user_proto = { 662 658 .func = bpf_copy_from_user, 663 659 .gpl_only = false, 660 + .might_sleep = true, 664 661 .ret_type = RET_INTEGER, 665 662 .arg1_type = ARG_PTR_TO_UNINIT_MEM, 666 663 .arg2_type = ARG_CONST_SIZE_OR_ZERO, ··· 692 687 const struct bpf_func_proto bpf_copy_from_user_task_proto = { 693 688 .func = bpf_copy_from_user_task, 694 689 .gpl_only = true, 690 + .might_sleep = true, 695 691 .ret_type = RET_INTEGER, 696 692 .arg1_type = ARG_PTR_TO_UNINIT_MEM, 697 693 .arg2_type = ARG_CONST_SIZE_OR_ZERO, ··· 1712 1706 } 1713 1707 } 1714 1708 1715 - BTF_SET8_START(tracing_btf_ids) 1709 + void bpf_list_head_free(const struct btf_field *field, void *list_head, 1710 + struct bpf_spin_lock *spin_lock) 1711 + { 1712 + struct list_head *head = list_head, *orig_head = list_head; 1713 + 1714 + BUILD_BUG_ON(sizeof(struct list_head) > sizeof(struct bpf_list_head)); 1715 + BUILD_BUG_ON(__alignof__(struct list_head) > __alignof__(struct bpf_list_head)); 1716 + 1717 + /* Do the actual list draining outside the lock to not hold the lock for 1718 + * too long, and also prevent deadlocks if tracing programs end up 1719 + * executing on entry/exit of functions called inside the critical 1720 + * section, and end up doing map ops that call bpf_list_head_free for 1721 + * the same map value again. 1722 + */ 1723 + __bpf_spin_lock_irqsave(spin_lock); 1724 + if (!head->next || list_empty(head)) 1725 + goto unlock; 1726 + head = head->next; 1727 + unlock: 1728 + INIT_LIST_HEAD(orig_head); 1729 + __bpf_spin_unlock_irqrestore(spin_lock); 1730 + 1731 + while (head != orig_head) { 1732 + void *obj = head; 1733 + 1734 + obj -= field->list_head.node_offset; 1735 + head = head->next; 1736 + /* The contained type can also have resources, including a 1737 + * bpf_list_head which needs to be freed. 1738 + */ 1739 + bpf_obj_free_fields(field->list_head.value_rec, obj); 1740 + /* bpf_mem_free requires migrate_disable(), since we can be 1741 + * called from map free path as well apart from BPF program (as 1742 + * part of map ops doing bpf_obj_free_fields). 1743 + */ 1744 + migrate_disable(); 1745 + bpf_mem_free(&bpf_global_ma, obj); 1746 + migrate_enable(); 1747 + } 1748 + } 1749 + 1750 + __diag_push(); 1751 + __diag_ignore_all("-Wmissing-prototypes", 1752 + "Global functions as their definitions will be in vmlinux BTF"); 1753 + 1754 + void *bpf_obj_new_impl(u64 local_type_id__k, void *meta__ign) 1755 + { 1756 + struct btf_struct_meta *meta = meta__ign; 1757 + u64 size = local_type_id__k; 1758 + void *p; 1759 + 1760 + p = bpf_mem_alloc(&bpf_global_ma, size); 1761 + if (!p) 1762 + return NULL; 1763 + if (meta) 1764 + bpf_obj_init(meta->field_offs, p); 1765 + return p; 1766 + } 1767 + 1768 + void bpf_obj_drop_impl(void *p__alloc, void *meta__ign) 1769 + { 1770 + struct btf_struct_meta *meta = meta__ign; 1771 + void *p = p__alloc; 1772 + 1773 + if (meta) 1774 + bpf_obj_free_fields(meta->record, p); 1775 + bpf_mem_free(&bpf_global_ma, p); 1776 + } 1777 + 1778 + static void __bpf_list_add(struct bpf_list_node *node, struct bpf_list_head *head, bool tail) 1779 + { 1780 + struct list_head *n = (void *)node, *h = (void *)head; 1781 + 1782 + if (unlikely(!h->next)) 1783 + INIT_LIST_HEAD(h); 1784 + if (unlikely(!n->next)) 1785 + INIT_LIST_HEAD(n); 1786 + tail ? list_add_tail(n, h) : list_add(n, h); 1787 + } 1788 + 1789 + void bpf_list_push_front(struct bpf_list_head *head, struct bpf_list_node *node) 1790 + { 1791 + return __bpf_list_add(node, head, false); 1792 + } 1793 + 1794 + void bpf_list_push_back(struct bpf_list_head *head, struct bpf_list_node *node) 1795 + { 1796 + return __bpf_list_add(node, head, true); 1797 + } 1798 + 1799 + static struct bpf_list_node *__bpf_list_del(struct bpf_list_head *head, bool tail) 1800 + { 1801 + struct list_head *n, *h = (void *)head; 1802 + 1803 + if (unlikely(!h->next)) 1804 + INIT_LIST_HEAD(h); 1805 + if (list_empty(h)) 1806 + return NULL; 1807 + n = tail ? h->prev : h->next; 1808 + list_del_init(n); 1809 + return (struct bpf_list_node *)n; 1810 + } 1811 + 1812 + struct bpf_list_node *bpf_list_pop_front(struct bpf_list_head *head) 1813 + { 1814 + return __bpf_list_del(head, false); 1815 + } 1816 + 1817 + struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head) 1818 + { 1819 + return __bpf_list_del(head, true); 1820 + } 1821 + 1822 + /** 1823 + * bpf_task_acquire - Acquire a reference to a task. A task acquired by this 1824 + * kfunc which is not stored in a map as a kptr, must be released by calling 1825 + * bpf_task_release(). 1826 + * @p: The task on which a reference is being acquired. 1827 + */ 1828 + struct task_struct *bpf_task_acquire(struct task_struct *p) 1829 + { 1830 + refcount_inc(&p->rcu_users); 1831 + return p; 1832 + } 1833 + 1834 + /** 1835 + * bpf_task_kptr_get - Acquire a reference on a struct task_struct kptr. A task 1836 + * kptr acquired by this kfunc which is not subsequently stored in a map, must 1837 + * be released by calling bpf_task_release(). 1838 + * @pp: A pointer to a task kptr on which a reference is being acquired. 1839 + */ 1840 + struct task_struct *bpf_task_kptr_get(struct task_struct **pp) 1841 + { 1842 + struct task_struct *p; 1843 + 1844 + rcu_read_lock(); 1845 + p = READ_ONCE(*pp); 1846 + 1847 + /* Another context could remove the task from the map and release it at 1848 + * any time, including after we've done the lookup above. This is safe 1849 + * because we're in an RCU read region, so the task is guaranteed to 1850 + * remain valid until at least the rcu_read_unlock() below. 1851 + */ 1852 + if (p && !refcount_inc_not_zero(&p->rcu_users)) 1853 + /* If the task had been removed from the map and freed as 1854 + * described above, refcount_inc_not_zero() will return false. 1855 + * The task will be freed at some point after the current RCU 1856 + * gp has ended, so just return NULL to the user. 1857 + */ 1858 + p = NULL; 1859 + rcu_read_unlock(); 1860 + 1861 + return p; 1862 + } 1863 + 1864 + /** 1865 + * bpf_task_release - Release the reference acquired on a struct task_struct *. 1866 + * If this kfunc is invoked in an RCU read region, the task_struct is 1867 + * guaranteed to not be freed until the current grace period has ended, even if 1868 + * its refcount drops to 0. 1869 + * @p: The task on which a reference is being released. 1870 + */ 1871 + void bpf_task_release(struct task_struct *p) 1872 + { 1873 + if (!p) 1874 + return; 1875 + 1876 + put_task_struct_rcu_user(p); 1877 + } 1878 + 1879 + #ifdef CONFIG_CGROUPS 1880 + /** 1881 + * bpf_cgroup_acquire - Acquire a reference to a cgroup. A cgroup acquired by 1882 + * this kfunc which is not stored in a map as a kptr, must be released by 1883 + * calling bpf_cgroup_release(). 1884 + * @cgrp: The cgroup on which a reference is being acquired. 1885 + */ 1886 + struct cgroup *bpf_cgroup_acquire(struct cgroup *cgrp) 1887 + { 1888 + cgroup_get(cgrp); 1889 + return cgrp; 1890 + } 1891 + 1892 + /** 1893 + * bpf_cgroup_kptr_get - Acquire a reference on a struct cgroup kptr. A cgroup 1894 + * kptr acquired by this kfunc which is not subsequently stored in a map, must 1895 + * be released by calling bpf_cgroup_release(). 1896 + * @cgrpp: A pointer to a cgroup kptr on which a reference is being acquired. 1897 + */ 1898 + struct cgroup *bpf_cgroup_kptr_get(struct cgroup **cgrpp) 1899 + { 1900 + struct cgroup *cgrp; 1901 + 1902 + rcu_read_lock(); 1903 + /* Another context could remove the cgroup from the map and release it 1904 + * at any time, including after we've done the lookup above. This is 1905 + * safe because we're in an RCU read region, so the cgroup is 1906 + * guaranteed to remain valid until at least the rcu_read_unlock() 1907 + * below. 1908 + */ 1909 + cgrp = READ_ONCE(*cgrpp); 1910 + 1911 + if (cgrp && !cgroup_tryget(cgrp)) 1912 + /* If the cgroup had been removed from the map and freed as 1913 + * described above, cgroup_tryget() will return false. The 1914 + * cgroup will be freed at some point after the current RCU gp 1915 + * has ended, so just return NULL to the user. 1916 + */ 1917 + cgrp = NULL; 1918 + rcu_read_unlock(); 1919 + 1920 + return cgrp; 1921 + } 1922 + 1923 + /** 1924 + * bpf_cgroup_release - Release the reference acquired on a struct cgroup *. 1925 + * If this kfunc is invoked in an RCU read region, the cgroup is guaranteed to 1926 + * not be freed until the current grace period has ended, even if its refcount 1927 + * drops to 0. 1928 + * @cgrp: The cgroup on which a reference is being released. 1929 + */ 1930 + void bpf_cgroup_release(struct cgroup *cgrp) 1931 + { 1932 + if (!cgrp) 1933 + return; 1934 + 1935 + cgroup_put(cgrp); 1936 + } 1937 + 1938 + /** 1939 + * bpf_cgroup_ancestor - Perform a lookup on an entry in a cgroup's ancestor 1940 + * array. A cgroup returned by this kfunc which is not subsequently stored in a 1941 + * map, must be released by calling bpf_cgroup_release(). 1942 + * @cgrp: The cgroup for which we're performing a lookup. 1943 + * @level: The level of ancestor to look up. 1944 + */ 1945 + struct cgroup *bpf_cgroup_ancestor(struct cgroup *cgrp, int level) 1946 + { 1947 + struct cgroup *ancestor; 1948 + 1949 + if (level > cgrp->level || level < 0) 1950 + return NULL; 1951 + 1952 + ancestor = cgrp->ancestors[level]; 1953 + cgroup_get(ancestor); 1954 + return ancestor; 1955 + } 1956 + #endif /* CONFIG_CGROUPS */ 1957 + 1958 + /** 1959 + * bpf_task_from_pid - Find a struct task_struct from its pid by looking it up 1960 + * in the root pid namespace idr. If a task is returned, it must either be 1961 + * stored in a map, or released with bpf_task_release(). 1962 + * @pid: The pid of the task being looked up. 1963 + */ 1964 + struct task_struct *bpf_task_from_pid(s32 pid) 1965 + { 1966 + struct task_struct *p; 1967 + 1968 + rcu_read_lock(); 1969 + p = find_task_by_pid_ns(pid, &init_pid_ns); 1970 + if (p) 1971 + bpf_task_acquire(p); 1972 + rcu_read_unlock(); 1973 + 1974 + return p; 1975 + } 1976 + 1977 + void *bpf_cast_to_kern_ctx(void *obj) 1978 + { 1979 + return obj; 1980 + } 1981 + 1982 + void *bpf_rdonly_cast(void *obj__ign, u32 btf_id__k) 1983 + { 1984 + return obj__ign; 1985 + } 1986 + 1987 + void bpf_rcu_read_lock(void) 1988 + { 1989 + rcu_read_lock(); 1990 + } 1991 + 1992 + void bpf_rcu_read_unlock(void) 1993 + { 1994 + rcu_read_unlock(); 1995 + } 1996 + 1997 + __diag_pop(); 1998 + 1999 + BTF_SET8_START(generic_btf_ids) 1716 2000 #ifdef CONFIG_KEXEC_CORE 1717 2001 BTF_ID_FLAGS(func, crash_kexec, KF_DESTRUCTIVE) 1718 2002 #endif 1719 - BTF_SET8_END(tracing_btf_ids) 2003 + BTF_ID_FLAGS(func, bpf_obj_new_impl, KF_ACQUIRE | KF_RET_NULL) 2004 + BTF_ID_FLAGS(func, bpf_obj_drop_impl, KF_RELEASE) 2005 + BTF_ID_FLAGS(func, bpf_list_push_front) 2006 + BTF_ID_FLAGS(func, bpf_list_push_back) 2007 + BTF_ID_FLAGS(func, bpf_list_pop_front, KF_ACQUIRE | KF_RET_NULL) 2008 + BTF_ID_FLAGS(func, bpf_list_pop_back, KF_ACQUIRE | KF_RET_NULL) 2009 + BTF_ID_FLAGS(func, bpf_task_acquire, KF_ACQUIRE | KF_TRUSTED_ARGS) 2010 + BTF_ID_FLAGS(func, bpf_task_kptr_get, KF_ACQUIRE | KF_KPTR_GET | KF_RET_NULL) 2011 + BTF_ID_FLAGS(func, bpf_task_release, KF_RELEASE) 2012 + #ifdef CONFIG_CGROUPS 2013 + BTF_ID_FLAGS(func, bpf_cgroup_acquire, KF_ACQUIRE | KF_TRUSTED_ARGS) 2014 + BTF_ID_FLAGS(func, bpf_cgroup_kptr_get, KF_ACQUIRE | KF_KPTR_GET | KF_RET_NULL) 2015 + BTF_ID_FLAGS(func, bpf_cgroup_release, KF_RELEASE) 2016 + BTF_ID_FLAGS(func, bpf_cgroup_ancestor, KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL) 2017 + #endif 2018 + BTF_ID_FLAGS(func, bpf_task_from_pid, KF_ACQUIRE | KF_RET_NULL) 2019 + BTF_SET8_END(generic_btf_ids) 1720 2020 1721 - static const struct btf_kfunc_id_set tracing_kfunc_set = { 2021 + static const struct btf_kfunc_id_set generic_kfunc_set = { 1722 2022 .owner = THIS_MODULE, 1723 - .set = &tracing_btf_ids, 2023 + .set = &generic_btf_ids, 2024 + }; 2025 + 2026 + 2027 + BTF_ID_LIST(generic_dtor_ids) 2028 + BTF_ID(struct, task_struct) 2029 + BTF_ID(func, bpf_task_release) 2030 + #ifdef CONFIG_CGROUPS 2031 + BTF_ID(struct, cgroup) 2032 + BTF_ID(func, bpf_cgroup_release) 2033 + #endif 2034 + 2035 + BTF_SET8_START(common_btf_ids) 2036 + BTF_ID_FLAGS(func, bpf_cast_to_kern_ctx) 2037 + BTF_ID_FLAGS(func, bpf_rdonly_cast) 2038 + BTF_ID_FLAGS(func, bpf_rcu_read_lock) 2039 + BTF_ID_FLAGS(func, bpf_rcu_read_unlock) 2040 + BTF_SET8_END(common_btf_ids) 2041 + 2042 + static const struct btf_kfunc_id_set common_kfunc_set = { 2043 + .owner = THIS_MODULE, 2044 + .set = &common_btf_ids, 1724 2045 }; 1725 2046 1726 2047 static int __init kfunc_init(void) 1727 2048 { 1728 - return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &tracing_kfunc_set); 2049 + int ret; 2050 + const struct btf_id_dtor_kfunc generic_dtors[] = { 2051 + { 2052 + .btf_id = generic_dtor_ids[0], 2053 + .kfunc_btf_id = generic_dtor_ids[1] 2054 + }, 2055 + #ifdef CONFIG_CGROUPS 2056 + { 2057 + .btf_id = generic_dtor_ids[2], 2058 + .kfunc_btf_id = generic_dtor_ids[3] 2059 + }, 2060 + #endif 2061 + }; 2062 + 2063 + ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &generic_kfunc_set); 2064 + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &generic_kfunc_set); 2065 + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &generic_kfunc_set); 2066 + ret = ret ?: register_btf_id_dtor_kfuncs(generic_dtors, 2067 + ARRAY_SIZE(generic_dtors), 2068 + THIS_MODULE); 2069 + return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_UNSPEC, &common_kfunc_set); 1729 2070 } 1730 2071 1731 2072 late_initcall(kfunc_init);
+35 -13
kernel/bpf/map_in_map.c
··· 12 12 struct bpf_map *inner_map, *inner_map_meta; 13 13 u32 inner_map_meta_size; 14 14 struct fd f; 15 + int ret; 15 16 16 17 f = fdget(inner_map_ufd); 17 18 inner_map = __bpf_map_get(f); ··· 21 20 22 21 /* Does not support >1 level map-in-map */ 23 22 if (inner_map->inner_map_meta) { 24 - fdput(f); 25 - return ERR_PTR(-EINVAL); 23 + ret = -EINVAL; 24 + goto put; 26 25 } 27 26 28 27 if (!inner_map->ops->map_meta_equal) { 29 - fdput(f); 30 - return ERR_PTR(-ENOTSUPP); 31 - } 32 - 33 - if (btf_record_has_field(inner_map->record, BPF_SPIN_LOCK)) { 34 - fdput(f); 35 - return ERR_PTR(-ENOTSUPP); 28 + ret = -ENOTSUPP; 29 + goto put; 36 30 } 37 31 38 32 inner_map_meta_size = sizeof(*inner_map_meta); ··· 37 41 38 42 inner_map_meta = kzalloc(inner_map_meta_size, GFP_USER); 39 43 if (!inner_map_meta) { 40 - fdput(f); 41 - return ERR_PTR(-ENOMEM); 44 + ret = -ENOMEM; 45 + goto put; 42 46 } 43 47 44 48 inner_map_meta->map_type = inner_map->map_type; ··· 46 50 inner_map_meta->value_size = inner_map->value_size; 47 51 inner_map_meta->map_flags = inner_map->map_flags; 48 52 inner_map_meta->max_entries = inner_map->max_entries; 53 + 49 54 inner_map_meta->record = btf_record_dup(inner_map->record); 50 55 if (IS_ERR(inner_map_meta->record)) { 51 56 /* btf_record_dup returns NULL or valid pointer in case of 52 57 * invalid/empty/valid, but ERR_PTR in case of errors. During 53 58 * equality NULL or IS_ERR is equivalent. 54 59 */ 55 - fdput(f); 56 - return ERR_CAST(inner_map_meta->record); 60 + ret = PTR_ERR(inner_map_meta->record); 61 + goto free; 57 62 } 63 + if (inner_map_meta->record) { 64 + struct btf_field_offs *field_offs; 65 + /* If btf_record is !IS_ERR_OR_NULL, then field_offs is always 66 + * valid. 67 + */ 68 + field_offs = kmemdup(inner_map->field_offs, sizeof(*inner_map->field_offs), GFP_KERNEL | __GFP_NOWARN); 69 + if (!field_offs) { 70 + ret = -ENOMEM; 71 + goto free_rec; 72 + } 73 + inner_map_meta->field_offs = field_offs; 74 + } 75 + /* Note: We must use the same BTF, as we also used btf_record_dup above 76 + * which relies on BTF being same for both maps, as some members like 77 + * record->fields.list_head have pointers like value_rec pointing into 78 + * inner_map->btf. 79 + */ 58 80 if (inner_map->btf) { 59 81 btf_get(inner_map->btf); 60 82 inner_map_meta->btf = inner_map->btf; ··· 88 74 89 75 fdput(f); 90 76 return inner_map_meta; 77 + free_rec: 78 + btf_record_free(inner_map_meta->record); 79 + free: 80 + kfree(inner_map_meta); 81 + put: 82 + fdput(f); 83 + return ERR_PTR(ret); 91 84 } 92 85 93 86 void bpf_map_meta_free(struct bpf_map *map_meta) 94 87 { 88 + kfree(map_meta->field_offs); 95 89 bpf_map_free_record(map_meta); 96 90 btf_put(map_meta->btf); 97 91 kfree(map_meta);
+3 -3
kernel/bpf/ringbuf.c
··· 447 447 448 448 const struct bpf_func_proto bpf_ringbuf_reserve_proto = { 449 449 .func = bpf_ringbuf_reserve, 450 - .ret_type = RET_PTR_TO_ALLOC_MEM_OR_NULL, 450 + .ret_type = RET_PTR_TO_RINGBUF_MEM_OR_NULL, 451 451 .arg1_type = ARG_CONST_MAP_PTR, 452 452 .arg2_type = ARG_CONST_ALLOC_SIZE_OR_ZERO, 453 453 .arg3_type = ARG_ANYTHING, ··· 490 490 const struct bpf_func_proto bpf_ringbuf_submit_proto = { 491 491 .func = bpf_ringbuf_submit, 492 492 .ret_type = RET_VOID, 493 - .arg1_type = ARG_PTR_TO_ALLOC_MEM | OBJ_RELEASE, 493 + .arg1_type = ARG_PTR_TO_RINGBUF_MEM | OBJ_RELEASE, 494 494 .arg2_type = ARG_ANYTHING, 495 495 }; 496 496 ··· 503 503 const struct bpf_func_proto bpf_ringbuf_discard_proto = { 504 504 .func = bpf_ringbuf_discard, 505 505 .ret_type = RET_VOID, 506 - .arg1_type = ARG_PTR_TO_ALLOC_MEM | OBJ_RELEASE, 506 + .arg1_type = ARG_PTR_TO_RINGBUF_MEM | OBJ_RELEASE, 507 507 .arg2_type = ARG_ANYTHING, 508 508 }; 509 509
+72 -24
kernel/bpf/syscall.c
··· 175 175 synchronize_rcu(); 176 176 } 177 177 178 - static int bpf_map_update_value(struct bpf_map *map, struct fd f, void *key, 179 - void *value, __u64 flags) 178 + static int bpf_map_update_value(struct bpf_map *map, struct file *map_file, 179 + void *key, void *value, __u64 flags) 180 180 { 181 181 int err; 182 182 ··· 190 190 map->map_type == BPF_MAP_TYPE_SOCKMAP) { 191 191 return sock_map_update_elem_sys(map, key, value, flags); 192 192 } else if (IS_FD_PROG_ARRAY(map)) { 193 - return bpf_fd_array_map_update_elem(map, f.file, key, value, 193 + return bpf_fd_array_map_update_elem(map, map_file, key, value, 194 194 flags); 195 195 } 196 196 ··· 205 205 flags); 206 206 } else if (IS_FD_ARRAY(map)) { 207 207 rcu_read_lock(); 208 - err = bpf_fd_array_map_update_elem(map, f.file, key, value, 208 + err = bpf_fd_array_map_update_elem(map, map_file, key, value, 209 209 flags); 210 210 rcu_read_unlock(); 211 211 } else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) { 212 212 rcu_read_lock(); 213 - err = bpf_fd_htab_map_update_elem(map, f.file, key, value, 213 + err = bpf_fd_htab_map_update_elem(map, map_file, key, value, 214 214 flags); 215 215 rcu_read_unlock(); 216 216 } else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) { ··· 536 536 module_put(rec->fields[i].kptr.module); 537 537 btf_put(rec->fields[i].kptr.btf); 538 538 break; 539 + case BPF_LIST_HEAD: 540 + case BPF_LIST_NODE: 541 + /* Nothing to release for bpf_list_head */ 542 + break; 539 543 default: 540 544 WARN_ON_ONCE(1); 541 545 continue; ··· 582 578 goto free; 583 579 } 584 580 break; 581 + case BPF_LIST_HEAD: 582 + case BPF_LIST_NODE: 583 + /* Nothing to acquire for bpf_list_head */ 584 + break; 585 585 default: 586 586 ret = -EFAULT; 587 587 WARN_ON_ONCE(1); ··· 611 603 if (rec_a->cnt != rec_b->cnt) 612 604 return false; 613 605 size = offsetof(struct btf_record, fields[rec_a->cnt]); 606 + /* btf_parse_fields uses kzalloc to allocate a btf_record, so unused 607 + * members are zeroed out. So memcmp is safe to do without worrying 608 + * about padding/unused fields. 609 + * 610 + * While spin_lock, timer, and kptr have no relation to map BTF, 611 + * list_head metadata is specific to map BTF, the btf and value_rec 612 + * members in particular. btf is the map BTF, while value_rec points to 613 + * btf_record in that map BTF. 614 + * 615 + * So while by default, we don't rely on the map BTF (which the records 616 + * were parsed from) matching for both records, which is not backwards 617 + * compatible, in case list_head is part of it, we implicitly rely on 618 + * that by way of depending on memcmp succeeding for it. 619 + */ 614 620 return !memcmp(rec_a, rec_b, size); 615 621 } 616 622 ··· 659 637 case BPF_KPTR_REF: 660 638 field->kptr.dtor((void *)xchg((unsigned long *)field_ptr, 0)); 661 639 break; 640 + case BPF_LIST_HEAD: 641 + if (WARN_ON_ONCE(rec->spin_lock_off < 0)) 642 + continue; 643 + bpf_list_head_free(field, field_ptr, obj + rec->spin_lock_off); 644 + break; 645 + case BPF_LIST_NODE: 646 + break; 662 647 default: 663 648 WARN_ON_ONCE(1); 664 649 continue; ··· 677 648 static void bpf_map_free_deferred(struct work_struct *work) 678 649 { 679 650 struct bpf_map *map = container_of(work, struct bpf_map, work); 651 + struct btf_field_offs *foffs = map->field_offs; 652 + struct btf_record *rec = map->record; 680 653 681 654 security_bpf_map_free(map); 682 - kfree(map->field_offs); 683 655 bpf_map_release_memcg(map); 684 - /* implementation dependent freeing, map_free callback also does 685 - * bpf_map_free_record, if needed. 686 - */ 656 + /* implementation dependent freeing */ 687 657 map->ops->map_free(map); 658 + /* Delay freeing of field_offs and btf_record for maps, as map_free 659 + * callback usually needs access to them. It is better to do it here 660 + * than require each callback to do the free itself manually. 661 + * 662 + * Note that the btf_record stashed in map->inner_map_meta->record was 663 + * already freed using the map_free callback for map in map case which 664 + * eventually calls bpf_map_free_meta, since inner_map_meta is only a 665 + * template bpf_map struct used during verification. 666 + */ 667 + kfree(foffs); 668 + btf_record_free(rec); 688 669 } 689 670 690 671 static void bpf_map_put_uref(struct bpf_map *map) ··· 1004 965 if (!value_type || value_size != map->value_size) 1005 966 return -EINVAL; 1006 967 1007 - map->record = btf_parse_fields(btf, value_type, BPF_SPIN_LOCK | BPF_TIMER | BPF_KPTR, 968 + map->record = btf_parse_fields(btf, value_type, 969 + BPF_SPIN_LOCK | BPF_TIMER | BPF_KPTR | BPF_LIST_HEAD, 1008 970 map->value_size); 1009 971 if (!IS_ERR_OR_NULL(map->record)) { 1010 972 int i; ··· 1038 998 if (map->map_type != BPF_MAP_TYPE_HASH && 1039 999 map->map_type != BPF_MAP_TYPE_LRU_HASH && 1040 1000 map->map_type != BPF_MAP_TYPE_ARRAY) { 1041 - return -EOPNOTSUPP; 1001 + ret = -EOPNOTSUPP; 1042 1002 goto free_map_tab; 1043 1003 } 1044 1004 break; ··· 1052 1012 goto free_map_tab; 1053 1013 } 1054 1014 break; 1015 + case BPF_LIST_HEAD: 1016 + if (map->map_type != BPF_MAP_TYPE_HASH && 1017 + map->map_type != BPF_MAP_TYPE_LRU_HASH && 1018 + map->map_type != BPF_MAP_TYPE_ARRAY) { 1019 + ret = -EOPNOTSUPP; 1020 + goto free_map_tab; 1021 + } 1022 + break; 1055 1023 default: 1056 1024 /* Fail if map_type checks are missing for a field type */ 1057 1025 ret = -EOPNOTSUPP; ··· 1067 1019 } 1068 1020 } 1069 1021 } 1022 + 1023 + ret = btf_check_and_fixup_fields(btf, map->record); 1024 + if (ret < 0) 1025 + goto free_map_tab; 1070 1026 1071 1027 if (map->ops->map_check_btf) { 1072 1028 ret = map->ops->map_check_btf(map, btf, key_type, value_type); ··· 1442 1390 goto free_key; 1443 1391 } 1444 1392 1445 - err = bpf_map_update_value(map, f, key, value, attr->flags); 1393 + err = bpf_map_update_value(map, f.file, key, value, attr->flags); 1446 1394 1447 1395 kvfree(value); 1448 1396 free_key: ··· 1628 1576 return err; 1629 1577 } 1630 1578 1631 - int generic_map_update_batch(struct bpf_map *map, 1579 + int generic_map_update_batch(struct bpf_map *map, struct file *map_file, 1632 1580 const union bpf_attr *attr, 1633 1581 union bpf_attr __user *uattr) 1634 1582 { 1635 1583 void __user *values = u64_to_user_ptr(attr->batch.values); 1636 1584 void __user *keys = u64_to_user_ptr(attr->batch.keys); 1637 1585 u32 value_size, cp, max_count; 1638 - int ufd = attr->batch.map_fd; 1639 1586 void *key, *value; 1640 - struct fd f; 1641 1587 int err = 0; 1642 1588 1643 1589 if (attr->batch.elem_flags & ~BPF_F_LOCK) ··· 1662 1612 return -ENOMEM; 1663 1613 } 1664 1614 1665 - f = fdget(ufd); /* bpf_map_do_batch() guarantees ufd is valid */ 1666 1615 for (cp = 0; cp < max_count; cp++) { 1667 1616 err = -EFAULT; 1668 1617 if (copy_from_user(key, keys + cp * map->key_size, ··· 1669 1620 copy_from_user(value, values + cp * value_size, value_size)) 1670 1621 break; 1671 1622 1672 - err = bpf_map_update_value(map, f, key, value, 1623 + err = bpf_map_update_value(map, map_file, key, value, 1673 1624 attr->batch.elem_flags); 1674 1625 1675 1626 if (err) ··· 1682 1633 1683 1634 kvfree(value); 1684 1635 kvfree(key); 1685 - fdput(f); 1686 1636 return err; 1687 1637 } 1688 1638 ··· 4474 4426 4475 4427 #define BPF_MAP_BATCH_LAST_FIELD batch.flags 4476 4428 4477 - #define BPF_DO_BATCH(fn) \ 4429 + #define BPF_DO_BATCH(fn, ...) \ 4478 4430 do { \ 4479 4431 if (!fn) { \ 4480 4432 err = -ENOTSUPP; \ 4481 4433 goto err_put; \ 4482 4434 } \ 4483 - err = fn(map, attr, uattr); \ 4435 + err = fn(__VA_ARGS__); \ 4484 4436 } while (0) 4485 4437 4486 4438 static int bpf_map_do_batch(const union bpf_attr *attr, ··· 4514 4466 } 4515 4467 4516 4468 if (cmd == BPF_MAP_LOOKUP_BATCH) 4517 - BPF_DO_BATCH(map->ops->map_lookup_batch); 4469 + BPF_DO_BATCH(map->ops->map_lookup_batch, map, attr, uattr); 4518 4470 else if (cmd == BPF_MAP_LOOKUP_AND_DELETE_BATCH) 4519 - BPF_DO_BATCH(map->ops->map_lookup_and_delete_batch); 4471 + BPF_DO_BATCH(map->ops->map_lookup_and_delete_batch, map, attr, uattr); 4520 4472 else if (cmd == BPF_MAP_UPDATE_BATCH) 4521 - BPF_DO_BATCH(map->ops->map_update_batch); 4473 + BPF_DO_BATCH(map->ops->map_update_batch, map, f.file, attr, uattr); 4522 4474 else 4523 - BPF_DO_BATCH(map->ops->map_delete_batch); 4475 + BPF_DO_BATCH(map->ops->map_delete_batch, map, attr, uattr); 4524 4476 err_put: 4525 4477 if (has_write) 4526 4478 bpf_map_write_active_dec(map);
+1414 -117
kernel/bpf/verifier.c
··· 451 451 type == PTR_TO_SOCK_COMMON; 452 452 } 453 453 454 + static struct btf_record *reg_btf_record(const struct bpf_reg_state *reg) 455 + { 456 + struct btf_record *rec = NULL; 457 + struct btf_struct_meta *meta; 458 + 459 + if (reg->type == PTR_TO_MAP_VALUE) { 460 + rec = reg->map_ptr->record; 461 + } else if (reg->type == (PTR_TO_BTF_ID | MEM_ALLOC)) { 462 + meta = btf_find_struct_meta(reg->btf, reg->btf_id); 463 + if (meta) 464 + rec = meta->record; 465 + } 466 + return rec; 467 + } 468 + 454 469 static bool reg_may_point_to_spin_lock(const struct bpf_reg_state *reg) 455 470 { 456 - return reg->type == PTR_TO_MAP_VALUE && 457 - btf_record_has_field(reg->map_ptr->record, BPF_SPIN_LOCK); 471 + return btf_record_has_field(reg_btf_record(reg), BPF_SPIN_LOCK); 458 472 } 459 473 460 474 static bool type_is_rdonly_mem(u32 type) ··· 527 513 func_id == BPF_FUNC_user_ringbuf_drain; 528 514 } 529 515 516 + static bool is_storage_get_function(enum bpf_func_id func_id) 517 + { 518 + return func_id == BPF_FUNC_sk_storage_get || 519 + func_id == BPF_FUNC_inode_storage_get || 520 + func_id == BPF_FUNC_task_storage_get || 521 + func_id == BPF_FUNC_cgrp_storage_get; 522 + } 523 + 530 524 static bool helper_multiple_ref_obj_use(enum bpf_func_id func_id, 531 525 const struct bpf_map *map) 532 526 { ··· 565 543 static const char *reg_type_str(struct bpf_verifier_env *env, 566 544 enum bpf_reg_type type) 567 545 { 568 - char postfix[16] = {0}, prefix[32] = {0}; 546 + char postfix[16] = {0}, prefix[64] = {0}; 569 547 static const char * const str[] = { 570 548 [NOT_INIT] = "?", 571 549 [SCALAR_VALUE] = "scalar", ··· 597 575 strncpy(postfix, "_or_null", 16); 598 576 } 599 577 600 - if (type & MEM_RDONLY) 601 - strncpy(prefix, "rdonly_", 32); 602 - if (type & MEM_ALLOC) 603 - strncpy(prefix, "alloc_", 32); 604 - if (type & MEM_USER) 605 - strncpy(prefix, "user_", 32); 606 - if (type & MEM_PERCPU) 607 - strncpy(prefix, "percpu_", 32); 608 - if (type & PTR_UNTRUSTED) 609 - strncpy(prefix, "untrusted_", 32); 578 + snprintf(prefix, sizeof(prefix), "%s%s%s%s%s%s%s", 579 + type & MEM_RDONLY ? "rdonly_" : "", 580 + type & MEM_RINGBUF ? "ringbuf_" : "", 581 + type & MEM_USER ? "user_" : "", 582 + type & MEM_PERCPU ? "percpu_" : "", 583 + type & MEM_RCU ? "rcu_" : "", 584 + type & PTR_UNTRUSTED ? "untrusted_" : "", 585 + type & PTR_TRUSTED ? "trusted_" : "" 586 + ); 610 587 611 588 snprintf(env->type_str_buf, TYPE_STR_BUF_LEN, "%s%s%s", 612 589 prefix, str[base_type(type)], postfix); ··· 1031 1010 if (unlikely(check_mul_overflow(n, size, &bytes))) 1032 1011 return NULL; 1033 1012 1034 - if (ksize(dst) < bytes) { 1013 + if (ksize(dst) < ksize(src)) { 1035 1014 kfree(dst); 1036 - dst = kmalloc_track_caller(bytes, flags); 1015 + dst = kmalloc_track_caller(kmalloc_size_roundup(bytes), flags); 1037 1016 if (!dst) 1038 1017 return NULL; 1039 1018 } ··· 1050 1029 */ 1051 1030 static void *realloc_array(void *arr, size_t old_n, size_t new_n, size_t size) 1052 1031 { 1032 + size_t alloc_size; 1053 1033 void *new_arr; 1054 1034 1055 1035 if (!new_n || old_n == new_n) 1056 1036 goto out; 1057 1037 1058 - new_arr = krealloc_array(arr, new_n, size, GFP_KERNEL); 1038 + alloc_size = kmalloc_size_roundup(size_mul(new_n, size)); 1039 + new_arr = krealloc(arr, alloc_size, GFP_KERNEL); 1059 1040 if (!new_arr) { 1060 1041 kfree(arr); 1061 1042 return NULL; ··· 1229 1206 dst_state->frame[i] = NULL; 1230 1207 } 1231 1208 dst_state->speculative = src->speculative; 1209 + dst_state->active_rcu_lock = src->active_rcu_lock; 1232 1210 dst_state->curframe = src->curframe; 1233 - dst_state->active_spin_lock = src->active_spin_lock; 1211 + dst_state->active_lock.ptr = src->active_lock.ptr; 1212 + dst_state->active_lock.id = src->active_lock.id; 1234 1213 dst_state->branches = src->branches; 1235 1214 dst_state->parent = src->parent; 1236 1215 dst_state->first_insn_idx = src->first_insn_idx; ··· 2531 2506 { 2532 2507 u32 cnt = cur->jmp_history_cnt; 2533 2508 struct bpf_idx_pair *p; 2509 + size_t alloc_size; 2534 2510 2535 2511 cnt++; 2536 - p = krealloc(cur->jmp_history, cnt * sizeof(*p), GFP_USER); 2512 + alloc_size = kmalloc_size_roundup(size_mul(cnt, sizeof(*p))); 2513 + p = krealloc(cur->jmp_history, alloc_size, GFP_USER); 2537 2514 if (!p) 2538 2515 return -ENOMEM; 2539 2516 p[cnt - 1].idx = env->insn_idx; ··· 3871 3844 struct bpf_reg_state *reg, u32 regno) 3872 3845 { 3873 3846 const char *targ_name = kernel_type_name(kptr_field->kptr.btf, kptr_field->kptr.btf_id); 3874 - int perm_flags = PTR_MAYBE_NULL; 3847 + int perm_flags = PTR_MAYBE_NULL | PTR_TRUSTED; 3875 3848 const char *reg_name = ""; 3876 3849 3877 3850 /* Only unreferenced case accepts untrusted pointers */ ··· 4266 4239 4267 4240 /* Separate to is_ctx_reg() since we still want to allow BPF_ST here. */ 4268 4241 return reg->type == PTR_TO_FLOW_KEYS; 4242 + } 4243 + 4244 + static bool is_trusted_reg(const struct bpf_reg_state *reg) 4245 + { 4246 + /* A referenced register is always trusted. */ 4247 + if (reg->ref_obj_id) 4248 + return true; 4249 + 4250 + /* If a register is not referenced, it is trusted if it has the 4251 + * MEM_ALLOC, MEM_RCU or PTR_TRUSTED type modifiers, and no others. Some of the 4252 + * other type modifiers may be safe, but we elect to take an opt-in 4253 + * approach here as some (e.g. PTR_UNTRUSTED and PTR_MAYBE_NULL) are 4254 + * not. 4255 + * 4256 + * Eventually, we should make PTR_TRUSTED the single source of truth 4257 + * for whether a register is trusted. 4258 + */ 4259 + return type_flag(reg->type) & BPF_REG_TRUSTED_MODIFIERS && 4260 + !bpf_type_has_unsafe_modifiers(reg->type); 4269 4261 } 4270 4262 4271 4263 static int check_pkt_ptr_alignment(struct bpf_verifier_env *env, ··· 4733 4687 return -EACCES; 4734 4688 } 4735 4689 4736 - if (env->ops->btf_struct_access) { 4737 - ret = env->ops->btf_struct_access(&env->log, reg->btf, t, 4738 - off, size, atype, &btf_id, &flag); 4690 + if (env->ops->btf_struct_access && !type_is_alloc(reg->type)) { 4691 + if (!btf_is_kernel(reg->btf)) { 4692 + verbose(env, "verifier internal error: reg->btf must be kernel btf\n"); 4693 + return -EFAULT; 4694 + } 4695 + ret = env->ops->btf_struct_access(&env->log, reg, off, size, atype, &btf_id, &flag); 4739 4696 } else { 4740 - if (atype != BPF_READ) { 4697 + /* Writes are permitted with default btf_struct_access for 4698 + * program allocated objects (which always have ref_obj_id > 0), 4699 + * but not for untrusted PTR_TO_BTF_ID | MEM_ALLOC. 4700 + */ 4701 + if (atype != BPF_READ && reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) { 4741 4702 verbose(env, "only read is supported\n"); 4742 4703 return -EACCES; 4743 4704 } 4744 4705 4745 - ret = btf_struct_access(&env->log, reg->btf, t, off, size, 4746 - atype, &btf_id, &flag); 4706 + if (type_is_alloc(reg->type) && !reg->ref_obj_id) { 4707 + verbose(env, "verifier internal error: ref_obj_id for allocated object must be non-zero\n"); 4708 + return -EFAULT; 4709 + } 4710 + 4711 + ret = btf_struct_access(&env->log, reg, off, size, atype, &btf_id, &flag); 4747 4712 } 4748 4713 4749 4714 if (ret < 0) ··· 4765 4708 */ 4766 4709 if (type_flag(reg->type) & PTR_UNTRUSTED) 4767 4710 flag |= PTR_UNTRUSTED; 4711 + 4712 + /* By default any pointer obtained from walking a trusted pointer is 4713 + * no longer trusted except the rcu case below. 4714 + */ 4715 + flag &= ~PTR_TRUSTED; 4716 + 4717 + if (flag & MEM_RCU) { 4718 + /* Mark value register as MEM_RCU only if it is protected by 4719 + * bpf_rcu_read_lock() and the ptr reg is trusted. MEM_RCU 4720 + * itself can already indicate trustedness inside the rcu 4721 + * read lock region. Also mark it as PTR_TRUSTED. 4722 + */ 4723 + if (!env->cur_state->active_rcu_lock || !is_trusted_reg(reg)) 4724 + flag &= ~MEM_RCU; 4725 + else 4726 + flag |= PTR_TRUSTED; 4727 + } else if (reg->type & MEM_RCU) { 4728 + /* ptr (reg) is marked as MEM_RCU, but the struct field is not tagged 4729 + * with __rcu. Mark the flag as PTR_UNTRUSTED conservatively. 4730 + */ 4731 + flag |= PTR_UNTRUSTED; 4732 + } 4768 4733 4769 4734 if (atype == BPF_READ && value_regno >= 0) 4770 4735 mark_btf_ld_reg(env, regs, value_regno, ret, reg->btf, btf_id, flag); ··· 4802 4723 { 4803 4724 struct bpf_reg_state *reg = regs + regno; 4804 4725 struct bpf_map *map = reg->map_ptr; 4726 + struct bpf_reg_state map_reg; 4805 4727 enum bpf_type_flag flag = 0; 4806 4728 const struct btf_type *t; 4807 4729 const char *tname; ··· 4841 4761 return -EACCES; 4842 4762 } 4843 4763 4844 - ret = btf_struct_access(&env->log, btf_vmlinux, t, off, size, atype, &btf_id, &flag); 4764 + /* Simulate access to a PTR_TO_BTF_ID */ 4765 + memset(&map_reg, 0, sizeof(map_reg)); 4766 + mark_btf_ld_reg(env, &map_reg, 0, PTR_TO_BTF_ID, btf_vmlinux, *map->ops->map_btf_id, 0); 4767 + ret = btf_struct_access(&env->log, &map_reg, off, size, atype, &btf_id, &flag); 4845 4768 if (ret < 0) 4846 4769 return ret; 4847 4770 ··· 5603 5520 return err; 5604 5521 } 5605 5522 5606 - int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg, 5607 - u32 regno) 5523 + static int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg, 5524 + u32 regno) 5608 5525 { 5609 5526 struct bpf_reg_state *mem_reg = &cur_regs(env)[regno - 1]; 5610 5527 bool may_be_null = type_may_be_null(mem_reg->type); ··· 5632 5549 } 5633 5550 5634 5551 /* Implementation details: 5635 - * bpf_map_lookup returns PTR_TO_MAP_VALUE_OR_NULL 5552 + * bpf_map_lookup returns PTR_TO_MAP_VALUE_OR_NULL. 5553 + * bpf_obj_new returns PTR_TO_BTF_ID | MEM_ALLOC | PTR_MAYBE_NULL. 5636 5554 * Two bpf_map_lookups (even with the same key) will have different reg->id. 5637 - * For traditional PTR_TO_MAP_VALUE the verifier clears reg->id after 5638 - * value_or_null->value transition, since the verifier only cares about 5639 - * the range of access to valid map value pointer and doesn't care about actual 5640 - * address of the map element. 5555 + * Two separate bpf_obj_new will also have different reg->id. 5556 + * For traditional PTR_TO_MAP_VALUE or PTR_TO_BTF_ID | MEM_ALLOC, the verifier 5557 + * clears reg->id after value_or_null->value transition, since the verifier only 5558 + * cares about the range of access to valid map value pointer and doesn't care 5559 + * about actual address of the map element. 5641 5560 * For maps with 'struct bpf_spin_lock' inside map value the verifier keeps 5642 5561 * reg->id > 0 after value_or_null->value transition. By doing so 5643 5562 * two bpf_map_lookups will be considered two different pointers that 5644 - * point to different bpf_spin_locks. 5563 + * point to different bpf_spin_locks. Likewise for pointers to allocated objects 5564 + * returned from bpf_obj_new. 5645 5565 * The verifier allows taking only one bpf_spin_lock at a time to avoid 5646 5566 * dead-locks. 5647 5567 * Since only one bpf_spin_lock is allowed the checks are simpler than 5648 5568 * reg_is_refcounted() logic. The verifier needs to remember only 5649 5569 * one spin_lock instead of array of acquired_refs. 5650 - * cur_state->active_spin_lock remembers which map value element got locked 5651 - * and clears it after bpf_spin_unlock. 5570 + * cur_state->active_lock remembers which map value element or allocated 5571 + * object got locked and clears it after bpf_spin_unlock. 5652 5572 */ 5653 5573 static int process_spin_lock(struct bpf_verifier_env *env, int regno, 5654 5574 bool is_lock) ··· 5659 5573 struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno]; 5660 5574 struct bpf_verifier_state *cur = env->cur_state; 5661 5575 bool is_const = tnum_is_const(reg->var_off); 5662 - struct bpf_map *map = reg->map_ptr; 5663 5576 u64 val = reg->var_off.value; 5577 + struct bpf_map *map = NULL; 5578 + struct btf *btf = NULL; 5579 + struct btf_record *rec; 5664 5580 5665 5581 if (!is_const) { 5666 5582 verbose(env, ··· 5670 5582 regno); 5671 5583 return -EINVAL; 5672 5584 } 5673 - if (!map->btf) { 5674 - verbose(env, 5675 - "map '%s' has to have BTF in order to use bpf_spin_lock\n", 5676 - map->name); 5585 + if (reg->type == PTR_TO_MAP_VALUE) { 5586 + map = reg->map_ptr; 5587 + if (!map->btf) { 5588 + verbose(env, 5589 + "map '%s' has to have BTF in order to use bpf_spin_lock\n", 5590 + map->name); 5591 + return -EINVAL; 5592 + } 5593 + } else { 5594 + btf = reg->btf; 5595 + } 5596 + 5597 + rec = reg_btf_record(reg); 5598 + if (!btf_record_has_field(rec, BPF_SPIN_LOCK)) { 5599 + verbose(env, "%s '%s' has no valid bpf_spin_lock\n", map ? "map" : "local", 5600 + map ? map->name : "kptr"); 5677 5601 return -EINVAL; 5678 5602 } 5679 - if (!btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 5680 - verbose(env, "map '%s' has no valid bpf_spin_lock\n", map->name); 5681 - return -EINVAL; 5682 - } 5683 - if (map->record->spin_lock_off != val + reg->off) { 5603 + if (rec->spin_lock_off != val + reg->off) { 5684 5604 verbose(env, "off %lld doesn't point to 'struct bpf_spin_lock' that is at %d\n", 5685 - val + reg->off, map->record->spin_lock_off); 5605 + val + reg->off, rec->spin_lock_off); 5686 5606 return -EINVAL; 5687 5607 } 5688 5608 if (is_lock) { 5689 - if (cur->active_spin_lock) { 5609 + if (cur->active_lock.ptr) { 5690 5610 verbose(env, 5691 5611 "Locking two bpf_spin_locks are not allowed\n"); 5692 5612 return -EINVAL; 5693 5613 } 5694 - cur->active_spin_lock = reg->id; 5614 + if (map) 5615 + cur->active_lock.ptr = map; 5616 + else 5617 + cur->active_lock.ptr = btf; 5618 + cur->active_lock.id = reg->id; 5695 5619 } else { 5696 - if (!cur->active_spin_lock) { 5620 + struct bpf_func_state *fstate = cur_func(env); 5621 + void *ptr; 5622 + int i; 5623 + 5624 + if (map) 5625 + ptr = map; 5626 + else 5627 + ptr = btf; 5628 + 5629 + if (!cur->active_lock.ptr) { 5697 5630 verbose(env, "bpf_spin_unlock without taking a lock\n"); 5698 5631 return -EINVAL; 5699 5632 } 5700 - if (cur->active_spin_lock != reg->id) { 5633 + if (cur->active_lock.ptr != ptr || 5634 + cur->active_lock.id != reg->id) { 5701 5635 verbose(env, "bpf_spin_unlock of different lock\n"); 5702 5636 return -EINVAL; 5703 5637 } 5704 - cur->active_spin_lock = 0; 5638 + cur->active_lock.ptr = NULL; 5639 + cur->active_lock.id = 0; 5640 + 5641 + for (i = 0; i < fstate->acquired_refs; i++) { 5642 + int err; 5643 + 5644 + /* Complain on error because this reference state cannot 5645 + * be freed before this point, as bpf_spin_lock critical 5646 + * section does not allow functions that release the 5647 + * allocated object immediately. 5648 + */ 5649 + if (!fstate->refs[i].release_on_unlock) 5650 + continue; 5651 + err = release_reference(env, fstate->refs[i].id); 5652 + if (err) { 5653 + verbose(env, "failed to release release_on_unlock reference"); 5654 + return err; 5655 + } 5656 + } 5705 5657 } 5706 5658 return 0; 5707 5659 } ··· 5900 5772 PTR_TO_TCP_SOCK, 5901 5773 PTR_TO_XDP_SOCK, 5902 5774 PTR_TO_BTF_ID, 5775 + PTR_TO_BTF_ID | PTR_TRUSTED, 5903 5776 }, 5904 5777 .btf_id = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON], 5905 5778 }; ··· 5914 5785 PTR_TO_MAP_KEY, 5915 5786 PTR_TO_MAP_VALUE, 5916 5787 PTR_TO_MEM, 5917 - PTR_TO_MEM | MEM_ALLOC, 5788 + PTR_TO_MEM | MEM_RINGBUF, 5918 5789 PTR_TO_BUF, 5919 5790 }, 5920 5791 }; ··· 5929 5800 }, 5930 5801 }; 5931 5802 5803 + static const struct bpf_reg_types spin_lock_types = { 5804 + .types = { 5805 + PTR_TO_MAP_VALUE, 5806 + PTR_TO_BTF_ID | MEM_ALLOC, 5807 + } 5808 + }; 5809 + 5932 5810 static const struct bpf_reg_types fullsock_types = { .types = { PTR_TO_SOCKET } }; 5933 5811 static const struct bpf_reg_types scalar_types = { .types = { SCALAR_VALUE } }; 5934 5812 static const struct bpf_reg_types context_types = { .types = { PTR_TO_CTX } }; 5935 - static const struct bpf_reg_types alloc_mem_types = { .types = { PTR_TO_MEM | MEM_ALLOC } }; 5813 + static const struct bpf_reg_types ringbuf_mem_types = { .types = { PTR_TO_MEM | MEM_RINGBUF } }; 5936 5814 static const struct bpf_reg_types const_map_ptr_types = { .types = { CONST_PTR_TO_MAP } }; 5937 - static const struct bpf_reg_types btf_ptr_types = { .types = { PTR_TO_BTF_ID } }; 5938 - static const struct bpf_reg_types spin_lock_types = { .types = { PTR_TO_MAP_VALUE } }; 5939 - static const struct bpf_reg_types percpu_btf_ptr_types = { .types = { PTR_TO_BTF_ID | MEM_PERCPU } }; 5815 + static const struct bpf_reg_types btf_ptr_types = { 5816 + .types = { 5817 + PTR_TO_BTF_ID, 5818 + PTR_TO_BTF_ID | PTR_TRUSTED, 5819 + PTR_TO_BTF_ID | MEM_RCU | PTR_TRUSTED, 5820 + }, 5821 + }; 5822 + static const struct bpf_reg_types percpu_btf_ptr_types = { 5823 + .types = { 5824 + PTR_TO_BTF_ID | MEM_PERCPU, 5825 + PTR_TO_BTF_ID | MEM_PERCPU | PTR_TRUSTED, 5826 + } 5827 + }; 5940 5828 static const struct bpf_reg_types func_ptr_types = { .types = { PTR_TO_FUNC } }; 5941 5829 static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK } }; 5942 5830 static const struct bpf_reg_types const_str_ptr_types = { .types = { PTR_TO_MAP_VALUE } }; ··· 5982 5836 [ARG_PTR_TO_BTF_ID] = &btf_ptr_types, 5983 5837 [ARG_PTR_TO_SPIN_LOCK] = &spin_lock_types, 5984 5838 [ARG_PTR_TO_MEM] = &mem_types, 5985 - [ARG_PTR_TO_ALLOC_MEM] = &alloc_mem_types, 5839 + [ARG_PTR_TO_RINGBUF_MEM] = &ringbuf_mem_types, 5986 5840 [ARG_PTR_TO_INT] = &int_ptr_types, 5987 5841 [ARG_PTR_TO_LONG] = &int_ptr_types, 5988 5842 [ARG_PTR_TO_PERCPU_BTF_ID] = &percpu_btf_ptr_types, ··· 6041 5895 return -EACCES; 6042 5896 6043 5897 found: 6044 - if (reg->type == PTR_TO_BTF_ID) { 5898 + if (reg->type == PTR_TO_BTF_ID || reg->type & PTR_TRUSTED) { 6045 5899 /* For bpf_sk_release, it needs to match against first member 6046 5900 * 'struct sock_common', hence make an exception for it. This 6047 5901 * allows bpf_sk_release to work for multiple socket types. ··· 6077 5931 return -EACCES; 6078 5932 } 6079 5933 } 5934 + } else if (type_is_alloc(reg->type)) { 5935 + if (meta->func_id != BPF_FUNC_spin_lock && meta->func_id != BPF_FUNC_spin_unlock) { 5936 + verbose(env, "verifier internal error: unimplemented handling of MEM_ALLOC\n"); 5937 + return -EFAULT; 5938 + } 6080 5939 } 6081 5940 6082 5941 return 0; ··· 6108 5957 case PTR_TO_MAP_VALUE: 6109 5958 case PTR_TO_MEM: 6110 5959 case PTR_TO_MEM | MEM_RDONLY: 6111 - case PTR_TO_MEM | MEM_ALLOC: 5960 + case PTR_TO_MEM | MEM_RINGBUF: 6112 5961 case PTR_TO_BUF: 6113 5962 case PTR_TO_BUF | MEM_RDONLY: 6114 5963 case SCALAR_VALUE: 6115 5964 /* Some of the argument types nevertheless require a 6116 5965 * zero register offset. 6117 5966 */ 6118 - if (base_type(arg_type) != ARG_PTR_TO_ALLOC_MEM) 5967 + if (base_type(arg_type) != ARG_PTR_TO_RINGBUF_MEM) 6119 5968 return 0; 6120 5969 break; 6121 5970 /* All the rest must be rejected, except PTR_TO_BTF_ID which allows 6122 5971 * fixed offset. 6123 5972 */ 6124 5973 case PTR_TO_BTF_ID: 5974 + case PTR_TO_BTF_ID | MEM_ALLOC: 5975 + case PTR_TO_BTF_ID | PTR_TRUSTED: 5976 + case PTR_TO_BTF_ID | MEM_RCU | PTR_TRUSTED: 5977 + case PTR_TO_BTF_ID | MEM_ALLOC | PTR_TRUSTED: 6125 5978 /* When referenced PTR_TO_BTF_ID is passed to release function, 6126 5979 * it's fixed offset must be 0. In the other cases, fixed offset 6127 5980 * can be non-zero. ··· 6201 6046 goto skip_type_check; 6202 6047 6203 6048 /* arg_btf_id and arg_size are in a union. */ 6204 - if (base_type(arg_type) == ARG_PTR_TO_BTF_ID) 6049 + if (base_type(arg_type) == ARG_PTR_TO_BTF_ID || 6050 + base_type(arg_type) == ARG_PTR_TO_SPIN_LOCK) 6205 6051 arg_btf_id = fn->arg_btf_id[arg]; 6206 6052 6207 6053 err = check_reg_type(env, regno, arg_type, arg_btf_id, meta); ··· 6820 6664 int i; 6821 6665 6822 6666 for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++) { 6823 - if (base_type(fn->arg_type[i]) == ARG_PTR_TO_BTF_ID && !fn->arg_btf_id[i]) 6824 - return false; 6825 - 6667 + if (base_type(fn->arg_type[i]) == ARG_PTR_TO_BTF_ID) 6668 + return !!fn->arg_btf_id[i]; 6669 + if (base_type(fn->arg_type[i]) == ARG_PTR_TO_SPIN_LOCK) 6670 + return fn->arg_btf_id[i] == BPF_PTR_POISON; 6826 6671 if (base_type(fn->arg_type[i]) != ARG_PTR_TO_BTF_ID && fn->arg_btf_id[i] && 6827 6672 /* arg_btf_id and arg_size are in a union. */ 6828 6673 (base_type(fn->arg_type[i]) != ARG_PTR_TO_MEM || ··· 7570 7413 return -EINVAL; 7571 7414 } 7572 7415 7416 + if (!env->prog->aux->sleepable && fn->might_sleep) { 7417 + verbose(env, "helper call might sleep in a non-sleepable prog\n"); 7418 + return -EINVAL; 7419 + } 7420 + 7573 7421 /* With LD_ABS/IND some JITs save/restore skb from r1. */ 7574 7422 changes_data = bpf_helper_changes_pkt_data(fn->func); 7575 7423 if (changes_data && fn->arg1_type != ARG_PTR_TO_CTX) { ··· 7591 7429 verbose(env, "kernel subsystem misconfigured func %s#%d\n", 7592 7430 func_id_name(func_id), func_id); 7593 7431 return err; 7432 + } 7433 + 7434 + if (env->cur_state->active_rcu_lock) { 7435 + if (fn->might_sleep) { 7436 + verbose(env, "sleepable helper %s#%d in rcu_read_lock region\n", 7437 + func_id_name(func_id), func_id); 7438 + return -EINVAL; 7439 + } 7440 + 7441 + if (env->prog->aux->sleepable && is_storage_get_function(func_id)) 7442 + env->insn_aux_data[insn_idx].storage_get_func_atomic = true; 7594 7443 } 7595 7444 7596 7445 meta.func_id = func_id; ··· 7807 7634 mark_reg_known_zero(env, regs, BPF_REG_0); 7808 7635 regs[BPF_REG_0].type = PTR_TO_TCP_SOCK | ret_flag; 7809 7636 break; 7810 - case RET_PTR_TO_ALLOC_MEM: 7637 + case RET_PTR_TO_MEM: 7811 7638 mark_reg_known_zero(env, regs, BPF_REG_0); 7812 7639 regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag; 7813 7640 regs[BPF_REG_0].mem_size = meta.mem_size; ··· 7970 7797 } 7971 7798 } 7972 7799 7800 + struct bpf_kfunc_call_arg_meta { 7801 + /* In parameters */ 7802 + struct btf *btf; 7803 + u32 func_id; 7804 + u32 kfunc_flags; 7805 + const struct btf_type *func_proto; 7806 + const char *func_name; 7807 + /* Out parameters */ 7808 + u32 ref_obj_id; 7809 + u8 release_regno; 7810 + bool r0_rdonly; 7811 + u32 ret_btf_id; 7812 + u64 r0_size; 7813 + struct { 7814 + u64 value; 7815 + bool found; 7816 + } arg_constant; 7817 + struct { 7818 + struct btf *btf; 7819 + u32 btf_id; 7820 + } arg_obj_drop; 7821 + struct { 7822 + struct btf_field *field; 7823 + } arg_list_head; 7824 + }; 7825 + 7826 + static bool is_kfunc_acquire(struct bpf_kfunc_call_arg_meta *meta) 7827 + { 7828 + return meta->kfunc_flags & KF_ACQUIRE; 7829 + } 7830 + 7831 + static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta) 7832 + { 7833 + return meta->kfunc_flags & KF_RET_NULL; 7834 + } 7835 + 7836 + static bool is_kfunc_release(struct bpf_kfunc_call_arg_meta *meta) 7837 + { 7838 + return meta->kfunc_flags & KF_RELEASE; 7839 + } 7840 + 7841 + static bool is_kfunc_trusted_args(struct bpf_kfunc_call_arg_meta *meta) 7842 + { 7843 + return meta->kfunc_flags & KF_TRUSTED_ARGS; 7844 + } 7845 + 7846 + static bool is_kfunc_sleepable(struct bpf_kfunc_call_arg_meta *meta) 7847 + { 7848 + return meta->kfunc_flags & KF_SLEEPABLE; 7849 + } 7850 + 7851 + static bool is_kfunc_destructive(struct bpf_kfunc_call_arg_meta *meta) 7852 + { 7853 + return meta->kfunc_flags & KF_DESTRUCTIVE; 7854 + } 7855 + 7856 + static bool is_kfunc_arg_kptr_get(struct bpf_kfunc_call_arg_meta *meta, int arg) 7857 + { 7858 + return arg == 0 && (meta->kfunc_flags & KF_KPTR_GET); 7859 + } 7860 + 7861 + static bool __kfunc_param_match_suffix(const struct btf *btf, 7862 + const struct btf_param *arg, 7863 + const char *suffix) 7864 + { 7865 + int suffix_len = strlen(suffix), len; 7866 + const char *param_name; 7867 + 7868 + /* In the future, this can be ported to use BTF tagging */ 7869 + param_name = btf_name_by_offset(btf, arg->name_off); 7870 + if (str_is_empty(param_name)) 7871 + return false; 7872 + len = strlen(param_name); 7873 + if (len < suffix_len) 7874 + return false; 7875 + param_name += len - suffix_len; 7876 + return !strncmp(param_name, suffix, suffix_len); 7877 + } 7878 + 7879 + static bool is_kfunc_arg_mem_size(const struct btf *btf, 7880 + const struct btf_param *arg, 7881 + const struct bpf_reg_state *reg) 7882 + { 7883 + const struct btf_type *t; 7884 + 7885 + t = btf_type_skip_modifiers(btf, arg->type, NULL); 7886 + if (!btf_type_is_scalar(t) || reg->type != SCALAR_VALUE) 7887 + return false; 7888 + 7889 + return __kfunc_param_match_suffix(btf, arg, "__sz"); 7890 + } 7891 + 7892 + static bool is_kfunc_arg_constant(const struct btf *btf, const struct btf_param *arg) 7893 + { 7894 + return __kfunc_param_match_suffix(btf, arg, "__k"); 7895 + } 7896 + 7897 + static bool is_kfunc_arg_ignore(const struct btf *btf, const struct btf_param *arg) 7898 + { 7899 + return __kfunc_param_match_suffix(btf, arg, "__ign"); 7900 + } 7901 + 7902 + static bool is_kfunc_arg_alloc_obj(const struct btf *btf, const struct btf_param *arg) 7903 + { 7904 + return __kfunc_param_match_suffix(btf, arg, "__alloc"); 7905 + } 7906 + 7907 + static bool is_kfunc_arg_scalar_with_name(const struct btf *btf, 7908 + const struct btf_param *arg, 7909 + const char *name) 7910 + { 7911 + int len, target_len = strlen(name); 7912 + const char *param_name; 7913 + 7914 + param_name = btf_name_by_offset(btf, arg->name_off); 7915 + if (str_is_empty(param_name)) 7916 + return false; 7917 + len = strlen(param_name); 7918 + if (len != target_len) 7919 + return false; 7920 + if (strcmp(param_name, name)) 7921 + return false; 7922 + 7923 + return true; 7924 + } 7925 + 7926 + enum { 7927 + KF_ARG_DYNPTR_ID, 7928 + KF_ARG_LIST_HEAD_ID, 7929 + KF_ARG_LIST_NODE_ID, 7930 + }; 7931 + 7932 + BTF_ID_LIST(kf_arg_btf_ids) 7933 + BTF_ID(struct, bpf_dynptr_kern) 7934 + BTF_ID(struct, bpf_list_head) 7935 + BTF_ID(struct, bpf_list_node) 7936 + 7937 + static bool __is_kfunc_ptr_arg_type(const struct btf *btf, 7938 + const struct btf_param *arg, int type) 7939 + { 7940 + const struct btf_type *t; 7941 + u32 res_id; 7942 + 7943 + t = btf_type_skip_modifiers(btf, arg->type, NULL); 7944 + if (!t) 7945 + return false; 7946 + if (!btf_type_is_ptr(t)) 7947 + return false; 7948 + t = btf_type_skip_modifiers(btf, t->type, &res_id); 7949 + if (!t) 7950 + return false; 7951 + return btf_types_are_same(btf, res_id, btf_vmlinux, kf_arg_btf_ids[type]); 7952 + } 7953 + 7954 + static bool is_kfunc_arg_dynptr(const struct btf *btf, const struct btf_param *arg) 7955 + { 7956 + return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_DYNPTR_ID); 7957 + } 7958 + 7959 + static bool is_kfunc_arg_list_head(const struct btf *btf, const struct btf_param *arg) 7960 + { 7961 + return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_LIST_HEAD_ID); 7962 + } 7963 + 7964 + static bool is_kfunc_arg_list_node(const struct btf *btf, const struct btf_param *arg) 7965 + { 7966 + return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_LIST_NODE_ID); 7967 + } 7968 + 7969 + /* Returns true if struct is composed of scalars, 4 levels of nesting allowed */ 7970 + static bool __btf_type_is_scalar_struct(struct bpf_verifier_env *env, 7971 + const struct btf *btf, 7972 + const struct btf_type *t, int rec) 7973 + { 7974 + const struct btf_type *member_type; 7975 + const struct btf_member *member; 7976 + u32 i; 7977 + 7978 + if (!btf_type_is_struct(t)) 7979 + return false; 7980 + 7981 + for_each_member(i, t, member) { 7982 + const struct btf_array *array; 7983 + 7984 + member_type = btf_type_skip_modifiers(btf, member->type, NULL); 7985 + if (btf_type_is_struct(member_type)) { 7986 + if (rec >= 3) { 7987 + verbose(env, "max struct nesting depth exceeded\n"); 7988 + return false; 7989 + } 7990 + if (!__btf_type_is_scalar_struct(env, btf, member_type, rec + 1)) 7991 + return false; 7992 + continue; 7993 + } 7994 + if (btf_type_is_array(member_type)) { 7995 + array = btf_array(member_type); 7996 + if (!array->nelems) 7997 + return false; 7998 + member_type = btf_type_skip_modifiers(btf, array->type, NULL); 7999 + if (!btf_type_is_scalar(member_type)) 8000 + return false; 8001 + continue; 8002 + } 8003 + if (!btf_type_is_scalar(member_type)) 8004 + return false; 8005 + } 8006 + return true; 8007 + } 8008 + 8009 + 8010 + static u32 *reg2btf_ids[__BPF_REG_TYPE_MAX] = { 8011 + #ifdef CONFIG_NET 8012 + [PTR_TO_SOCKET] = &btf_sock_ids[BTF_SOCK_TYPE_SOCK], 8013 + [PTR_TO_SOCK_COMMON] = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON], 8014 + [PTR_TO_TCP_SOCK] = &btf_sock_ids[BTF_SOCK_TYPE_TCP], 8015 + #endif 8016 + }; 8017 + 8018 + enum kfunc_ptr_arg_type { 8019 + KF_ARG_PTR_TO_CTX, 8020 + KF_ARG_PTR_TO_ALLOC_BTF_ID, /* Allocated object */ 8021 + KF_ARG_PTR_TO_KPTR, /* PTR_TO_KPTR but type specific */ 8022 + KF_ARG_PTR_TO_DYNPTR, 8023 + KF_ARG_PTR_TO_LIST_HEAD, 8024 + KF_ARG_PTR_TO_LIST_NODE, 8025 + KF_ARG_PTR_TO_BTF_ID, /* Also covers reg2btf_ids conversions */ 8026 + KF_ARG_PTR_TO_MEM, 8027 + KF_ARG_PTR_TO_MEM_SIZE, /* Size derived from next argument, skip it */ 8028 + }; 8029 + 8030 + enum special_kfunc_type { 8031 + KF_bpf_obj_new_impl, 8032 + KF_bpf_obj_drop_impl, 8033 + KF_bpf_list_push_front, 8034 + KF_bpf_list_push_back, 8035 + KF_bpf_list_pop_front, 8036 + KF_bpf_list_pop_back, 8037 + KF_bpf_cast_to_kern_ctx, 8038 + KF_bpf_rdonly_cast, 8039 + KF_bpf_rcu_read_lock, 8040 + KF_bpf_rcu_read_unlock, 8041 + }; 8042 + 8043 + BTF_SET_START(special_kfunc_set) 8044 + BTF_ID(func, bpf_obj_new_impl) 8045 + BTF_ID(func, bpf_obj_drop_impl) 8046 + BTF_ID(func, bpf_list_push_front) 8047 + BTF_ID(func, bpf_list_push_back) 8048 + BTF_ID(func, bpf_list_pop_front) 8049 + BTF_ID(func, bpf_list_pop_back) 8050 + BTF_ID(func, bpf_cast_to_kern_ctx) 8051 + BTF_ID(func, bpf_rdonly_cast) 8052 + BTF_SET_END(special_kfunc_set) 8053 + 8054 + BTF_ID_LIST(special_kfunc_list) 8055 + BTF_ID(func, bpf_obj_new_impl) 8056 + BTF_ID(func, bpf_obj_drop_impl) 8057 + BTF_ID(func, bpf_list_push_front) 8058 + BTF_ID(func, bpf_list_push_back) 8059 + BTF_ID(func, bpf_list_pop_front) 8060 + BTF_ID(func, bpf_list_pop_back) 8061 + BTF_ID(func, bpf_cast_to_kern_ctx) 8062 + BTF_ID(func, bpf_rdonly_cast) 8063 + BTF_ID(func, bpf_rcu_read_lock) 8064 + BTF_ID(func, bpf_rcu_read_unlock) 8065 + 8066 + static bool is_kfunc_bpf_rcu_read_lock(struct bpf_kfunc_call_arg_meta *meta) 8067 + { 8068 + return meta->func_id == special_kfunc_list[KF_bpf_rcu_read_lock]; 8069 + } 8070 + 8071 + static bool is_kfunc_bpf_rcu_read_unlock(struct bpf_kfunc_call_arg_meta *meta) 8072 + { 8073 + return meta->func_id == special_kfunc_list[KF_bpf_rcu_read_unlock]; 8074 + } 8075 + 8076 + static enum kfunc_ptr_arg_type 8077 + get_kfunc_ptr_arg_type(struct bpf_verifier_env *env, 8078 + struct bpf_kfunc_call_arg_meta *meta, 8079 + const struct btf_type *t, const struct btf_type *ref_t, 8080 + const char *ref_tname, const struct btf_param *args, 8081 + int argno, int nargs) 8082 + { 8083 + u32 regno = argno + 1; 8084 + struct bpf_reg_state *regs = cur_regs(env); 8085 + struct bpf_reg_state *reg = &regs[regno]; 8086 + bool arg_mem_size = false; 8087 + 8088 + if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) 8089 + return KF_ARG_PTR_TO_CTX; 8090 + 8091 + /* In this function, we verify the kfunc's BTF as per the argument type, 8092 + * leaving the rest of the verification with respect to the register 8093 + * type to our caller. When a set of conditions hold in the BTF type of 8094 + * arguments, we resolve it to a known kfunc_ptr_arg_type. 8095 + */ 8096 + if (btf_get_prog_ctx_type(&env->log, meta->btf, t, resolve_prog_type(env->prog), argno)) 8097 + return KF_ARG_PTR_TO_CTX; 8098 + 8099 + if (is_kfunc_arg_alloc_obj(meta->btf, &args[argno])) 8100 + return KF_ARG_PTR_TO_ALLOC_BTF_ID; 8101 + 8102 + if (is_kfunc_arg_kptr_get(meta, argno)) { 8103 + if (!btf_type_is_ptr(ref_t)) { 8104 + verbose(env, "arg#0 BTF type must be a double pointer for kptr_get kfunc\n"); 8105 + return -EINVAL; 8106 + } 8107 + ref_t = btf_type_by_id(meta->btf, ref_t->type); 8108 + ref_tname = btf_name_by_offset(meta->btf, ref_t->name_off); 8109 + if (!btf_type_is_struct(ref_t)) { 8110 + verbose(env, "kernel function %s args#0 pointer type %s %s is not supported\n", 8111 + meta->func_name, btf_type_str(ref_t), ref_tname); 8112 + return -EINVAL; 8113 + } 8114 + return KF_ARG_PTR_TO_KPTR; 8115 + } 8116 + 8117 + if (is_kfunc_arg_dynptr(meta->btf, &args[argno])) 8118 + return KF_ARG_PTR_TO_DYNPTR; 8119 + 8120 + if (is_kfunc_arg_list_head(meta->btf, &args[argno])) 8121 + return KF_ARG_PTR_TO_LIST_HEAD; 8122 + 8123 + if (is_kfunc_arg_list_node(meta->btf, &args[argno])) 8124 + return KF_ARG_PTR_TO_LIST_NODE; 8125 + 8126 + if ((base_type(reg->type) == PTR_TO_BTF_ID || reg2btf_ids[base_type(reg->type)])) { 8127 + if (!btf_type_is_struct(ref_t)) { 8128 + verbose(env, "kernel function %s args#%d pointer type %s %s is not supported\n", 8129 + meta->func_name, argno, btf_type_str(ref_t), ref_tname); 8130 + return -EINVAL; 8131 + } 8132 + return KF_ARG_PTR_TO_BTF_ID; 8133 + } 8134 + 8135 + if (argno + 1 < nargs && is_kfunc_arg_mem_size(meta->btf, &args[argno + 1], &regs[regno + 1])) 8136 + arg_mem_size = true; 8137 + 8138 + /* This is the catch all argument type of register types supported by 8139 + * check_helper_mem_access. However, we only allow when argument type is 8140 + * pointer to scalar, or struct composed (recursively) of scalars. When 8141 + * arg_mem_size is true, the pointer can be void *. 8142 + */ 8143 + if (!btf_type_is_scalar(ref_t) && !__btf_type_is_scalar_struct(env, meta->btf, ref_t, 0) && 8144 + (arg_mem_size ? !btf_type_is_void(ref_t) : 1)) { 8145 + verbose(env, "arg#%d pointer type %s %s must point to %sscalar, or struct with scalar\n", 8146 + argno, btf_type_str(ref_t), ref_tname, arg_mem_size ? "void, " : ""); 8147 + return -EINVAL; 8148 + } 8149 + return arg_mem_size ? KF_ARG_PTR_TO_MEM_SIZE : KF_ARG_PTR_TO_MEM; 8150 + } 8151 + 8152 + static int process_kf_arg_ptr_to_btf_id(struct bpf_verifier_env *env, 8153 + struct bpf_reg_state *reg, 8154 + const struct btf_type *ref_t, 8155 + const char *ref_tname, u32 ref_id, 8156 + struct bpf_kfunc_call_arg_meta *meta, 8157 + int argno) 8158 + { 8159 + const struct btf_type *reg_ref_t; 8160 + bool strict_type_match = false; 8161 + const struct btf *reg_btf; 8162 + const char *reg_ref_tname; 8163 + u32 reg_ref_id; 8164 + 8165 + if (base_type(reg->type) == PTR_TO_BTF_ID) { 8166 + reg_btf = reg->btf; 8167 + reg_ref_id = reg->btf_id; 8168 + } else { 8169 + reg_btf = btf_vmlinux; 8170 + reg_ref_id = *reg2btf_ids[base_type(reg->type)]; 8171 + } 8172 + 8173 + if (is_kfunc_trusted_args(meta) || (is_kfunc_release(meta) && reg->ref_obj_id)) 8174 + strict_type_match = true; 8175 + 8176 + reg_ref_t = btf_type_skip_modifiers(reg_btf, reg_ref_id, &reg_ref_id); 8177 + reg_ref_tname = btf_name_by_offset(reg_btf, reg_ref_t->name_off); 8178 + if (!btf_struct_ids_match(&env->log, reg_btf, reg_ref_id, reg->off, meta->btf, ref_id, strict_type_match)) { 8179 + verbose(env, "kernel function %s args#%d expected pointer to %s %s but R%d has a pointer to %s %s\n", 8180 + meta->func_name, argno, btf_type_str(ref_t), ref_tname, argno + 1, 8181 + btf_type_str(reg_ref_t), reg_ref_tname); 8182 + return -EINVAL; 8183 + } 8184 + return 0; 8185 + } 8186 + 8187 + static int process_kf_arg_ptr_to_kptr(struct bpf_verifier_env *env, 8188 + struct bpf_reg_state *reg, 8189 + const struct btf_type *ref_t, 8190 + const char *ref_tname, 8191 + struct bpf_kfunc_call_arg_meta *meta, 8192 + int argno) 8193 + { 8194 + struct btf_field *kptr_field; 8195 + 8196 + /* check_func_arg_reg_off allows var_off for 8197 + * PTR_TO_MAP_VALUE, but we need fixed offset to find 8198 + * off_desc. 8199 + */ 8200 + if (!tnum_is_const(reg->var_off)) { 8201 + verbose(env, "arg#0 must have constant offset\n"); 8202 + return -EINVAL; 8203 + } 8204 + 8205 + kptr_field = btf_record_find(reg->map_ptr->record, reg->off + reg->var_off.value, BPF_KPTR); 8206 + if (!kptr_field || kptr_field->type != BPF_KPTR_REF) { 8207 + verbose(env, "arg#0 no referenced kptr at map value offset=%llu\n", 8208 + reg->off + reg->var_off.value); 8209 + return -EINVAL; 8210 + } 8211 + 8212 + if (!btf_struct_ids_match(&env->log, meta->btf, ref_t->type, 0, kptr_field->kptr.btf, 8213 + kptr_field->kptr.btf_id, true)) { 8214 + verbose(env, "kernel function %s args#%d expected pointer to %s %s\n", 8215 + meta->func_name, argno, btf_type_str(ref_t), ref_tname); 8216 + return -EINVAL; 8217 + } 8218 + return 0; 8219 + } 8220 + 8221 + static int ref_set_release_on_unlock(struct bpf_verifier_env *env, u32 ref_obj_id) 8222 + { 8223 + struct bpf_func_state *state = cur_func(env); 8224 + struct bpf_reg_state *reg; 8225 + int i; 8226 + 8227 + /* bpf_spin_lock only allows calling list_push and list_pop, no BPF 8228 + * subprogs, no global functions. This means that the references would 8229 + * not be released inside the critical section but they may be added to 8230 + * the reference state, and the acquired_refs are never copied out for a 8231 + * different frame as BPF to BPF calls don't work in bpf_spin_lock 8232 + * critical sections. 8233 + */ 8234 + if (!ref_obj_id) { 8235 + verbose(env, "verifier internal error: ref_obj_id is zero for release_on_unlock\n"); 8236 + return -EFAULT; 8237 + } 8238 + for (i = 0; i < state->acquired_refs; i++) { 8239 + if (state->refs[i].id == ref_obj_id) { 8240 + if (state->refs[i].release_on_unlock) { 8241 + verbose(env, "verifier internal error: expected false release_on_unlock"); 8242 + return -EFAULT; 8243 + } 8244 + state->refs[i].release_on_unlock = true; 8245 + /* Now mark everyone sharing same ref_obj_id as untrusted */ 8246 + bpf_for_each_reg_in_vstate(env->cur_state, state, reg, ({ 8247 + if (reg->ref_obj_id == ref_obj_id) 8248 + reg->type |= PTR_UNTRUSTED; 8249 + })); 8250 + return 0; 8251 + } 8252 + } 8253 + verbose(env, "verifier internal error: ref state missing for ref_obj_id\n"); 8254 + return -EFAULT; 8255 + } 8256 + 8257 + /* Implementation details: 8258 + * 8259 + * Each register points to some region of memory, which we define as an 8260 + * allocation. Each allocation may embed a bpf_spin_lock which protects any 8261 + * special BPF objects (bpf_list_head, bpf_rb_root, etc.) part of the same 8262 + * allocation. The lock and the data it protects are colocated in the same 8263 + * memory region. 8264 + * 8265 + * Hence, everytime a register holds a pointer value pointing to such 8266 + * allocation, the verifier preserves a unique reg->id for it. 8267 + * 8268 + * The verifier remembers the lock 'ptr' and the lock 'id' whenever 8269 + * bpf_spin_lock is called. 8270 + * 8271 + * To enable this, lock state in the verifier captures two values: 8272 + * active_lock.ptr = Register's type specific pointer 8273 + * active_lock.id = A unique ID for each register pointer value 8274 + * 8275 + * Currently, PTR_TO_MAP_VALUE and PTR_TO_BTF_ID | MEM_ALLOC are the two 8276 + * supported register types. 8277 + * 8278 + * The active_lock.ptr in case of map values is the reg->map_ptr, and in case of 8279 + * allocated objects is the reg->btf pointer. 8280 + * 8281 + * The active_lock.id is non-unique for maps supporting direct_value_addr, as we 8282 + * can establish the provenance of the map value statically for each distinct 8283 + * lookup into such maps. They always contain a single map value hence unique 8284 + * IDs for each pseudo load pessimizes the algorithm and rejects valid programs. 8285 + * 8286 + * So, in case of global variables, they use array maps with max_entries = 1, 8287 + * hence their active_lock.ptr becomes map_ptr and id = 0 (since they all point 8288 + * into the same map value as max_entries is 1, as described above). 8289 + * 8290 + * In case of inner map lookups, the inner map pointer has same map_ptr as the 8291 + * outer map pointer (in verifier context), but each lookup into an inner map 8292 + * assigns a fresh reg->id to the lookup, so while lookups into distinct inner 8293 + * maps from the same outer map share the same map_ptr as active_lock.ptr, they 8294 + * will get different reg->id assigned to each lookup, hence different 8295 + * active_lock.id. 8296 + * 8297 + * In case of allocated objects, active_lock.ptr is the reg->btf, and the 8298 + * reg->id is a unique ID preserved after the NULL pointer check on the pointer 8299 + * returned from bpf_obj_new. Each allocation receives a new reg->id. 8300 + */ 8301 + static int check_reg_allocation_locked(struct bpf_verifier_env *env, struct bpf_reg_state *reg) 8302 + { 8303 + void *ptr; 8304 + u32 id; 8305 + 8306 + switch ((int)reg->type) { 8307 + case PTR_TO_MAP_VALUE: 8308 + ptr = reg->map_ptr; 8309 + break; 8310 + case PTR_TO_BTF_ID | MEM_ALLOC: 8311 + case PTR_TO_BTF_ID | MEM_ALLOC | PTR_TRUSTED: 8312 + ptr = reg->btf; 8313 + break; 8314 + default: 8315 + verbose(env, "verifier internal error: unknown reg type for lock check\n"); 8316 + return -EFAULT; 8317 + } 8318 + id = reg->id; 8319 + 8320 + if (!env->cur_state->active_lock.ptr) 8321 + return -EINVAL; 8322 + if (env->cur_state->active_lock.ptr != ptr || 8323 + env->cur_state->active_lock.id != id) { 8324 + verbose(env, "held lock and object are not in the same allocation\n"); 8325 + return -EINVAL; 8326 + } 8327 + return 0; 8328 + } 8329 + 8330 + static bool is_bpf_list_api_kfunc(u32 btf_id) 8331 + { 8332 + return btf_id == special_kfunc_list[KF_bpf_list_push_front] || 8333 + btf_id == special_kfunc_list[KF_bpf_list_push_back] || 8334 + btf_id == special_kfunc_list[KF_bpf_list_pop_front] || 8335 + btf_id == special_kfunc_list[KF_bpf_list_pop_back]; 8336 + } 8337 + 8338 + static int process_kf_arg_ptr_to_list_head(struct bpf_verifier_env *env, 8339 + struct bpf_reg_state *reg, u32 regno, 8340 + struct bpf_kfunc_call_arg_meta *meta) 8341 + { 8342 + struct btf_field *field; 8343 + struct btf_record *rec; 8344 + u32 list_head_off; 8345 + 8346 + if (meta->btf != btf_vmlinux || !is_bpf_list_api_kfunc(meta->func_id)) { 8347 + verbose(env, "verifier internal error: bpf_list_head argument for unknown kfunc\n"); 8348 + return -EFAULT; 8349 + } 8350 + 8351 + if (!tnum_is_const(reg->var_off)) { 8352 + verbose(env, 8353 + "R%d doesn't have constant offset. bpf_list_head has to be at the constant offset\n", 8354 + regno); 8355 + return -EINVAL; 8356 + } 8357 + 8358 + rec = reg_btf_record(reg); 8359 + list_head_off = reg->off + reg->var_off.value; 8360 + field = btf_record_find(rec, list_head_off, BPF_LIST_HEAD); 8361 + if (!field) { 8362 + verbose(env, "bpf_list_head not found at offset=%u\n", list_head_off); 8363 + return -EINVAL; 8364 + } 8365 + 8366 + /* All functions require bpf_list_head to be protected using a bpf_spin_lock */ 8367 + if (check_reg_allocation_locked(env, reg)) { 8368 + verbose(env, "bpf_spin_lock at off=%d must be held for bpf_list_head\n", 8369 + rec->spin_lock_off); 8370 + return -EINVAL; 8371 + } 8372 + 8373 + if (meta->arg_list_head.field) { 8374 + verbose(env, "verifier internal error: repeating bpf_list_head arg\n"); 8375 + return -EFAULT; 8376 + } 8377 + meta->arg_list_head.field = field; 8378 + return 0; 8379 + } 8380 + 8381 + static int process_kf_arg_ptr_to_list_node(struct bpf_verifier_env *env, 8382 + struct bpf_reg_state *reg, u32 regno, 8383 + struct bpf_kfunc_call_arg_meta *meta) 8384 + { 8385 + const struct btf_type *et, *t; 8386 + struct btf_field *field; 8387 + struct btf_record *rec; 8388 + u32 list_node_off; 8389 + 8390 + if (meta->btf != btf_vmlinux || 8391 + (meta->func_id != special_kfunc_list[KF_bpf_list_push_front] && 8392 + meta->func_id != special_kfunc_list[KF_bpf_list_push_back])) { 8393 + verbose(env, "verifier internal error: bpf_list_node argument for unknown kfunc\n"); 8394 + return -EFAULT; 8395 + } 8396 + 8397 + if (!tnum_is_const(reg->var_off)) { 8398 + verbose(env, 8399 + "R%d doesn't have constant offset. bpf_list_node has to be at the constant offset\n", 8400 + regno); 8401 + return -EINVAL; 8402 + } 8403 + 8404 + rec = reg_btf_record(reg); 8405 + list_node_off = reg->off + reg->var_off.value; 8406 + field = btf_record_find(rec, list_node_off, BPF_LIST_NODE); 8407 + if (!field || field->offset != list_node_off) { 8408 + verbose(env, "bpf_list_node not found at offset=%u\n", list_node_off); 8409 + return -EINVAL; 8410 + } 8411 + 8412 + field = meta->arg_list_head.field; 8413 + 8414 + et = btf_type_by_id(field->list_head.btf, field->list_head.value_btf_id); 8415 + t = btf_type_by_id(reg->btf, reg->btf_id); 8416 + if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, 0, field->list_head.btf, 8417 + field->list_head.value_btf_id, true)) { 8418 + verbose(env, "operation on bpf_list_head expects arg#1 bpf_list_node at offset=%d " 8419 + "in struct %s, but arg is at offset=%d in struct %s\n", 8420 + field->list_head.node_offset, btf_name_by_offset(field->list_head.btf, et->name_off), 8421 + list_node_off, btf_name_by_offset(reg->btf, t->name_off)); 8422 + return -EINVAL; 8423 + } 8424 + 8425 + if (list_node_off != field->list_head.node_offset) { 8426 + verbose(env, "arg#1 offset=%d, but expected bpf_list_node at offset=%d in struct %s\n", 8427 + list_node_off, field->list_head.node_offset, 8428 + btf_name_by_offset(field->list_head.btf, et->name_off)); 8429 + return -EINVAL; 8430 + } 8431 + /* Set arg#1 for expiration after unlock */ 8432 + return ref_set_release_on_unlock(env, reg->ref_obj_id); 8433 + } 8434 + 8435 + static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_arg_meta *meta) 8436 + { 8437 + const char *func_name = meta->func_name, *ref_tname; 8438 + const struct btf *btf = meta->btf; 8439 + const struct btf_param *args; 8440 + u32 i, nargs; 8441 + int ret; 8442 + 8443 + args = (const struct btf_param *)(meta->func_proto + 1); 8444 + nargs = btf_type_vlen(meta->func_proto); 8445 + if (nargs > MAX_BPF_FUNC_REG_ARGS) { 8446 + verbose(env, "Function %s has %d > %d args\n", func_name, nargs, 8447 + MAX_BPF_FUNC_REG_ARGS); 8448 + return -EINVAL; 8449 + } 8450 + 8451 + /* Check that BTF function arguments match actual types that the 8452 + * verifier sees. 8453 + */ 8454 + for (i = 0; i < nargs; i++) { 8455 + struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[i + 1]; 8456 + const struct btf_type *t, *ref_t, *resolve_ret; 8457 + enum bpf_arg_type arg_type = ARG_DONTCARE; 8458 + u32 regno = i + 1, ref_id, type_size; 8459 + bool is_ret_buf_sz = false; 8460 + int kf_arg_type; 8461 + 8462 + t = btf_type_skip_modifiers(btf, args[i].type, NULL); 8463 + 8464 + if (is_kfunc_arg_ignore(btf, &args[i])) 8465 + continue; 8466 + 8467 + if (btf_type_is_scalar(t)) { 8468 + if (reg->type != SCALAR_VALUE) { 8469 + verbose(env, "R%d is not a scalar\n", regno); 8470 + return -EINVAL; 8471 + } 8472 + 8473 + if (is_kfunc_arg_constant(meta->btf, &args[i])) { 8474 + if (meta->arg_constant.found) { 8475 + verbose(env, "verifier internal error: only one constant argument permitted\n"); 8476 + return -EFAULT; 8477 + } 8478 + if (!tnum_is_const(reg->var_off)) { 8479 + verbose(env, "R%d must be a known constant\n", regno); 8480 + return -EINVAL; 8481 + } 8482 + ret = mark_chain_precision(env, regno); 8483 + if (ret < 0) 8484 + return ret; 8485 + meta->arg_constant.found = true; 8486 + meta->arg_constant.value = reg->var_off.value; 8487 + } else if (is_kfunc_arg_scalar_with_name(btf, &args[i], "rdonly_buf_size")) { 8488 + meta->r0_rdonly = true; 8489 + is_ret_buf_sz = true; 8490 + } else if (is_kfunc_arg_scalar_with_name(btf, &args[i], "rdwr_buf_size")) { 8491 + is_ret_buf_sz = true; 8492 + } 8493 + 8494 + if (is_ret_buf_sz) { 8495 + if (meta->r0_size) { 8496 + verbose(env, "2 or more rdonly/rdwr_buf_size parameters for kfunc"); 8497 + return -EINVAL; 8498 + } 8499 + 8500 + if (!tnum_is_const(reg->var_off)) { 8501 + verbose(env, "R%d is not a const\n", regno); 8502 + return -EINVAL; 8503 + } 8504 + 8505 + meta->r0_size = reg->var_off.value; 8506 + ret = mark_chain_precision(env, regno); 8507 + if (ret) 8508 + return ret; 8509 + } 8510 + continue; 8511 + } 8512 + 8513 + if (!btf_type_is_ptr(t)) { 8514 + verbose(env, "Unrecognized arg#%d type %s\n", i, btf_type_str(t)); 8515 + return -EINVAL; 8516 + } 8517 + 8518 + if (reg->ref_obj_id) { 8519 + if (is_kfunc_release(meta) && meta->ref_obj_id) { 8520 + verbose(env, "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n", 8521 + regno, reg->ref_obj_id, 8522 + meta->ref_obj_id); 8523 + return -EFAULT; 8524 + } 8525 + meta->ref_obj_id = reg->ref_obj_id; 8526 + if (is_kfunc_release(meta)) 8527 + meta->release_regno = regno; 8528 + } 8529 + 8530 + ref_t = btf_type_skip_modifiers(btf, t->type, &ref_id); 8531 + ref_tname = btf_name_by_offset(btf, ref_t->name_off); 8532 + 8533 + kf_arg_type = get_kfunc_ptr_arg_type(env, meta, t, ref_t, ref_tname, args, i, nargs); 8534 + if (kf_arg_type < 0) 8535 + return kf_arg_type; 8536 + 8537 + switch (kf_arg_type) { 8538 + case KF_ARG_PTR_TO_ALLOC_BTF_ID: 8539 + case KF_ARG_PTR_TO_BTF_ID: 8540 + if (!is_kfunc_trusted_args(meta)) 8541 + break; 8542 + 8543 + if (!is_trusted_reg(reg)) { 8544 + verbose(env, "R%d must be referenced or trusted\n", regno); 8545 + return -EINVAL; 8546 + } 8547 + fallthrough; 8548 + case KF_ARG_PTR_TO_CTX: 8549 + /* Trusted arguments have the same offset checks as release arguments */ 8550 + arg_type |= OBJ_RELEASE; 8551 + break; 8552 + case KF_ARG_PTR_TO_KPTR: 8553 + case KF_ARG_PTR_TO_DYNPTR: 8554 + case KF_ARG_PTR_TO_LIST_HEAD: 8555 + case KF_ARG_PTR_TO_LIST_NODE: 8556 + case KF_ARG_PTR_TO_MEM: 8557 + case KF_ARG_PTR_TO_MEM_SIZE: 8558 + /* Trusted by default */ 8559 + break; 8560 + default: 8561 + WARN_ON_ONCE(1); 8562 + return -EFAULT; 8563 + } 8564 + 8565 + if (is_kfunc_release(meta) && reg->ref_obj_id) 8566 + arg_type |= OBJ_RELEASE; 8567 + ret = check_func_arg_reg_off(env, reg, regno, arg_type); 8568 + if (ret < 0) 8569 + return ret; 8570 + 8571 + switch (kf_arg_type) { 8572 + case KF_ARG_PTR_TO_CTX: 8573 + if (reg->type != PTR_TO_CTX) { 8574 + verbose(env, "arg#%d expected pointer to ctx, but got %s\n", i, btf_type_str(t)); 8575 + return -EINVAL; 8576 + } 8577 + 8578 + if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) { 8579 + ret = get_kern_ctx_btf_id(&env->log, resolve_prog_type(env->prog)); 8580 + if (ret < 0) 8581 + return -EINVAL; 8582 + meta->ret_btf_id = ret; 8583 + } 8584 + break; 8585 + case KF_ARG_PTR_TO_ALLOC_BTF_ID: 8586 + if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) { 8587 + verbose(env, "arg#%d expected pointer to allocated object\n", i); 8588 + return -EINVAL; 8589 + } 8590 + if (!reg->ref_obj_id) { 8591 + verbose(env, "allocated object must be referenced\n"); 8592 + return -EINVAL; 8593 + } 8594 + if (meta->btf == btf_vmlinux && 8595 + meta->func_id == special_kfunc_list[KF_bpf_obj_drop_impl]) { 8596 + meta->arg_obj_drop.btf = reg->btf; 8597 + meta->arg_obj_drop.btf_id = reg->btf_id; 8598 + } 8599 + break; 8600 + case KF_ARG_PTR_TO_KPTR: 8601 + if (reg->type != PTR_TO_MAP_VALUE) { 8602 + verbose(env, "arg#0 expected pointer to map value\n"); 8603 + return -EINVAL; 8604 + } 8605 + ret = process_kf_arg_ptr_to_kptr(env, reg, ref_t, ref_tname, meta, i); 8606 + if (ret < 0) 8607 + return ret; 8608 + break; 8609 + case KF_ARG_PTR_TO_DYNPTR: 8610 + if (reg->type != PTR_TO_STACK) { 8611 + verbose(env, "arg#%d expected pointer to stack\n", i); 8612 + return -EINVAL; 8613 + } 8614 + 8615 + if (!is_dynptr_reg_valid_init(env, reg)) { 8616 + verbose(env, "arg#%d pointer type %s %s must be valid and initialized\n", 8617 + i, btf_type_str(ref_t), ref_tname); 8618 + return -EINVAL; 8619 + } 8620 + 8621 + if (!is_dynptr_type_expected(env, reg, ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL)) { 8622 + verbose(env, "arg#%d pointer type %s %s points to unsupported dynamic pointer type\n", 8623 + i, btf_type_str(ref_t), ref_tname); 8624 + return -EINVAL; 8625 + } 8626 + break; 8627 + case KF_ARG_PTR_TO_LIST_HEAD: 8628 + if (reg->type != PTR_TO_MAP_VALUE && 8629 + reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) { 8630 + verbose(env, "arg#%d expected pointer to map value or allocated object\n", i); 8631 + return -EINVAL; 8632 + } 8633 + if (reg->type == (PTR_TO_BTF_ID | MEM_ALLOC) && !reg->ref_obj_id) { 8634 + verbose(env, "allocated object must be referenced\n"); 8635 + return -EINVAL; 8636 + } 8637 + ret = process_kf_arg_ptr_to_list_head(env, reg, regno, meta); 8638 + if (ret < 0) 8639 + return ret; 8640 + break; 8641 + case KF_ARG_PTR_TO_LIST_NODE: 8642 + if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) { 8643 + verbose(env, "arg#%d expected pointer to allocated object\n", i); 8644 + return -EINVAL; 8645 + } 8646 + if (!reg->ref_obj_id) { 8647 + verbose(env, "allocated object must be referenced\n"); 8648 + return -EINVAL; 8649 + } 8650 + ret = process_kf_arg_ptr_to_list_node(env, reg, regno, meta); 8651 + if (ret < 0) 8652 + return ret; 8653 + break; 8654 + case KF_ARG_PTR_TO_BTF_ID: 8655 + /* Only base_type is checked, further checks are done here */ 8656 + if ((base_type(reg->type) != PTR_TO_BTF_ID || 8657 + bpf_type_has_unsafe_modifiers(reg->type)) && 8658 + !reg2btf_ids[base_type(reg->type)]) { 8659 + verbose(env, "arg#%d is %s ", i, reg_type_str(env, reg->type)); 8660 + verbose(env, "expected %s or socket\n", 8661 + reg_type_str(env, base_type(reg->type) | 8662 + (type_flag(reg->type) & BPF_REG_TRUSTED_MODIFIERS))); 8663 + return -EINVAL; 8664 + } 8665 + ret = process_kf_arg_ptr_to_btf_id(env, reg, ref_t, ref_tname, ref_id, meta, i); 8666 + if (ret < 0) 8667 + return ret; 8668 + break; 8669 + case KF_ARG_PTR_TO_MEM: 8670 + resolve_ret = btf_resolve_size(btf, ref_t, &type_size); 8671 + if (IS_ERR(resolve_ret)) { 8672 + verbose(env, "arg#%d reference type('%s %s') size cannot be determined: %ld\n", 8673 + i, btf_type_str(ref_t), ref_tname, PTR_ERR(resolve_ret)); 8674 + return -EINVAL; 8675 + } 8676 + ret = check_mem_reg(env, reg, regno, type_size); 8677 + if (ret < 0) 8678 + return ret; 8679 + break; 8680 + case KF_ARG_PTR_TO_MEM_SIZE: 8681 + ret = check_kfunc_mem_size_reg(env, &regs[regno + 1], regno + 1); 8682 + if (ret < 0) { 8683 + verbose(env, "arg#%d arg#%d memory, len pair leads to invalid memory access\n", i, i + 1); 8684 + return ret; 8685 + } 8686 + /* Skip next '__sz' argument */ 8687 + i++; 8688 + break; 8689 + } 8690 + } 8691 + 8692 + if (is_kfunc_release(meta) && !meta->release_regno) { 8693 + verbose(env, "release kernel function %s expects refcounted PTR_TO_BTF_ID\n", 8694 + func_name); 8695 + return -EINVAL; 8696 + } 8697 + 8698 + return 0; 8699 + } 8700 + 7973 8701 static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, 7974 8702 int *insn_idx_p) 7975 8703 { 7976 8704 const struct btf_type *t, *func, *func_proto, *ptr_type; 7977 8705 struct bpf_reg_state *regs = cur_regs(env); 7978 - struct bpf_kfunc_arg_meta meta = { 0 }; 7979 8706 const char *func_name, *ptr_type_name; 8707 + bool sleepable, rcu_lock, rcu_unlock; 8708 + struct bpf_kfunc_call_arg_meta meta; 7980 8709 u32 i, nargs, func_id, ptr_type_id; 7981 8710 int err, insn_idx = *insn_idx_p; 7982 8711 const struct btf_param *args; 8712 + const struct btf_type *ret_t; 7983 8713 struct btf *desc_btf; 7984 8714 u32 *kfunc_flags; 7985 - bool acq; 7986 8715 7987 8716 /* skip for now, but return error when we find this in fixup_kfunc_call */ 7988 8717 if (!insn->imm) ··· 8905 7830 func_name); 8906 7831 return -EACCES; 8907 7832 } 8908 - if (*kfunc_flags & KF_DESTRUCTIVE && !capable(CAP_SYS_BOOT)) { 8909 - verbose(env, "destructive kfunc calls require CAP_SYS_BOOT capabilities\n"); 7833 + 7834 + /* Prepare kfunc call metadata */ 7835 + memset(&meta, 0, sizeof(meta)); 7836 + meta.btf = desc_btf; 7837 + meta.func_id = func_id; 7838 + meta.kfunc_flags = *kfunc_flags; 7839 + meta.func_proto = func_proto; 7840 + meta.func_name = func_name; 7841 + 7842 + if (is_kfunc_destructive(&meta) && !capable(CAP_SYS_BOOT)) { 7843 + verbose(env, "destructive kfunc calls require CAP_SYS_BOOT capability\n"); 8910 7844 return -EACCES; 8911 7845 } 8912 7846 8913 - acq = *kfunc_flags & KF_ACQUIRE; 7847 + sleepable = is_kfunc_sleepable(&meta); 7848 + if (sleepable && !env->prog->aux->sleepable) { 7849 + verbose(env, "program must be sleepable to call sleepable kfunc %s\n", func_name); 7850 + return -EACCES; 7851 + } 8914 7852 8915 - meta.flags = *kfunc_flags; 7853 + rcu_lock = is_kfunc_bpf_rcu_read_lock(&meta); 7854 + rcu_unlock = is_kfunc_bpf_rcu_read_unlock(&meta); 7855 + if ((rcu_lock || rcu_unlock) && !env->rcu_tag_supported) { 7856 + verbose(env, "no vmlinux btf rcu tag support for kfunc %s\n", func_name); 7857 + return -EACCES; 7858 + } 7859 + 7860 + if (env->cur_state->active_rcu_lock) { 7861 + struct bpf_func_state *state; 7862 + struct bpf_reg_state *reg; 7863 + 7864 + if (rcu_lock) { 7865 + verbose(env, "nested rcu read lock (kernel function %s)\n", func_name); 7866 + return -EINVAL; 7867 + } else if (rcu_unlock) { 7868 + bpf_for_each_reg_in_vstate(env->cur_state, state, reg, ({ 7869 + if (reg->type & MEM_RCU) { 7870 + reg->type &= ~(MEM_RCU | PTR_TRUSTED); 7871 + reg->type |= PTR_UNTRUSTED; 7872 + } 7873 + })); 7874 + env->cur_state->active_rcu_lock = false; 7875 + } else if (sleepable) { 7876 + verbose(env, "kernel func %s is sleepable within rcu_read_lock region\n", func_name); 7877 + return -EACCES; 7878 + } 7879 + } else if (rcu_lock) { 7880 + env->cur_state->active_rcu_lock = true; 7881 + } else if (rcu_unlock) { 7882 + verbose(env, "unmatched rcu read unlock (kernel function %s)\n", func_name); 7883 + return -EINVAL; 7884 + } 8916 7885 8917 7886 /* Check the arguments */ 8918 - err = btf_check_kfunc_arg_match(env, desc_btf, func_id, regs, &meta); 7887 + err = check_kfunc_args(env, &meta); 8919 7888 if (err < 0) 8920 7889 return err; 8921 7890 /* In case of release function, we get register number of refcounted 8922 - * PTR_TO_BTF_ID back from btf_check_kfunc_arg_match, do the release now 7891 + * PTR_TO_BTF_ID in bpf_kfunc_arg_meta, do the release now. 8923 7892 */ 8924 - if (err) { 8925 - err = release_reference(env, regs[err].ref_obj_id); 7893 + if (meta.release_regno) { 7894 + err = release_reference(env, regs[meta.release_regno].ref_obj_id); 8926 7895 if (err) { 8927 7896 verbose(env, "kfunc %s#%d reference has not been acquired before\n", 8928 7897 func_name, func_id); ··· 8980 7861 /* Check return type */ 8981 7862 t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL); 8982 7863 8983 - if (acq && !btf_type_is_struct_ptr(desc_btf, t)) { 8984 - verbose(env, "acquire kernel function does not return PTR_TO_BTF_ID\n"); 8985 - return -EINVAL; 7864 + if (is_kfunc_acquire(&meta) && !btf_type_is_struct_ptr(meta.btf, t)) { 7865 + /* Only exception is bpf_obj_new_impl */ 7866 + if (meta.btf != btf_vmlinux || meta.func_id != special_kfunc_list[KF_bpf_obj_new_impl]) { 7867 + verbose(env, "acquire kernel function does not return PTR_TO_BTF_ID\n"); 7868 + return -EINVAL; 7869 + } 8986 7870 } 8987 7871 8988 7872 if (btf_type_is_scalar(t)) { 8989 7873 mark_reg_unknown(env, regs, BPF_REG_0); 8990 7874 mark_btf_func_reg_size(env, BPF_REG_0, t->size); 8991 7875 } else if (btf_type_is_ptr(t)) { 8992 - ptr_type = btf_type_skip_modifiers(desc_btf, t->type, 8993 - &ptr_type_id); 8994 - if (!btf_type_is_struct(ptr_type)) { 7876 + ptr_type = btf_type_skip_modifiers(desc_btf, t->type, &ptr_type_id); 7877 + 7878 + if (meta.btf == btf_vmlinux && btf_id_set_contains(&special_kfunc_set, meta.func_id)) { 7879 + if (meta.func_id == special_kfunc_list[KF_bpf_obj_new_impl]) { 7880 + struct btf *ret_btf; 7881 + u32 ret_btf_id; 7882 + 7883 + if (unlikely(!bpf_global_ma_set)) 7884 + return -ENOMEM; 7885 + 7886 + if (((u64)(u32)meta.arg_constant.value) != meta.arg_constant.value) { 7887 + verbose(env, "local type ID argument must be in range [0, U32_MAX]\n"); 7888 + return -EINVAL; 7889 + } 7890 + 7891 + ret_btf = env->prog->aux->btf; 7892 + ret_btf_id = meta.arg_constant.value; 7893 + 7894 + /* This may be NULL due to user not supplying a BTF */ 7895 + if (!ret_btf) { 7896 + verbose(env, "bpf_obj_new requires prog BTF\n"); 7897 + return -EINVAL; 7898 + } 7899 + 7900 + ret_t = btf_type_by_id(ret_btf, ret_btf_id); 7901 + if (!ret_t || !__btf_type_is_struct(ret_t)) { 7902 + verbose(env, "bpf_obj_new type ID argument must be of a struct\n"); 7903 + return -EINVAL; 7904 + } 7905 + 7906 + mark_reg_known_zero(env, regs, BPF_REG_0); 7907 + regs[BPF_REG_0].type = PTR_TO_BTF_ID | MEM_ALLOC; 7908 + regs[BPF_REG_0].btf = ret_btf; 7909 + regs[BPF_REG_0].btf_id = ret_btf_id; 7910 + 7911 + env->insn_aux_data[insn_idx].obj_new_size = ret_t->size; 7912 + env->insn_aux_data[insn_idx].kptr_struct_meta = 7913 + btf_find_struct_meta(ret_btf, ret_btf_id); 7914 + } else if (meta.func_id == special_kfunc_list[KF_bpf_obj_drop_impl]) { 7915 + env->insn_aux_data[insn_idx].kptr_struct_meta = 7916 + btf_find_struct_meta(meta.arg_obj_drop.btf, 7917 + meta.arg_obj_drop.btf_id); 7918 + } else if (meta.func_id == special_kfunc_list[KF_bpf_list_pop_front] || 7919 + meta.func_id == special_kfunc_list[KF_bpf_list_pop_back]) { 7920 + struct btf_field *field = meta.arg_list_head.field; 7921 + 7922 + mark_reg_known_zero(env, regs, BPF_REG_0); 7923 + regs[BPF_REG_0].type = PTR_TO_BTF_ID | MEM_ALLOC; 7924 + regs[BPF_REG_0].btf = field->list_head.btf; 7925 + regs[BPF_REG_0].btf_id = field->list_head.value_btf_id; 7926 + regs[BPF_REG_0].off = field->list_head.node_offset; 7927 + } else if (meta.func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) { 7928 + mark_reg_known_zero(env, regs, BPF_REG_0); 7929 + regs[BPF_REG_0].type = PTR_TO_BTF_ID | PTR_TRUSTED; 7930 + regs[BPF_REG_0].btf = desc_btf; 7931 + regs[BPF_REG_0].btf_id = meta.ret_btf_id; 7932 + } else if (meta.func_id == special_kfunc_list[KF_bpf_rdonly_cast]) { 7933 + ret_t = btf_type_by_id(desc_btf, meta.arg_constant.value); 7934 + if (!ret_t || !btf_type_is_struct(ret_t)) { 7935 + verbose(env, 7936 + "kfunc bpf_rdonly_cast type ID argument must be of a struct\n"); 7937 + return -EINVAL; 7938 + } 7939 + 7940 + mark_reg_known_zero(env, regs, BPF_REG_0); 7941 + regs[BPF_REG_0].type = PTR_TO_BTF_ID | PTR_UNTRUSTED; 7942 + regs[BPF_REG_0].btf = desc_btf; 7943 + regs[BPF_REG_0].btf_id = meta.arg_constant.value; 7944 + } else { 7945 + verbose(env, "kernel function %s unhandled dynamic return type\n", 7946 + meta.func_name); 7947 + return -EFAULT; 7948 + } 7949 + } else if (!__btf_type_is_struct(ptr_type)) { 8995 7950 if (!meta.r0_size) { 8996 7951 ptr_type_name = btf_name_by_offset(desc_btf, 8997 7952 ptr_type->name_off); ··· 9093 7900 regs[BPF_REG_0].type = PTR_TO_BTF_ID; 9094 7901 regs[BPF_REG_0].btf_id = ptr_type_id; 9095 7902 } 9096 - if (*kfunc_flags & KF_RET_NULL) { 7903 + 7904 + if (is_kfunc_ret_null(&meta)) { 9097 7905 regs[BPF_REG_0].type |= PTR_MAYBE_NULL; 9098 7906 /* For mark_ptr_or_null_reg, see 93c230e3f5bd6 */ 9099 7907 regs[BPF_REG_0].id = ++env->id_gen; 9100 7908 } 9101 7909 mark_btf_func_reg_size(env, BPF_REG_0, sizeof(void *)); 9102 - if (acq) { 7910 + if (is_kfunc_acquire(&meta)) { 9103 7911 int id = acquire_reference_state(env, insn_idx); 9104 7912 9105 7913 if (id < 0) 9106 7914 return id; 9107 - regs[BPF_REG_0].id = id; 7915 + if (is_kfunc_ret_null(&meta)) 7916 + regs[BPF_REG_0].id = id; 9108 7917 regs[BPF_REG_0].ref_obj_id = id; 9109 7918 } 7919 + if (reg_may_point_to_spin_lock(&regs[BPF_REG_0]) && !regs[BPF_REG_0].id) 7920 + regs[BPF_REG_0].id = ++env->id_gen; 9110 7921 } /* else { add_kfunc_call() ensures it is btf_type_is_void(t) } */ 9111 7922 9112 7923 nargs = btf_type_vlen(func_proto); ··· 11283 10086 { 11284 10087 if (type_may_be_null(reg->type) && reg->id == id && 11285 10088 !WARN_ON_ONCE(!reg->id)) { 11286 - if (WARN_ON_ONCE(reg->smin_value || reg->smax_value || 11287 - !tnum_equals_const(reg->var_off, 0) || 11288 - reg->off)) { 11289 - /* Old offset (both fixed and variable parts) should 11290 - * have been known-zero, because we don't allow pointer 11291 - * arithmetic on pointers that might be NULL. If we 11292 - * see this happening, don't convert the register. 11293 - */ 10089 + /* Old offset (both fixed and variable parts) should have been 10090 + * known-zero, because we don't allow pointer arithmetic on 10091 + * pointers that might be NULL. If we see this happening, don't 10092 + * convert the register. 10093 + * 10094 + * But in some cases, some helpers that return local kptrs 10095 + * advance offset for the returned pointer. In those cases, it 10096 + * is fine to expect to see reg->off. 10097 + */ 10098 + if (WARN_ON_ONCE(reg->smin_value || reg->smax_value || !tnum_equals_const(reg->var_off, 0))) 11294 10099 return; 11295 - } 10100 + if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC | PTR_MAYBE_NULL) && WARN_ON_ONCE(reg->off)) 10101 + return; 11296 10102 if (is_null) { 11297 10103 reg->type = SCALAR_VALUE; 11298 10104 /* We don't need id and ref_obj_id from this point ··· 11469 10269 struct bpf_verifier_state *other_branch; 11470 10270 struct bpf_reg_state *regs = this_branch->frame[this_branch->curframe]->regs; 11471 10271 struct bpf_reg_state *dst_reg, *other_branch_regs, *src_reg = NULL; 10272 + struct bpf_reg_state *eq_branch_regs; 11472 10273 u8 opcode = BPF_OP(insn->code); 11473 10274 bool is_jmp32; 11474 10275 int pred = -1; ··· 11579 10378 /* detect if we are comparing against a constant value so we can adjust 11580 10379 * our min/max values for our dst register. 11581 10380 * this is only legit if both are scalars (or pointers to the same 11582 - * object, I suppose, but we don't support that right now), because 11583 - * otherwise the different base pointers mean the offsets aren't 10381 + * object, I suppose, see the PTR_MAYBE_NULL related if block below), 10382 + * because otherwise the different base pointers mean the offsets aren't 11584 10383 * comparable. 11585 10384 */ 11586 10385 if (BPF_SRC(insn->code) == BPF_X) { ··· 11627 10426 !WARN_ON_ONCE(dst_reg->id != other_branch_regs[insn->dst_reg].id)) { 11628 10427 find_equal_scalars(this_branch, dst_reg); 11629 10428 find_equal_scalars(other_branch, &other_branch_regs[insn->dst_reg]); 10429 + } 10430 + 10431 + /* if one pointer register is compared to another pointer 10432 + * register check if PTR_MAYBE_NULL could be lifted. 10433 + * E.g. register A - maybe null 10434 + * register B - not null 10435 + * for JNE A, B, ... - A is not null in the false branch; 10436 + * for JEQ A, B, ... - A is not null in the true branch. 10437 + */ 10438 + if (!is_jmp32 && BPF_SRC(insn->code) == BPF_X && 10439 + __is_pointer_value(false, src_reg) && __is_pointer_value(false, dst_reg) && 10440 + type_may_be_null(src_reg->type) != type_may_be_null(dst_reg->type)) { 10441 + eq_branch_regs = NULL; 10442 + switch (opcode) { 10443 + case BPF_JEQ: 10444 + eq_branch_regs = other_branch_regs; 10445 + break; 10446 + case BPF_JNE: 10447 + eq_branch_regs = regs; 10448 + break; 10449 + default: 10450 + /* do nothing */ 10451 + break; 10452 + } 10453 + if (eq_branch_regs) { 10454 + if (type_may_be_null(src_reg->type)) 10455 + mark_ptr_not_null_reg(&eq_branch_regs[insn->src_reg]); 10456 + else 10457 + mark_ptr_not_null_reg(&eq_branch_regs[insn->dst_reg]); 10458 + } 11630 10459 } 11631 10460 11632 10461 /* detect if R == 0 where R is returned from bpf_map_lookup_elem(). ··· 11765 10534 insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE) { 11766 10535 dst_reg->type = PTR_TO_MAP_VALUE; 11767 10536 dst_reg->off = aux->map_off; 11768 - if (btf_record_has_field(map->record, BPF_SPIN_LOCK)) 11769 - dst_reg->id = ++env->id_gen; 10537 + WARN_ON_ONCE(map->max_entries != 1); 10538 + /* We want reg->id to be same (0) as map_value is not distinct */ 11770 10539 } else if (insn->src_reg == BPF_PSEUDO_MAP_FD || 11771 10540 insn->src_reg == BPF_PSEUDO_MAP_IDX) { 11772 10541 dst_reg->type = CONST_PTR_TO_MAP; ··· 11844 10613 return err; 11845 10614 } 11846 10615 11847 - if (env->cur_state->active_spin_lock) { 10616 + if (env->cur_state->active_lock.ptr) { 11848 10617 verbose(env, "BPF_LD_[ABS|IND] cannot be used inside bpf_spin_lock-ed region\n"); 10618 + return -EINVAL; 10619 + } 10620 + 10621 + if (env->cur_state->active_rcu_lock) { 10622 + verbose(env, "BPF_LD_[ABS|IND] cannot be used inside bpf_rcu_read_lock-ed region\n"); 11849 10623 return -EINVAL; 11850 10624 } 11851 10625 ··· 13115 11879 if (old->speculative && !cur->speculative) 13116 11880 return false; 13117 11881 13118 - if (old->active_spin_lock != cur->active_spin_lock) 11882 + if (old->active_lock.ptr != cur->active_lock.ptr || 11883 + old->active_lock.id != cur->active_lock.id) 11884 + return false; 11885 + 11886 + if (old->active_rcu_lock != cur->active_rcu_lock) 13119 11887 return false; 13120 11888 13121 11889 /* for states to be equal callsites have to be the same ··· 13764 12524 return -EINVAL; 13765 12525 } 13766 12526 13767 - if (env->cur_state->active_spin_lock && 13768 - (insn->src_reg == BPF_PSEUDO_CALL || 13769 - insn->imm != BPF_FUNC_spin_unlock)) { 13770 - verbose(env, "function calls are not allowed while holding a lock\n"); 13771 - return -EINVAL; 12527 + if (env->cur_state->active_lock.ptr) { 12528 + if ((insn->src_reg == BPF_REG_0 && insn->imm != BPF_FUNC_spin_unlock) || 12529 + (insn->src_reg == BPF_PSEUDO_CALL) || 12530 + (insn->src_reg == BPF_PSEUDO_KFUNC_CALL && 12531 + (insn->off != 0 || !is_bpf_list_api_kfunc(insn->imm)))) { 12532 + verbose(env, "function calls are not allowed while holding a lock\n"); 12533 + return -EINVAL; 12534 + } 13772 12535 } 13773 12536 if (insn->src_reg == BPF_PSEUDO_CALL) 13774 12537 err = check_func_call(env, insn, &env->insn_idx); ··· 13804 12561 return -EINVAL; 13805 12562 } 13806 12563 13807 - if (env->cur_state->active_spin_lock) { 12564 + if (env->cur_state->active_lock.ptr) { 13808 12565 verbose(env, "bpf_spin_unlock is missing\n"); 12566 + return -EINVAL; 12567 + } 12568 + 12569 + if (env->cur_state->active_rcu_lock) { 12570 + verbose(env, "bpf_rcu_read_unlock is missing\n"); 13809 12571 return -EINVAL; 13810 12572 } 13811 12573 ··· 14065 12817 14066 12818 { 14067 12819 enum bpf_prog_type prog_type = resolve_prog_type(prog); 12820 + 12821 + if (btf_record_has_field(map->record, BPF_LIST_HEAD)) { 12822 + if (is_tracing_prog_type(prog_type)) { 12823 + verbose(env, "tracing progs cannot use bpf_list_head yet\n"); 12824 + return -EINVAL; 12825 + } 12826 + } 14068 12827 14069 12828 if (btf_record_has_field(map->record, BPF_SPIN_LOCK)) { 14070 12829 if (prog_type == BPF_PROG_TYPE_SOCKET_FILTER) { ··· 14909 13654 break; 14910 13655 case PTR_TO_BTF_ID: 14911 13656 case PTR_TO_BTF_ID | PTR_UNTRUSTED: 13657 + /* PTR_TO_BTF_ID | MEM_ALLOC always has a valid lifetime, unlike 13658 + * PTR_TO_BTF_ID, and an active ref_obj_id, but the same cannot 13659 + * be said once it is marked PTR_UNTRUSTED, hence we must handle 13660 + * any faults for loads into such types. BPF_WRITE is disallowed 13661 + * for this case. 13662 + */ 13663 + case PTR_TO_BTF_ID | MEM_ALLOC | PTR_UNTRUSTED: 14912 13664 if (type == BPF_READ) { 14913 13665 insn->code = BPF_LDX | BPF_PROBE_MEM | 14914 13666 BPF_SIZE((insn)->code); ··· 15281 14019 return err; 15282 14020 } 15283 14021 15284 - static int fixup_kfunc_call(struct bpf_verifier_env *env, 15285 - struct bpf_insn *insn) 14022 + static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, 14023 + struct bpf_insn *insn_buf, int insn_idx, int *cnt) 15286 14024 { 15287 14025 const struct bpf_kfunc_desc *desc; 15288 14026 ··· 15301 14039 return -EFAULT; 15302 14040 } 15303 14041 14042 + *cnt = 0; 15304 14043 insn->imm = desc->imm; 14044 + if (insn->off) 14045 + return 0; 14046 + if (desc->func_id == special_kfunc_list[KF_bpf_obj_new_impl]) { 14047 + struct btf_struct_meta *kptr_struct_meta = env->insn_aux_data[insn_idx].kptr_struct_meta; 14048 + struct bpf_insn addr[2] = { BPF_LD_IMM64(BPF_REG_2, (long)kptr_struct_meta) }; 14049 + u64 obj_new_size = env->insn_aux_data[insn_idx].obj_new_size; 15305 14050 14051 + insn_buf[0] = BPF_MOV64_IMM(BPF_REG_1, obj_new_size); 14052 + insn_buf[1] = addr[0]; 14053 + insn_buf[2] = addr[1]; 14054 + insn_buf[3] = *insn; 14055 + *cnt = 4; 14056 + } else if (desc->func_id == special_kfunc_list[KF_bpf_obj_drop_impl]) { 14057 + struct btf_struct_meta *kptr_struct_meta = env->insn_aux_data[insn_idx].kptr_struct_meta; 14058 + struct bpf_insn addr[2] = { BPF_LD_IMM64(BPF_REG_2, (long)kptr_struct_meta) }; 14059 + 14060 + insn_buf[0] = addr[0]; 14061 + insn_buf[1] = addr[1]; 14062 + insn_buf[2] = *insn; 14063 + *cnt = 3; 14064 + } else if (desc->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx] || 14065 + desc->func_id == special_kfunc_list[KF_bpf_rdonly_cast]) { 14066 + insn_buf[0] = BPF_MOV64_REG(BPF_REG_0, BPF_REG_1); 14067 + *cnt = 1; 14068 + } 15306 14069 return 0; 15307 14070 } 15308 14071 ··· 15469 14182 if (insn->src_reg == BPF_PSEUDO_CALL) 15470 14183 continue; 15471 14184 if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { 15472 - ret = fixup_kfunc_call(env, insn); 14185 + ret = fixup_kfunc_call(env, insn, insn_buf, i + delta, &cnt); 15473 14186 if (ret) 15474 14187 return ret; 14188 + if (cnt == 0) 14189 + continue; 14190 + 14191 + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); 14192 + if (!new_prog) 14193 + return -ENOMEM; 14194 + 14195 + delta += cnt - 1; 14196 + env->prog = prog = new_prog; 14197 + insn = new_prog->insnsi + i + delta; 15475 14198 continue; 15476 14199 } 15477 14200 ··· 15599 14302 goto patch_call_imm; 15600 14303 } 15601 14304 15602 - if (insn->imm == BPF_FUNC_task_storage_get || 15603 - insn->imm == BPF_FUNC_sk_storage_get || 15604 - insn->imm == BPF_FUNC_inode_storage_get || 15605 - insn->imm == BPF_FUNC_cgrp_storage_get) { 15606 - if (env->prog->aux->sleepable) 15607 - insn_buf[0] = BPF_MOV64_IMM(BPF_REG_5, (__force __s32)GFP_KERNEL); 15608 - else 14305 + if (is_storage_get_function(insn->imm)) { 14306 + if (!env->prog->aux->sleepable || 14307 + env->insn_aux_data[i + delta].storage_get_func_atomic) 15609 14308 insn_buf[0] = BPF_MOV64_IMM(BPF_REG_5, (__force __s32)GFP_ATOMIC); 14309 + else 14310 + insn_buf[0] = BPF_MOV64_IMM(BPF_REG_5, (__force __s32)GFP_KERNEL); 15610 14311 insn_buf[1] = *insn; 15611 14312 cnt = 2; 15612 14313 ··· 15674 14379 BUILD_BUG_ON(!__same_type(ops->map_peek_elem, 15675 14380 (int (*)(struct bpf_map *map, void *value))NULL)); 15676 14381 BUILD_BUG_ON(!__same_type(ops->map_redirect, 15677 - (int (*)(struct bpf_map *map, u32 ifindex, u64 flags))NULL)); 14382 + (int (*)(struct bpf_map *map, u64 index, u64 flags))NULL)); 15678 14383 BUILD_BUG_ON(!__same_type(ops->map_for_each_callback, 15679 14384 (int (*)(struct bpf_map *map, 15680 14385 bpf_callback_t callback_fn, ··· 16683 15388 env->bypass_spec_v1 = bpf_bypass_spec_v1(); 16684 15389 env->bypass_spec_v4 = bpf_bypass_spec_v4(); 16685 15390 env->bpf_capable = bpf_capable(); 15391 + env->rcu_tag_supported = btf_vmlinux && 15392 + btf_find_by_name_kind(btf_vmlinux, "rcu", BTF_KIND_TYPE_TAG) > 0; 16686 15393 16687 15394 if (is_priv) 16688 15395 env->test_state_freq = attr->prog_flags & BPF_F_TEST_STATE_FREQ;
+3 -3
kernel/trace/bpf_trace.c
··· 774 774 const struct bpf_func_proto bpf_get_current_task_btf_proto = { 775 775 .func = bpf_get_current_task_btf, 776 776 .gpl_only = true, 777 - .ret_type = RET_PTR_TO_BTF_ID, 777 + .ret_type = RET_PTR_TO_BTF_ID_TRUSTED, 778 778 .ret_btf_id = &btf_tracing_ids[BTF_TRACING_TYPE_TASK], 779 779 }; 780 780 ··· 1485 1485 case BPF_FUNC_get_task_stack: 1486 1486 return &bpf_get_task_stack_proto; 1487 1487 case BPF_FUNC_copy_from_user: 1488 - return prog->aux->sleepable ? &bpf_copy_from_user_proto : NULL; 1488 + return &bpf_copy_from_user_proto; 1489 1489 case BPF_FUNC_copy_from_user_task: 1490 - return prog->aux->sleepable ? &bpf_copy_from_user_task_proto : NULL; 1490 + return &bpf_copy_from_user_task_proto; 1491 1491 case BPF_FUNC_snprintf_btf: 1492 1492 return &bpf_snprintf_btf_proto; 1493 1493 case BPF_FUNC_per_cpu_ptr:
+7 -7
net/bpf/bpf_dummy_struct_ops.c
··· 156 156 } 157 157 158 158 static int bpf_dummy_ops_btf_struct_access(struct bpf_verifier_log *log, 159 - const struct btf *btf, 160 - const struct btf_type *t, int off, 161 - int size, enum bpf_access_type atype, 159 + const struct bpf_reg_state *reg, 160 + int off, int size, enum bpf_access_type atype, 162 161 u32 *next_btf_id, 163 162 enum bpf_type_flag *flag) 164 163 { 165 164 const struct btf_type *state; 165 + const struct btf_type *t; 166 166 s32 type_id; 167 167 int err; 168 168 169 - type_id = btf_find_by_name_kind(btf, "bpf_dummy_ops_state", 169 + type_id = btf_find_by_name_kind(reg->btf, "bpf_dummy_ops_state", 170 170 BTF_KIND_STRUCT); 171 171 if (type_id < 0) 172 172 return -EINVAL; 173 173 174 - state = btf_type_by_id(btf, type_id); 174 + t = btf_type_by_id(reg->btf, reg->btf_id); 175 + state = btf_type_by_id(reg->btf, type_id); 175 176 if (t != state) { 176 177 bpf_log(log, "only access to bpf_dummy_ops_state is supported\n"); 177 178 return -EACCES; 178 179 } 179 180 180 - err = btf_struct_access(log, btf, t, off, size, atype, next_btf_id, 181 - flag); 181 + err = btf_struct_access(log, reg, off, size, atype, next_btf_id, flag); 182 182 if (err < 0) 183 183 return err; 184 184
-3
net/bpf/test_run.c
··· 980 980 { 981 981 struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb; 982 982 983 - if (!skb->len) 984 - return -EINVAL; 985 - 986 983 if (!__skb) 987 984 return 0; 988 985
+28 -29
net/core/filter.c
··· 2124 2124 { 2125 2125 unsigned int mlen = skb_network_offset(skb); 2126 2126 2127 + if (unlikely(skb->len <= mlen)) { 2128 + kfree_skb(skb); 2129 + return -ERANGE; 2130 + } 2131 + 2127 2132 if (mlen) { 2128 2133 __skb_pull(skb, mlen); 2129 - if (unlikely(!skb->len)) { 2130 - kfree_skb(skb); 2131 - return -ERANGE; 2132 - } 2133 2134 2134 2135 /* At ingress, the mac header has already been pulled once. 2135 2136 * At egress, skb_pospull_rcsum has to be done in case that ··· 2150 2149 u32 flags) 2151 2150 { 2152 2151 /* Verify that a link layer header is carried */ 2153 - if (unlikely(skb->mac_header >= skb->network_header)) { 2152 + if (unlikely(skb->mac_header >= skb->network_header || skb->len == 0)) { 2154 2153 kfree_skb(skb); 2155 2154 return -ERANGE; 2156 2155 } ··· 4109 4108 .arg2_type = ARG_ANYTHING, 4110 4109 }; 4111 4110 4112 - /* XDP_REDIRECT works by a three-step process, implemented in the functions 4111 + /** 4112 + * DOC: xdp redirect 4113 + * 4114 + * XDP_REDIRECT works by a three-step process, implemented in the functions 4113 4115 * below: 4114 4116 * 4115 4117 * 1. The bpf_redirect() and bpf_redirect_map() helpers will lookup the target ··· 4127 4123 * 3. Before exiting its NAPI poll loop, the driver will call xdp_do_flush(), 4128 4124 * which will flush all the different bulk queues, thus completing the 4129 4125 * redirect. 4130 - * 4126 + */ 4127 + /* 4131 4128 * Pointers to the map entries will be kept around for this whole sequence of 4132 4129 * steps, protected by RCU. However, there is no top-level rcu_read_lock() in 4133 4130 * the core code; instead, the RCU protection relies on everything happening ··· 4419 4414 .arg2_type = ARG_ANYTHING, 4420 4415 }; 4421 4416 4422 - BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex, 4417 + BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, map, u64, key, 4423 4418 u64, flags) 4424 4419 { 4425 - return map->ops->map_redirect(map, ifindex, flags); 4420 + return map->ops->map_redirect(map, key, flags); 4426 4421 } 4427 4422 4428 4423 static const struct bpf_func_proto bpf_xdp_redirect_map_proto = { ··· 8656 8651 DEFINE_MUTEX(nf_conn_btf_access_lock); 8657 8652 EXPORT_SYMBOL_GPL(nf_conn_btf_access_lock); 8658 8653 8659 - int (*nfct_btf_struct_access)(struct bpf_verifier_log *log, const struct btf *btf, 8660 - const struct btf_type *t, int off, int size, 8661 - enum bpf_access_type atype, u32 *next_btf_id, 8662 - enum bpf_type_flag *flag); 8654 + int (*nfct_btf_struct_access)(struct bpf_verifier_log *log, 8655 + const struct bpf_reg_state *reg, 8656 + int off, int size, enum bpf_access_type atype, 8657 + u32 *next_btf_id, enum bpf_type_flag *flag); 8663 8658 EXPORT_SYMBOL_GPL(nfct_btf_struct_access); 8664 8659 8665 8660 static int tc_cls_act_btf_struct_access(struct bpf_verifier_log *log, 8666 - const struct btf *btf, 8667 - const struct btf_type *t, int off, 8668 - int size, enum bpf_access_type atype, 8669 - u32 *next_btf_id, 8670 - enum bpf_type_flag *flag) 8661 + const struct bpf_reg_state *reg, 8662 + int off, int size, enum bpf_access_type atype, 8663 + u32 *next_btf_id, enum bpf_type_flag *flag) 8671 8664 { 8672 8665 int ret = -EACCES; 8673 8666 8674 8667 if (atype == BPF_READ) 8675 - return btf_struct_access(log, btf, t, off, size, atype, next_btf_id, 8676 - flag); 8668 + return btf_struct_access(log, reg, off, size, atype, next_btf_id, flag); 8677 8669 8678 8670 mutex_lock(&nf_conn_btf_access_lock); 8679 8671 if (nfct_btf_struct_access) 8680 - ret = nfct_btf_struct_access(log, btf, t, off, size, atype, next_btf_id, flag); 8672 + ret = nfct_btf_struct_access(log, reg, off, size, atype, next_btf_id, flag); 8681 8673 mutex_unlock(&nf_conn_btf_access_lock); 8682 8674 8683 8675 return ret; ··· 8740 8738 EXPORT_SYMBOL_GPL(bpf_warn_invalid_xdp_action); 8741 8739 8742 8740 static int xdp_btf_struct_access(struct bpf_verifier_log *log, 8743 - const struct btf *btf, 8744 - const struct btf_type *t, int off, 8745 - int size, enum bpf_access_type atype, 8746 - u32 *next_btf_id, 8747 - enum bpf_type_flag *flag) 8741 + const struct bpf_reg_state *reg, 8742 + int off, int size, enum bpf_access_type atype, 8743 + u32 *next_btf_id, enum bpf_type_flag *flag) 8748 8744 { 8749 8745 int ret = -EACCES; 8750 8746 8751 8747 if (atype == BPF_READ) 8752 - return btf_struct_access(log, btf, t, off, size, atype, next_btf_id, 8753 - flag); 8748 + return btf_struct_access(log, reg, off, size, atype, next_btf_id, flag); 8754 8749 8755 8750 mutex_lock(&nf_conn_btf_access_lock); 8756 8751 if (nfct_btf_struct_access) 8757 - ret = nfct_btf_struct_access(log, btf, t, off, size, atype, next_btf_id, flag); 8752 + ret = nfct_btf_struct_access(log, reg, off, size, atype, next_btf_id, flag); 8758 8753 mutex_unlock(&nf_conn_btf_access_lock); 8759 8754 8760 8755 return ret;
+9 -8
net/ipv4/bpf_tcp_ca.c
··· 61 61 if (!bpf_tracing_btf_ctx_access(off, size, type, prog, info)) 62 62 return false; 63 63 64 - if (info->reg_type == PTR_TO_BTF_ID && info->btf_id == sock_id) 64 + if (base_type(info->reg_type) == PTR_TO_BTF_ID && 65 + !bpf_type_has_unsafe_modifiers(info->reg_type) && 66 + info->btf_id == sock_id) 65 67 /* promote it to tcp_sock */ 66 68 info->btf_id = tcp_sock_id; 67 69 ··· 71 69 } 72 70 73 71 static int bpf_tcp_ca_btf_struct_access(struct bpf_verifier_log *log, 74 - const struct btf *btf, 75 - const struct btf_type *t, int off, 76 - int size, enum bpf_access_type atype, 77 - u32 *next_btf_id, 78 - enum bpf_type_flag *flag) 72 + const struct bpf_reg_state *reg, 73 + int off, int size, enum bpf_access_type atype, 74 + u32 *next_btf_id, enum bpf_type_flag *flag) 79 75 { 76 + const struct btf_type *t; 80 77 size_t end; 81 78 82 79 if (atype == BPF_READ) 83 - return btf_struct_access(log, btf, t, off, size, atype, next_btf_id, 84 - flag); 80 + return btf_struct_access(log, reg, off, size, atype, next_btf_id, flag); 85 81 82 + t = btf_type_by_id(reg->btf, reg->btf_id); 86 83 if (t != tcp_sock_type) { 87 84 bpf_log(log, "only read is supported\n"); 88 85 return -EACCES;
+7 -10
net/netfilter/nf_conntrack_bpf.c
··· 191 191 192 192 /* Check writes into `struct nf_conn` */ 193 193 static int _nf_conntrack_btf_struct_access(struct bpf_verifier_log *log, 194 - const struct btf *btf, 195 - const struct btf_type *t, int off, 196 - int size, enum bpf_access_type atype, 197 - u32 *next_btf_id, 198 - enum bpf_type_flag *flag) 194 + const struct bpf_reg_state *reg, 195 + int off, int size, enum bpf_access_type atype, 196 + u32 *next_btf_id, enum bpf_type_flag *flag) 199 197 { 200 - const struct btf_type *ncit; 201 - const struct btf_type *nct; 198 + const struct btf_type *ncit, *nct, *t; 202 199 size_t end; 203 200 204 - ncit = btf_type_by_id(btf, btf_nf_conn_ids[1]); 205 - nct = btf_type_by_id(btf, btf_nf_conn_ids[0]); 206 - 201 + ncit = btf_type_by_id(reg->btf, btf_nf_conn_ids[1]); 202 + nct = btf_type_by_id(reg->btf, btf_nf_conn_ids[0]); 203 + t = btf_type_by_id(reg->btf, reg->btf_id); 207 204 if (t != nct && t != ncit) { 208 205 bpf_log(log, "only read is supported\n"); 209 206 return -EACCES;
+2 -2
net/xdp/xskmap.c
··· 231 231 return 0; 232 232 } 233 233 234 - static int xsk_map_redirect(struct bpf_map *map, u32 ifindex, u64 flags) 234 + static int xsk_map_redirect(struct bpf_map *map, u64 index, u64 flags) 235 235 { 236 - return __bpf_xdp_redirect_map(map, ifindex, flags, 0, 236 + return __bpf_xdp_redirect_map(map, index, flags, 0, 237 237 __xsk_map_lookup_elem); 238 238 } 239 239
+1 -1
samples/bpf/test_cgrp2_tc.sh
··· 115 115 if [ "$DEBUG" == "yes" ] && [ "$MODE" != 'cleanuponly' ] 116 116 then 117 117 echo "------ DEBUG ------" 118 - echo "mount: "; mount | egrep '(cgroup2|bpf)'; echo 118 + echo "mount: "; mount | grep -E '(cgroup2|bpf)'; echo 119 119 echo "$CGRP2_TC_LEAF: "; ls -l $CGRP2_TC_LEAF; echo 120 120 if [ -d "$BPF_FS_TC_SHARE" ] 121 121 then
+1 -1
samples/bpf/xdp_router_ipv4_user.c
··· 162 162 __be32 gw; 163 163 } *prefix_value; 164 164 165 - prefix_key = alloca(sizeof(*prefix_key) + 3); 165 + prefix_key = alloca(sizeof(*prefix_key) + 4); 166 166 prefix_value = alloca(sizeof(*prefix_value)); 167 167 168 168 prefix_key->prefixlen = 32;
-9
tools/bpf/bpftool/Documentation/common_options.rst
··· 23 23 Print all logs available, even debug-level information. This includes 24 24 logs from libbpf as well as from the verifier, when attempting to 25 25 load programs. 26 - 27 - -l, --legacy 28 - Use legacy libbpf mode which has more relaxed BPF program 29 - requirements. By default, bpftool has more strict requirements 30 - about section names, changes pinning logic and doesn't support 31 - some of the older non-BTF map declarations. 32 - 33 - See https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0 34 - for details.
+1 -1
tools/bpf/bpftool/Documentation/substitutions.rst
··· 1 1 .. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) 2 2 3 - .. |COMMON_OPTIONS| replace:: { **-j** | **--json** } [{ **-p** | **--pretty** }] | { **-d** | **--debug** } | { **-l** | **--legacy** } 3 + .. |COMMON_OPTIONS| replace:: { **-j** | **--json** } [{ **-p** | **--pretty** }] | { **-d** | **--debug** }
+1 -1
tools/bpf/bpftool/bash-completion/bpftool
··· 261 261 # Deal with options 262 262 if [[ ${words[cword]} == -* ]]; then 263 263 local c='--version --json --pretty --bpffs --mapcompat --debug \ 264 - --use-loader --base-btf --legacy' 264 + --use-loader --base-btf' 265 265 COMPREPLY=( $( compgen -W "$c" -- "$cur" ) ) 266 266 return 0 267 267 fi
+8 -11
tools/bpf/bpftool/btf.c
··· 467 467 int err = 0, i; 468 468 469 469 d = btf_dump__new(btf, btf_dump_printf, NULL, NULL); 470 - err = libbpf_get_error(d); 471 - if (err) 472 - return err; 470 + if (!d) 471 + return -errno; 473 472 474 473 printf("#ifndef __VMLINUX_H__\n"); 475 474 printf("#define __VMLINUX_H__\n"); ··· 511 512 struct btf *base; 512 513 513 514 base = btf__parse(sysfs_vmlinux, NULL); 514 - if (libbpf_get_error(base)) { 515 - p_err("failed to parse vmlinux BTF at '%s': %ld\n", 516 - sysfs_vmlinux, libbpf_get_error(base)); 517 - base = NULL; 518 - } 515 + if (!base) 516 + p_err("failed to parse vmlinux BTF at '%s': %d\n", 517 + sysfs_vmlinux, -errno); 519 518 520 519 return base; 521 520 } ··· 556 559 __u32 btf_id = -1; 557 560 const char *src; 558 561 int fd = -1; 559 - int err; 562 + int err = 0; 560 563 561 564 if (!REQ_ARGS(2)) { 562 565 usage(); ··· 631 634 base = get_vmlinux_btf_from_sysfs(); 632 635 633 636 btf = btf__parse_split(*argv, base ?: base_btf); 634 - err = libbpf_get_error(btf); 635 637 if (!btf) { 638 + err = -errno; 636 639 p_err("failed to load BTF from %s: %s", 637 640 *argv, strerror(errno)); 638 641 goto done; ··· 678 681 } 679 682 680 683 btf = btf__load_from_kernel_by_id_split(btf_id, base_btf); 681 - err = libbpf_get_error(btf); 682 684 if (!btf) { 685 + err = -errno; 683 686 p_err("get btf by id (%u): %s", btf_id, strerror(errno)); 684 687 goto done; 685 688 }
+1 -1
tools/bpf/bpftool/btf_dumper.c
··· 75 75 goto print; 76 76 77 77 prog_btf = btf__load_from_kernel_by_id(info.btf_id); 78 - if (libbpf_get_error(prog_btf)) 78 + if (!prog_btf) 79 79 goto print; 80 80 func_type = btf__type_by_id(prog_btf, finfo.type_id); 81 81 if (!func_type || !btf_is_func(func_type))
+4 -6
tools/bpf/bpftool/gen.c
··· 252 252 int err = 0; 253 253 254 254 d = btf_dump__new(btf, codegen_btf_dump_printf, NULL, NULL); 255 - err = libbpf_get_error(d); 256 - if (err) 257 - return err; 255 + if (!d) 256 + return -errno; 258 257 259 258 bpf_object__for_each_map(map, obj) { 260 259 /* only generate definitions for memory-mapped internal maps */ ··· 975 976 /* log_level1 + log_level2 + stats, but not stable UAPI */ 976 977 opts.kernel_log_level = 1 + 2 + 4; 977 978 obj = bpf_object__open_mem(obj_data, file_sz, &opts); 978 - err = libbpf_get_error(obj); 979 - if (err) { 979 + if (!obj) { 980 980 char err_buf[256]; 981 981 982 + err = -errno; 982 983 libbpf_strerror(err, err_buf, sizeof(err_buf)); 983 984 p_err("failed to open BPF object file: %s", err_buf); 984 - obj = NULL; 985 985 goto out; 986 986 } 987 987
+6 -4
tools/bpf/bpftool/iter.c
··· 4 4 #ifndef _GNU_SOURCE 5 5 #define _GNU_SOURCE 6 6 #endif 7 + #include <errno.h> 7 8 #include <unistd.h> 8 9 #include <linux/err.h> 9 10 #include <bpf/libbpf.h> ··· 49 48 } 50 49 51 50 obj = bpf_object__open(objfile); 52 - err = libbpf_get_error(obj); 53 - if (err) { 51 + if (!obj) { 52 + err = -errno; 54 53 p_err("can't open objfile %s", objfile); 55 54 goto close_map_fd; 56 55 } ··· 63 62 64 63 prog = bpf_object__next_program(obj, NULL); 65 64 if (!prog) { 65 + err = -errno; 66 66 p_err("can't find bpf program in objfile %s", objfile); 67 67 goto close_obj; 68 68 } 69 69 70 70 link = bpf_program__attach_iter(prog, &iter_opts); 71 - err = libbpf_get_error(link); 72 - if (err) { 71 + if (!link) { 72 + err = -errno; 73 73 p_err("attach_iter failed for program %s", 74 74 bpf_program__name(prog)); 75 75 goto close_obj;
+6 -22
tools/bpf/bpftool/main.c
··· 31 31 bool verifier_logs; 32 32 bool relaxed_maps; 33 33 bool use_loader; 34 - bool legacy_libbpf; 35 34 struct btf *base_btf; 36 35 struct hashmap *refs_table; 37 36 ··· 159 160 jsonw_start_object(json_wtr); /* features */ 160 161 jsonw_bool_field(json_wtr, "libbfd", has_libbfd); 161 162 jsonw_bool_field(json_wtr, "llvm", has_llvm); 162 - jsonw_bool_field(json_wtr, "libbpf_strict", !legacy_libbpf); 163 163 jsonw_bool_field(json_wtr, "skeletons", has_skeletons); 164 164 jsonw_bool_field(json_wtr, "bootstrap", bootstrap); 165 165 jsonw_end_object(json_wtr); /* features */ ··· 177 179 printf("features:"); 178 180 print_feature("libbfd", has_libbfd, &nb_features); 179 181 print_feature("llvm", has_llvm, &nb_features); 180 - print_feature("libbpf_strict", !legacy_libbpf, &nb_features); 181 182 print_feature("skeletons", has_skeletons, &nb_features); 182 183 print_feature("bootstrap", bootstrap, &nb_features); 183 184 printf("\n"); ··· 334 337 if (argc < 2) { 335 338 p_err("too few parameters for batch"); 336 339 return -1; 337 - } else if (!is_prefix(*argv, "file")) { 338 - p_err("expected 'file', got: %s", *argv); 339 - return -1; 340 340 } else if (argc > 2) { 341 341 p_err("too many parameters for batch"); 342 + return -1; 343 + } else if (!is_prefix(*argv, "file")) { 344 + p_err("expected 'file', got: %s", *argv); 342 345 return -1; 343 346 } 344 347 NEXT_ARG(); ··· 448 451 { "debug", no_argument, NULL, 'd' }, 449 452 { "use-loader", no_argument, NULL, 'L' }, 450 453 { "base-btf", required_argument, NULL, 'B' }, 451 - { "legacy", no_argument, NULL, 'l' }, 452 454 { 0 } 453 455 }; 454 456 bool version_requested = false; ··· 510 514 break; 511 515 case 'B': 512 516 base_btf = btf__parse(optarg, NULL); 513 - if (libbpf_get_error(base_btf)) { 514 - p_err("failed to parse base BTF at '%s': %ld\n", 515 - optarg, libbpf_get_error(base_btf)); 516 - base_btf = NULL; 517 + if (!base_btf) { 518 + p_err("failed to parse base BTF at '%s': %d\n", 519 + optarg, -errno); 517 520 return -1; 518 521 } 519 522 break; 520 523 case 'L': 521 524 use_loader = true; 522 - break; 523 - case 'l': 524 - legacy_libbpf = true; 525 525 break; 526 526 default: 527 527 p_err("unrecognized option '%s'", argv[optind - 1]); ··· 526 534 else 527 535 usage(); 528 536 } 529 - } 530 - 531 - if (!legacy_libbpf) { 532 - /* Allow legacy map definitions for skeleton generation. 533 - * It will still be rejected if users use LIBBPF_STRICT_ALL 534 - * mode for loading generated skeleton. 535 - */ 536 - libbpf_set_strict_mode(LIBBPF_STRICT_ALL & ~LIBBPF_STRICT_MAP_DEFINITIONS); 537 537 } 538 538 539 539 argc -= optind;
+1 -2
tools/bpf/bpftool/main.h
··· 57 57 #define HELP_SPEC_PROGRAM \ 58 58 "PROG := { id PROG_ID | pinned FILE | tag PROG_TAG | name PROG_NAME }" 59 59 #define HELP_SPEC_OPTIONS \ 60 - "OPTIONS := { {-j|--json} [{-p|--pretty}] | {-d|--debug} | {-l|--legacy}" 60 + "OPTIONS := { {-j|--json} [{-p|--pretty}] | {-d|--debug}" 61 61 #define HELP_SPEC_MAP \ 62 62 "MAP := { id MAP_ID | pinned FILE | name MAP_NAME }" 63 63 #define HELP_SPEC_LINK \ ··· 82 82 extern bool verifier_logs; 83 83 extern bool relaxed_maps; 84 84 extern bool use_loader; 85 - extern bool legacy_libbpf; 86 85 extern struct btf *base_btf; 87 86 extern struct hashmap *refs_table; 88 87
+7 -13
tools/bpf/bpftool/map.c
··· 786 786 if (info->btf_vmlinux_value_type_id) { 787 787 if (!btf_vmlinux) { 788 788 btf_vmlinux = libbpf_find_kernel_btf(); 789 - err = libbpf_get_error(btf_vmlinux); 790 - if (err) { 789 + if (!btf_vmlinux) { 791 790 p_err("failed to get kernel btf"); 792 - return err; 791 + return -errno; 793 792 } 794 793 } 795 794 *btf = btf_vmlinux; 796 795 } else if (info->btf_value_type_id) { 797 796 *btf = btf__load_from_kernel_by_id(info->btf_id); 798 - err = libbpf_get_error(*btf); 799 - if (err) 797 + if (!*btf) { 798 + err = -errno; 800 799 p_err("failed to get btf"); 800 + } 801 801 } else { 802 802 *btf = NULL; 803 803 } ··· 807 807 808 808 static void free_map_kv_btf(struct btf *btf) 809 809 { 810 - if (!libbpf_get_error(btf) && btf != btf_vmlinux) 810 + if (btf != btf_vmlinux) 811 811 btf__free(btf); 812 - } 813 - 814 - static void free_btf_vmlinux(void) 815 - { 816 - if (!libbpf_get_error(btf_vmlinux)) 817 - btf__free(btf_vmlinux); 818 812 } 819 813 820 814 static int ··· 947 953 close(fds[i]); 948 954 exit_free: 949 955 free(fds); 950 - free_btf_vmlinux(); 956 + btf__free(btf_vmlinux); 951 957 return err; 952 958 } 953 959
+5 -10
tools/bpf/bpftool/prog.c
··· 322 322 return; 323 323 324 324 btf = btf__load_from_kernel_by_id(map_info.btf_id); 325 - if (libbpf_get_error(btf)) 325 + if (!btf) 326 326 goto out_free; 327 327 328 328 t_datasec = btf__type_by_id(btf, map_info.btf_value_type_id); ··· 726 726 727 727 if (info->btf_id) { 728 728 btf = btf__load_from_kernel_by_id(info->btf_id); 729 - if (libbpf_get_error(btf)) { 729 + if (!btf) { 730 730 p_err("failed to get btf"); 731 731 return -1; 732 732 } ··· 1663 1663 open_opts.kernel_log_level = 1 + 2 + 4; 1664 1664 1665 1665 obj = bpf_object__open_file(file, &open_opts); 1666 - if (libbpf_get_error(obj)) { 1666 + if (!obj) { 1667 1667 p_err("failed to open object file"); 1668 1668 goto err_free_reuse_maps; 1669 1669 } ··· 1802 1802 else 1803 1803 bpf_object__unpin_programs(obj, pinfile); 1804 1804 err_close_obj: 1805 - if (!legacy_libbpf) { 1806 - p_info("Warning: bpftool is now running in libbpf strict mode and has more stringent requirements about BPF programs.\n" 1807 - "If it used to work for this object file but now doesn't, see --legacy option for more details.\n"); 1808 - } 1809 - 1810 1805 bpf_object__close(obj); 1811 1806 err_free_reuse_maps: 1812 1807 for (i = 0; i < old_map_fds; i++) ··· 1882 1887 open_opts.kernel_log_level = 1 + 2 + 4; 1883 1888 1884 1889 obj = bpf_object__open_file(file, &open_opts); 1885 - if (libbpf_get_error(obj)) { 1890 + if (!obj) { 1886 1891 p_err("failed to open object file"); 1887 1892 goto err_close_obj; 1888 1893 } ··· 2199 2204 } 2200 2205 2201 2206 btf = btf__load_from_kernel_by_id(info.btf_id); 2202 - if (libbpf_get_error(btf)) { 2207 + if (!btf) { 2203 2208 p_err("failed to load btf for prog FD %d", tgt_fd); 2204 2209 goto out; 2205 2210 }
+9 -13
tools/bpf/bpftool/struct_ops.c
··· 32 32 return btf_vmlinux; 33 33 34 34 btf_vmlinux = libbpf_find_kernel_btf(); 35 - if (libbpf_get_error(btf_vmlinux)) 35 + if (!btf_vmlinux) 36 36 p_err("struct_ops requires kernel CONFIG_DEBUG_INFO_BTF=y"); 37 37 38 38 return btf_vmlinux; ··· 45 45 const char *st_ops_name; 46 46 47 47 kern_btf = get_btf_vmlinux(); 48 - if (libbpf_get_error(kern_btf)) 48 + if (!kern_btf) 49 49 return "<btf_vmlinux_not_found>"; 50 50 51 51 t = btf__type_by_id(kern_btf, info->btf_vmlinux_value_type_id); ··· 63 63 return map_info_type_id; 64 64 65 65 kern_btf = get_btf_vmlinux(); 66 - if (libbpf_get_error(kern_btf)) { 67 - map_info_type_id = PTR_ERR(kern_btf); 68 - return map_info_type_id; 69 - } 66 + if (!kern_btf) 67 + return 0; 70 68 71 69 map_info_type_id = btf__find_by_name_kind(kern_btf, "bpf_map_info", 72 70 BTF_KIND_STRUCT); ··· 413 415 } 414 416 415 417 kern_btf = get_btf_vmlinux(); 416 - if (libbpf_get_error(kern_btf)) 418 + if (!kern_btf) 417 419 return -1; 418 420 419 421 if (!json_output) { ··· 496 498 open_opts.kernel_log_level = 1 + 2 + 4; 497 499 498 500 obj = bpf_object__open_file(file, &open_opts); 499 - if (libbpf_get_error(obj)) 501 + if (!obj) 500 502 return -1; 501 503 502 504 set_max_rlimit(); ··· 511 513 continue; 512 514 513 515 link = bpf_map__attach_struct_ops(map); 514 - if (libbpf_get_error(link)) { 516 + if (!link) { 515 517 p_err("can't register struct_ops %s: %s", 516 - bpf_map__name(map), 517 - strerror(-PTR_ERR(link))); 518 + bpf_map__name(map), strerror(errno)); 518 519 nr_errs++; 519 520 continue; 520 521 } ··· 590 593 591 594 err = cmd_select(cmds, argc, argv, do_help); 592 595 593 - if (!libbpf_get_error(btf_vmlinux)) 594 - btf__free(btf_vmlinux); 596 + btf__free(btf_vmlinux); 595 597 596 598 return err; 597 599 }
+23 -10
tools/include/uapi/linux/bpf.h
··· 2584 2584 * * **SOL_SOCKET**, which supports the following *optname*\ s: 2585 2585 * **SO_RCVBUF**, **SO_SNDBUF**, **SO_MAX_PACING_RATE**, 2586 2586 * **SO_PRIORITY**, **SO_RCVLOWAT**, **SO_MARK**, 2587 - * **SO_BINDTODEVICE**, **SO_KEEPALIVE**. 2587 + * **SO_BINDTODEVICE**, **SO_KEEPALIVE**, **SO_REUSEADDR**, 2588 + * **SO_REUSEPORT**, **SO_BINDTOIFINDEX**, **SO_TXREHASH**. 2588 2589 * * **IPPROTO_TCP**, which supports the following *optname*\ s: 2589 2590 * **TCP_CONGESTION**, **TCP_BPF_IW**, 2590 2591 * **TCP_BPF_SNDCWND_CLAMP**, **TCP_SAVE_SYN**, 2591 2592 * **TCP_KEEPIDLE**, **TCP_KEEPINTVL**, **TCP_KEEPCNT**, 2592 - * **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**. 2593 + * **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**, 2594 + * **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**, 2595 + * **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**, 2596 + * **TCP_BPF_RTO_MIN**. 2593 2597 * * **IPPROTO_IP**, which supports *optname* **IP_TOS**. 2594 - * * **IPPROTO_IPV6**, which supports *optname* **IPV6_TCLASS**. 2598 + * * **IPPROTO_IPV6**, which supports the following *optname*\ s: 2599 + * **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**. 2595 2600 * Return 2596 2601 * 0 on success, or a negative error in case of failure. 2597 2602 * ··· 2652 2647 * Return 2653 2648 * 0 on success, or a negative error in case of failure. 2654 2649 * 2655 - * long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags) 2650 + * long bpf_redirect_map(struct bpf_map *map, u64 key, u64 flags) 2656 2651 * Description 2657 2652 * Redirect the packet to the endpoint referenced by *map* at 2658 2653 * index *key*. Depending on its type, this *map* can contain ··· 2813 2808 * and **BPF_CGROUP_INET6_CONNECT**. 2814 2809 * 2815 2810 * This helper actually implements a subset of **getsockopt()**. 2816 - * It supports the following *level*\ s: 2817 - * 2818 - * * **IPPROTO_TCP**, which supports *optname* 2819 - * **TCP_CONGESTION**. 2820 - * * **IPPROTO_IP**, which supports *optname* **IP_TOS**. 2821 - * * **IPPROTO_IPV6**, which supports *optname* **IPV6_TCLASS**. 2811 + * It supports the same set of *optname*\ s that is supported by 2812 + * the **bpf_setsockopt**\ () helper. The exceptions are 2813 + * **TCP_BPF_*** is **bpf_setsockopt**\ () only and 2814 + * **TCP_SAVED_SYN** is **bpf_getsockopt**\ () only. 2822 2815 * Return 2823 2816 * 0 on success, or a negative error in case of failure. 2824 2817 * ··· 6887 6884 } __attribute__((aligned(8))); 6888 6885 6889 6886 struct bpf_dynptr { 6887 + __u64 :64; 6888 + __u64 :64; 6889 + } __attribute__((aligned(8))); 6890 + 6891 + struct bpf_list_head { 6892 + __u64 :64; 6893 + __u64 :64; 6894 + } __attribute__((aligned(8))); 6895 + 6896 + struct bpf_list_node { 6890 6897 __u64 :64; 6891 6898 __u64 :64; 6892 6899 } __attribute__((aligned(8)));
+3 -2
tools/lib/bpf/btf.c
··· 1724 1724 memset(btf->strs_data + old_strs_len, 0, btf->hdr->str_len - old_strs_len); 1725 1725 1726 1726 /* and now restore original strings section size; types data size 1727 - * wasn't modified, so doesn't need restoring, see big comment above */ 1727 + * wasn't modified, so doesn't need restoring, see big comment above 1728 + */ 1728 1729 btf->hdr->str_len = old_strs_len; 1729 1730 1730 1731 hashmap__free(p.str_off_map); ··· 2330 2329 */ 2331 2330 int btf__add_type_tag(struct btf *btf, const char *value, int ref_type_id) 2332 2331 { 2333 - if (!value|| !value[0]) 2332 + if (!value || !value[0]) 2334 2333 return libbpf_err(-EINVAL); 2335 2334 2336 2335 return btf_add_ref_kind(btf, BTF_KIND_TYPE_TAG, value, ref_type_id);
+2 -2
tools/lib/bpf/btf_dump.c
··· 1543 1543 if (!new_name) 1544 1544 return 1; 1545 1545 1546 - hashmap__find(name_map, orig_name, &dup_cnt); 1546 + (void)hashmap__find(name_map, orig_name, &dup_cnt); 1547 1547 dup_cnt++; 1548 1548 1549 1549 err = hashmap__set(name_map, new_name, dup_cnt, &old_name, NULL); ··· 1989 1989 { 1990 1990 const struct btf_member *m = btf_members(t); 1991 1991 __u16 n = btf_vlen(t); 1992 - int i, err; 1992 + int i, err = 0; 1993 1993 1994 1994 /* note that we increment depth before calling btf_dump_print() below; 1995 1995 * this is intentional. btf_dump_data_newline() will not print a
+30 -18
tools/lib/bpf/libbpf.c
··· 347 347 SEC_ATTACHABLE = 2, 348 348 SEC_ATTACHABLE_OPT = SEC_ATTACHABLE | SEC_EXP_ATTACH_OPT, 349 349 /* attachment target is specified through BTF ID in either kernel or 350 - * other BPF program's BTF object */ 350 + * other BPF program's BTF object 351 + */ 351 352 SEC_ATTACH_BTF = 4, 352 353 /* BPF program type allows sleeping/blocking in kernel */ 353 354 SEC_SLEEPABLE = 8, ··· 489 488 char *name; 490 489 /* real_name is defined for special internal maps (.rodata*, 491 490 * .data*, .bss, .kconfig) and preserves their original ELF section 492 - * name. This is important to be be able to find corresponding BTF 491 + * name. This is important to be able to find corresponding BTF 493 492 * DATASEC information. 494 493 */ 495 494 char *real_name; ··· 1864 1863 return -ERANGE; 1865 1864 } 1866 1865 switch (ext->kcfg.sz) { 1867 - case 1: *(__u8 *)ext_val = value; break; 1868 - case 2: *(__u16 *)ext_val = value; break; 1869 - case 4: *(__u32 *)ext_val = value; break; 1870 - case 8: *(__u64 *)ext_val = value; break; 1871 - default: 1872 - return -EINVAL; 1866 + case 1: 1867 + *(__u8 *)ext_val = value; 1868 + break; 1869 + case 2: 1870 + *(__u16 *)ext_val = value; 1871 + break; 1872 + case 4: 1873 + *(__u32 *)ext_val = value; 1874 + break; 1875 + case 8: 1876 + *(__u64 *)ext_val = value; 1877 + break; 1878 + default: 1879 + return -EINVAL; 1873 1880 } 1874 1881 ext->is_set = true; 1875 1882 return 0; ··· 2779 2770 m->type = enum64_placeholder_id; 2780 2771 m->offset = 0; 2781 2772 } 2782 - } 2773 + } 2783 2774 } 2784 2775 2785 2776 return 0; ··· 3511 3502 sec_desc->sec_type = SEC_RELO; 3512 3503 sec_desc->shdr = sh; 3513 3504 sec_desc->data = data; 3514 - } else if (sh->sh_type == SHT_NOBITS && strcmp(name, BSS_SEC) == 0) { 3505 + } else if (sh->sh_type == SHT_NOBITS && (strcmp(name, BSS_SEC) == 0 || 3506 + str_has_pfx(name, BSS_SEC "."))) { 3515 3507 sec_desc->sec_type = SEC_BSS; 3516 3508 sec_desc->shdr = sh; 3517 3509 sec_desc->data = data; ··· 3528 3518 } 3529 3519 3530 3520 /* sort BPF programs by section name and in-section instruction offset 3531 - * for faster search */ 3521 + * for faster search 3522 + */ 3532 3523 if (obj->nr_programs) 3533 3524 qsort(obj->programs, obj->nr_programs, sizeof(*obj->programs), cmp_progs); 3534 3525 ··· 3828 3817 return -EINVAL; 3829 3818 } 3830 3819 ext->kcfg.type = find_kcfg_type(obj->btf, t->type, 3831 - &ext->kcfg.is_signed); 3820 + &ext->kcfg.is_signed); 3832 3821 if (ext->kcfg.type == KCFG_UNKNOWN) { 3833 3822 pr_warn("extern (kcfg) '%s': type is unsupported\n", ext_name); 3834 3823 return -ENOTSUP; ··· 4976 4965 4977 4966 err = bpf_map__reuse_fd(map, pin_fd); 4978 4967 close(pin_fd); 4979 - if (err) { 4968 + if (err) 4980 4969 return err; 4981 - } 4970 + 4982 4971 map->pinned = true; 4983 4972 pr_debug("reused pinned map at '%s'\n", map->pin_path); 4984 4973 ··· 5496 5485 } 5497 5486 5498 5487 err = libbpf_ensure_mem((void **)&obj->btf_modules, &obj->btf_module_cap, 5499 - sizeof(*obj->btf_modules), obj->btf_module_cnt + 1); 5488 + sizeof(*obj->btf_modules), obj->btf_module_cnt + 1); 5500 5489 if (err) 5501 5490 goto err_out; 5502 5491 ··· 6248 6237 * prog; each main prog can have a different set of 6249 6238 * subprograms appended (potentially in different order as 6250 6239 * well), so position of any subprog can be different for 6251 - * different main programs */ 6240 + * different main programs 6241 + */ 6252 6242 insn->imm = subprog->sub_insn_off - (prog->sub_insn_off + insn_idx) - 1; 6253 6243 6254 6244 pr_debug("prog '%s': insn #%zu relocated, imm %d points to subprog '%s' (now at %zu offset)\n", ··· 11007 10995 11008 10996 usdt_cookie = OPTS_GET(opts, usdt_cookie, 0); 11009 10997 link = usdt_manager_attach_usdt(obj->usdt_man, prog, pid, binary_path, 11010 - usdt_provider, usdt_name, usdt_cookie); 10998 + usdt_provider, usdt_name, usdt_cookie); 11011 10999 err = libbpf_get_error(link); 11012 11000 if (err) 11013 11001 return libbpf_err_ptr(err); ··· 12316 12304 btf = bpf_object__btf(s->obj); 12317 12305 if (!btf) { 12318 12306 pr_warn("subskeletons require BTF at runtime (object %s)\n", 12319 - bpf_object__name(s->obj)); 12307 + bpf_object__name(s->obj)); 12320 12308 return libbpf_err(-errno); 12321 12309 } 12322 12310
+2 -2
tools/lib/bpf/ringbuf.c
··· 128 128 /* Map read-only producer page and data pages. We map twice as big 129 129 * data size to allow simple reading of samples that wrap around the 130 130 * end of a ring buffer. See kernel implementation for details. 131 - * */ 131 + */ 132 132 tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ, 133 133 MAP_SHARED, map_fd, rb->page_size); 134 134 if (tmp == MAP_FAILED) { ··· 220 220 return (len + 7) / 8 * 8; 221 221 } 222 222 223 - static int64_t ringbuf_process_ring(struct ring* r) 223 + static int64_t ringbuf_process_ring(struct ring *r) 224 224 { 225 225 int *len_ptr, len, err; 226 226 /* 64-bit to avoid overflow in case of extreme application behavior */
+2
tools/testing/selftests/bpf/DENYLIST.aarch64
··· 38 38 ksyms_module/libbpf # 'bpf_testmod_ksym_percpu': not found in kernel BTF 39 39 ksyms_module/lskel # test_ksyms_module_lskel__open_and_load unexpected error: -2 40 40 libbpf_get_fd_by_id_opts # test_libbpf_get_fd_by_id_opts__attach unexpected error: -524 (errno 524) 41 + linked_list 41 42 lookup_key # test_lookup_key__attach unexpected error: -524 (errno 524) 42 43 lru_bug # lru_bug__attach unexpected error: -524 (errno 524) 43 44 modify_return # modify_return__attach failed unexpected error: -524 (errno 524) 44 45 module_attach # skel_attach skeleton attach failed: -524 45 46 mptcp/base # run_test mptcp unexpected error: -524 (errno 524) 46 47 netcnt # packets unexpected packets: actual 10001 != expected 10000 48 + rcu_read_lock # failed to attach: ERROR: strerror_r(-524)=22 47 49 recursion # skel_attach unexpected error: -524 (errno 524) 48 50 ringbuf # skel_attach skeleton attachment failed: -1 49 51 setget_sockopt # attach_cgroup unexpected error: -524
+5
tools/testing/selftests/bpf/DENYLIST.s390x
··· 10 10 bpf_tcp_ca # JIT does not support calling kernel function (kfunc) 11 11 cb_refs # expected error message unexpected error: -524 (trampoline) 12 12 cgroup_hierarchical_stats # JIT does not support calling kernel function (kfunc) 13 + cgrp_kfunc # JIT does not support calling kernel function 13 14 cgrp_local_storage # prog_attach unexpected error: -524 (trampoline) 14 15 core_read_macros # unknown func bpf_probe_read#4 (overlapping) 15 16 d_path # failed to auto-attach program 'prog_stat': -524 (trampoline) ··· 34 33 ksyms_module_libbpf # JIT does not support calling kernel function (kfunc) 35 34 ksyms_module_lskel # test_ksyms_module_lskel__open_and_load unexpected error: -9 (?) 36 35 libbpf_get_fd_by_id_opts # failed to attach: ERROR: strerror_r(-524)=22 (trampoline) 36 + linked_list # JIT does not support calling kernel function (kfunc) 37 37 lookup_key # JIT does not support calling kernel function (kfunc) 38 38 lru_bug # prog 'printk': failed to auto-attach: -524 39 39 map_kptr # failed to open_and_load program: -524 (trampoline) ··· 43 41 mptcp 44 42 netcnt # failed to load BPF skeleton 'netcnt_prog': -7 (?) 45 43 probe_user # check_kprobe_res wrong kprobe res from probe read (?) 44 + rcu_read_lock # failed to find kernel BTF type ID of '__x64_sys_getpgid': -3 (?) 46 45 recursion # skel_attach unexpected error: -524 (trampoline) 47 46 ringbuf # skel_load skeleton load failed (?) 48 47 select_reuseport # intermittently fails on new s390x setup ··· 56 53 socket_cookie # prog_attach unexpected error: -524 (trampoline) 57 54 stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?) 58 55 tailcalls # tail_calls are not allowed in non-JITed programs with bpf-to-bpf calls (?) 56 + task_kfunc # JIT does not support calling kernel function 59 57 task_local_storage # failed to auto-attach program 'trace_exit_creds': -524 (trampoline) 60 58 test_bpffs # bpffs test failed 255 (iterator) 61 59 test_bprm_opts # failed to auto-attach program 'secure_exec': -524 (trampoline) ··· 73 69 trace_vprintk # trace_vprintk__open_and_load unexpected error: -9 (?) 74 70 tracing_struct # failed to auto-attach: -524 (trampoline) 75 71 trampoline_count # prog 'prog1': failed to attach: ERROR: strerror_r(-524)=22 (trampoline) 72 + type_cast # JIT does not support calling kernel function 76 73 unpriv_bpf_disabled # fentry 77 74 user_ringbuf # failed to find kernel BTF type ID of '__s390x_sys_prctl': -3 (?) 78 75 verif_stats # trace_vprintk__open_and_load unexpected error: -9 (?)
+9 -5
tools/testing/selftests/bpf/Makefile
··· 201 201 $(OUTPUT)/bpf_testmod.ko: $(VMLINUX_BTF) $(wildcard bpf_testmod/Makefile bpf_testmod/*.[ch]) 202 202 $(call msg,MOD,,$@) 203 203 $(Q)$(RM) bpf_testmod/bpf_testmod.ko # force re-compilation 204 - $(Q)$(MAKE) $(submake_extras) -C bpf_testmod 204 + $(Q)$(MAKE) $(submake_extras) RESOLVE_BTFIDS=$(RESOLVE_BTFIDS) -C bpf_testmod 205 205 $(Q)cp bpf_testmod/bpf_testmod.ko $@ 206 206 207 207 DEFAULT_BPFTOOL := $(HOST_SCRATCH_DIR)/sbin/bpftool ··· 310 310 # Use '-idirafter': Don't interfere with include mechanics except where the 311 311 # build would have failed anyways. 312 312 define get_sys_includes 313 - $(shell $(1) -v -E - </dev/null 2>&1 \ 313 + $(shell $(1) $(2) -v -E - </dev/null 2>&1 \ 314 314 | sed -n '/<...> search starts here:/,/End of search list./{ s| \(/.*\)|-idirafter \1|p }') \ 315 - $(shell $(1) -dM -E - </dev/null | grep '__riscv_xlen ' | awk '{printf("-D__riscv_xlen=%d -D__BITS_PER_LONG=%d", $$3, $$3)}') 315 + $(shell $(1) $(2) -dM -E - </dev/null | grep '__riscv_xlen ' | awk '{printf("-D__riscv_xlen=%d -D__BITS_PER_LONG=%d", $$3, $$3)}') 316 316 endef 317 317 318 318 # Determine target endianness. ··· 320 320 grep 'define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__') 321 321 MENDIAN=$(if $(IS_LITTLE_ENDIAN),-mlittle-endian,-mbig-endian) 322 322 323 - CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG)) 323 + ifneq ($(CROSS_COMPILE),) 324 + CLANG_TARGET_ARCH = --target=$(notdir $(CROSS_COMPILE:%-=%)) 325 + endif 326 + 327 + CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG),$(CLANG_TARGET_ARCH)) 324 328 BPF_CFLAGS = -g -Werror -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN) \ 325 329 -I$(INCLUDE_DIR) -I$(CURDIR) -I$(APIDIR) \ 326 330 -I$(abspath $(OUTPUT)/../usr/include) ··· 546 542 # Define test_progs BPF-GCC-flavored test runner. 547 543 ifneq ($(BPF_GCC),) 548 544 TRUNNER_BPF_BUILD_RULE := GCC_BPF_BUILD_RULE 549 - TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(call get_sys_includes,gcc) 545 + TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(call get_sys_includes,gcc,) 550 546 $(eval $(call DEFINE_TEST_RUNNER,test_progs,bpf_gcc)) 551 547 endif 552 548
+68
tools/testing/selftests/bpf/bpf_experimental.h
··· 1 + #ifndef __BPF_EXPERIMENTAL__ 2 + #define __BPF_EXPERIMENTAL__ 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_tracing.h> 6 + #include <bpf/bpf_helpers.h> 7 + #include <bpf/bpf_core_read.h> 8 + 9 + #define __contains(name, node) __attribute__((btf_decl_tag("contains:" #name ":" #node))) 10 + 11 + /* Description 12 + * Allocates an object of the type represented by 'local_type_id' in 13 + * program BTF. User may use the bpf_core_type_id_local macro to pass the 14 + * type ID of a struct in program BTF. 15 + * 16 + * The 'local_type_id' parameter must be a known constant. 17 + * The 'meta' parameter is a hidden argument that is ignored. 18 + * Returns 19 + * A pointer to an object of the type corresponding to the passed in 20 + * 'local_type_id', or NULL on failure. 21 + */ 22 + extern void *bpf_obj_new_impl(__u64 local_type_id, void *meta) __ksym; 23 + 24 + /* Convenience macro to wrap over bpf_obj_new_impl */ 25 + #define bpf_obj_new(type) ((type *)bpf_obj_new_impl(bpf_core_type_id_local(type), NULL)) 26 + 27 + /* Description 28 + * Free an allocated object. All fields of the object that require 29 + * destruction will be destructed before the storage is freed. 30 + * 31 + * The 'meta' parameter is a hidden argument that is ignored. 32 + * Returns 33 + * Void. 34 + */ 35 + extern void bpf_obj_drop_impl(void *kptr, void *meta) __ksym; 36 + 37 + /* Convenience macro to wrap over bpf_obj_drop_impl */ 38 + #define bpf_obj_drop(kptr) bpf_obj_drop_impl(kptr, NULL) 39 + 40 + /* Description 41 + * Add a new entry to the beginning of the BPF linked list. 42 + * Returns 43 + * Void. 44 + */ 45 + extern void bpf_list_push_front(struct bpf_list_head *head, struct bpf_list_node *node) __ksym; 46 + 47 + /* Description 48 + * Add a new entry to the end of the BPF linked list. 49 + * Returns 50 + * Void. 51 + */ 52 + extern void bpf_list_push_back(struct bpf_list_head *head, struct bpf_list_node *node) __ksym; 53 + 54 + /* Description 55 + * Remove the entry at the beginning of the BPF linked list. 56 + * Returns 57 + * Pointer to bpf_list_node of deleted entry, or NULL if list is empty. 58 + */ 59 + extern struct bpf_list_node *bpf_list_pop_front(struct bpf_list_head *head) __ksym; 60 + 61 + /* Description 62 + * Remove the entry at the end of the BPF linked list. 63 + * Returns 64 + * Pointer to bpf_list_node of deleted entry, or NULL if list is empty. 65 + */ 66 + extern struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head) __ksym; 67 + 68 + #endif
+19
tools/testing/selftests/bpf/cgroup_helpers.c
··· 333 333 return fd; 334 334 } 335 335 336 + /* 337 + * remove_cgroup() - Remove a cgroup 338 + * @relative_path: The cgroup path, relative to the workdir, to remove 339 + * 340 + * This function expects a cgroup to already be created, relative to the cgroup 341 + * work dir. It also expects the cgroup doesn't have any children or live 342 + * processes and it removes the cgroup. 343 + * 344 + * On failure, it will print an error to stderr. 345 + */ 346 + void remove_cgroup(const char *relative_path) 347 + { 348 + char cgroup_path[PATH_MAX + 1]; 349 + 350 + format_cgroup_path(cgroup_path, relative_path); 351 + if (rmdir(cgroup_path)) 352 + log_err("rmdiring cgroup %s .. %s", relative_path, cgroup_path); 353 + } 354 + 336 355 /** 337 356 * create_and_get_cgroup() - Create a cgroup, relative to workdir, and get the FD 338 357 * @relative_path: The cgroup path, relative to the workdir, to join
+1
tools/testing/selftests/bpf/cgroup_helpers.h
··· 18 18 int cgroup_setup_and_join(const char *relative_path); 19 19 int get_root_cgroup(void); 20 20 int create_and_get_cgroup(const char *relative_path); 21 + void remove_cgroup(const char *relative_path); 21 22 unsigned long long get_cgroup_id(const char *relative_path); 22 23 23 24 int join_cgroup(const char *relative_path);
+1
tools/testing/selftests/bpf/config
··· 8 8 CONFIG_BPF_LSM=y 9 9 CONFIG_BPF_STREAM_PARSER=y 10 10 CONFIG_BPF_SYSCALL=y 11 + CONFIG_BPF_UNPRIV_DEFAULT_OFF=n 11 12 CONFIG_CGROUP_BPF=y 12 13 CONFIG_CRYPTO_HMAC=y 13 14 CONFIG_CRYPTO_SHA256=y
+4
tools/testing/selftests/bpf/network_helpers.c
··· 426 426 if (!ASSERT_OK(err, "mount /sys/fs/bpf")) 427 427 return err; 428 428 429 + err = mount("debugfs", "/sys/kernel/debug", "debugfs", 0, NULL); 430 + if (!ASSERT_OK(err, "mount /sys/kernel/debug")) 431 + return err; 432 + 429 433 return 0; 430 434 } 431 435
+14
tools/testing/selftests/bpf/prog_tests/btf.c
··· 3949 3949 .err_str = "Invalid return type", 3950 3950 }, 3951 3951 { 3952 + .descr = "decl_tag test #17, func proto, argument", 3953 + .raw_types = { 3954 + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DECL_TAG, 0, 0), 4), (-1), /* [1] */ 3955 + BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 0), /* [2] */ 3956 + BTF_FUNC_PROTO_ENC(0, 1), /* [3] */ 3957 + BTF_FUNC_PROTO_ARG_ENC(NAME_TBD, 1), 3958 + BTF_VAR_ENC(NAME_TBD, 2, 0), /* [4] */ 3959 + BTF_END_RAW, 3960 + }, 3961 + BTF_STR_SEC("\0local\0tag1\0var"), 3962 + .btf_load_err = true, 3963 + .err_str = "Invalid arg#1", 3964 + }, 3965 + { 3952 3966 .descr = "type_tag test #1", 3953 3967 .raw_types = { 3954 3968 BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
+76
tools/testing/selftests/bpf/prog_tests/cgroup_iter.c
··· 189 189 BPF_CGROUP_ITER_SELF_ONLY, "self_only"); 190 190 } 191 191 192 + static void test_walk_dead_self_only(struct cgroup_iter *skel) 193 + { 194 + DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts); 195 + char expected_output[128], buf[128]; 196 + const char *cgrp_name = "/dead"; 197 + union bpf_iter_link_info linfo; 198 + int len, cgrp_fd, iter_fd; 199 + struct bpf_link *link; 200 + size_t left; 201 + char *p; 202 + 203 + cgrp_fd = create_and_get_cgroup(cgrp_name); 204 + if (!ASSERT_GE(cgrp_fd, 0, "create cgrp")) 205 + return; 206 + 207 + /* The cgroup will be dead during read() iteration, so it only has 208 + * epilogue in the output 209 + */ 210 + snprintf(expected_output, sizeof(expected_output), EPILOGUE); 211 + 212 + memset(&linfo, 0, sizeof(linfo)); 213 + linfo.cgroup.cgroup_fd = cgrp_fd; 214 + linfo.cgroup.order = BPF_CGROUP_ITER_SELF_ONLY; 215 + opts.link_info = &linfo; 216 + opts.link_info_len = sizeof(linfo); 217 + 218 + link = bpf_program__attach_iter(skel->progs.cgroup_id_printer, &opts); 219 + if (!ASSERT_OK_PTR(link, "attach_iter")) 220 + goto close_cgrp; 221 + 222 + iter_fd = bpf_iter_create(bpf_link__fd(link)); 223 + if (!ASSERT_GE(iter_fd, 0, "iter_create")) 224 + goto free_link; 225 + 226 + /* Close link fd and cgroup fd */ 227 + bpf_link__destroy(link); 228 + close(cgrp_fd); 229 + 230 + /* Remove cgroup to mark it as dead */ 231 + remove_cgroup(cgrp_name); 232 + 233 + /* Two kern_sync_rcu() and usleep() pairs are used to wait for the 234 + * releases of cgroup css, and the last kern_sync_rcu() and usleep() 235 + * pair is used to wait for the free of cgroup itself. 236 + */ 237 + kern_sync_rcu(); 238 + usleep(8000); 239 + kern_sync_rcu(); 240 + usleep(8000); 241 + kern_sync_rcu(); 242 + usleep(1000); 243 + 244 + memset(buf, 0, sizeof(buf)); 245 + left = ARRAY_SIZE(buf); 246 + p = buf; 247 + while ((len = read(iter_fd, p, left)) > 0) { 248 + p += len; 249 + left -= len; 250 + } 251 + 252 + ASSERT_STREQ(buf, expected_output, "dead cgroup output"); 253 + 254 + /* read() after iter finishes should be ok. */ 255 + if (len == 0) 256 + ASSERT_OK(read(iter_fd, buf, sizeof(buf)), "second_read"); 257 + 258 + close(iter_fd); 259 + return; 260 + free_link: 261 + bpf_link__destroy(link); 262 + close_cgrp: 263 + close(cgrp_fd); 264 + } 265 + 192 266 void test_cgroup_iter(void) 193 267 { 194 268 struct cgroup_iter *skel = NULL; ··· 291 217 test_early_termination(skel); 292 218 if (test__start_subtest("cgroup_iter__self_only")) 293 219 test_walk_self_only(skel); 220 + if (test__start_subtest("cgroup_iter__dead_self_only")) 221 + test_walk_dead_self_only(skel); 294 222 out: 295 223 cgroup_iter__destroy(skel); 296 224 cleanup_cgroups();
+175
tools/testing/selftests/bpf/prog_tests/cgrp_kfunc.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #define _GNU_SOURCE 5 + #include <cgroup_helpers.h> 6 + #include <test_progs.h> 7 + 8 + #include "cgrp_kfunc_failure.skel.h" 9 + #include "cgrp_kfunc_success.skel.h" 10 + 11 + static size_t log_buf_sz = 1 << 20; /* 1 MB */ 12 + static char obj_log_buf[1048576]; 13 + 14 + static struct cgrp_kfunc_success *open_load_cgrp_kfunc_skel(void) 15 + { 16 + struct cgrp_kfunc_success *skel; 17 + int err; 18 + 19 + skel = cgrp_kfunc_success__open(); 20 + if (!ASSERT_OK_PTR(skel, "skel_open")) 21 + return NULL; 22 + 23 + skel->bss->pid = getpid(); 24 + 25 + err = cgrp_kfunc_success__load(skel); 26 + if (!ASSERT_OK(err, "skel_load")) 27 + goto cleanup; 28 + 29 + return skel; 30 + 31 + cleanup: 32 + cgrp_kfunc_success__destroy(skel); 33 + return NULL; 34 + } 35 + 36 + static int mkdir_rm_test_dir(void) 37 + { 38 + int fd; 39 + const char *cgrp_path = "cgrp_kfunc"; 40 + 41 + fd = create_and_get_cgroup(cgrp_path); 42 + if (!ASSERT_GT(fd, 0, "mkdir_cgrp_fd")) 43 + return -1; 44 + 45 + close(fd); 46 + remove_cgroup(cgrp_path); 47 + 48 + return 0; 49 + } 50 + 51 + static void run_success_test(const char *prog_name) 52 + { 53 + struct cgrp_kfunc_success *skel; 54 + struct bpf_program *prog; 55 + struct bpf_link *link = NULL; 56 + 57 + skel = open_load_cgrp_kfunc_skel(); 58 + if (!ASSERT_OK_PTR(skel, "open_load_skel")) 59 + return; 60 + 61 + if (!ASSERT_OK(skel->bss->err, "pre_mkdir_err")) 62 + goto cleanup; 63 + 64 + prog = bpf_object__find_program_by_name(skel->obj, prog_name); 65 + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) 66 + goto cleanup; 67 + 68 + link = bpf_program__attach(prog); 69 + if (!ASSERT_OK_PTR(link, "attached_link")) 70 + goto cleanup; 71 + 72 + ASSERT_EQ(skel->bss->invocations, 0, "pre_rmdir_count"); 73 + if (!ASSERT_OK(mkdir_rm_test_dir(), "cgrp_mkdir")) 74 + goto cleanup; 75 + 76 + ASSERT_EQ(skel->bss->invocations, 1, "post_rmdir_count"); 77 + ASSERT_OK(skel->bss->err, "post_rmdir_err"); 78 + 79 + cleanup: 80 + bpf_link__destroy(link); 81 + cgrp_kfunc_success__destroy(skel); 82 + } 83 + 84 + static const char * const success_tests[] = { 85 + "test_cgrp_acquire_release_argument", 86 + "test_cgrp_acquire_leave_in_map", 87 + "test_cgrp_xchg_release", 88 + "test_cgrp_get_release", 89 + "test_cgrp_get_ancestors", 90 + }; 91 + 92 + static struct { 93 + const char *prog_name; 94 + const char *expected_err_msg; 95 + } failure_tests[] = { 96 + {"cgrp_kfunc_acquire_untrusted", "R1 must be referenced or trusted"}, 97 + {"cgrp_kfunc_acquire_fp", "arg#0 pointer type STRUCT cgroup must point"}, 98 + {"cgrp_kfunc_acquire_unsafe_kretprobe", "reg type unsupported for arg#0 function"}, 99 + {"cgrp_kfunc_acquire_trusted_walked", "R1 must be referenced or trusted"}, 100 + {"cgrp_kfunc_acquire_null", "arg#0 pointer type STRUCT cgroup must point"}, 101 + {"cgrp_kfunc_acquire_unreleased", "Unreleased reference"}, 102 + {"cgrp_kfunc_get_non_kptr_param", "arg#0 expected pointer to map value"}, 103 + {"cgrp_kfunc_get_non_kptr_acquired", "arg#0 expected pointer to map value"}, 104 + {"cgrp_kfunc_get_null", "arg#0 expected pointer to map value"}, 105 + {"cgrp_kfunc_xchg_unreleased", "Unreleased reference"}, 106 + {"cgrp_kfunc_get_unreleased", "Unreleased reference"}, 107 + {"cgrp_kfunc_release_untrusted", "arg#0 is untrusted_ptr_or_null_ expected ptr_ or socket"}, 108 + {"cgrp_kfunc_release_fp", "arg#0 pointer type STRUCT cgroup must point"}, 109 + {"cgrp_kfunc_release_null", "arg#0 is ptr_or_null_ expected ptr_ or socket"}, 110 + {"cgrp_kfunc_release_unacquired", "release kernel function bpf_cgroup_release expects"}, 111 + }; 112 + 113 + static void verify_fail(const char *prog_name, const char *expected_err_msg) 114 + { 115 + LIBBPF_OPTS(bpf_object_open_opts, opts); 116 + struct cgrp_kfunc_failure *skel; 117 + int err, i; 118 + 119 + opts.kernel_log_buf = obj_log_buf; 120 + opts.kernel_log_size = log_buf_sz; 121 + opts.kernel_log_level = 1; 122 + 123 + skel = cgrp_kfunc_failure__open_opts(&opts); 124 + if (!ASSERT_OK_PTR(skel, "cgrp_kfunc_failure__open_opts")) 125 + goto cleanup; 126 + 127 + for (i = 0; i < ARRAY_SIZE(failure_tests); i++) { 128 + struct bpf_program *prog; 129 + const char *curr_name = failure_tests[i].prog_name; 130 + 131 + prog = bpf_object__find_program_by_name(skel->obj, curr_name); 132 + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) 133 + goto cleanup; 134 + 135 + bpf_program__set_autoload(prog, !strcmp(curr_name, prog_name)); 136 + } 137 + 138 + err = cgrp_kfunc_failure__load(skel); 139 + if (!ASSERT_ERR(err, "unexpected load success")) 140 + goto cleanup; 141 + 142 + if (!ASSERT_OK_PTR(strstr(obj_log_buf, expected_err_msg), "expected_err_msg")) { 143 + fprintf(stderr, "Expected err_msg: %s\n", expected_err_msg); 144 + fprintf(stderr, "Verifier output: %s\n", obj_log_buf); 145 + } 146 + 147 + cleanup: 148 + cgrp_kfunc_failure__destroy(skel); 149 + } 150 + 151 + void test_cgrp_kfunc(void) 152 + { 153 + int i, err; 154 + 155 + err = setup_cgroup_environment(); 156 + if (!ASSERT_OK(err, "cgrp_env_setup")) 157 + goto cleanup; 158 + 159 + for (i = 0; i < ARRAY_SIZE(success_tests); i++) { 160 + if (!test__start_subtest(success_tests[i])) 161 + continue; 162 + 163 + run_success_test(success_tests[i]); 164 + } 165 + 166 + for (i = 0; i < ARRAY_SIZE(failure_tests); i++) { 167 + if (!test__start_subtest(failure_tests[i].prog_name)) 168 + continue; 169 + 170 + verify_fail(failure_tests[i].prog_name, failure_tests[i].expected_err_msg); 171 + } 172 + 173 + cleanup: 174 + cleanup_cgroup_environment(); 175 + }
+1 -1
tools/testing/selftests/bpf/prog_tests/dynptr.c
··· 17 17 {"ringbuf_missing_release2", "Unreleased reference id=2"}, 18 18 {"ringbuf_missing_release_callback", "Unreleased reference id"}, 19 19 {"use_after_invalid", "Expected an initialized dynptr as arg #3"}, 20 - {"ringbuf_invalid_api", "type=mem expected=alloc_mem"}, 20 + {"ringbuf_invalid_api", "type=mem expected=ringbuf_mem"}, 21 21 {"add_dynptr_to_map1", "invalid indirect read from stack"}, 22 22 {"add_dynptr_to_map2", "invalid indirect read from stack"}, 23 23 {"data_slice_out_of_bounds_ringbuf", "value is outside of the allowed memory range"},
+146
tools/testing/selftests/bpf/prog_tests/empty_skb.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <test_progs.h> 3 + #include <network_helpers.h> 4 + #include <net/if.h> 5 + #include "empty_skb.skel.h" 6 + 7 + #define SYS(cmd) ({ \ 8 + if (!ASSERT_OK(system(cmd), (cmd))) \ 9 + goto out; \ 10 + }) 11 + 12 + void serial_test_empty_skb(void) 13 + { 14 + LIBBPF_OPTS(bpf_test_run_opts, tattr); 15 + struct empty_skb *bpf_obj = NULL; 16 + struct nstoken *tok = NULL; 17 + struct bpf_program *prog; 18 + char eth_hlen_pp[15]; 19 + char eth_hlen[14]; 20 + int veth_ifindex; 21 + int ipip_ifindex; 22 + int err; 23 + int i; 24 + 25 + struct { 26 + const char *msg; 27 + const void *data_in; 28 + __u32 data_size_in; 29 + int *ifindex; 30 + int err; 31 + int ret; 32 + bool success_on_tc; 33 + } tests[] = { 34 + /* Empty packets are always rejected. */ 35 + 36 + { 37 + /* BPF_PROG_RUN ETH_HLEN size check */ 38 + .msg = "veth empty ingress packet", 39 + .data_in = NULL, 40 + .data_size_in = 0, 41 + .ifindex = &veth_ifindex, 42 + .err = -EINVAL, 43 + }, 44 + { 45 + /* BPF_PROG_RUN ETH_HLEN size check */ 46 + .msg = "ipip empty ingress packet", 47 + .data_in = NULL, 48 + .data_size_in = 0, 49 + .ifindex = &ipip_ifindex, 50 + .err = -EINVAL, 51 + }, 52 + 53 + /* ETH_HLEN-sized packets: 54 + * - can not be redirected at LWT_XMIT 55 + * - can be redirected at TC to non-tunneling dest 56 + */ 57 + 58 + { 59 + /* __bpf_redirect_common */ 60 + .msg = "veth ETH_HLEN packet ingress", 61 + .data_in = eth_hlen, 62 + .data_size_in = sizeof(eth_hlen), 63 + .ifindex = &veth_ifindex, 64 + .ret = -ERANGE, 65 + .success_on_tc = true, 66 + }, 67 + { 68 + /* __bpf_redirect_no_mac 69 + * 70 + * lwt: skb->len=0 <= skb_network_offset=0 71 + * tc: skb->len=14 <= skb_network_offset=14 72 + */ 73 + .msg = "ipip ETH_HLEN packet ingress", 74 + .data_in = eth_hlen, 75 + .data_size_in = sizeof(eth_hlen), 76 + .ifindex = &ipip_ifindex, 77 + .ret = -ERANGE, 78 + }, 79 + 80 + /* ETH_HLEN+1-sized packet should be redirected. */ 81 + 82 + { 83 + .msg = "veth ETH_HLEN+1 packet ingress", 84 + .data_in = eth_hlen_pp, 85 + .data_size_in = sizeof(eth_hlen_pp), 86 + .ifindex = &veth_ifindex, 87 + }, 88 + { 89 + .msg = "ipip ETH_HLEN+1 packet ingress", 90 + .data_in = eth_hlen_pp, 91 + .data_size_in = sizeof(eth_hlen_pp), 92 + .ifindex = &ipip_ifindex, 93 + }, 94 + }; 95 + 96 + SYS("ip netns add empty_skb"); 97 + tok = open_netns("empty_skb"); 98 + SYS("ip link add veth0 type veth peer veth1"); 99 + SYS("ip link set dev veth0 up"); 100 + SYS("ip link set dev veth1 up"); 101 + SYS("ip addr add 10.0.0.1/8 dev veth0"); 102 + SYS("ip addr add 10.0.0.2/8 dev veth1"); 103 + veth_ifindex = if_nametoindex("veth0"); 104 + 105 + SYS("ip link add ipip0 type ipip local 10.0.0.1 remote 10.0.0.2"); 106 + SYS("ip link set ipip0 up"); 107 + SYS("ip addr add 192.168.1.1/16 dev ipip0"); 108 + ipip_ifindex = if_nametoindex("ipip0"); 109 + 110 + bpf_obj = empty_skb__open_and_load(); 111 + if (!ASSERT_OK_PTR(bpf_obj, "open skeleton")) 112 + goto out; 113 + 114 + for (i = 0; i < ARRAY_SIZE(tests); i++) { 115 + bpf_object__for_each_program(prog, bpf_obj->obj) { 116 + char buf[128]; 117 + bool at_tc = !strncmp(bpf_program__section_name(prog), "tc", 2); 118 + 119 + tattr.data_in = tests[i].data_in; 120 + tattr.data_size_in = tests[i].data_size_in; 121 + 122 + tattr.data_size_out = 0; 123 + bpf_obj->bss->ifindex = *tests[i].ifindex; 124 + bpf_obj->bss->ret = 0; 125 + err = bpf_prog_test_run_opts(bpf_program__fd(prog), &tattr); 126 + sprintf(buf, "err: %s [%s]", tests[i].msg, bpf_program__name(prog)); 127 + 128 + if (at_tc && tests[i].success_on_tc) 129 + ASSERT_GE(err, 0, buf); 130 + else 131 + ASSERT_EQ(err, tests[i].err, buf); 132 + sprintf(buf, "ret: %s [%s]", tests[i].msg, bpf_program__name(prog)); 133 + if (at_tc && tests[i].success_on_tc) 134 + ASSERT_GE(bpf_obj->bss->ret, 0, buf); 135 + else 136 + ASSERT_EQ(bpf_obj->bss->ret, tests[i].ret, buf); 137 + } 138 + } 139 + 140 + out: 141 + if (bpf_obj) 142 + empty_skb__destroy(bpf_obj); 143 + if (tok) 144 + close_netns(tok); 145 + system("ip netns del empty_skb"); 146 + }
+1 -1
tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
··· 22 22 "arg#0 pointer type STRUCT bpf_dynptr_kern points to unsupported dynamic pointer type", 0}, 23 23 {"not_valid_dynptr", 24 24 "arg#0 pointer type STRUCT bpf_dynptr_kern must be valid and initialized", 0}, 25 - {"not_ptr_to_stack", "arg#0 pointer type STRUCT bpf_dynptr_kern not to stack", 0}, 25 + {"not_ptr_to_stack", "arg#0 expected pointer to stack", 0}, 26 26 {"dynptr_data_null", NULL, -EBADMSG}, 27 27 }; 28 28
+740
tools/testing/selftests/bpf/prog_tests/linked_list.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <bpf/btf.h> 3 + #include <test_btf.h> 4 + #include <linux/btf.h> 5 + #include <test_progs.h> 6 + #include <network_helpers.h> 7 + 8 + #include "linked_list.skel.h" 9 + #include "linked_list_fail.skel.h" 10 + 11 + static char log_buf[1024 * 1024]; 12 + 13 + static struct { 14 + const char *prog_name; 15 + const char *err_msg; 16 + } linked_list_fail_tests[] = { 17 + #define TEST(test, off) \ 18 + { #test "_missing_lock_push_front", \ 19 + "bpf_spin_lock at off=" #off " must be held for bpf_list_head" }, \ 20 + { #test "_missing_lock_push_back", \ 21 + "bpf_spin_lock at off=" #off " must be held for bpf_list_head" }, \ 22 + { #test "_missing_lock_pop_front", \ 23 + "bpf_spin_lock at off=" #off " must be held for bpf_list_head" }, \ 24 + { #test "_missing_lock_pop_back", \ 25 + "bpf_spin_lock at off=" #off " must be held for bpf_list_head" }, 26 + TEST(kptr, 32) 27 + TEST(global, 16) 28 + TEST(map, 0) 29 + TEST(inner_map, 0) 30 + #undef TEST 31 + #define TEST(test, op) \ 32 + { #test "_kptr_incorrect_lock_" #op, \ 33 + "held lock and object are not in the same allocation\n" \ 34 + "bpf_spin_lock at off=32 must be held for bpf_list_head" }, \ 35 + { #test "_global_incorrect_lock_" #op, \ 36 + "held lock and object are not in the same allocation\n" \ 37 + "bpf_spin_lock at off=16 must be held for bpf_list_head" }, \ 38 + { #test "_map_incorrect_lock_" #op, \ 39 + "held lock and object are not in the same allocation\n" \ 40 + "bpf_spin_lock at off=0 must be held for bpf_list_head" }, \ 41 + { #test "_inner_map_incorrect_lock_" #op, \ 42 + "held lock and object are not in the same allocation\n" \ 43 + "bpf_spin_lock at off=0 must be held for bpf_list_head" }, 44 + TEST(kptr, push_front) 45 + TEST(kptr, push_back) 46 + TEST(kptr, pop_front) 47 + TEST(kptr, pop_back) 48 + TEST(global, push_front) 49 + TEST(global, push_back) 50 + TEST(global, pop_front) 51 + TEST(global, pop_back) 52 + TEST(map, push_front) 53 + TEST(map, push_back) 54 + TEST(map, pop_front) 55 + TEST(map, pop_back) 56 + TEST(inner_map, push_front) 57 + TEST(inner_map, push_back) 58 + TEST(inner_map, pop_front) 59 + TEST(inner_map, pop_back) 60 + #undef TEST 61 + { "map_compat_kprobe", "tracing progs cannot use bpf_list_head yet" }, 62 + { "map_compat_kretprobe", "tracing progs cannot use bpf_list_head yet" }, 63 + { "map_compat_tp", "tracing progs cannot use bpf_list_head yet" }, 64 + { "map_compat_perf", "tracing progs cannot use bpf_list_head yet" }, 65 + { "map_compat_raw_tp", "tracing progs cannot use bpf_list_head yet" }, 66 + { "map_compat_raw_tp_w", "tracing progs cannot use bpf_list_head yet" }, 67 + { "obj_type_id_oor", "local type ID argument must be in range [0, U32_MAX]" }, 68 + { "obj_new_no_composite", "bpf_obj_new type ID argument must be of a struct" }, 69 + { "obj_new_no_struct", "bpf_obj_new type ID argument must be of a struct" }, 70 + { "obj_drop_non_zero_off", "R1 must have zero offset when passed to release func" }, 71 + { "new_null_ret", "R0 invalid mem access 'ptr_or_null_'" }, 72 + { "obj_new_acq", "Unreleased reference id=" }, 73 + { "use_after_drop", "invalid mem access 'scalar'" }, 74 + { "ptr_walk_scalar", "type=scalar expected=percpu_ptr_" }, 75 + { "direct_read_lock", "direct access to bpf_spin_lock is disallowed" }, 76 + { "direct_write_lock", "direct access to bpf_spin_lock is disallowed" }, 77 + { "direct_read_head", "direct access to bpf_list_head is disallowed" }, 78 + { "direct_write_head", "direct access to bpf_list_head is disallowed" }, 79 + { "direct_read_node", "direct access to bpf_list_node is disallowed" }, 80 + { "direct_write_node", "direct access to bpf_list_node is disallowed" }, 81 + { "write_after_push_front", "only read is supported" }, 82 + { "write_after_push_back", "only read is supported" }, 83 + { "use_after_unlock_push_front", "invalid mem access 'scalar'" }, 84 + { "use_after_unlock_push_back", "invalid mem access 'scalar'" }, 85 + { "double_push_front", "arg#1 expected pointer to allocated object" }, 86 + { "double_push_back", "arg#1 expected pointer to allocated object" }, 87 + { "no_node_value_type", "bpf_list_node not found at offset=0" }, 88 + { "incorrect_value_type", 89 + "operation on bpf_list_head expects arg#1 bpf_list_node at offset=0 in struct foo, " 90 + "but arg is at offset=0 in struct bar" }, 91 + { "incorrect_node_var_off", "variable ptr_ access var_off=(0x0; 0xffffffff) disallowed" }, 92 + { "incorrect_node_off1", "bpf_list_node not found at offset=1" }, 93 + { "incorrect_node_off2", "arg#1 offset=40, but expected bpf_list_node at offset=0 in struct foo" }, 94 + { "no_head_type", "bpf_list_head not found at offset=0" }, 95 + { "incorrect_head_var_off1", "R1 doesn't have constant offset" }, 96 + { "incorrect_head_var_off2", "variable ptr_ access var_off=(0x0; 0xffffffff) disallowed" }, 97 + { "incorrect_head_off1", "bpf_list_head not found at offset=17" }, 98 + { "incorrect_head_off2", "bpf_list_head not found at offset=1" }, 99 + { "pop_front_off", 100 + "15: (bf) r1 = r6 ; R1_w=ptr_or_null_foo(id=4,ref_obj_id=4,off=40,imm=0) " 101 + "R6_w=ptr_or_null_foo(id=4,ref_obj_id=4,off=40,imm=0) refs=2,4\n" 102 + "16: (85) call bpf_this_cpu_ptr#154\nR1 type=ptr_or_null_ expected=percpu_ptr_" }, 103 + { "pop_back_off", 104 + "15: (bf) r1 = r6 ; R1_w=ptr_or_null_foo(id=4,ref_obj_id=4,off=40,imm=0) " 105 + "R6_w=ptr_or_null_foo(id=4,ref_obj_id=4,off=40,imm=0) refs=2,4\n" 106 + "16: (85) call bpf_this_cpu_ptr#154\nR1 type=ptr_or_null_ expected=percpu_ptr_" }, 107 + }; 108 + 109 + static void test_linked_list_fail_prog(const char *prog_name, const char *err_msg) 110 + { 111 + LIBBPF_OPTS(bpf_object_open_opts, opts, .kernel_log_buf = log_buf, 112 + .kernel_log_size = sizeof(log_buf), 113 + .kernel_log_level = 1); 114 + struct linked_list_fail *skel; 115 + struct bpf_program *prog; 116 + int ret; 117 + 118 + skel = linked_list_fail__open_opts(&opts); 119 + if (!ASSERT_OK_PTR(skel, "linked_list_fail__open_opts")) 120 + return; 121 + 122 + prog = bpf_object__find_program_by_name(skel->obj, prog_name); 123 + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) 124 + goto end; 125 + 126 + bpf_program__set_autoload(prog, true); 127 + 128 + ret = linked_list_fail__load(skel); 129 + if (!ASSERT_ERR(ret, "linked_list_fail__load must fail")) 130 + goto end; 131 + 132 + if (!ASSERT_OK_PTR(strstr(log_buf, err_msg), "expected error message")) { 133 + fprintf(stderr, "Expected: %s\n", err_msg); 134 + fprintf(stderr, "Verifier: %s\n", log_buf); 135 + } 136 + 137 + end: 138 + linked_list_fail__destroy(skel); 139 + } 140 + 141 + static void clear_fields(struct bpf_map *map) 142 + { 143 + char buf[24]; 144 + int key = 0; 145 + 146 + memset(buf, 0xff, sizeof(buf)); 147 + ASSERT_OK(bpf_map__update_elem(map, &key, sizeof(key), buf, sizeof(buf), 0), "check_and_free_fields"); 148 + } 149 + 150 + enum { 151 + TEST_ALL, 152 + PUSH_POP, 153 + PUSH_POP_MULT, 154 + LIST_IN_LIST, 155 + }; 156 + 157 + static void test_linked_list_success(int mode, bool leave_in_map) 158 + { 159 + LIBBPF_OPTS(bpf_test_run_opts, opts, 160 + .data_in = &pkt_v4, 161 + .data_size_in = sizeof(pkt_v4), 162 + .repeat = 1, 163 + ); 164 + struct linked_list *skel; 165 + int ret; 166 + 167 + skel = linked_list__open_and_load(); 168 + if (!ASSERT_OK_PTR(skel, "linked_list__open_and_load")) 169 + return; 170 + 171 + if (mode == LIST_IN_LIST) 172 + goto lil; 173 + if (mode == PUSH_POP_MULT) 174 + goto ppm; 175 + 176 + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.map_list_push_pop), &opts); 177 + ASSERT_OK(ret, "map_list_push_pop"); 178 + ASSERT_OK(opts.retval, "map_list_push_pop retval"); 179 + if (!leave_in_map) 180 + clear_fields(skel->maps.array_map); 181 + 182 + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.inner_map_list_push_pop), &opts); 183 + ASSERT_OK(ret, "inner_map_list_push_pop"); 184 + ASSERT_OK(opts.retval, "inner_map_list_push_pop retval"); 185 + if (!leave_in_map) 186 + clear_fields(skel->maps.inner_map); 187 + 188 + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.global_list_push_pop), &opts); 189 + ASSERT_OK(ret, "global_list_push_pop"); 190 + ASSERT_OK(opts.retval, "global_list_push_pop retval"); 191 + if (!leave_in_map) 192 + clear_fields(skel->maps.bss_A); 193 + 194 + if (mode == PUSH_POP) 195 + goto end; 196 + 197 + ppm: 198 + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.map_list_push_pop_multiple), &opts); 199 + ASSERT_OK(ret, "map_list_push_pop_multiple"); 200 + ASSERT_OK(opts.retval, "map_list_push_pop_multiple retval"); 201 + if (!leave_in_map) 202 + clear_fields(skel->maps.array_map); 203 + 204 + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.inner_map_list_push_pop_multiple), &opts); 205 + ASSERT_OK(ret, "inner_map_list_push_pop_multiple"); 206 + ASSERT_OK(opts.retval, "inner_map_list_push_pop_multiple retval"); 207 + if (!leave_in_map) 208 + clear_fields(skel->maps.inner_map); 209 + 210 + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.global_list_push_pop_multiple), &opts); 211 + ASSERT_OK(ret, "global_list_push_pop_multiple"); 212 + ASSERT_OK(opts.retval, "global_list_push_pop_multiple retval"); 213 + if (!leave_in_map) 214 + clear_fields(skel->maps.bss_A); 215 + 216 + if (mode == PUSH_POP_MULT) 217 + goto end; 218 + 219 + lil: 220 + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.map_list_in_list), &opts); 221 + ASSERT_OK(ret, "map_list_in_list"); 222 + ASSERT_OK(opts.retval, "map_list_in_list retval"); 223 + if (!leave_in_map) 224 + clear_fields(skel->maps.array_map); 225 + 226 + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.inner_map_list_in_list), &opts); 227 + ASSERT_OK(ret, "inner_map_list_in_list"); 228 + ASSERT_OK(opts.retval, "inner_map_list_in_list retval"); 229 + if (!leave_in_map) 230 + clear_fields(skel->maps.inner_map); 231 + 232 + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.global_list_in_list), &opts); 233 + ASSERT_OK(ret, "global_list_in_list"); 234 + ASSERT_OK(opts.retval, "global_list_in_list retval"); 235 + if (!leave_in_map) 236 + clear_fields(skel->maps.bss_A); 237 + end: 238 + linked_list__destroy(skel); 239 + } 240 + 241 + #define SPIN_LOCK 2 242 + #define LIST_HEAD 3 243 + #define LIST_NODE 4 244 + 245 + static struct btf *init_btf(void) 246 + { 247 + int id, lid, hid, nid; 248 + struct btf *btf; 249 + 250 + btf = btf__new_empty(); 251 + if (!ASSERT_OK_PTR(btf, "btf__new_empty")) 252 + return NULL; 253 + id = btf__add_int(btf, "int", 4, BTF_INT_SIGNED); 254 + if (!ASSERT_EQ(id, 1, "btf__add_int")) 255 + goto end; 256 + lid = btf__add_struct(btf, "bpf_spin_lock", 4); 257 + if (!ASSERT_EQ(lid, SPIN_LOCK, "btf__add_struct bpf_spin_lock")) 258 + goto end; 259 + hid = btf__add_struct(btf, "bpf_list_head", 16); 260 + if (!ASSERT_EQ(hid, LIST_HEAD, "btf__add_struct bpf_list_head")) 261 + goto end; 262 + nid = btf__add_struct(btf, "bpf_list_node", 16); 263 + if (!ASSERT_EQ(nid, LIST_NODE, "btf__add_struct bpf_list_node")) 264 + goto end; 265 + return btf; 266 + end: 267 + btf__free(btf); 268 + return NULL; 269 + } 270 + 271 + static void test_btf(void) 272 + { 273 + struct btf *btf = NULL; 274 + int id, err; 275 + 276 + while (test__start_subtest("btf: too many locks")) { 277 + btf = init_btf(); 278 + if (!ASSERT_OK_PTR(btf, "init_btf")) 279 + break; 280 + id = btf__add_struct(btf, "foo", 24); 281 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 282 + break; 283 + err = btf__add_field(btf, "a", SPIN_LOCK, 0, 0); 284 + if (!ASSERT_OK(err, "btf__add_struct foo::a")) 285 + break; 286 + err = btf__add_field(btf, "b", SPIN_LOCK, 32, 0); 287 + if (!ASSERT_OK(err, "btf__add_struct foo::a")) 288 + break; 289 + err = btf__add_field(btf, "c", LIST_HEAD, 64, 0); 290 + if (!ASSERT_OK(err, "btf__add_struct foo::a")) 291 + break; 292 + 293 + err = btf__load_into_kernel(btf); 294 + ASSERT_EQ(err, -E2BIG, "check btf"); 295 + btf__free(btf); 296 + break; 297 + } 298 + 299 + while (test__start_subtest("btf: missing lock")) { 300 + btf = init_btf(); 301 + if (!ASSERT_OK_PTR(btf, "init_btf")) 302 + break; 303 + id = btf__add_struct(btf, "foo", 16); 304 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 305 + break; 306 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 307 + if (!ASSERT_OK(err, "btf__add_struct foo::a")) 308 + break; 309 + id = btf__add_decl_tag(btf, "contains:baz:a", 5, 0); 310 + if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:baz:a")) 311 + break; 312 + id = btf__add_struct(btf, "baz", 16); 313 + if (!ASSERT_EQ(id, 7, "btf__add_struct baz")) 314 + break; 315 + err = btf__add_field(btf, "a", LIST_NODE, 0, 0); 316 + if (!ASSERT_OK(err, "btf__add_field baz::a")) 317 + break; 318 + 319 + err = btf__load_into_kernel(btf); 320 + ASSERT_EQ(err, -EINVAL, "check btf"); 321 + btf__free(btf); 322 + break; 323 + } 324 + 325 + while (test__start_subtest("btf: bad offset")) { 326 + btf = init_btf(); 327 + if (!ASSERT_OK_PTR(btf, "init_btf")) 328 + break; 329 + id = btf__add_struct(btf, "foo", 36); 330 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 331 + break; 332 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 333 + if (!ASSERT_OK(err, "btf__add_field foo::a")) 334 + break; 335 + err = btf__add_field(btf, "b", LIST_NODE, 0, 0); 336 + if (!ASSERT_OK(err, "btf__add_field foo::b")) 337 + break; 338 + err = btf__add_field(btf, "c", SPIN_LOCK, 0, 0); 339 + if (!ASSERT_OK(err, "btf__add_field foo::c")) 340 + break; 341 + id = btf__add_decl_tag(btf, "contains:foo:b", 5, 0); 342 + if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:foo:b")) 343 + break; 344 + 345 + err = btf__load_into_kernel(btf); 346 + ASSERT_EQ(err, -EEXIST, "check btf"); 347 + btf__free(btf); 348 + break; 349 + } 350 + 351 + while (test__start_subtest("btf: missing contains:")) { 352 + btf = init_btf(); 353 + if (!ASSERT_OK_PTR(btf, "init_btf")) 354 + break; 355 + id = btf__add_struct(btf, "foo", 24); 356 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 357 + break; 358 + err = btf__add_field(btf, "a", SPIN_LOCK, 0, 0); 359 + if (!ASSERT_OK(err, "btf__add_field foo::a")) 360 + break; 361 + err = btf__add_field(btf, "b", LIST_HEAD, 64, 0); 362 + if (!ASSERT_OK(err, "btf__add_field foo::b")) 363 + break; 364 + 365 + err = btf__load_into_kernel(btf); 366 + ASSERT_EQ(err, -EINVAL, "check btf"); 367 + btf__free(btf); 368 + break; 369 + } 370 + 371 + while (test__start_subtest("btf: missing struct")) { 372 + btf = init_btf(); 373 + if (!ASSERT_OK_PTR(btf, "init_btf")) 374 + break; 375 + id = btf__add_struct(btf, "foo", 24); 376 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 377 + break; 378 + err = btf__add_field(btf, "a", SPIN_LOCK, 0, 0); 379 + if (!ASSERT_OK(err, "btf__add_field foo::a")) 380 + break; 381 + err = btf__add_field(btf, "b", LIST_HEAD, 64, 0); 382 + if (!ASSERT_OK(err, "btf__add_field foo::b")) 383 + break; 384 + id = btf__add_decl_tag(btf, "contains:bar:bar", 5, 1); 385 + if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:bar:bar")) 386 + break; 387 + 388 + err = btf__load_into_kernel(btf); 389 + ASSERT_EQ(err, -ENOENT, "check btf"); 390 + btf__free(btf); 391 + break; 392 + } 393 + 394 + while (test__start_subtest("btf: missing node")) { 395 + btf = init_btf(); 396 + if (!ASSERT_OK_PTR(btf, "init_btf")) 397 + break; 398 + id = btf__add_struct(btf, "foo", 24); 399 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 400 + break; 401 + err = btf__add_field(btf, "a", SPIN_LOCK, 0, 0); 402 + if (!ASSERT_OK(err, "btf__add_field foo::a")) 403 + break; 404 + err = btf__add_field(btf, "b", LIST_HEAD, 64, 0); 405 + if (!ASSERT_OK(err, "btf__add_field foo::b")) 406 + break; 407 + id = btf__add_decl_tag(btf, "contains:foo:c", 5, 1); 408 + if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:foo:c")) 409 + break; 410 + 411 + err = btf__load_into_kernel(btf); 412 + btf__free(btf); 413 + ASSERT_EQ(err, -ENOENT, "check btf"); 414 + break; 415 + } 416 + 417 + while (test__start_subtest("btf: node incorrect type")) { 418 + btf = init_btf(); 419 + if (!ASSERT_OK_PTR(btf, "init_btf")) 420 + break; 421 + id = btf__add_struct(btf, "foo", 20); 422 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 423 + break; 424 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 425 + if (!ASSERT_OK(err, "btf__add_field foo::a")) 426 + break; 427 + err = btf__add_field(btf, "b", SPIN_LOCK, 128, 0); 428 + if (!ASSERT_OK(err, "btf__add_field foo::b")) 429 + break; 430 + id = btf__add_decl_tag(btf, "contains:bar:a", 5, 0); 431 + if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:bar:a")) 432 + break; 433 + id = btf__add_struct(btf, "bar", 4); 434 + if (!ASSERT_EQ(id, 7, "btf__add_struct bar")) 435 + break; 436 + err = btf__add_field(btf, "a", SPIN_LOCK, 0, 0); 437 + if (!ASSERT_OK(err, "btf__add_field bar::a")) 438 + break; 439 + 440 + err = btf__load_into_kernel(btf); 441 + ASSERT_EQ(err, -EINVAL, "check btf"); 442 + btf__free(btf); 443 + break; 444 + } 445 + 446 + while (test__start_subtest("btf: multiple bpf_list_node with name b")) { 447 + btf = init_btf(); 448 + if (!ASSERT_OK_PTR(btf, "init_btf")) 449 + break; 450 + id = btf__add_struct(btf, "foo", 52); 451 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 452 + break; 453 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 454 + if (!ASSERT_OK(err, "btf__add_field foo::a")) 455 + break; 456 + err = btf__add_field(btf, "b", LIST_NODE, 128, 0); 457 + if (!ASSERT_OK(err, "btf__add_field foo::b")) 458 + break; 459 + err = btf__add_field(btf, "b", LIST_NODE, 256, 0); 460 + if (!ASSERT_OK(err, "btf__add_field foo::c")) 461 + break; 462 + err = btf__add_field(btf, "d", SPIN_LOCK, 384, 0); 463 + if (!ASSERT_OK(err, "btf__add_field foo::d")) 464 + break; 465 + id = btf__add_decl_tag(btf, "contains:foo:b", 5, 0); 466 + if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:foo:b")) 467 + break; 468 + 469 + err = btf__load_into_kernel(btf); 470 + ASSERT_EQ(err, -EINVAL, "check btf"); 471 + btf__free(btf); 472 + break; 473 + } 474 + 475 + while (test__start_subtest("btf: owning | owned AA cycle")) { 476 + btf = init_btf(); 477 + if (!ASSERT_OK_PTR(btf, "init_btf")) 478 + break; 479 + id = btf__add_struct(btf, "foo", 36); 480 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 481 + break; 482 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 483 + if (!ASSERT_OK(err, "btf__add_field foo::a")) 484 + break; 485 + err = btf__add_field(btf, "b", LIST_NODE, 128, 0); 486 + if (!ASSERT_OK(err, "btf__add_field foo::b")) 487 + break; 488 + err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); 489 + if (!ASSERT_OK(err, "btf__add_field foo::c")) 490 + break; 491 + id = btf__add_decl_tag(btf, "contains:foo:b", 5, 0); 492 + if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:foo:b")) 493 + break; 494 + 495 + err = btf__load_into_kernel(btf); 496 + ASSERT_EQ(err, -ELOOP, "check btf"); 497 + btf__free(btf); 498 + break; 499 + } 500 + 501 + while (test__start_subtest("btf: owning | owned ABA cycle")) { 502 + btf = init_btf(); 503 + if (!ASSERT_OK_PTR(btf, "init_btf")) 504 + break; 505 + id = btf__add_struct(btf, "foo", 36); 506 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 507 + break; 508 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 509 + if (!ASSERT_OK(err, "btf__add_field foo::a")) 510 + break; 511 + err = btf__add_field(btf, "b", LIST_NODE, 128, 0); 512 + if (!ASSERT_OK(err, "btf__add_field foo::b")) 513 + break; 514 + err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); 515 + if (!ASSERT_OK(err, "btf__add_field foo::c")) 516 + break; 517 + id = btf__add_decl_tag(btf, "contains:bar:b", 5, 0); 518 + if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:bar:b")) 519 + break; 520 + id = btf__add_struct(btf, "bar", 36); 521 + if (!ASSERT_EQ(id, 7, "btf__add_struct bar")) 522 + break; 523 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 524 + if (!ASSERT_OK(err, "btf__add_field bar::a")) 525 + break; 526 + err = btf__add_field(btf, "b", LIST_NODE, 128, 0); 527 + if (!ASSERT_OK(err, "btf__add_field bar::b")) 528 + break; 529 + err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); 530 + if (!ASSERT_OK(err, "btf__add_field bar::c")) 531 + break; 532 + id = btf__add_decl_tag(btf, "contains:foo:b", 7, 0); 533 + if (!ASSERT_EQ(id, 8, "btf__add_decl_tag contains:foo:b")) 534 + break; 535 + 536 + err = btf__load_into_kernel(btf); 537 + ASSERT_EQ(err, -ELOOP, "check btf"); 538 + btf__free(btf); 539 + break; 540 + } 541 + 542 + while (test__start_subtest("btf: owning -> owned")) { 543 + btf = init_btf(); 544 + if (!ASSERT_OK_PTR(btf, "init_btf")) 545 + break; 546 + id = btf__add_struct(btf, "foo", 20); 547 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 548 + break; 549 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 550 + if (!ASSERT_OK(err, "btf__add_field foo::a")) 551 + break; 552 + err = btf__add_field(btf, "b", SPIN_LOCK, 128, 0); 553 + if (!ASSERT_OK(err, "btf__add_field foo::b")) 554 + break; 555 + id = btf__add_decl_tag(btf, "contains:bar:a", 5, 0); 556 + if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:bar:a")) 557 + break; 558 + id = btf__add_struct(btf, "bar", 16); 559 + if (!ASSERT_EQ(id, 7, "btf__add_struct bar")) 560 + break; 561 + err = btf__add_field(btf, "a", LIST_NODE, 0, 0); 562 + if (!ASSERT_OK(err, "btf__add_field bar::a")) 563 + break; 564 + 565 + err = btf__load_into_kernel(btf); 566 + ASSERT_EQ(err, 0, "check btf"); 567 + btf__free(btf); 568 + break; 569 + } 570 + 571 + while (test__start_subtest("btf: owning -> owning | owned -> owned")) { 572 + btf = init_btf(); 573 + if (!ASSERT_OK_PTR(btf, "init_btf")) 574 + break; 575 + id = btf__add_struct(btf, "foo", 20); 576 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 577 + break; 578 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 579 + if (!ASSERT_OK(err, "btf__add_field foo::a")) 580 + break; 581 + err = btf__add_field(btf, "b", SPIN_LOCK, 128, 0); 582 + if (!ASSERT_OK(err, "btf__add_field foo::b")) 583 + break; 584 + id = btf__add_decl_tag(btf, "contains:bar:b", 5, 0); 585 + if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:bar:b")) 586 + break; 587 + id = btf__add_struct(btf, "bar", 36); 588 + if (!ASSERT_EQ(id, 7, "btf__add_struct bar")) 589 + break; 590 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 591 + if (!ASSERT_OK(err, "btf__add_field bar::a")) 592 + break; 593 + err = btf__add_field(btf, "b", LIST_NODE, 128, 0); 594 + if (!ASSERT_OK(err, "btf__add_field bar::b")) 595 + break; 596 + err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); 597 + if (!ASSERT_OK(err, "btf__add_field bar::c")) 598 + break; 599 + id = btf__add_decl_tag(btf, "contains:baz:a", 7, 0); 600 + if (!ASSERT_EQ(id, 8, "btf__add_decl_tag contains:baz:a")) 601 + break; 602 + id = btf__add_struct(btf, "baz", 16); 603 + if (!ASSERT_EQ(id, 9, "btf__add_struct baz")) 604 + break; 605 + err = btf__add_field(btf, "a", LIST_NODE, 0, 0); 606 + if (!ASSERT_OK(err, "btf__add_field baz:a")) 607 + break; 608 + 609 + err = btf__load_into_kernel(btf); 610 + ASSERT_EQ(err, 0, "check btf"); 611 + btf__free(btf); 612 + break; 613 + } 614 + 615 + while (test__start_subtest("btf: owning | owned -> owning | owned -> owned")) { 616 + btf = init_btf(); 617 + if (!ASSERT_OK_PTR(btf, "init_btf")) 618 + break; 619 + id = btf__add_struct(btf, "foo", 36); 620 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 621 + break; 622 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 623 + if (!ASSERT_OK(err, "btf__add_field foo::a")) 624 + break; 625 + err = btf__add_field(btf, "b", LIST_NODE, 128, 0); 626 + if (!ASSERT_OK(err, "btf__add_field foo::b")) 627 + break; 628 + err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); 629 + if (!ASSERT_OK(err, "btf__add_field foo::c")) 630 + break; 631 + id = btf__add_decl_tag(btf, "contains:bar:b", 5, 0); 632 + if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:bar:b")) 633 + break; 634 + id = btf__add_struct(btf, "bar", 36); 635 + if (!ASSERT_EQ(id, 7, "btf__add_struct bar")) 636 + break; 637 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 638 + if (!ASSERT_OK(err, "btf__add_field bar:a")) 639 + break; 640 + err = btf__add_field(btf, "b", LIST_NODE, 128, 0); 641 + if (!ASSERT_OK(err, "btf__add_field bar:b")) 642 + break; 643 + err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); 644 + if (!ASSERT_OK(err, "btf__add_field bar:c")) 645 + break; 646 + id = btf__add_decl_tag(btf, "contains:baz:a", 7, 0); 647 + if (!ASSERT_EQ(id, 8, "btf__add_decl_tag contains:baz:a")) 648 + break; 649 + id = btf__add_struct(btf, "baz", 16); 650 + if (!ASSERT_EQ(id, 9, "btf__add_struct baz")) 651 + break; 652 + err = btf__add_field(btf, "a", LIST_NODE, 0, 0); 653 + if (!ASSERT_OK(err, "btf__add_field baz:a")) 654 + break; 655 + 656 + err = btf__load_into_kernel(btf); 657 + ASSERT_EQ(err, -ELOOP, "check btf"); 658 + btf__free(btf); 659 + break; 660 + } 661 + 662 + while (test__start_subtest("btf: owning -> owning | owned -> owning | owned -> owned")) { 663 + btf = init_btf(); 664 + if (!ASSERT_OK_PTR(btf, "init_btf")) 665 + break; 666 + id = btf__add_struct(btf, "foo", 20); 667 + if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) 668 + break; 669 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 670 + if (!ASSERT_OK(err, "btf__add_field foo::a")) 671 + break; 672 + err = btf__add_field(btf, "b", SPIN_LOCK, 128, 0); 673 + if (!ASSERT_OK(err, "btf__add_field foo::b")) 674 + break; 675 + id = btf__add_decl_tag(btf, "contains:bar:b", 5, 0); 676 + if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:bar:b")) 677 + break; 678 + id = btf__add_struct(btf, "bar", 36); 679 + if (!ASSERT_EQ(id, 7, "btf__add_struct bar")) 680 + break; 681 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 682 + if (!ASSERT_OK(err, "btf__add_field bar::a")) 683 + break; 684 + err = btf__add_field(btf, "b", LIST_NODE, 128, 0); 685 + if (!ASSERT_OK(err, "btf__add_field bar::b")) 686 + break; 687 + err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); 688 + if (!ASSERT_OK(err, "btf__add_field bar::c")) 689 + break; 690 + id = btf__add_decl_tag(btf, "contains:baz:b", 7, 0); 691 + if (!ASSERT_EQ(id, 8, "btf__add_decl_tag")) 692 + break; 693 + id = btf__add_struct(btf, "baz", 36); 694 + if (!ASSERT_EQ(id, 9, "btf__add_struct baz")) 695 + break; 696 + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); 697 + if (!ASSERT_OK(err, "btf__add_field bar::a")) 698 + break; 699 + err = btf__add_field(btf, "b", LIST_NODE, 128, 0); 700 + if (!ASSERT_OK(err, "btf__add_field bar::b")) 701 + break; 702 + err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); 703 + if (!ASSERT_OK(err, "btf__add_field bar::c")) 704 + break; 705 + id = btf__add_decl_tag(btf, "contains:bam:a", 9, 0); 706 + if (!ASSERT_EQ(id, 10, "btf__add_decl_tag contains:bam:a")) 707 + break; 708 + id = btf__add_struct(btf, "bam", 16); 709 + if (!ASSERT_EQ(id, 11, "btf__add_struct bam")) 710 + break; 711 + err = btf__add_field(btf, "a", LIST_NODE, 0, 0); 712 + if (!ASSERT_OK(err, "btf__add_field bam::a")) 713 + break; 714 + 715 + err = btf__load_into_kernel(btf); 716 + ASSERT_EQ(err, -ELOOP, "check btf"); 717 + btf__free(btf); 718 + break; 719 + } 720 + } 721 + 722 + void test_linked_list(void) 723 + { 724 + int i; 725 + 726 + for (i = 0; i < ARRAY_SIZE(linked_list_fail_tests); i++) { 727 + if (!test__start_subtest(linked_list_fail_tests[i].prog_name)) 728 + continue; 729 + test_linked_list_fail_prog(linked_list_fail_tests[i].prog_name, 730 + linked_list_fail_tests[i].err_msg); 731 + } 732 + test_btf(); 733 + test_linked_list_success(PUSH_POP, false); 734 + test_linked_list_success(PUSH_POP, true); 735 + test_linked_list_success(PUSH_POP_MULT, false); 736 + test_linked_list_success(PUSH_POP_MULT, true); 737 + test_linked_list_success(LIST_IN_LIST, false); 738 + test_linked_list_success(LIST_IN_LIST, true); 739 + test_linked_list_success(TEST_ALL, false); 740 + }
+13 -4
tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
··· 173 173 ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 4, "total prog count"); 174 174 ASSERT_EQ(query_prog_cnt(cgroup_fd2, NULL), 1, "total prog count"); 175 175 176 - /* AF_UNIX is prohibited. */ 177 - 178 176 fd = socket(AF_UNIX, SOCK_STREAM, 0); 179 - ASSERT_LT(fd, 0, "socket(AF_UNIX)"); 177 + if (!(skel->kconfig->CONFIG_SECURITY_APPARMOR 178 + || skel->kconfig->CONFIG_SECURITY_SELINUX 179 + || skel->kconfig->CONFIG_SECURITY_SMACK)) 180 + /* AF_UNIX is prohibited. */ 181 + ASSERT_LT(fd, 0, "socket(AF_UNIX)"); 180 182 close(fd); 181 183 182 184 /* AF_INET6 gets default policy (sk_priority). */ ··· 235 233 236 234 /* AF_INET6+SOCK_STREAM 237 235 * AF_PACKET+SOCK_RAW 236 + * AF_UNIX+SOCK_RAW if already have non-bpf lsms installed 238 237 * listen_fd 239 238 * client_fd 240 239 * accepted_fd 241 240 */ 242 - ASSERT_EQ(skel->bss->called_socket_post_create2, 5, "called_create2"); 241 + if (skel->kconfig->CONFIG_SECURITY_APPARMOR 242 + || skel->kconfig->CONFIG_SECURITY_SELINUX 243 + || skel->kconfig->CONFIG_SECURITY_SMACK) 244 + /* AF_UNIX+SOCK_RAW if already have non-bpf lsms installed */ 245 + ASSERT_EQ(skel->bss->called_socket_post_create2, 6, "called_create2"); 246 + else 247 + ASSERT_EQ(skel->bss->called_socket_post_create2, 5, "called_create2"); 243 248 244 249 /* start_server 245 250 * bind(ETH_P_ALL)
+158
tools/testing/selftests/bpf/prog_tests/rcu_read_lock.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates.*/ 3 + 4 + #define _GNU_SOURCE 5 + #include <unistd.h> 6 + #include <sys/syscall.h> 7 + #include <sys/types.h> 8 + #include <test_progs.h> 9 + #include <bpf/btf.h> 10 + #include "rcu_read_lock.skel.h" 11 + #include "cgroup_helpers.h" 12 + 13 + static unsigned long long cgroup_id; 14 + 15 + static void test_success(void) 16 + { 17 + struct rcu_read_lock *skel; 18 + int err; 19 + 20 + skel = rcu_read_lock__open(); 21 + if (!ASSERT_OK_PTR(skel, "skel_open")) 22 + return; 23 + 24 + skel->bss->target_pid = syscall(SYS_gettid); 25 + 26 + bpf_program__set_autoload(skel->progs.get_cgroup_id, true); 27 + bpf_program__set_autoload(skel->progs.task_succ, true); 28 + bpf_program__set_autoload(skel->progs.no_lock, true); 29 + bpf_program__set_autoload(skel->progs.two_regions, true); 30 + bpf_program__set_autoload(skel->progs.non_sleepable_1, true); 31 + bpf_program__set_autoload(skel->progs.non_sleepable_2, true); 32 + err = rcu_read_lock__load(skel); 33 + if (!ASSERT_OK(err, "skel_load")) 34 + goto out; 35 + 36 + err = rcu_read_lock__attach(skel); 37 + if (!ASSERT_OK(err, "skel_attach")) 38 + goto out; 39 + 40 + syscall(SYS_getpgid); 41 + 42 + ASSERT_EQ(skel->bss->task_storage_val, 2, "task_storage_val"); 43 + ASSERT_EQ(skel->bss->cgroup_id, cgroup_id, "cgroup_id"); 44 + out: 45 + rcu_read_lock__destroy(skel); 46 + } 47 + 48 + static void test_rcuptr_acquire(void) 49 + { 50 + struct rcu_read_lock *skel; 51 + int err; 52 + 53 + skel = rcu_read_lock__open(); 54 + if (!ASSERT_OK_PTR(skel, "skel_open")) 55 + return; 56 + 57 + skel->bss->target_pid = syscall(SYS_gettid); 58 + 59 + bpf_program__set_autoload(skel->progs.task_acquire, true); 60 + err = rcu_read_lock__load(skel); 61 + if (!ASSERT_OK(err, "skel_load")) 62 + goto out; 63 + 64 + err = rcu_read_lock__attach(skel); 65 + ASSERT_OK(err, "skel_attach"); 66 + out: 67 + rcu_read_lock__destroy(skel); 68 + } 69 + 70 + static const char * const inproper_region_tests[] = { 71 + "miss_lock", 72 + "miss_unlock", 73 + "non_sleepable_rcu_mismatch", 74 + "inproper_sleepable_helper", 75 + "inproper_sleepable_kfunc", 76 + "nested_rcu_region", 77 + }; 78 + 79 + static void test_inproper_region(void) 80 + { 81 + struct rcu_read_lock *skel; 82 + struct bpf_program *prog; 83 + int i, err; 84 + 85 + for (i = 0; i < ARRAY_SIZE(inproper_region_tests); i++) { 86 + skel = rcu_read_lock__open(); 87 + if (!ASSERT_OK_PTR(skel, "skel_open")) 88 + return; 89 + 90 + prog = bpf_object__find_program_by_name(skel->obj, inproper_region_tests[i]); 91 + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) 92 + goto out; 93 + bpf_program__set_autoload(prog, true); 94 + err = rcu_read_lock__load(skel); 95 + ASSERT_ERR(err, "skel_load"); 96 + out: 97 + rcu_read_lock__destroy(skel); 98 + } 99 + } 100 + 101 + static const char * const rcuptr_misuse_tests[] = { 102 + "task_untrusted_non_rcuptr", 103 + "task_untrusted_rcuptr", 104 + "cross_rcu_region", 105 + }; 106 + 107 + static void test_rcuptr_misuse(void) 108 + { 109 + struct rcu_read_lock *skel; 110 + struct bpf_program *prog; 111 + int i, err; 112 + 113 + for (i = 0; i < ARRAY_SIZE(rcuptr_misuse_tests); i++) { 114 + skel = rcu_read_lock__open(); 115 + if (!ASSERT_OK_PTR(skel, "skel_open")) 116 + return; 117 + 118 + prog = bpf_object__find_program_by_name(skel->obj, rcuptr_misuse_tests[i]); 119 + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) 120 + goto out; 121 + bpf_program__set_autoload(prog, true); 122 + err = rcu_read_lock__load(skel); 123 + ASSERT_ERR(err, "skel_load"); 124 + out: 125 + rcu_read_lock__destroy(skel); 126 + } 127 + } 128 + 129 + void test_rcu_read_lock(void) 130 + { 131 + struct btf *vmlinux_btf; 132 + int cgroup_fd; 133 + 134 + vmlinux_btf = btf__load_vmlinux_btf(); 135 + if (!ASSERT_OK_PTR(vmlinux_btf, "could not load vmlinux BTF")) 136 + return; 137 + if (btf__find_by_name_kind(vmlinux_btf, "rcu", BTF_KIND_TYPE_TAG) < 0) { 138 + test__skip(); 139 + goto out; 140 + } 141 + 142 + cgroup_fd = test__join_cgroup("/rcu_read_lock"); 143 + if (!ASSERT_GE(cgroup_fd, 0, "join_cgroup /rcu_read_lock")) 144 + goto out; 145 + 146 + cgroup_id = get_cgroup_id("/rcu_read_lock"); 147 + if (test__start_subtest("success")) 148 + test_success(); 149 + if (test__start_subtest("rcuptr_acquire")) 150 + test_rcuptr_acquire(); 151 + if (test__start_subtest("negative_tests_inproper_region")) 152 + test_inproper_region(); 153 + if (test__start_subtest("negative_tests_rcuptr_misuse")) 154 + test_rcuptr_misuse(); 155 + close(cgroup_fd); 156 + out: 157 + btf__free(vmlinux_btf); 158 + }
+142
tools/testing/selftests/bpf/prog_tests/spin_lock.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <test_progs.h> 3 + #include <network_helpers.h> 4 + 5 + #include "test_spin_lock.skel.h" 6 + #include "test_spin_lock_fail.skel.h" 7 + 8 + static char log_buf[1024 * 1024]; 9 + 10 + static struct { 11 + const char *prog_name; 12 + const char *err_msg; 13 + } spin_lock_fail_tests[] = { 14 + { "lock_id_kptr_preserve", 15 + "5: (bf) r1 = r0 ; R0_w=ptr_foo(id=2,ref_obj_id=2,off=0,imm=0) " 16 + "R1_w=ptr_foo(id=2,ref_obj_id=2,off=0,imm=0) refs=2\n6: (85) call bpf_this_cpu_ptr#154\n" 17 + "R1 type=ptr_ expected=percpu_ptr_" }, 18 + { "lock_id_global_zero", 19 + "; R1_w=map_value(off=0,ks=4,vs=4,imm=0)\n2: (85) call bpf_this_cpu_ptr#154\n" 20 + "R1 type=map_value expected=percpu_ptr_" }, 21 + { "lock_id_mapval_preserve", 22 + "8: (bf) r1 = r0 ; R0_w=map_value(id=1,off=0,ks=4,vs=8,imm=0) " 23 + "R1_w=map_value(id=1,off=0,ks=4,vs=8,imm=0)\n9: (85) call bpf_this_cpu_ptr#154\n" 24 + "R1 type=map_value expected=percpu_ptr_" }, 25 + { "lock_id_innermapval_preserve", 26 + "13: (bf) r1 = r0 ; R0=map_value(id=2,off=0,ks=4,vs=8,imm=0) " 27 + "R1_w=map_value(id=2,off=0,ks=4,vs=8,imm=0)\n14: (85) call bpf_this_cpu_ptr#154\n" 28 + "R1 type=map_value expected=percpu_ptr_" }, 29 + { "lock_id_mismatch_kptr_kptr", "bpf_spin_unlock of different lock" }, 30 + { "lock_id_mismatch_kptr_global", "bpf_spin_unlock of different lock" }, 31 + { "lock_id_mismatch_kptr_mapval", "bpf_spin_unlock of different lock" }, 32 + { "lock_id_mismatch_kptr_innermapval", "bpf_spin_unlock of different lock" }, 33 + { "lock_id_mismatch_global_global", "bpf_spin_unlock of different lock" }, 34 + { "lock_id_mismatch_global_kptr", "bpf_spin_unlock of different lock" }, 35 + { "lock_id_mismatch_global_mapval", "bpf_spin_unlock of different lock" }, 36 + { "lock_id_mismatch_global_innermapval", "bpf_spin_unlock of different lock" }, 37 + { "lock_id_mismatch_mapval_mapval", "bpf_spin_unlock of different lock" }, 38 + { "lock_id_mismatch_mapval_kptr", "bpf_spin_unlock of different lock" }, 39 + { "lock_id_mismatch_mapval_global", "bpf_spin_unlock of different lock" }, 40 + { "lock_id_mismatch_mapval_innermapval", "bpf_spin_unlock of different lock" }, 41 + { "lock_id_mismatch_innermapval_innermapval1", "bpf_spin_unlock of different lock" }, 42 + { "lock_id_mismatch_innermapval_innermapval2", "bpf_spin_unlock of different lock" }, 43 + { "lock_id_mismatch_innermapval_kptr", "bpf_spin_unlock of different lock" }, 44 + { "lock_id_mismatch_innermapval_global", "bpf_spin_unlock of different lock" }, 45 + { "lock_id_mismatch_innermapval_mapval", "bpf_spin_unlock of different lock" }, 46 + }; 47 + 48 + static void test_spin_lock_fail_prog(const char *prog_name, const char *err_msg) 49 + { 50 + LIBBPF_OPTS(bpf_object_open_opts, opts, .kernel_log_buf = log_buf, 51 + .kernel_log_size = sizeof(log_buf), 52 + .kernel_log_level = 1); 53 + struct test_spin_lock_fail *skel; 54 + struct bpf_program *prog; 55 + int ret; 56 + 57 + skel = test_spin_lock_fail__open_opts(&opts); 58 + if (!ASSERT_OK_PTR(skel, "test_spin_lock_fail__open_opts")) 59 + return; 60 + 61 + prog = bpf_object__find_program_by_name(skel->obj, prog_name); 62 + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) 63 + goto end; 64 + 65 + bpf_program__set_autoload(prog, true); 66 + 67 + ret = test_spin_lock_fail__load(skel); 68 + if (!ASSERT_ERR(ret, "test_spin_lock_fail__load must fail")) 69 + goto end; 70 + 71 + /* Skip check if JIT does not support kfuncs */ 72 + if (strstr(log_buf, "JIT does not support calling kernel function")) { 73 + test__skip(); 74 + goto end; 75 + } 76 + 77 + if (!ASSERT_OK_PTR(strstr(log_buf, err_msg), "expected error message")) { 78 + fprintf(stderr, "Expected: %s\n", err_msg); 79 + fprintf(stderr, "Verifier: %s\n", log_buf); 80 + } 81 + 82 + end: 83 + test_spin_lock_fail__destroy(skel); 84 + } 85 + 86 + static void *spin_lock_thread(void *arg) 87 + { 88 + int err, prog_fd = *(u32 *) arg; 89 + LIBBPF_OPTS(bpf_test_run_opts, topts, 90 + .data_in = &pkt_v4, 91 + .data_size_in = sizeof(pkt_v4), 92 + .repeat = 10000, 93 + ); 94 + 95 + err = bpf_prog_test_run_opts(prog_fd, &topts); 96 + ASSERT_OK(err, "test_run"); 97 + ASSERT_OK(topts.retval, "test_run retval"); 98 + pthread_exit(arg); 99 + } 100 + 101 + void test_spin_lock_success(void) 102 + { 103 + struct test_spin_lock *skel; 104 + pthread_t thread_id[4]; 105 + int prog_fd, i; 106 + void *ret; 107 + 108 + skel = test_spin_lock__open_and_load(); 109 + if (!ASSERT_OK_PTR(skel, "test_spin_lock__open_and_load")) 110 + return; 111 + prog_fd = bpf_program__fd(skel->progs.bpf_spin_lock_test); 112 + for (i = 0; i < 4; i++) { 113 + int err; 114 + 115 + err = pthread_create(&thread_id[i], NULL, &spin_lock_thread, &prog_fd); 116 + if (!ASSERT_OK(err, "pthread_create")) 117 + goto end; 118 + } 119 + 120 + for (i = 0; i < 4; i++) { 121 + if (!ASSERT_OK(pthread_join(thread_id[i], &ret), "pthread_join")) 122 + goto end; 123 + if (!ASSERT_EQ(ret, &prog_fd, "ret == prog_fd")) 124 + goto end; 125 + } 126 + end: 127 + test_spin_lock__destroy(skel); 128 + } 129 + 130 + void test_spin_lock(void) 131 + { 132 + int i; 133 + 134 + test_spin_lock_success(); 135 + 136 + for (i = 0; i < ARRAY_SIZE(spin_lock_fail_tests); i++) { 137 + if (!test__start_subtest(spin_lock_fail_tests[i].prog_name)) 138 + continue; 139 + test_spin_lock_fail_prog(spin_lock_fail_tests[i].prog_name, 140 + spin_lock_fail_tests[i].err_msg); 141 + } 142 + }
-45
tools/testing/selftests/bpf/prog_tests/spinlock.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - #include <test_progs.h> 3 - #include <network_helpers.h> 4 - 5 - static void *spin_lock_thread(void *arg) 6 - { 7 - int err, prog_fd = *(u32 *) arg; 8 - LIBBPF_OPTS(bpf_test_run_opts, topts, 9 - .data_in = &pkt_v4, 10 - .data_size_in = sizeof(pkt_v4), 11 - .repeat = 10000, 12 - ); 13 - 14 - err = bpf_prog_test_run_opts(prog_fd, &topts); 15 - ASSERT_OK(err, "test_run"); 16 - ASSERT_OK(topts.retval, "test_run retval"); 17 - pthread_exit(arg); 18 - } 19 - 20 - void test_spinlock(void) 21 - { 22 - const char *file = "./test_spin_lock.bpf.o"; 23 - pthread_t thread_id[4]; 24 - struct bpf_object *obj = NULL; 25 - int prog_fd; 26 - int err = 0, i; 27 - void *ret; 28 - 29 - err = bpf_prog_test_load(file, BPF_PROG_TYPE_CGROUP_SKB, &obj, &prog_fd); 30 - if (CHECK_FAIL(err)) { 31 - printf("test_spin_lock:bpf_prog_test_load errno %d\n", errno); 32 - goto close_prog; 33 - } 34 - for (i = 0; i < 4; i++) 35 - if (CHECK_FAIL(pthread_create(&thread_id[i], NULL, 36 - &spin_lock_thread, &prog_fd))) 37 - goto close_prog; 38 - 39 - for (i = 0; i < 4; i++) 40 - if (CHECK_FAIL(pthread_join(thread_id[i], &ret) || 41 - ret != (void *)&prog_fd)) 42 - goto close_prog; 43 - close_prog: 44 - bpf_object__close(obj); 45 - }
+163
tools/testing/selftests/bpf/prog_tests/task_kfunc.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #define _GNU_SOURCE 5 + #include <sys/wait.h> 6 + #include <test_progs.h> 7 + #include <unistd.h> 8 + 9 + #include "task_kfunc_failure.skel.h" 10 + #include "task_kfunc_success.skel.h" 11 + 12 + static size_t log_buf_sz = 1 << 20; /* 1 MB */ 13 + static char obj_log_buf[1048576]; 14 + 15 + static struct task_kfunc_success *open_load_task_kfunc_skel(void) 16 + { 17 + struct task_kfunc_success *skel; 18 + int err; 19 + 20 + skel = task_kfunc_success__open(); 21 + if (!ASSERT_OK_PTR(skel, "skel_open")) 22 + return NULL; 23 + 24 + skel->bss->pid = getpid(); 25 + 26 + err = task_kfunc_success__load(skel); 27 + if (!ASSERT_OK(err, "skel_load")) 28 + goto cleanup; 29 + 30 + return skel; 31 + 32 + cleanup: 33 + task_kfunc_success__destroy(skel); 34 + return NULL; 35 + } 36 + 37 + static void run_success_test(const char *prog_name) 38 + { 39 + struct task_kfunc_success *skel; 40 + int status; 41 + pid_t child_pid; 42 + struct bpf_program *prog; 43 + struct bpf_link *link = NULL; 44 + 45 + skel = open_load_task_kfunc_skel(); 46 + if (!ASSERT_OK_PTR(skel, "open_load_skel")) 47 + return; 48 + 49 + if (!ASSERT_OK(skel->bss->err, "pre_spawn_err")) 50 + goto cleanup; 51 + 52 + prog = bpf_object__find_program_by_name(skel->obj, prog_name); 53 + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) 54 + goto cleanup; 55 + 56 + link = bpf_program__attach(prog); 57 + if (!ASSERT_OK_PTR(link, "attached_link")) 58 + goto cleanup; 59 + 60 + child_pid = fork(); 61 + if (!ASSERT_GT(child_pid, -1, "child_pid")) 62 + goto cleanup; 63 + if (child_pid == 0) 64 + _exit(0); 65 + waitpid(child_pid, &status, 0); 66 + 67 + ASSERT_OK(skel->bss->err, "post_wait_err"); 68 + 69 + cleanup: 70 + bpf_link__destroy(link); 71 + task_kfunc_success__destroy(skel); 72 + } 73 + 74 + static const char * const success_tests[] = { 75 + "test_task_acquire_release_argument", 76 + "test_task_acquire_release_current", 77 + "test_task_acquire_leave_in_map", 78 + "test_task_xchg_release", 79 + "test_task_get_release", 80 + "test_task_current_acquire_release", 81 + "test_task_from_pid_arg", 82 + "test_task_from_pid_current", 83 + "test_task_from_pid_invalid", 84 + }; 85 + 86 + static struct { 87 + const char *prog_name; 88 + const char *expected_err_msg; 89 + } failure_tests[] = { 90 + {"task_kfunc_acquire_untrusted", "R1 must be referenced or trusted"}, 91 + {"task_kfunc_acquire_fp", "arg#0 pointer type STRUCT task_struct must point"}, 92 + {"task_kfunc_acquire_unsafe_kretprobe", "reg type unsupported for arg#0 function"}, 93 + {"task_kfunc_acquire_trusted_walked", "R1 must be referenced or trusted"}, 94 + {"task_kfunc_acquire_null", "arg#0 pointer type STRUCT task_struct must point"}, 95 + {"task_kfunc_acquire_unreleased", "Unreleased reference"}, 96 + {"task_kfunc_get_non_kptr_param", "arg#0 expected pointer to map value"}, 97 + {"task_kfunc_get_non_kptr_acquired", "arg#0 expected pointer to map value"}, 98 + {"task_kfunc_get_null", "arg#0 expected pointer to map value"}, 99 + {"task_kfunc_xchg_unreleased", "Unreleased reference"}, 100 + {"task_kfunc_get_unreleased", "Unreleased reference"}, 101 + {"task_kfunc_release_untrusted", "arg#0 is untrusted_ptr_or_null_ expected ptr_ or socket"}, 102 + {"task_kfunc_release_fp", "arg#0 pointer type STRUCT task_struct must point"}, 103 + {"task_kfunc_release_null", "arg#0 is ptr_or_null_ expected ptr_ or socket"}, 104 + {"task_kfunc_release_unacquired", "release kernel function bpf_task_release expects"}, 105 + {"task_kfunc_from_pid_no_null_check", "arg#0 is ptr_or_null_ expected ptr_ or socket"}, 106 + }; 107 + 108 + static void verify_fail(const char *prog_name, const char *expected_err_msg) 109 + { 110 + LIBBPF_OPTS(bpf_object_open_opts, opts); 111 + struct task_kfunc_failure *skel; 112 + int err, i; 113 + 114 + opts.kernel_log_buf = obj_log_buf; 115 + opts.kernel_log_size = log_buf_sz; 116 + opts.kernel_log_level = 1; 117 + 118 + skel = task_kfunc_failure__open_opts(&opts); 119 + if (!ASSERT_OK_PTR(skel, "task_kfunc_failure__open_opts")) 120 + goto cleanup; 121 + 122 + for (i = 0; i < ARRAY_SIZE(failure_tests); i++) { 123 + struct bpf_program *prog; 124 + const char *curr_name = failure_tests[i].prog_name; 125 + 126 + prog = bpf_object__find_program_by_name(skel->obj, curr_name); 127 + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) 128 + goto cleanup; 129 + 130 + bpf_program__set_autoload(prog, !strcmp(curr_name, prog_name)); 131 + } 132 + 133 + err = task_kfunc_failure__load(skel); 134 + if (!ASSERT_ERR(err, "unexpected load success")) 135 + goto cleanup; 136 + 137 + if (!ASSERT_OK_PTR(strstr(obj_log_buf, expected_err_msg), "expected_err_msg")) { 138 + fprintf(stderr, "Expected err_msg: %s\n", expected_err_msg); 139 + fprintf(stderr, "Verifier output: %s\n", obj_log_buf); 140 + } 141 + 142 + cleanup: 143 + task_kfunc_failure__destroy(skel); 144 + } 145 + 146 + void test_task_kfunc(void) 147 + { 148 + int i; 149 + 150 + for (i = 0; i < ARRAY_SIZE(success_tests); i++) { 151 + if (!test__start_subtest(success_tests[i])) 152 + continue; 153 + 154 + run_success_test(success_tests[i]); 155 + } 156 + 157 + for (i = 0; i < ARRAY_SIZE(failure_tests); i++) { 158 + if (!test__start_subtest(failure_tests[i].prog_name)) 159 + continue; 160 + 161 + verify_fail(failure_tests[i].prog_name, failure_tests[i].expected_err_msg); 162 + } 163 + }
+114
tools/testing/selftests/bpf/prog_tests/type_cast.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + #include <test_progs.h> 4 + #include <network_helpers.h> 5 + #include "type_cast.skel.h" 6 + 7 + static void test_xdp(void) 8 + { 9 + struct type_cast *skel; 10 + int err, prog_fd; 11 + char buf[128]; 12 + 13 + LIBBPF_OPTS(bpf_test_run_opts, topts, 14 + .data_in = &pkt_v4, 15 + .data_size_in = sizeof(pkt_v4), 16 + .data_out = buf, 17 + .data_size_out = sizeof(buf), 18 + .repeat = 1, 19 + ); 20 + 21 + skel = type_cast__open(); 22 + if (!ASSERT_OK_PTR(skel, "skel_open")) 23 + return; 24 + 25 + bpf_program__set_autoload(skel->progs.md_xdp, true); 26 + err = type_cast__load(skel); 27 + if (!ASSERT_OK(err, "skel_load")) 28 + goto out; 29 + 30 + prog_fd = bpf_program__fd(skel->progs.md_xdp); 31 + err = bpf_prog_test_run_opts(prog_fd, &topts); 32 + ASSERT_OK(err, "test_run"); 33 + ASSERT_EQ(topts.retval, XDP_PASS, "xdp test_run retval"); 34 + 35 + ASSERT_EQ(skel->bss->ifindex, 1, "xdp_md ifindex"); 36 + ASSERT_EQ(skel->bss->ifindex, skel->bss->ingress_ifindex, "xdp_md ingress_ifindex"); 37 + ASSERT_STREQ(skel->bss->name, "lo", "xdp_md name"); 38 + ASSERT_NEQ(skel->bss->inum, 0, "xdp_md inum"); 39 + 40 + out: 41 + type_cast__destroy(skel); 42 + } 43 + 44 + static void test_tc(void) 45 + { 46 + struct type_cast *skel; 47 + int err, prog_fd; 48 + 49 + LIBBPF_OPTS(bpf_test_run_opts, topts, 50 + .data_in = &pkt_v4, 51 + .data_size_in = sizeof(pkt_v4), 52 + .repeat = 1, 53 + ); 54 + 55 + skel = type_cast__open(); 56 + if (!ASSERT_OK_PTR(skel, "skel_open")) 57 + return; 58 + 59 + bpf_program__set_autoload(skel->progs.md_skb, true); 60 + err = type_cast__load(skel); 61 + if (!ASSERT_OK(err, "skel_load")) 62 + goto out; 63 + 64 + prog_fd = bpf_program__fd(skel->progs.md_skb); 65 + err = bpf_prog_test_run_opts(prog_fd, &topts); 66 + ASSERT_OK(err, "test_run"); 67 + ASSERT_EQ(topts.retval, 0, "tc test_run retval"); 68 + 69 + ASSERT_EQ(skel->bss->meta_len, 0, "skb meta_len"); 70 + ASSERT_EQ(skel->bss->frag0_len, 0, "skb frag0_len"); 71 + ASSERT_NEQ(skel->bss->kskb_len, 0, "skb len"); 72 + ASSERT_NEQ(skel->bss->kskb2_len, 0, "skb2 len"); 73 + ASSERT_EQ(skel->bss->kskb_len, skel->bss->kskb2_len, "skb len compare"); 74 + 75 + out: 76 + type_cast__destroy(skel); 77 + } 78 + 79 + static const char * const negative_tests[] = { 80 + "untrusted_ptr", 81 + "kctx_u64", 82 + }; 83 + 84 + static void test_negative(void) 85 + { 86 + struct bpf_program *prog; 87 + struct type_cast *skel; 88 + int i, err; 89 + 90 + for (i = 0; i < ARRAY_SIZE(negative_tests); i++) { 91 + skel = type_cast__open(); 92 + if (!ASSERT_OK_PTR(skel, "skel_open")) 93 + return; 94 + 95 + prog = bpf_object__find_program_by_name(skel->obj, negative_tests[i]); 96 + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) 97 + goto out; 98 + bpf_program__set_autoload(prog, true); 99 + err = type_cast__load(skel); 100 + ASSERT_ERR(err, "skel_load"); 101 + out: 102 + type_cast__destroy(skel); 103 + } 104 + } 105 + 106 + void test_type_cast(void) 107 + { 108 + if (test__start_subtest("xdp")) 109 + test_xdp(); 110 + if (test__start_subtest("tc")) 111 + test_tc(); 112 + if (test__start_subtest("negative")) 113 + test_negative(); 114 + }
+1 -1
tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
··· 85 85 } 86 86 87 87 #define NUM_PKTS 10000 88 - void test_xdp_do_redirect(void) 88 + void serial_test_xdp_do_redirect(void) 89 89 { 90 90 int err, xdp_prog_fd, tc_prog_fd, ifindex_src, ifindex_dst; 91 91 char data[sizeof(pkt_udp) + sizeof(__u32)];
+1 -1
tools/testing/selftests/bpf/prog_tests/xdp_synproxy.c
··· 174 174 system("ip netns del synproxy"); 175 175 } 176 176 177 - void test_xdp_synproxy(void) 177 + void serial_test_xdp_synproxy(void) 178 178 { 179 179 if (test__start_subtest("xdp")) 180 180 test_synproxy(true);
+72
tools/testing/selftests/bpf/progs/cgrp_kfunc_common.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #ifndef _CGRP_KFUNC_COMMON_H 5 + #define _CGRP_KFUNC_COMMON_H 6 + 7 + #include <errno.h> 8 + #include <vmlinux.h> 9 + #include <bpf/bpf_helpers.h> 10 + #include <bpf/bpf_tracing.h> 11 + 12 + struct __cgrps_kfunc_map_value { 13 + struct cgroup __kptr_ref * cgrp; 14 + }; 15 + 16 + struct hash_map { 17 + __uint(type, BPF_MAP_TYPE_HASH); 18 + __type(key, int); 19 + __type(value, struct __cgrps_kfunc_map_value); 20 + __uint(max_entries, 1); 21 + } __cgrps_kfunc_map SEC(".maps"); 22 + 23 + struct cgroup *bpf_cgroup_acquire(struct cgroup *p) __ksym; 24 + struct cgroup *bpf_cgroup_kptr_get(struct cgroup **pp) __ksym; 25 + void bpf_cgroup_release(struct cgroup *p) __ksym; 26 + struct cgroup *bpf_cgroup_ancestor(struct cgroup *cgrp, int level) __ksym; 27 + 28 + static inline struct __cgrps_kfunc_map_value *cgrps_kfunc_map_value_lookup(struct cgroup *cgrp) 29 + { 30 + s32 id; 31 + long status; 32 + 33 + status = bpf_probe_read_kernel(&id, sizeof(id), &cgrp->self.id); 34 + if (status) 35 + return NULL; 36 + 37 + return bpf_map_lookup_elem(&__cgrps_kfunc_map, &id); 38 + } 39 + 40 + static inline int cgrps_kfunc_map_insert(struct cgroup *cgrp) 41 + { 42 + struct __cgrps_kfunc_map_value local, *v; 43 + long status; 44 + struct cgroup *acquired, *old; 45 + s32 id; 46 + 47 + status = bpf_probe_read_kernel(&id, sizeof(id), &cgrp->self.id); 48 + if (status) 49 + return status; 50 + 51 + local.cgrp = NULL; 52 + status = bpf_map_update_elem(&__cgrps_kfunc_map, &id, &local, BPF_NOEXIST); 53 + if (status) 54 + return status; 55 + 56 + v = bpf_map_lookup_elem(&__cgrps_kfunc_map, &id); 57 + if (!v) { 58 + bpf_map_delete_elem(&__cgrps_kfunc_map, &id); 59 + return -ENOENT; 60 + } 61 + 62 + acquired = bpf_cgroup_acquire(cgrp); 63 + old = bpf_kptr_xchg(&v->cgrp, acquired); 64 + if (old) { 65 + bpf_cgroup_release(old); 66 + return -EEXIST; 67 + } 68 + 69 + return 0; 70 + } 71 + 72 + #endif /* _CGRP_KFUNC_COMMON_H */
+260
tools/testing/selftests/bpf/progs/cgrp_kfunc_failure.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_tracing.h> 6 + #include <bpf/bpf_helpers.h> 7 + 8 + #include "cgrp_kfunc_common.h" 9 + 10 + char _license[] SEC("license") = "GPL"; 11 + 12 + /* Prototype for all of the program trace events below: 13 + * 14 + * TRACE_EVENT(cgroup_mkdir, 15 + * TP_PROTO(struct cgroup *cgrp, const char *path), 16 + * TP_ARGS(cgrp, path) 17 + */ 18 + 19 + static struct __cgrps_kfunc_map_value *insert_lookup_cgrp(struct cgroup *cgrp) 20 + { 21 + int status; 22 + 23 + status = cgrps_kfunc_map_insert(cgrp); 24 + if (status) 25 + return NULL; 26 + 27 + return cgrps_kfunc_map_value_lookup(cgrp); 28 + } 29 + 30 + SEC("tp_btf/cgroup_mkdir") 31 + int BPF_PROG(cgrp_kfunc_acquire_untrusted, struct cgroup *cgrp, const char *path) 32 + { 33 + struct cgroup *acquired; 34 + struct __cgrps_kfunc_map_value *v; 35 + 36 + v = insert_lookup_cgrp(cgrp); 37 + if (!v) 38 + return 0; 39 + 40 + /* Can't invoke bpf_cgroup_acquire() on an untrusted pointer. */ 41 + acquired = bpf_cgroup_acquire(v->cgrp); 42 + bpf_cgroup_release(acquired); 43 + 44 + return 0; 45 + } 46 + 47 + SEC("tp_btf/cgroup_mkdir") 48 + int BPF_PROG(cgrp_kfunc_acquire_fp, struct cgroup *cgrp, const char *path) 49 + { 50 + struct cgroup *acquired, *stack_cgrp = (struct cgroup *)&path; 51 + 52 + /* Can't invoke bpf_cgroup_acquire() on a random frame pointer. */ 53 + acquired = bpf_cgroup_acquire((struct cgroup *)&stack_cgrp); 54 + bpf_cgroup_release(acquired); 55 + 56 + return 0; 57 + } 58 + 59 + SEC("kretprobe/cgroup_destroy_locked") 60 + int BPF_PROG(cgrp_kfunc_acquire_unsafe_kretprobe, struct cgroup *cgrp) 61 + { 62 + struct cgroup *acquired; 63 + 64 + /* Can't acquire an untrusted struct cgroup * pointer. */ 65 + acquired = bpf_cgroup_acquire(cgrp); 66 + bpf_cgroup_release(acquired); 67 + 68 + return 0; 69 + } 70 + 71 + SEC("tp_btf/cgroup_mkdir") 72 + int BPF_PROG(cgrp_kfunc_acquire_trusted_walked, struct cgroup *cgrp, const char *path) 73 + { 74 + struct cgroup *acquired; 75 + 76 + /* Can't invoke bpf_cgroup_acquire() on a pointer obtained from walking a trusted cgroup. */ 77 + acquired = bpf_cgroup_acquire(cgrp->old_dom_cgrp); 78 + bpf_cgroup_release(acquired); 79 + 80 + return 0; 81 + } 82 + 83 + 84 + SEC("tp_btf/cgroup_mkdir") 85 + int BPF_PROG(cgrp_kfunc_acquire_null, struct cgroup *cgrp, const char *path) 86 + { 87 + struct cgroup *acquired; 88 + 89 + /* Can't invoke bpf_cgroup_acquire() on a NULL pointer. */ 90 + acquired = bpf_cgroup_acquire(NULL); 91 + if (!acquired) 92 + return 0; 93 + bpf_cgroup_release(acquired); 94 + 95 + return 0; 96 + } 97 + 98 + SEC("tp_btf/cgroup_mkdir") 99 + int BPF_PROG(cgrp_kfunc_acquire_unreleased, struct cgroup *cgrp, const char *path) 100 + { 101 + struct cgroup *acquired; 102 + 103 + acquired = bpf_cgroup_acquire(cgrp); 104 + 105 + /* Acquired cgroup is never released. */ 106 + 107 + return 0; 108 + } 109 + 110 + SEC("tp_btf/cgroup_mkdir") 111 + int BPF_PROG(cgrp_kfunc_get_non_kptr_param, struct cgroup *cgrp, const char *path) 112 + { 113 + struct cgroup *kptr; 114 + 115 + /* Cannot use bpf_cgroup_kptr_get() on a non-kptr, even on a valid cgroup. */ 116 + kptr = bpf_cgroup_kptr_get(&cgrp); 117 + if (!kptr) 118 + return 0; 119 + 120 + bpf_cgroup_release(kptr); 121 + 122 + return 0; 123 + } 124 + 125 + SEC("tp_btf/cgroup_mkdir") 126 + int BPF_PROG(cgrp_kfunc_get_non_kptr_acquired, struct cgroup *cgrp, const char *path) 127 + { 128 + struct cgroup *kptr, *acquired; 129 + 130 + acquired = bpf_cgroup_acquire(cgrp); 131 + 132 + /* Cannot use bpf_cgroup_kptr_get() on a non-map-value, even if the kptr was acquired. */ 133 + kptr = bpf_cgroup_kptr_get(&acquired); 134 + bpf_cgroup_release(acquired); 135 + if (!kptr) 136 + return 0; 137 + 138 + bpf_cgroup_release(kptr); 139 + 140 + return 0; 141 + } 142 + 143 + SEC("tp_btf/cgroup_mkdir") 144 + int BPF_PROG(cgrp_kfunc_get_null, struct cgroup *cgrp, const char *path) 145 + { 146 + struct cgroup *kptr; 147 + 148 + /* Cannot use bpf_cgroup_kptr_get() on a NULL pointer. */ 149 + kptr = bpf_cgroup_kptr_get(NULL); 150 + if (!kptr) 151 + return 0; 152 + 153 + bpf_cgroup_release(kptr); 154 + 155 + return 0; 156 + } 157 + 158 + SEC("tp_btf/cgroup_mkdir") 159 + int BPF_PROG(cgrp_kfunc_xchg_unreleased, struct cgroup *cgrp, const char *path) 160 + { 161 + struct cgroup *kptr; 162 + struct __cgrps_kfunc_map_value *v; 163 + 164 + v = insert_lookup_cgrp(cgrp); 165 + if (!v) 166 + return 0; 167 + 168 + kptr = bpf_kptr_xchg(&v->cgrp, NULL); 169 + if (!kptr) 170 + return 0; 171 + 172 + /* Kptr retrieved from map is never released. */ 173 + 174 + return 0; 175 + } 176 + 177 + SEC("tp_btf/cgroup_mkdir") 178 + int BPF_PROG(cgrp_kfunc_get_unreleased, struct cgroup *cgrp, const char *path) 179 + { 180 + struct cgroup *kptr; 181 + struct __cgrps_kfunc_map_value *v; 182 + 183 + v = insert_lookup_cgrp(cgrp); 184 + if (!v) 185 + return 0; 186 + 187 + kptr = bpf_cgroup_kptr_get(&v->cgrp); 188 + if (!kptr) 189 + return 0; 190 + 191 + /* Kptr acquired above is never released. */ 192 + 193 + return 0; 194 + } 195 + 196 + SEC("tp_btf/cgroup_mkdir") 197 + int BPF_PROG(cgrp_kfunc_release_untrusted, struct cgroup *cgrp, const char *path) 198 + { 199 + struct __cgrps_kfunc_map_value *v; 200 + 201 + v = insert_lookup_cgrp(cgrp); 202 + if (!v) 203 + return 0; 204 + 205 + /* Can't invoke bpf_cgroup_release() on an untrusted pointer. */ 206 + bpf_cgroup_release(v->cgrp); 207 + 208 + return 0; 209 + } 210 + 211 + SEC("tp_btf/cgroup_mkdir") 212 + int BPF_PROG(cgrp_kfunc_release_fp, struct cgroup *cgrp, const char *path) 213 + { 214 + struct cgroup *acquired = (struct cgroup *)&path; 215 + 216 + /* Cannot release random frame pointer. */ 217 + bpf_cgroup_release(acquired); 218 + 219 + return 0; 220 + } 221 + 222 + SEC("tp_btf/cgroup_mkdir") 223 + int BPF_PROG(cgrp_kfunc_release_null, struct cgroup *cgrp, const char *path) 224 + { 225 + struct __cgrps_kfunc_map_value local, *v; 226 + long status; 227 + struct cgroup *acquired, *old; 228 + s32 id; 229 + 230 + status = bpf_probe_read_kernel(&id, sizeof(id), &cgrp->self.id); 231 + if (status) 232 + return 0; 233 + 234 + local.cgrp = NULL; 235 + status = bpf_map_update_elem(&__cgrps_kfunc_map, &id, &local, BPF_NOEXIST); 236 + if (status) 237 + return status; 238 + 239 + v = bpf_map_lookup_elem(&__cgrps_kfunc_map, &id); 240 + if (!v) 241 + return -ENOENT; 242 + 243 + acquired = bpf_cgroup_acquire(cgrp); 244 + 245 + old = bpf_kptr_xchg(&v->cgrp, acquired); 246 + 247 + /* old cannot be passed to bpf_cgroup_release() without a NULL check. */ 248 + bpf_cgroup_release(old); 249 + 250 + return 0; 251 + } 252 + 253 + SEC("tp_btf/cgroup_mkdir") 254 + int BPF_PROG(cgrp_kfunc_release_unacquired, struct cgroup *cgrp, const char *path) 255 + { 256 + /* Cannot release trusted cgroup pointer which was not acquired. */ 257 + bpf_cgroup_release(cgrp); 258 + 259 + return 0; 260 + }
+170
tools/testing/selftests/bpf/progs/cgrp_kfunc_success.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_tracing.h> 6 + #include <bpf/bpf_helpers.h> 7 + 8 + #include "cgrp_kfunc_common.h" 9 + 10 + char _license[] SEC("license") = "GPL"; 11 + 12 + int err, pid, invocations; 13 + 14 + /* Prototype for all of the program trace events below: 15 + * 16 + * TRACE_EVENT(cgroup_mkdir, 17 + * TP_PROTO(struct cgroup *cgrp, const char *path), 18 + * TP_ARGS(cgrp, path) 19 + */ 20 + 21 + static bool is_test_kfunc_task(void) 22 + { 23 + int cur_pid = bpf_get_current_pid_tgid() >> 32; 24 + bool same = pid == cur_pid; 25 + 26 + if (same) 27 + __sync_fetch_and_add(&invocations, 1); 28 + 29 + return same; 30 + } 31 + 32 + SEC("tp_btf/cgroup_mkdir") 33 + int BPF_PROG(test_cgrp_acquire_release_argument, struct cgroup *cgrp, const char *path) 34 + { 35 + struct cgroup *acquired; 36 + 37 + if (!is_test_kfunc_task()) 38 + return 0; 39 + 40 + acquired = bpf_cgroup_acquire(cgrp); 41 + bpf_cgroup_release(acquired); 42 + 43 + return 0; 44 + } 45 + 46 + SEC("tp_btf/cgroup_mkdir") 47 + int BPF_PROG(test_cgrp_acquire_leave_in_map, struct cgroup *cgrp, const char *path) 48 + { 49 + long status; 50 + 51 + if (!is_test_kfunc_task()) 52 + return 0; 53 + 54 + status = cgrps_kfunc_map_insert(cgrp); 55 + if (status) 56 + err = 1; 57 + 58 + return 0; 59 + } 60 + 61 + SEC("tp_btf/cgroup_mkdir") 62 + int BPF_PROG(test_cgrp_xchg_release, struct cgroup *cgrp, const char *path) 63 + { 64 + struct cgroup *kptr; 65 + struct __cgrps_kfunc_map_value *v; 66 + long status; 67 + 68 + if (!is_test_kfunc_task()) 69 + return 0; 70 + 71 + status = cgrps_kfunc_map_insert(cgrp); 72 + if (status) { 73 + err = 1; 74 + return 0; 75 + } 76 + 77 + v = cgrps_kfunc_map_value_lookup(cgrp); 78 + if (!v) { 79 + err = 2; 80 + return 0; 81 + } 82 + 83 + kptr = bpf_kptr_xchg(&v->cgrp, NULL); 84 + if (!kptr) { 85 + err = 3; 86 + return 0; 87 + } 88 + 89 + bpf_cgroup_release(kptr); 90 + 91 + return 0; 92 + } 93 + 94 + SEC("tp_btf/cgroup_mkdir") 95 + int BPF_PROG(test_cgrp_get_release, struct cgroup *cgrp, const char *path) 96 + { 97 + struct cgroup *kptr; 98 + struct __cgrps_kfunc_map_value *v; 99 + long status; 100 + 101 + if (!is_test_kfunc_task()) 102 + return 0; 103 + 104 + status = cgrps_kfunc_map_insert(cgrp); 105 + if (status) { 106 + err = 1; 107 + return 0; 108 + } 109 + 110 + v = cgrps_kfunc_map_value_lookup(cgrp); 111 + if (!v) { 112 + err = 2; 113 + return 0; 114 + } 115 + 116 + kptr = bpf_cgroup_kptr_get(&v->cgrp); 117 + if (!kptr) { 118 + err = 3; 119 + return 0; 120 + } 121 + 122 + bpf_cgroup_release(kptr); 123 + 124 + return 0; 125 + } 126 + 127 + SEC("tp_btf/cgroup_mkdir") 128 + int BPF_PROG(test_cgrp_get_ancestors, struct cgroup *cgrp, const char *path) 129 + { 130 + struct cgroup *self, *ancestor1, *invalid; 131 + 132 + if (!is_test_kfunc_task()) 133 + return 0; 134 + 135 + self = bpf_cgroup_ancestor(cgrp, cgrp->level); 136 + if (!self) { 137 + err = 1; 138 + return 0; 139 + } 140 + 141 + if (self->self.id != cgrp->self.id) { 142 + bpf_cgroup_release(self); 143 + err = 2; 144 + return 0; 145 + } 146 + bpf_cgroup_release(self); 147 + 148 + ancestor1 = bpf_cgroup_ancestor(cgrp, cgrp->level - 1); 149 + if (!ancestor1) { 150 + err = 3; 151 + return 0; 152 + } 153 + bpf_cgroup_release(ancestor1); 154 + 155 + invalid = bpf_cgroup_ancestor(cgrp, 10000); 156 + if (invalid) { 157 + bpf_cgroup_release(invalid); 158 + err = 4; 159 + return 0; 160 + } 161 + 162 + invalid = bpf_cgroup_ancestor(cgrp, -1); 163 + if (invalid) { 164 + bpf_cgroup_release(invalid); 165 + err = 5; 166 + return 0; 167 + } 168 + 169 + return 0; 170 + }
+37
tools/testing/selftests/bpf/progs/empty_skb.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause 2 + #include <linux/bpf.h> 3 + #include <bpf/bpf_helpers.h> 4 + #include <bpf/bpf_endian.h> 5 + 6 + char _license[] SEC("license") = "GPL"; 7 + 8 + int ifindex; 9 + int ret; 10 + 11 + SEC("lwt_xmit") 12 + int redirect_ingress(struct __sk_buff *skb) 13 + { 14 + ret = bpf_clone_redirect(skb, ifindex, BPF_F_INGRESS); 15 + return 0; 16 + } 17 + 18 + SEC("lwt_xmit") 19 + int redirect_egress(struct __sk_buff *skb) 20 + { 21 + ret = bpf_clone_redirect(skb, ifindex, 0); 22 + return 0; 23 + } 24 + 25 + SEC("tc") 26 + int tc_redirect_ingress(struct __sk_buff *skb) 27 + { 28 + ret = bpf_clone_redirect(skb, ifindex, BPF_F_INGRESS); 29 + return 0; 30 + } 31 + 32 + SEC("tc") 33 + int tc_redirect_egress(struct __sk_buff *skb) 34 + { 35 + ret = bpf_clone_redirect(skb, ifindex, 0); 36 + return 0; 37 + }
+370
tools/testing/selftests/bpf/progs/linked_list.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <vmlinux.h> 3 + #include <bpf/bpf_tracing.h> 4 + #include <bpf/bpf_helpers.h> 5 + #include <bpf/bpf_core_read.h> 6 + #include "bpf_experimental.h" 7 + 8 + #ifndef ARRAY_SIZE 9 + #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) 10 + #endif 11 + 12 + #include "linked_list.h" 13 + 14 + static __always_inline 15 + int list_push_pop(struct bpf_spin_lock *lock, struct bpf_list_head *head, bool leave_in_map) 16 + { 17 + struct bpf_list_node *n; 18 + struct foo *f; 19 + 20 + f = bpf_obj_new(typeof(*f)); 21 + if (!f) 22 + return 2; 23 + 24 + bpf_spin_lock(lock); 25 + n = bpf_list_pop_front(head); 26 + bpf_spin_unlock(lock); 27 + if (n) { 28 + bpf_obj_drop(container_of(n, struct foo, node)); 29 + bpf_obj_drop(f); 30 + return 3; 31 + } 32 + 33 + bpf_spin_lock(lock); 34 + n = bpf_list_pop_back(head); 35 + bpf_spin_unlock(lock); 36 + if (n) { 37 + bpf_obj_drop(container_of(n, struct foo, node)); 38 + bpf_obj_drop(f); 39 + return 4; 40 + } 41 + 42 + 43 + bpf_spin_lock(lock); 44 + f->data = 42; 45 + bpf_list_push_front(head, &f->node); 46 + bpf_spin_unlock(lock); 47 + if (leave_in_map) 48 + return 0; 49 + bpf_spin_lock(lock); 50 + n = bpf_list_pop_back(head); 51 + bpf_spin_unlock(lock); 52 + if (!n) 53 + return 5; 54 + f = container_of(n, struct foo, node); 55 + if (f->data != 42) { 56 + bpf_obj_drop(f); 57 + return 6; 58 + } 59 + 60 + bpf_spin_lock(lock); 61 + f->data = 13; 62 + bpf_list_push_front(head, &f->node); 63 + bpf_spin_unlock(lock); 64 + bpf_spin_lock(lock); 65 + n = bpf_list_pop_front(head); 66 + bpf_spin_unlock(lock); 67 + if (!n) 68 + return 7; 69 + f = container_of(n, struct foo, node); 70 + if (f->data != 13) { 71 + bpf_obj_drop(f); 72 + return 8; 73 + } 74 + bpf_obj_drop(f); 75 + 76 + bpf_spin_lock(lock); 77 + n = bpf_list_pop_front(head); 78 + bpf_spin_unlock(lock); 79 + if (n) { 80 + bpf_obj_drop(container_of(n, struct foo, node)); 81 + return 9; 82 + } 83 + 84 + bpf_spin_lock(lock); 85 + n = bpf_list_pop_back(head); 86 + bpf_spin_unlock(lock); 87 + if (n) { 88 + bpf_obj_drop(container_of(n, struct foo, node)); 89 + return 10; 90 + } 91 + return 0; 92 + } 93 + 94 + 95 + static __always_inline 96 + int list_push_pop_multiple(struct bpf_spin_lock *lock, struct bpf_list_head *head, bool leave_in_map) 97 + { 98 + struct bpf_list_node *n; 99 + struct foo *f[8], *pf; 100 + int i; 101 + 102 + for (i = 0; i < ARRAY_SIZE(f); i++) { 103 + f[i] = bpf_obj_new(typeof(**f)); 104 + if (!f[i]) 105 + return 2; 106 + f[i]->data = i; 107 + bpf_spin_lock(lock); 108 + bpf_list_push_front(head, &f[i]->node); 109 + bpf_spin_unlock(lock); 110 + } 111 + 112 + for (i = 0; i < ARRAY_SIZE(f); i++) { 113 + bpf_spin_lock(lock); 114 + n = bpf_list_pop_front(head); 115 + bpf_spin_unlock(lock); 116 + if (!n) 117 + return 3; 118 + pf = container_of(n, struct foo, node); 119 + if (pf->data != (ARRAY_SIZE(f) - i - 1)) { 120 + bpf_obj_drop(pf); 121 + return 4; 122 + } 123 + bpf_spin_lock(lock); 124 + bpf_list_push_back(head, &pf->node); 125 + bpf_spin_unlock(lock); 126 + } 127 + 128 + if (leave_in_map) 129 + return 0; 130 + 131 + for (i = 0; i < ARRAY_SIZE(f); i++) { 132 + bpf_spin_lock(lock); 133 + n = bpf_list_pop_back(head); 134 + bpf_spin_unlock(lock); 135 + if (!n) 136 + return 5; 137 + pf = container_of(n, struct foo, node); 138 + if (pf->data != i) { 139 + bpf_obj_drop(pf); 140 + return 6; 141 + } 142 + bpf_obj_drop(pf); 143 + } 144 + bpf_spin_lock(lock); 145 + n = bpf_list_pop_back(head); 146 + bpf_spin_unlock(lock); 147 + if (n) { 148 + bpf_obj_drop(container_of(n, struct foo, node)); 149 + return 7; 150 + } 151 + 152 + bpf_spin_lock(lock); 153 + n = bpf_list_pop_front(head); 154 + bpf_spin_unlock(lock); 155 + if (n) { 156 + bpf_obj_drop(container_of(n, struct foo, node)); 157 + return 8; 158 + } 159 + return 0; 160 + } 161 + 162 + static __always_inline 163 + int list_in_list(struct bpf_spin_lock *lock, struct bpf_list_head *head, bool leave_in_map) 164 + { 165 + struct bpf_list_node *n; 166 + struct bar *ba[8], *b; 167 + struct foo *f; 168 + int i; 169 + 170 + f = bpf_obj_new(typeof(*f)); 171 + if (!f) 172 + return 2; 173 + for (i = 0; i < ARRAY_SIZE(ba); i++) { 174 + b = bpf_obj_new(typeof(*b)); 175 + if (!b) { 176 + bpf_obj_drop(f); 177 + return 3; 178 + } 179 + b->data = i; 180 + bpf_spin_lock(&f->lock); 181 + bpf_list_push_back(&f->head, &b->node); 182 + bpf_spin_unlock(&f->lock); 183 + } 184 + 185 + bpf_spin_lock(lock); 186 + f->data = 42; 187 + bpf_list_push_front(head, &f->node); 188 + bpf_spin_unlock(lock); 189 + 190 + if (leave_in_map) 191 + return 0; 192 + 193 + bpf_spin_lock(lock); 194 + n = bpf_list_pop_front(head); 195 + bpf_spin_unlock(lock); 196 + if (!n) 197 + return 4; 198 + f = container_of(n, struct foo, node); 199 + if (f->data != 42) { 200 + bpf_obj_drop(f); 201 + return 5; 202 + } 203 + 204 + for (i = 0; i < ARRAY_SIZE(ba); i++) { 205 + bpf_spin_lock(&f->lock); 206 + n = bpf_list_pop_front(&f->head); 207 + bpf_spin_unlock(&f->lock); 208 + if (!n) { 209 + bpf_obj_drop(f); 210 + return 6; 211 + } 212 + b = container_of(n, struct bar, node); 213 + if (b->data != i) { 214 + bpf_obj_drop(f); 215 + bpf_obj_drop(b); 216 + return 7; 217 + } 218 + bpf_obj_drop(b); 219 + } 220 + bpf_spin_lock(&f->lock); 221 + n = bpf_list_pop_front(&f->head); 222 + bpf_spin_unlock(&f->lock); 223 + if (n) { 224 + bpf_obj_drop(f); 225 + bpf_obj_drop(container_of(n, struct bar, node)); 226 + return 8; 227 + } 228 + bpf_obj_drop(f); 229 + return 0; 230 + } 231 + 232 + static __always_inline 233 + int test_list_push_pop(struct bpf_spin_lock *lock, struct bpf_list_head *head) 234 + { 235 + int ret; 236 + 237 + ret = list_push_pop(lock, head, false); 238 + if (ret) 239 + return ret; 240 + return list_push_pop(lock, head, true); 241 + } 242 + 243 + static __always_inline 244 + int test_list_push_pop_multiple(struct bpf_spin_lock *lock, struct bpf_list_head *head) 245 + { 246 + int ret; 247 + 248 + ret = list_push_pop_multiple(lock ,head, false); 249 + if (ret) 250 + return ret; 251 + return list_push_pop_multiple(lock, head, true); 252 + } 253 + 254 + static __always_inline 255 + int test_list_in_list(struct bpf_spin_lock *lock, struct bpf_list_head *head) 256 + { 257 + int ret; 258 + 259 + ret = list_in_list(lock, head, false); 260 + if (ret) 261 + return ret; 262 + return list_in_list(lock, head, true); 263 + } 264 + 265 + SEC("tc") 266 + int map_list_push_pop(void *ctx) 267 + { 268 + struct map_value *v; 269 + 270 + v = bpf_map_lookup_elem(&array_map, &(int){0}); 271 + if (!v) 272 + return 1; 273 + return test_list_push_pop(&v->lock, &v->head); 274 + } 275 + 276 + SEC("tc") 277 + int inner_map_list_push_pop(void *ctx) 278 + { 279 + struct map_value *v; 280 + void *map; 281 + 282 + map = bpf_map_lookup_elem(&map_of_maps, &(int){0}); 283 + if (!map) 284 + return 1; 285 + v = bpf_map_lookup_elem(map, &(int){0}); 286 + if (!v) 287 + return 1; 288 + return test_list_push_pop(&v->lock, &v->head); 289 + } 290 + 291 + SEC("tc") 292 + int global_list_push_pop(void *ctx) 293 + { 294 + return test_list_push_pop(&glock, &ghead); 295 + } 296 + 297 + SEC("tc") 298 + int map_list_push_pop_multiple(void *ctx) 299 + { 300 + struct map_value *v; 301 + int ret; 302 + 303 + v = bpf_map_lookup_elem(&array_map, &(int){0}); 304 + if (!v) 305 + return 1; 306 + return test_list_push_pop_multiple(&v->lock, &v->head); 307 + } 308 + 309 + SEC("tc") 310 + int inner_map_list_push_pop_multiple(void *ctx) 311 + { 312 + struct map_value *v; 313 + void *map; 314 + int ret; 315 + 316 + map = bpf_map_lookup_elem(&map_of_maps, &(int){0}); 317 + if (!map) 318 + return 1; 319 + v = bpf_map_lookup_elem(map, &(int){0}); 320 + if (!v) 321 + return 1; 322 + return test_list_push_pop_multiple(&v->lock, &v->head); 323 + } 324 + 325 + SEC("tc") 326 + int global_list_push_pop_multiple(void *ctx) 327 + { 328 + int ret; 329 + 330 + ret = list_push_pop_multiple(&glock, &ghead, false); 331 + if (ret) 332 + return ret; 333 + return list_push_pop_multiple(&glock, &ghead, true); 334 + } 335 + 336 + SEC("tc") 337 + int map_list_in_list(void *ctx) 338 + { 339 + struct map_value *v; 340 + int ret; 341 + 342 + v = bpf_map_lookup_elem(&array_map, &(int){0}); 343 + if (!v) 344 + return 1; 345 + return test_list_in_list(&v->lock, &v->head); 346 + } 347 + 348 + SEC("tc") 349 + int inner_map_list_in_list(void *ctx) 350 + { 351 + struct map_value *v; 352 + void *map; 353 + int ret; 354 + 355 + map = bpf_map_lookup_elem(&map_of_maps, &(int){0}); 356 + if (!map) 357 + return 1; 358 + v = bpf_map_lookup_elem(map, &(int){0}); 359 + if (!v) 360 + return 1; 361 + return test_list_in_list(&v->lock, &v->head); 362 + } 363 + 364 + SEC("tc") 365 + int global_list_in_list(void *ctx) 366 + { 367 + return test_list_in_list(&glock, &ghead); 368 + } 369 + 370 + char _license[] SEC("license") = "GPL";
+56
tools/testing/selftests/bpf/progs/linked_list.h
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #ifndef LINKED_LIST_H 3 + #define LINKED_LIST_H 4 + 5 + #include <vmlinux.h> 6 + #include <bpf/bpf_helpers.h> 7 + #include "bpf_experimental.h" 8 + 9 + struct bar { 10 + struct bpf_list_node node; 11 + int data; 12 + }; 13 + 14 + struct foo { 15 + struct bpf_list_node node; 16 + struct bpf_list_head head __contains(bar, node); 17 + struct bpf_spin_lock lock; 18 + int data; 19 + struct bpf_list_node node2; 20 + }; 21 + 22 + struct map_value { 23 + struct bpf_spin_lock lock; 24 + int data; 25 + struct bpf_list_head head __contains(foo, node); 26 + }; 27 + 28 + struct array_map { 29 + __uint(type, BPF_MAP_TYPE_ARRAY); 30 + __type(key, int); 31 + __type(value, struct map_value); 32 + __uint(max_entries, 1); 33 + }; 34 + 35 + struct array_map array_map SEC(".maps"); 36 + struct array_map inner_map SEC(".maps"); 37 + 38 + struct { 39 + __uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS); 40 + __uint(max_entries, 1); 41 + __type(key, int); 42 + __type(value, int); 43 + __array(values, struct array_map); 44 + } map_of_maps SEC(".maps") = { 45 + .values = { 46 + [0] = &inner_map, 47 + }, 48 + }; 49 + 50 + #define private(name) SEC(".bss." #name) __hidden __attribute__((aligned(8))) 51 + 52 + private(A) struct bpf_spin_lock glock; 53 + private(A) struct bpf_list_head ghead __contains(foo, node); 54 + private(B) struct bpf_spin_lock glock2; 55 + 56 + #endif
+581
tools/testing/selftests/bpf/progs/linked_list_fail.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <vmlinux.h> 3 + #include <bpf/bpf_tracing.h> 4 + #include <bpf/bpf_helpers.h> 5 + #include <bpf/bpf_core_read.h> 6 + #include "bpf_experimental.h" 7 + 8 + #include "linked_list.h" 9 + 10 + #define INIT \ 11 + struct map_value *v, *v2, *iv, *iv2; \ 12 + struct foo *f, *f1, *f2; \ 13 + struct bar *b; \ 14 + void *map; \ 15 + \ 16 + map = bpf_map_lookup_elem(&map_of_maps, &(int){ 0 }); \ 17 + if (!map) \ 18 + return 0; \ 19 + v = bpf_map_lookup_elem(&array_map, &(int){ 0 }); \ 20 + if (!v) \ 21 + return 0; \ 22 + v2 = bpf_map_lookup_elem(&array_map, &(int){ 0 }); \ 23 + if (!v2) \ 24 + return 0; \ 25 + iv = bpf_map_lookup_elem(map, &(int){ 0 }); \ 26 + if (!iv) \ 27 + return 0; \ 28 + iv2 = bpf_map_lookup_elem(map, &(int){ 0 }); \ 29 + if (!iv2) \ 30 + return 0; \ 31 + f = bpf_obj_new(typeof(*f)); \ 32 + if (!f) \ 33 + return 0; \ 34 + f1 = f; \ 35 + f2 = bpf_obj_new(typeof(*f2)); \ 36 + if (!f2) { \ 37 + bpf_obj_drop(f1); \ 38 + return 0; \ 39 + } \ 40 + b = bpf_obj_new(typeof(*b)); \ 41 + if (!b) { \ 42 + bpf_obj_drop(f2); \ 43 + bpf_obj_drop(f1); \ 44 + return 0; \ 45 + } 46 + 47 + #define CHECK(test, op, hexpr) \ 48 + SEC("?tc") \ 49 + int test##_missing_lock_##op(void *ctx) \ 50 + { \ 51 + INIT; \ 52 + void (*p)(void *) = (void *)&bpf_list_##op; \ 53 + p(hexpr); \ 54 + return 0; \ 55 + } 56 + 57 + CHECK(kptr, push_front, &f->head); 58 + CHECK(kptr, push_back, &f->head); 59 + CHECK(kptr, pop_front, &f->head); 60 + CHECK(kptr, pop_back, &f->head); 61 + 62 + CHECK(global, push_front, &ghead); 63 + CHECK(global, push_back, &ghead); 64 + CHECK(global, pop_front, &ghead); 65 + CHECK(global, pop_back, &ghead); 66 + 67 + CHECK(map, push_front, &v->head); 68 + CHECK(map, push_back, &v->head); 69 + CHECK(map, pop_front, &v->head); 70 + CHECK(map, pop_back, &v->head); 71 + 72 + CHECK(inner_map, push_front, &iv->head); 73 + CHECK(inner_map, push_back, &iv->head); 74 + CHECK(inner_map, pop_front, &iv->head); 75 + CHECK(inner_map, pop_back, &iv->head); 76 + 77 + #undef CHECK 78 + 79 + #define CHECK(test, op, lexpr, hexpr) \ 80 + SEC("?tc") \ 81 + int test##_incorrect_lock_##op(void *ctx) \ 82 + { \ 83 + INIT; \ 84 + void (*p)(void *) = (void *)&bpf_list_##op; \ 85 + bpf_spin_lock(lexpr); \ 86 + p(hexpr); \ 87 + return 0; \ 88 + } 89 + 90 + #define CHECK_OP(op) \ 91 + CHECK(kptr_kptr, op, &f1->lock, &f2->head); \ 92 + CHECK(kptr_global, op, &f1->lock, &ghead); \ 93 + CHECK(kptr_map, op, &f1->lock, &v->head); \ 94 + CHECK(kptr_inner_map, op, &f1->lock, &iv->head); \ 95 + \ 96 + CHECK(global_global, op, &glock2, &ghead); \ 97 + CHECK(global_kptr, op, &glock, &f1->head); \ 98 + CHECK(global_map, op, &glock, &v->head); \ 99 + CHECK(global_inner_map, op, &glock, &iv->head); \ 100 + \ 101 + CHECK(map_map, op, &v->lock, &v2->head); \ 102 + CHECK(map_kptr, op, &v->lock, &f2->head); \ 103 + CHECK(map_global, op, &v->lock, &ghead); \ 104 + CHECK(map_inner_map, op, &v->lock, &iv->head); \ 105 + \ 106 + CHECK(inner_map_inner_map, op, &iv->lock, &iv2->head); \ 107 + CHECK(inner_map_kptr, op, &iv->lock, &f2->head); \ 108 + CHECK(inner_map_global, op, &iv->lock, &ghead); \ 109 + CHECK(inner_map_map, op, &iv->lock, &v->head); 110 + 111 + CHECK_OP(push_front); 112 + CHECK_OP(push_back); 113 + CHECK_OP(pop_front); 114 + CHECK_OP(pop_back); 115 + 116 + #undef CHECK 117 + #undef CHECK_OP 118 + #undef INIT 119 + 120 + SEC("?kprobe/xyz") 121 + int map_compat_kprobe(void *ctx) 122 + { 123 + bpf_list_push_front(&ghead, NULL); 124 + return 0; 125 + } 126 + 127 + SEC("?kretprobe/xyz") 128 + int map_compat_kretprobe(void *ctx) 129 + { 130 + bpf_list_push_front(&ghead, NULL); 131 + return 0; 132 + } 133 + 134 + SEC("?tracepoint/xyz") 135 + int map_compat_tp(void *ctx) 136 + { 137 + bpf_list_push_front(&ghead, NULL); 138 + return 0; 139 + } 140 + 141 + SEC("?perf_event") 142 + int map_compat_perf(void *ctx) 143 + { 144 + bpf_list_push_front(&ghead, NULL); 145 + return 0; 146 + } 147 + 148 + SEC("?raw_tp/xyz") 149 + int map_compat_raw_tp(void *ctx) 150 + { 151 + bpf_list_push_front(&ghead, NULL); 152 + return 0; 153 + } 154 + 155 + SEC("?raw_tp.w/xyz") 156 + int map_compat_raw_tp_w(void *ctx) 157 + { 158 + bpf_list_push_front(&ghead, NULL); 159 + return 0; 160 + } 161 + 162 + SEC("?tc") 163 + int obj_type_id_oor(void *ctx) 164 + { 165 + bpf_obj_new_impl(~0UL, NULL); 166 + return 0; 167 + } 168 + 169 + SEC("?tc") 170 + int obj_new_no_composite(void *ctx) 171 + { 172 + bpf_obj_new_impl(bpf_core_type_id_local(int), (void *)42); 173 + return 0; 174 + } 175 + 176 + SEC("?tc") 177 + int obj_new_no_struct(void *ctx) 178 + { 179 + 180 + bpf_obj_new(union { int data; unsigned udata; }); 181 + return 0; 182 + } 183 + 184 + SEC("?tc") 185 + int obj_drop_non_zero_off(void *ctx) 186 + { 187 + void *f; 188 + 189 + f = bpf_obj_new(struct foo); 190 + if (!f) 191 + return 0; 192 + bpf_obj_drop(f+1); 193 + return 0; 194 + } 195 + 196 + SEC("?tc") 197 + int new_null_ret(void *ctx) 198 + { 199 + return bpf_obj_new(struct foo)->data; 200 + } 201 + 202 + SEC("?tc") 203 + int obj_new_acq(void *ctx) 204 + { 205 + bpf_obj_new(struct foo); 206 + return 0; 207 + } 208 + 209 + SEC("?tc") 210 + int use_after_drop(void *ctx) 211 + { 212 + struct foo *f; 213 + 214 + f = bpf_obj_new(typeof(*f)); 215 + if (!f) 216 + return 0; 217 + bpf_obj_drop(f); 218 + return f->data; 219 + } 220 + 221 + SEC("?tc") 222 + int ptr_walk_scalar(void *ctx) 223 + { 224 + struct test1 { 225 + struct test2 { 226 + struct test2 *next; 227 + } *ptr; 228 + } *p; 229 + 230 + p = bpf_obj_new(typeof(*p)); 231 + if (!p) 232 + return 0; 233 + bpf_this_cpu_ptr(p->ptr); 234 + return 0; 235 + } 236 + 237 + SEC("?tc") 238 + int direct_read_lock(void *ctx) 239 + { 240 + struct foo *f; 241 + 242 + f = bpf_obj_new(typeof(*f)); 243 + if (!f) 244 + return 0; 245 + return *(int *)&f->lock; 246 + } 247 + 248 + SEC("?tc") 249 + int direct_write_lock(void *ctx) 250 + { 251 + struct foo *f; 252 + 253 + f = bpf_obj_new(typeof(*f)); 254 + if (!f) 255 + return 0; 256 + *(int *)&f->lock = 0; 257 + return 0; 258 + } 259 + 260 + SEC("?tc") 261 + int direct_read_head(void *ctx) 262 + { 263 + struct foo *f; 264 + 265 + f = bpf_obj_new(typeof(*f)); 266 + if (!f) 267 + return 0; 268 + return *(int *)&f->head; 269 + } 270 + 271 + SEC("?tc") 272 + int direct_write_head(void *ctx) 273 + { 274 + struct foo *f; 275 + 276 + f = bpf_obj_new(typeof(*f)); 277 + if (!f) 278 + return 0; 279 + *(int *)&f->head = 0; 280 + return 0; 281 + } 282 + 283 + SEC("?tc") 284 + int direct_read_node(void *ctx) 285 + { 286 + struct foo *f; 287 + 288 + f = bpf_obj_new(typeof(*f)); 289 + if (!f) 290 + return 0; 291 + return *(int *)&f->node; 292 + } 293 + 294 + SEC("?tc") 295 + int direct_write_node(void *ctx) 296 + { 297 + struct foo *f; 298 + 299 + f = bpf_obj_new(typeof(*f)); 300 + if (!f) 301 + return 0; 302 + *(int *)&f->node = 0; 303 + return 0; 304 + } 305 + 306 + static __always_inline 307 + int write_after_op(void (*push_op)(void *head, void *node)) 308 + { 309 + struct foo *f; 310 + 311 + f = bpf_obj_new(typeof(*f)); 312 + if (!f) 313 + return 0; 314 + bpf_spin_lock(&glock); 315 + push_op(&ghead, &f->node); 316 + f->data = 42; 317 + bpf_spin_unlock(&glock); 318 + 319 + return 0; 320 + } 321 + 322 + SEC("?tc") 323 + int write_after_push_front(void *ctx) 324 + { 325 + return write_after_op((void *)bpf_list_push_front); 326 + } 327 + 328 + SEC("?tc") 329 + int write_after_push_back(void *ctx) 330 + { 331 + return write_after_op((void *)bpf_list_push_back); 332 + } 333 + 334 + static __always_inline 335 + int use_after_unlock(void (*op)(void *head, void *node)) 336 + { 337 + struct foo *f; 338 + 339 + f = bpf_obj_new(typeof(*f)); 340 + if (!f) 341 + return 0; 342 + bpf_spin_lock(&glock); 343 + f->data = 42; 344 + op(&ghead, &f->node); 345 + bpf_spin_unlock(&glock); 346 + 347 + return f->data; 348 + } 349 + 350 + SEC("?tc") 351 + int use_after_unlock_push_front(void *ctx) 352 + { 353 + return use_after_unlock((void *)bpf_list_push_front); 354 + } 355 + 356 + SEC("?tc") 357 + int use_after_unlock_push_back(void *ctx) 358 + { 359 + return use_after_unlock((void *)bpf_list_push_back); 360 + } 361 + 362 + static __always_inline 363 + int list_double_add(void (*op)(void *head, void *node)) 364 + { 365 + struct foo *f; 366 + 367 + f = bpf_obj_new(typeof(*f)); 368 + if (!f) 369 + return 0; 370 + bpf_spin_lock(&glock); 371 + op(&ghead, &f->node); 372 + op(&ghead, &f->node); 373 + bpf_spin_unlock(&glock); 374 + 375 + return 0; 376 + } 377 + 378 + SEC("?tc") 379 + int double_push_front(void *ctx) 380 + { 381 + return list_double_add((void *)bpf_list_push_front); 382 + } 383 + 384 + SEC("?tc") 385 + int double_push_back(void *ctx) 386 + { 387 + return list_double_add((void *)bpf_list_push_back); 388 + } 389 + 390 + SEC("?tc") 391 + int no_node_value_type(void *ctx) 392 + { 393 + void *p; 394 + 395 + p = bpf_obj_new(struct { int data; }); 396 + if (!p) 397 + return 0; 398 + bpf_spin_lock(&glock); 399 + bpf_list_push_front(&ghead, p); 400 + bpf_spin_unlock(&glock); 401 + 402 + return 0; 403 + } 404 + 405 + SEC("?tc") 406 + int incorrect_value_type(void *ctx) 407 + { 408 + struct bar *b; 409 + 410 + b = bpf_obj_new(typeof(*b)); 411 + if (!b) 412 + return 0; 413 + bpf_spin_lock(&glock); 414 + bpf_list_push_front(&ghead, &b->node); 415 + bpf_spin_unlock(&glock); 416 + 417 + return 0; 418 + } 419 + 420 + SEC("?tc") 421 + int incorrect_node_var_off(struct __sk_buff *ctx) 422 + { 423 + struct foo *f; 424 + 425 + f = bpf_obj_new(typeof(*f)); 426 + if (!f) 427 + return 0; 428 + bpf_spin_lock(&glock); 429 + bpf_list_push_front(&ghead, (void *)&f->node + ctx->protocol); 430 + bpf_spin_unlock(&glock); 431 + 432 + return 0; 433 + } 434 + 435 + SEC("?tc") 436 + int incorrect_node_off1(void *ctx) 437 + { 438 + struct foo *f; 439 + 440 + f = bpf_obj_new(typeof(*f)); 441 + if (!f) 442 + return 0; 443 + bpf_spin_lock(&glock); 444 + bpf_list_push_front(&ghead, (void *)&f->node + 1); 445 + bpf_spin_unlock(&glock); 446 + 447 + return 0; 448 + } 449 + 450 + SEC("?tc") 451 + int incorrect_node_off2(void *ctx) 452 + { 453 + struct foo *f; 454 + 455 + f = bpf_obj_new(typeof(*f)); 456 + if (!f) 457 + return 0; 458 + bpf_spin_lock(&glock); 459 + bpf_list_push_front(&ghead, &f->node2); 460 + bpf_spin_unlock(&glock); 461 + 462 + return 0; 463 + } 464 + 465 + SEC("?tc") 466 + int no_head_type(void *ctx) 467 + { 468 + void *p; 469 + 470 + p = bpf_obj_new(typeof(struct { int data; })); 471 + if (!p) 472 + return 0; 473 + bpf_spin_lock(&glock); 474 + bpf_list_push_front(p, NULL); 475 + bpf_spin_lock(&glock); 476 + 477 + return 0; 478 + } 479 + 480 + SEC("?tc") 481 + int incorrect_head_var_off1(struct __sk_buff *ctx) 482 + { 483 + struct foo *f; 484 + 485 + f = bpf_obj_new(typeof(*f)); 486 + if (!f) 487 + return 0; 488 + bpf_spin_lock(&glock); 489 + bpf_list_push_front((void *)&ghead + ctx->protocol, &f->node); 490 + bpf_spin_unlock(&glock); 491 + 492 + return 0; 493 + } 494 + 495 + SEC("?tc") 496 + int incorrect_head_var_off2(struct __sk_buff *ctx) 497 + { 498 + struct foo *f; 499 + 500 + f = bpf_obj_new(typeof(*f)); 501 + if (!f) 502 + return 0; 503 + bpf_spin_lock(&glock); 504 + bpf_list_push_front((void *)&f->head + ctx->protocol, &f->node); 505 + bpf_spin_unlock(&glock); 506 + 507 + return 0; 508 + } 509 + 510 + SEC("?tc") 511 + int incorrect_head_off1(void *ctx) 512 + { 513 + struct foo *f; 514 + struct bar *b; 515 + 516 + f = bpf_obj_new(typeof(*f)); 517 + if (!f) 518 + return 0; 519 + b = bpf_obj_new(typeof(*b)); 520 + if (!b) { 521 + bpf_obj_drop(f); 522 + return 0; 523 + } 524 + 525 + bpf_spin_lock(&f->lock); 526 + bpf_list_push_front((void *)&f->head + 1, &b->node); 527 + bpf_spin_unlock(&f->lock); 528 + 529 + return 0; 530 + } 531 + 532 + SEC("?tc") 533 + int incorrect_head_off2(void *ctx) 534 + { 535 + struct foo *f; 536 + struct bar *b; 537 + 538 + f = bpf_obj_new(typeof(*f)); 539 + if (!f) 540 + return 0; 541 + 542 + bpf_spin_lock(&glock); 543 + bpf_list_push_front((void *)&ghead + 1, &f->node); 544 + bpf_spin_unlock(&glock); 545 + 546 + return 0; 547 + } 548 + 549 + static __always_inline 550 + int pop_ptr_off(void *(*op)(void *head)) 551 + { 552 + struct { 553 + struct bpf_list_head head __contains(foo, node2); 554 + struct bpf_spin_lock lock; 555 + } *p; 556 + struct bpf_list_node *n; 557 + 558 + p = bpf_obj_new(typeof(*p)); 559 + if (!p) 560 + return 0; 561 + bpf_spin_lock(&p->lock); 562 + n = op(&p->head); 563 + bpf_spin_unlock(&p->lock); 564 + 565 + bpf_this_cpu_ptr(n); 566 + return 0; 567 + } 568 + 569 + SEC("?tc") 570 + int pop_front_off(void *ctx) 571 + { 572 + return pop_ptr_off((void *)bpf_list_pop_front); 573 + } 574 + 575 + SEC("?tc") 576 + int pop_back_off(void *ctx) 577 + { 578 + return pop_ptr_off((void *)bpf_list_pop_back); 579 + } 580 + 581 + char _license[] SEC("license") = "GPL";
+8
tools/testing/selftests/bpf/progs/lsm_cgroup.c
··· 7 7 8 8 char _license[] SEC("license") = "GPL"; 9 9 10 + extern bool CONFIG_SECURITY_SELINUX __kconfig __weak; 11 + extern bool CONFIG_SECURITY_SMACK __kconfig __weak; 12 + extern bool CONFIG_SECURITY_APPARMOR __kconfig __weak; 13 + 10 14 #ifndef AF_PACKET 11 15 #define AF_PACKET 17 12 16 #endif ··· 144 140 int BPF_PROG(socket_alloc, struct sock *sk, int family, gfp_t priority) 145 141 { 146 142 called_socket_alloc++; 143 + /* if already have non-bpf lsms installed, EPERM will cause memory leak of non-bpf lsms */ 144 + if (CONFIG_SECURITY_SELINUX || CONFIG_SECURITY_SMACK || CONFIG_SECURITY_APPARMOR) 145 + return 1; 146 + 147 147 if (family == AF_UNIX) 148 148 return 0; /* EPERM */ 149 149
+290
tools/testing/selftests/bpf/progs/rcu_read_lock.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include "vmlinux.h" 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + #include "bpf_tracing_net.h" 8 + #include "bpf_misc.h" 9 + 10 + char _license[] SEC("license") = "GPL"; 11 + 12 + struct { 13 + __uint(type, BPF_MAP_TYPE_TASK_STORAGE); 14 + __uint(map_flags, BPF_F_NO_PREALLOC); 15 + __type(key, int); 16 + __type(value, long); 17 + } map_a SEC(".maps"); 18 + 19 + __u32 user_data, key_serial, target_pid; 20 + __u64 flags, task_storage_val, cgroup_id; 21 + 22 + struct bpf_key *bpf_lookup_user_key(__u32 serial, __u64 flags) __ksym; 23 + void bpf_key_put(struct bpf_key *key) __ksym; 24 + void bpf_rcu_read_lock(void) __ksym; 25 + void bpf_rcu_read_unlock(void) __ksym; 26 + struct task_struct *bpf_task_acquire(struct task_struct *p) __ksym; 27 + void bpf_task_release(struct task_struct *p) __ksym; 28 + 29 + SEC("?fentry.s/" SYS_PREFIX "sys_getpgid") 30 + int get_cgroup_id(void *ctx) 31 + { 32 + struct task_struct *task; 33 + 34 + task = bpf_get_current_task_btf(); 35 + if (task->pid != target_pid) 36 + return 0; 37 + 38 + /* simulate bpf_get_current_cgroup_id() helper */ 39 + bpf_rcu_read_lock(); 40 + cgroup_id = task->cgroups->dfl_cgrp->kn->id; 41 + bpf_rcu_read_unlock(); 42 + return 0; 43 + } 44 + 45 + SEC("?fentry.s/" SYS_PREFIX "sys_getpgid") 46 + int task_succ(void *ctx) 47 + { 48 + struct task_struct *task, *real_parent; 49 + long init_val = 2; 50 + long *ptr; 51 + 52 + task = bpf_get_current_task_btf(); 53 + if (task->pid != target_pid) 54 + return 0; 55 + 56 + bpf_rcu_read_lock(); 57 + /* region including helper using rcu ptr real_parent */ 58 + real_parent = task->real_parent; 59 + ptr = bpf_task_storage_get(&map_a, real_parent, &init_val, 60 + BPF_LOCAL_STORAGE_GET_F_CREATE); 61 + if (!ptr) 62 + goto out; 63 + ptr = bpf_task_storage_get(&map_a, real_parent, 0, 0); 64 + if (!ptr) 65 + goto out; 66 + task_storage_val = *ptr; 67 + out: 68 + bpf_rcu_read_unlock(); 69 + return 0; 70 + } 71 + 72 + SEC("?fentry.s/" SYS_PREFIX "sys_nanosleep") 73 + int no_lock(void *ctx) 74 + { 75 + struct task_struct *task, *real_parent; 76 + 77 + /* no bpf_rcu_read_lock(), old code still works */ 78 + task = bpf_get_current_task_btf(); 79 + real_parent = task->real_parent; 80 + (void)bpf_task_storage_get(&map_a, real_parent, 0, 0); 81 + return 0; 82 + } 83 + 84 + SEC("?fentry.s/" SYS_PREFIX "sys_nanosleep") 85 + int two_regions(void *ctx) 86 + { 87 + struct task_struct *task, *real_parent; 88 + 89 + /* two regions */ 90 + task = bpf_get_current_task_btf(); 91 + bpf_rcu_read_lock(); 92 + bpf_rcu_read_unlock(); 93 + bpf_rcu_read_lock(); 94 + real_parent = task->real_parent; 95 + (void)bpf_task_storage_get(&map_a, real_parent, 0, 0); 96 + bpf_rcu_read_unlock(); 97 + return 0; 98 + } 99 + 100 + SEC("?fentry/" SYS_PREFIX "sys_getpgid") 101 + int non_sleepable_1(void *ctx) 102 + { 103 + struct task_struct *task, *real_parent; 104 + 105 + task = bpf_get_current_task_btf(); 106 + bpf_rcu_read_lock(); 107 + real_parent = task->real_parent; 108 + (void)bpf_task_storage_get(&map_a, real_parent, 0, 0); 109 + bpf_rcu_read_unlock(); 110 + return 0; 111 + } 112 + 113 + SEC("?fentry/" SYS_PREFIX "sys_getpgid") 114 + int non_sleepable_2(void *ctx) 115 + { 116 + struct task_struct *task, *real_parent; 117 + 118 + bpf_rcu_read_lock(); 119 + task = bpf_get_current_task_btf(); 120 + bpf_rcu_read_unlock(); 121 + 122 + bpf_rcu_read_lock(); 123 + real_parent = task->real_parent; 124 + (void)bpf_task_storage_get(&map_a, real_parent, 0, 0); 125 + bpf_rcu_read_unlock(); 126 + return 0; 127 + } 128 + 129 + SEC("?fentry.s/" SYS_PREFIX "sys_nanosleep") 130 + int task_acquire(void *ctx) 131 + { 132 + struct task_struct *task, *real_parent; 133 + 134 + task = bpf_get_current_task_btf(); 135 + bpf_rcu_read_lock(); 136 + real_parent = task->real_parent; 137 + /* acquire a reference which can be used outside rcu read lock region */ 138 + real_parent = bpf_task_acquire(real_parent); 139 + bpf_rcu_read_unlock(); 140 + (void)bpf_task_storage_get(&map_a, real_parent, 0, 0); 141 + bpf_task_release(real_parent); 142 + return 0; 143 + } 144 + 145 + SEC("?fentry.s/" SYS_PREFIX "sys_getpgid") 146 + int miss_lock(void *ctx) 147 + { 148 + struct task_struct *task; 149 + struct css_set *cgroups; 150 + struct cgroup *dfl_cgrp; 151 + 152 + /* missing bpf_rcu_read_lock() */ 153 + task = bpf_get_current_task_btf(); 154 + bpf_rcu_read_lock(); 155 + (void)bpf_task_storage_get(&map_a, task, 0, 0); 156 + bpf_rcu_read_unlock(); 157 + bpf_rcu_read_unlock(); 158 + return 0; 159 + } 160 + 161 + SEC("?fentry.s/" SYS_PREFIX "sys_getpgid") 162 + int miss_unlock(void *ctx) 163 + { 164 + struct task_struct *task; 165 + struct css_set *cgroups; 166 + struct cgroup *dfl_cgrp; 167 + 168 + /* missing bpf_rcu_read_unlock() */ 169 + task = bpf_get_current_task_btf(); 170 + bpf_rcu_read_lock(); 171 + (void)bpf_task_storage_get(&map_a, task, 0, 0); 172 + return 0; 173 + } 174 + 175 + SEC("?fentry/" SYS_PREFIX "sys_getpgid") 176 + int non_sleepable_rcu_mismatch(void *ctx) 177 + { 178 + struct task_struct *task, *real_parent; 179 + 180 + task = bpf_get_current_task_btf(); 181 + /* non-sleepable: missing bpf_rcu_read_unlock() in one path */ 182 + bpf_rcu_read_lock(); 183 + real_parent = task->real_parent; 184 + (void)bpf_task_storage_get(&map_a, real_parent, 0, 0); 185 + if (real_parent) 186 + bpf_rcu_read_unlock(); 187 + return 0; 188 + } 189 + 190 + SEC("?fentry.s/" SYS_PREFIX "sys_getpgid") 191 + int inproper_sleepable_helper(void *ctx) 192 + { 193 + struct task_struct *task, *real_parent; 194 + struct pt_regs *regs; 195 + __u32 value = 0; 196 + void *ptr; 197 + 198 + task = bpf_get_current_task_btf(); 199 + /* sleepable helper in rcu read lock region */ 200 + bpf_rcu_read_lock(); 201 + real_parent = task->real_parent; 202 + regs = (struct pt_regs *)bpf_task_pt_regs(real_parent); 203 + if (!regs) { 204 + bpf_rcu_read_unlock(); 205 + return 0; 206 + } 207 + 208 + ptr = (void *)PT_REGS_IP(regs); 209 + (void)bpf_copy_from_user_task(&value, sizeof(uint32_t), ptr, task, 0); 210 + user_data = value; 211 + (void)bpf_task_storage_get(&map_a, real_parent, 0, 0); 212 + bpf_rcu_read_unlock(); 213 + return 0; 214 + } 215 + 216 + SEC("?lsm.s/bpf") 217 + int BPF_PROG(inproper_sleepable_kfunc, int cmd, union bpf_attr *attr, unsigned int size) 218 + { 219 + struct bpf_key *bkey; 220 + 221 + /* sleepable kfunc in rcu read lock region */ 222 + bpf_rcu_read_lock(); 223 + bkey = bpf_lookup_user_key(key_serial, flags); 224 + bpf_rcu_read_unlock(); 225 + if (!bkey) 226 + return -1; 227 + bpf_key_put(bkey); 228 + 229 + return 0; 230 + } 231 + 232 + SEC("?fentry.s/" SYS_PREFIX "sys_nanosleep") 233 + int nested_rcu_region(void *ctx) 234 + { 235 + struct task_struct *task, *real_parent; 236 + 237 + /* nested rcu read lock regions */ 238 + task = bpf_get_current_task_btf(); 239 + bpf_rcu_read_lock(); 240 + bpf_rcu_read_lock(); 241 + real_parent = task->real_parent; 242 + (void)bpf_task_storage_get(&map_a, real_parent, 0, 0); 243 + bpf_rcu_read_unlock(); 244 + bpf_rcu_read_unlock(); 245 + return 0; 246 + } 247 + 248 + SEC("?fentry.s/" SYS_PREFIX "sys_getpgid") 249 + int task_untrusted_non_rcuptr(void *ctx) 250 + { 251 + struct task_struct *task, *last_wakee; 252 + 253 + task = bpf_get_current_task_btf(); 254 + bpf_rcu_read_lock(); 255 + /* the pointer last_wakee marked as untrusted */ 256 + last_wakee = task->real_parent->last_wakee; 257 + (void)bpf_task_storage_get(&map_a, last_wakee, 0, 0); 258 + bpf_rcu_read_unlock(); 259 + return 0; 260 + } 261 + 262 + SEC("?fentry.s/" SYS_PREFIX "sys_getpgid") 263 + int task_untrusted_rcuptr(void *ctx) 264 + { 265 + struct task_struct *task, *real_parent; 266 + 267 + task = bpf_get_current_task_btf(); 268 + bpf_rcu_read_lock(); 269 + real_parent = task->real_parent; 270 + bpf_rcu_read_unlock(); 271 + /* helper use of rcu ptr outside the rcu read lock region */ 272 + (void)bpf_task_storage_get(&map_a, real_parent, 0, 0); 273 + return 0; 274 + } 275 + 276 + SEC("?fentry.s/" SYS_PREFIX "sys_nanosleep") 277 + int cross_rcu_region(void *ctx) 278 + { 279 + struct task_struct *task, *real_parent; 280 + 281 + /* rcu ptr define/use in different regions */ 282 + task = bpf_get_current_task_btf(); 283 + bpf_rcu_read_lock(); 284 + real_parent = task->real_parent; 285 + bpf_rcu_read_unlock(); 286 + bpf_rcu_read_lock(); 287 + (void)bpf_task_storage_get(&map_a, real_parent, 0, 0); 288 + bpf_rcu_read_unlock(); 289 + return 0; 290 + }
+72
tools/testing/selftests/bpf/progs/task_kfunc_common.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #ifndef _TASK_KFUNC_COMMON_H 5 + #define _TASK_KFUNC_COMMON_H 6 + 7 + #include <errno.h> 8 + #include <vmlinux.h> 9 + #include <bpf/bpf_helpers.h> 10 + #include <bpf/bpf_tracing.h> 11 + 12 + struct __tasks_kfunc_map_value { 13 + struct task_struct __kptr_ref * task; 14 + }; 15 + 16 + struct hash_map { 17 + __uint(type, BPF_MAP_TYPE_HASH); 18 + __type(key, int); 19 + __type(value, struct __tasks_kfunc_map_value); 20 + __uint(max_entries, 1); 21 + } __tasks_kfunc_map SEC(".maps"); 22 + 23 + struct task_struct *bpf_task_acquire(struct task_struct *p) __ksym; 24 + struct task_struct *bpf_task_kptr_get(struct task_struct **pp) __ksym; 25 + void bpf_task_release(struct task_struct *p) __ksym; 26 + struct task_struct *bpf_task_from_pid(s32 pid) __ksym; 27 + 28 + static inline struct __tasks_kfunc_map_value *tasks_kfunc_map_value_lookup(struct task_struct *p) 29 + { 30 + s32 pid; 31 + long status; 32 + 33 + status = bpf_probe_read_kernel(&pid, sizeof(pid), &p->pid); 34 + if (status) 35 + return NULL; 36 + 37 + return bpf_map_lookup_elem(&__tasks_kfunc_map, &pid); 38 + } 39 + 40 + static inline int tasks_kfunc_map_insert(struct task_struct *p) 41 + { 42 + struct __tasks_kfunc_map_value local, *v; 43 + long status; 44 + struct task_struct *acquired, *old; 45 + s32 pid; 46 + 47 + status = bpf_probe_read_kernel(&pid, sizeof(pid), &p->pid); 48 + if (status) 49 + return status; 50 + 51 + local.task = NULL; 52 + status = bpf_map_update_elem(&__tasks_kfunc_map, &pid, &local, BPF_NOEXIST); 53 + if (status) 54 + return status; 55 + 56 + v = bpf_map_lookup_elem(&__tasks_kfunc_map, &pid); 57 + if (!v) { 58 + bpf_map_delete_elem(&__tasks_kfunc_map, &pid); 59 + return -ENOENT; 60 + } 61 + 62 + acquired = bpf_task_acquire(p); 63 + old = bpf_kptr_xchg(&v->task, acquired); 64 + if (old) { 65 + bpf_task_release(old); 66 + return -EEXIST; 67 + } 68 + 69 + return 0; 70 + } 71 + 72 + #endif /* _TASK_KFUNC_COMMON_H */
+273
tools/testing/selftests/bpf/progs/task_kfunc_failure.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_tracing.h> 6 + #include <bpf/bpf_helpers.h> 7 + 8 + #include "task_kfunc_common.h" 9 + 10 + char _license[] SEC("license") = "GPL"; 11 + 12 + /* Prototype for all of the program trace events below: 13 + * 14 + * TRACE_EVENT(task_newtask, 15 + * TP_PROTO(struct task_struct *p, u64 clone_flags) 16 + */ 17 + 18 + static struct __tasks_kfunc_map_value *insert_lookup_task(struct task_struct *task) 19 + { 20 + int status; 21 + 22 + status = tasks_kfunc_map_insert(task); 23 + if (status) 24 + return NULL; 25 + 26 + return tasks_kfunc_map_value_lookup(task); 27 + } 28 + 29 + SEC("tp_btf/task_newtask") 30 + int BPF_PROG(task_kfunc_acquire_untrusted, struct task_struct *task, u64 clone_flags) 31 + { 32 + struct task_struct *acquired; 33 + struct __tasks_kfunc_map_value *v; 34 + 35 + v = insert_lookup_task(task); 36 + if (!v) 37 + return 0; 38 + 39 + /* Can't invoke bpf_task_acquire() on an untrusted pointer. */ 40 + acquired = bpf_task_acquire(v->task); 41 + bpf_task_release(acquired); 42 + 43 + return 0; 44 + } 45 + 46 + SEC("tp_btf/task_newtask") 47 + int BPF_PROG(task_kfunc_acquire_fp, struct task_struct *task, u64 clone_flags) 48 + { 49 + struct task_struct *acquired, *stack_task = (struct task_struct *)&clone_flags; 50 + 51 + /* Can't invoke bpf_task_acquire() on a random frame pointer. */ 52 + acquired = bpf_task_acquire((struct task_struct *)&stack_task); 53 + bpf_task_release(acquired); 54 + 55 + return 0; 56 + } 57 + 58 + SEC("kretprobe/free_task") 59 + int BPF_PROG(task_kfunc_acquire_unsafe_kretprobe, struct task_struct *task, u64 clone_flags) 60 + { 61 + struct task_struct *acquired; 62 + 63 + acquired = bpf_task_acquire(task); 64 + /* Can't release a bpf_task_acquire()'d task without a NULL check. */ 65 + bpf_task_release(acquired); 66 + 67 + return 0; 68 + } 69 + 70 + SEC("tp_btf/task_newtask") 71 + int BPF_PROG(task_kfunc_acquire_trusted_walked, struct task_struct *task, u64 clone_flags) 72 + { 73 + struct task_struct *acquired; 74 + 75 + /* Can't invoke bpf_task_acquire() on a trusted pointer obtained from walking a struct. */ 76 + acquired = bpf_task_acquire(task->last_wakee); 77 + bpf_task_release(acquired); 78 + 79 + return 0; 80 + } 81 + 82 + 83 + SEC("tp_btf/task_newtask") 84 + int BPF_PROG(task_kfunc_acquire_null, struct task_struct *task, u64 clone_flags) 85 + { 86 + struct task_struct *acquired; 87 + 88 + /* Can't invoke bpf_task_acquire() on a NULL pointer. */ 89 + acquired = bpf_task_acquire(NULL); 90 + if (!acquired) 91 + return 0; 92 + bpf_task_release(acquired); 93 + 94 + return 0; 95 + } 96 + 97 + SEC("tp_btf/task_newtask") 98 + int BPF_PROG(task_kfunc_acquire_unreleased, struct task_struct *task, u64 clone_flags) 99 + { 100 + struct task_struct *acquired; 101 + 102 + acquired = bpf_task_acquire(task); 103 + 104 + /* Acquired task is never released. */ 105 + 106 + return 0; 107 + } 108 + 109 + SEC("tp_btf/task_newtask") 110 + int BPF_PROG(task_kfunc_get_non_kptr_param, struct task_struct *task, u64 clone_flags) 111 + { 112 + struct task_struct *kptr; 113 + 114 + /* Cannot use bpf_task_kptr_get() on a non-kptr, even on a valid task. */ 115 + kptr = bpf_task_kptr_get(&task); 116 + if (!kptr) 117 + return 0; 118 + 119 + bpf_task_release(kptr); 120 + 121 + return 0; 122 + } 123 + 124 + SEC("tp_btf/task_newtask") 125 + int BPF_PROG(task_kfunc_get_non_kptr_acquired, struct task_struct *task, u64 clone_flags) 126 + { 127 + struct task_struct *kptr, *acquired; 128 + 129 + acquired = bpf_task_acquire(task); 130 + 131 + /* Cannot use bpf_task_kptr_get() on a non-kptr, even if it was acquired. */ 132 + kptr = bpf_task_kptr_get(&acquired); 133 + bpf_task_release(acquired); 134 + if (!kptr) 135 + return 0; 136 + 137 + bpf_task_release(kptr); 138 + 139 + return 0; 140 + } 141 + 142 + SEC("tp_btf/task_newtask") 143 + int BPF_PROG(task_kfunc_get_null, struct task_struct *task, u64 clone_flags) 144 + { 145 + struct task_struct *kptr; 146 + 147 + /* Cannot use bpf_task_kptr_get() on a NULL pointer. */ 148 + kptr = bpf_task_kptr_get(NULL); 149 + if (!kptr) 150 + return 0; 151 + 152 + bpf_task_release(kptr); 153 + 154 + return 0; 155 + } 156 + 157 + SEC("tp_btf/task_newtask") 158 + int BPF_PROG(task_kfunc_xchg_unreleased, struct task_struct *task, u64 clone_flags) 159 + { 160 + struct task_struct *kptr; 161 + struct __tasks_kfunc_map_value *v; 162 + 163 + v = insert_lookup_task(task); 164 + if (!v) 165 + return 0; 166 + 167 + kptr = bpf_kptr_xchg(&v->task, NULL); 168 + if (!kptr) 169 + return 0; 170 + 171 + /* Kptr retrieved from map is never released. */ 172 + 173 + return 0; 174 + } 175 + 176 + SEC("tp_btf/task_newtask") 177 + int BPF_PROG(task_kfunc_get_unreleased, struct task_struct *task, u64 clone_flags) 178 + { 179 + struct task_struct *kptr; 180 + struct __tasks_kfunc_map_value *v; 181 + 182 + v = insert_lookup_task(task); 183 + if (!v) 184 + return 0; 185 + 186 + kptr = bpf_task_kptr_get(&v->task); 187 + if (!kptr) 188 + return 0; 189 + 190 + /* Kptr acquired above is never released. */ 191 + 192 + return 0; 193 + } 194 + 195 + SEC("tp_btf/task_newtask") 196 + int BPF_PROG(task_kfunc_release_untrusted, struct task_struct *task, u64 clone_flags) 197 + { 198 + struct __tasks_kfunc_map_value *v; 199 + 200 + v = insert_lookup_task(task); 201 + if (!v) 202 + return 0; 203 + 204 + /* Can't invoke bpf_task_release() on an untrusted pointer. */ 205 + bpf_task_release(v->task); 206 + 207 + return 0; 208 + } 209 + 210 + SEC("tp_btf/task_newtask") 211 + int BPF_PROG(task_kfunc_release_fp, struct task_struct *task, u64 clone_flags) 212 + { 213 + struct task_struct *acquired = (struct task_struct *)&clone_flags; 214 + 215 + /* Cannot release random frame pointer. */ 216 + bpf_task_release(acquired); 217 + 218 + return 0; 219 + } 220 + 221 + SEC("tp_btf/task_newtask") 222 + int BPF_PROG(task_kfunc_release_null, struct task_struct *task, u64 clone_flags) 223 + { 224 + struct __tasks_kfunc_map_value local, *v; 225 + long status; 226 + struct task_struct *acquired, *old; 227 + s32 pid; 228 + 229 + status = bpf_probe_read_kernel(&pid, sizeof(pid), &task->pid); 230 + if (status) 231 + return 0; 232 + 233 + local.task = NULL; 234 + status = bpf_map_update_elem(&__tasks_kfunc_map, &pid, &local, BPF_NOEXIST); 235 + if (status) 236 + return status; 237 + 238 + v = bpf_map_lookup_elem(&__tasks_kfunc_map, &pid); 239 + if (!v) 240 + return -ENOENT; 241 + 242 + acquired = bpf_task_acquire(task); 243 + 244 + old = bpf_kptr_xchg(&v->task, acquired); 245 + 246 + /* old cannot be passed to bpf_task_release() without a NULL check. */ 247 + bpf_task_release(old); 248 + bpf_task_release(old); 249 + 250 + return 0; 251 + } 252 + 253 + SEC("tp_btf/task_newtask") 254 + int BPF_PROG(task_kfunc_release_unacquired, struct task_struct *task, u64 clone_flags) 255 + { 256 + /* Cannot release trusted task pointer which was not acquired. */ 257 + bpf_task_release(task); 258 + 259 + return 0; 260 + } 261 + 262 + SEC("tp_btf/task_newtask") 263 + int BPF_PROG(task_kfunc_from_pid_no_null_check, struct task_struct *task, u64 clone_flags) 264 + { 265 + struct task_struct *acquired; 266 + 267 + acquired = bpf_task_from_pid(task->pid); 268 + 269 + /* Releasing bpf_task_from_pid() lookup without a NULL check. */ 270 + bpf_task_release(acquired); 271 + 272 + return 0; 273 + }
+222
tools/testing/selftests/bpf/progs/task_kfunc_success.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_tracing.h> 6 + #include <bpf/bpf_helpers.h> 7 + 8 + #include "task_kfunc_common.h" 9 + 10 + char _license[] SEC("license") = "GPL"; 11 + 12 + int err, pid; 13 + 14 + /* Prototype for all of the program trace events below: 15 + * 16 + * TRACE_EVENT(task_newtask, 17 + * TP_PROTO(struct task_struct *p, u64 clone_flags) 18 + */ 19 + 20 + static bool is_test_kfunc_task(void) 21 + { 22 + int cur_pid = bpf_get_current_pid_tgid() >> 32; 23 + 24 + return pid == cur_pid; 25 + } 26 + 27 + static int test_acquire_release(struct task_struct *task) 28 + { 29 + struct task_struct *acquired; 30 + 31 + acquired = bpf_task_acquire(task); 32 + bpf_task_release(acquired); 33 + 34 + return 0; 35 + } 36 + 37 + SEC("tp_btf/task_newtask") 38 + int BPF_PROG(test_task_acquire_release_argument, struct task_struct *task, u64 clone_flags) 39 + { 40 + if (!is_test_kfunc_task()) 41 + return 0; 42 + 43 + return test_acquire_release(task); 44 + } 45 + 46 + SEC("tp_btf/task_newtask") 47 + int BPF_PROG(test_task_acquire_release_current, struct task_struct *task, u64 clone_flags) 48 + { 49 + if (!is_test_kfunc_task()) 50 + return 0; 51 + 52 + return test_acquire_release(bpf_get_current_task_btf()); 53 + } 54 + 55 + SEC("tp_btf/task_newtask") 56 + int BPF_PROG(test_task_acquire_leave_in_map, struct task_struct *task, u64 clone_flags) 57 + { 58 + long status; 59 + 60 + if (!is_test_kfunc_task()) 61 + return 0; 62 + 63 + status = tasks_kfunc_map_insert(task); 64 + if (status) 65 + err = 1; 66 + 67 + return 0; 68 + } 69 + 70 + SEC("tp_btf/task_newtask") 71 + int BPF_PROG(test_task_xchg_release, struct task_struct *task, u64 clone_flags) 72 + { 73 + struct task_struct *kptr; 74 + struct __tasks_kfunc_map_value *v; 75 + long status; 76 + 77 + if (!is_test_kfunc_task()) 78 + return 0; 79 + 80 + status = tasks_kfunc_map_insert(task); 81 + if (status) { 82 + err = 1; 83 + return 0; 84 + } 85 + 86 + v = tasks_kfunc_map_value_lookup(task); 87 + if (!v) { 88 + err = 2; 89 + return 0; 90 + } 91 + 92 + kptr = bpf_kptr_xchg(&v->task, NULL); 93 + if (!kptr) { 94 + err = 3; 95 + return 0; 96 + } 97 + 98 + bpf_task_release(kptr); 99 + 100 + return 0; 101 + } 102 + 103 + SEC("tp_btf/task_newtask") 104 + int BPF_PROG(test_task_get_release, struct task_struct *task, u64 clone_flags) 105 + { 106 + struct task_struct *kptr; 107 + struct __tasks_kfunc_map_value *v; 108 + long status; 109 + 110 + if (!is_test_kfunc_task()) 111 + return 0; 112 + 113 + status = tasks_kfunc_map_insert(task); 114 + if (status) { 115 + err = 1; 116 + return 0; 117 + } 118 + 119 + v = tasks_kfunc_map_value_lookup(task); 120 + if (!v) { 121 + err = 2; 122 + return 0; 123 + } 124 + 125 + kptr = bpf_task_kptr_get(&v->task); 126 + if (!kptr) { 127 + err = 3; 128 + return 0; 129 + } 130 + 131 + bpf_task_release(kptr); 132 + 133 + return 0; 134 + } 135 + 136 + SEC("tp_btf/task_newtask") 137 + int BPF_PROG(test_task_current_acquire_release, struct task_struct *task, u64 clone_flags) 138 + { 139 + struct task_struct *current, *acquired; 140 + 141 + if (!is_test_kfunc_task()) 142 + return 0; 143 + 144 + current = bpf_get_current_task_btf(); 145 + acquired = bpf_task_acquire(current); 146 + bpf_task_release(acquired); 147 + 148 + return 0; 149 + } 150 + 151 + static void lookup_compare_pid(const struct task_struct *p) 152 + { 153 + struct task_struct *acquired; 154 + 155 + acquired = bpf_task_from_pid(p->pid); 156 + if (!acquired) { 157 + err = 1; 158 + return; 159 + } 160 + 161 + if (acquired->pid != p->pid) 162 + err = 2; 163 + bpf_task_release(acquired); 164 + } 165 + 166 + SEC("tp_btf/task_newtask") 167 + int BPF_PROG(test_task_from_pid_arg, struct task_struct *task, u64 clone_flags) 168 + { 169 + struct task_struct *acquired; 170 + 171 + if (!is_test_kfunc_task()) 172 + return 0; 173 + 174 + lookup_compare_pid(task); 175 + return 0; 176 + } 177 + 178 + SEC("tp_btf/task_newtask") 179 + int BPF_PROG(test_task_from_pid_current, struct task_struct *task, u64 clone_flags) 180 + { 181 + struct task_struct *current, *acquired; 182 + 183 + if (!is_test_kfunc_task()) 184 + return 0; 185 + 186 + lookup_compare_pid(bpf_get_current_task_btf()); 187 + return 0; 188 + } 189 + 190 + static int is_pid_lookup_valid(s32 pid) 191 + { 192 + struct task_struct *acquired; 193 + 194 + acquired = bpf_task_from_pid(pid); 195 + if (acquired) { 196 + bpf_task_release(acquired); 197 + return 1; 198 + } 199 + 200 + return 0; 201 + } 202 + 203 + SEC("tp_btf/task_newtask") 204 + int BPF_PROG(test_task_from_pid_invalid, struct task_struct *task, u64 clone_flags) 205 + { 206 + struct task_struct *acquired; 207 + 208 + if (!is_test_kfunc_task()) 209 + return 0; 210 + 211 + if (is_pid_lookup_valid(-1)) { 212 + err = 1; 213 + return 0; 214 + } 215 + 216 + if (is_pid_lookup_valid(0xcafef00d)) { 217 + err = 2; 218 + return 0; 219 + } 220 + 221 + return 0; 222 + }
+2 -2
tools/testing/selftests/bpf/progs/test_spin_lock.c
··· 45 45 46 46 #define CREDIT_PER_NS(delta, rate) (((delta) * rate) >> 20) 47 47 48 - SEC("tc") 49 - int bpf_sping_lock_test(struct __sk_buff *skb) 48 + SEC("cgroup_skb/ingress") 49 + int bpf_spin_lock_test(struct __sk_buff *skb) 50 50 { 51 51 volatile int credit = 0, max_credit = 100, pkt_len = 64; 52 52 struct hmap_elem zero = {}, *val;
+204
tools/testing/selftests/bpf/progs/test_spin_lock_fail.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <vmlinux.h> 3 + #include <bpf/bpf_tracing.h> 4 + #include <bpf/bpf_helpers.h> 5 + #include "bpf_experimental.h" 6 + 7 + struct foo { 8 + struct bpf_spin_lock lock; 9 + int data; 10 + }; 11 + 12 + struct array_map { 13 + __uint(type, BPF_MAP_TYPE_ARRAY); 14 + __type(key, int); 15 + __type(value, struct foo); 16 + __uint(max_entries, 1); 17 + } array_map SEC(".maps"); 18 + 19 + struct { 20 + __uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS); 21 + __uint(max_entries, 1); 22 + __type(key, int); 23 + __type(value, int); 24 + __array(values, struct array_map); 25 + } map_of_maps SEC(".maps") = { 26 + .values = { 27 + [0] = &array_map, 28 + }, 29 + }; 30 + 31 + SEC(".data.A") struct bpf_spin_lock lockA; 32 + SEC(".data.B") struct bpf_spin_lock lockB; 33 + 34 + SEC("?tc") 35 + int lock_id_kptr_preserve(void *ctx) 36 + { 37 + struct foo *f; 38 + 39 + f = bpf_obj_new(typeof(*f)); 40 + if (!f) 41 + return 0; 42 + bpf_this_cpu_ptr(f); 43 + return 0; 44 + } 45 + 46 + SEC("?tc") 47 + int lock_id_global_zero(void *ctx) 48 + { 49 + bpf_this_cpu_ptr(&lockA); 50 + return 0; 51 + } 52 + 53 + SEC("?tc") 54 + int lock_id_mapval_preserve(void *ctx) 55 + { 56 + struct foo *f; 57 + int key = 0; 58 + 59 + f = bpf_map_lookup_elem(&array_map, &key); 60 + if (!f) 61 + return 0; 62 + bpf_this_cpu_ptr(f); 63 + return 0; 64 + } 65 + 66 + SEC("?tc") 67 + int lock_id_innermapval_preserve(void *ctx) 68 + { 69 + struct foo *f; 70 + int key = 0; 71 + void *map; 72 + 73 + map = bpf_map_lookup_elem(&map_of_maps, &key); 74 + if (!map) 75 + return 0; 76 + f = bpf_map_lookup_elem(map, &key); 77 + if (!f) 78 + return 0; 79 + bpf_this_cpu_ptr(f); 80 + return 0; 81 + } 82 + 83 + #define CHECK(test, A, B) \ 84 + SEC("?tc") \ 85 + int lock_id_mismatch_##test(void *ctx) \ 86 + { \ 87 + struct foo *f1, *f2, *v, *iv; \ 88 + int key = 0; \ 89 + void *map; \ 90 + \ 91 + map = bpf_map_lookup_elem(&map_of_maps, &key); \ 92 + if (!map) \ 93 + return 0; \ 94 + iv = bpf_map_lookup_elem(map, &key); \ 95 + if (!iv) \ 96 + return 0; \ 97 + v = bpf_map_lookup_elem(&array_map, &key); \ 98 + if (!v) \ 99 + return 0; \ 100 + f1 = bpf_obj_new(typeof(*f1)); \ 101 + if (!f1) \ 102 + return 0; \ 103 + f2 = bpf_obj_new(typeof(*f2)); \ 104 + if (!f2) { \ 105 + bpf_obj_drop(f1); \ 106 + return 0; \ 107 + } \ 108 + bpf_spin_lock(A); \ 109 + bpf_spin_unlock(B); \ 110 + return 0; \ 111 + } 112 + 113 + CHECK(kptr_kptr, &f1->lock, &f2->lock); 114 + CHECK(kptr_global, &f1->lock, &lockA); 115 + CHECK(kptr_mapval, &f1->lock, &v->lock); 116 + CHECK(kptr_innermapval, &f1->lock, &iv->lock); 117 + 118 + CHECK(global_global, &lockA, &lockB); 119 + CHECK(global_kptr, &lockA, &f1->lock); 120 + CHECK(global_mapval, &lockA, &v->lock); 121 + CHECK(global_innermapval, &lockA, &iv->lock); 122 + 123 + SEC("?tc") 124 + int lock_id_mismatch_mapval_mapval(void *ctx) 125 + { 126 + struct foo *f1, *f2; 127 + int key = 0; 128 + 129 + f1 = bpf_map_lookup_elem(&array_map, &key); 130 + if (!f1) 131 + return 0; 132 + f2 = bpf_map_lookup_elem(&array_map, &key); 133 + if (!f2) 134 + return 0; 135 + 136 + bpf_spin_lock(&f1->lock); 137 + f1->data = 42; 138 + bpf_spin_unlock(&f2->lock); 139 + 140 + return 0; 141 + } 142 + 143 + CHECK(mapval_kptr, &v->lock, &f1->lock); 144 + CHECK(mapval_global, &v->lock, &lockB); 145 + CHECK(mapval_innermapval, &v->lock, &iv->lock); 146 + 147 + SEC("?tc") 148 + int lock_id_mismatch_innermapval_innermapval1(void *ctx) 149 + { 150 + struct foo *f1, *f2; 151 + int key = 0; 152 + void *map; 153 + 154 + map = bpf_map_lookup_elem(&map_of_maps, &key); 155 + if (!map) 156 + return 0; 157 + f1 = bpf_map_lookup_elem(map, &key); 158 + if (!f1) 159 + return 0; 160 + f2 = bpf_map_lookup_elem(map, &key); 161 + if (!f2) 162 + return 0; 163 + 164 + bpf_spin_lock(&f1->lock); 165 + f1->data = 42; 166 + bpf_spin_unlock(&f2->lock); 167 + 168 + return 0; 169 + } 170 + 171 + SEC("?tc") 172 + int lock_id_mismatch_innermapval_innermapval2(void *ctx) 173 + { 174 + struct foo *f1, *f2; 175 + int key = 0; 176 + void *map; 177 + 178 + map = bpf_map_lookup_elem(&map_of_maps, &key); 179 + if (!map) 180 + return 0; 181 + f1 = bpf_map_lookup_elem(map, &key); 182 + if (!f1) 183 + return 0; 184 + map = bpf_map_lookup_elem(&map_of_maps, &key); 185 + if (!map) 186 + return 0; 187 + f2 = bpf_map_lookup_elem(map, &key); 188 + if (!f2) 189 + return 0; 190 + 191 + bpf_spin_lock(&f1->lock); 192 + f1->data = 42; 193 + bpf_spin_unlock(&f2->lock); 194 + 195 + return 0; 196 + } 197 + 198 + CHECK(innermapval_kptr, &iv->lock, &f1->lock); 199 + CHECK(innermapval_global, &iv->lock, &lockA); 200 + CHECK(innermapval_mapval, &iv->lock, &v->lock); 201 + 202 + #undef CHECK 203 + 204 + char _license[] SEC("license") = "GPL";
+83
tools/testing/selftests/bpf/progs/type_cast.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + #include "vmlinux.h" 4 + #include <bpf/bpf_helpers.h> 5 + #include <bpf/bpf_tracing.h> 6 + #include <bpf/bpf_core_read.h> 7 + 8 + struct { 9 + __uint(type, BPF_MAP_TYPE_TASK_STORAGE); 10 + __uint(map_flags, BPF_F_NO_PREALLOC); 11 + __type(key, int); 12 + __type(value, long); 13 + } enter_id SEC(".maps"); 14 + 15 + #define IFNAMSIZ 16 16 + 17 + int ifindex, ingress_ifindex; 18 + char name[IFNAMSIZ]; 19 + unsigned int inum; 20 + unsigned int meta_len, frag0_len, kskb_len, kskb2_len; 21 + 22 + void *bpf_cast_to_kern_ctx(void *) __ksym; 23 + void *bpf_rdonly_cast(void *, __u32) __ksym; 24 + 25 + SEC("?xdp") 26 + int md_xdp(struct xdp_md *ctx) 27 + { 28 + struct xdp_buff *kctx = bpf_cast_to_kern_ctx(ctx); 29 + struct net_device *dev; 30 + 31 + dev = kctx->rxq->dev; 32 + ifindex = dev->ifindex; 33 + inum = dev->nd_net.net->ns.inum; 34 + __builtin_memcpy(name, dev->name, IFNAMSIZ); 35 + ingress_ifindex = ctx->ingress_ifindex; 36 + return XDP_PASS; 37 + } 38 + 39 + SEC("?tc") 40 + int md_skb(struct __sk_buff *skb) 41 + { 42 + struct sk_buff *kskb = bpf_cast_to_kern_ctx(skb); 43 + struct skb_shared_info *shared_info; 44 + struct sk_buff *kskb2; 45 + 46 + kskb_len = kskb->len; 47 + 48 + /* Simulate the following kernel macro: 49 + * #define skb_shinfo(SKB) ((struct skb_shared_info *)(skb_end_pointer(SKB))) 50 + */ 51 + shared_info = bpf_rdonly_cast(kskb->head + kskb->end, 52 + bpf_core_type_id_kernel(struct skb_shared_info)); 53 + meta_len = shared_info->meta_len; 54 + frag0_len = shared_info->frag_list->len; 55 + 56 + /* kskb2 should be equal to kskb */ 57 + kskb2 = bpf_rdonly_cast(kskb, bpf_core_type_id_kernel(struct sk_buff)); 58 + kskb2_len = kskb2->len; 59 + return 0; 60 + } 61 + 62 + SEC("?tp_btf/sys_enter") 63 + int BPF_PROG(untrusted_ptr, struct pt_regs *regs, long id) 64 + { 65 + struct task_struct *task, *task_dup; 66 + long *ptr; 67 + 68 + task = bpf_get_current_task_btf(); 69 + task_dup = bpf_rdonly_cast(task, bpf_core_type_id_kernel(struct task_struct)); 70 + (void)bpf_task_storage_get(&enter_id, task_dup, 0, 0); 71 + return 0; 72 + } 73 + 74 + SEC("?tracepoint/syscalls/sys_enter_nanosleep") 75 + int kctx_u64(void *ctx) 76 + { 77 + u64 *kctx = bpf_rdonly_cast(ctx, bpf_core_type_id_kernel(u64)); 78 + 79 + (void)kctx; 80 + return 0; 81 + } 82 + 83 + char _license[] SEC("license") = "GPL";
+3 -3
tools/testing/selftests/bpf/test_bpftool_synctypes.py
··· 309 309 commands), which looks to the lists of options in other source files 310 310 but has different start and end markers: 311 311 312 - "OPTIONS := { {-j|--json} [{-p|--pretty}] | {-d|--debug} | {-l|--legacy}" 312 + "OPTIONS := { {-j|--json} [{-p|--pretty}] | {-d|--debug}" 313 313 314 314 Return a set containing all options, such as: 315 315 316 - {'-p', '-d', '--legacy', '--pretty', '--debug', '--json', '-l', '-j'} 316 + {'-p', '-d', '--pretty', '--debug', '--json', '-j'} 317 317 """ 318 318 start_marker = re.compile(f'"OPTIONS :=') 319 319 pattern = re.compile('([\w-]+) ?(?:\||}[ }\]"])') ··· 336 336 337 337 Return a set containing all options, such as: 338 338 339 - {'-p', '-d', '--legacy', '--pretty', '--debug', '--json', '-l', '-j'} 339 + {'-p', '-d', '--pretty', '--debug', '--json', '-j'} 340 340 """ 341 341 start_marker = re.compile('\|COMMON_OPTIONS\| replace:: {') 342 342 pattern = re.compile('\*\*([\w/-]+)\*\*')
+1 -1
tools/testing/selftests/bpf/verifier/calls.c
··· 109 109 }, 110 110 .prog_type = BPF_PROG_TYPE_SCHED_CLS, 111 111 .result = REJECT, 112 - .errstr = "arg#0 pointer type STRUCT prog_test_ref_kfunc must point", 112 + .errstr = "arg#0 is ptr_or_null_ expected ptr_ or socket", 113 113 .fixup_kfunc_btf_id = { 114 114 { "bpf_kfunc_call_test_acquire", 3 }, 115 115 { "bpf_kfunc_call_test_release", 5 },
+174
tools/testing/selftests/bpf/verifier/jeq_infer_not_null.c
··· 1 + { 2 + /* This is equivalent to the following program: 3 + * 4 + * r6 = skb->sk; 5 + * r7 = sk_fullsock(r6); 6 + * r0 = sk_fullsock(r6); 7 + * if (r0 == 0) return 0; (a) 8 + * if (r0 != r7) return 0; (b) 9 + * *r7->type; (c) 10 + * return 0; 11 + * 12 + * It is safe to dereference r7 at point (c), because of (a) and (b). 13 + * The test verifies that relation r0 == r7 is propagated from (b) to (c). 14 + */ 15 + "jne/jeq infer not null, PTR_TO_SOCKET_OR_NULL -> PTR_TO_SOCKET for JNE false branch", 16 + .insns = { 17 + /* r6 = skb->sk; */ 18 + BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, offsetof(struct __sk_buff, sk)), 19 + /* if (r6 == 0) return 0; */ 20 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_6, 0, 8), 21 + /* r7 = sk_fullsock(skb); */ 22 + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), 23 + BPF_EMIT_CALL(BPF_FUNC_sk_fullsock), 24 + BPF_MOV64_REG(BPF_REG_7, BPF_REG_0), 25 + /* r0 = sk_fullsock(skb); */ 26 + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), 27 + BPF_EMIT_CALL(BPF_FUNC_sk_fullsock), 28 + /* if (r0 == null) return 0; */ 29 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2), 30 + /* if (r0 == r7) r0 = *(r7->type); */ 31 + BPF_JMP_REG(BPF_JNE, BPF_REG_0, BPF_REG_7, 1), /* Use ! JNE ! */ 32 + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, offsetof(struct bpf_sock, type)), 33 + /* return 0 */ 34 + BPF_MOV64_IMM(BPF_REG_0, 0), 35 + BPF_EXIT_INSN(), 36 + }, 37 + .prog_type = BPF_PROG_TYPE_CGROUP_SKB, 38 + .result = ACCEPT, 39 + .result_unpriv = REJECT, 40 + .errstr_unpriv = "R7 pointer comparison", 41 + }, 42 + { 43 + /* Same as above, but verify that another branch of JNE still 44 + * prohibits access to PTR_MAYBE_NULL. 45 + */ 46 + "jne/jeq infer not null, PTR_TO_SOCKET_OR_NULL unchanged for JNE true branch", 47 + .insns = { 48 + /* r6 = skb->sk */ 49 + BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, offsetof(struct __sk_buff, sk)), 50 + /* if (r6 == 0) return 0; */ 51 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_6, 0, 9), 52 + /* r7 = sk_fullsock(skb); */ 53 + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), 54 + BPF_EMIT_CALL(BPF_FUNC_sk_fullsock), 55 + BPF_MOV64_REG(BPF_REG_7, BPF_REG_0), 56 + /* r0 = sk_fullsock(skb); */ 57 + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), 58 + BPF_EMIT_CALL(BPF_FUNC_sk_fullsock), 59 + /* if (r0 == null) return 0; */ 60 + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 3), 61 + /* if (r0 == r7) return 0; */ 62 + BPF_JMP_REG(BPF_JNE, BPF_REG_0, BPF_REG_7, 1), /* Use ! JNE ! */ 63 + BPF_JMP_IMM(BPF_JA, 0, 0, 1), 64 + /* r0 = *(r7->type); */ 65 + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, offsetof(struct bpf_sock, type)), 66 + /* return 0 */ 67 + BPF_MOV64_IMM(BPF_REG_0, 0), 68 + BPF_EXIT_INSN(), 69 + }, 70 + .prog_type = BPF_PROG_TYPE_CGROUP_SKB, 71 + .result = REJECT, 72 + .errstr = "R7 invalid mem access 'sock_or_null'", 73 + .result_unpriv = REJECT, 74 + .errstr_unpriv = "R7 pointer comparison", 75 + }, 76 + { 77 + /* Same as a first test, but not null should be inferred for JEQ branch */ 78 + "jne/jeq infer not null, PTR_TO_SOCKET_OR_NULL -> PTR_TO_SOCKET for JEQ true branch", 79 + .insns = { 80 + /* r6 = skb->sk; */ 81 + BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, offsetof(struct __sk_buff, sk)), 82 + /* if (r6 == null) return 0; */ 83 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_6, 0, 9), 84 + /* r7 = sk_fullsock(skb); */ 85 + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), 86 + BPF_EMIT_CALL(BPF_FUNC_sk_fullsock), 87 + BPF_MOV64_REG(BPF_REG_7, BPF_REG_0), 88 + /* r0 = sk_fullsock(skb); */ 89 + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), 90 + BPF_EMIT_CALL(BPF_FUNC_sk_fullsock), 91 + /* if (r0 == null) return 0; */ 92 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 3), 93 + /* if (r0 != r7) return 0; */ 94 + BPF_JMP_REG(BPF_JEQ, BPF_REG_0, BPF_REG_7, 1), /* Use ! JEQ ! */ 95 + BPF_JMP_IMM(BPF_JA, 0, 0, 1), 96 + /* r0 = *(r7->type); */ 97 + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, offsetof(struct bpf_sock, type)), 98 + /* return 0; */ 99 + BPF_MOV64_IMM(BPF_REG_0, 0), 100 + BPF_EXIT_INSN(), 101 + }, 102 + .prog_type = BPF_PROG_TYPE_CGROUP_SKB, 103 + .result = ACCEPT, 104 + .result_unpriv = REJECT, 105 + .errstr_unpriv = "R7 pointer comparison", 106 + }, 107 + { 108 + /* Same as above, but verify that another branch of JNE still 109 + * prohibits access to PTR_MAYBE_NULL. 110 + */ 111 + "jne/jeq infer not null, PTR_TO_SOCKET_OR_NULL unchanged for JEQ false branch", 112 + .insns = { 113 + /* r6 = skb->sk; */ 114 + BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, offsetof(struct __sk_buff, sk)), 115 + /* if (r6 == null) return 0; */ 116 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_6, 0, 8), 117 + /* r7 = sk_fullsock(skb); */ 118 + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), 119 + BPF_EMIT_CALL(BPF_FUNC_sk_fullsock), 120 + BPF_MOV64_REG(BPF_REG_7, BPF_REG_0), 121 + /* r0 = sk_fullsock(skb); */ 122 + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), 123 + BPF_EMIT_CALL(BPF_FUNC_sk_fullsock), 124 + /* if (r0 == null) return 0; */ 125 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2), 126 + /* if (r0 != r7) r0 = *(r7->type); */ 127 + BPF_JMP_REG(BPF_JEQ, BPF_REG_0, BPF_REG_7, 1), /* Use ! JEQ ! */ 128 + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, offsetof(struct bpf_sock, type)), 129 + /* return 0; */ 130 + BPF_MOV64_IMM(BPF_REG_0, 0), 131 + BPF_EXIT_INSN(), 132 + }, 133 + .prog_type = BPF_PROG_TYPE_CGROUP_SKB, 134 + .result = REJECT, 135 + .errstr = "R7 invalid mem access 'sock_or_null'", 136 + .result_unpriv = REJECT, 137 + .errstr_unpriv = "R7 pointer comparison", 138 + }, 139 + { 140 + /* Maps are treated in a different branch of `mark_ptr_not_null_reg`, 141 + * so separate test for maps case. 142 + */ 143 + "jne/jeq infer not null, PTR_TO_MAP_VALUE_OR_NULL -> PTR_TO_MAP_VALUE", 144 + .insns = { 145 + /* r9 = &some stack to use as key */ 146 + BPF_ST_MEM(BPF_W, BPF_REG_10, -8, 0), 147 + BPF_MOV64_REG(BPF_REG_9, BPF_REG_10), 148 + BPF_ALU64_IMM(BPF_ADD, BPF_REG_9, -8), 149 + /* r8 = process local map */ 150 + BPF_LD_MAP_FD(BPF_REG_8, 0), 151 + /* r6 = map_lookup_elem(r8, r9); */ 152 + BPF_MOV64_REG(BPF_REG_1, BPF_REG_8), 153 + BPF_MOV64_REG(BPF_REG_2, BPF_REG_9), 154 + BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem), 155 + BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), 156 + /* r7 = map_lookup_elem(r8, r9); */ 157 + BPF_MOV64_REG(BPF_REG_1, BPF_REG_8), 158 + BPF_MOV64_REG(BPF_REG_2, BPF_REG_9), 159 + BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem), 160 + BPF_MOV64_REG(BPF_REG_7, BPF_REG_0), 161 + /* if (r6 == 0) return 0; */ 162 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_6, 0, 2), 163 + /* if (r6 != r7) return 0; */ 164 + BPF_JMP_REG(BPF_JNE, BPF_REG_6, BPF_REG_7, 1), 165 + /* read *r7; */ 166 + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, offsetof(struct bpf_xdp_sock, queue_id)), 167 + /* return 0; */ 168 + BPF_MOV64_IMM(BPF_REG_0, 0), 169 + BPF_EXIT_INSN(), 170 + }, 171 + .fixup_map_xskmap = { 3 }, 172 + .prog_type = BPF_PROG_TYPE_XDP, 173 + .result = ACCEPT, 174 + },
+2 -2
tools/testing/selftests/bpf/verifier/ref_tracking.c
··· 142 142 .kfunc = "bpf", 143 143 .expected_attach_type = BPF_LSM_MAC, 144 144 .flags = BPF_F_SLEEPABLE, 145 - .errstr = "arg#0 pointer type STRUCT bpf_key must point to scalar, or struct with scalar", 145 + .errstr = "arg#0 is ptr_or_null_ expected ptr_ or socket", 146 146 .fixup_kfunc_btf_id = { 147 147 { "bpf_lookup_user_key", 2 }, 148 148 { "bpf_key_put", 4 }, ··· 163 163 .kfunc = "bpf", 164 164 .expected_attach_type = BPF_LSM_MAC, 165 165 .flags = BPF_F_SLEEPABLE, 166 - .errstr = "arg#0 pointer type STRUCT bpf_key must point to scalar, or struct with scalar", 166 + .errstr = "arg#0 is ptr_or_null_ expected ptr_ or socket", 167 167 .fixup_kfunc_btf_id = { 168 168 { "bpf_lookup_system_key", 1 }, 169 169 { "bpf_key_put", 3 },
+1 -1
tools/testing/selftests/bpf/verifier/ringbuf.c
··· 28 28 }, 29 29 .fixup_map_ringbuf = { 1 }, 30 30 .result = REJECT, 31 - .errstr = "dereference of modified alloc_mem ptr R1", 31 + .errstr = "dereference of modified ringbuf_mem ptr R1", 32 32 }, 33 33 { 34 34 "ringbuf: invalid reservation offset 2",
+1 -1
tools/testing/selftests/bpf/verifier/spill_fill.c
··· 84 84 }, 85 85 .fixup_map_ringbuf = { 1 }, 86 86 .result = REJECT, 87 - .errstr = "R0 pointer arithmetic on alloc_mem_or_null prohibited", 87 + .errstr = "R0 pointer arithmetic on ringbuf_mem_or_null prohibited", 88 88 }, 89 89 { 90 90 "check corrupted spill/fill",