···11+.. SPDX-License-Identifier: GPL-2.0+22+.. Copyright (C) 2020 Google LLC.33+44+================55+LSM BPF Programs66+================77+88+These BPF programs allow runtime instrumentation of the LSM hooks by privileged99+users to implement system-wide MAC (Mandatory Access Control) and Audit1010+policies using eBPF.1111+1212+Structure1313+---------1414+1515+The example shows an eBPF program that can be attached to the ``file_mprotect``1616+LSM hook:1717+1818+.. c:function:: int file_mprotect(struct vm_area_struct *vma, unsigned long reqprot, unsigned long prot);1919+2020+Other LSM hooks which can be instrumented can be found in2121+``include/linux/lsm_hooks.h``.2222+2323+eBPF programs that use :doc:`/bpf/btf` do not need to include kernel headers2424+for accessing information from the attached eBPF program's context. They can2525+simply declare the structures in the eBPF program and only specify the fields2626+that need to be accessed.2727+2828+.. code-block:: c2929+3030+ struct mm_struct {3131+ unsigned long start_brk, brk, start_stack;3232+ } __attribute__((preserve_access_index));3333+3434+ struct vm_area_struct {3535+ unsigned long start_brk, brk, start_stack;3636+ unsigned long vm_start, vm_end;3737+ struct mm_struct *vm_mm;3838+ } __attribute__((preserve_access_index));3939+4040+4141+.. note:: The order of the fields is irrelevant.4242+4343+This can be further simplified (if one has access to the BTF information at4444+build time) by generating the ``vmlinux.h`` with:4545+4646+.. code-block:: console4747+4848+ # bpftool btf dump file <path-to-btf-vmlinux> format c > vmlinux.h4949+5050+.. note:: ``path-to-btf-vmlinux`` can be ``/sys/kernel/btf/vmlinux`` if the5151+ build environment matches the environment the BPF programs are5252+ deployed in.5353+5454+The ``vmlinux.h`` can then simply be included in the BPF programs without5555+requiring the definition of the types.5656+5757+The eBPF programs can be declared using the``BPF_PROG``5858+macros defined in `tools/lib/bpf/bpf_tracing.h`_. In this5959+example:6060+6161+ * ``"lsm/file_mprotect"`` indicates the LSM hook that the program must6262+ be attached to6363+ * ``mprotect_audit`` is the name of the eBPF program6464+6565+.. code-block:: c6666+6767+ SEC("lsm/file_mprotect")6868+ int BPF_PROG(mprotect_audit, struct vm_area_struct *vma,6969+ unsigned long reqprot, unsigned long prot, int ret)7070+ {7171+ /* ret is the return value from the previous BPF program7272+ * or 0 if it's the first hook.7373+ */7474+ if (ret != 0)7575+ return ret;7676+7777+ int is_heap;7878+7979+ is_heap = (vma->vm_start >= vma->vm_mm->start_brk &&8080+ vma->vm_end <= vma->vm_mm->brk);8181+8282+ /* Return an -EPERM or write information to the perf events buffer8383+ * for auditing8484+ */8585+ if (is_heap)8686+ return -EPERM;8787+ }8888+8989+The ``__attribute__((preserve_access_index))`` is a clang feature that allows9090+the BPF verifier to update the offsets for the access at runtime using the9191+:doc:`/bpf/btf` information. Since the BPF verifier is aware of the types, it9292+also validates all the accesses made to the various types in the eBPF program.9393+9494+Loading9595+-------9696+9797+eBPF programs can be loaded with the :manpage:`bpf(2)` syscall's9898+``BPF_PROG_LOAD`` operation:9999+100100+.. code-block:: c101101+102102+ struct bpf_object *obj;103103+104104+ obj = bpf_object__open("./my_prog.o");105105+ bpf_object__load(obj);106106+107107+This can be simplified by using a skeleton header generated by ``bpftool``:108108+109109+.. code-block:: console110110+111111+ # bpftool gen skeleton my_prog.o > my_prog.skel.h112112+113113+and the program can be loaded by including ``my_prog.skel.h`` and using114114+the generated helper, ``my_prog__open_and_load``.115115+116116+Attachment to LSM Hooks117117+-----------------------118118+119119+The LSM allows attachment of eBPF programs as LSM hooks using :manpage:`bpf(2)`120120+syscall's ``BPF_RAW_TRACEPOINT_OPEN`` operation or more simply by121121+using the libbpf helper ``bpf_program__attach_lsm``.122122+123123+The program can be detached from the LSM hook by *destroying* the ``link``124124+link returned by ``bpf_program__attach_lsm`` using ``bpf_link__destroy``.125125+126126+One can also use the helpers generated in ``my_prog.skel.h`` i.e.127127+``my_prog__attach`` for attachment and ``my_prog__destroy`` for cleaning up.128128+129129+Examples130130+--------131131+132132+An example eBPF program can be found in133133+`tools/testing/selftests/bpf/progs/lsm.c`_ and the corresponding134134+userspace code in `tools/testing/selftests/bpf/prog_tests/test_lsm.c`_135135+136136+.. Links137137+.. _tools/lib/bpf/bpf_tracing.h:138138+ https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/lib/bpf/bpf_tracing.h139139+.. _tools/testing/selftests/bpf/progs/lsm.c:140140+ https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/bpf/progs/lsm.c141141+.. _tools/testing/selftests/bpf/prog_tests/test_lsm.c:142142+ https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/bpf/prog_tests/test_lsm.c
+213
Documentation/bpf/drgn.rst
···11+.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)22+33+==============44+BPF drgn tools55+==============66+77+drgn scripts is a convenient and easy to use mechanism to retrieve arbitrary88+kernel data structures. drgn is not relying on kernel UAPI to read the data.99+Instead it's reading directly from ``/proc/kcore`` or vmcore and pretty prints1010+the data based on DWARF debug information from vmlinux.1111+1212+This document describes BPF related drgn tools.1313+1414+See `drgn/tools`_ for all tools available at the moment and `drgn/doc`_ for1515+more details on drgn itself.1616+1717+bpf_inspect.py1818+--------------1919+2020+Description2121+===========2222+2323+`bpf_inspect.py`_ is a tool intended to inspect BPF programs and maps. It can2424+iterate over all programs and maps in the system and print basic information2525+about these objects, including id, type and name.2626+2727+The main use-case `bpf_inspect.py`_ covers is to show BPF programs of types2828+``BPF_PROG_TYPE_EXT`` and ``BPF_PROG_TYPE_TRACING`` attached to other BPF2929+programs via ``freplace``/``fentry``/``fexit`` mechanisms, since there is no3030+user-space API to get this information.3131+3232+Getting started3333+===============3434+3535+List BPF programs (full names are obtained from BTF)::3636+3737+ % sudo bpf_inspect.py prog3838+ 27: BPF_PROG_TYPE_TRACEPOINT tracepoint__tcp__tcp_send_reset3939+ 4632: BPF_PROG_TYPE_CGROUP_SOCK_ADDR tw_ipt_bind4040+ 49464: BPF_PROG_TYPE_RAW_TRACEPOINT raw_tracepoint__sched_process_exit4141+4242+List BPF maps::4343+4444+ % sudo bpf_inspect.py map4545+ 2577: BPF_MAP_TYPE_HASH tw_ipt_vips4646+ 4050: BPF_MAP_TYPE_STACK_TRACE stack_traces4747+ 4069: BPF_MAP_TYPE_PERCPU_ARRAY ned_dctcp_cntr4848+4949+Find BPF programs attached to BPF program ``test_pkt_access``::5050+5151+ % sudo bpf_inspect.py p | grep test_pkt_access5252+ 650: BPF_PROG_TYPE_SCHED_CLS test_pkt_access5353+ 654: BPF_PROG_TYPE_TRACING test_main linked:[650->25: BPF_TRAMP_FEXIT test_pkt_access->test_pkt_access()]5454+ 655: BPF_PROG_TYPE_TRACING test_subprog1 linked:[650->29: BPF_TRAMP_FEXIT test_pkt_access->test_pkt_access_subprog1()]5555+ 656: BPF_PROG_TYPE_TRACING test_subprog2 linked:[650->31: BPF_TRAMP_FEXIT test_pkt_access->test_pkt_access_subprog2()]5656+ 657: BPF_PROG_TYPE_TRACING test_subprog3 linked:[650->21: BPF_TRAMP_FEXIT test_pkt_access->test_pkt_access_subprog3()]5757+ 658: BPF_PROG_TYPE_EXT new_get_skb_len linked:[650->16: BPF_TRAMP_REPLACE test_pkt_access->get_skb_len()]5858+ 659: BPF_PROG_TYPE_EXT new_get_skb_ifindex linked:[650->23: BPF_TRAMP_REPLACE test_pkt_access->get_skb_ifindex()]5959+ 660: BPF_PROG_TYPE_EXT new_get_constant linked:[650->19: BPF_TRAMP_REPLACE test_pkt_access->get_constant()]6060+6161+It can be seen that there is a program ``test_pkt_access``, id 650 and there6262+are multiple other tracing and ext programs attached to functions in6363+``test_pkt_access``.6464+6565+For example the line::6666+6767+ 658: BPF_PROG_TYPE_EXT new_get_skb_len linked:[650->16: BPF_TRAMP_REPLACE test_pkt_access->get_skb_len()]6868+6969+, means that BPF program id 658, type ``BPF_PROG_TYPE_EXT``, name7070+``new_get_skb_len`` replaces (``BPF_TRAMP_REPLACE``) function ``get_skb_len()``7171+that has BTF id 16 in BPF program id 650, name ``test_pkt_access``.7272+7373+Getting help:7474+7575+.. code-block:: none7676+7777+ % sudo bpf_inspect.py7878+ usage: bpf_inspect.py [-h] {prog,p,map,m} ...7979+8080+ drgn script to list BPF programs or maps and their properties8181+ unavailable via kernel API.8282+8383+ See https://github.com/osandov/drgn/ for more details on drgn.8484+8585+ optional arguments:8686+ -h, --help show this help message and exit8787+8888+ subcommands:8989+ {prog,p,map,m}9090+ prog (p) list BPF programs9191+ map (m) list BPF maps9292+9393+Customization9494+=============9595+9696+The script is intended to be customized by developers to print relevant9797+information about BPF programs, maps and other objects.9898+9999+For example, to print ``struct bpf_prog_aux`` for BPF program id 53077:100100+101101+.. code-block:: none102102+103103+ % git diff104104+ diff --git a/tools/bpf_inspect.py b/tools/bpf_inspect.py105105+ index 650e228..aea2357 100755106106+ --- a/tools/bpf_inspect.py107107+ +++ b/tools/bpf_inspect.py108108+ @@ -112,7 +112,9 @@ def list_bpf_progs(args):109109+ if linked:110110+ linked = f" linked:[{linked}]"111111+112112+ - print(f"{id_:>6}: {type_:32} {name:32} {linked}")113113+ + if id_ == 53077:114114+ + print(f"{id_:>6}: {type_:32} {name:32}")115115+ + print(f"{bpf_prog.aux}")116116+117117+118118+ def list_bpf_maps(args):119119+120120+It produces the output::121121+122122+ % sudo bpf_inspect.py p123123+ 53077: BPF_PROG_TYPE_XDP tw_xdp_policer124124+ *(struct bpf_prog_aux *)0xffff8893fad4b400 = {125125+ .refcnt = (atomic64_t){126126+ .counter = (long)58,127127+ },128128+ .used_map_cnt = (u32)1,129129+ .max_ctx_offset = (u32)8,130130+ .max_pkt_offset = (u32)15,131131+ .max_tp_access = (u32)0,132132+ .stack_depth = (u32)8,133133+ .id = (u32)53077,134134+ .func_cnt = (u32)0,135135+ .func_idx = (u32)0,136136+ .attach_btf_id = (u32)0,137137+ .linked_prog = (struct bpf_prog *)0x0,138138+ .verifier_zext = (bool)0,139139+ .offload_requested = (bool)0,140140+ .attach_btf_trace = (bool)0,141141+ .func_proto_unreliable = (bool)0,142142+ .trampoline_prog_type = (enum bpf_tramp_prog_type)BPF_TRAMP_FENTRY,143143+ .trampoline = (struct bpf_trampoline *)0x0,144144+ .tramp_hlist = (struct hlist_node){145145+ .next = (struct hlist_node *)0x0,146146+ .pprev = (struct hlist_node **)0x0,147147+ },148148+ .attach_func_proto = (const struct btf_type *)0x0,149149+ .attach_func_name = (const char *)0x0,150150+ .func = (struct bpf_prog **)0x0,151151+ .jit_data = (void *)0x0,152152+ .poke_tab = (struct bpf_jit_poke_descriptor *)0x0,153153+ .size_poke_tab = (u32)0,154154+ .ksym_tnode = (struct latch_tree_node){155155+ .node = (struct rb_node [2]){156156+ {157157+ .__rb_parent_color = (unsigned long)18446612956263126665,158158+ .rb_right = (struct rb_node *)0x0,159159+ .rb_left = (struct rb_node *)0xffff88a0be3d0088,160160+ },161161+ {162162+ .__rb_parent_color = (unsigned long)18446612956263126689,163163+ .rb_right = (struct rb_node *)0x0,164164+ .rb_left = (struct rb_node *)0xffff88a0be3d00a0,165165+ },166166+ },167167+ },168168+ .ksym_lnode = (struct list_head){169169+ .next = (struct list_head *)0xffff88bf481830b8,170170+ .prev = (struct list_head *)0xffff888309f536b8,171171+ },172172+ .ops = (const struct bpf_prog_ops *)xdp_prog_ops+0x0 = 0xffffffff820fa350,173173+ .used_maps = (struct bpf_map **)0xffff889ff795de98,174174+ .prog = (struct bpf_prog *)0xffffc9000cf2d000,175175+ .user = (struct user_struct *)root_user+0x0 = 0xffffffff82444820,176176+ .load_time = (u64)2408348759285319,177177+ .cgroup_storage = (struct bpf_map *[2]){},178178+ .name = (char [16])"tw_xdp_policer",179179+ .security = (void *)0xffff889ff795d548,180180+ .offload = (struct bpf_prog_offload *)0x0,181181+ .btf = (struct btf *)0xffff8890ce6d0580,182182+ .func_info = (struct bpf_func_info *)0xffff889ff795d240,183183+ .func_info_aux = (struct bpf_func_info_aux *)0xffff889ff795de20,184184+ .linfo = (struct bpf_line_info *)0xffff888a707afc00,185185+ .jited_linfo = (void **)0xffff8893fad48600,186186+ .func_info_cnt = (u32)1,187187+ .nr_linfo = (u32)37,188188+ .linfo_idx = (u32)0,189189+ .num_exentries = (u32)0,190190+ .extable = (struct exception_table_entry *)0xffffffffa032d950,191191+ .stats = (struct bpf_prog_stats *)0x603fe3a1f6d0,192192+ .work = (struct work_struct){193193+ .data = (atomic_long_t){194194+ .counter = (long)0,195195+ },196196+ .entry = (struct list_head){197197+ .next = (struct list_head *)0x0,198198+ .prev = (struct list_head *)0x0,199199+ },200200+ .func = (work_func_t)0x0,201201+ },202202+ .rcu = (struct callback_head){203203+ .next = (struct callback_head *)0x0,204204+ .func = (void (*)(struct callback_head *))0x0,205205+ },206206+ }207207+208208+209209+.. Links210210+.. _drgn/doc: https://drgn.readthedocs.io/en/latest/211211+.. _drgn/tools: https://github.com/osandov/drgn/tree/master/tools212212+.. _bpf_inspect.py:213213+ https://github.com/osandov/drgn/blob/master/tools/bpf_inspect.py
···31473147R: Song Liu <songliubraving@fb.com>31483148R: Yonghong Song <yhs@fb.com>31493149R: Andrii Nakryiko <andriin@fb.com>31503150+R: John Fastabend <john.fastabend@gmail.com>31513151+R: KP Singh <kpsingh@chromium.org>31503152L: netdev@vger.kernel.org31513153L: bpf@vger.kernel.org31523154T: git git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git
···123123 s64 smax_value; /* maximum possible (s64)value */124124 u64 umin_value; /* minimum possible (u64)value */125125 u64 umax_value; /* maximum possible (u64)value */126126+ s32 s32_min_value; /* minimum possible (s32)value */127127+ s32 s32_max_value; /* maximum possible (s32)value */128128+ u32 u32_min_value; /* minimum possible (u32)value */129129+ u32 u32_max_value; /* maximum possible (u32)value */126130 /* parentage chain for liveness checking */127131 struct bpf_reg_state *parent;128132 /* Inside the callee two registers can be both PTR_TO_STACK like
···14561456 * @what: kernel feature being accessed14571457 */14581458union security_list_options {14591459- int (*binder_set_context_mgr)(struct task_struct *mgr);14601460- int (*binder_transaction)(struct task_struct *from,14611461- struct task_struct *to);14621462- int (*binder_transfer_binder)(struct task_struct *from,14631463- struct task_struct *to);14641464- int (*binder_transfer_file)(struct task_struct *from,14651465- struct task_struct *to,14661466- struct file *file);14671467-14681468- int (*ptrace_access_check)(struct task_struct *child,14691469- unsigned int mode);14701470- int (*ptrace_traceme)(struct task_struct *parent);14711471- int (*capget)(struct task_struct *target, kernel_cap_t *effective,14721472- kernel_cap_t *inheritable, kernel_cap_t *permitted);14731473- int (*capset)(struct cred *new, const struct cred *old,14741474- const kernel_cap_t *effective,14751475- const kernel_cap_t *inheritable,14761476- const kernel_cap_t *permitted);14771477- int (*capable)(const struct cred *cred,14781478- struct user_namespace *ns,14791479- int cap,14801480- unsigned int opts);14811481- int (*quotactl)(int cmds, int type, int id, struct super_block *sb);14821482- int (*quota_on)(struct dentry *dentry);14831483- int (*syslog)(int type);14841484- int (*settime)(const struct timespec64 *ts, const struct timezone *tz);14851485- int (*vm_enough_memory)(struct mm_struct *mm, long pages);14861486-14871487- int (*bprm_set_creds)(struct linux_binprm *bprm);14881488- int (*bprm_check_security)(struct linux_binprm *bprm);14891489- void (*bprm_committing_creds)(struct linux_binprm *bprm);14901490- void (*bprm_committed_creds)(struct linux_binprm *bprm);14911491-14921492- int (*fs_context_dup)(struct fs_context *fc, struct fs_context *src_sc);14931493- int (*fs_context_parse_param)(struct fs_context *fc, struct fs_parameter *param);14941494-14951495- int (*sb_alloc_security)(struct super_block *sb);14961496- void (*sb_free_security)(struct super_block *sb);14971497- void (*sb_free_mnt_opts)(void *mnt_opts);14981498- int (*sb_eat_lsm_opts)(char *orig, void **mnt_opts);14991499- int (*sb_remount)(struct super_block *sb, void *mnt_opts);15001500- int (*sb_kern_mount)(struct super_block *sb);15011501- int (*sb_show_options)(struct seq_file *m, struct super_block *sb);15021502- int (*sb_statfs)(struct dentry *dentry);15031503- int (*sb_mount)(const char *dev_name, const struct path *path,15041504- const char *type, unsigned long flags, void *data);15051505- int (*sb_umount)(struct vfsmount *mnt, int flags);15061506- int (*sb_pivotroot)(const struct path *old_path, const struct path *new_path);15071507- int (*sb_set_mnt_opts)(struct super_block *sb,15081508- void *mnt_opts,15091509- unsigned long kern_flags,15101510- unsigned long *set_kern_flags);15111511- int (*sb_clone_mnt_opts)(const struct super_block *oldsb,15121512- struct super_block *newsb,15131513- unsigned long kern_flags,15141514- unsigned long *set_kern_flags);15151515- int (*sb_add_mnt_opt)(const char *option, const char *val, int len,15161516- void **mnt_opts);15171517- int (*move_mount)(const struct path *from_path, const struct path *to_path);15181518- int (*dentry_init_security)(struct dentry *dentry, int mode,15191519- const struct qstr *name, void **ctx,15201520- u32 *ctxlen);15211521- int (*dentry_create_files_as)(struct dentry *dentry, int mode,15221522- struct qstr *name,15231523- const struct cred *old,15241524- struct cred *new);15251525-15261526-15271527-#ifdef CONFIG_SECURITY_PATH15281528- int (*path_unlink)(const struct path *dir, struct dentry *dentry);15291529- int (*path_mkdir)(const struct path *dir, struct dentry *dentry,15301530- umode_t mode);15311531- int (*path_rmdir)(const struct path *dir, struct dentry *dentry);15321532- int (*path_mknod)(const struct path *dir, struct dentry *dentry,15331533- umode_t mode, unsigned int dev);15341534- int (*path_truncate)(const struct path *path);15351535- int (*path_symlink)(const struct path *dir, struct dentry *dentry,15361536- const char *old_name);15371537- int (*path_link)(struct dentry *old_dentry, const struct path *new_dir,15381538- struct dentry *new_dentry);15391539- int (*path_rename)(const struct path *old_dir, struct dentry *old_dentry,15401540- const struct path *new_dir,15411541- struct dentry *new_dentry);15421542- int (*path_chmod)(const struct path *path, umode_t mode);15431543- int (*path_chown)(const struct path *path, kuid_t uid, kgid_t gid);15441544- int (*path_chroot)(const struct path *path);15451545-#endif15461546- /* Needed for inode based security check */15471547- int (*path_notify)(const struct path *path, u64 mask,15481548- unsigned int obj_type);15491549- int (*inode_alloc_security)(struct inode *inode);15501550- void (*inode_free_security)(struct inode *inode);15511551- int (*inode_init_security)(struct inode *inode, struct inode *dir,15521552- const struct qstr *qstr,15531553- const char **name, void **value,15541554- size_t *len);15551555- int (*inode_create)(struct inode *dir, struct dentry *dentry,15561556- umode_t mode);15571557- int (*inode_link)(struct dentry *old_dentry, struct inode *dir,15581558- struct dentry *new_dentry);15591559- int (*inode_unlink)(struct inode *dir, struct dentry *dentry);15601560- int (*inode_symlink)(struct inode *dir, struct dentry *dentry,15611561- const char *old_name);15621562- int (*inode_mkdir)(struct inode *dir, struct dentry *dentry,15631563- umode_t mode);15641564- int (*inode_rmdir)(struct inode *dir, struct dentry *dentry);15651565- int (*inode_mknod)(struct inode *dir, struct dentry *dentry,15661566- umode_t mode, dev_t dev);15671567- int (*inode_rename)(struct inode *old_dir, struct dentry *old_dentry,15681568- struct inode *new_dir,15691569- struct dentry *new_dentry);15701570- int (*inode_readlink)(struct dentry *dentry);15711571- int (*inode_follow_link)(struct dentry *dentry, struct inode *inode,15721572- bool rcu);15731573- int (*inode_permission)(struct inode *inode, int mask);15741574- int (*inode_setattr)(struct dentry *dentry, struct iattr *attr);15751575- int (*inode_getattr)(const struct path *path);15761576- int (*inode_setxattr)(struct dentry *dentry, const char *name,15771577- const void *value, size_t size, int flags);15781578- void (*inode_post_setxattr)(struct dentry *dentry, const char *name,15791579- const void *value, size_t size,15801580- int flags);15811581- int (*inode_getxattr)(struct dentry *dentry, const char *name);15821582- int (*inode_listxattr)(struct dentry *dentry);15831583- int (*inode_removexattr)(struct dentry *dentry, const char *name);15841584- int (*inode_need_killpriv)(struct dentry *dentry);15851585- int (*inode_killpriv)(struct dentry *dentry);15861586- int (*inode_getsecurity)(struct inode *inode, const char *name,15871587- void **buffer, bool alloc);15881588- int (*inode_setsecurity)(struct inode *inode, const char *name,15891589- const void *value, size_t size,15901590- int flags);15911591- int (*inode_listsecurity)(struct inode *inode, char *buffer,15921592- size_t buffer_size);15931593- void (*inode_getsecid)(struct inode *inode, u32 *secid);15941594- int (*inode_copy_up)(struct dentry *src, struct cred **new);15951595- int (*inode_copy_up_xattr)(const char *name);15961596-15971597- int (*kernfs_init_security)(struct kernfs_node *kn_dir,15981598- struct kernfs_node *kn);15991599-16001600- int (*file_permission)(struct file *file, int mask);16011601- int (*file_alloc_security)(struct file *file);16021602- void (*file_free_security)(struct file *file);16031603- int (*file_ioctl)(struct file *file, unsigned int cmd,16041604- unsigned long arg);16051605- int (*mmap_addr)(unsigned long addr);16061606- int (*mmap_file)(struct file *file, unsigned long reqprot,16071607- unsigned long prot, unsigned long flags);16081608- int (*file_mprotect)(struct vm_area_struct *vma, unsigned long reqprot,16091609- unsigned long prot);16101610- int (*file_lock)(struct file *file, unsigned int cmd);16111611- int (*file_fcntl)(struct file *file, unsigned int cmd,16121612- unsigned long arg);16131613- void (*file_set_fowner)(struct file *file);16141614- int (*file_send_sigiotask)(struct task_struct *tsk,16151615- struct fown_struct *fown, int sig);16161616- int (*file_receive)(struct file *file);16171617- int (*file_open)(struct file *file);16181618-16191619- int (*task_alloc)(struct task_struct *task, unsigned long clone_flags);16201620- void (*task_free)(struct task_struct *task);16211621- int (*cred_alloc_blank)(struct cred *cred, gfp_t gfp);16221622- void (*cred_free)(struct cred *cred);16231623- int (*cred_prepare)(struct cred *new, const struct cred *old,16241624- gfp_t gfp);16251625- void (*cred_transfer)(struct cred *new, const struct cred *old);16261626- void (*cred_getsecid)(const struct cred *c, u32 *secid);16271627- int (*kernel_act_as)(struct cred *new, u32 secid);16281628- int (*kernel_create_files_as)(struct cred *new, struct inode *inode);16291629- int (*kernel_module_request)(char *kmod_name);16301630- int (*kernel_load_data)(enum kernel_load_data_id id);16311631- int (*kernel_read_file)(struct file *file, enum kernel_read_file_id id);16321632- int (*kernel_post_read_file)(struct file *file, char *buf, loff_t size,16331633- enum kernel_read_file_id id);16341634- int (*task_fix_setuid)(struct cred *new, const struct cred *old,16351635- int flags);16361636- int (*task_setpgid)(struct task_struct *p, pid_t pgid);16371637- int (*task_getpgid)(struct task_struct *p);16381638- int (*task_getsid)(struct task_struct *p);16391639- void (*task_getsecid)(struct task_struct *p, u32 *secid);16401640- int (*task_setnice)(struct task_struct *p, int nice);16411641- int (*task_setioprio)(struct task_struct *p, int ioprio);16421642- int (*task_getioprio)(struct task_struct *p);16431643- int (*task_prlimit)(const struct cred *cred, const struct cred *tcred,16441644- unsigned int flags);16451645- int (*task_setrlimit)(struct task_struct *p, unsigned int resource,16461646- struct rlimit *new_rlim);16471647- int (*task_setscheduler)(struct task_struct *p);16481648- int (*task_getscheduler)(struct task_struct *p);16491649- int (*task_movememory)(struct task_struct *p);16501650- int (*task_kill)(struct task_struct *p, struct kernel_siginfo *info,16511651- int sig, const struct cred *cred);16521652- int (*task_prctl)(int option, unsigned long arg2, unsigned long arg3,16531653- unsigned long arg4, unsigned long arg5);16541654- void (*task_to_inode)(struct task_struct *p, struct inode *inode);16551655-16561656- int (*ipc_permission)(struct kern_ipc_perm *ipcp, short flag);16571657- void (*ipc_getsecid)(struct kern_ipc_perm *ipcp, u32 *secid);16581658-16591659- int (*msg_msg_alloc_security)(struct msg_msg *msg);16601660- void (*msg_msg_free_security)(struct msg_msg *msg);16611661-16621662- int (*msg_queue_alloc_security)(struct kern_ipc_perm *perm);16631663- void (*msg_queue_free_security)(struct kern_ipc_perm *perm);16641664- int (*msg_queue_associate)(struct kern_ipc_perm *perm, int msqflg);16651665- int (*msg_queue_msgctl)(struct kern_ipc_perm *perm, int cmd);16661666- int (*msg_queue_msgsnd)(struct kern_ipc_perm *perm, struct msg_msg *msg,16671667- int msqflg);16681668- int (*msg_queue_msgrcv)(struct kern_ipc_perm *perm, struct msg_msg *msg,16691669- struct task_struct *target, long type,16701670- int mode);16711671-16721672- int (*shm_alloc_security)(struct kern_ipc_perm *perm);16731673- void (*shm_free_security)(struct kern_ipc_perm *perm);16741674- int (*shm_associate)(struct kern_ipc_perm *perm, int shmflg);16751675- int (*shm_shmctl)(struct kern_ipc_perm *perm, int cmd);16761676- int (*shm_shmat)(struct kern_ipc_perm *perm, char __user *shmaddr,16771677- int shmflg);16781678-16791679- int (*sem_alloc_security)(struct kern_ipc_perm *perm);16801680- void (*sem_free_security)(struct kern_ipc_perm *perm);16811681- int (*sem_associate)(struct kern_ipc_perm *perm, int semflg);16821682- int (*sem_semctl)(struct kern_ipc_perm *perm, int cmd);16831683- int (*sem_semop)(struct kern_ipc_perm *perm, struct sembuf *sops,16841684- unsigned nsops, int alter);16851685-16861686- int (*netlink_send)(struct sock *sk, struct sk_buff *skb);16871687-16881688- void (*d_instantiate)(struct dentry *dentry, struct inode *inode);16891689-16901690- int (*getprocattr)(struct task_struct *p, char *name, char **value);16911691- int (*setprocattr)(const char *name, void *value, size_t size);16921692- int (*ismaclabel)(const char *name);16931693- int (*secid_to_secctx)(u32 secid, char **secdata, u32 *seclen);16941694- int (*secctx_to_secid)(const char *secdata, u32 seclen, u32 *secid);16951695- void (*release_secctx)(char *secdata, u32 seclen);16961696-16971697- void (*inode_invalidate_secctx)(struct inode *inode);16981698- int (*inode_notifysecctx)(struct inode *inode, void *ctx, u32 ctxlen);16991699- int (*inode_setsecctx)(struct dentry *dentry, void *ctx, u32 ctxlen);17001700- int (*inode_getsecctx)(struct inode *inode, void **ctx, u32 *ctxlen);17011701-17021702-#ifdef CONFIG_SECURITY_NETWORK17031703- int (*unix_stream_connect)(struct sock *sock, struct sock *other,17041704- struct sock *newsk);17051705- int (*unix_may_send)(struct socket *sock, struct socket *other);17061706-17071707- int (*socket_create)(int family, int type, int protocol, int kern);17081708- int (*socket_post_create)(struct socket *sock, int family, int type,17091709- int protocol, int kern);17101710- int (*socket_socketpair)(struct socket *socka, struct socket *sockb);17111711- int (*socket_bind)(struct socket *sock, struct sockaddr *address,17121712- int addrlen);17131713- int (*socket_connect)(struct socket *sock, struct sockaddr *address,17141714- int addrlen);17151715- int (*socket_listen)(struct socket *sock, int backlog);17161716- int (*socket_accept)(struct socket *sock, struct socket *newsock);17171717- int (*socket_sendmsg)(struct socket *sock, struct msghdr *msg,17181718- int size);17191719- int (*socket_recvmsg)(struct socket *sock, struct msghdr *msg,17201720- int size, int flags);17211721- int (*socket_getsockname)(struct socket *sock);17221722- int (*socket_getpeername)(struct socket *sock);17231723- int (*socket_getsockopt)(struct socket *sock, int level, int optname);17241724- int (*socket_setsockopt)(struct socket *sock, int level, int optname);17251725- int (*socket_shutdown)(struct socket *sock, int how);17261726- int (*socket_sock_rcv_skb)(struct sock *sk, struct sk_buff *skb);17271727- int (*socket_getpeersec_stream)(struct socket *sock,17281728- char __user *optval,17291729- int __user *optlen, unsigned len);17301730- int (*socket_getpeersec_dgram)(struct socket *sock,17311731- struct sk_buff *skb, u32 *secid);17321732- int (*sk_alloc_security)(struct sock *sk, int family, gfp_t priority);17331733- void (*sk_free_security)(struct sock *sk);17341734- void (*sk_clone_security)(const struct sock *sk, struct sock *newsk);17351735- void (*sk_getsecid)(struct sock *sk, u32 *secid);17361736- void (*sock_graft)(struct sock *sk, struct socket *parent);17371737- int (*inet_conn_request)(struct sock *sk, struct sk_buff *skb,17381738- struct request_sock *req);17391739- void (*inet_csk_clone)(struct sock *newsk,17401740- const struct request_sock *req);17411741- void (*inet_conn_established)(struct sock *sk, struct sk_buff *skb);17421742- int (*secmark_relabel_packet)(u32 secid);17431743- void (*secmark_refcount_inc)(void);17441744- void (*secmark_refcount_dec)(void);17451745- void (*req_classify_flow)(const struct request_sock *req,17461746- struct flowi *fl);17471747- int (*tun_dev_alloc_security)(void **security);17481748- void (*tun_dev_free_security)(void *security);17491749- int (*tun_dev_create)(void);17501750- int (*tun_dev_attach_queue)(void *security);17511751- int (*tun_dev_attach)(struct sock *sk, void *security);17521752- int (*tun_dev_open)(void *security);17531753- int (*sctp_assoc_request)(struct sctp_endpoint *ep,17541754- struct sk_buff *skb);17551755- int (*sctp_bind_connect)(struct sock *sk, int optname,17561756- struct sockaddr *address, int addrlen);17571757- void (*sctp_sk_clone)(struct sctp_endpoint *ep, struct sock *sk,17581758- struct sock *newsk);17591759-#endif /* CONFIG_SECURITY_NETWORK */17601760-17611761-#ifdef CONFIG_SECURITY_INFINIBAND17621762- int (*ib_pkey_access)(void *sec, u64 subnet_prefix, u16 pkey);17631763- int (*ib_endport_manage_subnet)(void *sec, const char *dev_name,17641764- u8 port_num);17651765- int (*ib_alloc_security)(void **sec);17661766- void (*ib_free_security)(void *sec);17671767-#endif /* CONFIG_SECURITY_INFINIBAND */17681768-17691769-#ifdef CONFIG_SECURITY_NETWORK_XFRM17701770- int (*xfrm_policy_alloc_security)(struct xfrm_sec_ctx **ctxp,17711771- struct xfrm_user_sec_ctx *sec_ctx,17721772- gfp_t gfp);17731773- int (*xfrm_policy_clone_security)(struct xfrm_sec_ctx *old_ctx,17741774- struct xfrm_sec_ctx **new_ctx);17751775- void (*xfrm_policy_free_security)(struct xfrm_sec_ctx *ctx);17761776- int (*xfrm_policy_delete_security)(struct xfrm_sec_ctx *ctx);17771777- int (*xfrm_state_alloc)(struct xfrm_state *x,17781778- struct xfrm_user_sec_ctx *sec_ctx);17791779- int (*xfrm_state_alloc_acquire)(struct xfrm_state *x,17801780- struct xfrm_sec_ctx *polsec,17811781- u32 secid);17821782- void (*xfrm_state_free_security)(struct xfrm_state *x);17831783- int (*xfrm_state_delete_security)(struct xfrm_state *x);17841784- int (*xfrm_policy_lookup)(struct xfrm_sec_ctx *ctx, u32 fl_secid,17851785- u8 dir);17861786- int (*xfrm_state_pol_flow_match)(struct xfrm_state *x,17871787- struct xfrm_policy *xp,17881788- const struct flowi *fl);17891789- int (*xfrm_decode_session)(struct sk_buff *skb, u32 *secid, int ckall);17901790-#endif /* CONFIG_SECURITY_NETWORK_XFRM */17911791-17921792- /* key management security hooks */17931793-#ifdef CONFIG_KEYS17941794- int (*key_alloc)(struct key *key, const struct cred *cred,17951795- unsigned long flags);17961796- void (*key_free)(struct key *key);17971797- int (*key_permission)(key_ref_t key_ref, const struct cred *cred,17981798- unsigned perm);17991799- int (*key_getsecurity)(struct key *key, char **_buffer);18001800-#endif /* CONFIG_KEYS */18011801-18021802-#ifdef CONFIG_AUDIT18031803- int (*audit_rule_init)(u32 field, u32 op, char *rulestr,18041804- void **lsmrule);18051805- int (*audit_rule_known)(struct audit_krule *krule);18061806- int (*audit_rule_match)(u32 secid, u32 field, u32 op, void *lsmrule);18071807- void (*audit_rule_free)(void *lsmrule);18081808-#endif /* CONFIG_AUDIT */18091809-18101810-#ifdef CONFIG_BPF_SYSCALL18111811- int (*bpf)(int cmd, union bpf_attr *attr,18121812- unsigned int size);18131813- int (*bpf_map)(struct bpf_map *map, fmode_t fmode);18141814- int (*bpf_prog)(struct bpf_prog *prog);18151815- int (*bpf_map_alloc_security)(struct bpf_map *map);18161816- void (*bpf_map_free_security)(struct bpf_map *map);18171817- int (*bpf_prog_alloc_security)(struct bpf_prog_aux *aux);18181818- void (*bpf_prog_free_security)(struct bpf_prog_aux *aux);18191819-#endif /* CONFIG_BPF_SYSCALL */18201820- int (*locked_down)(enum lockdown_reason what);18211821-#ifdef CONFIG_PERF_EVENTS18221822- int (*perf_event_open)(struct perf_event_attr *attr, int type);18231823- int (*perf_event_alloc)(struct perf_event *event);18241824- void (*perf_event_free)(struct perf_event *event);18251825- int (*perf_event_read)(struct perf_event *event);18261826- int (*perf_event_write)(struct perf_event *event);18271827-18281828-#endif14591459+ #define LSM_HOOK(RET, DEFAULT, NAME, ...) RET (*NAME)(__VA_ARGS__);14601460+ #include "lsm_hook_defs.h"14611461+ #undef LSM_HOOK18291462};1830146318311464struct security_hook_heads {18321832- struct hlist_head binder_set_context_mgr;18331833- struct hlist_head binder_transaction;18341834- struct hlist_head binder_transfer_binder;18351835- struct hlist_head binder_transfer_file;18361836- struct hlist_head ptrace_access_check;18371837- struct hlist_head ptrace_traceme;18381838- struct hlist_head capget;18391839- struct hlist_head capset;18401840- struct hlist_head capable;18411841- struct hlist_head quotactl;18421842- struct hlist_head quota_on;18431843- struct hlist_head syslog;18441844- struct hlist_head settime;18451845- struct hlist_head vm_enough_memory;18461846- struct hlist_head bprm_set_creds;18471847- struct hlist_head bprm_check_security;18481848- struct hlist_head bprm_committing_creds;18491849- struct hlist_head bprm_committed_creds;18501850- struct hlist_head fs_context_dup;18511851- struct hlist_head fs_context_parse_param;18521852- struct hlist_head sb_alloc_security;18531853- struct hlist_head sb_free_security;18541854- struct hlist_head sb_free_mnt_opts;18551855- struct hlist_head sb_eat_lsm_opts;18561856- struct hlist_head sb_remount;18571857- struct hlist_head sb_kern_mount;18581858- struct hlist_head sb_show_options;18591859- struct hlist_head sb_statfs;18601860- struct hlist_head sb_mount;18611861- struct hlist_head sb_umount;18621862- struct hlist_head sb_pivotroot;18631863- struct hlist_head sb_set_mnt_opts;18641864- struct hlist_head sb_clone_mnt_opts;18651865- struct hlist_head sb_add_mnt_opt;18661866- struct hlist_head move_mount;18671867- struct hlist_head dentry_init_security;18681868- struct hlist_head dentry_create_files_as;18691869-#ifdef CONFIG_SECURITY_PATH18701870- struct hlist_head path_unlink;18711871- struct hlist_head path_mkdir;18721872- struct hlist_head path_rmdir;18731873- struct hlist_head path_mknod;18741874- struct hlist_head path_truncate;18751875- struct hlist_head path_symlink;18761876- struct hlist_head path_link;18771877- struct hlist_head path_rename;18781878- struct hlist_head path_chmod;18791879- struct hlist_head path_chown;18801880- struct hlist_head path_chroot;18811881-#endif18821882- /* Needed for inode based modules as well */18831883- struct hlist_head path_notify;18841884- struct hlist_head inode_alloc_security;18851885- struct hlist_head inode_free_security;18861886- struct hlist_head inode_init_security;18871887- struct hlist_head inode_create;18881888- struct hlist_head inode_link;18891889- struct hlist_head inode_unlink;18901890- struct hlist_head inode_symlink;18911891- struct hlist_head inode_mkdir;18921892- struct hlist_head inode_rmdir;18931893- struct hlist_head inode_mknod;18941894- struct hlist_head inode_rename;18951895- struct hlist_head inode_readlink;18961896- struct hlist_head inode_follow_link;18971897- struct hlist_head inode_permission;18981898- struct hlist_head inode_setattr;18991899- struct hlist_head inode_getattr;19001900- struct hlist_head inode_setxattr;19011901- struct hlist_head inode_post_setxattr;19021902- struct hlist_head inode_getxattr;19031903- struct hlist_head inode_listxattr;19041904- struct hlist_head inode_removexattr;19051905- struct hlist_head inode_need_killpriv;19061906- struct hlist_head inode_killpriv;19071907- struct hlist_head inode_getsecurity;19081908- struct hlist_head inode_setsecurity;19091909- struct hlist_head inode_listsecurity;19101910- struct hlist_head inode_getsecid;19111911- struct hlist_head inode_copy_up;19121912- struct hlist_head inode_copy_up_xattr;19131913- struct hlist_head kernfs_init_security;19141914- struct hlist_head file_permission;19151915- struct hlist_head file_alloc_security;19161916- struct hlist_head file_free_security;19171917- struct hlist_head file_ioctl;19181918- struct hlist_head mmap_addr;19191919- struct hlist_head mmap_file;19201920- struct hlist_head file_mprotect;19211921- struct hlist_head file_lock;19221922- struct hlist_head file_fcntl;19231923- struct hlist_head file_set_fowner;19241924- struct hlist_head file_send_sigiotask;19251925- struct hlist_head file_receive;19261926- struct hlist_head file_open;19271927- struct hlist_head task_alloc;19281928- struct hlist_head task_free;19291929- struct hlist_head cred_alloc_blank;19301930- struct hlist_head cred_free;19311931- struct hlist_head cred_prepare;19321932- struct hlist_head cred_transfer;19331933- struct hlist_head cred_getsecid;19341934- struct hlist_head kernel_act_as;19351935- struct hlist_head kernel_create_files_as;19361936- struct hlist_head kernel_load_data;19371937- struct hlist_head kernel_read_file;19381938- struct hlist_head kernel_post_read_file;19391939- struct hlist_head kernel_module_request;19401940- struct hlist_head task_fix_setuid;19411941- struct hlist_head task_setpgid;19421942- struct hlist_head task_getpgid;19431943- struct hlist_head task_getsid;19441944- struct hlist_head task_getsecid;19451945- struct hlist_head task_setnice;19461946- struct hlist_head task_setioprio;19471947- struct hlist_head task_getioprio;19481948- struct hlist_head task_prlimit;19491949- struct hlist_head task_setrlimit;19501950- struct hlist_head task_setscheduler;19511951- struct hlist_head task_getscheduler;19521952- struct hlist_head task_movememory;19531953- struct hlist_head task_kill;19541954- struct hlist_head task_prctl;19551955- struct hlist_head task_to_inode;19561956- struct hlist_head ipc_permission;19571957- struct hlist_head ipc_getsecid;19581958- struct hlist_head msg_msg_alloc_security;19591959- struct hlist_head msg_msg_free_security;19601960- struct hlist_head msg_queue_alloc_security;19611961- struct hlist_head msg_queue_free_security;19621962- struct hlist_head msg_queue_associate;19631963- struct hlist_head msg_queue_msgctl;19641964- struct hlist_head msg_queue_msgsnd;19651965- struct hlist_head msg_queue_msgrcv;19661966- struct hlist_head shm_alloc_security;19671967- struct hlist_head shm_free_security;19681968- struct hlist_head shm_associate;19691969- struct hlist_head shm_shmctl;19701970- struct hlist_head shm_shmat;19711971- struct hlist_head sem_alloc_security;19721972- struct hlist_head sem_free_security;19731973- struct hlist_head sem_associate;19741974- struct hlist_head sem_semctl;19751975- struct hlist_head sem_semop;19761976- struct hlist_head netlink_send;19771977- struct hlist_head d_instantiate;19781978- struct hlist_head getprocattr;19791979- struct hlist_head setprocattr;19801980- struct hlist_head ismaclabel;19811981- struct hlist_head secid_to_secctx;19821982- struct hlist_head secctx_to_secid;19831983- struct hlist_head release_secctx;19841984- struct hlist_head inode_invalidate_secctx;19851985- struct hlist_head inode_notifysecctx;19861986- struct hlist_head inode_setsecctx;19871987- struct hlist_head inode_getsecctx;19881988-#ifdef CONFIG_SECURITY_NETWORK19891989- struct hlist_head unix_stream_connect;19901990- struct hlist_head unix_may_send;19911991- struct hlist_head socket_create;19921992- struct hlist_head socket_post_create;19931993- struct hlist_head socket_socketpair;19941994- struct hlist_head socket_bind;19951995- struct hlist_head socket_connect;19961996- struct hlist_head socket_listen;19971997- struct hlist_head socket_accept;19981998- struct hlist_head socket_sendmsg;19991999- struct hlist_head socket_recvmsg;20002000- struct hlist_head socket_getsockname;20012001- struct hlist_head socket_getpeername;20022002- struct hlist_head socket_getsockopt;20032003- struct hlist_head socket_setsockopt;20042004- struct hlist_head socket_shutdown;20052005- struct hlist_head socket_sock_rcv_skb;20062006- struct hlist_head socket_getpeersec_stream;20072007- struct hlist_head socket_getpeersec_dgram;20082008- struct hlist_head sk_alloc_security;20092009- struct hlist_head sk_free_security;20102010- struct hlist_head sk_clone_security;20112011- struct hlist_head sk_getsecid;20122012- struct hlist_head sock_graft;20132013- struct hlist_head inet_conn_request;20142014- struct hlist_head inet_csk_clone;20152015- struct hlist_head inet_conn_established;20162016- struct hlist_head secmark_relabel_packet;20172017- struct hlist_head secmark_refcount_inc;20182018- struct hlist_head secmark_refcount_dec;20192019- struct hlist_head req_classify_flow;20202020- struct hlist_head tun_dev_alloc_security;20212021- struct hlist_head tun_dev_free_security;20222022- struct hlist_head tun_dev_create;20232023- struct hlist_head tun_dev_attach_queue;20242024- struct hlist_head tun_dev_attach;20252025- struct hlist_head tun_dev_open;20262026- struct hlist_head sctp_assoc_request;20272027- struct hlist_head sctp_bind_connect;20282028- struct hlist_head sctp_sk_clone;20292029-#endif /* CONFIG_SECURITY_NETWORK */20302030-#ifdef CONFIG_SECURITY_INFINIBAND20312031- struct hlist_head ib_pkey_access;20322032- struct hlist_head ib_endport_manage_subnet;20332033- struct hlist_head ib_alloc_security;20342034- struct hlist_head ib_free_security;20352035-#endif /* CONFIG_SECURITY_INFINIBAND */20362036-#ifdef CONFIG_SECURITY_NETWORK_XFRM20372037- struct hlist_head xfrm_policy_alloc_security;20382038- struct hlist_head xfrm_policy_clone_security;20392039- struct hlist_head xfrm_policy_free_security;20402040- struct hlist_head xfrm_policy_delete_security;20412041- struct hlist_head xfrm_state_alloc;20422042- struct hlist_head xfrm_state_alloc_acquire;20432043- struct hlist_head xfrm_state_free_security;20442044- struct hlist_head xfrm_state_delete_security;20452045- struct hlist_head xfrm_policy_lookup;20462046- struct hlist_head xfrm_state_pol_flow_match;20472047- struct hlist_head xfrm_decode_session;20482048-#endif /* CONFIG_SECURITY_NETWORK_XFRM */20492049-#ifdef CONFIG_KEYS20502050- struct hlist_head key_alloc;20512051- struct hlist_head key_free;20522052- struct hlist_head key_permission;20532053- struct hlist_head key_getsecurity;20542054-#endif /* CONFIG_KEYS */20552055-#ifdef CONFIG_AUDIT20562056- struct hlist_head audit_rule_init;20572057- struct hlist_head audit_rule_known;20582058- struct hlist_head audit_rule_match;20592059- struct hlist_head audit_rule_free;20602060-#endif /* CONFIG_AUDIT */20612061-#ifdef CONFIG_BPF_SYSCALL20622062- struct hlist_head bpf;20632063- struct hlist_head bpf_map;20642064- struct hlist_head bpf_prog;20652065- struct hlist_head bpf_map_alloc_security;20662066- struct hlist_head bpf_map_free_security;20672067- struct hlist_head bpf_prog_alloc_security;20682068- struct hlist_head bpf_prog_free_security;20692069-#endif /* CONFIG_BPF_SYSCALL */20702070- struct hlist_head locked_down;20712071-#ifdef CONFIG_PERF_EVENTS20722072- struct hlist_head perf_event_open;20732073- struct hlist_head perf_event_alloc;20742074- struct hlist_head perf_event_free;20752075- struct hlist_head perf_event_read;20762076- struct hlist_head perf_event_write;20772077-#endif14651465+ #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME;14661466+ #include "lsm_hook_defs.h"14671467+ #undef LSM_HOOK20781468} __randomize_layout;2079146920801470/*···14892099 int lbs_msg_msg;14902100 int lbs_task;14912101};21022102+21032103+/*21042104+ * LSM_RET_VOID is used as the default value in LSM_HOOK definitions for void21052105+ * LSM hooks (in include/linux/lsm_hook_defs.h).21062106+ */21072107+#define LSM_RET_VOID ((void) 0)1492210814932109/*14942110 * Initializing a security_hook_list structure takes
+1-1
include/linux/netdevice.h
···3777377737783778typedef int (*bpf_op_t)(struct net_device *dev, struct netdev_bpf *bpf);37793779int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,37803780- int fd, u32 flags);37803780+ int fd, int expected_fd, u32 flags);37813781u32 __dev_xdp_query(struct net_device *dev, bpf_op_t xdp_op,37823782 enum bpf_netdev_command cmd);37833783int xdp_umem_query(struct net_device *dev, u16 queue_id);
+12
include/linux/tnum.h
···8686/* Format a tnum as tristate binary expansion */8787int tnum_sbin(char *str, size_t size, struct tnum a);88888989+/* Returns the 32-bit subreg */9090+struct tnum tnum_subreg(struct tnum a);9191+/* Returns the tnum with the lower 32-bit subreg cleared */9292+struct tnum tnum_clear_subreg(struct tnum a);9393+/* Returns the tnum with the lower 32-bit subreg set to value */9494+struct tnum tnum_const_subreg(struct tnum a, u32 value);9595+/* Returns true if 32-bit subreg @a is a known constant*/9696+static inline bool tnum_subreg_is_const(struct tnum a)9797+{9898+ return !(tnum_subreg(a)).mask;9999+}100100+89101#endif /* _LINUX_TNUM_H */
+6-1
include/net/cls_cgroup.h
···4545 sock_cgroup_set_classid(skcd, classid);4646}47474848+static inline u32 __task_get_classid(struct task_struct *task)4949+{5050+ return task_cls_state(task)->classid;5151+}5252+4853static inline u32 task_get_classid(const struct sk_buff *skb)4954{5050- u32 classid = task_cls_state(current)->classid;5555+ u32 classid = __task_get_classid(current);51565257 /* Due to the nature of the classifier it is required to ignore all5358 * packets originating from softirq context as accessing `current'
+1-2
include/net/inet6_hashtables.h
···8585 int iif, int sdif,8686 bool *refcounted)8787{8888- struct sock *sk = skb_steal_sock(skb);8888+ struct sock *sk = skb_steal_sock(skb, refcounted);89899090- *refcounted = true;9190 if (sk)9291 return sk;9392
···168168#ifdef CONFIG_XFRM169169 struct netns_xfrm xfrm;170170#endif171171+172172+ atomic64_t net_cookie; /* written once */173173+171174#if IS_ENABLED(CONFIG_IP_VS)172175 struct netns_ipvs *ipvs;173176#endif···227224228225struct net *get_net_ns_by_pid(pid_t pid);229226struct net *get_net_ns_by_fd(int fd);227227+228228+u64 net_gen_cookie(struct net *net);230229231230#ifdef CONFIG_SYSCTL232231void ipx_register_sysctl(void);
+37-9
include/net/sock.h
···16591659void sock_efree(struct sk_buff *skb);16601660#ifdef CONFIG_INET16611661void sock_edemux(struct sk_buff *skb);16621662+void sock_pfree(struct sk_buff *skb);16621663#else16631664#define sock_edemux sock_efree16641665#endif···25272526 write_pnet(&sk->sk_net, net);25282527}2529252825302530-static inline struct sock *skb_steal_sock(struct sk_buff *skb)25292529+static inline bool25302530+skb_sk_is_prefetched(struct sk_buff *skb)25312531{25322532- if (skb->sk) {25332533- struct sock *sk = skb->sk;25342534-25352535- skb->destructor = NULL;25362536- skb->sk = NULL;25372537- return sk;25382538- }25392539- return NULL;25322532+#ifdef CONFIG_INET25332533+ return skb->destructor == sock_pfree;25342534+#else25352535+ return false;25362536+#endif /* CONFIG_INET */25402537}2541253825422539/* This helper checks if a socket is a full socket,···25432544static inline bool sk_fullsock(const struct sock *sk)25442545{25452546 return (1 << sk->sk_state) & ~(TCPF_TIME_WAIT | TCPF_NEW_SYN_RECV);25472547+}25482548+25492549+static inline bool25502550+sk_is_refcounted(struct sock *sk)25512551+{25522552+ /* Only full sockets have sk->sk_flags. */25532553+ return !sk_fullsock(sk) || !sock_flag(sk, SOCK_RCU_FREE);25542554+}25552555+25562556+/**25572557+ * skb_steal_sock25582558+ * @skb to steal the socket from25592559+ * @refcounted is set to true if the socket is reference-counted25602560+ */25612561+static inline struct sock *25622562+skb_steal_sock(struct sk_buff *skb, bool *refcounted)25632563+{25642564+ if (skb->sk) {25652565+ struct sock *sk = skb->sk;25662566+25672567+ *refcounted = true;25682568+ if (skb_sk_is_prefetched(skb))25692569+ *refcounted = sk_is_refcounted(sk);25702570+ skb->destructor = NULL;25712571+ skb->sk = NULL;25722572+ return sk;25732573+ }25742574+ *refcounted = false;25752575+ return NULL;25462576}2547257725482578/* Checks if this SKB belongs to an HW offloaded socket
-2
include/net/tcp.h
···22072207#ifdef CONFIG_NET_SOCK_MSG22082208int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg, u32 bytes,22092209 int flags);22102210-int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,22112211- int nonblock, int flags, int *addr_len);22122210int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock,22132211 struct msghdr *msg, int len, int flags);22142212#endif /* CONFIG_NET_SOCK_MSG */
+80-2
include/uapi/linux/bpf.h
···111111 BPF_MAP_LOOKUP_AND_DELETE_BATCH,112112 BPF_MAP_UPDATE_BATCH,113113 BPF_MAP_DELETE_BATCH,114114+ BPF_LINK_CREATE,115115+ BPF_LINK_UPDATE,114116};115117116118enum bpf_map_type {···183181 BPF_PROG_TYPE_TRACING,184182 BPF_PROG_TYPE_STRUCT_OPS,185183 BPF_PROG_TYPE_EXT,184184+ BPF_PROG_TYPE_LSM,186185};187186188187enum bpf_attach_type {···214211 BPF_TRACE_FENTRY,215212 BPF_TRACE_FEXIT,216213 BPF_MODIFY_RETURN,214214+ BPF_LSM_MAC,217215 __MAX_BPF_ATTACH_TYPE218216};219217···543539 __u32 prog_cnt;544540 } query;545541546546- struct {542542+ struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */547543 __u64 name;548544 __u32 prog_fd;549545 } raw_tracepoint;···571567 __u64 probe_offset; /* output: probe_offset */572568 __u64 probe_addr; /* output: probe_addr */573569 } task_fd_query;570570+571571+ struct { /* struct used by BPF_LINK_CREATE command */572572+ __u32 prog_fd; /* eBPF program to attach */573573+ __u32 target_fd; /* object to attach to */574574+ __u32 attach_type; /* attach type */575575+ __u32 flags; /* extra flags */576576+ } link_create;577577+578578+ struct { /* struct used by BPF_LINK_UPDATE command */579579+ __u32 link_fd; /* link fd */580580+ /* new program fd to update link with */581581+ __u32 new_prog_fd;582582+ __u32 flags; /* extra flags */583583+ /* expected link's program fd; is specified only if584584+ * BPF_F_REPLACE flag is set in flags */585585+ __u32 old_prog_fd;586586+ } link_update;587587+574588} __attribute__((aligned(8)));575589576590/* The description below is an attempt at providing documentation to eBPF···29722950 * restricted to raw_tracepoint bpf programs.29732951 * Return29742952 * 0 on success, or a negative error in case of failure.29532953+ *29542954+ * u64 bpf_get_netns_cookie(void *ctx)29552955+ * Description29562956+ * Retrieve the cookie (generated by the kernel) of the network29572957+ * namespace the input *ctx* is associated with. The network29582958+ * namespace cookie remains stable for its lifetime and provides29592959+ * a global identifier that can be assumed unique. If *ctx* is29602960+ * NULL, then the helper returns the cookie for the initial29612961+ * network namespace. The cookie itself is very similar to that29622962+ * of bpf_get_socket_cookie() helper, but for network namespaces29632963+ * instead of sockets.29642964+ * Return29652965+ * A 8-byte long opaque number.29662966+ *29672967+ * u64 bpf_get_current_ancestor_cgroup_id(int ancestor_level)29682968+ * Description29692969+ * Return id of cgroup v2 that is ancestor of the cgroup associated29702970+ * with the current task at the *ancestor_level*. The root cgroup29712971+ * is at *ancestor_level* zero and each step down the hierarchy29722972+ * increments the level. If *ancestor_level* == level of cgroup29732973+ * associated with the current task, then return value will be the29742974+ * same as that of **bpf_get_current_cgroup_id**\ ().29752975+ *29762976+ * The helper is useful to implement policies based on cgroups29772977+ * that are upper in hierarchy than immediate cgroup associated29782978+ * with the current task.29792979+ *29802980+ * The format of returned id and helper limitations are same as in29812981+ * **bpf_get_current_cgroup_id**\ ().29822982+ * Return29832983+ * The id is returned or 0 in case the id could not be retrieved.29842984+ *29852985+ * int bpf_sk_assign(struct sk_buff *skb, struct bpf_sock *sk, u64 flags)29862986+ * Description29872987+ * Assign the *sk* to the *skb*. When combined with appropriate29882988+ * routing configuration to receive the packet towards the socket,29892989+ * will cause *skb* to be delivered to the specified socket.29902990+ * Subsequent redirection of *skb* via **bpf_redirect**\ (),29912991+ * **bpf_clone_redirect**\ () or other methods outside of BPF may29922992+ * interfere with successful delivery to the socket.29932993+ *29942994+ * This operation is only valid from TC ingress path.29952995+ *29962996+ * The *flags* argument must be zero.29972997+ * Return29982998+ * 0 on success, or a negative errno in case of failure.29992999+ *30003000+ * * **-EINVAL** Unsupported flags specified.30013001+ * * **-ENOENT** Socket is unavailable for assignment.30023002+ * * **-ENETUNREACH** Socket is unreachable (wrong netns).30033003+ * * **-EOPNOTSUPP** Unsupported operation, for example a30043004+ * call from outside of TC ingress.30053005+ * * **-ESOCKTNOSUPPORT** Socket type not supported (reuseport).29753006 */29763007#define __BPF_FUNC_MAPPER(FN) \29773008 FN(unspec), \···31483073 FN(jiffies64), \31493074 FN(read_branch_records), \31503075 FN(get_ns_current_pid_tgid), \31513151- FN(xdp_output),30763076+ FN(xdp_output), \30773077+ FN(get_netns_cookie), \30783078+ FN(get_current_ancestor_cgroup_id), \30793079+ FN(sk_assign),3152308031533081/* integer value in 'imm' field of BPF_CALL instruction selects which helper31543082 * function eBPF program intends to call
···16151615# end of the "standard kernel features (expert users)" menu1616161616171617# syscall, maps, verifier16181618+16191619+config BPF_LSM16201620+ bool "LSM Instrumentation with BPF"16211621+ depends on BPF_EVENTS16221622+ depends on BPF_SYSCALL16231623+ depends on SECURITY16241624+ depends on BPF_JIT16251625+ help16261626+ Enables instrumentation of the security hooks with eBPF programs for16271627+ implementing dynamic MAC and Audit Policies.16281628+16291629+ If you are unsure how to answer this question, answer N.16301630+16181631config BPF_SYSCALL16191632 bool "Enable bpf() system call"16201633 select BPF
···66#include <linux/ftrace.h>77#include <linux/rbtree_latch.h>88#include <linux/perf_event.h>99+#include <linux/btf.h>9101011/* dummy _ops. The verifier will operate on target program's ops. */1112const struct bpf_verifier_ops bpf_extension_verifier_ops = {···234233 return err;235234}236235237237-static enum bpf_tramp_prog_type bpf_attach_type_to_tramp(enum bpf_attach_type t)236236+static enum bpf_tramp_prog_type bpf_attach_type_to_tramp(struct bpf_prog *prog)238237{239239- switch (t) {238238+ switch (prog->expected_attach_type) {240239 case BPF_TRACE_FENTRY:241240 return BPF_TRAMP_FENTRY;242241 case BPF_MODIFY_RETURN:243242 return BPF_TRAMP_MODIFY_RETURN;244243 case BPF_TRACE_FEXIT:245244 return BPF_TRAMP_FEXIT;245245+ case BPF_LSM_MAC:246246+ if (!prog->aux->attach_func_proto->type)247247+ /* The function returns void, we cannot modify its248248+ * return value.249249+ */250250+ return BPF_TRAMP_FEXIT;251251+ else252252+ return BPF_TRAMP_MODIFY_RETURN;246253 default:247254 return BPF_TRAMP_REPLACE;248255 }···264255 int cnt;265256266257 tr = prog->aux->trampoline;267267- kind = bpf_attach_type_to_tramp(prog->expected_attach_type);258258+ kind = bpf_attach_type_to_tramp(prog);268259 mutex_lock(&tr->mutex);269260 if (tr->extension_prog) {270261 /* cannot attach fentry/fexit if extension prog is attached.···314305 int err;315306316307 tr = prog->aux->trampoline;317317- kind = bpf_attach_type_to_tramp(prog->expected_attach_type);308308+ kind = bpf_attach_type_to_tramp(prog);318309 mutex_lock(&tr->mutex);319310 if (kind == BPF_TRAMP_REPLACE) {320311 WARN_ON_ONCE(!tr->extension_prog);
+1089-485
kernel/bpf/verifier.c
···2020#include <linux/perf_event.h>2121#include <linux/ctype.h>2222#include <linux/error-injection.h>2323+#include <linux/bpf_lsm.h>23242425#include "disasm.h"2526···229228 bool pkt_access;230229 int regno;231230 int access_size;232232- s64 msize_smax_value;233233- u64 msize_umax_value;231231+ u64 msize_max_value;234232 int ref_obj_id;235233 int func_id;236234 u32 btf_id;···550550 tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off);551551 verbose(env, ",var_off=%s", tn_buf);552552 }553553+ if (reg->s32_min_value != reg->smin_value &&554554+ reg->s32_min_value != S32_MIN)555555+ verbose(env, ",s32_min_value=%d",556556+ (int)(reg->s32_min_value));557557+ if (reg->s32_max_value != reg->smax_value &&558558+ reg->s32_max_value != S32_MAX)559559+ verbose(env, ",s32_max_value=%d",560560+ (int)(reg->s32_max_value));561561+ if (reg->u32_min_value != reg->umin_value &&562562+ reg->u32_min_value != U32_MIN)563563+ verbose(env, ",u32_min_value=%d",564564+ (int)(reg->u32_min_value));565565+ if (reg->u32_max_value != reg->umax_value &&566566+ reg->u32_max_value != U32_MAX)567567+ verbose(env, ",u32_max_value=%d",568568+ (int)(reg->u32_max_value));553569 }554570 verbose(env, ")");555571 }···940924 reg->smax_value = (s64)imm;941925 reg->umin_value = imm;942926 reg->umax_value = imm;927927+928928+ reg->s32_min_value = (s32)imm;929929+ reg->s32_max_value = (s32)imm;930930+ reg->u32_min_value = (u32)imm;931931+ reg->u32_max_value = (u32)imm;932932+}933933+934934+static void __mark_reg32_known(struct bpf_reg_state *reg, u64 imm)935935+{936936+ reg->var_off = tnum_const_subreg(reg->var_off, imm);937937+ reg->s32_min_value = (s32)imm;938938+ reg->s32_max_value = (s32)imm;939939+ reg->u32_min_value = (u32)imm;940940+ reg->u32_max_value = (u32)imm;943941}944942945943/* Mark the 'variable offset' part of a register as zero. This should be···1008978 tnum_equals_const(reg->var_off, 0);1009979}101098010111011-/* Attempts to improve min/max values based on var_off information */10121012-static void __update_reg_bounds(struct bpf_reg_state *reg)981981+/* Reset the min/max bounds of a register */982982+static void __mark_reg_unbounded(struct bpf_reg_state *reg)983983+{984984+ reg->smin_value = S64_MIN;985985+ reg->smax_value = S64_MAX;986986+ reg->umin_value = 0;987987+ reg->umax_value = U64_MAX;988988+989989+ reg->s32_min_value = S32_MIN;990990+ reg->s32_max_value = S32_MAX;991991+ reg->u32_min_value = 0;992992+ reg->u32_max_value = U32_MAX;993993+}994994+995995+static void __mark_reg64_unbounded(struct bpf_reg_state *reg)996996+{997997+ reg->smin_value = S64_MIN;998998+ reg->smax_value = S64_MAX;999999+ reg->umin_value = 0;10001000+ reg->umax_value = U64_MAX;10011001+}10021002+10031003+static void __mark_reg32_unbounded(struct bpf_reg_state *reg)10041004+{10051005+ reg->s32_min_value = S32_MIN;10061006+ reg->s32_max_value = S32_MAX;10071007+ reg->u32_min_value = 0;10081008+ reg->u32_max_value = U32_MAX;10091009+}10101010+10111011+static void __update_reg32_bounds(struct bpf_reg_state *reg)10121012+{10131013+ struct tnum var32_off = tnum_subreg(reg->var_off);10141014+10151015+ /* min signed is max(sign bit) | min(other bits) */10161016+ reg->s32_min_value = max_t(s32, reg->s32_min_value,10171017+ var32_off.value | (var32_off.mask & S32_MIN));10181018+ /* max signed is min(sign bit) | max(other bits) */10191019+ reg->s32_max_value = min_t(s32, reg->s32_max_value,10201020+ var32_off.value | (var32_off.mask & S32_MAX));10211021+ reg->u32_min_value = max_t(u32, reg->u32_min_value, (u32)var32_off.value);10221022+ reg->u32_max_value = min(reg->u32_max_value,10231023+ (u32)(var32_off.value | var32_off.mask));10241024+}10251025+10261026+static void __update_reg64_bounds(struct bpf_reg_state *reg)10131027{10141028 /* min signed is max(sign bit) | min(other bits) */10151029 reg->smin_value = max_t(s64, reg->smin_value,···1066992 reg->var_off.value | reg->var_off.mask);1067993}1068994995995+static void __update_reg_bounds(struct bpf_reg_state *reg)996996+{997997+ __update_reg32_bounds(reg);998998+ __update_reg64_bounds(reg);999999+}10001000+10691001/* Uses signed min/max values to inform unsigned, and vice-versa */10701070-static void __reg_deduce_bounds(struct bpf_reg_state *reg)10021002+static void __reg32_deduce_bounds(struct bpf_reg_state *reg)10031003+{10041004+ /* Learn sign from signed bounds.10051005+ * If we cannot cross the sign boundary, then signed and unsigned bounds10061006+ * are the same, so combine. This works even in the negative case, e.g.10071007+ * -3 s<= x s<= -1 implies 0xf...fd u<= x u<= 0xf...ff.10081008+ */10091009+ if (reg->s32_min_value >= 0 || reg->s32_max_value < 0) {10101010+ reg->s32_min_value = reg->u32_min_value =10111011+ max_t(u32, reg->s32_min_value, reg->u32_min_value);10121012+ reg->s32_max_value = reg->u32_max_value =10131013+ min_t(u32, reg->s32_max_value, reg->u32_max_value);10141014+ return;10151015+ }10161016+ /* Learn sign from unsigned bounds. Signed bounds cross the sign10171017+ * boundary, so we must be careful.10181018+ */10191019+ if ((s32)reg->u32_max_value >= 0) {10201020+ /* Positive. We can't learn anything from the smin, but smax10211021+ * is positive, hence safe.10221022+ */10231023+ reg->s32_min_value = reg->u32_min_value;10241024+ reg->s32_max_value = reg->u32_max_value =10251025+ min_t(u32, reg->s32_max_value, reg->u32_max_value);10261026+ } else if ((s32)reg->u32_min_value < 0) {10271027+ /* Negative. We can't learn anything from the smax, but smin10281028+ * is negative, hence safe.10291029+ */10301030+ reg->s32_min_value = reg->u32_min_value =10311031+ max_t(u32, reg->s32_min_value, reg->u32_min_value);10321032+ reg->s32_max_value = reg->u32_max_value;10331033+ }10341034+}10351035+10361036+static void __reg64_deduce_bounds(struct bpf_reg_state *reg)10711037{10721038 /* Learn sign from signed bounds.10731039 * If we cannot cross the sign boundary, then signed and unsigned bounds···11411027 }11421028}1143102910301030+static void __reg_deduce_bounds(struct bpf_reg_state *reg)10311031+{10321032+ __reg32_deduce_bounds(reg);10331033+ __reg64_deduce_bounds(reg);10341034+}10351035+11441036/* Attempts to improve var_off based on unsigned min/max information */11451037static void __reg_bound_offset(struct bpf_reg_state *reg)11461038{11471147- reg->var_off = tnum_intersect(reg->var_off,11481148- tnum_range(reg->umin_value,11491149- reg->umax_value));10391039+ struct tnum var64_off = tnum_intersect(reg->var_off,10401040+ tnum_range(reg->umin_value,10411041+ reg->umax_value));10421042+ struct tnum var32_off = tnum_intersect(tnum_subreg(reg->var_off),10431043+ tnum_range(reg->u32_min_value,10441044+ reg->u32_max_value));10451045+10461046+ reg->var_off = tnum_or(tnum_clear_subreg(var64_off), var32_off);11501047}1151104811521152-static void __reg_bound_offset32(struct bpf_reg_state *reg)10491049+static void __reg_assign_32_into_64(struct bpf_reg_state *reg)11531050{11541154- u64 mask = 0xffffFFFF;11551155- struct tnum range = tnum_range(reg->umin_value & mask,11561156- reg->umax_value & mask);11571157- struct tnum lo32 = tnum_cast(reg->var_off, 4);11581158- struct tnum hi32 = tnum_lshift(tnum_rshift(reg->var_off, 32), 32);11591159-11601160- reg->var_off = tnum_or(hi32, tnum_intersect(lo32, range));10511051+ reg->umin_value = reg->u32_min_value;10521052+ reg->umax_value = reg->u32_max_value;10531053+ /* Attempt to pull 32-bit signed bounds into 64-bit bounds10541054+ * but must be positive otherwise set to worse case bounds10551055+ * and refine later from tnum.10561056+ */10571057+ if (reg->s32_min_value > 0)10581058+ reg->smin_value = reg->s32_min_value;10591059+ else10601060+ reg->smin_value = 0;10611061+ if (reg->s32_max_value > 0)10621062+ reg->smax_value = reg->s32_max_value;10631063+ else10641064+ reg->smax_value = U32_MAX;11611065}1162106611631163-/* Reset the min/max bounds of a register */11641164-static void __mark_reg_unbounded(struct bpf_reg_state *reg)10671067+static void __reg_combine_32_into_64(struct bpf_reg_state *reg)11651068{11661166- reg->smin_value = S64_MIN;11671167- reg->smax_value = S64_MAX;11681168- reg->umin_value = 0;11691169- reg->umax_value = U64_MAX;10691069+ /* special case when 64-bit register has upper 32-bit register10701070+ * zeroed. Typically happens after zext or <<32, >>32 sequence10711071+ * allowing us to use 32-bit bounds directly,10721072+ */10731073+ if (tnum_equals_const(tnum_clear_subreg(reg->var_off), 0)) {10741074+ __reg_assign_32_into_64(reg);10751075+ } else {10761076+ /* Otherwise the best we can do is push lower 32bit known and10771077+ * unknown bits into register (var_off set from jmp logic)10781078+ * then learn as much as possible from the 64-bit tnum10791079+ * known and unknown bits. The previous smin/smax bounds are10801080+ * invalid here because of jmp32 compare so mark them unknown10811081+ * so they do not impact tnum bounds calculation.10821082+ */10831083+ __mark_reg64_unbounded(reg);10841084+ __update_reg_bounds(reg);10851085+ }10861086+10871087+ /* Intersecting with the old var_off might have improved our bounds10881088+ * slightly. e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc),10891089+ * then new var_off is (0; 0x7f...fc) which improves our umax.10901090+ */10911091+ __reg_deduce_bounds(reg);10921092+ __reg_bound_offset(reg);10931093+ __update_reg_bounds(reg);10941094+}10951095+10961096+static bool __reg64_bound_s32(s64 a)10971097+{10981098+ if (a > S32_MIN && a < S32_MAX)10991099+ return true;11001100+ return false;11011101+}11021102+11031103+static bool __reg64_bound_u32(u64 a)11041104+{11051105+ if (a > U32_MIN && a < U32_MAX)11061106+ return true;11071107+ return false;11081108+}11091109+11101110+static void __reg_combine_64_into_32(struct bpf_reg_state *reg)11111111+{11121112+ __mark_reg32_unbounded(reg);11131113+11141114+ if (__reg64_bound_s32(reg->smin_value))11151115+ reg->s32_min_value = (s32)reg->smin_value;11161116+ if (__reg64_bound_s32(reg->smax_value))11171117+ reg->s32_max_value = (s32)reg->smax_value;11181118+ if (__reg64_bound_u32(reg->umin_value))11191119+ reg->u32_min_value = (u32)reg->umin_value;11201120+ if (__reg64_bound_u32(reg->umax_value))11211121+ reg->u32_max_value = (u32)reg->umax_value;11221122+11231123+ /* Intersecting with the old var_off might have improved our bounds11241124+ * slightly. e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc),11251125+ * then new var_off is (0; 0x7f...fc) which improves our umax.11261126+ */11271127+ __reg_deduce_bounds(reg);11281128+ __reg_bound_offset(reg);11291129+ __update_reg_bounds(reg);11701130}1171113111721132/* Mark a register as having a completely unknown (scalar) value. */···29732785 return 0;29742786}2975278727882788+/* BPF architecture zero extends alu32 ops into 64-bit registesr */27892789+static void zext_32_to_64(struct bpf_reg_state *reg)27902790+{27912791+ reg->var_off = tnum_subreg(reg->var_off);27922792+ __reg_assign_32_into_64(reg);27932793+}2976279429772795/* truncate register to smaller size (in bytes)29782796 * must be called with size < BPF_REG_SIZE···30012807 }30022808 reg->smin_value = reg->umin_value;30032809 reg->smax_value = reg->umax_value;28102810+28112811+ /* If size is smaller than 32bit register the 32bit register28122812+ * values are also truncated so we push 64-bit bounds into28132813+ * 32-bit bounds. Above were truncated < 32-bits already.28142814+ */28152815+ if (size >= 4)28162816+ return;28172817+ __reg_combine_64_into_32(reg);30042818}3005281930062820static bool bpf_map_is_rdonly(const struct bpf_map *map)···36633461 expected_type = CONST_PTR_TO_MAP;36643462 if (type != expected_type)36653463 goto err_type;36663666- } else if (arg_type == ARG_PTR_TO_CTX) {34643464+ } else if (arg_type == ARG_PTR_TO_CTX ||34653465+ arg_type == ARG_PTR_TO_CTX_OR_NULL) {36673466 expected_type = PTR_TO_CTX;36683668- if (type != expected_type)36693669- goto err_type;36703670- err = check_ctx_reg(env, reg, regno);36713671- if (err < 0)36723672- return err;34673467+ if (!(register_is_null(reg) &&34683468+ arg_type == ARG_PTR_TO_CTX_OR_NULL)) {34693469+ if (type != expected_type)34703470+ goto err_type;34713471+ err = check_ctx_reg(env, reg, regno);34723472+ if (err < 0)34733473+ return err;34743474+ }36733475 } else if (arg_type == ARG_PTR_TO_SOCK_COMMON) {36743476 expected_type = PTR_TO_SOCK_COMMON;36753477 /* Any sk pointer can be ARG_PTR_TO_SOCK_COMMON */···37833577 } else if (arg_type_is_mem_size(arg_type)) {37843578 bool zero_size_allowed = (arg_type == ARG_CONST_SIZE_OR_ZERO);3785357937863786- /* remember the mem_size which may be used later37873787- * to refine return values.35803580+ /* This is used to refine r0 return value bounds for helpers35813581+ * that enforce this value as an upper bound on return values.35823582+ * See do_refine_retval_range() for helpers that can refine35833583+ * the return value. C type of helper is u32 so we pull register35843584+ * bound from umax_value however, if negative verifier errors35853585+ * out. Only upper bounds can be learned because retval is an35863586+ * int type and negative retvals are allowed.37883587 */37893789- meta->msize_smax_value = reg->smax_value;37903790- meta->msize_umax_value = reg->umax_value;35883588+ meta->msize_max_value = reg->umax_value;3791358937923590 /* The register is SCALAR_VALUE; the access check37933591 * happens using its boundaries.···43344124 func_id != BPF_FUNC_probe_read_str))43354125 return;4336412643374337- ret_reg->smax_value = meta->msize_smax_value;43384338- ret_reg->umax_value = meta->msize_umax_value;41274127+ ret_reg->smax_value = meta->msize_max_value;41284128+ ret_reg->s32_max_value = meta->msize_max_value;43394129 __reg_deduce_bounds(ret_reg);43404130 __reg_bound_offset(ret_reg);41314131+ __update_reg_bounds(ret_reg);43414132}4342413343434134static int···46454434 return res < a;46464435}4647443646484648-static bool signed_sub_overflows(s64 a, s64 b)44374437+static bool signed_add32_overflows(s64 a, s64 b)44384438+{44394439+ /* Do the add in u32, where overflow is well-defined */44404440+ s32 res = (s32)((u32)a + (u32)b);44414441+44424442+ if (b < 0)44434443+ return res > a;44444444+ return res < a;44454445+}44464446+44474447+static bool signed_sub_overflows(s32 a, s32 b)46494448{46504449 /* Do the sub in u64, where overflow is well-defined */46514450 s64 res = (s64)((u64)a - (u64)b);44514451+44524452+ if (b < 0)44534453+ return res < a;44544454+ return res > a;44554455+}44564456+44574457+static bool signed_sub32_overflows(s32 a, s32 b)44584458+{44594459+ /* Do the sub in u64, where overflow is well-defined */44604460+ s32 res = (s32)((u32)a - (u32)b);4652446146534462 if (b < 0)46544463 return res < a;···49114680 !check_reg_sane_offset(env, ptr_reg, ptr_reg->type))49124681 return -EINVAL;4913468246834683+ /* pointer types do not carry 32-bit bounds at the moment. */46844684+ __mark_reg32_unbounded(dst_reg);46854685+49144686 switch (opcode) {49154687 case BPF_ADD:49164688 ret = sanitize_ptr_alu(env, insn, ptr_reg, dst_reg, smin_val < 0);···50774843 return 0;50784844}5079484548464846+static void scalar32_min_max_add(struct bpf_reg_state *dst_reg,48474847+ struct bpf_reg_state *src_reg)48484848+{48494849+ s32 smin_val = src_reg->s32_min_value;48504850+ s32 smax_val = src_reg->s32_max_value;48514851+ u32 umin_val = src_reg->u32_min_value;48524852+ u32 umax_val = src_reg->u32_max_value;48534853+48544854+ if (signed_add32_overflows(dst_reg->s32_min_value, smin_val) ||48554855+ signed_add32_overflows(dst_reg->s32_max_value, smax_val)) {48564856+ dst_reg->s32_min_value = S32_MIN;48574857+ dst_reg->s32_max_value = S32_MAX;48584858+ } else {48594859+ dst_reg->s32_min_value += smin_val;48604860+ dst_reg->s32_max_value += smax_val;48614861+ }48624862+ if (dst_reg->u32_min_value + umin_val < umin_val ||48634863+ dst_reg->u32_max_value + umax_val < umax_val) {48644864+ dst_reg->u32_min_value = 0;48654865+ dst_reg->u32_max_value = U32_MAX;48664866+ } else {48674867+ dst_reg->u32_min_value += umin_val;48684868+ dst_reg->u32_max_value += umax_val;48694869+ }48704870+}48714871+48724872+static void scalar_min_max_add(struct bpf_reg_state *dst_reg,48734873+ struct bpf_reg_state *src_reg)48744874+{48754875+ s64 smin_val = src_reg->smin_value;48764876+ s64 smax_val = src_reg->smax_value;48774877+ u64 umin_val = src_reg->umin_value;48784878+ u64 umax_val = src_reg->umax_value;48794879+48804880+ if (signed_add_overflows(dst_reg->smin_value, smin_val) ||48814881+ signed_add_overflows(dst_reg->smax_value, smax_val)) {48824882+ dst_reg->smin_value = S64_MIN;48834883+ dst_reg->smax_value = S64_MAX;48844884+ } else {48854885+ dst_reg->smin_value += smin_val;48864886+ dst_reg->smax_value += smax_val;48874887+ }48884888+ if (dst_reg->umin_value + umin_val < umin_val ||48894889+ dst_reg->umax_value + umax_val < umax_val) {48904890+ dst_reg->umin_value = 0;48914891+ dst_reg->umax_value = U64_MAX;48924892+ } else {48934893+ dst_reg->umin_value += umin_val;48944894+ dst_reg->umax_value += umax_val;48954895+ }48964896+}48974897+48984898+static void scalar32_min_max_sub(struct bpf_reg_state *dst_reg,48994899+ struct bpf_reg_state *src_reg)49004900+{49014901+ s32 smin_val = src_reg->s32_min_value;49024902+ s32 smax_val = src_reg->s32_max_value;49034903+ u32 umin_val = src_reg->u32_min_value;49044904+ u32 umax_val = src_reg->u32_max_value;49054905+49064906+ if (signed_sub32_overflows(dst_reg->s32_min_value, smax_val) ||49074907+ signed_sub32_overflows(dst_reg->s32_max_value, smin_val)) {49084908+ /* Overflow possible, we know nothing */49094909+ dst_reg->s32_min_value = S32_MIN;49104910+ dst_reg->s32_max_value = S32_MAX;49114911+ } else {49124912+ dst_reg->s32_min_value -= smax_val;49134913+ dst_reg->s32_max_value -= smin_val;49144914+ }49154915+ if (dst_reg->u32_min_value < umax_val) {49164916+ /* Overflow possible, we know nothing */49174917+ dst_reg->u32_min_value = 0;49184918+ dst_reg->u32_max_value = U32_MAX;49194919+ } else {49204920+ /* Cannot overflow (as long as bounds are consistent) */49214921+ dst_reg->u32_min_value -= umax_val;49224922+ dst_reg->u32_max_value -= umin_val;49234923+ }49244924+}49254925+49264926+static void scalar_min_max_sub(struct bpf_reg_state *dst_reg,49274927+ struct bpf_reg_state *src_reg)49284928+{49294929+ s64 smin_val = src_reg->smin_value;49304930+ s64 smax_val = src_reg->smax_value;49314931+ u64 umin_val = src_reg->umin_value;49324932+ u64 umax_val = src_reg->umax_value;49334933+49344934+ if (signed_sub_overflows(dst_reg->smin_value, smax_val) ||49354935+ signed_sub_overflows(dst_reg->smax_value, smin_val)) {49364936+ /* Overflow possible, we know nothing */49374937+ dst_reg->smin_value = S64_MIN;49384938+ dst_reg->smax_value = S64_MAX;49394939+ } else {49404940+ dst_reg->smin_value -= smax_val;49414941+ dst_reg->smax_value -= smin_val;49424942+ }49434943+ if (dst_reg->umin_value < umax_val) {49444944+ /* Overflow possible, we know nothing */49454945+ dst_reg->umin_value = 0;49464946+ dst_reg->umax_value = U64_MAX;49474947+ } else {49484948+ /* Cannot overflow (as long as bounds are consistent) */49494949+ dst_reg->umin_value -= umax_val;49504950+ dst_reg->umax_value -= umin_val;49514951+ }49524952+}49534953+49544954+static void scalar32_min_max_mul(struct bpf_reg_state *dst_reg,49554955+ struct bpf_reg_state *src_reg)49564956+{49574957+ s32 smin_val = src_reg->s32_min_value;49584958+ u32 umin_val = src_reg->u32_min_value;49594959+ u32 umax_val = src_reg->u32_max_value;49604960+49614961+ if (smin_val < 0 || dst_reg->s32_min_value < 0) {49624962+ /* Ain't nobody got time to multiply that sign */49634963+ __mark_reg32_unbounded(dst_reg);49644964+ return;49654965+ }49664966+ /* Both values are positive, so we can work with unsigned and49674967+ * copy the result to signed (unless it exceeds S32_MAX).49684968+ */49694969+ if (umax_val > U16_MAX || dst_reg->u32_max_value > U16_MAX) {49704970+ /* Potential overflow, we know nothing */49714971+ __mark_reg32_unbounded(dst_reg);49724972+ return;49734973+ }49744974+ dst_reg->u32_min_value *= umin_val;49754975+ dst_reg->u32_max_value *= umax_val;49764976+ if (dst_reg->u32_max_value > S32_MAX) {49774977+ /* Overflow possible, we know nothing */49784978+ dst_reg->s32_min_value = S32_MIN;49794979+ dst_reg->s32_max_value = S32_MAX;49804980+ } else {49814981+ dst_reg->s32_min_value = dst_reg->u32_min_value;49824982+ dst_reg->s32_max_value = dst_reg->u32_max_value;49834983+ }49844984+}49854985+49864986+static void scalar_min_max_mul(struct bpf_reg_state *dst_reg,49874987+ struct bpf_reg_state *src_reg)49884988+{49894989+ s64 smin_val = src_reg->smin_value;49904990+ u64 umin_val = src_reg->umin_value;49914991+ u64 umax_val = src_reg->umax_value;49924992+49934993+ if (smin_val < 0 || dst_reg->smin_value < 0) {49944994+ /* Ain't nobody got time to multiply that sign */49954995+ __mark_reg64_unbounded(dst_reg);49964996+ return;49974997+ }49984998+ /* Both values are positive, so we can work with unsigned and49994999+ * copy the result to signed (unless it exceeds S64_MAX).50005000+ */50015001+ if (umax_val > U32_MAX || dst_reg->umax_value > U32_MAX) {50025002+ /* Potential overflow, we know nothing */50035003+ __mark_reg64_unbounded(dst_reg);50045004+ return;50055005+ }50065006+ dst_reg->umin_value *= umin_val;50075007+ dst_reg->umax_value *= umax_val;50085008+ if (dst_reg->umax_value > S64_MAX) {50095009+ /* Overflow possible, we know nothing */50105010+ dst_reg->smin_value = S64_MIN;50115011+ dst_reg->smax_value = S64_MAX;50125012+ } else {50135013+ dst_reg->smin_value = dst_reg->umin_value;50145014+ dst_reg->smax_value = dst_reg->umax_value;50155015+ }50165016+}50175017+50185018+static void scalar32_min_max_and(struct bpf_reg_state *dst_reg,50195019+ struct bpf_reg_state *src_reg)50205020+{50215021+ bool src_known = tnum_subreg_is_const(src_reg->var_off);50225022+ bool dst_known = tnum_subreg_is_const(dst_reg->var_off);50235023+ struct tnum var32_off = tnum_subreg(dst_reg->var_off);50245024+ s32 smin_val = src_reg->s32_min_value;50255025+ u32 umax_val = src_reg->u32_max_value;50265026+50275027+ /* Assuming scalar64_min_max_and will be called so its safe50285028+ * to skip updating register for known 32-bit case.50295029+ */50305030+ if (src_known && dst_known)50315031+ return;50325032+50335033+ /* We get our minimum from the var_off, since that's inherently50345034+ * bitwise. Our maximum is the minimum of the operands' maxima.50355035+ */50365036+ dst_reg->u32_min_value = var32_off.value;50375037+ dst_reg->u32_max_value = min(dst_reg->u32_max_value, umax_val);50385038+ if (dst_reg->s32_min_value < 0 || smin_val < 0) {50395039+ /* Lose signed bounds when ANDing negative numbers,50405040+ * ain't nobody got time for that.50415041+ */50425042+ dst_reg->s32_min_value = S32_MIN;50435043+ dst_reg->s32_max_value = S32_MAX;50445044+ } else {50455045+ /* ANDing two positives gives a positive, so safe to50465046+ * cast result into s64.50475047+ */50485048+ dst_reg->s32_min_value = dst_reg->u32_min_value;50495049+ dst_reg->s32_max_value = dst_reg->u32_max_value;50505050+ }50515051+50525052+}50535053+50545054+static void scalar_min_max_and(struct bpf_reg_state *dst_reg,50555055+ struct bpf_reg_state *src_reg)50565056+{50575057+ bool src_known = tnum_is_const(src_reg->var_off);50585058+ bool dst_known = tnum_is_const(dst_reg->var_off);50595059+ s64 smin_val = src_reg->smin_value;50605060+ u64 umax_val = src_reg->umax_value;50615061+50625062+ if (src_known && dst_known) {50635063+ __mark_reg_known(dst_reg, dst_reg->var_off.value &50645064+ src_reg->var_off.value);50655065+ return;50665066+ }50675067+50685068+ /* We get our minimum from the var_off, since that's inherently50695069+ * bitwise. Our maximum is the minimum of the operands' maxima.50705070+ */50715071+ dst_reg->umin_value = dst_reg->var_off.value;50725072+ dst_reg->umax_value = min(dst_reg->umax_value, umax_val);50735073+ if (dst_reg->smin_value < 0 || smin_val < 0) {50745074+ /* Lose signed bounds when ANDing negative numbers,50755075+ * ain't nobody got time for that.50765076+ */50775077+ dst_reg->smin_value = S64_MIN;50785078+ dst_reg->smax_value = S64_MAX;50795079+ } else {50805080+ /* ANDing two positives gives a positive, so safe to50815081+ * cast result into s64.50825082+ */50835083+ dst_reg->smin_value = dst_reg->umin_value;50845084+ dst_reg->smax_value = dst_reg->umax_value;50855085+ }50865086+ /* We may learn something more from the var_off */50875087+ __update_reg_bounds(dst_reg);50885088+}50895089+50905090+static void scalar32_min_max_or(struct bpf_reg_state *dst_reg,50915091+ struct bpf_reg_state *src_reg)50925092+{50935093+ bool src_known = tnum_subreg_is_const(src_reg->var_off);50945094+ bool dst_known = tnum_subreg_is_const(dst_reg->var_off);50955095+ struct tnum var32_off = tnum_subreg(dst_reg->var_off);50965096+ s32 smin_val = src_reg->smin_value;50975097+ u32 umin_val = src_reg->umin_value;50985098+50995099+ /* Assuming scalar64_min_max_or will be called so it is safe51005100+ * to skip updating register for known case.51015101+ */51025102+ if (src_known && dst_known)51035103+ return;51045104+51055105+ /* We get our maximum from the var_off, and our minimum is the51065106+ * maximum of the operands' minima51075107+ */51085108+ dst_reg->u32_min_value = max(dst_reg->u32_min_value, umin_val);51095109+ dst_reg->u32_max_value = var32_off.value | var32_off.mask;51105110+ if (dst_reg->s32_min_value < 0 || smin_val < 0) {51115111+ /* Lose signed bounds when ORing negative numbers,51125112+ * ain't nobody got time for that.51135113+ */51145114+ dst_reg->s32_min_value = S32_MIN;51155115+ dst_reg->s32_max_value = S32_MAX;51165116+ } else {51175117+ /* ORing two positives gives a positive, so safe to51185118+ * cast result into s64.51195119+ */51205120+ dst_reg->s32_min_value = dst_reg->umin_value;51215121+ dst_reg->s32_max_value = dst_reg->umax_value;51225122+ }51235123+}51245124+51255125+static void scalar_min_max_or(struct bpf_reg_state *dst_reg,51265126+ struct bpf_reg_state *src_reg)51275127+{51285128+ bool src_known = tnum_is_const(src_reg->var_off);51295129+ bool dst_known = tnum_is_const(dst_reg->var_off);51305130+ s64 smin_val = src_reg->smin_value;51315131+ u64 umin_val = src_reg->umin_value;51325132+51335133+ if (src_known && dst_known) {51345134+ __mark_reg_known(dst_reg, dst_reg->var_off.value |51355135+ src_reg->var_off.value);51365136+ return;51375137+ }51385138+51395139+ /* We get our maximum from the var_off, and our minimum is the51405140+ * maximum of the operands' minima51415141+ */51425142+ dst_reg->umin_value = max(dst_reg->umin_value, umin_val);51435143+ dst_reg->umax_value = dst_reg->var_off.value | dst_reg->var_off.mask;51445144+ if (dst_reg->smin_value < 0 || smin_val < 0) {51455145+ /* Lose signed bounds when ORing negative numbers,51465146+ * ain't nobody got time for that.51475147+ */51485148+ dst_reg->smin_value = S64_MIN;51495149+ dst_reg->smax_value = S64_MAX;51505150+ } else {51515151+ /* ORing two positives gives a positive, so safe to51525152+ * cast result into s64.51535153+ */51545154+ dst_reg->smin_value = dst_reg->umin_value;51555155+ dst_reg->smax_value = dst_reg->umax_value;51565156+ }51575157+ /* We may learn something more from the var_off */51585158+ __update_reg_bounds(dst_reg);51595159+}51605160+51615161+static void __scalar32_min_max_lsh(struct bpf_reg_state *dst_reg,51625162+ u64 umin_val, u64 umax_val)51635163+{51645164+ /* We lose all sign bit information (except what we can pick51655165+ * up from var_off)51665166+ */51675167+ dst_reg->s32_min_value = S32_MIN;51685168+ dst_reg->s32_max_value = S32_MAX;51695169+ /* If we might shift our top bit out, then we know nothing */51705170+ if (umax_val > 31 || dst_reg->u32_max_value > 1ULL << (31 - umax_val)) {51715171+ dst_reg->u32_min_value = 0;51725172+ dst_reg->u32_max_value = U32_MAX;51735173+ } else {51745174+ dst_reg->u32_min_value <<= umin_val;51755175+ dst_reg->u32_max_value <<= umax_val;51765176+ }51775177+}51785178+51795179+static void scalar32_min_max_lsh(struct bpf_reg_state *dst_reg,51805180+ struct bpf_reg_state *src_reg)51815181+{51825182+ u32 umax_val = src_reg->u32_max_value;51835183+ u32 umin_val = src_reg->u32_min_value;51845184+ /* u32 alu operation will zext upper bits */51855185+ struct tnum subreg = tnum_subreg(dst_reg->var_off);51865186+51875187+ __scalar32_min_max_lsh(dst_reg, umin_val, umax_val);51885188+ dst_reg->var_off = tnum_subreg(tnum_lshift(subreg, umin_val));51895189+ /* Not required but being careful mark reg64 bounds as unknown so51905190+ * that we are forced to pick them up from tnum and zext later and51915191+ * if some path skips this step we are still safe.51925192+ */51935193+ __mark_reg64_unbounded(dst_reg);51945194+ __update_reg32_bounds(dst_reg);51955195+}51965196+51975197+static void __scalar64_min_max_lsh(struct bpf_reg_state *dst_reg,51985198+ u64 umin_val, u64 umax_val)51995199+{52005200+ /* Special case <<32 because it is a common compiler pattern to sign52015201+ * extend subreg by doing <<32 s>>32. In this case if 32bit bounds are52025202+ * positive we know this shift will also be positive so we can track52035203+ * bounds correctly. Otherwise we lose all sign bit information except52045204+ * what we can pick up from var_off. Perhaps we can generalize this52055205+ * later to shifts of any length.52065206+ */52075207+ if (umin_val == 32 && umax_val == 32 && dst_reg->s32_max_value >= 0)52085208+ dst_reg->smax_value = (s64)dst_reg->s32_max_value << 32;52095209+ else52105210+ dst_reg->smax_value = S64_MAX;52115211+52125212+ if (umin_val == 32 && umax_val == 32 && dst_reg->s32_min_value >= 0)52135213+ dst_reg->smin_value = (s64)dst_reg->s32_min_value << 32;52145214+ else52155215+ dst_reg->smin_value = S64_MIN;52165216+52175217+ /* If we might shift our top bit out, then we know nothing */52185218+ if (dst_reg->umax_value > 1ULL << (63 - umax_val)) {52195219+ dst_reg->umin_value = 0;52205220+ dst_reg->umax_value = U64_MAX;52215221+ } else {52225222+ dst_reg->umin_value <<= umin_val;52235223+ dst_reg->umax_value <<= umax_val;52245224+ }52255225+}52265226+52275227+static void scalar_min_max_lsh(struct bpf_reg_state *dst_reg,52285228+ struct bpf_reg_state *src_reg)52295229+{52305230+ u64 umax_val = src_reg->umax_value;52315231+ u64 umin_val = src_reg->umin_value;52325232+52335233+ /* scalar64 calc uses 32bit unshifted bounds so must be called first */52345234+ __scalar64_min_max_lsh(dst_reg, umin_val, umax_val);52355235+ __scalar32_min_max_lsh(dst_reg, umin_val, umax_val);52365236+52375237+ dst_reg->var_off = tnum_lshift(dst_reg->var_off, umin_val);52385238+ /* We may learn something more from the var_off */52395239+ __update_reg_bounds(dst_reg);52405240+}52415241+52425242+static void scalar32_min_max_rsh(struct bpf_reg_state *dst_reg,52435243+ struct bpf_reg_state *src_reg)52445244+{52455245+ struct tnum subreg = tnum_subreg(dst_reg->var_off);52465246+ u32 umax_val = src_reg->u32_max_value;52475247+ u32 umin_val = src_reg->u32_min_value;52485248+52495249+ /* BPF_RSH is an unsigned shift. If the value in dst_reg might52505250+ * be negative, then either:52515251+ * 1) src_reg might be zero, so the sign bit of the result is52525252+ * unknown, so we lose our signed bounds52535253+ * 2) it's known negative, thus the unsigned bounds capture the52545254+ * signed bounds52555255+ * 3) the signed bounds cross zero, so they tell us nothing52565256+ * about the result52575257+ * If the value in dst_reg is known nonnegative, then again the52585258+ * unsigned bounts capture the signed bounds.52595259+ * Thus, in all cases it suffices to blow away our signed bounds52605260+ * and rely on inferring new ones from the unsigned bounds and52615261+ * var_off of the result.52625262+ */52635263+ dst_reg->s32_min_value = S32_MIN;52645264+ dst_reg->s32_max_value = S32_MAX;52655265+52665266+ dst_reg->var_off = tnum_rshift(subreg, umin_val);52675267+ dst_reg->u32_min_value >>= umax_val;52685268+ dst_reg->u32_max_value >>= umin_val;52695269+52705270+ __mark_reg64_unbounded(dst_reg);52715271+ __update_reg32_bounds(dst_reg);52725272+}52735273+52745274+static void scalar_min_max_rsh(struct bpf_reg_state *dst_reg,52755275+ struct bpf_reg_state *src_reg)52765276+{52775277+ u64 umax_val = src_reg->umax_value;52785278+ u64 umin_val = src_reg->umin_value;52795279+52805280+ /* BPF_RSH is an unsigned shift. If the value in dst_reg might52815281+ * be negative, then either:52825282+ * 1) src_reg might be zero, so the sign bit of the result is52835283+ * unknown, so we lose our signed bounds52845284+ * 2) it's known negative, thus the unsigned bounds capture the52855285+ * signed bounds52865286+ * 3) the signed bounds cross zero, so they tell us nothing52875287+ * about the result52885288+ * If the value in dst_reg is known nonnegative, then again the52895289+ * unsigned bounts capture the signed bounds.52905290+ * Thus, in all cases it suffices to blow away our signed bounds52915291+ * and rely on inferring new ones from the unsigned bounds and52925292+ * var_off of the result.52935293+ */52945294+ dst_reg->smin_value = S64_MIN;52955295+ dst_reg->smax_value = S64_MAX;52965296+ dst_reg->var_off = tnum_rshift(dst_reg->var_off, umin_val);52975297+ dst_reg->umin_value >>= umax_val;52985298+ dst_reg->umax_value >>= umin_val;52995299+53005300+ /* Its not easy to operate on alu32 bounds here because it depends53015301+ * on bits being shifted in. Take easy way out and mark unbounded53025302+ * so we can recalculate later from tnum.53035303+ */53045304+ __mark_reg32_unbounded(dst_reg);53055305+ __update_reg_bounds(dst_reg);53065306+}53075307+53085308+static void scalar32_min_max_arsh(struct bpf_reg_state *dst_reg,53095309+ struct bpf_reg_state *src_reg)53105310+{53115311+ u64 umin_val = src_reg->u32_min_value;53125312+53135313+ /* Upon reaching here, src_known is true and53145314+ * umax_val is equal to umin_val.53155315+ */53165316+ dst_reg->s32_min_value = (u32)(((s32)dst_reg->s32_min_value) >> umin_val);53175317+ dst_reg->s32_max_value = (u32)(((s32)dst_reg->s32_max_value) >> umin_val);53185318+53195319+ dst_reg->var_off = tnum_arshift(tnum_subreg(dst_reg->var_off), umin_val, 32);53205320+53215321+ /* blow away the dst_reg umin_value/umax_value and rely on53225322+ * dst_reg var_off to refine the result.53235323+ */53245324+ dst_reg->u32_min_value = 0;53255325+ dst_reg->u32_max_value = U32_MAX;53265326+53275327+ __mark_reg64_unbounded(dst_reg);53285328+ __update_reg32_bounds(dst_reg);53295329+}53305330+53315331+static void scalar_min_max_arsh(struct bpf_reg_state *dst_reg,53325332+ struct bpf_reg_state *src_reg)53335333+{53345334+ u64 umin_val = src_reg->umin_value;53355335+53365336+ /* Upon reaching here, src_known is true and umax_val is equal53375337+ * to umin_val.53385338+ */53395339+ dst_reg->smin_value >>= umin_val;53405340+ dst_reg->smax_value >>= umin_val;53415341+53425342+ dst_reg->var_off = tnum_arshift(dst_reg->var_off, umin_val, 64);53435343+53445344+ /* blow away the dst_reg umin_value/umax_value and rely on53455345+ * dst_reg var_off to refine the result.53465346+ */53475347+ dst_reg->umin_value = 0;53485348+ dst_reg->umax_value = U64_MAX;53495349+53505350+ /* Its not easy to operate on alu32 bounds here because it depends53515351+ * on bits being shifted in from upper 32-bits. Take easy way out53525352+ * and mark unbounded so we can recalculate later from tnum.53535353+ */53545354+ __mark_reg32_unbounded(dst_reg);53555355+ __update_reg_bounds(dst_reg);53565356+}53575357+50805358/* WARNING: This function does calculations on 64-bit values, but the actual50815359 * execution may occur on 32-bit values. Therefore, things like bitshifts50825360 * need extra checks in the 32-bit case.···56034857 bool src_known, dst_known;56044858 s64 smin_val, smax_val;56054859 u64 umin_val, umax_val;48604860+ s32 s32_min_val, s32_max_val;48614861+ u32 u32_min_val, u32_max_val;56064862 u64 insn_bitness = (BPF_CLASS(insn->code) == BPF_ALU64) ? 64 : 32;56074863 u32 dst = insn->dst_reg;56084864 int ret;56095609-56105610- if (insn_bitness == 32) {56115611- /* Relevant for 32-bit RSH: Information can propagate towards56125612- * LSB, so it isn't sufficient to only truncate the output to56135613- * 32 bits.56145614- */56155615- coerce_reg_to_size(dst_reg, 4);56165616- coerce_reg_to_size(&src_reg, 4);56175617- }48654865+ bool alu32 = (BPF_CLASS(insn->code) != BPF_ALU64);5618486656194867 smin_val = src_reg.smin_value;56204868 smax_val = src_reg.smax_value;56214869 umin_val = src_reg.umin_value;56224870 umax_val = src_reg.umax_value;56235623- src_known = tnum_is_const(src_reg.var_off);56245624- dst_known = tnum_is_const(dst_reg->var_off);5625487156265626- if ((src_known && (smin_val != smax_val || umin_val != umax_val)) ||56275627- smin_val > smax_val || umin_val > umax_val) {56285628- /* Taint dst register if offset had invalid bounds derived from56295629- * e.g. dead branches.56305630- */56315631- __mark_reg_unknown(env, dst_reg);56325632- return 0;48724872+ s32_min_val = src_reg.s32_min_value;48734873+ s32_max_val = src_reg.s32_max_value;48744874+ u32_min_val = src_reg.u32_min_value;48754875+ u32_max_val = src_reg.u32_max_value;48764876+48774877+ if (alu32) {48784878+ src_known = tnum_subreg_is_const(src_reg.var_off);48794879+ dst_known = tnum_subreg_is_const(dst_reg->var_off);48804880+ if ((src_known &&48814881+ (s32_min_val != s32_max_val || u32_min_val != u32_max_val)) ||48824882+ s32_min_val > s32_max_val || u32_min_val > u32_max_val) {48834883+ /* Taint dst register if offset had invalid bounds48844884+ * derived from e.g. dead branches.48854885+ */48864886+ __mark_reg_unknown(env, dst_reg);48874887+ return 0;48884888+ }48894889+ } else {48904890+ src_known = tnum_is_const(src_reg.var_off);48914891+ dst_known = tnum_is_const(dst_reg->var_off);48924892+ if ((src_known &&48934893+ (smin_val != smax_val || umin_val != umax_val)) ||48944894+ smin_val > smax_val || umin_val > umax_val) {48954895+ /* Taint dst register if offset had invalid bounds48964896+ * derived from e.g. dead branches.48974897+ */48984898+ __mark_reg_unknown(env, dst_reg);48994899+ return 0;49004900+ }56334901 }5634490256354903 if (!src_known &&···56524892 return 0;56534893 }5654489448954895+ /* Calculate sign/unsigned bounds and tnum for alu32 and alu64 bit ops.48964896+ * There are two classes of instructions: The first class we track both48974897+ * alu32 and alu64 sign/unsigned bounds independently this provides the48984898+ * greatest amount of precision when alu operations are mixed with jmp3248994899+ * operations. These operations are BPF_ADD, BPF_SUB, BPF_MUL, BPF_ADD,49004900+ * and BPF_OR. This is possible because these ops have fairly easy to49014901+ * understand and calculate behavior in both 32-bit and 64-bit alu ops.49024902+ * See alu32 verifier tests for examples. The second class of49034903+ * operations, BPF_LSH, BPF_RSH, and BPF_ARSH, however are not so easy49044904+ * with regards to tracking sign/unsigned bounds because the bits may49054905+ * cross subreg boundaries in the alu64 case. When this happens we mark49064906+ * the reg unbounded in the subreg bound space and use the resulting49074907+ * tnum to calculate an approximation of the sign/unsigned bounds.49084908+ */56554909 switch (opcode) {56564910 case BPF_ADD:56574911 ret = sanitize_val_alu(env, insn);···56734899 verbose(env, "R%d tried to add from different pointers or scalars\n", dst);56744900 return ret;56754901 }56765676- if (signed_add_overflows(dst_reg->smin_value, smin_val) ||56775677- signed_add_overflows(dst_reg->smax_value, smax_val)) {56785678- dst_reg->smin_value = S64_MIN;56795679- dst_reg->smax_value = S64_MAX;56805680- } else {56815681- dst_reg->smin_value += smin_val;56825682- dst_reg->smax_value += smax_val;56835683- }56845684- if (dst_reg->umin_value + umin_val < umin_val ||56855685- dst_reg->umax_value + umax_val < umax_val) {56865686- dst_reg->umin_value = 0;56875687- dst_reg->umax_value = U64_MAX;56885688- } else {56895689- dst_reg->umin_value += umin_val;56905690- dst_reg->umax_value += umax_val;56915691- }49024902+ scalar32_min_max_add(dst_reg, &src_reg);49034903+ scalar_min_max_add(dst_reg, &src_reg);56924904 dst_reg->var_off = tnum_add(dst_reg->var_off, src_reg.var_off);56934905 break;56944906 case BPF_SUB:···56834923 verbose(env, "R%d tried to sub from different pointers or scalars\n", dst);56844924 return ret;56854925 }56865686- if (signed_sub_overflows(dst_reg->smin_value, smax_val) ||56875687- signed_sub_overflows(dst_reg->smax_value, smin_val)) {56885688- /* Overflow possible, we know nothing */56895689- dst_reg->smin_value = S64_MIN;56905690- dst_reg->smax_value = S64_MAX;56915691- } else {56925692- dst_reg->smin_value -= smax_val;56935693- dst_reg->smax_value -= smin_val;56945694- }56955695- if (dst_reg->umin_value < umax_val) {56965696- /* Overflow possible, we know nothing */56975697- dst_reg->umin_value = 0;56985698- dst_reg->umax_value = U64_MAX;56995699- } else {57005700- /* Cannot overflow (as long as bounds are consistent) */57015701- dst_reg->umin_value -= umax_val;57025702- dst_reg->umax_value -= umin_val;57035703- }49264926+ scalar32_min_max_sub(dst_reg, &src_reg);49274927+ scalar_min_max_sub(dst_reg, &src_reg);57044928 dst_reg->var_off = tnum_sub(dst_reg->var_off, src_reg.var_off);57054929 break;57064930 case BPF_MUL:57074931 dst_reg->var_off = tnum_mul(dst_reg->var_off, src_reg.var_off);57085708- if (smin_val < 0 || dst_reg->smin_value < 0) {57095709- /* Ain't nobody got time to multiply that sign */57105710- __mark_reg_unbounded(dst_reg);57115711- __update_reg_bounds(dst_reg);57125712- break;57135713- }57145714- /* Both values are positive, so we can work with unsigned and57155715- * copy the result to signed (unless it exceeds S64_MAX).57165716- */57175717- if (umax_val > U32_MAX || dst_reg->umax_value > U32_MAX) {57185718- /* Potential overflow, we know nothing */57195719- __mark_reg_unbounded(dst_reg);57205720- /* (except what we can learn from the var_off) */57215721- __update_reg_bounds(dst_reg);57225722- break;57235723- }57245724- dst_reg->umin_value *= umin_val;57255725- dst_reg->umax_value *= umax_val;57265726- if (dst_reg->umax_value > S64_MAX) {57275727- /* Overflow possible, we know nothing */57285728- dst_reg->smin_value = S64_MIN;57295729- dst_reg->smax_value = S64_MAX;57305730- } else {57315731- dst_reg->smin_value = dst_reg->umin_value;57325732- dst_reg->smax_value = dst_reg->umax_value;57335733- }49324932+ scalar32_min_max_mul(dst_reg, &src_reg);49334933+ scalar_min_max_mul(dst_reg, &src_reg);57344934 break;57354935 case BPF_AND:57365736- if (src_known && dst_known) {57375737- __mark_reg_known(dst_reg, dst_reg->var_off.value &57385738- src_reg.var_off.value);57395739- break;57405740- }57415741- /* We get our minimum from the var_off, since that's inherently57425742- * bitwise. Our maximum is the minimum of the operands' maxima.57435743- */57444936 dst_reg->var_off = tnum_and(dst_reg->var_off, src_reg.var_off);57455745- dst_reg->umin_value = dst_reg->var_off.value;57465746- dst_reg->umax_value = min(dst_reg->umax_value, umax_val);57475747- if (dst_reg->smin_value < 0 || smin_val < 0) {57485748- /* Lose signed bounds when ANDing negative numbers,57495749- * ain't nobody got time for that.57505750- */57515751- dst_reg->smin_value = S64_MIN;57525752- dst_reg->smax_value = S64_MAX;57535753- } else {57545754- /* ANDing two positives gives a positive, so safe to57555755- * cast result into s64.57565756- */57575757- dst_reg->smin_value = dst_reg->umin_value;57585758- dst_reg->smax_value = dst_reg->umax_value;57595759- }57605760- /* We may learn something more from the var_off */57615761- __update_reg_bounds(dst_reg);49374937+ scalar32_min_max_and(dst_reg, &src_reg);49384938+ scalar_min_max_and(dst_reg, &src_reg);57624939 break;57634940 case BPF_OR:57645764- if (src_known && dst_known) {57655765- __mark_reg_known(dst_reg, dst_reg->var_off.value |57665766- src_reg.var_off.value);57675767- break;57685768- }57695769- /* We get our maximum from the var_off, and our minimum is the57705770- * maximum of the operands' minima57715771- */57724941 dst_reg->var_off = tnum_or(dst_reg->var_off, src_reg.var_off);57735773- dst_reg->umin_value = max(dst_reg->umin_value, umin_val);57745774- dst_reg->umax_value = dst_reg->var_off.value |57755775- dst_reg->var_off.mask;57765776- if (dst_reg->smin_value < 0 || smin_val < 0) {57775777- /* Lose signed bounds when ORing negative numbers,57785778- * ain't nobody got time for that.57795779- */57805780- dst_reg->smin_value = S64_MIN;57815781- dst_reg->smax_value = S64_MAX;57825782- } else {57835783- /* ORing two positives gives a positive, so safe to57845784- * cast result into s64.57855785- */57865786- dst_reg->smin_value = dst_reg->umin_value;57875787- dst_reg->smax_value = dst_reg->umax_value;57885788- }57895789- /* We may learn something more from the var_off */57905790- __update_reg_bounds(dst_reg);49424942+ scalar32_min_max_or(dst_reg, &src_reg);49434943+ scalar_min_max_or(dst_reg, &src_reg);57914944 break;57924945 case BPF_LSH:57934946 if (umax_val >= insn_bitness) {···57105037 mark_reg_unknown(env, regs, insn->dst_reg);57115038 break;57125039 }57135713- /* We lose all sign bit information (except what we can pick57145714- * up from var_off)57155715- */57165716- dst_reg->smin_value = S64_MIN;57175717- dst_reg->smax_value = S64_MAX;57185718- /* If we might shift our top bit out, then we know nothing */57195719- if (dst_reg->umax_value > 1ULL << (63 - umax_val)) {57205720- dst_reg->umin_value = 0;57215721- dst_reg->umax_value = U64_MAX;57225722- } else {57235723- dst_reg->umin_value <<= umin_val;57245724- dst_reg->umax_value <<= umax_val;57255725- }57265726- dst_reg->var_off = tnum_lshift(dst_reg->var_off, umin_val);57275727- /* We may learn something more from the var_off */57285728- __update_reg_bounds(dst_reg);50405040+ if (alu32)50415041+ scalar32_min_max_lsh(dst_reg, &src_reg);50425042+ else50435043+ scalar_min_max_lsh(dst_reg, &src_reg);57295044 break;57305045 case BPF_RSH:57315046 if (umax_val >= insn_bitness) {···57235062 mark_reg_unknown(env, regs, insn->dst_reg);57245063 break;57255064 }57265726- /* BPF_RSH is an unsigned shift. If the value in dst_reg might57275727- * be negative, then either:57285728- * 1) src_reg might be zero, so the sign bit of the result is57295729- * unknown, so we lose our signed bounds57305730- * 2) it's known negative, thus the unsigned bounds capture the57315731- * signed bounds57325732- * 3) the signed bounds cross zero, so they tell us nothing57335733- * about the result57345734- * If the value in dst_reg is known nonnegative, then again the57355735- * unsigned bounts capture the signed bounds.57365736- * Thus, in all cases it suffices to blow away our signed bounds57375737- * and rely on inferring new ones from the unsigned bounds and57385738- * var_off of the result.57395739- */57405740- dst_reg->smin_value = S64_MIN;57415741- dst_reg->smax_value = S64_MAX;57425742- dst_reg->var_off = tnum_rshift(dst_reg->var_off, umin_val);57435743- dst_reg->umin_value >>= umax_val;57445744- dst_reg->umax_value >>= umin_val;57455745- /* We may learn something more from the var_off */57465746- __update_reg_bounds(dst_reg);50655065+ if (alu32)50665066+ scalar32_min_max_rsh(dst_reg, &src_reg);50675067+ else50685068+ scalar_min_max_rsh(dst_reg, &src_reg);57475069 break;57485070 case BPF_ARSH:57495071 if (umax_val >= insn_bitness) {···57365092 mark_reg_unknown(env, regs, insn->dst_reg);57375093 break;57385094 }57395739-57405740- /* Upon reaching here, src_known is true and57415741- * umax_val is equal to umin_val.57425742- */57435743- if (insn_bitness == 32) {57445744- dst_reg->smin_value = (u32)(((s32)dst_reg->smin_value) >> umin_val);57455745- dst_reg->smax_value = (u32)(((s32)dst_reg->smax_value) >> umin_val);57465746- } else {57475747- dst_reg->smin_value >>= umin_val;57485748- dst_reg->smax_value >>= umin_val;57495749- }57505750-57515751- dst_reg->var_off = tnum_arshift(dst_reg->var_off, umin_val,57525752- insn_bitness);57535753-57545754- /* blow away the dst_reg umin_value/umax_value and rely on57555755- * dst_reg var_off to refine the result.57565756- */57575757- dst_reg->umin_value = 0;57585758- dst_reg->umax_value = U64_MAX;57595759- __update_reg_bounds(dst_reg);50955095+ if (alu32)50965096+ scalar32_min_max_arsh(dst_reg, &src_reg);50975097+ else50985098+ scalar_min_max_arsh(dst_reg, &src_reg);57605099 break;57615100 default:57625101 mark_reg_unknown(env, regs, insn->dst_reg);57635102 break;57645103 }5765510457665766- if (BPF_CLASS(insn->code) != BPF_ALU64) {57675767- /* 32-bit ALU ops are (32,32)->32 */57685768- coerce_reg_to_size(dst_reg, 4);57695769- }51055105+ /* ALU32 ops are zero extended into 64bit register */51065106+ if (alu32)51075107+ zext_32_to_64(dst_reg);5770510851095109+ __update_reg_bounds(dst_reg);57715110 __reg_deduce_bounds(dst_reg);57725111 __reg_bound_offset(dst_reg);57735112 return 0;···59245297 mark_reg_unknown(env, regs,59255298 insn->dst_reg);59265299 }59275927- coerce_reg_to_size(dst_reg, 4);53005300+ zext_32_to_64(dst_reg);59285301 }59295302 } else {59305303 /* case: R = imm···60945467 new_range);60955468}6096546960976097-/* compute branch direction of the expression "if (reg opcode val) goto target;"60986098- * and return:60996099- * 1 - branch will be taken and "goto target" will be executed61006100- * 0 - branch will not be taken and fall-through to next insn61016101- * -1 - unknown. Example: "if (reg < 5)" is unknown when register value range [0,10]61026102- */61036103-static int is_branch_taken(struct bpf_reg_state *reg, u64 val, u8 opcode,61046104- bool is_jmp32)54705470+static int is_branch32_taken(struct bpf_reg_state *reg, u32 val, u8 opcode)61055471{61066106- struct bpf_reg_state reg_lo;61076107- s64 sval;54725472+ struct tnum subreg = tnum_subreg(reg->var_off);54735473+ s32 sval = (s32)val;6108547461096109- if (__is_pointer_value(false, reg))61106110- return -1;61116111-61126112- if (is_jmp32) {61136113- reg_lo = *reg;61146114- reg = ®_lo;61156115- /* For JMP32, only low 32 bits are compared, coerce_reg_to_size61166116- * could truncate high bits and update umin/umax according to61176117- * information of low bits.61186118- */61196119- coerce_reg_to_size(reg, 4);61206120- /* smin/smax need special handling. For example, after coerce,61216121- * if smin_value is 0x00000000ffffffffLL, the value is -1 when61226122- * used as operand to JMP32. It is a negative number from s32's61236123- * point of view, while it is a positive number when seen as61246124- * s64. The smin/smax are kept as s64, therefore, when used with61256125- * JMP32, they need to be transformed into s32, then sign61266126- * extended back to s64.61276127- *61286128- * Also, smin/smax were copied from umin/umax. If umin/umax has61296129- * different sign bit, then min/max relationship doesn't61306130- * maintain after casting into s32, for this case, set smin/smax61316131- * to safest range.61326132- */61336133- if ((reg->umax_value ^ reg->umin_value) &61346134- (1ULL << 31)) {61356135- reg->smin_value = S32_MIN;61366136- reg->smax_value = S32_MAX;61376137- }61386138- reg->smin_value = (s64)(s32)reg->smin_value;61396139- reg->smax_value = (s64)(s32)reg->smax_value;61406140-61416141- val = (u32)val;61426142- sval = (s64)(s32)val;61436143- } else {61446144- sval = (s64)val;54755475+ switch (opcode) {54765476+ case BPF_JEQ:54775477+ if (tnum_is_const(subreg))54785478+ return !!tnum_equals_const(subreg, val);54795479+ break;54805480+ case BPF_JNE:54815481+ if (tnum_is_const(subreg))54825482+ return !tnum_equals_const(subreg, val);54835483+ break;54845484+ case BPF_JSET:54855485+ if ((~subreg.mask & subreg.value) & val)54865486+ return 1;54875487+ if (!((subreg.mask | subreg.value) & val))54885488+ return 0;54895489+ break;54905490+ case BPF_JGT:54915491+ if (reg->u32_min_value > val)54925492+ return 1;54935493+ else if (reg->u32_max_value <= val)54945494+ return 0;54955495+ break;54965496+ case BPF_JSGT:54975497+ if (reg->s32_min_value > sval)54985498+ return 1;54995499+ else if (reg->s32_max_value < sval)55005500+ return 0;55015501+ break;55025502+ case BPF_JLT:55035503+ if (reg->u32_max_value < val)55045504+ return 1;55055505+ else if (reg->u32_min_value >= val)55065506+ return 0;55075507+ break;55085508+ case BPF_JSLT:55095509+ if (reg->s32_max_value < sval)55105510+ return 1;55115511+ else if (reg->s32_min_value >= sval)55125512+ return 0;55135513+ break;55145514+ case BPF_JGE:55155515+ if (reg->u32_min_value >= val)55165516+ return 1;55175517+ else if (reg->u32_max_value < val)55185518+ return 0;55195519+ break;55205520+ case BPF_JSGE:55215521+ if (reg->s32_min_value >= sval)55225522+ return 1;55235523+ else if (reg->s32_max_value < sval)55245524+ return 0;55255525+ break;55265526+ case BPF_JLE:55275527+ if (reg->u32_max_value <= val)55285528+ return 1;55295529+ else if (reg->u32_min_value > val)55305530+ return 0;55315531+ break;55325532+ case BPF_JSLE:55335533+ if (reg->s32_max_value <= sval)55345534+ return 1;55355535+ else if (reg->s32_min_value > sval)55365536+ return 0;55375537+ break;61455538 }55395539+55405540+ return -1;55415541+}55425542+55435543+55445544+static int is_branch64_taken(struct bpf_reg_state *reg, u64 val, u8 opcode)55455545+{55465546+ s64 sval = (s64)val;6146554761475548 switch (opcode) {61485549 case BPF_JEQ:···62405585 return -1;62415586}6242558762436243-/* Generate min value of the high 32-bit from TNUM info. */62446244-static u64 gen_hi_min(struct tnum var)62456245-{62466246- return var.value & ~0xffffffffULL;62476247-}62486248-62496249-/* Generate max value of the high 32-bit from TNUM info. */62506250-static u64 gen_hi_max(struct tnum var)62516251-{62526252- return (var.value | var.mask) & ~0xffffffffULL;62536253-}62546254-62556255-/* Return true if VAL is compared with a s64 sign extended from s32, and they62566256- * are with the same signedness.55885588+/* compute branch direction of the expression "if (reg opcode val) goto target;"55895589+ * and return:55905590+ * 1 - branch will be taken and "goto target" will be executed55915591+ * 0 - branch will not be taken and fall-through to next insn55925592+ * -1 - unknown. Example: "if (reg < 5)" is unknown when register value55935593+ * range [0,10]62575594 */62586258-static bool cmp_val_with_extended_s64(s64 sval, struct bpf_reg_state *reg)55955595+static int is_branch_taken(struct bpf_reg_state *reg, u64 val, u8 opcode,55965596+ bool is_jmp32)62595597{62606260- return ((s32)sval >= 0 &&62616261- reg->smin_value >= 0 && reg->smax_value <= S32_MAX) ||62626262- ((s32)sval < 0 &&62636263- reg->smax_value <= 0 && reg->smin_value >= S32_MIN);55985598+ if (__is_pointer_value(false, reg))55995599+ return -1;56005600+56015601+ if (is_jmp32)56025602+ return is_branch32_taken(reg, val, opcode);56035603+ return is_branch64_taken(reg, val, opcode);62645604}6265560562665606/* Adjusts the register min/max values in the case that the dst_reg is the···62645614 * In JEQ/JNE cases we also adjust the var_off values.62655615 */62665616static void reg_set_min_max(struct bpf_reg_state *true_reg,62676267- struct bpf_reg_state *false_reg, u64 val,56175617+ struct bpf_reg_state *false_reg,56185618+ u64 val, u32 val32,62685619 u8 opcode, bool is_jmp32)62695620{62706270- s64 sval;56215621+ struct tnum false_32off = tnum_subreg(false_reg->var_off);56225622+ struct tnum false_64off = false_reg->var_off;56235623+ struct tnum true_32off = tnum_subreg(true_reg->var_off);56245624+ struct tnum true_64off = true_reg->var_off;56255625+ s64 sval = (s64)val;56265626+ s32 sval32 = (s32)val32;6271562762725628 /* If the dst_reg is a pointer, we can't learn anything about its62735629 * variable offset from the compare (unless src_reg were a pointer into···62835627 */62845628 if (__is_pointer_value(false, false_reg))62855629 return;62866286-62876287- val = is_jmp32 ? (u32)val : val;62886288- sval = is_jmp32 ? (s64)(s32)val : (s64)val;6289563062905631 switch (opcode) {62915632 case BPF_JEQ:···62955642 * if it is true we know the value for sure. Likewise for62965643 * BPF_JNE.62975644 */62986298- if (is_jmp32) {62996299- u64 old_v = reg->var_off.value;63006300- u64 hi_mask = ~0xffffffffULL;63016301-63026302- reg->var_off.value = (old_v & hi_mask) | val;63036303- reg->var_off.mask &= hi_mask;63046304- } else {56455645+ if (is_jmp32)56465646+ __mark_reg32_known(reg, val32);56475647+ else63055648 __mark_reg_known(reg, val);63066306- }63075649 break;63085650 }63095651 case BPF_JSET:63106310- false_reg->var_off = tnum_and(false_reg->var_off,63116311- tnum_const(~val));63126312- if (is_power_of_2(val))63136313- true_reg->var_off = tnum_or(true_reg->var_off,63146314- tnum_const(val));56525652+ if (is_jmp32) {56535653+ false_32off = tnum_and(false_32off, tnum_const(~val32));56545654+ if (is_power_of_2(val32))56555655+ true_32off = tnum_or(true_32off,56565656+ tnum_const(val32));56575657+ } else {56585658+ false_64off = tnum_and(false_64off, tnum_const(~val));56595659+ if (is_power_of_2(val))56605660+ true_64off = tnum_or(true_64off,56615661+ tnum_const(val));56625662+ }63155663 break;63165664 case BPF_JGE:63175665 case BPF_JGT:63185666 {63196319- u64 false_umax = opcode == BPF_JGT ? val : val - 1;63206320- u64 true_umin = opcode == BPF_JGT ? val + 1 : val;63216321-63225667 if (is_jmp32) {63236323- false_umax += gen_hi_max(false_reg->var_off);63246324- true_umin += gen_hi_min(true_reg->var_off);56685668+ u32 false_umax = opcode == BPF_JGT ? val32 : val32 - 1;56695669+ u32 true_umin = opcode == BPF_JGT ? val32 + 1 : val32;56705670+56715671+ false_reg->u32_max_value = min(false_reg->u32_max_value,56725672+ false_umax);56735673+ true_reg->u32_min_value = max(true_reg->u32_min_value,56745674+ true_umin);56755675+ } else {56765676+ u64 false_umax = opcode == BPF_JGT ? val : val - 1;56775677+ u64 true_umin = opcode == BPF_JGT ? val + 1 : val;56785678+56795679+ false_reg->umax_value = min(false_reg->umax_value, false_umax);56805680+ true_reg->umin_value = max(true_reg->umin_value, true_umin);63255681 }63266326- false_reg->umax_value = min(false_reg->umax_value, false_umax);63276327- true_reg->umin_value = max(true_reg->umin_value, true_umin);63285682 break;63295683 }63305684 case BPF_JSGE:63315685 case BPF_JSGT:63325686 {63336333- s64 false_smax = opcode == BPF_JSGT ? sval : sval - 1;63346334- s64 true_smin = opcode == BPF_JSGT ? sval + 1 : sval;56875687+ if (is_jmp32) {56885688+ s32 false_smax = opcode == BPF_JSGT ? sval32 : sval32 - 1;56895689+ s32 true_smin = opcode == BPF_JSGT ? sval32 + 1 : sval32;6335569063366336- /* If the full s64 was not sign-extended from s32 then don't63376337- * deduct further info.63386338- */63396339- if (is_jmp32 && !cmp_val_with_extended_s64(sval, false_reg))63406340- break;63416341- false_reg->smax_value = min(false_reg->smax_value, false_smax);63426342- true_reg->smin_value = max(true_reg->smin_value, true_smin);56915691+ false_reg->s32_max_value = min(false_reg->s32_max_value, false_smax);56925692+ true_reg->s32_min_value = max(true_reg->s32_min_value, true_smin);56935693+ } else {56945694+ s64 false_smax = opcode == BPF_JSGT ? sval : sval - 1;56955695+ s64 true_smin = opcode == BPF_JSGT ? sval + 1 : sval;56965696+56975697+ false_reg->smax_value = min(false_reg->smax_value, false_smax);56985698+ true_reg->smin_value = max(true_reg->smin_value, true_smin);56995699+ }63435700 break;63445701 }63455702 case BPF_JLE:63465703 case BPF_JLT:63475704 {63486348- u64 false_umin = opcode == BPF_JLT ? val : val + 1;63496349- u64 true_umax = opcode == BPF_JLT ? val - 1 : val;63506350-63515705 if (is_jmp32) {63526352- false_umin += gen_hi_min(false_reg->var_off);63536353- true_umax += gen_hi_max(true_reg->var_off);57065706+ u32 false_umin = opcode == BPF_JLT ? val32 : val32 + 1;57075707+ u32 true_umax = opcode == BPF_JLT ? val32 - 1 : val32;57085708+57095709+ false_reg->u32_min_value = max(false_reg->u32_min_value,57105710+ false_umin);57115711+ true_reg->u32_max_value = min(true_reg->u32_max_value,57125712+ true_umax);57135713+ } else {57145714+ u64 false_umin = opcode == BPF_JLT ? val : val + 1;57155715+ u64 true_umax = opcode == BPF_JLT ? val - 1 : val;57165716+57175717+ false_reg->umin_value = max(false_reg->umin_value, false_umin);57185718+ true_reg->umax_value = min(true_reg->umax_value, true_umax);63545719 }63556355- false_reg->umin_value = max(false_reg->umin_value, false_umin);63566356- true_reg->umax_value = min(true_reg->umax_value, true_umax);63575720 break;63585721 }63595722 case BPF_JSLE:63605723 case BPF_JSLT:63615724 {63626362- s64 false_smin = opcode == BPF_JSLT ? sval : sval + 1;63636363- s64 true_smax = opcode == BPF_JSLT ? sval - 1 : sval;57255725+ if (is_jmp32) {57265726+ s32 false_smin = opcode == BPF_JSLT ? sval32 : sval32 + 1;57275727+ s32 true_smax = opcode == BPF_JSLT ? sval32 - 1 : sval32;6364572863656365- if (is_jmp32 && !cmp_val_with_extended_s64(sval, false_reg))63666366- break;63676367- false_reg->smin_value = max(false_reg->smin_value, false_smin);63686368- true_reg->smax_value = min(true_reg->smax_value, true_smax);57295729+ false_reg->s32_min_value = max(false_reg->s32_min_value, false_smin);57305730+ true_reg->s32_max_value = min(true_reg->s32_max_value, true_smax);57315731+ } else {57325732+ s64 false_smin = opcode == BPF_JSLT ? sval : sval + 1;57335733+ s64 true_smax = opcode == BPF_JSLT ? sval - 1 : sval;57345734+57355735+ false_reg->smin_value = max(false_reg->smin_value, false_smin);57365736+ true_reg->smax_value = min(true_reg->smax_value, true_smax);57375737+ }63695738 break;63705739 }63715740 default:63726372- break;57415741+ return;63735742 }6374574363756375- __reg_deduce_bounds(false_reg);63766376- __reg_deduce_bounds(true_reg);63776377- /* We might have learned some bits from the bounds. */63786378- __reg_bound_offset(false_reg);63796379- __reg_bound_offset(true_reg);63805744 if (is_jmp32) {63816381- __reg_bound_offset32(false_reg);63826382- __reg_bound_offset32(true_reg);57455745+ false_reg->var_off = tnum_or(tnum_clear_subreg(false_64off),57465746+ tnum_subreg(false_32off));57475747+ true_reg->var_off = tnum_or(tnum_clear_subreg(true_64off),57485748+ tnum_subreg(true_32off));57495749+ __reg_combine_32_into_64(false_reg);57505750+ __reg_combine_32_into_64(true_reg);57515751+ } else {57525752+ false_reg->var_off = false_64off;57535753+ true_reg->var_off = true_64off;57545754+ __reg_combine_64_into_32(false_reg);57555755+ __reg_combine_64_into_32(true_reg);63835756 }63846384- /* Intersecting with the old var_off might have improved our bounds63856385- * slightly. e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc),63866386- * then new var_off is (0; 0x7f...fc) which improves our umax.63876387- */63886388- __update_reg_bounds(false_reg);63896389- __update_reg_bounds(true_reg);63905757}6391575863925759/* Same as above, but for the case that dst_reg holds a constant and src_reg is63935760 * the variable reg.63945761 */63955762static void reg_set_min_max_inv(struct bpf_reg_state *true_reg,63966396- struct bpf_reg_state *false_reg, u64 val,57635763+ struct bpf_reg_state *false_reg,57645764+ u64 val, u32 val32,63975765 u8 opcode, bool is_jmp32)63985766{63996399- s64 sval;64006400-64016401- if (__is_pointer_value(false, false_reg))64026402- return;64036403-64046404- val = is_jmp32 ? (u32)val : val;64056405- sval = is_jmp32 ? (s64)(s32)val : (s64)val;64066406-64076407- switch (opcode) {64086408- case BPF_JEQ:64096409- case BPF_JNE:64106410- {64116411- struct bpf_reg_state *reg =64126412- opcode == BPF_JEQ ? true_reg : false_reg;64136413-64146414- if (is_jmp32) {64156415- u64 old_v = reg->var_off.value;64166416- u64 hi_mask = ~0xffffffffULL;64176417-64186418- reg->var_off.value = (old_v & hi_mask) | val;64196419- reg->var_off.mask &= hi_mask;64206420- } else {64216421- __mark_reg_known(reg, val);64226422- }64236423- break;64246424- }64256425- case BPF_JSET:64266426- false_reg->var_off = tnum_and(false_reg->var_off,64276427- tnum_const(~val));64286428- if (is_power_of_2(val))64296429- true_reg->var_off = tnum_or(true_reg->var_off,64306430- tnum_const(val));64316431- break;64326432- case BPF_JGE:64336433- case BPF_JGT:64346434- {64356435- u64 false_umin = opcode == BPF_JGT ? val : val + 1;64366436- u64 true_umax = opcode == BPF_JGT ? val - 1 : val;64376437-64386438- if (is_jmp32) {64396439- false_umin += gen_hi_min(false_reg->var_off);64406440- true_umax += gen_hi_max(true_reg->var_off);64416441- }64426442- false_reg->umin_value = max(false_reg->umin_value, false_umin);64436443- true_reg->umax_value = min(true_reg->umax_value, true_umax);64446444- break;64456445- }64466446- case BPF_JSGE:64476447- case BPF_JSGT:64486448- {64496449- s64 false_smin = opcode == BPF_JSGT ? sval : sval + 1;64506450- s64 true_smax = opcode == BPF_JSGT ? sval - 1 : sval;64516451-64526452- if (is_jmp32 && !cmp_val_with_extended_s64(sval, false_reg))64536453- break;64546454- false_reg->smin_value = max(false_reg->smin_value, false_smin);64556455- true_reg->smax_value = min(true_reg->smax_value, true_smax);64566456- break;64576457- }64586458- case BPF_JLE:64596459- case BPF_JLT:64606460- {64616461- u64 false_umax = opcode == BPF_JLT ? val : val - 1;64626462- u64 true_umin = opcode == BPF_JLT ? val + 1 : val;64636463-64646464- if (is_jmp32) {64656465- false_umax += gen_hi_max(false_reg->var_off);64666466- true_umin += gen_hi_min(true_reg->var_off);64676467- }64686468- false_reg->umax_value = min(false_reg->umax_value, false_umax);64696469- true_reg->umin_value = max(true_reg->umin_value, true_umin);64706470- break;64716471- }64726472- case BPF_JSLE:64736473- case BPF_JSLT:64746474- {64756475- s64 false_smax = opcode == BPF_JSLT ? sval : sval - 1;64766476- s64 true_smin = opcode == BPF_JSLT ? sval + 1 : sval;64776477-64786478- if (is_jmp32 && !cmp_val_with_extended_s64(sval, false_reg))64796479- break;64806480- false_reg->smax_value = min(false_reg->smax_value, false_smax);64816481- true_reg->smin_value = max(true_reg->smin_value, true_smin);64826482- break;64836483- }64846484- default:64856485- break;64866486- }64876487-64886488- __reg_deduce_bounds(false_reg);64896489- __reg_deduce_bounds(true_reg);64906490- /* We might have learned some bits from the bounds. */64916491- __reg_bound_offset(false_reg);64926492- __reg_bound_offset(true_reg);64936493- if (is_jmp32) {64946494- __reg_bound_offset32(false_reg);64956495- __reg_bound_offset32(true_reg);64966496- }64976497- /* Intersecting with the old var_off might have improved our bounds64986498- * slightly. e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc),64996499- * then new var_off is (0; 0x7f...fc) which improves our umax.57675767+ /* How can we transform "a <op> b" into "b <op> a"? */57685768+ static const u8 opcode_flip[16] = {57695769+ /* these stay the same */57705770+ [BPF_JEQ >> 4] = BPF_JEQ,57715771+ [BPF_JNE >> 4] = BPF_JNE,57725772+ [BPF_JSET >> 4] = BPF_JSET,57735773+ /* these swap "lesser" and "greater" (L and G in the opcodes) */57745774+ [BPF_JGE >> 4] = BPF_JLE,57755775+ [BPF_JGT >> 4] = BPF_JLT,57765776+ [BPF_JLE >> 4] = BPF_JGE,57775777+ [BPF_JLT >> 4] = BPF_JGT,57785778+ [BPF_JSGE >> 4] = BPF_JSLE,57795779+ [BPF_JSGT >> 4] = BPF_JSLT,57805780+ [BPF_JSLE >> 4] = BPF_JSGE,57815781+ [BPF_JSLT >> 4] = BPF_JSGT57825782+ };57835783+ opcode = opcode_flip[opcode >> 4];57845784+ /* This uses zero as "not present in table"; luckily the zero opcode,57855785+ * BPF_JA, can't get here.65005786 */65016501- __update_reg_bounds(false_reg);65026502- __update_reg_bounds(true_reg);57875787+ if (opcode)57885788+ reg_set_min_max(true_reg, false_reg, val, val32, opcode, is_jmp32);65035789}6504579065055791/* Regs are known to be equal, so intersect their min/max/var_off */···67276135 dst_reg = ®s[insn->dst_reg];67286136 is_jmp32 = BPF_CLASS(insn->code) == BPF_JMP32;6729613767306730- if (BPF_SRC(insn->code) == BPF_K)67316731- pred = is_branch_taken(dst_reg, insn->imm,67326732- opcode, is_jmp32);67336733- else if (src_reg->type == SCALAR_VALUE &&67346734- tnum_is_const(src_reg->var_off))67356735- pred = is_branch_taken(dst_reg, src_reg->var_off.value,67366736- opcode, is_jmp32);61386138+ if (BPF_SRC(insn->code) == BPF_K) {61396139+ pred = is_branch_taken(dst_reg, insn->imm, opcode, is_jmp32);61406140+ } else if (src_reg->type == SCALAR_VALUE &&61416141+ is_jmp32 && tnum_is_const(tnum_subreg(src_reg->var_off))) {61426142+ pred = is_branch_taken(dst_reg,61436143+ tnum_subreg(src_reg->var_off).value,61446144+ opcode,61456145+ is_jmp32);61466146+ } else if (src_reg->type == SCALAR_VALUE &&61476147+ !is_jmp32 && tnum_is_const(src_reg->var_off)) {61486148+ pred = is_branch_taken(dst_reg,61496149+ src_reg->var_off.value,61506150+ opcode,61516151+ is_jmp32);61526152+ }61536153+67376154 if (pred >= 0) {67386155 err = mark_chain_precision(env, insn->dst_reg);67396156 if (BPF_SRC(insn->code) == BPF_X && !err)···67766175 */67776176 if (BPF_SRC(insn->code) == BPF_X) {67786177 struct bpf_reg_state *src_reg = ®s[insn->src_reg];67796779- struct bpf_reg_state lo_reg0 = *dst_reg;67806780- struct bpf_reg_state lo_reg1 = *src_reg;67816781- struct bpf_reg_state *src_lo, *dst_lo;67826782-67836783- dst_lo = &lo_reg0;67846784- src_lo = &lo_reg1;67856785- coerce_reg_to_size(dst_lo, 4);67866786- coerce_reg_to_size(src_lo, 4);6787617867886179 if (dst_reg->type == SCALAR_VALUE &&67896180 src_reg->type == SCALAR_VALUE) {67906181 if (tnum_is_const(src_reg->var_off) ||67916791- (is_jmp32 && tnum_is_const(src_lo->var_off)))61826182+ (is_jmp32 &&61836183+ tnum_is_const(tnum_subreg(src_reg->var_off))))67926184 reg_set_min_max(&other_branch_regs[insn->dst_reg],67936185 dst_reg,67946794- is_jmp3267956795- ? src_lo->var_off.value67966796- : src_reg->var_off.value,61866186+ src_reg->var_off.value,61876187+ tnum_subreg(src_reg->var_off).value,67976188 opcode, is_jmp32);67986189 else if (tnum_is_const(dst_reg->var_off) ||67996799- (is_jmp32 && tnum_is_const(dst_lo->var_off)))61906190+ (is_jmp32 &&61916191+ tnum_is_const(tnum_subreg(dst_reg->var_off))))68006192 reg_set_min_max_inv(&other_branch_regs[insn->src_reg],68016193 src_reg,68026802- is_jmp3268036803- ? dst_lo->var_off.value68046804- : dst_reg->var_off.value,61946194+ dst_reg->var_off.value,61956195+ tnum_subreg(dst_reg->var_off).value,68056196 opcode, is_jmp32);68066197 else if (!is_jmp32 &&68076198 (opcode == BPF_JEQ || opcode == BPF_JNE))···68046211 }68056212 } else if (dst_reg->type == SCALAR_VALUE) {68066213 reg_set_min_max(&other_branch_regs[insn->dst_reg],68076807- dst_reg, insn->imm, opcode, is_jmp32);62146214+ dst_reg, insn->imm, (u32)insn->imm,62156215+ opcode, is_jmp32);68086216 }6809621768106218 /* detect if R == 0 where R is returned from bpf_map_lookup_elem().···70066412 struct tnum range = tnum_range(0, 1);70076413 int err;7008641470097009- /* The struct_ops func-ptr's return type could be "void" */70107010- if (env->prog->type == BPF_PROG_TYPE_STRUCT_OPS &&64156415+ /* LSM and struct_ops func-ptr's return type could be "void" */64166416+ if ((env->prog->type == BPF_PROG_TYPE_STRUCT_OPS ||64176417+ env->prog->type == BPF_PROG_TYPE_LSM) &&70116418 !prog->aux->attach_func_proto->type)70126419 return 0;70136420···104389843 if (prog->type == BPF_PROG_TYPE_STRUCT_OPS)104399844 return check_struct_ops_btf_id(env);1044098451044110441- if (prog->type != BPF_PROG_TYPE_TRACING && !prog_extension)98469846+ if (prog->type != BPF_PROG_TYPE_TRACING &&98479847+ prog->type != BPF_PROG_TYPE_LSM &&98489848+ !prog_extension)104429849 return 0;104439850104449851 if (!btf_id) {···105719974 return -EINVAL;105729975 /* fallthrough */105739976 case BPF_MODIFY_RETURN:99779977+ case BPF_LSM_MAC:105749978 case BPF_TRACE_FENTRY:105759979 case BPF_TRACE_FEXIT:99809980+ prog->aux->attach_func_name = tname;99819981+ if (prog->type == BPF_PROG_TYPE_LSM) {99829982+ ret = bpf_lsm_verify_prog(&env->log, prog);99839983+ if (ret < 0)99849984+ return ret;99859985+ }99869986+105769987 if (!btf_type_is_func(t)) {105779988 verbose(env, "attach_btf_id %u is not a function\n",105789989 btf_id);···105959990 tr = bpf_trampoline_lookup(key);105969991 if (!tr)105979992 return -ENOMEM;1059810598- prog->aux->attach_func_name = tname;105999993 /* t is either vmlinux type or another program's type */106009994 prog->aux->attach_func_proto = t;106019995 mutex_lock(&tr->mutex);
+36-5
kernel/cgroup/cgroup.c
···63036303#endif /* CONFIG_SOCK_CGROUP_DATA */6304630463056305#ifdef CONFIG_CGROUP_BPF63066306-int cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,63076307- struct bpf_prog *replace_prog, enum bpf_attach_type type,63066306+int cgroup_bpf_attach(struct cgroup *cgrp,63076307+ struct bpf_prog *prog, struct bpf_prog *replace_prog,63086308+ struct bpf_cgroup_link *link,63096309+ enum bpf_attach_type type,63086310 u32 flags)63096311{63106312 int ret;6311631363126314 mutex_lock(&cgroup_mutex);63136313- ret = __cgroup_bpf_attach(cgrp, prog, replace_prog, type, flags);63156315+ ret = __cgroup_bpf_attach(cgrp, prog, replace_prog, link, type, flags);63146316 mutex_unlock(&cgroup_mutex);63156317 return ret;63166318}63196319+63206320+int cgroup_bpf_replace(struct bpf_link *link, struct bpf_prog *old_prog,63216321+ struct bpf_prog *new_prog)63226322+{63236323+ struct bpf_cgroup_link *cg_link;63246324+ int ret;63256325+63266326+ if (link->ops != &bpf_cgroup_link_lops)63276327+ return -EINVAL;63286328+63296329+ cg_link = container_of(link, struct bpf_cgroup_link, link);63306330+63316331+ mutex_lock(&cgroup_mutex);63326332+ /* link might have been auto-released by dying cgroup, so fail */63336333+ if (!cg_link->cgroup) {63346334+ ret = -EINVAL;63356335+ goto out_unlock;63366336+ }63376337+ if (old_prog && link->prog != old_prog) {63386338+ ret = -EPERM;63396339+ goto out_unlock;63406340+ }63416341+ ret = __cgroup_bpf_replace(cg_link->cgroup, cg_link, new_prog);63426342+out_unlock:63436343+ mutex_unlock(&cgroup_mutex);63446344+ return ret;63456345+}63466346+63176347int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,63186318- enum bpf_attach_type type, u32 flags)63486348+ enum bpf_attach_type type)63196349{63206350 int ret;6321635163226352 mutex_lock(&cgroup_mutex);63236323- ret = __cgroup_bpf_detach(cgrp, prog, type);63536353+ ret = __cgroup_bpf_detach(cgrp, prog, NULL, type);63246354 mutex_unlock(&cgroup_mutex);63256355 return ret;63266356}63576357+63276358int cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,63286359 union bpf_attr __user *uattr)63296360{
···114114 * architecture dependent calling conventions. 7+ can be supported in the115115 * future.116116 */117117+__diag_push();118118+__diag_ignore(GCC, 8, "-Wmissing-prototypes",119119+ "Global functions as their definitions will be in vmlinux BTF");117120int noinline bpf_fentry_test1(int a)118121{119122 return a + 1;···152149 *b += 1;153150 return a + *b;154151}152152+__diag_pop();155153156154ALLOW_ERROR_INJECTION(bpf_modify_return_test, ERRNO);157155
+21-5
net/core/dev.c
···86558655 * @dev: device86568656 * @extack: netlink extended ack86578657 * @fd: new program fd or negative value to clear86588658+ * @expected_fd: old program fd that userspace expects to replace or clear86588659 * @flags: xdp-related flags86598660 *86608661 * Set or clear a bpf program for a device86618662 */86628663int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,86638663- int fd, u32 flags)86648664+ int fd, int expected_fd, u32 flags)86648665{86658666 const struct net_device_ops *ops = dev->netdev_ops;86668667 enum bpf_netdev_command query;86688668+ u32 prog_id, expected_id = 0;86678669 struct bpf_prog *prog = NULL;86688670 bpf_op_t bpf_op, bpf_chk;86698671 bool offload;···86868684 if (bpf_op == bpf_chk)86878685 bpf_chk = generic_xdp_install;8688868686898689- if (fd >= 0) {86908690- u32 prog_id;86878687+ prog_id = __dev_xdp_query(dev, bpf_op, query);86888688+ if (flags & XDP_FLAGS_REPLACE) {86898689+ if (expected_fd >= 0) {86908690+ prog = bpf_prog_get_type_dev(expected_fd,86918691+ BPF_PROG_TYPE_XDP,86928692+ bpf_op == ops->ndo_bpf);86938693+ if (IS_ERR(prog))86948694+ return PTR_ERR(prog);86958695+ expected_id = prog->aux->id;86968696+ bpf_prog_put(prog);86978697+ }8691869886998699+ if (prog_id != expected_id) {87008700+ NL_SET_ERR_MSG(extack, "Active program does not match expected");87018701+ return -EEXIST;87028702+ }87038703+ }87048704+ if (fd >= 0) {86928705 if (!offload && __dev_xdp_query(dev, bpf_chk, XDP_QUERY_PROG)) {86938706 NL_SET_ERR_MSG(extack, "native and generic XDP can't be active at the same time");86948707 return -EEXIST;86958708 }8696870986978697- prog_id = __dev_xdp_query(dev, bpf_op, query);86988710 if ((flags & XDP_FLAGS_UPDATE_IF_NOEXIST) && prog_id) {86998711 NL_SET_ERR_MSG(extack, "XDP program already attached");87008712 return -EBUSY;···87318715 return 0;87328716 }87338717 } else {87348734- if (!__dev_xdp_query(dev, bpf_op, query))87188718+ if (!prog_id)87358719 return 0;87368720 }87378721
···20712071}20722072EXPORT_SYMBOL(sock_efree);2073207320742074+/* Buffer destructor for prefetch/receive path where reference count may20752075+ * not be held, e.g. for listen sockets.20762076+ */20772077+#ifdef CONFIG_INET20782078+void sock_pfree(struct sk_buff *skb)20792079+{20802080+ if (sk_is_refcounted(skb->sk))20812081+ sock_gen_put(skb->sk);20822082+}20832083+EXPORT_SYMBOL(sock_pfree);20842084+#endif /* CONFIG_INET */20852085+20742086kuid_t sock_i_uid(struct sock *sk)20752087{20762088 kuid_t uid;
+33
net/ipv4/bpf_tcp_ca.c
···77#include <linux/btf.h>88#include <linux/filter.h>99#include <net/tcp.h>1010+#include <net/bpf_sk_storage.h>10111112static u32 optional_ops[] = {1213 offsetof(struct tcp_congestion_ops, init),···2827static const struct btf_type *tcp_sock_type;2928static u32 tcp_sock_id, sock_id;30293030+static int btf_sk_storage_get_ids[5];3131+static struct bpf_func_proto btf_sk_storage_get_proto __read_mostly;3232+3333+static int btf_sk_storage_delete_ids[5];3434+static struct bpf_func_proto btf_sk_storage_delete_proto __read_mostly;3535+3636+static void convert_sk_func_proto(struct bpf_func_proto *to, int *to_btf_ids,3737+ const struct bpf_func_proto *from)3838+{3939+ int i;4040+4141+ *to = *from;4242+ to->btf_id = to_btf_ids;4343+ for (i = 0; i < ARRAY_SIZE(to->arg_type); i++) {4444+ if (to->arg_type[i] == ARG_PTR_TO_SOCKET) {4545+ to->arg_type[i] = ARG_PTR_TO_BTF_ID;4646+ to->btf_id[i] = tcp_sock_id;4747+ }4848+ }4949+}5050+3151static int bpf_tcp_ca_init(struct btf *btf)3252{3353 s32 type_id;···6341 return -EINVAL;6442 tcp_sock_id = type_id;6543 tcp_sock_type = btf_type_by_id(btf, tcp_sock_id);4444+4545+ convert_sk_func_proto(&btf_sk_storage_get_proto,4646+ btf_sk_storage_get_ids,4747+ &bpf_sk_storage_get_proto);4848+ convert_sk_func_proto(&btf_sk_storage_delete_proto,4949+ btf_sk_storage_delete_ids,5050+ &bpf_sk_storage_delete_proto);66516752 return 0;6853}···196167 switch (func_id) {197168 case BPF_FUNC_tcp_send_ack:198169 return &bpf_tcp_send_ack_proto;170170+ case BPF_FUNC_sk_storage_get:171171+ return &btf_sk_storage_get_proto;172172+ case BPF_FUNC_sk_storage_delete:173173+ return &btf_sk_storage_delete_proto;199174 default:200175 return bpf_base_func_proto(func_id);201176 }
+2-1
net/ipv4/ip_input.c
···509509 IPCB(skb)->iif = skb->skb_iif;510510511511 /* Must drop socket now because of tproxy. */512512- skb_orphan(skb);512512+ if (!skb_sk_is_prefetched(skb))513513+ skb_orphan(skb);513514514515 return skb;515516
+76-76
net/ipv4/tcp_bpf.c
···1010#include <net/inet_common.h>1111#include <net/tls.h>12121313-static bool tcp_bpf_stream_read(const struct sock *sk)1414-{1515- struct sk_psock *psock;1616- bool empty = true;1717-1818- rcu_read_lock();1919- psock = sk_psock(sk);2020- if (likely(psock))2121- empty = list_empty(&psock->ingress_msg);2222- rcu_read_unlock();2323- return !empty;2424-}2525-2626-static int tcp_bpf_wait_data(struct sock *sk, struct sk_psock *psock,2727- int flags, long timeo, int *err)2828-{2929- DEFINE_WAIT_FUNC(wait, woken_wake_function);3030- int ret = 0;3131-3232- if (!timeo)3333- return ret;3434-3535- add_wait_queue(sk_sleep(sk), &wait);3636- sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk);3737- ret = sk_wait_event(sk, &timeo,3838- !list_empty(&psock->ingress_msg) ||3939- !skb_queue_empty(&sk->sk_receive_queue), &wait);4040- sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk);4141- remove_wait_queue(sk_sleep(sk), &wait);4242- return ret;4343-}4444-4513int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock,4614 struct msghdr *msg, int len, int flags)4715{···82114 return copied;83115}84116EXPORT_SYMBOL_GPL(__tcp_bpf_recvmsg);8585-8686-int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,8787- int nonblock, int flags, int *addr_len)8888-{8989- struct sk_psock *psock;9090- int copied, ret;9191-9292- psock = sk_psock_get(sk);9393- if (unlikely(!psock))9494- return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);9595- if (unlikely(flags & MSG_ERRQUEUE))9696- return inet_recv_error(sk, msg, len, addr_len);9797- if (!skb_queue_empty(&sk->sk_receive_queue) &&9898- sk_psock_queue_empty(psock))9999- return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);100100- lock_sock(sk);101101-msg_bytes_ready:102102- copied = __tcp_bpf_recvmsg(sk, psock, msg, len, flags);103103- if (!copied) {104104- int data, err = 0;105105- long timeo;106106-107107- timeo = sock_rcvtimeo(sk, nonblock);108108- data = tcp_bpf_wait_data(sk, psock, flags, timeo, &err);109109- if (data) {110110- if (!sk_psock_queue_empty(psock))111111- goto msg_bytes_ready;112112- release_sock(sk);113113- sk_psock_put(sk, psock);114114- return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);115115- }116116- if (err) {117117- ret = err;118118- goto out;119119- }120120- copied = -EAGAIN;121121- }122122- ret = copied;123123-out:124124- release_sock(sk);125125- sk_psock_put(sk, psock);126126- return ret;127127-}128117129118static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock,130119 struct sk_msg *msg, u32 apply_bytes, int flags)···222297 return ret;223298}224299EXPORT_SYMBOL_GPL(tcp_bpf_sendmsg_redir);300300+301301+#ifdef CONFIG_BPF_STREAM_PARSER302302+static bool tcp_bpf_stream_read(const struct sock *sk)303303+{304304+ struct sk_psock *psock;305305+ bool empty = true;306306+307307+ rcu_read_lock();308308+ psock = sk_psock(sk);309309+ if (likely(psock))310310+ empty = list_empty(&psock->ingress_msg);311311+ rcu_read_unlock();312312+ return !empty;313313+}314314+315315+static int tcp_bpf_wait_data(struct sock *sk, struct sk_psock *psock,316316+ int flags, long timeo, int *err)317317+{318318+ DEFINE_WAIT_FUNC(wait, woken_wake_function);319319+ int ret = 0;320320+321321+ if (!timeo)322322+ return ret;323323+324324+ add_wait_queue(sk_sleep(sk), &wait);325325+ sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk);326326+ ret = sk_wait_event(sk, &timeo,327327+ !list_empty(&psock->ingress_msg) ||328328+ !skb_queue_empty(&sk->sk_receive_queue), &wait);329329+ sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk);330330+ remove_wait_queue(sk_sleep(sk), &wait);331331+ return ret;332332+}333333+334334+static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,335335+ int nonblock, int flags, int *addr_len)336336+{337337+ struct sk_psock *psock;338338+ int copied, ret;339339+340340+ psock = sk_psock_get(sk);341341+ if (unlikely(!psock))342342+ return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);343343+ if (unlikely(flags & MSG_ERRQUEUE))344344+ return inet_recv_error(sk, msg, len, addr_len);345345+ if (!skb_queue_empty(&sk->sk_receive_queue) &&346346+ sk_psock_queue_empty(psock))347347+ return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);348348+ lock_sock(sk);349349+msg_bytes_ready:350350+ copied = __tcp_bpf_recvmsg(sk, psock, msg, len, flags);351351+ if (!copied) {352352+ int data, err = 0;353353+ long timeo;354354+355355+ timeo = sock_rcvtimeo(sk, nonblock);356356+ data = tcp_bpf_wait_data(sk, psock, flags, timeo, &err);357357+ if (data) {358358+ if (!sk_psock_queue_empty(psock))359359+ goto msg_bytes_ready;360360+ release_sock(sk);361361+ sk_psock_put(sk, psock);362362+ return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);363363+ }364364+ if (err) {365365+ ret = err;366366+ goto out;367367+ }368368+ copied = -EAGAIN;369369+ }370370+ ret = copied;371371+out:372372+ release_sock(sk);373373+ sk_psock_put(sk, psock);374374+ return ret;375375+}225376226377static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,227378 struct sk_msg *msg, int *copied, int flags)···529528 return copied ? copied : err;530529}531530532532-#ifdef CONFIG_BPF_STREAM_PARSER533531enum {534532 TCP_BPF_IPV4,535533 TCP_BPF_IPV6,
···285285 rcu_read_unlock();286286287287 /* Must drop socket now because of tproxy. */288288- skb_orphan(skb);288288+ if (!skb_sk_is_prefetched(skb))289289+ skb_orphan(skb);289290290291 return skb;291292err:
+6-3
net/ipv6/udp.c
···843843 struct net *net = dev_net(skb->dev);844844 struct udphdr *uh;845845 struct sock *sk;846846+ bool refcounted;846847 u32 ulen = 0;847848848849 if (!pskb_may_pull(skb, sizeof(struct udphdr)))···880879 goto csum_error;881880882881 /* Check if the socket is already available, e.g. due to early demux */883883- sk = skb_steal_sock(skb);882882+ sk = skb_steal_sock(skb, &refcounted);884883 if (sk) {885884 struct dst_entry *dst = skb_dst(skb);886885 int ret;···889888 udp6_sk_rx_dst_set(sk, dst);890889891890 if (!uh->check && !udp_sk(sk)->no_check6_rx) {892892- sock_put(sk);891891+ if (refcounted)892892+ sock_put(sk);893893 goto report_csum_error;894894 }895895896896 ret = udp6_unicast_rcv_skb(sk, skb, uh);897897- sock_put(sk);897897+ if (refcounted)898898+ sock_put(sk);898899 return ret;899900 }900901
+3
net/sched/act_bpf.c
···1212#include <linux/bpf.h>13131414#include <net/netlink.h>1515+#include <net/sock.h>1516#include <net/pkt_sched.h>1617#include <net/pkt_cls.h>1718···5453 bpf_compute_data_pointers(skb);5554 filter_res = BPF_PROG_RUN(filter, skb);5655 }5656+ if (skb_sk_is_prefetched(skb) && filter_res != TC_ACT_OK)5757+ skb_orphan(skb);5758 rcu_read_unlock();58595960 /* A BPF program may overwrite the default action opcode.
···88#include <bpf/bpf.h>99#include "bpf_load.h"1010#include <sys/resource.h>1111+#include "trace_helpers.h"11121213/* install fake seccomp program to enable seccomp code path inside the kernel,1314 * so that our kprobe attached to seccomp_phase1() can be triggered
+10-14
scripts/link-vmlinux.sh
···113113gen_btf()114114{115115 local pahole_ver116116- local bin_arch117117- local bin_format118118- local bin_file119116120117 if ! [ -x "$(command -v ${PAHOLE})" ]; then121118 echo >&2 "BTF: ${1}: pahole (${PAHOLE}) is not available"···130133 info "BTF" ${2}131134 LLVM_OBJCOPY=${OBJCOPY} ${PAHOLE} -J ${1}132135133133- # dump .BTF section into raw binary file to link with final vmlinux134134- bin_arch=$(LANG=C ${OBJDUMP} -f ${1} | grep architecture | \135135- cut -d, -f1 | cut -d' ' -f2)136136- bin_format=$(LANG=C ${OBJDUMP} -f ${1} | grep 'file format' | \137137- awk '{print $4}')138138- bin_file=.btf.vmlinux.bin139139- ${OBJCOPY} --change-section-address .BTF=0 \140140- --set-section-flags .BTF=alloc -O binary \141141- --only-section=.BTF ${1} $bin_file142142- ${OBJCOPY} -I binary -O ${bin_format} -B ${bin_arch} \143143- --rename-section .data=.BTF $bin_file ${2}136136+ # Create ${2} which contains just .BTF section but no symbols. Add137137+ # SHF_ALLOC because .BTF will be part of the vmlinux image. --strip-all138138+ # deletes all symbols including __start_BTF and __stop_BTF, which will139139+ # be redefined in the linker script. Add 2>/dev/null to suppress GNU140140+ # objcopy warnings: "empty loadable segment detected at ..."141141+ ${OBJCOPY} --only-section=.BTF --set-section-flags .BTF=alloc,readonly \142142+ --strip-all ${1} ${2} 2>/dev/null143143+ # Change e_type to ET_REL so that it can be used to link final vmlinux.144144+ # Unlike GNU ld, lld does not allow an ET_EXEC input.145145+ printf '\1' | dd of=${2} conv=notrunc bs=1 seek=16 status=none144146}145147146148# Create ${2} .o file with all symbols from the ${1} object file
+5-5
security/Kconfig
···277277278278config LSM279279 string "Ordered list of enabled LSMs"280280- default "lockdown,yama,loadpin,safesetid,integrity,smack,selinux,tomoyo,apparmor" if DEFAULT_SECURITY_SMACK281281- default "lockdown,yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo" if DEFAULT_SECURITY_APPARMOR282282- default "lockdown,yama,loadpin,safesetid,integrity,tomoyo" if DEFAULT_SECURITY_TOMOYO283283- default "lockdown,yama,loadpin,safesetid,integrity" if DEFAULT_SECURITY_DAC284284- default "lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"280280+ default "lockdown,yama,loadpin,safesetid,integrity,smack,selinux,tomoyo,apparmor,bpf" if DEFAULT_SECURITY_SMACK281281+ default "lockdown,yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo,bpf" if DEFAULT_SECURITY_APPARMOR282282+ default "lockdown,yama,loadpin,safesetid,integrity,tomoyo,bpf" if DEFAULT_SECURITY_TOMOYO283283+ default "lockdown,yama,loadpin,safesetid,integrity,bpf" if DEFAULT_SECURITY_DAC284284+ default "lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor,bpf"285285 help286286 A comma-separated list of LSMs, in initialization order.287287 Any LSMs left off this list will be ignored. This can be
···11+==================22+bpftool-struct_ops33+==================44+-------------------------------------------------------------------------------55+tool to register/unregister/introspect BPF struct_ops66+-------------------------------------------------------------------------------77+88+:Manual section: 899+1010+SYNOPSIS1111+========1212+1313+ **bpftool** [*OPTIONS*] **struct_ops** *COMMAND*1414+1515+ *OPTIONS* := { { **-j** | **--json** } [{ **-p** | **--pretty** }] }1616+1717+ *COMMANDS* :=1818+ { **show** | **list** | **dump** | **register** | **unregister** | **help** }1919+2020+STRUCT_OPS COMMANDS2121+===================2222+2323+| **bpftool** **struct_ops { show | list }** [*STRUCT_OPS_MAP*]2424+| **bpftool** **struct_ops dump** [*STRUCT_OPS_MAP*]2525+| **bpftool** **struct_ops register** *OBJ*2626+| **bpftool** **struct_ops unregister** *STRUCT_OPS_MAP*2727+| **bpftool** **struct_ops help**2828+|2929+| *STRUCT_OPS_MAP* := { **id** *STRUCT_OPS_MAP_ID* | **name** *STRUCT_OPS_MAP_NAME* }3030+| *OBJ* := /a/file/of/bpf_struct_ops.o3131+3232+3333+DESCRIPTION3434+===========3535+ **bpftool struct_ops { show | list }** [*STRUCT_OPS_MAP*]3636+ Show brief information about the struct_ops in the system.3737+ If *STRUCT_OPS_MAP* is specified, it shows information only3838+ for the given struct_ops. Otherwise, it lists all struct_ops3939+ currently existing in the system.4040+4141+ Output will start with struct_ops map ID, followed by its map4242+ name and its struct_ops's kernel type.4343+4444+ **bpftool struct_ops dump** [*STRUCT_OPS_MAP*]4545+ Dump details information about the struct_ops in the system.4646+ If *STRUCT_OPS_MAP* is specified, it dumps information only4747+ for the given struct_ops. Otherwise, it dumps all struct_ops4848+ currently existing in the system.4949+5050+ **bpftool struct_ops register** *OBJ*5151+ Register bpf struct_ops from *OBJ*. All struct_ops under5252+ the ELF section ".struct_ops" will be registered to5353+ its kernel subsystem.5454+5555+ **bpftool struct_ops unregister** *STRUCT_OPS_MAP*5656+ Unregister the *STRUCT_OPS_MAP* from the kernel subsystem.5757+5858+ **bpftool struct_ops help**5959+ Print short help message.6060+6161+OPTIONS6262+=======6363+ -h, --help6464+ Print short generic help message (similar to **bpftool help**).6565+6666+ -V, --version6767+ Print version number (similar to **bpftool version**).6868+6969+ -j, --json7070+ Generate JSON output. For commands that cannot produce JSON, this7171+ option has no effect.7272+7373+ -p, --pretty7474+ Generate human-readable JSON output. Implies **-j**.7575+7676+ -d, --debug7777+ Print all logs available, even debug-level information. This7878+ includes logs from libbpf as well as from the verifier, when7979+ attempting to load programs.8080+8181+EXAMPLES8282+========8383+**# bpftool struct_ops show**8484+8585+::8686+8787+ 100: dctcp tcp_congestion_ops8888+ 105: cubic tcp_congestion_ops8989+9090+**# bpftool struct_ops unregister id 105**9191+9292+::9393+9494+ Unregistered tcp_congestion_ops cubic id 1059595+9696+**# bpftool struct_ops register bpf_cubic.o**9797+9898+::9999+100100+ Registered tcp_congestion_ops cubic id 110101101+102102+103103+SEE ALSO104104+========105105+ **bpf**\ (2),106106+ **bpf-helpers**\ (7),107107+ **bpftool**\ (8),108108+ **bpftool-prog**\ (8),109109+ **bpftool-map**\ (8),110110+ **bpftool-cgroup**\ (8),111111+ **bpftool-feature**\ (8),112112+ **bpftool-net**\ (8),113113+ **bpftool-perf**\ (8),114114+ **bpftool-btf**\ (8)115115+ **bpftool-gen**\ (8)116116+
+28
tools/bpf/bpftool/bash-completion/bpftool
···576576 ;;577577 esac578578 ;;579579+ struct_ops)580580+ local STRUCT_OPS_TYPE='id name'581581+ case $command in582582+ show|list|dump|unregister)583583+ case $prev in584584+ $command)585585+ COMPREPLY=( $( compgen -W "$STRUCT_OPS_TYPE" -- "$cur" ) )586586+ ;;587587+ id)588588+ _bpftool_get_map_ids_for_type struct_ops589589+ ;;590590+ name)591591+ _bpftool_get_map_names_for_type struct_ops592592+ ;;593593+ esac594594+ return 0595595+ ;;596596+ register)597597+ _filedir598598+ return 0599599+ ;;600600+ *)601601+ [[ $prev == $object ]] && \602602+ COMPREPLY=( $( compgen -W 'register unregister show list dump help' \603603+ -- "$cur" ) )604604+ ;;605605+ esac606606+ ;;579607 map)580608 local MAP_TYPE='id pinned name'581609 case $command in
+183-16
tools/bpf/bpftool/btf_dumper.c
···44#include <ctype.h>55#include <stdio.h> /* for (FILE *) used by json_writer */66#include <string.h>77+#include <unistd.h>78#include <asm/byteorder.h>89#include <linux/bitops.h>910#include <linux/btf.h>1011#include <linux/err.h>1112#include <bpf/btf.h>1313+#include <bpf/bpf.h>12141315#include "json_writer.h"1416#include "main.h"···2422static int btf_dumper_do_type(const struct btf_dumper *d, __u32 type_id,2523 __u8 bit_offset, const void *data);26242727-static void btf_dumper_ptr(const void *data, json_writer_t *jw,2828- bool is_plain_text)2525+static int btf_dump_func(const struct btf *btf, char *func_sig,2626+ const struct btf_type *func_proto,2727+ const struct btf_type *func, int pos, int size);2828+2929+static int dump_prog_id_as_func_ptr(const struct btf_dumper *d,3030+ const struct btf_type *func_proto,3131+ __u32 prog_id)2932{3030- if (is_plain_text)3131- jsonw_printf(jw, "%p", *(void **)data);3333+ struct bpf_prog_info_linear *prog_info = NULL;3434+ const struct btf_type *func_type;3535+ const char *prog_name = NULL;3636+ struct bpf_func_info *finfo;3737+ struct btf *prog_btf = NULL;3838+ struct bpf_prog_info *info;3939+ int prog_fd, func_sig_len;4040+ char prog_str[1024];4141+4242+ /* Get the ptr's func_proto */4343+ func_sig_len = btf_dump_func(d->btf, prog_str, func_proto, NULL, 0,4444+ sizeof(prog_str));4545+ if (func_sig_len == -1)4646+ return -1;4747+4848+ if (!prog_id)4949+ goto print;5050+5151+ /* Get the bpf_prog's name. Obtain from func_info. */5252+ prog_fd = bpf_prog_get_fd_by_id(prog_id);5353+ if (prog_fd == -1)5454+ goto print;5555+5656+ prog_info = bpf_program__get_prog_info_linear(prog_fd,5757+ 1UL << BPF_PROG_INFO_FUNC_INFO);5858+ close(prog_fd);5959+ if (IS_ERR(prog_info)) {6060+ prog_info = NULL;6161+ goto print;6262+ }6363+ info = &prog_info->info;6464+6565+ if (!info->btf_id || !info->nr_func_info ||6666+ btf__get_from_id(info->btf_id, &prog_btf))6767+ goto print;6868+ finfo = (struct bpf_func_info *)info->func_info;6969+ func_type = btf__type_by_id(prog_btf, finfo->type_id);7070+ if (!func_type || !btf_is_func(func_type))7171+ goto print;7272+7373+ prog_name = btf__name_by_offset(prog_btf, func_type->name_off);7474+7575+print:7676+ if (!prog_id)7777+ snprintf(&prog_str[func_sig_len],7878+ sizeof(prog_str) - func_sig_len, " 0");7979+ else if (prog_name)8080+ snprintf(&prog_str[func_sig_len],8181+ sizeof(prog_str) - func_sig_len,8282+ " %s/prog_id:%u", prog_name, prog_id);3283 else3333- jsonw_printf(jw, "%lu", *(unsigned long *)data);8484+ snprintf(&prog_str[func_sig_len],8585+ sizeof(prog_str) - func_sig_len,8686+ " <unknown_prog_name>/prog_id:%u", prog_id);8787+8888+ prog_str[sizeof(prog_str) - 1] = '\0';8989+ jsonw_string(d->jw, prog_str);9090+ btf__free(prog_btf);9191+ free(prog_info);9292+ return 0;9393+}9494+9595+static void btf_dumper_ptr(const struct btf_dumper *d,9696+ const struct btf_type *t,9797+ const void *data)9898+{9999+ unsigned long value = *(unsigned long *)data;100100+ const struct btf_type *ptr_type;101101+ __s32 ptr_type_id;102102+103103+ if (!d->prog_id_as_func_ptr || value > UINT32_MAX)104104+ goto print_ptr_value;105105+106106+ ptr_type_id = btf__resolve_type(d->btf, t->type);107107+ if (ptr_type_id < 0)108108+ goto print_ptr_value;109109+ ptr_type = btf__type_by_id(d->btf, ptr_type_id);110110+ if (!ptr_type || !btf_is_func_proto(ptr_type))111111+ goto print_ptr_value;112112+113113+ if (!dump_prog_id_as_func_ptr(d, ptr_type, value))114114+ return;115115+116116+print_ptr_value:117117+ if (d->is_plain_text)118118+ jsonw_printf(d->jw, "%p", (void *)value);119119+ else120120+ jsonw_printf(d->jw, "%lu", value);34121}3512236123static int btf_dumper_modifier(const struct btf_dumper *d, __u32 type_id,···13443 return btf_dumper_do_type(d, actual_type_id, bit_offset, data);13544}13645137137-static void btf_dumper_enum(const void *data, json_writer_t *jw)4646+static int btf_dumper_enum(const struct btf_dumper *d,4747+ const struct btf_type *t,4848+ const void *data)13849{139139- jsonw_printf(jw, "%d", *(int *)data);5050+ const struct btf_enum *enums = btf_enum(t);5151+ __s64 value;5252+ __u16 i;5353+5454+ switch (t->size) {5555+ case 8:5656+ value = *(__s64 *)data;5757+ break;5858+ case 4:5959+ value = *(__s32 *)data;6060+ break;6161+ case 2:6262+ value = *(__s16 *)data;6363+ break;6464+ case 1:6565+ value = *(__s8 *)data;6666+ break;6767+ default:6868+ return -EINVAL;6969+ }7070+7171+ for (i = 0; i < btf_vlen(t); i++) {7272+ if (value == enums[i].val) {7373+ jsonw_string(d->jw,7474+ btf__name_by_offset(d->btf,7575+ enums[i].name_off));7676+ return 0;7777+ }7878+ }7979+8080+ jsonw_int(d->jw, value);8181+ return 0;8282+}8383+8484+static bool is_str_array(const struct btf *btf, const struct btf_array *arr,8585+ const char *s)8686+{8787+ const struct btf_type *elem_type;8888+ const char *end_s;8989+9090+ if (!arr->nelems)9191+ return false;9292+9393+ elem_type = btf__type_by_id(btf, arr->type);9494+ /* Not skipping typedef. typedef to char does not count as9595+ * a string now.9696+ */9797+ while (elem_type && btf_is_mod(elem_type))9898+ elem_type = btf__type_by_id(btf, elem_type->type);9999+100100+ if (!elem_type || !btf_is_int(elem_type) || elem_type->size != 1)101101+ return false;102102+103103+ if (btf_int_encoding(elem_type) != BTF_INT_CHAR &&104104+ strcmp("char", btf__name_by_offset(btf, elem_type->name_off)))105105+ return false;106106+107107+ end_s = s + arr->nelems;108108+ while (s < end_s) {109109+ if (!*s)110110+ return true;111111+ if (*s <= 0x1f || *s >= 0x7f)112112+ return false;113113+ s++;114114+ }115115+116116+ /* '\0' is not found */117117+ return false;140118}141119142120static int btf_dumper_array(const struct btf_dumper *d, __u32 type_id,···21656 long long elem_size;21757 int ret = 0;21858 __u32 i;5959+6060+ if (is_str_array(d->btf, arr, data)) {6161+ jsonw_string(d->jw, data);6262+ return 0;6363+ }2196422065 elem_size = btf__resolve_size(d->btf, arr->type);22166 if (elem_size < 0)···531366 case BTF_KIND_ARRAY:532367 return btf_dumper_array(d, type_id, data);533368 case BTF_KIND_ENUM:534534- btf_dumper_enum(data, d->jw);535535- return 0;369369+ return btf_dumper_enum(d, t, data);536370 case BTF_KIND_PTR:537537- btf_dumper_ptr(data, d->jw, d->is_plain_text);371371+ btf_dumper_ptr(d, t, data);538372 return 0;539373 case BTF_KIND_UNKN:540374 jsonw_printf(d->jw, "(unknown)");···577413 if (pos == -1) \578414 return -1; \579415 } while (0)580580-581581-static int btf_dump_func(const struct btf *btf, char *func_sig,582582- const struct btf_type *func_proto,583583- const struct btf_type *func, int pos, int size);584416585417static int __btf_dumper_type_only(const struct btf *btf, __u32 type_id,586418 char *func_sig, int pos, int size)···686526 BTF_PRINT_ARG(", ");687527 if (arg->type) {688528 BTF_PRINT_TYPE(arg->type);689689- BTF_PRINT_ARG("%s",690690- btf__name_by_offset(btf, arg->name_off));529529+ if (arg->name_off)530530+ BTF_PRINT_ARG("%s",531531+ btf__name_by_offset(btf, arg->name_off));532532+ else if (pos && func_sig[pos - 1] == ' ')533533+ /* Remove unnecessary space for534534+ * FUNC_PROTO that does not have535535+ * arg->name_off536536+ */537537+ func_sig[--pos] = '\0';691538 } else {692539 BTF_PRINT_ARG("...");693540 }
···161161int do_feature(int argc, char **argv);162162int do_btf(int argc, char **argv);163163int do_gen(int argc, char **argv);164164+int do_struct_ops(int argc, char **argv);164165165166int parse_u32_arg(int *argc, char ***argv, __u32 *val, const char *what);166167int prog_parse_fd(int *argc, char ***argv);···206205 const struct btf *btf;207206 json_writer_t *jw;208207 bool is_plain_text;208208+ bool prog_id_as_func_ptr;209209};210210211211/* btf_dumper_type - print data along with type information
+596
tools/bpf/bpftool/struct_ops.c
···11+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)22+/* Copyright (C) 2020 Facebook */33+44+#include <errno.h>55+#include <stdio.h>66+#include <unistd.h>77+88+#include <linux/err.h>99+1010+#include <bpf/bpf.h>1111+#include <bpf/btf.h>1212+#include <bpf/libbpf.h>1313+1414+#include "json_writer.h"1515+#include "main.h"1616+1717+#define STRUCT_OPS_VALUE_PREFIX "bpf_struct_ops_"1818+1919+static const struct btf_type *map_info_type;2020+static __u32 map_info_alloc_len;2121+static struct btf *btf_vmlinux;2222+static __s32 map_info_type_id;2323+2424+struct res {2525+ unsigned int nr_maps;2626+ unsigned int nr_errs;2727+};2828+2929+static const struct btf *get_btf_vmlinux(void)3030+{3131+ if (btf_vmlinux)3232+ return btf_vmlinux;3333+3434+ btf_vmlinux = libbpf_find_kernel_btf();3535+ if (IS_ERR(btf_vmlinux))3636+ p_err("struct_ops requires kernel CONFIG_DEBUG_INFO_BTF=y");3737+3838+ return btf_vmlinux;3939+}4040+4141+static const char *get_kern_struct_ops_name(const struct bpf_map_info *info)4242+{4343+ const struct btf *kern_btf;4444+ const struct btf_type *t;4545+ const char *st_ops_name;4646+4747+ kern_btf = get_btf_vmlinux();4848+ if (IS_ERR(kern_btf))4949+ return "<btf_vmlinux_not_found>";5050+5151+ t = btf__type_by_id(kern_btf, info->btf_vmlinux_value_type_id);5252+ st_ops_name = btf__name_by_offset(kern_btf, t->name_off);5353+ st_ops_name += strlen(STRUCT_OPS_VALUE_PREFIX);5454+5555+ return st_ops_name;5656+}5757+5858+static __s32 get_map_info_type_id(void)5959+{6060+ const struct btf *kern_btf;6161+6262+ if (map_info_type_id)6363+ return map_info_type_id;6464+6565+ kern_btf = get_btf_vmlinux();6666+ if (IS_ERR(kern_btf)) {6767+ map_info_type_id = PTR_ERR(kern_btf);6868+ return map_info_type_id;6969+ }7070+7171+ map_info_type_id = btf__find_by_name_kind(kern_btf, "bpf_map_info",7272+ BTF_KIND_STRUCT);7373+ if (map_info_type_id < 0) {7474+ p_err("can't find bpf_map_info from btf_vmlinux");7575+ return map_info_type_id;7676+ }7777+ map_info_type = btf__type_by_id(kern_btf, map_info_type_id);7878+7979+ /* Ensure map_info_alloc() has at least what the bpftool needs */8080+ map_info_alloc_len = map_info_type->size;8181+ if (map_info_alloc_len < sizeof(struct bpf_map_info))8282+ map_info_alloc_len = sizeof(struct bpf_map_info);8383+8484+ return map_info_type_id;8585+}8686+8787+/* If the subcmd needs to print out the bpf_map_info,8888+ * it should always call map_info_alloc to allocate8989+ * a bpf_map_info object instead of allocating it9090+ * on the stack.9191+ *9292+ * map_info_alloc() will take the running kernel's btf9393+ * into account. i.e. it will consider the9494+ * sizeof(struct bpf_map_info) of the running kernel.9595+ *9696+ * It will enable the "struct_ops" cmd to print the latest9797+ * "struct bpf_map_info".9898+ *9999+ * [ Recall that "struct_ops" requires the kernel's btf to100100+ * be available ]101101+ */102102+static struct bpf_map_info *map_info_alloc(__u32 *alloc_len)103103+{104104+ struct bpf_map_info *info;105105+106106+ if (get_map_info_type_id() < 0)107107+ return NULL;108108+109109+ info = calloc(1, map_info_alloc_len);110110+ if (!info)111111+ p_err("mem alloc failed");112112+ else113113+ *alloc_len = map_info_alloc_len;114114+115115+ return info;116116+}117117+118118+/* It iterates all struct_ops maps of the system.119119+ * It returns the fd in "*res_fd" and map_info in "*info".120120+ * In the very first iteration, info->id should be 0.121121+ * An optional map "*name" filter can be specified.122122+ * The filter can be made more flexible in the future.123123+ * e.g. filter by kernel-struct-ops-name, regex-name, glob-name, ...etc.124124+ *125125+ * Return value:126126+ * 1: A struct_ops map found. It is returned in "*res_fd" and "*info".127127+ * The caller can continue to call get_next in the future.128128+ * 0: No struct_ops map is returned.129129+ * All struct_ops map has been found.130130+ * -1: Error and the caller should abort the iteration.131131+ */132132+static int get_next_struct_ops_map(const char *name, int *res_fd,133133+ struct bpf_map_info *info, __u32 info_len)134134+{135135+ __u32 id = info->id;136136+ int err, fd;137137+138138+ while (true) {139139+ err = bpf_map_get_next_id(id, &id);140140+ if (err) {141141+ if (errno == ENOENT)142142+ return 0;143143+ p_err("can't get next map: %s", strerror(errno));144144+ return -1;145145+ }146146+147147+ fd = bpf_map_get_fd_by_id(id);148148+ if (fd < 0) {149149+ if (errno == ENOENT)150150+ continue;151151+ p_err("can't get map by id (%u): %s",152152+ id, strerror(errno));153153+ return -1;154154+ }155155+156156+ err = bpf_obj_get_info_by_fd(fd, info, &info_len);157157+ if (err) {158158+ p_err("can't get map info: %s", strerror(errno));159159+ close(fd);160160+ return -1;161161+ }162162+163163+ if (info->type == BPF_MAP_TYPE_STRUCT_OPS &&164164+ (!name || !strcmp(name, info->name))) {165165+ *res_fd = fd;166166+ return 1;167167+ }168168+ close(fd);169169+ }170170+}171171+172172+static int cmd_retval(const struct res *res, bool must_have_one_map)173173+{174174+ if (res->nr_errs || (!res->nr_maps && must_have_one_map))175175+ return -1;176176+177177+ return 0;178178+}179179+180180+/* "data" is the work_func private storage */181181+typedef int (*work_func)(int fd, const struct bpf_map_info *info, void *data,182182+ struct json_writer *wtr);183183+184184+/* Find all struct_ops map in the system.185185+ * Filter out by "name" (if specified).186186+ * Then call "func(fd, info, data, wtr)" on each struct_ops map found.187187+ */188188+static struct res do_search(const char *name, work_func func, void *data,189189+ struct json_writer *wtr)190190+{191191+ struct bpf_map_info *info;192192+ struct res res = {};193193+ __u32 info_len;194194+ int fd, err;195195+196196+ info = map_info_alloc(&info_len);197197+ if (!info) {198198+ res.nr_errs++;199199+ return res;200200+ }201201+202202+ if (wtr)203203+ jsonw_start_array(wtr);204204+ while ((err = get_next_struct_ops_map(name, &fd, info, info_len)) == 1) {205205+ res.nr_maps++;206206+ err = func(fd, info, data, wtr);207207+ if (err)208208+ res.nr_errs++;209209+ close(fd);210210+ }211211+ if (wtr)212212+ jsonw_end_array(wtr);213213+214214+ if (err)215215+ res.nr_errs++;216216+217217+ if (!wtr && name && !res.nr_errs && !res.nr_maps)218218+ /* It is not printing empty [].219219+ * Thus, needs to specifically say nothing found220220+ * for "name" here.221221+ */222222+ p_err("no struct_ops found for %s", name);223223+ else if (!wtr && json_output && !res.nr_errs)224224+ /* The "func()" above is not writing any json (i.e. !wtr225225+ * test here).226226+ *227227+ * However, "-j" is enabled and there is no errs here,228228+ * so call json_null() as the current convention of229229+ * other cmds.230230+ */231231+ jsonw_null(json_wtr);232232+233233+ free(info);234234+ return res;235235+}236236+237237+static struct res do_one_id(const char *id_str, work_func func, void *data,238238+ struct json_writer *wtr)239239+{240240+ struct bpf_map_info *info;241241+ struct res res = {};242242+ unsigned long id;243243+ __u32 info_len;244244+ char *endptr;245245+ int fd;246246+247247+ id = strtoul(id_str, &endptr, 0);248248+ if (*endptr || !id || id > UINT32_MAX) {249249+ p_err("invalid id %s", id_str);250250+ res.nr_errs++;251251+ return res;252252+ }253253+254254+ fd = bpf_map_get_fd_by_id(id);255255+ if (fd == -1) {256256+ p_err("can't get map by id (%lu): %s", id, strerror(errno));257257+ res.nr_errs++;258258+ return res;259259+ }260260+261261+ info = map_info_alloc(&info_len);262262+ if (!info) {263263+ res.nr_errs++;264264+ goto done;265265+ }266266+267267+ if (bpf_obj_get_info_by_fd(fd, info, &info_len)) {268268+ p_err("can't get map info: %s", strerror(errno));269269+ res.nr_errs++;270270+ goto done;271271+ }272272+273273+ if (info->type != BPF_MAP_TYPE_STRUCT_OPS) {274274+ p_err("%s id %u is not a struct_ops map", info->name, info->id);275275+ res.nr_errs++;276276+ goto done;277277+ }278278+279279+ res.nr_maps++;280280+281281+ if (func(fd, info, data, wtr))282282+ res.nr_errs++;283283+ else if (!wtr && json_output)284284+ /* The "func()" above is not writing any json (i.e. !wtr285285+ * test here).286286+ *287287+ * However, "-j" is enabled and there is no errs here,288288+ * so call json_null() as the current convention of289289+ * other cmds.290290+ */291291+ jsonw_null(json_wtr);292292+293293+done:294294+ free(info);295295+ close(fd);296296+297297+ return res;298298+}299299+300300+static struct res do_work_on_struct_ops(const char *search_type,301301+ const char *search_term,302302+ work_func func, void *data,303303+ struct json_writer *wtr)304304+{305305+ if (search_type) {306306+ if (is_prefix(search_type, "id"))307307+ return do_one_id(search_term, func, data, wtr);308308+ else if (!is_prefix(search_type, "name"))309309+ usage();310310+ }311311+312312+ return do_search(search_term, func, data, wtr);313313+}314314+315315+static int __do_show(int fd, const struct bpf_map_info *info, void *data,316316+ struct json_writer *wtr)317317+{318318+ if (wtr) {319319+ jsonw_start_object(wtr);320320+ jsonw_uint_field(wtr, "id", info->id);321321+ jsonw_string_field(wtr, "name", info->name);322322+ jsonw_string_field(wtr, "kernel_struct_ops",323323+ get_kern_struct_ops_name(info));324324+ jsonw_end_object(wtr);325325+ } else {326326+ printf("%u: %-15s %-32s\n", info->id, info->name,327327+ get_kern_struct_ops_name(info));328328+ }329329+330330+ return 0;331331+}332332+333333+static int do_show(int argc, char **argv)334334+{335335+ const char *search_type = NULL, *search_term = NULL;336336+ struct res res;337337+338338+ if (argc && argc != 2)339339+ usage();340340+341341+ if (argc == 2) {342342+ search_type = GET_ARG();343343+ search_term = GET_ARG();344344+ }345345+346346+ res = do_work_on_struct_ops(search_type, search_term, __do_show,347347+ NULL, json_wtr);348348+349349+ return cmd_retval(&res, !!search_term);350350+}351351+352352+static int __do_dump(int fd, const struct bpf_map_info *info, void *data,353353+ struct json_writer *wtr)354354+{355355+ struct btf_dumper *d = (struct btf_dumper *)data;356356+ const struct btf_type *struct_ops_type;357357+ const struct btf *kern_btf = d->btf;358358+ const char *struct_ops_name;359359+ int zero = 0;360360+ void *value;361361+362362+ /* note: d->jw == wtr */363363+364364+ kern_btf = d->btf;365365+366366+ /* The kernel supporting BPF_MAP_TYPE_STRUCT_OPS must have367367+ * btf_vmlinux_value_type_id.368368+ */369369+ struct_ops_type = btf__type_by_id(kern_btf,370370+ info->btf_vmlinux_value_type_id);371371+ struct_ops_name = btf__name_by_offset(kern_btf,372372+ struct_ops_type->name_off);373373+ value = calloc(1, info->value_size);374374+ if (!value) {375375+ p_err("mem alloc failed");376376+ return -1;377377+ }378378+379379+ if (bpf_map_lookup_elem(fd, &zero, value)) {380380+ p_err("can't lookup struct_ops map %s id %u",381381+ info->name, info->id);382382+ free(value);383383+ return -1;384384+ }385385+386386+ jsonw_start_object(wtr);387387+ jsonw_name(wtr, "bpf_map_info");388388+ btf_dumper_type(d, map_info_type_id, (void *)info);389389+ jsonw_end_object(wtr);390390+391391+ jsonw_start_object(wtr);392392+ jsonw_name(wtr, struct_ops_name);393393+ btf_dumper_type(d, info->btf_vmlinux_value_type_id, value);394394+ jsonw_end_object(wtr);395395+396396+ free(value);397397+398398+ return 0;399399+}400400+401401+static int do_dump(int argc, char **argv)402402+{403403+ const char *search_type = NULL, *search_term = NULL;404404+ json_writer_t *wtr = json_wtr;405405+ const struct btf *kern_btf;406406+ struct btf_dumper d = {};407407+ struct res res;408408+409409+ if (argc && argc != 2)410410+ usage();411411+412412+ if (argc == 2) {413413+ search_type = GET_ARG();414414+ search_term = GET_ARG();415415+ }416416+417417+ kern_btf = get_btf_vmlinux();418418+ if (IS_ERR(kern_btf))419419+ return -1;420420+421421+ if (!json_output) {422422+ wtr = jsonw_new(stdout);423423+ if (!wtr) {424424+ p_err("can't create json writer");425425+ return -1;426426+ }427427+ jsonw_pretty(wtr, true);428428+ }429429+430430+ d.btf = kern_btf;431431+ d.jw = wtr;432432+ d.is_plain_text = !json_output;433433+ d.prog_id_as_func_ptr = true;434434+435435+ res = do_work_on_struct_ops(search_type, search_term, __do_dump, &d,436436+ wtr);437437+438438+ if (!json_output)439439+ jsonw_destroy(&wtr);440440+441441+ return cmd_retval(&res, !!search_term);442442+}443443+444444+static int __do_unregister(int fd, const struct bpf_map_info *info, void *data,445445+ struct json_writer *wtr)446446+{447447+ int zero = 0;448448+449449+ if (bpf_map_delete_elem(fd, &zero)) {450450+ p_err("can't unload %s %s id %u: %s",451451+ get_kern_struct_ops_name(info), info->name,452452+ info->id, strerror(errno));453453+ return -1;454454+ }455455+456456+ p_info("Unregistered %s %s id %u",457457+ get_kern_struct_ops_name(info), info->name,458458+ info->id);459459+460460+ return 0;461461+}462462+463463+static int do_unregister(int argc, char **argv)464464+{465465+ const char *search_type, *search_term;466466+ struct res res;467467+468468+ if (argc != 2)469469+ usage();470470+471471+ search_type = GET_ARG();472472+ search_term = GET_ARG();473473+474474+ res = do_work_on_struct_ops(search_type, search_term,475475+ __do_unregister, NULL, NULL);476476+477477+ return cmd_retval(&res, true);478478+}479479+480480+static int do_register(int argc, char **argv)481481+{482482+ const struct bpf_map_def *def;483483+ struct bpf_map_info info = {};484484+ __u32 info_len = sizeof(info);485485+ int nr_errs = 0, nr_maps = 0;486486+ struct bpf_object *obj;487487+ struct bpf_link *link;488488+ struct bpf_map *map;489489+ const char *file;490490+491491+ if (argc != 1)492492+ usage();493493+494494+ file = GET_ARG();495495+496496+ obj = bpf_object__open(file);497497+ if (IS_ERR_OR_NULL(obj))498498+ return -1;499499+500500+ set_max_rlimit();501501+502502+ if (bpf_object__load(obj)) {503503+ bpf_object__close(obj);504504+ return -1;505505+ }506506+507507+ bpf_object__for_each_map(map, obj) {508508+ def = bpf_map__def(map);509509+ if (def->type != BPF_MAP_TYPE_STRUCT_OPS)510510+ continue;511511+512512+ link = bpf_map__attach_struct_ops(map);513513+ if (IS_ERR(link)) {514514+ p_err("can't register struct_ops %s: %s",515515+ bpf_map__name(map),516516+ strerror(-PTR_ERR(link)));517517+ nr_errs++;518518+ continue;519519+ }520520+ nr_maps++;521521+522522+ bpf_link__disconnect(link);523523+ bpf_link__destroy(link);524524+525525+ if (!bpf_obj_get_info_by_fd(bpf_map__fd(map), &info,526526+ &info_len))527527+ p_info("Registered %s %s id %u",528528+ get_kern_struct_ops_name(&info),529529+ bpf_map__name(map),530530+ info.id);531531+ else532532+ /* Not p_err. The struct_ops was attached533533+ * successfully.534534+ */535535+ p_info("Registered %s but can't find id: %s",536536+ bpf_map__name(map), strerror(errno));537537+ }538538+539539+ bpf_object__close(obj);540540+541541+ if (nr_errs)542542+ return -1;543543+544544+ if (!nr_maps) {545545+ p_err("no struct_ops found in %s", file);546546+ return -1;547547+ }548548+549549+ if (json_output)550550+ jsonw_null(json_wtr);551551+552552+ return 0;553553+}554554+555555+static int do_help(int argc, char **argv)556556+{557557+ if (json_output) {558558+ jsonw_null(json_wtr);559559+ return 0;560560+ }561561+562562+ fprintf(stderr,563563+ "Usage: %s %s { show | list } [STRUCT_OPS_MAP]\n"564564+ " %s %s dump [STRUCT_OPS_MAP]\n"565565+ " %s %s register OBJ\n"566566+ " %s %s unregister STRUCT_OPS_MAP\n"567567+ " %s %s help\n"568568+ "\n"569569+ " OPTIONS := { {-j|--json} [{-p|--pretty}] }\n"570570+ " STRUCT_OPS_MAP := [ id STRUCT_OPS_MAP_ID | name STRUCT_OPS_MAP_NAME ]\n",571571+ bin_name, argv[-2], bin_name, argv[-2],572572+ bin_name, argv[-2], bin_name, argv[-2],573573+ bin_name, argv[-2]);574574+575575+ return 0;576576+}577577+578578+static const struct cmd cmds[] = {579579+ { "show", do_show },580580+ { "list", do_show },581581+ { "register", do_register },582582+ { "unregister", do_unregister },583583+ { "dump", do_dump },584584+ { "help", do_help },585585+ { 0 }586586+};587587+588588+int do_struct_ops(int argc, char **argv)589589+{590590+ int err;591591+592592+ err = cmd_select(cmds, argc, argv, do_help);593593+594594+ btf__free(btf_vmlinux);595595+ return err;596596+}
+80-2
tools/include/uapi/linux/bpf.h
···111111 BPF_MAP_LOOKUP_AND_DELETE_BATCH,112112 BPF_MAP_UPDATE_BATCH,113113 BPF_MAP_DELETE_BATCH,114114+ BPF_LINK_CREATE,115115+ BPF_LINK_UPDATE,114116};115117116118enum bpf_map_type {···183181 BPF_PROG_TYPE_TRACING,184182 BPF_PROG_TYPE_STRUCT_OPS,185183 BPF_PROG_TYPE_EXT,184184+ BPF_PROG_TYPE_LSM,186185};187186188187enum bpf_attach_type {···214211 BPF_TRACE_FENTRY,215212 BPF_TRACE_FEXIT,216213 BPF_MODIFY_RETURN,214214+ BPF_LSM_MAC,217215 __MAX_BPF_ATTACH_TYPE218216};219217···543539 __u32 prog_cnt;544540 } query;545541546546- struct {542542+ struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */547543 __u64 name;548544 __u32 prog_fd;549545 } raw_tracepoint;···571567 __u64 probe_offset; /* output: probe_offset */572568 __u64 probe_addr; /* output: probe_addr */573569 } task_fd_query;570570+571571+ struct { /* struct used by BPF_LINK_CREATE command */572572+ __u32 prog_fd; /* eBPF program to attach */573573+ __u32 target_fd; /* object to attach to */574574+ __u32 attach_type; /* attach type */575575+ __u32 flags; /* extra flags */576576+ } link_create;577577+578578+ struct { /* struct used by BPF_LINK_UPDATE command */579579+ __u32 link_fd; /* link fd */580580+ /* new program fd to update link with */581581+ __u32 new_prog_fd;582582+ __u32 flags; /* extra flags */583583+ /* expected link's program fd; is specified only if584584+ * BPF_F_REPLACE flag is set in flags */585585+ __u32 old_prog_fd;586586+ } link_update;587587+574588} __attribute__((aligned(8)));575589576590/* The description below is an attempt at providing documentation to eBPF···29722950 * restricted to raw_tracepoint bpf programs.29732951 * Return29742952 * 0 on success, or a negative error in case of failure.29532953+ *29542954+ * u64 bpf_get_netns_cookie(void *ctx)29552955+ * Description29562956+ * Retrieve the cookie (generated by the kernel) of the network29572957+ * namespace the input *ctx* is associated with. The network29582958+ * namespace cookie remains stable for its lifetime and provides29592959+ * a global identifier that can be assumed unique. If *ctx* is29602960+ * NULL, then the helper returns the cookie for the initial29612961+ * network namespace. The cookie itself is very similar to that29622962+ * of bpf_get_socket_cookie() helper, but for network namespaces29632963+ * instead of sockets.29642964+ * Return29652965+ * A 8-byte long opaque number.29662966+ *29672967+ * u64 bpf_get_current_ancestor_cgroup_id(int ancestor_level)29682968+ * Description29692969+ * Return id of cgroup v2 that is ancestor of the cgroup associated29702970+ * with the current task at the *ancestor_level*. The root cgroup29712971+ * is at *ancestor_level* zero and each step down the hierarchy29722972+ * increments the level. If *ancestor_level* == level of cgroup29732973+ * associated with the current task, then return value will be the29742974+ * same as that of **bpf_get_current_cgroup_id**\ ().29752975+ *29762976+ * The helper is useful to implement policies based on cgroups29772977+ * that are upper in hierarchy than immediate cgroup associated29782978+ * with the current task.29792979+ *29802980+ * The format of returned id and helper limitations are same as in29812981+ * **bpf_get_current_cgroup_id**\ ().29822982+ * Return29832983+ * The id is returned or 0 in case the id could not be retrieved.29842984+ *29852985+ * int bpf_sk_assign(struct sk_buff *skb, struct bpf_sock *sk, u64 flags)29862986+ * Description29872987+ * Assign the *sk* to the *skb*. When combined with appropriate29882988+ * routing configuration to receive the packet towards the socket,29892989+ * will cause *skb* to be delivered to the specified socket.29902990+ * Subsequent redirection of *skb* via **bpf_redirect**\ (),29912991+ * **bpf_clone_redirect**\ () or other methods outside of BPF may29922992+ * interfere with successful delivery to the socket.29932993+ *29942994+ * This operation is only valid from TC ingress path.29952995+ *29962996+ * The *flags* argument must be zero.29972997+ * Return29982998+ * 0 on success, or a negative errno in case of failure.29992999+ *30003000+ * * **-EINVAL** Unsupported flags specified.30013001+ * * **-ENOENT** Socket is unavailable for assignment.30023002+ * * **-ENETUNREACH** Socket is unreachable (wrong netns).30033003+ * * **-EOPNOTSUPP** Unsupported operation, for example a30043004+ * call from outside of TC ingress.30053005+ * * **-ESOCKTNOSUPPORT** Socket type not supported (reuseport).29753006 */29763007#define __BPF_FUNC_MAPPER(FN) \29773008 FN(unspec), \···31483073 FN(jiffies64), \31493074 FN(read_branch_records), \31503075 FN(get_ns_current_pid_tgid), \31513151- FN(xdp_output),30763076+ FN(xdp_output), \30773077+ FN(get_netns_cookie), \30783078+ FN(get_current_ancestor_cgroup_id), \30793079+ FN(sk_assign),3152308031533081/* integer value in 'imm' field of BPF_CALL instruction selects which helper31543082 * function eBPF program intends to call
···168168LIBBPF_API int bpf_prog_detach2(int prog_fd, int attachable_fd,169169 enum bpf_attach_type type);170170171171+struct bpf_link_create_opts {172172+ size_t sz; /* size of this struct for forward/backward compatibility */173173+};174174+#define bpf_link_create_opts__last_field sz175175+176176+LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,177177+ enum bpf_attach_type attach_type,178178+ const struct bpf_link_create_opts *opts);179179+180180+struct bpf_link_update_opts {181181+ size_t sz; /* size of this struct for forward/backward compatibility */182182+ __u32 flags; /* extra flags */183183+ __u32 old_prog_fd; /* expected old program FD */184184+};185185+#define bpf_link_update_opts__last_field old_prog_fd186186+187187+LIBBPF_API int bpf_link_update(int link_fd, int new_prog_fd,188188+ const struct bpf_link_update_opts *opts);189189+171190struct bpf_prog_test_run_attr {172191 int prog_fd;173192 int repeat;
···108108 case BPF_PROG_TYPE_TRACING:109109 case BPF_PROG_TYPE_STRUCT_OPS:110110 case BPF_PROG_TYPE_EXT:111111+ case BPF_PROG_TYPE_LSM:111112 default:112113 break;113114 }
+33-1
tools/lib/bpf/netlink.c
···132132 return ret;133133}134134135135-int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)135135+static int __bpf_set_link_xdp_fd_replace(int ifindex, int fd, int old_fd,136136+ __u32 flags)136137{137138 int sock, seq = 0, ret;138139 struct nlattr *nla, *nla_xdp;···179178 nla->nla_len += nla_xdp->nla_len;180179 }181180181181+ if (flags & XDP_FLAGS_REPLACE) {182182+ nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);183183+ nla_xdp->nla_type = IFLA_XDP_EXPECTED_FD;184184+ nla_xdp->nla_len = NLA_HDRLEN + sizeof(old_fd);185185+ memcpy((char *)nla_xdp + NLA_HDRLEN, &old_fd, sizeof(old_fd));186186+ nla->nla_len += nla_xdp->nla_len;187187+ }188188+182189 req.nh.nlmsg_len += NLA_ALIGN(nla->nla_len);183190184191 if (send(sock, &req, req.nh.nlmsg_len, 0) < 0) {···198189cleanup:199190 close(sock);200191 return ret;192192+}193193+194194+int bpf_set_link_xdp_fd_opts(int ifindex, int fd, __u32 flags,195195+ const struct bpf_xdp_set_link_opts *opts)196196+{197197+ int old_fd = -1;198198+199199+ if (!OPTS_VALID(opts, bpf_xdp_set_link_opts))200200+ return -EINVAL;201201+202202+ if (OPTS_HAS(opts, old_fd)) {203203+ old_fd = OPTS_GET(opts, old_fd, -1);204204+ flags |= XDP_FLAGS_REPLACE;205205+ }206206+207207+ return __bpf_set_link_xdp_fd_replace(ifindex, fd,208208+ old_fd,209209+ flags);210210+}211211+212212+int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)213213+{214214+ return __bpf_set_link_xdp_fd_replace(ifindex, fd, 0, flags);201215}202216203217static int __dump_link_nlmsg(struct nlmsghdr *nlh,