Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

bpf, docs: document BPF_MAP_TYPE_HASH and variants

Add documentation for BPF_MAP_TYPE_HASH including kernel version
introduced, usage and examples. Document BPF_MAP_TYPE_PERCPU_HASH,
BPF_MAP_TYPE_LRU_HASH and BPF_MAP_TYPE_LRU_PERCPU_HASH variations.

Note that this file is included in the BPF documentation by the glob in
Documentation/bpf/maps.rst

v3:
Fix typos reported by Stanislav Fomichev and Yonghong Song.
Add note about iteration and deletion as requested by Yonghong Song.

v2:
Describe memory allocation semantics as suggested by Stanislav Fomichev.
Fix u64 typo reported by Stanislav Fomichev.
Cut down usage examples to only show usage in context.
Updated patch description to follow style recommendation, reported by
Bagas Sanjaya.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220718125847.1390-1-donald.hunter@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

authored by

Donald Hunter and committed by
Alexei Starovoitov
979855d3 e1346019

+185
+185
Documentation/bpf/map_hash.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Red Hat, Inc. 3 + 4 + =============================================== 5 + BPF_MAP_TYPE_HASH, with PERCPU and LRU Variants 6 + =============================================== 7 + 8 + .. note:: 9 + - ``BPF_MAP_TYPE_HASH`` was introduced in kernel version 3.19 10 + - ``BPF_MAP_TYPE_PERCPU_HASH`` was introduced in version 4.6 11 + - Both ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` 12 + were introduced in version 4.10 13 + 14 + ``BPF_MAP_TYPE_HASH`` and ``BPF_MAP_TYPE_PERCPU_HASH`` provide general 15 + purpose hash map storage. Both the key and the value can be structs, 16 + allowing for composite keys and values. 17 + 18 + The kernel is responsible for allocating and freeing key/value pairs, up 19 + to the max_entries limit that you specify. Hash maps use pre-allocation 20 + of hash table elements by default. The ``BPF_F_NO_PREALLOC`` flag can be 21 + used to disable pre-allocation when it is too memory expensive. 22 + 23 + ``BPF_MAP_TYPE_PERCPU_HASH`` provides a separate value slot per 24 + CPU. The per-cpu values are stored internally in an array. 25 + 26 + The ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` 27 + variants add LRU semantics to their respective hash tables. An LRU hash 28 + will automatically evict the least recently used entries when the hash 29 + table reaches capacity. An LRU hash maintains an internal LRU list that 30 + is used to select elements for eviction. This internal LRU list is 31 + shared across CPUs but it is possible to request a per CPU LRU list with 32 + the ``BPF_F_NO_COMMON_LRU`` flag when calling ``bpf_map_create``. 33 + 34 + Usage 35 + ===== 36 + 37 + .. c:function:: 38 + long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) 39 + 40 + Hash entries can be added or updated using the ``bpf_map_update_elem()`` 41 + helper. This helper replaces existing elements atomically. The ``flags`` 42 + parameter can be used to control the update behaviour: 43 + 44 + - ``BPF_ANY`` will create a new element or update an existing element 45 + - ``BPF_NOEXIST`` will create a new element only if one did not already 46 + exist 47 + - ``BPF_EXIST`` will update an existing element 48 + 49 + ``bpf_map_update_elem()`` returns 0 on success, or negative error in 50 + case of failure. 51 + 52 + .. c:function:: 53 + void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 54 + 55 + Hash entries can be retrieved using the ``bpf_map_lookup_elem()`` 56 + helper. This helper returns a pointer to the value associated with 57 + ``key``, or ``NULL`` if no entry was found. 58 + 59 + .. c:function:: 60 + long bpf_map_delete_elem(struct bpf_map *map, const void *key) 61 + 62 + Hash entries can be deleted using the ``bpf_map_delete_elem()`` 63 + helper. This helper will return 0 on success, or negative error in case 64 + of failure. 65 + 66 + Per CPU Hashes 67 + -------------- 68 + 69 + For ``BPF_MAP_TYPE_PERCPU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` 70 + the ``bpf_map_update_elem()`` and ``bpf_map_lookup_elem()`` helpers 71 + automatically access the hash slot for the current CPU. 72 + 73 + .. c:function:: 74 + void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key, u32 cpu) 75 + 76 + The ``bpf_map_lookup_percpu_elem()`` helper can be used to lookup the 77 + value in the hash slot for a specific CPU. Returns value associated with 78 + ``key`` on ``cpu`` , or ``NULL`` if no entry was found or ``cpu`` is 79 + invalid. 80 + 81 + Concurrency 82 + ----------- 83 + 84 + Values stored in ``BPF_MAP_TYPE_HASH`` can be accessed concurrently by 85 + programs running on different CPUs. Since Kernel version 5.1, the BPF 86 + infrastructure provides ``struct bpf_spin_lock`` to synchronise access. 87 + See ``tools/testing/selftests/bpf/progs/test_spin_lock.c``. 88 + 89 + Userspace 90 + --------- 91 + 92 + .. c:function:: 93 + int bpf_map_get_next_key(int fd, const void *cur_key, void *next_key) 94 + 95 + In userspace, it is possible to iterate through the keys of a hash using 96 + libbpf's ``bpf_map_get_next_key()`` function. The first key can be fetched by 97 + calling ``bpf_map_get_next_key()`` with ``cur_key`` set to 98 + ``NULL``. Subsequent calls will fetch the next key that follows the 99 + current key. ``bpf_map_get_next_key()`` returns 0 on success, -ENOENT if 100 + cur_key is the last key in the hash, or negative error in case of 101 + failure. 102 + 103 + Note that if ``cur_key`` gets deleted then ``bpf_map_get_next_key()`` 104 + will instead return the *first* key in the hash table which is 105 + undesirable. It is recommended to use batched lookup if there is going 106 + to be key deletion intermixed with ``bpf_map_get_next_key()``. 107 + 108 + Examples 109 + ======== 110 + 111 + Please see the ``tools/testing/selftests/bpf`` directory for functional 112 + examples. The code snippets below demonstrates API usage. 113 + 114 + This example shows how to declare an LRU Hash with a struct key and a 115 + struct value. 116 + 117 + .. code-block:: c 118 + 119 + #include <linux/bpf.h> 120 + #include <bpf/bpf_helpers.h> 121 + 122 + struct key { 123 + __u32 srcip; 124 + }; 125 + 126 + struct value { 127 + __u64 packets; 128 + __u64 bytes; 129 + }; 130 + 131 + struct { 132 + __uint(type, BPF_MAP_TYPE_LRU_HASH); 133 + __uint(max_entries, 32); 134 + __type(key, struct key); 135 + __type(value, struct value); 136 + } packet_stats SEC(".maps"); 137 + 138 + This example shows how to create or update hash values using atomic 139 + instructions: 140 + 141 + .. code-block:: c 142 + 143 + static void update_stats(__u32 srcip, int bytes) 144 + { 145 + struct key key = { 146 + .srcip = srcip, 147 + }; 148 + struct value *value = bpf_map_lookup_elem(&packet_stats, &key); 149 + 150 + if (value) { 151 + __sync_fetch_and_add(&value->packets, 1); 152 + __sync_fetch_and_add(&value->bytes, bytes); 153 + } else { 154 + struct value newval = { 1, bytes }; 155 + 156 + bpf_map_update_elem(&packet_stats, &key, &newval, BPF_NOEXIST); 157 + } 158 + } 159 + 160 + Userspace walking the map elements from the map declared above: 161 + 162 + .. code-block:: c 163 + 164 + #include <bpf/libbpf.h> 165 + #include <bpf/bpf.h> 166 + 167 + static void walk_hash_elements(int map_fd) 168 + { 169 + struct key *cur_key = NULL; 170 + struct key next_key; 171 + struct value value; 172 + int err; 173 + 174 + for (;;) { 175 + err = bpf_map_get_next_key(map_fd, cur_key, &next_key); 176 + if (err) 177 + break; 178 + 179 + bpf_map_lookup_elem(map_fd, &next_key, &value); 180 + 181 + // Use key and value here 182 + 183 + cur_key = &next_key; 184 + } 185 + }