Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

bpf: btf: add btf documentation

This patch added documentation for BTF (BPF Debug Format).
The document is placed under linux:Documentation/bpf directory.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

authored by

Yonghong Song and committed by
Alexei Starovoitov
ffcf7ce9 cbeaad90

+877
+870
Documentation/bpf/btf.rst
··· 1 + ===================== 2 + BPF Type Format (BTF) 3 + ===================== 4 + 5 + 1. Introduction 6 + *************** 7 + 8 + BTF (BPF Type Format) is the meta data format which 9 + encodes the debug info related to BPF program/map. 10 + The name BTF was used initially to describe 11 + data types. The BTF was later extended to include 12 + function info for defined subroutines, and line info 13 + for source/line information. 14 + 15 + The debug info is used for map pretty print, function 16 + signature, etc. The function signature enables better 17 + bpf program/function kernel symbol. 18 + The line info helps generate 19 + source annotated translated byte code, jited code 20 + and verifier log. 21 + 22 + The BTF specification contains two parts, 23 + * BTF kernel API 24 + * BTF ELF file format 25 + 26 + The kernel API is the contract between 27 + user space and kernel. The kernel verifies 28 + the BTF info before using it. 29 + The ELF file format is a user space contract 30 + between ELF file and libbpf loader. 31 + 32 + The type and string sections are part of the 33 + BTF kernel API, describing the debug info 34 + (mostly types related) referenced by the bpf program. 35 + These two sections are discussed in 36 + details in :ref:`BTF_Type_String`. 37 + 38 + .. _BTF_Type_String: 39 + 40 + 2. BTF Type and String Encoding 41 + ******************************* 42 + 43 + The file ``include/uapi/linux/btf.h`` provides high 44 + level definition on how types/strings are encoded. 45 + 46 + The beginning of data blob must be:: 47 + 48 + struct btf_header { 49 + __u16 magic; 50 + __u8 version; 51 + __u8 flags; 52 + __u32 hdr_len; 53 + 54 + /* All offsets are in bytes relative to the end of this header */ 55 + __u32 type_off; /* offset of type section */ 56 + __u32 type_len; /* length of type section */ 57 + __u32 str_off; /* offset of string section */ 58 + __u32 str_len; /* length of string section */ 59 + }; 60 + 61 + The magic is ``0xeB9F``, which has different encoding for big and little 62 + endian system, and can be used to test whether BTF is generated for 63 + big or little endian target. 64 + The btf_header is designed to be extensible with hdr_len equal to 65 + ``sizeof(struct btf_header)`` when the data blob is generated. 66 + 67 + 2.1 String Encoding 68 + =================== 69 + 70 + The first string in the string section must be a null string. 71 + The rest of string table is a concatenation of other null-treminated 72 + strings. 73 + 74 + 2.2 Type Encoding 75 + ================= 76 + 77 + The type id ``0`` is reserved for ``void`` type. 78 + The type section is parsed sequentially and the type id is assigned to 79 + each recognized type starting from id ``1``. 80 + Currently, the following types are supported:: 81 + 82 + #define BTF_KIND_INT 1 /* Integer */ 83 + #define BTF_KIND_PTR 2 /* Pointer */ 84 + #define BTF_KIND_ARRAY 3 /* Array */ 85 + #define BTF_KIND_STRUCT 4 /* Struct */ 86 + #define BTF_KIND_UNION 5 /* Union */ 87 + #define BTF_KIND_ENUM 6 /* Enumeration */ 88 + #define BTF_KIND_FWD 7 /* Forward */ 89 + #define BTF_KIND_TYPEDEF 8 /* Typedef */ 90 + #define BTF_KIND_VOLATILE 9 /* Volatile */ 91 + #define BTF_KIND_CONST 10 /* Const */ 92 + #define BTF_KIND_RESTRICT 11 /* Restrict */ 93 + #define BTF_KIND_FUNC 12 /* Function */ 94 + #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */ 95 + 96 + Note that the type section encodes debug info, not just pure types. 97 + ``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram. 98 + 99 + Each type contains the following common data:: 100 + 101 + struct btf_type { 102 + __u32 name_off; 103 + /* "info" bits arrangement 104 + * bits 0-15: vlen (e.g. # of struct's members) 105 + * bits 16-23: unused 106 + * bits 24-27: kind (e.g. int, ptr, array...etc) 107 + * bits 28-30: unused 108 + * bit 31: kind_flag, currently used by 109 + * struct, union and fwd 110 + */ 111 + __u32 info; 112 + /* "size" is used by INT, ENUM, STRUCT and UNION. 113 + * "size" tells the size of the type it is describing. 114 + * 115 + * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, 116 + * FUNC and FUNC_PROTO. 117 + * "type" is a type_id referring to another type. 118 + */ 119 + union { 120 + __u32 size; 121 + __u32 type; 122 + }; 123 + }; 124 + 125 + For certain kinds, the common data are followed by kind specific data. 126 + The ``name_off`` in ``struct btf_type`` specifies the offset in the string table. 127 + The following details encoding of each kind. 128 + 129 + 2.2.1 BTF_KIND_INT 130 + ~~~~~~~~~~~~~~~~~~ 131 + 132 + ``struct btf_type`` encoding requirement: 133 + * ``name_off``: any valid offset 134 + * ``info.kind_flag``: 0 135 + * ``info.kind``: BTF_KIND_INT 136 + * ``info.vlen``: 0 137 + * ``size``: the size of the int type in bytes. 138 + 139 + ``btf_type`` is followed by a ``u32`` with following bits arrangement:: 140 + 141 + #define BTF_INT_ENCODING(VAL) (((VAL) & 0x0f000000) >> 24) 142 + #define BTF_INT_OFFSET(VAL) (((VAL & 0x00ff0000)) >> 16) 143 + #define BTF_INT_BITS(VAL) ((VAL) & 0x000000ff) 144 + 145 + The ``BTF_INT_ENCODING`` has the following attributes:: 146 + 147 + #define BTF_INT_SIGNED (1 << 0) 148 + #define BTF_INT_CHAR (1 << 1) 149 + #define BTF_INT_BOOL (1 << 2) 150 + 151 + The ``BTF_INT_ENCODING()`` provides extra information, signness, 152 + char, or bool, for the int type. The char and bool encoding 153 + are mostly useful for pretty print. At most one encoding can 154 + be specified for the int type. 155 + 156 + The ``BTF_INT_BITS()`` specifies the number of actual bits held by 157 + this int type. For example, a 4-bit bitfield encodes 158 + ``BTF_INT_BITS()`` equals to 4. The ``btf_type.size * 8`` 159 + must be equal to or greater than ``BTF_INT_BITS()`` for the type. 160 + The maximum value of ``BTF_INT_BITS()`` is 128. 161 + 162 + The ``BTF_INT_OFFSET()`` specifies the starting bit offset to 163 + calculate values for this int. For example, a bitfield struct 164 + member has 165 + 166 + * btf member bit offset 100 from the start of the structure, 167 + * btf member pointing to an int type, 168 + * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4`` 169 + 170 + Then in the struct memory layout, this member will occupy 171 + ``4`` bits starting from bits ``100 + 2 = 102``. 172 + 173 + Alternatively, the bitfield struct member can be the following to 174 + access the same bits as the above: 175 + 176 + * btf member bit offset 102, 177 + * btf member pointing to an int type, 178 + * the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4`` 179 + 180 + The original intention of ``BTF_INT_OFFSET()`` is to provide 181 + flexibility of bitfield encoding. 182 + Currently, both llvm and pahole generates ``BTF_INT_OFFSET() = 0`` 183 + for all int types. 184 + 185 + 2.2.2 BTF_KIND_PTR 186 + ~~~~~~~~~~~~~~~~~~ 187 + 188 + ``struct btf_type`` encoding requirement: 189 + * ``name_off``: 0 190 + * ``info.kind_flag``: 0 191 + * ``info.kind``: BTF_KIND_PTR 192 + * ``info.vlen``: 0 193 + * ``type``: the pointee type of the pointer 194 + 195 + No additional type data follow ``btf_type``. 196 + 197 + 2.2.3 BTF_KIND_ARRAY 198 + ~~~~~~~~~~~~~~~~~~~~ 199 + 200 + ``struct btf_type`` encoding requirement: 201 + * ``name_off``: 0 202 + * ``info.kind_flag``: 0 203 + * ``info.kind``: BTF_KIND_ARRAY 204 + * ``info.vlen``: 0 205 + * ``size/type``: 0, not used 206 + 207 + btf_type is followed by one "struct btf_array":: 208 + 209 + struct btf_array { 210 + __u32 type; 211 + __u32 index_type; 212 + __u32 nelems; 213 + }; 214 + 215 + The ``struct btf_array`` encoding: 216 + * ``type``: the element type 217 + * ``index_type``: the index type 218 + * ``nelems``: the number of elements for this array (``0`` is also allowed). 219 + 220 + The ``index_type`` can be any regular int types 221 + (u8, u16, u32, u64, unsigned __int128). 222 + The original design of including ``index_type`` follows dwarf 223 + which has a ``index_type`` for its array type. 224 + Currently in BTF, beyond type verification, the ``index_type`` is not used. 225 + 226 + The ``struct btf_array`` allows chaining through element type to represent 227 + multiple dimensional arrays. For example, ``int a[5][6]``, the following 228 + type system illustrates the chaining: 229 + 230 + * [1]: int 231 + * [2]: array, ``btf_array.type = [1]``, ``btf_array.nelems = 6`` 232 + * [3]: array, ``btf_array.type = [2]``, ``btf_array.nelems = 5`` 233 + 234 + Currently, both pahole and llvm collapse multiple dimensional array 235 + into one dimensional array, e.g., ``a[5][6]``, the btf_array.nelems 236 + equal to ``30``. This is because the original use case is map pretty 237 + print where the whole array is dumped out so one dimensional array 238 + is enough. As more BTF usage is explored, pahole and llvm can be 239 + changed to generate proper chained representation for 240 + multiple dimensional arrays. 241 + 242 + 2.2.4 BTF_KIND_STRUCT 243 + ~~~~~~~~~~~~~~~~~~~~~ 244 + 2.2.5 BTF_KIND_UNION 245 + ~~~~~~~~~~~~~~~~~~~~ 246 + 247 + ``struct btf_type`` encoding requirement: 248 + * ``name_off``: 0 or offset to a valid C identifier 249 + * ``info.kind_flag``: 0 or 1 250 + * ``info.kind``: BTF_KIND_STRUCT or BTF_KIND_UNION 251 + * ``info.vlen``: the number of struct/union members 252 + * ``info.size``: the size of the struct/union in bytes 253 + 254 + ``btf_type`` is followed by ``info.vlen`` number of ``struct btf_member``.:: 255 + 256 + struct btf_member { 257 + __u32 name_off; 258 + __u32 type; 259 + __u32 offset; 260 + }; 261 + 262 + ``struct btf_member`` encoding: 263 + * ``name_off``: offset to a valid C identifier 264 + * ``type``: the member type 265 + * ``offset``: <see below> 266 + 267 + If the type info ``kind_flag`` is not set, the offset contains 268 + only bit offset of the member. Note that the base type of the 269 + bitfield can only be int or enum type. If the bitfield size 270 + is 32, the base type can be either int or enum type. 271 + If the bitfield size is not 32, the base type must be int, 272 + and int type ``BTF_INT_BITS()`` encodes the bitfield size. 273 + 274 + If the ``kind_flag`` is set, the ``btf_member.offset`` 275 + contains both member bitfield size and bit offset. The 276 + bitfield size and bit offset are calculated as below.:: 277 + 278 + #define BTF_MEMBER_BITFIELD_SIZE(val) ((val) >> 24) 279 + #define BTF_MEMBER_BIT_OFFSET(val) ((val) & 0xffffff) 280 + 281 + In this case, if the base type is an int type, it must 282 + be a regular int type: 283 + 284 + * ``BTF_INT_OFFSET()`` must be 0. 285 + * ``BTF_INT_BITS()`` must be equal to ``{1,2,4,8,16} * 8``. 286 + 287 + The following kernel patch introduced ``kind_flag`` and 288 + explained why both modes exist: 289 + 290 + https://github.com/torvalds/linux/commit/9d5f9f701b1891466fb3dbb1806ad97716f95cc3#diff-fa650a64fdd3968396883d2fe8215ff3 291 + 292 + 2.2.6 BTF_KIND_ENUM 293 + ~~~~~~~~~~~~~~~~~~~ 294 + 295 + ``struct btf_type`` encoding requirement: 296 + * ``name_off``: 0 or offset to a valid C identifier 297 + * ``info.kind_flag``: 0 298 + * ``info.kind``: BTF_KIND_ENUM 299 + * ``info.vlen``: number of enum values 300 + * ``size``: 4 301 + 302 + ``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.:: 303 + 304 + struct btf_enum { 305 + __u32 name_off; 306 + __s32 val; 307 + }; 308 + 309 + The ``btf_enum`` encoding: 310 + * ``name_off``: offset to a valid C identifier 311 + * ``val``: any value 312 + 313 + 2.2.7 BTF_KIND_FWD 314 + ~~~~~~~~~~~~~~~~~~ 315 + 316 + ``struct btf_type`` encoding requirement: 317 + * ``name_off``: offset to a valid C identifier 318 + * ``info.kind_flag``: 0 for struct, 1 for union 319 + * ``info.kind``: BTF_KIND_FWD 320 + * ``info.vlen``: 0 321 + * ``type``: 0 322 + 323 + No additional type data follow ``btf_type``. 324 + 325 + 2.2.8 BTF_KIND_TYPEDEF 326 + ~~~~~~~~~~~~~~~~~~~~~~ 327 + 328 + ``struct btf_type`` encoding requirement: 329 + * ``name_off``: offset to a valid C identifier 330 + * ``info.kind_flag``: 0 331 + * ``info.kind``: BTF_KIND_TYPEDEF 332 + * ``info.vlen``: 0 333 + * ``type``: the type which can be referred by name at ``name_off`` 334 + 335 + No additional type data follow ``btf_type``. 336 + 337 + 2.2.9 BTF_KIND_VOLATILE 338 + ~~~~~~~~~~~~~~~~~~~~~~~ 339 + 340 + ``struct btf_type`` encoding requirement: 341 + * ``name_off``: 0 342 + * ``info.kind_flag``: 0 343 + * ``info.kind``: BTF_KIND_VOLATILE 344 + * ``info.vlen``: 0 345 + * ``type``: the type with ``volatile`` qualifier 346 + 347 + No additional type data follow ``btf_type``. 348 + 349 + 2.2.10 BTF_KIND_CONST 350 + ~~~~~~~~~~~~~~~~~~~~~ 351 + 352 + ``struct btf_type`` encoding requirement: 353 + * ``name_off``: 0 354 + * ``info.kind_flag``: 0 355 + * ``info.kind``: BTF_KIND_CONST 356 + * ``info.vlen``: 0 357 + * ``type``: the type with ``const`` qualifier 358 + 359 + No additional type data follow ``btf_type``. 360 + 361 + 2.2.11 BTF_KIND_RESTRICT 362 + ~~~~~~~~~~~~~~~~~~~~~~~~ 363 + 364 + ``struct btf_type`` encoding requirement: 365 + * ``name_off``: 0 366 + * ``info.kind_flag``: 0 367 + * ``info.kind``: BTF_KIND_RESTRICT 368 + * ``info.vlen``: 0 369 + * ``type``: the type with ``restrict`` qualifier 370 + 371 + No additional type data follow ``btf_type``. 372 + 373 + 2.2.12 BTF_KIND_FUNC 374 + ~~~~~~~~~~~~~~~~~~~~ 375 + 376 + ``struct btf_type`` encoding requirement: 377 + * ``name_off``: offset to a valid C identifier 378 + * ``info.kind_flag``: 0 379 + * ``info.kind``: BTF_KIND_FUNC 380 + * ``info.vlen``: 0 381 + * ``type``: a BTF_KIND_FUNC_PROTO type 382 + 383 + No additional type data follow ``btf_type``. 384 + 385 + A BTF_KIND_FUNC defines, not a type, but a subprogram (function) whose 386 + signature is defined by ``type``. The subprogram is thus an instance of 387 + that type. The BTF_KIND_FUNC may in turn be referenced by a func_info in 388 + the :ref:`BTF_Ext_Section` (ELF) or in the arguments to 389 + :ref:`BPF_Prog_Load` (ABI). 390 + 391 + 2.2.13 BTF_KIND_FUNC_PROTO 392 + ~~~~~~~~~~~~~~~~~~~~~~~~~~ 393 + 394 + ``struct btf_type`` encoding requirement: 395 + * ``name_off``: 0 396 + * ``info.kind_flag``: 0 397 + * ``info.kind``: BTF_KIND_FUNC_PROTO 398 + * ``info.vlen``: # of parameters 399 + * ``type``: the return type 400 + 401 + ``btf_type`` is followed by ``info.vlen`` number of ``struct btf_param``.:: 402 + 403 + struct btf_param { 404 + __u32 name_off; 405 + __u32 type; 406 + }; 407 + 408 + If a BTF_KIND_FUNC_PROTO type is referred by a BTF_KIND_FUNC type, 409 + then ``btf_param.name_off`` must point to a valid C identifier 410 + except for the possible last argument representing the variable 411 + argument. The btf_param.type refers to parameter type. 412 + 413 + If the function has variable arguments, the last parameter 414 + is encoded with ``name_off = 0`` and ``type = 0``. 415 + 416 + 3. BTF Kernel API 417 + ***************** 418 + 419 + The following bpf syscall command involves BTF: 420 + * BPF_BTF_LOAD: load a blob of BTF data into kernel 421 + * BPF_MAP_CREATE: map creation with btf key and value type info. 422 + * BPF_PROG_LOAD: prog load with btf function and line info. 423 + * BPF_BTF_GET_FD_BY_ID: get a btf fd 424 + * BPF_OBJ_GET_INFO_BY_FD: btf, func_info, line_info 425 + and other btf related info are returned. 426 + 427 + The workflow typically looks like: 428 + :: 429 + 430 + Application: 431 + BPF_BTF_LOAD 432 + | 433 + v 434 + BPF_MAP_CREATE and BPF_PROG_LOAD 435 + | 436 + V 437 + ...... 438 + 439 + Introspection tool: 440 + ...... 441 + BPF_{PROG,MAP}_GET_NEXT_ID (get prog/map id's) 442 + | 443 + V 444 + BPF_{PROG,MAP}_GET_FD_BY_ID (get a prog/map fd) 445 + | 446 + V 447 + BPF_OBJ_GET_INFO_BY_FD (get bpf_prog_info/bpf_map_info with btf_id) 448 + | | 449 + V | 450 + BPF_BTF_GET_FD_BY_ID (get btf_fd) | 451 + | | 452 + V | 453 + BPF_OBJ_GET_INFO_BY_FD (get btf) | 454 + | | 455 + V V 456 + pretty print types, dump func signatures and line info, etc. 457 + 458 + 459 + 3.1 BPF_BTF_LOAD 460 + ================ 461 + 462 + Load a blob of BTF data into kernel. A blob of data 463 + described in :ref:`BTF_Type_String` 464 + can be directly loaded into the kernel. 465 + A ``btf_fd`` returns to userspace. 466 + 467 + 3.2 BPF_MAP_CREATE 468 + ================== 469 + 470 + A map can be created with ``btf_fd`` and specified key/value type id.:: 471 + 472 + __u32 btf_fd; /* fd pointing to a BTF type data */ 473 + __u32 btf_key_type_id; /* BTF type_id of the key */ 474 + __u32 btf_value_type_id; /* BTF type_id of the value */ 475 + 476 + In libbpf, the map can be defined with extra annotation like below: 477 + :: 478 + 479 + struct bpf_map_def SEC("maps") btf_map = { 480 + .type = BPF_MAP_TYPE_ARRAY, 481 + .key_size = sizeof(int), 482 + .value_size = sizeof(struct ipv_counts), 483 + .max_entries = 4, 484 + }; 485 + BPF_ANNOTATE_KV_PAIR(btf_map, int, struct ipv_counts); 486 + 487 + Here, the parameters for macro BPF_ANNOTATE_KV_PAIR are map name, 488 + key and value types for the map. 489 + During ELF parsing, libbpf is able to extract key/value type_id's 490 + and assigned them to BPF_MAP_CREATE attributes automatically. 491 + 492 + .. _BPF_Prog_Load: 493 + 494 + 3.3 BPF_PROG_LOAD 495 + ================= 496 + 497 + During prog_load, func_info and line_info can be passed to kernel with 498 + proper values for the following attributes: 499 + :: 500 + 501 + __u32 insn_cnt; 502 + __aligned_u64 insns; 503 + ...... 504 + __u32 prog_btf_fd; /* fd pointing to BTF type data */ 505 + __u32 func_info_rec_size; /* userspace bpf_func_info size */ 506 + __aligned_u64 func_info; /* func info */ 507 + __u32 func_info_cnt; /* number of bpf_func_info records */ 508 + __u32 line_info_rec_size; /* userspace bpf_line_info size */ 509 + __aligned_u64 line_info; /* line info */ 510 + __u32 line_info_cnt; /* number of bpf_line_info records */ 511 + 512 + The func_info and line_info are an array of below, respectively.:: 513 + 514 + struct bpf_func_info { 515 + __u32 insn_off; /* [0, insn_cnt - 1] */ 516 + __u32 type_id; /* pointing to a BTF_KIND_FUNC type */ 517 + }; 518 + struct bpf_line_info { 519 + __u32 insn_off; /* [0, insn_cnt - 1] */ 520 + __u32 file_name_off; /* offset to string table for the filename */ 521 + __u32 line_off; /* offset to string table for the source line */ 522 + __u32 line_col; /* line number and column number */ 523 + }; 524 + 525 + func_info_rec_size is the size of each func_info record, and line_info_rec_size 526 + is the size of each line_info record. Passing the record size to kernel make 527 + it possible to extend the record itself in the future. 528 + 529 + Below are requirements for func_info: 530 + * func_info[0].insn_off must be 0. 531 + * the func_info insn_off is in strictly increasing order and matches 532 + bpf func boundaries. 533 + 534 + Below are requirements for line_info: 535 + * the first insn in each func must points to a line_info record. 536 + * the line_info insn_off is in strictly increasing order. 537 + 538 + For line_info, the line number and column number are defined as below: 539 + :: 540 + 541 + #define BPF_LINE_INFO_LINE_NUM(line_col) ((line_col) >> 10) 542 + #define BPF_LINE_INFO_LINE_COL(line_col) ((line_col) & 0x3ff) 543 + 544 + 3.4 BPF_{PROG,MAP}_GET_NEXT_ID 545 + 546 + In kernel, every loaded program, map or btf has a unique id. 547 + The id won't change during the life time of the program, map or btf. 548 + 549 + The bpf syscall command BPF_{PROG,MAP}_GET_NEXT_ID 550 + returns all id's, one for each command, to user space, for bpf 551 + program or maps, 552 + so the inspection tool can inspect all programs and maps. 553 + 554 + 3.5 BPF_{PROG,MAP}_GET_FD_BY_ID 555 + 556 + The introspection tool cannot use id to get details about program or maps. 557 + A file descriptor needs to be obtained first for reference counting purpose. 558 + 559 + 3.6 BPF_OBJ_GET_INFO_BY_FD 560 + ========================== 561 + 562 + Once a program/map fd is acquired, the introspection tool can 563 + get the detailed information from kernel about this fd, 564 + some of which is btf related. For example, 565 + ``bpf_map_info`` returns ``btf_id``, key/value type id. 566 + ``bpf_prog_info`` returns ``btf_id``, func_info and line info 567 + for translated bpf byte codes, and jited_line_info. 568 + 569 + 3.7 BPF_BTF_GET_FD_BY_ID 570 + ======================== 571 + 572 + With ``btf_id`` obtained in ``bpf_map_info`` and ``bpf_prog_info``, 573 + bpf syscall command BPF_BTF_GET_FD_BY_ID can retrieve a btf fd. 574 + Then, with command BPF_OBJ_GET_INFO_BY_FD, the btf blob, originally 575 + loaded into the kernel with BPF_BTF_LOAD, can be retrieved. 576 + 577 + With the btf blob, ``bpf_map_info`` and ``bpf_prog_info``, the introspection 578 + tool has full btf knowledge and is able to pretty print map key/values, 579 + dump func signatures, dump line info along with byte/jit codes. 580 + 581 + 4. ELF File Format Interface 582 + **************************** 583 + 584 + 4.1 .BTF section 585 + ================ 586 + 587 + The .BTF section contains type and string data. The format of this section 588 + is same as the one describe in :ref:`BTF_Type_String`. 589 + 590 + .. _BTF_Ext_Section: 591 + 592 + 4.2 .BTF.ext section 593 + ==================== 594 + 595 + The .BTF.ext section encodes func_info and line_info which 596 + needs loader manipulation before loading into the kernel. 597 + 598 + The specification for .BTF.ext section is defined at 599 + ``tools/lib/bpf/btf.h`` and ``tools/lib/bpf/btf.c``. 600 + 601 + The current header of .BTF.ext section:: 602 + 603 + struct btf_ext_header { 604 + __u16 magic; 605 + __u8 version; 606 + __u8 flags; 607 + __u32 hdr_len; 608 + 609 + /* All offsets are in bytes relative to the end of this header */ 610 + __u32 func_info_off; 611 + __u32 func_info_len; 612 + __u32 line_info_off; 613 + __u32 line_info_len; 614 + }; 615 + 616 + It is very similar to .BTF section. Instead of type/string section, 617 + it contains func_info and line_info section. See :ref:`BPF_Prog_Load` 618 + for details about func_info and line_info record format. 619 + 620 + The func_info is organized as below.:: 621 + 622 + func_info_rec_size 623 + btf_ext_info_sec for section #1 /* func_info for section #1 */ 624 + btf_ext_info_sec for section #2 /* func_info for section #2 */ 625 + ... 626 + 627 + ``func_info_rec_size`` specifies the size of ``bpf_func_info`` structure 628 + when .BTF.ext is generated. btf_ext_info_sec, defined below, is 629 + the func_info for each specific ELF section.:: 630 + 631 + struct btf_ext_info_sec { 632 + __u32 sec_name_off; /* offset to section name */ 633 + __u32 num_info; 634 + /* Followed by num_info * record_size number of bytes */ 635 + __u8 data[0]; 636 + }; 637 + 638 + Here, num_info must be greater than 0. 639 + 640 + The line_info is organized as below.:: 641 + 642 + line_info_rec_size 643 + btf_ext_info_sec for section #1 /* line_info for section #1 */ 644 + btf_ext_info_sec for section #2 /* line_info for section #2 */ 645 + ... 646 + 647 + ``line_info_rec_size`` specifies the size of ``bpf_line_info`` structure 648 + when .BTF.ext is generated. 649 + 650 + The interpretation of ``bpf_func_info->insn_off`` and 651 + ``bpf_line_info->insn_off`` is different between kernel API and ELF API. 652 + For kernel API, the ``insn_off`` is the instruction offset in the unit 653 + of ``struct bpf_insn``. For ELF API, the ``insn_off`` is the byte offset 654 + from the beginning of section (``btf_ext_info_sec->sec_name_off``). 655 + 656 + 5. Using BTF 657 + ************ 658 + 659 + 5.1 bpftool map pretty print 660 + ============================ 661 + 662 + With BTF, the map key/value can be printed based on fields rather than 663 + simply raw bytes. This is especially 664 + valuable for large structure or if you data structure 665 + has bitfields. For example, for the following map,:: 666 + 667 + enum A { A1, A2, A3, A4, A5 }; 668 + typedef enum A ___A; 669 + struct tmp_t { 670 + char a1:4; 671 + int a2:4; 672 + int :4; 673 + __u32 a3:4; 674 + int b; 675 + ___A b1:4; 676 + enum A b2:4; 677 + }; 678 + struct bpf_map_def SEC("maps") tmpmap = { 679 + .type = BPF_MAP_TYPE_ARRAY, 680 + .key_size = sizeof(__u32), 681 + .value_size = sizeof(struct tmp_t), 682 + .max_entries = 1, 683 + }; 684 + BPF_ANNOTATE_KV_PAIR(tmpmap, int, struct tmp_t); 685 + 686 + bpftool is able to pretty print like below: 687 + :: 688 + 689 + [{ 690 + "key": 0, 691 + "value": { 692 + "a1": 0x2, 693 + "a2": 0x4, 694 + "a3": 0x6, 695 + "b": 7, 696 + "b1": 0x8, 697 + "b2": 0xa 698 + } 699 + } 700 + ] 701 + 702 + 5.2 bpftool prog dump 703 + ===================== 704 + 705 + The following is an example to show func_info and line_info 706 + can help prog dump with better kernel symbol name, function prototype 707 + and line information.:: 708 + 709 + $ bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv 710 + [...] 711 + int test_long_fname_2(struct dummy_tracepoint_args * arg): 712 + bpf_prog_44a040bf25481309_test_long_fname_2: 713 + ; static int test_long_fname_2(struct dummy_tracepoint_args *arg) 714 + 0: push %rbp 715 + 1: mov %rsp,%rbp 716 + 4: sub $0x30,%rsp 717 + b: sub $0x28,%rbp 718 + f: mov %rbx,0x0(%rbp) 719 + 13: mov %r13,0x8(%rbp) 720 + 17: mov %r14,0x10(%rbp) 721 + 1b: mov %r15,0x18(%rbp) 722 + 1f: xor %eax,%eax 723 + 21: mov %rax,0x20(%rbp) 724 + 25: xor %esi,%esi 725 + ; int key = 0; 726 + 27: mov %esi,-0x4(%rbp) 727 + ; if (!arg->sock) 728 + 2a: mov 0x8(%rdi),%rdi 729 + ; if (!arg->sock) 730 + 2e: cmp $0x0,%rdi 731 + 32: je 0x0000000000000070 732 + 34: mov %rbp,%rsi 733 + ; counts = bpf_map_lookup_elem(&btf_map, &key); 734 + [...] 735 + 736 + 5.3 verifier log 737 + ================ 738 + 739 + The following is an example how line_info can help verifier failure debug.:: 740 + 741 + /* The code at tools/testing/selftests/bpf/test_xdp_noinline.c 742 + * is modified as below. 743 + */ 744 + data = (void *)(long)xdp->data; 745 + data_end = (void *)(long)xdp->data_end; 746 + /* 747 + if (data + 4 > data_end) 748 + return XDP_DROP; 749 + */ 750 + *(u32 *)data = dst->dst; 751 + 752 + $ bpftool prog load ./test_xdp_noinline.o /sys/fs/bpf/test_xdp_noinline type xdp 753 + ; data = (void *)(long)xdp->data; 754 + 224: (79) r2 = *(u64 *)(r10 -112) 755 + 225: (61) r2 = *(u32 *)(r2 +0) 756 + ; *(u32 *)data = dst->dst; 757 + 226: (63) *(u32 *)(r2 +0) = r1 758 + invalid access to packet, off=0 size=4, R2(id=0,off=0,r=0) 759 + R2 offset is outside of the packet 760 + 761 + 6. BTF Generation 762 + ***************** 763 + 764 + You need latest pahole 765 + 766 + https://git.kernel.org/pub/scm/devel/pahole/pahole.git/ 767 + 768 + or llvm (8.0 or later). The pahole acts as a dwarf2btf converter. It doesn't support .BTF.ext 769 + and btf BTF_KIND_FUNC type yet. For example,:: 770 + 771 + -bash-4.4$ cat t.c 772 + struct t { 773 + int a:2; 774 + int b:3; 775 + int c:2; 776 + } g; 777 + -bash-4.4$ gcc -c -O2 -g t.c 778 + -bash-4.4$ pahole -JV t.o 779 + File t.o: 780 + [1] STRUCT t kind_flag=1 size=4 vlen=3 781 + a type_id=2 bitfield_size=2 bits_offset=0 782 + b type_id=2 bitfield_size=3 bits_offset=2 783 + c type_id=2 bitfield_size=2 bits_offset=5 784 + [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED 785 + 786 + The llvm is able to generate .BTF and .BTF.ext directly with -g for bpf target only. 787 + The assembly code (-S) is able to show the BTF encoding in assembly format.:: 788 + 789 + -bash-4.4$ cat t2.c 790 + typedef int __int32; 791 + struct t2 { 792 + int a2; 793 + int (*f2)(char q1, __int32 q2, ...); 794 + int (*f3)(); 795 + } g2; 796 + int main() { return 0; } 797 + int test() { return 0; } 798 + -bash-4.4$ clang -c -g -O2 -target bpf t2.c 799 + -bash-4.4$ readelf -S t2.o 800 + ...... 801 + [ 8] .BTF PROGBITS 0000000000000000 00000247 802 + 000000000000016e 0000000000000000 0 0 1 803 + [ 9] .BTF.ext PROGBITS 0000000000000000 000003b5 804 + 0000000000000060 0000000000000000 0 0 1 805 + [10] .rel.BTF.ext REL 0000000000000000 000007e0 806 + 0000000000000040 0000000000000010 16 9 8 807 + ...... 808 + -bash-4.4$ clang -S -g -O2 -target bpf t2.c 809 + -bash-4.4$ cat t2.s 810 + ...... 811 + .section .BTF,"",@progbits 812 + .short 60319 # 0xeb9f 813 + .byte 1 814 + .byte 0 815 + .long 24 816 + .long 0 817 + .long 220 818 + .long 220 819 + .long 122 820 + .long 0 # BTF_KIND_FUNC_PROTO(id = 1) 821 + .long 218103808 # 0xd000000 822 + .long 2 823 + .long 83 # BTF_KIND_INT(id = 2) 824 + .long 16777216 # 0x1000000 825 + .long 4 826 + .long 16777248 # 0x1000020 827 + ...... 828 + .byte 0 # string offset=0 829 + .ascii ".text" # string offset=1 830 + .byte 0 831 + .ascii "/home/yhs/tmp-pahole/t2.c" # string offset=7 832 + .byte 0 833 + .ascii "int main() { return 0; }" # string offset=33 834 + .byte 0 835 + .ascii "int test() { return 0; }" # string offset=58 836 + .byte 0 837 + .ascii "int" # string offset=83 838 + ...... 839 + .section .BTF.ext,"",@progbits 840 + .short 60319 # 0xeb9f 841 + .byte 1 842 + .byte 0 843 + .long 24 844 + .long 0 845 + .long 28 846 + .long 28 847 + .long 44 848 + .long 8 # FuncInfo 849 + .long 1 # FuncInfo section string offset=1 850 + .long 2 851 + .long .Lfunc_begin0 852 + .long 3 853 + .long .Lfunc_begin1 854 + .long 5 855 + .long 16 # LineInfo 856 + .long 1 # LineInfo section string offset=1 857 + .long 2 858 + .long .Ltmp0 859 + .long 7 860 + .long 33 861 + .long 7182 # Line 7 Col 14 862 + .long .Ltmp3 863 + .long 7 864 + .long 58 865 + .long 8206 # Line 8 Col 14 866 + 867 + 7. Testing 868 + ********** 869 + 870 + Kernel bpf selftest `test_btf.c` provides extensive set of BTF related tests.
+7
Documentation/bpf/index.rst
··· 15 15 The primary info for the bpf syscall is available in the `man-pages`_ 16 16 for `bpf(2)`_. 17 17 18 + BPF Type Format (BTF) 19 + ===================== 20 + 21 + .. toctree:: 22 + :maxdepth: 1 23 + 24 + btf 18 25 19 26 20 27 Frequently asked questions (FAQ)