Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

docs/bpf: Add description for CO-RE relocations

Add a section on CO-RE relocations to llvm_relo.rst. Describe relevant .BTF.ext
structure, `enum bpf_core_relo_kind` and `struct bpf_core_relo` in some detail.

Description is based on doc-strings from:

- include/uapi/linux/bpf.h:struct bpf_core_relo
- tools/lib/bpf/relo_core.c:__bpf_core_types_match()

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20230826222912.2560865-2-eddyz87@gmail.com

authored by

Eduard Zingerman and committed by
Daniel Borkmann
be4033d3 2d71a90f

+329 -6
+25 -6
Documentation/bpf/btf.rst
··· 726 726 4.2 .BTF.ext section 727 727 -------------------- 728 728 729 - The .BTF.ext section encodes func_info and line_info which needs loader 730 - manipulation before loading into the kernel. 729 + The .BTF.ext section encodes func_info, line_info and CO-RE relocations 730 + which needs loader manipulation before loading into the kernel. 731 731 732 732 The specification for .BTF.ext section is defined at ``tools/lib/bpf/btf.h`` 733 733 and ``tools/lib/bpf/btf.c``. ··· 745 745 __u32 func_info_len; 746 746 __u32 line_info_off; 747 747 __u32 line_info_len; 748 + 749 + /* optional part of .BTF.ext header */ 750 + __u32 core_relo_off; 751 + __u32 core_relo_len; 748 752 }; 749 753 750 754 It is very similar to .BTF section. Instead of type/string section, it 751 - contains func_info and line_info section. See :ref:`BPF_Prog_Load` for details 752 - about func_info and line_info record format. 755 + contains func_info, line_info and core_relo sub-sections. 756 + See :ref:`BPF_Prog_Load` for details about func_info and line_info 757 + record format. 753 758 754 759 The func_info is organized as below.:: 755 760 756 - func_info_rec_size 761 + func_info_rec_size /* __u32 value */ 757 762 btf_ext_info_sec for section #1 /* func_info for section #1 */ 758 763 btf_ext_info_sec for section #2 /* func_info for section #2 */ 759 764 ... ··· 778 773 779 774 The line_info is organized as below.:: 780 775 781 - line_info_rec_size 776 + line_info_rec_size /* __u32 value */ 782 777 btf_ext_info_sec for section #1 /* line_info for section #1 */ 783 778 btf_ext_info_sec for section #2 /* line_info for section #2 */ 784 779 ... ··· 791 786 kernel API, the ``insn_off`` is the instruction offset in the unit of ``struct 792 787 bpf_insn``. For ELF API, the ``insn_off`` is the byte offset from the 793 788 beginning of section (``btf_ext_info_sec->sec_name_off``). 789 + 790 + The core_relo is organized as below.:: 791 + 792 + core_relo_rec_size /* __u32 value */ 793 + btf_ext_info_sec for section #1 /* core_relo for section #1 */ 794 + btf_ext_info_sec for section #2 /* core_relo for section #2 */ 795 + 796 + ``core_relo_rec_size`` specifies the size of ``bpf_core_relo`` 797 + structure when .BTF.ext is generated. All ``bpf_core_relo`` structures 798 + within a single ``btf_ext_info_sec`` describe relocations applied to 799 + section named by ``btf_ext_info_sec->sec_name_off``. 800 + 801 + See :ref:`Documentation/bpf/llvm_reloc <btf-co-re-relocations>` 802 + for more information on CO-RE relocations. 794 803 795 804 4.2 .BTF_ids section 796 805 --------------------
+304
Documentation/bpf/llvm_reloc.rst
··· 240 240 Offset Info Type Symbol's Value Symbol's Name 241 241 000000000000002c 0000000200000004 R_BPF_64_NODYLD32 0000000000000000 .text 242 242 0000000000000040 0000000200000004 R_BPF_64_NODYLD32 0000000000000000 .text 243 + 244 + .. _btf-co-re-relocations: 245 + 246 + ================= 247 + CO-RE Relocations 248 + ================= 249 + 250 + From object file point of view CO-RE mechanism is implemented as a set 251 + of CO-RE specific relocation records. These relocation records are not 252 + related to ELF relocations and are encoded in .BTF.ext section. 253 + See :ref:`Documentation/bpf/btf <BTF_Ext_Section>` for more 254 + information on .BTF.ext structure. 255 + 256 + CO-RE relocations are applied to BPF instructions to update immediate 257 + or offset fields of the instruction at load time with information 258 + relevant for target kernel. 259 + 260 + Field to patch is selected basing on the instruction class: 261 + 262 + * For BPF_ALU, BPF_ALU64, BPF_LD `immediate` field is patched; 263 + * For BPF_LDX, BPF_STX, BPF_ST `offset` field is patched; 264 + * BPF_JMP, BPF_JMP32 instructions **should not** be patched. 265 + 266 + Relocation kinds 267 + ================ 268 + 269 + There are several kinds of CO-RE relocations that could be split in 270 + three groups: 271 + 272 + * Field-based - patch instruction with field related information, e.g. 273 + change offset field of the BPF_LDX instruction to reflect offset 274 + of a specific structure field in the target kernel. 275 + 276 + * Type-based - patch instruction with type related information, e.g. 277 + change immediate field of the BPF_ALU move instruction to 0 or 1 to 278 + reflect if specific type is present in the target kernel. 279 + 280 + * Enum-based - patch instruction with enum related information, e.g. 281 + change immediate field of the BPF_LD_IMM64 instruction to reflect 282 + value of a specific enum literal in the target kernel. 283 + 284 + The complete list of relocation kinds is represented by the following enum: 285 + 286 + .. code-block:: c 287 + 288 + enum bpf_core_relo_kind { 289 + BPF_CORE_FIELD_BYTE_OFFSET = 0, /* field byte offset */ 290 + BPF_CORE_FIELD_BYTE_SIZE = 1, /* field size in bytes */ 291 + BPF_CORE_FIELD_EXISTS = 2, /* field existence in target kernel */ 292 + BPF_CORE_FIELD_SIGNED = 3, /* field signedness (0 - unsigned, 1 - signed) */ 293 + BPF_CORE_FIELD_LSHIFT_U64 = 4, /* bitfield-specific left bitshift */ 294 + BPF_CORE_FIELD_RSHIFT_U64 = 5, /* bitfield-specific right bitshift */ 295 + BPF_CORE_TYPE_ID_LOCAL = 6, /* type ID in local BPF object */ 296 + BPF_CORE_TYPE_ID_TARGET = 7, /* type ID in target kernel */ 297 + BPF_CORE_TYPE_EXISTS = 8, /* type existence in target kernel */ 298 + BPF_CORE_TYPE_SIZE = 9, /* type size in bytes */ 299 + BPF_CORE_ENUMVAL_EXISTS = 10, /* enum value existence in target kernel */ 300 + BPF_CORE_ENUMVAL_VALUE = 11, /* enum value integer value */ 301 + BPF_CORE_TYPE_MATCHES = 12, /* type match in target kernel */ 302 + }; 303 + 304 + Notes: 305 + 306 + * ``BPF_CORE_FIELD_LSHIFT_U64`` and ``BPF_CORE_FIELD_RSHIFT_U64`` are 307 + supposed to be used to read bitfield values using the following 308 + algorithm: 309 + 310 + .. code-block:: c 311 + 312 + // To read bitfield ``f`` from ``struct s`` 313 + is_signed = relo(s->f, BPF_CORE_FIELD_SIGNED) 314 + off = relo(s->f, BPF_CORE_FIELD_BYTE_OFFSET) 315 + sz = relo(s->f, BPF_CORE_FIELD_BYTE_SIZE) 316 + l = relo(s->f, BPF_CORE_FIELD_LSHIFT_U64) 317 + r = relo(s->f, BPF_CORE_FIELD_RSHIFT_U64) 318 + // define ``v`` as signed or unsigned integer of size ``sz`` 319 + v = *({s|u}<sz> *)((void *)s + off) 320 + v <<= l 321 + v >>= r 322 + 323 + * The ``BPF_CORE_TYPE_MATCHES`` queries matching relation, defined as 324 + follows: 325 + 326 + * for integers: types match if size and signedness match; 327 + * for arrays & pointers: target types are recursively matched; 328 + * for structs & unions: 329 + 330 + * local members need to exist in target with the same name; 331 + 332 + * for each member we recursively check match unless it is already behind a 333 + pointer, in which case we only check matching names and compatible kind; 334 + 335 + * for enums: 336 + 337 + * local variants have to have a match in target by symbolic name (but not 338 + numeric value); 339 + 340 + * size has to match (but enum may match enum64 and vice versa); 341 + 342 + * for function pointers: 343 + 344 + * number and position of arguments in local type has to match target; 345 + * for each argument and the return value we recursively check match. 346 + 347 + CO-RE Relocation Record 348 + ======================= 349 + 350 + Relocation record is encoded as the following structure: 351 + 352 + .. code-block:: c 353 + 354 + struct bpf_core_relo { 355 + __u32 insn_off; 356 + __u32 type_id; 357 + __u32 access_str_off; 358 + enum bpf_core_relo_kind kind; 359 + }; 360 + 361 + * ``insn_off`` - instruction offset (in bytes) within a code section 362 + associated with this relocation; 363 + 364 + * ``type_id`` - BTF type ID of the "root" (containing) entity of a 365 + relocatable type or field; 366 + 367 + * ``access_str_off`` - offset into corresponding .BTF string section. 368 + String interpretation depends on specific relocation kind: 369 + 370 + * for field-based relocations, string encodes an accessed field using 371 + a sequence of field and array indices, separated by colon (:). It's 372 + conceptually very close to LLVM's `getelementptr <GEP_>`_ instruction's 373 + arguments for identifying offset to a field. For example, consider the 374 + following C code: 375 + 376 + .. code-block:: c 377 + 378 + struct sample { 379 + int a; 380 + int b; 381 + struct { int c[10]; }; 382 + } __attribute__((preserve_access_index)); 383 + struct sample *s; 384 + 385 + * Access to ``s[0].a`` would be encoded as ``0:0``: 386 + 387 + * ``0``: first element of ``s`` (as if ``s`` is an array); 388 + * ``0``: index of field ``a`` in ``struct sample``. 389 + 390 + * Access to ``s->a`` would be encoded as ``0:0`` as well. 391 + * Access to ``s->b`` would be encoded as ``0:1``: 392 + 393 + * ``0``: first element of ``s``; 394 + * ``1``: index of field ``b`` in ``struct sample``. 395 + 396 + * Access to ``s[1].c[5]`` would be encoded as ``1:2:0:5``: 397 + 398 + * ``1``: second element of ``s``; 399 + * ``2``: index of anonymous structure field in ``struct sample``; 400 + * ``0``: index of field ``c`` in anonymous structure; 401 + * ``5``: access to array element #5. 402 + 403 + * for type-based relocations, string is expected to be just "0"; 404 + 405 + * for enum value-based relocations, string contains an index of enum 406 + value within its enum type; 407 + 408 + * ``kind`` - one of ``enum bpf_core_relo_kind``. 409 + 410 + .. _GEP: https://llvm.org/docs/LangRef.html#getelementptr-instruction 411 + 412 + .. _btf_co_re_relocation_examples: 413 + 414 + CO-RE Relocation Examples 415 + ========================= 416 + 417 + For the following C code: 418 + 419 + .. code-block:: c 420 + 421 + struct foo { 422 + int a; 423 + int b; 424 + unsigned c:15; 425 + } __attribute__((preserve_access_index)); 426 + 427 + enum bar { U, V }; 428 + 429 + With the following BTF definitions: 430 + 431 + .. code-block:: 432 + 433 + ... 434 + [2] STRUCT 'foo' size=8 vlen=2 435 + 'a' type_id=3 bits_offset=0 436 + 'b' type_id=3 bits_offset=32 437 + 'c' type_id=4 bits_offset=64 bitfield_size=15 438 + [3] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED 439 + [4] INT 'unsigned int' size=4 bits_offset=0 nr_bits=32 encoding=(none) 440 + ... 441 + [16] ENUM 'bar' encoding=UNSIGNED size=4 vlen=2 442 + 'U' val=0 443 + 'V' val=1 444 + 445 + Field offset relocations are generated automatically when 446 + ``__attribute__((preserve_access_index))`` is used, for example: 447 + 448 + .. code-block:: c 449 + 450 + void alpha(struct foo *s, volatile unsigned long *g) { 451 + *g = s->a; 452 + s->a = 1; 453 + } 454 + 455 + 00 <alpha>: 456 + 0: r3 = *(s32 *)(r1 + 0x0) 457 + 00: CO-RE <byte_off> [2] struct foo::a (0:0) 458 + 1: *(u64 *)(r2 + 0x0) = r3 459 + 2: *(u32 *)(r1 + 0x0) = 0x1 460 + 10: CO-RE <byte_off> [2] struct foo::a (0:0) 461 + 3: exit 462 + 463 + 464 + All relocation kinds could be requested via built-in functions. 465 + E.g. field-based relocations: 466 + 467 + .. code-block:: c 468 + 469 + void bravo(struct foo *s, volatile unsigned long *g) { 470 + *g = __builtin_preserve_field_info(s->b, 0 /* field byte offset */); 471 + *g = __builtin_preserve_field_info(s->b, 1 /* field byte size */); 472 + *g = __builtin_preserve_field_info(s->b, 2 /* field existence */); 473 + *g = __builtin_preserve_field_info(s->b, 3 /* field signedness */); 474 + *g = __builtin_preserve_field_info(s->c, 4 /* bitfield left shift */); 475 + *g = __builtin_preserve_field_info(s->c, 5 /* bitfield right shift */); 476 + } 477 + 478 + 20 <bravo>: 479 + 4: r1 = 0x4 480 + 20: CO-RE <byte_off> [2] struct foo::b (0:1) 481 + 5: *(u64 *)(r2 + 0x0) = r1 482 + 6: r1 = 0x4 483 + 30: CO-RE <byte_sz> [2] struct foo::b (0:1) 484 + 7: *(u64 *)(r2 + 0x0) = r1 485 + 8: r1 = 0x1 486 + 40: CO-RE <field_exists> [2] struct foo::b (0:1) 487 + 9: *(u64 *)(r2 + 0x0) = r1 488 + 10: r1 = 0x1 489 + 50: CO-RE <signed> [2] struct foo::b (0:1) 490 + 11: *(u64 *)(r2 + 0x0) = r1 491 + 12: r1 = 0x31 492 + 60: CO-RE <lshift_u64> [2] struct foo::c (0:2) 493 + 13: *(u64 *)(r2 + 0x0) = r1 494 + 14: r1 = 0x31 495 + 70: CO-RE <rshift_u64> [2] struct foo::c (0:2) 496 + 15: *(u64 *)(r2 + 0x0) = r1 497 + 16: exit 498 + 499 + 500 + Type-based relocations: 501 + 502 + .. code-block:: c 503 + 504 + void charlie(struct foo *s, volatile unsigned long *g) { 505 + *g = __builtin_preserve_type_info(*s, 0 /* type existence */); 506 + *g = __builtin_preserve_type_info(*s, 1 /* type size */); 507 + *g = __builtin_preserve_type_info(*s, 2 /* type matches */); 508 + *g = __builtin_btf_type_id(*s, 0 /* type id in this object file */); 509 + *g = __builtin_btf_type_id(*s, 1 /* type id in target kernel */); 510 + } 511 + 512 + 88 <charlie>: 513 + 17: r1 = 0x1 514 + 88: CO-RE <type_exists> [2] struct foo 515 + 18: *(u64 *)(r2 + 0x0) = r1 516 + 19: r1 = 0xc 517 + 98: CO-RE <type_size> [2] struct foo 518 + 20: *(u64 *)(r2 + 0x0) = r1 519 + 21: r1 = 0x1 520 + a8: CO-RE <type_matches> [2] struct foo 521 + 22: *(u64 *)(r2 + 0x0) = r1 522 + 23: r1 = 0x2 ll 523 + b8: CO-RE <local_type_id> [2] struct foo 524 + 25: *(u64 *)(r2 + 0x0) = r1 525 + 26: r1 = 0x2 ll 526 + d0: CO-RE <target_type_id> [2] struct foo 527 + 28: *(u64 *)(r2 + 0x0) = r1 528 + 29: exit 529 + 530 + Enum-based relocations: 531 + 532 + .. code-block:: c 533 + 534 + void delta(struct foo *s, volatile unsigned long *g) { 535 + *g = __builtin_preserve_enum_value(*(enum bar *)U, 0 /* enum literal existence */); 536 + *g = __builtin_preserve_enum_value(*(enum bar *)V, 1 /* enum literal value */); 537 + } 538 + 539 + f0 <delta>: 540 + 30: r1 = 0x1 ll 541 + f0: CO-RE <enumval_exists> [16] enum bar::U = 0 542 + 32: *(u64 *)(r2 + 0x0) = r1 543 + 33: r1 = 0x1 ll 544 + 108: CO-RE <enumval_value> [16] enum bar::V = 1 545 + 35: *(u64 *)(r2 + 0x0) = r1 546 + 36: exit