Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

bpf: Fix reg_set_min_max corruption of fake_reg

Juan reported that after doing some changes to buzzer [0] and implementing
a new fuzzing strategy guided by coverage, they noticed the following in
one of the probes:

[...]
13: (79) r6 = *(u64 *)(r0 +0) ; R0=map_value(ks=4,vs=8) R6_w=scalar()
14: (b7) r0 = 0 ; R0_w=0
15: (b4) w0 = -1 ; R0_w=0xffffffff
16: (74) w0 >>= 1 ; R0_w=0x7fffffff
17: (5c) w6 &= w0 ; R0_w=0x7fffffff R6_w=scalar(smin=smin32=0,smax=umax=umax32=0x7fffffff,var_off=(0x0; 0x7fffffff))
18: (44) w6 |= 2 ; R6_w=scalar(smin=umin=smin32=umin32=2,smax=umax=umax32=0x7fffffff,var_off=(0x2; 0x7ffffffd))
19: (56) if w6 != 0x7ffffffd goto pc+1
REG INVARIANTS VIOLATION (true_reg2): range bounds violation u64=[0x7fffffff, 0x7ffffffd] s64=[0x7fffffff, 0x7ffffffd] u32=[0x7fffffff, 0x7ffffffd] s32=[0x7fffffff, 0x7ffffffd] var_off=(0x7fffffff, 0x0)
REG INVARIANTS VIOLATION (false_reg1): range bounds violation u64=[0x7fffffff, 0x7ffffffd] s64=[0x7fffffff, 0x7ffffffd] u32=[0x7fffffff, 0x7ffffffd] s32=[0x7fffffff, 0x7ffffffd] var_off=(0x7fffffff, 0x0)
REG INVARIANTS VIOLATION (false_reg2): const tnum out of sync with range bounds u64=[0x0, 0xffffffffffffffff] s64=[0x8000000000000000, 0x7fffffffffffffff] u32=[0x0, 0xffffffff] s32=[0x80000000, 0x7fffffff] var_off=(0x7fffffff, 0x0)
19: R6_w=0x7fffffff
20: (95) exit

from 19 to 21: R0=0x7fffffff R6=scalar(smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=0x7ffffffe,var_off=(0x2; 0x7ffffffd)) R7=map_ptr(ks=4,vs=8) R9=ctx() R10=fp0 fp-24=map_ptr(ks=4,vs=8) fp-40=mmmmmmmm
21: R0=0x7fffffff R6=scalar(smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=0x7ffffffe,var_off=(0x2; 0x7ffffffd)) R7=map_ptr(ks=4,vs=8) R9=ctx() R10=fp0 fp-24=map_ptr(ks=4,vs=8) fp-40=mmmmmmmm
21: (14) w6 -= 2147483632 ; R6_w=scalar(smin=umin=umin32=2,smax=umax=0xffffffff,smin32=0x80000012,smax32=14,var_off=(0x2; 0xfffffffd))
22: (76) if w6 s>= 0xe goto pc+1 ; R6_w=scalar(smin=umin=umin32=2,smax=umax=0xffffffff,smin32=0x80000012,smax32=13,var_off=(0x2; 0xfffffffd))
23: (95) exit

from 22 to 24: R0=0x7fffffff R6_w=14 R7=map_ptr(ks=4,vs=8) R9=ctx() R10=fp0 fp-24=map_ptr(ks=4,vs=8) fp-40=mmmmmmmm
24: R0=0x7fffffff R6_w=14 R7=map_ptr(ks=4,vs=8) R9=ctx() R10=fp0 fp-24=map_ptr(ks=4,vs=8) fp-40=mmmmmmmm
24: (14) w6 -= 14 ; R6_w=0
[...]

What can be seen here is a register invariant violation on line 19. After
the binary-or in line 18, the verifier knows that bit 2 is set but knows
nothing about the rest of the content which was loaded from a map value,
meaning, range is [2,0x7fffffff] with var_off=(0x2; 0x7ffffffd). When in
line 19 the verifier analyzes the branch, it splits the register states
in reg_set_min_max() into the registers of the true branch (true_reg1,
true_reg2) and the registers of the false branch (false_reg1, false_reg2).

Since the test is w6 != 0x7ffffffd, the src_reg is a known constant.
Internally, the verifier creates a "fake" register initialized as scalar
to the value of 0x7ffffffd, and then passes it onto reg_set_min_max(). Now,
for line 19, it is mathematically impossible to take the false branch of
this program, yet the verifier analyzes it. It is impossible because the
second bit of r6 will be set due to the prior or operation and the
constant in the condition has that bit unset (hex(fd) == binary(1111 1101).

When the verifier first analyzes the false / fall-through branch, it will
compute an intersection between the var_off of r6 and of the constant. This
is because the verifier creates a "fake" register initialized to the value
of the constant. The intersection result later refines both registers in
regs_refine_cond_op():

[...]
t = tnum_intersect(tnum_subreg(reg1->var_off), tnum_subreg(reg2->var_off));
reg1->var_off = tnum_with_subreg(reg1->var_off, t);
reg2->var_off = tnum_with_subreg(reg2->var_off, t);
[...]

Since the verifier is analyzing the false branch of the conditional jump,
reg1 is equal to false_reg1 and reg2 is equal to false_reg2, i.e. the reg2
is the "fake" register that was meant to hold a constant value. The resulting
var_off of the intersection says that both registers now hold a known value
of var_off=(0x7fffffff, 0x0) or in other words: this operation manages to
make the verifier think that the "constant" value that was passed in the
jump operation now holds a different value.

Normally this would not be an issue since it should not influence the true
branch, however, false_reg2 and true_reg2 are pointers to the same "fake"
register. Meaning, the false branch can influence the results of the true
branch. In line 24, the verifier assumes R6_w=0, but the actual runtime
value in this case is 1. The fix is simply not passing in the same "fake"
register location as inputs to reg_set_min_max(), but instead making a
copy. Moving the fake_reg into the env also reduces stack consumption by
120 bytes. With this, the verifier successfully rejects invalid accesses
from the test program.

[0] https://github.com/google/buzzer

Fixes: 67420501e868 ("bpf: generalize reg_set_min_max() to handle non-const register comparisons")
Reported-by: Juan José López Jaimez <jjlopezjaimez@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20240613115310.25383-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

authored by

Daniel Borkmann and committed by
Alexei Starovoitov
92424801 26ba7c3f

+12 -4
+2
include/linux/bpf_verifier.h
··· 746 746 /* Same as scratched_regs but for stack slots */ 747 747 u64 scratched_stack_slots; 748 748 u64 prev_log_pos, prev_insn_print_pos; 749 + /* buffer used to temporary hold constants as scalar registers */ 750 + struct bpf_reg_state fake_reg[2]; 749 751 /* buffer used to generate temporary string representations, 750 752 * e.g., in reg_type_str() to generate reg_type string 751 753 */
+10 -4
kernel/bpf/verifier.c
··· 15113 15113 struct bpf_reg_state *regs = this_branch->frame[this_branch->curframe]->regs; 15114 15114 struct bpf_reg_state *dst_reg, *other_branch_regs, *src_reg = NULL; 15115 15115 struct bpf_reg_state *eq_branch_regs; 15116 - struct bpf_reg_state fake_reg = {}; 15117 15116 u8 opcode = BPF_OP(insn->code); 15118 15117 bool is_jmp32; 15119 15118 int pred = -1; ··· 15178 15179 verbose(env, "BPF_JMP/JMP32 uses reserved fields\n"); 15179 15180 return -EINVAL; 15180 15181 } 15181 - src_reg = &fake_reg; 15182 + src_reg = &env->fake_reg[0]; 15183 + memset(src_reg, 0, sizeof(*src_reg)); 15182 15184 src_reg->type = SCALAR_VALUE; 15183 15185 __mark_reg_known(src_reg, insn->imm); 15184 15186 } ··· 15239 15239 &other_branch_regs[insn->src_reg], 15240 15240 dst_reg, src_reg, opcode, is_jmp32); 15241 15241 } else /* BPF_SRC(insn->code) == BPF_K */ { 15242 + /* reg_set_min_max() can mangle the fake_reg. Make a copy 15243 + * so that these are two different memory locations. The 15244 + * src_reg is not used beyond here in context of K. 15245 + */ 15246 + memcpy(&env->fake_reg[1], &env->fake_reg[0], 15247 + sizeof(env->fake_reg[0])); 15242 15248 err = reg_set_min_max(env, 15243 15249 &other_branch_regs[insn->dst_reg], 15244 - src_reg /* fake one */, 15245 - dst_reg, src_reg /* same fake one */, 15250 + &env->fake_reg[0], 15251 + dst_reg, &env->fake_reg[1], 15246 15252 opcode, is_jmp32); 15247 15253 } 15248 15254 if (err)