This is made to clarify that this flag will cause a nospec to be added
after this insn and can therefore be relied upon to reduce speculative
path analysis.
Signed-off-by: Luis Gerhorst <luis.gerhorst@fau.de>
Cc: Henriette Herzog <henriette.herzog@rub.de>
Cc: Maximilian Ott <ott@cs.fau.de>
Cc: Milan Stephan <milan.stephan@fau.de>
---
include/linux/bpf_verifier.h | 2 +-
kernel/bpf/verifier.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index d6cfc4ee6820..da586dd4703e 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -563,7 +563,7 @@ struct bpf_insn_aux_data {
u64 map_key_state; /* constant (32 bit) key tracking for maps */
int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
u32 seen; /* this insn was processed by the verifier at env->pass_cnt */
- bool sanitize_stack_spill; /* subject to Spectre v4 sanitation */
+ bool nospec_result; /* result is unsafe under speculation, nospec must follow */
bool zext_dst; /* this insn zero extends dst reg */
bool needs_zext; /* alu op needs to clear upper bits */
bool storage_get_func_atomic; /* bpf_*_storage_get() with atomic memory alloc */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index eb65038682b0..4c1ed31d86af 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5007,7 +5007,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
}
if (sanitize)
- env->insn_aux_data[insn_idx].sanitize_stack_spill = true;
+ env->insn_aux_data[insn_idx].nospec_result = true;
}
err = destroy_if_dynptr_stack_slot(env, state, spi);
@@ -20719,7 +20719,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
}
if (type == BPF_WRITE &&
- env->insn_aux_data[i + delta].sanitize_stack_spill) {
+ env->insn_aux_data[i + delta].nospec_result) {
struct bpf_insn patch[] = {
*insn,
BPF_ST_NOSPEC(),
--
2.48.1
This implements the core of the series and causes the verifier to fall
back to mitigating Spectre v1 using speculation barriers. The approach
was presented at LPC'24 [1] and RAID'24 [2].
This has the unfortunate consequence that Spectre v1 mitigations now
only support architectures which implement BPF_NOSPEC. Before this
commit, Spectre v1 mitigations prevented exploits by rejecting the
programs on all architectures. Because some JITs do not implement
BPF_NOSPEC, this patch therefore may regress unpriv BPF's security to a
limited extent:
* The regression is limited to systems vulnerable to Spectre v1, have
unprivileged BPF enabled, and do NOT emit insns for BPF_NOSPEC. The
latter is not the case for x86 64- and 32-bit, arm64, and powerpc
64-bit and they are therefore not affected by the regression.
According to commit a6f6a95f2580 ("LoongArch, bpf: Fix jit to skip
speculation barrier opcode"), LoongArch is not vulnerable to Spectre
v1 and therefore also not affected by the regression.
* To the best of my knowledge this regression may therefore only affect
MIPS. This is deemed acceptable because unpriv BPF is still disabled
there by default. As stated in a previous commit, BPF_NOSPEC could be
implemented for MIPS based on GCC's speculation_barrier
implementation.
* It is unclear which other architectures (besides x86 64- and 32-bit,
ARM64, PowerPC 64-bit, LoongArch, and MIPS) supported by the kernel
are vulnerable to Spectre v1. Also, it is not clear if barriers are
available on these architectures. Implementing BPF_NOSPEC on these
architectures therefore is non-trivial. Searching GCC and the kernel
for speculation barrier implementations for these architectures
yielded no result.
* If any of those regressed systems is also vulnerable to Spectre v4,
the system was already vulnerable to Spectre v4 attacks based on
unpriv BPF before this patch and the impact is therefore further
limited.
As an alternative to regressing security, one could still reject
programs if the architecture does not emit BPF_NOSPEC (e.g., by removing
the empty BPF_NOSPEC-case from all JITs except for LoongArch where it
appears justified). However, this will cause rejections on these archs
that are likely unfounded in the vast majority of cases.
In the tests, some are now successful where we previously had a
false-positive (i.e., rejection). Change them to reflect where the
nospec should be inserted (as comment) and modify the error message if
the nospec is able to mitigate a problem that previously shadowed
another problem.
Briefly went through all the occurrences of EPERM, EINVAL, and EACCESS
in the verifier in order to validate that catching them like this makes
sense.
[1] https://lpc.events/event/18/contributions/1954/ ("Mitigating
Spectre-PHT using Speculation Barriers in Linux eBPF")
[2] https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and
Precise Spectre Defenses for Untrusted Linux Kernel Extensions")
Signed-off-by: Luis Gerhorst <luis.gerhorst@fau.de>
Acked-by: Henriette Herzog <henriette.herzog@rub.de>
Cc: Maximilian Ott <ott@cs.fau.de>
Cc: Milan Stephan <milan.stephan@fau.de>
---
include/linux/bpf_verifier.h | 1 +
kernel/bpf/verifier.c | 68 +++++++++++++++++--
.../selftests/bpf/progs/verifier_and.c | 3 +-
.../selftests/bpf/progs/verifier_bounds.c | 30 ++++----
.../selftests/bpf/progs/verifier_movsx.c | 6 +-
.../selftests/bpf/progs/verifier_unpriv.c | 3 +-
.../bpf/progs/verifier_value_ptr_arith.c | 14 ++--
.../selftests/bpf/verifier/dead_code.c | 3 +-
tools/testing/selftests/bpf/verifier/jmp32.c | 33 +++------
tools/testing/selftests/bpf/verifier/jset.c | 10 ++-
10 files changed, 115 insertions(+), 56 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index da586dd4703e..588c6ac79987 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -563,6 +563,7 @@ struct bpf_insn_aux_data {
u64 map_key_state; /* constant (32 bit) key tracking for maps */
int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
u32 seen; /* this insn was processed by the verifier at env->pass_cnt */
+ bool nospec; /* do not execute this instruction speculatively */
bool nospec_result; /* result is unsafe under speculation, nospec must follow */
bool zext_dst; /* this insn zero extends dst reg */
bool needs_zext; /* alu op needs to clear upper bits */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 4c1ed31d86af..8093a5bac4d1 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1990,6 +1990,18 @@ static int pop_stack(struct bpf_verifier_env *env, int *prev_insn_idx,
return 0;
}
+static bool error_recoverable_with_nospec(int err)
+{
+ /* Should only return true for non-fatal errors that are allowed to
+ * occur during speculative verification. For these we can insert a
+ * nospec and the program might still be accepted. Do not include
+ * something like ENOMEM because it is likely to re-occur for the next
+ * architectural path once it has been recovered-from in all speculative
+ * paths.
+ */
+ return err == -EPERM || err == -EACCES || err == -EINVAL;
+}
+
static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env,
int insn_idx, int prev_insn_idx,
bool speculative)
@@ -11084,7 +11096,7 @@ static int check_get_func_ip(struct bpf_verifier_env *env)
return -ENOTSUPP;
}
-static struct bpf_insn_aux_data *cur_aux(struct bpf_verifier_env *env)
+static struct bpf_insn_aux_data *cur_aux(const struct bpf_verifier_env *env)
{
return &env->insn_aux_data[env->insn_idx];
}
@@ -13828,7 +13840,9 @@ static int retrieve_ptr_limit(const struct bpf_reg_state *ptr_reg,
static bool can_skip_alu_sanitation(const struct bpf_verifier_env *env,
const struct bpf_insn *insn)
{
- return env->bypass_spec_v1 || BPF_SRC(insn->code) == BPF_K;
+ return env->bypass_spec_v1 ||
+ BPF_SRC(insn->code) == BPF_K ||
+ cur_aux(env)->nospec;
}
static int update_alu_sanitation_state(struct bpf_insn_aux_data *aux,
@@ -19514,13 +19528,34 @@ static int do_check(struct bpf_verifier_env *env)
sanitize_mark_insn_seen(env);
prev_insn_idx = env->insn_idx;
+ /* Reduce verification complexity by stopping speculative path
+ * verification when a nospec is encountered.
+ */
+ if (state->speculative && cur_aux(env)->nospec)
+ goto process_bpf_exit;
+
err = do_check_insn(env, insn, pop_log, &do_print_state, regs, state,
&prev_insn_idx);
- if (err < 0) {
+ if (state->speculative && error_recoverable_with_nospec(err)) {
+ /* Prevent this speculative path from ever reaching the
+ * insn that would have been unsafe to execute.
+ */
+ cur_aux(env)->nospec = true;
+ /* If it was an ADD/SUB insn, potentially remove any
+ * markings for alu sanitization.
+ */
+ cur_aux(env)->alu_state = 0;
+ goto process_bpf_exit;
+ } else if (err < 0) {
return err;
} else if (err == INSN_IDX_MODIFIED) {
continue;
} else if (err == PROCESS_BPF_EXIT) {
+ goto process_bpf_exit;
+ }
+ WARN_ON_ONCE(err);
+
+ if (state->speculative && cur_aux(env)->nospec_result) {
process_bpf_exit:
mark_verifier_state_scratched(env);
update_branch_counts(env, env->cur_state);
@@ -19539,7 +19574,6 @@ static int do_check(struct bpf_verifier_env *env)
continue;
}
}
- WARN_ON_ONCE(err);
env->insn_idx++;
}
@@ -20670,6 +20704,29 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
bpf_convert_ctx_access_t convert_ctx_access;
u8 mode;
+ if (env->insn_aux_data[i + delta].nospec) {
+ WARN_ON_ONCE(env->insn_aux_data[i + delta].alu_state);
+ struct bpf_insn patch[] = {
+ BPF_ST_NOSPEC(),
+ *insn,
+ };
+
+ cnt = ARRAY_SIZE(patch);
+ new_prog = bpf_patch_insn_data(env, i + delta, patch, cnt);
+ if (!new_prog)
+ return -ENOMEM;
+
+ delta += cnt - 1;
+ env->prog = new_prog;
+ insn = new_prog->insnsi + i + delta;
+ /* This can not be easily merged with the
+ * nospec_result-case, because an insn may require a
+ * nospec before and after itself. Therefore also do not
+ * 'continue' here but potentially apply further
+ * patching to insn. *insn should equal patch[1] now.
+ */
+ }
+
if (insn->code == (BPF_LDX | BPF_MEM | BPF_B) ||
insn->code == (BPF_LDX | BPF_MEM | BPF_H) ||
insn->code == (BPF_LDX | BPF_MEM | BPF_W) ||
@@ -20720,6 +20777,9 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
if (type == BPF_WRITE &&
env->insn_aux_data[i + delta].nospec_result) {
+ /* nospec_result is only used to mitigate Spectre v4 and
+ * to limit verification-time for Spectre v1.
+ */
struct bpf_insn patch[] = {
*insn,
BPF_ST_NOSPEC(),
diff --git a/tools/testing/selftests/bpf/progs/verifier_and.c b/tools/testing/selftests/bpf/progs/verifier_and.c
index e97e518516b6..98bafce6d8f3 100644
--- a/tools/testing/selftests/bpf/progs/verifier_and.c
+++ b/tools/testing/selftests/bpf/progs/verifier_and.c
@@ -85,7 +85,7 @@ l0_%=: r0 = r0; \
SEC("socket")
__description("check known subreg with unknown reg")
-__success __failure_unpriv __msg_unpriv("R1 !read_ok")
+__success __success_unpriv
__retval(0)
__naked void known_subreg_with_unknown_reg(void)
{
@@ -96,6 +96,7 @@ __naked void known_subreg_with_unknown_reg(void)
r0 &= 0xFFFF1234; \
/* Upper bits are unknown but AND above masks out 1 zero'ing lower bits */\
if w0 < 1 goto l0_%=; \
+ /* unpriv: nospec (inserted to prevent `R1 !read_ok'`) */\
r1 = *(u32*)(r1 + 512); \
l0_%=: r0 = 0; \
exit; \
diff --git a/tools/testing/selftests/bpf/progs/verifier_bounds.c b/tools/testing/selftests/bpf/progs/verifier_bounds.c
index 0eb33bb801b5..0488ef05a7a4 100644
--- a/tools/testing/selftests/bpf/progs/verifier_bounds.c
+++ b/tools/testing/selftests/bpf/progs/verifier_bounds.c
@@ -620,7 +620,7 @@ l1_%=: exit; \
SEC("socket")
__description("bounds check mixed 32bit and 64bit arithmetic. test1")
-__success __failure_unpriv __msg_unpriv("R0 invalid mem access 'scalar'")
+__success __success_unpriv
__retval(0)
__naked void _32bit_and_64bit_arithmetic_test1(void)
{
@@ -636,6 +636,7 @@ __naked void _32bit_and_64bit_arithmetic_test1(void)
if w1 > 2 goto l0_%=; \
goto l1_%=; \
l0_%=: /* invalid ldx if bounds are lost above */ \
+ /* unpriv: nospec (inserted to prevent `R0 invalid mem access 'scalar'`) */\
r0 = *(u64*)(r0 - 1); \
l1_%=: exit; \
" ::: __clobber_all);
@@ -643,7 +644,7 @@ l1_%=: exit; \
SEC("socket")
__description("bounds check mixed 32bit and 64bit arithmetic. test2")
-__success __failure_unpriv __msg_unpriv("R0 invalid mem access 'scalar'")
+__success __success_unpriv
__retval(0)
__naked void _32bit_and_64bit_arithmetic_test2(void)
{
@@ -660,6 +661,7 @@ __naked void _32bit_and_64bit_arithmetic_test2(void)
if r1 > r2 goto l0_%=; \
goto l1_%=; \
l0_%=: /* invalid ldx if bounds are lost above */ \
+ /* unpriv: nospec (inserted to prevent `R0 invalid mem access 'scalar'`) */\
r0 = *(u64*)(r0 - 1); \
l1_%=: exit; \
" ::: __clobber_all);
@@ -691,8 +693,7 @@ l0_%=: r0 = 0; \
SEC("socket")
__description("bounds check for reg = 0, reg xor 1")
-__success __failure_unpriv
-__msg_unpriv("R0 min value is outside of the allowed memory range")
+__success __success_unpriv
__retval(0)
__naked void reg_0_reg_xor_1(void)
{
@@ -708,6 +709,7 @@ __naked void reg_0_reg_xor_1(void)
l0_%=: r1 = 0; \
r1 ^= 1; \
if r1 != 0 goto l1_%=; \
+ /* unpriv: nospec (inserted to prevent `R0 min value is outside of the allowed memory range`) */\
r0 = *(u64*)(r0 + 8); \
l1_%=: r0 = 0; \
exit; \
@@ -719,8 +721,7 @@ l1_%=: r0 = 0; \
SEC("socket")
__description("bounds check for reg32 = 0, reg32 xor 1")
-__success __failure_unpriv
-__msg_unpriv("R0 min value is outside of the allowed memory range")
+__success __success_unpriv
__retval(0)
__naked void reg32_0_reg32_xor_1(void)
{
@@ -736,6 +737,7 @@ __naked void reg32_0_reg32_xor_1(void)
l0_%=: w1 = 0; \
w1 ^= 1; \
if w1 != 0 goto l1_%=; \
+ /* unpriv: nospec (inserted to prevent `R0 min value is outside of the allowed memory range`) */\
r0 = *(u64*)(r0 + 8); \
l1_%=: r0 = 0; \
exit; \
@@ -747,8 +749,7 @@ l1_%=: r0 = 0; \
SEC("socket")
__description("bounds check for reg = 2, reg xor 3")
-__success __failure_unpriv
-__msg_unpriv("R0 min value is outside of the allowed memory range")
+__success __success_unpriv
__retval(0)
__naked void reg_2_reg_xor_3(void)
{
@@ -764,6 +765,7 @@ __naked void reg_2_reg_xor_3(void)
l0_%=: r1 = 2; \
r1 ^= 3; \
if r1 > 0 goto l1_%=; \
+ /* unpriv: nospec (inserted to prevent `R0 min value is outside of the allowed memory range`) */\
r0 = *(u64*)(r0 + 8); \
l1_%=: r0 = 0; \
exit; \
@@ -829,8 +831,7 @@ l1_%=: r0 = 0; \
SEC("socket")
__description("bounds check for reg > 0, reg xor 3")
-__success __failure_unpriv
-__msg_unpriv("R0 min value is outside of the allowed memory range")
+__success __success_unpriv
__retval(0)
__naked void reg_0_reg_xor_3(void)
{
@@ -843,7 +844,8 @@ __naked void reg_0_reg_xor_3(void)
call %[bpf_map_lookup_elem]; \
if r0 != 0 goto l0_%=; \
exit; \
-l0_%=: r1 = *(u64*)(r0 + 0); \
+l0_%=: /* unpriv: nospec (inserted to prevent `R0 min value is outside of the allowed memory range`) */\
+ r1 = *(u64*)(r0 + 0); \
if r1 <= 0 goto l1_%=; \
r1 ^= 3; \
if r1 >= 0 goto l1_%=; \
@@ -858,8 +860,7 @@ l1_%=: r0 = 0; \
SEC("socket")
__description("bounds check for reg32 > 0, reg32 xor 3")
-__success __failure_unpriv
-__msg_unpriv("R0 min value is outside of the allowed memory range")
+__success __success_unpriv
__retval(0)
__naked void reg32_0_reg32_xor_3(void)
{
@@ -872,7 +873,8 @@ __naked void reg32_0_reg32_xor_3(void)
call %[bpf_map_lookup_elem]; \
if r0 != 0 goto l0_%=; \
exit; \
-l0_%=: r1 = *(u64*)(r0 + 0); \
+l0_%=: /* unpriv: nospec (inserted to prevent `R0 min value is outside of the allowed memory range`) */\
+ r1 = *(u64*)(r0 + 0); \
if w1 <= 0 goto l1_%=; \
w1 ^= 3; \
if w1 >= 0 goto l1_%=; \
diff --git a/tools/testing/selftests/bpf/progs/verifier_movsx.c b/tools/testing/selftests/bpf/progs/verifier_movsx.c
index 994bbc346d25..5de9bcbe7425 100644
--- a/tools/testing/selftests/bpf/progs/verifier_movsx.c
+++ b/tools/testing/selftests/bpf/progs/verifier_movsx.c
@@ -245,7 +245,7 @@ l0_%=: \
SEC("socket")
__description("MOV32SX, S8, var_off not u32_max, positive after s8 extension")
__success __retval(0)
-__failure_unpriv __msg_unpriv("frame pointer is read only")
+__success_unpriv
__naked void mov64sx_s32_varoff_2(void)
{
asm volatile (" \
@@ -257,6 +257,7 @@ __naked void mov64sx_s32_varoff_2(void)
w0 = 0; \
exit; \
l0_%=: \
+ /* nospec (inserted to prevent `frame pointer is read only`) */\
r10 = 1; \
exit; \
" :
@@ -267,7 +268,7 @@ l0_%=: \
SEC("socket")
__description("MOV32SX, S8, var_off not u32_max, negative after s8 extension")
__success __retval(0)
-__failure_unpriv __msg_unpriv("frame pointer is read only")
+__success_unpriv
__naked void mov64sx_s32_varoff_3(void)
{
asm volatile (" \
@@ -280,6 +281,7 @@ __naked void mov64sx_s32_varoff_3(void)
w0 = 0; \
exit; \
l0_%=: \
+ /* nospec (inserted to prevent `frame pointer is read only`) */\
r10 = 1; \
exit; \
" :
diff --git a/tools/testing/selftests/bpf/progs/verifier_unpriv.c b/tools/testing/selftests/bpf/progs/verifier_unpriv.c
index a4a5e2071604..af35365558ce 100644
--- a/tools/testing/selftests/bpf/progs/verifier_unpriv.c
+++ b/tools/testing/selftests/bpf/progs/verifier_unpriv.c
@@ -572,7 +572,7 @@ l0_%=: exit; \
SEC("socket")
__description("alu32: mov u32 const")
-__success __failure_unpriv __msg_unpriv("R7 invalid mem access 'scalar'")
+__success __success_unpriv
__retval(0)
__naked void alu32_mov_u32_const(void)
{
@@ -581,6 +581,7 @@ __naked void alu32_mov_u32_const(void)
w7 &= 1; \
w0 = w7; \
if r0 == 0 goto l0_%=; \
+ /* unpriv: nospec (inserted to prevent `R7 invalid mem access 'scalar'`) */\
r0 = *(u64*)(r7 + 0); \
l0_%=: exit; \
" ::: __clobber_all);
diff --git a/tools/testing/selftests/bpf/progs/verifier_value_ptr_arith.c b/tools/testing/selftests/bpf/progs/verifier_value_ptr_arith.c
index 5ba6e53571c8..2acd7769bd91 100644
--- a/tools/testing/selftests/bpf/progs/verifier_value_ptr_arith.c
+++ b/tools/testing/selftests/bpf/progs/verifier_value_ptr_arith.c
@@ -398,7 +398,8 @@ l2_%=: r0 = 1; \
SEC("socket")
__description("map access: mixing value pointer and scalar, 1")
-__success __failure_unpriv __msg_unpriv("R2 pointer comparison prohibited")
+__success __failure_unpriv
+__msg_unpriv("R2 tried to add from different maps, paths or scalars, pointer arithmetic with it prohibited for !root")
__retval(0)
__naked void value_pointer_and_scalar_1(void)
{
@@ -433,6 +434,7 @@ l2_%=: /* common instruction */ \
l3_%=: /* branch B */ \
r0 = 0x13371337; \
/* verifier follows fall-through */ \
+ /* unpriv: nospec (inserted to prevent `R2 pointer comparison prohibited`) */\
if r2 != 0x100000 goto l4_%=; \
r0 = 0; \
exit; \
@@ -450,7 +452,8 @@ l4_%=: /* fake-dead code; targeted from branch A to \
SEC("socket")
__description("map access: mixing value pointer and scalar, 2")
-__success __failure_unpriv __msg_unpriv("R0 invalid mem access 'scalar'")
+__success __failure_unpriv
+__msg_unpriv("R2 tried to add from different maps, paths or scalars, pointer arithmetic with it prohibited for !root")
__retval(0)
__naked void value_pointer_and_scalar_2(void)
{
@@ -466,6 +469,7 @@ __naked void value_pointer_and_scalar_2(void)
if r0 != 0 goto l0_%=; \
exit; \
l0_%=: /* load some number from the map into r1 */ \
+ /* unpriv: nospec (inserted to prevent `R0 invalid mem access 'scalar'`) */\
r1 = *(u8*)(r0 + 0); \
/* depending on r1, branch: */ \
if r1 == 0 goto l1_%=; \
@@ -1296,11 +1300,13 @@ l0_%=: r0 = 1; \
SEC("socket")
__description("map access: value_ptr -= unknown scalar, 2")
-__success __failure_unpriv
-__msg_unpriv("R0 pointer arithmetic of map value goes out of range")
+__success __success_unpriv
__retval(1)
__naked void value_ptr_unknown_scalar_2_2(void)
{
+ /* unpriv: nospec inserted by verifier to mitigate 'R0 pointer
+ * arithmetic of map value goes out of range'.
+ */
asm volatile (" \
r1 = 0; \
*(u64*)(r10 - 8) = r1; \
diff --git a/tools/testing/selftests/bpf/verifier/dead_code.c b/tools/testing/selftests/bpf/verifier/dead_code.c
index ee454327e5c6..77207b498c6f 100644
--- a/tools/testing/selftests/bpf/verifier/dead_code.c
+++ b/tools/testing/selftests/bpf/verifier/dead_code.c
@@ -2,14 +2,13 @@
"dead code: start",
.insns = {
BPF_JMP_IMM(BPF_JA, 0, 0, 2),
+ /* unpriv: nospec (inserted to prevent "R9 !read_ok") */
BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0),
BPF_JMP_IMM(BPF_JA, 0, 0, 2),
BPF_MOV64_IMM(BPF_REG_0, 7),
BPF_JMP_IMM(BPF_JGE, BPF_REG_0, 10, -4),
BPF_EXIT_INSN(),
},
- .errstr_unpriv = "R9 !read_ok",
- .result_unpriv = REJECT,
.result = ACCEPT,
.retval = 7,
},
diff --git a/tools/testing/selftests/bpf/verifier/jmp32.c b/tools/testing/selftests/bpf/verifier/jmp32.c
index 43776f6f92f4..91d83e9cb148 100644
--- a/tools/testing/selftests/bpf/verifier/jmp32.c
+++ b/tools/testing/selftests/bpf/verifier/jmp32.c
@@ -84,11 +84,10 @@
BPF_JMP32_IMM(BPF_JSET, BPF_REG_7, 0x10, 1),
BPF_EXIT_INSN(),
BPF_JMP32_IMM(BPF_JGE, BPF_REG_7, 0x10, 1),
+ /* unpriv: nospec (inserted to prevent "R9 !read_ok") */
BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0),
BPF_EXIT_INSN(),
},
- .errstr_unpriv = "R9 !read_ok",
- .result_unpriv = REJECT,
.result = ACCEPT,
},
{
@@ -149,11 +148,10 @@
BPF_JMP32_IMM(BPF_JEQ, BPF_REG_7, 0x10, 1),
BPF_EXIT_INSN(),
BPF_JMP32_IMM(BPF_JSGE, BPF_REG_7, 0xf, 1),
+ /* unpriv: nospec (inserted to prevent "R9 !read_ok") */
BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0),
BPF_EXIT_INSN(),
},
- .errstr_unpriv = "R9 !read_ok",
- .result_unpriv = REJECT,
.result = ACCEPT,
},
{
@@ -214,11 +212,10 @@
BPF_JMP32_IMM(BPF_JNE, BPF_REG_7, 0x10, 1),
BPF_JMP_IMM(BPF_JNE, BPF_REG_7, 0x10, 1),
BPF_EXIT_INSN(),
+ /* unpriv: nospec (inserted to prevent "R9 !read_ok") */
BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0),
BPF_EXIT_INSN(),
},
- .errstr_unpriv = "R9 !read_ok",
- .result_unpriv = REJECT,
.result = ACCEPT,
},
{
@@ -283,11 +280,10 @@
BPF_JMP32_REG(BPF_JGE, BPF_REG_7, BPF_REG_8, 1),
BPF_EXIT_INSN(),
BPF_JMP32_IMM(BPF_JGE, BPF_REG_7, 0x7ffffff0, 1),
+ /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */
BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .errstr_unpriv = "R0 invalid mem access 'scalar'",
- .result_unpriv = REJECT,
.result = ACCEPT,
.retval = 2,
.flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
@@ -354,11 +350,10 @@
BPF_JMP32_REG(BPF_JGT, BPF_REG_7, BPF_REG_8, 1),
BPF_EXIT_INSN(),
BPF_JMP_IMM(BPF_JGT, BPF_REG_7, 0x7ffffff0, 1),
+ /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */
BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .errstr_unpriv = "R0 invalid mem access 'scalar'",
- .result_unpriv = REJECT,
.result = ACCEPT,
.retval = 2,
.flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
@@ -425,11 +420,10 @@
BPF_JMP32_REG(BPF_JLE, BPF_REG_7, BPF_REG_8, 1),
BPF_EXIT_INSN(),
BPF_JMP32_IMM(BPF_JLE, BPF_REG_7, 0x7ffffff0, 1),
+ /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */
BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .errstr_unpriv = "R0 invalid mem access 'scalar'",
- .result_unpriv = REJECT,
.result = ACCEPT,
.retval = 2,
.flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
@@ -496,11 +490,10 @@
BPF_JMP32_REG(BPF_JLT, BPF_REG_7, BPF_REG_8, 1),
BPF_EXIT_INSN(),
BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0x7ffffff0, 1),
+ /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */
BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .errstr_unpriv = "R0 invalid mem access 'scalar'",
- .result_unpriv = REJECT,
.result = ACCEPT,
.retval = 2,
.flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
@@ -567,11 +560,10 @@
BPF_JMP32_REG(BPF_JSGE, BPF_REG_7, BPF_REG_8, 1),
BPF_EXIT_INSN(),
BPF_JMP_IMM(BPF_JSGE, BPF_REG_7, 0x7ffffff0, 1),
+ /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */
BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .errstr_unpriv = "R0 invalid mem access 'scalar'",
- .result_unpriv = REJECT,
.result = ACCEPT,
.retval = 2,
.flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
@@ -638,11 +630,10 @@
BPF_JMP32_REG(BPF_JSGT, BPF_REG_7, BPF_REG_8, 1),
BPF_EXIT_INSN(),
BPF_JMP_IMM(BPF_JSGT, BPF_REG_7, -2, 1),
+ /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */
BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .errstr_unpriv = "R0 invalid mem access 'scalar'",
- .result_unpriv = REJECT,
.result = ACCEPT,
.retval = 2,
.flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
@@ -709,11 +700,10 @@
BPF_JMP32_REG(BPF_JSLE, BPF_REG_7, BPF_REG_8, 1),
BPF_EXIT_INSN(),
BPF_JMP_IMM(BPF_JSLE, BPF_REG_7, 0x7ffffff0, 1),
+ /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */
BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .errstr_unpriv = "R0 invalid mem access 'scalar'",
- .result_unpriv = REJECT,
.result = ACCEPT,
.retval = 2,
.flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
@@ -780,11 +770,10 @@
BPF_JMP32_REG(BPF_JSLT, BPF_REG_7, BPF_REG_8, 1),
BPF_EXIT_INSN(),
BPF_JMP32_IMM(BPF_JSLT, BPF_REG_7, -1, 1),
+ /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */
BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .errstr_unpriv = "R0 invalid mem access 'scalar'",
- .result_unpriv = REJECT,
.result = ACCEPT,
.retval = 2,
.flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
diff --git a/tools/testing/selftests/bpf/verifier/jset.c b/tools/testing/selftests/bpf/verifier/jset.c
index 11fc68da735e..e901eefd774a 100644
--- a/tools/testing/selftests/bpf/verifier/jset.c
+++ b/tools/testing/selftests/bpf/verifier/jset.c
@@ -78,12 +78,11 @@
.insns = {
BPF_MOV64_IMM(BPF_REG_0, 1),
BPF_JMP_IMM(BPF_JSET, BPF_REG_0, 1, 1),
+ /* unpriv: nospec (inserted to prevent "R9 !read_ok") */
BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0),
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
- .errstr_unpriv = "R9 !read_ok",
- .result_unpriv = REJECT,
.retval = 1,
.result = ACCEPT,
},
@@ -136,13 +135,12 @@
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_prandom_u32),
BPF_ALU64_IMM(BPF_OR, BPF_REG_0, 2),
BPF_JMP_IMM(BPF_JSET, BPF_REG_0, 3, 1),
+ /* unpriv: nospec (inserted to prevent "R9 !read_ok") */
BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0),
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
- .errstr_unpriv = "R9 !read_ok",
- .result_unpriv = REJECT,
.result = ACCEPT,
},
{
@@ -154,16 +152,16 @@
BPF_ALU64_IMM(BPF_AND, BPF_REG_1, 0xff),
BPF_JMP_IMM(BPF_JSET, BPF_REG_1, 0xf0, 3),
BPF_JMP_IMM(BPF_JLT, BPF_REG_1, 0x10, 1),
+ /* unpriv: nospec (inserted to prevent "R9 !read_ok") */
BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0),
BPF_EXIT_INSN(),
BPF_JMP_IMM(BPF_JSET, BPF_REG_1, 0x10, 1),
BPF_EXIT_INSN(),
BPF_JMP_IMM(BPF_JGE, BPF_REG_1, 0x10, 1),
+ /* unpriv: nospec (inserted to prevent "R9 !read_ok") */
BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0),
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
- .errstr_unpriv = "R9 !read_ok",
- .result_unpriv = REJECT,
.result = ACCEPT,
},
--
2.48.1
Insert a nospec before the access to prevent it from ever using an index
that is subject to speculative scalar-confusion.
Signed-off-by: Luis Gerhorst <luis.gerhorst@fau.de>
Acked-by: Henriette Herzog <henriette.herzog@rub.de>
Cc: Maximilian Ott <ott@cs.fau.de>
Cc: Milan Stephan <milan.stephan@fau.de>
---
kernel/bpf/verifier.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8093a5bac4d1..683a76aceffa 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7858,6 +7858,11 @@ static int check_atomic(struct bpf_verifier_env *env, struct bpf_insn *insn)
}
}
+static struct bpf_insn_aux_data *cur_aux(const struct bpf_verifier_env *env)
+{
+ return &env->insn_aux_data[env->insn_idx];
+}
+
/* When register 'regno' is used to read the stack (either directly or through
* a helper function) make sure that it's within stack boundary and, depending
* on the access type and privileges, that all elements of the stack are
@@ -7897,18 +7902,18 @@ static int check_stack_range_initialized(
if (tnum_is_const(reg->var_off)) {
min_off = max_off = reg->var_off.value + off;
} else {
- /* Variable offset is prohibited for unprivileged mode for
+ /* Variable offset requires a nospec for unprivileged mode for
* simplicity since it requires corresponding support in
* Spectre masking for stack ALU.
* See also retrieve_ptr_limit().
*/
if (!env->bypass_spec_v1) {
- char tn_buf[48];
-
- tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off);
- verbose(env, "R%d variable offset stack access prohibited for !root, var_off=%s\n",
- regno, tn_buf);
- return -EACCES;
+ /* Allow the access, but prevent it from using a
+ * speculative offset using a nospec before the
+ * dereference op.
+ */
+ cur_aux(env)->nospec = true;
+ WARN_ON_ONCE(cur_aux(env)->alu_state);
}
/* Only initialized buffer on stack is allowed to be accessed
* with variable offset. With uninitialized buffer it's hard to
@@ -11096,11 +11101,6 @@ static int check_get_func_ip(struct bpf_verifier_env *env)
return -ENOTSUPP;
}
-static struct bpf_insn_aux_data *cur_aux(const struct bpf_verifier_env *env)
-{
- return &env->insn_aux_data[env->insn_idx];
-}
-
static bool loop_flag_is_zero(struct bpf_verifier_env *env)
{
struct bpf_reg_state *regs = cur_regs(env);
--
2.48.1
Main reason is, that it will later allow us to fall back to a nospec for
certain errors in push_stack().
This changes the sanitization-case to returning -ENOMEM. However, this
is more fitting as -EFAULT would indicate a verifier-internal bug.
Signed-off-by: Luis Gerhorst <luis.gerhorst@fau.de>
Acked-by: Henriette Herzog <henriette.herzog@rub.de>
Cc: Maximilian Ott <ott@cs.fau.de>
Cc: Milan Stephan <milan.stephan@fau.de>
---
kernel/bpf/verifier.c | 72 ++++++++++++++++++++++++++-----------------
1 file changed, 43 insertions(+), 29 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 683a76aceffa..610f9567af7c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2011,8 +2011,10 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env,
int err;
elem = kzalloc(sizeof(struct bpf_verifier_stack_elem), GFP_KERNEL);
- if (!elem)
- goto err;
+ if (!elem) {
+ err = -ENOMEM;
+ goto unrecoverable_err;
+ }
elem->insn_idx = insn_idx;
elem->prev_insn_idx = prev_insn_idx;
@@ -2022,12 +2024,19 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env,
env->stack_size++;
err = copy_verifier_state(&elem->st, cur);
if (err)
- goto err;
+ goto unrecoverable_err;
elem->st.speculative |= speculative;
if (env->stack_size > BPF_COMPLEXITY_LIMIT_JMP_SEQ) {
verbose(env, "The sequence of %d jumps is too complex.\n",
env->stack_size);
- goto err;
+ /* Do not return -EINVAL to prevent main loop from trying to
+ * mitigate this using nospec if we are on a speculative path.
+ * If it was tried anyway, we would encounter an -ENOMEM (from
+ * which we can not recover) again shortly on the next
+ * non-speculative path that has to be checked.
+ */
+ err = -ENOMEM;
+ goto unrecoverable_err;
}
if (elem->st.parent) {
++elem->st.parent->branches;
@@ -2042,12 +2051,14 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env,
*/
}
return &elem->st;
-err:
+unrecoverable_err:
free_verifier_state(env->cur_state, true);
env->cur_state = NULL;
/* pop all elements and return */
while (!pop_stack(env, NULL, NULL, false));
- return NULL;
+ WARN_ON_ONCE(err >= 0);
+ WARN_ON_ONCE(error_recoverable_with_nospec(err));
+ return ERR_PTR(err);
}
#define CALLER_SAVED_REGS 6
@@ -8856,8 +8867,8 @@ static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx,
prev_st = find_prev_entry(env, cur_st->parent, insn_idx);
/* branch out active iter state */
queued_st = push_stack(env, insn_idx + 1, insn_idx, false);
- if (!queued_st)
- return -ENOMEM;
+ if (IS_ERR(queued_st))
+ return PTR_ERR(queued_st);
queued_iter = get_iter_from_state(queued_st, meta);
queued_iter->iter.state = BPF_ITER_STATE_ACTIVE;
@@ -10440,8 +10451,8 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins
* proceed with next instruction within current frame.
*/
callback_state = push_stack(env, env->subprog_info[subprog].start, insn_idx, false);
- if (!callback_state)
- return -ENOMEM;
+ if (IS_ERR(callback_state))
+ return PTR_ERR(callback_state);
err = setup_func_entry(env, subprog, insn_idx, set_callee_state_cb,
callback_state);
@@ -13892,7 +13903,7 @@ sanitize_speculative_path(struct bpf_verifier_env *env,
struct bpf_reg_state *regs;
branch = push_stack(env, next_idx, curr_idx, true);
- if (branch && insn) {
+ if (!IS_ERR(branch) && insn) {
regs = branch->frame[branch->curframe]->regs;
if (BPF_SRC(insn->code) == BPF_K) {
mark_reg_unknown(env, regs, insn->dst_reg);
@@ -13920,7 +13931,7 @@ static int sanitize_ptr_alu(struct bpf_verifier_env *env,
u8 opcode = BPF_OP(insn->code);
u32 alu_state, alu_limit;
struct bpf_reg_state tmp;
- bool ret;
+ struct bpf_verifier_state *branch;
int err;
if (can_skip_alu_sanitation(env, insn))
@@ -13993,11 +14004,11 @@ static int sanitize_ptr_alu(struct bpf_verifier_env *env,
tmp = *dst_reg;
copy_register_state(dst_reg, ptr_reg);
}
- ret = sanitize_speculative_path(env, NULL, env->insn_idx + 1,
- env->insn_idx);
- if (!ptr_is_dst_reg && ret)
+ branch = sanitize_speculative_path(env, NULL, env->insn_idx + 1,
+ env->insn_idx);
+ if (!ptr_is_dst_reg && !IS_ERR(branch))
*dst_reg = tmp;
- return !ret ? REASON_STACK : 0;
+ return IS_ERR(branch) ? REASON_STACK : 0;
}
static void sanitize_mark_insn_seen(struct bpf_verifier_env *env)
@@ -16246,8 +16257,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
/* branch out 'fallthrough' insn as a new state to explore */
queued_st = push_stack(env, idx + 1, idx, false);
- if (!queued_st)
- return -ENOMEM;
+ if (IS_ERR(queued_st))
+ return PTR_ERR(queued_st);
queued_st->may_goto_depth++;
if (prev_st)
@@ -16311,10 +16322,12 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
* the fall-through branch for simulation under speculative
* execution.
*/
- if (!env->bypass_spec_v1 &&
- !sanitize_speculative_path(env, insn, *insn_idx + 1,
- *insn_idx))
- return -EFAULT;
+ if (!env->bypass_spec_v1) {
+ struct bpf_verifier_state *branch = sanitize_speculative_path(
+ env, insn, *insn_idx + 1, *insn_idx);
+ if (IS_ERR(branch))
+ return PTR_ERR(branch);
+ }
if (env->log.level & BPF_LOG_LEVEL)
print_insn_state(env, this_branch, this_branch->curframe);
*insn_idx += insn->off;
@@ -16324,11 +16337,12 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
* program will go. If needed, push the goto branch for
* simulation under speculative execution.
*/
- if (!env->bypass_spec_v1 &&
- !sanitize_speculative_path(env, insn,
- *insn_idx + insn->off + 1,
- *insn_idx))
- return -EFAULT;
+ if (!env->bypass_spec_v1) {
+ struct bpf_verifier_state *branch = sanitize_speculative_path(
+ env, insn, *insn_idx + insn->off + 1, *insn_idx);
+ if (IS_ERR(branch))
+ return PTR_ERR(branch);
+ }
if (env->log.level & BPF_LOG_LEVEL)
print_insn_state(env, this_branch, this_branch->curframe);
return 0;
@@ -16351,8 +16365,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
other_branch = push_stack(env, *insn_idx + insn->off + 1, *insn_idx,
false);
- if (!other_branch)
- return -EFAULT;
+ if (IS_ERR(other_branch))
+ return PTR_ERR(other_branch);
other_branch_regs = other_branch->frame[other_branch->curframe]->regs;
if (BPF_SRC(insn->code) == BPF_X) {
--
2.48.1
On Thu, 2025-03-13 at 18:41 +0100, Luis Gerhorst wrote:
[...]
> @@ -2011,8 +2011,10 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env,
> int err;
>
> elem = kzalloc(sizeof(struct bpf_verifier_stack_elem), GFP_KERNEL);
> - if (!elem)
> - goto err;
> + if (!elem) {
> + err = -ENOMEM;
> + goto unrecoverable_err;
> + }
Could you please point me to a location, where exact error code
returned by updated push_stack() matters?
I checked push_stack() callgraph (in the attachment), but can't find
anything.
>
> elem->insn_idx = insn_idx;
> elem->prev_insn_idx = prev_insn_idx;
> @@ -2022,12 +2024,19 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env,
[...]
Eduard Zingerman <eddyz87@gmail.com> writes:
> Could you please point me to a location, where exact error code
> returned by updated push_stack() matters?
> I checked push_stack() callgraph (in the attachment), but can't find
> anything.
Only with the final patch 11 ("bpf: Fall back to nospec for spec path
verification") applied, the error code should matter. Then, the error
code either matches `state->speculative &&
error_recoverable_with_nospec(err)` in do_check() if it was EINVAL (in
this case we heuristically avoided nested speculative path verification
but have to add a nospec), or we continue to raise the error (e.g.,
ENOMEM) from do_check().
Or is your question on this part from the commit message of patch 9?
This changes the sanitization-case to returning -ENOMEM. However, this
is more fitting as -EFAULT would indicate a verifier-internal bug.
This was referring to the sanitize_speculative_path() calls in
check_cond_jmp_op(). For that case, the error should also only be used
in do_check() with patch 11 applied. However, regarding this, EFAULT and
ENOMEM are treated the same (they both don't satisfy
error_recoverable_with_nospec()), therefore this change is primarily
made to not complicate the code.
I just became aware that there is some special handling of EFAULT as
discussed in c7a897843224 ("bpf: don't leave partial mangled prog in
jit_subprogs error path"). I will have look into this in detail to make
sure changing push_stack() from EFAULT to ENOMEM is OK.
Hope this answers your question.
Adding some of these details to v2 won't hurt I guess.
© 2016 - 2025 Red Hat, Inc.