From nobody Wed Sep 10 21:48:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A56EC61DA4 for ; Wed, 22 Feb 2023 19:29:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232122AbjBVT3o (ORCPT ); Wed, 22 Feb 2023 14:29:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35188 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232348AbjBVT3j (ORCPT ); Wed, 22 Feb 2023 14:29:39 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA8792D140 for ; Wed, 22 Feb 2023 11:29:37 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-536c8bcae3bso77967907b3.2 for ; Wed, 22 Feb 2023 11:29:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6tasE976sed6Z9eekcv/3y3HqpG7M1LvNFXgzwFh9Rk=; b=MKMHno+wsx9foc5KwhZQMbLnWr5HNiKCWvxRPdknfsHmV/b0GFXdSkeesCbvDHC2Ty N9Sv00E4OIuOitE400fJYQOSPC7O+JijnQIsWQ9mdJsxdJuE2rzP4Q3ydJ+Xv3qOxWV+ 1EP0Wznfhw5mgTlYlx3Ka2xCiE5HJs5sahb9zz6Ru82ntP4xhyk/YRvZX1UH7TWi0is2 nx32DG0HarWrA/ao2c8MwOePvxgFZbHC+z3frPlbHSOFUoWvH5ydA4Bkw853LnwemR/P YN+R5JyGEI5Vf9BcBtU38GGZ6tcPd/s67R+8Jrj+MTy00I6yGi5ZQcJYoMRC0H/rziu+ n3TA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6tasE976sed6Z9eekcv/3y3HqpG7M1LvNFXgzwFh9Rk=; b=DGguz3KwsMwWm85zLBF1+alh443S/XqmT3FaE13G1CfEfcGO7seF0BBNcuL7uodiVw FFs7dVmDItctYFuB2Wzm0NtVEo1IxnIgytg0YD1rj0/QTJ2uTS79sUp4fHecB2yX3UMb UrGiWx9+zUB4JLNFkXD/iI5TCSbymFuktTPlTeWJOn+YeovN0zr+ezY/miof0FWUXVTk Vhs2BuFqArzHvvwMHhw+CruLCdbfiV84gq7AHQMElAi4VzDlfFQ4Sdl3oTqhknLZTTDE 7sVE5wSvHHytqa1UkQ4eNhjHinRs5pqCUJrtfGm0+MP7QwuvbdaGAPz1sP35F/n22OcY xCow== X-Gm-Message-State: AO0yUKVmgSrdVUgQMY9Sua2QfQAPM8megsDZWQcTd9XIi5XqwG+9XxHn y1rJC6RLOENiCAIc8wi9NBwBoRFENGQ= X-Google-Smtp-Source: AK7set/SJbJM8Dxbbt+8qEYFy5lTpS7wwzsR9FZD6mlxxuDkgAtg7r2wZSAItN2hh36FLwfRJrTiOgkMsA0= X-Received: from edliaw.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:305d]) (user=edliaw job=sendgmr) by 2002:a05:6902:1028:b0:a27:3ecd:6 with SMTP id x8-20020a056902102800b00a273ecd0006mr172380ybt.1.1677094177130; Wed, 22 Feb 2023 11:29:37 -0800 (PST) Date: Wed, 22 Feb 2023 19:29:21 +0000 In-Reply-To: <20230222192925.1778183-1-edliaw@google.com> Mime-Version: 1.0 References: <20230222192925.1778183-1-edliaw@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230222192925.1778183-2-edliaw@google.com> Subject: [PATCH 4.14 v2 1/4] bpf: Do not use ax register in interpreter on div/mod From: Edward Liaw To: stable@vger.kernel.org, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" Cc: bpf@vger.kernel.org, kernel-team@android.com, Edward Liaw , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, John Fastabend , Thadeu Lima de Souza Cascardo Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Daniel Borkmann Partially undo old commit 144cd91c4c2b ("bpf: move tmp variable into ax register in interpreter"). The reason we need this here is because ax register will be used for holding temporary state for div/mod instruction which otherwise interpreter would corrupt. This will cause a small +8 byte stack increase for interpreter, but with the gain that we can use it from verifier rewrites as scratch register. Signed-off-by: Daniel Borkmann Reviewed-by: John Fastabend [cascardo: This partial revert is needed in order to support using AX for the following two commits, as there is no JMP32 on 4.19.y] Signed-off-by: Thadeu Lima de Souza Cascardo [edliaw: Removed redeclaration of tmp] Signed-off-by: Edward Liaw --- kernel/bpf/core.c | 31 ++++++++++++++----------------- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 5d649983de07..4ddb846693bb 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -663,9 +663,6 @@ static int bpf_jit_blind_insn(const struct bpf_insn *fr= om, * below. * * Constant blinding is only used by JITs, not in the interpreter. - * The interpreter uses AX in some occasions as a local temporary - * register e.g. in DIV or MOD instructions. - * * In restricted circumstances, the verifier can also use the AX * register for rewrites as long as they do not interfere with * the above cases! @@ -1060,22 +1057,22 @@ static unsigned int ___bpf_prog_run(u64 *regs, cons= t struct bpf_insn *insn, ALU64_MOD_X: if (unlikely(SRC =3D=3D 0)) return 0; - div64_u64_rem(DST, SRC, &AX); - DST =3D AX; + div64_u64_rem(DST, SRC, &tmp); + DST =3D tmp; CONT; ALU_MOD_X: if (unlikely((u32)SRC =3D=3D 0)) return 0; - AX =3D (u32) DST; - DST =3D do_div(AX, (u32) SRC); + tmp =3D (u32) DST; + DST =3D do_div(tmp, (u32) SRC); CONT; ALU64_MOD_K: - div64_u64_rem(DST, IMM, &AX); - DST =3D AX; + div64_u64_rem(DST, IMM, &tmp); + DST =3D tmp; CONT; ALU_MOD_K: - AX =3D (u32) DST; - DST =3D do_div(AX, (u32) IMM); + tmp =3D (u32) DST; + DST =3D do_div(tmp, (u32) IMM); CONT; ALU64_DIV_X: if (unlikely(SRC =3D=3D 0)) @@ -1085,17 +1082,17 @@ static unsigned int ___bpf_prog_run(u64 *regs, cons= t struct bpf_insn *insn, ALU_DIV_X: if (unlikely((u32)SRC =3D=3D 0)) return 0; - AX =3D (u32) DST; - do_div(AX, (u32) SRC); - DST =3D (u32) AX; + tmp =3D (u32) DST; + do_div(tmp, (u32) SRC); + DST =3D (u32) tmp; CONT; ALU64_DIV_K: DST =3D div64_u64(DST, IMM); CONT; ALU_DIV_K: - AX =3D (u32) DST; - do_div(AX, (u32) IMM); - DST =3D (u32) AX; + tmp =3D (u32) DST; + do_div(tmp, (u32) IMM); + DST =3D (u32) tmp; CONT; ALU_END_TO_BE: switch (IMM) { --=20 2.39.2.637.g21b0678d19-goog From nobody Wed Sep 10 21:48:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B28EC64EC7 for ; Wed, 22 Feb 2023 19:30:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232315AbjBVT3s (ORCPT ); Wed, 22 Feb 2023 14:29:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231820AbjBVT3m (ORCPT ); Wed, 22 Feb 2023 14:29:42 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 163C22B2AD for ; Wed, 22 Feb 2023 11:29:40 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-536e8d6d9ceso55372177b3.12 for ; Wed, 22 Feb 2023 11:29:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ET5uK4G9BnYd+mgYhqQ9kutIdmfzaa4o47ZwKb3KpKg=; b=amhiaTMtNJatGLJukmVySR1HZRRjtveJ4zbpPuOmg0XvedYwF0sMIUQhaT1qRMENj0 NhQC5h7QmOJX/mxgfXSr5xJPjAILx9zcTEFlVkOLjjFU0zTNKvzUjfvmUimwIE1zDBd0 kjKGJchFsATjiZYwxKylrVqV/+QBBBOBSj2lf6hx2BU3b/0Nt82ppYe8mycpDdMcxO6O RCSL4ENYrfSKTc5ephkac/ztcCXuUdBs3uMwijtJx1zoqDMwkozzDbNchsf4p84Z3bCl VX/GoZqqwnunYpbU9uPQtVSos7PuXSJroid+a3wLi9gTDrDPI553h384J8vppKNfe0PY FLCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ET5uK4G9BnYd+mgYhqQ9kutIdmfzaa4o47ZwKb3KpKg=; b=s7XVPZiLE4uzNRLDvE2gWB43DIWzm+NYHGsHYfPYyvZSZlu+y5Hv+7/PTk+XJcbjAl Cm2mXoSXw1yLN4AsQC9bbtDktB7RoIQT54CIQu5zAPT98P44mdtg5eL943zFgkhej9eK NXwO1YoFvr3z4dYYlWtiN+iDyIKqVYlkjtdDOukQrjGJs2bgW1CkaYEDMRyUTZ2Ygsj8 UA9N7HO43mPBY96/JNrYVIlgoR2fe0xSTCv9wXQ5dB6SyckjSSP6mbh35oTXVDoWtoTQ OuZ7tBRg2sMkbq2T2TYT3F6Cx3pfLKYianECltL2tm/nTANWnGyA2Gig+2L2WCiWVje9 +lCA== X-Gm-Message-State: AO0yUKWVuvLFP27i3ZlozLwe1nP7VKxRqGtAU2RGaeQdaXaqd9otZfhs M6GZSskubgL1KDfEaEnczaij2qThl0E= X-Google-Smtp-Source: AK7set/Pa04XjzSQIQpgmKbOq9P2op1nWrFy5MZnIcn7RhtaK/5kiY2YHZG8PWqLvf8NJSncTO0uoME0lXQ= X-Received: from edliaw.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:305d]) (user=edliaw job=sendgmr) by 2002:a05:6902:10c9:b0:855:fdcb:4467 with SMTP id w9-20020a05690210c900b00855fdcb4467mr1578916ybu.0.1677094179755; Wed, 22 Feb 2023 11:29:39 -0800 (PST) Date: Wed, 22 Feb 2023 19:29:22 +0000 In-Reply-To: <20230222192925.1778183-1-edliaw@google.com> Mime-Version: 1.0 References: <20230222192925.1778183-1-edliaw@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230222192925.1778183-3-edliaw@google.com> Subject: [PATCH 4.14 v2 2/4] bpf: fix subprog verifier bypass by div/mod by 0 exception From: Edward Liaw To: stable@vger.kernel.org, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" Cc: bpf@vger.kernel.org, kernel-team@android.com, Edward Liaw , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Thadeu Lima de Souza Cascardo Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Daniel Borkmann Commit f6b1b3bf0d5f681631a293cfe1ca934b81716f1e upstream. One of the ugly leftovers from the early eBPF days is that div/mod operations based on registers have a hard-coded src_reg =3D=3D 0 test in the interpreter as well as in JIT code generators that would return from the BPF program with exit code 0. This was basically adopted from cBPF interpreter for historical reasons. There are multiple reasons why this is very suboptimal and prone to bugs. To name one: the return code mapping for such abnormal program exit of 0 does not always match with a suitable program type's exit code mapping. For example, '0' in tc means action 'ok' where the packet gets passed further up the stack, which is just undesirable for such cases (e.g. when implementing policy) and also does not match with other program types. While trying to work out an exception handling scheme, I also noticed that programs crafted like the following will currently pass the verifier: 0: (bf) r6 =3D r1 1: (85) call pc+8 caller: R6=3Dctx(id=3D0,off=3D0,imm=3D0) R10=3Dfp0,call_-1 callee: frame1: R1=3Dctx(id=3D0,off=3D0,imm=3D0) R10=3Dfp0,call_1 10: (b4) (u32) r2 =3D (u32) 0 11: (b4) (u32) r3 =3D (u32) 1 12: (3c) (u32) r3 /=3D (u32) r2 13: (61) r0 =3D *(u32 *)(r1 +76) 14: (95) exit returning from callee: frame1: R0_w=3Dpkt(id=3D0,off=3D0,r=3D0,imm=3D0) R1=3Dctx(id=3D0,off=3D0,imm=3D0) R2_w=3Dinv0 R3_w=3Dinv(id=3D0,umax_value=3D4294967295,var_off=3D(0x0; 0xffff= ffff)) R10=3Dfp0,call_1 to caller at 2: R0_w=3Dpkt(id=3D0,off=3D0,r=3D0,imm=3D0) R6=3Dctx(id=3D0,off=3D0,imm=3D0) R10=3Dfp0,call_-1 from 14 to 2: R0=3Dpkt(id=3D0,off=3D0,r=3D0,imm=3D0) R6=3Dctx(id=3D0,off=3D0,imm=3D0) R10=3Dfp0,call_-1 2: (bf) r1 =3D r6 3: (61) r1 =3D *(u32 *)(r1 +80) 4: (bf) r2 =3D r0 5: (07) r2 +=3D 8 6: (2d) if r2 > r1 goto pc+1 R0=3Dpkt(id=3D0,off=3D0,r=3D8,imm=3D0) R1=3Dpkt_end(id=3D0,off=3D0,imm= =3D0) R2=3Dpkt(id=3D0,off=3D8,r=3D8,imm=3D0) R6=3Dctx(id=3D0,off=3D0,imm=3D0) R10=3Dfp0,call_-1 7: (71) r0 =3D *(u8 *)(r0 +0) 8: (b7) r0 =3D 1 9: (95) exit from 6 to 8: safe processed 16 insns (limit 131072), stack depth 0+0 Basically what happens is that in the subprog we make use of a div/mod by 0 exception and in the 'normal' subprog's exit path we just return skb->data back to the main prog. This has the implication that the verifier thinks we always get a pkt pointer in R0 while we still have the implicit 'return 0' from the div as an alternative unconditional return path earlier. Thus, R0 then contains 0, meaning back in the parent prog we get the address range of [0x0, skb->data_end] as read and writeable. Similar can be crafted with other pointer register types. Since i) BPF_ABS/IND is not allowed in programs that contain BPF to BPF calls (and generally it's also disadvised to use in native eBPF context), ii) unknown opcodes don't return zero anymore, iii) we don't return an exception code in dead branches, the only last missing case affected and to fix is the div/mod handling. What we would really need is some infrastructure to propagate exceptions all the way to the original prog unwinding the current stack and returning that code to the caller of the BPF program. In user space such exception handling for similar runtimes is typically implemented with setjmp(3) and longjmp(3) as one possibility which is not available in the kernel, though (kgdb used to implement it in kernel long time ago). I implemented a PoC exception handling mechanism into the BPF interpreter with porting setjmp()/longjmp() into x86_64 and adding a new internal BPF_ABRT opcode that can use a program specific exception code for all exception cases we have (e.g. div/mod by 0, unknown opcodes, etc). While this seems to work in the constrained BPF environment (meaning, here, we don't need to deal with state e.g. from memory allocations that we would need to undo before going into exception state), it still has various drawbacks: i) we would need to implement the setjmp()/longjmp() for every arch supported in the kernel and for x86_64, arm64, sparc64 JITs currently supporting calls, ii) it has unconditional additional cost on main program entry to store CPU register state in initial setjmp() call, and we would need some way to pass the jmp_buf down into ___bpf_prog_run() for main prog and all subprogs, but also storing on stack is not really nice (other option would be per-cpu storage for this, but it also has the drawback that we need to disable preemption for every BPF program types). All in all this approach would add a lot of complexity. Another poor-man's solution would be to have some sort of additional shared register or scratch buffer to hold state for exceptions, and test that after every call return to chain returns and pass R0 all the way down to BPF prog caller. This is also problematic in various ways: i) an additional register doesn't map well into JITs, and some other scratch space could only be on per-cpu storage, which, again has the side-effect that this only works when we disable preemption, or somewhere in the input context which is not available everywhere either, and ii) this adds significant runtime overhead by putting conditionals after each and every call, as well as implementation complexity. Yet another option is to teach verifier that div/mod can return an integer, which however is also complex to implement as verifier would need to walk such fake 'mov r0,; exit;' sequeuence and there would still be no guarantee for having propagation of this further down to the BPF caller as proper exception code. For parent prog, it is also is not distinguishable from a normal return of a constant scalar value. The approach taken here is a completely different one with little complexity and no additional overhead involved in that we make use of the fact that a div/mod by 0 is undefined behavior. Instead of bailing out, we adapt the same behavior as on some major archs like ARMv8 [0] into eBPF as well: X div 0 results in 0, and X mod 0 results in X. aarch64 and aarch32 ISA do not generate any traps or otherwise aborts of program execution for unsigned divides. I verified this also with a test program compiled by gcc and clang, and the behavior matches with the spec. Going forward we adapt the eBPF verifier to emit such rewrites once div/mod by register was seen. cBPF is not touched and will keep existing 'return 0' semantics. Given the options, it seems the most suitable from all of them, also since major archs have similar schemes in place. Given this is all in the realm of undefined behavior, we still have the option to adapt if deemed necessary and this way we would also have the option of more flexibility from LLVM code generation side (which is then fully visible to verifier). Thus, this patch i) fixes the panic seen in above program and ii) doesn't bypass the verifier observations. [0] ARM Architecture Reference Manual, ARMv8 [ARM DDI 0487B.b] http://infocenter.arm.com/help/topic/com.arm.doc.ddi0487b.b/DDI0487B_= b_armv8_arm.pdf 1) aarch64 instruction set: section C3.4.7 and C6.2.279 (UDIV) "A division by zero results in a zero being written to the destination register, without any indication that the division by zero occurred." 2) aarch32 instruction set: section F1.4.8 and F5.1.263 (UDIV) "For the SDIV and UDIV instructions, division by zero always returns a zero result." Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov Signed-off-by: Alexei Starovoitov Signed-off-by: Thadeu Lima de Souza Cascardo --- kernel/bpf/core.c | 8 -------- kernel/bpf/verifier.c | 38 ++++++++++++++++++++++++++++++-------- net/core/filter.c | 9 ++++++++- 3 files changed, 38 insertions(+), 17 deletions(-) diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 4ddb846693bb..2ca36bb440de 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1055,14 +1055,10 @@ static unsigned int ___bpf_prog_run(u64 *regs, cons= t struct bpf_insn *insn, (*(s64 *) &DST) >>=3D IMM; CONT; ALU64_MOD_X: - if (unlikely(SRC =3D=3D 0)) - return 0; div64_u64_rem(DST, SRC, &tmp); DST =3D tmp; CONT; ALU_MOD_X: - if (unlikely((u32)SRC =3D=3D 0)) - return 0; tmp =3D (u32) DST; DST =3D do_div(tmp, (u32) SRC); CONT; @@ -1075,13 +1071,9 @@ static unsigned int ___bpf_prog_run(u64 *regs, const= struct bpf_insn *insn, DST =3D do_div(tmp, (u32) IMM); CONT; ALU64_DIV_X: - if (unlikely(SRC =3D=3D 0)) - return 0; DST =3D div64_u64(DST, SRC); CONT; ALU_DIV_X: - if (unlikely((u32)SRC =3D=3D 0)) - return 0; tmp =3D (u32) DST; do_div(tmp, (u32) SRC); DST =3D (u32) tmp; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index e8d9ddd5cb18..836791403129 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4839,15 +4839,37 @@ static int fixup_bpf_calls(struct bpf_verifier_env = *env) struct bpf_insn_aux_data *aux; =20 for (i =3D 0; i < insn_cnt; i++, insn++) { - if (insn->code =3D=3D (BPF_ALU | BPF_MOD | BPF_X) || + if (insn->code =3D=3D (BPF_ALU64 | BPF_MOD | BPF_X) || + insn->code =3D=3D (BPF_ALU64 | BPF_DIV | BPF_X) || + insn->code =3D=3D (BPF_ALU | BPF_MOD | BPF_X) || insn->code =3D=3D (BPF_ALU | BPF_DIV | BPF_X)) { - /* due to JIT bugs clear upper 32-bits of src register - * before div/mod operation - */ - insn_buf[0] =3D BPF_MOV32_REG(insn->src_reg, insn->src_reg); - insn_buf[1] =3D *insn; - cnt =3D 2; - new_prog =3D bpf_patch_insn_data(env, i + delta, insn_buf, cnt); + bool is64 =3D BPF_CLASS(insn->code) =3D=3D BPF_ALU64; + struct bpf_insn mask_and_div[] =3D { + BPF_MOV32_REG(insn->src_reg, insn->src_reg), + /* Rx div 0 -> 0 */ + BPF_JMP_IMM(BPF_JNE, insn->src_reg, 0, 2), + BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), + BPF_JMP_IMM(BPF_JA, 0, 0, 1), + *insn, + }; + struct bpf_insn mask_and_mod[] =3D { + BPF_MOV32_REG(insn->src_reg, insn->src_reg), + /* Rx mod 0 -> Rx */ + BPF_JMP_IMM(BPF_JEQ, insn->src_reg, 0, 1), + *insn, + }; + struct bpf_insn *patchlet; + + if (insn->code =3D=3D (BPF_ALU64 | BPF_DIV | BPF_X) || + insn->code =3D=3D (BPF_ALU | BPF_DIV | BPF_X)) { + patchlet =3D mask_and_div + (is64 ? 1 : 0); + cnt =3D ARRAY_SIZE(mask_and_div) - (is64 ? 1 : 0); + } else { + patchlet =3D mask_and_mod + (is64 ? 1 : 0); + cnt =3D ARRAY_SIZE(mask_and_mod) - (is64 ? 1 : 0); + } + + new_prog =3D bpf_patch_insn_data(env, i + delta, patchlet, cnt); if (!new_prog) return -ENOMEM; =20 diff --git a/net/core/filter.c b/net/core/filter.c index 29d85a20f4fc..fc665885d57c 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -458,8 +458,15 @@ static int bpf_convert_filter(struct sock_filter *prog= , int len, break; =20 if (fp->code =3D=3D (BPF_ALU | BPF_DIV | BPF_X) || - fp->code =3D=3D (BPF_ALU | BPF_MOD | BPF_X)) + fp->code =3D=3D (BPF_ALU | BPF_MOD | BPF_X)) { *insn++ =3D BPF_MOV32_REG(BPF_REG_X, BPF_REG_X); + /* Error with exception code on div/mod by 0. + * For cBPF programs, this was always return 0. + */ + *insn++ =3D BPF_JMP_IMM(BPF_JNE, BPF_REG_X, 0, 2); + *insn++ =3D BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A); + *insn++ =3D BPF_EXIT_INSN(); + } =20 *insn =3D BPF_RAW_INSN(fp->code, BPF_REG_A, BPF_REG_X, 0, fp->k); break; --=20 2.39.2.637.g21b0678d19-goog From nobody Wed Sep 10 21:48:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39E90C636D6 for ; Wed, 22 Feb 2023 19:30:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232762AbjBVTaJ (ORCPT ); Wed, 22 Feb 2023 14:30:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232499AbjBVT3q (ORCPT ); Wed, 22 Feb 2023 14:29:46 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9DCF82E810 for ; Wed, 22 Feb 2023 11:29:42 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-536bf649e70so84648197b3.0 for ; Wed, 22 Feb 2023 11:29:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dgCd2Bq61u3wl6Yp7z0XuSdAv+g3K86zE1oWtNCB1AY=; b=NkB1j9GeQtSiQfQxNB9XJOu1e98NtprLGhOfr0QPGUq0JRgeKWxChZAM+Wk02FEJvk ec34vKMBQi0MmmPHrXcwZEJr9+ngC86YNWtmQsFsWEjXrdjbxgIP7Dw4CVLH+MZJHSAe RSPYNYY1WKq+7YO3Pv3SKQUiGxK7Z+ktbK13Yx15aYg8LH21JYBLOjSosHx7GCK9Yp9d i3FBlaiFOPuqCDD7oXf5Sa36n4ztlTmZ8ETwSkZhTpKX92GDXVJnwGxKYPos3ud/K4O8 Ox1WV0FEKBwYSYKVBQ8nynGhHe1BUz7AGRG1QGWbiUr2PVFYyQQ0V+W4btos0/ulrRfu 2rVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dgCd2Bq61u3wl6Yp7z0XuSdAv+g3K86zE1oWtNCB1AY=; b=ik76374XaTIPCC5ToA5adh3UKYEjsnZ1kCFLma7yWvwpZbG9plXWReNP7RdWpBP564 E7UHE290zgJtdyP5rsBW9DgIAs63wCuyLrizOQwKMxus3Z+C4X7L0BZHKNfWt2LyUaBB zDqtDXvQAUVGSQzofc5XLXvxE0WCMq/WTqyEtjJyylXuXBCGaGSO/YvBsrIYXyYUeiif J1DtSMEteiwcXK1ETV+pQEIIpm5LbQfCzXf6oph8rbc5XZw1bDUFatvwLJFMZkVpT2U4 MdEz5Cn5IHdp+ugFDHU0pkfVHgAHUbbj9P+YKUZzZnPRctGXAUHLaL+XRha/0oWGgxyS 8xXA== X-Gm-Message-State: AO0yUKVt+7dqrZ5o+/9gmDZUr5zyWjHjkPAHQni8Xezb0x+nvQ78ZWNE ZYAWlcKDDmEDnOjUz73gijZL+2WdXyY= X-Google-Smtp-Source: AK7set+uDJRa/2xaJdjVO33ZNktOZvNh3okiFPhtwrQLqmeO3QO7pZdn9kniqELhOHVlgAV25qc7NdNAuzc= X-Received: from edliaw.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:305d]) (user=edliaw job=sendgmr) by 2002:a05:6902:50d:b0:8cc:1051:2f2f with SMTP id x13-20020a056902050d00b008cc10512f2fmr892239ybs.12.1677094181876; Wed, 22 Feb 2023 11:29:41 -0800 (PST) Date: Wed, 22 Feb 2023 19:29:23 +0000 In-Reply-To: <20230222192925.1778183-1-edliaw@google.com> Mime-Version: 1.0 References: <20230222192925.1778183-1-edliaw@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230222192925.1778183-4-edliaw@google.com> Subject: [PATCH 4.14 v2 3/4] bpf: Fix 32 bit src register truncation on div/mod From: Edward Liaw To: stable@vger.kernel.org, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" Cc: bpf@vger.kernel.org, kernel-team@android.com, Edward Liaw , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, John Fastabend , Salvatore Bonaccorso , Thadeu Lima de Souza Cascardo Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Daniel Borkmann Commit e88b2c6e5a4d9ce30d75391e4d950da74bb2bd90 upstream. While reviewing a different fix, John and I noticed an oddity in one of the BPF program dumps that stood out, for example: # bpftool p d x i 13 0: (b7) r0 =3D 808464450 1: (b4) w4 =3D 808464432 2: (bc) w0 =3D w0 3: (15) if r0 =3D=3D 0x0 goto pc+1 4: (9c) w4 %=3D w0 [...] In line 2 we noticed that the mov32 would 32 bit truncate the original src register for the div/mod operation. While for the two operations the dst register is typically marked unknown e.g. from adjust_scalar_min_max_vals() the src register is not, and thus verifier keeps tracking original bounds, simplified: 0: R1=3Dctx(id=3D0,off=3D0,imm=3D0) R10=3Dfp0 0: (b7) r0 =3D -1 1: R0_w=3DinvP-1 R1=3Dctx(id=3D0,off=3D0,imm=3D0) R10=3Dfp0 1: (b7) r1 =3D -1 2: R0_w=3DinvP-1 R1_w=3DinvP-1 R10=3Dfp0 2: (3c) w0 /=3D w1 3: R0_w=3DinvP(id=3D0,umax_value=3D4294967295,var_off=3D(0x0; 0xffffffff)= ) R1_w=3DinvP-1 R10=3Dfp0 3: (77) r1 >>=3D 32 4: R0_w=3DinvP(id=3D0,umax_value=3D4294967295,var_off=3D(0x0; 0xffffffff)= ) R1_w=3DinvP4294967295 R10=3Dfp0 4: (bf) r0 =3D r1 5: R0_w=3DinvP4294967295 R1_w=3DinvP4294967295 R10=3Dfp0 5: (95) exit processed 6 insns (limit 1000000) max_states_per_insn 0 total_states 0 pe= ak_states 0 mark_read 0 Runtime result of r0 at exit is 0 instead of expected -1. Remove the verifier mov32 src rewrite in div/mod and replace it with a jmp32 test instead. After the fix, we result in the following code generation when having dividend r1 and divisor r6: div, 64 bit: div, 32 bit: 0: (b7) r6 =3D 8 0: (b7) r6 =3D 8 1: (b7) r1 =3D 8 1: (b7) r1 =3D 8 2: (55) if r6 !=3D 0x0 goto pc+2 2: (56) if w6 !=3D 0x0 goto p= c+2 3: (ac) w1 ^=3D w1 3: (ac) w1 ^=3D w1 4: (05) goto pc+1 4: (05) goto pc+1 5: (3f) r1 /=3D r6 5: (3c) w1 /=3D w6 6: (b7) r0 =3D 0 6: (b7) r0 =3D 0 7: (95) exit 7: (95) exit mod, 64 bit: mod, 32 bit: 0: (b7) r6 =3D 8 0: (b7) r6 =3D 8 1: (b7) r1 =3D 8 1: (b7) r1 =3D 8 2: (15) if r6 =3D=3D 0x0 goto pc+1 2: (16) if w6 =3D=3D 0x0 go= to pc+1 3: (9f) r1 %=3D r6 3: (9c) w1 %=3D w6 4: (b7) r0 =3D 0 4: (b7) r0 =3D 0 5: (95) exit 5: (95) exit x86 in particular can throw a 'divide error' exception for div instruction not only for divisor being zero, but also for the case when the quotient is too large for the designated register. For the edx:eax and rdx:rax dividend pair it is not an issue in x86 BPF JIT since we always zero edx (rdx). Hence really the only protection needed is against divisor being zero. [Salvatore Bonaccorso: This is an earlier version of the patch provided by Daniel Borkmann which does not rely on availability of the BPF_JMP32 instruction class. This means it is not even strictly a backport of the upstream commit mentioned but based on Daniel's and John's work to address the issue.] Fixes: 68fda450a7df ("bpf: fix 32-bit divide by zero") Co-developed-by: John Fastabend Signed-off-by: John Fastabend Signed-off-by: Daniel Borkmann Tested-by: Salvatore Bonaccorso Signed-off-by: Thadeu Lima de Souza Cascardo --- include/linux/filter.h | 24 ++++++++++++++++++++++++ kernel/bpf/verifier.c | 22 +++++++++++----------- 2 files changed, 35 insertions(+), 11 deletions(-) diff --git a/include/linux/filter.h b/include/linux/filter.h index 7c0e616362f0..55a639609070 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -67,6 +67,14 @@ struct bpf_prog_aux; =20 /* ALU ops on registers, bpf_add|sub|...: dst_reg +=3D src_reg */ =20 +#define BPF_ALU_REG(CLASS, OP, DST, SRC) \ + ((struct bpf_insn) { \ + .code =3D CLASS | BPF_OP(OP) | BPF_X, \ + .dst_reg =3D DST, \ + .src_reg =3D SRC, \ + .off =3D 0, \ + .imm =3D 0 }) + #define BPF_ALU64_REG(OP, DST, SRC) \ ((struct bpf_insn) { \ .code =3D BPF_ALU64 | BPF_OP(OP) | BPF_X, \ @@ -113,6 +121,14 @@ struct bpf_prog_aux; =20 /* Short form of mov, dst_reg =3D src_reg */ =20 +#define BPF_MOV_REG(CLASS, DST, SRC) \ + ((struct bpf_insn) { \ + .code =3D CLASS | BPF_MOV | BPF_X, \ + .dst_reg =3D DST, \ + .src_reg =3D SRC, \ + .off =3D 0, \ + .imm =3D 0 }) + #define BPF_MOV64_REG(DST, SRC) \ ((struct bpf_insn) { \ .code =3D BPF_ALU64 | BPF_MOV | BPF_X, \ @@ -147,6 +163,14 @@ struct bpf_prog_aux; .off =3D 0, \ .imm =3D IMM }) =20 +#define BPF_RAW_REG(insn, DST, SRC) \ + ((struct bpf_insn) { \ + .code =3D (insn).code, \ + .dst_reg =3D DST, \ + .src_reg =3D SRC, \ + .off =3D (insn).off, \ + .imm =3D (insn).imm }) + /* BPF_LD_IMM64 macro encodes single 'load 64-bit immediate' insn */ #define BPF_LD_IMM64(DST, IMM) \ BPF_LD_IMM64_RAW(DST, 0, IMM) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 836791403129..9f04d413df92 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4845,28 +4845,28 @@ static int fixup_bpf_calls(struct bpf_verifier_env = *env) insn->code =3D=3D (BPF_ALU | BPF_DIV | BPF_X)) { bool is64 =3D BPF_CLASS(insn->code) =3D=3D BPF_ALU64; struct bpf_insn mask_and_div[] =3D { - BPF_MOV32_REG(insn->src_reg, insn->src_reg), + BPF_MOV_REG(BPF_CLASS(insn->code), BPF_REG_AX, insn->src_reg), /* Rx div 0 -> 0 */ - BPF_JMP_IMM(BPF_JNE, insn->src_reg, 0, 2), - BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_AX, 0, 2), + BPF_RAW_REG(*insn, insn->dst_reg, BPF_REG_AX), BPF_JMP_IMM(BPF_JA, 0, 0, 1), - *insn, + BPF_ALU_REG(BPF_CLASS(insn->code), BPF_XOR, insn->dst_reg, insn->dst_r= eg), }; struct bpf_insn mask_and_mod[] =3D { - BPF_MOV32_REG(insn->src_reg, insn->src_reg), + BPF_MOV_REG(BPF_CLASS(insn->code), BPF_REG_AX, insn->src_reg), /* Rx mod 0 -> Rx */ - BPF_JMP_IMM(BPF_JEQ, insn->src_reg, 0, 1), - *insn, + BPF_JMP_IMM(BPF_JEQ, BPF_REG_AX, 0, 1), + BPF_RAW_REG(*insn, insn->dst_reg, BPF_REG_AX), }; struct bpf_insn *patchlet; =20 if (insn->code =3D=3D (BPF_ALU64 | BPF_DIV | BPF_X) || insn->code =3D=3D (BPF_ALU | BPF_DIV | BPF_X)) { - patchlet =3D mask_and_div + (is64 ? 1 : 0); - cnt =3D ARRAY_SIZE(mask_and_div) - (is64 ? 1 : 0); + patchlet =3D mask_and_div; + cnt =3D ARRAY_SIZE(mask_and_div); } else { - patchlet =3D mask_and_mod + (is64 ? 1 : 0); - cnt =3D ARRAY_SIZE(mask_and_mod) - (is64 ? 1 : 0); + patchlet =3D mask_and_mod; + cnt =3D ARRAY_SIZE(mask_and_mod); } =20 new_prog =3D bpf_patch_insn_data(env, i + delta, patchlet, cnt); --=20 2.39.2.637.g21b0678d19-goog From nobody Wed Sep 10 21:48:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5CBDC636D6 for ; Wed, 22 Feb 2023 19:30:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231479AbjBVTaF (ORCPT ); Wed, 22 Feb 2023 14:30:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231236AbjBVT3q (ORCPT ); Wed, 22 Feb 2023 14:29:46 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E9C732513 for ; Wed, 22 Feb 2023 11:29:44 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id bw25-20020a056a00409900b005a9d0e66a7aso4976559pfb.5 for ; Wed, 22 Feb 2023 11:29:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=CBY1EQ8rgvYwbAE0sNfh/NLDoFtyGrAczu8biZVyLX0=; b=o3icHyjxwvdHAhPRUh6Mt6hvF5TRYma19mD+TxcsNWR4dcwvzykuLrLgvdV+PDLtlF cdZvsk7or8RrIsU067FeiJHYoz+MPT+JgTTv0DBnRbYGcxzIwktwKnMLlU3p6gt6+w4i ids2WRyltylT6UatxWBrSQgsF5ZnyBuaHlLUMx9hJo5AXRIVF3HN2y7GZmBuKJv63Iha f/huLn96P2Rk1WA2a4IwinWnGiJJogk5K83YwSdyxlr2FWjfv6tZ657eAnaoBWAEcKHc g/qOWXwx5Qo25UH0yfAfr1AdI+cxaQHLySxnQAtq75ko8f0mKMqAWb1N3s+dJ5RVBYXZ VSlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=CBY1EQ8rgvYwbAE0sNfh/NLDoFtyGrAczu8biZVyLX0=; b=V0dNND1oD/CbenQBKYVM3TM/pweqoFZIYmrEQwuMJ1I6UQgDdB/HsQYnIUTBkFEqJW NMmS/U7P8pDY9gaR0ZkpTP8USF/JSwDgR8Ggw/3Jsd6FX5PRHq8jEeCLJEbW7jgpTCSt TA75XANyKcjRWUc/wf5w9jK5ckGqLkqPtKezUjO3ADO6FK7KLmeiTqARx1rv/P90ehTa bRtmrj4iniyskzrOVAvvrPTEX5Q9LgRYXWT30rQ7UpynqPTcNuDqiCPrEzGdGV7UbyO7 DmRRtT2tuz4ftaQXgCCdSnUp4vuoR6nu6wiFTL3adelDARL5hW17VtiWjiLx/UYN8F5l Vf7Q== X-Gm-Message-State: AO0yUKUEzyIQf6WAyfRVeA3Tu5gPZOcogx0reqbEPFe/wUU1eTn/oZqQ shEZDfhr4XsNvAuMHfe3Wiz0tJGI6XU= X-Google-Smtp-Source: AK7set8IjtejlcQuK1gmqMMMCxx3rSvp07+jvv64f31bXlPR3CPC0jh/v1XrF0KwPfYEha3WiyIHCV9t+9A= X-Received: from edliaw.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:305d]) (user=edliaw job=sendgmr) by 2002:a62:a110:0:b0:5a8:c0ee:876f with SMTP id b16-20020a62a110000000b005a8c0ee876fmr1332747pff.3.1677094183762; Wed, 22 Feb 2023 11:29:43 -0800 (PST) Date: Wed, 22 Feb 2023 19:29:24 +0000 In-Reply-To: <20230222192925.1778183-1-edliaw@google.com> Mime-Version: 1.0 References: <20230222192925.1778183-1-edliaw@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230222192925.1778183-5-edliaw@google.com> Subject: [PATCH 4.14 v2 4/4] bpf: Fix truncation handling for mod32 dst reg wrt zero From: Edward Liaw To: stable@vger.kernel.org, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" Cc: bpf@vger.kernel.org, kernel-team@android.com, Edward Liaw , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, John Fastabend , Salvatore Bonaccorso , Thadeu Lima de Souza Cascardo Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Daniel Borkmann Commit 9b00f1b78809309163dda2d044d9e94a3c0248a3 upstream. Recently noticed that when mod32 with a known src reg of 0 is performed, then the dst register is 32-bit truncated in verifier: 0: R1=3Dctx(id=3D0,off=3D0,imm=3D0) R10=3Dfp0 0: (b7) r0 =3D 0 1: R0_w=3Dinv0 R1=3Dctx(id=3D0,off=3D0,imm=3D0) R10=3Dfp0 1: (b7) r1 =3D -1 2: R0_w=3Dinv0 R1_w=3Dinv-1 R10=3Dfp0 2: (b4) w2 =3D -1 3: R0_w=3Dinv0 R1_w=3Dinv-1 R2_w=3Dinv4294967295 R10=3Dfp0 3: (9c) w1 %=3D w0 4: R0_w=3Dinv0 R1_w=3Dinv(id=3D0,umax_value=3D4294967295,var_off=3D(0x0; = 0xffffffff)) R2_w=3Dinv4294967295 R10=3Dfp0 4: (b7) r0 =3D 1 5: R0_w=3Dinv1 R1_w=3Dinv(id=3D0,umax_value=3D4294967295,var_off=3D(0x0; = 0xffffffff)) R2_w=3Dinv4294967295 R10=3Dfp0 5: (1d) if r1 =3D=3D r2 goto pc+1 R0_w=3Dinv1 R1_w=3Dinv(id=3D0,umax_value=3D4294967295,var_off=3D(0x0; 0x= ffffffff)) R2_w=3Dinv4294967295 R10=3Dfp0 6: R0_w=3Dinv1 R1_w=3Dinv(id=3D0,umax_value=3D4294967295,var_off=3D(0x0; = 0xffffffff)) R2_w=3Dinv4294967295 R10=3Dfp0 6: (b7) r0 =3D 2 7: R0_w=3Dinv2 R1_w=3Dinv(id=3D0,umax_value=3D4294967295,var_off=3D(0x0; = 0xffffffff)) R2_w=3Dinv4294967295 R10=3Dfp0 7: (95) exit 7: R0=3Dinv1 R1=3Dinv(id=3D0,umin_value=3D4294967295,umax_value=3D4294967= 295,var_off=3D(0x0; 0xffffffff)) R2=3Dinv4294967295 R10=3Dfp0 7: (95) exit However, as a runtime result, we get 2 instead of 1, meaning the dst register does not contain (u32)-1 in this case. The reason is fairly straight forward given the 0 test leaves the dst register as-is: # ./bpftool p d x i 23 0: (b7) r0 =3D 0 1: (b7) r1 =3D -1 2: (b4) w2 =3D -1 3: (16) if w0 =3D=3D 0x0 goto pc+1 4: (9c) w1 %=3D w0 5: (b7) r0 =3D 1 6: (1d) if r1 =3D=3D r2 goto pc+1 7: (b7) r0 =3D 2 8: (95) exit This was originally not an issue given the dst register was marked as completely unknown (aka 64 bit unknown). However, after 468f6eafa6c4 ("bpf: fix 32-bit ALU op verification") the verifier casts the register output to 32 bit, and hence it becomes 32 bit unknown. Note that for the case where the src register is unknown, the dst register is marked 64 bit unknown. After the fix, the register is truncated by the runtime and the test passes: # ./bpftool p d x i 23 0: (b7) r0 =3D 0 1: (b7) r1 =3D -1 2: (b4) w2 =3D -1 3: (16) if w0 =3D=3D 0x0 goto pc+2 4: (9c) w1 %=3D w0 5: (05) goto pc+1 6: (bc) w1 =3D w1 7: (b7) r0 =3D 1 8: (1d) if r1 =3D=3D r2 goto pc+1 9: (b7) r0 =3D 2 10: (95) exit Semantics also match with {R,W}x mod{64,32} 0 -> {R,W}x. Invalid div has always been {R,W}x div{64,32} 0 -> 0. Rewrites are as follows: mod32: mod64: (16) if w0 =3D=3D 0x0 goto pc+2 (15) if r0 =3D=3D 0x0 goto pc+1 (9c) w1 %=3D w0 (9f) r1 %=3D r0 (05) goto pc+1 (bc) w1 =3D w1 [Salvatore Bonaccorso: This is an earlier version based on work by Daniel and John which does not rely on availability of the BPF_JMP32 instruction class. This means it is not even strictly a backport of the upstream commit mentioned but based on Daniel's and John's work to address the issue and was finalized by Thadeu Lima de Souza Cascardo.] Fixes: 468f6eafa6c4 ("bpf: fix 32-bit ALU op verification") Signed-off-by: Daniel Borkmann Reviewed-by: John Fastabend Tested-by: Salvatore Bonaccorso Signed-off-by: Thadeu Lima de Souza Cascardo --- kernel/bpf/verifier.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9f04d413df92..a55e264cdb54 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4846,7 +4846,7 @@ static int fixup_bpf_calls(struct bpf_verifier_env *e= nv) bool is64 =3D BPF_CLASS(insn->code) =3D=3D BPF_ALU64; struct bpf_insn mask_and_div[] =3D { BPF_MOV_REG(BPF_CLASS(insn->code), BPF_REG_AX, insn->src_reg), - /* Rx div 0 -> 0 */ + /* [R,W]x div 0 -> 0 */ BPF_JMP_IMM(BPF_JEQ, BPF_REG_AX, 0, 2), BPF_RAW_REG(*insn, insn->dst_reg, BPF_REG_AX), BPF_JMP_IMM(BPF_JA, 0, 0, 1), @@ -4854,9 +4854,10 @@ static int fixup_bpf_calls(struct bpf_verifier_env *= env) }; struct bpf_insn mask_and_mod[] =3D { BPF_MOV_REG(BPF_CLASS(insn->code), BPF_REG_AX, insn->src_reg), - /* Rx mod 0 -> Rx */ - BPF_JMP_IMM(BPF_JEQ, BPF_REG_AX, 0, 1), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_AX, 0, 1 + (is64 ? 0 : 1)), BPF_RAW_REG(*insn, insn->dst_reg, BPF_REG_AX), + BPF_JMP_IMM(BPF_JA, 0, 0, 1), + BPF_MOV32_REG(insn->dst_reg, insn->dst_reg), }; struct bpf_insn *patchlet; =20 @@ -4866,7 +4867,7 @@ static int fixup_bpf_calls(struct bpf_verifier_env *e= nv) cnt =3D ARRAY_SIZE(mask_and_div); } else { patchlet =3D mask_and_mod; - cnt =3D ARRAY_SIZE(mask_and_mod); + cnt =3D ARRAY_SIZE(mask_and_mod) - (is64 ? 2 : 0); } =20 new_prog =3D bpf_patch_insn_data(env, i + delta, patchlet, cnt); --=20 2.39.2.637.g21b0678d19-goog