From nobody Fri Dec 19 17:34:00 2025 Received: from mail-wr1-f42.google.com (mail-wr1-f42.google.com [209.85.221.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F28C332E739 for ; Thu, 6 Nov 2025 12:53:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762433622; cv=none; b=Q70IbWLgKo/PX61n3qsJFkFQC3EYGoRoXBNR/2vNqwCcGVvZdywsrehstu5LeHzirScpnaZdCVuQ3HqpWdKZ5AGoLSNXC0p+nM/0Yj/nyWMh1StjSa1Ll2rX/NicTvK3AHLrN6wEuG1sETWw6hS0rx65nrmydkbDCU/nGlfb+uI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762433622; c=relaxed/simple; bh=EnSIiRq6VHyzg36EE4lisdtnJkdNAd019Xn1/RbfzCI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=MXoFC2OsHFZG7D6Qj+M18/oO/GQdHQv7SKaQvfKlozNLcpchONK4aJh6pn8IkwuzZuM5C3fOH4c2Bj11xfKxmpXDat9n7RXPC8BLG7JTLbG4d9tpEuT4YNTM7TFYfyVS/EXG2xfFjvBEoR+x12vEXnfUMkIEiJZsxedUdXQxHB4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=YaojCmQB; arc=none smtp.client-ip=209.85.221.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="YaojCmQB" Received: by mail-wr1-f42.google.com with SMTP id ffacd0b85a97d-429eb7fafc7so651274f8f.2 for ; Thu, 06 Nov 2025 04:53:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762433618; x=1763038418; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=J5UtbFN3SeUqWauxQUFhPcEKoaIiq/3X5SZjNQXVMzY=; b=YaojCmQB0Fg3OgZso8vv+nvyabzPAmSaWeZXdFQoQpHXACpMIxuVDBFMyyDfYFN8in fVTRbo6/Acqj30Qio+HFABgeffUKSQzl5qvVdpRq8h4o7a3guo3QAw6RxWMPuY84wAuK RWQaAHBdnDSkdsvs9N7ejWKx2coqRHezQkK3w4+2KESWJ7nOiZcAADXORDo8lqFCeQxa DhQg2n3H3ERtmz6niWqEKNmCGUUHfFc3yUk63v3hfDbVOne8cvjRWV3KEELgRJ2Min+h 4ZDJU7sAsUJnc6VIXw8R4IKVdcFUf7c0VP47gN3RFpcJwqUkShrm8jcVeGL8PxVPIt1m 1EoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762433618; x=1763038418; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=J5UtbFN3SeUqWauxQUFhPcEKoaIiq/3X5SZjNQXVMzY=; b=mqxpYxZLo6rlbXxMFp5+KTVZEB/drqBawdQku8+syMrNQP7qd46QrEq38WAoEP5VIP S0mY+bexts+oFFzHsMYMudwA8TXLeGx9kCr9zeCZ5Zn1BDdsK0Wv+vAkq7yASUohqh/N UVBuwhg53bU14GeIimXBlCRbaLMZA0wndNZ2prUx5OyqgED2G3UTTXmT3+qVTUyI7Ilc Pt7kfI/Vb2u6Hm0Gd7GuldjIpRtElrNw2f6Kwa5bIX+RPxe6CeToRThWL/UGTj9aHH+P AVd4mJJqkKU82iUpGf8tZjk4+wXE5LQ/2xZZhjhR6384r4pIl9SczmcYHuOUYqhK5/eO 3XeQ== X-Forwarded-Encrypted: i=1; AJvYcCXZgMFObs8vTi6bwncphNbZAXCfxrHbbP7P5KP7P22Cx7SmFLzpHsRaDrz7lV5LNu+2xRc6RYY3N2uICP8=@vger.kernel.org X-Gm-Message-State: AOJu0Yzq0vbEQ46ck3kl5irIABNqlMRYSSLK9vZHi2kXqRIjxxekjVB0 pvgq3u0BWQ2ZcYl//Ux0Avr6jQQdPpPd8zu7RzDCQJjLZXx5R9jX0to= X-Gm-Gg: ASbGncv4StgXl9LqD6CtORsGPOJsPF4kk7Muu07DD9HIAB+D9ZmnUx5GT91jvavJCJS x3DAMYY+OeznZNJCgA0rF7JuqZayrxDmuqe3EhAGvukmUsXogAn3w3D7hjZIKpeUTTKAsp438Uc jpYPwUyifyRG/VT6FY8vxHwGSAjMbuf2CNPr8t7GMSWdtrAM4eTGshG2NaqFApRFxNaQlrg6h7w vMh98HGRjdUPkOK3u7M8X8JZUprURtbr7irJbWAsCVUACjFIZTAYJNKr8xP8BR1qTeGZzsjCJMx 3+RecAA2VhEXcOMPSxbeCEyIP3SAy7/O7oUXa0c4VlinaYaP7rli0jho7YyBOBQm9/wPXH7lon7 dnr3/oiYl606tZMDddNBa4oP+OHyDkYG1Wik/GbMdeYCgQQnOtrTtUmMqv40MVwy0H0/qbB33b+ sTJUIVFykTcvVXSmjIgKtgxisjwbOhZb601unDtyexpQXs9s3Rj/+0i+g= X-Google-Smtp-Source: AGHT+IETLShyKsHT25JdeNK9Sz1/yzYm3e12YwX3zT1yZFekr4gNRNP0A8afPQDiZ+0+kfpa/ZLupA== X-Received: by 2002:adf:9d91:0:b0:429:ec94:67ec with SMTP id ffacd0b85a97d-429ec9469cbmr1719997f8f.1.1762433618072; Thu, 06 Nov 2025 04:53:38 -0800 (PST) Received: from ast-epyc5.inf.ethz.ch (ast-epyc5.inf.ethz.ch. [129.132.161.180]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-429eb40379esm4788856f8f.9.2025.11.06.04.53.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Nov 2025 04:53:37 -0800 (PST) From: Hao Sun X-Google-Original-From: Hao Sun To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, eddyz87@gmail.com, john.fastabend@gmail.com, martin.lau@linux.dev, song@kernel.org, yonghong.song@linux.dev, linux-kernel@vger.kernel.org, sunhao.th@gmail.com, Hao Sun Subject: [PATCH RFC 04/17] bpf: Add top-level workflow of bcf_refine() Date: Thu, 6 Nov 2025 13:52:42 +0100 Message-Id: <20251106125255.1969938-5-hao.sun@inf.ethz.ch> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251106125255.1969938-1-hao.sun@inf.ethz.ch> References: <20251106125255.1969938-1-hao.sun@inf.ethz.ch> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the top-level refinement hook `bcf_refine()`: - Add `struct bcf_refine_state` to `struct bpf_verifier_env`: stores the li= st of parent states forming the relevant path suffix (`parents`, `vstate_cnt= `), and traversal cursors (`cur_vstate`, `cur_jmp_entry`). A boolean `availab= le` marks whether BCF can be used in this verification. - Implement `backtrack_states()`: walking parents with `backtrack_insn()` a= nd recorded jump history and find a base state to start the symbolic tracking. - Add a stub `bcf_track()` () and `bcf_refine()` routine that: (a) derives default `reg_masks` when not provided by selecting interesting regs, (b) backtracks parents, (c) runs `bcf_track()` and a refinement callback, and (d) marks the aux flag to request a proof when a condition is produce= d. The actual symbolic tracking, condition build and concrete refinements appear in subsequent patches. Signed-off-by: Hao Sun --- include/linux/bpf.h | 1 + include/linux/bpf_verifier.h | 13 +++ kernel/bpf/verifier.c | 156 +++++++++++++++++++++++++++++++++++ 3 files changed, 170 insertions(+) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index a47d67db3be5..690b0b2b84ba 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1656,6 +1656,7 @@ struct bpf_prog_aux { bool changes_pkt_data; bool might_sleep; bool kprobe_write_ctx; + bool bcf_requested; u64 prog_array_member_cnt; /* counts how many times as member of prog_arr= ay */ struct mutex ext_mutex; /* mutex for is_extended and prog_array_member_cn= t */ struct bpf_arena *arena; diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index c6eb68b6389c..090430168523 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -732,6 +732,18 @@ struct bpf_scc_info { =20 struct bpf_liveness; =20 +struct bcf_refine_state { + /* The state list that decides the path suffix, on which bcf_track() + * collects symbolic information for target registers. + */ + struct bpf_verifier_state **parents; + u32 vstate_cnt; + u32 cur_vstate; + u32 cur_jmp_entry; + + bool available; /* if bcf_buf is provided. */ +}; + /* single container for all structs * one verifier_env per bpf_check() call */ @@ -838,6 +850,7 @@ struct bpf_verifier_env { struct bpf_scc_info **scc_info; u32 scc_cnt; struct bpf_iarray *succ; + struct bcf_refine_state bcf; }; =20 static inline struct bpf_func_info_aux *subprog_aux(struct bpf_verifier_en= v *env, int subprog) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index e4928846e763..7125f7434e6f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -608,6 +608,23 @@ static bool is_atomic_load_insn(const struct bpf_insn = *insn) insn->imm =3D=3D BPF_LOAD_ACQ; } =20 +typedef int (*refine_state_fn)(struct bpf_verifier_env *env, + struct bpf_verifier_state *st, void *ctx); + +static int bcf_refine(struct bpf_verifier_env *env, + struct bpf_verifier_state *st, u32 reg_masks, + refine_state_fn refine_cb, void *ctx); + +static bool bcf_requested(const struct bpf_verifier_env *env) +{ + return env->prog->aux->bcf_requested; +} + +static void mark_bcf_requested(struct bpf_verifier_env *env) +{ + env->prog->aux->bcf_requested =3D true; +} + static int __get_spi(s32 off) { return (-off - 1) / BPF_REG_SIZE; @@ -23378,6 +23395,145 @@ static int do_check_common(struct bpf_verifier_en= v *env, int subprog) return ret; } =20 +static int bcf_track(struct bpf_verifier_env *env, + struct bpf_verifier_state *st, + struct bpf_verifier_state *base) +{ + return -EOPNOTSUPP; +} + +/* + * Backtracks through parent verifier states to identify the suffix of the= path + * that is relevant for register refinement in bcf_track(). Using backtrac= k_insn(), + * this routine locates the instructions that define the target registers = and any + * registers that are transitively related. All states visited during this= process + * collectively define the path suffix. + * + * Returns the parent state of the last visited state, which serves as the= base + * state from which bcf_track() begins its analysis. + * The jump history from the collected states determines the suffix to fol= low. + */ +static struct bpf_verifier_state * +backtrack_states(struct bpf_verifier_env *env, struct bpf_verifier_state *= cur, + u32 reg_masks) +{ + struct bpf_verifier_state *base =3D NULL, *st =3D cur; + struct backtrack_state *bt =3D &env->bt; + struct bcf_refine_state *bcf =3D &env->bcf; + int first_idx =3D cur->first_insn_idx; + int last_idx =3D cur->insn_idx; + int subseq_idx =3D -1; + bool skip_first =3D true; + int i, err, log_level =3D 0; + u32 vstate_cnt; + + if (!reg_masks) + return ERR_PTR(-EFAULT); + + bt_init(bt, st->curframe); + bt->reg_masks[bt->frame] =3D reg_masks; + swap(env->log.level, log_level); /* Disable backtrack_insn() log. */ + + for (;;) { + u32 history =3D st->jmp_history_cnt; + struct bpf_jmp_history_entry *hist; + + if (last_idx < 0 || !st->parent) + break; + + for (i =3D last_idx;;) { + if (skip_first) { + err =3D 0; + skip_first =3D false; + } else { + hist =3D get_jmp_hist_entry(st, history, i); + err =3D backtrack_insn(env, i, subseq_idx, hist, bt); + } + if (err) /* Track the entire path. */ + goto out; + if (bt_empty(bt)) { /* Base state found. */ + base =3D st->parent; + goto out; + } + subseq_idx =3D i; + i =3D get_prev_insn_idx(st, i, &history); + if (i =3D=3D -ENOENT) + break; + if (i >=3D env->prog->len) + goto out; + } + + st =3D st->parent; + subseq_idx =3D first_idx; + last_idx =3D st->last_insn_idx; + first_idx =3D st->first_insn_idx; + } + +out: + bt_reset(bt); + swap(env->log.level, log_level); + + /* Collect parents and follow their jmp history. */ + vstate_cnt =3D 1; + st =3D cur->parent; + while (st !=3D base) { + vstate_cnt++; + st =3D st->parent; + } + bcf->parents =3D kmalloc_array(vstate_cnt, sizeof(st), GFP_KERNEL_ACCOUNT= ); + if (!bcf->parents) + return ERR_PTR(-ENOMEM); + bcf->vstate_cnt =3D vstate_cnt; + st =3D cur; + while (vstate_cnt) { + bcf->parents[--vstate_cnt] =3D st; + st =3D st->parent; + } + bcf->cur_vstate =3D 0; + bcf->cur_jmp_entry =3D 0; + return base; +} + +static int __used bcf_refine(struct bpf_verifier_env *env, + struct bpf_verifier_state *st, u32 reg_masks, + refine_state_fn refine_cb, void *ctx) +{ + struct bpf_reg_state *regs =3D st->frame[st->curframe]->regs; + struct bpf_verifier_state *base; + int i, err; + + if (!env->bcf.available || st->speculative) + return 0; + /* BCF requested multiple times in an error path. */ + if (bcf_requested(env)) + return -EFAULT; + + if (!reg_masks) { + for (i =3D 0; i < BPF_REG_FP; i++) { + if (regs[i].type =3D=3D NOT_INIT) + continue; + if (regs[i].type !=3D SCALAR_VALUE && + tnum_is_const(regs[i].var_off)) + continue; + reg_masks |=3D (1 << i); + } + } + + base =3D backtrack_states(env, st, reg_masks); + if (IS_ERR(base)) + return PTR_ERR(base); + + err =3D bcf_track(env, st, base); + if (!err && refine_cb) + err =3D refine_cb(env, st, ctx); + + if (!err) + mark_bcf_requested(env); + + kfree(env->bcf.parents); + return err ?: 1; +} + /* Lazily verify all global functions based on their BTF, if they are call= ed * from main BPF program or any of subprograms transitively. * BPF global subprogs called from dead code are not validated. --=20 2.34.1