From nobody Fri Dec 19 13:27:28 2025 Received: from mail-pf1-f196.google.com (mail-pf1-f196.google.com [209.85.210.196]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E7E9427CB04 for ; Tue, 16 Dec 2025 01:48:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.196 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765849700; cv=none; b=AGKNihI1V/WrfVoZwQhUYO3arus7pBXqSQP8xGCVs8Q/081UcSL4/f0c9k01I8RDzSj1Gkf3D6L5rkR8rZKVgXWH4F1hmkaKsCOAp4hMwOCPXRoTlomw2w/L8V4faQcIBW4Z+0oh7K4+25+qH7bybGTBP+e/hpP3axCZA0uAsuw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765849700; c=relaxed/simple; bh=rDYObsyVt0k7aR8oMEAY68o1gzyAa4BXH44HoG7kYmI=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=sAGfyPkPa7YuqBhHRwD9cBuSM/sPrVB6YNVZz+r4OBA8XanhYFtBNQ2CDgoxzX9Ihob00adJRXTatt2DKy5aV5Z5Oc4+7c6bc/4yKCg5jVHJ/dvTN9sXzNeuaeG9EYOPBo146Fdd38kOUDhRe93QeHL3/46dOFczu6ThcMNymu4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=cZow14Da; arc=none smtp.client-ip=209.85.210.196 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="cZow14Da" Received: by mail-pf1-f196.google.com with SMTP id d2e1a72fcca58-7f0da2dfeaeso4109180b3a.1 for ; Mon, 15 Dec 2025 17:48:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1765849697; x=1766454497; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=oPaduSWe86itOLJLDO9/qRxhQFQa/4nam9IeEV1MY2g=; b=cZow14DaSSBwHkOqLu+UPT+nmLYcdqoFHHTrbkYFelsRAvN3yMWeqUyFSKLsuTZRw1 +rxOL4qTrk6oACuawUmd/jRZoTq1BCdUncP9i6Ll/IiQSCkSOMiE8RE2InEU5yvUF3oR ImDEurFsOvDIXGqiUXSR4O1I1qDokZxQZbKzZ1N7fqCVHza5xEBRTwQvchoP/MHuvFqL 2fTgU9E6LtOYp15SgbhDVdUxrrShtrLcOwxCbFYk2zIjZRf7iGaunGz5A/C8YK7P9xWD tbUTTKGZuTtjIaEKLNSLgWlRukBywHlW5q7IfI/uymgDxnKTwfd03mSQ1k/flRg60VJV YTUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765849697; x=1766454497; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=oPaduSWe86itOLJLDO9/qRxhQFQa/4nam9IeEV1MY2g=; b=gGf0uPhDfC/Lk8+gb8xuMKsSDm1Vsq9o8cuAKBCLQzrnvKE6YHq1bzKOvMl6B/fcsW PICuhaDNUgXZxVHHNpGcWBWpB9E99BQTRliWrhtHl7eecG5c9DPb/3KCicmXIdA6aAuI l594Ne29mBC0ffhw3G0ws3urKTB9LivrysfiwWkiN1ZIuSiMsh9YSc8u/Nyvi5+JmXzI o/jurD34btpOMSCkFM/Fbjsmj2iSEwwkyT1m65pzYAXDIex/ghReVdWCCM30vz82N7wa RL4OgmD8MMYpOwU3IiSpO+sr8Twth6NRc9cGlU8m1gmJd+Q311MACAyzXF8+CXkeaCdO b0vw== X-Forwarded-Encrypted: i=1; AJvYcCUAob2B5lkfuMUW1vsQ8paYvM97T3vU1X3vqwAFKuxCksZNtLVTRLtuXnzFRX4JZ6zFPIDadJDCNWZZOLM=@vger.kernel.org X-Gm-Message-State: AOJu0YxOS7b4suecDwlwLKVysQmMxWUmMvvxDTcAiFhJVymZsyaAGktS +YC67MhvdvUdVkHklZkeZVc+emgtxszLorPkj2l2gJBUdFXZ7GXeI0Qinj/1nK7QvuY= X-Gm-Gg: AY/fxX7CPAJ6PhHhNaaQKfSQv6FHFJtnRrolKdjhwguGBbErXtP2LEZuHWGhTQUiPuE ZTQBDlQROWFwdD8nmyqmgbapgVtLiNwdFtLsV6QV61Z3mPleJgUKIboviJPfteuCBCM4HhDmmHO aq3Wqen03/HKye975yxYmHfW6MzRZOPb1xD/LfyTzKUwx93Omu3gjAl3z47TvFoS0cRqTIyP5rp RHZSpnTL6TaOXkBD6Fa0J2I2oAOyIOCkjkQXRB51tV1xw7HCvHhDE/FMt0Ys0AHrBlweJoAqt25 HElcZhrHPhoArfogGVduHj/mzcidKJg3mcnPom5TysvYrSaTml0HjO72xbHpLQl7EZMRW4Nr8aq bZYpUIIZXFmzyXn76894Ld3CpASOj0EjEoorAOkt4SoK3rp6hm1e7VE/HvEysF8qDwXlfzTe6Hu fKkuRf/4FMyavkGi75zugeC7fJX+75h2Ja0RpSAzyzL6p6 X-Google-Smtp-Source: AGHT+IGq1tjZWtQhKRKbnMgnlKR2obxbHUXs706wZEaM0LNbNkdcq+HNiaw0vmfCMXTy97ide8fzgQ== X-Received: by 2002:a05:6a20:6a20:b0:320:3da8:34d7 with SMTP id adf61e73a8af0-369adebe03amr13055886637.22.1765849696602; Mon, 15 Dec 2025 17:48:16 -0800 (PST) Received: from L6YN4KR4K9.bytedance.net ([139.177.225.224]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-c0c2c963b53sm13632790a12.36.2025.12.15.17.48.04 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 15 Dec 2025 17:48:16 -0800 (PST) From: Yunhui Cui To: aou@eecs.berkeley.edu, alex@ghiti.fr, andii@kernel.org, andybnac@gmail.com, apatel@ventanamicro.com, ast@kernel.org, ben.dooks@codethink.co.uk, bjorn@kernel.org, bpf@vger.kernel.org, charlie@rivosinc.com, cl@gentwo.org, conor.dooley@microchip.com, cuiyunhui@bytedance.com, cyrilbur@tenstorrent.com, daniel@iogearbox.net, debug@rivosinc.com, dennis@kernel.org, eddyz87@gmail.com, haoluo@google.com, john.fastabend@gmail.com, jolsa@kernel.org, kpsingh@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, linux@rasmusvillemoes.dk, martin.lau@linux.dev, palmer@dabbelt.com, pjw@kernel.org, puranjay@kernel.org, pulehui@huawei.com, ruanjinjie@huawei.com, rkrcmar@ventanamicro.com, samuel.holland@sifive.com, sdf@fomichev.me, song@kernel.org, tglx@linutronix.de, tj@kernel.org, thuth@redhat.com, yonghong.song@linux.dev, yury.norov@gmail.com, zong.li@sifive.com Subject: [PATCH v3 3/3] riscv: store percpu offset into thread_info Date: Tue, 16 Dec 2025 09:47:21 +0800 Message-Id: <20251216014721.42262-4-cuiyunhui@bytedance.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20251216014721.42262-1-cuiyunhui@bytedance.com> References: <20251216014721.42262-1-cuiyunhui@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Originally we planned to add a register for the percpu offset, which would speed up percpu variable R/W and reduce access instructions. After discussion [1], it=E2=80=99s now stored in thread_info. [1] https://lists.riscv.org/g/tech-privileged/topic/risc_v_tech_arch_review= /113437553?page=3D2 Signed-off-by: Yunhui Cui --- arch/riscv/include/asm/asm.h | 6 +----- arch/riscv/include/asm/percpu.h | 4 ++++ arch/riscv/include/asm/switch_to.h | 8 ++++++++ arch/riscv/include/asm/thread_info.h | 5 +++-- arch/riscv/kernel/asm-offsets.c | 1 + arch/riscv/kernel/smpboot.c | 7 +++++++ arch/riscv/net/bpf_jit_comp64.c | 9 +-------- 7 files changed, 25 insertions(+), 15 deletions(-) diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h index e9e8ba83e632f..137a49488325e 100644 --- a/arch/riscv/include/asm/asm.h +++ b/arch/riscv/include/asm/asm.h @@ -91,11 +91,7 @@ =20 #ifdef CONFIG_SMP .macro asm_per_cpu dst sym tmp - lw \tmp, TASK_TI_CPU_NUM(tp) - slli \tmp, \tmp, RISCV_LGPTR - la \dst, __per_cpu_offset - add \dst, \dst, \tmp - REG_L \tmp, 0(\dst) + REG_L \tmp, TASK_TI_PCPU_OFFSET(tp) la \dst, \sym add \dst, \dst, \tmp .endm diff --git a/arch/riscv/include/asm/percpu.h b/arch/riscv/include/asm/percp= u.h index c5bacf6d864ee..35a63420a76a4 100644 --- a/arch/riscv/include/asm/percpu.h +++ b/arch/riscv/include/asm/percpu.h @@ -7,7 +7,9 @@ =20 #include #include +#include #include +#include =20 #define PERCPU_RW_OPS(sz) \ static inline unsigned long __percpu_read_##sz(void *ptr) \ @@ -239,6 +241,8 @@ _pcp_protect_return(__percpu_add_return_amo_case_64, pc= p, val) }) #endif =20 +#define __my_cpu_offset (((struct thread_info *)current)->pcpu_offset) + #include =20 #endif /* __ASM_PERCPU_H */ diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/sw= itch_to.h index 0e71eb82f920c..733b6cd306e40 100644 --- a/arch/riscv/include/asm/switch_to.h +++ b/arch/riscv/include/asm/switch_to.h @@ -88,6 +88,13 @@ static inline void __switch_to_envcfg(struct task_struct= *next) :: "r" (next->thread.envcfg) : "memory"); } =20 +static inline void __switch_to_pcpu_offset(struct task_struct *next) +{ +#ifdef CONFIG_SMP + next->thread_info.pcpu_offset =3D __my_cpu_offset; +#endif +} + extern struct task_struct *__switch_to(struct task_struct *, struct task_struct *); =20 @@ -122,6 +129,7 @@ do { \ if (switch_to_should_flush_icache(__next)) \ local_flush_icache_all(); \ __switch_to_envcfg(__next); \ + __switch_to_pcpu_offset(__next); \ ((last) =3D __switch_to(__prev, __next)); \ } while (0) =20 diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/= thread_info.h index 36918c9200c92..8d7d43cc9c405 100644 --- a/arch/riscv/include/asm/thread_info.h +++ b/arch/riscv/include/asm/thread_info.h @@ -52,7 +52,8 @@ */ struct thread_info { unsigned long flags; /* low level flags */ - int preempt_count; /* 0=3D>preemptible, <0=3D>BUG */ + int preempt_count; /* 0=3D>preemptible, <0=3D>BUG */ + int cpu; /* * These stack pointers are overwritten on every system call or * exception. SP is also saved to the stack it can be recovered when @@ -60,8 +61,8 @@ struct thread_info { */ long kernel_sp; /* Kernel stack pointer */ long user_sp; /* User stack pointer */ - int cpu; unsigned long syscall_work; /* SYSCALL_WORK_ flags */ + unsigned long pcpu_offset; #ifdef CONFIG_SHADOW_CALL_STACK void *scs_base; void *scs_sp; diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offset= s.c index af827448a609e..fbf53b66b0e06 100644 --- a/arch/riscv/kernel/asm-offsets.c +++ b/arch/riscv/kernel/asm-offsets.c @@ -38,6 +38,7 @@ void asm_offsets(void) OFFSET(TASK_THREAD_SUM, task_struct, thread.sum); =20 OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu); + OFFSET(TASK_TI_PCPU_OFFSET, task_struct, thread_info.pcpu_offset); OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count); OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp); OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp); diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c index d85916a3660c3..9e95c068b966b 100644 --- a/arch/riscv/kernel/smpboot.c +++ b/arch/riscv/kernel/smpboot.c @@ -209,6 +209,11 @@ int __cpu_up(unsigned int cpu, struct task_struct *tid= le) } #endif =20 +void __init smp_prepare_boot_cpu(void) +{ + __my_cpu_offset =3D per_cpu_offset(smp_processor_id()); +} + void __init smp_cpus_done(unsigned int max_cpus) { } @@ -234,6 +239,8 @@ asmlinkage __visible void smp_callin(void) mmgrab(mm); current->active_mm =3D mm; =20 + __my_cpu_offset =3D per_cpu_offset(smp_processor_id()); + #ifdef CONFIG_HOTPLUG_PARALLEL cpuhp_ap_sync_alive(); #endif diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp6= 4.c index 5f9457e910e87..4a492a6a1cc1e 100644 --- a/arch/riscv/net/bpf_jit_comp64.c +++ b/arch/riscv/net/bpf_jit_comp64.c @@ -1345,15 +1345,8 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, s= truct rv_jit_context *ctx, if (rd !=3D rs) emit_mv(rd, rs, ctx); #ifdef CONFIG_SMP - /* Load current CPU number in T1 */ - emit_lw(RV_REG_T1, offsetof(struct thread_info, cpu), + emit_lw(RV_REG_T1, offsetof(struct thread_info, pcpu_offset), RV_REG_TP, ctx); - /* Load address of __per_cpu_offset array in T2 */ - emit_addr(RV_REG_T2, (u64)&__per_cpu_offset, extra_pass, ctx); - /* Get address of __per_cpu_offset[cpu] in T1 */ - emit_sh3add(RV_REG_T1, RV_REG_T1, RV_REG_T2, ctx); - /* Load __per_cpu_offset[cpu] in T1 */ - emit_ld(RV_REG_T1, 0, RV_REG_T1, ctx); /* Add the offset to Rd */ emit_add(rd, rd, RV_REG_T1, ctx); #endif --=20 2.39.5