From nobody Thu Apr 2 17:31:03 2026 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0BA9022156B for ; Mon, 16 Feb 2026 14:50:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771253440; cv=none; b=UIZzmk86u+1XCzVNPItOS+h1lipFFSIbejy2gYesqXulh+P2OiE3HQHQUXdBhl+27cT5YxfNt69Y3r0JhGGZqJbdeb1KoomBBo4StbyLB0h6hLOFC6lvi5Vc/NA/n0xIQjnMJX1CXW1IwlpKwDFesZjfDT86G1KLHD5kMPgJFnc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771253440; c=relaxed/simple; bh=ZUPsEyOe5JsqQojpLzkrHOX340UlsnpEJtleUHPh+3Q=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ixAUOmZ64SiTIGoVdB52dJ8cyO+Tq2yp6J8hGY//Q2Uq6GD0hQaBA06U5nbQTuFTLdHMnIOp4qgaHgWiLsbr3FlcvHVorxt7mcnH9oRBj+xRg29SrL3taQ6buKTKHyhLrRz8vbyZ8w05DC7ySY+BnT0/mobzEfciOlCmhlqLTJo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--elver.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=YEYGX0Ij; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--elver.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="YEYGX0Ij" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4837b7903f3so22497245e9.2 for ; Mon, 16 Feb 2026 06:50:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1771253436; x=1771858236; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=a9bx2SVNA6p6zRS/D5nNjpug9N+EIaRo9Qayx0T8s/E=; b=YEYGX0IjfVq9/h5q5GKvsDtlo4OObyQYImYyhPfecswdNacBCMIW2H8P8C54JGB27/ K+OoovnSiyV2rweQ3i+KnJCCNcKqI4CY7hCkUMh0nXtY3A030Z44BbTWZxUQWx40oWQb DvYfrUX9Z7MS3fNP1dw1RRfwGLHfs+ZLbnizs93Ye7WhkSjZP6u5kdcHfgDRc//ZnrRD cnxYgV8oJL9M70CaEPe3cjwIV6YCFWqmpC8VmULm2Na5onoARXomQOlGjfY61wzIZdZ0 HuqdWBnR2qXdA6iLw7bhvplLQHeYf+ltSVJWUfpickQnhUa5OdjqIEM6gOxVuhB377jT fDsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771253436; x=1771858236; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=a9bx2SVNA6p6zRS/D5nNjpug9N+EIaRo9Qayx0T8s/E=; b=QJl7aEnpOzqF34MYxh7gnFkC2x/msUR80t7uRBlIl5FG7fNkZNHiN7lWLEf/Xv297M SG432yV4raNE/BY6rZ0Ny02yyRynGQdqr1/Y6etqbd1da2Bu0hZqj0yzpFpss3Ss40oF SwiKgsGcbZgGIXsh/AUIBogay2nqggkJ0tRpu5UsoMky8o6c4FW35lzX1DYLN7wngpsx 4J6/k5PHrEMGjQU3dM3gVDrn7MBFzPXgVZeBfeAMVz/lIILe3hGVV3obUVobpmsS/wsz 353TCPof7bWl53FoJJtjJ+HUI18rOjcW/soqMUzO7ll4LWaH5ExfRoVKbNHyoRetmyPr 4exQ== X-Forwarded-Encrypted: i=1; AJvYcCUxFWt1juCnkCcE5u2krKaC+1qD47abe79vvDPeUJZ8mhcOxfgeLsjnwGjb8DLEDe8e/DUuergQhT4RtgE=@vger.kernel.org X-Gm-Message-State: AOJu0YzUMEL8BNUD9muaaqfufu3fOlbPFPWr8fRu/dcgIKyJjmBvSOVL 91JkinQ4JiHkrrG6ceuN18RzdoE4YI+gQ2WB8pGtxByxvAIvgdSFPu7qWBeCQZQqgPhBbjP5gbC 4kA== X-Received: from wmbfj12.prod.google.com ([2002:a05:600c:c8c:b0:480:69c2:3949]) (user=elver job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1f8d:b0:47e:e87b:af8 with SMTP id 5b1f17b1804b1-48373a5ba01mr192805505e9.21.1771253436108; Mon, 16 Feb 2026 06:50:36 -0800 (PST) Date: Mon, 16 Feb 2026 15:16:22 +0100 In-Reply-To: <20260216142436.2207937-2-elver@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260216142436.2207937-2-elver@google.com> X-Mailer: git-send-email 2.53.0.335.g19a08e0c02-goog Message-ID: <20260216142436.2207937-3-elver@google.com> Subject: [PATCH v4 1/2] arm64: Optimize __READ_ONCE() with CONFIG_LTO=y From: Marco Elver To: elver@google.com, Peter Zijlstra , Will Deacon Cc: Ingo Molnar , Thomas Gleixner , Boqun Feng , Waiman Long , Bart Van Assche , llvm@lists.linux.dev, David Laight , Al Viro , Catalin Marinas , Nathan Chancellor , Arnd Bergmann , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Rework arm64 LTO __READ_ONCE() to improve code generation as follows: 1. Replace _Generic-based __unqual_scalar_typeof() with more complete __rwonce_typeof_unqual(). This strips qualifiers from all types, not just integer types, which is required to be able to assign (must be non-const) to __u.__val in the non-atomic case (required for #2). One subtle point here is that non-integer types of __val could be const or volatile within the union with the old __unqual_scalar_typeof(), if the passed variable is const or volatile. This would then result in a forced load from the stack if __u.__val is volatile; in the case of const, it does look odd if the underlying storage changes, but the compiler is told said member is "const" -- it smells like UB. 2. Eliminate the atomic flag and ternary conditional expression. Move the fallback volatile load into the default case of the switch, ensuring __u is unconditionally initialized across all paths. The statement expression now unconditionally returns __u.__val. This refactoring appears to help the compiler improve (or fix) code generation. With a defconfig + LTO + debug options builds, we observe different codegen for the following functions: btrfs_reclaim_sweep (708 -> 1032 bytes) btrfs_sinfo_bg_reclaim_threshold_store (200 -> 204 bytes) check_mem_access (3652 -> 3692 bytes) [inlined bpf_map_is_rdonly] console_flush_all (1268 -> 1264 bytes) console_lock_spinning_disable_and_check (180 -> 176 bytes) igb_add_filter (640 -> 636 bytes) igb_config_tx_modes (2404 -> 2400 bytes) kvm_vcpu_on_spin (480 -> 476 bytes) map_freeze (376 -> 380 bytes) netlink_bind (1664 -> 1656 bytes) nmi_cpu_backtrace (404 -> 400 bytes) set_rps_cpu (516 -> 520 bytes) swap_cluster_readahead (944 -> 932 bytes) tcp_accecn_third_ack (328 -> 336 bytes) tcp_create_openreq_child (1764 -> 1772 bytes) tcp_data_queue (5784 -> 5892 bytes) tcp_ecn_rcv_synack (620 -> 628 bytes) xen_manage_runstate_time (944 -> 896 bytes) xen_steal_clock (340 -> 296 bytes) Increase of some functions are due to more aggressive inlining due to better codegen (in this build, e.g. bpf_map_is_rdonly is no longer present due to being inlined completely). NOTE: The return-value-of-function-drops-qualifiers hack was first suggested by Al Viro in [1], which notes some of its limitations which make it unsuitable for a general __unqual_scalar_typeof() replacement. Notably, array types are not supported, and GCC 8.1-8.3 still fail. Why should we use it here? READ_ONCE() does not support reading whole arrays, and the GCC version problem only affects 3 minor releases of a very ancient still-supported GCC version; not only that, this arm64 READ_ONCE() version is currently only activated by LTO builds, which to-date are *only supported by Clang*! Link: https://lore.kernel.org/all/20260111182010.GH3634291@ZenIV/ [1] Signed-off-by: Marco Elver --- v4: * Use the return-value-of-function-drops-qualifiers hack to paint the bikeshed. v3: * Comment. v2: * Add __rwonce_typeof_unqual() as fallback for old compilers. --- arch/arm64/include/asm/rwonce.h | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/asm/rwonce.h b/arch/arm64/include/asm/rwonc= e.h index fc0fb42b0b64..9fd24cef3376 100644 --- a/arch/arm64/include/asm/rwonce.h +++ b/arch/arm64/include/asm/rwonce.h @@ -19,6 +19,17 @@ "ldapr" #sfx "\t" #regs, \ ARM64_HAS_LDAPR) =20 +/* + * Replace this with typeof_unqual() when minimum compiler versions are + * increased to GCC 14 and Clang 19. For the time being, we need this + * workaround, which relies on function return values dropping qualifiers. + */ +#define __rwonce_typeof_unqual(x) typeof(({ \ + __diag_push() \ + __diag_ignore_all("-Wignored-qualifiers", "") \ + ((typeof(x)(*)(void))0)(); \ + __diag_pop() })) + /* * When building with LTO, there is an increased risk of the compiler * converting an address dependency headed by a READ_ONCE() invocation @@ -32,8 +43,7 @@ #define __READ_ONCE(x) \ ({ \ typeof(&(x)) __x =3D &(x); \ - int atomic =3D 1; \ - union { __unqual_scalar_typeof(*__x) __val; char __c[1]; } __u; \ + union { __rwonce_typeof_unqual(*__x) __val; char __c[1]; } __u; \ switch (sizeof(x)) { \ case 1: \ asm volatile(__LOAD_RCPC(b, %w0, %1) \ @@ -56,9 +66,9 @@ : "Q" (*__x) : "memory"); \ break; \ default: \ - atomic =3D 0; \ + __u.__val =3D *(volatile typeof(*__x) *)__x; \ } \ - atomic ? (typeof(*__x))__u.__val : (*(volatile typeof(*__x) *)__x);\ + __u.__val; \ }) =20 #endif /* !BUILD_VDSO */ --=20 2.53.0.335.g19a08e0c02-goog From nobody Thu Apr 2 17:31:03 2026 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C35B524337B for ; Mon, 16 Feb 2026 14:50:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771253442; cv=none; b=bt39oFAj+NUjnpZsmtyTKyPEkCvDJ57hYrdWk4++NXKUP799PE2voT7BCSEBzVyax59EEfc2yGi3nSp6L/q3+TW7shtO9zH9IsZ/JOeZzMWG5PEwMY9gTbsG6wwTHeZOO7sTY9StLlREukCNzQ1oUGygt2OaKS7Ng7cpabqaTnA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771253442; c=relaxed/simple; bh=Ip1FGe9THDnU9YMANqAopIxPtcAKpyNKkQY8BLBr/bU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Uav1GfIQr8DxOhoRxvcwLtdS4xchVAimRmr3Vga9JYm+RoywMG0B+dLejy0+4O1OxKynK1u7HDcad+qyuC2eUN9o+HhMh23J/FvvNAb8iGTYd6k72ozdD7D3Oqfn60ACdrTI6n09kNwSae5tFu9iIqCnQiX+bbIeplz/niuRjgY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--elver.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=BHnXSCp7; arc=none smtp.client-ip=209.85.221.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--elver.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="BHnXSCp7" Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-4368060a5e5so3677143f8f.3 for ; Mon, 16 Feb 2026 06:50:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1771253439; x=1771858239; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zYdVr7BD//ML3cPspm8e51p1r/jn9khxfOY35saBUOQ=; b=BHnXSCp7p8Pl5Cddau5YTfFEDzPNX/SInMx3RtOh/QCiUZsiL+2vHYMT2N2GAYeInl x0nNOcPsjWCgf2ksxmeLvKNlCIoIDZ0CQmLPvhMIXzABBzdfr+Ji13e88pvGDyLJeWW0 2eBug3iVlTJyE7y8igSLoIxmqZuQG3vKvksTKljmQ9s3oIVMIYrSwKgLbwnBmRhZRLc8 3ff/EjNQ5B5I9xR3Y8oqJRD4LJZY/MnbwpR14asiJDatOU6YEyYr14Q5U4x2kvEwiMzS h1kqShfVqBnQ0we6/ohXXVPsdqSiLu49io2Dq7zb0MJ07Jdk8cUP41DK0NGGScb+Pr2R w2MQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771253439; x=1771858239; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zYdVr7BD//ML3cPspm8e51p1r/jn9khxfOY35saBUOQ=; b=PgyGKxYpTu620YTN+JhCHKEQOidkDzR9+H1zk8/ASMPlDwz8SOoIvRoxdPs8I9LA9P bC92BfG6YlikI/BWA7r2e7ShBpQEG0jwojCgvABfKjwAAzWT72GjFKE1lrJgz2nAL0/u FDPgb4BmVJzSdaDftr9SW1O2b9BFVD8TNk+2y+hDebPASHPD3WOGVZnfaBanDcJp7RTO utnri7mHWZlSwEhpTBtGiYDscjj3bXdWFFYZcx8cmmERIopgcqValKX/P9aZ4fYemgT2 n8htMlzTgCne8H4gwsdLhQjCMyRyECvb5kwHckrURuBZmgaci7m6f7/ZT7IpnXJzD0yd o4/w== X-Forwarded-Encrypted: i=1; AJvYcCVwgDrr1P6e5OPdoUhs/nSFF4P77obTrrF8cCuGgWkQr1AqAGgJmOACq/LbSdDMO6eEnMVCksADoxeWMh4=@vger.kernel.org X-Gm-Message-State: AOJu0YwIDv4zYLFXasjMTIkazi/8vcbHOCFEwK4bH6KKhhBtZvfaTe/8 xjErUNxCoC1dUMxs9ihZ/8evK8M7kJOTtZwJpLXjk6zz6v8mANd8rGzDjRTm2vulwzSPmDAnHcD GPA== X-Received: from wrqa11.prod.google.com ([2002:adf:f7cb:0:b0:437:72d9:7316]) (user=elver job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:613:b0:435:a0ca:bdca with SMTP id ffacd0b85a97d-4379dbae796mr14728899f8f.57.1771253438836; Mon, 16 Feb 2026 06:50:38 -0800 (PST) Date: Mon, 16 Feb 2026 15:16:23 +0100 In-Reply-To: <20260216142436.2207937-2-elver@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260216142436.2207937-2-elver@google.com> X-Mailer: git-send-email 2.53.0.335.g19a08e0c02-goog Message-ID: <20260216142436.2207937-4-elver@google.com> Subject: [PATCH v4 2/2] arm64, compiler-context-analysis: Permit alias analysis through __READ_ONCE() with CONFIG_LTO=y From: Marco Elver To: elver@google.com, Peter Zijlstra , Will Deacon Cc: Ingo Molnar , Thomas Gleixner , Boqun Feng , Waiman Long , Bart Van Assche , llvm@lists.linux.dev, David Laight , Al Viro , Catalin Marinas , Nathan Chancellor , Arnd Bergmann , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kernel test robot , Boqun Feng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When enabling Clang's Context Analysis (aka. Thread Safety Analysis) on kernel/futex/core.o (see Peter's changes at [1]), in arm64 LTO builds we could see: | kernel/futex/core.c:982:1: warning: spinlock 'atomic ? __u.__val : q->loc= k_ptr' is still held at the end of function [-Wthread-safety-analysis] | 982 | } | | ^ | kernel/futex/core.c:976:2: note: spinlock acquired here | 976 | spin_lock(lock_ptr); | | ^ | kernel/futex/core.c:982:1: warning: expecting spinlock 'q->lock_ptr' to b= e held at the end of function [-Wthread-safety-analysis] | 982 | } | | ^ | kernel/futex/core.c:966:6: note: spinlock acquired here | 966 | void futex_q_lockptr_lock(struct futex_q *q) | | ^ | 2 warnings generated. Where we have: extern void futex_q_lockptr_lock(struct futex_q *q) __acquires(q->lock_ptr= ); .. void futex_q_lockptr_lock(struct futex_q *q) { spinlock_t *lock_ptr; /* * See futex_unqueue() why lock_ptr can change. */ guard(rcu)(); retry: >> lock_ptr =3D READ_ONCE(q->lock_ptr); spin_lock(lock_ptr); ... } At the time of the above report (prior to removal of the 'atomic' flag), Clang Thread Safety Analysis's alias analysis resolved 'lock_ptr' to 'atomic ? __u.__val : q->lock_ptr' (now just '__u.__val'), and used this as the identity of the context lock given it cannot "see through" the inline assembly; however, we want 'q->lock_ptr' as the canonical context lock. While for code generation the compiler simplified to '__u.__val' for pointers (8 byte case -> 'atomic' was set), TSA's analysis (a) happens much earlier on the AST, and (b) would be the wrong deduction. Now that we've gotten rid of the 'atomic' ternary comparison, we can return '__u.__val' through a pointer that we initialize with '&x', but then update via a pointer-to-pointer. When READ_ONCE()'ing a context lock pointer, TSA's alias analysis does not invalidate the initial alias when updated through the pointer-to-pointer, and we make it effectively "see through" the __READ_ONCE(). Code generation is unchanged. Link: https://lkml.kernel.org/r/20260121110704.221498346@infradead.org [1] Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202601221040.TeM0ihff-lkp@int= el.com/ Cc: Peter Zijlstra Tested-by: Boqun Feng Reviewed-by: David Laight Signed-off-by: Marco Elver --- v3: * Use 'typeof(*__ret)'. * Commit message. v2: * Rebase. --- arch/arm64/include/asm/rwonce.h | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/rwonce.h b/arch/arm64/include/asm/rwonc= e.h index 9fd24cef3376..0f3a01d30f66 100644 --- a/arch/arm64/include/asm/rwonce.h +++ b/arch/arm64/include/asm/rwonce.h @@ -42,8 +42,12 @@ */ #define __READ_ONCE(x) \ ({ \ - typeof(&(x)) __x =3D &(x); \ - union { __rwonce_typeof_unqual(*__x) __val; char __c[1]; } __u; \ + auto __x =3D &(x); \ + auto __ret =3D (__rwonce_typeof_unqual(*__x) *)__x; \ + /* Hides alias reassignment from Clang's -Wthread-safety. */ \ + auto __retp =3D &__ret; \ + union { typeof(*__ret) __val; char __c[1]; } __u; \ + *__retp =3D &__u.__val; \ switch (sizeof(x)) { \ case 1: \ asm volatile(__LOAD_RCPC(b, %w0, %1) \ @@ -68,7 +72,7 @@ default: \ __u.__val =3D *(volatile typeof(*__x) *)__x; \ } \ - __u.__val; \ + *__ret; \ }) =20 #endif /* !BUILD_VDSO */ --=20 2.53.0.335.g19a08e0c02-goog