From nobody Wed Nov 27 20:49:00 2024 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD9031DF279; Tue, 8 Oct 2024 13:52:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=167.114.26.122 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728395572; cv=none; b=kvg/rOQ6zRYDCkVGW8P2+q477mkXAii7zoQZ2Fht77WboXt1OvpEd8jLWbJT0p2TDmzadqFB0CmHKTWRQhGkcclSvf3mT7uQ9wuykW0bwAJRNpjyKMH4c6JA49cyqvcArI0vMbPcDiXvBon4yUwhqzuqWzZCPfPQGDw2W7YimRg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728395572; c=relaxed/simple; bh=ruHajFuAP+9RCH6285NJzkwm+MUgE21IicKMAn8zEWg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JFXEIri7yPfXLWUUVGH88xgoX/1z6D890tGLj69Rv+H5OkzKDTUhmahD//Z6AWYFu3Em4Oc4U9/78AhS0SDkaTOSB7RNoXrYNMeMqS6tKlmr9d5TbRtTePU2yG7AW4zK/mxjEBSOpSE4b46WU73/TxpkExBgT5tqpIfVo8iSO10= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com; spf=pass smtp.mailfrom=efficios.com; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b=LtITutnp; arc=none smtp.client-ip=167.114.26.122 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=efficios.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b="LtITutnp" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1728395561; bh=ruHajFuAP+9RCH6285NJzkwm+MUgE21IicKMAn8zEWg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LtITutnpgtMp+C31zbrNesLxnf5B4k6XtwV8rHMbjoI0AAVE6gW+o+wi5mUtCkwar Ga57E27euyLob/B6qOuupcvTmjgOcIIXfFLr+b0/ycaUgq34HFGeWkG+30IY/OvYQl NtqKnB1ArwD7ms3LYC446z5H5ihWusKxOvh9HM4Sf90OOJa29PjrdcDhTPFQZZaplZ QNc2KdnFDVBg5Xmdeo3dpG6TnmZse20WHqosPv+be60G3vRxnRYZBjG8g9NlNBBtP7 D9ichiqHRmRyCVDonzZX7nmCcCU31tnjxH4MBfropSezGXysozpRBQa++c1cHbWoYU 4aECQfjirUreg== Received: from thinkos.internal.efficios.com (96-127-217-162.qc.cable.ebox.net [96.127.217.162]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4XNHX91sCszM4s; Tue, 8 Oct 2024 09:52:41 -0400 (EDT) From: Mathieu Desnoyers To: Boqun Feng Cc: linux-kernel@vger.kernel.org, Mathieu Desnoyers , Linus Torvalds , Andrew Morton , Peter Zijlstra , Nicholas Piggin , Michael Ellerman , Greg Kroah-Hartman , Sebastian Andrzej Siewior , "Paul E. McKenney" , Will Deacon , Alan Stern , John Stultz , Neeraj Upadhyay , Frederic Weisbecker , Joel Fernandes , Josh Triplett , Uladzislau Rezki , Steven Rostedt , Lai Jiangshan , Zqiang , Ingo Molnar , Waiman Long , Mark Rutland , Thomas Gleixner , Vlastimil Babka , maged.michael@gmail.com, Mateusz Guzik , Jonas Oberhauser , rcu@vger.kernel.org, linux-mm@kvack.org, lkmm@lists.linux.dev, Gary Guo , Nikita Popov , llvm@lists.linux.dev Subject: [RFC PATCH v3 1/4] compiler.h: Introduce ptr_eq() to preserve address dependency Date: Tue, 8 Oct 2024 09:50:31 -0400 Message-Id: <20241008135034.1982519-2-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20241008135034.1982519-1-mathieu.desnoyers@efficios.com> References: <20241008135034.1982519-1-mathieu.desnoyers@efficios.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Compiler CSE and SSA GVN optimizations can cause the address dependency of addresses returned by rcu_dereference to be lost when comparing those pointers with either constants or previously loaded pointers. Introduce ptr_eq() to compare two addresses while preserving the address dependencies for later use of the address. It should be used when comparing an address returned by rcu_dereference(). This is needed to prevent the compiler CSE and SSA GVN optimizations from using @a (or @b) in places where the source refers to @b (or @a) based on the fact that after the comparison, the two are known to be equal, which does not preserve address dependencies and allows the following misordering speculations: - If @b is a constant, the compiler can issue the loads which depend on @a before loading @a. - If @b is a register populated by a prior load, weakly-ordered CPUs can speculate loads which depend on @a before loading @a. The same logic applies with @a and @b swapped. Suggested-by: Linus Torvalds Suggested-by: Boqun Feng Signed-off-by: Mathieu Desnoyers Reviewed-by: Boqun Feng Reviewed-by: Joel Fernandes (Google) Tested-by: Joel Fernandes (Google) Acked-by: "Paul E. McKenney" Acked-by: Alan Stern Cc: Greg Kroah-Hartman Cc: Sebastian Andrzej Siewior Cc: "Paul E. McKenney" Cc: Will Deacon Cc: Peter Zijlstra Cc: Boqun Feng Cc: Alan Stern Cc: John Stultz Cc: Neeraj Upadhyay Cc: Linus Torvalds Cc: Boqun Feng Cc: Frederic Weisbecker Cc: Joel Fernandes Cc: Josh Triplett Cc: Uladzislau Rezki Cc: Steven Rostedt Cc: Lai Jiangshan Cc: Zqiang Cc: Ingo Molnar Cc: Waiman Long Cc: Mark Rutland Cc: Thomas Gleixner Cc: Vlastimil Babka Cc: maged.michael@gmail.com Cc: Mateusz Guzik Cc: Gary Guo Cc: Jonas Oberhauser Cc: rcu@vger.kernel.org Cc: linux-mm@kvack.org Cc: lkmm@lists.linux.dev Cc: Nikita Popov Cc: llvm@lists.linux.dev --- Changes since v0: - Include feedback from Alan Stern. --- include/linux/compiler.h | 63 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 63 insertions(+) diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 2df665fa2964..75a378ae7af1 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -186,6 +186,69 @@ void ftrace_likely_update(struct ftrace_likely_data *f= , int val, __asm__ ("" : "=3Dr" (var) : "0" (var)) #endif =20 +/* + * Compare two addresses while preserving the address dependencies for + * later use of the address. It should be used when comparing an address + * returned by rcu_dereference(). + * + * This is needed to prevent the compiler CSE and SSA GVN optimizations + * from using @a (or @b) in places where the source refers to @b (or @a) + * based on the fact that after the comparison, the two are known to be + * equal, which does not preserve address dependencies and allows the + * following misordering speculations: + * + * - If @b is a constant, the compiler can issue the loads which depend + * on @a before loading @a. + * - If @b is a register populated by a prior load, weakly-ordered + * CPUs can speculate loads which depend on @a before loading @a. + * + * The same logic applies with @a and @b swapped. + * + * Return value: true if pointers are equal, false otherwise. + * + * The compiler barrier() is ineffective at fixing this issue. It does + * not prevent the compiler CSE from losing the address dependency: + * + * int fct_2_volatile_barriers(void) + * { + * int *a, *b; + * + * do { + * a =3D READ_ONCE(p); + * asm volatile ("" : : : "memory"); + * b =3D READ_ONCE(p); + * } while (a !=3D b); + * asm volatile ("" : : : "memory"); <-- barrier() + * return *b; + * } + * + * With gcc 14.2 (arm64): + * + * fct_2_volatile_barriers: + * adrp x0, .LANCHOR0 + * add x0, x0, :lo12:.LANCHOR0 + * .L2: + * ldr x1, [x0] <-- x1 populated by first load. + * ldr x2, [x0] + * cmp x1, x2 + * bne .L2 + * ldr w0, [x1] <-- x1 is used for access which should depend= on b. + * ret + * + * On weakly-ordered architectures, this lets CPU speculation use the + * result from the first load to speculate "ldr w0, [x1]" before + * "ldr x2, [x0]". + * Based on the RCU documentation, the control dependency does not + * prevent the CPU from speculating loads. + */ +static __always_inline +int ptr_eq(const volatile void *a, const volatile void *b) +{ + OPTIMIZER_HIDE_VAR(a); + OPTIMIZER_HIDE_VAR(b); + return a =3D=3D b; +} + #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNT= ER__) =20 /** --=20 2.39.2