From nobody Mon Dec 1 22:36:27 2025 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DBF18329C53 for ; Thu, 27 Nov 2025 09:22:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764235382; cv=none; b=f1LVI+BjcHTL1DR62aKWEGdUV/mr9BoYG2eP4uwXQxLeQdla3kRR1wAeBv6ojn9lvwZi3Q1WAIZ0wemUL0Qwz3B8HtvTWFGBW3EH/FyXBWeYwckEajC0BLsgSSRVg6YCs+J3qEq/5Wo3ib4P60QduvPPx1+qkI7IOFs0xA8CxDY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764235382; c=relaxed/simple; bh=F2UdfCjShiQuHh3oQQRYHB4xtYfaoIt5uvb/4ByzDno=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=GUrLXn3LEnEESW7RZjqREjxRc2HZkRjGeqNSLxz4n/uaKjUDUH0mv5qDEqNturysTRq6g4fk2Lqbn4U1aq5v1eC0s2iEK8jObjO7hk0QBoZIL/mr+shmxf9WuX1bhx/gtNaVaQWCDeq4rrsJbfYGF8a+ylbkfElCjGukyKlkLqc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=pZJ1u51H; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="pZJ1u51H" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-429c7b0ae36so445338f8f.0 for ; Thu, 27 Nov 2025 01:22:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764235377; x=1764840177; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=wzVFhaCJ4tPvEzjzVh2/FLJsOENJe5tM82eV69ICp3w=; b=pZJ1u51HYn6Z3mdEJPllCXyHTdraMpcNmsUTdxW5UpdYzALZw1y2IIvsfLONEd4/By uDl87otuD/09t6oCtKMHYvurg7XLzJ31NICBwmXCrpf/x7LKBeRtVy9bcVLr2yEIZc+Y ac50xRUpzj7piw3ipk/PQGLtsX+e6yuWaNTfFBnOvwPJM5h5b9FKdoj4dYZzujx5zje6 kV1WvglHSomVB+9liVn+3R1l3HPJRbt2/8qDO9lw49o5RFnFllLPfC+B5U4znHNJ00LH it1a6FJAilaNT6OEkVo4wJnbX031MEbascrDIbPoH3AZDHqXCXfh4R7MLcn/k9TE5nyL eWwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764235377; x=1764840177; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wzVFhaCJ4tPvEzjzVh2/FLJsOENJe5tM82eV69ICp3w=; b=qyKcO+Yv551+TgEMGoINq0/yWQrkCxgOo+GEXHngkYuk+wKuieaKG7VQ6Kb0Yy1Ihu /2pPcaNqf/CF8w3AW1qnK0EY9Pbe0Oir9dUapFbp8HQJnNUbwO+GrsGkUIa8mVxH6TLA 7mUjd07RCebbP13/Dffsi+2Isdlbh8ks3kzLLsRjmIPQQwbY/HtgdsBxp0pGKL1mxkG0 NlFmUPua4Fsdpb5Xr8LdlhTKWlOmE1gR/uVRObeO7zSZZ11wt8YljKMd9MzmyCzHP/xZ o6ycCkDwkUtUVSL8hrMzkZ01Y68LsIug8OqItb9g68JR8RwjszN+Wxfrl4ocOHR70X9X n66w== X-Forwarded-Encrypted: i=1; AJvYcCUDPYgdMDMdUfQKhy/1MLqm83d2dVKvfTsokhBOQF1Hwkv1aLewLGNyH+9CGWzIyLNGarwMD5NpjSLCK0M=@vger.kernel.org X-Gm-Message-State: AOJu0Yw6hN+5VPlyoiGf93s7ddoq+Yvu8Rp5VuWFrMVnZxb2TcnA2A2G +itG+Jm7p8euDCI2h5E9zGL6wx48Oby4pvtpoUQfSmwtDBh7we3bggmSlMTJOBIASsNEWBosZQ= = X-Google-Smtp-Source: AGHT+IGd+iJrVyefWu9xcpdYdKEXoxro/lG9n2ewRU/WJbCL+3bUle2bLSXZHjm1+EN52LuTQiG20BfQ X-Received: from wrqr7.prod.google.com ([2002:a5d:4987:0:b0:42b:b28a:6746]) (user=ardb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:4010:b0:42b:3dbe:3a37 with SMTP id ffacd0b85a97d-42cc1302285mr26345067f8f.10.1764235376919; Thu, 27 Nov 2025 01:22:56 -0800 (PST) Date: Thu, 27 Nov 2025 10:22:31 +0100 In-Reply-To: <20251127092226.1439196-8-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251127092226.1439196-8-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=4605; i=ardb@kernel.org; h=from:subject; bh=2kdRtRT+6jrXGVJpTsN6JRllZWlsEfiMzJMbIKIwLig=; b=owGbwMvMwCVmkMcZplerG8N4Wi2JIVNDItKrV2/nxk2T1FL2Pt6aq6pVO+2PRuinE8KPNTo6a 3nj3j3uKGVhEONikBVTZBGY/ffdztMTpWqdZ8nCzGFlAhnCwMUpABPp8mFkuDohVFskVsdiR1zu wmdCi5KUPl1VU7Fe/dFZccnmV/sc3jEynJLJaUjTW3hYn2F3RubPpNRt29cIn9B6cfPQJ+dKTzk BPgA= X-Mailer: git-send-email 2.52.0.107.ga0afd4fd5b-goog Message-ID: <20251127092226.1439196-12-ardb+git@google.com> Subject: [RFC/RFT PATCH 4/6] random: Use a lockless fast path for get_random_uXX() From: Ard Biesheuvel To: linux-hardening@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Ard Biesheuvel , Kees Cook , Ryan Roberts , Will Deacon , Arnd Bergmann , Jeremy Linton , Catalin Marinas , Mark Rutland , "Jason A. Donenfeld" Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ard Biesheuvel Currently, the implementations of the get_random_uXX() API protect their critical section with a local lock and disabling interrupts, to ensure that the code does not race with itself when called from interrupt context. Given that the fast path does nothing more than read a single uXX quantity from a linear buffer and bump the position pointer, poking the hardware registers to disable and re-enable interrupts is disproportionately costly, and best avoided. There are two conditions under which the batched entropy buffer is replenished, which is what forms the critical section: - the buffer is exhausted - the base_crng generation counter has incremented. By combining the position and generation counters into a single u64, we can use compare and exchange to implement the fast path without taking the local lock or disabling interrupts. By constructing the expected and next values carefully, the compare and exchange will only succeed if - we did not race with ourselves, i.e., the compare and exchange increments the position counter by exactly 1; - the buffer is not exhausted - the generation counter equals the base_crng generation counter. Only if the compare and exchange fails is the original slow path taken, and only in that case do we take the local lock. This results in a considerable speedup (3-5x) when benchmarking get_random_u8() in a tight loop. Signed-off-by: Ard Biesheuvel --- drivers/char/random.c | 44 ++++++++++++++------ 1 file changed, 31 insertions(+), 13 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index 0e04bc60d034..71bd74871540 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -496,6 +496,12 @@ static ssize_t get_random_bytes_user(struct iov_iter *= iter) * should be called and return 0 at least once at any point prior. */ =20 +#ifdef __LITTLE_ENDIAN +#define LOHI(lo, hi) lo, hi +#else +#define LOHI(lo, hi) hi, lo +#endif + #define DEFINE_BATCHED_ENTROPY(type) \ struct batch_ ##type { \ /* \ @@ -507,8 +513,12 @@ struct batch_ ##type { \ */ \ type entropy[CHACHA_BLOCK_SIZE * 3 / (2 * sizeof(type))]; \ local_lock_t lock; \ - unsigned int generation; \ - unsigned int position; \ + union { \ + struct { \ + unsigned int LOHI(position, generation); \ + }; \ + u64 posgen; \ + }; \ }; \ \ static DEFINE_PER_CPU(struct batch_ ##type, batched_entropy_ ##type) =3D {= \ @@ -522,6 +532,7 @@ type get_random_ ##type(void) \ unsigned long flags; \ struct batch_ ##type *batch; \ unsigned int next_gen; \ + u64 next; \ \ warn_unseeded_randomness(); \ \ @@ -530,21 +541,28 @@ type get_random_ ##type(void) \ return ret; \ } \ \ - local_lock_irqsave(&batched_entropy_ ##type.lock, flags); \ - batch =3D raw_cpu_ptr(&batched_entropy_##type); \ + batch =3D &get_cpu_var(batched_entropy_##type); \ \ next_gen =3D (unsigned int)READ_ONCE(base_crng.generation); \ - if (batch->position >=3D ARRAY_SIZE(batch->entropy) || \ - next_gen !=3D batch->generation) { \ - _get_random_bytes(batch->entropy, sizeof(batch->entropy)); \ - batch->position =3D 0; \ - batch->generation =3D next_gen; \ + next =3D (u64)next_gen << 32; \ + if (likely(batch->position < ARRAY_SIZE(batch->entropy))) { \ + next |=3D batch->position + 1; /* next-1 is bogus otherwise */ \ + ret =3D batch->entropy[batch->position]; \ + } \ + if (cmpxchg64_local(&batch->posgen, next, next - 1) !=3D next - 1) { \ + local_lock_irqsave(&batched_entropy_ ##type.lock, flags); \ + if (batch->position >=3D ARRAY_SIZE(batch->entropy) || \ + next_gen !=3D batch->generation) { \ + _get_random_bytes(batch->entropy, sizeof(batch->entropy));\ + batch->position =3D 0; \ + batch->generation =3D next_gen; \ + } \ + ret =3D batch->entropy[batch->position++]; \ + local_unlock_irqrestore(&batched_entropy_ ##type.lock, flags); \ } \ \ - ret =3D batch->entropy[batch->position]; \ - batch->entropy[batch->position] =3D 0; \ - ++batch->position; \ - local_unlock_irqrestore(&batched_entropy_ ##type.lock, flags); \ + batch->entropy[batch->position - 1] =3D 0; \ + put_cpu_var(batched_entropy_##type); \ return ret; \ } \ EXPORT_SYMBOL(get_random_ ##type); --=20 2.52.0.107.ga0afd4fd5b-goog