From nobody Sun Dec 14 06:15:19 2025 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B3E6D311C17 for ; Wed, 5 Nov 2025 20:10:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373455; cv=none; b=TOFv+ug2QNkDAhBHxLOKEQzgMu3hhQlhxHminbVqD4roS8JuDK2ZMvHuUdJmNnA5iOlYIemzrtrjb9Jz4B6WdNFQcRc+M5om8GEyre4c1aMJAFH92YX2eTUxcpKtMldJoXePxQEiAzdu/YCMQOqacYzm/Q/3DZnhtTWhDYUUJfo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373455; c=relaxed/simple; bh=zXxfYSuarlcdX7wkRJ5fQQu9ihnL5EPvoYU9sZehLCY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=E8MfEQRgS61f9dTHRaA0pZhLveM/BGHem61iLmpcgNoP6ksSll30V07/dq1unpmB89pCeWRLLfk6ZGDx131mmu63nOYBj6QUnPBWyqXyLx799ov7/ix07pEkceqciPxOelql+V584/bpCe1y3/bB3knS1nJqL7VzemwagSpj8nQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=K6L8TGAq; arc=none smtp.client-ip=209.85.221.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="K6L8TGAq" Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-429c2f6a580so219328f8f.1 for ; Wed, 05 Nov 2025 12:10:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762373452; x=1762978252; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wd2SwmO3s42C6B+THDT/QkC+mSXRUeoaYVSZH4hdUjs=; b=K6L8TGAqIa8ZgPdm1d6KhupvbsrX1ngc6RwWsv8QTS4QlHTl7Ao1Tn0M2YWMi9IJzh KmC+P7+8G2r9g/0d8ka+1kNN2Ni+kP2cvMChZnjlYUg0SBAd34m1bTi/6l9FQ+OW+8mt 4GHGqedT4SmvjRwl5luE64pJqtEaIwPRzhKjuGoOUUQjFad8I7LhwRx48qbViFQfPeCI wByVK27OjXDr1m9Z+HCiTinJMmtCd8B1O2WWhakFj7mi7QZBAuXttJebcBApXWpCf71+ 7uQOOpUz54L5fXOqTb9Z4LDpUnMSwNseLNAL5UbKJ8XGvUQQsswheIJlm6+mMU5P3ogv Vlxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762373452; x=1762978252; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wd2SwmO3s42C6B+THDT/QkC+mSXRUeoaYVSZH4hdUjs=; b=abrWT5IpEfYvOCHxudtVkGw0svs9b3qvGXLiQePS2prMLzudYuEZu6xObWunN9IPMZ p+2NBmj/XlYTqCBn/y8ukO73/tHknfJW06Ooba4wGruwhuOnP1Gc7tf3NMa17hU9hWvO xV7XAeKkVYbtZqz/KcQeVw9t5omeh8o64vulU0vkvrLwEIBuSHb6aJSU92T5xaBv/P/C gFa1oAcY70AQSzEOYnOrWiDy0BHhrXmtuJjfzgYHNHghjIhQ2xVUPsxsRXl4fYwazT+P 7lQZuFk1ctrijUNic0ZxJKs9YDVgRarBzaJhEj/fBfcv82UGBzDIiWDDvMR1aaou1fUR KfwQ== X-Forwarded-Encrypted: i=1; AJvYcCWFgwV8jhuLwJ0g7yOWc9Rn16ORSBT0Ze/IAv8Wefc7N2ZGSmQ077ZeoAZv7wftjVVAesRX4pIHVWlX2yA=@vger.kernel.org X-Gm-Message-State: AOJu0YyARw2s+TPA5xVax7TUJDkAJ8cGDcKq2K+PPtz/0CTp435CDhxD hdM8SWEm33l+euy8qPX3fDHRk15J7+FrVUw890ZUq4Krx4bZsbrQvY2Q X-Gm-Gg: ASbGncv7Wxha8hlgjCb1txVjZCCC/gylRYsT5y/1pzf0IeRsfVr7yx9Nx9GkjxF2Dz+ 7F4AbTDlEw+HnA4KAzpLWknghQ+eaCdmMjjEDNjS+RyxgRxNgD+cp3jT7yJUuMdJy0aOuqXwlqL 8IRvX/h0cSzM7cy7N9J0obZgRrTNXOmdw/fIXqtCqrv9UQcV2YRD6ABAjT86bcjWCQJGrYQ89vN YA4XaUai/WFlbtGEBTC9eA/zBx2sxa+ZgUDurMzM57azHue6+1EBY7euPuZUANJGHm38Qykorx1 0tRs1v7m7ORpldTQNWuvK0AW7YnijFhyXV7RyWSZBAyf8I6CiuU4Jb/LOl7IKALOhb1RdeE9W+1 pvR4ByWDF7TWCj53AgPuFJYht57Y3bT27YPgTjVqr4I4VdbNlliDAIZATrVY5kPUgJflopf+4IQ Ejpdn0AzqKV0X3UWiw5sMRBYYi4C6kbXtplTAlq4qVr/Mq+4bVu8QT6H4E8RAavA== X-Google-Smtp-Source: AGHT+IG/v2aU/SD3lKp9paG2qiBsbug343WJyIT+sF8b2IVKz7rvwz5faz/6qSbgxne+g0g++yxLJw== X-Received: by 2002:a05:6000:42ca:b0:429:c851:69a3 with SMTP id ffacd0b85a97d-429e32dd891mr2286605f8f.6.1762373451981; Wed, 05 Nov 2025 12:10:51 -0800 (PST) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-429eb41102bsm619857f8f.17.2025.11.05.12.10.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Nov 2025 12:10:51 -0800 (PST) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v5 next 1/9] lib: mul_u64_u64_div_u64() rename parameter 'c' to 'd' Date: Wed, 5 Nov 2025 20:10:27 +0000 Message-Id: <20251105201035.64043-2-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251105201035.64043-1-david.laight.linux@gmail.com> References: <20251105201035.64043-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Change to prototype from mul_u64_u64_div_u64(u64 a, u64 b, u64 c) to mul_u64_u64_div_u64(u64 a, u64 b, u64 d). Using 'd' for 'divisor' makes more sense. An upcoming change adds a 'c' parameter to calculate (a * b + c)/d. Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- lib/math/div64.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/lib/math/div64.c b/lib/math/div64.c index bf77b9843175..0ebff850fd4d 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -184,10 +184,10 @@ u32 iter_div_u64_rem(u64 dividend, u32 divisor, u64 *= remainder) EXPORT_SYMBOL(iter_div_u64_rem); =20 #ifndef mul_u64_u64_div_u64 -u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 c) +u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) { if (ilog2(a) + ilog2(b) <=3D 62) - return div64_u64(a * b, c); + return div64_u64(a * b, d); =20 #if defined(__SIZEOF_INT128__) =20 @@ -212,37 +212,37 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 c) =20 #endif =20 - /* make sure c is not zero, trigger runtime exception otherwise */ - if (unlikely(c =3D=3D 0)) { + /* make sure d is not zero, trigger runtime exception otherwise */ + if (unlikely(d =3D=3D 0)) { unsigned long zero =3D 0; =20 OPTIMIZER_HIDE_VAR(zero); return ~0UL/zero; } =20 - int shift =3D __builtin_ctzll(c); + int shift =3D __builtin_ctzll(d); =20 /* try reducing the fraction in case the dividend becomes <=3D 64 bits */ if ((n_hi >> shift) =3D=3D 0) { u64 n =3D shift ? (n_lo >> shift) | (n_hi << (64 - shift)) : n_lo; =20 - return div64_u64(n, c >> shift); + return div64_u64(n, d >> shift); /* * The remainder value if needed would be: - * res =3D div64_u64_rem(n, c >> shift, &rem); + * res =3D div64_u64_rem(n, d >> shift, &rem); * rem =3D (rem << shift) + (n_lo - (n << shift)); */ } =20 - if (n_hi >=3D c) { + if (n_hi >=3D d) { /* overflow: result is unrepresentable in a u64 */ return -1; } =20 /* Do the full 128 by 64 bits division */ =20 - shift =3D __builtin_clzll(c); - c <<=3D shift; + shift =3D __builtin_clzll(d); + d <<=3D shift; =20 int p =3D 64 + shift; u64 res =3D 0; @@ -257,8 +257,8 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 c) n_hi <<=3D shift; n_hi |=3D n_lo >> (64 - shift); n_lo <<=3D shift; - if (carry || (n_hi >=3D c)) { - n_hi -=3D c; + if (carry || (n_hi >=3D d)) { + n_hi -=3D d; res |=3D 1ULL << p; } } while (n_hi); --=20 2.39.5 From nobody Sun Dec 14 06:15:19 2025 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9062E3203A9 for ; Wed, 5 Nov 2025 20:10:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373456; cv=none; b=oy4BcORdX8z70Da7MpAGzVmCD6wKa/jK5/POCUrAL/mfDWcvxIns4c2VFPTpCgsSLFYgHfY9usU9SXQmYL5xEnX6MHLTbRHGHwdl/ktqG5gzqIzZsSD8Gaz09tsImOSEGqrLepLaenIic8xcpqIpwfnJSRMnHvQiniY+jy/UC5U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373456; c=relaxed/simple; bh=y6LNer88uh0HuzeiccT8aDtkO9mUG6rh3Xseiqd5Ey8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=UYruKi3FANOCWCDTLIe4j7m9s8XFL92fkEDDtEMASl1hKyo0a5PaPSOHHDkCkXkrd7pU975t1tbPR1esSzrRoAKyCVRcfmvdXwx6biQYVe/qes0u+mU0WBh5k9JV4mixiFCbcqEngh7OsoNY6OxuHLwat15wR8LSC15t2tFsb7k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ja2xwrdl; arc=none smtp.client-ip=209.85.128.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ja2xwrdl" Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-47118259fd8so1276255e9.3 for ; Wed, 05 Nov 2025 12:10:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762373453; x=1762978253; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cUBbmLbrucox9uCClvSdAxdwcj7AhBFvDyn0xKMfcrs=; b=ja2xwrdlbVOPi+cq1Sh0y7mBv+194rVFoQHF3IYpSdZwzoXvkfKEgwWWS9i5IDsKBh 6HG+4Cp7iuwYDmwc4aSQAmTi0oKIm1IYZ8ZldtpHiV+23G7ImhimKUYfF7KkW/hLzED9 3F4A/6ExWppFg3oMlqu7qB2n2nQif3zw72naY4VCjOl680UixRkdbJ4fFuOj88fWdTs8 VLmA++8Xf05ZynYt4DNPM+/HLgovHGbf3nOS9ewrhf7jSv7qNvZPiXgaMeS19rzskjb1 vGVKGdYJSy0OUmCtBdBflXs54hCtN+1Oa9meL6b+NL0ZnybVzi6P4lSr6M0VgRBvPnw+ 6cog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762373453; x=1762978253; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cUBbmLbrucox9uCClvSdAxdwcj7AhBFvDyn0xKMfcrs=; b=bL5wLipD8iSb8HpN/Aqok1zWmDgSu2bAiM1G41rEOsSXjM9DbM/nmNx/1XXNz6Ikaa iEz3HJHnNxyQi5e8wE2GXADE6T9IX9xr0DqtS7ciMNnLmnX4e+Iex7ATvx40gayTVTLI DGWldgAY+kaTzwDW/aZX18+RcKI6r6ecvinWurLTxJQdPc4DLXTrSfL/Pas575PZHc2Z uHL+qEoAf+4Db1ASKf8WHxGPQJ773PC/59k+GeL6xyOEFDFCXMJLdsMbrg2zEEl3EGcD SXejPnKq6FeLim67OlnSzD8rn5401ikTed4gJGOc+tb1XA45PqT9G+C6rJhAK368xq+6 VAhA== X-Forwarded-Encrypted: i=1; AJvYcCWHE1jk0l3RtdjFFaAZxN8OW0oAzOh9dTj3Uoyaha0IRcXJwp+p9xZkFcd4Kb2V9DIDdLF4hSZ3r70dW+o=@vger.kernel.org X-Gm-Message-State: AOJu0YyOB45IjT2oMtlma2P8WNPKAa+vOoLcu5AU61lqgyL0yEyigrVC N8DJfOryXrFlT8Em0W0oLVWJKjMDwsfsx2eglN/OVB9Tgo8djBW75Hdo X-Gm-Gg: ASbGncujySbUXGDm+F5/j5RzLHsZURMS/GDPUsoVLrJR9Yx5B3tph/o2bxxVC6rpaX1 OG38YMr+XOvzXiuNR9znAGbkYz/69Mfy99ZZPAEd3MOY0Lw2FIvcg8TxapXdvZ+bHeNZJXnOwQc ShDwIYnKqrHFMm4aH7jXIN4gvjk4adYizUUOzm+x/ipH2Z/F/PArdeGRonAkORQF90OxpXZ9u9I V61xaltF1zUU52Yckkilp6mvTcULJcnlat0HbNSw8PiH919qYEr3cnHqoOnkJ4fTiV8Uv0PH0/U Vu4EqmzdL69Rbc3c8QddOwDQXM/yHImg72k941UAMC3Oa7OtX5igxAbLE8YW1kO0LSd2Qj2E9F5 QK5lEPDO5DUmB2f+hNj/3vJjd6QR/pgssG/V5CqfK/7KhpK9scg7ERoYd7PKHcEEqkbTH2z5l0Q CdEmxZV2UVoX5zVrbqdPsYbm5PRX/JLA/VQOUbMItsckPiVbEegzQD1kXGo2oTSVGty/4RH6g/ X-Google-Smtp-Source: AGHT+IHDBxFlsBIdPH8XlgV+0m2kGnYHrAYLuJDDatQcp9SAWJx1GAGMDuN7auiXQsAIDvLjgKS7og== X-Received: by 2002:a05:600c:4e88:b0:475:de14:db1f with SMTP id 5b1f17b1804b1-4775ce19f93mr42387865e9.30.1762373452762; Wed, 05 Nov 2025 12:10:52 -0800 (PST) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-429eb41102bsm619857f8f.17.2025.11.05.12.10.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Nov 2025 12:10:52 -0800 (PST) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v5 next 2/9] lib: mul_u64_u64_div_u64() Combine overflow and divide by zero checks Date: Wed, 5 Nov 2025 20:10:28 +0000 Message-Id: <20251105201035.64043-3-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251105201035.64043-1-david.laight.linux@gmail.com> References: <20251105201035.64043-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since the overflow check always triggers when the divisor is zero move the check for divide by zero inside the overflow check. This means there is only one test in the normal path. Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- V3 contained a different patch 2 that did different changes to the error paths. lib/math/div64.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/lib/math/div64.c b/lib/math/div64.c index 0ebff850fd4d..1092f41e878e 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -212,12 +212,16 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) =20 #endif =20 - /* make sure d is not zero, trigger runtime exception otherwise */ - if (unlikely(d =3D=3D 0)) { - unsigned long zero =3D 0; + if (unlikely(n_hi >=3D d)) { + /* trigger runtime exception if divisor is zero */ + if (d =3D=3D 0) { + unsigned long zero =3D 0; =20 - OPTIMIZER_HIDE_VAR(zero); - return ~0UL/zero; + OPTIMIZER_HIDE_VAR(zero); + return ~0UL/zero; + } + /* overflow: result is unrepresentable in a u64 */ + return ~0ULL; } =20 int shift =3D __builtin_ctzll(d); @@ -234,11 +238,6 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) */ } =20 - if (n_hi >=3D d) { - /* overflow: result is unrepresentable in a u64 */ - return -1; - } - /* Do the full 128 by 64 bits division */ =20 shift =3D __builtin_clzll(d); --=20 2.39.5 From nobody Sun Dec 14 06:15:19 2025 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E4CC337B9D for ; Wed, 5 Nov 2025 20:10:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373457; cv=none; b=D+sGEtWksgLjqB8ODAU/pfu56o65UJHJhIvs7aOO04EFQlj9VPpdu2lHmk8fqFEIMOmcd3tJ6GnzQNymYmM9ZYt65FwlZwP8bl8FzCzsmyx+XvjYZVObE+xTI01wfI0rdCA2RyfSm6G8Y/wt01xYpMGArGkGglKc0//7R4vG14I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373457; c=relaxed/simple; bh=Ci+D/Qw6aHWTzp5Pft/ve9w2p6PO9IAnnt5sqH7yjp4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=bzspUfGYn2spw0+H2rBR+WjnVjH2SpVqP67ltT4ijQfsbG7iL3XblPcJOZRg7uM1MY9mglcudj8fOmtxOeOKXUWzfHXxGcYha+qYh1MPlou2kFiLHf0ES/qz6ps75ThPng9S6ac6gkfWxvVwWzUcU0/J1sHBCJftA8jnpj76cRE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=T6649L9w; arc=none smtp.client-ip=209.85.221.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="T6649L9w" Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-426fd62bfeaso92175f8f.2 for ; Wed, 05 Nov 2025 12:10:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762373453; x=1762978253; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Gly6MdE7U0LBD683F5RPW4SXNIn5sbTqp5fs+Yhen9Q=; b=T6649L9wYrmYNzdlRubC40B7yTQ7BTSWYRD0nbXYit+2FGRtunGGiJ+mvKn+Trs9eJ UuQpknoBXqZYwhqdRuSyDQvQuMhCOtGYlhKk/735Mjx1nXbhmrOhrR6Lnz2B6rw35hzr nkkmvSN3UNpvF2amU3aFsMsK4tVpeIO34O1wCV//bUt8pKDAINCRXG+jZk8y+NtlFOOa 1H89l5cZpQT+dbjvUf17Zhped4HFXGOKXTsOQ5Jf4cUwjWuPMdYDaBvjNbX6BKIo5g1z kNIntspa5sKiRXp2jbYFd38OYj2mVPDJNDF3xLoopxeUqg21bRgqDxa5rjf4UD4vjgNy mDUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762373453; x=1762978253; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Gly6MdE7U0LBD683F5RPW4SXNIn5sbTqp5fs+Yhen9Q=; b=BuqAQSbVQVXuFLN6j1bXxOURuMO0820f2I6qFNzjC5r0V1QrC/PXT6rGy7m6v68z+h 2jLDE4JAQTvdMF6qdkYbo4Y1O9UM4nphyPAaRh3Me43a6CWbIbawpYDqmcIBg365F49M 0dCoJsQFSi5z58KrBJHcVw4hVfgpyvPaJeXvQT6rPVtVP82XwkUo93V36vAGiky3INm2 TENi8nEv2oKdnZEn/eK2qx/0DFy/oDsxbC36RhGAbSENtOnieHw6Ofv/ALV8dilO3rw2 WyIYUfQIFWyXD5upJpKR9Ba+vujSl1GQcFAw92AqXgPmba38qFdyPHXYCPOYXCSyQt2X Zf5g== X-Forwarded-Encrypted: i=1; AJvYcCVFYj4KyzsZ5l0nPY4VfRfDe9g9r0fY1xfMESqnDRx+Tl4fMdDb0tUCULiZ0due7JONhr1TCUGnXO9roMU=@vger.kernel.org X-Gm-Message-State: AOJu0YydYUU9FW5O8/loQB1WpUETkqOvbRRa16kZdAMYuc2NY0RFbPAw q2dOSeeMeGUZQscuvdBJVTGS6EDV5CkJOYwKqMHG6/Sc+VCMa2E0dg8w X-Gm-Gg: ASbGnctMccKpXkh+TOh+zUwEsSekPnWqJi9vgLLa4A82qQc9x6fzKT43D6jq3hqepsE Vu0J6l4HAsoju1PfbEd0+TiqabwjhtxlhfKTqJvqCb9r4cmHDVBA6W8lcxZisy+57GZlPdCmSwh tzAxqzqfeTBZRGzx1vy6dtemsbN5OPpPQEpHE6jWWFCu4snZjYo/DsDxix4oaI4HZcmd68p7dnN t/ltz4GFkNYLw/Z++jxvBIbXWbqBbI9vfeut3dg1haQ3AAfs/IBSgrvl8DqIcvJifwmWdvUw2CG W80uMsp63m54xNRsETxkm2PE+aXinNgUR7veThskl6RxqM+bPSs8bZS9rvDfZKEPxT2a7JYD8gP efNMC8adAJV5JotV4vTMisv9jcLQ+azIeztNx+jQUVN/9ClLi4jc9Q1BgSpB2oRbSIwZSOA9GcX 7v57Oa5p97ITFOW9sPgSGV6wzAjmZaa1QDIQ9GcW9iF/6t4aQ58/DUW5aQmMy02A== X-Google-Smtp-Source: AGHT+IHEZqct1HXIsFvsGs/AM1wuRpHV2gnkZFVBv7Nc7wk2RlYL60/xFtEUiz95ye+sIqSkSqBMqQ== X-Received: by 2002:a05:6000:2885:b0:429:cf86:1247 with SMTP id ffacd0b85a97d-429e331395dmr3891845f8f.57.1762373453495; Wed, 05 Nov 2025 12:10:53 -0800 (PST) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-429eb41102bsm619857f8f.17.2025.11.05.12.10.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Nov 2025 12:10:53 -0800 (PST) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v5 next 3/9] lib: mul_u64_u64_div_u64() simplify check for a 64bit product Date: Wed, 5 Nov 2025 20:10:29 +0000 Message-Id: <20251105201035.64043-4-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251105201035.64043-1-david.laight.linux@gmail.com> References: <20251105201035.64043-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If the product is only 64bits div64_u64() can be used for the divide. Replace the pre-multiply check (ilog2(a) + ilog2(b) <=3D 62) with a simple post-multiply check that the high 64bits are zero. This has the advantage of being simpler, more accurate and less code. It will always be faster when the product is larger than 64bits. Most 64bit cpu have a native 64x64=3D128 bit multiply, this is needed (for the low 64bits) even when div64_u64() is called - so the early check gains nothing and is just extra code. 32bit cpu will need a compare (etc) to generate the 64bit ilog2() from two 32bit bit scans - so that is non-trivial. (Never mind the mess of x86's 'bsr' and any oddball cpu without fast bit-scan instructions.) Whereas the additional instructions for the 128bit multiply result are pretty much one multiply and two adds (typically the 'adc $0,%reg' can be run in parallel with the instruction that follows). The only outliers are 64bit systems without 128bit mutiply and simple in order 32bit ones with fast bit scan but needing extra instructions to get the high bits of the multiply result. I doubt it makes much difference to either, the latter is definitely not mainstream. If anyone is worried about the analysis they can look at the generated code for x86 (especially when cmov isn't used). Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- Split from patch 3 for v2. Changes for v5: - Test for small dividends before overflow. lib/math/div64.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/math/div64.c b/lib/math/div64.c index 1092f41e878e..4a4b1aa9e6e1 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -186,9 +186,6 @@ EXPORT_SYMBOL(iter_div_u64_rem); #ifndef mul_u64_u64_div_u64 u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) { - if (ilog2(a) + ilog2(b) <=3D 62) - return div64_u64(a * b, d); - #if defined(__SIZEOF_INT128__) =20 /* native 64x64=3D128 bits multiplication */ @@ -212,6 +209,9 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) =20 #endif =20 + if (!n_hi) + return div64_u64(n_lo, d); + if (unlikely(n_hi >=3D d)) { /* trigger runtime exception if divisor is zero */ if (d =3D=3D 0) { --=20 2.39.5 From nobody Sun Dec 14 06:15:19 2025 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E5C803451C4 for ; Wed, 5 Nov 2025 20:10:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373458; cv=none; b=axGVdiO5dgfzbGHKjsO4zRA67Q/bcBHm8N+gtTaV9rX97YpT98IyDgDucdDpsmFglIxmaFqH2wNS2zcfQonC9ZwAZZ6g065+nFnW6TZTg4Eqc6hxiu8zVfN3wu5I+LYCkUzv+NNBf2WJoOowF79PqDAA+VMZtNPXzAFTSq8glHw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373458; c=relaxed/simple; bh=LQF+5DOroykS0egwdIZIn1wgypURQ/kr06ARZbD7kXg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=BNVb8pRoksO6KnKtX6FM5bByjZ4KseGlJ2kitXwzqgQ21HpzCeZJfoXfL5Xi6lBhQo9P7mH0e5+4zBaDBSyJzCFSpWEBOhE/Fs5ekm/pwFGgVcoCNEaZnIPrBTcyLPGZFLAWI5uJKWtCZT2HIlv/fUacUOp5BcVlrOpxB/WRNFY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JLGvVFWb; arc=none smtp.client-ip=209.85.128.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JLGvVFWb" Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-477632b0621so317775e9.2 for ; Wed, 05 Nov 2025 12:10:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762373454; x=1762978254; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PkRmlGhQ3RIhH+UDlv2Rx49yULgpizS+/KVzccR++lw=; b=JLGvVFWbAEkUSS/DgSfhyjfnv9MsK1D1KrTj7eg0j4qffywF8CSmcbrm2b1mj6SeeS 2VCKoK7SjN6u/hRuxNg36qgLABSlKO3BWiMw/Z+plgnmPyEqQtgUg+G8T5a/CeP/Bq7g J+7lOP/0k/3YHm9+Nwhk/TcShyp47ouN9ZHLgbKhXNH8+Iwiwwy5aj0o323/SWuOl7oY rQX75Tq6dIfRtISLHTpG4bY0P7tiC0+ru/GCD/F+xyXTk2WeFJz1rNHmF6+9+l6i/KmP KQwblcP4Mz7X/+OTke94aIvLom61GdjBOfPqd9YRcQg8VL9tLosYGiAhjKGv1rK5Lziy mJJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762373454; x=1762978254; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PkRmlGhQ3RIhH+UDlv2Rx49yULgpizS+/KVzccR++lw=; b=d4Hkld4h+7aoA3oUxpCFBLad0XVztJcvwMtUAo8MUCQT1pjPF0Y6ow2XpK8+KfPbQ1 CvVlHohxjT2xXPaR5+k1YJBqKEsGgmJJxTMf37CzuE2Fo9GOb1sgGEqFHtvgUpUO0blz 9RIoN7QmbXLJ+T8rAHv8+vOwjj5yYs0UmnVLfpHb43/7PoZBXUqBxsPrpu94xWfhkzbF o0gnvrTRCXwMGOcfOvw/acktN5iMYiF/d+TE6oGaVo0w4NJAN+ZtB9hFdBsWF7/KaIx7 omXeA7ydcBfMAsMIZTLxSScOe0AzmY9+knJYzF6ano1r7/f3quLJKwGVEfJ/sYsEymza s8mA== X-Forwarded-Encrypted: i=1; AJvYcCUpYl1biErtLKrEzVZHbl5cxUkGCIWATbK2FttioaC3fL52mH0vvnrRYO15iojetaMeWj/o1eCMaZuEByY=@vger.kernel.org X-Gm-Message-State: AOJu0YzZMHbp9qoZo542IE0csig62dTokQla4PXyDRGLylysrP2AsewJ 4zy6mXay+YFXHxmXGa1QbjYonZcgmd8a3eD9ob35Q+4f8dKXUocHZzPn X-Gm-Gg: ASbGncvuhr3kQpVf2zD/sMlbDpOz2VWCgk3anlV1s1tcqJDwq25LXQr07SYdlJar4Ii 1MCt4aj19XDxBb3AJCOuULjUQrT7awEGTyctFPI3WA/LkqHYUkAoqQHC4cnvMXQbiuFBUmvWnLs c/LR5kn5tCUzsaUtI3M4zEvBKm3YDvot0pRPl6hEx4W4JsE/4tbMxZmi4s5LlJ93UlWDKle5Zeo ukMdAD8F36133jCOL5KVHER5DrdPUjXCHl6/MvJb4AL/jErRatl1v8uDpR973DHyh/recREX1lL KvksDl1Uef0f0GM8o5J5YvF1CObsbK+itfieJSnm57tc4wAkFwfucUY78ejKXv02t0DsYExpViA I7QmQw6nzQa+bBYFJn4sBvgVkBmsUcztHDwiwb5q5iqOYEcY0I5U6XuiXGgAUc42cXO50BdsD4x EjHB+3nEgSRKEIvw08C06VqlQXLTMPebBzzEjvj86BRlRdkGGR/8Cs767poLIgoV54G3ZlJgRN X-Google-Smtp-Source: AGHT+IFXG1S5UtuGGZsKT7XgyGdRJCZnI3LB9sBEb0GEql+LqqFAXg/5KZmRt5O8aqAfZKR7qj7S0A== X-Received: by 2002:a05:600c:8b84:b0:475:d8c8:6894 with SMTP id 5b1f17b1804b1-4775cdbec8fmr34636355e9.9.1762373454240; Wed, 05 Nov 2025 12:10:54 -0800 (PST) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-429eb41102bsm619857f8f.17.2025.11.05.12.10.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Nov 2025 12:10:54 -0800 (PST) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v5 next 4/9] lib: Add mul_u64_add_u64_div_u64() and mul_u64_u64_div_u64_roundup() Date: Wed, 5 Nov 2025 20:10:30 +0000 Message-Id: <20251105201035.64043-5-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251105201035.64043-1-david.laight.linux@gmail.com> References: <20251105201035.64043-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The existing mul_u64_u64_div_u64() rounds down, a 'rounding up' variant needs 'divisor - 1' adding in between the multiply and divide so cannot easily be done by a caller. Add mul_u64_add_u64_div_u64(a, b, c, d) that calculates (a * b + c)/d and implement the 'round down' and 'round up' using it. Update the x86-64 asm to optimise for 'c' being a constant zero. Add kerndoc definitions for all three functions. Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- Changes for v2 (formally patch 1/3): - Reinstate the early call to div64_u64() on 32bit when 'c' is zero. Although I'm not convinced the path is common enough to be worth the two ilog2() calls. =20 Changes for v3 (formally patch 3/4): - The early call to div64_u64() has been removed by patch 3. Pretty much guaranteed to be a pessimisation. Changes for v4:=20 - For x86-64 split the multiply, add and divide into three asm blocks. (gcc makes a pigs breakfast of (u128)a * b + c) - Change the kerndoc since divide by zero will (probably) fault. Changes for v5: - Fix test that excludes the add/adc asm block for constant zero 'add'. arch/x86/include/asm/div64.h | 20 +++++++++------ include/linux/math64.h | 48 +++++++++++++++++++++++++++++++++++- lib/math/div64.c | 14 ++++++----- 3 files changed, 67 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/div64.h b/arch/x86/include/asm/div64.h index 9931e4c7d73f..6d8a3de3f43a 100644 --- a/arch/x86/include/asm/div64.h +++ b/arch/x86/include/asm/div64.h @@ -84,21 +84,25 @@ static inline u64 mul_u32_u32(u32 a, u32 b) * Will generate an #DE when the result doesn't fit u64, could fix with an * __ex_table[] entry when it becomes an issue. */ -static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div) +static inline u64 mul_u64_add_u64_div_u64(u64 rax, u64 mul, u64 add, u64 d= iv) { - u64 q; + u64 rdx; =20 - asm ("mulq %2; divq %3" : "=3Da" (q) - : "a" (a), "rm" (mul), "rm" (div) - : "rdx"); + asm ("mulq %[mul]" : "+a" (rax), "=3Dd" (rdx) : [mul] "rm" (mul)); =20 - return q; + if (!statically_true(!add)) + asm ("addq %[add], %[lo]; adcq $0, %[hi]" : + [lo] "+r" (rax), [hi] "+r" (rdx) : [add] "irm" (add)); + + asm ("divq %[div]" : "+a" (rax), "+d" (rdx) : [div] "rm" (div)); + + return rax; } -#define mul_u64_u64_div_u64 mul_u64_u64_div_u64 +#define mul_u64_add_u64_div_u64 mul_u64_add_u64_div_u64 =20 static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 div) { - return mul_u64_u64_div_u64(a, mul, div); + return mul_u64_add_u64_div_u64(a, mul, 0, div); } #define mul_u64_u32_div mul_u64_u32_div =20 diff --git a/include/linux/math64.h b/include/linux/math64.h index 6aaccc1626ab..e889d850b7f1 100644 --- a/include/linux/math64.h +++ b/include/linux/math64.h @@ -282,7 +282,53 @@ static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 = divisor) } #endif /* mul_u64_u32_div */ =20 -u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div); +/** + * mul_u64_add_u64_div_u64 - unsigned 64bit multiply, add, and divide + * @a: first unsigned 64bit multiplicand + * @b: second unsigned 64bit multiplicand + * @c: unsigned 64bit addend + * @d: unsigned 64bit divisor + * + * Multiply two 64bit values together to generate a 128bit product + * add a third value and then divide by a fourth. + * The Generic code divides by 0 if @d is zero and returns ~0 on overflow. + * Architecture specific code may trap on zero or overflow. + * + * Return: (@a * @b + @c) / @d + */ +u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d); + +/** + * mul_u64_u64_div_u64 - unsigned 64bit multiply and divide + * @a: first unsigned 64bit multiplicand + * @b: second unsigned 64bit multiplicand + * @d: unsigned 64bit divisor + * + * Multiply two 64bit values together to generate a 128bit product + * and then divide by a third value. + * The Generic code divides by 0 if @d is zero and returns ~0 on overflow. + * Architecture specific code may trap on zero or overflow. + * + * Return: @a * @b / @d + */ +#define mul_u64_u64_div_u64(a, b, d) mul_u64_add_u64_div_u64(a, b, 0, d) + +/** + * mul_u64_u64_div_u64_roundup - unsigned 64bit multiply and divide rounde= d up + * @a: first unsigned 64bit multiplicand + * @b: second unsigned 64bit multiplicand + * @d: unsigned 64bit divisor + * + * Multiply two 64bit values together to generate a 128bit product + * and then divide and round up. + * The Generic code divides by 0 if @d is zero and returns ~0 on overflow. + * Architecture specific code may trap on zero or overflow. + * + * Return: (@a * @b + @d - 1) / @d + */ +#define mul_u64_u64_div_u64_roundup(a, b, d) \ + ({ u64 _tmp =3D (d); mul_u64_add_u64_div_u64(a, b, _tmp - 1, _tmp); }) + =20 /** * DIV64_U64_ROUND_UP - unsigned 64bit divide with 64bit divisor rounded up diff --git a/lib/math/div64.c b/lib/math/div64.c index 4a4b1aa9e6e1..a88391b8ada0 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -183,13 +183,13 @@ u32 iter_div_u64_rem(u64 dividend, u32 divisor, u64 *= remainder) } EXPORT_SYMBOL(iter_div_u64_rem); =20 -#ifndef mul_u64_u64_div_u64 -u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) +#ifndef mul_u64_add_u64_div_u64 +u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d) { #if defined(__SIZEOF_INT128__) =20 /* native 64x64=3D128 bits multiplication */ - u128 prod =3D (u128)a * b; + u128 prod =3D (u128)a * b + c; u64 n_lo =3D prod, n_hi =3D prod >> 64; =20 #else @@ -198,8 +198,10 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) u32 a_lo =3D a, a_hi =3D a >> 32, b_lo =3D b, b_hi =3D b >> 32; u64 x, y, z; =20 - x =3D (u64)a_lo * b_lo; - y =3D (u64)a_lo * b_hi + (u32)(x >> 32); + /* Since (x-1)(x-1) + 2(x-1) =3D=3D x.x - 1 two u32 can be added to a u64= */ + x =3D (u64)a_lo * b_lo + (u32)c; + y =3D (u64)a_lo * b_hi + (u32)(c >> 32); + y +=3D (u32)(x >> 32); z =3D (u64)a_hi * b_hi + (u32)(y >> 32); y =3D (u64)a_hi * b_lo + (u32)y; z +=3D (u32)(y >> 32); @@ -265,5 +267,5 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) =20 return res; } -EXPORT_SYMBOL(mul_u64_u64_div_u64); +EXPORT_SYMBOL(mul_u64_add_u64_div_u64); #endif --=20 2.39.5 From nobody Sun Dec 14 06:15:19 2025 Received: from mail-wr1-f51.google.com (mail-wr1-f51.google.com [209.85.221.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AED42345CC6 for ; Wed, 5 Nov 2025 20:10:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373458; cv=none; b=XGHYA3ybOcO1+HNTa+8wsHO8nn0tccXdfh+HP5ZNbm1zeEhOFtu5kOZZUjT+mAqiu5n8QyL+1Sf1U0+ZeAMwgZfh+fhsbjA9hfz7cIhKivnXgtK8A8QLjgD3fUlnYio/5xVbO909tOa7x4oq1XqoM6aIvw3g8ntRr19hECu99Nc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373458; c=relaxed/simple; bh=SE1UQRnPEKhTNhIK/TT/prwPCxvLiTm6vXZUGaBu40M=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ZBLwfNonvidlNFklVLXJXYE2S159MLyKl9XTTGUTXVopnjk1QsJEvDmtZeJ3hqyCvFqRtOjroKF38cTw6JJohVUSSf4fgf5bYQnxap7kYE0IH6gZzt6FozuG5ZXQ857EvED+lwo/RME9+LwAumiOcX5nfvBE8gDDqPYltSAG25Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=R8D84cAy; arc=none smtp.client-ip=209.85.221.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="R8D84cAy" Received: by mail-wr1-f51.google.com with SMTP id ffacd0b85a97d-429b7eecf7cso157804f8f.0 for ; Wed, 05 Nov 2025 12:10:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762373455; x=1762978255; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tnISXxbbuCs9wWXUfoIS/DH5ed/+RGS24ZfJrQTNFHc=; b=R8D84cAyNVc/lCTj+doH008wfmQusq2v0YtGfH5Z1OBqk4wR/XmssBEtsFydH10IdG wrBtOthV0xbOqZ4K8uKC7YcQtB2Qvz9zprSyrGiBd2UwJnXG13+pMau+GetzYONgpm2O Vs6ayrq5D+lziaGRsWrML5EJI0qwIBrt5PxhgONPxkWCoK+c2zThQEwFO9jwZAZczyy/ 6YyrAkaSqfe6jV2bVXPHkCqOeUSgbbjNLVG4rzq0xn56wohpNulGXGdTS7mMuebGxseo FUluzKTCL7TjUnoRCo/aYfHqdVMsKzMiQBni1VMEfpbUV8XP7UBE/KfUuMQ3cyhCJd1l fJWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762373455; x=1762978255; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=tnISXxbbuCs9wWXUfoIS/DH5ed/+RGS24ZfJrQTNFHc=; b=ioczsUU26yxH4yOKeYszsWcNvywSYsVWEF+9w6i4ufH60TBpPyTQMm03+9sSePhlgM aitchYy1DkNbWmaHNHWhX8hFO7F6cV917WmfRiFIEgL42IctmjLdH9flcTxvd4U9LTKS DEhz+j0Ch0Hb/81MWzTK3OrfYFM05n5COBQQwD/S4rJlXjkICA0KlXDMPKs/832mpykA Ck0GtHFzeBwIO3a2BdPDoHGMhDpRe9X2KChhmyfe0E2/DKtir07ZqX91lOAA4vNkUNG3 lUpBBSHQDNN3rw1MzSP0OCSMxb6+R2QSm58uQLOZIIF/mR7ueov3fcRi+2asaRPRnrMw mCrg== X-Forwarded-Encrypted: i=1; AJvYcCVC9MU9Ac5Mkf0wZiRqe611tdsnvZvK0iB6W3AwTHnD6kKEA8CJmIrX8BE/mBmhroJh7EfcX72dDz1luyI=@vger.kernel.org X-Gm-Message-State: AOJu0YwJkg6vQ5DfT+8xD4VJ1bYEGJpyfOsog+SkhmfODk6FBsKk/6mM oyvIBQxwfjgfTqjOW2dvyHHBxsk0Bxr8N4f7vOTBTLDwwIVeAa7DFlPQ X-Gm-Gg: ASbGnct02uDhmDWzBayjDH4E47M7Ezqa3p9xJPPGf5zoI0VKOK1zOuuDnBmUHBcla3R CMT7BmO+eUGtKJhaXn161v6NieC4814SHx/rdDVhZNIGQCPZNgYy6q7SpaSPs/inunnq+9VooCJ VzA1ALpQXBxuomEg06ytXNTf10w9ExpyDMhVsYKPl9atIMv2vIX0sBJonQ6bYCh3ryPsRxvXNFL GuXJ3mGt/GnGD8FetIBki5p0W3sEoaEXoaIq0PyYjyuuI07NrmTQMRC+eu76DedVoyJWjga0TKO LnL+h1Pj0nZaZSi4PQU6A+xNgaEvgY8mG3fDkqZ5iaNSYzhjTFgzokCjh5y+3Jgg66eQKn/tovI QYwrbaZVKCqmnTAIQ2zeHB7QXuYfmigebx0uGlG2Q8k7VTFE71qrl5/Ha4pJau0iDJ0/2J0ICtu ou3mNoJYHQ+kg1VZDHGmbwGtANu25NSLvrBOPnzz/zdFd6DlFmsm+QiKDb1gqOnZopQIjL52OF X-Google-Smtp-Source: AGHT+IEJznuzPxKOkxO5jMoPlA9leM3O1XtF3OXGMcU9U8E9hG/y7HhGXncxOeNL9Mpo/1nwg5LPAg== X-Received: by 2002:a05:6000:3111:b0:429:d253:8619 with SMTP id ffacd0b85a97d-429eb12faebmr628051f8f.5.1762373454939; Wed, 05 Nov 2025 12:10:54 -0800 (PST) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-429eb41102bsm619857f8f.17.2025.11.05.12.10.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Nov 2025 12:10:54 -0800 (PST) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v5 next 5/9] lib: Add tests for mul_u64_u64_div_u64_roundup() Date: Wed, 5 Nov 2025 20:10:31 +0000 Message-Id: <20251105201035.64043-6-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251105201035.64043-1-david.laight.linux@gmail.com> References: <20251105201035.64043-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Replicate the existing mul_u64_u64_div_u64() test cases with round up. Update the shell script that verifies the table, remove the comment markers so that it can be directly pasted into a shell. Rename the divisor from 'c' to 'd' to match mul_u64_add_u64_div_u64(). It any tests fail then fail the module load with -EINVAL. Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- Changes for v3: - Rename 'c' to 'd' to match mul_u64_add_u64_div_u64() Changes for v4: - Fix shell script that verifies the table lib/math/test_mul_u64_u64_div_u64.c | 122 +++++++++++++++++----------- 1 file changed, 73 insertions(+), 49 deletions(-) diff --git a/lib/math/test_mul_u64_u64_div_u64.c b/lib/math/test_mul_u64_u6= 4_div_u64.c index 58d058de4e73..4d5e4e5dac67 100644 --- a/lib/math/test_mul_u64_u64_div_u64.c +++ b/lib/math/test_mul_u64_u64_div_u64.c @@ -10,61 +10,73 @@ #include #include =20 -typedef struct { u64 a; u64 b; u64 c; u64 result; } test_params; +typedef struct { u64 a; u64 b; u64 d; u64 result; uint round_up;} test_par= ams; =20 static test_params test_values[] =3D { /* this contains many edge values followed by a couple random values */ -{ 0xb, 0x7, 0x3, = 0x19 }, -{ 0xffff0000, 0xffff0000, 0xf, 0x1110eeef00= 000000 }, -{ 0xffffffff, 0xffffffff, 0x1, 0xfffffffe00= 000001 }, -{ 0xffffffff, 0xffffffff, 0x2, 0x7fffffff00= 000000 }, -{ 0x1ffffffff, 0xffffffff, 0x2, 0xfffffffe80= 000000 }, -{ 0x1ffffffff, 0xffffffff, 0x3, 0xaaaaaaa9aa= aaaaab }, -{ 0x1ffffffff, 0x1ffffffff, 0x4, 0xffffffff00= 000000 }, -{ 0xffff000000000000, 0xffff000000000000, 0xffff000000000001, 0xfffeffffff= ffffff }, -{ 0x3333333333333333, 0x3333333333333333, 0x5555555555555555, 0x1eb851eb85= 1eb851 }, -{ 0x7fffffffffffffff, 0x2, 0x3, 0x5555555555= 555554 }, -{ 0xffffffffffffffff, 0x2, 0x8000000000000000, = 0x3 }, -{ 0xffffffffffffffff, 0x2, 0xc000000000000000, = 0x2 }, -{ 0xffffffffffffffff, 0x4000000000000004, 0x8000000000000000, 0x8000000000= 000007 }, -{ 0xffffffffffffffff, 0x4000000000000001, 0x8000000000000000, 0x8000000000= 000001 }, -{ 0xffffffffffffffff, 0x8000000000000001, 0xffffffffffffffff, 0x8000000000= 000001 }, -{ 0xfffffffffffffffe, 0x8000000000000001, 0xffffffffffffffff, 0x8000000000= 000000 }, -{ 0xffffffffffffffff, 0x8000000000000001, 0xfffffffffffffffe, 0x8000000000= 000001 }, -{ 0xffffffffffffffff, 0x8000000000000001, 0xfffffffffffffffd, 0x8000000000= 000002 }, -{ 0x7fffffffffffffff, 0xffffffffffffffff, 0xc000000000000000, 0xaaaaaaaaaa= aaaaa8 }, -{ 0xffffffffffffffff, 0x7fffffffffffffff, 0xa000000000000000, 0xcccccccccc= ccccca }, -{ 0xffffffffffffffff, 0x7fffffffffffffff, 0x9000000000000000, 0xe38e38e38e= 38e38b }, -{ 0x7fffffffffffffff, 0x7fffffffffffffff, 0x5000000000000000, 0xcccccccccc= ccccc9 }, -{ 0xffffffffffffffff, 0xfffffffffffffffe, 0xffffffffffffffff, 0xffffffffff= fffffe }, -{ 0xe6102d256d7ea3ae, 0x70a77d0be4c31201, 0xd63ec35ab3220357, 0x78f8bf8cc8= 6c6e18 }, -{ 0xf53bae05cb86c6e1, 0x3847b32d2f8d32e0, 0xcfd4f55a647f403c, 0x42687f79d8= 998d35 }, -{ 0x9951c5498f941092, 0x1f8c8bfdf287a251, 0xa3c8dc5f81ea3fe2, 0x1d887cb259= 00091f }, -{ 0x374fee9daa1bb2bb, 0x0d0bfbff7b8ae3ef, 0xc169337bd42d5179, 0x03bb2dbaff= cbb961 }, -{ 0xeac0d03ac10eeaf0, 0x89be05dfa162ed9b, 0x92bb1679a41f0e4b, 0xdc5f5cc9e2= 70d216 }, +{ 0xb, 0x7, 0x3, = 0x19, 1 }, +{ 0xffff0000, 0xffff0000, 0xf, 0x1110eeef00= 000000, 0 }, +{ 0xffffffff, 0xffffffff, 0x1, 0xfffffffe00= 000001, 0 }, +{ 0xffffffff, 0xffffffff, 0x2, 0x7fffffff00= 000000, 1 }, +{ 0x1ffffffff, 0xffffffff, 0x2, 0xfffffffe80= 000000, 1 }, +{ 0x1ffffffff, 0xffffffff, 0x3, 0xaaaaaaa9aa= aaaaab, 0 }, +{ 0x1ffffffff, 0x1ffffffff, 0x4, 0xffffffff00= 000000, 1 }, +{ 0xffff000000000000, 0xffff000000000000, 0xffff000000000001, 0xfffeffffff= ffffff, 1 }, +{ 0x3333333333333333, 0x3333333333333333, 0x5555555555555555, 0x1eb851eb85= 1eb851, 1 }, +{ 0x7fffffffffffffff, 0x2, 0x3, 0x5555555555= 555554, 1 }, +{ 0xffffffffffffffff, 0x2, 0x8000000000000000, = 0x3, 1 }, +{ 0xffffffffffffffff, 0x2, 0xc000000000000000, = 0x2, 1 }, +{ 0xffffffffffffffff, 0x4000000000000004, 0x8000000000000000, 0x8000000000= 000007, 1 }, +{ 0xffffffffffffffff, 0x4000000000000001, 0x8000000000000000, 0x8000000000= 000001, 1 }, +{ 0xffffffffffffffff, 0x8000000000000001, 0xffffffffffffffff, 0x8000000000= 000001, 0 }, +{ 0xfffffffffffffffe, 0x8000000000000001, 0xffffffffffffffff, 0x8000000000= 000000, 1 }, +{ 0xffffffffffffffff, 0x8000000000000001, 0xfffffffffffffffe, 0x8000000000= 000001, 1 }, +{ 0xffffffffffffffff, 0x8000000000000001, 0xfffffffffffffffd, 0x8000000000= 000002, 1 }, +{ 0x7fffffffffffffff, 0xffffffffffffffff, 0xc000000000000000, 0xaaaaaaaaaa= aaaaa8, 1 }, +{ 0xffffffffffffffff, 0x7fffffffffffffff, 0xa000000000000000, 0xcccccccccc= ccccca, 1 }, +{ 0xffffffffffffffff, 0x7fffffffffffffff, 0x9000000000000000, 0xe38e38e38e= 38e38b, 1 }, +{ 0x7fffffffffffffff, 0x7fffffffffffffff, 0x5000000000000000, 0xcccccccccc= ccccc9, 1 }, +{ 0xffffffffffffffff, 0xfffffffffffffffe, 0xffffffffffffffff, 0xffffffffff= fffffe, 0 }, +{ 0xe6102d256d7ea3ae, 0x70a77d0be4c31201, 0xd63ec35ab3220357, 0x78f8bf8cc8= 6c6e18, 1 }, +{ 0xf53bae05cb86c6e1, 0x3847b32d2f8d32e0, 0xcfd4f55a647f403c, 0x42687f79d8= 998d35, 1 }, +{ 0x9951c5498f941092, 0x1f8c8bfdf287a251, 0xa3c8dc5f81ea3fe2, 0x1d887cb259= 00091f, 1 }, +{ 0x374fee9daa1bb2bb, 0x0d0bfbff7b8ae3ef, 0xc169337bd42d5179, 0x03bb2dbaff= cbb961, 1 }, +{ 0xeac0d03ac10eeaf0, 0x89be05dfa162ed9b, 0x92bb1679a41f0e4b, 0xdc5f5cc9e2= 70d216, 1 }, }; =20 /* * The above table can be verified with the following shell script: - * - * #!/bin/sh - * sed -ne 's/^{ \+\(.*\), \+\(.*\), \+\(.*\), \+\(.*\) },$/\1 \2 \3 \4/p'= \ - * lib/math/test_mul_u64_u64_div_u64.c | - * while read a b c r; do - * expected=3D$( printf "obase=3D16; ibase=3D16; %X * %X / %X\n" $a $b $= c | bc ) - * given=3D$( printf "%X\n" $r ) - * if [ "$expected" =3D "$given" ]; then - * echo "$a * $b / $c =3D $r OK" - * else - * echo "$a * $b / $c =3D $r is wrong" >&2 - * echo "should be equivalent to 0x$expected" >&2 - * exit 1 - * fi - * done + +#!/bin/sh +sed -ne 's/^{ \+\(.*\), \+\(.*\), \+\(.*\), \+\(.*\), \+\(.*\) },$/\1 \2 \= 3 \4 \5/p' \ + lib/math/test_mul_u64_u64_div_u64.c | +while read a b d r e; do + expected=3D$( printf "obase=3D16; ibase=3D16; %X * %X / %X\n" $a $b $d |= bc ) + given=3D$( printf "%X\n" $r ) + if [ "$expected" =3D "$given" ]; then + echo "$a * $b / $d =3D $r OK" + else + echo "$a * $b / $d =3D $r is wrong" >&2 + echo "should be equivalent to 0x$expected" >&2 + exit 1 + fi + expected=3D$( printf "obase=3D16; ibase=3D16; (%X * %X + %X) / %X\n" $a = $b $((d-1)) $d | bc ) + given=3D$( printf "%X\n" $((r + e)) ) + if [ "$expected" =3D "$given" ]; then + echo "$a * $b +/ $d =3D $(printf '%#x' $((r + e))) OK" + else + echo "$a * $b +/ $d =3D $(printf '%#x' $((r + e))) is wrong" >&2 + echo "should be equivalent to 0x$expected" >&2 + exit 1 + fi +done + */ =20 static int __init test_init(void) { + int errors =3D 0; + int tests =3D 0; int i; =20 pr_info("Starting mul_u64_u64_div_u64() test\n"); @@ -72,19 +84,31 @@ static int __init test_init(void) for (i =3D 0; i < ARRAY_SIZE(test_values); i++) { u64 a =3D test_values[i].a; u64 b =3D test_values[i].b; - u64 c =3D test_values[i].c; + u64 d =3D test_values[i].d; u64 expected_result =3D test_values[i].result; - u64 result =3D mul_u64_u64_div_u64(a, b, c); + u64 result =3D mul_u64_u64_div_u64(a, b, d); + u64 result_up =3D mul_u64_u64_div_u64_roundup(a, b, d); + + tests +=3D 2; =20 if (result !=3D expected_result) { - pr_err("ERROR: 0x%016llx * 0x%016llx / 0x%016llx\n", a, b, c); + pr_err("ERROR: 0x%016llx * 0x%016llx / 0x%016llx\n", a, b, d); pr_err("ERROR: expected result: %016llx\n", expected_result); pr_err("ERROR: obtained result: %016llx\n", result); + errors++; + } + expected_result +=3D test_values[i].round_up; + if (result_up !=3D expected_result) { + pr_err("ERROR: 0x%016llx * 0x%016llx +/ 0x%016llx\n", a, b, d); + pr_err("ERROR: expected result: %016llx\n", expected_result); + pr_err("ERROR: obtained result: %016llx\n", result_up); + errors++; } } =20 - pr_info("Completed mul_u64_u64_div_u64() test\n"); - return 0; + pr_info("Completed mul_u64_u64_div_u64() test, %d tests, %d errors\n", + tests, errors); + return errors ? -EINVAL : 0; } =20 static void __exit test_exit(void) --=20 2.39.5 From nobody Sun Dec 14 06:15:19 2025 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 475D3346764 for ; Wed, 5 Nov 2025 20:10:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373459; cv=none; b=oWae96bJZsfIxEei0u7GU01DGHH1DhtflREKKJCkrcZTZkB/YmkssnSgHuItaL+Zkv1IhSkWfOwbIPn59ds7biKuhoBylGLSS1g3ovFG7hYHbCkZrDa9E8uZluWLtg3ECrJgAvpWkJxvaXD1D4gLk2v7lyYrtoPS76CVqz0OJ/Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373459; c=relaxed/simple; bh=nQcgUg1tamkssa25I7SceYXz9LMVdMlx1H4xuC6WOO0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=BR1i7G/EOKVkeiI4iaKoyP/s8Dfm+PWgQR+5rwOJUSBdejtrVCW7pAjxEMLA+/UJw+eciEGcntnWIMZLLjtMj+POqg3ZatgaST8QrsvzMZOy3D9iOztpdnWMgZgaOUadihywdtKS+dkYzjD7c0/490gucEN0vSCYj2RL1zhVkiM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=m+lyJ/Ic; arc=none smtp.client-ip=209.85.221.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="m+lyJ/Ic" Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-429b895458cso143150f8f.1 for ; Wed, 05 Nov 2025 12:10:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762373456; x=1762978256; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=II8Ps1o8HVI6iyVPZpBUMlThSw7mZMjPEJaIwTAG0oE=; b=m+lyJ/Icsfb9zzjbCXY1L3504Hjj9ix+4qL0eJzeTCtKMCr+AnmVKDgP1OlT6ovh1U KI4QQkgTRA809x5lBO3ZXzQ0U/YlvvAEamPCt9joWE4HNHHQHcy+Nx4DeB5Ugrf+ASA6 Szx5mXPRRfk2s7hlsx2zHJRlbXFLI/4bAaU6AwMulIeVW4vHXslFUcaUG7lXqUhwhiGh sGUwPTmscInFmdf9meICUoK6ot5WNXMTDZ6OB9cKlIxeFKVLKpa4YE2JAuam3vI5g20Q 2t6nF+b0ejL7ptuHWO0mcTWqTwkgqLdGU95ObUoJY8mSNNpyuyMl2XRSQEYaE3BceE8z +BtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762373456; x=1762978256; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=II8Ps1o8HVI6iyVPZpBUMlThSw7mZMjPEJaIwTAG0oE=; b=bw5iE5cGbakvP//ilX9FkbO/MLfyVGIhhLZ/EV/0bW1AEeZ82DSAchq+jgN0UlMjsP 7yPH6DSsFNPEPwRa4oOCKtJTJA8pXq4TV4iFSds78T5ccapPRYitJ4Fe2+TRqEro/KKx zQ9JKIiDxVTQ5bHF/mhSmPoRGw4WkqbMF5738aHQuJIvfK9ZD8VSpFrVi8UIb189DaQ0 7uIA41o9jHCXCSUD11DJcXbE7KMQyLn1sE9zGRHxeiwdIJ66wa2AoQ10OqIX9uYkM/tR 2R7oFfSK9PFjMo8rw8F/ve7yl56R+AV59FyL19gVBoEz9ZZ12lwD6L2Xgh0kmgcxagAl Jh5A== X-Forwarded-Encrypted: i=1; AJvYcCWkSGzb8tlzihXV+u4r2S2Ma0COavpaLXMG5D5MB78el5MnBO2/T1qTSbKfg3FjjhkvUwYNJ8EM4TwFk60=@vger.kernel.org X-Gm-Message-State: AOJu0Yx1u6kmmYYDRJrG8JsFUMqA6lke3n33onkFhSfKnT7mAzCNHJ33 tmLYeaStEG+/F9AbJq7XH2yEHboeLgC3T4MQCL/XlVce5UQnxeF8N4/B+TJC5g== X-Gm-Gg: ASbGncuaQ15Irt+EsVpIfkxAhDJap8dvaZQCjOUu//5y0skSTFh8mWTL1TpWuE/g32c xoVotae61UFg8tksTVe+l+i8mndEeuxspppro2eL5nYLmSyv6cdpEdaA4USyRSM8s5FyvWjGpS/ MYryR++hJlUU15FcZrhf5WZzZyHVWqk2Cj8PieH9OrLVFh+QXSfqkliwQVTpkJfIaQ9u26nqwvk 5HySZ6mwCbLl9F+INMAiya2Bdi423yisM7MmSBU1Yr+DVmF6duNcCAhvAaDo91TlTuh/f3Uwgmt EUOLnuzli8WZCTpRNVCC4KTPVysFmFPamsAC73tQe7x5xfyypUFP5i2tnUCs/ltT6dfufVzya6H dzVJ3eR6XbbfVdTPz9nVditOR/+XXC5pEss6tjHHjlVDyTZMWnCOImwG1lg6u6trlyA0eXkLh6G GYIbR/E3DbQrgfNnti9Wcy00dlZRJmkgAsiUuTbSM7/nhH6kLL3tMdvENVPneC+w== X-Google-Smtp-Source: AGHT+IFY/hfFUcD5QeTt/6rC82/eckDEwN1+xJtfeQzmZxu7KUWEW9CJn2N6Pg3PBnCs5EhLLZ9/XQ== X-Received: by 2002:a05:6000:1889:b0:429:c4bb:fbd9 with SMTP id ffacd0b85a97d-429e32e43d8mr3860797f8f.17.1762373455729; Wed, 05 Nov 2025 12:10:55 -0800 (PST) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-429eb41102bsm619857f8f.17.2025.11.05.12.10.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Nov 2025 12:10:55 -0800 (PST) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v5 next 6/9] lib: test_mul_u64_u64_div_u64: Test both generic and arch versions Date: Wed, 5 Nov 2025 20:10:32 +0000 Message-Id: <20251105201035.64043-7-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251105201035.64043-1-david.laight.linux@gmail.com> References: <20251105201035.64043-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Change the #if in div64.c so that test_mul_u64_u64_div_u64.c can compile and test the generic version (including the 'long multiply') on architectures (eg amd64) that define their own copy. Test the kernel version and the locally compiled version on all arch. Output the time taken (in ns) on the 'test completed' trace. For reference, on my zen 5, the optimised version takes ~220ns and the generic version ~3350ns. Using the native multiply saves ~200ns and adding back the ilog2() 'optimis= ation' test adds ~50ms. Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- Changes for v4: - Fix build on non x86 (eg arm32) Changes for v5: - Fix build on 32bit x86 lib/math/div64.c | 8 +++-- lib/math/test_mul_u64_u64_div_u64.c | 52 +++++++++++++++++++++++++---- 2 files changed, 51 insertions(+), 9 deletions(-) diff --git a/lib/math/div64.c b/lib/math/div64.c index a88391b8ada0..18a9ba26c418 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -177,16 +177,18 @@ EXPORT_SYMBOL(div64_s64); * Iterative div/mod for use when dividend is not expected to be much * bigger than divisor. */ +#ifndef iter_div_u64_rem u32 iter_div_u64_rem(u64 dividend, u32 divisor, u64 *remainder) { return __iter_div_u64_rem(dividend, divisor, remainder); } EXPORT_SYMBOL(iter_div_u64_rem); +#endif =20 -#ifndef mul_u64_add_u64_div_u64 +#if !defined(mul_u64_add_u64_div_u64) || defined(test_mul_u64_add_u64_div_= u64) u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d) { -#if defined(__SIZEOF_INT128__) +#if defined(__SIZEOF_INT128__) && !defined(test_mul_u64_add_u64_div_u64) =20 /* native 64x64=3D128 bits multiplication */ u128 prod =3D (u128)a * b + c; @@ -267,5 +269,7 @@ u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d) =20 return res; } +#if !defined(test_mul_u64_add_u64_div_u64) EXPORT_SYMBOL(mul_u64_add_u64_div_u64); #endif +#endif diff --git a/lib/math/test_mul_u64_u64_div_u64.c b/lib/math/test_mul_u64_u6= 4_div_u64.c index 4d5e4e5dac67..d8d2c18c4614 100644 --- a/lib/math/test_mul_u64_u64_div_u64.c +++ b/lib/math/test_mul_u64_u64_div_u64.c @@ -73,21 +73,34 @@ done =20 */ =20 -static int __init test_init(void) +static u64 test_mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d); + +static int __init test_run(unsigned int fn_no, const char *fn_name) { + u64 start_time; int errors =3D 0; int tests =3D 0; int i; =20 - pr_info("Starting mul_u64_u64_div_u64() test\n"); + start_time =3D ktime_get_ns(); =20 for (i =3D 0; i < ARRAY_SIZE(test_values); i++) { u64 a =3D test_values[i].a; u64 b =3D test_values[i].b; u64 d =3D test_values[i].d; u64 expected_result =3D test_values[i].result; - u64 result =3D mul_u64_u64_div_u64(a, b, d); - u64 result_up =3D mul_u64_u64_div_u64_roundup(a, b, d); + u64 result, result_up; + + switch (fn_no) { + default: + result =3D mul_u64_u64_div_u64(a, b, d); + result_up =3D mul_u64_u64_div_u64_roundup(a, b, d); + break; + case 1: + result =3D test_mul_u64_add_u64_div_u64(a, b, 0, d); + result_up =3D test_mul_u64_add_u64_div_u64(a, b, d - 1, d); + break; + } =20 tests +=3D 2; =20 @@ -106,15 +119,40 @@ static int __init test_init(void) } } =20 - pr_info("Completed mul_u64_u64_div_u64() test, %d tests, %d errors\n", - tests, errors); - return errors ? -EINVAL : 0; + pr_info("Completed %s() test, %d tests, %d errors, %llu ns\n", + fn_name, tests, errors, ktime_get_ns() - start_time); + return errors; +} + +static int __init test_init(void) +{ + pr_info("Starting mul_u64_u64_div_u64() test\n"); + if (test_run(0, "mul_u64_u64_div_u64")) + return -EINVAL; + if (test_run(1, "test_mul_u64_u64_div_u64")) + return -EINVAL; + return 0; } =20 static void __exit test_exit(void) { } =20 +/* Compile the generic mul_u64_add_u64_div_u64() code */ +#undef __div64_32 +#define __div64_32 __div64_32 +#define div_s64_rem div_s64_rem +#define div64_u64_rem div64_u64_rem +#define div64_u64 div64_u64 +#define div64_s64 div64_s64 +#define iter_div_u64_rem iter_div_u64_rem + +#undef mul_u64_add_u64_div_u64 +#define mul_u64_add_u64_div_u64 test_mul_u64_add_u64_div_u64 +#define test_mul_u64_add_u64_div_u64 test_mul_u64_add_u64_div_u64 + +#include "div64.c" + module_init(test_init); module_exit(test_exit); =20 --=20 2.39.5 From nobody Sun Dec 14 06:15:19 2025 Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A55132FDC42 for ; Wed, 5 Nov 2025 20:10:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373460; cv=none; b=c1sSIjvr4tLwi+TU7K6NJlETMFd0xI2inBO7lNpGqJfehWNR1Lrhdcie/Z1wDbaP9VU2+5OMXDcdI4iRcDN4Dc64RU5ioPnFQQdyhxcaaHNSpjkJXzgL5Tp/Ju405/9+hAgx1N5DZTXSl9M54xWGg84wRAZY9cvBgsUiWatuJUU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373460; c=relaxed/simple; bh=SgoqsqYg/Z87Xt+blLgphWyz2/h1WWrB/dz2hqYSOog=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=QPy4LW99djAKrprm13i2sdwQucKwPy5aiAgBwNd4rIcsjGffaYjRr9HdI0Xr59wQRHN7Hzda9wrz06Wljk+6OwOlTdFOfbVQsl+KcuOa2q9ql0/sgcFKtMUDfm+b+BEeHZAZv8FbS7Zwb8wwOMXxaqF8nVohlaQDKlgTNI9Ij+g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=UqEEoSqO; arc=none smtp.client-ip=209.85.128.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UqEEoSqO" Received: by mail-wm1-f44.google.com with SMTP id 5b1f17b1804b1-471b80b994bso1779585e9.3 for ; Wed, 05 Nov 2025 12:10:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762373457; x=1762978257; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ioBpYKTwK4o/CnDejM7nHinzWtOoVutc6otbb7BtldQ=; b=UqEEoSqOx3vPxfFKM0DweT5H7U4Na1jgcLKu/XD1JGecjit8uYO/7f0CFunsD2YbcA V22A3LUlItSoZCmX/pc9lQ2QTS+i0Srz2yy17o5pZBuFrNRu9y5dgPW2uuYVonYTfa0u v1xnikLfZgoIN0o+LKXH7Sgym485ScaoEXpWQJEP+Hic+tO5dloLDe1/e6gbxNIV3OLy QIWVBfA0qpXjIC9fMJztjyDoNbdKZMDHTNaVpEff9lj97YlFr+x+Uxu6u1uh1QgBGIdW 1xBkWFaAL9TugOjpdMbJfsrsoO/R7mKvjqTTvapGmgdYZHNwTjLglk5NT8nIsJ6xRx0R YgIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762373457; x=1762978257; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ioBpYKTwK4o/CnDejM7nHinzWtOoVutc6otbb7BtldQ=; b=QAK6latamlAWxbRh90qmfKGfGJr99xhm6nBAJQyTt0ZwjuAJMHUDULRw97xeZpur7s rRzbCBVhN51OzbUip0SKPciqL41qkYdjzpi+fYNlVn2Uk1nu7dV4OsUIKUQO4NAHth9j dC5irCcuDwAfZUQjJka8sEr6w45qRSqbp2e8LUoS2LTO7enjLemcHr8Iff/cTTtbX6j/ ZBPULbAsGjIIGsrr9euowa9s6x8+QNOTCR8zIDLGg1IrHZdFg9AM63bJR2I88jZdKOgR RP4Qb5GoCM925Cw40KL0+X5hMG7YtaWs0Z1NsaEt8hpkbLfuI8M3SYFT1n0jyT44iqZK GzNg== X-Forwarded-Encrypted: i=1; AJvYcCW6oF4nBesXiQj3sgMz4suENcQFDVkXN+MqHHLsYq5WLpK9WH7ofjqIQPfgIE7toUv1sSo0oXV69IF6rGQ=@vger.kernel.org X-Gm-Message-State: AOJu0YykF4fcriDdmTUxuvsHXwQieV4YIrsAWk9oLL2DnPkbXFKFyjUy SWIKDz4DLeZ7ruxvMDNK5vAx0BzUwC+6BwUoF1gnbKoEq2K7HiSwMui7 X-Gm-Gg: ASbGncvIgSeRXOtKuOX3LxVIBl/Im6Bthvy/NTT/F/KnBGsdxKatP+K0lW54mpa3tCm SnzxhzMIzIE2bwHVLhzXMaUU2JqPzUYV9hiehCeQethudIA+jjxIHQhx+IUWkIcsTg1aGS/iwYy DDhOSNnQlfHTGD+OnoSjJGaFrrUoj332GhYNPtQ2xEOPIn/4N3k6HEzeDN4jphI4hEi3FFWTISr KqqdKS1XsTV7sAuUHOKZVV7QE6gaYg8zZMz/1u1EAiQsXrCYpWDgDJtvXPmtD8Z4i+VjRbteyne 2TtwU7iqgoS+dQo0FPXu0+CKXQ0WMzM/ZvOtfXL5ixt+LTokghRYIH28zgmIHmuOQO00FO0X9ID EY9Pz5R0InbdvtA/9BYKkM9lgqiGFA+Nw471jJNxJmJqByznbTZQCz80cnO6FuDI7x2jZdNsUab 2A7eGsxa2o8A4wGQupsEwflpZ2hFdNJUCnhiH/c2HGh8wQisVt2edAOR4kquVXDHZo/XUKZ2m0 X-Google-Smtp-Source: AGHT+IE+9/YK8eZOcut9+nmbfSnkrcIvMeeHYF1njwZC5YB1F3HW27Ghq/iNGnkGlJpcwA46TxpESQ== X-Received: by 2002:a05:600c:458c:b0:477:54cd:2030 with SMTP id 5b1f17b1804b1-4775cdf4322mr46701735e9.21.1762373456453; Wed, 05 Nov 2025 12:10:56 -0800 (PST) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-429eb41102bsm619857f8f.17.2025.11.05.12.10.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Nov 2025 12:10:56 -0800 (PST) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v5 next 7/9] lib: mul_u64_u64_div_u64() optimise multiply on 32bit x86 Date: Wed, 5 Nov 2025 20:10:33 +0000 Message-Id: <20251105201035.64043-8-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251105201035.64043-1-david.laight.linux@gmail.com> References: <20251105201035.64043-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" gcc generates horrid code for both ((u64)u32_a * u32_b) and (u64_a + u32_b). As well as the extra instructions it can generate a lot of spills to stack (including spills of constant zeros and even multiplies by constant zero). mul_u32_u32() already exists to optimise the multiply. Add a similar add_u64_32() for the addition. Disable both for clang - it generates better code without them. Move the 64x64 =3D> 128 multiply into a static inline helper function for code clarity. No need for the a/b_hi/lo variables, the implicit casts on the function calls do the work for us. Should have minimal effect on the generated code. Use mul_u32_u32() and add_u64_u32() in the 64x64 =3D> 128 multiply in mul_u64_add_u64_div_u64(). Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- Changes for v4: - merge in patch 8. - Add comments about gcc being 'broken' for mixed 32/64 bit maths. clang doesn't have the same issues. - Use a #define for define mul_add() to avoid 'defined but not used' errors. arch/x86/include/asm/div64.h | 19 +++++++++++++++++ include/linux/math64.h | 11 ++++++++++ lib/math/div64.c | 40 +++++++++++++++++++++++------------- 3 files changed, 56 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/div64.h b/arch/x86/include/asm/div64.h index 6d8a3de3f43a..30fd06ede751 100644 --- a/arch/x86/include/asm/div64.h +++ b/arch/x86/include/asm/div64.h @@ -60,6 +60,12 @@ static inline u64 div_u64_rem(u64 dividend, u32 divisor,= u32 *remainder) } #define div_u64_rem div_u64_rem =20 +/* + * gcc tends to zero extend 32bit values and do full 64bit maths. + * Define asm functions that avoid this. + * (clang generates better code for the C versions.) + */ +#ifndef __clang__ static inline u64 mul_u32_u32(u32 a, u32 b) { u32 high, low; @@ -71,6 +77,19 @@ static inline u64 mul_u32_u32(u32 a, u32 b) } #define mul_u32_u32 mul_u32_u32 =20 +static inline u64 add_u64_u32(u64 a, u32 b) +{ + u32 high =3D a >> 32, low =3D a; + + asm ("addl %[b], %[low]; adcl $0, %[high]" + : [low] "+r" (low), [high] "+r" (high) + : [b] "rm" (b) ); + + return low | (u64)high << 32; +} +#define add_u64_u32 add_u64_u32 +#endif + /* * __div64_32() is never called on x86, so prevent the * generic definition from getting built. diff --git a/include/linux/math64.h b/include/linux/math64.h index e889d850b7f1..cc305206d89f 100644 --- a/include/linux/math64.h +++ b/include/linux/math64.h @@ -158,6 +158,17 @@ static inline u64 mul_u32_u32(u32 a, u32 b) } #endif =20 +#ifndef add_u64_u32 +/* + * Many a GCC version also messes this up. + * Zero extending b and then spilling everything to stack. + */ +static inline u64 add_u64_u32(u64 a, u32 b) +{ + return a + b; +} +#endif + #if defined(CONFIG_ARCH_SUPPORTS_INT128) && defined(__SIZEOF_INT128__) =20 #ifndef mul_u64_u32_shr diff --git a/lib/math/div64.c b/lib/math/div64.c index 18a9ba26c418..bb57a48ce36a 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -186,33 +186,45 @@ EXPORT_SYMBOL(iter_div_u64_rem); #endif =20 #if !defined(mul_u64_add_u64_div_u64) || defined(test_mul_u64_add_u64_div_= u64) -u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d) -{ + +#define mul_add(a, b, c) add_u64_u32(mul_u32_u32(a, b), c) + #if defined(__SIZEOF_INT128__) && !defined(test_mul_u64_add_u64_div_u64) =20 +static inline u64 mul_u64_u64_add_u64(u64 *p_lo, u64 a, u64 b, u64 c) +{ /* native 64x64=3D128 bits multiplication */ u128 prod =3D (u128)a * b + c; - u64 n_lo =3D prod, n_hi =3D prod >> 64; + + *p_lo =3D prod; + return prod >> 64; +} =20 #else =20 - /* perform a 64x64=3D128 bits multiplication manually */ - u32 a_lo =3D a, a_hi =3D a >> 32, b_lo =3D b, b_hi =3D b >> 32; +static inline u64 mul_u64_u64_add_u64(u64 *p_lo, u64 a, u64 b, u64 c) +{ + /* perform a 64x64=3D128 bits multiplication in 32bit chunks */ u64 x, y, z; =20 /* Since (x-1)(x-1) + 2(x-1) =3D=3D x.x - 1 two u32 can be added to a u64= */ - x =3D (u64)a_lo * b_lo + (u32)c; - y =3D (u64)a_lo * b_hi + (u32)(c >> 32); - y +=3D (u32)(x >> 32); - z =3D (u64)a_hi * b_hi + (u32)(y >> 32); - y =3D (u64)a_hi * b_lo + (u32)y; - z +=3D (u32)(y >> 32); - x =3D (y << 32) + (u32)x; - - u64 n_lo =3D x, n_hi =3D z; + x =3D mul_add(a, b, c); + y =3D mul_add(a, b >> 32, c >> 32); + y =3D add_u64_u32(y, x >> 32); + z =3D mul_add(a >> 32, b >> 32, y >> 32); + y =3D mul_add(a >> 32, b, y); + *p_lo =3D (y << 32) + (u32)x; + return add_u64_u32(z, y >> 32); +} =20 #endif =20 +u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d) +{ + u64 n_lo, n_hi; + + n_hi =3D mul_u64_u64_add_u64(&n_lo, a, b, c); + if (!n_hi) return div64_u64(n_lo, d); =20 --=20 2.39.5 From nobody Sun Dec 14 06:15:19 2025 Received: from mail-wr1-f41.google.com (mail-wr1-f41.google.com [209.85.221.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0C2C9347BBE for ; Wed, 5 Nov 2025 20:10:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373461; cv=none; b=G5emPpadDmuTKrQPckr7xp6p8Jt93s1zB8J1bK5MVx44FPxuH+O8vaMJ8yM822zhf1i8BYR1wUNGEAynDfVOygDEWPPitbniIZhwXmSbyfIxoHbiGhj7064kVDXhclIpIuAwHPTfWmj1ilk1YUdLZaxTXS3nfN5lHq0D5U/9a0Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373461; c=relaxed/simple; bh=mDaZKjqXz4ljxzmOjQjNiJfdKq/E8XB6DbTrYhSWAXY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TAUd+CgBFVXZg4fynloI7glgpsdwVir83nu15NFVlHnPtW1lWfPOaoSLOHYcFp46AV5QSacgFd8FMwFoxLAsIA3YlL28Gmam8UOZxI0/6u+TTe5BhAW9jWMNDO7eNNVbbDRj8jNzR4QbEHVl1wX8Z88HdoyGnGeJyvJoE/Tj9mA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=E4rUUjaU; arc=none smtp.client-ip=209.85.221.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="E4rUUjaU" Received: by mail-wr1-f41.google.com with SMTP id ffacd0b85a97d-42557c5cedcso159225f8f.0 for ; Wed, 05 Nov 2025 12:10:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762373457; x=1762978257; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=IGQCHSX+VBuD1rwvjJ6Wh0BG1UvLo9/YmQdrfi1ZSTU=; b=E4rUUjaUjef8YEHFaLxYsO7yRs5zOCtLwqhJuN8K0SlmaEUeEFUXddUVoMvglNN9ZA zauk1wasQJRKtJQ8PM2FPJ9NHpWWRmHnSAaAJhWIaZxtFQjDWIsllPy0BdxJanB6rnNb j2Pf3FT3xmT1eiUmg3ezHghInZL8DwKyjES/q29urrxnemVAqHOQe/iiuV+/4EoUOr68 Yj91tV/7BxjTDOePcfETiraUzDu64AWXldgMiIWjP0iYEFJ3x4RCpdLK8vNdmbR2JOr8 Er7c94G8IjPXMw7SbUB9DM1KUiCweiVEPwUcb0g1cwnNQE9rAAhC8+Oj/0Cra1i0/B+N lK6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762373457; x=1762978257; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IGQCHSX+VBuD1rwvjJ6Wh0BG1UvLo9/YmQdrfi1ZSTU=; b=qSit2TYDpdPl2eA5yoiyQ6gwGfUeSURkVX0/Yr+phZc8T8SBkRDJG5h37vdNHmbufZ T7XDEAqmTxOufBhCuJWFU2IeyL0YiseF2Hq6nUrIcUq46mNIUBzQWQLLPZJJSV7XDPBi sVxfOwGYjN5yzcWihOokYanHtM9gltCXD9OtTtVwZZ4dIAE0c3DKTPeYKSqetPppI3e2 q7SypdqSOwIBGx/mVhxIsf1rKvb6QxM/ivUTClZMmi0BeZvFnFBjGgATufAuqkfbbEQb clIMl3KWmwSfDtGNnkEchGitneim85iQJCpWEKAfVNxKrx9uYtJP33C4Zb3WDTb12uay Sm5A== X-Forwarded-Encrypted: i=1; AJvYcCUXcHpw0YuzUDRCKwrGbkaO8gvdx0x6kndf4ujqDJr+03kqWhXIdCE5H/1hpyJbmXjVWEXTolLaU3w9pjY=@vger.kernel.org X-Gm-Message-State: AOJu0YyBLfjYvubrxAHXVKjXOAeZVPeeFYIxKOCHBGj1x1pwLOCmBdUE Sl96OTpiZClad1aY2OFxBjKztRK85v0Qgo3+TdojSiJMbdJuoneyunec X-Gm-Gg: ASbGncvQsO/kO8asJFoO93jqPQmA/WnM60BBL9uKEHUf34LDYPo4WvQuv+hdZb2e3Q7 yAEclSiY/cUXzmXndDxyMF4Nsw44FAYkjGJNShVFjflHce4Joonu8zplhm2sGGuVNxMI+q2NFDU cDrvbnIdVtjk6VoK3r6cMFo6o6jlPdScA4NcQHrdPHRgLrwRp/LKgW+KnfTjYU9z7mZPs9NSDQN SMy/0s+yZhVWc0pkUgx5dNyBSbmXQiW67a+sIZItQTBKgcwRZSpDGUQlE1DE4il43FHjI5yRbkj 5DejWQQXdu4NlpsPDMgLX1DfN5wxjeLODp5W6BnsSA7W4yhW+usJrgNi9lbA088/WDS+BVNSVoa qb7IbPM0GdzMBJP15c5ZG5b195k21eR9sQGpNzKEiHYqsAfbQAnmAReaRAuTD6t4ru7pRG6yf3R u9OR9H/iu2OzPHy8RD7QMYoWiNOFlsXfkIayUu6MEIs5vpm3cS1YH7GYi81mxgu/uolYKIwH80 X-Google-Smtp-Source: AGHT+IHEC6a0tb0CmYj1fixrjFAg0hcYGOZWGSCeLDUOu/kqJ+eVN76eroHUy1G9kEWbbuy8Nw3iWQ== X-Received: by 2002:a05:6000:4304:b0:40e:31a2:7efe with SMTP id ffacd0b85a97d-429e32e17c1mr3926274f8f.14.1762373457246; Wed, 05 Nov 2025 12:10:57 -0800 (PST) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-429eb41102bsm619857f8f.17.2025.11.05.12.10.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Nov 2025 12:10:57 -0800 (PST) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v5 next 8/9] lib: mul_u64_u64_div_u64() Optimise the divide code Date: Wed, 5 Nov 2025 20:10:34 +0000 Message-Id: <20251105201035.64043-9-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251105201035.64043-1-david.laight.linux@gmail.com> References: <20251105201035.64043-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Replace the bit by bit algorithm with one that generates 16 bits per iteration on 32bit architectures and 32 bits on 64bit ones. On my zen 5 this reduces the time for the tests (using the generic code) from ~3350ns to ~1000ns. Running the 32bit algorithm on 64bit x86 takes ~1500ns. It'll be slightly slower on a real 32bit system, mostly due to register pressure. The savings for 32bit x86 are much higher (tested in userspace). The worst case (lots of bits in the quotient) drops from ~900 clocks to ~130 (pretty much independant of the arguments). Other 32bit architectures may see better savings. It is possibly to optimise for divisors that span less than __LONG_WIDTH__/2 bits. However I suspect they don't happen that often and it doesn't remove any slow cpu divide instructions which dominate the result. Typical improvements for 64bit random divides: old new sandy bridge: 470 150 haswell: 400 144 piledriver: 960 467 I think rdpmc is very slow. zen5: 244 80 (Timing is 'rdpmc; mul_div(); rdpmc' with the multiply depending on the first rdpmc and the second rdpmc depending on the quotient.) Object code (64bit x86 test program): old 0x173 new 0x141. Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- Algorithm unchanged since v3. lib/math/div64.c | 124 ++++++++++++++++++++++++++++++++--------------- 1 file changed, 85 insertions(+), 39 deletions(-) diff --git a/lib/math/div64.c b/lib/math/div64.c index bb57a48ce36a..d1e92ea24fce 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -190,7 +190,6 @@ EXPORT_SYMBOL(iter_div_u64_rem); #define mul_add(a, b, c) add_u64_u32(mul_u32_u32(a, b), c) =20 #if defined(__SIZEOF_INT128__) && !defined(test_mul_u64_add_u64_div_u64) - static inline u64 mul_u64_u64_add_u64(u64 *p_lo, u64 a, u64 b, u64 c) { /* native 64x64=3D128 bits multiplication */ @@ -199,9 +198,7 @@ static inline u64 mul_u64_u64_add_u64(u64 *p_lo, u64 a,= u64 b, u64 c) *p_lo =3D prod; return prod >> 64; } - #else - static inline u64 mul_u64_u64_add_u64(u64 *p_lo, u64 a, u64 b, u64 c) { /* perform a 64x64=3D128 bits multiplication in 32bit chunks */ @@ -216,12 +213,37 @@ static inline u64 mul_u64_u64_add_u64(u64 *p_lo, u64 = a, u64 b, u64 c) *p_lo =3D (y << 32) + (u32)x; return add_u64_u32(z, y >> 32); } +#endif + +#ifndef BITS_PER_ITER +#define BITS_PER_ITER (__LONG_WIDTH__ >=3D 64 ? 32 : 16) +#endif + +#if BITS_PER_ITER =3D=3D 32 +#define mul_u64_long_add_u64(p_lo, a, b, c) mul_u64_u64_add_u64(p_lo, a, b= , c) +#define add_u64_long(a, b) ((a) + (b)) +#else +#undef BITS_PER_ITER +#define BITS_PER_ITER 16 +static inline u32 mul_u64_long_add_u64(u64 *p_lo, u64 a, u32 b, u64 c) +{ + u64 n_lo =3D mul_add(a, b, c); + u64 n_med =3D mul_add(a >> 32, b, c >> 32); + + n_med =3D add_u64_u32(n_med, n_lo >> 32); + *p_lo =3D n_med << 32 | (u32)n_lo; + return n_med >> 32; +} =20 +#define add_u64_long(a, b) add_u64_u32(a, b) #endif =20 u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d) { - u64 n_lo, n_hi; + unsigned long d_msig, q_digit; + unsigned int reps, d_z_hi; + u64 quotient, n_lo, n_hi; + u32 overflow; =20 n_hi =3D mul_u64_u64_add_u64(&n_lo, a, b, c); =20 @@ -240,46 +262,70 @@ u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 = d) return ~0ULL; } =20 - int shift =3D __builtin_ctzll(d); - - /* try reducing the fraction in case the dividend becomes <=3D 64 bits */ - if ((n_hi >> shift) =3D=3D 0) { - u64 n =3D shift ? (n_lo >> shift) | (n_hi << (64 - shift)) : n_lo; - - return div64_u64(n, d >> shift); - /* - * The remainder value if needed would be: - * res =3D div64_u64_rem(n, d >> shift, &rem); - * rem =3D (rem << shift) + (n_lo - (n << shift)); - */ + /* Left align the divisor, shifting the dividend to match */ + d_z_hi =3D __builtin_clzll(d); + if (d_z_hi) { + d <<=3D d_z_hi; + n_hi =3D n_hi << d_z_hi | n_lo >> (64 - d_z_hi); + n_lo <<=3D d_z_hi; } =20 - /* Do the full 128 by 64 bits division */ - - shift =3D __builtin_clzll(d); - d <<=3D shift; - - int p =3D 64 + shift; - u64 res =3D 0; - bool carry; + reps =3D 64 / BITS_PER_ITER; + /* Optimise loop count for small dividends */ + if (!(u32)(n_hi >> 32)) { + reps -=3D 32 / BITS_PER_ITER; + n_hi =3D n_hi << 32 | n_lo >> 32; + n_lo <<=3D 32; + } +#if BITS_PER_ITER =3D=3D 16 + if (!(u32)(n_hi >> 48)) { + reps--; + n_hi =3D add_u64_u32(n_hi << 16, n_lo >> 48); + n_lo <<=3D 16; + } +#endif =20 - do { - carry =3D n_hi >> 63; - shift =3D carry ? 1 : __builtin_clzll(n_hi); - if (p < shift) - break; - p -=3D shift; - n_hi <<=3D shift; - n_hi |=3D n_lo >> (64 - shift); - n_lo <<=3D shift; - if (carry || (n_hi >=3D d)) { - n_hi -=3D d; - res |=3D 1ULL << p; + /* Invert the dividend so we can use add instead of subtract. */ + n_lo =3D ~n_lo; + n_hi =3D ~n_hi; + + /* + * Get the most significant BITS_PER_ITER bits of the divisor. + * This is used to get a low 'guestimate' of the quotient digit. + */ + d_msig =3D (d >> (64 - BITS_PER_ITER)) + 1; + + /* + * Now do a 'long division' with BITS_PER_ITER bit 'digits'. + * The 'guess' quotient digit can be low and BITS_PER_ITER+1 bits. + * The worst case is dividing ~0 by 0x8000 which requires two subtracts. + */ + quotient =3D 0; + while (reps--) { + q_digit =3D (unsigned long)(~n_hi >> (64 - 2 * BITS_PER_ITER)) / d_msig; + /* Shift 'n' left to align with the product q_digit * d */ + overflow =3D n_hi >> (64 - BITS_PER_ITER); + n_hi =3D add_u64_u32(n_hi << BITS_PER_ITER, n_lo >> (64 - BITS_PER_ITER)= ); + n_lo <<=3D BITS_PER_ITER; + /* Add product to negated divisor */ + overflow +=3D mul_u64_long_add_u64(&n_hi, d, q_digit, n_hi); + /* Adjust for the q_digit 'guestimate' being low */ + while (overflow < 0xffffffff >> (32 - BITS_PER_ITER)) { + q_digit++; + n_hi +=3D d; + overflow +=3D n_hi < d; } - } while (n_hi); - /* The remainder value if needed would be n_hi << p */ + quotient =3D add_u64_long(quotient << BITS_PER_ITER, q_digit); + } =20 - return res; + /* + * The above only ensures the remainder doesn't overflow, + * it can still be possible to add (aka subtract) another copy + * of the divisor. + */ + if ((n_hi + d) > n_hi) + quotient++; + return quotient; } #if !defined(test_mul_u64_add_u64_div_u64) EXPORT_SYMBOL(mul_u64_add_u64_div_u64); --=20 2.39.5 From nobody Sun Dec 14 06:15:19 2025 Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BFC5C30CDB9 for ; Wed, 5 Nov 2025 20:10:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373461; cv=none; b=ELYyDzfzPYOzQtylasA+DDeF4Cfr03NsnZBwfyD/FIL1Bct4Gaxt28pgxJwoKYfa/Snp3HoSeUrx49MX4GW8I4PDXtd7JfZXFRBQjuNIq8tHFaFWPhAJP08HQfcp+SmfZHmdGCft0hx/lJaqe6AElbLsghbOcbHGAfdD9z+CKXA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762373461; c=relaxed/simple; bh=HVjUwdOw78msn3/jdQXnWRqAvOvh+8n3kTj/k7a1RvQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=PRPBoWPDuODTOQHxDbUnlnLWIeEpUllOh+Hcd/AETvzD3/UVQNn9MRzG5gDzrmzLeYFekYsFxS3ZmjKwhEnq9qKbc2GRx3iSWwR5BvuHk27DLEuGspWRN/EtBC42GUHJd1v77/rGn4O2Ck9yTDsawE2FZxmNu5xWzNDNWRE8xzM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=bnmqisLF; arc=none smtp.client-ip=209.85.221.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bnmqisLF" Received: by mail-wr1-f48.google.com with SMTP id ffacd0b85a97d-429bccca1e8so171299f8f.0 for ; Wed, 05 Nov 2025 12:10:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762373458; x=1762978258; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lfqqsBnisQAavJV9LRrMQzrEZPCcNFYgFzwVnKLBfWg=; b=bnmqisLFrRS5wl2hXc6eIhWsvIvlvWLatg3Nd7KQ/5f9d3YQ45HZrFzccW5HKd7Oux XeqmcFrzpoqwoocqQA4pFF/bkSHtwCJb/ioOpAwf3zGFperc53lkZ5dmt9Uamqvfm4Xy 75VV+2TAEmRV+n+1PTvPcIo/NWwB7yRF+awABKkhi1hTrinb32A1aX66A5ltl9YvgfQ5 nDxwBFm+i40YwF2du8YS1udoNck6R9x8daqgfY7KdKyZbiAZWrAzQn0uIa8ClZ1ptDLx U9pjink9krsakUnBTiYmdmE2kaj/Ps2yYWQ6+JAAH8ZW8qs+TylURPFlwaRi2ja7IsGg RFWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762373458; x=1762978258; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lfqqsBnisQAavJV9LRrMQzrEZPCcNFYgFzwVnKLBfWg=; b=K40FYglvgjtHBSJLkpsiAexMEIr9POCDpMXXVDfjUhfCelvPqAeyvv0EdaOmc5S8IV 3o5w1Ipk+DjMLmtT1RLb7TN3n2eC2fq9PdNpLbOMnD+x+YuJq/UTZqne6cqTXhpW/6Vn t7rv4miaOaPGcXKocjB484bl12Gy5TQJlDMUOb/tvJrz+LTrtm16WAO/DLcYmgy9iqLN QMXOzgdUwLi7D9Lc/dBX8wRj95Pv1FOmdU7tqHNEopaBd3NHIocKF+6rcDyeb5YUOFvU xoja8FFtQYpA+oc9diGdzEttAdFgq7pYo5b8xjWUbYqerzPExG3xn7OIeoRyfgDdoCMo Pi+A== X-Forwarded-Encrypted: i=1; AJvYcCUp20yUWBqe8+ccdYcarMFm/OiJtmt/AYsE6Iwgc+uKzPdVMA6DxklWFJZ0Fa7KzsGWRRyxThTynL+Xlw4=@vger.kernel.org X-Gm-Message-State: AOJu0YzkNrOTX57TJAL8q6itsxNQOW1a/3TZt608op9lidGLsdCCYFoi Oyh/gxHf16El5OV1gYVuvbJun6JjUj3xqgu0XMpTqMe3DPbqBoD8Luhe X-Gm-Gg: ASbGncttz1cIWsQHU2SeD/vT0389UJ9P5BwEImcSjrsmPMvWCzqPAh3vQXNLUS9TkvK 7jAWR6oROXqZKBNRGOrtUv4UV+iDcqv3ZMX5/ENwXdfLqmgBWY5zKqlgPslre9iOd5wl0OyxibX sHRtjXk9UCLrtupavkF+mDif8gWTl37JGUhOnndo6jrZbnHR1ePLyqhCRV2iafLvMPpSsm1Ou+x pQHVP1xxs7LvNPpGH42tlglqfOUp1UXPl19mB83Os6G5RMb1k5TqN/eulAizdsZKQRkj4LvYz+W Xm6yIO/L2nEBg+kZMkTyWbZFAVARWYl5zJexgKAkoHxOMFnSgg04dgPi+ydxyBFZA8BTquf53D8 keI7UYzROp4U7fVTAEZ1D9MPrBHuoImYXTlYMe5if/PX+y63UlkAoaFufBvXlUO75gEJalr+ui+ iJ3QHZSJsg29RZoiuEF9gFqqeeAY2qMH9ngMDMi0WuwQWHp8ts2soUgCWlmk3vt0xuO3L0gRrM X-Google-Smtp-Source: AGHT+IE3JU2/MHg/OIZde/XCPMdKXkgSkZSj6exPteW1+q13R6TsC2yTPJbcDctJMHoID6XiIDqcVg== X-Received: by 2002:a05:6000:178a:b0:426:d82f:889e with SMTP id ffacd0b85a97d-429e32e3fb0mr4204906f8f.14.1762373457967; Wed, 05 Nov 2025 12:10:57 -0800 (PST) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-429eb41102bsm619857f8f.17.2025.11.05.12.10.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Nov 2025 12:10:57 -0800 (PST) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v5 next 9/9] lib: test_mul_u64_u64_div_u64: Test the 32bit code on 64bit Date: Wed, 5 Nov 2025 20:10:35 +0000 Message-Id: <20251105201035.64043-10-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251105201035.64043-1-david.laight.linux@gmail.com> References: <20251105201035.64043-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There are slight differences in the mul_u64_add_u64_div_u64() code between 32bit and 64bit systems. Compile and test the 32bit version on 64bit hosts for better test coverage. Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- Changes for v4:=20 - Fix build on non x86-64 lib/math/test_mul_u64_u64_div_u64.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/lib/math/test_mul_u64_u64_div_u64.c b/lib/math/test_mul_u64_u6= 4_div_u64.c index d8d2c18c4614..338d014f0c73 100644 --- a/lib/math/test_mul_u64_u64_div_u64.c +++ b/lib/math/test_mul_u64_u64_div_u64.c @@ -74,6 +74,10 @@ done */ =20 static u64 test_mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d); +#if __LONG_WIDTH__ >=3D 64 +#define TEST_32BIT_DIV +static u64 test_mul_u64_add_u64_div_u64_32bit(u64 a, u64 b, u64 c, u64 d); +#endif =20 static int __init test_run(unsigned int fn_no, const char *fn_name) { @@ -100,6 +104,12 @@ static int __init test_run(unsigned int fn_no, const c= har *fn_name) result =3D test_mul_u64_add_u64_div_u64(a, b, 0, d); result_up =3D test_mul_u64_add_u64_div_u64(a, b, d - 1, d); break; +#ifdef TEST_32BIT_DIV + case 2: + result =3D test_mul_u64_add_u64_div_u64_32bit(a, b, 0, d); + result_up =3D test_mul_u64_add_u64_div_u64_32bit(a, b, d - 1, d); + break; +#endif } =20 tests +=3D 2; @@ -131,6 +141,10 @@ static int __init test_init(void) return -EINVAL; if (test_run(1, "test_mul_u64_u64_div_u64")) return -EINVAL; +#ifdef TEST_32BIT_DIV + if (test_run(2, "test_mul_u64_u64_div_u64_32bit")) + return -EINVAL; +#endif return 0; } =20 @@ -153,6 +167,21 @@ static void __exit test_exit(void) =20 #include "div64.c" =20 +#ifdef TEST_32BIT_DIV +/* Recompile the generic code for 32bit long */ +#undef test_mul_u64_add_u64_div_u64 +#define test_mul_u64_add_u64_div_u64 test_mul_u64_add_u64_div_u64_32bit +#undef BITS_PER_ITER +#define BITS_PER_ITER 16 + +#define mul_u64_u64_add_u64 mul_u64_u64_add_u64_32bit +#undef mul_u64_long_add_u64 +#undef add_u64_long +#undef mul_add + +#include "div64.c" +#endif + module_init(test_init); module_exit(test_exit); =20 --=20 2.39.5