From nobody Sun Dec 14 11:13:48 2025 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 47A6F1E2307 for ; Wed, 29 Oct 2025 17:39:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759557; cv=none; b=Hsym06W7wfKjrqqLgInGNw3zvIDwrd5n0DHOMb1X/45zCJsk+f60xYECAtwvMj4nkzyg9oGE3+iCMrpFZHWfUxifyVK4iGtrdb7noOFZf7jwr4qgFf6wDVNYO91wwoHixyq4fFy9mnGVlKD2FUyyK4vawy3dGJPxv2LjPFSZr44= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759557; c=relaxed/simple; bh=zXxfYSuarlcdX7wkRJ5fQQu9ihnL5EPvoYU9sZehLCY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CJTbjPD7JcpdOmDcP+emq3Vy8zW62qaIje39IupSGfsaMpQnrPBKIeiIgpMiFiCRe+xAMBqaRCQqWuV74ReKRsoKBVbBCcxb/k6tuH5SZ1HVg93ZBXRXHsds+v+AX2jiYQV5yTJGsQJofF5eXSMbdMSNENT6zmJKwl8PQOCk4GU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=a1nHV+r5; arc=none smtp.client-ip=209.85.128.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="a1nHV+r5" Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-4770c34ca8eso839715e9.0 for ; Wed, 29 Oct 2025 10:39:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761759553; x=1762364353; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wd2SwmO3s42C6B+THDT/QkC+mSXRUeoaYVSZH4hdUjs=; b=a1nHV+r5qz4RTbf/O01Jw6p8+5TYzTrwuKHlLNctPnb/hG1tj3IWKLUDbYTdZR/wCM v66BntoxuNYXK9aGbs9Wq7QOF+bSBqTpE8U8SM/eck9gVCjyRp06vF+aWeWxggWgAmcO nwG+2cFV1lgTn/PlAgByDVpnIt0EdrPLKmYSK4eeCkv7l2vypOhsWL8ZSsH6/Q8uHOJJ VF9lgnMue+jvgP87hQp6sKMmI9FpBIY4IoV4Zbzn2xz3Ct7J5GSsPwGH25jW2F5NkTrn BcYe2OUNGstM4Jw0VF1JAMD/VRDDXN0eRYcvqdObR0u2qelFzxuaohFnl4nAaRUN9y3S LMiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761759553; x=1762364353; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wd2SwmO3s42C6B+THDT/QkC+mSXRUeoaYVSZH4hdUjs=; b=Giq90LjP7IKHR35AJYR1/q4JfRWEFIy9lpmBKpzbfN7MhEW3NKtF98NPKMPBaI8kCM IbGRLMmAD/AVJ7HW1TLxhYwgf0aQYBmg8Tf4hl8dufDPMiHOFyLTAhIHSjuWZCUcm1Sf w6Xoz4NFkWy3YPWvJkld8U36dmq9Qq0djaTSqZBuJ1YvZb8erVCY1ipwYhZLEyLM+07+ 7tygibn9WMPcQ6rsEBkjaA9MGxWNXfQuAUX3MJKqnB+cQSyehHgNSVFy/ppCehgYr5UF SDz4SzRJIsHPZWeSvGiOq7HJFtfoiiX8qMOgObDiFse8NgkKSzoYlDJ+7xGAmz9eF6pQ VzVQ== X-Forwarded-Encrypted: i=1; AJvYcCVPdgVWVxH+fR85HXi0/VXPFCRSUjRyC1lznXK51hdphNtLD03dDGoHczGTH4AMpFLAGF4SFGUHbnw+jPg=@vger.kernel.org X-Gm-Message-State: AOJu0YwWscOU7cbHRJtxf6paPIEN+XpgxvXSCyvVRdioMrd0/Xbr3zcj NwWaoPp4IRP09tF0Fkf8DrfwDE/+CgRcW1YvPOP41evfg0vOOsNJIcPc X-Gm-Gg: ASbGncuuRRBUaDA+POAsPS6eDEZBipZdM/wzvkDDx1E9pS5olel8dANRT6AMD1KB7zD 3G8bmXyOdaZx4wO086UrEQ/sXu+2t+Axj4tyQv6L0l9/5RpyHR/DO+B+rB/gdJGcfMaCppXPgX+ k4ra4bNeopD/TPsRrGukvI/ee9j8Sa+6y0ckM6OBnMkj+emtD1XaWp0do7e/rXt9gXLww02X0vt Q1NKcYRS4DtUd12Cm06F5TzbAge0izARP6iXvLE0G1J29DYZdrYa71HkTi+gM83MDmgOX+0a9Rv 6BE/G715p4yRXt+vQixrhizv+VffYYiOXZ3wq5UyY1RbthdMd8BVXHk4zikwAEHsvgFBzxfN1mR JThD/ErdZk3wdl9rMRfiVNMHR7mnrC/JOijugM2gKOiqYaub+ZS0PV0/zkVntiDJDgRev+/RK2B xkjOiQxK1WnMcUxf1g/zzqxF8cpswnzCiHiOsQp7MUg6ZuMaVqApgZXob2+YaA X-Google-Smtp-Source: AGHT+IHhr91+Bfr8yPDxQuAt1LS8V31mfkbT0U8xn6O5zXuWwhTctTa0JaAaXvbdZwX5WjAK4Fv/Kw== X-Received: by 2002:a05:600c:3506:b0:46e:37fe:f0e6 with SMTP id 5b1f17b1804b1-4771e3b849fmr37597855e9.30.1761759553398; Wed, 29 Oct 2025 10:39:13 -0700 (PDT) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4771e235ae1sm70646865e9.17.2025.10.29.10.39.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Oct 2025 10:39:13 -0700 (PDT) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Yu Kuai , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v4 next 1/9] lib: mul_u64_u64_div_u64() rename parameter 'c' to 'd' Date: Wed, 29 Oct 2025 17:38:20 +0000 Message-Id: <20251029173828.3682-2-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251029173828.3682-1-david.laight.linux@gmail.com> References: <20251029173828.3682-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Change to prototype from mul_u64_u64_div_u64(u64 a, u64 b, u64 c) to mul_u64_u64_div_u64(u64 a, u64 b, u64 d). Using 'd' for 'divisor' makes more sense. An upcoming change adds a 'c' parameter to calculate (a * b + c)/d. Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- lib/math/div64.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/lib/math/div64.c b/lib/math/div64.c index bf77b9843175..0ebff850fd4d 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -184,10 +184,10 @@ u32 iter_div_u64_rem(u64 dividend, u32 divisor, u64 *= remainder) EXPORT_SYMBOL(iter_div_u64_rem); =20 #ifndef mul_u64_u64_div_u64 -u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 c) +u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) { if (ilog2(a) + ilog2(b) <=3D 62) - return div64_u64(a * b, c); + return div64_u64(a * b, d); =20 #if defined(__SIZEOF_INT128__) =20 @@ -212,37 +212,37 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 c) =20 #endif =20 - /* make sure c is not zero, trigger runtime exception otherwise */ - if (unlikely(c =3D=3D 0)) { + /* make sure d is not zero, trigger runtime exception otherwise */ + if (unlikely(d =3D=3D 0)) { unsigned long zero =3D 0; =20 OPTIMIZER_HIDE_VAR(zero); return ~0UL/zero; } =20 - int shift =3D __builtin_ctzll(c); + int shift =3D __builtin_ctzll(d); =20 /* try reducing the fraction in case the dividend becomes <=3D 64 bits */ if ((n_hi >> shift) =3D=3D 0) { u64 n =3D shift ? (n_lo >> shift) | (n_hi << (64 - shift)) : n_lo; =20 - return div64_u64(n, c >> shift); + return div64_u64(n, d >> shift); /* * The remainder value if needed would be: - * res =3D div64_u64_rem(n, c >> shift, &rem); + * res =3D div64_u64_rem(n, d >> shift, &rem); * rem =3D (rem << shift) + (n_lo - (n << shift)); */ } =20 - if (n_hi >=3D c) { + if (n_hi >=3D d) { /* overflow: result is unrepresentable in a u64 */ return -1; } =20 /* Do the full 128 by 64 bits division */ =20 - shift =3D __builtin_clzll(c); - c <<=3D shift; + shift =3D __builtin_clzll(d); + d <<=3D shift; =20 int p =3D 64 + shift; u64 res =3D 0; @@ -257,8 +257,8 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 c) n_hi <<=3D shift; n_hi |=3D n_lo >> (64 - shift); n_lo <<=3D shift; - if (carry || (n_hi >=3D c)) { - n_hi -=3D c; + if (carry || (n_hi >=3D d)) { + n_hi -=3D d; res |=3D 1ULL << p; } } while (n_hi); --=20 2.39.5 From nobody Sun Dec 14 11:13:48 2025 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A22C351FD7 for ; Wed, 29 Oct 2025 17:39:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759558; cv=none; b=lWM+0TVlB9GRq1nP6/Sathw4wztGWOwg2QI91HKlNVrKqtcClIcSOxcQA7BkvEDi8xexJf6KyZZY2LR3Ju4cbVTRVBUIVJ5C7q/FzWYNnOu1x4bFN3e33N357uXA7Q9C2Pm1hQ6l+yn7m+VGc2Gvb+yc55Qk/bApc5P6SIiYr5E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759558; c=relaxed/simple; bh=gBefb6TqXpK7acbrPQobkS7HLgFQZCifBHYFs8GUpJw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=tXHWCkPG1uekdQTOh/ekWviqrOnmA87bNIuiTefABMgMZg7N+UvUVkKppgSK15TgqMJQaLkF2E75kWQK/w+xusAbyYxZ2NGPhD5ToViU/xA29ZJ1C59siqAKRRvdYPPUwJlqm01LWqsmBqfCwgED7vlDQ4U47VQwiwQBYxJorBA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=dF3DBLvF; arc=none smtp.client-ip=209.85.128.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dF3DBLvF" Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-475dd54d7cdso456845e9.1 for ; Wed, 29 Oct 2025 10:39:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761759554; x=1762364354; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7BzVG5MnbPwlPbESmRwYUVZfmKVQyQaqYcMPWkRLlE8=; b=dF3DBLvFxed+650zVeP8h863Yle78hobNeKOKn+IJktW//5fB0odz42tVJUQqxvA+a /BoFVCwXjoVA0UtAapAu6KQzdmdtx7pfT0J6fDB6WtcJ/VacUYcwArk4CHzWIRySjfkm jpY3b/Ix5bVQcRIP0sOq1e8bQ+7scqz2/Oe+q1MEOWikD67hdXNfBYx0XYoyOfHQC7B1 a+qpPoSgrpG3jI6RqX6oUeLEz9pU428lVnUy9ok919uIqFU2EtuGd5hk7SUKsXzhasZY 02qGp/5iV6HhYJz6if2Te0jvzyIo5tNbtWZS0xd70MGnCju+Ik/8J5dVP8LYf37aXuar o+IQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761759554; x=1762364354; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7BzVG5MnbPwlPbESmRwYUVZfmKVQyQaqYcMPWkRLlE8=; b=VKbL6vCHIZ/OayPlBy5vFqvHxtIDTcvpAbcqYDUpW7FRty+sVDRGXMQHl7BcRzdqiD 4E/pZl3NV7xIKu34sn1aQSws4524YHnPNTOPUixnwrICBBA2Cp1eK7lpDHiuhhzSPb30 pixANvnW86CmTOcLHP7hec3mdGSepHklluFUrD/PMqqqw6VWQHtxcGODvr3u/Q0+aOeH uqykTrMq9TWVcYzt4oE5nrTesXSfotI6VBr5U4kLoPOoyRpa8jWvgoGyOwpqNCz6uQKg bzYcss2xR4syNyQPRNzHvH9uVGFNvHu5iovZisMCM9gfr2wjBBxz/sDvVER8lEbt46RO jTxA== X-Forwarded-Encrypted: i=1; AJvYcCVfgieXbWrnTbQRqjv4iKSmQr+CWNQnlbVTa9VClC+w9tmEYtA1E8l7QCqJDhAoEcYQy3T1FvVWMWjQPYA=@vger.kernel.org X-Gm-Message-State: AOJu0Ywti6HpZfj8aobdbKF4rPxYpUqsX4BZIVDWyAB62Q4I5ruHd6Su JyWBSiPxs8j2Z7NQamRdptZ+YBrjtSaWrSTWG89AgWGz3D7XfZThcG1h X-Gm-Gg: ASbGncuKr1U6ZH0Z7myxXS6qlYtNNO4Qt1DE/JNkinZhwzdWlsXtgaZZTtQMSsE8TBj o0oevzHyqju9n48zppixdasOF53zr3A8N9Z6j0nyKoQ3JVz/XzDz8vyI0OyF1pbhFx3LuMRojea c7xcCxP9YxXsWzLazGTDwjX8VZZ0GzNfFLnNmADtJoxNrD/b4RwxTKkpG8AodS3C0lDcRDO1OGF f756s+oRGNB1E3PK8z7lE5Xwmi9/Z6cbC08aOpKxKGv+T0FsSpSRMEkkQgFXQLslCQpz9Sb5to2 1BLUXev8iStz3hlQuT69l5Cs/HsBwAtrOczULCZJV9UIDUQaXYj2TOdF2GFrsvktpva2k8YTsjb iTi7quit5g9mJ9AsP9O7fXlNHkmEmltu3z/mYkLGeekmLBK6LpzxD2mhz/s/eQ6eLyMg8NnbzgH dFVP04J/+VloiAXPdm2sgDOgJQP4t8gYi6Y6IX+oh55gM3uqCc0VjtPSWB15e1 X-Google-Smtp-Source: AGHT+IGl0NDHjcqHERE0/tdzJRGWl9OHItubCGkefGK9qCcwbrWMRpil+4PA5SDhZx6H7QnJ8j68Jw== X-Received: by 2002:a05:600c:3b1e:b0:471:a73:a9d2 with SMTP id 5b1f17b1804b1-4771e39c8damr30167895e9.11.1761759554128; Wed, 29 Oct 2025 10:39:14 -0700 (PDT) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4771e235ae1sm70646865e9.17.2025.10.29.10.39.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Oct 2025 10:39:13 -0700 (PDT) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Yu Kuai , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v4 next 2/9] lib: mul_u64_u64_div_u64() Combine overflow and divide by zero checks Date: Wed, 29 Oct 2025 17:38:21 +0000 Message-Id: <20251029173828.3682-3-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251029173828.3682-1-david.laight.linux@gmail.com> References: <20251029173828.3682-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since the overflow check always triggers when the divisor is zero move the check for divide by zero inside the overflow check. This means there is only one test in the normal path. Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- V3 contained a different patch 2 that did different chenges to the error paths. lib/math/div64.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/lib/math/div64.c b/lib/math/div64.c index 0ebff850fd4d..1092f41e878e 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -212,12 +212,16 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) =20 #endif =20 - /* make sure d is not zero, trigger runtime exception otherwise */ - if (unlikely(d =3D=3D 0)) { - unsigned long zero =3D 0; + if (unlikely(n_hi >=3D d)) { + /* trigger runtime exception if divisor is zero */ + if (d =3D=3D 0) { + unsigned long zero =3D 0; =20 - OPTIMIZER_HIDE_VAR(zero); - return ~0UL/zero; + OPTIMIZER_HIDE_VAR(zero); + return ~0UL/zero; + } + /* overflow: result is unrepresentable in a u64 */ + return ~0ULL; } =20 int shift =3D __builtin_ctzll(d); @@ -234,11 +238,6 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) */ } =20 - if (n_hi >=3D d) { - /* overflow: result is unrepresentable in a u64 */ - return -1; - } - /* Do the full 128 by 64 bits division */ =20 shift =3D __builtin_clzll(d); --=20 2.39.5 From nobody Sun Dec 14 11:13:48 2025 Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5CA8C23D7DC for ; Wed, 29 Oct 2025 17:39:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759560; cv=none; b=kriArMv3aaeZKdbrqrpHzkfkjfKopStyjwJAV1qz2BXrbehrutAV045g6Zr0RTkIzEtf0WTPoFSG2BP2s56+fZAqhV1zlBT0T9W/kp+dcC+Lv5h9GF98QkamC5bpu3Z0aekLuGieY6GUcM6TGFb0T6cxLvunw5i6M48PyoTjjok= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759560; c=relaxed/simple; bh=MKwhK3JcbaRztRiku8FMycQox/iKFVXWtOASr3ZhwMI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ow8/Vf9mEFRcejH2wqq9E9wSbHvIh4af++5IK7gE4awmClW7sv9db5DaKx0tOp3p3M0DWRnHNULKSO55XY4XaOST5/CyqsNYXMGdGs4iC2EPDblqy1M1H8GEToWmYfTfqui4LVuDPs1g146LqM5yRoT3SQgreygEeD6anT/zrDw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=T4VKdvMA; arc=none smtp.client-ip=209.85.128.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="T4VKdvMA" Received: by mail-wm1-f53.google.com with SMTP id 5b1f17b1804b1-47109187c32so569035e9.2 for ; Wed, 29 Oct 2025 10:39:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761759555; x=1762364355; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=73i5y2n0GA2A+e7j+S3Mi6jDYb6pvGaIa2tmoB3gAh4=; b=T4VKdvMAcCquhqVukaMoTg+9l2O1nAUx7u73STIemz+rvARfWihrEP8P4NUb317PLI zUBBj5ShXhS+LTZ9yQbtlys0/+W9SAYEMpkz+5+RMu56lHT4Od1d+zgKP2ecpFiFrh8R EoZTf9QqS6oiZcARek8FElHB5Rnxdb8uxbp/H7PuhhqtIgCLjFhxhyWDIAkdAVDX0uhb Ihho9BUrY4LdhHWcTL7pUjYd4gRjwTag9JM0w0tzJBWmRi7SHTngHi7odueeZwJI+gVH /s5XL6KM5P886AYZ8Qq3Wcj5+iVHQNeNMkdjjK+hJf7r0m39yp4o8XpH1t0PuMVeylV2 0nrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761759555; x=1762364355; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=73i5y2n0GA2A+e7j+S3Mi6jDYb6pvGaIa2tmoB3gAh4=; b=qxNPECcPU0z5PsKtqKOa5O3ohd3iZQeYE0S2GXYf1JeVEJXTJMXyQg5PPeeFOWk0VH UD3gI1IiinURwXve/x4AJrqhFawfVmXA5Cqr0zUkHDHJqdQP94V/sWDpfPj1VKrQE/su kkPt3KUxEBhO8f2p3HAUft32rrGJD2NNUjW/3k44SxnBXIhMisLdcu4EuXJcppOhPs5+ kLY4dLpZgsy7wbzspUMPYy2sHeTq0+eXFu4TUowSor28xx+V64G8QcgAzB5hPsXTuucn BKMR4+ti/2FNpFj1wP1cknphF64zoL/0vmeuKCTrLvmxyPs4zmFJOQyfdfQt2dMNaoC5 12xA== X-Forwarded-Encrypted: i=1; AJvYcCXfmZwxts97LDnomFpd8CV9cmNYxk6dbpxxxY5U2DC2syB2j1+XPaa29cRMeEnr33r+YTvX+9fPu2DBego=@vger.kernel.org X-Gm-Message-State: AOJu0Yz4A5sZTFgLFXBe7OUNLJd29133NgjY4pi4ecCpSy1WBT+T7Lnf n7UQZkx7qO5Unf6sJjNCWuM4aCeUavnqNSVDiMipNO/FaKeaGjH+MRfv X-Gm-Gg: ASbGnctvihFXc9Y9z85U4dEEvQpI+QOVZwfcdR9qyrKuqEjHFHTbb1z+t9+cTEFVoZK YTWbvTVhZjj7wTDA4M1XXu0JOkeMxO1WDV2F2cyHU2X6r+VE7gkrs469WEZBRl7Uj67sqRNsavc A7yStsGJV6DDNws7l+4XFZbdRmuYz8tceDKO2bRE+lemEi+aMgFr+RLGjjdV7jAHuw0bJrFkxVk QCmz0aMQmddzYzTRbK93i14ShxTJs/Pu/11XvMDLde+Xm9c/adMb3774Tz2VAnk84l86v6g+9fx Iva4LW0QTRrIGxGBnp2R0DiB+5HoPd1h+bvHFom4Q9G68qvOQhu4MiXsveTCgv8smgd32GtHeTu 4AYGGic3Dnq3U3JJZowHGsTDPzIPbE4gXHNVT8S9cQoD8U0yUgBKUJGX2QHMb7cNRCUc5EjGIRA k6tV7zXXZs0RYIgJZbLoNFhMaUmY89JLZUeN5ZuoiupJPgeI0k2cjUcNNj2S/FWmijiJNTE1s= X-Google-Smtp-Source: AGHT+IHDRSHnAfg1acOm/XZuYqhG+cfYrkmYCtrskf/kG6Ag6vPjm8134fKm36AOnpOS+6umWU41yQ== X-Received: by 2002:a05:600c:45c7:b0:475:dcbb:7903 with SMTP id 5b1f17b1804b1-4771e39c87fmr30450815e9.9.1761759554909; Wed, 29 Oct 2025 10:39:14 -0700 (PDT) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4771e235ae1sm70646865e9.17.2025.10.29.10.39.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Oct 2025 10:39:14 -0700 (PDT) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Yu Kuai , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v4 next 3/9] lib: mul_u64_u64_div_u64() simplify check for a 64bit product Date: Wed, 29 Oct 2025 17:38:22 +0000 Message-Id: <20251029173828.3682-4-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251029173828.3682-1-david.laight.linux@gmail.com> References: <20251029173828.3682-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If the product is only 64bits div64_u64() can be used for the divide. Replace the pre-multiply check (ilog2(a) + ilog2(b) <=3D 62) with a simple post-multiply check that the high 64bits are zero. This has the advantage of being simpler, more accurate and less code. It will always be faster when the product is larger than 64bits. Most 64bit cpu have a native 64x64=3D128 bit multiply, this is needed (for the low 64bits) even when div64_u64() is called - so the early check gains nothing and is just extra code. 32bit cpu will need a compare (etc) to generate the 64bit ilog2() from two 32bit bit scans - so that is non-trivial. (Never mind the mess of x86's 'bsr' and any oddball cpu without fast bit-scan instructions.) Whereas the additional instructions for the 128bit multiply result are pretty much one multiply and two adds (typically the 'adc $0,%reg' can be run in parallel with the instruction that follows). The only outliers are 64bit systems without 128bit mutiply and simple in order 32bit ones with fast bit scan but needing extra instructions to get the high bits of the multiply result. I doubt it makes much difference to either, the latter is definitely not mainstream. If anyone is worried about the analysis they can look at the generated code for x86 (especially when cmov isn't used). Signed-off-by: David Laight --- Split from patch 3 for v2, unchanged since. lib/math/div64.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/math/div64.c b/lib/math/div64.c index 1092f41e878e..7158d141b6e9 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -186,9 +186,6 @@ EXPORT_SYMBOL(iter_div_u64_rem); #ifndef mul_u64_u64_div_u64 u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) { - if (ilog2(a) + ilog2(b) <=3D 62) - return div64_u64(a * b, d); - #if defined(__SIZEOF_INT128__) =20 /* native 64x64=3D128 bits multiplication */ @@ -224,6 +221,9 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) return ~0ULL; } =20 + if (!n_hi) + return div64_u64(n_lo, d); + int shift =3D __builtin_ctzll(d); =20 /* try reducing the fraction in case the dividend becomes <=3D 64 bits */ --=20 2.39.5 From nobody Sun Dec 14 11:13:48 2025 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0FA023546F4 for ; Wed, 29 Oct 2025 17:39:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759560; cv=none; b=GRfUFKT/FHca5h5Q8Jlp3+iHhT0E5r3cm/i7tqNZ4/NBcPxI3+nua0b5EFbrtYxqj5+V4YBNhi3qwynY9B/DchjQTTRXd44ONfTeRO+1QQ6O60HjZ4giZRb+th1gWV+kDIu37FfH/vyrJsNaEhGnkz33h2lB2zCT+qFgg5J6u1c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759560; c=relaxed/simple; bh=w7pJ3g4XS3B0VnuXhUk6tjBfi0ZN1H7G2PtYS2dY3R8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=PIfEHIiJ/8oAqr7XUyv7M6ZcPHfp2qiGlQAJaCLfKkzTNBcnrH+g24b6DjcE38FvkXErUBnpe9mytq4EQ/86eRU3kBx/0tjwqiptWjA4n1LU4idg7pye/yUm6wvviBIts0Fw2OlY7lQUwAOG7B0ICCb1gm2u/y8Kk1ChV5bcH8E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ZL7TgNV9; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZL7TgNV9" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-475e01db75aso614825e9.1 for ; Wed, 29 Oct 2025 10:39:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761759556; x=1762364356; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=W4ZF1zncuIOrL9kOPKjGtIkvHUHy+10iJnWf37+9Phc=; b=ZL7TgNV91jgu1TKHOSbggRvyiXqRI26OiWEhg3CUJnOzPRh8gJf4NRu0r0DK60H3uq QqM5oUAU6o22YVmxXcVT19vIzmzBY9nGiqvaRK4gIGFNsalzwfX4DkfBc9o6xR4qbUTL /ScdczqkodP4xtSRZcOXTpW+v6twkxPe5KALMtlvp3EPzNrJWvp4qZgV16Lvo0DOnyC6 NZilOa/nLp5BYAL3Xe4ojr78NGxqnip5Oz38iTEiSztt0pB/dvi5WK6bVivXRe01bVA1 NRuyvPbyZZ6npLihvZBjWT/2e8P/lofKEMbC64ivh2S8whcAqp+yCH9yRbsY5dMrDic2 Sa1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761759556; x=1762364356; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=W4ZF1zncuIOrL9kOPKjGtIkvHUHy+10iJnWf37+9Phc=; b=ho7a0NlvWS12kxk5h2JlXrrEozPESGTRDdariyCE0Y4BwezxCrtZ13h82r0Ml9/wLl dI93ZtttRbVf7AU8MaYiwJUOjMhJWmz4CzliJOiv3YSI1xCnLHIc14XY7LJi0WAmB3Pk Ad8Qvc/6FOL0MKHOKVkKxG1qOyh7mtNX4Al/G2OTOaTsjlgX7TLeelW7gBo9s9xIm21x fr/uhHQN/BNLtirTcwWVtHGh1J+T+v/LSiwGjsiGUPEZtYkx/U1lSs4ql50S3RbLjxFK kQnbyb3/8RwlzgVOAVWTJgI/UodxcRANhuMNkaXzS11tGRTkIBesP6YVxkwH98J9hr+i uZnQ== X-Forwarded-Encrypted: i=1; AJvYcCWhIlIhX/yhqgtEr4Cjww8N8+FcUK4H4LsVJ/f5wiZCyYOPb+g534XuYhtvv/GICBheVp/UFiMdvgoKx2k=@vger.kernel.org X-Gm-Message-State: AOJu0Yxr3cZ+dWv1qB39qBCRDqUvJ9jfPQP2oHxWtNQ3mH3Mll61U2w1 T0zatHnvqDc/A93RjOOSLWeAcv844aDbSlFI+9Czyg1vDcj7YsGok7TT X-Gm-Gg: ASbGncsysV5LpgBB/7B3UbQ38qSSjiJQ6J5/V1KQBo60gJqQWzChfVXmXY7AyXOOD40 GAT5WUiRo4G30RHdUdjMT1QMm8RKS9i5j2Mz6qYYDfOwxXm7xuonS3usqFvWqA2/kYCGwW2+8tq 40oO45JVp483PL1FJi4igVX1VM3rYfhEP9NRj38D60FxWfLsImPMeTIq65oj74Kzg2wNtrrE7uZ hu9yX0MGBQGcx6YMgCJlOuXalC+qx/KKLasUEYaJ7bq5zlNAzOIIkHKvnvjY8cSTvk0z+5tK3kZ PCUSAoblEVWBOQrxA1ePccEyaWksId8wn9rA7UQJTEhdv5HSPzvkanV1SINA04KFubZ00wFiLPf WNZVXHDQ6afBWOkCtFGMQWbGJ9j1W7idlM9aYCKlw1FwoExYk9OqmDPFUvTfbdCx0YYKjD4T8Ra Ge0ErvJA4P+6eOeMVPij3aavCBZrNMw9F/z6zhnu98f3v3AiGI2L3EiAwWWb49 X-Google-Smtp-Source: AGHT+IEhgS9pPNkz3DZhn7/xPbpimfCyqBOfDjpGD6ckoRsCLWpga4S8wuc7B3GmpJWexkTfjphShQ== X-Received: by 2002:a05:600d:8307:b0:471:133c:4b9a with SMTP id 5b1f17b1804b1-47722c8d852mr15725615e9.6.1761759555648; Wed, 29 Oct 2025 10:39:15 -0700 (PDT) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4771e235ae1sm70646865e9.17.2025.10.29.10.39.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Oct 2025 10:39:15 -0700 (PDT) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Yu Kuai , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v4 next 4/9] lib: Add mul_u64_add_u64_div_u64() and mul_u64_u64_div_u64_roundup() Date: Wed, 29 Oct 2025 17:38:23 +0000 Message-Id: <20251029173828.3682-5-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251029173828.3682-1-david.laight.linux@gmail.com> References: <20251029173828.3682-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The existing mul_u64_u64_div_u64() rounds down, a 'rounding up' variant needs 'divisor - 1' adding in between the multiply and divide so cannot easily be done by a caller. Add mul_u64_add_u64_div_u64(a, b, c, d) that calculates (a * b + c)/d and implement the 'round down' and 'round up' using it. Update the x86-64 asm to optimise for 'c' being a constant zero. Add kerndoc definitions for all three functions. Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- Changes for v2 (formally patch 1/3): - Reinstate the early call to div64_u64() on 32bit when 'c' is zero. Although I'm not convinced the path is common enough to be worth the two ilog2() calls. Changes for v3 (formally patch 3/4): - The early call to div64_u64() has been removed by patch 3. Pretty much guaranteed to be a pessimisation. Changes for v4: - For x86-64 split the multiply, add and divide into three asm blocks. (gcc makes a pigs breakfast of (u128)a * b + c) - Change the kerndoc since divide by zero will (probably) fault. arch/x86/include/asm/div64.h | 20 +++++++++------ include/linux/math64.h | 48 +++++++++++++++++++++++++++++++++++- lib/math/div64.c | 14 ++++++----- 3 files changed, 67 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/div64.h b/arch/x86/include/asm/div64.h index 9931e4c7d73f..cabdc2d5a68f 100644 --- a/arch/x86/include/asm/div64.h +++ b/arch/x86/include/asm/div64.h @@ -84,21 +84,25 @@ static inline u64 mul_u32_u32(u32 a, u32 b) * Will generate an #DE when the result doesn't fit u64, could fix with an * __ex_table[] entry when it becomes an issue. */ -static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div) +static inline u64 mul_u64_add_u64_div_u64(u64 rax, u64 mul, u64 add, u64 d= iv) { - u64 q; + u64 rdx; =20 - asm ("mulq %2; divq %3" : "=3Da" (q) - : "a" (a), "rm" (mul), "rm" (div) - : "rdx"); + asm ("mulq %[mul]" : "+a" (rax), "=3Dd" (rdx) : [mul] "rm" (mul)); =20 - return q; + if (statically_true(add)) + asm ("addq %[add], %[lo]; adcq $0, %[hi]" : + [lo] "+r" (rax), [hi] "+r" (rdx) : [add] "irm" (add)); + + asm ("divq %[div]" : "+a" (rax), "+d" (rdx) : [div] "rm" (div)); + + return rax; } -#define mul_u64_u64_div_u64 mul_u64_u64_div_u64 +#define mul_u64_add_u64_div_u64 mul_u64_add_u64_div_u64 =20 static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 div) { - return mul_u64_u64_div_u64(a, mul, div); + return mul_u64_add_u64_div_u64(a, mul, 0, div); } #define mul_u64_u32_div mul_u64_u32_div =20 diff --git a/include/linux/math64.h b/include/linux/math64.h index 6aaccc1626ab..e889d850b7f1 100644 --- a/include/linux/math64.h +++ b/include/linux/math64.h @@ -282,7 +282,53 @@ static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 = divisor) } #endif /* mul_u64_u32_div */ =20 -u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div); +/** + * mul_u64_add_u64_div_u64 - unsigned 64bit multiply, add, and divide + * @a: first unsigned 64bit multiplicand + * @b: second unsigned 64bit multiplicand + * @c: unsigned 64bit addend + * @d: unsigned 64bit divisor + * + * Multiply two 64bit values together to generate a 128bit product + * add a third value and then divide by a fourth. + * The Generic code divides by 0 if @d is zero and returns ~0 on overflow. + * Architecture specific code may trap on zero or overflow. + * + * Return: (@a * @b + @c) / @d + */ +u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d); + +/** + * mul_u64_u64_div_u64 - unsigned 64bit multiply and divide + * @a: first unsigned 64bit multiplicand + * @b: second unsigned 64bit multiplicand + * @d: unsigned 64bit divisor + * + * Multiply two 64bit values together to generate a 128bit product + * and then divide by a third value. + * The Generic code divides by 0 if @d is zero and returns ~0 on overflow. + * Architecture specific code may trap on zero or overflow. + * + * Return: @a * @b / @d + */ +#define mul_u64_u64_div_u64(a, b, d) mul_u64_add_u64_div_u64(a, b, 0, d) + +/** + * mul_u64_u64_div_u64_roundup - unsigned 64bit multiply and divide rounde= d up + * @a: first unsigned 64bit multiplicand + * @b: second unsigned 64bit multiplicand + * @d: unsigned 64bit divisor + * + * Multiply two 64bit values together to generate a 128bit product + * and then divide and round up. + * The Generic code divides by 0 if @d is zero and returns ~0 on overflow. + * Architecture specific code may trap on zero or overflow. + * + * Return: (@a * @b + @d - 1) / @d + */ +#define mul_u64_u64_div_u64_roundup(a, b, d) \ + ({ u64 _tmp =3D (d); mul_u64_add_u64_div_u64(a, b, _tmp - 1, _tmp); }) + =20 /** * DIV64_U64_ROUND_UP - unsigned 64bit divide with 64bit divisor rounded up diff --git a/lib/math/div64.c b/lib/math/div64.c index 7158d141b6e9..25295daebde9 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -183,13 +183,13 @@ u32 iter_div_u64_rem(u64 dividend, u32 divisor, u64 *= remainder) } EXPORT_SYMBOL(iter_div_u64_rem); =20 -#ifndef mul_u64_u64_div_u64 -u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) +#ifndef mul_u64_add_u64_div_u64 +u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d) { #if defined(__SIZEOF_INT128__) =20 /* native 64x64=3D128 bits multiplication */ - u128 prod =3D (u128)a * b; + u128 prod =3D (u128)a * b + c; u64 n_lo =3D prod, n_hi =3D prod >> 64; =20 #else @@ -198,8 +198,10 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) u32 a_lo =3D a, a_hi =3D a >> 32, b_lo =3D b, b_hi =3D b >> 32; u64 x, y, z; =20 - x =3D (u64)a_lo * b_lo; - y =3D (u64)a_lo * b_hi + (u32)(x >> 32); + /* Since (x-1)(x-1) + 2(x-1) =3D=3D x.x - 1 two u32 can be added to a u64= */ + x =3D (u64)a_lo * b_lo + (u32)c; + y =3D (u64)a_lo * b_hi + (u32)(c >> 32); + y +=3D (u32)(x >> 32); z =3D (u64)a_hi * b_hi + (u32)(y >> 32); y =3D (u64)a_hi * b_lo + (u32)y; z +=3D (u32)(y >> 32); @@ -265,5 +267,5 @@ u64 mul_u64_u64_div_u64(u64 a, u64 b, u64 d) =20 return res; } -EXPORT_SYMBOL(mul_u64_u64_div_u64); +EXPORT_SYMBOL(mul_u64_add_u64_div_u64); #endif --=20 2.39.5 From nobody Sun Dec 14 11:13:48 2025 Received: from mail-wm1-f47.google.com (mail-wm1-f47.google.com [209.85.128.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC4ED3546FA for ; Wed, 29 Oct 2025 17:39:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759561; cv=none; b=QUKaEXtJ1r7ZPvX4/hCwrNfqQYMwXlh2i6pajJ9vQ9rsP/c4NbLiANPM6VudQcawIIUh2PfbmqQ1Ad6qzhjV1ZmIz+KV53vjK7MVJp0QjzUghNK11xDECSzB1Te/qe43NDlhN+hFD3Z58MxX2KxNDn6sK8M6pw+LgYT/WV/eySw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759561; c=relaxed/simple; bh=brQAKaBI2JfxL3nHimpmQqMYkcAGVahG+LND6RewIig=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=h0bFs+DUelSyHYv4Dk8GkoV/nZ46kAkz+J9T2D42rY0ByFxlbdwU0enVjhsLGK6sldPsXEO7F94PLsQtwO0ZOJ87DwuRqqzsjfi6Xmvgpl9RL6Mj55svVgwqP23pr1KIqsteL+GMY8B5Bef5XODIB+JoscPqciBTb6Fez/uWtWM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=RvYTQF4O; arc=none smtp.client-ip=209.85.128.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="RvYTQF4O" Received: by mail-wm1-f47.google.com with SMTP id 5b1f17b1804b1-4711b95226dso1634295e9.0 for ; Wed, 29 Oct 2025 10:39:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761759557; x=1762364357; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=R7HOVAqPQPihhZufx2AkZx0/NAZTeNOfsgPjqdl0hO8=; b=RvYTQF4OxdAChN9ssfyJn/TFdrOS6CaXp/M+hXZW+knXqSHvBYSxgQmOTmBHg61Mih Mdn8BdAgufPvXvyYka/H6ep2L3yjqt59wQt5cJEn38EyNqkFJ7O3deF0SfiHv/CktixV URwfT4Q1l8xeT7L9zH8iTXKs9EoTPbU3aFs454wf+RxgwqQJ9y/RmVQ0f6eWy+0KPGsy x44UfbGgBLDjA/ekXxLlc7Atb+btUSvDD6zatR37DsZ0umK63LoJ2MoeU2ZS7HJmkUlx CEv5hLM4IIQeQ2BhD1biAjG9ETuPJunoP8wQveqGayr6jXtz9WjEzqi1eV4wmZVeXGOs wpaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761759557; x=1762364357; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=R7HOVAqPQPihhZufx2AkZx0/NAZTeNOfsgPjqdl0hO8=; b=rkXKU8as08gBr1ifeAekKNPscW1Ou3M6POlQEm6lk7CJYlXPBQmR+bG1PDm2nUnvZG RD4VLmC4TPdadNk1HgLo4RxGhdWGhPVSD9gBI6Bsru34rNagPSDX7gFbJQHtScuXD0ea e+S3LJKfcsI2fgnv84bRf6xN8UN6ccNOaD8qE2PttHhqVYgQfjjJPJNU9bSSvtalD2WG FSwJAtGTdJtt289gOlELDf3U5BGX4/foKYTTnZWm7iBmYn8YpefNBImmXP4mGQocYmoU tnv+bR5FS24dhz7IVzFLJ4Cyn17CWE7/waoko6lATdAQrSVryI/HXschVkrnBn+mTwGu zwUQ== X-Forwarded-Encrypted: i=1; AJvYcCXYI6n9ROCcBdVo4GdIhJAYeAb5dzu7ESr8D5fHSWHKxQ336UlZ/PS0lsPHVL51jdZWVR7d8Hx2mKqFfLQ=@vger.kernel.org X-Gm-Message-State: AOJu0YymTpxpCW94BHA3/fXDGMNP9caHKuStu1RX55zf8RsxS8g1hHCR SYfNStdZwno9padnK3ACE236vx0Sl9EIJwTEb/TChUvvL52IeFfbngvH X-Gm-Gg: ASbGncv9aJpr7/cH+p12bJKTVHOniNnxlk74Wt6MnAl+sIl4ZoeC2BzhYqfXUlNedac JE9AG8rUpuEwhwjhOX4zrQo0pQu4DYLvvlFJaPcMl6HqWZ+zVm6hWSLC73Cjifubb7C0dSl86cR +W83lIqlNXwN8Yo+/9nIJwd653ZwKGCdHW/zxU2paVPZbxM2uVyKbl+OxpXgXvmFx6M8p/L3OJs NEc3kvag/i7W5/x45CQgpaiByggfdoB88qV6Z/TE1uyFFGDgopIE3DySJTE6LAbx1WqitDXchc2 cPwywbK3hHZLx4dyPXVo5FhbL0wdwkl3Bd4qun3JsF5a7oO1d36xmgl4+ocw2h/oHvmC7NfOdn2 NLhDC/O3GVvOeDw5AWoicVFIuVGuP9UnkIKdU2mNtluSxDrbTMtcmPUzkZuQCNNk+bUBTSxanDm GH1JtzYcK0lkPmfru3tBq8XZngjzGWuRO9k0AGB3yGf4oI9GTF29Pef3KNaqxb X-Google-Smtp-Source: AGHT+IGFqmYWf+rtSScU1XDloRcGj38Kn6CXlOyj9qu6LQygMLJUmRHYLZRh5gN1e522TppcU6eDUg== X-Received: by 2002:a05:600c:1f93:b0:477:1bb6:17e5 with SMTP id 5b1f17b1804b1-4771e3fbbf8mr31470175e9.30.1761759556433; Wed, 29 Oct 2025 10:39:16 -0700 (PDT) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4771e235ae1sm70646865e9.17.2025.10.29.10.39.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Oct 2025 10:39:16 -0700 (PDT) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Yu Kuai , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v4 next 5/9] lib: Add tests for mul_u64_u64_div_u64_roundup() Date: Wed, 29 Oct 2025 17:38:24 +0000 Message-Id: <20251029173828.3682-6-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251029173828.3682-1-david.laight.linux@gmail.com> References: <20251029173828.3682-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Replicate the existing mul_u64_u64_div_u64() test cases with round up. Update the shell script that verifies the table, remove the comment markers so that it can be directly pasted into a shell. Rename the divisor from 'c' to 'd' to match mul_u64_add_u64_div_u64(). It any tests fail then fail the module load with -EINVAL. Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- Changes for v3: - Rename 'c' to 'd' to match mul_u64_add_u64_div_u64() Changes for v4: - Fix shell script that verifies the table lib/math/test_mul_u64_u64_div_u64.c | 122 +++++++++++++++++----------- 1 file changed, 73 insertions(+), 49 deletions(-) diff --git a/lib/math/test_mul_u64_u64_div_u64.c b/lib/math/test_mul_u64_u6= 4_div_u64.c index 58d058de4e73..4d5e4e5dac67 100644 --- a/lib/math/test_mul_u64_u64_div_u64.c +++ b/lib/math/test_mul_u64_u64_div_u64.c @@ -10,61 +10,73 @@ #include #include =20 -typedef struct { u64 a; u64 b; u64 c; u64 result; } test_params; +typedef struct { u64 a; u64 b; u64 d; u64 result; uint round_up;} test_par= ams; =20 static test_params test_values[] =3D { /* this contains many edge values followed by a couple random values */ -{ 0xb, 0x7, 0x3, = 0x19 }, -{ 0xffff0000, 0xffff0000, 0xf, 0x1110eeef00= 000000 }, -{ 0xffffffff, 0xffffffff, 0x1, 0xfffffffe00= 000001 }, -{ 0xffffffff, 0xffffffff, 0x2, 0x7fffffff00= 000000 }, -{ 0x1ffffffff, 0xffffffff, 0x2, 0xfffffffe80= 000000 }, -{ 0x1ffffffff, 0xffffffff, 0x3, 0xaaaaaaa9aa= aaaaab }, -{ 0x1ffffffff, 0x1ffffffff, 0x4, 0xffffffff00= 000000 }, -{ 0xffff000000000000, 0xffff000000000000, 0xffff000000000001, 0xfffeffffff= ffffff }, -{ 0x3333333333333333, 0x3333333333333333, 0x5555555555555555, 0x1eb851eb85= 1eb851 }, -{ 0x7fffffffffffffff, 0x2, 0x3, 0x5555555555= 555554 }, -{ 0xffffffffffffffff, 0x2, 0x8000000000000000, = 0x3 }, -{ 0xffffffffffffffff, 0x2, 0xc000000000000000, = 0x2 }, -{ 0xffffffffffffffff, 0x4000000000000004, 0x8000000000000000, 0x8000000000= 000007 }, -{ 0xffffffffffffffff, 0x4000000000000001, 0x8000000000000000, 0x8000000000= 000001 }, -{ 0xffffffffffffffff, 0x8000000000000001, 0xffffffffffffffff, 0x8000000000= 000001 }, -{ 0xfffffffffffffffe, 0x8000000000000001, 0xffffffffffffffff, 0x8000000000= 000000 }, -{ 0xffffffffffffffff, 0x8000000000000001, 0xfffffffffffffffe, 0x8000000000= 000001 }, -{ 0xffffffffffffffff, 0x8000000000000001, 0xfffffffffffffffd, 0x8000000000= 000002 }, -{ 0x7fffffffffffffff, 0xffffffffffffffff, 0xc000000000000000, 0xaaaaaaaaaa= aaaaa8 }, -{ 0xffffffffffffffff, 0x7fffffffffffffff, 0xa000000000000000, 0xcccccccccc= ccccca }, -{ 0xffffffffffffffff, 0x7fffffffffffffff, 0x9000000000000000, 0xe38e38e38e= 38e38b }, -{ 0x7fffffffffffffff, 0x7fffffffffffffff, 0x5000000000000000, 0xcccccccccc= ccccc9 }, -{ 0xffffffffffffffff, 0xfffffffffffffffe, 0xffffffffffffffff, 0xffffffffff= fffffe }, -{ 0xe6102d256d7ea3ae, 0x70a77d0be4c31201, 0xd63ec35ab3220357, 0x78f8bf8cc8= 6c6e18 }, -{ 0xf53bae05cb86c6e1, 0x3847b32d2f8d32e0, 0xcfd4f55a647f403c, 0x42687f79d8= 998d35 }, -{ 0x9951c5498f941092, 0x1f8c8bfdf287a251, 0xa3c8dc5f81ea3fe2, 0x1d887cb259= 00091f }, -{ 0x374fee9daa1bb2bb, 0x0d0bfbff7b8ae3ef, 0xc169337bd42d5179, 0x03bb2dbaff= cbb961 }, -{ 0xeac0d03ac10eeaf0, 0x89be05dfa162ed9b, 0x92bb1679a41f0e4b, 0xdc5f5cc9e2= 70d216 }, +{ 0xb, 0x7, 0x3, = 0x19, 1 }, +{ 0xffff0000, 0xffff0000, 0xf, 0x1110eeef00= 000000, 0 }, +{ 0xffffffff, 0xffffffff, 0x1, 0xfffffffe00= 000001, 0 }, +{ 0xffffffff, 0xffffffff, 0x2, 0x7fffffff00= 000000, 1 }, +{ 0x1ffffffff, 0xffffffff, 0x2, 0xfffffffe80= 000000, 1 }, +{ 0x1ffffffff, 0xffffffff, 0x3, 0xaaaaaaa9aa= aaaaab, 0 }, +{ 0x1ffffffff, 0x1ffffffff, 0x4, 0xffffffff00= 000000, 1 }, +{ 0xffff000000000000, 0xffff000000000000, 0xffff000000000001, 0xfffeffffff= ffffff, 1 }, +{ 0x3333333333333333, 0x3333333333333333, 0x5555555555555555, 0x1eb851eb85= 1eb851, 1 }, +{ 0x7fffffffffffffff, 0x2, 0x3, 0x5555555555= 555554, 1 }, +{ 0xffffffffffffffff, 0x2, 0x8000000000000000, = 0x3, 1 }, +{ 0xffffffffffffffff, 0x2, 0xc000000000000000, = 0x2, 1 }, +{ 0xffffffffffffffff, 0x4000000000000004, 0x8000000000000000, 0x8000000000= 000007, 1 }, +{ 0xffffffffffffffff, 0x4000000000000001, 0x8000000000000000, 0x8000000000= 000001, 1 }, +{ 0xffffffffffffffff, 0x8000000000000001, 0xffffffffffffffff, 0x8000000000= 000001, 0 }, +{ 0xfffffffffffffffe, 0x8000000000000001, 0xffffffffffffffff, 0x8000000000= 000000, 1 }, +{ 0xffffffffffffffff, 0x8000000000000001, 0xfffffffffffffffe, 0x8000000000= 000001, 1 }, +{ 0xffffffffffffffff, 0x8000000000000001, 0xfffffffffffffffd, 0x8000000000= 000002, 1 }, +{ 0x7fffffffffffffff, 0xffffffffffffffff, 0xc000000000000000, 0xaaaaaaaaaa= aaaaa8, 1 }, +{ 0xffffffffffffffff, 0x7fffffffffffffff, 0xa000000000000000, 0xcccccccccc= ccccca, 1 }, +{ 0xffffffffffffffff, 0x7fffffffffffffff, 0x9000000000000000, 0xe38e38e38e= 38e38b, 1 }, +{ 0x7fffffffffffffff, 0x7fffffffffffffff, 0x5000000000000000, 0xcccccccccc= ccccc9, 1 }, +{ 0xffffffffffffffff, 0xfffffffffffffffe, 0xffffffffffffffff, 0xffffffffff= fffffe, 0 }, +{ 0xe6102d256d7ea3ae, 0x70a77d0be4c31201, 0xd63ec35ab3220357, 0x78f8bf8cc8= 6c6e18, 1 }, +{ 0xf53bae05cb86c6e1, 0x3847b32d2f8d32e0, 0xcfd4f55a647f403c, 0x42687f79d8= 998d35, 1 }, +{ 0x9951c5498f941092, 0x1f8c8bfdf287a251, 0xa3c8dc5f81ea3fe2, 0x1d887cb259= 00091f, 1 }, +{ 0x374fee9daa1bb2bb, 0x0d0bfbff7b8ae3ef, 0xc169337bd42d5179, 0x03bb2dbaff= cbb961, 1 }, +{ 0xeac0d03ac10eeaf0, 0x89be05dfa162ed9b, 0x92bb1679a41f0e4b, 0xdc5f5cc9e2= 70d216, 1 }, }; =20 /* * The above table can be verified with the following shell script: - * - * #!/bin/sh - * sed -ne 's/^{ \+\(.*\), \+\(.*\), \+\(.*\), \+\(.*\) },$/\1 \2 \3 \4/p'= \ - * lib/math/test_mul_u64_u64_div_u64.c | - * while read a b c r; do - * expected=3D$( printf "obase=3D16; ibase=3D16; %X * %X / %X\n" $a $b $= c | bc ) - * given=3D$( printf "%X\n" $r ) - * if [ "$expected" =3D "$given" ]; then - * echo "$a * $b / $c =3D $r OK" - * else - * echo "$a * $b / $c =3D $r is wrong" >&2 - * echo "should be equivalent to 0x$expected" >&2 - * exit 1 - * fi - * done + +#!/bin/sh +sed -ne 's/^{ \+\(.*\), \+\(.*\), \+\(.*\), \+\(.*\), \+\(.*\) },$/\1 \2 \= 3 \4 \5/p' \ + lib/math/test_mul_u64_u64_div_u64.c | +while read a b d r e; do + expected=3D$( printf "obase=3D16; ibase=3D16; %X * %X / %X\n" $a $b $d |= bc ) + given=3D$( printf "%X\n" $r ) + if [ "$expected" =3D "$given" ]; then + echo "$a * $b / $d =3D $r OK" + else + echo "$a * $b / $d =3D $r is wrong" >&2 + echo "should be equivalent to 0x$expected" >&2 + exit 1 + fi + expected=3D$( printf "obase=3D16; ibase=3D16; (%X * %X + %X) / %X\n" $a = $b $((d-1)) $d | bc ) + given=3D$( printf "%X\n" $((r + e)) ) + if [ "$expected" =3D "$given" ]; then + echo "$a * $b +/ $d =3D $(printf '%#x' $((r + e))) OK" + else + echo "$a * $b +/ $d =3D $(printf '%#x' $((r + e))) is wrong" >&2 + echo "should be equivalent to 0x$expected" >&2 + exit 1 + fi +done + */ =20 static int __init test_init(void) { + int errors =3D 0; + int tests =3D 0; int i; =20 pr_info("Starting mul_u64_u64_div_u64() test\n"); @@ -72,19 +84,31 @@ static int __init test_init(void) for (i =3D 0; i < ARRAY_SIZE(test_values); i++) { u64 a =3D test_values[i].a; u64 b =3D test_values[i].b; - u64 c =3D test_values[i].c; + u64 d =3D test_values[i].d; u64 expected_result =3D test_values[i].result; - u64 result =3D mul_u64_u64_div_u64(a, b, c); + u64 result =3D mul_u64_u64_div_u64(a, b, d); + u64 result_up =3D mul_u64_u64_div_u64_roundup(a, b, d); + + tests +=3D 2; =20 if (result !=3D expected_result) { - pr_err("ERROR: 0x%016llx * 0x%016llx / 0x%016llx\n", a, b, c); + pr_err("ERROR: 0x%016llx * 0x%016llx / 0x%016llx\n", a, b, d); pr_err("ERROR: expected result: %016llx\n", expected_result); pr_err("ERROR: obtained result: %016llx\n", result); + errors++; + } + expected_result +=3D test_values[i].round_up; + if (result_up !=3D expected_result) { + pr_err("ERROR: 0x%016llx * 0x%016llx +/ 0x%016llx\n", a, b, d); + pr_err("ERROR: expected result: %016llx\n", expected_result); + pr_err("ERROR: obtained result: %016llx\n", result_up); + errors++; } } =20 - pr_info("Completed mul_u64_u64_div_u64() test\n"); - return 0; + pr_info("Completed mul_u64_u64_div_u64() test, %d tests, %d errors\n", + tests, errors); + return errors ? -EINVAL : 0; } =20 static void __exit test_exit(void) --=20 2.39.5 From nobody Sun Dec 14 11:13:48 2025 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 691193446B5 for ; Wed, 29 Oct 2025 17:39:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759561; cv=none; b=lXSg1jvVzS59V961iMT5E2kA+MAoxHSin30Eg172BBQDfE19PQt3LkLIDLXNd7VvjluCGpR/tdt/dLha5ZtRvs92jThLIQ0NRMQLu33hPIpNLWED06v0RTd5GKXdlkIa63Tuu0T6za/iBEJZ3+ZmcrqG7fXscphktxVK3K2OQVs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759561; c=relaxed/simple; bh=UCLOuHRe9JTMAMFSws4hOXFjQBZHy75iVHd2PkK7mQg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=kqvz+HCBm1NPByXSK/IHDQq7xXJdvbSKTQVnD3bLhEjFc1rydeIjO3R3jx9TMgfvf6rCjSXbGaqYB6N6Z7SCiL2MY/PQJTWcA9nJtkIDJaxr2nu3nxRwT/XqUVF2+7jqG7vq+ShL/HjjkQnd3WddDXm1feOkenZk9me8h2iw5oQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=IYkH+7LG; arc=none smtp.client-ip=209.85.128.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="IYkH+7LG" Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-471191ac79dso1126415e9.3 for ; Wed, 29 Oct 2025 10:39:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761759557; x=1762364357; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=edTaeeVtqohQLLYXZn1p4Mx6bseq0sNKUYhIKbAAn3o=; b=IYkH+7LGFNLhHyHKa+dXU8KQIvHIpzgyToNCMYvYTNG0g/Dj7OdTzTgPsrnjH387ub x4L9Jg/BEZaGSmF5BXDJlgigJ16gepBZRN0bhj9bwUylzQs7bvX0F3V/K1URG+sT9YYS S+gUOaCbS8bqmbk+t81bocWcFtMzmjgee20EEsnE3SRYRPVFBTEpFxh9M/dTlZSmAsp3 SfqTH1SnGV7qrGPUtPx1FnvVU7s+yNOxcj5PT5mNb7IWw7x60raqoAv6hSE6UirTkffz 3XUwREXGsUoYn6uTzC4a4OcSRZJXufTgXtn8wwS379WqDG6FzOqP4f496vw32nO+yfGr nJEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761759557; x=1762364357; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=edTaeeVtqohQLLYXZn1p4Mx6bseq0sNKUYhIKbAAn3o=; b=W6ZEgOabkprmrSpmlki4HTr9fEdEC7sgzcoiUKnOFkBHWE5C55XV4yeXjJKBDPVhTQ VWoI34+3nAC127oV3WxMn7utEHTrCnMPKmghCANeMN5EG6ES2pm2pvK412v19ImuD8MM WZZlMH7gyNEbZ9Wxa8YtaJmBdyNk3m+0mfMYmGmL422gf8ecoeTfFcat/CvLSD9Fa+ML AqoX4cQOErLayrn1Zz6XQ0c94E4eXuv5pqaiwHkHLxxvHQYW8b7lPJEzh2JtYPlcr3+n Ss6K0bgGhzydIz2wEeugcNUceEQWHkYDRiun/HAZoc/vnHF7w6Ffm7JEBvmR1/rMVCpT BXnA== X-Forwarded-Encrypted: i=1; AJvYcCXl2IeWL7Ru5/LvwocfMEpmKpR5U0aBsx3T1PW7tayrgpoAEnkMBndGdsu+YigTcyU2NTjrKN2e1Xcgyes=@vger.kernel.org X-Gm-Message-State: AOJu0YxMfrKGL7/lRPfpam0xZdoptA60qZdH4b5OIqSYwT4i/wY9p7pA p4Tqf3zatSpmeR4ks5uE00+mYXx7/5YYH33Fmo/ysCSKcF+g5PYtkaU8 X-Gm-Gg: ASbGnctqkYDrdIqfiH0c0ndpGeKm1c2Wj2Rij7LNPIGH3ujhm29bFQ0Cv6tKif39DwP 5Fd/08INJBgfs17ARtugrpqdeKJEQfRIiw3idehK5Zx1UGX53wzGRJNhxOAbhI1fn2uELS+Pc0Z tnUqxfpEK/rjASJWpqSpAuEWm4H6Spie8GRw0jHZo9YPJaeq3lm+LrbW2YnAm9IQ0Pse7G+00ft 5m+P5BDFIwsRB1gwYLi27+XfAG4a5wKyJrgXJVDQN2xINBpEvdUwKuXXmx5SdD54OUhHIRbDBxX Z5ockZYnrTlbKNX4Etc5fb3QuGpUWwTLPDGKo1ZhQrsGGkHfyyNGtOWDVi/N53nBLx3R2Jzqy3X Zz0KuZi2Gwz6xpGlCN7l0DRF/Cu1ab7XwzNTOWVrlPKKI3Cqbkti/LblcNHX6/HifyMXh343amR 80/y2ht5rPgHOsEOtKTILlWR8LMrk84gYsKp3ZvS8JSYztmYysI9ndNx48fYWHTjwiD/RfTPg= X-Google-Smtp-Source: AGHT+IEa++kmTpR/Sg1CLMMa4kRHbHLSYdevatxIQ21q/SlUuM94yltxHoX05TYGMgJzOyf0gR8BYA== X-Received: by 2002:a05:600c:1d1c:b0:475:da1a:5418 with SMTP id 5b1f17b1804b1-4771e1c9da8mr30359655e9.1.1761759557231; Wed, 29 Oct 2025 10:39:17 -0700 (PDT) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4771e235ae1sm70646865e9.17.2025.10.29.10.39.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Oct 2025 10:39:17 -0700 (PDT) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Yu Kuai , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v4 next 6/9] lib: test_mul_u64_u64_div_u64: Test both generic and arch versions Date: Wed, 29 Oct 2025 17:38:25 +0000 Message-Id: <20251029173828.3682-7-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251029173828.3682-1-david.laight.linux@gmail.com> References: <20251029173828.3682-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Change the #if in div64.c so that test_mul_u64_u64_div_u64.c can compile and test the generic version (including the 'long multiply') on architectures (eg amd64) that define their own copy. Test the kernel version and the locally compiled version on all arch. Output the time taken (in ns) on the 'test completed' trace. For reference, on my zen 5, the optimised version takes ~220ns and the generic version ~3350ns. Using the native multiply saves ~200ns and adding back the ilog2() 'optimis= ation' test adds ~50ms. Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- Changes for v4: - Fix build on non x86 (eg arm32) lib/math/div64.c | 8 +++-- lib/math/test_mul_u64_u64_div_u64.c | 51 +++++++++++++++++++++++++---- 2 files changed, 50 insertions(+), 9 deletions(-) diff --git a/lib/math/div64.c b/lib/math/div64.c index 25295daebde9..f92e7160feb6 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -177,16 +177,18 @@ EXPORT_SYMBOL(div64_s64); * Iterative div/mod for use when dividend is not expected to be much * bigger than divisor. */ +#ifndef iter_div_u64_rem u32 iter_div_u64_rem(u64 dividend, u32 divisor, u64 *remainder) { return __iter_div_u64_rem(dividend, divisor, remainder); } EXPORT_SYMBOL(iter_div_u64_rem); +#endif =20 -#ifndef mul_u64_add_u64_div_u64 +#if !defined(mul_u64_add_u64_div_u64) || defined(test_mul_u64_add_u64_div_= u64) u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d) { -#if defined(__SIZEOF_INT128__) +#if defined(__SIZEOF_INT128__) && !defined(test_mul_u64_add_u64_div_u64) =20 /* native 64x64=3D128 bits multiplication */ u128 prod =3D (u128)a * b + c; @@ -267,5 +269,7 @@ u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d) =20 return res; } +#if !defined(test_mul_u64_add_u64_div_u64) EXPORT_SYMBOL(mul_u64_add_u64_div_u64); #endif +#endif diff --git a/lib/math/test_mul_u64_u64_div_u64.c b/lib/math/test_mul_u64_u6= 4_div_u64.c index 4d5e4e5dac67..a3c5e54f37ef 100644 --- a/lib/math/test_mul_u64_u64_div_u64.c +++ b/lib/math/test_mul_u64_u64_div_u64.c @@ -73,21 +73,34 @@ done =20 */ =20 -static int __init test_init(void) +static u64 test_mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d); + +static int __init test_run(unsigned int fn_no, const char *fn_name) { + u64 start_time; int errors =3D 0; int tests =3D 0; int i; =20 - pr_info("Starting mul_u64_u64_div_u64() test\n"); + start_time =3D ktime_get_ns(); =20 for (i =3D 0; i < ARRAY_SIZE(test_values); i++) { u64 a =3D test_values[i].a; u64 b =3D test_values[i].b; u64 d =3D test_values[i].d; u64 expected_result =3D test_values[i].result; - u64 result =3D mul_u64_u64_div_u64(a, b, d); - u64 result_up =3D mul_u64_u64_div_u64_roundup(a, b, d); + u64 result, result_up; + + switch (fn_no) { + default: + result =3D mul_u64_u64_div_u64(a, b, d); + result_up =3D mul_u64_u64_div_u64_roundup(a, b, d); + break; + case 1: + result =3D test_mul_u64_add_u64_div_u64(a, b, 0, d); + result_up =3D test_mul_u64_add_u64_div_u64(a, b, d - 1, d); + break; + } =20 tests +=3D 2; =20 @@ -106,15 +119,39 @@ static int __init test_init(void) } } =20 - pr_info("Completed mul_u64_u64_div_u64() test, %d tests, %d errors\n", - tests, errors); - return errors ? -EINVAL : 0; + pr_info("Completed %s() test, %d tests, %d errors, %llu ns\n", + fn_name, tests, errors, ktime_get_ns() - start_time); + return errors; +} + +static int __init test_init(void) +{ + pr_info("Starting mul_u64_u64_div_u64() test\n"); + if (test_run(0, "mul_u64_u64_div_u64")) + return -EINVAL; + if (test_run(1, "test_mul_u64_u64_div_u64")) + return -EINVAL; + return 0; } =20 static void __exit test_exit(void) { } =20 +/* Compile the generic mul_u64_add_u64_div_u64() code */ +#define __div64_32 __div64_32 +#define div_s64_rem div_s64_rem +#define div64_u64_rem div64_u64_rem +#define div64_u64 div64_u64 +#define div64_s64 div64_s64 +#define iter_div_u64_rem iter_div_u64_rem + +#undef mul_u64_add_u64_div_u64 +#define mul_u64_add_u64_div_u64 test_mul_u64_add_u64_div_u64 +#define test_mul_u64_add_u64_div_u64 test_mul_u64_add_u64_div_u64 + +#include "div64.c" + module_init(test_init); module_exit(test_exit); =20 --=20 2.39.5 From nobody Sun Dec 14 11:13:48 2025 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 07CEA1A9F96 for ; Wed, 29 Oct 2025 17:39:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759562; cv=none; b=aUpERoXIy+AGyRs86aMCiJHX4maH01f8tG7iYRJYlJGkOl7umGxsCZ3hG+hL4/+jrccxSWvxz+m/2Ch7IHrYCFS1rfJ+N+BAVLyvUTD1L8fq2uTKHmNBo4jDD1QbHFkIFbWbIYHZHpSQixuV1hjALdfjvn+r9DEWriMbH2VvswQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759562; c=relaxed/simple; bh=Z3FqXguUg8T1r6oH2VNVIbBBg9pZco2x6EyPSCHtG0E=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=uO7RQpRTsXBywRF5BhcPPXTb7AYmanywqoCCN1Bf3Ha9uIKvBcrLwNh3FrvmhNdsCJSbLRkVXIh1yENNr91o48w1sb0l9LiI0PZyU4taxLUK9FyZZonKYq3Le8EYBmLaAU84SWzXtC8IqrPEcFn+/hB3ai23tTtlXFSu0gfLGOw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ltoonu9h; arc=none smtp.client-ip=209.85.128.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ltoonu9h" Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-475dc0ed8aeso727595e9.2 for ; Wed, 29 Oct 2025 10:39:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761759558; x=1762364358; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6rOHLgMiqxXYiPO+1o8K7NwRMVoG3THlCo2i4GBN/+c=; b=ltoonu9hAlLp9v92S7hSN4zqLU/GbhG/OY74v0EkbC68LXrvb7/1qoIkYRvY4p3fwK N1cWZngoppn2TQxZQw7if2WieF3qkaaXueiA2tjmIELOAOrAe5Qp1ZeZkoyZE60sXcQf nPUqLwfaam9lRs3WIC1dHDRIZJy5U821VQw7LCuu0vaVbX25Rrd2KGhPaSGw9bSq11aG MPumSyrMUL9Qp2p0ONt46uRJNhIZ9nRQP9lx2sYKac3bjUyfjizmvOgwChvXXLnOliTr mjVDxwclc5IAQTHjWtwbeq234xq+x/V7YXLNQuBknUFNQdDGnVjM9LQggB9Iuh/Siwi9 +Hcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761759558; x=1762364358; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6rOHLgMiqxXYiPO+1o8K7NwRMVoG3THlCo2i4GBN/+c=; b=b/7EgPpZ04euv9+AzmfCqB2fRkWsZCWuRMvN9q4bpclsV+jShiz9PEKrTGl2mPGC4r zTR4Oyiw78yiu3h+/jAbKttwTq2TGNfg9UyUE+ullmGjw4JTjB54vQjIb5kID9ShJP43 8eqCQM/2PvgWzPU/2SGQwESLEtNt7PGaR6NML5M6fwkqWCrUhSh1UbBoIeKhnehaBFkg +UpRj8sZotbpdeTX8uFartNOnJbUQ28TpDN/XkyyTC6YDHnRbYbMfwxhg3uOP7zLevCT C/SDtC9hjUifu8Q5nDznWNBTA+VTu6/FfZEl6/2wa2wZxMtN+RQuSqeYlClC0ogjyfzq X1NA== X-Forwarded-Encrypted: i=1; AJvYcCVPjGeSQsip7yHC9CV8TcvLLpXSk9r2q2zWBFfB5HoqPxigqbPMKNJQYu0C43XQDGZgyNJ6vEKfxlcHu0k=@vger.kernel.org X-Gm-Message-State: AOJu0YzfCpxK2ysWmCE4LxG1zuZGPyHsqHKD9XUpshBkfXIJwBVgSebY FFM8sSiRb8fZcblqRBazJmkWMgIJEjzXU+ncgIbia4X9zaEYkQ/q/ACb X-Gm-Gg: ASbGncvupSRJ/ENQieY2X4vw2uWuLkH6VWHRlHvVh/iW4wHs+9GxCZMIcoftEnNlItE mVlkDxZFiIQasEulapsvApygbTXggPk5c1fO0kIYrkCyPZBEJpqZpY5uJ0vq3vgq0nR6cgZ1eLC Lh3MIrk3hVP8944Rvqg4iar+ZH2zNKy1bjHbcc6ZrOBPNQz9yrkGczEGL51Phbt92tSU47Twhpg /HB5tZacSZRqy0TJCvD9tVL46Io5zfNNR1Kipyh+DzOisFAO6RS7RqYHFKKDlM/KaAu5rfG2q/8 AQtGZJqO8P0ZvqgP/+QdKyEjmM2hvwAOAvyiSdXf5vZjEalEkwPOwH1X3ubCtbhIWqsFdhyFW+s LoCPkhlIXEA56MbCQS8dqsSp7aR3wvN3obmFGFV40U8SVP7YXBzDksYsW46b0n6WY3PqxT7U+KH 6KXZgcBBwzXbMoZFc3QeXCG5VUZcj/d4jGpqk3WXx9y9AWCzX4+gWbE3Bg0P0M X-Google-Smtp-Source: AGHT+IEu+ef78TU5BohY955TnLfLd2FOTrbixoN6WxzQUUhRNw24BuA/i4qVmfrFqxuIN5aC3Q1pwA== X-Received: by 2002:a05:600c:628e:b0:477:19ad:1c43 with SMTP id 5b1f17b1804b1-477267115fcmr2089875e9.5.1761759558078; Wed, 29 Oct 2025 10:39:18 -0700 (PDT) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4771e235ae1sm70646865e9.17.2025.10.29.10.39.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Oct 2025 10:39:17 -0700 (PDT) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Yu Kuai , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v4 next 7/9] lib: mul_u64_u64_div_u64() optimise multiply on 32bit x86 Date: Wed, 29 Oct 2025 17:38:26 +0000 Message-Id: <20251029173828.3682-8-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251029173828.3682-1-david.laight.linux@gmail.com> References: <20251029173828.3682-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" gcc generates horrid code for both ((u64)u32_a * u32_b) and (u64_a + u32_b). As well as the extra instructions it can generate a lot of spills to stack (including spills of constant zeros and even multiplies by constant zero). mul_u32_u32() already exists to optimise the multiply. Add a similar add_u64_32() for the addition. Disable both for clang - it generates better code without them. Move the 64x64 =3D> 128 multiply into a static inline helper function for code clarity. No need for the a/b_hi/lo variables, the implicit casts on the function calls do the work for us. Should have minimal effect on the generated code. Use mul_u32_u32() and add_u64_u32() in the 64x64 =3D> 128 multiply in mul_u64_add_u64_div_u64(). Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- Changes for v4: - merge in patch 8. - Add comments about gcc being 'broken' for mixed 32/64 bit maths. clang doesn't have the same issues. - use a #defdine for define mul_add() to avoid 'defined but not used' errors. arch/x86/include/asm/div64.h | 19 +++++++++++++++++ include/linux/math64.h | 11 ++++++++++ lib/math/div64.c | 40 +++++++++++++++++++++++------------- 3 files changed, 56 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/div64.h b/arch/x86/include/asm/div64.h index cabdc2d5a68f..a18c045aa8a1 100644 --- a/arch/x86/include/asm/div64.h +++ b/arch/x86/include/asm/div64.h @@ -60,6 +60,12 @@ static inline u64 div_u64_rem(u64 dividend, u32 divisor,= u32 *remainder) } #define div_u64_rem div_u64_rem =20 +/* + * gcc tends to zero extend 32bit values and do full 64bit maths. + * Define asm functions that avoid this. + * (clang generates better code for the C versions.) + */ +#ifndef __clang__ static inline u64 mul_u32_u32(u32 a, u32 b) { u32 high, low; @@ -71,6 +77,19 @@ static inline u64 mul_u32_u32(u32 a, u32 b) } #define mul_u32_u32 mul_u32_u32 =20 +static inline u64 add_u64_u32(u64 a, u32 b) +{ + u32 high =3D a >> 32, low =3D a; + + asm ("addl %[b], %[low]; adcl $0, %[high]" + : [low] "+r" (low), [high] "+r" (high) + : [b] "rm" (b) ); + + return low | (u64)high << 32; +} +#define add_u64_u32 add_u64_u32 +#endif + /* * __div64_32() is never called on x86, so prevent the * generic definition from getting built. diff --git a/include/linux/math64.h b/include/linux/math64.h index e889d850b7f1..cc305206d89f 100644 --- a/include/linux/math64.h +++ b/include/linux/math64.h @@ -158,6 +158,17 @@ static inline u64 mul_u32_u32(u32 a, u32 b) } #endif =20 +#ifndef add_u64_u32 +/* + * Many a GCC version also messes this up. + * Zero extending b and then spilling everything to stack. + */ +static inline u64 add_u64_u32(u64 a, u32 b) +{ + return a + b; +} +#endif + #if defined(CONFIG_ARCH_SUPPORTS_INT128) && defined(__SIZEOF_INT128__) =20 #ifndef mul_u64_u32_shr diff --git a/lib/math/div64.c b/lib/math/div64.c index f92e7160feb6..f6da7b5fb69e 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -186,33 +186,45 @@ EXPORT_SYMBOL(iter_div_u64_rem); #endif =20 #if !defined(mul_u64_add_u64_div_u64) || defined(test_mul_u64_add_u64_div_= u64) -u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d) -{ + +#define mul_add(a, b, c) add_u64_u32(mul_u32_u32(a, b), c) + #if defined(__SIZEOF_INT128__) && !defined(test_mul_u64_add_u64_div_u64) =20 +static inline u64 mul_u64_u64_add_u64(u64 *p_lo, u64 a, u64 b, u64 c) +{ /* native 64x64=3D128 bits multiplication */ u128 prod =3D (u128)a * b + c; - u64 n_lo =3D prod, n_hi =3D prod >> 64; + + *p_lo =3D prod; + return prod >> 64; +} =20 #else =20 - /* perform a 64x64=3D128 bits multiplication manually */ - u32 a_lo =3D a, a_hi =3D a >> 32, b_lo =3D b, b_hi =3D b >> 32; +static inline u64 mul_u64_u64_add_u64(u64 *p_lo, u64 a, u64 b, u64 c) +{ + /* perform a 64x64=3D128 bits multiplication in 32bit chunks */ u64 x, y, z; =20 /* Since (x-1)(x-1) + 2(x-1) =3D=3D x.x - 1 two u32 can be added to a u64= */ - x =3D (u64)a_lo * b_lo + (u32)c; - y =3D (u64)a_lo * b_hi + (u32)(c >> 32); - y +=3D (u32)(x >> 32); - z =3D (u64)a_hi * b_hi + (u32)(y >> 32); - y =3D (u64)a_hi * b_lo + (u32)y; - z +=3D (u32)(y >> 32); - x =3D (y << 32) + (u32)x; - - u64 n_lo =3D x, n_hi =3D z; + x =3D mul_add(a, b, c); + y =3D mul_add(a, b >> 32, c >> 32); + y =3D add_u64_u32(y, x >> 32); + z =3D mul_add(a >> 32, b >> 32, y >> 32); + y =3D mul_add(a >> 32, b, y); + *p_lo =3D (y << 32) + (u32)x; + return add_u64_u32(z, y >> 32); +} =20 #endif =20 +u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d) +{ + u64 n_lo, n_hi; + + n_hi =3D mul_u64_u64_add_u64(&n_lo, a, b, c); + if (unlikely(n_hi >=3D d)) { /* trigger runtime exception if divisor is zero */ if (d =3D=3D 0) { --=20 2.39.5 From nobody Sun Dec 14 11:13:48 2025 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B3E4E354ADA for ; Wed, 29 Oct 2025 17:39:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759563; cv=none; b=geIgO4II2wQsrE9VCoCjElQE0l4Hu+GC6pYT+LkeRNNyJre/rcYJ7EY/5czcgaeFCX4f2kh919F1pgmfFfr8cD0Lj5dtOm/N+OB/149J13OlCic4hiDonxmSaBgZ0jXDxA8tw3dLaJnyiy9ykfD3oA4lPs7YrNfqMz3+iijcafU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759563; c=relaxed/simple; bh=lGdJFEw9ZrLRN8+wHLowhZQJmVyVzgrAF4ZGEKbDKpw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CBzF3rIV0jVDWrN4CBnJLOiiOc/c5GzoBDoydvbB60+hKk3IWpj7k+JfPtWjtv9LrpbSMS7IzXL1qd2U/oh7CX65+5Ac543S09++s7mHNs/fCObFjwTBZLFVpWgb8aS39Ck0OryBvehYlnoP33ore9t1RlBGhVq2b/vuFnKRL7U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=jokPCtqR; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jokPCtqR" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-475ca9237c2so608725e9.3 for ; Wed, 29 Oct 2025 10:39:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761759559; x=1762364359; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=t4k8wTMJbnOldNXby8wSilNtSncUAyWpauPZkY67AaQ=; b=jokPCtqR/h15eEyN/nugtkfwpS2I2tvc+rwuFNw5JAvwVM7D/SLSNGlhfyShWXwEhn 4cEWGc3Oz1Rv0hdPxPQ0YLi9/kAfHQpoAQyIgE9Xx31JDMSQrJnouSXcPGcPLm7kE59u vx8iEBbCGtrETzDr7ihzuD0rGd6N1qle4f5qhySb1j0tHOmoF1+kn/Svr0fEBZSFGJzS WB7iKU+lVXooYde40y3yGvZbJEfDkE+WTfaMdhqruyG7cvzdwvbUxN2iLhOhiVzgZpEX dBkU9gFMfVPbsoODO6q881hhtqPx6dTVVVLRERVjDDMIewN/4EnS69MluShQql46wvSr O3xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761759559; x=1762364359; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=t4k8wTMJbnOldNXby8wSilNtSncUAyWpauPZkY67AaQ=; b=TR1Vkd4GPhudob2tggP9/RhZ5uCIOBM7lsFV4aGbT7ZTwJ+nqUy54w6Q4+ZkYLKBgb zJDlKCrStPQcQ95BrtuZJZ/EtRfYUzT8mUQzW+UZA5/MUsYNNkt1k7fnoC99fpa0NT6X 2lqsoGvg3asw0NVo6MZw0BkcDCkonhuT8j+tDvWMn0azNd8V305gUrlvJ2yPUh+C0Pzr EEwWZGbvuYvaAfCLDZctzVf0EHZOjHcsBXEnADcBNqOfvSjIrrGexgLfntnTSmtkTEmB v8Skg+DA4yRTypbnOLtBAGOYwdsSkZE+K1NQ7nD5mc/rlVRWcREkCwJW1NHWN8VaoOvx whFg== X-Forwarded-Encrypted: i=1; AJvYcCUDte1/XLRaFjbyc0Rr1jF/cNsXGa6eis/IpqUNh3bjJWgcQVo8Zk1XIhMJTC4CEbKZLS8EhCJQMxJb1Ao=@vger.kernel.org X-Gm-Message-State: AOJu0YxxDrRtxx49XS89gD18Al33lpJjaTsw2jyDrpSq3VS0PUgkc6fX fTZ53F4+Bf/jDcLC/zxhLtXzAWA7gWd54Yie7hBmOKDHHfsMvvhziAvJ X-Gm-Gg: ASbGnculG+s0pyXpINRqo2AxEmP2+58Yi9f5u1Y2VRmECXn+gRNTA2OP5oUB4cSLVjA QPkMh3p+r7wxpKiQ2h9mzS5wx4UfawUFNtU1Tp8E+a+ayEimzkBW0+t5W/nOCtfTGEZto7/Zxtb 3FfIG0vbKXlqAiHWv+3hqKfqAghMb8Y8WagQ2fH4z/mHXNWg1CZ0LaUKi2bQkO9UoG8RaDGyZwb kghD29VegMqNMLV0P31bLw4eQIyruGIr64GbsuWZ2k+93BPcvnOxdSLK0J5YXRUkYXyAY/75HcZ un0sggG3xRBwX7rf4ua2xAgMX2Nj3UcUao1WHwJOjd+0LrWRmAW9kxNSEstk9LMPJVXXwNM4jEO zWRG3ueHk3SP5Elo2lQKyPCjx9ql1YphI6ZQz3Hp28IUNkyixxYIDqAvzNpHpnb8tCmv9eZXXBo 2W1h85yLuZvEmBLEk5uRtRDA6CAFYFKPBs4nQ2yIgIq91a9faOwpFPCiLTU4KW X-Google-Smtp-Source: AGHT+IFtPujlMsLoj3kRnKO3734HIPu4CLpfKHbGaZyOmvULCHsRYFPy4ZY8ZVzUlcZNSEIxivN0Lw== X-Received: by 2002:a05:600c:3e08:b0:475:daa7:ec60 with SMTP id 5b1f17b1804b1-4771e1cc7e6mr37879755e9.21.1761759558836; Wed, 29 Oct 2025 10:39:18 -0700 (PDT) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4771e235ae1sm70646865e9.17.2025.10.29.10.39.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Oct 2025 10:39:18 -0700 (PDT) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Yu Kuai , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v4 next 8/9] lib: mul_u64_u64_div_u64() Optimise the divide code Date: Wed, 29 Oct 2025 17:38:27 +0000 Message-Id: <20251029173828.3682-9-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251029173828.3682-1-david.laight.linux@gmail.com> References: <20251029173828.3682-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Replace the bit by bit algorithm with one that generates 16 bits per iteration on 32bit architectures and 32 bits on 64bit ones. On my zen 5 this reduces the time for the tests (using the generic code) from ~3350ns to ~1000ns. Running the 32bit algorithm on 64bit x86 takes ~1500ns. It'll be slightly slower on a real 32bit system, mostly due to register pressure. The savings for 32bit x86 are much higher (tested in userspace). The worst case (lots of bits in the quotient) drops from ~900 clocks to ~130 (pretty much independant of the arguments). Other 32bit architectures may see better savings. It is possibly to optimise for divisors that span less than __LONG_WIDTH__/2 bits. However I suspect they don't happen that often and it doesn't remove any slow cpu divide instructions which dominate the result. Typical improvements for 64bit random divides: old new sandy bridge: 470 150 haswell: 400 144 piledriver: 960 467 I think rdpmc is very slow. zen5: 244 80 (Timing is 'rdpmc; mul_div(); rdpmc' with the multiply depending on the first rdpmc and the second rdpmc depending on the quotient.) Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- Algorithm unchanged from v3. lib/math/div64.c | 124 ++++++++++++++++++++++++++++++++--------------- 1 file changed, 85 insertions(+), 39 deletions(-) diff --git a/lib/math/div64.c b/lib/math/div64.c index f6da7b5fb69e..4e4e962261c3 100644 --- a/lib/math/div64.c +++ b/lib/math/div64.c @@ -190,7 +190,6 @@ EXPORT_SYMBOL(iter_div_u64_rem); #define mul_add(a, b, c) add_u64_u32(mul_u32_u32(a, b), c) =20 #if defined(__SIZEOF_INT128__) && !defined(test_mul_u64_add_u64_div_u64) - static inline u64 mul_u64_u64_add_u64(u64 *p_lo, u64 a, u64 b, u64 c) { /* native 64x64=3D128 bits multiplication */ @@ -199,9 +198,7 @@ static inline u64 mul_u64_u64_add_u64(u64 *p_lo, u64 a,= u64 b, u64 c) *p_lo =3D prod; return prod >> 64; } - #else - static inline u64 mul_u64_u64_add_u64(u64 *p_lo, u64 a, u64 b, u64 c) { /* perform a 64x64=3D128 bits multiplication in 32bit chunks */ @@ -216,12 +213,37 @@ static inline u64 mul_u64_u64_add_u64(u64 *p_lo, u64 = a, u64 b, u64 c) *p_lo =3D (y << 32) + (u32)x; return add_u64_u32(z, y >> 32); } +#endif + +#ifndef BITS_PER_ITER +#define BITS_PER_ITER (__LONG_WIDTH__ >=3D 64 ? 32 : 16) +#endif + +#if BITS_PER_ITER =3D=3D 32 +#define mul_u64_long_add_u64(p_lo, a, b, c) mul_u64_u64_add_u64(p_lo, a, b= , c) +#define add_u64_long(a, b) ((a) + (b)) +#else +#undef BITS_PER_ITER +#define BITS_PER_ITER 16 +static inline u32 mul_u64_long_add_u64(u64 *p_lo, u64 a, u32 b, u64 c) +{ + u64 n_lo =3D mul_add(a, b, c); + u64 n_med =3D mul_add(a >> 32, b, c >> 32); + + n_med =3D add_u64_u32(n_med, n_lo >> 32); + *p_lo =3D n_med << 32 | (u32)n_lo; + return n_med >> 32; +} =20 +#define add_u64_long(a, b) add_u64_u32(a, b) #endif =20 u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d) { - u64 n_lo, n_hi; + unsigned long d_msig, q_digit; + unsigned int reps, d_z_hi; + u64 quotient, n_lo, n_hi; + u32 overflow; =20 n_hi =3D mul_u64_u64_add_u64(&n_lo, a, b, c); =20 @@ -240,46 +262,70 @@ u64 mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 = d) if (!n_hi) return div64_u64(n_lo, d); =20 - int shift =3D __builtin_ctzll(d); - - /* try reducing the fraction in case the dividend becomes <=3D 64 bits */ - if ((n_hi >> shift) =3D=3D 0) { - u64 n =3D shift ? (n_lo >> shift) | (n_hi << (64 - shift)) : n_lo; - - return div64_u64(n, d >> shift); - /* - * The remainder value if needed would be: - * res =3D div64_u64_rem(n, d >> shift, &rem); - * rem =3D (rem << shift) + (n_lo - (n << shift)); - */ + /* Left align the divisor, shifting the dividend to match */ + d_z_hi =3D __builtin_clzll(d); + if (d_z_hi) { + d <<=3D d_z_hi; + n_hi =3D n_hi << d_z_hi | n_lo >> (64 - d_z_hi); + n_lo <<=3D d_z_hi; } =20 - /* Do the full 128 by 64 bits division */ - - shift =3D __builtin_clzll(d); - d <<=3D shift; - - int p =3D 64 + shift; - u64 res =3D 0; - bool carry; + reps =3D 64 / BITS_PER_ITER; + /* Optimise loop count for small dividends */ + if (!(u32)(n_hi >> 32)) { + reps -=3D 32 / BITS_PER_ITER; + n_hi =3D n_hi << 32 | n_lo >> 32; + n_lo <<=3D 32; + } +#if BITS_PER_ITER =3D=3D 16 + if (!(u32)(n_hi >> 48)) { + reps--; + n_hi =3D add_u64_u32(n_hi << 16, n_lo >> 48); + n_lo <<=3D 16; + } +#endif =20 - do { - carry =3D n_hi >> 63; - shift =3D carry ? 1 : __builtin_clzll(n_hi); - if (p < shift) - break; - p -=3D shift; - n_hi <<=3D shift; - n_hi |=3D n_lo >> (64 - shift); - n_lo <<=3D shift; - if (carry || (n_hi >=3D d)) { - n_hi -=3D d; - res |=3D 1ULL << p; + /* Invert the dividend so we can use add instead of subtract. */ + n_lo =3D ~n_lo; + n_hi =3D ~n_hi; + + /* + * Get the most significant BITS_PER_ITER bits of the divisor. + * This is used to get a low 'guestimate' of the quotient digit. + */ + d_msig =3D (d >> (64 - BITS_PER_ITER)) + 1; + + /* + * Now do a 'long division' with BITS_PER_ITER bit 'digits'. + * The 'guess' quotient digit can be low and BITS_PER_ITER+1 bits. + * The worst case is dividing ~0 by 0x8000 which requires two subtracts. + */ + quotient =3D 0; + while (reps--) { + q_digit =3D (unsigned long)(~n_hi >> (64 - 2 * BITS_PER_ITER)) / d_msig; + /* Shift 'n' left to align with the product q_digit * d */ + overflow =3D n_hi >> (64 - BITS_PER_ITER); + n_hi =3D add_u64_u32(n_hi << BITS_PER_ITER, n_lo >> (64 - BITS_PER_ITER)= ); + n_lo <<=3D BITS_PER_ITER; + /* Add product to negated divisor */ + overflow +=3D mul_u64_long_add_u64(&n_hi, d, q_digit, n_hi); + /* Adjust for the q_digit 'guestimate' being low */ + while (overflow < 0xffffffff >> (32 - BITS_PER_ITER)) { + q_digit++; + n_hi +=3D d; + overflow +=3D n_hi < d; } - } while (n_hi); - /* The remainder value if needed would be n_hi << p */ + quotient =3D add_u64_long(quotient << BITS_PER_ITER, q_digit); + } =20 - return res; + /* + * The above only ensures the remainder doesn't overflow, + * it can still be possible to add (aka subtract) another copy + * of the divisor. + */ + if ((n_hi + d) > n_hi) + quotient++; + return quotient; } #if !defined(test_mul_u64_add_u64_div_u64) EXPORT_SYMBOL(mul_u64_add_u64_div_u64); --=20 2.39.5 From nobody Sun Dec 14 11:13:48 2025 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DF30735581F for ; Wed, 29 Oct 2025 17:39:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759565; cv=none; b=PSgWP/AgVvPLkIE4WQY9Qh6F8bP3JLKEsH+DeWsFeX+/LTg+PyLFuwWbIRGF6TaqV7HvYeKZkzGHjcvXi4ldVxhsWXCp93MkwID4b+CDTH1u9aISN/atjTbVo4zeooeJ3whdJukS3JG2IXJRdmlxsUYJ5ANI6HBT/SqrQY7f4Gw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761759565; c=relaxed/simple; bh=8vI5gDqoxAXmmgm15UQmqAAAqKT3ptVUQ39b7WRvVn4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=oFjWvU5Opcg0DdotepDN704GdOvf8VpR2tNd7Fs8yUgA0lVoSUo0VSkEZGmCLFjBw14mWI1U3P4pgoa+GKu2/XnArMftU59EEtp8239/yITzfs/3JjVGHzlv4EBQz3PPnZBZ1NgSF478UiIWpt5T1ZU0JhJGLfEXR4HOHczy1hk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hnM67xjS; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hnM67xjS" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-4770c34ca8eso840795e9.0 for ; Wed, 29 Oct 2025 10:39:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761759560; x=1762364360; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=u0RMEDdukqBuP/8S5mfLuGm61Krda2i2EoSkBYnqxhg=; b=hnM67xjSIAJ/s4ndKJdaspYTxuFl44namtLAEEKaJ0MpU8wREAPKf8GEud2U2BWeV4 4LGUaS7SLbzBhM7dxr9w+RYO0MoyHIkmh/ImV0RXfj/r2wDWT1b21yh86A3aBgh9ITsn u6ayCdPfUlHO1NgupUknyOBsCW2syVBSfuKYNqnQk3UzQmQ7F/vHo92nxJeo1f9KD7xh 3bq8SAo8OkqBq7CyeGuC8tn1JFYe2Xh4C8krr10Bk1wayTZ2NDRHSJRg10+4+3rzMzAz J5w7KZuuQ0wilGbGNy6BV4r3pC7/njmYZuaRBQPsMWLff3eMBhkEgEcjIy19280Ci5mq 8LSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761759560; x=1762364360; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u0RMEDdukqBuP/8S5mfLuGm61Krda2i2EoSkBYnqxhg=; b=nVLCrKk/tQwLhC9aEd+5SPc4pDofD9nzIvW/MUHlDQnbRM0Re21vUoHXzxX5lEGC4H wJWXkZtP4K0aecdq+DUyTnmqWMQC0YpxDBPbSYy8v+x89FKuxRzRmLE1TFqhLFYPZia0 PBBitLfqcK8eVl+UWl4hJbpCK9+/2i0gw2B+vQ11OlTQv2uOgy8SvQGOTbPIcydL+Cah 2FZivPEryBeq8tpoPNkWaMipaj4vPSpyCjt7IFXxXE0TAyKHNvDvf2OCc7gTlNXAQzR8 YJ3Z4V9Z/EXbZ1zNTOqzKYLS5F/NAZabcPNrT9nSFUb/lIsyB2UEoMDc/ihJstFsKGXS 6rnA== X-Forwarded-Encrypted: i=1; AJvYcCVjmKXWKj11V5QazuB6EjmyOzTbQMHCt+2/3UNJlXMkNBJbGNeabRW2001h50mIdxMLPi5vT1roiIlWbsE=@vger.kernel.org X-Gm-Message-State: AOJu0Yz073EFaQRxZVUAjgSbRTBcdUINEwmArAlLii7UI+SP+M+YPwgr zMgZsvw5bZlXAgm3AGpcAtFOE1jPKUpA9bPiVprnoH8VtBpV2P3RRgyM X-Gm-Gg: ASbGncvUqg1SzavUQBEjTIKwCd72u2I7EXn1XCgg+wnV9iF7yS7BILBSHpOeSuAufxO HYwHWVPVl8B+b/BEaAU6ypXWrlvpY+sFyjedvCZGuFawuNtCrn4UO1igbYvLgKFUEQzh348B7+F Qeggdfda9uDXQUivT3/BM/SSQeQFtC3gF6aHyJpTrM0+4CAPvagnMYsijaQE2XK9OBkngDncU4X x0lx4EjY1uaHEbHidKS/IVbFXanPOd2zr5nRni53iuPwe49IQ9woe9U9ghLcMdujaDa60t1yq6k vNppHN8sEL9VHXYEHHAlhtZOQ5xEEEsyFxkO3SrV4PoHsHnQbU5+TKBV/ZkVPCbZCg7sBNY6dFS ax7EuaflQBLFmRdQd6Bz2I8NqUGro1JyJBQct13mw6DE+UJw4RjKfbRIsqnIC5eyBOT9U55Xv/A 4RpbXR+Y+2n/xvE0hmPWD15mO8aCPwVqlHo8U4i+sAKtmcgxr+hyovbDZOxJLy X-Google-Smtp-Source: AGHT+IFYEqHaRrv4oVtIJptEbue5rUEn2kWh/F/sNXC3+oJT4xx8jZmPLAz6pZX/kKSb6t0iSAncwg== X-Received: by 2002:a05:600c:6748:b0:46e:1abc:1811 with SMTP id 5b1f17b1804b1-4771e3b7cf5mr33981705e9.27.1761759559709; Wed, 29 Oct 2025 10:39:19 -0700 (PDT) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4771e235ae1sm70646865e9.17.2025.10.29.10.39.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Oct 2025 10:39:19 -0700 (PDT) From: David Laight To: Andrew Morton , linux-kernel@vger.kernel.org Cc: David Laight , u.kleine-koenig@baylibre.com, Nicolas Pitre , Oleg Nesterov , Peter Zijlstra , Biju Das , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , Li RongQing , Yu Kuai , Khazhismel Kumykov , Jens Axboe , x86@kernel.org Subject: [PATCH v4 next 9/9] lib: test_mul_u64_u64_div_u64: Test the 32bit code on 64bit Date: Wed, 29 Oct 2025 17:38:28 +0000 Message-Id: <20251029173828.3682-10-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20251029173828.3682-1-david.laight.linux@gmail.com> References: <20251029173828.3682-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There are slight differences in the mul_u64_add_u64_div_u64() code between 32bit and 64bit systems. Compile and test the 32bit version on 64bit hosts for better test coverage. Signed-off-by: David Laight Reviewed-by: Nicolas Pitre --- Changes for v4: - Fix build on non x86-64 lib/math/test_mul_u64_u64_div_u64.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/lib/math/test_mul_u64_u64_div_u64.c b/lib/math/test_mul_u64_u6= 4_div_u64.c index a3c5e54f37ef..57d5c7158b10 100644 --- a/lib/math/test_mul_u64_u64_div_u64.c +++ b/lib/math/test_mul_u64_u64_div_u64.c @@ -74,6 +74,10 @@ done */ =20 static u64 test_mul_u64_add_u64_div_u64(u64 a, u64 b, u64 c, u64 d); +#if __LONG_WIDTH__ >=3D 64 +#define TEST_32BIT_DIV +static u64 test_mul_u64_add_u64_div_u64_32bit(u64 a, u64 b, u64 c, u64 d); +#endif =20 static int __init test_run(unsigned int fn_no, const char *fn_name) { @@ -100,6 +104,12 @@ static int __init test_run(unsigned int fn_no, const c= har *fn_name) result =3D test_mul_u64_add_u64_div_u64(a, b, 0, d); result_up =3D test_mul_u64_add_u64_div_u64(a, b, d - 1, d); break; +#ifdef TEST_32BIT_DIV + case 2: + result =3D test_mul_u64_add_u64_div_u64_32bit(a, b, 0, d); + result_up =3D test_mul_u64_add_u64_div_u64_32bit(a, b, d - 1, d); + break; +#endif } =20 tests +=3D 2; @@ -131,6 +141,10 @@ static int __init test_init(void) return -EINVAL; if (test_run(1, "test_mul_u64_u64_div_u64")) return -EINVAL; +#ifdef TEST_32BIT_DIV + if (test_run(2, "test_mul_u64_u64_div_u64_32bit")) + return -EINVAL; +#endif return 0; } =20 @@ -152,6 +166,21 @@ static void __exit test_exit(void) =20 #include "div64.c" =20 +#ifdef TEST_32BIT_DIV +/* Recompile the generic code for 32bit long */ +#undef test_mul_u64_add_u64_div_u64 +#define test_mul_u64_add_u64_div_u64 test_mul_u64_add_u64_div_u64_32bit +#undef BITS_PER_ITER +#define BITS_PER_ITER 16 + +#define mul_u64_u64_add_u64 mul_u64_u64_add_u64_32bit +#undef mul_u64_long_add_u64 +#undef add_u64_long +#undef mul_add + +#include "div64.c" +#endif + module_init(test_init); module_exit(test_exit); =20 --=20 2.39.5