From nobody Thu Nov 28 04:52:38 2024 Received: from fhigh-a4-smtp.messagingengine.com (fhigh-a4-smtp.messagingengine.com [103.168.172.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 073611AD41F; Thu, 3 Oct 2024 21:18:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.155 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727990317; cv=none; b=IYFWq+ey5TH3M2h4OfnHFlyn4hgvem9PqjGtC7zLhLiIbeIzTB+rkla7YftMQMLtPtc8cbJ52YrLo07GDUkLPMGyAJ0jItdO3+ky/OW12wsxKH4mbXzkyuzHDEtvM410N4KIbJUH4tFkmkoCkveZf7MOywijUBFQMsR0yCRk/gU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727990317; c=relaxed/simple; bh=63tqb7jhbmMTqFgrye9w6beVOBgdli82xZXEtbLJ2BQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=t6ymyZ+MSJnBkSP0Om8N918iPLW0/CyVXnwvtHCL8K5eNhMUHUE0cYSlaY5MlMmxaqke5/mxArYC9blsYKLvj0ncIojFi3rOCl+gH4pHJyOYJjMXaDidxU2Dt0PGYtBYPiL6pXb+jD81uDSlsClwcQ0moV6WGGLHXl1Jt3+XOQc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=fluxnic.net; spf=pass smtp.mailfrom=fluxnic.net; dkim=pass (1024-bit key) header.d=fluxnic.net header.i=@fluxnic.net header.b=ZmvY9/Rg; dkim=pass (2048-bit key) header.d=pobox.com header.i=@pobox.com header.b=ZmYT8ULw; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=gcamWM0V; arc=none smtp.client-ip=103.168.172.155 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=fluxnic.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fluxnic.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=fluxnic.net header.i=@fluxnic.net header.b="ZmvY9/Rg"; dkim=pass (2048-bit key) header.d=pobox.com header.i=@pobox.com header.b="ZmYT8ULw"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="gcamWM0V" Received: from phl-compute-05.internal (phl-compute-05.phl.internal [10.202.2.45]) by mailfhigh.phl.internal (Postfix) with ESMTP id EA6191140195; Thu, 3 Oct 2024 17:18:33 -0400 (EDT) Received: from phl-frontend-01 ([10.202.2.160]) by phl-compute-05.internal (MEProxy); Thu, 03 Oct 2024 17:18:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fluxnic.net; h= cc:cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=2016-12.pbsmtp; t=1727990313; x=1728076713; bh=gMZ287qsu4KPyQpfjhcSwVOet5c3sEaQsgFoRc8y1bA=; b= ZmvY9/Rge0mxNtiz/sSMhv0tD01DTYfhxhfuxkOtVGqO+0t1tCmOa0YYBiP8CZhc hGSQ1HOa23hy0R9e86YaDUPFk3zUAEakwww94oIIuYypK/yERBaVSuDKLZA7GwNY QhM/g2NBm3LmUZ8ZQ+Z9nAfj0MWFWzTEm0QzabLc9Uc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pobox.com; h=cc :cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm1; t=1727990313; x= 1728076713; bh=gMZ287qsu4KPyQpfjhcSwVOet5c3sEaQsgFoRc8y1bA=; b=Z mYT8ULwss2c+A/pjkBrZD6fCVeZrwYYMJwcc5Wr3B8GOT3FxuzB4iXHp9JmCwgwJ lZFTUDucWHYAuP0shgbetZ2TLjH66N6lBwGqg4HwGrH2i8fhPYBwM3aXuYkuw0Ya IelH03O3UC5hD7XhEVpaw7XZWctgSGp1wgeiqBfVst3M4N4bARaSSMNvMQSI8kpw H6A7e4Fd847HrqAFB6FKFHMO50aSlzJ2PQMFPZSq5PgFHpLtbKYMNMANMda+9zFp fa5swZuUn5lAyKcqgiiJIx4KTatl3yall9Ew9HF/LwETiXukCFvf89j1O5c5vOfG WYr9bm0k0enmQPsgpzYqw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1727990313; x= 1728076713; bh=gMZ287qsu4KPyQpfjhcSwVOet5c3sEaQsgFoRc8y1bA=; b=g camWM0Vzi9BRleGJKPImBbKSdBgC+3BrhGWlGMO8epC3K7c60myvx0MDDVWB5zK/ N7LzgszvH7+hnFqw7+zUxCmShb8VBCYqWtu1LrIDPWyVlrA1tcivuDArRSfrKVsd LP2XilRLpo7x2STxYBHhX/KguW83uMsmKBzpx3SvQPYmq54PtS/3thZzEg2gN1SB HcJ4qVsoWUPTLfCGC2IlsYC0+/hAOegXMjk55YQPBXQb6962By6OijjhTBcuwyJ6 Xc+hB7V79aSAd9wzdn9P6gkQxclHyvrNouw1c+HnpnBIEW0k5taX8T/ZGAw8soFC k2xvUamAmO0r+OdraJ3VQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddvuddgudeitdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvfevufffkffojghfggfgsedtkeertdertddt necuhfhrohhmpefpihgtohhlrghsucfrihhtrhgvuceonhhitghosehflhhugihnihgtrd hnvghtqeenucggtffrrghtthgvrhhnpedtjeeuieeiheeiueffuddvffelheekleegkedu keeffffhudffudegvdetiefhteenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh epmhgrihhlfhhrohhmpehnihgtohesfhhluhignhhitgdrnhgvthdpnhgspghrtghpthht ohephedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtoheplhhinhhugiesrghrmhhlih hnuhigrdhorhhgrdhukhdprhgtphhtthhopegrrhhnugesrghrnhgusgdruggvpdhrtghp thhtohepnhhpihhtrhgvsegsrgihlhhisghrvgdrtghomhdprhgtphhtthhopehlihhnuh igqdgrrhgthhesvhhgvghrrdhkvghrnhgvlhdrohhrghdprhgtphhtthhopehlihhnuhig qdhkvghrnhgvlhesvhhgvghrrdhkvghrnhgvlhdrohhrgh X-ME-Proxy: Feedback-ID: i58514971:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 3 Oct 2024 17:18:33 -0400 (EDT) Received: from xanadu.lan (OpenWrt.lan [192.168.1.1]) by yoda.fluxnic.net (Postfix) with ESMTPSA id 8B720E3CD83; Thu, 3 Oct 2024 17:18:32 -0400 (EDT) From: Nicolas Pitre To: Arnd Bergmann , Russell King Cc: Nicolas Pitre , linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 1/4] lib/math/test_div64: add some edge cases relevant to __div64_const32() Date: Thu, 3 Oct 2024 17:16:13 -0400 Message-ID: <20241003211829.2750436-2-nico@fluxnic.net> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241003211829.2750436-1-nico@fluxnic.net> References: <20241003211829.2750436-1-nico@fluxnic.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Nicolas Pitre Be sure to test the extreme cases with and without bias. Signed-off-by: Nicolas Pitre --- lib/math/test_div64.c | 85 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 83 insertions(+), 2 deletions(-) diff --git a/lib/math/test_div64.c b/lib/math/test_div64.c index c15edd688d..3cd699b654 100644 --- a/lib/math/test_div64.c +++ b/lib/math/test_div64.c @@ -26,6 +26,9 @@ static const u64 test_div64_dividends[] =3D { 0x0072db27380dd689, 0x0842f488162e2284, 0xf66745411d8ab063, + 0xfffffffffffffffb, + 0xfffffffffffffffc, + 0xffffffffffffffff, }; #define SIZE_DIV64_DIVIDENDS ARRAY_SIZE(test_div64_dividends) =20 @@ -37,7 +40,10 @@ static const u64 test_div64_dividends[] =3D { #define TEST_DIV64_DIVISOR_5 0x0008a880 #define TEST_DIV64_DIVISOR_6 0x003fd3ae #define TEST_DIV64_DIVISOR_7 0x0b658fac -#define TEST_DIV64_DIVISOR_8 0xdc08b349 +#define TEST_DIV64_DIVISOR_8 0x80000001 +#define TEST_DIV64_DIVISOR_9 0xdc08b349 +#define TEST_DIV64_DIVISOR_A 0xfffffffe +#define TEST_DIV64_DIVISOR_B 0xffffffff =20 static const u32 test_div64_divisors[] =3D { TEST_DIV64_DIVISOR_0, @@ -49,13 +55,16 @@ static const u32 test_div64_divisors[] =3D { TEST_DIV64_DIVISOR_6, TEST_DIV64_DIVISOR_7, TEST_DIV64_DIVISOR_8, + TEST_DIV64_DIVISOR_9, + TEST_DIV64_DIVISOR_A, + TEST_DIV64_DIVISOR_B, }; #define SIZE_DIV64_DIVISORS ARRAY_SIZE(test_div64_divisors) =20 static const struct { u64 quotient; u32 remainder; -} test_div64_results[SIZE_DIV64_DIVISORS][SIZE_DIV64_DIVIDENDS] =3D { +} test_div64_results[SIZE_DIV64_DIVIDENDS][SIZE_DIV64_DIVISORS] =3D { { { 0x0000000013045e47, 0x00000001 }, { 0x000000000161596c, 0x00000030 }, @@ -65,6 +74,9 @@ static const struct { { 0x00000000000013c4, 0x0004ce80 }, { 0x00000000000002ae, 0x001e143c }, { 0x000000000000000f, 0x0033e56c }, + { 0x0000000000000001, 0x2b27507f }, + { 0x0000000000000000, 0xab275080 }, + { 0x0000000000000000, 0xab275080 }, { 0x0000000000000000, 0xab275080 }, }, { { 0x00000001c45c02d1, 0x00000000 }, @@ -75,7 +87,10 @@ static const struct { { 0x000000000001d637, 0x0004e5d9 }, { 0x0000000000003fc9, 0x000713bb }, { 0x0000000000000165, 0x029abe7d }, + { 0x000000000000001f, 0x673c193a }, { 0x0000000000000012, 0x6e9f7e37 }, + { 0x000000000000000f, 0xe73c1977 }, + { 0x000000000000000f, 0xe73c1968 }, }, { { 0x000000197a3a0cf7, 0x00000002 }, { 0x00000001d9632e5c, 0x00000021 }, @@ -85,7 +100,10 @@ static const struct { { 0x00000000001a7bb3, 0x00072331 }, { 0x00000000000397ad, 0x0002c61b }, { 0x000000000000141e, 0x06ea2e89 }, + { 0x00000000000001ca, 0x4c0a72e7 }, { 0x000000000000010a, 0xab002ad7 }, + { 0x00000000000000e5, 0x4c0a767b }, + { 0x00000000000000e5, 0x4c0a7596 }, }, { { 0x0000017949e37538, 0x00000001 }, { 0x0000001b62441f37, 0x00000055 }, @@ -95,7 +113,10 @@ static const struct { { 0x0000000001882ec6, 0x0005cbf9 }, { 0x000000000035333b, 0x0017abdf }, { 0x00000000000129f1, 0x0ab4520d }, + { 0x0000000000001a87, 0x18ff0472 }, { 0x0000000000000f6e, 0x8ac0ce9b }, + { 0x0000000000000d43, 0x98ff397f }, + { 0x0000000000000d43, 0x98ff2c3c }, }, { { 0x000011f321a74e49, 0x00000006 }, { 0x0000014d8481d211, 0x0000005b }, @@ -105,7 +126,10 @@ static const struct { { 0x0000000012a88828, 0x00036c97 }, { 0x000000000287f16f, 0x002c2a25 }, { 0x00000000000e2cc7, 0x02d581e3 }, + { 0x0000000000014318, 0x2ee07d7f }, { 0x000000000000bbf4, 0x1ba08c03 }, + { 0x000000000000a18c, 0x2ee303af }, + { 0x000000000000a18c, 0x2ee26223 }, }, { { 0x0000d8db8f72935d, 0x00000005 }, { 0x00000fbd5aed7a2e, 0x00000002 }, @@ -115,7 +139,10 @@ static const struct { { 0x00000000e16b20fa, 0x0002a14a }, { 0x000000001e940d22, 0x00353b2e }, { 0x0000000000ab40ac, 0x06fba6ba }, + { 0x00000000000f3f70, 0x0af7eeda }, { 0x000000000008debd, 0x72d98365 }, + { 0x0000000000079fb8, 0x0b166dba }, + { 0x0000000000079fb8, 0x0b0ece02 }, }, { { 0x000cc3045b8fc281, 0x00000000 }, { 0x0000ed1f48b5c9fc, 0x00000079 }, @@ -125,7 +152,10 @@ static const struct { { 0x0000000d43fce827, 0x00082b09 }, { 0x00000001ccaba11a, 0x0037e8dd }, { 0x000000000a13f729, 0x0566dffd }, + { 0x0000000000e5b64e, 0x3728203b }, { 0x000000000085a14b, 0x23d36726 }, + { 0x000000000072db27, 0x38f38cd7 }, + { 0x000000000072db27, 0x3880b1b0 }, }, { { 0x00eafeb9c993592b, 0x00000001 }, { 0x00110e5befa9a991, 0x00000048 }, @@ -135,7 +165,10 @@ static const struct { { 0x000000f4459740fc, 0x00084484 }, { 0x0000002122c47bf9, 0x002ca446 }, { 0x00000000b9936290, 0x004979c4 }, + { 0x000000001085e910, 0x05a83974 }, { 0x00000000099ca89d, 0x9db446bf }, + { 0x000000000842f488, 0x26b40b94 }, + { 0x000000000842f488, 0x1e71170c }, }, { { 0x1b60cece589da1d2, 0x00000001 }, { 0x01fcb42be1453f5b, 0x0000004f }, @@ -145,7 +178,49 @@ static const struct { { 0x00001c757dfab350, 0x00048863 }, { 0x000003dc4979c652, 0x00224ea7 }, { 0x000000159edc3144, 0x06409ab3 }, + { 0x00000001ecce8a7e, 0x30bc25e5 }, { 0x000000011eadfee3, 0xa99c48a8 }, + { 0x00000000f6674543, 0x0a593ae9 }, + { 0x00000000f6674542, 0x13f1f5a5 }, + }, { + { 0x1c71c71c71c71c71, 0x00000002 }, + { 0x0210842108421084, 0x0000000b }, + { 0x007f01fc07f01fc0, 0x000000fb }, + { 0x00014245eabf1f9a, 0x0000a63d }, + { 0x0000ffffffffffff, 0x0000fffb }, + { 0x00001d913cecc509, 0x0007937b }, + { 0x00000402c70c678f, 0x0005bfc9 }, + { 0x00000016766cb70b, 0x045edf97 }, + { 0x00000001fffffffb, 0x80000000 }, + { 0x0000000129d84b3a, 0xa2e8fe71 }, + { 0x0000000100000001, 0xfffffffd }, + { 0x0000000100000000, 0xfffffffb }, + }, { + { 0x1c71c71c71c71c71, 0x00000003 }, + { 0x0210842108421084, 0x0000000c }, + { 0x007f01fc07f01fc0, 0x000000fc }, + { 0x00014245eabf1f9a, 0x0000a63e }, + { 0x0000ffffffffffff, 0x0000fffc }, + { 0x00001d913cecc509, 0x0007937c }, + { 0x00000402c70c678f, 0x0005bfca }, + { 0x00000016766cb70b, 0x045edf98 }, + { 0x00000001fffffffc, 0x00000000 }, + { 0x0000000129d84b3a, 0xa2e8fe72 }, + { 0x0000000100000002, 0x00000000 }, + { 0x0000000100000000, 0xfffffffc }, + }, { + { 0x1c71c71c71c71c71, 0x00000006 }, + { 0x0210842108421084, 0x0000000f }, + { 0x007f01fc07f01fc0, 0x000000ff }, + { 0x00014245eabf1f9a, 0x0000a641 }, + { 0x0000ffffffffffff, 0x0000ffff }, + { 0x00001d913cecc509, 0x0007937f }, + { 0x00000402c70c678f, 0x0005bfcd }, + { 0x00000016766cb70b, 0x045edf9b }, + { 0x00000001fffffffc, 0x00000003 }, + { 0x0000000129d84b3a, 0xa2e8fe75 }, + { 0x0000000100000002, 0x00000003 }, + { 0x0000000100000001, 0x00000000 }, }, }; =20 @@ -208,6 +283,12 @@ static bool __init test_div64(void) return false; if (!test_div64_one(dividend, TEST_DIV64_DIVISOR_8, i, 8)) return false; + if (!test_div64_one(dividend, TEST_DIV64_DIVISOR_9, i, 9)) + return false; + if (!test_div64_one(dividend, TEST_DIV64_DIVISOR_A, i, 10)) + return false; + if (!test_div64_one(dividend, TEST_DIV64_DIVISOR_B, i, 11)) + return false; for (j =3D 0; j < SIZE_DIV64_DIVISORS; j++) { if (!test_div64_one(dividend, test_div64_divisors[j], i, j)) --=20 2.46.1 From nobody Thu Nov 28 04:52:38 2024 Received: from fout-a6-smtp.messagingengine.com (fout-a6-smtp.messagingengine.com [103.168.172.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0B431AD3E5; Thu, 3 Oct 2024 21:18:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.149 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727990318; cv=none; b=PX0h/lLjdfXwTR/1vQqh0q1DKRZcZY3oezzPt0XBNW6CAk41uS/0S39P0kzOuFyCDLnFpS72esp3CuqobawZPEgx+5Uay31K2g/Y5KP7b6Desv2/kDPypVj4bCZjGEEB09on+cQFlkEIHeCZ3lVcHTFcqWLxR1UA3vQRuVqI/mU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727990318; c=relaxed/simple; bh=fLv1+fwJm8xkorFQnNPWoD4F0TZgiDH6LtPVSl6bqgQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gXCI5vs0I86OoMbr44LnHsrdEyIOG2hWNu9d++70WiZ/3aNwuNUmLn20F4L7UuMbFujpMvohWOuxnCfx170XWkbgGsuxOj8x027EnbgcCbmwpzKJIFo/fCdz2ur42xGz/GNFrPraIDwXaCrWcElNg92qTXlFnjCppRzb9NK7E4c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=fluxnic.net; spf=pass smtp.mailfrom=fluxnic.net; dkim=pass (1024-bit key) header.d=fluxnic.net header.i=@fluxnic.net header.b=KUBiKU/i; dkim=pass (2048-bit key) header.d=pobox.com header.i=@pobox.com header.b=q7akH2FE; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=YwKgSGwp; arc=none smtp.client-ip=103.168.172.149 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=fluxnic.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fluxnic.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=fluxnic.net header.i=@fluxnic.net header.b="KUBiKU/i"; dkim=pass (2048-bit key) header.d=pobox.com header.i=@pobox.com header.b="q7akH2FE"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="YwKgSGwp" Received: from phl-compute-11.internal (phl-compute-11.phl.internal [10.202.2.51]) by mailfout.phl.internal (Postfix) with ESMTP id DEAC5138021D; Thu, 3 Oct 2024 17:18:33 -0400 (EDT) Received: from phl-frontend-01 ([10.202.2.160]) by phl-compute-11.internal (MEProxy); Thu, 03 Oct 2024 17:18:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fluxnic.net; h= cc:cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=2016-12.pbsmtp; t=1727990313; x=1728076713; bh=0ITT2ENdh4aVmD3eIhmijkArG0w3ABJjgi8K1hTvi/k=; b= KUBiKU/i/XGCpjSqC0w4sKIK4xtiFFrekR6cm0ww6UxZZXPe0Z8bzoYNBQdbO7or nW6pFphw6w4ZVM06Jnv18BTUxDzSBodAisjgYsQmzcOb1kcqz1gDIgorslFGYF1p kA4EiaTDX3uN5jsR/B84+hGnPScVTErQ+xHXz1zGJEA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pobox.com; h=cc :cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm1; t=1727990313; x= 1728076713; bh=0ITT2ENdh4aVmD3eIhmijkArG0w3ABJjgi8K1hTvi/k=; b=q 7akH2FE9nJBdpjeW2wtRSVXsL8Gj1ZpBGPDvc5bZS9uSH/c9n0UemyC6Dviaz6D0 TwbHYaiEI1iPImd0au/x9jB6T0t61ia9buwBX2PfCr83dWGESSYssXqEYB2IWR7m puhFCQ+lBBhPI2VaDq3I9yNEN/ZSFSi86sDcK02nKU3ldbYH1gR13icfmrxG9Ljb bgc16MWauJZyzfyki5wOOaBwJWcXT1YDl9UCqM8CXa1oBFHVQK3zMoyh785vl0IU AaNM9kL5bADZLebAvwwQH+6JVeb+RFIa89VGmr4mz5eIcRo04vUgN7skvg5Pcxs0 cfcXrImqWY63AEea3r4LA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1727990313; x= 1728076713; bh=0ITT2ENdh4aVmD3eIhmijkArG0w3ABJjgi8K1hTvi/k=; b=Y wKgSGwp2I8YpQKY+4J9x1w3ZQj8yVWNlZIfjQHUHVVVSZI2DzjCtZx0ZAtLfZ5wA dbSv3aW5OQNQLzCHvSxkmJqmZNNmcY2vqRv3KndCLieTKSKNUtcZ2h97ccI7+K4+ vtKcQuVtR5NYC7qcIMHclB6rKCrun5le/Xe45UYhRx8spwlRQnRHvePnZGoCU35d pb88U5buncTK/DHIi0whCOpNhQpnXWPNbshlRXgFI3d6/BsNyX/5kE84z5E89k/s UHsbDX9niBnAHnQjkVBw4U5bm18TZpd3alXgI2TNXA63aBMsDRgb6RhGpu4TdcYG XmiBPLQ9i+n3dO+Or12og== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddvuddgudeitdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvfevufffkffojghfggfgsedtkeertdertddt necuhfhrohhmpefpihgtohhlrghsucfrihhtrhgvuceonhhitghosehflhhugihnihgtrd hnvghtqeenucggtffrrghtthgvrhhnpedtjeeuieeiheeiueffuddvffelheekleegkedu keeffffhudffudegvdetiefhteenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh epmhgrihhlfhhrohhmpehnihgtohesfhhluhignhhitgdrnhgvthdpnhgspghrtghpthht ohephedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtoheplhhinhhugiesrghrmhhlih hnuhigrdhorhhgrdhukhdprhgtphhtthhopegrrhhnugesrghrnhgusgdruggvpdhrtghp thhtohepnhhpihhtrhgvsegsrgihlhhisghrvgdrtghomhdprhgtphhtthhopehlihhnuh igqdgrrhgthhesvhhgvghrrdhkvghrnhgvlhdrohhrghdprhgtphhtthhopehlihhnuhig qdhkvghrnhgvlhesvhhgvghrrdhkvghrnhgvlhdrohhrgh X-ME-Proxy: Feedback-ID: i58514971:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 3 Oct 2024 17:18:33 -0400 (EDT) Received: from xanadu.lan (OpenWrt.lan [192.168.1.1]) by yoda.fluxnic.net (Postfix) with ESMTPSA id 99DF5E3CD84; Thu, 3 Oct 2024 17:18:32 -0400 (EDT) From: Nicolas Pitre To: Arnd Bergmann , Russell King Cc: Nicolas Pitre , linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 2/4] asm-generic/div64: optimize/simplify __div64_const32() Date: Thu, 3 Oct 2024 17:16:14 -0400 Message-ID: <20241003211829.2750436-3-nico@fluxnic.net> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241003211829.2750436-1-nico@fluxnic.net> References: <20241003211829.2750436-1-nico@fluxnic.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Nicolas Pitre Several years later I just realized that this code could be greatly simplified. First, let's formalize the need for overflow handling in __arch_xprod64(). Assuming n =3D UINT64_MAX, there are 2 cases where an overflow may occur: 1) If a bias must be added, we have m_lo * n_lo + m or m_lo * 0xffffffff + ((m_hi << 32) + m_lo) or ((m_lo << 32) - m_lo) + ((m_hi << 32) + m_lo) or (m_lo + m_hi) << 32 which must be < (1 << 64). So the criteria for no overflow is m_lo + m_hi < (1 << 32). 2) The cross product m_lo * n_hi + m_hi * n_lo or m_lo * 0xffffffff + m_hi * 0xffffffff or ((m_lo << 32) - m_lo) + ((m_hi << 32) - m_hi). Assuming the top result from the previous step (m_lo + m_hi) that must be added to this, we get (m_lo + m_hi) << 32 again. So let's have a straight and simpler version when this is true. Otherwise some reordering allows for taking care of possible overflows without any actual conditionals. And prevent from generating both code variants by making sure this is considered only if m is perceived as constant by the compiler. This, in turn, allows for greatly simplifying __div64_const32(). The "special case" may go as well as the regular case works just fine without needing a bias. Then reduction should be applied all the time as minimizing m is the key. Signed-off-by: Nicolas Pitre --- include/asm-generic/div64.h | 114 +++++++++++------------------------- 1 file changed, 35 insertions(+), 79 deletions(-) diff --git a/include/asm-generic/div64.h b/include/asm-generic/div64.h index 13f5aa68a4..5d59cf7e73 100644 --- a/include/asm-generic/div64.h +++ b/include/asm-generic/div64.h @@ -74,7 +74,8 @@ * do the trick here). \ */ \ uint64_t ___res, ___x, ___t, ___m, ___n =3D (n); \ - uint32_t ___p, ___bias; \ + uint32_t ___p; \ + bool ___bias =3D false; \ \ /* determine MSB of b */ \ ___p =3D 1 << ilog2(___b); \ @@ -87,22 +88,14 @@ ___x =3D ~0ULL / ___b * ___b - 1; \ \ /* test our ___m with res =3D m * x / (p << 64) */ \ - ___res =3D ((___m & 0xffffffff) * (___x & 0xffffffff)) >> 32; \ - ___t =3D ___res +=3D (___m & 0xffffffff) * (___x >> 32); \ - ___res +=3D (___x & 0xffffffff) * (___m >> 32); \ - ___t =3D (___res < ___t) ? (1ULL << 32) : 0; \ - ___res =3D (___res >> 32) + ___t; \ - ___res +=3D (___m >> 32) * (___x >> 32); \ - ___res /=3D ___p; \ + ___res =3D (___m & 0xffffffff) * (___x & 0xffffffff); \ + ___t =3D (___m & 0xffffffff) * (___x >> 32) + (___res >> 32); \ + ___res =3D (___m >> 32) * (___x >> 32) + (___t >> 32); \ + ___t =3D (___m >> 32) * (___x & 0xffffffff) + (___t & 0xffffffff);\ + ___res =3D (___res + (___t >> 32)) / ___p; \ \ - /* Now sanitize and optimize what we've got. */ \ - if (~0ULL % (___b / (___b & -___b)) =3D=3D 0) { \ - /* special case, can be simplified to ... */ \ - ___n /=3D (___b & -___b); \ - ___m =3D ~0ULL / (___b / (___b & -___b)); \ - ___p =3D 1; \ - ___bias =3D 1; \ - } else if (___res !=3D ___x / ___b) { \ + /* Now validate what we've got. */ \ + if (___res !=3D ___x / ___b) { \ /* \ * We can't get away without a bias to compensate \ * for bit truncation errors. To avoid it we'd need an \ @@ -111,45 +104,18 @@ * \ * Instead we do m =3D p / b and n / b =3D (n * m + m) / p. \ */ \ - ___bias =3D 1; \ + ___bias =3D true; \ /* Compute m =3D (p << 64) / b */ \ ___m =3D (~0ULL / ___b) * ___p; \ ___m +=3D ((~0ULL % ___b + 1) * ___p) / ___b; \ - } else { \ - /* \ - * Reduce m / p, and try to clear bit 31 of m when \ - * possible, otherwise that'll need extra overflow \ - * handling later. \ - */ \ - uint32_t ___bits =3D -(___m & -___m); \ - ___bits |=3D ___m >> 32; \ - ___bits =3D (~___bits) << 1; \ - /* \ - * If ___bits =3D=3D 0 then setting bit 31 is unavoidable. \ - * Simply apply the maximum possible reduction in that \ - * case. Otherwise the MSB of ___bits indicates the \ - * best reduction we should apply. \ - */ \ - if (!___bits) { \ - ___p /=3D (___m & -___m); \ - ___m /=3D (___m & -___m); \ - } else { \ - ___p >>=3D ilog2(___bits); \ - ___m >>=3D ilog2(___bits); \ - } \ - /* No bias needed. */ \ - ___bias =3D 0; \ } \ \ + /* Reduce m / p to help avoid overflow handling later. */ \ + ___p /=3D (___m & -___m); \ + ___m /=3D (___m & -___m); \ + \ /* \ - * Now we have a combination of 2 conditions: \ - * \ - * 1) whether or not we need to apply a bias, and \ - * \ - * 2) whether or not there might be an overflow in the cross \ - * product determined by (___m & ((1 << 63) | (1 << 31))). \ - * \ - * Select the best way to do (m_bias + m * n) / (1 << 64). \ + * Perform (m_bias + m * n) / (1 << 64). \ * From now on there will be actual runtime code generated. \ */ \ ___res =3D __arch_xprod_64(___m, ___n, ___bias); \ @@ -165,7 +131,7 @@ * Semantic: retval =3D ((bias ? m : 0) + m * n) >> 64 * * The product is a 128-bit value, scaled down to 64 bits. - * Assuming constant propagation to optimize away unused conditional code. + * Hoping for compile-time optimization of conditional code. * Architectures may provide their own optimized assembly implementation. */ static inline uint64_t __arch_xprod_64(const uint64_t m, uint64_t n, bool = bias) @@ -174,38 +140,28 @@ static inline uint64_t __arch_xprod_64(const uint64_t= m, uint64_t n, bool bias) uint32_t m_hi =3D m >> 32; uint32_t n_lo =3D n; uint32_t n_hi =3D n >> 32; - uint64_t res; - uint32_t res_lo, res_hi, tmp; - - if (!bias) { - res =3D ((uint64_t)m_lo * n_lo) >> 32; - } else if (!(m & ((1ULL << 63) | (1ULL << 31)))) { - /* there can't be any overflow here */ - res =3D (m + (uint64_t)m_lo * n_lo) >> 32; - } else { - res =3D m + (uint64_t)m_lo * n_lo; - res_lo =3D res >> 32; - res_hi =3D (res_lo < m_hi); - res =3D res_lo | ((uint64_t)res_hi << 32); - } - - if (!(m & ((1ULL << 63) | (1ULL << 31)))) { - /* there can't be any overflow here */ - res +=3D (uint64_t)m_lo * n_hi; - res +=3D (uint64_t)m_hi * n_lo; - res >>=3D 32; + uint64_t x, y; + + /* Determine if overflow handling can be dispensed with. */ + bool no_ovf =3D __builtin_constant_p(m) && + ((m >> 32) + (m & 0xffffffff) < 0x100000000); + + if (no_ovf) { + x =3D (uint64_t)m_lo * n_lo + (bias ? m : 0); + x >>=3D 32; + x +=3D (uint64_t)m_lo * n_hi; + x +=3D (uint64_t)m_hi * n_lo; + x >>=3D 32; + x +=3D (uint64_t)m_hi * n_hi; } else { - res +=3D (uint64_t)m_lo * n_hi; - tmp =3D res >> 32; - res +=3D (uint64_t)m_hi * n_lo; - res_lo =3D res >> 32; - res_hi =3D (res_lo < tmp); - res =3D res_lo | ((uint64_t)res_hi << 32); + x =3D (uint64_t)m_lo * n_lo + (bias ? m_lo : 0); + y =3D (uint64_t)m_lo * n_hi + (uint32_t)(x >> 32) + (bias ? m_hi : 0); + x =3D (uint64_t)m_hi * n_hi + (uint32_t)(y >> 32); + y =3D (uint64_t)m_hi * n_lo + (uint32_t)y; + x +=3D (uint32_t)(y >> 32); } =20 - res +=3D (uint64_t)m_hi * n_hi; - - return res; + return x; } #endif =20 --=20 2.46.1 From nobody Thu Nov 28 04:52:38 2024 Received: from fout-a6-smtp.messagingengine.com (fout-a6-smtp.messagingengine.com [103.168.172.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0AAD1ABEA7; Thu, 3 Oct 2024 21:18:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.149 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727990318; cv=none; b=gdKIb9rPKCqXfclqLKGG9zzJ4v0qj1cc0L5ad2q+KXJwBQlT4qacJmXOz0qYReXG1zOXkuCSa2sZeqiOXRuBSufoFDvxotJlw1wlgZSqp+bfmfyr3iK1lSFlXJqoLqvhRcGNoUZWLcev2vZ6JTd0RixU2MVZUeg63KHopuSAI/M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727990318; c=relaxed/simple; bh=HhmzzIqFovKlT/CUw9uX4DeNUzVDqpAizbPZxvENWUg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eS0xWc/1c2emIIyS51VQM8XNNaMAFvmqSJo7rhkGXilXtct+qWXMOfe100VaaOYdgtn6z7To3+jnsyHSM0zhvPJgc991XZR4w/o45A/JmXijS3IMlJqqPzUfSIicdCeIbg0bhobJLI637+qLiZ4saPppBl3EuCiKZyeGBpvzTAY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=fluxnic.net; spf=pass smtp.mailfrom=fluxnic.net; dkim=pass (1024-bit key) header.d=fluxnic.net header.i=@fluxnic.net header.b=O9d7h89m; dkim=pass (2048-bit key) header.d=pobox.com header.i=@pobox.com header.b=Cc8JiuAY; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=RAHeGt1K; arc=none smtp.client-ip=103.168.172.149 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=fluxnic.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fluxnic.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=fluxnic.net header.i=@fluxnic.net header.b="O9d7h89m"; dkim=pass (2048-bit key) header.d=pobox.com header.i=@pobox.com header.b="Cc8JiuAY"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="RAHeGt1K" Received: from phl-compute-06.internal (phl-compute-06.phl.internal [10.202.2.46]) by mailfout.phl.internal (Postfix) with ESMTP id E5FA81380279; Thu, 3 Oct 2024 17:18:33 -0400 (EDT) Received: from phl-frontend-01 ([10.202.2.160]) by phl-compute-06.internal (MEProxy); Thu, 03 Oct 2024 17:18:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fluxnic.net; h= cc:cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=2016-12.pbsmtp; t=1727990313; x=1728076713; bh=Ur4Aq+YlwAdsvGXOFs0gXLqiXWQpdrRi9S7N/offUuw=; b= O9d7h89mdBNEfquo1iYvfcdEagaJPy2do/3oiEEpV2HFU2KOS0I5wQ8VD9OfqIWO D8pfY1BQTdnNiEwtRDJo870i+Y4AjTVvAe5GIw3S7MWOALhcyx8qDqwUoFeBnWfY /77VKB/wIDsgm7/Ks2xEhkKMhn9PAhVrxkrzPNhIttE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pobox.com; h=cc :cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm1; t=1727990313; x= 1728076713; bh=Ur4Aq+YlwAdsvGXOFs0gXLqiXWQpdrRi9S7N/offUuw=; b=C c8JiuAY7sQXlBlxRCVuNbbeB//ArzLmUvLhkusO59HlnZag7F2ZtlvGGm93Si3Cm LLEeMFtoalTCR1An/OCkp3oDyACSz3PJtrRjAXTXSn5d2gFJH/PHBj/W0ExTWZot JM5f0dKVYJ/yPN/3d0LsYoJklJ+JhXHlQnYl/DDUsOfQhr0KEdA0yThEqJBR5x/Y uEgYCNyXSPymjr/JqIBONXnqvtc9MYkhs56jryehzWTWJx/LhbyORQM9v30siusB uUIFBMfG/J+dp/MiStWK3umytfzeduuazPVFAU30DNt5nl9FXPg8uWtTHHI5lpVZ vGc78bxMPn9OYqjF60+VA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1727990313; x= 1728076713; bh=Ur4Aq+YlwAdsvGXOFs0gXLqiXWQpdrRi9S7N/offUuw=; b=R AHeGt1Kzag/ehSV1y10yC7bmQqlIhcV037RMH8EglGm2yRwZc71eR6VIJj+KL2kR mnhn+rGsqirkJnnjSAPulb9DFRuAMzjLtq8WFcz+P4HioOKs7M2DkYjp1N6Gv3fz wxg4RYBAH/xcOspdJZHl3w72quBlwN4TgHQxJEIGUY3mjcY5XxEYYt3zM6w4Hqaw Yz9NqU4WticB4j/m6mjbChsjZuFptSUBfaZgfdjdHQmNXsLb7WSYxWXdZfROo/cb Nv6S+h6vrqkJ3NGE7tUbJ4ZAgINHvJK0/uglNLepKHUIkdrUs6/VtrmiFTkITbqm 0oN4f4b0RmTkVUa+CXARQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddvuddgudeiudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvfevufffkffojghfggfgsedtkeertdertddt necuhfhrohhmpefpihgtohhlrghsucfrihhtrhgvuceonhhitghosehflhhugihnihgtrd hnvghtqeenucggtffrrghtthgvrhhnpedtjeeuieeiheeiueffuddvffelheekleegkedu keeffffhudffudegvdetiefhteenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh epmhgrihhlfhhrohhmpehnihgtohesfhhluhignhhitgdrnhgvthdpnhgspghrtghpthht ohephedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtoheplhhinhhugiesrghrmhhlih hnuhigrdhorhhgrdhukhdprhgtphhtthhopegrrhhnugesrghrnhgusgdruggvpdhrtghp thhtohepnhhpihhtrhgvsegsrgihlhhisghrvgdrtghomhdprhgtphhtthhopehlihhnuh igqdgrrhgthhesvhhgvghrrdhkvghrnhgvlhdrohhrghdprhgtphhtthhopehlihhnuhig qdhkvghrnhgvlhesvhhgvghrrdhkvghrnhgvlhdrohhrgh X-ME-Proxy: Feedback-ID: i58514971:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 3 Oct 2024 17:18:33 -0400 (EDT) Received: from xanadu.lan (OpenWrt.lan [192.168.1.1]) by yoda.fluxnic.net (Postfix) with ESMTPSA id B2601E3CD85; Thu, 3 Oct 2024 17:18:32 -0400 (EDT) From: Nicolas Pitre To: Arnd Bergmann , Russell King Cc: Nicolas Pitre , linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 3/4] ARM: div64: improve __arch_xprod_64() Date: Thu, 3 Oct 2024 17:16:15 -0400 Message-ID: <20241003211829.2750436-4-nico@fluxnic.net> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241003211829.2750436-1-nico@fluxnic.net> References: <20241003211829.2750436-1-nico@fluxnic.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Nicolas Pitre Let's use the same criteria for overflow handling necessity as the generic code. Signed-off-by: Nicolas Pitre --- arch/arm/include/asm/div64.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h index 4b69cf8504..562d5376ae 100644 --- a/arch/arm/include/asm/div64.h +++ b/arch/arm/include/asm/div64.h @@ -56,6 +56,8 @@ static inline uint64_t __arch_xprod_64(uint64_t m, uint64= _t n, bool bias) { unsigned long long res; register unsigned int tmp asm("ip") =3D 0; + bool no_ovf =3D __builtin_constant_p(m) && + ((m >> 32) + (m & 0xffffffff) < 0x100000000); =20 if (!bias) { asm ( "umull %Q0, %R0, %Q1, %Q2\n\t" @@ -63,7 +65,7 @@ static inline uint64_t __arch_xprod_64(uint64_t m, uint64= _t n, bool bias) : "=3D&r" (res) : "r" (m), "r" (n) : "cc"); - } else if (!(m & ((1ULL << 63) | (1ULL << 31)))) { + } else if (no_ovf) { res =3D m; asm ( "umlal %Q0, %R0, %Q1, %Q2\n\t" "mov %Q0, #0" @@ -80,7 +82,7 @@ static inline uint64_t __arch_xprod_64(uint64_t m, uint64= _t n, bool bias) : "cc"); } =20 - if (!(m & ((1ULL << 63) | (1ULL << 31)))) { + if (no_ovf) { asm ( "umlal %R0, %Q0, %R1, %Q2\n\t" "umlal %R0, %Q0, %Q1, %R2\n\t" "mov %R0, #0\n\t" --=20 2.46.1 From nobody Thu Nov 28 04:52:38 2024 Received: from fout-a6-smtp.messagingengine.com (fout-a6-smtp.messagingengine.com [103.168.172.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0A401A76CE; Thu, 3 Oct 2024 21:18:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.149 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727990316; cv=none; b=IaPiWjY43shfpP5nyWglUtYp+r/AJ1SVfZ91Ts0E3XrBFNba20xLS4QValseY/MvsuoToMe4wzHIOeiMKiTqDjlW8lkVX7A7z1oGSNvvKAJUSaqv61THkUgqUdxOSfDeuCHRp+/zPZlTe1uXE9604EAPGL8VRqwQsTEs3b90MRo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727990316; c=relaxed/simple; bh=fB4Z3BxVeo+0QrW+Ii/+Itqcp6qi1Tzmaq8LNY0yfFo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uqoarFDWEUHYXQk6fn2Hw4ZxBKM1U5cx2C3CZJwnijpj2gTOD9+QFfY7rsEHtc/2mvNJoF6rwgPkfvpFRIe70zacYsRgyLPq5a14ejMKYba3IZMuad2G8S2zIzTIShGongoiDFMNMpMe1KUF+Y9Nu8f9hlXdn5Xb8uejxOGfAn4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=fluxnic.net; spf=pass smtp.mailfrom=fluxnic.net; dkim=pass (1024-bit key) header.d=fluxnic.net header.i=@fluxnic.net header.b=xiQc8KuK; dkim=pass (2048-bit key) header.d=pobox.com header.i=@pobox.com header.b=C6AyC3Qs; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=crppWk2b; arc=none smtp.client-ip=103.168.172.149 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=fluxnic.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fluxnic.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=fluxnic.net header.i=@fluxnic.net header.b="xiQc8KuK"; dkim=pass (2048-bit key) header.d=pobox.com header.i=@pobox.com header.b="C6AyC3Qs"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="crppWk2b" Received: from phl-compute-06.internal (phl-compute-06.phl.internal [10.202.2.46]) by mailfout.phl.internal (Postfix) with ESMTP id E6874138027B; Thu, 3 Oct 2024 17:18:33 -0400 (EDT) Received: from phl-frontend-01 ([10.202.2.160]) by phl-compute-06.internal (MEProxy); Thu, 03 Oct 2024 17:18:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fluxnic.net; h= cc:cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=2016-12.pbsmtp; t=1727990313; x=1728076713; bh=JO+JOZInDR3JSscjDLnxO7RphvcKtqdKqusKtWnLamo=; b= xiQc8KuKs6C1A0Bz82h6j1i7dL6Fkx3ExM6Z3cJzExZR9myZG2I6T2Wvi+TDDuvL fOKeJSRqFB/IepmkZvEEzGR4ZhxEto9ROaiKO6bt/B+ExjVM8/wyxI7rvTx6h+CI rkRHXwJwJicenyDr31kxo5eCdphX9VhFyRYSH+X+TqY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pobox.com; h=cc :cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm1; t=1727990313; x= 1728076713; bh=JO+JOZInDR3JSscjDLnxO7RphvcKtqdKqusKtWnLamo=; b=C 6AyC3Qs0Fz0GewEeTa0S4x0fVwy1tX23WvH0Ipo/iDFaxH9tkR2egkdHDpKzRtrb X5RBqvo7muqKWbyXREzMVWZeyTmz1dNhBjXLe3JP/jkTOTHb4h8xdHeqoHaLw8C4 4x/TeTNBeIBFRdeUdCaPCTObgFMW99WVs/MYkddfqAEshO2YdaxxikH0Z4GlBdKZ Mly8Pb8UTjWiVnfSlNxYq6jO030ZihiRNjXX2Vuxs9vWLNgpvlkfPdborpvSTwtv UBp8eoC8qZa5wN/KOBzQPlSii/t1qjHuwWIQNE0km2sKNVwZR8MJGx74tEgMWkFz UpbOMi8YB2bGahEz55w6w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1727990313; x= 1728076713; bh=JO+JOZInDR3JSscjDLnxO7RphvcKtqdKqusKtWnLamo=; b=c rppWk2bk5QmgDq24/PSKWdoXFmXjrvgCv5axA3hilFX9c3hN7BRSefkMhULdq3R2 rL0ngyn08v4sgcWItX6wpnhsB2uyzd3yqP850tFSkUdQHRt7fMi9MZBHsktpO1Ba ctVPBM4x/iGm2zqK8ojlidyqTgmD56+ZZ0iVz/fhJbYPhRZ9iw48sVbeTEtEhkJ/ cWabe+DKOzaYU9h8TSdkI0ZQQPSYbjPO5yGqgmnOZuXoWkEDj51kXWxVw5lR2gBO tM/pYQij+s2IL0NfzJ6je/ojo+mWpMC1KHv2EttmpLc6tueuQIFt7GZ4ghwHgodA kC3j6xkis1PjTA8A3xVig== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddvuddgudeiudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvfevufffkffojghfggfgsedtkeertdertddt necuhfhrohhmpefpihgtohhlrghsucfrihhtrhgvuceonhhitghosehflhhugihnihgtrd hnvghtqeenucggtffrrghtthgvrhhnpedtjeeuieeiheeiueffuddvffelheekleegkedu keeffffhudffudegvdetiefhteenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh epmhgrihhlfhhrohhmpehnihgtohesfhhluhignhhitgdrnhgvthdpnhgspghrtghpthht ohephedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtoheplhhinhhugiesrghrmhhlih hnuhigrdhorhhgrdhukhdprhgtphhtthhopegrrhhnugesrghrnhgusgdruggvpdhrtghp thhtohepnhhpihhtrhgvsegsrgihlhhisghrvgdrtghomhdprhgtphhtthhopehlihhnuh igqdgrrhgthhesvhhgvghrrdhkvghrnhgvlhdrohhrghdprhgtphhtthhopehlihhnuhig qdhkvghrnhgvlhesvhhgvghrrdhkvghrnhgvlhdrohhrgh X-ME-Proxy: Feedback-ID: i58514971:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 3 Oct 2024 17:18:33 -0400 (EDT) Received: from xanadu.lan (OpenWrt.lan [192.168.1.1]) by yoda.fluxnic.net (Postfix) with ESMTPSA id C8528E3CD87; Thu, 3 Oct 2024 17:18:32 -0400 (EDT) From: Nicolas Pitre To: Arnd Bergmann , Russell King Cc: Nicolas Pitre , linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 4/4] __arch_xprod64(): make __always_inline when optimizing for performance Date: Thu, 3 Oct 2024 17:16:16 -0400 Message-ID: <20241003211829.2750436-5-nico@fluxnic.net> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241003211829.2750436-1-nico@fluxnic.net> References: <20241003211829.2750436-1-nico@fluxnic.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Nicolas Pitre Recent gcc versions started not systematically inline __arch_xprod64() and that has performance implications. Give the compiler the freedom to decide only when optimizing for size. Here's some timing numbers from lib/math/test_div64.c Using __always_inline: ``` test_div64: Starting 64bit/32bit division and modulo test test_div64: Completed 64bit/32bit division and modulo test, 0.048285584s el= apsed ``` Without __always_inline: ``` test_div64: Starting 64bit/32bit division and modulo test test_div64: Completed 64bit/32bit division and modulo test, 0.053023584s el= apsed ``` Forcing constant base through the non-constant base code path: ``` test_div64: Starting 64bit/32bit division and modulo test test_div64: Completed 64bit/32bit division and modulo test, 0.103263776s el= apsed ``` It is worth noting that test_div64 does half the test with non constant divisors already so the impact is greater than what those numbers show. And for what it is worth, those numbers were obtained using QEMU. The gcc version is 14.1.0. Signed-off-by: Nicolas Pitre --- arch/arm/include/asm/div64.h | 7 ++++++- include/asm-generic/div64.h | 7 ++++++- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h index 562d5376ae..d3ef8e416b 100644 --- a/arch/arm/include/asm/div64.h +++ b/arch/arm/include/asm/div64.h @@ -52,7 +52,12 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t = base) =20 #else =20 -static inline uint64_t __arch_xprod_64(uint64_t m, uint64_t n, bool bias) +#ifdef CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE +static __always_inline +#else +static inline +#endif +uint64_t __arch_xprod_64(uint64_t m, uint64_t n, bool bias) { unsigned long long res; register unsigned int tmp asm("ip") =3D 0; diff --git a/include/asm-generic/div64.h b/include/asm-generic/div64.h index 5d59cf7e73..25e7b4b58d 100644 --- a/include/asm-generic/div64.h +++ b/include/asm-generic/div64.h @@ -134,7 +134,12 @@ * Hoping for compile-time optimization of conditional code. * Architectures may provide their own optimized assembly implementation. */ -static inline uint64_t __arch_xprod_64(const uint64_t m, uint64_t n, bool = bias) +#ifdef CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE +static __always_inline +#else +static inline +#endif +uint64_t __arch_xprod_64(const uint64_t m, uint64_t n, bool bias) { uint32_t m_lo =3D m; uint32_t m_hi =3D m >> 32; --=20 2.46.1