From nobody Mon Feb 9 09:21:21 2026 Received: from angie.orcam.me.uk (angie.orcam.me.uk [78.133.224.34]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8E27E13D539 for ; Sun, 11 Jan 2026 21:21:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=78.133.224.34 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768166522; cv=none; b=kG3ys0OiHQVgdR9RPeIXZNDdoLLwAousYMOu0KTtoC19EAoS1g+s90H0x1lQjttFsyMkODXPXF2Xit5NUlL+5e/hQ4sluAGETTDV7TuEWLSIETvmZxS2wCkRITp+B7tDENxZ+QIXsGysGn5/x4k6Zdct2gZ9ZApKMjMkOfkHKb0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768166522; c=relaxed/simple; bh=wk2JCh3UUOAdnKbt8SY/7wekpyvXvFIksg+OUxUGj14=; h=Date:From:To:cc:Subject:Message-ID:MIME-Version:Content-Type; b=Sa6kBJR96YThsEAVREQzhE7sqdcTvPQ0Q2e48CisDgRV2bS/cui3pO9J0V743ABDLyfxFe43KGBbTqDDsN+RG8IXFze4XpjBRZR0zJPy5/GH/K4mC/DD7Ke7Qg2JKV2fy5AX4IuLJarmYjvskVkumngAfDkFGF4wZuFoy42skW8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=orcam.me.uk; spf=none smtp.mailfrom=orcam.me.uk; arc=none smtp.client-ip=78.133.224.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=orcam.me.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=orcam.me.uk Received: by angie.orcam.me.uk (Postfix, from userid 500) id 9A79F92009C; Sun, 11 Jan 2026 22:21:57 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by angie.orcam.me.uk (Postfix) with ESMTP id 8B6D992009B; Sun, 11 Jan 2026 21:21:57 +0000 (GMT) Date: Sun, 11 Jan 2026 21:21:57 +0000 (GMT) From: "Maciej W. Rozycki" To: Andrew Morton , Jens Axboe , John Garry , Su Hui , "Martin K. Petersen" cc: David Laight , linux-kernel@vger.kernel.org Subject: [PATCH] linux/log2.h: Reduce instruction count for is_power_of_2() Message-ID: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Follow an observation that (n ^ (n - 1)) will only ever retain the most=20 significant bit set in the word operated on if that is the only bit set=20 in the first place, and use it to determine whether a number is a whole=20 power of 2, avoiding the need for an explicit check for nonzero. This reduces the sequence produced to 3 instructions only across Alpha,=20 MIPS, and RISC-V targets, down from 4, 5, and 4 respectively, removing a=20 branch in the two latter cases. And it's 5 instructions on POWER and=20 x86-64 vs 8 and 9 respectively. There are no branches now emitted here=20 for targets that have a suitable conditional set operation, although an=20 inline expansion will often end with one, depending on what code a call=20 to this function is used in. Credit goes to GCC authors for coming up with this optimisation used as=20 the fallback for (__builtin_popcountl(n) =3D=3D 1), equivalent to this code= ,=20 for targets where the hardware population count operation is considered=20 expensive. Signed-off-by: Maciej W. Rozycki --- Hi, As discussed here[1] a further improvement might be possible with targets=20 that support a population count operation in hardware, but care needs to=20 be taken for a libcall not to be produced instead, so such code would have=20 to be conditionalised on the presence of a population count instruction=20 and its correct handling in the compiler (which GCC gets right for POWER,=20 but wrong for Alpha, apparently owing to an incorrect cost taken for the=20 operation). Anyway, this seems like worthwhile if small an improvement regardless, so=20 please apply. References: [1] "treewide, bits: use ffs_val() where it is open-coded",=20 . Maciej --- include/linux/log2.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) linux-log2-pow2-opt.diff Index: linux-macro/include/linux/log2.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-macro.orig/include/linux/log2.h +++ linux-macro/include/linux/log2.h @@ -44,7 +44,7 @@ int __ilog2_u64(u64 n) static __always_inline __attribute__((const)) bool is_power_of_2(unsigned long n) { - return (n !=3D 0 && ((n & (n - 1)) =3D=3D 0)); + return n - 1 < (n ^ (n - 1)); } =20 /**