The calculation of "((~_ULL(0)) - (_ULL(1) << (l)) + 1)" is to generate
a bitmask with "l" trailing zeroes, which is equivalent to
"(~_ULL(0) << (l))".
Refactor the calculation so the number of arithmetic instruction will be
reduced from 3 to 1.
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
---
Test is done to show the speed-up we can get from reducing the number of
instruction. The test machine runs with 6.9.0-0-generic kernel on x86_64
architecture with processor AMD Ryzen 7 5700X3D 8-Core Processor.
The test executes in the form of kernel module.
Test result:
[452858.156718] new __GENMASK_ULL() : 5908678
[452858.156722] old __GENMASK_ULL() : 5950998
Test script snippet:
/* switch to __BITS_PER_LONG_LONG when testing __GENMASK_ULL() */
\#define __GENMASK_ULL_NEW(h, l) \
((~_ULL(0) << (l)) & \
(~_ULL(0) >> (__BITS_PER_LONG_LONG - 1 - (h))))
int init_module(void)
{
ktime_t start, end, total1 = 0, total2 = 0;
for (int k = 0; k < 100; k++) {
for (int i = 0; i < __BITS_PER_LONG_LONG; i++) {
for (int j = 0; j <= i; j++) {
unsigned result1, result2;
start = ktime_get();
result1 = __GENMASK_ULL_NEW(i, j);
end = ktime_get();
total1 += (end - start);
start = ktime_get();
result2 = __GENMASK_ULL(i, j);
end = ktime_get();
total2 += (end - start);
if (result1 != result2) {
pr_info("Wrong calculation of GENMASK_ULL_NEW()\n");
return 0;
}
}
}
}
pr_info("new __GENMASK_ULL() : %lld\n", total1);
pr_info("old __GENMASK_ULL() : %lld\n", total2);
return 0;
}
---
include/uapi/linux/bits.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/uapi/linux/bits.h b/include/uapi/linux/bits.h
index 8fc7fea65288..965f6eb91ff3 100644
--- a/include/uapi/linux/bits.h
+++ b/include/uapi/linux/bits.h
@@ -9,7 +9,7 @@
(~_UL(0) >> (__BITS_PER_LONG - 1 - (h))))
#define __GENMASK_ULL(h, l) \
- (((~_ULL(0)) - (_ULL(1) << (l)) + 1) & \
+ ((~_ULL(0) << (l)) & \
(~_ULL(0) >> (__BITS_PER_LONG_LONG - 1 - (h))))
#define __GENMASK_U128(h, l) \
--
2.43.0
On Wed, 12 Feb 2025 00:24:12 +0800 I Hsin Cheng <richard120310@gmail.com> wrote: > The calculation of "((~_ULL(0)) - (_ULL(1) << (l)) + 1)" is to generate > a bitmask with "l" trailing zeroes, which is equivalent to > "(~_ULL(0) << (l))". Yes, and if you look through the commit history you'll see it was changed to avoid a compiler warning about the shift losing significant bits. So you are just reverting that change and the compiler warnings will reappear. For non-constants I suspect that (2ul << hi) - (1ul << lo) is the best answer. But the compiler (clang with some options?) will still complain for constants when trying to set the high bit. That version also doesn't need BITS_PER_[U]LONG. While that is valid for C, the _ULL() are there for the assembler (when they are no-ops) so there is no way asm copies can be right for both GENMASK() ans GENMASK_ULL(). David
On Tue, Feb 11, 2025 at 10:30:45PM +0000, David Laight wrote: > On Wed, 12 Feb 2025 00:24:12 +0800 > I Hsin Cheng <richard120310@gmail.com> wrote: > > > The calculation of "((~_ULL(0)) - (_ULL(1) << (l)) + 1)" is to generate > > a bitmask with "l" trailing zeroes, which is equivalent to > > "(~_ULL(0) << (l))". > > Yes, and if you look through the commit history you'll see it was changed > to avoid a compiler warning about the shift losing significant bits. > So you are just reverting that change and the compiler warnings will > reappear. > > For non-constants I suspect that (2ul << hi) - (1ul << lo) is the > best answer. > But the compiler (clang with some options?) will still complain > for constants when trying to set the high bit. > > That version also doesn't need BITS_PER_[U]LONG. > While that is valid for C, the _ULL() are there for the assembler > (when they are no-ops) so there is no way asm copies can be right > for both GENMASK() ans GENMASK_ULL(). > > David Hi David, Thanks for the review! > Yes, and if you look through the commit history you'll see it was changed > to avoid a compiler warning about the shift losing significant bits. I've browse through the commits of include/linux/bits.h , where GENMASK() was placed under. Though I didn't find specific commit of it, would you be so kind to paste the link of the commit? I assume you're talking about warnings like the following? warning: left shift count >= width of type [-Wshift-count-overflow] If this is the case then it happens for the current version as well, I mean from the perspective of operations, "(~_ULL(0) << (l))" and "(_ULL(1) << (1))" are basically the same, they all shift a constant left with an amount of "l". When "l" is large enough the compiler will complain about losing msb. > While that is valid for C, the _ULL() are there for the assembler > (when they are no-ops) so there is no way asm copies can be right > for both GENMASK() ans GENMASK_ULL(). I don't understand this part sorry, if asm copies can't be right for "~_ULL(0) << (l)" , how can it be right for "(_ULL(1) << (l))" ? At least from my pespective these 2 ops only differs at the value of constant. Please let me know about it, I'm not familiar with assembler and would love to know more about it. Thanks. Best regards, I Hsin Cheng
On Wed, 12 Feb 2025 20:39:09 +0800 I Hsin Cheng <richard120310@gmail.com> wrote: > On Tue, Feb 11, 2025 at 10:30:45PM +0000, David Laight wrote: > > On Wed, 12 Feb 2025 00:24:12 +0800 > > I Hsin Cheng <richard120310@gmail.com> wrote: > > > > > The calculation of "((~_ULL(0)) - (_ULL(1) << (l)) + 1)" is to generate > > > a bitmask with "l" trailing zeroes, which is equivalent to > > > "(~_ULL(0) << (l))". > > > > Yes, and if you look through the commit history you'll see it was changed > > to avoid a compiler warning about the shift losing significant bits. > > So you are just reverting that change and the compiler warnings will > > reappear. > > > > For non-constants I suspect that (2ul << hi) - (1ul << lo) is the > > best answer. > > But the compiler (clang with some options?) will still complain > > for constants when trying to set the high bit. > > > > That version also doesn't need BITS_PER_[U]LONG. > > While that is valid for C, the _ULL() are there for the assembler > > (when they are no-ops) so there is no way asm copies can be right > > for both GENMASK() ans GENMASK_ULL(). > > > > David > > Hi David, > > Thanks for the review! > > > Yes, and if you look through the commit history you'll see it was changed > > to avoid a compiler warning about the shift losing significant bits. > > I've browse through the commits of include/linux/bits.h , where > GENMASK() was placed under. Though I didn't find specific commit of it, > would you be so kind to paste the link of the commit? > > I assume you're talking about warnings like the following? > warning: left shift count >= width of type [-Wshift-count-overflow] No, there is one about clang 'objecting' to ~0u << count because significant bits get discarded. The 'strange' expression was used to avoid that - even though it is mathematically equivalent. > > If this is the case then it happens for the current version as well, I > mean from the perspective of operations, "(~_ULL(0) << (l))" and > "(_ULL(1) << (1))" are basically the same, they all shift a constant > left with an amount of "l". When "l" is large enough the compiler will > complain about losing msb. > > > While that is valid for C, the _ULL() are there for the assembler > > (when they are no-ops) so there is no way asm copies can be right > > for both GENMASK() ans GENMASK_ULL(). > > I don't understand this part sorry, if asm copies can't be right for > "~_ULL(0) << (l)" , how can it be right for "(_ULL(1) << (l))" ? > At least from my pespective these 2 ops only differs at the value of > constant. Please let me know about it, I'm not familiar with assembler > and would love to know more about it. Thanks. It is the >> that go wrong, since they rely on the number of bits in the integer type being used. For asm both ~_UL(0) and ~_ULL(0) are just ~0, but the >> uses a different constant. I'm not even sure that the _ULL() should even be in a uapi header. I proposed using: ((1u << (hi)) - 1) * 2) + 1 - (((1u << (lo)) - 1) not that long ago. That avoid any bits being shifted off the top and I'm sure it that gcc optimised add the +/- away. It also has the advantage that the type of the result only depends of the type of the 1u (so can be 1ul, 1ull etc). David > > Best regards, > I Hsin Cheng >
+ Matthias, Andrew
On Wed, Feb 12, 2025 at 08:39:09PM +0800, I Hsin Cheng wrote:
> On Tue, Feb 11, 2025 at 10:30:45PM +0000, David Laight wrote:
> > On Wed, 12 Feb 2025 00:24:12 +0800
> > I Hsin Cheng <richard120310@gmail.com> wrote:
> >
> > > The calculation of "((~_ULL(0)) - (_ULL(1) << (l)) + 1)" is to generate
> > > a bitmask with "l" trailing zeroes, which is equivalent to
> > > "(~_ULL(0) << (l))".
> >
> > Yes, and if you look through the commit history you'll see it was changed
> > to avoid a compiler warning about the shift losing significant bits.
> > So you are just reverting that change and the compiler warnings will
> > reappear.
> > For non-constants I suspect that (2ul << hi) - (1ul << lo) is the
> > best answer.
> > But the compiler (clang with some options?) will still complain
> > for constants when trying to set the high bit.
> >
> > That version also doesn't need BITS_PER_[U]LONG.
> > While that is valid for C, the _ULL() are there for the assembler
> > (when they are no-ops) so there is no way asm copies can be right
> > for both GENMASK() ans GENMASK_ULL().
> >
> > David
>
> Hi David,
>
> Thanks for the review!
>
> > Yes, and if you look through the commit history you'll see it was changed
> > to avoid a compiler warning about the shift losing significant bits.
>
> I've browse through the commits of include/linux/bits.h , where
> GENMASK() was placed under. Though I didn't find specific commit of it,
> would you be so kind to paste the link of the commit?
It's c32ee3d9abd284b4f ("bitops: avoid integer overflow in GENMASK(_ULL)").
[From cover letter]
> $ ./scripts/bloat-o-meter vmlinux_old vmlinux_new
> add/remove: 0/2 grow/shrink: 46/510 up/down: 464/-1733 (-1269)
You can see, Andrew tested it and found no changes in .text size. If that
has changed, we need to revert the original patch and suppress warning.
Can you please print the full bloat-o-meter result? Which compiler(s) did
you try?
Thanks,
Yury
© 2016 - 2026 Red Hat, Inc.