[PATCH] printk: Fix _DESCS_COUNT type for 64-bit systems

feng.zhou posted 1 patch 5 days, 12 hours ago
kernel/printk/printk_ringbuffer.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] printk: Fix _DESCS_COUNT type for 64-bit systems
Posted by feng.zhou 5 days, 12 hours ago
The _DESCS_COUNT macro currently uses 1U (32-bit unsigned) instead of
1UL (unsigned long), which breaks the intended overflow testing design
on 64-bit systems.

Problem Analysis:
----------------
The printk_ringbuffer uses a deliberate design choice to initialize
descriptor IDs near the maximum 62-bit value to trigger overflow early
in the system's lifetime. This is documented in printk_ringbuffer.h:

  "initial values are chosen that map to the correct initial array
   indexes, but will result in overflows soon."

The DESC0_ID macro calculates:
  DESC0_ID(ct_bits) = DESC_ID(-(_DESCS_COUNT(ct_bits) + 1))

On 64-bit systems with typical configuration (descbits=16):
- Current buggy behavior: DESC0_ID = 0xfffeffff
- Expected behavior:      DESC0_ID = 0x3ffffffffffeffff

The buggy version only uses 32 bits, which means:
1. The initial ID is nowhere near 2^62
2. It would take ~140 trillion wraps to trigger 62-bit overflow
3. The overflow handling code is never tested in practice

Root Cause:
----------
The issue is in this line:
  #define _DESCS_COUNT(ct_bits)    (1U << (ct_bits))

When _DESCS_COUNT(16) is calculated:
  1U << 16 = 0x10000 (32-bit value)
  -(0x10000 + 1) = -0x10001 = 0xFFFEFFFF (32-bit two's complement)

On 64-bit systems, this 32-bit value doesn't get extended to create
the intended 62-bit ID near the maximum value.

Impact:
------
While index calculations still work correctly in the short term, this
bug has several implications:

1. Violates the design intention documented in the code
2. Overflow handling code paths remain untested
3. ABA detection code doesn't get exercised under overflow conditions
4. In extreme long-term running scenarios (though unlikely), could
   potentially cause issues when ID actually reaches 2^62

Verification:
------------
Tested on ARM64 system with CONFIG_LOG_BUF_SHIFT=20 (descbits=15):
- Before fix: DESC0_ID(16) = 0xfffeffff
- After fix:  DESC0_ID(16) = 0x3fffffffffff7fff

The fix aligns _DESCS_COUNT with _DATA_SIZE, which already correctly
uses 1UL:
  #define _DATA_SIZE(sz_bits)    (1UL << (sz_bits))

Signed-off-by: feng.zhou <realsummitzhou@gmail.com>
---
 kernel/printk/printk_ringbuffer.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h
index 4ef81349d9fb..4f4949700676 100644
--- a/kernel/printk/printk_ringbuffer.h
+++ b/kernel/printk/printk_ringbuffer.h
@@ -122,7 +122,7 @@ enum desc_state {
 };
 
 #define _DATA_SIZE(sz_bits)	(1UL << (sz_bits))
-#define _DESCS_COUNT(ct_bits)	(1U << (ct_bits))
+#define _DESCS_COUNT(ct_bits)	(1UL << (ct_bits))
 #define DESC_SV_BITS		BITS_PER_LONG
 #define DESC_FLAGS_SHIFT	(DESC_SV_BITS - 2)
 #define DESC_FLAGS_MASK		(3UL << DESC_FLAGS_SHIFT)
-- 
2.43.0
Re: [PATCH] printk: Fix _DESCS_COUNT type for 64-bit systems
Posted by John Ogness 5 days, 10 hours ago
Hi Zhou,

Thanks for reporting and fixing this!

On 2026-02-02, "feng.zhou" <realsummitzhou@gmail.com> wrote:
> The _DESCS_COUNT macro currently uses 1U (32-bit unsigned) instead of
> 1UL (unsigned long), which breaks the intended overflow testing design
> on 64-bit systems.
>
> Problem Analysis:
> ----------------
> The printk_ringbuffer uses a deliberate design choice to initialize
> descriptor IDs near the maximum 62-bit value to trigger overflow early
> in the system's lifetime. This is documented in printk_ringbuffer.h:
>
>   "initial values are chosen that map to the correct initial array
>    indexes, but will result in overflows soon."
>
> The DESC0_ID macro calculates:
>   DESC0_ID(ct_bits) = DESC_ID(-(_DESCS_COUNT(ct_bits) + 1))
>
> On 64-bit systems with typical configuration (descbits=16):
> - Current buggy behavior: DESC0_ID = 0xfffeffff
> - Expected behavior:      DESC0_ID = 0x3ffffffffffeffff
>
> The buggy version only uses 32 bits, which means:
> 1. The initial ID is nowhere near 2^62
> 2. It would take ~140 trillion wraps to trigger 62-bit overflow
> 3. The overflow handling code is never tested in practice
>
> Root Cause:
> ----------
> The issue is in this line:
>   #define _DESCS_COUNT(ct_bits)    (1U << (ct_bits))
>
> When _DESCS_COUNT(16) is calculated:
>   1U << 16 = 0x10000 (32-bit value)
>   -(0x10000 + 1) = -0x10001 = 0xFFFEFFFF (32-bit two's complement)
>
> On 64-bit systems, this 32-bit value doesn't get extended to create
> the intended 62-bit ID near the maximum value.

I was surprised to see that 1U is used here. I did some historical
digging and it has existed since the very first private version of this
ringbuffer implementation. It was a private email to Petr:

Date: 12 Oct 2019
Subject: ringbuffer v1
Message-ID: <87lftqwvhp.fsf@linutronix.de>

That was the very first appearance of a descriptor initializer:

+#define _DESCS_COUNT(ct_bits)	(1U << (ct_bits))
+#define DESCS_COUNT(dr)		_DESCS_COUNT((dr)->count_bits)
+#define DESC0_ID(ct_bits)	DESC_ID(-_DESCS_COUNT(ct_bits))

> Impact:
> ------
> While index calculations still work correctly in the short term, this
> bug has several implications:
>
> 1. Violates the design intention documented in the code
> 2. Overflow handling code paths remain untested
> 3. ABA detection code doesn't get exercised under overflow conditions

The ABA detection is only relevant for 32-bit systems. The patch is only
fixing an issue on 64-bit systems.

> 4. In extreme long-term running scenarios (though unlikely), could
>    potentially cause issues when ID actually reaches 2^62

It is worth investigating if this is an issue. Although it is quite
unlikely. A machine would need to generate 100k printk messages per
second for 490k years!

> Verification:
> ------------
> Tested on ARM64 system with CONFIG_LOG_BUF_SHIFT=20 (descbits=15):
> - Before fix: DESC0_ID(16) = 0xfffeffff
> - After fix:  DESC0_ID(16) = 0x3fffffffffff7fff
>
> The fix aligns _DESCS_COUNT with _DATA_SIZE, which already correctly
> uses 1UL:
>   #define _DATA_SIZE(sz_bits)    (1UL << (sz_bits))
>
> Signed-off-by: feng.zhou <realsummitzhou@gmail.com>
> ---
>  kernel/printk/printk_ringbuffer.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h
> index 4ef81349d9fb..4f4949700676 100644
> --- a/kernel/printk/printk_ringbuffer.h
> +++ b/kernel/printk/printk_ringbuffer.h
> @@ -122,7 +122,7 @@ enum desc_state {
>  };
>  
>  #define _DATA_SIZE(sz_bits)	(1UL << (sz_bits))
> -#define _DESCS_COUNT(ct_bits)	(1U << (ct_bits))
> +#define _DESCS_COUNT(ct_bits)	(1UL << (ct_bits))
>  #define DESC_SV_BITS		BITS_PER_LONG
>  #define DESC_FLAGS_SHIFT	(DESC_SV_BITS - 2)
>  #define DESC_FLAGS_MASK		(3UL << DESC_FLAGS_SHIFT)

The change looks correct. But I would like to take some more time to
understand why I never used UL in the first place.

Also note (unrelated) that we are using 1U for the buffer of the data
ring. Not really an issue, but we should probably switch that to UL as
well.

John Ogness
[RESEND] Re: [PATCH] printk: Fix _DESCS_COUNT type for 64-bit systems
Posted by feng.zhou 5 days, 7 hours ago
Hi John and maintainers,

(Resending to the list - apologies for the initial off-list reply)

On Mon, Feb 2, 2026, John Ogness wrote:
> This is important to us as well! And we really do want those
> wrap-arounds to happen early.

> Your v1 is certainly OK as-is. I just want to do some more investigating
> and testing. With your patch it will be the first time we experience
> descriptor ID wraps.
>
> As to the other U->UL conversion, I really have no preference. Let's
> just wait a bit in case Petr wants to comment.

Thank you for the feedback! I understand v1 is acceptable and you're doing
additional testing. I'll wait for Petr's comments and your investigation
results before any further action.

For the record, here's a summary of our off-list discussion:

1. Early wrap-around testing is important for the printk_ringbuffer design
   itself, not just for matching documentation.

2. John noted this is interesting because it will be the first time descriptor
   ID wraps are actually exercised in practice.

3. There's a similar issue with the text buffer array size definition using 1U,
   but we agreed to wait for broader maintainer input on whether to address it
   in the same patch or separately.

I apologize for not using "Reply All" initially - I'm still learning the
kernel development process. I'll make sure all future communications stay
on-list.

Thanks again for your patience and guidance!

Best regards,
Zhou
Re: [RESEND] Re: [PATCH] printk: Fix _DESCS_COUNT type for 64-bit systems
Posted by Petr Mladek 4 days, 5 hours ago
On Mon 2026-02-02 23:07:50, feng.zhou wrote:
> Hi John and maintainers,
> 
> (Resending to the list - apologies for the initial off-list reply)
> 
> On Mon, Feb 2, 2026, John Ogness wrote:
> > This is important to us as well! And we really do want those
> > wrap-arounds to happen early.
> 
> > Your v1 is certainly OK as-is. I just want to do some more investigating
> > and testing. With your patch it will be the first time we experience
> > descriptor ID wraps.
> >
> > As to the other U->UL conversion, I really have no preference. Let's
> > just wait a bit in case Petr wants to comment.
>
> Thank you for the feedback! I understand v1 is acceptable and you're doing
> additional testing. I'll wait for Petr's comments and your investigation
> results before any further action.

JFYI, this patch is on my radar. But the review might take some time:

  1. I am a bit overloaded.

  2. This patch has low priority. It helps to test a rather
     non-realistic scenario.

  3. The review is far from easy because the macro _DESCS_COUNT()
     is used on many cricial parts of the lock-less code via
     DESC_COUNT() macro.

> For the record, here's a summary of our off-list discussion:
> 
> 1. Early wrap-around testing is important for the printk_ringbuffer design
>    itself, not just for matching documentation.
> 
> 2. John noted this is interesting because it will be the first time descriptor
>    ID wraps are actually exercised in practice.
> 
> 3. There's a similar issue with the text buffer array size definition using 1U,
>    but we agreed to wait for broader maintainer input on whether to address it
>    in the same patch or separately.

This is another complicated problem. The printk log buffer size is
limited to 2GB. I am not exactly sure why. Some clue can be found
at https://lore.kernel.org/all/20181008135916.gg4kkmoki5bgtco5@pathway.suse.cz/


> I apologize for not using "Reply All" initially - I'm still learning the
> kernel development process. I'll make sure all future communications stay
> on-list.

No problem. It happens. It is great that you are learning.

> Thanks again for your patience and guidance!

You are welcome. Please, have patience with us as well.

Best Regards,
Petr