irq: simplify irq_im_handle_irq()

[PATCH] irq: simplify irq_im_handle_irq()

Posted by Yury Norov 6 months, 3 weeks ago

From: Yury Norov (NVIDIA) <yury.norov@gmail.com>

Hi Thomas,

The function calls bitmap_empty() for potentially every bit in
work_ctx->pending, which makes a simple bitmap traverse O(N^2).
Fix it by switching to the dedicated for_each_set_bit().

While there, fix using atomic clear_bit() in a context where atomicity
cannot be guaranteed.

Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
---
 kernel/irq/irq_sim.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/kernel/irq/irq_sim.c b/kernel/irq/irq_sim.c
index ae4c9cbd1b4b..e05904da7e3d 100644
--- a/kernel/irq/irq_sim.c
+++ b/kernel/irq/irq_sim.c
@@ -128,15 +128,13 @@ static struct irq_chip irq_sim_irqchip = {
 static void irq_sim_handle_irq(struct irq_work *work)
 {
 	struct irq_sim_work_ctx *work_ctx;
-	unsigned int offset = 0;
+	unsigned int offset;
 	int irqnum;
 
 	work_ctx = container_of(work, struct irq_sim_work_ctx, work);
 
-	while (!bitmap_empty(work_ctx->pending, work_ctx->irq_count)) {
-		offset = find_next_bit(work_ctx->pending,
-				       work_ctx->irq_count, offset);
-		clear_bit(offset, work_ctx->pending);
+	for_each_set_bit(offset, work_ctx->pending, work_ctx->irq_count) {
+		__clear_bit(offset, work_ctx->pending);
 		irqnum = irq_find_mapping(work_ctx->domain, offset);
 		handle_simple_irq(irq_to_desc(irqnum));
 	}
-- 
2.43.0

Re: [PATCH] irq: simplify irq_im_handle_irq()

Posted by Jiri Slaby 6 months, 3 weeks ago

On 19. 07. 25, 23:18, Yury Norov wrote:
> From: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> 
> Hi Thomas,

This does not belong to a commit log ^^.

> The function calls bitmap_empty() for potentially every bit in
> work_ctx->pending, which makes a simple bitmap traverse O(N^2).
> Fix it by switching to the dedicated for_each_set_bit().

Looks good.

> While there, fix using atomic clear_bit() in a context where atomicity
> cannot be guaranteed.

What does this mean? __clear_bit() can corrupt the bitmap when there is 
an in-flight set_bit(), right?

> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> ---
>   kernel/irq/irq_sim.c | 8 +++-----
>   1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/irq/irq_sim.c b/kernel/irq/irq_sim.c
> index ae4c9cbd1b4b..e05904da7e3d 100644
> --- a/kernel/irq/irq_sim.c
> +++ b/kernel/irq/irq_sim.c
> @@ -128,15 +128,13 @@ static struct irq_chip irq_sim_irqchip = {
>   static void irq_sim_handle_irq(struct irq_work *work)
>   {
>   	struct irq_sim_work_ctx *work_ctx;
> -	unsigned int offset = 0;
> +	unsigned int offset;
>   	int irqnum;
>   
>   	work_ctx = container_of(work, struct irq_sim_work_ctx, work);
>   
> -	while (!bitmap_empty(work_ctx->pending, work_ctx->irq_count)) {
> -		offset = find_next_bit(work_ctx->pending,
> -				       work_ctx->irq_count, offset);
> -		clear_bit(offset, work_ctx->pending);
> +	for_each_set_bit(offset, work_ctx->pending, work_ctx->irq_count) {
> +		__clear_bit(offset, work_ctx->pending);
>   		irqnum = irq_find_mapping(work_ctx->domain, offset);
>   		handle_simple_irq(irq_to_desc(irqnum));
>   	}

-- 
js
suse labs

Re: [PATCH] irq: simplify irq_im_handle_irq()

Posted by Thomas Gleixner 6 months, 3 weeks ago

Yury!

On Sat, Jul 19 2025 at 17:18, Yury Norov wrote:

'irq:' is not the correct prefix here. See:

https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#patch-submission-notes

Also irq_im_handle_irq() is not a known function name.

> From: Yury Norov (NVIDIA) <yury.norov@gmail.com>
>
> Hi Thomas,

Since when is a greeting part of the changelog?

> The function calls bitmap_empty() for potentially every bit in
> work_ctx->pending, which makes a simple bitmap traverse O(N^2).
> Fix it by switching to the dedicated for_each_set_bit().
>
> While there, fix using atomic clear_bit() in a context where atomicity
> cannot be guaranteed.

Seriously? See below.

>  static void irq_sim_handle_irq(struct irq_work *work)
>  {
>  	struct irq_sim_work_ctx *work_ctx;
> -	unsigned int offset = 0;
> +	unsigned int offset;
>  	int irqnum;
>  
>  	work_ctx = container_of(work, struct irq_sim_work_ctx, work);
>  
> -	while (!bitmap_empty(work_ctx->pending, work_ctx->irq_count)) {
> -		offset = find_next_bit(work_ctx->pending,
> -				       work_ctx->irq_count, offset);
> -		clear_bit(offset, work_ctx->pending);
> +	for_each_set_bit(offset, work_ctx->pending, work_ctx->irq_count) {
> +		__clear_bit(offset, work_ctx->pending);

This is just wrong.

__clear_bit() can only be used when there is _NO_ concurrency
possible. But this has concurrency:

irq_sim_set_irqchip_state()
...
        assign_bit(hwirq, irq_ctx->work_ctx->pending, state);

That function can be executed on a different CPU concurrently while the
other CPU walks the bitmap and tries to clear a bit. The function
documentation of __clear_bit() has this documented very clearly:

 * Unlike clear_bit(), this function is non-atomic. If it is called on the same
 * region of memory concurrently, the effect may be that only one operation                                                                                    * succeeds.

No?

Thanks,

        tglx

Re: [PATCH] irq: simplify irq_im_handle_irq()

Posted by Yury Norov 6 months, 3 weeks ago

On Mon, Jul 21, 2025 at 04:07:22PM +0200, Thomas Gleixner wrote:
> Yury!
> 
> On Sat, Jul 19 2025 at 17:18, Yury Norov wrote:
> 
> 'irq:' is not the correct prefix here. See:
> 
> https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#patch-submission-notes
> 
> Also irq_im_handle_irq() is not a known function name.
> 
> > From: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> >
> > Hi Thomas,
> 
> Since when is a greeting part of the changelog?
> 
> > The function calls bitmap_empty() for potentially every bit in
> > work_ctx->pending, which makes a simple bitmap traverse O(N^2).
> > Fix it by switching to the dedicated for_each_set_bit().
> >
> > While there, fix using atomic clear_bit() in a context where atomicity
> > cannot be guaranteed.
> 
> Seriously? See below.
> 
> >  static void irq_sim_handle_irq(struct irq_work *work)
> >  {
> >  	struct irq_sim_work_ctx *work_ctx;
> > -	unsigned int offset = 0;
> > +	unsigned int offset;
> >  	int irqnum;
> >  
> >  	work_ctx = container_of(work, struct irq_sim_work_ctx, work);
> >  
> > -	while (!bitmap_empty(work_ctx->pending, work_ctx->irq_count)) {
> > -		offset = find_next_bit(work_ctx->pending,
> > -				       work_ctx->irq_count, offset);
> > -		clear_bit(offset, work_ctx->pending);
> > +	for_each_set_bit(offset, work_ctx->pending, work_ctx->irq_count) {
> > +		__clear_bit(offset, work_ctx->pending);
> 
> This is just wrong.
> 
> __clear_bit() can only be used when there is _NO_ concurrency
> possible. But this has concurrency:
> 
> irq_sim_set_irqchip_state()
> ...
>         assign_bit(hwirq, irq_ctx->work_ctx->pending, state);
> 
> That function can be executed on a different CPU concurrently while the
> other CPU walks the bitmap and tries to clear a bit. The function
> documentation of __clear_bit() has this documented very clearly:
> 
>  * Unlike clear_bit(), this function is non-atomic. If it is called on the same
>  * region of memory concurrently, the effect may be that only one operation                                                                                    * succeeds.
> 
> No?

find_next_bit() and for_each_bit() cannot be used in concurrent
environment, and having atomic clear_bit() is meaningless here.
Two concurrent processes, if running in parallel, may pick the
same offset, ending up executing the handle_simple_irq() twice.

So, the work_ctx->pending must be local or protected bitmap to make
this all working.

It simply doesn't matter how do you clean the offset - atomically
or not.

I have a series for atomic find_bit() API, not merged though. In
I described it in details there [1].

Or I miss something in the IRQ handling logic?

Thanks,
Yury

[1] https://lists.infradead.org/pipermail/ath10k/2024-June/015900.html

Re: [PATCH] irq: simplify irq_im_handle_irq()

Posted by Thomas Gleixner 6 months, 3 weeks ago

On Mon, Jul 21 2025 at 10:27, Yury Norov wrote:
> On Mon, Jul 21, 2025 at 04:07:22PM +0200, Thomas Gleixner wrote:
> find_next_bit() and for_each_bit() cannot be used in concurrent
> environment, and having atomic clear_bit() is meaningless here.
> Two concurrent processes, if running in parallel, may pick the
> same offset, ending up executing the handle_simple_irq() twice.

The irq work cannot be run in parallel on multiple CPUs. It's guaranteed
that only one irq work handler runs at a time. So irq_sim_handle_irq()
is fully serialized by the irq work magic.

But the bitmap can be modified concurrently, which is not a problem.

Thanks,

        tglx

Re: [PATCH] irq: simplify irq_im_handle_irq()

Posted by Jiri Slaby 6 months, 1 week ago

Hi,

On 21. 07. 25, 17:44, Thomas Gleixner wrote:
> On Mon, Jul 21 2025 at 10:27, Yury Norov wrote:
>> On Mon, Jul 21, 2025 at 04:07:22PM +0200, Thomas Gleixner wrote:
>> find_next_bit() and for_each_bit() cannot be used in concurrent
>> environment, and having atomic clear_bit() is meaningless here.
>> Two concurrent processes, if running in parallel, may pick the
>> same offset, ending up executing the handle_simple_irq() twice.
> 
> The irq work cannot be run in parallel on multiple CPUs. It's guaranteed
> that only one irq work handler runs at a time. So irq_sim_handle_irq()
> is fully serialized by the irq work magic.
> 
> But the bitmap can be modified concurrently, which is not a problem.

Actually, it is (IMO):

         while (!bitmap_empty(work_ctx->pending, work_ctx->irq_count)) {
                 offset = find_next_bit(work_ctx->pending,
                                        work_ctx->irq_count, offset);
                 clear_bit(offset, work_ctx->pending);
                 irqnum = irq_find_mapping(work_ctx->domain, offset);
                 handle_simple_irq(irq_to_desc(irqnum));
         }

If another CPU sets a bit X in the beginning of the work_ctx->pending 
bitmap while this is running for some time already (that means offset is 
already greater that that X), bitmap_empty() will be always true and 
this spins forever (or crashes). It is because find_next_bit() will 
never return that bit X -- so clear_bit() will never happen on that.

What is worse, find_next_bit() will return work_ctx->irq_count and both 
clear_bit() and irq_find_mapping() will touch an OOB memory.

Or what am I missing?

find_next_bit_wrap() would cure that.

thanks,
-- 
js
suse labs