[PATCH V1] block: Fix null pointer dereference issue on struct io_cq

Pradeep P V K posted 1 patch 2 years, 7 months ago
block/blk-ioc.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
[PATCH V1] block: Fix null pointer dereference issue on struct io_cq
Posted by Pradeep P V K 2 years, 7 months ago
There is a potential race between ioc_clear_fn() and
exit_io_context() as shown below, due to which below
crash is observed. It can also result into use-after-free
issue.

context#1:                           context#2:
ioc_release_fn()                     do_exit();
->spin_lock(&ioc->lock);             ->exit_io_context();
->ioc_destroy_icq(icq);              ->ioc_exit_icqs();
 ->list_del_init(&icq->q_node);       ->spin_lock_irq(&ioc->lock);
 ->call_rcu(&icq->__rcu_head,
     icq_free_icq_rcu);
->spin_unlock(&ioc->lock);
                                      ->ioc_exit_icq(); gets the same icq
				       ->bfq_exit_icq();
                                  This results into below crash as bic
				  is NULL as it is derived from icq.
				  There is a chance that icq could be
				  free'd as well.

[33.245722][ T8666] Unable to handle kernel NULL pointer dereference
at virtual address 0000000000000018.
...
Call trace:
[33.325782][ T8666]  bfq_exit_icq+0x28/0xa8
[33.325785][ T8666]  exit_io_context+0xcc/0x100
[33.325786][ T8666]  do_exit+0x764/0xa58
[33.325791][ T8666]  do_group_exit+0x0/0xa0
[33.325793][ T8666]  invoke_syscall+0x48/0x114
[33.325802][ T8666]  el0_svc_common+0xcc/0x118
[33.325805][ T8666]  do_el0_svc+0x34/0xd0
[33.325807][ T8666]  el0_svc+0x38/0xd0
[33.325812][ T8666]  el0t_64_sync_handler+0x8c/0xfc
[33.325813][ T8666]  el0t_64_sync+0x1a0/0x1a4

Fix this by checking with ICQ_DESTROYED flags in ioc_exit_icqs().
Also, ensure ioc_exit_icq() is accessing icq within rcu_read_lock/unlock
so that icq doesn't get free'd up while it is still using it.

Signed-off-by: Pradeep P V K <quic_pragalla@quicinc.com>
---
 block/blk-ioc.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/block/blk-ioc.c b/block/blk-ioc.c
index 63fc02042408..1aa34fd46ac8 100644
--- a/block/blk-ioc.c
+++ b/block/blk-ioc.c
@@ -60,10 +60,14 @@ static void ioc_exit_icqs(struct io_context *ioc)
 {
 	struct io_cq *icq;
 
+	rcu_read_lock();
 	spin_lock_irq(&ioc->lock);
-	hlist_for_each_entry(icq, &ioc->icq_list, ioc_node)
-		ioc_exit_icq(icq);
+	hlist_for_each_entry(icq, &ioc->icq_list, ioc_node) {
+		if (!(icq->flags & ICQ_DESTROYED))
+			ioc_exit_icq(icq);
+	}
 	spin_unlock_irq(&ioc->lock);
+	rcu_read_unlock();
 }
 
 /*
-- 
2.17.1
Re: [PATCH V1] block: Fix null pointer dereference issue on struct io_cq
Posted by Yu Kuai 2 years, 7 months ago
Hi,

在 2023/05/17 16:44, Pradeep P V K 写道:
> There is a potential race between ioc_clear_fn() and
> exit_io_context() as shown below, due to which below
> crash is observed. It can also result into use-after-free
> issue.
> 
> context#1:                           context#2:
> ioc_release_fn()                     do_exit();
> ->spin_lock(&ioc->lock);             ->exit_io_context();
> ->ioc_destroy_icq(icq);              ->ioc_exit_icqs();
>   ->list_del_init(&icq->q_node);       ->spin_lock_irq(&ioc->lock);
>   ->call_rcu(&icq->__rcu_head,
>       icq_free_icq_rcu);
> ->spin_unlock(&ioc->lock);
>                                        ->ioc_exit_icq(); gets the same icq
I don't understand how is this possible, the list is protected by
'ioc->lock', both hlist_del_init and hlist_for_each_entry are called
inside the lock.

Thanks,
Kuai
> 				       ->bfq_exit_icq();
>                                    This results into below crash as bic
> 				  is NULL as it is derived from icq.
> 				  There is a chance that icq could be
> 				  free'd as well.
> 
> [33.245722][ T8666] Unable to handle kernel NULL pointer dereference
> at virtual address 0000000000000018.
> ...
> Call trace:
> [33.325782][ T8666]  bfq_exit_icq+0x28/0xa8
> [33.325785][ T8666]  exit_io_context+0xcc/0x100
> [33.325786][ T8666]  do_exit+0x764/0xa58
> [33.325791][ T8666]  do_group_exit+0x0/0xa0
> [33.325793][ T8666]  invoke_syscall+0x48/0x114
> [33.325802][ T8666]  el0_svc_common+0xcc/0x118
> [33.325805][ T8666]  do_el0_svc+0x34/0xd0
> [33.325807][ T8666]  el0_svc+0x38/0xd0
> [33.325812][ T8666]  el0t_64_sync_handler+0x8c/0xfc
> [33.325813][ T8666]  el0t_64_sync+0x1a0/0x1a4
> 
> Fix this by checking with ICQ_DESTROYED flags in ioc_exit_icqs().
> Also, ensure ioc_exit_icq() is accessing icq within rcu_read_lock/unlock
> so that icq doesn't get free'd up while it is still using it.
> 
> Signed-off-by: Pradeep P V K <quic_pragalla@quicinc.com>
> ---
>   block/blk-ioc.c | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-ioc.c b/block/blk-ioc.c
> index 63fc02042408..1aa34fd46ac8 100644
> --- a/block/blk-ioc.c
> +++ b/block/blk-ioc.c
> @@ -60,10 +60,14 @@ static void ioc_exit_icqs(struct io_context *ioc)
>   {
>   	struct io_cq *icq;
>   
> +	rcu_read_lock();
>   	spin_lock_irq(&ioc->lock);
> -	hlist_for_each_entry(icq, &ioc->icq_list, ioc_node)
> -		ioc_exit_icq(icq);
> +	hlist_for_each_entry(icq, &ioc->icq_list, ioc_node) {
> +		if (!(icq->flags & ICQ_DESTROYED))
> +			ioc_exit_icq(icq);
> +	}
>   	spin_unlock_irq(&ioc->lock);
> +	rcu_read_unlock();
>   }
>   
>   /*
> 

Re: [PATCH V1] block: Fix null pointer dereference issue on struct io_cq
Posted by Yu Kuai 2 years, 7 months ago
Hi,

在 2023/05/17 16:58, Yu Kuai 写道:
> Hi,
> 
> 在 2023/05/17 16:44, Pradeep P V K 写道:
>> There is a potential race between ioc_clear_fn() and
>> exit_io_context() as shown below, due to which below
>> crash is observed. It can also result into use-after-free
>> issue.
>>
>> context#1:                           context#2:
>> ioc_release_fn()                     do_exit();
>> ->spin_lock(&ioc->lock);             ->exit_io_context();
>> ->ioc_destroy_icq(icq);              ->ioc_exit_icqs();
>>   ->list_del_init(&icq->q_node);       ->spin_lock_irq(&ioc->lock);
>>   ->call_rcu(&icq->__rcu_head,
>>       icq_free_icq_rcu);
>> ->spin_unlock(&ioc->lock);

I think above concurrent scenario is not possible as well.

exit_io_context() doesn't release ioc refcount before ioc_exit_icqs() is
done, so that ioc_release_fn() can never concurrent with
ioc_exit_icqs().

do_exit
  exit_io_context
   ioc_exit_icqs
   put_io_context -> ioc_release_fn won't be called before this

>>                                        ->ioc_exit_icq(); gets the same 
>> icq
> I don't understand how is this possible, the list is protected by
> 'ioc->lock', both hlist_del_init and hlist_for_each_entry are called
> inside the lock.
> 
> Thanks,
> Kuai
>>                        ->bfq_exit_icq();
>>                                    This results into below crash as bic
>>                   is NULL as it is derived from icq.
>>                   There is a chance that icq could be
>>                   free'd as well.
>>
>> [33.245722][ T8666] Unable to handle kernel NULL pointer dereference
>> at virtual address 0000000000000018.
>> ...
>> Call trace:
>> [33.325782][ T8666]  bfq_exit_icq+0x28/0xa8
>> [33.325785][ T8666]  exit_io_context+0xcc/0x100
>> [33.325786][ T8666]  do_exit+0x764/0xa58
>> [33.325791][ T8666]  do_group_exit+0x0/0xa0
>> [33.325793][ T8666]  invoke_syscall+0x48/0x114
>> [33.325802][ T8666]  el0_svc_common+0xcc/0x118
>> [33.325805][ T8666]  do_el0_svc+0x34/0xd0
>> [33.325807][ T8666]  el0_svc+0x38/0xd0
>> [33.325812][ T8666]  el0t_64_sync_handler+0x8c/0xfc
>> [33.325813][ T8666]  el0t_64_sync+0x1a0/0x1a4
>>
>> Fix this by checking with ICQ_DESTROYED flags in ioc_exit_icqs().
>> Also, ensure ioc_exit_icq() is accessing icq within rcu_read_lock/unlock
>> so that icq doesn't get free'd up while it is still using it.
>>
>> Signed-off-by: Pradeep P V K <quic_pragalla@quicinc.com>
>> ---
>>   block/blk-ioc.c | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/blk-ioc.c b/block/blk-ioc.c
>> index 63fc02042408..1aa34fd46ac8 100644
>> --- a/block/blk-ioc.c
>> +++ b/block/blk-ioc.c
>> @@ -60,10 +60,14 @@ static void ioc_exit_icqs(struct io_context *ioc)
>>   {
>>       struct io_cq *icq;
>> +    rcu_read_lock();
>>       spin_lock_irq(&ioc->lock);
>> -    hlist_for_each_entry(icq, &ioc->icq_list, ioc_node)
>> -        ioc_exit_icq(icq);
>> +    hlist_for_each_entry(icq, &ioc->icq_list, ioc_node) {
>> +        if (!(icq->flags & ICQ_DESTROYED))
By the way, above change doesn't make sense to me as well.
ioc_exit_icq() is called before setting ICQ_DESTROYED, hence if
ICQ_DESTROYED is set, then ICQ_EXITED is set as well, in this case
ioc_exit_icq() won't do anything.

>> +            ioc_exit_icq(icq);
>> +    }
>>       spin_unlock_irq(&ioc->lock);
>> +    rcu_read_unlock();
>>   }
>>   /*

I think I do found a problem, but I'm not sure it's the same in your
case, can you try the following patch?

diff --git a/block/blk-ioc.c b/block/blk-ioc.c
index 63fc02042408..37a56f2bb040 100644
--- a/block/blk-ioc.c
+++ b/block/blk-ioc.c
@@ -78,6 +78,9 @@ static void ioc_destroy_icq(struct io_cq *icq)

         lockdep_assert_held(&ioc->lock);

+       if (icq->flags & ICQ_DESTROYED)
+               return;
+
         radix_tree_delete(&ioc->icq_tree, icq->q->id);
         hlist_del_init(&icq->ioc_node);
         list_del_init(&icq->q_node);
@@ -128,12 +131,7 @@ static void ioc_release_fn(struct work_struct *work)
                         spin_lock(&q->queue_lock);
                         spin_lock(&ioc->lock);

-                       /*
-                        * The icq may have been destroyed when the ioc lock
-                        * was released.
-                        */
-                       if (!(icq->flags & ICQ_DESTROYED))
-                               ioc_destroy_icq(icq);
+                       ioc_destroy_icq(icq);

                         spin_unlock(&q->queue_lock);
                         rcu_read_unlock();
@@ -173,18 +171,17 @@ void ioc_clear_queue(struct request_queue *q)
  {
         LIST_HEAD(icq_list);

+       rcu_read_lock();
         spin_lock_irq(&q->queue_lock);
         list_splice_init(&q->icq_list, &icq_list);
         spin_unlock_irq(&q->queue_lock);

-       rcu_read_lock();
         while (!list_empty(&icq_list)) {
                 struct io_cq *icq =
                         list_entry(icq_list.next, struct io_cq, q_node);

                 spin_lock_irq(&icq->ioc->lock);
-               if (!(icq->flags & ICQ_DESTROYED))
-                       ioc_destroy_icq(icq);
+               ioc_destroy_icq(icq);
                 spin_unlock_irq(&icq->ioc->lock);
         }
         rcu_read_unlock();

Re: [PATCH V1] block: Fix null pointer dereference issue on struct io_cq
Posted by Yu Kuai 2 years, 7 months ago
Hi,

在 2023/05/18 20:16, Yu Kuai 写道:

> @@ -173,18 +171,17 @@ void ioc_clear_queue(struct request_queue *q)
>   {
>          LIST_HEAD(icq_list);
> 
> +       rcu_read_lock();

Sorry that I realized this is still not enough, following list_empty()
and list_entry() can still concurrent with list_del(). Please try the
following patch:
>          spin_lock_irq(&q->queue_lock);
>          list_splice_init(&q->icq_list, &icq_list);
>          spin_unlock_irq(&q->queue_lock);
> 
> -       rcu_read_lock();
>          while (!list_empty(&icq_list)) {
>                  struct io_cq *icq =
>                          list_entry(icq_list.next, struct io_cq, q_node);
> 
>                  spin_lock_irq(&icq->ioc->lock);
> -               if (!(icq->flags & ICQ_DESTROYED))
> -                       ioc_destroy_icq(icq);
> +               ioc_destroy_icq(icq);
>                  spin_unlock_irq(&icq->ioc->lock);
>          }
>          rcu_read_unlock();
> 
> .
> 

diff --git a/block/blk-ioc.c b/block/blk-ioc.c
index 63fc02042408..47684d1e9006 100644
--- a/block/blk-ioc.c
+++ b/block/blk-ioc.c
@@ -78,6 +78,9 @@ static void ioc_destroy_icq(struct io_cq *icq)

         lockdep_assert_held(&ioc->lock);

+       if (icq->flags & ICQ_DESTROYED)
+               return;
+
         radix_tree_delete(&ioc->icq_tree, icq->q->id);
         hlist_del_init(&icq->ioc_node);
         list_del_init(&icq->q_node);
@@ -128,12 +131,7 @@ static void ioc_release_fn(struct work_struct *work)
                         spin_lock(&q->queue_lock);
                         spin_lock(&ioc->lock);

-                       /*
-                        * The icq may have been destroyed when the ioc lock
-                        * was released.
-                        */
-                       if (!(icq->flags & ICQ_DESTROYED))
-                               ioc_destroy_icq(icq);
+                       ioc_destroy_icq(icq);

                         spin_unlock(&q->queue_lock);
                         rcu_read_unlock();
@@ -175,19 +173,16 @@ void ioc_clear_queue(struct request_queue *q)

         spin_lock_irq(&q->queue_lock);
         list_splice_init(&q->icq_list, &icq_list);
-       spin_unlock_irq(&q->queue_lock);

-       rcu_read_lock();
         while (!list_empty(&icq_list)) {
                 struct io_cq *icq =
                         list_entry(icq_list.next, struct io_cq, q_node);

                 spin_lock_irq(&icq->ioc->lock);
-               if (!(icq->flags & ICQ_DESTROYED))
-                       ioc_destroy_icq(icq);
+               ioc_destroy_icq(icq);
                 spin_unlock_irq(&icq->ioc->lock);
         }
-       rcu_read_unlock();
+       spin_unlock_irq(&q->queue_lock);
  }
  #else /* CONFIG_BLK_ICQ */
  static inline void ioc_exit_icqs(struct io_context *ioc)

Re: [PATCH V1] block: Fix null pointer dereference issue on struct io_cq
Posted by Pradeep Pragallapati 2 years, 7 months ago
On 5/18/2023 6:14 PM, Yu Kuai wrote:
> Hi,
>
> 在 2023/05/18 20:16, Yu Kuai 写道:
>
>> @@ -173,18 +171,17 @@ void ioc_clear_queue(struct request_queue *q)
>>   {
>>          LIST_HEAD(icq_list);
>>
>> +       rcu_read_lock();
>
> Sorry that I realized this is still not enough, following list_empty()
> and list_entry() can still concurrent with list_del(). Please try the
> following patch:
sure will try and update the results.
>> spin_lock_irq(&q->queue_lock);
>>          list_splice_init(&q->icq_list, &icq_list);
>>          spin_unlock_irq(&q->queue_lock);
>>
>> -       rcu_read_lock();
>>          while (!list_empty(&icq_list)) {
>>                  struct io_cq *icq =
>>                          list_entry(icq_list.next, struct io_cq, 
>> q_node);
>>
>>                  spin_lock_irq(&icq->ioc->lock);
>> -               if (!(icq->flags & ICQ_DESTROYED))
>> -                       ioc_destroy_icq(icq);
>> +               ioc_destroy_icq(icq);
>>                  spin_unlock_irq(&icq->ioc->lock);
>>          }
>>          rcu_read_unlock();
>>
>> .
>>
>
> diff --git a/block/blk-ioc.c b/block/blk-ioc.c
> index 63fc02042408..47684d1e9006 100644
> --- a/block/blk-ioc.c
> +++ b/block/blk-ioc.c
> @@ -78,6 +78,9 @@ static void ioc_destroy_icq(struct io_cq *icq)
>
>         lockdep_assert_held(&ioc->lock);
>
> +       if (icq->flags & ICQ_DESTROYED)
> +               return;
> +
>         radix_tree_delete(&ioc->icq_tree, icq->q->id);
>         hlist_del_init(&icq->ioc_node);
>         list_del_init(&icq->q_node);
> @@ -128,12 +131,7 @@ static void ioc_release_fn(struct work_struct *work)
>                         spin_lock(&q->queue_lock);
>                         spin_lock(&ioc->lock);
>
> -                       /*
> -                        * The icq may have been destroyed when the 
> ioc lock
> -                        * was released.
> -                        */
> -                       if (!(icq->flags & ICQ_DESTROYED))
> -                               ioc_destroy_icq(icq);
> +                       ioc_destroy_icq(icq);
>
>                         spin_unlock(&q->queue_lock);
>                         rcu_read_unlock();
> @@ -175,19 +173,16 @@ void ioc_clear_queue(struct request_queue *q)
>
>         spin_lock_irq(&q->queue_lock);
>         list_splice_init(&q->icq_list, &icq_list);
> -       spin_unlock_irq(&q->queue_lock);
>
> -       rcu_read_lock();
>         while (!list_empty(&icq_list)) {
>                 struct io_cq *icq =
>                         list_entry(icq_list.next, struct io_cq, q_node);
>
>                 spin_lock_irq(&icq->ioc->lock);
> -               if (!(icq->flags & ICQ_DESTROYED))
> -                       ioc_destroy_icq(icq);
> +               ioc_destroy_icq(icq);
>                 spin_unlock_irq(&icq->ioc->lock);
>         }
> -       rcu_read_unlock();
> +       spin_unlock_irq(&q->queue_lock);
>  }
>  #else /* CONFIG_BLK_ICQ */
>  static inline void ioc_exit_icqs(struct io_context *ioc)
>
Re: [PATCH V1] block: Fix null pointer dereference issue on struct io_cq
Posted by Pradeep Pragallapati 2 years, 6 months ago
Hi,

On 5/22/2023 11:49 AM, Pradeep Pragallapati wrote:
>
> On 5/18/2023 6:14 PM, Yu Kuai wrote:
>> Hi,
>>
>> 在 2023/05/18 20:16, Yu Kuai 写道:
>>
>>> @@ -173,18 +171,17 @@ void ioc_clear_queue(struct request_queue *q)
>>>   {
>>>          LIST_HEAD(icq_list);
>>>
>>> +       rcu_read_lock();
>>
>> Sorry that I realized this is still not enough, following list_empty()
>> and list_entry() can still concurrent with list_del(). Please try the
>> following patch:
> sure will try and update the results.


At least for 80+hrs of testing, i didn't see the issue reproduced. seems 
like it is helping my case.

>>> spin_lock_irq(&q->queue_lock);
>>>          list_splice_init(&q->icq_list, &icq_list);
>>>          spin_unlock_irq(&q->queue_lock);
>>>
>>> -       rcu_read_lock();
>>>          while (!list_empty(&icq_list)) {
>>>                  struct io_cq *icq =
>>>                          list_entry(icq_list.next, struct io_cq, 
>>> q_node);
>>>
>>>                  spin_lock_irq(&icq->ioc->lock);
>>> -               if (!(icq->flags & ICQ_DESTROYED))
>>> -                       ioc_destroy_icq(icq);
>>> +               ioc_destroy_icq(icq);
>>>                  spin_unlock_irq(&icq->ioc->lock);
>>>          }
>>>          rcu_read_unlock();
>>>
>>> .
>>>
>>
>> diff --git a/block/blk-ioc.c b/block/blk-ioc.c
>> index 63fc02042408..47684d1e9006 100644
>> --- a/block/blk-ioc.c
>> +++ b/block/blk-ioc.c
>> @@ -78,6 +78,9 @@ static void ioc_destroy_icq(struct io_cq *icq)
>>
>>         lockdep_assert_held(&ioc->lock);
>>
>> +       if (icq->flags & ICQ_DESTROYED)
>> +               return;
>> +
>>         radix_tree_delete(&ioc->icq_tree, icq->q->id);
>>         hlist_del_init(&icq->ioc_node);
>>         list_del_init(&icq->q_node);
>> @@ -128,12 +131,7 @@ static void ioc_release_fn(struct work_struct 
>> *work)
>>                         spin_lock(&q->queue_lock);
>>                         spin_lock(&ioc->lock);
>>
>> -                       /*
>> -                        * The icq may have been destroyed when the 
>> ioc lock
>> -                        * was released.
>> -                        */
>> -                       if (!(icq->flags & ICQ_DESTROYED))
>> -                               ioc_destroy_icq(icq);
>> +                       ioc_destroy_icq(icq);
>>
>>                         spin_unlock(&q->queue_lock);
>>                         rcu_read_unlock();
>> @@ -175,19 +173,16 @@ void ioc_clear_queue(struct request_queue *q)
>>
>>         spin_lock_irq(&q->queue_lock);
>>         list_splice_init(&q->icq_list, &icq_list);
>> -       spin_unlock_irq(&q->queue_lock);
>>
>> -       rcu_read_lock();
>>         while (!list_empty(&icq_list)) {
>>                 struct io_cq *icq =
>>                         list_entry(icq_list.next, struct io_cq, q_node);
>>
>>                 spin_lock_irq(&icq->ioc->lock);
>> -               if (!(icq->flags & ICQ_DESTROYED))
>> -                       ioc_destroy_icq(icq);
>> +               ioc_destroy_icq(icq);
>>                 spin_unlock_irq(&icq->ioc->lock);
>>         }
>> -       rcu_read_unlock();
>> +       spin_unlock_irq(&q->queue_lock);
>>  }
>>  #else /* CONFIG_BLK_ICQ */
>>  static inline void ioc_exit_icqs(struct io_context *ioc)
>>
Re: [PATCH V1] block: Fix null pointer dereference issue on struct io_cq
Posted by Yu Kuai 2 years, 6 months ago
Hi,

在 2023/05/30 21:15, Pradeep Pragallapati 写道:
> Hi,
> 
> On 5/22/2023 11:49 AM, Pradeep Pragallapati wrote:
>>
>> On 5/18/2023 6:14 PM, Yu Kuai wrote:
>>> Hi,
>>>
>>> 在 2023/05/18 20:16, Yu Kuai 写道:
>>>
>>>> @@ -173,18 +171,17 @@ void ioc_clear_queue(struct request_queue *q)
>>>>   {
>>>>          LIST_HEAD(icq_list);
>>>>
>>>> +       rcu_read_lock();
>>>
>>> Sorry that I realized this is still not enough, following list_empty()
>>> and list_entry() can still concurrent with list_del(). Please try the
>>> following patch:
>> sure will try and update the results.
> 
> 
> At least for 80+hrs of testing, i didn't see the issue reproduced. seems 
> like it is helping my case.

Thanks for the test, I'll send a patch soon.

Kuai

Re: [PATCH V1] block: Fix null pointer dereference issue on struct io_cq
Posted by Damien Le Moal 2 years, 7 months ago
On 5/17/23 17:58, Yu Kuai wrote:
> Hi,
> 
> 在 2023/05/17 16:44, Pradeep P V K 写道:
>> There is a potential race between ioc_clear_fn() and
>> exit_io_context() as shown below, due to which below
>> crash is observed. It can also result into use-after-free
>> issue.
>>
>> context#1:                           context#2:
>> ioc_release_fn()                     do_exit();
>> ->spin_lock(&ioc->lock);             ->exit_io_context();
>> ->ioc_destroy_icq(icq);              ->ioc_exit_icqs();
>>   ->list_del_init(&icq->q_node);       ->spin_lock_irq(&ioc->lock);
>>   ->call_rcu(&icq->__rcu_head,
>>       icq_free_icq_rcu);
>> ->spin_unlock(&ioc->lock);
>>                                        ->ioc_exit_icq(); gets the same icq
> I don't understand how is this possible, the list is protected by
> 'ioc->lock', both hlist_del_init and hlist_for_each_entry are called
> inside the lock.

Given that ioc_destroy_icq() calls ioc_exit_icq(), ioc_exit_icqs() should ignore
all icqs that have been destroyed already, otherwise, ioc_exit_icq() gets called
twice for the same icq. The missing rcu lock in ioc_exit_icqs() already was in
itself a bug, and the missing flag check is another.

> 
> Thanks,
> Kuai
>> 				       ->bfq_exit_icq();
>>                                    This results into below crash as bic
>> 				  is NULL as it is derived from icq.
>> 				  There is a chance that icq could be
>> 				  free'd as well.
>>
>> [33.245722][ T8666] Unable to handle kernel NULL pointer dereference
>> at virtual address 0000000000000018.
>> ...
>> Call trace:
>> [33.325782][ T8666]  bfq_exit_icq+0x28/0xa8
>> [33.325785][ T8666]  exit_io_context+0xcc/0x100
>> [33.325786][ T8666]  do_exit+0x764/0xa58
>> [33.325791][ T8666]  do_group_exit+0x0/0xa0
>> [33.325793][ T8666]  invoke_syscall+0x48/0x114
>> [33.325802][ T8666]  el0_svc_common+0xcc/0x118
>> [33.325805][ T8666]  do_el0_svc+0x34/0xd0
>> [33.325807][ T8666]  el0_svc+0x38/0xd0
>> [33.325812][ T8666]  el0t_64_sync_handler+0x8c/0xfc
>> [33.325813][ T8666]  el0t_64_sync+0x1a0/0x1a4
>>
>> Fix this by checking with ICQ_DESTROYED flags in ioc_exit_icqs().
>> Also, ensure ioc_exit_icq() is accessing icq within rcu_read_lock/unlock
>> so that icq doesn't get free'd up while it is still using it.
>>
>> Signed-off-by: Pradeep P V K <quic_pragalla@quicinc.com>

Pradeep, this needs a Fixes tag and cc-stable I think.

>> ---
>>   block/blk-ioc.c | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/blk-ioc.c b/block/blk-ioc.c
>> index 63fc02042408..1aa34fd46ac8 100644
>> --- a/block/blk-ioc.c
>> +++ b/block/blk-ioc.c
>> @@ -60,10 +60,14 @@ static void ioc_exit_icqs(struct io_context *ioc)
>>   {
>>   	struct io_cq *icq;
>>   
>> +	rcu_read_lock();
>>   	spin_lock_irq(&ioc->lock);
>> -	hlist_for_each_entry(icq, &ioc->icq_list, ioc_node)
>> -		ioc_exit_icq(icq);
>> +	hlist_for_each_entry(icq, &ioc->icq_list, ioc_node) {
>> +		if (!(icq->flags & ICQ_DESTROYED))
>> +			ioc_exit_icq(icq);
>> +	}
>>   	spin_unlock_irq(&ioc->lock);
>> +	rcu_read_unlock();
>>   }
>>   
>>   /*
>>
> 

-- 
Damien Le Moal
Western Digital Research

Re: [PATCH V1] block: Fix null pointer dereference issue on struct io_cq
Posted by Christoph Hellwig 2 years, 7 months ago
On Wed, May 17, 2023 at 06:20:19PM +0900, Damien Le Moal wrote:
> twice for the same icq. The missing rcu lock in ioc_exit_icqs() already was in
> itself a bug, and the missing flag check is another.

spinlocks imply a rcu critical section, no need to duplicate it.
Re: [PATCH V1] block: Fix null pointer dereference issue on struct io_cq
Posted by Damien Le Moal 2 years, 7 months ago
On 5/17/23 18:21, Christoph Hellwig wrote:
> On Wed, May 17, 2023 at 06:20:19PM +0900, Damien Le Moal wrote:
>> twice for the same icq. The missing rcu lock in ioc_exit_icqs() already was in
>> itself a bug, and the missing flag check is another.
> 
> spinlocks imply a rcu critical section, no need to duplicate it.

Right. And I misread the code. As Yu said, given that ioc_exit_icqs() iterates
the list of icqs under ioc->lock and the ioc is removed from that list under the
same lock, ioc_exit_icqs() should never see an icq that went through
ioc_destroy_icq()...
Very weird.


-- 
Damien Le Moal
Western Digital Research
RE: [PATCH V1] block: Fix null pointer dereference issue on struct io_cq
Posted by Pradeep Pragallapati (QUIC) 2 years, 7 months ago
Hi,

-----Original Message-----
From: Damien Le Moal <dlemoal@kernel.org> 
Sent: Wednesday, May 17, 2023 3:02 PM
To: Christoph Hellwig <hch@infradead.org>
Cc: Yu Kuai <yukuai1@huaweicloud.com>; Pradeep Pragallapati (QUIC) <quic_pragalla@quicinc.com>; axboe@kernel.dk; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; yukuai (C) <yukuai3@huawei.com>
Subject: Re: [PATCH V1] block: Fix null pointer dereference issue on struct io_cq

On 5/17/23 18:21, Christoph Hellwig wrote:
> On Wed, May 17, 2023 at 06:20:19PM +0900, Damien Le Moal wrote:
>> twice for the same icq. The missing rcu lock in ioc_exit_icqs() 
>> already was in itself a bug, and the missing flag check is another.
> 
> spinlocks imply a rcu critical section, no need to duplicate it.

Right. And I misread the code. As Yu said, given that ioc_exit_icqs() iterates the list of icqs under ioc->lock and the ioc is removed from that list under the same lock, ioc_exit_icqs() should never see an icq that went through ioc_destroy_icq()...
Very weird.

This weird can be possible 
1. updating icq_hint which is annotated as __rcu type without RCU-protected context in ioc_destroy_icq().
Moreover, this was taken care in else part of ioc_release_fn() by rcu_read_lock/unlock() but missed in if statement which can lead to this weird.

2. extracting icq from hlist/list elements are done using rcu locks protected in ioc_clear_queue() but same was not at ioc_exit_icqs().

So, far we have seen 10+ instances of this crash on 6.1 kernel during stability testing (Involves IO, reboots, device suspend/resume, and few more).
With the V1 patch, we didn't observe the issue for at least 48hrs+ of stability testing.


--
Damien Le Moal
Western Digital Research

Re: [PATCH V1] block: Fix null pointer dereference issue on struct io_cq
Posted by Chaitanya Kulkarni 2 years, 7 months ago
On 5/17/23 01:44, Pradeep P V K wrote:
> There is a potential race between ioc_clear_fn() and
> exit_io_context() as shown below, due to which below
> crash is observed. It can also result into use-after-free
> issue.
>
> context#1:                           context#2:
> ioc_release_fn()                     do_exit();
> ->spin_lock(&ioc->lock);             ->exit_io_context();
> ->ioc_destroy_icq(icq);              ->ioc_exit_icqs();
>   ->list_del_init(&icq->q_node);       ->spin_lock_irq(&ioc->lock);
>   ->call_rcu(&icq->__rcu_head,
>       icq_free_icq_rcu);
> ->spin_unlock(&ioc->lock);
>                                        ->ioc_exit_icq(); gets the same icq
> 				       ->bfq_exit_icq();
>                                    This results into below crash as bic
> 				  is NULL as it is derived from icq.
> 				  There is a chance that icq could be
> 				  free'd as well.
>
> [33.245722][ T8666] Unable to handle kernel NULL pointer dereference
> at virtual address 0000000000000018.
> ...
> Call trace:
> [33.325782][ T8666]  bfq_exit_icq+0x28/0xa8
> [33.325785][ T8666]  exit_io_context+0xcc/0x100
> [33.325786][ T8666]  do_exit+0x764/0xa58
> [33.325791][ T8666]  do_group_exit+0x0/0xa0
> [33.325793][ T8666]  invoke_syscall+0x48/0x114
> [33.325802][ T8666]  el0_svc_common+0xcc/0x118
> [33.325805][ T8666]  do_el0_svc+0x34/0xd0
> [33.325807][ T8666]  el0_svc+0x38/0xd0
> [33.325812][ T8666]  el0t_64_sync_handler+0x8c/0xfc
> [33.325813][ T8666]  el0t_64_sync+0x1a0/0x1a4
>
> Fix this by checking with ICQ_DESTROYED flags in ioc_exit_icqs().
> Also, ensure ioc_exit_icq() is accessing icq within rcu_read_lock/unlock
> so that icq doesn't get free'd up while it is still using it.
>
> Signed-off-by: Pradeep P V K <quic_pragalla@quicinc.com>
> ---

Is it possible to add a null_blk based blktests for this ?

-ck