[PATCH v1.1] xen/commom/dt-overlay: Fix missing lock when remove the device

Henry Wang posted 1 patch 1 year, 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20240426015550.577986-1-xin.wang2@amd.com
xen/common/dt-overlay.c | 2 ++
1 file changed, 2 insertions(+)
[PATCH v1.1] xen/commom/dt-overlay: Fix missing lock when remove the device
Posted by Henry Wang 1 year, 9 months ago
If CONFIG_DEBUG=y, below assertion will be triggered:
(XEN) Assertion 'rw_is_locked(&dt_host_lock)' failed at drivers/passthrough/device_tree.c:146
(XEN) ----[ Xen-4.19-unstable  arm64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) PC:     00000a0000257418 iommu_remove_dt_device+0x8c/0xd4
(XEN) LR:     00000a00002573a0
(XEN) SP:     00008000fff7fb30
(XEN) CPSR:   0000000000000249 MODE:64-bit EL2h (Hypervisor, handler)
[...]

(XEN) Xen call trace:
(XEN)    [<00000a0000257418>] iommu_remove_dt_device+0x8c/0xd4 (PC)
(XEN)    [<00000a00002573a0>] iommu_remove_dt_device+0x14/0xd4 (LR)
(XEN)    [<00000a000020797c>] dt-overlay.c#remove_node_resources+0x8c/0x90
(XEN)    [<00000a0000207f14>] dt-overlay.c#remove_nodes+0x524/0x648
(XEN)    [<00000a0000208460>] dt_overlay_sysctl+0x428/0xc68
(XEN)    [<00000a00002707f8>] arch_do_sysctl+0x1c/0x2c
(XEN)    [<00000a0000230b40>] do_sysctl+0x96c/0x9ec
(XEN)    [<00000a0000271e08>] traps.c#do_trap_hypercall+0x1e8/0x288
(XEN)    [<00000a0000273490>] do_trap_guest_sync+0x448/0x63c
(XEN)    [<00000a000025c480>] entry.o#guest_sync_slowpath+0xa8/0xd8
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'rw_is_locked(&dt_host_lock)' failed at drivers/passthrough/device_tree.c:146
(XEN) ****************************************

This is because iommu_remove_dt_device() is called without taking the
dt_host_lock. Fix the issue by taking and releasing the lock properly.

Fixes: 7e5c4a8b86f1 ("xen/arm: Implement device tree node removal functionalities")
Signed-off-by: Henry Wang <xin.wang2@amd.com>
---
v1.1:
- Move the unlock position before the check of rc.
---
 xen/common/dt-overlay.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/common/dt-overlay.c b/xen/common/dt-overlay.c
index 1b197381f6..ab8f43aea2 100644
--- a/xen/common/dt-overlay.c
+++ b/xen/common/dt-overlay.c
@@ -381,7 +381,9 @@ static int remove_node_resources(struct dt_device_node *device_node)
     {
         if ( dt_device_is_protected(device_node) )
         {
+            write_lock(&dt_host_lock);
             rc = iommu_remove_dt_device(device_node);
+            write_unlock(&dt_host_lock);
             if ( rc < 0 )
                 return rc;
         }
-- 
2.34.1


Re: [PATCH v1.1] xen/commom/dt-overlay: Fix missing lock when remove the device
Posted by Julien Grall 1 year, 9 months ago
Hi Henry,

On 26/04/2024 02:55, Henry Wang wrote:
> If CONFIG_DEBUG=y, below assertion will be triggered:
> (XEN) Assertion 'rw_is_locked(&dt_host_lock)' failed at drivers/passthrough/device_tree.c:146
> (XEN) ----[ Xen-4.19-unstable  arm64  debug=y  Not tainted ]----
> (XEN) CPU:    0
> (XEN) PC:     00000a0000257418 iommu_remove_dt_device+0x8c/0xd4
> (XEN) LR:     00000a00002573a0
> (XEN) SP:     00008000fff7fb30
> (XEN) CPSR:   0000000000000249 MODE:64-bit EL2h (Hypervisor, handler)
> [...]
> 
> (XEN) Xen call trace:
> (XEN)    [<00000a0000257418>] iommu_remove_dt_device+0x8c/0xd4 (PC)
> (XEN)    [<00000a00002573a0>] iommu_remove_dt_device+0x14/0xd4 (LR)
> (XEN)    [<00000a000020797c>] dt-overlay.c#remove_node_resources+0x8c/0x90
> (XEN)    [<00000a0000207f14>] dt-overlay.c#remove_nodes+0x524/0x648
> (XEN)    [<00000a0000208460>] dt_overlay_sysctl+0x428/0xc68
> (XEN)    [<00000a00002707f8>] arch_do_sysctl+0x1c/0x2c
> (XEN)    [<00000a0000230b40>] do_sysctl+0x96c/0x9ec
> (XEN)    [<00000a0000271e08>] traps.c#do_trap_hypercall+0x1e8/0x288
> (XEN)    [<00000a0000273490>] do_trap_guest_sync+0x448/0x63c
> (XEN)    [<00000a000025c480>] entry.o#guest_sync_slowpath+0xa8/0xd8
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'rw_is_locked(&dt_host_lock)' failed at drivers/passthrough/device_tree.c:146
> (XEN) ****************************************
> 
> This is because iommu_remove_dt_device() is called without taking the
> dt_host_lock. Fix the issue by taking and releasing the lock properly.
> 
> Fixes: 7e5c4a8b86f1 ("xen/arm: Implement device tree node removal functionalities")
> Signed-off-by: Henry Wang <xin.wang2@amd.com>
> ---
> v1.1:
> - Move the unlock position before the check of rc.
> ---
>   xen/common/dt-overlay.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/xen/common/dt-overlay.c b/xen/common/dt-overlay.c
> index 1b197381f6..ab8f43aea2 100644
> --- a/xen/common/dt-overlay.c
> +++ b/xen/common/dt-overlay.c
> @@ -381,7 +381,9 @@ static int remove_node_resources(struct dt_device_node *device_node)
>       {
>           if ( dt_device_is_protected(device_node) )
>           {
> +            write_lock(&dt_host_lock);

Looking at the code, we are not modifying the device_node, so shouldn't 
this be a read_lock()?

That said, even though either fix your issue, I am not entirely 
convinced this is the correct position for the lock. From my 
understanding, dt_host_lock is meant to ensure that the DT node will not 
disappear behind your back. So in theory, shouldn't the lock be taken as 
soon as you get hold of device_node?

Cheers,

-- 
Julien Grall

Re: [PATCH v1.1] xen/commom/dt-overlay: Fix missing lock when remove the device
Posted by Henry Wang 1 year, 9 months ago
Hi Julien,

On 5/3/2024 9:04 PM, Julien Grall wrote:
> Hi Henry,
>
> On 26/04/2024 02:55, Henry Wang wrote:
>> If CONFIG_DEBUG=y, below assertion will be triggered:
>> (XEN) Assertion 'rw_is_locked(&dt_host_lock)' failed at 
>> drivers/passthrough/device_tree.c:146
>> (XEN) ----[ Xen-4.19-unstable  arm64  debug=y  Not tainted ]----
>> (XEN) CPU:    0
>> (XEN) PC:     00000a0000257418 iommu_remove_dt_device+0x8c/0xd4
>> (XEN) LR:     00000a00002573a0
>> (XEN) SP:     00008000fff7fb30
>> (XEN) CPSR:   0000000000000249 MODE:64-bit EL2h (Hypervisor, handler)
>> [...]
>>
>> (XEN) Xen call trace:
>> (XEN)    [<00000a0000257418>] iommu_remove_dt_device+0x8c/0xd4 (PC)
>> (XEN)    [<00000a00002573a0>] iommu_remove_dt_device+0x14/0xd4 (LR)
>> (XEN)    [<00000a000020797c>] 
>> dt-overlay.c#remove_node_resources+0x8c/0x90
>> (XEN)    [<00000a0000207f14>] dt-overlay.c#remove_nodes+0x524/0x648
>> (XEN)    [<00000a0000208460>] dt_overlay_sysctl+0x428/0xc68
>> (XEN)    [<00000a00002707f8>] arch_do_sysctl+0x1c/0x2c
>> (XEN)    [<00000a0000230b40>] do_sysctl+0x96c/0x9ec
>> (XEN)    [<00000a0000271e08>] traps.c#do_trap_hypercall+0x1e8/0x288
>> (XEN)    [<00000a0000273490>] do_trap_guest_sync+0x448/0x63c
>> (XEN)    [<00000a000025c480>] entry.o#guest_sync_slowpath+0xa8/0xd8
>> (XEN)
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 0:
>> (XEN) Assertion 'rw_is_locked(&dt_host_lock)' failed at 
>> drivers/passthrough/device_tree.c:146
>> (XEN) ****************************************
>>
>> This is because iommu_remove_dt_device() is called without taking the
>> dt_host_lock. Fix the issue by taking and releasing the lock properly.
>>
>> Fixes: 7e5c4a8b86f1 ("xen/arm: Implement device tree node removal 
>> functionalities")
>> Signed-off-by: Henry Wang <xin.wang2@amd.com>
>> ---
>> v1.1:
>> - Move the unlock position before the check of rc.
>> ---
>>   xen/common/dt-overlay.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/xen/common/dt-overlay.c b/xen/common/dt-overlay.c
>> index 1b197381f6..ab8f43aea2 100644
>> --- a/xen/common/dt-overlay.c
>> +++ b/xen/common/dt-overlay.c
>> @@ -381,7 +381,9 @@ static int remove_node_resources(struct 
>> dt_device_node *device_node)
>>       {
>>           if ( dt_device_is_protected(device_node) )
>>           {
>> +            write_lock(&dt_host_lock);
>
> Looking at the code, we are not modifying the device_node, so 
> shouldn't this be a read_lock()?

Hmm yes, however after seeing your comment...

>
> That said, even though either fix your issue, I am not entirely 
> convinced this is the correct position for the lock. From my 
> understanding, dt_host_lock is meant to ensure that the DT node will 
> not disappear behind your back. So in theory, shouldn't the lock be 
> taken as soon as you get hold of device_node?

...here. I believe you made a point here so I think I will just move the 
write_lock(&dt_host_lock) as soon as getting  overlay_node, i.e. on top 
of the call to remove_descendant_nodes_resources(). Therefore we can 
solve the assertion issue of this patch together.

Kind regards,
Henry

>
> Cheers,
>


Re: [PATCH v1.1] xen/commom/dt-overlay: Fix missing lock when remove the device
Posted by Stefano Stabellini 1 year, 9 months ago
On Fri, 26 Apr 2024, Henry Wang wrote:
> If CONFIG_DEBUG=y, below assertion will be triggered:
> (XEN) Assertion 'rw_is_locked(&dt_host_lock)' failed at drivers/passthrough/device_tree.c:146
> (XEN) ----[ Xen-4.19-unstable  arm64  debug=y  Not tainted ]----
> (XEN) CPU:    0
> (XEN) PC:     00000a0000257418 iommu_remove_dt_device+0x8c/0xd4
> (XEN) LR:     00000a00002573a0
> (XEN) SP:     00008000fff7fb30
> (XEN) CPSR:   0000000000000249 MODE:64-bit EL2h (Hypervisor, handler)
> [...]
> 
> (XEN) Xen call trace:
> (XEN)    [<00000a0000257418>] iommu_remove_dt_device+0x8c/0xd4 (PC)
> (XEN)    [<00000a00002573a0>] iommu_remove_dt_device+0x14/0xd4 (LR)
> (XEN)    [<00000a000020797c>] dt-overlay.c#remove_node_resources+0x8c/0x90
> (XEN)    [<00000a0000207f14>] dt-overlay.c#remove_nodes+0x524/0x648
> (XEN)    [<00000a0000208460>] dt_overlay_sysctl+0x428/0xc68
> (XEN)    [<00000a00002707f8>] arch_do_sysctl+0x1c/0x2c
> (XEN)    [<00000a0000230b40>] do_sysctl+0x96c/0x9ec
> (XEN)    [<00000a0000271e08>] traps.c#do_trap_hypercall+0x1e8/0x288
> (XEN)    [<00000a0000273490>] do_trap_guest_sync+0x448/0x63c
> (XEN)    [<00000a000025c480>] entry.o#guest_sync_slowpath+0xa8/0xd8
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'rw_is_locked(&dt_host_lock)' failed at drivers/passthrough/device_tree.c:146
> (XEN) ****************************************
> 
> This is because iommu_remove_dt_device() is called without taking the
> dt_host_lock. Fix the issue by taking and releasing the lock properly.
> 
> Fixes: 7e5c4a8b86f1 ("xen/arm: Implement device tree node removal functionalities")
> Signed-off-by: Henry Wang <xin.wang2@amd.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> v1.1:
> - Move the unlock position before the check of rc.
> ---
>  xen/common/dt-overlay.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/xen/common/dt-overlay.c b/xen/common/dt-overlay.c
> index 1b197381f6..ab8f43aea2 100644
> --- a/xen/common/dt-overlay.c
> +++ b/xen/common/dt-overlay.c
> @@ -381,7 +381,9 @@ static int remove_node_resources(struct dt_device_node *device_node)
>      {
>          if ( dt_device_is_protected(device_node) )
>          {
> +            write_lock(&dt_host_lock);
>              rc = iommu_remove_dt_device(device_node);
> +            write_unlock(&dt_host_lock);
>              if ( rc < 0 )
>                  return rc;
>          }
> -- 
> 2.34.1
>