[PATCH v3 1/8] xen/common/dt-overlay: Fix lock issue when add/remove the device

Henry Wang posted 8 patches 6 months ago
There is a newer version of this series
[PATCH v3 1/8] xen/common/dt-overlay: Fix lock issue when add/remove the device
Posted by Henry Wang 6 months ago
If CONFIG_DEBUG=y, below assertion will be triggered:
(XEN) Assertion 'rw_is_locked(&dt_host_lock)' failed at drivers/passthrough/device_tree.c:146
(XEN) ----[ Xen-4.19-unstable  arm64  debug=y  Not tainted ]----
[...]
(XEN) Xen call trace:
(XEN)    [<00000a0000257418>] iommu_remove_dt_device+0x8c/0xd4 (PC)
(XEN)    [<00000a00002573a0>] iommu_remove_dt_device+0x14/0xd4 (LR)
(XEN)    [<00000a000020797c>] dt-overlay.c#remove_node_resources+0x8c/0x90
(XEN)    [<00000a0000207f14>] dt-overlay.c#remove_nodes+0x524/0x648
(XEN)    [<00000a0000208460>] dt_overlay_sysctl+0x428/0xc68
(XEN)    [<00000a00002707f8>] arch_do_sysctl+0x1c/0x2c
(XEN)    [<00000a0000230b40>] do_sysctl+0x96c/0x9ec
(XEN)    [<00000a0000271e08>] traps.c#do_trap_hypercall+0x1e8/0x288
(XEN)    [<00000a0000273490>] do_trap_guest_sync+0x448/0x63c
(XEN)    [<00000a000025c480>] entry.o#guest_sync_slowpath+0xa8/0xd8
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'rw_is_locked(&dt_host_lock)' failed at drivers/passthrough/device_tree.c:146
(XEN) ****************************************

This is because iommu_remove_dt_device() is called without taking the
dt_host_lock. dt_host_lock is meant to ensure that the DT node will not
disappear behind back. So fix the issue by taking the lock as soon as
getting hold of overlay_node.

Similar issue will be observed in adding the dtbo:
(XEN) Assertion 'system_state < SYS_STATE_active || rw_is_locked(&dt_host_lock)'
failed at xen-source/xen/drivers/passthrough/device_tree.c:192
(XEN) ----[ Xen-4.19-unstable  arm64  debug=y  Not tainted ]----
[...]
(XEN) Xen call trace:
(XEN)    [<00000a00002594f4>] iommu_add_dt_device+0x7c/0x17c (PC)
(XEN)    [<00000a0000259494>] iommu_add_dt_device+0x1c/0x17c (LR)
(XEN)    [<00000a0000267db4>] handle_device+0x68/0x1e8
(XEN)    [<00000a0000208ba8>] dt_overlay_sysctl+0x9d4/0xb84
(XEN)    [<00000a000027342c>] arch_do_sysctl+0x24/0x38
(XEN)    [<00000a0000231ac8>] do_sysctl+0x9ac/0xa34
(XEN)    [<00000a0000274b70>] traps.c#do_trap_hypercall+0x230/0x2dc
(XEN)    [<00000a0000276330>] do_trap_guest_sync+0x478/0x688
(XEN)    [<00000a000025e480>] entry.o#guest_sync_slowpath+0xa8/0xd8

This is because the lock is released too early. So fix the issue by
releasing the lock after handle_device().

Fixes: 7e5c4a8b86f1 ("xen/arm: Implement device tree node removal functionalities")
Signed-off-by: Henry Wang <xin.wang2@amd.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
v3:
- Add Julien's Reviewed-by tag.
v2:
- Take the lock as soon as getting hold of overlay_node. Also
  release the lock after handle_device() when adding dtbo.
v1.1:
- Move the unlock position before the check of rc.
---
 xen/common/dt-overlay.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/xen/common/dt-overlay.c b/xen/common/dt-overlay.c
index 1b197381f6..9cece79067 100644
--- a/xen/common/dt-overlay.c
+++ b/xen/common/dt-overlay.c
@@ -429,18 +429,24 @@ static int remove_nodes(const struct overlay_track *tracker)
         if ( overlay_node == NULL )
             return -EINVAL;
 
+        write_lock(&dt_host_lock);
+
         rc = remove_descendant_nodes_resources(overlay_node);
         if ( rc )
+        {
+            write_unlock(&dt_host_lock);
             return rc;
+        }
 
         rc = remove_node_resources(overlay_node);
         if ( rc )
+        {
+            write_unlock(&dt_host_lock);
             return rc;
+        }
 
         dt_dprintk("Removing node: %s\n", overlay_node->full_name);
 
-        write_lock(&dt_host_lock);
-
         rc = dt_overlay_remove_node(overlay_node);
         if ( rc )
         {
@@ -604,8 +610,6 @@ static long add_nodes(struct overlay_track *tr, char **nodes_full_path)
             return rc;
         }
 
-        write_unlock(&dt_host_lock);
-
         prev_node->allnext = next_node;
 
         overlay_node = dt_find_node_by_path(overlay_node->full_name);
@@ -619,6 +623,7 @@ static long add_nodes(struct overlay_track *tr, char **nodes_full_path)
         rc = handle_device(hardware_domain, overlay_node, p2m_mmio_direct_c,
                            tr->iomem_ranges,
                            tr->irq_ranges);
+        write_unlock(&dt_host_lock);
         if ( rc )
         {
             printk(XENLOG_ERR "Adding IRQ and IOMMU failed\n");
-- 
2.34.1