[PATCH v2] coresight: Fix possible deadlock in coresight_panic_cb

Sean Anderson posted 1 patch 3 weeks ago
There is a newer version of this series
drivers/hwtracing/coresight/coresight-core.c | 43 +++++++++++---------
include/linux/coresight.h                    |  1 +
2 files changed, 25 insertions(+), 19 deletions(-)
[PATCH v2] coresight: Fix possible deadlock in coresight_panic_cb
Posted by Sean Anderson 3 weeks ago
coresight_panic_cb is called with interrupts disabled during panics.
However, bus_for_each_dev calls bus_to_subsys which takes
bus_kset->list_lock without disabling IRQs. This will cause a deadlock
if a panic occurs while one of the other coresight functions that uses
bus_for_each_dev is running.

Maintain a separate list of coresight devices to access during a panic.

Fixes: 46006ceb5d02 ("coresight: core: Add provision for panic callbacks")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
---

Changes in v2:
- Add a comment describing csdev_lock/list
- Consolidate list removal in coresight_device_release

 drivers/hwtracing/coresight/coresight-core.c | 43 +++++++++++---------
 include/linux/coresight.h                    |  1 +
 2 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
index fa758cc21827..4e28e56f2e30 100644
--- a/drivers/hwtracing/coresight/coresight-core.c
+++ b/drivers/hwtracing/coresight/coresight-core.c
@@ -1042,10 +1042,19 @@ static void coresight_clear_default_sink(struct coresight_device *csdev)
 	}
 }
 
+/*
+ * Dedicated list of devices for use by during panic (which may occur with
+ * interrupts disabled).
+ */
+static DEFINE_SPINLOCK(csdev_lock);
+static LIST_HEAD(csdev_list);
+
 static void coresight_device_release(struct device *dev)
 {
 	struct coresight_device *csdev = to_coresight_device(dev);
 
+	scoped_guard(spinlock_irq, &csdev_lock)
+		list_del(&csdev->csdev_list);
 	fwnode_handle_put(csdev->dev.fwnode);
 	free_percpu(csdev->perf_sink_id_map.cpu_map);
 	kfree(csdev);
@@ -1357,6 +1366,10 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
 			goto err_out;
 		}
 	}
+
+	scoped_guard(spinlock_irq, &csdev_lock)
+		list_add(&csdev->csdev_list, &csdev_list);
+
 	/*
 	 * Make sure the device registration and the connection fixup
 	 * are synchronised, so that we don't see uninitialised devices
@@ -1563,28 +1576,20 @@ const struct bus_type coresight_bustype = {
 	.name	= "coresight",
 };
 
-static int coresight_panic_sync(struct device *dev, void *data)
-{
-	int mode;
-	struct coresight_device *csdev;
-
-	/* Run through panic sync handlers for all enabled devices */
-	csdev = container_of(dev, struct coresight_device, dev);
-	mode = coresight_get_mode(csdev);
-
-	if ((mode == CS_MODE_SYSFS) || (mode == CS_MODE_PERF)) {
-		if (panic_ops(csdev))
-			panic_ops(csdev)->sync(csdev);
-	}
-
-	return 0;
-}
-
 static int coresight_panic_cb(struct notifier_block *self,
 			       unsigned long v, void *p)
 {
-	bus_for_each_dev(&coresight_bustype, NULL, NULL,
-				 coresight_panic_sync);
+	struct coresight_device *csdev;
+
+	guard(spinlock)(&csdev_lock);
+	list_for_each_entry(csdev, &csdev_list, csdev_list) {
+		/* Run through panic sync handlers for all enabled devices */
+		int mode = coresight_get_mode(csdev);
+
+		if ((mode == CS_MODE_SYSFS || mode == CS_MODE_PERF) &&
+		    panic_ops(csdev))
+			panic_ops(csdev)->sync(csdev);
+	}
 
 	return 0;
 }
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index 4ac65c68bbf4..a5e62ebd03b5 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -302,6 +302,7 @@ struct coresight_device {
 	/* system configuration and feature lists */
 	struct list_head feature_csdev_list;
 	struct list_head config_csdev_list;
+	struct list_head csdev_list;
 	raw_spinlock_t cscfg_csdev_lock;
 	void *active_cscfg_ctxt;
 };
-- 
2.35.1.1320.gc452695387.dirty
Re: [PATCH v2] coresight: Fix possible deadlock in coresight_panic_cb
Posted by Leo Yan 2 weeks, 6 days ago
Hi Sean,

On Thu, Sep 11, 2025 at 11:33:15AM -0400, Sean Anderson wrote:
> coresight_panic_cb is called with interrupts disabled during panics.
> However, bus_for_each_dev calls bus_to_subsys which takes
> bus_kset->list_lock without disabling IRQs. This will cause a deadlock
> if a panic occurs while one of the other coresight functions that uses
> bus_for_each_dev is running.

The decription is a bit misleading. Even when IRQ is disabled, if an
exception happens, a CPU still can be trapped for handling kernel panic.

> Maintain a separate list of coresight devices to access during a panic.

Rather than maintaining a separate list and introducing a new spinlock,
I would argue if we can simply register panic notifier in TMC ETR and
ETF drviers (see tmc_panic_sync_etr() and tmc_panic_sync_etf()).

If there is no dependency between CoreSight modules in panic sync flow,
it is not necessary to maintain list (and lock) for these modules.

I have not involved in panic patches before, so I would like to know
the maintainers' opinion.

Thanks,
Leo
Re: [PATCH v2] coresight: Fix possible deadlock in coresight_panic_cb
Posted by Yeoreum Yun 2 weeks, 6 days ago
Hi,

> Hi Sean,
>
> On Thu, Sep 11, 2025 at 11:33:15AM -0400, Sean Anderson wrote:
> > coresight_panic_cb is called with interrupts disabled during panics.
> > However, bus_for_each_dev calls bus_to_subsys which takes
> > bus_kset->list_lock without disabling IRQs. This will cause a deadlock
> > if a panic occurs while one of the other coresight functions that uses
> > bus_for_each_dev is running.
>
> The decription is a bit misleading. Even when IRQ is disabled, if an
> exception happens, a CPU still can be trapped for handling kernel panic.
>
> > Maintain a separate list of coresight devices to access during a panic.
>
> Rather than maintaining a separate list and introducing a new spinlock,
> I would argue if we can simply register panic notifier in TMC ETR and
> ETF drviers (see tmc_panic_sync_etr() and tmc_panic_sync_etf()).
>
> If there is no dependency between CoreSight modules in panic sync flow,
> it is not necessary to maintain list (and lock) for these modules.

+1 for this.
and using the spinlock in the panic_cb doesn't work on PREEMPT_RT side.

Thanks.

--
Sincerely,
Yeoreum Yun
Re: [PATCH v2] coresight: Fix possible deadlock in coresight_panic_cb
Posted by Sean Anderson 2 weeks, 6 days ago
On 9/12/25 07:03, Yeoreum Yun wrote:
> Hi,
> 
>> Hi Sean,
>>
>> On Thu, Sep 11, 2025 at 11:33:15AM -0400, Sean Anderson wrote:
>> > coresight_panic_cb is called with interrupts disabled during panics.
>> > However, bus_for_each_dev calls bus_to_subsys which takes
>> > bus_kset->list_lock without disabling IRQs. This will cause a deadlock
>> > if a panic occurs while one of the other coresight functions that uses
>> > bus_for_each_dev is running.
>>
>> The decription is a bit misleading. Even when IRQ is disabled, if an
>> exception happens, a CPU still can be trapped for handling kernel panic.
>>
>> > Maintain a separate list of coresight devices to access during a panic.
>>
>> Rather than maintaining a separate list and introducing a new spinlock,
>> I would argue if we can simply register panic notifier in TMC ETR and
>> ETF drviers (see tmc_panic_sync_etr() and tmc_panic_sync_etf()).
>>
>> If there is no dependency between CoreSight modules in panic sync flow,
>> it is not necessary to maintain list (and lock) for these modules.

Yeah, I was thinking about this as I was preparing v2 of this patch.

> +1 for this.
> and using the spinlock in the panic_cb doesn't work on PREEMPT_RT side.

What do you mean by this? I am using lockdep and it did not warn about this,
so I assume that on PREEMPT_RT IRQs remain enabled in this path.

--Sean
Re: [PATCH v2] coresight: Fix possible deadlock in coresight_panic_cb
Posted by Yeoreum Yun 2 weeks, 5 days ago
Hi,

> > Hi,
> >
> >> Hi Sean,
> >>
> >> On Thu, Sep 11, 2025 at 11:33:15AM -0400, Sean Anderson wrote:
> >> > coresight_panic_cb is called with interrupts disabled during panics.
> >> > However, bus_for_each_dev calls bus_to_subsys which takes
> >> > bus_kset->list_lock without disabling IRQs. This will cause a deadlock
> >> > if a panic occurs while one of the other coresight functions that uses
> >> > bus_for_each_dev is running.
> >>
> >> The decription is a bit misleading. Even when IRQ is disabled, if an
> >> exception happens, a CPU still can be trapped for handling kernel panic.
> >>
> >> > Maintain a separate list of coresight devices to access during a panic.
> >>
> >> Rather than maintaining a separate list and introducing a new spinlock,
> >> I would argue if we can simply register panic notifier in TMC ETR and
> >> ETF drviers (see tmc_panic_sync_etr() and tmc_panic_sync_etf()).
> >>
> >> If there is no dependency between CoreSight modules in panic sync flow,
> >> it is not necessary to maintain list (and lock) for these modules.
>
> Yeah, I was thinking about this as I was preparing v2 of this patch.
>
> > +1 for this.
> > and using the spinlock in the panic_cb doesn't work on PREEMPT_RT side.
>
> What do you mean by this? I am using lockdep and it did not warn about this,
> so I assume that on PREEMPT_RT IRQs remain enabled in this path.

Hmm, I don't believe this.
When you see the panic(), it explicitly disable irq.
and preempt_disabled() before
calling atomic_notifier_call_chain(&panic_notifier_list, 0, buf);

also, atomic_nofier_call_chain() is rcu critical section.

As you know, since the spinlock becomes sleepable lock in PREEMPT_RT
this is problem.

The reason why lockdep doesn't report this problem since it was disabled
before panic notifier chain by calling debug_locks_off();

Thanks.

--
Sincerely,
Yeoreum Yun
Re: [PATCH v2] coresight: Fix possible deadlock in coresight_panic_cb
Posted by Sean Anderson 2 weeks, 3 days ago
On 9/13/25 00:30, Yeoreum Yun wrote:
> Hi,
> 
>> > Hi,
>> >
>> >> Hi Sean,
>> >>
>> >> On Thu, Sep 11, 2025 at 11:33:15AM -0400, Sean Anderson wrote:
>> >> > coresight_panic_cb is called with interrupts disabled during panics.
>> >> > However, bus_for_each_dev calls bus_to_subsys which takes
>> >> > bus_kset->list_lock without disabling IRQs. This will cause a deadlock
>> >> > if a panic occurs while one of the other coresight functions that uses
>> >> > bus_for_each_dev is running.
>> >>
>> >> The decription is a bit misleading. Even when IRQ is disabled, if an
>> >> exception happens, a CPU still can be trapped for handling kernel panic.
>> >>
>> >> > Maintain a separate list of coresight devices to access during a panic.
>> >>
>> >> Rather than maintaining a separate list and introducing a new spinlock,
>> >> I would argue if we can simply register panic notifier in TMC ETR and
>> >> ETF drviers (see tmc_panic_sync_etr() and tmc_panic_sync_etf()).
>> >>
>> >> If there is no dependency between CoreSight modules in panic sync flow,
>> >> it is not necessary to maintain list (and lock) for these modules.
>>
>> Yeah, I was thinking about this as I was preparing v2 of this patch.
>>
>> > +1 for this.
>> > and using the spinlock in the panic_cb doesn't work on PREEMPT_RT side.
>>
>> What do you mean by this? I am using lockdep and it did not warn about this,
>> so I assume that on PREEMPT_RT IRQs remain enabled in this path.
> 
> Hmm, I don't believe this.
> When you see the panic(), it explicitly disable irq.
> and preempt_disabled() before
> calling atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
> 
> also, atomic_nofier_call_chain() is rcu critical section.
> 
> As you know, since the spinlock becomes sleepable lock in PREEMPT_RT
> this is problem.
> 
> The reason why lockdep doesn't report this problem since it was disabled
> before panic notifier chain by calling debug_locks_off();

Ah, that makes sense.

--Sean