drivers/base/core.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
Modify device_shutdown() so that supplier devices do not wait for
consumer devices to be shut down first when the devlink is sync state
only, since the consumer is not dependent on the supplier in this case.
Without this change, a circular dependency could hang the system.
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
---
drivers/base/core.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index b69b82da8837..76513e360496 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -4898,8 +4898,16 @@ void device_shutdown(void)
idx = device_links_read_lock();
list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
- device_links_read_lock_held())
+ device_links_read_lock_held()) {
+ /*
+ * sync_state_only suppliers don't need to wait,
+ * aren't reordered on devices_kset, so making them
+ * wait could result in a hang
+ */
+ if (device_link_flag_is_sync_state_only(link->flags))
+ continue;
link->supplier->p->shutdown_after = cookie;
+ }
device_links_read_unlock(idx);
put_device(dev);
--
2.39.3
On Tue, Sep 17, 2024 at 03:15:17PM -0500, Stuart Hayes wrote: > Modify device_shutdown() so that supplier devices do not wait for > consumer devices to be shut down first when the devlink is sync state > only, since the consumer is not dependent on the supplier in this case. > > Without this change, a circular dependency could hang the system. > > Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com> What commit id does this fix? Should it go to stable? And what driver is causing this problem, is this a regression or for something new that just got added to the tree? thanks, greg k-h
On 9/17/2024 3:42 PM, Greg Kroah-Hartman wrote: > On Tue, Sep 17, 2024 at 03:15:17PM -0500, Stuart Hayes wrote: >> Modify device_shutdown() so that supplier devices do not wait for >> consumer devices to be shut down first when the devlink is sync state >> only, since the consumer is not dependent on the supplier in this case. >> >> Without this change, a circular dependency could hang the system. >> >> Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com> > > What commit id does this fix? Should it go to stable? > > And what driver is causing this problem, is this a regression or for > something new that just got added to the tree? > > thanks, > > greg k-h This fixes commit 8064952c65045f05ee2671fe437770e50c151776, in driver-core-next & linux-next... it's problem with code that was just added to the tree (in drivers/base/core.c). It is not in stable. Apologies, I should have mentioned that from the start. The issue was found using qemu... a pl061 device (supplier) and gpio-keys device (consumer), from a qemu-generated device tree with the aarch64 architecture. I don't know why the devlink is in the sync_state_only state in this case. I didn't dig into that, because I figured the shutdown code shouldn't hang regardless.
On Tue, Sep 17, 2024 at 07:20:41PM -0500, stuart hayes wrote: > > > On 9/17/2024 3:42 PM, Greg Kroah-Hartman wrote: > > On Tue, Sep 17, 2024 at 03:15:17PM -0500, Stuart Hayes wrote: > > > Modify device_shutdown() so that supplier devices do not wait for > > > consumer devices to be shut down first when the devlink is sync state > > > only, since the consumer is not dependent on the supplier in this case. > > > > > > Without this change, a circular dependency could hang the system. > > > > > > Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com> > > > > What commit id does this fix? Should it go to stable? > > > > And what driver is causing this problem, is this a regression or for > > something new that just got added to the tree? > > > > thanks, > > > > greg k-h > > This fixes commit 8064952c65045f05ee2671fe437770e50c151776, in > driver-core-next & linux-next... it's problem with code that was just > added to the tree (in drivers/base/core.c). It is not in stable. Ah, that wasn't obvious, sorry. > Apologies, I should have mentioned that from the start. Can you resend this with a "Fixes:" tag in it so I can just take it that way and not have to edit it by hand? thanks, greg k-h
On Wed, 2024-09-18 at 08:12 +0200, Greg Kroah-Hartman wrote: > On Tue, Sep 17, 2024 at 07:20:41PM -0500, stuart hayes wrote: > > > > > > On 9/17/2024 3:42 PM, Greg Kroah-Hartman wrote: > > > On Tue, Sep 17, 2024 at 03:15:17PM -0500, Stuart Hayes wrote: > > > > Modify device_shutdown() so that supplier devices do not wait > > > > for > > > > consumer devices to be shut down first when the devlink is sync > > > > state > > > > only, since the consumer is not dependent on the supplier in > > > > this case. > > > > > > > > Without this change, a circular dependency could hang the > > > > system. > > > > > > > > Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com> > > > > > > What commit id does this fix? Should it go to stable? > > > > > > And what driver is causing this problem, is this a regression or > > > for > > > something new that just got added to the tree? > > > > > > thanks, > > > > > > greg k-h > > > > This fixes commit 8064952c65045f05ee2671fe437770e50c151776, in > > driver-core-next & linux-next... it's problem with code that was > > just > > added to the tree (in drivers/base/core.c). It is not in stable. > > Ah, that wasn't obvious, sorry. > > > Apologies, I should have mentioned that from the start. > > Can you resend this with a "Fixes:" tag in it so I can just take it > that > way and not have to edit it by hand? > > thanks, > > greg k-h > FYI This patch plus the rest of the original Bundle has been tested at Red Hat on a system with 24 nvme devices. Improvement was almost 8 times faster to shutdown. Tested-by: Laurence Oberman <loberman@redhat.com>
© 2016 - 2024 Red Hat, Inc.