cxl/region: Fix a race bug in delete_region_store

[PATCH] cxl/region: Fix a race bug in delete_region_store

Posted by Sungwoo Kim 1 month ago

A race exists when two concurrent sysfs writes to delete_region specify
the same region name. Both calls succeed in cxl_find_region_by_name()
(which only does device_find_child_by_name and takes a reference), and
both then proceed to call devm_release_action(). The first call atomically
removes and releases the devres entry successfully. The second call finds
no matching entry, causing devres_release() to return -ENOENT, which trips
the WARN_ON.

Fix this by replacing devm_release_action() with devm_remove_action_nowarn()
followed by a manual call to unregister_region(). devm_remove_action_nowarn()
removes the devres tracking entry and returns an error code.

------------[ cut here ]------------
WARNING: drivers/base/devres.c:824 at devm_release_action drivers/base/devres.c:824 [inline], CPU#0: syz.1.12224/47589
WARNING: drivers/base/devres.c:824 at devm_release_action+0x2b2/0x360 drivers/base/devres.c:817, CPU#0: syz.1.12224/47589
Modules linked in:
CPU: 0 UID: 0 PID: 47589 Comm: syz.1.12224 Not tainted 6.19.0-g15a37b05f387 #10 PREEMPT(lazy) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
RIP: 0010:devm_release_action drivers/base/devres.c:824 [inline]
RIP: 0010:devm_release_action+0x2b2/0x360 drivers/base/devres.c:817
Code: 41 5c 41 5d 41 5e 41 5f e9 86 1a b3 fd e8 86 71 33 fe e8 81 71 33 fe 48 8b 74 24 08 4c 89 f7 e8 04 34 8a 01 e8 6f 71 33 fe 90 <0f> 0b 90 eb c6 e8 64 71 33 fe 90 0f 0b e8 2c f1 71 fe e9 f4 fd ff
RSP: 0018:ffff88813e5f7bf8 EFLAGS: 00010282
RAX: 0000000000000125 RBX: dffffc0000000000 RCX: ffffffff835d26a1
RDX: 0000000000080000 RSI: ffffc9000685b000 RDI: ffffffff8e875adf
RBP: ffff888101059010 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff888101059428
R13: ffff888100e05c30 R14: ffff8881010593e8 R15: ffff888101059428
FS:  00007fae6afb46c0(0000) GS:ffff8882a526d000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fae6a12b8c0 CR3: 0000000127f1b005 CR4: 0000000000770ef0
PKRU: 80000000
Call Trace:
 <TASK>
 delete_region_store+0x106/0x1d0 drivers/cxl/core/region.c:2753
 dev_attr_store+0x58/0x80 drivers/base/core.c:2437
 sysfs_kf_write+0xf2/0x150 fs/sysfs/file.c:142
 kernfs_fop_write_iter+0x3d6/0x5d0 fs/kernfs/file.c:352
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x68e/0x1050 fs/read_write.c:688
 ksys_write+0x12a/0x250 fs/read_write.c:740
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfc/0x670 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fae6a19c669
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fae6afb4028 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fae6a415fa0 RCX: 00007fae6a19c669
RDX: 0000000000000008 RSI: 0000200000000580 RDI: 0000000000000005
RBP: 00007fae6a232c71 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fae6a416038 R14: 00007fae6a415fa0 R15: 00007ffc10522c58
 </TASK>
irq event stamp: 885
hardirqs last  enabled at (893): [<ffffffff8b104c4b>] __up_console_sem+0x7b/0x90 kernel/printk/printk.c:347
hardirqs last disabled at (902): [<ffffffff8b104c30>] __up_console_sem+0x60/0x90 kernel/printk/printk.c:345
softirqs last  enabled at (822): [<ffffffff8af73b93>] __do_softirq kernel/softirq.c:656 [inline]
softirqs last  enabled at (822): [<ffffffff8af73b93>] invoke_softirq kernel/softirq.c:496 [inline]
softirqs last  enabled at (822): [<ffffffff8af73b93>] __irq_exit_rcu+0xb3/0xe0 kernel/softirq.c:723
softirqs last disabled at (811): [<ffffffff8af73b93>] __do_softirq kernel/softirq.c:656 [inline]
softirqs last disabled at (811): [<ffffffff8af73b93>] invoke_softirq kernel/softirq.c:496 [inline]
softirqs last disabled at (811): [<ffffffff8af73b93>] __irq_exit_rcu+0xb3/0xe0 kernel/softirq.c:723
---[ end trace 0000000000000000 ]---

Fixes: 779dd20cfb56 ("cxl/region: Add region creation support")
Signed-off-by: Sungwoo Kim <iam@sung-woo.kim>
---
 drivers/cxl/core/region.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 08fa3deef70ab..7ade9aa2aeecc 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2745,12 +2745,19 @@ static ssize_t delete_region_store(struct device *dev,
 	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev);
 	struct cxl_port *port = to_cxl_port(dev->parent);
 	struct cxl_region *cxlr;
+	int err;
 
 	cxlr = cxl_find_region_by_name(cxlrd, buf);
 	if (IS_ERR(cxlr))
 		return PTR_ERR(cxlr);
 
-	devm_release_action(port->uport_dev, unregister_region, cxlr);
+	err = devm_remove_action_nowarn(port->uport_dev, unregister_region,
+				       cxlr);
+	if (err) {
+		put_device(&cxlr->dev);
+		return err;
+	}
+	unregister_region(cxlr);
 	put_device(&cxlr->dev);
 
 	return len;
-- 
2.47.3

Re: [PATCH] cxl/region: Fix a race bug in delete_region_store

Posted by Dan Williams 4 weeks, 1 day ago

Sungwoo Kim wrote:
> A race exists when two concurrent sysfs writes to delete_region specify
> the same region name. Both calls succeed in cxl_find_region_by_name()
> (which only does device_find_child_by_name and takes a reference), and
> both then proceed to call devm_release_action(). The first call atomically
> removes and releases the devres entry successfully. The second call finds
> no matching entry, causing devres_release() to return -ENOENT, which trips
> the WARN_ON.

Ugh, yes, good find.

I think I understand the pathology of how this happened. Touching devres
actions needs to happen under device lock or with the knowledge that the
device hosting the action is currently bound to a driver. The
port->uport_dev is known to be bound in this path because decoder
devices are only registered in the bound lifetime of ->uport_dev.

The device_lock() is not needed to make sure the release action can be
called, however, *some* synchronization is needed for racing requesters
as you found.

> Fix this by replacing devm_release_action() with devm_remove_action_nowarn()
> followed by a manual call to unregister_region(). devm_remove_action_nowarn()
> removes the devres tracking entry and returns an error code.

No, that is not an acceptable or comprehensive fix. A subsystem should
never try to double unregister a device. Keep in mind the decoder sysfs
stays alive while the devres_release_all() actions are running for
port->uport_dev.  As a result, yes, Davidlohr is right. You can
theoretically observe the unregister_region() release action firing
while delete_region_store() is running.

The only comprehensive fix I currently see to that problem is to indeed
get it back synchronized under the device lock. This prevents multiple
requesters from colliding, and from devres_release_all() deleting an
action that delete_region_store() already committed to handling.

This looks something like schedule the devres action to be released in
workqueue under guard(device)(&port->udev).

It is ugly, but it may be the case that this wants to synchronously
delete the region, and asynchronously cleanup the release action.

I.e. unregister_region() grows a new:

	if (!test_and_set_bit(CXL_REGION_F_DELETE, ...)

...flag to enforce only one path successfully deletes.
delete_region_store() calls unregister_region() directly. Then the
workqueue's job is to trigger the release action under the lock only if
port->uport_dev is still driver-attached by the time the workqueue runs.

In the meantime the workqueue needs to hold a reference on the region
device until it can observe the state of CXL_REGION_F_DELETE under the
device lock.

Otherwise I suspect if you put a device_lock() in delete_region_store()
it will deadlock.

> ------------[ cut here ]------------
> WARNING: drivers/base/devres.c:824 at devm_release_action drivers/base/devres.c:824 [inline], CPU#0: syz.1.12224/47589
> WARNING: drivers/base/devres.c:824 at devm_release_action+0x2b2/0x360 drivers/base/devres.c:817, CPU#0: syz.1.12224/47589

For the next posting, these 2 lines are sufficient, no need for all the
extra detail of a full backtrace.

Re: [PATCH] cxl/region: Fix a race bug in delete_region_store

Posted by Sungwoo Kim 4 weeks, 1 day ago

On Tue, Mar 10, 2026 at 6:54 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> Sungwoo Kim wrote:
> > Fix this by replacing devm_release_action() with devm_remove_action_nowarn()
> > followed by a manual call to unregister_region(). devm_remove_action_nowarn()
> > removes the devres tracking entry and returns an error code.
>
> No, that is not an acceptable or comprehensive fix. A subsystem should
> never try to double unregister a device. Keep in mind the decoder sysfs
> stays alive while the devres_release_all() actions are running for
> port->uport_dev.  As a result, yes, Davidlohr is right. You can
> theoretically observe the unregister_region() release action firing
> while delete_region_store() is running.

I appreciate your review and suggestion of a new patch. It seems more
complicated than I expected!

I'm new to CXL and device drivers, so I would like to confirm that my
understanding is correct.
- CXL tolpology could be: [upstream] host -> switch(s) -> endpoints(s)
[downstream].
- The host can configure a unique address range (= region) to access
each device.
- The bug happened because releasing the same regions can race.
- Although the prior patch handled this, it's insufficient because it
can also race with releasing the device (what Davidlohr had reported).

>
> The only comprehensive fix I currently see to that problem is to indeed
> get it back synchronized under the device lock. This prevents multiple
> requesters from colliding, and from devres_release_all() deleting an
> action that delete_region_store() already committed to handling.
>
> This looks something like schedule the devres action to be released in
> workqueue under guard(device)(&port->udev).

I assume you meant guard(device)(port->uport_dev).

>
> It is ugly, but it may be the case that this wants to synchronously
> delete the region, and asynchronously cleanup the release action.
>
> I.e. unregister_region() grows a new:
>
>         if (!test_and_set_bit(CXL_REGION_F_DELETE, ...)
>
> ...flag to enforce only one path successfully deletes.
> delete_region_store() calls unregister_region() directly. Then the

To fix this, you suggested:
- Add a flag to make unregister_region() idempotent.
- Make delete_region_store() call unregister_region() directly.

Now unregister_region() is race-free. But we need to release an action.

> workqueue's job is to trigger the release action under the lock only if
> port->uport_dev is still driver-attached by the time the workqueue runs.
>

What I'm missing is this part.
How could asynchronous cleanup solve the race? In the worst case, the
workqueue could start immediately after we schedule it. In this case,
it's the same as the synchronous execution.
Or do we have other reasons to make it asynchronous?

> In the meantime the workqueue needs to hold a reference on the region
> device until it can observe the state of CXL_REGION_F_DELETE under the
> device lock.

If it's the same for synchronous execution, we might execute this
directly in unregister_region(), holding
guard(device)(port->uport_dev).

>
> Otherwise I suspect if you put a device_lock() in delete_region_store()
> it will deadlock.
>
> > ------------[ cut here ]------------
> > WARNING: drivers/base/devres.c:824 at devm_release_action drivers/base/devres.c:824 [inline], CPU#0: syz.1.12224/47589
> > WARNING: drivers/base/devres.c:824 at devm_release_action+0x2b2/0x360 drivers/base/devres.c:817, CPU#0: syz.1.12224/47589
>
> For the next posting, these 2 lines are sufficient, no need for all the
> extra detail of a full backtrace.
>

Copy that.

Thanks again for your detailed reviews.
Sungwoo.

Re: [PATCH] cxl/region: Fix a race bug in delete_region_store

Posted by Dan Williams 1 week, 1 day ago

Sungwoo Kim wrote:
> On Tue, Mar 10, 2026 at 6:54 PM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > Sungwoo Kim wrote:
> > > Fix this by replacing devm_release_action() with devm_remove_action_nowarn()
> > > followed by a manual call to unregister_region(). devm_remove_action_nowarn()
> > > removes the devres tracking entry and returns an error code.
> >
> > No, that is not an acceptable or comprehensive fix. A subsystem should
> > never try to double unregister a device. Keep in mind the decoder sysfs
> > stays alive while the devres_release_all() actions are running for
> > port->uport_dev.  As a result, yes, Davidlohr is right. You can
> > theoretically observe the unregister_region() release action firing
> > while delete_region_store() is running.
> 
> I appreciate your review and suggestion of a new patch. It seems more
> complicated than I expected!

Yes, the CXL subsystem is unfortunately... complicated.

> I'm new to CXL and device drivers, so I would like to confirm that my
> understanding is correct.
> - CXL tolpology could be: [upstream] host -> switch(s) -> endpoints(s)
> [downstream].
> - The host can configure a unique address range (= region) to access
> each device.
> - The bug happened because releasing the same regions can race.
> - Although the prior patch handled this, it's insufficient because it
> can also race with releasing the device (what Davidlohr had reported).

The patch was technically correct but it relies on a design that
requires depending on a double free semantic. Apparently Rust needs it
to enforce a higher level semantic, but in C we need to handle that
semantic directly.

> > The only comprehensive fix I currently see to that problem is to indeed
> > get it back synchronized under the device lock. This prevents multiple
> > requesters from colliding, and from devres_release_all() deleting an
> > action that delete_region_store() already committed to handling.
> >
> > This looks something like schedule the devres action to be released in
> > workqueue under guard(device)(&port->udev).
> 
> I assume you meant guard(device)(port->uport_dev).

Correct. When I say "something like" it almost always means "be careful
what I am about to say is full of bugs" so, good catch.

> > It is ugly, but it may be the case that this wants to synchronously
> > delete the region, and asynchronously cleanup the release action.
> >
> > I.e. unregister_region() grows a new:
> >
> >         if (!test_and_set_bit(CXL_REGION_F_DELETE, ...)
> >
> > ...flag to enforce only one path successfully deletes.
> > delete_region_store() calls unregister_region() directly. Then the
> 
> To fix this, you suggested:
> - Add a flag to make unregister_region() idempotent.
> - Make delete_region_store() call unregister_region() directly.
> 
> Now unregister_region() is race-free. But we need to release an action.
> 
> > workqueue's job is to trigger the release action under the lock only if
> > port->uport_dev is still driver-attached by the time the workqueue runs.
> >
> 
> What I'm missing is this part.
> How could asynchronous cleanup solve the race? In the worst case, the
> workqueue could start immediately after we schedule it. In this case,
> it's the same as the synchronous execution.

Asynchronous cleanup gets us into a context where we are safe to acquire
locks. delete_region_store can not hold guard(device)(port->uport_dev).
You can try it to see the full lockdep splat. It is setup by fact that
unregistering the root decoders is performed under that lock.
Unregistering root decoder flushes userspace out of sysfs attribute
handlers. If those are holding the lock themselves it is an ABBA
deadlock.

I circled back to this because the CXL accelerator patches also have a
reason to autocleanup regions so this would be another source of
cxl_region deletion to rendezvous.

Re: [PATCH] cxl/region: Fix a race bug in delete_region_store

Posted by Sungwoo Kim 6 days, 16 hours ago

On Wed, Apr 1, 2026 at 12:51 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> Sungwoo Kim wrote:
> > On Tue, Mar 10, 2026 at 6:54 PM Dan Williams <dan.j.williams@intel.com> wrote:
> > >
> > > Sungwoo Kim wrote:
> > > > Fix this by replacing devm_release_action() with devm_remove_action_nowarn()
> > > > followed by a manual call to unregister_region(). devm_remove_action_nowarn()
> > > > removes the devres tracking entry and returns an error code.
> > >
> > > No, that is not an acceptable or comprehensive fix. A subsystem should
> > > never try to double unregister a device. Keep in mind the decoder sysfs
> > > stays alive while the devres_release_all() actions are running for
> > > port->uport_dev.  As a result, yes, Davidlohr is right. You can
> > > theoretically observe the unregister_region() release action firing
> > > while delete_region_store() is running.
> >
> > I appreciate your review and suggestion of a new patch. It seems more
> > complicated than I expected!
>
> Yes, the CXL subsystem is unfortunately... complicated.
>
> > I'm new to CXL and device drivers, so I would like to confirm that my
> > understanding is correct.
> > - CXL tolpology could be: [upstream] host -> switch(s) -> endpoints(s)
> > [downstream].
> > - The host can configure a unique address range (= region) to access
> > each device.
> > - The bug happened because releasing the same regions can race.
> > - Although the prior patch handled this, it's insufficient because it
> > can also race with releasing the device (what Davidlohr had reported).
>
> The patch was technically correct but it relies on a design that
> requires depending on a double free semantic. Apparently Rust needs it
> to enforce a higher level semantic, but in C we need to handle that
> semantic directly.
>
> > > The only comprehensive fix I currently see to that problem is to indeed
> > > get it back synchronized under the device lock. This prevents multiple
> > > requesters from colliding, and from devres_release_all() deleting an
> > > action that delete_region_store() already committed to handling.
> > >
> > > This looks something like schedule the devres action to be released in
> > > workqueue under guard(device)(&port->udev).
> >
> > I assume you meant guard(device)(port->uport_dev).
>
> Correct. When I say "something like" it almost always means "be careful
> what I am about to say is full of bugs" so, good catch.
>
> > > It is ugly, but it may be the case that this wants to synchronously
> > > delete the region, and asynchronously cleanup the release action.
> > >
> > > I.e. unregister_region() grows a new:
> > >
> > >         if (!test_and_set_bit(CXL_REGION_F_DELETE, ...)
> > >
> > > ...flag to enforce only one path successfully deletes.
> > > delete_region_store() calls unregister_region() directly. Then the
> >
> > To fix this, you suggested:
> > - Add a flag to make unregister_region() idempotent.
> > - Make delete_region_store() call unregister_region() directly.
> >
> > Now unregister_region() is race-free. But we need to release an action.
> >
> > > workqueue's job is to trigger the release action under the lock only if
> > > port->uport_dev is still driver-attached by the time the workqueue runs.
> > >
> >
> > What I'm missing is this part.
> > How could asynchronous cleanup solve the race? In the worst case, the
> > workqueue could start immediately after we schedule it. In this case,
> > it's the same as the synchronous execution.
>
> Asynchronous cleanup gets us into a context where we are safe to acquire
> locks. delete_region_store can not hold guard(device)(port->uport_dev).
> You can try it to see the full lockdep splat. It is setup by fact that
> unregistering the root decoders is performed under that lock.
> Unregistering root decoder flushes userspace out of sysfs attribute
> handlers. If those are holding the lock themselves it is an ABBA
> deadlock.

I really appreciate your detailed comments. I understood that the
async work can safely wait for and attain the lock.
The lock is required to ensure that port->uport_dev is still driver-attached.
The previous patch was working but it was over-tailored for preventing
double-free.
So, using the workqueue, we can perform devm_release_action() without
losing a certain abstraction level (no over-tailoring).

>
> I circled back to this because the CXL accelerator patches also have a
> reason to autocleanup regions so this would be another source of
> cxl_region deletion to rendezvous.

Okay. other CXL types could be another reason for not depending on the
double-free semantics.
I'll add a workqueue for devm_release_action() in V3.

Thank you again,
Sungwoo.

Re: [PATCH] cxl/region: Fix a race bug in delete_region_store

Posted by Davidlohr Bueso 4 weeks, 1 day ago

On Sun, 08 Mar 2026, Sungwoo Kim wrote:

>A race exists when two concurrent sysfs writes to delete_region specify
>the same region name. Both calls succeed in cxl_find_region_by_name()
>(which only does device_find_child_by_name and takes a reference), and
>both then proceed to call devm_release_action(). The first call atomically
>removes and releases the devres entry successfully. The second call finds
>no matching entry, causing devres_release() to return -ENOENT, which trips
>the WARN_ON.

afaict the splat is also triggable via devres_release_all(), ie: unbinding
the host bridge. Basically cxl_find_region_by_name() succeeds because the
region hasn't been device_del()'d yet:

CPU0					CPU1
devres_release_all()
   // take devres_lock
   remove_nodes(devres_head) // mv to local todo
   // drop devres_lock                   delete_region_store()
                                           cxlr = cxl_find_region_by_name()  // success
                                           devm_release_action(unregister_region)
                                             devres_release()
                                               devres_remove()
                                                 // hold devres_lock
                                                 find_dr(devres_head) // does not find it
                                             WARN_ON(-ENOENT)
   release_nodes() // drain todo
     unregister_region(cxlr) // release() cb
       device_del()
       
>Fix this by replacing devm_release_action() with devm_remove_action_nowarn()
>followed by a manual call to unregister_region(). devm_remove_action_nowarn()
>removes the devres tracking entry and returns an error code.

While devm_remove_action_nowarn() has only a single driver user (gpio), using it
here would seem to fit the requirement of independent lifetime management; and
ultimately these races seem benign as unregister_region() is only being called
once.

>------------[ cut here ]------------
>WARNING: drivers/base/devres.c:824 at devm_release_action drivers/base/devres.c:824 [inline], CPU#0: syz.1.12224/47589
>WARNING: drivers/base/devres.c:824 at devm_release_action+0x2b2/0x360 drivers/base/devres.c:817, CPU#0: syz.1.12224/47589

I see you are using syzkaller; I added cxl support as well a while back based
on the usb fuzzying approach, and also triggered this issue (which was in my
to-investigate backlog, so glad you ran into this).

Thanks,
Davidlohr

Re: [PATCH] cxl/region: Fix a race bug in delete_region_store

Posted by Ira Weiny 1 month ago

Sungwoo Kim wrote:
> A race exists when two concurrent sysfs writes to delete_region specify
> the same region name. Both calls succeed in cxl_find_region_by_name()
> (which only does device_find_child_by_name and takes a reference), and
> both then proceed to call devm_release_action(). The first call atomically
> removes and releases the devres entry successfully. The second call finds
> no matching entry, causing devres_release() to return -ENOENT, which trips
> the WARN_ON.
> 
> Fix this by replacing devm_release_action() with devm_remove_action_nowarn()
> followed by a manual call to unregister_region(). devm_remove_action_nowarn()
> removes the devres tracking entry and returns an error code.
> 

[snip]

> 
> Fixes: 779dd20cfb56 ("cxl/region: Add region creation support")
> Signed-off-by: Sungwoo Kim <iam@sung-woo.kim>

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

[snip]

Re: [PATCH] cxl/region: Fix a race bug in delete_region_store

Posted by Jonathan Cameron 1 month ago

On Sun, 8 Mar 2026 14:59:58 -0400
Sungwoo Kim <iam@sung-woo.kim> wrote:

> A race exists when two concurrent sysfs writes to delete_region specify
> the same region name. Both calls succeed in cxl_find_region_by_name()
> (which only does device_find_child_by_name and takes a reference), and
> both then proceed to call devm_release_action(). The first call atomically
> removes and releases the devres entry successfully. The second call finds
> no matching entry, causing devres_release() to return -ENOENT, which trips
> the WARN_ON.
> 
> Fix this by replacing devm_release_action() with devm_remove_action_nowarn()
> followed by a manual call to unregister_region(). devm_remove_action_nowarn()
> removes the devres tracking entry and returns an error code.

Naive question (or just me being lazy).  Why can't we take the
write lock on cxl_rwsem.region?

Jonathan

Re: [PATCH] cxl/region: Fix a race bug in delete_region_store

Posted by Sungwoo Kim 1 month ago

On Mon, Mar 9, 2026 at 8:00 AM Jonathan Cameron
<jonathan.cameron@huawei.com> wrote:
>
> On Sun, 8 Mar 2026 14:59:58 -0400
> Sungwoo Kim <iam@sung-woo.kim> wrote:
>
> > A race exists when two concurrent sysfs writes to delete_region specify
> > the same region name. Both calls succeed in cxl_find_region_by_name()
> > (which only does device_find_child_by_name and takes a reference), and
> > both then proceed to call devm_release_action(). The first call atomically
> > removes and releases the devres entry successfully. The second call finds
> > no matching entry, causing devres_release() to return -ENOENT, which trips
> > the WARN_ON.
> >
> > Fix this by replacing devm_release_action() with devm_remove_action_nowarn()
> > followed by a manual call to unregister_region(). devm_remove_action_nowarn()
> > removes the devres tracking entry and returns an error code.
>
> Naive question (or just me being lazy).  Why can't we take the
> write lock on cxl_rwsem.region?

Thanks for your review. I've just tested your suggestion, but it
caused an ABBA deadlock:

task 1:
create_pmem_region_store
  __device_attach()         ...dev_lock()
    cxl_region_can_probe()  ...lock(cxl_rwsem.region)

task 2:
delete_region_store()       ...lock(cxl_rwsem.region)
  unregister_region()
    device_del()            ...dev_lock()

One way to avoid a deadlock might be to not add an additional lock.

Re: [PATCH] cxl/region: Fix a race bug in delete_region_store

Posted by Jonathan Cameron 1 month ago

On Mon, 9 Mar 2026 13:56:33 -0400
Sungwoo Kim <iam@sung-woo.kim> wrote:

> On Mon, Mar 9, 2026 at 8:00 AM Jonathan Cameron
> <jonathan.cameron@huawei.com> wrote:
> >
> > On Sun, 8 Mar 2026 14:59:58 -0400
> > Sungwoo Kim <iam@sung-woo.kim> wrote:
> >  
> > > A race exists when two concurrent sysfs writes to delete_region specify
> > > the same region name. Both calls succeed in cxl_find_region_by_name()
> > > (which only does device_find_child_by_name and takes a reference), and
> > > both then proceed to call devm_release_action(). The first call atomically
> > > removes and releases the devres entry successfully. The second call finds
> > > no matching entry, causing devres_release() to return -ENOENT, which trips
> > > the WARN_ON.
> > >
> > > Fix this by replacing devm_release_action() with devm_remove_action_nowarn()
> > > followed by a manual call to unregister_region(). devm_remove_action_nowarn()
> > > removes the devres tracking entry and returns an error code.  
> >
> > Naive question (or just me being lazy).  Why can't we take the
> > write lock on cxl_rwsem.region?  
> 
> Thanks for your review. I've just tested your suggestion, but it
> caused an ABBA deadlock:
> 
> task 1:
> create_pmem_region_store
>   __device_attach()         ...dev_lock()
>     cxl_region_can_probe()  ...lock(cxl_rwsem.region)
> 
> task 2:
> delete_region_store()       ...lock(cxl_rwsem.region)
>   unregister_region()
>     device_del()            ...dev_lock()
> 

Thanks for chasing that down. (I was indeed just being lazy!)

Let's wait a few days to get inputs from others on this.

One horrible option would just be to have a single purpose lock to
serialize handling writes to the sysfs file. I don't much like
that solution however!

Thanks,

Jonathan

> One way to avoid a deadlock might be to not add an additional lock.