[v1] devcoredump : Serialize devcd_del work

[PATCH] devcoredump : Serialize devcd_del work

Posted by Mukesh Ojha 4 years ago

In following scenario(diagram), when one thread X running dev_coredumpm() adds devcd
device to the framework which sends uevent notification to userspace
and another thread Y reads this uevent and call to devcd_data_write()
which eventually try to delete the queued timer that is not initialized/queued yet.

So, debug object reports some warning and in the meantime, timer is initialized
and queued from X path. and from Y path, it gets reinitialized again and
timer->entry.pprev=NULL and try_to_grab_pending() stucks.

To fix this, introduce mutex to serialize the behaviour.

 	cpu0(X)			                      cpu1(Y)

    dev_coredump() uevent sent to userspace
    device_add()  =========================> userspace process Y reads the uevents
                                             writes to devcd fd which
                                             results into writes to

                                            devcd_data_write()
                                              mod_delayed_work()
                                                try_to_grab_pending()
                                                  del_timer()
                                                    debug_assert_init()
   INIT_DELAYED_WORK
   schedule_delayed_work
                                                     debug_object_fixup()
                                                      timer_fixup_assert_init()
                                                       timer_setup()
                                                         do_init_timer()   ==> reinitialized the
                                                                                 timer to
                                                                                 timer->entry.pprev=NULL

                                                  timer_pending()
                                                   !hlist_unhashed_lockless(&timer->entry)
                                                     !h->pprev  ==> del_timer checks
                                                                  and finds it to be NULL
 								  try_to_grab_pending() stucks.

Link: https://lore.kernel.org/lkml/2e1f81e2-428c-f11f-ce92-eb11048cb271@quicinc.com/
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
---
 drivers/base/devcoredump.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
index f4d794d..316f566 100644
--- a/drivers/base/devcoredump.c
+++ b/drivers/base/devcoredump.c
@@ -25,6 +25,7 @@ struct devcd_entry {
 	struct device devcd_dev;
 	void *data;
 	size_t datalen;
+	struct mutex mutex;
 	struct module *owner;
 	ssize_t (*read)(char *buffer, loff_t offset, size_t count,
 			void *data, size_t datalen);
@@ -84,7 +85,9 @@ static ssize_t devcd_data_write(struct file *filp, struct kobject *kobj,
 	struct device *dev = kobj_to_dev(kobj);
 	struct devcd_entry *devcd = dev_to_devcd(dev);
 
+	mutex_lock(&devcd->mutex);
 	mod_delayed_work(system_wq, &devcd->del_wk, 0);
+	mutex_unlock(&devcd->mutex);
 
 	return count;
 }
@@ -112,7 +115,9 @@ static int devcd_free(struct device *dev, void *data)
 {
 	struct devcd_entry *devcd = dev_to_devcd(dev);
 
+	mutex_lock(&devcd->mutex);
 	flush_delayed_work(&devcd->del_wk);
+	mutex_unlock(&devcd->mutex);
 	return 0;
 }
 
@@ -278,13 +283,14 @@ void dev_coredumpm(struct device *dev, struct module *owner,
 	devcd->read = read;
 	devcd->free = free;
 	devcd->failing_dev = get_device(dev);
-
+	mutex_init(&devcd->mutex);
 	device_initialize(&devcd->devcd_dev);
 
 	dev_set_name(&devcd->devcd_dev, "devcd%d",
 		     atomic_inc_return(&devcd_count));
 	devcd->devcd_dev.class = &devcd_class;
 
+	mutex_lock(&devcd->mutex);
 	if (device_add(&devcd->devcd_dev))
 		goto put_device;
 
@@ -301,10 +307,11 @@ void dev_coredumpm(struct device *dev, struct module *owner,
 
 	INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
 	schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
-
+	mutex_unlock(&devcd->mutex);
 	return;
  put_device:
 	put_device(&devcd->devcd_dev);
+	mutex_unlock(&devcd->mutex);
  put_module:
 	module_put(owner);
  free:
-- 
2.7.4

Re: [PATCH] devcoredump : Serialize devcd_del work

Posted by Johannes Berg 4 years ago

On Tue, 2022-04-19 at 15:57 +0530, Mukesh Ojha wrote:
> In following scenario(diagram), when one thread X running dev_coredumpm() adds devcd
> device to the framework which sends uevent notification to userspace
> and another thread Y reads this uevent and call to devcd_data_write()
> which eventually try to delete the queued timer that is not initialized/queued yet.
> 
> So, debug object reports some warning and in the meantime, timer is initialized
> and queued from X path. and from Y path, it gets reinitialized again and
> timer->entry.pprev=NULL and try_to_grab_pending() stucks.
> 
> To fix this, introduce mutex to serialize the behaviour.
> 
>  	cpu0(X)			                      cpu1(Y)
> 
>     dev_coredump() uevent sent to userspace
>     device_add()  =========================> userspace process Y reads the uevents
>                                              writes to devcd fd which
>                                              results into writes to
> 
>                                             devcd_data_write()
>                                               mod_delayed_work()
>                                                 try_to_grab_pending()
>                                                   del_timer()
>                                                     debug_assert_init()
>    INIT_DELAYED_WORK
>    schedule_delayed_work
> 

Wouldn't it be easier to simply schedule this before adding it to sysfs
and sending the uevent?

johannes

Re: [PATCH] devcoredump : Serialize devcd_del work

Posted by Johannes Berg 4 years ago

On Fri, 2022-04-22 at 15:41 +0200, Johannes Berg wrote:
> On Tue, 2022-04-19 at 15:57 +0530, Mukesh Ojha wrote:
> > In following scenario(diagram), when one thread X running dev_coredumpm() adds devcd
> > device to the framework which sends uevent notification to userspace
> > and another thread Y reads this uevent and call to devcd_data_write()
> > which eventually try to delete the queued timer that is not initialized/queued yet.
> > 
> > So, debug object reports some warning and in the meantime, timer is initialized
> > and queued from X path. and from Y path, it gets reinitialized again and
> > timer->entry.pprev=NULL and try_to_grab_pending() stucks.
> > 
> > To fix this, introduce mutex to serialize the behaviour.
> > 
> >  	cpu0(X)			                      cpu1(Y)
> > 
> >     dev_coredump() uevent sent to userspace
> >     device_add()  =========================> userspace process Y reads the uevents
> >                                              writes to devcd fd which
> >                                              results into writes to
> > 
> >                                             devcd_data_write()
> >                                               mod_delayed_work()
> >                                                 try_to_grab_pending()
> >                                                   del_timer()
> >                                                     debug_assert_init()
> >    INIT_DELAYED_WORK
> >    schedule_delayed_work
> > 
> 
> Wouldn't it be easier to simply schedule this before adding it to sysfs
> and sending the uevent?
> 

Hm. I think that would solve this problem, but not all of the problems
here ...

Even with your change, I believe it's still racy wrt. disabled_store(),
since that flushes the work but devcd_data_write() remains reachable
(and might in fact be waiting for the mutex after your change), so I
think we need an additional flag somewhere (in addition to the mutex) to
serialize all of these things against each other.

johannes

Re: [PATCH] devcoredump : Serialize devcd_del work

Posted by Mukesh Ojha 4 years ago

On Fri, Apr 22, 2022 at 03:53:35PM +0200, Johannes Berg wrote:
> On Fri, 2022-04-22 at 15:41 +0200, Johannes Berg wrote:
> > On Tue, 2022-04-19 at 15:57 +0530, Mukesh Ojha wrote:
> > > In following scenario(diagram), when one thread X running dev_coredumpm() adds devcd
> > > device to the framework which sends uevent notification to userspace
> > > and another thread Y reads this uevent and call to devcd_data_write()
> > > which eventually try to delete the queued timer that is not initialized/queued yet.
> > > 
> > > So, debug object reports some warning and in the meantime, timer is initialized
> > > and queued from X path. and from Y path, it gets reinitialized again and
> > > timer->entry.pprev=NULL and try_to_grab_pending() stucks.
> > > 
> > > To fix this, introduce mutex to serialize the behaviour.
> > > 
> > >  	cpu0(X)			                      cpu1(Y)
> > > 
> > >     dev_coredump() uevent sent to userspace
> > >     device_add()  =========================> userspace process Y reads the uevents
> > >                                              writes to devcd fd which
> > >                                              results into writes to
> > > 
> > >                                             devcd_data_write()
> > >                                               mod_delayed_work()
> > >                                                 try_to_grab_pending()
> > >                                                   del_timer()
> > >                                                     debug_assert_init()
> > >    INIT_DELAYED_WORK
> > >    schedule_delayed_work
> > > 
> > 
> > Wouldn't it be easier to simply schedule this before adding it to sysfs
> > and sending the uevent?
> > 
> 
> Hm. I think that would solve this problem, but not all of the problems
> here ...
> 
> Even with your change, I believe it's still racy wrt. disabled_store(),
> since that flushes the work but devcd_data_write() remains reachable
> (and might in fact be waiting for the mutex after your change), so I
> think we need an additional flag somewhere (in addition to the mutex) to
> serialize all of these things against each other.

Can we do something like this in v2

https://lore.kernel.org/lkml/1650892193-12888-1-git-send-email-quic_mojha@quicinc.com/

Thanks,
-Mukesh

> 
> johannes

Re: [PATCH] devcoredump : Serialize devcd_del work

Posted by Thomas Gleixner 4 years ago

On Fri, Apr 22 2022 at 15:53, Johannes Berg wrote:
> On Fri, 2022-04-22 at 15:41 +0200, Johannes Berg wrote:
>> On Tue, 2022-04-19 at 15:57 +0530, Mukesh Ojha wrote:
>> >    INIT_DELAYED_WORK
>> >    schedule_delayed_work
>> > 
>> 
>> Wouldn't it be easier to simply schedule this before adding it to sysfs
>> and sending the uevent?

Only if it's guaranteed that the timer will not expire before
add_device() succeeds. It's likely, but there is virt....

> Hm. I think that would solve this problem, but not all of the problems
> here ...
>
> Even with your change, I believe it's still racy wrt. disabled_store(),
> since that flushes the work but devcd_data_write() remains reachable
> (and might in fact be waiting for the mutex after your change), so I
> think we need an additional flag somewhere (in addition to the mutex) to
> serialize all of these things against each other.

Plus there is disabled_store() which iterates over the devices and
reschedules the work. How is that supposed to be protected against a
concurrent devcd_del()?

Why needs disabled_store() to do that at all?

Thanks,

        tglx

Re: [PATCH] devcoredump : Serialize devcd_del work

Posted by Mukesh Ojha 4 years ago

Hi All,

Request you all the review comments on the fix of the described problem?

-Mukesh


On 4/19/2022 3:57 PM, Mukesh Ojha wrote:
> In following scenario(diagram), when one thread X running dev_coredumpm() adds devcd
> device to the framework which sends uevent notification to userspace
> and another thread Y reads this uevent and call to devcd_data_write()
> which eventually try to delete the queued timer that is not initialized/queued yet.
> 
> So, debug object reports some warning and in the meantime, timer is initialized
> and queued from X path. and from Y path, it gets reinitialized again and
> timer->entry.pprev=NULL and try_to_grab_pending() stucks.
> 
> To fix this, introduce mutex to serialize the behaviour.
> 
>   	cpu0(X)			                      cpu1(Y)
> 
>      dev_coredump() uevent sent to userspace
>      device_add()  =========================> userspace process Y reads the uevents
>                                               writes to devcd fd which
>                                               results into writes to
> 
>                                              devcd_data_write()
>                                                mod_delayed_work()
>                                                  try_to_grab_pending()
>                                                    del_timer()
>                                                      debug_assert_init()
>     INIT_DELAYED_WORK
>     schedule_delayed_work
>                                                       debug_object_fixup()
>                                                        timer_fixup_assert_init()
>                                                         timer_setup()
>                                                           do_init_timer()   ==> reinitialized the
>                                                                                   timer to
>                                                                                   timer->entry.pprev=NULL
> 
>                                                    timer_pending()
>                                                     !hlist_unhashed_lockless(&timer->entry)
>                                                       !h->pprev  ==> del_timer checks
>                                                                    and finds it to be NULL
>   								  try_to_grab_pending() stucks.
> 
> Link: https://lore.kernel.org/lkml/2e1f81e2-428c-f11f-ce92-eb11048cb271@quicinc.com/
> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
> ---
>   drivers/base/devcoredump.c | 11 +++++++++--
>   1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
> index f4d794d..316f566 100644
> --- a/drivers/base/devcoredump.c
> +++ b/drivers/base/devcoredump.c
> @@ -25,6 +25,7 @@ struct devcd_entry {
>   	struct device devcd_dev;
>   	void *data;
>   	size_t datalen;
> +	struct mutex mutex;
>   	struct module *owner;
>   	ssize_t (*read)(char *buffer, loff_t offset, size_t count,
>   			void *data, size_t datalen);
> @@ -84,7 +85,9 @@ static ssize_t devcd_data_write(struct file *filp, struct kobject *kobj,
>   	struct device *dev = kobj_to_dev(kobj);
>   	struct devcd_entry *devcd = dev_to_devcd(dev);
>   
> +	mutex_lock(&devcd->mutex);
>   	mod_delayed_work(system_wq, &devcd->del_wk, 0);
> +	mutex_unlock(&devcd->mutex);
>   
>   	return count;
>   }
> @@ -112,7 +115,9 @@ static int devcd_free(struct device *dev, void *data)
>   {
>   	struct devcd_entry *devcd = dev_to_devcd(dev);
>   
> +	mutex_lock(&devcd->mutex);
>   	flush_delayed_work(&devcd->del_wk);
> +	mutex_unlock(&devcd->mutex);
>   	return 0;
>   }
>   
> @@ -278,13 +283,14 @@ void dev_coredumpm(struct device *dev, struct module *owner,
>   	devcd->read = read;
>   	devcd->free = free;
>   	devcd->failing_dev = get_device(dev);
> -
> +	mutex_init(&devcd->mutex);
>   	device_initialize(&devcd->devcd_dev);
>   
>   	dev_set_name(&devcd->devcd_dev, "devcd%d",
>   		     atomic_inc_return(&devcd_count));
>   	devcd->devcd_dev.class = &devcd_class;
>   
> +	mutex_lock(&devcd->mutex);
>   	if (device_add(&devcd->devcd_dev))
>   		goto put_device;
>   
> @@ -301,10 +307,11 @@ void dev_coredumpm(struct device *dev, struct module *owner,
>   
>   	INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
>   	schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
> -
> +	mutex_unlock(&devcd->mutex);
>   	return;
>    put_device:
>   	put_device(&devcd->devcd_dev);
> +	mutex_unlock(&devcd->mutex);
>    put_module:
>   	module_put(owner);
>    free:

Re: [PATCH] devcoredump : Serialize devcd_del work

Posted by Greg KH 4 years ago

A: http://en.wikipedia.org/wiki/Top_post
Q: Were do I find info about this thing called top-posting?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Fri, Apr 22, 2022 at 03:33:08PM +0530, Mukesh Ojha wrote:
> 
> Hi All,
> 
> Request you all the review comments on the fix of the described problem?
> 
> -Mukesh
> 
> 
> On 4/19/2022 3:57 PM, Mukesh Ojha wrote:

You sent this 3 days ago, please be realistic.

If you are concerned about the delay in patch reviews, please help us
out and review patches sent to the list.  Otherwise, don't start to
worry until at least after 2 weeks.

thanks,

greg k-h

Re: [PATCH] devcoredump : Serialize devcd_del work

Posted by Greg KH 4 years ago

On Fri, Apr 22, 2022 at 02:13:47PM +0200, Greg KH wrote:
> 
> A: http://en.wikipedia.org/wiki/Top_post
> Q: Were do I find info about this thing called top-posting?
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?
> 
> A: No.
> Q: Should I include quotations after my reply?
> 
> http://daringfireball.net/2007/07/on_top
> 
> On Fri, Apr 22, 2022 at 03:33:08PM +0530, Mukesh Ojha wrote:
> > 
> > Hi All,
> > 
> > Request you all the review comments on the fix of the described problem?
> > 
> > -Mukesh
> > 
> > 
> > On 4/19/2022 3:57 PM, Mukesh Ojha wrote:
> 
> You sent this 3 days ago, please be realistic.
> 
> If you are concerned about the delay in patch reviews, please help us
> out and review patches sent to the list.  Otherwise, don't start to
> worry until at least after 2 weeks.

You also didn't even cc: me, so how could I see this at all?

So, nothing to review in my inbox, nice!

{sigh}