drivers/base/devcoredump.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
There are cases where depending on the size of the devcoredump and the speed
at which the usermode reads the dump, it can take longer than the current 5 mins
timeout.
This can lead to incomplete dumps as the device is deleted once the timeout expires.
One example is below where it took 6 mins for the devcoredump to be completely read.
04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data
04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node
Increase the timeout to 10 mins to accommodate system delays and large coredump
sizes.
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
---
drivers/base/devcoredump.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
index f4d794d..6b83ae5 100644
--- a/drivers/base/devcoredump.c
+++ b/drivers/base/devcoredump.c
@@ -18,8 +18,8 @@ static struct class devcd_class;
/* global disable flag, for security purposes */
static bool devcd_disabled;
-/* if data isn't read by userspace after 5 minutes then delete it */
-#define DEVCD_TIMEOUT (HZ * 60 * 5)
+/* if data isn't read by userspace after 10 minutes then delete it */
+#define DEVCD_TIMEOUT (HZ * 60 * 10)
struct devcd_entry {
struct device devcd_dev;
--
2.7.4
On Tue, 2022-02-08 at 11:44 -0800, Abhinav Kumar wrote: > There are cases where depending on the size of the devcoredump and the speed > at which the usermode reads the dump, it can take longer than the current 5 mins > timeout. > > This can lead to incomplete dumps as the device is deleted once the timeout expires. > > One example is below where it took 6 mins for the devcoredump to be completely read. > > 04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data > 04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node > > Increase the timeout to 10 mins to accommodate system delays and large coredump > sizes. > No real objection, I guess, but can the data actually disappear *while* the sysfs file is open?! Or did it take 5 minutes to open the file? If the former, maybe we should fix that too (or instead)? johannes
Hi Johannes
Thanks for the response.
On 2/8/2022 12:35 PM, Johannes Berg wrote:
> On Tue, 2022-02-08 at 11:44 -0800, Abhinav Kumar wrote:
>> There are cases where depending on the size of the devcoredump and the speed
>> at which the usermode reads the dump, it can take longer than the current 5 mins
>> timeout.
>>
>> This can lead to incomplete dumps as the device is deleted once the timeout expires.
>>
>> One example is below where it took 6 mins for the devcoredump to be completely read.
>>
>> 04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data
>> 04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node
>>
>> Increase the timeout to 10 mins to accommodate system delays and large coredump
>> sizes.
>>
>
> No real objection, I guess, but can the data actually disappear *while*
> the sysfs file is open?!
>
> Or did it take 5 minutes to open the file?
>
> If the former, maybe we should fix that too (or instead)?
>
> johannes
It opened the file rightaway but could not finish reading.
The device gets deleted so the corresponding /data will disappear too (
as the data node is under devcd*/data)
60 static void devcd_del(struct work_struct *wk)
61 {
62 struct devcd_entry *devcd;
63
64 devcd = container_of(wk, struct devcd_entry, del_wk.work);
65
66 device_del(&devcd->devcd_dev);
67 put_device(&devcd->devcd_dev);
68 }
Are you suggesting we implement a logic like :
a) if the usermode has started reading the data but has not finished yet
( we can detect the former with something like devcd->data_read_ongoing
= 1 and we know it has finished when it acks and we can clear this flag
then), in the timeout del_wk then we can delay the the delete timer by
another TIMEOUT amount of time to give usermode time to finish the data?
b) If usermode acks, we will clear both the flag and delete the device
as usual
But there is a corner case here:
c) If usermode starts the read, but then for some reason crashes, the
timer will timeout and try to delete the device but will detect that
usermode is still reading and will keep the device. How do we detect
this case?
Thats why i thought maybe the easier way right now is to try increasing
the timeout.
On Tue, 2022-02-08 at 13:04 -0800, Abhinav Kumar wrote: > > It opened the file rightaway but could not finish reading. > > The device gets deleted so the corresponding /data will disappear too ( > as the data node is under devcd*/data) Yeah but even if the file disappears, the open file descriptor is still there, no? Does sysfs somehow make those disappear? I know debugfs does (now, to some extent, it didn't always), but I thought sysfs was refcounting things and didn't do that? What did the userspace actually see? read() returned 0 so EOF? (I guess I could test it, but it's getting late) Your other questions are related - you need to consider the file in sysfs and the open file descriptor separately. johannes
Hi Johannes On 2/8/2022 1:12 PM, Johannes Berg wrote: > On Tue, 2022-02-08 at 13:04 -0800, Abhinav Kumar wrote: >> >> It opened the file rightaway but could not finish reading. >> >> The device gets deleted so the corresponding /data will disappear too ( >> as the data node is under devcd*/data) > > Yeah but even if the file disappears, the open file descriptor is still > there, no? Does sysfs somehow make those disappear? I know debugfs does > (now, to some extent, it didn't always), but I thought sysfs was > refcounting things and didn't do that? > > What did the userspace actually see? read() returned 0 so EOF? > > (I guess I could test it, but it's getting late) > > Your other questions are related - you need to consider the file in > sysfs and the open file descriptor separately. > > johannes > I am checking what usermode sees and will get back ( I didnt see an error do most likely it was EOF ). I didnt follow the second part. If the file descriptor read returns EOF, even if we consider them separate how will it resolve this issue? My earlier questions were related to fixing it in devcoredump to detect and fix it there. Are you suggesting to fix in usermode instead? How? Thanks Abhinav
On Tue, 2022-02-08 at 13:40 -0800, Abhinav Kumar wrote: > > > I am checking what usermode sees and will get back ( I didnt see an > error do most likely it was EOF ). I didnt follow the second part. I think probably it got -ENODEV, looking at kernfs_file_read_iter(). > If the file descriptor read returns EOF, even if we consider them > separate how will it resolve this issue? > > My earlier questions were related to fixing it in devcoredump to detect > and fix it there. Are you suggesting to fix in usermode instead? How? > Yeah, no, you cannot fix it in userspace. But I just followed the rabbit hole down kernfs and all, and it looks like indeed the read would be cut short with -ENODEV, sorry. It doesn't look like there's good API for this, but it seems at least from the underlying kernfs POV it should be possible to get_device() in open and put_device() in release, so that the device sticks around while somebody has the file open? It's entirely virtual, so this should be OK? johannes
Hi Johannes
On 2/8/2022 1:54 PM, Johannes Berg wrote:
> On Tue, 2022-02-08 at 13:40 -0800, Abhinav Kumar wrote:
>>>
>> I am checking what usermode sees and will get back ( I didnt see an
>> error do most likely it was EOF ). I didnt follow the second part.
>
> I think probably it got -ENODEV, looking at kernfs_file_read_iter().
>
>> If the file descriptor read returns EOF, even if we consider them
>> separate how will it resolve this issue?
>>
>> My earlier questions were related to fixing it in devcoredump to detect
>> and fix it there. Are you suggesting to fix in usermode instead? How?
>>
>
> Yeah, no, you cannot fix it in userspace.
>
> But I just followed the rabbit hole down kernfs and all, and it looks
> like indeed the read would be cut short with -ENODEV, sorry.
>
> It doesn't look like there's good API for this, but it seems at least
> from the underlying kernfs POV it should be possible to get_device() in
> open and put_device() in release, so that the device sticks around while
> somebody has the file open? It's entirely virtual, so this should be OK?
>
> johannes
Are you suggesting something like below?
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 42dcf96..14203d0 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -32,6 +32,22 @@ static const struct sysfs_ops *sysfs_file_ops(struct
kernfs_node *kn)
return kobj->ktype ? kobj->ktype->sysfs_ops : NULL;
}
+static int sysfs_kf_open(struct kernfs_open_file *of)
+{
+ struct kobject *kobj = of->kn->parent->priv;
+ struct device *dev = kobj_to_dev(kobj);
+
+ get_device(dev);
+}
+
+static void sysfs_kf_release(struct kernfs_open_file *of)
+{
+ struct kobject *kobj = of->kn->parent->priv;
+ struct device *dev = kobj_to_dev(kobj);
+
+ put_device(dev);
+}
+
/*
* Reads on sysfs are handled through seq_file, which takes care of hairy
* details like buffering and seeking. The following function pipes
@@ -211,6 +227,8 @@ static const struct kernfs_ops sysfs_file_kfops_wo = {
};
static const struct kernfs_ops sysfs_file_kfops_rw = {
+ .open = sysfs_kf_open;
+ .release = sysfs_kf_release;
.seq_show = sysfs_kf_seq_show,
.write = sysfs_kf_write,
};
If so, dont you think this will be a more intrusive change just for the
sake of devcoredump? Any other way to keep the changes limited to
devcoredump?
Thanks
Abhinav
On Tue, 2022-02-08 at 17:55 -0800, Abhinav Kumar wrote: > > Are you suggesting something like below? > > diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c > index 42dcf96..14203d0 100644 > --- a/fs/sysfs/file.c > No, for sure not, but I guess from the looks of this patch there's no way to do something like that for just an individual attribute... Oh well. johannes
Hi Johannes On 2/8/2022 11:50 PM, Johannes Berg wrote: > On Tue, 2022-02-08 at 17:55 -0800, Abhinav Kumar wrote: >> >> Are you suggesting something like below? >> >> diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c >> index 42dcf96..14203d0 100644 >> --- a/fs/sysfs/file.c >> > > No, for sure not, but I guess from the looks of this patch there's no > way to do something like that for just an individual attribute... > > Oh well. > > johannes In that case, I was not clear on the previous solution you suggested. Are you suggesting then we can go ahead with the timeout increase? If so, can I please get your ack? Thanks Abhinav
On Tue, Feb 08, 2022 at 05:55:18PM -0800, Abhinav Kumar wrote:
> Hi Johannes
>
> On 2/8/2022 1:54 PM, Johannes Berg wrote:
> > On Tue, 2022-02-08 at 13:40 -0800, Abhinav Kumar wrote:
> > > >
> > > I am checking what usermode sees and will get back ( I didnt see an
> > > error do most likely it was EOF ). I didnt follow the second part.
> >
> > I think probably it got -ENODEV, looking at kernfs_file_read_iter().
> >
> > > If the file descriptor read returns EOF, even if we consider them
> > > separate how will it resolve this issue?
> > >
> > > My earlier questions were related to fixing it in devcoredump to detect
> > > and fix it there. Are you suggesting to fix in usermode instead? How?
> > >
> >
> > Yeah, no, you cannot fix it in userspace.
> >
> > But I just followed the rabbit hole down kernfs and all, and it looks
> > like indeed the read would be cut short with -ENODEV, sorry.
> >
> > It doesn't look like there's good API for this, but it seems at least
> > from the underlying kernfs POV it should be possible to get_device() in
> > open and put_device() in release, so that the device sticks around while
> > somebody has the file open? It's entirely virtual, so this should be OK?
> >
> > johannes
>
> Are you suggesting something like below?
>
> diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
> index 42dcf96..14203d0 100644
> --- a/fs/sysfs/file.c
> +++ b/fs/sysfs/file.c
> @@ -32,6 +32,22 @@ static const struct sysfs_ops *sysfs_file_ops(struct
> kernfs_node *kn)
> return kobj->ktype ? kobj->ktype->sysfs_ops : NULL;
> }
>
> +static int sysfs_kf_open(struct kernfs_open_file *of)
> +{
> + struct kobject *kobj = of->kn->parent->priv;
> + struct device *dev = kobj_to_dev(kobj);
> +
> + get_device(dev);
> +}
> +
> +static void sysfs_kf_release(struct kernfs_open_file *of)
> +{
> + struct kobject *kobj = of->kn->parent->priv;
> + struct device *dev = kobj_to_dev(kobj);
> +
> + put_device(dev);
> +}
That obviously does not work as not everything in sysfs is a struct
device :(
On Tue, Feb 08, 2022 at 11:44:32AM -0800, Abhinav Kumar wrote: > There are cases where depending on the size of the devcoredump and the speed > at which the usermode reads the dump, it can take longer than the current 5 mins > timeout. > > This can lead to incomplete dumps as the device is deleted once the timeout expires. > > One example is below where it took 6 mins for the devcoredump to be completely read. > > 04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data > 04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node What makes this so slow? Reading from the kernel shouldn't be the limit, is it where the data is being sent to? > Increase the timeout to 10 mins to accommodate system delays and large coredump > sizes. Nit, please wrap your changelog texts at 72 columns. And what is "large"? thanks, greg k-h
Hi Greg Thanks for the response. On 2/11/2022 3:09 AM, Greg KH wrote: > On Tue, Feb 08, 2022 at 11:44:32AM -0800, Abhinav Kumar wrote: >> There are cases where depending on the size of the devcoredump and the speed >> at which the usermode reads the dump, it can take longer than the current 5 mins >> timeout. >> >> This can lead to incomplete dumps as the device is deleted once the timeout expires. >> >> One example is below where it took 6 mins for the devcoredump to be completely read. >> >> 04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data >> 04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node > > What makes this so slow? Reading from the kernel shouldn't be the > limit, is it where the data is being sent to? We are still checking this. We are seeing better read times when we bump up the thread priority of the thread which was reading this. We are also trying to check if bumping up CPU speed is helping. But, results have not been consistently good enough. So we thought we should also increase the timeout to be safe. > >> Increase the timeout to 10 mins to accommodate system delays and large coredump >> sizes. > > Nit, please wrap your changelog texts at 72 columns. > Yes, i will fix this when I re-post. > And what is "large"? We are seeing devcoredumps in the range of 2.5MB-3MB. I can also mention this in the commit text in the next post. Thanks Abhinav > > thanks, > > greg k-h
© 2016 - 2026 Red Hat, Inc.