drivers/base/core.c | 26 +++++++++++++++++++++++--- drivers/of/overlay.c | 10 +++++++++- include/linux/device.h | 1 + 3 files changed, 33 insertions(+), 4 deletions(-)
Hi,
In the following sequence:
of_platform_depopulate(); /* Remove devices from a DT overlay node */
of_overlay_remove(); /* Remove the DT overlay node itself */
Some warnings are raised by __of_changeset_entry_destroy() which was
called from of_overlay_remove():
ERROR: memory leak, expected refcount 1 instead of 2 ...
The issue is that, during the device devlink removals triggered from the
of_platform_depopulate(), jobs are put in a workqueue.
These jobs drop the reference to the devices. When a device is no more
referenced (refcount == 0), it is released and the reference to its
of_node is dropped by a call to of_node_put().
These operations are fully correct except that, because of the
workqueue, they are done asynchronously with respect to function calls.
In the sequence provided, the jobs are run too late, after the call to
__of_changeset_entry_destroy() and so a missing of_node_put() call is
detected by __of_changeset_entry_destroy().
This series fixes this issue introducing device_link_wait_removal() in
order to wait for the end of jobs execution (patch 1) and using this
function to synchronize the overlay removal with the end of jobs
execution (patch 2).
Compared to the previous iteration:
https://lore.kernel.org/linux-kernel/20231130174126.688486-1-herve.codina@bootlin.com/
this v3 series:
- add the missing device.h
This series handles cases reported by Luca [1] and Nuno [2].
[1]: https://lore.kernel.org/all/20231220181627.341e8789@booty/
[2]: https://lore.kernel.org/all/20240205-fix-device-links-overlays-v2-2-5344f8c79d57@analog.com/
Best regards,
Hervé
Changes v2 -> v3
- Patch 1
No changes
- Patch 2
Add missing device.h
Changes v1 -> v2
- Patch 1
Rename the workqueue to 'device_link_wq'
Add 'Fixes' tag and Cc stable
- Patch 2
Add device.h inclusion.
Call device_link_wait_removal() later in the overlay removal
sequence (i.e. in free_overlay_changeset() function).
Drop of_mutex lock while calling device_link_wait_removal().
Add 'Fixes' tag and Cc stable
Herve Codina (2):
driver core: Introduce device_link_wait_removal()
of: overlay: Synchronize of_overlay_remove() with the devlink removals
drivers/base/core.c | 26 +++++++++++++++++++++++---
drivers/of/overlay.c | 10 +++++++++-
include/linux/device.h | 1 +
3 files changed, 33 insertions(+), 4 deletions(-)
--
2.43.0
On Thu, Feb 29, 2024 at 11:52:01AM +0100, Herve Codina wrote: > Hi, Please CC Saravana on this. > > In the following sequence: > of_platform_depopulate(); /* Remove devices from a DT overlay node */ > of_overlay_remove(); /* Remove the DT overlay node itself */ > > Some warnings are raised by __of_changeset_entry_destroy() which was > called from of_overlay_remove(): > ERROR: memory leak, expected refcount 1 instead of 2 ... > > The issue is that, during the device devlink removals triggered from the > of_platform_depopulate(), jobs are put in a workqueue. > These jobs drop the reference to the devices. When a device is no more > referenced (refcount == 0), it is released and the reference to its > of_node is dropped by a call to of_node_put(). > These operations are fully correct except that, because of the > workqueue, they are done asynchronously with respect to function calls. > > In the sequence provided, the jobs are run too late, after the call to > __of_changeset_entry_destroy() and so a missing of_node_put() call is > detected by __of_changeset_entry_destroy(). > > This series fixes this issue introducing device_link_wait_removal() in > order to wait for the end of jobs execution (patch 1) and using this > function to synchronize the overlay removal with the end of jobs > execution (patch 2). > > Compared to the previous iteration: > https://lore.kernel.org/linux-kernel/20231130174126.688486-1-herve.codina@bootlin.com/ > this v3 series: > - add the missing device.h > > This series handles cases reported by Luca [1] and Nuno [2]. > [1]: https://lore.kernel.org/all/20231220181627.341e8789@booty/ > [2]: https://lore.kernel.org/all/20240205-fix-device-links-overlays-v2-2-5344f8c79d57@analog.com/ > > Best regards, > Hervé > > Changes v2 -> v3 > - Patch 1 > No changes > > - Patch 2 > Add missing device.h > > Changes v1 -> v2 > - Patch 1 > Rename the workqueue to 'device_link_wq' > Add 'Fixes' tag and Cc stable > > - Patch 2 > Add device.h inclusion. > Call device_link_wait_removal() later in the overlay removal > sequence (i.e. in free_overlay_changeset() function). > Drop of_mutex lock while calling device_link_wait_removal(). > Add 'Fixes' tag and Cc stable > > Herve Codina (2): > driver core: Introduce device_link_wait_removal() > of: overlay: Synchronize of_overlay_remove() with the devlink removals > > drivers/base/core.c | 26 +++++++++++++++++++++++--- > drivers/of/overlay.c | 10 +++++++++- > include/linux/device.h | 1 + > 3 files changed, 33 insertions(+), 4 deletions(-) > > -- > 2.43.0 >
On Mon, Mar 4, 2024 at 7:02 AM Rob Herring <robh@kernel.org> wrote: > > On Thu, Feb 29, 2024 at 11:52:01AM +0100, Herve Codina wrote: > > Hi, > > Please CC Saravana on this. Nuno, this is why I was replying to the older series. I didn't even get this one. > > > > > In the following sequence: > > of_platform_depopulate(); /* Remove devices from a DT overlay node */ > > of_overlay_remove(); /* Remove the DT overlay node itself */ > > > > Some warnings are raised by __of_changeset_entry_destroy() which was > > called from of_overlay_remove(): > > ERROR: memory leak, expected refcount 1 instead of 2 ... > > > > The issue is that, during the device devlink removals triggered from the > > of_platform_depopulate(), jobs are put in a workqueue. > > These jobs drop the reference to the devices. When a device is no more > > referenced (refcount == 0), it is released and the reference to its > > of_node is dropped by a call to of_node_put(). > > These operations are fully correct except that, because of the > > workqueue, they are done asynchronously with respect to function calls. > > > > In the sequence provided, the jobs are run too late, after the call to > > __of_changeset_entry_destroy() and so a missing of_node_put() call is > > detected by __of_changeset_entry_destroy(). > > > > This series fixes this issue introducing device_link_wait_removal() in > > order to wait for the end of jobs execution (patch 1) and using this > > function to synchronize the overlay removal with the end of jobs > > execution (patch 2). > > > > Compared to the previous iteration: > > https://lore.kernel.org/linux-kernel/20231130174126.688486-1-herve.codina@bootlin.com/ > > this v3 series: > > - add the missing device.h > > > > This series handles cases reported by Luca [1] and Nuno [2]. > > [1]: https://lore.kernel.org/all/20231220181627.341e8789@booty/ > > [2]: https://lore.kernel.org/all/20240205-fix-device-links-overlays-v2-2-5344f8c79d57@analog.com/ > > > > Best regards, > > Hervé > > > > Changes v2 -> v3 > > - Patch 1 > > No changes > > > > - Patch 2 > > Add missing device.h > > > > Changes v1 -> v2 > > - Patch 1 > > Rename the workqueue to 'device_link_wq' > > Add 'Fixes' tag and Cc stable > > > > - Patch 2 > > Add device.h inclusion. > > Call device_link_wait_removal() later in the overlay removal > > sequence (i.e. in free_overlay_changeset() function). > > Drop of_mutex lock while calling device_link_wait_removal(). > > Add 'Fixes' tag and Cc stable > > > > Herve Codina (2): > > driver core: Introduce device_link_wait_removal() > > of: overlay: Synchronize of_overlay_remove() with the devlink removals > > > > drivers/base/core.c | 26 +++++++++++++++++++++++--- > > drivers/of/overlay.c | 10 +++++++++- > > include/linux/device.h | 1 + > > 3 files changed, 33 insertions(+), 4 deletions(-) > > > > -- > > 2.43.0 > >
On Mon, 2024-03-04 at 22:17 -0800, Saravana Kannan wrote: > On Mon, Mar 4, 2024 at 7:02 AM Rob Herring <robh@kernel.org> wrote: > > > > On Thu, Feb 29, 2024 at 11:52:01AM +0100, Herve Codina wrote: > > > Hi, > > > > Please CC Saravana on this. > > Nuno, this is why I was replying to the older series. I didn't even > get this one. Arghh, I see... In lot's of replies I was mentioning you :) - Nuno Sá > > > > > > > > > In the following sequence: > > > of_platform_depopulate(); /* Remove devices from a DT overlay node */ > > > of_overlay_remove(); /* Remove the DT overlay node itself */ > > > > > > Some warnings are raised by __of_changeset_entry_destroy() which was > > > called from of_overlay_remove(): > > > ERROR: memory leak, expected refcount 1 instead of 2 ... > > > > > > The issue is that, during the device devlink removals triggered from the > > > of_platform_depopulate(), jobs are put in a workqueue. > > > These jobs drop the reference to the devices. When a device is no more > > > referenced (refcount == 0), it is released and the reference to its > > > of_node is dropped by a call to of_node_put(). > > > These operations are fully correct except that, because of the > > > workqueue, they are done asynchronously with respect to function calls. > > > > > > In the sequence provided, the jobs are run too late, after the call to > > > __of_changeset_entry_destroy() and so a missing of_node_put() call is > > > detected by __of_changeset_entry_destroy(). > > > > > > This series fixes this issue introducing device_link_wait_removal() in > > > order to wait for the end of jobs execution (patch 1) and using this > > > function to synchronize the overlay removal with the end of jobs > > > execution (patch 2). > > > > > > Compared to the previous iteration: > > > > > > https://lore.kernel.org/linux-kernel/20231130174126.688486-1-herve.codina@bootlin.com/ > > > this v3 series: > > > - add the missing device.h > > > > > > This series handles cases reported by Luca [1] and Nuno [2]. > > > [1]: https://lore.kernel.org/all/20231220181627.341e8789@booty/ > > > [2]: > > > https://lore.kernel.org/all/20240205-fix-device-links-overlays-v2-2-5344f8c79d57@analog.com/ > > > > > > Best regards, > > > Hervé > > > > > > Changes v2 -> v3 > > > - Patch 1 > > > No changes > > > > > > - Patch 2 > > > Add missing device.h > > > > > > Changes v1 -> v2 > > > - Patch 1 > > > Rename the workqueue to 'device_link_wq' > > > Add 'Fixes' tag and Cc stable > > > > > > - Patch 2 > > > Add device.h inclusion. > > > Call device_link_wait_removal() later in the overlay removal > > > sequence (i.e. in free_overlay_changeset() function). > > > Drop of_mutex lock while calling device_link_wait_removal(). > > > Add 'Fixes' tag and Cc stable > > > > > > Herve Codina (2): > > > driver core: Introduce device_link_wait_removal() > > > of: overlay: Synchronize of_overlay_remove() with the devlink removals > > > > > > drivers/base/core.c | 26 +++++++++++++++++++++++--- > > > drivers/of/overlay.c | 10 +++++++++- > > > include/linux/device.h | 1 + > > > 3 files changed, 33 insertions(+), 4 deletions(-) > > > > > > -- > > > 2.43.0 > > >
© 2016 - 2026 Red Hat, Inc.