[PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point

James Morse posted 38 patches 1 year, 6 months ago
There is a newer version of this series
[PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
Posted by James Morse 1 year, 6 months ago
resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
resctrl can't be built as a module, and the kernfs helpers are not exported
so this is unlikely to change. MPAM has an error interrupt which indicates
the MPAM driver has gone haywire. Should this occur tasks could run with
the wrong control values, leading to bad performance for impoartant tasks.
The MPAM driver needs a way to tell resctrl that no further configuration
should be attempted.

Using resctrl_exit() for this leaves the system in a funny state as
resctrl is still mounted, but cannot be un-mounted because the sysfs
directory that is typically used has been removed. Dave Martin suggests
this may cause systemd trouble in the future as not all filesystems
can be unmounted.

Add calls to remove all the files and directories in resctrl, and
remove the sysfs_remove_mount_point() call that leaves the system
in a funny state. When triggered, this causes all the resctrl files
to disappear. resctrl can be unmounted, but not mounted again.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 403118fdabd4..ddf3d0a26517 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4255,9 +4255,9 @@ int __init resctrl_init(void)
 
 void __exit resctrl_exit(void)
 {
+	rdtgroup_destroy_root();
 	debugfs_remove_recursive(debugfs_resctrl);
 	unregister_filesystem(&rdt_fs_type);
-	sysfs_remove_mount_point(fs_kobj, "resctrl");
 
 	resctrl_mon_resource_exit();
 }
-- 
2.39.2
Re: [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
Posted by Reinette Chatre 1 year, 5 months ago
Hi James,

On 6/14/24 8:00 AM, James Morse wrote:
> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
> resctrl can't be built as a module, and the kernfs helpers are not exported
> so this is unlikely to change. MPAM has an error interrupt which indicates
> the MPAM driver has gone haywire. Should this occur tasks could run with
> the wrong control values, leading to bad performance for impoartant tasks.

impoartant -> important

> The MPAM driver needs a way to tell resctrl that no further configuration
> should be attempted.
> 
> Using resctrl_exit() for this leaves the system in a funny state as
> resctrl is still mounted, but cannot be un-mounted because the sysfs
> directory that is typically used has been removed. Dave Martin suggests
> this may cause systemd trouble in the future as not all filesystems
> can be unmounted.
> 
> Add calls to remove all the files and directories in resctrl, and
> remove the sysfs_remove_mount_point() call that leaves the system
> in a funny state. When triggered, this causes all the resctrl files
> to disappear. resctrl can be unmounted, but not mounted again.

I am not familiar with these flows so I would like to confirm ...
In this scenario the resctrl filesystem will be unregistered, are
you saying that it is possible to unmount a filesystem after it has
been unregistered?

Reinette
Re: [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
Posted by James Morse 1 year, 5 months ago
Hi Reinette,

On 28/06/2024 17:53, Reinette Chatre wrote:
> On 6/14/24 8:00 AM, James Morse wrote:
>> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
>> resctrl can't be built as a module, and the kernfs helpers are not exported
>> so this is unlikely to change. MPAM has an error interrupt which indicates
>> the MPAM driver has gone haywire. Should this occur tasks could run with
>> the wrong control values, leading to bad performance for impoartant tasks.
> 
> impoartant -> important
> 
>> The MPAM driver needs a way to tell resctrl that no further configuration
>> should be attempted.
>>
>> Using resctrl_exit() for this leaves the system in a funny state as
>> resctrl is still mounted, but cannot be un-mounted because the sysfs
>> directory that is typically used has been removed. Dave Martin suggests
>> this may cause systemd trouble in the future as not all filesystems
>> can be unmounted.
>>
>> Add calls to remove all the files and directories in resctrl, and
>> remove the sysfs_remove_mount_point() call that leaves the system
>> in a funny state. When triggered, this causes all the resctrl files
>> to disappear. resctrl can be unmounted, but not mounted again.

> I am not familiar with these flows so I would like to confirm ...
> In this scenario the resctrl filesystem will be unregistered, are
> you saying that it is possible to unmount a filesystem after it has
> been unregistered?

Counter-intuitively: yes.

The rules are described in fs/filesystems.c: We can access the members of the struct
file_system_type if the list lock is held, or a reference is held to the module. This is
how /proc/mounts is able to print the filesystem name from struct file_system_type without
taking the lock - it holds a reference to any module to prevent the structure from being
freed. Because resctrl can't be built as a module, we can say there is always a reference
held, and we can never free struct file_system_type.


Thanks,

James
Re: [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
Posted by Reinette Chatre 1 year, 5 months ago
Hi James,

On 7/4/24 9:41 AM, James Morse wrote:
> Hi Reinette,
> 
> On 28/06/2024 17:53, Reinette Chatre wrote:
>> On 6/14/24 8:00 AM, James Morse wrote:
>>> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
>>> resctrl can't be built as a module, and the kernfs helpers are not exported
>>> so this is unlikely to change. MPAM has an error interrupt which indicates
>>> the MPAM driver has gone haywire. Should this occur tasks could run with
>>> the wrong control values, leading to bad performance for impoartant tasks.
>>
>> impoartant -> important
>>
>>> The MPAM driver needs a way to tell resctrl that no further configuration
>>> should be attempted.
>>>
>>> Using resctrl_exit() for this leaves the system in a funny state as
>>> resctrl is still mounted, but cannot be un-mounted because the sysfs
>>> directory that is typically used has been removed. Dave Martin suggests
>>> this may cause systemd trouble in the future as not all filesystems
>>> can be unmounted.
>>>
>>> Add calls to remove all the files and directories in resctrl, and
>>> remove the sysfs_remove_mount_point() call that leaves the system
>>> in a funny state. When triggered, this causes all the resctrl files
>>> to disappear. resctrl can be unmounted, but not mounted again.
> 
>> I am not familiar with these flows so I would like to confirm ...
>> In this scenario the resctrl filesystem will be unregistered, are
>> you saying that it is possible to unmount a filesystem after it has
>> been unregistered?
> 
> Counter-intuitively: yes.
> 
> The rules are described in fs/filesystems.c: We can access the members of the struct
> file_system_type if the list lock is held, or a reference is held to the module. This is
> how /proc/mounts is able to print the filesystem name from struct file_system_type without
> taking the lock - it holds a reference to any module to prevent the structure from being

hmmm ... does this mean I am supposed to find calls to try_module_get() in the flow from
mounts_open_common()?

> freed. Because resctrl can't be built as a module, we can say there is always a reference
> held, and we can never free struct file_system_type.

unregister_filesystem() continues to be called and as I understand in new MPAM usages will be
called during runtime. unregister_filesystem() comments state "Once this function has returned
the &struct file_system_type structure may be freed or reused.". Could you please highlight to me
what gives the confidence of "we can say there is always a reference held"? Could you please
point to me where that reference is obtained that will prevent the structure from being
freed?

Thank you

Reinette
Re: [PATCH v3 31/38] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
Posted by James Morse 1 year, 4 months ago
Hi Reinette,

On 08/07/2024 18:47, Reinette Chatre wrote:
> On 7/4/24 9:41 AM, James Morse wrote:
>> On 28/06/2024 17:53, Reinette Chatre wrote:
>>> On 6/14/24 8:00 AM, James Morse wrote:
>>>> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
>>>> resctrl can't be built as a module, and the kernfs helpers are not exported
>>>> so this is unlikely to change. MPAM has an error interrupt which indicates
>>>> the MPAM driver has gone haywire. Should this occur tasks could run with
>>>> the wrong control values, leading to bad performance for impoartant tasks.
>>>
>>> impoartant -> important
>>>
>>>> The MPAM driver needs a way to tell resctrl that no further configuration
>>>> should be attempted.
>>>>
>>>> Using resctrl_exit() for this leaves the system in a funny state as
>>>> resctrl is still mounted, but cannot be un-mounted because the sysfs
>>>> directory that is typically used has been removed. Dave Martin suggests
>>>> this may cause systemd trouble in the future as not all filesystems
>>>> can be unmounted.
>>>>
>>>> Add calls to remove all the files and directories in resctrl, and
>>>> remove the sysfs_remove_mount_point() call that leaves the system
>>>> in a funny state. When triggered, this causes all the resctrl files
>>>> to disappear. resctrl can be unmounted, but not mounted again.
>>
>>> I am not familiar with these flows so I would like to confirm ...
>>> In this scenario the resctrl filesystem will be unregistered, are
>>> you saying that it is possible to unmount a filesystem after it has
>>> been unregistered?
>>
>> Counter-intuitively: yes.
>>
>> The rules are described in fs/filesystems.c: We can access the members of the struct
>> file_system_type if the list lock is held, or a reference is held to the module. This is
>> how /proc/mounts is able to print the filesystem name from struct file_system_type without
>> taking the lock - it holds a reference to any module to prevent the structure from being
> 
> hmmm ... does this mean I am supposed to find calls to try_module_get() in the flow from
> mounts_open_common()?

There may be, but when a filesystem is mounted the code in super.c holds a reference to
the filesystem - which translates to a reference on the module/filesystem->owner.

My point was only that its possible to unregister a filesystem while its mounted. The
reference counting takes care of this - and is unnecessary in our case.


>> freed. Because resctrl can't be built as a module, we can say there is always a reference
>> held, and we can never free struct file_system_type.
> 
> unregister_filesystem() continues to be called and as I understand in new MPAM usages will be
> called during runtime. unregister_filesystem() comments state "Once this function has
> returned
> the &struct file_system_type structure may be freed or reused.". Could you please
> highlight to me
> what gives the confidence of "we can say there is always a reference held"? Could you please
> point to me where that reference is obtained that will prevent the structure from being
> freed?

I think we are rat-holing on something that doesn't matter:
 * resctrl can't be built as a module - it is always built in.
 * rdt_fs_type is therefore part of the kernel data section - it can't be freed.
 * likewise the code that is part of resctrl can't be freed either.


Thanks,

James