[PATCH v3 0/9] x86,fs/resctrl: Fix long-standing issues

Reinette Chatre posted 9 patches 3 weeks ago
There is a newer version of this series
arch/x86/kernel/cpu/resctrl/core.c      |  18 +-
arch/x86/kernel/cpu/resctrl/intel_aet.c |   5 +-
fs/resctrl/ctrlmondata.c                |  38 +-
fs/resctrl/internal.h                   |  15 +-
fs/resctrl/monitor.c                    | 100 ++-
fs/resctrl/pseudo_lock.c                |  44 +-
fs/resctrl/rdtgroup.c                   | 847 +++++++++++++++---------
7 files changed, 680 insertions(+), 387 deletions(-)
[PATCH v3 0/9] x86,fs/resctrl: Fix long-standing issues
Posted by Reinette Chatre 3 weeks ago
v2: https://lore.kernel.org/lkml/20260515193944.15114-1-tony.luck@intel.com/
v1: https://lore.kernel.org/all/20260508182143.14592-1-tony.luck@intel.com/

While reviewing the AET series [1] Sashiko reported a deadlock during mount,
and a use-after-free when an L3 domain is removed during CPU offline. Reinette
found a memory leak in the mount error path while refactoring code for a
solution to the mount hang.

During review of V1 of this series Sashiko found a new UAF on unmount issue
that was fixed in V2.

During review of V2 Sashiko uncovered a couple more new issues: TOCTOU
involving rdtgroup_kn_put() that may lead to UAF or double-free, double
free of pseudo-locked regions, potential deadlock between resctrl unmount and
info file readers. Sashiko also found that the CPU offline fix in V2 is flawed
in its use of is_percpu_thread().

Address all issues identified. This version is significantly different from V2
because of the additional fixes and reworking of the CPU offline fix. I do not
consider this version quite "polished" but after all changes made to address
all the issues identified by Sashiko I would like to check-in with folks (and
Sashiko) on where the fixes are headed and would appreciate any feedback.

Applies against tip/master to ensure it considers pending x86/cache changes.

[1] https://sashiko.dev/#/patchset/20260429184858.36423-1-tony.luck%40intel.com

Reinette Chatre (6):
  fs/resctrl: Fix deadlock for errors during mount
  fs/resctrl: Prevent use-after-free in rdtgroup_kn_put()
  fs/resctrl: Fix pseudo-locking lifetime handling
  fs/resctrl: Prevent deadlock and use-after-free in info file handlers
  x86/resctrl: Ensure domain fully initialized before placed on RCU list
  fs/resctrl: Fix UAF from worker threads when domains are removed

Tony Luck (3):
  fs/resctrl: Move functions to avoid forward references in subsequent
    fixes
  fs/resctrl: Free mon_data structures on rdt_get_tree() failure
  fs/resctrl: Fix use-after-free during unmount

 arch/x86/kernel/cpu/resctrl/core.c      |  18 +-
 arch/x86/kernel/cpu/resctrl/intel_aet.c |   5 +-
 fs/resctrl/ctrlmondata.c                |  38 +-
 fs/resctrl/internal.h                   |  15 +-
 fs/resctrl/monitor.c                    | 100 ++-
 fs/resctrl/pseudo_lock.c                |  44 +-
 fs/resctrl/rdtgroup.c                   | 847 +++++++++++++++---------
 7 files changed, 680 insertions(+), 387 deletions(-)

-- 
2.50.1
Re: [PATCH v3 0/9] x86,fs/resctrl: Fix long-standing issues
Posted by Luck, Tony 2 weeks, 1 day ago
On Fri, May 22, 2026 at 12:15:04PM -0700, Reinette Chatre wrote:
> v2: https://lore.kernel.org/lkml/20260515193944.15114-1-tony.luck@intel.com/
> v1: https://lore.kernel.org/all/20260508182143.14592-1-tony.luck@intel.com/
> 
> While reviewing the AET series [1] Sashiko reported a deadlock during mount,
> and a use-after-free when an L3 domain is removed during CPU offline. Reinette
> found a memory leak in the mount error path while refactoring code for a
> solution to the mount hang.
> 
> During review of V1 of this series Sashiko found a new UAF on unmount issue
> that was fixed in V2.
> 
> During review of V2 Sashiko uncovered a couple more new issues: TOCTOU
> involving rdtgroup_kn_put() that may lead to UAF or double-free, double
> free of pseudo-locked regions, potential deadlock between resctrl unmount and
> info file readers. Sashiko also found that the CPU offline fix in V2 is flawed
> in its use of is_percpu_thread().
> 
> Address all issues identified. This version is significantly different from V2
> because of the additional fixes and reworking of the CPU offline fix. I do not
> consider this version quite "polished" but after all changes made to address
> all the issues identified by Sashiko I would like to check-in with folks (and
> Sashiko) on where the fixes are headed and would appreciate any feedback.

Several of these patches are either authored or co-authored by me, so
I'm uncertain about the ethics of piling on a Reviewed-by tag.

Apart from patch 6 to fix sashiko reported issues in pseudo-locking
everything looks good. I agree with Reinette's assessment[1] that
pursuing the bizarre corner case races for pseudo-locking should
not be a priority. That patch can be dropped.

-Tony

[1] https://lore.kernel.org/all/e40a924f-5398-43bd-821a-2ff9873c5a4c@intel.com/
Re: [PATCH v3 0/9] x86,fs/resctrl: Fix long-standing issues
Posted by Reinette Chatre 2 weeks ago
Hi Tony,

On 5/28/26 1:08 PM, Luck, Tony wrote:
> On Fri, May 22, 2026 at 12:15:04PM -0700, Reinette Chatre wrote:
>> v2: https://lore.kernel.org/lkml/20260515193944.15114-1-tony.luck@intel.com/
>> v1: https://lore.kernel.org/all/20260508182143.14592-1-tony.luck@intel.com/
>>
>> While reviewing the AET series [1] Sashiko reported a deadlock during mount,
>> and a use-after-free when an L3 domain is removed during CPU offline. Reinette
>> found a memory leak in the mount error path while refactoring code for a
>> solution to the mount hang.
>>
>> During review of V1 of this series Sashiko found a new UAF on unmount issue
>> that was fixed in V2.
>>
>> During review of V2 Sashiko uncovered a couple more new issues: TOCTOU
>> involving rdtgroup_kn_put() that may lead to UAF or double-free, double
>> free of pseudo-locked regions, potential deadlock between resctrl unmount and
>> info file readers. Sashiko also found that the CPU offline fix in V2 is flawed
>> in its use of is_percpu_thread().
>>
>> Address all issues identified. This version is significantly different from V2
>> because of the additional fixes and reworking of the CPU offline fix. I do not
>> consider this version quite "polished" but after all changes made to address
>> all the issues identified by Sashiko I would like to check-in with folks (and
>> Sashiko) on where the fixes are headed and would appreciate any feedback.
> 
> Several of these patches are either authored or co-authored by me, so
> I'm uncertain about the ethics of piling on a Reviewed-by tag.
> 
> Apart from patch 6 to fix sashiko reported issues in pseudo-locking
> everything looks good. I agree with Reinette's assessment[1] that
> pursuing the bizarre corner case races for pseudo-locking should
> not be a priority. That patch can be dropped.

Thank you for taking a look. 

After thinking about patch 6 more I plan to drop the fixes surrounding the races
but keep the fix for the RMID double-add. Since this will touch pseudo-locking region
lifetime management I expect Sashiko would bring up the pseudo-locking races
during its review but I believe this is a worthy fix.

As a summary of your tags for this series I see:
- patches 1, 2, 3, 4, and 9 have your Signed-off-by.
- patches 5, 7, and 8 do not have any tags from you.

Reinette
RE: [PATCH v3 0/9] x86,fs/resctrl: Fix long-standing issues
Posted by Luck, Tony 2 weeks ago
> > Apart from patch 6 to fix sashiko reported issues in pseudo-locking
> > everything looks good. I agree with Reinette's assessment[1] that
> > pursuing the bizarre corner case races for pseudo-locking should
> > not be a priority. That patch can be dropped.
>
> Thank you for taking a look.
>
> After thinking about patch 6 more I plan to drop the fixes surrounding the races
> but keep the fix for the RMID double-add. Since this will touch pseudo-locking region
> lifetime management I expect Sashiko would bring up the pseudo-locking races
> during its review but I believe this is a worthy fix.

Seems good.

> As a summary of your tags for this series I see:
> - patches 1, 2, 3, 4, and 9 have your Signed-off-by.
> - patches 5, 7, and 8 do not have any tags from you.

You can add my Reviewed-by: Tony Luck <tony.luck@intel.com> to parts 5,7,8

-Tony
Re: [PATCH v3 0/9] x86,fs/resctrl: Fix long-standing issues
Posted by Reinette Chatre 2 weeks ago

On 5/29/26 12:06 PM, Luck, Tony wrote:
>>> Apart from patch 6 to fix sashiko reported issues in pseudo-locking
>>> everything looks good. I agree with Reinette's assessment[1] that
>>> pursuing the bizarre corner case races for pseudo-locking should
>>> not be a priority. That patch can be dropped.
>>
>> Thank you for taking a look.
>>
>> After thinking about patch 6 more I plan to drop the fixes surrounding the races
>> but keep the fix for the RMID double-add. Since this will touch pseudo-locking region
>> lifetime management I expect Sashiko would bring up the pseudo-locking races
>> during its review but I believe this is a worthy fix.
> 
> Seems good.
> 
>> As a summary of your tags for this series I see:
>> - patches 1, 2, 3, 4, and 9 have your Signed-off-by.
>> - patches 5, 7, and 8 do not have any tags from you.
> 
> You can add my Reviewed-by: Tony Luck <tony.luck@intel.com> to parts 5,7,8
> 

Thank you very much Tony.

Reinette