[PATCH v2] cgroup/cpuset: call fmeter_init() when cpuset.memory_pressure disabled

Jin Guojie posted 1 patch 11 months ago
kernel/cgroup/cpuset-v1.c | 4 +++-
kernel/cgroup/cpuset.c    | 2 ++
2 files changed, 5 insertions(+), 1 deletion(-)
[PATCH v2] cgroup/cpuset: call fmeter_init() when cpuset.memory_pressure disabled
Posted by Jin Guojie 11 months ago
When running LTP's cpuset_memory_pressure program, the following error occurs:

(1) Create a cgroup, enable cpuset subsystem, set memory limit, and
then set cpuset_memory_pressure to 1
(2) In this cgroup, create a process to allocate a large amount of
memory and generate pressure counts
(3) Set cpuset_memory_pressure to 0
(4) Check cpuset.memory_pressure: LTP thinks it should be 0, but the
current kernel returns a value of 1, so LTP determines it as FAIL

V2:
* call fmeter_init() when writing 0 to the memory_pressure_enabled

Compared with patch v1 [1], this version implements clearer logic.

[1] https://lore.kernel.org/cgroups/CA+B+MYRNsdKcYxC8kbyzVrdH9fT8c2if5UxGguKep36ZHe6HMQ@mail.gmail.com/T/#u

Signed-off-by: Jin Guojie <guojie.jin@gmail.com>
Suggested-by: Michal Koutný <mkoutny@suse.com>
Suggested-by: Waiman Long <longman@redhat.com>
---
 kernel/cgroup/cpuset-v1.c | 4 +++-
 kernel/cgroup/cpuset.c    | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c
index 25c1d7b77e2f..7520eb31598a 100644
--- a/kernel/cgroup/cpuset-v1.c
+++ b/kernel/cgroup/cpuset-v1.c
@@ -66,7 +66,6 @@ void fmeter_init(struct fmeter *fmp)
        fmp->cnt = 0;
        fmp->val = 0;
        fmp->time = 0;
-       spin_lock_init(&fmp->lock);
 }

 /* Internal meter update - process cnt events and update value */
@@ -437,6 +436,9 @@ static int cpuset_write_u64(struct
cgroup_subsys_state *css, struct cftype *cft,
                break;
        case FILE_MEMORY_PRESSURE_ENABLED:
                cpuset_memory_pressure_enabled = !!val;
+               if (cpuset_memory_pressure_enabled == 0) {
+                       fmeter_init(&cs->fmeter);
+               }
                break;
        case FILE_SPREAD_PAGE:
                retval = cpuset_update_flag(CS_SPREAD_PAGE, cs, val);
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 0f910c828973..3583c898ff77 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -3378,6 +3378,7 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_css)

        __set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
        fmeter_init(&cs->fmeter);
+       spin_lock_init(&cs->fmeter.lock);
        cs->relax_domain_level = -1;
        INIT_LIST_HEAD(&cs->remote_sibling);

@@ -3650,6 +3651,7 @@ int __init cpuset_init(void)
        nodes_setall(top_cpuset.effective_mems);

        fmeter_init(&top_cpuset.fmeter);
+       spin_lock_init(&top_cpuset.fmeter.lock);
        INIT_LIST_HEAD(&remote_children);

        BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL));
--
2.34.1
Re: [PATCH v2] cgroup/cpuset: call fmeter_init() when cpuset.memory_pressure disabled
Posted by Waiman Long 11 months ago
On 1/15/25 12:05 AM, Jin Guojie wrote:
> When running LTP's cpuset_memory_pressure program, the following error occurs:
>
> (1) Create a cgroup, enable cpuset subsystem, set memory limit, and
> then set cpuset_memory_pressure to 1
> (2) In this cgroup, create a process to allocate a large amount of
> memory and generate pressure counts
> (3) Set cpuset_memory_pressure to 0
> (4) Check cpuset.memory_pressure: LTP thinks it should be 0, but the
> current kernel returns a value of 1, so LTP determines it as FAIL
>
> V2:
> * call fmeter_init() when writing 0 to the memory_pressure_enabled
>
> Compared with patch v1 [1], this version implements clearer logic.
>
> [1] https://lore.kernel.org/cgroups/CA+B+MYRNsdKcYxC8kbyzVrdH9fT8c2if5UxGguKep36ZHe6HMQ@mail.gmail.com/T/#u
>
> Signed-off-by: Jin Guojie <guojie.jin@gmail.com>
> Suggested-by: Michal Koutný <mkoutny@suse.com>
> Suggested-by: Waiman Long <longman@redhat.com>
> ---
>   kernel/cgroup/cpuset-v1.c | 4 +++-
>   kernel/cgroup/cpuset.c    | 2 ++
>   2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c
> index 25c1d7b77e2f..7520eb31598a 100644
> --- a/kernel/cgroup/cpuset-v1.c
> +++ b/kernel/cgroup/cpuset-v1.c
> @@ -66,7 +66,6 @@ void fmeter_init(struct fmeter *fmp)
>          fmp->cnt = 0;
>          fmp->val = 0;
>          fmp->time = 0;
> -       spin_lock_init(&fmp->lock);
>   }
>
>   /* Internal meter update - process cnt events and update value */
> @@ -437,6 +436,9 @@ static int cpuset_write_u64(struct
> cgroup_subsys_state *css, struct cftype *cft,
>                  break;
>          case FILE_MEMORY_PRESSURE_ENABLED:
>                  cpuset_memory_pressure_enabled = !!val;
> +               if (cpuset_memory_pressure_enabled == 0) {
> +                       fmeter_init(&cs->fmeter);
> +               }
Nit: you don't need parentheses when there is only one statement 
underneath "if".
>                  break;
>          case FILE_SPREAD_PAGE:
>                  retval = cpuset_update_flag(CS_SPREAD_PAGE, cs, val);
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 0f910c828973..3583c898ff77 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -3378,6 +3378,7 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
>
>          __set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
>          fmeter_init(&cs->fmeter);
> +       spin_lock_init(&cs->fmeter.lock);
>          cs->relax_domain_level = -1;
>          INIT_LIST_HEAD(&cs->remote_sibling);
>
> @@ -3650,6 +3651,7 @@ int __init cpuset_init(void)
>          nodes_setall(top_cpuset.effective_mems);
>
>          fmeter_init(&top_cpuset.fmeter);
> +       spin_lock_init(&top_cpuset.fmeter.lock);
>          INIT_LIST_HEAD(&remote_children);
>
>          BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL));
> --
> 2.34.1
>
I just realize that cpuset.memory_pressure_enabled is on root cgroup 
only and affect a global flag that impact the behavior of all the 
existing cpusets. Your current patch will clear the memory pressure data 
in the root cgroup only. The other child cpusets will not be affected 
and will still show existing data. This inconsistency isn't good.

OTOH, I also don't think iterating the whole cpuset hierarchy and 
clearing all the fmeter data is worth the effort given that cgroup v1 is 
in maintenance mode. Perhaps just a simple check to return 0 if 
cpuset.memory_pressure_enabled isn't set like in the v1 patch. I also 
don't think we need to clear the fmeter data in that case as it will 
lead to data clearing only on cpusets where cpuset.memory_pressure is 
read while cpuset.memory_pressure_enabled has been cleared.

Cheers,
Longman


Re: [PATCH v2] cgroup/cpuset: call fmeter_init() when cpuset.memory_pressure disabled
Posted by Jin Guojie 10 months ago
On Wed, Jan 15, 2025 at 11:38 PM Waiman Long <llong@redhat.com> wrote:
>
> On 1/15/25 12:05 AM, Jin Guojie wrote:
> > When running LTP's cpuset_memory_pressure program, the following error occurs:
> >
> > (1) Create a cgroup, enable cpuset subsystem, set memory limit, and
> > then set cpuset_memory_pressure to 1
> > (2) In this cgroup, create a process to allocate a large amount of
> > memory and generate pressure counts
> > (3) Set cpuset_memory_pressure to 0
> > (4) Check cpuset.memory_pressure: LTP thinks it should be 0, but the
> > current kernel returns a value of 1, so LTP determines it as FAIL
> >
> > V2:
> > * call fmeter_init() when writing 0 to the memory_pressure_enabled
> >
> > Compared with patch v1 [1], this version implements clearer logic.
> >
> > [1] https://lore.kernel.org/cgroups/CA+B+MYRNsdKcYxC8kbyzVrdH9fT8c2if5UxGguKep36ZHe6HMQ@mail.gmail.com/T/#u
> >
> > Signed-off-by: Jin Guojie <guojie.jin@gmail.com>
> > Suggested-by: Michal Koutný <mkoutny@suse.com>
> > Suggested-by: Waiman Long <longman@redhat.com>
> > ---
> >   kernel/cgroup/cpuset-v1.c | 4 +++-
> >   kernel/cgroup/cpuset.c    | 2 ++
> >   2 files changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c
> > index 25c1d7b77e2f..7520eb31598a 100644
> > --- a/kernel/cgroup/cpuset-v1.c
> > +++ b/kernel/cgroup/cpuset-v1.c
> > @@ -66,7 +66,6 @@ void fmeter_init(struct fmeter *fmp)
> >          fmp->cnt = 0;
> >          fmp->val = 0;
> >          fmp->time = 0;
> > -       spin_lock_init(&fmp->lock);
> >   }
> >
> >   /* Internal meter update - process cnt events and update value */
> > @@ -437,6 +436,9 @@ static int cpuset_write_u64(struct
> > cgroup_subsys_state *css, struct cftype *cft,
> >                  break;
> >          case FILE_MEMORY_PRESSURE_ENABLED:
> >                  cpuset_memory_pressure_enabled = !!val;
> > +               if (cpuset_memory_pressure_enabled == 0) {
> > +                       fmeter_init(&cs->fmeter);
> > +               }
> Nit: you don't need parentheses when there is only one statement
> underneath "if".
> >                  break;
> >          case FILE_SPREAD_PAGE:
> >                  retval = cpuset_update_flag(CS_SPREAD_PAGE, cs, val);
> > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> > index 0f910c828973..3583c898ff77 100644
> > --- a/kernel/cgroup/cpuset.c
> > +++ b/kernel/cgroup/cpuset.c
> > @@ -3378,6 +3378,7 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
> >
> >          __set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
> >          fmeter_init(&cs->fmeter);
> > +       spin_lock_init(&cs->fmeter.lock);
> >          cs->relax_domain_level = -1;
> >          INIT_LIST_HEAD(&cs->remote_sibling);
> >
> > @@ -3650,6 +3651,7 @@ int __init cpuset_init(void)
> >          nodes_setall(top_cpuset.effective_mems);
> >
> >          fmeter_init(&top_cpuset.fmeter);
> > +       spin_lock_init(&top_cpuset.fmeter.lock);
> >          INIT_LIST_HEAD(&remote_children);
> >
> >          BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL));
> > --
> > 2.34.1
> >
> I just realize that cpuset.memory_pressure_enabled is on root cgroup
> only and affect a global flag that impact the behavior of all the
> existing cpusets. Your current patch will clear the memory pressure data
> in the root cgroup only. The other child cpusets will not be affected
> and will still show existing data. This inconsistency isn't good.
>
> OTOH, I also don't think iterating the whole cpuset hierarchy and
> clearing all the fmeter data is worth the effort given that cgroup v1 is
> in maintenance mode. Perhaps just a simple check to return 0 if
> cpuset.memory_pressure_enabled isn't set like in the v1 patch. I also
> don't think we need to clear the fmeter data in that case as it will
> lead to data clearing only on cpusets where cpuset.memory_pressure is
> read while cpuset.memory_pressure_enabled has been cleared.
>
> Cheers,
> Longman

Hi Waiman,

I understand that your meaning is to avoid clearing the fmeter data.
Based on this idea, I made some modifications based on patch v1.

Instead of writing 0 to the member variables of fmeter, we only need
to return 0 in fmeter_getrate(). The update logic for fmeter data is kept
unchanged, which minimizes the impact on the kernel's current behavior.

The whole patch is greatly simplified as follows.
Please see if this idea is correct, thanks!

diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c
index 25c1d7b77e2f..14564e91e2b3 100644
--- a/kernel/cgroup/cpuset-v1.c
+++ b/kernel/cgroup/cpuset-v1.c
@@ -108,7 +108,7 @@ static int fmeter_getrate(struct fmeter *fmp)
        fmeter_update(fmp);
        val = fmp->val;
        spin_unlock(&fmp->lock);
-       return val;
+       return cpuset_memory_pressure_enabled ? val : 0;
}



Jin
Re: [PATCH v2] cgroup/cpuset: call fmeter_init() when cpuset.memory_pressure disabled
Posted by Waiman Long 10 months ago
On 2/14/25 5:34 AM, Jin Guojie wrote:
> On Wed, Jan 15, 2025 at 11:38 PM Waiman Long <llong@redhat.com> wrote:
>> On 1/15/25 12:05 AM, Jin Guojie wrote:
>>> When running LTP's cpuset_memory_pressure program, the following error occurs:
>>>
>>> (1) Create a cgroup, enable cpuset subsystem, set memory limit, and
>>> then set cpuset_memory_pressure to 1
>>> (2) In this cgroup, create a process to allocate a large amount of
>>> memory and generate pressure counts
>>> (3) Set cpuset_memory_pressure to 0
>>> (4) Check cpuset.memory_pressure: LTP thinks it should be 0, but the
>>> current kernel returns a value of 1, so LTP determines it as FAIL
>>>
>>> V2:
>>> * call fmeter_init() when writing 0 to the memory_pressure_enabled
>>>
>>> Compared with patch v1 [1], this version implements clearer logic.
>>>
>>> [1] https://lore.kernel.org/cgroups/CA+B+MYRNsdKcYxC8kbyzVrdH9fT8c2if5UxGguKep36ZHe6HMQ@mail.gmail.com/T/#u
>>>
>>> Signed-off-by: Jin Guojie <guojie.jin@gmail.com>
>>> Suggested-by: Michal Koutný <mkoutny@suse.com>
>>> Suggested-by: Waiman Long <longman@redhat.com>
>>> ---
>>>    kernel/cgroup/cpuset-v1.c | 4 +++-
>>>    kernel/cgroup/cpuset.c    | 2 ++
>>>    2 files changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c
>>> index 25c1d7b77e2f..7520eb31598a 100644
>>> --- a/kernel/cgroup/cpuset-v1.c
>>> +++ b/kernel/cgroup/cpuset-v1.c
>>> @@ -66,7 +66,6 @@ void fmeter_init(struct fmeter *fmp)
>>>           fmp->cnt = 0;
>>>           fmp->val = 0;
>>>           fmp->time = 0;
>>> -       spin_lock_init(&fmp->lock);
>>>    }
>>>
>>>    /* Internal meter update - process cnt events and update value */
>>> @@ -437,6 +436,9 @@ static int cpuset_write_u64(struct
>>> cgroup_subsys_state *css, struct cftype *cft,
>>>                   break;
>>>           case FILE_MEMORY_PRESSURE_ENABLED:
>>>                   cpuset_memory_pressure_enabled = !!val;
>>> +               if (cpuset_memory_pressure_enabled == 0) {
>>> +                       fmeter_init(&cs->fmeter);
>>> +               }
>> Nit: you don't need parentheses when there is only one statement
>> underneath "if".
>>>                   break;
>>>           case FILE_SPREAD_PAGE:
>>>                   retval = cpuset_update_flag(CS_SPREAD_PAGE, cs, val);
>>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>>> index 0f910c828973..3583c898ff77 100644
>>> --- a/kernel/cgroup/cpuset.c
>>> +++ b/kernel/cgroup/cpuset.c
>>> @@ -3378,6 +3378,7 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
>>>
>>>           __set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
>>>           fmeter_init(&cs->fmeter);
>>> +       spin_lock_init(&cs->fmeter.lock);
>>>           cs->relax_domain_level = -1;
>>>           INIT_LIST_HEAD(&cs->remote_sibling);
>>>
>>> @@ -3650,6 +3651,7 @@ int __init cpuset_init(void)
>>>           nodes_setall(top_cpuset.effective_mems);
>>>
>>>           fmeter_init(&top_cpuset.fmeter);
>>> +       spin_lock_init(&top_cpuset.fmeter.lock);
>>>           INIT_LIST_HEAD(&remote_children);
>>>
>>>           BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL));
>>> --
>>> 2.34.1
>>>
>> I just realize that cpuset.memory_pressure_enabled is on root cgroup
>> only and affect a global flag that impact the behavior of all the
>> existing cpusets. Your current patch will clear the memory pressure data
>> in the root cgroup only. The other child cpusets will not be affected
>> and will still show existing data. This inconsistency isn't good.
>>
>> OTOH, I also don't think iterating the whole cpuset hierarchy and
>> clearing all the fmeter data is worth the effort given that cgroup v1 is
>> in maintenance mode. Perhaps just a simple check to return 0 if
>> cpuset.memory_pressure_enabled isn't set like in the v1 patch. I also
>> don't think we need to clear the fmeter data in that case as it will
>> lead to data clearing only on cpusets where cpuset.memory_pressure is
>> read while cpuset.memory_pressure_enabled has been cleared.
>>
>> Cheers,
>> Longman
> Hi Waiman,
>
> I understand that your meaning is to avoid clearing the fmeter data.
> Based on this idea, I made some modifications based on patch v1.
>
> Instead of writing 0 to the member variables of fmeter, we only need
> to return 0 in fmeter_getrate(). The update logic for fmeter data is kept
> unchanged, which minimizes the impact on the kernel's current behavior.
>
> The whole patch is greatly simplified as follows.
> Please see if this idea is correct, thanks!
>
> diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c
> index 25c1d7b77e2f..14564e91e2b3 100644
> --- a/kernel/cgroup/cpuset-v1.c
> +++ b/kernel/cgroup/cpuset-v1.c
> @@ -108,7 +108,7 @@ static int fmeter_getrate(struct fmeter *fmp)
>          fmeter_update(fmp);
>          val = fmp->val;
>          spin_unlock(&fmp->lock);
> -       return val;
> +       return cpuset_memory_pressure_enabled ? val : 0;
> }

That looks good to me. You can send out a v3 with this simpler change.

Cheers,
Longman

Re: [PATCH v2] cgroup/cpuset: call fmeter_init() when cpuset.memory_pressure disabled
Posted by Michal Koutný 11 months ago
On Wed, Jan 15, 2025 at 01:05:21PM +0800, Jin Guojie <guojie.jin@gmail.com> wrote:
> V2:
> * call fmeter_init() when writing 0 to the memory_pressure_enabled

Thanks for taking into account the feedback.

I'm still curious -- is this:
	a) a new LTP test,
	b) a new failure,
	c) an old failure, new look
or anything else?

Michal
Re: [PATCH v2] cgroup/cpuset: call fmeter_init() when cpuset.memory_pressure disabled
Posted by Jin Guojie 10 months ago
On Wed, Jan 15, 2025 at 7:01 PM Michal Koutný <mkoutny@suse.com> wrote:
>
> On Wed, Jan 15, 2025 at 01:05:21PM +0800, Jin Guojie <guojie.jin@gmail.com> wrote:
> > V2:
> > * call fmeter_init() when writing 0 to the memory_pressure_enabled
>
> Thanks for taking into account the feedback.
>
> I'm still curious -- is this:
>         a) a new LTP test,
>         b) a new failure,
>         c) an old failure, new look
> or anything else?

For the three situations a, b, and c you mentioned, none of them is true.
In fact, this "case cpuset_memory_pressure" has been in LTP for a long time,
and the same error will occur when running in multiple kernel versions.

The method to reproduce the error is provided in the description of patch v1.

https://lore.kernel.org/cgroups/CA+B+MYRNsdKcYxC8kbyzVrdH9fT8c2if5UxGguKep36ZHe6HMQ@mail.gmail.com/T/#u

The fmeter_getrate() function in the kernel has not been modified for
a long time,
so currently LTP "case cpuset_memory_pressure" still produces the same error.

>
> Michal
Re: [PATCH v2] cgroup/cpuset: call fmeter_init() when cpuset.memory_pressure disabled
Posted by Michal Koutný 10 months ago
On Fri, Feb 14, 2025 at 06:04:18PM +0800, Jin Guojie <guojie.jin@gmail.com> wrote:
> In fact, this "case cpuset_memory_pressure" has been in LTP for a long time,
> and the same error will occur when running in multiple kernel versions.

So it's an old failure that no one has looked at for a long time, i.e.
c). Thanks for the clarifier. (It might be benefitial to look at v2
nowadays ;-)

Michal