From: Muchun Song <songmuchun@bytedance.com>
In the near future, a folio will no longer pin its corresponding
memory cgroup. To ensure safety, it will only be appropriate to
hold the rcu read lock or acquire a reference to the memory cgroup
returned by folio_memcg(), thereby preventing it from being released.
In the current patch, the rcu read lock is employed to safeguard
against the release of the memory cgroup in folio_migrate_mapping().
This serves as a preparatory measure for the reparenting of the
LRU pages.
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
---
mm/migrate.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/migrate.c b/mm/migrate.c
index 5169f9717f606..8bcd588c083ca 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct address_space *mapping,
struct lruvec *old_lruvec, *new_lruvec;
struct mem_cgroup *memcg;
+ rcu_read_lock();
memcg = folio_memcg(folio);
old_lruvec = mem_cgroup_lruvec(memcg, oldzone->zone_pgdat);
new_lruvec = mem_cgroup_lruvec(memcg, newzone->zone_pgdat);
@@ -698,6 +699,7 @@ static int __folio_migrate_mapping(struct address_space *mapping,
mod_lruvec_state(new_lruvec, NR_FILE_DIRTY, nr);
__mod_zone_page_state(newzone, NR_ZONE_WRITE_PENDING, nr);
}
+ rcu_read_unlock();
}
local_irq_enable();
--
2.20.1
On 12/17/25 08:27, Qi Zheng wrote: > From: Muchun Song <songmuchun@bytedance.com> > > In the near future, a folio will no longer pin its corresponding > memory cgroup. To ensure safety, it will only be appropriate to > hold the rcu read lock or acquire a reference to the memory cgroup > returned by folio_memcg(), thereby preventing it from being released. > > In the current patch, the rcu read lock is employed to safeguard > against the release of the memory cgroup in folio_migrate_mapping(). We usually avoid talking about "patches". In __folio_migrate_mapping(), the rcu read lock ... > > This serves as a preparatory measure for the reparenting of the > LRU pages. > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> > Reviewed-by: Harry Yoo <harry.yoo@oracle.com> > --- > mm/migrate.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/migrate.c b/mm/migrate.c > index 5169f9717f606..8bcd588c083ca 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct address_space *mapping, > struct lruvec *old_lruvec, *new_lruvec; > struct mem_cgroup *memcg; > > + rcu_read_lock(); > memcg = folio_memcg(folio); In general, LGTM I wonder, though, whether we should embed that in the ABI. Like "lock RCU and get the memcg" in one operation, to the "return memcg and unock rcu" in another operation. Something like "start / end" semantics. -- Cheers David
On Thu, Dec 18, 2025 at 10:09:21AM +0100, David Hildenbrand (Red Hat) wrote: > On 12/17/25 08:27, Qi Zheng wrote: > > From: Muchun Song <songmuchun@bytedance.com> > > > > In the near future, a folio will no longer pin its corresponding > > memory cgroup. To ensure safety, it will only be appropriate to > > hold the rcu read lock or acquire a reference to the memory cgroup > > returned by folio_memcg(), thereby preventing it from being released. > > > > In the current patch, the rcu read lock is employed to safeguard > > against the release of the memory cgroup in folio_migrate_mapping(). > > We usually avoid talking about "patches". > > In __folio_migrate_mapping(), the rcu read lock ... > > > > > This serves as a preparatory measure for the reparenting of the > > LRU pages. > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> > > Reviewed-by: Harry Yoo <harry.yoo@oracle.com> > > --- > > mm/migrate.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/mm/migrate.c b/mm/migrate.c > > index 5169f9717f606..8bcd588c083ca 100644 > > --- a/mm/migrate.c > > +++ b/mm/migrate.c > > @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct address_space *mapping, > > struct lruvec *old_lruvec, *new_lruvec; > > struct mem_cgroup *memcg; > > > > + rcu_read_lock(); > > memcg = folio_memcg(folio); > > In general, LGTM > > I wonder, though, whether we should embed that in the ABI. > > Like "lock RCU and get the memcg" in one operation, to the "return memcg > and unock rcu" in another operation. > > Something like "start / end" semantics. The advantage of open-coding this particular one is that 1) rcu_read_lock() is something the caller could already be holding/using, implicitly or explicitly; and 2) it's immediately obvious that this is an atomic section (which was already useful in spotting a bug in the workingset patch of this series). "start/end" terminology hides this. "lock" we can't use because it would suggest binding stability. The only other idea I'd have would be to spell it all out: memcg = folio_memcg_rcu_read_lock(folio); stuff(memcg); otherstuff(); rcu_read_unlock(); But that might not be worth it. Maybe somebody can think of a better name. But I'd be hesitant to trade off the obviousness of what's going on given how simple the locking + access scheme is.
On 12/18/25 5:09 PM, David Hildenbrand (Red Hat) wrote: > On 12/17/25 08:27, Qi Zheng wrote: >> From: Muchun Song <songmuchun@bytedance.com> >> >> In the near future, a folio will no longer pin its corresponding >> memory cgroup. To ensure safety, it will only be appropriate to >> hold the rcu read lock or acquire a reference to the memory cgroup >> returned by folio_memcg(), thereby preventing it from being released. >> >> In the current patch, the rcu read lock is employed to safeguard >> against the release of the memory cgroup in folio_migrate_mapping(). > > We usually avoid talking about "patches". Got it. > > In __folio_migrate_mapping(), the rcu read lock ... Will do. > >> >> This serves as a preparatory measure for the reparenting of the >> LRU pages. >> >> Signed-off-by: Muchun Song <songmuchun@bytedance.com> >> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> >> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> >> --- >> mm/migrate.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/mm/migrate.c b/mm/migrate.c >> index 5169f9717f606..8bcd588c083ca 100644 >> --- a/mm/migrate.c >> +++ b/mm/migrate.c >> @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct >> address_space *mapping, >> struct lruvec *old_lruvec, *new_lruvec; >> struct mem_cgroup *memcg; >> + rcu_read_lock(); >> memcg = folio_memcg(folio); > > In general, LGTM > > I wonder, though, whether we should embed that in the ABI. > > Like "lock RCU and get the memcg" in one operation, to the "return memcg > and unock rcu" in another operation. Do you mean adding a helper function like get_mem_cgroup_from_folio()? > > Something like "start / end" semantics. > > -- > Cheers > > David
On 12/18/25 10:36, Qi Zheng wrote: > > > On 12/18/25 5:09 PM, David Hildenbrand (Red Hat) wrote: >> On 12/17/25 08:27, Qi Zheng wrote: >>> From: Muchun Song <songmuchun@bytedance.com> >>> >>> In the near future, a folio will no longer pin its corresponding >>> memory cgroup. To ensure safety, it will only be appropriate to >>> hold the rcu read lock or acquire a reference to the memory cgroup >>> returned by folio_memcg(), thereby preventing it from being released. >>> >>> In the current patch, the rcu read lock is employed to safeguard >>> against the release of the memory cgroup in folio_migrate_mapping(). >> >> We usually avoid talking about "patches". > > Got it. > >> >> In __folio_migrate_mapping(), the rcu read lock ... > > Will do. > >> >>> >>> This serves as a preparatory measure for the reparenting of the >>> LRU pages. >>> >>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> >>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> >>> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> >>> --- >>> mm/migrate.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/mm/migrate.c b/mm/migrate.c >>> index 5169f9717f606..8bcd588c083ca 100644 >>> --- a/mm/migrate.c >>> +++ b/mm/migrate.c >>> @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct >>> address_space *mapping, >>> struct lruvec *old_lruvec, *new_lruvec; >>> struct mem_cgroup *memcg; >>> + rcu_read_lock(); >>> memcg = folio_memcg(folio); >> >> In general, LGTM >> >> I wonder, though, whether we should embed that in the ABI. >> >> Like "lock RCU and get the memcg" in one operation, to the "return memcg >> and unock rcu" in another operation. > > Do you mean adding a helper function like get_mem_cgroup_from_folio()? Right, something like memcg = folio_memcg_begin(folio); folio_memcg_end(memcg); Maybe someone reading along has a better idea. Then you can nicely document the requirements in the kerneldocs, and it is clear why the RCU lock is used (internally). -- Cheers David
On 12/18/25 5:43 PM, David Hildenbrand (Red Hat) wrote:
> On 12/18/25 10:36, Qi Zheng wrote:
>>
>>
>> On 12/18/25 5:09 PM, David Hildenbrand (Red Hat) wrote:
>>> On 12/17/25 08:27, Qi Zheng wrote:
>>>> From: Muchun Song <songmuchun@bytedance.com>
>>>>
>>>> In the near future, a folio will no longer pin its corresponding
>>>> memory cgroup. To ensure safety, it will only be appropriate to
>>>> hold the rcu read lock or acquire a reference to the memory cgroup
>>>> returned by folio_memcg(), thereby preventing it from being released.
>>>>
>>>> In the current patch, the rcu read lock is employed to safeguard
>>>> against the release of the memory cgroup in folio_migrate_mapping().
>>>
>>> We usually avoid talking about "patches".
>>
>> Got it.
>>
>>>
>>> In __folio_migrate_mapping(), the rcu read lock ...
>>
>> Will do.
>>
>>>
>>>>
>>>> This serves as a preparatory measure for the reparenting of the
>>>> LRU pages.
>>>>
>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>>>> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
>>>> ---
>>>> mm/migrate.c | 2 ++
>>>> 1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>> index 5169f9717f606..8bcd588c083ca 100644
>>>> --- a/mm/migrate.c
>>>> +++ b/mm/migrate.c
>>>> @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct
>>>> address_space *mapping,
>>>> struct lruvec *old_lruvec, *new_lruvec;
>>>> struct mem_cgroup *memcg;
>>>> + rcu_read_lock();
>>>> memcg = folio_memcg(folio);
>>>
>>> In general, LGTM
>>>
>>> I wonder, though, whether we should embed that in the ABI.
>>>
>>> Like "lock RCU and get the memcg" in one operation, to the "return memcg
>>> and unock rcu" in another operation.
>>
>> Do you mean adding a helper function like get_mem_cgroup_from_folio()?
>
> Right, something like
>
> memcg = folio_memcg_begin(folio);
> folio_memcg_end(memcg);
For some longer or might-sleep critical sections (such as those pointed
by Johannes), perhaps it can be defined like this:
struct mem_cgroup *folio_memcg_begin(struct folio *folio)
{
return get_mem_cgroup_from_folio(folio);
}
void folio_memcg_end(struct mem_cgroup *memcg)
{
mem_cgroup_put(memcg);
}
But for some short critical sections, using RCU lock directly might
be the most convention option?
>
> Maybe someone reading along has a better idea. Then you can nicely
> document the requirements in the kerneldocs, and it is clear why the RCU
> lock is used (internally).
>
On 12/18/25 12:40, Qi Zheng wrote:
>
>
> On 12/18/25 5:43 PM, David Hildenbrand (Red Hat) wrote:
>> On 12/18/25 10:36, Qi Zheng wrote:
>>>
>>>
>>> On 12/18/25 5:09 PM, David Hildenbrand (Red Hat) wrote:
>>>> On 12/17/25 08:27, Qi Zheng wrote:
>>>>> From: Muchun Song <songmuchun@bytedance.com>
>>>>>
>>>>> In the near future, a folio will no longer pin its corresponding
>>>>> memory cgroup. To ensure safety, it will only be appropriate to
>>>>> hold the rcu read lock or acquire a reference to the memory cgroup
>>>>> returned by folio_memcg(), thereby preventing it from being released.
>>>>>
>>>>> In the current patch, the rcu read lock is employed to safeguard
>>>>> against the release of the memory cgroup in folio_migrate_mapping().
>>>>
>>>> We usually avoid talking about "patches".
>>>
>>> Got it.
>>>
>>>>
>>>> In __folio_migrate_mapping(), the rcu read lock ...
>>>
>>> Will do.
>>>
>>>>
>>>>>
>>>>> This serves as a preparatory measure for the reparenting of the
>>>>> LRU pages.
>>>>>
>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>>>>> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
>>>>> ---
>>>>> mm/migrate.c | 2 ++
>>>>> 1 file changed, 2 insertions(+)
>>>>>
>>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>>> index 5169f9717f606..8bcd588c083ca 100644
>>>>> --- a/mm/migrate.c
>>>>> +++ b/mm/migrate.c
>>>>> @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct
>>>>> address_space *mapping,
>>>>> struct lruvec *old_lruvec, *new_lruvec;
>>>>> struct mem_cgroup *memcg;
>>>>> + rcu_read_lock();
>>>>> memcg = folio_memcg(folio);
>>>>
>>>> In general, LGTM
>>>>
>>>> I wonder, though, whether we should embed that in the ABI.
>>>>
>>>> Like "lock RCU and get the memcg" in one operation, to the "return memcg
>>>> and unock rcu" in another operation.
>>>
>>> Do you mean adding a helper function like get_mem_cgroup_from_folio()?
>>
>> Right, something like
>>
>> memcg = folio_memcg_begin(folio);
>> folio_memcg_end(memcg);
>
> For some longer or might-sleep critical sections (such as those pointed
> by Johannes), perhaps it can be defined like this:
>
> struct mem_cgroup *folio_memcg_begin(struct folio *folio)
> {
> return get_mem_cgroup_from_folio(folio);
> }
>
> void folio_memcg_end(struct mem_cgroup *memcg)
> {
> mem_cgroup_put(memcg);
> }
>
> But for some short critical sections, using RCU lock directly might
> be the most convention option?
>
Then put the rcu read locking in there instead?
--
Cheers
David
On 12/18/25 7:56 PM, David Hildenbrand (Red Hat) wrote:
> On 12/18/25 12:40, Qi Zheng wrote:
>>
>>
>> On 12/18/25 5:43 PM, David Hildenbrand (Red Hat) wrote:
>>> On 12/18/25 10:36, Qi Zheng wrote:
>>>>
>>>>
>>>> On 12/18/25 5:09 PM, David Hildenbrand (Red Hat) wrote:
>>>>> On 12/17/25 08:27, Qi Zheng wrote:
>>>>>> From: Muchun Song <songmuchun@bytedance.com>
>>>>>>
>>>>>> In the near future, a folio will no longer pin its corresponding
>>>>>> memory cgroup. To ensure safety, it will only be appropriate to
>>>>>> hold the rcu read lock or acquire a reference to the memory cgroup
>>>>>> returned by folio_memcg(), thereby preventing it from being released.
>>>>>>
>>>>>> In the current patch, the rcu read lock is employed to safeguard
>>>>>> against the release of the memory cgroup in folio_migrate_mapping().
>>>>>
>>>>> We usually avoid talking about "patches".
>>>>
>>>> Got it.
>>>>
>>>>>
>>>>> In __folio_migrate_mapping(), the rcu read lock ...
>>>>
>>>> Will do.
>>>>
>>>>>
>>>>>>
>>>>>> This serves as a preparatory measure for the reparenting of the
>>>>>> LRU pages.
>>>>>>
>>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>>>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>>>>>> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
>>>>>> ---
>>>>>> mm/migrate.c | 2 ++
>>>>>> 1 file changed, 2 insertions(+)
>>>>>>
>>>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>>>> index 5169f9717f606..8bcd588c083ca 100644
>>>>>> --- a/mm/migrate.c
>>>>>> +++ b/mm/migrate.c
>>>>>> @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct
>>>>>> address_space *mapping,
>>>>>> struct lruvec *old_lruvec, *new_lruvec;
>>>>>> struct mem_cgroup *memcg;
>>>>>> + rcu_read_lock();
>>>>>> memcg = folio_memcg(folio);
>>>>>
>>>>> In general, LGTM
>>>>>
>>>>> I wonder, though, whether we should embed that in the ABI.
>>>>>
>>>>> Like "lock RCU and get the memcg" in one operation, to the "return
>>>>> memcg
>>>>> and unock rcu" in another operation.
>>>>
>>>> Do you mean adding a helper function like get_mem_cgroup_from_folio()?
>>>
>>> Right, something like
>>>
>>> memcg = folio_memcg_begin(folio);
>>> folio_memcg_end(memcg);
>>
>> For some longer or might-sleep critical sections (such as those pointed
>> by Johannes), perhaps it can be defined like this:
>>
>> struct mem_cgroup *folio_memcg_begin(struct folio *folio)
>> {
>> return get_mem_cgroup_from_folio(folio);
>> }
>>
>> void folio_memcg_end(struct mem_cgroup *memcg)
>> {
>> mem_cgroup_put(memcg);
>> }
>>
>> But for some short critical sections, using RCU lock directly might
>> be the most convention option?
>>
>
> Then put the rcu read locking in there instead?
So for some longer or might-sleep critical sections, using:
memcg = folio_memcg_begin(folio);
do_some_thing(memcg);
folio_memcg_end(folio);
for some short critical sections, using:
rcu_read_lock();
memcg = folio_memcg(folio);
do_some_thing(memcg);
rcu_read_unlock();
Right?
>
On 12/18/25 14:00, Qi Zheng wrote:
>
>
> On 12/18/25 7:56 PM, David Hildenbrand (Red Hat) wrote:
>> On 12/18/25 12:40, Qi Zheng wrote:
>>>
>>>
>>> On 12/18/25 5:43 PM, David Hildenbrand (Red Hat) wrote:
>>>> On 12/18/25 10:36, Qi Zheng wrote:
>>>>>
>>>>>
>>>>> On 12/18/25 5:09 PM, David Hildenbrand (Red Hat) wrote:
>>>>>> On 12/17/25 08:27, Qi Zheng wrote:
>>>>>>> From: Muchun Song <songmuchun@bytedance.com>
>>>>>>>
>>>>>>> In the near future, a folio will no longer pin its corresponding
>>>>>>> memory cgroup. To ensure safety, it will only be appropriate to
>>>>>>> hold the rcu read lock or acquire a reference to the memory cgroup
>>>>>>> returned by folio_memcg(), thereby preventing it from being released.
>>>>>>>
>>>>>>> In the current patch, the rcu read lock is employed to safeguard
>>>>>>> against the release of the memory cgroup in folio_migrate_mapping().
>>>>>>
>>>>>> We usually avoid talking about "patches".
>>>>>
>>>>> Got it.
>>>>>
>>>>>>
>>>>>> In __folio_migrate_mapping(), the rcu read lock ...
>>>>>
>>>>> Will do.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> This serves as a preparatory measure for the reparenting of the
>>>>>>> LRU pages.
>>>>>>>
>>>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>>>>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>>>>>>> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
>>>>>>> ---
>>>>>>> mm/migrate.c | 2 ++
>>>>>>> 1 file changed, 2 insertions(+)
>>>>>>>
>>>>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>>>>> index 5169f9717f606..8bcd588c083ca 100644
>>>>>>> --- a/mm/migrate.c
>>>>>>> +++ b/mm/migrate.c
>>>>>>> @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct
>>>>>>> address_space *mapping,
>>>>>>> struct lruvec *old_lruvec, *new_lruvec;
>>>>>>> struct mem_cgroup *memcg;
>>>>>>> + rcu_read_lock();
>>>>>>> memcg = folio_memcg(folio);
>>>>>>
>>>>>> In general, LGTM
>>>>>>
>>>>>> I wonder, though, whether we should embed that in the ABI.
>>>>>>
>>>>>> Like "lock RCU and get the memcg" in one operation, to the "return
>>>>>> memcg
>>>>>> and unock rcu" in another operation.
>>>>>
>>>>> Do you mean adding a helper function like get_mem_cgroup_from_folio()?
>>>>
>>>> Right, something like
>>>>
>>>> memcg = folio_memcg_begin(folio);
>>>> folio_memcg_end(memcg);
>>>
>>> For some longer or might-sleep critical sections (such as those pointed
>>> by Johannes), perhaps it can be defined like this:
>>>
>>> struct mem_cgroup *folio_memcg_begin(struct folio *folio)
>>> {
>>> return get_mem_cgroup_from_folio(folio);
>>> }
>>>
>>> void folio_memcg_end(struct mem_cgroup *memcg)
>>> {
>>> mem_cgroup_put(memcg);
>>> }
>>>
>>> But for some short critical sections, using RCU lock directly might
>>> be the most convention option?
>>>
>>
>> Then put the rcu read locking in there instead?
>
> So for some longer or might-sleep critical sections, using:
>
> memcg = folio_memcg_begin(folio);
> do_some_thing(memcg);
> folio_memcg_end(folio);
>
> for some short critical sections, using:
>
> rcu_read_lock();
> memcg = folio_memcg(folio);
> do_some_thing(memcg);
> rcu_read_unlock();
>
> Right?
What I mean is:
memcg = folio_memcg_begin(folio);
do_some_thing(memcg);
folio_memcg_end(folio);
but do the rcu_read_lock() in folio_memcg_begin() and the
rcu_read_unlock() in folio_memcg_end().
You could also have (expensive) variants, as you describe, that mess
with getting/dopping the memcg.
But my points was about hiding the rcu details in a set of helpers.
Sorry if what I say is confusing.
--
Cheers
David
On 12/18/25 9:04 PM, David Hildenbrand (Red Hat) wrote:
> On 12/18/25 14:00, Qi Zheng wrote:
>>
>>
>> On 12/18/25 7:56 PM, David Hildenbrand (Red Hat) wrote:
>>> On 12/18/25 12:40, Qi Zheng wrote:
>>>>
>>>>
>>>> On 12/18/25 5:43 PM, David Hildenbrand (Red Hat) wrote:
>>>>> On 12/18/25 10:36, Qi Zheng wrote:
>>>>>>
>>>>>>
>>>>>> On 12/18/25 5:09 PM, David Hildenbrand (Red Hat) wrote:
>>>>>>> On 12/17/25 08:27, Qi Zheng wrote:
>>>>>>>> From: Muchun Song <songmuchun@bytedance.com>
>>>>>>>>
>>>>>>>> In the near future, a folio will no longer pin its corresponding
>>>>>>>> memory cgroup. To ensure safety, it will only be appropriate to
>>>>>>>> hold the rcu read lock or acquire a reference to the memory cgroup
>>>>>>>> returned by folio_memcg(), thereby preventing it from being
>>>>>>>> released.
>>>>>>>>
>>>>>>>> In the current patch, the rcu read lock is employed to safeguard
>>>>>>>> against the release of the memory cgroup in
>>>>>>>> folio_migrate_mapping().
>>>>>>>
>>>>>>> We usually avoid talking about "patches".
>>>>>>
>>>>>> Got it.
>>>>>>
>>>>>>>
>>>>>>> In __folio_migrate_mapping(), the rcu read lock ...
>>>>>>
>>>>>> Will do.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> This serves as a preparatory measure for the reparenting of the
>>>>>>>> LRU pages.
>>>>>>>>
>>>>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>>>>>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>>>>>>>> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
>>>>>>>> ---
>>>>>>>> mm/migrate.c | 2 ++
>>>>>>>> 1 file changed, 2 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>>>>>> index 5169f9717f606..8bcd588c083ca 100644
>>>>>>>> --- a/mm/migrate.c
>>>>>>>> +++ b/mm/migrate.c
>>>>>>>> @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct
>>>>>>>> address_space *mapping,
>>>>>>>> struct lruvec *old_lruvec, *new_lruvec;
>>>>>>>> struct mem_cgroup *memcg;
>>>>>>>> + rcu_read_lock();
>>>>>>>> memcg = folio_memcg(folio);
>>>>>>>
>>>>>>> In general, LGTM
>>>>>>>
>>>>>>> I wonder, though, whether we should embed that in the ABI.
>>>>>>>
>>>>>>> Like "lock RCU and get the memcg" in one operation, to the "return
>>>>>>> memcg
>>>>>>> and unock rcu" in another operation.
>>>>>>
>>>>>> Do you mean adding a helper function like
>>>>>> get_mem_cgroup_from_folio()?
>>>>>
>>>>> Right, something like
>>>>>
>>>>> memcg = folio_memcg_begin(folio);
>>>>> folio_memcg_end(memcg);
>>>>
>>>> For some longer or might-sleep critical sections (such as those pointed
>>>> by Johannes), perhaps it can be defined like this:
>>>>
>>>> struct mem_cgroup *folio_memcg_begin(struct folio *folio)
>>>> {
>>>> return get_mem_cgroup_from_folio(folio);
>>>> }
>>>>
>>>> void folio_memcg_end(struct mem_cgroup *memcg)
>>>> {
>>>> mem_cgroup_put(memcg);
>>>> }
>>>>
>>>> But for some short critical sections, using RCU lock directly might
>>>> be the most convention option?
>>>>
>>>
>>> Then put the rcu read locking in there instead?
>>
>> So for some longer or might-sleep critical sections, using:
>>
>> memcg = folio_memcg_begin(folio);
>> do_some_thing(memcg);
>> folio_memcg_end(folio);
>>
>> for some short critical sections, using:
>>
>> rcu_read_lock();
>> memcg = folio_memcg(folio);
>> do_some_thing(memcg);
>> rcu_read_unlock();
>>
>> Right?
>
> What I mean is:
>
> memcg = folio_memcg_begin(folio);
> do_some_thing(memcg);
> folio_memcg_end(folio);
>
> but do the rcu_read_lock() in folio_memcg_begin() and the
> rcu_read_unlock() in folio_memcg_end().
>
> You could also have (expensive) variants, as you describe, that mess
> with getting/dopping the memcg.
Or simple use folio_memcg_begin(memcg)/folio_memcg_end(memcg) in all cases.
Or add a parameter to them:
struct mem_cgroup *folio_memcg_begin(struct folio *folio, bool get_refcnt)
{
struct mem_cgroup *memcg;
if (get_refcnt)
memcg = get_mem_cgroup_from_folio(folio);
else {
rcu_read_lock();
memcg = folio_memcg(folio);
}
return memcg;
}
void folio_memcg_end(struct mem_cgroup *memcg, bool get_refcnt)
{
if (get_refcnt)
mem_cgroup_put(memcg);
else
rcu_read_unlock();
}
>
> But my points was about hiding the rcu details in a set of helpers.
>
> Sorry if what I say is confusing.
>
On Thu, Dec 18, 2025 at 09:16:11PM +0800, Qi Zheng wrote:
>
>
> On 12/18/25 9:04 PM, David Hildenbrand (Red Hat) wrote:
> > On 12/18/25 14:00, Qi Zheng wrote:
> > >
> > >
> > > On 12/18/25 7:56 PM, David Hildenbrand (Red Hat) wrote:
> > > > On 12/18/25 12:40, Qi Zheng wrote:
> > > > >
> > > > >
> > > > > On 12/18/25 5:43 PM, David Hildenbrand (Red Hat) wrote:
> > > > > > On 12/18/25 10:36, Qi Zheng wrote:
> > > > > > >
> > > > > > >
> > > > > > > On 12/18/25 5:09 PM, David Hildenbrand (Red Hat) wrote:
> > > > > > > > On 12/17/25 08:27, Qi Zheng wrote:
> > > > > > > > > From: Muchun Song <songmuchun@bytedance.com>
> > > > > > > > >
> > > > > > > > > In the near future, a folio will no longer pin its corresponding
> > > > > > > > > memory cgroup. To ensure safety, it will only be appropriate to
> > > > > > > > > hold the rcu read lock or acquire a reference to the memory cgroup
> > > > > > > > > returned by folio_memcg(), thereby
> > > > > > > > > preventing it from being released.
> > > > > > > > >
> > > > > > > > > In the current patch, the rcu read lock is employed to safeguard
> > > > > > > > > against the release of the memory cgroup in
> > > > > > > > > folio_migrate_mapping().
> > > > > > > >
> > > > > > > > We usually avoid talking about "patches".
> > > > > > >
> > > > > > > Got it.
> > > > > > >
> > > > > > > >
> > > > > > > > In __folio_migrate_mapping(), the rcu read lock ...
> > > > > > >
> > > > > > > Will do.
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > This serves as a preparatory measure for the reparenting of the
> > > > > > > > > LRU pages.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > > > > > > > > Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
> > > > > > > > > Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
> > > > > > > > > ---
> > > > > > > > > mm/migrate.c | 2 ++
> > > > > > > > > 1 file changed, 2 insertions(+)
> > > > > > > > >
> > > > > > > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > > > > > > index 5169f9717f606..8bcd588c083ca 100644
> > > > > > > > > --- a/mm/migrate.c
> > > > > > > > > +++ b/mm/migrate.c
> > > > > > > > > @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct
> > > > > > > > > address_space *mapping,
> > > > > > > > > struct lruvec *old_lruvec, *new_lruvec;
> > > > > > > > > struct mem_cgroup *memcg;
> > > > > > > > > + rcu_read_lock();
> > > > > > > > > memcg = folio_memcg(folio);
> > > > > > > >
> > > > > > > > In general, LGTM
> > > > > > > >
> > > > > > > > I wonder, though, whether we should embed that in the ABI.
> > > > > > > >
> > > > > > > > Like "lock RCU and get the memcg" in one operation, to the "return
> > > > > > > > memcg
> > > > > > > > and unock rcu" in another operation.
> > > > > > >
> > > > > > > Do you mean adding a helper function like
> > > > > > > get_mem_cgroup_from_folio()?
> > > > > >
> > > > > > Right, something like
> > > > > >
> > > > > > memcg = folio_memcg_begin(folio);
> > > > > > folio_memcg_end(memcg);
> > > > >
> > > > > For some longer or might-sleep critical sections (such as those pointed
> > > > > by Johannes), perhaps it can be defined like this:
> > > > >
> > > > > struct mem_cgroup *folio_memcg_begin(struct folio *folio)
> > > > > {
> > > > > return get_mem_cgroup_from_folio(folio);
> > > > > }
> > > > >
> > > > > void folio_memcg_end(struct mem_cgroup *memcg)
> > > > > {
> > > > > mem_cgroup_put(memcg);
> > > > > }
> > > > >
> > > > > But for some short critical sections, using RCU lock directly might
> > > > > be the most convention option?
> > > > >
> > > >
> > > > Then put the rcu read locking in there instead?
> > >
> > > So for some longer or might-sleep critical sections, using:
> > >
> > > memcg = folio_memcg_begin(folio);
> > > do_some_thing(memcg);
> > > folio_memcg_end(folio);
> > >
> > > for some short critical sections, using:
> > >
> > > rcu_read_lock();
> > > memcg = folio_memcg(folio);
> > > do_some_thing(memcg);
> > > rcu_read_unlock();
> > >
> > > Right?
> >
> > What I mean is:
> >
> > memcg = folio_memcg_begin(folio);
> > do_some_thing(memcg);
> > folio_memcg_end(folio);
> >
> > but do the rcu_read_lock() in folio_memcg_begin() and the
> > rcu_read_unlock() in folio_memcg_end().
> >
> > You could also have (expensive) variants, as you describe, that mess
> > with getting/dopping the memcg.
>
> Or simple use folio_memcg_begin(memcg)/folio_memcg_end(memcg) in all cases.
>
> Or add a parameter to them:
>
> struct mem_cgroup *folio_memcg_begin(struct folio *folio, bool get_refcnt)
> {
> struct mem_cgroup *memcg;
>
> if (get_refcnt)
> memcg = get_mem_cgroup_from_folio(folio);
> else {
> rcu_read_lock();
> memcg = folio_memcg(folio);
> }
>
> return memcg;
> }
>
> void folio_memcg_end(struct mem_cgroup *memcg, bool get_refcnt)
> {
> if (get_refcnt)
> mem_cgroup_put(memcg);
> else
> rcu_read_unlock();
> }
I would like to vote for open coding as we do now, because I think hiding
the RCU lock / refcount acquisition into a less obvious API doesn't make
it more readable.
No strong opinion on introducing new helpers, but at least it should be
obvious that each variant has different restrictions.
> > But my points was about hiding the rcu details in a set of helpers.
> >
> > Sorry if what I say is confusing.
--
Cheers,
Harry / Hyeonggon
On 12/19/25 05:12, Harry Yoo wrote:
> On Thu, Dec 18, 2025 at 09:16:11PM +0800, Qi Zheng wrote:
>>
>>
>> On 12/18/25 9:04 PM, David Hildenbrand (Red Hat) wrote:
>>> On 12/18/25 14:00, Qi Zheng wrote:
>>>>
>>>>
>>>> On 12/18/25 7:56 PM, David Hildenbrand (Red Hat) wrote:
>>>>> On 12/18/25 12:40, Qi Zheng wrote:
>>>>>>
>>>>>>
>>>>>> On 12/18/25 5:43 PM, David Hildenbrand (Red Hat) wrote:
>>>>>>> On 12/18/25 10:36, Qi Zheng wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/18/25 5:09 PM, David Hildenbrand (Red Hat) wrote:
>>>>>>>>> On 12/17/25 08:27, Qi Zheng wrote:
>>>>>>>>>> From: Muchun Song <songmuchun@bytedance.com>
>>>>>>>>>>
>>>>>>>>>> In the near future, a folio will no longer pin its corresponding
>>>>>>>>>> memory cgroup. To ensure safety, it will only be appropriate to
>>>>>>>>>> hold the rcu read lock or acquire a reference to the memory cgroup
>>>>>>>>>> returned by folio_memcg(), thereby
>>>>>>>>>> preventing it from being released.
>>>>>>>>>>
>>>>>>>>>> In the current patch, the rcu read lock is employed to safeguard
>>>>>>>>>> against the release of the memory cgroup in
>>>>>>>>>> folio_migrate_mapping().
>>>>>>>>>
>>>>>>>>> We usually avoid talking about "patches".
>>>>>>>>
>>>>>>>> Got it.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> In __folio_migrate_mapping(), the rcu read lock ...
>>>>>>>>
>>>>>>>> Will do.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This serves as a preparatory measure for the reparenting of the
>>>>>>>>>> LRU pages.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>>>>>>>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>>>>>>>>>> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
>>>>>>>>>> ---
>>>>>>>>>> mm/migrate.c | 2 ++
>>>>>>>>>> 1 file changed, 2 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>>>>>>>> index 5169f9717f606..8bcd588c083ca 100644
>>>>>>>>>> --- a/mm/migrate.c
>>>>>>>>>> +++ b/mm/migrate.c
>>>>>>>>>> @@ -671,6 +671,7 @@ static int __folio_migrate_mapping(struct
>>>>>>>>>> address_space *mapping,
>>>>>>>>>> struct lruvec *old_lruvec, *new_lruvec;
>>>>>>>>>> struct mem_cgroup *memcg;
>>>>>>>>>> + rcu_read_lock();
>>>>>>>>>> memcg = folio_memcg(folio);
>>>>>>>>>
>>>>>>>>> In general, LGTM
>>>>>>>>>
>>>>>>>>> I wonder, though, whether we should embed that in the ABI.
>>>>>>>>>
>>>>>>>>> Like "lock RCU and get the memcg" in one operation, to the "return
>>>>>>>>> memcg
>>>>>>>>> and unock rcu" in another operation.
>>>>>>>>
>>>>>>>> Do you mean adding a helper function like
>>>>>>>> get_mem_cgroup_from_folio()?
>>>>>>>
>>>>>>> Right, something like
>>>>>>>
>>>>>>> memcg = folio_memcg_begin(folio);
>>>>>>> folio_memcg_end(memcg);
>>>>>>
>>>>>> For some longer or might-sleep critical sections (such as those pointed
>>>>>> by Johannes), perhaps it can be defined like this:
>>>>>>
>>>>>> struct mem_cgroup *folio_memcg_begin(struct folio *folio)
>>>>>> {
>>>>>> return get_mem_cgroup_from_folio(folio);
>>>>>> }
>>>>>>
>>>>>> void folio_memcg_end(struct mem_cgroup *memcg)
>>>>>> {
>>>>>> mem_cgroup_put(memcg);
>>>>>> }
>>>>>>
>>>>>> But for some short critical sections, using RCU lock directly might
>>>>>> be the most convention option?
>>>>>>
>>>>>
>>>>> Then put the rcu read locking in there instead?
>>>>
>>>> So for some longer or might-sleep critical sections, using:
>>>>
>>>> memcg = folio_memcg_begin(folio);
>>>> do_some_thing(memcg);
>>>> folio_memcg_end(folio);
>>>>
>>>> for some short critical sections, using:
>>>>
>>>> rcu_read_lock();
>>>> memcg = folio_memcg(folio);
>>>> do_some_thing(memcg);
>>>> rcu_read_unlock();
>>>>
>>>> Right?
>>>
>>> What I mean is:
>>>
>>> memcg = folio_memcg_begin(folio);
>>> do_some_thing(memcg);
>>> folio_memcg_end(folio);
>>>
>>> but do the rcu_read_lock() in folio_memcg_begin() and the
>>> rcu_read_unlock() in folio_memcg_end().
>>>
>>> You could also have (expensive) variants, as you describe, that mess
>>> with getting/dopping the memcg.
>>
>> Or simple use folio_memcg_begin(memcg)/folio_memcg_end(memcg) in all cases.
>>
>> Or add a parameter to them:
>>
>> struct mem_cgroup *folio_memcg_begin(struct folio *folio, bool get_refcnt)
>> {
>> struct mem_cgroup *memcg;
>>
>> if (get_refcnt)
>> memcg = get_mem_cgroup_from_folio(folio);
>> else {
>> rcu_read_lock();
>> memcg = folio_memcg(folio);
>> }
>>
>> return memcg;
>> }
>>
>> void folio_memcg_end(struct mem_cgroup *memcg, bool get_refcnt)
>> {
>> if (get_refcnt)
>> mem_cgroup_put(memcg);
>> else
>> rcu_read_unlock();
>> }
>
> I would like to vote for open coding as we do now, because I think hiding
> the RCU lock / refcount acquisition into a less obvious API doesn't make
> it more readable.
I wouldn't do it in an API as proposed above.
I prefer to not have magical RCU locking in every caller. Easy to get wrong.
See how we did something similar in the pte_*map*() vs. pte_unmap() API,
without requiring all callers to open-code this.
--
Cheers
David
On Wed, Dec 17, 2025 at 03:27:37PM +0800, Qi Zheng wrote: > From: Muchun Song <songmuchun@bytedance.com> > > In the near future, a folio will no longer pin its corresponding > memory cgroup. To ensure safety, it will only be appropriate to > hold the rcu read lock or acquire a reference to the memory cgroup > returned by folio_memcg(), thereby preventing it from being released. > > In the current patch, the rcu read lock is employed to safeguard > against the release of the memory cgroup in folio_migrate_mapping(). > > This serves as a preparatory measure for the reparenting of the > LRU pages. > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> > Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
© 2016 - 2025 Red Hat, Inc.