mm/cma: add an API to enable/disable concurrent memory allocation for the CMA

[PATCH] mm/cma: add an API to enable/disable concurrent memory allocation for the CMA

Posted by yangge1116@126.com 1 year ago

From: yangge <yangge1116@126.com>

Commit 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
simply reverts to the original method of using the cma_mutex to ensure
that alloc_contig_range() runs sequentially. This change was made to avoid
concurrency allocation failures. However, it can negatively impact
performance when concurrent allocation of CMA memory is required.

To address this issue, we could introduce an API for concurrency settings,
allowing users to decide whether their CMA can perform concurrent memory
allocations or not.

Fixes: 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
Signed-off-by: yangge <yangge1116@126.com>
Cc: <stable@vger.kernel.org>
---
 include/linux/cma.h |  2 ++
 mm/cma.c            | 22 ++++++++++++++++++++--
 mm/cma.h            |  1 +
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/include/linux/cma.h b/include/linux/cma.h
index d15b64f..2384624 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -53,6 +53,8 @@ extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data)
 
 extern void cma_reserve_pages_on_error(struct cma *cma);
 
+extern bool cma_set_concurrency(struct cma *cma, bool concurrency);
+
 #ifdef CONFIG_CMA
 struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp);
 bool cma_free_folio(struct cma *cma, const struct folio *folio);
diff --git a/mm/cma.c b/mm/cma.c
index de5bc0c..49a7186 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -460,9 +460,17 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
 		spin_unlock_irq(&cma->lock);
 
 		pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
-		mutex_lock(&cma_mutex);
+
+		/*
+		 * If the user sets the concurr_alloc of CMA to true, concurrent
+		 * memory allocation is allowed. If the user sets it to false or
+		 * does not set it, concurrent memory allocation is not allowed.
+		 */
+		if (!cma->concurr_alloc)
+			mutex_lock(&cma_mutex);
 		ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp);
-		mutex_unlock(&cma_mutex);
+		if (!cma->concurr_alloc)
+			mutex_unlock(&cma_mutex);
 		if (ret == 0) {
 			page = pfn_to_page(pfn);
 			break;
@@ -610,3 +618,13 @@ int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data)
 
 	return 0;
 }
+
+bool cma_set_concurrency(struct cma *cma, bool concurrency)
+{
+	if (!cma)
+		return false;
+
+	cma->concurr_alloc = concurrency;
+
+	return true;
+}
diff --git a/mm/cma.h b/mm/cma.h
index 8485ef8..30f489d 100644
--- a/mm/cma.h
+++ b/mm/cma.h
@@ -16,6 +16,7 @@ struct cma {
 	unsigned long   *bitmap;
 	unsigned int order_per_bit; /* Order of pages represented by one bit */
 	spinlock_t	lock;
+	bool concurr_alloc;
 #ifdef CONFIG_CMA_DEBUGFS
 	struct hlist_head mem_head;
 	spinlock_t mem_head_lock;
-- 
2.7.4

Re: [PATCH] mm/cma: add an API to enable/disable concurrent memory allocation for the CMA

Posted by Barry Song 1 year ago

On Sat, Jan 25, 2025 at 12:21 AM <yangge1116@126.com> wrote:
>
> From: yangge <yangge1116@126.com>
>
> Commit 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
> simply reverts to the original method of using the cma_mutex to ensure
> that alloc_contig_range() runs sequentially. This change was made to avoid
> concurrency allocation failures. However, it can negatively impact
> performance when concurrent allocation of CMA memory is required.

Do we have some data?

>
> To address this issue, we could introduce an API for concurrency settings,
> allowing users to decide whether their CMA can perform concurrent memory
> allocations or not.

Who is the intended user of cma_set_concurrency? I also feel it is somewhat
unsafe since cma->concurr_alloc is not protected by any locks.

Will a user setting cma->concurr_alloc = 1 encounter the original issue that
commit 60a60e32cf91 was attempting to fix?

>
> Fixes: 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
> Signed-off-by: yangge <yangge1116@126.com>
> Cc: <stable@vger.kernel.org>
> ---
>  include/linux/cma.h |  2 ++
>  mm/cma.c            | 22 ++++++++++++++++++++--
>  mm/cma.h            |  1 +
>  3 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index d15b64f..2384624 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -53,6 +53,8 @@ extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data)
>
>  extern void cma_reserve_pages_on_error(struct cma *cma);
>
> +extern bool cma_set_concurrency(struct cma *cma, bool concurrency);
> +
>  #ifdef CONFIG_CMA
>  struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp);
>  bool cma_free_folio(struct cma *cma, const struct folio *folio);
> diff --git a/mm/cma.c b/mm/cma.c
> index de5bc0c..49a7186 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -460,9 +460,17 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
>                 spin_unlock_irq(&cma->lock);
>
>                 pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
> -               mutex_lock(&cma_mutex);
> +
> +               /*
> +                * If the user sets the concurr_alloc of CMA to true, concurrent
> +                * memory allocation is allowed. If the user sets it to false or
> +                * does not set it, concurrent memory allocation is not allowed.
> +                */
> +               if (!cma->concurr_alloc)
> +                       mutex_lock(&cma_mutex);
>                 ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp);
> -               mutex_unlock(&cma_mutex);
> +               if (!cma->concurr_alloc)
> +                       mutex_unlock(&cma_mutex);
>                 if (ret == 0) {
>                         page = pfn_to_page(pfn);
>                         break;
> @@ -610,3 +618,13 @@ int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data)
>
>         return 0;
>  }
> +
> +bool cma_set_concurrency(struct cma *cma, bool concurrency)
> +{
> +       if (!cma)
> +               return false;
> +
> +       cma->concurr_alloc = concurrency;
> +
> +       return true;
> +}
> diff --git a/mm/cma.h b/mm/cma.h
> index 8485ef8..30f489d 100644
> --- a/mm/cma.h
> +++ b/mm/cma.h
> @@ -16,6 +16,7 @@ struct cma {
>         unsigned long   *bitmap;
>         unsigned int order_per_bit; /* Order of pages represented by one bit */
>         spinlock_t      lock;
> +       bool concurr_alloc;
>  #ifdef CONFIG_CMA_DEBUGFS
>         struct hlist_head mem_head;
>         spinlock_t mem_head_lock;
> --
> 2.7.4
>
>

Thanks
Barry

Re: [PATCH] mm/cma: add an API to enable/disable concurrent memory allocation for the CMA

Posted by Ge Yang 1 year ago


在 2025/1/28 17:58, Barry Song 写道:
> On Sat, Jan 25, 2025 at 12:21 AM <yangge1116@126.com> wrote:
>>
>> From: yangge <yangge1116@126.com>
>>
>> Commit 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
>> simply reverts to the original method of using the cma_mutex to ensure
>> that alloc_contig_range() runs sequentially. This change was made to avoid
>> concurrency allocation failures. However, it can negatively impact
>> performance when concurrent allocation of CMA memory is required.
> 
> Do we have some data?
Yes, I will add it in the next version, thanks.
> 
>>
>> To address this issue, we could introduce an API for concurrency settings,
>> allowing users to decide whether their CMA can perform concurrent memory
>> allocations or not.
> 
> Who is the intended user of cma_set_concurrency? 
We have some drivers that use cma_set_concurrency(), but they have not 
yet been merged into the mainline. The cma_alloc_mem() function in the 
mainline also supports concurrent allocation of CMA memory. By applying 
this patch, we can also achieve significant performance improvements in 
certain scenarios. I will provide performance data in the next version.
I also feel it is somewhat
> unsafe since cma->concurr_alloc is not protected by any locks.
Ok, thanks.
> 
> Will a user setting cma->concurr_alloc = 1 encounter the original issue that
> commit 60a60e32cf91 was attempting to fix?
> 
Yes, if a user encounters the issue described in commit 60a60e32cf91, 
they will not be able to set cma->concurr_alloc to 1.
>>
>> Fixes: 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
>> Signed-off-by: yangge <yangge1116@126.com>
>> Cc: <stable@vger.kernel.org>
>> ---
>>   include/linux/cma.h |  2 ++
>>   mm/cma.c            | 22 ++++++++++++++++++++--
>>   mm/cma.h            |  1 +
>>   3 files changed, 23 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/linux/cma.h b/include/linux/cma.h
>> index d15b64f..2384624 100644
>> --- a/include/linux/cma.h
>> +++ b/include/linux/cma.h
>> @@ -53,6 +53,8 @@ extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data)
>>
>>   extern void cma_reserve_pages_on_error(struct cma *cma);
>>
>> +extern bool cma_set_concurrency(struct cma *cma, bool concurrency);
>> +
>>   #ifdef CONFIG_CMA
>>   struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp);
>>   bool cma_free_folio(struct cma *cma, const struct folio *folio);
>> diff --git a/mm/cma.c b/mm/cma.c
>> index de5bc0c..49a7186 100644
>> --- a/mm/cma.c
>> +++ b/mm/cma.c
>> @@ -460,9 +460,17 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
>>                  spin_unlock_irq(&cma->lock);
>>
>>                  pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
>> -               mutex_lock(&cma_mutex);
>> +
>> +               /*
>> +                * If the user sets the concurr_alloc of CMA to true, concurrent
>> +                * memory allocation is allowed. If the user sets it to false or
>> +                * does not set it, concurrent memory allocation is not allowed.
>> +                */
>> +               if (!cma->concurr_alloc)
>> +                       mutex_lock(&cma_mutex);
>>                  ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp);
>> -               mutex_unlock(&cma_mutex);
>> +               if (!cma->concurr_alloc)
>> +                       mutex_unlock(&cma_mutex);
>>                  if (ret == 0) {
>>                          page = pfn_to_page(pfn);
>>                          break;
>> @@ -610,3 +618,13 @@ int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data)
>>
>>          return 0;
>>   }
>> +
>> +bool cma_set_concurrency(struct cma *cma, bool concurrency)
>> +{
>> +       if (!cma)
>> +               return false;
>> +
>> +       cma->concurr_alloc = concurrency;
>> +
>> +       return true;
>> +}
>> diff --git a/mm/cma.h b/mm/cma.h
>> index 8485ef8..30f489d 100644
>> --- a/mm/cma.h
>> +++ b/mm/cma.h
>> @@ -16,6 +16,7 @@ struct cma {
>>          unsigned long   *bitmap;
>>          unsigned int order_per_bit; /* Order of pages represented by one bit */
>>          spinlock_t      lock;
>> +       bool concurr_alloc;
>>   #ifdef CONFIG_CMA_DEBUGFS
>>          struct hlist_head mem_head;
>>          spinlock_t mem_head_lock;
>> --
>> 2.7.4
>>
>>
> 
> Thanks
> Barry

Re: [PATCH] mm/cma: add an API to enable/disable concurrent memory allocation for the CMA

Posted by Barry Song 1 year ago

On Sat, Feb 8, 2025 at 9:50 PM Ge Yang <yangge1116@126.com> wrote:
>
>
>
> 在 2025/1/28 17:58, Barry Song 写道:
> > On Sat, Jan 25, 2025 at 12:21 AM <yangge1116@126.com> wrote:
> >>
> >> From: yangge <yangge1116@126.com>
> >>
> >> Commit 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
> >> simply reverts to the original method of using the cma_mutex to ensure
> >> that alloc_contig_range() runs sequentially. This change was made to avoid
> >> concurrency allocation failures. However, it can negatively impact
> >> performance when concurrent allocation of CMA memory is required.
> >
> > Do we have some data?
> Yes, I will add it in the next version, thanks.
> >
> >>
> >> To address this issue, we could introduce an API for concurrency settings,
> >> allowing users to decide whether their CMA can perform concurrent memory
> >> allocations or not.
> >
> > Who is the intended user of cma_set_concurrency?
> We have some drivers that use cma_set_concurrency(), but they have not
> yet been merged into the mainline. The cma_alloc_mem() function in the
> mainline also supports concurrent allocation of CMA memory. By applying
> this patch, we can also achieve significant performance improvements in
> certain scenarios. I will provide performance data in the next version.
> I also feel it is somewhat
> > unsafe since cma->concurr_alloc is not protected by any locks.
> Ok, thanks.
> >
> > Will a user setting cma->concurr_alloc = 1 encounter the original issue that
> > commit 60a60e32cf91 was attempting to fix?
> >
> Yes, if a user encounters the issue described in commit 60a60e32cf91,
> they will not be able to set cma->concurr_alloc to 1.

A user who hasn't encountered a problem yet doesn't mean they won't
encounter it; it most likely just means the testing time hasn't been long
enough.

Is it possible to implement a per-CMA lock or range lock that simultaneously
improves performance and prevents the original issue that commit
60a60e32cf91 aimed to fix?

I strongly believe that cma->concurr_alloc is not the right approach. Let's
not waste our time on this kind of hack or workaround.  Instead, we should
find a proper fix that remains transparent to users.

> >>
> >> Fixes: 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
> >> Signed-off-by: yangge <yangge1116@126.com>
> >> Cc: <stable@vger.kernel.org>
> >> ---
> >>   include/linux/cma.h |  2 ++
> >>   mm/cma.c            | 22 ++++++++++++++++++++--
> >>   mm/cma.h            |  1 +
> >>   3 files changed, 23 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/include/linux/cma.h b/include/linux/cma.h
> >> index d15b64f..2384624 100644
> >> --- a/include/linux/cma.h
> >> +++ b/include/linux/cma.h
> >> @@ -53,6 +53,8 @@ extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data)
> >>
> >>   extern void cma_reserve_pages_on_error(struct cma *cma);
> >>
> >> +extern bool cma_set_concurrency(struct cma *cma, bool concurrency);
> >> +
> >>   #ifdef CONFIG_CMA
> >>   struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp);
> >>   bool cma_free_folio(struct cma *cma, const struct folio *folio);
> >> diff --git a/mm/cma.c b/mm/cma.c
> >> index de5bc0c..49a7186 100644
> >> --- a/mm/cma.c
> >> +++ b/mm/cma.c
> >> @@ -460,9 +460,17 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
> >>                  spin_unlock_irq(&cma->lock);
> >>
> >>                  pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
> >> -               mutex_lock(&cma_mutex);
> >> +
> >> +               /*
> >> +                * If the user sets the concurr_alloc of CMA to true, concurrent
> >> +                * memory allocation is allowed. If the user sets it to false or
> >> +                * does not set it, concurrent memory allocation is not allowed.
> >> +                */
> >> +               if (!cma->concurr_alloc)
> >> +                       mutex_lock(&cma_mutex);
> >>                  ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp);
> >> -               mutex_unlock(&cma_mutex);
> >> +               if (!cma->concurr_alloc)
> >> +                       mutex_unlock(&cma_mutex);
> >>                  if (ret == 0) {
> >>                          page = pfn_to_page(pfn);
> >>                          break;
> >> @@ -610,3 +618,13 @@ int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data)
> >>
> >>          return 0;
> >>   }
> >> +
> >> +bool cma_set_concurrency(struct cma *cma, bool concurrency)
> >> +{
> >> +       if (!cma)
> >> +               return false;
> >> +
> >> +       cma->concurr_alloc = concurrency;
> >> +
> >> +       return true;
> >> +}
> >> diff --git a/mm/cma.h b/mm/cma.h
> >> index 8485ef8..30f489d 100644
> >> --- a/mm/cma.h
> >> +++ b/mm/cma.h
> >> @@ -16,6 +16,7 @@ struct cma {
> >>          unsigned long   *bitmap;
> >>          unsigned int order_per_bit; /* Order of pages represented by one bit */
> >>          spinlock_t      lock;
> >> +       bool concurr_alloc;
> >>   #ifdef CONFIG_CMA_DEBUGFS
> >>          struct hlist_head mem_head;
> >>          spinlock_t mem_head_lock;
> >> --
> >> 2.7.4
> >>
> >>
> >

Thanks
Barry

Re: [PATCH] mm/cma: add an API to enable/disable concurrent memory allocation for the CMA

Posted by David Hildenbrand 12 months ago

On 08.02.25 22:34, Barry Song wrote:
> On Sat, Feb 8, 2025 at 9:50 PM Ge Yang <yangge1116@126.com> wrote:
>>
>>
>>
>> 在 2025/1/28 17:58, Barry Song 写道:
>>> On Sat, Jan 25, 2025 at 12:21 AM <yangge1116@126.com> wrote:
>>>>
>>>> From: yangge <yangge1116@126.com>
>>>>
>>>> Commit 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
>>>> simply reverts to the original method of using the cma_mutex to ensure
>>>> that alloc_contig_range() runs sequentially. This change was made to avoid
>>>> concurrency allocation failures. However, it can negatively impact
>>>> performance when concurrent allocation of CMA memory is required.
>>>
>>> Do we have some data?
>> Yes, I will add it in the next version, thanks.
>>>
>>>>
>>>> To address this issue, we could introduce an API for concurrency settings,
>>>> allowing users to decide whether their CMA can perform concurrent memory
>>>> allocations or not.
>>>
>>> Who is the intended user of cma_set_concurrency?
>> We have some drivers that use cma_set_concurrency(), but they have not
>> yet been merged into the mainline. The cma_alloc_mem() function in the
>> mainline also supports concurrent allocation of CMA memory. By applying
>> this patch, we can also achieve significant performance improvements in
>> certain scenarios. I will provide performance data in the next version.
>> I also feel it is somewhat
>>> unsafe since cma->concurr_alloc is not protected by any locks.
>> Ok, thanks.
>>>
>>> Will a user setting cma->concurr_alloc = 1 encounter the original issue that
>>> commit 60a60e32cf91 was attempting to fix?
>>>
>> Yes, if a user encounters the issue described in commit 60a60e32cf91,
>> they will not be able to set cma->concurr_alloc to 1.
> 
> A user who hasn't encountered a problem yet doesn't mean they won't
> encounter it; it most likely just means the testing time hasn't been long
> enough.
> 
> Is it possible to implement a per-CMA lock or range lock that simultaneously
> improves performance and prevents the original issue that commit
> 60a60e32cf91 aimed to fix?
> 
> I strongly believe that cma->concurr_alloc is not the right approach. Let's
> not waste our time on this kind of hack or workaround.  Instead, we should
> find a proper fix that remains transparent to users.

Fully agreed.

IIUC, the problem is that we find a pageblock is already isolated. It 
might be sufficient to return -EAGAIN in that case from 
alloc_contig_range_noprof()->start_isolate_page_range() and retry in 
CMA. ideally, we'd have a way to wait on some event (e.g., any pageblock 
transitioning from isolated -> !isolated).

-- 
Cheers,

David / dhildenb

Re: [PATCH] mm/cma: add an API to enable/disable concurrent memory allocation for the CMA

Posted by Ge Yang 1 year ago


在 2025/2/9 5:34, Barry Song 写道:
> On Sat, Feb 8, 2025 at 9:50 PM Ge Yang <yangge1116@126.com> wrote:
>>
>>
>>
>> 在 2025/1/28 17:58, Barry Song 写道:
>>> On Sat, Jan 25, 2025 at 12:21 AM <yangge1116@126.com> wrote:
>>>>
>>>> From: yangge <yangge1116@126.com>
>>>>
>>>> Commit 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
>>>> simply reverts to the original method of using the cma_mutex to ensure
>>>> that alloc_contig_range() runs sequentially. This change was made to avoid
>>>> concurrency allocation failures. However, it can negatively impact
>>>> performance when concurrent allocation of CMA memory is required.
>>>
>>> Do we have some data?
>> Yes, I will add it in the next version, thanks.
>>>
>>>>
>>>> To address this issue, we could introduce an API for concurrency settings,
>>>> allowing users to decide whether their CMA can perform concurrent memory
>>>> allocations or not.
>>>
>>> Who is the intended user of cma_set_concurrency?
>> We have some drivers that use cma_set_concurrency(), but they have not
>> yet been merged into the mainline. The cma_alloc_mem() function in the
>> mainline also supports concurrent allocation of CMA memory. By applying
>> this patch, we can also achieve significant performance improvements in
>> certain scenarios. I will provide performance data in the next version.
>> I also feel it is somewhat
>>> unsafe since cma->concurr_alloc is not protected by any locks.
>> Ok, thanks.
>>>
>>> Will a user setting cma->concurr_alloc = 1 encounter the original issue that
>>> commit 60a60e32cf91 was attempting to fix?
>>>
>> Yes, if a user encounters the issue described in commit 60a60e32cf91,
>> they will not be able to set cma->concurr_alloc to 1.
> 
> A user who hasn't encountered a problem yet doesn't mean they won't
> encounter it; it most likely just means the testing time hasn't been long
> enough.
> 
> Is it possible to implement a per-CMA lock or range lock that simultaneously
> improves performance and prevents the original issue that commit
> 60a60e32cf91 aimed to fix?
> 
Using per-CMA locks can improve performance and prevent the original 
issue. I am currently preparing the patch. Thanks.
> I strongly believe that cma->concurr_alloc is not the right approach. Let's
> not waste our time on this kind of hack or workaround.  Instead, we should
> find a proper fix that remains transparent to users.
> 
>>>>
>>>> Fixes: 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
>>>> Signed-off-by: yangge <yangge1116@126.com>
>>>> Cc: <stable@vger.kernel.org>
>>>> ---
>>>>    include/linux/cma.h |  2 ++
>>>>    mm/cma.c            | 22 ++++++++++++++++++++--
>>>>    mm/cma.h            |  1 +
>>>>    3 files changed, 23 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/include/linux/cma.h b/include/linux/cma.h
>>>> index d15b64f..2384624 100644
>>>> --- a/include/linux/cma.h
>>>> +++ b/include/linux/cma.h
>>>> @@ -53,6 +53,8 @@ extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data)
>>>>
>>>>    extern void cma_reserve_pages_on_error(struct cma *cma);
>>>>
>>>> +extern bool cma_set_concurrency(struct cma *cma, bool concurrency);
>>>> +
>>>>    #ifdef CONFIG_CMA
>>>>    struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp);
>>>>    bool cma_free_folio(struct cma *cma, const struct folio *folio);
>>>> diff --git a/mm/cma.c b/mm/cma.c
>>>> index de5bc0c..49a7186 100644
>>>> --- a/mm/cma.c
>>>> +++ b/mm/cma.c
>>>> @@ -460,9 +460,17 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
>>>>                   spin_unlock_irq(&cma->lock);
>>>>
>>>>                   pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
>>>> -               mutex_lock(&cma_mutex);
>>>> +
>>>> +               /*
>>>> +                * If the user sets the concurr_alloc of CMA to true, concurrent
>>>> +                * memory allocation is allowed. If the user sets it to false or
>>>> +                * does not set it, concurrent memory allocation is not allowed.
>>>> +                */
>>>> +               if (!cma->concurr_alloc)
>>>> +                       mutex_lock(&cma_mutex);
>>>>                   ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp);
>>>> -               mutex_unlock(&cma_mutex);
>>>> +               if (!cma->concurr_alloc)
>>>> +                       mutex_unlock(&cma_mutex);
>>>>                   if (ret == 0) {
>>>>                           page = pfn_to_page(pfn);
>>>>                           break;
>>>> @@ -610,3 +618,13 @@ int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data)
>>>>
>>>>           return 0;
>>>>    }
>>>> +
>>>> +bool cma_set_concurrency(struct cma *cma, bool concurrency)
>>>> +{
>>>> +       if (!cma)
>>>> +               return false;
>>>> +
>>>> +       cma->concurr_alloc = concurrency;
>>>> +
>>>> +       return true;
>>>> +}
>>>> diff --git a/mm/cma.h b/mm/cma.h
>>>> index 8485ef8..30f489d 100644
>>>> --- a/mm/cma.h
>>>> +++ b/mm/cma.h
>>>> @@ -16,6 +16,7 @@ struct cma {
>>>>           unsigned long   *bitmap;
>>>>           unsigned int order_per_bit; /* Order of pages represented by one bit */
>>>>           spinlock_t      lock;
>>>> +       bool concurr_alloc;
>>>>    #ifdef CONFIG_CMA_DEBUGFS
>>>>           struct hlist_head mem_head;
>>>>           spinlock_t mem_head_lock;
>>>> --
>>>> 2.7.4
>>>>
>>>>
>>>
> 
> Thanks
> Barry

Re: [PATCH] mm/cma: add an API to enable/disable concurrent memory allocation for the CMA

Posted by Christoph Hellwig 1 year ago

On Fri, Jan 24, 2025 at 07:21:27PM +0800, yangge1116@126.com wrote:
> From: yangge <yangge1116@126.com>
> 
> Commit 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
> simply reverts to the original method of using the cma_mutex to ensure
> that alloc_contig_range() runs sequentially. This change was made to avoid
> concurrency allocation failures. However, it can negatively impact
> performance when concurrent allocation of CMA memory is required.
> 
> To address this issue, we could introduce an API for concurrency settings,
> allowing users to decide whether their CMA can perform concurrent memory
> allocations or not.
> 
> Fixes: 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
> Signed-off-by: yangge <yangge1116@126.com>
> Cc: <stable@vger.kernel.org>

Umm, you're adding new unused functions while not even reporting what
the problem is.  This looks sketchy as hell and surely is not a stable
candidate.

Re: [PATCH] mm/cma: add an API to enable/disable concurrent memory allocation for the CMA

Posted by Andrew Morton 1 year ago

On Fri, 24 Jan 2025 19:21:27 +0800 yangge1116@126.com wrote:

> From: yangge <yangge1116@126.com>
> 
> Commit 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
> simply reverts to the original method of using the cma_mutex to ensure
> that alloc_contig_range() runs sequentially. This change was made to avoid
> concurrency allocation failures. However, it can negatively impact
> performance when concurrent allocation of CMA memory is required.
> 
> To address this issue, we could introduce an API for concurrency settings,
> allowing users to decide whether their CMA can perform concurrent memory
> allocations or not.

The term "users" tends to refer to userspace code.  Here I'm thinking
you mean in-kernel code, so a better term to use is "callers".

This new interface has no callers.  We prefer not to merge unused code!
Please send along the patch which calls cma_set_concurrency() so we
can better understand this proposal and so that the new code is
testable.  In fact the patch has cc:stable, which makes things
stranger.  Why should the -stable maintainers merge a patch which
doesn't do anything?

And please quantify the benefit.  "negatively impact" is too vague. 
How much benefit can we expect our users to see from this?  Some
runtime testing results would be good.

And please describe in more detail why this particular caller doesn't
require concurrency protection.  And help other developers understand
when it is safe for them to use concurr_alloc==false.

Re: [PATCH] mm/cma: add an API to enable/disable concurrent memory allocation for the CMA

Posted by Ge Yang 1 year ago


在 2025/1/28 7:04, Andrew Morton 写道:
> On Fri, 24 Jan 2025 19:21:27 +0800 yangge1116@126.com wrote:
> 
>> From: yangge <yangge1116@126.com>
>>
>> Commit 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"")
>> simply reverts to the original method of using the cma_mutex to ensure
>> that alloc_contig_range() runs sequentially. This change was made to avoid
>> concurrency allocation failures. However, it can negatively impact
>> performance when concurrent allocation of CMA memory is required.
>>
>> To address this issue, we could introduce an API for concurrency settings,
>> allowing users to decide whether their CMA can perform concurrent memory
>> allocations or not.
> 
> The term "users" tends to refer to userspace code.  Here I'm thinking
> you mean in-kernel code, so a better term to use is "callers".
Ok, thank you. I will change it in the next version.

> 
> This new interface has no callers.  We prefer not to merge unused code!
> Please send along the patch which calls cma_set_concurrency() so we
> can better understand this proposal and so that the new code is
> testable.
Ok, thank you. I will add the caller in the next version.

   In fact the patch has cc:stable, which makes things
> stranger.  Why should the -stable maintainers merge a patch which
> doesn't do anything?
> 
> And please quantify the benefit.  "negatively impact" is too vague.
> How much benefit can we expect our users to see from this?  Some
> runtime testing results would be good.
> 
> And please describe in more detail why this particular caller doesn't
> require concurrency protection.  And help other developers understand
> when it is safe for them to use concurr_alloc==false.
Ok, thank you.