mm/vmalloc: fix mischeck pfn valid in vmap_pfns

[PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Huan Yang 11 months ago

When invoke vmap_pfns, it call vmap_pfn_apply to set pfn into pte.
It check pfn is valid, if true then warn and return.

This is a mischeck, actually we need set a valid pfn into pte, not an
invalid pfn.

This patch fix it.

Signed-off-by: Huan Yang <link@vivo.com>
Reported-by: Bingbu Cao <bingbu.cao@linux.intel.com>
Closes: https://lore.kernel.org/dri-devel/eb7e0137-3508-4287-98c4-816c5fd98e10@vivo.com/T/#mbda4f64a3532b32e061f4e8763bc8e307bea3ca8
Fixes: b3f78e749865 ("mm: vmalloc must set pte via arch code")

---
 mm/vmalloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 044af7088359..ebadb8ceeba7 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3488,7 +3488,7 @@ static int vmap_pfn_apply(pte_t *pte, unsigned long addr, void *private)
 	unsigned long pfn = data->pfns[data->idx];
 	pte_t ptent;
 
-	if (WARN_ON_ONCE(pfn_valid(pfn)))
+	if (WARN_ON_ONCE(!pfn_valid(pfn)))
 		return -EINVAL;
 
 	ptent = pte_mkspecial(pfn_pte(pfn, data->prot));
-- 
2.39.0

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Christoph Hellwig 11 months ago

On Wed, Mar 12, 2025 at 02:15:12PM +0800, Huan Yang wrote:
> When invoke vmap_pfns, it call vmap_pfn_apply to set pfn into pte.
> It check pfn is valid, if true then warn and return.
> 
> This is a mischeck, actually we need set a valid pfn into pte, not an
> invalid pfn.

As just discussed this is wrong.  vmap_pfn is for mapping non-page
PFNs and the check is what enforces that.  What is the point of having
that detailed discussion if you just send the broken patch anyway with
a commit log not even acknowledging the facts?

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Huan Yang 10 months, 3 weeks ago

HI Christoph,

Thanks for your reply, and I'm sorry for my late reply. Your response
didn't appear in my email client; I only saw it on the website.:(

>> On Wed, Mar 12, 2025 at 02:15:12PM +0800, Huan Yang wrote:
>> When invoke vmap_pfns, it call vmap_pfn_apply to set pfn into pte. >> It check pfn is valid, if true then warn and return. >> >> This is 
a mischeck, actually we need set a valid pfn into pte, not an >> invalid 
pfn. >
> As just discussed this is wrong.  vmap_pfn is for mapping non-page
Thank you for your explanation. I now understand that the design of vmap_pfn
is indeed intentional. It's design to do this.
> PFNs and the check is what enforces that.  What is the point of having
> that detailed discussion if you just send the broken patch anyway with
> a commit log not even acknowledging the facts?
Sorry for that.

We now have a new use case where, in udmabuf, memory is passed via memfd and can
be either shmem or hugetlb.
When the memory is hugetlb and HVO is enabled, the tail page's struct is no longer
reliable because it has been freed. Can't use vmap.
Therefore, when making modifications, I recorded the pfn of the folio base pfn + offset and called vmap_pfns.
And, these pfns are valid. So rejected by vmap_pfns.

Can we just remove pfn_valid check in vmap_pfns, so make it suit for both of they?
If you agree, I wanna send a new patch.

Thank you,
Huan Yang

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Bingbu Cao 10 months, 3 weeks ago

On 3/17/25 10:12 AM, Huan Yang wrote:
> HI Christoph,
> 
> Thanks for your reply, and I'm sorry for my late reply. Your response
> didn't appear in my email client; I only saw it on the website.:(
> 
>>> On Wed, Mar 12, 2025 at 02:15:12PM +0800, Huan Yang wrote:
>>> When invoke vmap_pfns, it call vmap_pfn_apply to set pfn into pte. >> It check pfn is valid, if true then warn and return. >> >> This is 
> a mischeck, actually we need set a valid pfn into pte, not an >> invalid pfn. >
>> As just discussed this is wrong.  vmap_pfn is for mapping non-page
> Thank you for your explanation. I now understand that the design of vmap_pfn
> is indeed intentional. It's design to do this.
>> PFNs and the check is what enforces that.  What is the point of having
>> that detailed discussion if you just send the broken patch anyway with
>> a commit log not even acknowledging the facts?
> Sorry for that.
> 
> We now have a new use case where, in udmabuf, memory is passed via memfd and can
> be either shmem or hugetlb.
> When the memory is hugetlb and HVO is enabled, the tail page's struct is no longer
> reliable because it has been freed. Can't use vmap.
> Therefore, when making modifications, I recorded the pfn of the folio base pfn + offset and called vmap_pfns.
> And, these pfns are valid. So rejected by vmap_pfns.
> 
> Can we just remove pfn_valid check in vmap_pfns, so make it suit for both of they?
> If you agree, I wanna send a new patch.

Huan,

Why not update udmabuf to make it work with both vmap_pfns() and
vmap()? As only the udmabuf knows it is actually working on?

I don't think it's a good idea to hack the common API, the WARN_ON()
is really a mandatory check, and current case is a good example.

> 
> Thank you,
> Huan Yang
> 

-- 
Best regards,
Bingbu Cao

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Huan Yang 10 months, 3 weeks ago

Hi Bingbu

在 2025/3/17 13:29, Bingbu Cao 写道:
> On 3/17/25 10:12 AM, Huan Yang wrote:
>> HI Christoph,
>>
>> Thanks for your reply, and I'm sorry for my late reply. Your response
>> didn't appear in my email client; I only saw it on the website.:(
>>
>>>> On Wed, Mar 12, 2025 at 02:15:12PM +0800, Huan Yang wrote:
>>>> When invoke vmap_pfns, it call vmap_pfn_apply to set pfn into pte. >> It check pfn is valid, if true then warn and return. >> >> This is
>> a mischeck, actually we need set a valid pfn into pte, not an >> invalid pfn. >
>>> As just discussed this is wrong.  vmap_pfn is for mapping non-page
>> Thank you for your explanation. I now understand that the design of vmap_pfn
>> is indeed intentional. It's design to do this.
>>> PFNs and the check is what enforces that.  What is the point of having
>>> that detailed discussion if you just send the broken patch anyway with
>>> a commit log not even acknowledging the facts?
>> Sorry for that.
>>
>> We now have a new use case where, in udmabuf, memory is passed via memfd and can
>> be either shmem or hugetlb.
>> When the memory is hugetlb and HVO is enabled, the tail page's struct is no longer
>> reliable because it has been freed. Can't use vmap.
>> Therefore, when making modifications, I recorded the pfn of the folio base pfn + offset and called vmap_pfns.
>> And, these pfns are valid. So rejected by vmap_pfns.
>>
>> Can we just remove pfn_valid check in vmap_pfns, so make it suit for both of they?
>> If you agree, I wanna send a new patch.
> Huan,
>
> Why not update udmabuf to make it work with both vmap_pfns() and
> vmap()? As only the udmabuf knows it is actually working on?

You mean, If udmabuf invoke vmap if it's normal page-based folio range, 
else invoke vmap_pfns

if it's in HVO based?

udmabuf can contained a rane in folio and offset, what if it contains 
folio's head(with page struct) and

remain tail(without page struct, freed by HVO).

I think there are no suitable way to map it into vmalloc area.:)

Or else, just block hugetlb's folio mapped into vmalloc area? Which I 
don't think it's a good way.

>
> I don't think it's a good idea to hack the common API, the WARN_ON()

You mean common, but I think vmalloc can provide a more common API that 
not care if page it's

exist, if provide pfn, just map? :)

Or else, document it that vmap_pfn just do not welcome page based pfn 
map?(Just IMO)

Thanks,

Huan Yang

> is really a mandatory check, and current case is a good example.
>
>> Thank you,
>> Huan Yang
>>

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Christoph Hellwig 10 months, 3 weeks ago

On Mon, Mar 17, 2025 at 01:29:05PM +0800, Bingbu Cao wrote:
> Why not update udmabuf to make it work with both vmap_pfns() and
> vmap()? As only the udmabuf knows it is actually working on?
> 
> I don't think it's a good idea to hack the common API, the WARN_ON()
> is really a mandatory check, and current case is a good example.

What non-page backed memory does udmabuf try to work on, and more
importantly how does it actually work on them given that the normal
DMA APIs require page backed memory.  Or is this just made it up
and it doesn't work at all given that it also tries to dma map
to the fake miscdevice struct device which can't work for most
cases?

Mapping non-page memory is difficult and without having coherent theory
of what non-page memory you are mapping and being very careful you
are extremely unlikely to get it right.

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Huan Yang 10 months, 3 weeks ago

HI Christoph

> On Mon, Mar 17, 2025 at 01:29:05PM +0800, Bingbu Cao wrote:
>> Why not update udmabuf to make it work with both vmap_pfns() and >> vmap()? As only the udmabuf knows it is actually working on? >> >> 
I don't think it's a good idea to hack the common API, the WARN_ON() >> 
is really a mandatory check, and current case is a good example.
> What non-page backed memory does udmabuf try to work on, and more
It's HUGETLB which enabled VMEMMAP_OPTIMIZE, all tail page's struct will
ref to head page struct, and then release tailed page struct.
> importantly how does it actually work on them given that the normal
> DMA APIs require page backed memory.  Or is this just made it up
udmabuf's sg_table ref only folio+offset, no any page struct ref.
So, any DMA APIs just use folio based.
It pin each folio given by memfd&offset.(memfd can be shmem or hugetlb).
So, shmem memfd can get page struct, hugetlb's may can't.
> and it doesn't work at all given that it also tries to dma map
> to the fake miscdevice struct device which can't work for most
> cases?
This implement map_dma_buf&mmap&vmap&begin/end_cpu_access.
It's simple implement.
>
> Mapping non-page memory is difficult and without having coherent theory
> of what non-page memory you are mapping and being very careful you
> are extremely unlikely to get it right.

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Christoph Hellwig 10 months, 3 weeks ago

So you want a folio based version of vmap.  Please work on that instead
of doing crazy hacks using vmap_pfns which is just for memory not
historically backed by pages and now folios.

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Huan Yang 10 months, 3 weeks ago

HI Christoph

在 2025/3/18 14:48, Christoph Hellwig 写道:
> So you want a folio based version of vmap.  Please work on that instead

The folio-based vmap implementation you mentioned may indeed be necessary, but it has limited

direct relevance to the specific issue I'm currently addressing.

> of doing crazy hacks using vmap_pfns which is just for memory not

I mentioned that if we use HVO-based hugetlb, we cannot access the page structures corresponding to

the physical memory that needs to be mapped.

This prevents us from properly invoking vmap, which is why we have turned to using vmap_pfn instead.

Even if a folio-based vmap is implemented, it still cannot handle mapping multiple folio ranges of physical

memory to vmalloc regions. A range of folio is important, it maybe an offset in memfd, no need entire folio.

So, I still consider vmap_pfn to be the optimal solution for this specific scenario. :)

> historically backed by pages and now folios.

So by HVO, it also not backed by pages, only contains folio head, each tail pfn's page struct go away.

Do I misunderstood this part? Please correct me.

Thank you

Huan Yang

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Christoph Hellwig 10 months, 3 weeks ago

On Tue, Mar 18, 2025 at 04:20:17PM +0800, Huan Yang wrote:
> This prevents us from properly invoking vmap, which is why we have turned to using vmap_pfn instead.
>
> Even if a folio-based vmap is implemented, it still cannot handle mapping multiple folio ranges of physical
>
> memory to vmalloc regions. A range of folio is important, it maybe an offset in memfd, no need entire folio.
>
> So, I still consider vmap_pfn to be the optimal solution for this specific scenario. :)

No, vmap_pfn is entirely for memory not backed by pages or folios,
i.e. PCIe BARs and similar memory.  This must not be mixed with proper
folio backed memory.

So you'll need a vmap for folios to support this use case.

>
>> historically backed by pages and now folios.
>
> So by HVO, it also not backed by pages, only contains folio head, each tail pfn's page struct go away.

And a fully folios based vmap solves that problem.

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Matthew Wilcox 10 months, 2 weeks ago

On Tue, Mar 18, 2025 at 09:33:30AM +0100, Christoph Hellwig wrote:
> On Tue, Mar 18, 2025 at 04:20:17PM +0800, Huan Yang wrote:
> > This prevents us from properly invoking vmap, which is why we have turned to using vmap_pfn instead.
> >
> > Even if a folio-based vmap is implemented, it still cannot handle mapping multiple folio ranges of physical
> >
> > memory to vmalloc regions. A range of folio is important, it maybe an offset in memfd, no need entire folio.
> >
> > So, I still consider vmap_pfn to be the optimal solution for this specific scenario. :)
> 
> No, vmap_pfn is entirely for memory not backed by pages or folios,
> i.e. PCIe BARs and similar memory.  This must not be mixed with proper
> folio backed memory.
> 
> So you'll need a vmap for folios to support this use case.

https://lore.kernel.org/linux-mm/20250131001806.92349-1-vishal.moola@gmail.com/
seems like a good match for "we need to vmap a range from a file".

Vishal has some updates since this version, and I'm sure he can send
them out after LSFMM.

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Huan Yang 10 months, 3 weeks ago

HI Christoph

在 2025/3/18 16:33, Christoph Hellwig 写道:
> [You don't often get email from hch@lst.de. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> On Tue, Mar 18, 2025 at 04:20:17PM +0800, Huan Yang wrote:
>> This prevents us from properly invoking vmap, which is why we have turned to using vmap_pfn instead.
>>
>> Even if a folio-based vmap is implemented, it still cannot handle mapping multiple folio ranges of physical
>>
>> memory to vmalloc regions. A range of folio is important, it maybe an offset in memfd, no need entire folio.
>>
>> So, I still consider vmap_pfn to be the optimal solution for this specific scenario. :)
> No, vmap_pfn is entirely for memory not backed by pages or folios,
> i.e. PCIe BARs and similar memory.  This must not be mixed with proper
> folio backed memory.
OK, I learn it more.
>
> So you'll need a vmap for folios to support this use case.
May can't
>
>>> historically backed by pages and now folios.
>> So by HVO, it also not backed by pages, only contains folio head, each tail pfn's page struct go away.
> And a fully folios based vmap solves that problem.

A folio may be 2MB or more large 1GB, what if we only need a little, 1M or 512MB, can vmap based on folio can solve it?

Normally, can offer 4k-page based array map it. But consider HVO, can't. That's why wanto base on pfn.

Thank you

Huan Yang

>

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Christoph Hellwig 10 months, 3 weeks ago

On Tue, Mar 18, 2025 at 04:39:40PM +0800, Huan Yang wrote:
> A folio may be 2MB or more large 1GB, what if we only need a little, 1M or 512MB, can vmap based on folio can solve it?

Then you only map part of it by passing a length argument.  Note that
in general when you have these large folios you also don't have highmem,
so if you only map one of them, or part of one of them you don't actually
need vmap at all and can just use folio-address..

> Normally, can offer 4k-page based array map it. But consider HVO, can't. That's why wanto base on pfn.

Well, for any large folio using this 4k based page interface is
actually highly inefficient.  So let's fix that.  And my loop in
willy as Mr. Folio while you're at it.

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Gao Xiang 10 months, 3 weeks ago


On 2025/3/18 16:44, Christoph Hellwig wrote:
> On Tue, Mar 18, 2025 at 04:39:40PM +0800, Huan Yang wrote:
>> A folio may be 2MB or more large 1GB, what if we only need a little, 1M or 512MB, can vmap based on folio can solve it?
> 
> Then you only map part of it by passing a length argument.  Note that
> in general when you have these large folios you also don't have highmem,
> so if you only map one of them, or part of one of them you don't actually
> need vmap at all and can just use folio-address..
> 
>> Normally, can offer 4k-page based array map it. But consider HVO, can't. That's why wanto base on pfn.
> 
> Well, for any large folio using this 4k based page interface is
> actually highly inefficient.  So let's fix that.  And my loop in
> willy as Mr. Folio while you're at it.

The minimum map unit is page size instead of variable-size folio.

For many cases, vmap (to combine many partial folios) is useful
(instead of split all folios to order-0 folios in advance) but
I agree page array may be inefficient.

So I don't think another folio vmap() version is better overall
anyway.

Thanks,
Gao Xiang

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Christoph Hellwig 10 months, 3 weeks ago

On Wed, Mar 19, 2025 at 05:09:06PM +0800, Gao Xiang wrote:
> The minimum map unit is page size instead of variable-size folio.
>
> For many cases, vmap (to combine many partial folios) is useful
> (instead of split all folios to order-0 folios in advance) but
> I agree page array may be inefficient.
>
> So I don't think another folio vmap() version is better overall
> anyway.

Then just reject the mappings for now.  vmap/vm_map_ram isn't really
intended for mapping totally arbitrary scattered memory anyway.
As mentioned before udmabuf also has a grave bug in the dma mapping
part, so just marking it broken would be another very sensible
option.

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Huan Yang 10 months, 2 weeks ago

HI Christoph

Sorry for the later reply.

在 2025/3/20 13:31, Christoph Hellwig 写道:
> As mentioned before udmabuf also has a grave bug in the dma mapping
> part, so just marking it broken would be another very sensible

Could you please provide a more detailed description of this?

I'm not the maintainer, but I see the udmabuf's begin patch mentioned as:

A driver to let userspace turn memfd regions into dma-bufs.

Use case:  Allows qemu create dmabufs for the vga framebuffer or
virtio-gpu ressources.  Then they can be passed around to display
those guest things on the host.  To spice client for classic full
framebuffer display, and hopefully some day to wayland server for
seamless guest window display.

https://patchwork.freedesktop.org/patch/246100/

Did it also mean to do it?

I'd like to learn your thinking

Thank you,

Huan Yang

> option.

Re: [PATCH] mm/vmalloc: fix mischeck pfn valid in vmap_pfns

Posted by Huan Yang 10 months, 3 weeks ago

HI Christoph

在 2025/3/18 16:44, Christoph Hellwig 写道:
> [You don't often get email from hch@lst.de. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> On Tue, Mar 18, 2025 at 04:39:40PM +0800, Huan Yang wrote:
>> A folio may be 2MB or more large 1GB, what if we only need a little, 1M or 512MB, can vmap based on folio can solve it?
> Then you only map part of it by passing a length argument.  Note that
> in general when you have these large folios you also don't have highmem,
> so if you only map one of them, or part of one of them you don't actually
> need vmap at all and can just use folio-address..

The situation remains complex. We are not only referring to this single folio, but it may be composed

of multiple folios and start with a small piece. For example:

Folio 1 2M - use 1M

Folio 2 2M - use 2M

....

Combine these folios into memory used by udmabuf. So, directly virtual memory useless. :)

>
>> Normally, can offer 4k-page based array map it. But consider HVO, can't. That's why wanto base on pfn.
> Well, for any large folio using this 4k based page interface is
> actually highly inefficient.  So let's fix that.  And my loop in
Yes, indeed. You mentioned good to entire folio array mapped into vmalloc, I guest here are many user like it.
> willy as Mr. Folio while you're at it.
I'd like to research it, but not too fast. :)
>