[PATCH] vfio/container: Restrict dma_map_file() to shared RAM

Chenyi Qiang posted 1 patch 1 week, 4 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20260519055756.70575-1-chenyi.qiang@intel.com
Maintainers: Alex Williamson <alex@shazbot.org>, "Cédric Le Goater" <clg@redhat.com>
hw/vfio/container.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] vfio/container: Restrict dma_map_file() to shared RAM
Posted by Chenyi Qiang 1 week, 4 days ago
vfio_container_dma_map() uses dma_map_file() whenever a RAMBlock has an
fd and the VFIO IOMMU backend supports file-based DMA mapping. That is
not correct for private file-backed RAM.

dma_map_file() resolves PFNs from the backing file, but private
mappings can run on different PFNs than the file itself. As a result,
using dma_map_file() on a private RAMBlock can program DMA against pages
that do not back QEMU's actual guest memory.

This was observed with hugetlbfs-backed guest memory and iommufd/VFIO:
share=on works, while share=off can fault because the file-backed PFNs
can diverge from the PFNs backing QEMU's private mapping.

Fix this by using dma_map_file() only for shared RAMBlocks.

Fixes: fb32965b6dd8 ("vfio/iommufd: use IOMMU_IOAS_MAP_FILE")
Reported-by: Farrah Chen <farrah.chen@intel.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220776
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
---
 hw/vfio/container.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 4c2816b574..c5a3c60a27 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -82,7 +82,7 @@ int vfio_container_dma_map(VFIOContainer *bcontainer,
     RAMBlock *rb = mr->ram_block;
     int mfd = rb ? qemu_ram_get_fd(rb) : -1;
 
-    if (mfd >= 0 && vioc->dma_map_file) {
+    if (mfd >= 0 && vioc->dma_map_file && qemu_ram_is_shared(rb)) {
         unsigned long start = vaddr - qemu_ram_get_host_addr(rb);
         unsigned long offset = qemu_ram_get_fd_offset(rb);
 
-- 
2.43.5
Re: [PATCH] vfio/container: Restrict dma_map_file() to shared RAM
Posted by Cédric Le Goater 3 days, 21 hours ago
Hello Chenyi,


On 5/19/26 07:57, Chenyi Qiang wrote:
> vfio_container_dma_map() uses dma_map_file() whenever a RAMBlock has an
> fd and the VFIO IOMMU backend supports file-based DMA mapping. That is
> not correct for private file-backed RAM.
> 
> dma_map_file() resolves PFNs from the backing file, but private
> mappings can run on different PFNs than the file itself. As a result,
> using dma_map_file() on a private RAMBlock can program DMA against pages
> that do not back QEMU's actual guest memory.
> 
> This was observed with hugetlbfs-backed guest memory and iommufd/VFIO:
> share=on works, while share=off can fault because the file-backed PFNs
> can diverge from the PFNs backing QEMU's private mapping.
> 
> Fix this by using dma_map_file() only for shared RAMBlocks.
> 
> Fixes: fb32965b6dd8 ("vfio/iommufd: use IOMMU_IOAS_MAP_FILE")
> Reported-by: Farrah Chen <farrah.chen@intel.com>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220776
> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
> ---
>   hw/vfio/container.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
> index 4c2816b574..c5a3c60a27 100644
> --- a/hw/vfio/container.c
> +++ b/hw/vfio/container.c
> @@ -82,7 +82,7 @@ int vfio_container_dma_map(VFIOContainer *bcontainer,
>       RAMBlock *rb = mr->ram_block;
>       int mfd = rb ? qemu_ram_get_fd(rb) : -1;
>   
> -    if (mfd >= 0 && vioc->dma_map_file) {
> +    if (mfd >= 0 && vioc->dma_map_file && qemu_ram_is_shared(rb)) {

I think we should introduce an helper to check that the conditions
for using the .dma_map_file() handler are met. Something like :

     if (vfio_container_can_dma_map_file(bcontainer, ...)) {
           RAMBlock *rb = mr->ram_block;
     

and

  /*
   * We can use IOMMU DMA mapping (IOMMU_IOAS_MAP_FILE) for :
   *
   * 1) Guest RAM blocks explicitly configured as shared (MAP_SHARED)
   * 2) RAM device sub-regions (MMIO BARs)
   *
   * Private RAM mappings (MAP_PRIVATE) are strictly excluded. Because
   * they are subject to copy-on-write (COW) anomalies, their
   * underlying PFNs can permanently diverge from the backing file
   */
   bool vfio_container_can_dma_map_file(VFIOContainer *bcontainer, ...)
   {
      return ...
   }

How's that ?

Thanks,

C.
Re: [PATCH] vfio/container: Restrict dma_map_file() to shared RAM
Posted by Chenyi Qiang 3 days, 14 hours ago

On 5/27/2026 3:44 AM, Cédric Le Goater wrote:
> Hello Chenyi,
> 
> 
> On 5/19/26 07:57, Chenyi Qiang wrote:
>> vfio_container_dma_map() uses dma_map_file() whenever a RAMBlock has an
>> fd and the VFIO IOMMU backend supports file-based DMA mapping. That is
>> not correct for private file-backed RAM.
>>
>> dma_map_file() resolves PFNs from the backing file, but private
>> mappings can run on different PFNs than the file itself. As a result,
>> using dma_map_file() on a private RAMBlock can program DMA against pages
>> that do not back QEMU's actual guest memory.
>>
>> This was observed with hugetlbfs-backed guest memory and iommufd/VFIO:
>> share=on works, while share=off can fault because the file-backed PFNs
>> can diverge from the PFNs backing QEMU's private mapping.
>>
>> Fix this by using dma_map_file() only for shared RAMBlocks.
>>
>> Fixes: fb32965b6dd8 ("vfio/iommufd: use IOMMU_IOAS_MAP_FILE")
>> Reported-by: Farrah Chen <farrah.chen@intel.com>
>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220776
>> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
>> ---
>>   hw/vfio/container.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>> index 4c2816b574..c5a3c60a27 100644
>> --- a/hw/vfio/container.c
>> +++ b/hw/vfio/container.c
>> @@ -82,7 +82,7 @@ int vfio_container_dma_map(VFIOContainer *bcontainer,
>>       RAMBlock *rb = mr->ram_block;
>>       int mfd = rb ? qemu_ram_get_fd(rb) : -1;
>>   -    if (mfd >= 0 && vioc->dma_map_file) {
>> +    if (mfd >= 0 && vioc->dma_map_file && qemu_ram_is_shared(rb)) {
> 
> I think we should introduce an helper to check that the conditions
> for using the .dma_map_file() handler are met. Something like :
> 
>     if (vfio_container_can_dma_map_file(bcontainer, ...)) {
>           RAMBlock *rb = mr->ram_block;
>    
> and
> 
>  /*
>   * We can use IOMMU DMA mapping (IOMMU_IOAS_MAP_FILE) for :
>   *
>   * 1) Guest RAM blocks explicitly configured as shared (MAP_SHARED)
>   * 2) RAM device sub-regions (MMIO BARs)
>   *
>   * Private RAM mappings (MAP_PRIVATE) are strictly excluded. Because
>   * they are subject to copy-on-write (COW) anomalies, their
>   * underlying PFNs can permanently diverge from the backing file
>   */
>   bool vfio_container_can_dma_map_file(VFIOContainer *bcontainer, ...)
>   {
>      return ...
>   }
> 
> How's that ?
Thanks for the review. I'm fine with introducing the helper.

But I'm not sure if it's cleaner to include duplicated calculations in both the caller and helper.
 
+static bool vfio_container_can_dma_map_file(VFIOContainer *bcontainer, RAMBlock *rb)
+{
+    VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
+
+    return rb && qemu_ram_get_fd(rb) >= 0 && vioc->dma_map_file &&
+           qemu_ram_is_shared(rb);
+}
+

int vfio_container_dma_map(VFIOContainer *bcontainer, ...)
{
     ...
     RAMBlock *rb = mr->ram_block;
     int mfd = rb ? qemu_ram_get_fd(rb) : -1;
 
-    if (mfd >= 0 && vioc->dma_map_file && qemu_ram_is_shared(rb)) {
+    if (vfio_container_can_dma_map_file(bcontainer, rb)) {
         unsigned long start = vaddr - qemu_ram_get_host_addr(rb);
         unsigned long offset = qemu_ram_get_fd_offset(rb);

Perhaps it can be simplified slightly.


Re: [PATCH] vfio/container: Restrict dma_map_file() to shared RAM
Posted by Cédric Le Goater 3 days, 10 hours ago
On 5/27/26 05:14, Chenyi Qiang wrote:
> 
> 
> On 5/27/2026 3:44 AM, Cédric Le Goater wrote:
>> Hello Chenyi,
>>
>>
>> On 5/19/26 07:57, Chenyi Qiang wrote:
>>> vfio_container_dma_map() uses dma_map_file() whenever a RAMBlock has an
>>> fd and the VFIO IOMMU backend supports file-based DMA mapping. That is
>>> not correct for private file-backed RAM.
>>>
>>> dma_map_file() resolves PFNs from the backing file, but private
>>> mappings can run on different PFNs than the file itself. As a result,
>>> using dma_map_file() on a private RAMBlock can program DMA against pages
>>> that do not back QEMU's actual guest memory.
>>>
>>> This was observed with hugetlbfs-backed guest memory and iommufd/VFIO:
>>> share=on works, while share=off can fault because the file-backed PFNs
>>> can diverge from the PFNs backing QEMU's private mapping.
>>>
>>> Fix this by using dma_map_file() only for shared RAMBlocks.
>>>
>>> Fixes: fb32965b6dd8 ("vfio/iommufd: use IOMMU_IOAS_MAP_FILE")
>>> Reported-by: Farrah Chen <farrah.chen@intel.com>
>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220776
>>> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
>>> ---
>>>    hw/vfio/container.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>>> index 4c2816b574..c5a3c60a27 100644
>>> --- a/hw/vfio/container.c
>>> +++ b/hw/vfio/container.c
>>> @@ -82,7 +82,7 @@ int vfio_container_dma_map(VFIOContainer *bcontainer,
>>>        RAMBlock *rb = mr->ram_block;
>>>        int mfd = rb ? qemu_ram_get_fd(rb) : -1;
>>>    -    if (mfd >= 0 && vioc->dma_map_file) {
>>> +    if (mfd >= 0 && vioc->dma_map_file && qemu_ram_is_shared(rb)) {
>>
>> I think we should introduce an helper to check that the conditions
>> for using the .dma_map_file() handler are met. Something like :
>>
>>      if (vfio_container_can_dma_map_file(bcontainer, ...)) {
>>            RAMBlock *rb = mr->ram_block;
>>     
>> and
>>
>>   /*
>>    * We can use IOMMU DMA mapping (IOMMU_IOAS_MAP_FILE) for :
>>    *
>>    * 1) Guest RAM blocks explicitly configured as shared (MAP_SHARED)
>>    * 2) RAM device sub-regions (MMIO BARs)
>>    *
>>    * Private RAM mappings (MAP_PRIVATE) are strictly excluded. Because
>>    * they are subject to copy-on-write (COW) anomalies, their
>>    * underlying PFNs can permanently diverge from the backing file
>>    */
>>    bool vfio_container_can_dma_map_file(VFIOContainer *bcontainer, ...)
>>    {
>>       return ...
>>    }
>>
>> How's that ?
> Thanks for the review. I'm fine with introducing the helper.
> 
> But I'm not sure if it's cleaner to include duplicated calculations in both the caller and helper.
>   
> +static bool vfio_container_can_dma_map_file(VFIOContainer *bcontainer, RAMBlock *rb)
> +{
> +    VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
> +
> +    return rb && qemu_ram_get_fd(rb) >= 0 && vioc->dma_map_file &&
> +           qemu_ram_is_shared(rb);
> +}
> +
> 
> int vfio_container_dma_map(VFIOContainer *bcontainer, ...)
> {
>       ...
>       RAMBlock *rb = mr->ram_block;
>       int mfd = rb ? qemu_ram_get_fd(rb) : -1;
>   
> -    if (mfd >= 0 && vioc->dma_map_file && qemu_ram_is_shared(rb)) {
> +    if (vfio_container_can_dma_map_file(bcontainer, rb)) {
>           unsigned long start = vaddr - qemu_ram_get_host_addr(rb);
>           unsigned long offset = qemu_ram_get_fd_offset(rb);
> 
> Perhaps it can be simplified slightly.
> 

I was thinking moving the check on qemu_ram_get_fd(rb) within the
new routine. Something like :
   

   static bool vfio_container_can_dma_map_file(VFIOContainer *bcontainer,
                                               MemoryRegion *mr, int *fd)
   {
       VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
       RAMBlock *rb = mr->ram_block;
   
       if (!vioc->dma_map_file || !rb) {
           return false;
       }
   
       *fd = qemu_ram_get_fd(rb);
       if (*fd < 0) {
           return false;
       }
   
       /*
        * We can use IOMMU DMA mapping (IOMMU_IOAS_MAP_FILE) for :
        *
        * 1) Guest RAM blocks explicitly configured as shared (MAP_SHARED)
        * 2) RAM device sub-regions (MMIO BARs)
        *
        * Private RAM mappings (MAP_PRIVATE) are strictly excluded. Because
        * they are subject to copy-on-write (COW) anomalies, their underlying
        * PFNs can permanently diverge from the backing file
        */
       return qemu_ram_is_shared(rb) || memory_region_is_ram_device(mr);
   }

or simpler :

   static bool vfio_container_can_dma_map_file(MemoryRegion *mr, int *fd)
   {
       RAMBlock *rb = mr->ram_block;
  
       if (!rb) {
           return false;
       }
  
       *fd = qemu_ram_get_fd(rb);
       if (*fd < 0) {
           return false;
       }
  
       /* ... */	
       return qemu_ram_is_shared(rb) || memory_region_is_ram_device(mr);
   }
  
I think option 2. is best. you choose.

Thanks,

C.


RE: [PATCH] vfio/container: Restrict dma_map_file() to shared RAM
Posted by Duan, Zhenzhong 5 days, 7 hours ago
Hi,

>-----Original Message-----
>From: Qiang, Chenyi <chenyi.qiang@intel.com>
>Subject: [PATCH] vfio/container: Restrict dma_map_file() to shared RAM
>
>vfio_container_dma_map() uses dma_map_file() whenever a RAMBlock has an
>fd and the VFIO IOMMU backend supports file-based DMA mapping. That is
>not correct for private file-backed RAM.
>
>dma_map_file() resolves PFNs from the backing file, but private
>mappings can run on different PFNs than the file itself. As a result,
>using dma_map_file() on a private RAMBlock can program DMA against pages
>that do not back QEMU's actual guest memory.
>
>This was observed with hugetlbfs-backed guest memory and iommufd/VFIO:
>share=on works, while share=off can fault because the file-backed PFNs
>can diverge from the PFNs backing QEMU's private mapping.

Good finding, did you see same issue with common file(not hugetlbfs file) backed private mapping?

Thanks
Zhenzhong

>
>Fix this by using dma_map_file() only for shared RAMBlocks.
>
>Fixes: fb32965b6dd8 ("vfio/iommufd: use IOMMU_IOAS_MAP_FILE")
>Reported-by: Farrah Chen <farrah.chen@intel.com>
>Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220776
>Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
>---
> hw/vfio/container.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>index 4c2816b574..c5a3c60a27 100644
>--- a/hw/vfio/container.c
>+++ b/hw/vfio/container.c
>@@ -82,7 +82,7 @@ int vfio_container_dma_map(VFIOContainer *bcontainer,
>     RAMBlock *rb = mr->ram_block;
>     int mfd = rb ? qemu_ram_get_fd(rb) : -1;
>
>-    if (mfd >= 0 && vioc->dma_map_file) {
>+    if (mfd >= 0 && vioc->dma_map_file && qemu_ram_is_shared(rb)) {
>         unsigned long start = vaddr - qemu_ram_get_host_addr(rb);
>         unsigned long offset = qemu_ram_get_fd_offset(rb);
>
>--
>2.43.5
Re: [PATCH] vfio/container: Restrict dma_map_file() to shared RAM
Posted by Chenyi Qiang 5 days, 6 hours ago

On 5/25/2026 6:31 PM, Duan, Zhenzhong wrote:
> Hi,
> 
>> -----Original Message-----
>> From: Qiang, Chenyi <chenyi.qiang@intel.com>
>> Subject: [PATCH] vfio/container: Restrict dma_map_file() to shared RAM
>>
>> vfio_container_dma_map() uses dma_map_file() whenever a RAMBlock has an
>> fd and the VFIO IOMMU backend supports file-based DMA mapping. That is
>> not correct for private file-backed RAM.
>>
>> dma_map_file() resolves PFNs from the backing file, but private
>> mappings can run on different PFNs than the file itself. As a result,
>> using dma_map_file() on a private RAMBlock can program DMA against pages
>> that do not back QEMU's actual guest memory.
>>
>> This was observed with hugetlbfs-backed guest memory and iommufd/VFIO:
>> share=on works, while share=off can fault because the file-backed PFNs
>> can diverge from the PFNs backing QEMU's private mapping.
> 
> Good finding, did you see same issue with common file(not hugetlbfs file) backed private mapping?

Yes, I also tried with "-object memory-backend-memfd,id=mem1,size=2G,share=off", and it can also be reproduced.
RE: [PATCH] vfio/container: Restrict dma_map_file() to shared RAM
Posted by Duan, Zhenzhong 4 days, 8 hours ago

>-----Original Message-----
>From: Qiang, Chenyi <chenyi.qiang@intel.com>
>Subject: Re: [PATCH] vfio/container: Restrict dma_map_file() to shared RAM
>
>
>
>On 5/25/2026 6:31 PM, Duan, Zhenzhong wrote:
>> Hi,
>>
>>> -----Original Message-----
>>> From: Qiang, Chenyi <chenyi.qiang@intel.com>
>>> Subject: [PATCH] vfio/container: Restrict dma_map_file() to shared RAM
>>>
>>> vfio_container_dma_map() uses dma_map_file() whenever a RAMBlock has an
>>> fd and the VFIO IOMMU backend supports file-based DMA mapping. That is
>>> not correct for private file-backed RAM.
>>>
>>> dma_map_file() resolves PFNs from the backing file, but private
>>> mappings can run on different PFNs than the file itself. As a result,
>>> using dma_map_file() on a private RAMBlock can program DMA against pages
>>> that do not back QEMU's actual guest memory.
>>>
>>> This was observed with hugetlbfs-backed guest memory and iommufd/VFIO:
>>> share=on works, while share=off can fault because the file-backed PFNs
>>> can diverge from the PFNs backing QEMU's private mapping.
>>
>> Good finding, did you see same issue with common file(not hugetlbfs file) backed
>private mapping?
>
>Yes, I also tried with "-object memory-backend-
>memfd,id=mem1,size=2G,share=off", and it can also be reproduced.

Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>