[PATCH v2 16/17] intel_iommu: Introduce a property to control FS1GP cap bit setting

Zhenzhong Duan posted 17 patches 1 month, 2 weeks ago
There is a newer version of this series
[PATCH v2 16/17] intel_iommu: Introduce a property to control FS1GP cap bit setting
Posted by Zhenzhong Duan 1 month, 2 weeks ago
When host IOMMU doesn't support FS1GP but vIOMMU does, host IOMMU
can't translate stage-1 page table from guest correctly.

Add a property x-cap-fs1gp for user to turn FS1GP off so that
nested page table on host side works.

This property has no effect when vIOMMU isn't in scalable modern
mode.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/i386/intel_iommu.h | 1 +
 hw/i386/intel_iommu.c         | 5 ++++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 650641544c..f6d9b41b80 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -308,6 +308,7 @@ struct IntelIOMMUState {
     bool dma_drain;                 /* Whether DMA r/w draining enabled */
     bool dma_translation;           /* Whether DMA translation supported */
     bool pasid;                     /* Whether to support PASID */
+    bool fs1gp;                     /* First Stage 1-GByte Page Support */
 
     /*
      * Protects IOMMU states in general.  Currently it protects the
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 9e973bd710..d7e7354db4 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3778,6 +3778,7 @@ static Property vtd_properties[] = {
     DEFINE_PROP_BOOL("x-pasid-mode", IntelIOMMUState, pasid, false),
     DEFINE_PROP_BOOL("dma-drain", IntelIOMMUState, dma_drain, true),
     DEFINE_PROP_BOOL("dma-translation", IntelIOMMUState, dma_translation, true),
+    DEFINE_PROP_BOOL("x-cap-fs1gp", IntelIOMMUState, fs1gp, true),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -4506,7 +4507,9 @@ static void vtd_cap_init(IntelIOMMUState *s)
     /* TODO: read cap/ecap from host to decide which cap to be exposed. */
     if (s->scalable_modern) {
         s->ecap |= VTD_ECAP_SMTS | VTD_ECAP_FLTS;
-        s->cap |= VTD_CAP_FS1GP;
+        if (s->fs1gp) {
+            s->cap |= VTD_CAP_FS1GP;
+        }
     } else if (s->scalable_mode) {
         s->ecap |= VTD_ECAP_SMTS | VTD_ECAP_SRS | VTD_ECAP_SLTS;
     }
-- 
2.34.1
Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control FS1GP cap bit setting
Posted by Yi Liu 1 month, 1 week ago
On 2024/8/5 14:27, Zhenzhong Duan wrote:
> When host IOMMU doesn't support FS1GP but vIOMMU does, host IOMMU
> can't translate stage-1 page table from guest correctly.

this series is for emulated devices, so the above statement does not
belong to this series. Is there any other reason to have this option?

> Add a property x-cap-fs1gp for user to turn FS1GP off so that
> nested page table on host side works.

I guess you would need to sync the FS1GP cap with host before reporting it
in vIOMMU when comes to support passthrough devices.

> This property has no effect when vIOMMU isn't in scalable modern
> mode.
> 
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   include/hw/i386/intel_iommu.h | 1 +
>   hw/i386/intel_iommu.c         | 5 ++++-
>   2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index 650641544c..f6d9b41b80 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -308,6 +308,7 @@ struct IntelIOMMUState {
>       bool dma_drain;                 /* Whether DMA r/w draining enabled */
>       bool dma_translation;           /* Whether DMA translation supported */
>       bool pasid;                     /* Whether to support PASID */
> +    bool fs1gp;                     /* First Stage 1-GByte Page Support */
>   
>       /*
>        * Protects IOMMU states in general.  Currently it protects the
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 9e973bd710..d7e7354db4 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -3778,6 +3778,7 @@ static Property vtd_properties[] = {
>       DEFINE_PROP_BOOL("x-pasid-mode", IntelIOMMUState, pasid, false),
>       DEFINE_PROP_BOOL("dma-drain", IntelIOMMUState, dma_drain, true),
>       DEFINE_PROP_BOOL("dma-translation", IntelIOMMUState, dma_translation, true),
> +    DEFINE_PROP_BOOL("x-cap-fs1gp", IntelIOMMUState, fs1gp, true),
>       DEFINE_PROP_END_OF_LIST(),
>   };
>   
> @@ -4506,7 +4507,9 @@ static void vtd_cap_init(IntelIOMMUState *s)
>       /* TODO: read cap/ecap from host to decide which cap to be exposed. */
>       if (s->scalable_modern) {
>           s->ecap |= VTD_ECAP_SMTS | VTD_ECAP_FLTS;
> -        s->cap |= VTD_CAP_FS1GP;
> +        if (s->fs1gp) {
> +            s->cap |= VTD_CAP_FS1GP;
> +        }
>       } else if (s->scalable_mode) {
>           s->ecap |= VTD_ECAP_SMTS | VTD_ECAP_SRS | VTD_ECAP_SLTS;
>       }

-- 
Regards,
Yi Liu
RE: [PATCH v2 16/17] intel_iommu: Introduce a property to control FS1GP cap bit setting
Posted by Duan, Zhenzhong 1 month ago

>-----Original Message-----
>From: Liu, Yi L <yi.l.liu@intel.com>
>Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control
>FS1GP cap bit setting
>
>On 2024/8/5 14:27, Zhenzhong Duan wrote:
>> When host IOMMU doesn't support FS1GP but vIOMMU does, host
>IOMMU
>> can't translate stage-1 page table from guest correctly.
>
>this series is for emulated devices, so the above statement does not
>belong to this series. Is there any other reason to have this option?

Good catch, will remove this comment.
In fact, this patch is mainly for passthrough device where host IOMMU doesn't support fs1gp.

>
>> Add a property x-cap-fs1gp for user to turn FS1GP off so that
>> nested page table on host side works.
>
>I guess you would need to sync the FS1GP cap with host before reporting it
>in vIOMMU when comes to support passthrough devices.

Yes, we already have this check, see https://github.com/yiliu1765/qemu/commit/b7ac7ce3a2e21eb1b3172743ee6f73e80fe67b3a

Thanks
Zhenzhong
Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control FS1GP cap bit setting
Posted by Yi Liu 1 month ago
On 2024/8/15 11:46, Duan, Zhenzhong wrote:
> 
> 
>> -----Original Message-----
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control
>> FS1GP cap bit setting
>>
>> On 2024/8/5 14:27, Zhenzhong Duan wrote:
>>> When host IOMMU doesn't support FS1GP but vIOMMU does, host
>> IOMMU
>>> can't translate stage-1 page table from guest correctly.
>>
>> this series is for emulated devices, so the above statement does not
>> belong to this series. Is there any other reason to have this option?
> 
> Good catch, will remove this comment.
> In fact, this patch is mainly for passthrough device where host IOMMU doesn't support fs1gp.

I see. To me, as long as the vIOMMU page walk logic supports 1GP large
pages, it's ok to report the FS1GP cap to VM. But it is still fine to
have this property to opt-out FS1GP if admin/orchestration layer(e.g. libvirt)
knows no hw iommu has this capability, so it is better to opt out it
before invoking QEMU.

Is this your motivation for this property?

>>
>>> Add a property x-cap-fs1gp for user to turn FS1GP off so that
>>> nested page table on host side works.
>>
>> I guess you would need to sync the FS1GP cap with host before reporting it
>> in vIOMMU when comes to support passthrough devices.
> 
> Yes, we already have this check, see https://github.com/yiliu1765/qemu/commit/b7ac7ce3a2e21eb1b3172743ee6f73e80fe67b3a

good to know it. :) Will you fail the VM if the device's iommu does not
support FS1GP or just mask out the FS1GP?

-- 
Regards,
Yi Liu
RE: [PATCH v2 16/17] intel_iommu: Introduce a property to control FS1GP cap bit setting
Posted by Duan, Zhenzhong 1 month ago

>-----Original Message-----
>From: Liu, Yi L <yi.l.liu@intel.com>
>Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control
>FS1GP cap bit setting
>
>On 2024/8/15 11:46, Duan, Zhenzhong wrote:
>>
>>
>>> -----Original Message-----
>>> From: Liu, Yi L <yi.l.liu@intel.com>
>>> Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to
>control
>>> FS1GP cap bit setting
>>>
>>> On 2024/8/5 14:27, Zhenzhong Duan wrote:
>>>> When host IOMMU doesn't support FS1GP but vIOMMU does, host
>>> IOMMU
>>>> can't translate stage-1 page table from guest correctly.
>>>
>>> this series is for emulated devices, so the above statement does not
>>> belong to this series. Is there any other reason to have this option?
>>
>> Good catch, will remove this comment.
>> In fact, this patch is mainly for passthrough device where host IOMMU
>doesn't support fs1gp.
>
>I see. To me, as long as the vIOMMU page walk logic supports 1GP large
>pages, it's ok to report the FS1GP cap to VM. But it is still fine to
>have this property to opt-out FS1GP if admin/orchestration layer(e.g. libvirt)
>knows no hw iommu has this capability, so it is better to opt out it
>before invoking QEMU.
>
>Is this your motivation for this property?

Exactly.

>
>>>
>>>> Add a property x-cap-fs1gp for user to turn FS1GP off so that
>>>> nested page table on host side works.
>>>
>>> I guess you would need to sync the FS1GP cap with host before reporting
>it
>>> in vIOMMU when comes to support passthrough devices.
>>
>> Yes, we already have this check, see
>https://github.com/yiliu1765/qemu/commit/b7ac7ce3a2e21eb1b3172743
>ee6f73e80fe67b3a
>
>good to know it. :) Will you fail the VM if the device's iommu does not
>support FS1GP or just mask out the FS1GP?

For cold plugged VFIO device, it will fail the VM with "Stage-1 1GB huge page is unsupported by host IOMMU" error report.
For hotplug VFIO device, only hotplug fails with "Stage-1 1GB huge page is unsupported by host IOMMU".

We don't update vIOMMU cap/ecap from host cap/ecap per Michael's suggestion, only vIOMMU properties can control them.

Thanks
Zhenzhong
Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control FS1GP cap bit setting
Posted by Yi Liu 1 month ago
On 2024/8/19 17:41, Duan, Zhenzhong wrote:
> 
> 
>> -----Original Message-----
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control
>> FS1GP cap bit setting
>>
>> On 2024/8/15 11:46, Duan, Zhenzhong wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Liu, Yi L <yi.l.liu@intel.com>
>>>> Subject: Re: [PATCH v2 16/17] intel_iommu: Introduce a property to
>> control
>>>> FS1GP cap bit setting
>>>>
>>>> On 2024/8/5 14:27, Zhenzhong Duan wrote:
>>>>> When host IOMMU doesn't support FS1GP but vIOMMU does, host
>>>> IOMMU
>>>>> can't translate stage-1 page table from guest correctly.
>>>>
>>>> this series is for emulated devices, so the above statement does not
>>>> belong to this series. Is there any other reason to have this option?
>>>
>>> Good catch, will remove this comment.
>>> In fact, this patch is mainly for passthrough device where host IOMMU
>> doesn't support fs1gp.
>>
>> I see. To me, as long as the vIOMMU page walk logic supports 1GP large
>> pages, it's ok to report the FS1GP cap to VM. But it is still fine to
>> have this property to opt-out FS1GP if admin/orchestration layer(e.g. libvirt)
>> knows no hw iommu has this capability, so it is better to opt out it
>> before invoking QEMU.
>>
>> Is this your motivation for this property?
> 
> Exactly.

ok, then let's keep it in this series after refining the commit message a
bit.

>>
>>>>
>>>>> Add a property x-cap-fs1gp for user to turn FS1GP off so that
>>>>> nested page table on host side works.
>>>>
>>>> I guess you would need to sync the FS1GP cap with host before reporting
>> it
>>>> in vIOMMU when comes to support passthrough devices.
>>>
>>> Yes, we already have this check, see
>> https://github.com/yiliu1765/qemu/commit/b7ac7ce3a2e21eb1b3172743
>> ee6f73e80fe67b3a
>>
>> good to know it. :) Will you fail the VM if the device's iommu does not
>> support FS1GP or just mask out the FS1GP?
> 
> For cold plugged VFIO device, it will fail the VM with "Stage-1 1GB huge page is unsupported by host IOMMU" error report.
> For hotplug VFIO device, only hotplug fails with "Stage-1 1GB huge page is unsupported by host IOMMU".
> 
> We don't update vIOMMU cap/ecap from host cap/ecap per Michael's suggestion, only vIOMMU properties can control them.

I see. yeah, it makes sense.

-- 
Regards,
Yi Liu
Re: [PATCH v2 16/17] intel_iommu: Introduce a property to control FS1GP cap bit setting
Posted by CLEMENT MATHIEU--DRIF 1 month, 2 weeks ago
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>



On 05/08/2024 08:27, Zhenzhong Duan wrote:
> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>
>
> When host IOMMU doesn't support FS1GP but vIOMMU does, host IOMMU
> can't translate stage-1 page table from guest correctly.
>
> Add a property x-cap-fs1gp for user to turn FS1GP off so that
> nested page table on host side works.
>
> This property has no effect when vIOMMU isn't in scalable modern
> mode.
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   include/hw/i386/intel_iommu.h | 1 +
>   hw/i386/intel_iommu.c         | 5 ++++-
>   2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index 650641544c..f6d9b41b80 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -308,6 +308,7 @@ struct IntelIOMMUState {
>       bool dma_drain;                 /* Whether DMA r/w draining enabled */
>       bool dma_translation;           /* Whether DMA translation supported */
>       bool pasid;                     /* Whether to support PASID */
> +    bool fs1gp;                     /* First Stage 1-GByte Page Support */
>
>       /*
>        * Protects IOMMU states in general.  Currently it protects the
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 9e973bd710..d7e7354db4 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -3778,6 +3778,7 @@ static Property vtd_properties[] = {
>       DEFINE_PROP_BOOL("x-pasid-mode", IntelIOMMUState, pasid, false),
>       DEFINE_PROP_BOOL("dma-drain", IntelIOMMUState, dma_drain, true),
>       DEFINE_PROP_BOOL("dma-translation", IntelIOMMUState, dma_translation, true),
> +    DEFINE_PROP_BOOL("x-cap-fs1gp", IntelIOMMUState, fs1gp, true),
>       DEFINE_PROP_END_OF_LIST(),
>   };
>
> @@ -4506,7 +4507,9 @@ static void vtd_cap_init(IntelIOMMUState *s)
>       /* TODO: read cap/ecap from host to decide which cap to be exposed. */
>       if (s->scalable_modern) {
>           s->ecap |= VTD_ECAP_SMTS | VTD_ECAP_FLTS;
> -        s->cap |= VTD_CAP_FS1GP;
> +        if (s->fs1gp) {
> +            s->cap |= VTD_CAP_FS1GP;
> +        }
>       } else if (s->scalable_mode) {
>           s->ecap |= VTD_ECAP_SMTS | VTD_ECAP_SRS | VTD_ECAP_SLTS;
>       }
> --
> 2.34.1
>