[PATCH v2] x86/PVH: modify permission checking in hwdom_fixup_p2m()

Jan Beulich posted 1 patch 3 months, 2 weeks ago
Failed in applying to current master (apply log)
[PATCH v2] x86/PVH: modify permission checking in hwdom_fixup_p2m()
Posted by Jan Beulich 3 months, 2 weeks ago
We're generally striving to minimize behavioral differences between PV
and PVH Dom0. Using is_memory_hole() in the PVH case looks quite a bit
weaker to me, compared to the page ownership check done in the PV case.
Change checking accordingly.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -176,13 +176,27 @@ static int hwdom_fixup_p2m(paddr_t addr)
     ASSERT(is_hardware_domain(currd));
     ASSERT(!altp2m_active(currd));
 
+    if ( !iomem_access_permitted(currd, gfn, gfn) )
+        return -EPERM;
+
     /*
      * Fixups are only applied for MMIO holes, and rely on the hardware domain
      * having identity mappings for non RAM regions (gfn == mfn).
+     *
+     * Much like get_page_from_l1e() for PV Dom0 does, check that the page
+     * accessed is actually an MMIO one: Either its MFN is out of range, or
+     * it's owned by DOM_IO.
      */
-    if ( !iomem_access_permitted(currd, gfn, gfn) ||
-         !is_memory_hole(_mfn(gfn), _mfn(gfn)) )
-        return -EPERM;
+    if ( mfn_valid(_mfn(gfn)) )
+    {
+        struct page_info *pg = mfn_to_page(_mfn(gfn));
+        const struct domain *owner = page_get_owner_and_reference(pg);
+
+        if ( owner )
+            put_page(pg);
+        if ( owner != dom_io )
+            return -EPERM;
+    }
 
     mfn = get_gfn(currd, gfn, &type);
     if ( !mfn_eq(mfn, INVALID_MFN) || !p2m_is_hole(type) )
Re: [PATCH v2] x86/PVH: modify permission checking in hwdom_fixup_p2m()
Posted by Roger Pau Monné 3 months, 2 weeks ago
On Mon, Jul 14, 2025 at 06:09:27PM +0200, Jan Beulich wrote:
> We're generally striving to minimize behavioral differences between PV
> and PVH Dom0. Using is_memory_hole() in the PVH case looks quite a bit
> weaker to me, compared to the page ownership check done in the PV case.
> Change checking accordingly.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Kind of unrelated to this specific patch, but what's our opinion on
turning on pf-fixup by default before the release?

Thanks, Roger.

Re: [PATCH v2] x86/PVH: modify permission checking in hwdom_fixup_p2m()
Posted by Jan Beulich 3 months, 2 weeks ago
On 15.07.2025 12:09, Roger Pau Monné wrote:
> On Mon, Jul 14, 2025 at 06:09:27PM +0200, Jan Beulich wrote:
>> We're generally striving to minimize behavioral differences between PV
>> and PVH Dom0. Using is_memory_hole() in the PVH case looks quite a bit
>> weaker to me, compared to the page ownership check done in the PV case.
>> Change checking accordingly.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

> Kind of unrelated to this specific patch, but what's our opinion on
> turning on pf-fixup by default before the release?

As far as the patch here goes, the relationship is very tight. I came to
make this patch only while investigating whether we couldn't have Dom0
report the resource (MMIO) ranges early enough for us to not even need
such fixing-up. Sadly, as per [1] that turned out pretty much impossible.
Which means that while I'm still pretty hesitant of us doing something
like this by default, I can't currently see a way around doing so. Hence
perhaps yes, we may want (or even need) to turn this on by default.

Jan

[1] https://lists.xen.org/archives/html/xen-devel/2025-07/msg00446.html

Re: [PATCH v2] x86/PVH: modify permission checking in hwdom_fixup_p2m()
Posted by Roger Pau Monné 3 months, 2 weeks ago
On Tue, Jul 15, 2025 at 12:47:15PM +0200, Jan Beulich wrote:
> On 15.07.2025 12:09, Roger Pau Monné wrote:
> > On Mon, Jul 14, 2025 at 06:09:27PM +0200, Jan Beulich wrote:
> >> We're generally striving to minimize behavioral differences between PV
> >> and PVH Dom0. Using is_memory_hole() in the PVH case looks quite a bit
> >> weaker to me, compared to the page ownership check done in the PV case.
> >> Change checking accordingly.
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > 
> > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Thanks.
> 
> > Kind of unrelated to this specific patch, but what's our opinion on
> > turning on pf-fixup by default before the release?
> 
> As far as the patch here goes, the relationship is very tight. I came to
> make this patch only while investigating whether we couldn't have Dom0
> report the resource (MMIO) ranges early enough for us to not even need
> such fixing-up. Sadly, as per [1] that turned out pretty much impossible.
> Which means that while I'm still pretty hesitant of us doing something
> like this by default, I can't currently see a way around doing so. Hence
> perhaps yes, we may want (or even need) to turn this on by default.

Sorry, wanted to reply to your previous commit alternative approach
email, but got distracted with something else and forgot about it.

While I won't be opposed to having a way for dom0 to notify extra MMIO
regions it wants added to the p2m, I think this is likely too much
fuzz.  For example for FreeBSD I wouldn't consider adding such logic
to the kernel, simply because I think it's likely to be too intrusive,
and would rather rely on pf-fixup.  Overall the amount of p2m fixups
that Xen ends up doing is always fairly small (I usually see maybe 4
pages tops), and only as result of ACPI related accesses.  IMO it's an
acceptable compromise to map those as individual 4K pages.

I would only consider the alternative approach of using a hypercall if
we saw big regions being mapped by pf-fixup, because in that case it
would better be using p2m superpage(s).

I think we want to enable pf-fixup by default at some point, the
question is whether you would consider it appropriate to do now.
Given it's limited to PVH dom0 only, I think we should enable for this
release already.

Thanks, Roger.

Re: [PATCH v2] x86/PVH: modify permission checking in hwdom_fixup_p2m()
Posted by Jan Beulich 3 months, 2 weeks ago
On 15.07.2025 13:04, Roger Pau Monné wrote:
> On Tue, Jul 15, 2025 at 12:47:15PM +0200, Jan Beulich wrote:
>> On 15.07.2025 12:09, Roger Pau Monné wrote:
>>> On Mon, Jul 14, 2025 at 06:09:27PM +0200, Jan Beulich wrote:
>>>> We're generally striving to minimize behavioral differences between PV
>>>> and PVH Dom0. Using is_memory_hole() in the PVH case looks quite a bit
>>>> weaker to me, compared to the page ownership check done in the PV case.
>>>> Change checking accordingly.
>>>>
>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>
>>> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>>
>> Thanks.
>>
>>> Kind of unrelated to this specific patch, but what's our opinion on
>>> turning on pf-fixup by default before the release?
>>
>> As far as the patch here goes, the relationship is very tight. I came to
>> make this patch only while investigating whether we couldn't have Dom0
>> report the resource (MMIO) ranges early enough for us to not even need
>> such fixing-up. Sadly, as per [1] that turned out pretty much impossible.
>> Which means that while I'm still pretty hesitant of us doing something
>> like this by default, I can't currently see a way around doing so. Hence
>> perhaps yes, we may want (or even need) to turn this on by default.
> 
> Sorry, wanted to reply to your previous commit alternative approach
> email, but got distracted with something else and forgot about it.
> 
> While I won't be opposed to having a way for dom0 to notify extra MMIO
> regions it wants added to the p2m, I think this is likely too much
> fuzz.  For example for FreeBSD I wouldn't consider adding such logic
> to the kernel, simply because I think it's likely to be too intrusive,
> and would rather rely on pf-fixup.  Overall the amount of p2m fixups
> that Xen ends up doing is always fairly small (I usually see maybe 4
> pages tops), and only as result of ACPI related accesses.  IMO it's an
> acceptable compromise to map those as individual 4K pages.

Yes, and my concern isn't so much what we map, or how many pages there
are, but that we do this behind the back of Dom0 (and also not ahead
of actually launching it).

As to the amount of accesses, these are the ranges that my SKL reports
through the temporary hypercall (as described on the v1 thread):

(XEN) sysmem: fed1c000 (24000 bytes)
(XEN) sysmem: fed45000 (47000 bytes)
(XEN) sysmem: ff000000 (1000000 bytes)
(XEN) sysmem: fed1b000 (1000 bytes)
(XEN) sysmem: fd000000 (ac0000 bytes)
(XEN) sysmem: fdad0000 (10000 bytes)
(XEN) sysmem: fe000000 (10000 bytes)
(XEN) sysmem: fe011000 (f000 bytes)
(XEN) sysmem: fe036000 (6000 bytes)
(XEN) sysmem: fe03d000 (3c3000 bytes)
(XEN) sysmem: fe410000 (3f0000 bytes)
(XEN) sysmem: fdaf0000 (10000 bytes)
(XEN) sysmem: fdae0000 (10000 bytes)
(XEN) sysmem: fdac0000 (10000 bytes)

Some of these ranges are also E820_RESERVED, so would (by default) be
mapped anyway. That's most notably the ff000000 one. The other regions
exceeding 2Mb in size aren't visible in E820, though.

As they're all reported by ACPI, they all could in principle be accessed.
Just requires the right drivers to be loaded, I expect.

> I would only consider the alternative approach of using a hypercall if
> we saw big regions being mapped by pf-fixup, because in that case it
> would better be using p2m superpage(s).
> 
> I think we want to enable pf-fixup by default at some point, the
> question is whether you would consider it appropriate to do now.
> Given it's limited to PVH dom0 only, I think we should enable for this
> release already.

As said, since I see no alternative, we can as well do it for 4.21. No
matter that I'm hesitant about it.

Jan