[PATCH] domain: Validate __copy_to_guest in VCPUOP_register_runstate_memory_area

Michal Orzel posted 1 patch 1 month ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20241126102653.25487-1-michal.orzel@amd.com
xen/common/domain.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
[PATCH] domain: Validate __copy_to_guest in VCPUOP_register_runstate_memory_area
Posted by Michal Orzel 1 month ago
For guests with paging mode external, guest_handle_okay() always returns
success, even if the guest handle is invalid (e.g. address not in P2M).
In VCPUOP_register_runstate_memory_area, we would then blindly set
runstate_guest() for a given vCPU to invalid handle. Moreover, we don't
check the return value from __copy_to_guest() and return success to the
guest, even in case of a failure during copy. Fix it, by checking the
return value from __copy_to_guest() and set runstate_guest() only on
success.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
---
I'm not sure what would be the appropriate commit for a Fixes tag. Arm port
only moved this code to common in 8968bafa3170d46d21d8f6ee2d0856f270c864ad,
so if at all, it would be:
Fixes: 8968bafa3170 ("xen: move VCPUOP_register_runstate_memory_area to common code")
---
 xen/common/domain.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 92263a4fbdc5..3f6fb0798fa3 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -2046,19 +2046,21 @@ long common_vcpu_op(int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( !guest_handle_okay(area.addr.h, 1) )
             break;
 
-        rc = 0;
-        runstate_guest(v) = area.addr.h;
-
         if ( v == current )
         {
-            __copy_to_guest(runstate_guest(v), &v->runstate, 1);
+            if ( __copy_to_guest(area.addr.h, &v->runstate, 1) )
+                break;
         }
         else
         {
             vcpu_runstate_get(v, &runstate);
-            __copy_to_guest(runstate_guest(v), &runstate, 1);
+            if ( __copy_to_guest(area.addr.h, &runstate, 1) )
+                break;
         }
 
+        rc = 0;
+        runstate_guest(v) = area.addr.h;
+
         break;
     }
 
-- 
2.25.1
Re: [PATCH] domain: Validate __copy_to_guest in VCPUOP_register_runstate_memory_area
Posted by Jan Beulich 1 month ago
On 26.11.2024 11:26, Michal Orzel wrote:
> For guests with paging mode external, guest_handle_okay() always returns
> success, even if the guest handle is invalid (e.g. address not in P2M).
> In VCPUOP_register_runstate_memory_area, we would then blindly set
> runstate_guest() for a given vCPU to invalid handle. Moreover, we don't
> check the return value from __copy_to_guest() and return success to the
> guest, even in case of a failure during copy.

I'm afraid this is all deliberate, providing best effort behavior. For a
paging mode external guest, the handle may become valid subsequently. If
any __copy_to_guest() fail here, subsequent update_runstate_area() may
succeed (and success of the actual copying isn't checked there either).

> Fix it, by checking the
> return value from __copy_to_guest() and set runstate_guest() only on
> success.

_If_ such a change was wanted (despite its potential for regressions,
as guests may leverage present behavior to establish handles before
putting in place mappings), x86'es compat_vcpu_op() would need updating,
too. Plus what about VCPUOP_register_vcpu_time_memory_area, behaving
similarly?

Jan
Re: [PATCH] domain: Validate __copy_to_guest in VCPUOP_register_runstate_memory_area
Posted by Michal Orzel 1 month ago

On 26/11/2024 11:40, Jan Beulich wrote:
> 
> 
> On 26.11.2024 11:26, Michal Orzel wrote:
>> For guests with paging mode external, guest_handle_okay() always returns
>> success, even if the guest handle is invalid (e.g. address not in P2M).
>> In VCPUOP_register_runstate_memory_area, we would then blindly set
>> runstate_guest() for a given vCPU to invalid handle. Moreover, we don't
>> check the return value from __copy_to_guest() and return success to the
>> guest, even in case of a failure during copy.
> 
> I'm afraid this is all deliberate, providing best effort behavior. For a
> paging mode external guest, the handle may become valid subsequently. If
> any __copy_to_guest() fail here, subsequent update_runstate_area() may
> succeed (and success of the actual copying isn't checked there either).
Hmm, I've never looked at that this way. I always thought that mapping must be in place
before setting up handle. Is this concept really something used (e.g. in Linux) or just a
way for us to provide excuse for not hardening the code because of a fear of regressions, etc.?

> 
>> Fix it, by checking the
>> return value from __copy_to_guest() and set runstate_guest() only on
>> success.
> 
> _If_ such a change was wanted (despite its potential for regressions,
> as guests may leverage present behavior to establish handles before
> putting in place mappings), x86'es compat_vcpu_op() would need updating,
> too. Plus what about VCPUOP_register_vcpu_time_memory_area, behaving
> similarly?
It's up to us. If the concept you mentioned is valid and people are aware of it, then
let's keep as is. I for one was amazed that Xen returned success but structure was not
copied.

~Michal
Re: [PATCH] domain: Validate __copy_to_guest in VCPUOP_register_runstate_memory_area
Posted by Jan Beulich 1 month ago
On 26.11.2024 12:24, Michal Orzel wrote:
> On 26/11/2024 11:40, Jan Beulich wrote:
>> On 26.11.2024 11:26, Michal Orzel wrote:
>>> For guests with paging mode external, guest_handle_okay() always returns
>>> success, even if the guest handle is invalid (e.g. address not in P2M).
>>> In VCPUOP_register_runstate_memory_area, we would then blindly set
>>> runstate_guest() for a given vCPU to invalid handle. Moreover, we don't
>>> check the return value from __copy_to_guest() and return success to the
>>> guest, even in case of a failure during copy.
>>
>> I'm afraid this is all deliberate, providing best effort behavior. For a
>> paging mode external guest, the handle may become valid subsequently. If
>> any __copy_to_guest() fail here, subsequent update_runstate_area() may
>> succeed (and success of the actual copying isn't checked there either).
> Hmm, I've never looked at that this way. I always thought that mapping must be in place
> before setting up handle. Is this concept really something used (e.g. in Linux) or just a
> way for us to provide excuse for not hardening the code because of a fear of regressions, etc.?

I'm pretty certain some common guest OS used such "early" registration at
some point. I don't recall whether that was Windows or Linux, nor - for the
latter - whether it was upstream (pv-ops) or the old forward ports from the
2.6.18-xen tree. And I can't say whether that may still be the case
somewhere. Nor that it was specifically the runstate area (it seems more
likely that it was the vCPU info area). Hence my desire for us to be
cautious.

Jan