[PATCH] xen: detect uninitialized xenbus in xenbus_init

Stefano Stabellini posted 1 patch 2 years, 4 months ago
Failed in applying to current master (apply log)
There is a newer version of this series
drivers/xen/xenbus/xenbus_probe.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
[PATCH] xen: detect uninitialized xenbus in xenbus_init
Posted by Stefano Stabellini 2 years, 4 months ago
From: Stefano Stabellini <stefano.stabellini@xilinx.com>

If the xenstore page hasn't been allocated properly, reading the value
of the related hvm_param (HVM_PARAM_STORE_PFN) won't actually return
error. Instead, it will succeed and return zero.

Instead of attempting to xen_remap a bad guest physical address, detect
this condition and return early.

Note that although a guest physical address of zero for
HVM_PARAM_STORE_PFN is theoretically possible, it is not a good choice
and zero has never been validly used in that capacity.

Cc: stable@vger.kernel.org
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
---
 drivers/xen/xenbus/xenbus_probe.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 94405bb3829e..c89de0062399 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -951,6 +951,18 @@ static int __init xenbus_init(void)
 		err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
 		if (err)
 			goto out_error;
+		/*
+		 * Uninitialized hvm_params are zero and return no error.
+		 * Although it is theoretically possible to have
+		 * HVM_PARAM_STORE_PFN set to zero on purpose, in reality it is
+		 * not zero when valid. If zero, it means that Xenstore hasn't
+		 * been properly initialized. Instead of attempting to map a
+		 * wrong guest physical address return error.
+		 */
+		if (v == 0) {
+			err = -ENOENT;
+			goto out_error;
+		}
 		xen_store_gfn = (unsigned long)v;
 		xen_store_interface =
 			xen_remap(xen_store_gfn << XEN_PAGE_SHIFT,
-- 
2.25.1

Re: [PATCH] xen: detect uninitialized xenbus in xenbus_init
Posted by Jan Beulich 2 years, 4 months ago
On 17.11.2021 03:11, Stefano Stabellini wrote:
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -951,6 +951,18 @@ static int __init xenbus_init(void)
>  		err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
>  		if (err)
>  			goto out_error;
> +		/*
> +		 * Uninitialized hvm_params are zero and return no error.
> +		 * Although it is theoretically possible to have
> +		 * HVM_PARAM_STORE_PFN set to zero on purpose, in reality it is
> +		 * not zero when valid. If zero, it means that Xenstore hasn't
> +		 * been properly initialized. Instead of attempting to map a
> +		 * wrong guest physical address return error.
> +		 */
> +		if (v == 0) {
> +			err = -ENOENT;
> +			goto out_error;
> +		}

If such a check gets added, then I think known-invalid frame numbers
should be covered at even higher a priority than zero. This would,
for example, also mean to ...

>  		xen_store_gfn = (unsigned long)v;

... stop silently truncating a value here.

By covering them we would then have the option to pre-fill PFN params
with, say, ~0 in the hypervisor (to clearly identify them as invalid,
rather than having to guess at the validity of 0). I haven't really
checked yet whether such a change would be compatible with existing
software ...

Jan

Re: [PATCH] xen: detect uninitialized xenbus in xenbus_init
Posted by Stefano Stabellini 2 years, 4 months ago
On Wed, 17 Nov 2021, Jan Beulich wrote:
> On 17.11.2021 03:11, Stefano Stabellini wrote:
> > --- a/drivers/xen/xenbus/xenbus_probe.c
> > +++ b/drivers/xen/xenbus/xenbus_probe.c
> > @@ -951,6 +951,18 @@ static int __init xenbus_init(void)
> >  		err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
> >  		if (err)
> >  			goto out_error;
> > +		/*
> > +		 * Uninitialized hvm_params are zero and return no error.
> > +		 * Although it is theoretically possible to have
> > +		 * HVM_PARAM_STORE_PFN set to zero on purpose, in reality it is
> > +		 * not zero when valid. If zero, it means that Xenstore hasn't
> > +		 * been properly initialized. Instead of attempting to map a
> > +		 * wrong guest physical address return error.
> > +		 */
> > +		if (v == 0) {
> > +			err = -ENOENT;
> > +			goto out_error;
> > +		}
> 
> If such a check gets added, then I think known-invalid frame numbers
> should be covered at even higher a priority than zero.

Uhm, that's a good point. We could check for 0 and also ULONG_MAX


> This would, for example, also mean to ...
>
> >  		xen_store_gfn = (unsigned long)v;
> 
> ... stop silently truncating a value here.

Yeah, it can only happen on 32-bit but you have a point.


> By covering them we would then have the option to pre-fill PFN params
> with, say, ~0 in the hypervisor (to clearly identify them as invalid,
> rather than having to guess at the validity of 0). I haven't really
> checked yet whether such a change would be compatible with existing
> software ...

I had the same idea. I think the hvm_params should be initialized to an
invalid value in Xen. But here in Linux we need to be able to cope with
older Xen versions too so it still makes sense to check for zero in
places where it is very obviously incorrect (HVM_PARAM_STORE_PFN).


What do you think of the appended?



diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 94405bb3829e..04558d3a5562 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -951,6 +951,28 @@ static int __init xenbus_init(void)
 		err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
 		if (err)
 			goto out_error;
+		/*
+		 * Return error on an invalid value.
+		 *
+		 * Uninitialized hvm_params are zero and return no error.
+		 * Although it is theoretically possible to have
+		 * HVM_PARAM_STORE_PFN set to zero on purpose, in reality it is
+		 * not zero when valid. If zero, it means that Xenstore hasn't
+		 * been properly initialized. Instead of attempting to map a
+		 * wrong guest physical address return error.
+		 */
+		if (v == 0) {
+			err = -ENOENT;
+			goto out_error;
+		}
+		/*
+		 * ULONG_MAX is invalid on 64-bit because is INVALID_PFN.
+		 * On 32-bit return error to avoid truncation.
+		 */
+		if (v >= ULONG_MAX) {
+			err = -EINVAL;
+			goto out_error;
+		}
 		xen_store_gfn = (unsigned long)v;
 		xen_store_interface =
 			xen_remap(xen_store_gfn << XEN_PAGE_SHIFT,

Re: [PATCH] xen: detect uninitialized xenbus in xenbus_init
Posted by Juergen Gross 2 years, 4 months ago
On 18.11.21 03:37, Stefano Stabellini wrote:
> On Wed, 17 Nov 2021, Jan Beulich wrote:
>> On 17.11.2021 03:11, Stefano Stabellini wrote:
>>> --- a/drivers/xen/xenbus/xenbus_probe.c
>>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>>> @@ -951,6 +951,18 @@ static int __init xenbus_init(void)
>>>   		err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
>>>   		if (err)
>>>   			goto out_error;
>>> +		/*
>>> +		 * Uninitialized hvm_params are zero and return no error.
>>> +		 * Although it is theoretically possible to have
>>> +		 * HVM_PARAM_STORE_PFN set to zero on purpose, in reality it is
>>> +		 * not zero when valid. If zero, it means that Xenstore hasn't
>>> +		 * been properly initialized. Instead of attempting to map a
>>> +		 * wrong guest physical address return error.
>>> +		 */
>>> +		if (v == 0) {
>>> +			err = -ENOENT;
>>> +			goto out_error;
>>> +		}
>>
>> If such a check gets added, then I think known-invalid frame numbers
>> should be covered at even higher a priority than zero.
> 
> Uhm, that's a good point. We could check for 0 and also ULONG_MAX
> 
> 
>> This would, for example, also mean to ...
>>
>>>   		xen_store_gfn = (unsigned long)v;
>>
>> ... stop silently truncating a value here.
> 
> Yeah, it can only happen on 32-bit but you have a point.
> 
> 
>> By covering them we would then have the option to pre-fill PFN params
>> with, say, ~0 in the hypervisor (to clearly identify them as invalid,
>> rather than having to guess at the validity of 0). I haven't really
>> checked yet whether such a change would be compatible with existing
>> software ...
> 
> I had the same idea. I think the hvm_params should be initialized to an
> invalid value in Xen. But here in Linux we need to be able to cope with
> older Xen versions too so it still makes sense to check for zero in
> places where it is very obviously incorrect (HVM_PARAM_STORE_PFN).
> 
> 
> What do you think of the appended?
> 
> 
> 
> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
> index 94405bb3829e..04558d3a5562 100644
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -951,6 +951,28 @@ static int __init xenbus_init(void)
>   		err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
>   		if (err)
>   			goto out_error;
> +		/*
> +		 * Return error on an invalid value.
> +		 *
> +		 * Uninitialized hvm_params are zero and return no error.
> +		 * Although it is theoretically possible to have
> +		 * HVM_PARAM_STORE_PFN set to zero on purpose, in reality it is
> +		 * not zero when valid. If zero, it means that Xenstore hasn't
> +		 * been properly initialized. Instead of attempting to map a
> +		 * wrong guest physical address return error.
> +		 */
> +		if (v == 0) {

Make this "if (v == ULONG_MAX || v== 0)" instead?
This would result in the same err on a new and an old hypervisor
(assuming we switch the hypervisor to init params with ~0UL).

> +			err = -ENOENT;
> +			goto out_error;
> +		}
> +		/*
> +		 * ULONG_MAX is invalid on 64-bit because is INVALID_PFN.
> +		 * On 32-bit return error to avoid truncation.
> +		 */
> +		if (v >= ULONG_MAX) {
> +			err = -EINVAL;
> +			goto out_error;
> +		}

Does it make sense to continue the system running in case of
truncation? This would be a 32-bit guest with more than 16TB of RAM
and the Xen tools decided to place the Xenstore ring page above the
16TB boundary. This is a completely insane scenario IMO.

A proper panic() in this case would make diagnosis of that much
easier (me having doubts that this will ever be hit, though).


Juergen
Re: [PATCH] xen: detect uninitialized xenbus in xenbus_init
Posted by Jan Beulich 2 years, 4 months ago
On 18.11.2021 06:32, Juergen Gross wrote:
> On 18.11.21 03:37, Stefano Stabellini wrote:
>> --- a/drivers/xen/xenbus/xenbus_probe.c
>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>> @@ -951,6 +951,28 @@ static int __init xenbus_init(void)
>>   		err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
>>   		if (err)
>>   			goto out_error;
>> +		/*
>> +		 * Return error on an invalid value.
>> +		 *
>> +		 * Uninitialized hvm_params are zero and return no error.
>> +		 * Although it is theoretically possible to have
>> +		 * HVM_PARAM_STORE_PFN set to zero on purpose, in reality it is
>> +		 * not zero when valid. If zero, it means that Xenstore hasn't
>> +		 * been properly initialized. Instead of attempting to map a
>> +		 * wrong guest physical address return error.
>> +		 */
>> +		if (v == 0) {
> 
> Make this "if (v == ULONG_MAX || v== 0)" instead?
> This would result in the same err on a new and an old hypervisor
> (assuming we switch the hypervisor to init params with ~0UL).
> 
>> +			err = -ENOENT;
>> +			goto out_error;
>> +		}
>> +		/*
>> +		 * ULONG_MAX is invalid on 64-bit because is INVALID_PFN.
>> +		 * On 32-bit return error to avoid truncation.
>> +		 */
>> +		if (v >= ULONG_MAX) {
>> +			err = -EINVAL;
>> +			goto out_error;
>> +		}
> 
> Does it make sense to continue the system running in case of
> truncation? This would be a 32-bit guest with more than 16TB of RAM
> and the Xen tools decided to place the Xenstore ring page above the
> 16TB boundary. This is a completely insane scenario IMO.
> 
> A proper panic() in this case would make diagnosis of that much
> easier (me having doubts that this will ever be hit, though).

While I agree panic() may be an option here (albeit I'm not sure why
that would be better than trying to cope with 0 and hence without
xenbus), I'd like to point out that the amount of RAM assigned to a
guest is unrelated to the choice of GFNs for the various "magic"
items.

Jan


Re: [PATCH] xen: detect uninitialized xenbus in xenbus_init
Posted by Juergen Gross 2 years, 4 months ago
On 18.11.21 09:47, Jan Beulich wrote:
> On 18.11.2021 06:32, Juergen Gross wrote:
>> On 18.11.21 03:37, Stefano Stabellini wrote:
>>> --- a/drivers/xen/xenbus/xenbus_probe.c
>>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>>> @@ -951,6 +951,28 @@ static int __init xenbus_init(void)
>>>    		err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
>>>    		if (err)
>>>    			goto out_error;
>>> +		/*
>>> +		 * Return error on an invalid value.
>>> +		 *
>>> +		 * Uninitialized hvm_params are zero and return no error.
>>> +		 * Although it is theoretically possible to have
>>> +		 * HVM_PARAM_STORE_PFN set to zero on purpose, in reality it is
>>> +		 * not zero when valid. If zero, it means that Xenstore hasn't
>>> +		 * been properly initialized. Instead of attempting to map a
>>> +		 * wrong guest physical address return error.
>>> +		 */
>>> +		if (v == 0) {
>>
>> Make this "if (v == ULONG_MAX || v== 0)" instead?
>> This would result in the same err on a new and an old hypervisor
>> (assuming we switch the hypervisor to init params with ~0UL).
>>
>>> +			err = -ENOENT;
>>> +			goto out_error;
>>> +		}
>>> +		/*
>>> +		 * ULONG_MAX is invalid on 64-bit because is INVALID_PFN.
>>> +		 * On 32-bit return error to avoid truncation.
>>> +		 */
>>> +		if (v >= ULONG_MAX) {
>>> +			err = -EINVAL;
>>> +			goto out_error;
>>> +		}
>>
>> Does it make sense to continue the system running in case of
>> truncation? This would be a 32-bit guest with more than 16TB of RAM
>> and the Xen tools decided to place the Xenstore ring page above the
>> 16TB boundary. This is a completely insane scenario IMO.
>>
>> A proper panic() in this case would make diagnosis of that much
>> easier (me having doubts that this will ever be hit, though).
> 
> While I agree panic() may be an option here (albeit I'm not sure why
> that would be better than trying to cope with 0 and hence without

I could imagine someone wanting to run a guest without Xenstore access,
which BTW will happen in case of a guest created by the hypervisor at
boot time.

> xenbus), I'd like to point out that the amount of RAM assigned to a
> guest is unrelated to the choice of GFNs for the various "magic"
> items.

Yes, but this would still be a major tools problem which probably
would render the whole guest rather unusable.


Juergen
Re: [PATCH] xen: detect uninitialized xenbus in xenbus_init
Posted by Stefano Stabellini 2 years, 4 months ago
On Thu, 18 Nov 2021, Juergen Gross wrote:
> On 18.11.21 09:47, Jan Beulich wrote:
> > On 18.11.2021 06:32, Juergen Gross wrote:
> > > On 18.11.21 03:37, Stefano Stabellini wrote:
> > > > --- a/drivers/xen/xenbus/xenbus_probe.c
> > > > +++ b/drivers/xen/xenbus/xenbus_probe.c
> > > > @@ -951,6 +951,28 @@ static int __init xenbus_init(void)
> > > >    		err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
> > > >    		if (err)
> > > >    			goto out_error;
> > > > +		/*
> > > > +		 * Return error on an invalid value.
> > > > +		 *
> > > > +		 * Uninitialized hvm_params are zero and return no error.
> > > > +		 * Although it is theoretically possible to have
> > > > +		 * HVM_PARAM_STORE_PFN set to zero on purpose, in reality it
> > > > is
> > > > +		 * not zero when valid. If zero, it means that Xenstore hasn't
> > > > +		 * been properly initialized. Instead of attempting to map a
> > > > +		 * wrong guest physical address return error.
> > > > +		 */
> > > > +		if (v == 0) {
> > > 
> > > Make this "if (v == ULONG_MAX || v== 0)" instead?
> > > This would result in the same err on a new and an old hypervisor
> > > (assuming we switch the hypervisor to init params with ~0UL).

Sure, I can do that


> > > > +			err = -ENOENT;
> > > > +			goto out_error;
> > > > +		}
> > > > +		/*
> > > > +		 * ULONG_MAX is invalid on 64-bit because is INVALID_PFN.
> > > > +		 * On 32-bit return error to avoid truncation.
> > > > +		 */
> > > > +		if (v >= ULONG_MAX) {
> > > > +			err = -EINVAL;
> > > > +			goto out_error;
> > > > +		}
> > > 
> > > Does it make sense to continue the system running in case of
> > > truncation? This would be a 32-bit guest with more than 16TB of RAM
> > > and the Xen tools decided to place the Xenstore ring page above the
> > > 16TB boundary. This is a completely insane scenario IMO.
> > > 
> > > A proper panic() in this case would make diagnosis of that much
> > > easier (me having doubts that this will ever be hit, though).
> > 
> > While I agree panic() may be an option here (albeit I'm not sure why
> > that would be better than trying to cope with 0 and hence without
> 
> I could imagine someone wanting to run a guest without Xenstore access,
> which BTW will happen in case of a guest created by the hypervisor at
> boot time.
> 
> > xenbus), I'd like to point out that the amount of RAM assigned to a
> > guest is unrelated to the choice of GFNs for the various "magic"
> > items.
> 
> Yes, but this would still be a major tools problem which probably
> would render the whole guest rather unusable.

First let's distinguish between an error due to "hvm_param not
initialized" and an error due to more serious conditions, such as "pfn
above max".

"hvm_param not initialized" could mean v == 0 (as it would be today) or
v == ~0UL (if we change Xen to initialize all hvm_param to ~0UL). I
don't think we want to panic in these cases as they are not actually
true erroneous configurations. We should just stop trying to initialize
xenstore and continue with the rest.


The "pfn above max" case could happen if v is greater than the max pfn.
This is a true error in the configuration because the toolstack should
know that the guest is 32-bit so it should give it a pfn that the guest
is able to use. As Jan wrote in another email, for 32-bit the actual
limit depends on the physical address bits but actually Linux has never
been able to cope with a pfn > ULONG_MAX on 32-bit because xen_store_gfn
is defined as unsigned long. So Linux 32-bit has been truncating
HVM_PARAM_STORE_PFN all along.

There is also an argument that depending on kconfig Linux 32-bit might
only be able to handle addresses < 4G, so I don't think the toolstack
can assume that a 32-bit guest is able to cope with HVM_PARAM_STORE_PFN
> ULONG_MAX.  If Linux is 32-bit and HVM_PARAM_STORE_PFN > ULONG_MAX,
even if HVM_PARAM_STORE_PFN < address_bits_max I think it would be fair
to still consider it an error, but I can see it could be argued either
way. Certainly if HVM_PARAM_STORE_PFN > address_bits_max is an error.

In any case, I think it is still better for Linux to stop trying to
initialize Xenstore but continue with the rest because there is a bunch
of other useful things Linux can do without it. Panic should only be the
last resort if there is nothing else to do. In this case we haven't even
initialized the service and the service is not essential, at least it is
not essential in certain ARM setups.


So in conclusion, I think this patch should:
- if v == 0 return error (uninitialized)
- if v == ~0ULL (INVALID_PFN) return error (uinitialized)
- if v >= ~0UL (32-bit) return error (even if this case could be made to
  work for v < max_address_bits depending on kconfig)

Which leads to something like:

        /* uninitialized */
		if (v == 0 || v == ~0ULL) {
			err = -ENOENT;
			goto out_error;
		}
        /* 
         * Avoid truncation on 32-bit.
         * TODO: handle addresses >= 4G
         */
        if ( v >= ~0UL ) {
            err = -EINVAL;
            goto out_error;
        }

Re: [PATCH] xen: detect uninitialized xenbus in xenbus_init
Posted by Boris Ostrovsky 2 years, 4 months ago
On 11/18/21 4:00 PM, Stefano Stabellini wrote:
>
>          /*
>           * Avoid truncation on 32-bit.
>           * TODO: handle addresses >= 4G
>           */
>          if ( v >= ~0UL ) {
>              err = -EINVAL;
>              goto out_error;
>          }


Since this is only relevant to 32-bit kernels then "#if BITS_PER_LONG == 32".


-boris


Re: [PATCH] xen: detect uninitialized xenbus in xenbus_init
Posted by Jan Beulich 2 years, 4 months ago
On 18.11.2021 23:24, Boris Ostrovsky wrote:
> On 11/18/21 4:00 PM, Stefano Stabellini wrote:
>>
>>          /*
>>           * Avoid truncation on 32-bit.
>>           * TODO: handle addresses >= 4G
>>           */
>>          if ( v >= ~0UL ) {
>>              err = -EINVAL;
>>              goto out_error;
>>          }
> 
> 
> Since this is only relevant to 32-bit kernels then "#if BITS_PER_LONG == 32".

Plus then > instead of >= (thus also making the comment actually decribe
the code) and ULONG_MAX instead of ~0UL, I would say.

Jan


Re: [PATCH] xen: detect uninitialized xenbus in xenbus_init
Posted by Juergen Gross 2 years, 4 months ago
On 18.11.21 22:00, Stefano Stabellini wrote:
> On Thu, 18 Nov 2021, Juergen Gross wrote:
>> On 18.11.21 09:47, Jan Beulich wrote:
>>> On 18.11.2021 06:32, Juergen Gross wrote:
>>>> On 18.11.21 03:37, Stefano Stabellini wrote:
>>>>> --- a/drivers/xen/xenbus/xenbus_probe.c
>>>>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>>>>> @@ -951,6 +951,28 @@ static int __init xenbus_init(void)
>>>>>     		err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
>>>>>     		if (err)
>>>>>     			goto out_error;
>>>>> +		/*
>>>>> +		 * Return error on an invalid value.
>>>>> +		 *
>>>>> +		 * Uninitialized hvm_params are zero and return no error.
>>>>> +		 * Although it is theoretically possible to have
>>>>> +		 * HVM_PARAM_STORE_PFN set to zero on purpose, in reality it
>>>>> is
>>>>> +		 * not zero when valid. If zero, it means that Xenstore hasn't
>>>>> +		 * been properly initialized. Instead of attempting to map a
>>>>> +		 * wrong guest physical address return error.
>>>>> +		 */
>>>>> +		if (v == 0) {
>>>>
>>>> Make this "if (v == ULONG_MAX || v== 0)" instead?
>>>> This would result in the same err on a new and an old hypervisor
>>>> (assuming we switch the hypervisor to init params with ~0UL).
> 
> Sure, I can do that
> 
> 
>>>>> +			err = -ENOENT;
>>>>> +			goto out_error;
>>>>> +		}
>>>>> +		/*
>>>>> +		 * ULONG_MAX is invalid on 64-bit because is INVALID_PFN.
>>>>> +		 * On 32-bit return error to avoid truncation.
>>>>> +		 */
>>>>> +		if (v >= ULONG_MAX) {
>>>>> +			err = -EINVAL;
>>>>> +			goto out_error;
>>>>> +		}
>>>>
>>>> Does it make sense to continue the system running in case of
>>>> truncation? This would be a 32-bit guest with more than 16TB of RAM
>>>> and the Xen tools decided to place the Xenstore ring page above the
>>>> 16TB boundary. This is a completely insane scenario IMO.
>>>>
>>>> A proper panic() in this case would make diagnosis of that much
>>>> easier (me having doubts that this will ever be hit, though).
>>>
>>> While I agree panic() may be an option here (albeit I'm not sure why
>>> that would be better than trying to cope with 0 and hence without
>>
>> I could imagine someone wanting to run a guest without Xenstore access,
>> which BTW will happen in case of a guest created by the hypervisor at
>> boot time.
>>
>>> xenbus), I'd like to point out that the amount of RAM assigned to a
>>> guest is unrelated to the choice of GFNs for the various "magic"
>>> items.
>>
>> Yes, but this would still be a major tools problem which probably
>> would render the whole guest rather unusable.
> 
> First let's distinguish between an error due to "hvm_param not
> initialized" and an error due to more serious conditions, such as "pfn
> above max".
> 
> "hvm_param not initialized" could mean v == 0 (as it would be today) or
> v == ~0UL (if we change Xen to initialize all hvm_param to ~0UL). I
> don't think we want to panic in these cases as they are not actually
> true erroneous configurations. We should just stop trying to initialize
> xenstore and continue with the rest.
> 
> 
> The "pfn above max" case could happen if v is greater than the max pfn.
> This is a true error in the configuration because the toolstack should
> know that the guest is 32-bit so it should give it a pfn that the guest

I don't think so. All x86 PVH/HVM guests start booting in 32-bit mode.

> is able to use. As Jan wrote in another email, for 32-bit the actual
> limit depends on the physical address bits but actually Linux has never
> been able to cope with a pfn > ULONG_MAX on 32-bit because xen_store_gfn
> is defined as unsigned long. So Linux 32-bit has been truncating
> HVM_PARAM_STORE_PFN all along.

The question is whether the number of physical address bits as presented
to the guest is always >= 44. If not the actual limit is less than
ULONG_MAX. Other than that you are right: a PFN larger than a 32-bit
ULONG_MAX will be truncated by a 32-bit guest.

> There is also an argument that depending on kconfig Linux 32-bit might
> only be able to handle addresses < 4G, so I don't think the toolstack
> can assume that a 32-bit guest is able to cope with HVM_PARAM_STORE_PFN
>> ULONG_MAX.  If Linux is 32-bit and HVM_PARAM_STORE_PFN > ULONG_MAX,
> even if HVM_PARAM_STORE_PFN < address_bits_max I think it would be fair
> to still consider it an error, but I can see it could be argued either
> way. Certainly if HVM_PARAM_STORE_PFN > address_bits_max is an error.

Right. The tools should NEVER put the frame above 4G for a non-PV guest.

> In any case, I think it is still better for Linux to stop trying to
> initialize Xenstore but continue with the rest because there is a bunch
> of other useful things Linux can do without it. Panic should only be the
> last resort if there is nothing else to do. In this case we haven't even
> initialized the service and the service is not essential, at least it is
> not essential in certain ARM setups.
> 
> 
> So in conclusion, I think this patch should:
> - if v == 0 return error (uninitialized)
> - if v == ~0ULL (INVALID_PFN) return error (uinitialized)
> - if v >= ~0UL (32-bit) return error (even if this case could be made to
>    work for v < max_address_bits depending on kconfig)
> 
> Which leads to something like:
> 
>          /* uninitialized */
> 		if (v == 0 || v == ~0ULL) {
> 			err = -ENOENT;
> 			goto out_error;
> 		}
>          /*
>           * Avoid truncation on 32-bit.
>           * TODO: handle addresses >= 4G
>           */
>          if ( v >= ~0UL ) {
>              err = -EINVAL;
>              goto out_error;

I think at least in this case a pr_err("...") should be added.

Silent failure is not nice.


Juergen
Re: [PATCH] xen: detect uninitialized xenbus in xenbus_init
Posted by Jan Beulich 2 years, 4 months ago
On 18.11.2021 03:37, Stefano Stabellini wrote:
> On Wed, 17 Nov 2021, Jan Beulich wrote:
>> On 17.11.2021 03:11, Stefano Stabellini wrote:
>>> --- a/drivers/xen/xenbus/xenbus_probe.c
>>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>>> @@ -951,6 +951,18 @@ static int __init xenbus_init(void)
>>>  		err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
>>>  		if (err)
>>>  			goto out_error;
>>> +		/*
>>> +		 * Uninitialized hvm_params are zero and return no error.
>>> +		 * Although it is theoretically possible to have
>>> +		 * HVM_PARAM_STORE_PFN set to zero on purpose, in reality it is
>>> +		 * not zero when valid. If zero, it means that Xenstore hasn't
>>> +		 * been properly initialized. Instead of attempting to map a
>>> +		 * wrong guest physical address return error.
>>> +		 */
>>> +		if (v == 0) {
>>> +			err = -ENOENT;
>>> +			goto out_error;
>>> +		}
>>
>> If such a check gets added, then I think known-invalid frame numbers
>> should be covered at even higher a priority than zero.
> 
> Uhm, that's a good point. We could check for 0 and also ULONG_MAX

Why ULONG_MAX? The upper bound is determined by the number of physical
address bits (in a guest: the virtual counterpart thereof). In a 32-bit
environment ULONG_MAX could in principle even represent a valid frame
number.

Jan