[PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

Michal Orzel posted 1 patch 2 years, 4 months ago
Test gitlab-ci failed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20211206142032.27536-1-michal.orzel@arm.com
There is a newer version of this series
xen/arch/arm/arm64/entry.S | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
[PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Michal Orzel 2 years, 4 months ago
to hypervisor when switching to AArch32 state.

According to section D1.20.2 of Arm Arm(DDI 0487A.j):
"If the general-purpose register was accessible from AArch32 state the
upper 32 bits either become zero, or hold the value that the same
architectural register held before any AArch32 execution.
The choice between these two options is IMPLEMENTATIONDEFINED"

Currently Xen does not ensure that the top 32 bits are zeroed and this
needs to be fixed.

Fix this bug by zeroing the upper 32 bits of these registers on an
entry to hypervisor when switching to AArch32 state.

Set default value of parameter compat of macro entry to 0 (AArch64 mode
as we are on 64-bit hypervisor) to avoid checking if parameter is blank
when not passed.

Signed-off-by: Michal Orzel <michal.orzel@arm.com>
---
 xen/arch/arm/arm64/entry.S | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
index fc3811ad0a..d364128175 100644
--- a/xen/arch/arm/arm64/entry.S
+++ b/xen/arch/arm/arm64/entry.S
@@ -109,8 +109,16 @@
  * If 0, we rely on the on x0/x1 to have been saved at the correct
  * position on the stack before.
  */
-        .macro  entry, hyp, compat, save_x0_x1=1
+        .macro  entry, hyp, compat=0, save_x0_x1=1
         sub     sp, sp, #(UREGS_SPSR_el1 - UREGS_LR) /* CPSR, PC, SP, LR */
+
+        /* Zero the upper 32 bits of the registers when switching to AArch32 */
+        .if \compat == 1      /* AArch32 mode */
+        .irp nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
+        mov w\nr, w\nr
+        .endr
+        .endif
+
         push    x28, x29
         push    x26, x27
         push    x24, x25
-- 
2.29.0


Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Jan Beulich 2 years, 4 months ago
On 06.12.2021 15:20, Michal Orzel wrote:
> to hypervisor when switching to AArch32 state.

Do you perhaps mean "from AArch32 state" (also in further places below?
The 64-bit hypervisor runs in AArch64 state in all cases aiui.

> --- a/xen/arch/arm/arm64/entry.S
> +++ b/xen/arch/arm/arm64/entry.S
> @@ -109,8 +109,16 @@
>   * If 0, we rely on the on x0/x1 to have been saved at the correct
>   * position on the stack before.
>   */
> -        .macro  entry, hyp, compat, save_x0_x1=1
> +        .macro  entry, hyp, compat=0, save_x0_x1=1
>          sub     sp, sp, #(UREGS_SPSR_el1 - UREGS_LR) /* CPSR, PC, SP, LR */
> +
> +        /* Zero the upper 32 bits of the registers when switching to AArch32 */
> +        .if \compat == 1      /* AArch32 mode */
> +        .irp nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
> +        mov w\nr, w\nr
> +        .endr
> +        .endif

Don't you at least want, perhaps even need to respect save_x0_x1 being
zero here?

Jan


Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Julien Grall 2 years, 4 months ago
Hi,

On 06/12/2021 14:20, Michal Orzel wrote:
> to hypervisor when switching to AArch32 state.
> 
> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
> "If the general-purpose register was accessible from AArch32 state the
> upper 32 bits either become zero, or hold the value that the same
> architectural register held before any AArch32 execution.
> The choice between these two options is IMPLEMENTATIONDEFINED"

Typo: Missing space between IMPLEMENTATION and DEFINED.

> 
> Currently Xen does not ensure that the top 32 bits are zeroed and this
> needs to be fixed.

Can you outline why this is a problem and why we need to protect? IIRC, 
the main concern is Xen may misinterpret what the guest requested but we 
are not concerned about Xen using wrong value.

> 
> Fix this bug by zeroing the upper 32 bits of these registers on an
> entry to hypervisor when switching to AArch32 state.
> 
> Set default value of parameter compat of macro entry to 0 (AArch64 mode
> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
> when not passed.

Which error do you see otherwise? Is it a compilation error?

> 
> Signed-off-by: Michal Orzel <michal.orzel@arm.com>
> ---
>   xen/arch/arm/arm64/entry.S | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
> index fc3811ad0a..d364128175 100644
> --- a/xen/arch/arm/arm64/entry.S
> +++ b/xen/arch/arm/arm64/entry.S
> @@ -109,8 +109,16 @@
>    * If 0, we rely on the on x0/x1 to have been saved at the correct
>    * position on the stack before.
>    */
> -        .macro  entry, hyp, compat, save_x0_x1=1
> +        .macro  entry, hyp, compat=0, save_x0_x1=1
>           sub     sp, sp, #(UREGS_SPSR_el1 - UREGS_LR) /* CPSR, PC, SP, LR */
> +
> +        /* Zero the upper 32 bits of the registers when switching to AArch32 */
> +        .if \compat == 1      /* AArch32 mode */
> +        .irp nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
> +        mov w\nr, w\nr
> +        .endr
> +        .endif

So Jan mentioned, the x0/x1 may have already been saved. So you may need 
to fetch them from the stack and then clobber the top 32-bit.

Cheers,

-- 
Julien Grall

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Michal Orzel 2 years, 4 months ago
Hi Julien,

On 06.12.2021 16:29, Julien Grall wrote:
> Hi,
> 
> On 06/12/2021 14:20, Michal Orzel wrote:
>> to hypervisor when switching to AArch32 state.
>>
I will change to "from AArch32 state".
>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>> "If the general-purpose register was accessible from AArch32 state the
>> upper 32 bits either become zero, or hold the value that the same
>> architectural register held before any AArch32 execution.
>> The choice between these two options is IMPLEMENTATIONDEFINED"
> 
> Typo: Missing space between IMPLEMENTATION and DEFINED.
> 
Ok.
>>
>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>> needs to be fixed.
> 
> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
> 
I would say:
"
The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
if the command passed as the first argument was clobbered,
"
>>
>> Fix this bug by zeroing the upper 32 bits of these registers on an
>> entry to hypervisor when switching to AArch32 state.
>>
>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>> when not passed.
> 
> Which error do you see otherwise? Is it a compilation error?
> 
Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
So basically in all the places where entry is called with hyp=1.
When taking the current patch and removing default value for compat you will get:
```
entry.S:254: Error: ".endif" without ".if"
entry.S:258: Error: symbol `.if' is already defined
entry.S:258: Error: ".endif" without ".if"
entry.S:262: Error: symbol `.if' is already defined
entry.S:262: Error: ".endif" without ".if"
entry.S:266: Error: symbol `.if' is already defined
entry.S:266: Error: ".endif" without ".if"
entry.S:278: Error: symbol `.if' is already defined
entry.S:278: Error: ".endif" without ".if"
entry.S:292: Error: symbol `.if' is already defined
entry.S:292: Error: ".endif" without ".if"
entry.S:317: Error: symbol `.if' is already defined
entry.S:317: Error: ".endif" without ".if"
```

>>
>> Signed-off-by: Michal Orzel <michal.orzel@arm.com>
>> ---
>>   xen/arch/arm/arm64/entry.S | 10 +++++++++-
>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
>> index fc3811ad0a..d364128175 100644
>> --- a/xen/arch/arm/arm64/entry.S
>> +++ b/xen/arch/arm/arm64/entry.S
>> @@ -109,8 +109,16 @@
>>    * If 0, we rely on the on x0/x1 to have been saved at the correct
>>    * position on the stack before.
>>    */
>> -        .macro  entry, hyp, compat, save_x0_x1=1
>> +        .macro  entry, hyp, compat=0, save_x0_x1=1
>>           sub     sp, sp, #(UREGS_SPSR_el1 - UREGS_LR) /* CPSR, PC, SP, LR */
>> +
>> +        /* Zero the upper 32 bits of the registers when switching to AArch32 */
>> +        .if \compat == 1      /* AArch32 mode */
>> +        .irp nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
>> +        mov w\nr, w\nr
>> +        .endr
>> +        .endif
> 
> So Jan mentioned, the x0/x1 may have already been saved. So you may need to fetch them from the stack and then clobber the top 32-bit.
> 
So I would do the following:
-fetch x0/x1 from the stack
-clobber them
-store them again on the stack

/*
 * Zero the upper 32 bits of the gp registers when switching
 * from AArch32.
 */
.if \compat == 1      /* AArch32 mode */

/* x0/x1 have already been saved so fetch them to zero top 32 bits */
.if \save_x0_x1 == 0
ldp     x0, x1, [sp], #(UREGS_kernel_sizeof - UREGS_X0)
.endif

.irp nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
mov w\nr, w\nr
.endr

.if \save_x0_x1 == 0
stp     x0, x1, [sp, #-(UREGS_kernel_sizeof - UREGS_X0)]
.endif

.endif

> Cheers,
> 

Cheers,
Michal

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Bertrand Marquis 2 years, 4 months ago
Hi Michal,

> On 7 Dec 2021, at 08:37, Michal Orzel <michal.orzel@arm.com> wrote:
> 
> Hi Julien,
> 
> On 06.12.2021 16:29, Julien Grall wrote:
>> Hi,
>> 
>> On 06/12/2021 14:20, Michal Orzel wrote:
>>> to hypervisor when switching to AArch32 state.
>>> 
> I will change to "from AArch32 state".
>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>> "If the general-purpose register was accessible from AArch32 state the
>>> upper 32 bits either become zero, or hold the value that the same
>>> architectural register held before any AArch32 execution.
>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>> 
>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>> 
> Ok.
>>> 
>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>> needs to be fixed.
>> 
>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>> 
> I would say:
> "
> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
> if the command passed as the first argument was clobbered,
> "
>>> 
>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>> entry to hypervisor when switching to AArch32 state.
>>> 
>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>> when not passed.
>> 
>> Which error do you see otherwise? Is it a compilation error?
>> 
> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
> So basically in all the places where entry is called with hyp=1.
> When taking the current patch and removing default value for compat you will get:
> ```
> entry.S:254: Error: ".endif" without ".if"
> entry.S:258: Error: symbol `.if' is already defined
> entry.S:258: Error: ".endif" without ".if"
> entry.S:262: Error: symbol `.if' is already defined
> entry.S:262: Error: ".endif" without ".if"
> entry.S:266: Error: symbol `.if' is already defined
> entry.S:266: Error: ".endif" without ".if"
> entry.S:278: Error: symbol `.if' is already defined
> entry.S:278: Error: ".endif" without ".if"
> entry.S:292: Error: symbol `.if' is already defined
> entry.S:292: Error: ".endif" without ".if"
> entry.S:317: Error: symbol `.if' is already defined
> entry.S:317: Error: ".endif" without ".if"
> ```
> 
>>> 
>>> Signed-off-by: Michal Orzel <michal.orzel@arm.com>
>>> ---
>>>   xen/arch/arm/arm64/entry.S | 10 +++++++++-
>>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
>>> index fc3811ad0a..d364128175 100644
>>> --- a/xen/arch/arm/arm64/entry.S
>>> +++ b/xen/arch/arm/arm64/entry.S
>>> @@ -109,8 +109,16 @@
>>>    * If 0, we rely on the on x0/x1 to have been saved at the correct
>>>    * position on the stack before.
>>>    */
>>> -        .macro  entry, hyp, compat, save_x0_x1=1
>>> +        .macro  entry, hyp, compat=0, save_x0_x1=1
>>>           sub     sp, sp, #(UREGS_SPSR_el1 - UREGS_LR) /* CPSR, PC, SP, LR */
>>> +
>>> +        /* Zero the upper 32 bits of the registers when switching to AArch32 */
>>> +        .if \compat == 1      /* AArch32 mode */
>>> +        .irp nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
>>> +        mov w\nr, w\nr
>>> +        .endr
>>> +        .endif
>> 
>> So Jan mentioned, the x0/x1 may have already been saved. So you may need to fetch them from the stack and then clobber the top 32-bit.
>> 
> So I would do the following:
> -fetch x0/x1 from the stack
> -clobber them
> -store them again on the stack
> 
> /*
> * Zero the upper 32 bits of the gp registers when switching
> * from AArch32.
> */
> .if \compat == 1      /* AArch32 mode */
> 
> /* x0/x1 have already been saved so fetch them to zero top 32 bits */
> .if \save_x0_x1 == 0
> ldp     x0, x1, [sp], #(UREGS_kernel_sizeof - UREGS_X0)
> .endif
> 
> .irp nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
> mov w\nr, w\nr
> .endr
> 
> .if \save_x0_x1 == 0
> stp     x0, x1, [sp, #-(UREGS_kernel_sizeof - UREGS_X0)]
> .endif
> 
> .endif

This solution looks ok. X0 and x1 when they are used is as scratch register for x1 or using w0 for x0 so it is ok to clean them here and not earlier.

Cheers
Bertrand

> 
>> Cheers,
>> 
> 
> Cheers,
> Michal


Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Julien Grall 2 years, 4 months ago

On 07/12/2021 08:37, Michal Orzel wrote:
> Hi Julien,

Hi,

> On 06.12.2021 16:29, Julien Grall wrote:
>> Hi,
>>
>> On 06/12/2021 14:20, Michal Orzel wrote:
>>> to hypervisor when switching to AArch32 state.
>>>
> I will change to "from AArch32 state".
>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>> "If the general-purpose register was accessible from AArch32 state the
>>> upper 32 bits either become zero, or hold the value that the same
>>> architectural register held before any AArch32 execution.
>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>
>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>
> Ok.
>>>
>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>> needs to be fixed.
>>
>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>
> I would say:
> "
> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
> if the command passed as the first argument was clobbered,
> "
>>>
>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>> entry to hypervisor when switching to AArch32 state.
>>>
>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>> when not passed.
>>
>> Which error do you see otherwise? Is it a compilation error?
>>
> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
> So basically in all the places where entry is called with hyp=1.
> When taking the current patch and removing default value for compat you will get:
> ```
> entry.S:254: Error: ".endif" without ".if"
> entry.S:258: Error: symbol `.if' is already defined
> entry.S:258: Error: ".endif" without ".if"
> entry.S:262: Error: symbol `.if' is already defined
> entry.S:262: Error: ".endif" without ".if"
> entry.S:266: Error: symbol `.if' is already defined
> entry.S:266: Error: ".endif" without ".if"
> entry.S:278: Error: symbol `.if' is already defined
> entry.S:278: Error: ".endif" without ".if"
> entry.S:292: Error: symbol `.if' is already defined
> entry.S:292: Error: ".endif" without ".if"
> entry.S:317: Error: symbol `.if' is already defined
> entry.S:317: Error: ".endif" without ".if"
> ```

Thanks for input. I am concerned with your suggested approach (or using 
.if 0\compat as suggested by Jan) because they allow the caller to not 
properly specify compat when hyp=0. The risk here is we may generate the 
wrong entry.

compat should need to be specified when hyp=1 as we will always run in 
aarch64 mode. So could we protect this code with hyp=0?

Cheers,

-- 
Julien Grall

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Jan Beulich 2 years, 4 months ago
On 07.12.2021 20:11, Julien Grall wrote:
> 
> 
> On 07/12/2021 08:37, Michal Orzel wrote:
>> Hi Julien,
> 
> Hi,
> 
>> On 06.12.2021 16:29, Julien Grall wrote:
>>> Hi,
>>>
>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>> to hypervisor when switching to AArch32 state.
>>>>
>> I will change to "from AArch32 state".
>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>> "If the general-purpose register was accessible from AArch32 state the
>>>> upper 32 bits either become zero, or hold the value that the same
>>>> architectural register held before any AArch32 execution.
>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>
>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>
>> Ok.
>>>>
>>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>>> needs to be fixed.
>>>
>>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>>
>> I would say:
>> "
>> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
>> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
>> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>> if the command passed as the first argument was clobbered,
>> "
>>>>
>>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>>> entry to hypervisor when switching to AArch32 state.
>>>>
>>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>>> when not passed.
>>>
>>> Which error do you see otherwise? Is it a compilation error?
>>>
>> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
>> So basically in all the places where entry is called with hyp=1.
>> When taking the current patch and removing default value for compat you will get:
>> ```
>> entry.S:254: Error: ".endif" without ".if"
>> entry.S:258: Error: symbol `.if' is already defined
>> entry.S:258: Error: ".endif" without ".if"
>> entry.S:262: Error: symbol `.if' is already defined
>> entry.S:262: Error: ".endif" without ".if"
>> entry.S:266: Error: symbol `.if' is already defined
>> entry.S:266: Error: ".endif" without ".if"
>> entry.S:278: Error: symbol `.if' is already defined
>> entry.S:278: Error: ".endif" without ".if"
>> entry.S:292: Error: symbol `.if' is already defined
>> entry.S:292: Error: ".endif" without ".if"
>> entry.S:317: Error: symbol `.if' is already defined
>> entry.S:317: Error: ".endif" without ".if"
>> ```
> 
> Thanks for input. I am concerned with your suggested approach (or using 
> .if 0\compat as suggested by Jan) because they allow the caller to not 
> properly specify compat when hyp=0. The risk here is we may generate the 
> wrong entry.
> 
> compat should need to be specified when hyp=1 as we will always run in 
> aarch64 mode. So could we protect this code with hyp=0?

Since my suggestion was only to avoid the need for specifying a default
for the parameter (which you didn't seem to be happy about), it would
then merely extend to

.if !0\hyp && 0\compat

or something along those lines.

Jan


Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Julien Grall 2 years, 4 months ago
Hi,

On 08/12/2021 07:20, Jan Beulich wrote:
> On 07.12.2021 20:11, Julien Grall wrote:
>>
>>
>> On 07/12/2021 08:37, Michal Orzel wrote:
>>> Hi Julien,
>>
>> Hi,
>>
>>> On 06.12.2021 16:29, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>>> to hypervisor when switching to AArch32 state.
>>>>>
>>> I will change to "from AArch32 state".
>>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>>> "If the general-purpose register was accessible from AArch32 state the
>>>>> upper 32 bits either become zero, or hold the value that the same
>>>>> architectural register held before any AArch32 execution.
>>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>>
>>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>>
>>> Ok.
>>>>>
>>>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>>>> needs to be fixed.
>>>>
>>>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>>>
>>> I would say:
>>> "
>>> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
>>> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
>>> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>>> if the command passed as the first argument was clobbered,
>>> "
>>>>>
>>>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>>>> entry to hypervisor when switching to AArch32 state.
>>>>>
>>>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>>>> when not passed.
>>>>
>>>> Which error do you see otherwise? Is it a compilation error?
>>>>
>>> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
>>> So basically in all the places where entry is called with hyp=1.
>>> When taking the current patch and removing default value for compat you will get:
>>> ```
>>> entry.S:254: Error: ".endif" without ".if"
>>> entry.S:258: Error: symbol `.if' is already defined
>>> entry.S:258: Error: ".endif" without ".if"
>>> entry.S:262: Error: symbol `.if' is already defined
>>> entry.S:262: Error: ".endif" without ".if"
>>> entry.S:266: Error: symbol `.if' is already defined
>>> entry.S:266: Error: ".endif" without ".if"
>>> entry.S:278: Error: symbol `.if' is already defined
>>> entry.S:278: Error: ".endif" without ".if"
>>> entry.S:292: Error: symbol `.if' is already defined
>>> entry.S:292: Error: ".endif" without ".if"
>>> entry.S:317: Error: symbol `.if' is already defined
>>> entry.S:317: Error: ".endif" without ".if"
>>> ```
>>
>> Thanks for input. I am concerned with your suggested approach (or using
>> .if 0\compat as suggested by Jan) because they allow the caller to not
>> properly specify compat when hyp=0. The risk here is we may generate the
>> wrong entry.
>>
>> compat should need to be specified when hyp=1 as we will always run in
>> aarch64 mode. So could we protect this code with hyp=0?
> 
> Since my suggestion was only to avoid the need for specifying a default
> for the parameter (which you didn't seem to be happy about), it would
> then merely extend to
> 
> .if !0\hyp && 0\compat
Isn't it effectively the same as setting a default value?

The reason we seem to get away is because other part of the macro (e.g. 
entry_guest) will need compat to be valid.

But that seems pretty fragile to me. So I would prefer if the new code 
it added within a macro that is only called when hyp==0.

Cheers,

> 
> or something along those lines.
> 
> Jan
> 

-- 
Julien Grall

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Jan Beulich 2 years, 4 months ago
On 08.12.2021 10:55, Julien Grall wrote:
> Hi,
> 
> On 08/12/2021 07:20, Jan Beulich wrote:
>> On 07.12.2021 20:11, Julien Grall wrote:
>>>
>>>
>>> On 07/12/2021 08:37, Michal Orzel wrote:
>>>> Hi Julien,
>>>
>>> Hi,
>>>
>>>> On 06.12.2021 16:29, Julien Grall wrote:
>>>>> Hi,
>>>>>
>>>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>>>> to hypervisor when switching to AArch32 state.
>>>>>>
>>>> I will change to "from AArch32 state".
>>>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>>>> "If the general-purpose register was accessible from AArch32 state the
>>>>>> upper 32 bits either become zero, or hold the value that the same
>>>>>> architectural register held before any AArch32 execution.
>>>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>>>
>>>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>>>
>>>> Ok.
>>>>>>
>>>>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>>>>> needs to be fixed.
>>>>>
>>>>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>>>>
>>>> I would say:
>>>> "
>>>> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
>>>> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
>>>> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>>>> if the command passed as the first argument was clobbered,
>>>> "
>>>>>>
>>>>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>>>>> entry to hypervisor when switching to AArch32 state.
>>>>>>
>>>>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>>>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>>>>> when not passed.
>>>>>
>>>>> Which error do you see otherwise? Is it a compilation error?
>>>>>
>>>> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
>>>> So basically in all the places where entry is called with hyp=1.
>>>> When taking the current patch and removing default value for compat you will get:
>>>> ```
>>>> entry.S:254: Error: ".endif" without ".if"
>>>> entry.S:258: Error: symbol `.if' is already defined
>>>> entry.S:258: Error: ".endif" without ".if"
>>>> entry.S:262: Error: symbol `.if' is already defined
>>>> entry.S:262: Error: ".endif" without ".if"
>>>> entry.S:266: Error: symbol `.if' is already defined
>>>> entry.S:266: Error: ".endif" without ".if"
>>>> entry.S:278: Error: symbol `.if' is already defined
>>>> entry.S:278: Error: ".endif" without ".if"
>>>> entry.S:292: Error: symbol `.if' is already defined
>>>> entry.S:292: Error: ".endif" without ".if"
>>>> entry.S:317: Error: symbol `.if' is already defined
>>>> entry.S:317: Error: ".endif" without ".if"
>>>> ```
>>>
>>> Thanks for input. I am concerned with your suggested approach (or using
>>> .if 0\compat as suggested by Jan) because they allow the caller to not
>>> properly specify compat when hyp=0. The risk here is we may generate the
>>> wrong entry.
>>>
>>> compat should need to be specified when hyp=1 as we will always run in
>>> aarch64 mode. So could we protect this code with hyp=0?
>>
>> Since my suggestion was only to avoid the need for specifying a default
>> for the parameter (which you didn't seem to be happy about), it would
>> then merely extend to
>>
>> .if !0\hyp && 0\compat
> Isn't it effectively the same as setting a default value?

Hmm, yes, it is.

Jan


Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Michal Orzel 2 years, 4 months ago
Hi Julien, Jan

On 08.12.2021 10:55, Julien Grall wrote:
> Hi,
> 
> On 08/12/2021 07:20, Jan Beulich wrote:
>> On 07.12.2021 20:11, Julien Grall wrote:
>>>
>>>
>>> On 07/12/2021 08:37, Michal Orzel wrote:
>>>> Hi Julien,
>>>
>>> Hi,
>>>
>>>> On 06.12.2021 16:29, Julien Grall wrote:
>>>>> Hi,
>>>>>
>>>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>>>> to hypervisor when switching to AArch32 state.
>>>>>>
>>>> I will change to "from AArch32 state".
>>>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>>>> "If the general-purpose register was accessible from AArch32 state the
>>>>>> upper 32 bits either become zero, or hold the value that the same
>>>>>> architectural register held before any AArch32 execution.
>>>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>>>
>>>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>>>
>>>> Ok.
>>>>>>
>>>>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>>>>> needs to be fixed.
>>>>>
>>>>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>>>>
>>>> I would say:
>>>> "
>>>> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
>>>> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
>>>> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>>>> if the command passed as the first argument was clobbered,
>>>> "
>>>>>>
>>>>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>>>>> entry to hypervisor when switching to AArch32 state.
>>>>>>
>>>>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>>>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>>>>> when not passed.
>>>>>
>>>>> Which error do you see otherwise? Is it a compilation error?
>>>>>
>>>> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
>>>> So basically in all the places where entry is called with hyp=1.
>>>> When taking the current patch and removing default value for compat you will get:
>>>> ```
>>>> entry.S:254: Error: ".endif" without ".if"
>>>> entry.S:258: Error: symbol `.if' is already defined
>>>> entry.S:258: Error: ".endif" without ".if"
>>>> entry.S:262: Error: symbol `.if' is already defined
>>>> entry.S:262: Error: ".endif" without ".if"
>>>> entry.S:266: Error: symbol `.if' is already defined
>>>> entry.S:266: Error: ".endif" without ".if"
>>>> entry.S:278: Error: symbol `.if' is already defined
>>>> entry.S:278: Error: ".endif" without ".if"
>>>> entry.S:292: Error: symbol `.if' is already defined
>>>> entry.S:292: Error: ".endif" without ".if"
>>>> entry.S:317: Error: symbol `.if' is already defined
>>>> entry.S:317: Error: ".endif" without ".if"
>>>> ```
>>>
>>> Thanks for input. I am concerned with your suggested approach (or using
>>> .if 0\compat as suggested by Jan) because they allow the caller to not
>>> properly specify compat when hyp=0. The risk here is we may generate the
>>> wrong entry.
>>>
>>> compat should need to be specified when hyp=1 as we will always run in
>>> aarch64 mode. So could we protect this code with hyp=0?
>>
>> Since my suggestion was only to avoid the need for specifying a default
>> for the parameter (which you didn't seem to be happy about), it would
>> then merely extend to
>>
>> .if !0\hyp && 0\compat
> Isn't it effectively the same as setting a default value?
> 
> The reason we seem to get away is because other part of the macro (e.g. entry_guest) will need compat to be valid.
> 
> But that seems pretty fragile to me. So I would prefer if the new code it added within a macro that is only called when hyp==0.
> 
So you would like to have a macro that is called if hyp=0 (which means compat had to be passed) and inside this macro additional check if compat is 1?
> Cheers,
> 
>>
>> or something along those lines.
>>
>> Jan
>>
> 
So when it comes to zeroing the top 32bits by pushing zero to higher halves of stack slots I would do in a loop:
stp wzr, wzr, [sp #8 * 0]
stp wzr, wzr, [sp #8 * 1]
...

FWIK this would store 0 in the upper addresses. Am I correct?

Cheers,
Michal

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Julien Grall 2 years, 4 months ago

On 14/12/2021 09:17, Michal Orzel wrote:
> Hi Julien, Jan

Hi,

> On 08.12.2021 10:55, Julien Grall wrote:
>> Hi,
>>
>> On 08/12/2021 07:20, Jan Beulich wrote:
>>> On 07.12.2021 20:11, Julien Grall wrote:
>>>>
>>>>
>>>> On 07/12/2021 08:37, Michal Orzel wrote:
>>>>> Hi Julien,
>>>>
>>>> Hi,
>>>>
>>>>> On 06.12.2021 16:29, Julien Grall wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>>>>> to hypervisor when switching to AArch32 state.
>>>>>>>
>>>>> I will change to "from AArch32 state".
>>>>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>>>>> "If the general-purpose register was accessible from AArch32 state the
>>>>>>> upper 32 bits either become zero, or hold the value that the same
>>>>>>> architectural register held before any AArch32 execution.
>>>>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>>>>
>>>>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>>>>
>>>>> Ok.
>>>>>>>
>>>>>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>>>>>> needs to be fixed.
>>>>>>
>>>>>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>>>>>
>>>>> I would say:
>>>>> "
>>>>> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
>>>>> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
>>>>> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>>>>> if the command passed as the first argument was clobbered,
>>>>> "
>>>>>>>
>>>>>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>>>>>> entry to hypervisor when switching to AArch32 state.
>>>>>>>
>>>>>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>>>>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>>>>>> when not passed.
>>>>>>
>>>>>> Which error do you see otherwise? Is it a compilation error?
>>>>>>
>>>>> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
>>>>> So basically in all the places where entry is called with hyp=1.
>>>>> When taking the current patch and removing default value for compat you will get:
>>>>> ```
>>>>> entry.S:254: Error: ".endif" without ".if"
>>>>> entry.S:258: Error: symbol `.if' is already defined
>>>>> entry.S:258: Error: ".endif" without ".if"
>>>>> entry.S:262: Error: symbol `.if' is already defined
>>>>> entry.S:262: Error: ".endif" without ".if"
>>>>> entry.S:266: Error: symbol `.if' is already defined
>>>>> entry.S:266: Error: ".endif" without ".if"
>>>>> entry.S:278: Error: symbol `.if' is already defined
>>>>> entry.S:278: Error: ".endif" without ".if"
>>>>> entry.S:292: Error: symbol `.if' is already defined
>>>>> entry.S:292: Error: ".endif" without ".if"
>>>>> entry.S:317: Error: symbol `.if' is already defined
>>>>> entry.S:317: Error: ".endif" without ".if"
>>>>> ```
>>>>
>>>> Thanks for input. I am concerned with your suggested approach (or using
>>>> .if 0\compat as suggested by Jan) because they allow the caller to not
>>>> properly specify compat when hyp=0. The risk here is we may generate the
>>>> wrong entry.
>>>>
>>>> compat should need to be specified when hyp=1 as we will always run in
>>>> aarch64 mode. So could we protect this code with hyp=0?
>>>
>>> Since my suggestion was only to avoid the need for specifying a default
>>> for the parameter (which you didn't seem to be happy about), it would
>>> then merely extend to
>>>
>>> .if !0\hyp && 0\compat
>> Isn't it effectively the same as setting a default value?
>>
>> The reason we seem to get away is because other part of the macro (e.g. entry_guest) will need compat to be valid.
>>
>> But that seems pretty fragile to me. So I would prefer if the new code it added within a macro that is only called when hyp==0.
>>
> So you would like to have a macro that is called if hyp=0 (which means compat had to be passed) and inside this macro additional check if compat is 1?

Yes. This is the only way I could think to avoid making 'compat'optional.

>> Cheers,
>>
>>>
>>> or something along those lines.
>>>
>>> Jan
>>>
>>
> So when it comes to zeroing the top 32bits by pushing zero to higher halves of stack slots I would do in a loop:
> stp wzr, wzr, [sp #8 * 0]
> stp wzr, wzr, [sp #8 * 1]
> ...

I don't think you can use stp here because this would store two 32-bit 
values consecutively. Instead, you would need to use ldr to store one 
32-bit value at the time.

Cheers,

-- 
Julien Grall

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Michal Orzel 2 years, 4 months ago
Hi Julien,

On 14.12.2021 10:33, Julien Grall wrote:
> 
> 
> On 14/12/2021 09:17, Michal Orzel wrote:
>> Hi Julien, Jan
> 
> Hi,
> 
>> On 08.12.2021 10:55, Julien Grall wrote:
>>> Hi,
>>>
>>> On 08/12/2021 07:20, Jan Beulich wrote:
>>>> On 07.12.2021 20:11, Julien Grall wrote:
>>>>>
>>>>>
>>>>> On 07/12/2021 08:37, Michal Orzel wrote:
>>>>>> Hi Julien,
>>>>>
>>>>> Hi,
>>>>>
>>>>>> On 06.12.2021 16:29, Julien Grall wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>>>>>> to hypervisor when switching to AArch32 state.
>>>>>>>>
>>>>>> I will change to "from AArch32 state".
>>>>>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>>>>>> "If the general-purpose register was accessible from AArch32 state the
>>>>>>>> upper 32 bits either become zero, or hold the value that the same
>>>>>>>> architectural register held before any AArch32 execution.
>>>>>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>>>>>
>>>>>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>>>>>
>>>>>> Ok.
>>>>>>>>
>>>>>>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>>>>>>> needs to be fixed.
>>>>>>>
>>>>>>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>>>>>>
>>>>>> I would say:
>>>>>> "
>>>>>> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
>>>>>> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
>>>>>> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>>>>>> if the command passed as the first argument was clobbered,
>>>>>> "
>>>>>>>>
>>>>>>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>>>>>>> entry to hypervisor when switching to AArch32 state.
>>>>>>>>
>>>>>>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>>>>>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>>>>>>> when not passed.
>>>>>>>
>>>>>>> Which error do you see otherwise? Is it a compilation error?
>>>>>>>
>>>>>> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
>>>>>> So basically in all the places where entry is called with hyp=1.
>>>>>> When taking the current patch and removing default value for compat you will get:
>>>>>> ```
>>>>>> entry.S:254: Error: ".endif" without ".if"
>>>>>> entry.S:258: Error: symbol `.if' is already defined
>>>>>> entry.S:258: Error: ".endif" without ".if"
>>>>>> entry.S:262: Error: symbol `.if' is already defined
>>>>>> entry.S:262: Error: ".endif" without ".if"
>>>>>> entry.S:266: Error: symbol `.if' is already defined
>>>>>> entry.S:266: Error: ".endif" without ".if"
>>>>>> entry.S:278: Error: symbol `.if' is already defined
>>>>>> entry.S:278: Error: ".endif" without ".if"
>>>>>> entry.S:292: Error: symbol `.if' is already defined
>>>>>> entry.S:292: Error: ".endif" without ".if"
>>>>>> entry.S:317: Error: symbol `.if' is already defined
>>>>>> entry.S:317: Error: ".endif" without ".if"
>>>>>> ```
>>>>>
>>>>> Thanks for input. I am concerned with your suggested approach (or using
>>>>> .if 0\compat as suggested by Jan) because they allow the caller to not
>>>>> properly specify compat when hyp=0. The risk here is we may generate the
>>>>> wrong entry.
>>>>>
>>>>> compat should need to be specified when hyp=1 as we will always run in
>>>>> aarch64 mode. So could we protect this code with hyp=0?
>>>>
>>>> Since my suggestion was only to avoid the need for specifying a default
>>>> for the parameter (which you didn't seem to be happy about), it would
>>>> then merely extend to
>>>>
>>>> .if !0\hyp && 0\compat
>>> Isn't it effectively the same as setting a default value?
>>>
>>> The reason we seem to get away is because other part of the macro (e.g. entry_guest) will need compat to be valid.
>>>
>>> But that seems pretty fragile to me. So I would prefer if the new code it added within a macro that is only called when hyp==0.
>>>
>> So you would like to have a macro that is called if hyp=0 (which means compat had to be passed) and inside this macro additional check if compat is 1?
> 
> Yes. This is the only way I could think to avoid making 'compat'optional.
> 
>>> Cheers,
>>>
>>>>
>>>> or something along those lines.
>>>>
>>>> Jan
>>>>
>>>
>> So when it comes to zeroing the top 32bits by pushing zero to higher halves of stack slots I would do in a loop:
>> stp wzr, wzr, [sp #8 * 0]
>> stp wzr, wzr, [sp #8 * 1]
>> ...
> 
> I don't think you can use stp here because this would store two 32-bit values consecutively. Instead, you would need to use ldr to store one 32-bit value at the time.
> 
I hope you meant str and not ldr.
So a loop would look like that:
str wzr, [sp, #8 * 1]
str wzr, [sp, #8 * 3]
...
> Cheers,
> 

Cheers,
Michal

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Jan Beulich 2 years, 4 months ago
On 14.12.2021 10:51, Michal Orzel wrote:
> Hi Julien,
> 
> On 14.12.2021 10:33, Julien Grall wrote:
>>
>>
>> On 14/12/2021 09:17, Michal Orzel wrote:
>>> Hi Julien, Jan
>>
>> Hi,
>>
>>> On 08.12.2021 10:55, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> On 08/12/2021 07:20, Jan Beulich wrote:
>>>>> On 07.12.2021 20:11, Julien Grall wrote:
>>>>>>
>>>>>>
>>>>>> On 07/12/2021 08:37, Michal Orzel wrote:
>>>>>>> Hi Julien,
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>> On 06.12.2021 16:29, Julien Grall wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>>>>>>> to hypervisor when switching to AArch32 state.
>>>>>>>>>
>>>>>>> I will change to "from AArch32 state".
>>>>>>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>>>>>>> "If the general-purpose register was accessible from AArch32 state the
>>>>>>>>> upper 32 bits either become zero, or hold the value that the same
>>>>>>>>> architectural register held before any AArch32 execution.
>>>>>>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>>>>>>
>>>>>>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>>>>>>
>>>>>>> Ok.
>>>>>>>>>
>>>>>>>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>>>>>>>> needs to be fixed.
>>>>>>>>
>>>>>>>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>>>>>>>
>>>>>>> I would say:
>>>>>>> "
>>>>>>> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
>>>>>>> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
>>>>>>> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>>>>>>> if the command passed as the first argument was clobbered,
>>>>>>> "
>>>>>>>>>
>>>>>>>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>>>>>>>> entry to hypervisor when switching to AArch32 state.
>>>>>>>>>
>>>>>>>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>>>>>>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>>>>>>>> when not passed.
>>>>>>>>
>>>>>>>> Which error do you see otherwise? Is it a compilation error?
>>>>>>>>
>>>>>>> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
>>>>>>> So basically in all the places where entry is called with hyp=1.
>>>>>>> When taking the current patch and removing default value for compat you will get:
>>>>>>> ```
>>>>>>> entry.S:254: Error: ".endif" without ".if"
>>>>>>> entry.S:258: Error: symbol `.if' is already defined
>>>>>>> entry.S:258: Error: ".endif" without ".if"
>>>>>>> entry.S:262: Error: symbol `.if' is already defined
>>>>>>> entry.S:262: Error: ".endif" without ".if"
>>>>>>> entry.S:266: Error: symbol `.if' is already defined
>>>>>>> entry.S:266: Error: ".endif" without ".if"
>>>>>>> entry.S:278: Error: symbol `.if' is already defined
>>>>>>> entry.S:278: Error: ".endif" without ".if"
>>>>>>> entry.S:292: Error: symbol `.if' is already defined
>>>>>>> entry.S:292: Error: ".endif" without ".if"
>>>>>>> entry.S:317: Error: symbol `.if' is already defined
>>>>>>> entry.S:317: Error: ".endif" without ".if"
>>>>>>> ```
>>>>>>
>>>>>> Thanks for input. I am concerned with your suggested approach (or using
>>>>>> .if 0\compat as suggested by Jan) because they allow the caller to not
>>>>>> properly specify compat when hyp=0. The risk here is we may generate the
>>>>>> wrong entry.
>>>>>>
>>>>>> compat should need to be specified when hyp=1 as we will always run in
>>>>>> aarch64 mode. So could we protect this code with hyp=0?
>>>>>
>>>>> Since my suggestion was only to avoid the need for specifying a default
>>>>> for the parameter (which you didn't seem to be happy about), it would
>>>>> then merely extend to
>>>>>
>>>>> .if !0\hyp && 0\compat
>>>> Isn't it effectively the same as setting a default value?
>>>>
>>>> The reason we seem to get away is because other part of the macro (e.g. entry_guest) will need compat to be valid.
>>>>
>>>> But that seems pretty fragile to me. So I would prefer if the new code it added within a macro that is only called when hyp==0.
>>>>
>>> So you would like to have a macro that is called if hyp=0 (which means compat had to be passed) and inside this macro additional check if compat is 1?
>>
>> Yes. This is the only way I could think to avoid making 'compat'optional.
>>
>>>> Cheers,
>>>>
>>>>>
>>>>> or something along those lines.
>>>>>
>>>>> Jan
>>>>>
>>>>
>>> So when it comes to zeroing the top 32bits by pushing zero to higher halves of stack slots I would do in a loop:
>>> stp wzr, wzr, [sp #8 * 0]
>>> stp wzr, wzr, [sp #8 * 1]
>>> ...
>>
>> I don't think you can use stp here because this would store two 32-bit values consecutively. Instead, you would need to use ldr to store one 32-bit value at the time.
>>
> I hope you meant str and not ldr.
> So a loop would look like that:
> str wzr, [sp, #8 * 1]
> str wzr, [sp, #8 * 3]
> ...

Why "a loop" and why #8 (I'd have expected #4)?

There's another oddity which I'm noticing only now, but which also
may look odd to me only because I lack sufficient Arm details: On
x86, it would not be advisable to store anything below the stack
pointer (like is done here when storing x0 and x1 early), unless
it's absolutely certain that no further interruptions could clobber
that part of the stack.

Jan


Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Michal Orzel 2 years, 4 months ago

On 14.12.2021 11:01, Jan Beulich wrote:
> On 14.12.2021 10:51, Michal Orzel wrote:
>> Hi Julien,
>>
>> On 14.12.2021 10:33, Julien Grall wrote:
>>>
>>>
>>> On 14/12/2021 09:17, Michal Orzel wrote:
>>>> Hi Julien, Jan
>>>
>>> Hi,
>>>
>>>> On 08.12.2021 10:55, Julien Grall wrote:
>>>>> Hi,
>>>>>
>>>>> On 08/12/2021 07:20, Jan Beulich wrote:
>>>>>> On 07.12.2021 20:11, Julien Grall wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 07/12/2021 08:37, Michal Orzel wrote:
>>>>>>>> Hi Julien,
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>>> On 06.12.2021 16:29, Julien Grall wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>>>>>>>> to hypervisor when switching to AArch32 state.
>>>>>>>>>>
>>>>>>>> I will change to "from AArch32 state".
>>>>>>>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>>>>>>>> "If the general-purpose register was accessible from AArch32 state the
>>>>>>>>>> upper 32 bits either become zero, or hold the value that the same
>>>>>>>>>> architectural register held before any AArch32 execution.
>>>>>>>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>>>>>>>
>>>>>>>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>>>>>>>
>>>>>>>> Ok.
>>>>>>>>>>
>>>>>>>>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>>>>>>>>> needs to be fixed.
>>>>>>>>>
>>>>>>>>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>>>>>>>>
>>>>>>>> I would say:
>>>>>>>> "
>>>>>>>> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
>>>>>>>> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
>>>>>>>> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>>>>>>>> if the command passed as the first argument was clobbered,
>>>>>>>> "
>>>>>>>>>>
>>>>>>>>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>>>>>>>>> entry to hypervisor when switching to AArch32 state.
>>>>>>>>>>
>>>>>>>>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>>>>>>>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>>>>>>>>> when not passed.
>>>>>>>>>
>>>>>>>>> Which error do you see otherwise? Is it a compilation error?
>>>>>>>>>
>>>>>>>> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
>>>>>>>> So basically in all the places where entry is called with hyp=1.
>>>>>>>> When taking the current patch and removing default value for compat you will get:
>>>>>>>> ```
>>>>>>>> entry.S:254: Error: ".endif" without ".if"
>>>>>>>> entry.S:258: Error: symbol `.if' is already defined
>>>>>>>> entry.S:258: Error: ".endif" without ".if"
>>>>>>>> entry.S:262: Error: symbol `.if' is already defined
>>>>>>>> entry.S:262: Error: ".endif" without ".if"
>>>>>>>> entry.S:266: Error: symbol `.if' is already defined
>>>>>>>> entry.S:266: Error: ".endif" without ".if"
>>>>>>>> entry.S:278: Error: symbol `.if' is already defined
>>>>>>>> entry.S:278: Error: ".endif" without ".if"
>>>>>>>> entry.S:292: Error: symbol `.if' is already defined
>>>>>>>> entry.S:292: Error: ".endif" without ".if"
>>>>>>>> entry.S:317: Error: symbol `.if' is already defined
>>>>>>>> entry.S:317: Error: ".endif" without ".if"
>>>>>>>> ```
>>>>>>>
>>>>>>> Thanks for input. I am concerned with your suggested approach (or using
>>>>>>> .if 0\compat as suggested by Jan) because they allow the caller to not
>>>>>>> properly specify compat when hyp=0. The risk here is we may generate the
>>>>>>> wrong entry.
>>>>>>>
>>>>>>> compat should need to be specified when hyp=1 as we will always run in
>>>>>>> aarch64 mode. So could we protect this code with hyp=0?
>>>>>>
>>>>>> Since my suggestion was only to avoid the need for specifying a default
>>>>>> for the parameter (which you didn't seem to be happy about), it would
>>>>>> then merely extend to
>>>>>>
>>>>>> .if !0\hyp && 0\compat
>>>>> Isn't it effectively the same as setting a default value?
>>>>>
>>>>> The reason we seem to get away is because other part of the macro (e.g. entry_guest) will need compat to be valid.
>>>>>
>>>>> But that seems pretty fragile to me. So I would prefer if the new code it added within a macro that is only called when hyp==0.
>>>>>
>>>> So you would like to have a macro that is called if hyp=0 (which means compat had to be passed) and inside this macro additional check if compat is 1?
>>>
>>> Yes. This is the only way I could think to avoid making 'compat'optional.
>>>
>>>>> Cheers,
>>>>>
>>>>>>
>>>>>> or something along those lines.
>>>>>>
>>>>>> Jan
>>>>>>
>>>>>
>>>> So when it comes to zeroing the top 32bits by pushing zero to higher halves of stack slots I would do in a loop:
>>>> stp wzr, wzr, [sp #8 * 0]
>>>> stp wzr, wzr, [sp #8 * 1]
>>>> ...
>>>
>>> I don't think you can use stp here because this would store two 32-bit values consecutively. Instead, you would need to use ldr to store one 32-bit value at the time.
>>>
>> I hope you meant str and not ldr.
>> So a loop would look like that:
>> str wzr, [sp, #8 * 1]
>> str wzr, [sp, #8 * 3]
>> ...
> 
> Why "a loop" and why #8 (I'd have expected #4)?
> 
You are right. I confused it with stp. #4 is correct.

> There's another oddity which I'm noticing only now, but which also
> may look odd to me only because I lack sufficient Arm details: On
> x86, it would not be advisable to store anything below the stack
> pointer (like is done here when storing x0 and x1 early), unless
> it's absolutely certain that no further interruptions could clobber
> that part of the stack.
> 
I cannot answer this question. Sorry.

> Jan
> 
Cheers

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Julien Grall 2 years, 4 months ago
Hi,

Replying in one e-mail the comments from Jan and Michal.

On 14/12/2021 10:01, Jan Beulich wrote:
> On 14.12.2021 10:51, Michal Orzel wrote:
>> Hi Julien,
>>
>> On 14.12.2021 10:33, Julien Grall wrote:
>>>
>>>
>>> On 14/12/2021 09:17, Michal Orzel wrote:
>>>> Hi Julien, Jan
>>>
>>> Hi,
>>>
>>>> On 08.12.2021 10:55, Julien Grall wrote:
>>>>> Hi,
>>>>>
>>>>> On 08/12/2021 07:20, Jan Beulich wrote:
>>>>>> On 07.12.2021 20:11, Julien Grall wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 07/12/2021 08:37, Michal Orzel wrote:
>>>>>>>> Hi Julien,
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>>> On 06.12.2021 16:29, Julien Grall wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>>>>>>>> to hypervisor when switching to AArch32 state.
>>>>>>>>>>
>>>>>>>> I will change to "from AArch32 state".
>>>>>>>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>>>>>>>> "If the general-purpose register was accessible from AArch32 state the
>>>>>>>>>> upper 32 bits either become zero, or hold the value that the same
>>>>>>>>>> architectural register held before any AArch32 execution.
>>>>>>>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>>>>>>>
>>>>>>>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>>>>>>>
>>>>>>>> Ok.
>>>>>>>>>>
>>>>>>>>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>>>>>>>>> needs to be fixed.
>>>>>>>>>
>>>>>>>>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>>>>>>>>
>>>>>>>> I would say:
>>>>>>>> "
>>>>>>>> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
>>>>>>>> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
>>>>>>>> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>>>>>>>> if the command passed as the first argument was clobbered,
>>>>>>>> "
>>>>>>>>>>
>>>>>>>>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>>>>>>>>> entry to hypervisor when switching to AArch32 state.
>>>>>>>>>>
>>>>>>>>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>>>>>>>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>>>>>>>>> when not passed.
>>>>>>>>>
>>>>>>>>> Which error do you see otherwise? Is it a compilation error?
>>>>>>>>>
>>>>>>>> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
>>>>>>>> So basically in all the places where entry is called with hyp=1.
>>>>>>>> When taking the current patch and removing default value for compat you will get:
>>>>>>>> ```
>>>>>>>> entry.S:254: Error: ".endif" without ".if"
>>>>>>>> entry.S:258: Error: symbol `.if' is already defined
>>>>>>>> entry.S:258: Error: ".endif" without ".if"
>>>>>>>> entry.S:262: Error: symbol `.if' is already defined
>>>>>>>> entry.S:262: Error: ".endif" without ".if"
>>>>>>>> entry.S:266: Error: symbol `.if' is already defined
>>>>>>>> entry.S:266: Error: ".endif" without ".if"
>>>>>>>> entry.S:278: Error: symbol `.if' is already defined
>>>>>>>> entry.S:278: Error: ".endif" without ".if"
>>>>>>>> entry.S:292: Error: symbol `.if' is already defined
>>>>>>>> entry.S:292: Error: ".endif" without ".if"
>>>>>>>> entry.S:317: Error: symbol `.if' is already defined
>>>>>>>> entry.S:317: Error: ".endif" without ".if"
>>>>>>>> ```
>>>>>>>
>>>>>>> Thanks for input. I am concerned with your suggested approach (or using
>>>>>>> .if 0\compat as suggested by Jan) because they allow the caller to not
>>>>>>> properly specify compat when hyp=0. The risk here is we may generate the
>>>>>>> wrong entry.
>>>>>>>
>>>>>>> compat should need to be specified when hyp=1 as we will always run in
>>>>>>> aarch64 mode. So could we protect this code with hyp=0?
>>>>>>
>>>>>> Since my suggestion was only to avoid the need for specifying a default
>>>>>> for the parameter (which you didn't seem to be happy about), it would
>>>>>> then merely extend to
>>>>>>
>>>>>> .if !0\hyp && 0\compat
>>>>> Isn't it effectively the same as setting a default value?
>>>>>
>>>>> The reason we seem to get away is because other part of the macro (e.g. entry_guest) will need compat to be valid.
>>>>>
>>>>> But that seems pretty fragile to me. So I would prefer if the new code it added within a macro that is only called when hyp==0.
>>>>>
>>>> So you would like to have a macro that is called if hyp=0 (which means compat had to be passed) and inside this macro additional check if compat is 1?
>>>
>>> Yes. This is the only way I could think to avoid making 'compat'optional.
>>>
>>>>> Cheers,
>>>>>
>>>>>>
>>>>>> or something along those lines.
>>>>>>
>>>>>> Jan
>>>>>>
>>>>>
>>>> So when it comes to zeroing the top 32bits by pushing zero to higher halves of stack slots I would do in a loop:
>>>> stp wzr, wzr, [sp #8 * 0]
>>>> stp wzr, wzr, [sp #8 * 1]
>>>> ...
>>>
>>> I don't think you can use stp here because this would store two 32-bit values consecutively. Instead, you would need to use ldr to store one 32-bit value at the time.
>>>
>> I hope you meant str and not ldr.

Yes. I am not sure why I wrote ldr.

>> So a loop would look like that:
>> str wzr, [sp, #8 * 1]
>> str wzr, [sp, #8 * 3]
>> ...
> 
> Why "a loop" and why #8 (I'd have expected #4)?
> 
> There's another oddity which I'm noticing only now, but which also
> may look odd to me only because I lack sufficient Arm details: On
> x86, it would not be advisable to store anything below the stack
> pointer (like is done here when storing x0 and x1 early), unless
> it's absolutely certain that no further interruptions could clobber
> that part of the stack.

We are entering the hypervisor with both Interrupts and SErrors masked. 
They will only be unmasked after the guest registers have been saved on 
the stack.

You may still receive a Data Abort before the macro 'entry' has 
completed. But this is going to result to an hypervisor crash because 
they are not meant to happen in those paths.

So I believe, we are safe to modify sp before.

Cheers,

-- 
Julien Grall

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Julien Grall 2 years, 4 months ago

On 14/12/2021 11:01, Julien Grall wrote:
> Hi,
> 
> Replying in one e-mail the comments from Jan and Michal.
> 
> On 14/12/2021 10:01, Jan Beulich wrote:
>> On 14.12.2021 10:51, Michal Orzel wrote:
>>> Hi Julien,
>>>
>>> On 14.12.2021 10:33, Julien Grall wrote:
>>>>
>>>>
>>>> On 14/12/2021 09:17, Michal Orzel wrote:
>>>>> Hi Julien, Jan
>>>>
>>>> Hi,
>>>>
>>>>> On 08.12.2021 10:55, Julien Grall wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 08/12/2021 07:20, Jan Beulich wrote:
>>>>>>> On 07.12.2021 20:11, Julien Grall wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 07/12/2021 08:37, Michal Orzel wrote:
>>>>>>>>> Hi Julien,
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>>> On 06.12.2021 16:29, Julien Grall wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>>>>>>>>> to hypervisor when switching to AArch32 state.
>>>>>>>>>>>
>>>>>>>>> I will change to "from AArch32 state".
>>>>>>>>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>>>>>>>>> "If the general-purpose register was accessible from AArch32 
>>>>>>>>>>> state the
>>>>>>>>>>> upper 32 bits either become zero, or hold the value that the 
>>>>>>>>>>> same
>>>>>>>>>>> architectural register held before any AArch32 execution.
>>>>>>>>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>>>>>>>>
>>>>>>>>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>>>>>>>>
>>>>>>>>> Ok.
>>>>>>>>>>>
>>>>>>>>>>> Currently Xen does not ensure that the top 32 bits are zeroed 
>>>>>>>>>>> and this
>>>>>>>>>>> needs to be fixed.
>>>>>>>>>>
>>>>>>>>>> Can you outline why this is a problem and why we need to 
>>>>>>>>>> protect? IIRC, the main concern is Xen may misinterpret what 
>>>>>>>>>> the guest requested but we are not concerned about Xen using 
>>>>>>>>>> wrong value.
>>>>>>>>>>
>>>>>>>>> I would say:
>>>>>>>>> "
>>>>>>>>> The reason why this is a problem is that there are places in 
>>>>>>>>> Xen where we assume that top 32bits are zero for AArch32 guests.
>>>>>>>>> If they are not, this can lead to misinterpretation of Xen 
>>>>>>>>> regarding what the guest requested.
>>>>>>>>> For example hypercalls returning an error encoded in a signed 
>>>>>>>>> long like do_sched_op, do_hmv_op, do_memory_op would return 
>>>>>>>>> -ENOSYS
>>>>>>>>> if the command passed as the first argument was clobbered,
>>>>>>>>> "
>>>>>>>>>>>
>>>>>>>>>>> Fix this bug by zeroing the upper 32 bits of these registers 
>>>>>>>>>>> on an
>>>>>>>>>>> entry to hypervisor when switching to AArch32 state.
>>>>>>>>>>>
>>>>>>>>>>> Set default value of parameter compat of macro entry to 0 
>>>>>>>>>>> (AArch64 mode
>>>>>>>>>>> as we are on 64-bit hypervisor) to avoid checking if 
>>>>>>>>>>> parameter is blank
>>>>>>>>>>> when not passed.
>>>>>>>>>>
>>>>>>>>>> Which error do you see otherwise? Is it a compilation error?
>>>>>>>>>>
>>>>>>>>> Yes, this is a compilation error. The errors appear at each 
>>>>>>>>> line when "entry" is called without passing value for "compat".
>>>>>>>>> So basically in all the places where entry is called with hyp=1.
>>>>>>>>> When taking the current patch and removing default value for 
>>>>>>>>> compat you will get:
>>>>>>>>> ```
>>>>>>>>> entry.S:254: Error: ".endif" without ".if"
>>>>>>>>> entry.S:258: Error: symbol `.if' is already defined
>>>>>>>>> entry.S:258: Error: ".endif" without ".if"
>>>>>>>>> entry.S:262: Error: symbol `.if' is already defined
>>>>>>>>> entry.S:262: Error: ".endif" without ".if"
>>>>>>>>> entry.S:266: Error: symbol `.if' is already defined
>>>>>>>>> entry.S:266: Error: ".endif" without ".if"
>>>>>>>>> entry.S:278: Error: symbol `.if' is already defined
>>>>>>>>> entry.S:278: Error: ".endif" without ".if"
>>>>>>>>> entry.S:292: Error: symbol `.if' is already defined
>>>>>>>>> entry.S:292: Error: ".endif" without ".if"
>>>>>>>>> entry.S:317: Error: symbol `.if' is already defined
>>>>>>>>> entry.S:317: Error: ".endif" without ".if"
>>>>>>>>> ```
>>>>>>>>
>>>>>>>> Thanks for input. I am concerned with your suggested approach 
>>>>>>>> (or using
>>>>>>>> .if 0\compat as suggested by Jan) because they allow the caller 
>>>>>>>> to not
>>>>>>>> properly specify compat when hyp=0. The risk here is we may 
>>>>>>>> generate the
>>>>>>>> wrong entry.
>>>>>>>>
>>>>>>>> compat should need to be specified when hyp=1 as we will always 
>>>>>>>> run in
>>>>>>>> aarch64 mode. So could we protect this code with hyp=0?
>>>>>>>
>>>>>>> Since my suggestion was only to avoid the need for specifying a 
>>>>>>> default
>>>>>>> for the parameter (which you didn't seem to be happy about), it 
>>>>>>> would
>>>>>>> then merely extend to
>>>>>>>
>>>>>>> .if !0\hyp && 0\compat
>>>>>> Isn't it effectively the same as setting a default value?
>>>>>>
>>>>>> The reason we seem to get away is because other part of the macro 
>>>>>> (e.g. entry_guest) will need compat to be valid.
>>>>>>
>>>>>> But that seems pretty fragile to me. So I would prefer if the new 
>>>>>> code it added within a macro that is only called when hyp==0.
>>>>>>
>>>>> So you would like to have a macro that is called if hyp=0 (which 
>>>>> means compat had to be passed) and inside this macro additional 
>>>>> check if compat is 1?
>>>>
>>>> Yes. This is the only way I could think to avoid making 
>>>> 'compat'optional.
>>>>
>>>>>> Cheers,
>>>>>>
>>>>>>>
>>>>>>> or something along those lines.
>>>>>>>
>>>>>>> Jan
>>>>>>>
>>>>>>
>>>>> So when it comes to zeroing the top 32bits by pushing zero to 
>>>>> higher halves of stack slots I would do in a loop:
>>>>> stp wzr, wzr, [sp #8 * 0]
>>>>> stp wzr, wzr, [sp #8 * 1]
>>>>> ...
>>>>
>>>> I don't think you can use stp here because this would store two 
>>>> 32-bit values consecutively. Instead, you would need to use ldr to 
>>>> store one 32-bit value at the time.
>>>>
>>> I hope you meant str and not ldr.
> 
> Yes. I am not sure why I wrote ldr.
> 
>>> So a loop would look like that:
>>> str wzr, [sp, #8 * 1]
>>> str wzr, [sp, #8 * 3]
>>> ...
>>
>> Why "a loop" and why #8 (I'd have expected #4)?
>>
>> There's another oddity which I'm noticing only now, but which also
>> may look odd to me only because I lack sufficient Arm details: On
>> x86, it would not be advisable to store anything below the stack
>> pointer (like is done here when storing x0 and x1 early), unless
>> it's absolutely certain that no further interruptions could clobber
>> that part of the stack.
> 
> We are entering the hypervisor with both Interrupts and SErrors masked. 
> They will only be unmasked after the guest registers have been saved on 
> the stack.
> 
> You may still receive a Data Abort before the macro 'entry' has 
> completed. But this is going to result to an hypervisor crash because 
> they are not meant to happen in those paths.
> 
> So I believe, we are safe to modify sp before.

Hmmm... I meant to write on the stack before sp is modified.

Cheers,

-- 
Julien Grall

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Michal Orzel 2 years, 4 months ago
Replying to both Julien and Jan,

On 14.12.2021 12:30, Julien Grall wrote:
> 
> 
> On 14/12/2021 11:01, Julien Grall wrote:
>> Hi,
>>
>> Replying in one e-mail the comments from Jan and Michal.
>>
>> On 14/12/2021 10:01, Jan Beulich wrote:
>>> On 14.12.2021 10:51, Michal Orzel wrote:
>>>> Hi Julien,
>>>>
>>>> On 14.12.2021 10:33, Julien Grall wrote:
>>>>>
>>>>>
>>>>> On 14/12/2021 09:17, Michal Orzel wrote:
>>>>>> Hi Julien, Jan
>>>>>
>>>>> Hi,
>>>>>
>>>>>> On 08.12.2021 10:55, Julien Grall wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 08/12/2021 07:20, Jan Beulich wrote:
>>>>>>>> On 07.12.2021 20:11, Julien Grall wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 07/12/2021 08:37, Michal Orzel wrote:
>>>>>>>>>> Hi Julien,
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>> On 06.12.2021 16:29, Julien Grall wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>>>>>>>>>> to hypervisor when switching to AArch32 state.
>>>>>>>>>>>>
>>>>>>>>>> I will change to "from AArch32 state".
>>>>>>>>>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>>>>>>>>>> "If the general-purpose register was accessible from AArch32 state the
>>>>>>>>>>>> upper 32 bits either become zero, or hold the value that the same
>>>>>>>>>>>> architectural register held before any AArch32 execution.
>>>>>>>>>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>>>>>>>>>
>>>>>>>>>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>>>>>>>>>
>>>>>>>>>> Ok.
>>>>>>>>>>>>
>>>>>>>>>>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>>>>>>>>>>> needs to be fixed.
>>>>>>>>>>>
>>>>>>>>>>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>>>>>>>>>>
>>>>>>>>>> I would say:
>>>>>>>>>> "
>>>>>>>>>> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
>>>>>>>>>> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
>>>>>>>>>> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>>>>>>>>>> if the command passed as the first argument was clobbered,
>>>>>>>>>> "
>>>>>>>>>>>>
>>>>>>>>>>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>>>>>>>>>>> entry to hypervisor when switching to AArch32 state.
>>>>>>>>>>>>
>>>>>>>>>>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>>>>>>>>>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>>>>>>>>>>> when not passed.
>>>>>>>>>>>
>>>>>>>>>>> Which error do you see otherwise? Is it a compilation error?
>>>>>>>>>>>
>>>>>>>>>> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
>>>>>>>>>> So basically in all the places where entry is called with hyp=1.
>>>>>>>>>> When taking the current patch and removing default value for compat you will get:
>>>>>>>>>> ```
>>>>>>>>>> entry.S:254: Error: ".endif" without ".if"
>>>>>>>>>> entry.S:258: Error: symbol `.if' is already defined
>>>>>>>>>> entry.S:258: Error: ".endif" without ".if"
>>>>>>>>>> entry.S:262: Error: symbol `.if' is already defined
>>>>>>>>>> entry.S:262: Error: ".endif" without ".if"
>>>>>>>>>> entry.S:266: Error: symbol `.if' is already defined
>>>>>>>>>> entry.S:266: Error: ".endif" without ".if"
>>>>>>>>>> entry.S:278: Error: symbol `.if' is already defined
>>>>>>>>>> entry.S:278: Error: ".endif" without ".if"
>>>>>>>>>> entry.S:292: Error: symbol `.if' is already defined
>>>>>>>>>> entry.S:292: Error: ".endif" without ".if"
>>>>>>>>>> entry.S:317: Error: symbol `.if' is already defined
>>>>>>>>>> entry.S:317: Error: ".endif" without ".if"
>>>>>>>>>> ```
>>>>>>>>>
>>>>>>>>> Thanks for input. I am concerned with your suggested approach (or using
>>>>>>>>> .if 0\compat as suggested by Jan) because they allow the caller to not
>>>>>>>>> properly specify compat when hyp=0. The risk here is we may generate the
>>>>>>>>> wrong entry.
>>>>>>>>>
>>>>>>>>> compat should need to be specified when hyp=1 as we will always run in
>>>>>>>>> aarch64 mode. So could we protect this code with hyp=0?
>>>>>>>>
>>>>>>>> Since my suggestion was only to avoid the need for specifying a default
>>>>>>>> for the parameter (which you didn't seem to be happy about), it would
>>>>>>>> then merely extend to
>>>>>>>>
>>>>>>>> .if !0\hyp && 0\compat
>>>>>>> Isn't it effectively the same as setting a default value?
>>>>>>>
>>>>>>> The reason we seem to get away is because other part of the macro (e.g. entry_guest) will need compat to be valid.
>>>>>>>
>>>>>>> But that seems pretty fragile to me. So I would prefer if the new code it added within a macro that is only called when hyp==0.
>>>>>>>
>>>>>> So you would like to have a macro that is called if hyp=0 (which means compat had to be passed) and inside this macro additional check if compat is 1?
>>>>>
>>>>> Yes. This is the only way I could think to avoid making 'compat'optional.
>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>>>
>>>>>>>> or something along those lines.
>>>>>>>>
>>>>>>>> Jan
>>>>>>>>
>>>>>>>
>>>>>> So when it comes to zeroing the top 32bits by pushing zero to higher halves of stack slots I would do in a loop:
>>>>>> stp wzr, wzr, [sp #8 * 0]
>>>>>> stp wzr, wzr, [sp #8 * 1]
>>>>>> ...
>>>>>
>>>>> I don't think you can use stp here because this would store two 32-bit values consecutively. Instead, you would need to use ldr to store one 32-bit value at the time.
>>>>>
>>>> I hope you meant str and not ldr.
>>
>> Yes. I am not sure why I wrote ldr.
>>
>>>> So a loop would look like that:
>>>> str wzr, [sp, #8 * 1]
>>>> str wzr, [sp, #8 * 3]
>>>> ...
>>>
>>> Why "a loop" and why #8 (I'd have expected #4)?
>>>
>>> There's another oddity which I'm noticing only now, but which also
>>> may look odd to me only because I lack sufficient Arm details: On
>>> x86, it would not be advisable to store anything below the stack
>>> pointer (like is done here when storing x0 and x1 early), unless
>>> it's absolutely certain that no further interruptions could clobber
>>> that part of the stack.
>>
>> We are entering the hypervisor with both Interrupts and SErrors masked. They will only be unmasked after the guest registers have been saved on the stack.
>>
>> You may still receive a Data Abort before the macro 'entry' has completed. But this is going to result to an hypervisor crash because they are not meant to happen in those paths.
>>
>> So I believe, we are safe to modify sp before.
> 
> Hmmm... I meant to write on the stack before sp is modified.
> 
> Cheers,
> 

I would like to summarize what we discussed before pushing v2.
Changes since v1:
-update commit message adding information why do we need to zero top 32bits
-zero corresponding stack slots instead of zeroing directly gp registers
-create a macro called by entry, protected by if hyp=0. In macro add if compat=1

Now when it comes to implementation.

1. Regarding save_x0_x1, it is 0 only for guest_sync_slowpath, which is called by guest_sync.
So as we are dealing only with compat=1, save_x0_x1 cannot be 0.
The conclusion is that we do not need to worry about it.

2. Regarding clearing high halves of stack slots.
The new macro (called zero_stack_top_halves) will be called in entry before the first instruction sub sp,sp.
To avoid saving sp position/moving it, the simplest would be to execute 30 times:
str wzr, [sp, #-(UREGS_kernel_sizeof - 4)]
str wzr, [sp, #-(UREGS_kernel_sizeof - 12)]
...
I could also use .irp loop like (.irp n,1,3,5,7,...) and then:
str wzr, [sp, #-(UREGS_kernel_sizeof - (4 * n))]
but FWIK Jan does not like loops :)

Let me know what u think.

Cheers,
Michal

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Jan Beulich 2 years, 4 months ago
On 15.12.2021 10:27, Michal Orzel wrote:
> Replying to both Julien and Jan,
> 
> On 14.12.2021 12:30, Julien Grall wrote:
>>
>>
>> On 14/12/2021 11:01, Julien Grall wrote:
>>> Hi,
>>>
>>> Replying in one e-mail the comments from Jan and Michal.
>>>
>>> On 14/12/2021 10:01, Jan Beulich wrote:
>>>> On 14.12.2021 10:51, Michal Orzel wrote:
>>>>> Hi Julien,
>>>>>
>>>>> On 14.12.2021 10:33, Julien Grall wrote:
>>>>>>
>>>>>>
>>>>>> On 14/12/2021 09:17, Michal Orzel wrote:
>>>>>>> Hi Julien, Jan
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>> On 08.12.2021 10:55, Julien Grall wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 08/12/2021 07:20, Jan Beulich wrote:
>>>>>>>>> On 07.12.2021 20:11, Julien Grall wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 07/12/2021 08:37, Michal Orzel wrote:
>>>>>>>>>>> Hi Julien,
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>>> On 06.12.2021 16:29, Julien Grall wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>>>>>>>>>>> to hypervisor when switching to AArch32 state.
>>>>>>>>>>>>>
>>>>>>>>>>> I will change to "from AArch32 state".
>>>>>>>>>>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>>>>>>>>>>> "If the general-purpose register was accessible from AArch32 state the
>>>>>>>>>>>>> upper 32 bits either become zero, or hold the value that the same
>>>>>>>>>>>>> architectural register held before any AArch32 execution.
>>>>>>>>>>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>>>>>>>>>>
>>>>>>>>>>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>>>>>>>>>>
>>>>>>>>>>> Ok.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>>>>>>>>>>>> needs to be fixed.
>>>>>>>>>>>>
>>>>>>>>>>>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>>>>>>>>>>>
>>>>>>>>>>> I would say:
>>>>>>>>>>> "
>>>>>>>>>>> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
>>>>>>>>>>> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
>>>>>>>>>>> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>>>>>>>>>>> if the command passed as the first argument was clobbered,
>>>>>>>>>>> "
>>>>>>>>>>>>>
>>>>>>>>>>>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>>>>>>>>>>>> entry to hypervisor when switching to AArch32 state.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>>>>>>>>>>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>>>>>>>>>>>> when not passed.
>>>>>>>>>>>>
>>>>>>>>>>>> Which error do you see otherwise? Is it a compilation error?
>>>>>>>>>>>>
>>>>>>>>>>> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
>>>>>>>>>>> So basically in all the places where entry is called with hyp=1.
>>>>>>>>>>> When taking the current patch and removing default value for compat you will get:
>>>>>>>>>>> ```
>>>>>>>>>>> entry.S:254: Error: ".endif" without ".if"
>>>>>>>>>>> entry.S:258: Error: symbol `.if' is already defined
>>>>>>>>>>> entry.S:258: Error: ".endif" without ".if"
>>>>>>>>>>> entry.S:262: Error: symbol `.if' is already defined
>>>>>>>>>>> entry.S:262: Error: ".endif" without ".if"
>>>>>>>>>>> entry.S:266: Error: symbol `.if' is already defined
>>>>>>>>>>> entry.S:266: Error: ".endif" without ".if"
>>>>>>>>>>> entry.S:278: Error: symbol `.if' is already defined
>>>>>>>>>>> entry.S:278: Error: ".endif" without ".if"
>>>>>>>>>>> entry.S:292: Error: symbol `.if' is already defined
>>>>>>>>>>> entry.S:292: Error: ".endif" without ".if"
>>>>>>>>>>> entry.S:317: Error: symbol `.if' is already defined
>>>>>>>>>>> entry.S:317: Error: ".endif" without ".if"
>>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> Thanks for input. I am concerned with your suggested approach (or using
>>>>>>>>>> .if 0\compat as suggested by Jan) because they allow the caller to not
>>>>>>>>>> properly specify compat when hyp=0. The risk here is we may generate the
>>>>>>>>>> wrong entry.
>>>>>>>>>>
>>>>>>>>>> compat should need to be specified when hyp=1 as we will always run in
>>>>>>>>>> aarch64 mode. So could we protect this code with hyp=0?
>>>>>>>>>
>>>>>>>>> Since my suggestion was only to avoid the need for specifying a default
>>>>>>>>> for the parameter (which you didn't seem to be happy about), it would
>>>>>>>>> then merely extend to
>>>>>>>>>
>>>>>>>>> .if !0\hyp && 0\compat
>>>>>>>> Isn't it effectively the same as setting a default value?
>>>>>>>>
>>>>>>>> The reason we seem to get away is because other part of the macro (e.g. entry_guest) will need compat to be valid.
>>>>>>>>
>>>>>>>> But that seems pretty fragile to me. So I would prefer if the new code it added within a macro that is only called when hyp==0.
>>>>>>>>
>>>>>>> So you would like to have a macro that is called if hyp=0 (which means compat had to be passed) and inside this macro additional check if compat is 1?
>>>>>>
>>>>>> Yes. This is the only way I could think to avoid making 'compat'optional.
>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>>>
>>>>>>>>> or something along those lines.
>>>>>>>>>
>>>>>>>>> Jan
>>>>>>>>>
>>>>>>>>
>>>>>>> So when it comes to zeroing the top 32bits by pushing zero to higher halves of stack slots I would do in a loop:
>>>>>>> stp wzr, wzr, [sp #8 * 0]
>>>>>>> stp wzr, wzr, [sp #8 * 1]
>>>>>>> ...
>>>>>>
>>>>>> I don't think you can use stp here because this would store two 32-bit values consecutively. Instead, you would need to use ldr to store one 32-bit value at the time.
>>>>>>
>>>>> I hope you meant str and not ldr.
>>>
>>> Yes. I am not sure why I wrote ldr.
>>>
>>>>> So a loop would look like that:
>>>>> str wzr, [sp, #8 * 1]
>>>>> str wzr, [sp, #8 * 3]
>>>>> ...
>>>>
>>>> Why "a loop" and why #8 (I'd have expected #4)?
>>>>
>>>> There's another oddity which I'm noticing only now, but which also
>>>> may look odd to me only because I lack sufficient Arm details: On
>>>> x86, it would not be advisable to store anything below the stack
>>>> pointer (like is done here when storing x0 and x1 early), unless
>>>> it's absolutely certain that no further interruptions could clobber
>>>> that part of the stack.
>>>
>>> We are entering the hypervisor with both Interrupts and SErrors masked. They will only be unmasked after the guest registers have been saved on the stack.
>>>
>>> You may still receive a Data Abort before the macro 'entry' has completed. But this is going to result to an hypervisor crash because they are not meant to happen in those paths.
>>>
>>> So I believe, we are safe to modify sp before.
>>
>> Hmmm... I meant to write on the stack before sp is modified.
>>
>> Cheers,
>>
> 
> I would like to summarize what we discussed before pushing v2.
> Changes since v1:
> -update commit message adding information why do we need to zero top 32bits
> -zero corresponding stack slots instead of zeroing directly gp registers
> -create a macro called by entry, protected by if hyp=0. In macro add if compat=1
> 
> Now when it comes to implementation.
> 
> 1. Regarding save_x0_x1, it is 0 only for guest_sync_slowpath, which is called by guest_sync.
> So as we are dealing only with compat=1, save_x0_x1 cannot be 0.
> The conclusion is that we do not need to worry about it.

Oh, good point. I guess you may want to add a build time check to
avoid silently introducing a user of the macro violating that
assumption.

> 2. Regarding clearing high halves of stack slots.

I don't think I understood earlier responses that way. I think
fiddling with the stack was meant solely for x0 and x1 when they
were saved earlier on (i.e. instead of re-loading, zero-extending,
and then storing them back). That's also why ...

> The new macro (called zero_stack_top_halves) will be called in entry before the first instruction sub sp,sp.
> To avoid saving sp position/moving it, the simplest would be to execute 30 times:
> str wzr, [sp, #-(UREGS_kernel_sizeof - 4)]
> str wzr, [sp, #-(UREGS_kernel_sizeof - 12)]
> ...
> I could also use .irp loop like (.irp n,1,3,5,7,...) and then:
> str wzr, [sp, #-(UREGS_kernel_sizeof - (4 * n))]
> but FWIK Jan does not like loops :)

... in an earlier reply I expressed my surprise of you mentioning
loops - I simply didn't see how a loop would come into play when
dealing with just x0 and x1.

Jan


Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Julien Grall 2 years, 4 months ago
Hi,

On 15/12/2021 09:35, Jan Beulich wrote:
>> 1. Regarding save_x0_x1, it is 0 only for guest_sync_slowpath, which is called by guest_sync.
>> So as we are dealing only with compat=1, save_x0_x1 cannot be 0.
>> The conclusion is that we do not need to worry about it.
> 
> Oh, good point. I guess you may want to add a build time check to
> avoid silently introducing a user of the macro violating that
> assumption.

+1

Cheers,

-- 
Julien Grall

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Michal Orzel 2 years, 4 months ago

On 15.12.2021 10:35, Jan Beulich wrote:
> On 15.12.2021 10:27, Michal Orzel wrote:
>> Replying to both Julien and Jan,
>>
>> On 14.12.2021 12:30, Julien Grall wrote:
>>>
>>>
>>> On 14/12/2021 11:01, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> Replying in one e-mail the comments from Jan and Michal.
>>>>
>>>> On 14/12/2021 10:01, Jan Beulich wrote:
>>>>> On 14.12.2021 10:51, Michal Orzel wrote:
>>>>>> Hi Julien,
>>>>>>
>>>>>> On 14.12.2021 10:33, Julien Grall wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 14/12/2021 09:17, Michal Orzel wrote:
>>>>>>>> Hi Julien, Jan
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>>> On 08.12.2021 10:55, Julien Grall wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> On 08/12/2021 07:20, Jan Beulich wrote:
>>>>>>>>>> On 07.12.2021 20:11, Julien Grall wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 07/12/2021 08:37, Michal Orzel wrote:
>>>>>>>>>>>> Hi Julien,
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>>> On 06.12.2021 16:29, Julien Grall wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>>>>>>>>>>>> to hypervisor when switching to AArch32 state.
>>>>>>>>>>>>>>
>>>>>>>>>>>> I will change to "from AArch32 state".
>>>>>>>>>>>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>>>>>>>>>>>> "If the general-purpose register was accessible from AArch32 state the
>>>>>>>>>>>>>> upper 32 bits either become zero, or hold the value that the same
>>>>>>>>>>>>>> architectural register held before any AArch32 execution.
>>>>>>>>>>>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>>>>>>>>>>>
>>>>>>>>>>>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>>>>>>>>>>>
>>>>>>>>>>>> Ok.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>>>>>>>>>>>>> needs to be fixed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>>>>>>>>>>>>
>>>>>>>>>>>> I would say:
>>>>>>>>>>>> "
>>>>>>>>>>>> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
>>>>>>>>>>>> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
>>>>>>>>>>>> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>>>>>>>>>>>> if the command passed as the first argument was clobbered,
>>>>>>>>>>>> "
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>>>>>>>>>>>>> entry to hypervisor when switching to AArch32 state.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>>>>>>>>>>>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>>>>>>>>>>>>> when not passed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Which error do you see otherwise? Is it a compilation error?
>>>>>>>>>>>>>
>>>>>>>>>>>> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
>>>>>>>>>>>> So basically in all the places where entry is called with hyp=1.
>>>>>>>>>>>> When taking the current patch and removing default value for compat you will get:
>>>>>>>>>>>> ```
>>>>>>>>>>>> entry.S:254: Error: ".endif" without ".if"
>>>>>>>>>>>> entry.S:258: Error: symbol `.if' is already defined
>>>>>>>>>>>> entry.S:258: Error: ".endif" without ".if"
>>>>>>>>>>>> entry.S:262: Error: symbol `.if' is already defined
>>>>>>>>>>>> entry.S:262: Error: ".endif" without ".if"
>>>>>>>>>>>> entry.S:266: Error: symbol `.if' is already defined
>>>>>>>>>>>> entry.S:266: Error: ".endif" without ".if"
>>>>>>>>>>>> entry.S:278: Error: symbol `.if' is already defined
>>>>>>>>>>>> entry.S:278: Error: ".endif" without ".if"
>>>>>>>>>>>> entry.S:292: Error: symbol `.if' is already defined
>>>>>>>>>>>> entry.S:292: Error: ".endif" without ".if"
>>>>>>>>>>>> entry.S:317: Error: symbol `.if' is already defined
>>>>>>>>>>>> entry.S:317: Error: ".endif" without ".if"
>>>>>>>>>>>> ```
>>>>>>>>>>>
>>>>>>>>>>> Thanks for input. I am concerned with your suggested approach (or using
>>>>>>>>>>> .if 0\compat as suggested by Jan) because they allow the caller to not
>>>>>>>>>>> properly specify compat when hyp=0. The risk here is we may generate the
>>>>>>>>>>> wrong entry.
>>>>>>>>>>>
>>>>>>>>>>> compat should need to be specified when hyp=1 as we will always run in
>>>>>>>>>>> aarch64 mode. So could we protect this code with hyp=0?
>>>>>>>>>>
>>>>>>>>>> Since my suggestion was only to avoid the need for specifying a default
>>>>>>>>>> for the parameter (which you didn't seem to be happy about), it would
>>>>>>>>>> then merely extend to
>>>>>>>>>>
>>>>>>>>>> .if !0\hyp && 0\compat
>>>>>>>>> Isn't it effectively the same as setting a default value?
>>>>>>>>>
>>>>>>>>> The reason we seem to get away is because other part of the macro (e.g. entry_guest) will need compat to be valid.
>>>>>>>>>
>>>>>>>>> But that seems pretty fragile to me. So I would prefer if the new code it added within a macro that is only called when hyp==0.
>>>>>>>>>
>>>>>>>> So you would like to have a macro that is called if hyp=0 (which means compat had to be passed) and inside this macro additional check if compat is 1?
>>>>>>>
>>>>>>> Yes. This is the only way I could think to avoid making 'compat'optional.
>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> or something along those lines.
>>>>>>>>>>
>>>>>>>>>> Jan
>>>>>>>>>>
>>>>>>>>>
>>>>>>>> So when it comes to zeroing the top 32bits by pushing zero to higher halves of stack slots I would do in a loop:
>>>>>>>> stp wzr, wzr, [sp #8 * 0]
>>>>>>>> stp wzr, wzr, [sp #8 * 1]
>>>>>>>> ...
>>>>>>>
>>>>>>> I don't think you can use stp here because this would store two 32-bit values consecutively. Instead, you would need to use ldr to store one 32-bit value at the time.
>>>>>>>
>>>>>> I hope you meant str and not ldr.
>>>>
>>>> Yes. I am not sure why I wrote ldr.
>>>>
>>>>>> So a loop would look like that:
>>>>>> str wzr, [sp, #8 * 1]
>>>>>> str wzr, [sp, #8 * 3]
>>>>>> ...
>>>>>
>>>>> Why "a loop" and why #8 (I'd have expected #4)?
>>>>>
>>>>> There's another oddity which I'm noticing only now, but which also
>>>>> may look odd to me only because I lack sufficient Arm details: On
>>>>> x86, it would not be advisable to store anything below the stack
>>>>> pointer (like is done here when storing x0 and x1 early), unless
>>>>> it's absolutely certain that no further interruptions could clobber
>>>>> that part of the stack.
>>>>
>>>> We are entering the hypervisor with both Interrupts and SErrors masked. They will only be unmasked after the guest registers have been saved on the stack.
>>>>
>>>> You may still receive a Data Abort before the macro 'entry' has completed. But this is going to result to an hypervisor crash because they are not meant to happen in those paths.
>>>>
>>>> So I believe, we are safe to modify sp before.
>>>
>>> Hmmm... I meant to write on the stack before sp is modified.
>>>
>>> Cheers,
>>>
>>
>> I would like to summarize what we discussed before pushing v2.
>> Changes since v1:
>> -update commit message adding information why do we need to zero top 32bits
>> -zero corresponding stack slots instead of zeroing directly gp registers
>> -create a macro called by entry, protected by if hyp=0. In macro add if compat=1
>>
>> Now when it comes to implementation.
>>
>> 1. Regarding save_x0_x1, it is 0 only for guest_sync_slowpath, which is called by guest_sync.
>> So as we are dealing only with compat=1, save_x0_x1 cannot be 0.
>> The conclusion is that we do not need to worry about it.
> 
> Oh, good point. I guess you may want to add a build time check to
> avoid silently introducing a user of the macro violating that
> assumption.
> 
>> 2. Regarding clearing high halves of stack slots.
> 
> I don't think I understood earlier responses that way. I think
> fiddling with the stack was meant solely for x0 and x1 when they
> were saved earlier on (i.e. instead of re-loading, zero-extending,
> and then storing them back). That's also why ...
> 
This patch and the problem it solves is about clearing top 32bits of all gp registers so not only x0,x1.

>> The new macro (called zero_stack_top_halves) will be called in entry before the first instruction sub sp,sp.
>> To avoid saving sp position/moving it, the simplest would be to execute 30 times:
>> str wzr, [sp, #-(UREGS_kernel_sizeof - 4)]
>> str wzr, [sp, #-(UREGS_kernel_sizeof - 12)]
>> ...
>> I could also use .irp loop like (.irp n,1,3,5,7,...) and then:
>> str wzr, [sp, #-(UREGS_kernel_sizeof - (4 * n))]
>> but FWIK Jan does not like loops :)
> 
> ... in an earlier reply I expressed my surprise of you mentioning
> loops - I simply didn't see how a loop would come into play when
> dealing with just x0 and x1.
> 
> Jan
> 

Michal

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Jan Beulich 2 years, 4 months ago
(Re-sending an abridged version, as apparently spam filters didn't like
the original message with more retained context; I'll have to see whether
this one also isn't liked. Sorry.)

On 15.12.2021 10:48, Michal Orzel wrote:
> This patch and the problem it solves is about clearing top 32bits of all gp registers so not only x0,x1.

That's well understood. Yet for everything still in registers simply
using mov ahead of the respective push (as you had it) is still
preferable imo.

Jan


Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Michal Orzel 2 years, 4 months ago

On 15.12.2021 11:32, Jan Beulich wrote:
> (Re-sending an abridged version, as apparently spam filters didn't like
> the original message with more retained context; I'll have to see whether
> this one also isn't liked. Sorry.)
> 
> On 15.12.2021 10:48, Michal Orzel wrote:
>> This patch and the problem it solves is about clearing top 32bits of all gp registers so not only x0,x1.
> 
> That's well understood. Yet for everything still in registers simply
> using mov ahead of the respective push (as you had it) is still
> preferable imo.
> 
> Jan
> 

In that case let's wait for Julien's opinion to decide whether I should get back to the previous
solution with mov or to the stack solution.

Cheers,
Michal

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Julien Grall 2 years, 4 months ago
Hi Michal,

On 15/12/2021 10:40, Michal Orzel wrote:
> On 15.12.2021 11:32, Jan Beulich wrote:
>> (Re-sending an abridged version, as apparently spam filters didn't like
>> the original message with more retained context; I'll have to see whether
>> this one also isn't liked. Sorry.)
>>
>> On 15.12.2021 10:48, Michal Orzel wrote:
>>> This patch and the problem it solves is about clearing top 32bits of all gp registers so not only x0,x1.
>>
>> That's well understood. Yet for everything still in registers simply
>> using mov ahead of the respective push (as you had it) is still
>> preferable imo.
>>
>> Jan
>>
> 
> In that case let's wait for Julien's opinion to decide whether I should get back to the previous
> solution with mov or to the stack solution.

IIUC, your proposal is to:
    1) Push all the 64-bit registers
    2) Zero the top 32-bit

Jan's suggestion is to:
    1) clobber the top 32-bit using mov wX, wX
    2) Push all the registers

My preference is for the latter because there will be less memory/cache 
access.

So, this would be your original patch + a compile time check to ensure 
save_x0_x1 is 0 when compat=1.

Cheers,

-- 
Julien Grall

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Michal Orzel 2 years, 4 months ago
Hi Julien,

On 15.12.2021 19:25, Julien Grall wrote:
> Hi Michal,
> 
> On 15/12/2021 10:40, Michal Orzel wrote:
>> On 15.12.2021 11:32, Jan Beulich wrote:
>>> (Re-sending an abridged version, as apparently spam filters didn't like
>>> the original message with more retained context; I'll have to see whether
>>> this one also isn't liked. Sorry.)
>>>
>>> On 15.12.2021 10:48, Michal Orzel wrote:
>>>> This patch and the problem it solves is about clearing top 32bits of all gp registers so not only x0,x1.
>>>
>>> That's well understood. Yet for everything still in registers simply
>>> using mov ahead of the respective push (as you had it) is still
>>> preferable imo.
>>>
>>> Jan
>>>
>>
>> In that case let's wait for Julien's opinion to decide whether I should get back to the previous
>> solution with mov or to the stack solution.
> 
> IIUC, your proposal is to:
>    1) Push all the 64-bit registers
>    2) Zero the top 32-bit
> 
> Jan's suggestion is to:
>    1) clobber the top 32-bit using mov wX, wX
>    2) Push all the registers
> 
> My preference is for the latter because there will be less memory/cache access.
> 
> So, this would be your original patch + a compile time check to ensure save_x0_x1 is 0 when compat=1.
> 
That is exactly what I would like to do. I will send this as v2.

> Cheers,
> 

Cheers,
Michal

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Jan Beulich 2 years, 4 months ago
On 07.12.2021 09:37, Michal Orzel wrote:
> On 06.12.2021 16:29, Julien Grall wrote:
>> On 06/12/2021 14:20, Michal Orzel wrote:
>>> to hypervisor when switching to AArch32 state.
>>>
> I will change to "from AArch32 state".
>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>> "If the general-purpose register was accessible from AArch32 state the
>>> upper 32 bits either become zero, or hold the value that the same
>>> architectural register held before any AArch32 execution.
>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>
>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>
> Ok.
>>>
>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>> needs to be fixed.
>>
>> Can you outline why this is a problem and why we need to protect? IIRC, the main concern is Xen may misinterpret what the guest requested but we are not concerned about Xen using wrong value.
>>
> I would say:
> "
> The reason why this is a problem is that there are places in Xen where we assume that top 32bits are zero for AArch32 guests.
> If they are not, this can lead to misinterpretation of Xen regarding what the guest requested.
> For example hypercalls returning an error encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
> if the command passed as the first argument was clobbered,
> "
>>>
>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>> entry to hypervisor when switching to AArch32 state.
>>>
>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>> when not passed.
>>
>> Which error do you see otherwise? Is it a compilation error?
>>
> Yes, this is a compilation error. The errors appear at each line when "entry" is called without passing value for "compat".
> So basically in all the places where entry is called with hyp=1.
> When taking the current patch and removing default value for compat you will get:
> ```
> entry.S:254: Error: ".endif" without ".if"
> entry.S:258: Error: symbol `.if' is already defined
> entry.S:258: Error: ".endif" without ".if"
> entry.S:262: Error: symbol `.if' is already defined
> entry.S:262: Error: ".endif" without ".if"
> entry.S:266: Error: symbol `.if' is already defined
> entry.S:266: Error: ".endif" without ".if"
> entry.S:278: Error: symbol `.if' is already defined
> entry.S:278: Error: ".endif" without ".if"
> entry.S:292: Error: symbol `.if' is already defined
> entry.S:292: Error: ".endif" without ".if"
> entry.S:317: Error: symbol `.if' is already defined
> entry.S:317: Error: ".endif" without ".if"
> ```

An alternative might be to use

.if 0\compat

>>> --- a/xen/arch/arm/arm64/entry.S
>>> +++ b/xen/arch/arm/arm64/entry.S
>>> @@ -109,8 +109,16 @@
>>>    * If 0, we rely on the on x0/x1 to have been saved at the correct
>>>    * position on the stack before.
>>>    */
>>> -        .macro  entry, hyp, compat, save_x0_x1=1
>>> +        .macro  entry, hyp, compat=0, save_x0_x1=1
>>>           sub     sp, sp, #(UREGS_SPSR_el1 - UREGS_LR) /* CPSR, PC, SP, LR */
>>> +
>>> +        /* Zero the upper 32 bits of the registers when switching to AArch32 */
>>> +        .if \compat == 1      /* AArch32 mode */
>>> +        .irp nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
>>> +        mov w\nr, w\nr
>>> +        .endr
>>> +        .endif
>>
>> So Jan mentioned, the x0/x1 may have already been saved. So you may need to fetch them from the stack and then clobber the top 32-bit.
>>
> So I would do the following:
> -fetch x0/x1 from the stack
> -clobber them
> -store them again on the stack
> 
> /*
>  * Zero the upper 32 bits of the gp registers when switching
>  * from AArch32.
>  */
> .if \compat == 1      /* AArch32 mode */
> 
> /* x0/x1 have already been saved so fetch them to zero top 32 bits */
> .if \save_x0_x1 == 0
> ldp     x0, x1, [sp], #(UREGS_kernel_sizeof - UREGS_X0)
> .endif
> 
> .irp nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
> mov w\nr, w\nr
> .endr
> 
> .if \save_x0_x1 == 0
> stp     x0, x1, [sp, #-(UREGS_kernel_sizeof - UREGS_X0)]
> .endif
> 
> .endif

Wouldn't it be more efficient to store 32 bits of zero each into the
high halves of the respective stack slots? Afaict same code size, but
less memory / cache traffic. Plus it would avoid the latent issue of
a user of the macro actually expecting the two registers to retain
their values across the macro invocation.

I'm also puzzled by the two different memory addressing forms, but I
can easily see that I may be lacking enough Arm knowledge there.

Jan


Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...
Posted by Julien Grall 2 years, 4 months ago
Hi Jan,

On 07/12/2021 09:55, Jan Beulich wrote:
>>>> --- a/xen/arch/arm/arm64/entry.S
>>>> +++ b/xen/arch/arm/arm64/entry.S
>>>> @@ -109,8 +109,16 @@
>>>>     * If 0, we rely on the on x0/x1 to have been saved at the correct
>>>>     * position on the stack before.
>>>>     */
>>>> -        .macro  entry, hyp, compat, save_x0_x1=1
>>>> +        .macro  entry, hyp, compat=0, save_x0_x1=1
>>>>            sub     sp, sp, #(UREGS_SPSR_el1 - UREGS_LR) /* CPSR, PC, SP, LR */
>>>> +
>>>> +        /* Zero the upper 32 bits of the registers when switching to AArch32 */
>>>> +        .if \compat == 1      /* AArch32 mode */
>>>> +        .irp nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
>>>> +        mov w\nr, w\nr
>>>> +        .endr
>>>> +        .endif
>>>
>>> So Jan mentioned, the x0/x1 may have already been saved. So you may need to fetch them from the stack and then clobber the top 32-bit.
>>>
>> So I would do the following:
>> -fetch x0/x1 from the stack
>> -clobber them
>> -store them again on the stack
>>
>> /*
>>   * Zero the upper 32 bits of the gp registers when switching
>>   * from AArch32.
>>   */
>> .if \compat == 1      /* AArch32 mode */
>>
>> /* x0/x1 have already been saved so fetch them to zero top 32 bits */
>> .if \save_x0_x1 == 0
>> ldp     x0, x1, [sp], #(UREGS_kernel_sizeof - UREGS_X0)
>> .endif
>>
>> .irp nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
>> mov w\nr, w\nr
>> .endr
>>
>> .if \save_x0_x1 == 0
>> stp     x0, x1, [sp, #-(UREGS_kernel_sizeof - UREGS_X0)]
>> .endif
>>
>> .endif
> 
> Wouldn't it be more efficient to store 32 bits of zero each into the
> high halves of the respective stack slots? Afaict same code size, but
> less memory / cache traffic.

It would indeed be more efficient.

> Plus it would avoid the latent issue of
> a user of the macro actually expecting the two registers to retain
> their values across the macro invocation.

While this is not explicitely written, a caller cannot expect the 
registers to be preserved by this macro.

> 
> I'm also puzzled by the two different memory addressing forms, but I
> can easily see that I may be lacking enough Arm knowledge there.

I agree this is quite puzzling. The first one would update 'sp' after 
loading the register but I don't quite understand why it is necessary.

Assuming the this is happening before the instruction 'sub sp, sp, ...', 
then we should only need to load/store from sp with an offset (i.e. the 
second form).

Cheers,

-- 
Julien Grall