[XEN PATCH] xen/vpci: Fix UB in mask_write

Mykyta Poturai posted 1 patch 2 months, 4 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/559dfac91b8f097bc59c4de194fd2ae2b5b4144c.1730880005.git.mykyta._5Fpoturai@epam.com
xen/drivers/vpci/msi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[XEN PATCH] xen/vpci: Fix UB in mask_write
Posted by Mykyta Poturai 2 months, 4 weeks ago
During the construction of dmask value, it gets shifted by
(32 - msi->vectors) bits. If msi->vectors is 0, the result of the shift
becomes undefined due to shifting by a size of the type. While this
works fine on x86, on ARM the resulting mask becomes 0xFFFFFFFF, which
is incorrect.

Fix this by adding an explicit check for msi->vectors == 0.

Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
---
 xen/drivers/vpci/msi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
index 7bda47e7fc..787296fd42 100644
--- a/xen/drivers/vpci/msi.c
+++ b/xen/drivers/vpci/msi.c
@@ -172,7 +172,7 @@ static void cf_check mask_write(
     struct vpci_msi *msi = data;
     uint32_t dmask = msi->mask ^ val;
 
-    if ( !dmask )
+    if ( !dmask || msi->vectors == 0 )
         return;
 
     if ( msi->enabled )
-- 
2.34.1
Re: [XEN PATCH] xen/vpci: Fix UB in mask_write
Posted by Jan Beulich 2 months, 4 weeks ago
On 06.11.2024 09:05, Mykyta Poturai wrote:
> During the construction of dmask value, it gets shifted by
> (32 - msi->vectors) bits. If msi->vectors is 0, the result of the shift
> becomes undefined due to shifting by a size of the type. While this
> works fine on x86,

Oh, also - what made you think this would be fine on x86? Afaict ...

> on ARM the resulting mask becomes 0xFFFFFFFF, which
> is incorrect.

... the exact same thing would happen (if msi->vectors indeed could ever
be zero) there, due to the type of the value shifted being unsigned int,
not unsigned long.

Jan
Re: [XEN PATCH] xen/vpci: Fix UB in mask_write
Posted by Roger Pau Monné 2 months, 4 weeks ago
On Wed, Nov 06, 2024 at 08:05:19AM +0000, Mykyta Poturai wrote:
> During the construction of dmask value, it gets shifted by
> (32 - msi->vectors) bits. If msi->vectors is 0, the result of the shift
> becomes undefined due to shifting by a size of the type. While this
> works fine on x86, on ARM the resulting mask becomes 0xFFFFFFFF, which
> is incorrect.
> 
> Fix this by adding an explicit check for msi->vectors == 0.

I would also add:

Fixes: 188fa82305e7 ('xen/vpci: Improve code generation in mask_write()')

> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> ---
>  xen/drivers/vpci/msi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
> index 7bda47e7fc..787296fd42 100644
> --- a/xen/drivers/vpci/msi.c
> +++ b/xen/drivers/vpci/msi.c
> @@ -172,7 +172,7 @@ static void cf_check mask_write(
>      struct vpci_msi *msi = data;
>      uint32_t dmask = msi->mask ^ val;
>  
> -    if ( !dmask )
> +    if ( !dmask || msi->vectors == 0 )
>          return;

I'm afraid returning this early is not correct - the cached mask needs
to be updated, even if there are no vectors currently enabled.

The adjustment likely needs to be:

if ( msi->enabled && msi->vectors )
...

So that the update of msi->mask is not skipped.

Thanks, Roger.
Re: [XEN PATCH] xen/vpci: Fix UB in mask_write
Posted by Roger Pau Monné 2 months, 4 weeks ago
On Wed, Nov 06, 2024 at 10:00:07AM +0100, Roger Pau Monné wrote:
> On Wed, Nov 06, 2024 at 08:05:19AM +0000, Mykyta Poturai wrote:
> > During the construction of dmask value, it gets shifted by
> > (32 - msi->vectors) bits. If msi->vectors is 0, the result of the shift
> > becomes undefined due to shifting by a size of the type. While this
> > works fine on x86, on ARM the resulting mask becomes 0xFFFFFFFF, which
> > is incorrect.
> > 
> > Fix this by adding an explicit check for msi->vectors == 0.

Wait - how can msi->vectors ever be 0?  AFAICT there's no way in the
MSI logic to configure 0 vectors, there will always be at least 1 vector
enabled.

Maybe what you want, if this fix is for compliance reasons, is an
assert unreachable that msi->vectors > 0?

Thanks, Roger.

Re: [XEN PATCH] xen/vpci: Fix UB in mask_write
Posted by Jan Beulich 2 months, 4 weeks ago
On 06.11.2024 10:07, Roger Pau Monné wrote:
> On Wed, Nov 06, 2024 at 10:00:07AM +0100, Roger Pau Monné wrote:
>> On Wed, Nov 06, 2024 at 08:05:19AM +0000, Mykyta Poturai wrote:
>>> During the construction of dmask value, it gets shifted by
>>> (32 - msi->vectors) bits. If msi->vectors is 0, the result of the shift
>>> becomes undefined due to shifting by a size of the type. While this
>>> works fine on x86, on ARM the resulting mask becomes 0xFFFFFFFF, which
>>> is incorrect.
>>>
>>> Fix this by adding an explicit check for msi->vectors == 0.
> 
> Wait - how can msi->vectors ever be 0?  AFAICT there's no way in the
> MSI logic to configure 0 vectors, there will always be at least 1 vector
> enabled.
> 
> Maybe what you want, if this fix is for compliance reasons, is an
> assert unreachable that msi->vectors > 0?

Which raises a question as to (lack of) context: Was this spotted by
mere code inspection? Or by a static analyzer? If so, which one? That
may help figure whether some workaround like the one suggested is
necessary, or whether it can simply be left alone.

Jan

Re: [XEN PATCH] xen/vpci: Fix UB in mask_write
Posted by Mykyta Poturai 2 months, 4 weeks ago
On 06.11.2024 10:07, Roger Pau Monné wrote:>
> Wait - how can msi->vectors ever be 0?  AFAICT there's no way in the
> MSI logic to configure 0 vectors, there will always be at least 1 vector
> enabled.
>
> Maybe what you want, if this fix is for compliance reasons, is an
> assert unreachable that msi->vectors > 0?

I did some investigation and figured out that the value of 0 is being
set by guest writing to msi_control_reg. As far as I understand, the
control_write() function only checks that vectors are not greater than
the maximum allowed value, but does not check for 0.
So I am not sure if this is a valid scenario or not. Is this incorrect
guest behavior and it should be forbidden from setting vectors to 0
and enable to 1 at the same time?

On 06.11.24 13:31, Jan Beulich wrote:
> 
> Which raises a question as to (lack of) context: Was this spotted by
> mere code inspection? Or by a static analyzer? If so, which one? That
> may help figure whether some workaround like the one suggested is
> necessary, or whether it can simply be left alone.
> 
> Jan

I have found this while porting the PCI passthrough patches to Xen 4.20.
After checking the previous version which was on 4.18 it seems that
on it msi->vectors are also set to 0 but nothing breaks due to it being
the explicit end of the loop. So I have assumed that setting it to 0 is
a valid scenario.

I am testing all of this on Rcar Gen4 boards.

Mykyta
Re: [XEN PATCH] xen/vpci: Fix UB in mask_write
Posted by Jan Beulich 2 months, 4 weeks ago
On 06.11.2024 13:26, Mykyta Poturai wrote:
> On 06.11.2024 10:07, Roger Pau Monné wrote:>
>> Wait - how can msi->vectors ever be 0?  AFAICT there's no way in the
>> MSI logic to configure 0 vectors, there will always be at least 1 vector
>> enabled.
>>
>> Maybe what you want, if this fix is for compliance reasons, is an
>> assert unreachable that msi->vectors > 0?
> 
> I did some investigation and figured out that the value of 0 is being
> set by guest writing to msi_control_reg. As far as I understand, the
> control_write() function only checks that vectors are not greater than
> the maximum allowed value, but does not check for 0.

How that? How could it even check for 0, when 0 isn't possible? Quoting
the code there:

    unsigned int vectors = min_t(uint8_t,
                                 1u << MASK_EXTR(val, PCI_MSI_FLAGS_QSIZE),
                                 pdev->msi_maxvec);

"val" in the guest written value. As that's used as a shift count, how
could 0 result there? The only way I can see 0 ending up in vectors is
when pdev->msi_maxvec was still zero. Yet that's then a bug in device
initialization.

Jan

Re: [XEN PATCH] xen/vpci: Fix UB in mask_write
Posted by Roger Pau Monné 2 months, 4 weeks ago
On Thu, Nov 07, 2024 at 10:25:02AM +0100, Jan Beulich wrote:
> On 06.11.2024 13:26, Mykyta Poturai wrote:
> > On 06.11.2024 10:07, Roger Pau Monné wrote:>
> >> Wait - how can msi->vectors ever be 0?  AFAICT there's no way in the
> >> MSI logic to configure 0 vectors, there will always be at least 1 vector
> >> enabled.
> >>
> >> Maybe what you want, if this fix is for compliance reasons, is an
> >> assert unreachable that msi->vectors > 0?
> > 
> > I did some investigation and figured out that the value of 0 is being
> > set by guest writing to msi_control_reg. As far as I understand, the
> > control_write() function only checks that vectors are not greater than
> > the maximum allowed value, but does not check for 0.
> 
> How that? How could it even check for 0, when 0 isn't possible? Quoting
> the code there:
> 
>     unsigned int vectors = min_t(uint8_t,
>                                  1u << MASK_EXTR(val, PCI_MSI_FLAGS_QSIZE),
>                                  pdev->msi_maxvec);
> 
> "val" in the guest written value. As that's used as a shift count, how
> could 0 result there? The only way I can see 0 ending up in vectors is
> when pdev->msi_maxvec was still zero. Yet that's then a bug in device
> initialization.

See followup emails, I've arrived at the same conclusion and Mykyta
confirmed it's msi_maxvec that's indeed 0.  Still waiting for them to
figure out why msi_maxvec is 0.

Thanks, Roger.

Re: [XEN PATCH] xen/vpci: Fix UB in mask_write
Posted by Roger Pau Monné 2 months, 4 weeks ago
On Wed, Nov 06, 2024 at 12:26:55PM +0000, Mykyta Poturai wrote:
> On 06.11.2024 10:07, Roger Pau Monné wrote:>
> > Wait - how can msi->vectors ever be 0?  AFAICT there's no way in the
> > MSI logic to configure 0 vectors, there will always be at least 1 vector
> > enabled.
> >
> > Maybe what you want, if this fix is for compliance reasons, is an
> > assert unreachable that msi->vectors > 0?
> 
> I did some investigation and figured out that the value of 0 is being
> set by guest writing to msi_control_reg. As far as I understand, the
> control_write() function only checks that vectors are not greater than
> the maximum allowed value, but does not check for 0.

control_write() will set vectors to (1UL << val), so even if user
provides val == 0, vectors will be 1.

Can you provide an example input value of control_write() that will
lead to msi->vectors == 0?

Is maybe msi_maxvec not set correctly in your use case if you indeed
see vectors == 0?

Thanks, Roger.

Re: [XEN PATCH] xen/vpci: Fix UB in mask_write
Posted by Mykyta Poturai 2 months, 4 weeks ago
On 06.11.24 14:42, Roger Pau Monné wrote:
> On Wed, Nov 06, 2024 at 12:26:55PM +0000, Mykyta Poturai wrote:
>> On 06.11.2024 10:07, Roger Pau Monné wrote:
>>> Wait - how can msi->vectors ever be 0?  AFAICT there's no way in the
>>> MSI logic to configure 0 vectors, there will always be at least 1 vector
>>> enabled.
>>>
>>> Maybe what you want, if this fix is for compliance reasons, is an
>>> assert unreachable that msi->vectors > 0?
>>
>> I did some investigation and figured out that the value of 0 is being
>> set by guest writing to msi_control_reg. As far as I understand, the
>> control_write() function only checks that vectors are not greater than
>> the maximum allowed value, but does not check for 0.
> 
> control_write() will set vectors to (1UL << val), so even if user
> provides val == 0, vectors will be 1.
> 
> Can you provide an example input value of control_write() that will
> lead to msi->vectors == 0?
> 
> Is maybe msi_maxvec not set correctly in your use case if you indeed
> see vectors == 0?
> 
> Thanks, Roger.

Indeed, I have checked and msi_maxvec is set to 0. Thanks for pointing
this out. I will investigate further why this is happening. It is quite
strange that it somehow worked on 4.18 with the same problem.

I will change the check to an assert then, so if something similar
happens again it can be caught earlier.

Mykyta
Re: [XEN PATCH] xen/vpci: Fix UB in mask_write
Posted by Roger Pau Monné 2 months, 4 weeks ago
On Wed, Nov 06, 2024 at 02:32:13PM +0000, Mykyta Poturai wrote:
> On 06.11.24 14:42, Roger Pau Monné wrote:
> > On Wed, Nov 06, 2024 at 12:26:55PM +0000, Mykyta Poturai wrote:
> >> On 06.11.2024 10:07, Roger Pau Monné wrote:
> >>> Wait - how can msi->vectors ever be 0?  AFAICT there's no way in the
> >>> MSI logic to configure 0 vectors, there will always be at least 1 vector
> >>> enabled.
> >>>
> >>> Maybe what you want, if this fix is for compliance reasons, is an
> >>> assert unreachable that msi->vectors > 0?
> >>
> >> I did some investigation and figured out that the value of 0 is being
> >> set by guest writing to msi_control_reg. As far as I understand, the
> >> control_write() function only checks that vectors are not greater than
> >> the maximum allowed value, but does not check for 0.
> > 
> > control_write() will set vectors to (1UL << val), so even if user
> > provides val == 0, vectors will be 1.
> > 
> > Can you provide an example input value of control_write() that will
> > lead to msi->vectors == 0?
> > 
> > Is maybe msi_maxvec not set correctly in your use case if you indeed
> > see vectors == 0?
> > 
> > Thanks, Roger.
> 
> Indeed, I have checked and msi_maxvec is set to 0. Thanks for pointing
> this out. I will investigate further why this is happening. It is quite
> strange that it somehow worked on 4.18 with the same problem.

Check whether pdev_msi_init() is called during device addition, as
that's what initializes msi_maxvec.  Another cause could be memory
corruption.

> I will change the check to an assert then, so if something similar
> happens again it can be caught earlier.

Let's try to figure out what causes msi_maxvec to be 0 in your case
and then we can see how to better detect this.  If msi_maxvec needs to
be checked it should likely be done in init_msi().

Regards, Roger.

Re: [XEN PATCH] xen/vpci: Fix UB in mask_write
Posted by Mykyta Poturai 2 months, 3 weeks ago
On 06.11.24 16:57, Roger Pau Monné wrote:
> 
> Let's try to figure out what causes msi_maxvec to be 0 in your case
> and then we can see how to better detect this.  If msi_maxvec needs to
> be checked it should likely be done in init_msi().
> 
> Regards, Roger.

Hi everyone,
So I have done some more investigations, and I think it finally makes 
sense. The real cause of my crashes was a long-standing bug in yet to be 
upstreamed vpci patches where the register value and offset were swapped 
by mistake. And this bug was hidden for a long time because mask_write 
skipped actually doing anything, respecting vectors = 0, so I failed to 
spot it from the get-go.

Regarding msi_maxvec there seems to be an implicit dependency between 
CONFIG_HAS_VPCI and CONFIG_HAS_PCI_MSI. If HAS_PCI_MSI=n, then 
pdev_msi_init gets replaced with a stub and msi_maxvec remains 0, but it 
is still used in control_write unconditionally.

I see two possible solutions to this: either adding an explicit 
dependency or, if msi_maxvec can't be 0 anyway, always initializing it 
to 1. But I am not sure which one is better.

Mykyta
Re: [XEN PATCH] xen/vpci: Fix UB in mask_write
Posted by Roger Pau Monné 2 months, 3 weeks ago
On Mon, Nov 11, 2024 at 12:21:32PM +0000, Mykyta Poturai wrote:
> On 06.11.24 16:57, Roger Pau Monné wrote:
> > 
> > Let's try to figure out what causes msi_maxvec to be 0 in your case
> > and then we can see how to better detect this.  If msi_maxvec needs to
> > be checked it should likely be done in init_msi().
> > 
> > Regards, Roger.
> 
> Hi everyone,
> So I have done some more investigations, and I think it finally makes 
> sense. The real cause of my crashes was a long-standing bug in yet to be 
> upstreamed vpci patches where the register value and offset were swapped 
> by mistake. And this bug was hidden for a long time because mask_write 
> skipped actually doing anything, respecting vectors = 0, so I failed to 
> spot it from the get-go.
> 
> Regarding msi_maxvec there seems to be an implicit dependency between 
> CONFIG_HAS_VPCI and CONFIG_HAS_PCI_MSI. If HAS_PCI_MSI=n, then 
> pdev_msi_init gets replaced with a stub and msi_maxvec remains 0, but it 
> is still used in control_write unconditionally.

Hello,

If HAS_PCI_MSI=n then vPCI shouldn't attempt to handle the MSI(-X)
capabilities in the first place.  However that can lead to incorrect
scenarios if vPCI is used for dom0, as vPCI not supporting MSI(-X)
grants dom0 unmediated access to the capability registers, which won't
result in a functional system at least on x86.

I think the more robust solution is for vPCI to require
HAS_PCI_MSI=y.

Thanks, Roger.