[Xen-devel] [PATCH] x86/msi: fix loop termination condition in pci_msi_conf_write_intercept()

Paul Durrant posted 1 patch 4 years, 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/xen tags/patchew/20190702093414.27798-1-paul.durrant@citrix.com
xen/arch/x86/msi.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
[Xen-devel] [PATCH] x86/msi: fix loop termination condition in pci_msi_conf_write_intercept()
Posted by Paul Durrant 4 years, 9 months ago
The for loop that deals with MSI masking is coded as follows:

for ( pos = 0; pos < entry->msi.nvec; ++pos, ++entry )

Thus the loop termination condition is dereferencing a struct pointer that
is being incremented by the loop. However, it is clear from following code
paths in msi_capability_init() that this is unsafe as for instance, in the
case of nvec == 1, entry will point at a single struct msi_desc allocation
and thus the loop will walk beyond the bounds of the allocation before
dereferencing the memory to determine whether the loop should terminate.
Also, because the body of the loop writes via the entry pointer, this can
then lead to heap memory corruption, or indeed corruption of anything in
the direct map.

This patch simply initializes a stack variable to the value of
entry->msi.nvec before starting the loop and then uses that in the
termination condition instead.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Igor Druzhinin <igor.druzhinin@citrix.com>

Credit to Andrew Cooper and Igor Druzhinin for helping narrow down the
source of the memory corruption. It has taken many weeks of head-scratching
to get to this fix.
---
 xen/arch/x86/msi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
index babc4147c4..89e61160e9 100644
--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -1328,6 +1328,7 @@ int pci_msi_conf_write_intercept(struct pci_dev *pdev, unsigned int reg,
     {
         uint16_t cntl;
         uint32_t unused;
+        unsigned int nvec = entry->msi.nvec;
 
         pos = entry->msi_attrib.pos;
         if ( reg < pos || reg >= entry->msi.mpos + 8 )
@@ -1340,7 +1341,7 @@ int pci_msi_conf_write_intercept(struct pci_dev *pdev, unsigned int reg,
 
         cntl = pci_conf_read16(seg, bus, slot, func, msi_control_reg(pos));
         unused = ~(uint32_t)0 >> (32 - multi_msi_capable(cntl));
-        for ( pos = 0; pos < entry->msi.nvec; ++pos, ++entry )
+        for ( pos = 0; pos < nvec; ++pos, ++entry )
         {
             entry->msi_attrib.guest_masked =
                 *data >> entry->msi_attrib.entry_nr;
-- 
2.20.1.2.gb21ebb671


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/msi: fix loop termination condition in pci_msi_conf_write_intercept()
Posted by Andrew Cooper 4 years, 9 months ago
On 02/07/2019 10:34, Paul Durrant wrote:
> The for loop that deals with MSI masking is coded as follows:
>
> for ( pos = 0; pos < entry->msi.nvec; ++pos, ++entry )
>
> Thus the loop termination condition is dereferencing a struct pointer that
> is being incremented by the loop. However, it is clear from following code
> paths in msi_capability_init() that this is unsafe as for instance, in the
> case of nvec == 1, entry will point at a single struct msi_desc allocation
> and thus the loop will walk beyond the bounds of the allocation before
> dereferencing the memory to determine whether the loop should terminate.

More specifically, only entry[0].msi.nvec is correct.  All subsequent
nvec fields are 0 in a block of entries.

> Also, because the body of the loop writes via the entry pointer, this can
> then lead to heap memory corruption, or indeed corruption of anything in
> the direct map.
>
> This patch simply initializes a stack variable to the value of
> entry->msi.nvec before starting the loop and then uses that in the
> termination condition instead.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> ---
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Wei Liu <wl@xen.org>
> Cc: "Roger Pau Monné" <roger.pau@citrix.com>
> Cc: Igor Druzhinin <igor.druzhinin@citrix.com>
>
> Credit to Andrew Cooper and Igor Druzhinin for helping narrow down the
> source of the memory corruption. It has taken many weeks of head-scratching
> to get to this fix.

This has taken an embarrassingly long time figure out, even after
debugging hinted that the assignment to guest_masked (in context) was
the culprit of memory corruption.

Needless to say, this wants backporting to all trees.

Reivewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

> ---
>  xen/arch/x86/msi.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
> index babc4147c4..89e61160e9 100644
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -1328,6 +1328,7 @@ int pci_msi_conf_write_intercept(struct pci_dev *pdev, unsigned int reg,
>      {
>          uint16_t cntl;
>          uint32_t unused;
> +        unsigned int nvec = entry->msi.nvec;
>  
>          pos = entry->msi_attrib.pos;
>          if ( reg < pos || reg >= entry->msi.mpos + 8 )
> @@ -1340,7 +1341,7 @@ int pci_msi_conf_write_intercept(struct pci_dev *pdev, unsigned int reg,
>  
>          cntl = pci_conf_read16(seg, bus, slot, func, msi_control_reg(pos));
>          unused = ~(uint32_t)0 >> (32 - multi_msi_capable(cntl));
> -        for ( pos = 0; pos < entry->msi.nvec; ++pos, ++entry )
> +        for ( pos = 0; pos < nvec; ++pos, ++entry )
>          {
>              entry->msi_attrib.guest_masked =
>                  *data >> entry->msi_attrib.entry_nr;


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/msi: fix loop termination condition in pci_msi_conf_write_intercept()
Posted by Andrew Cooper 4 years, 9 months ago
On 02/07/2019 10:47, Andrew Cooper wrote:
> On 02/07/2019 10:34, Paul Durrant wrote:
>> The for loop that deals with MSI masking is coded as follows:
>>
>> for ( pos = 0; pos < entry->msi.nvec; ++pos, ++entry )
>>
>> Thus the loop termination condition is dereferencing a struct pointer that
>> is being incremented by the loop. However, it is clear from following code
>> paths in msi_capability_init() that this is unsafe as for instance, in the
>> case of nvec == 1, entry will point at a single struct msi_desc allocation
>> and thus the loop will walk beyond the bounds of the allocation before
>> dereferencing the memory to determine whether the loop should terminate.
> More specifically, only entry[0].msi.nvec is correct.  All subsequent
> nvec fields are 0 in a block of entries.
>
>> Also, because the body of the loop writes via the entry pointer, this can
>> then lead to heap memory corruption, or indeed corruption of anything in
>> the direct map.
>>
>> This patch simply initializes a stack variable to the value of
>> entry->msi.nvec before starting the loop and then uses that in the
>> termination condition instead.

There is actually a second bug here which is being fixed.  How about
this for the commit message?

x86/msi: fix loop termination condition in
pci_msi_conf_write_intercept()                                                                                                                                     

                                                                                                                                                                                                              

The for loop that deals with MSI masking is coded as
follows:                                                                                                                                                 

                                                                                                                                                                                                              

for ( pos = 0; pos < entry->msi.nvec; ++pos, ++entry
)                                                                                                                                                        

                                                                                                                                                                                                              

Thus the loop termination condition is dereferencing a struct pointer
that                                                                                                                                    

is being incremented by the
loop.                                                                                                                                                                             

                                                                                                                                                                                                              

A block of MSI entries stores the number of vectors in
entry[0].msi.nvec,                                                                                                                                     

with all subsequent entries using a value of 0.  Therefore, for a block
of                                                                                                                                    

two or more MSIs will terminate the loop early, as entry[1].msi.nvec is
0.                                                                                                                                                

                                                                                                                                                                                                              

However, for a single MSI, ++entry moves the pointer out of bounds, and
a                                                                                                                                     

bogus read is used for the termination condition.  In the case that
the                                                                                                                                       

loop body gets entered, there are subsequent OoB writes which
clobber                                                                                                                                         

adjacent memory in the
heap.                                                                                                                                                                                  

                                                                                                                                                                                                              

This patch simply initializes a stack variable to the value
of                                                                                                                                                

entry->msi.nvec before starting the loop and then uses that in
the                                                                                                                                            

termination condition instead.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/msi: fix loop termination condition in pci_msi_conf_write_intercept()
Posted by Paul Durrant 4 years, 9 months ago
> -----Original Message-----
> From: Andrew Cooper <Andrew.Cooper3@citrix.com>
> Sent: 02 July 2019 11:29
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Igor Druzhinin <igor.druzhinin@citrix.com>; Wei Liu <wl@xen.org>; Jan Beulich <jbeulich@suse.com>;
> Roger Pau Monne <roger.pau@citrix.com>
> Subject: Re: [Xen-devel] [PATCH] x86/msi: fix loop termination condition in
> pci_msi_conf_write_intercept()
> 
> On 02/07/2019 10:47, Andrew Cooper wrote:
> > On 02/07/2019 10:34, Paul Durrant wrote:
> >> The for loop that deals with MSI masking is coded as follows:
> >>
> >> for ( pos = 0; pos < entry->msi.nvec; ++pos, ++entry )
> >>
> >> Thus the loop termination condition is dereferencing a struct pointer that
> >> is being incremented by the loop. However, it is clear from following code
> >> paths in msi_capability_init() that this is unsafe as for instance, in the
> >> case of nvec == 1, entry will point at a single struct msi_desc allocation
> >> and thus the loop will walk beyond the bounds of the allocation before
> >> dereferencing the memory to determine whether the loop should terminate.
> > More specifically, only entry[0].msi.nvec is correct.  All subsequent
> > nvec fields are 0 in a block of entries.
> >
> >> Also, because the body of the loop writes via the entry pointer, this can
> >> then lead to heap memory corruption, or indeed corruption of anything in
> >> the direct map.
> >>
> >> This patch simply initializes a stack variable to the value of
> >> entry->msi.nvec before starting the loop and then uses that in the
> >> termination condition instead.
> 
> There is actually a second bug here which is being fixed.  How about
> this for the commit message?
> 

Apart from exchange/outlook terminally mangling it (as you can probably see below... unless it miraculously unmangles this reply), it looks ok to me. I assume you are happy to fix on commit?

  Paul

> x86/msi: fix loop termination condition in
> pci_msi_conf_write_intercept()
> 
> 
> 
> 
> 
> 
> The for loop that deals with MSI masking is coded as
> follows:
> 
> 
> 
> 
> 
> 
> for ( pos = 0; pos < entry->msi.nvec; ++pos, ++entry
> )
> 
> 
> 
> 
> 
> 
> Thus the loop termination condition is dereferencing a struct pointer
> that
> 
> 
> is being incremented by the
> loop.
> 
> 
> 
> 
> 
> 
> A block of MSI entries stores the number of vectors in
> entry[0].msi.nvec,
> 
> 
> with all subsequent entries using a value of 0.  Therefore, for a block
> of
> 
> 
> two or more MSIs will terminate the loop early, as entry[1].msi.nvec is
> 0.
> 
> 
> 
> 
> 
> 
> However, for a single MSI, ++entry moves the pointer out of bounds, and
> a
> 
> 
> bogus read is used for the termination condition.  In the case that
> the
> 
> 
> loop body gets entered, there are subsequent OoB writes which
> clobber
> 
> 
> adjacent memory in the
> heap.
> 
> 
> 
> 
> 
> 
> This patch simply initializes a stack variable to the value
> of
> 
> 
> entry->msi.nvec before starting the loop and then uses that in
> the
> 
> 
> termination condition instead.
> 
> ~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/msi: fix loop termination condition in pci_msi_conf_write_intercept()
Posted by Andrew Cooper 4 years, 9 months ago
On 02/07/2019 11:31, Paul Durrant wrote:
>> -----Original Message-----
>> From: Andrew Cooper <Andrew.Cooper3@citrix.com>
>> Sent: 02 July 2019 11:29
>> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
>> Cc: Igor Druzhinin <igor.druzhinin@citrix.com>; Wei Liu <wl@xen.org>; Jan Beulich <jbeulich@suse.com>;
>> Roger Pau Monne <roger.pau@citrix.com>
>> Subject: Re: [Xen-devel] [PATCH] x86/msi: fix loop termination condition in
>> pci_msi_conf_write_intercept()
>>
>> On 02/07/2019 10:47, Andrew Cooper wrote:
>>> On 02/07/2019 10:34, Paul Durrant wrote:
>>>> The for loop that deals with MSI masking is coded as follows:
>>>>
>>>> for ( pos = 0; pos < entry->msi.nvec; ++pos, ++entry )
>>>>
>>>> Thus the loop termination condition is dereferencing a struct pointer that
>>>> is being incremented by the loop. However, it is clear from following code
>>>> paths in msi_capability_init() that this is unsafe as for instance, in the
>>>> case of nvec == 1, entry will point at a single struct msi_desc allocation
>>>> and thus the loop will walk beyond the bounds of the allocation before
>>>> dereferencing the memory to determine whether the loop should terminate.
>>> More specifically, only entry[0].msi.nvec is correct.  All subsequent
>>> nvec fields are 0 in a block of entries.
>>>
>>>> Also, because the body of the loop writes via the entry pointer, this can
>>>> then lead to heap memory corruption, or indeed corruption of anything in
>>>> the direct map.
>>>>
>>>> This patch simply initializes a stack variable to the value of
>>>> entry->msi.nvec before starting the loop and then uses that in the
>>>> termination condition instead.
>> There is actually a second bug here which is being fixed.  How about
>> this for the commit message?
>>
> Apart from exchange/outlook terminally mangling it (as you can probably see below... unless it miraculously unmangles this reply), it looks ok to me. I assume you are happy to fix on commit?

Yeah - that is horrifically mangled.  The actual commit reads sensibly. 
I'm happy to fix on commit.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel