[PATCH v3 8/8] hw/ide/ahci: fix broken SError handling

Niklas Cassel posted 8 patches 2 years, 8 months ago
Maintainers: John Snow <jsnow@redhat.com>, Thomas Huth <thuth@redhat.com>, Laurent Vivier <lvivier@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
[PATCH v3 8/8] hw/ide/ahci: fix broken SError handling
Posted by Niklas Cassel 2 years, 8 months ago
From: Niklas Cassel <niklas.cassel@wdc.com>

When encountering an NCQ error, you should not write the NCQ tag to the
SError register. This is completely wrong.

The SError register has a clear definition, where each bit represents a
different error, see PxSERR definition in AHCI 1.3.1.

If we write a random value (like the NCQ tag) in SError, e.g. Linux will
read SError, and will trigger arbitrary error handling depending on the
NCQ tag that happened to be executing.

In case of success, ncq_cb() will call ncq_finish().
In case of error, ncq_cb() will call ncq_err() (which will clear
ncq_tfs->used), and then call ncq_finish(), thus using ncq_tfs->used is
sufficient to tell if finished should get set or not.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
---
 hw/ide/ahci.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index ef6c9fc378..d0a774bc17 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -1012,7 +1012,6 @@ static void ncq_err(NCQTransferState *ncq_tfs)
 
     ide_state->error = ABRT_ERR;
     ide_state->status = READY_STAT | ERR_STAT;
-    ncq_tfs->drive->port_regs.scr_err |= (1 << ncq_tfs->tag);
     qemu_sglist_destroy(&ncq_tfs->sglist);
     ncq_tfs->used = 0;
 }
@@ -1022,7 +1021,7 @@ static void ncq_finish(NCQTransferState *ncq_tfs)
     /* If we didn't error out, set our finished bit. Errored commands
      * do not get a bit set for the SDB FIS ACT register, nor do they
      * clear the outstanding bit in scr_act (PxSACT). */
-    if (!(ncq_tfs->drive->port_regs.scr_err & (1 << ncq_tfs->tag))) {
+    if (ncq_tfs->used) {
         ncq_tfs->drive->finished |= (1 << ncq_tfs->tag);
     }
 
-- 
2.40.1
Re: [PATCH v3 8/8] hw/ide/ahci: fix broken SError handling
Posted by Philippe Mathieu-Daudé 2 years, 6 months ago
On 9/6/23 16:08, Niklas Cassel wrote:
> From: Niklas Cassel <niklas.cassel@wdc.com>
> 
> When encountering an NCQ error, you should not write the NCQ tag to the
> SError register. This is completely wrong.
> 
> The SError register has a clear definition, where each bit represents a
> different error, see PxSERR definition in AHCI 1.3.1.
> 
> If we write a random value (like the NCQ tag) in SError, e.g. Linux will
> read SError, and will trigger arbitrary error handling depending on the
> NCQ tag that happened to be executing.
> 
> In case of success, ncq_cb() will call ncq_finish().
> In case of error, ncq_cb() will call ncq_err() (which will clear
> ncq_tfs->used), and then call ncq_finish(), thus using ncq_tfs->used is
> sufficient to tell if finished should get set or not.
> 

Cc: qemu-stable@nongnu.org
Fixes: f6ad2e32f8 ("ahci: add ahci emulation")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> ---
>   hw/ide/ahci.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)


Re: [PATCH v3 8/8] hw/ide/ahci: fix broken SError handling
Posted by Michael Tokarev 2 years, 6 months ago
19.07.2023 14:59, Philippe Mathieu-Daudé wrote:
> On 9/6/23 16:08, Niklas Cassel wrote:
>> From: Niklas Cassel <niklas.cassel@wdc.com>
>>
>> When encountering an NCQ error, you should not write the NCQ tag to the
>> SError register. This is completely wrong.
>>
>> The SError register has a clear definition, where each bit represents a
>> different error, see PxSERR definition in AHCI 1.3.1.
>>
>> If we write a random value (like the NCQ tag) in SError, e.g. Linux will
>> read SError, and will trigger arbitrary error handling depending on the
>> NCQ tag that happened to be executing.
>>
>> In case of success, ncq_cb() will call ncq_finish().
>> In case of error, ncq_cb() will call ncq_err() (which will clear
>> ncq_tfs->used), and then call ncq_finish(), thus using ncq_tfs->used is
>> sufficient to tell if finished should get set or not.
>>
> 
> Cc: qemu-stable@nongnu.org
> Fixes: f6ad2e32f8 ("ahci: add ahci emulation")

If going this route, it feels like 7/8 "hw/ide/ahci: fix ahci_write_fis_sdb()"
should be there (-stable) too.

/mjt


Re: [PATCH v3 8/8] hw/ide/ahci: fix broken SError handling
Posted by Philippe Mathieu-Daudé 2 years, 6 months ago
On 19/7/23 14:06, Michael Tokarev wrote:
> 19.07.2023 14:59, Philippe Mathieu-Daudé wrote:
>> On 9/6/23 16:08, Niklas Cassel wrote:
>>> From: Niklas Cassel <niklas.cassel@wdc.com>
>>>
>>> When encountering an NCQ error, you should not write the NCQ tag to the
>>> SError register. This is completely wrong.
>>>
>>> The SError register has a clear definition, where each bit represents a
>>> different error, see PxSERR definition in AHCI 1.3.1.
>>>
>>> If we write a random value (like the NCQ tag) in SError, e.g. Linux will
>>> read SError, and will trigger arbitrary error handling depending on the
>>> NCQ tag that happened to be executing.
>>>
>>> In case of success, ncq_cb() will call ncq_finish().
>>> In case of error, ncq_cb() will call ncq_err() (which will clear
>>> ncq_tfs->used), and then call ncq_finish(), thus using ncq_tfs->used is
>>> sufficient to tell if finished should get set or not.
>>>
>>
>> Cc: qemu-stable@nongnu.org
>> Fixes: f6ad2e32f8 ("ahci: add ahci emulation")
> 
> If going this route, it feels like 7/8 "hw/ide/ahci: fix 
> ahci_write_fis_sdb()"
> should be there (-stable) too.

Certainly; I realized this deserve to be in stable at this final
patch :/