[PATCH v3] hw/ide/ahci: fix legacy software reset

Niklas Cassel posted 1 patch 6 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20231108222657.117984-1-nks@flawful.org
Maintainers: John Snow <jsnow@redhat.com>
hw/ide/ahci.c | 27 ++++++++++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)
[PATCH v3] hw/ide/ahci: fix legacy software reset
Posted by Niklas Cassel 6 months ago
From: Niklas Cassel <niklas.cassel@wdc.com>

Legacy software contains a standard mechanism for generating a reset to a
Serial ATA device - setting the SRST (software reset) bit in the Device
Control register.

Serial ATA has a more robust mechanism called COMRESET, also referred to
as port reset. A port reset is the preferred mechanism for error
recovery and should be used in place of software reset.

Commit e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
improved the handling of PxCI, such that PxCI gets cleared after handling
a non-NCQ, or NCQ command (instead of incorrectly clearing PxCI after
receiving anything - even a FIS that failed to parse, which should NOT
clear PxCI, so that you can see which command slot that caused an error).

However, simply clearing PxCI after a non-NCQ, or NCQ command, is not
enough, we also need to clear PxCI when receiving a SRST in the Device
Control register.

A legacy software reset is performed by the host sending two H2D FISes,
the first H2D FIS asserts SRST, and the second H2D FIS deasserts SRST.

The first H2D FIS will not get a D2H reply, and requires the FIS to have
the C bit set to one, such that the HBA itself will clear the bit in PxCI.

The second H2D FIS will get a D2H reply once the diagnostic is completed.
The clearing of the bit in PxCI for this command should ideally be done
in ahci_init_d2h() (if it was a legacy software reset that caused the
reset (a COMRESET does not use a command slot)). However, since the reset
value for PxCI is 0, modify ahci_reset_port() to actually clear PxCI to 0,
that way we can avoid complex logic in ahci_init_d2h().

This fixes an issue for FreeBSD where the device would fail to reset.
The problem was not noticed in Linux, because Linux uses a COMRESET
instead of a legacy software reset by default.

Fixes: e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
---
Changes since v3: remove superfluous ahci_write_fis_d2h(), and add
comments describing how the (quite confusing) spec actually works.

 hw/ide/ahci.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index fcc5476e9e..56c7672eb9 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -623,9 +623,13 @@ static void ahci_init_d2h(AHCIDevice *ad)
         return;
     }
 
+    /*
+     * For simplicity, do not call ahci_clear_cmd_issue() for this
+     * ahci_write_fis_d2h(). (The reset value for PxCI is 0.)
+     */
     if (ahci_write_fis_d2h(ad, true)) {
         ad->init_d2h_sent = true;
-        /* We're emulating receiving the first Reg H2D Fis from the device;
+        /* We're emulating receiving the first Reg D2H FIS from the device;
          * Update the SIG register, but otherwise proceed as normal. */
         pr->sig = ((uint32_t)ide_state->hcyl << 24) |
             (ide_state->lcyl << 16) |
@@ -663,6 +667,7 @@ static void ahci_reset_port(AHCIState *s, int port)
     pr->scr_act = 0;
     pr->tfdata = 0x7F;
     pr->sig = 0xFFFFFFFF;
+    pr->cmd_issue = 0;
     d->busy_slot = -1;
     d->init_d2h_sent = false;
 
@@ -1243,10 +1248,30 @@ static void handle_reg_h2d_fis(AHCIState *s, int port,
         case STATE_RUN:
             if (cmd_fis[15] & ATA_SRST) {
                 s->dev[port].port_state = STATE_RESET;
+                /*
+                 * When setting SRST in the first H2D FIS in the reset sequence,
+                 * the device does not send a D2H FIS. Host software thus has to
+                 * set the "Clear Busy upon R_OK" bit such that PxCI (and BUSY)
+                 * gets cleared. See AHCI 1.3.1, section 10.4.1 Software Reset.
+                 */
+                if (opts & AHCI_CMD_CLR_BUSY) {
+                    ahci_clear_cmd_issue(ad, slot);
+                }
             }
             break;
         case STATE_RESET:
             if (!(cmd_fis[15] & ATA_SRST)) {
+                /*
+                 * When clearing SRST in the second H2D FIS in the reset
+                 * sequence, the device will execute diagnostics. When this is
+                 * done, the device will send a D2H FIS with the good status.
+                 * See SATA 3.5a Gold, section 11.4 Software reset protocol.
+                 *
+                 * This D2H FIS is the first D2H FIS received from the device,
+                 * and is received regardless if the reset was performed by a
+                 * COMRESET or by setting and clearing the SRST bit. Therefore,
+                 * the logic for this is found in ahci_init_d2h() and not here.
+                 */
                 ahci_reset_port(s, port);
             }
             break;
-- 
2.41.0
Re: [PATCH v3] hw/ide/ahci: fix legacy software reset
Posted by Kevin Wolf 6 months ago
Am 08.11.2023 um 23:26 hat Niklas Cassel geschrieben:
> From: Niklas Cassel <niklas.cassel@wdc.com>
> 
> Legacy software contains a standard mechanism for generating a reset to a
> Serial ATA device - setting the SRST (software reset) bit in the Device
> Control register.
> 
> Serial ATA has a more robust mechanism called COMRESET, also referred to
> as port reset. A port reset is the preferred mechanism for error
> recovery and should be used in place of software reset.
> 
> Commit e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
> improved the handling of PxCI, such that PxCI gets cleared after handling
> a non-NCQ, or NCQ command (instead of incorrectly clearing PxCI after
> receiving anything - even a FIS that failed to parse, which should NOT
> clear PxCI, so that you can see which command slot that caused an error).
> 
> However, simply clearing PxCI after a non-NCQ, or NCQ command, is not
> enough, we also need to clear PxCI when receiving a SRST in the Device
> Control register.
> 
> A legacy software reset is performed by the host sending two H2D FISes,
> the first H2D FIS asserts SRST, and the second H2D FIS deasserts SRST.
> 
> The first H2D FIS will not get a D2H reply, and requires the FIS to have
> the C bit set to one, such that the HBA itself will clear the bit in PxCI.
> 
> The second H2D FIS will get a D2H reply once the diagnostic is completed.
> The clearing of the bit in PxCI for this command should ideally be done
> in ahci_init_d2h() (if it was a legacy software reset that caused the
> reset (a COMRESET does not use a command slot)). However, since the reset
> value for PxCI is 0, modify ahci_reset_port() to actually clear PxCI to 0,
> that way we can avoid complex logic in ahci_init_d2h().
> 
> This fixes an issue for FreeBSD where the device would fail to reset.
> The problem was not noticed in Linux, because Linux uses a COMRESET
> instead of a legacy software reset by default.
> 
> Fixes: e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
> Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>

Thanks, applied to the block branch.

Kevin
Re: [PATCH v3] hw/ide/ahci: fix legacy software reset
Posted by Michael Tokarev 5 months, 3 weeks ago
10.11.2023 12:51, Kevin Wolf:
> Am 08.11.2023 um 23:26 hat Niklas Cassel geschrieben:
>> From: Niklas Cassel <niklas.cassel@wdc.com>
>>
>> Legacy software contains a standard mechanism for generating a reset to a
>> Serial ATA device - setting the SRST (software reset) bit in the Device
>> Control register.
...
>> Fixes: e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
>> Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
>> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> 
> Thanks, applied to the block branch.

Another friendly ping?

This change, once again, missed the next stable-8.1.x release, it seems
(the deadline for which is tomorrow).

/mjt
Re: [PATCH v3] hw/ide/ahci: fix legacy software reset
Posted by Marcin Juszkiewicz 6 months ago
W dniu 8.11.2023 o 23:26, Niklas Cassel pisze:
> From: Niklas Cassel <niklas.cassel@wdc.com>

> This fixes an issue for FreeBSD where the device would fail to reset.
> The problem was not noticed in Linux, because Linux uses a COMRESET
> instead of a legacy software reset by default.
> 
> Fixes: e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
> Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>

Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>

FreeBSD 14-rc3 boots fine on AArch64 with this patch:

Trying to mount root from cd9660:/dev/iso9660/14_0_RC3_AARCH64_BO [ro]...
cd0 at ahcich0 bus 0 scbus0 target 0 lun 0
cd0: <QEMU QEMU DVD-ROM 2.5+> Removable CD-ROM SCSI device
cd0: Serial Number QM00001
cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes)
cd0: 347MB (177954 2048 byte sectors)
ada0 at ahcich1 bus 0 scbus1 target 0 lun 0
ada0: <QEMU HARDDISK 2.5+> ATA-7 SATA device
ada0: Serial Number QM00003
ada0: 150.000MB/s transfers (SATA 1.x, UDMA5, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 504MB (1032192 512 byte sectors)
ada1 at ahcich2 bus 0 scbus2 target 0 lun 0
ada1: <QEMU HARDDISK 2.5+> ATA-7 SATA device
ada1: Serial Number QM00005
ada1: 150.000MB/s transfers (SATA 1.x, UDMA5, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 8192MB (16777216 512 byte sectors)