[PATCH 0/9] misc AHCI cleanups

Niklas Cassel posted 9 patches 1 year ago
Failed in applying to current master (apply log)
There is a newer version of this series
hw/ide/ahci.c | 111 +++++++++++++++++++++++++++++++++++---------------
hw/ide/core.c |   2 +-
2 files changed, 80 insertions(+), 33 deletions(-)
[PATCH 0/9] misc AHCI cleanups
Posted by Niklas Cassel 1 year ago
From: Niklas Cassel <niklas.cassel@wdc.com>

Hello John,

Here comes some misc AHCI cleanups.

Most are related to error handling.

Please review.

(I'm also working on a second series which will add support for
READ LOG EXT and READ LOG DMA EXT, but I will send that one out
once it is ready.)


Kind regards,
Niklas


Niklas Cassel (9):
  hw/ide/ahci: remove stray backslash
  hw/ide/core: set ERR_STAT in unsupported command completion
  hw/ide/ahci: write D2H FIS on when processing NCQ command
  hw/ide/ahci: simplify and document PxCI handling
  hw/ide/ahci: PxCI should not get cleared when ERR_STAT is set
  hw/ide/ahci: PxSACT and PxCI is cleared when PxCMD.ST is cleared
  hw/ide/ahci: trigger either error IRQ or regular IRQ, not both
  hw/ide/ahci: fix ahci_write_fis_sdb()
  hw/ide/ahci: fix broken SError handling

 hw/ide/ahci.c | 111 +++++++++++++++++++++++++++++++++++---------------
 hw/ide/core.c |   2 +-
 2 files changed, 80 insertions(+), 33 deletions(-)

-- 
2.40.0
Re: [PATCH 0/9] misc AHCI cleanups
Posted by John Snow 11 months, 2 weeks ago
On Fri, Apr 28, 2023 at 9:22 AM Niklas Cassel <nks@flawful.org> wrote:
>
> From: Niklas Cassel <niklas.cassel@wdc.com>
>
> Hello John,
>

Hi Niklas!

I haven't been actively involved with AHCI for a while, so I am not
sure I can find the time to give this a deep scrub. I'm going to
assume based on '@wdc.com` that you probably know a thing or two more
about AHCI than I do, though. Can you tell me what testing you've
performed on this? As long as it doesn't cause any obvious
regressions, we might be able to push it through, but it might not be
up to me anymore. I can give it a review on technical merit, but with
regards to "correctness" I have to admit I am flying blind.

(I haven't looked at the patches yet, but if in your commit messages
you can point to the relevant sections of the relevant specifications,
that'd help immensely.)

> Here comes some misc AHCI cleanups.
>
> Most are related to error handling.

I've always found this the most difficult part to verify. In a real
system, the registers between AHCI and the actual hard disk are
*physically separate*, and they update at specific times based on the
transmission of the FIS packets. The model in QEMU doesn't bother with
a perfect reproduction of that, and so it's been a little tough to
verify correctness. I tried to improve it a bit back in the day, but
my understanding has always been a bit limited :)

Are there any vendor tools you're aware of that have test suites we
could use to verify behavior? Our AHCI tests are very rudimentary - I
wrote them straight out of undergrad as my first project at RH - and
likely gloss over and misunderstand many things about the
ATA/SATA/AHCI specifications.

>
> Please review.
>
> (I'm also working on a second series which will add support for
> READ LOG EXT and READ LOG DMA EXT, but I will send that one out
> once it is ready.)

Wow!

A question for you: is it worth solidifying which ATA specification we
conform to? I don't believe we adhere to any one specific model,
because a lot of the code is shared between PATA and SATA, and we "in
theory" support IDE hard drives for fairly old guest operating systems
that may or may not predate the DMA extensions. As a result, the
actual device emulation is kind of a mish-mash of different ATA
specifications, generally whichever version provided the
least-abstracted detail and was easy to implement.

If you're adding the logging features, that seems to push us towards
the newer end of the spectrum, but I'm not sure if this causes any
problems for guest operating systems doing probing to guess what kind
of device they're talking to.

Any input?

>
>
> Kind regards,
> Niklas
>
>
> Niklas Cassel (9):
>   hw/ide/ahci: remove stray backslash
>   hw/ide/core: set ERR_STAT in unsupported command completion
>   hw/ide/ahci: write D2H FIS on when processing NCQ command
>   hw/ide/ahci: simplify and document PxCI handling
>   hw/ide/ahci: PxCI should not get cleared when ERR_STAT is set
>   hw/ide/ahci: PxSACT and PxCI is cleared when PxCMD.ST is cleared
>   hw/ide/ahci: trigger either error IRQ or regular IRQ, not both
>   hw/ide/ahci: fix ahci_write_fis_sdb()
>   hw/ide/ahci: fix broken SError handling
>
>  hw/ide/ahci.c | 111 +++++++++++++++++++++++++++++++++++---------------
>  hw/ide/core.c |   2 +-
>  2 files changed, 80 insertions(+), 33 deletions(-)
>
> --
> 2.40.0
>
Re: [PATCH 0/9] misc AHCI cleanups
Posted by Niklas Cassel 11 months ago
On Wed, May 17, 2023 at 01:06:06PM -0400, John Snow wrote:
> On Fri, Apr 28, 2023 at 9:22 AM Niklas Cassel <nks@flawful.org> wrote:
> >
> > From: Niklas Cassel <niklas.cassel@wdc.com>
> >
> > Hello John,
> >
> 
> Hi Niklas!
> 
> I haven't been actively involved with AHCI for a while, so I am not
> sure I can find the time to give this a deep scrub. I'm going to
> assume based on '@wdc.com` that you probably know a thing or two more
> about AHCI than I do, though. Can you tell me what testing you've
> performed on this? As long as it doesn't cause any obvious
> regressions, we might be able to push it through, but it might not be
> up to me anymore. I can give it a review on technical merit, but with
> regards to "correctness" I have to admit I am flying blind.

Hello John,

The testing is mostly using linux and injecting NCQ errors using some
additional QEMU patches that are not part of this series.

> 
> (I haven't looked at the patches yet, but if in your commit messages
> you can point to the relevant sections of the relevant specifications,
> that'd help immensely.)
> 
> > Here comes some misc AHCI cleanups.
> >
> > Most are related to error handling.
> 
> I've always found this the most difficult part to verify. In a real
> system, the registers between AHCI and the actual hard disk are
> *physically separate*, and they update at specific times based on the
> transmission of the FIS packets. The model in QEMU doesn't bother with
> a perfect reproduction of that, and so it's been a little tough to
> verify correctness. I tried to improve it a bit back in the day, but
> my understanding has always been a bit limited :)
> 
> Are there any vendor tools you're aware of that have test suites we
> could use to verify behavior?

Unfortunately, I don't know of any good test suite.

> A question for you: is it worth solidifying which ATA specification we
> conform to? I don't believe we adhere to any one specific model,
> because a lot of the code is shared between PATA and SATA, and we "in
> theory" support IDE hard drives for fairly old guest operating systems
> that may or may not predate the DMA extensions. As a result, the
> actual device emulation is kind of a mish-mash of different ATA
> specifications, generally whichever version provided the
> least-abstracted detail and was easy to implement.
> 
> If you're adding the logging features, that seems to push us towards
> the newer end of the spectrum, but I'm not sure if this causes any
> problems for guest operating systems doing probing to guess what kind
> of device they're talking to.
> 
> Any input?

I agree.

In my next series, after we have General Purpose Logging support,
I intend to bump the major version to indicate ACS-5 support.
I will need to verify that we are not missing any other major
feature from ACS-5 first though (other than GPL).


Kind regards,
Niklas