[PATCH] scsi-disk: convert more errno values back to SCSI statuses

Paolo Bonzini posted 1 patch 3 years, 6 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20201112095220.52590-1-pbonzini@redhat.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, Fam Zheng <fam@euphon.net>
hw/scsi/scsi-disk.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
[PATCH] scsi-disk: convert more errno values back to SCSI statuses
Posted by Paolo Bonzini 3 years, 6 months ago
Linux has some OS-specific (and sometimes weird) mappings for various SCSI
statuses and sense codes.  The most important is probably RESERVATION
CONFLICT.  Add them so that they can be reported back to the guest
kernel.

Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/scsi/scsi-disk.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 424bc192b7..fa14d1527a 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -461,6 +461,25 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int error, bool acct_failed)
             }
             error = scsi_sense_buf_to_errno(r->req.sense, sizeof(r->req.sense));
             break;
+#ifdef CONFIG_LINUX
+            /* These errno mapping are specific to Linux.  For more information:
+             * - scsi_decide_disposition in drivers/scsi/scsi_error.c
+             * - scsi_result_to_blk_status in drivers/scsi/scsi_lib.c
+             * - blk_errors[] in block/blk-core.c
+             */
+        case EBADE:
+            /* DID_NEXUS_FAILURE -> BLK_STS_NEXUS.  */
+            scsi_req_complete(&r->req, RESERVATION_CONFLICT);
+            break;
+        case ENODATA:
+            /* DID_MEDIUM_ERROR -> BLK_STS_MEDIUM.  */
+            scsi_check_condition(r, SENSE_CODE(READ_ERROR));
+            break;
+        case EREMOTEIO:
+            /* DID_TARGET_FAILURE -> BLK_STS_TARGET.  */
+            scsi_req_complete(&r->req, HARDWARE_ERROR);
+            break;
+#endif
         case ENOMEDIUM:
             scsi_check_condition(r, SENSE_CODE(NO_MEDIUM));
             break;
-- 
2.28.0


Re: [PATCH] scsi-disk: convert more errno values back to SCSI statuses
Posted by Hannes Reinecke 3 years, 6 months ago
On 11/12/20 10:52 AM, Paolo Bonzini wrote:
> Linux has some OS-specific (and sometimes weird) mappings for various SCSI
> statuses and sense codes.  The most important is probably RESERVATION
> CONFLICT.  Add them so that they can be reported back to the guest
> kernel.
> 
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   hw/scsi/scsi-disk.c | 19 +++++++++++++++++++
>   1 file changed, 19 insertions(+)
> 
> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> index 424bc192b7..fa14d1527a 100644
> --- a/hw/scsi/scsi-disk.c
> +++ b/hw/scsi/scsi-disk.c
> @@ -461,6 +461,25 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int error, bool acct_failed)
>               }
>               error = scsi_sense_buf_to_errno(r->req.sense, sizeof(r->req.sense));
>               break;
> +#ifdef CONFIG_LINUX
> +            /* These errno mapping are specific to Linux.  For more information:
> +             * - scsi_decide_disposition in drivers/scsi/scsi_error.c
> +             * - scsi_result_to_blk_status in drivers/scsi/scsi_lib.c
> +             * - blk_errors[] in block/blk-core.c
> +             */
> +        case EBADE:
> +            /* DID_NEXUS_FAILURE -> BLK_STS_NEXUS.  */
> +            scsi_req_complete(&r->req, RESERVATION_CONFLICT);
> +            break;
> +        case ENODATA:
> +            /* DID_MEDIUM_ERROR -> BLK_STS_MEDIUM.  */
> +            scsi_check_condition(r, SENSE_CODE(READ_ERROR));
> +            break;
> +        case EREMOTEIO:
> +            /* DID_TARGET_FAILURE -> BLK_STS_TARGET.  */
> +            scsi_req_complete(&r->req, HARDWARE_ERROR);
> +            break;
> +#endif
>           case ENOMEDIUM:
>               scsi_check_condition(r, SENSE_CODE(NO_MEDIUM));
>               break;
> 
Well, ironically I'm currently debugging a customer escalation which 
touches exactly this area.
It revolves more around the SG_IO handling; qemu ignores the host_status 
setting completely, leading to data corruption if the host has to abort 
commands.
(Not that the host _does_ abort commands, as qemu SG_IO sets an infinite 
timeout. But that's another story).

And part of the patchset is to enable passing of the host_status code 
back to the drivers. In particular virtio_scsi has a 'response' code 
which matches pretty closely to the linux SCSI DID_XXX codes.
So my idea is to pass the host_status directly down to the drivers, 
allowing virtio-scsi to do a mapping between DID_XX and virtio response 
codes. That way we can get rid of the weird mapping in utils/scsi.c 
where we try to map the DID_XXX codes onto Sense or SCSI status.
And with that we could map the error codes above onto DID_XXX codes, 
too, allowing us to have a more streamlined mapping.

Thoughts?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

Re: [PATCH] scsi-disk: convert more errno values back to SCSI statuses
Posted by Paolo Bonzini 3 years, 6 months ago
On 12/11/20 16:04, Hannes Reinecke wrote:
> On 11/12/20 10:52 AM, Paolo Bonzini wrote:
> Well, ironically I'm currently debugging a customer escalation which 
> touches exactly this area. It revolves more around the SG_IO handling;

Technically this patch is for *non* passthrough, but yeah it's a similar 
case.

> qemu ignores the host_status setting completely, leading to data
> corruption if the host has to abort commands. (Not that the host
> _does_ abort commands, as qemu SG_IO sets an infinite timeout. But
> that's another story).

> And part of the patchset is to enable passing of the host_status code 
> back to the drivers. In particular virtio_scsi has a 'response' code 
> which matches pretty closely to the linux SCSI DID_XXX codes.

Yeah, most of the time that's just because it's what can go wrong in 
SCSI.  Sometimes it's because I had no clue when writing the virtio-scsi 
spec and just copied blindly from Linux.  For example 
VIRTIO_SCSI_S_NEXUS_FAILURE probably should have never existed, since 
DID_NEXUS_FAILURE really should have been DID_RESRVATION_CONFLICT.

As an aside, I hate Linux host_status.  It's never clear when looking at 
the code if the statuses have been mapped back to BLK_STS codes or not, 
so you don't know if you already have gotten rid of DID_TARGET_FAILURE, 
DID_NEXUS_FAILURE, DID_ALLOC_FAILURE and DID_MEDIUM_ERROR (which are 
just weird way to store the SCSI status or sense key for future use, and 
not really "host statuses), and would really be a nexus failure only in 
the rare case of path-specific reservations).

> So my idea is to pass the host_status directly down to the drivers, 
> allowing virtio-scsi to do a mapping between DID_XX and virtio response 
> codes.

But yeah, this is a good idea.  But since I hate host_status, please 
define your own enum instead of DID_*.  Of course you can use the same 
values as DID_* and assert with QEMU_BUILD_BUG_ON that they are the 
same, but I don't want people to read the code and have to think of 
DID_ALLOC_FAILURE and the like.

Paolo