[PATCH] scsi: lpfc: use unsigned type for num_sge

Daniel Wagner posted 1 patch 1 year, 12 months ago
drivers/scsi/lpfc/lpfc_scsi.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
[PATCH] scsi: lpfc: use unsigned type for num_sge
Posted by Daniel Wagner 1 year, 12 months ago
From: Hannes Reinecke <hare@suse.de>

LUNs going into “failed ready running” state observed on >1T and on
even numbers of size (2T, 4T, 6T, 8T and 10T). The issue occurs when
DIF is enabled at the host.

The kernel logs:

  Cannot setup S/G List for HBAIO segs 1/1 SGL 512 SCSI 256: 3 0

The host lpfc driver is failing to setup scatter/gather list
(protection data) for the IO's.

The return type lpfc_bg_setup_sgl()/lpfc_bg_setup_sgl_prot() causes
the compiler to remove the most significant bit. Use an unsigned type
instead.

Signed-off-by: Hannes Reinecke <hare@suse.de>
[dwagner: added commit message]
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/scsi/lpfc/lpfc_scsi.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index d26941b131fd..bf879d81846b 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -1918,7 +1918,7 @@ lpfc_bg_setup_bpl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
  *
  * Returns the number of SGEs added to the SGL.
  **/
-static int
+static uint32_t
 lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
 		struct sli4_sge *sgl, int datasegcnt,
 		struct lpfc_io_buf *lpfc_cmd)
@@ -1926,8 +1926,8 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
 	struct scatterlist *sgde = NULL; /* s/g data entry */
 	struct sli4_sge_diseed *diseed = NULL;
 	dma_addr_t physaddr;
-	int i = 0, num_sge = 0, status;
-	uint32_t reftag;
+	int i = 0, status;
+	uint32_t reftag, num_sge = 0;
 	uint8_t txop, rxop;
 #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
 	uint32_t rc;
@@ -2099,7 +2099,7 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
  *
  * Returns the number of SGEs added to the SGL.
  **/
-static int
+static uint32_t
 lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
 		struct sli4_sge *sgl, int datacnt, int protcnt,
 		struct lpfc_io_buf *lpfc_cmd)
@@ -2123,8 +2123,8 @@ lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
 	uint32_t rc;
 #endif
 	uint32_t checking = 1;
-	uint32_t dma_offset = 0;
-	int num_sge = 0, j = 2;
+	uint32_t dma_offset = 0, num_sge = 0;
+	int j = 2;
 	struct sli4_hybrid_sgl *sgl_xtra = NULL;
 
 	sgpe = scsi_prot_sglist(sc);
-- 
2.43.0

Re: [PATCH] scsi: lpfc: use unsigned type for num_sge
Posted by Martin K. Petersen 1 year, 10 months ago
On Wed, 20 Dec 2023 17:26:58 +0100, Daniel Wagner wrote:

> LUNs going into “failed ready running” state observed on >1T and on
> even numbers of size (2T, 4T, 6T, 8T and 10T). The issue occurs when
> DIF is enabled at the host.
> 
> The kernel logs:
> 
>   Cannot setup S/G List for HBAIO segs 1/1 SGL 512 SCSI 256: 3 0
> 
> [...]

Applied to 6.8/scsi-fixes, thanks!

[1/1] scsi: lpfc: use unsigned type for num_sge
      https://git.kernel.org/mkp/scsi/c/d6c1b19153f9

-- 
Martin K. Petersen	Oracle Linux Engineering
Re: [PATCH] scsi: lpfc: use unsigned type for num_sge
Posted by Dick Kennedy 1 year, 12 months ago
The change is good, however, I  don't think this was really the problem.

We would  like to know if this patch really solved an issue they observed.

 A good data point to know is what adapter they're using.
 If that adapter supports hybrid sgl (i.e. phba->cfg_xpsgl), then we
 would have set the max sg_tablesize = LPFC_MAX_SG_TABLESIZE = 0xffff.

 But even then, this patch implies that dma_map_sg() returned a crazy
 huge amount with the MSB set.

On Wed, Dec 20, 2023 at 8:29 AM Daniel Wagner <dwagner@suse.de> wrote:

> From: Hannes Reinecke <hare@suse.de>
>
> LUNs going into “failed ready running” state observed on >1T and on
> even numbers of size (2T, 4T, 6T, 8T and 10T). The issue occurs when
> DIF is enabled at the host.
>
> The kernel logs:
>
>   Cannot setup S/G List for HBAIO segs 1/1 SGL 512 SCSI 256: 3 0
>
> The host lpfc driver is failing to setup scatter/gather list
> (protection data) for the IO's.
>
> The return type lpfc_bg_setup_sgl()/lpfc_bg_setup_sgl_prot() causes
> the compiler to remove the most significant bit. Use an unsigned type
> instead.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> [dwagner: added commit message]
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>  drivers/scsi/lpfc/lpfc_scsi.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
> index d26941b131fd..bf879d81846b 100644
> --- a/drivers/scsi/lpfc/lpfc_scsi.c
> +++ b/drivers/scsi/lpfc/lpfc_scsi.c
> @@ -1918,7 +1918,7 @@ lpfc_bg_setup_bpl_prot(struct lpfc_hba *phba, struct
> scsi_cmnd *sc,
>   *
>   * Returns the number of SGEs added to the SGL.
>   **/
> -static int
> +static uint32_t
>  lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
>                 struct sli4_sge *sgl, int datasegcnt,
>                 struct lpfc_io_buf *lpfc_cmd)
> @@ -1926,8 +1926,8 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct
> scsi_cmnd *sc,
>         struct scatterlist *sgde = NULL; /* s/g data entry */
>         struct sli4_sge_diseed *diseed = NULL;
>         dma_addr_t physaddr;
> -       int i = 0, num_sge = 0, status;
> -       uint32_t reftag;
> +       int i = 0, status;
> +       uint32_t reftag, num_sge = 0;
>         uint8_t txop, rxop;
>  #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
>         uint32_t rc;
> @@ -2099,7 +2099,7 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct
> scsi_cmnd *sc,
>   *
>   * Returns the number of SGEs added to the SGL.
>   **/
> -static int
> +static uint32_t
>  lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
>                 struct sli4_sge *sgl, int datacnt, int protcnt,
>                 struct lpfc_io_buf *lpfc_cmd)
> @@ -2123,8 +2123,8 @@ lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct
> scsi_cmnd *sc,
>         uint32_t rc;
>  #endif
>         uint32_t checking = 1;
> -       uint32_t dma_offset = 0;
> -       int num_sge = 0, j = 2;
> +       uint32_t dma_offset = 0, num_sge = 0;
> +       int j = 2;
>         struct sli4_hybrid_sgl *sgl_xtra = NULL;
>
>         sgpe = scsi_prot_sglist(sc);
> --
> 2.43.0
>
>

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.
Re: Re: [PATCH] scsi: lpfc: use unsigned type for num_sge
Posted by Daniel Wagner 1 year, 11 months ago
Hi Dick,

On Fri, Dec 22, 2023 at 10:04:50AM -0800, Dick Kennedy wrote:
> The change is good, however, I  don't think this was really the
> problem.

I tried to write the commit message based on the bug report we got. So
yes, it's possible the it is not correct as I was not really involved
and might missinterpret it.

> We would  like to know if this patch really solved an issue they
> observed.

Yes, it fixes the reported problem.

>  A good data point to know is what adapter they're using.

A bunch of different HPE cards which show this log entry: SN1700E,
SN1610E and SN1200E.

>  If that adapter supports hybrid sgl (i.e. phba->cfg_xpsgl), then we
>  would have set the max sg_tablesize = LPFC_MAX_SG_TABLESIZE = 0xffff.
> 
>  But even then, this patch implies that dma_map_sg() returned a crazy
>  huge amount with the MSB set.

Sure, though this seems to be the case.

One noteworthy information is that DIF needs to be enabled to trigger
it:

# cat /sys/module/lpfc/parameters/lpfc_enable_bg
1
# cat /sys/module/lpfc/parameters/lpfc_prot_guard
2

Thanks,
Daniel
Re: Re: Re: [PATCH] scsi: lpfc: use unsigned type for num_sge
Posted by Daniel Wagner 1 year, 10 months ago
On Wed, Jan 17, 2024 at 11:56:27AM +0100, Daniel Wagner wrote:
> Hi Dick,
> 
> On Fri, Dec 22, 2023 at 10:04:50AM -0800, Dick Kennedy wrote:
> > The change is good, however, I  don't think this was really the
> > problem.
> 
> I tried to write the commit message based on the bug report we got. So
> yes, it's possible the it is not correct as I was not really involved
> and might missinterpret it.

Any chance to get this moving forward?