[PATCH v2 0/1] scsi: sas: fix mkfs.xfs failure due to bogus optimal_io_size

Ionut Nechita (Wind River) posted 1 patch 2 weeks, 5 days ago
There is a newer version of this series
drivers/scsi/scsi_transport_sas.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
[PATCH v2 0/1] scsi: sas: fix mkfs.xfs failure due to bogus optimal_io_size
Posted by Ionut Nechita (Wind River) 2 weeks, 5 days ago
From: Ionut Nechita <ionut.nechita@windriver.com>

v2:
  - Dropped the dma_opt_mapping_size() change per Robin Murphy's feedback:
    the DMA core semantics are correct, the bug is in the caller.
  - Dropped the nvme-pci patch (no longer needed).
  - Single patch now fixes the actual bug in scsi_transport_sas.c by
    checking if dma_opt_mapping_size() == dma_max_mapping_size() before
    setting opt_sectors.  When they are equal, no backend provided a
    real hint.
  - Added concrete values from the affected system (Dell PowerEdge R750,
    mpt3sas, SAMSUNG MZILT800HBHQ0D3) to the commit message.

v1 feedback summary:
  - Robin Murphy: dma_opt_mapping_size() semantics are correct; if no
    restriction exists, the largest efficient size IS the largest size.
    Fix the caller, not the common code.
  - John Garry: Asked for concrete max_sectors/opt_sectors values and
    questioned whether sd_revalidate_disk() would override opt_sectors
    via opt_xfer_blocks.
  - Damien Le Moal: Suggested min_not_zero() for nvme-pci (now moot).

Answer to John's question about opt_xfer_blocks:
  The SAS disks on this system do not report Optimal Transfer Length in
  VPD page B0, so sdkp->opt_xfer_blocks = 0.  sd_revalidate_disk() uses
  min_not_zero(0, opt_sectors) which returns opt_sectors, propagating
  the bogus value.  Observed values:

    shost->max_sectors      = 32767
    opt_sectors             = 32767  (capped at max_sectors)
    optimal_io_size         = 16773120  (visible in lsblk --topology)
    minimum_io_size         = 8192

  mkfs.xfs computes swidth=4095, sunit=2, fails because 4095 % 2 != 0.

Answer to John's question about blk_validate_limits() rounding:
  blk_validate_limits() rounds optimal_io_size down to physical_block_size
  (4096), but does NOT enforce that optimal_io_size is a multiple of
  minimum_io_size (8192).  So optimal_io_size=16773120 survives validation
  unchanged — it is already a multiple of 4096.  The mismatch only shows
  up when mkfs.xfs divides optimal_io_size by minimum_io_size and expects
  an integer result: 16773120 / 8192 = 2047.5, giving swidth=4095 and
  sunit=2, with 4095 % 2 != 0.

Test environment:
  - Dell PowerEdge R750
  - SAS Controller: Broadcom/LSI mpt3sas (SAS3816, FW 33.15.00.00)
  - Disks: SAMSUNG MZILT800HBHQ0D3 (800GB SCSI SAS SSD)
  - Kernel: 6.12.0-1-amd64 with intel_iommu=off
  - IOMMU: Disabled (DMAR: IOMMU disabled), default domain: Passthrough

Based on linux-next (next-20260316).

Link: https://lore.kernel.org/lkml/20260316203956.64515-1-ionut.nechita@windriver.com/

Ionut Nechita (1):
  scsi: sas: skip opt_sectors when DMA reports no real optimization hint

 drivers/scsi/scsi_transport_sas.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

--
2.43.0
Re: [PATCH v2 0/1] scsi: sas: fix mkfs.xfs failure due to bogus optimal_io_size
Posted by John Garry 2 weeks, 5 days ago
On 18/03/2026 07:43, Ionut Nechita (Wind River) wrote:
> Answer to John's question about blk_validate_limits() rounding:
>    blk_validate_limits() rounds optimal_io_size down to physical_block_size
>    (4096), but does NOT enforce that optimal_io_size is a multiple of
>    minimum_io_size (8192).  So optimal_io_size=16773120 survives validation
>    unchanged — it is already a multiple of 4096.  The mismatch only shows
>    up when mkfs.xfs divides optimal_io_size by minimum_io_size and expects
>    an integer result: 16773120 / 8192 = 2047.5, giving swidth=4095 and
>    sunit=2, with 4095 % 2 != 0.

thanks for the info. I feel that that io_opt should be a multiple of the 
io_min and we should enforce it in blk queue limits validation, but that 
can mask problems like you have seen.