.../spi/cdns,qspi-nor-peripheral-props.yaml | 8 + .../bindings/spi/spi-peripheral-props.yaml | 10 +- drivers/mtd/nand/spi/core.c | 35 + drivers/mtd/spi-nor/core.c | 85 +- drivers/spi/spi-cadence-quadspi.c | 2267 +++++++++++++++-- drivers/spi/spi-mem.c | 57 +- drivers/spi/spi.c | 17 +- include/linux/mtd/spi-nor.h | 3 + include/linux/mtd/spinand.h | 4 + include/linux/spi/spi-mem.h | 10 + include/linux/spi/spi.h | 2 + 11 files changed, 2308 insertions(+), 190 deletions(-)
This series implements PHY tuning support for the Cadence QSPI controller
to enable reliable high-speed operations. Without PHY tuning, controllers
use conservative timing that limits performance. PHY tuning calibrates
RX/TX delay lines to find optimal data capture timing windows, enabling
operation up to the controller's maximum frequency.
Background:
High-speed SPI memory controllers require precise timing calibration for
reliable operation. At higher frequencies, board-to-board variations make
fixed timing parameters inadequate. The Cadence QSPI controller includes
a PHY interface with programmable delay lines (0-127 taps) for RX and TX
paths, but these require runtime calibration to find the valid timing
window.
Approach:
Add SDR/DDR PHY tuning algorithms for the Cadence controller:
SDR Mode Tuning (1D search):
- Searches for two consecutive valid RX delay windows
- Selects the larger window and uses its midpoint for maximum margin
- TX delay fixed at maximum (127) as it's less critical in SDR
DDR Mode Tuning (2D search):
- Finds RX boundaries (rxlow/rxhigh) using TX window sweeps
- Finds TX boundaries (txlow/txhigh) at fixed RX positions
- Defines valid region corners and detects gaps via binary search
- Applies temperature compensation for optimal point selection
- Handles single or dual passing regions with different strategies
Patch description:
Infrastructure (1-5):
- Patch 1: Extend spi-max-frequency DT binding to accept an optional
second value forming a [base-freq, max-freq] pair
- Patch 2: Add cadence-specific cdns,phy-pattern-partition phandle for
NOR flash PHY tuning pattern location
- Patch 3: Parse two-element spi-max-frequency in spi.c; adds
spi_device.base_speed_hz (0 when a single value is used,
keeping all existing DT fully compatible)
- Patch 4: Add spi_mem_apply_base_freq_cap(), called from
spi_mem_exec_op() to cap non-PHY ops to base_speed_hz;
tuned ops bypass the cap because execute_tuning() marks
them with op->max_freq = max_speed_hz
- Patch 5: Add execute_tuning callback to spi_controller_mem_ops and
spi_mem_execute_tuning() wrapper in SPI-MEM core
Cadence QSPI Implementation (6-10):
- Patch 6: Move cqspi_readdata_capture() earlier (preparatory)
- Patch 7: Add DQS bit to cqspi_readdata_capture() (preparatory)
- Patch 8: Add complete PHY tuning support: DLL management, pattern
verification (NOR via cdns,phy-pattern-partition phandle,
NAND via write-to-cache), SDR 1D and DDR 2D search
algorithms with temperature compensation, AM654-specific
execute_tuning entry point; base_speed_hz is cleared during
the tuning loop and restored unconditionally on return
- Patch 9: Reject 2-byte-address DDR operations via a new
CQSPI_NO_2BYTE_ADDR_PHY_DDR quirk flag to work around
AM654 OSPI erratum i2383
- Patch 10: Enable PHY for direct memory-mapped reads (aligned body
region only; unaligned head and tail run without PHY) and
for indirect writes >= 1 KB
MTD core (11-13):
- Patch 11: Integrate tuning in SPI-NAND probe; propagate the validated
frequency to all plane dirmaps (primary and secondary op
templates) and to the persistent write dirmap template
- Patch 12: Extract spi_nor_spimem_get_read_op() helper (preparatory)
- Patch 13: Integrate tuning in SPI-NOR probe; patch the dirmap op
template with the validated frequency; store the result in
nor->max_read_op so all subsequent reads (dirmap and direct)
pick up the tuned speed automatically
Series dependency:
Merge after: https://lore.kernel.org/linux-spi/20260527173736.2243004-1-s-k6@ti.com/T/#u
Testing:
This series was tested on TI's
AM62Ax SK with OSPI NAND flash and
AM62Px SK with OSPI NOR flash:
Read throughput:
|-------------------------------------|
| | without PHY | with PHY |
|-------------------------------------|
| OSPI NOR | 37.5 MB/s | 216 MB/s |
|-------------------------------------|
| OSPI NAND | 9.2 MB/s | 35.1 MB/s |
|-------------------------------------|
Write throughput:
|-------------------------------------|
| | without PHY | with PHY |
|-------------------------------------|
| OSPI NAND | 6 MB/s | 9.2 MB/s |
|-------------------------------------|
Test log: https://gist.github.com/santhosh21/3434d062f31622c5877a375218cd49c7
Repo: https://github.com/santhosh21/linux/commits/phy_tuning_v3/
Changes in v3:
- Drop spi-has-dqs DT property; DQS is now enabled automatically when
the selected read operation uses DDR signalling (dtr flags in the op)
- Extend spi-max-frequency to accept an optional second value forming a
[base-freq, max-freq] pair; the presence of two values signals PHY
tuning intent and encodes both the conservative base speed and the
calibration target in one property
- Add base_speed_hz to struct spi_device (spi.c/spi.h) and parse the
two-element array there; single-value DT is fully backward-compatible
- Move frequency enforcement from the cadence driver to core: new
spi_mem_apply_base_freq_cap() called from spi_mem_exec_op() replaces
the per-driver cqspi_op_matches_tuned() and non_phy_clk_rate field
- Propagate the tuned max_freq to dirmap op templates after
execute_tuning() succeeds; store persistent op templates in
spi_nor.max_read_op and spinand.{max_read,max_write}_op so the
frequency writeback survives across the probe call
- Replace NOR pattern partition lookup by name with a
cdns,phy-pattern-partition DT phandle pointing directly to the
partition node
- Add CQSPI_NO_2BYTE_ADDR_PHY_DDR quirk and reject 2-byte-address DDR
ops in cqspi_supports_mem_op() to work around AM654 erratum i2383
- Remove RFC tag
- Rebase on v7.1-rc5
- Collect tags from Miquel
- Link to v2: https://lore.kernel.org/linux-spi/20260113141617.1905039-1-s-k6@ti.com/
Changes in v2:
- Restructure the .execute_tuning() call from spi-mem clients instead
of mtdcore with best read_op and write_op (optional) passed
- Add compatible-specific .execute_tuning() call which can be called by
spi_mem_execute_tuning() if exists
- Handle tuning requirement check by controller instead of spi-mem
clients
- Add support to write the phy_pattern to cache if relevant write_op
is passed or get the partition offset which contains the phy_pattern
- Add tuning algorithm for DDR mode
- Add support for DQS
- Restrict PHY frequency to tuned operations
- Link to v1: https://lore.kernel.org/linux-spi/20250811193219.731851-1-s-k6@ti.com/
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
Pratyush Yadav (1):
mtd: spi-nor: extract read op template construction into helper
Santhosh Kumar K (12):
spi: dt-bindings: allow spi-max-frequency to specify a frequency pair
spi: dt-bindings: cdns,qspi-nor: add PHY tuning pattern partition
property
spi: parse two-element spi-max-frequency property
spi: spi-mem: add spi_mem_apply_base_freq_cap()
spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning()
spi: cadence-quadspi: move cqspi_readdata_capture earlier
spi: cadence-quadspi: add DQS support to read data capture
spi: cadence-quadspi: add PHY tuning support
spi: cadence-quadspi: reject 2-byte-address DDR ops on PHY-tunable
hardware
spi: cadence-quadspi: enable PHY for direct reads and indirect writes
mtd: spinand: run PHY tuning after init and update dirmap frequencies
mtd: spi-nor: run PHY tuning after init and update dirmap frequency
.../spi/cdns,qspi-nor-peripheral-props.yaml | 8 +
.../bindings/spi/spi-peripheral-props.yaml | 10 +-
drivers/mtd/nand/spi/core.c | 35 +
drivers/mtd/spi-nor/core.c | 85 +-
drivers/spi/spi-cadence-quadspi.c | 2267 +++++++++++++++--
drivers/spi/spi-mem.c | 57 +-
drivers/spi/spi.c | 17 +-
include/linux/mtd/spi-nor.h | 3 +
include/linux/mtd/spinand.h | 4 +
include/linux/spi/spi-mem.h | 10 +
include/linux/spi/spi.h | 2 +
11 files changed, 2308 insertions(+), 190 deletions(-)
--
2.34.1
Hi Santhosh, Very happy to see this v3! Looks pretty neat overall. On 27/05/2026 at 23:25:14 +0530, Santhosh Kumar K <s-k6@ti.com> wrote: > This series implements PHY tuning support for the Cadence QSPI controller > to enable reliable high-speed operations. Without PHY tuning, controllers > use conservative timing that limits performance. PHY tuning calibrates > RX/TX delay lines to find optimal data capture timing windows, enabling > operation up to the controller's maximum frequency. > > Background: > High-speed SPI memory controllers require precise timing calibration for > reliable operation. At higher frequencies, board-to-board variations make > fixed timing parameters inadequate. The Cadence QSPI controller includes > a PHY interface with programmable delay lines (0-127 taps) for RX and TX > paths, but these require runtime calibration to find the valid timing > window. > > Approach: > Add SDR/DDR PHY tuning algorithms for the Cadence controller: > > SDR Mode Tuning (1D search): > - Searches for two consecutive valid RX delay windows > - Selects the larger window and uses its midpoint for maximum margin > - TX delay fixed at maximum (127) as it's less critical in SDR > > DDR Mode Tuning (2D search): > - Finds RX boundaries (rxlow/rxhigh) using TX window sweeps > - Finds TX boundaries (txlow/txhigh) at fixed RX positions > - Defines valid region corners and detects gaps via binary search > - Applies temperature compensation for optimal point selection > - Handles single or dual passing regions with different strategies > > Patch description: > Infrastructure (1-5): > - Patch 1: Extend spi-max-frequency DT binding to accept an optional > second value forming a [base-freq, max-freq] pair > - Patch 2: Add cadence-specific cdns,phy-pattern-partition phandle for > NOR flash PHY tuning pattern location > - Patch 3: Parse two-element spi-max-frequency in spi.c; adds > spi_device.base_speed_hz (0 when a single value is used, > keeping all existing DT fully compatible) > - Patch 4: Add spi_mem_apply_base_freq_cap(), called from > spi_mem_exec_op() to cap non-PHY ops to base_speed_hz; > tuned ops bypass the cap because execute_tuning() marks > them with op->max_freq = max_speed_hz > - Patch 5: Add execute_tuning callback to spi_controller_mem_ops and > spi_mem_execute_tuning() wrapper in SPI-MEM core > > Cadence QSPI Implementation (6-10): > - Patch 6: Move cqspi_readdata_capture() earlier (preparatory) > - Patch 7: Add DQS bit to cqspi_readdata_capture() (preparatory) > - Patch 8: Add complete PHY tuning support: DLL management, pattern > verification (NOR via cdns,phy-pattern-partition phandle, > NAND via write-to-cache), SDR 1D and DDR 2D search > algorithms with temperature compensation, AM654-specific > execute_tuning entry point; base_speed_hz is cleared during > the tuning loop and restored unconditionally on return > - Patch 9: Reject 2-byte-address DDR operations via a new > CQSPI_NO_2BYTE_ADDR_PHY_DDR quirk flag to work around > AM654 OSPI erratum i2383 > - Patch 10: Enable PHY for direct memory-mapped reads (aligned body > region only; unaligned head and tail run without PHY) and > for indirect writes >= 1 KB > > MTD core (11-13): > - Patch 11: Integrate tuning in SPI-NAND probe; propagate the validated > frequency to all plane dirmaps (primary and secondary op > templates) and to the persistent write dirmap template > - Patch 12: Extract spi_nor_spimem_get_read_op() helper (preparatory) > - Patch 13: Integrate tuning in SPI-NOR probe; patch the dirmap op > template with the validated frequency; store the result in > nor->max_read_op so all subsequent reads (dirmap and direct) > pick up the tuned speed automatically > > Series dependency: > Merge after: > https://lore.kernel.org/linux-spi/20260527173736.2243004-1-s-k6@ti.com/T/#u Isn't the DQS series a prerequisite as well? I sent it as an RFC, we can definitely consider it for merge together with this series once ready. Link: https://lore.kernel.org/linux-mtd/20260205-winbond-nand-next-phy-tuning-v1-0-5e7d3976f0f1@bootlin.com/ Do you confirm that you have "[PATCH DO NOT MERGE RFC 4/4] spi: cadence-qspi: Retrieve DQS capability using the core helper" in your branch for the PHY tuning series to work? > Testing: > This series was tested on TI's > AM62Ax SK with OSPI NAND flash and > AM62Px SK with OSPI NOR flash: > > Read throughput: > |-------------------------------------| > | | without PHY | with PHY | > |-------------------------------------| > | OSPI NOR | 37.5 MB/s | 216 MB/s | I am impressed by the SPI NOR improvement o_O > |-------------------------------------| > | OSPI NAND | 9.2 MB/s | 35.1 MB/s | > |-------------------------------------| Was this tested in 8D-8D-8D mode? > Write throughput: > |-------------------------------------| > | | without PHY | with PHY | > |-------------------------------------| > | OSPI NAND | 6 MB/s | 9.2 MB/s | > |-------------------------------------| Thanks, Miquèl
Hello Miquel, On 28/05/26 14:00, Miquel Raynal wrote: > Hi Santhosh, > > Very happy to see this v3! Looks pretty neat overall. > > On 27/05/2026 at 23:25:14 +0530, Santhosh Kumar K <s-k6@ti.com> wrote: > >> This series implements PHY tuning support for the Cadence QSPI controller >> to enable reliable high-speed operations. Without PHY tuning, controllers >> use conservative timing that limits performance. PHY tuning calibrates >> RX/TX delay lines to find optimal data capture timing windows, enabling >> operation up to the controller's maximum frequency. >> >> Background: >> High-speed SPI memory controllers require precise timing calibration for >> reliable operation. At higher frequencies, board-to-board variations make >> fixed timing parameters inadequate. The Cadence QSPI controller includes >> a PHY interface with programmable delay lines (0-127 taps) for RX and TX >> paths, but these require runtime calibration to find the valid timing >> window. >> >> Approach: >> Add SDR/DDR PHY tuning algorithms for the Cadence controller: >> >> SDR Mode Tuning (1D search): >> - Searches for two consecutive valid RX delay windows >> - Selects the larger window and uses its midpoint for maximum margin >> - TX delay fixed at maximum (127) as it's less critical in SDR >> >> DDR Mode Tuning (2D search): >> - Finds RX boundaries (rxlow/rxhigh) using TX window sweeps >> - Finds TX boundaries (txlow/txhigh) at fixed RX positions >> - Defines valid region corners and detects gaps via binary search >> - Applies temperature compensation for optimal point selection >> - Handles single or dual passing regions with different strategies >> >> Patch description: >> Infrastructure (1-5): >> - Patch 1: Extend spi-max-frequency DT binding to accept an optional >> second value forming a [base-freq, max-freq] pair >> - Patch 2: Add cadence-specific cdns,phy-pattern-partition phandle for >> NOR flash PHY tuning pattern location >> - Patch 3: Parse two-element spi-max-frequency in spi.c; adds >> spi_device.base_speed_hz (0 when a single value is used, >> keeping all existing DT fully compatible) >> - Patch 4: Add spi_mem_apply_base_freq_cap(), called from >> spi_mem_exec_op() to cap non-PHY ops to base_speed_hz; >> tuned ops bypass the cap because execute_tuning() marks >> them with op->max_freq = max_speed_hz >> - Patch 5: Add execute_tuning callback to spi_controller_mem_ops and >> spi_mem_execute_tuning() wrapper in SPI-MEM core >> >> Cadence QSPI Implementation (6-10): >> - Patch 6: Move cqspi_readdata_capture() earlier (preparatory) >> - Patch 7: Add DQS bit to cqspi_readdata_capture() (preparatory) >> - Patch 8: Add complete PHY tuning support: DLL management, pattern >> verification (NOR via cdns,phy-pattern-partition phandle, >> NAND via write-to-cache), SDR 1D and DDR 2D search >> algorithms with temperature compensation, AM654-specific >> execute_tuning entry point; base_speed_hz is cleared during >> the tuning loop and restored unconditionally on return >> - Patch 9: Reject 2-byte-address DDR operations via a new >> CQSPI_NO_2BYTE_ADDR_PHY_DDR quirk flag to work around >> AM654 OSPI erratum i2383 >> - Patch 10: Enable PHY for direct memory-mapped reads (aligned body >> region only; unaligned head and tail run without PHY) and >> for indirect writes >= 1 KB >> >> MTD core (11-13): >> - Patch 11: Integrate tuning in SPI-NAND probe; propagate the validated >> frequency to all plane dirmaps (primary and secondary op >> templates) and to the persistent write dirmap template >> - Patch 12: Extract spi_nor_spimem_get_read_op() helper (preparatory) >> - Patch 13: Integrate tuning in SPI-NOR probe; patch the dirmap op >> template with the validated frequency; store the result in >> nor->max_read_op so all subsequent reads (dirmap and direct) >> pick up the tuned speed automatically >> >> Series dependency: >> Merge after: >> https://lore.kernel.org/linux-spi/20260527173736.2243004-1-s-k6@ti.com/T/#u > > Isn't the DQS series a prerequisite as well? I sent it as an RFC, we can > definitely consider it for merge together with this series once > ready. > > Link: https://lore.kernel.org/linux-mtd/20260205-winbond-nand-next-phy-tuning-v1-0-5e7d3976f0f1@bootlin.com/ > > Do you confirm that you have "[PATCH DO NOT MERGE RFC 4/4] spi: cadence-qspi: Retrieve > DQS capability using the core helper" in your branch for the PHY tuning > series to work? The DQS configuration is now derived from the selected read_op variant (SDR vs DDR), which in turn selects the corresponding tuning algorithm. The SDR and DDR tuning algorithms are designed such that SDR tuning runs with DQS disabled, while DDR tuning runs with DQS enabled. Because of this, the DQS support series is no longer a prerequisite for the PHY tuning series. However, it can be useful follow-up to make the implementation more optimal. Once use_dqs is enabled, we can additionally check has_dqs to ensure the flash advertises DQS support before enabling it. > >> Testing: >> This series was tested on TI's >> AM62Ax SK with OSPI NAND flash and >> AM62Px SK with OSPI NOR flash: >> >> Read throughput: >> |-------------------------------------| >> | | without PHY | with PHY | >> |-------------------------------------| >> | OSPI NOR | 37.5 MB/s | 216 MB/s | > > I am impressed by the SPI NOR improvement o_O > >> |-------------------------------------| >> | OSPI NAND | 9.2 MB/s | 35.1 MB/s | >> |-------------------------------------| > > Was this tested in 8D-8D-8D mode? Tested in 8S-PHY mode. 8D-PHY mode is not supported with 2-byte addressing due to Errata-i2383. [0] [0] https://www.ti.com/lit/er/sprz544c/sprz544c.pdf Regards, Santhosh. > >> Write throughput: >> |-------------------------------------| >> | | without PHY | with PHY | >> |-------------------------------------| >> | OSPI NAND | 6 MB/s | 9.2 MB/s | >> |-------------------------------------| > > Thanks, > Miquèl
© 2016 - 2026 Red Hat, Inc.