[PATCH RFC] mtd: spi-nor: Sleep between ready checks

William A. Kennington III posted 1 patch 6 months, 3 weeks ago
drivers/mtd/spi-nor/core.c  | 56 +++++++++++++++++++++++++++----------
include/linux/mtd/spi-nor.h |  2 ++
2 files changed, 44 insertions(+), 14 deletions(-)
[PATCH RFC] mtd: spi-nor: Sleep between ready checks
Posted by William A. Kennington III 6 months, 3 weeks ago
We have SPI NOR devices that respond very slowly (200ms+) to some
commands. This causes our CPU to stall while repeatedly polling the NOR
with no cooldown in between attempts. In order to reduce overhead, this
patch introduces an exponential backoff in SPI readiness polling, that
self tunes the polling interval to match the speed of most transactions.

Signed-off-by: William A. Kennington III <william@wkennington.com>
---
 drivers/mtd/spi-nor/core.c  | 56 +++++++++++++++++++++++++++----------
 include/linux/mtd/spi-nor.h |  2 ++
 2 files changed, 44 insertions(+), 14 deletions(-)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index ac4b960101cc..75d3a4a1c1e3 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -714,27 +714,47 @@ static int spi_nor_ready(struct spi_nor *nor)
 static int spi_nor_wait_till_ready_with_timeout(struct spi_nor *nor,
 						unsigned long timeout_jiffies)
 {
-	unsigned long deadline;
-	int timeout = 0, ret;
-
-	deadline = jiffies + timeout_jiffies;
-
-	while (!timeout) {
-		if (time_after_eq(jiffies, deadline))
-			timeout = 1;
+	unsigned long deadline = jiffies + timeout_jiffies;
+	unsigned long sleep = nor->ready_sleep;
+	int ret, sleeps = 0;
 
+	while (true) {
 		ret = spi_nor_ready(nor);
 		if (ret < 0)
 			return ret;
-		if (ret)
+		if (ret) {
+			/*
+			 * We want to decrease the polling interval in cases
+			 * where multiple requests finish in a single iteration
+			 * or less. We don't want to do it for single outliers,
+			 * but we give more weight to short transactions.
+			 */
+			if (sleeps < 2)
+				nor->ready_sleep_down += 2;
+			else if (nor->ready_sleep_down > 0)
+				nor->ready_sleep_down--;
+
+			if (nor->ready_sleep_down >= 5) {
+				nor->ready_sleep >>= 1;
+				nor->ready_sleep_down = 0;
+			}
 			return 0;
+		}
+		if (time_after_eq(jiffies, deadline)) {
+			dev_dbg(nor->dev, "flash operation timed out\n");
+			return -ETIMEDOUT;
+		}
 
-		cond_resched();
-	}
-
-	dev_dbg(nor->dev, "flash operation timed out\n");
+		fsleep(sleep);
+		sleeps++;
 
-	return -ETIMEDOUT;
+		/*
+		 * Exponentially backoff the sleep, but hard limit at
+		 * 1ms to avoid responsiveness issues.
+		 */
+		if (sleep < 1000)
+			sleep += (sleep >> 1) + 1;
+	}
 }
 
 /**
@@ -3449,6 +3469,14 @@ int spi_nor_scan(struct spi_nor *nor, const char *name,
 
 	nor->info = info;
 
+	/*
+	 * Pick an initial sleep value that will get downtuned.
+	 * We want this value to be higher than needed for most flashes
+	 * as it will clamp down to the fastest transactions.
+	 */
+	nor->ready_sleep = 127;
+	nor->ready_sleep_down = 0;
+
 	mutex_init(&nor->lock);
 
 	/* Init flash parameters based on flash_info struct and SFDP */
diff --git a/include/linux/mtd/spi-nor.h b/include/linux/mtd/spi-nor.h
index cdcfe0fd2e7d..27bb4243db64 100644
--- a/include/linux/mtd/spi-nor.h
+++ b/include/linux/mtd/spi-nor.h
@@ -406,6 +406,8 @@ struct spi_nor {
 	enum spi_nor_protocol	reg_proto;
 	bool			sst_write_second;
 	u32			flags;
+	unsigned long		ready_sleep;
+	unsigned int		ready_sleep_down;
 	enum spi_nor_cmd_ext	cmd_ext_type;
 	struct sfdp		*sfdp;
 	struct dentry		*debugfs_root;
-- 
2.49.0.1151.ga128411c76-goog
Re: [PATCH RFC] mtd: spi-nor: Sleep between ready checks
Posted by Pratyush Yadav 6 months, 2 weeks ago
Hi William,

On Thu, May 22 2025, William A. Kennington III wrote:

> We have SPI NOR devices that respond very slowly (200ms+) to some
> commands. This causes our CPU to stall while repeatedly polling the NOR
> with no cooldown in between attempts. In order to reduce overhead, this
> patch introduces an exponential backoff in SPI readiness polling, that
> self tunes the polling interval to match the speed of most transactions.
>
> Signed-off-by: William A. Kennington III <william@wkennington.com>

I didn't read the code too deeply, but it seems full of magic values and
I am honestly not a big fan of how it reads. At the very least the logic
should be less black-magic. But since I didn't spend too much time
understanding the code, I don't have specific feedback for you.

BTW, you say it causes CPU stalls. The loop has a cond_resched() after
every iteration which I think gives the scheduler a chance to run other
tasks. Does that not work for some reason?

[...]

-- 
Regards,
Pratyush Yadav