From: Cao Guanghui <caoguanghui@kylinos.cn>
Replace the FIXME in end_reshape_write(). Instead of failing the device
immediately on write errors during reshape, attempt to record badblocks
using new_data_offset with is_new=1.
rdev_set_badblocks() returns true on success. On failure (e.g., badblocks
table full), it has already called md_error() internally to degrade the
device. Queue WantReplacement for member devices regardless of badblock
recording success, but skip this for replacement devices to avoid
replacement loops.
On successful write, clear stale badblock records at the new location
since data has migrated.
Signed-off-by: Cao Guanghui <caoguanghui@kylinos.cn>
---
drivers/md/raid10.c | 27 ++++++++++++++++++++++++---
1 file changed, 24 insertions(+), 3 deletions(-)
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 4901ebe45c87..08d58a1c680e 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -4991,9 +4991,30 @@ static void end_reshape_write(struct bio *bio)
conf->mirrors[d].rdev;
if (bio->bi_status) {
- /* FIXME should record badblock */
- md_error(mddev, rdev);
- }
+ set_bit(WriteErrorSeen, &rdev->flags);
+
+ /* rdev_set_badblocks returns true on success.
+ * On failure, it has already called md_error() internally.
+ * Use is_new=1 as reshape writes target the new layout
+ * (new_data_offset).
+ */
+ if (rdev_set_badblocks(rdev, r10_bio->devs[slot].addr,
+ r10_bio->sectors, 1)) {
+ /* Queue async replacement for member devices
+ * For replacement devices, do not trigger WantReplacement
+ * to avoid circular replacement storms.
+ */
+ if (!repl) {
+ if (!test_and_set_bit(WantReplacement, &rdev->flags))
+ set_bit(MD_RECOVERY_NEEDED,
+ &rdev->mddev->recovery);
+ }
+ }
+ } else {
+ /* Write succeeded, clear stale badblock records */
+ rdev_clear_badblocks(rdev, r10_bio->devs[slot].addr,
+ r10_bio->sectors, 1);
+ }
rdev_dec_pending(rdev, mddev);
end_reshape_request(r10_bio);
--
2.25.1