From nobody Tue Dec 16 05:41:15 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3655B1F461A; Tue, 27 May 2025 08:19:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748333954; cv=none; b=ZnXwpyKNHQA4GSineaB62iPk1IUEoOngoqiUih2gl2aGrvwrSUuBiKw/kt+ndNzUUobC3PfaO2Z+kIhV0tEFjjirQCXfc49uUgtFXDap88ecOAdaBWHvtLFUHu015uRe0gozIli9/pzh7hIK/wJZAjiTOln4G6lmrFm/ajEw4Ak= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748333954; c=relaxed/simple; bh=6OT1aY7nCSjt1c4k46yinppRblyAjsUwLOczVANdIkI=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=I57ZDJHgfgibsLuE652r48c0scnA1BRrKQ7AfDlX12A77WuKFh91tps/FfpMgr3AgaFBbcholXtMs/ZK9K3//TpLuoa9Q88BiH53HMg9GYmCUpqvodUmldOn4GsOIqdMJpC45GwYt817sOfx/Z+TMUh4DUZSfhGJUJiLDS9lZb4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4b65BB2sBGz4f3jXX; Tue, 27 May 2025 16:18:42 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id A26A81A26D3; Tue, 27 May 2025 16:19:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2B4dTVoMj6fNg--.29944S4; Tue, 27 May 2025 16:19:06 +0800 (CST) From: Yu Kuai To: mpatocka@redhat.com, zdenek.kabelac@gmail.com, song@kernel.org, yukuai3@huawei.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH v2] md/raid1,raid10: don't handle IO error for REQ_RAHEAD and REQ_NOWAIT Date: Tue, 27 May 2025 16:14:07 +0800 Message-Id: <20250527081407.3004055-1-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2B4dTVoMj6fNg--.29944S4 X-Coremail-Antispam: 1UD129KBjvJXoWxWw18ArWruFW5KFyrXryDGFg_yoW7Jryxp3 y7Ca9Iv39rGFW5XF1DJFW7Za4Fkw1FgFW3C395J34xZ3WY9rZ8Aa1DJ3yFgrs8ZFWru3W2 vF1vgw4UuayayFUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9Y14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26F1j6w1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4U JVWxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Gc CE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E 2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJV W8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lFIxGxcIEc7CjxVA2 Y2ka0xkIwI1lc7CjxVAaw2AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x 0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2 zVAF1VAY17CE14v26r1q6r43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF 4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWU CwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCT nIWIevJa73UjIFyTuYvjfUonmRUUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai IO with REQ_RAHEAD or REQ_NOWAIT can fail early, even if the storage medium is fine, hence record badblocks or remove the disk from array does not make sense. This problem if found by lvm2 test lvcreate-large-raid, where dm-zero will fail read ahead IO directly. Reported-and-tested-by: Mikulas Patocka Closes: https://lore.kernel.org/all/34fa755d-62c8-4588-8ee1-33cb1249bdf2@re= dhat.com/ Signed-off-by: Yu Kuai --- Changes in v2: - handle REQ_NOWAIT as well. drivers/md/raid1-10.c | 10 ++++++++++ drivers/md/raid1.c | 19 ++++++++++--------- drivers/md/raid10.c | 11 ++++++----- 3 files changed, 26 insertions(+), 14 deletions(-) diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index c7efd8aab675..b8b3a9069701 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -293,3 +293,13 @@ static inline bool raid1_should_read_first(struct mdde= v *mddev, =20 return false; } + +/* + * bio with REQ_RAHEAD or REQ_NOWAIT can fail at anytime, before such IO is + * submitted to the underlying disks, hence don't record badblocks or retry + * in this case. + */ +static inline bool raid1_should_handle_error(struct bio *bio) +{ + return !(bio->bi_opf & (REQ_RAHEAD | REQ_NOWAIT)); +} diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 657d481525be..19c5a0ce5a40 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -373,14 +373,16 @@ static void raid1_end_read_request(struct bio *bio) */ update_head_pos(r1_bio->read_disk, r1_bio); =20 - if (uptodate) + if (uptodate) { set_bit(R1BIO_Uptodate, &r1_bio->state); - else if (test_bit(FailFast, &rdev->flags) && - test_bit(R1BIO_FailFast, &r1_bio->state)) + } else if (test_bit(FailFast, &rdev->flags) && + test_bit(R1BIO_FailFast, &r1_bio->state)) { /* This was a fail-fast read so we definitely * want to retry */ ; - else { + } else if (!raid1_should_handle_error(bio)) { + uptodate =3D 1; + } else { /* If all other devices have failed, we want to return * the error upwards rather than fail the last device. * Here we redefine "uptodate" to mean "Don't want to retry" @@ -451,16 +453,15 @@ static void raid1_end_write_request(struct bio *bio) struct bio *to_put =3D NULL; int mirror =3D find_bio_disk(r1_bio, bio); struct md_rdev *rdev =3D conf->mirrors[mirror].rdev; - bool discard_error; sector_t lo =3D r1_bio->sector; sector_t hi =3D r1_bio->sector + r1_bio->sectors; - - discard_error =3D bio->bi_status && bio_op(bio) =3D=3D REQ_OP_DISCARD; + bool ignore_error =3D !raid1_should_handle_error(bio) || + (bio->bi_status && bio_op(bio) =3D=3D REQ_OP_DISCARD); =20 /* * 'one mirror IO has finished' event handler: */ - if (bio->bi_status && !discard_error) { + if (bio->bi_status && !ignore_error) { set_bit(WriteErrorSeen, &rdev->flags); if (!test_and_set_bit(WantReplacement, &rdev->flags)) set_bit(MD_RECOVERY_NEEDED, & @@ -511,7 +512,7 @@ static void raid1_end_write_request(struct bio *bio) =20 /* Maybe we can clear some bad blocks. */ if (rdev_has_badblock(rdev, r1_bio->sector, r1_bio->sectors) && - !discard_error) { + !ignore_error) { r1_bio->bios[mirror] =3D IO_MADE_GOOD; set_bit(R1BIO_MadeGood, &r1_bio->state); } diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index dce06bf65016..b74780af4c22 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -399,6 +399,8 @@ static void raid10_end_read_request(struct bio *bio) * wait for the 'master' bio. */ set_bit(R10BIO_Uptodate, &r10_bio->state); + } else if (!raid1_should_handle_error(bio)) { + uptodate =3D 1; } else { /* If all other devices that store this block have * failed, we want to return the error upwards rather @@ -456,9 +458,8 @@ static void raid10_end_write_request(struct bio *bio) int slot, repl; struct md_rdev *rdev =3D NULL; struct bio *to_put =3D NULL; - bool discard_error; - - discard_error =3D bio->bi_status && bio_op(bio) =3D=3D REQ_OP_DISCARD; + bool ignore_error =3D !raid1_should_handle_error(bio) || + (bio->bi_status && bio_op(bio) =3D=3D REQ_OP_DISCARD); =20 dev =3D find_bio_disk(conf, r10_bio, bio, &slot, &repl); =20 @@ -472,7 +473,7 @@ static void raid10_end_write_request(struct bio *bio) /* * this branch is our 'one mirror IO has finished' event handler: */ - if (bio->bi_status && !discard_error) { + if (bio->bi_status && !ignore_error) { if (repl) /* Never record new bad blocks to replacement, * just fail it. @@ -527,7 +528,7 @@ static void raid10_end_write_request(struct bio *bio) /* Maybe we can clear some bad blocks. */ if (rdev_has_badblock(rdev, r10_bio->devs[slot].addr, r10_bio->sectors) && - !discard_error) { + !ignore_error) { bio_put(bio); if (repl) r10_bio->devs[slot].repl_bio =3D IO_MADE_GOOD; --=20 2.39.2