From nobody Mon Jun 8 06:37:54 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 20F5D47A0B4; Fri, 5 Jun 2026 07:26:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780644407; cv=none; b=oXpyZVQ3/dr5LMKYTXh/enHO4y4dVfNbtP3DO6f1x98rPviBCGcf2L9huQg/do8IvUcn70BhFeP6IuIraf80ILLjhQdV6iiN+Jpy7jZA9rZ7ZRfYddLABu5idE6L2SZ17Lygf3BmoKg7RWjREaw91mRrN2vD/RoZNpTYke2hc70= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780644407; c=relaxed/simple; bh=XhrYV5/VRpbogi3ErjzjGJBaGuSXKWPxU77FFLVcyc0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=O55M3+ONQgdxbIy1QVW246c0q50kheJhrLW7Ij12VgA0eSKcjMGXACQhp6ypwi4M//fe5ZU0OdpeebMcaWwj/R2vEzjPx0dqtaPyyu3JktRn21nsypjTIV7OYzL49W4CZYCxD+kOxC+0a3Q11UdG3zWbGho788zNt1FoxuDDG7s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=RchJSz5j; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="RchJSz5j" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CD4DC1F00893; Fri, 5 Jun 2026 07:26:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780644405; bh=Bedl9q/jJIqWgiyrFraApXDJ+ymOAhTjS9Jvc4JdEcQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=RchJSz5jPYnud8Plr6dMXVJ4EFWqjr0fmzgC9/31Yc4Hm3hgAys6JqcSVH+bbWysz 5kUolykptqdRq+rR2H7QXdPAJPpT4w4Gy4ptmdRhjutwCnXXD5l10XYpux7MLFLrCD QSE1QGFfsjRqP9B8KoJG0c/BN/+8eomhYwFIgdnK8AysGwj7Pg/q5QP3ucmQBl2dBy JAo0b/cbHJSbdgw4+e7xvYkpVSJVrnWASFE086vCKkQGlWzpBvT02clPgkFlMzk2c0 mUuKjVgSnQAfRi3gfmotpzwIsM5vyQz0lQcSphhTymoehsKrqosg4T7minID4hyK9F ZNYVr/zyMIYpg== From: Yu Kuai To: Song Liu , Yu Kuai Cc: Li Nan , Xiao Ni , linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 1/3] md/raid5: account discard IO Date: Fri, 5 Jun 2026 15:26:37 +0800 Message-ID: <20260605072639.2434847-2-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260605072639.2434847-1-yukuai@kernel.org> References: <20260605072639.2434847-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai Raid5 handles discard bios internally through make_discard_request() and never passes them through md_account_bio(). As a result, discard IO is missing the md-device iostat accounting that normal raid5 IO and discard IO in other raid levels get from md_account_bio(). Before accounting the bio, trim the request to the full data stripes that raid5 will actually discard. The first full stripe is the ceiling of the bio start divided by data-stripe sectors, and the last full stripe is the floor of the bio end divided by data-stripe sectors. Account that exact MD logical full-stripe range, then restore the original iterator so bio completion and iostat still cover the original request. Signed-off-by: Yu Kuai --- drivers/md/raid5.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 65ae7d8930fc..debf35342ae0 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -5688,34 +5688,47 @@ static void release_stripe_plug(struct mddev *mddev, =20 static void make_discard_request(struct mddev *mddev, struct bio *bi) { struct r5conf *conf =3D mddev->private; sector_t logical_sector, last_sector; + sector_t first_stripe, last_stripe; struct stripe_head *sh; + struct bvec_iter bi_iter; + struct bio *orig_bi =3D bi; int stripe_sectors; =20 /* We need to handle this when io_uring supports discard/trim */ if (WARN_ON_ONCE(bi->bi_opf & REQ_NOWAIT)) return; =20 if (mddev->reshape_position !=3D MaxSector) /* Skip discard while reshape is happening */ return; =20 - logical_sector =3D bi->bi_iter.bi_sector & ~((sector_t)RAID5_STRIPE_SECTO= RS(conf)-1); - last_sector =3D bio_end_sector(bi); - - bi->bi_next =3D NULL; - stripe_sectors =3D conf->chunk_sectors * (conf->raid_disks - conf->max_degraded); - logical_sector =3D DIV_ROUND_UP_SECTOR_T(logical_sector, - stripe_sectors); - sector_div(last_sector, stripe_sectors); + first_stripe =3D DIV_ROUND_UP_SECTOR_T(bi->bi_iter.bi_sector, + stripe_sectors); + last_stripe =3D bio_end_sector(bi); + sector_div(last_stripe, stripe_sectors); + + if (first_stripe >=3D last_stripe) { + bio_endio(bi); + return; + } + + bi_iter =3D bi->bi_iter; + bi->bi_iter.bi_sector =3D first_stripe * stripe_sectors; + bi->bi_iter.bi_size =3D ((last_stripe - first_stripe) * + stripe_sectors) << 9; + md_account_bio(mddev, &bi); + orig_bi->bi_iter =3D bi_iter; + bi->bi_iter =3D bi_iter; + bi->bi_next =3D NULL; =20 - logical_sector *=3D conf->chunk_sectors; - last_sector *=3D conf->chunk_sectors; + logical_sector =3D first_stripe * conf->chunk_sectors; + last_sector =3D last_stripe * conf->chunk_sectors; =20 for (; logical_sector < last_sector; logical_sector +=3D RAID5_STRIPE_SECTORS(conf)) { DEFINE_WAIT(w); int d; --=20 2.51.0 From nobody Mon Jun 8 06:37:54 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4CAEF47B42E; Fri, 5 Jun 2026 07:26:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780644409; cv=none; b=dnLqdykoFUZp4Q1FJ5zxnSWC2XmfatgNIXNiNP+JRw/IrR3qNBns4mrngWWuw+fPGBKcjQtQres7+wlK5CoL5plQkowvbeabITt1LjNgqN94eymAf2qpKSVOrsqyvBwIQZBGfC9I7hpgTYleSd44nWMY9mEkKwoZyd14zeu6QTA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780644409; c=relaxed/simple; bh=ZJr/oV9IjC7rsVS6EfJhpusBKrVUKqsqDwS2ICs751o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aKaRqFRtu0MRgzDpQoTmKehMT/sG798eNR/zhQpb9sNAI+yY6D4jie8WIUi73IEo/feToceM7z8fILhhMjm+R0qMNPKjiAivjflxHD5+0JuH3L74jF64vUnQyMXR5mACEaFCRjW1EEXt/hZOlwN1W2K1DRrV/5I5EvYll65EwlY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=TUDur2ss; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="TUDur2ss" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5F57D1F00898; Fri, 5 Jun 2026 07:26:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780644408; bh=pfunqHtFp53R8SJbwTwyK3rCXBeL+8juoqAAzZooEFA=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=TUDur2ssAs/s7YfNiStPIrwiDmT3Wuu7SF9unPo6sCLylIdBHNWZ0qrTFCFsykzBz FXMVWIs5gHQt6ufcCXQmNvhnwkNYiudqbB9QQPPu2lqSwXcSRVYpmD+yU2EbuoLNS8 BMLmn317qG5GSMiMV3ypnMJIXMsKqKVAZlD68RyZJPpjjInlFLOeJQxcJc42mZ8U5c Wxezjcs/NqHoLtnhTVx0kzGOPi0mKCSEu3KSJYg3010Somfmj1FXBM/GA0uXvo4u8Y mJ2z6tNr8Wa1+HkdZRHaclOYmBKQ0HZLbwKT9M4VIB2OViUN0QSHfpFTw3kBxU8P+g NDop1HaPWAhsQ== From: Yu Kuai To: Song Liu , Yu Kuai Cc: Li Nan , Xiao Ni , linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 2/3] md/raid5: validate discard support at request time Date: Fri, 5 Jun 2026 15:26:38 +0800 Message-ID: <20260605072639.2434847-3-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260605072639.2434847-1-yukuai@kernel.org> References: <20260605072639.2434847-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai Raid5 used to disable discard limits when devices_handle_discard_safely was not set or when stacked member limits could not support a full-stripe discard. That hides discard from userspace before raid5 can decide whether a request can be handled safely. Follow other virtual drivers and advertise a UINT_MAX discard limit for the md device. Cache lower discard support in r5conf when setting queue limits, and reject unsupported discard bios before queuing stripe work. Signed-off-by: Yu Kuai --- drivers/md/raid5.c | 34 +++++++++++++++++++--------------- drivers/md/raid5.h | 1 + 2 files changed, 20 insertions(+), 15 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index debf35342ae0..76e736ee48d3 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -1132,10 +1132,22 @@ static void defer_issue_bios(struct r5conf *conf, s= ector_t sector, spin_unlock(&conf->pending_bios_lock); =20 dispatch_bio_list(&tmp); } =20 +static bool raid5_discard_limits(struct mddev *mddev, struct bio *bi) +{ + struct r5conf *conf =3D mddev->private; + + if (!conf->raid5_discard_unsupported) + return true; + + bi->bi_status =3D BLK_STS_NOTSUPP; + bio_endio(bi); + return false; +} + static void raid5_end_read_request(struct bio *bi); static void raid5_end_write_request(struct bio *bi); =20 @@ -5702,10 +5714,13 @@ static void make_discard_request(struct mddev *mdde= v, struct bio *bi) =20 if (mddev->reshape_position !=3D MaxSector) /* Skip discard while reshape is happening */ return; =20 + if (!raid5_discard_limits(mddev, bi)) + return; + stripe_sectors =3D conf->chunk_sectors * (conf->raid_disks - conf->max_degraded); first_stripe =3D DIV_ROUND_UP_SECTOR_T(bi->bi_iter.bi_sector, stripe_sectors); last_stripe =3D bio_end_sector(bi); @@ -7815,36 +7830,25 @@ static int raid5_set_limits(struct mddev *mddev) mddev_stack_rdev_limits(mddev, &lim, 0); rdev_for_each(rdev, mddev) queue_limits_stack_bdev(&lim, rdev->bdev, rdev->new_data_offset, mddev->gendisk->disk_name); =20 - /* - * Zeroing is required for discard, otherwise data could be lost. - * - * Consider a scenario: discard a stripe (the stripe could be - * inconsistent if discard_zeroes_data is 0); write one disk of the - * stripe (the stripe could be inconsistent again depending on which - * disks are used to calculate parity); the disk is broken; The stripe - * data of this disk is lost. - * - * We only allow DISCARD if the sysadmin has confirmed that only safe - * devices are in use by setting a module parameter. A better idea - * might be to turn DISCARD into WRITE_ZEROES requests, as that is - * required to be safe. - */ if (!devices_handle_discard_safely || lim.max_discard_sectors < (stripe >> 9) || lim.discard_granularity < stripe) - lim.max_hw_discard_sectors =3D 0; + conf->raid5_discard_unsupported =3D true; + else + conf->raid5_discard_unsupported =3D false; =20 /* * Requests require having a bitmap for each stripe. * Limit the max sectors based on this. */ lim.max_hw_sectors =3D RAID5_MAX_REQ_STRIPES << RAID5_STRIPE_SHIFT(conf); if ((lim.max_hw_sectors << 9) < lim.io_opt) lim.max_hw_sectors =3D lim.io_opt >> 9; + lim.max_hw_discard_sectors =3D UINT_MAX; =20 /* No restrictions on the number of segments in the request */ lim.max_segments =3D USHRT_MAX; =20 return queue_limits_set(mddev->gendisk->queue, &lim); diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h index 1c7b710fc9c1..ba06cf88aa24 100644 --- a/drivers/md/raid5.h +++ b/drivers/md/raid5.h @@ -687,10 +687,11 @@ struct r5conf { struct r5pending_data *pending_data; struct list_head free_list; struct list_head pending_list; int pending_data_cnt; struct r5pending_data *next_pending_data; + bool raid5_discard_unsupported; =20 mempool_t *ctx_pool; int ctx_size; }; =20 --=20 2.51.0 From nobody Mon Jun 8 06:37:54 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E31547D922; Fri, 5 Jun 2026 07:26:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780644411; cv=none; b=ojX3FkUf0/rgcyGinb00+KY23+GokFhVcJgl3ZonOHBoz0T849/3nbZ4SB5Ek0GVAyNNehaH++8lHta0ydbYAPaN8cgQc4Ska2agv+K0EnBw9wysonYSosBCsJtthkpIqQrUcE/FXLr95LgwtiAZkupjXXVyS/Zi9oMblZ/NaVc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780644411; c=relaxed/simple; bh=X5zPx8AaczLpYI+wT+Fancgh3HTrYISPlYf3dPPokDg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AD7Y7fqbxl1hYwiJFWTQOsZqFyX2gVDLo0aGt9ii9yFvhUmQPdY8kz2KisZjZ2sDedclh33Ad8TGV0+MYSug0XevQrU8h7V4eTnXlb2r43Sd2lgkCaInWxH1EgTPZk8GVpkk14jdNAv5/qiLS7fXeXQZokjYrNJY1d39I2xHBOk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=edz8X+9z; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="edz8X+9z" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 80D4D1F00893; Fri, 5 Jun 2026 07:26:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780644410; bh=YB9jvAFrKIidDpZBfqQVx0rYEerO4lsJ66X/tMi2qVk=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=edz8X+9zuZJ8OvqAisw0Q4HevaimXXm4Oya/UZQiiVRkH4QijA8btDMqbe9Md6biO ajozUgXcUwE9IFvMNUT5sXyVem+2CpWmwPcCnHQuNVUtK1t/9VhAXwG6d0K5SK0P+z 7/yPdAFc/0y/hc/7Lgs9ekjq80Gowy5XQCV+SBILdqq4hvsxwv+NRwFi+Z9BlpRxUw z7UirzYOBkIx8mjkbBAwpAnxnMi5XTkYvpL+d/7GlNOlOMuvAOxYTn95DbDJTsVHIY w9cc3fbmGntzsoO8iBed1EYGAXZ58iz+Tl0ZFPhn3GgcEI6bOuQglanAbO/QgxNjQM d3eDEPsqXuiyA== From: Yu Kuai To: Song Liu , Yu Kuai Cc: Li Nan , Xiao Ni , linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 3/3] md/raid5: always convert llbitmap bits for discard Date: Fri, 5 Jun 2026 15:26:39 +0800 Message-ID: <20260605072639.2434847-4-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260605072639.2434847-1-yukuai@kernel.org> References: <20260605072639.2434847-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai llbitmap discard is useful even when no underlying member device supports it. The discard still converts the llbitmap range to unwritten, so later reads and recovery do not rely on stale parity for that range. Let llbitmap discard bypass the raid5 lower discard support check. If lower discard is not safe or not supported, complete the accounted clone after md_account_bio() so the llbitmap conversion callbacks run without member discard bios. Signed-off-by: Yu Kuai --- drivers/md/raid5.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 76e736ee48d3..180ff0660b6a 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -1136,10 +1136,13 @@ static void defer_issue_bios(struct r5conf *conf, s= ector_t sector, =20 static bool raid5_discard_limits(struct mddev *mddev, struct bio *bi) { struct r5conf *conf =3D mddev->private; =20 + if (mddev->bitmap_id =3D=3D ID_LLBITMAP) + return true; + if (!conf->raid5_discard_unsupported) return true; =20 bi->bi_status =3D BLK_STS_NOTSUPP; bio_endio(bi); @@ -5738,10 +5741,16 @@ static void make_discard_request(struct mddev *mdde= v, struct bio *bi) md_account_bio(mddev, &bi); orig_bi->bi_iter =3D bi_iter; bi->bi_iter =3D bi_iter; bi->bi_next =3D NULL; =20 + if (mddev->bitmap_id =3D=3D ID_LLBITMAP && + conf->raid5_discard_unsupported) { + bio_endio(bi); + return; + } + logical_sector =3D first_stripe * conf->chunk_sectors; last_sector =3D last_stripe * conf->chunk_sectors; =20 for (; logical_sector < last_sector; logical_sector +=3D RAID5_STRIPE_SECTORS(conf)) { --=20 2.51.0