From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB0371E2823; Mon, 12 May 2025 01:28:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013290; cv=none; b=PbvLXFJagFZAe/VdMQfuCjKbAY8NM8K4dICQEuGXS/e+H/tphHcPk+XI3tz1dKegOh4nfTqzERzg6DppUXiifJw4u5r8r5PqsROMsZk1TcRa1bkUzjKX6xQPMm12X+ZG01CU4V0dEBWJcBRrKy7kbtf+aBXqArUeoGzFLGQ48jw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013290; c=relaxed/simple; bh=GKPCrWIEUucXkdH2KT+WHWhcfLII+k+fcCarYpr3v28=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=oPrtdFlSCpi+6kwbQiEwiWO0NfQBZ87NzS4HcUSLRwROtDq4JC2qi8jMbrYb7Dkdaxfl14C4jJgCwThbJIavX0zeUTuk4P4sbQrSrZqJYAo65zYUtOjuaLO9EIF7Ot9ugsR9wfsekuHd8oj2guTHvRriF0JbOy061jJbArXtknE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmm3pXPz4f3lDc; Mon, 12 May 2025 09:27:36 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id AD33B1A1B8C; Mon, 12 May 2025 09:28:02 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S5; Mon, 12 May 2025 09:28:02 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 01/19] md/md-bitmap: add {start, end}_discard in bitmap_operations Date: Mon, 12 May 2025 09:19:09 +0800 Message-Id: <20250512011927.2809400-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S5 X-Coremail-Antispam: 1UD129KBjvdXoW7XFykKF45Aw13JFyxuF4Durg_yoWfCrg_C3 y09Fy0gFyrWr9Ykr13Xw4FvrWqqw48GF1DWFyFgFWrZFn3Aw18Cr929ws5tr1xZFyUAa43 tryUWr17Xw4qgjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbkxFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k26cxKx2IYs7xG 6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUGwA2048vs2IY02 0Ec7CjxVAFwI0_Gr0_Xr1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xv wVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxkF7I0En4kS14v26r1q6r43MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j 6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7 AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE 2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcV C2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2Kfnx nUUI43ZEXa7VUjrHUDUUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai Prepare to support discard in new md bitmap. Signed-off-by: Yu Kuai Reviewed-by: Christoph Hellwig --- drivers/md/md-bitmap.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h index d3d50629af91..c3fc051c88e9 100644 --- a/drivers/md/md-bitmap.h +++ b/drivers/md/md-bitmap.h @@ -94,6 +94,11 @@ struct bitmap_operations { unsigned long sectors); void (*endwrite)(struct mddev *mddev, sector_t offset, unsigned long sectors); + int (*start_discard)(struct mddev *mddev, sector_t offset, + unsigned long sectors); + void (*end_discard)(struct mddev *mddev, sector_t offset, + unsigned long sectors); + bool (*start_sync)(struct mddev *mddev, sector_t offset, sector_t *blocks, bool degraded); void (*end_sync)(struct mddev *mddev, sector_t offset, sector_t *blocks); --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C41501E25F8; Mon, 12 May 2025 01:28:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013288; cv=none; b=fBk0Zn9S0EEMSOFGKTZ12v1uJ3XQQoezn/gH35dXTODRqolZkrINhvHwbuuhrA1Qq7cz2LV6Z5M4qiv4dw9FAiD4qukNoyev1tCmq4eYZ0epRQdZ6SjfaButVaUW4YLoECbDJSBiG5wHqmVEHPSUMp0uMzYFML8c19WDqvnhUlU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013288; c=relaxed/simple; bh=JibtT05TCheH4YuVehMkAmbTROjLiDkuWNcWwRILQfo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=qq20JheNbPXogeQIgvZLFRiycJcY9VzNsw62tVVs7JhVAI8+mj6gr/FMdj9Pk23zJitJw1k5EA5TTb/jxo1ufpIry8ZAaouvJbOErZDHOcDuSZz8YjDbp9WCCzm3ZqPRXB7FMcz/pGK735+/Lk8BXvfc3dqFGQygbA8odS+OTws= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmp47mSz4f3jXd; Mon, 12 May 2025 09:27:38 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 456ED1A1A98; Mon, 12 May 2025 09:28:03 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S6; Mon, 12 May 2025 09:28:03 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 02/19] md: support discard for bitmap ops Date: Mon, 12 May 2025 09:19:10 +0800 Message-Id: <20250512011927.2809400-3-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S6 X-Coremail-Antispam: 1UD129KBjvJXoW7uFWxGr1kGFyxtw43Jr1xGrg_yoW8KrW8pF 4IvFyrJFW3XrZY9ay7Za4v9F1Fqw1DGrZ8tFy7Ww45WF18Gr9xAF4fWa4vvr15CFy3uF1a va1FkF13Wr18XrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUm014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0x vEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVj vjDU0xZFpf9x0JUQXo7UUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai If {start, end}_discard is implemented from bitmap_operations, then they will be used to handle discard IO. Currently md-bitmap handle discard the same as normal write, prepare to support discard for new md bitmap. Signed-off-by: Yu Kuai --- drivers/md/md.c | 19 +++++++++++++++---- drivers/md/md.h | 1 + 2 files changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 32b997dfe6f4..c72c13ed4253 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -8849,14 +8849,24 @@ static void md_bitmap_start(struct mddev *mddev, mddev->pers->bitmap_sector(mddev, &md_io_clone->offset, &md_io_clone->sectors); =20 - mddev->bitmap_ops->startwrite(mddev, md_io_clone->offset, - md_io_clone->sectors); + if (unlikely(md_io_clone->rw =3D=3D STAT_DISCARD) && + mddev->bitmap_ops->start_discard) + mddev->bitmap_ops->start_discard(mddev, md_io_clone->offset, + md_io_clone->sectors); + else + mddev->bitmap_ops->startwrite(mddev, md_io_clone->offset, + md_io_clone->sectors); } =20 static void md_bitmap_end(struct mddev *mddev, struct md_io_clone *md_io_c= lone) { - mddev->bitmap_ops->endwrite(mddev, md_io_clone->offset, - md_io_clone->sectors); + if (unlikely(md_io_clone->rw =3D=3D STAT_DISCARD) && + mddev->bitmap_ops->end_discard) + mddev->bitmap_ops->end_discard(mddev, md_io_clone->offset, + md_io_clone->sectors); + else + mddev->bitmap_ops->endwrite(mddev, md_io_clone->offset, + md_io_clone->sectors); } =20 static void md_end_clone_io(struct bio *bio) @@ -8895,6 +8905,7 @@ static void md_clone_bio(struct mddev *mddev, struct = bio **bio) if (bio_data_dir(*bio) =3D=3D WRITE && md_bitmap_enabled(mddev)) { md_io_clone->offset =3D (*bio)->bi_iter.bi_sector; md_io_clone->sectors =3D bio_sectors(*bio); + md_io_clone->rw =3D op_stat_group(bio_op(*bio)); md_bitmap_start(mddev, md_io_clone); } =20 diff --git a/drivers/md/md.h b/drivers/md/md.h index 6eb5dfdf2f55..c474bf74c345 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -850,6 +850,7 @@ struct md_io_clone { unsigned long start_time; sector_t offset; unsigned long sectors; + enum stat_group rw; struct bio bio_clone; }; =20 --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C573A1E2602; Mon, 12 May 2025 01:28:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013290; cv=none; b=ZESAB3ACJB9l2CLhosjTDOLIReVvDifvjqix8FAKdnIEiE56YY/AtT5wrWTWYOBy8OdjCNVnRcx7lyL0+xtXA/BG4MiDq19TgC0c3euiPBUpCtp3X3kTE1nhVmUwbj2aRPagOXD6FWkmPtTL/72nXoB1kZPX6QvbBJbtauNHBtw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013290; c=relaxed/simple; bh=ZLnxyOFDhn3NNnhysj3ERHI4Uo95sKDUQ514ww9Ohd8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=mbaqteoIaZbchSMUKus/0rwLVA1qPU0CAUPafLPJk3PRZSLEm46Aaw8NEU2OayIzAgZMafHKxUSz7zMX2IxuyVHfxXZ/q+inXjh1Gi3jVL958A+FPdxwqrBjQuqx+lUQ7FhTVhW/qGi5X48D0XVB3qE2YQ3p8b0Af37E0s1otBA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmq1Dpyz4f3jXl; Mon, 12 May 2025 09:27:39 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id D22431A0FD4; Mon, 12 May 2025 09:28:03 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S7; Mon, 12 May 2025 09:28:03 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 03/19] md/md-bitmap: remove parameter slot from bitmap_create() Date: Mon, 12 May 2025 09:19:11 +0800 Message-Id: <20250512011927.2809400-4-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S7 X-Coremail-Antispam: 1UD129KBjvJXoWxJFyrWFW8XF47KFW7XrW3GFg_yoW5XF48p3 97tas3GrW3JrWaqw4UXFyv9a45Xwn2grZrtryxC34rWF1DZrnxCF4FgF1jywn8Ka4rAFs8 Xw15Gw18GF1Igr7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmY14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42 IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIev Ja73UjIFyTuYvjfUO_MaUUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai All callers pass in '-1' for 'slot', hence it can be removed. Signed-off-by: Yu Kuai Reviewed-by: Christoph Hellwig --- drivers/md/md-bitmap.c | 6 +++--- drivers/md/md-bitmap.h | 2 +- drivers/md/md.c | 6 +++--- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c index 56a1430a398e..4c5067783b7a 100644 --- a/drivers/md/md-bitmap.c +++ b/drivers/md/md-bitmap.c @@ -2183,9 +2183,9 @@ static struct bitmap *__bitmap_create(struct mddev *m= ddev, int slot) return ERR_PTR(err); } =20 -static int bitmap_create(struct mddev *mddev, int slot) +static int bitmap_create(struct mddev *mddev) { - struct bitmap *bitmap =3D __bitmap_create(mddev, slot); + struct bitmap *bitmap =3D __bitmap_create(mddev, -1); =20 if (IS_ERR(bitmap)) return PTR_ERR(bitmap); @@ -2647,7 +2647,7 @@ location_store(struct mddev *mddev, const char *buf, = size_t len) } =20 mddev->bitmap_info.offset =3D offset; - rv =3D bitmap_create(mddev, -1); + rv =3D bitmap_create(mddev); if (rv) goto out; =20 diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h index c3fc051c88e9..41d09c6d0c14 100644 --- a/drivers/md/md-bitmap.h +++ b/drivers/md/md-bitmap.h @@ -74,7 +74,7 @@ struct bitmap_operations { struct md_submodule_head head; =20 bool (*enabled)(void *data); - int (*create)(struct mddev *mddev, int slot); + int (*create)(struct mddev *mddev); int (*resize)(struct mddev *mddev, sector_t blocks, int chunksize); =20 int (*load)(struct mddev *mddev); diff --git a/drivers/md/md.c b/drivers/md/md.c index c72c13ed4253..dc2b2b274677 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6255,7 +6255,7 @@ int md_run(struct mddev *mddev) } if (err =3D=3D 0 && pers->sync_request && md_bitmap_registered(mddev) && (mddev->bitmap_info.file || mddev->bitmap_info.offset)) { - err =3D mddev->bitmap_ops->create(mddev, -1); + err =3D mddev->bitmap_ops->create(mddev); if (err) pr_warn("%s: failed to create bitmap (%d)\n", mdname(mddev), err); @@ -7324,7 +7324,7 @@ static int set_bitmap_file(struct mddev *mddev, int f= d) err =3D 0; if (mddev->pers) { if (fd >=3D 0) { - err =3D mddev->bitmap_ops->create(mddev, -1); + err =3D mddev->bitmap_ops->create(mddev); if (!err) err =3D mddev->bitmap_ops->load(mddev); =20 @@ -7648,7 +7648,7 @@ static int update_array_info(struct mddev *mddev, mdu= _array_info_t *info) mddev->bitmap_info.default_offset; mddev->bitmap_info.space =3D mddev->bitmap_info.default_space; - rv =3D mddev->bitmap_ops->create(mddev, -1); + rv =3D mddev->bitmap_ops->create(mddev); if (!rv) rv =3D mddev->bitmap_ops->load(mddev); =20 --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BB9E41E2848; Mon, 12 May 2025 01:28:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013290; cv=none; b=tzZ2H0ewp9c2NO7Wxacxi7+1/XLkXLUAFWrszE3qFX6A0GJChW1z+d2CeWFvtn94u/tquejd1VrCUKsKIwdG5OTZGLp9SeuF+LXgpLa9YLEfLjgkBaiGO6Fv05ClRc3FG+RkTI8pcgfknDTmPG3xRiIOQp2U/B/dubAWTUUUZl8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013290; c=relaxed/simple; bh=+bTjueXGLimEjGnnKu6X/smZI2jnnYnymsrut/5kwh4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ILoBtAH3VvYpvkQrD3ofIyC9krys6EL8e26k/qYoTjq42R0vg2s2cgMbXz3EaSpseu+RGwcI/VhO9x4W6v6cU4fhKKwZsODP76aRIRwL30hvE8CN3Q29siEU44N5EcMKsGE0NP3K3zuUSL6SX3iYavpnZKcSN3Uquv3c2CvGZgs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmq5bJrz4f3jXr; Mon, 12 May 2025 09:27:39 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 71D3D1A1A95; Mon, 12 May 2025 09:28:04 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S8; Mon, 12 May 2025 09:28:04 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 04/19] md: add a new sysfs api bitmap_version Date: Mon, 12 May 2025 09:19:12 +0800 Message-Id: <20250512011927.2809400-5-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S8 X-Coremail-Antispam: 1UD129KBjvJXoWxWF1xKrW7KF4fXFW3Ww4xCrg_yoWrZFW8pa y8tFy3GF4rXFZ2qr4xGasruFnYgw1vya9Fq34ft34rGr13WrsxGFWrK3W5tr1kG3Wxurnx uF15JF48WrWUuF7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUOyIUUUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai The api will be used by mdadm to set bitmap version while creating new array or assemble array, prepare to add a new bitmap. Currently available options are: cat /sys/block/md0/md/bitmap_version none [bitmap] Signed-off-by: Yu Kuai --- drivers/md/md.c | 87 ++++++++++++++++++++++++++++++++++++++++++++++--- drivers/md/md.h | 2 ++ 2 files changed, 84 insertions(+), 5 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index dc2b2b274677..e16d3b4033d5 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -672,13 +672,13 @@ static void active_io_release(struct percpu_ref *ref) =20 static void no_op(struct percpu_ref *r) {} =20 -static void mddev_set_bitmap_ops(struct mddev *mddev, enum md_submodule_id= id) +static void mddev_set_bitmap_ops(struct mddev *mddev) { xa_lock(&md_submodule); - mddev->bitmap_ops =3D xa_load(&md_submodule, id); + mddev->bitmap_ops =3D xa_load(&md_submodule, mddev->bitmap_id); xa_unlock(&md_submodule); if (!mddev->bitmap_ops) - pr_warn_once("md: can't find bitmap id %d\n", id); + pr_warn_once("md: can't find bitmap id %d\n", mddev->bitmap_id); } =20 static void mddev_clear_bitmap_ops(struct mddev *mddev) @@ -688,8 +688,8 @@ static void mddev_clear_bitmap_ops(struct mddev *mddev) =20 int mddev_init(struct mddev *mddev) { - /* TODO: support more versions */ - mddev_set_bitmap_ops(mddev, ID_BITMAP); + mddev->bitmap_id =3D ID_BITMAP; + mddev_set_bitmap_ops(mddev); =20 if (percpu_ref_init(&mddev->active_io, active_io_release, PERCPU_REF_ALLOW_REINIT, GFP_KERNEL)) { @@ -4155,6 +4155,82 @@ new_level_store(struct mddev *mddev, const char *buf= , size_t len) static struct md_sysfs_entry md_new_level =3D __ATTR(new_level, 0664, new_level_show, new_level_store); =20 +static ssize_t +bitmap_version_show(struct mddev *mddev, char *page) +{ + struct md_submodule_head *head; + unsigned long i; + ssize_t len =3D 0; + + if (mddev->bitmap_id =3D=3D ID_BITMAP_NONE) + len +=3D sprintf(page + len, "[none] "); + else + len +=3D sprintf(page + len, "none "); + + xa_lock(&md_submodule); + xa_for_each(&md_submodule, i, head) { + if (head->type !=3D MD_BITMAP) + continue; + + if (mddev->bitmap_id =3D=3D head->id) + len +=3D sprintf(page + len, "[%s] ", head->name); + else + len +=3D sprintf(page + len, "%s ", head->name); + } + xa_unlock(&md_submodule); + + len +=3D sprintf(page + len, "\n"); + return len; +} + +static ssize_t +bitmap_version_store(struct mddev *mddev, const char *buf, size_t len) +{ + struct md_submodule_head *head; + enum md_submodule_id id; + unsigned long i; + int err; + + if (mddev->bitmap_ops) + return -EBUSY; + + err =3D kstrtoint(buf, 10, &id); + if (!err) { + if (id =3D=3D ID_BITMAP_NONE) { + mddev->bitmap_id =3D id; + return len; + } + + xa_lock(&md_submodule); + head =3D xa_load(&md_submodule, id); + xa_unlock(&md_submodule); + + if (head && head->type =3D=3D MD_BITMAP) { + mddev->bitmap_id =3D id; + return len; + } + } + + if (cmd_match(buf, "none")) { + mddev->bitmap_id =3D ID_BITMAP_NONE; + return len; + } + + xa_lock(&md_submodule); + xa_for_each(&md_submodule, i, head) { + if (head->type =3D=3D MD_BITMAP && cmd_match(buf, head->name)) { + mddev->bitmap_id =3D head->id; + xa_unlock(&md_submodule); + return len; + } + } + xa_unlock(&md_submodule); + return -ENOENT; +} + +static struct md_sysfs_entry md_bitmap_version =3D +__ATTR(bitmap_version, 0664, bitmap_version_show, bitmap_version_store); + static ssize_t layout_show(struct mddev *mddev, char *page) { @@ -5719,6 +5795,7 @@ __ATTR(serialize_policy, S_IRUGO | S_IWUSR, serialize= _policy_show, static struct attribute *md_default_attrs[] =3D { &md_level.attr, &md_new_level.attr, + &md_bitmap_version.attr, &md_layout.attr, &md_raid_disks.attr, &md_uuid.attr, diff --git a/drivers/md/md.h b/drivers/md/md.h index c474bf74c345..135d95ba1ebb 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -40,6 +40,7 @@ enum md_submodule_id { ID_CLUSTER, ID_BITMAP, ID_LLBITMAP, /* TODO */ + ID_BITMAP_NONE, }; =20 struct md_submodule_head { @@ -565,6 +566,7 @@ struct mddev { struct percpu_ref writes_pending; int sync_checkers; /* # of threads checking writes_pending */ =20 + enum md_submodule_id bitmap_id; void *bitmap; /* the bitmap for the device */ struct bitmap_operations *bitmap_ops; struct { --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8AA11E32B9; Mon, 12 May 2025 01:28:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013292; cv=none; b=dcPpg74CP7w6iEZMaiQ6xMw7onAy+n1DUlwIQun1UVvSsjF5rKqgTN09t3v/7raffnIbkS6EvPpTYDbVnvfj0TAKNXqK9nRy4eiODGJQm69yLG+5rtM9VkAkloSHMCZv9cRGPscaICUN/6cEwjcduDXsCMetf5lVwih0IUwq1NI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013292; c=relaxed/simple; bh=hvJUrahqvjuj8N5ONM8vvvnyCTEB2I3TA8aumyVlSHk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CDSeuk3QXJHmZB56gN0n7yZxq/UVm0DAGsHNBIL5/Zmy7xrz2bUgkj5NuT0amNKssNfAUsng6p6aV104aV0inaKd5Cumj8J6oCHIDa+BkiZ5ONqZf0yaMJBbCl2vjNEy8zSKFcoPp3M6Y+9NvUXiN4Jm81cLmv8b3zahRkZYYaU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmp66lhz4f3lDc; Mon, 12 May 2025 09:27:38 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 0979F1A1B99; Mon, 12 May 2025 09:28:05 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S9; Mon, 12 May 2025 09:28:04 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 05/19] md: delay registration of bitmap_ops until creating bitmap Date: Mon, 12 May 2025 09:19:13 +0800 Message-Id: <20250512011927.2809400-6-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S9 X-Coremail-Antispam: 1UD129KBjvJXoWxKF47Gw45Xr1DAFW3Cw17KFg_yoWxAFWrp3 yft3Z8Kr4rJrZIgw47JFyq9F1rXrn7tr9xtryxXw15Grn3JrnxXF4rWF1Utr18J348AFs8 Zw45Jr4rGr13uF7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUOyIUUUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai Currently bitmap_ops is registered while allocating mddev, this is fine when there is only one bitmap_ops, however, after introduing a new bitmap_ops, user space need a time window to choose which bitmap_ops to use while creating new array. Signed-off-by: Yu Kuai --- drivers/md/md.c | 85 ++++++++++++++++++++++++++++++++----------------- 1 file changed, 56 insertions(+), 29 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index e16d3b4033d5..ba2b981b017c 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -674,32 +674,47 @@ static void no_op(struct percpu_ref *r) {} =20 static void mddev_set_bitmap_ops(struct mddev *mddev) { + struct bitmap_operations *old =3D mddev->bitmap_ops; + struct md_submodule_head *head; + + if (mddev->bitmap_id =3D=3D ID_BITMAP_NONE || + (old && old->head.id =3D=3D mddev->bitmap_id)) + return; + xa_lock(&md_submodule); - mddev->bitmap_ops =3D xa_load(&md_submodule, mddev->bitmap_id); + head =3D xa_load(&md_submodule, mddev->bitmap_id); xa_unlock(&md_submodule); - if (!mddev->bitmap_ops) - pr_warn_once("md: can't find bitmap id %d\n", mddev->bitmap_id); + + if (WARN_ON_ONCE(!head || head->type !=3D MD_BITMAP)) { + pr_err("md: can't find bitmap id %d\n", mddev->bitmap_id); + return; + } + + if (old && old->group) + sysfs_remove_group(&mddev->kobj, old->group); + + mddev->bitmap_ops =3D (void *)head; + if (mddev->bitmap_ops && mddev->bitmap_ops->group && + sysfs_create_group(&mddev->kobj, mddev->bitmap_ops->group)) + pr_warn("md: cannot register extra bitmap attributes for %s\n", + mdname(mddev)); } =20 static void mddev_clear_bitmap_ops(struct mddev *mddev) { + if (mddev->bitmap_ops && mddev->bitmap_ops->group) + sysfs_remove_group(&mddev->kobj, mddev->bitmap_ops->group); mddev->bitmap_ops =3D NULL; } =20 int mddev_init(struct mddev *mddev) { - mddev->bitmap_id =3D ID_BITMAP; - mddev_set_bitmap_ops(mddev); - if (percpu_ref_init(&mddev->active_io, active_io_release, - PERCPU_REF_ALLOW_REINIT, GFP_KERNEL)) { - mddev_clear_bitmap_ops(mddev); + PERCPU_REF_ALLOW_REINIT, GFP_KERNEL)) return -ENOMEM; - } =20 if (percpu_ref_init(&mddev->writes_pending, no_op, PERCPU_REF_ALLOW_REINIT, GFP_KERNEL)) { - mddev_clear_bitmap_ops(mddev); percpu_ref_exit(&mddev->active_io); return -ENOMEM; } @@ -727,6 +742,7 @@ int mddev_init(struct mddev *mddev) mddev->resync_min =3D 0; mddev->resync_max =3D MaxSector; mddev->level =3D LEVEL_NONE; + mddev->bitmap_id =3D ID_BITMAP; =20 INIT_WORK(&mddev->sync_work, md_start_sync); INIT_WORK(&mddev->del_work, mddev_delayed_delete); @@ -737,7 +753,6 @@ EXPORT_SYMBOL_GPL(mddev_init); =20 void mddev_destroy(struct mddev *mddev) { - mddev_clear_bitmap_ops(mddev); percpu_ref_exit(&mddev->active_io); percpu_ref_exit(&mddev->writes_pending); } @@ -6086,11 +6101,6 @@ struct mddev *md_alloc(dev_t dev, char *name) return ERR_PTR(error); } =20 - if (md_bitmap_registered(mddev) && mddev->bitmap_ops->group) - if (sysfs_create_group(&mddev->kobj, mddev->bitmap_ops->group)) - pr_warn("md: cannot register extra bitmap attributes for %s\n", - mdname(mddev)); - kobject_uevent(&mddev->kobj, KOBJ_ADD); mddev->sysfs_state =3D sysfs_get_dirent_safe(mddev->kobj.sd, "array_state= "); mddev->sysfs_level =3D sysfs_get_dirent_safe(mddev->kobj.sd, "level"); @@ -6166,6 +6176,25 @@ static void md_safemode_timeout(struct timer_list *t) =20 static int start_dirty_degraded; =20 +static int md_bitmap_create(struct mddev *mddev) +{ + if (!md_bitmap_registered(mddev)) + mddev_set_bitmap_ops(mddev); + if (!mddev->bitmap_ops) + return -ENOENT; + + return mddev->bitmap_ops->create(mddev); +} + +static void md_bitmap_destroy(struct mddev *mddev) +{ + if (!md_bitmap_registered(mddev)) + return; + + mddev->bitmap_ops->destroy(mddev); + mddev_clear_bitmap_ops(mddev); +} + int md_run(struct mddev *mddev) { int err; @@ -6330,9 +6359,9 @@ int md_run(struct mddev *mddev) (unsigned long long)pers->size(mddev, 0, 0) / 2); err =3D -EINVAL; } - if (err =3D=3D 0 && pers->sync_request && md_bitmap_registered(mddev) && + if (err =3D=3D 0 && pers->sync_request && (mddev->bitmap_info.file || mddev->bitmap_info.offset)) { - err =3D mddev->bitmap_ops->create(mddev); + err =3D md_bitmap_create(mddev); if (err) pr_warn("%s: failed to create bitmap (%d)\n", mdname(mddev), err); @@ -6405,8 +6434,7 @@ int md_run(struct mddev *mddev) pers->free(mddev, mddev->private); mddev->private =3D NULL; put_pers(pers); - if (md_bitmap_registered(mddev)) - mddev->bitmap_ops->destroy(mddev); + md_bitmap_destroy(mddev); abort: bioset_exit(&mddev->io_clone_set); exit_sync_set: @@ -6429,7 +6457,7 @@ int do_md_run(struct mddev *mddev) if (md_bitmap_registered(mddev)) { err =3D mddev->bitmap_ops->load(mddev); if (err) { - mddev->bitmap_ops->destroy(mddev); + md_bitmap_destroy(mddev); goto out; } } @@ -6620,8 +6648,7 @@ static void __md_stop(struct mddev *mddev) { struct md_personality *pers =3D mddev->pers; =20 - if (md_bitmap_registered(mddev)) - mddev->bitmap_ops->destroy(mddev); + md_bitmap_destroy(mddev); mddev_detach(mddev); spin_lock(&mddev->lock); mddev->pers =3D NULL; @@ -7401,16 +7428,16 @@ static int set_bitmap_file(struct mddev *mddev, int= fd) err =3D 0; if (mddev->pers) { if (fd >=3D 0) { - err =3D mddev->bitmap_ops->create(mddev); + err =3D md_bitmap_create(mddev); if (!err) err =3D mddev->bitmap_ops->load(mddev); =20 if (err) { - mddev->bitmap_ops->destroy(mddev); + md_bitmap_destroy(mddev); fd =3D -1; } } else if (fd < 0) { - mddev->bitmap_ops->destroy(mddev); + md_bitmap_destroy(mddev); } } =20 @@ -7725,12 +7752,12 @@ static int update_array_info(struct mddev *mddev, m= du_array_info_t *info) mddev->bitmap_info.default_offset; mddev->bitmap_info.space =3D mddev->bitmap_info.default_space; - rv =3D mddev->bitmap_ops->create(mddev); + rv =3D md_bitmap_create(mddev); if (!rv) rv =3D mddev->bitmap_ops->load(mddev); =20 if (rv) - mddev->bitmap_ops->destroy(mddev); + md_bitmap_destroy(mddev); } else { struct md_bitmap_stats stats; =20 @@ -7756,7 +7783,7 @@ static int update_array_info(struct mddev *mddev, mdu= _array_info_t *info) put_cluster_ops(mddev); mddev->safemode_delay =3D DEFAULT_SAFEMODE_DELAY; } - mddev->bitmap_ops->destroy(mddev); + md_bitmap_destroy(mddev); mddev->bitmap_info.offset =3D 0; } } --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C1A7B1E285A; Mon, 12 May 2025 01:28:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013290; cv=none; b=g+k5Plx9gNsBlpDzNaJ8WXCGACMskBZbnU0hlI8WGYvbwpkFDfQqAlLK8VPN2xkYrCMplSwciOWUoXc4BwH1/YYcDnBM5Lk1jJXe8kd/V4BCqRuZ9t9OAS5NnYDlQq7hoh/xxpbLYSXD3xOzPRijMqpJsEBNU6GhWKaw777sW4g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013290; c=relaxed/simple; bh=QXGN9hhB73knjaQTWE5Iqt0nAmmYfZwfgur7moic1PY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=nR9hxN8q0wCsAW6b3YSp6fGXvfGCy1Cfwfa2P4K/DsT+37ztPoIUNFCGCi9h0P9TNQOnYxW+iZy7T2HQUSNS78wj+iubDpkPFlun464kzo34Fyg6iyuI/9xe43rwXOF0d/IzvBmelavG2T64nL4QtYnTCXfThSQPG9HRfwUkG9U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4ZwhnL2G90zYQvDd; Mon, 12 May 2025 09:28:06 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 997EE1A07BB; Mon, 12 May 2025 09:28:05 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S10; Mon, 12 May 2025 09:28:05 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 06/19] md: add a new parameter 'offset' to md_super_write() Date: Mon, 12 May 2025 09:19:14 +0800 Message-Id: <20250512011927.2809400-7-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S10 X-Coremail-Antispam: 1UD129KBjvJXoWxur47urW5urWkJrW3Ar1DWrg_yoW5Kw4fpa yjvFyfJ3y3JrWjqw1UJFZ7Ca4Fq34DKrZ7KryfC34fua43tryDKF15JFy8Xr98uF9xCrsI qw4UtFW7uw1xWr7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai The parameter is always set to 0 for now, following patches will use this helper to write llbitmap to underlying disks, allow writing dirty sectors instead of the whole page. Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 3 ++- drivers/md/md.c | 13 +++++++------ drivers/md/md.h | 3 ++- 3 files changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c index 4c5067783b7a..a5602cb5d756 100644 --- a/drivers/md/md-bitmap.c +++ b/drivers/md/md-bitmap.c @@ -468,7 +468,8 @@ static int __write_sb_page(struct md_rdev *rdev, struct= bitmap *bitmap, return -EINVAL; } =20 - md_super_write(mddev, rdev, sboff + ps, (int)min(size, bitmap_limit), pag= e); + md_super_write(mddev, rdev, sboff + ps, (int)min(size, bitmap_limit), + page, 0); return 0; } =20 diff --git a/drivers/md/md.c b/drivers/md/md.c index ba2b981b017c..4329ecfbe8ff 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -1037,7 +1037,8 @@ static void super_written(struct bio *bio) } =20 void md_super_write(struct mddev *mddev, struct md_rdev *rdev, - sector_t sector, int size, struct page *page) + sector_t sector, int size, struct page *page, + unsigned int offset) { /* write first size bytes of page to sector of rdev * Increment mddev->pending_writes before returning @@ -1062,7 +1063,7 @@ void md_super_write(struct mddev *mddev, struct md_rd= ev *rdev, atomic_inc(&rdev->nr_pending); =20 bio->bi_iter.bi_sector =3D sector; - __bio_add_page(bio, page, size, 0); + __bio_add_page(bio, page, size, offset); bio->bi_private =3D rdev; bio->bi_end_io =3D super_written; =20 @@ -1673,7 +1674,7 @@ super_90_rdev_size_change(struct md_rdev *rdev, secto= r_t num_sectors) num_sectors =3D (sector_t)(2ULL << 32) - 2; do { md_super_write(rdev->mddev, rdev, rdev->sb_start, rdev->sb_size, - rdev->sb_page); + rdev->sb_page, 0); } while (md_super_wait(rdev->mddev) < 0); return num_sectors; } @@ -2322,7 +2323,7 @@ super_1_rdev_size_change(struct md_rdev *rdev, sector= _t num_sectors) sb->sb_csum =3D calc_sb_1_csum(sb); do { md_super_write(rdev->mddev, rdev, rdev->sb_start, rdev->sb_size, - rdev->sb_page); + rdev->sb_page, 0); } while (md_super_wait(rdev->mddev) < 0); return num_sectors; =20 @@ -2833,7 +2834,7 @@ void md_update_sb(struct mddev *mddev, int force_chan= ge) if (!test_bit(Faulty, &rdev->flags)) { md_super_write(mddev,rdev, rdev->sb_start, rdev->sb_size, - rdev->sb_page); + rdev->sb_page, 0); pr_debug("md: (write) %pg's sb offset: %llu\n", rdev->bdev, (unsigned long long)rdev->sb_start); @@ -2842,7 +2843,7 @@ void md_update_sb(struct mddev *mddev, int force_chan= ge) md_super_write(mddev, rdev, rdev->badblocks.sector, rdev->badblocks.size << 9, - rdev->bb_page); + rdev->bb_page, 0); rdev->badblocks.size =3D 0; } =20 diff --git a/drivers/md/md.h b/drivers/md/md.h index 135d95ba1ebb..99f6c7a92b48 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -890,7 +890,8 @@ void md_free_cloned_bio(struct bio *bio); =20 extern bool __must_check md_flush_request(struct mddev *mddev, struct bio = *bio); extern void md_super_write(struct mddev *mddev, struct md_rdev *rdev, - sector_t sector, int size, struct page *page); + sector_t sector, int size, struct page *page, + unsigned int offset); extern int md_super_wait(struct mddev *mddev); extern int sync_page_io(struct md_rdev *rdev, sector_t sector, int size, struct page *page, blk_opf_t opf, bool metadata_op); --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 340D81E47B0; Mon, 12 May 2025 01:28:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013291; cv=none; b=qmLclivyzKwYuSXmX0nATJFZNXJ1wDgyv5ofZ1lGG06XNR7GpqBVqsksQSKzya3en/BD3SCbXXQjuSNXyjYkcgGJhoNvwfvUkiiBk8O29UpwiWFvC+Pjiafxp12xfIuDGMRqrRlv7oLM4cnRtOyt06UE0/xXmI1FtsV6QJIPWZQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013291; c=relaxed/simple; bh=NkTaOsaeSu0beUHTTpKdt3m5RX28Dvr3/vmCjtSn5+I=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=rdoDohCY2ticvGPbEiCndpBm9Y03YSOz2TaVQ+YIk12TTBRPpSR6zISZxkJOO308XOjlcz1xhxZoaHbXONiP3G7SeDGHyVKRO8+untXa5Q/nqsMao6ruFdoFJH36NVrinseOZCjdEszk//HJyd4wcVe+/99S3NyiII8tnmcjWHk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Zwhms3h2hz4f3jXm; Mon, 12 May 2025 09:27:41 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 2F6551A1BA8; Mon, 12 May 2025 09:28:06 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S11; Mon, 12 May 2025 09:28:05 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 07/19] md/md-bitmap: add a new helper skip_sync_blocks() in bitmap_operations Date: Mon, 12 May 2025 09:19:15 +0800 Message-Id: <20250512011927.2809400-8-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S11 X-Coremail-Antispam: 1UD129KBjvJXoW7Ar43Xw48CrW5WF1ftw4Dtwb_yoW8AFy3pa 97JFy3C3yUZrWYq3W7Ja4Dua4Fq34ktr9rtry7u34rur93GrnrGF45WayjqFyDGF1fAFsx Z3W5J3y5ZF1Iqr7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai This helper is used to check if blocks can be skipped before calling into pers->sync_request(), llbiltmap will use this helper to skip resync for unwritten/clean data blocks, and recovery/check/repair for unwritten data blocks; Signed-off-by: Yu Kuai Reviewed-by: Christoph Hellwig --- drivers/md/md-bitmap.h | 1 + drivers/md/md.c | 7 +++++++ 2 files changed, 8 insertions(+) diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h index 41d09c6d0c14..13be2a10801a 100644 --- a/drivers/md/md-bitmap.h +++ b/drivers/md/md-bitmap.h @@ -99,6 +99,7 @@ struct bitmap_operations { void (*end_discard)(struct mddev *mddev, sector_t offset, unsigned long sectors); =20 + sector_t (*skip_sync_blocks)(struct mddev *mddev, sector_t offset); bool (*start_sync)(struct mddev *mddev, sector_t offset, sector_t *blocks, bool degraded); void (*end_sync)(struct mddev *mddev, sector_t offset, sector_t *blocks); diff --git a/drivers/md/md.c b/drivers/md/md.c index 4329ecfbe8ff..c23ee9c19cf9 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9370,6 +9370,12 @@ void md_do_sync(struct md_thread *thread) if (test_bit(MD_RECOVERY_INTR, &mddev->recovery)) break; =20 + if (mddev->bitmap_ops && mddev->bitmap_ops->skip_sync_blocks) { + sectors =3D mddev->bitmap_ops->skip_sync_blocks(mddev, j); + if (sectors) + goto update; + } + sectors =3D mddev->pers->sync_request(mddev, j, max_sectors, &skipped); if (sectors =3D=3D 0) { @@ -9385,6 +9391,7 @@ void md_do_sync(struct md_thread *thread) if (test_bit(MD_RECOVERY_INTR, &mddev->recovery)) break; =20 +update: j +=3D sectors; if (j > max_sectors) /* when skipping, extra large numbers can be returned. */ --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6988E1E5B62; Mon, 12 May 2025 01:28:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013292; cv=none; b=pYMeRei59dCUJ9KXLJoOL2QcmowfkMghIAx9Hs7w0/K48KV9aF0zJq207Nm5IvyMprDC1inqZbQ69JCn2mNWsdx2TvnxgcXN9fwntFLi0MfmQbEtsdzxnCBSE7C/2ZJXCGf0AJDguS7GlZCwnt+jHVuWclMNHeBBz3DCdaHNNdY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013292; c=relaxed/simple; bh=p+03Fj+eOCZMY+6F/S00LYfxCvcAdfwOKOf5RiktbXo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=J0rjPSO5rdJQQfAbC60Dskx8N/d/WqB3r4oGsIi+OthejjIKq6YaLTSf4982U44jxtDvcfnmG43qhLdpAeS2AZXxmd0sz4viTOOjmCpDEZmIb94i5bY1JWoDSKQPh2Vzmoc1sMVDeiloebxU+k64r9pVmLFdGP/rYIgOhE3hz00= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmr4CQjz4f3lDq; Mon, 12 May 2025 09:27:40 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id BA6491A0FF2; Mon, 12 May 2025 09:28:06 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S12; Mon, 12 May 2025 09:28:06 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 08/19] md/md-bitmap: add a new helper blocks_synced() in bitmap_operations Date: Mon, 12 May 2025 09:19:16 +0800 Message-Id: <20250512011927.2809400-9-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S12 X-Coremail-Antispam: 1UD129KBjvJXoW7ZrWDWFW8XFy7Wr17WryUJrb_yoW8tF13pa yDJasxA3yjgrWjqF1UJayDuFyFq39rJrWxKFyfu34ruF95Kr9rWFWrJayUtF1UKF1avasx Z3Z8t3yUCr1FgrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai Currently, raid456 must perform a whole array initial recovery to build initail xor data, then IO to the array won't have to read all the blocks in underlying disks. This behavior will affect IO performance a lot, and nowadays there are huge disks and the initial recovery can take a long time. Hence llbitmap will support lazy initial recovery in following patches. This helper is used to check if data blocks is synced or not, if not then IO will still have to read all blocks. Signed-off-by: Yu Kuai Reviewed-by: Christoph Hellwig --- drivers/md/md-bitmap.h | 1 + drivers/md/raid5.c | 6 ++++++ 2 files changed, 7 insertions(+) diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h index 13be2a10801a..4e27f5f793b7 100644 --- a/drivers/md/md-bitmap.h +++ b/drivers/md/md-bitmap.h @@ -99,6 +99,7 @@ struct bitmap_operations { void (*end_discard)(struct mddev *mddev, sector_t offset, unsigned long sectors); =20 + bool (*blocks_synced)(struct mddev *mddev, sector_t offset); sector_t (*skip_sync_blocks)(struct mddev *mddev, sector_t offset); bool (*start_sync)(struct mddev *mddev, sector_t offset, sector_t *blocks, bool degraded); diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 7e66a99f29af..e5d3d8facb4b 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -3748,6 +3748,7 @@ static int want_replace(struct stripe_head *sh, int d= isk_idx) static int need_this_block(struct stripe_head *sh, struct stripe_head_stat= e *s, int disk_idx, int disks) { + struct mddev *mddev =3D sh->raid_conf->mddev; struct r5dev *dev =3D &sh->dev[disk_idx]; struct r5dev *fdev[2] =3D { &sh->dev[s->failed_num[0]], &sh->dev[s->failed_num[1]] }; @@ -3762,6 +3763,11 @@ static int need_this_block(struct stripe_head *sh, s= truct stripe_head_state *s, */ return 0; =20 + /* The initial recover is not done, must read everything */ + if (mddev->bitmap_ops && mddev->bitmap_ops->blocks_synced && + !mddev->bitmap_ops->blocks_synced(mddev, sh->sector)) + return 1; + if (dev->toread || (dev->towrite && !test_bit(R5_OVERWRITE, &dev->flags))) /* We need this block to directly satisfy a request */ --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2EF171E3DEF; Mon, 12 May 2025 01:28:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013293; cv=none; b=Edx/3zOHU18sUjP0/2oyA5ZZmRBaB6weFXF3L2FOTu0rwFYEG7S0M4ASazilxicFzAVUyzgJxUlAKD33Od5hFTu4BRN2dSgBCbk0wxIFd8GG9WVQSLNPboM5r+FYYXjshfCW1OK2ERn3IoWeNarBPpI9oDGer8EuqdmQu6f0sjs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013293; c=relaxed/simple; bh=5u/VTQgT10rVteQE3bLcRKRug+hOln5hZySpZiNE76o=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ttt00ZmYagClmNZ37pfhOd4XyftQ7SuGcDU1JHLoGGfGW0t7+ecJmfc59ZiRKhDYoW2c3zwj2iR03Ts0pvN5NN3r22wA+XOL5EKKpERhmp/O1036A/Ud0YDEGfxgcpWPGt+KzijpsSN5YiUMrML+3IpRDykI1EUPgdO3lYv+d0s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmz699cz4f3jt8; Mon, 12 May 2025 09:27:47 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 547DB1A0359; Mon, 12 May 2025 09:28:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S13; Mon, 12 May 2025 09:28:07 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 09/19] md: add a new recovery_flag MD_RECOVERY_LAZY_RECOVER Date: Mon, 12 May 2025 09:19:17 +0800 Message-Id: <20250512011927.2809400-10-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S13 X-Coremail-Antispam: 1UD129KBjvJXoWxGry5JF1rWF1xAFyDCF17Jrb_yoW5Gr1kpa yxAF93CrWUAFWfZ3yUt3WDWFW5Zw10qryqyFy3uas5JF90kFn3ZF1UW3W7JrWDJa9aqa12 qw1DJFsrZF1F9w7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai This flag is used by llbitmap in later patches to skip raid456 initial recover and delay building initial xor data to first write. Signed-off-by: Yu Kuai Reviewed-by: Christoph Hellwig --- drivers/md/md.c | 12 +++++++++++- drivers/md/md.h | 2 ++ 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index c23ee9c19cf9..a5dd7a403ea5 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9133,6 +9133,14 @@ static sector_t md_sync_position(struct mddev *mddev= , enum sync_action action) start =3D rdev->recovery_offset; rcu_read_unlock(); =20 + /* + * If there are no spares, and raid456 lazy initial recover is + * requested. + */ + if (test_bit(MD_RECOVERY_LAZY_RECOVER, &mddev->recovery) && + start =3D=3D MaxSector) + start =3D 0; + /* If there is a bitmap, we need to make sure all * writes that started before we added a spare * complete before we start doing a recovery. @@ -9697,6 +9705,7 @@ static bool md_choose_sync_action(struct mddev *mddev= , int *spares) if (mddev->recovery_cp < MaxSector) { set_bit(MD_RECOVERY_SYNC, &mddev->recovery); clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery); + clear_bit(MD_RECOVERY_LAZY_RECOVER, &mddev->recovery); return true; } =20 @@ -9706,7 +9715,7 @@ static bool md_choose_sync_action(struct mddev *mddev= , int *spares) * re-add. */ *spares =3D remove_and_add_spares(mddev, NULL); - if (*spares) { + if (*spares || test_bit(MD_RECOVERY_LAZY_RECOVER, &mddev->recovery)) { clear_bit(MD_RECOVERY_SYNC, &mddev->recovery); clear_bit(MD_RECOVERY_CHECK, &mddev->recovery); clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery); @@ -10029,6 +10038,7 @@ void md_reap_sync_thread(struct mddev *mddev) clear_bit(MD_RECOVERY_RESHAPE, &mddev->recovery); clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery); clear_bit(MD_RECOVERY_CHECK, &mddev->recovery); + clear_bit(MD_RECOVERY_LAZY_RECOVER, &mddev->recovery); /* * We call mddev->cluster_ops->update_size here because sync_size could * be changed by md_update_sb, and MD_RECOVERY_RESHAPE is cleared, diff --git a/drivers/md/md.h b/drivers/md/md.h index 99f6c7a92b48..0c89bf0e8e4f 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -667,6 +667,8 @@ enum recovery_flags { MD_RECOVERY_RESHAPE, /* remote node is running resync thread */ MD_RESYNCING_REMOTE, + /* raid456 lazy initial recover */ + MD_RECOVERY_LAZY_RECOVER, }; =20 enum md_ro_state { --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BCADC1E5B8A; Mon, 12 May 2025 01:28:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013294; cv=none; b=ikHIq1mr/LtrDrTRUs2PJFiz1FqC8l9O3GLzCOWI88QWMhy0tkQM2fbfuCcTX7rywyxSnXJKjsWhsOlp8NOZsXsjfPHAKOu9ueKL8+56tph6mqhdCL8xEp+m6lD3gI1Dj7iYOlV1lmH8+CVws4fvRkAvTqN5vlACTXgwF4LP9do= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013294; c=relaxed/simple; bh=m4pjRq7129TZ6hLY++gKH7ZOY1JonomcS9hFruvGQ9o=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=nPFXRY7LhTPhGMFd3iTWJ5yytV3fei6naXWdnir8VSugNlpJuzycDcGPZKghahEdTPDdg55USWiT1jFBKirmXpO3ljAYVl5vl4Vy7hDF7d2KbRBIcZJI+jCSOtnA9ebfZrS7V1MKRysu2PJn0zXAiuNUEiuVQvnsuAALBzujI7Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmv1kyYz4f3jXl; Mon, 12 May 2025 09:27:43 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id E5A3E1A13FA; Mon, 12 May 2025 09:28:07 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S14; Mon, 12 May 2025 09:28:07 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 10/19] md/md-llbitmap: add data structure definition and comments Date: Mon, 12 May 2025 09:19:18 +0800 Message-Id: <20250512011927.2809400-11-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S14 X-Coremail-Antispam: 1UD129KBjvJXoWxKF15uw43Ww17tryxtF1DWrg_yoWDGryfpF W3ZrnxJrs8J3yxK347AFy2qFyftw4kAw13try3A3WF9w1YyF9avF92gFWrW3y7G3y7G3W7 ZFs8Kr98Ga98ArJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 281 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 281 insertions(+) create mode 100644 drivers/md/md-llbitmap.c diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c new file mode 100644 index 000000000000..8ab4c77abd32 --- /dev/null +++ b/drivers/md/md-llbitmap.c @@ -0,0 +1,281 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "md.h" +#include "md-bitmap.h" + +/* + * #### Background + * + * Redundant data is used to enhance data fault tolerance, and the storage + * method for redundant data vary depending on the RAID levels. And it's + * important to maintain the consistency of redundant data. + * + * Bitmap is used to record which data blocks have been synchronized and w= hich + * ones need to be resynchronized or recovered. Each bit in the bitmap + * represents a segment of data in the array. When a bit is set, it indica= tes + * that the multiple redundant copies of that data segment may not be + * consistent. Data synchronization can be performed based on the bitmap a= fter + * power failure or readding a disk. If there is no bitmap, a full disk + * synchronization is required. + * + * #### Key Features + * + * - IO fastpath is lockless, if user issues lots of write IO to the same + * bitmap bit in a short time, only the first write have additional overh= ead + * to update bitmap bit, no additional overhead for the following writes; + * - support only resync or recover written data, means in the case creat= ing + * new array or replacing with a new disk, there is no need to do a full = disk + * resync/recovery; + * + * #### Key Concept + * + * ##### State Machine + * + * Each bit is one byte, contain 6 difference state, see llbitmap_state. A= nd + * there are total 8 differenct actions, see llbitmap_action, can change s= tate: + * + * llbitmap state machine: transitions between states + * + * | | Startwrite | Startsync | Endsync | Abortsync| Reload | = Daemon | Discard | Stale | + * | --------- | ---------- | --------- | ------- | ------- | -------- | = ------ | --------- | --------- | + * | Unwritten | Dirty | x | x | x | x | = x | x | x | + * | Clean | Dirty | x | x | x | x | = x | Unwritten | NeedSync | + * | Dirty | x | x | x | x | NeedSync | = Clean | Unwritten | NeedSync | + * | NeedSync | x | Syncing | x | x | x | = x | Unwritten | x | + * | Syncing | x | Syncing | Dirty | NeedSync | NeedSync | = x | Unwritten | NeedSync | + * + * Typical scenarios: + * + * 1) Create new array + * All bits will be set to Unwritten by default, if --assume-clean is set, + * All bits will be set to Clean instead. + * + * 2) write data, raid1/raid10 have full copy of data, while raid456 doesn= 't and + * rely on xor data + * + * 2.1) write new data to raid1/raid10: + * Unwritten --StartWrite--> Dirty + * + * 2.2) write new data to raid456: + * Unwritten --StartWrite--> NeedSync + * + * Because the initial recover for raid456 is skipped, the xor data is not= build + * yet, the bit must set to NeedSync first and after lazy initial recover = is + * finished, the bit will finially set to Dirty(see 4.1 and 4.4); + * + * 2.3) cover write + * Clean --StartWrite--> Dirty + * + * 3) daemon, if the array is not degraded: + * Dirty --Daemon--> Clean + * + * For degraded array, the Dirty bit will never be cleared, prevent full d= isk + * recovery while readding a removed disk. + * + * 4) discard + * {Clean, Dirty, NeedSync, Syncing} --Discard--> Unwritten + * + * 5) resync and recover + * + * 5.1) common process + * NeedSync --Startsync--> Syncing --Endsync--> Dirty --Daemon--> Clean + * + * 5.2) resync after power failure + * Dirty --Reload--> NeedSync + * + * 5.3) recover while replacing with a new disk + * By default, the old bitmap framework will recover all data, and llbitmap + * implement this by a new helper llbitmap_skip_sync_blocks: + * + * skip recover for bits other than dirty or clean; + * + * 5.4) lazy initial recover for raid5: + * By default, the old bitmap framework will only allow new recover when t= here + * are spares(new disk), a new recovery flag MD_RECOVERY_LAZY_RECOVER is a= dd + * to perform raid456 lazy recover for set bits(from 2.2). + * + * ##### Bitmap IO + * + * ##### Chunksize + * + * The default bitmap size is 128k, incluing 1k bitmap super block, and + * the default size of segment of data in the array each bit(chunksize) is= 64k, + * and chunksize will adjust to twice the old size each time if the total = number + * bits is not less than 127k.(see llbitmap_init) + * + * ##### READ + * + * While creating bitmap, all pages will be allocated and read for llbitma= p, + * there won't be read afterwards + * + * ##### WRITE + * + * WRITE IO is divided into logical_block_size of the array, the dirty sta= te + * of each block is tracked independently, for example: + * + * each page is 4k, contain 8 blocks; each block is 512 bytes contain 512 = bit; + * + * | page0 | page1 | ... | page 31 | + * | | + * | \-----------------------\ + * | | + * | block0 | block1 | ... | block 8| + * | | + * | \-----------------\ + * | | + * | bit0 | bit1 | ... | bit511 | + * + * From IO path, if one bit is changed to Dirty or NeedSync, the correspon= ding + * block will be marked dirty, such block must write first before the IO is + * issued. This behaviour will affect IO performance, to reduce the impact= , if + * multiple bits are changed in the same block in a short time, all bits i= n this + * block will be changed to Dirty/NeedSync, so that there won't be any ove= rhead + * until daemon clears dirty bits. + * + * ##### Dirty Bits syncronization + * + * IO fast path will set bits to dirty, and those dirty bits will be clear= ed + * by daemon after IO is done. llbitmap_barrier is used to synchronize bet= ween + * IO path and daemon; + * + * IO path: + * 1) try to grab a reference, if succeed, set expire time after 5s and r= eturn; + * 2) if failed to grab a reference, wait for daemon to finish clearing d= irty + * bits; + * + * Daemon(Daemon will be waken up every daemon_sleep seconds): + * For each page: + * 1) check if page expired, if not skip this page; for expired page: + * 2) suspend the page and wait for inflight write IO to be done; + * 3) change dirty page to clean; + * 4) resume the page; + */ + +#define LLBITMAP_MAJOR_HI 6 + +#define BITMAP_MAX_SECTOR (128 * 2) +#define BITMAP_MAX_PAGES 32 +#define BITMAP_SB_SIZE 1024 +/* 64k is the max IO size of sync IO for raid1/raid10 */ +#define MIN_CHUNK_SIZE (64 * 2) + +#define DEFAULT_DAEMON_SLEEP 30 + +#define BARRIER_IDLE 5 + +enum llbitmap_state { + /* No valid data, init state after assemble the array */ + BitUnwritten =3D 0, + /* data is consistent */ + BitClean, + /* data will be consistent after IO is done, set directly for writes */ + BitDirty, + /* + * data need to be resynchronized: + * 1) set directly for writes if array is degraded, prevent full disk + * synchronization after readding a disk; + * 2) reassemble the array after power failure, and dirty bits are + * found after reloading the bitmap; + */ + BitNeedSync, + /* data is synchronizing */ + BitSyncing, + nr_llbitmap_state, + BitNone =3D 0xff, +}; + +enum llbitmap_action { + /* User write new data, this is the only acton from IO fast path */ + BitmapActionStartwrite =3D 0, + /* Start recovery */ + BitmapActionStartsync, + /* Finish recovery */ + BitmapActionEndsync, + /* Failed recovery */ + BitmapActionAbortsync, + /* Reassemble the array */ + BitmapActionReload, + /* Daemon thread is trying to clear dirty bits */ + BitmapActionDaemon, + /* Data is deleted */ + BitmapActionDiscard, + /* + * Bitmap is stale, mark all bits in addition to BitUnwritten to + * BitNeedSync. + */ + BitmapActionStale, + nr_llbitmap_action, + /* Init state is BitUnwritten */ + BitmapActionInit, +}; + +enum barrier_state { + LLPageFlush =3D 0, + LLPageDirty, +}; +/* + * page level barrier to synchronize between dirty bit by write IO and cle= an bit + * by daemon. + */ +struct llbitmap_barrier { + char *data; + struct percpu_ref active; + unsigned long expire; + unsigned long flags; + /* Per block size dirty state, maximum 64k page / 512 sector =3D 128 */ + DECLARE_BITMAP(dirty, 128); + wait_queue_head_t wait; +} ____cacheline_aligned_in_smp; + +struct llbitmap { + struct mddev *mddev; + int nr_pages; + struct page *pages[BITMAP_MAX_PAGES]; + struct llbitmap_barrier barrier[BITMAP_MAX_PAGES]; + + /* shift of one chunk */ + unsigned long chunkshift; + /* size of one chunk in sector */ + unsigned long chunksize; + /* total number of chunks */ + unsigned long chunks; + int io_size; + int bits_per_page; + /* fires on first BitDirty state */ + struct timer_list pending_timer; + struct work_struct daemon_work; + + unsigned long flags; + __u64 events_cleared; +}; + +struct llbitmap_unplug_work { + struct work_struct work; + struct llbitmap *llbitmap; + struct completion *done; +}; + +static struct workqueue_struct *md_llbitmap_io_wq; +static struct workqueue_struct *md_llbitmap_unplug_wq; + +static char state_machine[nr_llbitmap_state][nr_llbitmap_action] =3D { + [BitUnwritten] =3D {BitDirty, BitNone, BitNone, BitNone, BitNone, BitNone= , BitNone, BitNone}, + [BitClean] =3D {BitDirty, BitNone, BitNone, BitNone, BitNone, BitNone, Bi= tUnwritten, BitNeedSync}, + [BitDirty] =3D {BitNone, BitNone, BitNone, BitNone, BitNeedSync, BitClean= , BitUnwritten, BitNeedSync}, + [BitNeedSync] =3D {BitNone, BitSyncing, BitNone, BitNone, BitNone, BitNon= e, BitUnwritten, BitNone}, + [BitSyncing] =3D {BitNone, BitSyncing, BitDirty, BitNeedSync, BitNeedSync= , BitNone, BitUnwritten, BitNeedSync}, +}; --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A6FE1E834A; Mon, 12 May 2025 01:28:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013295; cv=none; b=UKH4b+iNky9EI6YKz8d1My2HCEMSt+/JNC/FVIKHs71uJGZzntqhgg454rQGx1JxSFsucfmvaT9WPK+kf6/4xGDn7X8wliqrfrGp8cEjXY9vl5AHUlm8zocsRS77PJUVuxkNYrkPTzwY7Y+iqauJ48b30XyBEhfIEaRtoY7k6R0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013295; c=relaxed/simple; bh=cUklt6u9HRLvdv51EjrnOmOQXl8kpjw+uv3yoeOSjI4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=eavDYDiccCv14hUuy0xeku4m/F4EG0U9rX9vtTei9y/lDfSZyuhDcb4bwTK2HBJP07oVE0QRq1lwwEMOyIWCrqzMLrlsFWzTpAQXhIvnK99hwrkp6icEstY6mBv59F1IwvmpR8caZobOSAuvrwASD4GZjQm2mN0cDAddswtVC10= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmt2SMJz4f3lVX; Mon, 12 May 2025 09:27:42 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 7E36E1A0359; Mon, 12 May 2025 09:28:08 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S15; Mon, 12 May 2025 09:28:08 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 11/19] md/md-llbitmap: implement bitmap IO Date: Mon, 12 May 2025 09:19:19 +0800 Message-Id: <20250512011927.2809400-12-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S15 X-Coremail-Antispam: 1UD129KBjvJXoWxKw4xJry7Zr4xJw48WrWrGrg_yoW7Kw15pF sxZFy7Cr45Jw1fXw43Jr97AFy5tr4kJanFqryxC34rC34ayrZIgFn7GFy8G3s8Wry5JFn8 Jan8Gw4rCr18WFUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai READ While creating bitmap, all pages will be allocated and read for llbitmap, there won't be read afterwards WRITE WRITE IO is ievided into logical_block_size of the array, the dirty state of each block is tracked independently, for example: each page is 4k, contain 8 blocks; each block is 512 bytes contain 512 bit; | page0 | page1 | ... | page 31 | | | | \-----------------------\ | | | block0 | block1 | ... | block 8| | | | \-----------------\ | | | bit0 | bit1 | ... | bit511 | From IO path, if one bit is changed to Dirty or NeedSync, the corresponding block will be marked dirty, such block must write first before the IO is issued. This behaviour will affect IO performance, to reduce the impact, if multiple bits are changed in the same block in a short time, all bits in this block will be changed to Dirty/NeedSync, so that there won't be any overhead until daemon clears dirty bits. Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 183 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 183 insertions(+) diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index 8ab4c77abd32..b27d10661387 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -279,3 +279,186 @@ static char state_machine[nr_llbitmap_state][nr_llbit= map_action] =3D { [BitNeedSync] =3D {BitNone, BitSyncing, BitNone, BitNone, BitNone, BitNon= e, BitUnwritten, BitNone}, [BitSyncing] =3D {BitNone, BitSyncing, BitDirty, BitNeedSync, BitNeedSync= , BitNone, BitUnwritten, BitNeedSync}, }; + +static bool is_raid456(struct mddev *mddev) +{ + return (mddev->level =3D=3D 4 || mddev->level =3D=3D 5 || mddev->level = =3D=3D 6); +} + +static int llbitmap_read(struct llbitmap *llbitmap, enum llbitmap_state *s= tate, + loff_t pos) +{ + pos +=3D BITMAP_SB_SIZE; + *state =3D llbitmap->barrier[pos >> PAGE_SHIFT].data[offset_in_page(pos)]; + return 0; +} + +static void llbitmap_set_page_dirty(struct llbitmap *llbitmap, int idx, in= t offset) +{ + struct llbitmap_barrier *barrier =3D &llbitmap->barrier[idx]; + bool level_456 =3D is_raid456(llbitmap->mddev); + int io_size =3D llbitmap->io_size; + int bit =3D offset / io_size; + bool infectious =3D false; + int pos; + + if (!test_bit(LLPageDirty, &barrier->flags)) + set_bit(LLPageDirty, &barrier->flags); + + /* + * if the bit is already dirty, or other page bytes is the same bit is + * already BitDirty, then mark the whole bytes in the bit as dirty + */ + if (test_and_set_bit(bit, barrier->dirty)) { + infectious =3D true; + } else { + for (pos =3D bit * io_size; pos < (bit + 1) * io_size - 1; + pos++) { + if (pos =3D=3D offset) + continue; + if (barrier->data[pos] =3D=3D BitDirty || + barrier->data[pos] =3D=3D BitNeedSync) { + infectious =3D true; + break; + } + } + + } + + if (!infectious) + return; + + for (pos =3D bit * io_size; pos < (bit + 1) * io_size - 1; pos++) { + if (pos =3D=3D offset) + continue; + + switch (barrier->data[pos]) { + case BitUnwritten: + barrier->data[pos] =3D level_456 ? BitNeedSync : BitDirty; + break; + case BitClean: + barrier->data[pos] =3D BitDirty; + break; + }; + } +} + +static int llbitmap_write(struct llbitmap *llbitmap, enum llbitmap_state s= tate, + loff_t pos) +{ + int idx; + int offset; + + pos +=3D BITMAP_SB_SIZE; + idx =3D pos >> PAGE_SHIFT; + offset =3D offset_in_page(pos); + + llbitmap->barrier[idx].data[offset] =3D state; + if (state =3D=3D BitDirty || state =3D=3D BitNeedSync) + llbitmap_set_page_dirty(llbitmap, idx, offset); + return 0; +} + +static void llbitmap_free_pages(struct llbitmap *llbitmap) +{ + int i; + + for (i =3D 0; i < BITMAP_MAX_PAGES; i++) { + struct page *page =3D llbitmap->pages[i]; + + if (!page) + return; + + llbitmap->pages[i] =3D NULL; + __free_page(page); + percpu_ref_exit(&llbitmap->barrier[i].active); + } +} + +static struct page *llbitmap_read_page(struct llbitmap *llbitmap, int idx) +{ + struct page *page =3D llbitmap->pages[idx]; + struct mddev *mddev =3D llbitmap->mddev; + struct md_rdev *rdev; + + if (page) + return page; + + page =3D alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!page) + return ERR_PTR(-ENOMEM); + + rdev_for_each(rdev, mddev) { + sector_t sector; + + if (rdev->raid_disk < 0 || test_bit(Faulty, &rdev->flags)) + continue; + + sector =3D mddev->bitmap_info.offset + (idx << PAGE_SECTORS_SHIFT); + + if (sync_page_io(rdev, sector, PAGE_SIZE, page, REQ_OP_READ, true)) + return page; + + md_error(mddev, rdev); + } + + __free_page(page); + return ERR_PTR(-EIO); +} + +static void llbitmap_write_page(struct llbitmap *llbitmap, int idx) +{ + struct page *page =3D llbitmap->pages[idx]; + struct mddev *mddev =3D llbitmap->mddev; + struct md_rdev *rdev; + int bit; + + for (bit =3D 0; bit < llbitmap->bits_per_page; bit++) { + struct llbitmap_barrier *barrier =3D &llbitmap->barrier[idx]; + + if (!test_and_clear_bit(bit, barrier->dirty)) + continue; + + rdev_for_each(rdev, mddev) { + sector_t sector; + sector_t bit_sector =3D llbitmap->io_size >> SECTOR_SHIFT; + + if (rdev->raid_disk < 0 || test_bit(Faulty, &rdev->flags)) + continue; + + sector =3D mddev->bitmap_info.offset + rdev->sb_start + + (idx << PAGE_SECTORS_SHIFT) + + bit * bit_sector; + md_super_write(mddev, rdev, sector, llbitmap->io_size, + page, bit * llbitmap->io_size); + } + } +} + +static int llbitmap_cache_pages(struct llbitmap *llbitmap) +{ + int nr_pages =3D (llbitmap->chunks + BITMAP_SB_SIZE + PAGE_SIZE - 1) / PA= GE_SIZE; + struct page *page; + int i =3D 0; + + llbitmap->nr_pages =3D nr_pages; + while (i < nr_pages) { + page =3D llbitmap_read_page(llbitmap, i); + if (IS_ERR(page)) { + llbitmap_free_pages(llbitmap); + return PTR_ERR(page); + } + + if (percpu_ref_init(&llbitmap->barrier[i].active, active_release, + PERCPU_REF_ALLOW_REINIT, GFP_KERNEL)) { + __free_page(page); + return -ENOMEM; + } + + init_waitqueue_head(&llbitmap->barrier[i].wait); + llbitmap->barrier[i].data =3D page_address(page); + llbitmap->pages[i++] =3D page; + } + + return 0; +} --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E4F601E9B0B; Mon, 12 May 2025 01:28:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013295; cv=none; b=nybbg/jp1NEEFcj+WnYAl5C/f26r4PUz1FWCgNZ4OZLs0DRxw7uBR4DCMMkDkJzXyexcBofFpeJRftwRlDOs6LKLyTo9t7W4Uq78YPLr1JefEoOWWQBtcUZ7g5HjBT3JeDEZZXVY1qHtTRmqzLn80EQgqfd/Io0ePdn5yTfQMoI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013295; c=relaxed/simple; bh=5kMQtNH0vL/8NUzJ8hwKVaz0ZDkw8lGeSo6POxNb4pA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=DVKRCgBOkvKkO7zqzkU+cEKs+vEsqAvqx5rAbesIvKm51bcSEs6yJjKfRXc6o/NqsewuYLdPTm5YGgt6sXaIv2bQ1hQqEjXQlYRaKcW0ZtHGngaX9yJtYv/CyiRrLwSysKKTO4uMlqgrUcTCnomUehglniPrEeH8tare3df/oPM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmt6RmGz4f3lVM; Mon, 12 May 2025 09:27:42 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 12CCC1A0B39; Mon, 12 May 2025 09:28:09 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S16; Mon, 12 May 2025 09:28:08 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 12/19] md/md-llbitmap: implement bit state machine Date: Mon, 12 May 2025 09:19:20 +0800 Message-Id: <20250512011927.2809400-13-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S16 X-Coremail-Antispam: 1UD129KBjvJXoW3Xw43Cw4kXr18JryfWr4xXrb_yoW7Xw4xpw sxZrnxGrs8JF4xW3y7t34xtF95Kr4kt3sIqF93A3s5Ww1YyrZI9r1kWFy8J3yUWryFqF1D JFs8Gr95GF4UZrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai Each bit is one byte, contain 6 difference state, see llbitmap_state. And there are total 8 differenct actions, see llbitmap_action, can change state: llbitmap state machine: transitions between states | | Startwrite | Startsync | Endsync | Abortsync| Reload | Daem= on | Discard | Stale | | --------- | ---------- | --------- | ------- | ------- | -------- | ----= -- | --------- | --------- | | Unwritten | Dirty | x | x | x | x | x = | x | x | | Clean | Dirty | x | x | x | x | x = | Unwritten | NeedSync | | Dirty | x | x | x | x | NeedSync | Clea= n | Unwritten | NeedSync | | NeedSync | x | Syncing | x | x | x | x = | Unwritten | x | | Syncing | x | Syncing | Dirty | NeedSync | NeedSync | x = | Unwritten | NeedSync | Typical scenarios: 1) Create new array All bits will be set to Unwritten by default, if --assume-clean is set, All bits will be set to Clean instead. 2) write data, raid1/raid10 have full copy of data, while raid456 donen't a= nd rely on xor data 2.1) write new data to raid1/raid10: Unwritten --StartWrite--> Dirty 2.2) write new data to raid456: Unwritten --StartWrite--> NeedSync Because the initial recover for raid456 is skipped, the xor data is not bui= ld yet, the bit must set to NeedSync first and after lazy initial recover is finished, the bit will finially set to Dirty(see 4.1 and 4.4); 2.3) cover write Clean --StartWrite--> Dirty 3) daemon, if the array is not degraded: Dirty --Daemon--> Clean For degraded array, the Dirty bit will never be cleared, prevent full disk recovery while readding a removed disk. 4) discard {Clean, Dirty, NeedSync, Syncing} --Discard--> Unwritten 4) resync and recover 4.1) common process NeedSync --Startsync--> Syncing --Endsync--> Dirty --Daemon--> Clean 4.2) resync after power failure Dirty --Reload--> NeedSync 4.3) recover while replacing with a new disk By default, the old bitmap framework will recover all data, and llbitmap implement this by a new helper llbitmap_skip_sync_blocks: skip recover for bits other than dirty or clean; 4.4) lazy initial recover for raid5: By default, the old bitmap framework will only allow new recover when there are spares(new disk), a new recovery flag MD_RECOVERY_LAZY_RECOVER is add to perform raid456 lazy recover for set bits(from 2.2). Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 100 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+) diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index b27d10661387..315a4eb7627c 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -462,3 +462,103 @@ static int llbitmap_cache_pages(struct llbitmap *llbi= tmap) =20 return 0; } + +static void llbitmap_init_state(struct llbitmap *llbitmap) +{ + enum llbitmap_state state =3D BitUnwritten; + unsigned long i; + + if (test_and_clear_bit(BITMAP_CLEAN, &llbitmap->flags)) + state =3D BitClean; + + for (i =3D 0; i < llbitmap->chunks; i++) { + int ret =3D llbitmap_write(llbitmap, state, i); + + if (ret < 0) { + set_bit(BITMAP_WRITE_ERROR, &llbitmap->flags); + break; + } + } +} + +/* The return value is only used from resync, where @start =3D=3D @end. */ +static enum llbitmap_state llbitmap_state_machine(struct llbitmap *llbitma= p, + unsigned long start, + unsigned long end, + enum llbitmap_action action) +{ + struct mddev *mddev =3D llbitmap->mddev; + enum llbitmap_state state =3D BitNone; + bool need_resync =3D false; + bool need_recovery =3D false; + + if (test_bit(BITMAP_WRITE_ERROR, &llbitmap->flags)) + return BitNone; + + if (action =3D=3D BitmapActionInit) { + llbitmap_init_state(llbitmap); + return BitNone; + } + + while (start <=3D end) { + ssize_t ret; + enum llbitmap_state c; + + ret =3D llbitmap_read(llbitmap, &c, start); + if (ret < 0) { + set_bit(BITMAP_WRITE_ERROR, &llbitmap->flags); + return BitNone; + } + + if (c < 0 || c >=3D nr_llbitmap_state) { + pr_err("%s: invalid bit %lu state %d action %d, forcing resync\n", + __func__, start, c, action); + state =3D BitNeedSync; + goto write_bitmap; + } + + if (c =3D=3D BitNeedSync) + need_resync =3D true; + + state =3D state_machine[c][action]; + if (state =3D=3D BitNone) { + start++; + continue; + } + +write_bitmap: + /* Delay raid456 initial recovery to first write. */ + if (c =3D=3D BitUnwritten && state =3D=3D BitDirty && + action =3D=3D BitmapActionStartwrite && is_raid456(mddev)) { + state =3D BitNeedSync; + need_recovery =3D true; + } + + ret =3D llbitmap_write(llbitmap, state, start); + if (ret < 0) { + set_bit(BITMAP_WRITE_ERROR, &llbitmap->flags); + return BitNone; + } + + if (state =3D=3D BitNeedSync) + need_resync =3D true; + else if (state =3D=3D BitDirty && + !timer_pending(&llbitmap->pending_timer)) + mod_timer(&llbitmap->pending_timer, + jiffies + mddev->bitmap_info.daemon_sleep * HZ); + + start++; + } + + if (need_recovery) { + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); + set_bit(MD_RECOVERY_LAZY_RECOVER, &mddev->recovery); + md_wakeup_thread(mddev->thread); + } else if (need_resync) { + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); + set_bit(MD_RECOVERY_SYNC, &mddev->recovery); + md_wakeup_thread(mddev->thread); + } + + return state; +} --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9194C1EB18D; Mon, 12 May 2025 01:28:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013294; cv=none; b=PVKvmHQQmtTZXP3AwCtfNZpBgKHy/Iy8nbEt+EEZ9KRbPTbz9Xzalg+DsWlqcvp8VV36AyUm/3JbA+0REsYNTP09IVQ6S8RK5yUPZc68fOSPRwz2xB6mzZdtKqWNuOJltkYlmUbvRRosJbREEiJtpJ7m3jR/tDL7ZlCtM0GMpH4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013294; c=relaxed/simple; bh=ELvkmyOJBVEyNAiLctJDaPZmLafbprHc9jlgLasppsQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hYVFkrqCt0tpdX1iUwLxRwa9NepXQN5qONDT8hG4Pyi4EJQNT6jgXtUi1xMFh7osCv/gJg4iTZSj2C247jMnDtNK/tc8BKTGXSgSSSTwHeSTkVGVUKWSWZ4a0bfQC6xrDUhkc+UKNfI3/psUQA/ogg7f6fOzQqwcwYc1doUNs+w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmv3RCCz4f3lVX; Mon, 12 May 2025 09:27:43 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id A16FF1A0359; Mon, 12 May 2025 09:28:09 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S17; Mon, 12 May 2025 09:28:09 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 13/19] md/md-llbitmap: implement APIs for page level dirty bits synchronization Date: Mon, 12 May 2025 09:19:21 +0800 Message-Id: <20250512011927.2809400-14-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S17 X-Coremail-Antispam: 1UD129KBjvJXoWxAry8tF1DXF43JF18Cr4DJwb_yoW5XF15pF WxXr15Gr45tF1xWw43ArW7AFyrtr4kt39agasak34F9F1jkrZagF4xCFyDZw4UWrn5GFnr Aan8Cw1fGw48XF7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai IO fast path will set bits to dirty, and those dirty bits will be cleared by daemon after IO is done. llbitmap_barrier is used to synchronize between IO path and daemon; IO path: 1) try to grab a reference, if succeed, set expire time after 5s and return; 2) if failed to grab a reference, wait for daemon to finish clearing dirty bits; Daemon(Daemon will be waken up every daemon_sleep seconds): For each page: 1) check if page expired, if not skip this page; for expired page: 2) suspend the page and wait for inflight write IO to be done; 3) change dirty page to clean; 4) resume the page; Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 46 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index 315a4eb7627c..994ca0be3d17 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -435,6 +435,14 @@ static void llbitmap_write_page(struct llbitmap *llbit= map, int idx) } } =20 +static void active_release(struct percpu_ref *ref) +{ + struct llbitmap_barrier *barrier =3D + container_of(ref, struct llbitmap_barrier, active); + + wake_up(&barrier->wait); +} + static int llbitmap_cache_pages(struct llbitmap *llbitmap) { int nr_pages =3D (llbitmap->chunks + BITMAP_SB_SIZE + PAGE_SIZE - 1) / PA= GE_SIZE; @@ -562,3 +570,41 @@ static enum llbitmap_state llbitmap_state_machine(stru= ct llbitmap *llbitmap, =20 return state; } + +static void llbitmap_raise_barrier(struct llbitmap *llbitmap, int page_idx) +{ + struct llbitmap_barrier *barrier =3D &llbitmap->barrier[page_idx]; + +retry: + if (likely(percpu_ref_tryget_live(&barrier->active))) { + WRITE_ONCE(barrier->expire, jiffies + BARRIER_IDLE * HZ); + return; + } + + wait_event(barrier->wait, !percpu_ref_is_dying(&barrier->active)); + goto retry; +} + +static void llbitmap_release_barrier(struct llbitmap *llbitmap, int page_i= dx) +{ + struct llbitmap_barrier *barrier =3D &llbitmap->barrier[page_idx]; + + percpu_ref_put(&barrier->active); +} + +static void llbitmap_suspend(struct llbitmap *llbitmap, int page_idx) +{ + struct llbitmap_barrier *barrier =3D &llbitmap->barrier[page_idx]; + + percpu_ref_kill(&barrier->active); + wait_event(barrier->wait, percpu_ref_is_zero(&barrier->active)); +} + +static void llbitmap_resume(struct llbitmap *llbitmap, int page_idx) +{ + struct llbitmap_barrier *barrier =3D &llbitmap->barrier[page_idx]; + + barrier->expire =3D LONG_MAX; + percpu_ref_resurrect(&barrier->active); + wake_up(&barrier->wait); +} --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E0C571E1E1F; Mon, 12 May 2025 01:28:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013295; cv=none; b=QYYw23AWX6erthnmh77AGzv/YNRHk9j5O6aywTHXUEE57BB5gnEHNmPFsO0pqVC+GhRAqyFwOlZKgNm1rKabHF+4kIu4hovKdQnaitfeOFoN6Qnt3VKyMrScWGX49zBdj50j85SlQXN4LCwlyg0mro/qGt3jQOUuKN4Go/zL524= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013295; c=relaxed/simple; bh=ewKWniCbL1+SvJAu6q35eL3fOUVNHxLxAfx8fzZwWMQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Sx/5c1XWMrMbjhyKk3BrmQA+C4dXQRgnw7XPqysnJ7WZGQOuPPfLpoXVMXUje/Krko5+b+1cprg+ABqlEyxAKKqlUd4TqFpeA7whEhzxm/4qeKycB9bVa63NtOE74xeZQv2+Gmm1ObOxN+aiwyQ4d6I0ZXC1kS7RkDek7VwOkHM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmx3wVjz4f3jXr; Mon, 12 May 2025 09:27:45 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 3B1331A1073; Mon, 12 May 2025 09:28:10 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S18; Mon, 12 May 2025 09:28:10 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 14/19] md/md-llbitmap: implement APIs to mange bitmap lifetime Date: Mon, 12 May 2025 09:19:22 +0800 Message-Id: <20250512011927.2809400-15-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S18 X-Coremail-Antispam: 1UD129KBjvJXoWxtr1xGFy3WFyUuFyrAF4rAFb_yoW3AFyfpa 1Sv3W5KrWrJr1rXr47Xr93ZFWrXr4kJr9FqFZ7Aas5Cr17ZrsxKryrWFyUJw18Zw1rGFs8 Ja15GF45GF1UWFDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai Include following APIs: - llbitmap_create - llbitmap_resize - llbitmap_load - llbitmap_destroy Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 262 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 262 insertions(+) diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index 994ca0be3d17..4b54aa6fbe40 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -608,3 +608,265 @@ static void llbitmap_resume(struct llbitmap *llbitmap= , int page_idx) percpu_ref_resurrect(&barrier->active); wake_up(&barrier->wait); } + +static int llbitmap_check_support(struct mddev *mddev) +{ + if (test_bit(MD_HAS_JOURNAL, &mddev->flags)) { + pr_notice("md/llbitmap: %s: array with journal cannot have bitmap\n", + mdname(mddev)); + return -EBUSY; + } + + if (mddev->bitmap_info.space =3D=3D 0) { + if (mddev->bitmap_info.default_space =3D=3D 0) { + pr_notice("md/llbitmap: %s: no space for bitmap\n", + mdname(mddev)); + return -ENOSPC; + } + } + + if (!mddev->persistent) { + pr_notice("md/llbitmap: %s: array must be persistent\n", + mdname(mddev)); + return -EOPNOTSUPP; + } + + if (mddev->bitmap_info.file) { + pr_notice("md/llbitmap: %s: doesn't support bitmap file\n", + mdname(mddev)); + return -EOPNOTSUPP; + } + + if (mddev->bitmap_info.external) { + pr_notice("md/llbitmap: %s: doesn't support external metadata\n", + mdname(mddev)); + return -EOPNOTSUPP; + } + + if (mddev_is_dm(mddev)) { + pr_notice("md/llbitmap: %s: doesn't support dm-raid\n", + mdname(mddev)); + return -EOPNOTSUPP; + } + + return 0; +} + +static int llbitmap_init(struct llbitmap *llbitmap) +{ + struct mddev *mddev =3D llbitmap->mddev; + sector_t blocks =3D mddev->resync_max_sectors; + unsigned long chunksize =3D MIN_CHUNK_SIZE; + unsigned long chunks =3D DIV_ROUND_UP(blocks, chunksize); + unsigned long space =3D mddev->bitmap_info.space << SECTOR_SHIFT; + int ret; + + while (chunks > space) { + chunksize =3D chunksize << 1; + chunks =3D DIV_ROUND_UP(blocks, chunksize); + } + + llbitmap->chunkshift =3D ffz(~chunksize); + llbitmap->chunksize =3D chunksize; + llbitmap->chunks =3D chunks; + mddev->bitmap_info.daemon_sleep =3D DEFAULT_DAEMON_SLEEP; + + ret =3D llbitmap_cache_pages(llbitmap); + if (ret) + return ret; + + llbitmap_state_machine(llbitmap, 0, llbitmap->chunks - 1, BitmapActionIni= t); + return 0; +} + +static int llbitmap_read_sb(struct llbitmap *llbitmap) +{ + struct mddev *mddev =3D llbitmap->mddev; + unsigned long daemon_sleep; + unsigned long chunksize; + unsigned long events; + struct page *sb_page; + bitmap_super_t *sb; + int ret =3D -EINVAL; + + if (!mddev->bitmap_info.offset) { + pr_err("md/llbitmap: %s: no super block found", mdname(mddev)); + return -EINVAL; + } + + sb_page =3D llbitmap_read_page(llbitmap, 0); + if (IS_ERR(sb_page)) { + pr_err("md/llbitmap: %s: read super block failed", + mdname(mddev)); + ret =3D -EIO; + goto out; + } + + sb =3D kmap_local_page(sb_page); + if (sb->magic !=3D cpu_to_le32(BITMAP_MAGIC)) { + pr_err("md/llbitmap: %s: invalid super block magic number", + mdname(mddev)); + goto out_put_page; + } + + if (sb->version !=3D cpu_to_le32(LLBITMAP_MAJOR_HI)) { + pr_err("md/llbitmap: %s: invalid super block version", + mdname(mddev)); + goto out_put_page; + } + + if (memcmp(sb->uuid, mddev->uuid, 16)) { + pr_err("md/llbitmap: %s: bitmap superblock UUID mismatch\n", + mdname(mddev)); + goto out_put_page; + } + + if (mddev->bitmap_info.space =3D=3D 0) { + int room =3D le32_to_cpu(sb->sectors_reserved); + + if (room) + mddev->bitmap_info.space =3D room; + else + mddev->bitmap_info.space =3D mddev->bitmap_info.default_space; + } + llbitmap->flags =3D le32_to_cpu(sb->state); + if (test_and_clear_bit(BITMAP_FIRST_USE, &llbitmap->flags)) { + ret =3D llbitmap_init(llbitmap); + goto out_put_page; + } + + chunksize =3D le32_to_cpu(sb->chunksize); + if (!is_power_of_2(chunksize)) { + pr_err("md/llbitmap: %s: chunksize not a power of 2", + mdname(mddev)); + goto out_put_page; + } + + if (chunksize < DIV_ROUND_UP(mddev->resync_max_sectors, + mddev->bitmap_info.space << SECTOR_SHIFT)) { + pr_err("md/llbitmap: %s: chunksize too small %lu < %llu / %lu", + mdname(mddev), chunksize, mddev->resync_max_sectors, + mddev->bitmap_info.space); + goto out_put_page; + } + + daemon_sleep =3D le32_to_cpu(sb->daemon_sleep); + if (daemon_sleep < 1 || daemon_sleep > MAX_SCHEDULE_TIMEOUT / HZ) { + pr_err("md/llbitmap: %s: daemon sleep %lu period out of range", + mdname(mddev), daemon_sleep); + goto out_put_page; + } + + if (le32_to_cpu(sb->write_behind)) + pr_warn("md/llbitmap: %s: slow disk is not supported", + mdname(mddev)); + + events =3D le64_to_cpu(sb->events); + if (events < mddev->events) { + pr_warn("md/llbitmap :%s: bitmap file is out of date (%lu < %llu) -- for= cing full recovery", + mdname(mddev), events, mddev->events); + set_bit(BITMAP_STALE, &llbitmap->flags); + } + + sb->sync_size =3D cpu_to_le64(mddev->resync_max_sectors); + mddev->bitmap_info.chunksize =3D chunksize; + mddev->bitmap_info.daemon_sleep =3D daemon_sleep; + + llbitmap->chunksize =3D chunksize; + llbitmap->chunks =3D DIV_ROUND_UP(mddev->resync_max_sectors, chunksize); + llbitmap->chunkshift =3D ffz(~chunksize); + ret =3D llbitmap_cache_pages(llbitmap); + +out_put_page: + __free_page(sb_page); +out: + kunmap_local(sb); + return ret; +} + +static int llbitmap_create(struct mddev *mddev) +{ + struct llbitmap *llbitmap; + int ret; + + ret =3D llbitmap_check_support(mddev); + if (ret) + return ret; + + llbitmap =3D kzalloc(sizeof(*llbitmap), GFP_KERNEL); + if (!llbitmap) + return -ENOMEM; + + llbitmap->mddev =3D mddev; + llbitmap->io_size =3D bdev_logical_block_size(mddev->gendisk->part0); + llbitmap->bits_per_page =3D PAGE_SIZE / llbitmap->io_size; + + timer_setup(&llbitmap->pending_timer, llbitmap_pending_timer_fn, 0); + INIT_WORK(&llbitmap->daemon_work, md_llbitmap_daemon_fn); + + mutex_lock(&mddev->bitmap_info.mutex); + mddev->bitmap =3D llbitmap; + ret =3D llbitmap_read_sb(llbitmap); + mutex_unlock(&mddev->bitmap_info.mutex); + if (ret) + goto err_out; + + return 0; + +err_out: + kfree(llbitmap); + return ret; +} + +static int llbitmap_resize(struct mddev *mddev, sector_t blocks, int chunk= size) +{ + struct llbitmap *llbitmap =3D mddev->bitmap; + unsigned long chunks; + + if (chunksize =3D=3D 0) + chunksize =3D llbitmap->chunksize; + + /* If there is enough space, leave the chunksize unchanged. */ + chunks =3D DIV_ROUND_UP(blocks, chunksize); + while (chunks > mddev->bitmap_info.space << SECTOR_SHIFT) { + chunksize =3D chunksize << 1; + chunks =3D DIV_ROUND_UP(blocks, chunksize); + } + + llbitmap->chunkshift =3D ffz(~chunksize); + llbitmap->chunksize =3D chunksize; + llbitmap->chunks =3D chunks; + + return 0; +} + +static int llbitmap_load(struct mddev *mddev) +{ + enum llbitmap_action action =3D BitmapActionReload; + struct llbitmap *llbitmap =3D mddev->bitmap; + + if (test_and_clear_bit(BITMAP_STALE, &llbitmap->flags)) + action =3D BitmapActionStale; + + llbitmap_state_machine(llbitmap, 0, llbitmap->chunks - 1, action); + return 0; +} + +static void llbitmap_destroy(struct mddev *mddev) +{ + struct llbitmap *llbitmap =3D mddev->bitmap; + + if (!llbitmap) + return; + + mutex_lock(&mddev->bitmap_info.mutex); + + timer_delete_sync(&llbitmap->pending_timer); + flush_workqueue(md_llbitmap_io_wq); + flush_workqueue(md_llbitmap_unplug_wq); + + mddev->bitmap =3D NULL; + llbitmap_free_pages(llbitmap); + kfree(llbitmap); + mutex_unlock(&mddev->bitmap_info.mutex); +} --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B15451EDA14; Mon, 12 May 2025 01:28:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013296; cv=none; b=Glpzp0uKZhTOUauXOUU5MrKkjuGA+kvsxmg43fCANSKzOMHMU6i9W/bVRtN0Z7WjDW+UZn54YX9Y2SFQjFSSbAACyIrQQ8AKerObeW0SaRSy7u0PMpVTC1pLj9G72cnBEuAisATeHQC3gIS93Yd2cT1lXTo339ZpBTkKZubGAlI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013296; c=relaxed/simple; bh=6lQLbigT0kljTaoLCqvvkTdaRkHHixs/2bFGqCRhK40=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=LRHYZ7BVQf8/Vox3F8YR+WcOZF8gXo2FCbVZTdXx1WCpC11QwYkuo9r3mXgYvm8pMOOITmyhNtPrjTtimjdBn+hYGzXYA3Dhhrhz4ZhlDIrxLbwhPYdS8+9ApildVz6j6s3Z0HOOl/Yp9fqgj6n8xecLBOarSucMPVk+8rWcagE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmw4h53z4f3lVb; Mon, 12 May 2025 09:27:44 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id CB8671A0359; Mon, 12 May 2025 09:28:10 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S19; Mon, 12 May 2025 09:28:10 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 15/19] md/md-llbitmap: implement APIs to dirty bits and clear bits Date: Mon, 12 May 2025 09:19:23 +0800 Message-Id: <20250512011927.2809400-16-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S19 X-Coremail-Antispam: 1UD129KBjvJXoWxtw1kKw1DCry8Jr45WFW5Awb_yoWxZF43pF 43Xw15Kr45J34Fq3y7Jr97ZF15tr4kJwnFqF93A34rGr1UArZ8KF48GFy0yw18ur93WFn8 Aw4Ykry5Cw4fWrDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai Include following APIs: - llbitmap_startwrite - llbitmap_endwrite - llbitmap_start_discard - llbitmap_end_discard - llbitmap_unplug - llbitmap_flush Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 206 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 206 insertions(+) diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index 4b54aa6fbe40..71234c0ae160 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -784,6 +784,68 @@ static int llbitmap_read_sb(struct llbitmap *llbitmap) return ret; } =20 +static void llbitmap_pending_timer_fn(struct timer_list *t) +{ + struct llbitmap *llbitmap =3D from_timer(llbitmap, t, pending_timer); + + if (work_busy(&llbitmap->daemon_work)) { + pr_warn("daemon_work not finished\n"); + set_bit(BITMAP_DAEMON_BUSY, &llbitmap->flags); + return; + } + + queue_work(md_llbitmap_io_wq, &llbitmap->daemon_work); +} + +static void md_llbitmap_daemon_fn(struct work_struct *work) +{ + struct llbitmap *llbitmap =3D + container_of(work, struct llbitmap, daemon_work); + unsigned long start; + unsigned long end; + bool restart; + int idx; + + if (llbitmap->mddev->degraded) + return; + +retry: + start =3D 0; + end =3D min(llbitmap->chunks, PAGE_SIZE - BITMAP_SB_SIZE) - 1; + restart =3D false; + + for (idx =3D 0; idx < llbitmap->nr_pages; idx++) { + struct llbitmap_barrier *barrier =3D &llbitmap->barrier[idx]; + + if (idx > 0) { + start =3D end + 1; + end =3D min(end + PAGE_SIZE, llbitmap->chunks - 1); + } + + if (!test_bit(LLPageFlush, &barrier->flags) && + time_before(jiffies, barrier->expire)) { + restart =3D true; + continue; + } + + llbitmap_suspend(llbitmap, idx); + llbitmap_state_machine(llbitmap, start, end, BitmapActionDaemon); + llbitmap_resume(llbitmap, idx); + } + + /* + * If the daemon took a long time to finish, retry to prevent missing + * clearing dirty bits. + */ + if (test_and_clear_bit(BITMAP_DAEMON_BUSY, &llbitmap->flags)) + goto retry; + + /* If some page is dirty but not expired, setup timer again */ + if (restart) + mod_timer(&llbitmap->pending_timer, + jiffies + llbitmap->mddev->bitmap_info.daemon_sleep * HZ); +} + static int llbitmap_create(struct mddev *mddev) { struct llbitmap *llbitmap; @@ -870,3 +932,147 @@ static void llbitmap_destroy(struct mddev *mddev) kfree(llbitmap); mutex_unlock(&mddev->bitmap_info.mutex); } + +static int llbitmap_startwrite(struct mddev *mddev, sector_t offset, + unsigned long sectors) +{ + struct llbitmap *llbitmap =3D mddev->bitmap; + unsigned long start =3D offset >> llbitmap->chunkshift; + unsigned long end =3D (offset + sectors - 1) >> llbitmap->chunkshift; + int page_start =3D (start + BITMAP_SB_SIZE) >> PAGE_SHIFT; + int page_end =3D (end + BITMAP_SB_SIZE) >> PAGE_SHIFT; + + llbitmap_state_machine(llbitmap, start, end, BitmapActionStartwrite); + + + while (page_start <=3D page_end) { + llbitmap_raise_barrier(llbitmap, page_start); + page_start++; + } + + return 0; +} + +static void llbitmap_endwrite(struct mddev *mddev, sector_t offset, + unsigned long sectors) +{ + struct llbitmap *llbitmap =3D mddev->bitmap; + unsigned long start =3D offset >> llbitmap->chunkshift; + unsigned long end =3D (offset + sectors - 1) >> llbitmap->chunkshift; + int page_start =3D (start + BITMAP_SB_SIZE) >> PAGE_SHIFT; + int page_end =3D (end + BITMAP_SB_SIZE) >> PAGE_SHIFT; + + while (page_start <=3D page_end) { + llbitmap_release_barrier(llbitmap, page_start); + page_start++; + } +} + +static int llbitmap_start_discard(struct mddev *mddev, sector_t offset, + unsigned long sectors) +{ + struct llbitmap *llbitmap =3D mddev->bitmap; + unsigned long start =3D DIV_ROUND_UP(offset, llbitmap->chunksize); + unsigned long end =3D (offset + sectors - 1) >> llbitmap->chunkshift; + int page_start =3D (start + BITMAP_SB_SIZE) >> PAGE_SHIFT; + int page_end =3D (end + BITMAP_SB_SIZE) >> PAGE_SHIFT; + + llbitmap_state_machine(llbitmap, start, end, BitmapActionDiscard); + + while (page_start <=3D page_end) { + llbitmap_raise_barrier(llbitmap, page_start); + page_start++; + } + + return 0; +} + +static void llbitmap_end_discard(struct mddev *mddev, sector_t offset, + unsigned long sectors) +{ + struct llbitmap *llbitmap =3D mddev->bitmap; + unsigned long start =3D DIV_ROUND_UP(offset, llbitmap->chunksize); + unsigned long end =3D (offset + sectors - 1) >> llbitmap->chunkshift; + int page_start =3D (start + BITMAP_SB_SIZE) >> PAGE_SHIFT; + int page_end =3D (end + BITMAP_SB_SIZE) >> PAGE_SHIFT; + + while (page_start <=3D page_end) { + llbitmap_release_barrier(llbitmap, page_start); + page_start++; + } +} + +static void llbitmap_unplug_fn(struct work_struct *work) +{ + struct llbitmap_unplug_work *unplug_work =3D + container_of(work, struct llbitmap_unplug_work, work); + struct llbitmap *llbitmap =3D unplug_work->llbitmap; + struct blk_plug plug; + int i; + + blk_start_plug(&plug); + + for (i =3D 0; i < llbitmap->nr_pages; i++) { + if (!test_bit(LLPageDirty, &llbitmap->barrier[i].flags) || + !test_and_clear_bit(LLPageDirty, &llbitmap->barrier[i].flags)) + continue; + + llbitmap_write_page(llbitmap, i); + } + + blk_finish_plug(&plug); + md_super_wait(llbitmap->mddev); + complete(unplug_work->done); +} + +static bool llbitmap_dirty(struct llbitmap *llbitmap) +{ + int i; + + for (i =3D 0; i < llbitmap->nr_pages; i++) + if (test_bit(LLPageDirty, &llbitmap->barrier[i].flags)) + return true; + + return false; +} + +static void llbitmap_unplug(struct mddev *mddev, bool sync) +{ + DECLARE_COMPLETION_ONSTACK(done); + struct llbitmap *llbitmap =3D mddev->bitmap; + struct llbitmap_unplug_work unplug_work =3D { + .llbitmap =3D llbitmap, + .done =3D &done, + }; + + if (!llbitmap_dirty(llbitmap)) + return; + + INIT_WORK_ONSTACK(&unplug_work.work, llbitmap_unplug_fn); + queue_work(md_llbitmap_unplug_wq, &unplug_work.work); + wait_for_completion(&done); + destroy_work_on_stack(&unplug_work.work); +} + +static void llbitmap_flush(struct mddev *mddev) +{ + struct llbitmap *llbitmap =3D mddev->bitmap; + struct blk_plug plug; + int i; + + for (i =3D 0; i < llbitmap->nr_pages; i++) + set_bit(LLPageFlush, &llbitmap->barrier[i].flags); + + timer_delete_sync(&llbitmap->pending_timer); + queue_work(md_llbitmap_io_wq, &llbitmap->daemon_work); + flush_work(&llbitmap->daemon_work); + + blk_start_plug(&plug); + for (i =3D 0; i < llbitmap->nr_pages; i++) { + /* mark all bits as dirty */ + bitmap_fill(llbitmap->barrier[i].dirty, llbitmap->bits_per_page); + llbitmap_write_page(llbitmap, i); + } + blk_finish_plug(&plug); + md_super_wait(llbitmap->mddev); +} --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 45F011EE7C6; Mon, 12 May 2025 01:28:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013296; cv=none; b=BAEAzJfIsQ94GQiFI7cIiyLZSF/JVSeqKpU01MrU5TjBbsXP44QbYYU49VlR8llz8+Aq86V645hF+rBxzXwz2Gw2b4FuYyKCID4EhIulBVUn270+K3Aezs5V7wkzMh41pa5poYoflVrf1QuUTkx6WdA0lDGMs6Hf7dqAlFIXjlc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013296; c=relaxed/simple; bh=0IvTBzWZKFcrsB6nAPprxdcFl8beEnwLfIdCqrsDtG0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cFgbEpHVrpF4pTh1Dz/rVb0Ry5kqCDZnOnu4Nkud6O9UdXM1KYk/xSkqouy8KvUepdupDAG9fo9/HUYcWGPENU73+uKUnVrd/U5r21unA8fOizOCPW3xsAFR4IDybsswnNFSbaweuZ079uNqpcmzPH1T/LQSnA/R33AOrQAvMtQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zwhn36ch1z4f3jtT; Mon, 12 May 2025 09:27:51 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 6337C1A0B15; Mon, 12 May 2025 09:28:11 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S20; Mon, 12 May 2025 09:28:11 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 16/19] md/md-llbitmap: implement APIs for sync_thread Date: Mon, 12 May 2025 09:19:24 +0800 Message-Id: <20250512011927.2809400-17-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S20 X-Coremail-Antispam: 1UD129KBjvJXoWxWFykGw45uw4DGw4kJw43Wrg_yoW5ZrW8pF 47Xw15Gr45X34fX3y3Jr97Aa4Fqr4ktr9FqF93A34rGF1Yyrs8KFWkGFyUXa1jgr1rGF1D X3Z8GrW5Cr1rXFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai Include following APIs: - llbitmap_blocks_synced - llbitmap_skip_sync_blocks - llbitmap_start_sync - llbitmap_end_sync - llbitmap_close_sync Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 83 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 83 insertions(+) diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index 71234c0ae160..3169ae8b72be 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -1076,3 +1076,86 @@ static void llbitmap_flush(struct mddev *mddev) blk_finish_plug(&plug); md_super_wait(llbitmap->mddev); } + +/* This is used for raid5 lazy initial recovery */ +static bool llbitmap_blocks_synced(struct mddev *mddev, sector_t offset) +{ + struct llbitmap *llbitmap =3D mddev->bitmap; + unsigned long p =3D offset >> llbitmap->chunkshift; + enum llbitmap_state c; + int ret; + + ret =3D llbitmap_read(llbitmap, &c, p); + if (ret < 0) { + set_bit(BITMAP_WRITE_ERROR, &llbitmap->flags); + return false; + } + + return c =3D=3D BitClean || c =3D=3D BitDirty; +} + +static sector_t llbitmap_skip_sync_blocks(struct mddev *mddev, sector_t of= fset) +{ + struct llbitmap *llbitmap =3D mddev->bitmap; + unsigned long p =3D offset >> llbitmap->chunkshift; + int blocks =3D llbitmap->chunksize - (offset & (llbitmap->chunksize - 1)); + enum llbitmap_state c; + int ret; + + ret =3D llbitmap_read(llbitmap, &c, p); + if (ret < 0) { + set_bit(BITMAP_WRITE_ERROR, &llbitmap->flags); + return 0; + } + + /* always skip unwritten blocks */ + if (c =3D=3D BitUnwritten) + return blocks; + + /* For resync also skip clean/dirty blocks */ + if ((c =3D=3D BitClean || c =3D=3D BitDirty) && + test_bit(MD_RECOVERY_SYNC, &mddev->recovery) && + !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) + return blocks; + + return 0; +} + +static bool llbitmap_start_sync(struct mddev *mddev, sector_t offset, + sector_t *blocks, bool degraded) +{ + struct llbitmap *llbitmap =3D mddev->bitmap; + unsigned long p =3D offset >> llbitmap->chunkshift; + + /* + * Handle one bit at a time, this is much simpler. And it doesn't matter + * if md_do_sync() loop more times. + */ + *blocks =3D llbitmap->chunksize - (offset & (llbitmap->chunksize - 1)); + return llbitmap_state_machine(llbitmap, p, p, BitmapActionStartsync) =3D= =3D BitSyncing; +} + +static void llbitmap_end_sync(struct mddev *mddev, sector_t offset, + sector_t *blocks) +{ + struct llbitmap *llbitmap =3D mddev->bitmap; + unsigned long p =3D offset >> llbitmap->chunkshift; + + *blocks =3D llbitmap->chunksize - (offset & (llbitmap->chunksize - 1)); + llbitmap_state_machine(llbitmap, p, llbitmap->chunks - 1, BitmapActionAbo= rtsync); +} + +static void llbitmap_close_sync(struct mddev *mddev) +{ + struct llbitmap *llbitmap =3D mddev->bitmap; + int i; + + for (i =3D 0; i < llbitmap->nr_pages; i++) { + struct llbitmap_barrier *barrier =3D &llbitmap->barrier[i]; + + /* let daemon_fn clear dirty bits immediately */ + WRITE_ONCE(barrier->expire, jiffies); + } + + llbitmap_state_machine(llbitmap, 0, llbitmap->chunks - 1, BitmapActionEnd= sync); +} --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 43AA51EE031; Mon, 12 May 2025 01:28:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013296; cv=none; b=Q9l7Cx7il9eCwTXyvoduCXKzwAxYxKEt5VKcs1dLOcvJoCT867ddZndNJ40PBaoBq3x5huY0L0nJYTjMPBx8z+0FYWI0CEZY+9AmLMveLLOTD8moxjnpHQMpravx2/YEw0jIDF+Q8io1gCwjS/VrFLrhm97fbL/vNgYCvSraD70= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013296; c=relaxed/simple; bh=jsIGs1zqdocllTHIIfn+pRw2YRn8Hm6j68XrsyidPSk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=S7ITWkIDl8kvPFnyx7zcwgJlTKSMj6yMI/fm0KGeQ5VhaimokCrn6LSW7ehyGWYC6b6cr82tG++MkfzkcqFC+5LQlbDLK2a8o9XKs7mnbPKmyEtOuUzxQXzuoYrR2WRHm25gn3WAoWjjTlXn3I8Akc5tWt5HEUHz7AWcU4nf2nA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4ZwhnT26HhzKHMnN; Mon, 12 May 2025 09:28:13 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id F3ED61A108A; Mon, 12 May 2025 09:28:11 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S21; Mon, 12 May 2025 09:28:11 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 17/19] md/md-llbitmap: implement all bitmap operations Date: Mon, 12 May 2025 09:19:25 +0800 Message-Id: <20250512011927.2809400-18-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S21 X-Coremail-Antispam: 1UD129KBjvJXoWxuF15JFykZF1kKrWkAry5urg_yoWrCF1DpF 4aqFy5Gr45JFyfWw13Jr9rZF1Fyrs7tr9Fqr97C34rGF15CrZxKF48WFyUJ34DXryfJFn8 Aw45GF4rCrWrXF7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai Include following left APIs - llbitmap_enabled - llbitmap_dirty_bits - llbitmap_update_sb And following APIs that are not needed: - llbitmap_write_all, used in old bitmap to mark all pages need writeback; - llbitmap_daemon_work, used in old bitmap, llbitmap use timer to trigger daemon; - llbitmap_cond_end_sync, use to end sync for completed sectors(TODO, don't affect functionality) And following APIs that are not supported: - llbitmap_start_behind_write - llbitmap_end_behind_write - llbitmap_wait_behind_writes - llbitmap_sync_with_cluster - llbitmap_get_from_slot - llbitmap_copy_from_slot - llbitmap_set_pages - llbitmap_free Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 125 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 125 insertions(+) diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index 3169ae8b72be..e381859efcd7 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -1159,3 +1159,128 @@ static void llbitmap_close_sync(struct mddev *mddev) =20 llbitmap_state_machine(llbitmap, 0, llbitmap->chunks - 1, BitmapActionEnd= sync); } + +static bool llbitmap_enabled(void *data) +{ + struct llbitmap *llbitmap =3D data; + + return llbitmap && !test_bit(BITMAP_WRITE_ERROR, &llbitmap->flags); +} + +static void llbitmap_dirty_bits(struct mddev *mddev, unsigned long s, + unsigned long e) +{ + llbitmap_state_machine(mddev->bitmap, s, e, BitmapActionStartwrite); +} + +static void llbitmap_write_sb(struct llbitmap *llbitmap) +{ + int nr_bits =3D round_up(BITMAP_SB_SIZE, llbitmap->io_size) / llbitmap->i= o_size; + + bitmap_fill(llbitmap->barrier[0].dirty, nr_bits); + llbitmap_write_page(llbitmap, 0); + md_super_wait(llbitmap->mddev); +} + +static void llbitmap_update_sb(void *data) +{ + struct llbitmap *llbitmap =3D data; + struct mddev *mddev =3D llbitmap->mddev; + struct page *sb_page; + bitmap_super_t *sb; + + if (test_bit(BITMAP_WRITE_ERROR, &llbitmap->flags)) + return; + + sb_page =3D llbitmap_read_page(llbitmap, 0); + if (IS_ERR(sb_page)) { + pr_err("%s: %s: read super block failed", __func__, + mdname(mddev)); + set_bit(BITMAP_WRITE_ERROR, &llbitmap->flags); + return; + } + + if (mddev->events < llbitmap->events_cleared) + llbitmap->events_cleared =3D mddev->events; + + sb =3D kmap_local_page(sb_page); + sb->events =3D cpu_to_le64(mddev->events); + sb->state =3D cpu_to_le32(llbitmap->flags); + sb->chunksize =3D cpu_to_le32(llbitmap->chunksize); + sb->sync_size =3D cpu_to_le64(mddev->resync_max_sectors); + sb->events_cleared =3D cpu_to_le64(llbitmap->events_cleared); + sb->sectors_reserved =3D cpu_to_le32(mddev->bitmap_info.space); + sb->daemon_sleep =3D cpu_to_le32(mddev->bitmap_info.daemon_sleep); + + kunmap_local(sb); + llbitmap_write_sb(llbitmap); +} + +static int llbitmap_get_stats(void *data, struct md_bitmap_stats *stats) +{ + struct llbitmap *llbitmap =3D data; + + memset(stats, 0, sizeof(*stats)); + + stats->missing_pages =3D 0; + stats->pages =3D llbitmap->nr_pages; + stats->file_pages =3D llbitmap->nr_pages; + + return 0; +} + +static void llbitmap_write_all(struct mddev *mddev) +{ + +} + +static void llbitmap_daemon_work(struct mddev *mddev) +{ + +} + +static void llbitmap_start_behind_write(struct mddev *mddev) +{ + +} + +static void llbitmap_end_behind_write(struct mddev *mddev) +{ + +} + +static void llbitmap_wait_behind_writes(struct mddev *mddev) +{ + +} + +static void llbitmap_cond_end_sync(struct mddev *mddev, sector_t sector, + bool force) +{ +} + +static void llbitmap_sync_with_cluster(struct mddev *mddev, + sector_t old_lo, sector_t old_hi, + sector_t new_lo, sector_t new_hi) +{ + +} + +static void *llbitmap_get_from_slot(struct mddev *mddev, int slot) +{ + return ERR_PTR(-EOPNOTSUPP); +} + +static int llbitmap_copy_from_slot(struct mddev *mddev, int slot, sector_t= *low, + sector_t *high, bool clear_bits) +{ + return -EOPNOTSUPP; +} + +static void llbitmap_set_pages(void *data, unsigned long pages) +{ +} + +static void llbitmap_free(void *data) +{ +} --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 620201EEA28; Mon, 12 May 2025 01:28:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013298; cv=none; b=c4E9L1dfXGj50Q1e7gGo5BBtEMMtFcNH+0HG46yzY4F6NwwGQ5PVHCv8vNI8B5OVIMk7Bhduq8n3sX6CbnAE8qU5sErIXZ6BLO8/9+6UbZU11J5uFJs75cVARRBbprWnY3QLnLkL0BDyRl+UjM/jHisgfBxcKVQfA3DTAN9kKBg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013298; c=relaxed/simple; bh=4P4LGu5G1QFA/OBpLIiEhbuv6KcZkfa5jBQMhi8CdvA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=UVJN9va6h2LccIK5YlKD8rQWyaxW804qU3O6r/ADtKijhhn9GAVvuPNhFaKolXxkXFrq/WMcm/HSBn08i3kYo2OzNo+63cWluY9wHWRg2UlQOwSCqRMev56Qz+SrAlc8CfMU12lBKLG7Evl73xBxRbZNe4oC9A169Zzns0ypz3w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4ZwhnT1nTBzYQtss; Mon, 12 May 2025 09:28:13 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 8AA141A0930; Mon, 12 May 2025 09:28:12 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S22; Mon, 12 May 2025 09:28:12 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 18/19] md/md-llbitmap: implement sysfs APIs Date: Mon, 12 May 2025 09:19:26 +0800 Message-Id: <20250512011927.2809400-19-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S22 X-Coremail-Antispam: 1UD129KBjvJXoWxWF1DWryUCw18JFy3JFW8JFb_yoW5trWrpa ySg345GrW5Jr1xWr13JrZrZFWrWws3WasFqr97Ca4rCF1UArsIgry8GFyUGw1kWryfGF1q yan0grZ8GF4UXFDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai There are 3 APIs for now: - bits: readonly, show status of bitmap bits, the number of each value; - metadata: readonly show bitmap metadata, include chunksize, chunkshift, chunks, offset and daemon_sleep; - daemon_sleep: read-write, default value is 30; Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 99 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 99 insertions(+) diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index e381859efcd7..6993be132127 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -1284,3 +1284,102 @@ static void llbitmap_set_pages(void *data, unsigned= long pages) static void llbitmap_free(void *data) { } + +static ssize_t bits_show(struct mddev *mddev, char *page) +{ + struct llbitmap *llbitmap; + int bits[nr_llbitmap_state] =3D {0}; + loff_t start =3D 0; + + mutex_lock(&mddev->bitmap_info.mutex); + llbitmap =3D mddev->bitmap; + if (!llbitmap) { + mutex_unlock(&mddev->bitmap_info.mutex); + return sprintf(page, "no bitmap\n"); + } + + if (test_bit(BITMAP_WRITE_ERROR, &llbitmap->flags)) { + mutex_unlock(&mddev->bitmap_info.mutex); + return sprintf(page, "bitmap io error\n"); + } + + while (start < llbitmap->chunks) { + ssize_t ret; + enum llbitmap_state c; + + ret =3D llbitmap_read(llbitmap, &c, start); + if (ret < 0) { + set_bit(BITMAP_WRITE_ERROR, &llbitmap->flags); + mutex_unlock(&mddev->bitmap_info.mutex); + return sprintf(page, "bitmap io error\n"); + } + + if (c < 0 || c >=3D nr_llbitmap_state) + pr_err("%s: invalid bit %llu state %d\n", + __func__, start, c); + else + bits[c]++; + start++; + } + + mutex_unlock(&mddev->bitmap_info.mutex); + return sprintf(page, "unwritten %d\nclean %d\ndirty %d\nneed sync %d\nsyn= cing %d\n", + bits[BitUnwritten], bits[BitClean], bits[BitDirty], + bits[BitNeedSync], bits[BitSyncing]); +} + +static struct md_sysfs_entry llbitmap_bits =3D +__ATTR_RO(bits); + +static ssize_t metadata_show(struct mddev *mddev, char *page) +{ + struct llbitmap *llbitmap; + ssize_t ret; + + mutex_lock(&mddev->bitmap_info.mutex); + llbitmap =3D mddev->bitmap; + if (!llbitmap) { + mutex_unlock(&mddev->bitmap_info.mutex); + return sprintf(page, "no bitmap\n"); + } + + ret =3D sprintf(page, "chunksize %lu\nchunkshift %lu\nchunks %lu\noffset= %llu\ndaemon_sleep %lu\n", + llbitmap->chunksize, llbitmap->chunkshift, + llbitmap->chunks, mddev->bitmap_info.offset, + llbitmap->mddev->bitmap_info.daemon_sleep); + mutex_unlock(&mddev->bitmap_info.mutex); + + return ret; +} + +static struct md_sysfs_entry llbitmap_metadata =3D +__ATTR_RO(metadata); + +static ssize_t +daemon_sleep_show(struct mddev *mddev, char *page) +{ + return sprintf(page, "%lu\n", mddev->bitmap_info.daemon_sleep); +} + +static ssize_t +daemon_sleep_store(struct mddev *mddev, const char *buf, size_t len) +{ + unsigned long timeout; + int rv =3D kstrtoul(buf, 10, &timeout); + + if (rv) + return rv; + + mddev->bitmap_info.daemon_sleep =3D timeout; + return len; +} + +static struct md_sysfs_entry llbitmap_daemon_sleep =3D +__ATTR_RW(daemon_sleep); + +static struct attribute *md_llbitmap_attrs[] =3D { + &llbitmap_bits.attr, + &llbitmap_metadata.attr, + &llbitmap_daemon_sleep.attr, + NULL +}; --=20 2.39.2 From nobody Wed Dec 17 08:55:55 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 157071EE7DA; Mon, 12 May 2025 01:28:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013299; cv=none; b=ePNGaDBEKXQ1CMohN2TZUeLEAV4kmppF9jNxaM2wJkN+mjw30FarNlONlKVK4pKHewlCbwUat4ck3xnGHn2NIM0TiZ7rmhqNP/4DLBmkZ2NjnfF5h16kDNfBncQvXJlCUNnsRdMLTabeOdxno8uoDqfYp7OSzfdTBL+JCxzg5z4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747013299; c=relaxed/simple; bh=3qyIzPyheqrZy522eEuD2AFtuIz3klS4ja7thurjK0o=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=UGO40YC/e45Zs2W9a6QuQyKGRWh6oLMJbPfsdKRh9t53YAsp4ZW1kCgFFaKg/jjQ4/rc0zpg23T7TzicwsDfw44BMjhiD6BNFu0/RGNSapoW0FlA5n3VUyGrZUii+rMxSsD02W+xYng637B10J7BAJbUIEpf2mhzHjRUYdq23wA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zwhmy6zkGz4f3lDc; Mon, 12 May 2025 09:27:46 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 2662E1A0359; Mon, 12 May 2025 09:28:13 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCnC2CdTiFoNFCWMA--.55093S23; Mon, 12 May 2025 09:28:12 +0800 (CST) From: Yu Kuai To: hch@lst.de, xni@redhat.com, colyli@kernel.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org, yukuai3@huawei.com Cc: linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com, johnny.chenyi@huawei.com Subject: [PATCH RFC md-6.16 v3 19/19] md/md-llbitmap: add Kconfig Date: Mon, 12 May 2025 09:19:27 +0800 Message-Id: <20250512011927.2809400-20-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250512011927.2809400-1-yukuai1@huaweicloud.com> References: <20250512011927.2809400-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCnC2CdTiFoNFCWMA--.55093S23 X-Coremail-Antispam: 1UD129KBjvJXoWxuFWxKFW5Zr1xZrWxtry7ZFb_yoW7CFWfpF WfXry3Cw15tF4xXw15A347uFyrJws3tr9Fvrn3C34ruFyUArZIqr4xKFyUtw1DWrsxJFn8 J3W5Kr95G3W5XaUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUQFxUUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai A new config MD_LLBITMAP is added, user can now using llbitmap to replace the old bitmap. Signed-off-by: Yu Kuai --- drivers/md/Kconfig | 12 ++++++ drivers/md/Makefile | 1 + drivers/md/md-bitmap.h | 16 ++++++++ drivers/md/md-llbitmap.c | 80 ++++++++++++++++++++++++++++++++++++++++ drivers/md/md.c | 6 +++ 5 files changed, 115 insertions(+) diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig index f913579e731c..655c4e381f7d 100644 --- a/drivers/md/Kconfig +++ b/drivers/md/Kconfig @@ -52,6 +52,18 @@ config MD_BITMAP =20 If unsure, say Y. =20 +config MD_LLBITMAP + bool "MD RAID lockless bitmap support" + default n + depends on BLK_DEV_MD + help + If you say Y here, support for the lockless write intent bitmap will + be enabled. + + Note, this is an experimental feature. + + If unsure, say N. + config MD_AUTODETECT bool "Autodetect RAID arrays during kernel boot" depends on BLK_DEV_MD=3Dy diff --git a/drivers/md/Makefile b/drivers/md/Makefile index 811731840a5c..e70e4d3cbe29 100644 --- a/drivers/md/Makefile +++ b/drivers/md/Makefile @@ -39,6 +39,7 @@ linear-y +=3D md-linear.o obj-$(CONFIG_MD_LINEAR) +=3D linear.o obj-$(CONFIG_MD_RAID0) +=3D raid0.o obj-$(CONFIG_MD_BITMAP) +=3D md-bitmap.o +obj-$(CONFIG_MD_LLBITMAP) +=3D md-llbitmap.o obj-$(CONFIG_MD_RAID1) +=3D raid1.o obj-$(CONFIG_MD_RAID10) +=3D raid10.o obj-$(CONFIG_MD_RAID456) +=3D raid456.o diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h index 4e27f5f793b7..dd23b6fedb70 100644 --- a/drivers/md/md-bitmap.h +++ b/drivers/md/md-bitmap.h @@ -22,6 +22,9 @@ typedef __u16 bitmap_counter_t; enum bitmap_state { BITMAP_STALE =3D 1, /* the bitmap file is out of date or had -EIO */ BITMAP_WRITE_ERROR =3D 2, /* A write error has occurred */ + BITMAP_FIRST_USE =3D 3, /* llbitmap is just created */ + BITMAP_CLEAN =3D 4, /* llbitmap is created with assume_clean */ + BITMAP_DAEMON_BUSY =3D 5, /* llbitmap daemon is not finished after daemon= _sleep */ BITMAP_HOSTENDIAN =3D15, }; =20 @@ -176,4 +179,17 @@ static inline void md_bitmap_exit(void) } #endif =20 +#ifdef CONFIG_MD_LLBITMAP +int md_llbitmap_init(void); +void md_llbitmap_exit(void); +#else +static inline int md_llbitmap_init(void) +{ + return 0; +} +static inline void md_llbitmap_exit(void) +{ +} +#endif + #endif diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index 6993be132127..5bb60340c7e2 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -1383,3 +1383,83 @@ static struct attribute *md_llbitmap_attrs[] =3D { &llbitmap_daemon_sleep.attr, NULL }; + +static struct attribute_group md_llbitmap_group =3D { + .name =3D "llbitmap", + .attrs =3D md_llbitmap_attrs, +}; + +static struct bitmap_operations llbitmap_ops =3D { + .head =3D { + .type =3D MD_BITMAP, + .id =3D ID_LLBITMAP, + .name =3D "llbitmap", + }, + + .enabled =3D llbitmap_enabled, + .create =3D llbitmap_create, + .resize =3D llbitmap_resize, + .load =3D llbitmap_load, + .destroy =3D llbitmap_destroy, + + .startwrite =3D llbitmap_startwrite, + .endwrite =3D llbitmap_endwrite, + .start_discard =3D llbitmap_start_discard, + .end_discard =3D llbitmap_end_discard, + .unplug =3D llbitmap_unplug, + .flush =3D llbitmap_flush, + + .blocks_synced =3D llbitmap_blocks_synced, + .skip_sync_blocks =3D llbitmap_skip_sync_blocks, + .start_sync =3D llbitmap_start_sync, + .end_sync =3D llbitmap_end_sync, + .close_sync =3D llbitmap_close_sync, + + .update_sb =3D llbitmap_update_sb, + .get_stats =3D llbitmap_get_stats, + .dirty_bits =3D llbitmap_dirty_bits, + + /* not needed */ + .write_all =3D llbitmap_write_all, + .daemon_work =3D llbitmap_daemon_work, + .cond_end_sync =3D llbitmap_cond_end_sync, + + /* not supported */ + .start_behind_write =3D llbitmap_start_behind_write, + .end_behind_write =3D llbitmap_end_behind_write, + .wait_behind_writes =3D llbitmap_wait_behind_writes, + .sync_with_cluster =3D llbitmap_sync_with_cluster, + .get_from_slot =3D llbitmap_get_from_slot, + .copy_from_slot =3D llbitmap_copy_from_slot, + .set_pages =3D llbitmap_set_pages, + .free =3D llbitmap_free, + + .group =3D &md_llbitmap_group, +}; + +int md_llbitmap_init(void) +{ + md_llbitmap_io_wq =3D alloc_workqueue("md_llbitmap_io", + WQ_MEM_RECLAIM | WQ_UNBOUND, 0); + if (!md_llbitmap_io_wq) + return -ENOMEM; + + md_llbitmap_unplug_wq =3D alloc_workqueue("md_llbitmap_unplug", + WQ_MEM_RECLAIM | WQ_UNBOUND, 0); + if (!md_llbitmap_unplug_wq) { + destroy_workqueue(md_llbitmap_io_wq); + md_llbitmap_io_wq =3D NULL; + return -ENOMEM; + } + + return register_md_submodule(&llbitmap_ops.head); +} + +void md_llbitmap_exit(void) +{ + destroy_workqueue(md_llbitmap_io_wq); + md_llbitmap_io_wq =3D NULL; + destroy_workqueue(md_llbitmap_unplug_wq); + md_llbitmap_unplug_wq =3D NULL; + unregister_md_submodule(&llbitmap_ops.head); +} diff --git a/drivers/md/md.c b/drivers/md/md.c index a5dd7a403ea5..6ac5747738dd 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -10191,6 +10191,10 @@ static int __init md_init(void) if (ret) return ret; =20 + ret =3D md_llbitmap_init(); + if (ret) + goto err_bitmap; + ret =3D -ENOMEM; md_wq =3D alloc_workqueue("md", WQ_MEM_RECLAIM, 0); if (!md_wq) @@ -10222,6 +10226,8 @@ static int __init md_init(void) err_misc_wq: destroy_workqueue(md_wq); err_wq: + md_llbitmap_exit(); +err_bitmap: md_bitmap_exit(); return ret; } --=20 2.39.2