From nobody Sun Feb 8 16:11:42 2026 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 963982451F3 for ; Tue, 11 Feb 2025 13:54:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.255 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739282066; cv=none; b=nbO0zJSSZF4yDE0HSgYTKsThGTIIq7963sG0HwpNouo/NtDii8Zhkgqix7IF1YMRtarTVMqQDr3fh8q9sNVQPFE2kt8052IemL8qsn2Cc6JrAm1vu21cAP3/bp8ekpECy+VXoiPFvW2RtQTWM6QBscxRHIOkhP1CBDI2uzgDJPU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739282066; c=relaxed/simple; bh=/VQLQK/nY2tJxzd3Cpm2B3of/xg7FOt74pCXe7XZ/Gs=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=HQ5/0WoPpmB2zBmESj92oclmv40bdze8Rs9QxkMc+ROprQd9vcGK6dSbZm7MyA6/4c8fFjwBGPwnzgVtbspgY/qde82IEQI5+EMXDa9mywZNl2XkEO3IgeqQR6JtYsrTCt6r+2lofSXjkrVJwCtSosJf6LnP/PkRw21g00nshMY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.255 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.105]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4YsjVh5MHQz1W5ZW; Tue, 11 Feb 2025 21:49:48 +0800 (CST) Received: from kwepemo500009.china.huawei.com (unknown [7.202.194.199]) by mail.maildlp.com (Postfix) with ESMTPS id 99B3B1402C4; Tue, 11 Feb 2025 21:54:14 +0800 (CST) Received: from huawei.com (10.90.53.73) by kwepemo500009.china.huawei.com (7.202.194.199) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 11 Feb 2025 21:54:14 +0800 From: Hongbo Li To: , CC: , , , , , Subject: [PATCH v2 1/4] erofs: decouple the iterator on folio Date: Tue, 11 Feb 2025 21:53:28 +0800 Message-ID: <20250211135331.933681-2-lihongbo22@huawei.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250211135331.933681-1-lihongbo22@huawei.com> References: <20250211135331.933681-1-lihongbo22@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemo500009.china.huawei.com (7.202.194.199) Content-Type: text/plain; charset="utf-8" When reading data in file-backed mount case, we need to iterate the each mapping item to read the real data into memory. Currently, the iterator is based on the folio structure. To make the code more compatibable, we move the folio related logic out of iteration so that it only depends on the iov_iter structure. This allows the reading process (such as direct io) to reuse this without interacting with the folio structure. We conducted the base performance test with fio (iosize is 4k), and the modifications did not affect performance. [Before] - first round seq read: IOPS=3D96.6k rand read: IOPS=3D4101 - multi-round seq read: IOPS=3D188k rand read: IOPS=3D35.2k [After] - first round seq read: IOPS=3D96.3k rand read: IOPS=3D4245 - multi-round seq read: IOPS=3D184k rand read: IOPS=3D34.3k Signed-off-by: Hongbo Li --- fs/erofs/fileio.c | 72 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 51 insertions(+), 21 deletions(-) diff --git a/fs/erofs/fileio.c b/fs/erofs/fileio.c index 0ffd1c63beeb..616dc93c0dc5 100644 --- a/fs/erofs/fileio.c +++ b/fs/erofs/fileio.c @@ -3,6 +3,7 @@ * Copyright (C) 2024, Alibaba Cloud */ #include "internal.h" +#include #include =20 struct erofs_fileio_rq { @@ -12,10 +13,15 @@ struct erofs_fileio_rq { struct super_block *sb; }; =20 +typedef void (fileio_rq_split_t)(void *data); + struct erofs_fileio { struct erofs_map_blocks map; struct erofs_map_dev dev; struct erofs_fileio_rq *rq; + struct inode *inode; + fileio_rq_split_t *split; + void *private; }; =20 static void erofs_fileio_ki_complete(struct kiocb *iocb, long ret) @@ -43,6 +49,11 @@ static void erofs_fileio_ki_complete(struct kiocb *iocb,= long ret) kfree(rq); } =20 +static void erofs_folio_split(void *data) +{ + erofs_onlinefolio_split((struct folio *)data); +} + static void erofs_fileio_rq_submit(struct erofs_fileio_rq *rq) { struct iov_iter iter; @@ -85,17 +96,15 @@ void erofs_fileio_submit_bio(struct bio *bio) bio)); } =20 -static int erofs_fileio_scan_folio(struct erofs_fileio *io, struct folio *= folio) +static int erofs_fileio_scan(struct erofs_fileio *io, + loff_t pos, struct iov_iter *iter) { - struct inode *inode =3D folio_inode(folio); + struct inode *inode =3D io->inode; struct erofs_map_blocks *map =3D &io->map; - unsigned int cur =3D 0, end =3D folio_size(folio), len, attached =3D 0; - loff_t pos =3D folio_pos(folio), ofs; - struct iov_iter iter; - struct bio_vec bv; + unsigned int cur =3D 0, end =3D iov_iter_count(iter), len, attached =3D 0; + loff_t ofs; int err =3D 0; =20 - erofs_onlinefolio_init(folio); while (cur < end) { if (!in_range(pos + cur, map->m_la, map->m_llen)) { map->m_la =3D pos + cur; @@ -105,7 +114,7 @@ static int erofs_fileio_scan_folio(struct erofs_fileio = *io, struct folio *folio) break; } =20 - ofs =3D folio_pos(folio) + cur - map->m_la; + ofs =3D pos + cur - map->m_la; len =3D min_t(loff_t, map->m_llen - ofs, end - cur); if (map->m_flags & EROFS_MAP_META) { struct erofs_buf buf =3D __EROFS_BUF_INITIALIZER; @@ -117,21 +126,17 @@ static int erofs_fileio_scan_folio(struct erofs_filei= o *io, struct folio *folio) err =3D PTR_ERR(src); break; } - bvec_set_folio(&bv, folio, len, cur); - iov_iter_bvec(&iter, ITER_DEST, &bv, 1, len); - if (copy_to_iter(src, len, &iter) !=3D len) { + if (copy_to_iter(src, len, iter) !=3D len) { erofs_put_metabuf(&buf); err =3D -EIO; break; } erofs_put_metabuf(&buf); } else if (!(map->m_flags & EROFS_MAP_MAPPED)) { - folio_zero_segment(folio, cur, cur + len); - attached =3D 0; + iov_iter_zero(len, iter); } else { if (io->rq && (map->m_pa + ofs !=3D io->dev.m_pa || map->m_deviceid !=3D io->dev.m_deviceid)) { -io_retry: erofs_fileio_rq_submit(io->rq); io->rq =3D NULL; } @@ -148,26 +153,39 @@ static int erofs_fileio_scan_folio(struct erofs_filei= o *io, struct folio *folio) io->rq->bio.bi_iter.bi_sector =3D io->dev.m_pa >> 9; attached =3D 0; } - if (!attached++) - erofs_onlinefolio_split(folio); - if (!bio_add_folio(&io->rq->bio, folio, len, cur)) - goto io_retry; + if (bio_iov_iter_get_pages(&io->rq->bio, iter)) { + err =3D -EIO; + break; + } + if (io->split && !attached++) + io->split(io->private); io->dev.m_pa +=3D len; } cur +=3D len; } - erofs_onlinefolio_end(folio, err); return err; } =20 static int erofs_fileio_read_folio(struct file *file, struct folio *folio) { struct erofs_fileio io =3D {}; + struct folio_queue folioq; + struct iov_iter iter; int err; =20 + folioq_init(&folioq, 0); + folioq_append(&folioq, folio); + iov_iter_folio_queue(&iter, ITER_DEST, &folioq, 0, 0, folio_size(folio)); + io.inode =3D folio_inode(folio); + io.split =3D erofs_folio_split; + io.private =3D folio; + trace_erofs_read_folio(folio, true); - err =3D erofs_fileio_scan_folio(&io, folio); + erofs_onlinefolio_init(folio); + err =3D erofs_fileio_scan(&io, folio_pos(folio), &iter); + erofs_onlinefolio_end(folio, err); erofs_fileio_rq_submit(io.rq); + return err; } =20 @@ -175,13 +193,25 @@ static void erofs_fileio_readahead(struct readahead_c= ontrol *rac) { struct inode *inode =3D rac->mapping->host; struct erofs_fileio io =3D {}; + struct folio_queue folioq; + struct iov_iter iter; struct folio *folio; int err; =20 + io.inode =3D inode; + io.split =3D erofs_folio_split; trace_erofs_readpages(inode, readahead_index(rac), readahead_count(rac), true); while ((folio =3D readahead_folio(rac))) { - err =3D erofs_fileio_scan_folio(&io, folio); + folioq_init(&folioq, 0); + folioq_append(&folioq, folio); + iov_iter_folio_queue(&iter, ITER_DEST, &folioq, 0, 0, folio_size(folio)); + + io.private =3D folio; + erofs_onlinefolio_init(folio); + err =3D erofs_fileio_scan(&io, folio_pos(folio), &iter); + erofs_onlinefolio_end(folio, err); + if (err && err !=3D -EINTR) erofs_err(inode->i_sb, "readahead error at folio %lu @ nid %llu", folio->index, EROFS_I(inode)->nid); --=20 2.34.1 From nobody Sun Feb 8 16:11:42 2026 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C352926BD9A for ; Tue, 11 Feb 2025 13:54:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.188 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739282063; cv=none; b=CgY8e+eskCLq6lu6Xfv2DsgMYUmE8GkMICboZNfXX8lAzKNL9dwrNYtdXyqjjw4noJVzE+KtgbjZClazS9YUuH98+EgVHpM36vbVhmXRhOBY72NzbPypp1Y/WKTBpVspzJlkA0XrYOHHe/JWl2WNkkpQtQTumZEvcTbLVsPC+30= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739282063; c=relaxed/simple; bh=ZMKELNB2ZjW7QKdSw7cU/Zd/ZXcgPfCNsfcbAb8+jjw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=OZc861eMsLYlarV8bUBEEmjqX5oqTEwZnlD5pdbKeabYXyZ9dFWjQ6Aj6ixC9nVckJTAVbt1usceIMDIsDZxnqKwadwQB3NKMv35RbxlYZAX5h+UBg0b8HoAfHNnHyf2XgMUlpPHyJYWq00EGy+tZ/4/tges2kbJC+pAdGuTaMI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.188 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.163.48]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4YsjZ00GZ1zrSvn; Tue, 11 Feb 2025 21:52:40 +0800 (CST) Received: from kwepemo500009.china.huawei.com (unknown [7.202.194.199]) by mail.maildlp.com (Postfix) with ESMTPS id 034001802D0; Tue, 11 Feb 2025 21:54:15 +0800 (CST) Received: from huawei.com (10.90.53.73) by kwepemo500009.china.huawei.com (7.202.194.199) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 11 Feb 2025 21:54:14 +0800 From: Hongbo Li To: , CC: , , , , , Subject: [PATCH v2 2/4] erofs: decouple callback action for fileio bio Date: Tue, 11 Feb 2025 21:53:29 +0800 Message-ID: <20250211135331.933681-3-lihongbo22@huawei.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250211135331.933681-1-lihongbo22@huawei.com> References: <20250211135331.933681-1-lihongbo22@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemo500009.china.huawei.com (7.202.194.199) Content-Type: text/plain; charset="utf-8" Introduce erofs_fileio_end_folio as the .bi_end_io callback for fileio bio. Signed-off-by: Hongbo Li --- fs/erofs/fileio.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/fs/erofs/fileio.c b/fs/erofs/fileio.c index 616dc93c0dc5..cdd432ec266c 100644 --- a/fs/erofs/fileio.c +++ b/fs/erofs/fileio.c @@ -11,6 +11,7 @@ struct erofs_fileio_rq { struct bio bio; struct kiocb iocb; struct super_block *sb; + ssize_t ret; }; =20 typedef void (fileio_rq_split_t)(void *data); @@ -22,14 +23,15 @@ struct erofs_fileio { struct inode *inode; fileio_rq_split_t *split; void *private; + bio_end_io_t *end; }; =20 static void erofs_fileio_ki_complete(struct kiocb *iocb, long ret) { struct erofs_fileio_rq *rq =3D container_of(iocb, struct erofs_fileio_rq, iocb); - struct folio_iter fi; =20 + rq->ret =3D ret; if (ret > 0) { if (ret !=3D rq->bio.bi_iter.bi_size) { bio_advance(&rq->bio, ret); @@ -37,14 +39,8 @@ static void erofs_fileio_ki_complete(struct kiocb *iocb,= long ret) } ret =3D 0; } - if (rq->bio.bi_end_io) { + if (rq->bio.bi_end_io) rq->bio.bi_end_io(&rq->bio); - } else { - bio_for_each_folio_all(fi, &rq->bio) { - DBG_BUGON(folio_test_uptodate(fi.folio)); - erofs_onlinefolio_end(fi.folio, ret); - } - } bio_uninit(&rq->bio); kfree(rq); } @@ -54,6 +50,18 @@ static void erofs_folio_split(void *data) erofs_onlinefolio_split((struct folio *)data); } =20 +static void erofs_fileio_end_folio(struct bio *bio) +{ + struct erofs_fileio_rq *rq =3D + container_of(bio, struct erofs_fileio_rq, bio); + struct folio_iter fi; + + bio_for_each_folio_all(fi, &rq->bio) { + DBG_BUGON(folio_test_uptodate(fi.folio)); + erofs_onlinefolio_end(fi.folio, rq->ret >=3D 0 ? 0 : rq->ret); + } +} + static void erofs_fileio_rq_submit(struct erofs_fileio_rq *rq) { struct iov_iter iter; @@ -151,6 +159,7 @@ static int erofs_fileio_scan(struct erofs_fileio *io, break; io->rq =3D erofs_fileio_rq_alloc(&io->dev); io->rq->bio.bi_iter.bi_sector =3D io->dev.m_pa >> 9; + io->rq->bio.bi_end_io =3D io->end; attached =3D 0; } if (bio_iov_iter_get_pages(&io->rq->bio, iter)) { @@ -177,6 +186,7 @@ static int erofs_fileio_read_folio(struct file *file, s= truct folio *folio) folioq_append(&folioq, folio); iov_iter_folio_queue(&iter, ITER_DEST, &folioq, 0, 0, folio_size(folio)); io.inode =3D folio_inode(folio); + io.end =3D erofs_fileio_end_folio; io.split =3D erofs_folio_split; io.private =3D folio; =20 @@ -199,6 +209,7 @@ static void erofs_fileio_readahead(struct readahead_con= trol *rac) int err; =20 io.inode =3D inode; + io.end =3D erofs_fileio_end_folio; io.split =3D erofs_folio_split; trace_erofs_readpages(inode, readahead_index(rac), readahead_count(rac), true); --=20 2.34.1 From nobody Sun Feb 8 16:11:42 2026 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C729D24BD0A for ; Tue, 11 Feb 2025 13:54:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.255 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739282066; cv=none; b=MckU3Sn0lyKj2xKVI3OmCyp8E7QE1uRUTXsWH1n28joN4M6gCGK3AkLeJXRZCmDfAY4s1ty1FamBC7jiHeD1v7zkhyrLWUWmAGEoiEWSjVBuGenDeFEGTwsn6LODDDZM0R0EJ7Z6Q4QYZeV2YYQFbcSv58aDU/QxoQdToSQ4cQc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739282066; c=relaxed/simple; bh=6m4pOdfZ5pRJ8RcVP608oycyRdOTBMRDArfZbZhrsDw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=BradyhMBXyqi6ZIp9sCetD55EHHIjJrGLdtSnVtdO1GAiwzQg/yriJ4FRyj7xZG160fb/Zld/48BrteFE68K3DkqzWucsWwJQdQkttbEMDrXpZZbmNJVdoG1KrCt69QBK6VazgDlzRu8mX3D1C6eB2VZnyLHGTeng1x8lF2bKjc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.255 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.194]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4YsjVj3jsBz1W5Zn; Tue, 11 Feb 2025 21:49:49 +0800 (CST) Received: from kwepemo500009.china.huawei.com (unknown [7.202.194.199]) by mail.maildlp.com (Postfix) with ESMTPS id 609B51402C3; Tue, 11 Feb 2025 21:54:15 +0800 (CST) Received: from huawei.com (10.90.53.73) by kwepemo500009.china.huawei.com (7.202.194.199) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 11 Feb 2025 21:54:14 +0800 From: Hongbo Li To: , CC: , , , , , Subject: [PATCH v2 3/4] erofs: add erofs_fileio_direct_io helper to handle direct io Date: Tue, 11 Feb 2025 21:53:30 +0800 Message-ID: <20250211135331.933681-4-lihongbo22@huawei.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250211135331.933681-1-lihongbo22@huawei.com> References: <20250211135331.933681-1-lihongbo22@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemo500009.china.huawei.com (7.202.194.199) Content-Type: text/plain; charset="utf-8" erofs has add file-backed mount support. In this scenario, only buffer io is allowed. So we enhance the io mode by implementing the direct io. Also, this can make the iov_iter (user buffer) interact with the backed file's page cache directly. To be mentioned, the direct io is atomic, if the part of the iov_iter of direct io failed, the whole direct io also fails. Signed-off-by: Hongbo Li --- fs/erofs/fileio.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) diff --git a/fs/erofs/fileio.c b/fs/erofs/fileio.c index cdd432ec266c..b652e3df050c 100644 --- a/fs/erofs/fileio.c +++ b/fs/erofs/fileio.c @@ -12,6 +12,7 @@ struct erofs_fileio_rq { struct kiocb iocb; struct super_block *sb; ssize_t ret; + void *private; }; =20 typedef void (fileio_rq_split_t)(void *data); @@ -24,6 +25,11 @@ struct erofs_fileio { fileio_rq_split_t *split; void *private; bio_end_io_t *end; + /* the following members control the sync call */ + struct completion ctr; + refcount_t ref; + size_t total; + size_t done; }; =20 static void erofs_fileio_ki_complete(struct kiocb *iocb, long ret) @@ -50,6 +56,13 @@ static void erofs_folio_split(void *data) erofs_onlinefolio_split((struct folio *)data); } =20 +static void erofs_iter_split(void *data) +{ + struct erofs_fileio *io =3D (struct erofs_fileio *)data; + + refcount_inc(&io->ref); +} + static void erofs_fileio_end_folio(struct bio *bio) { struct erofs_fileio_rq *rq =3D @@ -62,6 +75,25 @@ static void erofs_fileio_end_folio(struct bio *bio) } } =20 +static void erofs_fileio_iter_complete(struct erofs_fileio *io) +{ + if (!refcount_dec_and_test(&io->ref)) + return; + complete(&io->ctr); +} + +static void erofs_fileio_end_iter(struct bio *bio) +{ + struct erofs_fileio_rq *rq =3D + container_of(bio, struct erofs_fileio_rq, bio); + struct erofs_fileio *io =3D (struct erofs_fileio *)rq->private; + + if (rq->ret > 0) + io->done +=3D rq->ret; + + erofs_fileio_iter_complete(io); +} + static void erofs_fileio_rq_submit(struct erofs_fileio_rq *rq) { struct iov_iter iter; @@ -158,6 +190,7 @@ static int erofs_fileio_scan(struct erofs_fileio *io, if (err) break; io->rq =3D erofs_fileio_rq_alloc(&io->dev); + io->rq->private =3D io; io->rq->bio.bi_iter.bi_sector =3D io->dev.m_pa >> 9; io->rq->bio.bi_end_io =3D io->end; attached =3D 0; @@ -230,7 +263,45 @@ static void erofs_fileio_readahead(struct readahead_co= ntrol *rac) erofs_fileio_rq_submit(io.rq); } =20 +static ssize_t erofs_fileio_direct_io(struct kiocb *iocb, struct iov_iter = *iter) +{ + struct file *file =3D iocb->ki_filp; + struct inode *inode =3D file_inode(file); + size_t i_size =3D i_size_read(inode); + struct erofs_fileio io =3D {}; + int err; + + if (unlikely(iocb->ki_pos >=3D i_size)) + return 0; + + iter->count =3D min_t(size_t, iter->count, + max_t(size_t, 0, i_size - iocb->ki_pos)); + io.total =3D iter->count; + if (!io.total) + return 0; + + io.inode =3D inode; + io.done =3D 0; + io.split =3D erofs_iter_split; + io.private =3D &io; + io.end =3D erofs_fileio_end_iter; + init_completion(&io.ctr); + refcount_set(&io.ref, 1); + err =3D erofs_fileio_scan(&io, iocb->ki_pos, iter); + erofs_fileio_rq_submit(io.rq); + + erofs_fileio_iter_complete(&io); + wait_for_completion(&io.ctr); + if (io.total !=3D io.done) { + iov_iter_revert(iter, io.done); + return err ?: -EIO; + } + + return io.done; +} + const struct address_space_operations erofs_fileio_aops =3D { .read_folio =3D erofs_fileio_read_folio, .readahead =3D erofs_fileio_readahead, + .direct_IO =3D erofs_fileio_direct_io, }; --=20 2.34.1 From nobody Sun Feb 8 16:11:42 2026 Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E4A45230D0E for ; Tue, 11 Feb 2025 13:54:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.191 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739282064; cv=none; b=K0bggqK81EuF8SwyFHgLK1YKH55iIIZxqyWARVga7CqYOMNPlnNrvUtFBBLceMNYNGDTiBdIpZQ0IGnX+SnH7uCj2EXba1WxJcWdE7oLpGiXw82LfI1PtH3GzBw8HZHPKnGKAoYxAT99nwrTsRZW8FAEdsO5S+jiw4685AL8m6s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739282064; c=relaxed/simple; bh=yLAlqWBH9VMvr25WNBDwbIjRTIp1f0MU97u0uNuvZuw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=T0jfuiFJDHMGvvgC2FbmeGGlqdkXSThPBfls7kC7LsfKdeWHP+CdQ4b8IE7O4TEzMhdvR/hRTCWprJiAoukufPV5sScwfHvWNmlAWj+72O0QnlWliMhpb+4WqqYqujmm2wiGDVzHcJjrLaP6cntB1brA7UsFw+iyeVpiLrdPp0g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.191 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.163.44]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4YsjZF4QWTz1JJm2; Tue, 11 Feb 2025 21:52:53 +0800 (CST) Received: from kwepemo500009.china.huawei.com (unknown [7.202.194.199]) by mail.maildlp.com (Postfix) with ESMTPS id BD3FF14010D; Tue, 11 Feb 2025 21:54:15 +0800 (CST) Received: from huawei.com (10.90.53.73) by kwepemo500009.china.huawei.com (7.202.194.199) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 11 Feb 2025 21:54:15 +0800 From: Hongbo Li To: , CC: , , , , , Subject: [PATCH v2 4/4] erofs: file-backed mount supports direct io Date: Tue, 11 Feb 2025 21:53:31 +0800 Message-ID: <20250211135331.933681-5-lihongbo22@huawei.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250211135331.933681-1-lihongbo22@huawei.com> References: <20250211135331.933681-1-lihongbo22@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemo500009.china.huawei.com (7.202.194.199) Content-Type: text/plain; charset="utf-8" After .direct_IO is hooked, so it is easy to handle direct io in fileio mount case. We conduct the basic test on direct io and normal io, the fio is used in the test, the results show it can decrease the memory overhead. It slower than normal io in seq read due to erofs page cache and readahead, uut in rand read direct io is similar than buffer io. The results are reasonable. ``` - buffer io total used free shared buff/cache available Mem: 54Gi 2.4Gi 52Gi 11Mi 254Mi 51Gi Swap: 4.0Gi 0B 4.0Gi after read total used free shared buff/cache available Mem: 54Gi 2.5Gi 50Gi 11Mi 2.3Gi 51Gi Swap: 4.0Gi 0B 4.0Gi cost 2GB memory (the test file is 1GB) - direct io total used free shared buff/cache available Mem: 54Gi 2.4Gi 52Gi 11Mi 280Mi 51Gi Swap: 4.0Gi 0B 4.0Gi after read total used free shared buff/cache available Mem: 54Gi 2.6Gi 51Gi 11Mi 1.2Gi 51Gi Swap: 4.0Gi 0B 4.0Gi only cost 1GB memory (the test file is 1GB) buffer io: 96.6k (seq read), 4245 (rand read) direct io: 21.6k (seq read), 4187 (rand read) ``` Signed-off-by: Hongbo Li --- fs/erofs/data.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/fs/erofs/data.c b/fs/erofs/data.c index 0cd6b5c4df98..d58496225381 100644 --- a/fs/erofs/data.c +++ b/fs/erofs/data.c @@ -395,9 +395,13 @@ static ssize_t erofs_file_read_iter(struct kiocb *iocb= , struct iov_iter *to) if (IS_DAX(inode)) return dax_iomap_rw(iocb, to, &erofs_iomap_ops); #endif - if ((iocb->ki_flags & IOCB_DIRECT) && inode->i_sb->s_bdev) - return iomap_dio_rw(iocb, to, &erofs_iomap_ops, - NULL, 0, NULL, 0); + if (iocb->ki_flags & IOCB_DIRECT) { + if (inode->i_sb->s_bdev) + return iomap_dio_rw(iocb, to, &erofs_iomap_ops, + NULL, 0, NULL, 0); + if (erofs_is_fileio_mode(EROFS_SB(inode->i_sb))) + return generic_file_read_iter(iocb, to); + } return filemap_read(iocb, to, 0); } =20 --=20 2.34.1