From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 952C4C64EC4 for ; Wed, 8 Mar 2023 16:54:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229914AbjCHQyB (ORCPT ); Wed, 8 Mar 2023 11:54:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48722 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229874AbjCHQx4 (ORCPT ); Wed, 8 Mar 2023 11:53:56 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B0C31CF55 for ; Wed, 8 Mar 2023 08:53:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294388; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dL8s47Tp1nK8GIg8PijNkCrDLEgPha2fkCIMRUXm1zk=; b=X9xDhghJJiUO5nheQyRlufRJS24LGPs91Op9cqn6U8PoPfSFk58bwgmG58loognKBRmYKa 9WhsMJodSUOiD+s84cb6WwbEhF582FGFojsYiFrf+OUnhDoyjiQstuQDEZDKJG1fYu9v7d a1rFqT7sPHryk26KidQwFJ8ZkD2bWkE= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-609-Q9XiGVkBM5CVEEGBMEseQA-1; Wed, 08 Mar 2023 11:53:02 -0500 X-MC-Unique: Q9XiGVkBM5CVEEGBMEseQA-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0ADFE2810C15; Wed, 8 Mar 2023 16:53:01 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 08E1F492B04; Wed, 8 Mar 2023 16:52:58 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v17 01/14] splice: Clean up direct_splice_read() a bit Date: Wed, 8 Mar 2023 16:52:38 +0000 Message-Id: <20230308165251.2078898-2-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Do a couple of cleanups to direct_splice_read(): (1) Cast to struct page **, not void *. (2) Simplify the calculation of the number of pages to keep/reclaim in direct_splice_read(). Suggested-by: Christoph Hellwig Signed-off-by: David Howells cc: Jens Axboe cc: Christoph Hellwig cc: Al Viro cc: David Hildenbrand cc: John Hubbard cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig --- fs/splice.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index 2e76dbb81a8f..abd21a455a2b 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -295,7 +295,7 @@ ssize_t direct_splice_read(struct file *in, loff_t *ppo= s, struct kiocb kiocb; struct page **pages; ssize_t ret; - size_t used, npages, chunk, remain, reclaim; + size_t used, npages, chunk, remain, keep =3D 0; int i; =20 /* Work out how much data we can actually add into the pipe */ @@ -309,7 +309,7 @@ ssize_t direct_splice_read(struct file *in, loff_t *ppo= s, if (!bv) return -ENOMEM; =20 - pages =3D (void *)(bv + npages); + pages =3D (struct page **)(bv + npages); npages =3D alloc_pages_bulk_array(GFP_USER, npages, pages); if (!npages) { kfree(bv); @@ -332,11 +332,8 @@ ssize_t direct_splice_read(struct file *in, loff_t *pp= os, kiocb.ki_pos =3D *ppos; ret =3D call_read_iter(in, &kiocb, &to); =20 - reclaim =3D npages * PAGE_SIZE; - remain =3D 0; if (ret > 0) { - reclaim -=3D ret; - remain =3D ret; + keep =3D DIV_ROUND_UP(ret, PAGE_SIZE); *ppos =3D kiocb.ki_pos; file_accessed(in); } else if (ret < 0) { @@ -349,14 +346,12 @@ ssize_t direct_splice_read(struct file *in, loff_t *p= pos, } =20 /* Free any pages that didn't get touched at all. */ - reclaim /=3D PAGE_SIZE; - if (reclaim) { - npages -=3D reclaim; - release_pages(pages + npages, reclaim); - } + if (keep < npages) + release_pages(pages + keep, npages - keep); =20 /* Push the remaining pages into the pipe. */ - for (i =3D 0; i < npages; i++) { + remain =3D ret; + for (i =3D 0; i < keep; i++) { struct pipe_buffer *buf =3D pipe_head_buf(pipe); =20 chunk =3D min_t(size_t, remain, PAGE_SIZE); From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E36EC742A7 for ; Wed, 8 Mar 2023 16:54:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229841AbjCHQyH (ORCPT ); Wed, 8 Mar 2023 11:54:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48276 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229634AbjCHQx7 (ORCPT ); Wed, 8 Mar 2023 11:53:59 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AA3DE48E2F for ; Wed, 8 Mar 2023 08:53:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294387; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fokk7Crx10kBwBR9E3z6e4T6rL1k7XhHeFqewFVvsVE=; b=YWf+OTmXf9pMPFDGIFLAiIomW4vbchxYA+UlksR7z7eksuP1xcb/gJnQCP7f3TfMGf0nPI vOi7bKQqx4KigQnNNkY1AsAZA2b2VitGlT6aDUeVJbOjMGo6hRNSz3GMPWJj8FlS8kdYCC icNmiizLoEmm3cb7JnibqMo2cfWUG34= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-194-7CVrQpdMObaxQx7KSWspRQ-1; Wed, 08 Mar 2023 11:53:04 -0500 X-MC-Unique: 7CVrQpdMObaxQx7KSWspRQ-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id EB104100F90A; Wed, 8 Mar 2023 16:53:03 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id BA1FE492C3E; Wed, 8 Mar 2023 16:53:01 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Miklos Szeredi , Christoph Hellwig , John Hubbard , linux-unionfs@vger.kernel.org Subject: [PATCH v17 02/14] splice: Make do_splice_to() generic and export it Date: Wed, 8 Mar 2023 16:52:39 +0000 Message-Id: <20230308165251.2078898-3-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Rename do_splice_to() to vfs_splice_read() and export it so that it can be used as a helper when calling down to a lower layer filesystem as it performs all the necessary checks[1]. Signed-off-by: David Howells cc: Miklos Szeredi cc: Christoph Hellwig cc: Jens Axboe cc: Al Viro cc: John Hubbard cc: David Hildenbrand cc: Matthew Wilcox cc: linux-unionfs@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/CAJfpeguGksS3sCigmRi9hJdUec8qtM9f+_9jC1rJhs= XT+dV01w@mail.gmail.com/ [1] Reviewed-by: Christoph Hellwig --- fs/splice.c | 27 ++++++++++++++++++++------- include/linux/splice.h | 3 +++ 2 files changed, 23 insertions(+), 7 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index abd21a455a2b..90ccd3666dca 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -851,12 +851,24 @@ static long do_splice_from(struct pipe_inode_info *pi= pe, struct file *out, return out->f_op->splice_write(pipe, out, ppos, len, flags); } =20 -/* - * Attempt to initiate a splice from a file to a pipe. +/** + * vfs_splice_read - Read data from a file and splice it into a pipe + * @in: File to splice from + * @ppos: Input file offset + * @pipe: Pipe to splice to + * @len: Number of bytes to splice + * @flags: Splice modifier flags (SPLICE_F_*) + * + * Splice the requested amount of data from the input file to the pipe. T= his + * is synchronous as the caller must hold the pipe lock across the entire + * operation. + * + * If successful, it returns the amount of data spliced, 0 if it hit the E= OF or + * a hole and a negative error code otherwise. */ -static long do_splice_to(struct file *in, loff_t *ppos, - struct pipe_inode_info *pipe, size_t len, - unsigned int flags) +long vfs_splice_read(struct file *in, loff_t *ppos, + struct pipe_inode_info *pipe, size_t len, + unsigned int flags) { unsigned int p_space; int ret; @@ -879,6 +891,7 @@ static long do_splice_to(struct file *in, loff_t *ppos, return warn_unsupported(in, "read"); return in->f_op->splice_read(in, ppos, pipe, len, flags); } +EXPORT_SYMBOL_GPL(vfs_splice_read); =20 /** * splice_direct_to_actor - splices data directly between two non-pipes @@ -949,7 +962,7 @@ ssize_t splice_direct_to_actor(struct file *in, struct = splice_desc *sd, size_t read_len; loff_t pos =3D sd->pos, prev_pos =3D pos; =20 - ret =3D do_splice_to(in, &pos, pipe, len, flags); + ret =3D vfs_splice_read(in, &pos, pipe, len, flags); if (unlikely(ret <=3D 0)) goto out_release; =20 @@ -1097,7 +1110,7 @@ long splice_file_to_pipe(struct file *in, pipe_lock(opipe); ret =3D wait_for_space(opipe, flags); if (!ret) - ret =3D do_splice_to(in, offset, opipe, len, flags); + ret =3D vfs_splice_read(in, offset, opipe, len, flags); pipe_unlock(opipe); if (ret > 0) wakeup_pipe_readers(opipe); diff --git a/include/linux/splice.h b/include/linux/splice.h index a55179fd60fc..8f052c3dae95 100644 --- a/include/linux/splice.h +++ b/include/linux/splice.h @@ -76,6 +76,9 @@ extern ssize_t splice_to_pipe(struct pipe_inode_info *, struct splice_pipe_desc *); extern ssize_t add_to_pipe(struct pipe_inode_info *, struct pipe_buffer *); +long vfs_splice_read(struct file *in, loff_t *ppos, + struct pipe_inode_info *pipe, size_t len, + unsigned int flags); extern ssize_t splice_direct_to_actor(struct file *, struct splice_desc *, splice_direct_actor *); extern long do_splice(struct file *in, loff_t *off_in, From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87CE5C678D5 for ; Wed, 8 Mar 2023 16:54:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229997AbjCHQyO (ORCPT ); Wed, 8 Mar 2023 11:54:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48906 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229892AbjCHQyC (ORCPT ); Wed, 8 Mar 2023 11:54:02 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD863457EC for ; Wed, 8 Mar 2023 08:53:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294391; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=n2MkOxcvB8unEejbMWkMEf534o0h2+e07KJxot0+dsE=; b=REQe703Uk6gd81YHTqgZekG1+bSzV+Z8Mp+d0cNgwfbaKiIMfpt+RCa2esZeUnKT53R/wY 7/+0VwbN2fwzoWTtVzyCtY0M1M9DEVZcM8Vt5aJBsG85WEE5TyNVTSJMEWZDu95sXPbvOJ mw97JaHaCRyNICL1UWH5DfzULuCovPg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-253-uC0SAgkeM7aQ7FeHN1dj7w-1; Wed, 08 Mar 2023 11:53:08 -0500 X-MC-Unique: uC0SAgkeM7aQ7FeHN1dj7w-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id EBC6F886C61; Wed, 8 Mar 2023 16:53:06 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8E7602166B26; Wed, 8 Mar 2023 16:53:04 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Daniel Golle , Guenter Roeck , Christoph Hellwig , John Hubbard , Hugh Dickins Subject: [PATCH v17 03/14] shmem: Implement splice-read Date: Wed, 8 Mar 2023 16:52:40 +0000 Message-Id: <20230308165251.2078898-4-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The new filemap_splice_read() has an implicit expectation via filemap_get_pages() that ->read_folio() exists if ->readahead() doesn't fully populate the pagecache of the file it is reading from[1], potentially leading to a jump to NULL if this doesn't exist. shmem, however, (and by extension, tmpfs, ramfs and rootfs), doesn't have ->read_folio(), Work around this by equipping shmem with its own splice-read implementation, based on filemap_splice_read(), but able to paste in zero_page when there's a page missing. Signed-off-by: David Howells cc: Daniel Golle cc: Guenter Roeck cc: Christoph Hellwig cc: Jens Axboe cc: Al Viro cc: John Hubbard cc: David Hildenbrand cc: Matthew Wilcox cc: Hugh Dickins cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/Y+pdHFFTk1TTEBsO@makrotopia.org/ [1] --- mm/shmem.c | 124 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 123 insertions(+), 1 deletion(-) diff --git a/mm/shmem.c b/mm/shmem.c index 448f393d8ab2..3cbec1d56112 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2719,6 +2719,128 @@ static ssize_t shmem_file_read_iter(struct kiocb *i= ocb, struct iov_iter *to) return retval ? retval : error; } =20 +static bool zero_pipe_buf_try_steal(struct pipe_inode_info *pipe, + struct pipe_buffer *buf) +{ + return false; +} + +static const struct pipe_buf_operations zero_pipe_buf_ops =3D { + .release =3D generic_pipe_buf_release, + .try_steal =3D zero_pipe_buf_try_steal, + .get =3D generic_pipe_buf_get, +}; + +static size_t splice_zeropage_into_pipe(struct pipe_inode_info *pipe, + loff_t fpos, size_t size) +{ + size_t offset =3D fpos & ~PAGE_MASK; + + size =3D min_t(size_t, size, PAGE_SIZE - offset); + + if (!pipe_full(pipe->head, pipe->tail, pipe->max_usage)) { + struct pipe_buffer *buf =3D pipe_head_buf(pipe); + + *buf =3D (struct pipe_buffer) { + .ops =3D &zero_pipe_buf_ops, + .page =3D ZERO_PAGE(0), + .offset =3D offset, + .len =3D size, + }; + get_page(buf->page); + pipe->head++; + } + + return size; +} + +static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, + struct pipe_inode_info *pipe, + size_t len, unsigned int flags) +{ + struct inode *inode =3D file_inode(in); + struct address_space *mapping =3D inode->i_mapping; + struct folio *folio =3D NULL; + size_t total_spliced =3D 0, used, npages, n, part; + loff_t isize; + int error =3D 0; + + /* Work out how much data we can actually add into the pipe */ + used =3D pipe_occupancy(pipe->head, pipe->tail); + npages =3D max_t(ssize_t, pipe->max_usage - used, 0); + len =3D min_t(size_t, len, npages * PAGE_SIZE); + + do { + if (*ppos >=3D i_size_read(inode)) + break; + + error =3D shmem_get_folio(inode, *ppos / PAGE_SIZE, &folio, SGP_READ); + if (error) { + if (error =3D=3D -EINVAL) + error =3D 0; + break; + } + if (folio) { + folio_unlock(folio); + + if (folio_test_hwpoison(folio)) { + error =3D -EIO; + break; + } + } + + /* + * i_size must be checked after we know the pages are Uptodate. + * + * Checking i_size after the check allows us to calculate + * the correct value for "nr", which means the zero-filled + * part of the page is not copied back to userspace (unless + * another truncate extends the file - this is desired though). + */ + isize =3D i_size_read(inode); + if (unlikely(*ppos >=3D isize)) + break; + part =3D min_t(loff_t, isize - *ppos, len); + + if (folio) { + /* + * If users can be writing to this page using arbitrary + * virtual addresses, take care about potential aliasing + * before reading the page on the kernel side. + */ + if (mapping_writably_mapped(mapping)) + flush_dcache_folio(folio); + folio_mark_accessed(folio); + /* + * Ok, we have the page, and it's up-to-date, so we can + * now splice it into the pipe. + */ + n =3D splice_folio_into_pipe(pipe, folio, *ppos, part); + folio_put(folio); + folio =3D NULL; + } else { + n =3D splice_zeropage_into_pipe(pipe, *ppos, len); + } + + if (!n) + break; + len -=3D n; + total_spliced +=3D n; + *ppos +=3D n; + in->f_ra.prev_pos =3D *ppos; + if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) + break; + + cond_resched(); + } while (len); + + if (folio) + folio_put(folio); + + file_accessed(in); + return total_spliced ? total_spliced : error; +} + static loff_t shmem_file_llseek(struct file *file, loff_t offset, int when= ce) { struct address_space *mapping =3D file->f_mapping; @@ -3938,7 +4060,7 @@ static const struct file_operations shmem_file_operat= ions =3D { .read_iter =3D shmem_file_read_iter, .write_iter =3D generic_file_write_iter, .fsync =3D noop_fsync, - .splice_read =3D generic_file_splice_read, + .splice_read =3D shmem_file_splice_read, .splice_write =3D iter_file_splice_write, .fallocate =3D shmem_fallocate, #endif From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96032C64EC4 for ; Wed, 8 Mar 2023 16:54:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230015AbjCHQyT (ORCPT ); Wed, 8 Mar 2023 11:54:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49144 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229954AbjCHQyG (ORCPT ); Wed, 8 Mar 2023 11:54:06 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE48F5D275 for ; Wed, 8 Mar 2023 08:53:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294394; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=74aMisHJvcrsbS5g8R7662ZNEA+1A6hMiypJVvVd5yo=; b=IG9KXCG/PY36J9nMSVZKNljz+5nj8znRqF96W6pWLC2tOWlRmtHXydUpgMTmp+2rSWLeaT bXJBCUHNymROLXwNe6xTwK4y1P1gSxjgzyCgqTTQSUnWCIa7c/klDKypahvoUZcRSydShM voqJMFbphtUOFSJm6EtMBsaUZ/X9kG8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-629-CMD8rN7nNP6tp5tN6HG62A-1; Wed, 08 Mar 2023 11:53:11 -0500 X-MC-Unique: CMD8rN7nNP6tp5tN6HG62A-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DFF60100F909; Wed, 8 Mar 2023 16:53:09 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id AE39D492C3E; Wed, 8 Mar 2023 16:53:07 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard , Miklos Szeredi , linux-unionfs@vger.kernel.org Subject: [PATCH v17 04/14] overlayfs: Implement splice-read Date: Wed, 8 Mar 2023 16:52:41 +0000 Message-Id: <20230308165251.2078898-5-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Implement splice-read for overlayfs by passing the request down a layer rather than going through generic_file_splice_read() which is going to be changed to assume that ->read_folio() is present on buffered files. Signed-off-by: David Howells cc: Christoph Hellwig cc: Jens Axboe cc: Al Viro cc: John Hubbard cc: David Hildenbrand cc: Matthew Wilcox cc: Miklos Szeredi cc: linux-unionfs@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- Notes: ver #17) - Use vfs_splice_read() helper rather than open-coding checks. =20 ver #15) - Remove redundant FMODE_CAN_ODIRECT check on real file. - Do rw_verify_area() on the real file, not the overlay file. - Fix a file leak. fs/overlayfs/file.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index 7c04f033aadd..86197882ff35 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -419,6 +419,27 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, stru= ct iov_iter *iter) return ret; } =20 +static ssize_t ovl_splice_read(struct file *in, loff_t *ppos, + struct pipe_inode_info *pipe, size_t len, + unsigned int flags) +{ + const struct cred *old_cred; + struct fd real; + ssize_t ret; + + ret =3D ovl_real_fdget(in, &real); + if (ret) + return ret; + + old_cred =3D ovl_override_creds(file_inode(in)->i_sb); + ret =3D vfs_splice_read(real.file, ppos, pipe, len, flags); + revert_creds(old_cred); + ovl_file_accessed(in); + + fdput(real); + return ret; +} + /* * Calling iter_file_splice_write() directly from overlay's f_op may deadl= ock * due to lock order inversion between pipe->mutex in iter_file_splice_wri= te() @@ -695,7 +716,7 @@ const struct file_operations ovl_file_operations =3D { .fallocate =3D ovl_fallocate, .fadvise =3D ovl_fadvise, .flush =3D ovl_flush, - .splice_read =3D generic_file_splice_read, + .splice_read =3D ovl_splice_read, .splice_write =3D ovl_splice_write, =20 .copy_file_range =3D ovl_copy_file_range, From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DE45C6FD1E for ; Wed, 8 Mar 2023 16:54:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230102AbjCHQye (ORCPT ); Wed, 8 Mar 2023 11:54:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229935AbjCHQyI (ORCPT ); Wed, 8 Mar 2023 11:54:08 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CE1B61A9D for ; Wed, 8 Mar 2023 08:53:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294400; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PhbxTzWahXdd/JlmGrRowyeASqSlq5EbDSo0Q9TZAGc=; b=hM0nRbrF1MoXBP7RU9/atMQ+2yAGE/JncDHrVqEMsvniy4HISLKJqLedPdFkehZJDswxF6 z87kAhTPw4nEe5oY7VUupjLNUG+lOZ8rXiOwfsnSFAQLQcfjDq0IqMuy4tRJnXlMxIJirg x7n03L5S2ldAK0xcz54MSEVzuwymcYI= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-18-Nut58j8COw2eSbYYTRM5UQ-1; Wed, 08 Mar 2023 11:53:14 -0500 X-MC-Unique: Nut58j8COw2eSbYYTRM5UQ-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1025F1C12983; Wed, 8 Mar 2023 16:53:13 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7E0E9492B04; Wed, 8 Mar 2023 16:53:10 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Jan Harkes , Christoph Hellwig , John Hubbard , coda@cs.cmu.edu, codalist@coda.cs.cmu.edu, linux-unionfs@vger.kernel.org Subject: [PATCH v17 05/14] coda: Implement splice-read Date: Wed, 8 Mar 2023 16:52:42 +0000 Message-Id: <20230308165251.2078898-6-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Implement splice-read for coda by passing the request down a layer rather than going through generic_file_splice_read() which is going to be changed to assume that ->read_folio() is present on buffered files. Signed-off-by: David Howells cc: Jan Harkes cc: Christoph Hellwig cc: Jens Axboe cc: Al Viro cc: John Hubbard cc: David Hildenbrand cc: Matthew Wilcox cc: coda@cs.cmu.edu cc: codalist@coda.cs.cmu.edu cc: linux-unionfs@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Acked-by: Jan Harkes --- Notes: ver #17) - Use vfs_splice_read() helper rather than open-coding checks. fs/coda/file.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/fs/coda/file.c b/fs/coda/file.c index 3f3c81e6b1ab..12b26bd13564 100644 --- a/fs/coda/file.c +++ b/fs/coda/file.c @@ -23,6 +23,7 @@ #include #include #include +#include =20 #include #include "coda_psdev.h" @@ -94,6 +95,32 @@ coda_file_write_iter(struct kiocb *iocb, struct iov_iter= *to) return ret; } =20 +static ssize_t +coda_file_splice_read(struct file *coda_file, loff_t *ppos, + struct pipe_inode_info *pipe, + size_t len, unsigned int flags) +{ + struct inode *coda_inode =3D file_inode(coda_file); + struct coda_file_info *cfi =3D coda_ftoc(coda_file); + struct file *in =3D cfi->cfi_container; + loff_t ki_pos =3D *ppos; + ssize_t ret; + + ret =3D venus_access_intent(coda_inode->i_sb, coda_i2f(coda_inode), + &cfi->cfi_access_intent, + len, ki_pos, CODA_ACCESS_TYPE_READ); + if (ret) + goto finish_read; + + ret =3D vfs_splice_read(in, ppos, pipe, len, flags); + +finish_read: + venus_access_intent(coda_inode->i_sb, coda_i2f(coda_inode), + &cfi->cfi_access_intent, + len, ki_pos, CODA_ACCESS_TYPE_READ_FINISH); + return ret; +} + static void coda_vm_open(struct vm_area_struct *vma) { @@ -302,5 +329,5 @@ const struct file_operations coda_file_operations =3D { .open =3D coda_open, .release =3D coda_release, .fsync =3D coda_fsync, - .splice_read =3D generic_file_splice_read, + .splice_read =3D coda_file_splice_read, }; From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FDC1C64EC4 for ; Wed, 8 Mar 2023 16:55:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230044AbjCHQy6 (ORCPT ); Wed, 8 Mar 2023 11:54:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229895AbjCHQye (ORCPT ); Wed, 8 Mar 2023 11:54:34 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 206A661AA9 for ; Wed, 8 Mar 2023 08:53:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294401; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WFhWed43Sc7081L0WA9mMaa7Q22YwjW6QNRxMcW8x9U=; b=D/TBRPvTxq+MoHqP12DpmUbIO1sb/Q1Wi3pn5URSmud8PXZYI3X5rtSWV7VtFfiJ9GZpch 3WquNCsLgYLfcniaOnPxJC9tVzGTsvD8HnCU6I5vCmYNgmMMWh+mVzVJLBT1hWc0ch1ypG NDl5jzxjbnL3Xv8pnDT0BKyZJlneLh8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-286-n7_IuBJlMQ-LcFthELANRQ-1; Wed, 08 Mar 2023 11:53:17 -0500 X-MC-Unique: n7_IuBJlMQ-LcFthELANRQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3D8BE8027FD; Wed, 8 Mar 2023 16:53:16 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id C81E414171B6; Wed, 8 Mar 2023 16:53:13 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Greg Kroah-Hartman , Christoph Hellwig , John Hubbard , Miklos Szeredi , Arnd Bergmann Subject: [PATCH v17 06/14] tty, proc, kernfs, random: Use direct_splice_read() Date: Wed, 8 Mar 2023 16:52:43 +0000 Message-Id: <20230308165251.2078898-7-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Use direct_splice_read() for tty, procfs, kernfs and random files rather than going through generic_file_splice_read() as they just copy the file into the output buffer and don't splice pages. This avoids the need for them to have a ->read_folio() to satisfy filemap_splice_read(). Signed-off-by: David Howells Acked-by: Greg Kroah-Hartman cc: Christoph Hellwig cc: Jens Axboe cc: Al Viro cc: John Hubbard cc: David Hildenbrand cc: Matthew Wilcox cc: Miklos Szeredi cc: Arnd Bergmann cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- drivers/char/random.c | 4 ++-- drivers/tty/tty_io.c | 4 ++-- fs/kernfs/file.c | 2 +- fs/proc/inode.c | 4 ++-- fs/proc/proc_sysctl.c | 2 +- fs/proc_namespace.c | 6 +++--- 6 files changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index ce3ccd172cc8..792713616ba8 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1546,7 +1546,7 @@ const struct file_operations random_fops =3D { .compat_ioctl =3D compat_ptr_ioctl, .fasync =3D random_fasync, .llseek =3D noop_llseek, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .splice_write =3D iter_file_splice_write, }; =20 @@ -1557,7 +1557,7 @@ const struct file_operations urandom_fops =3D { .compat_ioctl =3D compat_ptr_ioctl, .fasync =3D random_fasync, .llseek =3D noop_llseek, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .splice_write =3D iter_file_splice_write, }; =20 diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c index 36fb945fdad4..9d117e579dfb 100644 --- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -466,7 +466,7 @@ static const struct file_operations tty_fops =3D { .llseek =3D no_llseek, .read_iter =3D tty_read, .write_iter =3D tty_write, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .splice_write =3D iter_file_splice_write, .poll =3D tty_poll, .unlocked_ioctl =3D tty_ioctl, @@ -481,7 +481,7 @@ static const struct file_operations console_fops =3D { .llseek =3D no_llseek, .read_iter =3D tty_read, .write_iter =3D redirected_tty_write, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .splice_write =3D iter_file_splice_write, .poll =3D tty_poll, .unlocked_ioctl =3D tty_ioctl, diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index e4a50e4ff0d2..9d23b8141db7 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -1011,7 +1011,7 @@ const struct file_operations kernfs_file_fops =3D { .release =3D kernfs_fop_release, .poll =3D kernfs_fop_poll, .fsync =3D noop_fsync, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .splice_write =3D iter_file_splice_write, }; =20 diff --git a/fs/proc/inode.c b/fs/proc/inode.c index f495fdb39151..711f12706469 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -591,7 +591,7 @@ static const struct file_operations proc_iter_file_ops = =3D { .llseek =3D proc_reg_llseek, .read_iter =3D proc_reg_read_iter, .write =3D proc_reg_write, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .poll =3D proc_reg_poll, .unlocked_ioctl =3D proc_reg_unlocked_ioctl, .mmap =3D proc_reg_mmap, @@ -617,7 +617,7 @@ static const struct file_operations proc_reg_file_ops_c= ompat =3D { static const struct file_operations proc_iter_file_ops_compat =3D { .llseek =3D proc_reg_llseek, .read_iter =3D proc_reg_read_iter, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .write =3D proc_reg_write, .poll =3D proc_reg_poll, .unlocked_ioctl =3D proc_reg_unlocked_ioctl, diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 5851eb5bc726..e49f99657d1c 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -869,7 +869,7 @@ static const struct file_operations proc_sys_file_opera= tions =3D { .poll =3D proc_sys_poll, .read_iter =3D proc_sys_read, .write_iter =3D proc_sys_write, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .splice_write =3D iter_file_splice_write, .llseek =3D default_llseek, }; diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c index 846f9455ae22..492abbbeff5e 100644 --- a/fs/proc_namespace.c +++ b/fs/proc_namespace.c @@ -324,7 +324,7 @@ static int mountstats_open(struct inode *inode, struct = file *file) const struct file_operations proc_mounts_operations =3D { .open =3D mounts_open, .read_iter =3D seq_read_iter, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .llseek =3D seq_lseek, .release =3D mounts_release, .poll =3D mounts_poll, @@ -333,7 +333,7 @@ const struct file_operations proc_mounts_operations =3D= { const struct file_operations proc_mountinfo_operations =3D { .open =3D mountinfo_open, .read_iter =3D seq_read_iter, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .llseek =3D seq_lseek, .release =3D mounts_release, .poll =3D mounts_poll, @@ -342,7 +342,7 @@ const struct file_operations proc_mountinfo_operations = =3D { const struct file_operations proc_mountstats_operations =3D { .open =3D mountstats_open, .read_iter =3D seq_read_iter, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .llseek =3D seq_lseek, .release =3D mounts_release, }; From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB5E2C64EC4 for ; Wed, 8 Mar 2023 16:55:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230186AbjCHQzT (ORCPT ); Wed, 8 Mar 2023 11:55:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230126AbjCHQyq (ORCPT ); Wed, 8 Mar 2023 11:54:46 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20A0870417 for ; Wed, 8 Mar 2023 08:53:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294404; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cO3np2De/ugwikklpZQNS+rZPSCRCQBh402/ZSlB0NU=; b=Im0IC+WKS3iLlhBxH4Vov6ff587zyj7dtlN3Cq5hQ10LBh74bO1s04knu/zl5r8dVBWygI TG9uBEUWRYgOYaw9ZChbaWQqHeDw7iT2SjgpPqs6guIoT3UJIFoFhharjkEuTSQf8vGESo J+hA48iMWmCvYEziBC44Q4sYZpPL2B0= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-582-VQJz-tdFMcWDa0obmcHkjg-1; Wed, 08 Mar 2023 11:53:20 -0500 X-MC-Unique: VQJz-tdFMcWDa0obmcHkjg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 137A33C10C6C; Wed, 8 Mar 2023 16:53:19 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id CE7BC2026D4B; Wed, 8 Mar 2023 16:53:16 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Steve French , Christoph Hellwig , John Hubbard , linux-cifs@vger.kernel.org Subject: [PATCH v17 07/14] splice: Do splice read from a file without using ITER_PIPE Date: Wed, 8 Mar 2023 16:52:44 +0000 Message-Id: <20230308165251.2078898-8-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Make generic_file_splice_read() use filemap_splice_read() and direct_splice_read() rather than using an ITER_PIPE and call_read_iter(). Make cifs use generic_file_splice_read() rather than doing it for itself. Unexport filemap_splice_read(). With this, ITER_PIPE is no longer used. Signed-off-by: David Howells cc: Jens Axboe cc: Steve French cc: Christoph Hellwig cc: Al Viro cc: David Hildenbrand cc: John Hubbard cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig --- fs/cifs/cifsfs.c | 8 ++++---- fs/cifs/cifsfs.h | 3 --- fs/cifs/file.c | 16 ---------------- fs/splice.c | 30 +++++++----------------------- mm/filemap.c | 1 - 5 files changed, 11 insertions(+), 47 deletions(-) diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c index cbcf210d56e4..ba963a26cb19 100644 --- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -1359,7 +1359,7 @@ const struct file_operations cifs_file_ops =3D { .fsync =3D cifs_fsync, .flush =3D cifs_flush, .mmap =3D cifs_file_mmap, - .splice_read =3D cifs_splice_read, + .splice_read =3D generic_file_splice_read, .splice_write =3D iter_file_splice_write, .llseek =3D cifs_llseek, .unlocked_ioctl =3D cifs_ioctl, @@ -1379,7 +1379,7 @@ const struct file_operations cifs_file_strict_ops =3D= { .fsync =3D cifs_strict_fsync, .flush =3D cifs_flush, .mmap =3D cifs_file_strict_mmap, - .splice_read =3D cifs_splice_read, + .splice_read =3D generic_file_splice_read, .splice_write =3D iter_file_splice_write, .llseek =3D cifs_llseek, .unlocked_ioctl =3D cifs_ioctl, @@ -1417,7 +1417,7 @@ const struct file_operations cifs_file_nobrl_ops =3D { .fsync =3D cifs_fsync, .flush =3D cifs_flush, .mmap =3D cifs_file_mmap, - .splice_read =3D cifs_splice_read, + .splice_read =3D generic_file_splice_read, .splice_write =3D iter_file_splice_write, .llseek =3D cifs_llseek, .unlocked_ioctl =3D cifs_ioctl, @@ -1435,7 +1435,7 @@ const struct file_operations cifs_file_strict_nobrl_o= ps =3D { .fsync =3D cifs_strict_fsync, .flush =3D cifs_flush, .mmap =3D cifs_file_strict_mmap, - .splice_read =3D cifs_splice_read, + .splice_read =3D generic_file_splice_read, .splice_write =3D iter_file_splice_write, .llseek =3D cifs_llseek, .unlocked_ioctl =3D cifs_ioctl, diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h index 71fe0a0a7992..8b239854e590 100644 --- a/fs/cifs/cifsfs.h +++ b/fs/cifs/cifsfs.h @@ -100,9 +100,6 @@ extern ssize_t cifs_strict_readv(struct kiocb *iocb, st= ruct iov_iter *to); extern ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from); extern ssize_t cifs_direct_writev(struct kiocb *iocb, struct iov_iter *fro= m); extern ssize_t cifs_strict_writev(struct kiocb *iocb, struct iov_iter *fro= m); -extern ssize_t cifs_splice_read(struct file *in, loff_t *ppos, - struct pipe_inode_info *pipe, size_t len, - unsigned int flags); extern int cifs_flock(struct file *pfile, int cmd, struct file_lock *plock= ); extern int cifs_lock(struct file *, int, struct file_lock *); extern int cifs_fsync(struct file *, loff_t, loff_t, int); diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 4d4a2d82636d..321f9b7c84c9 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -5066,19 +5066,3 @@ const struct address_space_operations cifs_addr_ops_= smallbuf =3D { .launder_folio =3D cifs_launder_folio, .migrate_folio =3D filemap_migrate_folio, }; - -/* - * Splice data from a file into a pipe. - */ -ssize_t cifs_splice_read(struct file *in, loff_t *ppos, - struct pipe_inode_info *pipe, size_t len, - unsigned int flags) -{ - if (unlikely(*ppos >=3D file_inode(in)->i_sb->s_maxbytes)) - return 0; - if (unlikely(!len)) - return 0; - if (in->f_flags & O_DIRECT) - return direct_splice_read(in, ppos, pipe, len, flags); - return filemap_splice_read(in, ppos, pipe, len, flags); -} diff --git a/fs/splice.c b/fs/splice.c index 90ccd3666dca..f46dd1fb367b 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -387,29 +387,13 @@ ssize_t generic_file_splice_read(struct file *in, lof= f_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags) { - struct iov_iter to; - struct kiocb kiocb; - int ret; - - iov_iter_pipe(&to, ITER_DEST, pipe, len); - init_sync_kiocb(&kiocb, in); - kiocb.ki_pos =3D *ppos; - ret =3D call_read_iter(in, &kiocb, &to); - if (ret > 0) { - *ppos =3D kiocb.ki_pos; - file_accessed(in); - } else if (ret < 0) { - /* free what was emitted */ - pipe_discard_from(pipe, to.start_head); - /* - * callers of ->splice_read() expect -EAGAIN on - * "can't put anything in there", rather than -EFAULT. - */ - if (ret =3D=3D -EFAULT) - ret =3D -EAGAIN; - } - - return ret; + if (unlikely(*ppos >=3D file_inode(in)->i_sb->s_maxbytes)) + return 0; + if (unlikely(!len)) + return 0; + if (in->f_flags & O_DIRECT) + return direct_splice_read(in, ppos, pipe, len, flags); + return filemap_splice_read(in, ppos, pipe, len, flags); } EXPORT_SYMBOL(generic_file_splice_read); =20 diff --git a/mm/filemap.c b/mm/filemap.c index 2723104cc06a..3a93515ae2ed 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2967,7 +2967,6 @@ ssize_t filemap_splice_read(struct file *in, loff_t *= ppos, =20 return total_spliced ? total_spliced : error; } -EXPORT_SYMBOL(filemap_splice_read); =20 static inline loff_t folio_seek_hole_data(struct xa_state *xas, struct address_space *mapping, struct folio *folio, From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F375C64EC4 for ; Wed, 8 Mar 2023 16:55:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230137AbjCHQzH (ORCPT ); Wed, 8 Mar 2023 11:55:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230096AbjCHQyp (ORCPT ); Wed, 8 Mar 2023 11:54:45 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CDC089311A for ; Wed, 8 Mar 2023 08:53:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294408; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XsRpjIP88f7UuB0TiBadjal83kAgF2un+t2WHwzlM/w=; b=eLRPOIJ3I+w0pLbE6IETg8W1speVPurC+sD6dF8L9F7ueqYtQxXIFqn3zNQ6sLRwMMSgOB FpBEmOw82QKOhpHwQwZDUzW4Qs3SXzWL4h4tbV7xtnhbK+bLLmkaJKKyxnz7GfJHFIhoJd l1uEzVdlubL0qVd/3XAu5du73ulITcs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-558-iBdOJoOTPZWMtmN2YA3RcQ-1; Wed, 08 Mar 2023 11:53:22 -0500 X-MC-Unique: iBdOJoOTPZWMtmN2YA3RcQ-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E3AD8857A8E; Wed, 8 Mar 2023 16:53:21 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id C8EFD492B05; Wed, 8 Mar 2023 16:53:19 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v17 08/14] iov_iter: Kill ITER_PIPE Date: Wed, 8 Mar 2023 16:52:45 +0000 Message-Id: <20230308165251.2078898-9-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The ITER_PIPE-type iterator was only used for generic_file_splice_read(), but that has now been switched to either pull pages directly from the pagecache for buffered file splice-reads or to use ITER_BVEC instead for O_DIRECT file splice-reads. This leaves ITER_PIPE unused - so remove it. Signed-off-by: David Howells Reviewed-by: Christoph Hellwig cc: Jens Axboe cc: Al Viro cc: David Hildenbrand cc: John Hubbard cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org --- include/linux/uio.h | 14 -- lib/iov_iter.c | 429 +------------------------------------------- mm/filemap.c | 3 +- 3 files changed, 4 insertions(+), 442 deletions(-) diff --git a/include/linux/uio.h b/include/linux/uio.h index 27e3fd942960..74598426edb4 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -11,7 +11,6 @@ #include =20 struct page; -struct pipe_inode_info; =20 typedef unsigned int __bitwise iov_iter_extraction_t; =20 @@ -25,7 +24,6 @@ enum iter_type { ITER_IOVEC, ITER_KVEC, ITER_BVEC, - ITER_PIPE, ITER_XARRAY, ITER_DISCARD, ITER_UBUF, @@ -55,15 +53,10 @@ struct iov_iter { const struct kvec *kvec; const struct bio_vec *bvec; struct xarray *xarray; - struct pipe_inode_info *pipe; void __user *ubuf; }; union { unsigned long nr_segs; - struct { - unsigned int head; - unsigned int start_head; - }; loff_t xarray_start; }; }; @@ -101,11 +94,6 @@ static inline bool iov_iter_is_bvec(const struct iov_it= er *i) return iov_iter_type(i) =3D=3D ITER_BVEC; } =20 -static inline bool iov_iter_is_pipe(const struct iov_iter *i) -{ - return iov_iter_type(i) =3D=3D ITER_PIPE; -} - static inline bool iov_iter_is_discard(const struct iov_iter *i) { return iov_iter_type(i) =3D=3D ITER_DISCARD; @@ -247,8 +235,6 @@ void iov_iter_kvec(struct iov_iter *i, unsigned int dir= ection, const struct kvec unsigned long nr_segs, size_t count); void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struc= t bio_vec *bvec, unsigned long nr_segs, size_t count); -void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe= _inode_info *pipe, - size_t count); void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t c= ount); void iov_iter_xarray(struct iov_iter *i, unsigned int direction, struct xa= rray *xarray, loff_t start, size_t count); diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 274014e4eafe..fad95e4cf372 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -14,8 +14,6 @@ #include #include =20 -#define PIPE_PARANOIA /* for now */ - /* covers ubuf and kbuf alike */ #define iterate_buf(i, n, base, len, off, __p, STEP) { \ size_t __maybe_unused off =3D 0; \ @@ -186,150 +184,6 @@ static int copyin(void *to, const void __user *from, = size_t n) return res; } =20 -#ifdef PIPE_PARANOIA -static bool sanity(const struct iov_iter *i) -{ - struct pipe_inode_info *pipe =3D i->pipe; - unsigned int p_head =3D pipe->head; - unsigned int p_tail =3D pipe->tail; - unsigned int p_occupancy =3D pipe_occupancy(p_head, p_tail); - unsigned int i_head =3D i->head; - unsigned int idx; - - if (i->last_offset) { - struct pipe_buffer *p; - if (unlikely(p_occupancy =3D=3D 0)) - goto Bad; // pipe must be non-empty - if (unlikely(i_head !=3D p_head - 1)) - goto Bad; // must be at the last buffer... - - p =3D pipe_buf(pipe, i_head); - if (unlikely(p->offset + p->len !=3D abs(i->last_offset))) - goto Bad; // ... at the end of segment - } else { - if (i_head !=3D p_head) - goto Bad; // must be right after the last buffer - } - return true; -Bad: - printk(KERN_ERR "idx =3D %d, offset =3D %d\n", i_head, i->last_offset); - printk(KERN_ERR "head =3D %d, tail =3D %d, buffers =3D %d\n", - p_head, p_tail, pipe->ring_size); - for (idx =3D 0; idx < pipe->ring_size; idx++) - printk(KERN_ERR "[%p %p %d %d]\n", - pipe->bufs[idx].ops, - pipe->bufs[idx].page, - pipe->bufs[idx].offset, - pipe->bufs[idx].len); - WARN_ON(1); - return false; -} -#else -#define sanity(i) true -#endif - -static struct page *push_anon(struct pipe_inode_info *pipe, unsigned size) -{ - struct page *page =3D alloc_page(GFP_USER); - if (page) { - struct pipe_buffer *buf =3D pipe_buf(pipe, pipe->head++); - *buf =3D (struct pipe_buffer) { - .ops =3D &default_pipe_buf_ops, - .page =3D page, - .offset =3D 0, - .len =3D size - }; - } - return page; -} - -static void push_page(struct pipe_inode_info *pipe, struct page *page, - unsigned int offset, unsigned int size) -{ - struct pipe_buffer *buf =3D pipe_buf(pipe, pipe->head++); - *buf =3D (struct pipe_buffer) { - .ops =3D &page_cache_pipe_buf_ops, - .page =3D page, - .offset =3D offset, - .len =3D size - }; - get_page(page); -} - -static inline int last_offset(const struct pipe_buffer *buf) -{ - if (buf->ops =3D=3D &default_pipe_buf_ops) - return buf->len; // buf->offset is 0 for those - else - return -(buf->offset + buf->len); -} - -static struct page *append_pipe(struct iov_iter *i, size_t size, - unsigned int *off) -{ - struct pipe_inode_info *pipe =3D i->pipe; - int offset =3D i->last_offset; - struct pipe_buffer *buf; - struct page *page; - - if (offset > 0 && offset < PAGE_SIZE) { - // some space in the last buffer; add to it - buf =3D pipe_buf(pipe, pipe->head - 1); - size =3D min_t(size_t, size, PAGE_SIZE - offset); - buf->len +=3D size; - i->last_offset +=3D size; - i->count -=3D size; - *off =3D offset; - return buf->page; - } - // OK, we need a new buffer - *off =3D 0; - size =3D min_t(size_t, size, PAGE_SIZE); - if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) - return NULL; - page =3D push_anon(pipe, size); - if (!page) - return NULL; - i->head =3D pipe->head - 1; - i->last_offset =3D size; - i->count -=3D size; - return page; -} - -static size_t copy_page_to_iter_pipe(struct page *page, size_t offset, siz= e_t bytes, - struct iov_iter *i) -{ - struct pipe_inode_info *pipe =3D i->pipe; - unsigned int head =3D pipe->head; - - if (unlikely(bytes > i->count)) - bytes =3D i->count; - - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - if (offset && i->last_offset =3D=3D -offset) { // could we merge it? - struct pipe_buffer *buf =3D pipe_buf(pipe, head - 1); - if (buf->page =3D=3D page) { - buf->len +=3D bytes; - i->last_offset -=3D bytes; - i->count -=3D bytes; - return bytes; - } - } - if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) - return 0; - - push_page(pipe, page, offset, bytes); - i->last_offset =3D -(offset + bytes); - i->head =3D head; - i->count -=3D bytes; - return bytes; -} - /* * fault_in_iov_iter_readable - fault in iov iterator for reading * @i: iterator @@ -433,46 +287,6 @@ void iov_iter_init(struct iov_iter *i, unsigned int di= rection, } EXPORT_SYMBOL(iov_iter_init); =20 -// returns the offset in partial buffer (if any) -static inline unsigned int pipe_npages(const struct iov_iter *i, int *npag= es) -{ - struct pipe_inode_info *pipe =3D i->pipe; - int used =3D pipe->head - pipe->tail; - int off =3D i->last_offset; - - *npages =3D max((int)pipe->max_usage - used, 0); - - if (off > 0 && off < PAGE_SIZE) { // anon and not full - (*npages)++; - return off; - } - return 0; -} - -static size_t copy_pipe_to_iter(const void *addr, size_t bytes, - struct iov_iter *i) -{ - unsigned int off, chunk; - - if (unlikely(bytes > i->count)) - bytes =3D i->count; - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - for (size_t n =3D bytes; n; n -=3D chunk) { - struct page *page =3D append_pipe(i, n, &off); - chunk =3D min_t(size_t, n, PAGE_SIZE - off); - if (!page) - return bytes - n; - memcpy_to_page(page, off, addr, chunk); - addr +=3D chunk; - } - return bytes; -} - static __wsum csum_and_memcpy(void *to, const void *from, size_t len, __wsum sum, size_t off) { @@ -480,44 +294,10 @@ static __wsum csum_and_memcpy(void *to, const void *f= rom, size_t len, return csum_block_add(sum, next, off); } =20 -static size_t csum_and_copy_to_pipe_iter(const void *addr, size_t bytes, - struct iov_iter *i, __wsum *sump) -{ - __wsum sum =3D *sump; - size_t off =3D 0; - unsigned int chunk, r; - - if (unlikely(bytes > i->count)) - bytes =3D i->count; - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - while (bytes) { - struct page *page =3D append_pipe(i, bytes, &r); - char *p; - - if (!page) - break; - chunk =3D min_t(size_t, bytes, PAGE_SIZE - r); - p =3D kmap_local_page(page); - sum =3D csum_and_memcpy(p + r, addr + off, chunk, sum, off); - kunmap_local(p); - off +=3D chunk; - bytes -=3D chunk; - } - *sump =3D sum; - return off; -} - size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i) { if (WARN_ON_ONCE(i->data_source)) return 0; - if (unlikely(iov_iter_is_pipe(i))) - return copy_pipe_to_iter(addr, bytes, i); if (user_backed_iter(i)) might_fault(); iterate_and_advance(i, bytes, base, len, off, @@ -539,42 +319,6 @@ static int copyout_mc(void __user *to, const void *fro= m, size_t n) return n; } =20 -static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes, - struct iov_iter *i) -{ - size_t xfer =3D 0; - unsigned int off, chunk; - - if (unlikely(bytes > i->count)) - bytes =3D i->count; - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - while (bytes) { - struct page *page =3D append_pipe(i, bytes, &off); - unsigned long rem; - char *p; - - if (!page) - break; - chunk =3D min_t(size_t, bytes, PAGE_SIZE - off); - p =3D kmap_local_page(page); - rem =3D copy_mc_to_kernel(p + off, addr + xfer, chunk); - chunk -=3D rem; - kunmap_local(p); - xfer +=3D chunk; - bytes -=3D chunk; - if (rem) { - iov_iter_revert(i, rem); - break; - } - } - return xfer; -} - /** * _copy_mc_to_iter - copy to iter with source memory error exception hand= ling * @addr: source kernel address @@ -594,9 +338,8 @@ static size_t copy_mc_pipe_to_iter(const void *addr, si= ze_t bytes, * alignment and poison alignment assumptions to avoid re-triggering * hardware exceptions. * - * * ITER_KVEC, ITER_PIPE, and ITER_BVEC can return short copies. - * Compare to copy_to_iter() where only ITER_IOVEC attempts might return - * a short copy. + * * ITER_KVEC and ITER_BVEC can return short copies. Compare to + * copy_to_iter() where only ITER_IOVEC attempts might return a short co= py. * * Return: number of bytes copied (may be %0) */ @@ -604,8 +347,6 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes,= struct iov_iter *i) { if (WARN_ON_ONCE(i->data_source)) return 0; - if (unlikely(iov_iter_is_pipe(i))) - return copy_mc_pipe_to_iter(addr, bytes, i); if (user_backed_iter(i)) might_fault(); __iterate_and_advance(i, bytes, base, len, off, @@ -711,8 +452,6 @@ size_t copy_page_to_iter(struct page *page, size_t offs= et, size_t bytes, return 0; if (WARN_ON_ONCE(i->data_source)) return 0; - if (unlikely(iov_iter_is_pipe(i))) - return copy_page_to_iter_pipe(page, offset, bytes, i); page +=3D offset / PAGE_SIZE; // first subpage offset %=3D PAGE_SIZE; while (1) { @@ -761,36 +500,8 @@ size_t copy_page_from_iter(struct page *page, size_t o= ffset, size_t bytes, } EXPORT_SYMBOL(copy_page_from_iter); =20 -static size_t pipe_zero(size_t bytes, struct iov_iter *i) -{ - unsigned int chunk, off; - - if (unlikely(bytes > i->count)) - bytes =3D i->count; - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - for (size_t n =3D bytes; n; n -=3D chunk) { - struct page *page =3D append_pipe(i, n, &off); - char *p; - - if (!page) - return bytes - n; - chunk =3D min_t(size_t, n, PAGE_SIZE - off); - p =3D kmap_local_page(page); - memset(p + off, 0, chunk); - kunmap_local(p); - } - return bytes; -} - size_t iov_iter_zero(size_t bytes, struct iov_iter *i) { - if (unlikely(iov_iter_is_pipe(i))) - return pipe_zero(bytes, i); iterate_and_advance(i, bytes, base, len, count, clear_user(base, len), memset(base, 0, len) @@ -821,32 +532,6 @@ size_t copy_page_from_iter_atomic(struct page *page, u= nsigned offset, size_t byt } EXPORT_SYMBOL(copy_page_from_iter_atomic); =20 -static void pipe_advance(struct iov_iter *i, size_t size) -{ - struct pipe_inode_info *pipe =3D i->pipe; - int off =3D i->last_offset; - - if (!off && !size) { - pipe_discard_from(pipe, i->start_head); // discard everything - return; - } - i->count -=3D size; - while (1) { - struct pipe_buffer *buf =3D pipe_buf(pipe, i->head); - if (off) /* make it relative to the beginning of buffer */ - size +=3D abs(off) - buf->offset; - if (size <=3D buf->len) { - buf->len =3D size; - i->last_offset =3D last_offset(buf); - break; - } - size -=3D buf->len; - i->head++; - off =3D 0; - } - pipe_discard_from(pipe, i->head + 1); // discard everything past this one -} - static void iov_iter_bvec_advance(struct iov_iter *i, size_t size) { const struct bio_vec *bvec, *end; @@ -898,8 +583,6 @@ void iov_iter_advance(struct iov_iter *i, size_t size) iov_iter_iovec_advance(i, size); } else if (iov_iter_is_bvec(i)) { iov_iter_bvec_advance(i, size); - } else if (iov_iter_is_pipe(i)) { - pipe_advance(i, size); } else if (iov_iter_is_discard(i)) { i->count -=3D size; } @@ -913,26 +596,6 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll) if (WARN_ON(unroll > MAX_RW_COUNT)) return; i->count +=3D unroll; - if (unlikely(iov_iter_is_pipe(i))) { - struct pipe_inode_info *pipe =3D i->pipe; - unsigned int head =3D pipe->head; - - while (head > i->start_head) { - struct pipe_buffer *b =3D pipe_buf(pipe, --head); - if (unroll < b->len) { - b->len -=3D unroll; - i->last_offset =3D last_offset(b); - i->head =3D head; - return; - } - unroll -=3D b->len; - pipe_buf_release(pipe, b); - pipe->head--; - } - i->last_offset =3D 0; - i->head =3D head; - return; - } if (unlikely(iov_iter_is_discard(i))) return; if (unroll <=3D i->iov_offset) { @@ -1020,24 +683,6 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int d= irection, } EXPORT_SYMBOL(iov_iter_bvec); =20 -void iov_iter_pipe(struct iov_iter *i, unsigned int direction, - struct pipe_inode_info *pipe, - size_t count) -{ - BUG_ON(direction !=3D READ); - WARN_ON(pipe_full(pipe->head, pipe->tail, pipe->ring_size)); - *i =3D (struct iov_iter){ - .iter_type =3D ITER_PIPE, - .data_source =3D false, - .pipe =3D pipe, - .head =3D pipe->head, - .start_head =3D pipe->head, - .last_offset =3D 0, - .count =3D count - }; -} -EXPORT_SYMBOL(iov_iter_pipe); - /** * iov_iter_xarray - Initialise an I/O iterator to use the pages in an xar= ray * @i: The iterator to initialise. @@ -1162,19 +807,6 @@ bool iov_iter_is_aligned(const struct iov_iter *i, un= signed addr_mask, if (iov_iter_is_bvec(i)) return iov_iter_aligned_bvec(i, addr_mask, len_mask); =20 - if (iov_iter_is_pipe(i)) { - size_t size =3D i->count; - - if (size & len_mask) - return false; - if (size && i->last_offset > 0) { - if (i->last_offset & addr_mask) - return false; - } - - return true; - } - if (iov_iter_is_xarray(i)) { if (i->count & len_mask) return false; @@ -1244,14 +876,6 @@ unsigned long iov_iter_alignment(const struct iov_ite= r *i) if (iov_iter_is_bvec(i)) return iov_iter_alignment_bvec(i); =20 - if (iov_iter_is_pipe(i)) { - size_t size =3D i->count; - - if (size && i->last_offset > 0) - return size | i->last_offset; - return size; - } - if (iov_iter_is_xarray(i)) return (i->xarray_start + i->iov_offset) | i->count; =20 @@ -1303,36 +927,6 @@ static int want_pages_array(struct page ***res, size_= t size, return count; } =20 -static ssize_t pipe_get_pages(struct iov_iter *i, - struct page ***pages, size_t maxsize, unsigned maxpages, - size_t *start) -{ - unsigned int npages, count, off, chunk; - struct page **p; - size_t left; - - if (!sanity(i)) - return -EFAULT; - - *start =3D off =3D pipe_npages(i, &npages); - if (!npages) - return -EFAULT; - count =3D want_pages_array(pages, maxsize, off, min(npages, maxpages)); - if (!count) - return -ENOMEM; - p =3D *pages; - for (npages =3D 0, left =3D maxsize ; npages < count; npages++, left -=3D= chunk) { - struct page *page =3D append_pipe(i, left, &off); - if (!page) - break; - chunk =3D min_t(size_t, left, PAGE_SIZE - off); - get_page(*p++ =3D page); - } - if (!npages) - return -EFAULT; - return maxsize - left; -} - static ssize_t iter_xarray_populate_pages(struct page **pages, struct xarr= ay *xa, pgoff_t index, unsigned int nr_pages) { @@ -1482,8 +1076,6 @@ static ssize_t __iov_iter_get_pages_alloc(struct iov_= iter *i, } return maxsize; } - if (iov_iter_is_pipe(i)) - return pipe_get_pages(i, pages, maxsize, maxpages, start); if (iov_iter_is_xarray(i)) return iter_xarray_get_pages(i, pages, maxsize, maxpages, start); return -EFAULT; @@ -1573,9 +1165,7 @@ size_t csum_and_copy_to_iter(const void *addr, size_t= bytes, void *_csstate, } =20 sum =3D csum_shift(csstate->csum, csstate->off); - if (unlikely(iov_iter_is_pipe(i))) - bytes =3D csum_and_copy_to_pipe_iter(addr, bytes, i, &sum); - else iterate_and_advance(i, bytes, base, len, off, ({ + iterate_and_advance(i, bytes, base, len, off, ({ next =3D csum_and_copy_to_user(addr + off, base, len); sum =3D csum_block_add(sum, next, off); next ? 0 : len; @@ -1660,15 +1250,6 @@ int iov_iter_npages(const struct iov_iter *i, int ma= xpages) return iov_npages(i, maxpages); if (iov_iter_is_bvec(i)) return bvec_npages(i, maxpages); - if (iov_iter_is_pipe(i)) { - int npages; - - if (!sanity(i)) - return 0; - - pipe_npages(i, &npages); - return min(npages, maxpages); - } if (iov_iter_is_xarray(i)) { unsigned offset =3D (i->xarray_start + i->iov_offset) % PAGE_SIZE; int npages =3D DIV_ROUND_UP(offset + i->count, PAGE_SIZE); @@ -1681,10 +1262,6 @@ EXPORT_SYMBOL(iov_iter_npages); const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t fla= gs) { *new =3D *old; - if (unlikely(iov_iter_is_pipe(new))) { - WARN_ON(1); - return NULL; - } if (iov_iter_is_bvec(new)) return new->bvec =3D kmemdup(new->bvec, new->nr_segs * sizeof(struct bio_vec), diff --git a/mm/filemap.c b/mm/filemap.c index 3a93515ae2ed..470be06b6096 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2690,8 +2690,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_i= ter *iter, if (unlikely(iocb->ki_pos >=3D i_size_read(inode))) break; =20 - error =3D filemap_get_pages(iocb, iter->count, &fbatch, - iov_iter_is_pipe(iter)); + error =3D filemap_get_pages(iocb, iter->count, &fbatch, false); if (error < 0) break; From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53D71C678D5 for ; Wed, 8 Mar 2023 16:55:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230118AbjCHQzE (ORCPT ); Wed, 8 Mar 2023 11:55:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230019AbjCHQyp (ORCPT ); Wed, 8 Mar 2023 11:54:45 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D2FE8C95C for ; Wed, 8 Mar 2023 08:53:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294408; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T0MgyEE2up5lyn6F4xZiFP7ZLu50ZEhkrR3VW4b8xrI=; b=VLjXJGwPI+1KcVVUYkRJOzjTh7MfAC07bUeSwGDXqPcY+E39/Oc+Curo+m9DMIjByOJtK8 MZpQ4C+yZBOgFBVWmQLxZRVNig5SkE8AXh4TzbBpYwUPJgWpoNcB2VmPx25W7fsBSgsd80 ctJHgcUwImyFFGnP1+qseSP5yrZiSnM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-279-GmxZAvM4NvKOIUhelF-1kA-1; Wed, 08 Mar 2023 11:53:25 -0500 X-MC-Unique: GmxZAvM4NvKOIUhelF-1kA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 71C0287B2A3; Wed, 8 Mar 2023 16:53:24 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 858592166B2A; Wed, 8 Mar 2023 16:53:22 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, John Hubbard Subject: [PATCH v17 09/14] iomap: Don't get an reference on ZERO_PAGE for direct I/O block zeroing Date: Wed, 8 Mar 2023 16:52:46 +0000 Message-Id: <20230308165251.2078898-10-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" ZERO_PAGE can't go away, no need to hold an extra reference. Signed-off-by: David Howells Reviewed-by: David Hildenbrand Reviewed-by: John Hubbard cc: Al Viro cc: David Hildenbrand cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig Reviewed-by: Dave Chinner --- fs/iomap/direct-io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index f771001574d0..850fb9870c2f 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -202,7 +202,7 @@ static void iomap_dio_zero(const struct iomap_iter *ite= r, struct iomap_dio *dio, bio->bi_private =3D dio; bio->bi_end_io =3D iomap_dio_bio_end_io; =20 - get_page(page); + bio_set_flag(bio, BIO_NO_PAGE_REF); __bio_add_page(bio, page, len, 0); iomap_dio_submit_bio(iter, dio, bio, pos); } From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E3BEC64EC4 for ; Wed, 8 Mar 2023 16:55:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230145AbjCHQzM (ORCPT ); Wed, 8 Mar 2023 11:55:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230116AbjCHQyp (ORCPT ); Wed, 8 Mar 2023 11:54:45 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C1F59661A for ; Wed, 8 Mar 2023 08:53:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294411; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0+QymJ+GCZW28tJ8JK9HTXv9GxubpscA9882LeMgv9w=; b=cgaoeWDA4GLtYnWKdXmE89cDwJGZvbmzo2xBJLx+bLIoqJvxpSum8duznH7Io5cyC5FGZO MEGDEGdkvuA6NFqFiY9wZyyEgoRNrhJjs5RRV15o5Gb5qfI7ypTXn5lEN3Gaq5k9t1o2gc 9zkiri3K1tzue68Koxg0Lyns0RmioMU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-581-t3_NrMduOtmhP4jO257PkQ-1; Wed, 08 Mar 2023 11:53:28 -0500 X-MC-Unique: t3_NrMduOtmhP4jO257PkQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 32CB7811E9C; Wed, 8 Mar 2023 16:53:27 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2B99618EC7; Wed, 8 Mar 2023 16:53:25 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v17 10/14] block: Fix bio_flagged() so that gcc can better optimise it Date: Wed, 8 Mar 2023 16:52:47 +0000 Message-Id: <20230308165251.2078898-11-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Fix bio_flagged() so that multiple instances of it, such as: if (bio_flagged(bio, BIO_PAGE_REFFED) || bio_flagged(bio, BIO_PAGE_PINNED)) can be combined by the gcc optimiser into a single test in assembly (arguably, this is a compiler optimisation issue[1]). The missed optimisation stems from bio_flagged() comparing the result of the bitwise-AND to zero. This results in an out-of-line bio_release_page() being compiled to something like: <+0>: mov 0x14(%rdi),%eax <+3>: test $0x1,%al <+5>: jne 0xffffffff816dac53 <+7>: test $0x2,%al <+9>: je 0xffffffff816dac5c <+11>: movzbl %sil,%esi <+15>: jmp 0xffffffff816daba1 <__bio_release_pages> <+20>: jmp 0xffffffff81d0b800 <__x86_return_thunk> However, the test is superfluous as the return type is bool. Removing it results in: <+0>: testb $0x3,0x14(%rdi) <+4>: je 0xffffffff816e4af4 <+6>: movzbl %sil,%esi <+10>: jmp 0xffffffff816dab7c <__bio_release_pages> <+15>: jmp 0xffffffff81d0b7c0 <__x86_return_thunk> instead. Also, the MOVZBL instruction looks unnecessary[2] - I think it's just 're-booling' the mark_dirty parameter. Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard cc: Jens Axboe cc: linux-block@vger.kernel.org Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108370 [1] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108371 [2] Link: https://lore.kernel.org/r/167391056756.2311931.356007731815807265.stg= it@warthog.procyon.org.uk/ # v6 --- include/linux/bio.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/bio.h b/include/linux/bio.h index d766be7152e1..d9d6df62ea57 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -229,7 +229,7 @@ static inline void bio_cnt_set(struct bio *bio, unsigne= d int count) =20 static inline bool bio_flagged(struct bio *bio, unsigned int bit) { - return (bio->bi_flags & (1U << bit)) !=3D 0; + return bio->bi_flags & (1U << bit); } =20 static inline void bio_set_flag(struct bio *bio, unsigned int bit) From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4450BC64EC4 for ; Wed, 8 Mar 2023 16:55:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229667AbjCHQzQ (ORCPT ); Wed, 8 Mar 2023 11:55:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230122AbjCHQyq (ORCPT ); Wed, 8 Mar 2023 11:54:46 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B95B4968F4 for ; Wed, 8 Mar 2023 08:53:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294416; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cRZ6zwkATo5WH/jIVyGNX8/GaSvHJQOFy9f4tGJFuL8=; b=Ets9cLeRlqQaPe+RAtCtYok0yk5hFNlujDS3nwC1Nv/WKwjhqLNvEIsDVCz4jaAwvSd0iw lEBDm8GE+dc6zVGB4yogLvkH53+sP1QG/gJ4az/9bJh1qxN9Okg3n4cAHx8q+IRs/lRl7N q7FcY5b9Yg0lsauJHLkXICU2Zhq6gNE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-481-sN0k0v4zMtK8j61b-GBk5g-1; Wed, 08 Mar 2023 11:53:30 -0500 X-MC-Unique: sN0k0v4zMtK8j61b-GBk5g-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C3113800B23; Wed, 8 Mar 2023 16:53:29 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id C2335492B07; Wed, 8 Mar 2023 16:53:27 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v17 11/14] block: Replace BIO_NO_PAGE_REF with BIO_PAGE_REFFED with inverted logic Date: Wed, 8 Mar 2023 16:52:48 +0000 Message-Id: <20230308165251.2078898-12-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Christoph Hellwig Replace BIO_NO_PAGE_REF with a BIO_PAGE_REFFED flag that has the inverted meaning is only set when a page reference has been acquired that needs to be released by bio_release_pages(). Signed-off-by: Christoph Hellwig Signed-off-by: David Howells Reviewed-by: John Hubbard cc: Al Viro cc: Jens Axboe cc: Jan Kara cc: Matthew Wilcox cc: Logan Gunthorpe cc: linux-block@vger.kernel.org --- Notes: ver #8) - Split out from another patch [hch]. - Don't default to BIO_PAGE_REFFED [hch]. =20 ver #5) - Split from patch that uses iov_iter_extract_pages(). block/bio.c | 2 +- block/blk-map.c | 1 + fs/direct-io.c | 2 ++ fs/iomap/direct-io.c | 1 - include/linux/bio.h | 2 +- include/linux/blk_types.h | 2 +- 6 files changed, 6 insertions(+), 4 deletions(-) diff --git a/block/bio.c b/block/bio.c index fd11614bba4d..4ff96a0e4091 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1190,7 +1190,6 @@ void bio_iov_bvec_set(struct bio *bio, struct iov_ite= r *iter) bio->bi_io_vec =3D (struct bio_vec *)iter->bvec; bio->bi_iter.bi_bvec_done =3D iter->iov_offset; bio->bi_iter.bi_size =3D size; - bio_set_flag(bio, BIO_NO_PAGE_REF); bio_set_flag(bio, BIO_CLONED); } =20 @@ -1335,6 +1334,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct io= v_iter *iter) return 0; } =20 + bio_set_flag(bio, BIO_PAGE_REFFED); do { ret =3D __bio_iov_iter_get_pages(bio, iter); } while (!ret && iov_iter_count(iter) && !bio_full(bio, 0)); diff --git a/block/blk-map.c b/block/blk-map.c index 9137d16cecdc..c77fdb1fbda7 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -281,6 +281,7 @@ static int bio_map_user_iov(struct request *rq, struct = iov_iter *iter, if (blk_queue_pci_p2pdma(rq->q)) extraction_flags |=3D ITER_ALLOW_P2PDMA; =20 + bio_set_flag(bio, BIO_PAGE_REFFED); while (iov_iter_count(iter)) { struct page **pages, *stack_pages[UIO_FASTIOV]; ssize_t bytes; diff --git a/fs/direct-io.c b/fs/direct-io.c index ab0d7ea89813..47b90c68b369 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -403,6 +403,8 @@ dio_bio_alloc(struct dio *dio, struct dio_submit *sdio, bio->bi_end_io =3D dio_bio_end_aio; else bio->bi_end_io =3D dio_bio_end_io; + /* for now require references for all pages */ + bio_set_flag(bio, BIO_PAGE_REFFED); sdio->bio =3D bio; sdio->logical_offset_in_bio =3D sdio->cur_page_fs_offset; } diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 850fb9870c2f..ceeb0a183cea 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -202,7 +202,6 @@ static void iomap_dio_zero(const struct iomap_iter *ite= r, struct iomap_dio *dio, bio->bi_private =3D dio; bio->bi_end_io =3D iomap_dio_bio_end_io; =20 - bio_set_flag(bio, BIO_NO_PAGE_REF); __bio_add_page(bio, page, len, 0); iomap_dio_submit_bio(iter, dio, bio, pos); } diff --git a/include/linux/bio.h b/include/linux/bio.h index d9d6df62ea57..b537d03377f0 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -488,7 +488,7 @@ void zero_fill_bio(struct bio *bio); =20 static inline void bio_release_pages(struct bio *bio, bool mark_dirty) { - if (!bio_flagged(bio, BIO_NO_PAGE_REF)) + if (bio_flagged(bio, BIO_PAGE_REFFED)) __bio_release_pages(bio, mark_dirty); } =20 diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 99be590f952f..7daa261f4f98 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -318,7 +318,7 @@ struct bio { * bio flags */ enum { - BIO_NO_PAGE_REF, /* don't put release vec pages */ + BIO_PAGE_REFFED, /* put pages in bio_release_pages() */ BIO_CLONED, /* doesn't own data */ BIO_BOUNCED, /* bio is a bounce bio */ BIO_QUIET, /* Make BIO Quiet */ From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0B7CC678D5 for ; Wed, 8 Mar 2023 16:55:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230210AbjCHQz4 (ORCPT ); Wed, 8 Mar 2023 11:55:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230183AbjCHQyv (ORCPT ); Wed, 8 Mar 2023 11:54:51 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AFB91BDD12 for ; Wed, 8 Mar 2023 08:53:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294426; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HEPrb4kIqMqW7eebyDDmobfSBttcdx5+yTOr0krGf5g=; b=ZOBLq2FzJP55pDzVYqU4jwCmk+0lAPMZh7qjadxj34/rYUrypTMZ5A5jifzpHJwFqlJnjs f++1St9VZXKupgqIDITXamRYbIatfJaNHK1omg2YWGWGfgjao5f4RKLtheN3Ow8O5oiwOS sLcAa/3yRxeRAIkyGETtylf7YIwvLC0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-297-lMpGSRQOMG6okmNVI3BTqA-1; Wed, 08 Mar 2023 11:53:41 -0500 X-MC-Unique: lMpGSRQOMG6okmNVI3BTqA-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 83C6418A650A; Wed, 8 Mar 2023 16:53:32 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 81ADA492C3E; Wed, 8 Mar 2023 16:53:30 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v17 12/14] block: Add BIO_PAGE_PINNED and associated infrastructure Date: Wed, 8 Mar 2023 16:52:49 +0000 Message-Id: <20230308165251.2078898-13-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add BIO_PAGE_PINNED to indicate that the pages in a bio are pinned (FOLL_PIN) and that the pin will need removing. Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard cc: Al Viro cc: Jens Axboe cc: Jan Kara cc: Matthew Wilcox cc: Logan Gunthorpe cc: linux-block@vger.kernel.org --- Notes: ver #10) - Drop bio_set_cleanup_mode(), open coding it instead. =20 ver #9) - Only consider pinning in bio_set_cleanup_mode(). Ref'ing pages in struct bio is going away. - page_put_unpin() is removed; call unpin_user_page() and put_page() directly. - Use bio_release_page() in __bio_release_pages(). - BIO_PAGE_PINNED and BIO_PAGE_REFFED can't both be set, so use if-else when testing both of them. =20 ver #8) - Move the infrastructure to clean up pinned pages to this patch [hch]. - Put BIO_PAGE_PINNED before BIO_PAGE_REFFED as the latter should probably be removed at some point. FOLL_PIN can then be renumbered first. block/bio.c | 6 +++--- block/blk.h | 12 ++++++++++++ include/linux/bio.h | 3 ++- include/linux/blk_types.h | 1 + 4 files changed, 18 insertions(+), 4 deletions(-) diff --git a/block/bio.c b/block/bio.c index 4ff96a0e4091..51ae957cc4b6 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1168,7 +1168,7 @@ void __bio_release_pages(struct bio *bio, bool mark_d= irty) bio_for_each_segment_all(bvec, bio, iter_all) { if (mark_dirty && !PageCompound(bvec->bv_page)) set_page_dirty_lock(bvec->bv_page); - put_page(bvec->bv_page); + bio_release_page(bio, bvec->bv_page); } } EXPORT_SYMBOL_GPL(__bio_release_pages); @@ -1488,8 +1488,8 @@ void bio_set_pages_dirty(struct bio *bio) * the BIO and re-dirty the pages in process context. * * It is expected that bio_check_pages_dirty() will wholly own the BIO from - * here on. It will run one put_page() against each page and will run one - * bio_put() against the BIO. + * here on. It will unpin each page and will run one bio_put() against the + * BIO. */ =20 static void bio_dirty_fn(struct work_struct *work); diff --git a/block/blk.h b/block/blk.h index cc4e8873dfde..d65d96994a94 100644 --- a/block/blk.h +++ b/block/blk.h @@ -432,6 +432,18 @@ int bio_add_hw_page(struct request_queue *q, struct bi= o *bio, struct page *page, unsigned int len, unsigned int offset, unsigned int max_sectors, bool *same_page); =20 +/* + * Clean up a page appropriately, where the page may be pinned, may have a + * ref taken on it or neither. + */ +static inline void bio_release_page(struct bio *bio, struct page *page) +{ + if (bio_flagged(bio, BIO_PAGE_PINNED)) + unpin_user_page(page); + else if (bio_flagged(bio, BIO_PAGE_REFFED)) + put_page(page); +} + struct request_queue *blk_alloc_queue(int node_id); =20 int disk_scan_partitions(struct gendisk *disk, fmode_t mode); diff --git a/include/linux/bio.h b/include/linux/bio.h index b537d03377f0..d8c30c791a9a 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -488,7 +488,8 @@ void zero_fill_bio(struct bio *bio); =20 static inline void bio_release_pages(struct bio *bio, bool mark_dirty) { - if (bio_flagged(bio, BIO_PAGE_REFFED)) + if (bio_flagged(bio, BIO_PAGE_REFFED) || + bio_flagged(bio, BIO_PAGE_PINNED)) __bio_release_pages(bio, mark_dirty); } =20 diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 7daa261f4f98..a0e339ff3d09 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -318,6 +318,7 @@ struct bio { * bio flags */ enum { + BIO_PAGE_PINNED, /* Unpin pages in bio_release_pages() */ BIO_PAGE_REFFED, /* put pages in bio_release_pages() */ BIO_CLONED, /* doesn't own data */ BIO_BOUNCED, /* bio is a bounce bio */ From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AC21C64EC4 for ; Wed, 8 Mar 2023 16:55:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230193AbjCHQzX (ORCPT ); Wed, 8 Mar 2023 11:55:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230131AbjCHQyq (ORCPT ); Wed, 8 Mar 2023 11:54:46 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2428B8607 for ; Wed, 8 Mar 2023 08:53:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294423; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7b6tezcmsCRtrHxO2uDV+B0gtFM5s5ObrJQQe5H2QZI=; b=M59NXBAEfbQxjG8HniuSdwqy2zfnfHh5E1ztWTGS+eoRxNUupcaqAAJ9x7Q+Urthatx3rW qDmOJPuKCVUrFpor+0EQPcVVP/8w9gTvl8KwoNF+vqrpmzeyPjkQ3WD122BvKZ0Jo83j6V JKLQJ7ZVShbTfAS0f+hPPdVWcgybG+Y= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-457-phLgeZkLN8-e2tx3s-Pb0w-1; Wed, 08 Mar 2023 11:53:39 -0500 X-MC-Unique: phLgeZkLN8-e2tx3s-Pb0w-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2E66E88B7DE; Wed, 8 Mar 2023 16:53:35 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 20FAF492B09; Wed, 8 Mar 2023 16:53:33 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v17 13/14] block: Convert bio_iov_iter_get_pages to use iov_iter_extract_pages Date: Wed, 8 Mar 2023 16:52:50 +0000 Message-Id: <20230308165251.2078898-14-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This will pin pages or leave them unaltered rather than getting a ref on them as appropriate to the iterator. The pages need to be pinned for DIO rather than having refs taken on them to prevent VM copy-on-write from malfunctioning during a concurrent fork() (the result of the I/O could otherwise end up being affected by/visible to the child process). Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard cc: Al Viro cc: Jens Axboe cc: Jan Kara cc: Matthew Wilcox cc: Logan Gunthorpe cc: linux-block@vger.kernel.org --- Notes: ver #10) - Drop bio_set_cleanup_mode(), open coding it instead. =20 ver #8) - Split the patch up a bit [hch]. - We should only be using pinned/non-pinned pages and not ref'd pages, so adjust the comments appropriately. =20 ver #7) - Don't treat BIO_PAGE_REFFED/PINNED as being the same as FOLL_GET/PIN. =20 ver #5) - Transcribe the FOLL_* flags returned by iov_iter_extract_pages() to BIO_* flags and got rid of bi_cleanup_mode. - Replaced BIO_NO_PAGE_REF to BIO_PAGE_REFFED in the preceding patch. block/bio.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/block/bio.c b/block/bio.c index 51ae957cc4b6..fc98c1c723ca 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1204,7 +1204,7 @@ static int bio_iov_add_page(struct bio *bio, struct p= age *page, } =20 if (same_page) - put_page(page); + bio_release_page(bio, page); return 0; } =20 @@ -1218,7 +1218,7 @@ static int bio_iov_add_zone_append_page(struct bio *b= io, struct page *page, queue_max_zone_append_sectors(q), &same_page) !=3D len) return -EINVAL; if (same_page) - put_page(page); + bio_release_page(bio, page); return 0; } =20 @@ -1229,10 +1229,10 @@ static int bio_iov_add_zone_append_page(struct bio = *bio, struct page *page, * @bio: bio to add pages to * @iter: iov iterator describing the region to be mapped * - * Pins pages from *iter and appends them to @bio's bvec array. The - * pages will have to be released using put_page() when done. - * For multi-segment *iter, this function only adds pages from the - * next non-empty segment of the iov iterator. + * Extracts pages from *iter and appends them to @bio's bvec array. The p= ages + * will have to be cleaned up in the way indicated by the BIO_PAGE_PINNED = flag. + * For a multi-segment *iter, this function only adds pages from the next + * non-empty segment of the iov iterator. */ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) { @@ -1264,9 +1264,9 @@ static int __bio_iov_iter_get_pages(struct bio *bio, = struct iov_iter *iter) * result to ensure the bio's total size is correct. The remainder of * the iov data will be picked up in the next bio iteration. */ - size =3D iov_iter_get_pages(iter, pages, - UINT_MAX - bio->bi_iter.bi_size, - nr_pages, &offset, extraction_flags); + size =3D iov_iter_extract_pages(iter, &pages, + UINT_MAX - bio->bi_iter.bi_size, + nr_pages, extraction_flags, &offset); if (unlikely(size <=3D 0)) return size ? size : -EFAULT; =20 @@ -1299,7 +1299,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, = struct iov_iter *iter) iov_iter_revert(iter, left); out: while (i < nr_pages) - put_page(pages[i++]); + bio_release_page(bio, pages[i++]); =20 return ret; } @@ -1334,7 +1334,8 @@ int bio_iov_iter_get_pages(struct bio *bio, struct io= v_iter *iter) return 0; } =20 - bio_set_flag(bio, BIO_PAGE_REFFED); + if (iov_iter_extract_will_pin(iter)) + bio_set_flag(bio, BIO_PAGE_PINNED); do { ret =3D __bio_iov_iter_get_pages(bio, iter); } while (!ret && iov_iter_count(iter) && !bio_full(bio, 0)); From nobody Wed Sep 10 05:31:03 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1321BC678D5 for ; Wed, 8 Mar 2023 16:55:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230201AbjCHQz0 (ORCPT ); Wed, 8 Mar 2023 11:55:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230135AbjCHQyr (ORCPT ); Wed, 8 Mar 2023 11:54:47 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05BE3B950B for ; Wed, 8 Mar 2023 08:53:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678294424; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gIKMSgtKI0+T2jjDihR3uNZ9dKHoZzHE/RnqYef6EBU=; b=eHw0F4t8s0Z5aYBqVQRDV13CFqoOsDkIc0HwmiAV+5qQ+NG2LS2I1O+MWODGPXEiPSH9mR jHjQSrEw57Wm4gHBeng5hXO6K5HtxPzlIbogEQAmKaVMTl0GxOzN4+mTpzLISrxcV7GH2Q 2HnO1W319nNWLVMGHNXYCQN2+uV7orA= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-653-kBmAFUeENu2Sp64iGCfrUw-1; Wed, 08 Mar 2023 11:53:40 -0500 X-MC-Unique: kBmAFUeENu2Sp64iGCfrUw-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E02FE882823; Wed, 8 Mar 2023 16:53:38 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id DCEEA492B05; Wed, 8 Mar 2023 16:53:35 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v17 14/14] block: convert bio_map_user_iov to use iov_iter_extract_pages Date: Wed, 8 Mar 2023 16:52:51 +0000 Message-Id: <20230308165251.2078898-15-dhowells@redhat.com> In-Reply-To: <20230308165251.2078898-1-dhowells@redhat.com> References: <20230308165251.2078898-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This will pin pages or leave them unaltered rather than getting a ref on them as appropriate to the iterator. The pages need to be pinned for DIO rather than having refs taken on them to prevent VM copy-on-write from malfunctioning during a concurrent fork() (the result of the I/O could otherwise end up being visible to/affected by the child process). Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard cc: Al Viro cc: Jens Axboe cc: Jan Kara cc: Matthew Wilcox cc: Logan Gunthorpe cc: linux-block@vger.kernel.org --- Notes: ver #10) - Drop bio_set_cleanup_mode(), open coding it instead. =20 ver #8) - Split the patch up a bit [hch]. - We should only be using pinned/non-pinned pages and not ref'd pages, so adjust the comments appropriately. =20 ver #7) - Don't treat BIO_PAGE_REFFED/PINNED as being the same as FOLL_GET/PIN. =20 ver #5) - Transcribe the FOLL_* flags returned by iov_iter_extract_pages() to BIO_* flags and got rid of bi_cleanup_mode. - Replaced BIO_NO_PAGE_REF to BIO_PAGE_REFFED in the preceding patch. block/blk-map.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/block/blk-map.c b/block/blk-map.c index c77fdb1fbda7..7b12f4bb4d4c 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -280,22 +280,21 @@ static int bio_map_user_iov(struct request *rq, struc= t iov_iter *iter, =20 if (blk_queue_pci_p2pdma(rq->q)) extraction_flags |=3D ITER_ALLOW_P2PDMA; + if (iov_iter_extract_will_pin(iter)) + bio_set_flag(bio, BIO_PAGE_PINNED); =20 - bio_set_flag(bio, BIO_PAGE_REFFED); while (iov_iter_count(iter)) { - struct page **pages, *stack_pages[UIO_FASTIOV]; + struct page *stack_pages[UIO_FASTIOV]; + struct page **pages =3D stack_pages; ssize_t bytes; size_t offs; int npages; =20 - if (nr_vecs <=3D ARRAY_SIZE(stack_pages)) { - pages =3D stack_pages; - bytes =3D iov_iter_get_pages(iter, pages, LONG_MAX, - nr_vecs, &offs, extraction_flags); - } else { - bytes =3D iov_iter_get_pages_alloc(iter, &pages, - LONG_MAX, &offs, extraction_flags); - } + if (nr_vecs > ARRAY_SIZE(stack_pages)) + pages =3D NULL; + + bytes =3D iov_iter_extract_pages(iter, &pages, LONG_MAX, + nr_vecs, extraction_flags, &offs); if (unlikely(bytes <=3D 0)) { ret =3D bytes ? bytes : -EFAULT; goto out_unmap; @@ -317,7 +316,7 @@ static int bio_map_user_iov(struct request *rq, struct = iov_iter *iter, if (!bio_add_hw_page(rq->q, bio, page, n, offs, max_sectors, &same_page)) { if (same_page) - put_page(page); + bio_release_page(bio, page); break; } =20 @@ -329,7 +328,7 @@ static int bio_map_user_iov(struct request *rq, struct = iov_iter *iter, * release the pages we didn't map into the bio, if any */ while (j < npages) - put_page(pages[j++]); + bio_release_page(bio, pages[j++]); if (pages !=3D stack_pages) kvfree(pages); /* couldn't stuff something into bio? */