From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D85C0C61DA4 for ; Wed, 15 Mar 2023 16:37:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232284AbjCOQhL (ORCPT ); Wed, 15 Mar 2023 12:37:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232190AbjCOQg5 (ORCPT ); Wed, 15 Mar 2023 12:36:57 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4055A7042E for ; Wed, 15 Mar 2023 09:36:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898161; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fEHZtUBR4B7sSEpj4V24Co6bhAobBEWGUt1eQB77Iqg=; b=HSUrVCYuVWzQ94ISqK4yiL55PIFlorse0niiDDscGX50Rh43co4F+XEsMWTzio9KUdSAfk ZeUkFLHjYFybeSXTJqOJagqP/h59VlDVs5W2mfY1/Lo5fjbi1RqWgADUJjzxuowu8sGw2q tI5oeHmBeb5e+AU9gMiXXsGcxqwS+cY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-202-Kzhf41MDNSGA4ALqB2MQAQ-1; Wed, 15 Mar 2023 12:35:57 -0400 X-MC-Unique: Kzhf41MDNSGA4ALqB2MQAQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E32C2857F81; Wed, 15 Mar 2023 16:35:55 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id C33412166B26; Wed, 15 Mar 2023 16:35:53 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v19 01/15] splice: Clean up direct_splice_read() a bit Date: Wed, 15 Mar 2023 16:35:35 +0000 Message-Id: <20230315163549.295454-2-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Do a couple of cleanups to direct_splice_read(): (1) Cast to struct page **, not void *. (2) Simplify the calculation of the number of pages to keep/reclaim in direct_splice_read(). Suggested-by: Christoph Hellwig Signed-off-by: David Howells Reviewed-by: Christoph Hellwig cc: Jens Axboe cc: Al Viro cc: David Hildenbrand cc: John Hubbard cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Reviewed-by: David Hildenbrand --- fs/splice.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index 2e76dbb81a8f..abd21a455a2b 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -295,7 +295,7 @@ ssize_t direct_splice_read(struct file *in, loff_t *ppo= s, struct kiocb kiocb; struct page **pages; ssize_t ret; - size_t used, npages, chunk, remain, reclaim; + size_t used, npages, chunk, remain, keep =3D 0; int i; =20 /* Work out how much data we can actually add into the pipe */ @@ -309,7 +309,7 @@ ssize_t direct_splice_read(struct file *in, loff_t *ppo= s, if (!bv) return -ENOMEM; =20 - pages =3D (void *)(bv + npages); + pages =3D (struct page **)(bv + npages); npages =3D alloc_pages_bulk_array(GFP_USER, npages, pages); if (!npages) { kfree(bv); @@ -332,11 +332,8 @@ ssize_t direct_splice_read(struct file *in, loff_t *pp= os, kiocb.ki_pos =3D *ppos; ret =3D call_read_iter(in, &kiocb, &to); =20 - reclaim =3D npages * PAGE_SIZE; - remain =3D 0; if (ret > 0) { - reclaim -=3D ret; - remain =3D ret; + keep =3D DIV_ROUND_UP(ret, PAGE_SIZE); *ppos =3D kiocb.ki_pos; file_accessed(in); } else if (ret < 0) { @@ -349,14 +346,12 @@ ssize_t direct_splice_read(struct file *in, loff_t *p= pos, } =20 /* Free any pages that didn't get touched at all. */ - reclaim /=3D PAGE_SIZE; - if (reclaim) { - npages -=3D reclaim; - release_pages(pages + npages, reclaim); - } + if (keep < npages) + release_pages(pages + keep, npages - keep); =20 /* Push the remaining pages into the pipe. */ - for (i =3D 0; i < npages; i++) { + remain =3D ret; + for (i =3D 0; i < keep; i++) { struct pipe_buffer *buf =3D pipe_head_buf(pipe); =20 chunk =3D min_t(size_t, remain, PAGE_SIZE); From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6D5DC61DA4 for ; Wed, 15 Mar 2023 16:37:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231667AbjCOQhP (ORCPT ); Wed, 15 Mar 2023 12:37:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232204AbjCOQg5 (ORCPT ); Wed, 15 Mar 2023 12:36:57 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 779646F636 for ; Wed, 15 Mar 2023 09:36:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898167; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UYOaJrv+k3qRHTtCyS3jh/xO0Z1XCGBD7xHPEpqnnTw=; b=UsJA5sxLYzxWwZSz+4/jCzBhCDS+T0/S1ObJ3n5d1J+135GHMf2aUJE8wPtutUefukkzN7 cvsyqWY6iQsTNKtlYliFfA/X8NJwdJCtKdOt/vnXogMT1TNV5+XyfF/9flHY2pDBtnaocq hDaK6BjOBY7ZYKRD3KncrfMPUPVzdUs= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-610-HjThJPlUMZe-gype1NTd7Q-1; Wed, 15 Mar 2023 12:36:04 -0400 X-MC-Unique: HjThJPlUMZe-gype1NTd7Q-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E0D933C6675C; Wed, 15 Mar 2023 16:35:58 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 95B66C164E7; Wed, 15 Mar 2023 16:35:56 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , Miklos Szeredi , John Hubbard , linux-unionfs@vger.kernel.org Subject: [PATCH v19 02/15] splice: Make do_splice_to() generic and export it Date: Wed, 15 Mar 2023 16:35:36 +0000 Message-Id: <20230315163549.295454-3-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Rename do_splice_to() to vfs_splice_read() and export it so that it can be used as a helper when calling down to a lower layer filesystem as it performs all the necessary checks[1]. Signed-off-by: David Howells Reviewed-by: Christoph Hellwig cc: Miklos Szeredi cc: Jens Axboe cc: Al Viro cc: John Hubbard cc: David Hildenbrand cc: Matthew Wilcox cc: linux-unionfs@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/CAJfpeguGksS3sCigmRi9hJdUec8qtM9f+_9jC1rJhs= XT+dV01w@mail.gmail.com/ [1] Reviewed-by: David Hildenbrand --- fs/splice.c | 27 ++++++++++++++++++++------- include/linux/splice.h | 3 +++ 2 files changed, 23 insertions(+), 7 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index abd21a455a2b..90ccd3666dca 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -851,12 +851,24 @@ static long do_splice_from(struct pipe_inode_info *pi= pe, struct file *out, return out->f_op->splice_write(pipe, out, ppos, len, flags); } =20 -/* - * Attempt to initiate a splice from a file to a pipe. +/** + * vfs_splice_read - Read data from a file and splice it into a pipe + * @in: File to splice from + * @ppos: Input file offset + * @pipe: Pipe to splice to + * @len: Number of bytes to splice + * @flags: Splice modifier flags (SPLICE_F_*) + * + * Splice the requested amount of data from the input file to the pipe. T= his + * is synchronous as the caller must hold the pipe lock across the entire + * operation. + * + * If successful, it returns the amount of data spliced, 0 if it hit the E= OF or + * a hole and a negative error code otherwise. */ -static long do_splice_to(struct file *in, loff_t *ppos, - struct pipe_inode_info *pipe, size_t len, - unsigned int flags) +long vfs_splice_read(struct file *in, loff_t *ppos, + struct pipe_inode_info *pipe, size_t len, + unsigned int flags) { unsigned int p_space; int ret; @@ -879,6 +891,7 @@ static long do_splice_to(struct file *in, loff_t *ppos, return warn_unsupported(in, "read"); return in->f_op->splice_read(in, ppos, pipe, len, flags); } +EXPORT_SYMBOL_GPL(vfs_splice_read); =20 /** * splice_direct_to_actor - splices data directly between two non-pipes @@ -949,7 +962,7 @@ ssize_t splice_direct_to_actor(struct file *in, struct = splice_desc *sd, size_t read_len; loff_t pos =3D sd->pos, prev_pos =3D pos; =20 - ret =3D do_splice_to(in, &pos, pipe, len, flags); + ret =3D vfs_splice_read(in, &pos, pipe, len, flags); if (unlikely(ret <=3D 0)) goto out_release; =20 @@ -1097,7 +1110,7 @@ long splice_file_to_pipe(struct file *in, pipe_lock(opipe); ret =3D wait_for_space(opipe, flags); if (!ret) - ret =3D do_splice_to(in, offset, opipe, len, flags); + ret =3D vfs_splice_read(in, offset, opipe, len, flags); pipe_unlock(opipe); if (ret > 0) wakeup_pipe_readers(opipe); diff --git a/include/linux/splice.h b/include/linux/splice.h index a55179fd60fc..8f052c3dae95 100644 --- a/include/linux/splice.h +++ b/include/linux/splice.h @@ -76,6 +76,9 @@ extern ssize_t splice_to_pipe(struct pipe_inode_info *, struct splice_pipe_desc *); extern ssize_t add_to_pipe(struct pipe_inode_info *, struct pipe_buffer *); +long vfs_splice_read(struct file *in, loff_t *ppos, + struct pipe_inode_info *pipe, size_t len, + unsigned int flags); extern ssize_t splice_direct_to_actor(struct file *, struct splice_desc *, splice_direct_actor *); extern long do_splice(struct file *in, loff_t *off_in, From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F7CFC6FD1D for ; Wed, 15 Mar 2023 16:37:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232212AbjCOQh0 (ORCPT ); Wed, 15 Mar 2023 12:37:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232154AbjCOQhD (ORCPT ); Wed, 15 Mar 2023 12:37:03 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A26F36F4B3 for ; Wed, 15 Mar 2023 09:36:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898168; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=imlbxN8+e/h/7fdYpvOZBQo8e/sPfLa+LUB+/4kjN58=; b=I8lhhXuV6OzXKczJwuv/rsrOynpk8jbTSfyLimvcQx83gAQ7U/qwRf9T5KMLKdyIFjbUCW YmzDeCLiSwQZvdAOsIpViNhUfSXk9J2gMvOX7HoSzUCJVBviegG6ackZQPBTDsLmR8AXGs YNkyFdWH6hqBT7Z8R1sc5qSqN57s7Ig= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-296-Ki0JtdioOVChXpS3qKCPXQ-1; Wed, 15 Mar 2023 12:36:06 -0400 X-MC-Unique: Ki0JtdioOVChXpS3qKCPXQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2BDF888B7A0; Wed, 15 Mar 2023 16:36:02 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7DE122166B26; Wed, 15 Mar 2023 16:35:59 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Daniel Golle , Guenter Roeck , Christoph Hellwig , John Hubbard , Hugh Dickins Subject: [PATCH v19 03/15] shmem: Implement splice-read Date: Wed, 15 Mar 2023 16:35:37 +0000 Message-Id: <20230315163549.295454-4-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The new filemap_splice_read() has an implicit expectation via filemap_get_pages() that ->read_folio() exists if ->readahead() doesn't fully populate the pagecache of the file it is reading from[1], potentially leading to a jump to NULL if this doesn't exist. shmem, however, (and by extension, tmpfs, ramfs and rootfs), doesn't have ->read_folio(), Work around this by equipping shmem with its own splice-read implementation, based on filemap_splice_read(), but able to paste in zero_page when there's a page missing. Signed-off-by: David Howells cc: Daniel Golle cc: Guenter Roeck cc: Christoph Hellwig cc: Jens Axboe cc: Al Viro cc: John Hubbard cc: David Hildenbrand cc: Matthew Wilcox cc: Hugh Dickins cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/Y+pdHFFTk1TTEBsO@makrotopia.org/ [1] --- Notes: ver #19) - Remove a missed get_page() on the zero page. =20 ver #18) - Don't take/release a ref on the zero page. mm/shmem.c | 134 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 133 insertions(+), 1 deletion(-) diff --git a/mm/shmem.c b/mm/shmem.c index 448f393d8ab2..a0c268dcf7b8 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2719,6 +2719,138 @@ static ssize_t shmem_file_read_iter(struct kiocb *i= ocb, struct iov_iter *to) return retval ? retval : error; } =20 +static bool zero_pipe_buf_get(struct pipe_inode_info *pipe, + struct pipe_buffer *buf) +{ + return true; +} + +static void zero_pipe_buf_release(struct pipe_inode_info *pipe, + struct pipe_buffer *buf) +{ +} + +static bool zero_pipe_buf_try_steal(struct pipe_inode_info *pipe, + struct pipe_buffer *buf) +{ + return false; +} + +static const struct pipe_buf_operations zero_pipe_buf_ops =3D { + .release =3D zero_pipe_buf_release, + .try_steal =3D zero_pipe_buf_try_steal, + .get =3D zero_pipe_buf_get, +}; + +static size_t splice_zeropage_into_pipe(struct pipe_inode_info *pipe, + loff_t fpos, size_t size) +{ + size_t offset =3D fpos & ~PAGE_MASK; + + size =3D min_t(size_t, size, PAGE_SIZE - offset); + + if (!pipe_full(pipe->head, pipe->tail, pipe->max_usage)) { + struct pipe_buffer *buf =3D pipe_head_buf(pipe); + + *buf =3D (struct pipe_buffer) { + .ops =3D &zero_pipe_buf_ops, + .page =3D ZERO_PAGE(0), + .offset =3D offset, + .len =3D size, + }; + pipe->head++; + } + + return size; +} + +static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, + struct pipe_inode_info *pipe, + size_t len, unsigned int flags) +{ + struct inode *inode =3D file_inode(in); + struct address_space *mapping =3D inode->i_mapping; + struct folio *folio =3D NULL; + size_t total_spliced =3D 0, used, npages, n, part; + loff_t isize; + int error =3D 0; + + /* Work out how much data we can actually add into the pipe */ + used =3D pipe_occupancy(pipe->head, pipe->tail); + npages =3D max_t(ssize_t, pipe->max_usage - used, 0); + len =3D min_t(size_t, len, npages * PAGE_SIZE); + + do { + if (*ppos >=3D i_size_read(inode)) + break; + + error =3D shmem_get_folio(inode, *ppos / PAGE_SIZE, &folio, SGP_READ); + if (error) { + if (error =3D=3D -EINVAL) + error =3D 0; + break; + } + if (folio) { + folio_unlock(folio); + + if (folio_test_hwpoison(folio)) { + error =3D -EIO; + break; + } + } + + /* + * i_size must be checked after we know the pages are Uptodate. + * + * Checking i_size after the check allows us to calculate + * the correct value for "nr", which means the zero-filled + * part of the page is not copied back to userspace (unless + * another truncate extends the file - this is desired though). + */ + isize =3D i_size_read(inode); + if (unlikely(*ppos >=3D isize)) + break; + part =3D min_t(loff_t, isize - *ppos, len); + + if (folio) { + /* + * If users can be writing to this page using arbitrary + * virtual addresses, take care about potential aliasing + * before reading the page on the kernel side. + */ + if (mapping_writably_mapped(mapping)) + flush_dcache_folio(folio); + folio_mark_accessed(folio); + /* + * Ok, we have the page, and it's up-to-date, so we can + * now splice it into the pipe. + */ + n =3D splice_folio_into_pipe(pipe, folio, *ppos, part); + folio_put(folio); + folio =3D NULL; + } else { + n =3D splice_zeropage_into_pipe(pipe, *ppos, len); + } + + if (!n) + break; + len -=3D n; + total_spliced +=3D n; + *ppos +=3D n; + in->f_ra.prev_pos =3D *ppos; + if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) + break; + + cond_resched(); + } while (len); + + if (folio) + folio_put(folio); + + file_accessed(in); + return total_spliced ? total_spliced : error; +} + static loff_t shmem_file_llseek(struct file *file, loff_t offset, int when= ce) { struct address_space *mapping =3D file->f_mapping; @@ -3938,7 +4070,7 @@ static const struct file_operations shmem_file_operat= ions =3D { .read_iter =3D shmem_file_read_iter, .write_iter =3D generic_file_write_iter, .fsync =3D noop_fsync, - .splice_read =3D generic_file_splice_read, + .splice_read =3D shmem_file_splice_read, .splice_write =3D iter_file_splice_write, .fallocate =3D shmem_fallocate, #endif From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59CD1C7618B for ; Wed, 15 Mar 2023 16:37:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232279AbjCOQhb (ORCPT ); Wed, 15 Mar 2023 12:37:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232263AbjCOQhI (ORCPT ); Wed, 15 Mar 2023 12:37:08 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B303174A6C for ; Wed, 15 Mar 2023 09:36:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898171; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=74aMisHJvcrsbS5g8R7662ZNEA+1A6hMiypJVvVd5yo=; b=JogAK2/AhjBAWlxTzpsgVTypL9q3zjnUGTTOwkHrk9HebcVKeyQJns0RECDhHi/tRZPIcD OMriBO2+k/pHJOMziVCsX8SEb2JoAGZ0gCDyQ/QwFsPiGUyBu7lvEj1j29c6u/HsT+EWt2 +ld6cukSjpMdFTGugLbTtelh8Jjvabc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-552-0l7YBVunPwOyKLm554H8jA-1; Wed, 15 Mar 2023 12:36:08 -0400 X-MC-Unique: 0l7YBVunPwOyKLm554H8jA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 38265101A556; Wed, 15 Mar 2023 16:36:05 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id D47622027040; Wed, 15 Mar 2023 16:36:02 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard , Miklos Szeredi , linux-unionfs@vger.kernel.org Subject: [PATCH v19 04/15] overlayfs: Implement splice-read Date: Wed, 15 Mar 2023 16:35:38 +0000 Message-Id: <20230315163549.295454-5-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Implement splice-read for overlayfs by passing the request down a layer rather than going through generic_file_splice_read() which is going to be changed to assume that ->read_folio() is present on buffered files. Signed-off-by: David Howells cc: Christoph Hellwig cc: Jens Axboe cc: Al Viro cc: John Hubbard cc: David Hildenbrand cc: Matthew Wilcox cc: Miklos Szeredi cc: linux-unionfs@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Reviewed-by: Christian Brauner Reviewed-by: David Hildenbrand --- Notes: ver #17) - Use vfs_splice_read() helper rather than open-coding checks. =20 ver #15) - Remove redundant FMODE_CAN_ODIRECT check on real file. - Do rw_verify_area() on the real file, not the overlay file. - Fix a file leak. fs/overlayfs/file.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index 7c04f033aadd..86197882ff35 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -419,6 +419,27 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, stru= ct iov_iter *iter) return ret; } =20 +static ssize_t ovl_splice_read(struct file *in, loff_t *ppos, + struct pipe_inode_info *pipe, size_t len, + unsigned int flags) +{ + const struct cred *old_cred; + struct fd real; + ssize_t ret; + + ret =3D ovl_real_fdget(in, &real); + if (ret) + return ret; + + old_cred =3D ovl_override_creds(file_inode(in)->i_sb); + ret =3D vfs_splice_read(real.file, ppos, pipe, len, flags); + revert_creds(old_cred); + ovl_file_accessed(in); + + fdput(real); + return ret; +} + /* * Calling iter_file_splice_write() directly from overlay's f_op may deadl= ock * due to lock order inversion between pipe->mutex in iter_file_splice_wri= te() @@ -695,7 +716,7 @@ const struct file_operations ovl_file_operations =3D { .fallocate =3D ovl_fallocate, .fadvise =3D ovl_fadvise, .flush =3D ovl_flush, - .splice_read =3D generic_file_splice_read, + .splice_read =3D ovl_splice_read, .splice_write =3D ovl_splice_write, =20 .copy_file_range =3D ovl_copy_file_range, From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 571EAC76195 for ; Wed, 15 Mar 2023 16:37:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232482AbjCOQhp (ORCPT ); Wed, 15 Mar 2023 12:37:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36640 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232204AbjCOQhQ (ORCPT ); Wed, 15 Mar 2023 12:37:16 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4AA07339A for ; Wed, 15 Mar 2023 09:36:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898174; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Mo6NWBR6hkfydtMnZeO+F5U5GhN+vbQRFCzRyccP/1I=; b=Ax5X/Dcy2Q7ZssjDd/P6clUV3srPYXL7uvQnCgNua452c2w9Xq3D0I9tK3Ku1wNyCeXW6j EmKnNWT73beFYNTXzO+uDzeN5LJRpf/ZoPDTD5MKwo03uOa9g+G+l0VUofDn1cNUucfCyS 3BBwKQ2M/Ifp2nShRrsxCtag+Ey5Gfw= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-217-WMsh2JaSO0acjNRTd_X4qA-1; Wed, 15 Mar 2023 12:36:11 -0400 X-MC-Unique: WMsh2JaSO0acjNRTd_X4qA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9123C2807D8F; Wed, 15 Mar 2023 16:36:08 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id CEEBB4042AC2; Wed, 15 Mar 2023 16:36:05 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Jan Harkes , Christoph Hellwig , John Hubbard , coda@cs.cmu.edu, codalist@coda.cs.cmu.edu, linux-unionfs@vger.kernel.org Subject: [PATCH v19 05/15] coda: Implement splice-read Date: Wed, 15 Mar 2023 16:35:39 +0000 Message-Id: <20230315163549.295454-6-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Implement splice-read for coda by passing the request down a layer rather than going through generic_file_splice_read() which is going to be changed to assume that ->read_folio() is present on buffered files. Signed-off-by: David Howells Acked-by: Jan Harkes cc: Christoph Hellwig cc: Jens Axboe cc: Al Viro cc: John Hubbard cc: David Hildenbrand cc: Matthew Wilcox cc: coda@cs.cmu.edu cc: codalist@coda.cs.cmu.edu cc: linux-unionfs@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- Notes: ver #17) - Use vfs_splice_read() helper rather than open-coding checks. fs/coda/file.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/fs/coda/file.c b/fs/coda/file.c index 3f3c81e6b1ab..12b26bd13564 100644 --- a/fs/coda/file.c +++ b/fs/coda/file.c @@ -23,6 +23,7 @@ #include #include #include +#include =20 #include #include "coda_psdev.h" @@ -94,6 +95,32 @@ coda_file_write_iter(struct kiocb *iocb, struct iov_iter= *to) return ret; } =20 +static ssize_t +coda_file_splice_read(struct file *coda_file, loff_t *ppos, + struct pipe_inode_info *pipe, + size_t len, unsigned int flags) +{ + struct inode *coda_inode =3D file_inode(coda_file); + struct coda_file_info *cfi =3D coda_ftoc(coda_file); + struct file *in =3D cfi->cfi_container; + loff_t ki_pos =3D *ppos; + ssize_t ret; + + ret =3D venus_access_intent(coda_inode->i_sb, coda_i2f(coda_inode), + &cfi->cfi_access_intent, + len, ki_pos, CODA_ACCESS_TYPE_READ); + if (ret) + goto finish_read; + + ret =3D vfs_splice_read(in, ppos, pipe, len, flags); + +finish_read: + venus_access_intent(coda_inode->i_sb, coda_i2f(coda_inode), + &cfi->cfi_access_intent, + len, ki_pos, CODA_ACCESS_TYPE_READ_FINISH); + return ret; +} + static void coda_vm_open(struct vm_area_struct *vma) { @@ -302,5 +329,5 @@ const struct file_operations coda_file_operations =3D { .open =3D coda_open, .release =3D coda_release, .fsync =3D coda_fsync, - .splice_read =3D generic_file_splice_read, + .splice_read =3D coda_file_splice_read, }; From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D1C1C61DA4 for ; Wed, 15 Mar 2023 16:37:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232327AbjCOQhs (ORCPT ); Wed, 15 Mar 2023 12:37:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37398 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232313AbjCOQhU (ORCPT ); Wed, 15 Mar 2023 12:37:20 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2BD677E25 for ; Wed, 15 Mar 2023 09:36:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898178; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WFhWed43Sc7081L0WA9mMaa7Q22YwjW6QNRxMcW8x9U=; b=UAG477Rv4Ne6G+8L2JwJELH5tc5m3D0CzRSFT9tMhcvws9MlyRRDkTsGiKkcb/mp8tGKjb ij5pQjoS7iY+GiKPRaA9GKlzmvFk7fNrWVkaCzhLkUGjaG+BTuH1s3KEgS78tNq8q+mEgB Y9JmfYc5kNzIL0v4ilmRxFD9VTKRcBI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-483-kqwLoURCP6-7ZbSIoOf5VA-1; Wed, 15 Mar 2023 12:36:13 -0400 X-MC-Unique: kqwLoURCP6-7ZbSIoOf5VA-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C5C89185A794; Wed, 15 Mar 2023 16:36:11 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 47217400F4F; Wed, 15 Mar 2023 16:36:09 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Greg Kroah-Hartman , Christoph Hellwig , John Hubbard , Miklos Szeredi , Arnd Bergmann Subject: [PATCH v19 06/15] tty, proc, kernfs, random: Use direct_splice_read() Date: Wed, 15 Mar 2023 16:35:40 +0000 Message-Id: <20230315163549.295454-7-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Use direct_splice_read() for tty, procfs, kernfs and random files rather than going through generic_file_splice_read() as they just copy the file into the output buffer and don't splice pages. This avoids the need for them to have a ->read_folio() to satisfy filemap_splice_read(). Signed-off-by: David Howells Acked-by: Greg Kroah-Hartman cc: Christoph Hellwig cc: Jens Axboe cc: Al Viro cc: John Hubbard cc: David Hildenbrand cc: Matthew Wilcox cc: Miklos Szeredi cc: Arnd Bergmann cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- drivers/char/random.c | 4 ++-- drivers/tty/tty_io.c | 4 ++-- fs/kernfs/file.c | 2 +- fs/proc/inode.c | 4 ++-- fs/proc/proc_sysctl.c | 2 +- fs/proc_namespace.c | 6 +++--- 6 files changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index ce3ccd172cc8..792713616ba8 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1546,7 +1546,7 @@ const struct file_operations random_fops =3D { .compat_ioctl =3D compat_ptr_ioctl, .fasync =3D random_fasync, .llseek =3D noop_llseek, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .splice_write =3D iter_file_splice_write, }; =20 @@ -1557,7 +1557,7 @@ const struct file_operations urandom_fops =3D { .compat_ioctl =3D compat_ptr_ioctl, .fasync =3D random_fasync, .llseek =3D noop_llseek, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .splice_write =3D iter_file_splice_write, }; =20 diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c index 36fb945fdad4..9d117e579dfb 100644 --- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -466,7 +466,7 @@ static const struct file_operations tty_fops =3D { .llseek =3D no_llseek, .read_iter =3D tty_read, .write_iter =3D tty_write, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .splice_write =3D iter_file_splice_write, .poll =3D tty_poll, .unlocked_ioctl =3D tty_ioctl, @@ -481,7 +481,7 @@ static const struct file_operations console_fops =3D { .llseek =3D no_llseek, .read_iter =3D tty_read, .write_iter =3D redirected_tty_write, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .splice_write =3D iter_file_splice_write, .poll =3D tty_poll, .unlocked_ioctl =3D tty_ioctl, diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index e4a50e4ff0d2..9d23b8141db7 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -1011,7 +1011,7 @@ const struct file_operations kernfs_file_fops =3D { .release =3D kernfs_fop_release, .poll =3D kernfs_fop_poll, .fsync =3D noop_fsync, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .splice_write =3D iter_file_splice_write, }; =20 diff --git a/fs/proc/inode.c b/fs/proc/inode.c index f495fdb39151..711f12706469 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -591,7 +591,7 @@ static const struct file_operations proc_iter_file_ops = =3D { .llseek =3D proc_reg_llseek, .read_iter =3D proc_reg_read_iter, .write =3D proc_reg_write, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .poll =3D proc_reg_poll, .unlocked_ioctl =3D proc_reg_unlocked_ioctl, .mmap =3D proc_reg_mmap, @@ -617,7 +617,7 @@ static const struct file_operations proc_reg_file_ops_c= ompat =3D { static const struct file_operations proc_iter_file_ops_compat =3D { .llseek =3D proc_reg_llseek, .read_iter =3D proc_reg_read_iter, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .write =3D proc_reg_write, .poll =3D proc_reg_poll, .unlocked_ioctl =3D proc_reg_unlocked_ioctl, diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 5851eb5bc726..e49f99657d1c 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -869,7 +869,7 @@ static const struct file_operations proc_sys_file_opera= tions =3D { .poll =3D proc_sys_poll, .read_iter =3D proc_sys_read, .write_iter =3D proc_sys_write, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .splice_write =3D iter_file_splice_write, .llseek =3D default_llseek, }; diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c index 846f9455ae22..492abbbeff5e 100644 --- a/fs/proc_namespace.c +++ b/fs/proc_namespace.c @@ -324,7 +324,7 @@ static int mountstats_open(struct inode *inode, struct = file *file) const struct file_operations proc_mounts_operations =3D { .open =3D mounts_open, .read_iter =3D seq_read_iter, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .llseek =3D seq_lseek, .release =3D mounts_release, .poll =3D mounts_poll, @@ -333,7 +333,7 @@ const struct file_operations proc_mounts_operations =3D= { const struct file_operations proc_mountinfo_operations =3D { .open =3D mountinfo_open, .read_iter =3D seq_read_iter, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .llseek =3D seq_lseek, .release =3D mounts_release, .poll =3D mounts_poll, @@ -342,7 +342,7 @@ const struct file_operations proc_mountinfo_operations = =3D { const struct file_operations proc_mountstats_operations =3D { .open =3D mountstats_open, .read_iter =3D seq_read_iter, - .splice_read =3D generic_file_splice_read, + .splice_read =3D direct_splice_read, .llseek =3D seq_lseek, .release =3D mounts_release, }; From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76116C61DA4 for ; Wed, 15 Mar 2023 16:38:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232128AbjCOQiF (ORCPT ); Wed, 15 Mar 2023 12:38:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232252AbjCOQh2 (ORCPT ); Wed, 15 Mar 2023 12:37:28 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B5196C18F for ; Wed, 15 Mar 2023 09:36:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898181; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tPHDzTeiqkehQq1hzvPvPGknkMXTxgyXWdFbtANbawg=; b=YO9ETLPnnohMqmlfQR79OPpBzZvoSoa5LArSZ9uLU+IGs5JVvsg6/0vMGUu9lpF0lMwK/8 k8pHYzlOTsnhVrFgW3tPnAeMqQHM3tk4wY9MOjzOOnHj9C8yHTPUAGGctyvD0lzMddntvH cY1SHcAudqJL2DFfvhfPqerqRgpdopM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-84-WUgqb5XdM1OrTaCbnHUsDQ-1; Wed, 15 Mar 2023 12:36:15 -0400 X-MC-Unique: WUgqb5XdM1OrTaCbnHUsDQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B3540185A78F; Wed, 15 Mar 2023 16:36:14 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 639B01121314; Wed, 15 Mar 2023 16:36:12 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , Steve French , John Hubbard , linux-cifs@vger.kernel.org Subject: [PATCH v19 07/15] splice: Do splice read from a file without using ITER_PIPE Date: Wed, 15 Mar 2023 16:35:41 +0000 Message-Id: <20230315163549.295454-8-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Make generic_file_splice_read() use filemap_splice_read() and direct_splice_read() rather than using an ITER_PIPE and call_read_iter(). With this, ITER_PIPE is no longer used. Signed-off-by: David Howells Reviewed-by: Christoph Hellwig cc: Jens Axboe cc: Steve French cc: Al Viro cc: David Hildenbrand cc: John Hubbard cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christian Brauner --- Notes: ver #18) - Split out the change to cifs to make it use generic_file_splice_read= (). - Split out the unexport of filemap_splice_read() (still needed by cif= s). fs/splice.c | 30 +++++++----------------------- 1 file changed, 7 insertions(+), 23 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index 90ccd3666dca..f46dd1fb367b 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -387,29 +387,13 @@ ssize_t generic_file_splice_read(struct file *in, lof= f_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags) { - struct iov_iter to; - struct kiocb kiocb; - int ret; - - iov_iter_pipe(&to, ITER_DEST, pipe, len); - init_sync_kiocb(&kiocb, in); - kiocb.ki_pos =3D *ppos; - ret =3D call_read_iter(in, &kiocb, &to); - if (ret > 0) { - *ppos =3D kiocb.ki_pos; - file_accessed(in); - } else if (ret < 0) { - /* free what was emitted */ - pipe_discard_from(pipe, to.start_head); - /* - * callers of ->splice_read() expect -EAGAIN on - * "can't put anything in there", rather than -EFAULT. - */ - if (ret =3D=3D -EFAULT) - ret =3D -EAGAIN; - } - - return ret; + if (unlikely(*ppos >=3D file_inode(in)->i_sb->s_maxbytes)) + return 0; + if (unlikely(!len)) + return 0; + if (in->f_flags & O_DIRECT) + return direct_splice_read(in, ppos, pipe, len, flags); + return filemap_splice_read(in, ppos, pipe, len, flags); } EXPORT_SYMBOL(generic_file_splice_read); From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09270C76195 for ; Wed, 15 Mar 2023 16:38:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231436AbjCOQiJ (ORCPT ); Wed, 15 Mar 2023 12:38:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229556AbjCOQhc (ORCPT ); Wed, 15 Mar 2023 12:37:32 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66DDB6A9F4 for ; Wed, 15 Mar 2023 09:36:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898184; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yFE/8AaBiLqJLRp8UNCaoGhfMwL8QvYGE4o0n7cdHXc=; b=DNWVp+LVl3nz9bBBm3oJweZj3KYJRwC3mfzA3aQjziEurNS4C+jrP04x80/VjWIEIkroiH NJx+7HVE/pnc4LIyiG95jY+TGj0OFX+SHskWJ+xz9NcGIslbs2i6lW8+KOy4lOmqXVFONw 9emUuxRe1xDVdQo9zFQXnUZBm+r75q4= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-643-PMihXXm7NQapVKheb2ClNg-1; Wed, 15 Mar 2023 12:36:19 -0400 X-MC-Unique: PMihXXm7NQapVKheb2ClNg-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D87E91C08967; Wed, 15 Mar 2023 16:36:17 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6FCB147507A; Wed, 15 Mar 2023 16:36:15 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , Paulo Alcantara , Steve French , John Hubbard , linux-cifs@vger.kernel.org Subject: [PATCH v19 08/15] cifs: Use generic_file_splice_read() Date: Wed, 15 Mar 2023 16:35:42 +0000 Message-Id: <20230315163549.295454-9-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Make cifs use generic_file_splice_read() rather than doing it for itself. As a consequence, filemap_splice_read() no longer needs to be exported. Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: Paulo Alcantara (SUSE) cc: Jens Axboe cc: Steve French cc: Al Viro cc: David Hildenbrand cc: John Hubbard cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org --- Notes: ver #18) - Split out from change to generic_file_splice_read(). fs/cifs/cifsfs.c | 8 ++++---- fs/cifs/cifsfs.h | 3 --- fs/cifs/file.c | 16 ---------------- mm/filemap.c | 1 - 4 files changed, 4 insertions(+), 24 deletions(-) diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c index cbcf210d56e4..ba963a26cb19 100644 --- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -1359,7 +1359,7 @@ const struct file_operations cifs_file_ops =3D { .fsync =3D cifs_fsync, .flush =3D cifs_flush, .mmap =3D cifs_file_mmap, - .splice_read =3D cifs_splice_read, + .splice_read =3D generic_file_splice_read, .splice_write =3D iter_file_splice_write, .llseek =3D cifs_llseek, .unlocked_ioctl =3D cifs_ioctl, @@ -1379,7 +1379,7 @@ const struct file_operations cifs_file_strict_ops =3D= { .fsync =3D cifs_strict_fsync, .flush =3D cifs_flush, .mmap =3D cifs_file_strict_mmap, - .splice_read =3D cifs_splice_read, + .splice_read =3D generic_file_splice_read, .splice_write =3D iter_file_splice_write, .llseek =3D cifs_llseek, .unlocked_ioctl =3D cifs_ioctl, @@ -1417,7 +1417,7 @@ const struct file_operations cifs_file_nobrl_ops =3D { .fsync =3D cifs_fsync, .flush =3D cifs_flush, .mmap =3D cifs_file_mmap, - .splice_read =3D cifs_splice_read, + .splice_read =3D generic_file_splice_read, .splice_write =3D iter_file_splice_write, .llseek =3D cifs_llseek, .unlocked_ioctl =3D cifs_ioctl, @@ -1435,7 +1435,7 @@ const struct file_operations cifs_file_strict_nobrl_o= ps =3D { .fsync =3D cifs_strict_fsync, .flush =3D cifs_flush, .mmap =3D cifs_file_strict_mmap, - .splice_read =3D cifs_splice_read, + .splice_read =3D generic_file_splice_read, .splice_write =3D iter_file_splice_write, .llseek =3D cifs_llseek, .unlocked_ioctl =3D cifs_ioctl, diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h index 71fe0a0a7992..8b239854e590 100644 --- a/fs/cifs/cifsfs.h +++ b/fs/cifs/cifsfs.h @@ -100,9 +100,6 @@ extern ssize_t cifs_strict_readv(struct kiocb *iocb, st= ruct iov_iter *to); extern ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from); extern ssize_t cifs_direct_writev(struct kiocb *iocb, struct iov_iter *fro= m); extern ssize_t cifs_strict_writev(struct kiocb *iocb, struct iov_iter *fro= m); -extern ssize_t cifs_splice_read(struct file *in, loff_t *ppos, - struct pipe_inode_info *pipe, size_t len, - unsigned int flags); extern int cifs_flock(struct file *pfile, int cmd, struct file_lock *plock= ); extern int cifs_lock(struct file *, int, struct file_lock *); extern int cifs_fsync(struct file *, loff_t, loff_t, int); diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 4d4a2d82636d..321f9b7c84c9 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -5066,19 +5066,3 @@ const struct address_space_operations cifs_addr_ops_= smallbuf =3D { .launder_folio =3D cifs_launder_folio, .migrate_folio =3D filemap_migrate_folio, }; - -/* - * Splice data from a file into a pipe. - */ -ssize_t cifs_splice_read(struct file *in, loff_t *ppos, - struct pipe_inode_info *pipe, size_t len, - unsigned int flags) -{ - if (unlikely(*ppos >=3D file_inode(in)->i_sb->s_maxbytes)) - return 0; - if (unlikely(!len)) - return 0; - if (in->f_flags & O_DIRECT) - return direct_splice_read(in, ppos, pipe, len, flags); - return filemap_splice_read(in, ppos, pipe, len, flags); -} diff --git a/mm/filemap.c b/mm/filemap.c index 2723104cc06a..3a93515ae2ed 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2967,7 +2967,6 @@ ssize_t filemap_splice_read(struct file *in, loff_t *= ppos, =20 return total_spliced ? total_spliced : error; } -EXPORT_SYMBOL(filemap_splice_read); =20 static inline loff_t folio_seek_hole_data(struct xa_state *xas, struct address_space *mapping, struct folio *folio, From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32D96C7618B for ; Wed, 15 Mar 2023 16:38:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232438AbjCOQiL (ORCPT ); Wed, 15 Mar 2023 12:38:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232208AbjCOQhh (ORCPT ); Wed, 15 Mar 2023 12:37:37 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 900B96EB9C for ; Wed, 15 Mar 2023 09:36:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898185; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XsRpjIP88f7UuB0TiBadjal83kAgF2un+t2WHwzlM/w=; b=c/jGgXEVzT6740oVvMpOiIOeV1nMCADBzoz46RBbA68BvmraPw1e3W+SgfGQA8GN//7M9D FPF9fjehRoPOVxMT8XUInp1vjuNZ3vT05U4TDZGZyaBHFf+QxiBEnO8V7/L+Ep41XboPPD ac4xS56Lrmb1ALxfgPaPDiFI5iqdVpU= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-48-L1wUMuA2PWupoSIpynxtJg-1; Wed, 15 Mar 2023 12:36:21 -0400 X-MC-Unique: L1wUMuA2PWupoSIpynxtJg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B3F843C0F23B; Wed, 15 Mar 2023 16:36:20 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7985F2166B26; Wed, 15 Mar 2023 16:36:18 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v19 09/15] iov_iter: Kill ITER_PIPE Date: Wed, 15 Mar 2023 16:35:43 +0000 Message-Id: <20230315163549.295454-10-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The ITER_PIPE-type iterator was only used for generic_file_splice_read(), but that has now been switched to either pull pages directly from the pagecache for buffered file splice-reads or to use ITER_BVEC instead for O_DIRECT file splice-reads. This leaves ITER_PIPE unused - so remove it. Signed-off-by: David Howells Reviewed-by: Christoph Hellwig cc: Jens Axboe cc: Al Viro cc: David Hildenbrand cc: John Hubbard cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christian Brauner --- include/linux/uio.h | 14 -- lib/iov_iter.c | 429 +------------------------------------------- mm/filemap.c | 3 +- 3 files changed, 4 insertions(+), 442 deletions(-) diff --git a/include/linux/uio.h b/include/linux/uio.h index 27e3fd942960..74598426edb4 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -11,7 +11,6 @@ #include =20 struct page; -struct pipe_inode_info; =20 typedef unsigned int __bitwise iov_iter_extraction_t; =20 @@ -25,7 +24,6 @@ enum iter_type { ITER_IOVEC, ITER_KVEC, ITER_BVEC, - ITER_PIPE, ITER_XARRAY, ITER_DISCARD, ITER_UBUF, @@ -55,15 +53,10 @@ struct iov_iter { const struct kvec *kvec; const struct bio_vec *bvec; struct xarray *xarray; - struct pipe_inode_info *pipe; void __user *ubuf; }; union { unsigned long nr_segs; - struct { - unsigned int head; - unsigned int start_head; - }; loff_t xarray_start; }; }; @@ -101,11 +94,6 @@ static inline bool iov_iter_is_bvec(const struct iov_it= er *i) return iov_iter_type(i) =3D=3D ITER_BVEC; } =20 -static inline bool iov_iter_is_pipe(const struct iov_iter *i) -{ - return iov_iter_type(i) =3D=3D ITER_PIPE; -} - static inline bool iov_iter_is_discard(const struct iov_iter *i) { return iov_iter_type(i) =3D=3D ITER_DISCARD; @@ -247,8 +235,6 @@ void iov_iter_kvec(struct iov_iter *i, unsigned int dir= ection, const struct kvec unsigned long nr_segs, size_t count); void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struc= t bio_vec *bvec, unsigned long nr_segs, size_t count); -void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe= _inode_info *pipe, - size_t count); void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t c= ount); void iov_iter_xarray(struct iov_iter *i, unsigned int direction, struct xa= rray *xarray, loff_t start, size_t count); diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 274014e4eafe..fad95e4cf372 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -14,8 +14,6 @@ #include #include =20 -#define PIPE_PARANOIA /* for now */ - /* covers ubuf and kbuf alike */ #define iterate_buf(i, n, base, len, off, __p, STEP) { \ size_t __maybe_unused off =3D 0; \ @@ -186,150 +184,6 @@ static int copyin(void *to, const void __user *from, = size_t n) return res; } =20 -#ifdef PIPE_PARANOIA -static bool sanity(const struct iov_iter *i) -{ - struct pipe_inode_info *pipe =3D i->pipe; - unsigned int p_head =3D pipe->head; - unsigned int p_tail =3D pipe->tail; - unsigned int p_occupancy =3D pipe_occupancy(p_head, p_tail); - unsigned int i_head =3D i->head; - unsigned int idx; - - if (i->last_offset) { - struct pipe_buffer *p; - if (unlikely(p_occupancy =3D=3D 0)) - goto Bad; // pipe must be non-empty - if (unlikely(i_head !=3D p_head - 1)) - goto Bad; // must be at the last buffer... - - p =3D pipe_buf(pipe, i_head); - if (unlikely(p->offset + p->len !=3D abs(i->last_offset))) - goto Bad; // ... at the end of segment - } else { - if (i_head !=3D p_head) - goto Bad; // must be right after the last buffer - } - return true; -Bad: - printk(KERN_ERR "idx =3D %d, offset =3D %d\n", i_head, i->last_offset); - printk(KERN_ERR "head =3D %d, tail =3D %d, buffers =3D %d\n", - p_head, p_tail, pipe->ring_size); - for (idx =3D 0; idx < pipe->ring_size; idx++) - printk(KERN_ERR "[%p %p %d %d]\n", - pipe->bufs[idx].ops, - pipe->bufs[idx].page, - pipe->bufs[idx].offset, - pipe->bufs[idx].len); - WARN_ON(1); - return false; -} -#else -#define sanity(i) true -#endif - -static struct page *push_anon(struct pipe_inode_info *pipe, unsigned size) -{ - struct page *page =3D alloc_page(GFP_USER); - if (page) { - struct pipe_buffer *buf =3D pipe_buf(pipe, pipe->head++); - *buf =3D (struct pipe_buffer) { - .ops =3D &default_pipe_buf_ops, - .page =3D page, - .offset =3D 0, - .len =3D size - }; - } - return page; -} - -static void push_page(struct pipe_inode_info *pipe, struct page *page, - unsigned int offset, unsigned int size) -{ - struct pipe_buffer *buf =3D pipe_buf(pipe, pipe->head++); - *buf =3D (struct pipe_buffer) { - .ops =3D &page_cache_pipe_buf_ops, - .page =3D page, - .offset =3D offset, - .len =3D size - }; - get_page(page); -} - -static inline int last_offset(const struct pipe_buffer *buf) -{ - if (buf->ops =3D=3D &default_pipe_buf_ops) - return buf->len; // buf->offset is 0 for those - else - return -(buf->offset + buf->len); -} - -static struct page *append_pipe(struct iov_iter *i, size_t size, - unsigned int *off) -{ - struct pipe_inode_info *pipe =3D i->pipe; - int offset =3D i->last_offset; - struct pipe_buffer *buf; - struct page *page; - - if (offset > 0 && offset < PAGE_SIZE) { - // some space in the last buffer; add to it - buf =3D pipe_buf(pipe, pipe->head - 1); - size =3D min_t(size_t, size, PAGE_SIZE - offset); - buf->len +=3D size; - i->last_offset +=3D size; - i->count -=3D size; - *off =3D offset; - return buf->page; - } - // OK, we need a new buffer - *off =3D 0; - size =3D min_t(size_t, size, PAGE_SIZE); - if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) - return NULL; - page =3D push_anon(pipe, size); - if (!page) - return NULL; - i->head =3D pipe->head - 1; - i->last_offset =3D size; - i->count -=3D size; - return page; -} - -static size_t copy_page_to_iter_pipe(struct page *page, size_t offset, siz= e_t bytes, - struct iov_iter *i) -{ - struct pipe_inode_info *pipe =3D i->pipe; - unsigned int head =3D pipe->head; - - if (unlikely(bytes > i->count)) - bytes =3D i->count; - - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - if (offset && i->last_offset =3D=3D -offset) { // could we merge it? - struct pipe_buffer *buf =3D pipe_buf(pipe, head - 1); - if (buf->page =3D=3D page) { - buf->len +=3D bytes; - i->last_offset -=3D bytes; - i->count -=3D bytes; - return bytes; - } - } - if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) - return 0; - - push_page(pipe, page, offset, bytes); - i->last_offset =3D -(offset + bytes); - i->head =3D head; - i->count -=3D bytes; - return bytes; -} - /* * fault_in_iov_iter_readable - fault in iov iterator for reading * @i: iterator @@ -433,46 +287,6 @@ void iov_iter_init(struct iov_iter *i, unsigned int di= rection, } EXPORT_SYMBOL(iov_iter_init); =20 -// returns the offset in partial buffer (if any) -static inline unsigned int pipe_npages(const struct iov_iter *i, int *npag= es) -{ - struct pipe_inode_info *pipe =3D i->pipe; - int used =3D pipe->head - pipe->tail; - int off =3D i->last_offset; - - *npages =3D max((int)pipe->max_usage - used, 0); - - if (off > 0 && off < PAGE_SIZE) { // anon and not full - (*npages)++; - return off; - } - return 0; -} - -static size_t copy_pipe_to_iter(const void *addr, size_t bytes, - struct iov_iter *i) -{ - unsigned int off, chunk; - - if (unlikely(bytes > i->count)) - bytes =3D i->count; - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - for (size_t n =3D bytes; n; n -=3D chunk) { - struct page *page =3D append_pipe(i, n, &off); - chunk =3D min_t(size_t, n, PAGE_SIZE - off); - if (!page) - return bytes - n; - memcpy_to_page(page, off, addr, chunk); - addr +=3D chunk; - } - return bytes; -} - static __wsum csum_and_memcpy(void *to, const void *from, size_t len, __wsum sum, size_t off) { @@ -480,44 +294,10 @@ static __wsum csum_and_memcpy(void *to, const void *f= rom, size_t len, return csum_block_add(sum, next, off); } =20 -static size_t csum_and_copy_to_pipe_iter(const void *addr, size_t bytes, - struct iov_iter *i, __wsum *sump) -{ - __wsum sum =3D *sump; - size_t off =3D 0; - unsigned int chunk, r; - - if (unlikely(bytes > i->count)) - bytes =3D i->count; - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - while (bytes) { - struct page *page =3D append_pipe(i, bytes, &r); - char *p; - - if (!page) - break; - chunk =3D min_t(size_t, bytes, PAGE_SIZE - r); - p =3D kmap_local_page(page); - sum =3D csum_and_memcpy(p + r, addr + off, chunk, sum, off); - kunmap_local(p); - off +=3D chunk; - bytes -=3D chunk; - } - *sump =3D sum; - return off; -} - size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i) { if (WARN_ON_ONCE(i->data_source)) return 0; - if (unlikely(iov_iter_is_pipe(i))) - return copy_pipe_to_iter(addr, bytes, i); if (user_backed_iter(i)) might_fault(); iterate_and_advance(i, bytes, base, len, off, @@ -539,42 +319,6 @@ static int copyout_mc(void __user *to, const void *fro= m, size_t n) return n; } =20 -static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes, - struct iov_iter *i) -{ - size_t xfer =3D 0; - unsigned int off, chunk; - - if (unlikely(bytes > i->count)) - bytes =3D i->count; - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - while (bytes) { - struct page *page =3D append_pipe(i, bytes, &off); - unsigned long rem; - char *p; - - if (!page) - break; - chunk =3D min_t(size_t, bytes, PAGE_SIZE - off); - p =3D kmap_local_page(page); - rem =3D copy_mc_to_kernel(p + off, addr + xfer, chunk); - chunk -=3D rem; - kunmap_local(p); - xfer +=3D chunk; - bytes -=3D chunk; - if (rem) { - iov_iter_revert(i, rem); - break; - } - } - return xfer; -} - /** * _copy_mc_to_iter - copy to iter with source memory error exception hand= ling * @addr: source kernel address @@ -594,9 +338,8 @@ static size_t copy_mc_pipe_to_iter(const void *addr, si= ze_t bytes, * alignment and poison alignment assumptions to avoid re-triggering * hardware exceptions. * - * * ITER_KVEC, ITER_PIPE, and ITER_BVEC can return short copies. - * Compare to copy_to_iter() where only ITER_IOVEC attempts might return - * a short copy. + * * ITER_KVEC and ITER_BVEC can return short copies. Compare to + * copy_to_iter() where only ITER_IOVEC attempts might return a short co= py. * * Return: number of bytes copied (may be %0) */ @@ -604,8 +347,6 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes,= struct iov_iter *i) { if (WARN_ON_ONCE(i->data_source)) return 0; - if (unlikely(iov_iter_is_pipe(i))) - return copy_mc_pipe_to_iter(addr, bytes, i); if (user_backed_iter(i)) might_fault(); __iterate_and_advance(i, bytes, base, len, off, @@ -711,8 +452,6 @@ size_t copy_page_to_iter(struct page *page, size_t offs= et, size_t bytes, return 0; if (WARN_ON_ONCE(i->data_source)) return 0; - if (unlikely(iov_iter_is_pipe(i))) - return copy_page_to_iter_pipe(page, offset, bytes, i); page +=3D offset / PAGE_SIZE; // first subpage offset %=3D PAGE_SIZE; while (1) { @@ -761,36 +500,8 @@ size_t copy_page_from_iter(struct page *page, size_t o= ffset, size_t bytes, } EXPORT_SYMBOL(copy_page_from_iter); =20 -static size_t pipe_zero(size_t bytes, struct iov_iter *i) -{ - unsigned int chunk, off; - - if (unlikely(bytes > i->count)) - bytes =3D i->count; - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - for (size_t n =3D bytes; n; n -=3D chunk) { - struct page *page =3D append_pipe(i, n, &off); - char *p; - - if (!page) - return bytes - n; - chunk =3D min_t(size_t, n, PAGE_SIZE - off); - p =3D kmap_local_page(page); - memset(p + off, 0, chunk); - kunmap_local(p); - } - return bytes; -} - size_t iov_iter_zero(size_t bytes, struct iov_iter *i) { - if (unlikely(iov_iter_is_pipe(i))) - return pipe_zero(bytes, i); iterate_and_advance(i, bytes, base, len, count, clear_user(base, len), memset(base, 0, len) @@ -821,32 +532,6 @@ size_t copy_page_from_iter_atomic(struct page *page, u= nsigned offset, size_t byt } EXPORT_SYMBOL(copy_page_from_iter_atomic); =20 -static void pipe_advance(struct iov_iter *i, size_t size) -{ - struct pipe_inode_info *pipe =3D i->pipe; - int off =3D i->last_offset; - - if (!off && !size) { - pipe_discard_from(pipe, i->start_head); // discard everything - return; - } - i->count -=3D size; - while (1) { - struct pipe_buffer *buf =3D pipe_buf(pipe, i->head); - if (off) /* make it relative to the beginning of buffer */ - size +=3D abs(off) - buf->offset; - if (size <=3D buf->len) { - buf->len =3D size; - i->last_offset =3D last_offset(buf); - break; - } - size -=3D buf->len; - i->head++; - off =3D 0; - } - pipe_discard_from(pipe, i->head + 1); // discard everything past this one -} - static void iov_iter_bvec_advance(struct iov_iter *i, size_t size) { const struct bio_vec *bvec, *end; @@ -898,8 +583,6 @@ void iov_iter_advance(struct iov_iter *i, size_t size) iov_iter_iovec_advance(i, size); } else if (iov_iter_is_bvec(i)) { iov_iter_bvec_advance(i, size); - } else if (iov_iter_is_pipe(i)) { - pipe_advance(i, size); } else if (iov_iter_is_discard(i)) { i->count -=3D size; } @@ -913,26 +596,6 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll) if (WARN_ON(unroll > MAX_RW_COUNT)) return; i->count +=3D unroll; - if (unlikely(iov_iter_is_pipe(i))) { - struct pipe_inode_info *pipe =3D i->pipe; - unsigned int head =3D pipe->head; - - while (head > i->start_head) { - struct pipe_buffer *b =3D pipe_buf(pipe, --head); - if (unroll < b->len) { - b->len -=3D unroll; - i->last_offset =3D last_offset(b); - i->head =3D head; - return; - } - unroll -=3D b->len; - pipe_buf_release(pipe, b); - pipe->head--; - } - i->last_offset =3D 0; - i->head =3D head; - return; - } if (unlikely(iov_iter_is_discard(i))) return; if (unroll <=3D i->iov_offset) { @@ -1020,24 +683,6 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int d= irection, } EXPORT_SYMBOL(iov_iter_bvec); =20 -void iov_iter_pipe(struct iov_iter *i, unsigned int direction, - struct pipe_inode_info *pipe, - size_t count) -{ - BUG_ON(direction !=3D READ); - WARN_ON(pipe_full(pipe->head, pipe->tail, pipe->ring_size)); - *i =3D (struct iov_iter){ - .iter_type =3D ITER_PIPE, - .data_source =3D false, - .pipe =3D pipe, - .head =3D pipe->head, - .start_head =3D pipe->head, - .last_offset =3D 0, - .count =3D count - }; -} -EXPORT_SYMBOL(iov_iter_pipe); - /** * iov_iter_xarray - Initialise an I/O iterator to use the pages in an xar= ray * @i: The iterator to initialise. @@ -1162,19 +807,6 @@ bool iov_iter_is_aligned(const struct iov_iter *i, un= signed addr_mask, if (iov_iter_is_bvec(i)) return iov_iter_aligned_bvec(i, addr_mask, len_mask); =20 - if (iov_iter_is_pipe(i)) { - size_t size =3D i->count; - - if (size & len_mask) - return false; - if (size && i->last_offset > 0) { - if (i->last_offset & addr_mask) - return false; - } - - return true; - } - if (iov_iter_is_xarray(i)) { if (i->count & len_mask) return false; @@ -1244,14 +876,6 @@ unsigned long iov_iter_alignment(const struct iov_ite= r *i) if (iov_iter_is_bvec(i)) return iov_iter_alignment_bvec(i); =20 - if (iov_iter_is_pipe(i)) { - size_t size =3D i->count; - - if (size && i->last_offset > 0) - return size | i->last_offset; - return size; - } - if (iov_iter_is_xarray(i)) return (i->xarray_start + i->iov_offset) | i->count; =20 @@ -1303,36 +927,6 @@ static int want_pages_array(struct page ***res, size_= t size, return count; } =20 -static ssize_t pipe_get_pages(struct iov_iter *i, - struct page ***pages, size_t maxsize, unsigned maxpages, - size_t *start) -{ - unsigned int npages, count, off, chunk; - struct page **p; - size_t left; - - if (!sanity(i)) - return -EFAULT; - - *start =3D off =3D pipe_npages(i, &npages); - if (!npages) - return -EFAULT; - count =3D want_pages_array(pages, maxsize, off, min(npages, maxpages)); - if (!count) - return -ENOMEM; - p =3D *pages; - for (npages =3D 0, left =3D maxsize ; npages < count; npages++, left -=3D= chunk) { - struct page *page =3D append_pipe(i, left, &off); - if (!page) - break; - chunk =3D min_t(size_t, left, PAGE_SIZE - off); - get_page(*p++ =3D page); - } - if (!npages) - return -EFAULT; - return maxsize - left; -} - static ssize_t iter_xarray_populate_pages(struct page **pages, struct xarr= ay *xa, pgoff_t index, unsigned int nr_pages) { @@ -1482,8 +1076,6 @@ static ssize_t __iov_iter_get_pages_alloc(struct iov_= iter *i, } return maxsize; } - if (iov_iter_is_pipe(i)) - return pipe_get_pages(i, pages, maxsize, maxpages, start); if (iov_iter_is_xarray(i)) return iter_xarray_get_pages(i, pages, maxsize, maxpages, start); return -EFAULT; @@ -1573,9 +1165,7 @@ size_t csum_and_copy_to_iter(const void *addr, size_t= bytes, void *_csstate, } =20 sum =3D csum_shift(csstate->csum, csstate->off); - if (unlikely(iov_iter_is_pipe(i))) - bytes =3D csum_and_copy_to_pipe_iter(addr, bytes, i, &sum); - else iterate_and_advance(i, bytes, base, len, off, ({ + iterate_and_advance(i, bytes, base, len, off, ({ next =3D csum_and_copy_to_user(addr + off, base, len); sum =3D csum_block_add(sum, next, off); next ? 0 : len; @@ -1660,15 +1250,6 @@ int iov_iter_npages(const struct iov_iter *i, int ma= xpages) return iov_npages(i, maxpages); if (iov_iter_is_bvec(i)) return bvec_npages(i, maxpages); - if (iov_iter_is_pipe(i)) { - int npages; - - if (!sanity(i)) - return 0; - - pipe_npages(i, &npages); - return min(npages, maxpages); - } if (iov_iter_is_xarray(i)) { unsigned offset =3D (i->xarray_start + i->iov_offset) % PAGE_SIZE; int npages =3D DIV_ROUND_UP(offset + i->count, PAGE_SIZE); @@ -1681,10 +1262,6 @@ EXPORT_SYMBOL(iov_iter_npages); const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t fla= gs) { *new =3D *old; - if (unlikely(iov_iter_is_pipe(new))) { - WARN_ON(1); - return NULL; - } if (iov_iter_is_bvec(new)) return new->bvec =3D kmemdup(new->bvec, new->nr_segs * sizeof(struct bio_vec), diff --git a/mm/filemap.c b/mm/filemap.c index 3a93515ae2ed..470be06b6096 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2690,8 +2690,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_i= ter *iter, if (unlikely(iocb->ki_pos >=3D i_size_read(inode))) break; =20 - error =3D filemap_get_pages(iocb, iter->count, &fbatch, - iov_iter_is_pipe(iter)); + error =3D filemap_get_pages(iocb, iter->count, &fbatch, false); if (error < 0) break; From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4696DC61DA4 for ; Wed, 15 Mar 2023 16:38:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232620AbjCOQid (ORCPT ); Wed, 15 Mar 2023 12:38:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38092 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232340AbjCOQhq (ORCPT ); Wed, 15 Mar 2023 12:37:46 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AD717C9F2 for ; Wed, 15 Mar 2023 09:36:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898192; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w0CyxxuCfKyDxUhqZI8uZNDs4yTA+LT97JFfaSxmf5E=; b=cRPRo2QBHtHJdxiU+7z5cjEFsuN+VeQDn5nQRuWBYUzOGa5n4f68ya2pYVR1BJZTo/BvFR JLFA47h/6BRiOFfDJHKBA6wMvHXU/T3hJxzCoAl6m9LSkbyI6eUY4iKFFUsbVHoHE6znpT z16+b8JLJ73yi7CV861MolR9KOe+3Yk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-503-KRZAb1KrMEG4we92PR2NZA-1; Wed, 15 Mar 2023 12:36:27 -0400 X-MC-Unique: KRZAb1KrMEG4we92PR2NZA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 316A48828C6; Wed, 15 Mar 2023 16:36:24 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6A4C2C164E7; Wed, 15 Mar 2023 16:36:21 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, John Hubbard , Dave Chinner , Christoph Hellwig Subject: [PATCH v19 10/15] iomap: Don't get an reference on ZERO_PAGE for direct I/O block zeroing Date: Wed, 15 Mar 2023 16:35:44 +0000 Message-Id: <20230315163549.295454-11-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" ZERO_PAGE can't go away, no need to hold an extra reference. Signed-off-by: David Howells Reviewed-by: David Hildenbrand Reviewed-by: John Hubbard Reviewed-by: Dave Chinner Reviewed-by: Christoph Hellwig cc: Al Viro cc: linux-fsdevel@vger.kernel.org --- fs/iomap/direct-io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index f771001574d0..850fb9870c2f 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -202,7 +202,7 @@ static void iomap_dio_zero(const struct iomap_iter *ite= r, struct iomap_dio *dio, bio->bi_private =3D dio; bio->bi_end_io =3D iomap_dio_bio_end_io; =20 - get_page(page); + bio_set_flag(bio, BIO_NO_PAGE_REF); __bio_add_page(bio, page, len, 0); iomap_dio_submit_bio(iter, dio, bio, pos); } From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98787C76195 for ; Wed, 15 Mar 2023 16:38:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232613AbjCOQia (ORCPT ); Wed, 15 Mar 2023 12:38:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232160AbjCOQhq (ORCPT ); Wed, 15 Mar 2023 12:37:46 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBE33769F1 for ; Wed, 15 Mar 2023 09:36:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898194; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0+QymJ+GCZW28tJ8JK9HTXv9GxubpscA9882LeMgv9w=; b=NAl65g0XRxxRMByGwfkB6b2XFF1k6l4LoHxnoxY/0emKTJKeHUD6ENCPtcMRREi6oxXZhv HW/HegjFqpdAIO/1VFOSA2m9dnogfvDsEkMHXQAs2zIECQMKBSydIwUvhKqVDMdiZJ+HYm ls2xd94lDC4z/pf87Nmh2rTGkL+/o2E= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-132-0_two9ifOYW8-fQttFHzIw-1; Wed, 15 Mar 2023 12:36:28 -0400 X-MC-Unique: 0_two9ifOYW8-fQttFHzIw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DCA38858297; Wed, 15 Mar 2023 16:36:26 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id C161AC164E7; Wed, 15 Mar 2023 16:36:24 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v19 11/15] block: Fix bio_flagged() so that gcc can better optimise it Date: Wed, 15 Mar 2023 16:35:45 +0000 Message-Id: <20230315163549.295454-12-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Fix bio_flagged() so that multiple instances of it, such as: if (bio_flagged(bio, BIO_PAGE_REFFED) || bio_flagged(bio, BIO_PAGE_PINNED)) can be combined by the gcc optimiser into a single test in assembly (arguably, this is a compiler optimisation issue[1]). The missed optimisation stems from bio_flagged() comparing the result of the bitwise-AND to zero. This results in an out-of-line bio_release_page() being compiled to something like: <+0>: mov 0x14(%rdi),%eax <+3>: test $0x1,%al <+5>: jne 0xffffffff816dac53 <+7>: test $0x2,%al <+9>: je 0xffffffff816dac5c <+11>: movzbl %sil,%esi <+15>: jmp 0xffffffff816daba1 <__bio_release_pages> <+20>: jmp 0xffffffff81d0b800 <__x86_return_thunk> However, the test is superfluous as the return type is bool. Removing it results in: <+0>: testb $0x3,0x14(%rdi) <+4>: je 0xffffffff816e4af4 <+6>: movzbl %sil,%esi <+10>: jmp 0xffffffff816dab7c <__bio_release_pages> <+15>: jmp 0xffffffff81d0b7c0 <__x86_return_thunk> instead. Also, the MOVZBL instruction looks unnecessary[2] - I think it's just 're-booling' the mark_dirty parameter. Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard cc: Jens Axboe cc: linux-block@vger.kernel.org Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108370 [1] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108371 [2] Link: https://lore.kernel.org/r/167391056756.2311931.356007731815807265.stg= it@warthog.procyon.org.uk/ # v6 --- include/linux/bio.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/bio.h b/include/linux/bio.h index d766be7152e1..d9d6df62ea57 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -229,7 +229,7 @@ static inline void bio_cnt_set(struct bio *bio, unsigne= d int count) =20 static inline bool bio_flagged(struct bio *bio, unsigned int bit) { - return (bio->bi_flags & (1U << bit)) !=3D 0; + return bio->bi_flags & (1U << bit); } =20 static inline void bio_set_flag(struct bio *bio, unsigned int bit) From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20E7EC6FD1D for ; Wed, 15 Mar 2023 16:40:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232145AbjCOQkI (ORCPT ); Wed, 15 Mar 2023 12:40:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38080 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229634AbjCOQjt (ORCPT ); Wed, 15 Mar 2023 12:39:49 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4375F99C13 for ; Wed, 15 Mar 2023 09:37:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898248; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cRZ6zwkATo5WH/jIVyGNX8/GaSvHJQOFy9f4tGJFuL8=; b=S2CEr5aVKIvd7bjjos5aUUAwsUW49PCZiro8LwoahWnG34N3GtqWTVUV7BOumhtmziHMf9 WfA263us4OLVeDDtub7VXGdsNRnbJ2ntqlfJGX3kbz1wfFBW/p06RbP+B783PUsLYdFJmx boxGcQIV5o8fbOo49A+9yUIgqqdh+Bo= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-295-MaPgOrIuMoe73ChQ23sWJA-1; Wed, 15 Mar 2023 12:37:22 -0400 X-MC-Unique: MaPgOrIuMoe73ChQ23sWJA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B2A3B2806054; Wed, 15 Mar 2023 16:36:29 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 95C974042AC2; Wed, 15 Mar 2023 16:36:27 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v19 12/15] block: Replace BIO_NO_PAGE_REF with BIO_PAGE_REFFED with inverted logic Date: Wed, 15 Mar 2023 16:35:46 +0000 Message-Id: <20230315163549.295454-13-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Christoph Hellwig Replace BIO_NO_PAGE_REF with a BIO_PAGE_REFFED flag that has the inverted meaning is only set when a page reference has been acquired that needs to be released by bio_release_pages(). Signed-off-by: Christoph Hellwig Signed-off-by: David Howells Reviewed-by: John Hubbard cc: Al Viro cc: Jens Axboe cc: Jan Kara cc: Matthew Wilcox cc: Logan Gunthorpe cc: linux-block@vger.kernel.org --- Notes: ver #8) - Split out from another patch [hch]. - Don't default to BIO_PAGE_REFFED [hch]. =20 ver #5) - Split from patch that uses iov_iter_extract_pages(). block/bio.c | 2 +- block/blk-map.c | 1 + fs/direct-io.c | 2 ++ fs/iomap/direct-io.c | 1 - include/linux/bio.h | 2 +- include/linux/blk_types.h | 2 +- 6 files changed, 6 insertions(+), 4 deletions(-) diff --git a/block/bio.c b/block/bio.c index fd11614bba4d..4ff96a0e4091 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1190,7 +1190,6 @@ void bio_iov_bvec_set(struct bio *bio, struct iov_ite= r *iter) bio->bi_io_vec =3D (struct bio_vec *)iter->bvec; bio->bi_iter.bi_bvec_done =3D iter->iov_offset; bio->bi_iter.bi_size =3D size; - bio_set_flag(bio, BIO_NO_PAGE_REF); bio_set_flag(bio, BIO_CLONED); } =20 @@ -1335,6 +1334,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct io= v_iter *iter) return 0; } =20 + bio_set_flag(bio, BIO_PAGE_REFFED); do { ret =3D __bio_iov_iter_get_pages(bio, iter); } while (!ret && iov_iter_count(iter) && !bio_full(bio, 0)); diff --git a/block/blk-map.c b/block/blk-map.c index 9137d16cecdc..c77fdb1fbda7 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -281,6 +281,7 @@ static int bio_map_user_iov(struct request *rq, struct = iov_iter *iter, if (blk_queue_pci_p2pdma(rq->q)) extraction_flags |=3D ITER_ALLOW_P2PDMA; =20 + bio_set_flag(bio, BIO_PAGE_REFFED); while (iov_iter_count(iter)) { struct page **pages, *stack_pages[UIO_FASTIOV]; ssize_t bytes; diff --git a/fs/direct-io.c b/fs/direct-io.c index ab0d7ea89813..47b90c68b369 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -403,6 +403,8 @@ dio_bio_alloc(struct dio *dio, struct dio_submit *sdio, bio->bi_end_io =3D dio_bio_end_aio; else bio->bi_end_io =3D dio_bio_end_io; + /* for now require references for all pages */ + bio_set_flag(bio, BIO_PAGE_REFFED); sdio->bio =3D bio; sdio->logical_offset_in_bio =3D sdio->cur_page_fs_offset; } diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 850fb9870c2f..ceeb0a183cea 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -202,7 +202,6 @@ static void iomap_dio_zero(const struct iomap_iter *ite= r, struct iomap_dio *dio, bio->bi_private =3D dio; bio->bi_end_io =3D iomap_dio_bio_end_io; =20 - bio_set_flag(bio, BIO_NO_PAGE_REF); __bio_add_page(bio, page, len, 0); iomap_dio_submit_bio(iter, dio, bio, pos); } diff --git a/include/linux/bio.h b/include/linux/bio.h index d9d6df62ea57..b537d03377f0 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -488,7 +488,7 @@ void zero_fill_bio(struct bio *bio); =20 static inline void bio_release_pages(struct bio *bio, bool mark_dirty) { - if (!bio_flagged(bio, BIO_NO_PAGE_REF)) + if (bio_flagged(bio, BIO_PAGE_REFFED)) __bio_release_pages(bio, mark_dirty); } =20 diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 99be590f952f..7daa261f4f98 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -318,7 +318,7 @@ struct bio { * bio flags */ enum { - BIO_NO_PAGE_REF, /* don't put release vec pages */ + BIO_PAGE_REFFED, /* put pages in bio_release_pages() */ BIO_CLONED, /* doesn't own data */ BIO_BOUNCED, /* bio is a bounce bio */ BIO_QUIET, /* Make BIO Quiet */ From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32014C6FD1D for ; Wed, 15 Mar 2023 16:38:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232637AbjCOQih (ORCPT ); Wed, 15 Mar 2023 12:38:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38066 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231545AbjCOQht (ORCPT ); Wed, 15 Mar 2023 12:37:49 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E590677E12 for ; Wed, 15 Mar 2023 09:36:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898199; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HEPrb4kIqMqW7eebyDDmobfSBttcdx5+yTOr0krGf5g=; b=dMKgSZeVfp/Bm7VGSwSE3HO/Uh+Gp+sccC70e3qxrqSznQc6tNOylgdzjySyEAnmQ0SoLk rbsemhB4ZCwP2yaOr5uMHjj2AQFl/nosBOmSZODwWzGbwgEXPzkgGq0ev3Ly0JDI6eSjgT zKZzyW/mvTXRhQuABixikV90pxwmUqk= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-493-vcA44ovXN1e4qlJZz1bDQA-1; Wed, 15 Mar 2023 12:36:33 -0400 X-MC-Unique: vcA44ovXN1e4qlJZz1bDQA-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6ED473814594; Wed, 15 Mar 2023 16:36:32 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 54816400F4F; Wed, 15 Mar 2023 16:36:30 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v19 13/15] block: Add BIO_PAGE_PINNED and associated infrastructure Date: Wed, 15 Mar 2023 16:35:47 +0000 Message-Id: <20230315163549.295454-14-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add BIO_PAGE_PINNED to indicate that the pages in a bio are pinned (FOLL_PIN) and that the pin will need removing. Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard cc: Al Viro cc: Jens Axboe cc: Jan Kara cc: Matthew Wilcox cc: Logan Gunthorpe cc: linux-block@vger.kernel.org --- Notes: ver #10) - Drop bio_set_cleanup_mode(), open coding it instead. =20 ver #9) - Only consider pinning in bio_set_cleanup_mode(). Ref'ing pages in struct bio is going away. - page_put_unpin() is removed; call unpin_user_page() and put_page() directly. - Use bio_release_page() in __bio_release_pages(). - BIO_PAGE_PINNED and BIO_PAGE_REFFED can't both be set, so use if-else when testing both of them. =20 ver #8) - Move the infrastructure to clean up pinned pages to this patch [hch]. - Put BIO_PAGE_PINNED before BIO_PAGE_REFFED as the latter should probably be removed at some point. FOLL_PIN can then be renumbered first. block/bio.c | 6 +++--- block/blk.h | 12 ++++++++++++ include/linux/bio.h | 3 ++- include/linux/blk_types.h | 1 + 4 files changed, 18 insertions(+), 4 deletions(-) diff --git a/block/bio.c b/block/bio.c index 4ff96a0e4091..51ae957cc4b6 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1168,7 +1168,7 @@ void __bio_release_pages(struct bio *bio, bool mark_d= irty) bio_for_each_segment_all(bvec, bio, iter_all) { if (mark_dirty && !PageCompound(bvec->bv_page)) set_page_dirty_lock(bvec->bv_page); - put_page(bvec->bv_page); + bio_release_page(bio, bvec->bv_page); } } EXPORT_SYMBOL_GPL(__bio_release_pages); @@ -1488,8 +1488,8 @@ void bio_set_pages_dirty(struct bio *bio) * the BIO and re-dirty the pages in process context. * * It is expected that bio_check_pages_dirty() will wholly own the BIO from - * here on. It will run one put_page() against each page and will run one - * bio_put() against the BIO. + * here on. It will unpin each page and will run one bio_put() against the + * BIO. */ =20 static void bio_dirty_fn(struct work_struct *work); diff --git a/block/blk.h b/block/blk.h index cc4e8873dfde..d65d96994a94 100644 --- a/block/blk.h +++ b/block/blk.h @@ -432,6 +432,18 @@ int bio_add_hw_page(struct request_queue *q, struct bi= o *bio, struct page *page, unsigned int len, unsigned int offset, unsigned int max_sectors, bool *same_page); =20 +/* + * Clean up a page appropriately, where the page may be pinned, may have a + * ref taken on it or neither. + */ +static inline void bio_release_page(struct bio *bio, struct page *page) +{ + if (bio_flagged(bio, BIO_PAGE_PINNED)) + unpin_user_page(page); + else if (bio_flagged(bio, BIO_PAGE_REFFED)) + put_page(page); +} + struct request_queue *blk_alloc_queue(int node_id); =20 int disk_scan_partitions(struct gendisk *disk, fmode_t mode); diff --git a/include/linux/bio.h b/include/linux/bio.h index b537d03377f0..d8c30c791a9a 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -488,7 +488,8 @@ void zero_fill_bio(struct bio *bio); =20 static inline void bio_release_pages(struct bio *bio, bool mark_dirty) { - if (bio_flagged(bio, BIO_PAGE_REFFED)) + if (bio_flagged(bio, BIO_PAGE_REFFED) || + bio_flagged(bio, BIO_PAGE_PINNED)) __bio_release_pages(bio, mark_dirty); } =20 diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 7daa261f4f98..a0e339ff3d09 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -318,6 +318,7 @@ struct bio { * bio flags */ enum { + BIO_PAGE_PINNED, /* Unpin pages in bio_release_pages() */ BIO_PAGE_REFFED, /* put pages in bio_release_pages() */ BIO_CLONED, /* doesn't own data */ BIO_BOUNCED, /* bio is a bounce bio */ From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 948EFC6FD1D for ; Wed, 15 Mar 2023 16:38:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232643AbjCOQik (ORCPT ); Wed, 15 Mar 2023 12:38:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231680AbjCOQhw (ORCPT ); Wed, 15 Mar 2023 12:37:52 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 439746A062 for ; Wed, 15 Mar 2023 09:36:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898199; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7b6tezcmsCRtrHxO2uDV+B0gtFM5s5ObrJQQe5H2QZI=; b=CzHXOZlvjc3E6YhTBEcv010ybEjbPk4TekUElUbi+hUJdEtXW1euRH5yIA0ayPZ7Ah/S8M pZhUDPmW8q2U4pcZ9udCAlSbaQJJoXQILj8s+pnZvLqMeY5PkwBiQppzi5cyzi4bvDjtBr vthhhjA5GCYBRnJXwAPe2v3ISrNn9pQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-166-eXTazPypPp6eK0h5OKSezQ-1; Wed, 15 Mar 2023 12:36:36 -0400 X-MC-Unique: eXTazPypPp6eK0h5OKSezQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 40B00185A792; Wed, 15 Mar 2023 16:36:35 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 22FDE2A68; Wed, 15 Mar 2023 16:36:33 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v19 14/15] block: Convert bio_iov_iter_get_pages to use iov_iter_extract_pages Date: Wed, 15 Mar 2023 16:35:48 +0000 Message-Id: <20230315163549.295454-15-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This will pin pages or leave them unaltered rather than getting a ref on them as appropriate to the iterator. The pages need to be pinned for DIO rather than having refs taken on them to prevent VM copy-on-write from malfunctioning during a concurrent fork() (the result of the I/O could otherwise end up being affected by/visible to the child process). Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard cc: Al Viro cc: Jens Axboe cc: Jan Kara cc: Matthew Wilcox cc: Logan Gunthorpe cc: linux-block@vger.kernel.org --- Notes: ver #10) - Drop bio_set_cleanup_mode(), open coding it instead. =20 ver #8) - Split the patch up a bit [hch]. - We should only be using pinned/non-pinned pages and not ref'd pages, so adjust the comments appropriately. =20 ver #7) - Don't treat BIO_PAGE_REFFED/PINNED as being the same as FOLL_GET/PIN. =20 ver #5) - Transcribe the FOLL_* flags returned by iov_iter_extract_pages() to BIO_* flags and got rid of bi_cleanup_mode. - Replaced BIO_NO_PAGE_REF to BIO_PAGE_REFFED in the preceding patch. block/bio.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/block/bio.c b/block/bio.c index 51ae957cc4b6..fc98c1c723ca 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1204,7 +1204,7 @@ static int bio_iov_add_page(struct bio *bio, struct p= age *page, } =20 if (same_page) - put_page(page); + bio_release_page(bio, page); return 0; } =20 @@ -1218,7 +1218,7 @@ static int bio_iov_add_zone_append_page(struct bio *b= io, struct page *page, queue_max_zone_append_sectors(q), &same_page) !=3D len) return -EINVAL; if (same_page) - put_page(page); + bio_release_page(bio, page); return 0; } =20 @@ -1229,10 +1229,10 @@ static int bio_iov_add_zone_append_page(struct bio = *bio, struct page *page, * @bio: bio to add pages to * @iter: iov iterator describing the region to be mapped * - * Pins pages from *iter and appends them to @bio's bvec array. The - * pages will have to be released using put_page() when done. - * For multi-segment *iter, this function only adds pages from the - * next non-empty segment of the iov iterator. + * Extracts pages from *iter and appends them to @bio's bvec array. The p= ages + * will have to be cleaned up in the way indicated by the BIO_PAGE_PINNED = flag. + * For a multi-segment *iter, this function only adds pages from the next + * non-empty segment of the iov iterator. */ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) { @@ -1264,9 +1264,9 @@ static int __bio_iov_iter_get_pages(struct bio *bio, = struct iov_iter *iter) * result to ensure the bio's total size is correct. The remainder of * the iov data will be picked up in the next bio iteration. */ - size =3D iov_iter_get_pages(iter, pages, - UINT_MAX - bio->bi_iter.bi_size, - nr_pages, &offset, extraction_flags); + size =3D iov_iter_extract_pages(iter, &pages, + UINT_MAX - bio->bi_iter.bi_size, + nr_pages, extraction_flags, &offset); if (unlikely(size <=3D 0)) return size ? size : -EFAULT; =20 @@ -1299,7 +1299,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, = struct iov_iter *iter) iov_iter_revert(iter, left); out: while (i < nr_pages) - put_page(pages[i++]); + bio_release_page(bio, pages[i++]); =20 return ret; } @@ -1334,7 +1334,8 @@ int bio_iov_iter_get_pages(struct bio *bio, struct io= v_iter *iter) return 0; } =20 - bio_set_flag(bio, BIO_PAGE_REFFED); + if (iov_iter_extract_will_pin(iter)) + bio_set_flag(bio, BIO_PAGE_PINNED); do { ret =3D __bio_iov_iter_get_pages(bio, iter); } while (!ret && iov_iter_count(iter) && !bio_full(bio, 0)); From nobody Wed Feb 11 10:19:24 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B021C7618D for ; Wed, 15 Mar 2023 16:39:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232570AbjCOQjK (ORCPT ); Wed, 15 Mar 2023 12:39:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232208AbjCOQiM (ORCPT ); Wed, 15 Mar 2023 12:38:12 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BFF9B7C3C7 for ; Wed, 15 Mar 2023 09:36:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678898210; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gIKMSgtKI0+T2jjDihR3uNZ9dKHoZzHE/RnqYef6EBU=; b=P2AOPD+bppm4Wo/8L6cSalSo3tl8nwERthPm3V6ZaZ8sVGOu8tDbyNhZhH2bj6c2TD1Dtg /ifMqHO/1bZa/RrvB6phZ5aRUljxPSCY+Q9x47q1IOZ2gRtgu2XGsigtoexTEnjUuXv/d2 Yq6r9S3D0jmmiu4V4MwOT6Rjf5FpOMU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-623-xwMSkeoeMNWvPvIQXGfjEA-1; Wed, 15 Mar 2023 12:36:47 -0400 X-MC-Unique: xwMSkeoeMNWvPvIQXGfjEA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 02AAC88B7A7; Wed, 15 Mar 2023 16:36:38 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id D85421121314; Wed, 15 Mar 2023 16:36:35 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v19 15/15] block: convert bio_map_user_iov to use iov_iter_extract_pages Date: Wed, 15 Mar 2023 16:35:49 +0000 Message-Id: <20230315163549.295454-16-dhowells@redhat.com> In-Reply-To: <20230315163549.295454-1-dhowells@redhat.com> References: <20230315163549.295454-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This will pin pages or leave them unaltered rather than getting a ref on them as appropriate to the iterator. The pages need to be pinned for DIO rather than having refs taken on them to prevent VM copy-on-write from malfunctioning during a concurrent fork() (the result of the I/O could otherwise end up being visible to/affected by the child process). Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard cc: Al Viro cc: Jens Axboe cc: Jan Kara cc: Matthew Wilcox cc: Logan Gunthorpe cc: linux-block@vger.kernel.org --- Notes: ver #10) - Drop bio_set_cleanup_mode(), open coding it instead. =20 ver #8) - Split the patch up a bit [hch]. - We should only be using pinned/non-pinned pages and not ref'd pages, so adjust the comments appropriately. =20 ver #7) - Don't treat BIO_PAGE_REFFED/PINNED as being the same as FOLL_GET/PIN. =20 ver #5) - Transcribe the FOLL_* flags returned by iov_iter_extract_pages() to BIO_* flags and got rid of bi_cleanup_mode. - Replaced BIO_NO_PAGE_REF to BIO_PAGE_REFFED in the preceding patch. block/blk-map.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/block/blk-map.c b/block/blk-map.c index c77fdb1fbda7..7b12f4bb4d4c 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -280,22 +280,21 @@ static int bio_map_user_iov(struct request *rq, struc= t iov_iter *iter, =20 if (blk_queue_pci_p2pdma(rq->q)) extraction_flags |=3D ITER_ALLOW_P2PDMA; + if (iov_iter_extract_will_pin(iter)) + bio_set_flag(bio, BIO_PAGE_PINNED); =20 - bio_set_flag(bio, BIO_PAGE_REFFED); while (iov_iter_count(iter)) { - struct page **pages, *stack_pages[UIO_FASTIOV]; + struct page *stack_pages[UIO_FASTIOV]; + struct page **pages =3D stack_pages; ssize_t bytes; size_t offs; int npages; =20 - if (nr_vecs <=3D ARRAY_SIZE(stack_pages)) { - pages =3D stack_pages; - bytes =3D iov_iter_get_pages(iter, pages, LONG_MAX, - nr_vecs, &offs, extraction_flags); - } else { - bytes =3D iov_iter_get_pages_alloc(iter, &pages, - LONG_MAX, &offs, extraction_flags); - } + if (nr_vecs > ARRAY_SIZE(stack_pages)) + pages =3D NULL; + + bytes =3D iov_iter_extract_pages(iter, &pages, LONG_MAX, + nr_vecs, extraction_flags, &offs); if (unlikely(bytes <=3D 0)) { ret =3D bytes ? bytes : -EFAULT; goto out_unmap; @@ -317,7 +316,7 @@ static int bio_map_user_iov(struct request *rq, struct = iov_iter *iter, if (!bio_add_hw_page(rq->q, bio, page, n, offs, max_sectors, &same_page)) { if (same_page) - put_page(page); + bio_release_page(bio, page); break; } =20 @@ -329,7 +328,7 @@ static int bio_map_user_iov(struct request *rq, struct = iov_iter *iter, * release the pages we didn't map into the bio, if any */ while (j < npages) - put_page(pages[j++]); + bio_release_page(bio, pages[j++]); if (pages !=3D stack_pages) kvfree(pages); /* couldn't stuff something into bio? */