From nobody Mon Sep 15 03:55:53 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BA07C54EBE for ; Mon, 16 Jan 2023 23:15:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231158AbjAPXPm (ORCPT ); Mon, 16 Jan 2023 18:15:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234786AbjAPXO4 (ORCPT ); Mon, 16 Jan 2023 18:14:56 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D8E2B3028C for ; Mon, 16 Jan 2023 15:10:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673910640; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ca3YM5+ZDMsvf+3X9heZWXQBMUIpoxlz5ebJhHm9RTk=; b=DoaMkf8TPejz9BWZD02+pGjpzUl2+GAk89PFFSgR32ND+fflDUMEbGrY48KL+ZFhXOeuHQ TgEBencnE5ntqOsww0MpTDMN69Amseb1QW535zFO9v+s7qJ5M45lKjq5eZpXFRJC5dXuVT RK4SOKwmL/hFt7g1jyeU1nOuQDS6WNE= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-474-t5zTBbIRNZmpiJVxDWfEjg-1; Mon, 16 Jan 2023 18:10:35 -0500 X-MC-Unique: t5zTBbIRNZmpiJVxDWfEjg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B9D3F3C0F42B; Mon, 16 Jan 2023 23:10:34 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.23]) by smtp.corp.redhat.com (Postfix) with ESMTP id EE0491121315; Mon, 16 Jan 2023 23:10:32 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH v6 21/34] 9p: Pin pages rather than ref'ing if appropriate From: David Howells To: Al Viro Cc: Dominique Martinet , Eric Van Hensbergen , Latchesar Ionkov , Christian Schoenebeck , v9fs-developer@lists.sourceforge.net, dhowells@redhat.com, Christoph Hellwig , Matthew Wilcox , Jens Axboe , Jan Kara , Jeff Layton , Logan Gunthorpe , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 16 Jan 2023 23:10:32 +0000 Message-ID: <167391063242.2311931.3275290816918213423.stgit@warthog.procyon.org.uk> In-Reply-To: <167391047703.2311931.8115712773222260073.stgit@warthog.procyon.org.uk> References: <167391047703.2311931.8115712773222260073.stgit@warthog.procyon.org.uk> User-Agent: StGit/1.5 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Convert the 9p filesystem to use iov_iter_extract_pages() instead of iov_iter_get_pages(). This will pin pages or leave them unaltered rather than getting a ref on them as appropriate to the iterator. The pages need to be pinned for DIO-read rather than having refs taken on them to prevent VM copy-on-write from malfunctioning during a concurrent fork() (the result of the I/O would otherwise end up only visible to the child process and not the parent). Signed-off-by: David Howells cc: Dominique Martinet cc: Eric Van Hensbergen cc: Latchesar Ionkov cc: Christian Schoenebeck cc: v9fs-developer@lists.sourceforge.net --- net/9p/trans_common.c | 6 ++- net/9p/trans_common.h | 3 +- net/9p/trans_virtio.c | 89 ++++++++++++++-------------------------------= ---- 3 files changed, 31 insertions(+), 67 deletions(-) diff --git a/net/9p/trans_common.c b/net/9p/trans_common.c index c827f694551c..31d133412677 100644 --- a/net/9p/trans_common.c +++ b/net/9p/trans_common.c @@ -12,13 +12,15 @@ * p9_release_pages - Release pages after the transaction. * @pages: array of pages to be put * @nr_pages: size of array + * @cleanup_mode: How to clean up the pages. */ -void p9_release_pages(struct page **pages, int nr_pages) +void p9_release_pages(struct page **pages, int nr_pages, + unsigned int cleanup_mode) { int i; =20 for (i =3D 0; i < nr_pages; i++) if (pages[i]) - put_page(pages[i]); + page_put_unpin(pages[i], cleanup_mode); } EXPORT_SYMBOL(p9_release_pages); diff --git a/net/9p/trans_common.h b/net/9p/trans_common.h index 32134db6abf3..9b20eb4f2359 100644 --- a/net/9p/trans_common.h +++ b/net/9p/trans_common.h @@ -4,4 +4,5 @@ * Author Venkateswararao Jujjuri */ =20 -void p9_release_pages(struct page **pages, int nr_pages); +void p9_release_pages(struct page **pages, int nr_pages, + unsigned int cleanup_mode); diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c index eb28b54fe5f6..561f7cbd79da 100644 --- a/net/9p/trans_virtio.c +++ b/net/9p/trans_virtio.c @@ -310,73 +310,34 @@ static int p9_get_mapped_pages(struct virtio_chan *ch= an, struct iov_iter *data, int count, size_t *offs, - int *need_drop, + int *cleanup_mode, unsigned int gup_flags) { int nr_pages; int err; + int n; =20 if (!iov_iter_count(data)) return 0; =20 - if (!iov_iter_is_kvec(data)) { - int n; - /* - * We allow only p9_max_pages pinned. We wait for the - * Other zc request to finish here - */ - if (atomic_read(&vp_pinned) >=3D chan->p9_max_pages) { - err =3D wait_event_killable(vp_wq, - (atomic_read(&vp_pinned) < chan->p9_max_pages)); - if (err =3D=3D -ERESTARTSYS) - return err; - } - n =3D iov_iter_get_pages_alloc(data, pages, count, offs, - gup_flags); - if (n < 0) - return n; - *need_drop =3D 1; - nr_pages =3D DIV_ROUND_UP(n + *offs, PAGE_SIZE); - atomic_add(nr_pages, &vp_pinned); - return n; - } else { - /* kernel buffer, no need to pin pages */ - int index; - size_t len; - void *p; - - /* we'd already checked that it's non-empty */ - while (1) { - len =3D iov_iter_single_seg_count(data); - if (likely(len)) { - p =3D data->kvec->iov_base + data->iov_offset; - break; - } - iov_iter_advance(data, 0); - } - if (len > count) - len =3D count; - - nr_pages =3D DIV_ROUND_UP((unsigned long)p + len, PAGE_SIZE) - - (unsigned long)p / PAGE_SIZE; - - *pages =3D kmalloc_array(nr_pages, sizeof(struct page *), - GFP_NOFS); - if (!*pages) - return -ENOMEM; - - *need_drop =3D 0; - p -=3D (*offs =3D offset_in_page(p)); - for (index =3D 0; index < nr_pages; index++) { - if (is_vmalloc_addr(p)) - (*pages)[index] =3D vmalloc_to_page(p); - else - (*pages)[index] =3D kmap_to_page(p); - p +=3D PAGE_SIZE; - } - iov_iter_advance(data, len); - return len; + /* + * We allow only p9_max_pages pinned. We wait for the + * Other zc request to finish here + */ + if (atomic_read(&vp_pinned) >=3D chan->p9_max_pages) { + err =3D wait_event_killable(vp_wq, + (atomic_read(&vp_pinned) < chan->p9_max_pages)); + if (err =3D=3D -ERESTARTSYS) + return err; } + + n =3D iov_iter_extract_pages(data, pages, count, offs, gup_flags); + if (n < 0) + return n; + *cleanup_mode =3D iov_iter_extract_mode(data, gup_flags); + nr_pages =3D DIV_ROUND_UP(n + *offs, PAGE_SIZE); + atomic_add(nr_pages, &vp_pinned); + return n; } =20 static void handle_rerror(struct p9_req_t *req, int in_hdr_len, @@ -431,7 +392,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p= 9_req_t *req, struct virtio_chan *chan =3D client->trans; struct scatterlist *sgs[4]; size_t offs; - int need_drop =3D 0; + int cleanup_mode =3D 0; int kicked =3D 0; =20 p9_debug(P9_DEBUG_TRANS, "virtio request\n"); @@ -439,7 +400,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p= 9_req_t *req, if (uodata) { __le32 sz; int n =3D p9_get_mapped_pages(chan, &out_pages, uodata, - outlen, &offs, &need_drop, + outlen, &offs, &cleanup_mode, FOLL_DEST_BUF); if (n < 0) { err =3D n; @@ -459,7 +420,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p= 9_req_t *req, memcpy(&req->tc.sdata[0], &sz, sizeof(sz)); } else if (uidata) { int n =3D p9_get_mapped_pages(chan, &in_pages, uidata, - inlen, &offs, &need_drop, + inlen, &offs, &cleanup_mode, FOLL_SOURCE_BUF); if (n < 0) { err =3D n; @@ -546,14 +507,14 @@ p9_virtio_zc_request(struct p9_client *client, struct= p9_req_t *req, * Non kernel buffers are pinned, unpin them */ err_out: - if (need_drop) { + if (cleanup_mode) { if (in_pages) { p9_release_pages(in_pages, in_nr_pages); - atomic_sub(in_nr_pages, &vp_pinned); + atomic_sub(in_nr_pages, &vp_pinned, cleanup_mode); } if (out_pages) { p9_release_pages(out_pages, out_nr_pages); - atomic_sub(out_nr_pages, &vp_pinned); + atomic_sub(out_nr_pages, &vp_pinned, cleanup_mode); } /* wakeup anybody waiting for slots to pin pages */ wake_up(&vp_wq);