From nobody Fri Jun 19 07:45:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6FCDC37B413 for ; Thu, 23 Apr 2026 22:22:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776982955; cv=none; b=G+kP9AxhPXFcO+9z9o2NIEJXF0U77S1PeUwatpfPNLNQ9T7zaZyb0O+6MmNC19+xDLEKK19H5kx7TAzwGkVuFk/SgZTub51//IHRifXCGNQVkV2yJr7fn0uNHvvhifEUqxtvK4a0ogSlsovD9HVpGQafww1mwLDvKPeacb0vxFo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776982955; c=relaxed/simple; bh=EQpPMiLagPYSIVsb0YWyJHfB16hbO4DByLA6fMjT7yY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cYHe+d+RbgtBmEO9Ikr3sSSrDM0ceV3dmoR+34grYOXejyZiejB0fp37dHYvmCTxYrd/33JuzHPYbNbJH228dexX/5e7IeFHa9kMNS0v0qwg/EyUdcNflm5MvfptKMh8wucf4qVAmA33TSrFCIXt12a+r9VPRiMVwouG0WsMTLo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Gicg9zXi; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Gicg9zXi" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1776982953; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=600g6oafpsmNXEPuDP1fqAUAuRGwYq58jCdzRtTHVzs=; b=Gicg9zXiX6hRj5Di6y/cDx459+YVJA2VWwDKeOoIffNSI/UqpEZbsDIqL47rA96GT8BKXX RIP2djBQwMla5JS+cGkcJ/9o1SgFlUoExasYQEfLHkvEX74WobrrD59i9Xl52s41UEJwZa I6GrY/+xY0dOcBEdcdhG0nlhUA3K+sE= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-507-GJn_nIQFO-qJAJAnoBfq3A-1; Thu, 23 Apr 2026 18:22:28 -0400 X-MC-Unique: GJn_nIQFO-qJAJAnoBfq3A-1 X-Mimecast-MFC-AGG-ID: GJn_nIQFO-qJAJAnoBfq3A_1776982946 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1E7D519560BB; Thu, 23 Apr 2026 22:22:26 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.44.48.17]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 2CE67180047F; Thu, 23 Apr 2026 22:22:21 +0000 (UTC) From: David Howells To: Christian Brauner Cc: David Howells , Paulo Alcantara , netfs@lists.linux.dev, linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Mark Brown , Christian Brauner Subject: [PATCH 1/4] netfs: Fix wrong return from netfs_read_sizes() on 32-bit SMP arches Date: Thu, 23 Apr 2026 23:22:04 +0100 Message-ID: <20260423222209.3054909-2-dhowells@redhat.com> In-Reply-To: <20260423222209.3054909-1-dhowells@redhat.com> References: <20260423222209.3054909-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Content-Type: text/plain; charset="utf-8" Fix netfs_read_sizes() for 32-bit SMP arches to not have a return value. Fixes: 756f72b6d8db ("netfs: Fix potential for tearing in ->remote_i_size a= nd ->zero_point") Reported-by: Mark Brown Closes: https://lore.kernel.org/r/aeoIAXzqh0n54mxl@sirena.org.uk Signed-off-by: David Howells cc: Christian Brauner cc: Paulo Alcantara cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org --- include/linux/netfs.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/netfs.h b/include/linux/netfs.h index d72bc2f11734..ad394c088578 100644 --- a/include/linux/netfs.h +++ b/include/linux/netfs.h @@ -669,7 +669,6 @@ static inline void netfs_read_sizes(const struct netfs_= inode *ictx, *remote_i_size =3D ictx->_remote_i_size; *zero_point =3D ictx->_zero_point; } while (read_seqcount_retry(&inode->i_size_seqcount, seq)); - return zero_point; #elif BITS_PER_LONG=3D=3D32 && defined(CONFIG_PREEMPTION) unsigned long long zero_point; From nobody Fri Jun 19 07:45:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 141413BC684 for ; Thu, 23 Apr 2026 22:22:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776982957; cv=none; b=Q0FhE3LuRf2NKwxzmTQ/BndqJ+JcS3Tz+obgsf48c6nMpV/SRG3wa7mxKgSGthI2Jc1/C/FQHvemigmgIO1NsWlicr6oXX7dmaBNNumj0KSiB6lqZspf2bPv0DiFy7EFhp/YUYJNI9JbsnN1w4vetd9XhqoaeyQaehD04dfVPL4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776982957; c=relaxed/simple; bh=nQcYuDPSsOGJNf916H6peKQku5OR+hGAv0G1GjyDZd0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GwrCq7W3Mv5h8ddCZCgktd/U7iv5JFpRH4Cqf/jnEkopAMV3U1nRB0aC8+Ow6MNvV7orCarvLBsR8UD+eKBb27tvaQtGG0Dqdu8lnApHlVxjPsFMZeN5NZRiRMcY6YBw0sjB0btqlePGSTAD4ONOIjgNhwS+kcPZAgoYmpTHRFY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TRDAZ9Zc; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TRDAZ9Zc" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1776982955; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dSOPAar6dtE7it/Tsf2brwfsDP4VAcVbcSo3XqesRc0=; b=TRDAZ9ZcfUje3NO8fDUDTxbFHhiFLaJRrEn6P5Seq8tGRKGx8hjSl2e2dnxRkS6+4ajihQ ULues/m1bu9ZpzSz3pr7vSf+oKQVzuHY9SBr506esmx5iksp2ec6zOk3SWEa0ggjYNZ9lq utErsWDKwAzJO8agXlBJ6S6XxhjmQO4= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-575-2FfrlDfjOQa16dtGJW9_RA-1; Thu, 23 Apr 2026 18:22:32 -0400 X-MC-Unique: 2FfrlDfjOQa16dtGJW9_RA-1 X-Mimecast-MFC-AGG-ID: 2FfrlDfjOQa16dtGJW9_RA_1776982950 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 8E2B1195608D; Thu, 23 Apr 2026 22:22:30 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.44.48.17]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id B0543196B8F6; Thu, 23 Apr 2026 22:22:27 +0000 (UTC) From: David Howells To: Christian Brauner Cc: David Howells , Paulo Alcantara , netfs@lists.linux.dev, linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/4] netfs: Fix missing barriers when accessing stream->subrequests locklessly Date: Thu, 23 Apr 2026 23:22:05 +0100 Message-ID: <20260423222209.3054909-3-dhowells@redhat.com> In-Reply-To: <20260423222209.3054909-1-dhowells@redhat.com> References: <20260423222209.3054909-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" The list of subrequests attached to stream->subrequests is accessed without locks by netfs_collect_read_results() and netfs_collect_write_results(), but they access subreq->flags without taking a barrier after getting the subreq pointer from the list. Relatedly, the functions that build the list don't use any sort of write barrier when constructing the list to make sure that the NETFS_SREQ_IN_PROGRESS flag is perceived to be set first if no lock is taken. Fix this by: (1) Add a new list_add_tail_release() function that uses a release barrier to set the pointer to the new member of the list. (2) Add a new list_first_entry_acquire() function that uses an acquire barrier to read the pointer to the first member in a list (or return NULL). (3) Use list_add_tail_release() when adding a subreq to ->subrequests. (4) Make direct-read and read-single use netfs_queue_read() so that they share the relevant bit of code with buffered-read. (5) Use list_first_entry_acquire() when initially accessing the front of the list (when an item is removed, the pointer to the new front iterm is obtained under the same lock). Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use o= ne work item") Fixes: 288ace2f57c9 ("netfs: New writeback implementation") Link: https://sashiko.dev/#/patchset/20260326104544.509518-1-dhowells%40red= hat.com Signed-off-by: David Howells cc: Paulo Alcantara cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org --- fs/netfs/buffered_read.c | 9 +++++---- fs/netfs/direct_read.c | 15 +-------------- fs/netfs/internal.h | 3 +++ fs/netfs/read_collect.c | 4 +++- fs/netfs/read_single.c | 12 +----------- fs/netfs/write_collect.c | 4 +++- fs/netfs/write_issue.c | 3 ++- include/linux/list.h | 37 +++++++++++++++++++++++++++++++++++++ 8 files changed, 55 insertions(+), 32 deletions(-) diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c index dcfe51eec266..5c78ec14e46b 100644 --- a/fs/netfs/buffered_read.c +++ b/fs/netfs/buffered_read.c @@ -156,9 +156,9 @@ static void netfs_read_cache_to_pagecache(struct netfs_= io_request *rreq, netfs_cache_read_terminated, subreq); } =20 -static void netfs_queue_read(struct netfs_io_request *rreq, - struct netfs_io_subrequest *subreq, - bool last_subreq) +void netfs_queue_read(struct netfs_io_request *rreq, + struct netfs_io_subrequest *subreq, + bool last_subreq) { struct netfs_io_stream *stream =3D &rreq->io_streams[0]; =20 @@ -169,7 +169,8 @@ static void netfs_queue_read(struct netfs_io_request *r= req, * remove entries off of the front. */ spin_lock(&rreq->lock); - list_add_tail(&subreq->rreq_link, &stream->subrequests); + /* Write IN_PROGRESS before pointer to new subreq */ + list_add_tail_release(&subreq->rreq_link, &stream->subrequests); if (list_is_first(&subreq->rreq_link, &stream->subrequests)) { if (!stream->active) { stream->collected_to =3D subreq->start; diff --git a/fs/netfs/direct_read.c b/fs/netfs/direct_read.c index f72e6da88cca..69a1a1e26143 100644 --- a/fs/netfs/direct_read.c +++ b/fs/netfs/direct_read.c @@ -47,7 +47,6 @@ static void netfs_prepare_dio_read_iterator(struct netfs_= io_subrequest *subreq) */ static int netfs_dispatch_unbuffered_reads(struct netfs_io_request *rreq) { - struct netfs_io_stream *stream =3D &rreq->io_streams[0]; unsigned long long start =3D rreq->start; ssize_t size =3D rreq->len; int ret =3D 0; @@ -66,19 +65,7 @@ static int netfs_dispatch_unbuffered_reads(struct netfs_= io_request *rreq) subreq->start =3D start; subreq->len =3D size; =20 - __set_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags); - - spin_lock(&rreq->lock); - list_add_tail(&subreq->rreq_link, &stream->subrequests); - if (list_is_first(&subreq->rreq_link, &stream->subrequests)) { - if (!stream->active) { - stream->collected_to =3D subreq->start; - /* Store list pointers before active flag */ - smp_store_release(&stream->active, true); - } - } - trace_netfs_sreq(subreq, netfs_sreq_trace_added); - spin_unlock(&rreq->lock); + netfs_queue_read(rreq, subreq, false); =20 netfs_stat(&netfs_n_rh_download); if (rreq->netfs_ops->prepare_read) { diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h index d436e20d3418..964479335ff7 100644 --- a/fs/netfs/internal.h +++ b/fs/netfs/internal.h @@ -23,6 +23,9 @@ /* * buffered_read.c */ +void netfs_queue_read(struct netfs_io_request *rreq, + struct netfs_io_subrequest *subreq, + bool last_subreq); void netfs_cache_read_terminated(void *priv, ssize_t transferred_or_error); int netfs_prefetch_for_write(struct file *file, struct folio *folio, size_t offset, size_t len); diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c index eae067e3eaa5..5847796b54ec 100644 --- a/fs/netfs/read_collect.c +++ b/fs/netfs/read_collect.c @@ -205,8 +205,10 @@ static void netfs_collect_read_results(struct netfs_io= _request *rreq) * in progress. The issuer thread may be adding stuff to the tail * whilst we're doing this. */ - front =3D list_first_entry_or_null(&stream->subrequests, + front =3D list_first_entry_acquire(&stream->subrequests, struct netfs_io_subrequest, rreq_link); + /* Read first subreq pointer before IN_PROGRESS flag. */ + while (front) { size_t transferred; =20 diff --git a/fs/netfs/read_single.c b/fs/netfs/read_single.c index d0e23bc42445..30e184caadb2 100644 --- a/fs/netfs/read_single.c +++ b/fs/netfs/read_single.c @@ -89,7 +89,6 @@ static void netfs_single_read_cache(struct netfs_io_reque= st *rreq, */ static int netfs_single_dispatch_read(struct netfs_io_request *rreq) { - struct netfs_io_stream *stream =3D &rreq->io_streams[0]; struct netfs_io_subrequest *subreq; int ret =3D 0; =20 @@ -102,14 +101,7 @@ static int netfs_single_dispatch_read(struct netfs_io_= request *rreq) subreq->len =3D rreq->len; subreq->io_iter =3D rreq->buffer.iter; =20 - __set_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags); - - spin_lock(&rreq->lock); - list_add_tail(&subreq->rreq_link, &stream->subrequests); - trace_netfs_sreq(subreq, netfs_sreq_trace_added); - /* Store list pointers before active flag */ - smp_store_release(&stream->active, true); - spin_unlock(&rreq->lock); + netfs_queue_read(rreq, subreq, true); =20 netfs_single_cache_prepare_read(rreq, subreq); switch (subreq->source) { @@ -137,8 +129,6 @@ static int netfs_single_dispatch_read(struct netfs_io_r= equest *rreq) break; } =20 - smp_wmb(); /* Write lists before ALL_QUEUED. */ - set_bit(NETFS_RREQ_ALL_QUEUED, &rreq->flags); return ret; cancel: netfs_put_subrequest(subreq, netfs_sreq_trace_put_cancel); diff --git a/fs/netfs/write_collect.c b/fs/netfs/write_collect.c index 4718e5174d65..f0cafa1d5835 100644 --- a/fs/netfs/write_collect.c +++ b/fs/netfs/write_collect.c @@ -227,8 +227,10 @@ static void netfs_collect_write_results(struct netfs_i= o_request *wreq) if (!smp_load_acquire(&stream->active)) continue; =20 - front =3D list_first_entry_or_null(&stream->subrequests, + front =3D list_first_entry_acquire(&stream->subrequests, struct netfs_io_subrequest, rreq_link); + /* Read first subreq pointer before IN_PROGRESS flag. */ + while (front) { trace_netfs_collect_sreq(wreq, front); //_debug("sreq [%x] %llx %zx/%zx", diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c index 2db688f94125..b0e9690bb90c 100644 --- a/fs/netfs/write_issue.c +++ b/fs/netfs/write_issue.c @@ -204,7 +204,8 @@ void netfs_prepare_write(struct netfs_io_request *wreq, * remove entries off of the front. */ spin_lock(&wreq->lock); - list_add_tail(&subreq->rreq_link, &stream->subrequests); + /* Write IN_PROGRESS before pointer to new subreq */ + list_add_tail_release(&subreq->rreq_link, &stream->subrequests); if (list_is_first(&subreq->rreq_link, &stream->subrequests)) { if (!stream->active) { stream->collected_to =3D subreq->start; diff --git a/include/linux/list.h b/include/linux/list.h index 00ea8e5fb88b..5af356efd725 100644 --- a/include/linux/list.h +++ b/include/linux/list.h @@ -191,6 +191,29 @@ static inline void list_add_tail(struct list_head *new= , struct list_head *head) __list_add(new, head->prev, head); } =20 +/** + * list_add_tail_release - add a new entry with release barrier + * @new: new entry to be added + * @head: list head to add it before + * + * Insert a new entry before the specified head, using a release barrier t= o set + * the ->next pointer that points to it. This is useful for implementing + * queues, in particular one that the elements will be walked through forw= ards + * locklessly. + */ +static inline void list_add_tail_release(struct list_head *new, + struct list_head *head) +{ + struct list_head *prev =3D head->prev; + + if (__list_add_valid(new, prev, head)) { + new->next =3D head; + new->prev =3D prev; + head->prev =3D new; + smp_store_release(&prev->next, new); + } +} + /* * Delete a list entry by making the prev/next entries * point to each other. @@ -644,6 +667,20 @@ static inline void list_splice_tail_init(struct list_h= ead *list, pos__ !=3D head__ ? list_entry(pos__, type, member) : NULL; \ }) =20 +/** + * list_first_entry_acquire - get the first element from a list with barri= er + * @ptr: the list head to take the element from. + * @type: the type of the struct this is embedded in. + * @member: the name of the list_head within the struct. + * + * Note that if the list is empty, it returns NULL. + */ +#define list_first_entry_acquire(ptr, type, member) ({ \ + struct list_head *head__ =3D (ptr); \ + struct list_head *pos__ =3D smp_load_acquire(&head__->next); \ + pos__ !=3D head__ ? list_entry(pos__, type, member) : NULL; \ +}) + /** * list_last_entry_or_null - get the last element from a list * @ptr: the list head to take the element from. From nobody Fri Jun 19 07:45:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D1DD939B962 for ; Thu, 23 Apr 2026 22:22:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776982962; cv=none; b=pWJTrzRZQi+lOyabw6x4+4EX2HOyUpO6zqPTQErJGj+AmP/tWTiIa1WjsMDlBy6MQTQktuXhhXqigRxofw5MNGn8zasaYLN6A5aCg6+iiygQbrXZ4KKkLbD8O1lCS1cPp8qmopNjJnsL62wmlrjX9OgnNjxmxQxNbxeYT7VON9s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776982962; c=relaxed/simple; bh=3aYouAujFzp76xqqlG9o4vknZZvSlVMk3kOrrxli+y8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=meTJRJwC5NW4GN2koPMbSX3GtoAG1COCz+tJbSUb4k5WdWqSthHl9XeofIMJru2JNbdTUwT2SpjjIgmXlDYelVfo5a32W+NP+mzIIfTpx3R4Hx1yEyToF2AiDTHDuLfMZjGbzPUqI+5Q0C+NVU2y7Iuscb2x5Rms9m4HsJmG8eI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=G5uc/idx; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="G5uc/idx" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1776982959; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CEaZFT1kA2C9fmI8tLKUXOafS9Zq2QwNR5SdFuG0lEc=; b=G5uc/idxSx/apvBFFLOh1X2M/IEPV55jvoMl8SLbcMuZ0L1pNww627qP0gY+64wxbYzghw XUMo1uqdW4S2RwlGqQyWXR1KpWBMNwqg02v35+p3jXm1YtE/NUkFu0FEPH/Q6QciUc9tjQ vG6pAyRlAjSXoPsknZ4FBlYM7UEJWW4= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-20-6mRXVomyPvGKAhZHmx0xIA-1; Thu, 23 Apr 2026 18:22:37 -0400 X-MC-Unique: 6mRXVomyPvGKAhZHmx0xIA-1 X-Mimecast-MFC-AGG-ID: 6mRXVomyPvGKAhZHmx0xIA_1776982956 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id BC47C1956095; Thu, 23 Apr 2026 22:22:35 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.44.48.17]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 2CCFF19560AB; Thu, 23 Apr 2026 22:22:31 +0000 (UTC) From: David Howells To: Christian Brauner Cc: David Howells , Paulo Alcantara , netfs@lists.linux.dev, linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Marc Dionne Subject: [PATCH 3/4] afs: Fix afs_get_link() to take validate_lock around afs_read_single() Date: Thu, 23 Apr 2026 23:22:06 +0100 Message-ID: <20260423222209.3054909-4-dhowells@redhat.com> In-Reply-To: <20260423222209.3054909-1-dhowells@redhat.com> References: <20260423222209.3054909-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Content-Type: text/plain; charset="utf-8" Fix afs_get_link() to take the validate_lock around afs_read_single() to prevent races between multiple ->get_link() calls. Fixes: eae9e78951bb ("afs: Use netfslib for symlinks, allowing them to be c= ached") Closes: https://sashiko.dev/#/patchset/20260326104544.509518-1-dhowells%40r= edhat.com Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org --- fs/afs/inode.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/fs/afs/inode.c b/fs/afs/inode.c index 06e25e1b12df..5207c4a003f6 100644 --- a/fs/afs/inode.c +++ b/fs/afs/inode.c @@ -78,10 +78,19 @@ const char *afs_get_link(struct dentry *dentry, struct = inode *inode, goto good; =20 fetch: - ret =3D afs_read_single(vnode, NULL); - if (ret < 0) - return ERR_PTR(ret); - set_bit(AFS_VNODE_DIR_READ, &vnode->flags); + if (down_write_killable(&vnode->validate_lock) < 0) + return ERR_PTR(-ERESTARTSYS); + if (test_and_clear_bit(AFS_VNODE_ZAP_DATA, &vnode->flags) || + !test_bit(AFS_VNODE_DIR_READ, &vnode->flags)) { + ret =3D afs_read_single(vnode, NULL); + if (ret < 0) { + up_write(&vnode->validate_lock); + return ERR_PTR(ret); + } + set_bit(AFS_VNODE_DIR_READ, &vnode->flags); + } + + up_write(&vnode->validate_lock); =20 good: folio =3D folioq_folio(vnode->directory, 0); From nobody Fri Jun 19 07:45:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 156C23B2FDA for ; Thu, 23 Apr 2026 22:22:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776982968; cv=none; b=Ispzr0ralhfjSbCK+bIv9vkj/5Ir/YqnY05EAD8R2150BZT+hK4uAUuI45YpJAcFSdWov+YN6WVZcrWo0b51ZM2WacPW6U3HjubKGQexNHIMqCcj8D5upi1Adndh8p333Z8l2ErOIbrzcg+RDbpN9TqeXRSesqH3CmCaynaF12c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776982968; c=relaxed/simple; bh=j+CNTHw4JaXJ04qPB20Oy/gfNlGU9rt5C2hsX5WusIU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bV5rjHut0FDA/xwwDL19Zbhxbqt+gzUdaP5xQKrhkuVlgrEV/m+gZXzbpnWCxE53h4wDViVxbtHH+unazW7T27oRq5gFtnx7vK8g5Vogbt1vzoULwIrKbeUGkpFFzKIDGE8h1LxNTIkdeGv5K2oehZx3VxwnlzXk51jGrF9cS3k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=P/6IztW7; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="P/6IztW7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1776982965; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TahHXd/0zLYPsZIgom/gtrfZo11GRDjkH9gAacnDx7E=; b=P/6IztW7/+xG3veLb4aiHEPc1jvMJYL7n30UG4CxOajmPwoQtkT/KHmKUvTzuSjji6gMV1 kISLjRjlbBkDzmxse7kZL/0gGJaN6wMQGeq0Pcy5OfcRRe2fJIfjzCZEoC6af/Qtj+yb67 W2/K+vDGo69Hy1w+iNXMHuflmIxyjm8= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-657-2zaffEGoMVyjOZLej1VECw-1; Thu, 23 Apr 2026 18:22:41 -0400 X-MC-Unique: 2zaffEGoMVyjOZLej1VECw-1 X-Mimecast-MFC-AGG-ID: 2zaffEGoMVyjOZLej1VECw_1776982960 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 525771956056; Thu, 23 Apr 2026 22:22:40 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.44.48.17]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 427C719560AB; Thu, 23 Apr 2026 22:22:37 +0000 (UTC) From: David Howells To: Christian Brauner Cc: David Howells , Paulo Alcantara , netfs@lists.linux.dev, linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Marc Dionne Subject: [PATCH 4/4] afs: Fix RCU handling of symlinks in RCU pathwalk Date: Thu, 23 Apr 2026 23:22:07 +0100 Message-ID: <20260423222209.3054909-5-dhowells@redhat.com> In-Reply-To: <20260423222209.3054909-1-dhowells@redhat.com> References: <20260423222209.3054909-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Content-Type: text/plain; charset="utf-8" The afs filesystem in the kernel doesn't handle RCU pathwalk of symlinks correctly. The problem is twofold: firstly, it doesn't treat the buffer pointers as RCU pointers with the appropriate barriering; and secondly, it can race with another thread updating the contents of the symlink because a third party updated it on the server. Fix this by the following means: (1) Keep a separate copy of the symlink contents with an rcu_head. This is always going to be a lot smaller than a page, so it can be kmalloc'd and save quite a bit of memory. It also needs a refcount for non-RCU pathwalk. (2) Split the symlink read and write-to-cache routines in afs from those for directories. (3) Discard the I/O buffer as soon as the write-to-cache completes as this is a full page (plus a folio_queue). (4) If there's no cache, discard the I/O buffer immediately after reading and copying if there is no cache. Fixes: 6698c02d64b2 ("afs: Locally initialise the contents of a new symlink= on creation") Closes: https://sashiko.dev/#/patchset/20260326104544.509518-1-dhowells%40r= edhat.com Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org --- fs/afs/Makefile | 1 + fs/afs/dir.c | 33 +++++-- fs/afs/fsclient.c | 4 +- fs/afs/inode.c | 105 +------------------- fs/afs/internal.h | 35 +++++-- fs/afs/symlink.c | 242 +++++++++++++++++++++++++++++++++++++++++++++ fs/afs/yfsclient.c | 4 +- 7 files changed, 303 insertions(+), 121 deletions(-) create mode 100644 fs/afs/symlink.c diff --git a/fs/afs/Makefile b/fs/afs/Makefile index b49b8fe682f3..0d8f1982d596 100644 --- a/fs/afs/Makefile +++ b/fs/afs/Makefile @@ -30,6 +30,7 @@ kafs-y :=3D \ server.o \ server_list.o \ super.o \ + symlink.o \ validation.o \ vlclient.o \ vl_alias.o \ diff --git a/fs/afs/dir.c b/fs/afs/dir.c index aaaa55878ffd..40f6791114ec 100644 --- a/fs/afs/dir.c +++ b/fs/afs/dir.c @@ -68,7 +68,7 @@ const struct inode_operations afs_dir_inode_operations = =3D { }; =20 const struct address_space_operations afs_dir_aops =3D { - .writepages =3D afs_single_writepages, + .writepages =3D afs_dir_writepages, }; =20 const struct dentry_operations afs_fs_dentry_operations =3D { @@ -294,7 +294,7 @@ static ssize_t afs_do_read_single(struct afs_vnode *dvn= ode, struct file *file) return ret; } =20 -ssize_t afs_read_single(struct afs_vnode *dvnode, struct file *file) +static ssize_t afs_read_single(struct afs_vnode *dvnode, struct file *file) { ssize_t ret; =20 @@ -1763,13 +1763,20 @@ static int afs_link(struct dentry *from, struct ino= de *dir, return ret; } =20 +static void afs_symlink_put(struct afs_operation *op) +{ + kfree(op->create.symlink); + op->create.symlink =3D NULL; + afs_create_put(op); +} + static const struct afs_operation_ops afs_symlink_operation =3D { .issue_afs_rpc =3D afs_fs_symlink, .issue_yfs_rpc =3D yfs_fs_symlink, .success =3D afs_create_success, .aborted =3D afs_check_for_remote_deletion, .edit_dir =3D afs_create_edit_dir, - .put =3D afs_create_put, + .put =3D afs_symlink_put, }; =20 /* @@ -1779,7 +1786,9 @@ static int afs_symlink(struct mnt_idmap *idmap, struc= t inode *dir, struct dentry *dentry, const char *content) { struct afs_operation *op; + struct afs_symlink *symlink; struct afs_vnode *dvnode =3D AFS_FS_I(dir); + size_t clen =3D strlen(content); int ret; =20 _enter("{%llx:%llu},{%pd},%s", @@ -1791,12 +1800,20 @@ static int afs_symlink(struct mnt_idmap *idmap, str= uct inode *dir, goto error; =20 ret =3D -EINVAL; - if (strlen(content) >=3D AFSPATHMAX) + if (clen >=3D AFSPATHMAX) + goto error; + + ret =3D -ENOMEM; + symlink =3D kmalloc_flex(struct afs_symlink, content, clen + 1, GFP_KERNE= L); + if (!symlink) goto error; + refcount_set(&symlink->ref, 1); + memcpy(symlink->content, content, clen + 1); =20 op =3D afs_alloc_operation(NULL, dvnode->volume); if (IS_ERR(op)) { ret =3D PTR_ERR(op); + kfree(symlink); goto error; } =20 @@ -1808,7 +1825,7 @@ static int afs_symlink(struct mnt_idmap *idmap, struc= t inode *dir, op->dentry =3D dentry; op->ops =3D &afs_symlink_operation; op->create.reason =3D afs_edit_dir_for_symlink; - op->create.symlink =3D content; + op->create.symlink =3D symlink; op->mtime =3D current_time(dir); ret =3D afs_do_sync_operation(op); afs_dir_unuse_cookie(dvnode, ret); @@ -2192,10 +2209,10 @@ static int afs_rename(struct mnt_idmap *idmap, stru= ct inode *old_dir, } =20 /* - * Write the file contents to the cache as a single blob. + * Write the directory contents to the cache as a single blob. */ -int afs_single_writepages(struct address_space *mapping, - struct writeback_control *wbc) +int afs_dir_writepages(struct address_space *mapping, + struct writeback_control *wbc) { struct afs_vnode *dvnode =3D AFS_FS_I(mapping->host); struct iov_iter iter; diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c index 95494d5f2b8a..a2ffd60889f8 100644 --- a/fs/afs/fsclient.c +++ b/fs/afs/fsclient.c @@ -886,7 +886,7 @@ void afs_fs_symlink(struct afs_operation *op) namesz =3D name->len; padsz =3D (4 - (namesz & 3)) & 3; =20 - c_namesz =3D strlen(op->create.symlink); + c_namesz =3D strlen(op->create.symlink->content); c_padsz =3D (4 - (c_namesz & 3)) & 3; =20 reqsz =3D (6 * 4) + namesz + padsz + c_namesz + c_padsz + (6 * 4); @@ -910,7 +910,7 @@ void afs_fs_symlink(struct afs_operation *op) bp =3D (void *) bp + padsz; } *bp++ =3D htonl(c_namesz); - memcpy(bp, op->create.symlink, c_namesz); + memcpy(bp, op->create.symlink->content, c_namesz); bp =3D (void *) bp + c_namesz; if (c_padsz > 0) { memset(bp, 0, c_padsz); diff --git a/fs/afs/inode.c b/fs/afs/inode.c index 5207c4a003f6..ff2b8fc7f3df 100644 --- a/fs/afs/inode.c +++ b/fs/afs/inode.c @@ -25,105 +25,6 @@ #include "internal.h" #include "afs_fs.h" =20 -void afs_init_new_symlink(struct afs_vnode *vnode, struct afs_operation *o= p) -{ - size_t size =3D strlen(op->create.symlink) + 1; - size_t dsize =3D 0; - char *p; - - if (netfs_alloc_folioq_buffer(NULL, &vnode->directory, &dsize, size, - mapping_gfp_mask(vnode->netfs.inode.i_mapping)) < 0) - return; - - vnode->directory_size =3D dsize; - p =3D kmap_local_folio(folioq_folio(vnode->directory, 0), 0); - memcpy(p, op->create.symlink, size); - kunmap_local(p); - set_bit(AFS_VNODE_DIR_READ, &vnode->flags); - netfs_single_mark_inode_dirty(&vnode->netfs.inode); -} - -static void afs_put_link(void *arg) -{ - struct folio *folio =3D virt_to_folio(arg); - - kunmap_local(arg); - folio_put(folio); -} - -const char *afs_get_link(struct dentry *dentry, struct inode *inode, - struct delayed_call *callback) -{ - struct afs_vnode *vnode =3D AFS_FS_I(inode); - struct folio *folio; - char *content; - ssize_t ret; - - if (!dentry) { - /* RCU pathwalk. */ - if (!test_bit(AFS_VNODE_DIR_READ, &vnode->flags) || !afs_check_validity(= vnode)) - return ERR_PTR(-ECHILD); - goto good; - } - - if (test_bit(AFS_VNODE_DIR_READ, &vnode->flags)) - goto fetch; - - ret =3D afs_validate(vnode, NULL); - if (ret < 0) - return ERR_PTR(ret); - - if (!test_and_clear_bit(AFS_VNODE_ZAP_DATA, &vnode->flags) && - test_bit(AFS_VNODE_DIR_READ, &vnode->flags)) - goto good; - -fetch: - if (down_write_killable(&vnode->validate_lock) < 0) - return ERR_PTR(-ERESTARTSYS); - if (test_and_clear_bit(AFS_VNODE_ZAP_DATA, &vnode->flags) || - !test_bit(AFS_VNODE_DIR_READ, &vnode->flags)) { - ret =3D afs_read_single(vnode, NULL); - if (ret < 0) { - up_write(&vnode->validate_lock); - return ERR_PTR(ret); - } - set_bit(AFS_VNODE_DIR_READ, &vnode->flags); - } - - up_write(&vnode->validate_lock); - -good: - folio =3D folioq_folio(vnode->directory, 0); - folio_get(folio); - content =3D kmap_local_folio(folio, 0); - set_delayed_call(callback, afs_put_link, content); - return content; -} - -int afs_readlink(struct dentry *dentry, char __user *buffer, int buflen) -{ - DEFINE_DELAYED_CALL(done); - const char *content; - int len; - - content =3D afs_get_link(dentry, d_inode(dentry), &done); - if (IS_ERR(content)) { - do_delayed_call(&done); - return PTR_ERR(content); - } - - len =3D umin(strlen(content), buflen); - if (copy_to_user(buffer, content, len)) - len =3D -EFAULT; - do_delayed_call(&done); - return len; -} - -static const struct inode_operations afs_symlink_inode_operations =3D { - .get_link =3D afs_get_link, - .readlink =3D afs_readlink, -}; - static noinline void dump_vnode(struct afs_vnode *vnode, struct afs_vnode = *parent_vnode) { static unsigned long once_only; @@ -223,7 +124,7 @@ static int afs_inode_init_from_status(struct afs_operat= ion *op, inode->i_mode =3D S_IFLNK | status->mode; inode->i_op =3D &afs_symlink_inode_operations; } - inode->i_mapping->a_ops =3D &afs_dir_aops; + inode->i_mapping->a_ops =3D &afs_symlink_aops; inode_nohighmem(inode); mapping_set_release_always(inode->i_mapping); break; @@ -765,12 +666,14 @@ void afs_evict_inode(struct inode *inode) .range_end =3D LLONG_MAX, }; =20 - afs_single_writepages(inode->i_mapping, &wbc); + inode->i_mapping->a_ops->writepages(inode->i_mapping, &wbc); } =20 netfs_wait_for_outstanding_io(inode); truncate_inode_pages_final(&inode->i_data); netfs_free_folioq_buffer(vnode->directory); + if (vnode->symlink) + afs_replace_symlink(vnode, NULL); =20 afs_set_cache_aux(vnode, &aux); netfs_clear_inode_writeback(inode, &aux); diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 009064b8d661..802ae22133ae 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -711,6 +711,7 @@ struct afs_vnode { #define AFS_VNODE_DIR_READ 11 /* Set if we've read a dir's contents */ =20 struct folio_queue *directory; /* Directory contents */ + struct afs_symlink __rcu *symlink; /* Symlink content */ struct list_head wb_keys; /* List of keys available for writeback */ struct list_head pending_locks; /* locks waiting to be granted */ struct list_head granted_locks; /* locks granted on this file */ @@ -777,6 +778,15 @@ struct afs_permits { struct afs_permit permits[] __counted_by(nr_permits); /* List of permits = sorted by key pointer */ }; =20 +/* + * Copy of symlink content for normal use. + */ +struct afs_symlink { + struct rcu_head rcu; + refcount_t ref; + char content[]; +}; + /* * Error prioritisation and accumulation. */ @@ -888,7 +898,7 @@ struct afs_operation { struct { int reason; /* enum afs_edit_dir_reason */ mode_t mode; - const char *symlink; + struct afs_symlink *symlink; } create; struct { bool need_rehash; @@ -1099,13 +1109,12 @@ extern const struct inode_operations afs_dir_inode_= operations; extern const struct address_space_operations afs_dir_aops; extern const struct dentry_operations afs_fs_dentry_operations; =20 -ssize_t afs_read_single(struct afs_vnode *dvnode, struct file *file); ssize_t afs_read_dir(struct afs_vnode *dvnode, struct file *file) __acquires(&dvnode->validate_lock); extern void afs_d_release(struct dentry *); extern void afs_check_for_remote_deletion(struct afs_operation *); -int afs_single_writepages(struct address_space *mapping, - struct writeback_control *wbc); +int afs_dir_writepages(struct address_space *mapping, + struct writeback_control *wbc); =20 /* * dir_edit.c @@ -1247,10 +1256,6 @@ extern void afs_fs_probe_cleanup(struct afs_net *); */ extern const struct afs_operation_ops afs_fetch_status_operation; =20 -void afs_init_new_symlink(struct afs_vnode *vnode, struct afs_operation *o= p); -const char *afs_get_link(struct dentry *dentry, struct inode *inode, - struct delayed_call *callback); -int afs_readlink(struct dentry *dentry, char __user *buffer, int buflen); extern void afs_vnode_commit_status(struct afs_operation *, struct afs_vno= de_param *); extern int afs_fetch_status(struct afs_vnode *, struct key *, bool, afs_ac= cess_t *); extern int afs_ilookup5_test_by_fid(struct inode *, void *); @@ -1600,6 +1605,20 @@ void afs_detach_volume_from_servers(struct afs_volum= e *volume, struct afs_server extern int __init afs_fs_init(void); extern void afs_fs_exit(void); =20 +/* + * symlink.c + */ +extern const struct inode_operations afs_symlink_inode_operations; +extern const struct address_space_operations afs_symlink_aops; + +void afs_replace_symlink(struct afs_vnode *vnode, struct afs_symlink *syml= ink); +void afs_init_new_symlink(struct afs_vnode *vnode, struct afs_operation *o= p); +const char *afs_get_link(struct dentry *dentry, struct inode *inode, + struct delayed_call *callback); +int afs_readlink(struct dentry *dentry, char __user *buffer, int buflen); +int afs_symlink_writepages(struct address_space *mapping, + struct writeback_control *wbc); + /* * validation.c */ diff --git a/fs/afs/symlink.c b/fs/afs/symlink.c new file mode 100644 index 000000000000..8d2521c5f19d --- /dev/null +++ b/fs/afs/symlink.c @@ -0,0 +1,242 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* AFS filesystem symbolic link handling + * + * Copyright (C) 2026 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#include +#include +#include +#include +#include +#include "internal.h" + +static void afs_put_symlink(struct afs_symlink *symlink) +{ + if (refcount_dec_and_test(&symlink->ref)) + kfree_rcu(symlink, rcu); +} + +void afs_replace_symlink(struct afs_vnode *vnode, struct afs_symlink *syml= ink) +{ + struct afs_symlink *old; + + old =3D rcu_replace_pointer(vnode->symlink, symlink, + lockdep_is_held(&vnode->validate_lock)); + if (old) + afs_put_symlink(old); +} + +/* + * Set up a locally created symlink inode for immediate write to the cache. + */ +void afs_init_new_symlink(struct afs_vnode *vnode, struct afs_operation *o= p) +{ + size_t dsize =3D 0; + size_t size =3D strlen(op->create.symlink->content) + 1; + char *p; + + rcu_assign_pointer(vnode->symlink, op->create.symlink); + op->create.symlink =3D NULL; + + if (!fscache_cookie_enabled(netfs_i_cookie(&vnode->netfs))) + return; + + if (netfs_alloc_folioq_buffer(NULL, &vnode->directory, &dsize, size, + mapping_gfp_mask(vnode->netfs.inode.i_mapping)) < 0) + return; + + vnode->directory_size =3D dsize; + p =3D kmap_local_folio(folioq_folio(vnode->directory, 0), 0); + memcpy(p, vnode->symlink, size); + kunmap_local(p); + netfs_single_mark_inode_dirty(&vnode->netfs.inode); +} + +/* + * Read a symlink in a single download. + */ +static ssize_t afs_do_read_symlink(struct afs_vnode *vnode) +{ + struct afs_symlink *symlink; + struct iov_iter iter; + ssize_t ret; + loff_t i_size; + + i_size =3D i_size_read(&vnode->netfs.inode); + if (i_size > PAGE_SIZE - 1) { + trace_afs_file_error(vnode, -EFBIG, afs_file_error_dir_big); + return -EFBIG; + } + + if (!vnode->directory) { + size_t cur_size =3D 0; + + ret =3D netfs_alloc_folioq_buffer(NULL, + &vnode->directory, &cur_size, PAGE_SIZE, + mapping_gfp_mask(vnode->netfs.inode.i_mapping)); + vnode->directory_size =3D PAGE_SIZE - 1; + if (ret < 0) + return ret; + } + + iov_iter_folio_queue(&iter, ITER_DEST, vnode->directory, 0, 0, PAGE_SIZE); + + /* AFS requires us to perform the read of a symlink as a single unit to + * avoid issues with the content being changed between reads. + */ + ret =3D netfs_read_single(&vnode->netfs.inode, NULL, &iter); + if (ret >=3D 0) { + i_size =3D i_size_read(&vnode->netfs.inode); + if (i_size > PAGE_SIZE - 1) { + trace_afs_file_error(vnode, -EFBIG, afs_file_error_dir_big); + return -EFBIG; + } + vnode->directory_size =3D i_size; + + /* Copy the symlink. */ + symlink =3D kmalloc_flex(struct afs_symlink, content, i_size + 1, + GFP_KERNEL); + if (!symlink) + return -ENOMEM; + + refcount_set(&symlink->ref, 1); + symlink->content[i_size] =3D 0; + + const char *s =3D kmap_local_folio(folioq_folio(vnode->directory, 0), 0); + + memcpy(symlink->content, s, i_size); + kunmap_local(s); + + afs_replace_symlink(vnode, symlink); + } + + if (!fscache_cookie_enabled(netfs_i_cookie(&vnode->netfs))) { + netfs_free_folioq_buffer(vnode->directory); + vnode->directory =3D NULL; + vnode->directory_size =3D 0; + } + + return ret; +} + +static ssize_t afs_read_symlink(struct afs_vnode *vnode) +{ + ssize_t ret; + + fscache_use_cookie(afs_vnode_cache(vnode), false); + ret =3D afs_do_read_symlink(vnode); + fscache_unuse_cookie(afs_vnode_cache(vnode), NULL, NULL); + return ret; +} + +static void afs_put_link(void *arg) +{ + afs_put_symlink(arg); +} + +const char *afs_get_link(struct dentry *dentry, struct inode *inode, + struct delayed_call *callback) +{ + struct afs_symlink *symlink; + struct afs_vnode *vnode =3D AFS_FS_I(inode); + ssize_t ret; + + if (!dentry) { + /* RCU pathwalk. */ + if (!vnode->symlink || !afs_check_validity(vnode)) + return ERR_PTR(-ECHILD); + set_delayed_call(callback, NULL, NULL); + return rcu_dereference(vnode->symlink)->content; + } + + if (vnode->symlink) { + ret =3D afs_validate(vnode, NULL); + if (ret < 0) + return ERR_PTR(ret); + + down_read(&vnode->validate_lock); + if (!test_bit(AFS_VNODE_ZAP_DATA, &vnode->flags)) + goto good; + up_read(&vnode->validate_lock); + } + + if (down_write_killable(&vnode->validate_lock) < 0) + return ERR_PTR(-ERESTARTSYS); + if (!vnode->symlink || + test_and_clear_bit(AFS_VNODE_ZAP_DATA, &vnode->flags)) { + ret =3D afs_read_symlink(vnode); + if (ret < 0) { + up_write(&vnode->validate_lock); + return ERR_PTR(ret); + } + } + + downgrade_write(&vnode->validate_lock); +=09 +good: + symlink =3D rcu_dereference_protected(vnode->symlink, + lockdep_is_held(&vnode->validate_lock)); + refcount_inc(&symlink->ref); + up_read(&vnode->validate_lock); + + set_delayed_call(callback, afs_put_link, symlink); + return symlink->content; +} + +int afs_readlink(struct dentry *dentry, char __user *buffer, int buflen) +{ + DEFINE_DELAYED_CALL(done); + const char *content; + int len; + + content =3D afs_get_link(dentry, d_inode(dentry), &done); + if (IS_ERR(content)) { + do_delayed_call(&done); + return PTR_ERR(content); + } + + len =3D umin(strlen(content), buflen); + if (copy_to_user(buffer, content, len)) + len =3D -EFAULT; + do_delayed_call(&done); + return len; +} + +/* + * Write the symlink contents to the cache as a single blob. We then throw + * away the page we used to receive it. + */ +int afs_symlink_writepages(struct address_space *mapping, + struct writeback_control *wbc) +{ + struct afs_vnode *vnode =3D AFS_FS_I(mapping->host); + struct iov_iter iter; + int ret =3D 0; + + down_write(&vnode->validate_lock); + + if (vnode->directory && + atomic64_read(&vnode->cb_expires_at) !=3D AFS_NO_CB_PROMISE) { + iov_iter_folio_queue(&iter, ITER_SOURCE, vnode->directory, 0, 0, + i_size_read(&vnode->netfs.inode)); + ret =3D netfs_writeback_single(mapping, wbc, &iter); + } + + netfs_free_folioq_buffer(vnode->directory); + vnode->directory =3D NULL; + vnode->directory_size =3D 0; + + up_write(&vnode->validate_lock); + return ret; +} + +const struct inode_operations afs_symlink_inode_operations =3D { + .get_link =3D afs_get_link, + .readlink =3D afs_readlink, +}; + +const struct address_space_operations afs_symlink_aops =3D { + .writepages =3D afs_symlink_writepages, +}; diff --git a/fs/afs/yfsclient.c b/fs/afs/yfsclient.c index 24fb562ebd33..d941179730a9 100644 --- a/fs/afs/yfsclient.c +++ b/fs/afs/yfsclient.c @@ -960,7 +960,7 @@ void yfs_fs_symlink(struct afs_operation *op) =20 _enter(""); =20 - contents_sz =3D strlen(op->create.symlink); + contents_sz =3D strlen(op->create.symlink->content); call =3D afs_alloc_flat_call(op->net, &yfs_RXYFSSymlink, sizeof(__be32) + sizeof(struct yfs_xdr_RPCFlags) + @@ -981,7 +981,7 @@ void yfs_fs_symlink(struct afs_operation *op) bp =3D xdr_encode_u32(bp, 0); /* RPC flags */ bp =3D xdr_encode_YFSFid(bp, &dvp->fid); bp =3D xdr_encode_name(bp, name); - bp =3D xdr_encode_string(bp, op->create.symlink, contents_sz); + bp =3D xdr_encode_string(bp, op->create.symlink->content, contents_sz); bp =3D xdr_encode_YFSStoreStatus(bp, &mode, &op->mtime); yfs_check_req(call, bp);