From nobody Fri Dec 27 10:23:20 2024 Received: from mxout1.mail.janestreet.com (mxout1.mail.janestreet.com [38.105.200.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 269AF14F9E2 for ; Mon, 9 Dec 2024 17:45:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=38.105.200.78 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733766321; cv=none; b=NjPSGDhIqGXSl1JNwxpYIm7BTWskAQaxXEvw0WDBmmo+f6hwa416lshbdD2g+57dHwcsw1lJCa4uhjbltl7ZF1/rtxqp806ywUNBCJ+FtSl3l1nfV/5J8Xx+DK4bTaoLTzki/0pck54Jybf2UCBJD3UspJX2KIU+8qCXIiiusoU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733766321; c=relaxed/simple; bh=qzmDRzVwBCe/hFVlH/XWVtNJ+YW9/6Mesc6a2S/JkMw=; h=Date:From:To:Cc:Subject:Message-ID:MIME-Version:Content-Type: Content-Disposition; b=cEnrpKlfOubixe2WwnIyvAh1Q4TnGI0h9SmBZTOyGTtXFN7P7qH0IuTiWOtlaC3VC68UX6NTXsqsHtBF1BzZAtWMV6V1IKqRdVjTE6EHBAvp4wD5A+R+2+p9cfM5dcCzFwBPCGQ1T6vYlBTeXN1w3Vs8v0fcex3VYo8dLIP1yss= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=janestreet.com; spf=pass smtp.mailfrom=janestreet.com; dkim=pass (2048-bit key) header.d=janestreet.com header.i=@janestreet.com header.b=Aa5o4qzN; arc=none smtp.client-ip=38.105.200.78 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=janestreet.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=janestreet.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=janestreet.com header.i=@janestreet.com header.b="Aa5o4qzN" Date: Mon, 9 Dec 2024 12:39:55 -0500 From: Nikhil Jha To: Trond Myklebust , Anna Schumaker Cc: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] nfs: propagate fileid changed errors back to syscall Message-ID: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1733765995; bh=6IxyuMfQnp/g35TgnRBn0r/NAk97VUvbhHrdndtkHtA=; h=Date:From:To:Cc:Subject; b=Aa5o4qzNsT5XUYs2t1Qhec+cColPVaIX8bTwqctOknu4vML+1YRZTYnxsvaAq4/ex GZSa+v8H4mRPwXdB342C8pnaAQCQLr0AQnZFyS+Gm0XxrHPNymu+ixy2WpGKrVYgrj WRbHY8x8nrTvCSYYXRZgvuBThCFUWrhUfC+CVC0aKSX8iac0ALT6FpcN4qB/c+AvW7 LXatgBDb+KleeKSZPnjvpL6og7FZ8TfUeo4tjPPp0ecI9/6aTyz4g62f+jbQ8eXluw ovCCMGdJIWhwJATfy3lDFofTom8zgeOmXryZUNG/Gp8WPRl/UBFD5/siDi+KxqfiAd WFXgMt6UewXoA== Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Hello! This is the first kernel patch I have tried to upstream. I'm following along with the kernel newbies guide but apologies if I got anything wrong. Currently, if there is a mismatch in the request and response fileids in an NFS request, the kernel logs an error and attempts to return ESTALE. However, this error is currently dropped before it makes it all the way to userspace. This appears to be a mistake, since as far as I can tell that ESTALE value is never consumed from anywhere. Callstack for async NFS write, at time of error: nfs_update_inode <- returns -ESTALE nfs_refresh_inode_locked nfs_writeback_update_inode <- error is dropped here nfs3_write_done nfs_writeback_done nfs_pgio_result <- other errors are collected here rpc_exit_task __rpc_execute rpc_async_schedule process_one_work worker_thread kthread ret_from_fork We ran into this issue ourselves, and seeing the -ESTALE in the kernel source code but not from userspace was surprising. I tested a rebased version of this patch on an el8 kernel (v6.1.114), and it seems to correctly propagate this error. >8------------------------------------------------------8< If an NFS server returns a response with a different file id to the response, the kernel currently prints out an error and attempts to return -ESTALE. However, this -ESTALE value is never surfaced anywhere. This patch modifies nfs_writeback_update_inode() to propagate these errors up the call stack, and modifies nfs_pgio_result() to report the resulting error. Signed-off-by: Nikhil Jha --- fs/nfs/filelayout/filelayout.c | 2 +- fs/nfs/flexfilelayout/flexfilelayout.c | 2 +- fs/nfs/internal.h | 2 +- fs/nfs/nfs3proc.c | 2 +- fs/nfs/nfs4proc.c | 2 +- fs/nfs/pagelist.c | 5 ++++- fs/nfs/proc.c | 2 +- fs/nfs/write.c | 9 ++++++--- 8 files changed, 16 insertions(+), 10 deletions(-) diff --git a/fs/nfs/filelayout/filelayout.c b/fs/nfs/filelayout/filelayout.c index d39a1f58e18d..4e80a13e9639 100644 --- a/fs/nfs/filelayout/filelayout.c +++ b/fs/nfs/filelayout/filelayout.c @@ -335,7 +335,7 @@ static int filelayout_write_done_cb(struct rpc_task *ta= sk, /* zero out the fattr */ hdr->fattr.valid =3D 0; if (task->tk_status >=3D 0) - nfs_writeback_update_inode(hdr); + return nfs_writeback_update_inode(hdr); =20 return 0; } diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout= /flexfilelayout.c index f78115c6c2c1..d15e3799a351 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.c +++ b/fs/nfs/flexfilelayout/flexfilelayout.c @@ -1503,7 +1503,7 @@ static int ff_layout_write_done_cb(struct rpc_task *t= ask, /* zero out fattr since we don't care DS attr at all */ hdr->fattr.valid =3D 0; if (task->tk_status >=3D 0) - nfs_writeback_update_inode(hdr); + return nfs_writeback_update_inode(hdr); =20 return 0; } diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index e564bd11ba60..5c4e2fa88324 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -592,7 +592,7 @@ void nfs_mark_request_commit(struct nfs_page *req, struct nfs_commit_info *cinfo, u32 ds_commit_idx); int nfs_write_need_commit(struct nfs_pgio_header *); -void nfs_writeback_update_inode(struct nfs_pgio_header *hdr); +int nfs_writeback_update_inode(struct nfs_pgio_header *hdr); int nfs_generic_commit_list(struct inode *inode, struct list_head *head, int how, struct nfs_commit_info *cinfo); void nfs_retry_commit(struct list_head *page_list, diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c index 1566163c6d85..42ddbc21fb05 100644 --- a/fs/nfs/nfs3proc.c +++ b/fs/nfs/nfs3proc.c @@ -887,7 +887,7 @@ static int nfs3_write_done(struct rpc_task *task, struc= t nfs_pgio_header *hdr) if (nfs3_async_handle_jukebox(task, inode)) return -EAGAIN; if (task->tk_status >=3D 0) - nfs_writeback_update_inode(hdr); + return nfs_writeback_update_inode(hdr); return 0; } =20 diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 405f17e6e0b4..7ec372a1eb98 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -5661,7 +5661,7 @@ static int nfs4_write_done_cb(struct rpc_task *task, } if (task->tk_status >=3D 0) { renew_lease(NFS_SERVER(inode), hdr->timestamp); - nfs_writeback_update_inode(hdr); + return nfs_writeback_update_inode(hdr); } return 0; } diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c index e27c07bd8929..19cb080653e3 100644 --- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -854,9 +854,12 @@ static void nfs_pgio_result(struct rpc_task *task, voi= d *calldata) { struct nfs_pgio_header *hdr =3D calldata; struct inode *inode =3D hdr->inode; + int status =3D hdr->rw_ops->rw_done(task, hdr, inode); =20 - if (hdr->rw_ops->rw_done(task, hdr, inode) !=3D 0) + if (status !=3D 0) { + nfs_set_pgio_error(hdr, status, hdr->args.offset); return; + } if (task->tk_status < 0) nfs_set_pgio_error(hdr, task->tk_status, hdr->args.offset); else diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c index 6c09cd090c34..72ffbdfc7ae6 100644 --- a/fs/nfs/proc.c +++ b/fs/nfs/proc.c @@ -629,7 +629,7 @@ static int nfs_write_done(struct rpc_task *task, struct= nfs_pgio_header *hdr) { if (task->tk_status >=3D 0) { hdr->res.count =3D hdr->args.count; - nfs_writeback_update_inode(hdr); + return nfs_writeback_update_inode(hdr); } return 0; } diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 50fa539611f5..151da29175fd 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -1507,22 +1507,25 @@ static void nfs_writeback_check_extend(struct nfs_p= gio_header *hdr, fattr->valid |=3D NFS_ATTR_FATTR_SIZE; } =20 -void nfs_writeback_update_inode(struct nfs_pgio_header *hdr) +int nfs_writeback_update_inode(struct nfs_pgio_header *hdr) { struct nfs_fattr *fattr =3D &hdr->fattr; struct inode *inode =3D hdr->inode; + int ret =3D 0; =20 if (nfs_have_delegated_mtime(inode)) { spin_lock(&inode->i_lock); nfs_set_cache_invalid(inode, NFS_INO_INVALID_BLOCKS); spin_unlock(&inode->i_lock); - return; + return 0; } =20 spin_lock(&inode->i_lock); nfs_writeback_check_extend(hdr, fattr); - nfs_post_op_update_inode_force_wcc_locked(inode, fattr); + ret =3D nfs_post_op_update_inode_force_wcc_locked(inode, fattr); spin_unlock(&inode->i_lock); + + return ret; } EXPORT_SYMBOL_GPL(nfs_writeback_update_inode); =20 --=20 2.39.3