From nobody Fri Nov 29 23:33:16 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9C4551DC053; Fri, 13 Sep 2024 13:54:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235663; cv=none; b=uPtZwPhpWRVCmCpSEQiKt0hasFWeSIIlu83oWASmB1j5x6twpdDFQSb02DVix21NYJcLp0pRjEteAsudqj5Y2HSFGDoS8XCmVvijypS6DHSA5dQGsAUZMtB8M5S0vFBqCYt17WCcBdJ/ER4pNy+trQP8GA28WorwLhp83sd1MiE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235663; c=relaxed/simple; bh=1wQqjayOZU9OU7OIyp79413cwp1jIKED+/ddvabpxRE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=J625yz485ik744zandWN7v2VUWVaI+So3RZlFas0DKLm0t2orry5fklVcFTrr5zAfq87eSVO8Y8T4ZFXLEhDLMXquStn4bs7l2fH8stoez8XsNw6D9fcGnQ4WgoYbqac6SYE/i91mrIohIX5/DTem+bPrvxXwpwQBoFVvUiCM/E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MSnStM37; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MSnStM37" Received: by smtp.kernel.org (Postfix) with ESMTPSA id F1FF1C4CECE; Fri, 13 Sep 2024 13:54:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726235663; bh=1wQqjayOZU9OU7OIyp79413cwp1jIKED+/ddvabpxRE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=MSnStM37sPnp9DXk4F6EIH6b72VjVzSahHG8G3GrAM200gv2K3Q39nAW3S5z3EOtE YdNIC+JimHrlIa7gtqOhHM521Fior7jUBndNkAbd529aesrqar5Nwxs7WMIQ1ynSLt qyp+qe8ZzfaDvLGTiVAMQR3llWrqqGicXXd510d8kigpkfYCkdYbPmnFL3hn6uh/Ky B+htVPC9d6ON+9bsUM8ujYAuur0rLfdN14R7QQrFOKUU/8LHZCEe6GPNCR+eU0Tunv caCGQZZBsW3OvNBhbBOihZkDcWUqE0qreZ8YpbezQZVWmOMSC/JMhMgDnV9wpxSoYW NZ50YB4avb+8Q== From: Jeff Layton Date: Fri, 13 Sep 2024 09:54:10 -0400 Subject: [PATCH v7 01/11] timekeeping: move multigrain timestamp floor handling into timekeeper Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-mgtime-v7-1-92d4020e3b00@kernel.org> References: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> In-Reply-To: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=5382; i=jlayton@kernel.org; h=from:subject:message-id; bh=1wQqjayOZU9OU7OIyp79413cwp1jIKED+/ddvabpxRE=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm5EQHdrxoA4Zv70eWUKOVfFaSb4FDc6+mmDM7U djCsOXWJIqJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZuREBwAKCRAADmhBGVaC FahsD/kBt0wSc8RQuJxojIJblnZIg2g0aJ85YqPnmoTQvcCecxKf6H0dOsXyVro9MKCuVISYgOy RfqTSz7o4LRR/lbBCwc7T2w3n0WRmW4SM2LpjyJBP7pxAvnzVaoXcBkkxE0OOEl6QnBnGqBvkah n3pXUc0xiOQarhuwPSzTNc8oRsz0UD6nk5p0pFdyFwnsi9CeVXkWat6VjuOW/iKnvjdz0mW3lAs uqb8yLh139vG5amFjLrdGvlUc1zuSoFqy+psf+DNCdq/thiUJdueGvjN/pNooci0W4j09oXs5bV dCyMx7CanVzHwG8k8HP/TBR/HFXopkhtTMZavqvTmNx/LqK1IV1m5g+4mMNZn1zEiv7pjMzgzVT Op7Rk8bqkRddOdSlWbUPnE1bSaSRE7cd2RDX9BThB6UT0a3eiePLw1J9me1BdjWpHwdWek0OuRI ar8r0yN4lrmoUl3qWKFCyqLOAXsxqzihaAbJexTcpSBF+8t+WwD+laHMrtLMWKNVbbJAdhiSQxv w9IXUdEcB/NTRfacm1UeULZsgoMMXqWNZ2WFORMx1wgI3Df6PpBnwaoQ+Ot5P248reTCTnkKBdO 1BPzWFcTt+qzByEH0lJMjIT6kNjOL8nU5Z52XyHttqAYCIyPpWxjWtaSAsRQK+TWrUCtfVw3nbp ko5CuOLpgy0RNxA== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 For multigrain timestamps, we must keep track of the latest timestamp that has ever been handed out, and never hand out a coarse time below that value. Add a static singleton atomic64_t into timekeeper.c that we can use to keep track of the latest fine-grained time ever handed out. This is tracked as a monotonic ktime_t value to ensure that it isn't affected by clock jumps. Add two new public interfaces: - ktime_get_coarse_real_ts64_mg() fills a timespec64 with the later of the coarse-grained clock and the floor time - ktime_get_real_ts64_mg() gets the fine-grained clock value, and tries to swap it into the floor. A timespec64 is filled with the result. Since the floor is global, we take great pains to avoid updating it unless it's absolutely necessary. If we do the cmpxchg and find that the value has been updated since we fetched it, then we discard the fine-grained time that was fetched in favor of the recent update. To maximize the window of this occurring when multiple tasks are racing to update the floor, ktime_get_coarse_real_ts64_mg returns a cookie value that represents the state of the floor tracking word, and ktime_get_real_ts64_mg accepts a cookie value that it uses as the "old" value when calling cmpxchg(). Signed-off-by: Jeff Layton --- include/linux/timekeeping.h | 4 +++ kernel/time/timekeeping.c | 81 +++++++++++++++++++++++++++++++++++++++++= ++++ 2 files changed, 85 insertions(+) diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h index fc12a9ba2c88..cf2293158c65 100644 --- a/include/linux/timekeeping.h +++ b/include/linux/timekeeping.h @@ -45,6 +45,10 @@ extern void ktime_get_real_ts64(struct timespec64 *tv); extern void ktime_get_coarse_ts64(struct timespec64 *ts); extern void ktime_get_coarse_real_ts64(struct timespec64 *ts); =20 +/* Multigrain timestamp interfaces */ +extern u64 ktime_get_coarse_real_ts64_mg(struct timespec64 *ts); +extern void ktime_get_real_ts64_mg(struct timespec64 *ts, u64 cookie); + void getboottime64(struct timespec64 *ts); =20 /* diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 5391e4167d60..ee11006a224f 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -114,6 +114,13 @@ static struct tk_fast tk_fast_raw ____cacheline_align= ed =3D { .base[1] =3D FAST_TK_INIT, }; =20 +/* + * This represents the latest fine-grained time that we have handed out as= a + * timestamp on the system. Tracked as a monotonic ktime_t, and converted = to the + * realtime clock on an as-needed basis. + */ +static __cacheline_aligned_in_smp atomic64_t mg_floor; + static inline void tk_normalize_xtime(struct timekeeper *tk) { while (tk->tkr_mono.xtime_nsec >=3D ((u64)NSEC_PER_SEC << tk->tkr_mono.sh= ift)) { @@ -2394,6 +2401,80 @@ void ktime_get_coarse_real_ts64(struct timespec64 *t= s) } EXPORT_SYMBOL(ktime_get_coarse_real_ts64); =20 +/** + * ktime_get_coarse_real_ts64_mg - get later of coarse grained time or flo= or + * @ts: timespec64 to be filled + * + * Adjust floor to realtime and compare it to the coarse time. Fill + * @ts with the latest one. Returns opaque cookie suitable for passing + * to ktime_get_real_ts64_mg(). + */ +u64 ktime_get_coarse_real_ts64_mg(struct timespec64 *ts) +{ + struct timekeeper *tk =3D &tk_core.timekeeper; + u64 floor =3D atomic64_read(&mg_floor); + ktime_t f_real, offset, coarse; + unsigned int seq; + + WARN_ON(timekeeping_suspended); + + do { + seq =3D read_seqcount_begin(&tk_core.seq); + *ts =3D tk_xtime(tk); + offset =3D *offsets[TK_OFFS_REAL]; + } while (read_seqcount_retry(&tk_core.seq, seq)); + + coarse =3D timespec64_to_ktime(*ts); + f_real =3D ktime_add(floor, offset); + if (ktime_after(f_real, coarse)) + *ts =3D ktime_to_timespec64(f_real); + return floor; +} +EXPORT_SYMBOL_GPL(ktime_get_coarse_real_ts64_mg); + +/** + * ktime_get_real_ts64_mg - attempt to update floor value and return result + * @ts: pointer to the timespec to be set + * @cookie: opaque cookie from earlier call to ktime_get_coarse_real_ts64_= mg() + * + * Get a current monotonic fine-grained time value and attempt to swap + * it into the floor using @cookie as the "old" value. @ts will be + * filled with the resulting floor value, regardless of the outcome of + * the swap. + */ +void ktime_get_real_ts64_mg(struct timespec64 *ts, u64 cookie) +{ + struct timekeeper *tk =3D &tk_core.timekeeper; + ktime_t offset, mono, old =3D (ktime_t)cookie; + unsigned int seq; + u64 nsecs; + + WARN_ON(timekeeping_suspended); + + do { + seq =3D read_seqcount_begin(&tk_core.seq); + + ts->tv_sec =3D tk->xtime_sec; + mono =3D tk->tkr_mono.base; + nsecs =3D timekeeping_get_ns(&tk->tkr_mono); + offset =3D *offsets[TK_OFFS_REAL]; + } while (read_seqcount_retry(&tk_core.seq, seq)); + + mono =3D ktime_add_ns(mono, nsecs); + + if (atomic64_try_cmpxchg(&mg_floor, &old, mono)) { + ts->tv_nsec =3D 0; + timespec64_add_ns(ts, nsecs); + } else { + /* + * Something has changed mg_floor since "old" was + * fetched. That value is just as valid, so accept it. + */ + *ts =3D ktime_to_timespec64(ktime_add(old, offset)); + } +} +EXPORT_SYMBOL(ktime_get_real_ts64_mg); + void ktime_get_coarse_ts64(struct timespec64 *ts) { struct timekeeper *tk =3D &tk_core.timekeeper; --=20 2.46.0 From nobody Fri Nov 29 23:33:16 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDFE71DC1BB; Fri, 13 Sep 2024 13:54:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235666; cv=none; b=Xvc/Oy0pfZhH9L8pmwKQFDuH7OVMPBr/z1r1/0JgKByyJKOn7ZWeSH+N9zwnsBdkkyOMg5MQxKtvAJs+qxKFTPgHscQK7o7WnPTQaMgpri4nt12gu8V5jmv3IjqtBLM6FKOu+Y1S+xjwa0S4fA9wV0Lv3lAPyvy4kwFiyAcEV1A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235666; c=relaxed/simple; bh=HfayPkjrCbrN8Ldhj6V+sbCZR5tjrl5gfJaI4pLAWN8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=S83Gs5jdtDm3o724pSNJzPLSLQW9HwY5GAuibk1zFsVC/a9Hhi6WSHZ8ysFPky6NWBwbJmAixOQUmfZYD5bc/MUWyjem6V3UssIznj+Kd2OzkfEE2a6TbwK8L4ZZ6d3k9S+D95MatED1dhIDiumd5fwrED4v4udJBNKvFNYRDoY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=I/lkT+t+; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="I/lkT+t+" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B6B96C4CECC; Fri, 13 Sep 2024 13:54:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726235666; bh=HfayPkjrCbrN8Ldhj6V+sbCZR5tjrl5gfJaI4pLAWN8=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=I/lkT+t+4yb12bEOmFw4Pkxru4bU/QlkUB25IMQgWS8OzDiN5nKmH7gdLTM43GKw5 pb/vIQHT7xkkFLTKBlNcOTzixcAolv/AJAojDIY0LmfxhrlsPoOU5Mfai+y1gdlVl7 zL23hL5xpGWkLzy5QOUQDL9QOuPpS6aB0hTw2IjqyJqqb/ikYMG8M1pbmGQn5QlKnX yCWKpdCnhT4fZdRn8KdPKb1s4uEV17JkOSSEPDO7x0EEjuRlMtM5sMIO12WS3hiHgd 8gypVEOhRXoCjm2sBqeXBQNR8Ue/2hTzI6lBlqFLk7wJymkLIl/Q+RF10d3J5UOyJQ 3pMwxetf5paRw== From: Jeff Layton Date: Fri, 13 Sep 2024 09:54:11 -0400 Subject: [PATCH v7 02/11] fs: add infrastructure for multigrain timestamps Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-mgtime-v7-2-92d4020e3b00@kernel.org> References: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> In-Reply-To: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=13602; i=jlayton@kernel.org; h=from:subject:message-id; bh=HfayPkjrCbrN8Ldhj6V+sbCZR5tjrl5gfJaI4pLAWN8=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm5EQIA5DvePajA0OopMSzMNSJqmrsV2NAnlbEv 5DzfH+uG5iJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZuRECAAKCRAADmhBGVaC FfaDD/9luwQEaJ7IZs2JGi2hx3SS42ek8CTWcUXA5bfua4KA3OTvRK3UN6dMtFhwnXcUsQKMCu2 MLfxEF7ViqmqYWsxwbZLSppq4AX6wyvwY2ndnbaboA6AIGrw77bYoh/nZzIYc8XB/17GXiVG8vy RyOKx29hyo/thnKAsefFYTjl1da3qVo1fwNS0Qa0qBq/uU4hCViR9ab/+VHFxy7IEiSxMujHgn5 YcDwqna/R6DAoXFnxE3LcFjnHYOAJdI8ylNjcbTuEXdoWYBOMZvT+i5OPHZXWxx0aTLo4dAp/m5 eT2GpverrXTRLvrzsg7BEoCN/rpm1yicAJk+ClPZZT8yHII1n+6SFtbMCSHDSXv1J9hZrI+Hs1N oVH4NRWClc2OZOasGKcUcUyE2gG0UIXu/M2z0MzRo6N9DhKIbs8nqXoXDR/G2e6rV7XzDAtjx9R qphXvh/VsqFVKHUhc+x5JULEi9h4kfjs0HpO74ULOUee2OVdEBlGqnvmkpDWfRwN7J+Fafqb6Fh WAZ/GXbQP7O8f234z1tsaI/uhS0RbLun2tsN666y834+tNzeadX9FqUV6anuAD5ToZfbGt4cyv4 W3ClJbm4DviXbQXBZSIWu6JlmgtctiBTRquZWyyYXZm+kLltxLAw7530nexisaBGSbWzVNGyMq0 /ael7LDJdxq6z6w== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 The VFS has always used coarse-grained timestamps when updating the ctime and mtime after a change. This has the benefit of allowing filesystems to optimize away a lot metadata updates, down to around 1 per jiffy, even when a file is under heavy writes. Unfortunately, this has always been an issue when we're exporting via NFSv3, which relies on timestamps to validate caches. A lot of changes can happen in a jiffy, so timestamps aren't sufficient to help the client decide when to invalidate the cache. Even with NFSv4, a lot of exported filesystems don't properly support a change attribute and are subject to the same problems with timestamp granularity. Other applications have similar issues with timestamps (e.g backup applications). If we were to always use fine-grained timestamps, that would improve the situation, but that becomes rather expensive, as the underlying filesystem would have to log a lot more metadata updates. What we need is a way to only use fine-grained timestamps when they are being actively queried. Use the (unused) top bit in inode->i_ctime_nsec as a flag that indicates whether the current timestamps have been queried via stat() or the like. When it's set, we allow the kernel to use a fine-grained timestamp iff it's necessary to make the ctime show a different value. This solves the problem of being able to distinguish the timestamp between updates, but introduces a new problem: it's now possible for a file being changed to get a fine-grained timestamp. A file that is altered just a bit later can then get a coarse-grained one that appears older than the earlier fine-grained time. This violates timestamp ordering guarantees. To remedy this, keep a global monotonic atomic64_t value that acts as a timestamp floor. When we go to stamp a file, we first get the latter of the current floor value and the current coarse-grained time. If the inode ctime hasn't been queried then we just attempt to stamp it with that value. If it has been queried, then first see whether the current coarse time is later than the existing ctime. If it is, then we accept that value. If it isn't, then we get a fine-grained timestamp. Filesystems can opt into this by setting the FS_MGTIME fstype flag. Others should be unaffected (other than being subject to the same floor value as multigrain filesystems). Signed-off-by: Jeff Layton --- fs/inode.c | 132 +++++++++++++++++++++++++++++++++++++++++++------= ---- fs/stat.c | 39 +++++++++++++++- include/linux/fs.h | 34 ++++++++++---- 3 files changed, 170 insertions(+), 35 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index 10c4619faeef..8ab36779066e 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -2172,19 +2172,53 @@ int file_remove_privs(struct file *file) } EXPORT_SYMBOL(file_remove_privs); =20 +/** + * current_time - Return FS time (possibly fine-grained) + * @inode: inode. + * + * Return the current time truncated to the time granularity supported by + * the fs, as suitable for a ctime/mtime change. If the ctime is flagged + * as having been QUERIED, get a fine-grained timestamp. + */ +struct timespec64 current_time(struct inode *inode) +{ + struct timespec64 now; + u32 cns; + + ktime_get_coarse_real_ts64_mg(&now); + + if (!is_mgtime(inode)) + goto out; + + /* If nothing has queried it, then coarse time is fine */ + cns =3D smp_load_acquire(&inode->i_ctime_nsec); + if (cns & I_CTIME_QUERIED) { + /* + * If there is no apparent change, then + * get a fine-grained timestamp. + */ + if (now.tv_nsec =3D=3D (cns & ~I_CTIME_QUERIED)) + ktime_get_real_ts64(&now); + } +out: + return timestamp_truncate(now, inode); +} +EXPORT_SYMBOL(current_time); + static int inode_needs_update_time(struct inode *inode) { + struct timespec64 now, ts; int sync_it =3D 0; - struct timespec64 now =3D current_time(inode); - struct timespec64 ts; =20 /* First try to exhaust all avenues to not sync */ if (IS_NOCMTIME(inode)) return 0; =20 + now =3D current_time(inode); + ts =3D inode_get_mtime(inode); if (!timespec64_equal(&ts, &now)) - sync_it =3D S_MTIME; + sync_it |=3D S_MTIME; =20 ts =3D inode_get_ctime(inode); if (!timespec64_equal(&ts, &now)) @@ -2562,6 +2596,15 @@ void inode_nohighmem(struct inode *inode) } EXPORT_SYMBOL(inode_nohighmem); =20 +struct timespec64 inode_set_ctime_to_ts(struct inode *inode, struct timesp= ec64 ts) +{ + set_normalized_timespec64(&ts, ts.tv_sec, ts.tv_nsec); + inode->i_ctime_sec =3D ts.tv_sec; + inode->i_ctime_nsec =3D ts.tv_nsec; + return ts; +} +EXPORT_SYMBOL(inode_set_ctime_to_ts); + /** * timestamp_truncate - Truncate timespec to a granularity * @t: Timespec @@ -2594,36 +2637,75 @@ struct timespec64 timestamp_truncate(struct timespe= c64 t, struct inode *inode) EXPORT_SYMBOL(timestamp_truncate); =20 /** - * current_time - Return FS time - * @inode: inode. + * inode_set_ctime_current - set the ctime to current_time + * @inode: inode * - * Return the current time truncated to the time granularity supported by - * the fs. + * Set the inode's ctime to the current value for the inode. Returns the + * current value that was assigned. If this is not a multigrain inode, the= n we + * set it to the later of the coarse time and floor value. * - * Note that inode and inode->sb cannot be NULL. - * Otherwise, the function warns and returns time without truncation. + * If it is multigrain, then we first see if the coarse-grained timestamp = is + * distinct from what we have. If so, then we'll just use that. If we have= to + * get a fine-grained timestamp, then do so, and try to swap it into the f= loor. + * We accept the new floor value regardless of the outcome of the cmpxchg. + * After that, we try to swap the new value into i_ctime_nsec. Again, we t= ake + * the resulting ctime, regardless of the outcome of the swap. */ -struct timespec64 current_time(struct inode *inode) +struct timespec64 inode_set_ctime_current(struct inode *inode) { struct timespec64 now; + u32 cns, cur; + u64 cookie; =20 - ktime_get_coarse_real_ts64(&now); - return timestamp_truncate(now, inode); -} -EXPORT_SYMBOL(current_time); + cookie =3D ktime_get_coarse_real_ts64_mg(&now); =20 -/** - * inode_set_ctime_current - set the ctime to current_time - * @inode: inode - * - * Set the inode->i_ctime to the current value for the inode. Returns - * the current value that was assigned to i_ctime. - */ -struct timespec64 inode_set_ctime_current(struct inode *inode) -{ - struct timespec64 now =3D current_time(inode); + /* Just return that if this is not a multigrain fs */ + if (!is_mgtime(inode)) { + now =3D timestamp_truncate(now, inode); + inode_set_ctime_to_ts(inode, now); + goto out; + } =20 - inode_set_ctime_to_ts(inode, now); + /* + * We only need a fine-grained time if someone has queried it, + * and the current coarse grained time isn't later than what's + * already there. + */ + cns =3D smp_load_acquire(&inode->i_ctime_nsec); + if (cns & I_CTIME_QUERIED) { + struct timespec64 ctime =3D { .tv_sec =3D inode->i_ctime_sec, + .tv_nsec =3D cns & ~I_CTIME_QUERIED }; + + if (timespec64_compare(&now, &ctime) <=3D 0) + ktime_get_real_ts64_mg(&now, cookie); + } + now =3D timestamp_truncate(now, inode); + + /* No need to cmpxchg if it's exactly the same */ + if (cns =3D=3D now.tv_nsec && inode->i_ctime_sec =3D=3D now.tv_sec) + goto out; + cur =3D cns; +retry: + /* Try to swap the nsec value into place. */ + if (try_cmpxchg(&inode->i_ctime_nsec, &cur, now.tv_nsec)) { + /* If swap occurred, then we're (mostly) done */ + inode->i_ctime_sec =3D now.tv_sec; + } else { + /* + * Was the change due to someone marking the old ctime QUERIED? + * If so then retry the swap. This can only happen once since + * the only way to clear I_CTIME_QUERIED is to stamp the inode + * with a new ctime. + */ + if (!(cns & I_CTIME_QUERIED) && (cns | I_CTIME_QUERIED) =3D=3D cur) { + cns =3D cur; + goto retry; + } + /* Otherwise, keep the existing ctime */ + now.tv_sec =3D inode->i_ctime_sec; + now.tv_nsec =3D cur & ~I_CTIME_QUERIED; + } +out: return now; } EXPORT_SYMBOL(inode_set_ctime_current); diff --git a/fs/stat.c b/fs/stat.c index 89ce1be56310..a449626fd460 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -26,6 +26,35 @@ #include "internal.h" #include "mount.h" =20 +/** + * fill_mg_cmtime - Fill in the mtime and ctime and flag ctime as QUERIED + * @stat: where to store the resulting values + * @request_mask: STATX_* values requested + * @inode: inode from which to grab the c/mtime + * + * Given @inode, grab the ctime and mtime out if it and store the result + * in @stat. When fetching the value, flag it as QUERIED (if not already) + * so the next write will record a distinct timestamp. + */ +void fill_mg_cmtime(struct kstat *stat, u32 request_mask, struct inode *in= ode) +{ + atomic_t *pcn =3D (atomic_t *)&inode->i_ctime_nsec; + + /* If neither time was requested, then don't report them */ + if (!(request_mask & (STATX_CTIME|STATX_MTIME))) { + stat->result_mask &=3D ~(STATX_CTIME|STATX_MTIME); + return; + } + + stat->mtime =3D inode_get_mtime(inode); + stat->ctime.tv_sec =3D inode->i_ctime_sec; + stat->ctime.tv_nsec =3D (u32)atomic_read(pcn); + if (!(stat->ctime.tv_nsec & I_CTIME_QUERIED)) + stat->ctime.tv_nsec =3D ((u32)atomic_fetch_or(I_CTIME_QUERIED, pcn)); + stat->ctime.tv_nsec &=3D ~I_CTIME_QUERIED; +} +EXPORT_SYMBOL(fill_mg_cmtime); + /** * generic_fillattr - Fill in the basic attributes from the inode struct * @idmap: idmap of the mount the inode was found from @@ -58,8 +87,14 @@ void generic_fillattr(struct mnt_idmap *idmap, u32 reque= st_mask, stat->rdev =3D inode->i_rdev; stat->size =3D i_size_read(inode); stat->atime =3D inode_get_atime(inode); - stat->mtime =3D inode_get_mtime(inode); - stat->ctime =3D inode_get_ctime(inode); + + if (is_mgtime(inode)) { + fill_mg_cmtime(stat, request_mask, inode); + } else { + stat->ctime =3D inode_get_ctime(inode); + stat->mtime =3D inode_get_mtime(inode); + } + stat->blksize =3D i_blocksize(inode); stat->blocks =3D inode->i_blocks; =20 diff --git a/include/linux/fs.h b/include/linux/fs.h index 6ca11e241a24..eff688e75f2f 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1613,6 +1613,17 @@ static inline struct timespec64 inode_set_mtime(stru= ct inode *inode, return inode_set_mtime_to_ts(inode, ts); } =20 +/* + * Multigrain timestamps + * + * Conditionally use fine-grained ctime and mtime timestamps when there + * are users actively observing them via getattr. The primary use-case + * for this is NFS clients that use the ctime to distinguish between + * different states of the file, and that are often fooled by multiple + * operations that occur in the same coarse-grained timer tick. + */ +#define I_CTIME_QUERIED ((u32)BIT(31)) + static inline time64_t inode_get_ctime_sec(const struct inode *inode) { return inode->i_ctime_sec; @@ -1620,7 +1631,7 @@ static inline time64_t inode_get_ctime_sec(const stru= ct inode *inode) =20 static inline long inode_get_ctime_nsec(const struct inode *inode) { - return inode->i_ctime_nsec; + return inode->i_ctime_nsec & ~I_CTIME_QUERIED; } =20 static inline struct timespec64 inode_get_ctime(const struct inode *inode) @@ -1631,13 +1642,7 @@ static inline struct timespec64 inode_get_ctime(cons= t struct inode *inode) return ts; } =20 -static inline struct timespec64 inode_set_ctime_to_ts(struct inode *inode, - struct timespec64 ts) -{ - inode->i_ctime_sec =3D ts.tv_sec; - inode->i_ctime_nsec =3D ts.tv_nsec; - return ts; -} +struct timespec64 inode_set_ctime_to_ts(struct inode *inode, struct timesp= ec64 ts); =20 /** * inode_set_ctime - set the ctime in the inode @@ -2500,6 +2505,7 @@ struct file_system_type { #define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */ #define FS_DISALLOW_NOTIFY_PERM 16 /* Disable fanotify permission events */ #define FS_ALLOW_IDMAP 32 /* FS has been updated to handle vf= s idmappings. */ +#define FS_MGTIME 64 /* FS uses multigrain timestamps */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rena= me() internally. */ int (*init_fs_context)(struct fs_context *); const struct fs_parameter_spec *parameters; @@ -2523,6 +2529,17 @@ struct file_system_type { =20 #define MODULE_ALIAS_FS(NAME) MODULE_ALIAS("fs-" NAME) =20 +/** + * is_mgtime: is this inode using multigrain timestamps + * @inode: inode to test for multigrain timestamps + * + * Return true if the inode uses multigrain timestamps, false otherwise. + */ +static inline bool is_mgtime(const struct inode *inode) +{ + return inode->i_sb->s_type->fs_flags & FS_MGTIME; +} + extern struct dentry *mount_bdev(struct file_system_type *fs_type, int flags, const char *dev_name, void *data, int (*fill_super)(struct super_block *, void *, int)); @@ -3262,6 +3279,7 @@ extern void page_put_link(void *); extern int page_symlink(struct inode *inode, const char *symname, int len); extern const struct inode_operations page_symlink_inode_operations; extern void kfree_link(void *); +void fill_mg_cmtime(struct kstat *stat, u32 request_mask, struct inode *in= ode); void generic_fillattr(struct mnt_idmap *, u32, struct inode *, struct ksta= t *); void generic_fill_statx_attr(struct inode *inode, struct kstat *stat); void generic_fill_statx_atomic_writes(struct kstat *stat, --=20 2.46.0 From nobody Fri Nov 29 23:33:16 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 28F011DC72F; Fri, 13 Sep 2024 13:54:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235669; cv=none; b=lYYvoRyifLllRLFfaLDAlwqBxJF+xx66PbjUyByakf06a2mT6JhQBj3hM+UaTr9c/jabPrL2k0Z3Sz2erMdSqZ18N0vMM+pLwpneJVtd4La3qsM2Y13WpUUPCCMb4r5AarB/T4byL9CZ8VR+h6jJJRMtaXdzyesWXXhe0eUOJ2Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235669; c=relaxed/simple; bh=9/XBaGHpEs7172AZ1LJbIYF77AkYkko2J3CyGkwFfPc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=VQ3LVo5CoNgEI4480DYVECDo1Tqnnvm7Zlrru10iP+YZZ/gHQnThrZpj6V7EWNJMsZLFOZERac5aNxx6QNKuFzmNbg6VUWeRELqV9Zl90vqbyo9QBD8eoRvv9tYR3xt55/2A2WuSfNEOaehcaydiOn3juXC7ZTf8kEPK+nmaBkI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=W2SMFOaD; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="W2SMFOaD" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7D1C5C4CEC0; Fri, 13 Sep 2024 13:54:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726235669; bh=9/XBaGHpEs7172AZ1LJbIYF77AkYkko2J3CyGkwFfPc=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=W2SMFOaD/89VPA5TaMWz4z9E/674HUT6m4iHpbL3vTaL4DJSVwBzyOmAMPCWFE39N skL/RIsa0bOJ/FRylz7xBNdG5GKXLRcTvLLURGA0Qcf9Ik/UPyCZRbVxRsqDQ8o1+U GSI5Ou0ElKh+z3NZrjh7QMMwDJ6zLwkYAkfiJpmPedXrIDR6e2+C1poHGnP+ztI5z3 jspgihAV0GEBOhcO0QGYg4zJsGxGgeRYsJpFRJIc+X89rv0udITx1oOx4Jmlet/t+n gijgkPjqIRcPQqzLJLIIN9OHMOAgpnn/1lx9ilKxviTD4S7I8Q+8avsWCvhv9sz4ZY uSA0geUmOyQjA== From: Jeff Layton Date: Fri, 13 Sep 2024 09:54:12 -0400 Subject: [PATCH v7 03/11] fs: have setattr_copy handle multigrain timestamps appropriately Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-mgtime-v7-3-92d4020e3b00@kernel.org> References: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> In-Reply-To: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=3540; i=jlayton@kernel.org; h=from:subject:message-id; bh=9/XBaGHpEs7172AZ1LJbIYF77AkYkko2J3CyGkwFfPc=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm5EQI+jqtPOSShNNpOHPlbChfIJYBk0V5nXAwT cwVRFG+pq2JAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZuRECAAKCRAADmhBGVaC FT/JD/9/ovwyjr6zVFnxHSlE4yPi9XMFFRZ6vjJ3PSEqTzowQ6D7m1OOjDEUp49lWuXYBm+sfsE 6jJI40GaUDtkQKPk+I2tI1G5ki0jglFmBIIu/iJ0MJ0f7/jpwbkiyePe0oGBdlWNeyAkJQr5jII YDbAMtiqzB9siTAmRM4BLsZ1myu4a+sIJhiWnMx1kZM32aSfbJvcCtE50DkwrHS9CY7u5jX6spz oRNSNw/+Qy8FpH0KSMBa2dbO2pz0mBfz2lb1/ISS3q2FOMDPnkiB6glD40gYckHfCsffRx8hPPq B2qsSHoRbJXcX9+5mtH7VTfU6RMWlQ5S9J+p4bYDhGFPu6wtBvTOoxp5TpxOABtfI3R7FmwWT5P A/SIXpKhEaTI49RZHyO8JB2TGaPLGfZSapX2YxdUQa9+BZ5eVGjjmJjUqyV83EWu+I5japZEyMg cgzsE5m9Df/8Ze7c1zAdYSfUk79juoorEinuJy+qv9aX+ESEOIvU6ODnjrbeOozttxQ1FGqkjed CPpcP0o81OnRMHHBFutelQ5ZhKMaF3jRb0PA46kUrgY/RuJ4wqkk4+34wsmI9N0F5K9uLthvSYo u+NLE7I7ctuXPtJsaNXE8l/yf9yXCmc5pru6aOkDEotvqbSlojPt6Aypm4ju81hA57KsiiqVgTE OHDghWrSIt8b1Zg== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 The setattr codepath is still using coarse-grained timestamps, even on multigrain filesystems. To fix this, we need to fetch the timestamp for ctime updates later, at the point where the assignment occurs in setattr_copy. On a multigrain inode, ignore the ia_ctime in the attrs, and always update the ctime to the current clock value. Update the atime and mtime with the same value (if needed) unless they are being set to other specific values, a'la utimes(). Note that we don't want to do this universally however, as some filesystems (e.g. most networked fs) want to do an explicit update elsewhere before updating the local inode. Reviewed-by: Darrick J. Wong Reviewed-by: Josef Bacik Reviewed-by: Jan Kara Signed-off-by: Jeff Layton --- fs/attr.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 46 insertions(+), 6 deletions(-) diff --git a/fs/attr.c b/fs/attr.c index c04d19b58f12..3bcbc45708a3 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -271,6 +271,42 @@ int inode_newsize_ok(const struct inode *inode, loff_t= offset) } EXPORT_SYMBOL(inode_newsize_ok); =20 +/** + * setattr_copy_mgtime - update timestamps for mgtime inodes + * @inode: inode timestamps to be updated + * @attr: attrs for the update + * + * With multigrain timestamps, we need to take more care to prevent races + * when updating the ctime. Always update the ctime to the very latest + * using the standard mechanism, and use that to populate the atime and + * mtime appropriately (unless we're setting those to specific values). + */ +static void setattr_copy_mgtime(struct inode *inode, const struct iattr *a= ttr) +{ + unsigned int ia_valid =3D attr->ia_valid; + struct timespec64 now; + + /* + * If the ctime isn't being updated then nothing else should be + * either. + */ + if (!(ia_valid & ATTR_CTIME)) { + WARN_ON_ONCE(ia_valid & (ATTR_ATIME|ATTR_MTIME)); + return; + } + + now =3D inode_set_ctime_current(inode); + if (ia_valid & ATTR_ATIME_SET) + inode_set_atime_to_ts(inode, attr->ia_atime); + else if (ia_valid & ATTR_ATIME) + inode_set_atime_to_ts(inode, now); + + if (ia_valid & ATTR_MTIME_SET) + inode_set_mtime_to_ts(inode, attr->ia_mtime); + else if (ia_valid & ATTR_MTIME) + inode_set_mtime_to_ts(inode, now); +} + /** * setattr_copy - copy simple metadata updates into the generic inode * @idmap: idmap of the mount the inode was found from @@ -303,12 +339,6 @@ void setattr_copy(struct mnt_idmap *idmap, struct inod= e *inode, =20 i_uid_update(idmap, attr, inode); i_gid_update(idmap, attr, inode); - if (ia_valid & ATTR_ATIME) - inode_set_atime_to_ts(inode, attr->ia_atime); - if (ia_valid & ATTR_MTIME) - inode_set_mtime_to_ts(inode, attr->ia_mtime); - if (ia_valid & ATTR_CTIME) - inode_set_ctime_to_ts(inode, attr->ia_ctime); if (ia_valid & ATTR_MODE) { umode_t mode =3D attr->ia_mode; if (!in_group_or_capable(idmap, inode, @@ -316,6 +346,16 @@ void setattr_copy(struct mnt_idmap *idmap, struct inod= e *inode, mode &=3D ~S_ISGID; inode->i_mode =3D mode; } + + if (is_mgtime(inode)) + return setattr_copy_mgtime(inode, attr); + + if (ia_valid & ATTR_ATIME) + inode_set_atime_to_ts(inode, attr->ia_atime); + if (ia_valid & ATTR_MTIME) + inode_set_mtime_to_ts(inode, attr->ia_mtime); + if (ia_valid & ATTR_CTIME) + inode_set_ctime_to_ts(inode, attr->ia_ctime); } EXPORT_SYMBOL(setattr_copy); =20 --=20 2.46.0 From nobody Fri Nov 29 23:33:16 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E97EB1DCB2B; Fri, 13 Sep 2024 13:54:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235672; cv=none; b=qEyMt1y02OpvMuNxTrHA6WbxzYmlHULfEVKg/AbYh0+9e5wrIGAT7vULqbv7fKTJmUuQc2/D1HSfWqC4vCOmFXz1N991UGn7oh3ITc7U875HxwabmNNBbXkl5jsaTOqVvX9uQgmwaS0kSx9754wV3XPP9K+kekGITcJCPP8Y5oU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235672; c=relaxed/simple; bh=hdc3HuXN78tnJke+lsjFRc8MIpdoXqfI1aIZo2Ph9do=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=kMmntjWaBWq4bhuBR4AGzbQsDws4u1notiTmqxyi0yQX/s6tD8mQ+vad4dcdompq8EfwvUeN+ifb6ZAiGo6d2H8XZlPuoaLuS/d2GjFywlizk70tfj4zwkNrP6XsF3imUbCyu0S4tWsWT28ECcoxWRSohViPmNK1UR3G/VAP9Pg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=fl8X3cLS; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="fl8X3cLS" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 44623C4CED0; Fri, 13 Sep 2024 13:54:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726235671; bh=hdc3HuXN78tnJke+lsjFRc8MIpdoXqfI1aIZo2Ph9do=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=fl8X3cLSHojZIjmkmSYV4n4Y0MHgS2lGCRsEJknPf7WD1GMgEqaV67ufmNO5d2WUe 49XX5BJCMJJPdQdd/+/K0lkypJS229K2gE/55/J3DaoogPv0is7FMexHVQSMGR2W1j Vdrwmh0kYrFZOeyTFYJcVT8R6/XN9q1qODVb3SvTGDjDpJ/8ojpwL/u61Tu5NvSy95 nV7FUWW+5YI3dMDjIIfBOrSAamp42O2lT/fv34w1kXigWtrcLpRO1fi3xh6WXCZhC5 kJQ9wRKjcqXz1hReqjy9TiDj+w46N8W4b8wGlVxcsoETWKwF4lopHIbb6SPYyxEAWI WRhZXKraQkpsw== From: Jeff Layton Date: Fri, 13 Sep 2024 09:54:13 -0400 Subject: [PATCH v7 04/11] fs: handle delegated timestamps in setattr_copy_mgtime Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-mgtime-v7-4-92d4020e3b00@kernel.org> References: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> In-Reply-To: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=6206; i=jlayton@kernel.org; h=from:subject:message-id; bh=hdc3HuXN78tnJke+lsjFRc8MIpdoXqfI1aIZo2Ph9do=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm5EQI05miWpOV+IKasjIrj/f2h92GkcTZGaHOX i7zmXV5tiCJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZuRECAAKCRAADmhBGVaC FawzD/0QM/rrrncZr9goBf5Fodvc0nyXf5LKp7ptAeNyS9j27/HdizuJeqJN85pMsI6Z+d7O0ZB yJtdOR5rB/LlzyQO2tyXlaGp1w9niFn45ega3Y3e3a5jBVJVSCnQfvrBEeuzNILZjt7LLBwUS86 JB40F60UHoHTsiglLUa0/NPxitwDK/RJSDNLwySzK5dfrO3xO0zOyESEvAIx43QMJw3BwkvbOPh QTDhu6GVvb0HfFXv9EbRGoiDK9h2oCyk+FVKf3DJNQezu+QIBHGTH7TJ3VaO6rdESdIXTq6S8eU eJS7QDt/VRB46z+xHYc8HozzWOrSg9rI5b65RU9GraAhX76JKl8KdUt6J2rt/M/zHuuWh5gVMi5 yAktGJy//n2pDhUgslejA0WBSRfeydc+LET6il1YWu4dBUMzn8L2RdtnvU171pbHggj2eInPkY4 EoBw4Ii07JxOm3v7sMAdGeNDXQL855ydZjyfaMLmIqbes+U/3PQp5fehiCA3v/9Qez7BJmRsID6 OdSTBy0gE61fj0imPRmKgZBTrepnfUSwY+O8G4HOGGmD0YwPgDfOJb2fuSwwqgxxh4SWeEib5JY udf6qVdd0u+unM4YE3oGKxf4qwHSGO7R6emJKLlLQ6qgVHCReeXNOtpC6Y5OULwrMsJe+TbzRHh npI0J3yTRNm5ndw== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 When updating the ctime on an inode for a SETATTR with a multigrain filesystem, we usually want to take the latest time we can get for the ctime. The exception to this rule is when there is a nfsd write delegation and the server is proxying timestamps from the client. When nfsd gets a CB_GETATTR response, we want to update the timestamp value in the inode to the values that the client is tracking. The client doesn't send a ctime value (since that's always determined by the exported filesystem), but it can send a mtime value. In the case where it does, then we may need to update the ctime to a value commensurate with that instead of the current time. If ATTR_DELEG is set, then use ia_ctime value instead of setting the timestamp to the current time. With the addition of delegated timestamps we can also receive a request to update only the atime, but we may not need to set the ctime. Trust the ATTR_CTIME flag in the update and only update the ctime when it's set. Signed-off-by: Jeff Layton --- fs/attr.c | 28 +++++++++++++-------- fs/inode.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++= ++++ include/linux/fs.h | 2 ++ 3 files changed, 92 insertions(+), 10 deletions(-) diff --git a/fs/attr.c b/fs/attr.c index 3bcbc45708a3..392eb62aa609 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -286,16 +286,20 @@ static void setattr_copy_mgtime(struct inode *inode, = const struct iattr *attr) unsigned int ia_valid =3D attr->ia_valid; struct timespec64 now; =20 - /* - * If the ctime isn't being updated then nothing else should be - * either. - */ - if (!(ia_valid & ATTR_CTIME)) { - WARN_ON_ONCE(ia_valid & (ATTR_ATIME|ATTR_MTIME)); - return; + if (ia_valid & ATTR_CTIME) { + /* + * In the case of an update for a write delegation, we must respect + * the value in ia_ctime and not use the current time. + */ + if (ia_valid & ATTR_DELEG) + now =3D inode_set_ctime_deleg(inode, attr->ia_ctime); + else + now =3D inode_set_ctime_current(inode); + } else { + /* If ATTR_CTIME isn't set, then ATTR_MTIME shouldn't be either. */ + WARN_ON_ONCE(ia_valid & ATTR_MTIME); } =20 - now =3D inode_set_ctime_current(inode); if (ia_valid & ATTR_ATIME_SET) inode_set_atime_to_ts(inode, attr->ia_atime); else if (ia_valid & ATTR_ATIME) @@ -354,8 +358,12 @@ void setattr_copy(struct mnt_idmap *idmap, struct inod= e *inode, inode_set_atime_to_ts(inode, attr->ia_atime); if (ia_valid & ATTR_MTIME) inode_set_mtime_to_ts(inode, attr->ia_mtime); - if (ia_valid & ATTR_CTIME) - inode_set_ctime_to_ts(inode, attr->ia_ctime); + if (ia_valid & ATTR_CTIME) { + if (ia_valid & ATTR_DELEG) + inode_set_ctime_deleg(inode, attr->ia_ctime); + else + inode_set_ctime_to_ts(inode, attr->ia_ctime); + } } EXPORT_SYMBOL(setattr_copy); =20 diff --git a/fs/inode.c b/fs/inode.c index 8ab36779066e..260a8a1c1096 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -2710,6 +2710,78 @@ struct timespec64 inode_set_ctime_current(struct ino= de *inode) } EXPORT_SYMBOL(inode_set_ctime_current); =20 +/** + * inode_set_ctime_deleg - try to update the ctime on a delegated inode + * @inode: inode to update + * @update: timespec64 to set the ctime + * + * Attempt to atomically update the ctime on behalf of a delegation holder. + * + * The nfs server can call back the holder of a delegation to get updated + * inode attributes, including the mtime. When updating the mtime we may + * need to update the ctime to a value at least equal to that. + * + * This can race with concurrent updates to the inode, in which + * case we just don't do the update. + * + * Note that this works even when multigrain timestamps are not enabled, + * so use it in either case. + */ +struct timespec64 inode_set_ctime_deleg(struct inode *inode, struct timesp= ec64 update) +{ + struct timespec64 now, cur_ts; + u32 cur, old; + + /* pairs with try_cmpxchg below */ + cur =3D smp_load_acquire(&inode->i_ctime_nsec); + cur_ts.tv_nsec =3D cur & ~I_CTIME_QUERIED; + cur_ts.tv_sec =3D inode->i_ctime_sec; + + /* If the update is older than the existing value, skip it. */ + if (timespec64_compare(&update, &cur_ts) <=3D 0) + return cur_ts; + + ktime_get_coarse_real_ts64_mg(&now); + + /* Clamp the update to "now" if it's in the future */ + if (timespec64_compare(&update, &now) > 0) + update =3D now; + + update =3D timestamp_truncate(update, inode); + + /* No need to update if the values are already the same */ + if (timespec64_equal(&update, &cur_ts)) + return cur_ts; + + /* + * Try to swap the nsec value into place. If it fails, that means + * we raced with an update due to a write or similar activity. That + * stamp takes precedence, so just skip the update. + */ +retry: + old =3D cur; + if (try_cmpxchg(&inode->i_ctime_nsec, &cur, update.tv_nsec)) { + inode->i_ctime_sec =3D update.tv_sec; + mgtime_counter_inc(mg_ctime_swaps); + return update; + } + + /* + * Was the change due to someone marking the old ctime QUERIED? + * If so then retry the swap. This can only happen once since + * the only way to clear I_CTIME_QUERIED is to stamp the inode + * with a new ctime. + */ + if (!(old & I_CTIME_QUERIED) && (cur =3D=3D (old | I_CTIME_QUERIED))) + goto retry; + + /* Otherwise, it was a new timestamp. */ + cur_ts.tv_sec =3D inode->i_ctime_sec; + cur_ts.tv_nsec =3D cur & ~I_CTIME_QUERIED; + return cur_ts; +} +EXPORT_SYMBOL(inode_set_ctime_deleg); + /** * in_group_or_capable - check whether caller is CAP_FSETID privileged * @idmap: idmap of the mount @inode was found from diff --git a/include/linux/fs.h b/include/linux/fs.h index eff688e75f2f..ea7ed437d2b1 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1544,6 +1544,8 @@ static inline bool fsuidgid_has_mapping(struct super_= block *sb, =20 struct timespec64 current_time(struct inode *inode); struct timespec64 inode_set_ctime_current(struct inode *inode); +struct timespec64 inode_set_ctime_deleg(struct inode *inode, + struct timespec64 update); =20 static inline time64_t inode_get_atime_sec(const struct inode *inode) { --=20 2.46.0 From nobody Fri Nov 29 23:33:16 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E32F31DEFE5; Fri, 13 Sep 2024 13:54:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235675; cv=none; b=L6pUuqZ6UBcN5lZDdmlskbUKX1acHnsVK72ebioqa3BwDwFgt/PaqJgV3if44qMFAPgrUqYcs3NJzfozQBWuyaZIbdBHtldNpeNxMBpPWxqJ0MDn32NyeHl/idjx6r8lgqEPLVFRitQylkPBRcIxf+KyGsWO51hZsR1r/xAFnSU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235675; c=relaxed/simple; bh=mswhu5zQXRl2wd+0yOTYLtr97eP2EA3GtMMWzONy8wQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=A7WHyWIJO4WimQFHd+KCzpMc69038xdroj9u2HHSHO9c0+6LXhUH8lFUcvDZJU/sZ+9BIDNFXOroj/ZuVkujpAFN39zL9GYE9tXaIwmaedDUT1PDFkekehRIG55+0+GiovO7KI48rnMR01613Bx7GHSP0Rhsdiyh42045vkEGWk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=RcrZKXLg; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="RcrZKXLg" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 09398C4CEC5; Fri, 13 Sep 2024 13:54:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726235674; bh=mswhu5zQXRl2wd+0yOTYLtr97eP2EA3GtMMWzONy8wQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=RcrZKXLg4N/m7lvW60c9Jd7o5yNaZ94nCF7FAgZfXsUkKYWJaNn6Qe/3GsjthU0UY mtilMPhwu0hi+dBNqnjc22CfMT161qePMwNrcFY08moeR8O6iZyHoHuHG2KtxOg/Pz bu1FrLhNG8MUPi7xnU2R2/AsOPZf5XSjQUZslaBHtwgXtwSHFXMxP8dcGHDqXlnfZO 2jk17BcaG7Q9Vl1Q5YMlcKAxy0vTHnrfcW9MdxYgVlJtSiqIPuRzXpwjvbkvxMyjhG nInzULVRcvqSvlGspFy0EoyjghywAmFbI1S7pbtnsqH3Mg8ACL0eP4F1rfalDdEZ/J zSxSMGNn+kLxA== From: Jeff Layton Date: Fri, 13 Sep 2024 09:54:14 -0400 Subject: [PATCH v7 05/11] fs: tracepoints around multigrain timestamp events Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-mgtime-v7-5-92d4020e3b00@kernel.org> References: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> In-Reply-To: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=5892; i=jlayton@kernel.org; h=from:subject:message-id; bh=mswhu5zQXRl2wd+0yOTYLtr97eP2EA3GtMMWzONy8wQ=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm5EQImnGZG1ZnDgABRhwGHmVqj6eZNVyQqpkfv dJtgBXjjC2JAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZuRECAAKCRAADmhBGVaC FTzIEADLtm0TBV+gx6hWnjRySxnJ1KNJEpkgPVq427VHXKk3lpvAO6alhv1VxXv1lc3d4S96m8I Z1Io07qNpdTkazdyVw0PyBq+FvL85w3N1fhu/A4Ykgd2wpr/jZPWshC77ziwWTUFCNYH/jyHSq6 vBmNJmn9yl0s57IILhKk7eLq5kcwmWx5Npjet/bgSUaNgvi5+fUufs87xPDJH2TmvziVOs98MtZ MHtu6ykOm2B2qUMc4C+eSUIKhHZu0Eo29AzGb53tbz/NwtbnO7F9Fxg77Lf1IM3BTGuxexDk6P/ KOMw0Tjs53rqMNRpXq6mYMhkzgXHB04k3pu65ydcl6uj4HGma163fANNEidXkjxYXHUZ4+ksNtF +d+hOg8EjbSRFouVTAO+sv6F94+WNMDP1nXUn9P7paogkC644XpyKIA73zlAPr0WOMe1fg7u5HF Nctpw/NTd1LHcrc9YfH9sW8WkiPeFtEAPZEVIYMlUEc4NYIzFFqkYTpKXySJvLdPZn4A/XwGj1t hu+VQVfTgAJM4FHgYqEqe5j43zqFm6P4oX2mxRkqKKJFDPltx+UPFPBx0XA9ijOEHUkhYhEZ5Vl F0nIKxDFj/6+w8Uo/vji5S+0JaNfeHLqF3LdmdQvqmClLhsFvSGgCakfJ1HEzhqITmUNykn2IgJ dbpGPGZFu7e9AMA== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Add some tracepoints around various multigrain timestamp events. Reviewed-by: Josef Bacik Reviewed-by: Darrick J. Wong Reviewed-by: Jan Kara Signed-off-by: Jeff Layton --- fs/inode.c | 6 ++ fs/stat.c | 3 + include/trace/events/timestamp.h | 124 +++++++++++++++++++++++++++++++++++= ++++ 3 files changed, 133 insertions(+) diff --git a/fs/inode.c b/fs/inode.c index 260a8a1c1096..d19f70422a5d 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -22,6 +22,9 @@ #include #include #include +#define CREATE_TRACE_POINTS +#include + #include "internal.h" =20 /* @@ -2598,6 +2601,7 @@ EXPORT_SYMBOL(inode_nohighmem); =20 struct timespec64 inode_set_ctime_to_ts(struct inode *inode, struct timesp= ec64 ts) { + trace_inode_set_ctime_to_ts(inode, &ts); set_normalized_timespec64(&ts, ts.tv_sec, ts.tv_nsec); inode->i_ctime_sec =3D ts.tv_sec; inode->i_ctime_nsec =3D ts.tv_nsec; @@ -2683,6 +2687,7 @@ struct timespec64 inode_set_ctime_current(struct inod= e *inode) =20 /* No need to cmpxchg if it's exactly the same */ if (cns =3D=3D now.tv_nsec && inode->i_ctime_sec =3D=3D now.tv_sec) + trace_ctime_xchg_skip(inode, &now); goto out; cur =3D cns; retry: @@ -2690,6 +2695,7 @@ struct timespec64 inode_set_ctime_current(struct inod= e *inode) if (try_cmpxchg(&inode->i_ctime_nsec, &cur, now.tv_nsec)) { /* If swap occurred, then we're (mostly) done */ inode->i_ctime_sec =3D now.tv_sec; + trace_ctime_ns_xchg(inode, cns, now.tv_nsec, cur); } else { /* * Was the change due to someone marking the old ctime QUERIED? diff --git a/fs/stat.c b/fs/stat.c index a449626fd460..9eb6d9b2d010 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -23,6 +23,8 @@ #include #include =20 +#include + #include "internal.h" #include "mount.h" =20 @@ -52,6 +54,7 @@ void fill_mg_cmtime(struct kstat *stat, u32 request_mask,= struct inode *inode) if (!(stat->ctime.tv_nsec & I_CTIME_QUERIED)) stat->ctime.tv_nsec =3D ((u32)atomic_fetch_or(I_CTIME_QUERIED, pcn)); stat->ctime.tv_nsec &=3D ~I_CTIME_QUERIED; + trace_fill_mg_cmtime(inode, &stat->ctime, &stat->mtime); } EXPORT_SYMBOL(fill_mg_cmtime); =20 diff --git a/include/trace/events/timestamp.h b/include/trace/events/timest= amp.h new file mode 100644 index 000000000000..c9e5ec930054 --- /dev/null +++ b/include/trace/events/timestamp.h @@ -0,0 +1,124 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM timestamp + +#if !defined(_TRACE_TIMESTAMP_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_TIMESTAMP_H + +#include +#include + +#define CTIME_QUERIED_FLAGS \ + { I_CTIME_QUERIED, "Q" } + +DECLARE_EVENT_CLASS(ctime, + TP_PROTO(struct inode *inode, + struct timespec64 *ctime), + + TP_ARGS(inode, ctime), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, ino) + __field(time64_t, ctime_s) + __field(u32, ctime_ns) + __field(u32, gen) + ), + + TP_fast_assign( + __entry->dev =3D inode->i_sb->s_dev; + __entry->ino =3D inode->i_ino; + __entry->gen =3D inode->i_generation; + __entry->ctime_s =3D ctime->tv_sec; + __entry->ctime_ns =3D ctime->tv_nsec; + ), + + TP_printk("ino=3D%d:%d:%ld:%u ctime=3D%lld.%u", + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->ino, __entry->gen, + __entry->ctime_s, __entry->ctime_ns + ) +); + +DEFINE_EVENT(ctime, inode_set_ctime_to_ts, + TP_PROTO(struct inode *inode, + struct timespec64 *ctime), + TP_ARGS(inode, ctime)); + +DEFINE_EVENT(ctime, ctime_xchg_skip, + TP_PROTO(struct inode *inode, + struct timespec64 *ctime), + TP_ARGS(inode, ctime)); + +TRACE_EVENT(ctime_ns_xchg, + TP_PROTO(struct inode *inode, + u32 old, + u32 new, + u32 cur), + + TP_ARGS(inode, old, new, cur), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, ino) + __field(u32, gen) + __field(u32, old) + __field(u32, new) + __field(u32, cur) + ), + + TP_fast_assign( + __entry->dev =3D inode->i_sb->s_dev; + __entry->ino =3D inode->i_ino; + __entry->gen =3D inode->i_generation; + __entry->old =3D old; + __entry->new =3D new; + __entry->cur =3D cur; + ), + + TP_printk("ino=3D%d:%d:%ld:%u old=3D%u:%s new=3D%u cur=3D%u:%s", + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->ino, __entry->gen, + __entry->old & ~I_CTIME_QUERIED, + __print_flags(__entry->old & I_CTIME_QUERIED, "|", CTIME_QUERIED_FLAGS), + __entry->new, + __entry->cur & ~I_CTIME_QUERIED, + __print_flags(__entry->cur & I_CTIME_QUERIED, "|", CTIME_QUERIED_FLAGS) + ) +); + +TRACE_EVENT(fill_mg_cmtime, + TP_PROTO(struct inode *inode, + struct timespec64 *ctime, + struct timespec64 *mtime), + + TP_ARGS(inode, ctime, mtime), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, ino) + __field(time64_t, ctime_s) + __field(time64_t, mtime_s) + __field(u32, ctime_ns) + __field(u32, mtime_ns) + __field(u32, gen) + ), + + TP_fast_assign( + __entry->dev =3D inode->i_sb->s_dev; + __entry->ino =3D inode->i_ino; + __entry->gen =3D inode->i_generation; + __entry->ctime_s =3D ctime->tv_sec; + __entry->mtime_s =3D mtime->tv_sec; + __entry->ctime_ns =3D ctime->tv_nsec; + __entry->mtime_ns =3D mtime->tv_nsec; + ), + + TP_printk("ino=3D%d:%d:%ld:%u ctime=3D%lld.%u mtime=3D%lld.%u", + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->ino, __entry->gen, + __entry->ctime_s, __entry->ctime_ns, + __entry->mtime_s, __entry->mtime_ns + ) +); +#endif /* _TRACE_TIMESTAMP_H */ + +/* This part must be outside protection */ +#include --=20 2.46.0 From nobody Fri Nov 29 23:33:16 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 67A8D1DEFE3; Fri, 13 Sep 2024 13:54:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235677; cv=none; b=b78KJ5YUHnM3FHbPLvlZ8y/LxOG92/7yAvOVhMowQ2ncxYr5y33H34Eat8bE36ePB0MKqdRYW89sEg9IP9pdj+L2xmJRL9onF4JfrkBOKkQv6W5RVwsGIctYYGLinMfWc/H0S8vb3vNSy3UYj8StrVAP039XedtqZ+PlWURuTds= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235677; c=relaxed/simple; bh=KHbT8i/Qcxqnz2+0udQZ1J3l24AVjm4qbm+DkSqt5l4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=qyAg4tpg9HGGU1GA71NgS3vHMfxOX5JWavQYCCa5U0ACn3pMAxAa/6J/4Fd5+yxzhbr42OL5JSYuzWZ2qyDCnnb5MXus7iWx9yqIp8PtY+gESh3LJORS/+N1S8hHqZiemvVEkmyeDLEkP168+nruWTH9EPzWxjdMn/qzDokKGwk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=s4gq5z6h; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="s4gq5z6h" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BF469C4CEC0; Fri, 13 Sep 2024 13:54:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726235677; bh=KHbT8i/Qcxqnz2+0udQZ1J3l24AVjm4qbm+DkSqt5l4=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=s4gq5z6hcBvpH4Af29Xa3mpQqQYLzKD4py2xkhJFItmOHrX6Qd/Fgf6YVXRDe3MGr Y6ZtMmWbFjlzRcj4YIH5Xkc3/s3iDMH749NSTPEdbLAXKYVAyc+y0M176gv5js+3CQ sM+9flb6wNVmu5snNzxlWS3Jh0jMftXjUknTZwQMV25wcG7XoM8Unq4hH1/sr/r8sw mH/kWzpS43Xv8h79gRRUjdNRKmpwsS/AyuC+xx+pCsIl0H3CHgvLzt/mk4TvDmLOOq 6CWsCS/+/JC9Bx5hZIys1oqiaV5ujeaTgpNILKa007hMD2FDMzhYguW1uISUDErtCa 3OkcxFDgqNMtw== From: Jeff Layton Date: Fri, 13 Sep 2024 09:54:15 -0400 Subject: [PATCH v7 06/11] fs: add percpu counters for significant multigrain timestamp events Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-mgtime-v7-6-92d4020e3b00@kernel.org> References: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> In-Reply-To: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=4153; i=jlayton@kernel.org; h=from:subject:message-id; bh=KHbT8i/Qcxqnz2+0udQZ1J3l24AVjm4qbm+DkSqt5l4=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm5EQIsjChr/AGoTI8XHoboWzHgL1UrmuoBOIh2 H0F2aytFBiJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZuRECAAKCRAADmhBGVaC FQ/IEADDyLUPPS3+cN83808MzxZztxPDxXZD2JHsYTymxmOFO/RaD8qGe4hUZ7ZdN2LWKzMI5fr ppbNy6x0jKBcBM85ugKAyHqnvOuHlMrYzaKqj9+ANAywJslEBvAS/ukrrgESCwYCAKsVteGI+Z/ 8HQiNipXu2BjMQpbBsqqiNhIx9bvlR1ZXfFpndkgMklebePjxpZD1v65WqSD2tAQegFDYw9536b 0GVXFbwG2vTPWEFHPDs+aJcgA67VQdSvHHTTw3F5ZlfEphmYdXiBTWGeKajVsSEl7yMNcKaJL35 fs7kOeSLHIqh0fEj44Mxaxo7kAivA2UusbstCWI/Pv/lENOM5lsIcxYSZc/huctUzScUYdffl1W XAisX9gk2C5t1jEsqJBm0RZixHiU+Wq2C1JOZ60pg7b9hYl7f+M/nTslorqoL0d9U/FZg7DLqdO Dlw2uMvAA9rNutfj3ERGBr6Ko5LYiLULlICoyjDA8+1QW0zIohk7qhLyfCipFTPyTfVs1okIOIN 3+5eFy1j/B7UjGCZVH7xZH3x7QHe81R8CYRojAQWu4pk/js1aKMe7iyvy+xakmgmSyOWxU2Xp83 gJuk+plif3OX9OVu1StHpvYDsqookQd8OLqsIAqAaVzGVk0Q8HvCS5liO38Z/JS5bnukCW21eR2 8bDMmGYu8mWFbAA== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 New percpu counters for counting various stats around mgtimes, and a new debugfs file for displaying them when CONFIG_DEBUG_FS is enabled: - number of attempted ctime updates - number of successful i_ctime_nsec swaps - number of fine-grained timestamp fetches Reviewed-by: Josef Bacik Reviewed-by: Darrick J. Wong Reviewed-by: Jan Kara Signed-off-by: Jeff Layton --- fs/inode.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++= ---- 1 file changed, 72 insertions(+), 5 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index d19f70422a5d..749eb549dec5 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -21,6 +21,8 @@ #include #include #include +#include +#include #include #define CREATE_TRACE_POINTS #include @@ -101,6 +103,69 @@ long get_nr_dirty_inodes(void) return nr_dirty > 0 ? nr_dirty : 0; } =20 +#ifdef CONFIG_DEBUG_FS +static DEFINE_PER_CPU(unsigned long, mg_ctime_updates); +static DEFINE_PER_CPU(unsigned long, mg_fine_stamps); +static DEFINE_PER_CPU(unsigned long, mg_ctime_swaps); + +static long get_mg_ctime_updates(void) +{ + int i; + long sum =3D 0; + + for_each_possible_cpu(i) + sum +=3D per_cpu(mg_ctime_updates, i); + return sum < 0 ? 0 : sum; +} + +static long get_mg_fine_stamps(void) +{ + int i; + long sum =3D 0; + + for_each_possible_cpu(i) + sum +=3D per_cpu(mg_fine_stamps, i); + return sum < 0 ? 0 : sum; +} + +static long get_mg_ctime_swaps(void) +{ + int i; + long sum =3D 0; + + for_each_possible_cpu(i) + sum +=3D per_cpu(mg_ctime_swaps, i); + return sum < 0 ? 0 : sum; +} + +#define mgtime_counter_inc(__var) this_cpu_inc(__var) + +static int mgts_show(struct seq_file *s, void *p) +{ + long ctime_updates =3D get_mg_ctime_updates(); + long ctime_swaps =3D get_mg_ctime_swaps(); + long fine_stamps =3D get_mg_fine_stamps(); + + seq_printf(s, "%lu %lu %lu\n", + ctime_updates, ctime_swaps, fine_stamps); + return 0; +} + +DEFINE_SHOW_ATTRIBUTE(mgts); + +static int __init mg_debugfs_init(void) +{ + debugfs_create_file("multigrain_timestamps", S_IFREG | S_IRUGO, NULL, NUL= L, &mgts_fops); + return 0; +} +late_initcall(mg_debugfs_init); + +#else /* ! CONFIG_DEBUG_FS */ + +#define mgtime_counter_inc(__var) do { } while (0) + +#endif /* CONFIG_DEBUG_FS */ + /* * Handle nr_inode sysctl */ @@ -2650,10 +2715,9 @@ EXPORT_SYMBOL(timestamp_truncate); * * If it is multigrain, then we first see if the coarse-grained timestamp = is * distinct from what we have. If so, then we'll just use that. If we have= to - * get a fine-grained timestamp, then do so, and try to swap it into the f= loor. - * We accept the new floor value regardless of the outcome of the cmpxchg. - * After that, we try to swap the new value into i_ctime_nsec. Again, we t= ake - * the resulting ctime, regardless of the outcome of the swap. + * get a fine-grained timestamp, then do so. After that, we try to swap th= e new + * value into i_ctime_nsec. We take the resulting ctime, regardless of the + * outcome of the swap. */ struct timespec64 inode_set_ctime_current(struct inode *inode) { @@ -2680,8 +2744,10 @@ struct timespec64 inode_set_ctime_current(struct ino= de *inode) struct timespec64 ctime =3D { .tv_sec =3D inode->i_ctime_sec, .tv_nsec =3D cns & ~I_CTIME_QUERIED }; =20 - if (timespec64_compare(&now, &ctime) <=3D 0) + if (timespec64_compare(&now, &ctime) <=3D 0) { + mgtime_counter_inc(mg_fine_stamps); ktime_get_real_ts64_mg(&now, cookie); + } } now =3D timestamp_truncate(now, inode); =20 @@ -2696,6 +2762,7 @@ struct timespec64 inode_set_ctime_current(struct inod= e *inode) /* If swap occurred, then we're (mostly) done */ inode->i_ctime_sec =3D now.tv_sec; trace_ctime_ns_xchg(inode, cns, now.tv_nsec, cur); + mgtime_counter_inc(mg_ctime_swaps); } else { /* * Was the change due to someone marking the old ctime QUERIED? --=20 2.46.0 From nobody Fri Nov 29 23:33:16 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 73CB11E009D; Fri, 13 Sep 2024 13:54:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235680; cv=none; b=rgUbMJnyBHAPkhetMResR13s0c2l1qap8jx9gjo2tyu9zz50WzbnNP/qis788akpcXziw/174hpuUhjPjKPdY8MRByFiItJLrGHq5W/hz66LPfF7iyM+I+itIAqvURn3kjpO7T9AEFZygqaVJtxwMQiNiZvsLJ+R2zIj2WZ1eG0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235680; c=relaxed/simple; bh=/XcZHkK5gYpUwCSVg7s3+A6XUqyCAQD/OQvh4jjSlT8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=sVUep616DqZ/88pGW/WRr1hK79cqBDdzQQRD1T4h9okYwEOmaJLfbzCCvU2izoz2aM6khFMYdJbg4p0qMx9TeDulUTukQREeEOp1eM2A06TCfHtEyj9dZtkR92usF+UQkZzl1UX0+QUS6xhc51BXbvNcjmWtPEgl1BSKGOPmCyE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=CWcayPO0; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="CWcayPO0" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 836D6C4CED0; Fri, 13 Sep 2024 13:54:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726235680; bh=/XcZHkK5gYpUwCSVg7s3+A6XUqyCAQD/OQvh4jjSlT8=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=CWcayPO0BREmELx+qbla/1UkRoRXru8vfwvasAmBHs3OEmTzi5w2AeX7jA7cT8U4X 38UavtmBwMs4Ja9lM0InjsPOHCQYdbHPXzXSNkqGpTbgrcKWdyDC5GIu0bdj+bfJHh 8wmPVsXgw3nSgXsaEhE3Y8MvlFO4lshs8i52xCovEQwqU9BKge+LMFg/7qKGDnoMKE hGJAYizGYMPUSBbisaxLn+Eahr1+0bmRu/0l+o7CwDdN5/hS4EMu1ctLVpD216wUDI 9eMnXTM2Z/VO5BetJBI5Mruuqpz/wYGvkwPU6yaYvBBjoFwclUCzlXzemGEpWdqNi3 m/oDF1pr+zOJw== From: Jeff Layton Date: Fri, 13 Sep 2024 09:54:16 -0400 Subject: [PATCH v7 07/11] Documentation: add a new file documenting multigrain timestamps Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-mgtime-v7-7-92d4020e3b00@kernel.org> References: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> In-Reply-To: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton , Randy Dunlap X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=6998; i=jlayton@kernel.org; h=from:subject:message-id; bh=/XcZHkK5gYpUwCSVg7s3+A6XUqyCAQD/OQvh4jjSlT8=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm5EQJ1zvz7D6LAhdnOHcJnvHOQRJ7Pg8fD+uTz utfCHBIZWeJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZuRECQAKCRAADmhBGVaC FbvsEACoJHq00HMhA20PmQVbSRdg1FHnGuaUNQHBn66D636/VgZoSV8WgJu70G84vvFDIevbKMV bFejZ2F0n4EZC2z9z4DSA7SXYHnBQihlV1uN9+91voMzQBrwZhoh/yaACJnKLo/hKd/GtU+bkZk 8q243t6sBl3Eej7xrnY1QkNKSt2LOay+KR9I13PRrXaZFILjykvrWtMO1m5d6MmT4yZG2x20pQh imTIRKqSrIDYq286AkvABlR5bF/bfMYlYbjh5uJejc2wAHixNrvkfkdLDcFUzAhIIJZO65WkZt9 6W589fBgMuRCGwXyK/fuomY//FGkaNhGgTO3n7ncI2MDthotR8AE3bi/kz6SJdA/KSABtbb86fE JOq0YejwvQ4jJwSot5799C85AOQakU/9RUJfbPrwZxh8caEfoqEteLrAiOqWI5tsHAKDOA0PalM zypILvN5pYhWbwnlW+I3rWyle2v+gMoa2AGGTyvmiAvNQzMmYB/j2+9a2cV9EEhm0ullKHsI7YR vamW0vUHCs2OEnNlxEGOYNcVc7nbkkKLHDQu+URoB0ovON8waJcGlXaOw3g6bYcf4cfJwRa701g e0sTRqk60QYnQ8QZMnNApaO0I3y4ahS5bMkns5vJ+Kv5uIfLiA2TXW9xw6qE3VTzctsRft6s/IK Y4jJ9dKvqkO7iJw== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Add a high-level document that describes how multigrain timestamps work, rationale for them, and some info about implementation and tradeoffs. Reviewed-by: Josef Bacik Reviewed-by: Darrick J. Wong Reviewed-by: Randy Dunlap Reviewed-by: Jan Kara Signed-off-by: Jeff Layton --- Documentation/filesystems/index.rst | 1 + Documentation/filesystems/multigrain-ts.rst | 121 ++++++++++++++++++++++++= ++++ 2 files changed, 122 insertions(+) diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystem= s/index.rst index e8e496d23e1d..44e9e77ffe0d 100644 --- a/Documentation/filesystems/index.rst +++ b/Documentation/filesystems/index.rst @@ -29,6 +29,7 @@ algorithms work. fiemap files locks + multigrain-ts mount_api quota seq_file diff --git a/Documentation/filesystems/multigrain-ts.rst b/Documentation/fi= lesystems/multigrain-ts.rst new file mode 100644 index 000000000000..97877ab3d933 --- /dev/null +++ b/Documentation/filesystems/multigrain-ts.rst @@ -0,0 +1,121 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Multigrain Timestamps +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Introduction +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Historically, the kernel has always used coarse time values to stamp inode= s. +This value is updated every jiffy, so any change that happens within that = jiffy +will end up with the same timestamp. + +When the kernel goes to stamp an inode (due to a read or write), it first = gets +the current time and then compares it to the existing timestamp(s) to see +whether anything will change. If nothing changed, then it can avoid updati= ng +the inode's metadata. + +Coarse timestamps are therefore good from a performance standpoint, since = they +reduce the need for metadata updates, but bad from the standpoint of +determining whether anything has changed, since a lot of things can happen= in a +jiffy. + +They are particularly troublesome with NFSv3, where unchanging timestamps = can +make it difficult to tell whether to invalidate caches. NFSv4 provides a +dedicated change attribute that should always show a visible change, but n= ot +all filesystems implement this properly, causing the NFS server to substit= ute +the ctime in many cases. + +Multigrain timestamps aim to remedy this by selectively using fine-grained +timestamps when a file has had its timestamps queried recently, and the cu= rrent +coarse-grained time does not cause a change. + +Inode Timestamps +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +There are currently 3 timestamps in the inode that are updated to the curr= ent +wallclock time on different activity: + +ctime: + The inode change time. This is stamped with the current time whenever + the inode's metadata is changed. Note that this value is not settable + from userland. + +mtime: + The inode modification time. This is stamped with the current time + any time a file's contents change. + +atime: + The inode access time. This is stamped whenever an inode's contents are + read. Widely considered to be a terrible mistake. Usually avoided with + options like noatime or relatime. + +Updating the mtime always implies a change to the ctime, but updating the +atime due to a read request does not. + +Multigrain timestamps are only tracked for the ctime and the mtime. atimes= are +not affected and always use the coarse-grained value (subject to the floor= ). + +Inode Timestamp Ordering +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +In addition to just providing info about changes to individual files, file +timestamps also serve an important purpose in applications like "make". Th= ese +programs measure timestamps in order to determine whether source files mig= ht be +newer than cached objects. + +Userland applications like make can only determine ordering based on +operational boundaries. For a syscall those are the syscall entry and exit +points. For io_uring or nfsd operations, that's the request submission and +response. In the case of concurrent operations, userland can make no +determination about the order in which things will occur. + +For instance, if a single thread modifies one file, and then another file = in +sequence, the second file must show an equal or later mtime than the first= . The +same is true if two threads are issuing similar operations that do not ove= rlap +in time. + +If however, two threads have racing syscalls that overlap in time, then th= ere +is no such guarantee, and the second file may appear to have been modified +before, after or at the same time as the first, regardless of which one was +submitted first. + +Multigrain Timestamp Implementation +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Multigrain timestamps are aimed at ensuring that changes to a single file = are +always recognizable, without violating the ordering guarantees when multip= le +different files are modified. This affects the mtime and the ctime, but the +atime will always use coarse-grained timestamps. + +It uses an unused bit in the i_ctime_nsec field to indicate whether the mt= ime +or ctime has been queried. If either or both have, then the kernel takes +special care to ensure the next timestamp update will display a visible ch= ange. +This ensures tight cache coherency for use-cases like NFS, without sacrifi= cing +the benefits of reduced metadata updates when files aren't being watched. + +The Ctime Floor Value +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +It's not sufficient to simply use fine or coarse-grained timestamps based = on +whether the mtime or ctime has been queried. A file could get a fine grain= ed +timestamp, and then a second file modified later could get a coarse-graine= d one +that appears earlier than the first, which would break the kernel's timest= amp +ordering guarantees. + +To mitigate this problem, we maintain a global floor value that ensures th= at +this can't happen. The two files in the above example may appear to have b= een +modified at the same time in such a case, but they will never show the rev= erse +order. To avoid problems with realtime clock jumps, the floor is managed a= s a +monotonic ktime_t, and the values are converted to realtime clock values as +needed. + +Implementation Notes +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Multigrain timestamps are intended for use by local filesystems that get +ctime values from the local clock. This is in contrast to network filesyst= ems +and the like that just mirror timestamp values from a server. + +For most filesystems, it's sufficient to just set the FS_MGTIME flag in the +fstype->fs_flags in order to opt-in, providing the ctime is only ever set = via +inode_set_ctime_current(). If the filesystem has a ->getattr routine that +doesn't call generic_fillattr, then you should have it call fill_mg_cmtime= to +fill those values. For setattr, it should use setattr_copy() to update the +timestamps, or otherwise mimic its behavior. --=20 2.46.0 From nobody Fri Nov 29 23:33:16 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4C4CF1E0B78; Fri, 13 Sep 2024 13:54:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235683; cv=none; b=qYHTYuZz72k9v+C/zIZT/KCuVTA1USHLNTewuMhV/h8Qrs4/JziOKyJEERLCfud57R9bH7eEiQv3w+8WV0RUhuqteaZAfm5Wyf9G0prIvUdhggYxU/KtZ+OlqxkL/47eQTjr3TViBKT2oEi8pZmfLjA2FCQyh9ljJKo2M0D7uss= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235683; c=relaxed/simple; bh=bIaGrKHNkJGKwsIp2/9hx+nM5n7ktsZpZECPjECvfD4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=G9cjPfa7pXb68bFnOyyEQD+zzm8izXwJv4HlM8q1yrz9i7Oe26vYHIhSQGjkdm2+2Yi9KNFpnsHZ/1tvTwWzCeR66hUwUNumtKuLp3+VTJ1eGiw/f5GtHtR3v/pONqr9hiJDtvbrWERclp4ee7zxczcy9SrgMuM0xuXsiGOZl1I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Guyg9oT+; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Guyg9oT+" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5D8D3C4CECE; Fri, 13 Sep 2024 13:54:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726235682; bh=bIaGrKHNkJGKwsIp2/9hx+nM5n7ktsZpZECPjECvfD4=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=Guyg9oT+NWYbjSrZdRkxccMwDJyyHviLQGBLIj6YRh1eSeYcj5XRoEBPD7kVxR8ZN GyKoNcrB7fZelDhspiG6fdw1NIHAh7Re3miRGHxszHzcBVQP0bPYmKCNAPMnpJqXJP fPsIg2O9jLiYclcDOv/EOPX1bOertHlhi2ReZdPLRBUxHSlL0IhJACB6QMV11/59bd 4kjxU5edIebTsYnGfzFc52HaSqqqvT80UDf9XBc6/lqzqJdO7hZs+k2SIAU1EjKEdj SMGOddLaCCiD/ousbygLJCpd7bPcnrZjwP+bGRqKUG79rg6y/+2nqMqzrYe7ZyeZ0r nCd0ta10pN5FA== From: Jeff Layton Date: Fri, 13 Sep 2024 09:54:17 -0400 Subject: [PATCH v7 08/11] xfs: switch to multigrain timestamps Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-mgtime-v7-8-92d4020e3b00@kernel.org> References: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> In-Reply-To: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=2942; i=jlayton@kernel.org; h=from:subject:message-id; bh=bIaGrKHNkJGKwsIp2/9hx+nM5n7ktsZpZECPjECvfD4=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm5EQJhpcPCTpuesPObcfS93aBbr6jli76FoBN4 +jNilKD6nGJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZuRECQAKCRAADmhBGVaC FSiRD/9bpuRwwKxqNOXbJWyZS6BnOCIPYZl0DiHb+px0bvNOxAA5lV17DfkdTVRGj4cdiG6hbi3 bC1wnn994yj4E32dOkwARqFjXzPmPlpBz0WoCR2SNcHHL+h4MySabeo7A7qqL3cuGzu0bFVRVnE 7dZCwqhvcnKxar/hIzkq0yAXuzqiew/VLCussXTC9PXNj+oCyAbFmgJipL01U/M1vcOICzApCu4 1EOa0rXnVhR38691CU9nxtf4p+JoxxAYmRVi/lkQKApUceYDJQYg+nt5/Fni+oUL9wid2EPcJ1J Hp6aDFYxn+7hVzAtpzM/8E9+h5SzbDaLihsNeZMCJZ/meVGkbeQiMJc6zzeQoc66TCscfCfv/Qo Lfcjh0ow2eXeiAIE15FbpYp1GAicEL6VPBSysPi/1Jb8ylPr8GfHorN33d/xs5UMrtmjeV//viW ezh7rm3eHTernvy+of89bdQS2IsW6UzbWnFyDuOiVnlC9a/SPzx5qg/XyTzTDveCNUbFpHz0uQ6 EwLKOpZXD6btK89GykcHWpM8yNAimeIHZ/fOx3hFfhSs/PS6uvIdFu4R/pKTQ3wg+B/lGE9r3rU Y5+36ESqCpmzO9xeAll4MeQgsBxsIaXYzZYPN3pi+sKPwFlZf1ttfriEIfWxJxS0ttxGj3FjgoV uUfDd4iYB+/XKNg== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Enable multigrain timestamps, which should ensure that there is an apparent change to the timestamp whenever it has been written after being actively observed via getattr. Also, anytime the mtime changes, the ctime must also change, and those are now the only two options for xfs_trans_ichgtime. Have that function unconditionally bump the ctime, and ASSERT that XFS_ICHGTIME_CHG is always set. Finally, stop setting STATX_CHANGE_COOKIE in getattr, since the ctime should give us better semantics now. Reviewed-by: Josef Bacik Reviewed-by: Darrick J. Wong Signed-off-by: Jeff Layton --- fs/xfs/libxfs/xfs_trans_inode.c | 6 +++--- fs/xfs/xfs_iops.c | 10 +++------- fs/xfs/xfs_super.c | 2 +- 3 files changed, 7 insertions(+), 11 deletions(-) diff --git a/fs/xfs/libxfs/xfs_trans_inode.c b/fs/xfs/libxfs/xfs_trans_inod= e.c index 3c40f37e82c7..c962ad64b0c1 100644 --- a/fs/xfs/libxfs/xfs_trans_inode.c +++ b/fs/xfs/libxfs/xfs_trans_inode.c @@ -62,12 +62,12 @@ xfs_trans_ichgtime( ASSERT(tp); xfs_assert_ilocked(ip, XFS_ILOCK_EXCL); =20 - tv =3D current_time(inode); + /* If the mtime changes, then ctime must also change */ + ASSERT(flags & XFS_ICHGTIME_CHG); =20 + tv =3D inode_set_ctime_current(inode); if (flags & XFS_ICHGTIME_MOD) inode_set_mtime_to_ts(inode, tv); - if (flags & XFS_ICHGTIME_CHG) - inode_set_ctime_to_ts(inode, tv); if (flags & XFS_ICHGTIME_ACCESS) inode_set_atime_to_ts(inode, tv); if (flags & XFS_ICHGTIME_CREATE) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 1cdc8034f54d..a1c4a350a6db 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -597,8 +597,9 @@ xfs_vn_getattr( stat->gid =3D vfsgid_into_kgid(vfsgid); stat->ino =3D ip->i_ino; stat->atime =3D inode_get_atime(inode); - stat->mtime =3D inode_get_mtime(inode); - stat->ctime =3D inode_get_ctime(inode); + + fill_mg_cmtime(stat, request_mask, inode); + stat->blocks =3D XFS_FSB_TO_BB(mp, ip->i_nblocks + ip->i_delayed_blks); =20 if (xfs_has_v3inodes(mp)) { @@ -608,11 +609,6 @@ xfs_vn_getattr( } } =20 - if ((request_mask & STATX_CHANGE_COOKIE) && IS_I_VERSION(inode)) { - stat->change_cookie =3D inode_query_iversion(inode); - stat->result_mask |=3D STATX_CHANGE_COOKIE; - } - /* * Note: If you add another clause to set an attribute flag, please * update attributes_mask below. diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 27e9f749c4c7..210481b03fdb 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -2052,7 +2052,7 @@ static struct file_system_type xfs_fs_type =3D { .init_fs_context =3D xfs_init_fs_context, .parameters =3D xfs_fs_parameters, .kill_sb =3D xfs_kill_sb, - .fs_flags =3D FS_REQUIRES_DEV | FS_ALLOW_IDMAP, + .fs_flags =3D FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_MGTIME, }; MODULE_ALIAS_FS("xfs"); =20 --=20 2.46.0 From nobody Fri Nov 29 23:33:16 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0E92F1E1326; Fri, 13 Sep 2024 13:54:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235686; cv=none; b=A6+yMFhC78ysq3jjObvBRRePmgi9/aT4QzJvMZ6PZX2LzCbveFwSptC7g9QDaubO4VXHkDA319OabfhNcTz+uggHdDRTQPc0yF1EW4rCSvVqsnxpdiKT7vTUCfNh4vMDlI8dfuRoEw/LTJAlT82MxRtDmy3P32BI3I2iyPXa+aA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235686; c=relaxed/simple; bh=Fd5IxgHtKcN7b8Mow1VZm20amle3gCRV/bljD+jCFMs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=K+OHKRNZvYV7YiUG/jSAsnR9mU67uv+ts7fyBUgObpdKWU4SMo3ka6elpq9WKHTVnTIgbBBEmY7s15lViSEhQCqf0ysPDC+eWIsOSJGyhSmRPxN23RYshiTdmf2AEiEPX5vLnmXhQp5oUj2DLszCffN0WNrfXMBDTZ5fhDemmn0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=AuAeewB9; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="AuAeewB9" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1F73AC4CECF; Fri, 13 Sep 2024 13:54:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726235685; bh=Fd5IxgHtKcN7b8Mow1VZm20amle3gCRV/bljD+jCFMs=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=AuAeewB9Uk7nJE0YghyMUcM9Xs6hH7Buvcm/8ac7LUircjJ8rWbR972Y3zmopxj35 dHOxrhzMWZQW6+Ws79B1QbjRzDJfTaA0kiad8DS4JBRVlnsnHLYC1PwaP8lZc99++9 D1kH/eGfMk6TCGXCy8hPSW60oqJwsZ9Tkou1a7D7EGO5tcqWkwFizJT4h7VgVXr9k0 Q5obx6sQ+NBHF5bYHFWxqUIyTh4vTzgtrwe8fZ7E4HBK/l/HuSkfwOi7oiSvqa9tWY pqYldPVtk6dJdZ9BlYGDbE82T1FO68q7vsNgRcqmifwK/TCMW7RN9VYg7PKB9sj59G wRS3JooNdN2NA== From: Jeff Layton Date: Fri, 13 Sep 2024 09:54:18 -0400 Subject: [PATCH v7 09/11] ext4: switch to multigrain timestamps Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-mgtime-v7-9-92d4020e3b00@kernel.org> References: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> In-Reply-To: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=926; i=jlayton@kernel.org; h=from:subject:message-id; bh=Fd5IxgHtKcN7b8Mow1VZm20amle3gCRV/bljD+jCFMs=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm5EQJL6h7gS7IiFUm1AH2Gox/Iu4o0pKKpwApl Kim5XpJ0R6JAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZuRECQAKCRAADmhBGVaC FW87D/0a7vdE5p5oQ4JFJCRWyOBMckH3r9Y3OOWcv+EttrTbdiLCPa/rUw5s4ZYKOk/rN5BoRkQ CB2QK7M5veam7/xYvK2iv4okI6HbMig8Tzq3QFpgitnqYPwpugbI8Oti5AajHA/kqQUGPH5YQRu pZSmKs4IjJuG4rGNeu4d4j5oPW7UwslRmPL3skIKxFYYm0s+tDzeVo5+CLeMSab7n9Qxnyq1zAi McegWVYmasqTxVRAxiJ0OVPtf/3VwGbbXAd+PURarKFUkGoNLkpjm6Q2Y2uKo7N3z+FWGns+i0Q rkU7m1LmbBP16Lc4z44oh17/sgwU9zLQsZqu+UaeDNHgnMKTRE0eF+ddR2qJit+RRTw8uIkN7zj egk9PZZrC4rVou++Ngk7kTOyMRfFMHhsoo3in12P3kM91uwSbtx6w8tznVK9WnZC6wwyMJYhOYZ YMA+ZKF4oTlJAaoeWSWkp7uAbZy7eUGQO4DOUXOMayJQYdt86VFvTjQMn5xG1LkQ2ssY0+zWaS2 kEqLUYyjpYm5s1vs07Qe6/Ngh/5yaT7goaFapf3Sjxs0JXDJz0MOkxy2TccBCvY/MLT8FR0emSf RDi8FwFRjVH6VoCo3raSm2rWAvMKE2v1yByb0bWKucYzavOTCasGbt0HloiZ48/hDiQaDVl+OaX si75ypYgIGP/+Nw== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Enable multigrain timestamps, which should ensure that there is an apparent change to the timestamp whenever it has been written after being actively observed via getattr. For ext4, we only need to enable the FS_MGTIME flag. Reviewed-by: Josef Bacik Reviewed-by: Jan Kara Signed-off-by: Jeff Layton --- fs/ext4/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index e72145c4ae5a..a125d9435b8a 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -7298,7 +7298,7 @@ static struct file_system_type ext4_fs_type =3D { .init_fs_context =3D ext4_init_fs_context, .parameters =3D ext4_param_specs, .kill_sb =3D ext4_kill_sb, - .fs_flags =3D FS_REQUIRES_DEV | FS_ALLOW_IDMAP, + .fs_flags =3D FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_MGTIME, }; MODULE_ALIAS_FS("ext4"); =20 --=20 2.46.0 From nobody Fri Nov 29 23:33:16 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BFCC11DC06D; Fri, 13 Sep 2024 13:54:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235688; cv=none; b=DOnytK5ysqoFEwxwr9Dhcx1FvrSgSiptx8wwkqOJ5eqEBJIBrHjHlawkzTm2W6U9SEKXFfTw7iwayu4sTKnJkXXK5D5z3q6+OMTL6so1BIiLctOUMEN/ia5QwDqY7Sm5KQhov0wpp/ZxGSrqxN/0ubZeUwONi5ZQoSaw41fbPWs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235688; c=relaxed/simple; bh=gD0wNq7Gu5j1JVptVjT60GA5HV2tmuyZD0KoREfuUWw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Puk1hMEAqBLsY1ub2o/NwuMDcdly8K350WoT57nJUns4T06/1QHqb12qqON8WmCxmWADcRu8r1FkgiotQN4n0WAJaX1aB54erxi4/aSGRtzF45j01OwZY+l5Z0MuNgjqtT/Jcqs+AT4uP1fYosVFATDprzJ6nuM+RH0ozjb86Xs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=YV9h2BHP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="YV9h2BHP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D67F4C4CEC7; Fri, 13 Sep 2024 13:54:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726235688; bh=gD0wNq7Gu5j1JVptVjT60GA5HV2tmuyZD0KoREfuUWw=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=YV9h2BHPgRdlUgYO3F/CqZJt1Rc/2kCQcYi69lZ21K5cNlZZts+4iWAGlgUUTCoEv X3ewd9YOJGoyXgtot3C5RATvBsQMfiA8U+bY2D6M2qE5IoSDEUqnPRlFkIweuqZ/PJ p6edHrAfSGy/gWeiV1jyM678J00iCIu4ZfkPezDpqy0iiN2Y20rrMZjWUPBOMvfi3J oxtD9Cx7nCkicldPZdtC1GDNSYPk/YV+IUiN4kOYKZZobWqVk+lY4XWOfNMmqYQ8HG R/xFCBfIim+nQ8u8mt/T2s/WQQ4BpIahixAaZEp/mKbF/Bzyg95VsbaGRUQtmGcW4S FB+JjuV+ZVzvQ== From: Jeff Layton Date: Fri, 13 Sep 2024 09:54:19 -0400 Subject: [PATCH v7 10/11] btrfs: convert to multigrain timestamps Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-mgtime-v7-10-92d4020e3b00@kernel.org> References: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> In-Reply-To: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=2660; i=jlayton@kernel.org; h=from:subject:message-id; bh=gD0wNq7Gu5j1JVptVjT60GA5HV2tmuyZD0KoREfuUWw=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm5EQJki37/yNYFfRJD0xfVng3Rl9sd2OKealIZ wgN8YU7mDSJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZuRECQAKCRAADmhBGVaC FfCyEADHfVc2KbEBjV91v6AyHwyXWTddhIFLhoAxtZI3VehsvHLino/KQ15ml1Ehn1umOKyoJHb gY+BrYSpB2tTAaHIhBIdR0SJFSGZ/uJ3rD+KMcLIoLZB9GynRVKT8WLFf5/QAkuXYemXgVi7Mvk M0PyfSTPIUSkm1nEz6R7/fy7xv1zeYiX7AaW9y5JKwarbS0IA2/GEIfyyKiW3QhG0ynu60JX/lZ ECCXJWF1GG/7Rb4DBGb8UUMYNaxZ8aguK5eNDBG0NSsxUQg8zrTFKbg8HJ9ZAZ3bbYqLcRwJJ1E Db2ReX2fFW/iP5bZCikSTb2hxmA4/RGdDHSdURzG0r6Al5UbP2D7SuoUtztPW7AUq4WDKGQfXgZ F/E9+m02g7WPTNyM31BsAqkjWVBAX6MQpLU1mKyB6s++j0DFdPwo2OWwSnhR9HIqUxiyhDkZ0V/ ulmMZN+bjwIpUQ09rEzRmzL+1464X4wNk788Qosy+eceKV3wZKDgKSX14nVFiyWdbKQohlkRNPK Q+awpe8E+yDMrIFOtc/HIdvUKgq1zAkMF7IV1lfYMCAmp1bcsnLSQQOWKjWM9Kr6IywPztPFerL kZRpfsYr5QYY7quhuja5CbrvHJER8N666wveTyecwDl1wcChHfX8HQD+6wrpNoqO/5lffPcR5q7 hpIwg4Kem0wxJ5g== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Enable multigrain timestamps, which should ensure that there is an apparent change to the timestamp whenever it has been written after being actively observed via getattr. Beyond enabling the FS_MGTIME flag, this patch eliminates update_time_for_write, which goes to great pains to avoid in-memory stores. Just have it overwrite the timestamps unconditionally. Note that this also drops the IS_I_VERSION check and unconditionally bumps the change attribute, since SB_I_VERSION is always set on btrfs. Reviewed-by: Josef Bacik Signed-off-by: Jeff Layton --- fs/btrfs/file.c | 25 ++++--------------------- fs/btrfs/super.c | 3 ++- 2 files changed, 6 insertions(+), 22 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 2aeb8116549c..1656ad7498b8 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1120,26 +1120,6 @@ void btrfs_check_nocow_unlock(struct btrfs_inode *in= ode) btrfs_drew_write_unlock(&inode->root->snapshot_lock); } =20 -static void update_time_for_write(struct inode *inode) -{ - struct timespec64 now, ts; - - if (IS_NOCMTIME(inode)) - return; - - now =3D current_time(inode); - ts =3D inode_get_mtime(inode); - if (!timespec64_equal(&ts, &now)) - inode_set_mtime_to_ts(inode, now); - - ts =3D inode_get_ctime(inode); - if (!timespec64_equal(&ts, &now)) - inode_set_ctime_to_ts(inode, now); - - if (IS_I_VERSION(inode)) - inode_inc_iversion(inode); -} - int btrfs_write_check(struct kiocb *iocb, struct iov_iter *from, size_t co= unt) { struct file *file =3D iocb->ki_filp; @@ -1170,7 +1150,10 @@ int btrfs_write_check(struct kiocb *iocb, struct iov= _iter *from, size_t count) * need to start yet another transaction to update the inode as we will * update the inode when we finish writing whatever data we write. */ - update_time_for_write(inode); + if (!IS_NOCMTIME(inode)) { + inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode)); + inode_inc_iversion(inode); + } =20 start_pos =3D round_down(pos, fs_info->sectorsize); oldsize =3D i_size_read(inode); diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 98fa0f382480..d423acfe11d0 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2198,7 +2198,8 @@ static struct file_system_type btrfs_fs_type =3D { .init_fs_context =3D btrfs_init_fs_context, .parameters =3D btrfs_fs_parameters, .kill_sb =3D btrfs_kill_super, - .fs_flags =3D FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_ALLOW_IDMAP, + .fs_flags =3D FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | + FS_ALLOW_IDMAP | FS_MGTIME, }; =20 MODULE_ALIAS_FS("btrfs"); --=20 2.46.0 From nobody Fri Nov 29 23:33:16 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 804E61E203F; Fri, 13 Sep 2024 13:54:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235691; cv=none; b=J/41VkyVNUqrHrZnJba4Avn8MawaqRiRacAQVQ9pQYeb5pJOsAnsB/TCpz2nLp1jOaF/ETAMR+5odqPGR3TLp4w9N/O2emLl/aKcsMpWXP/JdBPCUFB0nPtTLkOIkfhvNPCm/Wti8akV58UhLPL658fdNcIqzaKyD9S7V3Q3wwg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726235691; c=relaxed/simple; bh=gE8r0Qbj4D8u7V9shLCLoJR+0WYqh7zAXrVYnVeHWEs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=I0uE5Z5iiSH3reWzL7iAekdZR4x5Ii8cbvuqm6cu4uW3o7Fiw2Fn4MQV1jJ1equb4dHhxT11eLi/mNUTMH8EvcjPtOi8VJpcJ/sMPGaQIwYP5E3sPuRXoKlx1AZvkci8KGvk1yz8RTdoYHYrbfj2X3CbeO8wpudi2Y1YdUzQsfo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=f2lIRgow; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="f2lIRgow" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 984F9C4CEC5; Fri, 13 Sep 2024 13:54:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726235691; bh=gE8r0Qbj4D8u7V9shLCLoJR+0WYqh7zAXrVYnVeHWEs=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=f2lIRgowJftwLcQ9w7tVB8oC70HkHxJDGKDyJpADdRiJhqArytVVoHezmLIRgh1/f 65MbjSAkLoNmZG0zGJ/4rrxWgBJ/YPPywR3n+kxUaKTBVCq2eXtTEAkBhSzlnY5nCN euY44TXbdqovwOs/hJGQ7akGkV/WNaOUgxxZAkqKnxloZmXkQ3x1EhYrznya4qzj7z xoyEBYwkF1GMT7ILCoL9sMkch1LKDhm/cyPj0K44b3EE079V3JVVwg5vlpSGlre4DQ fdqmY9C9zlDwjaILzHVoAqFRFkO9QniRuw3vFPpz0TCvMxl5tRuRD60P3aW5B2jPhL 57+P8nnHpu2iw== From: Jeff Layton Date: Fri, 13 Sep 2024 09:54:20 -0400 Subject: [PATCH v7 11/11] tmpfs: add support for multigrain timestamps Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240913-mgtime-v7-11-92d4020e3b00@kernel.org> References: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> In-Reply-To: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=862; i=jlayton@kernel.org; h=from:subject:message-id; bh=gE8r0Qbj4D8u7V9shLCLoJR+0WYqh7zAXrVYnVeHWEs=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm5EQJHTt+Om9HYRcT2NZhJ5f2mscuIoCqf2gYe jhTS7eDZkCJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZuRECQAKCRAADmhBGVaC FQPaD/9gG6UEiaA7vTuVgqfTbBhigO29yQX61fwL8SVjsS051BMJ8t48j4j1aviqPtTD6+16bWl s8DnyvS3b5IVxvqrrku+15cai6PFXW2hpMz6ncOe8hb//hNTQblEKLSar+bKiPqhlaNn6YemdXu F18GlMWmkLgQMMAkECO7gvp8yXBZIiqqEwc9uJWH8IytnUkKEJZS9S89pV9CI+atEhSU6OSMC5i ZnqIL98cX2Bg+Sq78zhJm0lJgwo20Arvq7B1MVHbp76yS86iKi6p8bpEsD8RwTA5jhomTVt/+UV qGiQHCZ+AdxQsrfRU+RPUyhG8uEBY0vrXDwpx2hQr+ouu+BDvjhB3ylkFyWMWaNm1OJuJ7WBv1O Loq6TgYL8/bit13EcqxlPIinPON11eRxALoPjA1VtFusW7dKrFrykV6bNy+5e17HXvpP+bCNdHV vwN28AVgozIZkBCZTl7j+qu2Ex9KY9cJGLkuRyKrHco62kfy2RUOJpT3M2mx1eW0VdLNLByadcQ 509HchpbTRFisLAArStgVSrxzAnFIdHyLu2du2y5vCjRHXyoym8UHgUX17Dl8mX3BG268V5pAHG 7DDpYgszQlnENaUwHlbsTtQsg10Hfi+pUzbv+aJdLRe39lFUd4Tkg7okBANYJaH15piBp6j60MJ Ct1p3Evc6o4ymiw== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Enable multigrain timestamps, which should ensure that there is an apparent change to the timestamp whenever it has been written after being actively observed via getattr. tmpfs only requires the FS_MGTIME flag. Reviewed-by: Josef Bacik Reviewed-by: Jan Kara Signed-off-by: Jeff Layton --- mm/shmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/shmem.c b/mm/shmem.c index 5a77acf6ac6a..5f17eaaa32e2 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -4804,7 +4804,7 @@ static struct file_system_type shmem_fs_type =3D { .parameters =3D shmem_fs_parameters, #endif .kill_sb =3D kill_litter_super, - .fs_flags =3D FS_USERNS_MOUNT | FS_ALLOW_IDMAP, + .fs_flags =3D FS_USERNS_MOUNT | FS_ALLOW_IDMAP | FS_MGTIME, }; =20 void __init shmem_init(void) --=20 2.46.0