From nobody Thu Nov 28 07:42:13 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A339E2178EE; Wed, 2 Oct 2024 21:27:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904454; cv=none; b=KB5ppNSf9Pk+SjliNwYA7AFyKovriwbRHe2s0O6qtMVOmHQuhxW8ecTZ/6abEHcqw/+9KdecNoP6g8sXUTY1jLlE2+alciTgrslWWnMYtrIi+iVJCZRh6D3HJ5fhYqhUsz93KhOpYFibPZ5r+Vpj6XOPW+3G1vEmT0g9u+s33Ok= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904454; c=relaxed/simple; bh=2+VLMo7X+TKT3T321kqTgs4zMRtNzRRVuC4rj6DW+Cc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=V2pZYl48i6DYMlSF/7y17COGt5QUQ2VO6Omsb819X7eZPBpaK5e7jaACbb35ujTwI4TJLM1HTQKxmGVjR4O/Xw6rpYIRTWRlfzmI9+RwlYjl0pCYVsxXPR7EIuKL9v0IaAI7ol8OD4SrF71lU9sbp4d/mm7vcMB6tFeWglhLHgA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ZsBmIDPP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ZsBmIDPP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 93A61C4CECE; Wed, 2 Oct 2024 21:27:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727904454; bh=2+VLMo7X+TKT3T321kqTgs4zMRtNzRRVuC4rj6DW+Cc=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=ZsBmIDPP/TPJIvUs0H0Bq+7XEPNLJ1InQNnB4epFjmVdguwWioPMQS0ex58J6kYmH 76Fs+mTjp0aloR+UbceVmYtsF/mHxicbc7SCRZDtvaZJKzoQby83Om+uyM41nvZm2U nVrZ+05fYngNpI6waazdFuGMsEZt3UJicfAl2I2GxhQTquhWAMA/wnGofhAAU3a5VR 1g4qISpmbhN2ycyH0JhDtn5sH+Ti4dfyVNQe5zHxioJaSlWTFWoFqTKFqhd+njgFgm RZb3hLGlthcy1Pc1qAypf3RxvQisdqLX8i9f050vMPUtK5g1OUzpYfBiDFiACH3HJp oLKTfZgy41pvg== From: Jeff Layton Date: Wed, 02 Oct 2024 17:27:16 -0400 Subject: [PATCH v10 01/12] timekeeping: add interfaces for handling timestamps with a floor value Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20241002-mgtime-v10-1-d1c4717f5284@kernel.org> References: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> In-Reply-To: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Randy Dunlap , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=7844; i=jlayton@kernel.org; h=from:subject:message-id; bh=2+VLMo7X+TKT3T321kqTgs4zMRtNzRRVuC4rj6DW+Cc=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm/bq/pOISz++DqIaPinjQXm8r2ux+FEodDWTqw 9MkXKbpGnyJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZv26vwAKCRAADmhBGVaC FSNmEAC/ux5B3D/H1d8eVc06UwpsjbzW9qWVQ+EdDb83LSfoT1OyhHOwvJc9orbEruZKyeuVp5F DIwOsP4z9ow//lDNM8u2BapdjY393P5MOpVBh8mYwPPpHGuTvLCcTdWKBciJC85QokG6fiWpGUh CgGtAIuTsvAM2qrN0FeWEslTcTg/XMSEC3FypKoEM2BTa5h2/FhZhsaOk6w/0p5eev9vVvn1uoQ l/YtTg8huFDZeYxEeZp5Yl3nzZGFAbl7m0XHrqGPcnOk1vErmfjY6Nbo+DBNTc+/diAsTwaQNf4 I9IaEwbowLv5z3cEqCsUdxxEtMDvIWyDMSDphlLjhWG07WBhsneSlDyziwWQrZUl6BlewfwbfoV +ZctmC9Xo/48Oah8yVfgW/yhhXvIEBAzg0f++xLKNC6N1ou0jAsN+IWOW9Tuq7/nZKCDDgKSJEU hm4hfXDa9J6EbHhOIURnF3b6jIkhHcHDoB7kSkGoKbMrLWDOa+Cjl4m/eeLALFMnjT4vIaP8XJf x8u/MX+O7HX4faRyBPIU5WUP02BAUGV9gPBpt1XUJaluSUYv7Zh57l6X1a1nijk7nYm2Thwogan 8kG5S56vOv6LQxQEsxRUL5GSFNDQ/glnMJnA2IOXQDXq55O0JYfn/iKCpgCK5WJ+l0bEVSqyCXk amyBDzLFVkZwj3w== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Multigrain timestamps allow the kernel to use fine-grained timestamps when an inode's attributes is being actively observed via ->getattr(). With this support, it's possible for a file to get a fine-grained timestamp, and another modified after it to get a coarse-grained stamp that is earlier than the fine-grained time. If this happens then the files can appear to have been modified in reverse order, which breaks VFS ordering guarantees [1]. To prevent this, maintain a floor value for multigrain timestamps. Whenever a fine-grained timestamp is handed out, record it, and when later coarse-grained stamps are handed out, ensure they are not earlier than that value. If the coarse-grained timestamp is earlier than the fine-grained floor, return the floor value instead. Add a static singleton atomic64_t into timekeeper.c that is used to keep track of the latest fine-grained time ever handed out. This is tracked as a monotonic ktime_t value to ensure that it isn't affected by clock jumps. Because it is updated at different times than the rest of the timekeeper object, the floor value is managed independently of the timekeeper via a cmpxchg() operation, and sits on its own cacheline. Add two new public interfaces: - ktime_get_coarse_real_ts64_mg() fills a timespec64 with the later of the coarse-grained clock and the floor time - ktime_get_real_ts64_mg() gets the fine-grained clock value, and tries to swap it into the floor. A timespec64 is filled with the result. The floor value is global and updated via a single try_cmpxchg(). If that fails then the operation raced with a concurrent update. Any concurrent update must be later than the existing floor value, so any racing tasks can accept any resulting floor value without retrying. [1]: POSIX requires that files be stamped with realtime clock values, and makes no provision for dealing with backward clock jumps. If a backward realtime clock jump occurs, then files can appear to have been modified in reverse order. Tested-by: Randy Dunlap # documentation bits Acked-by: John Stultz Signed-off-by: Jeff Layton --- include/linux/timekeeping.h | 4 ++ kernel/time/timekeeping.c | 105 ++++++++++++++++++++++++++++++++++++++++= ++++ 2 files changed, 109 insertions(+) diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h index fc12a9ba2c884271a75608211a72173b7ebaa24c..7aa85246c183576b039c02af4ab= ba02b4a09ef9d 100644 --- a/include/linux/timekeeping.h +++ b/include/linux/timekeeping.h @@ -45,6 +45,10 @@ extern void ktime_get_real_ts64(struct timespec64 *tv); extern void ktime_get_coarse_ts64(struct timespec64 *ts); extern void ktime_get_coarse_real_ts64(struct timespec64 *ts); =20 +/* Multigrain timestamp interfaces */ +extern void ktime_get_coarse_real_ts64_mg(struct timespec64 *ts); +extern void ktime_get_real_ts64_mg(struct timespec64 *ts); + void getboottime64(struct timespec64 *ts); =20 /* diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 5391e4167d60226dfc48c845170e36bcbeb7b292..ebfe846ebde35850c3e4d9c2cc4= 5642c983d137f 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -114,6 +114,24 @@ static struct tk_fast tk_fast_raw ____cacheline_align= ed =3D { .base[1] =3D FAST_TK_INIT, }; =20 +/* + * Multigrain timestamps require tracking the latest fine-grained timestamp + * that has been issued, and never returning a coarse-grained timestamp th= at is + * earlier than that value. + * + * mg_floor represents the latest fine-grained time that has been handed o= ut as + * a file timestamp on the system. This is tracked as a monotonic ktime_t,= and + * converted to a realtime clock value on an as-needed basis. + * + * Maintaining mg_floor ensures the multigrain interfaces never issue a + * timestamp earlier than one that has been previously issued. + * + * The exception to this rule is when there is a backward realtime clock j= ump. If + * such an event occurs, a timestamp can appear to be earlier than a previ= ous one. + */ + +static __cacheline_aligned_in_smp atomic64_t mg_floor; + static inline void tk_normalize_xtime(struct timekeeper *tk) { while (tk->tkr_mono.xtime_nsec >=3D ((u64)NSEC_PER_SEC << tk->tkr_mono.sh= ift)) { @@ -2394,6 +2412,93 @@ void ktime_get_coarse_real_ts64(struct timespec64 *t= s) } EXPORT_SYMBOL(ktime_get_coarse_real_ts64); =20 +/** + * ktime_get_coarse_real_ts64_mg - return latter of coarse grained time or= floor + * @ts: timespec64 to be filled + * + * Fetch the global mg_floor value, convert it to realtime and + * compare it to the current coarse-grained time. Fill @ts with + * whichever is latest. Note that this is a filesystem-specific + * interface and should be avoided outside of that context. + */ +void ktime_get_coarse_real_ts64_mg(struct timespec64 *ts) +{ + struct timekeeper *tk =3D &tk_core.timekeeper; + u64 floor =3D atomic64_read(&mg_floor); + ktime_t f_real, offset, coarse; + unsigned int seq; + + do { + seq =3D read_seqcount_begin(&tk_core.seq); + *ts =3D tk_xtime(tk); + offset =3D tk_core.timekeeper.offs_real; + } while (read_seqcount_retry(&tk_core.seq, seq)); + + coarse =3D timespec64_to_ktime(*ts); + f_real =3D ktime_add(floor, offset); + if (ktime_after(f_real, coarse)) + *ts =3D ktime_to_timespec64(f_real); +} + +/** + * ktime_get_real_ts64_mg - attempt to update floor value and return result + * @ts: pointer to the timespec to be set + * + * Get a monotonic fine-grained time value and attempt to swap it into + * mg_floor. If that succeeds then accept the new floor value. If it fails + * then another task raced in during the interim time and updated the floo= r. + * Since any update to the floor must be later than the previous floor, + * either outcome is acceptable. + * + * Typically this will be called after calling ktime_get_coarse_real_ts64_= mg(), + * and determining that the resulting coarse-grained timestamp did not eff= ect + * a change in the ctime. Any more recent floor value would effect a chang= e to + * the ctime, so there is no need to retry the atomic64_try_cmpxchg() on f= ailure. + * + * @ts will be filled with the latest floor value, regardless of the outco= me of + * the cmpxchg. Note that this is a filesystem specific interface and shou= ld be + * avoided outside of that context. + */ +void ktime_get_real_ts64_mg(struct timespec64 *ts) +{ + struct timekeeper *tk =3D &tk_core.timekeeper; + ktime_t old =3D atomic64_read(&mg_floor); + ktime_t offset, mono; + unsigned int seq; + u64 nsecs; + + do { + seq =3D read_seqcount_begin(&tk_core.seq); + + ts->tv_sec =3D tk->xtime_sec; + mono =3D tk->tkr_mono.base; + nsecs =3D timekeeping_get_ns(&tk->tkr_mono); + offset =3D tk_core.timekeeper.offs_real; + } while (read_seqcount_retry(&tk_core.seq, seq)); + + mono =3D ktime_add_ns(mono, nsecs); + + /* + * Attempt to update the floor with the new time value. As any + * update must be later then the existing floor, and would effect + * a change to the ctime from the perspective of the current task, + * accept the resulting floor value regardless of the outcome of + * the swap. + */ + if (atomic64_try_cmpxchg(&mg_floor, &old, mono)) { + ts->tv_nsec =3D 0; + timespec64_add_ns(ts, nsecs); + } else { + /* + * Another task changed mg_floor since "old" was fetched. + * "old" has been updated with the latest value of "mg_floor". + * That value is newer than the previous floor value, which + * is enough to effect a change to the ctime. Accept it. + */ + *ts =3D ktime_to_timespec64(ktime_add(old, offset)); + } +} + void ktime_get_coarse_ts64(struct timespec64 *ts) { struct timekeeper *tk =3D &tk_core.timekeeper; --=20 2.46.2 From nobody Thu Nov 28 07:42:13 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 455A621949F; Wed, 2 Oct 2024 21:27:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904457; cv=none; b=gp79FopKyihvvWMGDgBzSsB3sxzCnHG5ffkLCcRq6SyldOjyNBdEt1x+QGdh6WruWGnF6zytcLTDVgxZO+XfOhKZ7zw1aximmKLpU0PGkLsnQr7OxeIj5lZcsqGp/MczEJLl5qVEPjiMYAcSIkLq0gKmRuXUYROZTKnArzEltoQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904457; c=relaxed/simple; bh=b4XkKth37w2ravUWOahimfnL3UOfVbEBD3lD+3rp+1s=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=sM8cNeimZ6CgS18RXKmY6Jy0PbJpKTF+bO/Psl05blLIX2xH5oCAW0jqHxoTXMN909tmpguKX/aVIM01nrYFjntoRYPr56sJRQUeLUOYjHBJfYdVmG7kra7jj2dwBlJizsFff6d7pIo8h5N31zyMtHLK1pz+vwWtSNrrBZW42js= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=G3mMkbdg; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="G3mMkbdg" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 55601C4CED1; Wed, 2 Oct 2024 21:27:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727904456; bh=b4XkKth37w2ravUWOahimfnL3UOfVbEBD3lD+3rp+1s=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=G3mMkbdg0nkDEyn6L/6plWzGy8rzM311X3HHbMz6R3Ai1VYHlupggE7MRYdxecBcW Xvk6ll6KZR/g9zl7uCaka+nNH7x9Tm/RIkT5vO6uSHNqrUjFrtM95g8heH5M+lCE55 F8sgy2iJ9fYi2MZGAAxh7tC2DGiC+V5GkjwU7zP25bmpQ/aYrgzhRDQUPnIsshqeKy O7Jkn3FCOz/3z+OTJtQd0O6UxWljUiDGrmddv3D/etWQiDuDYhcmmN0fD6Ll8pFgyi DBGz2Fo7ZjiuVE0X7WaIHym/aM5Pjo+l4xsKTeBPi+9qboJDAz98MsoDdHe55s5REo dVuTX5zzfKQ1A== From: Jeff Layton Date: Wed, 02 Oct 2024 17:27:17 -0400 Subject: [PATCH v10 02/12] timekeeping: add percpu counter for tracking floor swap events Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20241002-mgtime-v10-2-d1c4717f5284@kernel.org> References: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> In-Reply-To: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Randy Dunlap , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=3634; i=jlayton@kernel.org; h=from:subject:message-id; bh=b4XkKth37w2ravUWOahimfnL3UOfVbEBD3lD+3rp+1s=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm/bq/KG7g3gRmiuGWZ6tpbobqHKVwsPLvCl1+2 PakkzoH/7uJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZv26vwAKCRAADmhBGVaC FZRZD/4yysdw6oqodqdrDqoDwQHs/Q4CF/xQT2tRHqA4QItHNsZTxwidAffBRsnGrGPeTIgeMCM eYGyGB9p4u3XxLEb8dhRFOW1RBgcQ/Cjr8cpZH6SZLYpRAk6GIcclb7JSx0Z+ZsLsb3raxWip/q aQkME57KDFHa5SU4k6wWAI43OaKhloW9psmHOc+VuX08saDovMS29GzzdVVAjB04fD6FVnJ0bOw LhNaYNNOKXmuoYPbwXaeS79CaJBEsoC4hSLmofWpFHtXZSM0eMXenbt01hkg0YPIERLodPld9rp iR7KGQ74j0CMmpjZaG/iu8mh33FVzNhkaSdq0w4OuN8UNcCDibrHS5/mOt4wdm3uhMR74E6vON+ 0hkVua/1WjZWCbWA9MiE7S8g3sORCoVEP2BIlNBS3fMnIDVCW9YjyntS4bsTRJnYK5cVmuROgko 7sLr656frMTXl0fy01yglfhU+MXQcqqrmxOiXyRELla8YEX+AKon8oWhvegEV/KJV2LWq3sc2PJ TaaxJQMW9FMLYqn5/ddC7uMEmiMU4fHMlL7JHWzMIzBCVTUE4F3nzpMWHemwN+a44HbOhkzhyQc 0gsIzOEweoF4Kk/yBwXoIpBJneQyP9cglaRlMWlyaD4SLndBdD4EgyTrBtFMAYZncjdhhZwr8Uz ptEgKiXkkzgo4IQ== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 The mgtime_floor value is a global variable for tracking the latest fine-grained timestamp handed out. Because it's a global, track the number of times that a new floor value is assigned. Add a new percpu counter to the timekeeping code to track the number of floor swap events that have occurred. A later patch will add a debugfs file to display this counter alongside other stats involving multigrain timestamps. Tested-by: Randy Dunlap # documentation bits Signed-off-by: Jeff Layton --- include/linux/timekeeping.h | 1 + kernel/time/timekeeping.c | 1 + kernel/time/timekeeping_debug.c | 14 ++++++++++++++ kernel/time/timekeeping_internal.h | 15 +++++++++++++++ 4 files changed, 31 insertions(+) diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h index 7aa85246c183576b039c02af4abba02b4a09ef9d..84a035e86ac811f9e7b1649246b= 71c9296519149 100644 --- a/include/linux/timekeeping.h +++ b/include/linux/timekeeping.h @@ -48,6 +48,7 @@ extern void ktime_get_coarse_real_ts64(struct timespec64 = *ts); /* Multigrain timestamp interfaces */ extern void ktime_get_coarse_real_ts64_mg(struct timespec64 *ts); extern void ktime_get_real_ts64_mg(struct timespec64 *ts); +extern unsigned long timekeeping_get_mg_floor_swaps(void); =20 void getboottime64(struct timespec64 *ts); =20 diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index ebfe846ebde35850c3e4d9c2cc45642c983d137f..e8b713e8ce5553f9e7de96c8e7c= 089714e0aa7a4 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -2488,6 +2488,7 @@ void ktime_get_real_ts64_mg(struct timespec64 *ts) if (atomic64_try_cmpxchg(&mg_floor, &old, mono)) { ts->tv_nsec =3D 0; timespec64_add_ns(ts, nsecs); + timekeeping_inc_mg_floor_swaps(); } else { /* * Another task changed mg_floor since "old" was fetched. diff --git a/kernel/time/timekeeping_debug.c b/kernel/time/timekeeping_debu= g.c index b73e8850e58d9c5b291559f475e67c7ed47c2db3..36d359cad7ca1d821bf42f59b3e= 50f89b14afd40 100644 --- a/kernel/time/timekeeping_debug.c +++ b/kernel/time/timekeeping_debug.c @@ -17,6 +17,9 @@ =20 #define NUM_BINS 32 =20 +/* incremented every time mg_floor is updated */ +DEFINE_PER_CPU(unsigned long, timekeeping_mg_floor_swaps); + static unsigned int sleep_time_bin[NUM_BINS] =3D {0}; =20 static int tk_debug_sleep_time_show(struct seq_file *s, void *data) @@ -53,3 +56,14 @@ void tk_debug_account_sleep_time(const struct timespec64= *t) (s64)t->tv_sec, t->tv_nsec / NSEC_PER_MSEC); } =20 +unsigned long timekeeping_get_mg_floor_swaps(void) +{ + unsigned long sum =3D 0; + int cpu; + + for_each_possible_cpu(cpu) + sum +=3D data_race(per_cpu(timekeeping_mg_floor_swaps, cpu)); + + return sum; +} + diff --git a/kernel/time/timekeeping_internal.h b/kernel/time/timekeeping_i= nternal.h index 4ca2787d1642e2f52bf985607ca3b03785cf9a50..0bbae825bc0226e4eed64e73fe3= b454986c7573f 100644 --- a/kernel/time/timekeeping_internal.h +++ b/kernel/time/timekeeping_internal.h @@ -10,9 +10,24 @@ * timekeeping debug functions */ #ifdef CONFIG_DEBUG_FS + +DECLARE_PER_CPU(unsigned long, timekeeping_mg_floor_swaps); + +static inline void timekeeping_inc_mg_floor_swaps(void) +{ + this_cpu_inc(timekeeping_mg_floor_swaps); +} + extern void tk_debug_account_sleep_time(const struct timespec64 *t); + #else + #define tk_debug_account_sleep_time(x) + +static inline void timekeeping_inc_mg_floor_swaps(void) +{ +} + #endif =20 #ifdef CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE --=20 2.46.2 From nobody Thu Nov 28 07:42:13 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19DDE21B445; Wed, 2 Oct 2024 21:27:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904460; cv=none; b=D6ra1BX/XvAQ4j5UvtS2CbLDwCVgWdGJt+lrDcJ4RP3UgEUkVWisZZux+g4+Ip1/Fqj7Z4+8Ru8+IvjyZfl28C7L/GRC0wiwFcVOML0TsHxtdbofVtdDab8pZ9ZuYYkx+3r+fa4ryvMxm8FLbsodHLSE9IDB+SZEIg7AVFYqlwI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904460; c=relaxed/simple; bh=3KJ68ZBqqX9TlQLj4uVbA/EOzRVQe5kX+liPSYeavbQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=azKa9Ew43LvmtIBeed+F87VsQAbSJlUmMTLHWxBDcFft2yU1xV1bxVKXsJW4bIUzrM5YiEsmRFovf1T4VHo5CELvzGD5b4vADEY6g/i81ReEUitoR75wo1Trz+amd/kYUqV9pZQ1Uu6lvRZqN79C7r7z0uyJNqR73dEKJa7sWeA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Hu21SUfo; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Hu21SUfo" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1A4FEC4CECE; Wed, 2 Oct 2024 21:27:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727904459; bh=3KJ68ZBqqX9TlQLj4uVbA/EOzRVQe5kX+liPSYeavbQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=Hu21SUfoTS2GaO+kzy1uL8z5jpbOBYPaWlfLvW/XQPfYnzfjz4B7WM+GnZPvdRMqk LvNKxNlUs9Wgog+kgdiEBWH0Rxc5vx9T2O+3/fNWH8y4Xa5wBAdSajN/FE0bBVL0h/ PBd/7wRqEDLz2Td3F3nTXJNU/FJrIuRbZBFW+GDjWDcIyDaBGMseG6XOCu1SNnkRbf xfpoAtT9n8CMnEBEdKZWwIRJ93PY9Os2xo0dIaBjQmKI3t6dGS3MGNzjY8B8wVt0UF NbUq/mUb1qqlD7hLpJa+1MepawXjFF3Rhog5b4oDKIBfns7zypa3opjjocWqgZvUwQ ysyKprM7orOqA== From: Jeff Layton Date: Wed, 02 Oct 2024 17:27:18 -0400 Subject: [PATCH v10 03/12] fs: add infrastructure for multigrain timestamps Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20241002-mgtime-v10-3-d1c4717f5284@kernel.org> References: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> In-Reply-To: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Randy Dunlap , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=13738; i=jlayton@kernel.org; h=from:subject:message-id; bh=3KJ68ZBqqX9TlQLj4uVbA/EOzRVQe5kX+liPSYeavbQ=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm/bq/NjtL0rdcXthLZsilQDOdal1o8rmwKEyJn 3seT+Yxz3eJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZv26vwAKCRAADmhBGVaC FeNAD/9gQnAjiTBHt4+Bt/faTCj5awekmmUuH1cUP28lumSOCmznEqw2QkPS0aio4bwe/m+gGOr fcTFvV/qbBBF19RtKklrv75Lo6G/XtIo+j1UzPV5i0cdXds1VPWvRRsKZeFs1A/4EiZfnmEcgzy BrxBh8Ef/0UD+ffLRWbCmXt4/+aQPTZreC8uquY9NuIVSKxVWSpc0aTOquW+V54dCmWZnf25+US 7jXHPqiYkoOxq7Mu4V2iOQCoIB3znzw5V/phPvsaXYfCQ8r4HPYwcXquH9exXLsafE3p+XtsoiD OD4p69FPFcDee9q7xl5vIu+gaTrcOxmc6ztELOx9Pv0fio84vSTuZaWalJG8TXy5zBw9RMhAAZW tqo+r9sdIvVB0wOIGpaHfLxHtKVcfZpxsqRuFl5wcyag4EMhjtLtyFRmkejoIjSXIc9OJoRnsd+ hflte/orHfFVkEDHIBqOSP5M89FQcCazVMopiqstK/Ir3U3403Cb197B2xeeXTppyUu4WL16cZB b3lfGh7ptJ3ncz6OraBFe3Sy4DIBuFShzJ2p7/UUeVf+E9t5JYyW/v12SCtWXMev4pwCu0Rk3rc artghJ6zBnGS6qNHJPX5wgqislqFiebmwNh2Ydgd+R1dRwW/yokekI08Ka4cTBEAMNQFFIgHMxw 9meaE8/e0t0Fl0A== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 The VFS has always used coarse-grained timestamps when updating the ctime and mtime after a change. This has the benefit of allowing filesystems to optimize away a lot metadata updates, down to around 1 per jiffy, even when a file is under heavy writes. Unfortunately, this has always been an issue when we're exporting via NFSv3, which relies on timestamps to validate caches. A lot of changes can happen in a jiffy, so timestamps aren't sufficient to help the client decide when to invalidate the cache. Even with NFSv4, a lot of exported filesystems don't properly support a change attribute and are subject to the same problems with timestamp granularity. Other applications have similar issues with timestamps (e.g backup applications). If fine-grained timestamps were always used, that would improve the situation, but that becomes rather expensive, as the underlying filesystem would have to log a lot more metadata updates. What is needed is a way to only use fine-grained timestamps when they are being actively queried. Use the (unused) top bit in inode->i_ctime_nsec as a flag that indicates whether the current timestamps have been queried via stat() or the like. When it's set, allow the update to use a fine-grained timestamp iff it's necessary to make the ctime show a different value. If it has been queried, then first see whether the current coarse time is later than the existing ctime. If it is, accept that value. If it isn't, then get a fine-grained timestamp and attempt to stamp the inode ctime with that value. If that races with another concurrent stamp, then abandon the update and take the new value without retrying. Filesystems can opt into this by setting the FS_MGTIME fstype flag. Others should be unaffected (other than being subject to the same floor value as multigrain filesystems). Tested-by: Randy Dunlap # documentation bits Reviewed-by: Jan Kara Signed-off-by: Jeff Layton --- fs/inode.c | 139 +++++++++++++++++++++++++++++++++++++++++++------= ---- fs/stat.c | 43 ++++++++++++++++- include/linux/fs.h | 34 ++++++++++--- 3 files changed, 181 insertions(+), 35 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index 10c4619faeef8cb81d84a91ec2d982d5a1a51a5c..53f56f6e1ff26e7180802118809= 24f37cf0e5b3c 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -2172,19 +2172,58 @@ int file_remove_privs(struct file *file) } EXPORT_SYMBOL(file_remove_privs); =20 +/** + * current_time - Return FS time (possibly fine-grained) + * @inode: inode. + * + * Return the current time truncated to the time granularity supported by + * the fs, as suitable for a ctime/mtime change. If the ctime is flagged + * as having been QUERIED, get a fine-grained timestamp, but don't update + * the floor. + * + * For a multigrain inode, this is effectively an estimate of the timestamp + * that a file would receive. An actual update must go through + * inode_set_ctime_current(). + */ +struct timespec64 current_time(struct inode *inode) +{ + struct timespec64 now; + u32 cns; + + ktime_get_coarse_real_ts64_mg(&now); + + if (!is_mgtime(inode)) + goto out; + + /* If nothing has queried it, then coarse time is fine */ + cns =3D smp_load_acquire(&inode->i_ctime_nsec); + if (cns & I_CTIME_QUERIED) { + /* + * If there is no apparent change, then get a fine-grained + * timestamp. + */ + if (now.tv_nsec =3D=3D (cns & ~I_CTIME_QUERIED)) + ktime_get_real_ts64(&now); + } +out: + return timestamp_truncate(now, inode); +} +EXPORT_SYMBOL(current_time); + static int inode_needs_update_time(struct inode *inode) { + struct timespec64 now, ts; int sync_it =3D 0; - struct timespec64 now =3D current_time(inode); - struct timespec64 ts; =20 /* First try to exhaust all avenues to not sync */ if (IS_NOCMTIME(inode)) return 0; =20 + now =3D current_time(inode); + ts =3D inode_get_mtime(inode); if (!timespec64_equal(&ts, &now)) - sync_it =3D S_MTIME; + sync_it |=3D S_MTIME; =20 ts =3D inode_get_ctime(inode); if (!timespec64_equal(&ts, &now)) @@ -2562,6 +2601,15 @@ void inode_nohighmem(struct inode *inode) } EXPORT_SYMBOL(inode_nohighmem); =20 +struct timespec64 inode_set_ctime_to_ts(struct inode *inode, struct timesp= ec64 ts) +{ + set_normalized_timespec64(&ts, ts.tv_sec, ts.tv_nsec); + inode->i_ctime_sec =3D ts.tv_sec; + inode->i_ctime_nsec =3D ts.tv_nsec; + return ts; +} +EXPORT_SYMBOL(inode_set_ctime_to_ts); + /** * timestamp_truncate - Truncate timespec to a granularity * @t: Timespec @@ -2594,36 +2642,77 @@ struct timespec64 timestamp_truncate(struct timespe= c64 t, struct inode *inode) EXPORT_SYMBOL(timestamp_truncate); =20 /** - * current_time - Return FS time - * @inode: inode. + * inode_set_ctime_current - set the ctime to current_time + * @inode: inode * - * Return the current time truncated to the time granularity supported by - * the fs. + * Set the inode's ctime to the current value for the inode. Returns the + * current value that was assigned. If this is not a multigrain inode, the= n we + * set it to the later of the coarse time and floor value. * - * Note that inode and inode->sb cannot be NULL. - * Otherwise, the function warns and returns time without truncation. + * If it is multigrain, then we first see if the coarse-grained timestamp = is + * distinct from what is already there. If so, then use that. Otherwise, g= et a + * fine-grained timestamp. + * + * After that, try to swap the new value into i_ctime_nsec. Accept the + * resulting ctime, regardless of the outcome of the swap. If it has + * already been replaced, then that timestamp is later than the earlier + * unacceptable one, and is thus acceptable. */ -struct timespec64 current_time(struct inode *inode) +struct timespec64 inode_set_ctime_current(struct inode *inode) { struct timespec64 now; + u32 cns, cur; =20 - ktime_get_coarse_real_ts64(&now); - return timestamp_truncate(now, inode); -} -EXPORT_SYMBOL(current_time); + ktime_get_coarse_real_ts64_mg(&now); + now =3D timestamp_truncate(now, inode); =20 -/** - * inode_set_ctime_current - set the ctime to current_time - * @inode: inode - * - * Set the inode->i_ctime to the current value for the inode. Returns - * the current value that was assigned to i_ctime. - */ -struct timespec64 inode_set_ctime_current(struct inode *inode) -{ - struct timespec64 now =3D current_time(inode); + /* Just return that if this is not a multigrain fs */ + if (!is_mgtime(inode)) { + inode_set_ctime_to_ts(inode, now); + goto out; + } =20 - inode_set_ctime_to_ts(inode, now); + /* + * A fine-grained time is only needed if someone has queried + * for timestamps, and the current coarse grained time isn't + * later than what's already there. + */ + cns =3D smp_load_acquire(&inode->i_ctime_nsec); + if (cns & I_CTIME_QUERIED) { + struct timespec64 ctime =3D { .tv_sec =3D inode->i_ctime_sec, + .tv_nsec =3D cns & ~I_CTIME_QUERIED }; + + if (timespec64_compare(&now, &ctime) <=3D 0) { + ktime_get_real_ts64_mg(&now); + now =3D timestamp_truncate(now, inode); + } + } + + /* No need to cmpxchg if it's exactly the same */ + if (cns =3D=3D now.tv_nsec && inode->i_ctime_sec =3D=3D now.tv_sec) + goto out; + cur =3D cns; +retry: + /* Try to swap the nsec value into place. */ + if (try_cmpxchg(&inode->i_ctime_nsec, &cur, now.tv_nsec)) { + /* If swap occurred, then we're (mostly) done */ + inode->i_ctime_sec =3D now.tv_sec; + } else { + /* + * Was the change due to someone marking the old ctime QUERIED? + * If so then retry the swap. This can only happen once since + * the only way to clear I_CTIME_QUERIED is to stamp the inode + * with a new ctime. + */ + if (!(cns & I_CTIME_QUERIED) && (cns | I_CTIME_QUERIED) =3D=3D cur) { + cns =3D cur; + goto retry; + } + /* Otherwise, keep the existing ctime */ + now.tv_sec =3D inode->i_ctime_sec; + now.tv_nsec =3D cur & ~I_CTIME_QUERIED; + } +out: return now; } EXPORT_SYMBOL(inode_set_ctime_current); diff --git a/fs/stat.c b/fs/stat.c index 89ce1be563108c1bc0ecabaff5b277258eb6c398..dd480bf51a2a764e5eb1d0a213c= 5ec8b640db911 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -26,6 +26,39 @@ #include "internal.h" #include "mount.h" =20 +/** + * fill_mg_cmtime - Fill in the mtime and ctime and flag ctime as QUERIED + * @stat: where to store the resulting values + * @request_mask: STATX_* values requested + * @inode: inode from which to grab the c/mtime + * + * Given @inode, grab the ctime and mtime out if it and store the result + * in @stat. When fetching the value, flag it as QUERIED (if not already) + * so the next write will record a distinct timestamp. + * + * NB: The QUERIED flag is tracked in the ctime, but we set it there even + * if only the mtime was requested, as that ensures that the next mtime + * change will be distinct. + */ +void fill_mg_cmtime(struct kstat *stat, u32 request_mask, struct inode *in= ode) +{ + atomic_t *pcn =3D (atomic_t *)&inode->i_ctime_nsec; + + /* If neither time was requested, then don't report them */ + if (!(request_mask & (STATX_CTIME|STATX_MTIME))) { + stat->result_mask &=3D ~(STATX_CTIME|STATX_MTIME); + return; + } + + stat->mtime =3D inode_get_mtime(inode); + stat->ctime.tv_sec =3D inode->i_ctime_sec; + stat->ctime.tv_nsec =3D (u32)atomic_read(pcn); + if (!(stat->ctime.tv_nsec & I_CTIME_QUERIED)) + stat->ctime.tv_nsec =3D ((u32)atomic_fetch_or(I_CTIME_QUERIED, pcn)); + stat->ctime.tv_nsec &=3D ~I_CTIME_QUERIED; +} +EXPORT_SYMBOL(fill_mg_cmtime); + /** * generic_fillattr - Fill in the basic attributes from the inode struct * @idmap: idmap of the mount the inode was found from @@ -58,8 +91,14 @@ void generic_fillattr(struct mnt_idmap *idmap, u32 reque= st_mask, stat->rdev =3D inode->i_rdev; stat->size =3D i_size_read(inode); stat->atime =3D inode_get_atime(inode); - stat->mtime =3D inode_get_mtime(inode); - stat->ctime =3D inode_get_ctime(inode); + + if (is_mgtime(inode)) { + fill_mg_cmtime(stat, request_mask, inode); + } else { + stat->ctime =3D inode_get_ctime(inode); + stat->mtime =3D inode_get_mtime(inode); + } + stat->blksize =3D i_blocksize(inode); stat->blocks =3D inode->i_blocks; =20 diff --git a/include/linux/fs.h b/include/linux/fs.h index 6ca11e241a24950d4bd44954cb285d51da2751e9..eff688e75f2f29f1c44dca96370= ee230f8c21db4 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1613,6 +1613,17 @@ static inline struct timespec64 inode_set_mtime(stru= ct inode *inode, return inode_set_mtime_to_ts(inode, ts); } =20 +/* + * Multigrain timestamps + * + * Conditionally use fine-grained ctime and mtime timestamps when there + * are users actively observing them via getattr. The primary use-case + * for this is NFS clients that use the ctime to distinguish between + * different states of the file, and that are often fooled by multiple + * operations that occur in the same coarse-grained timer tick. + */ +#define I_CTIME_QUERIED ((u32)BIT(31)) + static inline time64_t inode_get_ctime_sec(const struct inode *inode) { return inode->i_ctime_sec; @@ -1620,7 +1631,7 @@ static inline time64_t inode_get_ctime_sec(const stru= ct inode *inode) =20 static inline long inode_get_ctime_nsec(const struct inode *inode) { - return inode->i_ctime_nsec; + return inode->i_ctime_nsec & ~I_CTIME_QUERIED; } =20 static inline struct timespec64 inode_get_ctime(const struct inode *inode) @@ -1631,13 +1642,7 @@ static inline struct timespec64 inode_get_ctime(cons= t struct inode *inode) return ts; } =20 -static inline struct timespec64 inode_set_ctime_to_ts(struct inode *inode, - struct timespec64 ts) -{ - inode->i_ctime_sec =3D ts.tv_sec; - inode->i_ctime_nsec =3D ts.tv_nsec; - return ts; -} +struct timespec64 inode_set_ctime_to_ts(struct inode *inode, struct timesp= ec64 ts); =20 /** * inode_set_ctime - set the ctime in the inode @@ -2500,6 +2505,7 @@ struct file_system_type { #define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */ #define FS_DISALLOW_NOTIFY_PERM 16 /* Disable fanotify permission events */ #define FS_ALLOW_IDMAP 32 /* FS has been updated to handle vf= s idmappings. */ +#define FS_MGTIME 64 /* FS uses multigrain timestamps */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rena= me() internally. */ int (*init_fs_context)(struct fs_context *); const struct fs_parameter_spec *parameters; @@ -2523,6 +2529,17 @@ struct file_system_type { =20 #define MODULE_ALIAS_FS(NAME) MODULE_ALIAS("fs-" NAME) =20 +/** + * is_mgtime: is this inode using multigrain timestamps + * @inode: inode to test for multigrain timestamps + * + * Return true if the inode uses multigrain timestamps, false otherwise. + */ +static inline bool is_mgtime(const struct inode *inode) +{ + return inode->i_sb->s_type->fs_flags & FS_MGTIME; +} + extern struct dentry *mount_bdev(struct file_system_type *fs_type, int flags, const char *dev_name, void *data, int (*fill_super)(struct super_block *, void *, int)); @@ -3262,6 +3279,7 @@ extern void page_put_link(void *); extern int page_symlink(struct inode *inode, const char *symname, int len); extern const struct inode_operations page_symlink_inode_operations; extern void kfree_link(void *); +void fill_mg_cmtime(struct kstat *stat, u32 request_mask, struct inode *in= ode); void generic_fillattr(struct mnt_idmap *, u32, struct inode *, struct ksta= t *); void generic_fill_statx_attr(struct inode *inode, struct kstat *stat); void generic_fill_statx_atomic_writes(struct kstat *stat, --=20 2.46.2 From nobody Thu Nov 28 07:42:13 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7479021BAF3; Wed, 2 Oct 2024 21:27:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904462; cv=none; b=RkDTLhOzLpuW54XA65creR0JbPvhDqCHl2+kqx13rPH3T6Rm3CJoBffczTcIdF3rYOp90vE63KeWcuGD2Q/vEnpf5frhZlJPPSrO8gQob1OAhFPwKDad95FLXqEjhdpMxznlaZAuF8yr/S8QOTfMyisyWq5jVc621CRyE0o4wUw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904462; c=relaxed/simple; bh=nWAObiv1pT63De3yrHB7U+GTf1hLNrlZ3AMOKfvjPAI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Lp5yYXgHHkhK5ILmfCNNa3Om8EN/eejmT+YT5eoTZpqsxoNQ+EWtfnK6eAwu4lrIcTNSjDnVNhohVcqvBFYxcS0quKJS2cHDyiDqBRIzx8CSgf1QoK4rGskJpmBX1dGMoXqZF1U6EdKDiOrhVaVktTRYKSJL5sXjQ83RIuy52gs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=rrg8jFdo; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="rrg8jFdo" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CEAC9C4CED0; Wed, 2 Oct 2024 21:27:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727904462; bh=nWAObiv1pT63De3yrHB7U+GTf1hLNrlZ3AMOKfvjPAI=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=rrg8jFdosDYWMwAA92V2hbU+Yuic3cTXnqQJL7/CjaS4rr5GNRsDVyFogFc7wde8b 3+ytxmfZcVBW7BHEAGL4MUuw8r8fjtd6551m/mw3PBYEMfUcJX5EVnFh9Dl4h9g9bq E+0f3RyVBQuyVN+jR02fOAGU05QU3hphedY25tfubjbgoT7Ovjfw1tUksa+EGwTKEP cQw9zf61IzShMjY9Nc7Pn7ReERF7yn1KIs8f8bQPZs7a9O4goDuhvZZeDLeqBbBWs0 G1gFKZ/7FA/CHJ+OL4QeRprqQq6UL2esAzsdREnCvyhDIodZoeCmPj/niFarTH0M/n olAqzjyT6/E3Q== From: Jeff Layton Date: Wed, 02 Oct 2024 17:27:19 -0400 Subject: [PATCH v10 04/12] fs: have setattr_copy handle multigrain timestamps appropriately Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20241002-mgtime-v10-4-d1c4717f5284@kernel.org> References: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> In-Reply-To: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Randy Dunlap , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=3623; i=jlayton@kernel.org; h=from:subject:message-id; bh=nWAObiv1pT63De3yrHB7U+GTf1hLNrlZ3AMOKfvjPAI=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm/bq/ifAGWvuvl6BRwMs4Uby9r2RObyp81rJWQ t0O66nsopCJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZv26vwAKCRAADmhBGVaC FUFdEACg0WGt0WZDsR4lYHjv29oOUq3iMhe6DMxJ/AIvWuxlPjl9W5bf1/v4xbhac64hqIi9t75 Ed8baE1O4menek+QPLY2fRZnnBQGeQXQg9S9rBIhp0xF3IDD+27FJYF8TT5BItAgxM+tE08Jj79 TxEMXfS4ytrx3H/6P8gkkrxj/Vq2KgyY+pXxhr6SJL09O3NDWwkl+CmZlhsbJqaDtd0Bu2Nkqoc 7XU2I/HmGc9+oXYAi4FK+WjMwm+NdZxBDxUf+mxJELst25MI6Gn7ZqLMfihTZDlg/lkXmnjoMYc CpIIZoyJ+HTB+U9Kge0Knzi+yySpldS6Jv2RXPCtm4ptKKFDi2JO1mhHQyQ6U0oVux1XaGQMEda hhXz0O+U2tobDuICI10a29pVUBbbzgFMTN60MymsFmu7SunVDjCCb1GVEoIOOfAITwjZdcdxth/ d8tPUuSxoGy4Xm9bPfSc6kUuaWp/GqTpsjbtT22UjDOKz1DJfLRmBLrSNeW2c+1WW6yPWoA5wCX gAjJGRkY+f5UkqUIhYc/guqwfh2fkG5UVSrwX7YjFkNlNJjQxYPkFfaC5YpfqjGNJjS8Mm/XNw8 RuKnlGcHgkVhCZdFDdF9m9DCLNYtwbNGQmPRyak0iaHm3uyL+MOfkADpMxPn3YOLLNHu/1Gw7j9 rHKf7yUI65eskTQ== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 The setattr codepath is still using coarse-grained timestamps, even on multigrain filesystems. To fix this, fetch the timestamp for ctime updates later, at the point where the assignment occurs in setattr_copy. On a multigrain inode, ignore the ia_ctime in the attrs, and always update the ctime to the current clock value. Update the atime and mtime with the same value (if needed) unless they are being set to other specific values, a'la utimes(). Do not do this universally however, as some filesystems (e.g. most networked fs) want to do an explicit update elsewhere before updating the local inode. Reviewed-by: Darrick J. Wong Reviewed-by: Josef Bacik Reviewed-by: Jan Kara Tested-by: Randy Dunlap # documentation bits Signed-off-by: Jeff Layton --- fs/attr.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 46 insertions(+), 6 deletions(-) diff --git a/fs/attr.c b/fs/attr.c index c04d19b58f1224c2149da57e3224b7bbbc83561f..0309c2bd8afa04bc43db6ff207f= 8a58d9f6a617d 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -271,6 +271,42 @@ int inode_newsize_ok(const struct inode *inode, loff_t= offset) } EXPORT_SYMBOL(inode_newsize_ok); =20 +/** + * setattr_copy_mgtime - update timestamps for mgtime inodes + * @inode: inode timestamps to be updated + * @attr: attrs for the update + * + * With multigrain timestamps, take more care to prevent races when + * updating the ctime. Always update the ctime to the very latest using + * the standard mechanism, and use that to populate the atime and mtime + * appropriately (unless those are being set to specific values). + */ +static void setattr_copy_mgtime(struct inode *inode, const struct iattr *a= ttr) +{ + unsigned int ia_valid =3D attr->ia_valid; + struct timespec64 now; + + /* + * If the ctime isn't being updated then nothing else should be + * either. + */ + if (!(ia_valid & ATTR_CTIME)) { + WARN_ON_ONCE(ia_valid & (ATTR_ATIME|ATTR_MTIME)); + return; + } + + now =3D inode_set_ctime_current(inode); + if (ia_valid & ATTR_ATIME_SET) + inode_set_atime_to_ts(inode, attr->ia_atime); + else if (ia_valid & ATTR_ATIME) + inode_set_atime_to_ts(inode, now); + + if (ia_valid & ATTR_MTIME_SET) + inode_set_mtime_to_ts(inode, attr->ia_mtime); + else if (ia_valid & ATTR_MTIME) + inode_set_mtime_to_ts(inode, now); +} + /** * setattr_copy - copy simple metadata updates into the generic inode * @idmap: idmap of the mount the inode was found from @@ -303,12 +339,6 @@ void setattr_copy(struct mnt_idmap *idmap, struct inod= e *inode, =20 i_uid_update(idmap, attr, inode); i_gid_update(idmap, attr, inode); - if (ia_valid & ATTR_ATIME) - inode_set_atime_to_ts(inode, attr->ia_atime); - if (ia_valid & ATTR_MTIME) - inode_set_mtime_to_ts(inode, attr->ia_mtime); - if (ia_valid & ATTR_CTIME) - inode_set_ctime_to_ts(inode, attr->ia_ctime); if (ia_valid & ATTR_MODE) { umode_t mode =3D attr->ia_mode; if (!in_group_or_capable(idmap, inode, @@ -316,6 +346,16 @@ void setattr_copy(struct mnt_idmap *idmap, struct inod= e *inode, mode &=3D ~S_ISGID; inode->i_mode =3D mode; } + + if (is_mgtime(inode)) + return setattr_copy_mgtime(inode, attr); + + if (ia_valid & ATTR_ATIME) + inode_set_atime_to_ts(inode, attr->ia_atime); + if (ia_valid & ATTR_MTIME) + inode_set_mtime_to_ts(inode, attr->ia_mtime); + if (ia_valid & ATTR_CTIME) + inode_set_ctime_to_ts(inode, attr->ia_ctime); } EXPORT_SYMBOL(setattr_copy); =20 --=20 2.46.2 From nobody Thu Nov 28 07:42:13 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3576D21C173; Wed, 2 Oct 2024 21:27:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904465; cv=none; b=EwOBMlLlvmlE/lt8g3k5m3YpPbamaZLTs76wxZ/Btw9iqpyb+Mmejk9BvpwgldiMojCXMe/QZn+dEtdO+0+4fWxZ95AY3YY5bDadbt+EnmrtG1fgKE8kxi3v1Hu9a4Zp8miucFQqHwntc3Rj+dqnACQ6exACp4qbU47y5R8bBsk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904465; c=relaxed/simple; bh=GgXDPuj4VTCO+KYXw1W6wA5jmJ8SPd2bLeg/94OOtzA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=BYgExNlE4Twqy6SCpR+bMxG3Jl81dJWthkekw/sfQSzd65Hz+Ep6+DvQ3xuj629MjBMZZ3D7qFSSysC8TF7Z8yYEya16/xtUx5yhm8gnnEWN3ONKV27VxfIopJ0gLLAQwCTalJMspVoITunsJ7Fd13yjcBtBfc0xNEUGR6XRd4Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=pErORCLF; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="pErORCLF" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 90C01C4CED8; Wed, 2 Oct 2024 21:27:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727904465; bh=GgXDPuj4VTCO+KYXw1W6wA5jmJ8SPd2bLeg/94OOtzA=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=pErORCLFbUMtYV6Xg0Q8FSs6oa64pfK8QQ0Y2/53GcZECSBeypOEHAXcefZC1c2Tq nK/M2DGCQG1niK1yXv0WLJLWOr27L/5eLGvZfj6wegF9JGGH3AhxnV5eb9iH3Na9Ep CczJA9ddfxZ72S9kSmFSXcrlzuLes8ZZfp6aov8cqZu1YCkvACsPRLk5LFhVmIPDOa zNbRpMPaHvIm8fNV7gqc0l+mYtzLf4NLo2qaQvjXfMFsIDN4CyZ3MijISmLPCPNomW yrhuvZmXB7WYmqsl/t401/RJz34nyupB8rsbUKUjfJ6QidHDhDlmG0Wh1f1S+Wr0kT wqNcdegxY9Wlw== From: Jeff Layton Date: Wed, 02 Oct 2024 17:27:20 -0400 Subject: [PATCH v10 05/12] fs: handle delegated timestamps in setattr_copy_mgtime Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20241002-mgtime-v10-5-d1c4717f5284@kernel.org> References: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> In-Reply-To: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Randy Dunlap , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=6385; i=jlayton@kernel.org; h=from:subject:message-id; bh=GgXDPuj4VTCO+KYXw1W6wA5jmJ8SPd2bLeg/94OOtzA=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm/bq/8ejyIg0PRylRvcQV6hmgwmnKDZHwiP42g ghVwS8PaLKJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZv26vwAKCRAADmhBGVaC FUs6EACFaoAMIiWmAxm2JDOYSGBGqYx7XgULFJuHkZorvtP2QGTa3sFuMu0UtasSpV7DnjmXAp/ 58p6rdY498tiod4Ll7n9AwjwsBnmb2P2aPP3L1wpOZV+tK1hk+BnJ7XfM4/P3Yi4Bbut6GpQl8U x8jnTYyJFrknCniKjxi+mI91aXA67q5KEqqI8W+ZAr4E8C2Ai3/cD+24VbybY41A52VgelzA5EG Ap3liaASdiktai/bt5qbndIJJ2qvjFyn3QcW4Tz0LTSoHZFEY2MRJSO/6Unnl3bsZByuCsgNzCW hgi8TX/79pDgk+WdlH6cN7JN1pNFCNC6u8hyp8H+pbBYmHQ2sgy2vTR+2fKcKKtCls6CuTBkCE8 Lw3Ok0r66ptJdcW4waC3luYXt1KfZc1JpJ/Uw0iMc+jhYjhM8G9GDqFxgudGT60UrBhGkF6juES hrhkkqG/rpPnqv6AmU4Z5DtKHp8wR0nrZeiQUu5H0RqZ/MdrgEH+qhEEM0OKOfkbmNrXYtwuTmo 54WkyE0sLh/y23R41nuZPVgLSJul2Mzx1B3o4zRyc4pX0QiWdPI9CWYp+0ZPwp8fmXkMb+8aVKO qIysDBYjA+SkV0rOsDDA7fXhNsMH40A1UN0npUBje6zvkZFqTM4n/myilmcSEbsnFyU46LlFfRC 8/zxtdFjBqxLvrg== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 An update to the inode ctime typically requires the latest clock value possible. The exception to this rule is when there is a nfsd write delegation and the server is proxying timestamps from the client. When nfsd gets a CB_GETATTR response, update the timestamp value in the inode to the values that the client is tracking. The client doesn't send a ctime value (since that's always determined by the exported filesystem), but it can send a mtime value. In the case where it does, update the ctime to a value commensurate with that instead of the current time. If ATTR_DELEG is set, then use ia_ctime value instead of setting the timestamp to the current time. With the addition of delegated timestamps, the server may receive a request to update only the atime, which doesn't involve a ctime update. Trust the ATTR_CTIME flag in the update and only update the ctime when it's set. Tested-by: Randy Dunlap # documentation bits Reviewed-by: Jan Kara Signed-off-by: Jeff Layton --- fs/attr.c | 28 +++++++++++++-------- fs/inode.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++= ++++ include/linux/fs.h | 2 ++ 3 files changed, 93 insertions(+), 10 deletions(-) diff --git a/fs/attr.c b/fs/attr.c index 0309c2bd8afa04bc43db6ff207f8a58d9f6a617d..c614b954bda5244cc20ee82a98a= 8e68845f23bd7 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -286,16 +286,20 @@ static void setattr_copy_mgtime(struct inode *inode, = const struct iattr *attr) unsigned int ia_valid =3D attr->ia_valid; struct timespec64 now; =20 - /* - * If the ctime isn't being updated then nothing else should be - * either. - */ - if (!(ia_valid & ATTR_CTIME)) { - WARN_ON_ONCE(ia_valid & (ATTR_ATIME|ATTR_MTIME)); - return; + if (ia_valid & ATTR_CTIME) { + /* + * In the case of an update for a write delegation, we must respect + * the value in ia_ctime and not use the current time. + */ + if (ia_valid & ATTR_DELEG) + now =3D inode_set_ctime_deleg(inode, attr->ia_ctime); + else + now =3D inode_set_ctime_current(inode); + } else { + /* If ATTR_CTIME isn't set, then ATTR_MTIME shouldn't be either. */ + WARN_ON_ONCE(ia_valid & ATTR_MTIME); } =20 - now =3D inode_set_ctime_current(inode); if (ia_valid & ATTR_ATIME_SET) inode_set_atime_to_ts(inode, attr->ia_atime); else if (ia_valid & ATTR_ATIME) @@ -354,8 +358,12 @@ void setattr_copy(struct mnt_idmap *idmap, struct inod= e *inode, inode_set_atime_to_ts(inode, attr->ia_atime); if (ia_valid & ATTR_MTIME) inode_set_mtime_to_ts(inode, attr->ia_mtime); - if (ia_valid & ATTR_CTIME) - inode_set_ctime_to_ts(inode, attr->ia_ctime); + if (ia_valid & ATTR_CTIME) { + if (ia_valid & ATTR_DELEG) + inode_set_ctime_deleg(inode, attr->ia_ctime); + else + inode_set_ctime_to_ts(inode, attr->ia_ctime); + } } EXPORT_SYMBOL(setattr_copy); =20 diff --git a/fs/inode.c b/fs/inode.c index 53f56f6e1ff26e718080211880924f37cf0e5b3c..7d1ede60e549683502911f3bb3a= 3a079768e449b 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -2717,6 +2717,79 @@ struct timespec64 inode_set_ctime_current(struct ino= de *inode) } EXPORT_SYMBOL(inode_set_ctime_current); =20 +/** + * inode_set_ctime_deleg - try to update the ctime on a delegated inode + * @inode: inode to update + * @update: timespec64 to set the ctime + * + * Attempt to atomically update the ctime on behalf of a delegation holder. + * + * The nfs server can call back the holder of a delegation to get updated + * inode attributes, including the mtime. When updating the mtime, update + * the ctime to a value at least equal to that. + * + * This can race with concurrent updates to the inode, in which + * case the update is skipped. + * + * Note that this works even when multigrain timestamps are not enabled, + * so it is used in either case. + */ +struct timespec64 inode_set_ctime_deleg(struct inode *inode, struct timesp= ec64 update) +{ + struct timespec64 now, cur_ts; + u32 cur, old; + + /* pairs with try_cmpxchg below */ + cur =3D smp_load_acquire(&inode->i_ctime_nsec); + cur_ts.tv_nsec =3D cur & ~I_CTIME_QUERIED; + cur_ts.tv_sec =3D inode->i_ctime_sec; + + /* If the update is older than the existing value, skip it. */ + if (timespec64_compare(&update, &cur_ts) <=3D 0) + return cur_ts; + + ktime_get_coarse_real_ts64_mg(&now); + + /* Clamp the update to "now" if it's in the future */ + if (timespec64_compare(&update, &now) > 0) + update =3D now; + + update =3D timestamp_truncate(update, inode); + + /* No need to update if the values are already the same */ + if (timespec64_equal(&update, &cur_ts)) + return cur_ts; + + /* + * Try to swap the nsec value into place. If it fails, that means + * it raced with an update due to a write or similar activity. That + * stamp takes precedence, so just skip the update. + */ +retry: + old =3D cur; + if (try_cmpxchg(&inode->i_ctime_nsec, &cur, update.tv_nsec)) { + inode->i_ctime_sec =3D update.tv_sec; + mgtime_counter_inc(mg_ctime_swaps); + return update; + } + + /* + * Was the change due to another task marking the old ctime QUERIED? + * + * If so, then retry the swap. This can only happen once since + * the only way to clear I_CTIME_QUERIED is to stamp the inode + * with a new ctime. + */ + if (!(old & I_CTIME_QUERIED) && (cur =3D=3D (old | I_CTIME_QUERIED))) + goto retry; + + /* Otherwise, it was a new timestamp. */ + cur_ts.tv_sec =3D inode->i_ctime_sec; + cur_ts.tv_nsec =3D cur & ~I_CTIME_QUERIED; + return cur_ts; +} +EXPORT_SYMBOL(inode_set_ctime_deleg); + /** * in_group_or_capable - check whether caller is CAP_FSETID privileged * @idmap: idmap of the mount @inode was found from diff --git a/include/linux/fs.h b/include/linux/fs.h index eff688e75f2f29f1c44dca96370ee230f8c21db4..ea7ed437d2b165debf680507aa4= 50b2662fd5839 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1544,6 +1544,8 @@ static inline bool fsuidgid_has_mapping(struct super_= block *sb, =20 struct timespec64 current_time(struct inode *inode); struct timespec64 inode_set_ctime_current(struct inode *inode); +struct timespec64 inode_set_ctime_deleg(struct inode *inode, + struct timespec64 update); =20 static inline time64_t inode_get_atime_sec(const struct inode *inode) { --=20 2.46.2 From nobody Thu Nov 28 07:42:13 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4741C21D2AC; Wed, 2 Oct 2024 21:27:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904468; cv=none; b=ecavxAeNWjJxBh9fJUlTkRmp1+oM0JBEoSjKzsfpLF+NGNfrtbit1Td4f00/TGYaaYibAJr4Xq8XAyAuercYbAThunR+pg1A0DruKyyQdoFJI4wbHgHU+9ZXQUvb8dxMjrtAYDgYatOQWaGXyz3iWDetd6G4Md1bXlTZKVDGdnU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904468; c=relaxed/simple; bh=MRGGG8rqdJzdD5KgI4WfyNMzLguXPYXob8v5j7W1QaM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Z9v0hJxXHa+29X7cd8WUVSJH8v67mfZZLCJSS2ml/MURxve7zRr7wLldU/KVskbMuTQJsK/XJ21RDhGrWA6JIwzpNHPk8TdzDacnrO4YQ4ZEa3JcB5SFNCCLMgtdmcXJtXiFjY1zehMHduN7yHi3xb3x62i6N0omCS5wl0JHW38= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=RiQb+Vas; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="RiQb+Vas" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 53227C4CED6; Wed, 2 Oct 2024 21:27:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727904467; bh=MRGGG8rqdJzdD5KgI4WfyNMzLguXPYXob8v5j7W1QaM=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=RiQb+VasJhrhbm+efzGbpIcwLxszW5XKADxsVvHvZ4PtU9omVM1uKPRYjdDFTLrxU spC6qoc3Jx5R2FiTqiPPDvCGVFpDpJoGjjicm25vd2LLcfcz9jjV/RR/bjlnI1V4EY ad67YYwlvOrSCQNhgnodXxdPfs01FPltkNs3R5FmGOCg4ttPkExz83Q71CZ2EKDmER 3vEcrINS5svEVCEpFlZZQrG/rtN2NbUWpkxh9VaYk5MStmuCmVXUCBhdZXbYBg8jBs 7qYe48hdyaiE4eaWZsSg0Wr6BU5TlKcgxiilkkzod6H78fW5qEKHCFkhLxB0g4czZQ 211YHMvElLRgA== From: Jeff Layton Date: Wed, 02 Oct 2024 17:27:21 -0400 Subject: [PATCH v10 06/12] fs: tracepoints around multigrain timestamp events Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20241002-mgtime-v10-6-d1c4717f5284@kernel.org> References: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> In-Reply-To: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Randy Dunlap , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=6245; i=jlayton@kernel.org; h=from:subject:message-id; bh=MRGGG8rqdJzdD5KgI4WfyNMzLguXPYXob8v5j7W1QaM=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm/bq/rrV42gSWRnMywus+4vWn2erTXDHpugNyH p5Jt5mx0kCJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZv26vwAKCRAADmhBGVaC FfHLD/0Y+Ozd45QWOEXlGcizIJuYoklt/IclS6YYuOynkLK657rXW0pb+HYJPpKskii5RnL8UEO zVviDGgIKrEHkavVfZbP1PYA1Z+VQJxoYVWmMckxEU4D/ttgg9q8HMGi6CTbx8HZxejV2aRidO5 Q2DxmAfAokB2UKHZIppBcbZszz1P7Kl0yfRlUKeJuklq9GVYBs1Pim1fZgiXRyYL6b+tP/7ps5C ymjA4tKdOekiGiOceknMPhjV8Y9IqU+PmHDLm2KiN5LIetf6bnNXJkL52Q29NJ+eyMJwio0Ff1f Or/F8FVlAfl8FiM6wEM6D4BzMWoUdKQ4jZhMYgFdEkKNPohN1ckS5sGybaKegOQKCmz935a0+QB BPqzCoiaaZpYYAktHl5D4RtVolKmJtLGY2UYlkv0XkOyEjWDXeFzLnjTufDI846IzHPWWVDRGJ8 9EaICfw8ssZEJ1BjJ3qNb5yFKxCidh7aW86QQORcgAC80gN2aI+HZXv9ijN3+4SCsr/suRndazc kUa9Wh92buOEq+U0+WoJQBKJ5dKFUJram9X+6fl//fjCaCVByxDXkahkzeJk+k/CKqLT5eUqaWc XKmwVPbj0nWdMCahwZT6k0Suf4uVnDOSP/+bA9QNEaZGS34TkJrUBsyzRUcRoKoQpfFn6QVLS4H 6h9x5TdrOvNT+vg== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Add some tracepoints around various multigrain timestamp events. Reviewed-by: Josef Bacik Reviewed-by: Darrick J. Wong Reviewed-by: Jan Kara Reviewed-by: Steven Rostedt (Google) Tested-by: Randy Dunlap # documentation bits Signed-off-by: Jeff Layton --- fs/inode.c | 9 ++- fs/stat.c | 3 + include/trace/events/timestamp.h | 124 +++++++++++++++++++++++++++++++++++= ++++ 3 files changed, 135 insertions(+), 1 deletion(-) diff --git a/fs/inode.c b/fs/inode.c index 7d1ede60e549683502911f3bb3a3a079768e449b..f7a25c511d6b7069fa235135cf3= bad0cda32815b 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -22,6 +22,9 @@ #include #include #include +#define CREATE_TRACE_POINTS +#include + #include "internal.h" =20 /* @@ -2603,6 +2606,7 @@ EXPORT_SYMBOL(inode_nohighmem); =20 struct timespec64 inode_set_ctime_to_ts(struct inode *inode, struct timesp= ec64 ts) { + trace_inode_set_ctime_to_ts(inode, &ts); set_normalized_timespec64(&ts, ts.tv_sec, ts.tv_nsec); inode->i_ctime_sec =3D ts.tv_sec; inode->i_ctime_nsec =3D ts.tv_nsec; @@ -2689,14 +2693,17 @@ struct timespec64 inode_set_ctime_current(struct in= ode *inode) } =20 /* No need to cmpxchg if it's exactly the same */ - if (cns =3D=3D now.tv_nsec && inode->i_ctime_sec =3D=3D now.tv_sec) + if (cns =3D=3D now.tv_nsec && inode->i_ctime_sec =3D=3D now.tv_sec) { + trace_ctime_xchg_skip(inode, &now); goto out; + } cur =3D cns; retry: /* Try to swap the nsec value into place. */ if (try_cmpxchg(&inode->i_ctime_nsec, &cur, now.tv_nsec)) { /* If swap occurred, then we're (mostly) done */ inode->i_ctime_sec =3D now.tv_sec; + trace_ctime_ns_xchg(inode, cns, now.tv_nsec, cur); } else { /* * Was the change due to someone marking the old ctime QUERIED? diff --git a/fs/stat.c b/fs/stat.c index dd480bf51a2a764e5eb1d0a213c5ec8b640db911..6eb6c39d003755f9e602996ed93= dcbd863847820 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -23,6 +23,8 @@ #include #include =20 +#include + #include "internal.h" #include "mount.h" =20 @@ -56,6 +58,7 @@ void fill_mg_cmtime(struct kstat *stat, u32 request_mask,= struct inode *inode) if (!(stat->ctime.tv_nsec & I_CTIME_QUERIED)) stat->ctime.tv_nsec =3D ((u32)atomic_fetch_or(I_CTIME_QUERIED, pcn)); stat->ctime.tv_nsec &=3D ~I_CTIME_QUERIED; + trace_fill_mg_cmtime(inode, &stat->ctime, &stat->mtime); } EXPORT_SYMBOL(fill_mg_cmtime); =20 diff --git a/include/trace/events/timestamp.h b/include/trace/events/timest= amp.h new file mode 100644 index 0000000000000000000000000000000000000000..c9e5ec930054887a6a7bae8e487= 611b5ded33d71 --- /dev/null +++ b/include/trace/events/timestamp.h @@ -0,0 +1,124 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM timestamp + +#if !defined(_TRACE_TIMESTAMP_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_TIMESTAMP_H + +#include +#include + +#define CTIME_QUERIED_FLAGS \ + { I_CTIME_QUERIED, "Q" } + +DECLARE_EVENT_CLASS(ctime, + TP_PROTO(struct inode *inode, + struct timespec64 *ctime), + + TP_ARGS(inode, ctime), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, ino) + __field(time64_t, ctime_s) + __field(u32, ctime_ns) + __field(u32, gen) + ), + + TP_fast_assign( + __entry->dev =3D inode->i_sb->s_dev; + __entry->ino =3D inode->i_ino; + __entry->gen =3D inode->i_generation; + __entry->ctime_s =3D ctime->tv_sec; + __entry->ctime_ns =3D ctime->tv_nsec; + ), + + TP_printk("ino=3D%d:%d:%ld:%u ctime=3D%lld.%u", + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->ino, __entry->gen, + __entry->ctime_s, __entry->ctime_ns + ) +); + +DEFINE_EVENT(ctime, inode_set_ctime_to_ts, + TP_PROTO(struct inode *inode, + struct timespec64 *ctime), + TP_ARGS(inode, ctime)); + +DEFINE_EVENT(ctime, ctime_xchg_skip, + TP_PROTO(struct inode *inode, + struct timespec64 *ctime), + TP_ARGS(inode, ctime)); + +TRACE_EVENT(ctime_ns_xchg, + TP_PROTO(struct inode *inode, + u32 old, + u32 new, + u32 cur), + + TP_ARGS(inode, old, new, cur), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, ino) + __field(u32, gen) + __field(u32, old) + __field(u32, new) + __field(u32, cur) + ), + + TP_fast_assign( + __entry->dev =3D inode->i_sb->s_dev; + __entry->ino =3D inode->i_ino; + __entry->gen =3D inode->i_generation; + __entry->old =3D old; + __entry->new =3D new; + __entry->cur =3D cur; + ), + + TP_printk("ino=3D%d:%d:%ld:%u old=3D%u:%s new=3D%u cur=3D%u:%s", + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->ino, __entry->gen, + __entry->old & ~I_CTIME_QUERIED, + __print_flags(__entry->old & I_CTIME_QUERIED, "|", CTIME_QUERIED_FLAGS), + __entry->new, + __entry->cur & ~I_CTIME_QUERIED, + __print_flags(__entry->cur & I_CTIME_QUERIED, "|", CTIME_QUERIED_FLAGS) + ) +); + +TRACE_EVENT(fill_mg_cmtime, + TP_PROTO(struct inode *inode, + struct timespec64 *ctime, + struct timespec64 *mtime), + + TP_ARGS(inode, ctime, mtime), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, ino) + __field(time64_t, ctime_s) + __field(time64_t, mtime_s) + __field(u32, ctime_ns) + __field(u32, mtime_ns) + __field(u32, gen) + ), + + TP_fast_assign( + __entry->dev =3D inode->i_sb->s_dev; + __entry->ino =3D inode->i_ino; + __entry->gen =3D inode->i_generation; + __entry->ctime_s =3D ctime->tv_sec; + __entry->mtime_s =3D mtime->tv_sec; + __entry->ctime_ns =3D ctime->tv_nsec; + __entry->mtime_ns =3D mtime->tv_nsec; + ), + + TP_printk("ino=3D%d:%d:%ld:%u ctime=3D%lld.%u mtime=3D%lld.%u", + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->ino, __entry->gen, + __entry->ctime_s, __entry->ctime_ns, + __entry->mtime_s, __entry->mtime_ns + ) +); +#endif /* _TRACE_TIMESTAMP_H */ + +/* This part must be outside protection */ +#include --=20 2.46.2 From nobody Thu Nov 28 07:42:13 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A7D62178ED; Wed, 2 Oct 2024 21:27:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904471; cv=none; b=vDK7ptKrJkEvgp+zGvklJxdztqbnk4V6NG4hAX2GF/Eq0VKlTCBQ7GRJV6GRyswk+pXYs8G5E92ifNg5Q2UymXnqd/7WcIDwvvqFmB/Q4Zw1xebA8UPiDPvyCYH4IJKmoOEaA4BgHfWtlbamIe38Y76Z5LSsOQ68nAOjW2SB+14= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904471; c=relaxed/simple; bh=Z8ARvNSJ2qaGQw9gXXGsklzintCaz0parTyIMYimjOU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=dTcDlVkxG5lrmLzsYS8EUfu7d5KbzExWtk6H94LWffDH7Ty1WA7L8lg4WZE1At232b6kQONQTuc5c2UKnxyQDC6BdPLT7VPVXPlWQ+N9sSUZeTHEHKERt6muX4P4SQAMlqcXNxYfcCvHu87lKAeF3KWrgflvoFu7dmIxkiiELJ4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=tzCXJ6wf; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="tzCXJ6wf" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 17306C4CED1; Wed, 2 Oct 2024 21:27:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727904470; bh=Z8ARvNSJ2qaGQw9gXXGsklzintCaz0parTyIMYimjOU=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=tzCXJ6wfC5VMb2ExRKKrzQ20m0/cMA8alt1mx22BM50xuDhE0GS9gVEFsrjSveh5f HFH5VS9B24S84IqvZUpSgC0C1q24ouXK3xTB7RbQfG8UaeRZcICfAK6HDjmfFkfBV7 3aO9o2M1mNeM3mDVEnXkN4uh72RWd1d3J5IlhvmQyO4+hhMRETQlHKR4X9G6/DFhRe YSYJkbBlozjLOXMmaq1m2Gt9EFtR/FJhOO+7ql9Vva32Q75BFsiORHAqNQJqujNmqW SBn90QVLpNFHTeIS+z4r+VmFCXvrhebJ9B4yq9M8D2P5LJh1lY5wtzoECPs6IH+MI2 QP5O6BUQzp8Rg== From: Jeff Layton Date: Wed, 02 Oct 2024 17:27:22 -0400 Subject: [PATCH v10 07/12] fs: add percpu counters for significant multigrain timestamp events Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20241002-mgtime-v10-7-d1c4717f5284@kernel.org> References: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> In-Reply-To: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Randy Dunlap , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=3654; i=jlayton@kernel.org; h=from:subject:message-id; bh=Z8ARvNSJ2qaGQw9gXXGsklzintCaz0parTyIMYimjOU=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm/bq/TMaGYuw/cJFUuusrG5sD9Gz0d9qf6qw5y upmhOuFiF6JAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZv26vwAKCRAADmhBGVaC FRvxD/9evDrf3HGvYTuhxkxutkxIEi68sVhkiNZD69Mm7WZA2CzmHFpkbIzJUXlNfytAPvjdSGx Se9Non3XuQRLNglgc0lhxcfcZdu/X+u5yMiBvHkBZEf5lodGBnhBUeXjm4jg9rEign2Xcke0uen zmW17gVIHwMZ3pm48WdgDnI8LGujnHwyRtaKHMgoMO1zgnVvcxwJwwUnejPgDLsmruCTvs+3DcO Y1qi5xwXYC30GgA8xLI3gOheiXnZPH7tbEJ+CH8lRjel+ECcxNYyb4BgLgnjumPTmVu+ctbB3O/ NyFL+R7Pi0fN4SNBqZmhwLTUJrcqb7DCkLXIHsvJwkrbW3Bqnf24X7QYDnJtEJQ9zSf7lLcSU6k V+ew1hvKLd2wb+drV/Ii7HSHdlUZvaWrk8oHWdjSCKLvUAmjwdajzx9wmNWqXhGfXmTr0ecFsKA NXFHfSfVYvzOmgE53f2k3i5mgVRZyIhHJUlyjRFJKFGzUxYaTAIh5NNmXRgDtW4ppFtQH53Hxhu kp4ZqjZv2oO3ibhlxc5JK1fBHr6GVEm5y/b+JdsECtsh3oy/dqkLWdOAHXM5vNLGLzaBO7Qa7Ds CYD7TxCxZPUyWMagc0FhPLEfyvMm79Aw6QZdmp6UVnPQO6c8VRfeyGOyuivKJVQaMeSbXMaQuV2 JHDiBQHJy0omjwA== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 New percpu counters for counting various stats around multigrain timestamp events, and a new debugfs file for displaying them when CONFIG_DEBUG_FS is enabled: - number of attempted ctime updates - number of successful i_ctime_nsec swaps - number of fine-grained timestamp fetches - number of floor value swap events Reviewed-by: Josef Bacik Reviewed-by: Darrick J. Wong Reviewed-by: Jan Kara Tested-by: Randy Dunlap # documentation bits Signed-off-by: Jeff Layton --- fs/inode.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 69 insertions(+) diff --git a/fs/inode.c b/fs/inode.c index f7a25c511d6b7069fa235135cf3bad0cda32815b..6d501c7308aefcbb8001d64cb46= e57f1839b8a3b 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -21,6 +21,8 @@ #include #include #include +#include +#include #include #define CREATE_TRACE_POINTS #include @@ -101,6 +103,70 @@ long get_nr_dirty_inodes(void) return nr_dirty > 0 ? nr_dirty : 0; } =20 +#ifdef CONFIG_DEBUG_FS +static DEFINE_PER_CPU(long, mg_ctime_updates); +static DEFINE_PER_CPU(long, mg_fine_stamps); +static DEFINE_PER_CPU(long, mg_ctime_swaps); + +static unsigned long get_mg_ctime_updates(void) +{ + unsigned long sum =3D 0; + int i; + + for_each_possible_cpu(i) + sum +=3D data_race(per_cpu(mg_ctime_updates, i)); + return sum; +} + +static unsigned long get_mg_fine_stamps(void) +{ + unsigned long sum =3D 0; + int i; + + for_each_possible_cpu(i) + sum +=3D data_race(per_cpu(mg_fine_stamps, i)); + return sum; +} + +static unsigned long get_mg_ctime_swaps(void) +{ + unsigned long sum =3D 0; + int i; + + for_each_possible_cpu(i) + sum +=3D data_race(per_cpu(mg_ctime_swaps, i)); + return sum; +} + +#define mgtime_counter_inc(__var) this_cpu_inc(__var) + +static int mgts_show(struct seq_file *s, void *p) +{ + unsigned long ctime_updates =3D get_mg_ctime_updates(); + unsigned long ctime_swaps =3D get_mg_ctime_swaps(); + unsigned long fine_stamps =3D get_mg_fine_stamps(); + unsigned long floor_swaps =3D timekeeping_get_mg_floor_swaps(); + + seq_printf(s, "%lu %lu %lu %lu\n", + ctime_updates, ctime_swaps, fine_stamps, floor_swaps); + return 0; +} + +DEFINE_SHOW_ATTRIBUTE(mgts); + +static int __init mg_debugfs_init(void) +{ + debugfs_create_file("multigrain_timestamps", S_IFREG | S_IRUGO, NULL, NUL= L, &mgts_fops); + return 0; +} +late_initcall(mg_debugfs_init); + +#else /* ! CONFIG_DEBUG_FS */ + +#define mgtime_counter_inc(__var) do { } while (0) + +#endif /* CONFIG_DEBUG_FS */ + /* * Handle nr_inode sysctl */ @@ -2689,8 +2755,10 @@ struct timespec64 inode_set_ctime_current(struct ino= de *inode) if (timespec64_compare(&now, &ctime) <=3D 0) { ktime_get_real_ts64_mg(&now); now =3D timestamp_truncate(now, inode); + mgtime_counter_inc(mg_fine_stamps); } } + mgtime_counter_inc(mg_ctime_updates); =20 /* No need to cmpxchg if it's exactly the same */ if (cns =3D=3D now.tv_nsec && inode->i_ctime_sec =3D=3D now.tv_sec) { @@ -2704,6 +2772,7 @@ struct timespec64 inode_set_ctime_current(struct inod= e *inode) /* If swap occurred, then we're (mostly) done */ inode->i_ctime_sec =3D now.tv_sec; trace_ctime_ns_xchg(inode, cns, now.tv_nsec, cur); + mgtime_counter_inc(mg_ctime_swaps); } else { /* * Was the change due to someone marking the old ctime QUERIED? --=20 2.46.2 From nobody Thu Nov 28 07:42:13 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7976A21F438; Wed, 2 Oct 2024 21:27:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904473; cv=none; b=XLV0IZnDBGcJ64izGb4DEpUNaHs6AmAASsg9VmN8jXyeg4KejYN4Y0YRNZSZW+3L+WsNoZO5NdsA0v71x6rYJ/OwjNxigG0/8hTgvtwiC/0P0zXYbiMY1kejU+ndUxnTS7QzzEYP/UanHZckdkgxYkV5JKf+kNYlbOBHLx93nMs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904473; c=relaxed/simple; bh=qBtzxV6C3gNnpXcNznor+GUfZID0zTF7K3hXwvPKJ3M=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=D422JGoR2ZRyLl/O0q4IZMwnBwSZD9u7u0OPmygkJCIFI5d3KzMPnAVOUDXYqLT5ZhCViJ7qyn2ocCvS7UdoN6anSci3PUDLmufr4My83HLGDjCyGRHt1B6vLiA0GFsWjHnk3uHJVqstkyXGQN2U03nZYtrnwdLQlDWL5+nWKoo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=tZdd5t/F; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="tZdd5t/F" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CDD5AC4CED6; Wed, 2 Oct 2024 21:27:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727904473; bh=qBtzxV6C3gNnpXcNznor+GUfZID0zTF7K3hXwvPKJ3M=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=tZdd5t/FDA6BUDMKhuyJUMzgrU7Dj+ArevBTpINQ/kUw0bID2yhWKQpKYebxZG9T4 2HlHzHF/uJLRfFA19ndqG3latGrn/u720RlCJbM13CK23otWdgZtsYGMvwyX8ndXZ/ H8igGJd+Yl0+pDHWi/44IMj7dMBiuhLvFzott0kklgeHKAfK+g3pmdflyDlxJOH9sA y36edvjXMAAWajMoebqCmpJp8QVaBZpMdHVODRyfbSppMXtTxwys10nWxxlpiHuZh2 OjTH5zGsziWEHSHh8x/FV255nV5xQkP6RCuEJqa0VcmQfpX79TKxcTXe8XDQZ4pnx0 qZT1ErIW9PRnQ== From: Jeff Layton Date: Wed, 02 Oct 2024 17:27:23 -0400 Subject: [PATCH v10 08/12] Documentation: add a new file documenting multigrain timestamps Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20241002-mgtime-v10-8-d1c4717f5284@kernel.org> References: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> In-Reply-To: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Randy Dunlap , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=7402; i=jlayton@kernel.org; h=from:subject:message-id; bh=qBtzxV6C3gNnpXcNznor+GUfZID0zTF7K3hXwvPKJ3M=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm/brA6I1dI8EG/U2Sb1UisGHQiXpPg4GwgYbr8 hjnuAbXthWJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZv26wAAKCRAADmhBGVaC FV/tD/9X5pKIVIniCMvGql/J6WLFzqLMJ1KN/xNGXflDxvxSCtFhy3Bk+wgy5xigEQNhlpabOFn LQ6qkRRkug8tMq2+YV3w5KKcCNUzqBJg51UTT2r48f95rUtSl+q3B5N79OsZ1HzK7fkCD99+xaO P5DiED3DEtbfEhbsGCRpRyjrpKWqS5dCYV+3ggmpHv5XXtmAJMicDBzW443AVoks3ua0yvhF7/v dpd86poXepdu9zB2qA1wm17Skp5Paejf+WJXi6ilJOoIjZYVmhmzs9p/u/XmvpWtFuOlnGN4Gou Is4gcsCS1Zbpocaf+FWUf4TZDeQ3tCqb7LlmMFHRcO9MWqKJaKiSeqkdlHcs56K5MPzYnvjq1Gl lXJJjYPdgzD6ZldrFYidpzx3sDvknnjR6lvvo7lbI6dU3QPIMx77R8vd5REqp9nuMZKx4xZXVPg V/vtq7O0C+m6q3YCj+PYh0EYBY4I5Q2YBCG6RlOR2pJUdwPZ731ke2R3gPtm9qp+6WBj4rdEymx oj2xDpllQcSOV4ieVhIL2b7GOsIObiy7LKiLmekwwpazRph8l9tHk0krQvvZw8cK93TUpfYK3VC Ll8T6GXHeXnAko/UaPT9+Xd/1TjOj7hVYYRKN+Id8tDiGX9GQPRy5BPJgcW2zD79/af/mEp1NiZ QYYMHRnTwY+vL7A== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Add a high-level document that describes how multigrain timestamps work, rationale for them, and some info about implementation and tradeoffs. Reviewed-by: Josef Bacik Reviewed-by: Darrick J. Wong Reviewed-by: Randy Dunlap Reviewed-by: Jan Kara Tested-by: Randy Dunlap # documentation bits Signed-off-by: Jeff Layton --- Documentation/filesystems/index.rst | 1 + Documentation/filesystems/multigrain-ts.rst | 125 ++++++++++++++++++++++++= ++++ 2 files changed, 126 insertions(+) diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystem= s/index.rst index e8e496d23e1dd5b523889159b464d7adf5d5c30a..44e9e77ffe0d4b9c85f9921190d= 33dfd21acff8f 100644 --- a/Documentation/filesystems/index.rst +++ b/Documentation/filesystems/index.rst @@ -29,6 +29,7 @@ algorithms work. fiemap files locks + multigrain-ts mount_api quota seq_file diff --git a/Documentation/filesystems/multigrain-ts.rst b/Documentation/fi= lesystems/multigrain-ts.rst new file mode 100644 index 0000000000000000000000000000000000000000..c779e47284e80f54ad9fc8a6a0b= 03228dbbf3d59 --- /dev/null +++ b/Documentation/filesystems/multigrain-ts.rst @@ -0,0 +1,125 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Multigrain Timestamps +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Introduction +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Historically, the kernel has always used coarse time values to stamp inode= s. +This value is updated every jiffy, so any change that happens within that = jiffy +will end up with the same timestamp. + +When the kernel goes to stamp an inode (due to a read or write), it first = gets +the current time and then compares it to the existing timestamp(s) to see +whether anything will change. If nothing changed, then it can avoid updati= ng +the inode's metadata. + +Coarse timestamps are therefore good from a performance standpoint, since = they +reduce the need for metadata updates, but bad from the standpoint of +determining whether anything has changed, since a lot of things can happen= in a +jiffy. + +They are particularly troublesome with NFSv3, where unchanging timestamps = can +make it difficult to tell whether to invalidate caches. NFSv4 provides a +dedicated change attribute that should always show a visible change, but n= ot +all filesystems implement this properly, causing the NFS server to substit= ute +the ctime in many cases. + +Multigrain timestamps aim to remedy this by selectively using fine-grained +timestamps when a file has had its timestamps queried recently, and the cu= rrent +coarse-grained time does not cause a change. + +Inode Timestamps +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +There are currently 3 timestamps in the inode that are updated to the curr= ent +wallclock time on different activity: + +ctime: + The inode change time. This is stamped with the current time whenever + the inode's metadata is changed. Note that this value is not settable + from userland. + +mtime: + The inode modification time. This is stamped with the current time + any time a file's contents change. + +atime: + The inode access time. This is stamped whenever an inode's contents are + read. Widely considered to be a terrible mistake. Usually avoided with + options like noatime or relatime. + +Updating the mtime always implies a change to the ctime, but updating the +atime due to a read request does not. + +Multigrain timestamps are only tracked for the ctime and the mtime. atimes= are +not affected and always use the coarse-grained value (subject to the floor= ). + +Inode Timestamp Ordering +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +In addition to just providing info about changes to individual files, file +timestamps also serve an important purpose in applications like "make". Th= ese +programs measure timestamps in order to determine whether source files mig= ht be +newer than cached objects. + +Userland applications like make can only determine ordering based on +operational boundaries. For a syscall those are the syscall entry and exit +points. For io_uring or nfsd operations, that's the request submission and +response. In the case of concurrent operations, userland can make no +determination about the order in which things will occur. + +For instance, if a single thread modifies one file, and then another file = in +sequence, the second file must show an equal or later mtime than the first= . The +same is true if two threads are issuing similar operations that do not ove= rlap +in time. + +If however, two threads have racing syscalls that overlap in time, then th= ere +is no such guarantee, and the second file may appear to have been modified +before, after or at the same time as the first, regardless of which one was +submitted first. + +Note that the above assumes that the system doesn't experience a backward = jump +of the realtime clock. If that occurs at an inopportune time, then timesta= mps +can appear to go backward, even on a properly functioning system. + +Multigrain Timestamp Implementation +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Multigrain timestamps are aimed at ensuring that changes to a single file = are +always recognizable, without violating the ordering guarantees when multip= le +different files are modified. This affects the mtime and the ctime, but the +atime will always use coarse-grained timestamps. + +It uses an unused bit in the i_ctime_nsec field to indicate whether the mt= ime +or ctime has been queried. If either or both have, then the kernel takes +special care to ensure the next timestamp update will display a visible ch= ange. +This ensures tight cache coherency for use-cases like NFS, without sacrifi= cing +the benefits of reduced metadata updates when files aren't being watched. + +The Ctime Floor Value +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +It's not sufficient to simply use fine or coarse-grained timestamps based = on +whether the mtime or ctime has been queried. A file could get a fine grain= ed +timestamp, and then a second file modified later could get a coarse-graine= d one +that appears earlier than the first, which would break the kernel's timest= amp +ordering guarantees. + +To mitigate this problem, maintain a global floor value that ensures that +this can't happen. The two files in the above example may appear to have b= een +modified at the same time in such a case, but they will never show the rev= erse +order. To avoid problems with realtime clock jumps, the floor is managed a= s a +monotonic ktime_t, and the values are converted to realtime clock values as +needed. + +Implementation Notes +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Multigrain timestamps are intended for use by local filesystems that get +ctime values from the local clock. This is in contrast to network filesyst= ems +and the like that just mirror timestamp values from a server. + +For most filesystems, it's sufficient to just set the FS_MGTIME flag in the +fstype->fs_flags in order to opt-in, providing the ctime is only ever set = via +inode_set_ctime_current(). If the filesystem has a ->getattr routine that +doesn't call generic_fillattr, then it should call fill_mg_cmtime() to +fill those values. For setattr, it should use setattr_copy() to update the +timestamps, or otherwise mimic its behavior. --=20 2.46.2 From nobody Thu Nov 28 07:42:13 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 79B3822080F; Wed, 2 Oct 2024 21:27:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904476; cv=none; b=JzL26qBLm0kt18iTokvi1pRE67SY75XdCkp1Q0o7aILhjsFhMg5ZfAueu9dQlu9ysNdrOJe1KvC23hi1jmEef2eLTxr/fHC5xFs4UbjfmYQrZ5d/qhvVDGZxks5i/ShGcq9P9nN9nvKx4w1jR7+oEvDUGkkjOTOg1E82E8ZXW9I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904476; c=relaxed/simple; bh=9HLO3XpJ6HpHWAPYsUm3tCO/nrjDoz9ag2++QNPQia4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=RsYSSR33790YWUAJathr50UUxnydwwAX/mEO+IDco/TICYB1ZA1nDZO7q2E8ch1UDg0FPK/4OUQcBu14Vvqt3xLl/LRZXZnRnJGuuFzEKoCskfpgOp/DRz5d1AD07KnwcP3BuZxlvJpb8pEdLntsqRcqg7/cEoNWmb4Ty+unwDU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uXWl4i2t; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uXWl4i2t" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 914A9C4CEE1; Wed, 2 Oct 2024 21:27:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727904476; bh=9HLO3XpJ6HpHWAPYsUm3tCO/nrjDoz9ag2++QNPQia4=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=uXWl4i2tOgGIMLmuU2mUUroSOWL29/JUAjRh0jzUJWdMog/Qki/kBoa/LFSdc2CUU pbqa6d6QQFcznz/kcCHUnfBl+25eDBB/bQYt/vvfh8OZEglDW3g8h5YdVUtCjUJk7b UfrPAvDtVdFWByOtLKzs+YcAcV75KOpKC8mHwHCuEW4JgLL7C+I/NtrpeJleBPEiz5 tR6gVzMvmjn5B2Iz5vobADmFD3VOCf+5OLU0HnF8a2ppRPEpGzBJqgqgBs1vpr/i/h xH5/4CE0eT8F0g38OYLaDbrwU3mpOZ+7mD6rmrfrTEeZSTGIi2IjWjnuNmmcOsesiL oPkUk4nFRh92A== From: Jeff Layton Date: Wed, 02 Oct 2024 17:27:24 -0400 Subject: [PATCH v10 09/12] xfs: switch to multigrain timestamps Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20241002-mgtime-v10-9-d1c4717f5284@kernel.org> References: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> In-Reply-To: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Randy Dunlap , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=3180; i=jlayton@kernel.org; h=from:subject:message-id; bh=9HLO3XpJ6HpHWAPYsUm3tCO/nrjDoz9ag2++QNPQia4=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm/brAZIqVIJKeyaxoHylxGxKgRPgUJZI1VhJRj ZnV5Y921g6JAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZv26wAAKCRAADmhBGVaC FUffEACv5L7rnY+EMR497bJG5/8S/s1wJUfmdxaoCDG+LnQbH78d++nzXjCVGoIfX/y7IJQDIfn gM0wlLMYm/efC/7UmR9O/T9jiLQgBrRtaYaPLRcn3puzRSl5C8bOeZ0l3/jOe+501U/1OnL556c FyiMZQsOWor4MSyzHHPbU1rY7gk549eNccvj+67f3WpNM6Zn9CHM7dUv3QxvweREeQo3FRm0+jN CV6uf/GSmzRs8LBhKUpsiqMSBi1unsyQTPnOZBDWdo79OU8tsyUq0vWEOxYGMunxNJNyX/lfjNC hXAZAgp1zWvKl/M1z2GlAIq6y2iez2Vi7H6zrizbo8maTFOf/4wb+de4PNAPQoYK/BjVGiPcSpz uRhGofidblAaoZQ8LjepNkc6jtOKrOMd4qzluTWZ3TFu6JMP2hbkupUV/0i4NiRpOzHFUPxIHcT tM+QAnh8ULcL9xo1NVhsl0FZ/l9s55MnO3A/QoR/FMr3JiT/d6j2t3Adtn4FVTk8VxZjJMldfdw IWbxlJEjfMlYoSmyYGh514J3VI0jNyFV1sAi5fJaZBzZRDVToqGzcd6wQa7KavYEMC2FjZwhfuB TNkWp8Rnabq2rXR4Wq7krJUS3NYKYM1WY6gRph1rGxF1TfhvTaOucrIB1uPPyWY9Y3fdv6EPg0N rjHopf6H5bbG1NQ== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Enable multigrain timestamps, which should ensure that there is an apparent change to the timestamp whenever it has been written after being actively observed via getattr. Also, anytime the mtime changes, the ctime must also change, and those are now the only two options for xfs_trans_ichgtime. Have that function unconditionally bump the ctime, and ASSERT that XFS_ICHGTIME_CHG is always set. Finally, stop setting STATX_CHANGE_COOKIE in getattr, since the ctime should give us better semantics now. Reviewed-by: Josef Bacik Reviewed-by: Darrick J. Wong Tested-by: Randy Dunlap # documentation bits Signed-off-by: Jeff Layton --- fs/xfs/libxfs/xfs_trans_inode.c | 6 +++--- fs/xfs/xfs_iops.c | 10 +++------- fs/xfs/xfs_super.c | 2 +- 3 files changed, 7 insertions(+), 11 deletions(-) diff --git a/fs/xfs/libxfs/xfs_trans_inode.c b/fs/xfs/libxfs/xfs_trans_inod= e.c index 3c40f37e82c73cf871bb252553331b60c6b1973b..c962ad64b0c10058c1e2eecf664= fbc67ec7302f2 100644 --- a/fs/xfs/libxfs/xfs_trans_inode.c +++ b/fs/xfs/libxfs/xfs_trans_inode.c @@ -62,12 +62,12 @@ xfs_trans_ichgtime( ASSERT(tp); xfs_assert_ilocked(ip, XFS_ILOCK_EXCL); =20 - tv =3D current_time(inode); + /* If the mtime changes, then ctime must also change */ + ASSERT(flags & XFS_ICHGTIME_CHG); =20 + tv =3D inode_set_ctime_current(inode); if (flags & XFS_ICHGTIME_MOD) inode_set_mtime_to_ts(inode, tv); - if (flags & XFS_ICHGTIME_CHG) - inode_set_ctime_to_ts(inode, tv); if (flags & XFS_ICHGTIME_ACCESS) inode_set_atime_to_ts(inode, tv); if (flags & XFS_ICHGTIME_CREATE) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 1cdc8034f54d93f4a40ab3e3e4f91c6c9dfed7ec..a1c4a350a6dbfd19028ccecdfdb= 271879f769ccb 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -597,8 +597,9 @@ xfs_vn_getattr( stat->gid =3D vfsgid_into_kgid(vfsgid); stat->ino =3D ip->i_ino; stat->atime =3D inode_get_atime(inode); - stat->mtime =3D inode_get_mtime(inode); - stat->ctime =3D inode_get_ctime(inode); + + fill_mg_cmtime(stat, request_mask, inode); + stat->blocks =3D XFS_FSB_TO_BB(mp, ip->i_nblocks + ip->i_delayed_blks); =20 if (xfs_has_v3inodes(mp)) { @@ -608,11 +609,6 @@ xfs_vn_getattr( } } =20 - if ((request_mask & STATX_CHANGE_COOKIE) && IS_I_VERSION(inode)) { - stat->change_cookie =3D inode_query_iversion(inode); - stat->result_mask |=3D STATX_CHANGE_COOKIE; - } - /* * Note: If you add another clause to set an attribute flag, please * update attributes_mask below. diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 27e9f749c4c7fc75c3385aba2e02ac9fc5d1719d..210481b03fdb48fd50e9a7a109d= 8bcee0e7e3a29 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -2052,7 +2052,7 @@ static struct file_system_type xfs_fs_type =3D { .init_fs_context =3D xfs_init_fs_context, .parameters =3D xfs_fs_parameters, .kill_sb =3D xfs_kill_sb, - .fs_flags =3D FS_REQUIRES_DEV | FS_ALLOW_IDMAP, + .fs_flags =3D FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_MGTIME, }; MODULE_ALIAS_FS("xfs"); =20 --=20 2.46.2 From nobody Thu Nov 28 07:42:13 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E87C2216A8; Wed, 2 Oct 2024 21:27:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904479; cv=none; b=YzqgHD9/qXzZbTQLe+xNiiUVO6nOuJHDAhjg2TEQTyuoYb4MolW5yZ33i1Nfdp/ccrC97x9gLR+SxyX4xM5rXgfl/EO6e9iSfi5QR5mCepju1ujL+Zr8ub8784Y8w+9oKlba8TM4psdauWaXQSDIkWRtf9yRqEyA304nqODArR4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904479; c=relaxed/simple; bh=L8jPTqE/MEdr2Fk0QAaUZrk7fJPp2DBX4nFI13DAUpM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=g1Tpfetz7U47DHxJaAP3hZZ077UNKOsf9ehq6Q7M2QvpH8kQB1GAuw0NyA3OBMsCH5AuZE2sBClz2ezLFNuJSqE5daOd5R+36r4ajWQbjtxm4lwhcluU8JCMS8utcqjkxZfq41rgHexoG6/GPXtsYyghXU4bD6YMDlnkKnxJI44= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IRUjz9Pt; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IRUjz9Pt" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 53ECAC4CED8; Wed, 2 Oct 2024 21:27:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727904478; bh=L8jPTqE/MEdr2Fk0QAaUZrk7fJPp2DBX4nFI13DAUpM=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=IRUjz9PtfkWUJi0e9T3IPCCoMESom7iKKJ3CLgq48OxKmXWJ5/EJvSKXnIrqyYPEy cXYB9JoxiwOFWnbShCRYw+L9iXqXp5S4zliQGT+jN/yU8thuWASv7U7Bjhvz9EGmmY HZlWksb1N48laxkgzaACcRF3LuZjruytZTUmrSpfwJaq8yY3L45la4hbhUzL5Ah6uF VaXG+vGAJSrI/YGDgGYKh08C3Vw0bA212OWgDbBCgSBD5cH8kPXgryCalNn7Si1cCg Dnf3IyUzLmz1xed8h41cSfmi79GmS28hKsMSkI0kkEbKETd5KxDP/1sMOPF8c1QCAs uVAvAUbwCSgcA== From: Jeff Layton Date: Wed, 02 Oct 2024 17:27:25 -0400 Subject: [PATCH v10 10/12] ext4: switch to multigrain timestamps Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20241002-mgtime-v10-10-d1c4717f5284@kernel.org> References: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> In-Reply-To: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Randy Dunlap , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=1052; i=jlayton@kernel.org; h=from:subject:message-id; bh=L8jPTqE/MEdr2Fk0QAaUZrk7fJPp2DBX4nFI13DAUpM=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm/brAty56j/NgjClfQP+YgB3escq8CktdBFW2L wyBzPE0QpeJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZv26wAAKCRAADmhBGVaC Fa0aEADUq4geawXWbT0XKiWVNDe5HmCEhbMP2s1/8FXe/A/d6dLhUu7Fit4zb0KvCeOllM/wApx olJNRraOJVZhluYwJX2+A/BfzNrIuGKFiuZKLpkGaKHDg7fimaX1ks2NC8uUO6v/H+MdnvezKx9 gJgvYbPPFJamQt4dC4+RerijvvEQXqyH0HLG49AFR1wty4o8scKEOh9I/tamgS2nUAy46SrtrR6 B+1XmVEtAcemRl3vx8XR2uOGwnGXjQkvw5q2EZryLqOJWHkdmoKckir5tPLfypMc4K8dGev2LcJ gZEhdTe0QOMpFsJ3YrYPySffq9mPRIgNCePbgr4c8Devtb7WMalBfhydHOfRxIzRjC8zz87RNPZ 7bSv0sZtCIwqdawRdMm6GN7iWgCjikImQMy+DBPfaqKkI730o5c+bLd4PZbDWKhm8gDD8hno5Qc WKqliyfZXKshMun31q93qXnviYb3u0Wy6bTHh+yqG/cWD8pi29uSoIx3TJVx0uNeRC+NcgXDse4 z585K1g2IEydh2r+rO5LBZp1Oi4Efe6rOoVVfceCo+FeYE9wY7u3wO/EaORyxVol55Lkxf/Unqr eUknq/qzqTcdoUs+QmZyJeUQIWD/OmqQPzyVVWdMM23OQPgLi9nk1BQB6QzYVf18et5QtWHndAQ LsWTMxYdS9V5lYA== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Enable multigrain timestamps, which should ensure that there is an apparent change to the timestamp whenever it has been written after being actively observed via getattr. For ext4, we only need to enable the FS_MGTIME flag. Reviewed-by: Josef Bacik Reviewed-by: Jan Kara Tested-by: Randy Dunlap # documentation bits Signed-off-by: Jeff Layton --- fs/ext4/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index e72145c4ae5a071cf4a809d0519a01a8fb84dc2d..a125d9435b8a1c8f7a96a2a0bdd= 9ce1b4671f8a2 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -7298,7 +7298,7 @@ static struct file_system_type ext4_fs_type =3D { .init_fs_context =3D ext4_init_fs_context, .parameters =3D ext4_param_specs, .kill_sb =3D ext4_kill_sb, - .fs_flags =3D FS_REQUIRES_DEV | FS_ALLOW_IDMAP, + .fs_flags =3D FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_MGTIME, }; MODULE_ALIAS_FS("ext4"); =20 --=20 2.46.2 From nobody Thu Nov 28 07:42:13 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEE19217908; Wed, 2 Oct 2024 21:28:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904482; cv=none; b=nIQARWfsdZP3EiN7ytURebtBqAncvkQK0uPBcPCzq3DjPJJck023C9RzY1JERib3QEu+kkkrd3WipGBAFKB13WNjNrUIa0PZMropJ88OQ7D4+EurUGbGdYR40KQM1qa7AcfRdvaczbWv1zqVrmIhiGDapRdM4cUp6JnpY2UREeE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904482; c=relaxed/simple; bh=zObmtftqo+OaaHLBavfM0NkNtNo0uG0b222Sa7Qmh1I=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=bnnDMQlY4WWNKPNCJq04BnEzOhNQQX40fjSXzFTigFSqsdelbQ8aLayvbUmjVmTLSyQ7WSekHla76QjanYjiVolJG8mZJW3o/13ey0q6ZghoaVvTJJnocqiNMJyULDfKYf4CmN65fSaIupAdP6LW5ytWH4TUbMYQDm370knG9bc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=gfzEwwKw; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="gfzEwwKw" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 166D1C4CEDE; Wed, 2 Oct 2024 21:27:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727904481; bh=zObmtftqo+OaaHLBavfM0NkNtNo0uG0b222Sa7Qmh1I=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=gfzEwwKwlXuv/icWZdNJWvW+FUy1Vf/0lRfHCezZJdLWd0bObWLG0v+pUO5SPBiLc sdSKXhrmmr7gR7EG/kMlBwBOIXDFpEUY1Q5UnAq0VpOtOMYAK/HuPc2D4OPjxAG28z wX/lhPGtRunaig0jh6Zkqwu0w7i0F86Tdp4KkfnTk2SejE6O+R93qb58TgPsiqWeYO 4cRuZunzYUopQWD7qcAnyzodtqqFqRj3H1Lr00OGp4ASB/P7SjZpWKYcoFHEDC3k+9 DSkuNUfOrYYmTrchTlnoIigRhbdJizzR0WJ2zW6WwSQ7jNNLL1jDQ9z50Wyi2OvURn fBlphFgrhKdLg== From: Jeff Layton Date: Wed, 02 Oct 2024 17:27:26 -0400 Subject: [PATCH v10 11/12] btrfs: convert to multigrain timestamps Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20241002-mgtime-v10-11-d1c4717f5284@kernel.org> References: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> In-Reply-To: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Randy Dunlap , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=2842; i=jlayton@kernel.org; h=from:subject:message-id; bh=zObmtftqo+OaaHLBavfM0NkNtNo0uG0b222Sa7Qmh1I=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm/brAIkH4ZhSMoZFwZdDHij4mmUSui26ZshlEx x57+iG3EBGJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZv26wAAKCRAADmhBGVaC FZD6D/9Sk5MXNcAbxZylsBMFOovUPgZGtFA1V7FLElPi8cEY7bFvXXBEGek1fgh/mw0fw+rRbXb oQFZRU2UfV6ph1kYWVPicQNpEenaSUhXiOXeXIwuelUAe+90WG0JpGTnnTafXP8n+UcVYlkKDiK /mMe1hk0JgbyzqOg5Y9B6BymtY2UqRP9IWF7jCUswdY5ZaiIitOJ/Zj/OTbBECAf7iWWahXA3Fz ruvOwvTybJlrz6JA4fJH1SeI+6rRl/7R/0/SsmczCVpNWOwDJYqckWGX3iI9zjxRoaxctha6TQ+ xAp3b9jijuSBtQ6buORLmDHGJw9REll+c4DNtfZFRHriVBEvcsunWRnHasihG90eKOjU14SL9MC 1ZvySYOHnxGz8tns2wmWZWeRihy1XziiKNCmiRBHl7fyT8EAyQRBuFrLeSMukQtA9aALZqhe1lc zncx27VXeFwy6QXLRi9wyrhs9eKl9nDEw6c4P+31Qh9y6+S9EuVcWcrYTkTeFbkzDAlARXsRyrY PEVG5lg2NNHpKq6HoZ7OStLMuQUkfKvb4MjwAoZXtq+pp6pFT9u8CuPukYWWfM3eYLrp1L5gVLI IZgF/zLShS/XqVzCis+igDAtyWwVPGSgO3wwlQNWQozooZ2dLNaDCj9PeNhDU1CwuRYofGX5t57 qWW84x0PuUmo0Ng== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Enable multigrain timestamps, which should ensure that there is an apparent change to the timestamp whenever it has been written after being actively observed via getattr. Beyond enabling the FS_MGTIME flag, this patch eliminates update_time_for_write, which goes to great pains to avoid in-memory stores. Just have it overwrite the timestamps unconditionally. Note that this also drops the IS_I_VERSION check and unconditionally bumps the change attribute, since SB_I_VERSION is always set on btrfs. Reviewed-by: Josef Bacik Tested-by: Randy Dunlap # documentation bits Signed-off-by: Jeff Layton --- fs/btrfs/file.c | 25 ++++--------------------- fs/btrfs/super.c | 3 ++- 2 files changed, 6 insertions(+), 22 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 2aeb8116549ca970432042a315f29d9e7fa00980..1656ad7498b8161ec94a2752b0a= b7cb723fada1c 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1120,26 +1120,6 @@ void btrfs_check_nocow_unlock(struct btrfs_inode *in= ode) btrfs_drew_write_unlock(&inode->root->snapshot_lock); } =20 -static void update_time_for_write(struct inode *inode) -{ - struct timespec64 now, ts; - - if (IS_NOCMTIME(inode)) - return; - - now =3D current_time(inode); - ts =3D inode_get_mtime(inode); - if (!timespec64_equal(&ts, &now)) - inode_set_mtime_to_ts(inode, now); - - ts =3D inode_get_ctime(inode); - if (!timespec64_equal(&ts, &now)) - inode_set_ctime_to_ts(inode, now); - - if (IS_I_VERSION(inode)) - inode_inc_iversion(inode); -} - int btrfs_write_check(struct kiocb *iocb, struct iov_iter *from, size_t co= unt) { struct file *file =3D iocb->ki_filp; @@ -1170,7 +1150,10 @@ int btrfs_write_check(struct kiocb *iocb, struct iov= _iter *from, size_t count) * need to start yet another transaction to update the inode as we will * update the inode when we finish writing whatever data we write. */ - update_time_for_write(inode); + if (!IS_NOCMTIME(inode)) { + inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode)); + inode_inc_iversion(inode); + } =20 start_pos =3D round_down(pos, fs_info->sectorsize); oldsize =3D i_size_read(inode); diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 98fa0f382480a2a51420d586b1e2a2fa6c58d025..d423acfe11d0d1702ff1e17a87d= 11d65d3ce8cdb 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2198,7 +2198,8 @@ static struct file_system_type btrfs_fs_type =3D { .init_fs_context =3D btrfs_init_fs_context, .parameters =3D btrfs_fs_parameters, .kill_sb =3D btrfs_kill_super, - .fs_flags =3D FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_ALLOW_IDMAP, + .fs_flags =3D FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | + FS_ALLOW_IDMAP | FS_MGTIME, }; =20 MODULE_ALIAS_FS("btrfs"); --=20 2.46.2 From nobody Thu Nov 28 07:42:13 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C77C52225D9; Wed, 2 Oct 2024 21:28:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904484; cv=none; b=l+HcEFYaQQjtnnmY4idEFq+BA2N2v64VfmPuR0bhwRoeLmSaTPcX61xw+SPsh29YXQfhFyFoIS+R6tIYMblxProIZPdVoIWzYi1WN4NyUkyYL+I/P7PgyuS3nQgmPNJtqEpyPpmKDJvz71daYNvL+5ghpHGROCrnxvIqLMC1C5A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727904484; c=relaxed/simple; bh=5/dsLO8cNmOaM6p0rv+Qvzx/xMvPBJnx6tvOBbPfMzs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=dcUa7mhcQ8lr/TLJfNg75UgjQkB79jsi2EJOkQko4x4xekno8YFYkWMk/2/GnF0mWwndmohbkUNdaKh9iPHHg25rrgbHcfJXDr17yXTwURXhAwEUUUXFAAZKqOYfcB3Wlg6sP584izJ1fIVBH4+mwSO8TT9xxN6Okysx517og5c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=pY4qgNaa; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="pY4qgNaa" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CF362C4CEE1; Wed, 2 Oct 2024 21:28:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727904484; bh=5/dsLO8cNmOaM6p0rv+Qvzx/xMvPBJnx6tvOBbPfMzs=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=pY4qgNaaCqezS32qjDZEbsRs2uGb8OTkbND/gQNC3DZDk0GQyrjGRmxHYO6BtyTWX EQfJp0qmbpEp8Y4fEBEdJjYyDC1X6D/lcrtX9zhRMFR6vrtcr9gEpaw9lqhsBV4dzt hc/hRRGOV1mrRGkkpHthnO5R0XB4+yg2XN5Y8PwXRYhYvmUUyN5X088MqLAETzHPqn O26reMyyJqwCWRtn9LB+2DwWyURABcR1Pa7VdYMKNk9ArrOhg9DFEh4PfCmVFPe9Be NqUMbmrsKxgNgWyLK1sCeoxlIVRcC4VfPN2PcY5JZrL6kxFEm9kVwZtbsg1V1uskh8 uQiYUVOkUhxhA== From: Jeff Layton Date: Wed, 02 Oct 2024 17:27:27 -0400 Subject: [PATCH v10 12/12] tmpfs: add support for multigrain timestamps Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20241002-mgtime-v10-12-d1c4717f5284@kernel.org> References: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> In-Reply-To: <20241002-mgtime-v10-0-d1c4717f5284@kernel.org> To: John Stultz , Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Randy Dunlap , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=988; i=jlayton@kernel.org; h=from:subject:message-id; bh=5/dsLO8cNmOaM6p0rv+Qvzx/xMvPBJnx6tvOBbPfMzs=; b=owEBbQKS/ZANAwAIAQAOaEEZVoIVAcsmYgBm/brALjfB22SLtWrZvbq9dOIudVwmi0I1qA+aV n4HN6LIytCJAjMEAAEIAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCZv26wAAKCRAADmhBGVaC FRSFEACPohqXeTh7KgceuIiMgaiZvFEfTN6SD/E380ADOa3A/9s4jdNxUzvvIUF5TGO3RjuqvXp vQ39lbU+sN/EReWdsZdlcuP9wIJTEDPEo0Sd1UVdKEqRfvB7MXkHnFyhigbts8E4EeZm3vtO5Um dEueBuYGU/uHUQ+C6uWNdEqboX38koaVM6De2O+XxmDB25xapHTAlc+OjmcQvs6DpWxF5RNdpJg 84pcISszclA73/fk3T5S6O5wVNqceFy6Rj/wOaY0mSjy0XDKDJKGBWz7VJtKWZKTPmcLIh46ehW D3rEEFjv/THx5b+VU8ecQynhSjyfgPRiRgZ/ZFcxmYMlGzxi639VSAmecRyY4j1lwNLBwcaxP3U qfns+SkW0ZRePXxgunj1ox6ARPkr5OFR5Pa7/h0JkK0hjguHgiSpu5K82GofgSxXHpUFOv0Dkyj Ee32pebJG1fZRa4wWzJRSNqUvn+2uXCZvk35HWRS2uPOeNohiJVtbaJ34ZRn78lHec+pf5Z7ZJ5 GKT64vZ0wq2zEDEprVAk5g9Pid6BM0mn2dixzCz1QwYLKWJdWrcK7YfU6uGpmvh/UiXzYOPesTz zpKM7FScCuy994O5F+7EBot7sYRW87SFI2Nu0MwRE5tPQdBY/CHLfBwZP8GTD2ThHGIm8ZnGudD XIxJwAEylXRaCkQ== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Enable multigrain timestamps, which should ensure that there is an apparent change to the timestamp whenever it has been written after being actively observed via getattr. tmpfs only requires the FS_MGTIME flag. Reviewed-by: Josef Bacik Reviewed-by: Jan Kara Tested-by: Randy Dunlap # documentation bits Signed-off-by: Jeff Layton --- mm/shmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/shmem.c b/mm/shmem.c index 5a77acf6ac6a621dc7b5e7b46402b2b714b45bea..5f17eaaa32e2902228be7b245c5= b3b11c5fb6a56 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -4804,7 +4804,7 @@ static struct file_system_type shmem_fs_type =3D { .parameters =3D shmem_fs_parameters, #endif .kill_sb =3D kill_litter_super, - .fs_flags =3D FS_USERNS_MOUNT | FS_ALLOW_IDMAP, + .fs_flags =3D FS_USERNS_MOUNT | FS_ALLOW_IDMAP | FS_MGTIME, }; =20 void __init shmem_init(void) --=20 2.46.2