[PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"

Fei Lv posted 1 patch 1 year, 6 months ago
Documentation/filesystems/overlayfs.rst | 39 +++++++++++++++++++++++++
fs/overlayfs/copy_up.c                  | 18 ++++++++++++
fs/overlayfs/ovl_entry.h                | 20 +++++++++++--
fs/overlayfs/params.c                   | 33 ++++++++++++++++++---
fs/overlayfs/super.c                    |  2 +-
5 files changed, 105 insertions(+), 7 deletions(-)
[PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Fei Lv 1 year, 6 months ago
For upper filesystem which does not enforce ordering on storing of
metadata changes(e.g. ubifs), when overlayfs file is modified for
the first time, copy up will create a copy of the lower file and
its parent directories in the upper layer. Permission lost of the
new upper parent directory was observed during power-cut stress test.

Fix by adding new mount opion "fsync=strict", make sure data/metadata of
copied up directory written to disk before renaming from tmp to final
destination.

Signed-off-by: Fei Lv <feilv@asrmicro.com>
---
V1 -> V2:
 1. change open flags from "O_LARGEFILE | O_WRONLY" to "O_RDONLY".
 2. change mount option to "fsync=ordered/strict/volatile".
 3. ovl_should_sync_strict() implies ovl_should_sync().
 4. remove redundant ovl_should_sync_strict from ovl_copy_up_meta_inode_data.
 5. update commit log.
 6. update documentation overlayfs.rst.

 Documentation/filesystems/overlayfs.rst | 39 +++++++++++++++++++++++++
 fs/overlayfs/copy_up.c                  | 18 ++++++++++++
 fs/overlayfs/ovl_entry.h                | 20 +++++++++++--
 fs/overlayfs/params.c                   | 33 ++++++++++++++++++---
 fs/overlayfs/super.c                    |  2 +-
 5 files changed, 105 insertions(+), 7 deletions(-)

diff --git a/Documentation/filesystems/overlayfs.rst b/Documentation/filesystems/overlayfs.rst
index 165514401441..a783e57bdb57 100644
--- a/Documentation/filesystems/overlayfs.rst
+++ b/Documentation/filesystems/overlayfs.rst
@@ -742,6 +742,45 @@ controlled by the "uuid" mount option, which supports these values:
     mounted with "uuid=on".
 
 
+Durability and copy up
+----------------------
+
+The fsync(2) and fdatasync(2) system calls ensure that the metadata and
+data of a file, respectively, are safely written to the backing
+storage, which is expected to guarantee the existence of the information post
+system crash.
+
+Without the fdatasync(2) call, there is no guarantee that the observed
+data after a system crash will be either the old or the new data, but
+in practice, the observed data after crash is often the old or new data or a
+mix of both.
+
+When overlayfs file is modified for the first time, copy up will create
+a copy of the lower file and its parent directories in the upper layer.
+In case of a system crash, if fdatasync(2) was not called after the
+modification, the upper file could end up with no data at all (i.e.
+zeros), which would be an unusual outcome.  To avoid this experience,
+overlayfs calls fsync(2) on the upper file before completing the copy up with
+rename(2) to make the copy up "atomic".
+
+Depending on the backing filesystem (e.g. ubifs), fsync(2) before
+rename(2) may not be enough to provide the "atomic" copy up behavior
+and fsync(2) on the copied up parent directories is required as well.
+
+Overlayfs can be tuned to prefer performance or durability when storing
+to the underlying upper layer.  This is controlled by the "fsync" mount
+option, which supports these values:
+
+- "ordered": (default)
+    Call fsync(2) on upper file before completion of copy up.
+- "strict":
+    Call fsync(2) on upper file and directories before completion of copy up.
+- "volatile": [*]
+    Prefer performance over durability (see `Volatile mount`_)
+
+[*] The mount option "volatile" is an alias to "fsync=volatile".
+
+
 Volatile mount
 --------------
 
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index a5ef2005a2cc..d99a18afceb8 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -243,6 +243,21 @@ static int ovl_verify_area(loff_t pos, loff_t pos2, loff_t len, loff_t totlen)
 	return 0;
 }
 
+static int ovl_copy_up_sync(struct path *path)
+{
+	struct file *new_file;
+	int err;
+
+	new_file = ovl_path_open(path, O_RDONLY);
+	if (IS_ERR(new_file))
+		return PTR_ERR(new_file);
+
+	err = vfs_fsync(new_file, 0);
+	fput(new_file);
+
+	return err;
+}
+
 static int ovl_copy_up_file(struct ovl_fs *ofs, struct dentry *dentry,
 			    struct file *new_file, loff_t len)
 {
@@ -701,6 +716,9 @@ static int ovl_copy_up_metadata(struct ovl_copy_up_ctx *c, struct dentry *temp)
 		err = ovl_set_attr(ofs, temp, &c->stat);
 	inode_unlock(temp->d_inode);
 
+	if (!err && ovl_should_sync_strict(ofs))
+		err = ovl_copy_up_sync(&upperpath);
+
 	return err;
 }
 
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index cb449ab310a7..7f6d2effd5f1 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -5,6 +5,12 @@
  * Copyright (C) 2016 Red Hat, Inc.
  */
 
+enum {
+	OVL_FSYNC_ORDERED,
+	OVL_FSYNC_STRICT,
+	OVL_FSYNC_VOLATILE,
+};
+
 struct ovl_config {
 	char *upperdir;
 	char *workdir;
@@ -18,7 +24,7 @@ struct ovl_config {
 	int xino;
 	bool metacopy;
 	bool userxattr;
-	bool ovl_volatile;
+	int fsync_mode;
 };
 
 struct ovl_sb {
@@ -120,7 +126,17 @@ static inline struct ovl_fs *OVL_FS(struct super_block *sb)
 
 static inline bool ovl_should_sync(struct ovl_fs *ofs)
 {
-	return !ofs->config.ovl_volatile;
+	return ofs->config.fsync_mode != OVL_FSYNC_VOLATILE;
+}
+
+static inline bool ovl_should_sync_strict(struct ovl_fs *ofs)
+{
+	return ofs->config.fsync_mode == OVL_FSYNC_STRICT;
+}
+
+static inline bool ovl_is_volatile(struct ovl_config *config)
+{
+	return config->fsync_mode == OVL_FSYNC_VOLATILE;
 }
 
 static inline unsigned int ovl_numlower(struct ovl_entry *oe)
diff --git a/fs/overlayfs/params.c b/fs/overlayfs/params.c
index 4860fcc4611b..c4aac288b7e0 100644
--- a/fs/overlayfs/params.c
+++ b/fs/overlayfs/params.c
@@ -58,6 +58,7 @@ enum ovl_opt {
 	Opt_xino,
 	Opt_metacopy,
 	Opt_verity,
+	Opt_fsync,
 	Opt_volatile,
 };
 
@@ -139,6 +140,23 @@ static int ovl_verity_mode_def(void)
 	return OVL_VERITY_OFF;
 }
 
+static const struct constant_table ovl_parameter_fsync[] = {
+	{ "ordered",	OVL_FSYNC_ORDERED  },
+	{ "strict",	OVL_FSYNC_STRICT   },
+	{ "volatile",	OVL_FSYNC_VOLATILE },
+	{}
+};
+
+static const char *ovl_fsync_mode(struct ovl_config *config)
+{
+	return ovl_parameter_fsync[config->fsync_mode].name;
+}
+
+static int ovl_fsync_mode_def(void)
+{
+	return OVL_FSYNC_ORDERED;
+}
+
 const struct fs_parameter_spec ovl_parameter_spec[] = {
 	fsparam_string_empty("lowerdir",    Opt_lowerdir),
 	fsparam_string("lowerdir+",         Opt_lowerdir_add),
@@ -154,6 +172,7 @@ const struct fs_parameter_spec ovl_parameter_spec[] = {
 	fsparam_enum("xino",                Opt_xino, ovl_parameter_xino),
 	fsparam_enum("metacopy",            Opt_metacopy, ovl_parameter_bool),
 	fsparam_enum("verity",              Opt_verity, ovl_parameter_verity),
+	fsparam_enum("fsync",               Opt_fsync, ovl_parameter_fsync),
 	fsparam_flag("volatile",            Opt_volatile),
 	{}
 };
@@ -617,8 +636,11 @@ static int ovl_parse_param(struct fs_context *fc, struct fs_parameter *param)
 	case Opt_verity:
 		config->verity_mode = result.uint_32;
 		break;
+	case Opt_fsync:
+		config->fsync_mode = result.uint_32;
+		break;
 	case Opt_volatile:
-		config->ovl_volatile = true;
+		config->fsync_mode = OVL_FSYNC_VOLATILE;
 		break;
 	case Opt_userxattr:
 		config->userxattr = true;
@@ -802,9 +824,9 @@ int ovl_fs_params_verify(const struct ovl_fs_context *ctx,
 		config->index = false;
 	}
 
-	if (!config->upperdir && config->ovl_volatile) {
+	if (!config->upperdir && ovl_is_volatile(config)) {
 		pr_info("option \"volatile\" is meaningless in a non-upper mount, ignoring it.\n");
-		config->ovl_volatile = false;
+		config->fsync_mode = ovl_fsync_mode_def();
 	}
 
 	if (!config->upperdir && config->uuid == OVL_UUID_ON) {
@@ -997,8 +1019,11 @@ int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 	if (ofs->config.metacopy != ovl_metacopy_def)
 		seq_printf(m, ",metacopy=%s",
 			   ofs->config.metacopy ? "on" : "off");
-	if (ofs->config.ovl_volatile)
+	if (ovl_is_volatile(&ofs->config))
 		seq_puts(m, ",volatile");
+	else if (ofs->config.fsync_mode != ovl_fsync_mode_def())
+		seq_printf(m, ",fsync=%s",
+			   ovl_fsync_mode(&ofs->config));
 	if (ofs->config.userxattr)
 		seq_puts(m, ",userxattr");
 	if (ofs->config.verity_mode != ovl_verity_mode_def())
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 06a231970cb5..824cbcf40523 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -750,7 +750,7 @@ static int ovl_make_workdir(struct super_block *sb, struct ovl_fs *ofs,
 	 * For volatile mount, create a incompat/volatile/dirty file to keep
 	 * track of it.
 	 */
-	if (ofs->config.ovl_volatile) {
+	if (ovl_is_volatile(&ofs->config)) {
 		err = ovl_create_volatile_dirty(ofs);
 		if (err < 0) {
 			pr_err("Failed to create volatile/dirty file.\n");

base-commit: 0c3836482481200ead7b416ca80c68a29cfdaabd
-- 
2.45.2
Re: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Amir Goldstein 1 year, 6 months ago
On Mon, Jul 22, 2024 at 1:14 PM Fei Lv <feilv@asrmicro.com> wrote:
>
> For upper filesystem which does not enforce ordering on storing of
> metadata changes(e.g. ubifs), when overlayfs file is modified for
> the first time, copy up will create a copy of the lower file and
> its parent directories in the upper layer. Permission lost of the
> new upper parent directory was observed during power-cut stress test.
>
> Fix by adding new mount opion "fsync=strict", make sure data/metadata of
> copied up directory written to disk before renaming from tmp to final
> destination.
>
> Signed-off-by: Fei Lv <feilv@asrmicro.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

but I'd also like to wait for an ACK from Miklos on this feature.

As for timing, we are in the middle of the merge window for 6.11-rc1,
so we have some time before this can be considered for 6.12.
I will be on vacation for most of this development cycle, so either
Miklos will be able to queue it for 6.12 or I may be able to do
near the end of the 6.11 cycle.

Thanks,
Amir.

> ---
> V1 -> V2:
>  1. change open flags from "O_LARGEFILE | O_WRONLY" to "O_RDONLY".
>  2. change mount option to "fsync=ordered/strict/volatile".
>  3. ovl_should_sync_strict() implies ovl_should_sync().
>  4. remove redundant ovl_should_sync_strict from ovl_copy_up_meta_inode_data.
>  5. update commit log.
>  6. update documentation overlayfs.rst.
>
>  Documentation/filesystems/overlayfs.rst | 39 +++++++++++++++++++++++++
>  fs/overlayfs/copy_up.c                  | 18 ++++++++++++
>  fs/overlayfs/ovl_entry.h                | 20 +++++++++++--
>  fs/overlayfs/params.c                   | 33 ++++++++++++++++++---
>  fs/overlayfs/super.c                    |  2 +-
>  5 files changed, 105 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/filesystems/overlayfs.rst b/Documentation/filesystems/overlayfs.rst
> index 165514401441..a783e57bdb57 100644
> --- a/Documentation/filesystems/overlayfs.rst
> +++ b/Documentation/filesystems/overlayfs.rst
> @@ -742,6 +742,45 @@ controlled by the "uuid" mount option, which supports these values:
>      mounted with "uuid=on".
>
>
> +Durability and copy up
> +----------------------
> +
> +The fsync(2) and fdatasync(2) system calls ensure that the metadata and
> +data of a file, respectively, are safely written to the backing
> +storage, which is expected to guarantee the existence of the information post
> +system crash.
> +
> +Without the fdatasync(2) call, there is no guarantee that the observed
> +data after a system crash will be either the old or the new data, but
> +in practice, the observed data after crash is often the old or new data or a
> +mix of both.
> +
> +When overlayfs file is modified for the first time, copy up will create
> +a copy of the lower file and its parent directories in the upper layer.
> +In case of a system crash, if fdatasync(2) was not called after the
> +modification, the upper file could end up with no data at all (i.e.
> +zeros), which would be an unusual outcome.  To avoid this experience,
> +overlayfs calls fsync(2) on the upper file before completing the copy up with
> +rename(2) to make the copy up "atomic".
> +
> +Depending on the backing filesystem (e.g. ubifs), fsync(2) before
> +rename(2) may not be enough to provide the "atomic" copy up behavior
> +and fsync(2) on the copied up parent directories is required as well.
> +
> +Overlayfs can be tuned to prefer performance or durability when storing
> +to the underlying upper layer.  This is controlled by the "fsync" mount
> +option, which supports these values:
> +
> +- "ordered": (default)
> +    Call fsync(2) on upper file before completion of copy up.
> +- "strict":
> +    Call fsync(2) on upper file and directories before completion of copy up.
> +- "volatile": [*]
> +    Prefer performance over durability (see `Volatile mount`_)
> +
> +[*] The mount option "volatile" is an alias to "fsync=volatile".
> +
> +
>  Volatile mount
>  --------------
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index a5ef2005a2cc..d99a18afceb8 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -243,6 +243,21 @@ static int ovl_verify_area(loff_t pos, loff_t pos2, loff_t len, loff_t totlen)
>         return 0;
>  }
>
> +static int ovl_copy_up_sync(struct path *path)
> +{
> +       struct file *new_file;
> +       int err;
> +
> +       new_file = ovl_path_open(path, O_RDONLY);
> +       if (IS_ERR(new_file))
> +               return PTR_ERR(new_file);
> +
> +       err = vfs_fsync(new_file, 0);
> +       fput(new_file);
> +
> +       return err;
> +}
> +
>  static int ovl_copy_up_file(struct ovl_fs *ofs, struct dentry *dentry,
>                             struct file *new_file, loff_t len)
>  {
> @@ -701,6 +716,9 @@ static int ovl_copy_up_metadata(struct ovl_copy_up_ctx *c, struct dentry *temp)
>                 err = ovl_set_attr(ofs, temp, &c->stat);
>         inode_unlock(temp->d_inode);
>
> +       if (!err && ovl_should_sync_strict(ofs))
> +               err = ovl_copy_up_sync(&upperpath);
> +
>         return err;
>  }
>
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index cb449ab310a7..7f6d2effd5f1 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -5,6 +5,12 @@
>   * Copyright (C) 2016 Red Hat, Inc.
>   */
>
> +enum {
> +       OVL_FSYNC_ORDERED,
> +       OVL_FSYNC_STRICT,
> +       OVL_FSYNC_VOLATILE,
> +};
> +
>  struct ovl_config {
>         char *upperdir;
>         char *workdir;
> @@ -18,7 +24,7 @@ struct ovl_config {
>         int xino;
>         bool metacopy;
>         bool userxattr;
> -       bool ovl_volatile;
> +       int fsync_mode;
>  };
>
>  struct ovl_sb {
> @@ -120,7 +126,17 @@ static inline struct ovl_fs *OVL_FS(struct super_block *sb)
>
>  static inline bool ovl_should_sync(struct ovl_fs *ofs)
>  {
> -       return !ofs->config.ovl_volatile;
> +       return ofs->config.fsync_mode != OVL_FSYNC_VOLATILE;
> +}
> +
> +static inline bool ovl_should_sync_strict(struct ovl_fs *ofs)
> +{
> +       return ofs->config.fsync_mode == OVL_FSYNC_STRICT;
> +}
> +
> +static inline bool ovl_is_volatile(struct ovl_config *config)
> +{
> +       return config->fsync_mode == OVL_FSYNC_VOLATILE;
>  }
>
>  static inline unsigned int ovl_numlower(struct ovl_entry *oe)
> diff --git a/fs/overlayfs/params.c b/fs/overlayfs/params.c
> index 4860fcc4611b..c4aac288b7e0 100644
> --- a/fs/overlayfs/params.c
> +++ b/fs/overlayfs/params.c
> @@ -58,6 +58,7 @@ enum ovl_opt {
>         Opt_xino,
>         Opt_metacopy,
>         Opt_verity,
> +       Opt_fsync,
>         Opt_volatile,
>  };
>
> @@ -139,6 +140,23 @@ static int ovl_verity_mode_def(void)
>         return OVL_VERITY_OFF;
>  }
>
> +static const struct constant_table ovl_parameter_fsync[] = {
> +       { "ordered",    OVL_FSYNC_ORDERED  },
> +       { "strict",     OVL_FSYNC_STRICT   },
> +       { "volatile",   OVL_FSYNC_VOLATILE },
> +       {}
> +};
> +
> +static const char *ovl_fsync_mode(struct ovl_config *config)
> +{
> +       return ovl_parameter_fsync[config->fsync_mode].name;
> +}
> +
> +static int ovl_fsync_mode_def(void)
> +{
> +       return OVL_FSYNC_ORDERED;
> +}
> +
>  const struct fs_parameter_spec ovl_parameter_spec[] = {
>         fsparam_string_empty("lowerdir",    Opt_lowerdir),
>         fsparam_string("lowerdir+",         Opt_lowerdir_add),
> @@ -154,6 +172,7 @@ const struct fs_parameter_spec ovl_parameter_spec[] = {
>         fsparam_enum("xino",                Opt_xino, ovl_parameter_xino),
>         fsparam_enum("metacopy",            Opt_metacopy, ovl_parameter_bool),
>         fsparam_enum("verity",              Opt_verity, ovl_parameter_verity),
> +       fsparam_enum("fsync",               Opt_fsync, ovl_parameter_fsync),
>         fsparam_flag("volatile",            Opt_volatile),
>         {}
>  };
> @@ -617,8 +636,11 @@ static int ovl_parse_param(struct fs_context *fc, struct fs_parameter *param)
>         case Opt_verity:
>                 config->verity_mode = result.uint_32;
>                 break;
> +       case Opt_fsync:
> +               config->fsync_mode = result.uint_32;
> +               break;
>         case Opt_volatile:
> -               config->ovl_volatile = true;
> +               config->fsync_mode = OVL_FSYNC_VOLATILE;
>                 break;
>         case Opt_userxattr:
>                 config->userxattr = true;
> @@ -802,9 +824,9 @@ int ovl_fs_params_verify(const struct ovl_fs_context *ctx,
>                 config->index = false;
>         }
>
> -       if (!config->upperdir && config->ovl_volatile) {
> +       if (!config->upperdir && ovl_is_volatile(config)) {
>                 pr_info("option \"volatile\" is meaningless in a non-upper mount, ignoring it.\n");
> -               config->ovl_volatile = false;
> +               config->fsync_mode = ovl_fsync_mode_def();
>         }
>
>         if (!config->upperdir && config->uuid == OVL_UUID_ON) {
> @@ -997,8 +1019,11 @@ int ovl_show_options(struct seq_file *m, struct dentry *dentry)
>         if (ofs->config.metacopy != ovl_metacopy_def)
>                 seq_printf(m, ",metacopy=%s",
>                            ofs->config.metacopy ? "on" : "off");
> -       if (ofs->config.ovl_volatile)
> +       if (ovl_is_volatile(&ofs->config))
>                 seq_puts(m, ",volatile");
> +       else if (ofs->config.fsync_mode != ovl_fsync_mode_def())
> +               seq_printf(m, ",fsync=%s",
> +                          ovl_fsync_mode(&ofs->config));
>         if (ofs->config.userxattr)
>                 seq_puts(m, ",userxattr");
>         if (ofs->config.verity_mode != ovl_verity_mode_def())
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 06a231970cb5..824cbcf40523 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -750,7 +750,7 @@ static int ovl_make_workdir(struct super_block *sb, struct ovl_fs *ofs,
>          * For volatile mount, create a incompat/volatile/dirty file to keep
>          * track of it.
>          */
> -       if (ofs->config.ovl_volatile) {
> +       if (ovl_is_volatile(&ofs->config)) {
>                 err = ovl_create_volatile_dirty(ofs);
>                 if (err < 0) {
>                         pr_err("Failed to create volatile/dirty file.\n");
>
> base-commit: 0c3836482481200ead7b416ca80c68a29cfdaabd
> --
> 2.45.2
>
Re: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Miklos Szeredi 1 year, 5 months ago
On Mon, 22 Jul 2024 at 15:56, Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Mon, Jul 22, 2024 at 1:14 PM Fei Lv <feilv@asrmicro.com> wrote:
> >
> > For upper filesystem which does not enforce ordering on storing of
> > metadata changes(e.g. ubifs), when overlayfs file is modified for
> > the first time, copy up will create a copy of the lower file and
> > its parent directories in the upper layer. Permission lost of the
> > new upper parent directory was observed during power-cut stress test.
> >
> > Fix by adding new mount opion "fsync=strict", make sure data/metadata of
> > copied up directory written to disk before renaming from tmp to final
> > destination.
> >
> > Signed-off-by: Fei Lv <feilv@asrmicro.com>
>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>
> but I'd also like to wait for an ACK from Miklos on this feature.

I'm okay with this.  I'm a little confused about sync=strict mode,
since most copy ups will have vfs_fsync() called twice.  Is this what
we want, or could this be consolidated into a single fsync?

Also is it worth optimizing away the fsync on the directory in cases
the filesystem is well behaved?  Maybe we should just move the
vfs_fsync() call into ovl_copy_up_metadata() and omit the complexity
related to the additional mount option?

To me it feels that it shouldn't matter in terms of performance, but
if reports of performance regressions come in, we can still make this
optional.

Thanks,
Miklos
Re: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Amir Goldstein 1 year, 5 months ago
On Mon, Aug 26, 2024 at 5:59 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Mon, 22 Jul 2024 at 15:56, Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Mon, Jul 22, 2024 at 1:14 PM Fei Lv <feilv@asrmicro.com> wrote:
> > >
> > > For upper filesystem which does not enforce ordering on storing of
> > > metadata changes(e.g. ubifs), when overlayfs file is modified for
> > > the first time, copy up will create a copy of the lower file and
> > > its parent directories in the upper layer. Permission lost of the
> > > new upper parent directory was observed during power-cut stress test.
> > >
> > > Fix by adding new mount opion "fsync=strict", make sure data/metadata of
> > > copied up directory written to disk before renaming from tmp to final
> > > destination.
> > >
> > > Signed-off-by: Fei Lv <feilv@asrmicro.com>
> >
> > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> >
> > but I'd also like to wait for an ACK from Miklos on this feature.
>
> I'm okay with this.  I'm a little confused about sync=strict mode,
> since most copy ups will have vfs_fsync() called twice.  Is this what
> we want, or could this be consolidated into a single fsync?
>

Maybe it could, but remember that ubifs strict mode is the odd case
if we have an extra fsync for the odd case, I think code simplicity is
a more important factor.

> Also is it worth optimizing away the fsync on the directory in cases
> the filesystem is well behaved?  Maybe we should just move the
> vfs_fsync() call into ovl_copy_up_metadata() and omit the complexity
> related to the additional mount option?
>

Hmm. Maybe you are confused by the commit message that only mentions
fsync of the parent directory (same as the reported reproducer), but
the strict mode fsync also affects metacopy, not only parent dir copy up.

> To me it feels that it shouldn't matter in terms of performance, but
> if reports of performance regressions come in, we can still make this
> optional.
>

I think that the case of chown -R with metacopy is going to be terribly crippled
if every metacopy gets and fsync.

Thanks,
Amir.
Re: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Amir Goldstein 1 year, 5 months ago
On Tue, Aug 27, 2024 at 10:46 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Mon, Aug 26, 2024 at 5:59 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Mon, 22 Jul 2024 at 15:56, Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Mon, Jul 22, 2024 at 1:14 PM Fei Lv <feilv@asrmicro.com> wrote:
> > > >
> > > > For upper filesystem which does not enforce ordering on storing of
> > > > metadata changes(e.g. ubifs), when overlayfs file is modified for
> > > > the first time, copy up will create a copy of the lower file and
> > > > its parent directories in the upper layer. Permission lost of the
> > > > new upper parent directory was observed during power-cut stress test.
> > > >
> > > > Fix by adding new mount opion "fsync=strict", make sure data/metadata of
> > > > copied up directory written to disk before renaming from tmp to final
> > > > destination.
> > > >
> > > > Signed-off-by: Fei Lv <feilv@asrmicro.com>
> > >
> > > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> > >
> > > but I'd also like to wait for an ACK from Miklos on this feature.
> >
> > I'm okay with this.  I'm a little confused about sync=strict mode,
> > since most copy ups will have vfs_fsync() called twice.  Is this what
> > we want, or could this be consolidated into a single fsync?
> >
>
> Maybe it could, but remember that ubifs strict mode is the odd case
> if we have an extra fsync for the odd case, I think code simplicity is
> a more important factor.
>
> > Also is it worth optimizing away the fsync on the directory in cases
> > the filesystem is well behaved?  Maybe we should just move the
> > vfs_fsync() call into ovl_copy_up_metadata() and omit the complexity
> > related to the additional mount option?
> >

Maybe, but note that in ovl_copy_up_meta_inode_data(),
copy up of data still requires fsync and there is no call to
ovl_copy_up_metadata() in that code path, so trying to optimize
double fsync in all the code paths in going to be a PITA IMA
and not worth the trouble.

>
> Hmm. Maybe you are confused by the commit message that only mentions
> fsync of the parent directory (same as the reported reproducer), but
> the strict mode fsync also affects metacopy, not only parent dir copy up.
>
> > To me it feels that it shouldn't matter in terms of performance, but
> > if reports of performance regressions come in, we can still make this
> > optional.
> >
>
> I think that the case of chown -R with metacopy is going to be terribly crippled
> if every metacopy gets and fsync.
>

But maybe we can ignore crash safety of metacopy on ubifs, because
1. the ubifs users may not be using this feature
2. ubifs may be nice and takes care of ordering O_TMPFILE
    metadata updates before exposing the link

Then we can do the following:
IF (metacopy_enabled)
    fsync only in ovl_copy_up_file()
ELSE
    fsync only in ovl_copy_up_metadata()

Let me know what you think.

Thanks,
Amir.
Re: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Miklos Szeredi 1 year, 5 months ago
On Thu, 29 Aug 2024 at 12:29, Amir Goldstein <amir73il@gmail.com> wrote:

> But maybe we can ignore crash safety of metacopy on ubifs, because
> 1. the ubifs users may not be using this feature
> 2. ubifs may be nice and takes care of ordering O_TMPFILE
>     metadata updates before exposing the link
>
> Then we can do the following:
> IF (metacopy_enabled)
>     fsync only in ovl_copy_up_file()
> ELSE
>     fsync only in ovl_copy_up_metadata()
>
> Let me know what you think.

Sounds like a good compromise.

Thanks,
Miklos
Re: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Amir Goldstein 1 year, 5 months ago
On Thu, Aug 29, 2024 at 2:51 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Thu, 29 Aug 2024 at 12:29, Amir Goldstein <amir73il@gmail.com> wrote:
>
> > But maybe we can ignore crash safety of metacopy on ubifs, because
> > 1. the ubifs users may not be using this feature
> > 2. ubifs may be nice and takes care of ordering O_TMPFILE
> >     metadata updates before exposing the link
> >
> > Then we can do the following:
> > IF (metacopy_enabled)
> >     fsync only in ovl_copy_up_file()
> > ELSE
> >     fsync only in ovl_copy_up_metadata()
> >
> > Let me know what you think.
>
> Sounds like a good compromise.
>

Fei,

Could you please test the attached patch and confirm that your
use case does not depend on metacopy enabled?

In any case, I am holding on to your patch in case someone reports
a performance regression with this unconditional fsync approach.

Thanks,
Amir.
Re: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Amir Goldstein 1 year, 5 months ago
On Thu, Aug 29, 2024 at 6:23 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Thu, Aug 29, 2024 at 2:51 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Thu, 29 Aug 2024 at 12:29, Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > > But maybe we can ignore crash safety of metacopy on ubifs, because
> > > 1. the ubifs users may not be using this feature
> > > 2. ubifs may be nice and takes care of ordering O_TMPFILE
> > >     metadata updates before exposing the link
> > >
> > > Then we can do the following:
> > > IF (metacopy_enabled)
> > >     fsync only in ovl_copy_up_file()
> > > ELSE
> > >     fsync only in ovl_copy_up_metadata()
> > >
> > > Let me know what you think.
> >
> > Sounds like a good compromise.
> >
>
> Fei,
>
> Could you please test the attached patch and confirm that your
> use case does not depend on metacopy enabled?
>
> In any case, I am holding on to your patch in case someone reports
> a performance regression with this unconditional fsync approach.
>

Well, it's a good thing that I took Miklois' advice to make the fsync
option implicit, because the original patch had 2 bugs detected by fstest:
1. missing O_LARGEFILE
2. trying to fsync special files

Please see uptodate patch at:
https://github.com/amir73il/linux/commits/ovl-fsync/

If there are no complaints, I will queue this up for v6.12.
Fei, please provide your Tested-by.

Thanks,
Amir.
答复: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Lv Fei(吕飞) 1 year, 5 months ago
> On Thu, Aug 29, 2024 at 6:23 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Thu, Aug 29, 2024 at 2:51 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >
> > > On Thu, 29 Aug 2024 at 12:29, Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > > But maybe we can ignore crash safety of metacopy on ubifs, because 
> > > > 1. the ubifs users may not be using this feature 2. ubifs may be 
> > > > nice and takes care of ordering O_TMPFILE
> > > >     metadata updates before exposing the link
> > > >
> > > > Then we can do the following:
> > > > IF (metacopy_enabled)
> > > >     fsync only in ovl_copy_up_file() ELSE
> > > >     fsync only in ovl_copy_up_metadata()
> > > >
> > > > Let me know what you think.
> > >
> > > Sounds like a good compromise.
> > >
> >
> > Fei,
> >
> > Could you please test the attached patch and confirm that your use 
> > case does not depend on metacopy enabled?
> >
> > In any case, I am holding on to your patch in case someone reports a 
> > performance regression with this unconditional fsync approach.
> >
> 
> Well, it's a good thing that I took Miklois' advice to make the fsync option implicit, because > the original patch had 2 bugs detected by fstest:
> 1. missing O_LARGEFILE
> 2. trying to fsync special files
> 
> Please see uptodate patch at:
> https://github.com/amir73il/linux/commits/ovl-fsync/
> 
> If there are no complaints, I will queue this up for v6.12.
> Fei, please provide your Tested-by.

We do not enable metacopy.
Tested this patch and it also solved our issue.

Thanks,
Fei

Re: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Amir Goldstein 1 year, 5 months ago
On Fri, Aug 30, 2024 at 1:52 PM Lv Fei(吕飞) <feilv@asrmicro.com> wrote:
>
> > On Thu, Aug 29, 2024 at 6:23 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Thu, Aug 29, 2024 at 2:51 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > >
> > > > On Thu, 29 Aug 2024 at 12:29, Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > > But maybe we can ignore crash safety of metacopy on ubifs, because
> > > > > 1. the ubifs users may not be using this feature 2. ubifs may be
> > > > > nice and takes care of ordering O_TMPFILE
> > > > >     metadata updates before exposing the link
> > > > >
> > > > > Then we can do the following:
> > > > > IF (metacopy_enabled)
> > > > >     fsync only in ovl_copy_up_file() ELSE
> > > > >     fsync only in ovl_copy_up_metadata()
> > > > >
> > > > > Let me know what you think.
> > > >
> > > > Sounds like a good compromise.
> > > >
> > >
> > > Fei,
> > >
> > > Could you please test the attached patch and confirm that your use
> > > case does not depend on metacopy enabled?
> > >
> > > In any case, I am holding on to your patch in case someone reports a
> > > performance regression with this unconditional fsync approach.
> > >
> >
> > Well, it's a good thing that I took Miklois' advice to make the fsync option implicit, because > the original patch had 2 bugs detected by fstest:
> > 1. missing O_LARGEFILE
> > 2. trying to fsync special files
> >
> > Please see uptodate patch at:
> > https://github.com/amir73il/linux/commits/ovl-fsync/
> >
> > If there are no complaints, I will queue this up for v6.12.
> > Fei, please provide your Tested-by.
>
> We do not enable metacopy.
> Tested this patch and it also solved our issue.

Hi Fei,

Thanks for approving.
I added Reported-and-tested-by and pushed to overlayfs-next.

Now we just need to hope that users won't come shouting about
performance regressions.

Thanks,
Amir.
Re: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Amir Goldstein 1 year, 5 months ago
On Mon, Jul 22, 2024 at 3:56 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Mon, Jul 22, 2024 at 1:14 PM Fei Lv <feilv@asrmicro.com> wrote:
> >
> > For upper filesystem which does not enforce ordering on storing of
> > metadata changes(e.g. ubifs), when overlayfs file is modified for
> > the first time, copy up will create a copy of the lower file and
> > its parent directories in the upper layer. Permission lost of the
> > new upper parent directory was observed during power-cut stress test.
> >
> > Fix by adding new mount opion "fsync=strict", make sure data/metadata of
> > copied up directory written to disk before renaming from tmp to final
> > destination.
> >
> > Signed-off-by: Fei Lv <feilv@asrmicro.com>
>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>
> but I'd also like to wait for an ACK from Miklos on this feature.
>
> As for timing, we are in the middle of the merge window for 6.11-rc1,
> so we have some time before this can be considered for 6.12.
> I will be on vacation for most of this development cycle, so either
> Miklos will be able to queue it for 6.12 or I may be able to do
> near the end of the 6.11 cycle.
>

Miklos,

Please let me know what you think of this approach to handle ubifs upper.
If you like it, I can queue this up for v6.12.

Thanks,
Amir.

>
> > ---
> > V1 -> V2:
> >  1. change open flags from "O_LARGEFILE | O_WRONLY" to "O_RDONLY".
> >  2. change mount option to "fsync=ordered/strict/volatile".
> >  3. ovl_should_sync_strict() implies ovl_should_sync().
> >  4. remove redundant ovl_should_sync_strict from ovl_copy_up_meta_inode_data.
> >  5. update commit log.
> >  6. update documentation overlayfs.rst.
> >
> >  Documentation/filesystems/overlayfs.rst | 39 +++++++++++++++++++++++++
> >  fs/overlayfs/copy_up.c                  | 18 ++++++++++++
> >  fs/overlayfs/ovl_entry.h                | 20 +++++++++++--
> >  fs/overlayfs/params.c                   | 33 ++++++++++++++++++---
> >  fs/overlayfs/super.c                    |  2 +-
> >  5 files changed, 105 insertions(+), 7 deletions(-)
> >
> > diff --git a/Documentation/filesystems/overlayfs.rst b/Documentation/filesystems/overlayfs.rst
> > index 165514401441..a783e57bdb57 100644
> > --- a/Documentation/filesystems/overlayfs.rst
> > +++ b/Documentation/filesystems/overlayfs.rst
> > @@ -742,6 +742,45 @@ controlled by the "uuid" mount option, which supports these values:
> >      mounted with "uuid=on".
> >
> >
> > +Durability and copy up
> > +----------------------
> > +
> > +The fsync(2) and fdatasync(2) system calls ensure that the metadata and
> > +data of a file, respectively, are safely written to the backing
> > +storage, which is expected to guarantee the existence of the information post
> > +system crash.
> > +
> > +Without the fdatasync(2) call, there is no guarantee that the observed
> > +data after a system crash will be either the old or the new data, but
> > +in practice, the observed data after crash is often the old or new data or a
> > +mix of both.
> > +
> > +When overlayfs file is modified for the first time, copy up will create
> > +a copy of the lower file and its parent directories in the upper layer.
> > +In case of a system crash, if fdatasync(2) was not called after the
> > +modification, the upper file could end up with no data at all (i.e.
> > +zeros), which would be an unusual outcome.  To avoid this experience,
> > +overlayfs calls fsync(2) on the upper file before completing the copy up with
> > +rename(2) to make the copy up "atomic".
> > +
> > +Depending on the backing filesystem (e.g. ubifs), fsync(2) before
> > +rename(2) may not be enough to provide the "atomic" copy up behavior
> > +and fsync(2) on the copied up parent directories is required as well.
> > +
> > +Overlayfs can be tuned to prefer performance or durability when storing
> > +to the underlying upper layer.  This is controlled by the "fsync" mount
> > +option, which supports these values:
> > +
> > +- "ordered": (default)
> > +    Call fsync(2) on upper file before completion of copy up.
> > +- "strict":
> > +    Call fsync(2) on upper file and directories before completion of copy up.
> > +- "volatile": [*]
> > +    Prefer performance over durability (see `Volatile mount`_)
> > +
> > +[*] The mount option "volatile" is an alias to "fsync=volatile".
> > +
> > +
> >  Volatile mount
> >  --------------
> >
> > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> > index a5ef2005a2cc..d99a18afceb8 100644
> > --- a/fs/overlayfs/copy_up.c
> > +++ b/fs/overlayfs/copy_up.c
> > @@ -243,6 +243,21 @@ static int ovl_verify_area(loff_t pos, loff_t pos2, loff_t len, loff_t totlen)
> >         return 0;
> >  }
> >
> > +static int ovl_copy_up_sync(struct path *path)
> > +{
> > +       struct file *new_file;
> > +       int err;
> > +
> > +       new_file = ovl_path_open(path, O_RDONLY);
> > +       if (IS_ERR(new_file))
> > +               return PTR_ERR(new_file);
> > +
> > +       err = vfs_fsync(new_file, 0);
> > +       fput(new_file);
> > +
> > +       return err;
> > +}
> > +
> >  static int ovl_copy_up_file(struct ovl_fs *ofs, struct dentry *dentry,
> >                             struct file *new_file, loff_t len)
> >  {
> > @@ -701,6 +716,9 @@ static int ovl_copy_up_metadata(struct ovl_copy_up_ctx *c, struct dentry *temp)
> >                 err = ovl_set_attr(ofs, temp, &c->stat);
> >         inode_unlock(temp->d_inode);
> >
> > +       if (!err && ovl_should_sync_strict(ofs))
> > +               err = ovl_copy_up_sync(&upperpath);
> > +
> >         return err;
> >  }
> >
> > diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> > index cb449ab310a7..7f6d2effd5f1 100644
> > --- a/fs/overlayfs/ovl_entry.h
> > +++ b/fs/overlayfs/ovl_entry.h
> > @@ -5,6 +5,12 @@
> >   * Copyright (C) 2016 Red Hat, Inc.
> >   */
> >
> > +enum {
> > +       OVL_FSYNC_ORDERED,
> > +       OVL_FSYNC_STRICT,
> > +       OVL_FSYNC_VOLATILE,
> > +};
> > +
> >  struct ovl_config {
> >         char *upperdir;
> >         char *workdir;
> > @@ -18,7 +24,7 @@ struct ovl_config {
> >         int xino;
> >         bool metacopy;
> >         bool userxattr;
> > -       bool ovl_volatile;
> > +       int fsync_mode;
> >  };
> >
> >  struct ovl_sb {
> > @@ -120,7 +126,17 @@ static inline struct ovl_fs *OVL_FS(struct super_block *sb)
> >
> >  static inline bool ovl_should_sync(struct ovl_fs *ofs)
> >  {
> > -       return !ofs->config.ovl_volatile;
> > +       return ofs->config.fsync_mode != OVL_FSYNC_VOLATILE;
> > +}
> > +
> > +static inline bool ovl_should_sync_strict(struct ovl_fs *ofs)
> > +{
> > +       return ofs->config.fsync_mode == OVL_FSYNC_STRICT;
> > +}
> > +
> > +static inline bool ovl_is_volatile(struct ovl_config *config)
> > +{
> > +       return config->fsync_mode == OVL_FSYNC_VOLATILE;
> >  }
> >
> >  static inline unsigned int ovl_numlower(struct ovl_entry *oe)
> > diff --git a/fs/overlayfs/params.c b/fs/overlayfs/params.c
> > index 4860fcc4611b..c4aac288b7e0 100644
> > --- a/fs/overlayfs/params.c
> > +++ b/fs/overlayfs/params.c
> > @@ -58,6 +58,7 @@ enum ovl_opt {
> >         Opt_xino,
> >         Opt_metacopy,
> >         Opt_verity,
> > +       Opt_fsync,
> >         Opt_volatile,
> >  };
> >
> > @@ -139,6 +140,23 @@ static int ovl_verity_mode_def(void)
> >         return OVL_VERITY_OFF;
> >  }
> >
> > +static const struct constant_table ovl_parameter_fsync[] = {
> > +       { "ordered",    OVL_FSYNC_ORDERED  },
> > +       { "strict",     OVL_FSYNC_STRICT   },
> > +       { "volatile",   OVL_FSYNC_VOLATILE },
> > +       {}
> > +};
> > +
> > +static const char *ovl_fsync_mode(struct ovl_config *config)
> > +{
> > +       return ovl_parameter_fsync[config->fsync_mode].name;
> > +}
> > +
> > +static int ovl_fsync_mode_def(void)
> > +{
> > +       return OVL_FSYNC_ORDERED;
> > +}
> > +
> >  const struct fs_parameter_spec ovl_parameter_spec[] = {
> >         fsparam_string_empty("lowerdir",    Opt_lowerdir),
> >         fsparam_string("lowerdir+",         Opt_lowerdir_add),
> > @@ -154,6 +172,7 @@ const struct fs_parameter_spec ovl_parameter_spec[] = {
> >         fsparam_enum("xino",                Opt_xino, ovl_parameter_xino),
> >         fsparam_enum("metacopy",            Opt_metacopy, ovl_parameter_bool),
> >         fsparam_enum("verity",              Opt_verity, ovl_parameter_verity),
> > +       fsparam_enum("fsync",               Opt_fsync, ovl_parameter_fsync),
> >         fsparam_flag("volatile",            Opt_volatile),
> >         {}
> >  };
> > @@ -617,8 +636,11 @@ static int ovl_parse_param(struct fs_context *fc, struct fs_parameter *param)
> >         case Opt_verity:
> >                 config->verity_mode = result.uint_32;
> >                 break;
> > +       case Opt_fsync:
> > +               config->fsync_mode = result.uint_32;
> > +               break;
> >         case Opt_volatile:
> > -               config->ovl_volatile = true;
> > +               config->fsync_mode = OVL_FSYNC_VOLATILE;
> >                 break;
> >         case Opt_userxattr:
> >                 config->userxattr = true;
> > @@ -802,9 +824,9 @@ int ovl_fs_params_verify(const struct ovl_fs_context *ctx,
> >                 config->index = false;
> >         }
> >
> > -       if (!config->upperdir && config->ovl_volatile) {
> > +       if (!config->upperdir && ovl_is_volatile(config)) {
> >                 pr_info("option \"volatile\" is meaningless in a non-upper mount, ignoring it.\n");
> > -               config->ovl_volatile = false;
> > +               config->fsync_mode = ovl_fsync_mode_def();
> >         }
> >
> >         if (!config->upperdir && config->uuid == OVL_UUID_ON) {
> > @@ -997,8 +1019,11 @@ int ovl_show_options(struct seq_file *m, struct dentry *dentry)
> >         if (ofs->config.metacopy != ovl_metacopy_def)
> >                 seq_printf(m, ",metacopy=%s",
> >                            ofs->config.metacopy ? "on" : "off");
> > -       if (ofs->config.ovl_volatile)
> > +       if (ovl_is_volatile(&ofs->config))
> >                 seq_puts(m, ",volatile");
> > +       else if (ofs->config.fsync_mode != ovl_fsync_mode_def())
> > +               seq_printf(m, ",fsync=%s",
> > +                          ovl_fsync_mode(&ofs->config));
> >         if (ofs->config.userxattr)
> >                 seq_puts(m, ",userxattr");
> >         if (ofs->config.verity_mode != ovl_verity_mode_def())
> > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> > index 06a231970cb5..824cbcf40523 100644
> > --- a/fs/overlayfs/super.c
> > +++ b/fs/overlayfs/super.c
> > @@ -750,7 +750,7 @@ static int ovl_make_workdir(struct super_block *sb, struct ovl_fs *ofs,
> >          * For volatile mount, create a incompat/volatile/dirty file to keep
> >          * track of it.
> >          */
> > -       if (ofs->config.ovl_volatile) {
> > +       if (ovl_is_volatile(&ofs->config)) {
> >                 err = ovl_create_volatile_dirty(ofs);
> >                 if (err < 0) {
> >                         pr_err("Failed to create volatile/dirty file.\n");
> >
> > base-commit: 0c3836482481200ead7b416ca80c68a29cfdaabd
> > --
> > 2.45.2
> >
Re: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Amir Goldstein 1 year, 5 months ago
On Fri, Aug 23, 2024 at 11:51 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Mon, Jul 22, 2024 at 3:56 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Mon, Jul 22, 2024 at 1:14 PM Fei Lv <feilv@asrmicro.com> wrote:
> > >
> > > For upper filesystem which does not enforce ordering on storing of
> > > metadata changes(e.g. ubifs), when overlayfs file is modified for
> > > the first time, copy up will create a copy of the lower file and
> > > its parent directories in the upper layer. Permission lost of the
> > > new upper parent directory was observed during power-cut stress test.
> > >
> > > Fix by adding new mount opion "fsync=strict", make sure data/metadata of
> > > copied up directory written to disk before renaming from tmp to final
> > > destination.
> > >
> > > Signed-off-by: Fei Lv <feilv@asrmicro.com>
> >
> > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> >
> > but I'd also like to wait for an ACK from Miklos on this feature.
> >
> > As for timing, we are in the middle of the merge window for 6.11-rc1,
> > so we have some time before this can be considered for 6.12.
> > I will be on vacation for most of this development cycle, so either
> > Miklos will be able to queue it for 6.12 or I may be able to do
> > near the end of the 6.11 cycle.
> >
>
> Miklos,
>
> Please let me know what you think of this approach to handle ubifs upper.
> If you like it, I can queue this up for v6.12.
>
> Thanks,
> Amir.
>
> >
> > > ---
> > > V1 -> V2:
> > >  1. change open flags from "O_LARGEFILE | O_WRONLY" to "O_RDONLY".
> > >  2. change mount option to "fsync=ordered/strict/volatile".
> > >  3. ovl_should_sync_strict() implies ovl_should_sync().
> > >  4. remove redundant ovl_should_sync_strict from ovl_copy_up_meta_inode_data.
> > >  5. update commit log.
> > >  6. update documentation overlayfs.rst.
> > >

Hi Fei,

I started to test this patch and it occured to me that we have no test
coverage for
the "volatile" feature.

Filesystem durability tests are not easy to write and I know that you
tested your
own use case, so I will not ask you to write a regression test as a
condition for merge,
but if you are willing to help, it would be very nice to add this test coverage.

There is already one overlayfs test in fstests (overlay/078) which
tests behavior
of overlayfs copy up during power cut (a.k.a shutdown).

One thing that I do request is that you confirm that you tested that the legacy
"volatile" mount option still works as before.
I saw that you took care of preserving the legacy mount option in display,
which is good practice.

Thanks,
Amir.
答复: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Lv Fei(吕飞) 1 year, 5 months ago

> 发件人: Amir Goldstein [mailto:amir73il@gmail.com] 
> 发送时间: 2024年8月23日 19:43
> 收件人: Lv Fei(吕飞) <feilv@asrmicro.com>
> 抄送: miklos@szeredi.hu; linux-unionfs@vger.kernel.org; linux-kernel@vger.kernel.org; Xu Lianghu(徐良虎) <lianghuxu@asrmicro.> com>
> 主题: Re: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
> 
> On Fri, Aug 23, 2024 at 11:51 AM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Mon, Jul 22, 2024 at 3:56 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Mon, Jul 22, 2024 at 1:14 PM Fei Lv <feilv@asrmicro.com> wrote:
> > > >
> > > > For upper filesystem which does not enforce ordering on storing of 
> > > > metadata changes(e.g. ubifs), when overlayfs file is modified for 
> > > > the first time, copy up will create a copy of the lower file and 
> > > > its parent directories in the upper layer. Permission lost of the 
> > > > new upper parent directory was observed during power-cut stress test.
> > > >
> > > > Fix by adding new mount opion "fsync=strict", make sure 

There is a typo here, "opion" should be "option", please help correct before merge.

> > > > data/metadata of copied up directory written to disk before 
> > > > renaming from tmp to final destination.
> > > >
> > > > Signed-off-by: Fei Lv <feilv@asrmicro.com>
> > >
> > > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> > >
> > > but I'd also like to wait for an ACK from Miklos on this feature.
> > >
> > > As for timing, we are in the middle of the merge window for 
> > > 6.11-rc1, so we have some time before this can be considered for 6.12.
> > > I will be on vacation for most of this development cycle, so either 
> > > Miklos will be able to queue it for 6.12 or I may be able to do near 
> > > the end of the 6.11 cycle.
> > >
> >
> > Miklos,
> >
> > Please let me know what you think of this approach to handle ubifs upper.
> > If you like it, I can queue this up for v6.12.
> >
> > Thanks,
> > Amir.
> >
> > >
> > > > ---
> > > > V1 -> V2:
> > > >  1. change open flags from "O_LARGEFILE | O_WRONLY" to "O_RDONLY".
> > > >  2. change mount option to "fsync=ordered/strict/volatile".
> > > >  3. ovl_should_sync_strict() implies ovl_should_sync().
> > > >  4. remove redundant ovl_should_sync_strict from ovl_copy_up_meta_inode_data.
> > > >  5. update commit log.
> > > >  6. update documentation overlayfs.rst.
> > > >
> 
> Hi Fei,
> 
> I started to test this patch and it occured to me that we have no test coverage for the "volatile" feature.
> 
> Filesystem durability tests are not easy to write and I know that you tested your own use case, so I will not ask you to write a regression test as a condition for merge, but if you are willing to help, it would be very nice to add this test coverage.

OK, I can have a try, need some time to study test suite. This is a new thing for me.

> 
> There is already one overlayfs test in fstests (overlay/078) which tests behavior of overlayfs copy up during power cut (a.k.a shutdown).

Do you mean overlay/078 in kernel/git/brauner/xfstests-dev.git ?

> 
> One thing that I do request is that you confirm that you tested that the legacy "volatile" mount option still works as before.

Yes, I tested basic function of "volatile" mount option with this patch.

> I saw that you took care of preserving the legacy mount option in display, which is good practice.
> 
> Thanks,
> Amir.

Thanks,
Fei
Re: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
Posted by Amir Goldstein 1 year, 5 months ago
On Mon, Aug 26, 2024 at 8:56 AM Lv Fei(吕飞) <feilv@asrmicro.com> wrote:
>
>
>
> > 发件人: Amir Goldstein [mailto:amir73il@gmail.com]
> > 发送时间: 2024年8月23日 19:43
> > 收件人: Lv Fei(吕飞) <feilv@asrmicro.com>
> > 抄送: miklos@szeredi.hu; linux-unionfs@vger.kernel.org; linux-kernel@vger.kernel.org; Xu Lianghu(徐良虎) <lianghuxu@asrmicro.> com>
> > 主题: Re: [PATCH V2] ovl: fsync after metadata copy-up via mount option "fsync=strict"
> >
> > On Fri, Aug 23, 2024 at 11:51 AM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Mon, Jul 22, 2024 at 3:56 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > On Mon, Jul 22, 2024 at 1:14 PM Fei Lv <feilv@asrmicro.com> wrote:
> > > > >
> > > > > For upper filesystem which does not enforce ordering on storing of
> > > > > metadata changes(e.g. ubifs), when overlayfs file is modified for
> > > > > the first time, copy up will create a copy of the lower file and
> > > > > its parent directories in the upper layer. Permission lost of the
> > > > > new upper parent directory was observed during power-cut stress test.
> > > > >
> > > > > Fix by adding new mount opion "fsync=strict", make sure
>
> There is a typo here, "opion" should be "option", please help correct before merge.
>

No problem, but I am still waiting for Miklos to comment on this option.

> > > > > data/metadata of copied up directory written to disk before
> > > > > renaming from tmp to final destination.
> > > > >
> > > > > Signed-off-by: Fei Lv <feilv@asrmicro.com>
> > > >
> > > > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> > > >
> > > > but I'd also like to wait for an ACK from Miklos on this feature.
> > > >
> > > > As for timing, we are in the middle of the merge window for
> > > > 6.11-rc1, so we have some time before this can be considered for 6.12.
> > > > I will be on vacation for most of this development cycle, so either
> > > > Miklos will be able to queue it for 6.12 or I may be able to do near
> > > > the end of the 6.11 cycle.
> > > >
> > >
> > > Miklos,
> > >
> > > Please let me know what you think of this approach to handle ubifs upper.
> > > If you like it, I can queue this up for v6.12.
> > >
> > > Thanks,
> > > Amir.
> > >
> > > >
> > > > > ---
> > > > > V1 -> V2:
> > > > >  1. change open flags from "O_LARGEFILE | O_WRONLY" to "O_RDONLY".
> > > > >  2. change mount option to "fsync=ordered/strict/volatile".
> > > > >  3. ovl_should_sync_strict() implies ovl_should_sync().
> > > > >  4. remove redundant ovl_should_sync_strict from ovl_copy_up_meta_inode_data.
> > > > >  5. update commit log.
> > > > >  6. update documentation overlayfs.rst.
> > > > >
> >
> > Hi Fei,
> >
> > I started to test this patch and it occured to me that we have no test coverage for the "volatile" feature.
> >
> > Filesystem durability tests are not easy to write and I know that you tested your own use case, so I will not ask you to write a regression test as a condition for merge, but if you are willing to help, it would be very nice to add this test coverage.
>
> OK, I can have a try, need some time to study test suite. This is a new thing for me.
>

Whenever you can.

> >
> > There is already one overlayfs test in fstests (overlay/078) which tests behavior of overlayfs copy up during power cut (a.k.a shutdown).
>
> Do you mean overlay/078 in kernel/git/brauner/xfstests-dev.git ?
>

Yes overlay/078, but upstream repo is
git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git

> >
> > One thing that I do request is that you confirm that you tested that the legacy "volatile" mount option still works as before.
>
> Yes, I tested basic function of "volatile" mount option with this patch.
>

Thanks,
Amir.