[PATCH RFC] gen_init_cpio: Do fsync() only on regular files

Dmitry Safonov via B4 Relay posted 1 patch 2 months, 1 week ago
usr/gen_init_cpio.c | 4 ++++
1 file changed, 4 insertions(+)
[PATCH RFC] gen_init_cpio: Do fsync() only on regular files
Posted by Dmitry Safonov via B4 Relay 2 months, 1 week ago
From: Dmitry Safonov <dima@arista.com>

Here at Arista gen_init_cpio is used in testing in order to create
an initramfs for specific tests. Most notably, there is a test that does
essentially a fork-bomb in kdump/panic kernel, replacing build-time
generated init script: instead of doing makedumpfile, it does call
shell tests.

In comparison to usr/Makefile, which creates an intermediate .cpio file,
the Makefile that generates initrd for tests is a one-liner:
> file lib/kdump ${src_dir}/oom-crashkernel 0644 0 0 | usr/gen_init_cpio /dev/stdin | xz -z -c -e -Ccrc32 > ${target_dir}/oom-crashkernel.initrd

As fsync() on a pipe fd returns -EINVAL, that stopped working.
Check that outfd is a regular file descriptor before calling fsync().

Sending this as RFC as these are local tests, rather than upstream ones,
unfortunately. Yet, the fix is trivial and increases correctness of
gen_init_cpio (and maybe saves some time for another person debugging
it). A workaround to use temporary cpio file is also trivial, so not
insisting on merging.

Fixes: ae18b94099b0 ("gen_init_cpio: support -o <output_file> parameter")
Cc: David Disseldorp <ddiss@suse.de>
Cc: Nicolas Schier <nsc@kernel.org>
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 usr/gen_init_cpio.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/usr/gen_init_cpio.c b/usr/gen_init_cpio.c
index 75e9561ba31392e12536e76a918245a8ea07f9b8..845e2d92f0e56b02ba5fc12ecd243ce99c53f552 100644
--- a/usr/gen_init_cpio.c
+++ b/usr/gen_init_cpio.c
@@ -6,6 +6,7 @@
 #include <stdbool.h>
 #include <sys/types.h>
 #include <sys/stat.h>
+#include <sys/socket.h>
 #include <string.h>
 #include <unistd.h>
 #include <time.h>
@@ -112,6 +113,9 @@ static int cpio_trailer(void)
 	    push_pad(padlen(offset, 512)) < 0)
 		return -1;
 
+	if (!isfdtype(outfd, S_IFREG))
+		return 0;
+
 	return fsync(outfd);
 }
 

---
base-commit: c746c3b5169831d7fb032a1051d8b45592ae8d78
change-id: 20251007-gen_init_cpio-pipe-5ad87f616a40

Best regards,
-- 
Dmitry Safonov <dima@arista.com>
Re: [PATCH RFC] gen_init_cpio: Do fsync() only on regular files
Posted by Christoph Hellwig 2 months, 1 week ago
On Tue, Oct 07, 2025 at 12:55:03AM +0100, Dmitry Safonov via B4 Relay wrote:
> From: Dmitry Safonov <dima@arista.com>
> 
> Here at Arista gen_init_cpio is used in testing in order to create
> an initramfs for specific tests. Most notably, there is a test that does
> essentially a fork-bomb in kdump/panic kernel, replacing build-time
> generated init script: instead of doing makedumpfile, it does call
> shell tests.

Why is is using fsync at all?  Seems like this was added in

commit ae18b94099b04264b32e33b057114024bc72c993
Author: David Disseldorp <ddiss@suse.de>
Date:   Tue Aug 19 13:05:45 2025 +1000

    gen_init_cpio: support -o <output_file> parameter

without any good explanation.  In general doing a per-file fsync
is going to horrible wreck performance, and given that no one is
interested in partial initramfs archives also rather pointless.
Re: [PATCH RFC] gen_init_cpio: Do fsync() only on regular files
Posted by David Disseldorp 2 months, 1 week ago
On Mon, 6 Oct 2025 21:40:03 -0700, Christoph Hellwig wrote:

> On Tue, Oct 07, 2025 at 12:55:03AM +0100, Dmitry Safonov via B4 Relay wrote:
> > From: Dmitry Safonov <dima@arista.com>
> > 
> > Here at Arista gen_init_cpio is used in testing in order to create
> > an initramfs for specific tests. Most notably, there is a test that does
> > essentially a fork-bomb in kdump/panic kernel, replacing build-time
> > generated init script: instead of doing makedumpfile, it does call
> > shell tests.  
> 
> Why is is using fsync at all?  Seems like this was added in
> 
> commit ae18b94099b04264b32e33b057114024bc72c993
> Author: David Disseldorp <ddiss@suse.de>
> Date:   Tue Aug 19 13:05:45 2025 +1000
> 
>     gen_init_cpio: support -o <output_file> parameter
> 
> without any good explanation.  In general doing a per-file fsync
> is going to horrible wreck performance, and given that no one is
> interested in partial initramfs archives also rather pointless.

I should have explained why in the commit, sorry. The intention was to
catch any FS I/O errors during output archive writeback. fsync() is
called only once as the final I/O.
Re: [PATCH RFC] gen_init_cpio: Do fsync() only on regular files
Posted by Christoph Hellwig 2 months, 1 week ago
On Tue, Oct 07, 2025 at 04:57:32PM +1100, David Disseldorp wrote:
> I should have explained why in the commit, sorry. The intention was to
> catch any FS I/O errors during output archive writeback. fsync() is
> called only once as the final I/O.

I don't parse this.  What does 'as the final I/O' mean?  If you want
to catch writeback errors, a single syncfs should be enough.
Re: [PATCH RFC] gen_init_cpio: Do fsync() only on regular files
Posted by David Disseldorp 2 months, 1 week ago
On Mon, 6 Oct 2025 23:03:57 -0700, Christoph Hellwig wrote:

> On Tue, Oct 07, 2025 at 04:57:32PM +1100, David Disseldorp wrote:
> > I should have explained why in the commit, sorry. The intention was to
> > catch any FS I/O errors during output archive writeback. fsync() is
> > called only once as the final I/O.  
> 
> I don't parse this.  What does 'as the final I/O' mean?

fsync() is called once after all buffered writes and copy_file_range()
calls for the initramfs archive have completed.

> If you want
> to catch writeback errors, a single syncfs should be enough.

gen_init_cpio should only be concerned that the output archive file is
flushed to storage, rather than the entire filesystem. Why would syncfs
be more suitable?
Re: [PATCH RFC] gen_init_cpio: Do fsync() only on regular files
Posted by Christoph Hellwig 2 months, 1 week ago
On Tue, Oct 07, 2025 at 05:25:56PM +1100, David Disseldorp wrote:
> On Mon, 6 Oct 2025 23:03:57 -0700, Christoph Hellwig wrote:
> 
> > On Tue, Oct 07, 2025 at 04:57:32PM +1100, David Disseldorp wrote:
> > > I should have explained why in the commit, sorry. The intention was to
> > > catch any FS I/O errors during output archive writeback. fsync() is
> > > called only once as the final I/O.  
> > 
> > I don't parse this.  What does 'as the final I/O' mean?
> 
> fsync() is called once after all buffered writes and copy_file_range()
> calls for the initramfs archive have completed.
> 
> > If you want
> > to catch writeback errors, a single syncfs should be enough.
> 
> gen_init_cpio should only be concerned that the output archive file is
> flushed to storage, rather than the entire filesystem. Why would syncfs
> be more suitable?

Oh, it is called on the generated archive.  Yes, that makes sense.
Sorry for the noise.
Re: [PATCH RFC] gen_init_cpio: Do fsync() only on regular files
Posted by David Disseldorp 2 months, 1 week ago
Thanks for reporting this regression, Dmitry...

On Tue, 07 Oct 2025 00:55:03 +0100, Dmitry Safonov via B4 Relay wrote:

> From: Dmitry Safonov <dima@arista.com>
> 
> Here at Arista gen_init_cpio is used in testing in order to create
> an initramfs for specific tests. Most notably, there is a test that does
> essentially a fork-bomb in kdump/panic kernel, replacing build-time
> generated init script: instead of doing makedumpfile, it does call
> shell tests.
> 
> In comparison to usr/Makefile, which creates an intermediate .cpio file,
> the Makefile that generates initrd for tests is a one-liner:
> > file lib/kdump ${src_dir}/oom-crashkernel 0644 0 0 | usr/gen_init_cpio /dev/stdin | xz -z -c -e -Ccrc32 > ${target_dir}/oom-crashkernel.initrd  
> 
> As fsync() on a pipe fd returns -EINVAL, that stopped working.
> Check that outfd is a regular file descriptor before calling fsync().
> 
> Sending this as RFC as these are local tests, rather than upstream ones,
> unfortunately. Yet, the fix is trivial and increases correctness of
> gen_init_cpio (and maybe saves some time for another person debugging
> it). A workaround to use temporary cpio file is also trivial, so not
> insisting on merging.

The code change looks fine, but the commit message is a bit verbose IMO.
Please drop the first and last paragraphs. The reproducer could also be
simplified to e.g.
echo | usr/gen_init_cpio /dev/stdin > /dev/null

With that:
Reviewed-by: David Disseldorp <ddiss@suse.de>

> Fixes: ae18b94099b0 ("gen_init_cpio: support -o <output_file> parameter")
> Cc: David Disseldorp <ddiss@suse.de>
> Cc: Nicolas Schier <nsc@kernel.org>
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> ---
>  usr/gen_init_cpio.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/usr/gen_init_cpio.c b/usr/gen_init_cpio.c
> index 75e9561ba31392e12536e76a918245a8ea07f9b8..845e2d92f0e56b02ba5fc12ecd243ce99c53f552 100644
> --- a/usr/gen_init_cpio.c
> +++ b/usr/gen_init_cpio.c
> @@ -6,6 +6,7 @@
>  #include <stdbool.h>
>  #include <sys/types.h>
>  #include <sys/stat.h>
> +#include <sys/socket.h>
>  #include <string.h>
>  #include <unistd.h>
>  #include <time.h>
> @@ -112,6 +113,9 @@ static int cpio_trailer(void)
>  	    push_pad(padlen(offset, 512)) < 0)
>  		return -1;
>  
> +	if (!isfdtype(outfd, S_IFREG))
> +		return 0;
> +
>  	return fsync(outfd);
>  }

Another option would be to catch the EINVAL error, e.g.

--- a/usr/gen_init_cpio.c
+++ b/usr/gen_init_cpio.c
@@ -112,7 +112,10 @@ static int cpio_trailer(void)
            push_pad(padlen(offset, 512)) < 0)
                return -1;
 
-       return fsync(outfd);
+       if (fsync(outfd) < 0 && errno != EINVAL)
+               return -1;
+
+       return 0;
 }

It may be a little portable than isfdtype(), but I don't feel strongly
either way.

Thanks, David
Re: [PATCH RFC] gen_init_cpio: Do fsync() only on regular files
Posted by Dmitry Safonov 2 months, 1 week ago
On Tue, Oct 7, 2025 at 2:17 AM David Disseldorp <ddiss@suse.de> wrote:
[..]
> The code change looks fine, but the commit message is a bit verbose IMO.
> Please drop the first and last paragraphs. The reproducer could also be
> simplified to e.g.
> echo | usr/gen_init_cpio /dev/stdin > /dev/null

hehe, OK, let me simplify that in v2.

>
> With that:
> Reviewed-by: David Disseldorp <ddiss@suse.de>
>
> > Fixes: ae18b94099b0 ("gen_init_cpio: support -o <output_file> parameter")
> > Cc: David Disseldorp <ddiss@suse.de>
> > Cc: Nicolas Schier <nsc@kernel.org>
> > Signed-off-by: Dmitry Safonov <dima@arista.com>
> > ---
[..]
>
> Another option would be to catch the EINVAL error, e.g.
>
> --- a/usr/gen_init_cpio.c
> +++ b/usr/gen_init_cpio.c
> @@ -112,7 +112,10 @@ static int cpio_trailer(void)
>             push_pad(padlen(offset, 512)) < 0)
>                 return -1;
>
> -       return fsync(outfd);
> +       if (fsync(outfd) < 0 && errno != EINVAL)
> +               return -1;
> +
> +       return 0;
>  }
>
> It may be a little portable than isfdtype(), but I don't feel strongly
> either way.

Yeah, maybe worth avoiding breking compilation on some weird set ups,
going to use your version for v2.

Thanks,
           Dmitry
Re: [PATCH RFC] gen_init_cpio: Do fsync() only on regular files
Posted by Dmitry Safonov 2 months, 1 week ago
On Tue, Oct 7, 2025 at 12:55 AM Dmitry Safonov via B4 Relay
<devnull+dima.arista.com@kernel.org> wrote:
>
> From: Dmitry Safonov <dima@arista.com>
>
> Here at Arista gen_init_cpio is used in testing in order to create
> an initramfs for specific tests. Most notably, there is a test that does
> essentially a fork-bomb in kdump/panic kernel, replacing build-time
> generated init script: instead of doing makedumpfile, it does call
> shell tests.
>
> In comparison to usr/Makefile, which creates an intermediate .cpio file,
> the Makefile that generates initrd for tests is a one-liner:
> > file lib/kdump ${src_dir}/oom-crashkernel 0644 0 0 | usr/gen_init_cpio /dev/stdin | xz -z -c -e -Ccrc32 > ${target_dir}/oom-crashkernel.initrd
>
> As fsync() on a pipe fd returns -EINVAL, that stopped working.

To clarify: because any sane bash script with pipes should use
set -eo pipefail
which indeed is set in the script that generates initrd.

Thanks,
           Dmitry