[PATCH v1 2/4] initramfs: Refactor to use hex2bin() instead of custom approach

Andy Shevchenko posted 4 patches 2 weeks, 5 days ago
[PATCH v1 2/4] initramfs: Refactor to use hex2bin() instead of custom approach
Posted by Andy Shevchenko 2 weeks, 5 days ago
There is a simple_strntoul() function used solely as a shortcut
for hex2bin() with proper endianess conversions. Replace that
and drop the unneeded function in the next changes.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 init/initramfs.c | 36 ++++++++++++++++++++----------------
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/init/initramfs.c b/init/initramfs.c
index 750f126e19a0..8d931ad4d239 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -6,6 +6,7 @@
 #include <linux/fcntl.h>
 #include <linux/file.h>
 #include <linux/fs.h>
+#include <linux/hex.h>
 #include <linux/init.h>
 #include <linux/init_syscalls.h>
 #include <linux/kstrtox.h>
@@ -21,6 +22,8 @@
 #include <linux/umh.h>
 #include <linux/utime.h>
 
+#include <asm/byteorder.h>
+
 #include "do_mounts.h"
 #include "initramfs_internal.h"
 
@@ -192,24 +195,25 @@ static __initdata u32 hdr_csum;
 
 static void __init parse_header(char *s)
 {
-	unsigned long parsed[13];
-	int i;
+	__be32 header[13];
+	int ret;
 
-	for (i = 0, s += 6; i < 13; i++, s += 8)
-		parsed[i] = simple_strntoul(s, NULL, 16, 8);
+	ret = hex2bin((u8 *)header, s + 6, sizeof(header));
+	if (ret)
+		error("damaged header");
 
-	ino = parsed[0];
-	mode = parsed[1];
-	uid = parsed[2];
-	gid = parsed[3];
-	nlink = parsed[4];
-	mtime = parsed[5]; /* breaks in y2106 */
-	body_len = parsed[6];
-	major = parsed[7];
-	minor = parsed[8];
-	rdev = new_encode_dev(MKDEV(parsed[9], parsed[10]));
-	name_len = parsed[11];
-	hdr_csum = parsed[12];
+	ino = be32_to_cpu(header[0]);
+	mode = be32_to_cpu(header[1]);
+	uid = be32_to_cpu(header[2]);
+	gid = be32_to_cpu(header[3]);
+	nlink = be32_to_cpu(header[4]);
+	mtime = be32_to_cpu(header[5]); /* breaks in y2106 */
+	body_len = be32_to_cpu(header[6]);
+	major = be32_to_cpu(header[7]);
+	minor = be32_to_cpu(header[8]);
+	rdev = new_encode_dev(MKDEV(be32_to_cpu(header[9]), be32_to_cpu(header[10])));
+	name_len = be32_to_cpu(header[11]);
+	hdr_csum = be32_to_cpu(header[12]);
 }
 
 /* FSM */
-- 
2.50.1
Re: [PATCH v1 2/4] initramfs: Refactor to use hex2bin() instead of custom approach
Posted by David Disseldorp 2 weeks, 4 days ago
On Mon, 19 Jan 2026 21:38:39 +0100, Andy Shevchenko wrote:

> +	ret = hex2bin((u8 *)header, s + 6, sizeof(header));
> +	if (ret)
> +		error("damaged header");

The changes look reasonable to me on first glance, but I think we really
should improve the error handling to abort the state machine on
malformed header here.

One further issue that we have is simple_strntoul()'s acceptance of
"0x" prefixes for the hex strings - any initramfs which carries such
prefixes will now result in an error.
It's a pretty obscure corner case, but cpio is really easy to generate
from printf(), so maybe there are some images out there which rely on
this.

I've written an initramfs_test regression test for the "0x" prefix
handling. I'll send it to the list.

Thanks, David
Re: [PATCH v1 2/4] initramfs: Refactor to use hex2bin() instead of custom approach
Posted by Andy Shevchenko 2 weeks, 4 days ago
On Wed, Jan 21, 2026 at 07:12:50AM +1100, David Disseldorp wrote:
> On Mon, 19 Jan 2026 21:38:39 +0100, Andy Shevchenko wrote:
> 
> > +	ret = hex2bin((u8 *)header, s + 6, sizeof(header));
> > +	if (ret)
> > +		error("damaged header");
> 
> The changes look reasonable to me on first glance, but I think we really
> should improve the error handling to abort the state machine on
> malformed header here.
> 
> One further issue that we have is simple_strntoul()'s acceptance of
> "0x" prefixes for the hex strings - any initramfs which carries such
> prefixes will now result in an error.
> It's a pretty obscure corner case, but cpio is really easy to generate
> from printf(), so maybe there are some images out there which rely on
> this.
> 
> I've written an initramfs_test regression test for the "0x" prefix
> handling. I'll send it to the list.

Is it specified?

The standard refers to octal numbers, we seem to use hexadecimal.
I don't believe the 0x will ever appear here.

Otherwise, please point out to the specifications.

-- 
With Best Regards,
Andy Shevchenko
Re: [PATCH v1 2/4] initramfs: Refactor to use hex2bin() instead of custom approach
Posted by David Disseldorp 2 weeks, 4 days ago
On Tue, 20 Jan 2026 22:34:45 +0200, Andy Shevchenko wrote:

> On Wed, Jan 21, 2026 at 07:12:50AM +1100, David Disseldorp wrote:
> > On Mon, 19 Jan 2026 21:38:39 +0100, Andy Shevchenko wrote:
> >   
> > > +	ret = hex2bin((u8 *)header, s + 6, sizeof(header));
> > > +	if (ret)
> > > +		error("damaged header");  
> > 
> > The changes look reasonable to me on first glance, but I think we really
> > should improve the error handling to abort the state machine on
> > malformed header here.
> > 
> > One further issue that we have is simple_strntoul()'s acceptance of
> > "0x" prefixes for the hex strings - any initramfs which carries such
> > prefixes will now result in an error.
> > It's a pretty obscure corner case, but cpio is really easy to generate
> > from printf(), so maybe there are some images out there which rely on
> > this.
> > 
> > I've written an initramfs_test regression test for the "0x" prefix
> > handling. I'll send it to the list.  
> 
> Is it specified?
> 
> The standard refers to octal numbers, we seem to use hexadecimal.
> I don't believe the 0x will ever appear here.
> 
> Otherwise, please point out to the specifications.

The kernel initramfs specification is at
Documentation/driver-api/early-userspace/buffer-format.rst :

  The structure of the cpio_header is as follows (all fields contain
  hexadecimal ASCII numbers fully padded with '0' on the left to the
  full width of the field, for example, the integer 4780 is represented
  by the ASCII string "000012ac"):
  ...

I.e. a "0x" isn't specified as valid prefix. I don't feel strongly
regarding diverging from existing behaviour, but it should still be
considered (and documented) as a potentially user-visible regression.

Cheers, David
Re: [PATCH v1 2/4] initramfs: Refactor to use hex2bin() instead of custom approach
Posted by Andy Shevchenko 2 weeks, 4 days ago
On Wed, Jan 21, 2026 at 08:00:15AM +1100, David Disseldorp wrote:
> On Tue, 20 Jan 2026 22:34:45 +0200, Andy Shevchenko wrote:
> > On Wed, Jan 21, 2026 at 07:12:50AM +1100, David Disseldorp wrote:
> > > On Mon, 19 Jan 2026 21:38:39 +0100, Andy Shevchenko wrote:

...

> > > > +	ret = hex2bin((u8 *)header, s + 6, sizeof(header));
> > > > +	if (ret)
> > > > +		error("damaged header");  
> > > 
> > > The changes look reasonable to me on first glance, but I think we really
> > > should improve the error handling to abort the state machine on
> > > malformed header here.
> > > 
> > > One further issue that we have is simple_strntoul()'s acceptance of
> > > "0x" prefixes for the hex strings - any initramfs which carries such
> > > prefixes will now result in an error.
> > > It's a pretty obscure corner case, but cpio is really easy to generate
> > > from printf(), so maybe there are some images out there which rely on
> > > this.
> > > 
> > > I've written an initramfs_test regression test for the "0x" prefix
> > > handling. I'll send it to the list.  
> > 
> > Is it specified?
> > 
> > The standard refers to octal numbers, we seem to use hexadecimal.
> > I don't believe the 0x will ever appear here.
> > 
> > Otherwise, please point out to the specifications.
> 
> The kernel initramfs specification is at
> Documentation/driver-api/early-userspace/buffer-format.rst :

Thanks!

>   The structure of the cpio_header is as follows (all fields contain
>   hexadecimal ASCII numbers fully padded with '0' on the left to the
>   full width of the field, for example, the integer 4780 is represented
>   by the ASCII string "000012ac"):
>   ...
> 
> I.e. a "0x" isn't specified as valid prefix. I don't feel strongly
> regarding diverging from existing behaviour,

> but it should still be
> considered (and documented) as a potentially user-visible regression.

I disagree, this is not specified and should not be used. The CPIO archive in
the original form doesn't specify leading 0 for octals (at least how I read it,
please correct me, if I'm wrong).

https://pubs.opengroup.org/onlinepubs/007908799/xcu/pax.html

-- 
With Best Regards,
Andy Shevchenko
Re: [PATCH v1 2/4] initramfs: Refactor to use hex2bin() instead of custom approach
Posted by H. Peter Anvin 1 week, 4 days ago
On 2026-01-20 13:17, Andy Shevchenko wrote:
>>
>> I.e. a "0x" isn't specified as valid prefix. I don't feel strongly
>> regarding diverging from existing behaviour,
> 
>> but it should still be
>> considered (and documented) as a potentially user-visible regression.
> 
> I disagree, this is not specified and should not be used. The CPIO archive in
> the original form doesn't specify leading 0 for octals (at least how I read it,
> please correct me, if I'm wrong).
> 
> https://pubs.opengroup.org/onlinepubs/007908799/xcu/pax.html
> 

This is the "newc" or "crc" format used by BSD (header magic 070701 and
070702, respectively), those were never standardized by POSIX.

These formats use 8-character hexadecimal (%08x).

As far as a 0x prefix, or upper case -- no, that is technically not according
to spec, but we DO NOT break user space tools that have been working for 20+
years, unless it is either (a) a security problem or (b) is holding back
further development (e.g. because the format is ambiguous. This is Postel's
law in action.

	-hpa