[PATCH 1/2] soc: qcom: qmi: Fix "invalid data length" in encoder

Bjorn Andersson posted 2 patches 1 month, 2 weeks ago
[PATCH 1/2] soc: qcom: qmi: Fix "invalid data length" in encoder
Posted by Bjorn Andersson 1 month, 2 weeks ago
When encoding QMI messages, the "source buffer" is a C-struct in the
host memory, so while the data that goes into the outgoing buffer should
be converted to little endian, the length should not be.

Commit 'fe099c387e06 ("soc: qcom: preserve CPU endianness for
QMI_DATA_LEN")' fixed this, but did it by copying a whole word from the
source into a local u32 and then operated on that.

If the length in the DATA_LEN refers to either a char or short array,
it's reasonable to expect that the struct is packed such that this word
will contain not only the length-byte (or length-short), but also the
beginning of the payload.

As the encoder loops around to encode the payload it runs into an
unreasonable value of "data_len_value" and bails, with the error message
"qmi_encode: Invalid data length".

Rather then complicating the logic with local variables of different
types we can instead pick the u8 or u16 "data_len_value" directly from
"buf_src". As "buf_src" refers to a typical C-structure in the client
drivers, we expect this field to be naturally aligned.

We can then return to the original expression of qmi_encode_basic_elem()
encoding directly from "src_buf" to "dst_buf", with the endianness
conversion, based on the size of the type.

Reported-by: David Heidelberg <david@ixit.cz>
Closes: https://lore.kernel.org/all/dfb72933-938f-43f2-87f3-2e3ab9697125@ixit.cz/
Fixes: fe099c387e06 ("soc: qcom: preserve CPU endianness for QMI_DATA_LEN")
Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
---
 drivers/soc/qcom/qmi_encdec.c | 26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/soc/qcom/qmi_encdec.c b/drivers/soc/qcom/qmi_encdec.c
index 28ce6f130b6ac355820bb295c8c96f9c6a6e385f..45bb26d010da77ab8d481897026b718c2290bad7 100644
--- a/drivers/soc/qcom/qmi_encdec.c
+++ b/drivers/soc/qcom/qmi_encdec.c
@@ -368,8 +368,6 @@ static int qmi_encode(const struct qmi_elem_info *ei_array, void *out_buf,
 	const void *buf_src;
 	int encode_tlv = 0;
 	int rc;
-	u8 val8;
-	u16 val16;
 
 	if (!ei_array)
 		return 0;
@@ -406,7 +404,6 @@ static int qmi_encode(const struct qmi_elem_info *ei_array, void *out_buf,
 			break;
 
 		case QMI_DATA_LEN:
-			memcpy(&data_len_value, buf_src, sizeof(u32));
 			data_len_sz = temp_ei->elem_size == sizeof(u8) ?
 					sizeof(u8) : sizeof(u16);
 			/* Check to avoid out of range buffer access */
@@ -416,19 +413,16 @@ static int qmi_encode(const struct qmi_elem_info *ei_array, void *out_buf,
 				       __func__);
 				return -ETOOSMALL;
 			}
-			if (data_len_sz == sizeof(u8)) {
-				val8 = data_len_value;
-				rc = qmi_encode_basic_elem(buf_dst, &val8,
-							   1, data_len_sz);
-				if (rc < 0)
-					return rc;
-			} else {
-				val16 = data_len_value;
-				rc = qmi_encode_basic_elem(buf_dst, &val16,
-							   1, data_len_sz);
-				if (rc < 0)
-					return rc;
-			}
+
+			if (data_len_sz == sizeof(u8))
+				data_len_value = *(u8 *)buf_src;
+			else
+				data_len_value = *(u16 *)buf_src;
+
+			rc = qmi_encode_basic_elem(buf_dst, buf_src, 1, data_len_sz);
+			if (rc < 0)
+				return rc;
+
 			UPDATE_ENCODE_VARIABLES(temp_ei, buf_dst,
 						encoded_bytes, tlv_len,
 						encode_tlv, rc);

-- 
2.51.0
Re: [PATCH 1/2] soc: qcom: qmi: Fix "invalid data length" in encoder
Posted by Alexander Wilhelm 1 month, 2 weeks ago
On Sat, Feb 14, 2026 at 03:16:55PM -0600, Bjorn Andersson wrote:
> When encoding QMI messages, the "source buffer" is a C-struct in the
> host memory, so while the data that goes into the outgoing buffer should
> be converted to little endian, the length should not be.
> 
> Commit 'fe099c387e06 ("soc: qcom: preserve CPU endianness for
> QMI_DATA_LEN")' fixed this, but did it by copying a whole word from the
> source into a local u32 and then operated on that.
> 
> If the length in the DATA_LEN refers to either a char or short array,
> it's reasonable to expect that the struct is packed such that this word
> will contain not only the length-byte (or length-short), but also the
> beginning of the payload.
> 
> As the encoder loops around to encode the payload it runs into an
> unreasonable value of "data_len_value" and bails, with the error message
> "qmi_encode: Invalid data length".
> 
> Rather then complicating the logic with local variables of different
> types we can instead pick the u8 or u16 "data_len_value" directly from
> "buf_src". As "buf_src" refers to a typical C-structure in the client
> drivers, we expect this field to be naturally aligned.
> 
> We can then return to the original expression of qmi_encode_basic_elem()
> encoding directly from "src_buf" to "dst_buf", with the endianness
> conversion, based on the size of the type.
> 
> Reported-by: David Heidelberg <david@ixit.cz>
> Closes: https://urldefense.com/v3/__https://lore.kernel.org/all/dfb72933-938f-43f2-87f3-2e3ab9697125@ixit.cz/__;!!I9LPvj3b!BCfk4-YtwbkEy3mc_UUojT1xCH5BW5COilqBek1tBnJyWzp2eK716Cj0C_35FQwo8__BS8qk_PK5oJs9i719BCjcA-rnMg3YY71aTHHs$ 
> Fixes: fe099c387e06 ("soc: qcom: preserve CPU endianness for QMI_DATA_LEN")
> Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
> ---
>  drivers/soc/qcom/qmi_encdec.c | 26 ++++++++++----------------
>  1 file changed, 10 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/soc/qcom/qmi_encdec.c b/drivers/soc/qcom/qmi_encdec.c
> index 28ce6f130b6ac355820bb295c8c96f9c6a6e385f..45bb26d010da77ab8d481897026b718c2290bad7 100644
> --- a/drivers/soc/qcom/qmi_encdec.c
> +++ b/drivers/soc/qcom/qmi_encdec.c
> @@ -368,8 +368,6 @@ static int qmi_encode(const struct qmi_elem_info *ei_array, void *out_buf,
>  	const void *buf_src;
>  	int encode_tlv = 0;
>  	int rc;
> -	u8 val8;
> -	u16 val16;
>  
>  	if (!ei_array)
>  		return 0;
> @@ -406,7 +404,6 @@ static int qmi_encode(const struct qmi_elem_info *ei_array, void *out_buf,
>  			break;
>  
>  		case QMI_DATA_LEN:
> -			memcpy(&data_len_value, buf_src, sizeof(u32));

Hi Bjorn,

unfortunatelly, this change breaks the `ath11k`, and most likely `ath12k`,
execution on big-endian platforms:

    ath11k_pci 0001:01:00.0: BAR 0: assigned [mem 0xc00000000-0xc001fffff 64bit]
    ath11k_pci 0001:01:00.0: MSI vectors: 1
    ath11k_pci 0001:01:00.0: qcn9074 hw1.0
    ath11k_pci 0001:01:00.0: FW memory mode: 0
    ath11k_pci 0002:01:00.0: BAR 0: assigned [mem 0xc10000000-0xc101fffff 64bit]
    ath11k_pci 0002:01:00.0: MSI vectors: 1
    ath11k_pci 0002:01:00.0: qcn9074 hw1.0
    ath11k_pci 0002:01:00.0: FW memory mode: 0
    ath11k_pci 0001:01:00.0: invalid memory segment length: 83886080
    ath11k_pci 0001:01:00.0: invalid memory segment length: 419430400
    ath11k_pci 0001:01:00.0: qmi respond memory request failed: 1 0
    ath11k_pci 0001:01:00.0: qmi failed to respond fw mem req: -22
    ath11k_pci 0001:01:00.0: qmi respond memory request failed: 1 48
    ath11k_pci 0001:01:00.0: qmi failed to respond fw mem req: -22
    ath11k_pci 0002:01:00.0: invalid memory segment length: 83886080
    ath11k_pci 0002:01:00.0: invalid memory segment length: 419430400
    ath11k_pci 0002:01:00.0: qmi respond memory request failed: 1 0
    ath11k_pci 0002:01:00.0: qmi failed to respond fw mem req: -22 

I tried to analyze the regression I introduced and I think I now understand
what went wrong. Previously, the code looked like the this:

    memcpy(&data_len_value, buf_src, temp_ei->elem_size);

However, this never worked correctly on big‑endian systems. `buf_src` is a
`void *`, but `ath11k` and `ath12k` always store the data as `u32`. Assume
the element value is `0xABCD` with an elem_size of 2, that is, the
`sizeof(u16)`. The memory layout on the driver side then looks like this (X
marks unused bytes):

    +---------------+----+----+----+----+
    | Little Endian | XX | XX | AB | CD |
    +---------------+----+----+----+----+
    | Big Endian    | CD | AB | XX | XX |
    +---------------+----+----+----+----+

When `buf_src` is treated as an array of `u32` and then “reinterpreted” as
an array of `u8`, only the first 2 bytes of the `u32` are copied, which, on
big‑endian, no longer contain the actual data. After the copy,
`data_len_value` contains the following data:

    +---------------+----+----+----+----+
    | Little Endian | XX | XX | AB | CD |
    +---------------+----+----+----+----+
    | Big Endian    | XX | XX | XX | XX |
    +---------------+----+----+----+----+

So the original value `0xABCD` never gets copied at all on big‑endian
systems. This is why a simple pointer cast cannot work reliably on
big‑endian architectures. I did the following change:

    memcpy(&data_len_value, buf_src, sizeof(u32));

My attempt was to always copy the full `u32` value , but it seems that the
modem on the "Pixel 3" does not actually use a `u32` there, but rather an
array or a packed structure. I’ve CC’ed Jeff and the `ath11k/ath12k`
mailing list as well. Hopefully we can find a solution that works across
both endianness architectures.

>  			data_len_sz = temp_ei->elem_size == sizeof(u8) ?
>  					sizeof(u8) : sizeof(u16);
>  			/* Check to avoid out of range buffer access */
> @@ -416,19 +413,16 @@ static int qmi_encode(const struct qmi_elem_info *ei_array, void *out_buf,
>  				       __func__);
>  				return -ETOOSMALL;
>  			}
> -			if (data_len_sz == sizeof(u8)) {
> -				val8 = data_len_value;
> -				rc = qmi_encode_basic_elem(buf_dst, &val8,
> -							   1, data_len_sz);
> -				if (rc < 0)
> -					return rc;
> -			} else {
> -				val16 = data_len_value;
> -				rc = qmi_encode_basic_elem(buf_dst, &val16,
> -							   1, data_len_sz);
> -				if (rc < 0)
> -					return rc;
> -			}
> +
> +			if (data_len_sz == sizeof(u8))
> +				data_len_value = *(u8 *)buf_src;
> +			else
> +				data_len_value = *(u16 *)buf_src;
> +
> +			rc = qmi_encode_basic_elem(buf_dst, buf_src, 1, data_len_sz);

Here is the problem again: `buf_src` is once more being cast either to a
`u8 *` or a `u16 *`. This does not cause issues on little‑endian systems,
but it corrupts the data on big‑endian platforms.


Best regards
Alexander Wilhelm
Re: [PATCH 1/2] soc: qcom: qmi: Fix "invalid data length" in encoder
Posted by Bjorn Andersson 1 month, 2 weeks ago
On Mon, Feb 16, 2026 at 09:58:35AM +0100, Alexander Wilhelm wrote:
> On Sat, Feb 14, 2026 at 03:16:55PM -0600, Bjorn Andersson wrote:
> > When encoding QMI messages, the "source buffer" is a C-struct in the
> > host memory, so while the data that goes into the outgoing buffer should
> > be converted to little endian, the length should not be.
> > 
> > Commit 'fe099c387e06 ("soc: qcom: preserve CPU endianness for
> > QMI_DATA_LEN")' fixed this, but did it by copying a whole word from the
> > source into a local u32 and then operated on that.
> > 
> > If the length in the DATA_LEN refers to either a char or short array,
> > it's reasonable to expect that the struct is packed such that this word
> > will contain not only the length-byte (or length-short), but also the
> > beginning of the payload.
> > 
> > As the encoder loops around to encode the payload it runs into an
> > unreasonable value of "data_len_value" and bails, with the error message
> > "qmi_encode: Invalid data length".
> > 
> > Rather then complicating the logic with local variables of different
> > types we can instead pick the u8 or u16 "data_len_value" directly from
> > "buf_src". As "buf_src" refers to a typical C-structure in the client
> > drivers, we expect this field to be naturally aligned.
> > 
> > We can then return to the original expression of qmi_encode_basic_elem()
> > encoding directly from "src_buf" to "dst_buf", with the endianness
> > conversion, based on the size of the type.
> > 
> > Reported-by: David Heidelberg <david@ixit.cz>
> > Closes: https://urldefense.com/v3/__https://lore.kernel.org/all/dfb72933-938f-43f2-87f3-2e3ab9697125@ixit.cz/__;!!I9LPvj3b!BCfk4-YtwbkEy3mc_UUojT1xCH5BW5COilqBek1tBnJyWzp2eK716Cj0C_35FQwo8__BS8qk_PK5oJs9i719BCjcA-rnMg3YY71aTHHs$ 
> > Fixes: fe099c387e06 ("soc: qcom: preserve CPU endianness for QMI_DATA_LEN")
> > Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
> > ---
> >  drivers/soc/qcom/qmi_encdec.c | 26 ++++++++++----------------
> >  1 file changed, 10 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/soc/qcom/qmi_encdec.c b/drivers/soc/qcom/qmi_encdec.c
> > index 28ce6f130b6ac355820bb295c8c96f9c6a6e385f..45bb26d010da77ab8d481897026b718c2290bad7 100644
> > --- a/drivers/soc/qcom/qmi_encdec.c
> > +++ b/drivers/soc/qcom/qmi_encdec.c
> > @@ -368,8 +368,6 @@ static int qmi_encode(const struct qmi_elem_info *ei_array, void *out_buf,
> >  	const void *buf_src;
> >  	int encode_tlv = 0;
> >  	int rc;
> > -	u8 val8;
> > -	u16 val16;
> >  
> >  	if (!ei_array)
> >  		return 0;
> > @@ -406,7 +404,6 @@ static int qmi_encode(const struct qmi_elem_info *ei_array, void *out_buf,
> >  			break;
> >  
> >  		case QMI_DATA_LEN:
> > -			memcpy(&data_len_value, buf_src, sizeof(u32));
> 
> Hi Bjorn,
> 
> unfortunatelly, this change breaks the `ath11k`, and most likely `ath12k`,
> execution on big-endian platforms:
> 
>     ath11k_pci 0001:01:00.0: BAR 0: assigned [mem 0xc00000000-0xc001fffff 64bit]
>     ath11k_pci 0001:01:00.0: MSI vectors: 1
>     ath11k_pci 0001:01:00.0: qcn9074 hw1.0
>     ath11k_pci 0001:01:00.0: FW memory mode: 0
>     ath11k_pci 0002:01:00.0: BAR 0: assigned [mem 0xc10000000-0xc101fffff 64bit]
>     ath11k_pci 0002:01:00.0: MSI vectors: 1
>     ath11k_pci 0002:01:00.0: qcn9074 hw1.0
>     ath11k_pci 0002:01:00.0: FW memory mode: 0
>     ath11k_pci 0001:01:00.0: invalid memory segment length: 83886080
>     ath11k_pci 0001:01:00.0: invalid memory segment length: 419430400
>     ath11k_pci 0001:01:00.0: qmi respond memory request failed: 1 0
>     ath11k_pci 0001:01:00.0: qmi failed to respond fw mem req: -22
>     ath11k_pci 0001:01:00.0: qmi respond memory request failed: 1 48
>     ath11k_pci 0001:01:00.0: qmi failed to respond fw mem req: -22
>     ath11k_pci 0002:01:00.0: invalid memory segment length: 83886080
>     ath11k_pci 0002:01:00.0: invalid memory segment length: 419430400
>     ath11k_pci 0002:01:00.0: qmi respond memory request failed: 1 0
>     ath11k_pci 0002:01:00.0: qmi failed to respond fw mem req: -22 
> 
> I tried to analyze the regression I introduced and I think I now understand
> what went wrong. Previously, the code looked like the this:
> 
>     memcpy(&data_len_value, buf_src, temp_ei->elem_size);

Above log indicates that we have a problem on the decode-side. This
patch 1, changes the encoder side only. But we might have the same bug
in both cases(?)...

> 
> However, this never worked correctly on big‑endian systems. `buf_src` is a
> `void *`, but `ath11k` and `ath12k` always store the data as `u32`.

This I think is the actual bug.

Based on the "invalid memory segment length" error, I can see we're
talking about struct qmi_wlanfw_request_mem_ind_msg_v01.

This defined mem_seg_len as an u32, 

But qmi_wlanfw_request_mem_ind_msg_v01_ei[] tells us that it's an u8
(elem_size = 1).

> Assume
> the element value is `0xABCD` with an elem_size of 2, that is, the
> `sizeof(u16)`. The memory layout on the driver side then looks like this (X
> marks unused bytes):
> 
>     +---------------+----+----+----+----+
>     | Little Endian | XX | XX | AB | CD |
>     +---------------+----+----+----+----+
>     | Big Endian    | CD | AB | XX | XX |
>     +---------------+----+----+----+----+
> 
> When `buf_src` is treated as an array of `u32` and then “reinterpreted” as
> an array of `u8`, only the first 2 bytes of the `u32` are copied, which, on
> big‑endian, no longer contain the actual data. After the copy,
> `data_len_value` contains the following data:
> 
>     +---------------+----+----+----+----+
>     | Little Endian | XX | XX | AB | CD |
>     +---------------+----+----+----+----+
>     | Big Endian    | XX | XX | XX | XX |
>     +---------------+----+----+----+----+
> 
> So the original value `0xABCD` never gets copied at all on big‑endian
> systems. This is why a simple pointer cast cannot work reliably on
> big‑endian architectures. I did the following change:
> 

You're analysis is correct!

An incoming 0xcd will be written as 0xcd000000 in the 32-bit
mem_seg_len. This is confirmed by your log, which actually are the
values 5 and 25.

Looking further, I see that qmi_wlanfw_host_cap_req_msg_v01_ei[]
contains "gpios_len", which is declared as sizeof(u8), but gpios_len is
u32. So the ath11k driver has the same problem on the encode side.

In contrast, you can see struct ssctl_subsys_event_req which defines
subsys_name_len as u8 and ssctl_subsys_event_req_ei[0] specified
elem_size of sizeof(uint8_t) - which is the source of the bug that David
reported.

>     memcpy(&data_len_value, buf_src, sizeof(u32));
> 
> My attempt was to always copy the full `u32` value , but it seems that the
> modem on the "Pixel 3" does not actually use a `u32` there, but rather an
> array or a packed structure. I’ve CC’ed Jeff and the `ath11k/ath12k`
> mailing list as well. Hopefully we can find a solution that works across
> both endianness architectures.
> 

This isn't just the modem on Pixel 3, that message is sent on a wide
variety of Qualcomm SoCs.

> >  			data_len_sz = temp_ei->elem_size == sizeof(u8) ?
> >  					sizeof(u8) : sizeof(u16);
> >  			/* Check to avoid out of range buffer access */
> > @@ -416,19 +413,16 @@ static int qmi_encode(const struct qmi_elem_info *ei_array, void *out_buf,
> >  				       __func__);
> >  				return -ETOOSMALL;
> >  			}
> > -			if (data_len_sz == sizeof(u8)) {
> > -				val8 = data_len_value;
> > -				rc = qmi_encode_basic_elem(buf_dst, &val8,
> > -							   1, data_len_sz);
> > -				if (rc < 0)
> > -					return rc;
> > -			} else {
> > -				val16 = data_len_value;
> > -				rc = qmi_encode_basic_elem(buf_dst, &val16,
> > -							   1, data_len_sz);
> > -				if (rc < 0)
> > -					return rc;
> > -			}
> > +
> > +			if (data_len_sz == sizeof(u8))
> > +				data_len_value = *(u8 *)buf_src;
> > +			else
> > +				data_len_value = *(u16 *)buf_src;
> > +
> > +			rc = qmi_encode_basic_elem(buf_dst, buf_src, 1, data_len_sz);
> 
> Here is the problem again: `buf_src` is once more being cast either to a
> `u8 *` or a `u16 *`. This does not cause issues on little‑endian systems,
> but it corrupts the data on big‑endian platforms.
> 

Correct, but prior to 3ced38da5f7d ("soc: qcom: QMI encoding/decoding
for big endian") only temp_ei->elem_size bytes of buf_src was
considered. I.e. for a elem_size of 1, only the byte pointed to by
buf_src was considered to be the length - and this happens to be the
same thing with little endian.

And as the commit message of this patch states, if we change this to
always read 32-bits from the length we're likely reading "garbage"
beyond the u8 or u16 length member.


It might very well be that the underlying bug is my expectation that
elem_size should be reflected in the struct and not only in the encoded
message, and hence what I wrote in https://github.com/linux-msm/qmic.
Perhaps the length-specifier of an array should always be u32?

@Chris, what does the downstream generator produce here?

Regards,
Bjorn

> 
> Best regards
> Alexander Wilhelm
Re: [PATCH 1/2] soc: qcom: qmi: Fix "invalid data length" in encoder
Posted by Jeff Johnson 1 month, 1 week ago
On 2/16/2026 7:25 AM, Bjorn Andersson wrote:
> It might very well be that the underlying bug is my expectation that
> elem_size should be reflected in the struct and not only in the encoded
> message, and hence what I wrote in https://github.com/linux-msm/qmic.
> Perhaps the length-specifier of an array should always be u32?
> 
> @Chris, what does the downstream generator produce here?

Is this behavior just constrained to QMI_DATA_LEN TLVs?

I'm looking at downstream Android WLAN code and it has the same discrepancy,
so it appears the code generator is always producing a u32 member in the host
struct to hold a QMI_DATA_LEN member even though the actual element size as
defined in the qmi_elem_info array is either sizeof(u8) or sizeof(u16).

Does this issue get fixed if we change the member in the host struct, i.e. for
the issue mentioned (that I chopped off) modify:
struct qmi_wlanfw_request_mem_ind_msg_v01 {
-	u32 mem_seg_len;
+	u8 mem_seg_len;
Re: [PATCH 1/2] soc: qcom: qmi: Fix "invalid data length" in encoder
Posted by Bjorn Andersson 1 month, 1 week ago
On Wed, Feb 18, 2026 at 10:00:22AM -0800, Jeff Johnson wrote:
> On 2/16/2026 7:25 AM, Bjorn Andersson wrote:
> > It might very well be that the underlying bug is my expectation that
> > elem_size should be reflected in the struct and not only in the encoded
> > message, and hence what I wrote in https://github.com/linux-msm/qmic.
> > Perhaps the length-specifier of an array should always be u32?
> > 
> > @Chris, what does the downstream generator produce here?
> 
> Is this behavior just constrained to QMI_DATA_LEN TLVs?
> 
> I'm looking at downstream Android WLAN code and it has the same discrepancy,
> so it appears the code generator is always producing a u32 member in the host
> struct to hold a QMI_DATA_LEN member even though the actual element size as
> defined in the qmi_elem_info array is either sizeof(u8) or sizeof(u16).
> 
> Does this issue get fixed if we change the member in the host struct, i.e. for
> the issue mentioned (that I chopped off) modify:
> struct qmi_wlanfw_request_mem_ind_msg_v01 {
> -	u32 mem_seg_len;
> +	u8 mem_seg_len;
> 

Yes, that would fix the problem.

But if your elem_info arrays are coming from the downstream tool, then I
think the correct way to fix this would be to move in the other
direction and turn some those u8/u16 members elsewhere into u32.

I have the downstream code generator source here somewhere...

Regards,
Bjorn
Re: [PATCH 1/2] soc: qcom: qmi: Fix "invalid data length" in encoder
Posted by Bjorn Andersson 1 month, 1 week ago
On Wed, Feb 18, 2026 at 12:53:03PM -0600, Bjorn Andersson wrote:
> On Wed, Feb 18, 2026 at 10:00:22AM -0800, Jeff Johnson wrote:
> > On 2/16/2026 7:25 AM, Bjorn Andersson wrote:
> > > It might very well be that the underlying bug is my expectation that
> > > elem_size should be reflected in the struct and not only in the encoded
> > > message, and hence what I wrote in https://github.com/linux-msm/qmic.
> > > Perhaps the length-specifier of an array should always be u32?
> > > 
> > > @Chris, what does the downstream generator produce here?
> > 
> > Is this behavior just constrained to QMI_DATA_LEN TLVs?
> > 
> > I'm looking at downstream Android WLAN code and it has the same discrepancy,
> > so it appears the code generator is always producing a u32 member in the host
> > struct to hold a QMI_DATA_LEN member even though the actual element size as
> > defined in the qmi_elem_info array is either sizeof(u8) or sizeof(u16).
> > 
> > Does this issue get fixed if we change the member in the host struct, i.e. for
> > the issue mentioned (that I chopped off) modify:
> > struct qmi_wlanfw_request_mem_ind_msg_v01 {
> > -	u32 mem_seg_len;
> > +	u8 mem_seg_len;
> > 
> 
> Yes, that would fix the problem.
> 
> But if your elem_info arrays are coming from the downstream tool, then I
> think the correct way to fix this would be to move in the other
> direction and turn some those u8/u16 members elsewhere into u32.
> 
> I have the downstream code generator source here somewhere...
> 

I reviewed the downstream code generator source and documentation.

We do generate tables matching the ath12k c-structures, i.e. variable
length arrays are always prefixed with an uint32_t field - not a
uint8_t or uint16_t based on elem_size.

Looking back at the original implementation of the in-kernel
qmi_encode(), we only read elem_size bytes from the c-structure, but we
do so into the (little-endian) uint32_t on the stack, from which we
encode the message and act upon the result.

In qmi_decode() we decode elem_size bytes from the message into the
(little endian) uint32_t and then write 4 bytes to the c-structure.


The fix would as such seem to be to just update the length fields to be
all uint32_t. The problem I see with this is that qmic [1] is the only
publicly available code generator, and if we change it to always
generate uint32_t length members, we also need to fix the
encoder/decoder in libqrtr [2] - which will be an ABI breaking change.

If we go the other way around, the drawback is that we no longer support
the c-structures generated by the proprietary code generator.

Worth pointing out is that the structure of the c-code is an ABI between
the encoder/decoder, the code generator and the client - it does not
affect the wire format.

[1] https://github.com/linux-msm/qmic
[2] https://github.com/linux-msm/qrtr

Regards,
Bjorn

> Regards,
> Bjorn
Re: [PATCH 1/2] soc: qcom: qmi: Fix "invalid data length" in encoder
Posted by Jeff Johnson 1 month, 1 week ago
On 2/19/2026 11:18 AM, Bjorn Andersson wrote:
> I reviewed the downstream code generator source and documentation.
> 
> We do generate tables matching the ath12k c-structures, i.e. variable
> length arrays are always prefixed with an uint32_t field - not a
> uint8_t or uint16_t based on elem_size.
> 
> Looking back at the original implementation of the in-kernel
> qmi_encode(), we only read elem_size bytes from the c-structure, but we
> do so into the (little-endian) uint32_t on the stack, from which we
> encode the message and act upon the result.
> 
> In qmi_decode() we decode elem_size bytes from the message into the
> (little endian) uint32_t and then write 4 bytes to the c-structure.
> 
> 
> The fix would as such seem to be to just update the length fields to be
> all uint32_t. The problem I see with this is that qmic [1] is the only
> publicly available code generator, and if we change it to always
> generate uint32_t length members, we also need to fix the
> encoder/decoder in libqrtr [2] - which will be an ABI breaking change.

And IMO that is a deal breaker since it would break the interface with all
existing legacy firmware.

> 
> If we go the other way around, the drawback is that we no longer support
> the c-structures generated by the proprietary code generator.
> 
> Worth pointing out is that the structure of the c-code is an ABI between
> the encoder/decoder, the code generator and the client - it does not
> affect the wire format.
> 
> [1] https://github.com/linux-msm/qmic
> [2] https://github.com/linux-msm/qrtr

Going back to the original implementation that reads and writes a u32 on the
stack, can we stick with that but add endian logic that correctly converts
between u32 host endian on the stack and either u8 or u16 little endian in the
messages? Is this specific to QMI_DATA_LEN TLVs?

/jeff
Re: [PATCH 1/2] soc: qcom: qmi: Fix "invalid data length" in encoder
Posted by Bjorn Andersson 1 month, 1 week ago
On Thu, Feb 19, 2026 at 01:25:36PM -0800, Jeff Johnson wrote:
> On 2/19/2026 11:18 AM, Bjorn Andersson wrote:
> > I reviewed the downstream code generator source and documentation.
> > 
> > We do generate tables matching the ath12k c-structures, i.e. variable
> > length arrays are always prefixed with an uint32_t field - not a
> > uint8_t or uint16_t based on elem_size.
> > 
> > Looking back at the original implementation of the in-kernel
> > qmi_encode(), we only read elem_size bytes from the c-structure, but we
> > do so into the (little-endian) uint32_t on the stack, from which we
> > encode the message and act upon the result.
> > 
> > In qmi_decode() we decode elem_size bytes from the message into the
> > (little endian) uint32_t and then write 4 bytes to the c-structure.
> > 
> > 
> > The fix would as such seem to be to just update the length fields to be
> > all uint32_t. The problem I see with this is that qmic [1] is the only
> > publicly available code generator, and if we change it to always
> > generate uint32_t length members, we also need to fix the
> > encoder/decoder in libqrtr [2] - which will be an ABI breaking change.
> 
> And IMO that is a deal breaker since it would break the interface with all
> existing legacy firmware.
> 

No, the firwmare-facing encoded length in the messages are currently all
little-endian elem_sized, and this would be unchanged. It's merely a
question about the ABI between code generator, encoder/decoder, and the
client code.

> > 
> > If we go the other way around, the drawback is that we no longer support
> > the c-structures generated by the proprietary code generator.
> > 
> > Worth pointing out is that the structure of the c-code is an ABI between
> > the encoder/decoder, the code generator and the client - it does not
> > affect the wire format.
> > 
> > [1] https://github.com/linux-msm/qmic
> > [2] https://github.com/linux-msm/qrtr
> 
> Going back to the original implementation that reads and writes a u32 on the
> stack, can we stick with that but add endian logic that correctly converts
> between u32 host endian on the stack and either u8 or u16 little endian in the
> messages? Is this specific to QMI_DATA_LEN TLVs?
> 

I gave it some more thought, and discussed a bit with Chris Lew.

If we change qmic to produce uint32_t length entries and align the
in-kernel interfaces to use 32-bit lengths the kernel works fine - and
thanks to Alexander's work, should support both endianes (will double
check).

For the userspace library, the decoder already writes 32-bit fields
(into the u8/u16...) so the situation for currently generated c-structs
will be unchanged and it will be correct for 32-bit fields.

The encoder is reading u8/u16 from the c-struct and encodes this. Just
as with the structs in athNNk, we can change them to 32-bit without
impacting the encoder; as long as we don't change the encoder...
The only problem I think we have left is that we can't fix the userspace
encoder - as this would be incompatible with current clients (u8/u16
can't be read as u32).


I'll take another pass tomorrow, to review this, review Alexander's work
once more, and prepare some patches.

Regards,
Bjorn