[v4] net: macb: various fixes

[PATCH net v4 5/5] net: macb: avoid double endianness swap in macb_set_hwaddr()

Posted by Théo Lebrun 5 months, 3 weeks ago

writel() does a CPU->LE conversion. Drop manual cpu_to_le*() calls.

On little-endian system:
 - cpu_to_le32() is a no-op (LE->LE),
 - writel() is a no-op (LE->LE),
 - dev_addr will therefore not be swapped and written as-is.

On big-endian system:
 - cpu_to_le32() is a swap (BE->LE),
 - writel() is a swap (BE->LE),
 - dev_addr will therefore be swapped twice and written as a BE value.

This was found using sparse:
   ⟩ make C=2 drivers/net/ethernet/cadence/macb_main.o
   warning: incorrect type in assignment (different base types)
      expected unsigned int [usertype] bottom
      got restricted __le32 [usertype]
   warning: incorrect type in assignment (different base types)
      expected unsigned short [usertype] top
      got restricted __le16 [usertype]
   ...

Fixes: 89e5785fc8a6 ("[PATCH] Atmel MACB ethernet driver")
Reviewed-by: Sean Anderson <sean.anderson@linux.dev>
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/net/ethernet/cadence/macb_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 7f31f264a6d342ea01e2f61944b12c9b9a3fe66e..fe319d77f2a8f6b1f3b698e0d11781936345ea8f 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -274,9 +274,9 @@ static void macb_set_hwaddr(struct macb *bp)
 	u32 bottom;
 	u16 top;
 
-	bottom = cpu_to_le32(*((u32 *)bp->dev->dev_addr));
+	bottom = *((u32 *)bp->dev->dev_addr);
 	macb_or_gem_writel(bp, SA1B, bottom);
-	top = cpu_to_le16(*((u16 *)(bp->dev->dev_addr + 4)));
+	top = *((u16 *)(bp->dev->dev_addr + 4));
 	macb_or_gem_writel(bp, SA1T, top);
 
 	if (gem_has_ptp(bp)) {

-- 
2.50.1

Re: [PATCH net v4 5/5] net: macb: avoid double endianness swap in macb_set_hwaddr()

Posted by Russell King (Oracle) 5 months, 3 weeks ago

On Wed, Aug 20, 2025 at 04:55:09PM +0200, Théo Lebrun wrote:
> writel() does a CPU->LE conversion. Drop manual cpu_to_le*() calls.
> 
> On little-endian system:
>  - cpu_to_le32() is a no-op (LE->LE),
>  - writel() is a no-op (LE->LE),
>  - dev_addr will therefore not be swapped and written as-is.
> 
> On big-endian system:
>  - cpu_to_le32() is a swap (BE->LE),
>  - writel() is a swap (BE->LE),
>  - dev_addr will therefore be swapped twice and written as a BE value.

I'm not convinced by this, I think you're missing something.

writel() on a BE or LE system will give you bits 7:0 of the CPU value
written to LE bit 7:0 of the register. It has to be this way, otherwise
we would need to do endian conversions everwhere where we write simple
numbers to device registers.

Why?

Remember that on a LE system with a 32-bit bus, a hex value of
0x76543210 at the CPU when written without conversion will appear
as:
	0 on bus bits 0:3
	1 on bus bits 4:7
	...
	6 on bus bits 24:27
	7 on bus bits 28:31

whereas on a BE system, this is reversed:
	6 on bus bits 0:3
	7 on bus bits 4:7
	...
	0 on bus bits 24:27
	1 on bus bits 28:31

The specification is that writel() will write in LE format even on
BE systems, so there is a need to do an endian conversion for BE
systems.

So, if a device expects bits 0:7 on the bus to be the first byte of
the MAC address (high byte of the OUI) then this must be in CPU
bits 0:7 as well.

Now, assuming that a MAC address of AA:BB:CC:DD:EE:FF gets read as
0xDDCCBBAA by the first read on a LE machine, it will get read as
0xAABBCCDD on a BE machine.

We can now see that combining these two, getting rid of the
cpu_to_le32() is likely wrong.

Therefore, I am not convinced this patch is actually correct.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH net v4 5/5] net: macb: avoid double endianness swap in macb_set_hwaddr()

Posted by Théo Lebrun 5 months, 1 week ago

Hello Russell,

On Wed Aug 20, 2025 at 5:25 PM CEST, Russell King (Oracle) wrote:
> On Wed, Aug 20, 2025 at 04:55:09PM +0200, Théo Lebrun wrote:
>> writel() does a CPU->LE conversion. Drop manual cpu_to_le*() calls.
>> 
>> On little-endian system:
>>  - cpu_to_le32() is a no-op (LE->LE),
>>  - writel() is a no-op (LE->LE),
>>  - dev_addr will therefore not be swapped and written as-is.
>> 
>> On big-endian system:
>>  - cpu_to_le32() is a swap (BE->LE),
>>  - writel() is a swap (BE->LE),
>>  - dev_addr will therefore be swapped twice and written as a BE value.
>
> I'm not convinced by this, I think you're missing something.
>
> writel() on a BE or LE system will give you bits 7:0 of the CPU value
> written to LE bit 7:0 of the register. It has to be this way, otherwise
> we would need to do endian conversions everwhere where we write simple
> numbers to device registers.
>
> Why?
>
> Remember that on a LE system with a 32-bit bus, a hex value of
> 0x76543210 at the CPU when written without conversion will appear
> as:
> 	0 on bus bits 0:3
> 	1 on bus bits 4:7
> 	...
> 	6 on bus bits 24:27
> 	7 on bus bits 28:31
>
> whereas on a BE system, this is reversed:
> 	6 on bus bits 0:3
> 	7 on bus bits 4:7
> 	...
> 	0 on bus bits 24:27
> 	1 on bus bits 28:31
>
> The specification is that writel() will write in LE format even on
> BE systems, so there is a need to do an endian conversion for BE
> systems.
>
> So, if a device expects bits 0:7 on the bus to be the first byte of
> the MAC address (high byte of the OUI) then this must be in CPU
> bits 0:7 as well.
>
>
> Now, assuming that a MAC address of AA:BB:CC:DD:EE:FF gets read as
> 0xDDCCBBAA by the first read on a LE machine, it will get read as
> 0xAABBCCDD on a BE machine.
>
> We can now see that combining these two, getting rid of the
> cpu_to_le32() is likely wrong.
>
> Therefore, I am not convinced this patch is actually correct.

Thanks for the above, in-detail, explanation. I agree with it all.
I've always have had a hard time wrapping my head around endianness
conversion.

Indeed the patch is wrong: the swap is required on BE platforms.
My gripe is more with the semantic of the current code:

   bottom = cpu_to_le32(*((u32 *)bp->dev->dev_addr));
   macb_or_gem_writel(bp, SA1B, bottom);
   top = cpu_to_le16(*((u16 *)(bp->dev->dev_addr + 4)));
   macb_or_gem_writel(bp, SA1T, top);

Notice how:
 - The type of the argument to cpu_to_le32(); pointer is to a CPU-endian
   value (u32) but in reality is to a BE32.
 - We apply cpu_to_le32() to get a swap but the semantic is wrong; input
   value is BE32 that we want to turn into CPU-endian.
 - Above two points apply to `u16 top` as well.
 - writel() are unrelated to the issue; they do the right thing by
   writing a CPU value into a LE device register.

Sparse is complaining about the second bulletpoint; it won't complain
about the first one because it trusts that you know what you are doing
with explicit casts.

   warning: incorrect type in assignment (different base types)
      expected unsigned int [usertype] bottom
      got restricted __le32 [usertype]

If we want to keep to the same structure, this does the exact same but
its semantic is more aligned to reality (to my eyes):

   bottom = le32_to_cpu(*((__le32 *)bp->dev->dev_addr));
   macb_or_gem_writel(bp, SA1B, bottom);
   top = le16_to_cpu(*((__le16 *)(bp->dev->dev_addr + 4)));
   macb_or_gem_writel(bp, SA1T, top);

Notice how:
 - Casts are fixed to signal proper types.
 - Use le32_to_cpu().

Sparse is happy and code has been tested on a BE platform.
Assembly generated is strictly identical.

However, I think we can do better. Second option:

   const unsigned char *addr = bp->dev->dev_addr;

   bottom = addr[0] << 0 | addr[1] << 8 | addr[2] << 16 | addr[3] << 24;
   top = addr[4] << 0 | addr[5] << 8;

This is a bit of a mouthful, what about this one?

   bottom = get_unaligned_le32(addr);
   top = get_unaligned_le16(addr + 4);

It is my preferred. I found those helpers reading more code that reads
the `unsigned char *dev_addr` field. Explicit and straight forward.

Can you confirm that last option fits well?

Thanks Russell,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

[PATCH net v4 1/5] dt-bindings: net: cdns,macb: allow tsu_clk without tx_clk
[PATCH net v4 2/5] net: macb: remove illusion about TBQPH/RBQPH being per-queue
[PATCH net v4 3/5] net: macb: move ring size computation to functions
[PATCH net v4 4/5] net: macb: single dma_alloc_coherent() for DMA descriptors
[PATCH net v4 5/5] net: macb: avoid double endianness swap in macb_set_hwaddr()