[PATCH v2 00/11] fw_devlink improvements

Naresh Kamboju posted 11 patches 2 years, 7 months ago
Only 0 patches received!
There is a newer version of this series
[PATCH v2 00/11] fw_devlink improvements
Posted by Naresh Kamboju 2 years, 7 months ago
Build test pass on arm, arm64, i386, mips, parisc, powerpc, riscv, s390, sh,
sparc and x86_64.

Boot and LTP smoke pass on qemu-arm64, qemu-armv7, qemu-i386 and qemu-x86_64.
Boot failed on FVP.

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>

Please refer following link for details of testing.
FVP boot log failed.
https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-devicetree_20230127001141_407071-1-saravanak_google_com/testrun/14389034/suite/boot/test/gcc-12-lkftconfig-64k_page_size/details/


[    2.613437] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
[    2.613628] Mem abort info:
[    2.613756]   ESR = 0x0000000096000005
[    2.613904]   EC = 0x25: DABT (current EL), IL = 32 bits
[    2.614071]   SET = 0, FnV = 0
[    2.614215]   EA = 0, S1PTW = 0
[    2.614358]   FSC = 0x05: level 1 translation fault
[    2.614517] Data abort info:
[    2.614647]   ISV = 0, ISS = 0x00000005
[    2.614792]   CM = 0, WnR = 0
[    2.614934] [0000000000000010] user address but active_mm is swapper
[    2.615105] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
[    2.615219] Modules linked in:
[    2.615310] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc5 #1
[    2.615445] Hardware name: FVP Base RevC (DT)
[    2.615533] pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[    2.615685] pc : gpiochip_setup_dev (include/linux/err.h:41 include/linux/fwnode.h:201 drivers/gpio/gpiolib.c:586) 
[    2.615816] lr : gpiochip_add_data_with_key (drivers/gpio/gpiolib.c:871) 
[    2.615970] sp : ffff8000081af5e0
[    2.616051] x29: ffff8000081af5e0 x28: 0000000000000000 x27: ffff0008027cb5a0
[    2.616261] x26: 0000000000000000 x25: ffffd7c5d6745910 x24: ffff0008027f4800
[    2.616472] x23: 0000000000000000 x22: ffffd7c5d62b99a8 x21: 0000000000000202
[    2.616679] x20: 0000000000000000 x19: ffff0008027f4800 x18: ffffffffffffffff
[    2.616890] x17: ffffd7c5d6467928 x16: 0000000013e3690a x15: ffff8000081af3b0
[    2.617102] x14: ffff00080275cd8a x13: ffff00080275cd88 x12: 0000000000000001
[    2.617312] x11: 62726568746f6d3a x10: 0000000000000000 x9 : ffffd7c5d3b3ebe0
[    2.617522] x8 : ffff8000081af548 x7 : 0000000000000000 x6 : 0000000000000001
[    2.617727] x5 : 0000000000000000 x4 : ffff000800640000 x3 : ffffd7c5d62b99c8
[    2.617933] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
[    2.618138] Call trace:
[    2.618204] gpiochip_setup_dev (include/linux/err.h:41 include/linux/fwnode.h:201 drivers/gpio/gpiolib.c:586) 
[    2.618337] gpiochip_add_data_with_key (drivers/gpio/gpiolib.c:871) 
[    2.618493] devm_gpiochip_add_data_with_key (drivers/gpio/gpiolib-devres.c:478) 
[    2.618654] bgpio_pdev_probe (drivers/gpio/gpio-mmio.c:793) 
[    2.618785] platform_probe (drivers/base/platform.c:1401) 
[    2.618928] really_probe (drivers/base/dd.c:560 drivers/base/dd.c:639) 
[    2.619056] __driver_probe_device (drivers/base/dd.c:778) 
[    2.619193] driver_probe_device (drivers/base/dd.c:808) 
[    2.619329] __device_attach_driver (drivers/base/dd.c:937) 
[    2.619464] bus_for_each_drv (drivers/base/bus.c:427) 
[    2.619590] __device_attach (drivers/base/dd.c:1010) 
[    2.619722] device_initial_probe (drivers/base/dd.c:1058) 
[    2.619861] bus_probe_device (drivers/base/bus.c:489) 
[    2.619988] device_add (drivers/base/core.c:3637) 
[    2.620102] platform_device_add (drivers/base/platform.c:717) 
[    2.620251] mfd_add_device (drivers/mfd/mfd-core.c:297) 
[    2.620397] devm_mfd_add_devices (drivers/mfd/mfd-core.c:351 drivers/mfd/mfd-core.c:449) 
[    2.620548] vexpress_sysreg_probe (drivers/mfd/vexpress-sysreg.c:115) 
[    2.620672] platform_probe (drivers/base/platform.c:1401) 
[    2.620814] really_probe (drivers/base/dd.c:560 drivers/base/dd.c:639) 
[    2.620940] __driver_probe_device (drivers/base/dd.c:778) 
[    2.621080] driver_probe_device (drivers/base/dd.c:808) 
[    2.621216] __driver_attach (drivers/base/dd.c:1195) 
[    2.621344] bus_for_each_dev (drivers/base/bus.c:301) 
[    2.621467] driver_attach (drivers/base/dd.c:1212) 
[    2.621596] bus_add_driver (drivers/base/bus.c:618) 
[    2.621720] driver_register (drivers/base/driver.c:246) 
[    2.621859] __platform_driver_register (drivers/base/platform.c:868) 
[    2.622012] vexpress_sysreg_driver_init (drivers/mfd/vexpress-sysreg.c:134) 
[    2.622145] do_one_initcall (init/main.c:1306) 
[    2.622269] kernel_init_freeable (init/main.c:1378 init/main.c:1395 init/main.c:1414 init/main.c:1634) 
[    2.622394] kernel_init (init/main.c:1526) 
[    2.622531] ret_from_fork (arch/arm64/kernel/entry.S:864) 
[ 2.622692] Code: 910003fd a90153f3 aa0003f3 f9414c00 (f9400801)
All code
========
   0:*	fd                   	std    		<-- trapping instruction
   1:	03 00                	add    (%rax),%eax
   3:	91                   	xchg   %eax,%ecx
   4:	f3 53                	repz push %rbx
   6:	01 a9 f3 03 00 aa    	add    %ebp,-0x55fffc0d(%rcx)
   c:	00 4c 41 f9          	add    %cl,-0x7(%rcx,%rax,2)
  10:	01 08                	add    %ecx,(%rax)
  12:	40 f9                	rex stc 

Code starting with the faulting instruction
===========================================
   0:	01 08                	add    %ecx,(%rax)
   2:	40 f9                	rex stc 
[    2.622807] ---[ end trace 0000000000000000 ]---
[    2.623043] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    2.623157] SMP: stopping secondary CPUs
[    2.623303] Kernel Offset: 0x57c5cb400000 from 0xffff800008000000
[    2.623413] PHYS_OFFSET: 0x80000000
[    2.623492] CPU features: 0x00000,001439ff,cd3e772f
[    2.623591] Memory Limit: none
[    2.623679] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---


ref:
 - https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-devicetree_20230127001141_407071-1-saravanak_google_com/?results_layout=table&failures_only=false#!?details=#test-results



--
Linaro LKFT
https://lkft.linaro.org
Re: [PATCH v2 00/11] fw_devlink improvements
Posted by Saravana Kannan 2 years, 7 months ago
On Mon, Jan 30, 2023 at 12:56 AM Naresh Kamboju
<naresh.kamboju@linaro.org> wrote:
>
> Build test pass on arm, arm64, i386, mips, parisc, powerpc, riscv, s390, sh,
> sparc and x86_64.
>
> Boot and LTP smoke pass on qemu-arm64, qemu-armv7, qemu-i386 and qemu-x86_64.
> Boot failed on FVP.
>
> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
>
> Please refer following link for details of testing.
> FVP boot log failed.
> https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-devicetree_20230127001141_407071-1-saravanak_google_com/testrun/14389034/suite/boot/test/gcc-12-lkftconfig-64k_page_size/details/

Sudeep pointed me to what the issue might be. But it's strange that
you are hitting an issue now. I'm pretty sure I haven't changed this
part since v1. I'd also expect the limited assumptions I made to have
not been affected between v1 and v2.

Anyway, I'll look at this and fix it in v3.

-Saravana
Re: [PATCH v2 00/11] fw_devlink improvements
Posted by Sudeep Holla 2 years, 7 months ago
Hi Saravana,

On Mon, Jan 30, 2023 at 03:03:01PM -0800, Saravana Kannan wrote:
> On Mon, Jan 30, 2023 at 12:56 AM Naresh Kamboju
> <naresh.kamboju@linaro.org> wrote:
> >
> > Build test pass on arm, arm64, i386, mips, parisc, powerpc, riscv, s390, sh,
> > sparc and x86_64.
> >
> > Boot and LTP smoke pass on qemu-arm64, qemu-armv7, qemu-i386 and qemu-x86_64.
> > Boot failed on FVP.
> >
> > Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> >
> > Please refer following link for details of testing.
> > FVP boot log failed.
> > https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-devicetree_20230127001141_407071-1-saravanak_google_com/testrun/14389034/suite/boot/test/gcc-12-lkftconfig-64k_page_size/details/
>
> Sudeep pointed me to what the issue might be. But it's strange that
> you are hitting an issue now. I'm pretty sure I haven't changed this
> part since v1. I'd also expect the limited assumptions I made to have
> not been affected between v1 and v2.
>

Sorry I hadn't seen or tested v1.

FYI The fwnode non-NULL check as in your nvmem diff/suggestion and the diff I
replied on the gpiolib patch thread fixes the issues.

> Anyway, I'll look at this and fix it in v3.
>

If you add that fwnode check, feel free to add my tested by.

--
Regards,
Sudeep
Re: [PATCH v2 00/11] fw_devlink improvements
Posted by Maxim Kiselev 2 years, 7 months ago
Hi Saravana,

> Can you try the patch at the end of this email under these
> configurations and tell me which ones fail vs pass? I don't need logs

I did these tests and here is the results:

1. On top of this series - Not works
2. Without this series    - Works
3. On top of the series with the fwnode_dev_initialized() deleted - Not works
4. Without this series, with the fwnode_dev_initialized() deleted  - Works

So your nvmem/core.c patch helps only when it is applied without the series.
But despite the fact that this helps to avoid getting stuck at probing
my ethernet device, there is still regression.

When the ethernet module is loaded it takes a lot of time to drop dependency
from the nvmem-cell with mac address.

Please look at the kernel logs below.

The first log corresponds to kernel with your nvmem/core.c patch:

    [    0.036462] ethernet@70000 Linked as a fwnode consumer to
clock-gating-control@1821c
    [    0.036572] ethernet@70000 Linked as a fwnode consumer to partition@1
    [    0.045596] device: 'f1070000.ethernet': device_add
    [    0.045854] ethernet@70000 Dropping the fwnode link to
clock-gating-control@1821c
    [    0.114990] device:
'platform:f1010600.spi:m25p80@0:partitions:partition@1--platform:f1070000.ethernet':
device_add
    [    0.115266] devices_kset: Moving f1070000.ethernet to end of list
    [    0.115308] platform f1070000.ethernet: Linked as a consumer to
f1010600.spi:m25p80@0:partitions:partition@1
    [    0.115345] ethernet@70000 Dropping the fwnode link to partition@1
    [    1.968232] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
    [    2.088696] devices_kset: Moving f1070000.ethernet to end of list
    [    2.088988] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
    [    2.152411] devices_kset: Moving f1070000.ethernet to end of list
    [    2.152735] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
    [    2.153870] devices_kset: Moving f1070000.ethernet to end of list
    [    2.154152] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
    [    2.644950] devices_kset: Moving f1070000.ethernet to end of list
    [    2.645282] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
    [    3.169218] devices_kset: Moving f1070000.ethernet to end of list
    [    3.169506] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
    [    3.170444] devices_kset: Moving f1070000.ethernet to end of list
    [    3.170721] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
    [    3.419068] devices_kset: Moving f1070000.ethernet to end of list
    [    3.419359] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
    [    3.521275] devices_kset: Moving f1070000.ethernet to end of list
    [    3.521564] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
    [    3.639196] devices_kset: Moving f1070000.ethernet to end of list
    [    3.639532] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
    [   13.960144] platform f1070000.ethernet: Relaxing link with
f1010600.spi:m25p80@0:partitions:partition@1
    [   13.960260] devices_kset: Moving f1070000.ethernet to end of list
    [   13.971735] device: 'eth0': device_add
    [   13.974140] mvneta f1070000.ethernet eth0: Using device tree
mac address de:fa:ce:db:ab:e1
    [   13.974275] mvneta f1070000.ethernet: Dropping the link to
f1010600.spi:m25p80@0:partitions:partition@1
    [   13.974318] device:
'platform:f1010600.spi:m25p80@0:partitions:partition@1--platform:f1070000.ethernet':
device_unregister

It took around 13 seconds to obtain a mac from nvmem-cell and bring up
f1070000.ethernet


And here is the second log which corresponds to kernel without your
nvmem/core.c patch but also with reverted change 'bcdf0315':

    [    0.036285] ethernet@70000 Linked as a fwnode consumer to
clock-gating-control@1821c
    [    0.036395] ethernet@70000 Linked as a fwnode consumer to partition@1
    [    0.045416] device: 'f1070000.ethernet': device_add
    [    0.045674] ethernet@70000 Dropping the fwnode link to
clock-gating-control@1821c
    [    0.116136] ethernet@70000 Dropping the fwnode link to partition@1
    [    1.977060] device: 'eth0': device_add
    [    1.979145] mvneta f1070000.ethernet eth0: Using device tree
mac address de:fa:ce:db:ab:e1

It took around 1.5 second to obtain a mac from nvmem-cell

P.S. Your nvmem patch definitely helps to avoid a device probe stuck
but look like it is not best way to solve a problem which we discussed
in the MTD thread.

P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
applied on top of this series. Maybe I missed something.
Re: [PATCH v2 00/11] fw_devlink improvements
Posted by Saravana Kannan 2 years, 7 months ago
On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:
>
> Hi Saravana,
>
> > Can you try the patch at the end of this email under these
> > configurations and tell me which ones fail vs pass? I don't need logs
>
> I did these tests and here is the results:

Did you hand edit the In-Reply-To: in the header? Because in the
thread you are reply to the wrong email, but the context in your email
seems to be from the right email.

For example, see how your reply isn't under the email you are replying
to in this thread overview:
https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r

> 1. On top of this series - Not works
> 2. Without this series    - Works
> 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> 4. Without this series, with the fwnode_dev_initialized() deleted  - Works
>
> So your nvmem/core.c patch helps only when it is applied without the series.
> But despite the fact that this helps to avoid getting stuck at probing
> my ethernet device, there is still regression.
>
> When the ethernet module is loaded it takes a lot of time to drop dependency
> from the nvmem-cell with mac address.
>
> Please look at the kernel logs below.

The kernel logs below really aren't that useful for me in their
current state. See more below.

---8<---- <snip> --->8----

> P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> but look like it is not best way to solve a problem which we discussed
> in the MTD thread.
>
> P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> applied on top of this series. Maybe I missed something.

Yeah, I'm not too sure if the test was done correctly. You also didn't
answer my question about the dts from my earlier email.
https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t

So, can you please retest config 1 with all pr_debug and dev_dbg in
drivers/core/base.c changed to the _info variants? And then share the
kernel log from the beginning of boot? Maybe attach it to the email so
it doesn't get word wrapped by your email client. And please point me
to the .dts that corresponds to your board. Without that, I can't
debug much.

Thanks,
Saravana
Re: [PATCH v2 00/11] fw_devlink improvements
Posted by Maxim Kiselev 2 years, 7 months ago
пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <saravanak@google.com>:
>
> On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:
> >
> > Hi Saravana,
> >
> > > Can you try the patch at the end of this email under these
> > > configurations and tell me which ones fail vs pass? I don't need logs
> >
> > I did these tests and here is the results:
>
> Did you hand edit the In-Reply-To: in the header? Because in the
> thread you are reply to the wrong email, but the context in your email
> seems to be from the right email.
>
> For example, see how your reply isn't under the email you are replying
> to in this thread overview:
> https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r
>
> > 1. On top of this series - Not works
> > 2. Without this series    - Works
> > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > 4. Without this series, with the fwnode_dev_initialized() deleted  - Works
> >
> > So your nvmem/core.c patch helps only when it is applied without the series.
> > But despite the fact that this helps to avoid getting stuck at probing
> > my ethernet device, there is still regression.
> >
> > When the ethernet module is loaded it takes a lot of time to drop dependency
> > from the nvmem-cell with mac address.
> >
> > Please look at the kernel logs below.
>
> The kernel logs below really aren't that useful for me in their
> current state. See more below.
>
> ---8<---- <snip> --->8----
>
> > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > but look like it is not best way to solve a problem which we discussed
> > in the MTD thread.
> >
> > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > applied on top of this series. Maybe I missed something.
>
> Yeah, I'm not too sure if the test was done correctly. You also didn't
> answer my question about the dts from my earlier email.
> https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
>
> So, can you please retest config 1 with all pr_debug and dev_dbg in
> drivers/core/base.c changed to the _info variants? And then share the
> kernel log from the beginning of boot? Maybe attach it to the email so
> it doesn't get word wrapped by your email client. And please point me
> to the .dts that corresponds to your board. Without that, I can't
> debug much.
>
> Thanks,
> Saravana

> Did you hand edit the In-Reply-To: in the header? Because in the
> thread you are reply to the wrong email, but the context in your email
> seems to be from the right email.

Sorry for that, it seems like I accidently deleted it.

> So, can you please retest config 1 with all pr_debug and dev_dbg in
> drivers/core/base.c changed to the _info variants? And then share the
> kernel log from the beginning of boot? Maybe attach it to the email so
> it doesn't get word wrapped by your email client. And please point me
> to the .dts that corresponds to your board. Without that, I can't
> debug much.

Ok, I retested config 1 with all _debug logs changed to the _info. I
added the kernel log and the dts file to the attachment of this email.
Re: [PATCH v2 00/11] fw_devlink improvements
Posted by Saravana Kannan 2 years, 7 months ago
On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:
>
> пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <saravanak@google.com>:
> >
> > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:
> > >
> > > Hi Saravana,
> > >
> > > > Can you try the patch at the end of this email under these
> > > > configurations and tell me which ones fail vs pass? I don't need logs
> > >
> > > I did these tests and here is the results:
> >
> > Did you hand edit the In-Reply-To: in the header? Because in the
> > thread you are reply to the wrong email, but the context in your email
> > seems to be from the right email.
> >
> > For example, see how your reply isn't under the email you are replying
> > to in this thread overview:
> > https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r
> >
> > > 1. On top of this series - Not works
> > > 2. Without this series    - Works
> > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > > 4. Without this series, with the fwnode_dev_initialized() deleted  - Works
> > >
> > > So your nvmem/core.c patch helps only when it is applied without the series.
> > > But despite the fact that this helps to avoid getting stuck at probing
> > > my ethernet device, there is still regression.
> > >
> > > When the ethernet module is loaded it takes a lot of time to drop dependency
> > > from the nvmem-cell with mac address.
> > >
> > > Please look at the kernel logs below.
> >
> > The kernel logs below really aren't that useful for me in their
> > current state. See more below.
> >
> > ---8<---- <snip> --->8----
> >
> > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > > but look like it is not best way to solve a problem which we discussed
> > > in the MTD thread.
> > >
> > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > > applied on top of this series. Maybe I missed something.
> >
> > Yeah, I'm not too sure if the test was done correctly. You also didn't
> > answer my question about the dts from my earlier email.
> > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
> >
> > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > drivers/core/base.c changed to the _info variants? And then share the
> > kernel log from the beginning of boot? Maybe attach it to the email so
> > it doesn't get word wrapped by your email client. And please point me
> > to the .dts that corresponds to your board. Without that, I can't
> > debug much.
> >
> > Thanks,
> > Saravana
>
> > Did you hand edit the In-Reply-To: in the header? Because in the
> > thread you are reply to the wrong email, but the context in your email
> > seems to be from the right email.
>
> Sorry for that, it seems like I accidently deleted it.
>
> > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > drivers/core/base.c changed to the _info variants? And then share the
> > kernel log from the beginning of boot? Maybe attach it to the email so
> > it doesn't get word wrapped by your email client. And please point me
> > to the .dts that corresponds to your board. Without that, I can't
> > debug much.
>
> Ok, I retested config 1 with all _debug logs changed to the _info. I
> added the kernel log and the dts file to the attachment of this email.

Ah, so your device is not supported/present upstream? Even though it's
not upstream, I'll help fix this because it should fix what I believe
are unreported issues in upstream.

Ok I know why configs 1 - 4 behaved the way they did and why my test
patch didn't help.

After staring at mtd/nvmem code for a few hours I think mtd/nvmem
interaction is kind of a mess. mtd core creates "partition" platform
devices (including for nvmem-cells) that are probed by drivers in
drivers/nvmem. However, there's no driver for "nvmem-cells" partition
platform device. However, the nvmem core creates nvmem_device when
nvmem_register() is called by MTD or these partition platform devices
created by MTD. But these nvmem_devices are added to a nvmem_bus but
the bus has no means to even register a driver (it should really be a
nvmem_class and not nvmem_bus). And the nvmem_device sometimes points
to the DT node of the MTD device or sometimes the partition platform
devices or maybe no DT node at all.

So it's a mess of multiple devices pointing to the same DT node with
no clear way to identify which ones will point to a DT node and which
ones will probe and which ones won't. In the future, we shouldn't
allow adding new compatible strings for partitions for which we don't
plan on adding nvmem drivers.

Can you give the patch at the end of the email a shot? It should fix
the issue with this series and without this series. It just avoids
this whole mess by not creating useless platform device for
nvmem-cells compatible DT nodes.

Thanks,
Saravana

diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
index d442fa94c872..88a213f4d651 100644
--- a/drivers/mtd/mtdpart.c
+++ b/drivers/mtd/mtdpart.c
@@ -577,6 +577,7 @@ static int mtd_part_of_parse(struct mtd_info *master,
 {
        struct mtd_part_parser *parser;
        struct device_node *np;
+       struct device_node *child;
        struct property *prop;
        struct device *dev;
        const char *compat;
@@ -594,6 +595,10 @@ static int mtd_part_of_parse(struct mtd_info *master,
        else
                np = of_get_child_by_name(np, "partitions");

+       for_each_child_of_node(np, child)
+               if (of_device_is_compatible(child, "nvmem-cells"))
+                       of_node_set_flag(child, OF_POPULATED);
+
        of_property_for_each_string(np, "compatible", prop, compat) {
                parser = mtd_part_get_compatible_parser(compat);
                if (!parser)
Re: [PATCH v2 00/11] fw_devlink improvements
Posted by Rob Herring 2 years, 7 months ago
On Sun, Feb 5, 2023 at 7:33 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:
> >
> > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <saravanak@google.com>:
> > >
> > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:
> > > >
> > > > Hi Saravana,
> > > >
> > > > > Can you try the patch at the end of this email under these
> > > > > configurations and tell me which ones fail vs pass? I don't need logs
> > > >
> > > > I did these tests and here is the results:
> > >
> > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > thread you are reply to the wrong email, but the context in your email
> > > seems to be from the right email.
> > >
> > > For example, see how your reply isn't under the email you are replying
> > > to in this thread overview:
> > > https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r
> > >
> > > > 1. On top of this series - Not works
> > > > 2. Without this series    - Works
> > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > > > 4. Without this series, with the fwnode_dev_initialized() deleted  - Works
> > > >
> > > > So your nvmem/core.c patch helps only when it is applied without the series.
> > > > But despite the fact that this helps to avoid getting stuck at probing
> > > > my ethernet device, there is still regression.
> > > >
> > > > When the ethernet module is loaded it takes a lot of time to drop dependency
> > > > from the nvmem-cell with mac address.
> > > >
> > > > Please look at the kernel logs below.
> > >
> > > The kernel logs below really aren't that useful for me in their
> > > current state. See more below.
> > >
> > > ---8<---- <snip> --->8----
> > >
> > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > > > but look like it is not best way to solve a problem which we discussed
> > > > in the MTD thread.
> > > >
> > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > > > applied on top of this series. Maybe I missed something.
> > >
> > > Yeah, I'm not too sure if the test was done correctly. You also didn't
> > > answer my question about the dts from my earlier email.
> > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
> > >
> > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > drivers/core/base.c changed to the _info variants? And then share the
> > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > it doesn't get word wrapped by your email client. And please point me
> > > to the .dts that corresponds to your board. Without that, I can't
> > > debug much.
> > >
> > > Thanks,
> > > Saravana
> >
> > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > thread you are reply to the wrong email, but the context in your email
> > > seems to be from the right email.
> >
> > Sorry for that, it seems like I accidently deleted it.
> >
> > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > drivers/core/base.c changed to the _info variants? And then share the
> > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > it doesn't get word wrapped by your email client. And please point me
> > > to the .dts that corresponds to your board. Without that, I can't
> > > debug much.
> >
> > Ok, I retested config 1 with all _debug logs changed to the _info. I
> > added the kernel log and the dts file to the attachment of this email.
>
> Ah, so your device is not supported/present upstream? Even though it's
> not upstream, I'll help fix this because it should fix what I believe
> are unreported issues in upstream.
>
> Ok I know why configs 1 - 4 behaved the way they did and why my test
> patch didn't help.
>
> After staring at mtd/nvmem code for a few hours I think mtd/nvmem
> interaction is kind of a mess. mtd core creates "partition" platform
> devices (including for nvmem-cells) that are probed by drivers in
> drivers/nvmem. However, there's no driver for "nvmem-cells" partition
> platform device. However, the nvmem core creates nvmem_device when
> nvmem_register() is called by MTD or these partition platform devices
> created by MTD. But these nvmem_devices are added to a nvmem_bus but
> the bus has no means to even register a driver (it should really be a
> nvmem_class and not nvmem_bus). And the nvmem_device sometimes points
> to the DT node of the MTD device or sometimes the partition platform
> devices or maybe no DT node at all.
>
> So it's a mess of multiple devices pointing to the same DT node with
> no clear way to identify which ones will point to a DT node and which
> ones will probe and which ones won't. In the future, we shouldn't
> allow adding new compatible strings for partitions for which we don't
> plan on adding nvmem drivers.

That won't work. Having a compatible string cannot mean there must be a driver.

Rob
Re: [PATCH v2 00/11] fw_devlink improvements
Posted by Saravana Kannan 2 years, 7 months ago
On Mon, Feb 6, 2023 at 7:19 AM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Sun, Feb 5, 2023 at 7:33 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:
> > >
> > > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <saravanak@google.com>:
> > > >
> > > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:
> > > > >
> > > > > Hi Saravana,
> > > > >
> > > > > > Can you try the patch at the end of this email under these
> > > > > > configurations and tell me which ones fail vs pass? I don't need logs
> > > > >
> > > > > I did these tests and here is the results:
> > > >
> > > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > > thread you are reply to the wrong email, but the context in your email
> > > > seems to be from the right email.
> > > >
> > > > For example, see how your reply isn't under the email you are replying
> > > > to in this thread overview:
> > > > https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r
> > > >
> > > > > 1. On top of this series - Not works
> > > > > 2. Without this series    - Works
> > > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > > > > 4. Without this series, with the fwnode_dev_initialized() deleted  - Works
> > > > >
> > > > > So your nvmem/core.c patch helps only when it is applied without the series.
> > > > > But despite the fact that this helps to avoid getting stuck at probing
> > > > > my ethernet device, there is still regression.
> > > > >
> > > > > When the ethernet module is loaded it takes a lot of time to drop dependency
> > > > > from the nvmem-cell with mac address.
> > > > >
> > > > > Please look at the kernel logs below.
> > > >
> > > > The kernel logs below really aren't that useful for me in their
> > > > current state. See more below.
> > > >
> > > > ---8<---- <snip> --->8----
> > > >
> > > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > > > > but look like it is not best way to solve a problem which we discussed
> > > > > in the MTD thread.
> > > > >
> > > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > > > > applied on top of this series. Maybe I missed something.
> > > >
> > > > Yeah, I'm not too sure if the test was done correctly. You also didn't
> > > > answer my question about the dts from my earlier email.
> > > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
> > > >
> > > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > > drivers/core/base.c changed to the _info variants? And then share the
> > > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > > it doesn't get word wrapped by your email client. And please point me
> > > > to the .dts that corresponds to your board. Without that, I can't
> > > > debug much.
> > > >
> > > > Thanks,
> > > > Saravana
> > >
> > > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > > thread you are reply to the wrong email, but the context in your email
> > > > seems to be from the right email.
> > >
> > > Sorry for that, it seems like I accidently deleted it.
> > >
> > > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > > drivers/core/base.c changed to the _info variants? And then share the
> > > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > > it doesn't get word wrapped by your email client. And please point me
> > > > to the .dts that corresponds to your board. Without that, I can't
> > > > debug much.
> > >
> > > Ok, I retested config 1 with all _debug logs changed to the _info. I
> > > added the kernel log and the dts file to the attachment of this email.
> >
> > Ah, so your device is not supported/present upstream? Even though it's
> > not upstream, I'll help fix this because it should fix what I believe
> > are unreported issues in upstream.
> >
> > Ok I know why configs 1 - 4 behaved the way they did and why my test
> > patch didn't help.
> >
> > After staring at mtd/nvmem code for a few hours I think mtd/nvmem
> > interaction is kind of a mess. mtd core creates "partition" platform
> > devices (including for nvmem-cells) that are probed by drivers in
> > drivers/nvmem. However, there's no driver for "nvmem-cells" partition
> > platform device. However, the nvmem core creates nvmem_device when
> > nvmem_register() is called by MTD or these partition platform devices
> > created by MTD. But these nvmem_devices are added to a nvmem_bus but
> > the bus has no means to even register a driver (it should really be a
> > nvmem_class and not nvmem_bus). And the nvmem_device sometimes points
> > to the DT node of the MTD device or sometimes the partition platform
> > devices or maybe no DT node at all.
> >
> > So it's a mess of multiple devices pointing to the same DT node with
> > no clear way to identify which ones will point to a DT node and which
> > ones will probe and which ones won't. In the future, we shouldn't
> > allow adding new compatible strings for partitions for which we don't
> > plan on adding nvmem drivers.
>
> That won't work. Having a compatible string cannot mean there must be a driver.

Right, I know what you mean Rob and I know where you are coming from
(DT isn't just about Linux or even driver core). But what I'm saying
is that this seems to already be the case for MTD partitions after
commit:
bcdf0315a61a mtd: call of_platform_populate() for MTD partitions

So, if we are adding compatible properties only for some of them, then
I'm saying we should make sure people write drivers for them going
forward.

I don't know enough about MTD partitions to know why only some of them
have compatible properties.

-Saravana
Re: [PATCH v2 00/11] fw_devlink improvements
Posted by Miquel Raynal 2 years, 7 months ago
Hi Saravana,

+ Srinivas, nvmem maintainer

saravanak@google.com wrote on Sun, 5 Feb 2023 17:32:57 -0800:

> On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:
> >
> > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <saravanak@google.com>:  
> > >
> > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:  
> > > >
> > > > Hi Saravana,
> > > >  
> > > > > Can you try the patch at the end of this email under these
> > > > > configurations and tell me which ones fail vs pass? I don't need logs  
> > > >
> > > > I did these tests and here is the results:  
> > >
> > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > thread you are reply to the wrong email, but the context in your email
> > > seems to be from the right email.
> > >
> > > For example, see how your reply isn't under the email you are replying
> > > to in this thread overview:
> > > https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r
> > >  
> > > > 1. On top of this series - Not works
> > > > 2. Without this series    - Works
> > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > > > 4. Without this series, with the fwnode_dev_initialized() deleted  - Works
> > > >
> > > > So your nvmem/core.c patch helps only when it is applied without the series.
> > > > But despite the fact that this helps to avoid getting stuck at probing
> > > > my ethernet device, there is still regression.
> > > >
> > > > When the ethernet module is loaded it takes a lot of time to drop dependency
> > > > from the nvmem-cell with mac address.
> > > >
> > > > Please look at the kernel logs below.  
> > >
> > > The kernel logs below really aren't that useful for me in their
> > > current state. See more below.
> > >
> > > ---8<---- <snip> --->8----
> > >  
> > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > > > but look like it is not best way to solve a problem which we discussed
> > > > in the MTD thread.
> > > >
> > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > > > applied on top of this series. Maybe I missed something.  
> > >
> > > Yeah, I'm not too sure if the test was done correctly. You also didn't
> > > answer my question about the dts from my earlier email.
> > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
> > >
> > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > drivers/core/base.c changed to the _info variants? And then share the
> > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > it doesn't get word wrapped by your email client. And please point me
> > > to the .dts that corresponds to your board. Without that, I can't
> > > debug much.
> > >
> > > Thanks,
> > > Saravana  
> >  
> > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > thread you are reply to the wrong email, but the context in your email
> > > seems to be from the right email.  
> >
> > Sorry for that, it seems like I accidently deleted it.
> >  
> > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > drivers/core/base.c changed to the _info variants? And then share the
> > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > it doesn't get word wrapped by your email client. And please point me
> > > to the .dts that corresponds to your board. Without that, I can't
> > > debug much.  
> >
> > Ok, I retested config 1 with all _debug logs changed to the _info. I
> > added the kernel log and the dts file to the attachment of this email.  
> 
> Ah, so your device is not supported/present upstream? Even though it's
> not upstream, I'll help fix this because it should fix what I believe
> are unreported issues in upstream.
> 
> Ok I know why configs 1 - 4 behaved the way they did and why my test
> patch didn't help.
> 
> After staring at mtd/nvmem code for a few hours I think mtd/nvmem
> interaction is kind of a mess.

nvmem is a recent subsystem but mtd carries a lot of legacy stuff we
cannot really re-wire without breaking users, so nvmem on top of mtd
of course inherit from the fragile designs in place.

> mtd core creates "partition" platform
> devices (including for nvmem-cells) that are probed by drivers in
> drivers/nvmem. However, there's no driver for "nvmem-cells" partition
> platform device. However, the nvmem core creates nvmem_device when
> nvmem_register() is called by MTD or these partition platform devices
> created by MTD. But these nvmem_devices are added to a nvmem_bus but
> the bus has no means to even register a driver (it should really be a
> nvmem_class and not nvmem_bus).

Srinivas, do you think we could change this?

> And the nvmem_device sometimes points
> to the DT node of the MTD device or sometimes the partition platform
> devices or maybe no DT node at all.

I guess this comes from the fact that this is not strongly defined in
mtd and depends on the situation (not mentioning 20 years of history
there as well). "mtd" is a bit inconsistent on what it means. Older
designs mixed: controllers, ECC engines when relevant and memories;
while these three components are completely separated. Hence
sometimes the mtd device ends up being the top level controller,
sometimes it's just one partition...

But I'm surprised not all of them point to a DT node. Could you show us
an example? Because that might likely be unexpected (or perhaps I am
missing something).

> So it's a mess of multiple devices pointing to the same DT node with
> no clear way to identify which ones will point to a DT node and which
> ones will probe and which ones won't. In the future, we shouldn't
> allow adding new compatible strings for partitions for which we don't
> plan on adding nvmem drivers.
>
> Can you give the patch at the end of the email a shot? It should fix
> the issue with this series and without this series. It just avoids
> this whole mess by not creating useless platform device for
> nvmem-cells compatible DT nodes.

Thanks a lot for your help.

> 
> Thanks,
> Saravana
> 
> diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
> index d442fa94c872..88a213f4d651 100644
> --- a/drivers/mtd/mtdpart.c
> +++ b/drivers/mtd/mtdpart.c
> @@ -577,6 +577,7 @@ static int mtd_part_of_parse(struct mtd_info *master,
>  {
>         struct mtd_part_parser *parser;
>         struct device_node *np;
> +       struct device_node *child;
>         struct property *prop;
>         struct device *dev;
>         const char *compat;
> @@ -594,6 +595,10 @@ static int mtd_part_of_parse(struct mtd_info *master,
>         else
>                 np = of_get_child_by_name(np, "partitions");
> 
> +       for_each_child_of_node(np, child)
> +               if (of_device_is_compatible(child, "nvmem-cells"))
> +                       of_node_set_flag(child, OF_POPULATED);

What about a comment explaining why we need that in the final patch
(with a comment)? Otherwise it's a little bit obscure.

> +
>         of_property_for_each_string(np, "compatible", prop, compat) {
>                 parser = mtd_part_get_compatible_parser(compat);
>                 if (!parser)


Thanks,
Miquèl
Re: [PATCH v2 00/11] fw_devlink improvements
Posted by Saravana Kannan 2 years, 7 months ago
On Mon, Feb 6, 2023 at 1:39 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> Hi Saravana,
>
> + Srinivas, nvmem maintainer
>
> saravanak@google.com wrote on Sun, 5 Feb 2023 17:32:57 -0800:
>
> > On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:
> > >
> > > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <saravanak@google.com>:
> > > >
> > > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:
> > > > >
> > > > > Hi Saravana,
> > > > >
> > > > > > Can you try the patch at the end of this email under these
> > > > > > configurations and tell me which ones fail vs pass? I don't need logs
> > > > >
> > > > > I did these tests and here is the results:
> > > >
> > > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > > thread you are reply to the wrong email, but the context in your email
> > > > seems to be from the right email.
> > > >
> > > > For example, see how your reply isn't under the email you are replying
> > > > to in this thread overview:
> > > > https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r
> > > >
> > > > > 1. On top of this series - Not works
> > > > > 2. Without this series    - Works
> > > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > > > > 4. Without this series, with the fwnode_dev_initialized() deleted  - Works
> > > > >
> > > > > So your nvmem/core.c patch helps only when it is applied without the series.
> > > > > But despite the fact that this helps to avoid getting stuck at probing
> > > > > my ethernet device, there is still regression.
> > > > >
> > > > > When the ethernet module is loaded it takes a lot of time to drop dependency
> > > > > from the nvmem-cell with mac address.
> > > > >
> > > > > Please look at the kernel logs below.
> > > >
> > > > The kernel logs below really aren't that useful for me in their
> > > > current state. See more below.
> > > >
> > > > ---8<---- <snip> --->8----
> > > >
> > > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > > > > but look like it is not best way to solve a problem which we discussed
> > > > > in the MTD thread.
> > > > >
> > > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > > > > applied on top of this series. Maybe I missed something.
> > > >
> > > > Yeah, I'm not too sure if the test was done correctly. You also didn't
> > > > answer my question about the dts from my earlier email.
> > > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
> > > >
> > > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > > drivers/core/base.c changed to the _info variants? And then share the
> > > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > > it doesn't get word wrapped by your email client. And please point me
> > > > to the .dts that corresponds to your board. Without that, I can't
> > > > debug much.
> > > >
> > > > Thanks,
> > > > Saravana
> > >
> > > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > > thread you are reply to the wrong email, but the context in your email
> > > > seems to be from the right email.
> > >
> > > Sorry for that, it seems like I accidently deleted it.
> > >
> > > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > > drivers/core/base.c changed to the _info variants? And then share the
> > > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > > it doesn't get word wrapped by your email client. And please point me
> > > > to the .dts that corresponds to your board. Without that, I can't
> > > > debug much.
> > >
> > > Ok, I retested config 1 with all _debug logs changed to the _info. I
> > > added the kernel log and the dts file to the attachment of this email.
> >
> > Ah, so your device is not supported/present upstream? Even though it's
> > not upstream, I'll help fix this because it should fix what I believe
> > are unreported issues in upstream.
> >
> > Ok I know why configs 1 - 4 behaved the way they did and why my test
> > patch didn't help.
> >
> > After staring at mtd/nvmem code for a few hours I think mtd/nvmem
> > interaction is kind of a mess.
>
> nvmem is a recent subsystem but mtd carries a lot of legacy stuff we
> cannot really re-wire without breaking users, so nvmem on top of mtd
> of course inherit from the fragile designs in place.

Thanks for the context. Yeah, I figured. That's why I explicitly
limited my comment to "interaction". Although, I'd love to see the MTD
parsers all be converted to proper drivers that probe. MTD is
essentially repeating the driver matching logic. I think it can be
cleaned up to move to proper drivers and still not break backward
compatibility. Not saying it'll be trivial, but it should be possible.
Ironically MTD uses mtd_class but has real drivers that work on the
device (compared to nvmem_bus below).

> > mtd core creates "partition" platform
> > devices (including for nvmem-cells) that are probed by drivers in
> > drivers/nvmem. However, there's no driver for "nvmem-cells" partition
> > platform device. However, the nvmem core creates nvmem_device when
> > nvmem_register() is called by MTD or these partition platform devices
> > created by MTD. But these nvmem_devices are added to a nvmem_bus but
> > the bus has no means to even register a driver (it should really be a
> > nvmem_class and not nvmem_bus).
>
> Srinivas, do you think we could change this?

Yeah, this part gets a bit tricky. It depends on whether the sysfs
files for nvmem devices is considered an ABI. Changing from bus to
class would change the sysfs path for nvmem devices from:
/sys/class/nvmem to /sys/bus/nvmem

> > And the nvmem_device sometimes points
> > to the DT node of the MTD device or sometimes the partition platform
> > devices or maybe no DT node at all.
>
> I guess this comes from the fact that this is not strongly defined in
> mtd and depends on the situation (not mentioning 20 years of history
> there as well). "mtd" is a bit inconsistent on what it means. Older
> designs mixed: controllers, ECC engines when relevant and memories;
> while these three components are completely separated. Hence
> sometimes the mtd device ends up being the top level controller,
> sometimes it's just one partition...
>
> But I'm surprised not all of them point to a DT node. Could you show us
> an example? Because that might likely be unexpected (or perhaps I am
> missing something).

Well, the logic that sets the DT node for nvmem_device is like so:

        if (config->of_node)
                nvmem->dev.of_node = config->of_node;
        else if (!config->no_of_node)
                nvmem->dev.of_node = config->dev->of_node;

So there's definitely a path (where both if's could be false) where
the DT node will not get set. I don't know if that path is possible
with the existing users of nvmem_register(), but it's definitely
possible.

> > So it's a mess of multiple devices pointing to the same DT node with
> > no clear way to identify which ones will point to a DT node and which
> > ones will probe and which ones won't. In the future, we shouldn't
> > allow adding new compatible strings for partitions for which we don't
> > plan on adding nvmem drivers.
> >
> > Can you give the patch at the end of the email a shot? It should fix
> > the issue with this series and without this series. It just avoids
> > this whole mess by not creating useless platform device for
> > nvmem-cells compatible DT nodes.
>
> Thanks a lot for your help.

No problem. I want fw_devlink to work for everyone.

> >
> > Thanks,
> > Saravana
> >
> > diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
> > index d442fa94c872..88a213f4d651 100644
> > --- a/drivers/mtd/mtdpart.c
> > +++ b/drivers/mtd/mtdpart.c
> > @@ -577,6 +577,7 @@ static int mtd_part_of_parse(struct mtd_info *master,
> >  {
> >         struct mtd_part_parser *parser;
> >         struct device_node *np;
> > +       struct device_node *child;
> >         struct property *prop;
> >         struct device *dev;
> >         const char *compat;
> > @@ -594,6 +595,10 @@ static int mtd_part_of_parse(struct mtd_info *master,
> >         else
> >                 np = of_get_child_by_name(np, "partitions");
> >
> > +       for_each_child_of_node(np, child)
> > +               if (of_device_is_compatible(child, "nvmem-cells"))
> > +                       of_node_set_flag(child, OF_POPULATED);
>
> What about a comment explaining why we need that in the final patch
> (with a comment)? Otherwise it's a little bit obscure.

This wasn't meant to be reviewed :) Just a quick patch to make sure
I'm going down the right path. Once Maxim confirms I was going to roll
this into a proper patch.

But point noted. Will add a comment.

Thanks,
Saravana
Re: [PATCH v2 00/11] fw_devlink improvements
Posted by Miquel Raynal 2 years, 6 months ago
Hi Saravana,

> > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > > > drivers/core/base.c changed to the _info variants? And then share the
> > > > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > > > it doesn't get word wrapped by your email client. And please point me
> > > > > to the .dts that corresponds to your board. Without that, I can't
> > > > > debug much.  
> > > >
> > > > Ok, I retested config 1 with all _debug logs changed to the _info. I
> > > > added the kernel log and the dts file to the attachment of this email.  
> > >
> > > Ah, so your device is not supported/present upstream? Even though it's
> > > not upstream, I'll help fix this because it should fix what I believe
> > > are unreported issues in upstream.
> > >
> > > Ok I know why configs 1 - 4 behaved the way they did and why my test
> > > patch didn't help.
> > >
> > > After staring at mtd/nvmem code for a few hours I think mtd/nvmem
> > > interaction is kind of a mess.  
> >
> > nvmem is a recent subsystem but mtd carries a lot of legacy stuff we
> > cannot really re-wire without breaking users, so nvmem on top of mtd
> > of course inherit from the fragile designs in place.  
> 
> Thanks for the context. Yeah, I figured. That's why I explicitly
> limited my comment to "interaction". Although, I'd love to see the MTD
> parsers all be converted to proper drivers that probe. MTD is
> essentially repeating the driver matching logic. I think it can be
> cleaned up to move to proper drivers and still not break backward
> compatibility. Not saying it'll be trivial, but it should be possible.
> Ironically MTD uses mtd_class but has real drivers that work on the
> device (compared to nvmem_bus below).
> 
> > > mtd core creates "partition" platform
> > > devices (including for nvmem-cells) that are probed by drivers in
> > > drivers/nvmem. However, there's no driver for "nvmem-cells" partition
> > > platform device. However, the nvmem core creates nvmem_device when
> > > nvmem_register() is called by MTD or these partition platform devices
> > > created by MTD. But these nvmem_devices are added to a nvmem_bus but
> > > the bus has no means to even register a driver (it should really be a
> > > nvmem_class and not nvmem_bus).  
> >
> > Srinivas, do you think we could change this?  
> 
> Yeah, this part gets a bit tricky. It depends on whether the sysfs
> files for nvmem devices is considered an ABI. Changing from bus to
> class would change the sysfs path for nvmem devices from:
> /sys/class/nvmem to /sys/bus/nvmem

Ok, so this is a no :)

> > > And the nvmem_device sometimes points
> > > to the DT node of the MTD device or sometimes the partition platform
> > > devices or maybe no DT node at all.  
> >
> > I guess this comes from the fact that this is not strongly defined in
> > mtd and depends on the situation (not mentioning 20 years of history
> > there as well). "mtd" is a bit inconsistent on what it means. Older
> > designs mixed: controllers, ECC engines when relevant and memories;
> > while these three components are completely separated. Hence
> > sometimes the mtd device ends up being the top level controller,
> > sometimes it's just one partition...
> >
> > But I'm surprised not all of them point to a DT node. Could you show us
> > an example? Because that might likely be unexpected (or perhaps I am
> > missing something).  
> 
> Well, the logic that sets the DT node for nvmem_device is like so:
> 
>         if (config->of_node)
>                 nvmem->dev.of_node = config->of_node;
>         else if (!config->no_of_node)
>                 nvmem->dev.of_node = config->dev->of_node;
> 
> So there's definitely a path (where both if's could be false) where
> the DT node will not get set. I don't know if that path is possible
> with the existing users of nvmem_register(), but it's definitely
> possible.

It's an actual path. I just checked more in details, this is the change
from 2018 which uses the no_of_node flag:
c4dfa25ab307 ("mtd: add support for reading MTD devices via the nvmem API")

It basically allows any mtd device to be accessible (read-only) through
nvmem. So mtd partitions or such which are not described in the DT may
just be accessed through nvmem (that is my current understanding).

There was later a patch in 2021 which prevented this flag to be
automatically set, so that if partitions (well, mtd devices in general)
were described in the DT, they would provide a valid of_node in order
to be used as cell providers (again, my understanding):
658c4448bbbf ("mtd: core: add nvmem-cells compatible to parse mtd as nvmem cells")

But I guess the major problem comes from the nvmem-cell compatible. I
am wondering if it would make sense to kind of transpose the meaning of
this compatible into a property. But, well, backward compatibility
would still be a problem I guess...

> > > So it's a mess of multiple devices pointing to the same DT node with
> > > no clear way to identify which ones will point to a DT node and which
> > > ones will probe and which ones won't. In the future, we shouldn't
> > > allow adding new compatible strings for partitions for which we don't
> > > plan on adding nvmem drivers.
> > >
> > > Can you give the patch at the end of the email a shot? It should fix
> > > the issue with this series and without this series. It just avoids
> > > this whole mess by not creating useless platform device for
> > > nvmem-cells compatible DT nodes.  
> >
> > Thanks a lot for your help.  
> 
> No problem. I want fw_devlink to work for everyone.
> 

Thanks,
Miquèl
Re: [PATCH v2 00/11] fw_devlink improvements
Posted by Saravana Kannan 2 years, 7 months ago
On Sun, Feb 5, 2023 at 5:32 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:
> >
> > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <saravanak@google.com>:
> > >
> > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote:
> > > >
> > > > Hi Saravana,
> > > >
> > > > > Can you try the patch at the end of this email under these
> > > > > configurations and tell me which ones fail vs pass? I don't need logs
> > > >
> > > > I did these tests and here is the results:
> > >
> > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > thread you are reply to the wrong email, but the context in your email
> > > seems to be from the right email.
> > >
> > > For example, see how your reply isn't under the email you are replying
> > > to in this thread overview:
> > > https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r
> > >
> > > > 1. On top of this series - Not works
> > > > 2. Without this series    - Works
> > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > > > 4. Without this series, with the fwnode_dev_initialized() deleted  - Works
> > > >
> > > > So your nvmem/core.c patch helps only when it is applied without the series.
> > > > But despite the fact that this helps to avoid getting stuck at probing
> > > > my ethernet device, there is still regression.
> > > >
> > > > When the ethernet module is loaded it takes a lot of time to drop dependency
> > > > from the nvmem-cell with mac address.
> > > >
> > > > Please look at the kernel logs below.
> > >
> > > The kernel logs below really aren't that useful for me in their
> > > current state. See more below.
> > >
> > > ---8<---- <snip> --->8----
> > >
> > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > > > but look like it is not best way to solve a problem which we discussed
> > > > in the MTD thread.
> > > >
> > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > > > applied on top of this series. Maybe I missed something.
> > >
> > > Yeah, I'm not too sure if the test was done correctly. You also didn't
> > > answer my question about the dts from my earlier email.
> > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
> > >
> > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > drivers/core/base.c changed to the _info variants? And then share the
> > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > it doesn't get word wrapped by your email client. And please point me
> > > to the .dts that corresponds to your board. Without that, I can't
> > > debug much.
> > >
> > > Thanks,
> > > Saravana
> >
> > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > thread you are reply to the wrong email, but the context in your email
> > > seems to be from the right email.
> >
> > Sorry for that, it seems like I accidently deleted it.
> >
> > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > drivers/core/base.c changed to the _info variants? And then share the
> > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > it doesn't get word wrapped by your email client. And please point me
> > > to the .dts that corresponds to your board. Without that, I can't
> > > debug much.
> >
> > Ok, I retested config 1 with all _debug logs changed to the _info. I
> > added the kernel log and the dts file to the attachment of this email.
>
> Ah, so your device is not supported/present upstream? Even though it's
> not upstream, I'll help fix this because it should fix what I believe
> are unreported issues in upstream.
>
> Ok I know why configs 1 - 4 behaved the way they did and why my test
> patch didn't help.
>
> After staring at mtd/nvmem code for a few hours I think mtd/nvmem
> interaction is kind of a mess. mtd core creates "partition" platform
> devices (including for nvmem-cells) that are probed by drivers in
> drivers/nvmem. However, there's no driver for "nvmem-cells" partition
> platform device. However, the nvmem core creates nvmem_device when
> nvmem_register() is called by MTD or these partition platform devices
> created by MTD. But these nvmem_devices are added to a nvmem_bus but
> the bus has no means to even register a driver (it should really be a
> nvmem_class and not nvmem_bus). And the nvmem_device sometimes points
> to the DT node of the MTD device or sometimes the partition platform
> devices or maybe no DT node at all.
>
> So it's a mess of multiple devices pointing to the same DT node with
> no clear way to identify which ones will point to a DT node and which
> ones will probe and which ones won't. In the future, we shouldn't
> allow adding new compatible strings for partitions for which we don't
> plan on adding nvmem drivers.
>
> Can you give the patch at the end of the email a shot? It should fix
> the issue with this series and without this series. It just avoids
> this whole mess by not creating useless platform device for
> nvmem-cells compatible DT nodes.

Actually, without this series, the patch below will need an additional
line of code inside the if block:
fwnode_dev_initialized(of_fwnode_handle(child), true);

-Saravana

>
> Thanks,
> Saravana
>
> diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
> index d442fa94c872..88a213f4d651 100644
> --- a/drivers/mtd/mtdpart.c
> +++ b/drivers/mtd/mtdpart.c
> @@ -577,6 +577,7 @@ static int mtd_part_of_parse(struct mtd_info *master,
>  {
>         struct mtd_part_parser *parser;
>         struct device_node *np;
> +       struct device_node *child;
>         struct property *prop;
>         struct device *dev;
>         const char *compat;
> @@ -594,6 +595,10 @@ static int mtd_part_of_parse(struct mtd_info *master,
>         else
>                 np = of_get_child_by_name(np, "partitions");
>
> +       for_each_child_of_node(np, child)
> +               if (of_device_is_compatible(child, "nvmem-cells"))
> +                       of_node_set_flag(child, OF_POPULATED);
> +
>         of_property_for_each_string(np, "compatible", prop, compat) {
>                 parser = mtd_part_get_compatible_parser(compat);
>                 if (!parser)
Re: [PATCH v2 00/11] fw_devlink improvements
Posted by Marc Zyngier 2 years, 7 months ago
On Mon, 30 Jan 2023 08:55:42 +0000,
Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> 
> Build test pass on arm, arm64, i386, mips, parisc, powerpc, riscv, s390, sh,
> sparc and x86_64.
> 
> Boot and LTP smoke pass on qemu-arm64, qemu-armv7, qemu-i386 and qemu-x86_64.
> Boot failed on FVP.
> 
> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> 
> Please refer following link for details of testing.
> FVP boot log failed.
> https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-devicetree_20230127001141_407071-1-saravanak_google_com/testrun/14389034/suite/boot/test/gcc-12-lkftconfig-64k_page_size/details/
> 
> 
> [    2.613437] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
> [    2.613628] Mem abort info:
> [    2.613756]   ESR = 0x0000000096000005
> [    2.613904]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    2.614071]   SET = 0, FnV = 0
> [    2.614215]   EA = 0, S1PTW = 0
> [    2.614358]   FSC = 0x05: level 1 translation fault
> [    2.614517] Data abort info:
> [    2.614647]   ISV = 0, ISS = 0x00000005
> [    2.614792]   CM = 0, WnR = 0
> [    2.614934] [0000000000000010] user address but active_mm is swapper
> [    2.615105] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> [    2.615219] Modules linked in:
> [    2.615310] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc5 #1
> [    2.615445] Hardware name: FVP Base RevC (DT)
> [    2.615533] pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> [    2.615685] pc : gpiochip_setup_dev (include/linux/err.h:41 include/linux/fwnode.h:201 drivers/gpio/gpiolib.c:586) 
> [    2.615816] lr : gpiochip_add_data_with_key (drivers/gpio/gpiolib.c:871) 
> [    2.615970] sp : ffff8000081af5e0
> [    2.616051] x29: ffff8000081af5e0 x28: 0000000000000000 x27: ffff0008027cb5a0
> [    2.616261] x26: 0000000000000000 x25: ffffd7c5d6745910 x24: ffff0008027f4800
> [    2.616472] x23: 0000000000000000 x22: ffffd7c5d62b99a8 x21: 0000000000000202
> [    2.616679] x20: 0000000000000000 x19: ffff0008027f4800 x18: ffffffffffffffff
> [    2.616890] x17: ffffd7c5d6467928 x16: 0000000013e3690a x15: ffff8000081af3b0
> [    2.617102] x14: ffff00080275cd8a x13: ffff00080275cd88 x12: 0000000000000001
> [    2.617312] x11: 62726568746f6d3a x10: 0000000000000000 x9 : ffffd7c5d3b3ebe0
> [    2.617522] x8 : ffff8000081af548 x7 : 0000000000000000 x6 : 0000000000000001
> [    2.617727] x5 : 0000000000000000 x4 : ffff000800640000 x3 : ffffd7c5d62b99c8
> [    2.617933] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
> [    2.618138] Call trace:
> [    2.618204] gpiochip_setup_dev (include/linux/err.h:41 include/linux/fwnode.h:201 drivers/gpio/gpiolib.c:586) 
> [    2.618337] gpiochip_add_data_with_key (drivers/gpio/gpiolib.c:871) 
> [    2.618493] devm_gpiochip_add_data_with_key (drivers/gpio/gpiolib-devres.c:478) 
> [    2.618654] bgpio_pdev_probe (drivers/gpio/gpio-mmio.c:793) 
> [    2.618785] platform_probe (drivers/base/platform.c:1401) 
> [    2.618928] really_probe (drivers/base/dd.c:560 drivers/base/dd.c:639) 
> [    2.619056] __driver_probe_device (drivers/base/dd.c:778) 
> [    2.619193] driver_probe_device (drivers/base/dd.c:808) 
> [    2.619329] __device_attach_driver (drivers/base/dd.c:937) 
> [    2.619464] bus_for_each_drv (drivers/base/bus.c:427) 
> [    2.619590] __device_attach (drivers/base/dd.c:1010) 
> [    2.619722] device_initial_probe (drivers/base/dd.c:1058) 
> [    2.619861] bus_probe_device (drivers/base/bus.c:489) 
> [    2.619988] device_add (drivers/base/core.c:3637) 
> [    2.620102] platform_device_add (drivers/base/platform.c:717) 
> [    2.620251] mfd_add_device (drivers/mfd/mfd-core.c:297) 
> [    2.620397] devm_mfd_add_devices (drivers/mfd/mfd-core.c:351 drivers/mfd/mfd-core.c:449) 
> [    2.620548] vexpress_sysreg_probe (drivers/mfd/vexpress-sysreg.c:115) 
> [    2.620672] platform_probe (drivers/base/platform.c:1401) 
> [    2.620814] really_probe (drivers/base/dd.c:560 drivers/base/dd.c:639) 
> [    2.620940] __driver_probe_device (drivers/base/dd.c:778) 
> [    2.621080] driver_probe_device (drivers/base/dd.c:808) 
> [    2.621216] __driver_attach (drivers/base/dd.c:1195) 
> [    2.621344] bus_for_each_dev (drivers/base/bus.c:301) 
> [    2.621467] driver_attach (drivers/base/dd.c:1212) 
> [    2.621596] bus_add_driver (drivers/base/bus.c:618) 
> [    2.621720] driver_register (drivers/base/driver.c:246) 
> [    2.621859] __platform_driver_register (drivers/base/platform.c:868) 
> [    2.622012] vexpress_sysreg_driver_init (drivers/mfd/vexpress-sysreg.c:134) 
> [    2.622145] do_one_initcall (init/main.c:1306) 
> [    2.622269] kernel_init_freeable (init/main.c:1378 init/main.c:1395 init/main.c:1414 init/main.c:1634) 
> [    2.622394] kernel_init (init/main.c:1526) 
> [    2.622531] ret_from_fork (arch/arm64/kernel/entry.S:864) 
> [ 2.622692] Code: 910003fd a90153f3 aa0003f3 f9414c00 (f9400801)
> All code
> ========
>    0:*	fd                   	std    		<-- trapping instruction
>    1:	03 00                	add    (%rax),%eax
>    3:	91                   	xchg   %eax,%ecx
>    4:	f3 53                	repz push %rbx
>    6:	01 a9 f3 03 00 aa    	add    %ebp,-0x55fffc0d(%rcx)
>    c:	00 4c 41 f9          	add    %cl,-0x7(%rcx,%rax,2)
>   10:	01 08                	add    %ecx,(%rax)
>   12:	40 f9                	rex stc

Could you please fix your scripts so that they report something that
matches the tested architecture? I like x86 asm as much as the next
guy, but this is an arm64 crash... :-/

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.