.../npu/rockchip,rk3588-rknn-core.yaml | 18 ++++- .../boot/dts/rockchip/rk3568-rock-3b.dts | 14 +++- arch/arm64/boot/dts/rockchip/rk356x-base.dtsi | 38 +++++++++++ drivers/accel/rocket/rocket_core.c | 22 ++++++- drivers/accel/rocket/rocket_core.h | 19 ++++++ drivers/accel/rocket/rocket_device.c | 15 ++--- drivers/accel/rocket/rocket_device.h | 3 +- drivers/accel/rocket/rocket_drv.c | 66 ++++++++++++++++++- drivers/accel/rocket/rocket_job.c | 35 ++++++++-- drivers/iommu/rockchip-iommu.c | 12 ++++ 10 files changed, 219 insertions(+), 23 deletions(-)
RFC, not for merge. End-to-end inference does not produce correct output
yet (see Status), so per the v2 discussion this is a request for design
feedback. It now probes, attaches, and submits cleanly on a stock
v7.1-rc6 tree; what remains is one hardware-internal issue.
The RK3568 has a single NVDLA-derived NPU core, the same IP family as the
RK3588 NPU the driver already supports; the register layout matches. The
RK3568 differences are a 32-bit NPU AXI/IOMMU (vs 40-bit) and explicit
PVTPLL/PMU bring-up to power and de-idle the NPU before it is reachable.
Patches:
1-2 rocket: per-SoC data struct, then derive DMA width and core count
from match data (refactors, no functional change).
3 rocket: RK3568 SoC data + PVTPLL/PMU/NOC bring-up.
4 rocket: reset the NPU before detaching the IOMMU on a job timeout
(the detach otherwise stalls a wedged AXI master and WARNs).
5 rocket: keep the IOMMU domain attached across jobs instead of
re-attaching per job (the per-job rk_iommu handshake on the idle
NPU MMU is slow and noisy).
6 iommu/rockchip: clear AUTO_GATING bit 1 on the RK356x v1 IOMMU so
the page-walker keeps its clock (else a TLB-miss walk never
completes).
7 dt-bindings: add the RK3568 NPU compatible.
8-9 arm64 dts: add the NPU and its IOMMU, and enable them on ROCK 3B.
Dependency. The NPU MMU is rockchip-iommu v1 (32-bit) while the rest of
the RK3568 uses v2 (40-bit). They cannot coexist until the driver carries
per-device ops; this series is developed on top of Simon Xue's
"iommu/rockchip: Drop global rk_ops in favor of per-device ops" [1].
Without it the NPU IOMMU fails to probe on a full RK3568 boot.
Power bring-up. The NPU is brought up through the power-domain layer (no
driver hack): the NPU power-domain keeps its clocks but drops the pm_qos
phandle (qos_npu sits behind the gated NPU NoC, so genpd's power-off QoS
save faults reading it), and vdd_npu is marked always-on so the rail is
up before genpd de-idles the NoC at power-on. The PMU de-idle then ACKs
without PVTPLL running; PVTPLL is only needed for compute.
Status. On v7.1-rc6 the driver probes, creates /dev/accel/accel0,
attaches an IOMMU domain, and submits jobs; the program controller
fetches and broadcasts the command list. Inference output is still wrong,
and the cause is split across three layers:
- kernel (this series): the RK3568 differences appear handled;
- mesa/Teflon userspace: still emits RK3588-tuned config, wrong for
RK3568 (to be filed separately on mesa-dev);
- hardware: with corrected config the NPU's DMA reads the full input
and weight tensors (confirmed via its DMA bandwidth counters), but
the MAC/output stage never completes, the job times out, and the
output stays at the buffer's zero-point. I have not found the missing
step; it is not in the command list (replaying the vendor's
byte-exact command list behaves the same). Pointers welcome,
especially from anyone with RK3568 NPU experience.
Known residual. On the first IOMMU attach the NPU MMU is idle with paging
already enabled; the rk_iommu stall/reset handshake does not complete in
that state and logs one burst of timeouts before the (kept) domain
settles. It is harmless here because the job times out regardless, but it
points at an idle-MMU reconfiguration corner the rk_iommu code does not
handle on this block.
[1] https://lore.kernel.org/linux-rockchip/20260310105303.128859-1-xxm@rock-chips.com/
Changes since v2:
- Tagged RFC; now tested on a stock v7.1-rc6 tree.
- Bring-up moved into the power-domain/DT layer (no initcall hack).
- Added the IOMMU detach-on-timeout and attach-once driver fixes.
- Split the driver patch (Heiko): soc_data / match-data / RK3568.
- Derive DMA width and core count from match data; drop the DT rescans.
- Binding describes the hardware; added the missing $ref on rockchip,pmu.
- Disclosed the per-device-ops IOMMU dependency.
Midgy BALON (9):
accel: rocket: Introduce per-SoC rocket_soc_data
accel: rocket: Derive DMA width and core count from match data
accel: rocket: Add RK3568 SoC support
accel: rocket: Reset the NPU before detaching the IOMMU on timeout
accel: rocket: Keep the IOMMU domain attached across jobs
iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU
dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568
arm64: dts: rockchip: rk356x: Add the NPU and its IOMMU
arm64: dts: rockchip: rk3568-rock-3b: Enable the NPU
.../npu/rockchip,rk3588-rknn-core.yaml | 18 ++++-
.../boot/dts/rockchip/rk3568-rock-3b.dts | 14 +++-
arch/arm64/boot/dts/rockchip/rk356x-base.dtsi | 38 +++++++++++
drivers/accel/rocket/rocket_core.c | 22 ++++++-
drivers/accel/rocket/rocket_core.h | 19 ++++++
drivers/accel/rocket/rocket_device.c | 15 ++---
drivers/accel/rocket/rocket_device.h | 3 +-
drivers/accel/rocket/rocket_drv.c | 66 ++++++++++++++++++-
drivers/accel/rocket/rocket_job.c | 35 ++++++++--
drivers/iommu/rockchip-iommu.c | 12 ++++
10 files changed, 219 insertions(+), 23 deletions(-)
base-commit: 52c800fdcf11888ebeb50c3d707f782cc15b66eb
--
2.39.5
Hello Midgy, On 6/4/2026 9:52 PM, Midgy BALON wrote: > RFC, not for merge. End-to-end inference does not produce correct output > yet (see Status), so per the v2 discussion this is a request for design > feedback. It now probes, attaches, and submits cleanly on a stock > v7.1-rc6 tree; what remains is one hardware-internal issue. > > The RK3568 has a single NVDLA-derived NPU core, the same IP family as the > RK3588 NPU the driver already supports; the register layout matches. The > RK3568 differences are a 32-bit NPU AXI/IOMMU (vs 40-bit) and explicit > PVTPLL/PMU bring-up to power and de-idle the NPU before it is reachable. > > Patches: > 1-2 rocket: per-SoC data struct, then derive DMA width and core count > from match data (refactors, no functional change). > 3 rocket: RK3568 SoC data + PVTPLL/PMU/NOC bring-up. > 4 rocket: reset the NPU before detaching the IOMMU on a job timeout > (the detach otherwise stalls a wedged AXI master and WARNs). > 5 rocket: keep the IOMMU domain attached across jobs instead of > re-attaching per job (the per-job rk_iommu handshake on the idle > NPU MMU is slow and noisy). > 6 iommu/rockchip: clear AUTO_GATING bit 1 on the RK356x v1 IOMMU so > the page-walker keeps its clock (else a TLB-miss walk never > completes). > 7 dt-bindings: add the RK3568 NPU compatible. > 8-9 arm64 dts: add the NPU and its IOMMU, and enable them on ROCK 3B. > > Dependency. The NPU MMU is rockchip-iommu v1 (32-bit) while the rest of > the RK3568 uses v2 (40-bit). They cannot coexist until the driver carries > per-device ops; this series is developed on top of Simon Xue's > "iommu/rockchip: Drop global rk_ops in favor of per-device ops" [1]. > Without it the NPU IOMMU fails to probe on a full RK3568 boot. > Hmmm. If I understand correctly, the NPU IOMMU should be v2 rather than v1, implying it should support 40-bit PAs. Nevertheless, please note that the upper limit for DTE is 32 bits. > Power bring-up. The NPU is brought up through the power-domain layer (no > driver hack): the NPU power-domain keeps its clocks but drops the pm_qos > phandle (qos_npu sits behind the gated NPU NoC, so genpd's power-off QoS > save faults reading it), and vdd_npu is marked always-on so the rail is > up before genpd de-idles the NoC at power-on. The PMU de-idle then ACKs > without PVTPLL running; PVTPLL is only needed for compute. > Can these operations not be completed via the pmdomain driver? If some operations are controlled by TF-A, are you using open source TF-A? Thank you. > Status. On v7.1-rc6 the driver probes, creates /dev/accel/accel0, > attaches an IOMMU domain, and submits jobs; the program controller > fetches and broadcasts the command list. Inference output is still wrong, > and the cause is split across three layers: > - kernel (this series): the RK3568 differences appear handled; > - mesa/Teflon userspace: still emits RK3588-tuned config, wrong for > RK3568 (to be filed separately on mesa-dev); > - hardware: with corrected config the NPU's DMA reads the full input > and weight tensors (confirmed via its DMA bandwidth counters), but > the MAC/output stage never completes, the job times out, and the > output stays at the buffer's zero-point. I have not found the missing > step; it is not in the command list (replaying the vendor's > byte-exact command list behaves the same). Pointers welcome, > especially from anyone with RK3568 NPU experience. > > Known residual. On the first IOMMU attach the NPU MMU is idle with paging > already enabled; the rk_iommu stall/reset handshake does not complete in > that state and logs one burst of timeouts before the (kept) domain > settles. It is harmless here because the job times out regardless, but it > points at an idle-MMU reconfiguration corner the rk_iommu code does not > handle on this block. > > [1] https://lore.kernel.org/linux-rockchip/20260310105303.128859-1-xxm@rock-chips.com/ > > Changes since v2: > - Tagged RFC; now tested on a stock v7.1-rc6 tree. > - Bring-up moved into the power-domain/DT layer (no initcall hack). > - Added the IOMMU detach-on-timeout and attach-once driver fixes. > - Split the driver patch (Heiko): soc_data / match-data / RK3568. > - Derive DMA width and core count from match data; drop the DT rescans. > - Binding describes the hardware; added the missing $ref on rockchip,pmu. > - Disclosed the per-device-ops IOMMU dependency. > > Midgy BALON (9): > accel: rocket: Introduce per-SoC rocket_soc_data > accel: rocket: Derive DMA width and core count from match data > accel: rocket: Add RK3568 SoC support > accel: rocket: Reset the NPU before detaching the IOMMU on timeout > accel: rocket: Keep the IOMMU domain attached across jobs > iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU > dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568 > arm64: dts: rockchip: rk356x: Add the NPU and its IOMMU > arm64: dts: rockchip: rk3568-rock-3b: Enable the NPU > > .../npu/rockchip,rk3588-rknn-core.yaml | 18 ++++- > .../boot/dts/rockchip/rk3568-rock-3b.dts | 14 +++- > arch/arm64/boot/dts/rockchip/rk356x-base.dtsi | 38 +++++++++++ > drivers/accel/rocket/rocket_core.c | 22 ++++++- > drivers/accel/rocket/rocket_core.h | 19 ++++++ > drivers/accel/rocket/rocket_device.c | 15 ++--- > drivers/accel/rocket/rocket_device.h | 3 +- > drivers/accel/rocket/rocket_drv.c | 66 ++++++++++++++++++- > drivers/accel/rocket/rocket_job.c | 35 ++++++++-- > drivers/iommu/rockchip-iommu.c | 12 ++++ > 10 files changed, 219 insertions(+), 23 deletions(-) > > > base-commit: 52c800fdcf11888ebeb50c3d707f782cc15b66eb -- Best, Chaoyi
Hi Chaoyi,
Thanks a lot for looking at this -- input from Rockchip is exactly what this
series needs.
> Hmmm. If I understand correctly, the NPU IOMMU should be v2 rather than v1,
> implying it should support 40-bit PAs. Nevertheless, please note that the
> upper limit for DTE is 32 bits.
Understood, and that 32-bit-DTE note is the crux of the trouble I had, so let
me lay out what I see and ask how you'd prefer to solve it.
The mainline node is already v2 (rockchip,rk3568-iommu in rk356x-base.dtsi).
The problem on this 8 GiB board: with the v2 ops the page-table allocations
(gfp_flags == 0) can land above 4 GiB, so the DTE ends up > 32 bits and the
NPU's first translation faults with DMA_READ_ERROR. To work around that I had
switched the NPU MMU to the v1 compatible (rockchip,iommu), whose ops set
GFP_DMA32 and keep the DTE sub-4 GiB. That works in isolation, but because the
driver keeps a single global rk_ops, a v1 NPU MMU then trips
WARN_ON(rk_ops != ops) against the SoC's v2 instances (VOP/VDEC), which is why
I based the series on Simon's per-device-ops work.
So my question: with per-device ops in place, what's the intended way to keep
the NPU MMU on v2 *and* cap its DTE at 32 bits on boards with >4 GiB of RAM?
A v2 ops variant carrying GFP_DMA32 for this device, or is there a register/
config bit that constrains the DTE address? I'd rather follow the Rockchip
intent here than carry the v1 workaround. (Simon, cc'd -- this is right next to
your per-device-ops series.)
> Can these operations not be completed via the pmdomain driver?
> If some operations are controlled by TF-A, are you using open source TF-A?
Most of it is in pmdomain already. Power-on and NoC de-idle are done by the
RK3568 NPU power domain (genpd) at power-on -- the driver no longer pokes the
PMU directly. Two things remain outside it:
- vdd_npu: I mark it regulator-always-on in DT rather than wiring it as the
domain's domain-supply, because as a domain-supply it created a device-link
to the I2C PMIC (rk809) and genpd's power-off QoS-save path then hung
reading the NPU QoS registers behind the (gated) NoC. If there's a clean way
to let genpd own vdd_npu without that I2C ordering deadlock I'd much prefer
that -- pointers welcome.
- the NPU compute clock (PVTPLL): set from the driver via SCMI, and only
needed for actual compute, not for bring-up.
One more pmdomain observation from testing, possibly relevant to how the NPU
domain should be modelled: the domain's power-off/on cycle doesn't reliably
re-de-idle the NoC. If the NPU is probed after genpd has already powered the
(unused) domain off, the power-on de-idle fails ("failed to set idle on domain
'npu'") and the NPU IOMMU then takes an external abort on its first MMIO access.
Probing the NPU before the unused-domain power-off, or marking the domain
always-on, both avoid it. Is the NoC de-idle expected to work on a genpd
re-power here, or should this domain effectively stay on?
On TF-A: yes -- bl31 is built from upstream arm-trusted-firmware
(github.com/ARM-software/arm-trusted-firmware, RK3568 platform), providing PSCI
and the SCMI clock service. The only closed blob in the boot chain is Rockchip's
DDR init (rkbin), which is the standard situation for mainline RK356x.
Kind regards,
Midgy
Le ven. 5 juin 2026 à 03:36, Chaoyi Chen <chaoyi.chen@rock-chips.com> a écrit :
>
> Hello Midgy,
>
> On 6/4/2026 9:52 PM, Midgy BALON wrote:
> > RFC, not for merge. End-to-end inference does not produce correct output
> > yet (see Status), so per the v2 discussion this is a request for design
> > feedback. It now probes, attaches, and submits cleanly on a stock
> > v7.1-rc6 tree; what remains is one hardware-internal issue.
> >
> > The RK3568 has a single NVDLA-derived NPU core, the same IP family as the
> > RK3588 NPU the driver already supports; the register layout matches. The
> > RK3568 differences are a 32-bit NPU AXI/IOMMU (vs 40-bit) and explicit
> > PVTPLL/PMU bring-up to power and de-idle the NPU before it is reachable.
> >
> > Patches:
> > 1-2 rocket: per-SoC data struct, then derive DMA width and core count
> > from match data (refactors, no functional change).
> > 3 rocket: RK3568 SoC data + PVTPLL/PMU/NOC bring-up.
> > 4 rocket: reset the NPU before detaching the IOMMU on a job timeout
> > (the detach otherwise stalls a wedged AXI master and WARNs).
> > 5 rocket: keep the IOMMU domain attached across jobs instead of
> > re-attaching per job (the per-job rk_iommu handshake on the idle
> > NPU MMU is slow and noisy).
> > 6 iommu/rockchip: clear AUTO_GATING bit 1 on the RK356x v1 IOMMU so
> > the page-walker keeps its clock (else a TLB-miss walk never
> > completes).
> > 7 dt-bindings: add the RK3568 NPU compatible.
> > 8-9 arm64 dts: add the NPU and its IOMMU, and enable them on ROCK 3B.
> >
> > Dependency. The NPU MMU is rockchip-iommu v1 (32-bit) while the rest of
> > the RK3568 uses v2 (40-bit). They cannot coexist until the driver carries
> > per-device ops; this series is developed on top of Simon Xue's
> > "iommu/rockchip: Drop global rk_ops in favor of per-device ops" [1].
> > Without it the NPU IOMMU fails to probe on a full RK3568 boot.
> >
>
> Hmmm. If I understand correctly, the NPU IOMMU should be v2 rather than
> v1, implying it should support 40-bit PAs. Nevertheless, please note that
> the upper limit for DTE is 32 bits.
>
> > Power bring-up. The NPU is brought up through the power-domain layer (no
> > driver hack): the NPU power-domain keeps its clocks but drops the pm_qos
> > phandle (qos_npu sits behind the gated NPU NoC, so genpd's power-off QoS
> > save faults reading it), and vdd_npu is marked always-on so the rail is
> > up before genpd de-idles the NoC at power-on. The PMU de-idle then ACKs
> > without PVTPLL running; PVTPLL is only needed for compute.
> >
>
> Can these operations not be completed via the pmdomain driver?
> If some operations are controlled by TF-A, are you using open
> source TF-A? Thank you.
>
> > Status. On v7.1-rc6 the driver probes, creates /dev/accel/accel0,
> > attaches an IOMMU domain, and submits jobs; the program controller
> > fetches and broadcasts the command list. Inference output is still wrong,
> > and the cause is split across three layers:
> > - kernel (this series): the RK3568 differences appear handled;
> > - mesa/Teflon userspace: still emits RK3588-tuned config, wrong for
> > RK3568 (to be filed separately on mesa-dev);
> > - hardware: with corrected config the NPU's DMA reads the full input
> > and weight tensors (confirmed via its DMA bandwidth counters), but
> > the MAC/output stage never completes, the job times out, and the
> > output stays at the buffer's zero-point. I have not found the missing
> > step; it is not in the command list (replaying the vendor's
> > byte-exact command list behaves the same). Pointers welcome,
> > especially from anyone with RK3568 NPU experience.
> >
> > Known residual. On the first IOMMU attach the NPU MMU is idle with paging
> > already enabled; the rk_iommu stall/reset handshake does not complete in
> > that state and logs one burst of timeouts before the (kept) domain
> > settles. It is harmless here because the job times out regardless, but it
> > points at an idle-MMU reconfiguration corner the rk_iommu code does not
> > handle on this block.
> >
> > [1] https://lore.kernel.org/linux-rockchip/20260310105303.128859-1-xxm@rock-chips.com/
> >
> > Changes since v2:
> > - Tagged RFC; now tested on a stock v7.1-rc6 tree.
> > - Bring-up moved into the power-domain/DT layer (no initcall hack).
> > - Added the IOMMU detach-on-timeout and attach-once driver fixes.
> > - Split the driver patch (Heiko): soc_data / match-data / RK3568.
> > - Derive DMA width and core count from match data; drop the DT rescans.
> > - Binding describes the hardware; added the missing $ref on rockchip,pmu.
> > - Disclosed the per-device-ops IOMMU dependency.
> >
> > Midgy BALON (9):
> > accel: rocket: Introduce per-SoC rocket_soc_data
> > accel: rocket: Derive DMA width and core count from match data
> > accel: rocket: Add RK3568 SoC support
> > accel: rocket: Reset the NPU before detaching the IOMMU on timeout
> > accel: rocket: Keep the IOMMU domain attached across jobs
> > iommu/rockchip: Clear AUTO_GATING bit 1 on the RK356x v1 IOMMU
> > dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568
> > arm64: dts: rockchip: rk356x: Add the NPU and its IOMMU
> > arm64: dts: rockchip: rk3568-rock-3b: Enable the NPU
> >
> > .../npu/rockchip,rk3588-rknn-core.yaml | 18 ++++-
> > .../boot/dts/rockchip/rk3568-rock-3b.dts | 14 +++-
> > arch/arm64/boot/dts/rockchip/rk356x-base.dtsi | 38 +++++++++++
> > drivers/accel/rocket/rocket_core.c | 22 ++++++-
> > drivers/accel/rocket/rocket_core.h | 19 ++++++
> > drivers/accel/rocket/rocket_device.c | 15 ++---
> > drivers/accel/rocket/rocket_device.h | 3 +-
> > drivers/accel/rocket/rocket_drv.c | 66 ++++++++++++++++++-
> > drivers/accel/rocket/rocket_job.c | 35 ++++++++--
> > drivers/iommu/rockchip-iommu.c | 12 ++++
> > 10 files changed, 219 insertions(+), 23 deletions(-)
> >
> >
> > base-commit: 52c800fdcf11888ebeb50c3d707f782cc15b66eb
>
> --
> Best,
> Chaoyi
Hi Midgy,
On 6/8/2026 5:03 AM, Midgy Balon wrote:
> Hi Chaoyi,
>
> Thanks a lot for looking at this -- input from Rockchip is exactly what this
> series needs.
>
>> Hmmm. If I understand correctly, the NPU IOMMU should be v2 rather than v1,
>> implying it should support 40-bit PAs. Nevertheless, please note that the
>> upper limit for DTE is 32 bits.
>
> Understood, and that 32-bit-DTE note is the crux of the trouble I had, so let
> me lay out what I see and ask how you'd prefer to solve it.
>
> The mainline node is already v2 (rockchip,rk3568-iommu in rk356x-base.dtsi).
> The problem on this 8 GiB board: with the v2 ops the page-table allocations
> (gfp_flags == 0) can land above 4 GiB, so the DTE ends up > 32 bits and the
> NPU's first translation faults with DMA_READ_ERROR. To work around that I had
> switched the NPU MMU to the v1 compatible (rockchip,iommu), whose ops set
> GFP_DMA32 and keep the DTE sub-4 GiB. That works in isolation, but because the
> driver keeps a single global rk_ops, a v1 NPU MMU then trips
> WARN_ON(rk_ops != ops) against the SoC's v2 instances (VOP/VDEC), which is why
> I based the series on Simon's per-device-ops work.
>
> So my question: with per-device ops in place, what's the intended way to keep
> the NPU MMU on v2 *and* cap its DTE at 32 bits on boards with >4 GiB of RAM?
> A v2 ops variant carrying GFP_DMA32 for this device, or is there a register/
> config bit that constrains the DTE address? I'd rather follow the Rockchip
> intent here than carry the v1 workaround. (Simon, cc'd -- this is right next to
> your per-device-ops series.)
>
If Simon's method works, please use it :)
>> Can these operations not be completed via the pmdomain driver?
>> If some operations are controlled by TF-A, are you using open source TF-A?
>
> Most of it is in pmdomain already. Power-on and NoC de-idle are done by the
> RK3568 NPU power domain (genpd) at power-on -- the driver no longer pokes the
> PMU directly. Two things remain outside it:
>
> - vdd_npu: I mark it regulator-always-on in DT rather than wiring it as the
> domain's domain-supply, because as a domain-supply it created a device-link
> to the I2C PMIC (rk809) and genpd's power-off QoS-save path then hung
> reading the NPU QoS registers behind the (gated) NoC. If there's a clean way
> to let genpd own vdd_npu without that I2C ordering deadlock I'd much prefer
> that -- pointers welcome.
>
Please refer to the patch below regarding the RK3588 NPU pmdomain.
In short, you need to set a "need_regulator" for the RK3568 NPU pmdomain.
https://lore.kernel.org/all/20251216055247.13150-1-rmxpzlb@gmail.com/
> - the NPU compute clock (PVTPLL): set from the driver via SCMI, and only
> needed for actual compute, not for bring-up.
>
> One more pmdomain observation from testing, possibly relevant to how the NPU
> domain should be modelled: the domain's power-off/on cycle doesn't reliably
> re-de-idle the NoC. If the NPU is probed after genpd has already powered the
> (unused) domain off, the power-on de-idle fails ("failed to set idle on domain
> 'npu'") and the NPU IOMMU then takes an external abort on its first MMIO access.
> Probing the NPU before the unused-domain power-off, or marking the domain
> always-on, both avoid it. Is the NoC de-idle expected to work on a genpd
> re-power here, or should this domain effectively stay on?
>
Not quite sure what's going on with PVTPLL and NOC.
Maybe @Finley knows about this?
> On TF-A: yes -- bl31 is built from upstream arm-trusted-firmware
> (github.com/ARM-software/arm-trusted-firmware, RK3568 platform), providing PSCI
> and the SCMI clock service. The only closed blob in the boot chain is Rockchip's
> DDR init (rkbin), which is the standard situation for mainline RK356x.
--
Best,
Chaoyi
© 2016 - 2026 Red Hat, Inc.