[v8] crypto/dmaengine: qce: introduce BAM locking and use DMA for register I/O

[PATCH v8 01/11] dmaengine: Add DMA_PREP_LOCK/DMA_PREP_UNLOCK flags

Posted by Bartosz Golaszewski 3 months ago

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Some DMA engines may be accessed from linux and the TrustZone
simultaneously. In order to allow synchronization, add lock and unlock
flags for the command descriptor that allow the caller to request the
controller to be locked for the duration of the transaction in an
implementation-dependent way.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 Documentation/driver-api/dmaengine/provider.rst | 9 +++++++++
 include/linux/dmaengine.h                       | 6 ++++++
 2 files changed, 15 insertions(+)

diff --git a/Documentation/driver-api/dmaengine/provider.rst b/Documentation/driver-api/dmaengine/provider.rst
index 1594598b331782e4dddcf992159c724111db9cf3..6428211405472dd1147e363f5786acc91d95ed43 100644
--- a/Documentation/driver-api/dmaengine/provider.rst
+++ b/Documentation/driver-api/dmaengine/provider.rst
@@ -630,6 +630,15 @@ DMA_CTRL_REUSE
   - This flag is only supported if the channel reports the DMA_LOAD_EOT
     capability.
 
+- DMA_PREP_LOCK
+
+  - If set, the DMA controller will be locked for the duration of the current
+    transaction.
+
+- DMA_PREP_UNLOCK
+
+  - If set, DMA will release he controller lock.
+
 General Design Notes
 ====================
 
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 99efe2b9b4ea9844ca6161208362ef18ef111d96..c02be4bc8ac4c3db47c7c11751b949e3479e7cb8 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -200,6 +200,10 @@ struct dma_vec {
  *  transaction is marked with DMA_PREP_REPEAT will cause the new transaction
  *  to never be processed and stay in the issued queue forever. The flag is
  *  ignored if the previous transaction is not a repeated transaction.
+ *  @DMA_PREP_LOCK: tell the driver that there is a lock bit set on command
+ *  descriptor.
+ *  @DMA_PREP_UNLOCK: tell the driver that there is a un-lock bit set on command
+ *  descriptor.
  */
 enum dma_ctrl_flags {
 	DMA_PREP_INTERRUPT = (1 << 0),
@@ -212,6 +216,8 @@ enum dma_ctrl_flags {
 	DMA_PREP_CMD = (1 << 7),
 	DMA_PREP_REPEAT = (1 << 8),
 	DMA_PREP_LOAD_EOT = (1 << 9),
+	DMA_PREP_LOCK = (1 << 10),
+	DMA_PREP_UNLOCK = (1 << 11),
 };
 
 /**

-- 
2.51.0

Re: [PATCH v8 01/11] dmaengine: Add DMA_PREP_LOCK/DMA_PREP_UNLOCK flags

Posted by Dmitry Baryshkov 3 months ago

On Thu, Nov 06, 2025 at 12:33:57PM +0100, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> Some DMA engines may be accessed from linux and the TrustZone
> simultaneously. In order to allow synchronization, add lock and unlock
> flags for the command descriptor that allow the caller to request the
> controller to be locked for the duration of the transaction in an
> implementation-dependent way.

What is the expected behaviour if Linux "locks" the engine and then TZ
tries to use it before Linux has a chance to unlock it.

> 
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> ---
>  Documentation/driver-api/dmaengine/provider.rst | 9 +++++++++
>  include/linux/dmaengine.h                       | 6 ++++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/Documentation/driver-api/dmaengine/provider.rst b/Documentation/driver-api/dmaengine/provider.rst
> index 1594598b331782e4dddcf992159c724111db9cf3..6428211405472dd1147e363f5786acc91d95ed43 100644
> --- a/Documentation/driver-api/dmaengine/provider.rst
> +++ b/Documentation/driver-api/dmaengine/provider.rst
> @@ -630,6 +630,15 @@ DMA_CTRL_REUSE
>    - This flag is only supported if the channel reports the DMA_LOAD_EOT
>      capability.
>  
> +- DMA_PREP_LOCK
> +
> +  - If set, the DMA controller will be locked for the duration of the current
> +    transaction.
> +
> +- DMA_PREP_UNLOCK
> +
> +  - If set, DMA will release he controller lock.
> +
>  General Design Notes
>  ====================
>  
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 99efe2b9b4ea9844ca6161208362ef18ef111d96..c02be4bc8ac4c3db47c7c11751b949e3479e7cb8 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -200,6 +200,10 @@ struct dma_vec {
>   *  transaction is marked with DMA_PREP_REPEAT will cause the new transaction
>   *  to never be processed and stay in the issued queue forever. The flag is
>   *  ignored if the previous transaction is not a repeated transaction.
> + *  @DMA_PREP_LOCK: tell the driver that there is a lock bit set on command
> + *  descriptor.
> + *  @DMA_PREP_UNLOCK: tell the driver that there is a un-lock bit set on command
> + *  descriptor.
>   */
>  enum dma_ctrl_flags {
>  	DMA_PREP_INTERRUPT = (1 << 0),
> @@ -212,6 +216,8 @@ enum dma_ctrl_flags {
>  	DMA_PREP_CMD = (1 << 7),
>  	DMA_PREP_REPEAT = (1 << 8),
>  	DMA_PREP_LOAD_EOT = (1 << 9),
> +	DMA_PREP_LOCK = (1 << 10),
> +	DMA_PREP_UNLOCK = (1 << 11),
>  };
>  
>  /**
> 
> -- 
> 2.51.0
> 

-- 
With best wishes
Dmitry

Re: [PATCH v8 01/11] dmaengine: Add DMA_PREP_LOCK/DMA_PREP_UNLOCK flags

Posted by Bartosz Golaszewski 2 months, 4 weeks ago

On Tue, Nov 11, 2025 at 1:30 PM Dmitry Baryshkov
<dmitry.baryshkov@oss.qualcomm.com> wrote:
>
> On Thu, Nov 06, 2025 at 12:33:57PM +0100, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >
> > Some DMA engines may be accessed from linux and the TrustZone
> > simultaneously. In order to allow synchronization, add lock and unlock
> > flags for the command descriptor that allow the caller to request the
> > controller to be locked for the duration of the transaction in an
> > implementation-dependent way.
>
> What is the expected behaviour if Linux "locks" the engine and then TZ
> tries to use it before Linux has a chance to unlock it.
>

Are you asking about the actual behavior on Qualcomm platforms or are
you hinting that we should describe the behavior of the TZ in the docs
here? Ideally TZ would use the same synchronization mechanism and not
get in linux' way. On Qualcomm the BAM, once "locked" will not fetch
the next descriptors on pipes other than the current one until
unlocked so effectively DMA will just not complete on other pipes.
These flags here however are more general so I'm not sure if we should
describe any implementation-specific details.

We can say: "The DMA controller will be locked for the duration of the
current transaction and other users of the controller/TrustZone will
not see their transactions complete before it is unlocked"?

Bartosz

Re: [PATCH v8 01/11] dmaengine: Add DMA_PREP_LOCK/DMA_PREP_UNLOCK flags

Posted by Dmitry Baryshkov 2 months, 4 weeks ago

On Thu, Nov 13, 2025 at 11:02:11AM +0100, Bartosz Golaszewski wrote:
> On Tue, Nov 11, 2025 at 1:30 PM Dmitry Baryshkov
> <dmitry.baryshkov@oss.qualcomm.com> wrote:
> >
> > On Thu, Nov 06, 2025 at 12:33:57PM +0100, Bartosz Golaszewski wrote:
> > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > >
> > > Some DMA engines may be accessed from linux and the TrustZone
> > > simultaneously. In order to allow synchronization, add lock and unlock
> > > flags for the command descriptor that allow the caller to request the
> > > controller to be locked for the duration of the transaction in an
> > > implementation-dependent way.
> >
> > What is the expected behaviour if Linux "locks" the engine and then TZ
> > tries to use it before Linux has a chance to unlock it.
> >
> 
> Are you asking about the actual behavior on Qualcomm platforms or are
> you hinting that we should describe the behavior of the TZ in the docs
> here? Ideally TZ would use the same synchronization mechanism and not
> get in linux' way. On Qualcomm the BAM, once "locked" will not fetch
> the next descriptors on pipes other than the current one until
> unlocked so effectively DMA will just not complete on other pipes.
> These flags here however are more general so I'm not sure if we should
> describe any implementation-specific details.
> 
> We can say: "The DMA controller will be locked for the duration of the
> current transaction and other users of the controller/TrustZone will
> not see their transactions complete before it is unlocked"?

So, basically, we are providing a way to stall TZ's DMA transactions?
Doesn't sound good enough to me.

-- 
With best wishes
Dmitry

Re: [PATCH v8 01/11] dmaengine: Add DMA_PREP_LOCK/DMA_PREP_UNLOCK flags

Posted by Bartosz Golaszewski 2 months, 4 weeks ago

On Thu, Nov 13, 2025 at 1:28 PM Dmitry Baryshkov
<dmitry.baryshkov@oss.qualcomm.com> wrote:
>
> On Thu, Nov 13, 2025 at 11:02:11AM +0100, Bartosz Golaszewski wrote:
> > On Tue, Nov 11, 2025 at 1:30 PM Dmitry Baryshkov
> > <dmitry.baryshkov@oss.qualcomm.com> wrote:
> > >
> > > On Thu, Nov 06, 2025 at 12:33:57PM +0100, Bartosz Golaszewski wrote:
> > > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > > >
> > > > Some DMA engines may be accessed from linux and the TrustZone
> > > > simultaneously. In order to allow synchronization, add lock and unlock
> > > > flags for the command descriptor that allow the caller to request the
> > > > controller to be locked for the duration of the transaction in an
> > > > implementation-dependent way.
> > >
> > > What is the expected behaviour if Linux "locks" the engine and then TZ
> > > tries to use it before Linux has a chance to unlock it.
> > >
> >
> > Are you asking about the actual behavior on Qualcomm platforms or are
> > you hinting that we should describe the behavior of the TZ in the docs
> > here? Ideally TZ would use the same synchronization mechanism and not
> > get in linux' way. On Qualcomm the BAM, once "locked" will not fetch
> > the next descriptors on pipes other than the current one until
> > unlocked so effectively DMA will just not complete on other pipes.
> > These flags here however are more general so I'm not sure if we should
> > describe any implementation-specific details.
> >
> > We can say: "The DMA controller will be locked for the duration of the
> > current transaction and other users of the controller/TrustZone will
> > not see their transactions complete before it is unlocked"?
>
> So, basically, we are providing a way to stall TZ's DMA transactions?
> Doesn't sound good enough to me.

Can you elaborate because I'm not sure if you're opposed to the idea
itself or the explanation is not good enough?

Bartosz

Re: [PATCH v8 01/11] dmaengine: Add DMA_PREP_LOCK/DMA_PREP_UNLOCK flags

Posted by Dmitry Baryshkov 2 months, 4 weeks ago

On Thu, Nov 13, 2025 at 04:52:56PM +0100, Bartosz Golaszewski wrote:
> On Thu, Nov 13, 2025 at 1:28 PM Dmitry Baryshkov
> <dmitry.baryshkov@oss.qualcomm.com> wrote:
> >
> > On Thu, Nov 13, 2025 at 11:02:11AM +0100, Bartosz Golaszewski wrote:
> > > On Tue, Nov 11, 2025 at 1:30 PM Dmitry Baryshkov
> > > <dmitry.baryshkov@oss.qualcomm.com> wrote:
> > > >
> > > > On Thu, Nov 06, 2025 at 12:33:57PM +0100, Bartosz Golaszewski wrote:
> > > > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > > > >
> > > > > Some DMA engines may be accessed from linux and the TrustZone
> > > > > simultaneously. In order to allow synchronization, add lock and unlock
> > > > > flags for the command descriptor that allow the caller to request the
> > > > > controller to be locked for the duration of the transaction in an
> > > > > implementation-dependent way.
> > > >
> > > > What is the expected behaviour if Linux "locks" the engine and then TZ
> > > > tries to use it before Linux has a chance to unlock it.
> > > >
> > >
> > > Are you asking about the actual behavior on Qualcomm platforms or are
> > > you hinting that we should describe the behavior of the TZ in the docs
> > > here? Ideally TZ would use the same synchronization mechanism and not
> > > get in linux' way. On Qualcomm the BAM, once "locked" will not fetch
> > > the next descriptors on pipes other than the current one until
> > > unlocked so effectively DMA will just not complete on other pipes.
> > > These flags here however are more general so I'm not sure if we should
> > > describe any implementation-specific details.
> > >
> > > We can say: "The DMA controller will be locked for the duration of the
> > > current transaction and other users of the controller/TrustZone will
> > > not see their transactions complete before it is unlocked"?
> >
> > So, basically, we are providing a way to stall TZ's DMA transactions?
> > Doesn't sound good enough to me.
> 
> Can you elaborate because I'm not sure if you're opposed to the idea
> itself or the explanation is not good enough?

I find it a bit strange that the NS-OS (Linux) can cause side-effects to
the TZ. Please correct me if I'm wrong, but I assumed that TZ should be
able to function even when LInux is misbehaving.

-- 
With best wishes
Dmitry

Re: [PATCH v8 01/11] dmaengine: Add DMA_PREP_LOCK/DMA_PREP_UNLOCK flags

Posted by Bartosz Golaszewski 2 months, 2 weeks ago

On Thu, Nov 13, 2025 at 9:12 PM Dmitry Baryshkov
<dmitry.baryshkov@oss.qualcomm.com> wrote:
>
> On Thu, Nov 13, 2025 at 04:52:56PM +0100, Bartosz Golaszewski wrote:
> > On Thu, Nov 13, 2025 at 1:28 PM Dmitry Baryshkov
> > <dmitry.baryshkov@oss.qualcomm.com> wrote:
> > >
> > > On Thu, Nov 13, 2025 at 11:02:11AM +0100, Bartosz Golaszewski wrote:
> > > > On Tue, Nov 11, 2025 at 1:30 PM Dmitry Baryshkov
> > > > <dmitry.baryshkov@oss.qualcomm.com> wrote:
> > > > >
> > > > > On Thu, Nov 06, 2025 at 12:33:57PM +0100, Bartosz Golaszewski wrote:
> > > > > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > > > > >
> > > > > > Some DMA engines may be accessed from linux and the TrustZone
> > > > > > simultaneously. In order to allow synchronization, add lock and unlock
> > > > > > flags for the command descriptor that allow the caller to request the
> > > > > > controller to be locked for the duration of the transaction in an
> > > > > > implementation-dependent way.
> > > > >
> > > > > What is the expected behaviour if Linux "locks" the engine and then TZ
> > > > > tries to use it before Linux has a chance to unlock it.
> > > > >
> > > >
> > > > Are you asking about the actual behavior on Qualcomm platforms or are
> > > > you hinting that we should describe the behavior of the TZ in the docs
> > > > here? Ideally TZ would use the same synchronization mechanism and not
> > > > get in linux' way. On Qualcomm the BAM, once "locked" will not fetch
> > > > the next descriptors on pipes other than the current one until
> > > > unlocked so effectively DMA will just not complete on other pipes.
> > > > These flags here however are more general so I'm not sure if we should
> > > > describe any implementation-specific details.
> > > >
> > > > We can say: "The DMA controller will be locked for the duration of the
> > > > current transaction and other users of the controller/TrustZone will
> > > > not see their transactions complete before it is unlocked"?
> > >
> > > So, basically, we are providing a way to stall TZ's DMA transactions?
> > > Doesn't sound good enough to me.
> >
> > Can you elaborate because I'm not sure if you're opposed to the idea
> > itself or the explanation is not good enough?
>
> I find it a bit strange that the NS-OS (Linux) can cause side-effects to
> the TZ. Please correct me if I'm wrong, but I assumed that TZ should be
> able to function even when LInux is misbehaving.
>

Ok, so the consensus after talking to Qualcomm crypto engineers - and
I understand this is Qualcomm-specific but it should apply to any
similar use-cases - is this:

If the TZ uses BAM locking and it locks the BAM and linux tries to
write to the registers protected by this lock, we'll get an external
abort. Making linux use it too addresses that potential problem.

Linux could potentially lock and never unlock the BAM but TZ could
also just reset it. Also: linux could as well turn the entire device
off. :)

For the Qualcomm use-case this is not an issue - it's about making TZ
and linux work together. I suppose the same would apply to any other
users.

If that could be contained within the crypto driver, there would be no
issue. It's just that in order to pass this bit to the DMA controller,
we need a generic flag. If you have better suggestions, please let me
know.

The flag has to be passed to the BAM driver at the time of calling of
dmaengine_prep_slave_sg() and attrs seems to be the only way with the
current interface. Off the top of my head: we could extend struct
scatterlist to allow passing some arbitrary driver data but that
doesn't sound like a good approach.

Bart

Re: [PATCH v8 01/11] dmaengine: Add DMA_PREP_LOCK/DMA_PREP_UNLOCK flags

Posted by Dmitry Baryshkov 2 months, 2 weeks ago

On Fri, Nov 21, 2025 at 03:35:50PM +0100, Bartosz Golaszewski wrote:
> On Thu, Nov 13, 2025 at 9:12 PM Dmitry Baryshkov
> <dmitry.baryshkov@oss.qualcomm.com> wrote:
> >
> > On Thu, Nov 13, 2025 at 04:52:56PM +0100, Bartosz Golaszewski wrote:
> > > On Thu, Nov 13, 2025 at 1:28 PM Dmitry Baryshkov
> > > <dmitry.baryshkov@oss.qualcomm.com> wrote:
> > > >
> > > > On Thu, Nov 13, 2025 at 11:02:11AM +0100, Bartosz Golaszewski wrote:
> > > > > On Tue, Nov 11, 2025 at 1:30 PM Dmitry Baryshkov
> > > > > <dmitry.baryshkov@oss.qualcomm.com> wrote:
> > > > > >
> > > > > > On Thu, Nov 06, 2025 at 12:33:57PM +0100, Bartosz Golaszewski wrote:
> > > > > > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > > > > > >
> > > > > > > Some DMA engines may be accessed from linux and the TrustZone
> > > > > > > simultaneously. In order to allow synchronization, add lock and unlock
> > > > > > > flags for the command descriptor that allow the caller to request the
> > > > > > > controller to be locked for the duration of the transaction in an
> > > > > > > implementation-dependent way.
> > > > > >
> > > > > > What is the expected behaviour if Linux "locks" the engine and then TZ
> > > > > > tries to use it before Linux has a chance to unlock it.
> > > > > >
> > > > >
> > > > > Are you asking about the actual behavior on Qualcomm platforms or are
> > > > > you hinting that we should describe the behavior of the TZ in the docs
> > > > > here? Ideally TZ would use the same synchronization mechanism and not
> > > > > get in linux' way. On Qualcomm the BAM, once "locked" will not fetch
> > > > > the next descriptors on pipes other than the current one until
> > > > > unlocked so effectively DMA will just not complete on other pipes.
> > > > > These flags here however are more general so I'm not sure if we should
> > > > > describe any implementation-specific details.
> > > > >
> > > > > We can say: "The DMA controller will be locked for the duration of the
> > > > > current transaction and other users of the controller/TrustZone will
> > > > > not see their transactions complete before it is unlocked"?
> > > >
> > > > So, basically, we are providing a way to stall TZ's DMA transactions?
> > > > Doesn't sound good enough to me.
> > >
> > > Can you elaborate because I'm not sure if you're opposed to the idea
> > > itself or the explanation is not good enough?
> >
> > I find it a bit strange that the NS-OS (Linux) can cause side-effects to
> > the TZ. Please correct me if I'm wrong, but I assumed that TZ should be
> > able to function even when LInux is misbehaving.
> >
> 
> Ok, so the consensus after talking to Qualcomm crypto engineers - and
> I understand this is Qualcomm-specific but it should apply to any
> similar use-cases - is this:
> 
> If the TZ uses BAM locking and it locks the BAM and linux tries to
> write to the registers protected by this lock, we'll get an external
> abort. Making linux use it too addresses that potential problem.
> 
> Linux could potentially lock and never unlock the BAM but TZ could
> also just reset it. Also: linux could as well turn the entire device
> off. :)
> 
> For the Qualcomm use-case this is not an issue - it's about making TZ
> and linux work together. I suppose the same would apply to any other
> users.

Ack, thank you.

> 
> If that could be contained within the crypto driver, there would be no
> issue. It's just that in order to pass this bit to the DMA controller,
> we need a generic flag. If you have better suggestions, please let me
> know.
> 
> The flag has to be passed to the BAM driver at the time of calling of
> dmaengine_prep_slave_sg() and attrs seems to be the only way with the
> current interface. Off the top of my head: we could extend struct
> scatterlist to allow passing some arbitrary driver data but that
> doesn't sound like a good approach.

Can we use DMA metadata in order to pass the lock / unlock flags
instead? I might be missing something, but the LOCK / UNLOCK ops defined
in this patchset seem to be too usecase-specific. Using metadata seems
to allow for this kind of driver-specific sidechannel.

-- 
With best wishes
Dmitry

Re: [PATCH v8 01/11] dmaengine: Add DMA_PREP_LOCK/DMA_PREP_UNLOCK flags

Posted by Bartosz Golaszewski 2 months, 2 weeks ago

On Fri, Nov 21, 2025 at 5:36 PM Dmitry Baryshkov
<dmitry.baryshkov@oss.qualcomm.com> wrote:
>
> >
> > The flag has to be passed to the BAM driver at the time of calling of
> > dmaengine_prep_slave_sg() and attrs seems to be the only way with the
> > current interface. Off the top of my head: we could extend struct
> > scatterlist to allow passing some arbitrary driver data but that
> > doesn't sound like a good approach.
>
> Can we use DMA metadata in order to pass the lock / unlock flags
> instead? I might be missing something, but the LOCK / UNLOCK ops defined
> in this patchset seem to be too usecase-specific. Using metadata seems
> to allow for this kind of driver-specific sidechannel.
>

I'll look into it, thanks.

Bart

Re: [PATCH v8 01/11] dmaengine: Add DMA_PREP_LOCK/DMA_PREP_UNLOCK flags

Posted by Randy Dunlap 3 months ago


On 11/6/25 3:33 AM, Bartosz Golaszewski wrote:
>  Documentation/driver-api/dmaengine/provider.rst | 9 +++++++++
>  include/linux/dmaengine.h                       | 6 ++++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/Documentation/driver-api/dmaengine/provider.rst b/Documentation/driver-api/dmaengine/provider.rst
> index 1594598b331782e4dddcf992159c724111db9cf3..6428211405472dd1147e363f5786acc91d95ed43 100644
> --- a/Documentation/driver-api/dmaengine/provider.rst
> +++ b/Documentation/driver-api/dmaengine/provider.rst
> @@ -630,6 +630,15 @@ DMA_CTRL_REUSE
>    - This flag is only supported if the channel reports the DMA_LOAD_EOT
>      capability.
>  
> +- DMA_PREP_LOCK
> +
> +  - If set, the DMA controller will be locked for the duration of the current
> +    transaction.
> +
> +- DMA_PREP_UNLOCK
> +
> +  - If set, DMA will release he controller lock.

                                the

> +
>  General Design Notes

-- 
~Randy