[PATCH] crypto: caam - increase the domain of write memory barrier to full system

meenakshi.aggarwal@nxp.com posted 1 patch 2 years, 1 month ago
drivers/crypto/caam/jr.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
[PATCH] crypto: caam - increase the domain of write memory barrier to full system
Posted by meenakshi.aggarwal@nxp.com 2 years, 1 month ago
From: Iuliana Prodan <iuliana.prodan@nxp.com>

In caam_jr_enqueue, under heavy DDR load, smp_wmb() or dma_wmb()
fail to make the input ring be updated before the CAAM starts
reading it. So, CAAM will process, again, an old descriptor address
and will put it in the output ring. This will make caam_jr_dequeue()
to fail, since this old descriptor is not in the software ring.
To fix this, use wmb() which works on the full system instead of
inner/outer shareable domains.

Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Signed-off-by: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com>
---
 drivers/crypto/caam/jr.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
index 767fbf052536..5507d5d34a4c 100644
--- a/drivers/crypto/caam/jr.c
+++ b/drivers/crypto/caam/jr.c
@@ -464,8 +464,16 @@ int caam_jr_enqueue(struct device *dev, u32 *desc,
 	 * Guarantee that the descriptor's DMA address has been written to
 	 * the next slot in the ring before the write index is updated, since
 	 * other cores may update this index independently.
+	 *
+	 * Under heavy DDR load, smp_wmb() or dma_wmb() fail to make the input
+	 * ring be updated before the CAAM starts reading it. So, CAAM will
+	 * process, again, an old descriptor address and will put it in the
+	 * output ring. This will make caam_jr_dequeue() to fail, since this
+	 * old descriptor is not in the software ring.
+	 * To fix this, use wmb() which works on the full system instead of
+	 * inner/outer shareable domains.
 	 */
-	smp_wmb();
+	wmb();
 
 	jrp->head = (head + 1) & (JOBR_DEPTH - 1);
 
-- 
2.25.1
Re: [PATCH] crypto: caam - increase the domain of write memory barrier to full system
Posted by Herbert Xu 2 years ago
On Tue, Aug 08, 2023 at 12:55:26PM +0200, meenakshi.aggarwal@nxp.com wrote:
> From: Iuliana Prodan <iuliana.prodan@nxp.com>
> 
> In caam_jr_enqueue, under heavy DDR load, smp_wmb() or dma_wmb()
> fail to make the input ring be updated before the CAAM starts
> reading it. So, CAAM will process, again, an old descriptor address
> and will put it in the output ring. This will make caam_jr_dequeue()
> to fail, since this old descriptor is not in the software ring.
> To fix this, use wmb() which works on the full system instead of
> inner/outer shareable domains.
> 
> Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
> Signed-off-by: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com>
> ---
>  drivers/crypto/caam/jr.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: caam - increase the domain of write memory barrier to full system
Posted by Herbert Xu 2 years ago
On Tue, Aug 08, 2023 at 12:55:26PM +0200, meenakshi.aggarwal@nxp.com wrote:
> From: Iuliana Prodan <iuliana.prodan@nxp.com>
> 
> In caam_jr_enqueue, under heavy DDR load, smp_wmb() or dma_wmb()
> fail to make the input ring be updated before the CAAM starts
> reading it. So, CAAM will process, again, an old descriptor address
> and will put it in the output ring. This will make caam_jr_dequeue()
> to fail, since this old descriptor is not in the software ring.
> To fix this, use wmb() which works on the full system instead of
> inner/outer shareable domains.
> 
> Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
> Signed-off-by: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com>
> ---
>  drivers/crypto/caam/jr.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)

Indeed, smp_wmb is always wrong for barriers separating DMA writes.

I wonder if these should be changed to:

$ git grep smp_wmb drivers/crypto/
drivers/crypto/caam/jr.c:       smp_wmb();
drivers/crypto/cavium/cpt/cptvf_reqmanager.c:   smp_wmb();
drivers/crypto/hisilicon/qm.c:  smp_wmb();
drivers/crypto/marvell/octeontx/otx_cptvf_reqmgr.c:     smp_wmb();
drivers/crypto/marvell/octeontx2/otx2_cptpf_mbox.c:             smp_wmb();
drivers/crypto/marvell/octeontx2/otx2_cptpf_mbox.c:     smp_wmb();
drivers/crypto/marvell/octeontx2/otx2_cptpf_mbox.c:             smp_wmb();
drivers/crypto/marvell/octeontx2/otx2_cptpf_mbox.c:     smp_wmb();
drivers/crypto/talitos.c:       smp_wmb();
drivers/crypto/talitos.c:               smp_wmb();
$ 

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
RE: [PATCH] crypto: caam - increase the domain of write memory barrier to full system
Posted by Gaurav Jain 2 years ago
Reviewed-by: Gaurav Jain <gaurav.jain@nxp.com>

> -----Original Message-----
> From: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com>
> Sent: Tuesday, August 8, 2023 4:25 PM
> To: Horia Geanta <horia.geanta@nxp.com>; Varun Sethi <V.Sethi@nxp.com>;
> Pankaj Gupta <pankaj.gupta@nxp.com>; Gaurav Jain <gaurav.jain@nxp.com>;
> herbert@gondor.apana.org.au; davem@davemloft.net; linux-
> crypto@vger.kernel.org; linux-kernel@vger.kernel.org
> Cc: Iuliana Prodan <iuliana.prodan@nxp.com>; Meenakshi Aggarwal
> <meenakshi.aggarwal@nxp.com>
> Subject: [PATCH] crypto: caam - increase the domain of write memory barrier to
> full system
> 
> From: Iuliana Prodan <iuliana.prodan@nxp.com>
> 
> In caam_jr_enqueue, under heavy DDR load, smp_wmb() or dma_wmb() fail to
> make the input ring be updated before the CAAM starts reading it. So, CAAM will
> process, again, an old descriptor address and will put it in the output ring. This
> will make caam_jr_dequeue() to fail, since this old descriptor is not in the
> software ring.
> To fix this, use wmb() which works on the full system instead of inner/outer
> shareable domains.
> 
> Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
> Signed-off-by: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com>
> ---
>  drivers/crypto/caam/jr.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c index
> 767fbf052536..5507d5d34a4c 100644
> --- a/drivers/crypto/caam/jr.c
> +++ b/drivers/crypto/caam/jr.c
> @@ -464,8 +464,16 @@ int caam_jr_enqueue(struct device *dev, u32 *desc,
>  	 * Guarantee that the descriptor's DMA address has been written to
>  	 * the next slot in the ring before the write index is updated, since
>  	 * other cores may update this index independently.
> +	 *
> +	 * Under heavy DDR load, smp_wmb() or dma_wmb() fail to make the
> input
> +	 * ring be updated before the CAAM starts reading it. So, CAAM will
> +	 * process, again, an old descriptor address and will put it in the
> +	 * output ring. This will make caam_jr_dequeue() to fail, since this
> +	 * old descriptor is not in the software ring.
> +	 * To fix this, use wmb() which works on the full system instead of
> +	 * inner/outer shareable domains.
>  	 */
> -	smp_wmb();
> +	wmb();
> 
>  	jrp->head = (head + 1) & (JOBR_DEPTH - 1);
> 
> --
> 2.25.1