[RFC PATCH] RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()

Ding Hui posted 1 patch 2 months ago
drivers/infiniband/hw/bnxt_re/hw_counters.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
[RFC PATCH] RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
Posted by Ding Hui 2 months ago
Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
we found the inbox driver from upstream maybe have the same issue.

The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.

However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
hw stats with different num_counters for chip_gen_p5_p7 hardware.

For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
out-of-bounds write in bnxt_re_copy_err_stats().

It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
not only for p5/p7 hardware.

Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
part of the generic counter.

Compile tested only.

Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
---
 drivers/infiniband/hw/bnxt_re/hw_counters.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/hw_counters.h b/drivers/infiniband/hw/bnxt_re/hw_counters.h
index 09d371d442aa..cebec033f4a0 100644
--- a/drivers/infiniband/hw/bnxt_re/hw_counters.h
+++ b/drivers/infiniband/hw/bnxt_re/hw_counters.h
@@ -89,6 +89,9 @@ enum bnxt_re_hw_stats {
 	BNXT_RE_RES_SRQ_LOAD_ERR,
 	BNXT_RE_RES_TX_PCI_ERR,
 	BNXT_RE_RES_RX_PCI_ERR,
+	BNXT_RE_REQ_CQE_ERROR,
+	BNXT_RE_RESP_CQE_ERROR,
+	BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
 	BNXT_RE_OUT_OF_SEQ_ERR,
 	BNXT_RE_TX_ATOMIC_REQ,
 	BNXT_RE_TX_READ_REQ,
@@ -110,9 +113,6 @@ enum bnxt_re_hw_stats {
 	BNXT_RE_TX_CNP,
 	BNXT_RE_RX_CNP,
 	BNXT_RE_RX_ECN,
-	BNXT_RE_REQ_CQE_ERROR,
-	BNXT_RE_RESP_CQE_ERROR,
-	BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
 	BNXT_RE_NUM_EXT_COUNTERS
 };
 
-- 
2.17.1
Re: [RFC PATCH] RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
Posted by Leon Romanovsky 1 month, 2 weeks ago
On Mon, 08 Dec 2025 15:21:10 +0800, Ding Hui wrote:
> Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
> NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
> we found the inbox driver from upstream maybe have the same issue.
> 
> The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
> update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
> 
> [...]

Applied, thanks!

[1/1] RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
      https://git.kernel.org/rdma/rdma/c/9b68a1cc966bc9

Best regards,
-- 
Leon Romanovsky <leon@kernel.org>
Re: [RFC PATCH] RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
Posted by Kalesh Anakkur Purayil 1 month, 2 weeks ago
On Mon, Dec 8, 2025 at 12:52 PM Ding Hui <dinghui@sangfor.com.cn> wrote:
>
> Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
> NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
> we found the inbox driver from upstream maybe have the same issue.
>
> The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
> update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
>
> However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
> hw stats with different num_counters for chip_gen_p5_p7 hardware.
>
> For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
> out-of-bounds write in bnxt_re_copy_err_stats().
>
> It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
> and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
> not only for p5/p7 hardware.
>
> Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
> part of the generic counter.
>
> Compile tested only.
>
> Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
> Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>

Thank you Ding, the fix looks good to me and I have verified it locally.

Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Tested-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

-- 
Regards,
Kalesh AP
Re: [RFC PATCH] RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
Posted by Ding Hui 1 month, 2 weeks ago
On 2025/12/21 23:47, Kalesh Anakkur Purayil wrote:
> On Mon, Dec 8, 2025 at 12:52 PM Ding Hui <dinghui@sangfor.com.cn> wrote:
>>
>> Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
>> NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
>> we found the inbox driver from upstream maybe have the same issue.
>>
>> The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
>> update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
>>
>> However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
>> hw stats with different num_counters for chip_gen_p5_p7 hardware.
>>
>> For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
>> out-of-bounds write in bnxt_re_copy_err_stats().
>>
>> It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
>> and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
>> not only for p5/p7 hardware.
>>
>> Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
>> part of the generic counter.
>>
>> Compile tested only.
>>
>> Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
>> Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
>> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> 
> Thank you Ding, the fix looks good to me and I have verified it locally.
> 

Thanks for confirming.

Do I need to resend the patch without RFC prefix and update some commit log,
such as getting rid of the first paragraph about the outbox driver?

> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> Tested-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> 

-- 
Thanks,
- Ding Hui

Re: [RFC PATCH] RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
Posted by Leon Romanovsky 1 month, 2 weeks ago
On Mon, Dec 22, 2025 at 02:33:59PM +0800, Ding Hui wrote:
> On 2025/12/21 23:47, Kalesh Anakkur Purayil wrote:
> > On Mon, Dec 8, 2025 at 12:52 PM Ding Hui <dinghui@sangfor.com.cn> wrote:
> > > 
> > > Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
> > > NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
> > > we found the inbox driver from upstream maybe have the same issue.
> > > 
> > > The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
> > > update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
> > > 
> > > However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
> > > hw stats with different num_counters for chip_gen_p5_p7 hardware.
> > > 
> > > For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
> > > out-of-bounds write in bnxt_re_copy_err_stats().
> > > 
> > > It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
> > > and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
> > > not only for p5/p7 hardware.
> > > 
> > > Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
> > > part of the generic counter.
> > > 
> > > Compile tested only.
> > > 
> > > Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
> > > Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
> > > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> > 
> > Thank you Ding, the fix looks good to me and I have verified it locally.
> > 
> 
> Thanks for confirming.
> 
> Do I need to resend the patch without RFC prefix and update some commit log,
> such as getting rid of the first paragraph about the outbox driver?

No, there is no need. I'll fix it locally.

Thanks

> 
> > Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> > Tested-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> > 
> 
> -- 
> Thanks,
> - Ding Hui
> 
Re: [RFC PATCH] RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
Posted by Ding Hui 1 month, 3 weeks ago
Friendly ping.

On 2025/12/8 15:21, Ding Hui wrote:
> Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
> NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
> we found the inbox driver from upstream maybe have the same issue.
> 
> The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
> update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
> 
> However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
> hw stats with different num_counters for chip_gen_p5_p7 hardware.
> 
> For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
> out-of-bounds write in bnxt_re_copy_err_stats().
> 
> It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
> and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
> not only for p5/p7 hardware.
> 
> Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
> part of the generic counter.
> 
> Compile tested only.
> 
> Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
> Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> ---
>   drivers/infiniband/hw/bnxt_re/hw_counters.h | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/bnxt_re/hw_counters.h b/drivers/infiniband/hw/bnxt_re/hw_counters.h
> index 09d371d442aa..cebec033f4a0 100644
> --- a/drivers/infiniband/hw/bnxt_re/hw_counters.h
> +++ b/drivers/infiniband/hw/bnxt_re/hw_counters.h
> @@ -89,6 +89,9 @@ enum bnxt_re_hw_stats {
>   	BNXT_RE_RES_SRQ_LOAD_ERR,
>   	BNXT_RE_RES_TX_PCI_ERR,
>   	BNXT_RE_RES_RX_PCI_ERR,
> +	BNXT_RE_REQ_CQE_ERROR,
> +	BNXT_RE_RESP_CQE_ERROR,
> +	BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
>   	BNXT_RE_OUT_OF_SEQ_ERR,
>   	BNXT_RE_TX_ATOMIC_REQ,
>   	BNXT_RE_TX_READ_REQ,
> @@ -110,9 +113,6 @@ enum bnxt_re_hw_stats {
>   	BNXT_RE_TX_CNP,
>   	BNXT_RE_RX_CNP,
>   	BNXT_RE_RX_ECN,
> -	BNXT_RE_REQ_CQE_ERROR,
> -	BNXT_RE_RESP_CQE_ERROR,
> -	BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
>   	BNXT_RE_NUM_EXT_COUNTERS
>   };
>   

-- 
Thanks,
- Ding Hui
Re: [RFC PATCH] RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
Posted by Leon Romanovsky 1 month, 2 weeks ago
On Thu, Dec 18, 2025 at 10:16:02AM +0800, Ding Hui wrote:
> Friendly ping.

I'm waiting for some sort of response from Broadcom people.

Thanks

> 
> On 2025/12/8 15:21, Ding Hui wrote:
> > Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
> > NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
> > we found the inbox driver from upstream maybe have the same issue.
> > 
> > The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
> > update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
> > 
> > However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
> > hw stats with different num_counters for chip_gen_p5_p7 hardware.
> > 
> > For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
> > out-of-bounds write in bnxt_re_copy_err_stats().
> > 
> > It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
> > and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
> > not only for p5/p7 hardware.
> > 
> > Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
> > part of the generic counter.
> > 
> > Compile tested only.
> > 
> > Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
> > Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
> > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> > ---
> >   drivers/infiniband/hw/bnxt_re/hw_counters.h | 6 +++---
> >   1 file changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/infiniband/hw/bnxt_re/hw_counters.h b/drivers/infiniband/hw/bnxt_re/hw_counters.h
> > index 09d371d442aa..cebec033f4a0 100644
> > --- a/drivers/infiniband/hw/bnxt_re/hw_counters.h
> > +++ b/drivers/infiniband/hw/bnxt_re/hw_counters.h
> > @@ -89,6 +89,9 @@ enum bnxt_re_hw_stats {
> >   	BNXT_RE_RES_SRQ_LOAD_ERR,
> >   	BNXT_RE_RES_TX_PCI_ERR,
> >   	BNXT_RE_RES_RX_PCI_ERR,
> > +	BNXT_RE_REQ_CQE_ERROR,
> > +	BNXT_RE_RESP_CQE_ERROR,
> > +	BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
> >   	BNXT_RE_OUT_OF_SEQ_ERR,
> >   	BNXT_RE_TX_ATOMIC_REQ,
> >   	BNXT_RE_TX_READ_REQ,
> > @@ -110,9 +113,6 @@ enum bnxt_re_hw_stats {
> >   	BNXT_RE_TX_CNP,
> >   	BNXT_RE_RX_CNP,
> >   	BNXT_RE_RX_ECN,
> > -	BNXT_RE_REQ_CQE_ERROR,
> > -	BNXT_RE_RESP_CQE_ERROR,
> > -	BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
> >   	BNXT_RE_NUM_EXT_COUNTERS
> >   };
> 
> -- 
> Thanks,
> - Ding Hui
> 
>
Re: [RFC PATCH] RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
Posted by Kalesh Anakkur Purayil 1 month, 2 weeks ago
Hi Ding/Leon,

We will validate the changes on BCM957414A4142CC and confirm.

On Sun, Dec 21, 2025 at 1:31 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Thu, Dec 18, 2025 at 10:16:02AM +0800, Ding Hui wrote:
> > Friendly ping.
>
> I'm waiting for some sort of response from Broadcom people.
>
> Thanks
>
> >
> > On 2025/12/8 15:21, Ding Hui wrote:
> > > Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
> > > NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
> > > we found the inbox driver from upstream maybe have the same issue.
> > >
> > > The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
> > > update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
> > >
> > > However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
> > > hw stats with different num_counters for chip_gen_p5_p7 hardware.
> > >
> > > For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
> > > out-of-bounds write in bnxt_re_copy_err_stats().
> > >
> > > It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
> > > and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
> > > not only for p5/p7 hardware.
> > >
> > > Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
> > > part of the generic counter.
> > >
> > > Compile tested only.
> > >
> > > Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
> > > Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
> > > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> > > ---
> > >   drivers/infiniband/hw/bnxt_re/hw_counters.h | 6 +++---
> > >   1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/hw/bnxt_re/hw_counters.h b/drivers/infiniband/hw/bnxt_re/hw_counters.h
> > > index 09d371d442aa..cebec033f4a0 100644
> > > --- a/drivers/infiniband/hw/bnxt_re/hw_counters.h
> > > +++ b/drivers/infiniband/hw/bnxt_re/hw_counters.h
> > > @@ -89,6 +89,9 @@ enum bnxt_re_hw_stats {
> > >     BNXT_RE_RES_SRQ_LOAD_ERR,
> > >     BNXT_RE_RES_TX_PCI_ERR,
> > >     BNXT_RE_RES_RX_PCI_ERR,
> > > +   BNXT_RE_REQ_CQE_ERROR,
> > > +   BNXT_RE_RESP_CQE_ERROR,
> > > +   BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
> > >     BNXT_RE_OUT_OF_SEQ_ERR,
> > >     BNXT_RE_TX_ATOMIC_REQ,
> > >     BNXT_RE_TX_READ_REQ,
> > > @@ -110,9 +113,6 @@ enum bnxt_re_hw_stats {
> > >     BNXT_RE_TX_CNP,
> > >     BNXT_RE_RX_CNP,
> > >     BNXT_RE_RX_ECN,
> > > -   BNXT_RE_REQ_CQE_ERROR,
> > > -   BNXT_RE_RESP_CQE_ERROR,
> > > -   BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
> > >     BNXT_RE_NUM_EXT_COUNTERS
> > >   };
> >
> > --
> > Thanks,
> > - Ding Hui
> >
> >



-- 
Regards,
Kalesh AP