drivers/infiniband/hw/bnxt_re/hw_counters.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
we found the inbox driver from upstream maybe have the same issue.
The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
hw stats with different num_counters for chip_gen_p5_p7 hardware.
For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
out-of-bounds write in bnxt_re_copy_err_stats().
It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
not only for p5/p7 hardware.
Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
part of the generic counter.
Compile tested only.
Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
---
drivers/infiniband/hw/bnxt_re/hw_counters.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/hw_counters.h b/drivers/infiniband/hw/bnxt_re/hw_counters.h
index 09d371d442aa..cebec033f4a0 100644
--- a/drivers/infiniband/hw/bnxt_re/hw_counters.h
+++ b/drivers/infiniband/hw/bnxt_re/hw_counters.h
@@ -89,6 +89,9 @@ enum bnxt_re_hw_stats {
BNXT_RE_RES_SRQ_LOAD_ERR,
BNXT_RE_RES_TX_PCI_ERR,
BNXT_RE_RES_RX_PCI_ERR,
+ BNXT_RE_REQ_CQE_ERROR,
+ BNXT_RE_RESP_CQE_ERROR,
+ BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
BNXT_RE_OUT_OF_SEQ_ERR,
BNXT_RE_TX_ATOMIC_REQ,
BNXT_RE_TX_READ_REQ,
@@ -110,9 +113,6 @@ enum bnxt_re_hw_stats {
BNXT_RE_TX_CNP,
BNXT_RE_RX_CNP,
BNXT_RE_RX_ECN,
- BNXT_RE_REQ_CQE_ERROR,
- BNXT_RE_RESP_CQE_ERROR,
- BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
BNXT_RE_NUM_EXT_COUNTERS
};
--
2.17.1
On Mon, 08 Dec 2025 15:21:10 +0800, Ding Hui wrote:
> Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
> NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
> we found the inbox driver from upstream maybe have the same issue.
>
> The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
> update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
>
> [...]
Applied, thanks!
[1/1] RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
https://git.kernel.org/rdma/rdma/c/9b68a1cc966bc9
Best regards,
--
Leon Romanovsky <leon@kernel.org>
On Mon, Dec 8, 2025 at 12:52 PM Ding Hui <dinghui@sangfor.com.cn> wrote:
>
> Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
> NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
> we found the inbox driver from upstream maybe have the same issue.
>
> The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
> update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
>
> However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
> hw stats with different num_counters for chip_gen_p5_p7 hardware.
>
> For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
> out-of-bounds write in bnxt_re_copy_err_stats().
>
> It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
> and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
> not only for p5/p7 hardware.
>
> Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
> part of the generic counter.
>
> Compile tested only.
>
> Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
> Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
Thank you Ding, the fix looks good to me and I have verified it locally.
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Tested-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
--
Regards,
Kalesh AP
On 2025/12/21 23:47, Kalesh Anakkur Purayil wrote:
> On Mon, Dec 8, 2025 at 12:52 PM Ding Hui <dinghui@sangfor.com.cn> wrote:
>>
>> Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
>> NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
>> we found the inbox driver from upstream maybe have the same issue.
>>
>> The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
>> update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
>>
>> However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
>> hw stats with different num_counters for chip_gen_p5_p7 hardware.
>>
>> For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
>> out-of-bounds write in bnxt_re_copy_err_stats().
>>
>> It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
>> and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
>> not only for p5/p7 hardware.
>>
>> Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
>> part of the generic counter.
>>
>> Compile tested only.
>>
>> Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
>> Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
>> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
>
> Thank you Ding, the fix looks good to me and I have verified it locally.
>
Thanks for confirming.
Do I need to resend the patch without RFC prefix and update some commit log,
such as getting rid of the first paragraph about the outbox driver?
> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> Tested-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
>
--
Thanks,
- Ding Hui
On Mon, Dec 22, 2025 at 02:33:59PM +0800, Ding Hui wrote:
> On 2025/12/21 23:47, Kalesh Anakkur Purayil wrote:
> > On Mon, Dec 8, 2025 at 12:52 PM Ding Hui <dinghui@sangfor.com.cn> wrote:
> > >
> > > Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
> > > NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
> > > we found the inbox driver from upstream maybe have the same issue.
> > >
> > > The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
> > > update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
> > >
> > > However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
> > > hw stats with different num_counters for chip_gen_p5_p7 hardware.
> > >
> > > For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
> > > out-of-bounds write in bnxt_re_copy_err_stats().
> > >
> > > It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
> > > and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
> > > not only for p5/p7 hardware.
> > >
> > > Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
> > > part of the generic counter.
> > >
> > > Compile tested only.
> > >
> > > Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
> > > Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
> > > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> >
> > Thank you Ding, the fix looks good to me and I have verified it locally.
> >
>
> Thanks for confirming.
>
> Do I need to resend the patch without RFC prefix and update some commit log,
> such as getting rid of the first paragraph about the outbox driver?
No, there is no need. I'll fix it locally.
Thanks
>
> > Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> > Tested-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> >
>
> --
> Thanks,
> - Ding Hui
>
Friendly ping.
On 2025/12/8 15:21, Ding Hui wrote:
> Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
> NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
> we found the inbox driver from upstream maybe have the same issue.
>
> The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
> update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
>
> However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
> hw stats with different num_counters for chip_gen_p5_p7 hardware.
>
> For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
> out-of-bounds write in bnxt_re_copy_err_stats().
>
> It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
> and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
> not only for p5/p7 hardware.
>
> Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
> part of the generic counter.
>
> Compile tested only.
>
> Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
> Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> ---
> drivers/infiniband/hw/bnxt_re/hw_counters.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/infiniband/hw/bnxt_re/hw_counters.h b/drivers/infiniband/hw/bnxt_re/hw_counters.h
> index 09d371d442aa..cebec033f4a0 100644
> --- a/drivers/infiniband/hw/bnxt_re/hw_counters.h
> +++ b/drivers/infiniband/hw/bnxt_re/hw_counters.h
> @@ -89,6 +89,9 @@ enum bnxt_re_hw_stats {
> BNXT_RE_RES_SRQ_LOAD_ERR,
> BNXT_RE_RES_TX_PCI_ERR,
> BNXT_RE_RES_RX_PCI_ERR,
> + BNXT_RE_REQ_CQE_ERROR,
> + BNXT_RE_RESP_CQE_ERROR,
> + BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
> BNXT_RE_OUT_OF_SEQ_ERR,
> BNXT_RE_TX_ATOMIC_REQ,
> BNXT_RE_TX_READ_REQ,
> @@ -110,9 +113,6 @@ enum bnxt_re_hw_stats {
> BNXT_RE_TX_CNP,
> BNXT_RE_RX_CNP,
> BNXT_RE_RX_ECN,
> - BNXT_RE_REQ_CQE_ERROR,
> - BNXT_RE_RESP_CQE_ERROR,
> - BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
> BNXT_RE_NUM_EXT_COUNTERS
> };
>
--
Thanks,
- Ding Hui
On Thu, Dec 18, 2025 at 10:16:02AM +0800, Ding Hui wrote:
> Friendly ping.
I'm waiting for some sort of response from Broadcom people.
Thanks
>
> On 2025/12/8 15:21, Ding Hui wrote:
> > Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
> > NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
> > we found the inbox driver from upstream maybe have the same issue.
> >
> > The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
> > update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
> >
> > However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
> > hw stats with different num_counters for chip_gen_p5_p7 hardware.
> >
> > For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
> > out-of-bounds write in bnxt_re_copy_err_stats().
> >
> > It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
> > and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
> > not only for p5/p7 hardware.
> >
> > Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
> > part of the generic counter.
> >
> > Compile tested only.
> >
> > Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
> > Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
> > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> > ---
> > drivers/infiniband/hw/bnxt_re/hw_counters.h | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/infiniband/hw/bnxt_re/hw_counters.h b/drivers/infiniband/hw/bnxt_re/hw_counters.h
> > index 09d371d442aa..cebec033f4a0 100644
> > --- a/drivers/infiniband/hw/bnxt_re/hw_counters.h
> > +++ b/drivers/infiniband/hw/bnxt_re/hw_counters.h
> > @@ -89,6 +89,9 @@ enum bnxt_re_hw_stats {
> > BNXT_RE_RES_SRQ_LOAD_ERR,
> > BNXT_RE_RES_TX_PCI_ERR,
> > BNXT_RE_RES_RX_PCI_ERR,
> > + BNXT_RE_REQ_CQE_ERROR,
> > + BNXT_RE_RESP_CQE_ERROR,
> > + BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
> > BNXT_RE_OUT_OF_SEQ_ERR,
> > BNXT_RE_TX_ATOMIC_REQ,
> > BNXT_RE_TX_READ_REQ,
> > @@ -110,9 +113,6 @@ enum bnxt_re_hw_stats {
> > BNXT_RE_TX_CNP,
> > BNXT_RE_RX_CNP,
> > BNXT_RE_RX_ECN,
> > - BNXT_RE_REQ_CQE_ERROR,
> > - BNXT_RE_RESP_CQE_ERROR,
> > - BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
> > BNXT_RE_NUM_EXT_COUNTERS
> > };
>
> --
> Thanks,
> - Ding Hui
>
>
Hi Ding/Leon,
We will validate the changes on BCM957414A4142CC and confirm.
On Sun, Dec 21, 2025 at 1:31 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Thu, Dec 18, 2025 at 10:16:02AM +0800, Ding Hui wrote:
> > Friendly ping.
>
> I'm waiting for some sort of response from Broadcom people.
>
> Thanks
>
> >
> > On 2025/12/8 15:21, Ding Hui wrote:
> > > Recently we encountered an OOB write issue on BCM957414A4142CC with outbox
> > > NetXtreme-E-235.1.160.0 driver from broadcom. After a litte research,
> > > we found the inbox driver from upstream maybe have the same issue.
> > >
> > > The commit ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters
> > > update") introduced 3 counters, and appended after BNXT_RE_OUT_OF_SEQ_ERR.
> > >
> > > However, BNXT_RE_OUT_OF_SEQ_ERR serves as a boundary marker for allocating
> > > hw stats with different num_counters for chip_gen_p5_p7 hardware.
> > >
> > > For BNXT_RE_NUM_STD_COUNTERS allocated hw_stats, leading to an
> > > out-of-bounds write in bnxt_re_copy_err_stats().
> > >
> > > It seems like that the BNXT_RE_REQ_CQE_ERROR, BNXT_RE_RESP_CQE_ERROR,
> > > and BNXT_RE_RESP_REMOTE_ACCESS_ERRS can be updated for generic hardware,
> > > not only for p5/p7 hardware.
> > >
> > > Fix this by moving them before BNXT_RE_OUT_OF_SEQ_ERR so they become
> > > part of the generic counter.
> > >
> > > Compile tested only.
> > >
> > > Fixes: ef56081d1864 ("RDMA/bnxt_re: RoCE related hardware counters update")
> > > Reported-by: Yingying Zheng <zhengyingying@sangfor.com.cn>
> > > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> > > ---
> > > drivers/infiniband/hw/bnxt_re/hw_counters.h | 6 +++---
> > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/hw/bnxt_re/hw_counters.h b/drivers/infiniband/hw/bnxt_re/hw_counters.h
> > > index 09d371d442aa..cebec033f4a0 100644
> > > --- a/drivers/infiniband/hw/bnxt_re/hw_counters.h
> > > +++ b/drivers/infiniband/hw/bnxt_re/hw_counters.h
> > > @@ -89,6 +89,9 @@ enum bnxt_re_hw_stats {
> > > BNXT_RE_RES_SRQ_LOAD_ERR,
> > > BNXT_RE_RES_TX_PCI_ERR,
> > > BNXT_RE_RES_RX_PCI_ERR,
> > > + BNXT_RE_REQ_CQE_ERROR,
> > > + BNXT_RE_RESP_CQE_ERROR,
> > > + BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
> > > BNXT_RE_OUT_OF_SEQ_ERR,
> > > BNXT_RE_TX_ATOMIC_REQ,
> > > BNXT_RE_TX_READ_REQ,
> > > @@ -110,9 +113,6 @@ enum bnxt_re_hw_stats {
> > > BNXT_RE_TX_CNP,
> > > BNXT_RE_RX_CNP,
> > > BNXT_RE_RX_ECN,
> > > - BNXT_RE_REQ_CQE_ERROR,
> > > - BNXT_RE_RESP_CQE_ERROR,
> > > - BNXT_RE_RESP_REMOTE_ACCESS_ERRS,
> > > BNXT_RE_NUM_EXT_COUNTERS
> > > };
> >
> > --
> > Thanks,
> > - Ding Hui
> >
> >
--
Regards,
Kalesh AP
© 2016 - 2026 Red Hat, Inc.