init/Kconfig | 4 ++++ 1 file changed, 4 insertions(+)
With gcc 13.0.1 on x86, there are several false positives like
drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c:167:31:
error: array subscript 4 is above array bounds of ‘const struct sparx5_psfp_gce[4]’ [-Werror=array-bounds=]
167 | gce = &sg->gce[i];
| ~~~~~~~^~~
In file included from drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c:8:
drivers/net/ethernet/microchip/sparx5/sparx5_main.h:506:32: note: while referencing ‘gce’
506 | struct sparx5_psfp_gce gce[SPX5_PSFP_GCE_CNT];
| ^~~
The code lines for the reported problem
/* For each scheduling entry */
for (i = 0; i < sg->num_entries; i++) {
gce = &sg->gce[i];
i is bounded by num_entries, which is set in sparx5_tc_flower.c
if (act->gate.num_entries >= SPX5_PSFP_GCE_CNT) {
NL_SET_ERR_MSG_MOD(extack, "Invalid number of gate entries");
return -EINVAL;
}
..
sg->num_entries = act->gate.num_entries;
So disable array-bounds as was done on gcc 11 and 12
Signed-off-by: Tom Rix <trix@redhat.com>
---
init/Kconfig | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/init/Kconfig b/init/Kconfig
index 1fb5f313d18f..10d0a0020726 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -898,10 +898,14 @@ config GCC11_NO_ARRAY_BOUNDS
config GCC12_NO_ARRAY_BOUNDS
def_bool y
+config GCC13_NO_ARRAY_BOUNDS
+ def_bool y
+
config CC_NO_ARRAY_BOUNDS
bool
default y if CC_IS_GCC && GCC_VERSION >= 110000 && GCC_VERSION < 120000 && GCC11_NO_ARRAY_BOUNDS
default y if CC_IS_GCC && GCC_VERSION >= 120000 && GCC_VERSION < 130000 && GCC12_NO_ARRAY_BOUNDS
+ default y if CC_IS_GCC && GCC_VERSION >= 130000 && GCC_VERSION < 140000 && GCC13_NO_ARRAY_BOUNDS
#
# For architectures that know their GCC __int128 support is sound
--
2.27.0
*thread necromancy*
On Mon, Mar 06, 2023 at 05:09:47PM -0500, Tom Rix wrote:
> With gcc 13.0.1 on x86, there are several false positives like
>
> drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c:167:31:
> error: array subscript 4 is above array bounds of ‘const struct sparx5_psfp_gce[4]’ [-Werror=array-bounds=]
> 167 | gce = &sg->gce[i];
> | ~~~~~~~^~~
> In file included from drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c:8:
> drivers/net/ethernet/microchip/sparx5/sparx5_main.h:506:32: note: while referencing ‘gce’
> 506 | struct sparx5_psfp_gce gce[SPX5_PSFP_GCE_CNT];
> | ^~~
>
> The code lines for the reported problem
> /* For each scheduling entry */
> for (i = 0; i < sg->num_entries; i++) {
> gce = &sg->gce[i];
>
> i is bounded by num_entries, which is set in sparx5_tc_flower.c
> if (act->gate.num_entries >= SPX5_PSFP_GCE_CNT) {
> NL_SET_ERR_MSG_MOD(extack, "Invalid number of gate entries");
> return -EINVAL;
> }
> ..
> sg->num_entries = act->gate.num_entries;
>
> So disable array-bounds as was done on gcc 11 and 12
So, as it turns out, after much discussion with GCC devs (and Qing's great
patience with me), I finally see that this is not a false positive. GCC
correctly sees a test for "i >= SPX5_PSFP_GCE_CNT", hidden away in the
macros and inlines, and is reporting "Look, you have an execution path
where i is out of bounds."
(It is not a result of the shift sanitizer nor the array bounds sanitizer
-- I was convinced these were responsible, but I was wrong.)
Here are the details:
"gce" is this array of size 4:
struct sparx5_psfp_sg {
...
u32 num_entries; /* PSFPAdminControlListLength */
struct sparx5_psfp_gce gce[SPX5_PSFP_GCE_CNT];
};
#define SPX5_PSFP_GCE_CNT 4
with the loop:
for (i = 0; i < sg->num_entries; i++) {
gce = &sg->gce[i];
...
spx5_wr(ANA_AC_SG_GCL_GS_CONFIG_IPS_SET(ips) |
ANA_AC_SG_GCL_GS_CONFIG_GATE_STATE_SET(gce->gate_state),
sparx5, ANA_AC_SG_GCL_GS_CONFIG(i));
That last macro is:
#define ANA_AC_SG_GCL_GS_CONFIG(r) __REG(TARGET_ANA_AC,\
0, 1, 851584, 0, 1, 128, 0, r, 4, 4)
Note the "r, 4", which is actually "i, 4". These are passed into
spx5_wr() as rinst and rcnt respectively:
static inline void spx5_wr(u32 val, struct sparx5 *sparx5,
int id, int tinst, int tcnt,
int gbase, int ginst, int gcnt, int gwidth,
int raddr, int rinst, int rcnt, int rwidth)
{
writel(val, spx5_addr(sparx5->regs, id, tinst, tcnt,
gbase, ginst, gcnt, gwidth,
raddr, rinst, rcnt, rwidth));
}
and spx5_addr() does:
static inline void __iomem *spx5_addr(void __iomem *base[],
int id, int tinst, int tcnt,
int gbase, int ginst,
int gcnt, int gwidth,
int raddr, int rinst,
int rcnt, int rwidth)
{
...
WARN_ON((rinst) >= rcnt);
It is specifically checking for "i >= 4", and GCC is correctly reporting
that this is therefore a reachable path.
So, the correct fix here is to do the check up front and make sure an
invalid array indexing cannot happen:
diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c b/drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c
index 8dee1ab1fa75..23c40b9b74d5 100644
--- a/drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c
+++ b/drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c
@@ -163,7 +163,7 @@ static int sparx5_psfp_sg_set(struct sparx5 *sparx5, u32 id,
spx5_wr(sg->cycletimeext, sparx5, ANA_AC_SG_CONFIG_REG_5);
/* For each scheduling entry */
- for (i = 0; i < sg->num_entries; i++) {
+ for (i = 0; i < ARRAY_SIZE(sg->gce) && i < sg->num_entries; i++) {
gce = &sg->gce[i];
ips = sparx5_psfp_ipv_to_ips(gce->ipv);
/* hardware needs TimeInterval to be cumulative */
Or, if you want to retain the WARN:
diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c b/drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c
index 8dee1ab1fa75..5ffeb299fcb4 100644
--- a/drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c
+++ b/drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c
@@ -164,6 +164,8 @@ static int sparx5_psfp_sg_set(struct sparx5 *sparx5, u32 id,
/* For each scheduling entry */
for (i = 0; i < sg->num_entries; i++) {
+ if (WARN_ON_ONCE(i >= ARRAY_SIZE(sg->gce)))
+ break;
gce = &sg->gce[i];
ips = sparx5_psfp_ipv_to_ips(gce->ipv);
/* hardware needs TimeInterval to be cumulative */
--
Kees Cook
+ Kees
https://lore.kernel.org/lkml/20230306220947.1982272-1-trix@redhat.com/
On Mon, Mar 6, 2023 at 2:10 PM Tom Rix <trix@redhat.com> wrote:
>
> With gcc 13.0.1 on x86, there are several false positives like
>
> drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c:167:31:
> error: array subscript 4 is above array bounds of ‘const struct sparx5_psfp_gce[4]’ [-Werror=array-bounds=]
> 167 | gce = &sg->gce[i];
> | ~~~~~~~^~~
> In file included from drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c:8:
> drivers/net/ethernet/microchip/sparx5/sparx5_main.h:506:32: note: while referencing ‘gce’
> 506 | struct sparx5_psfp_gce gce[SPX5_PSFP_GCE_CNT];
> | ^~~
>
> The code lines for the reported problem
> /* For each scheduling entry */
> for (i = 0; i < sg->num_entries; i++) {
> gce = &sg->gce[i];
>
> i is bounded by num_entries, which is set in sparx5_tc_flower.c
> if (act->gate.num_entries >= SPX5_PSFP_GCE_CNT) {
> NL_SET_ERR_MSG_MOD(extack, "Invalid number of gate entries");
> return -EINVAL;
> }
> ..
> sg->num_entries = act->gate.num_entries;
>
> So disable array-bounds as was done on gcc 11 and 12
>
> Signed-off-by: Tom Rix <trix@redhat.com>
> ---
> init/Kconfig | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/init/Kconfig b/init/Kconfig
> index 1fb5f313d18f..10d0a0020726 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -898,10 +898,14 @@ config GCC11_NO_ARRAY_BOUNDS
> config GCC12_NO_ARRAY_BOUNDS
> def_bool y
>
> +config GCC13_NO_ARRAY_BOUNDS
> + def_bool y
> +
> config CC_NO_ARRAY_BOUNDS
> bool
> default y if CC_IS_GCC && GCC_VERSION >= 110000 && GCC_VERSION < 120000 && GCC11_NO_ARRAY_BOUNDS
> default y if CC_IS_GCC && GCC_VERSION >= 120000 && GCC_VERSION < 130000 && GCC12_NO_ARRAY_BOUNDS
> + default y if CC_IS_GCC && GCC_VERSION >= 130000 && GCC_VERSION < 140000 && GCC13_NO_ARRAY_BOUNDS
>
> #
> # For architectures that know their GCC __int128 support is sound
> --
> 2.27.0
>
--
Thanks,
~Nick Desaulniers
On March 6, 2023 2:20:50 PM PST, Nick Desaulniers <ndesaulniers@google.com> wrote:
>+ Kees
>https://lore.kernel.org/lkml/20230306220947.1982272-1-trix@redhat.com/
>
>On Mon, Mar 6, 2023 at 2:10 PM Tom Rix <trix@redhat.com> wrote:
>>
>> With gcc 13.0.1 on x86, there are several false positives like
>>
>> drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c:167:31:
>> error: array subscript 4 is above array bounds of ‘const struct sparx5_psfp_gce[4]’ [-Werror=array-bounds=]
>> 167 | gce = &sg->gce[i];
>> | ~~~~~~~^~~
>> In file included from drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c:8:
>> drivers/net/ethernet/microchip/sparx5/sparx5_main.h:506:32: note: while referencing ‘gce’
>> 506 | struct sparx5_psfp_gce gce[SPX5_PSFP_GCE_CNT];
>> | ^~~
>>
>> The code lines for the reported problem
>> /* For each scheduling entry */
>> for (i = 0; i < sg->num_entries; i++) {
>> gce = &sg->gce[i];
>>
>> i is bounded by num_entries, which is set in sparx5_tc_flower.c
>> if (act->gate.num_entries >= SPX5_PSFP_GCE_CNT) {
>> NL_SET_ERR_MSG_MOD(extack, "Invalid number of gate entries");
>> return -EINVAL;
>> }
>> ..
>> sg->num_entries = act->gate.num_entries;
>>
>> So disable array-bounds as was done on gcc 11 and 12
GCC 13 isn't released yet, and we've been working to make Linux warning-free under -Wareay-bounds. (And we succeeded briefly with GCC 11.)
I'd much rather get GCC fixed. This is due to the shift sanitizer reducing the scope of num_entries (via macro args) to 0-31, which is still >4. This seems like a hinting bug in GCC: just because the variable was used in a shift doesn't mean the compiler can make any value assumptions.
-Kees
--
Kees Cook
On 3/6/23 3:02 PM, Kees Cook wrote:
> On March 6, 2023 2:20:50 PM PST, Nick Desaulniers <ndesaulniers@google.com> wrote:
>> + Kees
>> https://lore.kernel.org/lkml/20230306220947.1982272-1-trix@redhat.com/
>>
>> On Mon, Mar 6, 2023 at 2:10 PM Tom Rix <trix@redhat.com> wrote:
>>> With gcc 13.0.1 on x86, there are several false positives like
>>>
>>> drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c:167:31:
>>> error: array subscript 4 is above array bounds of ‘const struct sparx5_psfp_gce[4]’ [-Werror=array-bounds=]
>>> 167 | gce = &sg->gce[i];
>>> | ~~~~~~~^~~
>>> In file included from drivers/net/ethernet/microchip/sparx5/sparx5_psfp.c:8:
>>> drivers/net/ethernet/microchip/sparx5/sparx5_main.h:506:32: note: while referencing ‘gce’
>>> 506 | struct sparx5_psfp_gce gce[SPX5_PSFP_GCE_CNT];
>>> | ^~~
>>>
>>> The code lines for the reported problem
>>> /* For each scheduling entry */
>>> for (i = 0; i < sg->num_entries; i++) {
>>> gce = &sg->gce[i];
>>>
>>> i is bounded by num_entries, which is set in sparx5_tc_flower.c
>>> if (act->gate.num_entries >= SPX5_PSFP_GCE_CNT) {
>>> NL_SET_ERR_MSG_MOD(extack, "Invalid number of gate entries");
>>> return -EINVAL;
>>> }
>>> ..
>>> sg->num_entries = act->gate.num_entries;
>>>
>>> So disable array-bounds as was done on gcc 11 and 12
> GCC 13 isn't released yet, and we've been working to make Linux warning-free under -Wareay-bounds. (And we succeeded briefly with GCC 11.)
>
> I'd much rather get GCC fixed. This is due to the shift sanitizer reducing the scope of num_entries (via macro args) to 0-31, which is still >4. This seems like a hinting bug in GCC: just because the variable was used in a shift doesn't mean the compiler can make any value assumptions.
The build with fail generally with gcc 13.
The warnings could be cleaned without having an error, but I looked at
multiple errors, none of them were real.
imo this is a broken compiler option.
Tom
>
> -Kees
>
>
>
On Tue, Mar 7, 2023 at 2:07 AM Tom Rix <trix@redhat.com> wrote: > > The build with fail generally with gcc 13. > > The warnings could be cleaned without having an error, but I looked at > multiple errors, none of them were real. > > imo this is a broken compiler option. I am not sure I understand -- my reading of Kees' message is that he would prefer to get the warning (rather than the kernel) fixed before GCC 13 releases. Are you asking to have the option disabled until GCC 13 releases and reevaluate then? How many warnings are you getting? Are those actual errors or `-Werror`? Cheers, Miguel
On 3/7/23 3:42 AM, Miguel Ojeda wrote: > On Tue, Mar 7, 2023 at 2:07 AM Tom Rix <trix@redhat.com> wrote: >> The build with fail generally with gcc 13. >> >> The warnings could be cleaned without having an error, but I looked at >> multiple errors, none of them were real. >> >> imo this is a broken compiler option. > I am not sure I understand -- my reading of Kees' message is that he > would prefer to get the warning (rather than the kernel) fixed before > GCC 13 releases. yes, that would be ideal. But anyone using gcc 13 in the meanwhile will have a broken build and any other aspect of testing the kernel with gcc 13 would have to be deferred to after the fix, if it happens. > > Are you asking to have the option disabled until GCC 13 releases and > reevaluate then? How many warnings are you getting? Are those actual > errors or `-Werror`? With W=1, there are 40 error:'s, sorry I do not have the default logs at hand. I looked about 10 of them over the weekend, they we all bogus. So gcc needs to be fixed first, just disabling with -Wnoerror will give folks bad problems that do not need fixing. I am asking that we at least temporarily disable this option so the build break is fixed. Tom > > Cheers, > Miguel >
© 2016 - 2026 Red Hat, Inc.