[PATCH net-next] net: mana: hardening: Reject zero max_num_queues from MANA_QUERY_VPORT_CONFIG

Erni Sri Satya Vennela posted 1 patch 6 days, 23 hours ago
drivers/net/ethernet/microsoft/mana/mana_en.c | 6 ++++++
1 file changed, 6 insertions(+)
[PATCH net-next] net: mana: hardening: Reject zero max_num_queues from MANA_QUERY_VPORT_CONFIG
Posted by Erni Sri Satya Vennela 6 days, 23 hours ago
As a part of MANA hardening for CVM, validate that max_num_sq and
max_num_rq returned by MANA_QUERY_VPORT_CONFIG are not zero. These
values flow into apc->num_queues, which is used as an allocation count
and loop bound. A zero value would result in zero-size allocations and
incorrect driver behavior.

Return -EPROTO if either value is zero.

Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index b39e8b920791..a4197b4b0597 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -1249,6 +1249,12 @@ static int mana_query_vport_cfg(struct mana_port_context *apc, u32 vport_index,
 
 	*max_sq = resp.max_num_sq;
 	*max_rq = resp.max_num_rq;
+
+	if (*max_sq == 0 || *max_rq == 0) {
+		netdev_err(apc->ndev, "Invalid max queues from vPort config\n");
+		return -EPROTO;
+	}
+
 	if (resp.num_indirection_ent > 0 &&
 	    resp.num_indirection_ent <= MANA_INDIRECT_TABLE_MAX_SIZE &&
 	    is_power_of_2(resp.num_indirection_ent)) {
-- 
2.34.1
Re: [PATCH net-next] net: mana: hardening: Reject zero max_num_queues from MANA_QUERY_VPORT_CONFIG
Posted by Paolo Abeni 2 days, 7 hours ago
On 3/26/26 6:48 PM, Erni Sri Satya Vennela wrote:
> As a part of MANA hardening for CVM, validate that max_num_sq and
> max_num_rq returned by MANA_QUERY_VPORT_CONFIG are not zero. These
> values flow into apc->num_queues, which is used as an allocation count
> and loop bound. A zero value would result in zero-size allocations and
> incorrect driver behavior.
> 
> Return -EPROTO if either value is zero.
> 
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
> ---
>  drivers/net/ethernet/microsoft/mana/mana_en.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index b39e8b920791..a4197b4b0597 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -1249,6 +1249,12 @@ static int mana_query_vport_cfg(struct mana_port_context *apc, u32 vport_index,
>  
>  	*max_sq = resp.max_num_sq;
>  	*max_rq = resp.max_num_rq;
> +
> +	if (*max_sq == 0 || *max_rq == 0) {
> +		netdev_err(apc->ndev, "Invalid max queues from vPort config\n");
> +		return -EPROTO;

AI review says:

Will returning -EPROTO here expose a pre-existing resource leak in the
driver's teardown path?
If mana_query_vport_cfg() returns an error, mana_init_port() fails and
mana_probe_port() frees the ndev, leaving ac->ports[i] as NULL. In
mana_probe(), the port initialization loop breaks upon this error, but
the err variable is then overwritten:

mana_probe() {
    ...
    for (i = 0; i < ac->num_ports; i++) {
        err = mana_probe_port(ac, i, &ac->ports[i]);
        if (err) {
            dev_err(dev, "Probe Failed for port %d\n", i);
            break;
        }
    }

    err = add_adev(gd, "eth");
    ...
}

If add_adev() succeeds, mana_probe() completes successfully instead of
failing, masking the earlier error while leaving ac->ports[0] as NULL.
Later, when the driver is unloaded or if add_adev() fails and triggers
immediate cleanup, mana_remove() is called. It iterates over ac->ports
and, upon encountering the NULL device, immediately executes goto out:

mana_remove() {
    ...
    for (i = 0; i < ac->num_ports; i++) {
        ndev = ac->ports[i];
        if (!ndev) {
            if (i == 0)
                ...
            goto out;
        }
        ...
    }

    mana_destroy_eq(ac);
out:
    ...
}

Because the out label in mana_remove() is located after the
mana_destroy_eq(ac) call, jumping there completely skips destroying the
event queues allocated earlier by mana_create_eq(ac).
In a Confidential Virtual Machine context, could an untrusted hypervisor
repeatedly return invalid configs to continuously leak guest memory and
hardware queues?
Re: [PATCH net-next] net: mana: hardening: Reject zero max_num_queues from MANA_QUERY_VPORT_CONFIG
Posted by Erni Sri Satya Vennela 1 day, 23 hours ago
 > +
> > +	if (*max_sq == 0 || *max_rq == 0) {
> > +		netdev_err(apc->ndev, "Invalid max queues from vPort config\n");
> > +		return -EPROTO;
> 
> AI review says:
> 
> Will returning -EPROTO here expose a pre-existing resource leak in the
> driver's teardown path?
> If mana_query_vport_cfg() returns an error, mana_init_port() fails and
> mana_probe_port() frees the ndev, leaving ac->ports[i] as NULL. In
> mana_probe(), the port initialization loop breaks upon this error, but
> the err variable is then overwritten:
> 
> mana_probe() {
>     ...
>     for (i = 0; i < ac->num_ports; i++) {
>         err = mana_probe_port(ac, i, &ac->ports[i]);
>         if (err) {
>             dev_err(dev, "Probe Failed for port %d\n", i);
>             break;
>         }
>     }
> 
>     err = add_adev(gd, "eth");
>     ...
> }
> 
> If add_adev() succeeds, mana_probe() completes successfully instead of
> failing, masking the earlier error while leaving ac->ports[0] as NULL.
> Later, when the driver is unloaded or if add_adev() fails and triggers
> immediate cleanup, mana_remove() is called. It iterates over ac->ports
> and, upon encountering the NULL device, immediately executes goto out:
> 
> mana_remove() {
>     ...
>     for (i = 0; i < ac->num_ports; i++) {
>         ndev = ac->ports[i];
>         if (!ndev) {
>             if (i == 0)
>                 ...
>             goto out;
>         }
>         ...
>     }
> 
>     mana_destroy_eq(ac);
> out:
>     ...
> }
> 
> Because the out label in mana_remove() is located after the
> mana_destroy_eq(ac) call, jumping there completely skips destroying the
> event queues allocated earlier by mana_create_eq(ac).
> In a Confidential Virtual Machine context, could an untrusted hypervisor
> repeatedly return invalid configs to continuously leak guest memory and
> hardware queues?

Thankyou for the review.
Since these issues are pre-existing, I will send it in a separate
patchset.
The patchset will also include the issues reported in:
[PATCH net-next] net: mana: hardening: Validate adapter_mtu from
MANA_QUERY_DEV_CONFIG

- Vennela