From: Denis Kenzior <denkenz@gmail.com>
Add support for tracking multiple endpoints that may have conflicting
node identifiers. This is achieved by using both the node and endpoint
identifiers as the key inside the radix_tree data structure.
For backward compatibility with existing clients, the previous key
schema (node identifier only) is preserved. However, this schema will
only support the first endpoint/node combination. This is acceptable
for legacy clients as support for multiple endpoints with conflicting
node identifiers was not previously possible.
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Andy Gross <agross@kernel.org>
Signed-off-by: Mihai Moldovan <ionic@ionic.de>
---
v3:
- rebase against current master
- port usage of [endpoint ID|node ID] key usage to the generic
solution already established for the [node ID|port number] usage
- Link to v2: https://msgid.link/4d0fe1eab4b38fb85e2ec53c07289bc0843611a2.1752947108.git.ionic@ionic.de
v2:
- rebase against current master
- no action on review comment regarding integer overflow on 32 bit
long platforms (thus far)
- Link to v1: https://msgid.link/20241018181842.1368394-4-denkenz@gmail.com
---
net/qrtr/af_qrtr.c | 50 ++++++++++++++++++++++++++++++++++------------
1 file changed, 37 insertions(+), 13 deletions(-)
diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c
index 1cb13242e41b..d6efd7f2eddf 100644
--- a/net/qrtr/af_qrtr.c
+++ b/net/qrtr/af_qrtr.c
@@ -119,14 +119,15 @@ static DEFINE_XARRAY_ALLOC(qrtr_ports);
/* The radix tree API uses fixed unsigned long keys and we will have to make
* do with that.
- * These keys are often a combination of node IDs (currently u32) and
- * port numbers (also currently u32).
- * Using the high 32 bits for the node ID and the low 32 bits for the
- * port number will work fine to create keys on platforms where unsigned long
- * is 64 bits wide, but obviously is not be possible on platforms where
- * unsigned long is smaller.
+ * These keys are often a combination of node IDs and port numbers or
+ * endpoint IDs and node IDs (all currently u32).
+ * Using the high 32 bits for the node/endpoint ID and the low 32 bits for the
+ * port number/node ID will work fine to create keys on platforms where
+ * unsigned long is 64 bits wide, but obviously is not be possible on
+ * platforms where unsigned long is smaller.
* Virtually split up unsigned long in half and assign the upper bits to
- * node IDs and the lower bits to the port number, however big that may be.
+ * node/endpoint IDs and the lower bits to the port number/node ID, however
+ * big that may be.
*/
#define QRTR_INDEX_HALF_BITS (RADIX_TREE_INDEX_BITS >> 1)
@@ -465,19 +466,36 @@ static struct qrtr_node *qrtr_node_lookup(unsigned int nid)
*
* This is mostly useful for automatic node id assignment, based on
* the source id in the incoming packet.
+ *
+ * Return: 0 on success; negative error code on failure
*/
-static void qrtr_node_assign(struct qrtr_node *node, unsigned int nid)
+static int qrtr_node_assign(struct qrtr_node *node, unsigned int nid)
{
unsigned long flags;
+ unsigned long key;
if (nid == QRTR_EP_NID_AUTO)
- return;
+ return 0;
spin_lock_irqsave(&qrtr_nodes_lock, flags);
- radix_tree_insert(&qrtr_nodes, nid, node);
+
+ if (node->ep->id > QRTR_INDEX_HALF_UNSIGNED_MAX ||
+ nid > QRTR_INDEX_HALF_UNSIGNED_MAX)
+ return -EINVAL;
+
+ /* Always insert with the endpoint_id + node_id */
+ key = ((unsigned long)(node->ep->id) << QRTR_INDEX_HALF_BITS) |
+ ((unsigned long)(nid) & QRTR_INDEX_HALF_UNSIGNED_MAX);
+ radix_tree_insert(&qrtr_nodes, key, node);
+
+ if (!radix_tree_lookup(&qrtr_nodes, nid))
+ radix_tree_insert(&qrtr_nodes, nid, node);
+
if (node->nid == QRTR_EP_NID_AUTO)
node->nid = nid;
spin_unlock_irqrestore(&qrtr_nodes_lock, flags);
+
+ return 0;
}
/**
@@ -571,14 +589,18 @@ int qrtr_endpoint_post(struct qrtr_endpoint *ep, const void *data, size_t len)
skb_put_data(skb, data + hdrlen, size);
- qrtr_node_assign(node, cb->src_node);
+ ret = qrtr_node_assign(node, cb->src_node);
+ if (ret)
+ goto err;
if (cb->type == QRTR_TYPE_NEW_SERVER) {
/* Remote node endpoint can bridge other distant nodes */
const struct qrtr_ctrl_pkt *pkt;
pkt = data + hdrlen;
- qrtr_node_assign(node, le32_to_cpu(pkt->server.node));
+ ret = qrtr_node_assign(node, le32_to_cpu(pkt->server.node));
+ if (ret)
+ goto err;
}
if (cb->type == QRTR_TYPE_RESUME_TX) {
@@ -670,7 +692,9 @@ int qrtr_endpoint_register(struct qrtr_endpoint *ep, unsigned int nid)
INIT_RADIX_TREE(&node->qrtr_tx_flow, GFP_KERNEL);
mutex_init(&node->qrtr_tx_lock);
- qrtr_node_assign(node, nid);
+ rc = qrtr_node_assign(node, nid);
+ if (rc < 0)
+ goto free_node;
mutex_lock(&qrtr_node_lock);
list_add(&node->item, &qrtr_all_nodes);
--
2.50.0
On Thu, Jul 24, 2025 at 01:24:01AM +0200, Mihai Moldovan wrote: > From: Denis Kenzior <denkenz@gmail.com> > > Add support for tracking multiple endpoints that may have conflicting > node identifiers. This is achieved by using both the node and endpoint > identifiers as the key inside the radix_tree data structure. > > For backward compatibility with existing clients, the previous key > schema (node identifier only) is preserved. However, this schema will > only support the first endpoint/node combination. This is acceptable > for legacy clients as support for multiple endpoints with conflicting > node identifiers was not previously possible. > > Signed-off-by: Denis Kenzior <denkenz@gmail.com> > Reviewed-by: Marcel Holtmann <marcel@holtmann.org> > Reviewed-by: Andy Gross <agross@kernel.org> > Signed-off-by: Mihai Moldovan <ionic@ionic.de> ... > diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c ... > @@ -465,19 +466,36 @@ static struct qrtr_node *qrtr_node_lookup(unsigned int nid) > * > * This is mostly useful for automatic node id assignment, based on > * the source id in the incoming packet. > + * > + * Return: 0 on success; negative error code on failure > */ > -static void qrtr_node_assign(struct qrtr_node *node, unsigned int nid) > +static int qrtr_node_assign(struct qrtr_node *node, unsigned int nid) > { > unsigned long flags; > + unsigned long key; > > if (nid == QRTR_EP_NID_AUTO) > - return; > + return 0; > > spin_lock_irqsave(&qrtr_nodes_lock, flags); > - radix_tree_insert(&qrtr_nodes, nid, node); > + > + if (node->ep->id > QRTR_INDEX_HALF_UNSIGNED_MAX || > + nid > QRTR_INDEX_HALF_UNSIGNED_MAX) > + return -EINVAL; Hi Mihai, Denis, all, This will leak holding qrtr_nodes_lock. Flagged by Smatch. > + > + /* Always insert with the endpoint_id + node_id */ > + key = ((unsigned long)(node->ep->id) << QRTR_INDEX_HALF_BITS) | > + ((unsigned long)(nid) & QRTR_INDEX_HALF_UNSIGNED_MAX); > + radix_tree_insert(&qrtr_nodes, key, node); > + > + if (!radix_tree_lookup(&qrtr_nodes, nid)) > + radix_tree_insert(&qrtr_nodes, nid, node); > + > if (node->nid == QRTR_EP_NID_AUTO) > node->nid = nid; > spin_unlock_irqrestore(&qrtr_nodes_lock, flags); > + > + return 0; > } > > /** > @@ -571,14 +589,18 @@ int qrtr_endpoint_post(struct qrtr_endpoint *ep, const void *data, size_t len) > > skb_put_data(skb, data + hdrlen, size); When declared, ret is assigned the value -EINVAL. And that is still the value of ret if this line is reached. > > - qrtr_node_assign(node, cb->src_node); > + ret = qrtr_node_assign(node, cb->src_node); > + if (ret) > + goto err; With this patch, if we get to this line, ret is 0. Whereas before this patch it was -EINVAL. > > if (cb->type == QRTR_TYPE_NEW_SERVER) { > /* Remote node endpoint can bridge other distant nodes */ > const struct qrtr_ctrl_pkt *pkt; > > pkt = data + hdrlen; > - qrtr_node_assign(node, le32_to_cpu(pkt->server.node)); > + ret = qrtr_node_assign(node, le32_to_cpu(pkt->server.node)); > + if (ret) > + goto err; > } > > if (cb->type == QRTR_TYPE_RESUME_TX) { The next portion of this function looks like this: ret = qrtr_tx_resume(node, skb); if (ret) goto err; } else { ipc = qrtr_port_lookup(cb->dst_port); if (!ipc) goto err; If we get to the line above, then the function will jump to err, free skb, and return ret. But ret is now 0, whereas before this patch it was -EINVAL. This seems both to be an unintentional side effect of this patch, and incorrect. Also flagged by Smatch. ...
* On 7/24/25 15:08, Simon Horman wrote: > [...] Thank you for the reviews, to both you and Jakub. > This will leak holding qrtr_nodes_lock. It certainly does, will be fixed in v4. > Flagged by Smatch. I haven't used smatch before, and probably should do so going forward. Curiously, a simple kchecker net/qrtr/ run did not warn about the locking issue (albeit it being obvious in the patch), while it did warn about the second issue with ret. Am I missing something? > But ret is now 0, whereas before this patch it was -EINVAL. > This seems both to be an unintentional side effect of this patch, > and incorrect. True. Will also fixed in v4. Mihai
+ Dan Carpenter On Sun, Jul 27, 2025 at 03:09:38PM +0200, Mihai Moldovan wrote: > * On 7/24/25 15:08, Simon Horman wrote: > > [...] > > Thank you for the reviews, to both you and Jakub. > > > > This will leak holding qrtr_nodes_lock. > > It certainly does, will be fixed in v4. > > > > Flagged by Smatch. > > I haven't used smatch before, and probably should do so going forward. > > Curiously, a simple kchecker net/qrtr/ run did not warn about the locking > issue (albeit it being obvious in the patch), while it did warn about the > second issue with ret. Am I missing something? TL;DR: No, I seem to have been able to reproduce what you see. I ran Smatch, compiled from a recent Git commit, like this: kchecker net/qrtr/af_qrtr.o The warnings I saw (new to this patch) are: net/qrtr/af_qrtr.c:498 qrtr_node_assign() warn: inconsistent returns 'global &qrtr_nodes_lock'. Locked on : 484 Unlocked on: 498 net/qrtr/af_qrtr.c:613 qrtr_endpoint_post() warn: missing error code 'ret' That was with Smatch compiled from Git [1] commit e1d933013098 ("return_efault: don't rely on the cross function DB") I tried again with the latest head, commit 2fb2b9093c5d ("sleep_info: The synchronize_srcu() sleeps"). And in that case I no longer see the 1st warning, about locking. I think this is what you saw too. This seems to a regression in Smatch wrt this particular case for this code. I bisected Smatch and it looks like it was introduced in commit d0367cd8a993 ("ranges: use absolute instead implied for possibly_true/false") I CCed Dan in case he wants to dig into this. [1] https://repo.or.cz/smatch.git > > > > But ret is now 0, whereas before this patch it was -EINVAL. > > This seems both to be an unintentional side effect of this patch, > > and incorrect. > > True. Will also fixed in v4. > > > Mihai
On Sun, Jul 27, 2025 at 03:40:14PM +0100, Simon Horman wrote: > + Dan Carpenter > > On Sun, Jul 27, 2025 at 03:09:38PM +0200, Mihai Moldovan wrote: > > * On 7/24/25 15:08, Simon Horman wrote: > > > [...] > > > > Thank you for the reviews, to both you and Jakub. > > > > > > > This will leak holding qrtr_nodes_lock. > > > > It certainly does, will be fixed in v4. > > > > > > > Flagged by Smatch. > > > > I haven't used smatch before, and probably should do so going forward. > > > > Curiously, a simple kchecker net/qrtr/ run did not warn about the locking > > issue (albeit it being obvious in the patch), while it did warn about the > > second issue with ret. Am I missing something? > > TL;DR: No, I seem to have been able to reproduce what you see. > > I ran Smatch, compiled from a recent Git commit, like this: > > kchecker net/qrtr/af_qrtr.o > > The warnings I saw (new to this patch) are: > > net/qrtr/af_qrtr.c:498 qrtr_node_assign() warn: inconsistent returns 'global &qrtr_nodes_lock'. > Locked on : 484 > Unlocked on: 498 > net/qrtr/af_qrtr.c:613 qrtr_endpoint_post() warn: missing error code 'ret' > > That was with Smatch compiled from Git [1] > commit e1d933013098 ("return_efault: don't rely on the cross function DB") > > I tried again with the latest head, > commit 2fb2b9093c5d ("sleep_info: The synchronize_srcu() sleeps"). > And in that case I no longer see the 1st warning, about locking. > I think this is what you saw too. > > This seems to a regression in Smatch wrt this particular case for this > code. I bisected Smatch and it looks like it was introduced in commit > d0367cd8a993 ("ranges: use absolute instead implied for possibly_true/false") > > I CCed Dan in case he wants to dig into this. The code looks like this: spin_lock_irqsave(&qrtr_nodes_lock, flags); if (node->ep->id > QRTR_INDEX_HALF_UNSIGNED_MAX || nid > QRTR_INDEX_HALF_UNSIGNED_MAX) return -EINVAL; The problem is that QRTR_INDEX_HALF_UNSIGNED_MAX is U32_MAX and node->ep->id and nid are both u32 type. The return statement is dead code and I deliberately silenced warnings on impossible paths. The following patch will enable the warning again and I'll test it tonight to see what happens. If it's not too painful then I'll delete it properly, but if it's generates a bunch of false positives then, in the end, I'm not overly stressed about bugs in dead code. regards, dan carpenter diff --git a/check_inconsistent_locking.c b/check_inconsistent_locking.c index f3cce559d7a6..e95d9110a1e1 100644 --- a/check_inconsistent_locking.c +++ b/check_inconsistent_locking.c @@ -67,8 +67,8 @@ static void check_lock_bool(const char *name, struct symbol *sym) FOR_EACH_PTR(get_all_return_strees(), stree) { orig = __swap_cur_stree(stree); - if (is_impossible_path()) - goto swap_stree; +// if (is_impossible_path()) +// goto swap_stree; return_sm = get_sm_state(RETURN_ID, "return_ranges", NULL); if (!return_sm) @@ -145,8 +145,8 @@ static void check_lock(const char *name, struct symbol *sym) FOR_EACH_PTR(get_all_return_strees(), stree) { orig = __swap_cur_stree(stree); - if (is_impossible_path()) - goto swap_stree; +// if (is_impossible_path()) +// goto swap_stree; return_sm = get_sm_state(RETURN_ID, "return_ranges", NULL); if (!return_sm)
On Fri, Aug 01, 2025 at 08:25:58PM +0300, Dan Carpenter wrote: > On Sun, Jul 27, 2025 at 03:40:14PM +0100, Simon Horman wrote: > > + Dan Carpenter > > > > On Sun, Jul 27, 2025 at 03:09:38PM +0200, Mihai Moldovan wrote: > > > * On 7/24/25 15:08, Simon Horman wrote: ... > > This seems to a regression in Smatch wrt this particular case for this > > code. I bisected Smatch and it looks like it was introduced in commit > > d0367cd8a993 ("ranges: use absolute instead implied for possibly_true/false") > > > > I CCed Dan in case he wants to dig into this. > > The code looks like this: > > spin_lock_irqsave(&qrtr_nodes_lock, flags); > > if (node->ep->id > QRTR_INDEX_HALF_UNSIGNED_MAX || > nid > QRTR_INDEX_HALF_UNSIGNED_MAX) > return -EINVAL; > > The problem is that QRTR_INDEX_HALF_UNSIGNED_MAX is U32_MAX and > node->ep->id and nid are both u32 type. The return statement is dead > code and I deliberately silenced warnings on impossible paths. > > The following patch will enable the warning again and I'll test it tonight > to see what happens. If it's not too painful then I'll delete it > properly, but if it's generates a bunch of false positives then, in the > end, I'm not overly stressed about bugs in dead code. Thanks Dan, I think the key point here is that neither Mihai nor I noticed the dead code. Thanks for pointing that out. ...
On Mon, Aug 04, 2025 at 10:55:22AM +0100, Simon Horman wrote: > On Fri, Aug 01, 2025 at 08:25:58PM +0300, Dan Carpenter wrote: > > On Sun, Jul 27, 2025 at 03:40:14PM +0100, Simon Horman wrote: > > > + Dan Carpenter > > > > > > On Sun, Jul 27, 2025 at 03:09:38PM +0200, Mihai Moldovan wrote: > > > > * On 7/24/25 15:08, Simon Horman wrote: > > ... > > > > This seems to a regression in Smatch wrt this particular case for this > > > code. I bisected Smatch and it looks like it was introduced in commit > > > d0367cd8a993 ("ranges: use absolute instead implied for possibly_true/false") > > > > > > I CCed Dan in case he wants to dig into this. > > > > The code looks like this: > > > > spin_lock_irqsave(&qrtr_nodes_lock, flags); > > > > if (node->ep->id > QRTR_INDEX_HALF_UNSIGNED_MAX || > > nid > QRTR_INDEX_HALF_UNSIGNED_MAX) > > return -EINVAL; > > > > The problem is that QRTR_INDEX_HALF_UNSIGNED_MAX is U32_MAX and > > node->ep->id and nid are both u32 type. The return statement is dead > > code and I deliberately silenced warnings on impossible paths. > > > > The following patch will enable the warning again and I'll test it tonight > > to see what happens. If it's not too painful then I'll delete it > > properly, but if it's generates a bunch of false positives then, in the > > end, I'm not overly stressed about bugs in dead code. > > Thanks Dan, > > I think the key point here is that neither Mihai nor I noticed > the dead code. Thanks for pointing that out. > I did test this over the weekend, btw. Warning about bugs in dead code does find some "real" bug and they might actually be real depending on config. But it also triggers a bunch of false positives which are hard to solve: r = dma_resv_lock(&resv, NULL); if (r) r; The dma_resv_lock() can't fail it you pass NULL as a parameter, so Smatch says "every return takes the lock", and then "if we fail, we return without dropping the lock." It's difficult to solve this. I guess we would have to say "If we're in an impossible return and it was the locking function which failed then we didn't take the lock." That would work, but it's sort of a tricky rule to code. regards, dan carpenter
* On 7/27/25 16:40, Simon Horman wrote: > I tried again with the latest head, > commit 2fb2b9093c5d ("sleep_info: The synchronize_srcu() sleeps"). > And in that case I no longer see the 1st warning, about locking. > I think this is what you saw too. Exactly! Together with impossible condition warnings, but those are actually fine/intended. > This seems to a regression in Smatch wrt this particular case for this > code. I bisected Smatch and it looks like it was introduced in commit > d0367cd8a993 ("ranges: use absolute instead implied for possibly_true/false") Oh, thank you very much. I suspected that I'm just missing a special script or option or even addition to Smash (given that Dan seems to have revamped its locking check code in 2020), especially since it seems to be so widely used in kernel development, but not a bug in the software itself. Mihai
On Sun, Jul 27, 2025 at 07:33:58PM +0200, Mihai Moldovan wrote: > * On 7/27/25 16:40, Simon Horman wrote: > > I tried again with the latest head, > > commit 2fb2b9093c5d ("sleep_info: The synchronize_srcu() sleeps"). > > And in that case I no longer see the 1st warning, about locking. > > I think this is what you saw too. > > Exactly! Together with impossible condition warnings, but those are actually > fine/intended. Yeah, I saw them too. I agree they are not correctness issues. > > > This seems to a regression in Smatch wrt this particular case for this > > code. I bisected Smatch and it looks like it was introduced in commit > > d0367cd8a993 ("ranges: use absolute instead implied for possibly_true/false") > Oh, thank you very much. I suspected that I'm just missing a special script > or option or even addition to Smash (given that Dan seems to have revamped > its locking check code in 2020), especially since it seems to be so widely > used in kernel development, but not a bug in the software itself. Likewise, thanks for pointing out this problem.
On Thu, 24 Jul 2025 01:24:01 +0200 Mihai Moldovan wrote: > spin_lock_irqsave(&qrtr_nodes_lock, flags); > - radix_tree_insert(&qrtr_nodes, nid, node); > + > + if (node->ep->id > QRTR_INDEX_HALF_UNSIGNED_MAX || > + nid > QRTR_INDEX_HALF_UNSIGNED_MAX) > + return -EINVAL; missing unlock? -- pw-bot: cr
© 2016 - 2025 Red Hat, Inc.