:p
atchew
Login
In mptcp_pm_create_subflow_or_signal_addr(), rcu_read_(un)lock() are used as expected to iterate over the list of local addresses, but list_for_each_entry() was used instead of list_for_each_entry_rcu() in __lookup_addr() (and lookup_id_by_addr() before). It is important to use this variant which adds the required READ_ONCE() (and diagnostic checks if enabled). Because __lookup_addr() is also used in mptcp_pm_nl_set_flags() where it is called under the pernet->lock because the returned entry might be modified, the _rcu variant cannot be used in all cases. It is then required to create a new helper. Note that this new helper can be reused later to reduce some duplicated code elsewhere in this file. Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/pm_netlink.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ __lookup_addr(struct pm_nl_pernet *pernet, const struct mptcp_addr_info *info) return NULL; } +static struct mptcp_pm_addr_entry * +__lookup_addr_rcu(struct pm_nl_pernet *pernet, const struct mptcp_addr_info *info) +{ + struct mptcp_pm_addr_entry *entry; + + list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) { + if (mptcp_addresses_equal(&entry->addr, info, entry->addr.port)) + return entry; + } + return NULL; +} + static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) { struct sock *sk = (struct sock *)msk; @@ -XXX,XX +XXX,XX @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) mptcp_local_address((struct sock_common *)msk->first, &mpc_addr); rcu_read_lock(); - entry = __lookup_addr(pernet, &mpc_addr); + entry = __lookup_addr_rcu(pernet, &mpc_addr); if (entry) { __clear_bit(entry->addr.id, msk->pm.id_avail_bitmap); msk->mpc_endpoint_id = entry->addr.id; --- base-commit: 66bee427d4db76bd1afbf0184a2fd1f238a4ee27 change-id: 20241022-mptcp-pm-lookup_addr_rcu-01833ea95155 Best regards, -- Matthieu Baerts (NGI0) <matttbe@kernel.org>
When looking at something else, I noticed that the local endpoint entries list was iterated under rcu_read_lock, but using list_for_each_entry() instead of the _rcu variant. That's what patch 1 is fixing. At the previous meeting, Mat and Christoph mentioned we could use the RCU variant elsewhere. That's what is done in patch 2, for -next then. Patch 3 is a simple change to remove duplicated code. Note: I see that we are using spin_lock_bh(), but the RCU read "locks" are always used without the _bh() variant. Is that OK here, or did we miss something? Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- Changes in v3: - Use list_for_each_entry_rcu() with a 4th parameter: lockdep_is_held() - See individual changelog in the different patches. - Link to v2: https://lore.kernel.org/r/20241025-mptcp-pm-lookup_addr_rcu-v2-0-1478f6c4b205@kernel.org Changes in v2: - Add patch 2 and 3 - Patch 1: avoid > 80 chars per line in __lookup_addr_rcu() + update commit message. - Link to v1: https://lore.kernel.org/r/20241022-mptcp-pm-lookup_addr_rcu-v1-1-19d45f26c872@kernel.org --- Geliang Tang (1): mptcp: pm: avoid code duplication to lookup endp Matthieu Baerts (NGI0) (2): mptcp: pm: use _rcu variant under rcu_read_lock mptcp: pm: lockless list traversal to dump endpoints net/mptcp/pm_netlink.c | 36 +++++++++++++++--------------------- 1 file changed, 15 insertions(+), 21 deletions(-) --- base-commit: ca26062fcd85c61922f543674c5dd0382e2059cd change-id: 20241022-mptcp-pm-lookup_addr_rcu-01833ea95155 Best regards, -- Matthieu Baerts (NGI0) <matttbe@kernel.org>
In mptcp_pm_create_subflow_or_signal_addr(), rcu_read_(un)lock() are used as expected to iterate over the list of local addresses, but list_for_each_entry() was used instead of list_for_each_entry_rcu() in __lookup_addr(). It is important to use this variant which adds the required READ_ONCE() (and diagnostic checks if enabled). Because __lookup_addr() is also used in mptcp_pm_nl_set_flags() where it is called under the pernet->lock and not rcu_read_lock(), an extra condition is then passed to help the diagnostic checks making sure either the associated spin lock or the RCU lock is held. Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- v3: - Use list_for_each_entry_rcu() with lockdep_is_held(). - Update commit message accordingly. --- net/mptcp/pm_netlink.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ __lookup_addr(struct pm_nl_pernet *pernet, const struct mptcp_addr_info *info) { struct mptcp_pm_addr_entry *entry; - list_for_each_entry(entry, &pernet->local_addr_list, list) { + list_for_each_entry_rcu(entry, &pernet->local_addr_list, list, + lockdep_is_held(&pernet->lock)) { if (mptcp_addresses_equal(&entry->addr, info, entry->addr.port)) return entry; } -- 2.45.2
To return an endpoint to the userspace via Netlink, and to dump all of them, the endpoint list was iterated while holding the pernet->lock, but only to read the content of the list. In these cases, the spin locks can be replaced by RCU read ones, and use the _rcu variants to iterate over the entries list in a lockless way. Note that the __lookup_addr_by_id() helper has been modified to use the _rcu variants of list_for_each_entry(), but with an extra conditions, so it can be called either while the RCU read lock is held, or when the associated pernet->lock is held. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- Notes: - This is not a fix, a small improvement for -next. - v3: - Use list_for_each_entry_rcu() with lockdep_is_held() instead of duplicating the helper. So no need to rename the helper with _rcu suffix. - No copy of the bitmap in dump_addr() (Mat) - No need to modify set flags. (Paolo) --- net/mptcp/pm_netlink.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ __lookup_addr_by_id(struct pm_nl_pernet *pernet, unsigned int id) { struct mptcp_pm_addr_entry *entry; - list_for_each_entry(entry, &pernet->local_addr_list, list) { + list_for_each_entry_rcu(entry, &pernet->local_addr_list, list, + lockdep_is_held(&pernet->lock)) { if (entry->addr.id == id) return entry; } @@ -XXX,XX +XXX,XX @@ int mptcp_pm_nl_get_addr(struct sk_buff *skb, struct genl_info *info) goto fail; } - spin_lock_bh(&pernet->lock); + rcu_read_lock(); entry = __lookup_addr_by_id(pernet, addr.addr.id); if (!entry) { GENL_SET_ERR_MSG(info, "address not found"); @@ -XXX,XX +XXX,XX @@ int mptcp_pm_nl_get_addr(struct sk_buff *skb, struct genl_info *info) genlmsg_end(msg, reply); ret = genlmsg_reply(msg, info); - spin_unlock_bh(&pernet->lock); + rcu_read_unlock(); return ret; unlock_fail: - spin_unlock_bh(&pernet->lock); + rcu_read_unlock(); fail: nlmsg_free(msg); @@ -XXX,XX +XXX,XX @@ int mptcp_pm_nl_dump_addr(struct sk_buff *msg, pernet = pm_nl_get_pernet(net); - spin_lock_bh(&pernet->lock); + rcu_read_lock(); for (i = id; i < MPTCP_PM_MAX_ADDR_ID + 1; i++) { if (test_bit(i, pernet->id_bitmap)) { entry = __lookup_addr_by_id(pernet, i); @@ -XXX,XX +XXX,XX @@ int mptcp_pm_nl_dump_addr(struct sk_buff *msg, genlmsg_end(msg, hdr); } } - spin_unlock_bh(&pernet->lock); + rcu_read_unlock(); cb->args[0] = id; return msg->len; -- 2.45.2
From: Geliang Tang <tanggeliang@kylinos.cn> The helper __lookup_addr() can be used in mptcp_pm_nl_get_local_id() and mptcp_pm_nl_is_backup() to simplify the code, and avoid code duplication. Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- Notes: - This is also for -next. - v3: - use __lookup_addr() instead of __lookup_addr_rcu() that has been removed in v3. --- net/mptcp/pm_netlink.c | 20 ++++++-------------- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, struct mptcp_addr_info *skc { struct mptcp_pm_addr_entry *entry; struct pm_nl_pernet *pernet; - int ret = -1; + int ret; pernet = pm_nl_get_pernet_from_msk(msk); rcu_read_lock(); - list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) { - if (mptcp_addresses_equal(&entry->addr, skc, entry->addr.port)) { - ret = entry->addr.id; - break; - } - } + entry = __lookup_addr(pernet, skc); + ret = entry ? entry->addr.id : -1; rcu_read_unlock(); if (ret >= 0) return ret; @@ -XXX,XX +XXX,XX @@ bool mptcp_pm_nl_is_backup(struct mptcp_sock *msk, struct mptcp_addr_info *skc) { struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); struct mptcp_pm_addr_entry *entry; - bool backup = false; + bool backup; rcu_read_lock(); - list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) { - if (mptcp_addresses_equal(&entry->addr, skc, entry->addr.port)) { - backup = !!(entry->flags & MPTCP_PM_ADDR_FLAG_BACKUP); - break; - } - } + entry = __lookup_addr(pernet, skc); + backup = entry && !!(entry->flags & MPTCP_PM_ADDR_FLAG_BACKUP); rcu_read_unlock(); return backup; -- 2.45.2