[v1] Squash to "Add mptcp_subflow bpf_iter support"

[PATCH mptcp-next 1/3] Squash to "bpf: Add mptcp_subflow bpf_iter"

Posted by Geliang Tang 3 months, 2 weeks ago

From: Geliang Tang <tanggeliang@kylinos.cn>

Drop the NULL check as Martin suggested.

Use the "struct sock *sk" instead of "struct mptcp-sock *msk" as the
argument in the bpf_iter_mptcp_subflow_new as Martin suggested.

Use msk_owned_by_me().

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
 net/mptcp/bpf.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index 923895322b2c..def4de34666d 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -234,24 +234,27 @@ bpf_mptcp_subflow_ctx(const struct sock *sk)
 
 __bpf_kfunc static int
 bpf_iter_mptcp_subflow_new(struct bpf_iter_mptcp_subflow *it,
-			   struct mptcp_sock *msk)
+			   struct sock *sk)
 {
 	struct bpf_iter_mptcp_subflow_kern *kit = (void *)it;
-	struct sock *sk = (struct sock *)msk;
+	struct mptcp_sock *msk;
 
 	BUILD_BUG_ON(sizeof(struct bpf_iter_mptcp_subflow_kern) >
 		     sizeof(struct bpf_iter_mptcp_subflow));
 	BUILD_BUG_ON(__alignof__(struct bpf_iter_mptcp_subflow_kern) !=
 		     __alignof__(struct bpf_iter_mptcp_subflow));
 
-	kit->msk = msk;
-	if (!msk)
+	if (unlikely(!sk || !sk_fullsock(sk)))
 		return -EINVAL;
 
-	if (!sock_owned_by_user_nocheck(sk) &&
-	    !spin_is_locked(&sk->sk_lock.slock))
+	if (sk->sk_protocol != IPPROTO_MPTCP)
 		return -EINVAL;
 
+	msk = mptcp_sk(sk);
+	kit->msk = msk;
+
+	msk_owned_by_me(msk);
+
 	kit->pos = &msk->conn_list;
 	return 0;
 }
-- 
2.43.0

Re: [PATCH mptcp-next 1/3] Squash to "bpf: Add mptcp_subflow bpf_iter"

Posted by Mat Martineau 3 months, 1 week ago

On Fri, 24 Jan 2025, Geliang Tang wrote:

> From: Geliang Tang <tanggeliang@kylinos.cn>
>
> Drop the NULL check as Martin suggested.
>
> Use the "struct sock *sk" instead of "struct mptcp-sock *msk" as the
> argument in the bpf_iter_mptcp_subflow_new as Martin suggested.
>
> Use msk_owned_by_me().
>
> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> ---
> net/mptcp/bpf.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
> index 923895322b2c..def4de34666d 100644
> --- a/net/mptcp/bpf.c
> +++ b/net/mptcp/bpf.c
> @@ -234,24 +234,27 @@ bpf_mptcp_subflow_ctx(const struct sock *sk)
>
> __bpf_kfunc static int
> bpf_iter_mptcp_subflow_new(struct bpf_iter_mptcp_subflow *it,
> -			   struct mptcp_sock *msk)
> +			   struct sock *sk)
> {
> 	struct bpf_iter_mptcp_subflow_kern *kit = (void *)it;
> -	struct sock *sk = (struct sock *)msk;
> +	struct mptcp_sock *msk;
>
> 	BUILD_BUG_ON(sizeof(struct bpf_iter_mptcp_subflow_kern) >
> 		     sizeof(struct bpf_iter_mptcp_subflow));
> 	BUILD_BUG_ON(__alignof__(struct bpf_iter_mptcp_subflow_kern) !=
> 		     __alignof__(struct bpf_iter_mptcp_subflow));
>
> -	kit->msk = msk;
> -	if (!msk)
> +	if (unlikely(!sk || !sk_fullsock(sk)))
> 		return -EINVAL;
>
> -	if (!sock_owned_by_user_nocheck(sk) &&
> -	    !spin_is_locked(&sk->sk_lock.slock))
> +	if (sk->sk_protocol != IPPROTO_MPTCP)
> 		return -EINVAL;
>
> +	msk = mptcp_sk(sk);
> +	kit->msk = msk;
> +
> +	msk_owned_by_me(msk);

Hi Geliang -

The convention is to check the owner before assigning the msk.

I also noticed in the reply to patch 5 that Martin mentioned adding a 
"bpf_mptcp_sock_from_sock check" to bpf_iter_mptcp_subflow_new 
(https://lore.kernel.org/netdev/9b373a23-c093-42d8-b4ae-99f2e62e7681@linux.dev/)

- Mat

> +
> 	kit->pos = &msk->conn_list;
> 	return 0;
> }
> -- 
> 2.43.0
>
>
>

Re: [PATCH mptcp-next 1/3] Squash to "bpf: Add mptcp_subflow bpf_iter"

Posted by Geliang Tang 3 months, 1 week ago

Hi Mat,

Thanks for your review.

On Wed, 2025-01-29 at 12:49 -0800, Mat Martineau wrote:
> On Fri, 24 Jan 2025, Geliang Tang wrote:
> 
> > From: Geliang Tang <tanggeliang@kylinos.cn>
> > 
> > Drop the NULL check as Martin suggested.
> > 
> > Use the "struct sock *sk" instead of "struct mptcp-sock *msk" as
> > the
> > argument in the bpf_iter_mptcp_subflow_new as Martin suggested.
> > 
> > Use msk_owned_by_me().
> > 
> > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> > ---
> > net/mptcp/bpf.c | 15 +++++++++------
> > 1 file changed, 9 insertions(+), 6 deletions(-)
> > 
> > diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
> > index 923895322b2c..def4de34666d 100644
> > --- a/net/mptcp/bpf.c
> > +++ b/net/mptcp/bpf.c
> > @@ -234,24 +234,27 @@ bpf_mptcp_subflow_ctx(const struct sock *sk)
> > 
> > __bpf_kfunc static int
> > bpf_iter_mptcp_subflow_new(struct bpf_iter_mptcp_subflow *it,
> > -			   struct mptcp_sock *msk)
> > +			   struct sock *sk)
> > {
> > 	struct bpf_iter_mptcp_subflow_kern *kit = (void *)it;
> > -	struct sock *sk = (struct sock *)msk;
> > +	struct mptcp_sock *msk;
> > 
> > 	BUILD_BUG_ON(sizeof(struct bpf_iter_mptcp_subflow_kern) >
> > 		     sizeof(struct bpf_iter_mptcp_subflow));
> > 	BUILD_BUG_ON(__alignof__(struct
> > bpf_iter_mptcp_subflow_kern) !=
> > 		     __alignof__(struct bpf_iter_mptcp_subflow));
> > 
> > -	kit->msk = msk;
> > -	if (!msk)
> > +	if (unlikely(!sk || !sk_fullsock(sk)))
> > 		return -EINVAL;
> > 
> > -	if (!sock_owned_by_user_nocheck(sk) &&
> > -	    !spin_is_locked(&sk->sk_lock.slock))
> > +	if (sk->sk_protocol != IPPROTO_MPTCP)
> > 		return -EINVAL;
> > 
> > +	msk = mptcp_sk(sk);
> > +	kit->msk = msk;
> > +
> > +	msk_owned_by_me(msk);
> 
> Hi Geliang -
> 
> The convention is to check the owner before assigning the msk.

Yes, indeed. Updated in v2.

> 
> I also noticed in the reply to patch 5 that Martin mentioned adding a
> "bpf_mptcp_sock_from_sock check" to bpf_iter_mptcp_subflow_new

bpf_mptcp_sock_from_sock should be dropped, so open-code it in
bpf_iter_mptcp_subflow_new.

Thanks,
-Geliang

>  
> (
> https://lore.kernel.org/netdev/9b373a23-c093-42d8-b4ae-99f2e62e7681@li
> nux.dev/)
> 
> - Mat
> 
> > +
> > 	kit->pos = &msk->conn_list;
> > 	return 0;
> > }
> > -- 
> > 2.43.0
> > 
> > 
> >

Re: [PATCH mptcp-next 1/3] Squash to "bpf: Add mptcp_subflow bpf_iter"

Posted by Matthieu Baerts 3 months, 2 weeks ago

Hi Geliang,

(+cc Mat who reviewed the previous versions)

On 24/01/2025 11:28, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
> 
> Drop the NULL check as Martin suggested.
> 
> Use the "struct sock *sk" instead of "struct mptcp-sock *msk" as the
> argument in the bpf_iter_mptcp_subflow_new as Martin suggested.
> 
> Use msk_owned_by_me().

Regarding this, do you mind checking my question on the thread initially
sent to the BPF mailing please?

>> This set is only showing the cg sockopt bpf prog but missing the major
>> struct_ops piece. It is hard to comment. I assumed the lock situation is
>> the same for the struct_ops where the lock will be held before calling
>> the struct_ops prog?
> 
> Can we restrict the use of this helper only for some specific callbacks?
> Or for only some specific struct_ops?
> 
> Last time I looked, I thought that you could only restrict it like that:
> "this helper can be used by (any) struct_ops". But then does it mean
> this helper could be called from a TCP BPF CC ops or another unrelated
> one for example? How can we avoid that?
Because with the suggested modification, there is no more checks
regarding the lock situation: it is probably fine now with
[gs]etsockopt(), but what about later with struct_ops? Can we restrict
this kfunc to be used only in some contexts? e.g. a part or all the
MPTCP struct_ops, but not the other ones?

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

Re: [PATCH mptcp-next 1/3] Squash to "bpf: Add mptcp_subflow bpf_iter"

Posted by Geliang Tang 3 months, 1 week ago

Hi Matt,

Thanks for your review.

On Fri, 2025-01-24 at 16:04 +0100, Matthieu Baerts wrote:
> Hi Geliang,
> 
> (+cc Mat who reviewed the previous versions)
> 
> On 24/01/2025 11:28, Geliang Tang wrote:
> > From: Geliang Tang <tanggeliang@kylinos.cn>
> > 
> > Drop the NULL check as Martin suggested.
> > 
> > Use the "struct sock *sk" instead of "struct mptcp-sock *msk" as
> > the
> > argument in the bpf_iter_mptcp_subflow_new as Martin suggested.
> > 
> > Use msk_owned_by_me().
> 
> Regarding this, do you mind checking my question on the thread
> initially
> sent to the BPF mailing please?
> 
> > > This set is only showing the cg sockopt bpf prog but missing the
> > > major
> > > struct_ops piece. It is hard to comment. I assumed the lock
> > > situation is
> > > the same for the struct_ops where the lock will be held before
> > > calling
> > > the struct_ops prog?
> > 
> > Can we restrict the use of this helper only for some specific
> > callbacks?
> > Or for only some specific struct_ops?
> > 
> > Last time I looked, I thought that you could only restrict it like
> > that:
> > "this helper can be used by (any) struct_ops". But then does it
> > mean
> > this helper could be called from a TCP BPF CC ops or another
> > unrelated
> > one for example? How can we avoid that?
> Because with the suggested modification, there is no more checks
> regarding the lock situation: it is probably fine now with
> [gs]etsockopt(), but what about later with struct_ops? Can we
> restrict
> this kfunc to be used only in some contexts? e.g. a part or all the
> MPTCP struct_ops, but not the other ones?

We use register_btf_kfunc_id_set() to allow these kfuncs to be used in
struct_ops:

register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
                          &bpf_mptcp_common_kfunc_set);

And we also check if the protocol of the socket is MPTCP in
bpf_iter_mptcp_subflow_new(). So this mptcp_subflow bpf_iter is
restricted to be used for MPTCP. Other struct_ops likes a TCP BPF CC
ops can't use it.

Do you think this restriction is enough?

Another way to restrict kfuncs is to set get_func_proto of struct
bpf_verifier_ops. But this way has not been applied to other BPF iters.

Thanks,
-Geliang

> 
> Cheers,
> Matt

Re: [PATCH mptcp-next 1/3] Squash to "bpf: Add mptcp_subflow bpf_iter"

Posted by Matthieu Baerts 2 months, 3 weeks ago

Hi Geliang,

On 02/02/2025 06:04, Geliang Tang wrote:
> Hi Matt,
> 
> Thanks for your review.
> 
> On Fri, 2025-01-24 at 16:04 +0100, Matthieu Baerts wrote:
>> Hi Geliang,
>>
>> (+cc Mat who reviewed the previous versions)
>>
>> On 24/01/2025 11:28, Geliang Tang wrote:
>>> From: Geliang Tang <tanggeliang@kylinos.cn>
>>>
>>> Drop the NULL check as Martin suggested.
>>>
>>> Use the "struct sock *sk" instead of "struct mptcp-sock *msk" as
>>> the
>>> argument in the bpf_iter_mptcp_subflow_new as Martin suggested.
>>>
>>> Use msk_owned_by_me().
>>
>> Regarding this, do you mind checking my question on the thread
>> initially
>> sent to the BPF mailing please?
>>
>>>> This set is only showing the cg sockopt bpf prog but missing the
>>>> major
>>>> struct_ops piece. It is hard to comment. I assumed the lock
>>>> situation is
>>>> the same for the struct_ops where the lock will be held before
>>>> calling
>>>> the struct_ops prog?
>>>
>>> Can we restrict the use of this helper only for some specific
>>> callbacks?
>>> Or for only some specific struct_ops?
>>>
>>> Last time I looked, I thought that you could only restrict it like
>>> that:
>>> "this helper can be used by (any) struct_ops". But then does it
>>> mean
>>> this helper could be called from a TCP BPF CC ops or another
>>> unrelated
>>> one for example? How can we avoid that?
>> Because with the suggested modification, there is no more checks
>> regarding the lock situation: it is probably fine now with
>> [gs]etsockopt(), but what about later with struct_ops? Can we
>> restrict
>> this kfunc to be used only in some contexts? e.g. a part or all the
>> MPTCP struct_ops, but not the other ones?
> 
> We use register_btf_kfunc_id_set() to allow these kfuncs to be used in
> struct_ops:
> 
> register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
>                           &bpf_mptcp_common_kfunc_set);

OK so if I understand correctly, it means that the kfunc that are part
of this 'mptcp_common' set can be used from any struct ops, right? Or in
other words, the new MPTCP kfunc could be used from the TCP BPF CC
struct ops for example, right?

If yes, it might be clearer to ask BPF maintainers how we should design
the new kfunc that are supposed to be used only in the context of the
new BPF packet scheduler, e.g. should we check who own the msk socket.

I can ask this question if you prefer.

> And we also check if the protocol of the socket is MPTCP in
> bpf_iter_mptcp_subflow_new(). So this mptcp_subflow bpf_iter is
> restricted to be used for MPTCP. Other struct_ops likes a TCP BPF CC
> ops can't use it.

I see. For the bpf_iter case, these restrictions should be fine. Maybe
not for the kfunc needed for the scheduler part.

> Do you think this restriction is enough?
> 
> Another way to restrict kfuncs is to set get_func_proto of struct
> bpf_verifier_ops. But this way has not been applied to other BPF iters.

No, I guess for bpf_iter, we don't need that. Probably best to see them
as a "generic thing" and have the previous checks we already have.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.