[PATCH net-next v2] l2tp: fix possible UAF when cleaning up tunnels

James Chapman posted 1 patch 1 year, 7 months ago
net/l2tp/l2tp_core.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
[PATCH net-next v2] l2tp: fix possible UAF when cleaning up tunnels
Posted by James Chapman 1 year, 7 months ago
syzbot reported a UAF caused by a race when the L2TP work queue closes a
tunnel at the same time as a userspace thread closes a session in that
tunnel.

Tunnel cleanup is handled by a work queue which iterates through the
sessions contained within a tunnel, and closes them in turn.

Meanwhile, a userspace thread may arbitrarily close a session via
either netlink command or by closing the pppox socket in the case of
l2tp_ppp.

The race condition may occur when l2tp_tunnel_closeall walks the list
of sessions in the tunnel and deletes each one.  Currently this is
implemented using list_for_each_safe, but because the list spinlock is
dropped in the loop body it's possible for other threads to manipulate
the list during list_for_each_safe's list walk.  This can lead to the
list iterator being corrupted, leading to list_for_each_safe spinning.
One sequence of events which may lead to this is as follows:

 * A tunnel is created, containing two sessions A and B.
 * A thread closes the tunnel, triggering tunnel cleanup via the work
   queue.
 * l2tp_tunnel_closeall runs in the context of the work queue.  It
   removes session A from the tunnel session list, then drops the list
   lock.  At this point the list_for_each_safe temporary variable is
   pointing to the other session on the list, which is session B, and
   the list can be manipulated by other threads since the list lock has
   been released.
 * Userspace closes session B, which removes the session from its parent
   tunnel via l2tp_session_delete.  Since l2tp_tunnel_closeall has
   released the tunnel list lock, l2tp_session_delete is able to call
   list_del_init on the session B list node.
 * Back on the work queue, l2tp_tunnel_closeall resumes execution and
   will now spin forever on the same list entry until the underlying
   session structure is freed, at which point UAF occurs.

The solution is to iterate over the tunnel's session list using
list_first_entry_not_null to avoid the possibility of the list
iterator pointing at a list item which may be removed during the walk.

Also, have l2tp_tunnel_closeall ref each session while it processes it
to prevent another thread from freeing it.

	cpu1				cpu2
	---				---
					pppol2tp_release()

	spin_lock_bh(&tunnel->list_lock);
	for (;;) {
		session = list_first_entry_or_null(&tunnel->session_list,
						   struct l2tp_session, list);
		if (!session)
			break;
		list_del_init(&session->list);
		spin_unlock_bh(&tunnel->list_lock);

 					l2tp_session_delete(session);

		l2tp_session_delete(session);
		spin_lock_bh(&tunnel->list_lock);
	}
	spin_unlock_bh(&tunnel->list_lock);

Calling l2tp_session_delete on the same session twice isn't a problem
per-se, but if cpu2 manages to destruct the socket and unref the
session to zero before cpu1 progresses then it would lead to UAF.

Reported-by: syzbot+b471b7c936301a59745b@syzkaller.appspotmail.com
Reported-by: syzbot+c041b4ce3a6dfd1e63e2@syzkaller.appspotmail.com
Fixes: d18d3f0a24fc ("l2tp: replace hlist with simple list for per-tunnel session list")

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: Tom Parkin <tparkin@katalix.com>

---
v2:
  - hold session ref when processing tunnel close (Hillf Danton)
v1: https://lore.kernel.org/netdev/20240703185108.1752795-1-jchapman@katalix.com/
---
 net/l2tp/l2tp_core.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 64f446f0930b..2790a51e59e3 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1290,17 +1290,20 @@ static void l2tp_session_unhash(struct l2tp_session *session)
 static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
 {
 	struct l2tp_session *session;
-	struct list_head *pos;
-	struct list_head *tmp;
 
 	spin_lock_bh(&tunnel->list_lock);
 	tunnel->acpt_newsess = false;
-	list_for_each_safe(pos, tmp, &tunnel->session_list) {
-		session = list_entry(pos, struct l2tp_session, list);
+	for (;;) {
+		session = list_first_entry_or_null(&tunnel->session_list,
+						   struct l2tp_session, list);
+		if (!session)
+			break;
+		l2tp_session_inc_refcount(session);
 		list_del_init(&session->list);
 		spin_unlock_bh(&tunnel->list_lock);
 		l2tp_session_delete(session);
 		spin_lock_bh(&tunnel->list_lock);
+		l2tp_session_dec_refcount(session);
 	}
 	spin_unlock_bh(&tunnel->list_lock);
 }
-- 
2.34.1
Re: [PATCH net-next v2] l2tp: fix possible UAF when cleaning up tunnels
Posted by Hillf Danton 1 year, 7 months ago
On Thu,  4 Jul 2024 16:25:08 +0100 James Chapman <jchapman@katalix.com>
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1290,17 +1290,20 @@ static void l2tp_session_unhash(struct l2tp_session *session)
>  static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
>  {
>  	struct l2tp_session *session;
> -	struct list_head *pos;
> -	struct list_head *tmp;
>  
>  	spin_lock_bh(&tunnel->list_lock);
>  	tunnel->acpt_newsess = false;
> -	list_for_each_safe(pos, tmp, &tunnel->session_list) {
> -		session = list_entry(pos, struct l2tp_session, list);
> +	for (;;) {
> +		session = list_first_entry_or_null(&tunnel->session_list,
> +						   struct l2tp_session, list);
> +		if (!session)
> +			break;
> +		l2tp_session_inc_refcount(session);
>  		list_del_init(&session->list);
>  		spin_unlock_bh(&tunnel->list_lock);
>  		l2tp_session_delete(session);
>  		spin_lock_bh(&tunnel->list_lock);
> +		l2tp_session_dec_refcount(session);

Bumping refcount up makes it safe for the current cpu to go thru race
after releasing lock, and if it wins the race, dropping refcount makes
the peer head on uaf.
Re: [PATCH net-next v2] l2tp: fix possible UAF when cleaning up tunnels
Posted by James Chapman 1 year, 7 months ago
On 05/07/2024 11:32, Hillf Danton wrote:
> On Thu,  4 Jul 2024 16:25:08 +0100 James Chapman <jchapman@katalix.com>
>> --- a/net/l2tp/l2tp_core.c
>> +++ b/net/l2tp/l2tp_core.c
>> @@ -1290,17 +1290,20 @@ static void l2tp_session_unhash(struct l2tp_session *session)
>>   static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
>>   {
>>   	struct l2tp_session *session;
>> -	struct list_head *pos;
>> -	struct list_head *tmp;
>>   
>>   	spin_lock_bh(&tunnel->list_lock);
>>   	tunnel->acpt_newsess = false;
>> -	list_for_each_safe(pos, tmp, &tunnel->session_list) {
>> -		session = list_entry(pos, struct l2tp_session, list);
>> +	for (;;) {
>> +		session = list_first_entry_or_null(&tunnel->session_list,
>> +						   struct l2tp_session, list);
>> +		if (!session)
>> +			break;
>> +		l2tp_session_inc_refcount(session);
>>   		list_del_init(&session->list);
>>   		spin_unlock_bh(&tunnel->list_lock);
>>   		l2tp_session_delete(session);
>>   		spin_lock_bh(&tunnel->list_lock);
>> +		l2tp_session_dec_refcount(session);
> 
> Bumping refcount up makes it safe for the current cpu to go thru race
> after releasing lock, and if it wins the race, dropping refcount makes
> the peer head on uaf.

Thanks for reviewing this. Can you elaborate on what you mean by "makes 
the peer head on uaf", please?
Re: [PATCH net-next v2] l2tp: fix possible UAF when cleaning up tunnels
Posted by Hillf Danton 1 year, 7 months ago
On Mon, 8 Jul 2024 11:06:25 +0100 James Chapman <jchapman@katalix.com>
> On 05/07/2024 11:32, Hillf Danton wrote:
> > On Thu,  4 Jul 2024 16:25:08 +0100 James Chapman <jchapman@katalix.com>
> >> --- a/net/l2tp/l2tp_core.c
> >> +++ b/net/l2tp/l2tp_core.c
> >> @@ -1290,17 +1290,20 @@ static void l2tp_session_unhash(struct l2tp_session *session)
> >>   static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
> >>   {
> >>   	struct l2tp_session *session;
> >> -	struct list_head *pos;
> >> -	struct list_head *tmp;
> >>   
> >>   	spin_lock_bh(&tunnel->list_lock);
> >>   	tunnel->acpt_newsess = false;
> >> -	list_for_each_safe(pos, tmp, &tunnel->session_list) {
> >> -		session = list_entry(pos, struct l2tp_session, list);
> >> +	for (;;) {
> >> +		session = list_first_entry_or_null(&tunnel->session_list,
> >> +						   struct l2tp_session, list);
> >> +		if (!session)
> >> +			break;
> >> +		l2tp_session_inc_refcount(session);
> >>   		list_del_init(&session->list);
> >>   		spin_unlock_bh(&tunnel->list_lock);
> >>   		l2tp_session_delete(session);
> >>   		spin_lock_bh(&tunnel->list_lock);
> >> +		l2tp_session_dec_refcount(session);
> > 
> > Bumping refcount up makes it safe for the current cpu to go thru race
> > after releasing lock, and if it wins the race, dropping refcount makes
> > the peer head on uaf.
> 
> Thanks for reviewing this. Can you elaborate on what you mean by "makes 
> the peer head on uaf", please?
>
Given race, there are winner and loser. If the current cpu wins the race,
the loser hits uaf once winner drops refcount.
Re: [PATCH net-next v2] l2tp: fix possible UAF when cleaning up tunnels
Posted by James Chapman 1 year, 7 months ago
On 08/07/2024 12:59, Hillf Danton wrote:
> On Mon, 8 Jul 2024 11:06:25 +0100 James Chapman <jchapman@katalix.com>
>> On 05/07/2024 11:32, Hillf Danton wrote:
>>> On Thu,  4 Jul 2024 16:25:08 +0100 James Chapman <jchapman@katalix.com>
>>>> --- a/net/l2tp/l2tp_core.c
>>>> +++ b/net/l2tp/l2tp_core.c
>>>> @@ -1290,17 +1290,20 @@ static void l2tp_session_unhash(struct l2tp_session *session)
>>>>    static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
>>>>    {
>>>>    	struct l2tp_session *session;
>>>> -	struct list_head *pos;
>>>> -	struct list_head *tmp;
>>>>    
>>>>    	spin_lock_bh(&tunnel->list_lock);
>>>>    	tunnel->acpt_newsess = false;
>>>> -	list_for_each_safe(pos, tmp, &tunnel->session_list) {
>>>> -		session = list_entry(pos, struct l2tp_session, list);
>>>> +	for (;;) {
>>>> +		session = list_first_entry_or_null(&tunnel->session_list,
>>>> +						   struct l2tp_session, list);
>>>> +		if (!session)
>>>> +			break;
>>>> +		l2tp_session_inc_refcount(session);
>>>>    		list_del_init(&session->list);
>>>>    		spin_unlock_bh(&tunnel->list_lock);
>>>>    		l2tp_session_delete(session);
>>>>    		spin_lock_bh(&tunnel->list_lock);
>>>> +		l2tp_session_dec_refcount(session);
>>>
>>> Bumping refcount up makes it safe for the current cpu to go thru race
>>> after releasing lock, and if it wins the race, dropping refcount makes
>>> the peer head on uaf.
>>
>> Thanks for reviewing this. Can you elaborate on what you mean by "makes
>> the peer head on uaf", please?
>>
> Given race, there are winner and loser. If the current cpu wins the race,
> the loser hits uaf once winner drops refcount.

I think the session's dead flag would protect against threads racing in 
l2tp_session_delete to delete the same session.
Any thread with a pointer to a session should hold a reference on it to 
prevent the session going away while it is accessed. Am I missing a 
codepath where that's not the case?
Re: [PATCH net-next v2] l2tp: fix possible UAF when cleaning up tunnels
Posted by Paolo Abeni 1 year, 7 months ago
On Mon, 2024-07-08 at 14:57 +0100, James Chapman wrote:
> On 08/07/2024 12:59, Hillf Danton wrote:
> > On Mon, 8 Jul 2024 11:06:25 +0100 James Chapman <jchapman@katalix.com>
> > > On 05/07/2024 11:32, Hillf Danton wrote:
> > > > On Thu,  4 Jul 2024 16:25:08 +0100 James Chapman <jchapman@katalix.com>
> > > > > --- a/net/l2tp/l2tp_core.c
> > > > > +++ b/net/l2tp/l2tp_core.c
> > > > > @@ -1290,17 +1290,20 @@ static void l2tp_session_unhash(struct l2tp_session *session)
> > > > >    static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel)
> > > > >    {
> > > > >    	struct l2tp_session *session;
> > > > > -	struct list_head *pos;
> > > > > -	struct list_head *tmp;
> > > > >    
> > > > >    	spin_lock_bh(&tunnel->list_lock);
> > > > >    	tunnel->acpt_newsess = false;
> > > > > -	list_for_each_safe(pos, tmp, &tunnel->session_list) {
> > > > > -		session = list_entry(pos, struct l2tp_session, list);
> > > > > +	for (;;) {
> > > > > +		session = list_first_entry_or_null(&tunnel->session_list,
> > > > > +						   struct l2tp_session, list);
> > > > > +		if (!session)
> > > > > +			break;
> > > > > +		l2tp_session_inc_refcount(session);
> > > > >    		list_del_init(&session->list);
> > > > >    		spin_unlock_bh(&tunnel->list_lock);
> > > > >    		l2tp_session_delete(session);
> > > > >    		spin_lock_bh(&tunnel->list_lock);
> > > > > +		l2tp_session_dec_refcount(session);
> > > > 
> > > > Bumping refcount up makes it safe for the current cpu to go thru race
> > > > after releasing lock, and if it wins the race, dropping refcount makes
> > > > the peer head on uaf.
> > > 
> > > Thanks for reviewing this. Can you elaborate on what you mean by "makes
> > > the peer head on uaf", please?
> > > 
> > Given race, there are winner and loser. If the current cpu wins the race,
> > the loser hits uaf once winner drops refcount.
> 
> I think the session's dead flag would protect against threads racing in 
> l2tp_session_delete to delete the same session.
> Any thread with a pointer to a session should hold a reference on it to 
> prevent the session going away while it is accessed. Am I missing a 
> codepath where that's not the case?

AFAICS this patch is safe, as the session refcount can't be 0 at
l2tp_session_inc_refcount() time and will drop to 0 after
l2tp_session_dec_refcount() only if no other entity/thread is owning
any reference to the session.

@James: the patch has a formal issue, you should avoid any empty line
in the tag area, specifically between the 'Fixes' and SoB tags.

I'll exceptionally fix this while applying the patch, but please run
checkpatch before your next submission.

Also somewhat related, I think there is still a race condition in
l2tp_tunnel_get_session():

	rcu_read_lock_bh();
        hlist_for_each_entry_rcu(session, session_list, hlist)
                if (session->session_id == session_id) {
                        l2tp_session_inc_refcount(session);

I think that at l2tp_session_inc_refcount(), the session refcount could
be 0 due to a concurrent tunnel cleanup. l2tp_session_inc_refcount()
should likely be refcount_inc_not_zero() and the caller should check
the return value.

In any case the latter is a separate issue.

Thanks,

Paolo
Re: [PATCH net-next v2] l2tp: fix possible UAF when cleaning up tunnels
Posted by James Chapman 1 year, 7 months ago
On 09/07/2024 10:03, Paolo Abeni wrote:
[snip]
> AFAICS this patch is safe, as the session refcount can't be 0 at
> l2tp_session_inc_refcount() time and will drop to 0 after
> l2tp_session_dec_refcount() only if no other entity/thread is owning
> any reference to the session.
> 
> @James: the patch has a formal issue, you should avoid any empty line
> in the tag area, specifically between the 'Fixes' and SoB tags.
> 
> I'll exceptionally fix this while applying the patch, but please run
> checkpatch before your next submission.

Thanks Paolo. Will do. I'll be more careful next time.

> Also somewhat related, I think there is still a race condition in
> l2tp_tunnel_get_session():
> 
> 	rcu_read_lock_bh();
>          hlist_for_each_entry_rcu(session, session_list, hlist)
>                  if (session->session_id == session_id) {
>                          l2tp_session_inc_refcount(session);
> 
> I think that at l2tp_session_inc_refcount(), the session refcount could
> be 0 due to a concurrent tunnel cleanup. l2tp_session_inc_refcount()
> should likely be refcount_inc_not_zero() and the caller should check
> the return value.
> 
> In any case the latter is a separate issue.

I'm currently working on another series which will address this along 
with more l2tp cleanup improvements.