[PATCH] smb: client: Fix netns refcount imbalance causing leaks and use-after-free

Wang Zhaolong posted 1 patch 10 months ago
fs/smb/client/connect.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
[PATCH] smb: client: Fix netns refcount imbalance causing leaks and use-after-free
Posted by Wang Zhaolong 10 months ago
Commit ef7134c7fc48 ("smb: client: Fix use-after-free of network
namespace.") attempted to fix a netns use-after-free issue by manually
adjusting reference counts via sk->sk_net_refcnt and sock_inuse_add().

However, a later commit e9f2517a3e18 ("smb: client: fix TCP timers deadlock
after rmmod") pointed out that the approach of manually setting
sk->sk_net_refcnt in the first commit was technically incorrect, as
sk->sk_net_refcnt should only be set for user sockets. It led to issues
like TCP timers not being cleared properly on close. The second commit
moved to a model of just holding an extra netns reference for
server->ssocket using get_net(), and dropping it when the server is torn
down.

But there remain some gaps in the get_net()/put_net() balancing added by
these commits. The incomplete reference handling in these fixes results
in two issues:

1. Netns refcount leaks[1]

The problem process is as follows:

```
mount.cifs                        cifsd

cifs_do_mount
  cifs_mount
    cifs_mount_get_session
      cifs_get_tcp_session
        get_net()  /* First get net. */
        ip_connect
          generic_ip_connect /* Try port 445 */
            get_net()
            ->connect() /* Failed */
            put_net()
          generic_ip_connect /* Try port 139 */
            get_net() /* Missing matching put_net() for this get_net().*/
      cifs_get_smb_ses
        cifs_negotiate_protocol
          smb2_negotiate
            SMB2_negotiate
              cifs_send_recv
                wait_for_response
                                 cifs_demultiplex_thread
                                   cifs_read_from_socket
                                     cifs_readv_from_socket
                                       cifs_reconnect
                                         cifs_abort_connection
                                           sock_release();
                                           server->ssocket = NULL;
                                           /* Missing put_net() here. */
                                           generic_ip_connect
                                             get_net()
                                             ->connect() /* Failed */
                                             put_net()
                                             sock_release();
                                             server->ssocket = NULL;
          free_rsp_buf
    ...
                                   clean_demultiplex_info
                                     /* It's only called once here. */
                                     put_net()
```

When cifs_reconnect() is triggered, the server->ssocket is released
without a corresponding put_net() for the reference acquired in
generic_ip_connect() before. it ends up calling generic_ip_connect()
again to retry get_net(). After that, server->ssocket is set to NULL
in the error path of generic_ip_connect(), and the net count cannot be
released in the final clean_demultiplex_info() function.

2. Potential use-after-free

The current refcounting scheme can lead to a potential use-after-free issue
in the following scenario:

```
 cifs_do_mount
   cifs_mount
     cifs_mount_get_session
       cifs_get_tcp_session
         get_net()  /* First get net */
           ip_connect
             generic_ip_connect
               get_net()
               bind_socket
	         kernel_bind /* failed */
               put_net()
         /* after out_err_crypto_release label */
         put_net()
         /* after out_err label */
         put_net()
```

In the exception handling process where binding the socket fails, the
get_net() and put_net() calls are unbalanced, which may cause the
server->net reference count to drop to zero and be prematurely released.

To address both issues, this patch ties the netns reference counting to
the server->ssocket and server lifecycles. The extra reference is now
acquired when the server or socket is created, and released when the
socket is destroyed or the server is torn down.

[1]: https://bugzilla.kernel.org/show_bug.cgi?id=219792

Fixes: ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.")
Fixes: e9f2517a3e18 ("smb: client: fix TCP timers deadlock after rmmod")
Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
---
 fs/smb/client/connect.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index f917de020dd5..0d454149f3b4 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -300,6 +300,7 @@ cifs_abort_connection(struct TCP_Server_Info *server)
 			 server->ssocket->flags);
 		sock_release(server->ssocket);
 		server->ssocket = NULL;
+		put_net(cifs_net_ns(server));
 	}
 	server->sequence_number = 0;
 	server->session_estab = false;
@@ -3115,8 +3116,12 @@ generic_ip_connect(struct TCP_Server_Info *server)
 		/*
 		 * Grab netns reference for the socket.
 		 *
-		 * It'll be released here, on error, or in clean_demultiplex_info() upon server
-		 * teardown.
+		 * This reference will be released in several situations:
+		 * - In the failure path before the cifsd thread is started.
+		 * - In the all place where server->socket is released, it is
+		 *   also set to NULL.
+		 * - Ultimately in clean_demultiplex_info(), during the final
+		 *   teardown.
 		 */
 		get_net(net);
 
@@ -3132,10 +3137,8 @@ generic_ip_connect(struct TCP_Server_Info *server)
 	}
 
 	rc = bind_socket(server);
-	if (rc < 0) {
-		put_net(cifs_net_ns(server));
+	if (rc < 0)
 		return rc;
-	}
 
 	/*
 	 * Eventually check for other socket options to change from
@@ -3181,9 +3184,6 @@ generic_ip_connect(struct TCP_Server_Info *server)
 	if (sport == htons(RFC1001_PORT))
 		rc = ip_rfc1001_connect(server);
 
-	if (rc < 0)
-		put_net(cifs_net_ns(server));
-
 	return rc;
 }
 
-- 
2.34.3
Re: [PATCH] smb: client: Fix netns refcount imbalance causing leaks and use-after-free
Posted by Steve French 9 months, 3 weeks ago
I was looking at this patch in more detail, and it does look important
but I wanted to clarify a few things.  In your detailed description
you mention that the retry on port 139 is missing a call put_net(0

>         ip_connect
>           generic_ip_connect /* Try port 445 */
>             get_net()
>             ->connect() /* Failed */
>             put_net()
>           generic_ip_connect /* Try port 139 */
>             get_net() /* Missing matching put_net() for this get_net().*/

but I found this confusing because generic_ip_connect() doesn't seem
to treat the port 445 vs. port 139 differently (there are only two
places the function does put_net() and the latter on line 3421 looks
like the only one that matters for your example).  Here is the snippet
from generic_ip_connect().  Could you explain why the retry on port
139 example is different here?

        rc = kernel_connect(socket, saddr, slen,
                            server->noblockcnt ? O_NONBLOCK : 0);
        /*
         * When mounting SMB root file systems, we do not want to block in
         * connect. Otherwise bail out and then let cifs_reconnect() perform
         * reconnect failover - if possible.
         */
        if (server->noblockcnt && rc == -EINPROGRESS)
                rc = 0;
        if (rc < 0) {
                cifs_dbg(FYI, "Error %d connecting to server\n", rc);
                trace_smb3_connect_err(server->hostname,
server->conn_id, &server->dstaddr, rc);
                put_net(cifs_net_ns(server));
                sock_release(socket);
                server->ssocket = NULL;
                return rc;
        }




On Tue, Feb 18, 2025 at 8:34 AM Wang Zhaolong <wangzhaolong1@huawei.com> wrote:
>
> Commit ef7134c7fc48 ("smb: client: Fix use-after-free of network
> namespace.") attempted to fix a netns use-after-free issue by manually
> adjusting reference counts via sk->sk_net_refcnt and sock_inuse_add().
>
> However, a later commit e9f2517a3e18 ("smb: client: fix TCP timers deadlock
> after rmmod") pointed out that the approach of manually setting
> sk->sk_net_refcnt in the first commit was technically incorrect, as
> sk->sk_net_refcnt should only be set for user sockets. It led to issues
> like TCP timers not being cleared properly on close. The second commit
> moved to a model of just holding an extra netns reference for
> server->ssocket using get_net(), and dropping it when the server is torn
> down.
>
> But there remain some gaps in the get_net()/put_net() balancing added by
> these commits. The incomplete reference handling in these fixes results
> in two issues:
>
> 1. Netns refcount leaks[1]
>
> The problem process is as follows:
>
> ```
> mount.cifs                        cifsd
>
> cifs_do_mount
>   cifs_mount
>     cifs_mount_get_session
>       cifs_get_tcp_session
>         get_net()  /* First get net. */
>         ip_connect
>           generic_ip_connect /* Try port 445 */
>             get_net()
>             ->connect() /* Failed */
>             put_net()
>           generic_ip_connect /* Try port 139 */
>             get_net() /* Missing matching put_net() for this get_net().*/
>       cifs_get_smb_ses
>         cifs_negotiate_protocol
>           smb2_negotiate
>             SMB2_negotiate
>               cifs_send_recv
>                 wait_for_response
>                                  cifs_demultiplex_thread
>                                    cifs_read_from_socket
>                                      cifs_readv_from_socket
>                                        cifs_reconnect
>                                          cifs_abort_connection
>                                            sock_release();
>                                            server->ssocket = NULL;
>                                            /* Missing put_net() here. */
>                                            generic_ip_connect
>                                              get_net()
>                                              ->connect() /* Failed */
>                                              put_net()
>                                              sock_release();
>                                              server->ssocket = NULL;
>           free_rsp_buf
>     ...
>                                    clean_demultiplex_info
>                                      /* It's only called once here. */
>                                      put_net()
> ```
>
> When cifs_reconnect() is triggered, the server->ssocket is released
> without a corresponding put_net() for the reference acquired in
> generic_ip_connect() before. it ends up calling generic_ip_connect()
> again to retry get_net(). After that, server->ssocket is set to NULL
> in the error path of generic_ip_connect(), and the net count cannot be
> released in the final clean_demultiplex_info() function.
>
> 2. Potential use-after-free
>
> The current refcounting scheme can lead to a potential use-after-free issue
> in the following scenario:
>
> ```
>  cifs_do_mount
>    cifs_mount
>      cifs_mount_get_session
>        cifs_get_tcp_session
>          get_net()  /* First get net */
>            ip_connect
>              generic_ip_connect
>                get_net()
>                bind_socket
>                  kernel_bind /* failed */
>                put_net()
>          /* after out_err_crypto_release label */
>          put_net()
>          /* after out_err label */
>          put_net()
> ```
>
> In the exception handling process where binding the socket fails, the
> get_net() and put_net() calls are unbalanced, which may cause the
> server->net reference count to drop to zero and be prematurely released.
>
> To address both issues, this patch ties the netns reference counting to
> the server->ssocket and server lifecycles. The extra reference is now
> acquired when the server or socket is created, and released when the
> socket is destroyed or the server is torn down.
>
> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=219792
>
> Fixes: ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.")
> Fixes: e9f2517a3e18 ("smb: client: fix TCP timers deadlock after rmmod")
> Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
> ---
>  fs/smb/client/connect.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
> index f917de020dd5..0d454149f3b4 100644
> --- a/fs/smb/client/connect.c
> +++ b/fs/smb/client/connect.c
> @@ -300,6 +300,7 @@ cifs_abort_connection(struct TCP_Server_Info *server)
>                          server->ssocket->flags);
>                 sock_release(server->ssocket);
>                 server->ssocket = NULL;
> +               put_net(cifs_net_ns(server));
>         }
>         server->sequence_number = 0;
>         server->session_estab = false;
> @@ -3115,8 +3116,12 @@ generic_ip_connect(struct TCP_Server_Info *server)
>                 /*
>                  * Grab netns reference for the socket.
>                  *
> -                * It'll be released here, on error, or in clean_demultiplex_info() upon server
> -                * teardown.
> +                * This reference will be released in several situations:
> +                * - In the failure path before the cifsd thread is started.
> +                * - In the all place where server->socket is released, it is
> +                *   also set to NULL.
> +                * - Ultimately in clean_demultiplex_info(), during the final
> +                *   teardown.
>                  */
>                 get_net(net);
>
> @@ -3132,10 +3137,8 @@ generic_ip_connect(struct TCP_Server_Info *server)
>         }
>
>         rc = bind_socket(server);
> -       if (rc < 0) {
> -               put_net(cifs_net_ns(server));
> +       if (rc < 0)
>                 return rc;
> -       }
>
>         /*
>          * Eventually check for other socket options to change from
> @@ -3181,9 +3184,6 @@ generic_ip_connect(struct TCP_Server_Info *server)
>         if (sport == htons(RFC1001_PORT))
>                 rc = ip_rfc1001_connect(server);
>
> -       if (rc < 0)
> -               put_net(cifs_net_ns(server));
> -
>         return rc;
>  }
>
> --
> 2.34.3
>
>


-- 
Thanks,

Steve
Re: [PATCH] smb: client: Fix netns refcount imbalance causing leaks and use-after-free
Posted by Wang Zhaolong 9 months, 1 week ago

> I was looking at this patch in more detail, and it does look important
> but I wanted to clarify a few things.  In your detailed description
> you mention that the retry on port 139 is missing a call put_net(0
> 
>>          ip_connect
>>            generic_ip_connect /* Try port 445 */
>>              get_net()
>>              ->connect() /* Failed */
>>              put_net()
>>            generic_ip_connect /* Try port 139 */
>>              get_net() /* Missing matching put_net() for this get_net().*/
> 
> but I found this confusing because generic_ip_connect() doesn't seem
> to treat the port 445 vs. port 139 differently (there are only two
> places the function does put_net() and the latter on line 3421 looks
> like the only one that matters for your example).  Here is the snippet
> from generic_ip_connect().  Could you explain why the retry on port
> 139 example is different here?
> 
>          rc = kernel_connect(socket, saddr, slen,
>                              server->noblockcnt ? O_NONBLOCK : 0);
>          /*
>           * When mounting SMB root file systems, we do not want to block in
>           * connect. Otherwise bail out and then let cifs_reconnect() perform
>           * reconnect failover - if possible.
>           */
>          if (server->noblockcnt && rc == -EINPROGRESS)
>                  rc = 0;
>          if (rc < 0) {
>                  cifs_dbg(FYI, "Error %d connecting to server\n", rc);
>                  trace_smb3_connect_err(server->hostname,
> server->conn_id, &server->dstaddr, rc);
>                  put_net(cifs_net_ns(server));
>                  sock_release(socket);
>                  server->ssocket = NULL;
>                  return rc;
>          }


Yes, both ports are treated similarly by the client; however, according
to the network packets I captured, the access error was returned by
the server.

This issue was identified through a flawed cifs test case, where
the cifs file system was mounted on the client before the server had
fully initialized the smbd service. Here's a breakdown of the process:

1. First Connection Attempt: The cifs client first attempts to connect
    to port 445. Since the server has not yet completed its initialization,
    it returns an error -111.

2. Subsequent Connection Attempt: The cifs client then tries to connect
    to port 139. At this point, the server has finished initializing, thus
    the `kernel_connect()` API call succeeds.

However, the server seemingly does not support the older data format associated
with port 139, likely due to server configuration settings. When the cifsd thread
processes the message, it detects this incorrect data format and triggers
cifs_reconnect(). This action leads to the issue of leakage.

By maintaining the test case essentially unchanged and simply blocking the
server's port 445 with a firewall, I can easily reproduce this issue.

The network packets I captured are as follows:

5	0.896004	192.168.11.3	192.168.11.2	TCP	74	38246 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM TSval=210053990 TSecr=0 WS=128
6	0.896121	192.168.11.2	192.168.11.3	ICMP	102	Destination unreachable (Port unreachable)
7	0.896287	192.168.11.3	192.168.11.2	TCP	74	52196 → 139 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM TSval=210053990 TSecr=0 WS=128
8	0.896336	192.168.11.2	192.168.11.3	TCP	74	139 → 52196 [SYN, ACK] Seq=0 Ack=1 Win=65160 Len=0 MSS=1460 SACK_PERM TSval=3568249692 TSecr=210053990 WS=128
9	0.896410	192.168.11.3	192.168.11.2	TCP	66	52196 → 139 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=210053990 TSecr=3568249692
10	0.896511	192.168.11.3	192.168.11.2	NBSS	142	Session message
11	0.896541	192.168.11.2	192.168.11.3	TCP	66	139 → 52196 [ACK] Seq=1 Ack=77 Win=65152 Len=0 TSval=3568249692 TSecr=210053990
12	0.898382	192.168.11.3	192.168.11.2	TCP	302	52196 → 139 [PSH, ACK] Seq=77 Ack=1 Win=64256 Len=236 TSval=210053992 TSecr=3568249692 [TCP segment of a reassembled PDU]
13	0.898427	192.168.11.2	192.168.11.3	TCP	66	139 → 52196 [ACK] Seq=1 Ack=313 Win=65024 Len=0 TSval=3568249694 TSecr=210053992
14	0.906736	192.168.11.2	192.168.11.3	TCP	66	139 → 52196 [RST, ACK] Seq=1 Ack=313 Win=65024 Len=0 TSval=3568249702 TSecr=210053992
15	0.906980	192.168.11.3	192.168.11.2	TCP	74	52206 → 139 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM TSval=210054001 TSecr=0 WS=128
16	0.907054	192.168.11.2	192.168.11.3	TCP	74	139 → 52206 [SYN, ACK] Seq=0 Ack=1 Win=65160 Len=0 MSS=1460 SACK_PERM TSval=3568249703 TSecr=210054001 WS=128
17	0.907099	192.168.11.3	192.168.11.2	TCP	66	52206 → 139 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=210054001 TSecr=3568249703
18	0.907175	192.168.11.3	192.168.11.2	TCP	66	52206 → 139 [FIN, ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=210054001 TSecr=3568249703
19	0.910324	192.168.11.2	192.168.11.3	TCP	66	139 → 52206 [ACK] Seq=1 Ack=2 Win=65280 Len=0 TSval=3568249706 TSecr=210054001
20	0.918673	192.168.11.2	192.168.11.3	TCP	66	139 → 52206 [FIN, ACK] Seq=1 Ack=2 Win=65280 Len=0 TSval=3568249714 TSecr=210054001
21	0.918714	192.168.11.3	192.168.11.2	TCP	66	52206 → 139 [ACK] Seq=2 Ack=2 Win=64256 Len=0 TSval=210054012 TSecr=3568249714

The detailed analysis process is as follows:

1. Initial Connection Attempt to Port 445 (SMB Direct):
  - Client (192.168.11.3) initiates a TCP connection to server (192.168.11.2) on port 445
  - Server immediately rejects this with an ICMP "Port unreachable" message
    - This indicates port 445 is either closed or blocked on the server

2. Fallback Connection Attempt to Port 139 (NetBIOS)
  - Client attempts to connect to the server on port 139 (NetBIOS)
  - Three-way TCP handshake succeeds:
      * Client sends SYN
      * Server responds with SYN+ACK
      * Client acknowledges with ACK

3. NetBIOS Session Setup:
  - Client sends a NetBIOS Session Service (NBSS) message
  - Server acknowledges receipt
  - Client sends additional data (236 bytes) as part of a larger PDU (Protocol Data Unit)
  - Server acknowledges this data

4. Connection Reset:
  - Server unexpectedly terminates the connection with RST+ACK flags
    - This indicates the server rejected the client's request after examining the session data

5. Final Connection Attempt:
  - Client tries again on port 139 from a new source port (52206)
  - Three-way handshake completes successfully
  - Client immediately initiates connection termination with FIN+ACK
  - Server acknowledges, then sends its own FIN+ACK
  - Client acknowledges, completing the graceful connection termination

Please let me know if there is any other information you require or
further details I can provide.

Best regards,
Wang Zhaolong

Re: [PATCH] smb: client: Fix netns refcount imbalance causing leaks and use-after-free
Posted by Wang Zhaolong 9 months ago
Friendly piug.

Best regards,
Wang Zhaolong
Re: [PATCH] smb: client: Fix netns refcount imbalance causing leaks and use-after-free
Posted by Steve French 9 months ago
This was excellent work, and the test to reproduce the problem looks
excellent but didn't run on the systems I have tried so far - but main
thing I am waiting on is someone who is able to  1) run the tests you
provided and show the patch fixed it for them  and/or 2) additional
review feedback (the namespace refcounting is more confusing than I
expected) and if there is a good way to debug the namespace refcounts
that would be even more helpful

I would like to include the patch - but badly need others who are able
to reproduce the same problem and verify the fix

On Tue, Mar 18, 2025 at 8:48 AM Wang Zhaolong <wangzhaolong1@huawei.com> wrote:
>
> Friendly piug.
>
> Best regards,
> Wang Zhaolong



-- 
Thanks,

Steve
Re: [PATCH] smb: client: Fix netns refcount imbalance causing leaks and use-after-free
Posted by Steve French 10 months ago
tentatively merged into cifs-2.6.git for-next pending more review and testing

On Tue, Feb 18, 2025 at 8:34 AM Wang Zhaolong <wangzhaolong1@huawei.com> wrote:
>
> Commit ef7134c7fc48 ("smb: client: Fix use-after-free of network
> namespace.") attempted to fix a netns use-after-free issue by manually
> adjusting reference counts via sk->sk_net_refcnt and sock_inuse_add().
>
> However, a later commit e9f2517a3e18 ("smb: client: fix TCP timers deadlock
> after rmmod") pointed out that the approach of manually setting
> sk->sk_net_refcnt in the first commit was technically incorrect, as
> sk->sk_net_refcnt should only be set for user sockets. It led to issues
> like TCP timers not being cleared properly on close. The second commit
> moved to a model of just holding an extra netns reference for
> server->ssocket using get_net(), and dropping it when the server is torn
> down.
>
> But there remain some gaps in the get_net()/put_net() balancing added by
> these commits. The incomplete reference handling in these fixes results
> in two issues:
>
> 1. Netns refcount leaks[1]
>
> The problem process is as follows:
>
> ```
> mount.cifs                        cifsd
>
> cifs_do_mount
>   cifs_mount
>     cifs_mount_get_session
>       cifs_get_tcp_session
>         get_net()  /* First get net. */
>         ip_connect
>           generic_ip_connect /* Try port 445 */
>             get_net()
>             ->connect() /* Failed */
>             put_net()
>           generic_ip_connect /* Try port 139 */
>             get_net() /* Missing matching put_net() for this get_net().*/
>       cifs_get_smb_ses
>         cifs_negotiate_protocol
>           smb2_negotiate
>             SMB2_negotiate
>               cifs_send_recv
>                 wait_for_response
>                                  cifs_demultiplex_thread
>                                    cifs_read_from_socket
>                                      cifs_readv_from_socket
>                                        cifs_reconnect
>                                          cifs_abort_connection
>                                            sock_release();
>                                            server->ssocket = NULL;
>                                            /* Missing put_net() here. */
>                                            generic_ip_connect
>                                              get_net()
>                                              ->connect() /* Failed */
>                                              put_net()
>                                              sock_release();
>                                              server->ssocket = NULL;
>           free_rsp_buf
>     ...
>                                    clean_demultiplex_info
>                                      /* It's only called once here. */
>                                      put_net()
> ```
>
> When cifs_reconnect() is triggered, the server->ssocket is released
> without a corresponding put_net() for the reference acquired in
> generic_ip_connect() before. it ends up calling generic_ip_connect()
> again to retry get_net(). After that, server->ssocket is set to NULL
> in the error path of generic_ip_connect(), and the net count cannot be
> released in the final clean_demultiplex_info() function.
>
> 2. Potential use-after-free
>
> The current refcounting scheme can lead to a potential use-after-free issue
> in the following scenario:
>
> ```
>  cifs_do_mount
>    cifs_mount
>      cifs_mount_get_session
>        cifs_get_tcp_session
>          get_net()  /* First get net */
>            ip_connect
>              generic_ip_connect
>                get_net()
>                bind_socket
>                  kernel_bind /* failed */
>                put_net()
>          /* after out_err_crypto_release label */
>          put_net()
>          /* after out_err label */
>          put_net()
> ```
>
> In the exception handling process where binding the socket fails, the
> get_net() and put_net() calls are unbalanced, which may cause the
> server->net reference count to drop to zero and be prematurely released.
>
> To address both issues, this patch ties the netns reference counting to
> the server->ssocket and server lifecycles. The extra reference is now
> acquired when the server or socket is created, and released when the
> socket is destroyed or the server is torn down.
>
> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=219792
>
> Fixes: ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.")
> Fixes: e9f2517a3e18 ("smb: client: fix TCP timers deadlock after rmmod")
> Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
> ---
>  fs/smb/client/connect.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
> index f917de020dd5..0d454149f3b4 100644
> --- a/fs/smb/client/connect.c
> +++ b/fs/smb/client/connect.c
> @@ -300,6 +300,7 @@ cifs_abort_connection(struct TCP_Server_Info *server)
>                          server->ssocket->flags);
>                 sock_release(server->ssocket);
>                 server->ssocket = NULL;
> +               put_net(cifs_net_ns(server));
>         }
>         server->sequence_number = 0;
>         server->session_estab = false;
> @@ -3115,8 +3116,12 @@ generic_ip_connect(struct TCP_Server_Info *server)
>                 /*
>                  * Grab netns reference for the socket.
>                  *
> -                * It'll be released here, on error, or in clean_demultiplex_info() upon server
> -                * teardown.
> +                * This reference will be released in several situations:
> +                * - In the failure path before the cifsd thread is started.
> +                * - In the all place where server->socket is released, it is
> +                *   also set to NULL.
> +                * - Ultimately in clean_demultiplex_info(), during the final
> +                *   teardown.
>                  */
>                 get_net(net);
>
> @@ -3132,10 +3137,8 @@ generic_ip_connect(struct TCP_Server_Info *server)
>         }
>
>         rc = bind_socket(server);
> -       if (rc < 0) {
> -               put_net(cifs_net_ns(server));
> +       if (rc < 0)
>                 return rc;
> -       }
>
>         /*
>          * Eventually check for other socket options to change from
> @@ -3181,9 +3184,6 @@ generic_ip_connect(struct TCP_Server_Info *server)
>         if (sport == htons(RFC1001_PORT))
>                 rc = ip_rfc1001_connect(server);
>
> -       if (rc < 0)
> -               put_net(cifs_net_ns(server));
> -
>         return rc;
>  }
>
> --
> 2.34.3
>
>


-- 
Thanks,

Steve