[PATCH net 1/3] vsock: Fix transport_{h2g,g2h} TOCTOU

Michal Luczaj posted 3 patches 3 months, 3 weeks ago
There is a newer version of this series
[PATCH net 1/3] vsock: Fix transport_{h2g,g2h} TOCTOU
Posted by Michal Luczaj 3 months, 3 weeks ago
Checking transport_{h2g,g2h} != NULL may race with vsock_core_unregister().
Make sure pointers remain valid.

KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
RIP: 0010:vsock_dev_do_ioctl.isra.0+0x58/0xf0
Call Trace:
 __x64_sys_ioctl+0x12d/0x190
 do_syscall_64+0x92/0x1c0
 entry_SYSCALL_64_after_hwframe+0x4b/0x53

Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 net/vmw_vsock/af_vsock.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 2e7a3034e965db30b6ee295370d866e6d8b1c341..047d1bc773fab9c315a6ccd383a451fa11fb703e 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -2541,6 +2541,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
 
 	switch (cmd) {
 	case IOCTL_VM_SOCKETS_GET_LOCAL_CID:
+		mutex_lock(&vsock_register_mutex);
+
 		/* To be compatible with the VMCI behavior, we prioritize the
 		 * guest CID instead of well-know host CID (VMADDR_CID_HOST).
 		 */
@@ -2549,6 +2551,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
 		else if (transport_h2g)
 			cid = transport_h2g->get_local_cid();
 
+		mutex_unlock(&vsock_register_mutex);
+
 		if (put_user(cid, p) != 0)
 			retval = -EFAULT;
 		break;

-- 
2.49.0
Re: [PATCH net 1/3] vsock: Fix transport_{h2g,g2h} TOCTOU
Posted by Stefano Garzarella 3 months, 3 weeks ago
On Wed, Jun 18, 2025 at 02:34:00PM +0200, Michal Luczaj wrote:
>Checking transport_{h2g,g2h} != NULL may race with vsock_core_unregister().
>Make sure pointers remain valid.
>
>KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
>RIP: 0010:vsock_dev_do_ioctl.isra.0+0x58/0xf0
>Call Trace:
> __x64_sys_ioctl+0x12d/0x190
> do_syscall_64+0x92/0x1c0
> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>
>Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
>Signed-off-by: Michal Luczaj <mhal@rbox.co>
>---
> net/vmw_vsock/af_vsock.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 2e7a3034e965db30b6ee295370d866e6d8b1c341..047d1bc773fab9c315a6ccd383a451fa11fb703e 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -2541,6 +2541,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
>
> 	switch (cmd) {
> 	case IOCTL_VM_SOCKETS_GET_LOCAL_CID:
>+		mutex_lock(&vsock_register_mutex);
>+
> 		/* To be compatible with the VMCI behavior, we prioritize the
> 		 * guest CID instead of well-know host CID (VMADDR_CID_HOST).
> 		 */
>@@ -2549,6 +2551,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
> 		else if (transport_h2g)
> 			cid = transport_h2g->get_local_cid();
>
>+		mutex_unlock(&vsock_register_mutex);


What about if we introduce a new `vsock_get_local_cid`:

u32 vsock_get_local_cid() {
	u32 cid = VMADDR_CID_ANY;

	mutex_lock(&vsock_register_mutex);
	/* To be compatible with the VMCI behavior, we prioritize the
	 * guest CID instead of well-know host CID (VMADDR_CID_HOST).
	 */
	if (transport_g2h)
		cid = transport_g2h->get_local_cid();
	else if (transport_h2g)
		cid = transport_h2g->get_local_cid();
	mutex_lock(&vsock_register_mutex);

	return cid;
}


And we use it here, and in the place fixed by next patch?

I think we can fix all in a single patch, the problem here is to call 
transport_*->get_local_cid() without the lock IIUC.

Thanks,
Stefano

>+
> 		if (put_user(cid, p) != 0)
> 			retval = -EFAULT;
> 		break;
>
>-- 
>2.49.0
>
Re: [PATCH net 1/3] vsock: Fix transport_{h2g,g2h} TOCTOU
Posted by Michal Luczaj 3 months, 2 weeks ago
On 6/20/25 10:32, Stefano Garzarella wrote:
> On Wed, Jun 18, 2025 at 02:34:00PM +0200, Michal Luczaj wrote:
>> Checking transport_{h2g,g2h} != NULL may race with vsock_core_unregister().
>> Make sure pointers remain valid.
>>
>> KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
>> RIP: 0010:vsock_dev_do_ioctl.isra.0+0x58/0xf0
>> Call Trace:
>> __x64_sys_ioctl+0x12d/0x190
>> do_syscall_64+0x92/0x1c0
>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>
>> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
>> Signed-off-by: Michal Luczaj <mhal@rbox.co>
>> ---
>> net/vmw_vsock/af_vsock.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>> index 2e7a3034e965db30b6ee295370d866e6d8b1c341..047d1bc773fab9c315a6ccd383a451fa11fb703e 100644
>> --- a/net/vmw_vsock/af_vsock.c
>> +++ b/net/vmw_vsock/af_vsock.c
>> @@ -2541,6 +2541,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
>>
>> 	switch (cmd) {
>> 	case IOCTL_VM_SOCKETS_GET_LOCAL_CID:
>> +		mutex_lock(&vsock_register_mutex);
>> +
>> 		/* To be compatible with the VMCI behavior, we prioritize the
>> 		 * guest CID instead of well-know host CID (VMADDR_CID_HOST).
>> 		 */
>> @@ -2549,6 +2551,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
>> 		else if (transport_h2g)
>> 			cid = transport_h2g->get_local_cid();
>>
>> +		mutex_unlock(&vsock_register_mutex);
> 
> 
> What about if we introduce a new `vsock_get_local_cid`:
> 
> u32 vsock_get_local_cid() {
> 	u32 cid = VMADDR_CID_ANY;
> 
> 	mutex_lock(&vsock_register_mutex);
> 	/* To be compatible with the VMCI behavior, we prioritize the
> 	 * guest CID instead of well-know host CID (VMADDR_CID_HOST).
> 	 */
> 	if (transport_g2h)
> 		cid = transport_g2h->get_local_cid();
> 	else if (transport_h2g)
> 		cid = transport_h2g->get_local_cid();
> 	mutex_lock(&vsock_register_mutex);
> 
> 	return cid;
> }
> 
> 
> And we use it here, and in the place fixed by next patch?
> 
> I think we can fix all in a single patch, the problem here is to call 
> transport_*->get_local_cid() without the lock IIUC.

Do you mean:

 bool vsock_find_cid(unsigned int cid)
 {
-       if (transport_g2h && cid == transport_g2h->get_local_cid())
+       if (transport_g2h && cid == vsock_get_local_cid())
                return true;

?

So we need to check transport_g2h twice; in vsock_find_cid() and then again
in vsock_get_local_cid(), right?
Re: [PATCH net 1/3] vsock: Fix transport_{h2g,g2h} TOCTOU
Posted by Stefano Garzarella 3 months, 2 weeks ago
On Fri, Jun 20, 2025 at 02:58:49PM +0200, Michal Luczaj wrote:
>On 6/20/25 10:32, Stefano Garzarella wrote:
>> On Wed, Jun 18, 2025 at 02:34:00PM +0200, Michal Luczaj wrote:
>>> Checking transport_{h2g,g2h} != NULL may race with vsock_core_unregister().
>>> Make sure pointers remain valid.
>>>
>>> KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
>>> RIP: 0010:vsock_dev_do_ioctl.isra.0+0x58/0xf0
>>> Call Trace:
>>> __x64_sys_ioctl+0x12d/0x190
>>> do_syscall_64+0x92/0x1c0
>>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>>
>>> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
>>> Signed-off-by: Michal Luczaj <mhal@rbox.co>
>>> ---
>>> net/vmw_vsock/af_vsock.c | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>>> index 2e7a3034e965db30b6ee295370d866e6d8b1c341..047d1bc773fab9c315a6ccd383a451fa11fb703e 100644
>>> --- a/net/vmw_vsock/af_vsock.c
>>> +++ b/net/vmw_vsock/af_vsock.c
>>> @@ -2541,6 +2541,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
>>>
>>> 	switch (cmd) {
>>> 	case IOCTL_VM_SOCKETS_GET_LOCAL_CID:
>>> +		mutex_lock(&vsock_register_mutex);
>>> +
>>> 		/* To be compatible with the VMCI behavior, we prioritize the
>>> 		 * guest CID instead of well-know host CID (VMADDR_CID_HOST).
>>> 		 */
>>> @@ -2549,6 +2551,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
>>> 		else if (transport_h2g)
>>> 			cid = transport_h2g->get_local_cid();
>>>
>>> +		mutex_unlock(&vsock_register_mutex);
>>
>>
>> What about if we introduce a new `vsock_get_local_cid`:
>>
>> u32 vsock_get_local_cid() {
>> 	u32 cid = VMADDR_CID_ANY;
>>
>> 	mutex_lock(&vsock_register_mutex);
>> 	/* To be compatible with the VMCI behavior, we prioritize the
>> 	 * guest CID instead of well-know host CID (VMADDR_CID_HOST).
>> 	 */
>> 	if (transport_g2h)
>> 		cid = transport_g2h->get_local_cid();
>> 	else if (transport_h2g)
>> 		cid = transport_h2g->get_local_cid();
>> 	mutex_lock(&vsock_register_mutex);
>>
>> 	return cid;
>> }
>>
>>
>> And we use it here, and in the place fixed by next patch?
>>
>> I think we can fix all in a single patch, the problem here is to call
>> transport_*->get_local_cid() without the lock IIUC.
>
>Do you mean:
>
> bool vsock_find_cid(unsigned int cid)
> {
>-       if (transport_g2h && cid == transport_g2h->get_local_cid())
>+       if (transport_g2h && cid == vsock_get_local_cid())
>                return true;
>
>?

Nope, I meant:

  bool vsock_find_cid(unsigned int cid)
  {
-       if (transport_g2h && cid == transport_g2h->get_local_cid())
-               return true;
-
-       if (transport_h2g && cid == VMADDR_CID_HOST)
+       if (cid == vsock_get_local_cid())
                 return true;

         if (transport_local && cid == VMADDR_CID_LOCAL)

But now I'm thinking if we should also include `transport_local` in the 
new `vsock_get_local_cid()`.

I think that will fix an issue when calling 
IOCTL_VM_SOCKETS_GET_LOCAL_CID and only vsock-loopback kernel module is 
loaded, so maybe we can do 2 patches:

1. fix IOCTL_VM_SOCKETS_GET_LOCAL_CID to check also `transport_local`
    Fixes: 0e12190578d0 ("vsock: add local transport support in the vsock core")

2. move that code in vsock_get_local_cid() with proper locking and use 
it also in vsock_find_cid()

WDYT?

Thanks,
Stefano
Re: [PATCH net 1/3] vsock: Fix transport_{h2g,g2h} TOCTOU
Posted by Michal Luczaj 3 months, 2 weeks ago
On 6/20/25 15:20, Stefano Garzarella wrote:
> On Fri, Jun 20, 2025 at 02:58:49PM +0200, Michal Luczaj wrote:
>> On 6/20/25 10:32, Stefano Garzarella wrote:
>>> On Wed, Jun 18, 2025 at 02:34:00PM +0200, Michal Luczaj wrote:
>>>> Checking transport_{h2g,g2h} != NULL may race with vsock_core_unregister().
>>>> Make sure pointers remain valid.
>>>>
>>>> KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
>>>> RIP: 0010:vsock_dev_do_ioctl.isra.0+0x58/0xf0
>>>> Call Trace:
>>>> __x64_sys_ioctl+0x12d/0x190
>>>> do_syscall_64+0x92/0x1c0
>>>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>>>
>>>> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
>>>> Signed-off-by: Michal Luczaj <mhal@rbox.co>
>>>> ---
>>>> net/vmw_vsock/af_vsock.c | 4 ++++
>>>> 1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>>>> index 2e7a3034e965db30b6ee295370d866e6d8b1c341..047d1bc773fab9c315a6ccd383a451fa11fb703e 100644
>>>> --- a/net/vmw_vsock/af_vsock.c
>>>> +++ b/net/vmw_vsock/af_vsock.c
>>>> @@ -2541,6 +2541,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
>>>>
>>>> 	switch (cmd) {
>>>> 	case IOCTL_VM_SOCKETS_GET_LOCAL_CID:
>>>> +		mutex_lock(&vsock_register_mutex);
>>>> +
>>>> 		/* To be compatible with the VMCI behavior, we prioritize the
>>>> 		 * guest CID instead of well-know host CID (VMADDR_CID_HOST).
>>>> 		 */
>>>> @@ -2549,6 +2551,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
>>>> 		else if (transport_h2g)
>>>> 			cid = transport_h2g->get_local_cid();
>>>>
>>>> +		mutex_unlock(&vsock_register_mutex);
>>>
>>>
>>> What about if we introduce a new `vsock_get_local_cid`:
>>>
>>> u32 vsock_get_local_cid() {
>>> 	u32 cid = VMADDR_CID_ANY;
>>>
>>> 	mutex_lock(&vsock_register_mutex);
>>> 	/* To be compatible with the VMCI behavior, we prioritize the
>>> 	 * guest CID instead of well-know host CID (VMADDR_CID_HOST).
>>> 	 */
>>> 	if (transport_g2h)
>>> 		cid = transport_g2h->get_local_cid();
>>> 	else if (transport_h2g)
>>> 		cid = transport_h2g->get_local_cid();
>>> 	mutex_lock(&vsock_register_mutex);
>>>
>>> 	return cid;
>>> }
>>>
>>>
>>> And we use it here, and in the place fixed by next patch?
>>>
>>> I think we can fix all in a single patch, the problem here is to call
>>> transport_*->get_local_cid() without the lock IIUC.
>>
>> Do you mean:
>>
>> bool vsock_find_cid(unsigned int cid)
>> {
>> -       if (transport_g2h && cid == transport_g2h->get_local_cid())
>> +       if (transport_g2h && cid == vsock_get_local_cid())
>>                return true;
>>
>> ?
> 
> Nope, I meant:
> 
>   bool vsock_find_cid(unsigned int cid)
>   {
> -       if (transport_g2h && cid == transport_g2h->get_local_cid())
> -               return true;
> -
> -       if (transport_h2g && cid == VMADDR_CID_HOST)
> +       if (cid == vsock_get_local_cid())
>                  return true;
> 
>          if (transport_local && cid == VMADDR_CID_LOCAL)

But it does change the behaviour, doesn't it? With this patch, (with g2h
loaded) if cid fails to match g2h->get_local_cid(), we don't fall back to
h2g case any more, i.e. no more comparing cid with VMADDR_CID_HOST.

> But now I'm thinking if we should also include `transport_local` in the 
> new `vsock_get_local_cid()`.
> 
> I think that will fix an issue when calling 
> IOCTL_VM_SOCKETS_GET_LOCAL_CID and only vsock-loopback kernel module is 
> loaded, so maybe we can do 2 patches:
> 
> 1. fix IOCTL_VM_SOCKETS_GET_LOCAL_CID to check also `transport_local`
>     Fixes: 0e12190578d0 ("vsock: add local transport support in the vsock core")

What would be the transport priority with transport_local thrown in? E.g.
if we have both local and g2h, ioctl should return VMADDR_CID_LOCAL or
transport_g2h->get_local_cid()?

> 2. move that code in vsock_get_local_cid() with proper locking and use 
> it also in vsock_find_cid()
> 
> WDYT?

Yeah, sure about 1, I'll add it to the series. I'm just still not certain
how useful vsock_get_local_cid() would be for vsock_find_cid().
Re: [PATCH net 1/3] vsock: Fix transport_{h2g,g2h} TOCTOU
Posted by Stefano Garzarella 3 months, 2 weeks ago
On Fri, 20 Jun 2025 at 16:23, Michal Luczaj <mhal@rbox.co> wrote:
>
> On 6/20/25 15:20, Stefano Garzarella wrote:
> > On Fri, Jun 20, 2025 at 02:58:49PM +0200, Michal Luczaj wrote:
> >> On 6/20/25 10:32, Stefano Garzarella wrote:
> >>> On Wed, Jun 18, 2025 at 02:34:00PM +0200, Michal Luczaj wrote:
> >>>> Checking transport_{h2g,g2h} != NULL may race with vsock_core_unregister().
> >>>> Make sure pointers remain valid.
> >>>>
> >>>> KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
> >>>> RIP: 0010:vsock_dev_do_ioctl.isra.0+0x58/0xf0
> >>>> Call Trace:
> >>>> __x64_sys_ioctl+0x12d/0x190
> >>>> do_syscall_64+0x92/0x1c0
> >>>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
> >>>>
> >>>> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
> >>>> Signed-off-by: Michal Luczaj <mhal@rbox.co>
> >>>> ---
> >>>> net/vmw_vsock/af_vsock.c | 4 ++++
> >>>> 1 file changed, 4 insertions(+)
> >>>>
> >>>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> >>>> index 2e7a3034e965db30b6ee295370d866e6d8b1c341..047d1bc773fab9c315a6ccd383a451fa11fb703e 100644
> >>>> --- a/net/vmw_vsock/af_vsock.c
> >>>> +++ b/net/vmw_vsock/af_vsock.c
> >>>> @@ -2541,6 +2541,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
> >>>>
> >>>>    switch (cmd) {
> >>>>    case IOCTL_VM_SOCKETS_GET_LOCAL_CID:
> >>>> +          mutex_lock(&vsock_register_mutex);
> >>>> +
> >>>>            /* To be compatible with the VMCI behavior, we prioritize the
> >>>>             * guest CID instead of well-know host CID (VMADDR_CID_HOST).
> >>>>             */
> >>>> @@ -2549,6 +2551,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
> >>>>            else if (transport_h2g)
> >>>>                    cid = transport_h2g->get_local_cid();
> >>>>
> >>>> +          mutex_unlock(&vsock_register_mutex);
> >>>
> >>>
> >>> What about if we introduce a new `vsock_get_local_cid`:
> >>>
> >>> u32 vsock_get_local_cid() {
> >>>     u32 cid = VMADDR_CID_ANY;
> >>>
> >>>     mutex_lock(&vsock_register_mutex);
> >>>     /* To be compatible with the VMCI behavior, we prioritize the
> >>>      * guest CID instead of well-know host CID (VMADDR_CID_HOST).
> >>>      */
> >>>     if (transport_g2h)
> >>>             cid = transport_g2h->get_local_cid();
> >>>     else if (transport_h2g)
> >>>             cid = transport_h2g->get_local_cid();
> >>>     mutex_lock(&vsock_register_mutex);
> >>>
> >>>     return cid;
> >>> }
> >>>
> >>>
> >>> And we use it here, and in the place fixed by next patch?
> >>>
> >>> I think we can fix all in a single patch, the problem here is to call
> >>> transport_*->get_local_cid() without the lock IIUC.
> >>
> >> Do you mean:
> >>
> >> bool vsock_find_cid(unsigned int cid)
> >> {
> >> -       if (transport_g2h && cid == transport_g2h->get_local_cid())
> >> +       if (transport_g2h && cid == vsock_get_local_cid())
> >>                return true;
> >>
> >> ?
> >
> > Nope, I meant:
> >
> >   bool vsock_find_cid(unsigned int cid)
> >   {
> > -       if (transport_g2h && cid == transport_g2h->get_local_cid())
> > -               return true;
> > -
> > -       if (transport_h2g && cid == VMADDR_CID_HOST)
> > +       if (cid == vsock_get_local_cid())
> >                  return true;
> >
> >          if (transport_local && cid == VMADDR_CID_LOCAL)
>
> But it does change the behaviour, doesn't it? With this patch, (with g2h
> loaded) if cid fails to match g2h->get_local_cid(), we don't fall back to
> h2g case any more, i.e. no more comparing cid with VMADDR_CID_HOST.

It's friday... yep, you're right!

>
> > But now I'm thinking if we should also include `transport_local` in the
> > new `vsock_get_local_cid()`.
> >
> > I think that will fix an issue when calling
> > IOCTL_VM_SOCKETS_GET_LOCAL_CID and only vsock-loopback kernel module is
> > loaded, so maybe we can do 2 patches:
> >
> > 1. fix IOCTL_VM_SOCKETS_GET_LOCAL_CID to check also `transport_local`
> >     Fixes: 0e12190578d0 ("vsock: add local transport support in the vsock core")
>
> What would be the transport priority with transport_local thrown in? E.g.
> if we have both local and g2h, ioctl should return VMADDR_CID_LOCAL or
> transport_g2h->get_local_cid()?

Should return the G2H, LOCAL is more for debug/test, so I'd return it
only if anything else is loaded.

>
> > 2. move that code in vsock_get_local_cid() with proper locking and use
> > it also in vsock_find_cid()
> >
> > WDYT?
>
> Yeah, sure about 1, I'll add it to the series. I'm just still not certain
> how useful vsock_get_local_cid() would be for vsock_find_cid().
>

Feel free to drop 1 too, we can send it later if it's not really
related to this issue.
About the series, maybe it is better to have a single patch that fixes
the access to ->get_local_cid() with proper locking.
But I don't have a strong opinion on that. I see it like a single
problem to fix, but up to you.

Thanks,
Stefano
Re: [PATCH net 1/3] vsock: Fix transport_{h2g,g2h} TOCTOU
Posted by Michal Luczaj 3 months, 2 weeks ago
On 6/20/25 16:43, Stefano Garzarella wrote:
> On Fri, 20 Jun 2025 at 16:23, Michal Luczaj <mhal@rbox.co> wrote:
>>
>> On 6/20/25 15:20, Stefano Garzarella wrote:
>>> On Fri, Jun 20, 2025 at 02:58:49PM +0200, Michal Luczaj wrote:
>>>> On 6/20/25 10:32, Stefano Garzarella wrote:
>>>>> On Wed, Jun 18, 2025 at 02:34:00PM +0200, Michal Luczaj wrote:
>>>>>> Checking transport_{h2g,g2h} != NULL may race with vsock_core_unregister().
>>>>>> Make sure pointers remain valid.
>>>>>>
>>>>>> KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
>>>>>> RIP: 0010:vsock_dev_do_ioctl.isra.0+0x58/0xf0
>>>>>> Call Trace:
>>>>>> __x64_sys_ioctl+0x12d/0x190
>>>>>> do_syscall_64+0x92/0x1c0
>>>>>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>>>>>
>>>>>> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
>>>>>> Signed-off-by: Michal Luczaj <mhal@rbox.co>
>>>>>> ---
>>>>>> net/vmw_vsock/af_vsock.c | 4 ++++
>>>>>> 1 file changed, 4 insertions(+)
>>>>>>
>>>>>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>>>>>> index 2e7a3034e965db30b6ee295370d866e6d8b1c341..047d1bc773fab9c315a6ccd383a451fa11fb703e 100644
>>>>>> --- a/net/vmw_vsock/af_vsock.c
>>>>>> +++ b/net/vmw_vsock/af_vsock.c
>>>>>> @@ -2541,6 +2541,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
>>>>>>
>>>>>>    switch (cmd) {
>>>>>>    case IOCTL_VM_SOCKETS_GET_LOCAL_CID:
>>>>>> +          mutex_lock(&vsock_register_mutex);
>>>>>> +
>>>>>>            /* To be compatible with the VMCI behavior, we prioritize the
>>>>>>             * guest CID instead of well-know host CID (VMADDR_CID_HOST).
>>>>>>             */
>>>>>> @@ -2549,6 +2551,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
>>>>>>            else if (transport_h2g)
>>>>>>                    cid = transport_h2g->get_local_cid();
>>>>>>
>>>>>> +          mutex_unlock(&vsock_register_mutex);
>>>>>
>>>>>
>>>>> What about if we introduce a new `vsock_get_local_cid`:
>>>>>
>>>>> u32 vsock_get_local_cid() {
>>>>>     u32 cid = VMADDR_CID_ANY;
>>>>>
>>>>>     mutex_lock(&vsock_register_mutex);
>>>>>     /* To be compatible with the VMCI behavior, we prioritize the
>>>>>      * guest CID instead of well-know host CID (VMADDR_CID_HOST).
>>>>>      */
>>>>>     if (transport_g2h)
>>>>>             cid = transport_g2h->get_local_cid();
>>>>>     else if (transport_h2g)
>>>>>             cid = transport_h2g->get_local_cid();
>>>>>     mutex_lock(&vsock_register_mutex);
>>>>>
>>>>>     return cid;
>>>>> }
>>>>>
>>>>>
>>>>> And we use it here, and in the place fixed by next patch?
>>>>>
>>>>> I think we can fix all in a single patch, the problem here is to call
>>>>> transport_*->get_local_cid() without the lock IIUC.
>>>>
>>>> Do you mean:
>>>>
>>>> bool vsock_find_cid(unsigned int cid)
>>>> {
>>>> -       if (transport_g2h && cid == transport_g2h->get_local_cid())
>>>> +       if (transport_g2h && cid == vsock_get_local_cid())
>>>>                return true;
>>>>
>>>> ?
>>>
>>> Nope, I meant:
>>>
>>>   bool vsock_find_cid(unsigned int cid)
>>>   {
>>> -       if (transport_g2h && cid == transport_g2h->get_local_cid())
>>> -               return true;
>>> -
>>> -       if (transport_h2g && cid == VMADDR_CID_HOST)
>>> +       if (cid == vsock_get_local_cid())
>>>                  return true;
>>>
>>>          if (transport_local && cid == VMADDR_CID_LOCAL)
>>
>> But it does change the behaviour, doesn't it? With this patch, (with g2h
>> loaded) if cid fails to match g2h->get_local_cid(), we don't fall back to
>> h2g case any more, i.e. no more comparing cid with VMADDR_CID_HOST.
> 
> It's friday... yep, you're right!
> 
>>
>>> But now I'm thinking if we should also include `transport_local` in the
>>> new `vsock_get_local_cid()`.
>>>
>>> I think that will fix an issue when calling
>>> IOCTL_VM_SOCKETS_GET_LOCAL_CID and only vsock-loopback kernel module is
>>> loaded, so maybe we can do 2 patches:
>>>
>>> 1. fix IOCTL_VM_SOCKETS_GET_LOCAL_CID to check also `transport_local`
>>>     Fixes: 0e12190578d0 ("vsock: add local transport support in the vsock core")
>>
>> What would be the transport priority with transport_local thrown in? E.g.
>> if we have both local and g2h, ioctl should return VMADDR_CID_LOCAL or
>> transport_g2h->get_local_cid()?
> 
> Should return the G2H, LOCAL is more for debug/test, so I'd return it
> only if anything else is loaded.
>>>> 2. move that code in vsock_get_local_cid() with proper locking and use
>>> it also in vsock_find_cid()
>>>
>>> WDYT?
>>
>> Yeah, sure about 1, I'll add it to the series. I'm just still not certain
>> how useful vsock_get_local_cid() would be for vsock_find_cid().
>>
> 
> Feel free to drop 1 too, we can send it later if it's not really
> related to this issue.

I've added it to the end of this series (and marked the series as RFC), for
ease of discussion.

> About the series, maybe it is better to have a single patch that fixes
> the access to ->get_local_cid() with proper locking.
> But I don't have a strong opinion on that. I see it like a single
> problem to fix, but up to you.

Yeah, I get your point. So I've tried something similar in v2:
https://lore.kernel.org/netdev/20250620-vsock-transports-toctou-v2-0-02ebd20b1d03@rbox.co/