[Qemu-devel] [PATCH] migration/rdma: unegister fd handler

Dr. David Alan Gilbert (git) posted 1 patch 6 years, 9 months ago
Test docker-mingw@fedora passed
Test asan passed
Test checkpatch passed
Test docker-clang@ubuntu passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190122173111.29821-1-dgilbert@redhat.com
Maintainers: Juan Quintela <quintela@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>
migration/rdma.c | 1 +
1 file changed, 1 insertion(+)
[Qemu-devel] [PATCH] migration/rdma: unegister fd handler
Posted by Dr. David Alan Gilbert (git) 6 years, 9 months ago
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Unregister the fd handler before we destroy the channel,
otherwise we've got a race where we might land in the
fd handler just as we're closing the device.

(The race is quite data dependent, you just have to have
the right set of devices for it to trigger).

Corresponds to RH bz: https://bugzilla.redhat.com/show_bug.cgi?id=1666601

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/rdma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/rdma.c b/migration/rdma.c
index 9b2e7e10aa..54a3c11540 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2321,6 +2321,7 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
         rdma->connected = false;
     }
 
+    qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
     g_free(rdma->dest_blocks);
     rdma->dest_blocks = NULL;
 
-- 
2.20.1


Re: [Qemu-devel] [PATCH] migration/rdma: unegister fd handler
Posted by Peter Xu 6 years, 9 months ago
On Tue, Jan 22, 2019 at 05:31:11PM +0000, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Unregister the fd handler before we destroy the channel,
> otherwise we've got a race where we might land in the
> fd handler just as we're closing the device.
> 
> (The race is quite data dependent, you just have to have
> the right set of devices for it to trigger).
> 
> Corresponds to RH bz: https://bugzilla.redhat.com/show_bug.cgi?id=1666601
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

(Could the crash happened because the same fd number is re-used after
 the RDMA channel was destroyed?  Then when the fd has an event, it'll
 be delivered to rdma_cm_poll_handler() while the fd is not really the
 RDMA channel handle any more)

Reviewed-by: Peter Xu <peterx@redhat.com>

Regards,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH] migration/rdma: unegister fd handler
Posted by Dr. David Alan Gilbert 6 years, 9 months ago
* Peter Xu (peterx@redhat.com) wrote:
> On Tue, Jan 22, 2019 at 05:31:11PM +0000, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Unregister the fd handler before we destroy the channel,
> > otherwise we've got a race where we might land in the
> > fd handler just as we're closing the device.
> > 
> > (The race is quite data dependent, you just have to have
> > the right set of devices for it to trigger).
> > 
> > Corresponds to RH bz: https://bugzilla.redhat.com/show_bug.cgi?id=1666601
> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 
> (Could the crash happened because the same fd number is re-used after
>  the RDMA channel was destroyed?  Then when the fd has an event, it'll
>  be delivered to rdma_cm_poll_handler() while the fd is not really the
>  RDMA channel handle any more)

That's an interesting thought, I'd assumed it was just a race, but being
dependent on the fd numbering would explain why it was so delicate to
reproduce it.

> Reviewed-by: Peter Xu <peterx@redhat.com>

Thanks!

Dave

> Regards,
> 
> -- 
> Peter Xu
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH] migration/rdma: unegister fd handler
Posted by Dr. David Alan Gilbert 6 years, 9 months ago
* Dr. David Alan Gilbert (git) (dgilbert@redhat.com) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Unregister the fd handler before we destroy the channel,
> otherwise we've got a race where we might land in the
> fd handler just as we're closing the device.
> 
> (The race is quite data dependent, you just have to have
> the right set of devices for it to trigger).
> 
> Corresponds to RH bz: https://bugzilla.redhat.com/show_bug.cgi?id=1666601
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Queued

> ---
>  migration/rdma.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 9b2e7e10aa..54a3c11540 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -2321,6 +2321,7 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
>          rdma->connected = false;
>      }
>  
> +    qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
>      g_free(rdma->dest_blocks);
>      rdma->dest_blocks = NULL;
>  
> -- 
> 2.20.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH] migration/rdma: unegister fd handler
Posted by Philippe Mathieu-Daudé 6 years, 9 months ago
On 1/23/19 12:44 PM, Dr. David Alan Gilbert wrote:
> * Dr. David Alan Gilbert (git) (dgilbert@redhat.com) wrote:
>> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>>
>> Unregister the fd handler before we destroy the channel,
>> otherwise we've got a race where we might land in the
>> fd handler just as we're closing the device.
>>
>> (The race is quite data dependent, you just have to have
>> the right set of devices for it to trigger).
>>
>> Corresponds to RH bz: https://bugzilla.redhat.com/show_bug.cgi?id=1666601
>>
>> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 
> Queued

Did you fixed the patch subject typo? "un(r)egister"

>> ---
>>  migration/rdma.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/migration/rdma.c b/migration/rdma.c
>> index 9b2e7e10aa..54a3c11540 100644
>> --- a/migration/rdma.c
>> +++ b/migration/rdma.c
>> @@ -2321,6 +2321,7 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
>>          rdma->connected = false;
>>      }
>>  
>> +    qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
>>      g_free(rdma->dest_blocks);
>>      rdma->dest_blocks = NULL;
>>  
>> -- 
>> 2.20.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 

Re: [Qemu-devel] [PATCH] migration/rdma: unegister fd handler
Posted by Dr. David Alan Gilbert 6 years, 9 months ago
* Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
> On 1/23/19 12:44 PM, Dr. David Alan Gilbert wrote:
> > * Dr. David Alan Gilbert (git) (dgilbert@redhat.com) wrote:
> >> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >>
> >> Unregister the fd handler before we destroy the channel,
> >> otherwise we've got a race where we might land in the
> >> fd handler just as we're closing the device.
> >>
> >> (The race is quite data dependent, you just have to have
> >> the right set of devices for it to trigger).
> >>
> >> Corresponds to RH bz: https://bugzilla.redhat.com/show_bug.cgi?id=1666601
> >>
> >> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > 
> > Queued
> 
> Did you fixed the patch subject typo? "un(r)egister"

Yes; fortunately I spotted it during building the pull :-)

Dave

> >> ---
> >>  migration/rdma.c | 1 +
> >>  1 file changed, 1 insertion(+)
> >>
> >> diff --git a/migration/rdma.c b/migration/rdma.c
> >> index 9b2e7e10aa..54a3c11540 100644
> >> --- a/migration/rdma.c
> >> +++ b/migration/rdma.c
> >> @@ -2321,6 +2321,7 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
> >>          rdma->connected = false;
> >>      }
> >>  
> >> +    qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
> >>      g_free(rdma->dest_blocks);
> >>      rdma->dest_blocks = NULL;
> >>  
> >> -- 
> >> 2.20.1
> >>
> >>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH] migration/rdma: unegister fd handler
Posted by Peter Maydell 6 years, 8 months ago
On Tue, 22 Jan 2019 at 19:08, Dr. David Alan Gilbert (git)
<dgilbert@redhat.com> wrote:
>
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Unregister the fd handler before we destroy the channel,
> otherwise we've got a race where we might land in the
> fd handler just as we're closing the device.
>
> (The race is quite data dependent, you just have to have
> the right set of devices for it to trigger).
>
> Corresponds to RH bz: https://bugzilla.redhat.com/show_bug.cgi?id=1666601
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  migration/rdma.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 9b2e7e10aa..54a3c11540 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -2321,6 +2321,7 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
>          rdma->connected = false;
>      }
>
> +    qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
>      g_free(rdma->dest_blocks);
>      rdma->dest_blocks = NULL;

Hi -- this patch makes coverity complain (CID 1398634),
because here we use rdma->channel without checking that it is NULL,
but later in the function we have an "if (rdma->channel)" test.
Should this code be conditional on rmda->channel being non-NULL,
or is the later test incorrect?

thanks
-- PMM

Re: [Qemu-devel] [PATCH] migration/rdma: unegister fd handler
Posted by Dr. David Alan Gilbert 6 years, 8 months ago
* Peter Maydell (peter.maydell@linaro.org) wrote:
> On Tue, 22 Jan 2019 at 19:08, Dr. David Alan Gilbert (git)
> <dgilbert@redhat.com> wrote:
> >
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >
> > Unregister the fd handler before we destroy the channel,
> > otherwise we've got a race where we might land in the
> > fd handler just as we're closing the device.
> >
> > (The race is quite data dependent, you just have to have
> > the right set of devices for it to trigger).
> >
> > Corresponds to RH bz: https://bugzilla.redhat.com/show_bug.cgi?id=1666601
> >
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  migration/rdma.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/migration/rdma.c b/migration/rdma.c
> > index 9b2e7e10aa..54a3c11540 100644
> > --- a/migration/rdma.c
> > +++ b/migration/rdma.c
> > @@ -2321,6 +2321,7 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
> >          rdma->connected = false;
> >      }
> >
> > +    qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
> >      g_free(rdma->dest_blocks);
> >      rdma->dest_blocks = NULL;
> 
> Hi -- this patch makes coverity complain (CID 1398634),
> because here we use rdma->channel without checking that it is NULL,
> but later in the function we have an "if (rdma->channel)" test.
> Should this code be conditional on rmda->channel being non-NULL,
> or is the later test incorrect?

Yes, it's got a point - I can seg that.

I'll post a fix.

Dave

> thanks
> -- PMM
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK