[v4] migration: fix the memory overwriting risk in add_to_iovec

Lin Feng posted 1 patch 2 years, 10 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20210625062138.1899-1-linfeng23@huawei.com
Maintainers: Juan Quintela <quintela@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>
migration/qemu-file.c | 5 +++++
1 file changed, 5 insertions(+)
[v4] migration: fix the memory overwriting risk in add_to_iovec
Posted by Lin Feng 2 years, 10 months ago
From: Feng Lin <linfeng23@huawei.com>

When testing migration, a Segmentation fault qemu core is generated.
0  error_free (err=0x1)
1  0x00007f8b862df647 in qemu_fclose (f=f@entry=0x55e06c247640)
2  0x00007f8b8516d59a in migrate_fd_cleanup (s=s@entry=0x55e06c0e1ef0)
3  0x00007f8b8516d66c in migrate_fd_cleanup_bh (opaque=0x55e06c0e1ef0)
4  0x00007f8b8626a47f in aio_bh_poll (ctx=ctx@entry=0x55e06b5a16d0)
5  0x00007f8b8626e71f in aio_dispatch (ctx=0x55e06b5a16d0)
6  0x00007f8b8626a33d in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>)
7  0x00007f8b866bdba4 in g_main_context_dispatch ()
8  0x00007f8b8626cde9 in glib_pollfds_poll ()
9  0x00007f8b8626ce62 in os_host_main_loop_wait (timeout=<optimized out>)
10 0x00007f8b8626cffd in main_loop_wait (nonblocking=nonblocking@entry=0)
11 0x00007f8b862ef01f in main_loop ()
Using gdb print the struct QEMUFile f = {
  ...,
  iovcnt = 65, last_error = 21984,
  last_error_obj = 0x1, shutdown = true
}
Well iovcnt is overflow, because the max size of MAX_IOV_SIZE is 64.
struct QEMUFile {
    ...;
    struct iovec iov[MAX_IOV_SIZE];
    unsigned int iovcnt;
    int last_error;
    Error *last_error_obj;
    bool shutdown;
};
iovcnt and last_error is overwrited by add_to_iovec().
Right now, add_to_iovec() increase iovcnt before check the limit.
And it seems that add_to_iovec() assumes that iovcnt will set to zero
in qemu_fflush(). But qemu_fflush() will directly return when f->shutdown
is true.

The situation may occur when libvirtd restart during migration, after
f->shutdown is set, before calling qemu_file_set_error() in
qemu_file_shutdown().

So the safiest way is checking the iovcnt before increasing it.

Signed-off-by: Feng Lin <linfeng23@huawei.com>
---
 migration/qemu-file.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index d6e03dbc0e..6879615197 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -416,6 +416,11 @@ static int add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size,
     {
         f->iov[f->iovcnt - 1].iov_len += size;
     } else {
+        if (f->iovcnt >= MAX_IOV_SIZE) {
+            /* Should only happen if a previous fflush failed */
+            assert(f->shutdown || !qemu_file_is_writeable(f));
+            return 1;
+        }
         if (may_free) {
             set_bit(f->iovcnt, f->may_free);
         }
-- 
2.23.0


Re: [v4] migration: fix the memory overwriting risk in add_to_iovec
Posted by Dr. David Alan Gilbert 2 years, 10 months ago
* Lin Feng (linfeng23@huawei.com) wrote:
> From: Feng Lin <linfeng23@huawei.com>
> 
> When testing migration, a Segmentation fault qemu core is generated.
> 0  error_free (err=0x1)
> 1  0x00007f8b862df647 in qemu_fclose (f=f@entry=0x55e06c247640)
> 2  0x00007f8b8516d59a in migrate_fd_cleanup (s=s@entry=0x55e06c0e1ef0)
> 3  0x00007f8b8516d66c in migrate_fd_cleanup_bh (opaque=0x55e06c0e1ef0)
> 4  0x00007f8b8626a47f in aio_bh_poll (ctx=ctx@entry=0x55e06b5a16d0)
> 5  0x00007f8b8626e71f in aio_dispatch (ctx=0x55e06b5a16d0)
> 6  0x00007f8b8626a33d in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>)
> 7  0x00007f8b866bdba4 in g_main_context_dispatch ()
> 8  0x00007f8b8626cde9 in glib_pollfds_poll ()
> 9  0x00007f8b8626ce62 in os_host_main_loop_wait (timeout=<optimized out>)
> 10 0x00007f8b8626cffd in main_loop_wait (nonblocking=nonblocking@entry=0)
> 11 0x00007f8b862ef01f in main_loop ()
> Using gdb print the struct QEMUFile f = {
>   ...,
>   iovcnt = 65, last_error = 21984,
>   last_error_obj = 0x1, shutdown = true
> }
> Well iovcnt is overflow, because the max size of MAX_IOV_SIZE is 64.
> struct QEMUFile {
>     ...;
>     struct iovec iov[MAX_IOV_SIZE];
>     unsigned int iovcnt;
>     int last_error;
>     Error *last_error_obj;
>     bool shutdown;
> };
> iovcnt and last_error is overwrited by add_to_iovec().
> Right now, add_to_iovec() increase iovcnt before check the limit.
> And it seems that add_to_iovec() assumes that iovcnt will set to zero
> in qemu_fflush(). But qemu_fflush() will directly return when f->shutdown
> is true.
> 
> The situation may occur when libvirtd restart during migration, after
> f->shutdown is set, before calling qemu_file_set_error() in
> qemu_file_shutdown().
> 
> So the safiest way is checking the iovcnt before increasing it.
> 
> Signed-off-by: Feng Lin <linfeng23@huawei.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/qemu-file.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> index d6e03dbc0e..6879615197 100644
> --- a/migration/qemu-file.c
> +++ b/migration/qemu-file.c
> @@ -416,6 +416,11 @@ static int add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size,
>      {
>          f->iov[f->iovcnt - 1].iov_len += size;
>      } else {
> +        if (f->iovcnt >= MAX_IOV_SIZE) {
> +            /* Should only happen if a previous fflush failed */
> +            assert(f->shutdown || !qemu_file_is_writeable(f));
> +            return 1;
> +        }
>          if (may_free) {
>              set_bit(f->iovcnt, f->may_free);
>          }
> -- 
> 2.23.0
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [v4] migration: fix the memory overwriting risk in add_to_iovec
Posted by Dr. David Alan Gilbert 2 years, 10 months ago
* Lin Feng (linfeng23@huawei.com) wrote:
> From: Feng Lin <linfeng23@huawei.com>
> 
> When testing migration, a Segmentation fault qemu core is generated.
> 0  error_free (err=0x1)
> 1  0x00007f8b862df647 in qemu_fclose (f=f@entry=0x55e06c247640)
> 2  0x00007f8b8516d59a in migrate_fd_cleanup (s=s@entry=0x55e06c0e1ef0)
> 3  0x00007f8b8516d66c in migrate_fd_cleanup_bh (opaque=0x55e06c0e1ef0)
> 4  0x00007f8b8626a47f in aio_bh_poll (ctx=ctx@entry=0x55e06b5a16d0)
> 5  0x00007f8b8626e71f in aio_dispatch (ctx=0x55e06b5a16d0)
> 6  0x00007f8b8626a33d in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>)
> 7  0x00007f8b866bdba4 in g_main_context_dispatch ()
> 8  0x00007f8b8626cde9 in glib_pollfds_poll ()
> 9  0x00007f8b8626ce62 in os_host_main_loop_wait (timeout=<optimized out>)
> 10 0x00007f8b8626cffd in main_loop_wait (nonblocking=nonblocking@entry=0)
> 11 0x00007f8b862ef01f in main_loop ()
> Using gdb print the struct QEMUFile f = {
>   ...,
>   iovcnt = 65, last_error = 21984,
>   last_error_obj = 0x1, shutdown = true
> }
> Well iovcnt is overflow, because the max size of MAX_IOV_SIZE is 64.
> struct QEMUFile {
>     ...;
>     struct iovec iov[MAX_IOV_SIZE];
>     unsigned int iovcnt;
>     int last_error;
>     Error *last_error_obj;
>     bool shutdown;
> };
> iovcnt and last_error is overwrited by add_to_iovec().
> Right now, add_to_iovec() increase iovcnt before check the limit.
> And it seems that add_to_iovec() assumes that iovcnt will set to zero
> in qemu_fflush(). But qemu_fflush() will directly return when f->shutdown
> is true.
> 
> The situation may occur when libvirtd restart during migration, after
> f->shutdown is set, before calling qemu_file_set_error() in
> qemu_file_shutdown().
> 
> So the safiest way is checking the iovcnt before increasing it.
> 
> Signed-off-by: Feng Lin <linfeng23@huawei.com>

Queued
> ---
>  migration/qemu-file.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> index d6e03dbc0e..6879615197 100644
> --- a/migration/qemu-file.c
> +++ b/migration/qemu-file.c
> @@ -416,6 +416,11 @@ static int add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size,
>      {
>          f->iov[f->iovcnt - 1].iov_len += size;
>      } else {
> +        if (f->iovcnt >= MAX_IOV_SIZE) {
> +            /* Should only happen if a previous fflush failed */
> +            assert(f->shutdown || !qemu_file_is_writeable(f));
> +            return 1;
> +        }
>          if (may_free) {
>              set_bit(f->iovcnt, f->may_free);
>          }
> -- 
> 2.23.0
> 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [v4] migration: fix the memory overwriting risk in add_to_iovec
Posted by Dr. David Alan Gilbert 2 years, 10 months ago
* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> * Lin Feng (linfeng23@huawei.com) wrote:
> > From: Feng Lin <linfeng23@huawei.com>
> > 
> > When testing migration, a Segmentation fault qemu core is generated.
> > 0  error_free (err=0x1)
> > 1  0x00007f8b862df647 in qemu_fclose (f=f@entry=0x55e06c247640)
> > 2  0x00007f8b8516d59a in migrate_fd_cleanup (s=s@entry=0x55e06c0e1ef0)
> > 3  0x00007f8b8516d66c in migrate_fd_cleanup_bh (opaque=0x55e06c0e1ef0)
> > 4  0x00007f8b8626a47f in aio_bh_poll (ctx=ctx@entry=0x55e06b5a16d0)
> > 5  0x00007f8b8626e71f in aio_dispatch (ctx=0x55e06b5a16d0)
> > 6  0x00007f8b8626a33d in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>)
> > 7  0x00007f8b866bdba4 in g_main_context_dispatch ()
> > 8  0x00007f8b8626cde9 in glib_pollfds_poll ()
> > 9  0x00007f8b8626ce62 in os_host_main_loop_wait (timeout=<optimized out>)
> > 10 0x00007f8b8626cffd in main_loop_wait (nonblocking=nonblocking@entry=0)
> > 11 0x00007f8b862ef01f in main_loop ()
> > Using gdb print the struct QEMUFile f = {
> >   ...,
> >   iovcnt = 65, last_error = 21984,
> >   last_error_obj = 0x1, shutdown = true
> > }
> > Well iovcnt is overflow, because the max size of MAX_IOV_SIZE is 64.
> > struct QEMUFile {
> >     ...;
> >     struct iovec iov[MAX_IOV_SIZE];
> >     unsigned int iovcnt;
> >     int last_error;
> >     Error *last_error_obj;
> >     bool shutdown;
> > };
> > iovcnt and last_error is overwrited by add_to_iovec().
> > Right now, add_to_iovec() increase iovcnt before check the limit.
> > And it seems that add_to_iovec() assumes that iovcnt will set to zero
> > in qemu_fflush(). But qemu_fflush() will directly return when f->shutdown
> > is true.
> > 
> > The situation may occur when libvirtd restart during migration, after
> > f->shutdown is set, before calling qemu_file_set_error() in
> > qemu_file_shutdown().
> > 
> > So the safiest way is checking the iovcnt before increasing it.
> > 
> > Signed-off-by: Feng Lin <linfeng23@huawei.com>
> 
> Queued

Hmm this didn't actually build because that function is actually
misnamed 'qemu_file_is_writable' (no e!);  I've fixed that, but can you
just reconfirm that you've tested this fixes your original problem?

Dave

> > ---
> >  migration/qemu-file.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> > index d6e03dbc0e..6879615197 100644
> > --- a/migration/qemu-file.c
> > +++ b/migration/qemu-file.c
> > @@ -416,6 +416,11 @@ static int add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size,
> >      {
> >          f->iov[f->iovcnt - 1].iov_len += size;
> >      } else {
> > +        if (f->iovcnt >= MAX_IOV_SIZE) {
> > +            /* Should only happen if a previous fflush failed */
> > +            assert(f->shutdown || !qemu_file_is_writeable(f));
> > +            return 1;
> > +        }
> >          if (may_free) {
> >              set_bit(f->iovcnt, f->may_free);
> >          }
> > -- 
> > 2.23.0
> > 
> > 
> -- 
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


RE: [v4] migration: fix the memory overwriting risk in add_to_iovec
Posted by linfeng (M) 2 years, 10 months ago
* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> Subject: Re: [v4] migration: fix the memory overwriting risk in add_to_iovec
> 
> * Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> > * Lin Feng (linfeng23@huawei.com) wrote:
> > > From: Feng Lin <linfeng23@huawei.com>
> > >
> > > When testing migration, a Segmentation fault qemu core is generated.
> > > 0  error_free (err=0x1)
> > > 1  0x00007f8b862df647 in qemu_fclose (f=f@entry=0x55e06c247640)
> > > 2  0x00007f8b8516d59a in migrate_fd_cleanup
> > > (s=s@entry=0x55e06c0e1ef0)
> > > 3  0x00007f8b8516d66c in migrate_fd_cleanup_bh
> > > (opaque=0x55e06c0e1ef0)
> > > 4  0x00007f8b8626a47f in aio_bh_poll (ctx=ctx@entry=0x55e06b5a16d0)
> > > 5  0x00007f8b8626e71f in aio_dispatch (ctx=0x55e06b5a16d0)
> > > 6  0x00007f8b8626a33d in aio_ctx_dispatch (source=<optimized out>,
> > > callback=<optimized out>, user_data=<optimized out>)
> > > 7  0x00007f8b866bdba4 in g_main_context_dispatch ()
> > > 8  0x00007f8b8626cde9 in glib_pollfds_poll ()
> > > 9  0x00007f8b8626ce62 in os_host_main_loop_wait (timeout=<optimized
> > > out>)
> > > 10 0x00007f8b8626cffd in main_loop_wait
> > > (nonblocking=nonblocking@entry=0)
> > > 11 0x00007f8b862ef01f in main_loop () Using gdb print the struct
> > > QEMUFile f = {
> > >   ...,
> > >   iovcnt = 65, last_error = 21984,
> > >   last_error_obj = 0x1, shutdown = true } Well iovcnt is overflow,
> > > because the max size of MAX_IOV_SIZE is 64.
> > > struct QEMUFile {
> > >     ...;
> > >     struct iovec iov[MAX_IOV_SIZE];
> > >     unsigned int iovcnt;
> > >     int last_error;
> > >     Error *last_error_obj;
> > >     bool shutdown;
> > > };
> > > iovcnt and last_error is overwrited by add_to_iovec().
> > > Right now, add_to_iovec() increase iovcnt before check the limit.
> > > And it seems that add_to_iovec() assumes that iovcnt will set to
> > > zero in qemu_fflush(). But qemu_fflush() will directly return when
> > > f->shutdown is true.
> > >
> > > The situation may occur when libvirtd restart during migration,
> > > after
> > > f->shutdown is set, before calling qemu_file_set_error() in
> > > qemu_file_shutdown().
> > >
> > > So the safiest way is checking the iovcnt before increasing it.
> > >
> > > Signed-off-by: Feng Lin <linfeng23@huawei.com>
> >
> > Queued
> 
> Hmm this didn't actually build because that function is actually misnamed 'qemu_file_is_writable' (no e!);
> I've fixed that, but can you just reconfirm that you've tested this fixes your original problem?
Sorry for that rookie mistake. I have tested it again with gdb-fault injection. It can fix my original problem.
Thanks for helping me complete my first qemu patch submission. Really helped a lot.
> 
> Dave
> 
> > > ---
> > >  migration/qemu-file.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > >
> > > diff --git a/migration/qemu-file.c b/migration/qemu-file.c index
> > > d6e03dbc0e..6879615197 100644
> > > --- a/migration/qemu-file.c
> > > +++ b/migration/qemu-file.c
> > > @@ -416,6 +416,11 @@ static int add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size,
> > >      {
> > >          f->iov[f->iovcnt - 1].iov_len += size;
> > >      } else {
> > > +        if (f->iovcnt >= MAX_IOV_SIZE) {
> > > +            /* Should only happen if a previous fflush failed */
> > > +            assert(f->shutdown || !qemu_file_is_writeable(f));
> > > +            return 1;
> > > +        }
> > >          if (may_free) {
> > >              set_bit(f->iovcnt, f->may_free);
> > >          }
> > > --
> > > 2.23.0
> > >
> > >
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [v4] migration: fix the memory overwriting risk in add_to_iovec
Posted by Dr. David Alan Gilbert 2 years, 10 months ago
* linfeng (M) (linfeng23@huawei.com) wrote:
> 
> * Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> > Subject: Re: [v4] migration: fix the memory overwriting risk in add_to_iovec
> > 
> > * Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> > > * Lin Feng (linfeng23@huawei.com) wrote:
> > > > From: Feng Lin <linfeng23@huawei.com>
> > > >
> > > > When testing migration, a Segmentation fault qemu core is generated.
> > > > 0  error_free (err=0x1)
> > > > 1  0x00007f8b862df647 in qemu_fclose (f=f@entry=0x55e06c247640)
> > > > 2  0x00007f8b8516d59a in migrate_fd_cleanup
> > > > (s=s@entry=0x55e06c0e1ef0)
> > > > 3  0x00007f8b8516d66c in migrate_fd_cleanup_bh
> > > > (opaque=0x55e06c0e1ef0)
> > > > 4  0x00007f8b8626a47f in aio_bh_poll (ctx=ctx@entry=0x55e06b5a16d0)
> > > > 5  0x00007f8b8626e71f in aio_dispatch (ctx=0x55e06b5a16d0)
> > > > 6  0x00007f8b8626a33d in aio_ctx_dispatch (source=<optimized out>,
> > > > callback=<optimized out>, user_data=<optimized out>)
> > > > 7  0x00007f8b866bdba4 in g_main_context_dispatch ()
> > > > 8  0x00007f8b8626cde9 in glib_pollfds_poll ()
> > > > 9  0x00007f8b8626ce62 in os_host_main_loop_wait (timeout=<optimized
> > > > out>)
> > > > 10 0x00007f8b8626cffd in main_loop_wait
> > > > (nonblocking=nonblocking@entry=0)
> > > > 11 0x00007f8b862ef01f in main_loop () Using gdb print the struct
> > > > QEMUFile f = {
> > > >   ...,
> > > >   iovcnt = 65, last_error = 21984,
> > > >   last_error_obj = 0x1, shutdown = true } Well iovcnt is overflow,
> > > > because the max size of MAX_IOV_SIZE is 64.
> > > > struct QEMUFile {
> > > >     ...;
> > > >     struct iovec iov[MAX_IOV_SIZE];
> > > >     unsigned int iovcnt;
> > > >     int last_error;
> > > >     Error *last_error_obj;
> > > >     bool shutdown;
> > > > };
> > > > iovcnt and last_error is overwrited by add_to_iovec().
> > > > Right now, add_to_iovec() increase iovcnt before check the limit.
> > > > And it seems that add_to_iovec() assumes that iovcnt will set to
> > > > zero in qemu_fflush(). But qemu_fflush() will directly return when
> > > > f->shutdown is true.
> > > >
> > > > The situation may occur when libvirtd restart during migration,
> > > > after
> > > > f->shutdown is set, before calling qemu_file_set_error() in
> > > > qemu_file_shutdown().
> > > >
> > > > So the safiest way is checking the iovcnt before increasing it.
> > > >
> > > > Signed-off-by: Feng Lin <linfeng23@huawei.com>
> > >
> > > Queued
> > 
> > Hmm this didn't actually build because that function is actually misnamed 'qemu_file_is_writable' (no e!);
> > I've fixed that, but can you just reconfirm that you've tested this fixes your original problem?
> Sorry for that rookie mistake. I have tested it again with gdb-fault injection. It can fix my original problem.
> Thanks for helping me complete my first qemu patch submission. Really helped a lot.

Thanks for retesting.

Dave

> > 
> > Dave
> > 
> > > > ---
> > > >  migration/qemu-file.c | 5 +++++
> > > >  1 file changed, 5 insertions(+)
> > > >
> > > > diff --git a/migration/qemu-file.c b/migration/qemu-file.c index
> > > > d6e03dbc0e..6879615197 100644
> > > > --- a/migration/qemu-file.c
> > > > +++ b/migration/qemu-file.c
> > > > @@ -416,6 +416,11 @@ static int add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size,
> > > >      {
> > > >          f->iov[f->iovcnt - 1].iov_len += size;
> > > >      } else {
> > > > +        if (f->iovcnt >= MAX_IOV_SIZE) {
> > > > +            /* Should only happen if a previous fflush failed */
> > > > +            assert(f->shutdown || !qemu_file_is_writeable(f));
> > > > +            return 1;
> > > > +        }
> > > >          if (may_free) {
> > > >              set_bit(f->iovcnt, f->may_free);
> > > >          }
> > > > --
> > > > 2.23.0
> > > >
> > > >
> > > --
> > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK