[PATCH 0/3] bugfixes for migration using compression methods

Yuan Liu posted 3 patches 3 months, 2 weeks ago
migration/multifd-nocomp.c | 3 +--
migration/multifd-qatzip.c | 1 +
migration/multifd-qpl.c    | 1 +
3 files changed, 3 insertions(+), 2 deletions(-)
[PATCH 0/3] bugfixes for migration using compression methods
Posted by Yuan Liu 3 months, 2 weeks ago
This set of patches is used to fix the bugs of incorrect migration
memory data when compression is enabled.

The method to reproduce this bug is as follows
1. Run "stress-ng --class memory --all 1" in the source side, the
stress-ng tool comes from https://github.com/ColinIanKing/stress-ng.git

2. Enable the multifd compression methods and start migration
   e.g. migrate_set_parameter multifd-compression qpl

3. The guest kernel will crash automatically or crash at shutdown after
   the migration is complete

The root cause of the bugs and the solutions are described in detail in
the patch.

My verification method as follows
1. Start the VM and run the stess-ng test command on the source side.
2. Start the VM with "-S" parameter on the target side, it is
   used to pause the vCPUs after migration.
3. After the migration is successful, use the dump-guest-memory command
   to export the memory data of the source and target VMs respectively.
4. Use "cmp -l source_memory target_memory" to verify memory data.

Yuan Liu (3):
  multifd: bugfix for migration using compression methods
  multifd: bugfix for incorrect migration data with QPL compression
  multifd: bugfix for incorrect migration data with qatzip compression

 migration/multifd-nocomp.c | 3 +--
 migration/multifd-qatzip.c | 1 +
 migration/multifd-qpl.c    | 1 +
 3 files changed, 3 insertions(+), 2 deletions(-)

-- 
2.43.0
Re: [PATCH 0/3] bugfixes for migration using compression methods
Posted by Michael Tokarev 2 months, 3 weeks ago
18.12.2024 12:14, Yuan Liu wrote:
> This set of patches is used to fix the bugs of incorrect migration
> memory data when compression is enabled.
> 
> The method to reproduce this bug is as follows
> 1. Run "stress-ng --class memory --all 1" in the source side, the
> stress-ng tool comes from https://github.com/ColinIanKing/stress-ng.git
> 
> 2. Enable the multifd compression methods and start migration
>     e.g. migrate_set_parameter multifd-compression qpl
> 
> 3. The guest kernel will crash automatically or crash at shutdown after
>     the migration is complete
> 
> The root cause of the bugs and the solutions are described in detail in
> the patch.
> 
> My verification method as follows
> 1. Start the VM and run the stess-ng test command on the source side.
> 2. Start the VM with "-S" parameter on the target side, it is
>     used to pause the vCPUs after migration.
> 3. After the migration is successful, use the dump-guest-memory command
>     to export the memory data of the source and target VMs respectively.
> 4. Use "cmp -l source_memory target_memory" to verify memory data.
> 
> Yuan Liu (3):
>    multifd: bugfix for migration using compression methods
>    multifd: bugfix for incorrect migration data with QPL compression
>    multifd: bugfix for incorrect migration data with qatzip compression

Should just the first patch be applied to qemu-stable branches, or all 3?
The first one has been Cc'd qemu-stable, but the other two hasn't?

Thanks,

/mjt
RE: [PATCH 0/3] bugfixes for migration using compression methods
Posted by Liu, Yuan1 2 months, 3 weeks ago
> -----Original Message-----
> From: Michael Tokarev <mjt@tls.msk.ru>
> Sent: Sunday, January 12, 2025 9:13 PM
> To: Liu, Yuan1 <yuan1.liu@intel.com>; peterx@redhat.com; farosas@suse.de
> Cc: qemu-devel@nongnu.org; Zeng, Jason <jason.zeng@intel.com>; Wang,
> Yichen <yichen.wang@bytedance.com>; qemu-stable <qemu-stable@nongnu.org>
> Subject: Re: [PATCH 0/3] bugfixes for migration using compression methods
> 
> 18.12.2024 12:14, Yuan Liu wrote:
> > This set of patches is used to fix the bugs of incorrect migration
> > memory data when compression is enabled.
> >
> > The method to reproduce this bug is as follows
> > 1. Run "stress-ng --class memory --all 1" in the source side, the
> > stress-ng tool comes from https://github.com/ColinIanKing/stress-ng.git
> >
> > 2. Enable the multifd compression methods and start migration
> >     e.g. migrate_set_parameter multifd-compression qpl
> >
> > 3. The guest kernel will crash automatically or crash at shutdown after
> >     the migration is complete
> >
> > The root cause of the bugs and the solutions are described in detail in
> > the patch.
> >
> > My verification method as follows
> > 1. Start the VM and run the stess-ng test command on the source side.
> > 2. Start the VM with "-S" parameter on the target side, it is
> >     used to pause the vCPUs after migration.
> > 3. After the migration is successful, use the dump-guest-memory command
> >     to export the memory data of the source and target VMs respectively.
> > 4. Use "cmp -l source_memory target_memory" to verify memory data.
> >
> > Yuan Liu (3):
> >    multifd: bugfix for migration using compression methods
> >    multifd: bugfix for incorrect migration data with QPL compression
> >    multifd: bugfix for incorrect migration data with qatzip compression
> 
> Should just the first patch be applied to qemu-stable branches, or all 3?
> The first one has been Cc'd qemu-stable, but the other two hasn't?

I think all three patches should be applied, they solve three different bugs.
> 
> Thanks,
> 
> /mjt
Re: [PATCH 0/3] bugfixes for migration using compression methods
Posted by Peter Xu 3 months, 2 weeks ago
On Wed, Dec 18, 2024 at 05:14:10PM +0800, Yuan Liu wrote:
> This set of patches is used to fix the bugs of incorrect migration
> memory data when compression is enabled.
> 
> The method to reproduce this bug is as follows
> 1. Run "stress-ng --class memory --all 1" in the source side, the
> stress-ng tool comes from https://github.com/ColinIanKing/stress-ng.git
> 
> 2. Enable the multifd compression methods and start migration
>    e.g. migrate_set_parameter multifd-compression qpl
> 
> 3. The guest kernel will crash automatically or crash at shutdown after
>    the migration is complete
> 
> The root cause of the bugs and the solutions are described in detail in
> the patch.
> 
> My verification method as follows
> 1. Start the VM and run the stess-ng test command on the source side.
> 2. Start the VM with "-S" parameter on the target side, it is
>    used to pause the vCPUs after migration.
> 3. After the migration is successful, use the dump-guest-memory command
>    to export the memory data of the source and target VMs respectively.
> 4. Use "cmp -l source_memory target_memory" to verify memory data.

This looks like a good idea to test memory intergrity.  I wonder if we can
do that in some, or all, of our migration qtests.

I didn't check the latter 2 patches but I assume they can also have a
proper Fixes tag.

The other thing is uadk seems also broken from that regard.. we could add
one patch for it, but the testing may be challenging for any of us.  In all
case, I copy Shameer.

-- 
Peter Xu