[libvirt PATCH 0/6] Introduce Local Migration Support in Libvirt

Daniel P. Berrangé posted 6 patches 4 years, 2 months ago
Failed in applying to current master (apply log)
include/libvirt/libvirt-domain.h |   6 +
src/conf/virdomainobjlist.c      | 232 +++++++++++++++++++++++++++++--
src/conf/virdomainobjlist.h      |  10 ++
src/libvirt_private.syms         |   4 +
src/qemu/qemu_conf.c             |   4 +-
src/qemu/qemu_domain.c           |  28 +++-
src/qemu/qemu_domain.h           |   2 +
src/qemu/qemu_driver.c           |  46 +++++-
src/qemu/qemu_migration.c        |  59 +++++---
src/qemu/qemu_migration.h        |   5 +
src/qemu/qemu_migration_cookie.c | 121 ++++++++--------
src/qemu/qemu_migration_cookie.h |   2 +
src/qemu/qemu_process.c          |   3 +-
src/qemu/qemu_process.h          |   2 +
src/util/virclosecallbacks.c     |  48 +++++--
src/util/virclosecallbacks.h     |   3 +
tools/virsh-domain.c             |   7 +
17 files changed, 471 insertions(+), 111 deletions(-)
[libvirt PATCH 0/6] Introduce Local Migration Support in Libvirt
Posted by Daniel P. Berrangé 4 years, 2 months ago
I'm (re-)sending this patch series on behalf of Shaju Abraham
<shaju.abraham@nutanix.com> who has tried to send this several times
already.

Red Hat's email infrastructure is broken, accepting the mails and then
failing to deliver them to mailman, or any other Red Hat address.
Unfortunately it means that while we can send comments back to Shaju
on this thread, subscribers will then probably fail to see any responses
Shaju tries to give :-( To say this is bad is an understatement. I have
yet another ticket open tracking & escalating this awful problem but
can't give any ETA on a fix :-(

Anyway, with that out of the way, here's Shaju's original cover letter
below....

1) What is this patch series about?

Local live migration of a VM is about Live migrating a VM instance with in the
same node. Traditional libvirt live migration involves migrating the VM from a
source node to a remote node. The local migrations are forbidden in Libvirt for
a myriad of reasons. This patch series is to enable local migration in Libvirt.

2) Why Local Migration is important?

The ability to Live migrate a VM locally paves the way for hypervisor upgrades
without shutting down the VM. For example to upgrade qemu after a security
upgrade, we can locally migrate the VM to the new qemu instance. By utilising
capabilities like "bypass-shared-memory" in qemu, the hypervisor upgrades are
faster.

3) Why is local migration difficult in Libvirt?

Libvirt always assumes that the name/UUID pair is unique with in a node. During
local migration there will be two different VMs with the same UUID/name pair
which will confuse the management stack. There are other path variables like
monitor path, config paths etc which assumes that the name/UUID pair is unique.
So during migration the same monitor will be used by both the source and the
target. We cannot assign a temporary UUID to the target VM, since UUID is a part
of the machine ABI which is immutable.
To decouple the dependecy on UUID/name, a new field (the domain id) is included
in all the PATHs that Libvirt uses. This will ensure that all instances of the
VM gets a unique PATH.

4) How is the Local Migration Designed ?

Libvirt manages all the VM domain objects using two hash tables which are
indexed using either the UUID or Name.During the Live migration the domain
entry in the source node gets deleted and a new entry gets populated in the
target node, which are indexed using the same name/UUID.But for the Local
migration, there is no remote node. Both the source and the target nodes are
same. So inorder to model the remote node, two more hashtables are introduced
which represents the  hash tables of the remote node during migration.
The Libvirt migration involves 5 stages
1) Begin
2) Prepare
3) Perform
4) Finish
5) Confirm

Begin,Perform and Confirm gets executed on the source node where as Prepare
and Finish gets executed on the target node. In the case of Local Migration
Perform and Finish stages uses the newly introduced 'remote hash table' and
rest of the stages uses the 'source hash tables'. Once the migration is
completed, that is after the confirm phase, the VM domain object is moved from
the 'remote hash table' to the 'source hash table'. This is required so that
other Libvirt commands like 'virsh list' can display all the VMs running in the
node.

5) How to test Local Migration?

A new flag 'local' is added to the 'virsh migrate' command to enable local
migration. The syntax is
virsh migrate --live --local 'domain-id'  qemu+ssh://ip-address/system

6) What are the known issues?

SeLinux policies is know to have issues with the creating /dev/hugepages entries
during VM launch. In order to test local migration disable SeLinux using 'setenforce 0'.

Shaju Abraham (6):
  Add VIR_MIGRATE_LOCAL flag to virsh migrate command
  Introduce remote hash tables and helper routines
  Add local migration support in QEMU Migration framework
  Modify close callback routines to handle local migration
  Make PATHs unique for a VM object instance
  Move the domain object from remote to source hash table

 include/libvirt/libvirt-domain.h |   6 +
 src/conf/virdomainobjlist.c      | 232 +++++++++++++++++++++++++++++--
 src/conf/virdomainobjlist.h      |  10 ++
 src/libvirt_private.syms         |   4 +
 src/qemu/qemu_conf.c             |   4 +-
 src/qemu/qemu_domain.c           |  28 +++-
 src/qemu/qemu_domain.h           |   2 +
 src/qemu/qemu_driver.c           |  46 +++++-
 src/qemu/qemu_migration.c        |  59 +++++---
 src/qemu/qemu_migration.h        |   5 +
 src/qemu/qemu_migration_cookie.c | 121 ++++++++--------
 src/qemu/qemu_migration_cookie.h |   2 +
 src/qemu/qemu_process.c          |   3 +-
 src/qemu/qemu_process.h          |   2 +
 src/util/virclosecallbacks.c     |  48 +++++--
 src/util/virclosecallbacks.h     |   3 +
 tools/virsh-domain.c             |   7 +
 17 files changed, 471 insertions(+), 111 deletions(-)

-- 
2.24.1

Re: [libvirt PATCH 0/6] Introduce Local Migration Support in Libvirt
Posted by Daniel Henrique Barboza 4 years, 2 months ago
Hi Daniel,

I am happy that Libvirt is pushing local migration/live patching support, but
at the same time I am wondering what changed from what you said here:

https://www.redhat.com/archives/libvir-list/2017-September/msg00489.html

To give you a background, we have live patching enhancements in IBM backlog
since a few years ago, and one on the reasons these were being postponed
time and time again were the lack of Libvirt support and this direction of
"Libvirt is not interested in supporting it". And this message above was being
used internally as the rationale for it.


Thanks,


DHB


On 2/3/20 9:43 AM, Daniel P. Berrangé wrote:
> I'm (re-)sending this patch series on behalf of Shaju Abraham
> <shaju.abraham@nutanix.com> who has tried to send this several times
> already.
> 
> Red Hat's email infrastructure is broken, accepting the mails and then
> failing to deliver them to mailman, or any other Red Hat address.
> Unfortunately it means that while we can send comments back to Shaju
> on this thread, subscribers will then probably fail to see any responses
> Shaju tries to give :-( To say this is bad is an understatement. I have
> yet another ticket open tracking & escalating this awful problem but
> can't give any ETA on a fix :-(
> 
> Anyway, with that out of the way, here's Shaju's original cover letter
> below....
> 
> 1) What is this patch series about?
> 
> Local live migration of a VM is about Live migrating a VM instance with in the
> same node. Traditional libvirt live migration involves migrating the VM from a
> source node to a remote node. The local migrations are forbidden in Libvirt for
> a myriad of reasons. This patch series is to enable local migration in Libvirt.
> 
> 2) Why Local Migration is important?
> 
> The ability to Live migrate a VM locally paves the way for hypervisor upgrades
> without shutting down the VM. For example to upgrade qemu after a security
> upgrade, we can locally migrate the VM to the new qemu instance. By utilising
> capabilities like "bypass-shared-memory" in qemu, the hypervisor upgrades are
> faster.
> 
> 3) Why is local migration difficult in Libvirt?
> 
> Libvirt always assumes that the name/UUID pair is unique with in a node. During
> local migration there will be two different VMs with the same UUID/name pair
> which will confuse the management stack. There are other path variables like
> monitor path, config paths etc which assumes that the name/UUID pair is unique.
> So during migration the same monitor will be used by both the source and the
> target. We cannot assign a temporary UUID to the target VM, since UUID is a part
> of the machine ABI which is immutable.
> To decouple the dependecy on UUID/name, a new field (the domain id) is included
> in all the PATHs that Libvirt uses. This will ensure that all instances of the
> VM gets a unique PATH.
> 
> 4) How is the Local Migration Designed ?
> 
> Libvirt manages all the VM domain objects using two hash tables which are
> indexed using either the UUID or Name.During the Live migration the domain
> entry in the source node gets deleted and a new entry gets populated in the
> target node, which are indexed using the same name/UUID.But for the Local
> migration, there is no remote node. Both the source and the target nodes are
> same. So inorder to model the remote node, two more hashtables are introduced
> which represents the  hash tables of the remote node during migration.
> The Libvirt migration involves 5 stages
> 1) Begin
> 2) Prepare
> 3) Perform
> 4) Finish
> 5) Confirm
> 
> Begin,Perform and Confirm gets executed on the source node where as Prepare
> and Finish gets executed on the target node. In the case of Local Migration
> Perform and Finish stages uses the newly introduced 'remote hash table' and
> rest of the stages uses the 'source hash tables'. Once the migration is
> completed, that is after the confirm phase, the VM domain object is moved from
> the 'remote hash table' to the 'source hash table'. This is required so that
> other Libvirt commands like 'virsh list' can display all the VMs running in the
> node.
> 
> 5) How to test Local Migration?
> 
> A new flag 'local' is added to the 'virsh migrate' command to enable local
> migration. The syntax is
> virsh migrate --live --local 'domain-id'  qemu+ssh://ip-address/system
> 
> 6) What are the known issues?
> 
> SeLinux policies is know to have issues with the creating /dev/hugepages entries
> during VM launch. In order to test local migration disable SeLinux using 'setenforce 0'.
> 
> Shaju Abraham (6):
>    Add VIR_MIGRATE_LOCAL flag to virsh migrate command
>    Introduce remote hash tables and helper routines
>    Add local migration support in QEMU Migration framework
>    Modify close callback routines to handle local migration
>    Make PATHs unique for a VM object instance
>    Move the domain object from remote to source hash table
> 
>   include/libvirt/libvirt-domain.h |   6 +
>   src/conf/virdomainobjlist.c      | 232 +++++++++++++++++++++++++++++--
>   src/conf/virdomainobjlist.h      |  10 ++
>   src/libvirt_private.syms         |   4 +
>   src/qemu/qemu_conf.c             |   4 +-
>   src/qemu/qemu_domain.c           |  28 +++-
>   src/qemu/qemu_domain.h           |   2 +
>   src/qemu/qemu_driver.c           |  46 +++++-
>   src/qemu/qemu_migration.c        |  59 +++++---
>   src/qemu/qemu_migration.h        |   5 +
>   src/qemu/qemu_migration_cookie.c | 121 ++++++++--------
>   src/qemu/qemu_migration_cookie.h |   2 +
>   src/qemu/qemu_process.c          |   3 +-
>   src/qemu/qemu_process.h          |   2 +
>   src/util/virclosecallbacks.c     |  48 +++++--
>   src/util/virclosecallbacks.h     |   3 +
>   tools/virsh-domain.c             |   7 +
>   17 files changed, 471 insertions(+), 111 deletions(-)
> 


Re: [libvirt PATCH 0/6] Introduce Local Migration Support in Libvirt
Posted by Daniel P. Berrangé 4 years, 2 months ago
On Mon, Feb 03, 2020 at 10:42:48AM -0300, Daniel Henrique Barboza wrote:
> Hi Daniel,
> 
> I am happy that Libvirt is pushing local migration/live patching support, but
> at the same time I am wondering what changed from what you said here:

Err, this isn't libvirt pushing local migration. I'm simply re-posting
these patches on behalf of Shaju who is unable to post the patches due
to our broken mail server.  Don't take this as meaning that I approve of
the patches. They're simply here for discussion as any other patch
proposal is.

> https://www.redhat.com/archives/libvir-list/2017-September/msg00489.html

That is largely still my view.

> To give you a background, we have live patching enhancements in IBM backlog
> since a few years ago, and one on the reasons these were being postponed
> time and time again were the lack of Libvirt support and this direction of
> "Libvirt is not interested in supporting it". And this message above was being
> used internally as the rationale for it.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [libvirt PATCH 0/6] Introduce Local Migration Support in Libvirt
Posted by Jim Fehlig 4 years, 2 months ago
On 2/3/20 5:43 AM, Daniel P. Berrangé wrote:
> I'm (re-)sending this patch series on behalf of Shaju Abraham
> <shaju.abraham@nutanix.com> who has tried to send this several times
> already.
> 
> Red Hat's email infrastructure is broken, accepting the mails and then
> failing to deliver them to mailman, or any other Red Hat address.
> Unfortunately it means that while we can send comments back to Shaju
> on this thread, subscribers will then probably fail to see any responses
> Shaju tries to give :-( To say this is bad is an understatement. I have
> yet another ticket open tracking & escalating this awful problem but
> can't give any ETA on a fix :-(
> 
> Anyway, with that out of the way, here's Shaju's original cover letter
> below....
> 
> 1) What is this patch series about?
> 
> Local live migration of a VM is about Live migrating a VM instance with in the
> same node. Traditional libvirt live migration involves migrating the VM from a
> source node to a remote node. The local migrations are forbidden in Libvirt for
> a myriad of reasons. This patch series is to enable local migration in Libvirt.
> 
> 2) Why Local Migration is important?
> 
> The ability to Live migrate a VM locally paves the way for hypervisor upgrades
> without shutting down the VM. For example to upgrade qemu after a security
> upgrade, we can locally migrate the VM to the new qemu instance. By utilising
> capabilities like "bypass-shared-memory" in qemu, the hypervisor upgrades are
> faster.
> 
> 3) Why is local migration difficult in Libvirt?
> 
> Libvirt always assumes that the name/UUID pair is unique with in a node. During
> local migration there will be two different VMs with the same UUID/name pair
> which will confuse the management stack. There are other path variables like
> monitor path, config paths etc which assumes that the name/UUID pair is unique.
> So during migration the same monitor will be used by both the source and the
> target. We cannot assign a temporary UUID to the target VM, since UUID is a part
> of the machine ABI which is immutable.
> To decouple the dependecy on UUID/name, a new field (the domain id) is included
> in all the PATHs that Libvirt uses. This will ensure that all instances of the
> VM gets a unique PATH.

Since localhost migration is difficult, and there will likely be some growing 
pains until the feature is fully baked, perhaps it is best to have a knob for 
enabling/disabling it. The namespace feature suffered similar growing pains and 
having the ability to disable it in qemu.conf proved quite handy at times.

Regards,
Jim

> 
> 4) How is the Local Migration Designed ?
> 
> Libvirt manages all the VM domain objects using two hash tables which are
> indexed using either the UUID or Name.During the Live migration the domain
> entry in the source node gets deleted and a new entry gets populated in the
> target node, which are indexed using the same name/UUID.But for the Local
> migration, there is no remote node. Both the source and the target nodes are
> same. So inorder to model the remote node, two more hashtables are introduced
> which represents the  hash tables of the remote node during migration.
> The Libvirt migration involves 5 stages
> 1) Begin
> 2) Prepare
> 3) Perform
> 4) Finish
> 5) Confirm
> 
> Begin,Perform and Confirm gets executed on the source node where as Prepare
> and Finish gets executed on the target node. In the case of Local Migration
> Perform and Finish stages uses the newly introduced 'remote hash table' and
> rest of the stages uses the 'source hash tables'. Once the migration is
> completed, that is after the confirm phase, the VM domain object is moved from
> the 'remote hash table' to the 'source hash table'. This is required so that
> other Libvirt commands like 'virsh list' can display all the VMs running in the
> node.
> 
> 5) How to test Local Migration?
> 
> A new flag 'local' is added to the 'virsh migrate' command to enable local
> migration. The syntax is
> virsh migrate --live --local 'domain-id'  qemu+ssh://ip-address/system
> 
> 6) What are the known issues?
> 
> SeLinux policies is know to have issues with the creating /dev/hugepages entries
> during VM launch. In order to test local migration disable SeLinux using 'setenforce 0'.
> 
> Shaju Abraham (6):
>    Add VIR_MIGRATE_LOCAL flag to virsh migrate command
>    Introduce remote hash tables and helper routines
>    Add local migration support in QEMU Migration framework
>    Modify close callback routines to handle local migration
>    Make PATHs unique for a VM object instance
>    Move the domain object from remote to source hash table
> 
>   include/libvirt/libvirt-domain.h |   6 +
>   src/conf/virdomainobjlist.c      | 232 +++++++++++++++++++++++++++++--
>   src/conf/virdomainobjlist.h      |  10 ++
>   src/libvirt_private.syms         |   4 +
>   src/qemu/qemu_conf.c             |   4 +-
>   src/qemu/qemu_domain.c           |  28 +++-
>   src/qemu/qemu_domain.h           |   2 +
>   src/qemu/qemu_driver.c           |  46 +++++-
>   src/qemu/qemu_migration.c        |  59 +++++---
>   src/qemu/qemu_migration.h        |   5 +
>   src/qemu/qemu_migration_cookie.c | 121 ++++++++--------
>   src/qemu/qemu_migration_cookie.h |   2 +
>   src/qemu/qemu_process.c          |   3 +-
>   src/qemu/qemu_process.h          |   2 +
>   src/util/virclosecallbacks.c     |  48 +++++--
>   src/util/virclosecallbacks.h     |   3 +
>   tools/virsh-domain.c             |   7 +
>   17 files changed, 471 insertions(+), 111 deletions(-)
> 


Re: [libvirt PATCH 0/6] Introduce Local Migration Support in Libvirt
Posted by Daniel P. Berrangé 4 years, 2 months ago
On Mon, Feb 10, 2020 at 08:45:28AM -0700, Jim Fehlig wrote:
> On 2/3/20 5:43 AM, Daniel P. Berrangé wrote:
> > I'm (re-)sending this patch series on behalf of Shaju Abraham
> > <shaju.abraham@nutanix.com> who has tried to send this several times
> > already.
> > 
> > Red Hat's email infrastructure is broken, accepting the mails and then
> > failing to deliver them to mailman, or any other Red Hat address.
> > Unfortunately it means that while we can send comments back to Shaju
> > on this thread, subscribers will then probably fail to see any responses
> > Shaju tries to give :-( To say this is bad is an understatement. I have
> > yet another ticket open tracking & escalating this awful problem but
> > can't give any ETA on a fix :-(
> > 
> > Anyway, with that out of the way, here's Shaju's original cover letter
> > below....
> > 
> > 1) What is this patch series about?
> > 
> > Local live migration of a VM is about Live migrating a VM instance with in the
> > same node. Traditional libvirt live migration involves migrating the VM from a
> > source node to a remote node. The local migrations are forbidden in Libvirt for
> > a myriad of reasons. This patch series is to enable local migration in Libvirt.
> > 
> > 2) Why Local Migration is important?
> > 
> > The ability to Live migrate a VM locally paves the way for hypervisor upgrades
> > without shutting down the VM. For example to upgrade qemu after a security
> > upgrade, we can locally migrate the VM to the new qemu instance. By utilising
> > capabilities like "bypass-shared-memory" in qemu, the hypervisor upgrades are
> > faster.
> > 
> > 3) Why is local migration difficult in Libvirt?
> > 
> > Libvirt always assumes that the name/UUID pair is unique with in a node. During
> > local migration there will be two different VMs with the same UUID/name pair
> > which will confuse the management stack. There are other path variables like
> > monitor path, config paths etc which assumes that the name/UUID pair is unique.
> > So during migration the same monitor will be used by both the source and the
> > target. We cannot assign a temporary UUID to the target VM, since UUID is a part
> > of the machine ABI which is immutable.
> > To decouple the dependecy on UUID/name, a new field (the domain id) is included
> > in all the PATHs that Libvirt uses. This will ensure that all instances of the
> > VM gets a unique PATH.
> 
> Since localhost migration is difficult, and there will likely be some
> growing pains until the feature is fully baked, perhaps it is best to have a
> knob for enabling/disabling it. The namespace feature suffered similar
> growing pains and having the ability to disable it in qemu.conf proved quite
> handy at times.

Probably an API flag VIR_MIGRATE_SAME_HOST is sufficient, as that shows
an opt-in on the part of the person/thing that initiates it. I think
we'd want this flag forever, regardless of whether its experimental
or production quality, because there are special concerns about clashing
host files/resources.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|