[RFC] dbus-vmstate: Connect to the dbus only during the migration phase

Priyankar Jain posted 1 patch 5 months, 2 weeks ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1605810535-51254-1-git-send-email-priyankar.jain@nutanix.com
Maintainers: "Marc-André Lureau" <marcandre.lureau@redhat.com>
backends/dbus-vmstate.c | 58 ++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 48 insertions(+), 10 deletions(-)

[RFC] dbus-vmstate: Connect to the dbus only during the migration phase

Posted by Priyankar Jain 5 months, 2 weeks ago
Today, dbus-vmstate maintains a constant connection to the dbus. This is
problematic for a number of reasons:
1. If dbus-vmstate is attached during power-on, then the device holds
   the unused connection for a long period of time until migration
   is triggered, thus unnecessarily occupying dbus.
2. Similarly, if the dbus is restarted in the time period between VM
   power-on (dbus-vmstate initialisation) and migration, then the
   migration will fail. The only way to recover would be by
   re-initialising the dbus-vmstate object.
3. If dbus is not available during VM power-on, then currently dbus-vmstate
   initialisation fails, causing power-on to fail.
4. For a system with large number of VMs, having multiple QEMUs connected to
   the same dbus can lead to a DoS for new connections.

To resolve these issues, this change makes dbus-vmstate connect to the dbus
only during the migration phase, inside dbus_vmstate_pre_save() and
dbus_vmstate_post_load(). The change also adds a helper, clear_dbus_connection(),
and calls it in the appropriate places to ensure dbus-vmstate closes the
connection properly post migration.

Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com>
Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
---
 backends/dbus-vmstate.c | 58 ++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 48 insertions(+), 10 deletions(-)

diff --git a/backends/dbus-vmstate.c b/backends/dbus-vmstate.c
index bd050e8..6bff11e 100644
--- a/backends/dbus-vmstate.c
+++ b/backends/dbus-vmstate.c
@@ -80,6 +80,20 @@ get_id_list_set(DBusVMState *self)
     return g_steal_pointer(&set);
 }
 
+static void
+clear_dbus_connection(DBusVMState *self)
+{
+    g_autoptr(GError) err = NULL;
+    if (self->bus) {
+        g_dbus_connection_close_sync(self->bus, NULL, &err);
+        if (err) {
+            warn_report("%s: Failed to close the Dbus connection: %s",
+                        __func__, err->message);
+        }
+        g_clear_object(&self->bus);
+    }
+}
+
 static GHashTable *
 dbus_get_proxies(DBusVMState *self, GError **err)
 {
@@ -195,9 +209,21 @@ static int dbus_vmstate_post_load(void *opaque, int version_id)
 
     trace_dbus_vmstate_post_load(version_id);
 
+    self->bus = g_dbus_connection_new_for_address_sync(self->dbus_addr,
+                    G_DBUS_CONNECTION_FLAGS_AUTHENTICATION_CLIENT |
+                    G_DBUS_CONNECTION_FLAGS_MESSAGE_BUS_CONNECTION,
+                    NULL, NULL, &err);
+    if (err) {
+        error_report("%s: Failed to connect to DBus: '%s'",
+                     __func__, err->message);
+        clear_dbus_connection(self);
+        return -1;
+    }
+
     proxies = dbus_get_proxies(self, &err);
     if (!proxies) {
         error_report("%s: Failed to get proxies: %s", __func__, err->message);
+        clear_dbus_connection(self);
         return -1;
     }
 
@@ -223,6 +249,7 @@ static int dbus_vmstate_post_load(void *opaque, int version_id)
         if (len >= 256) {
             error_report("%s: Invalid DBus vmstate proxy name %u",
                          __func__, len);
+            clear_dbus_connection(self);
             return -1;
         }
         if (!g_input_stream_read_all(G_INPUT_STREAM(s), id, len,
@@ -237,6 +264,7 @@ static int dbus_vmstate_post_load(void *opaque, int version_id)
         proxy = g_hash_table_lookup(proxies, id);
         if (!proxy) {
             error_report("%s: Failed to find proxy Id '%s'", __func__, id);
+            clear_dbus_connection(self);
             return -1;
         }
 
@@ -246,6 +274,7 @@ static int dbus_vmstate_post_load(void *opaque, int version_id)
 
         if (len > DBUS_VMSTATE_SIZE_LIMIT || len > avail) {
             error_report("%s: Invalid vmstate size: %u", __func__, len);
+            clear_dbus_connection(self);
             return -1;
         }
 
@@ -254,6 +283,7 @@ static int dbus_vmstate_post_load(void *opaque, int version_id)
                                                     NULL),
                 len) < 0) {
             error_report("%s: Failed to restore Id '%s'", __func__, id);
+            clear_dbus_connection(self);
             return -1;
         }
 
@@ -264,10 +294,12 @@ static int dbus_vmstate_post_load(void *opaque, int version_id)
         nelem -= 1;
     }
 
+    clear_dbus_connection(self);
     return 0;
 
 error:
     error_report("%s: Failed to read from stream: %s", __func__, err->message);
+    clear_dbus_connection(self);
     return -1;
 }
 
@@ -327,9 +359,21 @@ static int dbus_vmstate_pre_save(void *opaque)
 
     trace_dbus_vmstate_pre_save();
 
+    self->bus = g_dbus_connection_new_for_address_sync(self->dbus_addr,
+                    G_DBUS_CONNECTION_FLAGS_AUTHENTICATION_CLIENT |
+                    G_DBUS_CONNECTION_FLAGS_MESSAGE_BUS_CONNECTION,
+                    NULL, NULL, &err);
+    if (err) {
+        error_report("%s: Failed to connect to DBus: '%s'",
+                     __func__, err->message);
+        clear_dbus_connection(self);
+        return -1;
+    }
+
     proxies = dbus_get_proxies(self, &err);
     if (!proxies) {
         error_report("%s: Failed to get proxies: %s", __func__, err->message);
+        clear_dbus_connection(self);
         return -1;
     }
 
@@ -341,6 +385,7 @@ static int dbus_vmstate_pre_save(void *opaque)
                                          NULL, &err)) {
         error_report("%s: Failed to write to stream: %s",
                      __func__, err->message);
+        clear_dbus_connection(self);
         return -1;
     }
 
@@ -349,11 +394,13 @@ static int dbus_vmstate_pre_save(void *opaque)
     if (g_memory_output_stream_get_size(G_MEMORY_OUTPUT_STREAM(m))
         > UINT32_MAX) {
         error_report("%s: DBus vmstate buffer is too large", __func__);
+        clear_dbus_connection(self);
         return -1;
     }
 
     if (!g_output_stream_close(G_OUTPUT_STREAM(m), NULL, &err)) {
         error_report("%s: Failed to close stream: %s", __func__, err->message);
+        clear_dbus_connection(self);
         return -1;
     }
 
@@ -363,6 +410,7 @@ static int dbus_vmstate_pre_save(void *opaque)
     self->data =
         g_memory_output_stream_steal_data(G_MEMORY_OUTPUT_STREAM(m));
 
+    clear_dbus_connection(self);
     return 0;
 }
 
@@ -382,7 +430,6 @@ static void
 dbus_vmstate_complete(UserCreatable *uc, Error **errp)
 {
     DBusVMState *self = DBUS_VMSTATE(uc);
-    g_autoptr(GError) err = NULL;
 
     if (!object_resolve_path_type("", TYPE_DBUS_VMSTATE, NULL)) {
         error_setg(errp, "There is already an instance of %s",
@@ -395,15 +442,6 @@ dbus_vmstate_complete(UserCreatable *uc, Error **errp)
         return;
     }
 
-    self->bus = g_dbus_connection_new_for_address_sync(self->dbus_addr,
-                    G_DBUS_CONNECTION_FLAGS_AUTHENTICATION_CLIENT |
-                    G_DBUS_CONNECTION_FLAGS_MESSAGE_BUS_CONNECTION,
-                    NULL, NULL, &err);
-    if (err) {
-        error_setg(errp, "failed to connect to DBus: '%s'", err->message);
-        return;
-    }
-
     if (vmstate_register(VMSTATE_IF(self), VMSTATE_INSTANCE_ID_ANY,
                          &dbus_vmstate, self) < 0) {
         error_setg(errp, "Failed to register vmstate");
-- 
1.8.3.1


Re: [RFC] dbus-vmstate: Connect to the dbus only during the migration phase

Posted by Daniel P. Berrangé 5 months, 2 weeks ago
On Thu, Nov 19, 2020 at 06:28:55PM +0000, Priyankar Jain wrote:
> Today, dbus-vmstate maintains a constant connection to the dbus. This is
> problematic for a number of reasons:
> 1. If dbus-vmstate is attached during power-on, then the device holds
>    the unused connection for a long period of time until migration
>    is triggered, thus unnecessarily occupying dbus.
> 2. Similarly, if the dbus is restarted in the time period between VM
>    power-on (dbus-vmstate initialisation) and migration, then the
>    migration will fail. The only way to recover would be by
>    re-initialising the dbus-vmstate object.
> 3. If dbus is not available during VM power-on, then currently dbus-vmstate
>    initialisation fails, causing power-on to fail.
> 4. For a system with large number of VMs, having multiple QEMUs connected to
>    the same dbus can lead to a DoS for new connections.

The expectation is that there is a *separate* dbus daemon created for
each QEMU instance. There should never be multiple QEMUs connected to
the same dbus instance, nor should it ever connect to the common dbus
instances provided by most Linux distros.

None of these 4 issues should apply when each QEMU has its own dedicated
dbus instance AFAICT.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [RFC] dbus-vmstate: Connect to the dbus only during the migration phase

Posted by priyankar jain 5 months ago
On 20/11/20 12:17 am, Daniel P. Berrangé wrote:
> On Thu, Nov 19, 2020 at 06:28:55PM +0000, Priyankar Jain wrote:
>> Today, dbus-vmstate maintains a constant connection to the dbus. This is
>> problematic for a number of reasons:
>> 1. If dbus-vmstate is attached during power-on, then the device holds
>>     the unused connection for a long period of time until migration
>>     is triggered, thus unnecessarily occupying dbus.
>> 2. Similarly, if the dbus is restarted in the time period between VM
>>     power-on (dbus-vmstate initialisation) and migration, then the
>>     migration will fail. The only way to recover would be by
>>     re-initialising the dbus-vmstate object.
>> 3. If dbus is not available during VM power-on, then currently dbus-vmstate
>>     initialisation fails, causing power-on to fail.
>> 4. For a system with large number of VMs, having multiple QEMUs connected to
>>     the same dbus can lead to a DoS for new connections.
> 
> The expectation is that there is a *separate* dbus daemon created for
> each QEMU instance. There should never be multiple QEMUs connected to
> the same dbus instance, nor should it ever connect to the common dbus
> instances provided by most Linux distros.
> 
> None of these 4 issues should apply when each QEMU has its own dedicated
> dbus instance AFAICT.
> 
> 
> Regards,
> Daniel
> 

How does having a separate dbus daemon resolve issue (2)? If any daemon 
restarts between VM power-on and migration, the dbus-vmstate object for 
that VM would have to be reinitialized, no?

Secondly, on a setup with large number of VMs, having separate 
dbus-daemons leads to high cummulative memory usage by dbus daemons, is 
it a feasible approach to spawn a new dbus-daemon for every QEMU, given 
the fact that majority of the security aspect lies with the dbus peers, 
apart from the SELinux checks provided by dbus.

Regards,
Priyankar Jain

Re: [RFC] dbus-vmstate: Connect to the dbus only during the migration phase

Posted by Daniel P. Berrangé 5 months ago
On Wed, Dec 02, 2020 at 09:25:27PM +0530, priyankar jain wrote:
> On 20/11/20 12:17 am, Daniel P. Berrangé wrote:
> > On Thu, Nov 19, 2020 at 06:28:55PM +0000, Priyankar Jain wrote:
> > > Today, dbus-vmstate maintains a constant connection to the dbus. This is
> > > problematic for a number of reasons:
> > > 1. If dbus-vmstate is attached during power-on, then the device holds
> > >     the unused connection for a long period of time until migration
> > >     is triggered, thus unnecessarily occupying dbus.
> > > 2. Similarly, if the dbus is restarted in the time period between VM
> > >     power-on (dbus-vmstate initialisation) and migration, then the
> > >     migration will fail. The only way to recover would be by
> > >     re-initialising the dbus-vmstate object.
> > > 3. If dbus is not available during VM power-on, then currently dbus-vmstate
> > >     initialisation fails, causing power-on to fail.
> > > 4. For a system with large number of VMs, having multiple QEMUs connected to
> > >     the same dbus can lead to a DoS for new connections.
> > 
> > The expectation is that there is a *separate* dbus daemon created for
> > each QEMU instance. There should never be multiple QEMUs connected to
> > the same dbus instance, nor should it ever connect to the common dbus
> > instances provided by most Linux distros.
> > 
> > None of these 4 issues should apply when each QEMU has its own dedicated
> > dbus instance AFAICT.
> > 
> > 
> > Regards,
> > Daniel
> > 
> 
> How does having a separate dbus daemon resolve issue (2)? If any daemon
> restarts between VM power-on and migration, the dbus-vmstate object for that
> VM would have to be reinitialized, no?

The private dbus damon for QEMU is expected to exist for the lifetime of
that QEMU process.

> Secondly, on a setup with large number of VMs, having separate dbus-daemons
> leads to high cummulative memory usage by dbus daemons, is it a feasible
> approach to spawn a new dbus-daemon for every QEMU, given the fact that
> majority of the security aspect lies with the dbus peers, apart from the
> SELinux checks provided by dbus.

The memory usage of a dbus daemon shouldn't be that high. A large portion
of the memory footprint should be readony pages shared between all dbus
procsses. The private usage should be a functional of number of clients
and the message traffic. Do you have any measured figures you're concerned
with ?

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [RFC] dbus-vmstate: Connect to the dbus only during the migration phase

Posted by priyankar jain 5 months ago
On 02/12/20 9:46 pm, Daniel P. Berrangé wrote:
> On Wed, Dec 02, 2020 at 09:25:27PM +0530, priyankar jain wrote:
>> On 20/11/20 12:17 am, Daniel P. Berrangé wrote:
>>> On Thu, Nov 19, 2020 at 06:28:55PM +0000, Priyankar Jain wrote:
>>>> Today, dbus-vmstate maintains a constant connection to the dbus. This is
>>>> problematic for a number of reasons:
>>>> 1. If dbus-vmstate is attached during power-on, then the device holds
>>>>      the unused connection for a long period of time until migration
>>>>      is triggered, thus unnecessarily occupying dbus.
>>>> 2. Similarly, if the dbus is restarted in the time period between VM
>>>>      power-on (dbus-vmstate initialisation) and migration, then the
>>>>      migration will fail. The only way to recover would be by
>>>>      re-initialising the dbus-vmstate object.
>>>> 3. If dbus is not available during VM power-on, then currently dbus-vmstate
>>>>      initialisation fails, causing power-on to fail.
>>>> 4. For a system with large number of VMs, having multiple QEMUs connected to
>>>>      the same dbus can lead to a DoS for new connections.
>>>
>>> The expectation is that there is a *separate* dbus daemon created for
>>> each QEMU instance. There should never be multiple QEMUs connected to
>>> the same dbus instance, nor should it ever connect to the common dbus
>>> instances provided by most Linux distros.
>>>
>>> None of these 4 issues should apply when each QEMU has its own dedicated
>>> dbus instance AFAICT.
>>>
>>>
>>> Regards,
>>> Daniel
>>>
>>
>> How does having a separate dbus daemon resolve issue (2)? If any daemon
>> restarts between VM power-on and migration, the dbus-vmstate object for that
>> VM would have to be reinitialized, no?
> 
> The private dbus damon for QEMU is expected to exist for the lifetime of
> that QEMU process.

Totally agree on the expectation. But any external stimuli (maybe 
unintended) can easily break this condition, and this would indeed 
result into a situation where the VM is basically non migratable until 
the VM is powered cycled or the dbus-vmstate is removed by manual 
intervention.

Secondly, having dbus-vmstate backend connect to dbus at migration time 
eases the complexity for any management plane to recover in these 
failure situations by monitoring dbus and restarting it with the same 
params if dbus gets killed without affecting QEMU.

>> Secondly, on a setup with large number of VMs, having separate dbus-daemons
>> leads to high cummulative memory usage by dbus daemons, is it a feasible
>> approach to spawn a new dbus-daemon for every QEMU, given the fact that
>> majority of the security aspect lies with the dbus peers, apart from the
>> SELinux checks provided by dbus.
> 
> The memory usage of a dbus daemon shouldn't be that high. A large portion
> of the memory footprint should be readony pages shared between all dbus
> procsses. The private usage should be a functional of number of clients
> and the message traffic. Do you have any measured figures you're concerned
> with ?

One of our setup had a long running private dbus-daemon (nearly 4-5 
days) in the destination hypervisor after performing migration, it was 
showing the following memory usage (figures in kB):
Virt  - 90980
Rss   - 19576
Total - 110556

Extrapolating these figures for 100s of daemons results in considerable 
Rss usage. These figures were taken using `top` linux utility. But I had 
not considered the readonly shared pages aspect at the time of capture.

Regards,
Priyankar

Re: [RFC] dbus-vmstate: Connect to the dbus only during the migration phase

Posted by Daniel P. Berrangé 5 months ago
On Wed, Dec 02, 2020 at 10:33:01PM +0530, priyankar jain wrote:
> On 02/12/20 9:46 pm, Daniel P. Berrangé wrote:
> > On Wed, Dec 02, 2020 at 09:25:27PM +0530, priyankar jain wrote:
> > > On 20/11/20 12:17 am, Daniel P. Berrangé wrote:
> > > > On Thu, Nov 19, 2020 at 06:28:55PM +0000, Priyankar Jain wrote:
> > > > > Today, dbus-vmstate maintains a constant connection to the dbus. This is
> > > > > problematic for a number of reasons:
> > > > > 1. If dbus-vmstate is attached during power-on, then the device holds
> > > > >      the unused connection for a long period of time until migration
> > > > >      is triggered, thus unnecessarily occupying dbus.
> > > > > 2. Similarly, if the dbus is restarted in the time period between VM
> > > > >      power-on (dbus-vmstate initialisation) and migration, then the
> > > > >      migration will fail. The only way to recover would be by
> > > > >      re-initialising the dbus-vmstate object.
> > > > > 3. If dbus is not available during VM power-on, then currently dbus-vmstate
> > > > >      initialisation fails, causing power-on to fail.
> > > > > 4. For a system with large number of VMs, having multiple QEMUs connected to
> > > > >      the same dbus can lead to a DoS for new connections.
> > > > 
> > > > The expectation is that there is a *separate* dbus daemon created for
> > > > each QEMU instance. There should never be multiple QEMUs connected to
> > > > the same dbus instance, nor should it ever connect to the common dbus
> > > > instances provided by most Linux distros.
> > > > 
> > > > None of these 4 issues should apply when each QEMU has its own dedicated
> > > > dbus instance AFAICT.
> > > > 
> > > > 
> > > > Regards,
> > > > Daniel
> > > > 
> > > 
> > > How does having a separate dbus daemon resolve issue (2)? If any daemon
> > > restarts between VM power-on and migration, the dbus-vmstate object for that
> > > VM would have to be reinitialized, no?
> > 
> > The private dbus damon for QEMU is expected to exist for the lifetime of
> > that QEMU process.
> 
> Totally agree on the expectation. But any external stimuli (maybe
> unintended) can easily break this condition, and this would indeed result
> into a situation where the VM is basically non migratable until the VM is
> powered cycled or the dbus-vmstate is removed by manual intervention.



> Secondly, having dbus-vmstate backend connect to dbus at migration time
> eases the complexity for any management plane to recover in these failure
> situations by monitoring dbus and restarting it with the same params if dbus
> gets killed without affecting QEMU.

The vmstate support is just one use case for the dbus daemon.

The intention when specifying dbus was that this serves as a general
purpose framework on which other functionality will be built for
mgmt of helper processes associated with QEMU VM. So ultimately it
is thought that the dbus service will be relevant  throuh the lifetime
of the VM.

Obviously it doesn't appear that way right now, but I'm wary about
optimizing to dynamically connect/disconnect around migration since
we're expecting it to be in use any arbitrary times for other things
long term.

> 
> > > Secondly, on a setup with large number of VMs, having separate dbus-daemons
> > > leads to high cummulative memory usage by dbus daemons, is it a feasible
> > > approach to spawn a new dbus-daemon for every QEMU, given the fact that
> > > majority of the security aspect lies with the dbus peers, apart from the
> > > SELinux checks provided by dbus.
> > 
> > The memory usage of a dbus daemon shouldn't be that high. A large portion
> > of the memory footprint should be readony pages shared between all dbus
> > procsses. The private usage should be a functional of number of clients
> > and the message traffic. Do you have any measured figures you're concerned
> > with ?
> 
> One of our setup had a long running private dbus-daemon (nearly 4-5 days) in
> the destination hypervisor after performing migration, it was showing the
> following memory usage (figures in kB):
> Virt  - 90980
> Rss   - 19576
> Total - 110556
> 
> Extrapolating these figures for 100s of daemons results in considerable Rss
> usage. These figures were taken using `top` linux utility. But I had not
> considered the readonly shared pages aspect at the time of capture.

Yep, you can't just multiply  'Rss' by the number of instances, as that'll
way over-estimate the overhead. Need to look at the private Rss only. 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|