[libvirt] [PATCH] qemu: Forcibly mknod() even if it exists

Michal Privoznik posted 1 patch 4 years, 4 months ago
Test syntax-check failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/c6563e0fba698a54db07e13ecfe5c3753f9c4b62.1574243464.git.mprivozn@redhat.com
src/qemu/qemu_domain.c | 25 +++++++++----------------
1 file changed, 9 insertions(+), 16 deletions(-)
[libvirt] [PATCH] qemu: Forcibly mknod() even if it exists
Posted by Michal Privoznik 4 years, 4 months ago
Another weird bug appeared concerning qemu namespaces. Basically
the problem is as follows:

1) Issue an API that causes libvirt to create a node in domain's
   namespace, say /dev/sda with 8:0 as major:minor (the API can
   be attach-disk for instance). Or simply create the node from
   a console by hand.

2) Detach the disk from qemu.

3) Do something that makes /dev/sda change it's minor number.

4) Try to attach the disk again.

The problem is, in a few cases - like disk-detach - we don't
remove the corresponding /dev node from the mount namespace
(because it may be used by some other disk's backing chain).
But this creates a problem, because if the node changes its
MAJ:MIN numbers we don't propagate the change into the domain's
namespace. We do plain mknod() and ignore EEXIST which obviously
is not enough because it doesn't guarantee that the node has
updated MAJ:MIN pair.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1752978

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
 src/qemu/qemu_domain.c | 25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 262b74d1ab..f54b9b21ff 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -13203,16 +13203,14 @@ qemuDomainCreateDeviceRecursive(const char *device,
                                             allow_noent, ttl - 1) < 0)
             goto cleanup;
     } else if (isDev) {
-        if (create &&
-            mknod(devicePath, sb.st_mode, sb.st_rdev) < 0) {
-            if (errno == EEXIST) {
-                ret = 0;
-            } else {
+        if (create) {
+            unlink(devicePath);
+            if (mknod(devicePath, sb.st_mode, sb.st_rdev) < 0) {
                 virReportSystemError(errno,
                                      _("Failed to make device %s"),
                                      devicePath);
+                goto cleanup;
             }
-            goto cleanup;
         }
     } else if (isReg) {
         if (create &&
@@ -13996,17 +13994,12 @@ qemuDomainAttachDeviceMknodHelper(pid_t pid G_GNUC_UNUSED,
     } else if (isDev) {
         VIR_DEBUG("Creating dev %s (%d,%d)",
                   data->file, major(data->sb.st_rdev), minor(data->sb.st_rdev));
+        unlink(data->file);
         if (mknod(data->file, data->sb.st_mode, data->sb.st_rdev) < 0) {
-            /* Because we are not removing devices on hotunplug, or
-             * we might be creating part of backing chain that
-             * already exist due to a different disk plugged to
-             * domain, accept EEXIST. */
-            if (errno != EEXIST) {
-                virReportSystemError(errno,
-                                     _("Unable to create device %s"),
-                                     data->file);
-                goto cleanup;
-            }
+            virReportSystemError(errno,
+                                 _("Unable to create device %s"),
+                                 data->file);
+            goto cleanup;
         } else {
             delDevice = true;
         }
-- 
2.23.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] qemu: Forcibly mknod() even if it exists
Posted by Daniel P. Berrangé 4 years, 4 months ago
On Wed, Nov 20, 2019 at 10:51:47AM +0100, Michal Privoznik wrote:
> Another weird bug appeared concerning qemu namespaces. Basically
> the problem is as follows:
> 
> 1) Issue an API that causes libvirt to create a node in domain's
>    namespace, say /dev/sda with 8:0 as major:minor (the API can
>    be attach-disk for instance). Or simply create the node from
>    a console by hand.
> 
> 2) Detach the disk from qemu.
> 
> 3) Do something that makes /dev/sda change it's minor number.

Wait, what ?

major/minor numbers for SCSI disks are a defined standard
IIUC

$ grep SCSI ./admin-guide/devices.txt | grep block
   8 block	SCSI disk devices (0-15)
  11 block	SCSI CD-ROM devices
  65 block	SCSI disk devices (16-31)
  66 block	SCSI disk devices (32-47)
  67 block	SCSI disk devices (48-63)
  68 block	SCSI disk devices (64-79)
  69 block	SCSI disk devices (80-95)
  70 block	SCSI disk devices (96-111)
  71 block	SCSI disk devices (112-127)
 128 block       SCSI disk devices (128-143)
 129 block       SCSI disk devices (144-159)
 130 block       SCSI disk devices (160-175)
 131 block       SCSI disk devices (176-191)
 132 block       SCSI disk devices (192-207)
 133 block       SCSI disk devices (208-223)
 134 block       SCSI disk devices (224-239)
 135 block       SCSI disk devices (240-255)


IOW, /dev/sda should always be 8,0

There is, however, the possibiity of dynamically assign
major/minor numbers so its possible we'll see a block
device with a changable number, but AFAIK, such a block
device should never be call /dev/sda !??!

IOW, the commit is fine, but is this commit message
really accurate ?

> 
> 4) Try to attach the disk again.
> 
> The problem is, in a few cases - like disk-detach - we don't
> remove the corresponding /dev node from the mount namespace
> (because it may be used by some other disk's backing chain).
> But this creates a problem, because if the node changes its
> MAJ:MIN numbers we don't propagate the change into the domain's
> namespace. We do plain mknod() and ignore EEXIST which obviously
> is not enough because it doesn't guarantee that the node has
> updated MAJ:MIN pair.
> 
> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1752978
> 
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
>  src/qemu/qemu_domain.c | 25 +++++++++----------------
>  1 file changed, 9 insertions(+), 16 deletions(-)
> 
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index 262b74d1ab..f54b9b21ff 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -13203,16 +13203,14 @@ qemuDomainCreateDeviceRecursive(const char *device,
>                                              allow_noent, ttl - 1) < 0)
>              goto cleanup;
>      } else if (isDev) {
> -        if (create &&
> -            mknod(devicePath, sb.st_mode, sb.st_rdev) < 0) {
> -            if (errno == EEXIST) {
> -                ret = 0;
> -            } else {
> +        if (create) {
> +            unlink(devicePath);
> +            if (mknod(devicePath, sb.st_mode, sb.st_rdev) < 0) {
>                  virReportSystemError(errno,
>                                       _("Failed to make device %s"),
>                                       devicePath);
> +                goto cleanup;
>              }
> -            goto cleanup;
>          }
>      } else if (isReg) {
>          if (create &&
> @@ -13996,17 +13994,12 @@ qemuDomainAttachDeviceMknodHelper(pid_t pid G_GNUC_UNUSED,
>      } else if (isDev) {
>          VIR_DEBUG("Creating dev %s (%d,%d)",
>                    data->file, major(data->sb.st_rdev), minor(data->sb.st_rdev));
> +        unlink(data->file);
>          if (mknod(data->file, data->sb.st_mode, data->sb.st_rdev) < 0) {
> -            /* Because we are not removing devices on hotunplug, or
> -             * we might be creating part of backing chain that
> -             * already exist due to a different disk plugged to
> -             * domain, accept EEXIST. */
> -            if (errno != EEXIST) {
> -                virReportSystemError(errno,
> -                                     _("Unable to create device %s"),
> -                                     data->file);
> -                goto cleanup;
> -            }
> +            virReportSystemError(errno,
> +                                 _("Unable to create device %s"),
> +                                 data->file);
> +            goto cleanup;
>          } else {
>              delDevice = true;
>          }
> -- 
> 2.23.0
> 
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] qemu: Forcibly mknod() even if it exists
Posted by Michal Privoznik 4 years, 4 months ago
On 11/20/19 11:05 AM, Daniel P. Berrangé wrote:
> On Wed, Nov 20, 2019 at 10:51:47AM +0100, Michal Privoznik wrote:
>> Another weird bug appeared concerning qemu namespaces. Basically
>> the problem is as follows:
>>
>> 1) Issue an API that causes libvirt to create a node in domain's
>>     namespace, say /dev/sda with 8:0 as major:minor (the API can
>>     be attach-disk for instance). Or simply create the node from
>>     a console by hand.
>>
>> 2) Detach the disk from qemu.
>>
>> 3) Do something that makes /dev/sda change it's minor number.
> 
> Wait, what ?
> 
> major/minor numbers for SCSI disks are a defined standard
> IIUC
> 
> $ grep SCSI ./admin-guide/devices.txt | grep block
>     8 block	SCSI disk devices (0-15)
>    11 block	SCSI CD-ROM devices
>    65 block	SCSI disk devices (16-31)
>    66 block	SCSI disk devices (32-47)
>    67 block	SCSI disk devices (48-63)
>    68 block	SCSI disk devices (64-79)
>    69 block	SCSI disk devices (80-95)
>    70 block	SCSI disk devices (96-111)
>    71 block	SCSI disk devices (112-127)
>   128 block       SCSI disk devices (128-143)
>   129 block       SCSI disk devices (144-159)
>   130 block       SCSI disk devices (160-175)
>   131 block       SCSI disk devices (176-191)
>   132 block       SCSI disk devices (192-207)
>   133 block       SCSI disk devices (208-223)
>   134 block       SCSI disk devices (224-239)
>   135 block       SCSI disk devices (240-255)
> 
> 
> IOW, /dev/sda should always be 8,0
> 
> There is, however, the possibiity of dynamically assign
> major/minor numbers so its possible we'll see a block
> device with a changable number, but AFAIK, such a block
> device should never be call /dev/sda !??!
> 
> IOW, the commit is fine, but is this commit message
> really accurate ?

Ah, the bug talks about /dev/nvmeN that has changed MIN number. I 
haven't found corresponding docs on NVMe though. Is s/sda/nvmeN/ on the 
commit message enough?

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: Forcibly mknod() even if it exists
Posted by Daniel P. Berrangé 4 years, 4 months ago
On Wed, Nov 20, 2019 at 11:34:24AM +0100, Michal Privoznik wrote:
> On 11/20/19 11:05 AM, Daniel P. Berrangé wrote:
> > On Wed, Nov 20, 2019 at 10:51:47AM +0100, Michal Privoznik wrote:
> > > Another weird bug appeared concerning qemu namespaces. Basically
> > > the problem is as follows:
> > > 
> > > 1) Issue an API that causes libvirt to create a node in domain's
> > >     namespace, say /dev/sda with 8:0 as major:minor (the API can
> > >     be attach-disk for instance). Or simply create the node from
> > >     a console by hand.
> > > 
> > > 2) Detach the disk from qemu.
> > > 
> > > 3) Do something that makes /dev/sda change it's minor number.
> > 
> > Wait, what ?
> > 
> > major/minor numbers for SCSI disks are a defined standard
> > IIUC
> > 
> > $ grep SCSI ./admin-guide/devices.txt | grep block
> >     8 block	SCSI disk devices (0-15)
> >    11 block	SCSI CD-ROM devices
> >    65 block	SCSI disk devices (16-31)
> >    66 block	SCSI disk devices (32-47)
> >    67 block	SCSI disk devices (48-63)
> >    68 block	SCSI disk devices (64-79)
> >    69 block	SCSI disk devices (80-95)
> >    70 block	SCSI disk devices (96-111)
> >    71 block	SCSI disk devices (112-127)
> >   128 block       SCSI disk devices (128-143)
> >   129 block       SCSI disk devices (144-159)
> >   130 block       SCSI disk devices (160-175)
> >   131 block       SCSI disk devices (176-191)
> >   132 block       SCSI disk devices (192-207)
> >   133 block       SCSI disk devices (208-223)
> >   134 block       SCSI disk devices (224-239)
> >   135 block       SCSI disk devices (240-255)
> > 
> > 
> > IOW, /dev/sda should always be 8,0
> > 
> > There is, however, the possibiity of dynamically assign
> > major/minor numbers so its possible we'll see a block
> > device with a changable number, but AFAIK, such a block
> > device should never be call /dev/sda !??!
> > 
> > IOW, the commit is fine, but is this commit message
> > really accurate ?
> 
> Ah, the bug talks about /dev/nvmeN that has changed MIN number. I haven't
> found corresponding docs on NVMe though. Is s/sda/nvmeN/ on the commit
> message enough?

Yeah, that makes more sense as nvme devices are allocated from the
dynamic range.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: Forcibly mknod() even if it exists
Posted by Daniel P. Berrangé 4 years, 4 months ago
On Wed, Nov 20, 2019 at 10:37:14AM +0000, Daniel P. Berrangé wrote:
> On Wed, Nov 20, 2019 at 11:34:24AM +0100, Michal Privoznik wrote:
> > On 11/20/19 11:05 AM, Daniel P. Berrangé wrote:
> > > On Wed, Nov 20, 2019 at 10:51:47AM +0100, Michal Privoznik wrote:
> > > > Another weird bug appeared concerning qemu namespaces. Basically
> > > > the problem is as follows:
> > > > 
> > > > 1) Issue an API that causes libvirt to create a node in domain's
> > > >     namespace, say /dev/sda with 8:0 as major:minor (the API can
> > > >     be attach-disk for instance). Or simply create the node from
> > > >     a console by hand.
> > > > 
> > > > 2) Detach the disk from qemu.
> > > > 
> > > > 3) Do something that makes /dev/sda change it's minor number.
> > > 
> > > Wait, what ?
> > > 
> > > major/minor numbers for SCSI disks are a defined standard
> > > IIUC
> > > 
> > > $ grep SCSI ./admin-guide/devices.txt | grep block
> > >     8 block	SCSI disk devices (0-15)
> > >    11 block	SCSI CD-ROM devices
> > >    65 block	SCSI disk devices (16-31)
> > >    66 block	SCSI disk devices (32-47)
> > >    67 block	SCSI disk devices (48-63)
> > >    68 block	SCSI disk devices (64-79)
> > >    69 block	SCSI disk devices (80-95)
> > >    70 block	SCSI disk devices (96-111)
> > >    71 block	SCSI disk devices (112-127)
> > >   128 block       SCSI disk devices (128-143)
> > >   129 block       SCSI disk devices (144-159)
> > >   130 block       SCSI disk devices (160-175)
> > >   131 block       SCSI disk devices (176-191)
> > >   132 block       SCSI disk devices (192-207)
> > >   133 block       SCSI disk devices (208-223)
> > >   134 block       SCSI disk devices (224-239)
> > >   135 block       SCSI disk devices (240-255)
> > > 
> > > 
> > > IOW, /dev/sda should always be 8,0
> > > 
> > > There is, however, the possibiity of dynamically assign
> > > major/minor numbers so its possible we'll see a block
> > > device with a changable number, but AFAIK, such a block
> > > device should never be call /dev/sda !??!
> > > 
> > > IOW, the commit is fine, but is this commit message
> > > really accurate ?
> > 
> > Ah, the bug talks about /dev/nvmeN that has changed MIN number. I haven't
> > found corresponding docs on NVMe though. Is s/sda/nvmeN/ on the commit
> > message enough?
> 
> Yeah, that makes more sense as nvme devices are allocated from the
> dynamic range.

Opps, I should have added

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list