Don't process subchannel devices where `def->driver` is not set. This
fixes the following segfault:
Thread 21 "nodedev-init" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x3ffb08fc910 (LWP 64303)]
(gdb) bt
#0 0x000003fffd1272b4 in __strcmp_vx () at /lib64/libc.so.6
#1 0x000003ffc260c3a8 in udevProcessCSS (device=0x3ff9018d130, def=0x3ff90194a90)
#2 0x000003ffc260cb78 in udevGetDeviceDetails (device=0x3ff9018d130, def=0x3ff90194a90)
#3 0x000003ffc260d126 in udevAddOneDevice (device=0x3ff9018d130)
#4 0x000003ffc260d414 in udevProcessDeviceListEntry (udev=0x3ffa810d800, list_entry=0x3ff90001990)
#5 0x000003ffc260d638 in udevEnumerateDevices (udev=0x3ffa810d800)
#6 0x000003ffc260e08e in nodeStateInitializeEnumerate (opaque=0x3ffa810d800)
#7 0x000003fffdaa14b6 in virThreadHelper (data=0x3ffa810df00)
#8 0x000003fffc309ed6 in start_thread ()
#9 0x000003fffd185e66 in thread_start ()
(gdb) p *def
$2 = {
name = 0x0,
sysfs_path = 0x3ff90198e80 "/sys/devices/css0/0.0.ff40",
parent = 0x0,
parent_sysfs_path = 0x0,
parent_wwnn = 0x0,
parent_wwpn = 0x0,
parent_fabric_wwn = 0x0,
driver = 0x0,
devnode = 0x0,
devlinks = 0x3ff90194670,
caps = 0x3ff90194380
}
Fixes: 05e6cdafa6e0 ("node_device: detect CSS devices")
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
---
src/node_device/node_device_udev.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
index 5f2841bb7d8e..12e3f30badd1 100644
--- a/src/node_device/node_device_udev.c
+++ b/src/node_device/node_device_udev.c
@@ -1130,8 +1130,9 @@ udevProcessCSS(struct udev_device *device,
virNodeDeviceDefPtr def)
{
/* only process IO subchannel and vfio-ccw devices to keep the list sane */
- if (STRNEQ(def->driver, "io_subchannel") &&
- STRNEQ(def->driver, "vfio_ccw"))
+ if (!def->driver ||
+ (STRNEQ(def->driver, "io_subchannel") &&
+ STRNEQ(def->driver, "vfio_ccw")))
return -1;
if (udevGetCCWAddress(def->sysfs_path, &def->caps->data) < 0)
--
2.25.4
On Mon, Sep 21, 2020 at 07:06:32PM +0200, Marc Hartmayer wrote: > Don't process subchannel devices where `def->driver` is not set. This > fixes the following segfault: > > Thread 21 "nodedev-init" received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x3ffb08fc910 (LWP 64303)] > (gdb) bt > #0 0x000003fffd1272b4 in __strcmp_vx () at /lib64/libc.so.6 > #1 0x000003ffc260c3a8 in udevProcessCSS (device=0x3ff9018d130, def=0x3ff90194a90) > #2 0x000003ffc260cb78 in udevGetDeviceDetails (device=0x3ff9018d130, def=0x3ff90194a90) > #3 0x000003ffc260d126 in udevAddOneDevice (device=0x3ff9018d130) > #4 0x000003ffc260d414 in udevProcessDeviceListEntry (udev=0x3ffa810d800, list_entry=0x3ff90001990) > #5 0x000003ffc260d638 in udevEnumerateDevices (udev=0x3ffa810d800) > #6 0x000003ffc260e08e in nodeStateInitializeEnumerate (opaque=0x3ffa810d800) > #7 0x000003fffdaa14b6 in virThreadHelper (data=0x3ffa810df00) > #8 0x000003fffc309ed6 in start_thread () > #9 0x000003fffd185e66 in thread_start () > (gdb) p *def > $2 = { > name = 0x0, > sysfs_path = 0x3ff90198e80 "/sys/devices/css0/0.0.ff40", Okay, this patch fixes the segfault. However, if ^this generated it because the driver name is not set, how do we even get to the resulting device tree as outlined in 05e6cdafa6e0? +- css_0_0_003a | +- ccw_0_0_1a2b | +- scsi_host0 What kind of CSS device is the one causing the error? If we skip this CSS device, we don't generate a name for it and don't put it in the list, so I'm quite puzzled on what I missed in the IBM document and thus in the review process. FWIW: Reviewed-by: Erik Skultety <eskultet@redhat.com>
On 9/22/20 8:26 AM, Erik Skultety wrote: > On Mon, Sep 21, 2020 at 07:06:32PM +0200, Marc Hartmayer wrote: >> Don't process subchannel devices where `def->driver` is not set. This >> fixes the following segfault: >> >> Thread 21 "nodedev-init" received signal SIGSEGV, Segmentation fault. >> [Switching to Thread 0x3ffb08fc910 (LWP 64303)] >> (gdb) bt >> #0 0x000003fffd1272b4 in __strcmp_vx () at /lib64/libc.so.6 >> #1 0x000003ffc260c3a8 in udevProcessCSS (device=0x3ff9018d130, def=0x3ff90194a90) >> #2 0x000003ffc260cb78 in udevGetDeviceDetails (device=0x3ff9018d130, def=0x3ff90194a90) >> #3 0x000003ffc260d126 in udevAddOneDevice (device=0x3ff9018d130) >> #4 0x000003ffc260d414 in udevProcessDeviceListEntry (udev=0x3ffa810d800, list_entry=0x3ff90001990) >> #5 0x000003ffc260d638 in udevEnumerateDevices (udev=0x3ffa810d800) >> #6 0x000003ffc260e08e in nodeStateInitializeEnumerate (opaque=0x3ffa810d800) >> #7 0x000003fffdaa14b6 in virThreadHelper (data=0x3ffa810df00) >> #8 0x000003fffc309ed6 in start_thread () >> #9 0x000003fffd185e66 in thread_start () >> (gdb) p *def >> $2 = { >> name = 0x0, >> sysfs_path = 0x3ff90198e80 "/sys/devices/css0/0.0.ff40", > > Okay, this patch fixes the segfault. However, if ^this generated it because the > driver name is not set, how do we even get to the resulting device tree as > outlined in 05e6cdafa6e0? > > +- css_0_0_003a > | > +- ccw_0_0_1a2b > | > +- scsi_host0 > > What kind of CSS device is the one causing the error? If we skip this CSS > device, we don't generate a name for it and don't put it in the list, so I'm > quite puzzled on what I missed in the IBM document and thus in the review > process. > > FWIW: > Reviewed-by: Erik Skultety <eskultet@redhat.com> > Erik, for whatever reason Marcs system does not have the subchannel device driver "chsc_subchannel" loaded. Therefore the subchannel is not bound to any driver and by the way this would be a filtered device since we want to show the users only io_subchannel and vfio_ccw subchannels. The tree above shows a subchannel bound to the io_subchannel device driver. Maybe it helps you to see a full example of how it looks on my system: # virsh nodedev-list --tree computer | +- css_0_0_000e | | | +- ccw_0_0_f500 | +- css_0_0_000f | | | +- ccw_0_0_f501 | +- css_0_0_0010 | | | +- ccw_0_0_f502 | +- css_0_0_0011 | | | +- ccw_0_0_bd00 | +- css_0_0_0012 | | | +- ccw_0_0_bd01 | +- css_0_0_0013 | | | +- ccw_0_0_bd02 | +- css_0_0_002f | | | +- ccw_0_0_19c0 | | | +- scsi_host3 | | | +- scsi_target3_0_1 | | | +- scsi_3_0_1_1078935649 | | | +- block_sda_36005076307ffc5e3000000000000614f | +- scsi_generic_sg0 | +- css_0_0_0038 | | | +- ccw_0_0_1900 | | | +- scsi_host1 | | | +- scsi_target1_0_0 | | | | | +- scsi_1_0_0_1075986530 | | | | | | | +- block_sdd_36005076307ffc5e30000000000006222 | | | +- scsi_generic_sg3 | | | | | +- scsi_1_0_0_1078935649 | | | | | +- block_sdc_36005076307ffc5e3000000000000614f | | +- scsi_generic_sg2 | | | +- scsi_target1_0_2 | | | +- scsi_1_0_2_1075986530 | | | | | +- block_sdf_36005076307ffc5e30000000000006222 | | +- scsi_generic_sg5 | | | +- scsi_1_0_2_1078935649 | | | +- block_sde_36005076307ffc5e3000000000000614f | +- scsi_generic_sg4 | +- css_0_0_003a | | | +- ccw_0_0_1940 | | | +- scsi_host0 | | | +- scsi_target0_0_1 | | | | | +- scsi_0_0_1_1075986530 | | | | | | | +- block_sdi_36005076307ffc5e30000000000006222 | | | +- scsi_generic_sg8 | | | | | +- scsi_0_0_1_1078935649 | | | | | +- block_sdh_36005076307ffc5e3000000000000614f | | +- scsi_generic_sg7 | | | +- scsi_target0_0_3 | | | +- scsi_0_0_3_1075986530 | | | | | +- block_sdg_36005076307ffc5e30000000000006222 | | +- scsi_generic_sg6 | | | +- scsi_0_0_3_1078935649 | | | +- block_sdb_36005076307ffc5e3000000000000614f | +- scsi_generic_sg1 | +- css_0_0_003c | | | +- ccw_0_0_1980 | | | +- scsi_host2 | +- css_0_0_006b | | | +- ccw_0_0_1000 | | | +- block_dasdg_IBM_750000000DHVL1_0001_00 | +- css_0_0_006c | | | +- ccw_0_0_1001 | | | +- block_dasda_IBM_750000000DHVL1_0001_01 | +- css_0_0_006d | | | +- ccw_0_0_1002 | | | +- block_dasdb_IBM_750000000DHVL1_0001_02 | +- css_0_0_006e | | | +- ccw_0_0_1003 | | | +- block_dasdc_IBM_750000000DHVL1_0001_03 | +- css_0_0_006f | | | +- ccw_0_0_1004 | | | +- block_dasdd_IBM_750000000DHVL1_0001_04 | +- css_0_0_0070 | | | +- ccw_0_0_1005 | | | +- block_dasdh_IBM_750000000DHVL1_0001_05 | +- css_0_0_0071 | | | +- ccw_0_0_1006 | | | +- block_dasdf_IBM_750000000DHVL1_0001_06 | +- css_0_0_0072 | | | +- mdev_2e7237a7_0445_407e_b880_96f63a3cd17d | +- net_encbd00_02_ff_e3_80_c4_ef +- net_encf500_02_ff_e3_80_76_89 +- net_lo_00_00_00_00_00_00 +- pci_0001_00_00_0 | +- net_enP1s8_82_01_2d_0c_bb_b0 +- net_enP1s8d1_82_01_2d_0c_bb_b1 All subchannels but css_0_0_0072 are bound to io_subchannel. css_0_0_0072 is bound to vfio-ccw and therefore lists an mdev as child. Here are the xml dumps: # virsh nodedev-dumpxml css_0_0_0071 <device> <name>css_0_0_0071</name> <path>/sys/devices/css0/0.0.0071</path> <parent>computer</parent> <driver> <name>io_subchannel</name> </driver> <capability type='css'> <cssid>0x0</cssid> <ssid>0x0</ssid> <devno>0x0071</devno> </capability> </device> # virsh nodedev-dumpxml css_0_0_0072 <device> <name>css_0_0_0072</name> <path>/sys/devices/css0/0.0.0072</path> <parent>computer</parent> <driver> <name>vfio_ccw</name> </driver> <capability type='css'> <cssid>0x0</cssid> <ssid>0x0</ssid> <devno>0x0072</devno> </capability> </device> I hope that helps a bit... -- Mit freundlichen Grüßen/Kind regards Boris Fiuczynski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Gregor Pillen Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294
On Tue, Sep 22, 2020 at 10:45:09AM +0200, Boris Fiuczynski wrote: > On 9/22/20 8:26 AM, Erik Skultety wrote: > > On Mon, Sep 21, 2020 at 07:06:32PM +0200, Marc Hartmayer wrote: > > > Don't process subchannel devices where `def->driver` is not set. This > > > fixes the following segfault: > > > > > > Thread 21 "nodedev-init" received signal SIGSEGV, Segmentation fault. > > > [Switching to Thread 0x3ffb08fc910 (LWP 64303)] > > > (gdb) bt > > > #0 0x000003fffd1272b4 in __strcmp_vx () at /lib64/libc.so.6 > > > #1 0x000003ffc260c3a8 in udevProcessCSS (device=0x3ff9018d130, def=0x3ff90194a90) > > > #2 0x000003ffc260cb78 in udevGetDeviceDetails (device=0x3ff9018d130, def=0x3ff90194a90) > > > #3 0x000003ffc260d126 in udevAddOneDevice (device=0x3ff9018d130) > > > #4 0x000003ffc260d414 in udevProcessDeviceListEntry (udev=0x3ffa810d800, list_entry=0x3ff90001990) > > > #5 0x000003ffc260d638 in udevEnumerateDevices (udev=0x3ffa810d800) > > > #6 0x000003ffc260e08e in nodeStateInitializeEnumerate (opaque=0x3ffa810d800) > > > #7 0x000003fffdaa14b6 in virThreadHelper (data=0x3ffa810df00) > > > #8 0x000003fffc309ed6 in start_thread () > > > #9 0x000003fffd185e66 in thread_start () > > > (gdb) p *def > > > $2 = { > > > name = 0x0, > > > sysfs_path = 0x3ff90198e80 "/sys/devices/css0/0.0.ff40", > > > > Okay, this patch fixes the segfault. However, if ^this generated it because the > > driver name is not set, how do we even get to the resulting device tree as > > outlined in 05e6cdafa6e0? > > > > +- css_0_0_003a > > | > > +- ccw_0_0_1a2b > > | > > +- scsi_host0 > > > > What kind of CSS device is the one causing the error? If we skip this CSS > > device, we don't generate a name for it and don't put it in the list, so I'm > > quite puzzled on what I missed in the IBM document and thus in the review > > process. > > > > FWIW: > > Reviewed-by: Erik Skultety <eskultet@redhat.com> > > > > Erik, > for whatever reason Marcs system does not have the subchannel device driver > "chsc_subchannel" loaded. Therefore the subchannel is not bound to any > driver and by the way this would be a filtered device since we want to show > the users only io_subchannel and vfio_ccw subchannels. > The tree above shows a subchannel bound to the io_subchannel device driver. > Maybe it helps you to see a full example of how it looks on my system: Oh, it would have been a different driver anyway - impossible to spot just from the address :). Yeah, the tree dump makes it much clearer, thanks, I pushed the patch. Erik
© 2016 - 2024 Red Hat, Inc.