In order to try to debug hypervisor side breakage from XSA-378 I found myself urged to finally give PVH Dom0 a try. Sadly things didn't work quite as expected. In the course of investigating these issues I actually spotted one piece of PV Dom0 breakage as well, a fix for which is also included here. There are two immediate remaining issues (also mentioned in affected patches): 1) It is not clear to me how PCI device reporting is to work. PV Dom0 reports devices as they're discovered, including ones the hypervisor may not have been able to discover itself (ones on segments other than 0 or hotplugged ones). The respective hypercall, however, is inaccessible to PVH Dom0. Depending on the answer to this, either the hypervisor will need changing (to permit the call) or patch 2 here will need further refinement. 2) Dom0, unlike in the PV case, cannot access the screen (to use as a console) when in a non-default mode (i.e. not 80x25 text), as the necessary information (in particular about VESA-bases LFB modes) is not communicated. On the hypervisor side this looks like deliberate behavior, but it is unclear to me what the intentions were towards an alternative model. (X may be able to access the screen depending on whether it has a suitable driver besides the presently unusable /dev/fb<N> based one.) 1: xen/x86: prevent PVH type from getting clobbered 2: xen/x86: allow PVH Dom0 without XEN_PV=y 3: xen/x86: make "earlyprintk=xen" work better for PVH Dom0 4: xen/x86: allow "earlyprintk=xen" to work for PV Dom0 5: xen/x86: make "earlyprintk=xen" work for HVM/PVH DomU 6: xen/x86: generalize preferred console model from PV to PVH Dom0 7: xen/x86: hook up xen_banner() also for PVH 8: x86/PVH: adjust function/data placement 9: xen/x86: adjust data placement Jan
On Tue, Sep 07, 2021 at 12:04:34PM +0200, Jan Beulich wrote: > In order to try to debug hypervisor side breakage from XSA-378 I found > myself urged to finally give PVH Dom0 a try. Sadly things didn't work > quite as expected. In the course of investigating these issues I actually > spotted one piece of PV Dom0 breakage as well, a fix for which is also > included here. > > There are two immediate remaining issues (also mentioned in affected > patches): > > 1) It is not clear to me how PCI device reporting is to work. PV Dom0 > reports devices as they're discovered, including ones the hypervisor > may not have been able to discover itself (ones on segments other > than 0 or hotplugged ones). The respective hypercall, however, is > inaccessible to PVH Dom0. Depending on the answer to this, either > the hypervisor will need changing (to permit the call) or patch 2 > here will need further refinement. I would rather prefer if we could limit the hypercall usage to only report hotplugged segments to Xen. Then Xen would have to scan the segment when reported and add any devices found. Such hypercall must be used before dom0 tries to access any device, as otherwise the BARs won't be mapped in the second stage translation and the traps for the MCFG area won't be setup either. > > 2) Dom0, unlike in the PV case, cannot access the screen (to use as a > console) when in a non-default mode (i.e. not 80x25 text), as the > necessary information (in particular about VESA-bases LFB modes) is > not communicated. On the hypervisor side this looks like deliberate > behavior, but it is unclear to me what the intentions were towards > an alternative model. (X may be able to access the screen depending > on whether it has a suitable driver besides the presently unusable > /dev/fb<N> based one.) I had to admit most of my boxes are headless servers, albeit I have one NUC I can use to test gfx stuff, so I don't really use gfx output with Xen. As I understand such information is fetched from the BIOS and passed into Xen, which should then hand it over to the dom0 kernel? I guess the only way for Linux dom0 kernel to fetch that information would be to emulate the BIOS or drop into realmode and issue the BIOS calls? Is that an issue on UEFI also, or there dom0 can fetch the framebuffer info using the PV EFI interface? Thanks, Roger.
On 14.09.2021 10:32, Roger Pau Monné wrote: > On Tue, Sep 07, 2021 at 12:04:34PM +0200, Jan Beulich wrote: >> In order to try to debug hypervisor side breakage from XSA-378 I found >> myself urged to finally give PVH Dom0 a try. Sadly things didn't work >> quite as expected. In the course of investigating these issues I actually >> spotted one piece of PV Dom0 breakage as well, a fix for which is also >> included here. >> >> There are two immediate remaining issues (also mentioned in affected >> patches): >> >> 1) It is not clear to me how PCI device reporting is to work. PV Dom0 >> reports devices as they're discovered, including ones the hypervisor >> may not have been able to discover itself (ones on segments other >> than 0 or hotplugged ones). The respective hypercall, however, is >> inaccessible to PVH Dom0. Depending on the answer to this, either >> the hypervisor will need changing (to permit the call) or patch 2 >> here will need further refinement. > > I would rather prefer if we could limit the hypercall usage to only > report hotplugged segments to Xen. Then Xen would have to scan the > segment when reported and add any devices found. > > Such hypercall must be used before dom0 tries to access any device, as > otherwise the BARs won't be mapped in the second stage translation and > the traps for the MCFG area won't be setup either. This might work if hotplugging would only ever be of segments, and not of individual devices. Yet the latter is, I think, a common case (as far as hotplugging itself is "common"). Also don't forget about SR-IOV VFs - they would typically not be there when booting. They would materialize when the PF driver initializes the device. This is, I think, something that can be dealt with by intercepting writes to the SR-IOV capability. But I wonder whether there might be other cases where devices become "visible" only while the Dom0 kernel is already running. >> 2) Dom0, unlike in the PV case, cannot access the screen (to use as a >> console) when in a non-default mode (i.e. not 80x25 text), as the >> necessary information (in particular about VESA-bases LFB modes) is >> not communicated. On the hypervisor side this looks like deliberate >> behavior, but it is unclear to me what the intentions were towards >> an alternative model. (X may be able to access the screen depending >> on whether it has a suitable driver besides the presently unusable >> /dev/fb<N> based one.) > > I had to admit most of my boxes are headless servers, albeit I have > one NUC I can use to test gfx stuff, so I don't really use gfx output > with Xen. > > As I understand such information is fetched from the BIOS and passed > into Xen, which should then hand it over to the dom0 kernel? That's how PV Dom0 learns of the information, yes. See fill_console_start_info(). (I'm in the process of eliminating the need for some of the "fetch from BIOS" in Xen right now, but that's not going to get us as far as being able to delete that code, no matter how much in particular Andrew would like that to happen.) > I guess the only way for Linux dom0 kernel to fetch that information > would be to emulate the BIOS or drop into realmode and issue the BIOS > calls? Native Linux gets this information passed from the boot loader, I think (except in the EFI case, as per below). > Is that an issue on UEFI also, or there dom0 can fetch the framebuffer > info using the PV EFI interface? There it's EFI boot services functions which can be invoked before leaving boot services (in the native case). Aiui the PVH entry point lives logically past any EFI boot services interaction, and hence using them is not an option (if there was EFI firmware present in Dom0 in the first place, which I consider difficult all by itself - this can't be the physical system's firmware, but I also don't see where virtual firmware would be taken from). There is no PV EFI interface to obtain video information. With the needed information getting passed via start_info, PV has no need for such, and I would be hesitant to add a fundamentally redundant interface for PVH. The more that the information needed isn't EFI- specific at all. Jan
On Tue, Sep 14, 2021 at 11:03:23AM +0200, Jan Beulich wrote: > On 14.09.2021 10:32, Roger Pau Monné wrote: > > On Tue, Sep 07, 2021 at 12:04:34PM +0200, Jan Beulich wrote: > >> In order to try to debug hypervisor side breakage from XSA-378 I found > >> myself urged to finally give PVH Dom0 a try. Sadly things didn't work > >> quite as expected. In the course of investigating these issues I actually > >> spotted one piece of PV Dom0 breakage as well, a fix for which is also > >> included here. > >> > >> There are two immediate remaining issues (also mentioned in affected > >> patches): > >> > >> 1) It is not clear to me how PCI device reporting is to work. PV Dom0 > >> reports devices as they're discovered, including ones the hypervisor > >> may not have been able to discover itself (ones on segments other > >> than 0 or hotplugged ones). The respective hypercall, however, is > >> inaccessible to PVH Dom0. Depending on the answer to this, either > >> the hypervisor will need changing (to permit the call) or patch 2 > >> here will need further refinement. > > > > I would rather prefer if we could limit the hypercall usage to only > > report hotplugged segments to Xen. Then Xen would have to scan the > > segment when reported and add any devices found. > > > > Such hypercall must be used before dom0 tries to access any device, as > > otherwise the BARs won't be mapped in the second stage translation and > > the traps for the MCFG area won't be setup either. > > This might work if hotplugging would only ever be of segments, and not > of individual devices. Yet the latter is, I think, a common case (as > far as hotplugging itself is "common"). Right, I agree to use hypercalls to report either hotplugged segments or devices. However I would like to avoid mandating usage of the hypercall for non-hotplug stuff, as then OSes not having hotplug support don't really need to care about making use of those hypercalls. > Also don't forget about SR-IOV VFs - they would typically not be there > when booting. They would materialize when the PF driver initializes > the device. This is, I think, something that can be dealt with by > intercepting writes to the SR-IOV capability. My plan was to indeed trap SR-IOV capability accesses, see: https://lore.kernel.org/xen-devel/20180717094830.54806-1-roger.pau@citrix.com/ I just don't have time ATM to continue this work. > But I wonder whether > there might be other cases where devices become "visible" only while > the Dom0 kernel is already running. I would consider those kind of hotplug devices, and hence would require the use of the hypercall in order to notify Xen about them. > >> 2) Dom0, unlike in the PV case, cannot access the screen (to use as a > >> console) when in a non-default mode (i.e. not 80x25 text), as the > >> necessary information (in particular about VESA-bases LFB modes) is > >> not communicated. On the hypervisor side this looks like deliberate > >> behavior, but it is unclear to me what the intentions were towards > >> an alternative model. (X may be able to access the screen depending > >> on whether it has a suitable driver besides the presently unusable > >> /dev/fb<N> based one.) > > > > I had to admit most of my boxes are headless servers, albeit I have > > one NUC I can use to test gfx stuff, so I don't really use gfx output > > with Xen. > > > > As I understand such information is fetched from the BIOS and passed > > into Xen, which should then hand it over to the dom0 kernel? > > That's how PV Dom0 learns of the information, yes. See > fill_console_start_info(). (I'm in the process of eliminating the > need for some of the "fetch from BIOS" in Xen right now, but that's > not going to get us as far as being able to delete that code, no > matter how much in particular Andrew would like that to happen.) > > > I guess the only way for Linux dom0 kernel to fetch that information > > would be to emulate the BIOS or drop into realmode and issue the BIOS > > calls? > > Native Linux gets this information passed from the boot loader, I think > (except in the EFI case, as per below). > > > Is that an issue on UEFI also, or there dom0 can fetch the framebuffer > > info using the PV EFI interface? > > There it's EFI boot services functions which can be invoked before > leaving boot services (in the native case). Aiui the PVH entry point > lives logically past any EFI boot services interaction, and hence > using them is not an option (if there was EFI firmware present in Dom0 > in the first place, which I consider difficult all by itself - this > can't be the physical system's firmware, but I also don't see where > virtual firmware would be taken from). > > There is no PV EFI interface to obtain video information. With the > needed information getting passed via start_info, PV has no need for > such, and I would be hesitant to add a fundamentally redundant > interface for PVH. The more that the information needed isn't EFI- > specific at all. I think our only option is to expand the HVM start info information to convey that data from Xen into dom0. Thanks, Roger.
On 14.09.2021 13:15, Roger Pau Monné wrote: > On Tue, Sep 14, 2021 at 11:03:23AM +0200, Jan Beulich wrote: >> On 14.09.2021 10:32, Roger Pau Monné wrote: >>> On Tue, Sep 07, 2021 at 12:04:34PM +0200, Jan Beulich wrote: >>>> In order to try to debug hypervisor side breakage from XSA-378 I found >>>> myself urged to finally give PVH Dom0 a try. Sadly things didn't work >>>> quite as expected. In the course of investigating these issues I actually >>>> spotted one piece of PV Dom0 breakage as well, a fix for which is also >>>> included here. >>>> >>>> There are two immediate remaining issues (also mentioned in affected >>>> patches): >>>> >>>> 1) It is not clear to me how PCI device reporting is to work. PV Dom0 >>>> reports devices as they're discovered, including ones the hypervisor >>>> may not have been able to discover itself (ones on segments other >>>> than 0 or hotplugged ones). The respective hypercall, however, is >>>> inaccessible to PVH Dom0. Depending on the answer to this, either >>>> the hypervisor will need changing (to permit the call) or patch 2 >>>> here will need further refinement. >>> >>> I would rather prefer if we could limit the hypercall usage to only >>> report hotplugged segments to Xen. Then Xen would have to scan the >>> segment when reported and add any devices found. >>> >>> Such hypercall must be used before dom0 tries to access any device, as >>> otherwise the BARs won't be mapped in the second stage translation and >>> the traps for the MCFG area won't be setup either. >> >> This might work if hotplugging would only ever be of segments, and not >> of individual devices. Yet the latter is, I think, a common case (as >> far as hotplugging itself is "common"). > > Right, I agree to use hypercalls to report either hotplugged segments > or devices. However I would like to avoid mandating usage of the > hypercall for non-hotplug stuff, as then OSes not having hotplug > support don't really need to care about making use of those > hypercalls. > >> Also don't forget about SR-IOV VFs - they would typically not be there >> when booting. They would materialize when the PF driver initializes >> the device. This is, I think, something that can be dealt with by >> intercepting writes to the SR-IOV capability. > > My plan was to indeed trap SR-IOV capability accesses, see: > > https://lore.kernel.org/xen-devel/20180717094830.54806-1-roger.pau@citrix.com/ > > I just don't have time ATM to continue this work. > >> But I wonder whether >> there might be other cases where devices become "visible" only while >> the Dom0 kernel is already running. > > I would consider those kind of hotplug devices, and hence would > require the use of the hypercall in order to notify Xen about them. So what does this mean for the one patch? Should drivers/xen/pci.c then be built for PVH (and then have logic added to filter boot time device discovery), or should I restrict this to be PV-only (and PVH would get some completely different logic added later)? >>>> 2) Dom0, unlike in the PV case, cannot access the screen (to use as a >>>> console) when in a non-default mode (i.e. not 80x25 text), as the >>>> necessary information (in particular about VESA-bases LFB modes) is >>>> not communicated. On the hypervisor side this looks like deliberate >>>> behavior, but it is unclear to me what the intentions were towards >>>> an alternative model. (X may be able to access the screen depending >>>> on whether it has a suitable driver besides the presently unusable >>>> /dev/fb<N> based one.) >>> >>> I had to admit most of my boxes are headless servers, albeit I have >>> one NUC I can use to test gfx stuff, so I don't really use gfx output >>> with Xen. >>> >>> As I understand such information is fetched from the BIOS and passed >>> into Xen, which should then hand it over to the dom0 kernel? >> >> That's how PV Dom0 learns of the information, yes. See >> fill_console_start_info(). (I'm in the process of eliminating the >> need for some of the "fetch from BIOS" in Xen right now, but that's >> not going to get us as far as being able to delete that code, no >> matter how much in particular Andrew would like that to happen.) >> >>> I guess the only way for Linux dom0 kernel to fetch that information >>> would be to emulate the BIOS or drop into realmode and issue the BIOS >>> calls? >> >> Native Linux gets this information passed from the boot loader, I think >> (except in the EFI case, as per below). >> >>> Is that an issue on UEFI also, or there dom0 can fetch the framebuffer >>> info using the PV EFI interface? >> >> There it's EFI boot services functions which can be invoked before >> leaving boot services (in the native case). Aiui the PVH entry point >> lives logically past any EFI boot services interaction, and hence >> using them is not an option (if there was EFI firmware present in Dom0 >> in the first place, which I consider difficult all by itself - this >> can't be the physical system's firmware, but I also don't see where >> virtual firmware would be taken from). >> >> There is no PV EFI interface to obtain video information. With the >> needed information getting passed via start_info, PV has no need for >> such, and I would be hesitant to add a fundamentally redundant >> interface for PVH. The more that the information needed isn't EFI- >> specific at all. > > I think our only option is to expand the HVM start info information to > convey that data from Xen into dom0. PHV doesn't use the ordinary start_info, does it? Jan
On Tue, Sep 14, 2021 at 01:58:29PM +0200, Jan Beulich wrote: > On 14.09.2021 13:15, Roger Pau Monné wrote: > > On Tue, Sep 14, 2021 at 11:03:23AM +0200, Jan Beulich wrote: > >> On 14.09.2021 10:32, Roger Pau Monné wrote: > >>> On Tue, Sep 07, 2021 at 12:04:34PM +0200, Jan Beulich wrote: > >>>> In order to try to debug hypervisor side breakage from XSA-378 I found > >>>> myself urged to finally give PVH Dom0 a try. Sadly things didn't work > >>>> quite as expected. In the course of investigating these issues I actually > >>>> spotted one piece of PV Dom0 breakage as well, a fix for which is also > >>>> included here. > >>>> > >>>> There are two immediate remaining issues (also mentioned in affected > >>>> patches): > >>>> > >>>> 1) It is not clear to me how PCI device reporting is to work. PV Dom0 > >>>> reports devices as they're discovered, including ones the hypervisor > >>>> may not have been able to discover itself (ones on segments other > >>>> than 0 or hotplugged ones). The respective hypercall, however, is > >>>> inaccessible to PVH Dom0. Depending on the answer to this, either > >>>> the hypervisor will need changing (to permit the call) or patch 2 > >>>> here will need further refinement. > >>> > >>> I would rather prefer if we could limit the hypercall usage to only > >>> report hotplugged segments to Xen. Then Xen would have to scan the > >>> segment when reported and add any devices found. > >>> > >>> Such hypercall must be used before dom0 tries to access any device, as > >>> otherwise the BARs won't be mapped in the second stage translation and > >>> the traps for the MCFG area won't be setup either. > >> > >> This might work if hotplugging would only ever be of segments, and not > >> of individual devices. Yet the latter is, I think, a common case (as > >> far as hotplugging itself is "common"). > > > > Right, I agree to use hypercalls to report either hotplugged segments > > or devices. However I would like to avoid mandating usage of the > > hypercall for non-hotplug stuff, as then OSes not having hotplug > > support don't really need to care about making use of those > > hypercalls. > > > >> Also don't forget about SR-IOV VFs - they would typically not be there > >> when booting. They would materialize when the PF driver initializes > >> the device. This is, I think, something that can be dealt with by > >> intercepting writes to the SR-IOV capability. > > > > My plan was to indeed trap SR-IOV capability accesses, see: > > > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fxen-devel%2F20180717094830.54806-1-roger.pau%40citrix.com%2F&data=04%7C01%7Croger.pau%40citrix.com%7C35d2502d0128484e229e08d97777087f%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637672175399546062%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=sSeE%2F4wEo5%2Fplkj2yH%2B1kpHi5c15lxJxeUxx6Cbyr4s%3D&reserved=0 > > > > I just don't have time ATM to continue this work. > > > >> But I wonder whether > >> there might be other cases where devices become "visible" only while > >> the Dom0 kernel is already running. > > > > I would consider those kind of hotplug devices, and hence would > > require the use of the hypercall in order to notify Xen about them. > > So what does this mean for the one patch? Should drivers/xen/pci.c > then be built for PVH (and then have logic added to filter boot > time device discovery), or should I restrict this to be PV-only (and > PVH would get some completely different logic added later)? I think we can reuse the same hypercalls for PVH, and maybe the same code in Linux. For PVH we just need to be careful to make the hypercalls before attempting to access the BARs (or the PCI configuration space for the device) since there won't be any traps setup, and BARs won't be mapped on the p2m. It might be easier for Linux to just report every device it finds to Xen, like it's currently done for PV dom0, instead of filtering on whether the device has been hotplugged. > >>>> 2) Dom0, unlike in the PV case, cannot access the screen (to use as a > >>>> console) when in a non-default mode (i.e. not 80x25 text), as the > >>>> necessary information (in particular about VESA-bases LFB modes) is > >>>> not communicated. On the hypervisor side this looks like deliberate > >>>> behavior, but it is unclear to me what the intentions were towards > >>>> an alternative model. (X may be able to access the screen depending > >>>> on whether it has a suitable driver besides the presently unusable > >>>> /dev/fb<N> based one.) > >>> > >>> I had to admit most of my boxes are headless servers, albeit I have > >>> one NUC I can use to test gfx stuff, so I don't really use gfx output > >>> with Xen. > >>> > >>> As I understand such information is fetched from the BIOS and passed > >>> into Xen, which should then hand it over to the dom0 kernel? > >> > >> That's how PV Dom0 learns of the information, yes. See > >> fill_console_start_info(). (I'm in the process of eliminating the > >> need for some of the "fetch from BIOS" in Xen right now, but that's > >> not going to get us as far as being able to delete that code, no > >> matter how much in particular Andrew would like that to happen.) > >> > >>> I guess the only way for Linux dom0 kernel to fetch that information > >>> would be to emulate the BIOS or drop into realmode and issue the BIOS > >>> calls? > >> > >> Native Linux gets this information passed from the boot loader, I think > >> (except in the EFI case, as per below). > >> > >>> Is that an issue on UEFI also, or there dom0 can fetch the framebuffer > >>> info using the PV EFI interface? > >> > >> There it's EFI boot services functions which can be invoked before > >> leaving boot services (in the native case). Aiui the PVH entry point > >> lives logically past any EFI boot services interaction, and hence > >> using them is not an option (if there was EFI firmware present in Dom0 > >> in the first place, which I consider difficult all by itself - this > >> can't be the physical system's firmware, but I also don't see where > >> virtual firmware would be taken from). > >> > >> There is no PV EFI interface to obtain video information. With the > >> needed information getting passed via start_info, PV has no need for > >> such, and I would be hesitant to add a fundamentally redundant > >> interface for PVH. The more that the information needed isn't EFI- > >> specific at all. > > > > I think our only option is to expand the HVM start info information to > > convey that data from Xen into dom0. > > PHV doesn't use the ordinary start_info, does it? No, it's HVM start info as described in: xen/include/public/arch-x86/hvm/start_info.h We have already extended it once to add a memory map, we could extend it another time to add the video information. Roger.
On 14.09.2021 14:41, Roger Pau Monné wrote: > On Tue, Sep 14, 2021 at 01:58:29PM +0200, Jan Beulich wrote: >> On 14.09.2021 13:15, Roger Pau Monné wrote: >>> On Tue, Sep 14, 2021 at 11:03:23AM +0200, Jan Beulich wrote: >>>> On 14.09.2021 10:32, Roger Pau Monné wrote: >>>>> On Tue, Sep 07, 2021 at 12:04:34PM +0200, Jan Beulich wrote: >>>>>> In order to try to debug hypervisor side breakage from XSA-378 I found >>>>>> myself urged to finally give PVH Dom0 a try. Sadly things didn't work >>>>>> quite as expected. In the course of investigating these issues I actually >>>>>> spotted one piece of PV Dom0 breakage as well, a fix for which is also >>>>>> included here. >>>>>> >>>>>> There are two immediate remaining issues (also mentioned in affected >>>>>> patches): >>>>>> >>>>>> 1) It is not clear to me how PCI device reporting is to work. PV Dom0 >>>>>> reports devices as they're discovered, including ones the hypervisor >>>>>> may not have been able to discover itself (ones on segments other >>>>>> than 0 or hotplugged ones). The respective hypercall, however, is >>>>>> inaccessible to PVH Dom0. Depending on the answer to this, either >>>>>> the hypervisor will need changing (to permit the call) or patch 2 >>>>>> here will need further refinement. >>>>> >>>>> I would rather prefer if we could limit the hypercall usage to only >>>>> report hotplugged segments to Xen. Then Xen would have to scan the >>>>> segment when reported and add any devices found. >>>>> >>>>> Such hypercall must be used before dom0 tries to access any device, as >>>>> otherwise the BARs won't be mapped in the second stage translation and >>>>> the traps for the MCFG area won't be setup either. >>>> >>>> This might work if hotplugging would only ever be of segments, and not >>>> of individual devices. Yet the latter is, I think, a common case (as >>>> far as hotplugging itself is "common"). >>> >>> Right, I agree to use hypercalls to report either hotplugged segments >>> or devices. However I would like to avoid mandating usage of the >>> hypercall for non-hotplug stuff, as then OSes not having hotplug >>> support don't really need to care about making use of those >>> hypercalls. >>> >>>> Also don't forget about SR-IOV VFs - they would typically not be there >>>> when booting. They would materialize when the PF driver initializes >>>> the device. This is, I think, something that can be dealt with by >>>> intercepting writes to the SR-IOV capability. >>> >>> My plan was to indeed trap SR-IOV capability accesses, see: >>> >>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fxen-devel%2F20180717094830.54806-1-roger.pau%40citrix.com%2F&data=04%7C01%7Croger.pau%40citrix.com%7C35d2502d0128484e229e08d97777087f%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637672175399546062%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=sSeE%2F4wEo5%2Fplkj2yH%2B1kpHi5c15lxJxeUxx6Cbyr4s%3D&reserved=0 >>> >>> I just don't have time ATM to continue this work. >>> >>>> But I wonder whether >>>> there might be other cases where devices become "visible" only while >>>> the Dom0 kernel is already running. >>> >>> I would consider those kind of hotplug devices, and hence would >>> require the use of the hypercall in order to notify Xen about them. >> >> So what does this mean for the one patch? Should drivers/xen/pci.c >> then be built for PVH (and then have logic added to filter boot >> time device discovery), or should I restrict this to be PV-only (and >> PVH would get some completely different logic added later)? > > I think we can reuse the same hypercalls for PVH, and maybe the same > code in Linux. For PVH we just need to be careful to make the > hypercalls before attempting to access the BARs (or the PCI > configuration space for the device) since there won't be any traps > setup, and BARs won't be mapped on the p2m. > > It might be easier for Linux to just report every device it finds to > Xen, like it's currently done for PV dom0, instead of filtering on > whether the device has been hotplugged. Okay. I'll leave the Linux patch as is then and instead make a Xen patch to actually let through the necessary function(s) in hvm_physdev_op(). >>>>>> 2) Dom0, unlike in the PV case, cannot access the screen (to use as a >>>>>> console) when in a non-default mode (i.e. not 80x25 text), as the >>>>>> necessary information (in particular about VESA-bases LFB modes) is >>>>>> not communicated. On the hypervisor side this looks like deliberate >>>>>> behavior, but it is unclear to me what the intentions were towards >>>>>> an alternative model. (X may be able to access the screen depending >>>>>> on whether it has a suitable driver besides the presently unusable >>>>>> /dev/fb<N> based one.) >>>>> >>>>> I had to admit most of my boxes are headless servers, albeit I have >>>>> one NUC I can use to test gfx stuff, so I don't really use gfx output >>>>> with Xen. >>>>> >>>>> As I understand such information is fetched from the BIOS and passed >>>>> into Xen, which should then hand it over to the dom0 kernel? >>>> >>>> That's how PV Dom0 learns of the information, yes. See >>>> fill_console_start_info(). (I'm in the process of eliminating the >>>> need for some of the "fetch from BIOS" in Xen right now, but that's >>>> not going to get us as far as being able to delete that code, no >>>> matter how much in particular Andrew would like that to happen.) >>>> >>>>> I guess the only way for Linux dom0 kernel to fetch that information >>>>> would be to emulate the BIOS or drop into realmode and issue the BIOS >>>>> calls? >>>> >>>> Native Linux gets this information passed from the boot loader, I think >>>> (except in the EFI case, as per below). >>>> >>>>> Is that an issue on UEFI also, or there dom0 can fetch the framebuffer >>>>> info using the PV EFI interface? >>>> >>>> There it's EFI boot services functions which can be invoked before >>>> leaving boot services (in the native case). Aiui the PVH entry point >>>> lives logically past any EFI boot services interaction, and hence >>>> using them is not an option (if there was EFI firmware present in Dom0 >>>> in the first place, which I consider difficult all by itself - this >>>> can't be the physical system's firmware, but I also don't see where >>>> virtual firmware would be taken from). >>>> >>>> There is no PV EFI interface to obtain video information. With the >>>> needed information getting passed via start_info, PV has no need for >>>> such, and I would be hesitant to add a fundamentally redundant >>>> interface for PVH. The more that the information needed isn't EFI- >>>> specific at all. >>> >>> I think our only option is to expand the HVM start info information to >>> convey that data from Xen into dom0. >> >> PHV doesn't use the ordinary start_info, does it? > > No, it's HVM start info as described in: > > xen/include/public/arch-x86/hvm/start_info.h > > We have already extended it once to add a memory map, we could extend > it another time to add the video information. Okay, I'll try to make a(nother) patch along these lines. Since there's a DomU counterpart in PV's start_info - where does that information get passed for PVH? (I'm mainly wondering whether there's another approach to consider.) Jan
On Tue, Sep 14, 2021 at 05:13:52PM +0200, Jan Beulich wrote: > On 14.09.2021 14:41, Roger Pau Monné wrote: > > On Tue, Sep 14, 2021 at 01:58:29PM +0200, Jan Beulich wrote: > >> On 14.09.2021 13:15, Roger Pau Monné wrote: > >>> On Tue, Sep 14, 2021 at 11:03:23AM +0200, Jan Beulich wrote: > >>>> On 14.09.2021 10:32, Roger Pau Monné wrote: > >>>>> On Tue, Sep 07, 2021 at 12:04:34PM +0200, Jan Beulich wrote: > >>>>>> 2) Dom0, unlike in the PV case, cannot access the screen (to use as a > >>>>>> console) when in a non-default mode (i.e. not 80x25 text), as the > >>>>>> necessary information (in particular about VESA-bases LFB modes) is > >>>>>> not communicated. On the hypervisor side this looks like deliberate > >>>>>> behavior, but it is unclear to me what the intentions were towards > >>>>>> an alternative model. (X may be able to access the screen depending > >>>>>> on whether it has a suitable driver besides the presently unusable > >>>>>> /dev/fb<N> based one.) > >>>>> > >>>>> I had to admit most of my boxes are headless servers, albeit I have > >>>>> one NUC I can use to test gfx stuff, so I don't really use gfx output > >>>>> with Xen. > >>>>> > >>>>> As I understand such information is fetched from the BIOS and passed > >>>>> into Xen, which should then hand it over to the dom0 kernel? > >>>> > >>>> That's how PV Dom0 learns of the information, yes. See > >>>> fill_console_start_info(). (I'm in the process of eliminating the > >>>> need for some of the "fetch from BIOS" in Xen right now, but that's > >>>> not going to get us as far as being able to delete that code, no > >>>> matter how much in particular Andrew would like that to happen.) > >>>> > >>>>> I guess the only way for Linux dom0 kernel to fetch that information > >>>>> would be to emulate the BIOS or drop into realmode and issue the BIOS > >>>>> calls? > >>>> > >>>> Native Linux gets this information passed from the boot loader, I think > >>>> (except in the EFI case, as per below). > >>>> > >>>>> Is that an issue on UEFI also, or there dom0 can fetch the framebuffer > >>>>> info using the PV EFI interface? > >>>> > >>>> There it's EFI boot services functions which can be invoked before > >>>> leaving boot services (in the native case). Aiui the PVH entry point > >>>> lives logically past any EFI boot services interaction, and hence > >>>> using them is not an option (if there was EFI firmware present in Dom0 > >>>> in the first place, which I consider difficult all by itself - this > >>>> can't be the physical system's firmware, but I also don't see where > >>>> virtual firmware would be taken from). > >>>> > >>>> There is no PV EFI interface to obtain video information. With the > >>>> needed information getting passed via start_info, PV has no need for > >>>> such, and I would be hesitant to add a fundamentally redundant > >>>> interface for PVH. The more that the information needed isn't EFI- > >>>> specific at all. > >>> > >>> I think our only option is to expand the HVM start info information to > >>> convey that data from Xen into dom0. > >> > >> PHV doesn't use the ordinary start_info, does it? > > > > No, it's HVM start info as described in: > > > > xen/include/public/arch-x86/hvm/start_info.h > > > > We have already extended it once to add a memory map, we could extend > > it another time to add the video information. > > Okay, I'll try to make a(nother) patch along these lines. Since there's > a DomU counterpart in PV's start_info - where does that information get > passed for PVH? (I'm mainly wondering whether there's another approach > to consider.) We don't pass the video information at all for PVH, neither in domU or dom0 modes if that's what you mean. Not sure what video information we could pass for domU anyway, as that would be a PV framebuffer that would need setup ATM. Maybe we could at some point provide some kind of emulated or passed through card. The information contained in start_info that's needed for PVH is passed using hvm params, just like it's done for plain HVM guests. We could pass the video information in a hvm param I guess, but it would require stealing guest memory to store it (and mark as reserved in the memory map). Not sure that's better than expanding HVM start info. Maybe there's another hypercall interface I'm missing we could use to propagate that information to dom0? Thanks, Roger.
On 14.09.2021 18:27, Roger Pau Monné wrote: > On Tue, Sep 14, 2021 at 05:13:52PM +0200, Jan Beulich wrote: >> On 14.09.2021 14:41, Roger Pau Monné wrote: >>> On Tue, Sep 14, 2021 at 01:58:29PM +0200, Jan Beulich wrote: >>>> On 14.09.2021 13:15, Roger Pau Monné wrote: >>>>> On Tue, Sep 14, 2021 at 11:03:23AM +0200, Jan Beulich wrote: >>>>>> On 14.09.2021 10:32, Roger Pau Monné wrote: >>>>>>> On Tue, Sep 07, 2021 at 12:04:34PM +0200, Jan Beulich wrote: >>>>>>>> 2) Dom0, unlike in the PV case, cannot access the screen (to use as a >>>>>>>> console) when in a non-default mode (i.e. not 80x25 text), as the >>>>>>>> necessary information (in particular about VESA-bases LFB modes) is >>>>>>>> not communicated. On the hypervisor side this looks like deliberate >>>>>>>> behavior, but it is unclear to me what the intentions were towards >>>>>>>> an alternative model. (X may be able to access the screen depending >>>>>>>> on whether it has a suitable driver besides the presently unusable >>>>>>>> /dev/fb<N> based one.) >>>>>>> >>>>>>> I had to admit most of my boxes are headless servers, albeit I have >>>>>>> one NUC I can use to test gfx stuff, so I don't really use gfx output >>>>>>> with Xen. >>>>>>> >>>>>>> As I understand such information is fetched from the BIOS and passed >>>>>>> into Xen, which should then hand it over to the dom0 kernel? >>>>>> >>>>>> That's how PV Dom0 learns of the information, yes. See >>>>>> fill_console_start_info(). (I'm in the process of eliminating the >>>>>> need for some of the "fetch from BIOS" in Xen right now, but that's >>>>>> not going to get us as far as being able to delete that code, no >>>>>> matter how much in particular Andrew would like that to happen.) >>>>>> >>>>>>> I guess the only way for Linux dom0 kernel to fetch that information >>>>>>> would be to emulate the BIOS or drop into realmode and issue the BIOS >>>>>>> calls? >>>>>> >>>>>> Native Linux gets this information passed from the boot loader, I think >>>>>> (except in the EFI case, as per below). >>>>>> >>>>>>> Is that an issue on UEFI also, or there dom0 can fetch the framebuffer >>>>>>> info using the PV EFI interface? >>>>>> >>>>>> There it's EFI boot services functions which can be invoked before >>>>>> leaving boot services (in the native case). Aiui the PVH entry point >>>>>> lives logically past any EFI boot services interaction, and hence >>>>>> using them is not an option (if there was EFI firmware present in Dom0 >>>>>> in the first place, which I consider difficult all by itself - this >>>>>> can't be the physical system's firmware, but I also don't see where >>>>>> virtual firmware would be taken from). >>>>>> >>>>>> There is no PV EFI interface to obtain video information. With the >>>>>> needed information getting passed via start_info, PV has no need for >>>>>> such, and I would be hesitant to add a fundamentally redundant >>>>>> interface for PVH. The more that the information needed isn't EFI- >>>>>> specific at all. >>>>> >>>>> I think our only option is to expand the HVM start info information to >>>>> convey that data from Xen into dom0. >>>> >>>> PHV doesn't use the ordinary start_info, does it? >>> >>> No, it's HVM start info as described in: >>> >>> xen/include/public/arch-x86/hvm/start_info.h >>> >>> We have already extended it once to add a memory map, we could extend >>> it another time to add the video information. >> >> Okay, I'll try to make a(nother) patch along these lines. Since there's >> a DomU counterpart in PV's start_info - where does that information get >> passed for PVH? (I'm mainly wondering whether there's another approach >> to consider.) > > We don't pass the video information at all for PVH, neither in domU or > dom0 modes if that's what you mean. Not sure what video information we > could pass for domU anyway, as that would be a PV framebuffer that > would need setup ATM. Maybe we could at some point provide some kind > of emulated or passed through card. > > The information contained in start_info that's needed for PVH is > passed using hvm params, just like it's done for plain HVM guests. This is what I was referring to; I'm sorry for having been unclear. It's no video _mode_ information, but information on hot to get at the console. > We > could pass the video information in a hvm param I guess, but it would > require stealing guest memory to store it (and mark as reserved in > the memory map). Not sure that's better than expanding HVM start info. I don't think it would be; a param doesn't seem a good fit here, and I have to admit I'm not even convinced its a good fit for xenstore and console coordinates (that's fine for HVM; the only reason I can see for PVH to use the same is the expectation of the line between both to become increasingly blurred). > Maybe there's another hypercall interface I'm missing we could use to > propagate that information to dom0? I don't think there is; if anything we'd have to add something. Jan
© 2016 - 2024 Red Hat, Inc.