From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reset the command register when passing through a PCI device:
it is possible that when passing through a PCI device its memory
decoding bits in the command register are already set. Thus, a
guest OS may not write to the command register to update memory
decoding, so guest mappings (guest's view of the BARs) are
left not updated.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
xen/drivers/vpci/header.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 7416ef1e1e06..dac973368b1e 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -811,6 +811,16 @@ int vpci_bar_add_handlers(const struct domain *d, struct pci_dev *pdev)
gprintk(XENLOG_ERR,
"%pp: failed to add BAR handlers for dom%d\n", &pdev->sbdf,
d->domain_id);
+
+ /*
+ * Reset the command register: it is possible that when passing
+ * through a PCI device its memory decoding bits in the command
+ * register are already set. Thus, a guest OS may not write to the
+ * command register to update memory decoding, so guest mappings
+ * (guest's view of the BARs) are left not updated.
+ */
+ pci_conf_write16(pdev->sbdf, PCI_COMMAND, 0);
+
return rc;
}
--
2.25.1
On 03.09.2021 12:08, Oleksandr Andrushchenko wrote: > --- a/xen/drivers/vpci/header.c > +++ b/xen/drivers/vpci/header.c > @@ -811,6 +811,16 @@ int vpci_bar_add_handlers(const struct domain *d, struct pci_dev *pdev) > gprintk(XENLOG_ERR, > "%pp: failed to add BAR handlers for dom%d\n", &pdev->sbdf, > d->domain_id); > + > + /* > + * Reset the command register: it is possible that when passing > + * through a PCI device its memory decoding bits in the command > + * register are already set. Thus, a guest OS may not write to the > + * command register to update memory decoding, so guest mappings > + * (guest's view of the BARs) are left not updated. > + */ > + pci_conf_write16(pdev->sbdf, PCI_COMMAND, 0); Can you really blindly write 0 here? What about bits that have to be under host control, e.g. INTX_DISABLE? I can see that you may want to hand off with I/O and memory decoding off and bus mastering disabled, but for every other bit (including reserved ones) I'd expect separate justification (in the commit message). Jan
On 06.09.21 17:55, Jan Beulich wrote: > On 03.09.2021 12:08, Oleksandr Andrushchenko wrote: >> --- a/xen/drivers/vpci/header.c >> +++ b/xen/drivers/vpci/header.c >> @@ -811,6 +811,16 @@ int vpci_bar_add_handlers(const struct domain *d, struct pci_dev *pdev) >> gprintk(XENLOG_ERR, >> "%pp: failed to add BAR handlers for dom%d\n", &pdev->sbdf, >> d->domain_id); >> + >> + /* >> + * Reset the command register: it is possible that when passing >> + * through a PCI device its memory decoding bits in the command >> + * register are already set. Thus, a guest OS may not write to the >> + * command register to update memory decoding, so guest mappings >> + * (guest's view of the BARs) are left not updated. >> + */ >> + pci_conf_write16(pdev->sbdf, PCI_COMMAND, 0); > Can you really blindly write 0 here? What about bits that have to be > under host control, e.g. INTX_DISABLE? I can see that you may want to > hand off with I/O and memory decoding off and bus mastering disabled, > but for every other bit (including reserved ones) I'd expect separate > justification (in the commit message). According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0" I have at hand, section "6.2.2 Device Control" says that the reset state of the command register is typically 0, so this is why I chose to write 0 here, e.g. make the command register as if it is after the reset. With respect to host control: we currently do not really emulate command register which probably was ok for x86 PVH Dom0 and this might not be the case now as we add DomU's. That being said: in my implementation guest can alter command register as it wants without restrictions. If we see it does need proper emulation then we would need adding that as well (is not part of this series though). Meanwhile, I agree that we can only reset IO space, memory space and bus master bits and leave the rest untouched. But again, without proper command register emulation guests can still set what they want. Please let me know your opinion on how we can proceed. > > Jan > Thank you, Oleksandr
On 07.09.2021 09:43, Oleksandr Andrushchenko wrote: > > On 06.09.21 17:55, Jan Beulich wrote: >> On 03.09.2021 12:08, Oleksandr Andrushchenko wrote: >>> --- a/xen/drivers/vpci/header.c >>> +++ b/xen/drivers/vpci/header.c >>> @@ -811,6 +811,16 @@ int vpci_bar_add_handlers(const struct domain *d, struct pci_dev *pdev) >>> gprintk(XENLOG_ERR, >>> "%pp: failed to add BAR handlers for dom%d\n", &pdev->sbdf, >>> d->domain_id); >>> + >>> + /* >>> + * Reset the command register: it is possible that when passing >>> + * through a PCI device its memory decoding bits in the command >>> + * register are already set. Thus, a guest OS may not write to the >>> + * command register to update memory decoding, so guest mappings >>> + * (guest's view of the BARs) are left not updated. >>> + */ >>> + pci_conf_write16(pdev->sbdf, PCI_COMMAND, 0); >> Can you really blindly write 0 here? What about bits that have to be >> under host control, e.g. INTX_DISABLE? I can see that you may want to >> hand off with I/O and memory decoding off and bus mastering disabled, >> but for every other bit (including reserved ones) I'd expect separate >> justification (in the commit message). > According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0" I have at hand, > section "6.2.2 Device Control" says that the reset state of the command > register is typically 0, so this is why I chose to write 0 here, e.g. > make the command register as if it is after the reset. > > With respect to host control: we currently do not really emulate command > register which probably was ok for x86 PVH Dom0 and this might not be the > case now as we add DomU's. That being said: in my implementation guest can > alter command register as it wants without restrictions. > If we see it does need proper emulation then we would need adding that as > well (is not part of this series though). > > Meanwhile, I agree that we can only reset IO space, memory space and bus > master bits and leave the rest untouched. But again, without proper command > register emulation guests can still set what they want. Yes, writes to the register will need emulating for DomU. Reporting the emulated register as zero initially is probably also quite fine (to match, as you say, mandated reset state). Jan
On 07.09.21 11:00, Jan Beulich wrote: > On 07.09.2021 09:43, Oleksandr Andrushchenko wrote: >> On 06.09.21 17:55, Jan Beulich wrote: >>> On 03.09.2021 12:08, Oleksandr Andrushchenko wrote: >>>> --- a/xen/drivers/vpci/header.c >>>> +++ b/xen/drivers/vpci/header.c >>>> @@ -811,6 +811,16 @@ int vpci_bar_add_handlers(const struct domain *d, struct pci_dev *pdev) >>>> gprintk(XENLOG_ERR, >>>> "%pp: failed to add BAR handlers for dom%d\n", &pdev->sbdf, >>>> d->domain_id); >>>> + >>>> + /* >>>> + * Reset the command register: it is possible that when passing >>>> + * through a PCI device its memory decoding bits in the command >>>> + * register are already set. Thus, a guest OS may not write to the >>>> + * command register to update memory decoding, so guest mappings >>>> + * (guest's view of the BARs) are left not updated. >>>> + */ >>>> + pci_conf_write16(pdev->sbdf, PCI_COMMAND, 0); >>> Can you really blindly write 0 here? What about bits that have to be >>> under host control, e.g. INTX_DISABLE? I can see that you may want to >>> hand off with I/O and memory decoding off and bus mastering disabled, >>> but for every other bit (including reserved ones) I'd expect separate >>> justification (in the commit message). >> According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0" I have at hand, >> section "6.2.2 Device Control" says that the reset state of the command >> register is typically 0, so this is why I chose to write 0 here, e.g. >> make the command register as if it is after the reset. >> >> With respect to host control: we currently do not really emulate command >> register which probably was ok for x86 PVH Dom0 and this might not be the >> case now as we add DomU's. That being said: in my implementation guest can >> alter command register as it wants without restrictions. >> If we see it does need proper emulation then we would need adding that as >> well (is not part of this series though). >> >> Meanwhile, I agree that we can only reset IO space, memory space and bus >> master bits and leave the rest untouched. But again, without proper command >> register emulation guests can still set what they want. > Yes, writes to the register will need emulating for DomU. But then I am wondering to what extent we need to emulate the command register? We have the following bits in the command register: #define PCI_COMMAND_IO 0x1 /* Enable response in I/O space */ #define PCI_COMMAND_MEMORY 0x2 /* Enable response in Memory space */ #define PCI_COMMAND_MASTER 0x4 /* Enable bus mastering */ #define PCI_COMMAND_SPECIAL 0x8 /* Enable response to special cycles */ #define PCI_COMMAND_INVALIDATE 0x10 /* Use memory write and invalidate */ #define PCI_COMMAND_VGA_PALETTE 0x20 /* Enable palette snooping */ #define PCI_COMMAND_PARITY 0x40 /* Enable parity checking */ #define PCI_COMMAND_WAIT 0x80 /* Enable address/data stepping */ #define PCI_COMMAND_SERR 0x100 /* Enable SERR */ #define PCI_COMMAND_FAST_BACK 0x200 /* Enable back-to-back writes */ #define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */ We want the guest to access directly at least I/O and memory decoding and bus mastering bits, but how do we emulate the rest? Do you mean we can match the rest to what host uses for the device, like PCI_COMMAND_INTX_DISABLE bit? If so, as per my understanding, those bits get set/cleared when a device is enabled, e.g. by Linux kernel/device driver for example. So, if we have a hidden PCI device which can be assigned to a guest and it is literally untouched (not enabled in Dom0) then I think there will be no such reference as "host assigned values" as most probably the command register will remain in its after reset state. Thus, I am not quite sure the command register can easily be emulated. Please correct me if my understanding is wrong here. > Reporting the > emulated register as zero initially is probably also quite fine (to > match, as you say, mandated reset state). > > Jan > Thank you, Oleksandr
On 07.09.2021 10:18, Oleksandr Andrushchenko wrote:
>
> On 07.09.21 11:00, Jan Beulich wrote:
>> On 07.09.2021 09:43, Oleksandr Andrushchenko wrote:
>>> On 06.09.21 17:55, Jan Beulich wrote:
>>>> On 03.09.2021 12:08, Oleksandr Andrushchenko wrote:
>>>>> --- a/xen/drivers/vpci/header.c
>>>>> +++ b/xen/drivers/vpci/header.c
>>>>> @@ -811,6 +811,16 @@ int vpci_bar_add_handlers(const struct domain *d, struct pci_dev *pdev)
>>>>> gprintk(XENLOG_ERR,
>>>>> "%pp: failed to add BAR handlers for dom%d\n", &pdev->sbdf,
>>>>> d->domain_id);
>>>>> +
>>>>> + /*
>>>>> + * Reset the command register: it is possible that when passing
>>>>> + * through a PCI device its memory decoding bits in the command
>>>>> + * register are already set. Thus, a guest OS may not write to the
>>>>> + * command register to update memory decoding, so guest mappings
>>>>> + * (guest's view of the BARs) are left not updated.
>>>>> + */
>>>>> + pci_conf_write16(pdev->sbdf, PCI_COMMAND, 0);
>>>> Can you really blindly write 0 here? What about bits that have to be
>>>> under host control, e.g. INTX_DISABLE? I can see that you may want to
>>>> hand off with I/O and memory decoding off and bus mastering disabled,
>>>> but for every other bit (including reserved ones) I'd expect separate
>>>> justification (in the commit message).
>>> According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0" I have at hand,
>>> section "6.2.2 Device Control" says that the reset state of the command
>>> register is typically 0, so this is why I chose to write 0 here, e.g.
>>> make the command register as if it is after the reset.
>>>
>>> With respect to host control: we currently do not really emulate command
>>> register which probably was ok for x86 PVH Dom0 and this might not be the
>>> case now as we add DomU's. That being said: in my implementation guest can
>>> alter command register as it wants without restrictions.
>>> If we see it does need proper emulation then we would need adding that as
>>> well (is not part of this series though).
>>>
>>> Meanwhile, I agree that we can only reset IO space, memory space and bus
>>> master bits and leave the rest untouched. But again, without proper command
>>> register emulation guests can still set what they want.
>> Yes, writes to the register will need emulating for DomU.
>
> But then I am wondering to what extent we need to emulate the command
>
> register? We have the following bits in the command register:
>
> #define PCI_COMMAND_IO 0x1 /* Enable response in I/O space */
> #define PCI_COMMAND_MEMORY 0x2 /* Enable response in Memory space */
> #define PCI_COMMAND_MASTER 0x4 /* Enable bus mastering */
> #define PCI_COMMAND_SPECIAL 0x8 /* Enable response to special cycles */
> #define PCI_COMMAND_INVALIDATE 0x10 /* Use memory write and invalidate */
> #define PCI_COMMAND_VGA_PALETTE 0x20 /* Enable palette snooping */
> #define PCI_COMMAND_PARITY 0x40 /* Enable parity checking */
> #define PCI_COMMAND_WAIT 0x80 /* Enable address/data stepping */
> #define PCI_COMMAND_SERR 0x100 /* Enable SERR */
> #define PCI_COMMAND_FAST_BACK 0x200 /* Enable back-to-back writes */
> #define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */
>
> We want the guest to access directly at least I/O and memory decoding and bus mastering
> bits, but how do we emulate the rest? Do you mean we can match the rest to what host
> uses for the device, like PCI_COMMAND_INTX_DISABLE bit? If so, as per my understanding,
> those bits get set/cleared when a device is enabled, e.g. by Linux kernel/device driver for example.
I would suggest to take qemu's emulation as a starting point.
> So, if we have a hidden PCI device which can be assigned to a guest and it is literally untouched
> (not enabled in Dom0) then I think there will be no such reference as "host assigned values" as
> most probably the command register will remain in its after reset state.
What meaning of "hidden" do you imply here? Devices passed to
pci_{hide,ro}_device() may not be assigned to guests ...
For any other meaning of "hidden", even if the device is completely
ignored by Dom0, certain of the properties still cannot be allowed
to be DomU-controlled. (I'm therefore not sure in how far Dom0 can
actually legitimately "ignore" devices. It may decide to not enable
them, but that's not "ignoring".)
Jan
On 07.09.21 11:49, Jan Beulich wrote:
> On 07.09.2021 10:18, Oleksandr Andrushchenko wrote:
>> On 07.09.21 11:00, Jan Beulich wrote:
>>> On 07.09.2021 09:43, Oleksandr Andrushchenko wrote:
>>>> On 06.09.21 17:55, Jan Beulich wrote:
>>>>> On 03.09.2021 12:08, Oleksandr Andrushchenko wrote:
>>>>>> --- a/xen/drivers/vpci/header.c
>>>>>> +++ b/xen/drivers/vpci/header.c
>>>>>> @@ -811,6 +811,16 @@ int vpci_bar_add_handlers(const struct domain *d, struct pci_dev *pdev)
>>>>>> gprintk(XENLOG_ERR,
>>>>>> "%pp: failed to add BAR handlers for dom%d\n", &pdev->sbdf,
>>>>>> d->domain_id);
>>>>>> +
>>>>>> + /*
>>>>>> + * Reset the command register: it is possible that when passing
>>>>>> + * through a PCI device its memory decoding bits in the command
>>>>>> + * register are already set. Thus, a guest OS may not write to the
>>>>>> + * command register to update memory decoding, so guest mappings
>>>>>> + * (guest's view of the BARs) are left not updated.
>>>>>> + */
>>>>>> + pci_conf_write16(pdev->sbdf, PCI_COMMAND, 0);
>>>>> Can you really blindly write 0 here? What about bits that have to be
>>>>> under host control, e.g. INTX_DISABLE? I can see that you may want to
>>>>> hand off with I/O and memory decoding off and bus mastering disabled,
>>>>> but for every other bit (including reserved ones) I'd expect separate
>>>>> justification (in the commit message).
>>>> According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0" I have at hand,
>>>> section "6.2.2 Device Control" says that the reset state of the command
>>>> register is typically 0, so this is why I chose to write 0 here, e.g.
>>>> make the command register as if it is after the reset.
>>>>
>>>> With respect to host control: we currently do not really emulate command
>>>> register which probably was ok for x86 PVH Dom0 and this might not be the
>>>> case now as we add DomU's. That being said: in my implementation guest can
>>>> alter command register as it wants without restrictions.
>>>> If we see it does need proper emulation then we would need adding that as
>>>> well (is not part of this series though).
>>>>
>>>> Meanwhile, I agree that we can only reset IO space, memory space and bus
>>>> master bits and leave the rest untouched. But again, without proper command
>>>> register emulation guests can still set what they want.
>>> Yes, writes to the register will need emulating for DomU.
>> But then I am wondering to what extent we need to emulate the command
>>
>> register? We have the following bits in the command register:
>>
>> #define PCI_COMMAND_IO 0x1 /* Enable response in I/O space */
>> #define PCI_COMMAND_MEMORY 0x2 /* Enable response in Memory space */
>> #define PCI_COMMAND_MASTER 0x4 /* Enable bus mastering */
>> #define PCI_COMMAND_SPECIAL 0x8 /* Enable response to special cycles */
>> #define PCI_COMMAND_INVALIDATE 0x10 /* Use memory write and invalidate */
>> #define PCI_COMMAND_VGA_PALETTE 0x20 /* Enable palette snooping */
>> #define PCI_COMMAND_PARITY 0x40 /* Enable parity checking */
>> #define PCI_COMMAND_WAIT 0x80 /* Enable address/data stepping */
>> #define PCI_COMMAND_SERR 0x100 /* Enable SERR */
>> #define PCI_COMMAND_FAST_BACK 0x200 /* Enable back-to-back writes */
>> #define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */
>>
>> We want the guest to access directly at least I/O and memory decoding and bus mastering
>> bits, but how do we emulate the rest? Do you mean we can match the rest to what host
>> uses for the device, like PCI_COMMAND_INTX_DISABLE bit? If so, as per my understanding,
>> those bits get set/cleared when a device is enabled, e.g. by Linux kernel/device driver for example.
> I would suggest to take qemu's emulation as a starting point.
Sure, I'll take a look what QEMU does. But I guess that emulation may depend
on host bridge emulation etc. which may not be applicable for our case without
serious complications.
>
>> So, if we have a hidden PCI device which can be assigned to a guest and it is literally untouched
>> (not enabled in Dom0) then I think there will be no such reference as "host assigned values" as
>> most probably the command register will remain in its after reset state.
> What meaning of "hidden" do you imply here? Devices passed to
> pci_{hide,ro}_device() may not be assigned to guests ...
You are completely right here.
>
> For any other meaning of "hidden", even if the device is completely
> ignored by Dom0,
Dom0less is such a case when a device is assigned to the guest
without Dom0 at all?
> certain of the properties still cannot be allowed
> to be DomU-controlled.
The list is not that big, could you please name a few you think cannot
be controlled by a guest? I can think of PCI_COMMAND_SPECIAL(?),
PCI_COMMAND_INVALIDATE(?), PCI_COMMAND_PARITY, PCI_COMMAND_WAIT,
PCI_COMMAND_SERR, PCI_COMMAND_INTX_DISABLE which we may want to
be aligned with the "host reference" values, e.g. we only allow those bits
to be set as they are in Dom0.
> (I'm therefore not sure in how far Dom0 can
> actually legitimately "ignore" devices. It may decide to not enable
> them, but that's not "ignoring".)
>
> Jan
>
Thank you,
Oleksandr
On 07.09.2021 11:07, Oleksandr Andrushchenko wrote:
> On 07.09.21 11:49, Jan Beulich wrote:
>> On 07.09.2021 10:18, Oleksandr Andrushchenko wrote:
>>> So, if we have a hidden PCI device which can be assigned to a guest and it is literally untouched
>>> (not enabled in Dom0) then I think there will be no such reference as "host assigned values" as
>>> most probably the command register will remain in its after reset state.
>> What meaning of "hidden" do you imply here? Devices passed to
>> pci_{hide,ro}_device() may not be assigned to guests ...
> You are completely right here.
>>
>> For any other meaning of "hidden", even if the device is completely
>> ignored by Dom0,
>
> Dom0less is such a case when a device is assigned to the guest
> without Dom0 at all?
In this case it is entirely unclear to me what entity it is to have
a global view on the PCI subsystem.
>> certain of the properties still cannot be allowed
>> to be DomU-controlled.
>
> The list is not that big, could you please name a few you think cannot
> be controlled by a guest? I can think of PCI_COMMAND_SPECIAL(?),
> PCI_COMMAND_INVALIDATE(?), PCI_COMMAND_PARITY, PCI_COMMAND_WAIT,
> PCI_COMMAND_SERR, PCI_COMMAND_INTX_DISABLE which we may want to
> be aligned with the "host reference" values, e.g. we only allow those bits
> to be set as they are in Dom0.
Well, you've compile a list already, and I did say so before as well:
Everything except I/O and memory decoding as well as bus mastering
needs at least closely looking at. INTX_DISABLE, for example, is
something I don't think a guest should be able to directly control.
It may still be the case that the host permits it control, but then
only indirectly, allowing the host to appropriately adjust its
internals.
Note that even for I/O and memory decoding as well as bus mastering
it may be necessary to limit guest control: In case the host wants
to disable any of these (perhaps transiently) despite the guest
wanting them enabled.
Jan
On 07.09.21 12:19, Jan Beulich wrote:
> On 07.09.2021 11:07, Oleksandr Andrushchenko wrote:
>> On 07.09.21 11:49, Jan Beulich wrote:
>>> On 07.09.2021 10:18, Oleksandr Andrushchenko wrote:
>>>> So, if we have a hidden PCI device which can be assigned to a guest and it is literally untouched
>>>> (not enabled in Dom0) then I think there will be no such reference as "host assigned values" as
>>>> most probably the command register will remain in its after reset state.
>>> What meaning of "hidden" do you imply here? Devices passed to
>>> pci_{hide,ro}_device() may not be assigned to guests ...
>> You are completely right here.
>>> For any other meaning of "hidden", even if the device is completely
>>> ignored by Dom0,
>> Dom0less is such a case when a device is assigned to the guest
>> without Dom0 at all?
> In this case it is entirely unclear to me what entity it is to have
> a global view on the PCI subsystem.
>
>>> certain of the properties still cannot be allowed
>>> to be DomU-controlled.
>> The list is not that big, could you please name a few you think cannot
>> be controlled by a guest? I can think of PCI_COMMAND_SPECIAL(?),
>> PCI_COMMAND_INVALIDATE(?), PCI_COMMAND_PARITY, PCI_COMMAND_WAIT,
>> PCI_COMMAND_SERR, PCI_COMMAND_INTX_DISABLE which we may want to
>> be aligned with the "host reference" values, e.g. we only allow those bits
>> to be set as they are in Dom0.
> Well, you've compile a list already, and I did say so before as well:
> Everything except I/O and memory decoding as well as bus mastering
> needs at least closely looking at. INTX_DISABLE, for example, is
> something I don't think a guest should be able to directly control.
> It may still be the case that the host permits it control, but then
> only indirectly, allowing the host to appropriately adjust its
> internals.
>
> Note that even for I/O and memory decoding as well as bus mastering
> it may be necessary to limit guest control: In case the host wants
> to disable any of these (perhaps transiently) despite the guest
> wanting them enabled.
Ok, so it is now clear that we need a yet another patch to add a proper
command register emulation. What is your preference: drop the current
patch, implement command register emulation and add a "reset patch"
after that or we can have the patch as is now, but I'll only reset IO/mem and bus
master bits, e.g. read the real value, mask the wanted bits and write back?
>
> Jan
>
Thank you,
Oleksandr
On 07.09.2021 11:52, Oleksandr Andrushchenko wrote:
>
> On 07.09.21 12:19, Jan Beulich wrote:
>> On 07.09.2021 11:07, Oleksandr Andrushchenko wrote:
>>> On 07.09.21 11:49, Jan Beulich wrote:
>>>> On 07.09.2021 10:18, Oleksandr Andrushchenko wrote:
>>>>> So, if we have a hidden PCI device which can be assigned to a guest and it is literally untouched
>>>>> (not enabled in Dom0) then I think there will be no such reference as "host assigned values" as
>>>>> most probably the command register will remain in its after reset state.
>>>> What meaning of "hidden" do you imply here? Devices passed to
>>>> pci_{hide,ro}_device() may not be assigned to guests ...
>>> You are completely right here.
>>>> For any other meaning of "hidden", even if the device is completely
>>>> ignored by Dom0,
>>> Dom0less is such a case when a device is assigned to the guest
>>> without Dom0 at all?
>> In this case it is entirely unclear to me what entity it is to have
>> a global view on the PCI subsystem.
>>
>>>> certain of the properties still cannot be allowed
>>>> to be DomU-controlled.
>>> The list is not that big, could you please name a few you think cannot
>>> be controlled by a guest? I can think of PCI_COMMAND_SPECIAL(?),
>>> PCI_COMMAND_INVALIDATE(?), PCI_COMMAND_PARITY, PCI_COMMAND_WAIT,
>>> PCI_COMMAND_SERR, PCI_COMMAND_INTX_DISABLE which we may want to
>>> be aligned with the "host reference" values, e.g. we only allow those bits
>>> to be set as they are in Dom0.
>> Well, you've compile a list already, and I did say so before as well:
>> Everything except I/O and memory decoding as well as bus mastering
>> needs at least closely looking at. INTX_DISABLE, for example, is
>> something I don't think a guest should be able to directly control.
>> It may still be the case that the host permits it control, but then
>> only indirectly, allowing the host to appropriately adjust its
>> internals.
>>
>> Note that even for I/O and memory decoding as well as bus mastering
>> it may be necessary to limit guest control: In case the host wants
>> to disable any of these (perhaps transiently) despite the guest
>> wanting them enabled.
>
> Ok, so it is now clear that we need a yet another patch to add a proper
> command register emulation. What is your preference: drop the current
> patch, implement command register emulation and add a "reset patch"
> after that or we can have the patch as is now, but I'll only reset IO/mem and bus
> master bits, e.g. read the real value, mask the wanted bits and write back?
Either order is fine with me as long as the result will be claimed to
be complete until proper emulation is in place.
Jan
On 07.09.21 13:06, Jan Beulich wrote:
> On 07.09.2021 11:52, Oleksandr Andrushchenko wrote:
>> On 07.09.21 12:19, Jan Beulich wrote:
>>> On 07.09.2021 11:07, Oleksandr Andrushchenko wrote:
>>>> On 07.09.21 11:49, Jan Beulich wrote:
>>>>> On 07.09.2021 10:18, Oleksandr Andrushchenko wrote:
>>>>>> So, if we have a hidden PCI device which can be assigned to a guest and it is literally untouched
>>>>>> (not enabled in Dom0) then I think there will be no such reference as "host assigned values" as
>>>>>> most probably the command register will remain in its after reset state.
>>>>> What meaning of "hidden" do you imply here? Devices passed to
>>>>> pci_{hide,ro}_device() may not be assigned to guests ...
>>>> You are completely right here.
>>>>> For any other meaning of "hidden", even if the device is completely
>>>>> ignored by Dom0,
>>>> Dom0less is such a case when a device is assigned to the guest
>>>> without Dom0 at all?
>>> In this case it is entirely unclear to me what entity it is to have
>>> a global view on the PCI subsystem.
>>>
>>>>> certain of the properties still cannot be allowed
>>>>> to be DomU-controlled.
>>>> The list is not that big, could you please name a few you think cannot
>>>> be controlled by a guest? I can think of PCI_COMMAND_SPECIAL(?),
>>>> PCI_COMMAND_INVALIDATE(?), PCI_COMMAND_PARITY, PCI_COMMAND_WAIT,
>>>> PCI_COMMAND_SERR, PCI_COMMAND_INTX_DISABLE which we may want to
>>>> be aligned with the "host reference" values, e.g. we only allow those bits
>>>> to be set as they are in Dom0.
>>> Well, you've compile a list already, and I did say so before as well:
>>> Everything except I/O and memory decoding as well as bus mastering
>>> needs at least closely looking at. INTX_DISABLE, for example, is
>>> something I don't think a guest should be able to directly control.
>>> It may still be the case that the host permits it control, but then
>>> only indirectly, allowing the host to appropriately adjust its
>>> internals.
>>>
>>> Note that even for I/O and memory decoding as well as bus mastering
>>> it may be necessary to limit guest control: In case the host wants
>>> to disable any of these (perhaps transiently) despite the guest
>>> wanting them enabled.
>> Ok, so it is now clear that we need a yet another patch to add a proper
>> command register emulation. What is your preference: drop the current
>> patch, implement command register emulation and add a "reset patch"
>> after that or we can have the patch as is now, but I'll only reset IO/mem and bus
>> master bits, e.g. read the real value, mask the wanted bits and write back?
> Either order is fine with me as long as the result will be claimed to
> be complete until proper emulation is in place.
I tried to see what others do in order to emulate PCI_COMMAND register
and it seems that at most they care about the only INTX bit (besides
IO/memory enable and bus muster which are write through). Please see
[1] and [2]. Probably I miss something, but it could be because in order
to properly emulate the COMMAND register we need to know about the
whole PCI topology, e.g. if any setting in device's command register
is aligned with the upstream port etc. This makes me think that because
of this complexity others just ignore that. Neither I think this can be
easily done in our case. So I would suggest we just add the following
simple logic to only emulate PCI_COMMAND_INTX_DISABLE: allow guest to
disable the interrupts, but don't allow to enable if host has disabled
them. This is also could be tricky a bit for the devices which are not
enabled and thus not configured in Dom0, e.g. we do not know for sure
if the value in the PCI_COMMAND register (in particular
PCI_COMMAND_INTX_DISABLE bit) can be used as the reference host value or
not. It can be that the value there is just the one after reset or so.
The rest of the command register bits will go directly to the command
register untouched.
So, at the end of the day the question is if PCI_COMMAND_INTX_DISABLE
is enough and how to get its reference host value.
> Jan
Thank you,
Oleksandr
[1] https://github.com/qemu/qemu/blob/master/hw/xen/xen_pt_config_init.c#L310
[2] https://github.com/projectacrn/acrn-hypervisor/blob/master/hypervisor/hw/pci.c#L336
On 09.09.2021 10:39, Oleksandr Andrushchenko wrote:
>
> On 07.09.21 13:06, Jan Beulich wrote:
>> On 07.09.2021 11:52, Oleksandr Andrushchenko wrote:
>>> On 07.09.21 12:19, Jan Beulich wrote:
>>>> On 07.09.2021 11:07, Oleksandr Andrushchenko wrote:
>>>>> On 07.09.21 11:49, Jan Beulich wrote:
>>>>>> On 07.09.2021 10:18, Oleksandr Andrushchenko wrote:
>>>>>>> So, if we have a hidden PCI device which can be assigned to a guest and it is literally untouched
>>>>>>> (not enabled in Dom0) then I think there will be no such reference as "host assigned values" as
>>>>>>> most probably the command register will remain in its after reset state.
>>>>>> What meaning of "hidden" do you imply here? Devices passed to
>>>>>> pci_{hide,ro}_device() may not be assigned to guests ...
>>>>> You are completely right here.
>>>>>> For any other meaning of "hidden", even if the device is completely
>>>>>> ignored by Dom0,
>>>>> Dom0less is such a case when a device is assigned to the guest
>>>>> without Dom0 at all?
>>>> In this case it is entirely unclear to me what entity it is to have
>>>> a global view on the PCI subsystem.
>>>>
>>>>>> certain of the properties still cannot be allowed
>>>>>> to be DomU-controlled.
>>>>> The list is not that big, could you please name a few you think cannot
>>>>> be controlled by a guest? I can think of PCI_COMMAND_SPECIAL(?),
>>>>> PCI_COMMAND_INVALIDATE(?), PCI_COMMAND_PARITY, PCI_COMMAND_WAIT,
>>>>> PCI_COMMAND_SERR, PCI_COMMAND_INTX_DISABLE which we may want to
>>>>> be aligned with the "host reference" values, e.g. we only allow those bits
>>>>> to be set as they are in Dom0.
>>>> Well, you've compile a list already, and I did say so before as well:
>>>> Everything except I/O and memory decoding as well as bus mastering
>>>> needs at least closely looking at. INTX_DISABLE, for example, is
>>>> something I don't think a guest should be able to directly control.
>>>> It may still be the case that the host permits it control, but then
>>>> only indirectly, allowing the host to appropriately adjust its
>>>> internals.
>>>>
>>>> Note that even for I/O and memory decoding as well as bus mastering
>>>> it may be necessary to limit guest control: In case the host wants
>>>> to disable any of these (perhaps transiently) despite the guest
>>>> wanting them enabled.
>>> Ok, so it is now clear that we need a yet another patch to add a proper
>>> command register emulation. What is your preference: drop the current
>>> patch, implement command register emulation and add a "reset patch"
>>> after that or we can have the patch as is now, but I'll only reset IO/mem and bus
>>> master bits, e.g. read the real value, mask the wanted bits and write back?
>> Either order is fine with me as long as the result will be claimed to
>> be complete until proper emulation is in place.
> I tried to see what others do in order to emulate PCI_COMMAND register
> and it seems that at most they care about the only INTX bit (besides
> IO/memory enable and bus muster which are write through). Please see
> [1] and [2]. Probably I miss something, but it could be because in order
> to properly emulate the COMMAND register we need to know about the
> whole PCI topology, e.g. if any setting in device's command register
> is aligned with the upstream port etc. This makes me think that because
> of this complexity others just ignore that. Neither I think this can be
> easily done in our case. So I would suggest we just add the following
> simple logic to only emulate PCI_COMMAND_INTX_DISABLE: allow guest to
> disable the interrupts, but don't allow to enable if host has disabled
> them. This is also could be tricky a bit for the devices which are not
> enabled and thus not configured in Dom0, e.g. we do not know for sure
> if the value in the PCI_COMMAND register (in particular
> PCI_COMMAND_INTX_DISABLE bit) can be used as the reference host value or
> not. It can be that the value there is just the one after reset or so.
> The rest of the command register bits will go directly to the command
> register untouched.
> So, at the end of the day the question is if PCI_COMMAND_INTX_DISABLE
> is enough and how to get its reference host value.
Well, in order for the whole thing to be security supported it needs to
be explained for every bit why it is safe to allow the guest to drive it.
Until you mean vPCI to reach that state, leaving TODO notes in the code
for anything not investigated may indeed be good enough.
Jan
On 09.09.21 11:43, Jan Beulich wrote:
> On 09.09.2021 10:39, Oleksandr Andrushchenko wrote:
>> On 07.09.21 13:06, Jan Beulich wrote:
>>> On 07.09.2021 11:52, Oleksandr Andrushchenko wrote:
>>>> On 07.09.21 12:19, Jan Beulich wrote:
>>>>> On 07.09.2021 11:07, Oleksandr Andrushchenko wrote:
>>>>>> On 07.09.21 11:49, Jan Beulich wrote:
>>>>>>> On 07.09.2021 10:18, Oleksandr Andrushchenko wrote:
>>>>>>>> So, if we have a hidden PCI device which can be assigned to a guest and it is literally untouched
>>>>>>>> (not enabled in Dom0) then I think there will be no such reference as "host assigned values" as
>>>>>>>> most probably the command register will remain in its after reset state.
>>>>>>> What meaning of "hidden" do you imply here? Devices passed to
>>>>>>> pci_{hide,ro}_device() may not be assigned to guests ...
>>>>>> You are completely right here.
>>>>>>> For any other meaning of "hidden", even if the device is completely
>>>>>>> ignored by Dom0,
>>>>>> Dom0less is such a case when a device is assigned to the guest
>>>>>> without Dom0 at all?
>>>>> In this case it is entirely unclear to me what entity it is to have
>>>>> a global view on the PCI subsystem.
>>>>>
>>>>>>> certain of the properties still cannot be allowed
>>>>>>> to be DomU-controlled.
>>>>>> The list is not that big, could you please name a few you think cannot
>>>>>> be controlled by a guest? I can think of PCI_COMMAND_SPECIAL(?),
>>>>>> PCI_COMMAND_INVALIDATE(?), PCI_COMMAND_PARITY, PCI_COMMAND_WAIT,
>>>>>> PCI_COMMAND_SERR, PCI_COMMAND_INTX_DISABLE which we may want to
>>>>>> be aligned with the "host reference" values, e.g. we only allow those bits
>>>>>> to be set as they are in Dom0.
>>>>> Well, you've compile a list already, and I did say so before as well:
>>>>> Everything except I/O and memory decoding as well as bus mastering
>>>>> needs at least closely looking at. INTX_DISABLE, for example, is
>>>>> something I don't think a guest should be able to directly control.
>>>>> It may still be the case that the host permits it control, but then
>>>>> only indirectly, allowing the host to appropriately adjust its
>>>>> internals.
>>>>>
>>>>> Note that even for I/O and memory decoding as well as bus mastering
>>>>> it may be necessary to limit guest control: In case the host wants
>>>>> to disable any of these (perhaps transiently) despite the guest
>>>>> wanting them enabled.
>>>> Ok, so it is now clear that we need a yet another patch to add a proper
>>>> command register emulation. What is your preference: drop the current
>>>> patch, implement command register emulation and add a "reset patch"
>>>> after that or we can have the patch as is now, but I'll only reset IO/mem and bus
>>>> master bits, e.g. read the real value, mask the wanted bits and write back?
>>> Either order is fine with me as long as the result will be claimed to
>>> be complete until proper emulation is in place.
>> I tried to see what others do in order to emulate PCI_COMMAND register
>> and it seems that at most they care about the only INTX bit (besides
>> IO/memory enable and bus muster which are write through). Please see
>> [1] and [2]. Probably I miss something, but it could be because in order
>> to properly emulate the COMMAND register we need to know about the
>> whole PCI topology, e.g. if any setting in device's command register
>> is aligned with the upstream port etc. This makes me think that because
>> of this complexity others just ignore that. Neither I think this can be
>> easily done in our case. So I would suggest we just add the following
>> simple logic to only emulate PCI_COMMAND_INTX_DISABLE: allow guest to
>> disable the interrupts, but don't allow to enable if host has disabled
>> them. This is also could be tricky a bit for the devices which are not
>> enabled and thus not configured in Dom0, e.g. we do not know for sure
>> if the value in the PCI_COMMAND register (in particular
>> PCI_COMMAND_INTX_DISABLE bit) can be used as the reference host value or
>> not. It can be that the value there is just the one after reset or so.
>> The rest of the command register bits will go directly to the command
>> register untouched.
>> So, at the end of the day the question is if PCI_COMMAND_INTX_DISABLE
>> is enough and how to get its reference host value.
> Well, in order for the whole thing to be security supported it needs to
> be explained for every bit why it is safe to allow the guest to drive it.
So, do we want at least PCI_COMMAND_INTX_DISABLE bit aligned
between the host and guest? If so, what do you you think about
the reference value for it (please see above).
> Until you mean vPCI to reach that state, leaving TODO notes in the code
> for anything not investigated may indeed be good enough.
Ok, I'll add TODO then.
>
> Jan
>
Thank you,
Oleksandr
On 09.09.2021 10:50, Oleksandr Andrushchenko wrote:
>
> On 09.09.21 11:43, Jan Beulich wrote:
>> On 09.09.2021 10:39, Oleksandr Andrushchenko wrote:
>>> On 07.09.21 13:06, Jan Beulich wrote:
>>>> On 07.09.2021 11:52, Oleksandr Andrushchenko wrote:
>>>>> On 07.09.21 12:19, Jan Beulich wrote:
>>>>>> On 07.09.2021 11:07, Oleksandr Andrushchenko wrote:
>>>>>>> On 07.09.21 11:49, Jan Beulich wrote:
>>>>>>>> On 07.09.2021 10:18, Oleksandr Andrushchenko wrote:
>>>>>>>>> So, if we have a hidden PCI device which can be assigned to a guest and it is literally untouched
>>>>>>>>> (not enabled in Dom0) then I think there will be no such reference as "host assigned values" as
>>>>>>>>> most probably the command register will remain in its after reset state.
>>>>>>>> What meaning of "hidden" do you imply here? Devices passed to
>>>>>>>> pci_{hide,ro}_device() may not be assigned to guests ...
>>>>>>> You are completely right here.
>>>>>>>> For any other meaning of "hidden", even if the device is completely
>>>>>>>> ignored by Dom0,
>>>>>>> Dom0less is such a case when a device is assigned to the guest
>>>>>>> without Dom0 at all?
>>>>>> In this case it is entirely unclear to me what entity it is to have
>>>>>> a global view on the PCI subsystem.
>>>>>>
>>>>>>>> certain of the properties still cannot be allowed
>>>>>>>> to be DomU-controlled.
>>>>>>> The list is not that big, could you please name a few you think cannot
>>>>>>> be controlled by a guest? I can think of PCI_COMMAND_SPECIAL(?),
>>>>>>> PCI_COMMAND_INVALIDATE(?), PCI_COMMAND_PARITY, PCI_COMMAND_WAIT,
>>>>>>> PCI_COMMAND_SERR, PCI_COMMAND_INTX_DISABLE which we may want to
>>>>>>> be aligned with the "host reference" values, e.g. we only allow those bits
>>>>>>> to be set as they are in Dom0.
>>>>>> Well, you've compile a list already, and I did say so before as well:
>>>>>> Everything except I/O and memory decoding as well as bus mastering
>>>>>> needs at least closely looking at. INTX_DISABLE, for example, is
>>>>>> something I don't think a guest should be able to directly control.
>>>>>> It may still be the case that the host permits it control, but then
>>>>>> only indirectly, allowing the host to appropriately adjust its
>>>>>> internals.
>>>>>>
>>>>>> Note that even for I/O and memory decoding as well as bus mastering
>>>>>> it may be necessary to limit guest control: In case the host wants
>>>>>> to disable any of these (perhaps transiently) despite the guest
>>>>>> wanting them enabled.
>>>>> Ok, so it is now clear that we need a yet another patch to add a proper
>>>>> command register emulation. What is your preference: drop the current
>>>>> patch, implement command register emulation and add a "reset patch"
>>>>> after that or we can have the patch as is now, but I'll only reset IO/mem and bus
>>>>> master bits, e.g. read the real value, mask the wanted bits and write back?
>>>> Either order is fine with me as long as the result will be claimed to
>>>> be complete until proper emulation is in place.
>>> I tried to see what others do in order to emulate PCI_COMMAND register
>>> and it seems that at most they care about the only INTX bit (besides
>>> IO/memory enable and bus muster which are write through). Please see
>>> [1] and [2]. Probably I miss something, but it could be because in order
>>> to properly emulate the COMMAND register we need to know about the
>>> whole PCI topology, e.g. if any setting in device's command register
>>> is aligned with the upstream port etc. This makes me think that because
>>> of this complexity others just ignore that. Neither I think this can be
>>> easily done in our case. So I would suggest we just add the following
>>> simple logic to only emulate PCI_COMMAND_INTX_DISABLE: allow guest to
>>> disable the interrupts, but don't allow to enable if host has disabled
>>> them. This is also could be tricky a bit for the devices which are not
>>> enabled and thus not configured in Dom0, e.g. we do not know for sure
>>> if the value in the PCI_COMMAND register (in particular
>>> PCI_COMMAND_INTX_DISABLE bit) can be used as the reference host value or
>>> not. It can be that the value there is just the one after reset or so.
>>> The rest of the command register bits will go directly to the command
>>> register untouched.
>>> So, at the end of the day the question is if PCI_COMMAND_INTX_DISABLE
>>> is enough and how to get its reference host value.
>> Well, in order for the whole thing to be security supported it needs to
>> be explained for every bit why it is safe to allow the guest to drive it.
>
> So, do we want at least PCI_COMMAND_INTX_DISABLE bit aligned
> between the host and guest? If so, what do you you think about
> the reference value for it (please see above).
Please may I ask that you come up with a proposal? I don't think I've
said you need to emulate this or any of the other bits. All I've asked
for is that for every bit you allow the guest to control directly, you
justify why that's safe and secure. If no justification can be given,
emulation is going to be necessary. How to solve that is first and
foremost part of your undertaking.
For the bit in question, where the goal appears to be to have hardware
hold the OR of guest and host values, an approach similar to that used
for some of the MSI / MSI-X bits might be chosen: Maintain guest and
host bits in software, and update hardware (at least) when the
effective resulting value changes. A complicating fact here is, though,
that unlike for the MSI / MSI-X bits here Dom0 (pciback or its PCI
susbstem) may also have a view on what the setting ought to be.
Jan
On 09.09.21 12:21, Jan Beulich wrote:
> On 09.09.2021 10:50, Oleksandr Andrushchenko wrote:
>> On 09.09.21 11:43, Jan Beulich wrote:
>>> On 09.09.2021 10:39, Oleksandr Andrushchenko wrote:
>>>> On 07.09.21 13:06, Jan Beulich wrote:
>>>>> On 07.09.2021 11:52, Oleksandr Andrushchenko wrote:
>>>>>> On 07.09.21 12:19, Jan Beulich wrote:
>>>>>>> On 07.09.2021 11:07, Oleksandr Andrushchenko wrote:
>>>>>>>> On 07.09.21 11:49, Jan Beulich wrote:
>>>>>>>>> On 07.09.2021 10:18, Oleksandr Andrushchenko wrote:
>>>>>>>>>> So, if we have a hidden PCI device which can be assigned to a guest and it is literally untouched
>>>>>>>>>> (not enabled in Dom0) then I think there will be no such reference as "host assigned values" as
>>>>>>>>>> most probably the command register will remain in its after reset state.
>>>>>>>>> What meaning of "hidden" do you imply here? Devices passed to
>>>>>>>>> pci_{hide,ro}_device() may not be assigned to guests ...
>>>>>>>> You are completely right here.
>>>>>>>>> For any other meaning of "hidden", even if the device is completely
>>>>>>>>> ignored by Dom0,
>>>>>>>> Dom0less is such a case when a device is assigned to the guest
>>>>>>>> without Dom0 at all?
>>>>>>> In this case it is entirely unclear to me what entity it is to have
>>>>>>> a global view on the PCI subsystem.
>>>>>>>
>>>>>>>>> certain of the properties still cannot be allowed
>>>>>>>>> to be DomU-controlled.
>>>>>>>> The list is not that big, could you please name a few you think cannot
>>>>>>>> be controlled by a guest? I can think of PCI_COMMAND_SPECIAL(?),
>>>>>>>> PCI_COMMAND_INVALIDATE(?), PCI_COMMAND_PARITY, PCI_COMMAND_WAIT,
>>>>>>>> PCI_COMMAND_SERR, PCI_COMMAND_INTX_DISABLE which we may want to
>>>>>>>> be aligned with the "host reference" values, e.g. we only allow those bits
>>>>>>>> to be set as they are in Dom0.
>>>>>>> Well, you've compile a list already, and I did say so before as well:
>>>>>>> Everything except I/O and memory decoding as well as bus mastering
>>>>>>> needs at least closely looking at. INTX_DISABLE, for example, is
>>>>>>> something I don't think a guest should be able to directly control.
>>>>>>> It may still be the case that the host permits it control, but then
>>>>>>> only indirectly, allowing the host to appropriately adjust its
>>>>>>> internals.
>>>>>>>
>>>>>>> Note that even for I/O and memory decoding as well as bus mastering
>>>>>>> it may be necessary to limit guest control: In case the host wants
>>>>>>> to disable any of these (perhaps transiently) despite the guest
>>>>>>> wanting them enabled.
>>>>>> Ok, so it is now clear that we need a yet another patch to add a proper
>>>>>> command register emulation. What is your preference: drop the current
>>>>>> patch, implement command register emulation and add a "reset patch"
>>>>>> after that or we can have the patch as is now, but I'll only reset IO/mem and bus
>>>>>> master bits, e.g. read the real value, mask the wanted bits and write back?
>>>>> Either order is fine with me as long as the result will be claimed to
>>>>> be complete until proper emulation is in place.
>>>> I tried to see what others do in order to emulate PCI_COMMAND register
>>>> and it seems that at most they care about the only INTX bit (besides
>>>> IO/memory enable and bus muster which are write through). Please see
>>>> [1] and [2]. Probably I miss something, but it could be because in order
>>>> to properly emulate the COMMAND register we need to know about the
>>>> whole PCI topology, e.g. if any setting in device's command register
>>>> is aligned with the upstream port etc. This makes me think that because
>>>> of this complexity others just ignore that. Neither I think this can be
>>>> easily done in our case. So I would suggest we just add the following
>>>> simple logic to only emulate PCI_COMMAND_INTX_DISABLE: allow guest to
>>>> disable the interrupts, but don't allow to enable if host has disabled
>>>> them. This is also could be tricky a bit for the devices which are not
>>>> enabled and thus not configured in Dom0, e.g. we do not know for sure
>>>> if the value in the PCI_COMMAND register (in particular
>>>> PCI_COMMAND_INTX_DISABLE bit) can be used as the reference host value or
>>>> not. It can be that the value there is just the one after reset or so.
>>>> The rest of the command register bits will go directly to the command
>>>> register untouched.
>>>> So, at the end of the day the question is if PCI_COMMAND_INTX_DISABLE
>>>> is enough and how to get its reference host value.
>>> Well, in order for the whole thing to be security supported it needs to
>>> be explained for every bit why it is safe to allow the guest to drive it.
>> So, do we want at least PCI_COMMAND_INTX_DISABLE bit aligned
>> between the host and guest? If so, what do you you think about
>> the reference value for it (please see above).
> Please may I ask that you come up with a proposal? I don't think I've
> said you need to emulate this or any of the other bits. All I've asked
> for is that for every bit you allow the guest to control directly, you
> justify why that's safe and secure. If no justification can be given,
> emulation is going to be necessary. How to solve that is first and
> foremost part of your undertaking.
The thing here is that we can't truly justify if we can let the guest
control those bits or not as it all may depend on the topology that some
specific setup might have. Not that we technically can't, but not in a
practical and easy way IMO. Taking that into account we come to a
conclusion that we need to emulate those then. But, again we understand
that full emulation, if properly implemented, is going to be a big piece of code
which needs to take into account the physical PCI topology etc.
So, this is my understanding why others do not implement that (QEMU, ARCN)
and let the guest control all bits, but INTxDISABLE (Disclaimer: if I understood their
code correctly)
So, my proposal here is to only emulate PCI_COMMAND_INTX_DISABLE and let
the other bits be controlled by the guest (please also see the note below).
I do understand this is not correct, but I can't tell how to deal with this other way.
>
> For the bit in question, where the goal appears to be to have hardware
> hold the OR of guest and host values, an approach similar to that used
> for some of the MSI / MSI-X bits might be chosen: Maintain guest and
> host bits in software, and update hardware (at least) when the
> effective resulting value changes. A complicating fact here is, though,
> that unlike for the MSI / MSI-X bits here Dom0 (pciback or its PCI
> susbstem) may also have a view on what the setting ought to be.
The bigger question here is what can we take as the reference for INTx
bit, e.g. if Dom0 didn't enable/configured the device being passed through
than its COMMAND register may still be in after reset state and IMO there is
no guarantee it has the values we can say are "as host wants them"
>
> Jan
>
Thank you,
Oleksandr
On 09.09.2021 13:48, Oleksandr Andrushchenko wrote: > On 09.09.21 12:21, Jan Beulich wrote: >> For the bit in question, where the goal appears to be to have hardware >> hold the OR of guest and host values, an approach similar to that used >> for some of the MSI / MSI-X bits might be chosen: Maintain guest and >> host bits in software, and update hardware (at least) when the >> effective resulting value changes. A complicating fact here is, though, >> that unlike for the MSI / MSI-X bits here Dom0 (pciback or its PCI >> susbstem) may also have a view on what the setting ought to be. > > The bigger question here is what can we take as the reference for INTx > bit, e.g. if Dom0 didn't enable/configured the device being passed through > than its COMMAND register may still be in after reset state and IMO there is > no guarantee it has the values we can say are "as host wants them" In the absence of Dom0 controlling the device, I think we ought to take Xen's view as the "host" one. Which will want the bit set at least as long as either MSI or MSI-X is enabled for the device. Jan
On 09.09.21 14:53, Jan Beulich wrote: > On 09.09.2021 13:48, Oleksandr Andrushchenko wrote: >> On 09.09.21 12:21, Jan Beulich wrote: >>> For the bit in question, where the goal appears to be to have hardware >>> hold the OR of guest and host values, an approach similar to that used >>> for some of the MSI / MSI-X bits might be chosen: Maintain guest and >>> host bits in software, and update hardware (at least) when the >>> effective resulting value changes. A complicating fact here is, though, >>> that unlike for the MSI / MSI-X bits here Dom0 (pciback or its PCI >>> susbstem) may also have a view on what the setting ought to be. >> The bigger question here is what can we take as the reference for INTx >> bit, e.g. if Dom0 didn't enable/configured the device being passed through >> than its COMMAND register may still be in after reset state and IMO there is >> no guarantee it has the values we can say are "as host wants them" > In the absence of Dom0 controlling the device, I think we ought to take > Xen's view as the "host" one. Agree > Which will want the bit set at least as > long as either MSI or MSI-X is enabled for the device. But what is the INTx relation to MSI/MSI-X here? > > Jan > Thank you, Oleksandr
On 09.09.2021 14:42, Oleksandr Andrushchenko wrote: > On 09.09.21 14:53, Jan Beulich wrote: >> On 09.09.2021 13:48, Oleksandr Andrushchenko wrote: >>> On 09.09.21 12:21, Jan Beulich wrote: >>>> For the bit in question, where the goal appears to be to have hardware >>>> hold the OR of guest and host values, an approach similar to that used >>>> for some of the MSI / MSI-X bits might be chosen: Maintain guest and >>>> host bits in software, and update hardware (at least) when the >>>> effective resulting value changes. A complicating fact here is, though, >>>> that unlike for the MSI / MSI-X bits here Dom0 (pciback or its PCI >>>> susbstem) may also have a view on what the setting ought to be. >>> The bigger question here is what can we take as the reference for INTx >>> bit, e.g. if Dom0 didn't enable/configured the device being passed through >>> than its COMMAND register may still be in after reset state and IMO there is >>> no guarantee it has the values we can say are "as host wants them" >> In the absence of Dom0 controlling the device, I think we ought to take >> Xen's view as the "host" one. > Agree >> Which will want the bit set at least as >> long as either MSI or MSI-X is enabled for the device. > But what is the INTx relation to MSI/MSI-X here? Devices are not supposed to signal interrupts two different ways at a time. They may enable only one - pin based, MSI, or MSI-X. Jan
On 09.09.21 15:47, Jan Beulich wrote: > On 09.09.2021 14:42, Oleksandr Andrushchenko wrote: >> On 09.09.21 14:53, Jan Beulich wrote: >>> On 09.09.2021 13:48, Oleksandr Andrushchenko wrote: >>>> On 09.09.21 12:21, Jan Beulich wrote: >>>>> For the bit in question, where the goal appears to be to have hardware >>>>> hold the OR of guest and host values, an approach similar to that used >>>>> for some of the MSI / MSI-X bits might be chosen: Maintain guest and >>>>> host bits in software, and update hardware (at least) when the >>>>> effective resulting value changes. A complicating fact here is, though, >>>>> that unlike for the MSI / MSI-X bits here Dom0 (pciback or its PCI >>>>> susbstem) may also have a view on what the setting ought to be. >>>> The bigger question here is what can we take as the reference for INTx >>>> bit, e.g. if Dom0 didn't enable/configured the device being passed through >>>> than its COMMAND register may still be in after reset state and IMO there is >>>> no guarantee it has the values we can say are "as host wants them" >>> In the absence of Dom0 controlling the device, I think we ought to take >>> Xen's view as the "host" one. >> Agree >>> Which will want the bit set at least as >>> long as either MSI or MSI-X is enabled for the device. >> But what is the INTx relation to MSI/MSI-X here? > Devices are not supposed to signal interrupts two different ways at a > time. They may enable only one - pin based, MSI, or MSI-X. Ah, that simple ;) Yes, of course > > Jan > Thank you, Oleksandr
On 09.09.21 15:47, Jan Beulich wrote:
> On 09.09.2021 14:42, Oleksandr Andrushchenko wrote:
>> On 09.09.21 14:53, Jan Beulich wrote:
>>> On 09.09.2021 13:48, Oleksandr Andrushchenko wrote:
>>>> On 09.09.21 12:21, Jan Beulich wrote:
>>>>> For the bit in question, where the goal appears to be to have hardware
>>>>> hold the OR of guest and host values, an approach similar to that used
>>>>> for some of the MSI / MSI-X bits might be chosen: Maintain guest and
>>>>> host bits in software, and update hardware (at least) when the
>>>>> effective resulting value changes. A complicating fact here is, though,
>>>>> that unlike for the MSI / MSI-X bits here Dom0 (pciback or its PCI
>>>>> susbstem) may also have a view on what the setting ought to be.
>>>> The bigger question here is what can we take as the reference for INTx
>>>> bit, e.g. if Dom0 didn't enable/configured the device being passed through
>>>> than its COMMAND register may still be in after reset state and IMO there is
>>>> no guarantee it has the values we can say are "as host wants them"
>>> In the absence of Dom0 controlling the device, I think we ought to take
>>> Xen's view as the "host" one.
>> Agree
>>> Which will want the bit set at least as
>>> long as either MSI or MSI-X is enabled for the device.
>> But what is the INTx relation to MSI/MSI-X here?
> Devices are not supposed to signal interrupts two different ways at a
> time. They may enable only one - pin based, MSI, or MSI-X.
Ok, so I see that we can partially emulate the command register as:
static void guest_cmd_write(const struct pci_dev *pdev, unsigned int reg,
uint32_t cmd, void *data)
{
/* TODO: Add proper emulation for all bits of the command register. */
if ( (cmd & PCI_COMMAND_INTX_DISABLE) == 0 )
{
/*
* Guest wants to enable INTx. It can't be enabled if:
* - host has INTx disabled
* - MSI/MSI-X enabled
*/
if ( pdev->vpci->msi->enabled )
cmd |= PCI_COMMAND_INTX_DISABLE;
else
{
uint16_t current_cmd = pci_conf_read16(pdev->sbdf, reg);
if ( current_cmd & PCI_COMMAND_INTX_DISABLE )
cmd |= PCI_COMMAND_INTX_DISABLE;
}
}
cmd_write(pdev, reg, cmd, data);
}
and of course have grand TODO for the rest.
>
> Jan
>
Thank you,
Oleksandr
On 09.09.21 12:21, Jan Beulich wrote:
> On 09.09.2021 10:50, Oleksandr Andrushchenko wrote:
>> On 09.09.21 11:43, Jan Beulich wrote:
>>> On 09.09.2021 10:39, Oleksandr Andrushchenko wrote:
>>>> On 07.09.21 13:06, Jan Beulich wrote:
>>>>> On 07.09.2021 11:52, Oleksandr Andrushchenko wrote:
>>>>>> On 07.09.21 12:19, Jan Beulich wrote:
>>>>>>> On 07.09.2021 11:07, Oleksandr Andrushchenko wrote:
>>>>>>>> On 07.09.21 11:49, Jan Beulich wrote:
>>>>>>>>> On 07.09.2021 10:18, Oleksandr Andrushchenko wrote:
>>>>>>>>>> So, if we have a hidden PCI device which can be assigned to a guest and it is literally untouched
>>>>>>>>>> (not enabled in Dom0) then I think there will be no such reference as "host assigned values" as
>>>>>>>>>> most probably the command register will remain in its after reset state.
>>>>>>>>> What meaning of "hidden" do you imply here? Devices passed to
>>>>>>>>> pci_{hide,ro}_device() may not be assigned to guests ...
>>>>>>>> You are completely right here.
>>>>>>>>> For any other meaning of "hidden", even if the device is completely
>>>>>>>>> ignored by Dom0,
>>>>>>>> Dom0less is such a case when a device is assigned to the guest
>>>>>>>> without Dom0 at all?
>>>>>>> In this case it is entirely unclear to me what entity it is to have
>>>>>>> a global view on the PCI subsystem.
>>>>>>>
>>>>>>>>> certain of the properties still cannot be allowed
>>>>>>>>> to be DomU-controlled.
>>>>>>>> The list is not that big, could you please name a few you think cannot
>>>>>>>> be controlled by a guest? I can think of PCI_COMMAND_SPECIAL(?),
>>>>>>>> PCI_COMMAND_INVALIDATE(?), PCI_COMMAND_PARITY, PCI_COMMAND_WAIT,
>>>>>>>> PCI_COMMAND_SERR, PCI_COMMAND_INTX_DISABLE which we may want to
>>>>>>>> be aligned with the "host reference" values, e.g. we only allow those bits
>>>>>>>> to be set as they are in Dom0.
>>>>>>> Well, you've compile a list already, and I did say so before as well:
>>>>>>> Everything except I/O and memory decoding as well as bus mastering
>>>>>>> needs at least closely looking at. INTX_DISABLE, for example, is
>>>>>>> something I don't think a guest should be able to directly control.
>>>>>>> It may still be the case that the host permits it control, but then
>>>>>>> only indirectly, allowing the host to appropriately adjust its
>>>>>>> internals.
>>>>>>>
>>>>>>> Note that even for I/O and memory decoding as well as bus mastering
>>>>>>> it may be necessary to limit guest control: In case the host wants
>>>>>>> to disable any of these (perhaps transiently) despite the guest
>>>>>>> wanting them enabled.
>>>>>> Ok, so it is now clear that we need a yet another patch to add a proper
>>>>>> command register emulation. What is your preference: drop the current
>>>>>> patch, implement command register emulation and add a "reset patch"
>>>>>> after that or we can have the patch as is now, but I'll only reset IO/mem and bus
>>>>>> master bits, e.g. read the real value, mask the wanted bits and write back?
>>>>> Either order is fine with me as long as the result will be claimed to
>>>>> be complete until proper emulation is in place.
>>>> I tried to see what others do in order to emulate PCI_COMMAND register
>>>> and it seems that at most they care about the only INTX bit (besides
>>>> IO/memory enable and bus muster which are write through). Please see
>>>> [1] and [2]. Probably I miss something, but it could be because in order
>>>> to properly emulate the COMMAND register we need to know about the
>>>> whole PCI topology, e.g. if any setting in device's command register
>>>> is aligned with the upstream port etc. This makes me think that because
>>>> of this complexity others just ignore that. Neither I think this can be
>>>> easily done in our case. So I would suggest we just add the following
>>>> simple logic to only emulate PCI_COMMAND_INTX_DISABLE: allow guest to
>>>> disable the interrupts, but don't allow to enable if host has disabled
>>>> them. This is also could be tricky a bit for the devices which are not
>>>> enabled and thus not configured in Dom0, e.g. we do not know for sure
>>>> if the value in the PCI_COMMAND register (in particular
>>>> PCI_COMMAND_INTX_DISABLE bit) can be used as the reference host value or
>>>> not. It can be that the value there is just the one after reset or so.
>>>> The rest of the command register bits will go directly to the command
>>>> register untouched.
>>>> So, at the end of the day the question is if PCI_COMMAND_INTX_DISABLE
>>>> is enough and how to get its reference host value.
>>> Well, in order for the whole thing to be security supported it needs to
>>> be explained for every bit why it is safe to allow the guest to drive it.
>> So, do we want at least PCI_COMMAND_INTX_DISABLE bit aligned
>> between the host and guest? If so, what do you you think about
>> the reference value for it (please see above).
> Please may I ask that you come up with a proposal? I don't think I've
> said you need to emulate this or any of the other bits. All I've asked
> for is that for every bit you allow the guest to control directly, you
> justify why that's safe and secure. If no justification can be given,
> emulation is going to be necessary. How to solve that is first and
> foremost part of your undertaking.
The thing here is that we can't truly justify if we can let the guest
control those bits or not as it all may depend on the topology that some
specific setup might have. Not that we technically can't, but not in a
practical and easy way IMO. Taking that into account we come to a
conclusion that we need to emulate those then. But, again we understand
that full emulation, if properly implemented, is going to be a big piece of code
which needs to take into account the physical PCI topology etc.
So, this is my understanding why others do not implement that (QEMU, ARCN)
and let the guest control all bits, but INTxDISABLE (Disclaimer: if I understood their
code correctly)
So, my proposal here is to only emulate PCI_COMMAND_INTX_DISABLE and let
the other bits be controlled by the guest (please also see the note below).
I do understand this is not correct, but I can't tell how to deal with this other way.
>
> For the bit in question, where the goal appears to be to have hardware
> hold the OR of guest and host values, an approach similar to that used
> for some of the MSI / MSI-X bits might be chosen: Maintain guest and
> host bits in software, and update hardware (at least) when the
> effective resulting value changes. A complicating fact here is, though,
> that unlike for the MSI / MSI-X bits here Dom0 (pciback or its PCI
> susbstem) may also have a view on what the setting ought to be.
The bigger question here is what can we take as the reference for INTx
bit, e.g. if Dom0 didn't enable/configured the device being passed through
than its COMMAND register may still be in after reset state and IMO there is
no guarantee it has the values we can say are "as host wants them"
>
> Jan
>
Thank you,
Oleksandr
© 2016 - 2026 Red Hat, Inc.