This gets us closer to what Linux does, which hopefully improves the experience for people running Xen on affected hardware. Ian - I'm also Cc-ing you since this feels like being on the edge between a new feature and a bug fix. 1: generalize and correct "iommu=no-igfx" handling 2: Tylersburg isoch DMAR unit with no TLB space Jan
Jan Beulich writes ("[PATCH 0/2] VT-d: correct / extend workaround(s) leaving an IOMMU disabled"):
> Ian - I'm also Cc-ing you since this feels like being on the edge
> between a new feature and a bug fix.
Thanks.
I think 2/ is a new quirk (or, new behaviour for an existing quirk).
I think I am happy to treat that as a bugfix, assuming we are
reasonably confident that most systems (including in particular all
systems without the quirk) will take unchanged codepaths. Is that
right ?
I don't understand 1/. It looks bugfixish to me but I am really not
qualified. I am inclined to defer to your judgement, but it would
help me if you explicitly addressed the overall risks/benefits.
But when reading the patch I did notice one thing that struck me as
undesriable:
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -750,27 +750,43 @@ static void iommu_enable_translation(str
> if ( force_iommu )
> - panic("BIOS did not enable IGD for VT properly, crash Xen for security purpose\n");
> + panic(crash_fmt, msg);
...
> + if ( force_iommu )
> + panic(crash_fmt, msg);
Does this really mean that every exit path from
iommu_enable_translation that doesn't enable the iommu has to have a
check for force_iommu ?
That seems like a recipe for missing one. And I think a missed one
would be an XSA. Could we not structure the code some way to avoid
this foreseeable human error ?
Ian.
On 11.10.2021 12:56, Ian Jackson wrote:
> Jan Beulich writes ("[PATCH 0/2] VT-d: correct / extend workaround(s) leaving an IOMMU disabled"):
>> Ian - I'm also Cc-ing you since this feels like being on the edge
>> between a new feature and a bug fix.
>
> Thanks.
>
> I think 2/ is a new quirk (or, new behaviour for an existing quirk).
> I think I am happy to treat that as a bugfix, assuming we are
> reasonably confident that most systems (including in particular all
> systems without the quirk) will take unchanged codepaths. Is that
> right ?
Yes. According to Linux there's exactly one BIOS flavor known to
exhibit the issue.
> I don't understand 1/. It looks bugfixish to me but I am really not
> qualified. I am inclined to defer to your judgement, but it would
> help me if you explicitly addressed the overall risks/benefits.
Right now our documentation claims similarity to a Linux workaround
without the similarity actually existing in the general case. A
common case (a single integrated graphics device) is handled, but the
perhaps yet more common case of a single add-in graphics devices is
not. Plus the criteria by which a device is determined to be a
graphics one was completely flawed. Hence people in need of the
workaround may find it non-functional. However, since our doc tells
people to report if they have a need to use the option engaging the
workaround, and since we didn't have any such reports in a number
of years, I guess both benefits and possible risks here are of
purely theoretical nature. Note that I've specifically said "possible"
because I can't really see any beyond me not having properly matched
Linux'es equivalent workaround - that workaround has been in place
unchanged for very many years.
> But when reading the patch I did notice one thing that struck me as
> undesriable:
>
>> --- a/xen/drivers/passthrough/vtd/iommu.c
>> +++ b/xen/drivers/passthrough/vtd/iommu.c
>> @@ -750,27 +750,43 @@ static void iommu_enable_translation(str
>> if ( force_iommu )
>> - panic("BIOS did not enable IGD for VT properly, crash Xen for security purpose\n");
>> + panic(crash_fmt, msg);
> ...
>> + if ( force_iommu )
>> + panic(crash_fmt, msg);
>
> Does this really mean that every exit path from
> iommu_enable_translation that doesn't enable the iommu has to have a
> check for force_iommu ?
>
> That seems like a recipe for missing one. And I think a missed one
> would be an XSA. Could we not structure the code some way to avoid
> this foreseeable human error ?
I'm afraid I don't see a good way to do so, as imo it's desirable to
have separate log messages. IOW something like
if ( ... )
{
msg = "...";
goto dead;
}
doesn't look any better to me. Also leaving individual IOMMUs disabled
should really be the exception anyway.
Jan
Jan Beulich writes ("Re: [PATCH 0/2] VT-d: correct / extend workaround(s) leaving an IOMMU disabled [and 1 more messages]"):
> On 11.10.2021 12:56, Ian Jackson wrote:
> > I think 2/ is a new quirk (or, new behaviour for an existing quirk).
> > I think I am happy to treat that as a bugfix, assuming we are
> > reasonably confident that most systems (including in particular all
> > systems without the quirk) will take unchanged codepaths. Is that
> > right ?
>
> Yes. According to Linux there's exactly one BIOS flavor known to
> exhibit the issue.
>
> > I don't understand 1/. It looks bugfixish to me but I am really not
> > qualified. I am inclined to defer to your judgement, but it would
> > help me if you explicitly addressed the overall risks/benefits.
>
> Right now our documentation claims similarity to a Linux workaround
> without the similarity actually existing in the general case. A
> common case (a single integrated graphics device) is handled, but the
> perhaps yet more common case of a single add-in graphics devices is
> not. Plus the criteria by which a device is determined to be a
> graphics one was completely flawed. Hence people in need of the
> workaround may find it non-functional. However, since our doc tells
> people to report if they have a need to use the option engaging the
> workaround, and since we didn't have any such reports in a number
> of years, I guess both benefits and possible risks here are of
> purely theoretical nature. Note that I've specifically said "possible"
> because I can't really see any beyond me not having properly matched
> Linux'es equivalent workaround - that workaround has been in place
> unchanged for very many years.
OK, great. Thanks for the explanation. For the record,
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
> > But when reading the patch I did notice one thing that struck me as
> > undesriable:
...
> > That seems like a recipe for missing one. And I think a missed one
> > would be an XSA. Could we not structure the code some way to avoid
> > this foreseeable human error ?
>
> I'm afraid I don't see a good way to do so, as imo it's desirable to
> have separate log messages. IOW something like
>
> if ( ... )
> {
> msg = "...";
> goto dead;
> }
>
> doesn't look any better to me. Also leaving individual IOMMUs disabled
> should really be the exception anyway.
C does not make this kind of thing easy. I might be tempted to make
an inner function which returned a const char*, with NULL meaning "it
went OK". Oh for a proper sum type...
Ian.
© 2016 - 2026 Red Hat, Inc.