From: Julien Grall <jgrall@amazon.com>
is_iommu_enabled() will return true even if the IOMMU has not been
initialized (e.g. the ops are not set).
In the case of an early failure in arch_domain_init(), the function
iommu_destroy_domain() will be called even if the IOMMU is initialized.
This will result to dereference the ops which will be NULL and an host
crash.
Fix the issue by checking that ops has been set before accessing it.
Note that we are assuming that arch_iommu_domain_init() will cleanup an
intermediate failure if it failed.
Fixes: 71e617a6b8f6 ("use is_iommu_enabled() where appropriate...")
Signed-off-by: Julien Grall <jgrall@amazon.com>
---
xen/drivers/passthrough/iommu.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 2358b6eb09f4..f976d5a0b0a5 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -226,7 +226,15 @@ static void iommu_teardown(struct domain *d)
void iommu_domain_destroy(struct domain *d)
{
- if ( !is_iommu_enabled(d) )
+ struct domain_iommu *hd = dom_iommu(d);
+
+ /*
+ * In case of failure during the domain construction, it would be
+ * possible to reach this function with the IOMMU enabled but not
+ * yet initialized. We assume that hd->platforms will be non-NULL as
+ * soon as we start to initialize the IOMMU.
+ */
+ if ( !is_iommu_enabled(d) || !hd->platform_ops )
return;
iommu_teardown(d);
--
2.17.1
On 22.12.2020 16:43, Julien Grall wrote:
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -226,7 +226,15 @@ static void iommu_teardown(struct domain *d)
>
> void iommu_domain_destroy(struct domain *d)
> {
> - if ( !is_iommu_enabled(d) )
> + struct domain_iommu *hd = dom_iommu(d);
> +
> + /*
> + * In case of failure during the domain construction, it would be
> + * possible to reach this function with the IOMMU enabled but not
> + * yet initialized. We assume that hd->platforms will be non-NULL as
> + * soon as we start to initialize the IOMMU.
> + */
> + if ( !is_iommu_enabled(d) || !hd->platform_ops )
> return;
From your description I'd rather say is_iommu_enabled() is the
wrong predicate to use here. IOW I'd rather see it be replaced.
A couple of additional nits: "hd" isn't really necessary to
have as a local variable, but if you insist on introducing it
despite being used just once, it should be pointer-to-const.
Plus the comment would better spell correctly the field it
talks about. But I consider this comment excessive anyway, as
the check of ->platform_ops is quite clear by itself.
And finally "we assume" is in at least latent conflict with
->platform_ops getting set only after arch_iommu_domain_init()
was called. Right now neither x86'es nor Arm's do anything
that would need undoing, but I'd still suggest to replace
"assume" here (if you want to keep the comment in the first
place) and move the assignment up a few lines (right after the
is_iommu_enabled() check there).
Jan
Hi Jan,
On 23/12/2020 13:27, Jan Beulich wrote:
> On 22.12.2020 16:43, Julien Grall wrote:
>> --- a/xen/drivers/passthrough/iommu.c
>> +++ b/xen/drivers/passthrough/iommu.c
>> @@ -226,7 +226,15 @@ static void iommu_teardown(struct domain *d)
>>
>> void iommu_domain_destroy(struct domain *d)
>> {
>> - if ( !is_iommu_enabled(d) )
>> + struct domain_iommu *hd = dom_iommu(d);
>> +
>> + /*
>> + * In case of failure during the domain construction, it would be
>> + * possible to reach this function with the IOMMU enabled but not
>> + * yet initialized. We assume that hd->platforms will be non-NULL as
>> + * soon as we start to initialize the IOMMU.
>> + */
>> + if ( !is_iommu_enabled(d) || !hd->platform_ops )
>> return;
>
> From your description I'd rather say is_iommu_enabled() is the
> wrong predicate to use here. IOW I'd rather see it be replaced.
!hd->platform_ops should be sufficient. So, I think we can drop
!is_iommu_enabled(d). Would that be fine with you?
>
> A couple of additional nits: "hd" isn't really necessary to
> have as a local variable, but if you insist on introducing it
> despite being used just once, it should be pointer-to-const.
> Plus the comment would better spell correctly the field it
> talks about. But I consider this comment excessive anyway, as
> the check of ->platform_ops is quite clear by itself.
Well, I added the comment because I think check hd->platform_ops may not
be that obvious (see more below).
> And finally "we assume" is in at least latent conflict with
> ->platform_ops getting set only after arch_iommu_domain_init()
> was called. Right now neither x86'es nor Arm's do anything
> that would need undoing, but I'd still suggest to replace
> "assume" here (if you want to keep the comment in the first
> place) and move the assignment up a few lines (right after the
> is_iommu_enabled() check there).
My initial implementation of this patch moved the initialization of
hd->platform_ops before arch_iommu_domain_init().
However, this is going to lead to some issues with Paul's series which
add an IOMMU page-table pool ([1]).
The function arch_iommu_domain_init() may now fail. If we initialize
hd->platform_ops earlier on, then the ->teardown() will be called before
->init().
This may be an issue if ->teardown() expects some list pointer to be
initialized by ->init(). I am not aware of any today, but this seems
quite risky to me.
So I think it is better if we initialize hd->platform_ops after
arch_iommu_domain_init() and expect the function to clean-up nything
that was allocated before the error.
The alternative is we set hd->platform_ops if both
arch_iommu_domain_init() and ->init() have succeeded. This means they
will both have to clean-up in case of an error.
Any thoughts?
Cheers,
[1] <20201005094905.2929-6-paul@xen.org>
--
Julien Grall
On 23.12.2020 14:50, Julien Grall wrote:
> On 23/12/2020 13:27, Jan Beulich wrote:
>> On 22.12.2020 16:43, Julien Grall wrote:
>>> --- a/xen/drivers/passthrough/iommu.c
>>> +++ b/xen/drivers/passthrough/iommu.c
>>> @@ -226,7 +226,15 @@ static void iommu_teardown(struct domain *d)
>>>
>>> void iommu_domain_destroy(struct domain *d)
>>> {
>>> - if ( !is_iommu_enabled(d) )
>>> + struct domain_iommu *hd = dom_iommu(d);
>>> +
>>> + /*
>>> + * In case of failure during the domain construction, it would be
>>> + * possible to reach this function with the IOMMU enabled but not
>>> + * yet initialized. We assume that hd->platforms will be non-NULL as
>>> + * soon as we start to initialize the IOMMU.
>>> + */
>>> + if ( !is_iommu_enabled(d) || !hd->platform_ops )
>>> return;
>>
>> From your description I'd rather say is_iommu_enabled() is the
>> wrong predicate to use here. IOW I'd rather see it be replaced.
>
> !hd->platform_ops should be sufficient. So, I think we can drop
> !is_iommu_enabled(d). Would that be fine with you?
Well, that's what I was trying to suggest.
>> A couple of additional nits: "hd" isn't really necessary to
>> have as a local variable, but if you insist on introducing it
>> despite being used just once, it should be pointer-to-const.
>> Plus the comment would better spell correctly the field it
>> talks about. But I consider this comment excessive anyway, as
>> the check of ->platform_ops is quite clear by itself.
>
> Well, I added the comment because I think check hd->platform_ops may not
> be that obvious (see more below).
>
>> And finally "we assume" is in at least latent conflict with
>> ->platform_ops getting set only after arch_iommu_domain_init()
>> was called. Right now neither x86'es nor Arm's do anything
>> that would need undoing, but I'd still suggest to replace
>> "assume" here (if you want to keep the comment in the first
>> place) and move the assignment up a few lines (right after the
>> is_iommu_enabled() check there).
> My initial implementation of this patch moved the initialization of
> hd->platform_ops before arch_iommu_domain_init().
>
> However, this is going to lead to some issues with Paul's series which
> add an IOMMU page-table pool ([1]).
>
> The function arch_iommu_domain_init() may now fail. If we initialize
> hd->platform_ops earlier on, then the ->teardown() will be called before
> ->init().
>
> This may be an issue if ->teardown() expects some list pointer to be
> initialized by ->init(). I am not aware of any today, but this seems
> quite risky to me.
In such a case the obvious thing is to make the teardown handler
check whether its init counterpart has run before. This will then
fit our apparently much wider goal of making cleanup functions
idempotent.
Jan
> So I think it is better if we initialize hd->platform_ops after
> arch_iommu_domain_init() and expect the function to clean-up nything
> that was allocated before the error.
>
> The alternative is we set hd->platform_ops if both
> arch_iommu_domain_init() and ->init() have succeeded. This means they
> will both have to clean-up in case of an error.
>
> Any thoughts?
>
> Cheers,
>
> [1] <20201005094905.2929-6-paul@xen.org>
>
Hi Jan,
On 23/12/2020 13:59, Jan Beulich wrote:
> On 23.12.2020 14:50, Julien Grall wrote:
>> On 23/12/2020 13:27, Jan Beulich wrote:
>>> On 22.12.2020 16:43, Julien Grall wrote:
>>>> --- a/xen/drivers/passthrough/iommu.c
>>>> +++ b/xen/drivers/passthrough/iommu.c
>>>> @@ -226,7 +226,15 @@ static void iommu_teardown(struct domain *d)
>>>>
>>>> void iommu_domain_destroy(struct domain *d)
>>>> {
>>>> - if ( !is_iommu_enabled(d) )
>>>> + struct domain_iommu *hd = dom_iommu(d);
>>>> +
>>>> + /*
>>>> + * In case of failure during the domain construction, it would be
>>>> + * possible to reach this function with the IOMMU enabled but not
>>>> + * yet initialized. We assume that hd->platforms will be non-NULL as
>>>> + * soon as we start to initialize the IOMMU.
>>>> + */
>>>> + if ( !is_iommu_enabled(d) || !hd->platform_ops )
>>>> return;
>>>
>>> From your description I'd rather say is_iommu_enabled() is the
>>> wrong predicate to use here. IOW I'd rather see it be replaced.
>>
>> !hd->platform_ops should be sufficient. So, I think we can drop
>> !is_iommu_enabled(d). Would that be fine with you?
>
> Well, that's what I was trying to suggest.
>
>>> A couple of additional nits: "hd" isn't really necessary to
>>> have as a local variable, but if you insist on introducing it
>>> despite being used just once, it should be pointer-to-const.
>>> Plus the comment would better spell correctly the field it
>>> talks about. But I consider this comment excessive anyway, as
>>> the check of ->platform_ops is quite clear by itself.
>>
>> Well, I added the comment because I think check hd->platform_ops may not
>> be that obvious (see more below).
>>
>>> And finally "we assume" is in at least latent conflict with
>>> ->platform_ops getting set only after arch_iommu_domain_init()
>>> was called. Right now neither x86'es nor Arm's do anything
>>> that would need undoing, but I'd still suggest to replace
>>> "assume" here (if you want to keep the comment in the first
>>> place) and move the assignment up a few lines (right after the
>>> is_iommu_enabled() check there).
>> My initial implementation of this patch moved the initialization of
>> hd->platform_ops before arch_iommu_domain_init().
>>
>> However, this is going to lead to some issues with Paul's series which
>> add an IOMMU page-table pool ([1]).
>>
>> The function arch_iommu_domain_init() may now fail. If we initialize
>> hd->platform_ops earlier on, then the ->teardown() will be called before
>> ->init().
>>
>> This may be an issue if ->teardown() expects some list pointer to be
>> initialized by ->init(). I am not aware of any today, but this seems
>> quite risky to me.
>
> In such a case the obvious thing is to make the teardown handler
> check whether its init counterpart has run before. This will then
> fit our apparently much wider goal of making cleanup functions
> idempotent.
I will have a look. This may require another boolean which I wanted to
avoid.
Cheers,
--
Julien Grall
On 23.12.2020 15:51, Julien Grall wrote: > On 23/12/2020 13:59, Jan Beulich wrote: >> On 23.12.2020 14:50, Julien Grall wrote: >>> On 23/12/2020 13:27, Jan Beulich wrote: >>>> And finally "we assume" is in at least latent conflict with >>>> ->platform_ops getting set only after arch_iommu_domain_init() >>>> was called. Right now neither x86'es nor Arm's do anything >>>> that would need undoing, but I'd still suggest to replace >>>> "assume" here (if you want to keep the comment in the first >>>> place) and move the assignment up a few lines (right after the >>>> is_iommu_enabled() check there). >>> My initial implementation of this patch moved the initialization of >>> hd->platform_ops before arch_iommu_domain_init(). >>> >>> However, this is going to lead to some issues with Paul's series which >>> add an IOMMU page-table pool ([1]). >>> >>> The function arch_iommu_domain_init() may now fail. If we initialize >>> hd->platform_ops earlier on, then the ->teardown() will be called before >>> ->init(). >>> >>> This may be an issue if ->teardown() expects some list pointer to be >>> initialized by ->init(). I am not aware of any today, but this seems >>> quite risky to me. >> >> In such a case the obvious thing is to make the teardown handler >> check whether its init counterpart has run before. This will then >> fit our apparently much wider goal of making cleanup functions >> idempotent. > > I will have a look. This may require another boolean which I wanted to > avoid. I could imagine list_head_is_null() to become handy here. Jan
> -----Original Message-----
> From: Julien Grall <julien@xen.org>
> Sent: 22 December 2020 15:44
> To: xen-devel@lists.xenproject.org
> Cc: hongyxia@amazon.co.uk; Julien Grall <jgrall@amazon.com>; Jan Beulich <jbeulich@suse.com>; Paul
> Durrant <paul@xen.org>
> Subject: [PATCH for-4.15 1/4] xen/iommu: Check if the IOMMU was initialized before tearing down
>
> From: Julien Grall <jgrall@amazon.com>
>
> is_iommu_enabled() will return true even if the IOMMU has not been
> initialized (e.g. the ops are not set).
>
> In the case of an early failure in arch_domain_init(), the function
> iommu_destroy_domain() will be called even if the IOMMU is initialized.
>
> This will result to dereference the ops which will be NULL and an host
> crash.
>
> Fix the issue by checking that ops has been set before accessing it.
> Note that we are assuming that arch_iommu_domain_init() will cleanup an
> intermediate failure if it failed.
>
> Fixes: 71e617a6b8f6 ("use is_iommu_enabled() where appropriate...")
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> ---
> xen/drivers/passthrough/iommu.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
> index 2358b6eb09f4..f976d5a0b0a5 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -226,7 +226,15 @@ static void iommu_teardown(struct domain *d)
>
> void iommu_domain_destroy(struct domain *d)
> {
> - if ( !is_iommu_enabled(d) )
> + struct domain_iommu *hd = dom_iommu(d);
> +
> + /*
> + * In case of failure during the domain construction, it would be
> + * possible to reach this function with the IOMMU enabled but not
> + * yet initialized. We assume that hd->platforms will be non-NULL as
> + * soon as we start to initialize the IOMMU.
> + */
> + if ( !is_iommu_enabled(d) || !hd->platform_ops )
> return;
TBH, this seems like the wrong way to fix it. The ops dereference is done in iommu_teardown() so that ought to be doing the check,
but given it is single line function then it could be inlined at the same time. That said, I think arch_iommu_domain_destroy() needs
to be call unconditionally as it arch_iommu_domain_init() is called before the ops pointer is set.
Paul
>
> iommu_teardown(d);
> --
> 2.17.1
Hi Paul,
On 04/01/2021 09:28, Paul Durrant wrote:
>> -----Original Message-----
>> From: Julien Grall <julien@xen.org>
>> Sent: 22 December 2020 15:44
>> To: xen-devel@lists.xenproject.org
>> Cc: hongyxia@amazon.co.uk; Julien Grall <jgrall@amazon.com>; Jan Beulich <jbeulich@suse.com>; Paul
>> Durrant <paul@xen.org>
>> Subject: [PATCH for-4.15 1/4] xen/iommu: Check if the IOMMU was initialized before tearing down
>>
>> From: Julien Grall <jgrall@amazon.com>
>>
>> is_iommu_enabled() will return true even if the IOMMU has not been
>> initialized (e.g. the ops are not set).
>>
>> In the case of an early failure in arch_domain_init(), the function
>> iommu_destroy_domain() will be called even if the IOMMU is initialized.
>>
>> This will result to dereference the ops which will be NULL and an host
>> crash.
>>
>> Fix the issue by checking that ops has been set before accessing it.
>> Note that we are assuming that arch_iommu_domain_init() will cleanup an
>> intermediate failure if it failed.
>>
>> Fixes: 71e617a6b8f6 ("use is_iommu_enabled() where appropriate...")
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>> ---
>> xen/drivers/passthrough/iommu.c | 10 +++++++++-
>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
>> index 2358b6eb09f4..f976d5a0b0a5 100644
>> --- a/xen/drivers/passthrough/iommu.c
>> +++ b/xen/drivers/passthrough/iommu.c
>> @@ -226,7 +226,15 @@ static void iommu_teardown(struct domain *d)
>>
>> void iommu_domain_destroy(struct domain *d)
>> {
>> - if ( !is_iommu_enabled(d) )
>> + struct domain_iommu *hd = dom_iommu(d);
>> +
>> + /*
>> + * In case of failure during the domain construction, it would be
>> + * possible to reach this function with the IOMMU enabled but not
>> + * yet initialized. We assume that hd->platforms will be non-NULL as
>> + * soon as we start to initialize the IOMMU.
>> + */
>> + if ( !is_iommu_enabled(d) || !hd->platform_ops )
>> return;
>
> TBH, this seems like the wrong way to fix it. The ops dereference is done in iommu_teardown() so that ought to be doing the check,
> but given it is single line function then it could be inlined at the same time. That said, I think arch_iommu_domain_destroy() needs
> to be call unconditionally as it arch_iommu_domain_init() is called before the ops pointer is set.
I originally resolved this way to avoid arch_iommu_domain_init() and
iommu_teardown() to be called when the failure may happen before
iommu_domain_init() is called. IOW, the lists/locks would be unitialized.
After discussing with Jan, it turns out that we could check whether the
list was initialized or not. So I have reworked the code to now call
arch_iommu_domain_init() unconditionally and move the ops check in
iommu_teardown().
Cheers,
--
Julien Grall
© 2016 - 2026 Red Hat, Inc.