Commit 49147beb0ccb ("x86/xen: allow nesting of same lazy mode") was
added as a solution for a core-mm code change where
arch_[enter|leave]_lazy_mmu_mode() started to be called in a nested
manner; see commit bcc6cc832573 ("mm: add default definition of
set_ptes()").
However, now that we have fixed the API to avoid nesting, we no longer
need this capability in the x86 implementation.
Additionally, from code review, I don't believe the fix was ever robust
in the case of preemption occurring while in the nested lazy mode. The
implementation usually deals with preemption by calling
arch_leave_lazy_mmu_mode() from xen_start_context_switch() for the
outgoing task if we are in the lazy mmu mode. Then in
xen_end_context_switch(), it restarts the lazy mode by calling
arch_enter_lazy_mmu_mode() for an incoming task that was in the lazy
mode when it was switched out. But arch_leave_lazy_mmu_mode() will only
unwind a single level of nesting. If we are in the double nest, then
it's not fully unwound and per-cpu variables are left in a bad state.
So the correct solution is to remove the possibility of nesting from the
higher level (which has now been done) and remove this x86-specific
solution.
Fixes: 49147beb0ccb ("x86/xen: allow nesting of same lazy mode")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/x86/include/asm/xen/hypervisor.h | 15 ++-------------
arch/x86/xen/enlighten_pv.c | 1 -
2 files changed, 2 insertions(+), 14 deletions(-)
diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h
index a9088250770f..bd0fc69a10a7 100644
--- a/arch/x86/include/asm/xen/hypervisor.h
+++ b/arch/x86/include/asm/xen/hypervisor.h
@@ -72,18 +72,10 @@ enum xen_lazy_mode {
};
DECLARE_PER_CPU(enum xen_lazy_mode, xen_lazy_mode);
-DECLARE_PER_CPU(unsigned int, xen_lazy_nesting);
static inline void enter_lazy(enum xen_lazy_mode mode)
{
- enum xen_lazy_mode old_mode = this_cpu_read(xen_lazy_mode);
-
- if (mode == old_mode) {
- this_cpu_inc(xen_lazy_nesting);
- return;
- }
-
- BUG_ON(old_mode != XEN_LAZY_NONE);
+ BUG_ON(this_cpu_read(xen_lazy_mode) != XEN_LAZY_NONE);
this_cpu_write(xen_lazy_mode, mode);
}
@@ -92,10 +84,7 @@ static inline void leave_lazy(enum xen_lazy_mode mode)
{
BUG_ON(this_cpu_read(xen_lazy_mode) != mode);
- if (this_cpu_read(xen_lazy_nesting) == 0)
- this_cpu_write(xen_lazy_mode, XEN_LAZY_NONE);
- else
- this_cpu_dec(xen_lazy_nesting);
+ this_cpu_write(xen_lazy_mode, XEN_LAZY_NONE);
}
enum xen_lazy_mode xen_get_lazy_mode(void);
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 5e57835e999d..919e4df9380b 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -99,7 +99,6 @@ struct tls_descs {
};
DEFINE_PER_CPU(enum xen_lazy_mode, xen_lazy_mode) = XEN_LAZY_NONE;
-DEFINE_PER_CPU(unsigned int, xen_lazy_nesting);
enum xen_lazy_mode xen_get_lazy_mode(void)
{
--
2.43.0
On 02.03.25 15:55, Ryan Roberts wrote:
> Commit 49147beb0ccb ("x86/xen: allow nesting of same lazy mode") was
> added as a solution for a core-mm code change where
> arch_[enter|leave]_lazy_mmu_mode() started to be called in a nested
> manner; see commit bcc6cc832573 ("mm: add default definition of
> set_ptes()").
>
> However, now that we have fixed the API to avoid nesting, we no longer
> need this capability in the x86 implementation.
>
> Additionally, from code review, I don't believe the fix was ever robust
> in the case of preemption occurring while in the nested lazy mode. The
> implementation usually deals with preemption by calling
> arch_leave_lazy_mmu_mode() from xen_start_context_switch() for the
> outgoing task if we are in the lazy mmu mode. Then in
> xen_end_context_switch(), it restarts the lazy mode by calling
> arch_enter_lazy_mmu_mode() for an incoming task that was in the lazy
> mode when it was switched out. But arch_leave_lazy_mmu_mode() will only
> unwind a single level of nesting. If we are in the double nest, then
> it's not fully unwound and per-cpu variables are left in a bad state.
>
> So the correct solution is to remove the possibility of nesting from the
> higher level (which has now been done) and remove this x86-specific
> solution.
>
> Fixes: 49147beb0ccb ("x86/xen: allow nesting of same lazy mode")
Does this patch here deserve this tag? IIUC, it's rather a cleanup now
that it was properly fixed elsewhere.
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
On 03/03/2025 11:52, David Hildenbrand wrote:
> On 02.03.25 15:55, Ryan Roberts wrote:
>> Commit 49147beb0ccb ("x86/xen: allow nesting of same lazy mode") was
>> added as a solution for a core-mm code change where
>> arch_[enter|leave]_lazy_mmu_mode() started to be called in a nested
>> manner; see commit bcc6cc832573 ("mm: add default definition of
>> set_ptes()").
>>
>> However, now that we have fixed the API to avoid nesting, we no longer
>> need this capability in the x86 implementation.
>>
>> Additionally, from code review, I don't believe the fix was ever robust
>> in the case of preemption occurring while in the nested lazy mode. The
>> implementation usually deals with preemption by calling
>> arch_leave_lazy_mmu_mode() from xen_start_context_switch() for the
>> outgoing task if we are in the lazy mmu mode. Then in
>> xen_end_context_switch(), it restarts the lazy mode by calling
>> arch_enter_lazy_mmu_mode() for an incoming task that was in the lazy
>> mode when it was switched out. But arch_leave_lazy_mmu_mode() will only
>> unwind a single level of nesting. If we are in the double nest, then
>> it's not fully unwound and per-cpu variables are left in a bad state.
>>
>> So the correct solution is to remove the possibility of nesting from the
>> higher level (which has now been done) and remove this x86-specific
>> solution.
>>
>> Fixes: 49147beb0ccb ("x86/xen: allow nesting of same lazy mode")
>
> Does this patch here deserve this tag? IIUC, it's rather a cleanup now that it
> was properly fixed elsewhere.
Now that nesting is not possible, yes it is just a cleanup. But when nesting was
possible, as far as I can tell it was buggy, as per my description. So it's a
bug bug that won't ever trigger once the other fixes are applied. Happy to
remove the Fixes and then not include it for stable for v2. That's probably
simplest.
>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>
> Acked-by: David Hildenbrand <david@redhat.com>
>
On 03.03.25 13:33, Ryan Roberts wrote:
> On 03/03/2025 11:52, David Hildenbrand wrote:
>> On 02.03.25 15:55, Ryan Roberts wrote:
>>> Commit 49147beb0ccb ("x86/xen: allow nesting of same lazy mode") was
>>> added as a solution for a core-mm code change where
>>> arch_[enter|leave]_lazy_mmu_mode() started to be called in a nested
>>> manner; see commit bcc6cc832573 ("mm: add default definition of
>>> set_ptes()").
>>>
>>> However, now that we have fixed the API to avoid nesting, we no longer
>>> need this capability in the x86 implementation.
>>>
>>> Additionally, from code review, I don't believe the fix was ever robust
>>> in the case of preemption occurring while in the nested lazy mode. The
>>> implementation usually deals with preemption by calling
>>> arch_leave_lazy_mmu_mode() from xen_start_context_switch() for the
>>> outgoing task if we are in the lazy mmu mode. Then in
>>> xen_end_context_switch(), it restarts the lazy mode by calling
>>> arch_enter_lazy_mmu_mode() for an incoming task that was in the lazy
>>> mode when it was switched out. But arch_leave_lazy_mmu_mode() will only
>>> unwind a single level of nesting. If we are in the double nest, then
>>> it's not fully unwound and per-cpu variables are left in a bad state.
>>>
>>> So the correct solution is to remove the possibility of nesting from the
>>> higher level (which has now been done) and remove this x86-specific
>>> solution.
>>>
>>> Fixes: 49147beb0ccb ("x86/xen: allow nesting of same lazy mode")
>>
>> Does this patch here deserve this tag? IIUC, it's rather a cleanup now that it
>> was properly fixed elsewhere.
>
> Now that nesting is not possible, yes it is just a cleanup. But when nesting was
> possible, as far as I can tell it was buggy, as per my description.
Right, I understood that part.
> So it's a
> bug bug that won't ever trigger once the other fixes are applied. Happy to
> remove the Fixes and then not include it for stable for v2. That's probably
> simplest.
I was just curious, because it sounded like the actual fix was the other
patch. Whatever you think is best :)
--
Cheers,
David / dhildenb
© 2016 - 2025 Red Hat, Inc.