[PATCH v2 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE

Tamas K Lengyel posted 1 patch 3 years, 10 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/xen tags/patchew/3555e16baa9e1909dbefaaab04330694135c9021.1592410065.git.tamas.lengyel@intel.com
Maintainers: Wei Liu <wl@xen.org>, Jan Beulich <jbeulich@suse.com>, "Roger Pau Monné" <roger.pau@citrix.com>, Andrew Cooper <andrew.cooper3@citrix.com>, Kevin Tian <kevin.tian@intel.com>, Jun Nakajima <jun.nakajima@intel.com>
There is a newer version of this series
xen/arch/x86/hvm/vmx/vmx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH v2 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE
Posted by Tamas K Lengyel 3 years, 10 months ago
While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
observed due to a mm-lock order violation while copying the HVM CPU context
from the parent. This issue has been identified to be due to
hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
call also creates a shared entry in the fork's memory map for the cr3 gfn. The
function later calls hap_update_cr3 while holding the paging_lock, which
results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
the above entry when it grabs the page with the P2M_UNSHARE flag set.

Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
the p2m is properly populated and to avoid the lock-order violation we
observed.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
---
v2: This patch was previously sent as
     "x86/hap: use get_gfn_type in hap_update_paging_modes"
---
 xen/arch/x86/hvm/vmx/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index ab19d9424e..cc6d4ece22 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1325,7 +1325,7 @@ static void vmx_load_pdptrs(struct vcpu *v)
     if ( (cr3 & 0x1fUL) && !hvm_pcid_enabled(v) )
         goto crash;
 
-    page = get_page_from_gfn(v->domain, cr3 >> PAGE_SHIFT, &p2mt, P2M_UNSHARE);
+    page = get_page_from_gfn(v->domain, cr3 >> PAGE_SHIFT, &p2mt, P2M_ALLOC);
     if ( !page )
     {
         /* Ideally you don't want to crash but rather go into a wait 
-- 
2.25.1


Re: [PATCH v2 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE
Posted by Jan Beulich 3 years, 10 months ago
On 17.06.2020 18:19, Tamas K Lengyel wrote:
> While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
> observed due to a mm-lock order violation while copying the HVM CPU context
> from the parent. This issue has been identified to be due to
> hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
> call also creates a shared entry in the fork's memory map for the cr3 gfn. The
> function later calls hap_update_cr3 while holding the paging_lock, which
> results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
> the above entry when it grabs the page with the P2M_UNSHARE flag set.
> 
> Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
> unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
> the p2m is properly populated and to avoid the lock-order violation we
> observed.

Using P2M_ALLOC is not going to address the original problem though
afaict: You may hit the mem_sharing_fork_page() path that way, and
via nominate_page() => __grab_shared_page() => mem_sharing_page_lock()
you'd run into a lock order violation again.

The change is an improvement, so I'd be fine with it going in this
way, but only as long as the description mentions that there's still
an open issue here (which may be non-trivial to address). Or perhaps
combining with your v1 change is the way to go (for now or even
permanently)?

Jan

Re: [PATCH v2 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE
Posted by Roger Pau Monné 3 years, 10 months ago
On Thu, Jun 18, 2020 at 08:30:08AM +0200, Jan Beulich wrote:
> On 17.06.2020 18:19, Tamas K Lengyel wrote:
> > While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
> > observed due to a mm-lock order violation while copying the HVM CPU context
> > from the parent. This issue has been identified to be due to
> > hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
> > call also creates a shared entry in the fork's memory map for the cr3 gfn. The
> > function later calls hap_update_cr3 while holding the paging_lock, which
> > results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
> > the above entry when it grabs the page with the P2M_UNSHARE flag set.
> > 
> > Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
> > unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
> > the p2m is properly populated and to avoid the lock-order violation we
> > observed.
> 
> Using P2M_ALLOC is not going to address the original problem though
> afaict: You may hit the mem_sharing_fork_page() path that way, and
> via nominate_page() => __grab_shared_page() => mem_sharing_page_lock()
> you'd run into a lock order violation again.

Well, I guess Tamas avoids this because of the get_gfn call in
hap_update_paging_modes will have already populated the entry, so it's
never going to hit the p2m_is_hole check in __get_gfn_type_access.

> The change is an improvement, so I'd be fine with it going in this
> way, but only as long as the description mentions that there's still
> an open issue here (which may be non-trivial to address). Or perhaps
> combining with your v1 change is the way to go (for now or even
> permanently)?

If vmx_load_pdptrs only requires P2M_ALLOC then this is already
covered by the call to get_gfn performed in hap_update_paging_modes,
so I don't think there's much point in merging with v1, as forcing
hap_update_paging_modes to unshare the entry won't affect
vmx_load_pdptrs anymore.

I'm however worried about other code paths that can call into
vmx_load_pdptrs with mm locks taken, and I agree it would be safer to
assert that all the higher layers make sure the cr3 loaded is
correctly populated for a query without P2M_ALLOC to succeed.

Thanks, Roger.

Re: [PATCH v2 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE
Posted by Jan Beulich 3 years, 10 months ago
On 18.06.2020 11:40, Roger Pau Monné wrote:
> On Thu, Jun 18, 2020 at 08:30:08AM +0200, Jan Beulich wrote:
>> On 17.06.2020 18:19, Tamas K Lengyel wrote:
>>> While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
>>> observed due to a mm-lock order violation while copying the HVM CPU context
>>> from the parent. This issue has been identified to be due to
>>> hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
>>> call also creates a shared entry in the fork's memory map for the cr3 gfn. The
>>> function later calls hap_update_cr3 while holding the paging_lock, which
>>> results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
>>> the above entry when it grabs the page with the P2M_UNSHARE flag set.
>>>
>>> Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
>>> unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
>>> the p2m is properly populated and to avoid the lock-order violation we
>>> observed.
>>
>> Using P2M_ALLOC is not going to address the original problem though
>> afaict: You may hit the mem_sharing_fork_page() path that way, and
>> via nominate_page() => __grab_shared_page() => mem_sharing_page_lock()
>> you'd run into a lock order violation again.
> 
> Well, I guess Tamas avoids this because of the get_gfn call in
> hap_update_paging_modes will have already populated the entry, so it's
> never going to hit the p2m_is_hole check in __get_gfn_type_access.
> 
>> The change is an improvement, so I'd be fine with it going in this
>> way, but only as long as the description mentions that there's still
>> an open issue here (which may be non-trivial to address). Or perhaps
>> combining with your v1 change is the way to go (for now or even
>> permanently)?
> 
> If vmx_load_pdptrs only requires P2M_ALLOC then this is already
> covered by the call to get_gfn performed in hap_update_paging_modes,
> so I don't think there's much point in merging with v1, as forcing
> hap_update_paging_modes to unshare the entry won't affect
> vmx_load_pdptrs anymore.

Oh, I simply mis-remembered what v1 did; for some reason I thought
it added a call rather than modifying the query type from alloc to
unshare.

Jan

Re: [PATCH v2 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE
Posted by Tamas K Lengyel 3 years, 10 months ago
On Thu, Jun 18, 2020 at 3:42 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Thu, Jun 18, 2020 at 08:30:08AM +0200, Jan Beulich wrote:
> > On 17.06.2020 18:19, Tamas K Lengyel wrote:
> > > While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
> > > observed due to a mm-lock order violation while copying the HVM CPU context
> > > from the parent. This issue has been identified to be due to
> > > hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
> > > call also creates a shared entry in the fork's memory map for the cr3 gfn. The
> > > function later calls hap_update_cr3 while holding the paging_lock, which
> > > results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
> > > the above entry when it grabs the page with the P2M_UNSHARE flag set.
> > >
> > > Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
> > > unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
> > > the p2m is properly populated and to avoid the lock-order violation we
> > > observed.
> >
> > Using P2M_ALLOC is not going to address the original problem though
> > afaict: You may hit the mem_sharing_fork_page() path that way, and
> > via nominate_page() => __grab_shared_page() => mem_sharing_page_lock()
> > you'd run into a lock order violation again.
>
> Well, I guess Tamas avoids this because of the get_gfn call in
> hap_update_paging_modes will have already populated the entry, so it's
> never going to hit the p2m_is_hole check in __get_gfn_type_access.
>
> > The change is an improvement, so I'd be fine with it going in this
> > way, but only as long as the description mentions that there's still
> > an open issue here (which may be non-trivial to address). Or perhaps
> > combining with your v1 change is the way to go (for now or even
> > permanently)?
>
> If vmx_load_pdptrs only requires P2M_ALLOC then this is already
> covered by the call to get_gfn performed in hap_update_paging_modes,
> so I don't think there's much point in merging with v1, as forcing
> hap_update_paging_modes to unshare the entry won't affect
> vmx_load_pdptrs anymore.
>
> I'm however worried about other code paths that can call into
> vmx_load_pdptrs with mm locks taken, and I agree it would be safer to
> assert that all the higher layers make sure the cr3 loaded is
> correctly populated for a query without P2M_ALLOC to succeed.

Using P2M_ALLOC is always safe if 1) the entry is already populated
like in this case but also in 2) in case the gfn is a hole and gets
forked. In mem_sharing the paging lock order is only applicable when
an already present entry is getting converted to a shared type or a
shared typed is getting unshared. It does not apply when a hole is
being plugged. So this is safe in other paths as well where the entry
is not yet present.

Tamas

Re: [PATCH v2 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE
Posted by Roger Pau Monné 3 years, 10 months ago
On Thu, Jun 18, 2020 at 06:21:42AM -0600, Tamas K Lengyel wrote:
> On Thu, Jun 18, 2020 at 3:42 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Thu, Jun 18, 2020 at 08:30:08AM +0200, Jan Beulich wrote:
> > > On 17.06.2020 18:19, Tamas K Lengyel wrote:
> > > > While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
> > > > observed due to a mm-lock order violation while copying the HVM CPU context
> > > > from the parent. This issue has been identified to be due to
> > > > hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
> > > > call also creates a shared entry in the fork's memory map for the cr3 gfn. The
> > > > function later calls hap_update_cr3 while holding the paging_lock, which
> > > > results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
> > > > the above entry when it grabs the page with the P2M_UNSHARE flag set.
> > > >
> > > > Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
> > > > unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
> > > > the p2m is properly populated and to avoid the lock-order violation we
> > > > observed.
> > >
> > > Using P2M_ALLOC is not going to address the original problem though
> > > afaict: You may hit the mem_sharing_fork_page() path that way, and
> > > via nominate_page() => __grab_shared_page() => mem_sharing_page_lock()
> > > you'd run into a lock order violation again.
> >
> > Well, I guess Tamas avoids this because of the get_gfn call in
> > hap_update_paging_modes will have already populated the entry, so it's
> > never going to hit the p2m_is_hole check in __get_gfn_type_access.
> >
> > > The change is an improvement, so I'd be fine with it going in this
> > > way, but only as long as the description mentions that there's still
> > > an open issue here (which may be non-trivial to address). Or perhaps
> > > combining with your v1 change is the way to go (for now or even
> > > permanently)?
> >
> > If vmx_load_pdptrs only requires P2M_ALLOC then this is already
> > covered by the call to get_gfn performed in hap_update_paging_modes,
> > so I don't think there's much point in merging with v1, as forcing
> > hap_update_paging_modes to unshare the entry won't affect
> > vmx_load_pdptrs anymore.
> >
> > I'm however worried about other code paths that can call into
> > vmx_load_pdptrs with mm locks taken, and I agree it would be safer to
> > assert that all the higher layers make sure the cr3 loaded is
> > correctly populated for a query without P2M_ALLOC to succeed.
> 
> Using P2M_ALLOC is always safe if 1) the entry is already populated
> like in this case but also in 2) in case the gfn is a hole and gets
> forked. In mem_sharing the paging lock order is only applicable when
> an already present entry is getting converted to a shared type or a
> shared typed is getting unshared. It does not apply when a hole is
> being plugged.

But a hole being plugged can also imply that a page is being set
shareable by nominate_page, which will take the mem sharing page lock?

That would be the path: get_gfn_type_access -> __get_gfn_type_access
-> (hole found in p2m) -> mem_sharing_fork_page -> nominate_page (with
page not being shareable already).

It's likely I'm missing some bits, this is all quite complex.

Thanks, Roger.

Re: [PATCH v2 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE
Posted by Tamas K Lengyel 3 years, 10 months ago
On Thu, Jun 18, 2020 at 12:31 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 17.06.2020 18:19, Tamas K Lengyel wrote:
> > While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
> > observed due to a mm-lock order violation while copying the HVM CPU context
> > from the parent. This issue has been identified to be due to
> > hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
> > call also creates a shared entry in the fork's memory map for the cr3 gfn. The
> > function later calls hap_update_cr3 while holding the paging_lock, which
> > results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
> > the above entry when it grabs the page with the P2M_UNSHARE flag set.
> >
> > Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
> > unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
> > the p2m is properly populated and to avoid the lock-order violation we
> > observed.
>
> Using P2M_ALLOC is not going to address the original problem though
> afaict: You may hit the mem_sharing_fork_page() path that way, and
> via nominate_page() => __grab_shared_page() => mem_sharing_page_lock()
> you'd run into a lock order violation again.

Note that the nominate_page you see in that path is for the parent VM.
The paging lock is not taken for the parent VM thus nominate_page
succeeds without any issues any time fork_page is called. There is no
nominate_page called for the client domain as there is nothing to
nominate when plugging a hole.

Tamas

Re: [PATCH v2 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE
Posted by Jan Beulich 3 years, 10 months ago
On 18.06.2020 14:39, Tamas K Lengyel wrote:
> On Thu, Jun 18, 2020 at 12:31 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 17.06.2020 18:19, Tamas K Lengyel wrote:
>>> While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
>>> observed due to a mm-lock order violation while copying the HVM CPU context
>>> from the parent. This issue has been identified to be due to
>>> hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
>>> call also creates a shared entry in the fork's memory map for the cr3 gfn. The
>>> function later calls hap_update_cr3 while holding the paging_lock, which
>>> results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
>>> the above entry when it grabs the page with the P2M_UNSHARE flag set.
>>>
>>> Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
>>> unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
>>> the p2m is properly populated and to avoid the lock-order violation we
>>> observed.
>>
>> Using P2M_ALLOC is not going to address the original problem though
>> afaict: You may hit the mem_sharing_fork_page() path that way, and
>> via nominate_page() => __grab_shared_page() => mem_sharing_page_lock()
>> you'd run into a lock order violation again.
> 
> Note that the nominate_page you see in that path is for the parent VM.
> The paging lock is not taken for the parent VM thus nominate_page
> succeeds without any issues any time fork_page is called. There is no
> nominate_page called for the client domain as there is nothing to
> nominate when plugging a hole.

But that's still a lock order issue then, isn't it? Just one that
the machinery can't detect / assert upon.

Jan

Re: [PATCH v2 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE
Posted by Roger Pau Monné 3 years, 10 months ago
On Thu, Jun 18, 2020 at 02:46:24PM +0200, Jan Beulich wrote:
> On 18.06.2020 14:39, Tamas K Lengyel wrote:
> > On Thu, Jun 18, 2020 at 12:31 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 17.06.2020 18:19, Tamas K Lengyel wrote:
> >>> While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
> >>> observed due to a mm-lock order violation while copying the HVM CPU context
> >>> from the parent. This issue has been identified to be due to
> >>> hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
> >>> call also creates a shared entry in the fork's memory map for the cr3 gfn. The
> >>> function later calls hap_update_cr3 while holding the paging_lock, which
> >>> results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
> >>> the above entry when it grabs the page with the P2M_UNSHARE flag set.
> >>>
> >>> Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
> >>> unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
> >>> the p2m is properly populated and to avoid the lock-order violation we
> >>> observed.
> >>
> >> Using P2M_ALLOC is not going to address the original problem though
> >> afaict: You may hit the mem_sharing_fork_page() path that way, and
> >> via nominate_page() => __grab_shared_page() => mem_sharing_page_lock()
> >> you'd run into a lock order violation again.
> > 
> > Note that the nominate_page you see in that path is for the parent VM.
> > The paging lock is not taken for the parent VM thus nominate_page
> > succeeds without any issues any time fork_page is called. There is no
> > nominate_page called for the client domain as there is nothing to
> > nominate when plugging a hole.
> 
> But that's still a lock order issue then, isn't it? Just one that
> the machinery can't detect / assert upon.

Yes, mm lock ordering doesn't differentiate between domains, and the
current lock order on the pCPU is based on the last lock taken
(regardless of the domain it belongs to).

Roger.

Re: [PATCH v2 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE
Posted by Tamas K Lengyel 3 years, 10 months ago
On Thu, Jun 18, 2020 at 6:52 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Thu, Jun 18, 2020 at 02:46:24PM +0200, Jan Beulich wrote:
> > On 18.06.2020 14:39, Tamas K Lengyel wrote:
> > > On Thu, Jun 18, 2020 at 12:31 AM Jan Beulich <jbeulich@suse.com> wrote:
> > >>
> > >> On 17.06.2020 18:19, Tamas K Lengyel wrote:
> > >>> While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
> > >>> observed due to a mm-lock order violation while copying the HVM CPU context
> > >>> from the parent. This issue has been identified to be due to
> > >>> hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
> > >>> call also creates a shared entry in the fork's memory map for the cr3 gfn. The
> > >>> function later calls hap_update_cr3 while holding the paging_lock, which
> > >>> results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
> > >>> the above entry when it grabs the page with the P2M_UNSHARE flag set.
> > >>>
> > >>> Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
> > >>> unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
> > >>> the p2m is properly populated and to avoid the lock-order violation we
> > >>> observed.
> > >>
> > >> Using P2M_ALLOC is not going to address the original problem though
> > >> afaict: You may hit the mem_sharing_fork_page() path that way, and
> > >> via nominate_page() => __grab_shared_page() => mem_sharing_page_lock()
> > >> you'd run into a lock order violation again.
> > >
> > > Note that the nominate_page you see in that path is for the parent VM.
> > > The paging lock is not taken for the parent VM thus nominate_page
> > > succeeds without any issues any time fork_page is called. There is no
> > > nominate_page called for the client domain as there is nothing to
> > > nominate when plugging a hole.
> >
> > But that's still a lock order issue then, isn't it? Just one that
> > the machinery can't detect / assert upon.
>
> Yes, mm lock ordering doesn't differentiate between domains, and the
> current lock order on the pCPU is based on the last lock taken
> (regardless of the domain it belongs to).

I see, makes sense. In that case the issue is avoided purely due to
get_gfn being called that happens before the paging_lock is taken.
That would have to be the way-to-go on other paths leading to
vmx_load_pdptrs as well but since all other paths leading there do it
without the paging lock being taken there aren't any more adjustments
necessary right now that I can see.

Tamas

Re: [PATCH v2 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE
Posted by Jan Beulich 3 years, 10 months ago
On 18.06.2020 15:00, Tamas K Lengyel wrote:
> On Thu, Jun 18, 2020 at 6:52 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>
>> On Thu, Jun 18, 2020 at 02:46:24PM +0200, Jan Beulich wrote:
>>> On 18.06.2020 14:39, Tamas K Lengyel wrote:
>>>> On Thu, Jun 18, 2020 at 12:31 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>
>>>>> On 17.06.2020 18:19, Tamas K Lengyel wrote:
>>>>>> While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
>>>>>> observed due to a mm-lock order violation while copying the HVM CPU context
>>>>>> from the parent. This issue has been identified to be due to
>>>>>> hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
>>>>>> call also creates a shared entry in the fork's memory map for the cr3 gfn. The
>>>>>> function later calls hap_update_cr3 while holding the paging_lock, which
>>>>>> results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
>>>>>> the above entry when it grabs the page with the P2M_UNSHARE flag set.
>>>>>>
>>>>>> Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
>>>>>> unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
>>>>>> the p2m is properly populated and to avoid the lock-order violation we
>>>>>> observed.
>>>>>
>>>>> Using P2M_ALLOC is not going to address the original problem though
>>>>> afaict: You may hit the mem_sharing_fork_page() path that way, and
>>>>> via nominate_page() => __grab_shared_page() => mem_sharing_page_lock()
>>>>> you'd run into a lock order violation again.
>>>>
>>>> Note that the nominate_page you see in that path is for the parent VM.
>>>> The paging lock is not taken for the parent VM thus nominate_page
>>>> succeeds without any issues any time fork_page is called. There is no
>>>> nominate_page called for the client domain as there is nothing to
>>>> nominate when plugging a hole.
>>>
>>> But that's still a lock order issue then, isn't it? Just one that
>>> the machinery can't detect / assert upon.
>>
>> Yes, mm lock ordering doesn't differentiate between domains, and the
>> current lock order on the pCPU is based on the last lock taken
>> (regardless of the domain it belongs to).
> 
> I see, makes sense. In that case the issue is avoided purely due to
> get_gfn being called that happens before the paging_lock is taken.
> That would have to be the way-to-go on other paths leading to
> vmx_load_pdptrs as well but since all other paths leading there do it
> without the paging lock being taken there aren't any more adjustments
> necessary right now that I can see.

If this is indeed the case, then I guess all that's needed is a further
extended / refined commit message in v3.

Jan

Re: [PATCH v2 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE
Posted by Tamas K Lengyel 3 years, 10 months ago
On Thu, Jun 18, 2020 at 7:26 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 18.06.2020 15:00, Tamas K Lengyel wrote:
> > On Thu, Jun 18, 2020 at 6:52 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >>
> >> On Thu, Jun 18, 2020 at 02:46:24PM +0200, Jan Beulich wrote:
> >>> On 18.06.2020 14:39, Tamas K Lengyel wrote:
> >>>> On Thu, Jun 18, 2020 at 12:31 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>
> >>>>> On 17.06.2020 18:19, Tamas K Lengyel wrote:
> >>>>>> While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
> >>>>>> observed due to a mm-lock order violation while copying the HVM CPU context
> >>>>>> from the parent. This issue has been identified to be due to
> >>>>>> hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
> >>>>>> call also creates a shared entry in the fork's memory map for the cr3 gfn. The
> >>>>>> function later calls hap_update_cr3 while holding the paging_lock, which
> >>>>>> results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
> >>>>>> the above entry when it grabs the page with the P2M_UNSHARE flag set.
> >>>>>>
> >>>>>> Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
> >>>>>> unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
> >>>>>> the p2m is properly populated and to avoid the lock-order violation we
> >>>>>> observed.
> >>>>>
> >>>>> Using P2M_ALLOC is not going to address the original problem though
> >>>>> afaict: You may hit the mem_sharing_fork_page() path that way, and
> >>>>> via nominate_page() => __grab_shared_page() => mem_sharing_page_lock()
> >>>>> you'd run into a lock order violation again.
> >>>>
> >>>> Note that the nominate_page you see in that path is for the parent VM.
> >>>> The paging lock is not taken for the parent VM thus nominate_page
> >>>> succeeds without any issues any time fork_page is called. There is no
> >>>> nominate_page called for the client domain as there is nothing to
> >>>> nominate when plugging a hole.
> >>>
> >>> But that's still a lock order issue then, isn't it? Just one that
> >>> the machinery can't detect / assert upon.
> >>
> >> Yes, mm lock ordering doesn't differentiate between domains, and the
> >> current lock order on the pCPU is based on the last lock taken
> >> (regardless of the domain it belongs to).
> >
> > I see, makes sense. In that case the issue is avoided purely due to
> > get_gfn being called that happens before the paging_lock is taken.
> > That would have to be the way-to-go on other paths leading to
> > vmx_load_pdptrs as well but since all other paths leading there do it
> > without the paging lock being taken there aren't any more adjustments
> > necessary right now that I can see.
>
> If this is indeed the case, then I guess all that's needed is a further
> extended / refined commit message in v3.

Alright.

Thanks,
Tamas