xen/arch/x86/hvm/vmx/vmx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
observed due to a mm-lock order violation while copying the HVM CPU context
from the parent. This issue has been identified to be due to
hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
call also creates a shared entry in the fork's memory map for the cr3 gfn. The
function later calls hap_update_cr3 while holding the paging_lock, which
results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
the above entry when it grabs the page with the P2M_UNSHARE flag set.
Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
the p2m is properly populated.
Note that the lock order violation is avoided because before the paging_lock is
taken a lookup is performed with P2M_ALLOC that forks the page, thus the second
lookup in vmx_load_pdptrs succeeds without having to perform the fork. We keep
P2M_ALLOC in vmx_load_pdptrs because there are code-paths leading up to it
which don't take the paging_lock and that have no previous lookup. Currently no
other code-path exists leading there with the paging_lock taken, thus no
further adjustments are necessary.
Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
---
v3: expand commit message to explain why there is no lock-order violation
---
xen/arch/x86/hvm/vmx/vmx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index ab19d9424e..cc6d4ece22 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1325,7 +1325,7 @@ static void vmx_load_pdptrs(struct vcpu *v)
if ( (cr3 & 0x1fUL) && !hvm_pcid_enabled(v) )
goto crash;
- page = get_page_from_gfn(v->domain, cr3 >> PAGE_SHIFT, &p2mt, P2M_UNSHARE);
+ page = get_page_from_gfn(v->domain, cr3 >> PAGE_SHIFT, &p2mt, P2M_ALLOC);
if ( !page )
{
/* Ideally you don't want to crash but rather go into a wait
--
2.25.1
On Thu, Jun 18, 2020 at 07:39:04AM -0700, Tamas K Lengyel wrote: > While forking VMs running a small RTOS system (Zephyr) a Xen crash has been > observed due to a mm-lock order violation while copying the HVM CPU context > from the parent. This issue has been identified to be due to > hap_update_paging_modes first getting a lock on the gfn using get_gfn. This > call also creates a shared entry in the fork's memory map for the cr3 gfn. The > function later calls hap_update_cr3 while holding the paging_lock, which > results in the lock-order violation in vmx_load_pdptrs when it tries to unshare > the above entry when it grabs the page with the P2M_UNSHARE flag set. > > Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was > unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure > the p2m is properly populated. > > Note that the lock order violation is avoided because before the paging_lock is > taken a lookup is performed with P2M_ALLOC that forks the page, thus the second > lookup in vmx_load_pdptrs succeeds without having to perform the fork. We keep > P2M_ALLOC in vmx_load_pdptrs because there are code-paths leading up to it > which don't take the paging_lock and that have no previous lookup. Currently no > other code-path exists leading there with the paging_lock taken, thus no > further adjustments are necessary. > > Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks!
> -----Original Message----- > From: Roger Pau Monné <roger.pau@citrix.com> > Sent: 18 June 2020 16:46 > To: Tamas K Lengyel <tamas.lengyel@intel.com> > Cc: xen-devel@lists.xenproject.org; Jun Nakajima <jun.nakajima@intel.com>; Kevin Tian > <kevin.tian@intel.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper <andrew.cooper3@citrix.com>; > Wei Liu <wl@xen.org>; Paul Durrant <paul@xen.org> > Subject: Re: [PATCH v3 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE > > On Thu, Jun 18, 2020 at 07:39:04AM -0700, Tamas K Lengyel wrote: > > While forking VMs running a small RTOS system (Zephyr) a Xen crash has been > > observed due to a mm-lock order violation while copying the HVM CPU context > > from the parent. This issue has been identified to be due to > > hap_update_paging_modes first getting a lock on the gfn using get_gfn. This > > call also creates a shared entry in the fork's memory map for the cr3 gfn. The > > function later calls hap_update_cr3 while holding the paging_lock, which > > results in the lock-order violation in vmx_load_pdptrs when it tries to unshare > > the above entry when it grabs the page with the P2M_UNSHARE flag set. > > > > Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was > > unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure > > the p2m is properly populated. > > > > Note that the lock order violation is avoided because before the paging_lock is > > taken a lookup is performed with P2M_ALLOC that forks the page, thus the second > > lookup in vmx_load_pdptrs succeeds without having to perform the fork. We keep > > P2M_ALLOC in vmx_load_pdptrs because there are code-paths leading up to it > > which don't take the paging_lock and that have no previous lookup. Currently no > > other code-path exists leading there with the paging_lock taken, thus no > > further adjustments are necessary. > > > > Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com> > > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> > Release-acked-by: Paul Durrant <paul@xen.org> > Thanks!
> From: Lengyel, Tamas <tamas.lengyel@intel.com>
> Sent: Thursday, June 18, 2020 10:39 PM
>
> While forking VMs running a small RTOS system (Zephyr) a Xen crash has
> been
> observed due to a mm-lock order violation while copying the HVM CPU
> context
> from the parent. This issue has been identified to be due to
> hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
> call also creates a shared entry in the fork's memory map for the cr3 gfn. The
> function later calls hap_update_cr3 while holding the paging_lock, which
> results in the lock-order violation in vmx_load_pdptrs when it tries to
> unshare
> the above entry when it grabs the page with the P2M_UNSHARE flag set.
>
> Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE
> was
> unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
> the p2m is properly populated.
>
> Note that the lock order violation is avoided because before the paging_lock
> is
> taken a lookup is performed with P2M_ALLOC that forks the page, thus the
> second
> lookup in vmx_load_pdptrs succeeds without having to perform the fork. We
> keep
> P2M_ALLOC in vmx_load_pdptrs because there are code-paths leading up to
> it
> which don't take the paging_lock and that have no previous lookup.
> Currently no
> other code-path exists leading there with the paging_lock taken, thus no
> further adjustments are necessary.
>
> Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> ---
> v3: expand commit message to explain why there is no lock-order violation
> ---
> xen/arch/x86/hvm/vmx/vmx.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index ab19d9424e..cc6d4ece22 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -1325,7 +1325,7 @@ static void vmx_load_pdptrs(struct vcpu *v)
> if ( (cr3 & 0x1fUL) && !hvm_pcid_enabled(v) )
> goto crash;
>
> - page = get_page_from_gfn(v->domain, cr3 >> PAGE_SHIFT, &p2mt,
> P2M_UNSHARE);
> + page = get_page_from_gfn(v->domain, cr3 >> PAGE_SHIFT, &p2mt,
> P2M_ALLOC);
> if ( !page )
> {
> /* Ideally you don't want to crash but rather go into a wait
> --
> 2.25.1
© 2016 - 2026 Red Hat, Inc.