xen/arch/x86/hvm/vmx/vmx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
observed due to a mm-lock order violation while copying the HVM CPU context
from the parent. This issue has been identified to be due to
hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
call also creates a shared entry in the fork's memory map for the cr3 gfn. The
function later calls hap_update_cr3 while holding the paging_lock, which
results in the lock-order violation in vmx_load_pdptrs when it tries to unshare
the above entry when it grabs the page with the P2M_UNSHARE flag set.
Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
the p2m is properly populated.
Note that the lock order violation is avoided because before the paging_lock is
taken a lookup is performed with P2M_ALLOC that forks the page, thus the second
lookup in vmx_load_pdptrs succeeds without having to perform the fork. We keep
P2M_ALLOC in vmx_load_pdptrs because there are code-paths leading up to it
which don't take the paging_lock and that have no previous lookup. Currently no
other code-path exists leading there with the paging_lock taken, thus no
further adjustments are necessary.
Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
---
v3: expand commit message to explain why there is no lock-order violation
---
xen/arch/x86/hvm/vmx/vmx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index ab19d9424e..cc6d4ece22 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1325,7 +1325,7 @@ static void vmx_load_pdptrs(struct vcpu *v)
if ( (cr3 & 0x1fUL) && !hvm_pcid_enabled(v) )
goto crash;
- page = get_page_from_gfn(v->domain, cr3 >> PAGE_SHIFT, &p2mt, P2M_UNSHARE);
+ page = get_page_from_gfn(v->domain, cr3 >> PAGE_SHIFT, &p2mt, P2M_ALLOC);
if ( !page )
{
/* Ideally you don't want to crash but rather go into a wait
--
2.25.1
On Thu, Jun 18, 2020 at 07:39:04AM -0700, Tamas K Lengyel wrote: > While forking VMs running a small RTOS system (Zephyr) a Xen crash has been > observed due to a mm-lock order violation while copying the HVM CPU context > from the parent. This issue has been identified to be due to > hap_update_paging_modes first getting a lock on the gfn using get_gfn. This > call also creates a shared entry in the fork's memory map for the cr3 gfn. The > function later calls hap_update_cr3 while holding the paging_lock, which > results in the lock-order violation in vmx_load_pdptrs when it tries to unshare > the above entry when it grabs the page with the P2M_UNSHARE flag set. > > Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was > unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure > the p2m is properly populated. > > Note that the lock order violation is avoided because before the paging_lock is > taken a lookup is performed with P2M_ALLOC that forks the page, thus the second > lookup in vmx_load_pdptrs succeeds without having to perform the fork. We keep > P2M_ALLOC in vmx_load_pdptrs because there are code-paths leading up to it > which don't take the paging_lock and that have no previous lookup. Currently no > other code-path exists leading there with the paging_lock taken, thus no > further adjustments are necessary. > > Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks!
> -----Original Message----- > From: Roger Pau Monné <roger.pau@citrix.com> > Sent: 18 June 2020 16:46 > To: Tamas K Lengyel <tamas.lengyel@intel.com> > Cc: xen-devel@lists.xenproject.org; Jun Nakajima <jun.nakajima@intel.com>; Kevin Tian > <kevin.tian@intel.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper <andrew.cooper3@citrix.com>; > Wei Liu <wl@xen.org>; Paul Durrant <paul@xen.org> > Subject: Re: [PATCH v3 for-4.14] x86/vmx: use P2M_ALLOC in vmx_load_pdptrs instead of P2M_UNSHARE > > On Thu, Jun 18, 2020 at 07:39:04AM -0700, Tamas K Lengyel wrote: > > While forking VMs running a small RTOS system (Zephyr) a Xen crash has been > > observed due to a mm-lock order violation while copying the HVM CPU context > > from the parent. This issue has been identified to be due to > > hap_update_paging_modes first getting a lock on the gfn using get_gfn. This > > call also creates a shared entry in the fork's memory map for the cr3 gfn. The > > function later calls hap_update_cr3 while holding the paging_lock, which > > results in the lock-order violation in vmx_load_pdptrs when it tries to unshare > > the above entry when it grabs the page with the P2M_UNSHARE flag set. > > > > Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was > > unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure > > the p2m is properly populated. > > > > Note that the lock order violation is avoided because before the paging_lock is > > taken a lookup is performed with P2M_ALLOC that forks the page, thus the second > > lookup in vmx_load_pdptrs succeeds without having to perform the fork. We keep > > P2M_ALLOC in vmx_load_pdptrs because there are code-paths leading up to it > > which don't take the paging_lock and that have no previous lookup. Currently no > > other code-path exists leading there with the paging_lock taken, thus no > > further adjustments are necessary. > > > > Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com> > > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> > Release-acked-by: Paul Durrant <paul@xen.org> > Thanks!
> From: Lengyel, Tamas <tamas.lengyel@intel.com> > Sent: Thursday, June 18, 2020 10:39 PM > > While forking VMs running a small RTOS system (Zephyr) a Xen crash has > been > observed due to a mm-lock order violation while copying the HVM CPU > context > from the parent. This issue has been identified to be due to > hap_update_paging_modes first getting a lock on the gfn using get_gfn. This > call also creates a shared entry in the fork's memory map for the cr3 gfn. The > function later calls hap_update_cr3 while holding the paging_lock, which > results in the lock-order violation in vmx_load_pdptrs when it tries to > unshare > the above entry when it grabs the page with the P2M_UNSHARE flag set. > > Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE > was > unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure > the p2m is properly populated. > > Note that the lock order violation is avoided because before the paging_lock > is > taken a lookup is performed with P2M_ALLOC that forks the page, thus the > second > lookup in vmx_load_pdptrs succeeds without having to perform the fork. We > keep > P2M_ALLOC in vmx_load_pdptrs because there are code-paths leading up to > it > which don't take the paging_lock and that have no previous lookup. > Currently no > other code-path exists leading there with the paging_lock taken, thus no > further adjustments are necessary. > > Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> > --- > v3: expand commit message to explain why there is no lock-order violation > --- > xen/arch/x86/hvm/vmx/vmx.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c > index ab19d9424e..cc6d4ece22 100644 > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -1325,7 +1325,7 @@ static void vmx_load_pdptrs(struct vcpu *v) > if ( (cr3 & 0x1fUL) && !hvm_pcid_enabled(v) ) > goto crash; > > - page = get_page_from_gfn(v->domain, cr3 >> PAGE_SHIFT, &p2mt, > P2M_UNSHARE); > + page = get_page_from_gfn(v->domain, cr3 >> PAGE_SHIFT, &p2mt, > P2M_ALLOC); > if ( !page ) > { > /* Ideally you don't want to crash but rather go into a wait > -- > 2.25.1
© 2016 - 2024 Red Hat, Inc.