[PATCH] xen: fix usage of pmd/pud_poplulate in mremap for pv guests

Juergen Gross posted 1 patch 2 weeks, 3 days ago
arch/x86/xen/mmu_pv.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

[PATCH] xen: fix usage of pmd/pud_poplulate in mremap for pv guests

Posted by Juergen Gross 2 weeks, 3 days ago
Commit 0881ace292b662 ("mm/mremap: use pmd/pud_poplulate to update page
table entries") introduced a regression when running as Xen PV guest.

Today pmd/pud_poplulate() for Xen PV assumes that the PFN inserted is
referencing a not yet used page table. In case of move_normal_pmd/pud()
this is not true, resulting in WARN splats like:

[34321.304270] ------------[ cut here ]------------
[34321.304277] WARNING: CPU: 0 PID: 23628 at arch/x86/xen/multicalls.c:102 xen_mc_flush+0x176/0x1a0
[34321.304288] Modules linked in:
[34321.304291] CPU: 0 PID: 23628 Comm: apt-get Not tainted 5.14.1-20210906-doflr-mac80211debug+ #1
[34321.304294] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
[34321.304296] RIP: e030:xen_mc_flush+0x176/0x1a0
[34321.304300] Code: 89 45 18 48 c1 e9 3f 48 89 ce e9 20 ff ff ff e8 60 03 00 00 66 90 5b 5d 41 5c 41 5d c3 48 c7 45 18 ea ff ff ff be 01 00 00 00 <0f> 0b 8b 55 00 48 c7 c7 10 97 aa 82 31 db 49 c7 c5 38 97 aa 82 65
[34321.304303] RSP: e02b:ffffc90000a97c90 EFLAGS: 00010002
[34321.304305] RAX: ffff88807d416398 RBX: ffff88807d416350 RCX: ffff88807d416398
[34321.304306] RDX: 0000000000000001 RSI: 0000000000000001 RDI: deadbeefdeadf00d
[34321.304308] RBP: ffff88807d416300 R08: aaaaaaaaaaaaaaaa R09: ffff888006160cc0
[34321.304309] R10: deadbeefdeadf00d R11: ffffea000026a600 R12: 0000000000000000
[34321.304310] R13: ffff888012f6b000 R14: 0000000012f6b000 R15: 0000000000000001
[34321.304320] FS:  00007f5071177800(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
[34321.304322] CS:  10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[34321.304323] CR2: 00007f506f542000 CR3: 00000000160cc000 CR4: 0000000000000660
[34321.304326] Call Trace:
[34321.304331]  xen_alloc_pte+0x294/0x320
[34321.304334]  move_pgt_entry+0x165/0x4b0
[34321.304339]  move_page_tables+0x6fa/0x8d0
[34321.304342]  move_vma.isra.44+0x138/0x500
[34321.304345]  __x64_sys_mremap+0x296/0x410
[34321.304348]  do_syscall_64+0x3a/0x80
[34321.304352]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[34321.304355] RIP: 0033:0x7f507196301a
[34321.304358] Code: 73 01 c3 48 8b 0d 76 0e 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 49 89 ca b8 19 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 46 0e 0c 00 f7 d8 64 89 01 48
[34321.304360] RSP: 002b:00007ffda1eecd38 EFLAGS: 00000246 ORIG_RAX: 0000000000000019
[34321.304362] RAX: ffffffffffffffda RBX: 000056205f950f30 RCX: 00007f507196301a
[34321.304363] RDX: 0000000001a00000 RSI: 0000000001900000 RDI: 00007f506dc56000
[34321.304364] RBP: 0000000001a00000 R08: 0000000000000010 R09: 0000000000000004
[34321.304365] R10: 0000000000000001 R11: 0000000000000246 R12: 00007f506dc56060
[34321.304367] R13: 00007f506dc56000 R14: 00007f506dc56060 R15: 000056205f950f30
[34321.304368] ---[ end trace a19885b78fe8f33e ]---
[34321.304370] 1 of 2 multicall(s) failed: cpu 0
[34321.304371]   call  2: op=12297829382473034410 arg=[aaaaaaaaaaaaaaaa] result=-22

Fix that by modifying xen_alloc_ptpage() to only pin the page table in
case it wasn't pinned already.

Fixes: 0881ace292b662 ("mm/mremap: use pmd/pud_poplulate to update page table entries")
Cc: <stable@vger.kernel.org>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 arch/x86/xen/mmu_pv.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
index 1df5f01529e5..8d751939c6f3 100644
--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -1518,14 +1518,17 @@ static inline void xen_alloc_ptpage(struct mm_struct *mm, unsigned long pfn,
 	if (pinned) {
 		struct page *page = pfn_to_page(pfn);
 
-		if (static_branch_likely(&xen_struct_pages_ready))
+		pinned = false;
+		if (static_branch_likely(&xen_struct_pages_ready)) {
+			pinned = PagePinned(page);
 			SetPagePinned(page);
+		}
 
 		xen_mc_batch();
 
 		__set_pfn_prot(pfn, PAGE_KERNEL_RO);
 
-		if (level == PT_PTE && USE_SPLIT_PTE_PTLOCKS)
+		if (level == PT_PTE && USE_SPLIT_PTE_PTLOCKS && !pinned)
 			__pin_pagetable_pfn(MMUEXT_PIN_L1_TABLE, pfn);
 
 		xen_mc_issue(PARAVIRT_LAZY_MMU);
-- 
2.26.2


Re: [PATCH] xen: fix usage of pmd/pud_poplulate in mremap for pv guests

Posted by Jan Beulich 2 weeks, 3 days ago
On 08.09.2021 09:36, Juergen Gross wrote:
> Commit 0881ace292b662 ("mm/mremap: use pmd/pud_poplulate to update page
> table entries") introduced a regression when running as Xen PV guest.

The description of that change starts with "pmd/pud_populate is the
right interface to be used to set the respective page table entries."
If this is deemed true, I don't think pmd_populate() should call
paravirt_alloc_pte(): The latter function, as its name says, is
supposed to be called for newly allocated page tables only (aiui).

> Today pmd/pud_poplulate() for Xen PV assumes that the PFN inserted is
> referencing a not yet used page table. In case of move_normal_pmd/pud()
> this is not true, resulting in WARN splats like:

I agree for the PMD part, but is including PUD here really correct?
While I don't know why that is, xen_alloc_ptpage() pins L1 tables
only. Hence a PUD update shouldn't be able to find a pinned L2
table.

Jan


Re: [PATCH] xen: fix usage of pmd/pud_poplulate in mremap for pv guests

Posted by Juergen Gross 2 weeks, 3 days ago
On 08.09.21 13:07, Jan Beulich wrote:
> On 08.09.2021 09:36, Juergen Gross wrote:
>> Commit 0881ace292b662 ("mm/mremap: use pmd/pud_poplulate to update page
>> table entries") introduced a regression when running as Xen PV guest.
> 
> The description of that change starts with "pmd/pud_populate is the
> right interface to be used to set the respective page table entries."
> If this is deemed true, I don't think pmd_populate() should call
> paravirt_alloc_pte(): The latter function, as its name says, is
> supposed to be called for newly allocated page tables only (aiui).

In theory you are correct, but my experience with reality tells me that
another set of macros for this case will not be appreciated.

> 
>> Today pmd/pud_poplulate() for Xen PV assumes that the PFN inserted is
>> referencing a not yet used page table. In case of move_normal_pmd/pud()
>> this is not true, resulting in WARN splats like:
> 
> I agree for the PMD part, but is including PUD here really correct?
> While I don't know why that is, xen_alloc_ptpage() pins L1 tables
> only. Hence a PUD update shouldn't be able to find a pinned L2
> table.

I agree that I should drop mentioning PUD here.

I will do that change when committing in case no other changes are
required.


Juergen

Re: [PATCH] xen: fix usage of pmd/pud_poplulate in mremap for pv guests

Posted by Jan Beulich 2 weeks, 3 days ago
On 08.09.2021 15:32, Juergen Gross wrote:
> On 08.09.21 13:07, Jan Beulich wrote:
>> On 08.09.2021 09:36, Juergen Gross wrote:
>>> Commit 0881ace292b662 ("mm/mremap: use pmd/pud_poplulate to update page
>>> table entries") introduced a regression when running as Xen PV guest.
>>
>> The description of that change starts with "pmd/pud_populate is the
>> right interface to be used to set the respective page table entries."
>> If this is deemed true, I don't think pmd_populate() should call
>> paravirt_alloc_pte(): The latter function, as its name says, is
>> supposed to be called for newly allocated page tables only (aiui).
> 
> In theory you are correct, but my experience with reality tells me that
> another set of macros for this case will not be appreciated.

Perhaps a new parameter to the macros / inlines identifying fresh
vs moved? Or perhaps the offending change wasn't really correct in
what its description said?

Jan


Re: [PATCH] xen: fix usage of pmd/pud_poplulate in mremap for pv guests

Posted by Juergen Gross 2 weeks, 3 days ago
On 08.09.21 16:28, Jan Beulich wrote:
> On 08.09.2021 15:32, Juergen Gross wrote:
>> On 08.09.21 13:07, Jan Beulich wrote:
>>> On 08.09.2021 09:36, Juergen Gross wrote:
>>>> Commit 0881ace292b662 ("mm/mremap: use pmd/pud_poplulate to update page
>>>> table entries") introduced a regression when running as Xen PV guest.
>>>
>>> The description of that change starts with "pmd/pud_populate is the
>>> right interface to be used to set the respective page table entries."
>>> If this is deemed true, I don't think pmd_populate() should call
>>> paravirt_alloc_pte(): The latter function, as its name says, is
>>> supposed to be called for newly allocated page tables only (aiui).
>>
>> In theory you are correct, but my experience with reality tells me that
>> another set of macros for this case will not be appreciated.
> 
> Perhaps a new parameter to the macros / inlines identifying fresh
> vs moved? Or perhaps the offending change wasn't really correct in
> what its description said?

The problem is that those macros are spread over all architectures with
each architecture defining them separately. Changing all those will not
be really welcomed.

And the change was correct IMO, as the replaced pmd_set() should be used
for leaf entries only (at least in arch independent code).
pmd_populate() is the correct one for non-leaf entries.


Juergen