[PATCH] KVM: s390: Fix access to unavailable adapter indicator pages during postcopy

Thomas Huth posted 1 patch 1 month, 1 week ago
arch/s390/kvm/interrupt.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
[PATCH] KVM: s390: Fix access to unavailable adapter indicator pages during postcopy
Posted by Thomas Huth 1 month, 1 week ago
From: Thomas Huth <thuth@redhat.com>

When you run a KVM guest with vhost-net and migrate that guest to
another host, and you immediately enable postcopy after starting the
migration, there is a big chance that the network connection of the
guest won't work anymore on the destination side after the migration.

With a debug kernel v6.16.0, there is also a call trace that looks
like this:

 FAULT_FLAG_ALLOW_RETRY missing 881
 CPU: 6 UID: 0 PID: 549 Comm: kworker/6:2 Kdump: loaded Not tainted 6.16.0 #56 NONE
 Hardware name: IBM 3931 LA1 400 (LPAR)
 Workqueue: events irqfd_inject [kvm]
 Call Trace:
  [<00003173cbecc634>] dump_stack_lvl+0x104/0x168
  [<00003173cca69588>] handle_userfault+0xde8/0x1310
  [<00003173cc756f0c>] handle_pte_fault+0x4fc/0x760
  [<00003173cc759212>] __handle_mm_fault+0x452/0xa00
  [<00003173cc7599ba>] handle_mm_fault+0x1fa/0x6a0
  [<00003173cc73409a>] __get_user_pages+0x4aa/0xba0
  [<00003173cc7349e8>] get_user_pages_remote+0x258/0x770
  [<000031734be6f052>] get_map_page+0xe2/0x190 [kvm]
  [<000031734be6f910>] adapter_indicators_set+0x50/0x4a0 [kvm]
  [<000031734be7f674>] set_adapter_int+0xc4/0x170 [kvm]
  [<000031734be2f268>] kvm_set_irq+0x228/0x3f0 [kvm]
  [<000031734be27000>] irqfd_inject+0xd0/0x150 [kvm]
  [<00003173cc00c9ec>] process_one_work+0x87c/0x1490
  [<00003173cc00dda6>] worker_thread+0x7a6/0x1010
  [<00003173cc02dc36>] kthread+0x3b6/0x710
  [<00003173cbed2f0c>] __ret_from_fork+0xdc/0x7f0
  [<00003173cdd737ca>] ret_from_fork+0xa/0x30
 3 locks held by kworker/6:2/549:
  #0: 00000000800bc958 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x7ee/0x1490
  #1: 000030f3d527fbd0 ((work_completion)(&irqfd->inject)){+.+.}-{0:0}, at: process_one_work+0x81c/0x1490
  #2: 00000000f99862b0 (&mm->mmap_lock){++++}-{3:3}, at: get_map_page+0xa8/0x190 [kvm]

The "FAULT_FLAG_ALLOW_RETRY missing" indicates that handle_userfaultfd()
saw a page fault request without ALLOW_RETRY flag set, hence userfaultfd
cannot remotely resolve it (because the caller was asking for an immediate
resolution, aka, FAULT_FLAG_NOWAIT, while remote faults can take time).
With that, get_map_page() failed and the irq was lost.

We should not be strictly in an atomic environment here and the worker
should be sleepable (the call is done during an ioctl from userspace),
so we can allow adapter_indicators_set() to just sleep waiting for the
remote fault instead.

Link: https://issues.redhat.com/browse/RHEL-42486
Signed-off-by: Peter Xu <peterx@redhat.com>
[thuth: Assembled patch description and fixed some cosmetical issues]
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 Note: Instructions for reproducing the bug can be found in the ticket here:
 https://issues.redhat.com/browse/RHEL-42486?focusedId=26661116#comment-26661116

 arch/s390/kvm/interrupt.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 60c360c18690f..dcce826ae9875 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -2777,12 +2777,19 @@ static unsigned long get_ind_bit(__u64 addr, unsigned long bit_nr, bool swap)
 
 static struct page *get_map_page(struct kvm *kvm, u64 uaddr)
 {
+	struct mm_struct *mm = kvm->mm;
 	struct page *page = NULL;
+	int locked = 1;
+
+	if (mmget_not_zero(mm)) {
+		mmap_read_lock(mm);
+		get_user_pages_remote(mm, uaddr, 1, FOLL_WRITE,
+				      &page, &locked);
+		if (locked)
+			mmap_read_unlock(mm);
+		mmput(mm);
+	}
 
-	mmap_read_lock(kvm->mm);
-	get_user_pages_remote(kvm->mm, uaddr, 1, FOLL_WRITE,
-			      &page, NULL);
-	mmap_read_unlock(kvm->mm);
 	return page;
 }
 
-- 
2.50.1
Re: [PATCH] KVM: s390: Fix access to unavailable adapter indicator pages during postcopy
Posted by Christian Borntraeger 4 weeks, 1 day ago
CC Douglas, since Doug is looking into kvm_arch_set_irq_inatomic and this might have implications.


Am 21.08.25 um 17:23 schrieb Thomas Huth:
> From: Thomas Huth <thuth@redhat.com>
> 
> When you run a KVM guest with vhost-net and migrate that guest to
> another host, and you immediately enable postcopy after starting the
> migration, there is a big chance that the network connection of the
> guest won't work anymore on the destination side after the migration.
> 
> With a debug kernel v6.16.0, there is also a call trace that looks
> like this:
> 
>   FAULT_FLAG_ALLOW_RETRY missing 881
>   CPU: 6 UID: 0 PID: 549 Comm: kworker/6:2 Kdump: loaded Not tainted 6.16.0 #56 NONE
>   Hardware name: IBM 3931 LA1 400 (LPAR)
>   Workqueue: events irqfd_inject [kvm]
>   Call Trace:
>    [<00003173cbecc634>] dump_stack_lvl+0x104/0x168
>    [<00003173cca69588>] handle_userfault+0xde8/0x1310
>    [<00003173cc756f0c>] handle_pte_fault+0x4fc/0x760
>    [<00003173cc759212>] __handle_mm_fault+0x452/0xa00
>    [<00003173cc7599ba>] handle_mm_fault+0x1fa/0x6a0
>    [<00003173cc73409a>] __get_user_pages+0x4aa/0xba0
>    [<00003173cc7349e8>] get_user_pages_remote+0x258/0x770
>    [<000031734be6f052>] get_map_page+0xe2/0x190 [kvm]
>    [<000031734be6f910>] adapter_indicators_set+0x50/0x4a0 [kvm]
>    [<000031734be7f674>] set_adapter_int+0xc4/0x170 [kvm]
>    [<000031734be2f268>] kvm_set_irq+0x228/0x3f0 [kvm]
>    [<000031734be27000>] irqfd_inject+0xd0/0x150 [kvm]
>    [<00003173cc00c9ec>] process_one_work+0x87c/0x1490
>    [<00003173cc00dda6>] worker_thread+0x7a6/0x1010
>    [<00003173cc02dc36>] kthread+0x3b6/0x710
>    [<00003173cbed2f0c>] __ret_from_fork+0xdc/0x7f0
>    [<00003173cdd737ca>] ret_from_fork+0xa/0x30
>   3 locks held by kworker/6:2/549:
>    #0: 00000000800bc958 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x7ee/0x1490
>    #1: 000030f3d527fbd0 ((work_completion)(&irqfd->inject)){+.+.}-{0:0}, at: process_one_work+0x81c/0x1490
>    #2: 00000000f99862b0 (&mm->mmap_lock){++++}-{3:3}, at: get_map_page+0xa8/0x190 [kvm]
> 
> The "FAULT_FLAG_ALLOW_RETRY missing" indicates that handle_userfaultfd()
> saw a page fault request without ALLOW_RETRY flag set, hence userfaultfd
> cannot remotely resolve it (because the caller was asking for an immediate
> resolution, aka, FAULT_FLAG_NOWAIT, while remote faults can take time).
> With that, get_map_page() failed and the irq was lost.
> 
> We should not be strictly in an atomic environment here and the worker
> should be sleepable (the call is done during an ioctl from userspace),
> so we can allow adapter_indicators_set() to just sleep waiting for the
> remote fault instead.
> 
> Link: https://issues.redhat.com/browse/RHEL-42486
> Signed-off-by: Peter Xu <peterx@redhat.com>
> [thuth: Assembled patch description and fixed some cosmetical issues]
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>   Note: Instructions for reproducing the bug can be found in the ticket here:
>   https://issues.redhat.com/browse/RHEL-42486?focusedId=26661116#comment-26661116
> 
>   arch/s390/kvm/interrupt.c | 15 +++++++++++----
>   1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index 60c360c18690f..dcce826ae9875 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -2777,12 +2777,19 @@ static unsigned long get_ind_bit(__u64 addr, unsigned long bit_nr, bool swap)
>   
>   static struct page *get_map_page(struct kvm *kvm, u64 uaddr)
>   {
> +	struct mm_struct *mm = kvm->mm;
>   	struct page *page = NULL;
> +	int locked = 1;
> +
> +	if (mmget_not_zero(mm)) {
> +		mmap_read_lock(mm);
> +		get_user_pages_remote(mm, uaddr, 1, FOLL_WRITE,
> +				      &page, &locked);
> +		if (locked)
> +			mmap_read_unlock(mm);
> +		mmput(mm);
> +	}
>   
> -	mmap_read_lock(kvm->mm);
> -	get_user_pages_remote(kvm->mm, uaddr, 1, FOLL_WRITE,
> -			      &page, NULL);
> -	mmap_read_unlock(kvm->mm);
>   	return page;
>   }
>
Re: [PATCH] KVM: s390: Fix access to unavailable adapter indicator pages during postcopy
Posted by Janosch Frank 1 month, 1 week ago
On 8/21/25 5:23 PM, Thomas Huth wrote:
> From: Thomas Huth <thuth@redhat.com>
> 
> When you run a KVM guest with vhost-net and migrate that guest to
> another host, and you immediately enable postcopy after starting the
> migration, there is a big chance that the network connection of the
> guest won't work anymore on the destination side after the migration.

Do we want to add this?

Fixes: f65470661f36 ("KVM: s390/interrupt: do not pin adapter interrupt 
pages")
Re: [PATCH] KVM: s390: Fix access to unavailable adapter indicator pages during postcopy
Posted by Thomas Huth 1 month, 1 week ago
On 26/08/2025 13.43, Janosch Frank wrote:
> On 8/21/25 5:23 PM, Thomas Huth wrote:
>> From: Thomas Huth <thuth@redhat.com>
>>
>> When you run a KVM guest with vhost-net and migrate that guest to
>> another host, and you immediately enable postcopy after starting the
>> migration, there is a big chance that the network connection of the
>> guest won't work anymore on the destination side after the migration.
> 
> Do we want to add this?
> 
> Fixes: f65470661f36 ("KVM: s390/interrupt: do not pin adapter interrupt pages")

Yes, that sounds like a good idea, please add it when picking up the patch!

  Thanks,
   Thomas
Re: [PATCH] KVM: s390: Fix access to unavailable adapter indicator pages during postcopy
Posted by Claudio Imbrenda 1 month, 1 week ago
On Thu, 21 Aug 2025 17:23:09 +0200
Thomas Huth <thuth@redhat.com> wrote:

> From: Thomas Huth <thuth@redhat.com>
> 
> When you run a KVM guest with vhost-net and migrate that guest to
> another host, and you immediately enable postcopy after starting the
> migration, there is a big chance that the network connection of the
> guest won't work anymore on the destination side after the migration.
> 
> With a debug kernel v6.16.0, there is also a call trace that looks
> like this:
> 
>  FAULT_FLAG_ALLOW_RETRY missing 881
>  CPU: 6 UID: 0 PID: 549 Comm: kworker/6:2 Kdump: loaded Not tainted 6.16.0 #56 NONE
>  Hardware name: IBM 3931 LA1 400 (LPAR)
>  Workqueue: events irqfd_inject [kvm]
>  Call Trace:
>   [<00003173cbecc634>] dump_stack_lvl+0x104/0x168
>   [<00003173cca69588>] handle_userfault+0xde8/0x1310
>   [<00003173cc756f0c>] handle_pte_fault+0x4fc/0x760
>   [<00003173cc759212>] __handle_mm_fault+0x452/0xa00
>   [<00003173cc7599ba>] handle_mm_fault+0x1fa/0x6a0
>   [<00003173cc73409a>] __get_user_pages+0x4aa/0xba0
>   [<00003173cc7349e8>] get_user_pages_remote+0x258/0x770
>   [<000031734be6f052>] get_map_page+0xe2/0x190 [kvm]
>   [<000031734be6f910>] adapter_indicators_set+0x50/0x4a0 [kvm]
>   [<000031734be7f674>] set_adapter_int+0xc4/0x170 [kvm]
>   [<000031734be2f268>] kvm_set_irq+0x228/0x3f0 [kvm]
>   [<000031734be27000>] irqfd_inject+0xd0/0x150 [kvm]
>   [<00003173cc00c9ec>] process_one_work+0x87c/0x1490
>   [<00003173cc00dda6>] worker_thread+0x7a6/0x1010
>   [<00003173cc02dc36>] kthread+0x3b6/0x710
>   [<00003173cbed2f0c>] __ret_from_fork+0xdc/0x7f0
>   [<00003173cdd737ca>] ret_from_fork+0xa/0x30
>  3 locks held by kworker/6:2/549:
>   #0: 00000000800bc958 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x7ee/0x1490
>   #1: 000030f3d527fbd0 ((work_completion)(&irqfd->inject)){+.+.}-{0:0}, at: process_one_work+0x81c/0x1490
>   #2: 00000000f99862b0 (&mm->mmap_lock){++++}-{3:3}, at: get_map_page+0xa8/0x190 [kvm]
> 
> The "FAULT_FLAG_ALLOW_RETRY missing" indicates that handle_userfaultfd()
> saw a page fault request without ALLOW_RETRY flag set, hence userfaultfd
> cannot remotely resolve it (because the caller was asking for an immediate
> resolution, aka, FAULT_FLAG_NOWAIT, while remote faults can take time).
> With that, get_map_page() failed and the irq was lost.
> 
> We should not be strictly in an atomic environment here and the worker
> should be sleepable (the call is done during an ioctl from userspace),
> so we can allow adapter_indicators_set() to just sleep waiting for the
> remote fault instead.
> 
> Link: https://issues.redhat.com/browse/RHEL-42486
> Signed-off-by: Peter Xu <peterx@redhat.com>
> [thuth: Assembled patch description and fixed some cosmetical issues]
> Signed-off-by: Thomas Huth <thuth@redhat.com>

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>

> ---
>  Note: Instructions for reproducing the bug can be found in the ticket here:
>  https://issues.redhat.com/browse/RHEL-42486?focusedId=26661116#comment-26661116
> 
>  arch/s390/kvm/interrupt.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index 60c360c18690f..dcce826ae9875 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -2777,12 +2777,19 @@ static unsigned long get_ind_bit(__u64 addr, unsigned long bit_nr, bool swap)
>  
>  static struct page *get_map_page(struct kvm *kvm, u64 uaddr)
>  {
> +	struct mm_struct *mm = kvm->mm;
>  	struct page *page = NULL;
> +	int locked = 1;
> +
> +	if (mmget_not_zero(mm)) {
> +		mmap_read_lock(mm);
> +		get_user_pages_remote(mm, uaddr, 1, FOLL_WRITE,
> +				      &page, &locked);
> +		if (locked)
> +			mmap_read_unlock(mm);
> +		mmput(mm);
> +	}
>  
> -	mmap_read_lock(kvm->mm);
> -	get_user_pages_remote(kvm->mm, uaddr, 1, FOLL_WRITE,
> -			      &page, NULL);
> -	mmap_read_unlock(kvm->mm);
>  	return page;
>  }
>
Re: [PATCH] KVM: s390: Fix access to unavailable adapter indicator pages during postcopy
Posted by Janosch Frank 1 month, 1 week ago
On 8/21/25 5:23 PM, Thomas Huth wrote:
> From: Thomas Huth <thuth@redhat.com>
> 
> When you run a KVM guest with vhost-net and migrate that guest to
> another host, and you immediately enable postcopy after starting the
> migration, there is a big chance that the network connection of the
> guest won't work anymore on the destination side after the migration.
> 
> With a debug kernel v6.16.0, there is also a call trace that looks
> like this:
> 
>   FAULT_FLAG_ALLOW_RETRY missing 881
>   CPU: 6 UID: 0 PID: 549 Comm: kworker/6:2 Kdump: loaded Not tainted 6.16.0 #56 NONE
>   Hardware name: IBM 3931 LA1 400 (LPAR)
>   Workqueue: events irqfd_inject [kvm]
>   Call Trace:
>    [<00003173cbecc634>] dump_stack_lvl+0x104/0x168
>    [<00003173cca69588>] handle_userfault+0xde8/0x1310
>    [<00003173cc756f0c>] handle_pte_fault+0x4fc/0x760
>    [<00003173cc759212>] __handle_mm_fault+0x452/0xa00
>    [<00003173cc7599ba>] handle_mm_fault+0x1fa/0x6a0
>    [<00003173cc73409a>] __get_user_pages+0x4aa/0xba0
>    [<00003173cc7349e8>] get_user_pages_remote+0x258/0x770
>    [<000031734be6f052>] get_map_page+0xe2/0x190 [kvm]
>    [<000031734be6f910>] adapter_indicators_set+0x50/0x4a0 [kvm]
>    [<000031734be7f674>] set_adapter_int+0xc4/0x170 [kvm]
>    [<000031734be2f268>] kvm_set_irq+0x228/0x3f0 [kvm]
>    [<000031734be27000>] irqfd_inject+0xd0/0x150 [kvm]
>    [<00003173cc00c9ec>] process_one_work+0x87c/0x1490
>    [<00003173cc00dda6>] worker_thread+0x7a6/0x1010
>    [<00003173cc02dc36>] kthread+0x3b6/0x710
>    [<00003173cbed2f0c>] __ret_from_fork+0xdc/0x7f0
>    [<00003173cdd737ca>] ret_from_fork+0xa/0x30
>   3 locks held by kworker/6:2/549:
>    #0: 00000000800bc958 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x7ee/0x1490
>    #1: 000030f3d527fbd0 ((work_completion)(&irqfd->inject)){+.+.}-{0:0}, at: process_one_work+0x81c/0x1490
>    #2: 00000000f99862b0 (&mm->mmap_lock){++++}-{3:3}, at: get_map_page+0xa8/0x190 [kvm]
> 
> The "FAULT_FLAG_ALLOW_RETRY missing" indicates that handle_userfaultfd()
> saw a page fault request without ALLOW_RETRY flag set, hence userfaultfd
> cannot remotely resolve it (because the caller was asking for an immediate
> resolution, aka, FAULT_FLAG_NOWAIT, while remote faults can take time).
> With that, get_map_page() failed and the irq was lost.
> 
> We should not be strictly in an atomic environment here and the worker
> should be sleepable (the call is done during an ioctl from userspace),
> so we can allow adapter_indicators_set() to just sleep waiting for the
> remote fault instead.
> 
> Link: https://issues.redhat.com/browse/RHEL-42486
> Signed-off-by: Peter Xu <peterx@redhat.com>
> [thuth: Assembled patch description and fixed some cosmetical issues]
> Signed-off-by: Thomas Huth <thuth@redhat.com>

Acked-by: Janosch Frank <frankja@linux.ibm.com>