[PATCH 0/3] Detect SRCU related deadlocks

Boqun Feng posted 3 patches 2 years, 8 months ago
include/linux/lockdep.h  |  5 +++
include/linux/srcu.h     | 23 +++++++++++--
kernel/locking/lockdep.c | 34 +++++++++++++++++++
kernel/rcu/srcutiny.c    |  2 ++
kernel/rcu/srcutree.c    |  2 ++
lib/locking-selftest.c   | 71 ++++++++++++++++++++++++++++++++++++++++
6 files changed, 135 insertions(+), 2 deletions(-)
[PATCH 0/3] Detect SRCU related deadlocks
Posted by Boqun Feng 2 years, 8 months ago
This is actually a leftover of the recursive read deadlock detection
patchset:

	https://lore.kernel.org/lkml/20180411135647.21496-1-boqun.feng@gmail.com/

I resolve comments then and add more test cases, and hopefully this can
fulfill the request from KVM:

	https://lore.kernel.org/lkml/a14a13a690277d4cc95a4b26aa2d9a4d9b392a74.camel@infradead.org/

;-)

The patch #3 is now WIP for two reasons:

*	It may conflicts with Paul's patchset on removing CONFIG_SRCU

*	I haven't found a proper way to "reinit" srcu_struct when
	lockdep selftest runs: cleanup_srcu_struct() needs workqueue
	however the tests can run before there is one.

Anyway, these selftests prove the detection actually works. And as
always, feedbacks and comments are welcome!

Regards,
Boqun

Boqun Feng (3):
  locking/lockdep: Introduce lock_sync()
  rcu: Equip sleepable RCU with lockdep dependency graph checks
  WIP: locking/lockdep: selftests: Add selftests for SRCU

 include/linux/lockdep.h  |  5 +++
 include/linux/srcu.h     | 23 +++++++++++--
 kernel/locking/lockdep.c | 34 +++++++++++++++++++
 kernel/rcu/srcutiny.c    |  2 ++
 kernel/rcu/srcutree.c    |  2 ++
 lib/locking-selftest.c   | 71 ++++++++++++++++++++++++++++++++++++++++
 6 files changed, 135 insertions(+), 2 deletions(-)

-- 
2.38.1
[PATCH 0/3] KVM: Make use of SRCU deadlock detection support
Posted by David Woodhouse 2 years, 8 months ago
On Thu, 2023-01-12 at 22:59 -0800, Boqun Feng wrote:
> This is actually a leftover of the recursive read deadlock detection
> patchset:
> 
>         https://lore.kernel.org/lkml/20180411135647.21496-1-boqun.feng@gmail.com/
> 
> I resolve comments then and add more test cases, and hopefully this can
> fulfill the request from KVM:
> 
>         https://lore.kernel.org/lkml/a14a13a690277d4cc95a4b26aa2d9a4d9b392a74.camel@infradead.org/
> 
> ;-)

It definitely seems to work; thank you! I can revert some of the recent 
fixes from the KVM tree, apply your patches, and then I can trigger the 
lockdep warnings. To make it reliably trigger, we need to artificially
call synchronize_srcu(&kvm->srcu) under kvm->lock on KVM init, because
the circumstances under which that happens are a bit esoteric and don't
always happen otherwise, so lockdep wouldn't notice.

> The patch #3 is now WIP for two reasons:
> 
> *       It may conflicts with Paul's patchset on removing CONFIG_SRCU
> 
> *       I haven't found a proper way to "reinit" srcu_struct when
>         lockdep selftest runs: cleanup_srcu_struct() needs workqueue
>         however the tests can run before there is one.

Understood. I think the KVM series which follows can stand alone and go 
via the KVM tree separately. As and when your series gets merged, it'll 
serve to protect against regressions.

Thanks again!

David Woodhouse (3):
      KVM: Show lockdep the kvm->mutex vs. kvm->srcu ordering rule
      KVM: selftests: Use enum for test numbers in xen_shinfo_test
      KVM: selftests: Add EVTCHNOP_send slow path test to xen_shinfo_test

 .../testing/selftests/kvm/x86_64/xen_shinfo_test.c | 165 ++++++++++++++-------
 virt/kvm/kvm_main.c                                |  10 ++
 2 files changed, 124 insertions(+), 51 deletions(-)
Re: [PATCH 0/3] KVM: Make use of SRCU deadlock detection support
Posted by Sean Christopherson 2 years, 7 months ago
On Fri, Jan 13, 2023, David Woodhouse wrote:
> David Woodhouse (3):
>       KVM: Show lockdep the kvm->mutex vs. kvm->srcu ordering rule
>       KVM: selftests: Use enum for test numbers in xen_shinfo_test
>       KVM: selftests: Add EVTCHNOP_send slow path test to xen_shinfo_test
> 
>  .../testing/selftests/kvm/x86_64/xen_shinfo_test.c | 165 ++++++++++++++-------
>  virt/kvm/kvm_main.c                                |  10 ++
>  2 files changed, 124 insertions(+), 51 deletions(-)

As mentioned in patch three, I'm going to repost the selftests changes on top
of other cleanups, and will plan on applying them next week if all goes well.

Paolo, do you want to grab the KVM change directly?
[PATCH 1/3] KVM: Show lockdep the kvm->mutex vs. kvm->srcu ordering rule
Posted by David Woodhouse 2 years, 8 months ago
From: David Woodhouse <dwmw@amazon.co.uk>

Lockdep is learning to spot deadlocks with sleepable RCU vs. mutexes,
which can occur where one code path calls synchronize_scru() with a
mutex held, while another code path attempts to obtain the same mutex
while in a read-side section.

Since lockdep isn't very good at reading the English prose in
Documentation/virt/kvm/locking.rst, give it a demonstration by calling
synchronize_scru(&kvm->srcu) while holding kvm->lock in kvm_create_vm().
The cases where this happens naturally are relatively esoteric and may
not happen otherwise.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 virt/kvm/kvm_main.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 13e88297f999..285b3c5a6364 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1173,6 +1173,16 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
 	if (init_srcu_struct(&kvm->irq_srcu))
 		goto out_err_no_irq_srcu;
 
+#ifdef CONFIG_LOCKDEP
+	/*
+	 * Ensure lockdep knows that it's not permitted to lock kvm->lock
+	 * from a SRCU read section on kvm->srcu.
+	 */
+	mutex_lock(&kvm->lock);
+	synchronize_srcu(&kvm->srcu);
+	mutex_unlock(&kvm->lock);
+#endif
+
 	refcount_set(&kvm->users_count, 1);
 	for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
 		for (j = 0; j < 2; j++) {
-- 
2.35.3
[PATCH 2/3] KVM: selftests: Use enum for test numbers in xen_shinfo_test
Posted by David Woodhouse 2 years, 8 months ago
From: David Woodhouse <dwmw@amazon.co.uk>

The xen_shinfo_test started off with very few iterations, and the numbers
we used in GUEST_SYNC() were precisely mapped to the RUNSTATE_xxx values
anyway to start with.

It has since grown quite a few more tests, and it's kind of awful to be
handling them all as bare numbers. Especially when I want to add a new
test in the middle. Define an enum for the test stages, and use it both
in the guest code and the host switch statement.

No functional change, if I can count to 24.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 .../selftests/kvm/x86_64/xen_shinfo_test.c    | 134 +++++++++++-------
 1 file changed, 83 insertions(+), 51 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
index 721f6a693799..644d614a9965 100644
--- a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
+++ b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
@@ -44,6 +44,36 @@
 #define EVTCHN_TEST2 66
 #define EVTCHN_TIMER 13
 
+enum {
+	TEST_INJECT_VECTOR = 0,
+	TEST_RUNSTATE_runnable,
+	TEST_RUNSTATE_blocked,
+	TEST_RUNSTATE_offline,
+	TEST_RUNSTATE_ADJUST,
+	TEST_RUNSTATE_DATA,
+	TEST_STEAL_TIME,
+	TEST_EVTCHN_MASKED,
+	TEST_EVTCHN_UNMASKED,
+	TEST_EVTCHN_SLOWPATH,
+	TEST_EVTCHN_SEND_IOCTL,
+	TEST_EVTCHN_HCALL,
+	TEST_EVTCHN_HCALL_EVENTFD,
+	TEST_TIMER_SETUP,
+	TEST_TIMER_WAIT,
+	TEST_TIMER_RESTORE,
+	TEST_POLL_READY,
+	TEST_POLL_TIMEOUT,
+	TEST_POLL_MASKED,
+	TEST_POLL_WAKE,
+	TEST_TIMER_PAST,
+	TEST_LOCKING_SEND_RACE,
+	TEST_LOCKING_POLL_RACE,
+	TEST_LOCKING_POLL_TIMEOUT,
+	TEST_DONE,
+
+	TEST_GUEST_SAW_IRQ,
+};
+
 #define XEN_HYPERCALL_MSR	0x40000000
 
 #define MIN_STEAL_TIME		50000
@@ -147,7 +177,7 @@ static void evtchn_handler(struct ex_regs *regs)
 	vi->evtchn_pending_sel = 0;
 	guest_saw_irq = true;
 
-	GUEST_SYNC(0x20);
+	GUEST_SYNC(TEST_GUEST_SAW_IRQ);
 }
 
 static void guest_wait_for_irq(void)
@@ -168,41 +198,41 @@ static void guest_code(void)
 	);
 
 	/* Trigger an interrupt injection */
-	GUEST_SYNC(0);
+	GUEST_SYNC(TEST_INJECT_VECTOR);
 
 	guest_wait_for_irq();
 
 	/* Test having the host set runstates manually */
-	GUEST_SYNC(RUNSTATE_runnable);
+	GUEST_SYNC(TEST_RUNSTATE_runnable);
 	GUEST_ASSERT(rs->time[RUNSTATE_runnable] != 0);
 	GUEST_ASSERT(rs->state == 0);
 
-	GUEST_SYNC(RUNSTATE_blocked);
+	GUEST_SYNC(TEST_RUNSTATE_blocked);
 	GUEST_ASSERT(rs->time[RUNSTATE_blocked] != 0);
 	GUEST_ASSERT(rs->state == 0);
 
-	GUEST_SYNC(RUNSTATE_offline);
+	GUEST_SYNC(TEST_RUNSTATE_offline);
 	GUEST_ASSERT(rs->time[RUNSTATE_offline] != 0);
 	GUEST_ASSERT(rs->state == 0);
 
 	/* Test runstate time adjust */
-	GUEST_SYNC(4);
+	GUEST_SYNC(TEST_RUNSTATE_ADJUST);
 	GUEST_ASSERT(rs->time[RUNSTATE_blocked] == 0x5a);
 	GUEST_ASSERT(rs->time[RUNSTATE_offline] == 0x6b6b);
 
 	/* Test runstate time set */
-	GUEST_SYNC(5);
+	GUEST_SYNC(TEST_RUNSTATE_DATA);
 	GUEST_ASSERT(rs->state_entry_time >= 0x8000);
 	GUEST_ASSERT(rs->time[RUNSTATE_runnable] == 0);
 	GUEST_ASSERT(rs->time[RUNSTATE_blocked] == 0x6b6b);
 	GUEST_ASSERT(rs->time[RUNSTATE_offline] == 0x5a);
 
 	/* sched_yield() should result in some 'runnable' time */
-	GUEST_SYNC(6);
+	GUEST_SYNC(TEST_STEAL_TIME);
 	GUEST_ASSERT(rs->time[RUNSTATE_runnable] >= MIN_STEAL_TIME);
 
 	/* Attempt to deliver a *masked* interrupt */
-	GUEST_SYNC(7);
+	GUEST_SYNC(TEST_EVTCHN_MASKED);
 
 	/* Wait until we see the bit set */
 	struct shared_info *si = (void *)SHINFO_VADDR;
@@ -210,21 +240,21 @@ static void guest_code(void)
 		__asm__ __volatile__ ("rep nop" : : : "memory");
 
 	/* Now deliver an *unmasked* interrupt */
-	GUEST_SYNC(8);
+	GUEST_SYNC(TEST_EVTCHN_UNMASKED);
 
 	guest_wait_for_irq();
 
 	/* Change memslots and deliver an interrupt */
-	GUEST_SYNC(9);
+	GUEST_SYNC(TEST_EVTCHN_SLOWPATH);
 
 	guest_wait_for_irq();
 
 	/* Deliver event channel with KVM_XEN_HVM_EVTCHN_SEND */
-	GUEST_SYNC(10);
+	GUEST_SYNC(TEST_EVTCHN_SEND_IOCTL);
 
 	guest_wait_for_irq();
 
-	GUEST_SYNC(11);
+	GUEST_SYNC(TEST_EVTCHN_HCALL);
 
 	/* Our turn. Deliver event channel (to ourselves) with
 	 * EVTCHNOP_send hypercall. */
@@ -240,7 +270,7 @@ static void guest_code(void)
 
 	guest_wait_for_irq();
 
-	GUEST_SYNC(12);
+	GUEST_SYNC(TEST_EVTCHN_HCALL_EVENTFD);
 
 	/* Deliver "outbound" event channel to an eventfd which
 	 * happens to be one of our own irqfds. */
@@ -255,7 +285,7 @@ static void guest_code(void)
 
 	guest_wait_for_irq();
 
-	GUEST_SYNC(13);
+	GUEST_SYNC(TEST_TIMER_SETUP);
 
 	/* Set a timer 100ms in the future. */
 	__asm__ __volatile__ ("vmcall" :
@@ -264,17 +294,17 @@ static void guest_code(void)
 			      "D" (rs->state_entry_time + 100000000));
 	GUEST_ASSERT(rax == 0);
 
-	GUEST_SYNC(14);
+	GUEST_SYNC(TEST_TIMER_WAIT);
 
 	/* Now wait for the timer */
 	guest_wait_for_irq();
 
-	GUEST_SYNC(15);
+	GUEST_SYNC(TEST_TIMER_RESTORE);
 
 	/* The host has 'restored' the timer. Just wait for it. */
 	guest_wait_for_irq();
 
-	GUEST_SYNC(16);
+	GUEST_SYNC(TEST_POLL_READY);
 
 	/* Poll for an event channel port which is already set */
 	u32 ports[1] = { EVTCHN_TIMER };
@@ -292,7 +322,7 @@ static void guest_code(void)
 
 	GUEST_ASSERT(rax == 0);
 
-	GUEST_SYNC(17);
+	GUEST_SYNC(TEST_POLL_TIMEOUT);
 
 	/* Poll for an unset port and wait for the timeout. */
 	p.timeout = 100000000;
@@ -304,7 +334,7 @@ static void guest_code(void)
 
 	GUEST_ASSERT(rax == 0);
 
-	GUEST_SYNC(18);
+	GUEST_SYNC(TEST_POLL_MASKED);
 
 	/* A timer will wake the masked port we're waiting on, while we poll */
 	p.timeout = 0;
@@ -316,7 +346,7 @@ static void guest_code(void)
 
 	GUEST_ASSERT(rax == 0);
 
-	GUEST_SYNC(19);
+	GUEST_SYNC(TEST_POLL_WAKE);
 
 	/* A timer wake an *unmasked* port which should wake us with an
 	 * actual interrupt, while we're polling on a different port. */
@@ -332,17 +362,17 @@ static void guest_code(void)
 
 	guest_wait_for_irq();
 
-	GUEST_SYNC(20);
+	GUEST_SYNC(TEST_TIMER_PAST);
 
 	/* Timer should have fired already */
 	guest_wait_for_irq();
 
-	GUEST_SYNC(21);
+	GUEST_SYNC(TEST_LOCKING_SEND_RACE);
 	/* Racing host ioctls */
 
 	guest_wait_for_irq();
 
-	GUEST_SYNC(22);
+	GUEST_SYNC(TEST_LOCKING_POLL_RACE);
 	/* Racing vmcall against host ioctl */
 
 	ports[0] = 0;
@@ -375,12 +405,12 @@ static void guest_code(void)
 	 * expiring while the event channel was invalid.
 	 */
 	if (!guest_saw_irq) {
-		GUEST_SYNC(23);
+		GUEST_SYNC(TEST_LOCKING_POLL_TIMEOUT);
 		goto wait_for_timer;
 	}
 	guest_saw_irq = false;
 
-	GUEST_SYNC(24);
+	GUEST_SYNC(TEST_DONE);
 }
 
 static int cmp_timespec(struct timespec *a, struct timespec *b)
@@ -439,6 +469,7 @@ int main(int argc, char *argv[])
 	bool verbose;
 	int ret;
 
+	printf("TEST_DONE is %d\n", TEST_DONE);
 	verbose = argc > 1 && (!strncmp(argv[1], "-v", 3) ||
 			       !strncmp(argv[1], "--verbose", 10));
 
@@ -649,25 +680,26 @@ int main(int argc, char *argv[])
 					    "runstate times don't add up");
 
 			switch (uc.args[1]) {
-			case 0:
+			case TEST_INJECT_VECTOR:
 				if (verbose)
 					printf("Delivering evtchn upcall\n");
 				evtchn_irq_expected = true;
 				vinfo->evtchn_upcall_pending = 1;
 				break;
 
-			case RUNSTATE_runnable...RUNSTATE_offline:
+			case TEST_RUNSTATE_runnable...TEST_RUNSTATE_offline:
 				TEST_ASSERT(!evtchn_irq_expected, "Event channel IRQ not seen");
 				if (!do_runstate_tests)
 					goto done;
 				if (verbose)
 					printf("Testing runstate %s\n", runstate_names[uc.args[1]]);
 				rst.type = KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT;
-				rst.u.runstate.state = uc.args[1];
+				rst.u.runstate.state = uc.args[1] + RUNSTATE_runnable -
+					TEST_RUNSTATE_runnable;
 				vcpu_ioctl(vcpu, KVM_XEN_VCPU_SET_ATTR, &rst);
 				break;
 
-			case 4:
+			case TEST_RUNSTATE_ADJUST:
 				if (verbose)
 					printf("Testing RUNSTATE_ADJUST\n");
 				rst.type = KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST;
@@ -682,7 +714,7 @@ int main(int argc, char *argv[])
 				vcpu_ioctl(vcpu, KVM_XEN_VCPU_SET_ATTR, &rst);
 				break;
 
-			case 5:
+			case TEST_RUNSTATE_DATA:
 				if (verbose)
 					printf("Testing RUNSTATE_DATA\n");
 				rst.type = KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_DATA;
@@ -694,7 +726,7 @@ int main(int argc, char *argv[])
 				vcpu_ioctl(vcpu, KVM_XEN_VCPU_SET_ATTR, &rst);
 				break;
 
-			case 6:
+			case TEST_STEAL_TIME:
 				if (verbose)
 					printf("Testing steal time\n");
 				/* Yield until scheduler delay exceeds target */
@@ -704,7 +736,7 @@ int main(int argc, char *argv[])
 				} while (get_run_delay() < rundelay);
 				break;
 
-			case 7:
+			case TEST_EVTCHN_MASKED:
 				if (!do_eventfd_tests)
 					goto done;
 				if (verbose)
@@ -714,7 +746,7 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
-			case 8:
+			case TEST_EVTCHN_UNMASKED:
 				if (verbose)
 					printf("Testing unmasked event channel\n");
 				/* Unmask that, but deliver the other one */
@@ -725,7 +757,7 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
-			case 9:
+			case TEST_EVTCHN_SLOWPATH:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
 				shinfo->evtchn_pending[1] = 0;
@@ -738,7 +770,7 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
-			case 10:
+			case TEST_EVTCHN_SEND_IOCTL:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
 				if (!do_evtchn_tests)
@@ -758,7 +790,7 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
-			case 11:
+			case TEST_EVTCHN_HCALL:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
 				shinfo->evtchn_pending[1] = 0;
@@ -769,7 +801,7 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
-			case 12:
+			case TEST_EVTCHN_HCALL_EVENTFD:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
 				shinfo->evtchn_pending[0] = 0;
@@ -780,7 +812,7 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
-			case 13:
+			case TEST_TIMER_SETUP:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
 				shinfo->evtchn_pending[1] = 0;
@@ -789,7 +821,7 @@ int main(int argc, char *argv[])
 					printf("Testing guest oneshot timer\n");
 				break;
 
-			case 14:
+			case TEST_TIMER_WAIT:
 				memset(&tmr, 0, sizeof(tmr));
 				tmr.type = KVM_XEN_VCPU_ATTR_TYPE_TIMER;
 				vcpu_ioctl(vcpu, KVM_XEN_VCPU_GET_ATTR, &tmr);
@@ -803,7 +835,7 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
-			case 15:
+			case TEST_TIMER_RESTORE:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
 				shinfo->evtchn_pending[0] = 0;
@@ -817,7 +849,7 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
-			case 16:
+			case TEST_POLL_READY:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
 
@@ -827,14 +859,14 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
-			case 17:
+			case TEST_POLL_TIMEOUT:
 				if (verbose)
 					printf("Testing SCHEDOP_poll timeout\n");
 				shinfo->evtchn_pending[0] = 0;
 				alarm(1);
 				break;
 
-			case 18:
+			case TEST_POLL_MASKED:
 				if (verbose)
 					printf("Testing SCHEDOP_poll wake on masked event\n");
 
@@ -843,7 +875,7 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
-			case 19:
+			case TEST_POLL_WAKE:
 				shinfo->evtchn_pending[0] = shinfo->evtchn_mask[0] = 0;
 				if (verbose)
 					printf("Testing SCHEDOP_poll wake on unmasked event\n");
@@ -860,7 +892,7 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
-			case 20:
+			case TEST_TIMER_PAST:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
 				/* Read timer and check it is no longer pending */
@@ -877,7 +909,7 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
-			case 21:
+			case TEST_LOCKING_SEND_RACE:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
 				alarm(0);
@@ -899,7 +931,7 @@ int main(int argc, char *argv[])
 					__vm_ioctl(vm, KVM_XEN_HVM_EVTCHN_SEND, &uxe);
 				break;
 
-			case 22:
+			case TEST_LOCKING_POLL_RACE:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
 
@@ -914,7 +946,7 @@ int main(int argc, char *argv[])
 				vcpu_ioctl(vcpu, KVM_XEN_VCPU_SET_ATTR, &tmr);
 				break;
 
-			case 23:
+			case TEST_LOCKING_POLL_TIMEOUT:
 				/*
 				 * Optional and possibly repeated sync point.
 				 * Injecting the timer IRQ may fail if the
@@ -936,7 +968,7 @@ int main(int argc, char *argv[])
 							 SHINFO_RACE_TIMEOUT * 1000000000ULL;
 				vcpu_ioctl(vcpu, KVM_XEN_VCPU_SET_ATTR, &tmr);
 				break;
-			case 24:
+			case TEST_DONE:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
 
@@ -947,7 +979,7 @@ int main(int argc, char *argv[])
 				TEST_ASSERT(ret == 0, "pthread_join() failed: %s", strerror(ret));
 				goto done;
 
-			case 0x20:
+			case TEST_GUEST_SAW_IRQ:
 				TEST_ASSERT(evtchn_irq_expected, "Unexpected event channel IRQ");
 				evtchn_irq_expected = false;
 				break;
-- 
2.35.3
Re: [PATCH 2/3] KVM: selftests: Use enum for test numbers in xen_shinfo_test
Posted by David Woodhouse 2 years, 8 months ago
On Fri, 2023-01-13 at 12:46 +0000, David Woodhouse wrote:
> @@ -439,6 +469,7 @@ int main(int argc, char *argv[])
>         bool verbose;
>         int ret;
>  
> +       printf("TEST_DONE is %d\n", TEST_DONE);
>         verbose = argc > 1 && (!strncmp(argv[1], "-v", 3) ||
>                                !strncmp(argv[1], "--verbose", 10));
>  

Dammit.
[PATCH 3/3] KVM: selftests: Add EVTCHNOP_send slow path test to xen_shinfo_test
Posted by David Woodhouse 2 years, 8 months ago
From: David Woodhouse <dwmw@amazon.co.uk>

When kvm_xen_evtchn_send() takes the slow path because the shinfo GPC
needs to be revalidated, it used to violate the SRCU vs. kvm->lock
locking rules and potentially cause a deadlock.

Now that lockdep is learning to catch such things, make sure that code
path is exercised by the selftest.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 .../selftests/kvm/x86_64/xen_shinfo_test.c    | 31 +++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
index 644d614a9965..3adc2e11b094 100644
--- a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
+++ b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
@@ -29,6 +29,9 @@
 #define DUMMY_REGION_GPA	(SHINFO_REGION_GPA + (3 * PAGE_SIZE))
 #define DUMMY_REGION_SLOT	11
 
+#define DUMMY_REGION_GPA_2	(SHINFO_REGION_GPA + (4 * PAGE_SIZE))
+#define DUMMY_REGION_SLOT_2	12
+
 #define SHINFO_ADDR	(SHINFO_REGION_GPA)
 #define VCPU_INFO_ADDR	(SHINFO_REGION_GPA + 0x40)
 #define PVTIME_ADDR	(SHINFO_REGION_GPA + PAGE_SIZE)
@@ -57,6 +60,7 @@ enum {
 	TEST_EVTCHN_SLOWPATH,
 	TEST_EVTCHN_SEND_IOCTL,
 	TEST_EVTCHN_HCALL,
+	TEST_EVTCHN_HCALL_SLOWPATH,
 	TEST_EVTCHN_HCALL_EVENTFD,
 	TEST_TIMER_SETUP,
 	TEST_TIMER_WAIT,
@@ -270,6 +274,20 @@ static void guest_code(void)
 
 	guest_wait_for_irq();
 
+	GUEST_SYNC(TEST_EVTCHN_HCALL_SLOWPATH);
+
+	/* Same again, but this time the host has messed with memslots
+	 * so it should take the slow path in kvm_xen_set_evtchn(). */
+	__asm__ __volatile__ ("vmcall" :
+			      "=a" (rax) :
+			      "a" (__HYPERVISOR_event_channel_op),
+			      "D" (EVTCHNOP_send),
+			      "S" (&s));
+
+	GUEST_ASSERT(rax == 0);
+
+	guest_wait_for_irq();
+
 	GUEST_SYNC(TEST_EVTCHN_HCALL_EVENTFD);
 
 	/* Deliver "outbound" event channel to an eventfd which
@@ -801,6 +819,19 @@ int main(int argc, char *argv[])
 				alarm(1);
 				break;
 
+			case TEST_EVTCHN_HCALL_SLOWPATH:
+				TEST_ASSERT(!evtchn_irq_expected,
+					    "Expected event channel IRQ but it didn't happen");
+				shinfo->evtchn_pending[0] = 0;
+
+				if (verbose)
+					printf("Testing guest EVTCHNOP_send direct to evtchn after memslot change\n");
+				vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS,
+							    DUMMY_REGION_GPA_2, DUMMY_REGION_SLOT_2, 1, 0);
+				evtchn_irq_expected = true;
+				alarm(1);
+				break;
+
 			case TEST_EVTCHN_HCALL_EVENTFD:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
-- 
2.35.3
Re: [PATCH 3/3] KVM: selftests: Add EVTCHNOP_send slow path test to xen_shinfo_test
Posted by Sean Christopherson 2 years, 7 months ago
On Fri, Jan 13, 2023, David Woodhouse wrote:
> @@ -57,6 +60,7 @@ enum {
>  	TEST_EVTCHN_SLOWPATH,
>  	TEST_EVTCHN_SEND_IOCTL,
>  	TEST_EVTCHN_HCALL,
> +	TEST_EVTCHN_HCALL_SLOWPATH,
>  	TEST_EVTCHN_HCALL_EVENTFD,
>  	TEST_TIMER_SETUP,
>  	TEST_TIMER_WAIT,
> @@ -270,6 +274,20 @@ static void guest_code(void)
>  
>  	guest_wait_for_irq();
>  
> +	GUEST_SYNC(TEST_EVTCHN_HCALL_SLOWPATH);
> +
> +	/* Same again, but this time the host has messed with memslots
> +	 * so it should take the slow path in kvm_xen_set_evtchn(). */

	/*
	 * https://lore.kernel.org/all/CA+55aFyQYJerovMsSoSKS7PessZBr4vNp-3QUUwhqk4A4_jcbg@mail.gmail.com
	 */

> +	__asm__ __volatile__ ("vmcall" :
> +			      "=a" (rax) :
> +			      "a" (__HYPERVISOR_event_channel_op),
> +			      "D" (EVTCHNOP_send),
> +			      "S" (&s));
> +
> +	GUEST_ASSERT(rax == 0);

There's a lot of copy+paste in this file, and we really should do VMMCALL when
running on AMD.  That's easy to do with some changes that are in the queue for
6.3.  I'll repost these selftest patches on top of a few patches to add helpers for
doing hypercalls using the Xen ABI.