From nobody Sun Feb 8 23:05:55 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0899029A304 for ; Mon, 12 May 2025 18:04:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747073082; cv=none; b=X0jRL0oqNJBcKWWdugo9OPOAI5MAkENGbyn58gDx2Grm+p+M+pYcqFwvX8mNukVo6Ox49dSSFI8eYJF6NyJsUFqvQjkf0sFwpiUGlax9euplqCEUCgUvCPgN08ydz98MTuQoHzs3vj6Ne+zVW5fqJuazqdMXcGfmuzR3bYlpOns= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747073082; c=relaxed/simple; bh=oBzmlHt0HTJa1iYabb2/WVIvO0yzC8VwyaNi+DS7efQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kWZURSmB0iVbuC4cXWeOjBeNQKSmGE5W5ZDeqERY8NGvZQl5Eg5GVBLCYbyjITjbcPr6B4au0RVwF34yVxsz8ZCZZS7LFssEqLR1IPfWWAgMksI5Bvu9lAzCs2Omyl8aUhoWzt1DJV57UKytHMeeQVpSLszLi4vyoK57QuvQzTk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=cdOvs6sn; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="cdOvs6sn" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1747073078; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z2T7ycCKwqZIMsv2I6XJvuhGXIY81vCrox7I0j0GM+0=; b=cdOvs6sn9uEzNzywegXP6shEo8ZBbZEgnXwfw6xyqEQCZcNO7sk3p/15wOaaG0fGxi/aqV Kju2g6g1dLo6X3jKXzMsQAEWBymdQ4+BKlusM7Ah+ANVD/fnSv/kWn5aQEFjsXAQLIsesl oBTa6QwdAMLHPuktw3X21NJUm40Ayl8= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-418-bCO2M7_lMsCo5R7BHCagxQ-1; Mon, 12 May 2025 14:04:36 -0400 X-MC-Unique: bCO2M7_lMsCo5R7BHCagxQ-1 X-Mimecast-MFC-AGG-ID: bCO2M7_lMsCo5R7BHCagxQ_1747073066 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 987FE1955D4A; Mon, 12 May 2025 18:04:25 +0000 (UTC) Received: from intellaptop.lan (unknown [10.22.80.5]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id B654930001A1; Mon, 12 May 2025 18:04:17 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Suzuki K Poulose , Jing Zhang , "H. Peter Anvin" , Sebastian Ott , Shusen Li , Waiman Long , Thomas Gleixner , linux-arm-kernel@lists.infradead.org, Bjorn Helgaas , Borislav Petkov , Anup Patel , Will Deacon , Palmer Dabbelt , Alexander Potapenko , kvmarm@lists.linux.dev, Keisuke Nishimura , Zenghui Yu , Peter Zijlstra , Atish Patra , Joey Gouly , x86@kernel.org, Marc Zyngier , Sean Christopherson , Andre Przywara , Kunkun Jiang , linux-riscv@lists.infradead.org, Randy Dunlap , Paolo Bonzini , Boqun Feng , Catalin Marinas , Alexandre Ghiti , linux-kernel@vger.kernel.org, Dave Hansen , Oliver Upton , kvm-riscv@lists.infradead.org, Maxim Levitsky , Ingo Molnar , Paul Walmsley , Albert Ou Subject: [PATCH v5 1/6] locking/mutex: implement mutex_trylock_nested Date: Mon, 12 May 2025 14:04:02 -0400 Message-ID: <20250512180407.659015-2-mlevitsk@redhat.com> In-Reply-To: <20250512180407.659015-1-mlevitsk@redhat.com> References: <20250512180407.659015-1-mlevitsk@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" Despite the fact that several lockdep-related checks are skipped when calling trylock* versions of the locking primitives, for example mutex_trylock, each time the mutex is acquired, a held_lock is still placed onto the lockdep stack by __lock_acquire() which is called regardless of whether the trylock* or regular locking API was used. This means that if the caller successfully acquires more than MAX_LOCK_DEPTH locks of the same class, even when using mutex_trylock, lockdep will still complain that the maximum depth of the held lock stack has been reached and disable itself. For example, the following error currently occurs in the ARM version of KVM, once the code tries to lock all vCPUs of a VM configured with more than MAX_LOCK_DEPTH vCPUs, a situation that can easily happen on modern systems, where having more than 48 CPUs is common, and it's also common to run VMs that have vCPU counts approaching that number: [ 328.171264] BUG: MAX_LOCK_DEPTH too low! [ 328.175227] turning off the locking correctness validator. [ 328.180726] Please attach the output of /proc/lock_stat to the bug report [ 328.187531] depth: 48 max: 48! [ 328.190678] 48 locks held by qemu-kvm/11664: [ 328.194957] #0: ffff800086de5ba0 (&kvm->lock){+.+.}-{3:3}, at: kvm_ioct= l_create_device+0x174/0x5b0 [ 328.204048] #1: ffff0800e78800b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_a= ll_vcpus+0x16c/0x2a0 [ 328.212521] #2: ffff07ffeee51e98 (&vcpu->mutex){+.+.}-{3:3}, at: lock_a= ll_vcpus+0x16c/0x2a0 [ 328.220991] #3: ffff0800dc7d80b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_a= ll_vcpus+0x16c/0x2a0 [ 328.229463] #4: ffff07ffe0c980b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_a= ll_vcpus+0x16c/0x2a0 [ 328.237934] #5: ffff0800a3883c78 (&vcpu->mutex){+.+.}-{3:3}, at: lock_a= ll_vcpus+0x16c/0x2a0 [ 328.246405] #6: ffff07fffbe480b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_a= ll_vcpus+0x16c/0x2a0 Luckily, in all instances that require locking all vCPUs, the 'kvm->lock' is taken a priori, and that fact makes it possible to use the little known feature of lockdep, called a 'nest_lock', to avoid this warning and subsequent lockdep self-disablement. The action of 'nested lock' being provided to lockdep's lock_acquire(), causes the lockdep to detect that the top of the held lock stack contains a lock of the same class and then increment its reference counter instead of pushing a new held_lock item onto that stack. See __lock_acquire for more information. Signed-off-by: Maxim Levitsky Acked-by: Peter Zijlstra (Intel) --- include/linux/mutex.h | 15 +++++++++++++++ kernel/locking/mutex.c | 14 +++++++++++--- 2 files changed, 26 insertions(+), 3 deletions(-) diff --git a/include/linux/mutex.h b/include/linux/mutex.h index 2143d05116be..da4518cfd59c 100644 --- a/include/linux/mutex.h +++ b/include/linux/mutex.h @@ -193,7 +193,22 @@ extern void mutex_lock_io(struct mutex *lock); * * Returns 1 if the mutex has been acquired successfully, and 0 on content= ion. */ + +#ifdef CONFIG_DEBUG_LOCK_ALLOC +extern int _mutex_trylock_nest_lock(struct mutex *lock, struct lockdep_map= *nest_lock); + +#define mutex_trylock_nest_lock(lock, nest_lock) \ +( \ + typecheck(struct lockdep_map *, &(nest_lock)->dep_map), \ + _mutex_trylock_nest_lock(lock, &(nest_lock)->dep_map) \ +) + +#define mutex_trylock(lock) _mutex_trylock_nest_lock(lock, NULL) +#else extern int mutex_trylock(struct mutex *lock); +#define mutex_trylock_nest_lock(lock, nest_lock) mutex_trylock(lock) +#endif + extern void mutex_unlock(struct mutex *lock); =20 extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock); diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 555e2b3a665a..c75a838d3bae 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -1062,6 +1062,7 @@ __ww_mutex_lock_interruptible_slowpath(struct ww_mute= x *lock, =20 #endif =20 +#ifndef CONFIG_DEBUG_LOCK_ALLOC /** * mutex_trylock - try to acquire the mutex, without waiting * @lock: the mutex to be acquired @@ -1077,18 +1078,25 @@ __ww_mutex_lock_interruptible_slowpath(struct ww_mu= tex *lock, * mutex must be released by the same task that acquired it. */ int __sched mutex_trylock(struct mutex *lock) +{ + MUTEX_WARN_ON(lock->magic !=3D lock); + return __mutex_trylock(lock); +} +EXPORT_SYMBOL(mutex_trylock); +#else +int __sched _mutex_trylock_nest_lock(struct mutex *lock, struct lockdep_ma= p *nest_lock) { bool locked; =20 MUTEX_WARN_ON(lock->magic !=3D lock); - locked =3D __mutex_trylock(lock); if (locked) - mutex_acquire(&lock->dep_map, 0, 1, _RET_IP_); + mutex_acquire_nest(&lock->dep_map, 0, 1, nest_lock, _RET_IP_); =20 return locked; } -EXPORT_SYMBOL(mutex_trylock); +EXPORT_SYMBOL(_mutex_trylock_nest_lock); +#endif =20 #ifndef CONFIG_DEBUG_LOCK_ALLOC int __sched --=20 2.46.0 From nobody Sun Feb 8 23:05:55 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A206629A9F9 for ; Mon, 12 May 2025 18:04:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747073085; cv=none; b=k7DCqfP752CYdgJh0cplU3vBxnu/O7iJRuEAlJcfWFXlvTkIZj7a2Zg7DCYLVjiwOLHHiYtr1fJTPiv8w3pDyIXHzn6698LjOcRkMAjlwnuDNv/JVuvE7RGH7zcR3cTFjK+64hV/YpJcYowqM5EanOhxEVyh3WCS7KWudEHWu8s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747073085; c=relaxed/simple; bh=FYSieWUXv96exfZG1ShbZz/LgMQ+Pxv/qsWOZ2g+uo4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sxgbuM17aqoTcR70peOnq0hNkW3bd+F/d5NE/M6VHTKNyiXWFicCg7GN+c8EEBbgVwN5NCDw3r6cNOva8wrdj7YT0VXhjYYIwxzwxOZq7wCgaZZ5QEO9iGyOxLcYMvamMEzyA5pwGLGzUIr8DNEAlsad+/3nVwKAP3zr+V25gRo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=guNQgX2U; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="guNQgX2U" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1747073082; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I7SnvbxN813odoTR96udwnhW9eUh5grZic2/nYbCNXE=; b=guNQgX2U1QwDnEYepwoukSpu21KQkAWnw3fWEXrcypSkXp/fwwpDCSjauqbc6xPTrCziLA WAp23RQOELdLMEGKUGfRfnxTsyPWkd6mhK/88Kph9PTFgN/2IgqYL60Vm6ZwLe9KqQsW0C FhgzwYBk9WHsVKkmtMsRVr2cwKjXfTc= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-327-eDMLK6krPsWQOtdiFz7-iQ-1; Mon, 12 May 2025 14:04:38 -0400 X-MC-Unique: eDMLK6krPsWQOtdiFz7-iQ-1 X-Mimecast-MFC-AGG-ID: eDMLK6krPsWQOtdiFz7-iQ_1747073074 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E577E1800263; Mon, 12 May 2025 18:04:32 +0000 (UTC) Received: from intellaptop.lan (unknown [10.22.80.5]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D45EA30002D4; Mon, 12 May 2025 18:04:25 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Suzuki K Poulose , Jing Zhang , "H. Peter Anvin" , Sebastian Ott , Shusen Li , Waiman Long , Thomas Gleixner , linux-arm-kernel@lists.infradead.org, Bjorn Helgaas , Borislav Petkov , Anup Patel , Will Deacon , Palmer Dabbelt , Alexander Potapenko , kvmarm@lists.linux.dev, Keisuke Nishimura , Zenghui Yu , Peter Zijlstra , Atish Patra , Joey Gouly , x86@kernel.org, Marc Zyngier , Sean Christopherson , Andre Przywara , Kunkun Jiang , linux-riscv@lists.infradead.org, Randy Dunlap , Paolo Bonzini , Boqun Feng , Catalin Marinas , Alexandre Ghiti , linux-kernel@vger.kernel.org, Dave Hansen , Oliver Upton , kvm-riscv@lists.infradead.org, Maxim Levitsky , Ingo Molnar , Paul Walmsley , Albert Ou Subject: [PATCH v5 2/6] locking/mutex: implement mutex_lock_killable_nest_lock Date: Mon, 12 May 2025 14:04:03 -0400 Message-ID: <20250512180407.659015-3-mlevitsk@redhat.com> In-Reply-To: <20250512180407.659015-1-mlevitsk@redhat.com> References: <20250512180407.659015-1-mlevitsk@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" KVM's SEV intra-host migration code needs to lock all vCPUs of the source and the target VM, before it proceeds with the migration. The number of vCPUs that belong to each VM is not bounded by anything except a self-imposed KVM limit of CONFIG_KVM_MAX_NR_VCPUS vCPUs which is significantly larger than the depth of lockdep's lock stack. Luckily, the locks in both of the cases mentioned above, are held under the 'kvm->lock' of each VM, which means that we can use the little known lockdep feature called a "nest_lock" to support this use case in a cleaner way, compared to the way it's currently done. Implement and expose 'mutex_lock_killable_nest_lock' for this purpose. Signed-off-by: Maxim Levitsky Acked-by: Peter Zijlstra (Intel) --- include/linux/mutex.h | 17 +++++++++++++---- kernel/locking/mutex.c | 7 ++++--- 2 files changed, 17 insertions(+), 7 deletions(-) diff --git a/include/linux/mutex.h b/include/linux/mutex.h index da4518cfd59c..a039fa8c1780 100644 --- a/include/linux/mutex.h +++ b/include/linux/mutex.h @@ -156,16 +156,15 @@ static inline int __devm_mutex_init(struct device *de= v, struct mutex *lock) #ifdef CONFIG_DEBUG_LOCK_ALLOC extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass); extern void _mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *= nest_lock); - extern int __must_check mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass); -extern int __must_check mutex_lock_killable_nested(struct mutex *lock, - unsigned int subclass); +extern int __must_check _mutex_lock_killable(struct mutex *lock, + unsigned int subclass, struct lockdep_map *nest_lock); extern void mutex_lock_io_nested(struct mutex *lock, unsigned int subclass= ); =20 #define mutex_lock(lock) mutex_lock_nested(lock, 0) #define mutex_lock_interruptible(lock) mutex_lock_interruptible_nested(loc= k, 0) -#define mutex_lock_killable(lock) mutex_lock_killable_nested(lock, 0) +#define mutex_lock_killable(lock) _mutex_lock_killable(lock, 0, NULL) #define mutex_lock_io(lock) mutex_lock_io_nested(lock, 0) =20 #define mutex_lock_nest_lock(lock, nest_lock) \ @@ -174,6 +173,15 @@ do { \ _mutex_lock_nest_lock(lock, &(nest_lock)->dep_map); \ } while (0) =20 +#define mutex_lock_killable_nest_lock(lock, nest_lock) \ +( \ + typecheck(struct lockdep_map *, &(nest_lock)->dep_map), \ + _mutex_lock_killable(lock, 0, &(nest_lock)->dep_map) \ +) + +#define mutex_lock_killable_nested(lock, subclass) \ + _mutex_lock_killable(lock, subclass, NULL) + #else extern void mutex_lock(struct mutex *lock); extern int __must_check mutex_lock_interruptible(struct mutex *lock); @@ -183,6 +191,7 @@ extern void mutex_lock_io(struct mutex *lock); # define mutex_lock_nested(lock, subclass) mutex_lock(lock) # define mutex_lock_interruptible_nested(lock, subclass) mutex_lock_interr= uptible(lock) # define mutex_lock_killable_nested(lock, subclass) mutex_lock_killable(lo= ck) +# define mutex_lock_killable_nest_lock(lock, nest_lock) mutex_lock_killabl= e(lock) # define mutex_lock_nest_lock(lock, nest_lock) mutex_lock(lock) # define mutex_lock_io_nested(lock, subclass) mutex_lock_io(lock) #endif diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index c75a838d3bae..234923121ff0 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -808,11 +808,12 @@ _mutex_lock_nest_lock(struct mutex *lock, struct lock= dep_map *nest) EXPORT_SYMBOL_GPL(_mutex_lock_nest_lock); =20 int __sched -mutex_lock_killable_nested(struct mutex *lock, unsigned int subclass) +_mutex_lock_killable(struct mutex *lock, unsigned int subclass, + struct lockdep_map *nest) { - return __mutex_lock(lock, TASK_KILLABLE, subclass, NULL, _RET_IP_); + return __mutex_lock(lock, TASK_KILLABLE, subclass, nest, _RET_IP_); } -EXPORT_SYMBOL_GPL(mutex_lock_killable_nested); +EXPORT_SYMBOL_GPL(_mutex_lock_killable); =20 int __sched mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass) --=20 2.46.0 From nobody Sun Feb 8 23:05:55 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5725D29ACDE for ; Mon, 12 May 2025 18:04:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747073089; cv=none; b=haSO9rMS9/p9cmPU4UQXcw22MbJ6Eiq0BhuHmy3DkKgG5Wa0vcnXr2osfZxuRNAoYFR0BYGkyIZEYKbwYnFYz7FxSQUOpwh51APV0mtgoe+cHNqi1Z2v5jg4xRBRU9iT5PycBi+UCso36iv546J9tlnC0nbxCxeJyuOThcsdLK4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747073089; c=relaxed/simple; bh=XPAu0owr6qLu2CRArDhyhARm8Q7J10RVyq9f44dogdk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=joj2F02goQZAzHMDlBMg6PmTB3KJZeDj5zrCcCTb0e9d+elquLDOnA1dnxbU8WsHDW62Pe91JASeFecm7Sebi5nC03WTI09p1UuAJHaNbsba/4k4LOVnF4OFNTshVOE7ZKAripUVmyKGD+wCaxjreEWzG6OdN866zSMxBwIqpy0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=FuFnNH9m; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="FuFnNH9m" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1747073086; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=g9ERlWy//d3vXfM6qe6RwNpiSNHM8JyZPnWWfaO2iUE=; b=FuFnNH9mitAxxCNeVHQZJfHr+0USpo/jPRcUy1kgWVf8fTWB+487LxQdlOK9fGTaWRptnV ISQXpWB8BQt0Qh7ZbMVu/zM9Heb4Fo0lhOW3oncgT8Bc02DQnz7ubU3728iI5WlD5awzGf Ri8VxvZMNEqHmNjW65eR9FgYaTcIpss= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-515-9fb_4R8jOiSlXgEcXCHh-w-1; Mon, 12 May 2025 14:04:44 -0400 X-MC-Unique: 9fb_4R8jOiSlXgEcXCHh-w-1 X-Mimecast-MFC-AGG-ID: 9fb_4R8jOiSlXgEcXCHh-w_1747073080 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A6C0C18002A5; Mon, 12 May 2025 18:04:39 +0000 (UTC) Received: from intellaptop.lan (unknown [10.22.80.5]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 2C18530001A1; Mon, 12 May 2025 18:04:33 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Suzuki K Poulose , Jing Zhang , "H. Peter Anvin" , Sebastian Ott , Shusen Li , Waiman Long , Thomas Gleixner , linux-arm-kernel@lists.infradead.org, Bjorn Helgaas , Borislav Petkov , Anup Patel , Will Deacon , Palmer Dabbelt , Alexander Potapenko , kvmarm@lists.linux.dev, Keisuke Nishimura , Zenghui Yu , Peter Zijlstra , Atish Patra , Joey Gouly , x86@kernel.org, Marc Zyngier , Sean Christopherson , Andre Przywara , Kunkun Jiang , linux-riscv@lists.infradead.org, Randy Dunlap , Paolo Bonzini , Boqun Feng , Catalin Marinas , Alexandre Ghiti , linux-kernel@vger.kernel.org, Dave Hansen , Oliver Upton , kvm-riscv@lists.infradead.org, Maxim Levitsky , Ingo Molnar , Paul Walmsley , Albert Ou Subject: [PATCH v5 3/6] KVM: add kvm_lock_all_vcpus and kvm_trylock_all_vcpus Date: Mon, 12 May 2025 14:04:04 -0400 Message-ID: <20250512180407.659015-4-mlevitsk@redhat.com> In-Reply-To: <20250512180407.659015-1-mlevitsk@redhat.com> References: <20250512180407.659015-1-mlevitsk@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" In a few cases, usually in the initialization code, KVM locks all vCPUs of a VM to ensure that userspace doesn't do funny things while KVM performs an operation that affects the whole VM. Until now, all these operations were implemented using custom code, and all of them share the same problem: Lockdep can't cope with simultaneous locking of a large number of locks of the same class. However if these locks are taken while another lock is already held, which is luckily the case, it is possible to take advantage of little known _nest_lock feature of lockdep which allows in this case to have an unlimited number of locks of same class to be taken. To implement this, create two functions: kvm_lock_all_vcpus() and kvm_trylock_all_vcpus() Both functions are needed because some code that will be replaced in the subsequent patches, uses mutex_trylock, instead of regular mutex_lock. Suggested-by: Paolo Bonzini Signed-off-by: Maxim Levitsky Acked-by: Marc Zyngier Acked-by: Peter Zijlstra (Intel) --- include/linux/kvm_host.h | 4 +++ virt/kvm/kvm_main.c | 59 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 63 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 1dedc421b3e3..a6140415c693 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1015,6 +1015,10 @@ static inline struct kvm_vcpu *kvm_get_vcpu_by_id(st= ruct kvm *kvm, int id) =20 void kvm_destroy_vcpus(struct kvm *kvm); =20 +int kvm_trylock_all_vcpus(struct kvm *kvm); +int kvm_lock_all_vcpus(struct kvm *kvm); +void kvm_unlock_all_vcpus(struct kvm *kvm); + void vcpu_load(struct kvm_vcpu *vcpu); void vcpu_put(struct kvm_vcpu *vcpu); =20 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 69782df3617f..d660a7da3baa 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1368,6 +1368,65 @@ static int kvm_vm_release(struct inode *inode, struc= t file *filp) return 0; } =20 +int kvm_trylock_all_vcpus(struct kvm *kvm) +{ + struct kvm_vcpu *vcpu; + unsigned long i, j; + + lockdep_assert_held(&kvm->lock); + + kvm_for_each_vcpu(i, vcpu, kvm) + if (!mutex_trylock_nest_lock(&vcpu->mutex, &kvm->lock)) + goto out_unlock; + return 0; + +out_unlock: + kvm_for_each_vcpu(j, vcpu, kvm) { + if (i =3D=3D j) + break; + mutex_unlock(&vcpu->mutex); + } + return -EINTR; +} +EXPORT_SYMBOL_GPL(kvm_trylock_all_vcpus); + +int kvm_lock_all_vcpus(struct kvm *kvm) +{ + struct kvm_vcpu *vcpu; + unsigned long i, j; + int r; + + lockdep_assert_held(&kvm->lock); + + kvm_for_each_vcpu(i, vcpu, kvm) { + r =3D mutex_lock_killable_nest_lock(&vcpu->mutex, &kvm->lock); + if (r) + goto out_unlock; + } + return 0; + +out_unlock: + kvm_for_each_vcpu(j, vcpu, kvm) { + if (i =3D=3D j) + break; + mutex_unlock(&vcpu->mutex); + } + return r; +} +EXPORT_SYMBOL_GPL(kvm_lock_all_vcpus); + +void kvm_unlock_all_vcpus(struct kvm *kvm) +{ + struct kvm_vcpu *vcpu; + unsigned long i; + + lockdep_assert_held(&kvm->lock); + + kvm_for_each_vcpu(i, vcpu, kvm) + mutex_unlock(&vcpu->mutex); +} +EXPORT_SYMBOL_GPL(kvm_unlock_all_vcpus); + /* * Allocation size is twice as large as the actual dirty bitmap size. * See kvm_vm_ioctl_get_dirty_log() why this is needed. --=20 2.46.0 From nobody Sun Feb 8 23:05:55 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 29CE729B76B for ; Mon, 12 May 2025 18:04:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747073095; cv=none; b=D64NctGXXiD64sloYtHvDdtzNb97AWSOBprQ0vlXAMxqZI5oym7TjuFRtREHPc0ssxGRHJGiX2u0aL1kG+5pP/e4IGUCIwDaL97ynrkngnFjW+KN92Hz0RHFcIORCLG3vs50g6s7n0k4aCt2keP6HpXbKfae3k+LxtsVO2LducE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747073095; c=relaxed/simple; bh=oiXTP7GjIgf+KmXueG6Az0Eq8s4f06kD4cWloaCRxtY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nk3lv1wgofs5e9075b9quiHZPPvg/mtRWtmK8eJrdRNj0ShkpjqzBeEZAAjiFQnlN/L+8GtpJ6R29hWERYymE05btB4oAulGz8zZkQpvYxYjDwkZ50zKwDzGYg7WdTd0Fa+QsSH7hLJSm3hl0T2Lutv0np1TjD/Sd2bbSfA6jyI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TKr5FRcy; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TKr5FRcy" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1747073093; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Q6PrbmwQ6A0Seyvbao8fqw7a8VX7wKunfxAgqBK7Cp0=; b=TKr5FRcyaLdOuRPbB/eU3ELiDWL4kBeCgQ4y3cTStl4DcyCMrQWaGMaMxRV3eC9vK/JKZs MMon5PHCfPxJW51QWOvQoLDChl0ZNaROktaTrQOiGNHSJ9rF4rNI8lLmXro30k+3s6ofV6 P9eqLOxiRWlXjVP+K1G4f8lHaFZVu0g= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-130-mGHcVdicNDaIEHmdxj9Oxw-1; Mon, 12 May 2025 14:04:49 -0400 X-MC-Unique: mGHcVdicNDaIEHmdxj9Oxw-1 X-Mimecast-MFC-AGG-ID: mGHcVdicNDaIEHmdxj9Oxw_1747073085 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 894251800373; Mon, 12 May 2025 18:04:45 +0000 (UTC) Received: from intellaptop.lan (unknown [10.22.80.5]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id E1A5830002D4; Mon, 12 May 2025 18:04:39 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Suzuki K Poulose , Jing Zhang , "H. Peter Anvin" , Sebastian Ott , Shusen Li , Waiman Long , Thomas Gleixner , linux-arm-kernel@lists.infradead.org, Bjorn Helgaas , Borislav Petkov , Anup Patel , Will Deacon , Palmer Dabbelt , Alexander Potapenko , kvmarm@lists.linux.dev, Keisuke Nishimura , Zenghui Yu , Peter Zijlstra , Atish Patra , Joey Gouly , x86@kernel.org, Marc Zyngier , Sean Christopherson , Andre Przywara , Kunkun Jiang , linux-riscv@lists.infradead.org, Randy Dunlap , Paolo Bonzini , Boqun Feng , Catalin Marinas , Alexandre Ghiti , linux-kernel@vger.kernel.org, Dave Hansen , Oliver Upton , kvm-riscv@lists.infradead.org, Maxim Levitsky , Ingo Molnar , Paul Walmsley , Albert Ou Subject: [PATCH v5 4/6] x86: KVM: SVM: use kvm_lock_all_vcpus instead of a custom implementation Date: Mon, 12 May 2025 14:04:05 -0400 Message-ID: <20250512180407.659015-5-mlevitsk@redhat.com> In-Reply-To: <20250512180407.659015-1-mlevitsk@redhat.com> References: <20250512180407.659015-1-mlevitsk@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" Use kvm_lock_all_vcpus instead of sev's own implementation. Because kvm_lock_all_vcpus uses the _nest_lock feature of lockdep, which ignores subclasses, there is no longer a need to use separate subclasses for source and target VMs. No functional change intended. Suggested-by: Paolo Bonzini Signed-off-by: Maxim Levitsky Acked-by: Peter Zijlstra (Intel) --- arch/x86/kvm/svm/sev.c | 72 +++--------------------------------------- 1 file changed, 4 insertions(+), 68 deletions(-) diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index 0bc708ee2788..16db6179013d 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -1882,70 +1882,6 @@ static void sev_unlock_two_vms(struct kvm *dst_kvm, = struct kvm *src_kvm) atomic_set_release(&src_sev->migration_in_progress, 0); } =20 -/* vCPU mutex subclasses. */ -enum sev_migration_role { - SEV_MIGRATION_SOURCE =3D 0, - SEV_MIGRATION_TARGET, - SEV_NR_MIGRATION_ROLES, -}; - -static int sev_lock_vcpus_for_migration(struct kvm *kvm, - enum sev_migration_role role) -{ - struct kvm_vcpu *vcpu; - unsigned long i, j; - - kvm_for_each_vcpu(i, vcpu, kvm) { - if (mutex_lock_killable_nested(&vcpu->mutex, role)) - goto out_unlock; - -#ifdef CONFIG_PROVE_LOCKING - if (!i) - /* - * Reset the role to one that avoids colliding with - * the role used for the first vcpu mutex. - */ - role =3D SEV_NR_MIGRATION_ROLES; - else - mutex_release(&vcpu->mutex.dep_map, _THIS_IP_); -#endif - } - - return 0; - -out_unlock: - - kvm_for_each_vcpu(j, vcpu, kvm) { - if (i =3D=3D j) - break; - -#ifdef CONFIG_PROVE_LOCKING - if (j) - mutex_acquire(&vcpu->mutex.dep_map, role, 0, _THIS_IP_); -#endif - - mutex_unlock(&vcpu->mutex); - } - return -EINTR; -} - -static void sev_unlock_vcpus_for_migration(struct kvm *kvm) -{ - struct kvm_vcpu *vcpu; - unsigned long i; - bool first =3D true; - - kvm_for_each_vcpu(i, vcpu, kvm) { - if (first) - first =3D false; - else - mutex_acquire(&vcpu->mutex.dep_map, - SEV_NR_MIGRATION_ROLES, 0, _THIS_IP_); - - mutex_unlock(&vcpu->mutex); - } -} - static void sev_migrate_from(struct kvm *dst_kvm, struct kvm *src_kvm) { struct kvm_sev_info *dst =3D to_kvm_sev_info(dst_kvm); @@ -2083,10 +2019,10 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, u= nsigned int source_fd) charged =3D true; } =20 - ret =3D sev_lock_vcpus_for_migration(kvm, SEV_MIGRATION_SOURCE); + ret =3D kvm_lock_all_vcpus(kvm); if (ret) goto out_dst_cgroup; - ret =3D sev_lock_vcpus_for_migration(source_kvm, SEV_MIGRATION_TARGET); + ret =3D kvm_lock_all_vcpus(source_kvm); if (ret) goto out_dst_vcpu; =20 @@ -2100,9 +2036,9 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, uns= igned int source_fd) ret =3D 0; =20 out_source_vcpu: - sev_unlock_vcpus_for_migration(source_kvm); + kvm_unlock_all_vcpus(source_kvm); out_dst_vcpu: - sev_unlock_vcpus_for_migration(kvm); + kvm_unlock_all_vcpus(kvm); out_dst_cgroup: /* Operates on the source on success, on the destination on failure. */ if (charged) --=20 2.46.0 From nobody Sun Feb 8 23:05:55 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E70629C346 for ; Mon, 12 May 2025 18:05:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747073103; cv=none; b=gvLSIoaqlIr7AeMkxH5DEvHSobL3VLQ6yDgmkjYRj6bQJxLNygBDPPZ32BC6FB8YFhq7ccO4xBJ/5nxwUfn025CagqwdNfRwrBaHoLBVTw/9+hI9i4NXaz4aTly80K6r8zcpan+9uGLO6XlELlJ7972ynkrT6vC50GNaLatjK7g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747073103; c=relaxed/simple; bh=S3Dp7eOUWiu0fBnUo/XeL3fbBsD4MNsNjB7s5Xrgu2o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JMyM4Ukfb3icB7IpBKXf287noQjN7B2xwZGQN77XKlDP7JUmoOUc7hrv6rP3nvGSVgGgx3H3DJvuJEuN/Bw6qArrFN6kKJnwazSpY7KmBhP1oA17v45Lrr8eOQmHFN+Q8kw/9NXa+5no5/bKJIea9uRMWP8qHGVU9HfRqbWmOo4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=SpZnaZ6I; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="SpZnaZ6I" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1747073100; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sIeVHbVH5lAV3HtNUOd7f5yN+REnPsqDC00v0YT5pbY=; b=SpZnaZ6IK6pN8uAdewbhbQTwUGQkRD0yiMprEXp+iwgTVkqA3GcEksMyCEHYt+n98whRYc /yYMqZORR4J0EI6urUCbZQkC3NHL9pObVkbi7fmBX3hDGOe2qkfAVO6sl3+SBUoaqTFQ1J Jj5gkPnHcOFMY6vJ9lOO4XV3QEe7srw= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-501-2FHp6CAJMx6dESqdlnmoAw-1; Mon, 12 May 2025 14:04:55 -0400 X-MC-Unique: 2FHp6CAJMx6dESqdlnmoAw-1 X-Mimecast-MFC-AGG-ID: 2FHp6CAJMx6dESqdlnmoAw_1747073092 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id F26A41955DE9; Mon, 12 May 2025 18:04:51 +0000 (UTC) Received: from intellaptop.lan (unknown [10.22.80.5]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C4BEE30002D4; Mon, 12 May 2025 18:04:45 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Suzuki K Poulose , Jing Zhang , "H. Peter Anvin" , Sebastian Ott , Shusen Li , Waiman Long , Thomas Gleixner , linux-arm-kernel@lists.infradead.org, Bjorn Helgaas , Borislav Petkov , Anup Patel , Will Deacon , Palmer Dabbelt , Alexander Potapenko , kvmarm@lists.linux.dev, Keisuke Nishimura , Zenghui Yu , Peter Zijlstra , Atish Patra , Joey Gouly , x86@kernel.org, Marc Zyngier , Sean Christopherson , Andre Przywara , Kunkun Jiang , linux-riscv@lists.infradead.org, Randy Dunlap , Paolo Bonzini , Boqun Feng , Catalin Marinas , Alexandre Ghiti , linux-kernel@vger.kernel.org, Dave Hansen , Oliver Upton , kvm-riscv@lists.infradead.org, Maxim Levitsky , Ingo Molnar , Paul Walmsley , Albert Ou Subject: [PATCH v5 5/6] KVM: arm64: use kvm_trylock_all_vcpus when locking all vCPUs Date: Mon, 12 May 2025 14:04:06 -0400 Message-ID: <20250512180407.659015-6-mlevitsk@redhat.com> In-Reply-To: <20250512180407.659015-1-mlevitsk@redhat.com> References: <20250512180407.659015-1-mlevitsk@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" Use kvm_trylock_all_vcpus instead of a custom implementation when locking all vCPUs of a VM, to avoid triggering a lockdep warning, in the case in which the VM is configured to have more than MAX_LOCK_DEPTH vCPUs. This fixes the following false lockdep warning: [ 328.171264] BUG: MAX_LOCK_DEPTH too low! [ 328.175227] turning off the locking correctness validator. [ 328.180726] Please attach the output of /proc/lock_stat to the bug report [ 328.187531] depth: 48 max: 48! [ 328.190678] 48 locks held by qemu-kvm/11664: [ 328.194957] #0: ffff800086de5ba0 (&kvm->lock){+.+.}-{3:3}, at: kvm_ioct= l_create_device+0x174/0x5b0 [ 328.204048] #1: ffff0800e78800b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_a= ll_vcpus+0x16c/0x2a0 [ 328.212521] #2: ffff07ffeee51e98 (&vcpu->mutex){+.+.}-{3:3}, at: lock_a= ll_vcpus+0x16c/0x2a0 [ 328.220991] #3: ffff0800dc7d80b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_a= ll_vcpus+0x16c/0x2a0 [ 328.229463] #4: ffff07ffe0c980b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_a= ll_vcpus+0x16c/0x2a0 [ 328.237934] #5: ffff0800a3883c78 (&vcpu->mutex){+.+.}-{3:3}, at: lock_a= ll_vcpus+0x16c/0x2a0 [ 328.246405] #6: ffff07fffbe480b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_a= ll_vcpus+0x16c/0x2a0 Suggested-by: Paolo Bonzini Signed-off-by: Maxim Levitsky Acked-by: Marc Zyngier Acked-by: Peter Zijlstra (Intel) --- arch/arm64/include/asm/kvm_host.h | 3 -- arch/arm64/kvm/arch_timer.c | 4 +-- arch/arm64/kvm/arm.c | 43 --------------------------- arch/arm64/kvm/vgic/vgic-init.c | 4 +-- arch/arm64/kvm/vgic/vgic-its.c | 8 ++--- arch/arm64/kvm/vgic/vgic-kvm-device.c | 12 ++++---- 6 files changed, 14 insertions(+), 60 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm= _host.h index 08ba91e6fb03..e5ddbd1ba2ca 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1263,9 +1263,6 @@ int __init populate_sysreg_config(const struct sys_re= g_desc *sr, unsigned int idx); int __init populate_nv_trap_config(void); =20 -bool lock_all_vcpus(struct kvm *kvm); -void unlock_all_vcpus(struct kvm *kvm); - void kvm_calculate_traps(struct kvm_vcpu *vcpu); =20 /* MMIO helpers */ diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c index 5133dcbfe9f7..fdbc8beec930 100644 --- a/arch/arm64/kvm/arch_timer.c +++ b/arch/arm64/kvm/arch_timer.c @@ -1766,7 +1766,7 @@ int kvm_vm_ioctl_set_counter_offset(struct kvm *kvm, =20 mutex_lock(&kvm->lock); =20 - if (lock_all_vcpus(kvm)) { + if (!kvm_trylock_all_vcpus(kvm)) { set_bit(KVM_ARCH_FLAG_VM_COUNTER_OFFSET, &kvm->arch.flags); =20 /* @@ -1778,7 +1778,7 @@ int kvm_vm_ioctl_set_counter_offset(struct kvm *kvm, kvm->arch.timer_data.voffset =3D offset->counter_offset; kvm->arch.timer_data.poffset =3D offset->counter_offset; =20 - unlock_all_vcpus(kvm); + kvm_unlock_all_vcpus(kvm); } else { ret =3D -EBUSY; } diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 19ca57def629..4171bd5139c8 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1914,49 +1914,6 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned in= t ioctl, unsigned long arg) } } =20 -/* unlocks vcpus from @vcpu_lock_idx and smaller */ -static void unlock_vcpus(struct kvm *kvm, int vcpu_lock_idx) -{ - struct kvm_vcpu *tmp_vcpu; - - for (; vcpu_lock_idx >=3D 0; vcpu_lock_idx--) { - tmp_vcpu =3D kvm_get_vcpu(kvm, vcpu_lock_idx); - mutex_unlock(&tmp_vcpu->mutex); - } -} - -void unlock_all_vcpus(struct kvm *kvm) -{ - lockdep_assert_held(&kvm->lock); - - unlock_vcpus(kvm, atomic_read(&kvm->online_vcpus) - 1); -} - -/* Returns true if all vcpus were locked, false otherwise */ -bool lock_all_vcpus(struct kvm *kvm) -{ - struct kvm_vcpu *tmp_vcpu; - unsigned long c; - - lockdep_assert_held(&kvm->lock); - - /* - * Any time a vcpu is in an ioctl (including running), the - * core KVM code tries to grab the vcpu->mutex. - * - * By grabbing the vcpu->mutex of all VCPUs we ensure that no - * other VCPUs can fiddle with the state while we access it. - */ - kvm_for_each_vcpu(c, tmp_vcpu, kvm) { - if (!mutex_trylock(&tmp_vcpu->mutex)) { - unlock_vcpus(kvm, c - 1); - return false; - } - } - - return true; -} - static unsigned long nvhe_percpu_size(void) { return (unsigned long)CHOOSE_NVHE_SYM(__per_cpu_end) - diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-ini= t.c index 1f33e71c2a73..6a426d403a6b 100644 --- a/arch/arm64/kvm/vgic/vgic-init.c +++ b/arch/arm64/kvm/vgic/vgic-init.c @@ -88,7 +88,7 @@ int kvm_vgic_create(struct kvm *kvm, u32 type) lockdep_assert_held(&kvm->lock); =20 ret =3D -EBUSY; - if (!lock_all_vcpus(kvm)) + if (kvm_trylock_all_vcpus(kvm)) return ret; =20 mutex_lock(&kvm->arch.config_lock); @@ -142,7 +142,7 @@ int kvm_vgic_create(struct kvm *kvm, u32 type) =20 out_unlock: mutex_unlock(&kvm->arch.config_lock); - unlock_all_vcpus(kvm); + kvm_unlock_all_vcpus(kvm); return ret; } =20 diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c index fb96802799c6..7454388e3646 100644 --- a/arch/arm64/kvm/vgic/vgic-its.c +++ b/arch/arm64/kvm/vgic/vgic-its.c @@ -1999,7 +1999,7 @@ static int vgic_its_attr_regs_access(struct kvm_devic= e *dev, =20 mutex_lock(&dev->kvm->lock); =20 - if (!lock_all_vcpus(dev->kvm)) { + if (kvm_trylock_all_vcpus(dev->kvm)) { mutex_unlock(&dev->kvm->lock); return -EBUSY; } @@ -2034,7 +2034,7 @@ static int vgic_its_attr_regs_access(struct kvm_devic= e *dev, } out: mutex_unlock(&dev->kvm->arch.config_lock); - unlock_all_vcpus(dev->kvm); + kvm_unlock_all_vcpus(dev->kvm); mutex_unlock(&dev->kvm->lock); return ret; } @@ -2704,7 +2704,7 @@ static int vgic_its_ctrl(struct kvm *kvm, struct vgic= _its *its, u64 attr) =20 mutex_lock(&kvm->lock); =20 - if (!lock_all_vcpus(kvm)) { + if (kvm_trylock_all_vcpus(kvm)) { mutex_unlock(&kvm->lock); return -EBUSY; } @@ -2726,7 +2726,7 @@ static int vgic_its_ctrl(struct kvm *kvm, struct vgic= _its *its, u64 attr) =20 mutex_unlock(&its->its_lock); mutex_unlock(&kvm->arch.config_lock); - unlock_all_vcpus(kvm); + kvm_unlock_all_vcpus(kvm); mutex_unlock(&kvm->lock); return ret; } diff --git a/arch/arm64/kvm/vgic/vgic-kvm-device.c b/arch/arm64/kvm/vgic/vg= ic-kvm-device.c index 359094f68c23..f9ae790163fb 100644 --- a/arch/arm64/kvm/vgic/vgic-kvm-device.c +++ b/arch/arm64/kvm/vgic/vgic-kvm-device.c @@ -268,7 +268,7 @@ static int vgic_set_common_attr(struct kvm_device *dev, return -ENXIO; mutex_lock(&dev->kvm->lock); =20 - if (!lock_all_vcpus(dev->kvm)) { + if (kvm_trylock_all_vcpus(dev->kvm)) { mutex_unlock(&dev->kvm->lock); return -EBUSY; } @@ -276,7 +276,7 @@ static int vgic_set_common_attr(struct kvm_device *dev, mutex_lock(&dev->kvm->arch.config_lock); r =3D vgic_v3_save_pending_tables(dev->kvm); mutex_unlock(&dev->kvm->arch.config_lock); - unlock_all_vcpus(dev->kvm); + kvm_unlock_all_vcpus(dev->kvm); mutex_unlock(&dev->kvm->lock); return r; } @@ -390,7 +390,7 @@ static int vgic_v2_attr_regs_access(struct kvm_device *= dev, =20 mutex_lock(&dev->kvm->lock); =20 - if (!lock_all_vcpus(dev->kvm)) { + if (kvm_trylock_all_vcpus(dev->kvm)) { mutex_unlock(&dev->kvm->lock); return -EBUSY; } @@ -415,7 +415,7 @@ static int vgic_v2_attr_regs_access(struct kvm_device *= dev, =20 out: mutex_unlock(&dev->kvm->arch.config_lock); - unlock_all_vcpus(dev->kvm); + kvm_unlock_all_vcpus(dev->kvm); mutex_unlock(&dev->kvm->lock); =20 if (!ret && !is_write) @@ -554,7 +554,7 @@ static int vgic_v3_attr_regs_access(struct kvm_device *= dev, =20 mutex_lock(&dev->kvm->lock); =20 - if (!lock_all_vcpus(dev->kvm)) { + if (kvm_trylock_all_vcpus(dev->kvm)) { mutex_unlock(&dev->kvm->lock); return -EBUSY; } @@ -611,7 +611,7 @@ static int vgic_v3_attr_regs_access(struct kvm_device *= dev, =20 out: mutex_unlock(&dev->kvm->arch.config_lock); - unlock_all_vcpus(dev->kvm); + kvm_unlock_all_vcpus(dev->kvm); mutex_unlock(&dev->kvm->lock); =20 if (!ret && uaccess && !is_write) { --=20 2.46.0 From nobody Sun Feb 8 23:05:55 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 642A529CB53 for ; Mon, 12 May 2025 18:05:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747073106; cv=none; b=AwkFGO74UaFdZzRjhqbp7xakCmNm+foWh8U4kUcwo3v5t1OYVk7Fz5SNUdtK6CAeHNStbwBdqUY8nmTQVnnPLHRp02qLl8ZlIkgpq7OQXimlqt295kbz8ULetnVMvz/q1uqIp4cpT4M8/kiid8ifTlEBf0c8nUSGgPa7K4Ss06k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747073106; c=relaxed/simple; bh=GAuV9D2SSVmY1tEcwamQ8Io+RPWI39GoxyOm85OZ30Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BR0I4faqvbgPf57Rl9pV9WpJvBzWRotqqazV7IrbSz2KYk3sL0wAejtk/T/5tmskzmyRvuUYEM6hm+scGe594K25AaKlwFcuM1e0tyFs3S7IMs3HNEA5rWA5afbG/I2FDPCDdiHREoqb9xQuFZk4wGF+MZVZNh3AzP/yvBmV2Ys= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=N0Box8E7; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="N0Box8E7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1747073103; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fWaQ2unDnoX2epZ5aWfA8UXS6z4UogKfTjShB0NZNqU=; b=N0Box8E7pu4pDJbavegpEYkXoRJhGEBzd4yvlFYNqVdbgiUB2cXTnfVP3ZzGGyB5qwq4j6 RWwjUK4v74SYQ1ueNQ2HOYtsQKuyvBbiSQXIU6iudaLzjB/e1n0AkCEHdjEGF/Meak+UHu s1rQMLNO7Py0Kz1us1O2t2HEx/3Qk6Y= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-638-O_xTPgJeM7iHJqPcQlJ7xA-1; Mon, 12 May 2025 14:05:02 -0400 X-MC-Unique: O_xTPgJeM7iHJqPcQlJ7xA-1 X-Mimecast-MFC-AGG-ID: O_xTPgJeM7iHJqPcQlJ7xA_1747073098 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id DC5941800446; Mon, 12 May 2025 18:04:57 +0000 (UTC) Received: from intellaptop.lan (unknown [10.22.80.5]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 3A5D930001A1; Mon, 12 May 2025 18:04:52 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Suzuki K Poulose , Jing Zhang , "H. Peter Anvin" , Sebastian Ott , Shusen Li , Waiman Long , Thomas Gleixner , linux-arm-kernel@lists.infradead.org, Bjorn Helgaas , Borislav Petkov , Anup Patel , Will Deacon , Palmer Dabbelt , Alexander Potapenko , kvmarm@lists.linux.dev, Keisuke Nishimura , Zenghui Yu , Peter Zijlstra , Atish Patra , Joey Gouly , x86@kernel.org, Marc Zyngier , Sean Christopherson , Andre Przywara , Kunkun Jiang , linux-riscv@lists.infradead.org, Randy Dunlap , Paolo Bonzini , Boqun Feng , Catalin Marinas , Alexandre Ghiti , linux-kernel@vger.kernel.org, Dave Hansen , Oliver Upton , kvm-riscv@lists.infradead.org, Maxim Levitsky , Ingo Molnar , Paul Walmsley , Albert Ou Subject: [PATCH v5 6/6] RISC-V: KVM: use kvm_trylock_all_vcpus when locking all vCPUs Date: Mon, 12 May 2025 14:04:07 -0400 Message-ID: <20250512180407.659015-7-mlevitsk@redhat.com> In-Reply-To: <20250512180407.659015-1-mlevitsk@redhat.com> References: <20250512180407.659015-1-mlevitsk@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" Use kvm_trylock_all_vcpus instead of a custom implementation when locking all vCPUs of a VM. Compile tested only. Suggested-by: Paolo Bonzini Signed-off-by: Maxim Levitsky Acked-by: Peter Zijlstra (Intel) Reviewed-by: Anup Patel Tested-by: Anup Patel --- arch/riscv/kvm/aia_device.c | 34 ++-------------------------------- 1 file changed, 2 insertions(+), 32 deletions(-) diff --git a/arch/riscv/kvm/aia_device.c b/arch/riscv/kvm/aia_device.c index 39cd26af5a69..6315821f0d69 100644 --- a/arch/riscv/kvm/aia_device.c +++ b/arch/riscv/kvm/aia_device.c @@ -12,36 +12,6 @@ #include #include =20 -static void unlock_vcpus(struct kvm *kvm, int vcpu_lock_idx) -{ - struct kvm_vcpu *tmp_vcpu; - - for (; vcpu_lock_idx >=3D 0; vcpu_lock_idx--) { - tmp_vcpu =3D kvm_get_vcpu(kvm, vcpu_lock_idx); - mutex_unlock(&tmp_vcpu->mutex); - } -} - -static void unlock_all_vcpus(struct kvm *kvm) -{ - unlock_vcpus(kvm, atomic_read(&kvm->online_vcpus) - 1); -} - -static bool lock_all_vcpus(struct kvm *kvm) -{ - struct kvm_vcpu *tmp_vcpu; - unsigned long c; - - kvm_for_each_vcpu(c, tmp_vcpu, kvm) { - if (!mutex_trylock(&tmp_vcpu->mutex)) { - unlock_vcpus(kvm, c - 1); - return false; - } - } - - return true; -} - static int aia_create(struct kvm_device *dev, u32 type) { int ret; @@ -53,7 +23,7 @@ static int aia_create(struct kvm_device *dev, u32 type) return -EEXIST; =20 ret =3D -EBUSY; - if (!lock_all_vcpus(kvm)) + if (kvm_trylock_all_vcpus(kvm)) return ret; =20 kvm_for_each_vcpu(i, vcpu, kvm) { @@ -65,7 +35,7 @@ static int aia_create(struct kvm_device *dev, u32 type) kvm->arch.aia.in_kernel =3D true; =20 out_unlock: - unlock_all_vcpus(kvm); + kvm_unlock_all_vcpus(kvm); return ret; } =20 --=20 2.46.0