From nobody Sat May 4 02:13:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1638269033153317.3765654129139; Tue, 30 Nov 2021 02:43:53 -0800 (PST) Received: from localhost ([::1]:35692 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ms0c3-0008F1-Vt for importer@patchew.org; Tue, 30 Nov 2021 05:43:52 -0500 Received: from eggs.gnu.org ([209.51.188.92]:43468) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0a8-0005a0-7d for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:41:53 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:51979) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0a5-0000zG-81 for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:41:50 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-602-LsE1doLpPT667VmSigKgsA-1; Tue, 30 Nov 2021 05:41:45 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E4313801B10; Tue, 30 Nov 2021 10:41:43 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id CDA3D100AE22; Tue, 30 Nov 2021 10:41:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1638268908; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NMOA9kQbOGwFqcPMc+ilA6kpME97Hs3N/G+0b3k1Sx0=; b=bh+Q7ddYJihx7JU4NJJn5ntpjF6ErAXvLuXIkLN4NvGncyeNiAPpWQHaTrd6r8iNOl3133 xSxDlLjTiOkx0ugiGw6gKbgsSlu/ObMcRTLO5j+NTSGZtNeMSQvuqeGXDJCoaeStijzgTu iwgQgp+WBP+E+qHIVvIRulIKIad8GzQ= X-MC-Unique: LsE1doLpPT667VmSigKgsA-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v1 1/8] util/oslib-posix: Let touch_all_pages() return an error Date: Tue, 30 Nov 2021 11:41:29 +0100 Message-Id: <20211130104136.40927-2-david@redhat.com> In-Reply-To: <20211130104136.40927-1-david@redhat.com> References: <20211130104136.40927-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.716, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Gavin Shan , =?UTF-8?q?Daniel=20P=20=2E=20Berrang=C3=A9?= , Eduardo Habkost , "Michael S. Tsirkin" , =?UTF-8?q?Michal=20Pr=C3=ADvozn=C3=ADk?= , David Hildenbrand , "Dr . David Alan Gilbert" , Sebastien Boeuf , Igor Mammedov , Paolo Bonzini , Hui Zhu Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1638269033777100001 Let's prepare touch_all_pages() for returning differing errors. Return an error from the thread and report the last processed error. Translate SIGBUS to -EFAULT, as a SIGBUS can mean all different kind of things (memory error, read error, out of memory). When allocating memory fails via the current SIGBUS-based mechanism, we'll get: os_mem_prealloc: preallocating memory failed: Bad address Reviewed-by: Daniel P. Berrang=C3=A9 Signed-off-by: David Hildenbrand Reviewed-by: Michal Privoznik --- util/oslib-posix.c | 28 ++++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/util/oslib-posix.c b/util/oslib-posix.c index e8bdb02e1d..b146beef78 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -84,7 +84,6 @@ typedef struct MemsetThread MemsetThread; =20 static MemsetThread *memset_thread; static int memset_num_threads; -static bool memset_thread_failed; =20 static QemuMutex page_mutex; static QemuCond page_cond; @@ -452,6 +451,7 @@ static void *do_touch_pages(void *arg) { MemsetThread *memset_args =3D (MemsetThread *)arg; sigset_t set, oldset; + int ret =3D 0; =20 /* * On Linux, the page faults from the loop below can cause mmap_sem @@ -470,7 +470,7 @@ static void *do_touch_pages(void *arg) pthread_sigmask(SIG_UNBLOCK, &set, &oldset); =20 if (sigsetjmp(memset_args->env, 1)) { - memset_thread_failed =3D true; + ret =3D -EFAULT; } else { char *addr =3D memset_args->addr; size_t numpages =3D memset_args->numpages; @@ -494,7 +494,7 @@ static void *do_touch_pages(void *arg) } } pthread_sigmask(SIG_SETMASK, &oldset, NULL); - return NULL; + return (void *)(uintptr_t)ret; } =20 static inline int get_memset_num_threads(int smp_cpus) @@ -509,13 +509,13 @@ static inline int get_memset_num_threads(int smp_cpus) return ret; } =20 -static bool touch_all_pages(char *area, size_t hpagesize, size_t numpages, - int smp_cpus) +static int touch_all_pages(char *area, size_t hpagesize, size_t numpages, + int smp_cpus) { static gsize initialized =3D 0; size_t numpages_per_thread, leftover; + int ret =3D 0, i =3D 0; char *addr =3D area; - int i =3D 0; =20 if (g_once_init_enter(&initialized)) { qemu_mutex_init(&page_mutex); @@ -523,7 +523,6 @@ static bool touch_all_pages(char *area, size_t hpagesiz= e, size_t numpages, g_once_init_leave(&initialized, 1); } =20 - memset_thread_failed =3D false; threads_created_flag =3D false; memset_num_threads =3D get_memset_num_threads(smp_cpus); memset_thread =3D g_new0(MemsetThread, memset_num_threads); @@ -545,12 +544,16 @@ static bool touch_all_pages(char *area, size_t hpages= ize, size_t numpages, qemu_mutex_unlock(&page_mutex); =20 for (i =3D 0; i < memset_num_threads; i++) { - qemu_thread_join(&memset_thread[i].pgthread); + int tmp =3D (uintptr_t)qemu_thread_join(&memset_thread[i].pgthread= ); + + if (tmp) { + ret =3D tmp; + } } g_free(memset_thread); memset_thread =3D NULL; =20 - return memset_thread_failed; + return ret; } =20 void os_mem_prealloc(int fd, char *area, size_t memory, int smp_cpus, @@ -573,9 +576,10 @@ void os_mem_prealloc(int fd, char *area, size_t memory= , int smp_cpus, } =20 /* touch pages simultaneously */ - if (touch_all_pages(area, hpagesize, numpages, smp_cpus)) { - error_setg(errp, "os_mem_prealloc: Insufficient free host memory " - "pages available to allocate guest RAM"); + ret =3D touch_all_pages(area, hpagesize, numpages, smp_cpus); + if (ret) { + error_setg_errno(errp, -ret, + "os_mem_prealloc: preallocating memory failed"); } =20 ret =3D sigaction(SIGBUS, &oldact, NULL); --=20 2.31.1 From nobody Sat May 4 02:13:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1638269192550472.0261822307824; Tue, 30 Nov 2021 02:46:32 -0800 (PST) Received: from localhost ([::1]:43462 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ms0ed-00050t-Bs for importer@patchew.org; Tue, 30 Nov 2021 05:46:31 -0500 Received: from eggs.gnu.org ([209.51.188.92]:43484) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0aA-0005bL-Id for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:41:55 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:30666) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0a8-00010h-Nr for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:41:54 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-49-eAv_ePAIOXGY_UlF_H3x1g-1; Tue, 30 Nov 2021 05:41:48 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 66578192CC41; Tue, 30 Nov 2021 10:41:47 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4F30A100AE22; Tue, 30 Nov 2021 10:41:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1638268912; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FuqDVk/VmU0BGNDYQ+fzVu3fCHn52LEbJwMF35lPkRA=; b=IVmLPMqt4kicVXkT+Zt33yuUWjlHttdcZGxh02iu7cGls7R8wAKFbr9iT+vx0+GCFDLWkL KFp3wAhO5wE0Z2F7cOKLTvoqM24oTrIwZ7lLB5PAfBjNGF0arS7LH6119+QX9A4oc5PocJ SrVfw65w+joij9Aqyhox1XxFdgS/K30= X-MC-Unique: eAv_ePAIOXGY_UlF_H3x1g-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v1 2/8] util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc() Date: Tue, 30 Nov 2021 11:41:30 +0100 Message-Id: <20211130104136.40927-3-david@redhat.com> In-Reply-To: <20211130104136.40927-1-david@redhat.com> References: <20211130104136.40927-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.716, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Gavin Shan , =?UTF-8?q?Daniel=20P=20=2E=20Berrang=C3=A9?= , Eduardo Habkost , "Michael S. Tsirkin" , =?UTF-8?q?Michal=20Pr=C3=ADvozn=C3=ADk?= , David Hildenbrand , "Dr . David Alan Gilbert" , Pankaj Gupta , Sebastien Boeuf , Igor Mammedov , Paolo Bonzini , Hui Zhu Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1638269194373100001 Let's sense support and use it for preallocation. MADV_POPULATE_WRITE does not require a SIGBUS handler, doesn't actually touch page content, and avoids context switches; it is, therefore, faster and easier to handle than our current approach. While MADV_POPULATE_WRITE is, in general, faster than manual prefaulting, and especially faster with 4k pages, there is still value in prefaulting using multiple threads to speed up preallocation. More details on MADV_POPULATE_WRITE can be found in the Linux commits 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)"), and in the man page proposal [1]. This resolves the TODO in do_touch_pages(). In the future, we might want to look into using fallocate(), eventually combined with MADV_POPULATE_READ, when dealing with shared file/fd mappings and not caring about memory bindings. [1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com Reviewed-by: Pankaj Gupta Reviewed-by: Daniel P. Berrang=C3=A9 Signed-off-by: David Hildenbrand Reviewed-by: Michal Privoznik --- include/qemu/osdep.h | 7 ++++ util/oslib-posix.c | 83 +++++++++++++++++++++++++++++++++----------- 2 files changed, 69 insertions(+), 21 deletions(-) diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h index 60718fc342..d1660d67fa 100644 --- a/include/qemu/osdep.h +++ b/include/qemu/osdep.h @@ -471,6 +471,11 @@ static inline void qemu_cleanup_generic_vfree(void *p) #else #define QEMU_MADV_REMOVE QEMU_MADV_DONTNEED #endif +#ifdef MADV_POPULATE_WRITE +#define QEMU_MADV_POPULATE_WRITE MADV_POPULATE_WRITE +#else +#define QEMU_MADV_POPULATE_WRITE QEMU_MADV_INVALID +#endif =20 #elif defined(CONFIG_POSIX_MADVISE) =20 @@ -484,6 +489,7 @@ static inline void qemu_cleanup_generic_vfree(void *p) #define QEMU_MADV_HUGEPAGE QEMU_MADV_INVALID #define QEMU_MADV_NOHUGEPAGE QEMU_MADV_INVALID #define QEMU_MADV_REMOVE QEMU_MADV_DONTNEED +#define QEMU_MADV_POPULATE_WRITE QEMU_MADV_INVALID =20 #else /* no-op */ =20 @@ -497,6 +503,7 @@ static inline void qemu_cleanup_generic_vfree(void *p) #define QEMU_MADV_HUGEPAGE QEMU_MADV_INVALID #define QEMU_MADV_NOHUGEPAGE QEMU_MADV_INVALID #define QEMU_MADV_REMOVE QEMU_MADV_INVALID +#define QEMU_MADV_POPULATE_WRITE QEMU_MADV_INVALID =20 #endif =20 diff --git a/util/oslib-posix.c b/util/oslib-posix.c index b146beef78..cb89e07770 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -484,10 +484,6 @@ static void *do_touch_pages(void *arg) * * 'volatile' to stop compiler optimizing this away * to a no-op - * - * TODO: get a better solution from kernel so we - * don't need to write at all so we don't cause - * wear on the storage backing the region... */ *(volatile char *)addr =3D *addr; addr +=3D hpagesize; @@ -497,6 +493,26 @@ static void *do_touch_pages(void *arg) return (void *)(uintptr_t)ret; } =20 +static void *do_madv_populate_write_pages(void *arg) +{ + MemsetThread *memset_args =3D (MemsetThread *)arg; + const size_t size =3D memset_args->numpages * memset_args->hpagesize; + char * const addr =3D memset_args->addr; + int ret =3D 0; + + /* See do_touch_pages(). */ + qemu_mutex_lock(&page_mutex); + while (!threads_created_flag) { + qemu_cond_wait(&page_cond, &page_mutex); + } + qemu_mutex_unlock(&page_mutex); + + if (size && qemu_madvise(addr, size, QEMU_MADV_POPULATE_WRITE)) { + ret =3D -errno; + } + return (void *)(uintptr_t)ret; +} + static inline int get_memset_num_threads(int smp_cpus) { long host_procs =3D sysconf(_SC_NPROCESSORS_ONLN); @@ -510,10 +526,11 @@ static inline int get_memset_num_threads(int smp_cpus) } =20 static int touch_all_pages(char *area, size_t hpagesize, size_t numpages, - int smp_cpus) + int smp_cpus, bool use_madv_populate_write) { static gsize initialized =3D 0; size_t numpages_per_thread, leftover; + void *(*touch_fn)(void *); int ret =3D 0, i =3D 0; char *addr =3D area; =20 @@ -523,6 +540,12 @@ static int touch_all_pages(char *area, size_t hpagesiz= e, size_t numpages, g_once_init_leave(&initialized, 1); } =20 + if (use_madv_populate_write) { + touch_fn =3D do_madv_populate_write_pages; + } else { + touch_fn =3D do_touch_pages; + } + threads_created_flag =3D false; memset_num_threads =3D get_memset_num_threads(smp_cpus); memset_thread =3D g_new0(MemsetThread, memset_num_threads); @@ -533,7 +556,7 @@ static int touch_all_pages(char *area, size_t hpagesize= , size_t numpages, memset_thread[i].numpages =3D numpages_per_thread + (i < leftover); memset_thread[i].hpagesize =3D hpagesize; qemu_thread_create(&memset_thread[i].pgthread, "touch_pages", - do_touch_pages, &memset_thread[i], + touch_fn, &memset_thread[i], QEMU_THREAD_JOINABLE); addr +=3D memset_thread[i].numpages * hpagesize; } @@ -556,6 +579,12 @@ static int touch_all_pages(char *area, size_t hpagesiz= e, size_t numpages, return ret; } =20 +static bool madv_populate_write_possible(char *area, size_t pagesize) +{ + return !qemu_madvise(area, pagesize, QEMU_MADV_POPULATE_WRITE) || + errno !=3D EINVAL; +} + void os_mem_prealloc(int fd, char *area, size_t memory, int smp_cpus, Error **errp) { @@ -563,30 +592,42 @@ void os_mem_prealloc(int fd, char *area, size_t memor= y, int smp_cpus, struct sigaction act, oldact; size_t hpagesize =3D qemu_fd_getpagesize(fd); size_t numpages =3D DIV_ROUND_UP(memory, hpagesize); + bool use_madv_populate_write; =20 - memset(&act, 0, sizeof(act)); - act.sa_handler =3D &sigbus_handler; - act.sa_flags =3D 0; - - ret =3D sigaction(SIGBUS, &act, &oldact); - if (ret) { - error_setg_errno(errp, errno, - "os_mem_prealloc: failed to install signal handler"); - return; + /* + * Sense on every invocation, as MADV_POPULATE_WRITE cannot be used for + * some special mappings, such as mapping /dev/mem. + */ + use_madv_populate_write =3D madv_populate_write_possible(area, hpagesi= ze); + + if (!use_madv_populate_write) { + memset(&act, 0, sizeof(act)); + act.sa_handler =3D &sigbus_handler; + act.sa_flags =3D 0; + + ret =3D sigaction(SIGBUS, &act, &oldact); + if (ret) { + error_setg_errno(errp, errno, + "os_mem_prealloc: failed to install signal handler"); + return; + } } =20 /* touch pages simultaneously */ - ret =3D touch_all_pages(area, hpagesize, numpages, smp_cpus); + ret =3D touch_all_pages(area, hpagesize, numpages, smp_cpus, + use_madv_populate_write); if (ret) { error_setg_errno(errp, -ret, "os_mem_prealloc: preallocating memory failed"); } =20 - ret =3D sigaction(SIGBUS, &oldact, NULL); - if (ret) { - /* Terminate QEMU since it can't recover from error */ - perror("os_mem_prealloc: failed to reinstall signal handler"); - exit(1); + if (!use_madv_populate_write) { + ret =3D sigaction(SIGBUS, &oldact, NULL); + if (ret) { + /* Terminate QEMU since it can't recover from error */ + perror("os_mem_prealloc: failed to reinstall signal handler"); + exit(1); + } } } =20 --=20 2.31.1 From nobody Sat May 4 02:13:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1638269284519414.51819013447687; Tue, 30 Nov 2021 02:48:04 -0800 (PST) Received: from localhost ([::1]:47546 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ms0g7-0007qi-JT for importer@patchew.org; Tue, 30 Nov 2021 05:48:03 -0500 Received: from eggs.gnu.org ([209.51.188.92]:43504) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0aF-0005eT-Gp for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:42:00 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:60815) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0aC-00011Q-RO for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:41:59 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-431-SG0WxjdeMu-F5UuEfvNdww-1; Tue, 30 Nov 2021 05:41:51 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C03CF1018722; Tue, 30 Nov 2021 10:41:50 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id C3866100AE22; Tue, 30 Nov 2021 10:41:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1638268914; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E2hPa3OeU2VShLBIWGRBI4Pl+wvAn0MdoP0MtSPhYqc=; b=bBGfyLO/QutWzKYIGKD/+uaF77kF3S6Q/Msd5y9l1gm6CTrV/32wEWdGI3U4xzMxqqEI9J eTy7IU3OziTgoxgmogGq47DuVsvNEBv0ITaC+h8QeCqecVzDWyNN8Kaw1KkV9CI8am6eaW LHUsSFDy7PKQY+FaYnNnP2+KQeWoMdU= X-MC-Unique: SG0WxjdeMu-F5UuEfvNdww-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v1 3/8] util/oslib-posix: Introduce and use MemsetContext for touch_all_pages() Date: Tue, 30 Nov 2021 11:41:31 +0100 Message-Id: <20211130104136.40927-4-david@redhat.com> In-Reply-To: <20211130104136.40927-1-david@redhat.com> References: <20211130104136.40927-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.716, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Gavin Shan , =?UTF-8?q?Daniel=20P=20=2E=20Berrang=C3=A9?= , Eduardo Habkost , "Michael S. Tsirkin" , =?UTF-8?q?Michal=20Pr=C3=ADvozn=C3=ADk?= , David Hildenbrand , "Dr . David Alan Gilbert" , Sebastien Boeuf , Igor Mammedov , Paolo Bonzini , Hui Zhu Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1638269285767100001 Let's minimize the number of global variables to prepare for os_mem_prealloc() getting called concurrently and make the code a bit easier to read. The only consumer that really needs a global variable is the sigbus handler, which will require protection via a mutex in the future either way as we cannot concurrently mess with the SIGBUS handler. Reviewed-by: Daniel P. Berrang=C3=A9 Signed-off-by: David Hildenbrand Reviewed-by: Michal Privoznik --- util/oslib-posix.c | 73 +++++++++++++++++++++++++++++----------------- 1 file changed, 47 insertions(+), 26 deletions(-) diff --git a/util/oslib-posix.c b/util/oslib-posix.c index cb89e07770..cf2ead54ad 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -73,21 +73,30 @@ =20 #define MAX_MEM_PREALLOC_THREAD_COUNT 16 =20 +struct MemsetThread; + +typedef struct MemsetContext { + bool all_threads_created; + bool any_thread_failed; + struct MemsetThread *threads; + int num_threads; +} MemsetContext; + struct MemsetThread { char *addr; size_t numpages; size_t hpagesize; QemuThread pgthread; sigjmp_buf env; + MemsetContext *context; }; typedef struct MemsetThread MemsetThread; =20 -static MemsetThread *memset_thread; -static int memset_num_threads; +/* used by sigbus_handler() */ +static MemsetContext *sigbus_memset_context; =20 static QemuMutex page_mutex; static QemuCond page_cond; -static bool threads_created_flag; =20 int qemu_get_thread_id(void) { @@ -438,10 +447,13 @@ const char *qemu_get_exec_dir(void) static void sigbus_handler(int signal) { int i; - if (memset_thread) { - for (i =3D 0; i < memset_num_threads; i++) { - if (qemu_thread_is_self(&memset_thread[i].pgthread)) { - siglongjmp(memset_thread[i].env, 1); + + if (sigbus_memset_context) { + for (i =3D 0; i < sigbus_memset_context->num_threads; i++) { + MemsetThread *thread =3D &sigbus_memset_context->threads[i]; + + if (qemu_thread_is_self(&thread->pgthread)) { + siglongjmp(thread->env, 1); } } } @@ -459,7 +471,7 @@ static void *do_touch_pages(void *arg) * clearing until all threads have been created. */ qemu_mutex_lock(&page_mutex); - while(!threads_created_flag){ + while (!memset_args->context->all_threads_created) { qemu_cond_wait(&page_cond, &page_mutex); } qemu_mutex_unlock(&page_mutex); @@ -502,7 +514,7 @@ static void *do_madv_populate_write_pages(void *arg) =20 /* See do_touch_pages(). */ qemu_mutex_lock(&page_mutex); - while (!threads_created_flag) { + while (!memset_args->context->all_threads_created) { qemu_cond_wait(&page_cond, &page_mutex); } qemu_mutex_unlock(&page_mutex); @@ -529,6 +541,9 @@ static int touch_all_pages(char *area, size_t hpagesize= , size_t numpages, int smp_cpus, bool use_madv_populate_write) { static gsize initialized =3D 0; + MemsetContext context =3D { + .num_threads =3D get_memset_num_threads(smp_cpus), + }; size_t numpages_per_thread, leftover; void *(*touch_fn)(void *); int ret =3D 0, i =3D 0; @@ -546,35 +561,41 @@ static int touch_all_pages(char *area, size_t hpagesi= ze, size_t numpages, touch_fn =3D do_touch_pages; } =20 - threads_created_flag =3D false; - memset_num_threads =3D get_memset_num_threads(smp_cpus); - memset_thread =3D g_new0(MemsetThread, memset_num_threads); - numpages_per_thread =3D numpages / memset_num_threads; - leftover =3D numpages % memset_num_threads; - for (i =3D 0; i < memset_num_threads; i++) { - memset_thread[i].addr =3D addr; - memset_thread[i].numpages =3D numpages_per_thread + (i < leftover); - memset_thread[i].hpagesize =3D hpagesize; - qemu_thread_create(&memset_thread[i].pgthread, "touch_pages", - touch_fn, &memset_thread[i], + context.threads =3D g_new0(MemsetThread, context.num_threads); + numpages_per_thread =3D numpages / context.num_threads; + leftover =3D numpages % context.num_threads; + for (i =3D 0; i < context.num_threads; i++) { + context.threads[i].addr =3D addr; + context.threads[i].numpages =3D numpages_per_thread + (i < leftove= r); + context.threads[i].hpagesize =3D hpagesize; + context.threads[i].context =3D &context; + qemu_thread_create(&context.threads[i].pgthread, "touch_pages", + touch_fn, &context.threads[i], QEMU_THREAD_JOINABLE); - addr +=3D memset_thread[i].numpages * hpagesize; + addr +=3D context.threads[i].numpages * hpagesize; + } + + if (!use_madv_populate_write) { + sigbus_memset_context =3D &context; } =20 qemu_mutex_lock(&page_mutex); - threads_created_flag =3D true; + context.all_threads_created =3D true; qemu_cond_broadcast(&page_cond); qemu_mutex_unlock(&page_mutex); =20 - for (i =3D 0; i < memset_num_threads; i++) { - int tmp =3D (uintptr_t)qemu_thread_join(&memset_thread[i].pgthread= ); + for (i =3D 0; i < context.num_threads; i++) { + int tmp =3D (uintptr_t)qemu_thread_join(&context.threads[i].pgthre= ad); =20 if (tmp) { ret =3D tmp; } } - g_free(memset_thread); - memset_thread =3D NULL; + + if (!use_madv_populate_write) { + sigbus_memset_context =3D NULL; + } + g_free(context.threads); =20 return ret; } --=20 2.31.1 From nobody Sat May 4 02:13:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1638269278037899.1759734361515; Tue, 30 Nov 2021 02:47:58 -0800 (PST) Received: from localhost ([::1]:47330 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ms0g0-0007hY-V2 for importer@patchew.org; Tue, 30 Nov 2021 05:47:57 -0500 Received: from eggs.gnu.org ([209.51.188.92]:43540) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0aH-0005fO-GG for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:42:02 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:45564) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0aF-00012X-Hc for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:42:01 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-26-nXcuoLjxMMG3KaQ9p_5oRA-1; Tue, 30 Nov 2021 05:41:55 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 760851018720; Tue, 30 Nov 2021 10:41:54 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1D464100AE22; Tue, 30 Nov 2021 10:41:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1638268919; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6PRu0OPJVIqoF1U6fV+wR5ie0tm5wZlfI3mKkoAKXfo=; b=LjKIUejhtiVcG86z6FMTxf942VMcpK1s0hdAG9oFXIoY/l6DKoHRHf5opQ7crz9ki6rHHp oPBETC60a6dMFo/478/7mV5UXubIbgkkNq9FY6O1jqNhwf8y8o0IkheawAciPTYZbH54do dzroDBX4GG2iq3abgb1eeVhUXzteNVE= X-MC-Unique: nXcuoLjxMMG3KaQ9p_5oRA-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v1 4/8] util/oslib-posix: Don't create too many threads with small memory or little pages Date: Tue, 30 Nov 2021 11:41:32 +0100 Message-Id: <20211130104136.40927-5-david@redhat.com> In-Reply-To: <20211130104136.40927-1-david@redhat.com> References: <20211130104136.40927-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.716, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Gavin Shan , =?UTF-8?q?Daniel=20P=20=2E=20Berrang=C3=A9?= , Eduardo Habkost , "Michael S. Tsirkin" , =?UTF-8?q?Michal=20Pr=C3=ADvozn=C3=ADk?= , David Hildenbrand , "Dr . David Alan Gilbert" , Pankaj Gupta , Sebastien Boeuf , Igor Mammedov , Paolo Bonzini , Hui Zhu Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1638269278874100001 Let's limit the number of threads to something sane, especially that - We don't have more threads than the number of pages we have - We don't have threads that initialize small (< 64 MiB) memory Reviewed-by: Pankaj Gupta Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Daniel P. Berrang=C3=A9 Signed-off-by: David Hildenbrand Reviewed-by: Michal Privoznik --- util/oslib-posix.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/util/oslib-posix.c b/util/oslib-posix.c index cf2ead54ad..67c08a425e 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -40,6 +40,7 @@ #include #include "qemu/cutils.h" #include "qemu/compiler.h" +#include "qemu/units.h" =20 #ifdef CONFIG_LINUX #include @@ -525,7 +526,8 @@ static void *do_madv_populate_write_pages(void *arg) return (void *)(uintptr_t)ret; } =20 -static inline int get_memset_num_threads(int smp_cpus) +static inline int get_memset_num_threads(size_t hpagesize, size_t numpages, + int smp_cpus) { long host_procs =3D sysconf(_SC_NPROCESSORS_ONLN); int ret =3D 1; @@ -533,6 +535,12 @@ static inline int get_memset_num_threads(int smp_cpus) if (host_procs > 0) { ret =3D MIN(MIN(host_procs, MAX_MEM_PREALLOC_THREAD_COUNT), smp_cp= us); } + + /* Especially with gigantic pages, don't create more threads than page= s. */ + ret =3D MIN(ret, numpages); + /* Don't start threads to prealloc comparatively little memory. */ + ret =3D MIN(ret, MAX(1, hpagesize * numpages / (64 * MiB))); + /* In case sysconf() fails, we fall back to single threaded */ return ret; } @@ -542,7 +550,7 @@ static int touch_all_pages(char *area, size_t hpagesize= , size_t numpages, { static gsize initialized =3D 0; MemsetContext context =3D { - .num_threads =3D get_memset_num_threads(smp_cpus), + .num_threads =3D get_memset_num_threads(hpagesize, numpages, smp_c= pus), }; size_t numpages_per_thread, leftover; void *(*touch_fn)(void *); --=20 2.31.1 From nobody Sat May 4 02:13:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1638269318458612.8066678982716; Tue, 30 Nov 2021 02:48:38 -0800 (PST) Received: from localhost ([::1]:49712 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ms0gf-0000uF-E8 for importer@patchew.org; Tue, 30 Nov 2021 05:48:37 -0500 Received: from eggs.gnu.org ([209.51.188.92]:43584) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0aK-0005gW-DJ for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:42:04 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:41300) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0aI-000136-V5 for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:42:04 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-99-MatqU4wDPu-2YxpKyQUcNA-1; Tue, 30 Nov 2021 05:41:59 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id EA7ED192CC4A; Tue, 30 Nov 2021 10:41:57 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id D4208101E592; Tue, 30 Nov 2021 10:41:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1638268922; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Wlt40QBQkDU6CaDYw9NJ1KuuNgpytHW2nxi1LfiTHfw=; b=aIRcbzClfHfWFSd+Eqc24sCwT1TkR2SeSVf72nF4Q1PhHWsvbPCC5aHxahTigsXCIcR9/Y GIDrvDAZzEl4BfLqNHoYPLCSGOeZzfoTrGHlToPMWhqg58BwZQ40ij7cUfRhkC1zv3bxdb gbxDO6pX8tHrdSG+hljFQ+EFcC+kn04= X-MC-Unique: MatqU4wDPu-2YxpKyQUcNA-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v1 5/8] util/oslib-posix: Avoid creating a single thread with MADV_POPULATE_WRITE Date: Tue, 30 Nov 2021 11:41:33 +0100 Message-Id: <20211130104136.40927-6-david@redhat.com> In-Reply-To: <20211130104136.40927-1-david@redhat.com> References: <20211130104136.40927-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.716, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Gavin Shan , =?UTF-8?q?Daniel=20P=20=2E=20Berrang=C3=A9?= , Eduardo Habkost , "Michael S. Tsirkin" , =?UTF-8?q?Michal=20Pr=C3=ADvozn=C3=ADk?= , David Hildenbrand , "Dr . David Alan Gilbert" , Pankaj Gupta , Sebastien Boeuf , Igor Mammedov , Paolo Bonzini , Hui Zhu Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1638269319922100001 Let's simplify the case when we only want a single thread and don't have to mess with signal handlers. Reviewed-by: Pankaj Gupta Reviewed-by: Daniel P. Berrang=C3=A9 Signed-off-by: David Hildenbrand Reviewed-by: Michal Privoznik --- util/oslib-posix.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/util/oslib-posix.c b/util/oslib-posix.c index 67c08a425e..efa4f96d56 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -564,6 +564,14 @@ static int touch_all_pages(char *area, size_t hpagesiz= e, size_t numpages, } =20 if (use_madv_populate_write) { + /* Avoid creating a single thread for MADV_POPULATE_WRITE */ + if (context.num_threads =3D=3D 1) { + if (qemu_madvise(area, hpagesize * numpages, + QEMU_MADV_POPULATE_WRITE)) { + return -errno; + } + return 0; + } touch_fn =3D do_madv_populate_write_pages; } else { touch_fn =3D do_touch_pages; --=20 2.31.1 From nobody Sat May 4 02:13:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1638269564037927.9957846487307; Tue, 30 Nov 2021 02:52:44 -0800 (PST) Received: from localhost ([::1]:57400 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ms0kc-0006Ir-Vx for importer@patchew.org; Tue, 30 Nov 2021 05:52:43 -0500 Received: from eggs.gnu.org ([209.51.188.92]:43640) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0aQ-000610-MU for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:42:10 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:36301) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0aO-00013p-3K for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:42:10 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-343-wjfPuAIbOdS7oPvOWtH9Nw-1; Tue, 30 Nov 2021 05:42:02 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5490F86A063; Tue, 30 Nov 2021 10:42:01 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4DCF910114AE; Tue, 30 Nov 2021 10:41:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1638268927; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=e29Gggx2QJ++Sey6LRiwKPCe4BhS91CgAJ0+aKHqpRE=; b=fEdPYT7pqoH6TbPO4CU3yHHxgn7g0sIx9LQP8SjTaDyNEhcQ8HkszqpgXYmrzfOiZDqr3W RlJZxKy7lHzAkDyScxBcg4mpX6k0MRwtom2UG0hpjmOpmEIOwU6YT56wzPkeR5moHp+Cgc bb0VVaarO3Oo/zbct6MX2YCzpLXVI0k= X-MC-Unique: wjfPuAIbOdS7oPvOWtH9Nw-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v1 6/8] util/oslib-posix: Support concurrent os_mem_prealloc() invocation Date: Tue, 30 Nov 2021 11:41:34 +0100 Message-Id: <20211130104136.40927-7-david@redhat.com> In-Reply-To: <20211130104136.40927-1-david@redhat.com> References: <20211130104136.40927-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.716, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Gavin Shan , =?UTF-8?q?Daniel=20P=20=2E=20Berrang=C3=A9?= , Eduardo Habkost , "Michael S. Tsirkin" , =?UTF-8?q?Michal=20Pr=C3=ADvozn=C3=ADk?= , David Hildenbrand , "Dr . David Alan Gilbert" , Pankaj Gupta , Sebastien Boeuf , Igor Mammedov , Paolo Bonzini , Hui Zhu Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1638269565928100001 Add a mutex to protect the SIGBUS case, as we cannot mess concurrently with the sigbus handler and we have to manage the global variable sigbus_memset_context. The MADV_POPULATE_WRITE path can run concurrently. Note that page_mutex and page_cond are shared between concurrent invocations, which shouldn't be a problem. This is a preparation for future virtio-mem prealloc code, which will call os_mem_prealloc() asynchronously from an iothread when handling guest requests. Reviewed-by: Pankaj Gupta Reviewed-by: Daniel P. Berrang=C3=A9 Signed-off-by: David Hildenbrand Reviewed-by: Michal Privoznik --- util/oslib-posix.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/util/oslib-posix.c b/util/oslib-posix.c index efa4f96d56..9829149e4b 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -95,6 +95,7 @@ typedef struct MemsetThread MemsetThread; =20 /* used by sigbus_handler() */ static MemsetContext *sigbus_memset_context; +static QemuMutex sigbus_mutex; =20 static QemuMutex page_mutex; static QemuCond page_cond; @@ -625,6 +626,7 @@ static bool madv_populate_write_possible(char *area, si= ze_t pagesize) void os_mem_prealloc(int fd, char *area, size_t memory, int smp_cpus, Error **errp) { + static gsize initialized; int ret; struct sigaction act, oldact; size_t hpagesize =3D qemu_fd_getpagesize(fd); @@ -638,6 +640,12 @@ void os_mem_prealloc(int fd, char *area, size_t memory= , int smp_cpus, use_madv_populate_write =3D madv_populate_write_possible(area, hpagesi= ze); =20 if (!use_madv_populate_write) { + if (g_once_init_enter(&initialized)) { + qemu_mutex_init(&sigbus_mutex); + g_once_init_leave(&initialized, 1); + } + + qemu_mutex_lock(&sigbus_mutex); memset(&act, 0, sizeof(act)); act.sa_handler =3D &sigbus_handler; act.sa_flags =3D 0; @@ -665,6 +673,7 @@ void os_mem_prealloc(int fd, char *area, size_t memory,= int smp_cpus, perror("os_mem_prealloc: failed to reinstall signal handler"); exit(1); } + qemu_mutex_unlock(&sigbus_mutex); } } =20 --=20 2.31.1 From nobody Sat May 4 02:13:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1638269431793464.4194807484894; Tue, 30 Nov 2021 02:50:31 -0800 (PST) Received: from localhost ([::1]:53490 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ms0iU-0003V6-ND for importer@patchew.org; Tue, 30 Nov 2021 05:50:30 -0500 Received: from eggs.gnu.org ([209.51.188.92]:43632) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0aP-0005yI-KV for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:42:09 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:53975) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0aN-00013Z-IW for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:42:09 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-250-HDutT3M6PDy8lALyxt-X_w-1; Tue, 30 Nov 2021 05:42:05 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9F929192CC40; Tue, 30 Nov 2021 10:42:04 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id BE548100AE22; Tue, 30 Nov 2021 10:42:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1638268926; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5OgDf6BrqOXrYlJwSdM6+2Enyt2bTUEUbFrjj4pWhyM=; b=Uc8wpClQ06vQlvCHLqqMXMCincK0qsCZJp528hvwS1+F7+xTWEuB2Z1L5GOrCLW50/NBF3 Z37d5DswB3KEOQXbq1REcy7/+ylES7ksZRfy4GxkHgENGOgPdARctnNiaM+R/gC/SiPoQA JbPO8TR1g1zq+tHGpRHiUm/sA7JERmg= X-MC-Unique: HDutT3M6PDy8lALyxt-X_w-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v1 7/8] util/oslib-posix: Forward SIGBUS to MCE handler under Linux Date: Tue, 30 Nov 2021 11:41:35 +0100 Message-Id: <20211130104136.40927-8-david@redhat.com> In-Reply-To: <20211130104136.40927-1-david@redhat.com> References: <20211130104136.40927-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.716, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Gavin Shan , =?UTF-8?q?Daniel=20P=20=2E=20Berrang=C3=A9?= , Eduardo Habkost , "Michael S. Tsirkin" , =?UTF-8?q?Michal=20Pr=C3=ADvozn=C3=ADk?= , David Hildenbrand , "Dr . David Alan Gilbert" , Sebastien Boeuf , Igor Mammedov , Paolo Bonzini , Hui Zhu Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1638269433516100003 Temporarily modifying the SIGBUS handler is really nasty, as we might be unlucky and receive an MCE SIGBUS while having our handler registered. Unfortunately, there is no way around messing with SIGBUS when MADV_POPULATE_WRITE is not applicable or not around. Let's forward SIGBUS that don't belong to us to the already registered handler and document the situation. Reviewed-by: Daniel P. Berrang=C3=A9 Signed-off-by: David Hildenbrand Reviewed-by: Michal Privoznik --- softmmu/cpus.c | 4 ++++ util/oslib-posix.c | 36 +++++++++++++++++++++++++++++++++--- 2 files changed, 37 insertions(+), 3 deletions(-) diff --git a/softmmu/cpus.c b/softmmu/cpus.c index 071085f840..23bca46b07 100644 --- a/softmmu/cpus.c +++ b/softmmu/cpus.c @@ -352,6 +352,10 @@ static void qemu_init_sigbus(void) { struct sigaction action; =20 + /* + * ALERT: when modifying this, take care that SIGBUS forwarding in + * os_mem_prealloc() will continue working as expected. + */ memset(&action, 0, sizeof(action)); action.sa_flags =3D SA_SIGINFO; action.sa_sigaction =3D sigbus_handler; diff --git a/util/oslib-posix.c b/util/oslib-posix.c index 9829149e4b..5c47aa9cb7 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -95,6 +95,7 @@ typedef struct MemsetThread MemsetThread; =20 /* used by sigbus_handler() */ static MemsetContext *sigbus_memset_context; +struct sigaction sigbus_oldact; static QemuMutex sigbus_mutex; =20 static QemuMutex page_mutex; @@ -446,7 +447,11 @@ const char *qemu_get_exec_dir(void) return exec_dir; } =20 +#ifdef CONFIG_LINUX +static void sigbus_handler(int signal, siginfo_t *siginfo, void *ctx) +#else /* CONFIG_LINUX */ static void sigbus_handler(int signal) +#endif /* CONFIG_LINUX */ { int i; =20 @@ -459,6 +464,26 @@ static void sigbus_handler(int signal) } } } + +#ifdef CONFIG_LINUX + /* + * We assume that the MCE SIGBUS handler could have been registered. We + * should never receive BUS_MCEERR_AO on any of our threads, but only = on + * the main thread registered for PR_MCE_KILL_EARLY. Further, we shoul= d not + * receive BUS_MCEERR_AR triggered by action of other threads on one of + * our threads. So, no need to check for unrelated SIGBUS when seeing = one + * for our threads. + * + * We will forward to the MCE handler, which will either handle the SI= GBUS + * or reinstall the default SIGBUS handler and reraise the SIGBUS. The + * default SIGBUS handler will crash the process, so we don't care. + */ + if (sigbus_oldact.sa_flags & SA_SIGINFO) { + sigbus_oldact.sa_sigaction(signal, siginfo, ctx); + return; + } +#endif /* CONFIG_LINUX */ + warn_report("os_mem_prealloc: unrelated SIGBUS detected and ignored"); } =20 static void *do_touch_pages(void *arg) @@ -628,10 +653,10 @@ void os_mem_prealloc(int fd, char *area, size_t memor= y, int smp_cpus, { static gsize initialized; int ret; - struct sigaction act, oldact; size_t hpagesize =3D qemu_fd_getpagesize(fd); size_t numpages =3D DIV_ROUND_UP(memory, hpagesize); bool use_madv_populate_write; + struct sigaction act; =20 /* * Sense on every invocation, as MADV_POPULATE_WRITE cannot be used for @@ -647,10 +672,15 @@ void os_mem_prealloc(int fd, char *area, size_t memor= y, int smp_cpus, =20 qemu_mutex_lock(&sigbus_mutex); memset(&act, 0, sizeof(act)); +#ifdef CONFIG_LINUX + act.sa_sigaction =3D &sigbus_handler; + act.sa_flags =3D SA_SIGINFO; +#else /* CONFIG_LINUX */ act.sa_handler =3D &sigbus_handler; act.sa_flags =3D 0; +#endif /* CONFIG_LINUX */ =20 - ret =3D sigaction(SIGBUS, &act, &oldact); + ret =3D sigaction(SIGBUS, &act, &sigbus_oldact); if (ret) { error_setg_errno(errp, errno, "os_mem_prealloc: failed to install signal handler"); @@ -667,7 +697,7 @@ void os_mem_prealloc(int fd, char *area, size_t memory,= int smp_cpus, } =20 if (!use_madv_populate_write) { - ret =3D sigaction(SIGBUS, &oldact, NULL); + ret =3D sigaction(SIGBUS, &sigbus_oldact, NULL); if (ret) { /* Terminate QEMU since it can't recover from error */ perror("os_mem_prealloc: failed to reinstall signal handler"); --=20 2.31.1 From nobody Sat May 4 02:13:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1638269620802898.0559078107672; Tue, 30 Nov 2021 02:53:40 -0800 (PST) Received: from localhost ([::1]:59492 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ms0lX-0007le-Mv for importer@patchew.org; Tue, 30 Nov 2021 05:53:39 -0500 Received: from eggs.gnu.org ([209.51.188.92]:43674) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0aV-0006AX-Rk for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:42:16 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:41574) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ms0aT-000151-Qq for qemu-devel@nongnu.org; Tue, 30 Nov 2021 05:42:15 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-306-ny8_tcAcMHiood2sPC_eAw-1; Tue, 30 Nov 2021 05:42:09 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 03C68192CC40; Tue, 30 Nov 2021 10:42:08 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 02DA8101E589; Tue, 30 Nov 2021 10:42:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1638268932; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oFBwqshgG2JfNEJVA9/cXb1GcBIhMuW43BKcNMNMdoM=; b=MuF3YHrJdNQL1GIQMxb//tOooX1T69IXJZpXqC2aK2otAwetilvSrBNJMK4YzFmMOSHMDC JeDRT39Y+F7x/OCSgIKgf8/hmR5kyMsVdQ1mpNjDRdy84ejdMoC6FpbV8WULm5Wr1CLjoK By5cbasUzY7q+fIVcp0l1X9Z8XI7wRk= X-MC-Unique: ny8_tcAcMHiood2sPC_eAw-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v1 8/8] virtio-mem: Support "prealloc=on" option Date: Tue, 30 Nov 2021 11:41:36 +0100 Message-Id: <20211130104136.40927-9-david@redhat.com> In-Reply-To: <20211130104136.40927-1-david@redhat.com> References: <20211130104136.40927-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.716, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Gavin Shan , =?UTF-8?q?Daniel=20P=20=2E=20Berrang=C3=A9?= , Eduardo Habkost , "Michael S. Tsirkin" , =?UTF-8?q?Michal=20Pr=C3=ADvozn=C3=ADk?= , David Hildenbrand , "Dr . David Alan Gilbert" , Sebastien Boeuf , Igor Mammedov , Paolo Bonzini , Hui Zhu Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1638269623206100001 Content-Type: text/plain; charset="utf-8" For scarce memory resources, such as hugetlb, we want to be able to prealloc such memory resources in order to not crash later on access. On simple user errors we could otherwise easily run out of memory resources an crash the VM -- pretty much undesired. For ordinary memory devices, such as DIMMs, we preallocate memory via the memory backend for such use cases; however, with virtio-mem we're dealing with sparse memory backends; preallocating the whole memory backend destroys the whole purpose of virtio-mem. Instead, we want to preallocate memory when actually exposing memory to the VM dynamically, and fail plugging memory gracefully + warn the user in case preallocation fails. A common use case for hugetlb will be using "reserve=3Doff,prealloc=3Doff" = for the memory backend and "prealloc=3Don" for the virtio-mem device. This way, no huge pages will be reserved for the process, but we can recover if there are no actual huge pages when plugging memory. Libvirt is already prepared for this. Note that preallocation cannot protect from the OOM killer -- which holds true for any kind of preallocation in QEMU. It's primarily useful only for scarce memory resources such as hugetlb, or shared file-backed memory. It's of little use for ordinary anonymous memory that can be swapped, KSM merged, ... but we won't forbid it. Signed-off-by: David Hildenbrand Reviewed-by: Michal Privoznik --- hw/virtio/virtio-mem.c | 39 ++++++++++++++++++++++++++++++---- include/hw/virtio/virtio-mem.h | 4 ++++ 2 files changed, 39 insertions(+), 4 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index a5d26d414f..33c7884aa0 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -450,10 +450,40 @@ static int virtio_mem_set_block_state(VirtIOMEM *vmem= , uint64_t start_gpa, return -EBUSY; } virtio_mem_notify_unplug(vmem, offset, size); - } else if (virtio_mem_notify_plug(vmem, offset, size)) { - /* Could be a mapping attempt resulted in memory getting populated= . */ - ram_block_discard_range(vmem->memdev->mr.ram_block, offset, size); - return -EBUSY; + } else { + int ret =3D 0; + + if (vmem->prealloc) { + void *area =3D memory_region_get_ram_ptr(&vmem->memdev->mr) + = offset; + int fd =3D memory_region_get_fd(&vmem->memdev->mr); + Error *local_err =3D NULL; + + os_mem_prealloc(fd, area, size, 1, &local_err); + if (local_err) { + static bool warned; + + /* + * Warn only once, we don't want to fill the log with these + * warnings. + */ + if (!warned) { + warn_report_err(local_err); + warned =3D true; + } else { + error_free(local_err); + } + ret =3D -EBUSY; + } + } + if (!ret) { + ret =3D virtio_mem_notify_plug(vmem, offset, size); + } + + if (ret) { + /* Could be preallocation or a notifier populated memory. */ + ram_block_discard_range(vmem->memdev->mr.ram_block, offset, si= ze); + return -EBUSY; + } } virtio_mem_set_bitmap(vmem, start_gpa, size, plug); return 0; @@ -1165,6 +1195,7 @@ static void virtio_mem_instance_init(Object *obj) static Property virtio_mem_properties[] =3D { DEFINE_PROP_UINT64(VIRTIO_MEM_ADDR_PROP, VirtIOMEM, addr, 0), DEFINE_PROP_UINT32(VIRTIO_MEM_NODE_PROP, VirtIOMEM, node, 0), + DEFINE_PROP_BOOL(VIRTIO_MEM_PREALLOC_PROP, VirtIOMEM, prealloc, false), DEFINE_PROP_LINK(VIRTIO_MEM_MEMDEV_PROP, VirtIOMEM, memdev, TYPE_MEMORY_BACKEND, HostMemoryBackend *), #if defined(VIRTIO_MEM_HAS_LEGACY_GUESTS) diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h index 38c67a89f2..7745cfc1a3 100644 --- a/include/hw/virtio/virtio-mem.h +++ b/include/hw/virtio/virtio-mem.h @@ -31,6 +31,7 @@ OBJECT_DECLARE_TYPE(VirtIOMEM, VirtIOMEMClass, #define VIRTIO_MEM_BLOCK_SIZE_PROP "block-size" #define VIRTIO_MEM_ADDR_PROP "memaddr" #define VIRTIO_MEM_UNPLUGGED_INACCESSIBLE_PROP "unplugged-inaccessible" +#define VIRTIO_MEM_PREALLOC_PROP "prealloc" =20 struct VirtIOMEM { VirtIODevice parent_obj; @@ -70,6 +71,9 @@ struct VirtIOMEM { */ OnOffAuto unplugged_inaccessible; =20 + /* whether to prealloc memory when plugging new blocks */ + bool prealloc; + /* notifiers to notify when "size" changes */ NotifierList size_change_notifiers; =20 --=20 2.31.1