From nobody Thu May 2 11:36:58 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linux.ibm.com Return-Path: Received: from lists.gnu.org (209.51.188.17 [209.51.188.17]) by mx.zohomail.com with SMTPS id 1548891513895102.75102354632827; Wed, 30 Jan 2019 15:38:33 -0800 (PST) Received: from localhost ([127.0.0.1]:45764 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gozR5-0006C0-SN for importer@patchew.org; Wed, 30 Jan 2019 18:38:27 -0500 Received: from eggs.gnu.org ([209.51.188.92]:52577) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gozPb-0005WV-2w for qemu-devel@nongnu.org; Wed, 30 Jan 2019 18:36:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gozPZ-0007En-Gk for qemu-devel@nongnu.org; Wed, 30 Jan 2019 18:36:55 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:54440) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gozPZ-0007Dr-6J for qemu-devel@nongnu.org; Wed, 30 Jan 2019 18:36:53 -0500 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x0UNY4Ho024936 for ; Wed, 30 Jan 2019 18:36:51 -0500 Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.151]) by mx0a-001b2d01.pphosted.com with ESMTP id 2qbn281k4p-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 30 Jan 2019 18:36:51 -0500 Received: from localhost by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 30 Jan 2019 23:36:50 -0000 Received: from b03cxnp08028.gho.boulder.ibm.com (9.17.130.20) by e33.co.us.ibm.com (192.168.1.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 30 Jan 2019 23:36:46 -0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x0UNaiYq26083468 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 30 Jan 2019 23:36:44 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C67CBBE053; Wed, 30 Jan 2019 23:36:44 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A5863BE051; Wed, 30 Jan 2019 23:36:40 +0000 (GMT) Received: from kermit-br-ibm-com.ibmmodules.com (unknown [9.80.91.43]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Wed, 30 Jan 2019 23:36:40 +0000 (GMT) From: Murilo Opsfelder Araujo To: qemu-devel@nongnu.org, qemu-ppc@nongnu.org Date: Wed, 30 Jan 2019 21:36:04 -0200 X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190130233605.22163-1-muriloo@linux.ibm.com> References: <20190130233605.22163-1-muriloo@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 x-cbid: 19013023-0036-0000-0000-00000A838126 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010506; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000277; SDB=6.01154131; UDB=6.00601769; IPR=6.00934507; MB=3.00025361; MTD=3.00000008; XFM=3.00000015; UTC=2019-01-30 23:36:49 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19013023-0037-0000-0000-00004A8B2984 Message-Id: <20190130233605.22163-2-muriloo@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-01-30_18:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=969 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901300168 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-Received-From: 148.163.156.1 Subject: [Qemu-devel] [PATCH 1/2] mmap-alloc: unfold qemu_ram_mmap() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Crosthwaite , Fabiano Rosas , "Michael S . Tsirkin" , Greg Kurz , Cao jin , mopsfelder@gmail.com, Murilo Opsfelder Araujo , Paolo Bonzini , Richard Henderson , David Gibson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Unfold parts of qemu_ram_mmap() for the sake of understanding, moving declarations to the top, and keeping architecture-specifics in the ifdef-else blocks. No changes in the function behaviour. Give ptr and ptr1 meaningful names: ptr -> guardptr : pointer to the PROT_NONE guard region ptr1 -> ptr : pointer to the mapped memory returned to caller Signed-off-by: Murilo Opsfelder Araujo Reported-by: Balamuruhan S Reviewed-by: Greg Kurz Tested-by: Balamuruhan S --- util/mmap-alloc.c | 53 ++++++++++++++++++++++++++++++----------------- 1 file changed, 34 insertions(+), 19 deletions(-) diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c index fd329eccd8..f71ea038c8 100644 --- a/util/mmap-alloc.c +++ b/util/mmap-alloc.c @@ -77,11 +77,19 @@ size_t qemu_mempath_getpagesize(const char *mem_path) =20 void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared) { + int flags; + int guardfd; + size_t offset; + size_t total; + void *guardptr; + void *ptr; + /* * Note: this always allocates at least one extra page of virtual addr= ess * space, even if size is already aligned. */ - size_t total =3D size + align; + total =3D size + align; + #if defined(__powerpc64__) && defined(__linux__) /* On ppc64 mappings in the same segment (aka slice) must share the sa= me * page size. Since we will be re-allocating part of this segment @@ -91,16 +99,22 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, = bool shared) * We do this unless we are using the system page size, in which case * anonymous memory is OK. */ - int anonfd =3D fd =3D=3D -1 || qemu_fd_getpagesize(fd) =3D=3D getpages= ize() ? -1 : fd; - int flags =3D anonfd =3D=3D -1 ? MAP_ANONYMOUS : MAP_NORESERVE; - void *ptr =3D mmap(0, total, PROT_NONE, flags | MAP_PRIVATE, anonfd, 0= ); + flags =3D MAP_PRIVATE; + if (fd =3D=3D -1 || qemu_fd_getpagesize(fd) =3D=3D getpagesize()) { + guardfd =3D -1; + flags |=3D MAP_ANONYMOUS; + } else { + guardfd =3D fd; + flags |=3D MAP_NORESERVE; + } #else - void *ptr =3D mmap(0, total, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -= 1, 0); + guardfd =3D -1; + flags =3D MAP_PRIVATE | MAP_ANONYMOUS; #endif - size_t offset; - void *ptr1; =20 - if (ptr =3D=3D MAP_FAILED) { + guardptr =3D mmap(0, total, PROT_NONE, flags, guardfd, 0); + + if (guardptr =3D=3D MAP_FAILED) { return MAP_FAILED; } =20 @@ -108,19 +122,20 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align= , bool shared) /* Always align to host page size */ assert(align >=3D getpagesize()); =20 - offset =3D QEMU_ALIGN_UP((uintptr_t)ptr, align) - (uintptr_t)ptr; - ptr1 =3D mmap(ptr + offset, size, PROT_READ | PROT_WRITE, - MAP_FIXED | - (fd =3D=3D -1 ? MAP_ANONYMOUS : 0) | - (shared ? MAP_SHARED : MAP_PRIVATE), - fd, 0); - if (ptr1 =3D=3D MAP_FAILED) { - munmap(ptr, total); + flags =3D MAP_FIXED; + flags |=3D fd =3D=3D -1 ? MAP_ANONYMOUS : 0; + flags |=3D shared ? MAP_SHARED : MAP_PRIVATE; + offset =3D QEMU_ALIGN_UP((uintptr_t)guardptr, align) - (uintptr_t)guar= dptr; + + ptr =3D mmap(guardptr + offset, size, PROT_READ | PROT_WRITE, flags, f= d, 0); + + if (ptr =3D=3D MAP_FAILED) { + munmap(guardptr, total); return MAP_FAILED; } =20 if (offset > 0) { - munmap(ptr, offset); + munmap(guardptr, offset); } =20 /* @@ -129,10 +144,10 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align= , bool shared) */ total -=3D offset; if (total > size + getpagesize()) { - munmap(ptr1 + size + getpagesize(), total - size - getpagesize()); + munmap(ptr + size + getpagesize(), total - size - getpagesize()); } =20 - return ptr1; + return ptr; } =20 void qemu_ram_munmap(void *ptr, size_t size) --=20 2.20.1 From nobody Thu May 2 11:36:58 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linux.ibm.com Return-Path: Received: from lists.gnu.org (209.51.188.17 [209.51.188.17]) by mx.zohomail.com with SMTPS id 1548891612987500.7108087937055; Wed, 30 Jan 2019 15:40:12 -0800 (PST) Received: from localhost ([127.0.0.1]:45774 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gozSg-0007Jr-Kt for importer@patchew.org; Wed, 30 Jan 2019 18:40:06 -0500 Received: from eggs.gnu.org ([209.51.188.92]:52608) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gozPl-0005cb-Il for qemu-devel@nongnu.org; Wed, 30 Jan 2019 18:37:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gozPi-0007L0-KG for qemu-devel@nongnu.org; Wed, 30 Jan 2019 18:37:05 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:35730 helo=mx0a-001b2d01.pphosted.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gozPg-0007HQ-9v for qemu-devel@nongnu.org; Wed, 30 Jan 2019 18:37:01 -0500 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x0UNYFRc094640 for ; Wed, 30 Jan 2019 18:36:57 -0500 Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.153]) by mx0b-001b2d01.pphosted.com with ESMTP id 2qbj8484vq-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 30 Jan 2019 18:36:56 -0500 Received: from localhost by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 30 Jan 2019 23:36:56 -0000 Received: from b03cxnp08028.gho.boulder.ibm.com (9.17.130.20) by e35.co.us.ibm.com (192.168.1.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 30 Jan 2019 23:36:52 -0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x0UNao2s24641602 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 30 Jan 2019 23:36:50 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C8B5CBE059; Wed, 30 Jan 2019 23:36:50 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 648D4BE04F; Wed, 30 Jan 2019 23:36:46 +0000 (GMT) Received: from kermit-br-ibm-com.ibmmodules.com (unknown [9.80.91.43]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Wed, 30 Jan 2019 23:36:46 +0000 (GMT) From: Murilo Opsfelder Araujo To: qemu-devel@nongnu.org, qemu-ppc@nongnu.org Date: Wed, 30 Jan 2019 21:36:05 -0200 X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190130233605.22163-1-muriloo@linux.ibm.com> References: <20190130233605.22163-1-muriloo@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 x-cbid: 19013023-0012-0000-0000-000017025DEC X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010506; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000277; SDB=6.01154131; UDB=6.00601769; IPR=6.00934507; MB=3.00025361; MTD=3.00000008; XFM=3.00000015; UTC=2019-01-30 23:36:55 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19013023-0013-0000-0000-000056051312 Message-Id: <20190130233605.22163-3-muriloo@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-01-30_18:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901300168 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-Received-From: 148.163.158.5 Subject: [Qemu-devel] [PATCH 2/2] mmap-alloc: fix hugetlbfs misaligned length in ppc64 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Crosthwaite , Fabiano Rosas , "Michael S . Tsirkin" , Greg Kurz , Cao jin , mopsfelder@gmail.com, Murilo Opsfelder Araujo , Paolo Bonzini , Richard Henderson , David Gibson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" The commit 7197fb4058bcb68986bae2bb2c04d6370f3e7218 ("util/mmap-alloc: fix hugetlb support on ppc64") fixed Huge TLB mappings on ppc64. However, we still need to consider the underlying huge page size during munmap() because it requires that both address and length be a multiple of the underlying huge page size for Huge TLB mappings. Quote from "Huge page (Huge TLB) mappings" paragraph under NOTES section of the munmap(2) manual: "For munmap(), addr and length must both be a multiple of the underlying huge page size." On ppc64, the munmap() in qemu_ram_munmap() does not work for Huge TLB mappings because the mapped segment can be aligned with the underlying huge page size, not aligned with the native system page size, as returned by getpagesize(). This has the side effect of not releasing huge pages back to the pool after a hugetlbfs file-backed memory device is hot-unplugged. This patch fixes the situation in qemu_ram_mmap() and qemu_ram_munmap() by considering the underlying page size on ppc64. After this patch, memory hot-unplug releases huge pages back to the pool. Fixes: 7197fb4058bcb68986bae2bb2c04d6370f3e7218 Signed-off-by: Murilo Opsfelder Araujo Reported-by: Balamuruhan S Reviewed-by: Greg Kurz Tested-by: Balamuruhan S --- exec.c | 4 ++-- include/qemu/mmap-alloc.h | 2 +- util/mmap-alloc.c | 22 ++++++++++++++++------ util/oslib-posix.c | 2 +- 4 files changed, 20 insertions(+), 10 deletions(-) diff --git a/exec.c b/exec.c index da3e635f91..0db6d8bf34 100644 --- a/exec.c +++ b/exec.c @@ -1871,7 +1871,7 @@ static void *file_ram_alloc(RAMBlock *block, if (mem_prealloc) { os_mem_prealloc(fd, area, memory, smp_cpus, errp); if (errp && *errp) { - qemu_ram_munmap(area, memory); + qemu_ram_munmap(fd, area, memory); return NULL; } } @@ -2392,7 +2392,7 @@ static void reclaim_ramblock(RAMBlock *block) xen_invalidate_map_cache_entry(block->host); #ifndef _WIN32 } else if (block->fd >=3D 0) { - qemu_ram_munmap(block->host, block->max_length); + qemu_ram_munmap(block->fd, block->host, block->max_length); close(block->fd); #endif } else { diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h index 50385e3f81..ef04f0ed5b 100644 --- a/include/qemu/mmap-alloc.h +++ b/include/qemu/mmap-alloc.h @@ -9,6 +9,6 @@ size_t qemu_mempath_getpagesize(const char *mem_path); =20 void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared); =20 -void qemu_ram_munmap(void *ptr, size_t size); +void qemu_ram_munmap(int fd, void *ptr, size_t size); =20 #endif diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c index f71ea038c8..8565885420 100644 --- a/util/mmap-alloc.c +++ b/util/mmap-alloc.c @@ -80,6 +80,7 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bo= ol shared) int flags; int guardfd; size_t offset; + size_t pagesize; size_t total; void *guardptr; void *ptr; @@ -100,7 +101,8 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, = bool shared) * anonymous memory is OK. */ flags =3D MAP_PRIVATE; - if (fd =3D=3D -1 || qemu_fd_getpagesize(fd) =3D=3D getpagesize()) { + pagesize =3D qemu_fd_getpagesize(fd); + if (fd =3D=3D -1 || pagesize =3D=3D getpagesize()) { guardfd =3D -1; flags |=3D MAP_ANONYMOUS; } else { @@ -109,6 +111,7 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, = bool shared) } #else guardfd =3D -1; + pagesize =3D getpagesize(); flags =3D MAP_PRIVATE | MAP_ANONYMOUS; #endif =20 @@ -120,7 +123,7 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, = bool shared) =20 assert(is_power_of_2(align)); /* Always align to host page size */ - assert(align >=3D getpagesize()); + assert(align >=3D pagesize); =20 flags =3D MAP_FIXED; flags |=3D fd =3D=3D -1 ? MAP_ANONYMOUS : 0; @@ -143,17 +146,24 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align= , bool shared) * a guard page guarding against potential buffer overflows. */ total -=3D offset; - if (total > size + getpagesize()) { - munmap(ptr + size + getpagesize(), total - size - getpagesize()); + if (total > size + pagesize) { + munmap(ptr + size + pagesize, total - size - pagesize); } =20 return ptr; } =20 -void qemu_ram_munmap(void *ptr, size_t size) +void qemu_ram_munmap(int fd, void *ptr, size_t size) { + size_t pagesize; + if (ptr) { /* Unmap both the RAM block and the guard page */ - munmap(ptr, size + getpagesize()); +#if defined(__powerpc64__) && defined(__linux__) + pagesize =3D qemu_fd_getpagesize(fd); +#else + pagesize =3D getpagesize(); +#endif + munmap(ptr, size + pagesize); } } diff --git a/util/oslib-posix.c b/util/oslib-posix.c index 4ce1ba9ca4..37c5854b9c 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -226,7 +226,7 @@ void qemu_vfree(void *ptr) void qemu_anon_ram_free(void *ptr, size_t size) { trace_qemu_anon_ram_free(ptr, size); - qemu_ram_munmap(ptr, size); + qemu_ram_munmap(-1, ptr, size); } =20 void qemu_set_block(int fd) --=20 2.20.1