From nobody Sun Oct 5 00:12:22 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E8442E11BF for ; Tue, 12 Aug 2025 06:04:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754978657; cv=none; b=p2dM6zWc1f/I2Tw8mrbsJxCslKuH2AF+0ND8S63sa0gYIBzpAtXFX8uCpmsaoSDT75rYpjOmXEQPLgZVtOVp6kgihVErsSa1ZtS7CXZavT3vqRfI2J/yPKWSPjzmtje1W730+l/1Uc/v+ZtRly8E9XngSR+kdwXU7+AB4JtoOAE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754978657; c=relaxed/simple; bh=F4c2RtrLgfxY8WVt3OaY1o0yZJL+GJ7wPZPEm2SM7ws=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=YyRBhlIEMfqQjdn75OVsRzmg5YLrijV7gd79hllxoeF39uOD4R5+xZYpNZb88pl6LYrYPfSN0ocJgEZ0NYghFMZyjEPY7mFf011JVot4wWTfDVlMT9oNNUvOQORQlhlEs5B8PRsa51X6EPISWcjsAMBYEKGRNZHGfYHpWvysiR8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=aHvFf4Yl; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=6FzP3Bck; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="aHvFf4Yl"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="6FzP3Bck" From: =?utf-8?q?Thomas_Wei=C3=9Fschuh?= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1754978654; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2BoFYp3/nAMH+ixxD2t0VNLuyUFKS/Uzah6o0Jl1bck=; b=aHvFf4Ylm1RNilOxUVR6vG04ZVR2tc12wymkTjPgPQESodt32vnXjpGeDkzIEvqaQOnNGI Zb/HlxglYluJWdupCHJwhoDfucTLj0muhfwrDQpZU+Id0ZYTfIRqxK4qNpRNeUQvgz/ibp NhZXaQr49H2O6gcrD2Y3bqB+SCWdm/3RhrVwnSyRecaiLZ6adLl5GBO+IhNvKq47F2Toc5 nlmujIINfmMfxRPBB/DQwo1q3ArcRVjk7yz64I9rXjPM0ucZ/YjdHW6rzHxVb7w52WYcY2 0+uHMlpHVcajfHVtA7oG92GUQqtrF2d9MdeCV/Qk4VMG8AuVIg8EeyJvRXGSww== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1754978654; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2BoFYp3/nAMH+ixxD2t0VNLuyUFKS/Uzah6o0Jl1bck=; b=6FzP3BckI+GcO+EKU6ejSg9XooL5Q8QwUzw3N5MfAmwQLg3NN8qGSweUyJtJVHTCk01b/S 0E76NLdgVFUk2rBQ== Date: Tue, 12 Aug 2025 08:04:04 +0200 Subject: [PATCH 2/3] vdso/datastore: Allow prefaulting by mlockall() Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250812-vdso-mlockall-v1-2-2f49ba7cf819@linutronix.de> References: <20250812-vdso-mlockall-v1-0-2f49ba7cf819@linutronix.de> In-Reply-To: <20250812-vdso-mlockall-v1-0-2f49ba7cf819@linutronix.de> To: Anna-Maria Behnsen , Frederic Weisbecker , Thomas Gleixner , Andy Lutomirski , Vincenzo Frascino Cc: Nam Cao , linux-kernel@vger.kernel.org, =?utf-8?q?Thomas_Wei=C3=9Fschuh?= X-Developer-Signature: v=1; a=ed25519-sha256; t=1754978653; l=4076; i=thomas.weissschuh@linutronix.de; s=20240209; h=from:subject:message-id; bh=F4c2RtrLgfxY8WVt3OaY1o0yZJL+GJ7wPZPEm2SM7ws=; b=VKuUFSsfBhEGjYbfGFRXla2fgYp6pnaviA1WyEgaAl9QqiKNOMI8JQSAf7UmSSMqCHCxML5C9 NzABalqx3FyCYDh7iqQ21ifcpAC6Q1HSB9rDKwKLzhKJEVG7JnxIX0S X-Developer-Key: i=thomas.weissschuh@linutronix.de; a=ed25519; pk=pfvxvpFUDJV2h2nY0FidLUml22uGLSjByFbM6aqQQws= Latency-sensitive applications expect not to experience any pagefaults after calling mlockall(). However mlockall() ignores VM_PFNMAP and VM_IO mappings, both of which are used by the generic vDSO datastore. While the fault handler itself is very fast, going through the full pagefault exception handling is much slower, on the order of 20us in a test machine. Since the memory behind the datastore mappings is always present and accessible it is not necessary to use VM_IO for them. VM_PFNMAP can be removed by mapping the pages through 'struct page' instead of PFNs. VM_MIXEDMAP is necessary to call vmf_insert_page() in the timens optimization path. The data page mapping is now also aligned with the architecture-specific code pages. Some architecture-specific data pages, like the x86 VCLOCK pages, continue to use VM_IO as they are not always mappable. Regular mlock() would also work, but userspace does not know the boundaries of the vDSO. Signed-off-by: Thomas Wei=C3=9Fschuh Tested-by: Nam Cao --- lib/vdso/datastore.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/lib/vdso/datastore.c b/lib/vdso/datastore.c index ed1aa3e27b13f8b48d18dad9488e0798f49cb338..9a1af01f1c4db95255dd67b5912= 9791cc39d37c0 100644 --- a/lib/vdso/datastore.c +++ b/lib/vdso/datastore.c @@ -40,8 +40,8 @@ struct vdso_arch_data *vdso_k_arch_data =3D &vdso_arch_da= ta_store.data; static vm_fault_t vvar_fault(const struct vm_special_mapping *sm, struct vm_area_struct *vma, struct vm_fault *vmf) { - struct page *timens_page; - unsigned long addr, pfn; + struct page *page, *timens_page; + unsigned long addr; vm_fault_t err; =20 if (unlikely(vmf->flags & FAULT_FLAG_REMOTE)) @@ -53,17 +53,17 @@ static vm_fault_t vvar_fault(const struct vm_special_ma= pping *sm, case VDSO_TIME_PAGE_OFFSET: if (!IS_ENABLED(CONFIG_HAVE_GENERIC_VDSO)) return VM_FAULT_SIGBUS; - pfn =3D __phys_to_pfn(__pa_symbol(vdso_k_time_data)); + page =3D virt_to_page(vdso_k_time_data); if (timens_page) { /* * Fault in VVAR page too, since it will be accessed * to get clock data anyway. */ addr =3D vmf->address + VDSO_TIMENS_PAGE_OFFSET * PAGE_SIZE; - err =3D vmf_insert_pfn(vma, addr, pfn); + err =3D vmf_insert_page(vma, addr, page); if (unlikely(err & VM_FAULT_ERROR)) return err; - pfn =3D page_to_pfn(timens_page); + page =3D timens_page; } break; case VDSO_TIMENS_PAGE_OFFSET: @@ -76,24 +76,25 @@ static vm_fault_t vvar_fault(const struct vm_special_ma= pping *sm, */ if (!IS_ENABLED(CONFIG_TIME_NS) || !timens_page) return VM_FAULT_SIGBUS; - pfn =3D __phys_to_pfn(__pa_symbol(vdso_k_time_data)); + page =3D virt_to_page(vdso_k_time_data); break; case VDSO_RNG_PAGE_OFFSET: if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM)) return VM_FAULT_SIGBUS; - pfn =3D __phys_to_pfn(__pa_symbol(vdso_k_rng_data)); + page =3D virt_to_page(vdso_k_rng_data); break; case VDSO_ARCH_PAGES_START ... VDSO_ARCH_PAGES_END: if (!IS_ENABLED(CONFIG_ARCH_HAS_VDSO_ARCH_DATA)) return VM_FAULT_SIGBUS; - pfn =3D __phys_to_pfn(__pa_symbol(vdso_k_arch_data)) + - vmf->pgoff - VDSO_ARCH_PAGES_START; + page =3D nth_page(virt_to_page(vdso_k_arch_data), vmf->pgoff - VDSO_ARCH= _PAGES_START); break; default: return VM_FAULT_SIGBUS; } =20 - return vmf_insert_pfn(vma, vmf->address, pfn); + get_page(page); + vmf->page =3D page; + return 0; } =20 const struct vm_special_mapping vdso_vvar_mapping =3D { @@ -104,8 +105,8 @@ const struct vm_special_mapping vdso_vvar_mapping =3D { struct vm_area_struct *vdso_install_vvar_mapping(struct mm_struct *mm, uns= igned long addr) { return _install_special_mapping(mm, addr, VDSO_NR_PAGES * PAGE_SIZE, - VM_READ | VM_MAYREAD | VM_IO | VM_DONTDUMP | - VM_PFNMAP | VM_SEALED_SYSMAP, + VM_READ | VM_MAYREAD | VM_DONTDUMP | + VM_MIXEDMAP | VM_SEALED_SYSMAP, &vdso_vvar_mapping); } =20 --=20 2.50.1