From nobody Fri Dec 19 14:23:34 2025 Received: from mail-qt1-f179.google.com (mail-qt1-f179.google.com [209.85.160.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F03F1EF382; Mon, 19 May 2025 22:33:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694028; cv=none; b=fd+IDNuY3K8PmylsvEbqz2oSw8tVTdV/siA65dUUGtv1YkFx3Zf0mf+2LJjZEDoXmowGs0cwtyu8KnLrtHB6kCeLu+3Rz1CKFhA0Yo61kNsFKWHnKELhFeiiwaHBAwhR0ED+FZOcUDVqXxUfEkRKkKsDWH14kphnpZDAD6JZiKE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694028; c=relaxed/simple; bh=PfAY27T09TjnJ/nIpVYoM9+S1fT0+OEtJubTxVt/1WE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pHjSzDD5jy68i8lnMYWooFiVbD6qA/UzROq67w7m7Rl8QoSxDb12vbZIu5j4t48hFEG7qrEk5U4D7Pzk8/adcpIhPUS6LXCnyHamMwPDSoocd2gUwS7Hb7FUB+b5YbYAkCGSGJesV3oVDBCWjsrrum1GTglhFAegaBXYKzCIc7k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=e0BZmtx+; arc=none smtp.client-ip=209.85.160.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="e0BZmtx+" Received: by mail-qt1-f179.google.com with SMTP id d75a77b69052e-47ae894e9b7so91042401cf.3; Mon, 19 May 2025 15:33:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747694025; x=1748298825; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=SZoVm58CBAxFKJ1NXNNUrIOPj0gNvc2sOGcRB/nswhY=; b=e0BZmtx+V8+9LF1CheNx/AwLBGS1pTXSOnSMFejkAytkePnEscForSB6/+w/byxXh7 3YxlXaXHCZ064msksFCrb1V+0QRcBqmaO/Lf5NtCuslschlBRY9s/mMGc6DjJa822Plx gMmfaTwkMVqBRseN3nk1x1Z4pS27bgvLvaSv7wxlmDqWiBx7J3OGM558vFf7hidBJKFS T4OY7DjnaUrGqzvY9hQeSm4p5vy7Yf0cMkIwalnQEFIF6FdtPNUeyPMlNuqXeAAogWFp jT79BJ4xfJnPM9g0e7wGS7C6zfKHPPjE23HX0JiHnDBo1jcGe3sK0fqcyUUMg48VAAaH zGQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747694025; x=1748298825; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SZoVm58CBAxFKJ1NXNNUrIOPj0gNvc2sOGcRB/nswhY=; b=XhhKAPLIXUZ4eyxKg5X6txTKGzPKYfcOjntR//57S78Tuya6Ip5OMu8EiRhCS1zqLS +pqIi3HKApZdW3TB+pi8dWvK1yhzV1t8PJTKLsNEHOPnN5YRlpeFPIb/GjB5ysKBcTvx LSD3qdKqRZRS/Z/7XjSogFi5RuNYKcfJ7+ZooZINSbt/ok6bezzTkJ2EdRTbhqi2qgpE ZAYqIht0ofo24SQcHcptOypnia41iF6HyunY1HMs+jGRSQkfdZ4ZegJE74jooDVjD1B8 nWdGhzdHteDYoflFvAaAu1Ccwox9hoOY7lOQt6loCHXzQFsQfo/EFfg2K6iKftiBiQ2u NnSQ== X-Forwarded-Encrypted: i=1; AJvYcCU5neIVVOn6X4//P7gLRBwwHBYToE/fJmxBp7eQAGXV3gc+ihFvyS6DDp9C6IsQzS1SlSs5fvICzPs=@vger.kernel.org, AJvYcCXBNuwIqnhmXXj0e1P7VSkA9YgHv33gCMkP/zRQ/d7cNaT2yrbT/gT7hwxcamFXbA0ZmjOhpMsmGtTKX+MO@vger.kernel.org X-Gm-Message-State: AOJu0Yyy/I+20yBnQBOzWIAWTW3RDNXkwb4ZuFTEE07ZciFdUpXlGmp/ zqJfIymdkWMSuXbYl939csQVDH7pZ0FZLIDL05PpDr4Bj4k7/GtmTH9I X-Gm-Gg: ASbGnctQScXGjocjIm8PXNlaCUZnraG7/oc5/0dDxr/yD6de95Rf8NFEMUxg+gHVxvp /A1K8eh0aw8mrvqs7fILc79+2XA0KVks5KhGJ8bv67V0Z0cGaiq0EWjE/0L2KOA4xiOtECwdshI bTZBXdVBE98MdqLH+tI+memyx7Y41ciha5YGJu0/uEO7y1aQ4Um30VigirhGiDoZYBjWwgEsMJT oyQId+4V6ADNimCR95VYqibfyvAWU5h4eJyzigYp9XFu0WjYnLev+OTejuubDJODTQmsaJq6ely bTylGhAP9f5mLQwM4jNWChtwD71x5Avm/ovFmvEsNlmmIlVq X-Google-Smtp-Source: AGHT+IFSgdGO5NmmqTJLvtd4NIb4WXRuj3aQMKYRB2sVptgzNbInCwcG2zfX8HRipunbeXTsp22ejw== X-Received: by 2002:a05:622a:5519:b0:491:286a:8606 with SMTP id d75a77b69052e-494b00e92e2mr230773771cf.0.1747694025116; Mon, 19 May 2025 15:33:45 -0700 (PDT) Received: from localhost ([2a03:2880:20ff:7::]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-494b344924fsm57559991cf.40.2025.05.19.15.33.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 May 2025 15:33:44 -0700 (PDT) From: Usama Arif To: Andrew Morton , david@redhat.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, shakeel.butt@linux.dev, riel@surriel.com, ziy@nvidia.com, laoar.shao@gmail.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, vbabka@suse.cz, jannh@google.com, Arnd Bergmann , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v3 1/7] mm: khugepaged: extract vm flag setting outside of hugepage_madvise Date: Mon, 19 May 2025 23:29:53 +0100 Message-ID: <20250519223307.3601786-2-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250519223307.3601786-1-usamaarif642@gmail.com> References: <20250519223307.3601786-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This is so that flag setting can be resused later in other functions, to reduce code duplication (including the s390 exception). No functional change intended with this patch. Signed-off-by: Usama Arif --- include/linux/huge_mm.h | 1 + mm/khugepaged.c | 26 +++++++++++++++++--------- 2 files changed, 18 insertions(+), 9 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 2f190c90192d..23580a43787c 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -431,6 +431,7 @@ change_huge_pud(struct mmu_gather *tlb, struct vm_area_= struct *vma, __split_huge_pud(__vma, __pud, __address); \ } while (0) =20 +int hugepage_set_vmflags(unsigned long *vm_flags, int advice); int hugepage_madvise(struct vm_area_struct *vma, unsigned long *vm_flags, int advice); int madvise_collapse(struct vm_area_struct *vma, diff --git a/mm/khugepaged.c b/mm/khugepaged.c index b04b6a770afe..ab3427c87422 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -346,8 +346,7 @@ struct attribute_group khugepaged_attr_group =3D { }; #endif /* CONFIG_SYSFS */ =20 -int hugepage_madvise(struct vm_area_struct *vma, - unsigned long *vm_flags, int advice) +int hugepage_set_vmflags(unsigned long *vm_flags, int advice) { switch (advice) { case MADV_HUGEPAGE: @@ -358,16 +357,10 @@ int hugepage_madvise(struct vm_area_struct *vma, * ignore the madvise to prevent qemu from causing a SIGSEGV. */ if (mm_has_pgste(vma->vm_mm)) - return 0; + return -EPERM; #endif *vm_flags &=3D ~VM_NOHUGEPAGE; *vm_flags |=3D VM_HUGEPAGE; - /* - * If the vma become good for khugepaged to scan, - * register it here without waiting a page fault that - * may not happen any time soon. - */ - khugepaged_enter_vma(vma, *vm_flags); break; case MADV_NOHUGEPAGE: *vm_flags &=3D ~VM_HUGEPAGE; @@ -383,6 +376,21 @@ int hugepage_madvise(struct vm_area_struct *vma, return 0; } =20 +int hugepage_madvise(struct vm_area_struct *vma, + unsigned long *vm_flags, int advice) +{ + if (advice =3D=3D MADV_HUGEPAGE && !hugepage_set_vmflags(vm_flags, advice= )) { + /* + * If the vma become good for khugepaged to scan, + * register it here without waiting a page fault that + * may not happen any time soon. + */ + khugepaged_enter_vma(vma, *vm_flags); + } + + return 0; +} + int __init khugepaged_init(void) { mm_slot_cache =3D KMEM_CACHE(khugepaged_mm_slot, 0); --=20 2.47.1 From nobody Fri Dec 19 14:23:34 2025 Received: from mail-qv1-f52.google.com (mail-qv1-f52.google.com [209.85.219.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C1B4C2192FC; Mon, 19 May 2025 22:33:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694029; cv=none; b=N8N/InWaMNohp4rHo5AohlW55mHqBdTHUD/bnsh5iOr7xwumfaFdo99PCTuPVylMYHRlyFA0JJZMzs/w0oESiXsO0QUksCnogmVlkJ7PTmWifB6nyo32DHIOm2cFQhjCR2BsEqA1ScDXCtBBx9gJCuCJx3MLYfxtr/dajG0ASC8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694029; c=relaxed/simple; bh=r9OsL9Nl6T0Z5HieIAW1BzhpNkNTe6QjRVEQBGBP3us=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=S/M3xpaQ/QJh8xTPjDeyzHVoJc6325rpiIuS2vOQRU2BDbiZSPfkvFGY+W1z/IvR8e9iD5SHsJfdM3KtLIwTvE6dNlA7HjoaPkYzE3KXVnl3ovjXOyaWqqS3yYzHV1+q6RxKKCuaGCIxWNGd6Gs9sfmkyud2esU2tC+EGTCNU6c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=XYtiM1Rq; arc=none smtp.client-ip=209.85.219.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="XYtiM1Rq" Received: by mail-qv1-f52.google.com with SMTP id 6a1803df08f44-6f8aabbffaeso46632686d6.0; Mon, 19 May 2025 15:33:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747694026; x=1748298826; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GBzrH2GBnquUaHeSO3w8uSRV1aMylGMB1WN2N8Pvfik=; b=XYtiM1RqcdpU8IzwD/G3XN0V7Fvqf2ZoXFgi6SgtijP4xXGCK/qzFDaBkjCoWyR1c4 R2eSgCgdBqzs4k2CLbAiXiHJFisHnBWFrhK0YJ16jQvPuCsaOk2IzSQJhfZy3VfrNhTE ADEhakGh/f7q+BNg/EivW0XYcBYk95wA4LAV25sb/oATsOJ/iWavegMhpPb2tEdQL+IB 4w+1c19wJWO+70zFTbbzzmGpj1hQqlrEh5bFd59VfKD+Qa2yQWBDsh5in0NKNzrJeqAf klxtnntiCdn2QXikR99Y1s/B2udvdJjuBm+QPBGNAOpBbjiPn+80zoR4CzeaaEm1updT 3Q8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747694026; x=1748298826; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GBzrH2GBnquUaHeSO3w8uSRV1aMylGMB1WN2N8Pvfik=; b=jU5E04hr4JR2hsfDuMOTsUFxUxTiF1UFUB8jSZJ5UdmOYd3b24+WbSYJaESTVjJwVX 1/2sF1aVWzG9+VLm6cVUu8brNvLHulG1Utb+2EZIks5+nvmM/Kb8o+hD0rl4f7I6P5Qe 3lmXcZsCP5XVbw8tLWQPoa587DxSK1w4W3zHaOtruYq0gf3+tiennqIk4UyI6tPbodE3 5tHYs5rqSRjiKgZmoblf88ZPTGJdbSZRXf+DbhXtW011yfBcr8KrmvkGe8cntxywwray bCL6Pe3kf7qSAJW4XS4t6n0vtACMCDack10rhnrycv74BQ5o2FNvV8YroWz1ykW37n+d Tx8A== X-Forwarded-Encrypted: i=1; AJvYcCUj+Y2OMY8A82j5pWMGbt/i2glnNNr/WOvP/xHSSnoQ6Gxr53jWPeRgulc9NSLSc/HhQ+Y1h0PI8AeaOEsh@vger.kernel.org, AJvYcCW4GteRNSWpPKp2KxAB7l5aACofuha/k+3wXjvUCWQbu+vT+wiH5vIAoUKisHUDByzr/S03Kew5l8Y=@vger.kernel.org X-Gm-Message-State: AOJu0YxhtH2o8J5nR6vthP4uWlHIHQATV0lyaf3l0S8jHj7V0EzXthe+ WPX6lStSSKYmR8GEtF302q69+HJ1iIVKDZRet6EVwQO1Z4baEqCBKhdR X-Gm-Gg: ASbGncvZaIfsAyfgNzP4LKXBxUeazjEFpoEoN2XC0GDf8o6C4w+sE1IbgHRMlKlKN+3 XqKwHRI5F4i2Lyq6LyVCL6w62Gpu5rVMqxxMtPa/d5TGP54FEogu9QGINP92o9A32M3llxUutCl DLveGOUJhO3HOaJlGPl4JgXmt86WOJI+yewfCUveoJiyVfU0thIjqTfnHRMwPjeUpzQvX9h4zHB qloQWxkl+TS5fQPFsx4aPvj+MObjtQVTplw0nSivz304RLEKNtepdnL758SquqyGqYYeCXjLdoa GxNCLHrTyUQN9VQ3h8YRANTbuq9si0mpR0v9g4doVpZhSm8REA== X-Google-Smtp-Source: AGHT+IGoaL3+EMdpTKw62PROOK5NuGok7IoCsJPUMNfh1H9llowMCPFvMdWgbpfAanUPQ6/uxc12eQ== X-Received: by 2002:a05:6214:1c09:b0:6e4:3ddc:5d33 with SMTP id 6a1803df08f44-6f8b2cfe519mr234825206d6.13.1747694026467; Mon, 19 May 2025 15:33:46 -0700 (PDT) Received: from localhost ([2a03:2880:20ff:74::]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6f8b0883ed6sm62381186d6.18.2025.05.19.15.33.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 May 2025 15:33:45 -0700 (PDT) From: Usama Arif To: Andrew Morton , david@redhat.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, shakeel.butt@linux.dev, riel@surriel.com, ziy@nvidia.com, laoar.shao@gmail.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, vbabka@suse.cz, jannh@google.com, Arnd Bergmann , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v3 2/7] prctl: introduce PR_DEFAULT_MADV_HUGEPAGE for the process Date: Mon, 19 May 2025 23:29:54 +0100 Message-ID: <20250519223307.3601786-3-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250519223307.3601786-1-usamaarif642@gmail.com> References: <20250519223307.3601786-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This is set via the new PR_SET_THP_POLICY prctl. It has 2 affects: - It sets VM_HUGEPAGE and clears VM_NOHUGEPAGE on the default VMA flags (def_flags). This means that every new VMA will be considered for hugepage. - Iterate through every VMA in the process and call hugepage_madvise on it, with MADV_HUGEPAGE policy. The policy is inherited during fork+exec. This effectively allows setting MADV_HUGEPAGE on the entire process. In an environment where different types of workloads are run on the same machine, this will allow workloads that benefit from always having hugepages to do so, without regressing those that don't. Signed-off-by: Usama Arif --- include/linux/huge_mm.h | 1 + include/linux/mm.h | 2 +- include/linux/mm_types.h | 4 ++- include/uapi/linux/prctl.h | 4 +++ kernel/sys.c | 29 +++++++++++++++++++ mm/huge_memory.c | 13 +++++++++ tools/include/uapi/linux/prctl.h | 4 +++ .../trace/beauty/include/uapi/linux/prctl.h | 4 +++ 8 files changed, 59 insertions(+), 2 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 23580a43787c..b24a2e0ae642 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -431,6 +431,7 @@ change_huge_pud(struct mmu_gather *tlb, struct vm_area_= struct *vma, __split_huge_pud(__vma, __pud, __address); \ } while (0) =20 +void process_default_madv_hugepage(struct mm_struct *mm, int advice); int hugepage_set_vmflags(unsigned long *vm_flags, int advice); int hugepage_madvise(struct vm_area_struct *vma, unsigned long *vm_flags, int advice); diff --git a/include/linux/mm.h b/include/linux/mm.h index 43748c8f3454..436f4588bce8 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -466,7 +466,7 @@ extern unsigned int kobjsize(const void *objp); #define VM_NO_KHUGEPAGED (VM_SPECIAL | VM_HUGETLB) =20 /* This mask defines which mm->def_flags a process can inherit its parent = */ -#define VM_INIT_DEF_MASK VM_NOHUGEPAGE +#define VM_INIT_DEF_MASK (VM_HUGEPAGE | VM_NOHUGEPAGE) =20 /* This mask represents all the VMA flag bits used by mlock */ #define VM_LOCKED_MASK (VM_LOCKED | VM_LOCKONFAULT) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index e76bade9ebb1..f1836b7c5704 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1703,6 +1703,7 @@ enum { /* leave room for more dump flags */ #define MMF_VM_MERGEABLE 16 /* KSM may merge identical pages */ #define MMF_VM_HUGEPAGE 17 /* set when mm is available for khugepaged */ +#define MMF_VM_HUGEPAGE_MASK (1 << MMF_VM_HUGEPAGE) =20 /* * This one-shot flag is dropped due to necessity of changing exe once aga= in @@ -1742,7 +1743,8 @@ enum { =20 #define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\ MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK |\ - MMF_VM_MERGE_ANY_MASK | MMF_TOPDOWN_MASK) + MMF_VM_MERGE_ANY_MASK | MMF_TOPDOWN_MASK |\ + MMF_VM_HUGEPAGE_MASK) =20 static inline unsigned long mmf_init_flags(unsigned long flags) { diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 15c18ef4eb11..15aaa4db5ff8 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -364,4 +364,8 @@ struct prctl_mm_map { # define PR_TIMER_CREATE_RESTORE_IDS_ON 1 # define PR_TIMER_CREATE_RESTORE_IDS_GET 2 =20 +#define PR_SET_THP_POLICY 78 +#define PR_GET_THP_POLICY 79 +#define PR_DEFAULT_MADV_HUGEPAGE 0 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index c434968e9f5d..74397ace62f3 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2474,6 +2474,7 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, ar= g2, unsigned long, arg3, unsigned long, arg4, unsigned long, arg5) { struct task_struct *me =3D current; + struct mm_struct *mm =3D me->mm; unsigned char comm[sizeof(me->comm)]; long error; =20 @@ -2658,6 +2659,34 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, a= rg2, unsigned long, arg3, clear_bit(MMF_DISABLE_THP, &me->mm->flags); mmap_write_unlock(me->mm); break; + case PR_GET_THP_POLICY: + if (arg2 || arg3 || arg4 || arg5) + return -EINVAL; + if (mmap_write_lock_killable(mm)) + return -EINTR; + if (mm->def_flags & VM_HUGEPAGE) + error =3D PR_DEFAULT_MADV_HUGEPAGE; + mmap_write_unlock(mm); + break; + case PR_SET_THP_POLICY: + if (arg3 || arg4 || arg5) + return -EINVAL; + if (mmap_write_lock_killable(mm)) + return -EINTR; + switch (arg2) { + case PR_DEFAULT_MADV_HUGEPAGE: + if (!hugepage_global_enabled()) + error =3D -EPERM; + error =3D hugepage_set_vmflags(&mm->def_flags, MADV_HUGEPAGE); + if (!error) + process_default_madv_hugepage(mm, MADV_HUGEPAGE); + break; + default: + error =3D -EINVAL; + break; + } + mmap_write_unlock(mm); + break; case PR_MPX_ENABLE_MANAGEMENT: case PR_MPX_DISABLE_MANAGEMENT: /* No longer implemented: */ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2780a12b25f0..72806fe772b5 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -98,6 +98,19 @@ static inline bool file_thp_enabled(struct vm_area_struc= t *vma) return !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode); } =20 +void process_default_madv_hugepage(struct mm_struct *mm, int advice) +{ + struct vm_area_struct *vma; + unsigned long vm_flags; + + mmap_assert_write_locked(mm); + VMA_ITERATOR(vmi, mm, 0); + for_each_vma(vmi, vma) { + vm_flags =3D vma->vm_flags; + hugepage_madvise(vma, &vm_flags, advice); + } +} + unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, unsigned long vm_flags, unsigned long tva_flags, diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/pr= ctl.h index 35791791a879..f5945ebfe3f2 100644 --- a/tools/include/uapi/linux/prctl.h +++ b/tools/include/uapi/linux/prctl.h @@ -328,4 +328,8 @@ struct prctl_mm_map { # define PR_PPC_DEXCR_CTRL_CLEAR_ONEXEC 0x10 /* Clear the aspect on exec */ # define PR_PPC_DEXCR_CTRL_MASK 0x1f =20 +#define PR_SET_THP_POLICY 78 +#define PR_GET_THP_POLICY 79 +#define PR_THP_POLICY_DEFAULT_HUGE 0 + #endif /* _LINUX_PRCTL_H */ diff --git a/tools/perf/trace/beauty/include/uapi/linux/prctl.h b/tools/per= f/trace/beauty/include/uapi/linux/prctl.h index 15c18ef4eb11..325c72f40a93 100644 --- a/tools/perf/trace/beauty/include/uapi/linux/prctl.h +++ b/tools/perf/trace/beauty/include/uapi/linux/prctl.h @@ -364,4 +364,8 @@ struct prctl_mm_map { # define PR_TIMER_CREATE_RESTORE_IDS_ON 1 # define PR_TIMER_CREATE_RESTORE_IDS_GET 2 =20 +#define PR_SET_THP_POLICY 78 +#define PR_GET_THP_POLICY 79 +#define PR_THP_POLICY_DEFAULT_HUGE 0 + #endif /* _LINUX_PRCTL_H */ --=20 2.47.1 From nobody Fri Dec 19 14:23:34 2025 Received: from mail-qv1-f42.google.com (mail-qv1-f42.google.com [209.85.219.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C480621CC5A; Mon, 19 May 2025 22:33:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694030; cv=none; b=LCg8GDc9zghR6cRWkIoEzoOYVmDL6SFKAISvtT+arJyS74jPmpdFU0M6dPx/CIAS9bGDE6yJguSIRNcwcF3jIXpb6l4ePMighpAu0UtbB/RPpmGr/bnA5KUX0IIuUWl0M5Gd1tP50RZ+UnhCbweRVR52VcCjEYeQGjj2lQTqJvI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694030; c=relaxed/simple; bh=MC1N+OY0nC6U0Z6rmPXpbtSuHdN8I8xTOcxcOubMIck=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Pr/41q6SWnVNo2VMNQ5SbH4py8TXcP3BRUV+OkwaZBXM5r9Pq1m78UFqjP9+DPa8t0LTHRYHhj3NOWfmvv/zriudrZ4BToGswBxqQJ7r9CpLrDPNXAwgEK/XZCYKi7t2VhNsH1ySRgwBdfSRKzzqYOECZGfzlN/+RnXwtIJ1ODI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=V0WLN4qK; arc=none smtp.client-ip=209.85.219.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="V0WLN4qK" Received: by mail-qv1-f42.google.com with SMTP id 6a1803df08f44-6f8c53aeedbso57372156d6.2; Mon, 19 May 2025 15:33:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747694027; x=1748298827; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wdThakyVpRWt/pdL0RGS+6LQ/2FW4dqWsg/JIFZjGck=; b=V0WLN4qKg8oJZOYlIEXnAERLIaN6LF08UAHDq8vgLCj8j+iCg34VhDAlry1B9rjVB8 6RX7/xJ6H0+4PVBlbpI2RnHNXzyVDOS1+xZAL3Ft8/WWt3WxOCkRv+59xFgfuj+NheMB v+8Dc65R9I/dMIvqExeygupNF6Y+fwx04/tkB+exo/7uCphuF5A/3Sgg3x7saOVmfMuX ZCj4MgOGqZYN4Uhdm/i2Is//acvMWy5cWWn67BZFZDHXp27Ua5GvvKqwTwDIHH/YYxC/ b71GZMT8cmb6MXZjwJ6wOEZ0JEZyaFpg8Lce0ZZ7bbT5pMaKhRWf8nIhlsIGM3srNne+ fTMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747694027; x=1748298827; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wdThakyVpRWt/pdL0RGS+6LQ/2FW4dqWsg/JIFZjGck=; b=MAM07c/t9YKCysSxKmAlu8gJdsuJkZnSRCdudXkcf3wI+b+n5IM1YuwxSyN+E55Qeh v1vORoHejpznnIoTtd4yshxW86lHe8bTaP7yHR7KE+RfcbuyszHjku5cU/j/EWWP0wV7 9YhNTd5dl2fDTGD0tth6uYV+vZNHHF/JGuw9KZfuGbXCxNaqpyeN8o7DftkvYRrFU5OH aXdAxpiLlf1atAtAuquBefsDeIguRXEBoK+Ye0jh25e6wqzrXxAC3dtvzlYtBXirbzEt gm3kfc5sEERN6oIjzzlThDAHLLTAowAfwMCBX2Af48Aei+bwmvtV9RB6dGhxDmVe+S10 vB0w== X-Forwarded-Encrypted: i=1; AJvYcCU6zKY3v+2ICv6d7HxH4OiT9laneowUR9SUXolfonWzg3m1mK76u+rNVFAIjqtnj05tpAzbgFyvA0XB1r/b@vger.kernel.org, AJvYcCU7dc/vwnwosoWnwFvvooiLFzvjNTkKQSiBYlYQdVA6yc8WDtughoGDX8OvmSCfE5ihtuR9U2gOw+A=@vger.kernel.org X-Gm-Message-State: AOJu0YyGvsDjpPldynIP40xWsnRrNHyS4XkAksfdo8/B5yhZlT0J++t5 lZMAoTP5gHZqmHVkLzk1oCwMoyDNqnkZFuACTGg22eyTlekL5arD7ma+ X-Gm-Gg: ASbGncusgqXWuiqnRGky6pC8zzExR42M1tOrGpUZzZMGqY2G+0AYkssqDw9RRpGLk5H onG4DnyBG1zq1Fbup+EU+auQxOB9KAXUoSk3cPhzlEU6CVKXTGUAs2yap8KpRUygFORHhZPRELq 5g1ogMNTqkNNf8azOExkEXyTOO9gsNpK7Ak26fO5xt6t7L/gci1wByPzT+Nt5GphULnNd0bVKT/ vb6I4RUN80vGjq3Me1DgmECMM9e98B4KJqmb2S+A+Nr5u5OinLuEvvXePk16MuTc+XtyDcwktYT zWC7vHDkhlqqdhUFp14/KTEZ1b/gjhBPQpbCyYnIuXOTvarnJQ== X-Google-Smtp-Source: AGHT+IElzVJ+5veykGxkKp/1enaTKr5fwrckzXOASwoCkx8oZ5lGcePSRnYc7Ci5ixpQKVjII7tOfQ== X-Received: by 2002:a05:6214:5096:b0:6f4:f123:a97a with SMTP id 6a1803df08f44-6f8b2c65bd5mr238975326d6.15.1747694027561; Mon, 19 May 2025 15:33:47 -0700 (PDT) Received: from localhost ([2a03:2880:20ff:41::]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6f8b0965b90sm62201576d6.91.2025.05.19.15.33.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 May 2025 15:33:47 -0700 (PDT) From: Usama Arif To: Andrew Morton , david@redhat.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, shakeel.butt@linux.dev, riel@surriel.com, ziy@nvidia.com, laoar.shao@gmail.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, vbabka@suse.cz, jannh@google.com, Arnd Bergmann , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v3 3/7] prctl: introduce PR_DEFAULT_MADV_NOHUGEPAGE for the process Date: Mon, 19 May 2025 23:29:55 +0100 Message-ID: <20250519223307.3601786-4-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250519223307.3601786-1-usamaarif642@gmail.com> References: <20250519223307.3601786-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This is set via the new PR_SET_THP_POLICY prctl. It has 2 affects: - It sets VM_NOHUGEPAGE and clears VM_HUGEPAGE on the default VMA flags (def_flags). This means that every new VMA will not be considered for hugepage by default. - Iterate through every VMA in the process and call hugepage_madvise on it, with MADV_NOHUGEPAGE policy. The policy is inherited during fork+exec. This effectively allows setting MADV_NOHUGEPAGE on the entire process. In anenvironment where different types of workloads are stacked on the same machine,this will allow workloads that benefit from having hugepages on an madvise basis only to do so, without regressing those that benefit from having hugepages always. Signed-off-by: Usama Arif --- include/uapi/linux/prctl.h | 1 + kernel/sys.c | 7 +++++++ tools/include/uapi/linux/prctl.h | 1 + tools/perf/trace/beauty/include/uapi/linux/prctl.h | 1 + 4 files changed, 10 insertions(+) diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 15aaa4db5ff8..33a6ef6a5a72 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -367,5 +367,6 @@ struct prctl_mm_map { #define PR_SET_THP_POLICY 78 #define PR_GET_THP_POLICY 79 #define PR_DEFAULT_MADV_HUGEPAGE 0 +#define PR_DEFAULT_MADV_NOHUGEPAGE 1 =20 #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index 74397ace62f3..6bb28b3666f7 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2666,6 +2666,8 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, ar= g2, unsigned long, arg3, return -EINTR; if (mm->def_flags & VM_HUGEPAGE) error =3D PR_DEFAULT_MADV_HUGEPAGE; + else if (mm->def_flags & VM_NOHUGEPAGE) + error =3D PR_DEFAULT_MADV_NOHUGEPAGE; mmap_write_unlock(mm); break; case PR_SET_THP_POLICY: @@ -2681,6 +2683,11 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, a= rg2, unsigned long, arg3, if (!error) process_default_madv_hugepage(mm, MADV_HUGEPAGE); break; + case PR_DEFAULT_MADV_NOHUGEPAGE: + error =3D hugepage_set_vmflags(&mm->def_flags, MADV_NOHUGEPAGE); + if (!error) + process_default_madv_hugepage(mm, MADV_NOHUGEPAGE); + break; default: error =3D -EINVAL; break; diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/pr= ctl.h index f5945ebfe3f2..e03d0ed890c5 100644 --- a/tools/include/uapi/linux/prctl.h +++ b/tools/include/uapi/linux/prctl.h @@ -331,5 +331,6 @@ struct prctl_mm_map { #define PR_SET_THP_POLICY 78 #define PR_GET_THP_POLICY 79 #define PR_THP_POLICY_DEFAULT_HUGE 0 +#define PR_THP_POLICY_DEFAULT_NOHUGE 1 =20 #endif /* _LINUX_PRCTL_H */ diff --git a/tools/perf/trace/beauty/include/uapi/linux/prctl.h b/tools/per= f/trace/beauty/include/uapi/linux/prctl.h index 325c72f40a93..d25458f4db9e 100644 --- a/tools/perf/trace/beauty/include/uapi/linux/prctl.h +++ b/tools/perf/trace/beauty/include/uapi/linux/prctl.h @@ -367,5 +367,6 @@ struct prctl_mm_map { #define PR_SET_THP_POLICY 78 #define PR_GET_THP_POLICY 79 #define PR_THP_POLICY_DEFAULT_HUGE 0 +#define PR_THP_POLICY_DEFAULT_NOHUGE 1 =20 #endif /* _LINUX_PRCTL_H */ --=20 2.47.1 From nobody Fri Dec 19 14:23:34 2025 Received: from mail-qv1-f49.google.com (mail-qv1-f49.google.com [209.85.219.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A8BF421D5BF; Mon, 19 May 2025 22:33:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694033; cv=none; b=P0i8bmXoiX0E7e4PluWoHdaaL1GoJLoddLxTCTXhScdaJPYG9GTiTpaqz9MIoZL+z4Nr3h4HYeHyQunSEFUy+R24QGOICeZn35ljn3ojnWVSV31eR9Sam4kKHMWL7fUbJdNha96D3P4GU2Dhii3PGg521pw/KhYLkMxQPUb1LMk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694033; c=relaxed/simple; bh=7DHUAtdxYHed9//cev2Od5qBR1CObbVspUagy5uPlCk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=U9PM0PyJ9m3c3PRZ4ophcJ3c5iKYoEgTtqWY985bQP10q30Lg10rxTnyl/F5HEF7/hRjjXUGqpfp2mh9Pvz4k1wF+IZ8GfrZk3b6BLdmGCBoqXX9mdb6qEsxjrjbdJPH+dtIGCAMbRRRRp9hGttTJfNorBVL5nF7hIlgD/SYff0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=OS/WdA7T; arc=none smtp.client-ip=209.85.219.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OS/WdA7T" Received: by mail-qv1-f49.google.com with SMTP id 6a1803df08f44-6ecfc2cb1aaso53975666d6.3; Mon, 19 May 2025 15:33:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747694030; x=1748298830; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=I6GYjk8Dr4mM5akd3epW+BXD17ctmGGIEMIRcpD2XKI=; b=OS/WdA7T0h3w3Fhegdt5v3HUNKSFnd21UmfpgQfLR2/vvifSLcVdr6nqSvFitMeIHq H33LgNxowZuZ1mdXcxPhTzSEmWlU0K0mjgW5ZPubkSUF44EG9nw0mcc5opQeJeIcvpqe UmLoGQFRp1n0A242qVFv6tXA68fZ0Yk8D7Dz4DubvrBJDQdGu6RubyDz0qk4jnvoM0zq L6BrtbWKlSwokpkbXwmrh6rs3FRS6FJuVcutjlXx+44i8eOaONqXWnRHysNA3uIa5tZU 5sKf8Rmed21hKylQ8K8SgXaZ+LoOJWrEVweebArdV4y5giNkHqzyJejAEIrYesZS3qUQ KN/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747694030; x=1748298830; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=I6GYjk8Dr4mM5akd3epW+BXD17ctmGGIEMIRcpD2XKI=; b=QkgBgKufiDMuF2wX/sDxeO/dbgX1Rd+aj/SwPtqDWlNFxPQWgQdo/3vYoFmcDa0i6u a98BmnOOz5w1baG0YNKKHN1GvaldSuUuXuBe+KHjkIuW7H1pYbPfBp62pJBvTq1w9Hz+ aAY1qvyWmKOM2E9AKf1LFfRnSqIkLd5mMNxqb5XeSk/kh6+XApS0B7OK+9PjISFMO7um mVfCO1+cgbSMeBNEFtm8X3dzSKJa6jxyLCLB+szTGfr71/bJLrk8QMeiXoc//Sd4UAtj 6roEdEeAoC6KJdJLVIiMuS04Sz7SArr3+wpVz6qOwrLIVuB2+FnWWED6i1hUTSw62Eac HiIg== X-Forwarded-Encrypted: i=1; AJvYcCU/bx06Bhx0a5T6Dggl9/3+CFWMMffgFCiaTiyd781e5F0HxsbUK5FIXDHIalIozR50/Xxpxuok7JqUAxOf@vger.kernel.org, AJvYcCUR8tpJHckHhDIqjp0t/0WntFZth7ybbQeEtcalAbxyIwe/XEPSVKRRDo64LQURU0dHREhDVxr0fis=@vger.kernel.org X-Gm-Message-State: AOJu0Yy5N9mfXpDvXbLlQjWMKK3ZaDv/zEYZlMluv8Tv96YiO/2DP7KX 1tBBH0X7HO3DNj/cMKtNmkQaHS+a9h+iS8IIL1i6tE9k7uXfetarNkBB X-Gm-Gg: ASbGnct6X6IM7FqtwHaKyEWE/pCzAbCF185VYAS0ifDsrVHZdbl3vptPcZZ/d2BA0IR GqDnus/ZbIKBf3sABHG8erwKG0ylljiVbrMwuqyQDk2VKIwis1+FnHy5bl4VN9QwISBePHFLDF7 SK+UpbuVw8aXjKZuznR/nV03AnNJ2C6UAbdM2PrNTIO2oYmRsI50UjY5UVfp8Se41VakwNXyr4B QM8DXyuoPZBv1O7WkeJsBZcJx3fPuZjRq1+sMJazApB2i089i7JGGCsQpI8qSKsO162lQpUDxur HishFw/z9MojGyazi15vgwTYfaweHsxOTBQxNmGTodv3MgcX X-Google-Smtp-Source: AGHT+IGig5bxPGk+qAd4HDl5M5Hmea1b5/tyBwQX9Q6xIpmS8HDGuKJHgByhih/F1gdkpGOen/gW7w== X-Received: by 2002:a05:6214:2302:b0:6f8:a667:2959 with SMTP id 6a1803df08f44-6f8b08e5416mr251126186d6.36.1747694030472; Mon, 19 May 2025 15:33:50 -0700 (PDT) Received: from localhost ([2a03:2880:20ff:4::]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6f8b097b69csm62531546d6.103.2025.05.19.15.33.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 May 2025 15:33:50 -0700 (PDT) From: Usama Arif To: Andrew Morton , david@redhat.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, shakeel.butt@linux.dev, riel@surriel.com, ziy@nvidia.com, laoar.shao@gmail.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, vbabka@suse.cz, jannh@google.com, Arnd Bergmann , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v3 4/7] prctl: introduce PR_THP_POLICY_SYSTEM for the process Date: Mon, 19 May 2025 23:29:56 +0100 Message-ID: <20250519223307.3601786-5-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250519223307.3601786-1-usamaarif642@gmail.com> References: <20250519223307.3601786-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This is set via the new PR_SET_THP_POLICY prctl. This will clear VM_HUGEPAGE and VM_NOHUGEPAGE in mm->def_flags to reset VMA hugepage policy to system specific. (except in the case of s390 where pgstes are switched on for userspace process, in which case it will only clear VM_HUGEPAGE). Signed-off-by: Usama Arif --- include/uapi/linux/prctl.h | 1 + kernel/sys.c | 17 +++++++++++++++++ tools/include/uapi/linux/prctl.h | 1 + .../trace/beauty/include/uapi/linux/prctl.h | 1 + 4 files changed, 20 insertions(+) diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 33a6ef6a5a72..508d78bc3364 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -368,5 +368,6 @@ struct prctl_mm_map { #define PR_GET_THP_POLICY 79 #define PR_DEFAULT_MADV_HUGEPAGE 0 #define PR_DEFAULT_MADV_NOHUGEPAGE 1 +#define PR_THP_POLICY_SYSTEM 2 =20 #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index 6bb28b3666f7..cffb60632d97 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2668,6 +2668,8 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, ar= g2, unsigned long, arg3, error =3D PR_DEFAULT_MADV_HUGEPAGE; else if (mm->def_flags & VM_NOHUGEPAGE) error =3D PR_DEFAULT_MADV_NOHUGEPAGE; + else + error =3D PR_THP_POLICY_SYSTEM; mmap_write_unlock(mm); break; case PR_SET_THP_POLICY: @@ -2688,6 +2690,21 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, a= rg2, unsigned long, arg3, if (!error) process_default_madv_hugepage(mm, MADV_NOHUGEPAGE); break; + case PR_THP_POLICY_SYSTEM: +#ifdef CONFIG_S390 + /* + * When s390 switches on pgstes for its userspace + * process (for kvm), it sets VM_NOHUGEPAGE. + * Do not clear it with system policy. + */ + if (mm_has_pgste(mm)) + mm->def_flags &=3D ~VM_HUGEPAGE; + else + mm->def_flags &=3D ~(VM_HUGEPAGE | VM_NOHUGEPAGE); +#else + mm->def_flags &=3D ~(VM_HUGEPAGE | VM_NOHUGEPAGE); +#endif + break; default: error =3D -EINVAL; break; diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/pr= ctl.h index e03d0ed890c5..cc209c9a8afb 100644 --- a/tools/include/uapi/linux/prctl.h +++ b/tools/include/uapi/linux/prctl.h @@ -332,5 +332,6 @@ struct prctl_mm_map { #define PR_GET_THP_POLICY 79 #define PR_THP_POLICY_DEFAULT_HUGE 0 #define PR_THP_POLICY_DEFAULT_NOHUGE 1 +#define PR_THP_POLICY_SYSTEM 2 =20 #endif /* _LINUX_PRCTL_H */ diff --git a/tools/perf/trace/beauty/include/uapi/linux/prctl.h b/tools/per= f/trace/beauty/include/uapi/linux/prctl.h index d25458f4db9e..340d5ff769a9 100644 --- a/tools/perf/trace/beauty/include/uapi/linux/prctl.h +++ b/tools/perf/trace/beauty/include/uapi/linux/prctl.h @@ -368,5 +368,6 @@ struct prctl_mm_map { #define PR_GET_THP_POLICY 79 #define PR_THP_POLICY_DEFAULT_HUGE 0 #define PR_THP_POLICY_DEFAULT_NOHUGE 1 +#define PR_THP_POLICY_SYSTEM 2 =20 #endif /* _LINUX_PRCTL_H */ --=20 2.47.1 From nobody Fri Dec 19 14:23:34 2025 Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B51692192FC; Mon, 19 May 2025 22:33:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694035; cv=none; b=Aw7KtTTTF7UBbM03i9sILtnY4WUTARvZT5v/YmtbpaGWpr8TVW45H83xCbZ36qAZNBGXcc2O/VDqFWKZ0La11icjx7P72ZZ0BKhsn/DvR+O2Ii6bB8qtMTRbMFkPKPfU3H17uXRtDdAZ/MDF5QrthxiS1aQxo3myDJuwretJhhk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694035; c=relaxed/simple; bh=6clWYkJ46bS6af6Y7ODPfkaW96D1yKbSRPHkJFhb7is=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gbWFMZF24BUkmdJcXonLHvGRhMXfkuw9si6AyUoujG2d8z0oqvB0UAZg7QLuqgtv51zOdAfzutuAY0Xm2XF7HQcWnIMlJLjQ5JbUYVyxP+S9wKfZsfHXnje9nAXD69HQ0tAForg17VA2uvMcTQ1oC7bjPjnAzmDgbCBquopS5HE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=eDxYH/pS; arc=none smtp.client-ip=209.85.160.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="eDxYH/pS" Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-476977848c4so54010711cf.1; Mon, 19 May 2025 15:33:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747694033; x=1748298833; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GLL6/TM7VHHA7BD7VtemxaiF2BY2WlisCLQAtLpyNFE=; b=eDxYH/pSemxNbUSqcFuW+xByZiYWi5GOBg4qjqgn7O/IvpS+V8a1V3y8B/dxsz/ObA tfli25q2GP32H/YxE46YGrziGeW1ZUDLRC7pQ0IDPjwXKIL7jwXwS3+ezJUwiHcO7F1V 2d3WmyV1K3cw5JQoWiVindRWCfbK6xbN/K+KTjjkNvUceO71JLSGqg+Fcoc54VfIk511 JY80zS34QhM/Xd7stTpW30DIJeZwGzRbxZed3atQv+mASnXDsDLpl9woLtUpEVZ3Hm88 tyz6UieVV5lryVP+F7z8lhXbRKWGwUrqdq4oOy6kl1GHIhS1vFVfmvek+wHNlHDFGWZo IF7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747694033; x=1748298833; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GLL6/TM7VHHA7BD7VtemxaiF2BY2WlisCLQAtLpyNFE=; b=oz13oB1Q9lwYVe6GXOxHvQnRx7fRJELRSPGdQm2pF16fNf4lqYhJJllKyIedujDVzl iGVraJ4NBj0sKB9lDKTyNWcRP/bXEze6YrDKiL4FHYJyPHO+cQNpB7ilAPT/w4Hx2no7 nH6i12NR9Mpbo1UxXxTUN92lOlNdaQA+Bdc80oFB94unMuhTcozdC2R07nN4D4BE0JOG TwQElaPEjC5WxapNP8fDcal/JPHTphZaZa3mrHPGYFuPsWFUhaURs4V1rZ5fB0IES0Bn 7m+oaL+EUd0cvIrfvJPMkVVxwDSesfDnSMy1ey3viwCFFhhYQOJWJLgvfq7G+Q1WBaNt FB2A== X-Forwarded-Encrypted: i=1; AJvYcCUmDoVtBdmxw2/q1gU5XKxSpsE+vzYG/H97kn4VUhF+wEZ5uul2l4avxcUQ9dTEVBtFyeraZnLvReI=@vger.kernel.org, AJvYcCXUXF1kuzN7+QTWlOD3cQfVYqfST38j9A377v3h6vcyU5cLHwnqO4UnL0ONrZ2Akan0NFPLeLBwLfskLk7R@vger.kernel.org X-Gm-Message-State: AOJu0YyY7234vEbKhfQvDW2mKJZqc6/NYz/Yx7KbsE0nsUQN6CfZunLl 1Kh+LA257gVWISj75GIo6XZIw9VCAcssPMKF012yeCniI9SWaHj8rpSZ X-Gm-Gg: ASbGncsa6o5NoT2mj7GbV430Y+BI5l9Tht8WHCRWSoBpuT4AzMPEE+NllsuMQQhmWVO ulv0G/iwPWy57+Q0nCW4Vsv8fC8bXFoseu9aAKPgvN/4JnJNOjjXHF3lLOoFYQA9A8iveTt6oi6 kH5pqNovU1DwyRSVBAgvA+F+t10bLctN6YevGIu0SMaZ/R6ZC83j2S0hAw1IiG3bCfmoRcQDKXH LOWKuw8yI/EH2dYUGI0iQb5uX+XTdfBG2hE7bNzx5KehdsYLqlUIXzIilYinLvMpxWtJiVq5WOg mf7g0dtMs24wCKVh6/vo3HS6g5REf7q94LHLEkVL+jJCF3qZ X-Google-Smtp-Source: AGHT+IF2bjSrNBKkDGo+548/O/GSqWphCFb2QYFAjC9c4/YfABn2yMXLMwEjhN7yIRg9vuRAbhk/TA== X-Received: by 2002:a05:622a:a1b:b0:476:a74d:f23b with SMTP id d75a77b69052e-494ae4ca3ffmr252876701cf.48.1747694032568; Mon, 19 May 2025 15:33:52 -0700 (PDT) Received: from localhost ([2a03:2880:20ff:1::]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-494ae427fe5sm62947021cf.44.2025.05.19.15.33.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 May 2025 15:33:52 -0700 (PDT) From: Usama Arif To: Andrew Morton , david@redhat.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, shakeel.butt@linux.dev, riel@surriel.com, ziy@nvidia.com, laoar.shao@gmail.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, vbabka@suse.cz, jannh@google.com, Arnd Bergmann , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v3 5/7] selftests: prctl: introduce tests for PR_DEFAULT_MADV_NOHUGEPAGE Date: Mon, 19 May 2025 23:29:57 +0100 Message-ID: <20250519223307.3601786-6-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250519223307.3601786-1-usamaarif642@gmail.com> References: <20250519223307.3601786-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The test is limited to 2M PMD THPs. It does not modify the system settings in order to not disturb other process running in the system. It checks if the PMD size is 2M, if the 2M policy is set to inherit and if the system global THP policy is set to "always", so that the change in behaviour due to PR_DEFAULT_MADV_NOHUGEPAGE can be seen. This tests if: - the process can successfully set the policy - carry it over to the new process with fork - if no hugepage is gotten when the process doesn't MADV_HUGEPAGE - if hugepage is gotten when the process does MADV_HUGEPAGE - the process can successfully reset the policy to PR_DEFAULT_SYSTEM - if hugepage is gotten after the policy reset Signed-off-by: Usama Arif --- tools/testing/selftests/prctl/Makefile | 2 +- tools/testing/selftests/prctl/thp_policy.c | 214 +++++++++++++++++++++ 2 files changed, 215 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/prctl/thp_policy.c diff --git a/tools/testing/selftests/prctl/Makefile b/tools/testing/selftes= ts/prctl/Makefile index 01dc90fbb509..ee8c98e45b53 100644 --- a/tools/testing/selftests/prctl/Makefile +++ b/tools/testing/selftests/prctl/Makefile @@ -5,7 +5,7 @@ ARCH ?=3D $(shell echo $(uname_M) | sed -e s/i.86/x86/ -e s= /x86_64/x86/) =20 ifeq ($(ARCH),x86) TEST_PROGS :=3D disable-tsc-ctxt-sw-stress-test disable-tsc-on-off-stress-= test \ - disable-tsc-test set-anon-vma-name-test set-process-name + disable-tsc-test set-anon-vma-name-test set-process-name thp_policy all: $(TEST_PROGS) =20 include ../lib.mk diff --git a/tools/testing/selftests/prctl/thp_policy.c b/tools/testing/sel= ftests/prctl/thp_policy.c new file mode 100644 index 000000000000..7791d282f7c8 --- /dev/null +++ b/tools/testing/selftests/prctl/thp_policy.c @@ -0,0 +1,214 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * This test covers the PR_GET/SET_THP_POLICY functionality of prctl calls + */ +#include +#include +#include +#include +#include +#include +#include + +#ifndef PR_SET_THP_POLICY +#define PR_SET_THP_POLICY 78 +#define PR_GET_THP_POLICY 79 +#define PR_DEFAULT_MADV_HUGEPAGE 0 +#define PR_DEFAULT_MADV_NOHUGEPAGE 1 +#define PR_DEFAULT_SYSTEM 2 +#endif + +#define CONTENT_SIZE 256 +#define BUF_SIZE (12 * 2 * 1024 * 1024) // 12 x 2MB pages + +enum system_policy { + SYSTEM_POLICY_ALWAYS, + SYSTEM_POLICY_MADVISE, + SYSTEM_POLICY_NEVER, +}; + +int system_thp_policy; + +/* check if the sysfs file contains the expected substring */ +static int check_file_content(const char *file_path, const char *expected_= substring) +{ + FILE *file =3D fopen(file_path, "r"); + char buffer[CONTENT_SIZE]; + + if (!file) { + perror("Failed to open file"); + return -1; + } + if (fgets(buffer, CONTENT_SIZE, file) =3D=3D NULL) { + perror("Failed to read file"); + fclose(file); + return -1; + } + fclose(file); + // Remove newline character from the buffer + buffer[strcspn(buffer, "\n")] =3D '\0'; + if (strstr(buffer, expected_substring)) + return 0; + else + return 1; +} + +/* + * The test is designed for 2M hugepages only. + * Check if hugepage size is 2M, if 2M size inherits from global + * setting, and if the global setting is madvise or always. + */ +static int sysfs_check(void) +{ + int res =3D 0; + + res =3D check_file_content("/sys/kernel/mm/transparent_hugepage/hpage_pmd= _size", "2097152"); + if (res) { + printf("hpage_pmd_size is not set to 2MB. Skipping test.\n"); + return -1; + } + res |=3D check_file_content("/sys/kernel/mm/transparent_hugepage/hugepage= s-2048kB/enabled", + "[inherit]"); + if (res) { + printf("hugepages-2048kB does not inherit global setting. Skipping test.= \n"); + return -1; + } + + res =3D check_file_content("/sys/kernel/mm/transparent_hugepage/enabled",= "[madvise]"); + if (!res) { + system_thp_policy =3D SYSTEM_POLICY_MADVISE; + return 0; + } + res =3D check_file_content("/sys/kernel/mm/transparent_hugepage/enabled",= "[always]"); + if (!res) { + system_thp_policy =3D SYSTEM_POLICY_ALWAYS; + return 0; + } + printf("Global THP policy not set to madvise or always. Skipping test.\n"= ); + return -1; +} + +static int check_smaps_for_huge(void) +{ + FILE *file =3D fopen("/proc/self/smaps", "r"); + int is_anonhuge =3D 0; + char line[256]; + + if (!file) { + perror("fopen"); + return -1; + } + + while (fgets(line, sizeof(line), file)) { + if (strstr(line, "AnonHugePages:") && strstr(line, "24576 kB")) { + is_anonhuge =3D 1; + break; + } + } + fclose(file); + return is_anonhuge; +} + +static int test_mmap_thp(int madvise_buffer) +{ + int is_anonhuge; + + char *buffer =3D (char *)mmap(NULL, BUF_SIZE, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (buffer =3D=3D MAP_FAILED) { + perror("mmap"); + return -1; + } + if (madvise_buffer) + madvise(buffer, BUF_SIZE, MADV_HUGEPAGE); + + // set memory to ensure it's allocated + memset(buffer, 0, BUF_SIZE); + is_anonhuge =3D check_smaps_for_huge(); + munmap(buffer, BUF_SIZE); + return is_anonhuge; +} + +/* Global policy is always, process is changed to NOHUGE (process becomes = madvise) */ +static int test_global_always_process_nohuge(void) +{ + int is_anonhuge =3D 0, res =3D 0, status =3D 0; + pid_t pid; + + if (prctl(PR_SET_THP_POLICY, PR_DEFAULT_MADV_NOHUGEPAGE, NULL, NULL, NULL= ) !=3D 0) { + perror("prctl failed to set policy to madvise"); + return -1; + } + + /* Make sure prctl changes are carried across fork */ + pid =3D fork(); + if (pid < 0) { + perror("fork"); + exit(EXIT_FAILURE); + } + + res =3D prctl(PR_GET_THP_POLICY, NULL, NULL, NULL, NULL); + if (res !=3D PR_DEFAULT_MADV_NOHUGEPAGE) { + printf("prctl PR_GET_THP_POLICY returned %d pid %d\n", res, pid); + goto err_out; + } + + /* global =3D always, process =3D madvise, we shouldn't get HPs without m= advise */ + is_anonhuge =3D test_mmap_thp(0); + if (is_anonhuge) { + printf( + "PR_DEFAULT_MADV_NOHUGEPAGE set but still got hugepages without MADV_HUG= EPAGE\n"); + goto err_out; + } + + is_anonhuge =3D test_mmap_thp(1); + if (!is_anonhuge) { + printf( + "PR_DEFAULT_MADV_NOHUGEPAGE set but did't get hugepages with MADV_HUGEPA= GE\n"); + goto err_out; + } + + /* Reset to system policy */ + if (prctl(PR_SET_THP_POLICY, PR_DEFAULT_SYSTEM, NULL, NULL, NULL) !=3D 0)= { + perror("prctl failed to set policy to system"); + goto err_out; + } + + is_anonhuge =3D test_mmap_thp(0); + if (!is_anonhuge) { + printf("global policy is always but we still didn't get hugepages\n"); + goto err_out; + } + + is_anonhuge =3D test_mmap_thp(1); + if (!is_anonhuge) { + printf("global policy is always but we still didn't get hugepages\n"); + goto err_out; + } + + if (pid =3D=3D 0) { + exit(EXIT_SUCCESS); + } else { + wait(&status); + if (WIFEXITED(status)) + return 0; + else + return -1; + } + +err_out: + if (pid =3D=3D 0) + exit(EXIT_FAILURE); + else + return -1; +} + +int main(void) +{ + if (sysfs_check()) + return 0; + + if (system_thp_policy =3D=3D SYSTEM_POLICY_ALWAYS) + return test_global_always_process_nohuge(); + +} --=20 2.47.1 From nobody Fri Dec 19 14:23:34 2025 Received: from mail-qt1-f179.google.com (mail-qt1-f179.google.com [209.85.160.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D3A8321FF5A; Mon, 19 May 2025 22:33:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694036; cv=none; b=I3DZh6xSVNfj0LNJ22YrSd1ELfYdZzjsY1EY3g81cQ4Zq5rRXIHv0wWHew5Dk39ecoJEo7VlosCAMO/1rpj0OlAzPgYiE+Ip9X0gqKzXq8Jbo991IBNbuESZWTiZ5c9VWZZuQw5kIOm7n8f/y4dgQ7xldMKGKXh3ICCb8pM+yqU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694036; c=relaxed/simple; bh=mdjyDN1GqLCpsZy/2bMMWG1/g7ATnYg6QL8zmUB4dEg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qjF9wRpuinGQ83yGDhlLpEBJYTkaKgvQldOXr3zK9ZlDOluiLa3lxwiG1jADKIlb04EstEMLaYtE+yLKUU+Jwh4EJVR0xgY/IUN2PbQjas4jTqpjMSEV/IahNY3nToa1Qajm1WZ3ymjB6EiBvy6rkPCRDCWZ+Ykjwhrw+dXjLaA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Gc8zR7UE; arc=none smtp.client-ip=209.85.160.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Gc8zR7UE" Received: by mail-qt1-f179.google.com with SMTP id d75a77b69052e-4766cb762b6so56129361cf.0; Mon, 19 May 2025 15:33:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747694033; x=1748298833; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QBedgeD+gExl3gWB2aOy4e3KdH1CGamnabmn5JJeaHM=; b=Gc8zR7UE+5VjGeCNCY6rzh/3SoXhx3T2Ft9EOIaodPWSeYJwvnYt8w8uh+wHtjYXVY GFQ8J5nEaXeAaDLGL4js11CsYAZHtU6cFJ8sOduIY9/ZodpA1dx2BPOa2JfDrzacnI7a wjcRPGdIM1Q6Ft65YFM5QMyvSyS5L06YF1/11lGuWUibINzeypoW3D5YZ+FQCODX5cxy 1bsm8RtL94DBxoWhmXu0ajqKjuvtVgnju3QgDmN3iVEVaQ9ET9d9vId4D5xfKaBrpXwP HGVj7CAhMyqqA2EpahONhOyJFSE5fCbQXpZ1/iPAxxORNFC8fveySKEBWEBKHr9QuBbo 5nXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747694033; x=1748298833; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QBedgeD+gExl3gWB2aOy4e3KdH1CGamnabmn5JJeaHM=; b=K8d4S2t2UiDrkU+em0fSGN6refVux4zawqkvzf/HgqzOcjgLvWt1C9D99oY0RNESBC 2uGN4DGNRUFZh1t0XPOe7qpdHcxwj+HBgmV+GKIPTKH0NKwpBM7sfAsfXkIn1sjzl3jq m9AVNiuFRHO9Heh6cohI3IsBrYWbo6zXKRW5ad2bpxAUNRRkHbwcg1eCsfOIuFUo2EbD i2Aot1gy4g/II87CUKb+5RC6kViBiFQPSmWQuOnUg+aplw93ERaXm9ktliEUDV/QsxHO iJfNZDdhFaQm2QND0A7X4ZBICTgOB+Nm3QJaJq9HT7kspC++8ol3L/9tFRyd5s0sBcBx 5t/Q== X-Forwarded-Encrypted: i=1; AJvYcCUt22HHGdTmPN3tvOlhjuQrt1KTZVhYfdwp0WErj96P9bXbpXZVnZBvZQZslLVzOiwYkLW12Ubepb8=@vger.kernel.org, AJvYcCX0l2WGb/HnQhP/A3kw3aO2ZwpwXLPkQBbzZsTlcnvr0JptpL0Dyd/vcTF0fYhU7tf1D70C+FbggKNGZBOv@vger.kernel.org X-Gm-Message-State: AOJu0YxFB7PMw046ABuzTkbWFYR3wBFty2pCxHqiZ9k6JSzIH2jcBRXB 32ekhXUiUunnS33En4oLbRweB+mR2LUF1bqUoFFiAKHoEtr96JNEpEL2 X-Gm-Gg: ASbGncv+nSMY5ER5Goz0sGA2y6L3n+NGCBPCvBUyEIDcKMGKqhBU4KPF8dA14PDj5yz EHvo1KyQ9hD003hqaF+ouIkLqhzJuVkTSpcJfguCEglvFW9NPr1KjcZddO1piVkK7ZD7oRGnP4A uT1cZDKBM6cTU54ZpDRz6dG8hR0aQE59UZG71HXdanehtLreEK9xlx78e/iyLvwVbctjHltKA+Q m+2Efl5QTYN8329Ho5xcVxmld0Mg3dNRW+WVJMx0s/ZPVd+G50JBbZvQMBfk9txi3FkYzcR1+W8 Zjsqcz02lm+gR4dIVKY6aWOxeiWZ+J+S5Xayl3xMQ+aqa5Jd X-Google-Smtp-Source: AGHT+IHIMtoPpH0aTMNDuPw4CtAj8gWgNrBxIjYiHH9b7GKsL6dv3EnaDjorXwPudxtkzQKEOIFtfg== X-Received: by 2002:a05:622a:4d4f:b0:494:b8fd:b565 with SMTP id d75a77b69052e-494b8fdb73fmr226540651cf.17.1747694033571; Mon, 19 May 2025 15:33:53 -0700 (PDT) Received: from localhost ([2a03:2880:20ff:4::]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-494ae3ccfd5sm61080091cf.2.2025.05.19.15.33.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 May 2025 15:33:53 -0700 (PDT) From: Usama Arif To: Andrew Morton , david@redhat.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, shakeel.butt@linux.dev, riel@surriel.com, ziy@nvidia.com, laoar.shao@gmail.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, vbabka@suse.cz, jannh@google.com, Arnd Bergmann , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v3 6/7] selftests: prctl: introduce tests for PR_THP_POLICY_DEFAULT_HUGE Date: Mon, 19 May 2025 23:29:58 +0100 Message-ID: <20250519223307.3601786-7-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250519223307.3601786-1-usamaarif642@gmail.com> References: <20250519223307.3601786-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The test is limited to 2M PMD THPs. It does not modify the system settings in order to not disturb other process running in the system. It runs if the PMD size is 2M, if the 2M policy is set to inherit and if the system global THP policy is set to "madvise", so that the change in behaviour due to PR_THP_POLICY_DEFAULT_HUGE can be seen. This tests if: - the process can successfully set the policy - carry it over to the new process with fork - if hugepage is gotten both with and without madvise - the process can successfully reset the policy to PR_DEFAULT_SYSTEM - if hugepage is gotten after the policy reset only with MADV_HUGEPAGE Signed-off-by: Usama Arif --- tools/testing/selftests/prctl/thp_policy.c | 74 +++++++++++++++++++++- 1 file changed, 73 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/prctl/thp_policy.c b/tools/testing/sel= ftests/prctl/thp_policy.c index 7791d282f7c8..62cf1fa6fd28 100644 --- a/tools/testing/selftests/prctl/thp_policy.c +++ b/tools/testing/selftests/prctl/thp_policy.c @@ -203,6 +203,77 @@ static int test_global_always_process_nohuge(void) return -1; } =20 +/* Global policy is madvise, process is changed to HUGE (process becomes a= lways) */ +static int test_global_madvise_process_huge(void) +{ + int is_anonhuge =3D 0, res =3D 0, status =3D 0; + pid_t pid; + + if (prctl(PR_SET_THP_POLICY, PR_DEFAULT_MADV_HUGEPAGE, NULL, NULL, NULL) = !=3D 0) { + perror("prctl failed to set process policy to always"); + return -1; + } + + /* Make sure prctl changes are carried across fork */ + pid =3D fork(); + if (pid < 0) { + perror("fork"); + exit(EXIT_FAILURE); + } + + res =3D prctl(PR_GET_THP_POLICY, NULL, NULL, NULL, NULL); + if (res !=3D PR_DEFAULT_MADV_HUGEPAGE) { + printf("prctl PR_GET_THP_POLICY returned %d pid %d\n", res, pid); + goto err_out; + } + + /* global =3D madvise, process =3D always, we should get HPs irrespective= of MADV_HUGEPAGE */ + is_anonhuge =3D test_mmap_thp(0); + if (!is_anonhuge) { + printf("PR_DEFAULT_MADV_HUGEPAGE set but didn't get hugepages\n"); + goto err_out; + } + + is_anonhuge =3D test_mmap_thp(1); + if (!is_anonhuge) { + printf("PR_DEFAULT_MADV_HUGEPAGE set but did't get hugepages\n"); + goto err_out; + } + + /* Reset to system policy */ + if (prctl(PR_SET_THP_POLICY, PR_DEFAULT_SYSTEM, NULL, NULL, NULL) !=3D 0)= { + perror("prctl failed to set policy to system"); + goto err_out; + } + + is_anonhuge =3D test_mmap_thp(0); + if (is_anonhuge) { + printf("global policy is madvise\n"); + goto err_out; + } + + is_anonhuge =3D test_mmap_thp(1); + if (!is_anonhuge) { + printf("global policy is madvise\n"); + goto err_out; + } + + if (pid =3D=3D 0) { + exit(EXIT_SUCCESS); + } else { + wait(&status); + if (WIFEXITED(status)) + return 0; + else + return -1; + } +err_out: + if (pid =3D=3D 0) + exit(EXIT_FAILURE); + else + return -1; +} + int main(void) { if (sysfs_check()) @@ -210,5 +281,6 @@ int main(void) =20 if (system_thp_policy =3D=3D SYSTEM_POLICY_ALWAYS) return test_global_always_process_nohuge(); - + else + return test_global_madvise_process_huge(); } --=20 2.47.1 From nobody Fri Dec 19 14:23:34 2025 Received: from mail-qv1-f43.google.com (mail-qv1-f43.google.com [209.85.219.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0756122069E; Mon, 19 May 2025 22:33:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694037; cv=none; b=P//gQ9rITrM/IbPjf/948YaB0oVrtlSvFlEhWfZJkAYRsO0UXzIU7Nd1e5DzQ7WVb/NL2o13fuzbSTOWhpkTiBq1umgtIjFFIuZw6+aEtLUul7dlujsRKuyrIyefsM4SkzbmF1M07JcGdVhB6+vTC4ygSMBdmVIHVGz0dJ/cd2k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747694037; c=relaxed/simple; bh=WfzqJeERjr6pRWJErFYqsjHoy0qW6jTKBvTVI2yMSqA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Qr+k+VGgoSd6oE83WirbUXLdl7bze98Kt5QhNzyiDFWaG2HIcbjrlQXUR7N1n8vrcrLR6ZZXEOo2+MhZbGCdLb+CooPXxeb3X7vBpH/Ho9w2DovxSvPwMrMdLSbq7cI2QXYJZMgzbWK2sfTZVOwySoQK17TqWlIYZFFHaN0gNIM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=nRqswrU2; arc=none smtp.client-ip=209.85.219.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="nRqswrU2" Received: by mail-qv1-f43.google.com with SMTP id 6a1803df08f44-6ecfc2cb1aaso53976136d6.3; Mon, 19 May 2025 15:33:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747694035; x=1748298835; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LVAm0MhcUxfqy1/Jj47nVqzJXeFttO6qIrZiquv3thQ=; b=nRqswrU2X57g08m29dR7dJZpej40aIcVCNb0AKcDatcZ0vYovlTX/Ajt6l6+LTVRDS qlecqhPXZuX1Edkfupx7J3NIyqp3YstarHQtEJTBmv/fol7vxRkU3D98YTunAKkIIGZH wFKee7rPu4nHsleQ55CimnomjqUmaIDdHT1OYgmlurp6LHiQLtXq1xb1eNSeJ+ZMsRUL In+JrPVsi89PNlKjzxuE6sdQCopVZ3U1mbnICvLW4B+BXl4dTOug+XY8u7KCXaBJQmMh LfNaaR/Mv4YLR6j+uyZFofaebo8RTIIAJCGcoDLs0tHT5qX46QO/HRKVAuq87C3MweRm dvYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747694035; x=1748298835; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LVAm0MhcUxfqy1/Jj47nVqzJXeFttO6qIrZiquv3thQ=; b=r8WYP88+YB6zEN77t+4aISzT1EK+9zy7zCciEDq+6PN6L1w9U9IyuSrcpDBA94nQ7z pidrN8XfibA6GdDUxGXFhwJjZTnNzIfgvD2ev7vZfkUO6FFfYa8fMnMbaJeV6WXbAR3N yHhdNoUfb4MGFSSL48UMyKhKKzmAVXg5iY0jgzvlK0ylQyC+heDcgpcpBGf89vGrhsLO uD2BPfjkMKWvT4yBIr6eH27++oHv2qeFSzfyTABzzQcmFYDWSAfIUCC9gcXeH3IPl8UL fCcNm+MhXgrTHMKYosQpTG1z6wT6YlkIrWwbTP9LQWFAAi0koxH3eUZLAe6fTYsMcO6J xbCg== X-Forwarded-Encrypted: i=1; AJvYcCU5lNTC+Q0GlISPOVVsXBNdexcluH7WqSDafl2N1Ov6/Rk8kioMza+VhAgDyNCTQVIW85Ykv/9Ut4g=@vger.kernel.org, AJvYcCXiJDvHcDIVrwvZhvsNp6nFNsnbeU9mnkrXy3p+AjbwtgfcqqTnKtmvC4sfVqyj6iKXbwrdG9meFmyItNzB@vger.kernel.org X-Gm-Message-State: AOJu0Yw+dyjhSjjHxY8LRVsYA1wso/SA9lKcCHovney9CZ7K8CcERyuZ TcRA0nodiJakSrIzaypvNDLMSfWZmNk5flP2BSYlmB4152Hu071KF018 X-Gm-Gg: ASbGncvwpLYd/8RLQRRzVI4eJwXaBbcdnNpmE2ZbdkOLqEEfcWdD1LLM4clLBW/1eYo tmSPWNdGyQEXImqgWLPreNVEF+6blIQkPBHvNOU2XaS5xfXlDzFMjR+4GXI192cJvOVGlTfvPgp q5Au0hA+sOsLvVtBtGAGIiEXkhfFGPboHTA5QQKcoDyRnSQTgZiOvGkId6HVr9J9+gNwr47QQFp QYXXTJv+BSW7tpnrhfEPyr7u5j+Zlkprnv5KCLGsij8tikhruiEXRpiPK03PIupngFzoQ4Hijft rawI2XiX92rdbJU7ZK2oOfDiFVzQNeLccre0KwLwlglvnmZTcs5q4jI8jVA= X-Google-Smtp-Source: AGHT+IEHV6V/9vy/4d8ig5hkz88Eqqp79gFobxGRuxf58uObGKqhfceiqeMRNi1byTKwTMbz4yMfpQ== X-Received: by 2002:ad4:4ee2:0:b0:6f5:106a:271e with SMTP id 6a1803df08f44-6f8b08edabcmr264699826d6.38.1747694034773; Mon, 19 May 2025 15:33:54 -0700 (PDT) Received: from localhost ([2a03:2880:20ff:7::]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-494b4075860sm56213021cf.23.2025.05.19.15.33.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 May 2025 15:33:54 -0700 (PDT) From: Usama Arif To: Andrew Morton , david@redhat.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, shakeel.butt@linux.dev, riel@surriel.com, ziy@nvidia.com, laoar.shao@gmail.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, vbabka@suse.cz, jannh@google.com, Arnd Bergmann , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v3 7/7] docs: transhuge: document process level THP controls Date: Mon, 19 May 2025 23:29:59 +0100 Message-ID: <20250519223307.3601786-8-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250519223307.3601786-1-usamaarif642@gmail.com> References: <20250519223307.3601786-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This includes the already existing PR_GET/SET_THP_DISABLE policy, as well as the newly introduced PR_GET/SET_THP_POLICY. Signed-off-by: Usama Arif --- Documentation/admin-guide/mm/transhuge.rst | 42 ++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/adm= in-guide/mm/transhuge.rst index dff8d5985f0f..79983c20ae48 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -218,6 +218,48 @@ to "always" or "madvise"), and it'll be automatically = shutdown when PMD-sized THP is disabled (when both the per-size anon control and the top-level control are "never") =20 +process THP controls +-------------------- + +Transparent Hugepage behaviour of a process can be modified/obtained by +using the prctl system call. The following operations are supported: + +PR_SET_THP_DISABLE + This will set the MMF_DISABLE_THP process flag which will result + in no hugepages being faulted in or collapsed by khugepaged, + irrespective of global THP controls. + +PR_GET_THP_DISABLE + This will return the MMF_DISABLE_THP process flag, which will be + set if the process has previously been set with PR_SET_THP_DISABLE. + +PR_SET_THP_POLICY + This is used to change the behaviour of existing and future VMAs. + It has support for the following policies: + + PR_DEFAULT_MADV_HUGEPAGE + This will set VM_HUGEPAGE and clear VM_NOHUGEPAGE for the default + VMA flags. It will also iterate through every VMA in the process + and call hugepage_madvise on it, with MADV_HUGEPAGE policy. + This effectively allows setting MADV_HUGEPAGE on the entire process. + The policy is inherited during fork+exec. + + PR_DEFAULT_MADV_NOHUGEPAGE + This will set VM_NOHUGEPAGE and clear VM_HUGEPAGE for the default + VMA flags. It will also iterate through every VMA in the process + and call hugepage_madvise on it, with MADV_NOHUGEPAGE policy. + This effectively allows setting MADV_NOHUGEPAGE on the entire process. + The policy is inherited during fork+exec. + + PR_THP_POLICY_SYSTEM + This will reset (clear) both VM_HUGEPAGE and VM_NOHUGEPAGE process + for the default flags. + +PR_SET_THP_POLICY + This will return the current THP policy of the process, i.e. + PR_DEFAULT_MADV_HUGEPAGE, PR_DEFAULT_MADV_NOHUGEPAGE or + PR_THP_POLICY_SYSTEM. + Khugepaged controls ------------------- =20 --=20 2.47.1