[PATCH v3 0/6] prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised

Usama Arif posted 6 patches 2 months ago
There is a newer version of this series
Documentation/admin-guide/mm/transhuge.rst    |  38 +++
Documentation/filesystems/proc.rst            |   5 +-
fs/proc/array.c                               |   2 +-
fs/proc/task_mmu.c                            |   4 +-
include/linux/huge_mm.h                       |  60 ++--
include/linux/mm_types.h                      |  13 +-
include/uapi/linux/prctl.h                    |  10 +
kernel/sys.c                                  |  59 +++-
mm/huge_memory.c                              |  11 +-
mm/khugepaged.c                               |  19 +-
mm/memory.c                                   |  20 +-
mm/shmem.c                                    |   2 +-
tools/testing/selftests/mm/.gitignore         |   1 +
tools/testing/selftests/mm/Makefile           |   1 +
.../testing/selftests/mm/prctl_thp_disable.c  | 280 ++++++++++++++++++
tools/testing/selftests/mm/thp_settings.c     |   9 +-
tools/testing/selftests/mm/thp_settings.h     |   1 +
17 files changed, 464 insertions(+), 71 deletions(-)
create mode 100644 tools/testing/selftests/mm/prctl_thp_disable.c
[PATCH v3 0/6] prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised
Posted by Usama Arif 2 months ago
(Resending this as forgot to include PATCH v2 in subject prefix)

This will allow individual processes to opt-out of THP = "always"
into THP = "madvise", without affecting other workloads on the system.
This has been extensively discussed on the mailing list and has been
summarized very well by David in the first patch which also includes
the links to alternatives, please refer to the first patch commit message
for the motivation for this series.

Patch 1 adds the PR_THP_DISABLE_EXCEPT_ADVISED flag to implement this, along
with the MMF changes.
Patch 2 is a cleanup patch for tva_flags that will allow the forced collapse
case to be transmitted to vma_thp_disabled (which is done in patch 3).
Patch 4 adds documentation for PR_SET_THP_DISABLE/PR_GET_THP_DISABLE.
Patches 5-6 implement the selftests for PR_SET_THP_DISABLE for completely
disabling THPs (old behaviour) and only enabling it at advise
(PR_THP_DISABLE_EXCEPT_ADVISED).

The patches are tested on top of 4ad831303eca6ae518c3b3d86838a2a04b90ec41
from mm-new.

v2 -> v3: https://lore.kernel.org/all/20250731122825.2102184-1-usamaarif642@gmail.com/
- Fix sign off and added ack for patch 1 (Lorenzo and Zi Yan)
- Fix up commit message, comments and variable names in patch 2 and 3 (Lorenzo)
- Added documentation for PR_SET_THP_DISABLE/PR_GET_THP_DISABLE (Lorenzo)
- remove struct test_results and enum thp_policy for prctl tests (David)

v1 -> v2: https://lore.kernel.org/all/20250725162258.1043176-1-usamaarif642@gmail.com/
- Change thp_push_settings to thp_write_settings (David)
- Add tests for all the system policies for the prctl call (David)
- Small fixes and cleanups

David Hildenbrand (3):
  prctl: extend PR_SET_THP_DISABLE to optionally exclude VM_HUGEPAGE
  mm/huge_memory: convert "tva_flags" to "enum tva_type"
  mm/huge_memory: respect MADV_COLLAPSE with
    PR_THP_DISABLE_EXCEPT_ADVISED

Usama Arif (3):
  docs: transhuge: document process level THP controls
  selftests: prctl: introduce tests for disabling THPs completely
  selftests: prctl: introduce tests for disabling THPs except for
    madvise

 Documentation/admin-guide/mm/transhuge.rst    |  38 +++
 Documentation/filesystems/proc.rst            |   5 +-
 fs/proc/array.c                               |   2 +-
 fs/proc/task_mmu.c                            |   4 +-
 include/linux/huge_mm.h                       |  60 ++--
 include/linux/mm_types.h                      |  13 +-
 include/uapi/linux/prctl.h                    |  10 +
 kernel/sys.c                                  |  59 +++-
 mm/huge_memory.c                              |  11 +-
 mm/khugepaged.c                               |  19 +-
 mm/memory.c                                   |  20 +-
 mm/shmem.c                                    |   2 +-
 tools/testing/selftests/mm/.gitignore         |   1 +
 tools/testing/selftests/mm/Makefile           |   1 +
 .../testing/selftests/mm/prctl_thp_disable.c  | 280 ++++++++++++++++++
 tools/testing/selftests/mm/thp_settings.c     |   9 +-
 tools/testing/selftests/mm/thp_settings.h     |   1 +
 17 files changed, 464 insertions(+), 71 deletions(-)
 create mode 100644 tools/testing/selftests/mm/prctl_thp_disable.c

-- 
2.47.3
Re: [PATCH v3 0/6] prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised
Posted by Usama Arif 2 months ago

On 04/08/2025 16:40, Usama Arif wrote:
> (Resending this as forgot to include PATCH v2 in subject prefix)

Not a resend, this is v3, just forgot to remove the above line from the
previous cover letter :)
Re: [PATCH v3 0/6] prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised
Posted by Lorenzo Stoakes 1 month, 3 weeks ago
Usama - did we plan another respin here? I ask as not in mm-new.

Also heads up, my mm flags series will break this one, so if you're
respinning, please make sure to use the mm flag helpers described in [0].

It's really simple, you just do:

mm_flags_test(MMF_xxx, mm) instead of test_bit(MMF_xxx, &mm->flags)
mm_flags_set(MMF_xxx, mm) instead of set_bit(MMF_xxx, &mm->flags)
mm_flags_clear(MMF_xxx, mm) instead of clear_bit(MMF_xxx, &mm->flags)

So should be very quick to fixup.

Sorry about that, but should be super simple to sort out.

Cheers, Lorenzo

[0]: https://lore.kernel.org/linux-mm/cover.1755012943.git.lorenzo.stoakes@oracle.com/
Re: [PATCH v3 0/6] prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised
Posted by Usama Arif 1 month, 3 weeks ago

On 13/08/2025 07:06, Lorenzo Stoakes wrote:
> Usama - did we plan another respin here? I ask as not in mm-new.
> 

Yes, I have the changes ready since last week, was just waiting for a respin of
the selftest cleanup series that David mentioned in [1], but I dont see it
in the mailing list. I will just do the cleanup in my series and send it.


> Also heads up, my mm flags series will break this one, so if you're
> respinning, please make sure to use the mm flag helpers described in [0].

Sounds good, Thanks for the heads up!

I do see this in mm-new, so will just send the next revision today tested
on latest mm-new.


> 
> It's really simple, you just do:
> 
> mm_flags_test(MMF_xxx, mm) instead of test_bit(MMF_xxx, &mm->flags)
> mm_flags_set(MMF_xxx, mm) instead of set_bit(MMF_xxx, &mm->flags)
> mm_flags_clear(MMF_xxx, mm) instead of clear_bit(MMF_xxx, &mm->flags)
> 
> So should be very quick to fixup.
> 
> Sorry about that, but should be super simple to sort out.
> 
> Cheers, Lorenzo
> 
> [0]: https://lore.kernel.org/linux-mm/cover.1755012943.git.lorenzo.stoakes@oracle.com/
[1] https://lore.kernel.org/all/eec7e868-a61f-41ed-a8ef-7ff80548089f@redhat.com/