[PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes

David Hildenbrand posted 3 patches 4 months ago
There is a newer version of this series
include/linux/mm.h |  19 +++++++-
mm/huge_memory.c   | 110 +++++++++++++++++++++++++++------------------
2 files changed, 85 insertions(+), 44 deletions(-)
[PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes
Posted by David Hildenbrand 4 months ago
This is v2 of
	"[PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special
	 in vmf_insert_folio_*()"
Now with one additional fix, based on mm/mm-unstable.

While working on improving vm_normal_page() and friends, I stumbled
over this issues: refcounted "normal" pages must not be marked
using pmd_special() / pud_special().

Fortunately, so far there doesn't seem to be serious damage.

I spent too much time trying to get the ndctl tests mentioned by Dan
running (.config tweaks, memmap= setup, ... ), without getting them to
pass even without these patches. Some SKIP, some FAIL, some sometimes
suddenly SKIP on first invocation, ... instructions unclear or the tests
are shaky. This is how far I got:

# meson test -C build --suite ndctl:dax
ninja: Entering directory `/root/ndctl/build'
[1/70] Generating version.h with a custom command
 1/13 ndctl:dax / daxdev-errors.sh          OK              15.08s
 2/13 ndctl:dax / multi-dax.sh              OK               5.80s
 3/13 ndctl:dax / sub-section.sh            SKIP             0.39s   exit status 77
 4/13 ndctl:dax / dax-dev                   OK               1.37s
 5/13 ndctl:dax / dax-ext4.sh               OK              32.70s
 6/13 ndctl:dax / dax-xfs.sh                OK              29.43s
 7/13 ndctl:dax / device-dax                OK              44.50s
 8/13 ndctl:dax / revoke-devmem             OK               0.98s
 9/13 ndctl:dax / device-dax-fio.sh         SKIP             0.10s   exit status 77
10/13 ndctl:dax / daxctl-devices.sh         SKIP             0.16s   exit status 77
11/13 ndctl:dax / daxctl-create.sh          FAIL             2.61s   exit status 1
12/13 ndctl:dax / dm.sh                     FAIL             0.23s   exit status 1
13/13 ndctl:dax / mmap.sh                   OK             437.86s

So, no idea if this series breaks something, because the tests are rather
unreliable. I have plenty of other debug settings on, maybe that's a
problem? I guess if the FS tests and mmap test pass, we're mostly good.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>


v1 -> v2:
* "mm/huge_memory: don't ignore queried cachemode in vmf_insert_pfn_pud()"
 -> Added after stumbling over that
* Modified the other tests to reuse the existing function by passing a
  new struct
* Renamed the patches to talk about "folios" instead of pages and adjusted
  the patch descriptions
* Dropped RB/TB from Dan and Oscar due to the changes

David Hildenbrand (3):
  mm/huge_memory: don't ignore queried cachemode in vmf_insert_pfn_pud()
  mm/huge_memory: don't mark refcounted folios special in
    vmf_insert_folio_pmd()
  mm/huge_memory: don't mark refcounted folios special in
    vmf_insert_folio_pud()

 include/linux/mm.h |  19 +++++++-
 mm/huge_memory.c   | 110 +++++++++++++++++++++++++++------------------
 2 files changed, 85 insertions(+), 44 deletions(-)

-- 
2.49.0
Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes
Posted by Lorenzo Stoakes 4 months ago
FWIW I did a basic build/mm self tests run locally and all looking good!

On Wed, Jun 11, 2025 at 02:06:51PM +0200, David Hildenbrand wrote:
> This is v2 of
> 	"[PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special
> 	 in vmf_insert_folio_*()"
> Now with one additional fix, based on mm/mm-unstable.
>
> While working on improving vm_normal_page() and friends, I stumbled
> over this issues: refcounted "normal" pages must not be marked
> using pmd_special() / pud_special().
>
> Fortunately, so far there doesn't seem to be serious damage.
>
> I spent too much time trying to get the ndctl tests mentioned by Dan
> running (.config tweaks, memmap= setup, ... ), without getting them to
> pass even without these patches. Some SKIP, some FAIL, some sometimes
> suddenly SKIP on first invocation, ... instructions unclear or the tests
> are shaky. This is how far I got:
>
> # meson test -C build --suite ndctl:dax
> ninja: Entering directory `/root/ndctl/build'
> [1/70] Generating version.h with a custom command
>  1/13 ndctl:dax / daxdev-errors.sh          OK              15.08s
>  2/13 ndctl:dax / multi-dax.sh              OK               5.80s
>  3/13 ndctl:dax / sub-section.sh            SKIP             0.39s   exit status 77
>  4/13 ndctl:dax / dax-dev                   OK               1.37s
>  5/13 ndctl:dax / dax-ext4.sh               OK              32.70s
>  6/13 ndctl:dax / dax-xfs.sh                OK              29.43s
>  7/13 ndctl:dax / device-dax                OK              44.50s
>  8/13 ndctl:dax / revoke-devmem             OK               0.98s
>  9/13 ndctl:dax / device-dax-fio.sh         SKIP             0.10s   exit status 77
> 10/13 ndctl:dax / daxctl-devices.sh         SKIP             0.16s   exit status 77
> 11/13 ndctl:dax / daxctl-create.sh          FAIL             2.61s   exit status 1
> 12/13 ndctl:dax / dm.sh                     FAIL             0.23s   exit status 1
> 13/13 ndctl:dax / mmap.sh                   OK             437.86s
>
> So, no idea if this series breaks something, because the tests are rather
> unreliable. I have plenty of other debug settings on, maybe that's a
> problem? I guess if the FS tests and mmap test pass, we're mostly good.
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Nico Pache <npache@redhat.com>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: Dev Jain <dev.jain@arm.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Oscar Salvador <osalvador@suse.de>
>
>
> v1 -> v2:
> * "mm/huge_memory: don't ignore queried cachemode in vmf_insert_pfn_pud()"
>  -> Added after stumbling over that
> * Modified the other tests to reuse the existing function by passing a
>   new struct
> * Renamed the patches to talk about "folios" instead of pages and adjusted
>   the patch descriptions
> * Dropped RB/TB from Dan and Oscar due to the changes
>
> David Hildenbrand (3):
>   mm/huge_memory: don't ignore queried cachemode in vmf_insert_pfn_pud()
>   mm/huge_memory: don't mark refcounted folios special in
>     vmf_insert_folio_pmd()
>   mm/huge_memory: don't mark refcounted folios special in
>     vmf_insert_folio_pud()
>
>  include/linux/mm.h |  19 +++++++-
>  mm/huge_memory.c   | 110 +++++++++++++++++++++++++++------------------
>  2 files changed, 85 insertions(+), 44 deletions(-)
>
> --
> 2.49.0
>
Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes
Posted by David Hildenbrand 4 months ago
On 12.06.25 18:19, Lorenzo Stoakes wrote:
> FWIW I did a basic build/mm self tests run locally and all looking good!

Thanks! I have another series based on this series coming up ... but 
struggling to get !CONFIG_ARCH_HAS_PTE_SPECIAL tested "easily" :)

-- 
Cheers,

David / dhildenb
Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes
Posted by Lorenzo Stoakes 4 months ago
On Thu, Jun 12, 2025 at 06:22:32PM +0200, David Hildenbrand wrote:
> On 12.06.25 18:19, Lorenzo Stoakes wrote:
> > FWIW I did a basic build/mm self tests run locally and all looking good!
>
> Thanks! I have another series based on this series coming up ... but
> struggling to get !CONFIG_ARCH_HAS_PTE_SPECIAL tested "easily" :)

Hm which arches don't set it?

Filtering through:

arm - If !ARM_LPAE
csky
hexagon
m68k
microblaze
mips - If 32-bit or !CPU_HAS_RIXI
nios2
openrisc
um
xtensa

So the usual suspects of museum pieces and museum pieces on life-support for
some reason but also... usermode linux?

Might that be the easiest to play with?

I got this list from a basic grep for 'select ARCH_HAS_PTE_SPECIAL' so I'm not
sure if um imports some other arch's kconfig or there is some other way to set
it but probably this criteria is accurate...

IMO: criteria for arch removal (or in case of um - adjustment :) - 32-bit
(kernel), !ARCH_HAS_PTE_SPECIAL, nommu

Of course, pipe dreams...

>
> --
> Cheers,
>
> David / dhildenb
>
Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes
Posted by Alistair Popple 4 months ago
On Wed, Jun 11, 2025 at 02:06:51PM +0200, David Hildenbrand wrote:
> This is v2 of
> 	"[PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special
> 	 in vmf_insert_folio_*()"
> Now with one additional fix, based on mm/mm-unstable.
> 
> While working on improving vm_normal_page() and friends, I stumbled
> over this issues: refcounted "normal" pages must not be marked
> using pmd_special() / pud_special().
> 
> Fortunately, so far there doesn't seem to be serious damage.
> 
> I spent too much time trying to get the ndctl tests mentioned by Dan
> running (.config tweaks, memmap= setup, ... ), without getting them to
> pass even without these patches. Some SKIP, some FAIL, some sometimes
> suddenly SKIP on first invocation, ... instructions unclear or the tests
> are shaky. This is how far I got:

FWIW I had a similar experience, although I eventually got the FAIL cases below
to pass. I forget exactly what I needed to tweak for that though :-/

> # meson test -C build --suite ndctl:dax
> ninja: Entering directory `/root/ndctl/build'
> [1/70] Generating version.h with a custom command
>  1/13 ndctl:dax / daxdev-errors.sh          OK              15.08s
>  2/13 ndctl:dax / multi-dax.sh              OK               5.80s
>  3/13 ndctl:dax / sub-section.sh            SKIP             0.39s   exit status 77
>  4/13 ndctl:dax / dax-dev                   OK               1.37s
>  5/13 ndctl:dax / dax-ext4.sh               OK              32.70s
>  6/13 ndctl:dax / dax-xfs.sh                OK              29.43s
>  7/13 ndctl:dax / device-dax                OK              44.50s
>  8/13 ndctl:dax / revoke-devmem             OK               0.98s
>  9/13 ndctl:dax / device-dax-fio.sh         SKIP             0.10s   exit status 77
> 10/13 ndctl:dax / daxctl-devices.sh         SKIP             0.16s   exit status 77
> 11/13 ndctl:dax / daxctl-create.sh          FAIL             2.61s   exit status 1
> 12/13 ndctl:dax / dm.sh                     FAIL             0.23s   exit status 1
> 13/13 ndctl:dax / mmap.sh                   OK             437.86s
> 
> So, no idea if this series breaks something, because the tests are rather
> unreliable. I have plenty of other debug settings on, maybe that's a
> problem? I guess if the FS tests and mmap test pass, we're mostly good.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Nico Pache <npache@redhat.com>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: Dev Jain <dev.jain@arm.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> 
> 
> v1 -> v2:
> * "mm/huge_memory: don't ignore queried cachemode in vmf_insert_pfn_pud()"
>  -> Added after stumbling over that
> * Modified the other tests to reuse the existing function by passing a
>   new struct
> * Renamed the patches to talk about "folios" instead of pages and adjusted
>   the patch descriptions
> * Dropped RB/TB from Dan and Oscar due to the changes
> 
> David Hildenbrand (3):
>   mm/huge_memory: don't ignore queried cachemode in vmf_insert_pfn_pud()
>   mm/huge_memory: don't mark refcounted folios special in
>     vmf_insert_folio_pmd()
>   mm/huge_memory: don't mark refcounted folios special in
>     vmf_insert_folio_pud()
> 
>  include/linux/mm.h |  19 +++++++-
>  mm/huge_memory.c   | 110 +++++++++++++++++++++++++++------------------
>  2 files changed, 85 insertions(+), 44 deletions(-)
> 
> -- 
> 2.49.0
>
Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes
Posted by Dan Williams 4 months ago
Alistair Popple wrote:
> On Wed, Jun 11, 2025 at 02:06:51PM +0200, David Hildenbrand wrote:
> > This is v2 of
> > 	"[PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special
> > 	 in vmf_insert_folio_*()"
> > Now with one additional fix, based on mm/mm-unstable.
> > 
> > While working on improving vm_normal_page() and friends, I stumbled
> > over this issues: refcounted "normal" pages must not be marked
> > using pmd_special() / pud_special().
> > 
> > Fortunately, so far there doesn't seem to be serious damage.
> > 
> > I spent too much time trying to get the ndctl tests mentioned by Dan
> > running (.config tweaks, memmap= setup, ... ), without getting them to
> > pass even without these patches. Some SKIP, some FAIL, some sometimes
> > suddenly SKIP on first invocation, ... instructions unclear or the tests
> > are shaky. This is how far I got:
> 
> FWIW I had a similar experience, although I eventually got the FAIL cases below
> to pass. I forget exactly what I needed to tweak for that though :-/

Add Marc who has been working to clean the documentation up to solve the
reproducibility problem with standing up new environments to run these
tests.

http://lore.kernel.org/20250521002640.1700283-1-marc.herbert@linux.intel.com

There is also the run_qemu project that automates build an environment for this.

https://github.com/pmem/run_qemu

...but comes with its own set of quirks.

I have the following fixups applied to my environment to get his going on
Fedora 42 with v6.16-rc1:

diff --git a/README.md b/README.md
index 37314db7a155..8e06908d5921 100644
--- a/README.md
+++ b/README.md
@@ -84,6 +84,11 @@ loaded.  To build and install nfit_test.ko:
    CONFIG_TRANSPARENT_HUGEPAGE=y
    ```
 
+1. Install the following packages, (Fedora instructions):
+   ```
+   dnf install e2fsprogs xfsprogs parted jq trace-cmd hostname fio fio-engine-dev-dax
+   ```
+
 1. Build and install the unit test enabled libnvdimm modules in the
    following order.  The unit test modules need to be in place prior to
    the `depmod` that runs during the final `modules_install`  
diff --git a/test/dax.sh b/test/dax.sh
index 3ffbc8079eba..98faaf0eb9b2 100755
--- a/test/dax.sh
+++ b/test/dax.sh
@@ -37,13 +37,14 @@ run_test() {
 	rc=1
 	while read -r p; do
 		[[ $p ]] || continue
+		[[ $p == cpus=* ]] && continue
 		if [ "$count" -lt 10 ]; then
 			if [ "$p" != "0x100" ] && [ "$p" != "NOPAGE" ]; then
 				cleanup "$1"
 			fi
 		fi
 		count=$((count + 1))
-	done < <(trace-cmd report | awk '{ print $21 }')
+	done < <(trace-cmd report | awk '{ print $NF }')
 
 	if [ $count -lt 10 ]; then
 		cleanup "$1"

In the meantime, do not hesitate to ask me to run these tests.

FWIW with these patches on top of -rc1 I get:

---

[root@host ndctl]# meson test -C build --suite ndctl:dax
ninja: Entering directory `/root/git/ndctl/build'
[168/168] Linking target ndctl/ndctl
 1/13 ndctl:dax / daxdev-errors.sh          OK              12.60s
 2/13 ndctl:dax / multi-dax.sh              OK               2.47s
 3/13 ndctl:dax / sub-section.sh            OK               6.30s
 4/13 ndctl:dax / dax-dev                   OK               0.04s
 5/13 ndctl:dax / dax-ext4.sh               OK               3.04s
 6/13 ndctl:dax / dax-xfs.sh                OK               3.10s
 7/13 ndctl:dax / device-dax                OK               9.66s
 8/13 ndctl:dax / revoke-devmem             OK               0.22s
 9/13 ndctl:dax / device-dax-fio.sh         OK              32.32s
10/13 ndctl:dax / daxctl-devices.sh         OK               2.31s
11/13 ndctl:dax / daxctl-create.sh          SKIP             0.25s   exit status 77
12/13 ndctl:dax / dm.sh                     OK               1.00s
13/13 ndctl:dax / mmap.sh                   OK              62.27s

Ok:                12  
Fail:              0   
Skipped:           1   

Full log written to /root/git/ndctl/build/meson-logs/testlog.txt

---

Note that the daxctl-create.sh skip is a known unrelated v6.16-rc1 regression
fixed with this set:

http://lore.kernel.org/20250607033228.1475625-1-dan.j.williams@intel.com

You can add:

Tested-by: Dan Williams <dan.j.williams@intel.com>
Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes
Posted by David Hildenbrand 4 months ago
On 12.06.25 06:20, Dan Williams wrote:
> Alistair Popple wrote:
>> On Wed, Jun 11, 2025 at 02:06:51PM +0200, David Hildenbrand wrote:
>>> This is v2 of
>>> 	"[PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special
>>> 	 in vmf_insert_folio_*()"
>>> Now with one additional fix, based on mm/mm-unstable.
>>>
>>> While working on improving vm_normal_page() and friends, I stumbled
>>> over this issues: refcounted "normal" pages must not be marked
>>> using pmd_special() / pud_special().
>>>
>>> Fortunately, so far there doesn't seem to be serious damage.
>>>
>>> I spent too much time trying to get the ndctl tests mentioned by Dan
>>> running (.config tweaks, memmap= setup, ... ), without getting them to
>>> pass even without these patches. Some SKIP, some FAIL, some sometimes
>>> suddenly SKIP on first invocation, ... instructions unclear or the tests
>>> are shaky. This is how far I got:
>>
>> FWIW I had a similar experience, although I eventually got the FAIL cases below
>> to pass. I forget exactly what I needed to tweak for that though :-/
> 
> Add Marc who has been working to clean the documentation up to solve the
> reproducibility problem with standing up new environments to run these
> tests.

I was about to send some doc improvements myself, but I didn't manage to 
get the tests running in the first place ... even after trying hard :)

I think there is also one issue with a test that requires you to 
actually install ndctl ... and some tests seem to temporarily fail with 
weird issues regarding "file size problems with /proc/kallsyms", 
whereby, ... there are no such file size problems :)

All a bit shaky. The "memmap=" stuff is not documented anywhere for the 
tests, which is required for some tests I think. Maybe it should be 
added, not sure how big of an area we actually need, though.

> 
> http://lore.kernel.org/20250521002640.1700283-1-marc.herbert@linux.intel.com
> 

I think I have CONFIG_XFS_FS=m (instead of y) and CONFIG_DAX=y (instead 
of =m), and CONFIG_NFIT_SECURITY_DEBUG not set (instead of =y).

Let me try with these settings adjusted.

> There is also the run_qemu project that automates build an environment for this.
> 
> https://github.com/pmem/run_qemu
> 
> ...but comes with its own set of quirks.
> 
> I have the following fixups applied to my environment to get his going on
> Fedora 42 with v6.16-rc1:
> 
> diff --git a/README.md b/README.md
> index 37314db7a155..8e06908d5921 100644
> --- a/README.md
> +++ b/README.md
> @@ -84,6 +84,11 @@ loaded.  To build and install nfit_test.ko:
>      CONFIG_TRANSPARENT_HUGEPAGE=y
>      ```
>   
> +1. Install the following packages, (Fedora instructions):
> +   ```
> +   dnf install e2fsprogs xfsprogs parted jq trace-cmd hostname fio fio-engine-dev-dax
> +   ```
> +
>   1. Build and install the unit test enabled libnvdimm modules in the
>      following order.  The unit test modules need to be in place prior to
>      the `depmod` that runs during the final `modules_install`
> diff --git a/test/dax.sh b/test/dax.sh
> index 3ffbc8079eba..98faaf0eb9b2 100755
> --- a/test/dax.sh
> +++ b/test/dax.sh
> @@ -37,13 +37,14 @@ run_test() {
>   	rc=1
>   	while read -r p; do
>   		[[ $p ]] || continue
> +		[[ $p == cpus=* ]] && continue
>   		if [ "$count" -lt 10 ]; then
>   			if [ "$p" != "0x100" ] && [ "$p" != "NOPAGE" ]; then
>   				cleanup "$1"
>   			fi
>   		fi
>   		count=$((count + 1))
> -	done < <(trace-cmd report | awk '{ print $21 }')
> +	done < <(trace-cmd report | awk '{ print $NF }')
>   
>   	if [ $count -lt 10 ]; then
>   		cleanup "$1"
> 
> In the meantime, do not hesitate to ask me to run these tests.

Yes, thanks, and thanks for running these tests.

> 
> FWIW with these patches on top of -rc1 I get:
> 
> ---
> 
> [root@host ndctl]# meson test -C build --suite ndctl:dax
> ninja: Entering directory `/root/git/ndctl/build'
> [168/168] Linking target ndctl/ndctl
>   1/13 ndctl:dax / daxdev-errors.sh          OK              12.60s
>   2/13 ndctl:dax / multi-dax.sh              OK               2.47s
>   3/13 ndctl:dax / sub-section.sh            OK               6.30s
>   4/13 ndctl:dax / dax-dev                   OK               0.04s
>   5/13 ndctl:dax / dax-ext4.sh               OK               3.04s
>   6/13 ndctl:dax / dax-xfs.sh                OK               3.10s
>   7/13 ndctl:dax / device-dax                OK               9.66s
>   8/13 ndctl:dax / revoke-devmem             OK               0.22s
>   9/13 ndctl:dax / device-dax-fio.sh         OK              32.32s
> 10/13 ndctl:dax / daxctl-devices.sh         OK               2.31s
> 11/13 ndctl:dax / daxctl-create.sh          SKIP             0.25s   exit status 77
> 12/13 ndctl:dax / dm.sh                     OK               1.00s
> 13/13 ndctl:dax / mmap.sh                   OK              62.27s
> 
> Ok:                12
> Fail:              0
> Skipped:           1
> 
> Full log written to /root/git/ndctl/build/meson-logs/testlog.txt
> 
> ---
> 
> Note that the daxctl-create.sh skip is a known unrelated v6.16-rc1 regression
> fixed with this set:
> 
> http://lore.kernel.org/20250607033228.1475625-1-dan.j.williams@intel.com
> 
> You can add:
> 
> Tested-by: Dan Williams <dan.j.williams@intel.com>
> 

Thanks!

-- 
Cheers,

David / dhildenb
Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes
Posted by David Hildenbrand 4 months ago
On 12.06.25 09:18, David Hildenbrand wrote:
> On 12.06.25 06:20, Dan Williams wrote:
>> Alistair Popple wrote:
>>> On Wed, Jun 11, 2025 at 02:06:51PM +0200, David Hildenbrand wrote:
>>>> This is v2 of
>>>> 	"[PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special
>>>> 	 in vmf_insert_folio_*()"
>>>> Now with one additional fix, based on mm/mm-unstable.
>>>>
>>>> While working on improving vm_normal_page() and friends, I stumbled
>>>> over this issues: refcounted "normal" pages must not be marked
>>>> using pmd_special() / pud_special().
>>>>
>>>> Fortunately, so far there doesn't seem to be serious damage.
>>>>
>>>> I spent too much time trying to get the ndctl tests mentioned by Dan
>>>> running (.config tweaks, memmap= setup, ... ), without getting them to
>>>> pass even without these patches. Some SKIP, some FAIL, some sometimes
>>>> suddenly SKIP on first invocation, ... instructions unclear or the tests
>>>> are shaky. This is how far I got:
>>>
>>> FWIW I had a similar experience, although I eventually got the FAIL cases below
>>> to pass. I forget exactly what I needed to tweak for that though :-/
>>
>> Add Marc who has been working to clean the documentation up to solve the
>> reproducibility problem with standing up new environments to run these
>> tests.
> 
> I was about to send some doc improvements myself, but I didn't manage to
> get the tests running in the first place ... even after trying hard :)
> 
> I think there is also one issue with a test that requires you to
> actually install ndctl ... and some tests seem to temporarily fail with
> weird issues regarding "file size problems with /proc/kallsyms",
> whereby, ... there are no such file size problems :)
> 
> All a bit shaky. The "memmap=" stuff is not documented anywhere for the
> tests, which is required for some tests I think. Maybe it should be
> added, not sure how big of an area we actually need, though.
> 
>>
>> http://lore.kernel.org/20250521002640.1700283-1-marc.herbert@linux.intel.com
>>
> 
> I think I have CONFIG_XFS_FS=m (instead of y) and CONFIG_DAX=y (instead
> of =m), and CONFIG_NFIT_SECURITY_DEBUG not set (instead of =y).
> 
> Let me try with these settings adjusted.

Yeah, no. Unfortunately doesn't make it work with my debug config. Maybe with the
defconfig as raised by Marc it would do ... maybe will try that later.

# meson test -C build --suite ndctl:dax
ninja: Entering directory `/root/ndctl/build'
[1/70] Generating version.h with a custom command
  1/13 ndctl:dax / daxdev-errors.sh          OK              14.60s
  2/13 ndctl:dax / multi-dax.sh              OK               4.28s
  3/13 ndctl:dax / sub-section.sh            SKIP             0.25s   exit status 77
  4/13 ndctl:dax / dax-dev                   OK               1.00s
  5/13 ndctl:dax / dax-ext4.sh               OK              23.60s
  6/13 ndctl:dax / dax-xfs.sh                OK              23.74s
  7/13 ndctl:dax / device-dax                OK              40.61s
  8/13 ndctl:dax / revoke-devmem             OK               0.98s
  9/13 ndctl:dax / device-dax-fio.sh         SKIP             0.10s   exit status 77
10/13 ndctl:dax / daxctl-devices.sh         SKIP             0.16s   exit status 77
11/13 ndctl:dax / daxctl-create.sh          FAIL             2.53s   exit status 1
>>> DAXCTL=/root/ndctl/build/daxctl/daxctl DATA_PATH=/root/ndctl/test MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 MALLOC_PERTURB_=167 LD_LIBRARY_PATH=/root/ndctl/build/cxl/lib:/root/ndctl/build/daxctl/lib:/root/ndctl/build/ndctl/lib TEST_PATH=/root/ndctl/build/test UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 NDCTL=/root/ndctl/build/ndctl/ndctl ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 /root/ndctl/test/daxctl-create.sh

12/13 ndctl:dax / dm.sh                     FAIL             0.24s   exit status 1
>>> DAXCTL=/root/ndctl/build/daxctl/daxctl DATA_PATH=/root/ndctl/test MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 LD_LIBRARY_PATH=/root/ndctl/build/cxl/lib:/root/ndctl/build/daxctl/lib:/root/ndctl/build/ndctl/lib TEST_PATH=/root/ndctl/build/test MALLOC_PERTURB_=27 NDCTL=/root/ndctl/build/ndctl/ndctl ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 /root/ndctl/test/dm.sh

13/13 ndctl:dax / mmap.sh                   OK             343.67s

Ok:                 8
Expected Fail:      0
Fail:               2
Unexpected Pass:    0
Skipped:            3
Timeout:            0

Full log written to /root/ndctl/build/meson-logs/testlog.txt


After compilation, I can see that I again have "CONFIG_DAX=y" in my config.

And for the DAX setting in "make menuconfig" I can see:

Symbol: DAX [=y]
  ...
  Selected by [y]:
  - FS_DAX [=y] && MMU [=y] && (ZONE_DEVICE [=y] || FS_DAX_LIMITED [=n]
  Selected by [m]:
  - BLK_DEV_PMEM [=m] && LIBNVDIMM [=m]

So I guess, as requested in the doc "CONFIG_FS_DAX=y" combined with
"CONFIG_DAX=m" is impossible to achieve?


===

sub-section.sh complains about

++ /root/ndctl/build/ndctl/ndctl list -R -b ACPI.NFIT
+ json=
++ echo
++ jq -r '[.[] | select(.available_size >= 67108864)][0].dev'
+ region=
++ echo
++ jq -r '[.[] | select(.available_size >= 67108864)][0].available_size'
+ avail=
+ '[' -z ']'
+ exit 77

Not sure what's the problem in my environment. I thought we would be emulating
ACPI.NFIT.

===

device-dax-fio.sh complains about

kernel 6.16.0-rc1-00069-g0ede5baa0b46: missing fio, skipping...

So I guess I just need to install "fio" to make it fly.

Yes, with that the test is passing now.

===

daxctl-devices.sh complains about

++ reset_dev
++ /root/ndctl/build/ndctl/ndctl destroy-namespace -f -b ACPI.NFIT 'Error at linn
e 33'
error destroying namespaces: No such device or address
destroyed 0 namespaces
++ exit 77


No idea.

===

daxctl-create.sh complains about

+ /root/ndctl/build/daxctl/daxctl reconfigure-device -m devdax -f dax1.0
libdaxctl: daxctl_dev_enable: dax1.0: failed to enable
error reconfiguring devices: Invalid argument
reconfigured 0 devices
++ cleanup 54
++ printf 'Error at line %d\n' 54
++ [[ -n dax1.0 ]]
++ reset_dax
++ test -n dax1.0
++ /root/ndctl/build/daxctl/daxctl disable-device -r 1 all
disabled 1 device
++ /root/ndctl/build/daxctl/daxctl destroy-device -r 1 all
destroyed 1 device
++ /root/ndctl/build/daxctl/daxctl reconfigure-device -s '' dax1.0
reconfigured 1 device
++ exit 1


Again, no idea ... :(


-- 
Cheers,

David / dhildenb
Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes
Posted by Marc Herbert 4 months ago

>>>>> I spent too much time trying to get the ndctl tests mentioned by Dan
>>>>> running (.config tweaks, memmap= setup, ... ), without getting them to
>>>>> pass even without these patches. Some SKIP, some FAIL, some sometimes
>>>>> suddenly SKIP on first invocation, ... instructions unclear or the tests
>>>>> are shaky. This is how far I got:
>>>>
>>>> FWIW I had a similar experience, although I eventually got the FAIL cases below
>>>> to pass. I forget exactly what I needed to tweak for that though :-/
>>>
>>> Add Marc who has been working to clean the documentation up to solve the
>>> reproducibility problem with standing up new environments to run these
>>> tests.
>>
>> I was about to send some doc improvements myself, but I didn't manage to
>> get the tests running in the first place ... even after trying hard :)
>>


>>> http://lore.kernel.org/20250521002640.1700283-1-marc.herbert@linux.intel.com
>>>
>>
>> I think I have CONFIG_XFS_FS=m (instead of y) and CONFIG_DAX=y (instead
>> of =m), and CONFIG_NFIT_SECURITY_DEBUG not set (instead of =y).
>>
>> Let me try with these settings adjusted.
> 
> Yeah, no. Unfortunately doesn't make it work with my debug config. Maybe with the
> defconfig as raised by Marc it would do ... maybe will try that later.

After a lot of trial and error to get them right, these fragments have always
worked for me:

make defconfig ARCH=x86_64
./scripts/kconfig/merge_config.sh .config ../run_qemu/.github/workflows/*.cfg

Warning: there is a CONFIG_DRM=n in there to save a lot of compilation
time.  Nothing against DRM specifically; it's just the best "value" for
a single line change :-)


The run_qemu/.github/workflows/*.cfg fragments are mostly duplicated
from ndctl.git/README.md - but unlike the latter, they're
machine-readable and testable. The CXL fragment is actually tested in 
run_qemu's CI (CI = the only way not to bitrot).
https://github.com/pmem/run_qemu/actions

As I wrote in
https://lore.kernel.org/linux-cxl/aed71134-1029-4b88-ab20-8dfa527a7438@linux.intel.com/
these fragments should ideally live in ndctl.git/, not in run_qemu.git/
(the latter could still add tweaks). Then ndctl.git/README.md could just
refer to the testable fragments instead of inlining them. "Send patches"
they say :-)
Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes
Posted by Andrew Morton 4 months ago
On Wed, 11 Jun 2025 14:06:51 +0200 David Hildenbrand <david@redhat.com> wrote:

> While working on improving vm_normal_page() and friends, I stumbled
> over this issues: refcounted "normal" pages must not be marked
> using pmd_special() / pud_special().

Why is this?

>
> ...
>
> I spent too much time trying to get the ndctl tests mentioned by Dan
> running (.config tweaks, memmap= setup, ... ), without getting them to
> pass even without these patches. Some SKIP, some FAIL, some sometimes
> suddenly SKIP on first invocation, ... instructions unclear or the tests
> are shaky. This is how far I got:

I won't include this in the [0/N] - it doesn't seem helpful for future
readers of the patchset.

I'll give the patchset a run in mm-new, but it feels like some more
baking is needed?

The [1/N] has cc:stable but there's nothing in there to explain this
decision.  How does the issues affect userspace?
Re: [PATCH v2 0/3] mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes
Posted by David Hildenbrand 4 months ago
On 12.06.25 01:08, Andrew Morton wrote:
> On Wed, 11 Jun 2025 14:06:51 +0200 David Hildenbrand <david@redhat.com> wrote:
> 
>> While working on improving vm_normal_page() and friends, I stumbled
>> over this issues: refcounted "normal" pages must not be marked
>> using pmd_special() / pud_special().
> 
> Why is this?

The two patches for that refer to the rules documented for 
vm_normal_page(), how it could mislead pmd_special()/pud_special() 
users, and how the harm so far is fortunately still limited.

It's all about how we identify refcounted folios vs. pfn mappings / 
decide what's normal and what's special.

> 
>>
>> ...
>>
>> I spent too much time trying to get the ndctl tests mentioned by Dan
>> running (.config tweaks, memmap= setup, ... ), without getting them to
>> pass even without these patches. Some SKIP, some FAIL, some sometimes
>> suddenly SKIP on first invocation, ... instructions unclear or the tests
>> are shaky. This is how far I got:
> 
> I won't include this in the [0/N] - it doesn't seem helpful for future
> readers of the patchset.

Yes, trim it down to "ran ndctl tests, tests are shaky and ahrd to run, 
but the results indicate that the relevant stuff seems to keep working".

... combined with the Tested-by by Dan.

> 
> I'll give the patchset a run in mm-new, but it feels like some more
> baking is needed?

Fortunately Dan and Alistair managed to get the tests run properly. So I 
don't have to waste another valuable 4 hours of my life on testing some 
simple fixes that only stand in between me and doing the actual work in 
that area I want to get done.

> 
> The [1/N] has cc:stable but there's nothing in there to explain this
> decision.  How does the issues affect userspace?

My reasoning was: Getting cachemodes in page table entries wrong sounds 
... bad? At least to me :)

PAT code is confusing (when/how we could we actually mess up the 
cachemode?), so it's hard to decide when this actually hits, and what 
the exact results in which scenario would be. I tried to find out, but 
cannot spend another hour digging through that horrible code.

So if someone has a problem with "stable" here, we can drop it. But the 
fix is simple.

-- 
Cheers,

David / dhildenb