MAINTAINERS | 1 + tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + tools/testing/selftests/mm/memory-failure.c | 335 ++++++++++++++++++++ tools/testing/selftests/mm/run_vmtests.sh | 21 ++ tools/testing/selftests/mm/vm_util.c | 41 +++ tools/testing/selftests/mm/vm_util.h | 3 + 7 files changed, 403 insertions(+) create mode 100644 tools/testing/selftests/mm/memory-failure.c
Introduce selftests to validate the functionality of memory failure. These tests help ensure that memory failure handling for anonymous pages, pagecaches pages works correctly, including proper SIGBUS delivery to user processes, page isolation, and recovery paths. Currently madvise syscall is used to inject memory failures. And only anonymous pages and pagecaches are tested. More test scenarios, e.g. hugetlb, shmem, thp, will be added. Also more memory failure injecting methods will be supported, e.g. APEI Error INJection, if required. Thanks! Miaohe Lin (3): selftests/mm: add memory failure anonymous page test selftests/mm: add memory failure clean pagecache test selftests/mm: add memory failure dirty pagecache test MAINTAINERS | 1 + tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + tools/testing/selftests/mm/memory-failure.c | 335 ++++++++++++++++++++ tools/testing/selftests/mm/run_vmtests.sh | 21 ++ tools/testing/selftests/mm/vm_util.c | 41 +++ tools/testing/selftests/mm/vm_util.h | 3 + 7 files changed, 403 insertions(+) create mode 100644 tools/testing/selftests/mm/memory-failure.c -- 2.33.0
On 1/7/26 10:37, Miaohe Lin wrote: > Introduce selftests to validate the functionality of memory failure. > These tests help ensure that memory failure handling for anonymous > pages, pagecaches pages works correctly, including proper SIGBUS > delivery to user processes, page isolation, and recovery paths. > > Currently madvise syscall is used to inject memory failures. And only > anonymous pages and pagecaches are tested. More test scenarios, e.g. > hugetlb, shmem, thp, will be added. Also more memory failure injecting > methods will be supported, e.g. APEI Error INJection, if required. 0day reports that these tests fail: # # ------------------------ # # running ./memory-failure # # ------------------------ # # TAP version 13 # # 1..6 # # # Starting 6 tests from 2 test cases. # # # RUN memory_failure.madv_hard.anon ... # # # OK memory_failure.madv_hard.anon # # ok 1 memory_failure.madv_hard.anon # # # RUN memory_failure.madv_hard.clean_pagecache ... # # # memory-failure.c:166:clean_pagecache:Expected setjmp (1) == 0 (0) # # # clean_pagecache: Test terminated by assertion # # # FAIL memory_failure.madv_hard.clean_pagecache # # not ok 2 memory_failure.madv_hard.clean_pagecache # # # RUN memory_failure.madv_hard.dirty_pagecache ... # # # memory-failure.c:207:dirty_pagecache:Expected unpoison_memory(self->pfn) (-16) == 0 (0) # # # dirty_pagecache: Test terminated by assertion # # # FAIL memory_failure.madv_hard.dirty_pagecache # # not ok 3 memory_failure.madv_hard.dirty_pagecache # # # RUN memory_failure.madv_soft.anon ... # # # OK memory_failure.madv_soft.anon # # ok 4 memory_failure.madv_soft.anon # # # RUN memory_failure.madv_soft.clean_pagecache ... # # # memory-failure.c:282:clean_pagecache:Expected variant->inject(self, addr) (-1) == 0 (0) # # # clean_pagecache: Test terminated by assertion # # # FAIL memory_failure.madv_soft.clean_pagecache # # not ok 5 memory_failure.madv_soft.clean_pagecache # # # RUN memory_failure.madv_soft.dirty_pagecache ... # # # memory-failure.c:319:dirty_pagecache:Expected variant->inject(self, addr) (-1) == 0 (0) # # # dirty_pagecache: Test terminated by assertion # # # FAIL memory_failure.madv_soft.dirty_pagecache # # not ok 6 memory_failure.madv_soft.dirty_pagecache # # # FAILED: 2 / 6 tests passed. # # # Totals: pass:2 fail:4 xfail:0 xpass:0 skip:0 error:0 # # [FAIL] # not ok 71 memory-failure # exit=1 Can the test maybe not deal with running in certain environments (config options etc)? -- Cheers David
On 2026/1/9 21:45, David Hildenbrand (Red Hat) wrote:
> On 1/7/26 10:37, Miaohe Lin wrote:
>> Introduce selftests to validate the functionality of memory failure.
>> These tests help ensure that memory failure handling for anonymous
>> pages, pagecaches pages works correctly, including proper SIGBUS
>> delivery to user processes, page isolation, and recovery paths.
>>
>> Currently madvise syscall is used to inject memory failures. And only
>> anonymous pages and pagecaches are tested. More test scenarios, e.g.
>> hugetlb, shmem, thp, will be added. Also more memory failure injecting
>> methods will be supported, e.g. APEI Error INJection, if required.
>
Thanks for test and report. :)
> 0day reports that these tests fail:
>
> # # ------------------------
> # # running ./memory-failure
> # # ------------------------
> # # TAP version 13
> # # 1..6
> # # # Starting 6 tests from 2 test cases.
> # # # RUN memory_failure.madv_hard.anon ...
> # # # OK memory_failure.madv_hard.anon
> # # ok 1 memory_failure.madv_hard.anon
> # # # RUN memory_failure.madv_hard.clean_pagecache ...
> # # # memory-failure.c:166:clean_pagecache:Expected setjmp (1) == 0 (0)
> # # # clean_pagecache: Test terminated by assertion
> # # # FAIL memory_failure.madv_hard.clean_pagecache
> # # not ok 2 memory_failure.madv_hard.clean_pagecache
> # # # RUN memory_failure.madv_hard.dirty_pagecache ...
> # # # memory-failure.c:207:dirty_pagecache:Expected unpoison_memory(self->pfn) (-16) == 0 (0)
> # # # dirty_pagecache: Test terminated by assertion
> # # # FAIL memory_failure.madv_hard.dirty_pagecache
> # # not ok 3 memory_failure.madv_hard.dirty_pagecache
> # # # RUN memory_failure.madv_soft.anon ...
> # # # OK memory_failure.madv_soft.anon
> # # ok 4 memory_failure.madv_soft.anon
> # # # RUN memory_failure.madv_soft.clean_pagecache ...
> # # # memory-failure.c:282:clean_pagecache:Expected variant->inject(self, addr) (-1) == 0 (0)
> # # # clean_pagecache: Test terminated by assertion
> # # # FAIL memory_failure.madv_soft.clean_pagecache
> # # not ok 5 memory_failure.madv_soft.clean_pagecache
> # # # RUN memory_failure.madv_soft.dirty_pagecache ...
> # # # memory-failure.c:319:dirty_pagecache:Expected variant->inject(self, addr) (-1) == 0 (0)
> # # # dirty_pagecache: Test terminated by assertion
> # # # FAIL memory_failure.madv_soft.dirty_pagecache
> # # not ok 6 memory_failure.madv_soft.dirty_pagecache
> # # # FAILED: 2 / 6 tests passed.
> # # # Totals: pass:2 fail:4 xfail:0 xpass:0 skip:0 error:0
> # # [FAIL]
> # not ok 71 memory-failure # exit=1
>
>
> Can the test maybe not deal with running in certain environments (config options etc)?
To run the test, I think there should be:
1.CONFIG_MEMORY_FAILURE and CONFIG_HWPOISON_INJECT should be enabled.
2.Root privilege is required.
3.For dirty/clean pagecache testcases, the test file "./clean-page-cache-test-file" and
"./dirty-page-cache-test-file" are assumed to be created on non-memory file systems
such as xfs, ext4, etc.
Does your test environment break any of the above rules? Am I expected to add some code to
guard against this?
Thanks.
.
On 1/12/26 10:19, Miaohe Lin wrote: > On 2026/1/9 21:45, David Hildenbrand (Red Hat) wrote: >> On 1/7/26 10:37, Miaohe Lin wrote: >>> Introduce selftests to validate the functionality of memory failure. >>> These tests help ensure that memory failure handling for anonymous >>> pages, pagecaches pages works correctly, including proper SIGBUS >>> delivery to user processes, page isolation, and recovery paths. >>> >>> Currently madvise syscall is used to inject memory failures. And only >>> anonymous pages and pagecaches are tested. More test scenarios, e.g. >>> hugetlb, shmem, thp, will be added. Also more memory failure injecting >>> methods will be supported, e.g. APEI Error INJection, if required. >> > > Thanks for test and report. :) > >> 0day reports that these tests fail: >> >> # # ------------------------ >> # # running ./memory-failure >> # # ------------------------ >> # # TAP version 13 >> # # 1..6 >> # # # Starting 6 tests from 2 test cases. >> # # # RUN memory_failure.madv_hard.anon ... >> # # # OK memory_failure.madv_hard.anon >> # # ok 1 memory_failure.madv_hard.anon >> # # # RUN memory_failure.madv_hard.clean_pagecache ... >> # # # memory-failure.c:166:clean_pagecache:Expected setjmp (1) == 0 (0) >> # # # clean_pagecache: Test terminated by assertion >> # # # FAIL memory_failure.madv_hard.clean_pagecache >> # # not ok 2 memory_failure.madv_hard.clean_pagecache >> # # # RUN memory_failure.madv_hard.dirty_pagecache ... >> # # # memory-failure.c:207:dirty_pagecache:Expected unpoison_memory(self->pfn) (-16) == 0 (0) >> # # # dirty_pagecache: Test terminated by assertion >> # # # FAIL memory_failure.madv_hard.dirty_pagecache >> # # not ok 3 memory_failure.madv_hard.dirty_pagecache >> # # # RUN memory_failure.madv_soft.anon ... >> # # # OK memory_failure.madv_soft.anon >> # # ok 4 memory_failure.madv_soft.anon >> # # # RUN memory_failure.madv_soft.clean_pagecache ... >> # # # memory-failure.c:282:clean_pagecache:Expected variant->inject(self, addr) (-1) == 0 (0) >> # # # clean_pagecache: Test terminated by assertion >> # # # FAIL memory_failure.madv_soft.clean_pagecache >> # # not ok 5 memory_failure.madv_soft.clean_pagecache >> # # # RUN memory_failure.madv_soft.dirty_pagecache ... >> # # # memory-failure.c:319:dirty_pagecache:Expected variant->inject(self, addr) (-1) == 0 (0) >> # # # dirty_pagecache: Test terminated by assertion >> # # # FAIL memory_failure.madv_soft.dirty_pagecache >> # # not ok 6 memory_failure.madv_soft.dirty_pagecache >> # # # FAILED: 2 / 6 tests passed. >> # # # Totals: pass:2 fail:4 xfail:0 xpass:0 skip:0 error:0 >> # # [FAIL] >> # not ok 71 memory-failure # exit=1 >> >> >> Can the test maybe not deal with running in certain environments (config options etc)? > > To run the test, I think there should be: > 1.CONFIG_MEMORY_FAILURE and CONFIG_HWPOISON_INJECT should be enabled. > 2.Root privilege is required. > 3.For dirty/clean pagecache testcases, the test file "./clean-page-cache-test-file" and > "./dirty-page-cache-test-file" are assumed to be created on non-memory file systems > such as xfs, ext4, etc. > > Does your test environment break any of the above rules? It is 0day environment, so very likely yes. I suspect 1). > Am I expected to add some code to > guard against this? Yes, at least some. Checking for root privileges is not required. The tests are commonly run from non-memory file systems, but, in theory, could be run from nfs etc. If you require special file systems, take a look at gup_longterm.o where we test for some fileystsem types. Regarding 1): tools/testing/selftests/mm/config includes the config options we expect to be set for running MM tests. Extending that might take a while until environments like 0day would pick up such changes. If you require something else, make your test SKIP tests if the relevant kernel support is not there (e.g., sense support and conditionally skip). -- Cheers David
On 2026/1/12 17:40, David Hildenbrand (Red Hat) wrote: > On 1/12/26 10:19, Miaohe Lin wrote: >> On 2026/1/9 21:45, David Hildenbrand (Red Hat) wrote: >>> On 1/7/26 10:37, Miaohe Lin wrote: >>>> Introduce selftests to validate the functionality of memory failure. >>>> These tests help ensure that memory failure handling for anonymous >>>> pages, pagecaches pages works correctly, including proper SIGBUS >>>> delivery to user processes, page isolation, and recovery paths. >>>> >>>> Currently madvise syscall is used to inject memory failures. And only >>>> anonymous pages and pagecaches are tested. More test scenarios, e.g. >>>> hugetlb, shmem, thp, will be added. Also more memory failure injecting >>>> methods will be supported, e.g. APEI Error INJection, if required. >>> >> >> Thanks for test and report. :) >> >>> 0day reports that these tests fail: >>> >>> # # ------------------------ >>> # # running ./memory-failure >>> # # ------------------------ >>> # # TAP version 13 >>> # # 1..6 >>> # # # Starting 6 tests from 2 test cases. >>> # # # RUN memory_failure.madv_hard.anon ... >>> # # # OK memory_failure.madv_hard.anon >>> # # ok 1 memory_failure.madv_hard.anon >>> # # # RUN memory_failure.madv_hard.clean_pagecache ... >>> # # # memory-failure.c:166:clean_pagecache:Expected setjmp (1) == 0 (0) >>> # # # clean_pagecache: Test terminated by assertion >>> # # # FAIL memory_failure.madv_hard.clean_pagecache >>> # # not ok 2 memory_failure.madv_hard.clean_pagecache >>> # # # RUN memory_failure.madv_hard.dirty_pagecache ... >>> # # # memory-failure.c:207:dirty_pagecache:Expected unpoison_memory(self->pfn) (-16) == 0 (0) >>> # # # dirty_pagecache: Test terminated by assertion >>> # # # FAIL memory_failure.madv_hard.dirty_pagecache >>> # # not ok 3 memory_failure.madv_hard.dirty_pagecache >>> # # # RUN memory_failure.madv_soft.anon ... >>> # # # OK memory_failure.madv_soft.anon >>> # # ok 4 memory_failure.madv_soft.anon >>> # # # RUN memory_failure.madv_soft.clean_pagecache ... >>> # # # memory-failure.c:282:clean_pagecache:Expected variant->inject(self, addr) (-1) == 0 (0) >>> # # # clean_pagecache: Test terminated by assertion >>> # # # FAIL memory_failure.madv_soft.clean_pagecache >>> # # not ok 5 memory_failure.madv_soft.clean_pagecache >>> # # # RUN memory_failure.madv_soft.dirty_pagecache ... >>> # # # memory-failure.c:319:dirty_pagecache:Expected variant->inject(self, addr) (-1) == 0 (0) >>> # # # dirty_pagecache: Test terminated by assertion >>> # # # FAIL memory_failure.madv_soft.dirty_pagecache >>> # # not ok 6 memory_failure.madv_soft.dirty_pagecache >>> # # # FAILED: 2 / 6 tests passed. >>> # # # Totals: pass:2 fail:4 xfail:0 xpass:0 skip:0 error:0 >>> # # [FAIL] >>> # not ok 71 memory-failure # exit=1 >>> >>> >>> Can the test maybe not deal with running in certain environments (config options etc)? >> >> To run the test, I think there should be: >> 1.CONFIG_MEMORY_FAILURE and CONFIG_HWPOISON_INJECT should be enabled. >> 2.Root privilege is required. >> 3.For dirty/clean pagecache testcases, the test file "./clean-page-cache-test-file" and >> "./dirty-page-cache-test-file" are assumed to be created on non-memory file systems >> such as xfs, ext4, etc. >> >> Does your test environment break any of the above rules? > > It is 0day environment, so very likely yes. I suspect 1). > >> Am I expected to add some code to >> guard against this? > > Yes, at least some. > > Checking for root privileges is not required. The tests are commonly run from non-memory file systems, but, in theory, could be run from nfs etc. > > If you require special file systems, take a look at gup_longterm.o where we test for some fileystsem types. > > Regarding 1): tools/testing/selftests/mm/config includes the config options we expect to be set for running MM tests. Extending that might take a while until environments like 0day would pick up such changes. If you require something else, make your test SKIP tests if the relevant kernel support is not there (e.g., sense support and conditionally skip). Thanks for your valuable suggestion. I will take a close look. Thanks! .
On 2026/1/12 19:33, Miaohe Lin wrote: > On 2026/1/12 17:40, David Hildenbrand (Red Hat) wrote: >> On 1/12/26 10:19, Miaohe Lin wrote: >>> On 2026/1/9 21:45, David Hildenbrand (Red Hat) wrote: >>>> On 1/7/26 10:37, Miaohe Lin wrote: >>>>> Introduce selftests to validate the functionality of memory failure. >>>>> These tests help ensure that memory failure handling for anonymous >>>>> pages, pagecaches pages works correctly, including proper SIGBUS >>>>> delivery to user processes, page isolation, and recovery paths. >>>>> >>>>> Currently madvise syscall is used to inject memory failures. And only >>>>> anonymous pages and pagecaches are tested. More test scenarios, e.g. >>>>> hugetlb, shmem, thp, will be added. Also more memory failure injecting >>>>> methods will be supported, e.g. APEI Error INJection, if required. >>>> >>> >>> Thanks for test and report. :) >>> >>>> 0day reports that these tests fail: >>>> >>>> # # ------------------------ >>>> # # running ./memory-failure >>>> # # ------------------------ >>>> # # TAP version 13 >>>> # # 1..6 >>>> # # # Starting 6 tests from 2 test cases. >>>> # # # RUN memory_failure.madv_hard.anon ... >>>> # # # OK memory_failure.madv_hard.anon >>>> # # ok 1 memory_failure.madv_hard.anon >>>> # # # RUN memory_failure.madv_hard.clean_pagecache ... >>>> # # # memory-failure.c:166:clean_pagecache:Expected setjmp (1) == 0 (0) >>>> # # # clean_pagecache: Test terminated by assertion >>>> # # # FAIL memory_failure.madv_hard.clean_pagecache >>>> # # not ok 2 memory_failure.madv_hard.clean_pagecache >>>> # # # RUN memory_failure.madv_hard.dirty_pagecache ... >>>> # # # memory-failure.c:207:dirty_pagecache:Expected unpoison_memory(self->pfn) (-16) == 0 (0) >>>> # # # dirty_pagecache: Test terminated by assertion >>>> # # # FAIL memory_failure.madv_hard.dirty_pagecache >>>> # # not ok 3 memory_failure.madv_hard.dirty_pagecache >>>> # # # RUN memory_failure.madv_soft.anon ... >>>> # # # OK memory_failure.madv_soft.anon >>>> # # ok 4 memory_failure.madv_soft.anon >>>> # # # RUN memory_failure.madv_soft.clean_pagecache ... >>>> # # # memory-failure.c:282:clean_pagecache:Expected variant->inject(self, addr) (-1) == 0 (0) >>>> # # # clean_pagecache: Test terminated by assertion >>>> # # # FAIL memory_failure.madv_soft.clean_pagecache >>>> # # not ok 5 memory_failure.madv_soft.clean_pagecache >>>> # # # RUN memory_failure.madv_soft.dirty_pagecache ... >>>> # # # memory-failure.c:319:dirty_pagecache:Expected variant->inject(self, addr) (-1) == 0 (0) >>>> # # # dirty_pagecache: Test terminated by assertion >>>> # # # FAIL memory_failure.madv_soft.dirty_pagecache >>>> # # not ok 6 memory_failure.madv_soft.dirty_pagecache >>>> # # # FAILED: 2 / 6 tests passed. >>>> # # # Totals: pass:2 fail:4 xfail:0 xpass:0 skip:0 error:0 >>>> # # [FAIL] >>>> # not ok 71 memory-failure # exit=1 >>>> >>>> >>>> Can the test maybe not deal with running in certain environments (config options etc)? >>> >>> To run the test, I think there should be: >>> 1.CONFIG_MEMORY_FAILURE and CONFIG_HWPOISON_INJECT should be enabled. >>> 2.Root privilege is required. >>> 3.For dirty/clean pagecache testcases, the test file "./clean-page-cache-test-file" and >>> "./dirty-page-cache-test-file" are assumed to be created on non-memory file systems >>> such as xfs, ext4, etc. >>> >>> Does your test environment break any of the above rules? >> >> It is 0day environment, so very likely yes. I suspect 1). Hi David, After taking a more close look, I think CONFIG_MEMORY_FAILURE and CONFIG_HWPOISON_INJECT should have been enabled in 0day environment or testcase memory_failure.madv_hard.anon should fail. memory_failure.madv_hard.anon will inject memory failure and expects seeing a SIGBUG signal. >> >>> Am I expected to add some code to >>> guard against this? >> >> Yes, at least some. >> >> Checking for root privileges is not required. The tests are commonly run from non-memory file systems, but, in theory, could be run from nfs etc. >> >> If you require special file systems, take a look at gup_longterm.o where we test for some fileystsem types. And I think the cause of failures of testcases memory_failure.madv_hard.clean_pagecache and memory_failure.madv_hard.dirty_pagecache is they running on memory filesystems. The error pages are kept in page cache in that case while memory_failure.madv_hard.clean_pagecache expects to see the error page truncated. But I have no idea why memory_failure.madv_soft.dirty_pagecache and memory_failure.madv_soft.clean_pagecache return -1(-EPERM?) when try to inject memory error through madvise syscall. It could be really helpful if more information can be provided. Thanks! .
On 1/12/26 13:44, Miaohe Lin wrote: > On 2026/1/12 19:33, Miaohe Lin wrote: >> On 2026/1/12 17:40, David Hildenbrand (Red Hat) wrote: >>> On 1/12/26 10:19, Miaohe Lin wrote: >>>> On 2026/1/9 21:45, David Hildenbrand (Red Hat) wrote: >>>>> On 1/7/26 10:37, Miaohe Lin wrote: >>>>>> Introduce selftests to validate the functionality of memory failure. >>>>>> These tests help ensure that memory failure handling for anonymous >>>>>> pages, pagecaches pages works correctly, including proper SIGBUS >>>>>> delivery to user processes, page isolation, and recovery paths. >>>>>> >>>>>> Currently madvise syscall is used to inject memory failures. And only >>>>>> anonymous pages and pagecaches are tested. More test scenarios, e.g. >>>>>> hugetlb, shmem, thp, will be added. Also more memory failure injecting >>>>>> methods will be supported, e.g. APEI Error INJection, if required. >>>>> >>>> >>>> Thanks for test and report. :) >>>> >>>>> 0day reports that these tests fail: >>>>> >>>>> # # ------------------------ >>>>> # # running ./memory-failure >>>>> # # ------------------------ >>>>> # # TAP version 13 >>>>> # # 1..6 >>>>> # # # Starting 6 tests from 2 test cases. >>>>> # # # RUN memory_failure.madv_hard.anon ... >>>>> # # # OK memory_failure.madv_hard.anon >>>>> # # ok 1 memory_failure.madv_hard.anon >>>>> # # # RUN memory_failure.madv_hard.clean_pagecache ... >>>>> # # # memory-failure.c:166:clean_pagecache:Expected setjmp (1) == 0 (0) >>>>> # # # clean_pagecache: Test terminated by assertion >>>>> # # # FAIL memory_failure.madv_hard.clean_pagecache >>>>> # # not ok 2 memory_failure.madv_hard.clean_pagecache >>>>> # # # RUN memory_failure.madv_hard.dirty_pagecache ... >>>>> # # # memory-failure.c:207:dirty_pagecache:Expected unpoison_memory(self->pfn) (-16) == 0 (0) >>>>> # # # dirty_pagecache: Test terminated by assertion >>>>> # # # FAIL memory_failure.madv_hard.dirty_pagecache >>>>> # # not ok 3 memory_failure.madv_hard.dirty_pagecache >>>>> # # # RUN memory_failure.madv_soft.anon ... >>>>> # # # OK memory_failure.madv_soft.anon >>>>> # # ok 4 memory_failure.madv_soft.anon >>>>> # # # RUN memory_failure.madv_soft.clean_pagecache ... >>>>> # # # memory-failure.c:282:clean_pagecache:Expected variant->inject(self, addr) (-1) == 0 (0) >>>>> # # # clean_pagecache: Test terminated by assertion >>>>> # # # FAIL memory_failure.madv_soft.clean_pagecache >>>>> # # not ok 5 memory_failure.madv_soft.clean_pagecache >>>>> # # # RUN memory_failure.madv_soft.dirty_pagecache ... >>>>> # # # memory-failure.c:319:dirty_pagecache:Expected variant->inject(self, addr) (-1) == 0 (0) >>>>> # # # dirty_pagecache: Test terminated by assertion >>>>> # # # FAIL memory_failure.madv_soft.dirty_pagecache >>>>> # # not ok 6 memory_failure.madv_soft.dirty_pagecache >>>>> # # # FAILED: 2 / 6 tests passed. >>>>> # # # Totals: pass:2 fail:4 xfail:0 xpass:0 skip:0 error:0 >>>>> # # [FAIL] >>>>> # not ok 71 memory-failure # exit=1 >>>>> >>>>> >>>>> Can the test maybe not deal with running in certain environments (config options etc)? >>>> >>>> To run the test, I think there should be: >>>> 1.CONFIG_MEMORY_FAILURE and CONFIG_HWPOISON_INJECT should be enabled. >>>> 2.Root privilege is required. >>>> 3.For dirty/clean pagecache testcases, the test file "./clean-page-cache-test-file" and >>>> "./dirty-page-cache-test-file" are assumed to be created on non-memory file systems >>>> such as xfs, ext4, etc. >>>> >>>> Does your test environment break any of the above rules? >>> >>> It is 0day environment, so very likely yes. I suspect 1). > > Hi David, > > After taking a more close look, I think CONFIG_MEMORY_FAILURE and CONFIG_HWPOISON_INJECT should have been > enabled in 0day environment or testcase memory_failure.madv_hard.anon should fail. memory_failure.madv_hard.anon > will inject memory failure and expects seeing a SIGBUG signal. Good point. > >>> >>>> Am I expected to add some code to >>>> guard against this? >>> >>> Yes, at least some. >>> >>> Checking for root privileges is not required. The tests are commonly run from non-memory file systems, but, in theory, could be run from nfs etc. >>> >>> If you require special file systems, take a look at gup_longterm.o where we test for some fileystsem types. > > And I think the cause of failures of testcases memory_failure.madv_hard.clean_pagecache and memory_failure.madv_hard.dirty_pagecache > is they running on memory filesystems. The error pages are kept in page cache in that case while memory_failure.madv_hard.clean_pagecache > expects to see the error page truncated. Maybe they are run on shmem? Good question. (@Phil?) > > But I have no idea why memory_failure.madv_soft.dirty_pagecache and memory_failure.madv_soft.clean_pagecache return -1(-EPERM?) when try > to inject memory error through madvise syscall. It could be really helpful if more information can be provided. Here is more information: https://download.01.org/0day-ci/archive/20260110/202601100241.326d7cce-lkp@intel.com Unfortunately no config yet. (@Phil, could we provide that one as well as part of that bundle?) -- Cheers David
On Mon, Jan 12, 2026 at 08:38:58PM +0100, David Hildenbrand (Red Hat) wrote: > On 1/12/26 13:44, Miaohe Lin wrote: > > On 2026/1/12 19:33, Miaohe Lin wrote: ... > > > > > > # # # Starting 6 tests from 2 test cases. > > > > > > # # # RUN memory_failure.madv_hard.anon ... > > > > > > # # # OK memory_failure.madv_hard.anon > > > > > > # not ok 71 memory-failure # exit=1 > > > > > > > > > > > > > > > > > > Can the test maybe not deal with running in certain environments (config options etc)? > > > > > > > > > > To run the test, I think there should be: > > > > > 1.CONFIG_MEMORY_FAILURE and CONFIG_HWPOISON_INJECT should be enabled. in 0day env, the configs are below CONFIG_MEMORY_FAILURE=y CONFIG_HWPOISON_INJECT=m > > > > > 2.Root privilege is required. yes, use root to run > > > > > 3.For dirty/clean pagecache testcases, the test file "./clean-page-cache-test-file" and > > > > > "./dirty-page-cache-test-file" are assumed to be created on non-memory file systems > > > > > such as xfs, ext4, etc. this is a problem in 0day, the test is running in tmpfs. Let me further check to correct this. > > > > > > > > > > Does your test environment break any of the above rules? > > > > > > > > It is 0day environment, so very likely yes. I suspect 1). > > > > Hi David, > > > > After taking a more close look, I think CONFIG_MEMORY_FAILURE and CONFIG_HWPOISON_INJECT should have been > > enabled in 0day environment or testcase memory_failure.madv_hard.anon should fail. memory_failure.madv_hard.anon > > will inject memory failure and expects seeing a SIGBUG signal. > > Good point. > > > > > > > > > > > > Am I expected to add some code to > > > > > guard against this? > > > > > > > > Yes, at least some. > > > > > > > > Checking for root privileges is not required. The tests are commonly run from non-memory file systems, but, in theory, could be run from nfs etc. > > > > > > > > If you require special file systems, take a look at gup_longterm.o where we test for some fileystsem types. > > > > And I think the cause of failures of testcases memory_failure.madv_hard.clean_pagecache and memory_failure.madv_hard.dirty_pagecache > > is they running on memory filesystems. The error pages are kept in page cache in that case while memory_failure.madv_hard.clean_pagecache > > expects to see the error page truncated. > > Maybe they are run on shmem? Good question. (@Phil?) yes, it runs on tmpfs, let me further check to resolve it. > > > > > But I have no idea why memory_failure.madv_soft.dirty_pagecache and memory_failure.madv_soft.clean_pagecache return -1(-EPERM?) when try > > to inject memory error through madvise syscall. It could be really helpful if more information can be provided. > > Here is more information: > > https://download.01.org/0day-ci/archive/20260110/202601100241.326d7cce-lkp@intel.com > > Unfortunately no config yet. (@Phil, could we provide that one as well as > part of that bundle?) Got it, i will update the script to upload the kernel kconfig. > > -- > Cheers > > David
On 2026/1/13 22:05, Philip Li wrote: > On Mon, Jan 12, 2026 at 08:38:58PM +0100, David Hildenbrand (Red Hat) wrote: >> On 1/12/26 13:44, Miaohe Lin wrote: >>> On 2026/1/12 19:33, Miaohe Lin wrote: > > ... > >>>>>>> # # # Starting 6 tests from 2 test cases. >>>>>>> # # # RUN memory_failure.madv_hard.anon ... >>>>>>> # # # OK memory_failure.madv_hard.anon >>>>>>> # not ok 71 memory-failure # exit=1 >>>>>>> >>>>>>> >>>>>>> Can the test maybe not deal with running in certain environments (config options etc)? >>>>>> >>>>>> To run the test, I think there should be: >>>>>> 1.CONFIG_MEMORY_FAILURE and CONFIG_HWPOISON_INJECT should be enabled. > > in 0day env, the configs are below > > CONFIG_MEMORY_FAILURE=y > CONFIG_HWPOISON_INJECT=m > >>>>>> 2.Root privilege is required. > > yes, use root to run > >>>>>> 3.For dirty/clean pagecache testcases, the test file "./clean-page-cache-test-file" and >>>>>> "./dirty-page-cache-test-file" are assumed to be created on non-memory file systems >>>>>> such as xfs, ext4, etc. > > this is a problem in 0day, the test is running in tmpfs. Let me further check > to correct this. > >>>>>> >>>>>> Does your test environment break any of the above rules? >>>>> >>>>> It is 0day environment, so very likely yes. I suspect 1). >>> >>> Hi David, >>> >>> After taking a more close look, I think CONFIG_MEMORY_FAILURE and CONFIG_HWPOISON_INJECT should have been >>> enabled in 0day environment or testcase memory_failure.madv_hard.anon should fail. memory_failure.madv_hard.anon >>> will inject memory failure and expects seeing a SIGBUG signal. >> >> Good point. >> >>> >>>>> >>>>>> Am I expected to add some code to >>>>>> guard against this? >>>>> >>>>> Yes, at least some. >>>>> >>>>> Checking for root privileges is not required. The tests are commonly run from non-memory file systems, but, in theory, could be run from nfs etc. >>>>> >>>>> If you require special file systems, take a look at gup_longterm.o where we test for some fileystsem types. >>> >>> And I think the cause of failures of testcases memory_failure.madv_hard.clean_pagecache and memory_failure.madv_hard.dirty_pagecache >>> is they running on memory filesystems. The error pages are kept in page cache in that case while memory_failure.madv_hard.clean_pagecache >>> expects to see the error page truncated. >> >> Maybe they are run on shmem? Good question. (@Phil?) > > yes, it runs on tmpfs, let me further check to resolve it. Thanks both. This information is really helpful. I will add some codes to handle this. Thanks. .
© 2016 - 2026 Red Hat, Inc.