mm/khugepaged.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
There is no xas_pause(&xas) in collapse_file()'s main loop, at the points
where it does xas_unlock_irq(&xas) and then continues.
That would explain why, once two weeks ago and twice yesterday, I have
hit the VM_BUG_ON_PAGE(page != xas_load(&xas), page) since "mm/khugepaged:
fix iteration in collapse_file" removed the xas_set(&xas, index) just
before it: xas.xa_node could be left pointing to a stale node, if there
was concurrent activity on the file which transformed its xarray.
I tried inserting xas_pause()s, but then even bootup crashed on that
VM_BUG_ON_PAGE(): there appears to be a subtle "nextness" implicit in
xas_pause().
xas_next() and xas_pause() are good for use in simple loops, but not in
this one: xas_set() worked well until now, so use xas_set(&xas, index)
explicitly at the head of the loop; and change that VM_BUG_ON_PAGE() not
to need its own xas_set(), and not to interfere with the xa_state (which
would probably stop the crashes from xas_pause(), but I trust that less).
Link: https://lore.kernel.org/linux-mm/f18e4b64-3f88-a8ab-56cc-d1f5f9c58d4@google.com/
Fixes: c8a8f3b4a95a ("mm/khugepaged: fix iteration in collapse_file")
Signed-off-by: Hugh Dickins <hughd@google.com>
---
Linus, I'm rushing this directly to you, but not really expecting you
to put it in at this stage, unless you're very comfortable with it,
or perhaps it catches Matthew's eye and gets a quick Ack from him.
The commit being fixed only got in after -rc7, after being held up by my
initial report of this crash; but I had to rescind that when I couldn't
reproduce it at all. Then yesterday morning it hit again on two machines,
and reading XArray Doc reminded me of xas_pause() - seems obvious now.
Patch ran for 14 hours overnight on those two machines without a problem.
mm/khugepaged.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 2d0d58fb4e7f..47b59f2843f6 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1918,9 +1918,9 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
}
} while (1);
- xas_set(&xas, start);
for (index = start; index < end; index++) {
- page = xas_next(&xas);
+ xas_set(&xas, index);
+ page = xas_load(&xas);
VM_BUG_ON(index != xas.xa_index);
if (is_shmem) {
@@ -1935,7 +1935,6 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
result = SCAN_TRUNCATED;
goto xa_locked;
}
- xas_set(&xas, index + 1);
}
if (!shmem_charge(mapping->host, 1)) {
result = SCAN_FAIL;
@@ -2071,7 +2070,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
xas_lock_irq(&xas);
- VM_BUG_ON_PAGE(page != xas_load(&xas), page);
+ VM_BUG_ON_PAGE(page != xa_load(xas.xa, index), page);
/*
* We control three references to the page:
--
2.35.3
On Sun, 25 Jun 2023 at 12:06, Hugh Dickins <hughd@google.com> wrote:
>
> Linus, I'm rushing this directly to you, but not really expecting you
> to put it in at this stage, unless you're very comfortable with it,
> or perhaps it catches Matthew's eye and gets a quick Ack from him.
Yeah, the fix and explanation sound fine to me, but the maple tree
code confuses me enough and has been subtle enough (as exemplified by
this bug and many others) that there's no way I'll apply this just
before the final 6.4 without some "Ack, that's obviously correct" from
Willy or Liam or somebody.
Willy?
Linus
There is no xas_pause(&xas) in collapse_file()'s main loop, at the points
where it does xas_unlock_irq(&xas) and then continues.
That would explain why, once two weeks ago and twice yesterday, I have
hit the VM_BUG_ON_PAGE(page != xas_load(&xas), page) since "mm/khugepaged:
fix iteration in collapse_file" removed the xas_set(&xas, index) just
before it: xas.xa_node could be left pointing to a stale node, if there
was concurrent activity on the file which transformed its xarray.
I tried inserting xas_pause()s, but then even bootup crashed on that
VM_BUG_ON_PAGE(): there appears to be a subtle "nextness" implicit in
xas_pause().
xas_next() and xas_pause() are good for use in simple loops, but not in
this one: xas_set() worked well until now, so use xas_set(&xas, index)
explicitly at the head of the loop; and change that VM_BUG_ON_PAGE() not
to need its own xas_set(), and not to interfere with the xa_state (which
would probably stop the crashes from xas_pause(), but I trust that less).
The user-visible effects of this bug (if VM_BUG_ONs are configured out)
would be data loss and data leak - potentially - though in practice I
expect it is more likely that a subsequent check (e.g. on mapping or on
nr_none) would notice an inconsistency, and just abandon the collapse.
Link: https://lore.kernel.org/linux-mm/f18e4b64-3f88-a8ab-56cc-d1f5f9c58d4@google.com/
Fixes: c8a8f3b4a95a ("mm/khugepaged: fix iteration in collapse_file")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
---
mm/khugepaged.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 3beb4ad2ee5e..78c8d5d8b628 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1937,9 +1937,9 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
}
} while (1);
- xas_set(&xas, start);
for (index = start; index < end; index++) {
- page = xas_next(&xas);
+ xas_set(&xas, index);
+ page = xas_load(&xas);
VM_BUG_ON(index != xas.xa_index);
if (is_shmem) {
@@ -1954,7 +1954,6 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
result = SCAN_TRUNCATED;
goto xa_locked;
}
- xas_set(&xas, index + 1);
}
if (!shmem_charge(mapping->host, 1)) {
result = SCAN_FAIL;
@@ -2090,7 +2089,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
xas_lock_irq(&xas);
- VM_BUG_ON_PAGE(page != xas_load(&xas), page);
+ VM_BUG_ON_PAGE(page != xa_load(xas.xa, index), page);
/*
* We control three references to the page:
--
2.35.3
On Wed, 28 Jun 2023 at 21:31, Hugh Dickins <hughd@google.com> wrote:
>
> There is no xas_pause(&xas) in collapse_file()'s main loop, at the points
> where it does xas_unlock_irq(&xas) and then continues.
Thanks for reminding me. Now applied,
Linus
© 2016 - 2026 Red Hat, Inc.