The current logic allows for up to 1G pages to be scrubbed in place, which
can cause the watchdog to trigger in practice. Reduce the limit for
in-place scrubbed allocations to a newly introduced define:
CONFIG_DIRTY_MAX_ORDER. This currently defaults to CONFIG_DOMU_MAX_ORDER
on all architectures. Also introduce a command line option to set the
value.
Fixes: 74d2e11ccfd2 ("mm: Scrub pages in alloc_heap_pages() if needed")
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
- Split from previous patch.
- Introduce a command line option to set the limit.
---
docs/misc/xen-command-line.pandoc | 9 +++++++++
xen/common/page_alloc.c | 23 ++++++++++++++++++++++-
2 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 50d7edb2488e..65b4dfc826b5 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1822,6 +1822,15 @@ Specify the deepest C-state CPUs are permitted to be placed in, and
optionally the maximum sub C-state to be used used. The latter only applies
to the highest permitted C-state.
+### max-order-dirty
+> `= <integer>`
+
+Specify the maximum allocation order allowed when scrubbing allocated pages
+in-place. The allocation is non-preemptive, and hence the value must be keep
+low enough to avoid hogging the CPU for too long.
+
+Defaults to `CONFIG_DIRTY_MAX_ORDER` or if unset to `CONFIG_DOMU_MAX_ORDER`.
+
### max_gsi_irqs (x86)
> `= <integer>`
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index c9e82fd7ab62..728b4d6c9861 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -267,6 +267,13 @@ static PAGE_LIST_HEAD(page_offlined_list);
/* Broken page list, protected by heap_lock. */
static PAGE_LIST_HEAD(page_broken_list);
+/* Maximum order allowed for allocations with MEMF_no_scrub. */
+#ifndef CONFIG_DIRTY_MAX_ORDER
+# define CONFIG_DIRTY_MAX_ORDER CONFIG_DOMU_MAX_ORDER
+#endif
+static unsigned int __ro_after_init dirty_max_order = CONFIG_DIRTY_MAX_ORDER;
+integer_param("max-order-dirty", dirty_max_order);
+
/*************************
* BOOT-TIME ALLOCATOR
*/
@@ -1008,7 +1015,13 @@ static struct page_info *alloc_heap_pages(
pg = get_free_buddy(zone_lo, zone_hi, order, memflags, d);
/* Try getting a dirty buddy if we couldn't get a clean one. */
- if ( !pg && !(memflags & MEMF_no_scrub) )
+ if ( !pg && !(memflags & MEMF_no_scrub) &&
+ /*
+ * Allow any order unscrubbed allocations during boot time, we
+ * compensate by processing softirqs in the scrubbing loop below once
+ * irqs are enabled.
+ */
+ (order <= dirty_max_order || system_state < SYS_STATE_active) )
pg = get_free_buddy(zone_lo, zone_hi, order,
memflags | MEMF_no_scrub, d);
if ( !pg )
@@ -1117,6 +1130,14 @@ static struct page_info *alloc_heap_pages(
scrub_one_page(&pg[i], cold);
dirty_cnt++;
+
+ /*
+ * Use SYS_STATE_smp_boot explicitly; ahead of that state
+ * interrupts are disabled.
+ */
+ if ( system_state == SYS_STATE_smp_boot &&
+ !(dirty_cnt & 0xff) )
+ process_pending_softirqs();
}
else
check_one_page(&pg[i]);
--
2.51.0
On 15.01.2026 12:18, Roger Pau Monne wrote:
> The current logic allows for up to 1G pages to be scrubbed in place, which
> can cause the watchdog to trigger in practice. Reduce the limit for
> in-place scrubbed allocations to a newly introduced define:
> CONFIG_DIRTY_MAX_ORDER. This currently defaults to CONFIG_DOMU_MAX_ORDER
> on all architectures. Also introduce a command line option to set the
> value.
>
> Fixes: 74d2e11ccfd2 ("mm: Scrub pages in alloc_heap_pages() if needed")
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Changes since v1:
> - Split from previous patch.
> - Introduce a command line option to set the limit.
> ---
> docs/misc/xen-command-line.pandoc | 9 +++++++++
> xen/common/page_alloc.c | 23 ++++++++++++++++++++++-
> 2 files changed, 31 insertions(+), 1 deletion(-)
If you confine the change to page_alloc.c, won't this mean that patch 2's
passing of MEMF_no_scrub will then also be bounded (in which case the need
for patch 2 would largely disappear)?
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1822,6 +1822,15 @@ Specify the deepest C-state CPUs are permitted to be placed in, and
> optionally the maximum sub C-state to be used used. The latter only applies
> to the highest permitted C-state.
>
> +### max-order-dirty
> +> `= <integer>`
> +
> +Specify the maximum allocation order allowed when scrubbing allocated pages
> +in-place. The allocation is non-preemptive, and hence the value must be keep
> +low enough to avoid hogging the CPU for too long.
> +
> +Defaults to `CONFIG_DIRTY_MAX_ORDER` or if unset to `CONFIG_DOMU_MAX_ORDER`.
This may end up misleading, as - despite their names - these aren't really
Kconfig settings that people could easily control in their builds.
> ### max_gsi_irqs (x86)
> > `= <integer>`
I also wonder whether your addition wouldn't more naturally go a litter
further down, by assuming / implying that the sorting used largely ignores
separator characters (underscore vs dash here).
Jan
On Mon, Jan 19, 2026 at 05:13:25PM +0100, Jan Beulich wrote: > On 15.01.2026 12:18, Roger Pau Monne wrote: > > --- a/docs/misc/xen-command-line.pandoc > > +++ b/docs/misc/xen-command-line.pandoc > > @@ -1822,6 +1822,15 @@ Specify the deepest C-state CPUs are permitted to be placed in, and > > optionally the maximum sub C-state to be used used. The latter only applies > > to the highest permitted C-state. > > > > +### max-order-dirty > > +> `= <integer>` > > + > > +Specify the maximum allocation order allowed when scrubbing allocated pages > > +in-place. The allocation is non-preemptive, and hence the value must be keep > > +low enough to avoid hogging the CPU for too long. > > + > > +Defaults to `CONFIG_DIRTY_MAX_ORDER` or if unset to `CONFIG_DOMU_MAX_ORDER`. > > This may end up misleading, as - despite their names - these aren't really > Kconfig settings that people could easily control in their builds. But those have different default values depending on the architecture, hence I didn't know what else to reference to as the default. I'm open to suggestions, but I think we need to reference some default value so the user knows where to look for. > > ### max_gsi_irqs (x86) > > > `= <integer>` > > I also wonder whether your addition wouldn't more naturally go a litter > further down, by assuming / implying that the sorting used largely ignores > separator characters (underscore vs dash here). My bad, I think I've originally named it max-dirty-order and forgot to move it down when renaming to max-order-dirty. Thanks, Roger.
On 22.01.2026 13:55, Roger Pau Monné wrote: > On Mon, Jan 19, 2026 at 05:13:25PM +0100, Jan Beulich wrote: >> On 15.01.2026 12:18, Roger Pau Monne wrote: >>> --- a/docs/misc/xen-command-line.pandoc >>> +++ b/docs/misc/xen-command-line.pandoc >>> @@ -1822,6 +1822,15 @@ Specify the deepest C-state CPUs are permitted to be placed in, and >>> optionally the maximum sub C-state to be used used. The latter only applies >>> to the highest permitted C-state. >>> >>> +### max-order-dirty >>> +> `= <integer>` >>> + >>> +Specify the maximum allocation order allowed when scrubbing allocated pages >>> +in-place. The allocation is non-preemptive, and hence the value must be keep >>> +low enough to avoid hogging the CPU for too long. >>> + >>> +Defaults to `CONFIG_DIRTY_MAX_ORDER` or if unset to `CONFIG_DOMU_MAX_ORDER`. >> >> This may end up misleading, as - despite their names - these aren't really >> Kconfig settings that people could easily control in their builds. > > But those have different default values depending on the architecture, > hence I didn't know what else to reference to as the default. I'm > open to suggestions, but I think we need to reference some default > value so the user knows where to look for. I agree something needs saying. In the absence of anything better we may be able to think of, perhaps simply clarify that these are #define-s in source, not Kconfig settings? Jan
On 19.01.2026 17:13, Jan Beulich wrote:
> On 15.01.2026 12:18, Roger Pau Monne wrote:
>> The current logic allows for up to 1G pages to be scrubbed in place, which
>> can cause the watchdog to trigger in practice. Reduce the limit for
>> in-place scrubbed allocations to a newly introduced define:
>> CONFIG_DIRTY_MAX_ORDER. This currently defaults to CONFIG_DOMU_MAX_ORDER
>> on all architectures. Also introduce a command line option to set the
>> value.
>>
>> Fixes: 74d2e11ccfd2 ("mm: Scrub pages in alloc_heap_pages() if needed")
>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>> ---
>> Changes since v1:
>> - Split from previous patch.
>> - Introduce a command line option to set the limit.
>> ---
>> docs/misc/xen-command-line.pandoc | 9 +++++++++
>> xen/common/page_alloc.c | 23 ++++++++++++++++++++++-
>> 2 files changed, 31 insertions(+), 1 deletion(-)
>
> If you confine the change to page_alloc.c, won't this mean that patch 2's
> passing of MEMF_no_scrub will then also be bounded (in which case the need
> for patch 2 would largely disappear)?
This was rubbish, sorry. Besides my being thick-headed I can only attribute
this to the double negation in !(memflags & MEMF_no_scrub).
I have another concern, though: You effectively undermine ptdom_max_order,
which is even more of a problem as that would also affect Dom0's ability to
obtain larger contiguous I/O buffers. Perhaps DIRTY_MAX_ORDER ought to
default to PTDOM_MAX_ORDER (if HAS_PASSTHROUGH)? Yet then command line
options may also need tying together, such that people using
"memop-max-order=" to alter (increase) ptdom_max_order won't need to
additionally use "max-order-dirty="? At which point maybe the new option
shouldn't be a standalone one, but be added to "memop-max-order=" (despite
it being effected in alloc_heap_pages())?
Jan
On Tue, Jan 20, 2026 at 08:25:49AM +0100, Jan Beulich wrote:
> On 19.01.2026 17:13, Jan Beulich wrote:
> > On 15.01.2026 12:18, Roger Pau Monne wrote:
> >> The current logic allows for up to 1G pages to be scrubbed in place, which
> >> can cause the watchdog to trigger in practice. Reduce the limit for
> >> in-place scrubbed allocations to a newly introduced define:
> >> CONFIG_DIRTY_MAX_ORDER. This currently defaults to CONFIG_DOMU_MAX_ORDER
> >> on all architectures. Also introduce a command line option to set the
> >> value.
> >>
> >> Fixes: 74d2e11ccfd2 ("mm: Scrub pages in alloc_heap_pages() if needed")
> >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> >> ---
> >> Changes since v1:
> >> - Split from previous patch.
> >> - Introduce a command line option to set the limit.
> >> ---
> >> docs/misc/xen-command-line.pandoc | 9 +++++++++
> >> xen/common/page_alloc.c | 23 ++++++++++++++++++++++-
> >> 2 files changed, 31 insertions(+), 1 deletion(-)
> >
> > If you confine the change to page_alloc.c, won't this mean that patch 2's
> > passing of MEMF_no_scrub will then also be bounded (in which case the need
> > for patch 2 would largely disappear)?
>
> This was rubbish, sorry. Besides my being thick-headed I can only attribute
> this to the double negation in !(memflags & MEMF_no_scrub).
>
> I have another concern, though: You effectively undermine ptdom_max_order,
> which is even more of a problem as that would also affect Dom0's ability to
> obtain larger contiguous I/O buffers. Perhaps DIRTY_MAX_ORDER ought to
> default to PTDOM_MAX_ORDER (if HAS_PASSTHROUGH)?
OK, yes, I can default to PTDOM_MAX_ORDER instead of DOMU_MAX_ORDER.
> Yet then command line
> options may also need tying together, such that people using
> "memop-max-order=" to alter (increase) ptdom_max_order won't need to
> additionally use "max-order-dirty="? At which point maybe the new option
> shouldn't be a standalone one, but be added to "memop-max-order=" (despite
> it being effected in alloc_heap_pages())?
I had concerns about adding it to "memop-max-order=" because it's effect
is not limited to "issued by the various kinds of domain", this is an
option that affects all allocations. I could try expanding the option
description to reflect that, but I wasn't sure whether it would lead
to confusion (as all options there are per-domain currently).
Also if added to "memop-max-order=" the parsing function needs to be
adjust a bit to consume an extra parameter in the !HAS_PASSTHROUGH
case (which is not much of an issue).
Thanks, Roger.
On 22.01.2026 14:05, Roger Pau Monné wrote:
> On Tue, Jan 20, 2026 at 08:25:49AM +0100, Jan Beulich wrote:
>> On 19.01.2026 17:13, Jan Beulich wrote:
>>> On 15.01.2026 12:18, Roger Pau Monne wrote:
>>>> The current logic allows for up to 1G pages to be scrubbed in place, which
>>>> can cause the watchdog to trigger in practice. Reduce the limit for
>>>> in-place scrubbed allocations to a newly introduced define:
>>>> CONFIG_DIRTY_MAX_ORDER. This currently defaults to CONFIG_DOMU_MAX_ORDER
>>>> on all architectures. Also introduce a command line option to set the
>>>> value.
>>>>
>>>> Fixes: 74d2e11ccfd2 ("mm: Scrub pages in alloc_heap_pages() if needed")
>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>>> ---
>>>> Changes since v1:
>>>> - Split from previous patch.
>>>> - Introduce a command line option to set the limit.
>>>> ---
>>>> docs/misc/xen-command-line.pandoc | 9 +++++++++
>>>> xen/common/page_alloc.c | 23 ++++++++++++++++++++++-
>>>> 2 files changed, 31 insertions(+), 1 deletion(-)
>>>
>>> If you confine the change to page_alloc.c, won't this mean that patch 2's
>>> passing of MEMF_no_scrub will then also be bounded (in which case the need
>>> for patch 2 would largely disappear)?
>>
>> This was rubbish, sorry. Besides my being thick-headed I can only attribute
>> this to the double negation in !(memflags & MEMF_no_scrub).
>>
>> I have another concern, though: You effectively undermine ptdom_max_order,
>> which is even more of a problem as that would also affect Dom0's ability to
>> obtain larger contiguous I/O buffers. Perhaps DIRTY_MAX_ORDER ought to
>> default to PTDOM_MAX_ORDER (if HAS_PASSTHROUGH)?
>
> OK, yes, I can default to PTDOM_MAX_ORDER instead of DOMU_MAX_ORDER.
>
>> Yet then command line
>> options may also need tying together, such that people using
>> "memop-max-order=" to alter (increase) ptdom_max_order won't need to
>> additionally use "max-order-dirty="? At which point maybe the new option
>> shouldn't be a standalone one, but be added to "memop-max-order=" (despite
>> it being effected in alloc_heap_pages())?
>
> I had concerns about adding it to "memop-max-order=" because it's effect
> is not limited to "issued by the various kinds of domain", this is an
> option that affects all allocations. I could try expanding the option
> description to reflect that, but I wasn't sure whether it would lead
> to confusion (as all options there are per-domain currently).
Hmm, fair point. Let's keep it separate then.
Jan
© 2016 - 2026 Red Hat, Inc.