This commit introduces the `mm_setup_per_zone_lowmem_reserve` trace
event,which provides detailed insights into the kernel's per-zone lowmem
reserve configuration.
The trace event provides precise timestamps, allowing developers to
1. Correlate lowmem reserve changes with specific kernel events and
able to diagnose unexpected kswapd or direct reclaim behavior
triggered by dynamic changes in lowmem reserve.
2. know memory allocation failures that occur due to insufficient lowmem
reserve, by precisely correlating allocation attempts with reserve
adjustments.
Signed-off-by: Martin Liu <liumartin@google.com>
---
include/trace/events/kmem.h | 27 +++++++++++++++++++++++++++
mm/page_alloc.c | 2 ++
2 files changed, 29 insertions(+)
diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
index 5fd392dae503..9623e68d4d26 100644
--- a/include/trace/events/kmem.h
+++ b/include/trace/events/kmem.h
@@ -375,6 +375,33 @@ TRACE_EVENT(mm_setup_per_zone_wmarks,
__entry->watermark_promo)
);
+TRACE_EVENT(mm_setup_per_zone_lowmem_reserve,
+
+ TP_PROTO(struct zone *zone, struct zone *upper_zone, long lowmem_reserve),
+
+ TP_ARGS(zone, upper_zone, lowmem_reserve),
+
+ TP_STRUCT__entry(
+ __field(int, node_id)
+ __string(name, zone->name)
+ __string(upper_name, upper_zone->name)
+ __field(long, lowmem_reserve)
+ ),
+
+ TP_fast_assign(
+ __entry->node_id = zone->zone_pgdat->node_id;
+ __assign_str(name);
+ __assign_str(upper_name);
+ __entry->lowmem_reserve = lowmem_reserve;
+ ),
+
+ TP_printk("node_id=%d zone name=%s upper_zone name=%s lowmem_reserve_pages=%ld",
+ __entry->node_id,
+ __get_str(name),
+ __get_str(upper_name),
+ __entry->lowmem_reserve)
+);
+
/*
* Required for uniquely and securely identifying mm in rss_stat tracepoint.
*/
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 50893061db66..e472b1275166 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5857,6 +5857,8 @@ static void setup_per_zone_lowmem_reserve(void)
zone->lowmem_reserve[j] = 0;
else
zone->lowmem_reserve[j] = managed_pages / ratio;
+ trace_mm_setup_per_zone_lowmem_reserve(zone, upper_zone,
+ zone->lowmem_reserve[j]);
}
}
}
--
2.49.0.rc0.332.g42c0ae87b1-goog
On Sat, 8 Mar 2025 03:46:01 +0000
Martin Liu <liumartin@google.com> wrote:
> ---
> include/trace/events/kmem.h | 27 +++++++++++++++++++++++++++
> mm/page_alloc.c | 2 ++
> 2 files changed, 29 insertions(+)
>
> diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
> index 5fd392dae503..9623e68d4d26 100644
> --- a/include/trace/events/kmem.h
> +++ b/include/trace/events/kmem.h
> @@ -375,6 +375,33 @@ TRACE_EVENT(mm_setup_per_zone_wmarks,
> __entry->watermark_promo)
> );
>
> +TRACE_EVENT(mm_setup_per_zone_lowmem_reserve,
> +
> + TP_PROTO(struct zone *zone, struct zone *upper_zone, long lowmem_reserve),
> +
> + TP_ARGS(zone, upper_zone, lowmem_reserve),
> +
> + TP_STRUCT__entry(
> + __field(int, node_id)
> + __string(name, zone->name)
> + __string(upper_name, upper_zone->name)
> + __field(long, lowmem_reserve)
Nit, but may be useful. If you want to remove "holes" from the trace
event, I would move the lowmem_reserve to the top. The __string() macro
adds a 4 byte meta data into the structure (that defines the size and
offset of where the string is). That means you can think of __string()
as the same as "int".
The above has three int's followed by a long which on 64bit, would
leave a 4 byte hole just before lowmem_reserve.
-- Steve
> + ),
> +
> + TP_fast_assign(
> + __entry->node_id = zone->zone_pgdat->node_id;
> + __assign_str(name);
> + __assign_str(upper_name);
> + __entry->lowmem_reserve = lowmem_reserve;
> + ),
> +
> + TP_printk("node_id=%d zone name=%s upper_zone name=%s lowmem_reserve_pages=%ld",
> + __entry->node_id,
> + __get_str(name),
> + __get_str(upper_name),
> + __entry->lowmem_reserve)
> +);
> +
On Sat, 8 Mar 2025, Martin Liu wrote: > This commit introduces the `mm_setup_per_zone_lowmem_reserve` trace > event,which provides detailed insights into the kernel's per-zone lowmem > reserve configuration. > > The trace event provides precise timestamps, allowing developers to > > 1. Correlate lowmem reserve changes with specific kernel events and > able to diagnose unexpected kswapd or direct reclaim behavior > triggered by dynamic changes in lowmem reserve. > > 2. know memory allocation failures that occur due to insufficient lowmem > reserve, by precisely correlating allocation attempts with reserve > adjustments. > > Signed-off-by: Martin Liu <liumartin@google.com> Acked-by: David Rientjes <rientjes@google.com>
© 2016 - 2026 Red Hat, Inc.