From nobody Sun Feb 8 17:04:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 997B7C001DB for ; Thu, 10 Aug 2023 08:14:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234138AbjHJIOE (ORCPT ); Thu, 10 Aug 2023 04:14:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51132 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234079AbjHJIN6 (ORCPT ); Thu, 10 Aug 2023 04:13:58 -0400 Received: from mail-pl1-x62d.google.com (mail-pl1-x62d.google.com [IPv6:2607:f8b0:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4B7A2127 for ; Thu, 10 Aug 2023 01:13:32 -0700 (PDT) Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1bc83a96067so5015765ad.0 for ; Thu, 10 Aug 2023 01:13:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691655212; x=1692260012; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tl5oO1QRXOb7Xlr/P1TocHGd4tt8X+hUV0N7kEiSrUM=; b=aDltu1PKROoqGLCvRCHWF58dz9bbnpl3orhWXwYhHL4mOHV2ipbxiKpukTOenyL6CP qsNy+obeB+6hqeZA4OJZrBAEy5I5NGbHxgyWFUwujdVGnzUJPnTe5NggTksBrAirVvJP s1EYtHrvz6WbPAli7p8Q8y6ax6VSW2rlRMDKpBKnDWboQse/8rLDqIBFeQVxhHeZztwp hNgKt7V0jmP5xO5pheYLt3rgmcVCkGCETmfp0Yi1GP0kbth3HerEwWBLJr9aP63ixuse /4sAIb4CE6FcbBvaG+uCMzgW0hze7LEwB0OChYDzKAYsFMSSo2MoGv0b3ayXt0THKVke snpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691655212; x=1692260012; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tl5oO1QRXOb7Xlr/P1TocHGd4tt8X+hUV0N7kEiSrUM=; b=Cl/QLQXl88mw0UO5TvX8gyYtdwhtkqdyr/ts45f0qne8vaHs4Cmph/6ww/dSIQml5i EedU2viEUjvAJJI8bWDXFY11zZgwutfR7cTbGtjaUCwT/hm/j7mYd0Jx99gTyVdgXK4H Zq1dNteZpry0W7PFnnVJp4Tjs8yuqbA/qExZ2JNCARqLUPTjaULVOerAJUnVMsbZAz97 ZjnhjFB3BV+4zXsiB7XNOoOCYwZt4kAki2Ceckw+5SkCj0d9IyE4g1BClWJfG+epRgKT hFCdUerSBBqSm0Z1cWymb82NUlSUtgqFRiV6DwnIBCn7pN0UXkj2jfSR8UkKRC+1g+Y/ C4Mw== X-Gm-Message-State: AOJu0YwN/0kw3vY8oOv2nBA437Psr6xxc3jkG72XKMbv340dLLrpJnjm C9tK1qvE5YhhXnwRZPMNosP1zQ== X-Google-Smtp-Source: AGHT+IFFybK5XwmJPTjMNNThFKg57uX9Mgd++y0VnkKYwGUQcpL7/Y8InoUpFsJs/oQykpOry4nY0Q== X-Received: by 2002:a17:903:41cf:b0:1bc:9794:22ef with SMTP id u15-20020a17090341cf00b001bc979422efmr1545395ple.1.1691655212147; Thu, 10 Aug 2023 01:13:32 -0700 (PDT) Received: from n37-019-243.byted.org ([180.184.51.40]) by smtp.gmail.com with ESMTPSA id x12-20020a170902ec8c00b001b1a2c14a4asm1019036plg.38.2023.08.10.01.13.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Aug 2023 01:13:31 -0700 (PDT) From: Chuyi Zhou To: hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, muchun.song@linux.dev Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, wuyun.abel@bytedance.com, robin.lu@bytedance.com, Chuyi Zhou , Michal Hocko Subject: [RFC PATCH v2 1/5] mm, oom: Introduce bpf_oom_evaluate_task Date: Thu, 10 Aug 2023 16:13:15 +0800 Message-Id: <20230810081319.65668-2-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230810081319.65668-1-zhouchuyi@bytedance.com> References: <20230810081319.65668-1-zhouchuyi@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This patch adds a new hook bpf_oom_evaluate_task in oom_evaluate_task. It takes oc and current iterating task as parameters and returns a result indicating which one should be selected. We can use it to bypass the current logic of oom_evaluate_task and implement customized OOM policies in the attached BPF progams. Suggested-by: Michal Hocko Signed-off-by: Chuyi Zhou --- mm/oom_kill.c | 59 +++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 50 insertions(+), 9 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 612b5597d3af..255c9ef1d808 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -18,6 +18,7 @@ * kernel subsystems and hints as to where to find out what things do. */ =20 +#include #include #include #include @@ -305,6 +306,27 @@ static enum oom_constraint constrained_alloc(struct oo= m_control *oc) return CONSTRAINT_NONE; } =20 +enum { + NO_BPF_POLICY, + BPF_EVAL_ABORT, + BPF_EVAL_NEXT, + BPF_EVAL_SELECT, +}; + +__weak noinline int bpf_oom_evaluate_task(struct task_struct *task, struct= oom_control *oc) +{ + return NO_BPF_POLICY; +} + +BTF_SET8_START(oom_bpf_fmodret_ids) +BTF_ID_FLAGS(func, bpf_oom_evaluate_task) +BTF_SET8_END(oom_bpf_fmodret_ids) + +static const struct btf_kfunc_id_set oom_bpf_fmodret_set =3D { + .owner =3D THIS_MODULE, + .set =3D &oom_bpf_fmodret_ids, +}; + static int oom_evaluate_task(struct task_struct *task, void *arg) { struct oom_control *oc =3D arg; @@ -317,6 +339,26 @@ static int oom_evaluate_task(struct task_struct *task,= void *arg) if (!is_memcg_oom(oc) && !oom_cpuset_eligible(task, oc)) goto next; =20 + /* + * If task is allocating a lot of memory and has been marked to be + * killed first if it triggers an oom, then select it. + */ + if (oom_task_origin(task)) { + points =3D LONG_MAX; + goto select; + } + + switch (bpf_oom_evaluate_task(task, oc)) { + case BPF_EVAL_ABORT: + goto abort; /* abort search process */ + case BPF_EVAL_NEXT: + goto next; /* ignore the task */ + case BPF_EVAL_SELECT: + goto select; /* select the task */ + default: + break; /* No BPF policy */ + } + /* * This task already has access to memory reserves and is being killed. * Don't allow any other task to have access to the reserves unless @@ -329,15 +371,6 @@ static int oom_evaluate_task(struct task_struct *task,= void *arg) goto abort; } =20 - /* - * If task is allocating a lot of memory and has been marked to be - * killed first if it triggers an oom, then select it. - */ - if (oom_task_origin(task)) { - points =3D LONG_MAX; - goto select; - } - points =3D oom_badness(task, oc->totalpages); if (points =3D=3D LONG_MIN || points < oc->chosen_points) goto next; @@ -732,10 +765,18 @@ static struct ctl_table vm_oom_kill_table[] =3D { =20 static int __init oom_init(void) { + int err; oom_reaper_th =3D kthread_run(oom_reaper, NULL, "oom_reaper"); #ifdef CONFIG_SYSCTL register_sysctl_init("vm", vm_oom_kill_table); #endif + +#ifdef CONFIG_BPF_SYSCALL + err =3D register_btf_fmodret_id_set(&oom_bpf_fmodret_set); + if (err) + pr_warn("error while registering oom fmodret entrypoints: %d", err); +#endif + return 0; } subsys_initcall(oom_init) --=20 2.20.1 From nobody Sun Feb 8 17:04:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4831C001B0 for ; Thu, 10 Aug 2023 08:14:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234145AbjHJIOG (ORCPT ); Thu, 10 Aug 2023 04:14:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51144 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234089AbjHJIN7 (ORCPT ); Thu, 10 Aug 2023 04:13:59 -0400 Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA7EE2683 for ; Thu, 10 Aug 2023 01:13:36 -0700 (PDT) Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-686efb9ee3cso578541b3a.3 for ; Thu, 10 Aug 2023 01:13:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691655216; x=1692260016; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V+gOX8E32N7J0qn1e/cgbPAAdyKUl0JYooXv5sNbHuI=; b=OG7IuzB6CPOFfsV9wFrgnazb3f1VQrgIV5f+OOZdkCCvMJDzb3aV1VZFFlJjpUsXA3 +BP87n9CA/2IoiPk0nP0DzRB7WQ7FQy3RpzkGS3xJcTT6KV93ClOZg7CbgeULa7wj9ly Fr3KCTw83jY9eghkbiZB2yqiscmS6/rgT4aHFfjPPmngWuHbHqylqAWkCsuflY/r6XA/ u+1EJS1Q7o+TgHUDSY/Y9RXZe993prhItKrseX0iBXnPuQufH4ZiVy6hcumq/SQIiGp4 weiY7Tex+vf38O2Fi3m5cUUfZFvNYFueWFQFMKwYqa/fe7Wq4zxqt5ZSwFX0OfTK0l6P IEEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691655216; x=1692260016; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V+gOX8E32N7J0qn1e/cgbPAAdyKUl0JYooXv5sNbHuI=; b=ZFy+105Bz1Tc7JUwW59smB6b+rR2mt/1DQXnaqI6b50GAioaKK61BvlV3SUo7lUKyS lUfmPz/09yVq8XG88vftM5bV7TuUy3JNk52TTfw4DLG+xfABHzFaMC7r2cggcRHp5omg tdQfqnObMXTqpTHVd61cgj/NtPRHDKLrLQkiLt6jePBCvfLA/NrC7PNmPCO5TFpMT2U1 +y/yoSow1hTr15Y5qkJrT4OsARkbBnIOvnrKaSa4jc8lZv04GpOrjqu56zRqOfH8I2dk BQgZeHNT0C4IMxcMsmAOeADRO9/az15Txv8K8Ox6iVucRbHdKyma1Eq/+pfgtFXlhoZ0 z3og== X-Gm-Message-State: AOJu0Yyi5voIoxNMNK9P2NLv+YYo2E23LSGVZrfynAi1HaQhusj3MxK5 zLfU0sLXPWw7a8sAdOOG1ca9ag== X-Google-Smtp-Source: AGHT+IGtscS04wQJzOPzeNPixXB/Nvlgj5vG5uoIWZkaaoCuS2LHPKsrUQS+DTx+/kSMNoPf/s1uqQ== X-Received: by 2002:a17:903:246:b0:1b8:76ce:9d91 with SMTP id j6-20020a170903024600b001b876ce9d91mr1911030plh.1.1691655216226; Thu, 10 Aug 2023 01:13:36 -0700 (PDT) Received: from n37-019-243.byted.org ([180.184.51.40]) by smtp.gmail.com with ESMTPSA id x12-20020a170902ec8c00b001b1a2c14a4asm1019036plg.38.2023.08.10.01.13.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Aug 2023 01:13:35 -0700 (PDT) From: Chuyi Zhou To: hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, muchun.song@linux.dev Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, wuyun.abel@bytedance.com, robin.lu@bytedance.com, Chuyi Zhou Subject: [RFC PATCH v2 2/5] mm: Add policy_name to identify OOM policies Date: Thu, 10 Aug 2023 16:13:16 +0800 Message-Id: <20230810081319.65668-3-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230810081319.65668-1-zhouchuyi@bytedance.com> References: <20230810081319.65668-1-zhouchuyi@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This patch adds a new metadata policy_name in oom_control and report it in dump_header(), so we can know what has been the selection policy. In BPF program, we can call kfunc set_oom_policy_name to set the current user-defined policy name. The in-kernel policy_name is "default". Signed-off-by: Chuyi Zhou --- include/linux/oom.h | 7 +++++++ mm/oom_kill.c | 42 +++++++++++++++++++++++++++++++++++++++--- 2 files changed, 46 insertions(+), 3 deletions(-) diff --git a/include/linux/oom.h b/include/linux/oom.h index 7d0c9c48a0c5..69d0f2ec6ea6 100644 --- a/include/linux/oom.h +++ b/include/linux/oom.h @@ -22,6 +22,10 @@ enum oom_constraint { CONSTRAINT_MEMCG, }; =20 +enum { + POLICY_NAME_LEN =3D 16, +}; + /* * Details of the page allocation that triggered the oom killer that are u= sed to * determine what should be killed. @@ -52,6 +56,9 @@ struct oom_control { =20 /* Used to print the constraint info. */ enum oom_constraint constraint; + + /* Used to report the policy info. */ + char policy_name[POLICY_NAME_LEN]; }; =20 extern struct mutex oom_lock; diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 255c9ef1d808..3239dcdba4d7 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -443,6 +443,35 @@ static int dump_task(struct task_struct *p, void *arg) return 0; } =20 +__bpf_kfunc void set_oom_policy_name(struct oom_control *oc, const char *s= rc, size_t sz) +{ + memset(oc->policy_name, 0, sizeof(oc->policy_name)); + + if (sz > POLICY_NAME_LEN) + sz =3D POLICY_NAME_LEN; + + memcpy(oc->policy_name, src, sz); +} + +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "kfuncs which will be used in BPF programs"); + +__weak noinline void bpf_set_policy_name(struct oom_control *oc) +{ +} + +__diag_pop(); + +BTF_SET8_START(bpf_oom_policy_kfunc_ids) +BTF_ID_FLAGS(func, set_oom_policy_name) +BTF_SET8_END(bpf_oom_policy_kfunc_ids) + +static const struct btf_kfunc_id_set bpf_oom_policy_kfunc_set =3D { + .owner =3D THIS_MODULE, + .set =3D &bpf_oom_policy_kfunc_ids, +}; + /** * dump_tasks - dump current memory state of all system tasks * @oc: pointer to struct oom_control @@ -484,8 +513,8 @@ static void dump_oom_summary(struct oom_control *oc, st= ruct task_struct *victim) =20 static void dump_header(struct oom_control *oc, struct task_struct *p) { - pr_warn("%s invoked oom-killer: gfp_mask=3D%#x(%pGg), order=3D%d, oom_sco= re_adj=3D%hd\n", - current->comm, oc->gfp_mask, &oc->gfp_mask, oc->order, + pr_warn("%s invoked oom-killer: gfp_mask=3D%#x(%pGg), order=3D%d, policy_= name=3D%s, oom_score_adj=3D%hd\n", + current->comm, oc->gfp_mask, &oc->gfp_mask, oc->order, oc->policy_name, current->signal->oom_score_adj); if (!IS_ENABLED(CONFIG_COMPACTION) && oc->order) pr_warn("COMPACTION is disabled!!!\n"); @@ -775,8 +804,11 @@ static int __init oom_init(void) err =3D register_btf_fmodret_id_set(&oom_bpf_fmodret_set); if (err) pr_warn("error while registering oom fmodret entrypoints: %d", err); + err =3D register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, + &bpf_oom_policy_kfunc_set); + if (err) + pr_warn("error while registering oom kfunc entrypoints: %d", err); #endif - return 0; } subsys_initcall(oom_init) @@ -1196,6 +1228,10 @@ bool out_of_memory(struct oom_control *oc) return true; } =20 + set_oom_policy_name(oc, "default", sizeof("default")); + + bpf_set_policy_name(oc); + select_bad_process(oc); /* Found nothing?!?! */ if (!oc->chosen) { --=20 2.20.1 From nobody Sun Feb 8 17:04:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87D8DC001B0 for ; Thu, 10 Aug 2023 08:13:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234005AbjHJINn (ORCPT ); Thu, 10 Aug 2023 04:13:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234007AbjHJINl (ORCPT ); Thu, 10 Aug 2023 04:13:41 -0400 Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D27C10C4 for ; Thu, 10 Aug 2023 01:13:41 -0700 (PDT) Received: by mail-pf1-x431.google.com with SMTP id d2e1a72fcca58-686f19b6dd2so449279b3a.2 for ; Thu, 10 Aug 2023 01:13:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691655221; x=1692260021; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bR96duwMScqqE4Do/jU3bN4CcpYhg3nXfWXt6kcMass=; b=TXdsabfFDEPwwJuaGZzBTo8pU4vepokuZPrymTlMQv1CvvfMKgCU29jA0QwH6nWuD0 gznXsPzuwLjEwlDK0vi+3+2uRxC1XMbW+lp+jWxIiLx3wBIpH31hYECAny9GNhrdQoBp fQwpX0cY+EOzJNEIbOPRRsH7MYAIzMIZor3623Qqkm1xdI1C87niDDFOIeGXXw9YKZKm brvKjzr+H6AxlWG+iqKS93izG5cWAN1ArPQzOhLyXud37OGWuLg4IpUu1Ebn7x3jQUuM TpKXPVzkIRw4vQlWHA/ApQtbsInEvHS24H7J3wHllzsOpLmActmt8abO0eoPurB7QaOk svQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691655221; x=1692260021; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bR96duwMScqqE4Do/jU3bN4CcpYhg3nXfWXt6kcMass=; b=DnJqFM39jwmDlRjAeufFifXSFWMGqjzvd12nBGsZ6/ZcODi+K1+bnFREQ7UDJIaQqA 3BKDHlwIhfIwPI2imrQ9zYAupaZxOkRjW6rxnwYvDIWdEkqZmSbJj6LbERVNuOaFD6nS l2aDKPHFzgEOAZgsb05SA9xG3z+h9HoS4E05Z1nVRQjg/p9vo/iKiHr9zuxAERSfFx07 3TOks3uM5GwCvvRbS7cyEGI3Fj3zSvXHF/c9p6tFnZmmSVGO+2vkNMwoXt1X01QFH8g/ uyFfjj2iVIuGi1vCGmqHZ4y0NMesMx3zCYMTuGN7uG2VFFi3WRSQBAKBonNdoCNeNiYB LWMg== X-Gm-Message-State: AOJu0YxG7SjH3cyU+suPRC6Xq3lAbUihcRYdIYmeJ0JruZh9kdYlho1d jaDrbFZ4pOYtXCF/OHZfknGJjw== X-Google-Smtp-Source: AGHT+IGLxHS6Efr09r8Fx5K71xSslYVl/sG2ni7ae6H65/+PX1fMA8cicT8FgwPTgOcMbX17lAwmtQ== X-Received: by 2002:a05:6a20:1387:b0:13d:af0e:4ee5 with SMTP id hn7-20020a056a20138700b0013daf0e4ee5mr1482972pzc.18.1691655220823; Thu, 10 Aug 2023 01:13:40 -0700 (PDT) Received: from n37-019-243.byted.org ([180.184.51.40]) by smtp.gmail.com with ESMTPSA id x12-20020a170902ec8c00b001b1a2c14a4asm1019036plg.38.2023.08.10.01.13.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Aug 2023 01:13:40 -0700 (PDT) From: Chuyi Zhou To: hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, muchun.song@linux.dev Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, wuyun.abel@bytedance.com, robin.lu@bytedance.com, Chuyi Zhou , Alan Maguire Subject: [RFC PATCH v2 3/5] mm: Add a tracepoint when OOM victim selection is failed Date: Thu, 10 Aug 2023 16:13:17 +0800 Message-Id: <20230810081319.65668-4-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230810081319.65668-1-zhouchuyi@bytedance.com> References: <20230810081319.65668-1-zhouchuyi@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This patch add a tracepoint to mark the scenario where nothing was chosen for OOM killer. This would allow BPF programs to catch the fact that the BPF OOM policy didn't work well. Suggested-by: Alan Maguire Signed-off-by: Chuyi Zhou --- include/trace/events/oom.h | 18 ++++++++++++++++++ mm/oom_kill.c | 1 + 2 files changed, 19 insertions(+) diff --git a/include/trace/events/oom.h b/include/trace/events/oom.h index 26a11e4a2c36..b6ae1134229c 100644 --- a/include/trace/events/oom.h +++ b/include/trace/events/oom.h @@ -6,6 +6,7 @@ #define _TRACE_OOM_H #include #include +#include =20 TRACE_EVENT(oom_score_adj_update, =20 @@ -151,6 +152,23 @@ TRACE_EVENT(skip_task_reaping, TP_printk("pid=3D%d", __entry->pid) ); =20 +TRACE_EVENT(select_bad_process_end, + + TP_PROTO(struct oom_control *oc), + + TP_ARGS(oc), + + TP_STRUCT__entry( + __array(char, policy_name, POLICY_NAME_LEN) + ), + + TP_fast_assign( + memcpy(__entry->policy_name, oc->policy_name, POLICY_NAME_LEN); + ), + + TP_printk("policy_name=3D%s", __entry->policy_name) +); + #ifdef CONFIG_COMPACTION TRACE_EVENT(compact_retry, =20 diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 3239dcdba4d7..af40a1b750fa 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -1235,6 +1235,7 @@ bool out_of_memory(struct oom_control *oc) select_bad_process(oc); /* Found nothing?!?! */ if (!oc->chosen) { + trace_select_bad_process_end(oc); dump_header(oc, NULL); pr_warn("Out of memory and no killable processes...\n"); /* --=20 2.20.1 From nobody Sun Feb 8 17:04:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9623EC001DB for ; Thu, 10 Aug 2023 08:14:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234119AbjHJIOQ (ORCPT ); Thu, 10 Aug 2023 04:14:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234176AbjHJIOL (ORCPT ); Thu, 10 Aug 2023 04:14:11 -0400 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA7A32136 for ; Thu, 10 Aug 2023 01:13:45 -0700 (PDT) Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-1bc8045e09dso5147745ad.0 for ; Thu, 10 Aug 2023 01:13:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691655225; x=1692260025; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LEsujYzeHDJksabxHXYH7r+I+WQ8F1nGq024d8cm1KU=; b=VuTFtoYEBUXlmZ/xn9ftnduPgMunFsfNJN8FNbD7NPo+XLvQDZC82CdMZileZgf7Ju J6N3h2CD+mAIE6hKRtUvuxSpadHUagPDYdUrTa4ay0ncTKS2MyjfNBe6eVPT14fr1HD5 kq6marMYBOiAeqDY6CTWVjuC3XM5YtbqNlCNH/aIP8g8SlkuknQocQyNqJtsiB6YzuhB uIJNNN8uyjXlvs3eP+8s4hLCB1yjUKioQZcsW+p7dsPom0psuNhy8Tf73mLptX1KoVFy 9AQav14P/kk92N8V/4fpdXgdt5oHI3z4UP1dFui92Em4UReElSPkLCYpVCWza+wM+d8m Fu1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691655225; x=1692260025; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LEsujYzeHDJksabxHXYH7r+I+WQ8F1nGq024d8cm1KU=; b=ZDKJpUds0Y+FmLJz1AVlhEkLw90l0C6w4rt07VGYlXf2ZlxkRe0q/uqg7hSWGsEnNA /qtA7xmPVK1KbfBB08VqDYI7KKIueEmH7z4/cirLycGKi9B/8Sj7zaEVuFlDL5n+3FKa maVrXTC19DohVO3+L8TYOlamleNeT8MA2CuH4oH2BHC50wjAHwGvhR6YdpU9RFJcGOmm YwuXXCGcrqJuRO3kvbglt0XCRalQc9NEagE8dUGA/f6rpbuOM6CuLaV+q2ShtOvvBVhM ju/IwZM1lVMFDSZc/7N8U94sBcZWK1PWt20jH52kiUpzbwT73CP5ziZh+TgIu0woeAgP c8Yw== X-Gm-Message-State: AOJu0YwG1+t0gwqV8xAtJTbZSE/UQpVrpQ0maOOylj7c+IEBpwg6m8y6 IcEfMFRhICsXiMgGxjc6dk1JHg== X-Google-Smtp-Source: AGHT+IGBzcA4i2z/gDQv/UTkeNvJUtjNcqDbRAxVuixYlcfF4sIbOkxh+w7RKS0ZxncRcZaqWF8dTg== X-Received: by 2002:a17:903:11c8:b0:1b6:4bbd:c3a7 with SMTP id q8-20020a17090311c800b001b64bbdc3a7mr1431227plh.66.1691655225352; Thu, 10 Aug 2023 01:13:45 -0700 (PDT) Received: from n37-019-243.byted.org ([180.184.51.40]) by smtp.gmail.com with ESMTPSA id x12-20020a170902ec8c00b001b1a2c14a4asm1019036plg.38.2023.08.10.01.13.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Aug 2023 01:13:45 -0700 (PDT) From: Chuyi Zhou To: hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, muchun.song@linux.dev Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, wuyun.abel@bytedance.com, robin.lu@bytedance.com, Chuyi Zhou Subject: [RFC PATCH v2 4/5] bpf: Add a OOM policy test Date: Thu, 10 Aug 2023 16:13:18 +0800 Message-Id: <20230810081319.65668-5-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230810081319.65668-1-zhouchuyi@bytedance.com> References: <20230810081319.65668-1-zhouchuyi@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This patch adds a test which implements a priority-based policy through bpf_oom_evaluate_task. The BPF program, oom_policy.c, compares the cgroup priority of two tasks and select the lower one. The userspace program test_oom_policy.c maintains a priority map by using cgroup id as the keys and priority as the values. We could protect certain cgroups from oom-killer by setting higher priority. Signed-off-by: Chuyi Zhou --- .../bpf/prog_tests/test_oom_policy.c | 140 ++++++++++++++++++ .../testing/selftests/bpf/progs/oom_policy.c | 104 +++++++++++++ 2 files changed, 244 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/test_oom_policy.c create mode 100644 tools/testing/selftests/bpf/progs/oom_policy.c diff --git a/tools/testing/selftests/bpf/prog_tests/test_oom_policy.c b/too= ls/testing/selftests/bpf/prog_tests/test_oom_policy.c new file mode 100644 index 000000000000..bea61ff22603 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/test_oom_policy.c @@ -0,0 +1,140 @@ +// SPDX-License-Identifier: GPL-2.0-only +#define _GNU_SOURCE + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "cgroup_helpers.h" +#include "oom_policy.skel.h" + +static int map_fd; +static int cg_nr; +struct { + const char *path; + int fd; + unsigned long long id; +} cgs[] =3D { + { "/cg1" }, + { "/cg2" }, +}; + + +static struct oom_policy *open_load_oom_policy_skel(void) +{ + struct oom_policy *skel; + int err; + + skel =3D oom_policy__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + return NULL; + + err =3D oom_policy__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + return skel; + +cleanup: + oom_policy__destroy(skel); + return NULL; +} + +static void run_memory_consume(unsigned long long consume_size, int idx) +{ + char *buf; + + join_parent_cgroup(cgs[idx].path); + buf =3D malloc(consume_size); + memset(buf, 0, consume_size); + sleep(2); + exit(0); +} + +static int set_cgroup_prio(unsigned long long cg_id, int prio) +{ + int err; + + err =3D bpf_map_update_elem(map_fd, &cg_id, &prio, BPF_ANY); + ASSERT_EQ(err, 0, "update_map"); + return err; +} + +static int prepare_cgroup_environment(void) +{ + int err; + + err =3D setup_cgroup_environment(); + if (err) + goto clean_cg_env; + for (int i =3D 0; i < cg_nr; i++) { + err =3D cgs[i].fd =3D create_and_get_cgroup(cgs[i].path); + if (!ASSERT_GE(cgs[i].fd, 0, "cg_create")) + goto clean_cg_env; + cgs[i].id =3D get_cgroup_id(cgs[i].path); + } + return 0; +clean_cg_env: + cleanup_cgroup_environment(); + return err; +} + +void test_oom_policy(void) +{ + struct oom_policy *skel; + struct bpf_link *link; + int err; + int victim_pid; + unsigned long long victim_cg_id; + + link =3D NULL; + cg_nr =3D ARRAY_SIZE(cgs); + + skel =3D open_load_oom_policy_skel(); + err =3D oom_policy__attach(skel); + if (!ASSERT_OK(err, "oom_policy__attach")) + goto cleanup; + + map_fd =3D bpf_object__find_map_fd_by_name(skel->obj, "cg_map"); + if (!ASSERT_GE(map_fd, 0, "find map")) + goto cleanup; + + err =3D prepare_cgroup_environment(); + if (!ASSERT_EQ(err, 0, "prepare cgroup env")) + goto cleanup; + + write_cgroup_file("/", "memory.max", "10M"); + + /* + * Set higher priority to cg2 and lower to cg1, so we would select + * task under cg1 as victim.(see oom_policy.c) + */ + set_cgroup_prio(cgs[0].id, 10); + set_cgroup_prio(cgs[1].id, 50); + + victim_cg_id =3D cgs[0].id; + victim_pid =3D fork(); + + if (victim_pid =3D=3D 0) + run_memory_consume(1024 * 1024 * 4, 0); + + if (fork() =3D=3D 0) + run_memory_consume(1024 * 1024 * 8, 1); + + while (wait(NULL) > 0) + ; + + ASSERT_EQ(skel->bss->victim_pid, victim_pid, "victim_pid"); + ASSERT_EQ(skel->bss->victim_cg_id, victim_cg_id, "victim_cgid"); + ASSERT_EQ(skel->bss->failed_cnt, 1, "failed_cnt"); +cleanup: + bpf_link__destroy(link); + oom_policy__destroy(skel); + cleanup_cgroup_environment(); +} diff --git a/tools/testing/selftests/bpf/progs/oom_policy.c b/tools/testing= /selftests/bpf/progs/oom_policy.c new file mode 100644 index 000000000000..fc9efc93914e --- /dev/null +++ b/tools/testing/selftests/bpf/progs/oom_policy.c @@ -0,0 +1,104 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include + +char _license[] SEC("license") =3D "GPL"; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __type(key, int); + __type(value, int); + __uint(max_entries, 24); +} cg_map SEC(".maps"); + +unsigned int victim_pid; +u64 victim_cg_id; +int failed_cnt; + +#define EOPNOTSUPP 95 + +enum { + NO_BPF_POLICY, + BPF_EVAL_ABORT, + BPF_EVAL_NEXT, + BPF_EVAL_SELECT, +}; + +extern void set_oom_policy_name(struct oom_control *oc, const char *buf, s= ize_t sz) __ksym; + +static __always_inline u64 task_cgroup_id(struct task_struct *task) +{ + struct kernfs_node *node; + struct task_group *tg; + + if (!task) + return 0; + + tg =3D task->sched_task_group; + node =3D tg->css.cgroup->kn; + + return node->id; +} + +SEC("fentry/oom_kill_process") +int BPF_PROG(oom_kill_process_k, struct oom_control *oc, const char *messa= ge) +{ + struct task_struct *victim =3D oc->chosen; + + if (victim) { + victim_cg_id =3D task_cgroup_id(victim); + victim_pid =3D victim->pid; + } + + return 0; +} + +SEC("fentry/bpf_set_policy_name") +int BPF_PROG(set_police_name_k, struct oom_control *oc) +{ + char name[] =3D "cg_prio"; + set_oom_policy_name(oc, name, sizeof(name)); + return 0; +} + +SEC("tp_btf/select_bad_process_end") +int BPF_PROG(record_failed, struct oom_control *oc) +{ + failed_cnt +=3D 1; + return 0; +} + +SEC("fmod_ret/bpf_oom_evaluate_task") +int BPF_PROG(bpf_oom_evaluate_task, struct task_struct *task, struct oom_c= ontrol *oc) +{ + int chosen_cg_prio, task_cg_prio; + u64 chosen_cg_id, task_cg_id; + struct task_struct *chosen; + int *val; + + if (!failed_cnt) + return BPF_EVAL_NEXT; + + chosen =3D oc->chosen; + if (!chosen) + return BPF_EVAL_SELECT; + + chosen_cg_id =3D task_cgroup_id(chosen); + task_cg_id =3D task_cgroup_id(task); + chosen_cg_prio =3D task_cg_prio =3D 0; + val =3D bpf_map_lookup_elem(&cg_map, &chosen_cg_id); + if (val) + chosen_cg_prio =3D *val; + val =3D bpf_map_lookup_elem(&cg_map, &task_cg_id); + if (val) + task_cg_prio =3D *val; + + if (chosen_cg_prio > task_cg_prio) + return BPF_EVAL_SELECT; + if (chosen_cg_prio < task_cg_prio) + return BPF_EVAL_NEXT; + + return NO_BPF_POLICY; +} + --=20 2.20.1 From nobody Sun Feb 8 17:04:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 259FCC001B0 for ; Thu, 10 Aug 2023 08:14:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234134AbjHJIOT (ORCPT ); Thu, 10 Aug 2023 04:14:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48316 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234178AbjHJIOL (ORCPT ); Thu, 10 Aug 2023 04:14:11 -0400 Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F3F2E7E for ; Thu, 10 Aug 2023 01:13:51 -0700 (PDT) Received: by mail-pg1-x52e.google.com with SMTP id 41be03b00d2f7-564b8e60ce9so432088a12.2 for ; Thu, 10 Aug 2023 01:13:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691655231; x=1692260031; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YS6ue8t2dd5iDW6jb431VKpPx5tAdDiMNn4sU9SZ4eM=; b=KNJbzwTxxK3KzgAfGpr2FeHEj85XtDGujL07/8Dd3z3V3jSF8Rh+LYfOCIp5iXK0w7 YcNv7ogCQvXKM3pEP24qCmrjpg/JDP9h1bPZJUTufYhvRvW+AfvjaCJjSAIiDVPy4YlB FvMnZgpH//K7t4jlj1KtZkT8Tjr1dVTBmA31ymCxgtTRo28F/xNhPrvKRxrHZK39ygQZ TPSiOnZgnkDk1zDo7yeScGo2f+KIT1WWFlda15vKkcDADn6i8rF0kiPkSV8N9pjikKK2 NsnxwPD8zv745JOrycqHn98sjc4x3oYJA0Stmu2pcb6Xa6lTRk6/qVJaIsczS2Rfi1mM 78hA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691655231; x=1692260031; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YS6ue8t2dd5iDW6jb431VKpPx5tAdDiMNn4sU9SZ4eM=; b=jem38Q03kgOhbuCGqpGyvcnQhoAaP0O132pPuxQJYOFajkt3450WoFFtNqsjve4nR7 kwVDzjMkSCYqq8DbZQy5paEZIGOMrvsilU4sHsWSHAAS3j1PINgZGLe8Y5WQDd42Xdf4 zD/l+UUlQbgQSmAFdX/lst2U66GuPwpsvmf+2hOefn1KJdYqd1YxVyltDjkzoT26Rhqh cjRVF/0CtSpa86pGBVgWAXRmGchKLLsunnRf6N6wwQbyZD/In4B/jdCSSRpjfoh1jLrC 6Ht3vts4AEP0ENvD9aNvo9hAviMH205bmvHeqvcaxeXJtVzXuS5vL8bqsH3VPHXHAMX+ Fngg== X-Gm-Message-State: AOJu0YxweEDp5vnDYOTids1PN30QiTA35aJWZJnb+wEK7q3+C/fVEbuU BfPCCpBZKCVeRP4OBzD8A0wmpA== X-Google-Smtp-Source: AGHT+IHuFmD3Cyae7WIdtzmgxlWYjNuOj281DztPhX4TApif8F0U2Be9/d7VjjGRzVC64hKOa/1OAA== X-Received: by 2002:a17:902:e548:b0:1ac:63ac:10a7 with SMTP id n8-20020a170902e54800b001ac63ac10a7mr1519133plf.68.1691655230885; Thu, 10 Aug 2023 01:13:50 -0700 (PDT) Received: from n37-019-243.byted.org ([180.184.51.40]) by smtp.gmail.com with ESMTPSA id x12-20020a170902ec8c00b001b1a2c14a4asm1019036plg.38.2023.08.10.01.13.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Aug 2023 01:13:50 -0700 (PDT) From: Chuyi Zhou To: hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, muchun.song@linux.dev Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, wuyun.abel@bytedance.com, robin.lu@bytedance.com, Chuyi Zhou Subject: [RFC PATCH v2 5/5] bpf: Add a BPF OOM policy Doc Date: Thu, 10 Aug 2023 16:13:19 +0800 Message-Id: <20230810081319.65668-6-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230810081319.65668-1-zhouchuyi@bytedance.com> References: <20230810081319.65668-1-zhouchuyi@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This patch adds a new doc Documentation/bpf/oom.rst to describe how BPF OOM policy is supposed to work. Signed-off-by: Chuyi Zhou --- Documentation/bpf/oom.rst | 70 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 Documentation/bpf/oom.rst diff --git a/Documentation/bpf/oom.rst b/Documentation/bpf/oom.rst new file mode 100644 index 000000000000..9bad1fd30d4a --- /dev/null +++ b/Documentation/bpf/oom.rst @@ -0,0 +1,70 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +BPF OOM Policy +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The Out Of Memory Killer (aka OOM Killer) is invoked when the system is +critically low on memory. The in-kernel implementation is to iterate over +all tasks in the specific oom domain (all tasks for global and all members +of memcg tree for hard limit oom) and select a victim based some heuristic +policy to kill. + +Specifically: + +1. Begin to iterate tasks using ``oom_evaluate_task()`` and find a valid (= killable) + victim in iteration N, select it. + +2. In iteration N + 1, N + 2..., we compare the current iteration task wit= h the + previous selected task, if current is more suitable then select it. + +3. finally we get a victim to kill. + +However, this does not meet the needs of users in some special scenarios. = Using +the eBPF capabilities, We can implement customized OOM policies to meet ne= eds. + +Developer API: +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +bpf_oom_evaluate_task +---------------------- + +``bpf_oom_evaluate_task`` is a new interface hooking into ``oom_evaluate_t= ask()`` +which is used to bypass the in-kernel selection logic. Users can customize= their +victim selection policy through BPF programs attached to it. +:: + + int bpf_oom_evaluate_task(struct task_struct *task, + struct oom_control *oc); + +return value:: + + NO_BPF_POLICY no bpf policy and would fallback to the in-kernel se= lection + BPF_EVAL_ABORT abort the selection (exit from current selection loo= p) + BPF_EVAL_NEXT ignore the task + BPF_EAVL_SELECT select the current task + +Suppose we want to select a victim based on the specified pid when OOM is +invoked, we can use the following BPF program:: + + SEC("fmod_ret/bpf_oom_evaluate_task") + int BPF_PROG(bpf_oom_evaluate_task, struct task_struct *task, struct o= om_control *oc) + { + if (task->pid =3D=3D target_pid) + return BPF_EAVL_SELECT; + return BPF_EVAL_NEXT; + } + +bpf_set_policy_name +--------------------- + +``bpf_set_policy_name`` is a interface hooking before the start of victim = selection. We can +set policy's name in the attached program, so dump_header() can identify d= ifferent policies +when reporting messages. We can set policy's name through kfunc ``set_oom_= policy_name`` +:: + + SEC("fentry/bpf_set_policy_name") + int BPF_PROG(set_police_name_k, struct oom_control *oc) + { + char name[] =3D "my_policy"; + set_oom_policy_name(oc, name, sizeof(name)); + return 0; + } \ No newline at end of file --=20 2.20.1