arch/powerpc/include/asm/ftrace.h | 7 -- arch/x86/include/asm/ftrace.h | 7 -- include/linux/module.h | 14 +++ kernel/module/kallsyms.c | 42 ++++++-- kernel/trace/ftrace.c | 171 ++++++------------------------ scripts/kallsyms.c | 95 ++++++++++++++++- scripts/link-vmlinux.sh | 4 +- scripts/mksysmap | 2 +- 8 files changed, 173 insertions(+), 169 deletions(-)
Background of this patch set can be found in v1:
https://lore.kernel.org/all/20240613133711.2867745-1-zhengyejian1@huawei.com/
Here add a reproduction to show the impact to livepatch:
1. Add following hack to make livepatch-sample.ko do patch on do_one_initcall()
which has an overriden weak function behind in vmlinux, then print the
actually used __fentry__ location:
diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
index 90408500e5a3..20b75e7b1771 100644
--- a/kernel/livepatch/patch.c
+++ b/kernel/livepatch/patch.c
@@ -173,6 +173,8 @@ static int klp_patch_func(struct klp_func *func)
unsigned long ftrace_loc;
ftrace_loc = ftrace_location((unsigned long)func->old_func);
+ pr_info("%s: old_func=%pS ftrace_loc=%pS (%lx)\n", __func__,
+ func->old_func, (void *)ftrace_loc, ftrace_loc);
if (!ftrace_loc) {
pr_err("failed to find location for function '%s'\n",
func->old_name);
diff --git a/samples/livepatch/livepatch-sample.c b/samples/livepatch/livepatch-sample.c
index cd76d7ebe598..c6e8e78e23b8 100644
--- a/samples/livepatch/livepatch-sample.c
+++ b/samples/livepatch/livepatch-sample.c
@@ -38,7 +38,7 @@ static int livepatch_cmdline_proc_show(struct seq_file *m, void *v)
static struct klp_func funcs[] = {
{
- .old_name = "cmdline_proc_show",
+ .old_name = "do_one_initcall",
.new_func = livepatch_cmdline_proc_show,
}, { }
};
2. We can see the size of do_one_initcall() is 0x27e:
$ nm -nS vmlinux | grep do_one_initcall
ffffffff810021e0 000000000000027e T do_one_initcall
3. Insert the livepatch ko, the expected ftrace_loc is do_one_initcall+0x4/0x27e but
actually do_one_initcall+0x294/0x2c0 which is invalid, also its size is inaccurate.
Then this livepatch can not properly work!!!
# insmod livepatch-sample.ko
[ 17.942451] livepatch: klp_patch_func: old_func=do_one_initcall+0x0/0x2c0 ftrace_loc=do_one_initcall+0x294/0x2c0 (ffffffff81002474)
#
# cat /sys/kernel/tracing/available_filter_functions_addrs | grep ffffffff81002474
ffffffff81002474 __ftrace_invalid_address___660
After this patch, this issue is gone:
# insmod livepatch-sample.ko
[ 42.408632] livepatch: klp_patch_func: old_func=do_one_initcall+0x0/0x27e ftrace_loc=do_one_initcall+0x4/0x27e (ffffffff810021e4)
# cat /sys/kernel/tracing/available_filter_functions_addrs | grep ffffffff810021e4
ffffffff810021e4 do_one_initcall
# cat /sys/kernel/tracing/enabled_functions
do_one_initcall (1) I M tramp: 0xffffffffa0020000 (klp_ftrace_handler+0x0/0x258)->klp_ftrace_handler+0x0/0x258
Changes:
v1->v2:
- Drop patch which titled "kallsyms: Optimize multiple times of realloc() to one time of malloc()"
as suggested by Masahiro;
Link: https://lore.kernel.org/all/CAK7LNAQkSnZ1nVXiZH9kg52H-A_=urcsv-W7wGXvunMGhGX8Vw@mail.gmail.com/
- Changes in PATCH 1/5:
- Change may_exist_hole_after_symbol() to has_hole() and return bool type;
- User realloc instead of malloc another table in emit_hole_symbols();
Link: https://lore.kernel.org/all/CAK7LNAQaLc6aDK85qQtPHoCkQSGyL-TxXjpgJTfehe2Q1=jMSA@mail.gmail.com/
- Use empty symbol type/name as a special case to represent the hole instead of
using symbol "t__hole_symbol_XXXXX";
Link: https://lore.kernel.org/all/CAK7LNARiR5z9hPRG932T7YjRWqkX_qZ7WKmbxx7iTo2w5YJojQ@mail.gmail.com/
- Changes in PATCH 3/5:
- Now name of hole symbol is NULL, so if __fentry__ is located in a hole,
kallsyms_lookup() find an NULL name then will return 0, so drop the
needless kallsyms_is_hole_symbol().
Zheng Yejian (5):
kallsyms: Emit symbol at the holes in the text
module: kallsyms: Determine exact function size
ftrace: Skip invalid __fentry__ in ftrace_process_locs()
ftrace: Fix possible out-of-bound issue in ftrace_process_locs()
ftrace: Revert the FTRACE_MCOUNT_MAX_OFFSET workaround
arch/powerpc/include/asm/ftrace.h | 7 --
arch/x86/include/asm/ftrace.h | 7 --
include/linux/module.h | 14 +++
kernel/module/kallsyms.c | 42 ++++++--
kernel/trace/ftrace.c | 171 ++++++------------------------
scripts/kallsyms.c | 95 ++++++++++++++++-
scripts/link-vmlinux.sh | 4 +-
scripts/mksysmap | 2 +-
8 files changed, 173 insertions(+), 169 deletions(-)
--
2.25.1
On Tue, 2024-07-23 at 14:32 +0800, Zheng Yejian wrote: > Background of this patch set can be found in v1: > > https://lore.kernel.org/all/20240613133711.2867745-1-zhengyejian1@huawei.com/ > > > Here add a reproduction to show the impact to livepatch: > 1. Add following hack to make livepatch-sample.ko do patch on > do_one_initcall() > which has an overriden weak function behind in vmlinux, then print > the > actually used __fentry__ location: > Hi all, what is the status of this patch series? I'd really like to see it or some other fix to this issue merged. The underlying bug is a significant one that can cause ftrace/livepatch/BPF fentry to fail silently. I've noticed this bug in another context[1] and realized they're the same issue. I'm happy to help with this patch series to address any issues as needed. [1] https://lore.kernel.org/bpf/7136605d24de9b1fc62d02a355ef11c950a94153.camel@crowdstrike.com/T/#mb7e6f84ac90fa78989e9e2c3cd8d29f65a78845b
Hi, Le 10/12/2024 à 20:15, Martin Kelly a écrit : > [Vous ne recevez pas souvent de courriers de martin.kelly@crowdstrike.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ] > > On Tue, 2024-07-23 at 14:32 +0800, Zheng Yejian wrote: >> Background of this patch set can be found in v1: >> >> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2F20240613133711.2867745-1-zhengyejian1%40huawei.com%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Cbc4f27151ef04b74fba608dd194f0034%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638694550404456289%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C80000%7C%7C%7C&sdata=a5XFKy9qxVrM5yXuvJuilJ%2FsUxU4j326MOmEz7dBViY%3D&reserved=0 >> >> >> Here add a reproduction to show the impact to livepatch: >> 1. Add following hack to make livepatch-sample.ko do patch on >> do_one_initcall() >> which has an overriden weak function behind in vmlinux, then print >> the >> actually used __fentry__ location: >> > > Hi all, what is the status of this patch series? I'd really like to see > it or some other fix to this issue merged. The underlying bug is a > significant one that can cause ftrace/livepatch/BPF fentry to fail > silently. I've noticed this bug in another context[1] and realized > they're the same issue. > > I'm happy to help with this patch series to address any issues as > needed. As far as I can see there are problems on build with patch 1, see https://patchwork.kernel.org/project/linux-modules/patch/20240723063258.2240610-2-zhengyejian@huaweicloud.com/ > > [1] > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fbpf%2F7136605d24de9b1fc62d02a355ef11c950a94153.camel%40crowdstrike.com%2FT%2F%23mb7e6f84ac90fa78989e9e2c3cd8d29f65a78845b&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Cbc4f27151ef04b74fba608dd194f0034%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638694550404477455%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C80000%7C%7C%7C&sdata=v9qPnj%2FDDWAuSdB6dP19nyxUWijxveymI6mQb63KxbY%3D&reserved=0 Christophe
On Tue, 2024-12-10 at 21:01 +0100, Christophe Leroy wrote: > > > > Hi all, what is the status of this patch series? I'd really like to > > see > > it or some other fix to this issue merged. The underlying bug is a > > significant one that can cause ftrace/livepatch/BPF fentry to fail > > silently. I've noticed this bug in another context[1] and realized > > they're the same issue. > > > > I'm happy to help with this patch series to address any issues as > > needed. > > As far as I can see there are problems on build with patch 1, see > https://patchwork.kernel.org/project/linux-modules/patch/20240723063258.2240610-2-zhengyejian@huaweicloud.com/ > > > Yeah, I see those. Additionally, this patch no longer applies cleanly to current master, though fixing it up to do so is pretty easy. Having done that, this series appears to resolve the issues I saw in the other linked thread. Zheng, do you plan to send a v3? I'd be happy to help out with this patch series if you'd like, as I'm hoping to get this issue resolved (though I am not an ftrace expert).
On 2024/12/11 04:49, Martin Kelly wrote: > On Tue, 2024-12-10 at 21:01 +0100, Christophe Leroy wrote: >>> >>> Hi all, what is the status of this patch series? I'd really like to >>> see >>> it or some other fix to this issue merged. The underlying bug is a >>> significant one that can cause ftrace/livepatch/BPF fentry to fail >>> silently. I've noticed this bug in another context[1] and realized >>> they're the same issue. >>> >>> I'm happy to help with this patch series to address any issues as >>> needed. >> >> As far as I can see there are problems on build with patch 1, see >> https://patchwork.kernel.org/project/linux-modules/patch/20240723063258.2240610-2-zhengyejian@huaweicloud.com/ >> >> >> > > Yeah, I see those. Additionally, this patch no longer applies cleanly > to current master, though fixing it up to do so is pretty easy. Having > done that, this series appears to resolve the issues I saw in the other > linked thread. > > Zheng, do you plan to send a v3? I'd be happy to help out with this > patch series if you'd like, as I'm hoping to get this issue resolved > (though I am not an ftrace expert). Sorry to rely so late. Thanks for your feedback! Steve recently started a discussion of the issue in: https://lore.kernel.org/all/20241014210841.5a4764de@gandalf.local.home/ but there's no conclusion. I can rebase this patch series and send a new version first, and I'm also hoping to get more feedbacks and help to resolve the issue. -- Thanks, Zheng Yejian
On Thu, 2024-12-12 at 17:52 +0800, Zheng Yejian wrote: > On 2024/12/11 04:49, Martin Kelly wrote: > > > > > > Zheng, do you plan to send a v3? I'd be happy to help out with this > > patch series if you'd like, as I'm hoping to get this issue > > resolved > > (though I am not an ftrace expert). > > Sorry to rely so late. Thanks for your feedback! > > Steve recently started a discussion of the issue in: > > https://lore.kernel.org/all/20241015100111.37adfb28@gandalf.local.home/ > but there's no conclusion. > > I can rebase this patch series and send a new version first, and > I'm also hoping to get more feedbacks and help to resolve the issue. > Hi Zheng, I may have misunderstood, but based on the final message from Steven, I got the sense that the idea listed in that thread didn't work out and we should proceed with your current approach. Please consider me an interested party for this patch series, and let me know if there's anything I can do to help speed it along (co- develop, test, anything else). And of course, thanks very much for your work thus far!
On 2024/12/14 03:31, Martin Kelly wrote:
> On Thu, 2024-12-12 at 17:52 +0800, Zheng Yejian wrote:
>> On 2024/12/11 04:49, Martin Kelly wrote:
>>>
>>>
>>> Zheng, do you plan to send a v3? I'd be happy to help out with this
>>> patch series if you'd like, as I'm hoping to get this issue
>>> resolved
>>> (though I am not an ftrace expert).
>>
>> Sorry to rely so late. Thanks for your feedback!
>>
>> Steve recently started a discussion of the issue in:
>>
>> https://lore.kernel.org/all/20241015100111.37adfb28@gandalf.local.home/
>> but there's no conclusion.
>>
>> I can rebase this patch series and send a new version first, and
>> I'm also hoping to get more feedbacks and help to resolve the issue.
>>
>
> Hi Zheng,
>
> I may have misunderstood, but based on the final message from Steven, I
> got the sense that the idea listed in that thread didn't work out and
> we should proceed with your current approach.
This patch set attempts to add length information of symbols to kallsyms,
which can then ensure that there is only one valid fentry within the range
of function length.
However, I think this patch set actually has some performance implications,
including:
1. Adding hole symbols may cause the symbol table to grow significantly;
2. When parsing fentry tables, excluding invalid fentries through kallsyms lookups
may also increase boot or module load times slightly.
The direct cause of this issue is the wrong fentry being founded by ftrace_location(),
following the approach of "FTRACE_MCOUNT_MAX_OFFSET", narrowing down the search range
and re-finding may also solve this problem, demo patch like below (not
fully tested):
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 9b17efb1a87d..7d34320ca9d1 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1678,8 +1678,11 @@ unsigned long ftrace_location(unsigned long ip)
goto out;
/* map sym+0 to __fentry__ */
- if (!offset)
+ if (!offset) {
loc = ftrace_location_range(ip, ip + size - 1);
+ while (loc > ip && loc - ip > FTRACE_MCOUNT_MAX_OFFSET)
+ loc = ftrace_location_range(ip, loc - 1);
+ }
}
Steve, Peter, what do you think?
>
> Please consider me an interested party for this patch series, and let
> me know if there's anything I can do to help speed it along (co-
> develop, test, anything else). And of course, thanks very much for your
> work thus far!
--
Thanks,
Zheng Yejian
Sorry for the late reply. Forgot about this as I was focused on other end-of-year issues.
On Sat, 14 Dec 2024 16:37:59 +0800
Zheng Yejian <zhengyejian1@huawei.com> wrote:
> The direct cause of this issue is the wrong fentry being founded by ftrace_location(),
> following the approach of "FTRACE_MCOUNT_MAX_OFFSET", narrowing down the search range
> and re-finding may also solve this problem, demo patch like below (not
> fully tested):
>
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 9b17efb1a87d..7d34320ca9d1 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -1678,8 +1678,11 @@ unsigned long ftrace_location(unsigned long ip)
> goto out;
>
> /* map sym+0 to __fentry__ */
> - if (!offset)
> + if (!offset) {
> loc = ftrace_location_range(ip, ip + size - 1);
> + while (loc > ip && loc - ip > FTRACE_MCOUNT_MAX_OFFSET)
> + loc = ftrace_location_range(ip, loc - 1);
> + }
> }
>
> Steve, Peter, what do you think?
Hmm, removing the weak functions from the __mcount_loc location should also
solve this, as the ftrace_location_range() will not return a weak function
if it's not part of the __mcount_loc table.
That is, would this patchset work?
https://lore.kernel.org/all/20250102232609.529842248@goodmis.org/
-- Steve
On 2025/1/22 01:48, Steven Rostedt wrote:
>
> Sorry for the late reply. Forgot about this as I was focused on other end-of-year issues.
>
> On Sat, 14 Dec 2024 16:37:59 +0800
> Zheng Yejian <zhengyejian1@huawei.com> wrote:
>
>> The direct cause of this issue is the wrong fentry being founded by ftrace_location(),
>> following the approach of "FTRACE_MCOUNT_MAX_OFFSET", narrowing down the search range
>> and re-finding may also solve this problem, demo patch like below (not
>> fully tested):
>>
>> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
>> index 9b17efb1a87d..7d34320ca9d1 100644
>> --- a/kernel/trace/ftrace.c
>> +++ b/kernel/trace/ftrace.c
>> @@ -1678,8 +1678,11 @@ unsigned long ftrace_location(unsigned long ip)
>> goto out;
>>
>> /* map sym+0 to __fentry__ */
>> - if (!offset)
>> + if (!offset) {
>> loc = ftrace_location_range(ip, ip + size - 1);
>> + while (loc > ip && loc - ip > FTRACE_MCOUNT_MAX_OFFSET)
>> + loc = ftrace_location_range(ip, loc - 1);
>> + }
>> }
>>
>> Steve, Peter, what do you think?
>
> Hmm, removing the weak functions from the __mcount_loc location should also
> solve this, as the ftrace_location_range() will not return a weak function
> if it's not part of the __mcount_loc table.
>
> That is, would this patchset work?
>
> https://lore.kernel.org/all/20250102232609.529842248@goodmis.org/
I only pick patch15 and patch16 into v6.14-rc1, since most of patches in that patches
have already merged, and the issue seems gone, thanks!
>
> -- Steve
--
Thanks,
Zheng Yejian
© 2016 - 2026 Red Hat, Inc.