kernel/module/kmod.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-)
Sometimes when kernel calls request_module to load a module
into kernel space, it doesn't pass the module name appropriately,
and request_module doesn't verify it as well.
As a result, modprobe is invoked anyway and spend a lot of time
searching a nonsense name.
For example reported from a customer, he runs a user space process
to call ioctl(fd, SIOCGIFINDEX, &ifr), the callstack in kernel is
like that:
dev_ioctl(net/core/dev_iovtl.c)
dev_load
request_module("netdev-%s", name);
or request_module("%s", name);
However if name of NIC is empty, neither dev_load nor request_module
checks it at the first place, modprobe will search module "netdev-"
in its default path, env path and path configured in etc for nothing,
increase a lot system overhead.
To address this problem, this patch copies va_list and introduces
a helper is_module_name_valid to verify the parameters validity
one by one, either null or empty. if it fails, no modprobe invoked.
Signed-off-by: Song Chen <chensong_2000@189.cn>
---
kernel/module/kmod.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/kernel/module/kmod.c b/kernel/module/kmod.c
index 0800d9891692..161ad41b864e 100644
--- a/kernel/module/kmod.c
+++ b/kernel/module/kmod.c
@@ -113,6 +113,27 @@ static int call_modprobe(char *orig_module_name, int wait)
return -ENOMEM;
}
+static inline bool is_module_name_valid(const char *fmt, va_list args)
+{
+ va_list args_verify;
+ bool ret = true;
+ const char *p, *arg;
+
+ va_copy(args_verify, args);
+ for (p = fmt; *p; p++) {
+ if (*p == '%' && *(++p) == 's') {
+ arg = va_arg(args_verify, const char *);
+ if (!arg || arg[0] == '\0') {
+ ret = false;
+ break;
+ }
+ }
+ }
+ va_end(args_verify);
+
+ return ret;
+}
+
/**
* __request_module - try to load a kernel module
* @wait: wait (or not) for the operation to complete
@@ -147,7 +168,13 @@ int __request_module(bool wait, const char *fmt, ...)
return -ENOENT;
va_start(args, fmt);
- ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
+ if (is_module_name_valid(fmt, args))
+ ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
+ else {
+ pr_warn_ratelimited("request_module: modprobe cannot be processed due to invalid module name");
+ va_end(args);
+ return -EINVAL;
+ }
va_end(args);
if (ret >= MODULE_NAME_LEN)
return -ENAMETOOLONG;
--
2.25.1
On 11/10/24 12:42, Song Chen wrote: > Sometimes when kernel calls request_module to load a module > into kernel space, it doesn't pass the module name appropriately, > and request_module doesn't verify it as well. > > As a result, modprobe is invoked anyway and spend a lot of time > searching a nonsense name. > > For example reported from a customer, he runs a user space process > to call ioctl(fd, SIOCGIFINDEX, &ifr), the callstack in kernel is > like that: > dev_ioctl(net/core/dev_iovtl.c) > dev_load > request_module("netdev-%s", name); > or request_module("%s", name); > > However if name of NIC is empty, neither dev_load nor request_module > checks it at the first place, modprobe will search module "netdev-" > in its default path, env path and path configured in etc for nothing, > increase a lot system overhead. > > To address this problem, this patch copies va_list and introduces > a helper is_module_name_valid to verify the parameters validity > one by one, either null or empty. if it fails, no modprobe invoked. I'm not sure if I fully follow why this should be addressed at the request_module() level. If the user repeatedly invokes SIOCGIFINDEX with an empty name and this increases their system load, wouldn't it be better to update the userspace to prevent this non-sense request in the first place? Similarly, if something should be done in the kernel, wouldn't it be more straightforward for dev_ioctl()/dev_load() to check this case? I think the same should in principle apply to other places that might invoke request_module() with "%s" and a bogus value. The callers can appropriately decide if their request makes sense and should be fixed/improved. -- Thanks, Petr
Hi Petr, Please see my inline comments, many thanks. BR Song 在 2024/11/12 20:56, Petr Pavlu 写道: > On 11/10/24 12:42, Song Chen wrote: >> Sometimes when kernel calls request_module to load a module >> into kernel space, it doesn't pass the module name appropriately, >> and request_module doesn't verify it as well. >> >> As a result, modprobe is invoked anyway and spend a lot of time >> searching a nonsense name. >> >> For example reported from a customer, he runs a user space process >> to call ioctl(fd, SIOCGIFINDEX, &ifr), the callstack in kernel is >> like that: >> dev_ioctl(net/core/dev_iovtl.c) >> dev_load >> request_module("netdev-%s", name); >> or request_module("%s", name); >> >> However if name of NIC is empty, neither dev_load nor request_module >> checks it at the first place, modprobe will search module "netdev-" >> in its default path, env path and path configured in etc for nothing, >> increase a lot system overhead. >> >> To address this problem, this patch copies va_list and introduces >> a helper is_module_name_valid to verify the parameters validity >> one by one, either null or empty. if it fails, no modprobe invoked. > > I'm not sure if I fully follow why this should be addressed at the > request_module() level. If the user repeatedly invokes SIOCGIFINDEX with > an empty name and this increases their system load, wouldn't it be > better to update the userspace to prevent this non-sense request in the > first place? If the user process knew, it wouldn't make the mistake. moreover, what happened in dev_load was quite confusing, please see the code below: no_module = !dev; if (no_module && capable(CAP_NET_ADMIN)) no_module = request_module("netdev-%s", name); if (no_module && capable(CAP_SYS_MODULE)) request_module("%s", name); Running the same process, sys admin or root user spends more time than normal user, it took a while for us to find the cause, that's why i tried to fix it in kernel. Similarly, if something should be done in the kernel, > wouldn't it be more straightforward for dev_ioctl()/dev_load() to check > this case? I thought about it at the beginning, not only dev_ioctl/dev_load but also other request_module callers should check this case as well, that would be too much effort, then I switched to check it at the beginning of request_module which every caller goes through. > > I think the same should in principle apply to other places that might > invoke request_module() with "%s" and a bogus value. The callers can > appropriately decide if their request makes sense and should be > fixed/improved. > Callees are obliged to do fault tolerance for callers, or at least let them know what is going on inside, what kinds of mistake they are making, there are a lot of such cases in kernel, such as call_modprobe in kernel/module/kmod.c, it checks if orig_module_name is NULL. Song
On 11/13/24 03:15, Song Chen wrote: > 在 2024/11/12 20:56, Petr Pavlu 写道: >> On 11/10/24 12:42, Song Chen wrote: >>> Sometimes when kernel calls request_module to load a module >>> into kernel space, it doesn't pass the module name appropriately, >>> and request_module doesn't verify it as well. >>> >>> As a result, modprobe is invoked anyway and spend a lot of time >>> searching a nonsense name. >>> >>> For example reported from a customer, he runs a user space process >>> to call ioctl(fd, SIOCGIFINDEX, &ifr), the callstack in kernel is >>> like that: >>> dev_ioctl(net/core/dev_iovtl.c) >>> dev_load >>> request_module("netdev-%s", name); >>> or request_module("%s", name); >>> >>> However if name of NIC is empty, neither dev_load nor request_module >>> checks it at the first place, modprobe will search module "netdev-" >>> in its default path, env path and path configured in etc for nothing, >>> increase a lot system overhead. >>> >>> To address this problem, this patch copies va_list and introduces >>> a helper is_module_name_valid to verify the parameters validity >>> one by one, either null or empty. if it fails, no modprobe invoked. >> >> I'm not sure if I fully follow why this should be addressed at the >> request_module() level. If the user repeatedly invokes SIOCGIFINDEX with >> an empty name and this increases their system load, wouldn't it be >> better to update the userspace to prevent this non-sense request in the >> first place? > > If the user process knew, it wouldn't make the mistake. The user process should be able to check that the ifr_name passed to SIOCGIFINDEX is empty and avoid the syscall altogether, or am I missing something? Even if the kernel gets improved in some way to handle this case better, I would still suggest looking at what the application is doing and how it ends up making this call. > moreover, what > happened in dev_load was quite confusing, please see the code below: > > no_module = !dev; > if (no_module && capable(CAP_NET_ADMIN)) > no_module = request_module("netdev-%s", name); > if (no_module && capable(CAP_SYS_MODULE)) > request_module("%s", name); > > Running the same process, sys admin or root user spends more time than > normal user, it took a while for us to find the cause, that's why i > tried to fix it in kernel. > > Similarly, if something should be done in the kernel, >> wouldn't it be more straightforward for dev_ioctl()/dev_load() to check >> this case? > > I thought about it at the beginning, not only dev_ioctl/dev_load but > also other request_module callers should check this case as well, that > would be too much effort, then I switched to check it at the beginning > of request_module which every caller goes through. > >> >> I think the same should in principle apply to other places that might >> invoke request_module() with "%s" and a bogus value. The callers can >> appropriately decide if their request makes sense and should be >> fixed/improved. >> > > Callees are obliged to do fault tolerance for callers, or at least let > them know what is going on inside, what kinds of mistake they are > making, there are a lot of such cases in kernel, such as call_modprobe > in kernel/module/kmod.c, it checks if orig_module_name is NULL. Ok, I see the idea behind checking that a value passed to request_module() to format "%s" is non-NULL. I'm however not sure about rejecting empty strings as is also done by the patch. Consider a call to request_module("mod%s", suffix) where the suffix could be empty to select the default variant, or non-empty to select e.g. some optimized version of the module. Only the caller knows if the suffix being empty is valid or not. I've checked if this pattern is currently used in the kernel and wasn't able to find anything, so that is good. However, I'm not sure if request_module() should flat-out reject this use. -- Thanks, Petr
Hi Petr, 在 2024/11/18 20:54, Petr Pavlu 写道: > On 11/13/24 03:15, Song Chen wrote: >> 在 2024/11/12 20:56, Petr Pavlu 写道: >>> On 11/10/24 12:42, Song Chen wrote: >>>> Sometimes when kernel calls request_module to load a module >>>> into kernel space, it doesn't pass the module name appropriately, >>>> and request_module doesn't verify it as well. >>>> >>>> As a result, modprobe is invoked anyway and spend a lot of time >>>> searching a nonsense name. >>>> >>>> For example reported from a customer, he runs a user space process >>>> to call ioctl(fd, SIOCGIFINDEX, &ifr), the callstack in kernel is >>>> like that: >>>> dev_ioctl(net/core/dev_iovtl.c) >>>> dev_load >>>> request_module("netdev-%s", name); >>>> or request_module("%s", name); >>>> >>>> However if name of NIC is empty, neither dev_load nor request_module >>>> checks it at the first place, modprobe will search module "netdev-" >>>> in its default path, env path and path configured in etc for nothing, >>>> increase a lot system overhead. >>>> >>>> To address this problem, this patch copies va_list and introduces >>>> a helper is_module_name_valid to verify the parameters validity >>>> one by one, either null or empty. if it fails, no modprobe invoked. >>> >>> I'm not sure if I fully follow why this should be addressed at the >>> request_module() level. If the user repeatedly invokes SIOCGIFINDEX with >>> an empty name and this increases their system load, wouldn't it be >>> better to update the userspace to prevent this non-sense request in the >>> first place? >> >> If the user process knew, it wouldn't make the mistake. > > The user process should be able to check that the ifr_name passed to > SIOCGIFINDEX is empty and avoid the syscall altogether, or am I missing > something? Even if the kernel gets improved in some way to handle this > case better, I would still suggest looking at what the application is > doing and how it ends up making this call. > yes, agree, it's the user space process's fault after all. >> moreover, what >> happened in dev_load was quite confusing, please see the code below: >> >> no_module = !dev; >> if (no_module && capable(CAP_NET_ADMIN)) >> no_module = request_module("netdev-%s", name); >> if (no_module && capable(CAP_SYS_MODULE)) >> request_module("%s", name); >> >> Running the same process, sys admin or root user spends more time than >> normal user, it took a while for us to find the cause, that's why i >> tried to fix it in kernel. >> >> Similarly, if something should be done in the kernel, >>> wouldn't it be more straightforward for dev_ioctl()/dev_load() to check >>> this case? >> >> I thought about it at the beginning, not only dev_ioctl/dev_load but >> also other request_module callers should check this case as well, that >> would be too much effort, then I switched to check it at the beginning >> of request_module which every caller goes through. >> >>> >>> I think the same should in principle apply to other places that might >>> invoke request_module() with "%s" and a bogus value. The callers can >>> appropriately decide if their request makes sense and should be >>> fixed/improved. >>> >> >> Callees are obliged to do fault tolerance for callers, or at least let >> them know what is going on inside, what kinds of mistake they are >> making, there are a lot of such cases in kernel, such as call_modprobe >> in kernel/module/kmod.c, it checks if orig_module_name is NULL. > > Ok, I see the idea behind checking that a value passed to > request_module() to format "%s" is non-NULL. > > I'm however not sure about rejecting empty strings as is also done by > the patch. Consider a call to request_module("mod%s", suffix) where the > suffix could be empty to select the default variant, or non-empty to > select e.g. some optimized version of the module. Only the caller knows > if the suffix being empty is valid or not. > > I've checked if this pattern is currently used in the kernel and wasn't > able to find anything, so that is good. However, I'm not sure if > request_module() should flat-out reject this use. > I accidentally found another problem in request_module when i was testing this patch again, if the caller just passes a empty pointer to request_module, like request_module(NULL), the process will be broken: [ 2.336160] ? asm_exc_page_fault+0x2b/0x30 [ 2.336160] ? __pfx_crc64_rocksoft_notify+0x10/0x10 [ 2.336160] ? vsnprintf+0x5a/0x4f0 [ 2.336160] __request_module+0x93/0x2b0 [ 2.336160] ? __pfx_crc64_rocksoft_notify+0x10/0x10 [ 2.336160] ? notifier_call_chain+0x65/0xd0 [ 2.336160] ? __pfx_crc64_rocksoft_notify+0x10/0x10 [ 2.336160] crypto_probing_notify+0x43/0x60 (please ignore the caller, that is a testing code.) I searched kernel code if this patter exists, and found in __trace_bprintk of kernel/trace/trace_printk.c, it checks fmt at the beginning of the function: va_list ap; if (unlikely(!fmt)) return 0; Therefore, i would like to suggest we should at least add this check in request_module too. In that sense, why don't we do a little further to verify every parameter's validity to provide better fault tolerance, besides, it costs almost nothing. If you like this idea, i will send a v2. Many thanks. Song
© 2016 - 2024 Red Hat, Inc.