Documentation/admin-guide/kernel-api-spec.rst | 699 +++++++++ MAINTAINERS | 9 + arch/um/kernel/dyn.lds.S | 3 + arch/um/kernel/uml.lds.S | 3 + arch/x86/kernel/vmlinux.lds.S | 3 + drivers/android/binder.c | 758 ++++++++++ drivers/fwctl/main.c | 295 +++- fs/eventpoll.c | 1056 ++++++++++++++ fs/exec.c | 463 ++++++ include/asm-generic/vmlinux.lds.h | 20 + include/linux/ioctl_api_spec.h | 540 +++++++ include/linux/kernel_api_spec.h | 942 ++++++++++++ include/linux/syscall_api_spec.h | 341 +++++ include/linux/syscalls.h | 1 + init/Kconfig | 2 + kernel/Makefile | 1 + kernel/api/Kconfig | 55 + kernel/api/Makefile | 13 + kernel/api/ioctl_validation.c | 360 +++++ kernel/api/kapi_debugfs.c | 340 +++++ kernel/api/kernel_api_spec.c | 1257 +++++++++++++++++ mm/mlock.c | 646 +++++++++ tools/kapi/.gitignore | 4 + tools/kapi/Cargo.toml | 19 + tools/kapi/src/extractor/debugfs.rs | 204 +++ tools/kapi/src/extractor/mod.rs | 95 ++ tools/kapi/src/extractor/source_parser.rs | 488 +++++++ .../src/extractor/vmlinux/binary_utils.rs | 130 ++ tools/kapi/src/extractor/vmlinux/mod.rs | 372 +++++ tools/kapi/src/formatter/json.rs | 170 +++ tools/kapi/src/formatter/mod.rs | 68 + tools/kapi/src/formatter/plain.rs | 99 ++ tools/kapi/src/formatter/rst.rs | 144 ++ tools/kapi/src/main.rs | 121 ++ 34 files changed, 9719 insertions(+), 2 deletions(-) create mode 100644 Documentation/admin-guide/kernel-api-spec.rst create mode 100644 include/linux/ioctl_api_spec.h create mode 100644 include/linux/kernel_api_spec.h create mode 100644 include/linux/syscall_api_spec.h create mode 100644 kernel/api/Kconfig create mode 100644 kernel/api/Makefile create mode 100644 kernel/api/ioctl_validation.c create mode 100644 kernel/api/kapi_debugfs.c create mode 100644 kernel/api/kernel_api_spec.c create mode 100644 tools/kapi/.gitignore create mode 100644 tools/kapi/Cargo.toml create mode 100644 tools/kapi/src/extractor/debugfs.rs create mode 100644 tools/kapi/src/extractor/mod.rs create mode 100644 tools/kapi/src/extractor/source_parser.rs create mode 100644 tools/kapi/src/extractor/vmlinux/binary_utils.rs create mode 100644 tools/kapi/src/extractor/vmlinux/mod.rs create mode 100644 tools/kapi/src/formatter/json.rs create mode 100644 tools/kapi/src/formatter/mod.rs create mode 100644 tools/kapi/src/formatter/plain.rs create mode 100644 tools/kapi/src/formatter/rst.rs create mode 100644 tools/kapi/src/main.rs
This patch series introduces a framework for formally specifying kernel APIs, addressing the long-standing challenge of maintaining stable interfaces between the kernel and user-space programs. As outlined in previous discussions about kernel ABI stability, the lack of machine-readable API specifications has led to inadvertent breakages and inconsistent validation across system calls and IOCTLs. The framework provides three key components: declarative macros for specifying system call and IOCTL interfaces directly in the kernel source, automated extraction tools for generating machine-readable specifications, and a runtime validation infrastructure accessible through debugfs. By embedding specifications alongside implementation code, we ensure they remain synchronized and enable automated detection of API/ABI changes that could break user-space applications. This implementation demonstrates the approach with specifications for core system calls (epoll, exec, mlock families) and complex IOCTL interfaces (binder, fwctl). The specifications capture parameter types, validation rules, return values, and error conditions in a structured format that enables both documentation generation and runtime verification. Future work will expand coverage to additional subsystems and integrate with existing testing infrastructure to provide API compatibility guarantees. To complement the framework, we introduce the 'kapi' tool - a utility for extracting and analyzing kernel API specifications from multiple sources. The tool can extract specifications from kernel source code (parsing KAPI macros), compiled vmlinux binaries (reading the .kapi_specs ELF section), or from a running kernel via debugfs. It supports multiple output formats (plain text, JSON, RST) to facilitate integration with documentation systems and automated testing workflows. This tool enables developers to easily inspect API specifications, verify changes across kernel versions, and generate documentation without requiring kernel rebuilds. Sasha Levin (19): kernel/api: introduce kernel API specification framework eventpoll: add API specification for epoll_create1 eventpoll: add API specification for epoll_create eventpoll: add API specification for epoll_ctl eventpoll: add API specification for epoll_wait eventpoll: add API specification for epoll_pwait eventpoll: add API specification for epoll_pwait2 exec: add API specification for execve exec: add API specification for execveat mm/mlock: add API specification for mlock mm/mlock: add API specification for mlock2 mm/mlock: add API specification for mlockall mm/mlock: add API specification for munlock mm/mlock: add API specification for munlockall kernel/api: add debugfs interface for kernel API specifications kernel/api: add IOCTL specification infrastructure fwctl: add detailed IOCTL API specifications binder: add detailed IOCTL API specifications tools/kapi: Add kernel API specification extraction tool Documentation/admin-guide/kernel-api-spec.rst | 699 +++++++++ MAINTAINERS | 9 + arch/um/kernel/dyn.lds.S | 3 + arch/um/kernel/uml.lds.S | 3 + arch/x86/kernel/vmlinux.lds.S | 3 + drivers/android/binder.c | 758 ++++++++++ drivers/fwctl/main.c | 295 +++- fs/eventpoll.c | 1056 ++++++++++++++ fs/exec.c | 463 ++++++ include/asm-generic/vmlinux.lds.h | 20 + include/linux/ioctl_api_spec.h | 540 +++++++ include/linux/kernel_api_spec.h | 942 ++++++++++++ include/linux/syscall_api_spec.h | 341 +++++ include/linux/syscalls.h | 1 + init/Kconfig | 2 + kernel/Makefile | 1 + kernel/api/Kconfig | 55 + kernel/api/Makefile | 13 + kernel/api/ioctl_validation.c | 360 +++++ kernel/api/kapi_debugfs.c | 340 +++++ kernel/api/kernel_api_spec.c | 1257 +++++++++++++++++ mm/mlock.c | 646 +++++++++ tools/kapi/.gitignore | 4 + tools/kapi/Cargo.toml | 19 + tools/kapi/src/extractor/debugfs.rs | 204 +++ tools/kapi/src/extractor/mod.rs | 95 ++ tools/kapi/src/extractor/source_parser.rs | 488 +++++++ .../src/extractor/vmlinux/binary_utils.rs | 130 ++ tools/kapi/src/extractor/vmlinux/mod.rs | 372 +++++ tools/kapi/src/formatter/json.rs | 170 +++ tools/kapi/src/formatter/mod.rs | 68 + tools/kapi/src/formatter/plain.rs | 99 ++ tools/kapi/src/formatter/rst.rs | 144 ++ tools/kapi/src/main.rs | 121 ++ 34 files changed, 9719 insertions(+), 2 deletions(-) create mode 100644 Documentation/admin-guide/kernel-api-spec.rst create mode 100644 include/linux/ioctl_api_spec.h create mode 100644 include/linux/kernel_api_spec.h create mode 100644 include/linux/syscall_api_spec.h create mode 100644 kernel/api/Kconfig create mode 100644 kernel/api/Makefile create mode 100644 kernel/api/ioctl_validation.c create mode 100644 kernel/api/kapi_debugfs.c create mode 100644 kernel/api/kernel_api_spec.c create mode 100644 tools/kapi/.gitignore create mode 100644 tools/kapi/Cargo.toml create mode 100644 tools/kapi/src/extractor/debugfs.rs create mode 100644 tools/kapi/src/extractor/mod.rs create mode 100644 tools/kapi/src/extractor/source_parser.rs create mode 100644 tools/kapi/src/extractor/vmlinux/binary_utils.rs create mode 100644 tools/kapi/src/extractor/vmlinux/mod.rs create mode 100644 tools/kapi/src/formatter/json.rs create mode 100644 tools/kapi/src/formatter/mod.rs create mode 100644 tools/kapi/src/formatter/plain.rs create mode 100644 tools/kapi/src/formatter/rst.rs create mode 100644 tools/kapi/src/main.rs -- 2.39.5
On Sat, 14 Jun 2025 09:48:39 -0400 Sasha Levin <sashal@kernel.org> wrote: > This patch series introduces a framework for formally specifying kernel > APIs, addressing the long-standing challenge of maintaining stable > interfaces between the kernel and user-space programs. As outlined in > previous discussions about kernel ABI stability, the lack of > machine-readable API specifications has led to inadvertent breakages and > inconsistent validation across system calls and IOCTLs. Ugg, looks horrid. Going to be worse than things like doxygen for getting out of step with the actual code and grep searches are going to hit the comment blocks. David
On Sat, Jun 14, 2025 at 09:48:39AM -0400, Sasha Levin wrote: > This patch series introduces a framework for formally specifying kernel > APIs, addressing the long-standing challenge of maintaining stable > interfaces between the kernel and user-space programs. As outlined in > previous discussions about kernel ABI stability, the lack of > machine-readable API specifications has led to inadvertent breakages and > inconsistent validation across system calls and IOCTLs. I'd much prefer this be more attached to the code in question, otherwise we've go two things to update when changes happen. (Well, 3, since kern-doc already needs updating too.) Can't we collect error codes programmatically through control flow analysis? Argument mapping is already present in the SYSCALL macros, etc. Let's not repeat this info. -Kees -- Kees Cook
On Wed, Jun 18, 2025 at 02:29:37PM -0700, Kees Cook wrote: >On Sat, Jun 14, 2025 at 09:48:39AM -0400, Sasha Levin wrote: >> This patch series introduces a framework for formally specifying kernel >> APIs, addressing the long-standing challenge of maintaining stable >> interfaces between the kernel and user-space programs. As outlined in >> previous discussions about kernel ABI stability, the lack of >> machine-readable API specifications has led to inadvertent breakages and >> inconsistent validation across system calls and IOCTLs. > >I'd much prefer this be more attached to the code in question, otherwise >we've go two things to update when changes happen. (Well, 3, since >kern-doc already needs updating too.) > >Can't we collect error codes programmatically through control flow >analysis? Argument mapping is already present in the SYSCALL macros, I'm not sure what you meant with in the control flow analysis part: we have code to verify that the return value from the macro matches one of the ones defined in the spec. >etc. Let's not repeat this info. I tried to come up with a way to get rid of the SYSCALL_DEFINEx() macro right after the spec. I agree that it's duplication, but my macro-foo is too weak to get rid of that SYSCALL_DEFINE() call. Suggestions more than welcome here: I suspect that this might require a bigger change in the code, but I'm still trying to figure it out. -- Thanks, Sasha
Nice! A bag of assorted comments: 1. I share the same concern of duplicating info. If there are lots of duplication it may lead to failure of the whole effort since folks won't update these and/or they will get out of sync. If a syscall arg is e.g. umode_t, we already know that it's an integer of that enum type, and that it's an input arg. In syzkaller we have a Clang-tool: https://github.com/google/syzkaller/blob/master/tools/syz-declextract/clangtool/declextract.cpp that extracts a bunch of interfaces automatically: https://raw.githubusercontent.com/google/syzkaller/refs/heads/master/sys/linux/auto.txt Though, oviously that won't have user-readable string descriptions, can't be used as a source of truth, and may be challenging to integrate into kernel build process. Though, extracting some of that info automatically may be nice. 2. Does this framework ensure that the specified info about args is correct? E.g. number of syscall args, and their types match the actual ones? If such things are not tested/validated during build, I afraid they will be riddled with bugs over time. 3. To reduce duplication we could use more type information, e.g. I was always frustrated that close is just: SYSCALL_DEFINE1(close, unsigned int, fd) whereas if we would do: typedef int fd_t; SYSCALL_DEFINE1(close, fd_t, fd) then all semantic info about the arg is already in the code. 4. If we specify e.g. error return values here with descirptions, can that be used as the source of truth to generate man pages? That would eliminate some duplication. 5. We have a long standing dream that kernel developers add fuzzing descirpions along with new kernel interfaces. So far we got very few contributions to syzkaller from kernel developers. This framework can serve as the way to do it, which is nice. 6. What's the goal of validation of the input arguments? Kernel code must do this validation anyway, right. Any non-trivial validation is hard, e.g. even for open the validation function for file name would need to have access to flags and check file precense for some flags combinations. That may add significant amount of non-trivial code that duplicates main syscall logic, and that logic may also have bugs and memory leaks. 7. One of the most useful uses of this framework that I see if testing kernel behavior correctness. I wonder what properties we can test with these descirptions, and if we can add more useful info for that purpose. Argument validation does not help here (it's userspace bugs at best). Return values potentially may be useful, e.g. if we see a return value that's not specified, potentially it's a kernel bug. Side-effects specification potentially can be used to detect logical kernel bugs, e.g. if a syscall does not claim to change fs state, but it does, it's a bug. Though, a more useful check should be failure/concurrency atomicity. Namely, if a syscall claims to not alter state on failure, it shouldn't do so. Concurrency atomicity means linearizability of concurrent syscalls (side-effects match one of 2 possible orders of syscalls). But for these we would need to add additional flags to the descriptions that say that a syscall supports failure/concurrency atomicity. 8. It would be useful to have a mapping of file_operations to actual files in fs. Otherwise the exposed info is not very actionable, since there is no way to understand what actual file/fd the ioctl's can be applied to. 9. I see that syscalls and ioctls say: KAPI_CONTEXT(KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE) Can't we make this implicit? Are there any other options? Similarly an ioctl description says it releases a mutex (.released = true,), all ioctls/syscalls must release all acquired mutexes, no? Generally, the less verbose the descriptions are, the higher chances of their survival. +Marco also works static compiler-enforced lock checking annotations, I wonder if they can be used to describe this in a more useful way.
On Mon, Jun 23, 2025 at 03:28:03PM +0200, Dmitry Vyukov wrote: >Nice! > >A bag of assorted comments: > >1. I share the same concern of duplicating info. >If there are lots of duplication it may lead to failure of the whole effort >since folks won't update these and/or they will get out of sync. >If a syscall arg is e.g. umode_t, we already know that it's an integer >of that enum type, and that it's an input arg. >In syzkaller we have a Clang-tool: >https://github.com/google/syzkaller/blob/master/tools/syz-declextract/clangtool/declextract.cpp >that extracts a bunch of interfaces automatically: >https://raw.githubusercontent.com/google/syzkaller/refs/heads/master/sys/linux/auto.txt >Though, oviously that won't have user-readable string descriptions, can't be used as a source >of truth, and may be challenging to integrate into kernel build process. >Though, extracting some of that info automatically may be nice. > >2. Does this framework ensure that the specified info about args is correct? >E.g. number of syscall args, and their types match the actual ones? >If such things are not tested/validated during build, I afraid they will be >riddled with bugs over time. This is an answer for both (1) and (2): yes! In my mind, whatever we spec out needs to be enforced, because otherwise it will go out of sync. In this RFC, take a look at the code guarded by CONFIG_KAPI_RUNTIME_CHECKS: the idea is that we can enable runtime checks that verify the things you've mentioned above (and more). >3. To reduce duplication we could use more type information, e.g. I was always >frustrated that close is just: > >SYSCALL_DEFINE1(close, unsigned int, fd) > >whereas if we would do: > >typedef int fd_t; >SYSCALL_DEFINE1(close, fd_t, fd) > >then all semantic info about the arg is already in the code. Yup. It would also be great if we completely drop the SYSCALL_DEFINE() part and have it be automatically generated by the spec itself, but I couldn't wrap my head around doing this in C macro just yet. >4. If we specify e.g. error return values here with descirptions, >can that be used as the source of truth to generate man pages? >That would eliminate some duplication. Ideally yes. One of the formatters that the kapi tool has (see the last patch in this series) is the RST formatter that could be used to generate documentation similar to man pages. >5. We have a long standing dream that kernel developers add fuzzing descirpions >along with new kernel interfaces. So far we got very few contributions to syzkaller >from kernel developers. This framework can serve as the way to do it, which is nice. This was one of the main usecases I had in mind. In return, we can get back from syzkaller a body of automatically generated tests that we can embed into our testing CIs. >6. What's the goal of validation of the input arguments? >Kernel code must do this validation anyway, right. >Any non-trivial validation is hard, e.g. even for open the validation function >for file name would need to have access to flags and check file precense for >some flags combinations. That may add significant amount of non-trivial code >that duplicates main syscall logic, and that logic may also have bugs and >memory leaks. Mostly to catch divergence from the spec: think of a scenario where someone added a new param/flag/etc but forgot to update the spec - this will help catch it. Ideally it would also prevent some of the issues that syzkaller is so good at finding :) >7. One of the most useful uses of this framework that I see if testing kernel >behavior correctness. I wonder what properties we can test with these descirptions, >and if we can add more useful info for that purpose. >Argument validation does not help here (it's userspace bugs at best). >Return values potentially may be useful, e.g. if we see a return value that's >not specified, potentially it's a kernel bug. >Side-effects specification potentially can be used to detect logical kernel bugs, >e.g. if a syscall does not claim to change fs state, but it does, it's a bug. >Though, a more useful check should be failure/concurrency atomicity. >Namely, if a syscall claims to not alter state on failure, it shouldn't do so. >Concurrency atomicity means linearizability of concurrent syscalls >(side-effects match one of 2 possible orders of syscalls). >But for these we would need to add additional flags to the descriptions >that say that a syscall supports failure/concurrency atomicity. I agree: being able to fuzz for more than just kernel splats will be great. >8. It would be useful to have a mapping of file_operations to actual files in fs. >Otherwise the exposed info is not very actionable, since there is no way to understand >what actual file/fd the ioctl's can be applied to. Ack. The ioctl() part is a bit hand weavy right now, and at the very least we'd need to spec out ioctl() itself. It's more of a demonstration of how it could look like rather than being too useful at this point. >9. I see that syscalls and ioctls say: >KAPI_CONTEXT(KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE) >Can't we make this implicit? Are there any other options? Maybe? I wasn't sure how we'd describe somthing like getpid() which isn't supposed to sleep. >Similarly an ioctl description says it releases a mutex (.released = true,), >all ioctls/syscalls must release all acquired mutexes, no? >Generally, the less verbose the descriptions are, the higher chances of their survival. >+Marco also works static compiler-enforced lock checking annotations, >I wonder if they can be used to describe this in a more useful way. I was thinking about stuff like futex or flock which can return with a lock back to userspace. -- Thanks, Sasha
On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote: > >9. I see that syscalls and ioctls say: > >KAPI_CONTEXT(KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE) > >Can't we make this implicit? Are there any other options? > > Maybe? I wasn't sure how we'd describe somthing like getpid() which > isn't supposed to sleep. > > >Similarly an ioctl description says it releases a mutex (.released = true,), > >all ioctls/syscalls must release all acquired mutexes, no? > >Generally, the less verbose the descriptions are, the higher chances of their survival. > >+Marco also works static compiler-enforced lock checking annotations, > >I wonder if they can be used to describe this in a more useful way. > > I was thinking about stuff like futex or flock which can return with a > lock back to userspace. I see, this makes sense. Then I would go with explicitly specifying rare uncommon cases instead, and require 99% of common cases be the default that does not require saying anything. E.g. KAPI_CTX_NON_SLEEPABLE, .not_released = true. KAPI_CTX_NON_SLEEPABLE looks useful, since it allows easy validation: set current flag, and BUG on any attempt to sleep when the flag is set (lockdep probably already has required pieces for this).
On Wed, Jun 25, 2025 at 10:56:04AM +0200, Dmitry Vyukov wrote: >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote: >> >9. I see that syscalls and ioctls say: >> >KAPI_CONTEXT(KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE) >> >Can't we make this implicit? Are there any other options? >> >> Maybe? I wasn't sure how we'd describe somthing like getpid() which >> isn't supposed to sleep. >> >> >Similarly an ioctl description says it releases a mutex (.released = true,), >> >all ioctls/syscalls must release all acquired mutexes, no? >> >Generally, the less verbose the descriptions are, the higher chances of their survival. >> >+Marco also works static compiler-enforced lock checking annotations, >> >I wonder if they can be used to describe this in a more useful way. >> >> I was thinking about stuff like futex or flock which can return with a >> lock back to userspace. > >I see, this makes sense. Then I would go with explicitly specifying >rare uncommon cases instead, and require 99% of common cases be the >default that does not require saying anything. > >E.g. KAPI_CTX_NON_SLEEPABLE, .not_released = true. > >KAPI_CTX_NON_SLEEPABLE looks useful, since it allows easy validation: >set current flag, and BUG on any attempt to sleep when the flag is set >(lockdep probably already has required pieces for this). Yup, that makes sense. One of the reason I wrapped all the field assignments with macros is that we can easily customize them based on usage, so instead of: #define KAPI_LOCK_ACQUIRED \ .acquired = true, #define KAPI_LOCK_RELEASED \ .released = true, We can add: #define KAPI_LOCK_USED \ .acquired = true, \ .released = true, -- Thanks, Sasha
On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote: > >6. What's the goal of validation of the input arguments? > >Kernel code must do this validation anyway, right. > >Any non-trivial validation is hard, e.g. even for open the validation function > >for file name would need to have access to flags and check file precense for > >some flags combinations. That may add significant amount of non-trivial code > >that duplicates main syscall logic, and that logic may also have bugs and > >memory leaks. > > Mostly to catch divergence from the spec: think of a scenario where > someone added a new param/flag/etc but forgot to update the spec - this > will help catch it. How exactly is this supposed to work? Even if we run with a unit test suite, a test suite may include some incorrect inputs to check for error conditions. The framework will report violations on these incorrect inputs. These are not bugs in the API specifications, nor in the test suite (read false positives).
On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote: >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote: > >> >6. What's the goal of validation of the input arguments? >> >Kernel code must do this validation anyway, right. >> >Any non-trivial validation is hard, e.g. even for open the validation function >> >for file name would need to have access to flags and check file precense for >> >some flags combinations. That may add significant amount of non-trivial code >> >that duplicates main syscall logic, and that logic may also have bugs and >> >memory leaks. >> >> Mostly to catch divergence from the spec: think of a scenario where >> someone added a new param/flag/etc but forgot to update the spec - this >> will help catch it. > >How exactly is this supposed to work? >Even if we run with a unit test suite, a test suite may include some >incorrect inputs to check for error conditions. The framework will >report violations on these incorrect inputs. These are not bugs in the >API specifications, nor in the test suite (read false positives). Right now it would be something along the lines of the test checking for an expected failure message in dmesg, something along the lines of: https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67 I'm not opposed to coming up with a better story... -- Thanks, Sasha
On Wed, 25 Jun 2025 at 17:55, Sasha Levin <sashal@kernel.org> wrote: > > On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote: > >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote: > > > >> >6. What's the goal of validation of the input arguments? > >> >Kernel code must do this validation anyway, right. > >> >Any non-trivial validation is hard, e.g. even for open the validation function > >> >for file name would need to have access to flags and check file precense for > >> >some flags combinations. That may add significant amount of non-trivial code > >> >that duplicates main syscall logic, and that logic may also have bugs and > >> >memory leaks. > >> > >> Mostly to catch divergence from the spec: think of a scenario where > >> someone added a new param/flag/etc but forgot to update the spec - this > >> will help catch it. > > > >How exactly is this supposed to work? > >Even if we run with a unit test suite, a test suite may include some > >incorrect inputs to check for error conditions. The framework will > >report violations on these incorrect inputs. These are not bugs in the > >API specifications, nor in the test suite (read false positives). > > Right now it would be something along the lines of the test checking for > an expected failure message in dmesg, something along the lines of: > > https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67 > > I'm not opposed to coming up with a better story... Oh, you mean special tests for this framework (rather than existing tests). I don't think this is going to work in practice. Besides writing all these specifications, we will also need to write dozens of tests per each specification (e.g. for each fd arg one needs at least 3 tests: -1, valid fd, inclid fd; an enum may need 5 various inputs of something; let alone netlink specifications).
On Thu, 26 Jun 2025 at 10:32, Dmitry Vyukov <dvyukov@google.com> wrote: > > On Wed, 25 Jun 2025 at 17:55, Sasha Levin <sashal@kernel.org> wrote: > > > > On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote: > > >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote: > > > > > >> >6. What's the goal of validation of the input arguments? > > >> >Kernel code must do this validation anyway, right. > > >> >Any non-trivial validation is hard, e.g. even for open the validation function > > >> >for file name would need to have access to flags and check file precense for > > >> >some flags combinations. That may add significant amount of non-trivial code > > >> >that duplicates main syscall logic, and that logic may also have bugs and > > >> >memory leaks. > > >> > > >> Mostly to catch divergence from the spec: think of a scenario where > > >> someone added a new param/flag/etc but forgot to update the spec - this > > >> will help catch it. > > > > > >How exactly is this supposed to work? > > >Even if we run with a unit test suite, a test suite may include some > > >incorrect inputs to check for error conditions. The framework will > > >report violations on these incorrect inputs. These are not bugs in the > > >API specifications, nor in the test suite (read false positives). > > > > Right now it would be something along the lines of the test checking for > > an expected failure message in dmesg, something along the lines of: > > > > https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67 > > > > I'm not opposed to coming up with a better story... If the goal of validation is just indirectly validating correctness of the specification itself, then I would look for other ways of validating correctness of the spec. Either removing duplication between specification and actual code (i.e. generating it from SYSCALL_DEFINE, or the other way around) , then spec is correct by construction. Or, cross-validating it with info automatically extracted from the source (using clang/dwarf/pahole). This would be more scalable (O(1) work, rather than thousands more manually written tests). > Oh, you mean special tests for this framework (rather than existing tests). > I don't think this is going to work in practice. Besides writing all > these specifications, we will also need to write dozens of tests per > each specification (e.g. for each fd arg one needs at least 3 tests: > -1, valid fd, inclid fd; an enum may need 5 various inputs of > something; let alone netlink specifications).
On Thu, Jun 26, 2025 at 10:37:33AM +0200, Dmitry Vyukov wrote: >On Thu, 26 Jun 2025 at 10:32, Dmitry Vyukov <dvyukov@google.com> wrote: >> >> On Wed, 25 Jun 2025 at 17:55, Sasha Levin <sashal@kernel.org> wrote: >> > >> > On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote: >> > >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote: >> > > >> > >> >6. What's the goal of validation of the input arguments? >> > >> >Kernel code must do this validation anyway, right. >> > >> >Any non-trivial validation is hard, e.g. even for open the validation function >> > >> >for file name would need to have access to flags and check file precense for >> > >> >some flags combinations. That may add significant amount of non-trivial code >> > >> >that duplicates main syscall logic, and that logic may also have bugs and >> > >> >memory leaks. >> > >> >> > >> Mostly to catch divergence from the spec: think of a scenario where >> > >> someone added a new param/flag/etc but forgot to update the spec - this >> > >> will help catch it. >> > > >> > >How exactly is this supposed to work? >> > >Even if we run with a unit test suite, a test suite may include some >> > >incorrect inputs to check for error conditions. The framework will >> > >report violations on these incorrect inputs. These are not bugs in the >> > >API specifications, nor in the test suite (read false positives). >> > >> > Right now it would be something along the lines of the test checking for >> > an expected failure message in dmesg, something along the lines of: >> > >> > https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67 >> > >> > I'm not opposed to coming up with a better story... > >If the goal of validation is just indirectly validating correctness of >the specification itself, then I would look for other ways of >validating correctness of the spec. >Either removing duplication between specification and actual code >(i.e. generating it from SYSCALL_DEFINE, or the other way around) , >then spec is correct by construction. Or, cross-validating it with >info automatically extracted from the source (using >clang/dwarf/pahole). >This would be more scalable (O(1) work, rather than thousands more >manually written tests). > >> Oh, you mean special tests for this framework (rather than existing tests). >> I don't think this is going to work in practice. Besides writing all >> these specifications, we will also need to write dozens of tests per >> each specification (e.g. for each fd arg one needs at least 3 tests: >> -1, valid fd, inclid fd; an enum may need 5 various inputs of >> something; let alone netlink specifications). I didn't mean just for the framework: being able to specify the APIs in machine readable format will enable us to automatically generate exhaustive tests for each such API. I've been playing with the kapi tool (see last patch) which already supports different formatters. Right now it outputs human readable output, but I have proof-of-concept code that outputs testcases for specced APIs. The dream here is to be able to automatically generate hundreds/thousands of tests for each API in an automated fashion, and verify the results with: 1. Simply checking expected return value. 2. Checking that the actual action happened (i.e. we called close(fd), verify that `fd` is really closed). 3. Check for side effects (i.e. close(fd) isn't supposed to allocate memory - verify that it didn't allocate memory). 4. Code coverage: our tests are supposed to cover 100% of the code in that APIs call chain, do we have code that didn't run (missing/incorrect specs). -- Thanks, Sasha
On Thu, 26 Jun 2025 at 18:23, Sasha Levin <sashal@kernel.org> wrote: > > On Thu, Jun 26, 2025 at 10:37:33AM +0200, Dmitry Vyukov wrote: > >On Thu, 26 Jun 2025 at 10:32, Dmitry Vyukov <dvyukov@google.com> wrote: > >> > >> On Wed, 25 Jun 2025 at 17:55, Sasha Levin <sashal@kernel.org> wrote: > >> > > >> > On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote: > >> > >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote: > >> > > > >> > >> >6. What's the goal of validation of the input arguments? > >> > >> >Kernel code must do this validation anyway, right. > >> > >> >Any non-trivial validation is hard, e.g. even for open the validation function > >> > >> >for file name would need to have access to flags and check file precense for > >> > >> >some flags combinations. That may add significant amount of non-trivial code > >> > >> >that duplicates main syscall logic, and that logic may also have bugs and > >> > >> >memory leaks. > >> > >> > >> > >> Mostly to catch divergence from the spec: think of a scenario where > >> > >> someone added a new param/flag/etc but forgot to update the spec - this > >> > >> will help catch it. > >> > > > >> > >How exactly is this supposed to work? > >> > >Even if we run with a unit test suite, a test suite may include some > >> > >incorrect inputs to check for error conditions. The framework will > >> > >report violations on these incorrect inputs. These are not bugs in the > >> > >API specifications, nor in the test suite (read false positives). > >> > > >> > Right now it would be something along the lines of the test checking for > >> > an expected failure message in dmesg, something along the lines of: > >> > > >> > https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67 > >> > > >> > I'm not opposed to coming up with a better story... > > > >If the goal of validation is just indirectly validating correctness of > >the specification itself, then I would look for other ways of > >validating correctness of the spec. > >Either removing duplication between specification and actual code > >(i.e. generating it from SYSCALL_DEFINE, or the other way around) , > >then spec is correct by construction. Or, cross-validating it with > >info automatically extracted from the source (using > >clang/dwarf/pahole). > >This would be more scalable (O(1) work, rather than thousands more > >manually written tests). > > > >> Oh, you mean special tests for this framework (rather than existing tests). > >> I don't think this is going to work in practice. Besides writing all > >> these specifications, we will also need to write dozens of tests per > >> each specification (e.g. for each fd arg one needs at least 3 tests: > >> -1, valid fd, inclid fd; an enum may need 5 various inputs of > >> something; let alone netlink specifications). > > I didn't mean just for the framework: being able to specify the APIs in > machine readable format will enable us to automatically generate > exhaustive tests for each such API. > > I've been playing with the kapi tool (see last patch) which already > supports different formatters. Right now it outputs human readable > output, but I have proof-of-concept code that outputs testcases for > specced APIs. > > The dream here is to be able to automatically generate > hundreds/thousands of tests for each API in an automated fashion, and > verify the results with: > > 1. Simply checking expected return value. > > 2. Checking that the actual action happened (i.e. we called close(fd), > verify that `fd` is really closed). > > 3. Check for side effects (i.e. close(fd) isn't supposed to allocate > memory - verify that it didn't allocate memory). > > 4. Code coverage: our tests are supposed to cover 100% of the code in > that APIs call chain, do we have code that didn't run (missing/incorrect > specs). This is all good. I was asking the argument verification part of the framework. Is it required for any of this? How?
On Fri, Jun 27, 2025 at 08:23:41AM +0200, Dmitry Vyukov wrote: >On Thu, 26 Jun 2025 at 18:23, Sasha Levin <sashal@kernel.org> wrote: >> >> On Thu, Jun 26, 2025 at 10:37:33AM +0200, Dmitry Vyukov wrote: >> >On Thu, 26 Jun 2025 at 10:32, Dmitry Vyukov <dvyukov@google.com> wrote: >> >> >> >> On Wed, 25 Jun 2025 at 17:55, Sasha Levin <sashal@kernel.org> wrote: >> >> > >> >> > On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote: >> >> > >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote: >> >> > > >> >> > >> >6. What's the goal of validation of the input arguments? >> >> > >> >Kernel code must do this validation anyway, right. >> >> > >> >Any non-trivial validation is hard, e.g. even for open the validation function >> >> > >> >for file name would need to have access to flags and check file precense for >> >> > >> >some flags combinations. That may add significant amount of non-trivial code >> >> > >> >that duplicates main syscall logic, and that logic may also have bugs and >> >> > >> >memory leaks. >> >> > >> >> >> > >> Mostly to catch divergence from the spec: think of a scenario where >> >> > >> someone added a new param/flag/etc but forgot to update the spec - this >> >> > >> will help catch it. >> >> > > >> >> > >How exactly is this supposed to work? >> >> > >Even if we run with a unit test suite, a test suite may include some >> >> > >incorrect inputs to check for error conditions. The framework will >> >> > >report violations on these incorrect inputs. These are not bugs in the >> >> > >API specifications, nor in the test suite (read false positives). >> >> > >> >> > Right now it would be something along the lines of the test checking for >> >> > an expected failure message in dmesg, something along the lines of: >> >> > >> >> > https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67 >> >> > >> >> > I'm not opposed to coming up with a better story... >> > >> >If the goal of validation is just indirectly validating correctness of >> >the specification itself, then I would look for other ways of >> >validating correctness of the spec. >> >Either removing duplication between specification and actual code >> >(i.e. generating it from SYSCALL_DEFINE, or the other way around) , >> >then spec is correct by construction. Or, cross-validating it with >> >info automatically extracted from the source (using >> >clang/dwarf/pahole). >> >This would be more scalable (O(1) work, rather than thousands more >> >manually written tests). >> > >> >> Oh, you mean special tests for this framework (rather than existing tests). >> >> I don't think this is going to work in practice. Besides writing all >> >> these specifications, we will also need to write dozens of tests per >> >> each specification (e.g. for each fd arg one needs at least 3 tests: >> >> -1, valid fd, inclid fd; an enum may need 5 various inputs of >> >> something; let alone netlink specifications). >> >> I didn't mean just for the framework: being able to specify the APIs in >> machine readable format will enable us to automatically generate >> exhaustive tests for each such API. >> >> I've been playing with the kapi tool (see last patch) which already >> supports different formatters. Right now it outputs human readable >> output, but I have proof-of-concept code that outputs testcases for >> specced APIs. >> >> The dream here is to be able to automatically generate >> hundreds/thousands of tests for each API in an automated fashion, and >> verify the results with: >> >> 1. Simply checking expected return value. >> >> 2. Checking that the actual action happened (i.e. we called close(fd), >> verify that `fd` is really closed). >> >> 3. Check for side effects (i.e. close(fd) isn't supposed to allocate >> memory - verify that it didn't allocate memory). >> >> 4. Code coverage: our tests are supposed to cover 100% of the code in >> that APIs call chain, do we have code that didn't run (missing/incorrect >> specs). > > >This is all good. I was asking the argument verification part of the >framework. Is it required for any of this? How? Specifications without enforcement are just documentation :) In my mind, there are a few reasons we want this: 1. For folks coding against the kernel, it's a way for them to know that the code they're writing fits within the spec of the kernel's API. 2. Enforcement around kernel changes: think of a scenario where a flag is added to a syscall - the author of that change will have to also update the spec because otherwise the verification layer will complain about the new flag. This helps prevent divergence between the code and the spec. 3. Extra layer of security: we can choose to enable this as an additional layer to protect us from missing checks in our userspace facing API. -- Thanks, Sasha
On Mon, 30 Jun 2025 at 16:27, Sasha Levin <sashal@kernel.org> wrote: > > On Fri, Jun 27, 2025 at 08:23:41AM +0200, Dmitry Vyukov wrote: > >On Thu, 26 Jun 2025 at 18:23, Sasha Levin <sashal@kernel.org> wrote: > >> > >> On Thu, Jun 26, 2025 at 10:37:33AM +0200, Dmitry Vyukov wrote: > >> >On Thu, 26 Jun 2025 at 10:32, Dmitry Vyukov <dvyukov@google.com> wrote: > >> >> > >> >> On Wed, 25 Jun 2025 at 17:55, Sasha Levin <sashal@kernel.org> wrote: > >> >> > > >> >> > On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote: > >> >> > >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote: > >> >> > > > >> >> > >> >6. What's the goal of validation of the input arguments? > >> >> > >> >Kernel code must do this validation anyway, right. > >> >> > >> >Any non-trivial validation is hard, e.g. even for open the validation function > >> >> > >> >for file name would need to have access to flags and check file precense for > >> >> > >> >some flags combinations. That may add significant amount of non-trivial code > >> >> > >> >that duplicates main syscall logic, and that logic may also have bugs and > >> >> > >> >memory leaks. > >> >> > >> > >> >> > >> Mostly to catch divergence from the spec: think of a scenario where > >> >> > >> someone added a new param/flag/etc but forgot to update the spec - this > >> >> > >> will help catch it. > >> >> > > > >> >> > >How exactly is this supposed to work? > >> >> > >Even if we run with a unit test suite, a test suite may include some > >> >> > >incorrect inputs to check for error conditions. The framework will > >> >> > >report violations on these incorrect inputs. These are not bugs in the > >> >> > >API specifications, nor in the test suite (read false positives). > >> >> > > >> >> > Right now it would be something along the lines of the test checking for > >> >> > an expected failure message in dmesg, something along the lines of: > >> >> > > >> >> > https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67 > >> >> > > >> >> > I'm not opposed to coming up with a better story... > >> > > >> >If the goal of validation is just indirectly validating correctness of > >> >the specification itself, then I would look for other ways of > >> >validating correctness of the spec. > >> >Either removing duplication between specification and actual code > >> >(i.e. generating it from SYSCALL_DEFINE, or the other way around) , > >> >then spec is correct by construction. Or, cross-validating it with > >> >info automatically extracted from the source (using > >> >clang/dwarf/pahole). > >> >This would be more scalable (O(1) work, rather than thousands more > >> >manually written tests). > >> > > >> >> Oh, you mean special tests for this framework (rather than existing tests). > >> >> I don't think this is going to work in practice. Besides writing all > >> >> these specifications, we will also need to write dozens of tests per > >> >> each specification (e.g. for each fd arg one needs at least 3 tests: > >> >> -1, valid fd, inclid fd; an enum may need 5 various inputs of > >> >> something; let alone netlink specifications). > >> > >> I didn't mean just for the framework: being able to specify the APIs in > >> machine readable format will enable us to automatically generate > >> exhaustive tests for each such API. > >> > >> I've been playing with the kapi tool (see last patch) which already > >> supports different formatters. Right now it outputs human readable > >> output, but I have proof-of-concept code that outputs testcases for > >> specced APIs. > >> > >> The dream here is to be able to automatically generate > >> hundreds/thousands of tests for each API in an automated fashion, and > >> verify the results with: > >> > >> 1. Simply checking expected return value. > >> > >> 2. Checking that the actual action happened (i.e. we called close(fd), > >> verify that `fd` is really closed). > >> > >> 3. Check for side effects (i.e. close(fd) isn't supposed to allocate > >> memory - verify that it didn't allocate memory). > >> > >> 4. Code coverage: our tests are supposed to cover 100% of the code in > >> that APIs call chain, do we have code that didn't run (missing/incorrect > >> specs). > > > > > >This is all good. I was asking the argument verification part of the > >framework. Is it required for any of this? How? > > Specifications without enforcement are just documentation :) > > In my mind, there are a few reasons we want this: > > 1. For folks coding against the kernel, it's a way for them to know that > the code they're writing fits within the spec of the kernel's API. How is this different from just running the kernel normally? Running the kernel normally is simpler, faster, and more precise. > 2. Enforcement around kernel changes: think of a scenario where a flag > is added to a syscall - the author of that change will have to also > update the spec because otherwise the verification layer will complain > about the new flag. This helps prevent divergence between the code and > the spec. It may be more useful to invoke verification, but does not return early on verification errors, but instead memorize the result, and still always run the actual syscall normally. Then if verification produced an error, but the actual syscall has not returned the same error, then WARN loudly. This should provide the same value. But also does not rely on correctly marked manually written tests to test the specification. It will work automatically with any fuzzing/randomized testing, which I assume will be more valuable for specification testing. But then, as Cyril mentioned, this verification layer does not really need to live in the kernel. Once the kernel has exported the specification in machine-usable form, the same verification can be done in user-space. Which is always a good idea. > 3. Extra layer of security: we can choose to enable this as an > additional layer to protect us from missing checks in our userspace > facing API. This will have additional risks, and performance overhead. Such mitigations are usually assessed with % of past CVEs this could prevent. That would allow us to assess cost/benefit. Intuitively this does not look like worth doing to me.
Hi! > > >6. What's the goal of validation of the input arguments? > > >Kernel code must do this validation anyway, right. > > >Any non-trivial validation is hard, e.g. even for open the validation function > > >for file name would need to have access to flags and check file precense for > > >some flags combinations. That may add significant amount of non-trivial code > > >that duplicates main syscall logic, and that logic may also have bugs and > > >memory leaks. > > > > Mostly to catch divergence from the spec: think of a scenario where > > someone added a new param/flag/etc but forgot to update the spec - this > > will help catch it. > > How exactly is this supposed to work? > Even if we run with a unit test suite, a test suite may include some > incorrect inputs to check for error conditions. The framework will > report violations on these incorrect inputs. These are not bugs in the > API specifications, nor in the test suite (read false positives). This is what I tried to respond to but I guess that it didn't go well. Let me try to reiterate. I my opinion you shouldn't really put this part into the kernel, but rather than that include more type and semantic information into the data so that tests can be generated and executed in userspace. I do not see how can we validate that we get proper errors from a syscall if one of the input parameters is invalid other than generating and running a C test in userspace. For that part the syscall description does not need to be build into the kernel either, it may be just a build artifact that gets installed with the kernel image. -- Cyril Hrubis chrubis@suse.cz
On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote: > >3. To reduce duplication we could use more type information, e.g. I was always > >frustrated that close is just: > > > >SYSCALL_DEFINE1(close, unsigned int, fd) > > > >whereas if we would do: > > > >typedef int fd_t; > >SYSCALL_DEFINE1(close, fd_t, fd) > > > >then all semantic info about the arg is already in the code. > > Yup. It would also be great if we completely drop the SYSCALL_DEFINE() > part and have it be automatically generated by the spec itself, but I > couldn't wrap my head around doing this in C macro just yet. At some point I was looking at boost.pp library as the source of info on how to do things. It provides a set of containers and algorithms on them: https://www.boost.org/doc/libs/latest/libs/preprocessor/doc/index.html Sequences may be the most appealing b/c they support variable number of elements, and don't need specifying number of elements explicitly: https://www.boost.org/doc/libs/latest/libs/preprocessor/doc/data/sequences.html A sequence then allows generating multiple things from it using foreach over elements.
Hi! > 6. What's the goal of validation of the input arguments? > Kernel code must do this validation anyway, right. > Any non-trivial validation is hard, e.g. even for open the validation function > for file name would need to have access to flags and check file precense for > some flags combinations. That may add significant amount of non-trivial code > that duplicates main syscall logic, and that logic may also have bugs and > memory leaks. I was looking at that part and thinking that we could generate (at least some) automated conformance tests based on this information. We could make sure that invalid parameters are properly rejected. For open(), some combinations would be difficuilt to model though, e.g. for O_DIRECTORY the pathname is supposed to be a path to a directory and also the file descriptor returned has different properties. Also O_CREAT requires third parameter and changes which kinds of filepaths are invalid. Demultiplexing syscalls like this is going to be difficult to get right. As for testing purposes, most of the time it would be enough just to say something as "this parameter is an existing file". If we have this information in a machine parseable format we can generate automatic tests for various error conditions e.g. ELOOP, EACESS, ENAMETOOLONG, ENOENT, ... For paths we could have something as: file:existing file:notexisting file:replaced|nonexisting file:nonexisting|existing dir:existing dir:nonexisting Then for open() syscall we can do: flags=O_DIRECTORY path=dir:existing flags=O_CREAT path=file:nonexisting|existing flags=O_CREAT|O_EXCL path=file:nonexisting ... You may wonder if such kind of tests are useful at all, since quite a few of these errors are checked for and generated from a common functions. There are at least two cases I can think of. First of all it makes sure that errors are stable when particular function/subsystem is rewritten. And it can also make sure that errors are consistent across different implementation of the same functionality e.g. filesystems. I remember that some of the less used FUSE filesystems returned puzzling errors in certain corner cases. Maybe it would be more useful to steer this towards a system that annotates better the types for the syscall parameters and return values. Something that would be an extension to a C types with a description on how particular string or integer is interpreted. > Side-effects specification potentially can be used to detect logical kernel bugs, > e.g. if a syscall does not claim to change fs state, but it does, it's a bug. > Though, a more useful check should be failure/concurrency atomicity. > Namely, if a syscall claims to not alter state on failure, it shouldn't do so. > Concurrency atomicity means linearizability of concurrent syscalls > (side-effects match one of 2 possible orders of syscalls). > But for these we would need to add additional flags to the descriptions > that say that a syscall supports failure/concurrency atomicity. > > 8. It would be useful to have a mapping of file_operations to actual files in fs. > Otherwise the exposed info is not very actionable, since there is no way to understand > what actual file/fd the ioctl's can be applied to. +1 There are many different kinds of file descriptors and they differ wildy in what operations they support. Maybe we would need a subclass for a file descriptor, something as: fd:file fd:timerfd fd:pidfs ... -- Cyril Hrubis chrubis@suse.cz
On Tue, 24 Jun 2025 at 16:05, Cyril Hrubis <chrubis@suse.cz> wrote: > > Hi! > > 6. What's the goal of validation of the input arguments? > > Kernel code must do this validation anyway, right. > > Any non-trivial validation is hard, e.g. even for open the validation function > > for file name would need to have access to flags and check file precense for > > some flags combinations. That may add significant amount of non-trivial code > > that duplicates main syscall logic, and that logic may also have bugs and > > memory leaks. > > I was looking at that part and thinking that we could generate (at least > some) automated conformance tests based on this information. We could > make sure that invalid parameters are properly rejected. For open(), > some combinations would be difficuilt to model though, e.g. for > O_DIRECTORY the pathname is supposed to be a path to a directory and > also the file descriptor returned has different properties. Also O_CREAT > requires third parameter and changes which kinds of filepaths are > invalid. Demultiplexing syscalls like this is going to be difficult to > get right. > > As for testing purposes, most of the time it would be enough just to say > something as "this parameter is an existing file". If we have this > information in a machine parseable format we can generate automatic > tests for various error conditions e.g. ELOOP, EACESS, ENAMETOOLONG, > ENOENT, ... > > For paths we could have something as: > > file:existing > file:notexisting > file:replaced|nonexisting > file:nonexisting|existing > dir:existing > dir:nonexisting > > Then for open() syscall we can do: > > flags=O_DIRECTORY path=dir:existing > flags=O_CREAT path=file:nonexisting|existing > flags=O_CREAT|O_EXCL path=file:nonexisting > ... > > You may wonder if such kind of tests are useful at all, since quite a > few of these errors are checked for and generated from a common > functions. There are at least two cases I can think of. First of all it > makes sure that errors are stable when particular function/subsystem is > rewritten. And it can also make sure that errors are consistent across > different implementation of the same functionality e.g. filesystems. I > remember that some of the less used FUSE filesystems returned puzzling > errors in certain corner cases. I am not following how this is related to the validation part of the patch series. Can you elaborate? Generation of such conformance tests would need info about parameter types and their semantic meaning, not the validation part. The conformance tests should test that actual syscall checking of arguments, not the validation added by this framework. > Maybe it would be more useful to steer this towards a system that > annotates better the types for the syscall parameters and return values. > Something that would be an extension to a C types with a description on > how particular string or integer is interpreted. +1 > > Side-effects specification potentially can be used to detect logical kernel bugs, > > e.g. if a syscall does not claim to change fs state, but it does, it's a bug. > > Though, a more useful check should be failure/concurrency atomicity. > > Namely, if a syscall claims to not alter state on failure, it shouldn't do so. > > Concurrency atomicity means linearizability of concurrent syscalls > > (side-effects match one of 2 possible orders of syscalls). > > But for these we would need to add additional flags to the descriptions > > that say that a syscall supports failure/concurrency atomicity. > > > > 8. It would be useful to have a mapping of file_operations to actual files in fs. > > Otherwise the exposed info is not very actionable, since there is no way to understand > > what actual file/fd the ioctl's can be applied to. > > +1 There are many different kinds of file descriptors and they differ > wildy in what operations they support. > > Maybe we would need a subclass for a file descriptor, something as: > > fd:file > fd:timerfd > fd:pidfs FWIW syzkaller has this for the purpose of automatic generation of test inputs.
Hi! > > You may wonder if such kind of tests are useful at all, since quite a > > few of these errors are checked for and generated from a common > > functions. There are at least two cases I can think of. First of all it > > makes sure that errors are stable when particular function/subsystem is > > rewritten. And it can also make sure that errors are consistent across > > different implementation of the same functionality e.g. filesystems. I > > remember that some of the less used FUSE filesystems returned puzzling > > errors in certain corner cases. > > I am not following how this is related to the validation part of the > patch series. Can you elaborate? This part is me trying to explain that generated conformance tests would be useful for development as well. > Generation of such conformance tests would need info about parameter > types and their semantic meaning, not the validation part. > The conformance tests should test that actual syscall checking of > arguments, not the validation added by this framework. Exactly. I do not think that it makes sense to encode the argument ranges and functions to generate a valid syscall parameters into the kernel. Rather than that the information should encoded in the extended types, if we do that well enough we can generate combination of different valid and invalid parameters for the tests based on that. -- Cyril Hrubis chrubis@suse.cz
© 2016 - 2025 Red Hat, Inc.