[v1] Kernel API Specification Framework

[RFC 00/19] Kernel API Specification Framework

Posted by Sasha Levin 3 months, 3 weeks ago

This patch series introduces a framework for formally specifying kernel
APIs, addressing the long-standing challenge of maintaining stable
interfaces between the kernel and user-space programs. As outlined in
previous discussions about kernel ABI stability, the lack of
machine-readable API specifications has led to inadvertent breakages and
inconsistent validation across system calls and IOCTLs.

The framework provides three key components: declarative macros for
specifying system call and IOCTL interfaces directly in the kernel
source, automated extraction tools for generating machine-readable
specifications, and a runtime validation infrastructure accessible
through debugfs. By embedding specifications alongside implementation
code, we ensure they remain synchronized and enable automated detection
of API/ABI changes that could break user-space applications.

This implementation demonstrates the approach with specifications for
core system calls (epoll, exec, mlock families) and complex IOCTL
interfaces (binder, fwctl).  The specifications capture parameter types,
validation rules, return values, and error conditions in a structured
format that enables both documentation generation and runtime
verification. Future work will expand coverage to additional subsystems
and integrate with existing testing infrastructure to provide
API compatibility guarantees.

To complement the framework, we introduce the 'kapi' tool - a
utility for extracting and analyzing kernel API specifications from
multiple sources. The tool can extract specifications from kernel source
code (parsing KAPI macros), compiled vmlinux binaries (reading the
.kapi_specs ELF section), or from a running kernel via debugfs. It
supports multiple output formats (plain text, JSON, RST) to facilitate
integration with documentation systems and automated testing workflows.
This tool enables developers to easily inspect API specifications,
verify changes across kernel versions, and generate documentation
without requiring kernel rebuilds.

Sasha Levin (19):
  kernel/api: introduce kernel API specification framework
  eventpoll: add API specification for epoll_create1
  eventpoll: add API specification for epoll_create
  eventpoll: add API specification for epoll_ctl
  eventpoll: add API specification for epoll_wait
  eventpoll: add API specification for epoll_pwait
  eventpoll: add API specification for epoll_pwait2
  exec: add API specification for execve
  exec: add API specification for execveat
  mm/mlock: add API specification for mlock
  mm/mlock: add API specification for mlock2
  mm/mlock: add API specification for mlockall
  mm/mlock: add API specification for munlock
  mm/mlock: add API specification for munlockall
  kernel/api: add debugfs interface for kernel API specifications
  kernel/api: add IOCTL specification infrastructure
  fwctl: add detailed IOCTL API specifications
  binder: add detailed IOCTL API specifications
  tools/kapi: Add kernel API specification extraction tool

 Documentation/admin-guide/kernel-api-spec.rst |  699 +++++++++
 MAINTAINERS                                   |    9 +
 arch/um/kernel/dyn.lds.S                      |    3 +
 arch/um/kernel/uml.lds.S                      |    3 +
 arch/x86/kernel/vmlinux.lds.S                 |    3 +
 drivers/android/binder.c                      |  758 ++++++++++
 drivers/fwctl/main.c                          |  295 +++-
 fs/eventpoll.c                                | 1056 ++++++++++++++
 fs/exec.c                                     |  463 ++++++
 include/asm-generic/vmlinux.lds.h             |   20 +
 include/linux/ioctl_api_spec.h                |  540 +++++++
 include/linux/kernel_api_spec.h               |  942 ++++++++++++
 include/linux/syscall_api_spec.h              |  341 +++++
 include/linux/syscalls.h                      |    1 +
 init/Kconfig                                  |    2 +
 kernel/Makefile                               |    1 +
 kernel/api/Kconfig                            |   55 +
 kernel/api/Makefile                           |   13 +
 kernel/api/ioctl_validation.c                 |  360 +++++
 kernel/api/kapi_debugfs.c                     |  340 +++++
 kernel/api/kernel_api_spec.c                  | 1257 +++++++++++++++++
 mm/mlock.c                                    |  646 +++++++++
 tools/kapi/.gitignore                         |    4 +
 tools/kapi/Cargo.toml                         |   19 +
 tools/kapi/src/extractor/debugfs.rs           |  204 +++
 tools/kapi/src/extractor/mod.rs               |   95 ++
 tools/kapi/src/extractor/source_parser.rs     |  488 +++++++
 .../src/extractor/vmlinux/binary_utils.rs     |  130 ++
 tools/kapi/src/extractor/vmlinux/mod.rs       |  372 +++++
 tools/kapi/src/formatter/json.rs              |  170 +++
 tools/kapi/src/formatter/mod.rs               |   68 +
 tools/kapi/src/formatter/plain.rs             |   99 ++
 tools/kapi/src/formatter/rst.rs               |  144 ++
 tools/kapi/src/main.rs                        |  121 ++
 34 files changed, 9719 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/admin-guide/kernel-api-spec.rst
 create mode 100644 include/linux/ioctl_api_spec.h
 create mode 100644 include/linux/kernel_api_spec.h
 create mode 100644 include/linux/syscall_api_spec.h
 create mode 100644 kernel/api/Kconfig
 create mode 100644 kernel/api/Makefile
 create mode 100644 kernel/api/ioctl_validation.c
 create mode 100644 kernel/api/kapi_debugfs.c
 create mode 100644 kernel/api/kernel_api_spec.c
 create mode 100644 tools/kapi/.gitignore
 create mode 100644 tools/kapi/Cargo.toml
 create mode 100644 tools/kapi/src/extractor/debugfs.rs
 create mode 100644 tools/kapi/src/extractor/mod.rs
 create mode 100644 tools/kapi/src/extractor/source_parser.rs
 create mode 100644 tools/kapi/src/extractor/vmlinux/binary_utils.rs
 create mode 100644 tools/kapi/src/extractor/vmlinux/mod.rs
 create mode 100644 tools/kapi/src/formatter/json.rs
 create mode 100644 tools/kapi/src/formatter/mod.rs
 create mode 100644 tools/kapi/src/formatter/plain.rs
 create mode 100644 tools/kapi/src/formatter/rst.rs
 create mode 100644 tools/kapi/src/main.rs

-- 
2.39.5

Re: [RFC 00/19] Kernel API Specification Framework

Posted by David Laight 3 months, 3 weeks ago

On Sat, 14 Jun 2025 09:48:39 -0400
Sasha Levin <sashal@kernel.org> wrote:

> This patch series introduces a framework for formally specifying kernel
> APIs, addressing the long-standing challenge of maintaining stable
> interfaces between the kernel and user-space programs. As outlined in
> previous discussions about kernel ABI stability, the lack of
> machine-readable API specifications has led to inadvertent breakages and
> inconsistent validation across system calls and IOCTLs.

Ugg, looks horrid.
Going to be worse than things like doxygen for getting out of step with
the actual code and grep searches are going to hit the comment blocks.

	David

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Kees Cook 3 months, 3 weeks ago

On Sat, Jun 14, 2025 at 09:48:39AM -0400, Sasha Levin wrote:
> This patch series introduces a framework for formally specifying kernel
> APIs, addressing the long-standing challenge of maintaining stable
> interfaces between the kernel and user-space programs. As outlined in
> previous discussions about kernel ABI stability, the lack of
> machine-readable API specifications has led to inadvertent breakages and
> inconsistent validation across system calls and IOCTLs.

I'd much prefer this be more attached to the code in question, otherwise
we've go two things to update when changes happen. (Well, 3, since
kern-doc already needs updating too.)

Can't we collect error codes programmatically through control flow
analysis? Argument mapping is already present in the SYSCALL macros,
etc. Let's not repeat this info.

-Kees

-- 
Kees Cook

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Sasha Levin 3 months, 3 weeks ago

On Wed, Jun 18, 2025 at 02:29:37PM -0700, Kees Cook wrote:
>On Sat, Jun 14, 2025 at 09:48:39AM -0400, Sasha Levin wrote:
>> This patch series introduces a framework for formally specifying kernel
>> APIs, addressing the long-standing challenge of maintaining stable
>> interfaces between the kernel and user-space programs. As outlined in
>> previous discussions about kernel ABI stability, the lack of
>> machine-readable API specifications has led to inadvertent breakages and
>> inconsistent validation across system calls and IOCTLs.
>
>I'd much prefer this be more attached to the code in question, otherwise
>we've go two things to update when changes happen. (Well, 3, since
>kern-doc already needs updating too.)
>
>Can't we collect error codes programmatically through control flow
>analysis? Argument mapping is already present in the SYSCALL macros,

I'm not sure what you meant with in the control flow analysis part: we
have code to verify that the return value from the macro matches one of
the ones defined in the spec.

>etc. Let's not repeat this info.

I tried to come up with a way to get rid of the SYSCALL_DEFINEx() macro
right after the spec. I agree that it's duplication, but my macro-foo is
too weak to get rid of that SYSCALL_DEFINE() call.

Suggestions more than welcome here: I suspect that this might require a
bigger change in the code, but I'm still trying to figure it out.

-- 
Thanks,
Sasha

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Dmitry Vyukov 3 months, 2 weeks ago

Nice!

A bag of assorted comments:

1. I share the same concern of duplicating info.
If there are lots of duplication it may lead to failure of the whole effort
since folks won't update these and/or they will get out of sync.
If a syscall arg is e.g. umode_t, we already know that it's an integer
of that enum type, and that it's an input arg.
In syzkaller we have a Clang-tool:
https://github.com/google/syzkaller/blob/master/tools/syz-declextract/clangtool/declextract.cpp
that extracts a bunch of interfaces automatically:
https://raw.githubusercontent.com/google/syzkaller/refs/heads/master/sys/linux/auto.txt
Though, oviously that won't have user-readable string descriptions, can't be used as a source
of truth, and may be challenging to integrate into kernel build process.
Though, extracting some of that info automatically may be nice.

2. Does this framework ensure that the specified info about args is correct?
E.g. number of syscall args, and their types match the actual ones?
If such things are not tested/validated during build, I afraid they will be
riddled with bugs over time.

3. To reduce duplication we could use more type information, e.g. I was always
frustrated that close is just:

SYSCALL_DEFINE1(close, unsigned int, fd)

whereas if we would do:

typedef int fd_t;
SYSCALL_DEFINE1(close, fd_t, fd)

then all semantic info about the arg is already in the code.

4. If we specify e.g. error return values here with descirptions,
can that be used as the source of truth to generate man pages?
That would eliminate some duplication.

5. We have a long standing dream that kernel developers add fuzzing descirpions
along with new kernel interfaces. So far we got very few contributions to syzkaller
from kernel developers. This framework can serve as the way to do it, which is nice.

6. What's the goal of validation of the input arguments?
Kernel code must do this validation anyway, right.
Any non-trivial validation is hard, e.g. even for open the validation function
for file name would need to have access to flags and check file precense for
some flags combinations. That may add significant amount of non-trivial code
that duplicates main syscall logic, and that logic may also have bugs and
memory leaks.

7. One of the most useful uses of this framework that I see if testing kernel
behavior correctness. I wonder what properties we can test with these descirptions,
and if we can add more useful info for that purpose.
Argument validation does not help here (it's userspace bugs at best).
Return values potentially may be useful, e.g. if we see a return value that's
not specified, potentially it's a kernel bug.
Side-effects specification potentially can be used to detect logical kernel bugs,
e.g. if a syscall does not claim to change fs state, but it does, it's a bug.
Though, a more useful check should be failure/concurrency atomicity.
Namely, if a syscall claims to not alter state on failure, it shouldn't do so.
Concurrency atomicity means linearizability of concurrent syscalls
(side-effects match one of 2 possible orders of syscalls).
But for these we would need to add additional flags to the descriptions
that say that a syscall supports failure/concurrency atomicity.

8. It would be useful to have a mapping of file_operations to actual files in fs.
Otherwise the exposed info is not very actionable, since there is no way to understand
what actual file/fd the ioctl's can be applied to.

9. I see that syscalls and ioctls say:
KAPI_CONTEXT(KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE)
Can't we make this implicit? Are there any other options?
Similarly an ioctl description says it releases a mutex (.released = true,),
all ioctls/syscalls must release all acquired mutexes, no?
Generally, the less verbose the descriptions are, the higher chances of their survival.
+Marco also works static compiler-enforced lock checking annotations,
I wonder if they can be used to describe this in a more useful way.

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Sasha Levin 3 months, 2 weeks ago

On Mon, Jun 23, 2025 at 03:28:03PM +0200, Dmitry Vyukov wrote:
>Nice!
>
>A bag of assorted comments:
>
>1. I share the same concern of duplicating info.
>If there are lots of duplication it may lead to failure of the whole effort
>since folks won't update these and/or they will get out of sync.
>If a syscall arg is e.g. umode_t, we already know that it's an integer
>of that enum type, and that it's an input arg.
>In syzkaller we have a Clang-tool:
>https://github.com/google/syzkaller/blob/master/tools/syz-declextract/clangtool/declextract.cpp
>that extracts a bunch of interfaces automatically:
>https://raw.githubusercontent.com/google/syzkaller/refs/heads/master/sys/linux/auto.txt
>Though, oviously that won't have user-readable string descriptions, can't be used as a source
>of truth, and may be challenging to integrate into kernel build process.
>Though, extracting some of that info automatically may be nice.
>
>2. Does this framework ensure that the specified info about args is correct?
>E.g. number of syscall args, and their types match the actual ones?
>If such things are not tested/validated during build, I afraid they will be
>riddled with bugs over time.

This is an answer for both (1) and (2): yes! In my mind, whatever we
spec out needs to be enforced, because otherwise it will go out of sync.

In this RFC, take a look at the code guarded by
CONFIG_KAPI_RUNTIME_CHECKS: the idea is that we can enable runtime
checks that verify the things you've mentioned above (and more).

>3. To reduce duplication we could use more type information, e.g. I was always
>frustrated that close is just:
>
>SYSCALL_DEFINE1(close, unsigned int, fd)
>
>whereas if we would do:
>
>typedef int fd_t;
>SYSCALL_DEFINE1(close, fd_t, fd)
>
>then all semantic info about the arg is already in the code.

Yup. It would also be great if we completely drop the SYSCALL_DEFINE()
part and have it be automatically generated by the spec itself, but I
couldn't wrap my head around doing this in C macro just yet.

>4. If we specify e.g. error return values here with descirptions,
>can that be used as the source of truth to generate man pages?
>That would eliminate some duplication.

Ideally yes. One of the formatters that the kapi tool has (see the last
patch in this series) is the RST formatter that could be used to
generate documentation similar to man pages.

>5. We have a long standing dream that kernel developers add fuzzing descirpions
>along with new kernel interfaces. So far we got very few contributions to syzkaller
>from kernel developers. This framework can serve as the way to do it, which is nice.

This was one of the main usecases I had in mind.

In return, we can get back from syzkaller a body of automatically
generated tests that we can embed into our testing CIs.

>6. What's the goal of validation of the input arguments?
>Kernel code must do this validation anyway, right.
>Any non-trivial validation is hard, e.g. even for open the validation function
>for file name would need to have access to flags and check file precense for
>some flags combinations. That may add significant amount of non-trivial code
>that duplicates main syscall logic, and that logic may also have bugs and
>memory leaks.

Mostly to catch divergence from the spec: think of a scenario where
someone added a new param/flag/etc but forgot to update the spec - this
will help catch it.

Ideally it would also prevent some of the issues that syzkaller is so
good at finding :)

>7. One of the most useful uses of this framework that I see if testing kernel
>behavior correctness. I wonder what properties we can test with these descirptions,
>and if we can add more useful info for that purpose.
>Argument validation does not help here (it's userspace bugs at best).
>Return values potentially may be useful, e.g. if we see a return value that's
>not specified, potentially it's a kernel bug.
>Side-effects specification potentially can be used to detect logical kernel bugs,
>e.g. if a syscall does not claim to change fs state, but it does, it's a bug.
>Though, a more useful check should be failure/concurrency atomicity.
>Namely, if a syscall claims to not alter state on failure, it shouldn't do so.
>Concurrency atomicity means linearizability of concurrent syscalls
>(side-effects match one of 2 possible orders of syscalls).
>But for these we would need to add additional flags to the descriptions
>that say that a syscall supports failure/concurrency atomicity.

I agree: being able to fuzz for more than just kernel splats will be
great.

>8. It would be useful to have a mapping of file_operations to actual files in fs.
>Otherwise the exposed info is not very actionable, since there is no way to understand
>what actual file/fd the ioctl's can be applied to.

Ack. The ioctl() part is a bit hand weavy right now, and at the very
least we'd need to spec out ioctl() itself. It's more of a demonstration
of how it could look like rather than being too useful at this point.

>9. I see that syscalls and ioctls say:
>KAPI_CONTEXT(KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE)
>Can't we make this implicit? Are there any other options?

Maybe? I wasn't sure how we'd describe somthing like getpid() which
isn't supposed to sleep.

>Similarly an ioctl description says it releases a mutex (.released = true,),
>all ioctls/syscalls must release all acquired mutexes, no?
>Generally, the less verbose the descriptions are, the higher chances of their survival.
>+Marco also works static compiler-enforced lock checking annotations,
>I wonder if they can be used to describe this in a more useful way.

I was thinking about stuff like futex or flock which can return with a
lock back to userspace.

-- 
Thanks,
Sasha

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Dmitry Vyukov 3 months, 2 weeks ago

On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote:
> >9. I see that syscalls and ioctls say:
> >KAPI_CONTEXT(KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE)
> >Can't we make this implicit? Are there any other options?
>
> Maybe? I wasn't sure how we'd describe somthing like getpid() which
> isn't supposed to sleep.
>
> >Similarly an ioctl description says it releases a mutex (.released = true,),
> >all ioctls/syscalls must release all acquired mutexes, no?
> >Generally, the less verbose the descriptions are, the higher chances of their survival.
> >+Marco also works static compiler-enforced lock checking annotations,
> >I wonder if they can be used to describe this in a more useful way.
>
> I was thinking about stuff like futex or flock which can return with a
> lock back to userspace.

I see, this makes sense. Then I would go with explicitly specifying
rare uncommon cases instead, and require 99% of common cases be the
default that does not require saying anything.

E.g. KAPI_CTX_NON_SLEEPABLE, .not_released = true.

KAPI_CTX_NON_SLEEPABLE looks useful, since it allows easy validation:
set current flag, and BUG on any attempt to sleep when the flag is set
(lockdep probably already has required pieces for this).

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Sasha Levin 3 months, 2 weeks ago

On Wed, Jun 25, 2025 at 10:56:04AM +0200, Dmitry Vyukov wrote:
>On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote:
>> >9. I see that syscalls and ioctls say:
>> >KAPI_CONTEXT(KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE)
>> >Can't we make this implicit? Are there any other options?
>>
>> Maybe? I wasn't sure how we'd describe somthing like getpid() which
>> isn't supposed to sleep.
>>
>> >Similarly an ioctl description says it releases a mutex (.released = true,),
>> >all ioctls/syscalls must release all acquired mutexes, no?
>> >Generally, the less verbose the descriptions are, the higher chances of their survival.
>> >+Marco also works static compiler-enforced lock checking annotations,
>> >I wonder if they can be used to describe this in a more useful way.
>>
>> I was thinking about stuff like futex or flock which can return with a
>> lock back to userspace.
>
>I see, this makes sense. Then I would go with explicitly specifying
>rare uncommon cases instead, and require 99% of common cases be the
>default that does not require saying anything.
>
>E.g. KAPI_CTX_NON_SLEEPABLE, .not_released = true.
>
>KAPI_CTX_NON_SLEEPABLE looks useful, since it allows easy validation:
>set current flag, and BUG on any attempt to sleep when the flag is set
>(lockdep probably already has required pieces for this).

Yup, that makes sense. One of the reason I wrapped all the field
assignments with macros is that we can easily customize them based on
usage, so instead of:

	#define KAPI_LOCK_ACQUIRED \
         	        .acquired = true,

	#define KAPI_LOCK_RELEASED \
         	        .released = true,

We can add:

	#define KAPI_LOCK_USED \
			.acquired = true, \
			.released = true,

-- 
Thanks,
Sasha

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Dmitry Vyukov 3 months, 2 weeks ago

On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote:

> >6. What's the goal of validation of the input arguments?
> >Kernel code must do this validation anyway, right.
> >Any non-trivial validation is hard, e.g. even for open the validation function
> >for file name would need to have access to flags and check file precense for
> >some flags combinations. That may add significant amount of non-trivial code
> >that duplicates main syscall logic, and that logic may also have bugs and
> >memory leaks.
>
> Mostly to catch divergence from the spec: think of a scenario where
> someone added a new param/flag/etc but forgot to update the spec - this
> will help catch it.

How exactly is this supposed to work?
Even if we run with a unit test suite, a test suite may include some
incorrect inputs to check for error conditions. The framework will
report violations on these incorrect inputs. These are not bugs in the
API specifications, nor in the test suite (read false positives).

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Sasha Levin 3 months, 2 weeks ago

On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote:
>On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote:
>
>> >6. What's the goal of validation of the input arguments?
>> >Kernel code must do this validation anyway, right.
>> >Any non-trivial validation is hard, e.g. even for open the validation function
>> >for file name would need to have access to flags and check file precense for
>> >some flags combinations. That may add significant amount of non-trivial code
>> >that duplicates main syscall logic, and that logic may also have bugs and
>> >memory leaks.
>>
>> Mostly to catch divergence from the spec: think of a scenario where
>> someone added a new param/flag/etc but forgot to update the spec - this
>> will help catch it.
>
>How exactly is this supposed to work?
>Even if we run with a unit test suite, a test suite may include some
>incorrect inputs to check for error conditions. The framework will
>report violations on these incorrect inputs. These are not bugs in the
>API specifications, nor in the test suite (read false positives).

Right now it would be something along the lines of the test checking for
an expected failure message in dmesg, something along the lines of:

	https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67

I'm not opposed to coming up with a better story...

-- 
Thanks,
Sasha

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Dmitry Vyukov 3 months, 2 weeks ago

On Wed, 25 Jun 2025 at 17:55, Sasha Levin <sashal@kernel.org> wrote:
>
> On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote:
> >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote:
> >
> >> >6. What's the goal of validation of the input arguments?
> >> >Kernel code must do this validation anyway, right.
> >> >Any non-trivial validation is hard, e.g. even for open the validation function
> >> >for file name would need to have access to flags and check file precense for
> >> >some flags combinations. That may add significant amount of non-trivial code
> >> >that duplicates main syscall logic, and that logic may also have bugs and
> >> >memory leaks.
> >>
> >> Mostly to catch divergence from the spec: think of a scenario where
> >> someone added a new param/flag/etc but forgot to update the spec - this
> >> will help catch it.
> >
> >How exactly is this supposed to work?
> >Even if we run with a unit test suite, a test suite may include some
> >incorrect inputs to check for error conditions. The framework will
> >report violations on these incorrect inputs. These are not bugs in the
> >API specifications, nor in the test suite (read false positives).
>
> Right now it would be something along the lines of the test checking for
> an expected failure message in dmesg, something along the lines of:
>
>         https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67
>
> I'm not opposed to coming up with a better story...

Oh, you mean special tests for this framework (rather than existing tests).
I don't think this is going to work in practice. Besides writing all
these specifications, we will also need to write dozens of tests per
each specification (e.g. for each fd arg one needs at least 3 tests:
-1, valid fd, inclid fd; an enum may need 5 various inputs of
something; let alone netlink specifications).

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Dmitry Vyukov 3 months, 2 weeks ago

On Thu, 26 Jun 2025 at 10:32, Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Wed, 25 Jun 2025 at 17:55, Sasha Levin <sashal@kernel.org> wrote:
> >
> > On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote:
> > >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote:
> > >
> > >> >6. What's the goal of validation of the input arguments?
> > >> >Kernel code must do this validation anyway, right.
> > >> >Any non-trivial validation is hard, e.g. even for open the validation function
> > >> >for file name would need to have access to flags and check file precense for
> > >> >some flags combinations. That may add significant amount of non-trivial code
> > >> >that duplicates main syscall logic, and that logic may also have bugs and
> > >> >memory leaks.
> > >>
> > >> Mostly to catch divergence from the spec: think of a scenario where
> > >> someone added a new param/flag/etc but forgot to update the spec - this
> > >> will help catch it.
> > >
> > >How exactly is this supposed to work?
> > >Even if we run with a unit test suite, a test suite may include some
> > >incorrect inputs to check for error conditions. The framework will
> > >report violations on these incorrect inputs. These are not bugs in the
> > >API specifications, nor in the test suite (read false positives).
> >
> > Right now it would be something along the lines of the test checking for
> > an expected failure message in dmesg, something along the lines of:
> >
> >         https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67
> >
> > I'm not opposed to coming up with a better story...

If the goal of validation is just indirectly validating correctness of
the specification itself, then I would look for other ways of
validating correctness of the spec.
Either removing duplication between specification and actual code
(i.e. generating it from SYSCALL_DEFINE, or the other way around) ,
then spec is correct by construction. Or, cross-validating it with
info automatically extracted from the source (using
clang/dwarf/pahole).
This would be more scalable (O(1) work, rather than thousands more
manually written tests).

> Oh, you mean special tests for this framework (rather than existing tests).
> I don't think this is going to work in practice. Besides writing all
> these specifications, we will also need to write dozens of tests per
> each specification (e.g. for each fd arg one needs at least 3 tests:
> -1, valid fd, inclid fd; an enum may need 5 various inputs of
> something; let alone netlink specifications).

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Sasha Levin 3 months, 2 weeks ago

On Thu, Jun 26, 2025 at 10:37:33AM +0200, Dmitry Vyukov wrote:
>On Thu, 26 Jun 2025 at 10:32, Dmitry Vyukov <dvyukov@google.com> wrote:
>>
>> On Wed, 25 Jun 2025 at 17:55, Sasha Levin <sashal@kernel.org> wrote:
>> >
>> > On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote:
>> > >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote:
>> > >
>> > >> >6. What's the goal of validation of the input arguments?
>> > >> >Kernel code must do this validation anyway, right.
>> > >> >Any non-trivial validation is hard, e.g. even for open the validation function
>> > >> >for file name would need to have access to flags and check file precense for
>> > >> >some flags combinations. That may add significant amount of non-trivial code
>> > >> >that duplicates main syscall logic, and that logic may also have bugs and
>> > >> >memory leaks.
>> > >>
>> > >> Mostly to catch divergence from the spec: think of a scenario where
>> > >> someone added a new param/flag/etc but forgot to update the spec - this
>> > >> will help catch it.
>> > >
>> > >How exactly is this supposed to work?
>> > >Even if we run with a unit test suite, a test suite may include some
>> > >incorrect inputs to check for error conditions. The framework will
>> > >report violations on these incorrect inputs. These are not bugs in the
>> > >API specifications, nor in the test suite (read false positives).
>> >
>> > Right now it would be something along the lines of the test checking for
>> > an expected failure message in dmesg, something along the lines of:
>> >
>> >         https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67
>> >
>> > I'm not opposed to coming up with a better story...
>
>If the goal of validation is just indirectly validating correctness of
>the specification itself, then I would look for other ways of
>validating correctness of the spec.
>Either removing duplication between specification and actual code
>(i.e. generating it from SYSCALL_DEFINE, or the other way around) ,
>then spec is correct by construction. Or, cross-validating it with
>info automatically extracted from the source (using
>clang/dwarf/pahole).
>This would be more scalable (O(1) work, rather than thousands more
>manually written tests).
>
>> Oh, you mean special tests for this framework (rather than existing tests).
>> I don't think this is going to work in practice. Besides writing all
>> these specifications, we will also need to write dozens of tests per
>> each specification (e.g. for each fd arg one needs at least 3 tests:
>> -1, valid fd, inclid fd; an enum may need 5 various inputs of
>> something; let alone netlink specifications).

I didn't mean just for the framework: being able to specify the APIs in
machine readable format will enable us to automatically generate
exhaustive tests for each such API.

I've been playing with the kapi tool (see last patch) which already
supports different formatters. Right now it outputs human readable
output, but I have proof-of-concept code that outputs testcases for
specced APIs.

The dream here is to be able to automatically generate
hundreds/thousands of tests for each API in an automated fashion, and
verify the results with:

1. Simply checking expected return value.

2. Checking that the actual action happened (i.e. we called close(fd),
verify that `fd` is really closed).

3. Check for side effects (i.e. close(fd) isn't supposed to allocate
memory - verify that it didn't allocate memory).

4. Code coverage: our tests are supposed to cover 100% of the code in
that APIs call chain, do we have code that didn't run (missing/incorrect
specs).

-- 
Thanks,
Sasha

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Dmitry Vyukov 3 months, 2 weeks ago

On Thu, 26 Jun 2025 at 18:23, Sasha Levin <sashal@kernel.org> wrote:
>
> On Thu, Jun 26, 2025 at 10:37:33AM +0200, Dmitry Vyukov wrote:
> >On Thu, 26 Jun 2025 at 10:32, Dmitry Vyukov <dvyukov@google.com> wrote:
> >>
> >> On Wed, 25 Jun 2025 at 17:55, Sasha Levin <sashal@kernel.org> wrote:
> >> >
> >> > On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote:
> >> > >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote:
> >> > >
> >> > >> >6. What's the goal of validation of the input arguments?
> >> > >> >Kernel code must do this validation anyway, right.
> >> > >> >Any non-trivial validation is hard, e.g. even for open the validation function
> >> > >> >for file name would need to have access to flags and check file precense for
> >> > >> >some flags combinations. That may add significant amount of non-trivial code
> >> > >> >that duplicates main syscall logic, and that logic may also have bugs and
> >> > >> >memory leaks.
> >> > >>
> >> > >> Mostly to catch divergence from the spec: think of a scenario where
> >> > >> someone added a new param/flag/etc but forgot to update the spec - this
> >> > >> will help catch it.
> >> > >
> >> > >How exactly is this supposed to work?
> >> > >Even if we run with a unit test suite, a test suite may include some
> >> > >incorrect inputs to check for error conditions. The framework will
> >> > >report violations on these incorrect inputs. These are not bugs in the
> >> > >API specifications, nor in the test suite (read false positives).
> >> >
> >> > Right now it would be something along the lines of the test checking for
> >> > an expected failure message in dmesg, something along the lines of:
> >> >
> >> >         https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67
> >> >
> >> > I'm not opposed to coming up with a better story...
> >
> >If the goal of validation is just indirectly validating correctness of
> >the specification itself, then I would look for other ways of
> >validating correctness of the spec.
> >Either removing duplication between specification and actual code
> >(i.e. generating it from SYSCALL_DEFINE, or the other way around) ,
> >then spec is correct by construction. Or, cross-validating it with
> >info automatically extracted from the source (using
> >clang/dwarf/pahole).
> >This would be more scalable (O(1) work, rather than thousands more
> >manually written tests).
> >
> >> Oh, you mean special tests for this framework (rather than existing tests).
> >> I don't think this is going to work in practice. Besides writing all
> >> these specifications, we will also need to write dozens of tests per
> >> each specification (e.g. for each fd arg one needs at least 3 tests:
> >> -1, valid fd, inclid fd; an enum may need 5 various inputs of
> >> something; let alone netlink specifications).
>
> I didn't mean just for the framework: being able to specify the APIs in
> machine readable format will enable us to automatically generate
> exhaustive tests for each such API.
>
> I've been playing with the kapi tool (see last patch) which already
> supports different formatters. Right now it outputs human readable
> output, but I have proof-of-concept code that outputs testcases for
> specced APIs.
>
> The dream here is to be able to automatically generate
> hundreds/thousands of tests for each API in an automated fashion, and
> verify the results with:
>
> 1. Simply checking expected return value.
>
> 2. Checking that the actual action happened (i.e. we called close(fd),
> verify that `fd` is really closed).
>
> 3. Check for side effects (i.e. close(fd) isn't supposed to allocate
> memory - verify that it didn't allocate memory).
>
> 4. Code coverage: our tests are supposed to cover 100% of the code in
> that APIs call chain, do we have code that didn't run (missing/incorrect
> specs).


This is all good. I was asking the argument verification part of the
framework. Is it required for any of this? How?

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Sasha Levin 3 months, 1 week ago

On Fri, Jun 27, 2025 at 08:23:41AM +0200, Dmitry Vyukov wrote:
>On Thu, 26 Jun 2025 at 18:23, Sasha Levin <sashal@kernel.org> wrote:
>>
>> On Thu, Jun 26, 2025 at 10:37:33AM +0200, Dmitry Vyukov wrote:
>> >On Thu, 26 Jun 2025 at 10:32, Dmitry Vyukov <dvyukov@google.com> wrote:
>> >>
>> >> On Wed, 25 Jun 2025 at 17:55, Sasha Levin <sashal@kernel.org> wrote:
>> >> >
>> >> > On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote:
>> >> > >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote:
>> >> > >
>> >> > >> >6. What's the goal of validation of the input arguments?
>> >> > >> >Kernel code must do this validation anyway, right.
>> >> > >> >Any non-trivial validation is hard, e.g. even for open the validation function
>> >> > >> >for file name would need to have access to flags and check file precense for
>> >> > >> >some flags combinations. That may add significant amount of non-trivial code
>> >> > >> >that duplicates main syscall logic, and that logic may also have bugs and
>> >> > >> >memory leaks.
>> >> > >>
>> >> > >> Mostly to catch divergence from the spec: think of a scenario where
>> >> > >> someone added a new param/flag/etc but forgot to update the spec - this
>> >> > >> will help catch it.
>> >> > >
>> >> > >How exactly is this supposed to work?
>> >> > >Even if we run with a unit test suite, a test suite may include some
>> >> > >incorrect inputs to check for error conditions. The framework will
>> >> > >report violations on these incorrect inputs. These are not bugs in the
>> >> > >API specifications, nor in the test suite (read false positives).
>> >> >
>> >> > Right now it would be something along the lines of the test checking for
>> >> > an expected failure message in dmesg, something along the lines of:
>> >> >
>> >> >         https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67
>> >> >
>> >> > I'm not opposed to coming up with a better story...
>> >
>> >If the goal of validation is just indirectly validating correctness of
>> >the specification itself, then I would look for other ways of
>> >validating correctness of the spec.
>> >Either removing duplication between specification and actual code
>> >(i.e. generating it from SYSCALL_DEFINE, or the other way around) ,
>> >then spec is correct by construction. Or, cross-validating it with
>> >info automatically extracted from the source (using
>> >clang/dwarf/pahole).
>> >This would be more scalable (O(1) work, rather than thousands more
>> >manually written tests).
>> >
>> >> Oh, you mean special tests for this framework (rather than existing tests).
>> >> I don't think this is going to work in practice. Besides writing all
>> >> these specifications, we will also need to write dozens of tests per
>> >> each specification (e.g. for each fd arg one needs at least 3 tests:
>> >> -1, valid fd, inclid fd; an enum may need 5 various inputs of
>> >> something; let alone netlink specifications).
>>
>> I didn't mean just for the framework: being able to specify the APIs in
>> machine readable format will enable us to automatically generate
>> exhaustive tests for each such API.
>>
>> I've been playing with the kapi tool (see last patch) which already
>> supports different formatters. Right now it outputs human readable
>> output, but I have proof-of-concept code that outputs testcases for
>> specced APIs.
>>
>> The dream here is to be able to automatically generate
>> hundreds/thousands of tests for each API in an automated fashion, and
>> verify the results with:
>>
>> 1. Simply checking expected return value.
>>
>> 2. Checking that the actual action happened (i.e. we called close(fd),
>> verify that `fd` is really closed).
>>
>> 3. Check for side effects (i.e. close(fd) isn't supposed to allocate
>> memory - verify that it didn't allocate memory).
>>
>> 4. Code coverage: our tests are supposed to cover 100% of the code in
>> that APIs call chain, do we have code that didn't run (missing/incorrect
>> specs).
>
>
>This is all good. I was asking the argument verification part of the
>framework. Is it required for any of this? How?

Specifications without enforcement are just documentation :)

In my mind, there are a few reasons we want this:

1. For folks coding against the kernel, it's a way for them to know that
the code they're writing fits within the spec of the kernel's API.

2. Enforcement around kernel changes: think of a scenario where a flag
is added to a syscall - the author of that change will have to also
update the spec because otherwise the verification layer will complain
about the new flag. This helps prevent divergence between the code and
the spec.

3. Extra layer of security: we can choose to enable this as an
additional layer to protect us from missing checks in our userspace
facing API.

-- 
Thanks,
Sasha

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Dmitry Vyukov 3 months, 1 week ago

On Mon, 30 Jun 2025 at 16:27, Sasha Levin <sashal@kernel.org> wrote:
>
> On Fri, Jun 27, 2025 at 08:23:41AM +0200, Dmitry Vyukov wrote:
> >On Thu, 26 Jun 2025 at 18:23, Sasha Levin <sashal@kernel.org> wrote:
> >>
> >> On Thu, Jun 26, 2025 at 10:37:33AM +0200, Dmitry Vyukov wrote:
> >> >On Thu, 26 Jun 2025 at 10:32, Dmitry Vyukov <dvyukov@google.com> wrote:
> >> >>
> >> >> On Wed, 25 Jun 2025 at 17:55, Sasha Levin <sashal@kernel.org> wrote:
> >> >> >
> >> >> > On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote:
> >> >> > >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote:
> >> >> > >
> >> >> > >> >6. What's the goal of validation of the input arguments?
> >> >> > >> >Kernel code must do this validation anyway, right.
> >> >> > >> >Any non-trivial validation is hard, e.g. even for open the validation function
> >> >> > >> >for file name would need to have access to flags and check file precense for
> >> >> > >> >some flags combinations. That may add significant amount of non-trivial code
> >> >> > >> >that duplicates main syscall logic, and that logic may also have bugs and
> >> >> > >> >memory leaks.
> >> >> > >>
> >> >> > >> Mostly to catch divergence from the spec: think of a scenario where
> >> >> > >> someone added a new param/flag/etc but forgot to update the spec - this
> >> >> > >> will help catch it.
> >> >> > >
> >> >> > >How exactly is this supposed to work?
> >> >> > >Even if we run with a unit test suite, a test suite may include some
> >> >> > >incorrect inputs to check for error conditions. The framework will
> >> >> > >report violations on these incorrect inputs. These are not bugs in the
> >> >> > >API specifications, nor in the test suite (read false positives).
> >> >> >
> >> >> > Right now it would be something along the lines of the test checking for
> >> >> > an expected failure message in dmesg, something along the lines of:
> >> >> >
> >> >> >         https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67
> >> >> >
> >> >> > I'm not opposed to coming up with a better story...
> >> >
> >> >If the goal of validation is just indirectly validating correctness of
> >> >the specification itself, then I would look for other ways of
> >> >validating correctness of the spec.
> >> >Either removing duplication between specification and actual code
> >> >(i.e. generating it from SYSCALL_DEFINE, or the other way around) ,
> >> >then spec is correct by construction. Or, cross-validating it with
> >> >info automatically extracted from the source (using
> >> >clang/dwarf/pahole).
> >> >This would be more scalable (O(1) work, rather than thousands more
> >> >manually written tests).
> >> >
> >> >> Oh, you mean special tests for this framework (rather than existing tests).
> >> >> I don't think this is going to work in practice. Besides writing all
> >> >> these specifications, we will also need to write dozens of tests per
> >> >> each specification (e.g. for each fd arg one needs at least 3 tests:
> >> >> -1, valid fd, inclid fd; an enum may need 5 various inputs of
> >> >> something; let alone netlink specifications).
> >>
> >> I didn't mean just for the framework: being able to specify the APIs in
> >> machine readable format will enable us to automatically generate
> >> exhaustive tests for each such API.
> >>
> >> I've been playing with the kapi tool (see last patch) which already
> >> supports different formatters. Right now it outputs human readable
> >> output, but I have proof-of-concept code that outputs testcases for
> >> specced APIs.
> >>
> >> The dream here is to be able to automatically generate
> >> hundreds/thousands of tests for each API in an automated fashion, and
> >> verify the results with:
> >>
> >> 1. Simply checking expected return value.
> >>
> >> 2. Checking that the actual action happened (i.e. we called close(fd),
> >> verify that `fd` is really closed).
> >>
> >> 3. Check for side effects (i.e. close(fd) isn't supposed to allocate
> >> memory - verify that it didn't allocate memory).
> >>
> >> 4. Code coverage: our tests are supposed to cover 100% of the code in
> >> that APIs call chain, do we have code that didn't run (missing/incorrect
> >> specs).
> >
> >
> >This is all good. I was asking the argument verification part of the
> >framework. Is it required for any of this? How?
>
> Specifications without enforcement are just documentation :)
>
> In my mind, there are a few reasons we want this:
>
> 1. For folks coding against the kernel, it's a way for them to know that
> the code they're writing fits within the spec of the kernel's API.

How is this different from just running the kernel normally? Running
the kernel normally is simpler, faster, and more precise.

> 2. Enforcement around kernel changes: think of a scenario where a flag
> is added to a syscall - the author of that change will have to also
> update the spec because otherwise the verification layer will complain
> about the new flag. This helps prevent divergence between the code and
> the spec.

It may be more useful to invoke verification, but does not return
early on verification errors, but instead memorize the result, and
still always run the actual syscall normally. Then if verification
produced an error, but the actual syscall has not returned the same
error, then WARN loudly.

This should provide the same value. But also does not rely on
correctly marked manually written tests to test the specification. It
will work automatically with any fuzzing/randomized testing, which I
assume will be more valuable for specification testing.

But then, as Cyril mentioned, this verification layer does not really
need to live in the kernel. Once the kernel has exported the
specification in machine-usable form, the same verification can be
done in user-space. Which is always a good idea.


> 3. Extra layer of security: we can choose to enable this as an
> additional layer to protect us from missing checks in our userspace
> facing API.

This will have additional risks, and performance overhead. Such
mitigations are usually assessed with % of past CVEs this could
prevent. That would allow us to assess cost/benefit.
Intuitively this does not look like worth doing to me.

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Cyril Hrubis 3 months, 2 weeks ago

Hi!
> > >6. What's the goal of validation of the input arguments?
> > >Kernel code must do this validation anyway, right.
> > >Any non-trivial validation is hard, e.g. even for open the validation function
> > >for file name would need to have access to flags and check file precense for
> > >some flags combinations. That may add significant amount of non-trivial code
> > >that duplicates main syscall logic, and that logic may also have bugs and
> > >memory leaks.
> >
> > Mostly to catch divergence from the spec: think of a scenario where
> > someone added a new param/flag/etc but forgot to update the spec - this
> > will help catch it.
> 
> How exactly is this supposed to work?
> Even if we run with a unit test suite, a test suite may include some
> incorrect inputs to check for error conditions. The framework will
> report violations on these incorrect inputs. These are not bugs in the
> API specifications, nor in the test suite (read false positives).

This is what I tried to respond to but I guess that it didn't go well.
Let me try to reiterate. I my opinion you shouldn't really put this part
into the kernel, but rather than that include more type and semantic
information into the data so that tests can be generated and executed in
userspace. I do not see how can we validate that we get proper errors
from a syscall if one of the input parameters is invalid other than
generating and running a C test in userspace. For that part the syscall
description does not need to be build into the kernel either, it may be
just a build artifact that gets installed with the kernel image.

-- 
Cyril Hrubis
chrubis@suse.cz

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Dmitry Vyukov 3 months, 2 weeks ago

On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote:

> >3. To reduce duplication we could use more type information, e.g. I was always
> >frustrated that close is just:
> >
> >SYSCALL_DEFINE1(close, unsigned int, fd)
> >
> >whereas if we would do:
> >
> >typedef int fd_t;
> >SYSCALL_DEFINE1(close, fd_t, fd)
> >
> >then all semantic info about the arg is already in the code.
>
> Yup. It would also be great if we completely drop the SYSCALL_DEFINE()
> part and have it be automatically generated by the spec itself, but I
> couldn't wrap my head around doing this in C macro just yet.

At some point I was looking at boost.pp library as the source of info
on how to do things. It provides a set of containers and algorithms on
them:
https://www.boost.org/doc/libs/latest/libs/preprocessor/doc/index.html

Sequences may be the most appealing b/c they support variable number
of elements, and don't need specifying number of elements explicitly:
https://www.boost.org/doc/libs/latest/libs/preprocessor/doc/data/sequences.html
A sequence then allows generating multiple things from it using
foreach over elements.

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Cyril Hrubis 3 months, 2 weeks ago

Hi!
> 6. What's the goal of validation of the input arguments?
> Kernel code must do this validation anyway, right.
> Any non-trivial validation is hard, e.g. even for open the validation function
> for file name would need to have access to flags and check file precense for
> some flags combinations. That may add significant amount of non-trivial code
> that duplicates main syscall logic, and that logic may also have bugs and
> memory leaks.

I was looking at that part and thinking that we could generate (at least
some) automated conformance tests based on this information. We could
make sure that invalid parameters are properly rejected. For open(),
some combinations would be difficuilt to model though, e.g. for
O_DIRECTORY the pathname is supposed to be a path to a directory and
also the file descriptor returned has different properties. Also O_CREAT
requires third parameter and changes which kinds of filepaths are
invalid. Demultiplexing syscalls like this is going to be difficult to
get right.

As for testing purposes, most of the time it would be enough just to say
something as "this parameter is an existing file". If we have this
information in a machine parseable format we can generate automatic
tests for various error conditions e.g. ELOOP, EACESS, ENAMETOOLONG,
ENOENT, ...

For paths we could have something as:

file:existing
file:notexisting
file:replaced|nonexisting
file:nonexisting|existing
dir:existing
dir:nonexisting

Then for open() syscall we can do:

flags=O_DIRECTORY path=dir:existing
flags=O_CREAT path=file:nonexisting|existing
flags=O_CREAT|O_EXCL path=file:nonexisting
...

You may wonder if such kind of tests are useful at all, since quite a
few of these errors are checked for and generated from a common
functions. There are at least two cases I can think of. First of all it
makes sure that errors are stable when particular function/subsystem is
rewritten. And it can also make sure that errors are consistent across
different implementation of the same functionality e.g. filesystems. I
remember that some of the less used FUSE filesystems returned puzzling
errors in certain corner cases.

Maybe it would be more useful to steer this towards a system that
annotates better the types for the syscall parameters and return values.
Something that would be an extension to a C types with a description on
how particular string or integer is interpreted.

> Side-effects specification potentially can be used to detect logical kernel bugs,
> e.g. if a syscall does not claim to change fs state, but it does, it's a bug.
> Though, a more useful check should be failure/concurrency atomicity.
> Namely, if a syscall claims to not alter state on failure, it shouldn't do so.
> Concurrency atomicity means linearizability of concurrent syscalls
> (side-effects match one of 2 possible orders of syscalls).
> But for these we would need to add additional flags to the descriptions
> that say that a syscall supports failure/concurrency atomicity.
> 
> 8. It would be useful to have a mapping of file_operations to actual files in fs.
> Otherwise the exposed info is not very actionable, since there is no way to understand
> what actual file/fd the ioctl's can be applied to.

+1 There are many different kinds of file descriptors and they differ
wildy in what operations they support.

Maybe we would need a subclass for a file descriptor, something as:

fd:file
fd:timerfd
fd:pidfs
...

-- 
Cyril Hrubis
chrubis@suse.cz

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Dmitry Vyukov 3 months, 2 weeks ago

On Tue, 24 Jun 2025 at 16:05, Cyril Hrubis <chrubis@suse.cz> wrote:
>
> Hi!
> > 6. What's the goal of validation of the input arguments?
> > Kernel code must do this validation anyway, right.
> > Any non-trivial validation is hard, e.g. even for open the validation function
> > for file name would need to have access to flags and check file precense for
> > some flags combinations. That may add significant amount of non-trivial code
> > that duplicates main syscall logic, and that logic may also have bugs and
> > memory leaks.
>
> I was looking at that part and thinking that we could generate (at least
> some) automated conformance tests based on this information. We could
> make sure that invalid parameters are properly rejected. For open(),
> some combinations would be difficuilt to model though, e.g. for
> O_DIRECTORY the pathname is supposed to be a path to a directory and
> also the file descriptor returned has different properties. Also O_CREAT
> requires third parameter and changes which kinds of filepaths are
> invalid. Demultiplexing syscalls like this is going to be difficult to
> get right.
>
> As for testing purposes, most of the time it would be enough just to say
> something as "this parameter is an existing file". If we have this
> information in a machine parseable format we can generate automatic
> tests for various error conditions e.g. ELOOP, EACESS, ENAMETOOLONG,
> ENOENT, ...
>
> For paths we could have something as:
>
> file:existing
> file:notexisting
> file:replaced|nonexisting
> file:nonexisting|existing
> dir:existing
> dir:nonexisting
>
> Then for open() syscall we can do:
>
> flags=O_DIRECTORY path=dir:existing
> flags=O_CREAT path=file:nonexisting|existing
> flags=O_CREAT|O_EXCL path=file:nonexisting
> ...
>
> You may wonder if such kind of tests are useful at all, since quite a
> few of these errors are checked for and generated from a common
> functions. There are at least two cases I can think of. First of all it
> makes sure that errors are stable when particular function/subsystem is
> rewritten. And it can also make sure that errors are consistent across
> different implementation of the same functionality e.g. filesystems. I
> remember that some of the less used FUSE filesystems returned puzzling
> errors in certain corner cases.

I am not following how this is related to the validation part of the
patch series. Can you elaborate?

Generation of such conformance tests would need info about parameter
types and their semantic meaning, not the validation part.
The conformance tests should test that actual syscall checking of
arguments, not the validation added by this framework.


> Maybe it would be more useful to steer this towards a system that
> annotates better the types for the syscall parameters and return values.
> Something that would be an extension to a C types with a description on
> how particular string or integer is interpreted.

+1


> > Side-effects specification potentially can be used to detect logical kernel bugs,
> > e.g. if a syscall does not claim to change fs state, but it does, it's a bug.
> > Though, a more useful check should be failure/concurrency atomicity.
> > Namely, if a syscall claims to not alter state on failure, it shouldn't do so.
> > Concurrency atomicity means linearizability of concurrent syscalls
> > (side-effects match one of 2 possible orders of syscalls).
> > But for these we would need to add additional flags to the descriptions
> > that say that a syscall supports failure/concurrency atomicity.
> >
> > 8. It would be useful to have a mapping of file_operations to actual files in fs.
> > Otherwise the exposed info is not very actionable, since there is no way to understand
> > what actual file/fd the ioctl's can be applied to.
>
> +1 There are many different kinds of file descriptors and they differ
> wildy in what operations they support.
>
> Maybe we would need a subclass for a file descriptor, something as:
>
> fd:file
> fd:timerfd
> fd:pidfs

FWIW syzkaller has this for the purpose of automatic generation of test inputs.

Re: [RFC 00/19] Kernel API Specification Framework

Posted by Cyril Hrubis 3 months, 2 weeks ago

Hi!
> > You may wonder if such kind of tests are useful at all, since quite a
> > few of these errors are checked for and generated from a common
> > functions. There are at least two cases I can think of. First of all it
> > makes sure that errors are stable when particular function/subsystem is
> > rewritten. And it can also make sure that errors are consistent across
> > different implementation of the same functionality e.g. filesystems. I
> > remember that some of the less used FUSE filesystems returned puzzling
> > errors in certain corner cases.
> 
> I am not following how this is related to the validation part of the
> patch series. Can you elaborate?

This part is me trying to explain that generated conformance tests would
be useful for development as well.

> Generation of such conformance tests would need info about parameter
> types and their semantic meaning, not the validation part.
> The conformance tests should test that actual syscall checking of
> arguments, not the validation added by this framework.

Exactly.

I do not think that it makes sense to encode the argument ranges and
functions to generate a valid syscall parameters into the kernel. Rather
than that the information should encoded in the extended types, if we do
that well enough we can generate combination of different valid and
invalid parameters for the tests based on that.

-- 
Cyril Hrubis
chrubis@suse.cz