[v2] KFuzzTest: a new kernel fuzzing framework

[PATCH v2 RFC 0/7] KFuzzTest: a new kernel fuzzing framework

Ethan Graham posted 7 patches 1 month ago

Diff against v1 v2
Download series mbox

There is a newer version of this series

Documentation/dev-tools/index.rst             |   1 +
Documentation/dev-tools/kfuzztest.rst         | 371 +++++++++++++
arch/x86/kernel/vmlinux.lds.S                 |  22 +
crypto/asymmetric_keys/Kconfig                |  15 +
crypto/asymmetric_keys/Makefile               |   2 +
crypto/asymmetric_keys/tests/Makefile         |   2 +
crypto/asymmetric_keys/tests/pkcs7_kfuzz.c    |  22 +
.../asymmetric_keys/tests/rsa_helper_kfuzz.c  |  38 ++
include/linux/kasan.h                         |  16 +
include/linux/kfuzztest.h                     | 508 ++++++++++++++++++
lib/Kconfig.debug                             |   1 +
lib/Makefile                                  |   2 +
lib/kfuzztest/Kconfig                         |  20 +
lib/kfuzztest/Makefile                        |   4 +
lib/kfuzztest/main.c                          | 163 ++++++
lib/kfuzztest/parse.c                         | 208 +++++++
mm/kasan/shadow.c                             |  31 ++
samples/Kconfig                               |   7 +
samples/Makefile                              |   1 +
samples/kfuzztest/Makefile                    |   3 +
samples/kfuzztest/overflow_on_nested_buffer.c |  52 ++
samples/kfuzztest/underflow_on_buffer.c       |  41 ++
tools/Makefile                                |  15 +-
tools/kfuzztest-bridge/.gitignore             |   2 +
tools/kfuzztest-bridge/Build                  |   6 +
tools/kfuzztest-bridge/Makefile               |  48 ++
tools/kfuzztest-bridge/bridge.c               |  93 ++++
tools/kfuzztest-bridge/byte_buffer.c          |  87 +++
tools/kfuzztest-bridge/byte_buffer.h          |  31 ++
tools/kfuzztest-bridge/encoder.c              | 356 ++++++++++++
tools/kfuzztest-bridge/encoder.h              |  16 +
tools/kfuzztest-bridge/input_lexer.c          | 243 +++++++++
tools/kfuzztest-bridge/input_lexer.h          |  57 ++
tools/kfuzztest-bridge/input_parser.c         | 373 +++++++++++++
tools/kfuzztest-bridge/input_parser.h         |  79 +++
tools/kfuzztest-bridge/rand_stream.c          |  61 +++
tools/kfuzztest-bridge/rand_stream.h          |  46 ++
37 files changed, 3037 insertions(+), 6 deletions(-)
create mode 100644 Documentation/dev-tools/kfuzztest.rst
create mode 100644 crypto/asymmetric_keys/tests/Makefile
create mode 100644 crypto/asymmetric_keys/tests/pkcs7_kfuzz.c
create mode 100644 crypto/asymmetric_keys/tests/rsa_helper_kfuzz.c
create mode 100644 include/linux/kfuzztest.h
create mode 100644 lib/kfuzztest/Kconfig
create mode 100644 lib/kfuzztest/Makefile
create mode 100644 lib/kfuzztest/main.c
create mode 100644 lib/kfuzztest/parse.c
create mode 100644 samples/kfuzztest/Makefile
create mode 100644 samples/kfuzztest/overflow_on_nested_buffer.c
create mode 100644 samples/kfuzztest/underflow_on_buffer.c
create mode 100644 tools/kfuzztest-bridge/.gitignore
create mode 100644 tools/kfuzztest-bridge/Build
create mode 100644 tools/kfuzztest-bridge/Makefile
create mode 100644 tools/kfuzztest-bridge/bridge.c
create mode 100644 tools/kfuzztest-bridge/byte_buffer.c
create mode 100644 tools/kfuzztest-bridge/byte_buffer.h
create mode 100644 tools/kfuzztest-bridge/encoder.c
create mode 100644 tools/kfuzztest-bridge/encoder.h
create mode 100644 tools/kfuzztest-bridge/input_lexer.c
create mode 100644 tools/kfuzztest-bridge/input_lexer.h
create mode 100644 tools/kfuzztest-bridge/input_parser.c
create mode 100644 tools/kfuzztest-bridge/input_parser.h
create mode 100644 tools/kfuzztest-bridge/rand_stream.c
create mode 100644 tools/kfuzztest-bridge/rand_stream.h

Expand all Fold all

[PATCH v2 RFC 0/7] KFuzzTest: a new kernel fuzzing framework

Posted by Ethan Graham 1 month ago

From: Ethan Graham <ethangraham@google.com>

This patch series introduces KFuzzTest, a lightweight framework for
creating in-kernel fuzz targets for internal kernel functions.

The primary motivation for KFuzzTest is to simplify the fuzzing of
low-level, relatively stateless functions (e.g., data parsers, format
converters) that are difficult to exercise effectively from the syscall
boundary. It is intended for in-situ fuzzing of kernel code without
requiring that it be built as a separate userspace library or that its
dependencies be stubbed out. Using a simple macro-based API, developers
can add a new fuzz target with minimal boilerplate code.

The core design consists of three main parts:
1. A `FUZZ_TEST(name, struct_type)` macro that allows developers to
   easily define a fuzz test.
2. A binary input format that allows a userspace fuzzer to serialize
   complex, pointer-rich C structures into a single buffer.
3. Metadata for test targets, constraints, and annotations, which is
   emitted into dedicated ELF sections to allow for discovery and
   inspection by userspace tools. These are found in
   ".kfuzztest_{targets, constraints, annotations}".

To demonstrate this framework's viability, support for KFuzzTest has been
prototyped in a development fork of syzkaller, enabling coverage-guided
fuzzing. To validate its end-to-end effectiveness, we performed an
experiment by manually introducing an off-by-one buffer over-read into
pkcs7_parse_message, like so:

-ret = asn1_ber_decoder(&pkcs7_decoder, ctx, data, datalen);
+ret = asn1_ber_decoder(&pkcs7_decoder, ctx, data, datalen + 1);

A syzkaller instance fuzzing the new test_pkcs7_parse_message target
introduced in patch 7 successfully triggered the bug inside of
asn1_ber_decoder in under a 30 seconds from a cold start.

This RFC continues to seek feedback on the overall design of KFuzzTest
and the minor changes made in V2. We are particularly interested in
comments on:
- The ergonomics of the API for defining fuzz targets.
- The overall workflow and usability for a developer adding and running
  a new in-kernel fuzz target.
- The high-level architecture.

The patch series is structured as follows:
- Patch 1 adds and exposes a new KASAN function needed by KFuzzTest.
- Patch 2 introduces the core KFuzzTest API and data structures.
- Patch 3 adds the runtime implementation for the framework.
- Patch 4 adds a tool for sending structured inputs into a fuzz target.
- Patch 5 adds documentation.
- Patch 6 provides example fuzz targets.
- Patch 7 defines fuzz targets for real kernel functions.

Changes in v2:
- Per feedback from Eric Biggers and Ignat Korchagin, move the /crypto
  fuzz target samples into a new /crypto/tests directory to separate
  them from the functional source code.
- Per feedback from David Gow and Marco Elver, add the kfuzztest-bridge
  tool to generate structured inputs for fuzz targets. The tool can
  populate parts of the input structure with data from a file, enabling
  both simple randomized fuzzing (e.g, using /dev/urandom) and
  targeted testing with file-based inputs.

We would like to thank David Gow for his detailed feedback regarding the
potential integration with KUnit. The v1 discussion highlighted three
potential paths: making KFuzzTests a special case of KUnit tests, sharing
implementation details in a common library, or keeping the frameworks
separate while ensuring API familiarity.

Following a productive conversation with David, we are moving forward
with the third option for now. While tighter integration is an
attractive long-term goal, we believe the most practical first step is
to establish KFuzzTest as a valuable, standalone framework. This avoids
premature abstraction (e.g., creating a shared library with only one
user) and allows KFuzzTest's design to stabilize based on its specific
focus: fuzzing with complex, structured inputs.

Ethan Graham (7):
  mm/kasan: implement kasan_poison_range
  kfuzztest: add user-facing API and data structures
  kfuzztest: implement core module and input processing
  tools: add kfuzztest-bridge utility
  kfuzztest: add ReST documentation
  kfuzztest: add KFuzzTest sample fuzz targets
  crypto: implement KFuzzTest targets for PKCS7 and RSA parsing

 Documentation/dev-tools/index.rst             |   1 +
 Documentation/dev-tools/kfuzztest.rst         | 371 +++++++++++++
 arch/x86/kernel/vmlinux.lds.S                 |  22 +
 crypto/asymmetric_keys/Kconfig                |  15 +
 crypto/asymmetric_keys/Makefile               |   2 +
 crypto/asymmetric_keys/tests/Makefile         |   2 +
 crypto/asymmetric_keys/tests/pkcs7_kfuzz.c    |  22 +
 .../asymmetric_keys/tests/rsa_helper_kfuzz.c  |  38 ++
 include/linux/kasan.h                         |  16 +
 include/linux/kfuzztest.h                     | 508 ++++++++++++++++++
 lib/Kconfig.debug                             |   1 +
 lib/Makefile                                  |   2 +
 lib/kfuzztest/Kconfig                         |  20 +
 lib/kfuzztest/Makefile                        |   4 +
 lib/kfuzztest/main.c                          | 163 ++++++
 lib/kfuzztest/parse.c                         | 208 +++++++
 mm/kasan/shadow.c                             |  31 ++
 samples/Kconfig                               |   7 +
 samples/Makefile                              |   1 +
 samples/kfuzztest/Makefile                    |   3 +
 samples/kfuzztest/overflow_on_nested_buffer.c |  52 ++
 samples/kfuzztest/underflow_on_buffer.c       |  41 ++
 tools/Makefile                                |  15 +-
 tools/kfuzztest-bridge/.gitignore             |   2 +
 tools/kfuzztest-bridge/Build                  |   6 +
 tools/kfuzztest-bridge/Makefile               |  48 ++
 tools/kfuzztest-bridge/bridge.c               |  93 ++++
 tools/kfuzztest-bridge/byte_buffer.c          |  87 +++
 tools/kfuzztest-bridge/byte_buffer.h          |  31 ++
 tools/kfuzztest-bridge/encoder.c              | 356 ++++++++++++
 tools/kfuzztest-bridge/encoder.h              |  16 +
 tools/kfuzztest-bridge/input_lexer.c          | 243 +++++++++
 tools/kfuzztest-bridge/input_lexer.h          |  57 ++
 tools/kfuzztest-bridge/input_parser.c         | 373 +++++++++++++
 tools/kfuzztest-bridge/input_parser.h         |  79 +++
 tools/kfuzztest-bridge/rand_stream.c          |  61 +++
 tools/kfuzztest-bridge/rand_stream.h          |  46 ++
 37 files changed, 3037 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/dev-tools/kfuzztest.rst
 create mode 100644 crypto/asymmetric_keys/tests/Makefile
 create mode 100644 crypto/asymmetric_keys/tests/pkcs7_kfuzz.c
 create mode 100644 crypto/asymmetric_keys/tests/rsa_helper_kfuzz.c
 create mode 100644 include/linux/kfuzztest.h
 create mode 100644 lib/kfuzztest/Kconfig
 create mode 100644 lib/kfuzztest/Makefile
 create mode 100644 lib/kfuzztest/main.c
 create mode 100644 lib/kfuzztest/parse.c
 create mode 100644 samples/kfuzztest/Makefile
 create mode 100644 samples/kfuzztest/overflow_on_nested_buffer.c
 create mode 100644 samples/kfuzztest/underflow_on_buffer.c
 create mode 100644 tools/kfuzztest-bridge/.gitignore
 create mode 100644 tools/kfuzztest-bridge/Build
 create mode 100644 tools/kfuzztest-bridge/Makefile
 create mode 100644 tools/kfuzztest-bridge/bridge.c
 create mode 100644 tools/kfuzztest-bridge/byte_buffer.c
 create mode 100644 tools/kfuzztest-bridge/byte_buffer.h
 create mode 100644 tools/kfuzztest-bridge/encoder.c
 create mode 100644 tools/kfuzztest-bridge/encoder.h
 create mode 100644 tools/kfuzztest-bridge/input_lexer.c
 create mode 100644 tools/kfuzztest-bridge/input_lexer.h
 create mode 100644 tools/kfuzztest-bridge/input_parser.c
 create mode 100644 tools/kfuzztest-bridge/input_parser.h
 create mode 100644 tools/kfuzztest-bridge/rand_stream.c
 create mode 100644 tools/kfuzztest-bridge/rand_stream.h

-- 
2.51.0.318.gd7df087d1a-goog

Re: [PATCH v2 RFC 0/7] KFuzzTest: a new kernel fuzzing framework

Posted by Johannes Berg 3 weeks, 3 days ago

Hi Ethan,

Since I'm looking at some WiFi fuzzing just now ...

> The primary motivation for KFuzzTest is to simplify the fuzzing of
> low-level, relatively stateless functions (e.g., data parsers, format
> converters)

Could you clarify what you mean by "relatively" here? It seems to me
that if you let this fuzz say something like
cfg80211_inform_bss_frame_data(), which parses a frame and registers it
in the global scan list, you might quickly run into the 1000 limit of
the list, etc. since these functions are not stateless. OTOH, it's
obviously possible to just receive a lot of such frames over the air
even, or over simulated air like in syzbot today already.

> This RFC continues to seek feedback on the overall design of KFuzzTest
> and the minor changes made in V2. We are particularly interested in
> comments on:
> - The ergonomics of the API for defining fuzz targets.
> - The overall workflow and usability for a developer adding and running
>   a new in-kernel fuzz target.
> - The high-level architecture.

As far as the architecture is concerned, I'm reading this is built
around syzkaller (like) architecture, in that the fuzzer lives in the
fuzzed kernel's userspace, right?

> We would like to thank David Gow for his detailed feedback regarding the
> potential integration with KUnit. The v1 discussion highlighted three
> potential paths: making KFuzzTests a special case of KUnit tests, sharing
> implementation details in a common library, or keeping the frameworks
> separate while ensuring API familiarity.
> 
> Following a productive conversation with David, we are moving forward
> with the third option for now. While tighter integration is an
> attractive long-term goal, we believe the most practical first step is
> to establish KFuzzTest as a valuable, standalone framework.

I have been wondering about this from another perspective - with kunit
often running in ARCH=um, and there the kernel being "just" a userspace
process, we should be able to do a "classic" afl-style fork approach to
fuzzing. That way, state doesn't really (have to) matter at all. This is
of course both an advantage (reproducing any issue found is just the
right test with a single input) and disadvantage (the fuzzer won't
modify state first and then find an issue on a later round.)

I was just looking at what external state (such as the physical memory
mapped) UML has and that would need to be disentangled, and it's not
_that_ much if we can have specific configurations, and maybe mostly
shut down the userspace that's running inside UML (and/or have kunit
execute before init/pid 1 when builtin.)

Did you consider such a model at all, and have specific reasons for not
going in this direction, or simply didn't consider because you're coming
from the syzkaller side anyway?

johannes

Re: [PATCH v2 RFC 0/7] KFuzzTest: a new kernel fuzzing framework

Posted by Alexander Potapenko 3 weeks, 1 day ago

On Mon, Sep 8, 2025 at 3:11 PM Johannes Berg <johannes@sipsolutions.net> wrote:
>
> Hi Ethan,

Hi Johannes,

> Since I'm looking at some WiFi fuzzing just now ...
>
> > The primary motivation for KFuzzTest is to simplify the fuzzing of
> > low-level, relatively stateless functions (e.g., data parsers, format
> > converters)
>
> Could you clarify what you mean by "relatively" here? It seems to me
> that if you let this fuzz say something like
> cfg80211_inform_bss_frame_data(), which parses a frame and registers it
> in the global scan list, you might quickly run into the 1000 limit of
> the list, etc. since these functions are not stateless. OTOH, it's
> obviously possible to just receive a lot of such frames over the air
> even, or over simulated air like in syzbot today already.

While it would be very useful to be able to test every single function
in the kernel, there are limitations imposed by our approach.
To work around these limitations, some code may need to be refactored
for better testability, so that global state can be mocked out or
easily reset between runs.

I am not very familiar with the code in
cfg80211_inform_bss_frame_data(), but I can imagine that the code
doing the actual frame parsing could be untangled from the code that
registers it in the global list.
The upside of doing so would be the ability to test that parsing logic
in modes that real-world syscall invocations may never exercise.

>
> As far as the architecture is concerned, I'm reading this is built
> around syzkaller (like) architecture, in that the fuzzer lives in the
> fuzzed kernel's userspace, right?
>

This is correct.

> > We would like to thank David Gow for his detailed feedback regarding the
> > potential integration with KUnit. The v1 discussion highlighted three
> > potential paths: making KFuzzTests a special case of KUnit tests, sharing
> > implementation details in a common library, or keeping the frameworks
> > separate while ensuring API familiarity.
> >
> > Following a productive conversation with David, we are moving forward
> > with the third option for now. While tighter integration is an
> > attractive long-term goal, we believe the most practical first step is
> > to establish KFuzzTest as a valuable, standalone framework.
>
> I have been wondering about this from another perspective - with kunit
> often running in ARCH=um, and there the kernel being "just" a userspace
> process, we should be able to do a "classic" afl-style fork approach to
> fuzzing.

This approach is quite popular among security researchers, but if I'm
understanding correctly, we are yet to see continuous integration of
UML-based fuzzers with the kernel development process.

> That way, state doesn't really (have to) matter at all. This is
> of course both an advantage (reproducing any issue found is just the
> right test with a single input) and disadvantage (the fuzzer won't
> modify state first and then find an issue on a later round.)

From our experience, accumulated state is more of a disadvantage that
we'd rather eliminate altogether.
syzkaller can chain syscalls and could in theory generate a single
program that is elaborate enough to prepare the state and then find an
issue.
However, because resetting the kernel (rebooting machines or restoring
VM snapshots) is costly, we have to run multiple programs on the same
kernel instance, which interfere with each other.
As a result, some bugs that are tricky to trigger become even trickier
to reproduce, because one can't possibly replay all the interleavings
of those programs.

So, yes, assuming we can build the kernel with ARCH=um and run the
function under test in a fork-per-run model, that would speed things
up significantly.

>
> I was just looking at what external state (such as the physical memory
> mapped) UML has and that would need to be disentangled, and it's not
> _that_ much if we can have specific configurations, and maybe mostly
> shut down the userspace that's running inside UML (and/or have kunit
> execute before init/pid 1 when builtin.)

I looked at UML myself around 2023, and back then my impression was
that it didn't quite work with KASAN and KCOV, and adding an AFL
dependency on top of that made every fuzzer a one-of-a-kind setup.

> Did you consider such a model at all, and have specific reasons for not
> going in this direction, or simply didn't consider because you're coming
> from the syzkaller side anyway?

We did consider such a model, but decided against it, with the
maintainability of the fuzzers being the main reason.
We want to be sure that every fuzz target written for the kernel is
still buildable when the code author turns back on it.
We also want every target to be tested continuously and for the bugs
to be reported automatically.
Coming from the syzkaller side, it was natural to use the existing
infrastructure for that instead of reinventing the wheel :)

That being said, our current approach doesn't rule out UML.
In the future, we could adapt the FUZZ_TEST macro to generate stubs
that link against AFL, libFuzzer, or Centipede in UML builds.
The question of how to run those targets continuously would still be
on the table, though.

Re: [PATCH v2 RFC 0/7] KFuzzTest: a new kernel fuzzing framework

Posted by Johannes Berg 3 weeks, 1 day ago

Hi,

Thanks for your response!

> > > The primary motivation for KFuzzTest is to simplify the fuzzing of
> > > low-level, relatively stateless functions (e.g., data parsers, format
> > > converters)
> > 
> > Could you clarify what you mean by "relatively" here? It seems to me
> > that if you let this fuzz say something like
> > cfg80211_inform_bss_frame_data(), which parses a frame and registers it
> > in the global scan list, you might quickly run into the 1000 limit of
> > the list, etc. since these functions are not stateless. OTOH, it's
> > obviously possible to just receive a lot of such frames over the air
> > even, or over simulated air like in syzbot today already.
> 
> While it would be very useful to be able to test every single function
> in the kernel, there are limitations imposed by our approach.
> To work around these limitations, some code may need to be refactored
> for better testability, so that global state can be mocked out or
> easily reset between runs.

Sure, I that'd be possible. Perhaps I'm more wondering if it's actually
desirable, but sounds like at least that's how it was intended to be
used then.

> I am not very familiar with the code in
> cfg80211_inform_bss_frame_data(), but I can imagine that the code
> doing the actual frame parsing could be untangled from the code that
> registers it in the global list.

It could, but I'm actually less worried about the parsing code (it's
relatively simple to review) than about the data model in this code, and
trying to fuzz the data model generally requires the state. See e.g.
https://syzkaller.appspot.com/bug?extid=dc6f4dce0d707900cdea (which I
finally reproduced in a kunit test a few years after this was originally
reported.)

I mean ... I guess now I'm arguing against myself - having the state
there is required to find certain classes of bugs, but not having the
state makes it easier to figure out what's going on :-) A middle ground
would be to have some isolated state for fuzzing any particular "thing",
but not necessarily reset between rounds.

> The upside of doing so would be the ability to test that parsing logic
> in modes that real-world syscall invocations may never exercise.

Sure.

> > > We would like to thank David Gow for his detailed feedback regarding the
> > > potential integration with KUnit. The v1 discussion highlighted three
> > > potential paths: making KFuzzTests a special case of KUnit tests, sharing
> > > implementation details in a common library, or keeping the frameworks
> > > separate while ensuring API familiarity.
> > > 
> > > Following a productive conversation with David, we are moving forward
> > > with the third option for now. While tighter integration is an
> > > attractive long-term goal, we believe the most practical first step is
> > > to establish KFuzzTest as a valuable, standalone framework.
> > 
> > I have been wondering about this from another perspective - with kunit
> > often running in ARCH=um, and there the kernel being "just" a userspace
> > process, we should be able to do a "classic" afl-style fork approach to
> > fuzzing.
> 
> This approach is quite popular among security researchers, but if I'm
> understanding correctly, we are yet to see continuous integration of
> UML-based fuzzers with the kernel development process.

Well, chicken and egg type situation? There are no such fuzzers that are
actually easy to use and/or integrate, as far as I can tell.

I've been looking also at broader fuzzing tools such as nyx-fuzz and
related kafl [1] which are cool in theory (and are intended to address
your "cannot fork VMs quickly enough" issue), but ... while running a
modified host kernel etc. is sufficient for research, it's practically
impossible for deploying things since you have to stay on top of
security etc.

[1] https://intellabs.github.io/kAFL/tutorials/linux/fuzzing_linux_kernel.html

That said, it seems to me that upstream kvm code actually has Intel-PT
support and also dirty page logging (presumably for VM migration), so
I'm not entirely sure what the nyx/kafl host kernel actually really
adds. But I have yet to research this in detail, I've now asked some
folks at Intel who work(ed) on it.

> > That way, state doesn't really (have to) matter at all. This is
> > of course both an advantage (reproducing any issue found is just the
> > right test with a single input) and disadvantage (the fuzzer won't
> > modify state first and then find an issue on a later round.)
> 
> From our experience, accumulated state is more of a disadvantage that
> we'd rather eliminate altogether.

Interesting. I mean, I do somewhat see it that way too from the
perspective of someone faced with inscrutable bug reports, but it also
seems that given enough resources/time, accumulated state lets a fuzzer
find more potential issues.

> syzkaller can chain syscalls and could in theory generate a single
> program that is elaborate enough to prepare the state and then find an
> issue.

Right, mostly, the whole "I found a reproducer now" thing, I guess.

> However, because resetting the kernel (rebooting machines or restoring
> VM snapshots) is costly, we have to run multiple programs on the same
> kernel instance, which interfere with each other.

(see above for the nyx/kafl reference)

> As a result, some bugs that are tricky to trigger become even trickier
> to reproduce, because one can't possibly replay all the interleavings
> of those programs.

Right.

> So, yes, assuming we can build the kernel with ARCH=um and run the
> function under test in a fork-per-run model, that would speed things
> up significantly.

Is it really a speed-up vs. resulting in more readable reports? Possibly
even at the expense of coverage?

But anyway, making that possible was indeed what I was thinking about.
It requires some special configuration and "magic" in UML, but it seems
eminently doable. Mapping KCOV to a given fuzzer's feedback might not be
trivial, but it should be possible too. In theory you could even compile
the whole UML kernel with say afl-clang, I suppose.

> > I was just looking at what external state (such as the physical memory
> > mapped) UML has and that would need to be disentangled, and it's not
> > _that_ much if we can have specific configurations, and maybe mostly
> > shut down the userspace that's running inside UML (and/or have kunit
> > execute before init/pid 1 when builtin.)
> 
> I looked at UML myself around 2023, and back then my impression was
> that it didn't quite work with KASAN and KCOV, and adding an AFL
> dependency on top of that made every fuzzer a one-of-a-kind setup.

I'm not entirely sure about KCOV right now, but KASAN definitely works
today (not in 2023.) I agree that adding a fuzzer on top makes it a one-
of-a-kind setup, but I guess from my perspective adding syzbot/syzkaller
(inside) is really mostly the same, since we don't run that ourselves
right now.

> > Did you consider such a model at all, and have specific reasons for not
> > going in this direction, or simply didn't consider because you're coming
> > from the syzkaller side anyway?
> 
> We did consider such a model, but decided against it, with the
> maintainability of the fuzzers being the main reason.
> We want to be sure that every fuzz target written for the kernel is
> still buildable when the code author turns back on it.
> We also want every target to be tested continuously and for the bugs
> to be reported automatically.
> Coming from the syzkaller side, it was natural to use the existing
> infrastructure for that instead of reinventing the wheel :)

Fair points, though I'd like to point out that really the only reason
this is true is the syzkaller availability: that ensures fuzz tests
would run continuously/automatically, thus ensuring it's buildable
(since you try that) and thus ensuring it'd be maintained. So it all
goes back to syzkaller existing already :-)

Which I'm not arguing is bad, quite the opposite, but I'm also close to
just giving up on the whole UML thing precisely _because_ of it, since
there's no way anyone can compete with Google's deployment, and adding
somewhat competing infrastructure to the kernel will just complicate
matters. Which is maybe unfortunate, because a fork/fuzz model often
seems more usable in practice, and in particular can also be used more
easily for regression tests.

Regression, btw, is perhaps something to consider here in this patch
set? Maybe some side files could be provided with each KFuzzTest that
something (kunit?) would run to ensure that the code didn't regress when
asked to parse those files?

> That being said, our current approach doesn't rule out UML.
> In the future, we could adapt the FUZZ_TEST macro to generate stubs
> that link against AFL, libFuzzer, or Centipede in UML builds.

That's also true, I guess, in some way this infrastructure would be
available for any fuzzer to link to, especially if we do something with
UML as I was thinking about.

Which is also in part why I was asking about the state though, since a
"reset the whole state" approach is maybe a bit more amenable to
actually letting the fuzzer modify state than the current approach.

Then again, given that syzbot always modifies state, maybe I'm changing
my opinion on this and will say that I'm not so sure any more your
intention of fuzzing "low-level, relatively stateless functions" holds
that much water? If in practice syzbot is the thing that runs this, then
that doesn't matter very much apart from having to ensure that it
doesn't modify state in a way that is completely invalid - but to some
extent that'd be a bug anyway, and e.g. memory allocations of a function
can be freed by the fuzztest wrapper code.


I guess I'll research the whole nyx thing a bit more, and maybe
reconsider giving up on the UML-based fork/fuzz model, if I can figure
out a way to integrate it with KFuzzTest and run those tests, rather
than my initial intent of integrating it with kunit. Some infrastructure
could be shared, although I had hoped things like kunit asserts, memory
allocations, etc. would be available to fuzz test code just to be able
to share setup/teardown infrastructure - I guess we'll have to see how
that plays out. :)

Thanks!

johannes

Re: [PATCH v2 RFC 0/7] KFuzzTest: a new kernel fuzzing framework

Posted by Johannes Berg 3 weeks ago

Hi again :-)

So I've been spending another day on this, looking at kafl/nyx as
promised, and thinking about afl++ integration.

> I've been looking also at broader fuzzing tools such as nyx-fuzz and
> related kafl [1] which are cool in theory (and are intended to address
> your "cannot fork VMs quickly enough" issue), but ... while running a
> modified host kernel etc. is sufficient for research, it's practically
> impossible for deploying things since you have to stay on top of
> security etc.
> 
> [1] https://intellabs.github.io/kAFL/tutorials/linux/fuzzing_linux_kernel.html
> 
> That said, it seems to me that upstream kvm code actually has Intel-PT
> support and also dirty page logging (presumably for VM migration), so
> I'm not entirely sure what the nyx/kafl host kernel actually really
> adds. But I have yet to research this in detail, I've now asked some
> folks at Intel who work(ed) on it.

It's actually a bit more nuanced - it can work without Intel-PT using
instrumentation for feedback and using the upstream kvm PML APIs, but
then it requires the "vmware backdoor" enabled.

Also, the qemu they have is based on version 4.2, according to the bug
tracker there were two failed attempts at forward-porting it.


> Which I'm not arguing is bad, quite the opposite, but I'm also close to
> just giving up on the whole UML thing precisely _because_ of it, since
> there's no way anyone can compete with Google's deployment, and adding
> somewhat competing infrastructure to the kernel will just complicate
> matters. Which is maybe unfortunate, because a fork/fuzz model often
> seems more usable in practice, and in particular can also be used more
> easily for regression tests.

Or maybe not given the state of the kafl/nyx world... :)

I also just spent a bunch of time looking at integrating afl++ with kcov
and it seems ... tricky? There seem to be assumptions on the data format
in afl++, but the kcov data format is entirely different, both for block
and compare tracking. I think it could be made to work most easily by
first supporting -fsanitize-coverage=trace-pc-guard in kcov (which is
clang only at this point), and adding a new KCOV_TRACE_ mode for it, one
that indexes by guard pointer and assigns incrementing numbers to those
like afl does, or so?

I'd think it'd be useful to also be able to run afl++ on the kfuzztests
proposed here by forwarding the kcov data. For this though, it seems it
might also be useful to actually wait for remote kcov to finish? Yeah
there's still the whole state issue, but at least (remote) kcov will
only trace code that's actually relevant to the injected data. This
would be with afl running as a normal userspace process against the
kfuzztest of the kernel it's running in, but with some additional setup
it'd also be possible to apply it to UML with forking to avoid state
issues.

(And yes, kcov seems to work fine on UML.)

I guess I'll go play with this some unless someone sees total show-
stoppers.

johannes

Re: [PATCH v2 RFC 0/7] KFuzzTest: a new kernel fuzzing framework

Posted by David Gow 4 weeks, 1 day ago

On Tue, 2 Sept 2025 at 00:43, Ethan Graham <ethan.w.s.graham@gmail.com> wrote:
>
> From: Ethan Graham <ethangraham@google.com>
>
> This patch series introduces KFuzzTest, a lightweight framework for
> creating in-kernel fuzz targets for internal kernel functions.
>
> The primary motivation for KFuzzTest is to simplify the fuzzing of
> low-level, relatively stateless functions (e.g., data parsers, format
> converters) that are difficult to exercise effectively from the syscall
> boundary. It is intended for in-situ fuzzing of kernel code without
> requiring that it be built as a separate userspace library or that its
> dependencies be stubbed out. Using a simple macro-based API, developers
> can add a new fuzz target with minimal boilerplate code.
>
> The core design consists of three main parts:
> 1. A `FUZZ_TEST(name, struct_type)` macro that allows developers to
>    easily define a fuzz test.
> 2. A binary input format that allows a userspace fuzzer to serialize
>    complex, pointer-rich C structures into a single buffer.
> 3. Metadata for test targets, constraints, and annotations, which is
>    emitted into dedicated ELF sections to allow for discovery and
>    inspection by userspace tools. These are found in
>    ".kfuzztest_{targets, constraints, annotations}".
>
> To demonstrate this framework's viability, support for KFuzzTest has been
> prototyped in a development fork of syzkaller, enabling coverage-guided
> fuzzing. To validate its end-to-end effectiveness, we performed an
> experiment by manually introducing an off-by-one buffer over-read into
> pkcs7_parse_message, like so:
>
> -ret = asn1_ber_decoder(&pkcs7_decoder, ctx, data, datalen);
> +ret = asn1_ber_decoder(&pkcs7_decoder, ctx, data, datalen + 1);
>
> A syzkaller instance fuzzing the new test_pkcs7_parse_message target
> introduced in patch 7 successfully triggered the bug inside of
> asn1_ber_decoder in under a 30 seconds from a cold start.
>
> This RFC continues to seek feedback on the overall design of KFuzzTest
> and the minor changes made in V2. We are particularly interested in
> comments on:
> - The ergonomics of the API for defining fuzz targets.
> - The overall workflow and usability for a developer adding and running
>   a new in-kernel fuzz target.
> - The high-level architecture.
>
> The patch series is structured as follows:
> - Patch 1 adds and exposes a new KASAN function needed by KFuzzTest.
> - Patch 2 introduces the core KFuzzTest API and data structures.
> - Patch 3 adds the runtime implementation for the framework.
> - Patch 4 adds a tool for sending structured inputs into a fuzz target.
> - Patch 5 adds documentation.
> - Patch 6 provides example fuzz targets.
> - Patch 7 defines fuzz targets for real kernel functions.
>
> Changes in v2:
> - Per feedback from Eric Biggers and Ignat Korchagin, move the /crypto
>   fuzz target samples into a new /crypto/tests directory to separate
>   them from the functional source code.
> - Per feedback from David Gow and Marco Elver, add the kfuzztest-bridge
>   tool to generate structured inputs for fuzz targets. The tool can
>   populate parts of the input structure with data from a file, enabling
>   both simple randomized fuzzing (e.g, using /dev/urandom) and
>   targeted testing with file-based inputs.
>
> We would like to thank David Gow for his detailed feedback regarding the
> potential integration with KUnit. The v1 discussion highlighted three
> potential paths: making KFuzzTests a special case of KUnit tests, sharing
> implementation details in a common library, or keeping the frameworks
> separate while ensuring API familiarity.
>
> Following a productive conversation with David, we are moving forward
> with the third option for now. While tighter integration is an
> attractive long-term goal, we believe the most practical first step is
> to establish KFuzzTest as a valuable, standalone framework. This avoids
> premature abstraction (e.g., creating a shared library with only one
> user) and allows KFuzzTest's design to stabilize based on its specific
> focus: fuzzing with complex, structured inputs.
>

Thanks, Ethan. I've had a bit of a play around with the
kfuzztest-bridge tool, and it seems to work pretty well here. I'm
definitely looking forward to trying out

The only real feature I'd find useful would be to have a
human-readable way of describing the data (as well as the structure),
which could be useful when passing around reproducers, and could make
it possible to hand-craft or adapt cases to work cross-architecture,
if that's a future goal. But I don't think that it's worth holding up
an initial version for.

On the subject of architecture support, I don't see anything
particularly x86_64-specific in here (or at least, nothing that
couldn't be relatively easily fixed). While I don't think you need to
support lots of architectures immediately, it'd be nice to use
architecture-independant things (like the shared
include/asm-generic/vmlinux.lds.h) where possible. And even if you're
focusing on x86_64, supporting UML -- which is still x86
under-the-hood, but has its own linker scripts -- would be a nice
bonus if it's easy. Other things, like supporting 32-bit or big-endian
setups are nice-to-have, but definitely not worth spending too much
time on immediately (though if we start using some of the
formats/features here for KUnit, we'll want to support them).

Finally, while I like the samples and documentation, I think it'd be
nice to include a working example of using kfuzztest-bridge alongside
the samples, even if it's something as simple as including a line
like:
./kfuzztest-bridge "some_buffer { ptr[buf] len[buf, u64]}; buf {
arr[u8, 128] };"  "test_underflow_on_buffer" /dev/urandom

Regardless, this is very neat, and I can't wait (with some
apprehension) to see what it finds!

Cheers,
-- David

Re: [PATCH v2 RFC 0/7] KFuzzTest: a new kernel fuzzing framework

Posted by Ethan Graham 4 weeks ago

On Thu, Sep 4, 2025 at 11:11 AM David Gow <davidgow@google.com> wrote:
> Thanks, Ethan. I've had a bit of a play around with the
> kfuzztest-bridge tool, and it seems to work pretty well here. I'm
> definitely looking forward to trying out
>
> The only real feature I'd find useful would be to have a
> human-readable way of describing the data (as well as the structure),
> which could be useful when passing around reproducers, and could make
> it possible to hand-craft or adapt cases to work cross-architecture,
> if that's a future goal. But I don't think that it's worth holding up
> an initial version for.

That's a great idea for a future iteration.

> On the subject of architecture support, I don't see anything
> particularly x86_64-specific in here (or at least, nothing that
> couldn't be relatively easily fixed). While I don't think you need to
> support lots of architectures immediately, it'd be nice to use
> architecture-independant things (like the shared
> include/asm-generic/vmlinux.lds.h) where possible. And even if you're

You're absolutely right. I made some modifications locally, and there
seems to be no reason not to add all of the required section
definitions into the /include/asm-generic/vmlinux.lds.h.

> focusing on x86_64, supporting UML -- which is still x86
> under-the-hood, but has its own linker scripts -- would be a nice
> bonus if it's easy. Other things, like supporting 32-bit or big-endian
> setups are nice-to-have, but definitely not worth spending too much
> time on immediately (though if we start using some of the
> formats/features here for KUnit, we'll want to support them).
>
> Finally, while I like the samples and documentation, I think it'd be
> nice to include a working example of using kfuzztest-bridge alongside
> the samples, even if it's something as simple as including a line
> like:
> ./kfuzztest-bridge "some_buffer { ptr[buf] len[buf, u64]}; buf {
> arr[u8, 128] };"  "test_underflow_on_buffer" /dev/urandom

Definitely. I'll be sure to add that into the docs.