Documentation/dev-tools/index.rst | 1 + Documentation/dev-tools/kfuzztest.rst | 371 +++++++++++++ arch/x86/kernel/vmlinux.lds.S | 22 + crypto/asymmetric_keys/Kconfig | 15 + crypto/asymmetric_keys/Makefile | 2 + crypto/asymmetric_keys/tests/Makefile | 2 + crypto/asymmetric_keys/tests/pkcs7_kfuzz.c | 22 + .../asymmetric_keys/tests/rsa_helper_kfuzz.c | 38 ++ include/linux/kasan.h | 16 + include/linux/kfuzztest.h | 508 ++++++++++++++++++ lib/Kconfig.debug | 1 + lib/Makefile | 2 + lib/kfuzztest/Kconfig | 20 + lib/kfuzztest/Makefile | 4 + lib/kfuzztest/main.c | 163 ++++++ lib/kfuzztest/parse.c | 208 +++++++ mm/kasan/shadow.c | 31 ++ samples/Kconfig | 7 + samples/Makefile | 1 + samples/kfuzztest/Makefile | 3 + samples/kfuzztest/overflow_on_nested_buffer.c | 52 ++ samples/kfuzztest/underflow_on_buffer.c | 41 ++ tools/Makefile | 15 +- tools/kfuzztest-bridge/.gitignore | 2 + tools/kfuzztest-bridge/Build | 6 + tools/kfuzztest-bridge/Makefile | 48 ++ tools/kfuzztest-bridge/bridge.c | 93 ++++ tools/kfuzztest-bridge/byte_buffer.c | 87 +++ tools/kfuzztest-bridge/byte_buffer.h | 31 ++ tools/kfuzztest-bridge/encoder.c | 356 ++++++++++++ tools/kfuzztest-bridge/encoder.h | 16 + tools/kfuzztest-bridge/input_lexer.c | 243 +++++++++ tools/kfuzztest-bridge/input_lexer.h | 57 ++ tools/kfuzztest-bridge/input_parser.c | 373 +++++++++++++ tools/kfuzztest-bridge/input_parser.h | 79 +++ tools/kfuzztest-bridge/rand_stream.c | 61 +++ tools/kfuzztest-bridge/rand_stream.h | 46 ++ 37 files changed, 3037 insertions(+), 6 deletions(-) create mode 100644 Documentation/dev-tools/kfuzztest.rst create mode 100644 crypto/asymmetric_keys/tests/Makefile create mode 100644 crypto/asymmetric_keys/tests/pkcs7_kfuzz.c create mode 100644 crypto/asymmetric_keys/tests/rsa_helper_kfuzz.c create mode 100644 include/linux/kfuzztest.h create mode 100644 lib/kfuzztest/Kconfig create mode 100644 lib/kfuzztest/Makefile create mode 100644 lib/kfuzztest/main.c create mode 100644 lib/kfuzztest/parse.c create mode 100644 samples/kfuzztest/Makefile create mode 100644 samples/kfuzztest/overflow_on_nested_buffer.c create mode 100644 samples/kfuzztest/underflow_on_buffer.c create mode 100644 tools/kfuzztest-bridge/.gitignore create mode 100644 tools/kfuzztest-bridge/Build create mode 100644 tools/kfuzztest-bridge/Makefile create mode 100644 tools/kfuzztest-bridge/bridge.c create mode 100644 tools/kfuzztest-bridge/byte_buffer.c create mode 100644 tools/kfuzztest-bridge/byte_buffer.h create mode 100644 tools/kfuzztest-bridge/encoder.c create mode 100644 tools/kfuzztest-bridge/encoder.h create mode 100644 tools/kfuzztest-bridge/input_lexer.c create mode 100644 tools/kfuzztest-bridge/input_lexer.h create mode 100644 tools/kfuzztest-bridge/input_parser.c create mode 100644 tools/kfuzztest-bridge/input_parser.h create mode 100644 tools/kfuzztest-bridge/rand_stream.c create mode 100644 tools/kfuzztest-bridge/rand_stream.h
From: Ethan Graham <ethangraham@google.com> This patch series introduces KFuzzTest, a lightweight framework for creating in-kernel fuzz targets for internal kernel functions. The primary motivation for KFuzzTest is to simplify the fuzzing of low-level, relatively stateless functions (e.g., data parsers, format converters) that are difficult to exercise effectively from the syscall boundary. It is intended for in-situ fuzzing of kernel code without requiring that it be built as a separate userspace library or that its dependencies be stubbed out. Using a simple macro-based API, developers can add a new fuzz target with minimal boilerplate code. The core design consists of three main parts: 1. A `FUZZ_TEST(name, struct_type)` macro that allows developers to easily define a fuzz test. 2. A binary input format that allows a userspace fuzzer to serialize complex, pointer-rich C structures into a single buffer. 3. Metadata for test targets, constraints, and annotations, which is emitted into dedicated ELF sections to allow for discovery and inspection by userspace tools. These are found in ".kfuzztest_{targets, constraints, annotations}". To demonstrate this framework's viability, support for KFuzzTest has been prototyped in a development fork of syzkaller, enabling coverage-guided fuzzing. To validate its end-to-end effectiveness, we performed an experiment by manually introducing an off-by-one buffer over-read into pkcs7_parse_message, like so: -ret = asn1_ber_decoder(&pkcs7_decoder, ctx, data, datalen); +ret = asn1_ber_decoder(&pkcs7_decoder, ctx, data, datalen + 1); A syzkaller instance fuzzing the new test_pkcs7_parse_message target introduced in patch 7 successfully triggered the bug inside of asn1_ber_decoder in under a 30 seconds from a cold start. This RFC continues to seek feedback on the overall design of KFuzzTest and the minor changes made in V2. We are particularly interested in comments on: - The ergonomics of the API for defining fuzz targets. - The overall workflow and usability for a developer adding and running a new in-kernel fuzz target. - The high-level architecture. The patch series is structured as follows: - Patch 1 adds and exposes a new KASAN function needed by KFuzzTest. - Patch 2 introduces the core KFuzzTest API and data structures. - Patch 3 adds the runtime implementation for the framework. - Patch 4 adds a tool for sending structured inputs into a fuzz target. - Patch 5 adds documentation. - Patch 6 provides example fuzz targets. - Patch 7 defines fuzz targets for real kernel functions. Changes in v2: - Per feedback from Eric Biggers and Ignat Korchagin, move the /crypto fuzz target samples into a new /crypto/tests directory to separate them from the functional source code. - Per feedback from David Gow and Marco Elver, add the kfuzztest-bridge tool to generate structured inputs for fuzz targets. The tool can populate parts of the input structure with data from a file, enabling both simple randomized fuzzing (e.g, using /dev/urandom) and targeted testing with file-based inputs. We would like to thank David Gow for his detailed feedback regarding the potential integration with KUnit. The v1 discussion highlighted three potential paths: making KFuzzTests a special case of KUnit tests, sharing implementation details in a common library, or keeping the frameworks separate while ensuring API familiarity. Following a productive conversation with David, we are moving forward with the third option for now. While tighter integration is an attractive long-term goal, we believe the most practical first step is to establish KFuzzTest as a valuable, standalone framework. This avoids premature abstraction (e.g., creating a shared library with only one user) and allows KFuzzTest's design to stabilize based on its specific focus: fuzzing with complex, structured inputs. Ethan Graham (7): mm/kasan: implement kasan_poison_range kfuzztest: add user-facing API and data structures kfuzztest: implement core module and input processing tools: add kfuzztest-bridge utility kfuzztest: add ReST documentation kfuzztest: add KFuzzTest sample fuzz targets crypto: implement KFuzzTest targets for PKCS7 and RSA parsing Documentation/dev-tools/index.rst | 1 + Documentation/dev-tools/kfuzztest.rst | 371 +++++++++++++ arch/x86/kernel/vmlinux.lds.S | 22 + crypto/asymmetric_keys/Kconfig | 15 + crypto/asymmetric_keys/Makefile | 2 + crypto/asymmetric_keys/tests/Makefile | 2 + crypto/asymmetric_keys/tests/pkcs7_kfuzz.c | 22 + .../asymmetric_keys/tests/rsa_helper_kfuzz.c | 38 ++ include/linux/kasan.h | 16 + include/linux/kfuzztest.h | 508 ++++++++++++++++++ lib/Kconfig.debug | 1 + lib/Makefile | 2 + lib/kfuzztest/Kconfig | 20 + lib/kfuzztest/Makefile | 4 + lib/kfuzztest/main.c | 163 ++++++ lib/kfuzztest/parse.c | 208 +++++++ mm/kasan/shadow.c | 31 ++ samples/Kconfig | 7 + samples/Makefile | 1 + samples/kfuzztest/Makefile | 3 + samples/kfuzztest/overflow_on_nested_buffer.c | 52 ++ samples/kfuzztest/underflow_on_buffer.c | 41 ++ tools/Makefile | 15 +- tools/kfuzztest-bridge/.gitignore | 2 + tools/kfuzztest-bridge/Build | 6 + tools/kfuzztest-bridge/Makefile | 48 ++ tools/kfuzztest-bridge/bridge.c | 93 ++++ tools/kfuzztest-bridge/byte_buffer.c | 87 +++ tools/kfuzztest-bridge/byte_buffer.h | 31 ++ tools/kfuzztest-bridge/encoder.c | 356 ++++++++++++ tools/kfuzztest-bridge/encoder.h | 16 + tools/kfuzztest-bridge/input_lexer.c | 243 +++++++++ tools/kfuzztest-bridge/input_lexer.h | 57 ++ tools/kfuzztest-bridge/input_parser.c | 373 +++++++++++++ tools/kfuzztest-bridge/input_parser.h | 79 +++ tools/kfuzztest-bridge/rand_stream.c | 61 +++ tools/kfuzztest-bridge/rand_stream.h | 46 ++ 37 files changed, 3037 insertions(+), 6 deletions(-) create mode 100644 Documentation/dev-tools/kfuzztest.rst create mode 100644 crypto/asymmetric_keys/tests/Makefile create mode 100644 crypto/asymmetric_keys/tests/pkcs7_kfuzz.c create mode 100644 crypto/asymmetric_keys/tests/rsa_helper_kfuzz.c create mode 100644 include/linux/kfuzztest.h create mode 100644 lib/kfuzztest/Kconfig create mode 100644 lib/kfuzztest/Makefile create mode 100644 lib/kfuzztest/main.c create mode 100644 lib/kfuzztest/parse.c create mode 100644 samples/kfuzztest/Makefile create mode 100644 samples/kfuzztest/overflow_on_nested_buffer.c create mode 100644 samples/kfuzztest/underflow_on_buffer.c create mode 100644 tools/kfuzztest-bridge/.gitignore create mode 100644 tools/kfuzztest-bridge/Build create mode 100644 tools/kfuzztest-bridge/Makefile create mode 100644 tools/kfuzztest-bridge/bridge.c create mode 100644 tools/kfuzztest-bridge/byte_buffer.c create mode 100644 tools/kfuzztest-bridge/byte_buffer.h create mode 100644 tools/kfuzztest-bridge/encoder.c create mode 100644 tools/kfuzztest-bridge/encoder.h create mode 100644 tools/kfuzztest-bridge/input_lexer.c create mode 100644 tools/kfuzztest-bridge/input_lexer.h create mode 100644 tools/kfuzztest-bridge/input_parser.c create mode 100644 tools/kfuzztest-bridge/input_parser.h create mode 100644 tools/kfuzztest-bridge/rand_stream.c create mode 100644 tools/kfuzztest-bridge/rand_stream.h -- 2.51.0.318.gd7df087d1a-goog
Hi Ethan, Since I'm looking at some WiFi fuzzing just now ... > The primary motivation for KFuzzTest is to simplify the fuzzing of > low-level, relatively stateless functions (e.g., data parsers, format > converters) Could you clarify what you mean by "relatively" here? It seems to me that if you let this fuzz say something like cfg80211_inform_bss_frame_data(), which parses a frame and registers it in the global scan list, you might quickly run into the 1000 limit of the list, etc. since these functions are not stateless. OTOH, it's obviously possible to just receive a lot of such frames over the air even, or over simulated air like in syzbot today already. > This RFC continues to seek feedback on the overall design of KFuzzTest > and the minor changes made in V2. We are particularly interested in > comments on: > - The ergonomics of the API for defining fuzz targets. > - The overall workflow and usability for a developer adding and running > a new in-kernel fuzz target. > - The high-level architecture. As far as the architecture is concerned, I'm reading this is built around syzkaller (like) architecture, in that the fuzzer lives in the fuzzed kernel's userspace, right? > We would like to thank David Gow for his detailed feedback regarding the > potential integration with KUnit. The v1 discussion highlighted three > potential paths: making KFuzzTests a special case of KUnit tests, sharing > implementation details in a common library, or keeping the frameworks > separate while ensuring API familiarity. > > Following a productive conversation with David, we are moving forward > with the third option for now. While tighter integration is an > attractive long-term goal, we believe the most practical first step is > to establish KFuzzTest as a valuable, standalone framework. I have been wondering about this from another perspective - with kunit often running in ARCH=um, and there the kernel being "just" a userspace process, we should be able to do a "classic" afl-style fork approach to fuzzing. That way, state doesn't really (have to) matter at all. This is of course both an advantage (reproducing any issue found is just the right test with a single input) and disadvantage (the fuzzer won't modify state first and then find an issue on a later round.) I was just looking at what external state (such as the physical memory mapped) UML has and that would need to be disentangled, and it's not _that_ much if we can have specific configurations, and maybe mostly shut down the userspace that's running inside UML (and/or have kunit execute before init/pid 1 when builtin.) Did you consider such a model at all, and have specific reasons for not going in this direction, or simply didn't consider because you're coming from the syzkaller side anyway? johannes
On Mon, Sep 8, 2025 at 3:11 PM Johannes Berg <johannes@sipsolutions.net> wrote: > > Hi Ethan, Hi Johannes, > Since I'm looking at some WiFi fuzzing just now ... > > > The primary motivation for KFuzzTest is to simplify the fuzzing of > > low-level, relatively stateless functions (e.g., data parsers, format > > converters) > > Could you clarify what you mean by "relatively" here? It seems to me > that if you let this fuzz say something like > cfg80211_inform_bss_frame_data(), which parses a frame and registers it > in the global scan list, you might quickly run into the 1000 limit of > the list, etc. since these functions are not stateless. OTOH, it's > obviously possible to just receive a lot of such frames over the air > even, or over simulated air like in syzbot today already. While it would be very useful to be able to test every single function in the kernel, there are limitations imposed by our approach. To work around these limitations, some code may need to be refactored for better testability, so that global state can be mocked out or easily reset between runs. I am not very familiar with the code in cfg80211_inform_bss_frame_data(), but I can imagine that the code doing the actual frame parsing could be untangled from the code that registers it in the global list. The upside of doing so would be the ability to test that parsing logic in modes that real-world syscall invocations may never exercise. > > As far as the architecture is concerned, I'm reading this is built > around syzkaller (like) architecture, in that the fuzzer lives in the > fuzzed kernel's userspace, right? > This is correct. > > We would like to thank David Gow for his detailed feedback regarding the > > potential integration with KUnit. The v1 discussion highlighted three > > potential paths: making KFuzzTests a special case of KUnit tests, sharing > > implementation details in a common library, or keeping the frameworks > > separate while ensuring API familiarity. > > > > Following a productive conversation with David, we are moving forward > > with the third option for now. While tighter integration is an > > attractive long-term goal, we believe the most practical first step is > > to establish KFuzzTest as a valuable, standalone framework. > > I have been wondering about this from another perspective - with kunit > often running in ARCH=um, and there the kernel being "just" a userspace > process, we should be able to do a "classic" afl-style fork approach to > fuzzing. This approach is quite popular among security researchers, but if I'm understanding correctly, we are yet to see continuous integration of UML-based fuzzers with the kernel development process. > That way, state doesn't really (have to) matter at all. This is > of course both an advantage (reproducing any issue found is just the > right test with a single input) and disadvantage (the fuzzer won't > modify state first and then find an issue on a later round.) From our experience, accumulated state is more of a disadvantage that we'd rather eliminate altogether. syzkaller can chain syscalls and could in theory generate a single program that is elaborate enough to prepare the state and then find an issue. However, because resetting the kernel (rebooting machines or restoring VM snapshots) is costly, we have to run multiple programs on the same kernel instance, which interfere with each other. As a result, some bugs that are tricky to trigger become even trickier to reproduce, because one can't possibly replay all the interleavings of those programs. So, yes, assuming we can build the kernel with ARCH=um and run the function under test in a fork-per-run model, that would speed things up significantly. > > I was just looking at what external state (such as the physical memory > mapped) UML has and that would need to be disentangled, and it's not > _that_ much if we can have specific configurations, and maybe mostly > shut down the userspace that's running inside UML (and/or have kunit > execute before init/pid 1 when builtin.) I looked at UML myself around 2023, and back then my impression was that it didn't quite work with KASAN and KCOV, and adding an AFL dependency on top of that made every fuzzer a one-of-a-kind setup. > Did you consider such a model at all, and have specific reasons for not > going in this direction, or simply didn't consider because you're coming > from the syzkaller side anyway? We did consider such a model, but decided against it, with the maintainability of the fuzzers being the main reason. We want to be sure that every fuzz target written for the kernel is still buildable when the code author turns back on it. We also want every target to be tested continuously and for the bugs to be reported automatically. Coming from the syzkaller side, it was natural to use the existing infrastructure for that instead of reinventing the wheel :) That being said, our current approach doesn't rule out UML. In the future, we could adapt the FUZZ_TEST macro to generate stubs that link against AFL, libFuzzer, or Centipede in UML builds. The question of how to run those targets continuously would still be on the table, though.
Hi, Thanks for your response! > > > The primary motivation for KFuzzTest is to simplify the fuzzing of > > > low-level, relatively stateless functions (e.g., data parsers, format > > > converters) > > > > Could you clarify what you mean by "relatively" here? It seems to me > > that if you let this fuzz say something like > > cfg80211_inform_bss_frame_data(), which parses a frame and registers it > > in the global scan list, you might quickly run into the 1000 limit of > > the list, etc. since these functions are not stateless. OTOH, it's > > obviously possible to just receive a lot of such frames over the air > > even, or over simulated air like in syzbot today already. > > While it would be very useful to be able to test every single function > in the kernel, there are limitations imposed by our approach. > To work around these limitations, some code may need to be refactored > for better testability, so that global state can be mocked out or > easily reset between runs. Sure, I that'd be possible. Perhaps I'm more wondering if it's actually desirable, but sounds like at least that's how it was intended to be used then. > I am not very familiar with the code in > cfg80211_inform_bss_frame_data(), but I can imagine that the code > doing the actual frame parsing could be untangled from the code that > registers it in the global list. It could, but I'm actually less worried about the parsing code (it's relatively simple to review) than about the data model in this code, and trying to fuzz the data model generally requires the state. See e.g. https://syzkaller.appspot.com/bug?extid=dc6f4dce0d707900cdea (which I finally reproduced in a kunit test a few years after this was originally reported.) I mean ... I guess now I'm arguing against myself - having the state there is required to find certain classes of bugs, but not having the state makes it easier to figure out what's going on :-) A middle ground would be to have some isolated state for fuzzing any particular "thing", but not necessarily reset between rounds. > The upside of doing so would be the ability to test that parsing logic > in modes that real-world syscall invocations may never exercise. Sure. > > > We would like to thank David Gow for his detailed feedback regarding the > > > potential integration with KUnit. The v1 discussion highlighted three > > > potential paths: making KFuzzTests a special case of KUnit tests, sharing > > > implementation details in a common library, or keeping the frameworks > > > separate while ensuring API familiarity. > > > > > > Following a productive conversation with David, we are moving forward > > > with the third option for now. While tighter integration is an > > > attractive long-term goal, we believe the most practical first step is > > > to establish KFuzzTest as a valuable, standalone framework. > > > > I have been wondering about this from another perspective - with kunit > > often running in ARCH=um, and there the kernel being "just" a userspace > > process, we should be able to do a "classic" afl-style fork approach to > > fuzzing. > > This approach is quite popular among security researchers, but if I'm > understanding correctly, we are yet to see continuous integration of > UML-based fuzzers with the kernel development process. Well, chicken and egg type situation? There are no such fuzzers that are actually easy to use and/or integrate, as far as I can tell. I've been looking also at broader fuzzing tools such as nyx-fuzz and related kafl [1] which are cool in theory (and are intended to address your "cannot fork VMs quickly enough" issue), but ... while running a modified host kernel etc. is sufficient for research, it's practically impossible for deploying things since you have to stay on top of security etc. [1] https://intellabs.github.io/kAFL/tutorials/linux/fuzzing_linux_kernel.html That said, it seems to me that upstream kvm code actually has Intel-PT support and also dirty page logging (presumably for VM migration), so I'm not entirely sure what the nyx/kafl host kernel actually really adds. But I have yet to research this in detail, I've now asked some folks at Intel who work(ed) on it. > > That way, state doesn't really (have to) matter at all. This is > > of course both an advantage (reproducing any issue found is just the > > right test with a single input) and disadvantage (the fuzzer won't > > modify state first and then find an issue on a later round.) > > From our experience, accumulated state is more of a disadvantage that > we'd rather eliminate altogether. Interesting. I mean, I do somewhat see it that way too from the perspective of someone faced with inscrutable bug reports, but it also seems that given enough resources/time, accumulated state lets a fuzzer find more potential issues. > syzkaller can chain syscalls and could in theory generate a single > program that is elaborate enough to prepare the state and then find an > issue. Right, mostly, the whole "I found a reproducer now" thing, I guess. > However, because resetting the kernel (rebooting machines or restoring > VM snapshots) is costly, we have to run multiple programs on the same > kernel instance, which interfere with each other. (see above for the nyx/kafl reference) > As a result, some bugs that are tricky to trigger become even trickier > to reproduce, because one can't possibly replay all the interleavings > of those programs. Right. > So, yes, assuming we can build the kernel with ARCH=um and run the > function under test in a fork-per-run model, that would speed things > up significantly. Is it really a speed-up vs. resulting in more readable reports? Possibly even at the expense of coverage? But anyway, making that possible was indeed what I was thinking about. It requires some special configuration and "magic" in UML, but it seems eminently doable. Mapping KCOV to a given fuzzer's feedback might not be trivial, but it should be possible too. In theory you could even compile the whole UML kernel with say afl-clang, I suppose. > > I was just looking at what external state (such as the physical memory > > mapped) UML has and that would need to be disentangled, and it's not > > _that_ much if we can have specific configurations, and maybe mostly > > shut down the userspace that's running inside UML (and/or have kunit > > execute before init/pid 1 when builtin.) > > I looked at UML myself around 2023, and back then my impression was > that it didn't quite work with KASAN and KCOV, and adding an AFL > dependency on top of that made every fuzzer a one-of-a-kind setup. I'm not entirely sure about KCOV right now, but KASAN definitely works today (not in 2023.) I agree that adding a fuzzer on top makes it a one- of-a-kind setup, but I guess from my perspective adding syzbot/syzkaller (inside) is really mostly the same, since we don't run that ourselves right now. > > Did you consider such a model at all, and have specific reasons for not > > going in this direction, or simply didn't consider because you're coming > > from the syzkaller side anyway? > > We did consider such a model, but decided against it, with the > maintainability of the fuzzers being the main reason. > We want to be sure that every fuzz target written for the kernel is > still buildable when the code author turns back on it. > We also want every target to be tested continuously and for the bugs > to be reported automatically. > Coming from the syzkaller side, it was natural to use the existing > infrastructure for that instead of reinventing the wheel :) Fair points, though I'd like to point out that really the only reason this is true is the syzkaller availability: that ensures fuzz tests would run continuously/automatically, thus ensuring it's buildable (since you try that) and thus ensuring it'd be maintained. So it all goes back to syzkaller existing already :-) Which I'm not arguing is bad, quite the opposite, but I'm also close to just giving up on the whole UML thing precisely _because_ of it, since there's no way anyone can compete with Google's deployment, and adding somewhat competing infrastructure to the kernel will just complicate matters. Which is maybe unfortunate, because a fork/fuzz model often seems more usable in practice, and in particular can also be used more easily for regression tests. Regression, btw, is perhaps something to consider here in this patch set? Maybe some side files could be provided with each KFuzzTest that something (kunit?) would run to ensure that the code didn't regress when asked to parse those files? > That being said, our current approach doesn't rule out UML. > In the future, we could adapt the FUZZ_TEST macro to generate stubs > that link against AFL, libFuzzer, or Centipede in UML builds. That's also true, I guess, in some way this infrastructure would be available for any fuzzer to link to, especially if we do something with UML as I was thinking about. Which is also in part why I was asking about the state though, since a "reset the whole state" approach is maybe a bit more amenable to actually letting the fuzzer modify state than the current approach. Then again, given that syzbot always modifies state, maybe I'm changing my opinion on this and will say that I'm not so sure any more your intention of fuzzing "low-level, relatively stateless functions" holds that much water? If in practice syzbot is the thing that runs this, then that doesn't matter very much apart from having to ensure that it doesn't modify state in a way that is completely invalid - but to some extent that'd be a bug anyway, and e.g. memory allocations of a function can be freed by the fuzztest wrapper code. I guess I'll research the whole nyx thing a bit more, and maybe reconsider giving up on the UML-based fork/fuzz model, if I can figure out a way to integrate it with KFuzzTest and run those tests, rather than my initial intent of integrating it with kunit. Some infrastructure could be shared, although I had hoped things like kunit asserts, memory allocations, etc. would be available to fuzz test code just to be able to share setup/teardown infrastructure - I guess we'll have to see how that plays out. :) Thanks! johannes
Hi again :-) So I've been spending another day on this, looking at kafl/nyx as promised, and thinking about afl++ integration. > I've been looking also at broader fuzzing tools such as nyx-fuzz and > related kafl [1] which are cool in theory (and are intended to address > your "cannot fork VMs quickly enough" issue), but ... while running a > modified host kernel etc. is sufficient for research, it's practically > impossible for deploying things since you have to stay on top of > security etc. > > [1] https://intellabs.github.io/kAFL/tutorials/linux/fuzzing_linux_kernel.html > > That said, it seems to me that upstream kvm code actually has Intel-PT > support and also dirty page logging (presumably for VM migration), so > I'm not entirely sure what the nyx/kafl host kernel actually really > adds. But I have yet to research this in detail, I've now asked some > folks at Intel who work(ed) on it. It's actually a bit more nuanced - it can work without Intel-PT using instrumentation for feedback and using the upstream kvm PML APIs, but then it requires the "vmware backdoor" enabled. Also, the qemu they have is based on version 4.2, according to the bug tracker there were two failed attempts at forward-porting it. > Which I'm not arguing is bad, quite the opposite, but I'm also close to > just giving up on the whole UML thing precisely _because_ of it, since > there's no way anyone can compete with Google's deployment, and adding > somewhat competing infrastructure to the kernel will just complicate > matters. Which is maybe unfortunate, because a fork/fuzz model often > seems more usable in practice, and in particular can also be used more > easily for regression tests. Or maybe not given the state of the kafl/nyx world... :) I also just spent a bunch of time looking at integrating afl++ with kcov and it seems ... tricky? There seem to be assumptions on the data format in afl++, but the kcov data format is entirely different, both for block and compare tracking. I think it could be made to work most easily by first supporting -fsanitize-coverage=trace-pc-guard in kcov (which is clang only at this point), and adding a new KCOV_TRACE_ mode for it, one that indexes by guard pointer and assigns incrementing numbers to those like afl does, or so? I'd think it'd be useful to also be able to run afl++ on the kfuzztests proposed here by forwarding the kcov data. For this though, it seems it might also be useful to actually wait for remote kcov to finish? Yeah there's still the whole state issue, but at least (remote) kcov will only trace code that's actually relevant to the injected data. This would be with afl running as a normal userspace process against the kfuzztest of the kernel it's running in, but with some additional setup it'd also be possible to apply it to UML with forking to avoid state issues. (And yes, kcov seems to work fine on UML.) I guess I'll go play with this some unless someone sees total show- stoppers. johannes
On Tue, 2 Sept 2025 at 00:43, Ethan Graham <ethan.w.s.graham@gmail.com> wrote: > > From: Ethan Graham <ethangraham@google.com> > > This patch series introduces KFuzzTest, a lightweight framework for > creating in-kernel fuzz targets for internal kernel functions. > > The primary motivation for KFuzzTest is to simplify the fuzzing of > low-level, relatively stateless functions (e.g., data parsers, format > converters) that are difficult to exercise effectively from the syscall > boundary. It is intended for in-situ fuzzing of kernel code without > requiring that it be built as a separate userspace library or that its > dependencies be stubbed out. Using a simple macro-based API, developers > can add a new fuzz target with minimal boilerplate code. > > The core design consists of three main parts: > 1. A `FUZZ_TEST(name, struct_type)` macro that allows developers to > easily define a fuzz test. > 2. A binary input format that allows a userspace fuzzer to serialize > complex, pointer-rich C structures into a single buffer. > 3. Metadata for test targets, constraints, and annotations, which is > emitted into dedicated ELF sections to allow for discovery and > inspection by userspace tools. These are found in > ".kfuzztest_{targets, constraints, annotations}". > > To demonstrate this framework's viability, support for KFuzzTest has been > prototyped in a development fork of syzkaller, enabling coverage-guided > fuzzing. To validate its end-to-end effectiveness, we performed an > experiment by manually introducing an off-by-one buffer over-read into > pkcs7_parse_message, like so: > > -ret = asn1_ber_decoder(&pkcs7_decoder, ctx, data, datalen); > +ret = asn1_ber_decoder(&pkcs7_decoder, ctx, data, datalen + 1); > > A syzkaller instance fuzzing the new test_pkcs7_parse_message target > introduced in patch 7 successfully triggered the bug inside of > asn1_ber_decoder in under a 30 seconds from a cold start. > > This RFC continues to seek feedback on the overall design of KFuzzTest > and the minor changes made in V2. We are particularly interested in > comments on: > - The ergonomics of the API for defining fuzz targets. > - The overall workflow and usability for a developer adding and running > a new in-kernel fuzz target. > - The high-level architecture. > > The patch series is structured as follows: > - Patch 1 adds and exposes a new KASAN function needed by KFuzzTest. > - Patch 2 introduces the core KFuzzTest API and data structures. > - Patch 3 adds the runtime implementation for the framework. > - Patch 4 adds a tool for sending structured inputs into a fuzz target. > - Patch 5 adds documentation. > - Patch 6 provides example fuzz targets. > - Patch 7 defines fuzz targets for real kernel functions. > > Changes in v2: > - Per feedback from Eric Biggers and Ignat Korchagin, move the /crypto > fuzz target samples into a new /crypto/tests directory to separate > them from the functional source code. > - Per feedback from David Gow and Marco Elver, add the kfuzztest-bridge > tool to generate structured inputs for fuzz targets. The tool can > populate parts of the input structure with data from a file, enabling > both simple randomized fuzzing (e.g, using /dev/urandom) and > targeted testing with file-based inputs. > > We would like to thank David Gow for his detailed feedback regarding the > potential integration with KUnit. The v1 discussion highlighted three > potential paths: making KFuzzTests a special case of KUnit tests, sharing > implementation details in a common library, or keeping the frameworks > separate while ensuring API familiarity. > > Following a productive conversation with David, we are moving forward > with the third option for now. While tighter integration is an > attractive long-term goal, we believe the most practical first step is > to establish KFuzzTest as a valuable, standalone framework. This avoids > premature abstraction (e.g., creating a shared library with only one > user) and allows KFuzzTest's design to stabilize based on its specific > focus: fuzzing with complex, structured inputs. > Thanks, Ethan. I've had a bit of a play around with the kfuzztest-bridge tool, and it seems to work pretty well here. I'm definitely looking forward to trying out The only real feature I'd find useful would be to have a human-readable way of describing the data (as well as the structure), which could be useful when passing around reproducers, and could make it possible to hand-craft or adapt cases to work cross-architecture, if that's a future goal. But I don't think that it's worth holding up an initial version for. On the subject of architecture support, I don't see anything particularly x86_64-specific in here (or at least, nothing that couldn't be relatively easily fixed). While I don't think you need to support lots of architectures immediately, it'd be nice to use architecture-independant things (like the shared include/asm-generic/vmlinux.lds.h) where possible. And even if you're focusing on x86_64, supporting UML -- which is still x86 under-the-hood, but has its own linker scripts -- would be a nice bonus if it's easy. Other things, like supporting 32-bit or big-endian setups are nice-to-have, but definitely not worth spending too much time on immediately (though if we start using some of the formats/features here for KUnit, we'll want to support them). Finally, while I like the samples and documentation, I think it'd be nice to include a working example of using kfuzztest-bridge alongside the samples, even if it's something as simple as including a line like: ./kfuzztest-bridge "some_buffer { ptr[buf] len[buf, u64]}; buf { arr[u8, 128] };" "test_underflow_on_buffer" /dev/urandom Regardless, this is very neat, and I can't wait (with some apprehension) to see what it finds! Cheers, -- David
On Thu, Sep 4, 2025 at 11:11 AM David Gow <davidgow@google.com> wrote: > Thanks, Ethan. I've had a bit of a play around with the > kfuzztest-bridge tool, and it seems to work pretty well here. I'm > definitely looking forward to trying out > > The only real feature I'd find useful would be to have a > human-readable way of describing the data (as well as the structure), > which could be useful when passing around reproducers, and could make > it possible to hand-craft or adapt cases to work cross-architecture, > if that's a future goal. But I don't think that it's worth holding up > an initial version for. That's a great idea for a future iteration. > On the subject of architecture support, I don't see anything > particularly x86_64-specific in here (or at least, nothing that > couldn't be relatively easily fixed). While I don't think you need to > support lots of architectures immediately, it'd be nice to use > architecture-independant things (like the shared > include/asm-generic/vmlinux.lds.h) where possible. And even if you're You're absolutely right. I made some modifications locally, and there seems to be no reason not to add all of the required section definitions into the /include/asm-generic/vmlinux.lds.h. > focusing on x86_64, supporting UML -- which is still x86 > under-the-hood, but has its own linker scripts -- would be a nice > bonus if it's easy. Other things, like supporting 32-bit or big-endian > setups are nice-to-have, but definitely not worth spending too much > time on immediately (though if we start using some of the > formats/features here for KUnit, we'll want to support them). > > Finally, while I like the samples and documentation, I think it'd be > nice to include a working example of using kfuzztest-bridge alongside > the samples, even if it's something as simple as including a line > like: > ./kfuzztest-bridge "some_buffer { ptr[buf] len[buf, u64]}; buf { > arr[u8, 128] };" "test_underflow_on_buffer" /dev/urandom Definitely. I'll be sure to add that into the docs.
© 2016 - 2025 Red Hat, Inc.