[RFC PATCH] xen: add libafl-qemu fuzzer support

Volodymyr Babchuk posted 1 patch 6 days, 10 hours ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20241114224636.1942089-1-volodymyr._5Fbabchuk@epam.com
docs/hypervisor-guide/fuzzing.rst           | 102 +++++++++++++
xen/arch/arm/Kconfig.debug                  |  26 ++++
xen/arch/arm/Makefile                       |   1 +
xen/arch/arm/include/asm/libafl_qemu.h      |  54 +++++++
xen/arch/arm/include/asm/libafl_qemu_defs.h |  37 +++++
xen/arch/arm/libafl_qemu.c                  | 152 ++++++++++++++++++++
xen/arch/arm/psci.c                         |  13 ++
xen/common/sched/core.c                     |  17 +++
xen/common/shutdown.c                       |   7 +
xen/drivers/char/console.c                  |   8 ++
10 files changed, 417 insertions(+)
create mode 100644 docs/hypervisor-guide/fuzzing.rst
create mode 100644 xen/arch/arm/include/asm/libafl_qemu.h
create mode 100644 xen/arch/arm/include/asm/libafl_qemu_defs.h
create mode 100644 xen/arch/arm/libafl_qemu.c
[RFC PATCH] xen: add libafl-qemu fuzzer support
Posted by Volodymyr Babchuk 6 days, 10 hours ago
LibAFL, which is a part of AFL++ project is a instrument that allows
us to perform fuzzing on beremetal code (Xen hypervisor in this case)
using QEMU as an emulator. It employs QEMU's ability to create
snapshots to run many tests relatively quickly: system state is saved
right before executing a new test and restored after the test is
finished.

This patch adds all necessary plumbing to run aarch64 build of Xen
inside that LibAFL-QEMU fuzzer. From the Xen perspective we need to
do following things:

1. Able to communicate with LibAFL-QEMU fuzzer. This is done by
executing special opcodes, that only LibAFL-QEMU can handle.

2. Use interface from p.1 to tell the fuzzer about code Xen section,
so fuzzer know which part of code to track and gather coverage data.

3. Report fuzzer about crash. This is done in panic() function.

4. Prevent test harness from shooting itself in knee.

Right now test harness is an external component, because we want to
test external Xen interfaces, but it is possible to fuzz internal code
if we want to.

Test harness is implemented as a Zephyr-based application which launches
as Dom0 kernel and performs different tests. As test harness can issue
hypercall that shuts itself down, KConfig option
CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING was added. It basically tells
fuzzer that test was completed successfully if Dom0 tries to shut
itself (or the whole machine) down.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>

---

I tried to fuzz the vGIC emulator and hypercall interface. While vGIC
fuzzing didn't yield any interesting results, hypercall fuzzing found a
way to crash the hypervisor from Dom0 on aarch64, using
"XEN_SYSCTL_page_offline_op" with "sysctl_query_page_offline" sub-op,
because it leads to page_is_ram_type() call which is marked
UNREACHABLE on ARM.
---
 docs/hypervisor-guide/fuzzing.rst           | 102 +++++++++++++
 xen/arch/arm/Kconfig.debug                  |  26 ++++
 xen/arch/arm/Makefile                       |   1 +
 xen/arch/arm/include/asm/libafl_qemu.h      |  54 +++++++
 xen/arch/arm/include/asm/libafl_qemu_defs.h |  37 +++++
 xen/arch/arm/libafl_qemu.c                  | 152 ++++++++++++++++++++
 xen/arch/arm/psci.c                         |  13 ++
 xen/common/sched/core.c                     |  17 +++
 xen/common/shutdown.c                       |   7 +
 xen/drivers/char/console.c                  |   8 ++
 10 files changed, 417 insertions(+)
 create mode 100644 docs/hypervisor-guide/fuzzing.rst
 create mode 100644 xen/arch/arm/include/asm/libafl_qemu.h
 create mode 100644 xen/arch/arm/include/asm/libafl_qemu_defs.h
 create mode 100644 xen/arch/arm/libafl_qemu.c

diff --git a/docs/hypervisor-guide/fuzzing.rst b/docs/hypervisor-guide/fuzzing.rst
new file mode 100644
index 0000000000..9570de7670
--- /dev/null
+++ b/docs/hypervisor-guide/fuzzing.rst
@@ -0,0 +1,102 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+
+Fuzzing
+=======
+
+It is possible to use LibAFL-QEMU for fuzzing hypervisor. Right now
+only aarch64 is supported and only hypercall fuzzing is enabled in the
+test harness, but there are plans to add vGIC interface fuzzing, PSCI
+fuzzing and vPL011 fuzzing as well.
+
+
+Principle of operation
+----------------------
+
+LibAFL-QEMU is a part of American Fuzzy lop plus plus (AKA AFL++)
+project. It uses special build of QEMU, that allows to fuzz baremetal
+software like Xen hypervisor or Linux kernel. Basic idea is that we
+have software under test (Xen hypervisor in our case) and a test
+harness application. Test harness uses special protocol to communicate
+with LibAFL outside of QEMU to get input data and report test
+result. LibAFL monitors which branches are taken by Xen and mutates
+input data in attempt to discover new code paths that eventually can
+lead to a crash or other unintended behavior.
+
+LibAFL uses QEMU's `snapshot` feature to run multiple test without
+restarting the whole system every time. This speeds up fuzzing process
+greatly.
+
+So, to try Xen fuzzing we need three components: LibAFL-based fuzzer,
+test harness and Xen itself.
+
+Building Xen for fuzzing
+------------------------
+
+Xen hypervisor should be built with these two options::
+
+ CONFIG_LIBAFL_QEMU_FUZZER=y
+ CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING=y
+
+Building test harness
+---------------------
+
+We need to make low-level actions, like issuing random hypercalls, so
+for test harness we use special build of Zephyr application.
+
+You need to prepare environment for building Zephyr as described in
+getting `started guide
+<https://docs.zephyrproject.org/latest/develop/getting_started/index.html>`_.
+
+Next, download test harness application and built it::
+
+  # mkdir zephyr-harness
+  # cd zephyr-harness
+  # west init -m https://github.com/xen-troops/xen-fuzzer-harness
+  # cd xen-fuzzer-harness
+  # west update
+  # west build
+
+Building LibAFL-QEMU based fuzzer
+---------------------------------
+
+Fuzzer is written in Rust, so you need Rust toolchain and `cargo` tool
+in your system. Please refer to your distro documentation on how to
+obtain them.
+
+Once Rust is ready, fetch and build the fuzzer::
+
+  # git clone https://github.com/xen-troops/xen-fuzzer-rs
+  # cd xen-fuzzer-rs
+  # cargo build
+
+Running the fuzzer
+------------------
+
+Run it like this::
+
+  # target/debug/xen_fuzzer  -accel tcg \
+  -machine virt,virtualization=yes,acpi=off \
+  -m 4G \
+  -L  target/debug/qemu-libafl-bridge/pc-bios  \
+  -nographic \
+  -cpu max \
+  -append 'dom0_mem=512 loglvl=none guest_loglvl=none console=dtuart' \
+  -kernel /path/to/xen/xen/xen \
+  -device guest-loader,addr=0x42000000,kernel=/path/to/zephyr-harness/build/zephyr/zephyr.bin \
+  -snapshot
+
+Any inputs that led to crashes will be found in `crashes` directory.
+
+You can use standard QEMU parameters to redirect console output,
+change memory size, HW compisition, etc.
+
+
+TODOs
+-----
+
+ - Add x86 support.
+ - Implement fuzzing of other external hypervisor interfaces.
+ - Better integration: user should not build manually multiple
+   different projects
+ - Add ability to re-run fuzzer with a given input to make crash
+   debugging more comfortable
diff --git a/xen/arch/arm/Kconfig.debug b/xen/arch/arm/Kconfig.debug
index 7660e599c0..9e2c4649ed 100644
--- a/xen/arch/arm/Kconfig.debug
+++ b/xen/arch/arm/Kconfig.debug
@@ -183,3 +183,29 @@ config EARLY_PRINTK_INC
 	default "debug-mvebu.inc" if EARLY_UART_MVEBU
 	default "debug-pl011.inc" if EARLY_UART_PL011
 	default "debug-scif.inc" if EARLY_UART_SCIF
+
+config LIBAFL_QEMU_FUZZER
+	bool "Enable LibAFL-QEMU calls"
+	help
+	  This option enables support for LibAFL-QEMU calls. Enable this
+	  only when you are going to run hypervisor inside LibAFL-QEMU.
+	  Xen will report code section to LibAFL and will report about
+	  crash when it panics.
+
+	  Do not try to run Xen built on this option on any real hardware
+	  or plain QEMU, because it will just crash during startup.
+
+config LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+	depends on LIBAFL_QEMU_FUZZER
+	bool "LibAFL: Report any attempt to suspend/destroy a domain as a success"
+	help
+	  When fuzzing hypercalls, fuzzer sometimes will issue an hypercall that
+	  leads to a domain shutdown, or machine shutdown, or vCPU being
+	  blocked, or something similar. In this case test harness will not be
+	  able to report about successfully handled call to the fuzzer. Fuzzer
+	  will report timeout and mark this as a crash, which is not true. So,
+	  in such cases we need to report about successfully test case from the
+	  hypervisor itself.
+
+          Enable this option only if fuzzing attempt can lead to a correct
+	  stoppage, like when fuzzing hypercalls or PSCI.
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index e4ad1ce851..51b5e15b6a 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -10,6 +10,7 @@ obj-$(CONFIG_TEE) += tee/
 obj-$(CONFIG_HAS_VPCI) += vpci.o
 
 obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
+obj-${CONFIG_LIBAFL_QEMU_FUZZER} += libafl_qemu.o
 obj-y += cpuerrata.o
 obj-y += cpufeature.o
 obj-y += decode.o
diff --git a/xen/arch/arm/include/asm/libafl_qemu.h b/xen/arch/arm/include/asm/libafl_qemu.h
new file mode 100644
index 0000000000..b90cf48b9a
--- /dev/null
+++ b/xen/arch/arm/include/asm/libafl_qemu.h
@@ -0,0 +1,54 @@
+#ifndef LIBAFL_QEMU_H
+#define LIBAFL_QEMU_H
+
+#include <xen/stdint.h>
+#include "libafl_qemu_defs.h"
+#define LIBAFL_QEMU_PRINTF_MAX_SIZE 4096
+
+typedef uint64_t libafl_word;
+
+/**
+ * LibAFL QEMU header file.
+ *
+ * This file is a portable header file used to build target harnesses more
+ * conveniently. Its main purpose is to generate ready-to-use calls to
+ * communicate with the fuzzer. The list of commands is available at the bottom
+ * of this file. The rest mostly consists of macros generating the code used by
+ * the commands.
+ */
+
+enum LibaflQemuEndStatus {
+  LIBAFL_QEMU_END_UNKNOWN = 0,
+  LIBAFL_QEMU_END_OK = 1,
+  LIBAFL_QEMU_END_CRASH = 2,
+};
+
+libafl_word libafl_qemu_start_virt(void *buf_vaddr, libafl_word max_len);
+
+libafl_word libafl_qemu_start_phys(void *buf_paddr, libafl_word max_len);
+
+libafl_word libafl_qemu_input_virt(void *buf_vaddr, libafl_word max_len);
+
+libafl_word libafl_qemu_input_phys(void *buf_paddr, libafl_word max_len);
+
+void libafl_qemu_end(enum LibaflQemuEndStatus status);
+
+void libafl_qemu_save(void);
+
+void libafl_qemu_load(void);
+
+libafl_word libafl_qemu_version(void);
+
+void libafl_qemu_page_current_allow(void);
+
+void libafl_qemu_internal_error(void);
+
+void __attribute__((format(printf, 1, 2))) lqprintf(const char *fmt, ...);
+
+void libafl_qemu_test(void);
+
+void libafl_qemu_trace_vaddr_range(libafl_word start, libafl_word end);
+
+void libafl_qemu_trace_vaddr_size(libafl_word start, libafl_word size);
+
+#endif
diff --git a/xen/arch/arm/include/asm/libafl_qemu_defs.h b/xen/arch/arm/include/asm/libafl_qemu_defs.h
new file mode 100644
index 0000000000..2866cadaac
--- /dev/null
+++ b/xen/arch/arm/include/asm/libafl_qemu_defs.h
@@ -0,0 +1,37 @@
+#ifndef LIBAFL_QEMU_DEFS
+#define LIBAFL_QEMU_DEFS
+
+#define LIBAFL_STRINGIFY(s) #s
+#define XSTRINGIFY(s) LIBAFL_STRINGIFY(s)
+
+#if __STDC_VERSION__ >= 201112L
+  #define STATIC_CHECKS                                   \
+    _Static_assert(sizeof(void *) <= sizeof(libafl_word), \
+                   "pointer type should not be larger and libafl_word");
+#else
+  #define STATIC_CHECKS
+#endif
+
+#define LIBAFL_SYNC_EXIT_OPCODE 0x66f23a0f
+#define LIBAFL_BACKDOOR_OPCODE 0x44f23a0f
+
+#define LIBAFL_QEMU_TEST_VALUE 0xcafebabe
+
+#define LIBAFL_QEMU_HDR_VERSION_NUMBER 0111  // TODO: find a nice way to set it.
+
+typedef enum LibaflQemuCommand {
+  LIBAFL_QEMU_COMMAND_START_VIRT = 0,
+  LIBAFL_QEMU_COMMAND_START_PHYS = 1,
+  LIBAFL_QEMU_COMMAND_INPUT_VIRT = 2,
+  LIBAFL_QEMU_COMMAND_INPUT_PHYS = 3,
+  LIBAFL_QEMU_COMMAND_END = 4,
+  LIBAFL_QEMU_COMMAND_SAVE = 5,
+  LIBAFL_QEMU_COMMAND_LOAD = 6,
+  LIBAFL_QEMU_COMMAND_VERSION = 7,
+  LIBAFL_QEMU_COMMAND_VADDR_FILTER_ALLOW = 8,
+  LIBAFL_QEMU_COMMAND_INTERNAL_ERROR = 9,
+  LIBAFL_QEMU_COMMAND_LQPRINTF = 10,
+  LIBAFL_QEMU_COMMAND_TEST = 11,
+} LibaflExit;
+
+#endif
diff --git a/xen/arch/arm/libafl_qemu.c b/xen/arch/arm/libafl_qemu.c
new file mode 100644
index 0000000000..58924ce6c6
--- /dev/null
+++ b/xen/arch/arm/libafl_qemu.c
@@ -0,0 +1,152 @@
+/* SPDX-License-Identifier: Apache-2.0 */
+/*
+   This file is based on libafl_qemu_impl.h and libafl_qemu_qemu_arch.h
+   from LibAFL project.
+*/
+#include <xen/lib.h>
+#include <xen/init.h>
+#include <xen/kernel.h>
+#include <asm/libafl_qemu.h>
+
+#define LIBAFL_DEFINE_FUNCTIONS(name, opcode)				\
+	libafl_word _libafl_##name##_call0(	\
+		libafl_word action) {					\
+		libafl_word ret;					\
+		__asm__ volatile (					\
+			"mov x0, %1\n"					\
+			".word " XSTRINGIFY(opcode) "\n"		\
+			"mov %0, x0\n"					\
+			: "=r"(ret)					\
+			: "r"(action)					\
+			: "x0"						\
+			);						\
+		return ret;						\
+	}								\
+									\
+	libafl_word _libafl_##name##_call1(	\
+		libafl_word action, libafl_word arg1) {			\
+		libafl_word ret;					\
+		__asm__ volatile (					\
+			"mov x0, %1\n"					\
+			"mov x1, %2\n"					\
+			".word " XSTRINGIFY(opcode) "\n"		\
+			"mov %0, x0\n"					\
+			: "=r"(ret)					\
+			: "r"(action), "r"(arg1)			\
+			: "x0", "x1"					\
+			);						\
+		return ret;						\
+	}								\
+									\
+	libafl_word _libafl_##name##_call2(	\
+		libafl_word action, libafl_word arg1, libafl_word arg2) { \
+		libafl_word ret;					\
+		__asm__ volatile (					\
+			"mov x0, %1\n"					\
+			"mov x1, %2\n"					\
+			"mov x2, %3\n"					\
+			".word " XSTRINGIFY(opcode) "\n"		\
+			"mov %0, x0\n"					\
+			: "=r"(ret)					\
+			: "r"(action), "r"(arg1), "r"(arg2)		\
+			: "x0", "x1", "x2"				\
+			);						\
+		return ret;						\
+	}
+
+// Generates sync exit functions
+LIBAFL_DEFINE_FUNCTIONS(sync_exit, LIBAFL_SYNC_EXIT_OPCODE)
+
+// Generates backdoor functions
+LIBAFL_DEFINE_FUNCTIONS(backdoor, LIBAFL_BACKDOOR_OPCODE)
+
+static char _lqprintf_buffer[LIBAFL_QEMU_PRINTF_MAX_SIZE] = {0};
+
+libafl_word libafl_qemu_start_virt(void       *buf_vaddr,
+                                            libafl_word max_len) {
+  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_START_VIRT,
+                                 (libafl_word)buf_vaddr, max_len);
+}
+
+libafl_word libafl_qemu_start_phys(void       *buf_paddr,
+                                            libafl_word max_len) {
+  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_START_PHYS,
+                                 (libafl_word)buf_paddr, max_len);
+}
+
+libafl_word libafl_qemu_input_virt(void       *buf_vaddr,
+                                            libafl_word max_len) {
+  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_INPUT_VIRT,
+                                 (libafl_word)buf_vaddr, max_len);
+}
+
+libafl_word libafl_qemu_input_phys(void       *buf_paddr,
+                                            libafl_word max_len) {
+  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_INPUT_PHYS,
+                                 (libafl_word)buf_paddr, max_len);
+}
+
+void libafl_qemu_end(enum LibaflQemuEndStatus status) {
+  _libafl_sync_exit_call1(LIBAFL_QEMU_COMMAND_END, status);
+}
+
+void libafl_qemu_save(void) {
+  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_SAVE);
+}
+
+void libafl_qemu_load(void) {
+  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_LOAD);
+}
+
+libafl_word libafl_qemu_version(void) {
+  return _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_VERSION);
+}
+
+void libafl_qemu_internal_error(void) {
+  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_INTERNAL_ERROR);
+}
+
+void lqprintf(const char *fmt, ...) {
+  va_list args;
+  int res;
+  va_start(args, fmt);
+  res = vsnprintf(_lqprintf_buffer, LIBAFL_QEMU_PRINTF_MAX_SIZE, fmt, args);
+  va_end(args);
+
+  if (res >= LIBAFL_QEMU_PRINTF_MAX_SIZE) {
+    // buffer is not big enough, either recompile the target with more
+    // space or print less things
+    libafl_qemu_internal_error();
+  }
+
+  _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_LQPRINTF,
+                          (libafl_word)_lqprintf_buffer, res);
+}
+
+void libafl_qemu_test(void) {
+  _libafl_sync_exit_call1(LIBAFL_QEMU_COMMAND_TEST, LIBAFL_QEMU_TEST_VALUE);
+}
+
+void libafl_qemu_trace_vaddr_range(libafl_word start,
+                                            libafl_word end) {
+  _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_VADDR_FILTER_ALLOW, start, end);
+}
+
+void libafl_qemu_trace_vaddr_size(libafl_word start,
+                                           libafl_word size) {
+  libafl_qemu_trace_vaddr_range(start, start + size);
+}
+
+static int init_afl(void)
+{
+	vaddr_t xen_text_start = (vaddr_t)_stext;
+	vaddr_t xen_text_end = (vaddr_t)_etext;
+
+	lqprintf("Telling AFL about code section: %lx - %lx\n", xen_text_start, xen_text_end);
+
+	libafl_qemu_trace_vaddr_range(xen_text_start, xen_text_end);
+
+	return 0;
+}
+
+__initcall(init_afl);
diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
index b6860a7760..c7a51a1144 100644
--- a/xen/arch/arm/psci.c
+++ b/xen/arch/arm/psci.c
@@ -17,6 +17,7 @@
 #include <asm/cpufeature.h>
 #include <asm/psci.h>
 #include <asm/acpi.h>
+#include <asm/libafl_qemu.h>
 
 /*
  * While a 64-bit OS can make calls with SMC32 calling conventions, for
@@ -49,6 +50,10 @@ int call_psci_cpu_on(int cpu)
 
 void call_psci_cpu_off(void)
 {
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+    libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
+
     if ( psci_ver > PSCI_VERSION(0, 1) )
     {
         struct arm_smccc_res res;
@@ -62,12 +67,20 @@ void call_psci_cpu_off(void)
 
 void call_psci_system_off(void)
 {
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+    libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
+
     if ( psci_ver > PSCI_VERSION(0, 1) )
         arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL);
 }
 
 void call_psci_system_reset(void)
 {
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+    libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
+
     if ( psci_ver > PSCI_VERSION(0, 1) )
         arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL);
 }
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index d6296d99fd..fd722e0231 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -47,6 +47,10 @@
 #define pv_shim false
 #endif
 
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER
+#include <asm/libafl_qemu.h>
+#endif
+
 /* opt_sched: scheduler - default to configured value */
 static char __initdata opt_sched[10] = CONFIG_SCHED_DEFAULT;
 string_param("sched", opt_sched);
@@ -1452,6 +1456,10 @@ static long do_poll(const struct sched_poll *sched_poll)
     if ( !guest_handle_okay(sched_poll->ports, sched_poll->nr_ports) )
         return -EFAULT;
 
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+    libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
+
     set_bit(_VPF_blocked, &v->pause_flags);
     v->poll_evtchn = -1;
     set_bit(v->vcpu_id, d->poll_mask);
@@ -1887,12 +1895,18 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
     {
     case SCHEDOP_yield:
     {
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+        libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
         ret = vcpu_yield();
         break;
     }
 
     case SCHEDOP_block:
     {
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+        libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
         vcpu_block_enable_events();
         break;
     }
@@ -1907,6 +1921,9 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
         TRACE_TIME(TRC_SCHED_SHUTDOWN, current->domain->domain_id,
                    current->vcpu_id, sched_shutdown.reason);
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+        libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
         ret = domain_shutdown(current->domain, (u8)sched_shutdown.reason);
 
         break;
diff --git a/xen/common/shutdown.c b/xen/common/shutdown.c
index c47341b977..1340f4b606 100644
--- a/xen/common/shutdown.c
+++ b/xen/common/shutdown.c
@@ -11,6 +11,10 @@
 #include <xen/kexec.h>
 #include <public/sched.h>
 
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER
+#include <asm/libafl_qemu.h>
+#endif
+
 /* opt_noreboot: If true, machine will need manual reset on error. */
 bool __ro_after_init opt_noreboot;
 boolean_param("noreboot", opt_noreboot);
@@ -32,6 +36,9 @@ static void noreturn reboot_or_halt(void)
 
 void hwdom_shutdown(unsigned char reason)
 {
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+    libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
     switch ( reason )
     {
     case SHUTDOWN_poweroff:
diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
index 7da8c5296f..1262515e70 100644
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -41,6 +41,9 @@
 #ifdef CONFIG_SBSA_VUART_CONSOLE
 #include <asm/vpl011.h>
 #endif
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER
+#include <asm/libafl_qemu.h>
+#endif
 
 /* console: comma-separated list of console outputs. */
 static char __initdata opt_console[30] = OPT_CONSOLE_STR;
@@ -1299,6 +1302,11 @@ void panic(const char *fmt, ...)
 
     kexec_crash(CRASHREASON_PANIC);
 
+    #ifdef CONFIG_LIBAFL_QEMU_FUZZER
+    /* Tell the fuzzer that we crashed */
+    libafl_qemu_end(LIBAFL_QEMU_END_CRASH);
+    #endif
+
     if ( opt_noreboot )
         machine_halt();
     else
-- 
2.47.0
Re: [RFC PATCH] xen: add libafl-qemu fuzzer support
Posted by Stefano Stabellini 2 days, 7 hours ago
On Thu, 14 Nov 2024, Volodymyr Babchuk wrote:
> LibAFL, which is a part of AFL++ project is a instrument that allows
> us to perform fuzzing on beremetal code (Xen hypervisor in this case)
> using QEMU as an emulator. It employs QEMU's ability to create
> snapshots to run many tests relatively quickly: system state is saved
> right before executing a new test and restored after the test is
> finished.
> 
> This patch adds all necessary plumbing to run aarch64 build of Xen
> inside that LibAFL-QEMU fuzzer. From the Xen perspective we need to
> do following things:
> 
> 1. Able to communicate with LibAFL-QEMU fuzzer. This is done by
> executing special opcodes, that only LibAFL-QEMU can handle.
> 
> 2. Use interface from p.1 to tell the fuzzer about code Xen section,
> so fuzzer know which part of code to track and gather coverage data.
> 
> 3. Report fuzzer about crash. This is done in panic() function.
> 
> 4. Prevent test harness from shooting itself in knee.
> 
> Right now test harness is an external component, because we want to
> test external Xen interfaces, but it is possible to fuzz internal code
> if we want to.
> 
> Test harness is implemented as a Zephyr-based application which launches
> as Dom0 kernel and performs different tests. As test harness can issue
> hypercall that shuts itself down, KConfig option
> CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING was added. It basically tells
> fuzzer that test was completed successfully if Dom0 tries to shut
> itself (or the whole machine) down.
> 
> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> 
> ---
> 
> I tried to fuzz the vGIC emulator and hypercall interface. While vGIC
> fuzzing didn't yield any interesting results, hypercall fuzzing found a
> way to crash the hypervisor from Dom0 on aarch64, using
> "XEN_SYSCTL_page_offline_op" with "sysctl_query_page_offline" sub-op,
> because it leads to page_is_ram_type() call which is marked
> UNREACHABLE on ARM.
> ---
>  docs/hypervisor-guide/fuzzing.rst           | 102 +++++++++++++
>  xen/arch/arm/Kconfig.debug                  |  26 ++++
>  xen/arch/arm/Makefile                       |   1 +
>  xen/arch/arm/include/asm/libafl_qemu.h      |  54 +++++++
>  xen/arch/arm/include/asm/libafl_qemu_defs.h |  37 +++++
>  xen/arch/arm/libafl_qemu.c                  | 152 ++++++++++++++++++++
>  xen/arch/arm/psci.c                         |  13 ++
>  xen/common/sched/core.c                     |  17 +++
>  xen/common/shutdown.c                       |   7 +
>  xen/drivers/char/console.c                  |   8 ++
>  10 files changed, 417 insertions(+)
>  create mode 100644 docs/hypervisor-guide/fuzzing.rst
>  create mode 100644 xen/arch/arm/include/asm/libafl_qemu.h
>  create mode 100644 xen/arch/arm/include/asm/libafl_qemu_defs.h
>  create mode 100644 xen/arch/arm/libafl_qemu.c
> 
> diff --git a/docs/hypervisor-guide/fuzzing.rst b/docs/hypervisor-guide/fuzzing.rst
> new file mode 100644
> index 0000000000..9570de7670
> --- /dev/null
> +++ b/docs/hypervisor-guide/fuzzing.rst
> @@ -0,0 +1,102 @@
> +.. SPDX-License-Identifier: CC-BY-4.0
> +
> +Fuzzing
> +=======
> +
> +It is possible to use LibAFL-QEMU for fuzzing hypervisor. Right now
> +only aarch64 is supported and only hypercall fuzzing is enabled in the
> +test harness, but there are plans to add vGIC interface fuzzing, PSCI
> +fuzzing and vPL011 fuzzing as well.
> +
> +
> +Principle of operation
> +----------------------
> +
> +LibAFL-QEMU is a part of American Fuzzy lop plus plus (AKA AFL++)
> +project. It uses special build of QEMU, that allows to fuzz baremetal
> +software like Xen hypervisor or Linux kernel. Basic idea is that we
> +have software under test (Xen hypervisor in our case) and a test
> +harness application. Test harness uses special protocol to communicate
> +with LibAFL outside of QEMU to get input data and report test
> +result. LibAFL monitors which branches are taken by Xen and mutates
> +input data in attempt to discover new code paths that eventually can
> +lead to a crash or other unintended behavior.
> +
> +LibAFL uses QEMU's `snapshot` feature to run multiple test without
> +restarting the whole system every time. This speeds up fuzzing process
> +greatly.
> +
> +So, to try Xen fuzzing we need three components: LibAFL-based fuzzer,
> +test harness and Xen itself.
> +
> +Building Xen for fuzzing
> +------------------------
> +
> +Xen hypervisor should be built with these two options::
> +
> + CONFIG_LIBAFL_QEMU_FUZZER=y
> + CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING=y
> +
> +Building test harness
> +---------------------
> +
> +We need to make low-level actions, like issuing random hypercalls, so
> +for test harness we use special build of Zephyr application.
> +
> +You need to prepare environment for building Zephyr as described in
> +getting `started guide
> +<https://docs.zephyrproject.org/latest/develop/getting_started/index.html>`_.
> +
> +Next, download test harness application and built it::
> +
> +  # mkdir zephyr-harness
> +  # cd zephyr-harness
> +  # west init -m https://github.com/xen-troops/xen-fuzzer-harness
> +  # cd xen-fuzzer-harness
> +  # west update
> +  # west build
> +
> +Building LibAFL-QEMU based fuzzer
> +---------------------------------
> +
> +Fuzzer is written in Rust, so you need Rust toolchain and `cargo` tool
> +in your system. Please refer to your distro documentation on how to
> +obtain them.
> +
> +Once Rust is ready, fetch and build the fuzzer::
> +
> +  # git clone https://github.com/xen-troops/xen-fuzzer-rs
> +  # cd xen-fuzzer-rs
> +  # cargo build

Is this the only way to trigger the fuzzer? Are there other ways (e.g.
other languages or scripts)? If this is the only way, do we expect it to
grow much over time, or is it just a minimal shim only to invoke the
fuzzer (so basically we need an x86 version of it but that's pretty much
it in terms of growth)?


> +Running the fuzzer
> +------------------
> +
> +Run it like this::
> +
> +  # target/debug/xen_fuzzer  -accel tcg \
> +  -machine virt,virtualization=yes,acpi=off \
> +  -m 4G \
> +  -L  target/debug/qemu-libafl-bridge/pc-bios  \
> +  -nographic \
> +  -cpu max \
> +  -append 'dom0_mem=512 loglvl=none guest_loglvl=none console=dtuart' \
> +  -kernel /path/to/xen/xen/xen \
> +  -device guest-loader,addr=0x42000000,kernel=/path/to/zephyr-harness/build/zephyr/zephyr.bin \
> +  -snapshot
> +
> +Any inputs that led to crashes will be found in `crashes` directory.
> +
> +You can use standard QEMU parameters to redirect console output,
> +change memory size, HW compisition, etc.
> +
> +
> +TODOs
> +-----
> +
> + - Add x86 support.
> + - Implement fuzzing of other external hypervisor interfaces.
> + - Better integration: user should not build manually multiple
> +   different projects
> + - Add ability to re-run fuzzer with a given input to make crash
> +   debugging more comfortable
> diff --git a/xen/arch/arm/Kconfig.debug b/xen/arch/arm/Kconfig.debug
> index 7660e599c0..9e2c4649ed 100644
> --- a/xen/arch/arm/Kconfig.debug
> +++ b/xen/arch/arm/Kconfig.debug
> @@ -183,3 +183,29 @@ config EARLY_PRINTK_INC
>  	default "debug-mvebu.inc" if EARLY_UART_MVEBU
>  	default "debug-pl011.inc" if EARLY_UART_PL011
>  	default "debug-scif.inc" if EARLY_UART_SCIF
> +
> +config LIBAFL_QEMU_FUZZER
> +	bool "Enable LibAFL-QEMU calls"
> +	help
> +	  This option enables support for LibAFL-QEMU calls. Enable this
> +	  only when you are going to run hypervisor inside LibAFL-QEMU.
> +	  Xen will report code section to LibAFL and will report about
> +	  crash when it panics.
> +
> +	  Do not try to run Xen built on this option on any real hardware
> +	  or plain QEMU, because it will just crash during startup.
> +
> +config LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +	depends on LIBAFL_QEMU_FUZZER
> +	bool "LibAFL: Report any attempt to suspend/destroy a domain as a success"
> +	help
> +	  When fuzzing hypercalls, fuzzer sometimes will issue an hypercall that
> +	  leads to a domain shutdown, or machine shutdown, or vCPU being
> +	  blocked, or something similar. In this case test harness will not be
> +	  able to report about successfully handled call to the fuzzer. Fuzzer
> +	  will report timeout and mark this as a crash, which is not true. So,
> +	  in such cases we need to report about successfully test case from the
> +	  hypervisor itself.
> +
> +          Enable this option only if fuzzing attempt can lead to a correct
> +	  stoppage, like when fuzzing hypercalls or PSCI.
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index e4ad1ce851..51b5e15b6a 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -10,6 +10,7 @@ obj-$(CONFIG_TEE) += tee/
>  obj-$(CONFIG_HAS_VPCI) += vpci.o
>  
>  obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
> +obj-${CONFIG_LIBAFL_QEMU_FUZZER} += libafl_qemu.o
>  obj-y += cpuerrata.o
>  obj-y += cpufeature.o
>  obj-y += decode.o
> diff --git a/xen/arch/arm/include/asm/libafl_qemu.h b/xen/arch/arm/include/asm/libafl_qemu.h
> new file mode 100644
> index 0000000000..b90cf48b9a
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/libafl_qemu.h
> @@ -0,0 +1,54 @@
> +#ifndef LIBAFL_QEMU_H
> +#define LIBAFL_QEMU_H
> +
> +#include <xen/stdint.h>
> +#include "libafl_qemu_defs.h"
> +#define LIBAFL_QEMU_PRINTF_MAX_SIZE 4096
> +
> +typedef uint64_t libafl_word;
> +
> +/**
> + * LibAFL QEMU header file.
> + *
> + * This file is a portable header file used to build target harnesses more
> + * conveniently. Its main purpose is to generate ready-to-use calls to
> + * communicate with the fuzzer. The list of commands is available at the bottom
> + * of this file. The rest mostly consists of macros generating the code used by
> + * the commands.
> + */
> +
> +enum LibaflQemuEndStatus {
> +  LIBAFL_QEMU_END_UNKNOWN = 0,
> +  LIBAFL_QEMU_END_OK = 1,
> +  LIBAFL_QEMU_END_CRASH = 2,
> +};
> +
> +libafl_word libafl_qemu_start_virt(void *buf_vaddr, libafl_word max_len);
> +
> +libafl_word libafl_qemu_start_phys(void *buf_paddr, libafl_word max_len);
> +
> +libafl_word libafl_qemu_input_virt(void *buf_vaddr, libafl_word max_len);
> +
> +libafl_word libafl_qemu_input_phys(void *buf_paddr, libafl_word max_len);
> +
> +void libafl_qemu_end(enum LibaflQemuEndStatus status);
> +
> +void libafl_qemu_save(void);
> +
> +void libafl_qemu_load(void);
> +
> +libafl_word libafl_qemu_version(void);
> +
> +void libafl_qemu_page_current_allow(void);
> +
> +void libafl_qemu_internal_error(void);
> +
> +void __attribute__((format(printf, 1, 2))) lqprintf(const char *fmt, ...);
> +
> +void libafl_qemu_test(void);
> +
> +void libafl_qemu_trace_vaddr_range(libafl_word start, libafl_word end);
> +
> +void libafl_qemu_trace_vaddr_size(libafl_word start, libafl_word size);
> +
> +#endif
> diff --git a/xen/arch/arm/include/asm/libafl_qemu_defs.h b/xen/arch/arm/include/asm/libafl_qemu_defs.h
> new file mode 100644
> index 0000000000..2866cadaac
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/libafl_qemu_defs.h
> @@ -0,0 +1,37 @@
> +#ifndef LIBAFL_QEMU_DEFS
> +#define LIBAFL_QEMU_DEFS
> +
> +#define LIBAFL_STRINGIFY(s) #s
> +#define XSTRINGIFY(s) LIBAFL_STRINGIFY(s)
> +
> +#if __STDC_VERSION__ >= 201112L
> +  #define STATIC_CHECKS                                   \
> +    _Static_assert(sizeof(void *) <= sizeof(libafl_word), \
> +                   "pointer type should not be larger and libafl_word");
> +#else
> +  #define STATIC_CHECKS
> +#endif
> +
> +#define LIBAFL_SYNC_EXIT_OPCODE 0x66f23a0f
> +#define LIBAFL_BACKDOOR_OPCODE 0x44f23a0f
> +
> +#define LIBAFL_QEMU_TEST_VALUE 0xcafebabe
> +
> +#define LIBAFL_QEMU_HDR_VERSION_NUMBER 0111  // TODO: find a nice way to set it.
> +
> +typedef enum LibaflQemuCommand {
> +  LIBAFL_QEMU_COMMAND_START_VIRT = 0,
> +  LIBAFL_QEMU_COMMAND_START_PHYS = 1,
> +  LIBAFL_QEMU_COMMAND_INPUT_VIRT = 2,
> +  LIBAFL_QEMU_COMMAND_INPUT_PHYS = 3,
> +  LIBAFL_QEMU_COMMAND_END = 4,
> +  LIBAFL_QEMU_COMMAND_SAVE = 5,
> +  LIBAFL_QEMU_COMMAND_LOAD = 6,
> +  LIBAFL_QEMU_COMMAND_VERSION = 7,
> +  LIBAFL_QEMU_COMMAND_VADDR_FILTER_ALLOW = 8,
> +  LIBAFL_QEMU_COMMAND_INTERNAL_ERROR = 9,
> +  LIBAFL_QEMU_COMMAND_LQPRINTF = 10,
> +  LIBAFL_QEMU_COMMAND_TEST = 11,
> +} LibaflExit;
> +
> +#endif
> diff --git a/xen/arch/arm/libafl_qemu.c b/xen/arch/arm/libafl_qemu.c
> new file mode 100644
> index 0000000000..58924ce6c6
> --- /dev/null
> +++ b/xen/arch/arm/libafl_qemu.c
> @@ -0,0 +1,152 @@
> +/* SPDX-License-Identifier: Apache-2.0 */
> +/*
> +   This file is based on libafl_qemu_impl.h and libafl_qemu_qemu_arch.h
> +   from LibAFL project.
> +*/
> +#include <xen/lib.h>
> +#include <xen/init.h>
> +#include <xen/kernel.h>
> +#include <asm/libafl_qemu.h>
> +
> +#define LIBAFL_DEFINE_FUNCTIONS(name, opcode)				\
> +	libafl_word _libafl_##name##_call0(	\
> +		libafl_word action) {					\
> +		libafl_word ret;					\
> +		__asm__ volatile (					\
> +			"mov x0, %1\n"					\
> +			".word " XSTRINGIFY(opcode) "\n"		\
> +			"mov %0, x0\n"					\
> +			: "=r"(ret)					\
> +			: "r"(action)					\
> +			: "x0"						\
> +			);						\
> +		return ret;						\
> +	}								\
> +									\
> +	libafl_word _libafl_##name##_call1(	\
> +		libafl_word action, libafl_word arg1) {			\
> +		libafl_word ret;					\
> +		__asm__ volatile (					\
> +			"mov x0, %1\n"					\
> +			"mov x1, %2\n"					\
> +			".word " XSTRINGIFY(opcode) "\n"		\
> +			"mov %0, x0\n"					\
> +			: "=r"(ret)					\
> +			: "r"(action), "r"(arg1)			\
> +			: "x0", "x1"					\
> +			);						\
> +		return ret;						\
> +	}								\
> +									\
> +	libafl_word _libafl_##name##_call2(	\
> +		libafl_word action, libafl_word arg1, libafl_word arg2) { \
> +		libafl_word ret;					\
> +		__asm__ volatile (					\
> +			"mov x0, %1\n"					\
> +			"mov x1, %2\n"					\
> +			"mov x2, %3\n"					\
> +			".word " XSTRINGIFY(opcode) "\n"		\
> +			"mov %0, x0\n"					\
> +			: "=r"(ret)					\
> +			: "r"(action), "r"(arg1), "r"(arg2)		\
> +			: "x0", "x1", "x2"				\
> +			);						\
> +		return ret;						\
> +	}
> +
> +// Generates sync exit functions
> +LIBAFL_DEFINE_FUNCTIONS(sync_exit, LIBAFL_SYNC_EXIT_OPCODE)
> +
> +// Generates backdoor functions
> +LIBAFL_DEFINE_FUNCTIONS(backdoor, LIBAFL_BACKDOOR_OPCODE)
> +
> +static char _lqprintf_buffer[LIBAFL_QEMU_PRINTF_MAX_SIZE] = {0};
> +
> +libafl_word libafl_qemu_start_virt(void       *buf_vaddr,
> +                                            libafl_word max_len) {
> +  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_START_VIRT,
> +                                 (libafl_word)buf_vaddr, max_len);
> +}
> +
> +libafl_word libafl_qemu_start_phys(void       *buf_paddr,
> +                                            libafl_word max_len) {
> +  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_START_PHYS,
> +                                 (libafl_word)buf_paddr, max_len);
> +}
> +
> +libafl_word libafl_qemu_input_virt(void       *buf_vaddr,
> +                                            libafl_word max_len) {
> +  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_INPUT_VIRT,
> +                                 (libafl_word)buf_vaddr, max_len);
> +}
> +
> +libafl_word libafl_qemu_input_phys(void       *buf_paddr,
> +                                            libafl_word max_len) {
> +  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_INPUT_PHYS,
> +                                 (libafl_word)buf_paddr, max_len);
> +}
> +
> +void libafl_qemu_end(enum LibaflQemuEndStatus status) {
> +  _libafl_sync_exit_call1(LIBAFL_QEMU_COMMAND_END, status);
> +}
> +
> +void libafl_qemu_save(void) {
> +  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_SAVE);
> +}
> +
> +void libafl_qemu_load(void) {
> +  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_LOAD);
> +}
> +
> +libafl_word libafl_qemu_version(void) {
> +  return _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_VERSION);
> +}
> +
> +void libafl_qemu_internal_error(void) {
> +  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_INTERNAL_ERROR);
> +}
> +
> +void lqprintf(const char *fmt, ...) {
> +  va_list args;
> +  int res;
> +  va_start(args, fmt);
> +  res = vsnprintf(_lqprintf_buffer, LIBAFL_QEMU_PRINTF_MAX_SIZE, fmt, args);
> +  va_end(args);
> +
> +  if (res >= LIBAFL_QEMU_PRINTF_MAX_SIZE) {
> +    // buffer is not big enough, either recompile the target with more
> +    // space or print less things
> +    libafl_qemu_internal_error();
> +  }
> +
> +  _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_LQPRINTF,
> +                          (libafl_word)_lqprintf_buffer, res);
> +}
> +
> +void libafl_qemu_test(void) {
> +  _libafl_sync_exit_call1(LIBAFL_QEMU_COMMAND_TEST, LIBAFL_QEMU_TEST_VALUE);
> +}
> +
> +void libafl_qemu_trace_vaddr_range(libafl_word start,
> +                                            libafl_word end) {
> +  _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_VADDR_FILTER_ALLOW, start, end);
> +}
> +
> +void libafl_qemu_trace_vaddr_size(libafl_word start,
> +                                           libafl_word size) {
> +  libafl_qemu_trace_vaddr_range(start, start + size);
> +}
> +
> +static int init_afl(void)
> +{
> +	vaddr_t xen_text_start = (vaddr_t)_stext;
> +	vaddr_t xen_text_end = (vaddr_t)_etext;
> +
> +	lqprintf("Telling AFL about code section: %lx - %lx\n", xen_text_start, xen_text_end);
> +
> +	libafl_qemu_trace_vaddr_range(xen_text_start, xen_text_end);
> +
> +	return 0;
> +}
> +
> +__initcall(init_afl);
> diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
> index b6860a7760..c7a51a1144 100644
> --- a/xen/arch/arm/psci.c
> +++ b/xen/arch/arm/psci.c
> @@ -17,6 +17,7 @@
>  #include <asm/cpufeature.h>
>  #include <asm/psci.h>
>  #include <asm/acpi.h>
> +#include <asm/libafl_qemu.h>
>  
>  /*
>   * While a 64-bit OS can make calls with SMC32 calling conventions, for
> @@ -49,6 +50,10 @@ int call_psci_cpu_on(int cpu)
>  
>  void call_psci_cpu_off(void)
>  {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif

I think we should add a wrapper with an empty implementation in the
regular case and the call to libafl_qemu_end when the fuzzer is enabled.
So that here it becomes just something like:

  fuzzer_success();


>      if ( psci_ver > PSCI_VERSION(0, 1) )
>      {
>          struct arm_smccc_res res;
> @@ -62,12 +67,20 @@ void call_psci_cpu_off(void)
>  
>  void call_psci_system_off(void)
>  {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> +
>      if ( psci_ver > PSCI_VERSION(0, 1) )
>          arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL);
>  }
>  
>  void call_psci_system_reset(void)
>  {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> +
>      if ( psci_ver > PSCI_VERSION(0, 1) )
>          arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL);
>  }
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index d6296d99fd..fd722e0231 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -47,6 +47,10 @@
>  #define pv_shim false
>  #endif
>  
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
> +
>  /* opt_sched: scheduler - default to configured value */
>  static char __initdata opt_sched[10] = CONFIG_SCHED_DEFAULT;
>  string_param("sched", opt_sched);
> @@ -1452,6 +1456,10 @@ static long do_poll(const struct sched_poll *sched_poll)
>      if ( !guest_handle_okay(sched_poll->ports, sched_poll->nr_ports) )
>          return -EFAULT;
>  
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif

I am not sure about this one, why is this a success?

Honestly, aside from these two comments, this looks quite good. I would
suggest adding a GitLab CI job to exercise this, if nothing else, to
serve as an integration point since multiple components are required for
this to work.


>      set_bit(_VPF_blocked, &v->pause_flags);
>      v->poll_evtchn = -1;
>      set_bit(v->vcpu_id, d->poll_mask);
> @@ -1887,12 +1895,18 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>      {
>      case SCHEDOP_yield:
>      {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>          ret = vcpu_yield();
>          break;
>      }
>  
>      case SCHEDOP_block:
>      {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>          vcpu_block_enable_events();
>          break;
>      }
> @@ -1907,6 +1921,9 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>          TRACE_TIME(TRC_SCHED_SHUTDOWN, current->domain->domain_id,
>                     current->vcpu_id, sched_shutdown.reason);
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>          ret = domain_shutdown(current->domain, (u8)sched_shutdown.reason);
>  
>          break;
> diff --git a/xen/common/shutdown.c b/xen/common/shutdown.c
> index c47341b977..1340f4b606 100644
> --- a/xen/common/shutdown.c
> +++ b/xen/common/shutdown.c
> @@ -11,6 +11,10 @@
>  #include <xen/kexec.h>
>  #include <public/sched.h>
>  
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
> +
>  /* opt_noreboot: If true, machine will need manual reset on error. */
>  bool __ro_after_init opt_noreboot;
>  boolean_param("noreboot", opt_noreboot);
> @@ -32,6 +36,9 @@ static void noreturn reboot_or_halt(void)
>  
>  void hwdom_shutdown(unsigned char reason)
>  {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>      switch ( reason )
>      {
>      case SHUTDOWN_poweroff:
> diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
> index 7da8c5296f..1262515e70 100644
> --- a/xen/drivers/char/console.c
> +++ b/xen/drivers/char/console.c
> @@ -41,6 +41,9 @@
>  #ifdef CONFIG_SBSA_VUART_CONSOLE
>  #include <asm/vpl011.h>
>  #endif
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
>  
>  /* console: comma-separated list of console outputs. */
>  static char __initdata opt_console[30] = OPT_CONSOLE_STR;
> @@ -1299,6 +1302,11 @@ void panic(const char *fmt, ...)
>  
>      kexec_crash(CRASHREASON_PANIC);
>  
> +    #ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +    /* Tell the fuzzer that we crashed */
> +    libafl_qemu_end(LIBAFL_QEMU_END_CRASH);
> +    #endif
> +
>      if ( opt_noreboot )
>          machine_halt();
>      else
> -- 
> 2.47.0
>
Re: [RFC PATCH] xen: add libafl-qemu fuzzer support
Posted by Andrew Cooper 1 day, 14 hours ago
On 19/11/2024 1:46 am, Stefano Stabellini wrote:
> On Thu, 14 Nov 2024, Volodymyr Babchuk wrote:
>> diff --git a/xen/arch/arm/libafl_qemu.c b/xen/arch/arm/libafl_qemu.c
>> new file mode 100644
>> index 0000000000..58924ce6c6
>> --- /dev/null
>> +++ b/xen/arch/arm/libafl_qemu.c
>> @@ -0,0 +1,152 @@
>> +/* SPDX-License-Identifier: Apache-2.0 */

I am afraid that we cannot accept this submission.

While the Apache-2.0 license is compatible with GPLv3, it is
incompatible with GPLv2, and therefore with Xen.


Where precisely did this come from?

The LibAFL project says it is explicitly dual-licensed Apache-2.0 and
MIT, and MIT is compatible with GPLv2, so this likely can be made to work.

Assuming the source really is both Apache-2.0 and MIT, then the SPDX
header needs to state both, but this needs to be checked carefully.

~Andrew
Re: [RFC PATCH] xen: add libafl-qemu fuzzer support
Posted by Volodymyr Babchuk 1 day, 17 hours ago
Hi Stefano,

Stefano Stabellini <sstabellini@kernel.org> writes:

> On Thu, 14 Nov 2024, Volodymyr Babchuk wrote:

[...]

>> +Building LibAFL-QEMU based fuzzer
>> +---------------------------------
>> +
>> +Fuzzer is written in Rust, so you need Rust toolchain and `cargo` tool
>> +in your system. Please refer to your distro documentation on how to
>> +obtain them.
>> +
>> +Once Rust is ready, fetch and build the fuzzer::
>> +
>> + # git clone
>> https://github.com/xen-troops/xen-fuzzer-rs
>> +  # cd xen-fuzzer-rs
>> +  # cargo build
>
> Is this the only way to trigger the fuzzer? Are there other ways (e.g.
> other languages or scripts)? If this is the only way, do we expect it to
> grow much over time, or is it just a minimal shim only to invoke the
> fuzzer (so basically we need an x86 version of it but that's pretty much
> it in terms of growth)?

Well, original AFL++ is written in C. And I planned to use it
initially. I wanted to make plugin for QEMU to do the basically same
thing that LibAFL does - use QEMU to emulate target platform, create
snapshot before running a test, restore it afterwards.

But then I found LibAFL. It is a framework for creating fuzzers, it
implements the same algorithms as original AFL++ but it is more
flexible. And it already had QEMU support. Also, it seems it is quite
easy to reconfigure it for x86 support. I didn't tried tested this yet,
but looks like I need only to change one option in Cargo.toml.

This particular fuzzer is based on LibAFL example, but I am going to
tailor it for Xen Project-specific needs, like CI integration you
mentioned below.

As for test harness, I am using Zephyr currently. My first intention was
to use XTF, but it is x86-only... I am still considering using XTF for
x86 runs.

Zephyr was just the easiest and fastest way to trigger hypercalls. At
first I tried to use Linux kernel, but it was hard to cover all possible
failure paths. Zephyr is much simpler in this regard. Even better is to
use MiniOS or XTF. But ARM support in MiniOS is in sorry state and XTF
does not work on ARM at all.

[...]

>>  void call_psci_cpu_off(void)
>>  {
>> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
>> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
>> +#endif
>
> I think we should add a wrapper with an empty implementation in the
> regular case and the call to libafl_qemu_end when the fuzzer is enabled.
> So that here it becomes just something like:
>
>   fuzzer_success();

I considered this. In the next version I'll add fuzzer.h with inline wrappers.


[...]

>> @@ -1452,6 +1456,10 @@ static long do_poll(const struct sched_poll *sched_poll)
>>      if ( !guest_handle_okay(sched_poll->ports, sched_poll->nr_ports) )
>>          return -EFAULT;
>>
>> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
>> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
>> +#endif
>
> I am not sure about this one, why is this a success?

vCPU get blocked here basically forever. So test harness can't call
libafl_qemu_end(LIBAFL_QEMU_END_OK) from it's side because it is never
scheduled after this point.

> Honestly, aside from these two comments, this looks quite good. I would
> suggest adding a GitLab CI job to exercise this, if nothing else, to
> serve as an integration point since multiple components are required for
> this to work.

I was considering this as well. Problem is that fuzzing should be
running for a prolonged periods of time. There is no clear consensus on
"how long", but most widely accepted time period is 24 hours. So looks
like it should be something like "nightly build" task. Fuzzer code
needs to be extended to support some runtime restriction, because right
now it runs indefinitely, until user stops it.

I am certainly going to implement this, but this is a separate topic,
because it quires changes in the fuzzer app. Speaking on which... Right
now both fuzzer and test harness reside in our github repo, as you
noticed. I believe it is better to host it on xenbits as an official
part of the Xen Project.

--
WBR, Volodymyr
Re: [RFC PATCH] xen: add libafl-qemu fuzzer support
Posted by Marek Marczykowski-Górecki 1 day, 7 hours ago
On Tue, Nov 19, 2024 at 03:16:56PM +0000, Volodymyr Babchuk wrote:
> > Honestly, aside from these two comments, this looks quite good. I would
> > suggest adding a GitLab CI job to exercise this, if nothing else, to
> > serve as an integration point since multiple components are required for
> > this to work.
> 
> I was considering this as well. Problem is that fuzzing should be
> running for a prolonged periods of time. There is no clear consensus on
> "how long", but most widely accepted time period is 24 hours. So looks
> like it should be something like "nightly build" task. Fuzzer code
> needs to be extended to support some runtime restriction, because right
> now it runs indefinitely, until user stops it.

Regardless of the actual fuzzing (which takes time), I'd suggest to add
a gitlab job that does sanity test, checks if stuff still builds etc. It
can probably be limited to 1min fuzzing or such.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Re: [RFC PATCH] xen: add libafl-qemu fuzzer support
Posted by Stefano Stabellini 10 hours ago
On Wed, 20 Nov 2024, Marek Marczykowski-Górecki wrote:
> On Tue, Nov 19, 2024 at 03:16:56PM +0000, Volodymyr Babchuk wrote:
> > > Honestly, aside from these two comments, this looks quite good. I would
> > > suggest adding a GitLab CI job to exercise this, if nothing else, to
> > > serve as an integration point since multiple components are required for
> > > this to work.
> > 
> > I was considering this as well. Problem is that fuzzing should be
> > running for a prolonged periods of time. There is no clear consensus on
> > "how long", but most widely accepted time period is 24 hours. So looks
> > like it should be something like "nightly build" task. Fuzzer code
> > needs to be extended to support some runtime restriction, because right
> > now it runs indefinitely, until user stops it.
> 
> Regardless of the actual fuzzing (which takes time), I'd suggest to add
> a gitlab job that does sanity test, checks if stuff still builds etc. It
> can probably be limited to 1min fuzzing or such.

+1
Re: [RFC PATCH] xen: add libafl-qemu fuzzer support
Posted by Stefano Stabellini 1 day, 9 hours ago
On Tue, 19 Nov 2024, Volodymyr Babchuk wrote:
> Hi Stefano,
> 
> Stefano Stabellini <sstabellini@kernel.org> writes:
> 
> > On Thu, 14 Nov 2024, Volodymyr Babchuk wrote:
> 
> [...]
> 
> >> +Building LibAFL-QEMU based fuzzer
> >> +---------------------------------
> >> +
> >> +Fuzzer is written in Rust, so you need Rust toolchain and `cargo` tool
> >> +in your system. Please refer to your distro documentation on how to
> >> +obtain them.
> >> +
> >> +Once Rust is ready, fetch and build the fuzzer::
> >> +
> >> + # git clone
> >> https://github.com/xen-troops/xen-fuzzer-rs
> >> +  # cd xen-fuzzer-rs
> >> +  # cargo build
> >
> > Is this the only way to trigger the fuzzer? Are there other ways (e.g.
> > other languages or scripts)? If this is the only way, do we expect it to
> > grow much over time, or is it just a minimal shim only to invoke the
> > fuzzer (so basically we need an x86 version of it but that's pretty much
> > it in terms of growth)?
> 
> Well, original AFL++ is written in C. And I planned to use it
> initially. I wanted to make plugin for QEMU to do the basically same
> thing that LibAFL does - use QEMU to emulate target platform, create
> snapshot before running a test, restore it afterwards.
> 
> But then I found LibAFL. It is a framework for creating fuzzers, it
> implements the same algorithms as original AFL++ but it is more
> flexible. And it already had QEMU support. Also, it seems it is quite
> easy to reconfigure it for x86 support. I didn't tried tested this yet,
> but looks like I need only to change one option in Cargo.toml.
> 
> This particular fuzzer is based on LibAFL example, but I am going to
> tailor it for Xen Project-specific needs, like CI integration you
> mentioned below.

Is my understanding correct that we only need to invoke LibAFL as you
are doing already, and that's pretty much it? We need a better
configuration specific for Xen, and we need one more way to invoke it to
cover x86 but that's it? So, the expectation is that the code currently
under https://github.com/xen-troops/xen-fuzzer-rs will not grow much?


> As for test harness, I am using Zephyr currently. My first intention was
> to use XTF, but it is x86-only... I am still considering using XTF for
> x86 runs.
> 
> Zephyr was just the easiest and fastest way to trigger hypercalls. At
> first I tried to use Linux kernel, but it was hard to cover all possible
> failure paths. Zephyr is much simpler in this regard. Even better is to
> use MiniOS or XTF. But ARM support in MiniOS is in sorry state and XTF
> does not work on ARM at all.

There is a not-yet-upstream XTF branch that works on ARM here:
https://gitlab.com/xen-project/fusa/xtf/-/tree/xtf-arm?ref_type=heads


> [...]
> 
> >>  void call_psci_cpu_off(void)
> >>  {
> >> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> >> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> >> +#endif
> >
> > I think we should add a wrapper with an empty implementation in the
> > regular case and the call to libafl_qemu_end when the fuzzer is enabled.
> > So that here it becomes just something like:
> >
> >   fuzzer_success();
> 
> I considered this. In the next version I'll add fuzzer.h with inline wrappers.
> 
> 
> [...]
> 
> >> @@ -1452,6 +1456,10 @@ static long do_poll(const struct sched_poll *sched_poll)
> >>      if ( !guest_handle_okay(sched_poll->ports, sched_poll->nr_ports) )
> >>          return -EFAULT;
> >>
> >> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> >> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> >> +#endif
> >
> > I am not sure about this one, why is this a success?
> 
> vCPU get blocked here basically forever. So test harness can't call
> libafl_qemu_end(LIBAFL_QEMU_END_OK) from it's side because it is never
> scheduled after this point.
> 
> > Honestly, aside from these two comments, this looks quite good. I would
> > suggest adding a GitLab CI job to exercise this, if nothing else, to
> > serve as an integration point since multiple components are required for
> > this to work.
> 
> I was considering this as well. Problem is that fuzzing should be
> running for a prolonged periods of time. There is no clear consensus on
> "how long", but most widely accepted time period is 24 hours. So looks
> like it should be something like "nightly build" task. Fuzzer code
> needs to be extended to support some runtime restriction, because right
> now it runs indefinitely, until user stops it.

We can let it run for 48 hours continuously every weekend using the
Gitlab runners


> I am certainly going to implement this, but this is a separate topic,
> because it quires changes in the fuzzer app. Speaking on which... Right
> now both fuzzer and test harness reside in our github repo, as you
> noticed. I believe it is better to host it on xenbits as an official
> part of the Xen Project.

Yes we can create repos under gitlab.com/xen-project for this, maybe a
new subgroup gitlab.com/xen-project/fuzzer
Re: [RFC PATCH] xen: add libafl-qemu fuzzer support
Posted by Volodymyr Babchuk 1 day, 8 hours ago
Hi Stefano,

(sorry, hit wrong Reply-To option, re-sending for wider audience)

Stefano Stabellini <sstabellini@kernel.org> writes:

> On Tue, 19 Nov 2024, Volodymyr Babchuk wrote:
>> Hi Stefano,
>>
>> Stefano Stabellini <sstabellini@kernel.org> writes:
>>
>> > On Thu, 14 Nov 2024, Volodymyr Babchuk wrote:
>>
>> [...]
>>
>> >> +Building LibAFL-QEMU based fuzzer
>> >> +---------------------------------
>> >> +
>> >> +Fuzzer is written in Rust, so you need Rust toolchain and `cargo` tool
>> >> +in your system. Please refer to your distro documentation on how to
>> >> +obtain them.
>> >> +
>> >> +Once Rust is ready, fetch and build the fuzzer::
>> >> +
>> >> + # git clone
>> >> https://github.com/xen-troops/xen-fuzzer-rs
>> >> +  # cd xen-fuzzer-rs
>> >> +  # cargo build
>> >
>> > Is this the only way to trigger the fuzzer? Are there other ways (e.g.
>> > other languages or scripts)? If this is the only way, do we expect it to
>> > grow much over time, or is it just a minimal shim only to invoke the
>> > fuzzer (so basically we need an x86 version of it but that's pretty much
>> > it in terms of growth)?
>>
>> Well, original AFL++ is written in C. And I planned to use it
>> initially. I wanted to make plugin for QEMU to do the basically same
>> thing that LibAFL does - use QEMU to emulate target platform, create
>> snapshot before running a test, restore it afterwards.
>>
>> But then I found LibAFL. It is a framework for creating fuzzers, it
>> implements the same algorithms as original AFL++ but it is more
>> flexible. And it already had QEMU support. Also, it seems it is quite
>> easy to reconfigure it for x86 support. I didn't tried tested this yet,
>> but looks like I need only to change one option in Cargo.toml.
>>
>> This particular fuzzer is based on LibAFL example, but I am going to
>> tailor it for Xen Project-specific needs, like CI integration you
>> mentioned below.
>
> Is my understanding correct that we only need to invoke LibAFL as you
> are doing already, and that's pretty much it? We need a better
> configuration specific for Xen, and we need one more way to invoke it to
> cover x86 but that's it? So, the expectation is that the code currently
> under
> https://github.com/xen-troops/xen-fuzzer-rs
> will not grow much?
>

Yes, it basically configures different bits of LibAFL and integrates
them together. So yes, it will not grow much. I am planning to add some
QoL things like ability to re-run specific input so it will be easier to
debug discovered issues. Or maybe tune some fuzzing algorithms
settings... But nothing big.


>
>> As for test harness, I am using Zephyr currently. My first intention was
>> to use XTF, but it is x86-only... I am still considering using XTF for
>> x86 runs.
>>
>> Zephyr was just the easiest and fastest way to trigger hypercalls. At
>> first I tried to use Linux kernel, but it was hard to cover all possible
>> failure paths. Zephyr is much simpler in this regard. Even better is to
>> use MiniOS or XTF. But ARM support in MiniOS is in sorry state and XTF
>> does not work on ARM at all.
>
> There is a not-yet-upstream XTF branch that works on ARM here:
> https://gitlab.com/xen-project/fusa/xtf/-/tree/xtf-arm?ref_type=heads

Ah, thanks. I'll try to use it as a harness.

[...]

>
>>
>> I was considering this as well. Problem is that fuzzing should be
>> running for a prolonged periods of time. There is no clear consensus on
>> "how long", but most widely accepted time period is 24 hours. So looks
>> like it should be something like "nightly build" task. Fuzzer code
>> needs to be extended to support some runtime restriction, because right
>> now it runs indefinitely, until user stops it.
>
> We can let it run for 48 hours continuously every weekend using the
> Gitlab runners

Great idea. Anyways, I need to add option to limit runtime to the fuzzer
and invent some method for reporting discovered crashes to the CI first.

>
>> I am certainly going to implement this, but this is a separate topic,
>> because it quires changes in the fuzzer app. Speaking on which... Right
>> now both fuzzer and test harness reside in our github repo, as you
>> noticed. I believe it is better to host it on xenbits as an official
>> part of the Xen Project.
>
> Yes we can create repos under gitlab.com/xen-project for this, maybe a
> new subgroup gitlab.com/xen-project/fuzzer

Good. Whom should I ask to do this?

--
WBR, Volodymyr
Re: [RFC PATCH] xen: add libafl-qemu fuzzer support
Posted by Stefano Stabellini 10 hours ago
On Wed, 20 Nov 2024, Volodymyr Babchuk wrote:
> Hi Stefano,
> 
> (sorry, hit wrong Reply-To option, re-sending for wider audience)
> 
> Stefano Stabellini <sstabellini@kernel.org> writes:
> 
> > On Tue, 19 Nov 2024, Volodymyr Babchuk wrote:
> >> Hi Stefano,
> >>
> >> Stefano Stabellini <sstabellini@kernel.org> writes:
> >>
> >> > On Thu, 14 Nov 2024, Volodymyr Babchuk wrote:
> >>
> >> [...]
> >>
> >> >> +Building LibAFL-QEMU based fuzzer
> >> >> +---------------------------------
> >> >> +
> >> >> +Fuzzer is written in Rust, so you need Rust toolchain and `cargo` tool
> >> >> +in your system. Please refer to your distro documentation on how to
> >> >> +obtain them.
> >> >> +
> >> >> +Once Rust is ready, fetch and build the fuzzer::
> >> >> +
> >> >> + # git clone
> >> >> https://github.com/xen-troops/xen-fuzzer-rs
> >> >> +  # cd xen-fuzzer-rs
> >> >> +  # cargo build
> >> >
> >> > Is this the only way to trigger the fuzzer? Are there other ways (e.g.
> >> > other languages or scripts)? If this is the only way, do we expect it to
> >> > grow much over time, or is it just a minimal shim only to invoke the
> >> > fuzzer (so basically we need an x86 version of it but that's pretty much
> >> > it in terms of growth)?
> >>
> >> Well, original AFL++ is written in C. And I planned to use it
> >> initially. I wanted to make plugin for QEMU to do the basically same
> >> thing that LibAFL does - use QEMU to emulate target platform, create
> >> snapshot before running a test, restore it afterwards.
> >>
> >> But then I found LibAFL. It is a framework for creating fuzzers, it
> >> implements the same algorithms as original AFL++ but it is more
> >> flexible. And it already had QEMU support. Also, it seems it is quite
> >> easy to reconfigure it for x86 support. I didn't tried tested this yet,
> >> but looks like I need only to change one option in Cargo.toml.
> >>
> >> This particular fuzzer is based on LibAFL example, but I am going to
> >> tailor it for Xen Project-specific needs, like CI integration you
> >> mentioned below.
> >
> > Is my understanding correct that we only need to invoke LibAFL as you
> > are doing already, and that's pretty much it? We need a better
> > configuration specific for Xen, and we need one more way to invoke it to
> > cover x86 but that's it? So, the expectation is that the code currently
> > under
> > https://github.com/xen-troops/xen-fuzzer-rs
> > will not grow much?
> >
> 
> Yes, it basically configures different bits of LibAFL and integrates
> them together. So yes, it will not grow much. I am planning to add some
> QoL things like ability to re-run specific input so it will be easier to
> debug discovered issues. Or maybe tune some fuzzing algorithms
> settings... But nothing big.
 
OK then
 

> >> As for test harness, I am using Zephyr currently. My first intention was
> >> to use XTF, but it is x86-only... I am still considering using XTF for
> >> x86 runs.
> >>
> >> Zephyr was just the easiest and fastest way to trigger hypercalls. At
> >> first I tried to use Linux kernel, but it was hard to cover all possible
> >> failure paths. Zephyr is much simpler in this regard. Even better is to
> >> use MiniOS or XTF. But ARM support in MiniOS is in sorry state and XTF
> >> does not work on ARM at all.
> >
> > There is a not-yet-upstream XTF branch that works on ARM here:
> > https://gitlab.com/xen-project/fusa/xtf/-/tree/xtf-arm?ref_type=heads
> 
> Ah, thanks. I'll try to use it as a harness.
> 
> [...]
> 
> >
> >>
> >> I was considering this as well. Problem is that fuzzing should be
> >> running for a prolonged periods of time. There is no clear consensus on
> >> "how long", but most widely accepted time period is 24 hours. So looks
> >> like it should be something like "nightly build" task. Fuzzer code
> >> needs to be extended to support some runtime restriction, because right
> >> now it runs indefinitely, until user stops it.
> >
> > We can let it run for 48 hours continuously every weekend using the
> > Gitlab runners
> 
> Great idea. Anyways, I need to add option to limit runtime to the fuzzer
> and invent some method for reporting discovered crashes to the CI first.
> 
> >
> >> I am certainly going to implement this, but this is a separate topic,
> >> because it quires changes in the fuzzer app. Speaking on which... Right
> >> now both fuzzer and test harness reside in our github repo, as you
> >> noticed. I believe it is better to host it on xenbits as an official
> >> part of the Xen Project.
> >
> > Yes we can create repos under gitlab.com/xen-project for this, maybe a
> > new subgroup gitlab.com/xen-project/fuzzer
> 
> Good. Whom should I ask to do this?

I created gitlab.com/xen-project/fuzzer as an empty group. What
repositories do you need under it?
Re: [RFC PATCH] xen: add libafl-qemu fuzzer support
Posted by Andrew Cooper 1 day, 14 hours ago
On 19/11/2024 3:16 pm, Volodymyr Babchuk wrote:
>> On Thu, 14 Nov 2024, Volodymyr Babchuk wrote:
> As for test harness, I am using Zephyr currently. My first intention was
> to use XTF, but it is x86-only... I am still considering using XTF for
> x86 runs.

I need to get back to fixing this.

My in-progress ARM (and RISC-V) branch can make a prink() (console IO
hypercall) and clean shutdown (schedop).

~Andrew
Re: [RFC PATCH] xen: add libafl-qemu fuzzer support
Posted by Volodymyr Babchuk 1 day, 12 hours ago
Hi Andrew,

Andrew Cooper <andrew.cooper3@citrix.com> writes:

> On 19/11/2024 3:16 pm, Volodymyr Babchuk wrote:
>>> On Thu, 14 Nov 2024, Volodymyr Babchuk wrote:
>> As for test harness, I am using Zephyr currently. My first intention was
>> to use XTF, but it is x86-only... I am still considering using XTF for
>> x86 runs.
>
> I need to get back to fixing this.
>
> My in-progress ARM (and RISC-V) branch can make a prink() (console IO
> hypercall) and clean shutdown (schedop).

If you can share your branch, I'll try to use it as a test
harness. Also, it came to my attention that there is XTF with ARM
support, hosted on gitlab ([1]).

As for the licensing, you are right LibAFL is dual licensed, so we can
use MIT. I re-checked header files ([2]) which I used as a base. They have
no SPDX identifier, so I believe it it safe to use the clause from the
main README.md file ([3]).

[1] https://gitlab.com/xen-project/fusa/xtf/-/commits/xtf-arm?ref_type=heads
[2] https://github.com/AFLplusplus/LibAFL/tree/main/libafl_qemu/runtime
[3] https://github.com/AFLplusplus/LibAFL/blob/main/README.md
-- 
WBR, Volodymyr