[v1] Add support for Control-Flow Integrity

[PATCH 0/2] Add support for Control-Flow Integrity
Posted by Daniele Buono 5 years, 7 months ago
LLVM/Clang, starting from v3.9, supports runtime checks for forward-edge
Control-Flow Integrity (CFI).

CFI on indirect function calls can have a huge impact in enhancing QEMU
security, by significantly limiting one of the most used attack vectors
for VM Escape. Attacks demonstrated in [1],[2] and [3] will, at some
point, change a function pointer in a QEMU data structure.

At high level, LLVM's implementation relies on compile-time information
to create a range of consecutive trampolines for "compatible functions".
At runtime, if the pointer is not in the valid range, it is assumed that
the control flow was hijacked, and the process is terminated with an
"Illegal Instruction" exception.

CAVEATS:

1) For this CFI check to work, the code must always respect the function
signature when using function pointer. While this is generally true
in QEMU, there are a few instances where pointers are handled as
generic void* from the caller. Since this is a common approach, Clang
offer a flag to relax pointer checks and consider all pointer types
to be compatible.

2) Since CFI relies on compile-time information, it requires using
link-time optimization (LTO) to support CFI across compilation units.
This adds a requirement for the gold linker, and LLVM's versions of
static libraries tools (ar, ranlib, nm).

3) CFI checks cannot be performed on shared libraries (given that functions
are not known at compile time). This means that indirect function calls
will fail if the function pointer belong to a shared library.
This does not seem to be a big issue for a standard QEMU deployment today,
but QEMU modules won't be able to work when CFI is enabled.
There is a way to allow shared library pointers, but it is experimental
in LLVM, requires some work and reduces performance and security. For
these reasons, I think it's best to start with this version, and discuss
an extension for modules later.

4) CFI cannot be fully applied to TCG. The purpose of TCG is to transform
code on-the-fly from one ISA to another. In doing so, binary blobs of
executable code are created and called with function pointers.
Since the code did not exist at compile time, runtime CFI checks find such
functions illegal. To allow the code to keep running, CFI checks are not
performed in the core function of TCG/TCI, and in the code that
implements TCG plugins.
This does not affect QEMU when executed with KVM, and all the device
emulation code is always protected, even when using TCG.

5) Most of the logic to enable CFI goes in the configure, since it's
just a matter of checking for dependencies and incompatible options.
However, I had to disable CFI checks for a few TCG functions.
This can only be done through a blacklist file. I added a file in the
root of QEMU, called cfi-blacklist.txt for such purpose. I am open to
suggestions on where the file should go, and I am willing to become the
maintainer of it, if deemed necessary.

PERFORMANCE:

Enabling CFI creates a larger binary, which may be an issue in some use
cases. However, the increase is not exceptionally large. On my Ubuntu
system, with default options, I see an increase of stripped size from
14MiB to 15.3MiB when enabling CFI with Clang v9.

There is also a possible performance issue, since for every indirect
function call, and additional address check is performed, followed by
an additional indirect call to the trampoline function.
However, especially in the case of KVM-based virtualization, the impact
should be minimal, since indirect function pointers should be used mostly
for device emulation.

I used Kata Container's metrics tests since that is a simple,
reproducible set of tests to stress storage and network between VMs,
and run a Lifecycle test to measure VM startup times under a specific
workload. A full report is available here [4].

The difference between LLVM with and without CFI is generally low.
Sometimes CFI is actually offering better performance, which may be
explained by having a different binary layout because of LTO.
Lifecycle and network do not seem to be affected much. With storage,
the situation is a bit more variable, but the oscillations seem to be
more related to the benchmark variability than the CFI overhead.

I also run a quick check-acceptance on full system VMs with and without CFI,
the results are at [4] and show comparable results, with CFI slightly
outperforming the default binary produced by LLVM.

----

[1] Mehdi Talbi and Paul Fariello. VM escape - QEMU Case Study
[2] Nelson Elhage. Virtunoid: Breaking out of KVM
[3] Marco Grassi and Kira. Vulnerability Discovery and Exploitation
of Virtualization Solutions for Cloud Computing and Desktops
[4] https://github.com/dbuono/QEMU-CFI-Performance

Daniele Buono (2):
  check-block: enable iotests with cfi-icall
  configure: add support for Control-Flow Integrity

 cfi-blacklist.txt    |  27 +++++++
 configure            | 177 +++++++++++++++++++++++++++++++++++++++++++
 tests/check-block.sh |  18 +++--
 3 files changed, 215 insertions(+), 7 deletions(-)
 create mode 100644 cfi-blacklist.txt

-- 
2.26.2