[PATCH 0/2] NFSv4/pNFS: fix client kernel panic from malformed GETDEVICEINFO

Michael Bommarito posted 2 patches 1 day, 17 hours ago
fs/nfs/filelayout/filelayout.h            |  2 +-
fs/nfs/filelayout/filelayoutdev.c         |  7 +++++--
fs/nfs/flexfilelayout/flexfilelayoutdev.c | 10 ++++++++--
fs/nfs/pnfs_nfs.c                         |  4 ++--
include/linux/nfs4.h                      |  3 +++
5 files changed, 19 insertions(+), 7 deletions(-)
[PATCH 0/2] NFSv4/pNFS: fix client kernel panic from malformed GETDEVICEINFO
Posted by Michael Bommarito 1 day, 17 hours ago
A malicious or compromised NFSv4.1+ pNFS metadata server can panic any
pNFS-flexfile client by returning a GETDEVICEINFO body with a
multipath-DS count of >= 3 and exactly one valid (netid, uaddr) pair.
The unbounded inner loop in nfs4_ff_alloc_deviceid_node() (and the
parallel site in nfs4_fl_alloc_deviceid_node() for the legacy file
layout) keeps iterating after the first netaddr is decoded, consuming
the trailing version_count / version / minor words of the body as
opaque netid + uaddr pairs.  Both come out as zero-length strings;
xdr_stream_decode_string_dup() sets *str = NULL and returns 0; the
caller in nfs4_decode_mp_ds_addr() only checks "< 0" and immediately
calls strrchr(NULL, '.').

A QEMU/KASAN reproducer is described in the second patch.  The
shortest crashing GETDEVICEINFO body is 56 bytes, the panic is 5/5
deterministic at multipath_count = 10, and it fires before any
user-level read can complete on the first pNFS file the client
touches.

Patch 1 closes the NULL dereference itself by changing the two
xdr_stream_decode_string_dup() return-value checks in
nfs4_decode_mp_ds_addr() from "< 0" to "<= 0".  Patch 2 promotes
NFS4_PNFS_MAX_MULTI_CNT to include/linux/nfs4.h so flexfile and the
legacy file layout can share it, bounds the inner mp_count loop in
both drivers against that cap, and breaks the loop on the first NULL
return from nfs4_decode_mp_ds_addr() so a hostile server cannot drive
the decoder past a single malformed entry.  Either patch alone closes
the panic; both together close the latent unbounded-decode class.

The unbound on mp_count predates the flexfile driver: the same loop
exists in the legacy file layout since 35124a0994fc ("Cleanup XDR
parsing for LAYOUTGET, GETDEVICEINFO", 2011) and was carried into
flexfile by d67ae825a59d ("pnfs/flexfiles: Add the FlexFile Layout
Driver", 2014).  The NULL-deref site was introduced by 6b7f3cf96364
("nfs41: pull decode_ds_addr from file layout to generic pnfs") when
the netaddr decode was unified.  Stable backporting wanted for all
three.

Cc: stable@vger.kernel.org

Michael Bommarito (2):
  NFSv4/pNFS: reject zero-length r_addr in nfs4_decode_mp_ds_addr
  NFSv4/flexfile,filelayout: bound multipath DS count in GETDEVICEINFO

 fs/nfs/filelayout/filelayout.h            |  2 +-
 fs/nfs/filelayout/filelayoutdev.c         |  7 +++++--
 fs/nfs/flexfilelayout/flexfilelayoutdev.c | 10 ++++++++--
 fs/nfs/pnfs_nfs.c                         |  4 ++--
 include/linux/nfs4.h                      |  3 +++
 5 files changed, 19 insertions(+), 7 deletions(-)

--
2.47.3