[libvirt PATCH] logging: lockdown the systemd service configuration

Daniel P. Berrangé posted 1 patch 7 months, 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20230926161144.1049779-1-berrange@redhat.com
src/logging/virtlogd.service.in | 94 +++++++++++++++++++++++++++++++++
1 file changed, 94 insertions(+)
[libvirt PATCH] logging: lockdown the systemd service configuration
Posted by Daniel P. Berrangé 7 months, 1 week ago
The 'systemd-analyze security' command looks at the unit file
configuration and reports on any settings which increase the
attack surface for the daemon. Since most systemd units are
fairly minimalist, this is generally informing us about settings
that we never put any thought into using before.

In its current configuration it reports

  # systemd-analyze security virtlogd.service
  ...snip...
  → Overall exposure level for virtlogd.service: 9.6 UNSAFE 😨

which is pretty terrible as a score.

If we apply all of the recommendations that appear possible
without (knowingly) breaking functionality it reports:

  # systemd-analyze security virtlogd.service
  ...snip...
  → Overall exposure level for virtlogd.service: 2.2 OK 🙂

which is a pretty decent improvement.

Some of the settings we would like to enable require a systemd
version that is newer than that available in our oldest distro
target - RHEL-8 at v239.

NB, RestrictSUIDSGID is technically newer than 239, but RHEL-8
backported it, and other distros we target have it by default.

Remaining recommendations are

✗ CapabilityBoundingSet=~CAP_(DAC_*|FOWNER|IPC_OWNER)

  We block FOWNER/IPC_OWNER, but can't block the two DAC
  capabilities. Historically apps/users might point QEMU
  to log files in $HOME, pre-created with their own user
  ID.

✗ IPAddressDeny=

  Not required since RestrictAddressFamilies blocks IP
  usage. Ignoring this avoids the overhead of creating
  a traffic filter than will never be used.

✗ NoNewPrivileges=

  Highly desirable, but cannot enable it yet, because it
  will block the ability to transition to the virtlogd_t
  SELinux domain during execve. The SELinux policy needs
  fixing to permit this transition under NNP first.

✗ PrivateTmp=

  There is a decent chance people have VMs configured
  with a serial port logfile pointing at /tmp. We would
  cause a regression to use private /tmp for logging

✗ PrivateUsers=

  This would put virtlogd inside a user namespace where
  its root is in fact unprivileged. Same problem as the
  User= setting below

✗ ProcSubset=

  Libraries we link to might read certain non-PID related
  files from /proc

✗ ProtectClock=

  Requires v245

✗ ProtectHome=

  Same problem as PrivateTmp=. There's a decent chance
  that someone has a VM configured to write a logfile
  to /home

✗ ProtectHostname=

  Requires v241

✗ ProtectKernelLogs

  Requires v244

✗ ProtectProc

  Requires v247

✗ ProtectSystem=

  We only set it to 'full', as 'strict' is not viable for
  our required usage

✗ RootDirectory=/RootImage=

  We are not capable of running inside a custom chroot
  given needs to write log files to arbitrary places

✗ RestrictAddressFamilies=~AF_UNIX

  We need AF_UNIX to communicate with other libvirt daemons

✗ SystemCallFilter=~@resources

  We link to libvirt.so which links to libnuma.so which has
  a constructor that calls set_mempolicy. This is highly
  undesirable todo during a constructor.

✗ User=/DynamicUser=

  This is highly desirable, but we currently read/write
  logs as root, and directories we're told to write into
  could be anywhere. So using a non-root user would have
  a major risk of regressions for applications and also
  have upgrade implications

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 src/logging/virtlogd.service.in | 94 +++++++++++++++++++++++++++++++++
 1 file changed, 94 insertions(+)

diff --git a/src/logging/virtlogd.service.in b/src/logging/virtlogd.service.in
index 8e245ddb43..9e3838ff34 100644
--- a/src/logging/virtlogd.service.in
+++ b/src/logging/virtlogd.service.in
@@ -20,5 +20,99 @@ OOMScoreAdjust=-900
 # per systemd recommendations
 LimitNOFILE=1024:524288
 
+CapabilityBoundingSet=~CAP_AUDIT_CONTROL
+CapabilityBoundingSet=~CAP_AUDIT_READ
+CapabilityBoundingSet=~CAP_AUDIT_WRITE
+CapabilityBoundingSet=~CAP_BLOCK_SUSPEND
+CapabilityBoundingSet=~CAP_CHOWN
+# Mgmt app/user might have pre-created log files that we're
+# told to open and write to, or be storing them in otherwise
+# inaccessible locations like $HOME. So we need to ignore
+# DAC permission checks.
+#CapabilityBoundingSet=~CAP_DAC_OVERRIDE
+#CapabilityBoundingSet=~CAP_DAC_READ_SEARCH
+CapabilityBoundingSet=~CAP_FOWNER
+CapabilityBoundingSet=~CAP_FSETID
+CapabilityBoundingSet=~CAP_IPC_LOCK
+CapabilityBoundingSet=~CAP_IPC_OWNER
+CapabilityBoundingSet=~CAP_KILL
+CapabilityBoundingSet=~CAP_LEASE
+CapabilityBoundingSet=~CAP_LINUX_IMMUTABLE
+CapabilityBoundingSet=~CAP_MAC_ADMIN
+CapabilityBoundingSet=~CAP_MAC_OVERRIDE
+CapabilityBoundingSet=~CAP_MKNOD
+CapabilityBoundingSet=~CAP_NET_ADMIN
+CapabilityBoundingSet=~CAP_NET_BIND_SERVICE
+CapabilityBoundingSet=~CAP_NET_BROADCAST
+CapabilityBoundingSet=~CAP_NET_RAW
+CapabilityBoundingSet=~CAP_SETFCAP
+CapabilityBoundingSet=~CAP_SETPCAP
+CapabilityBoundingSet=~CAP_SETGID
+CapabilityBoundingSet=~CAP_SETUID
+CapabilityBoundingSet=~CAP_SYSLOG
+CapabilityBoundingSet=~CAP_SYS_ADMIN
+CapabilityBoundingSet=~CAP_SYS_BOOT
+CapabilityBoundingSet=~CAP_SYS_CHROOT
+CapabilityBoundingSet=~CAP_SYS_MODULE
+CapabilityBoundingSet=~CAP_SYS_NICE
+CapabilityBoundingSet=~CAP_SYS_PACCT
+CapabilityBoundingSet=~CAP_SYS_PTRACE
+CapabilityBoundingSet=~CAP_SYS_RAWIO
+CapabilityBoundingSet=~CAP_SYS_RESOURCE
+CapabilityBoundingSet=~CAP_SYS_TIME
+CapabilityBoundingSet=~CAP_SYS_TTY_CONFIG
+CapabilityBoundingSet=~CAP_WAKE_ALARM
+
+LockPersonality=true
+MemoryDenyWriteExecute=true
+# Cannot enable this as it prevents transitioning to
+# the confined SELinux virtlogd_t domain on execve
+# unless we modify the policy to allow this.
+#NoNewPrivileges=true
+PrivateDevices=true
+PrivateMounts=true
+PrivateNetwork=true
+# XXX someone could configure QEMU to log a serial port to an
+# arbitrary directory, including /tmp, even if this is ill-advised
+#PrivateTmp=true
+# Not until oldest build target has systemd >= v245
+#ProtectClock=true
+ProtectControlGroups=true
+# Not until oldest build target has systemd >= v241
+#ProtectHostname=true
+# Not until oldest build target has systemd >= v244
+#ProtectKernelLogs=true
+ProtectKernelModules=true
+ProtectKernelTunables=true
+# Not until oldest build target has systemd >= v247
+#ProtectProc=invisible
+ProtectSystem=full
+RestrictAddressFamilies=AF_UNIX
+RestrictNamespaces=~cgroup
+RestrictNamespaces=~ipc
+RestrictNamespaces=~mnt
+RestrictNamespaces=~net
+RestrictNamespaces=~pid
+RestrictNamespaces=~user
+RestrictNamespaces=~uts
+RestrictRealtime=true
+RestrictSUIDSGID=true
+SystemCallArchitectures=native
+SystemCallFilter=~@clock
+SystemCallFilter=~@debug
+SystemCallFilter=~@module
+SystemCallFilter=~@mount
+SystemCallFilter=~@raw-io
+SystemCallFilter=~@reboot
+SystemCallFilter=~@swap
+SystemCallFilter=~@privileged
+# Unfortunately we link to libnuma via libvirt.so which
+# has a constructor that runs unconditionally that invokes
+# set_mempolicy()
+#SystemCallFilter=~@resources
+SystemCallFilter=~@cpu-emulation
+SystemCallFilter=~@obsolete
+UMask=077
+
 [Install]
 Also=virtlogd.socket
-- 
2.41.0

Re: [libvirt PATCH] logging: lockdown the systemd service configuration
Posted by Michal Prívozník 6 months, 2 weeks ago
On 9/26/23 18:11, Daniel P. Berrangé wrote:
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  src/logging/virtlogd.service.in | 94 +++++++++++++++++++++++++++++++++
>  1 file changed, 94 insertions(+)

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>

Michal