[libvirt] [PATCH v2 0/8] Don't hold both monitor and agent jobs at the same time

Jonathon Jongsma posted 8 patches 4 years, 3 months ago
Test syntax-check failed
Failed in applying to current master (apply log)
src/libvirt-domain.c                |  12 +-
src/qemu/THREADS.txt                |  58 +-----
src/qemu/qemu_agent.c               | 268 ++++------------------------
src/qemu/qemu_agent.h               |  33 +++-
src/qemu/qemu_domain.c              |  56 +-----
src/qemu/qemu_domain.h              |   7 -
src/qemu/qemu_driver.c              | 246 +++++++++++++++++++++++--
src/remote/remote_daemon_dispatch.c |   2 +-
tests/qemuagenttest.c               | 196 ++++----------------
9 files changed, 336 insertions(+), 542 deletions(-)
[libvirt] [PATCH v2 0/8] Don't hold both monitor and agent jobs at the same time
Posted by Jonathon Jongsma 4 years, 3 months ago
We have to assume that the guest agent may be malicious, so we don't want to
allow any agent queries to block any other libvirt API. By holding a monitor
job and an agent job while we're querying the agent, any other threads will be
blocked from using the monitor while the agent is unresponsive. Because libvirt
waits forever for an agent response, this makes us vulnerable to a denial of
service from a malicious (or simply buggy) guest agent.

Most of the patches in the first series were already reviewed and pushed, but a
couple remain: the filesystem info functions. The problem with these functions
is that the agent functions access the vm definition (owned by the domain). If
a monitor job is not held while this is done, the vm definition could change
while we are looking up the disk alias, leading to a potentional crash.

This series tries to fix this by moving the disk alias searching up a level
from qemu_agent.c to qemu_driver.c. The code in qemu_agent.c will only return
the raw data returned from the agent command response. After the agent response
is returned and the agent job is ended, we can then look up the disk alias from
the vm definition while the domain object is locked.

In addition, a few nearby cleanups are included in this series, notably
changing to glib allocation API in a couple of places.

Jonathon Jongsma (8):
  qemu: rename qemuAgentGetFSInfoInternalDisk()
  qemu: store complete agent filesystem information
  qemu: Don't store disk alias in qemuAgentDiskInfo
  qemu: don't access vmdef within qemu_agent.c
  qemu: remove qemuDomainObjBegin/EndJobWithAgent()
  qemu: use glib alloc in qemuAgentGetFSInfoFillDisks()
  qemu: use glib allocation apis for qemuAgentFSInfo
  Use glib alloc API for virDomainFSInfo

 src/libvirt-domain.c                |  12 +-
 src/qemu/THREADS.txt                |  58 +-----
 src/qemu/qemu_agent.c               | 268 ++++------------------------
 src/qemu/qemu_agent.h               |  33 +++-
 src/qemu/qemu_domain.c              |  56 +-----
 src/qemu/qemu_domain.h              |   7 -
 src/qemu/qemu_driver.c              | 246 +++++++++++++++++++++++--
 src/remote/remote_daemon_dispatch.c |   2 +-
 tests/qemuagenttest.c               | 196 ++++----------------
 9 files changed, 336 insertions(+), 542 deletions(-)

-- 
2.21.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [PATCH v2 0/8] Don't hold both monitor and agent jobs at the same time
Posted by Michal Privoznik 4 years, 3 months ago
On 1/11/20 12:32 AM, Jonathon Jongsma wrote:
> We have to assume that the guest agent may be malicious, so we don't want to
> allow any agent queries to block any other libvirt API. By holding a monitor
> job and an agent job while we're querying the agent, any other threads will be
> blocked from using the monitor while the agent is unresponsive. Because libvirt
> waits forever for an agent response, this makes us vulnerable to a denial of
> service from a malicious (or simply buggy) guest agent.
> 
> Most of the patches in the first series were already reviewed and pushed, but a
> couple remain: the filesystem info functions. The problem with these functions
> is that the agent functions access the vm definition (owned by the domain). If
> a monitor job is not held while this is done, the vm definition could change
> while we are looking up the disk alias, leading to a potentional crash.
> 
> This series tries to fix this by moving the disk alias searching up a level
> from qemu_agent.c to qemu_driver.c. The code in qemu_agent.c will only return
> the raw data returned from the agent command response. After the agent response
> is returned and the agent job is ended, we can then look up the disk alias from
> the vm definition while the domain object is locked.
> 
> In addition, a few nearby cleanups are included in this series, notably
> changing to glib allocation API in a couple of places.
> 
> Jonathon Jongsma (8):
>    qemu: rename qemuAgentGetFSInfoInternalDisk()
>    qemu: store complete agent filesystem information
>    qemu: Don't store disk alias in qemuAgentDiskInfo
>    qemu: don't access vmdef within qemu_agent.c
>    qemu: remove qemuDomainObjBegin/EndJobWithAgent()
>    qemu: use glib alloc in qemuAgentGetFSInfoFillDisks()
>    qemu: use glib allocation apis for qemuAgentFSInfo
>    Use glib alloc API for virDomainFSInfo
> 
>   src/libvirt-domain.c                |  12 +-
>   src/qemu/THREADS.txt                |  58 +-----
>   src/qemu/qemu_agent.c               | 268 ++++------------------------
>   src/qemu/qemu_agent.h               |  33 +++-
>   src/qemu/qemu_domain.c              |  56 +-----
>   src/qemu/qemu_domain.h              |   7 -
>   src/qemu/qemu_driver.c              | 246 +++++++++++++++++++++++--
>   src/remote/remote_daemon_dispatch.c |   2 +-
>   tests/qemuagenttest.c               | 196 ++++----------------
>   9 files changed, 336 insertions(+), 542 deletions(-)
> 

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>

and pushed. Thanks for fixing this.

Michal

Re: [libvirt] [PATCH v2 0/8] Don't hold both monitor and agent jobs at the same time
Posted by Eric Blake 4 years, 3 months ago
On 1/10/20 5:32 PM, Jonathon Jongsma wrote:
> We have to assume that the guest agent may be malicious, so we don't want to
> allow any agent queries to block any other libvirt API. By holding a monitor
> job and an agent job while we're querying the agent, any other threads will be
> blocked from using the monitor while the agent is unresponsive. Because libvirt
> waits forever for an agent response, this makes us vulnerable to a denial of
> service from a malicious (or simply buggy) guest agent.
> 
> Most of the patches in the first series were already reviewed and pushed, but a
> couple remain: the filesystem info functions. The problem with these functions
> is that the agent functions access the vm definition (owned by the domain). If
> a monitor job is not held while this is done, the vm definition could change
> while we are looking up the disk alias, leading to a potentional crash.

Did we ever hear back on a CVE assignment for the first series?  And do 
any of the patches in this series also fall under the CVE umbrella?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [libvirt] [PATCH v2 0/8] Don't hold both monitor and agent jobs at the same time
Posted by Jonathon Jongsma 4 years, 3 months ago
On Thu, 2020-01-16 at 09:46 -0600, Eric Blake wrote:
> On 1/10/20 5:32 PM, Jonathon Jongsma wrote:
> > We have to assume that the guest agent may be malicious, so we
> > don't want to
> > allow any agent queries to block any other libvirt API. By holding
> > a monitor
> > job and an agent job while we're querying the agent, any other
> > threads will be
> > blocked from using the monitor while the agent is unresponsive.
> > Because libvirt
> > waits forever for an agent response, this makes us vulnerable to a
> > denial of
> > service from a malicious (or simply buggy) guest agent.
> > 
> > Most of the patches in the first series were already reviewed and
> > pushed, but a
> > couple remain: the filesystem info functions. The problem with
> > these functions
> > is that the agent functions access the vm definition (owned by the
> > domain). If
> > a monitor job is not held while this is done, the vm definition
> > could change
> > while we are looking up the disk alias, leading to a potentional
> > crash.
> 
> Did we ever hear back on a CVE assignment for the first series?  And
> do 
> any of the patches in this series also fall under the CVE umbrella?


Good question. I never did hear back about a CVE assignment. This
series is just a revision (and refactoring) of a couple of the patches
that were NACKed from the first series. So they are relevant to the
(potential) CVE. 

Jonathon