[PATCH-for-5.0] qga-posix: Avoid crashing process when failing to allocate memory

Philippe Mathieu-Daudé posted 1 patch 2 weeks ago
Test docker-mingw@fedora passed
Test docker-quick@centos7 passed
Test checkpatch passed
Test FreeBSD passed
Test asan passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20200324194836.21539-1-philmd@redhat.com
Maintainers: Michael Roth <mdroth@linux.vnet.ibm.com>
qga/commands-posix.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

[PATCH-for-5.0] qga-posix: Avoid crashing process when failing to allocate memory

Posted by Philippe Mathieu-Daudé 2 weeks ago
Similarly to commit 807e2b6fce0 for Windows, kindly return a
QMP error message instead of crashing the whole process.

Cc: qemu-stable@nongnu.org
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1594054
Reported-by: Fakhri Zulkifli <mohdfakhrizulkifli@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 qga/commands-posix.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 93474ff770..8f127788e6 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -493,7 +493,13 @@ struct GuestFileRead *qmp_guest_file_read(int64_t handle, bool has_count,
         gfh->state = RW_STATE_NEW;
     }
 
-    buf = g_malloc0(count+1);
+    buf = g_try_malloc0(count + 1);
+    if (!buf) {
+        error_setg(errp,
+                   "failed to allocate sufficient memory "
+                   "to complete the requested service");
+        return NULL;
+    }
     read_count = fread(buf, 1, count, fh);
     if (ferror(fh)) {
         error_setg_errno(errp, errno, "failed to read file");
-- 
2.21.1


Re: [PATCH-for-5.0] qga-posix: Avoid crashing process when failing to allocate memory

Posted by Daniel P. Berrangé 1 week ago
On Tue, Mar 24, 2020 at 08:48:36PM +0100, Philippe Mathieu-Daudé wrote:
> Similarly to commit 807e2b6fce0 for Windows, kindly return a
> QMP error message instead of crashing the whole process.
> 
> Cc: qemu-stable@nongnu.org
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1594054

I find that bug report pretty dubious

   "The QEMU Guest Agent in QEMU is vulnerable to an integer 
    overflow in the qmp_guest_file_read(). An attacker could 
    exploit this by sending a crafted QMP command (including 
    guest-file-read with a large count value) to the agent via 
    the listening socket to trigger a g_malloc() call with a 
    large memory chunk resulting in a segmentation fault."

"the attacker" in this case has to have access to the QEMU
chardev associated with the guest agent. IOW, in any sensible
configuration of the chardev, "the attacker" is the same person
who launched QEMU in the first place.  There's no elevation of
privilege here and any denial of service is inflicted against
themselves. Finally it doesn't cause a segmentation fault,
it causes an abort.

This is *NOT* a security bug.

In fact I'd call this NOTABUG entirely. It is just user error
to set the "count" to such a huge value that it hits OOM.

> Reported-by: Fakhri Zulkifli <mohdfakhrizulkifli@gmail.com>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>  qga/commands-posix.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> index 93474ff770..8f127788e6 100644
> --- a/qga/commands-posix.c
> +++ b/qga/commands-posix.c
> @@ -493,7 +493,13 @@ struct GuestFileRead *qmp_guest_file_read(int64_t handle, bool has_count,
>          gfh->state = RW_STATE_NEW;
>      }
>  
> -    buf = g_malloc0(count+1);
> +    buf = g_try_malloc0(count + 1);
> +    if (!buf) {
> +        error_setg(errp,
> +                   "failed to allocate sufficient memory "
> +                   "to complete the requested service");
> +        return NULL;
> +    }

So you've prevented an OOM when we call fread() on the file.

The contents of 'buf' now need to be turned into JSON which
means the buffer need to be base64 encoded. This will consume
even more memory than the original read.  So checking malloc
here has achieved nothing AFAICT.

>      read_count = fread(buf, 1, count, fh);
>      if (ferror(fh)) {
>          error_setg_errno(errp, errno, "failed to read file");

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH-for-5.0] qga-posix: Avoid crashing process when failing to allocate memory

Posted by Dietmar Maurer 2 weeks ago
but error_setg() also calls malloc, so this does not help at all?

> On March 24, 2020 8:48 PM Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> 
>  
> Similarly to commit 807e2b6fce0 for Windows, kindly return a
> QMP error message instead of crashing the whole process.
> 
> Cc: qemu-stable@nongnu.org
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1594054
> Reported-by: Fakhri Zulkifli <mohdfakhrizulkifli@gmail.com>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>  qga/commands-posix.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> index 93474ff770..8f127788e6 100644
> --- a/qga/commands-posix.c
> +++ b/qga/commands-posix.c
> @@ -493,7 +493,13 @@ struct GuestFileRead *qmp_guest_file_read(int64_t handle, bool has_count,
>          gfh->state = RW_STATE_NEW;
>      }
>  
> -    buf = g_malloc0(count+1);
> +    buf = g_try_malloc0(count + 1);
> +    if (!buf) {
> +        error_setg(errp,
> +                   "failed to allocate sufficient memory "
> +                   "to complete the requested service");
> +        return NULL;
> +    }
>      read_count = fread(buf, 1, count, fh);
>      if (ferror(fh)) {
>          error_setg_errno(errp, errno, "failed to read file");
> -- 
> 2.21.1


Re: [PATCH-for-5.0] qga-posix: Avoid crashing process when failing to allocate memory

Posted by Philippe Mathieu-Daudé 2 weeks ago
On 3/25/20 7:19 AM, Dietmar Maurer wrote:
> but error_setg() also calls malloc, so this does not help at all?

IIUC the problem, you can send a QMP command to ask to read let's say 
3GB of a file, and QEMU crashes. But this doesn't mean there the .heap 
is empty, there is probably few bytes still available, enough to respond 
with an error message.

> 
>> On March 24, 2020 8:48 PM Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>>
>>   
>> Similarly to commit 807e2b6fce0 for Windows, kindly return a
>> QMP error message instead of crashing the whole process.
>>
>> Cc: qemu-stable@nongnu.org
>> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1594054
>> Reported-by: Fakhri Zulkifli <mohdfakhrizulkifli@gmail.com>
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> ---
>>   qga/commands-posix.c | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
>> index 93474ff770..8f127788e6 100644
>> --- a/qga/commands-posix.c
>> +++ b/qga/commands-posix.c
>> @@ -493,7 +493,13 @@ struct GuestFileRead *qmp_guest_file_read(int64_t handle, bool has_count,
>>           gfh->state = RW_STATE_NEW;
>>       }
>>   
>> -    buf = g_malloc0(count+1);
>> +    buf = g_try_malloc0(count + 1);
>> +    if (!buf) {
>> +        error_setg(errp,
>> +                   "failed to allocate sufficient memory "
>> +                   "to complete the requested service");
>> +        return NULL;
>> +    }
>>       read_count = fread(buf, 1, count, fh);
>>       if (ferror(fh)) {
>>           error_setg_errno(errp, errno, "failed to read file");
>> -- 
>> 2.21.1
> 


Re: [PATCH-for-5.0] qga-posix: Avoid crashing process when failing to allocate memory

Posted by Markus Armbruster 1 week ago
Philippe Mathieu-Daudé <philmd@redhat.com> writes:

> On 3/25/20 7:19 AM, Dietmar Maurer wrote:
>> but error_setg() also calls malloc, so this does not help at all?
>
> IIUC the problem, you can send a QMP command to ask to read let's say
> 3GB of a file, and QEMU crashes. But this doesn't mean there the .heap
> is empty, there is probably few bytes still available, enough to
> respond with an error message.

We've discussed how to handle out-of-memory conditions many times.
Here's one instance:

    Subject: When it's okay to treat OOM as fatal?
    Message-ID: <87efcqniza.fsf@dusky.pond.sub.org>
    https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg03212.html

No improvement since then; there's no guidance on when to check for OOM.
Actual code tends to check only "large" allocations (for subjective
values of "large").

I reiterate my opinion that whatever OOM handling we have is too
unreliable to be worth much, since it can only help when (1) allocations
actually fail (they generally don't[*]), and (2) the allocation that
fails is actually handled (they generally aren't), and (3) the handling
actually works (we don't test OOM, so it generally doesn't).


[*] Linux overcommits memory, which means malloc() pretty much always
succeeds, but when you try to use "too much" of the memory you
supposedly allocated, a lethal signal is coming your way.  Reasd the
thread I quoted for examples.


Re: [PATCH-for-5.0] qga-posix: Avoid crashing process when failing to allocate memory

Posted by Philippe Mathieu-Daudé 1 week ago
Cc'ing the ppl who responded the thread you quoted.

On 3/30/20 4:11 PM, Markus Armbruster wrote:
> Philippe Mathieu-Daudé <philmd@redhat.com> writes:
 > ---
 >   qga/commands-posix.c | 8 +++++++-
 >   1 file changed, 7 insertions(+), 1 deletion(-)
 >
 > diff --git a/qga/commands-posix.c b/qga/commands-posix.c
 > index 93474ff770..8f127788e6 100644
 > --- a/qga/commands-posix.c
 > +++ b/qga/commands-posix.c
 > @@ -493,7 +493,13 @@ struct GuestFileRead 
*qmp_guest_file_read(int64_t handle, bool has_count,
 >           gfh->state = RW_STATE_NEW;
 >       }
 >
 > -    buf = g_malloc0(count+1);
 > +    buf = g_try_malloc0(count + 1);
 > +    if (!buf) {
 > +        error_setg(errp,
 > +                   "failed to allocate sufficient memory "
 > +                   "to complete the requested service");
 > +        return NULL;
 > +    }
 >       read_count = fread(buf, 1, count, fh);
 >       if (ferror(fh)) {
 >           error_setg_errno(errp, errno, "failed to read file");
 >

>> On 3/25/20 7:19 AM, Dietmar Maurer wrote:
>>> but error_setg() also calls malloc, so this does not help at all?
>>
>> IIUC the problem, you can send a QMP command to ask to read let's say
>> 3GB of a file, and QEMU crashes. But this doesn't mean there the .heap
>> is empty, there is probably few bytes still available, enough to
>> respond with an error message.
> 
> We've discussed how to handle out-of-memory conditions many times.
> Here's one instance:
> 
>      Subject: When it's okay to treat OOM as fatal?
>      Message-ID: <87efcqniza.fsf@dusky.pond.sub.org>
>      https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg03212.html
> 
> No improvement since then; there's no guidance on when to check for OOM.
> Actual code tends to check only "large" allocations (for subjective
> values of "large").
> 
> I reiterate my opinion that whatever OOM handling we have is too
> unreliable to be worth much, since it can only help when (1) allocations
> actually fail (they generally don't[*]), and (2) the allocation that
> fails is actually handled (they generally aren't), and (3) the handling
> actually works (we don't test OOM, so it generally doesn't).
> 
> 
> [*] Linux overcommits memory, which means malloc() pretty much always
> succeeds, but when you try to use "too much" of the memory you
> supposedly allocated, a lethal signal is coming your way.  Reasd the
> thread I quoted for examples.

So this patch takes Stefan reasoning:
https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg03525.html

   My thinking has been to use g_new() for small QEMU-internal structures
   and g_try_new() for large amounts of memory allocated in response to
   untrusted inputs.  (Untrusted inputs must never be used for unbounded
   allocation sizes but those bounded sizes can still be large.)

In any cases (malloc/malloc_try) we have a denial of service 
(https://www.openwall.com/lists/oss-security/2018/10/17/4) and the 
service is restarted.

Daniel suggests such behavior should be catched by external firewall 
guard (either on the process or on the network). This seems out of scope 
of QEMU and hard to fix.

So, can we improve something? Or should we let this code as it?


Re: [PATCH-for-5.0] qga-posix: Avoid crashing process when failing to allocate memory

Posted by Daniel P. Berrangé 1 week ago
On Mon, Mar 30, 2020 at 06:04:49PM +0200, Philippe Mathieu-Daudé wrote:
> Cc'ing the ppl who responded the thread you quoted.
> 
> On 3/30/20 4:11 PM, Markus Armbruster wrote:
> > Philippe Mathieu-Daudé <philmd@redhat.com> writes:
> > ---
> >   qga/commands-posix.c | 8 +++++++-
> >   1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> > index 93474ff770..8f127788e6 100644
> > --- a/qga/commands-posix.c
> > +++ b/qga/commands-posix.c
> > @@ -493,7 +493,13 @@ struct GuestFileRead *qmp_guest_file_read(int64_t
> handle, bool has_count,
> >           gfh->state = RW_STATE_NEW;
> >       }
> >
> > -    buf = g_malloc0(count+1);
> > +    buf = g_try_malloc0(count + 1);
> > +    if (!buf) {
> > +        error_setg(errp,
> > +                   "failed to allocate sufficient memory "
> > +                   "to complete the requested service");
> > +        return NULL;
> > +    }
> >       read_count = fread(buf, 1, count, fh);
> >       if (ferror(fh)) {
> >           error_setg_errno(errp, errno, "failed to read file");
> >
> 
> > > On 3/25/20 7:19 AM, Dietmar Maurer wrote:
> > > > but error_setg() also calls malloc, so this does not help at all?
> > > 
> > > IIUC the problem, you can send a QMP command to ask to read let's say
> > > 3GB of a file, and QEMU crashes. But this doesn't mean there the .heap
> > > is empty, there is probably few bytes still available, enough to
> > > respond with an error message.
> > 
> > We've discussed how to handle out-of-memory conditions many times.
> > Here's one instance:
> > 
> >      Subject: When it's okay to treat OOM as fatal?
> >      Message-ID: <87efcqniza.fsf@dusky.pond.sub.org>
> >      https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg03212.html
> > 
> > No improvement since then; there's no guidance on when to check for OOM.
> > Actual code tends to check only "large" allocations (for subjective
> > values of "large").
> > 
> > I reiterate my opinion that whatever OOM handling we have is too
> > unreliable to be worth much, since it can only help when (1) allocations
> > actually fail (they generally don't[*]), and (2) the allocation that
> > fails is actually handled (they generally aren't), and (3) the handling
> > actually works (we don't test OOM, so it generally doesn't).
> > 
> > 
> > [*] Linux overcommits memory, which means malloc() pretty much always
> > succeeds, but when you try to use "too much" of the memory you
> > supposedly allocated, a lethal signal is coming your way.  Reasd the
> > thread I quoted for examples.
> 
> So this patch takes Stefan reasoning:
> https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg03525.html
> 
>   My thinking has been to use g_new() for small QEMU-internal structures
>   and g_try_new() for large amounts of memory allocated in response to
>   untrusted inputs.  (Untrusted inputs must never be used for unbounded
>   allocation sizes but those bounded sizes can still be large.)
> 
> In any cases (malloc/malloc_try) we have a denial of service
> (https://www.openwall.com/lists/oss-security/2018/10/17/4) and the service
> is restarted.
> 
> Daniel suggests such behavior should be catched by external firewall guard
> (either on the process or on the network). This seems out of scope of QEMU
> and hard to fix.

Did I ?

IMHO if the guest agent is connected to a chardev that allows access to
a user that isn't the one running QEMU, that is serious mis-configuration.
IOW, it is user error, not a QEMU guest agent bug.  Firewalls are not
required  - QEMU should be reconfigured to not make it accessible over TCP.
This is the same situation documented in 4f24430821c568936aeda417bbb00e989a9e1984
when it was claimed that QEMU had a security flaw because the QMP monitor
could be exposed over TCP.

The guest agent is a privileged interface to interact with a guest OS, and
as such it should only be made available to users who are trustworthy on
the host. 

> So, can we improve something? Or should we let this code as it?

Personally I would have just put a low, hard limit on "count" in the
guest agent spec & impl. eg don't allow count to be larger than 10 MB
and document that this command is just for limited, ad-hoc debugging,
such as log file access. In extremis we could make the hard limit be
a configuration parameter for the guest agent.

The QEMU guest agent protocol is not sensible way to access huge files
inside the guest. It requires the inefficient process of reading the
entire data into memory than duplicating it again in base64 format, and
then copying it again in the JSON serializer / monitor code.

For arbitrary general purpose file access, especially for large files,
use a real file transfer program or use a network block device, not the
QEMU guest agent.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH-for-5.0] qga-posix: Avoid crashing process when failing to allocate memory

Posted by Philippe Mathieu-Daudé 1 week ago
On 3/30/20 7:06 PM, Daniel P. Berrangé wrote:
> On Mon, Mar 30, 2020 at 06:04:49PM +0200, Philippe Mathieu-Daudé wrote:
>> Cc'ing the ppl who responded the thread you quoted.
>>
>> On 3/30/20 4:11 PM, Markus Armbruster wrote:
>>> Philippe Mathieu-Daudé <philmd@redhat.com> writes:
>>> ---
>>>    qga/commands-posix.c | 8 +++++++-
>>>    1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
>>> index 93474ff770..8f127788e6 100644
>>> --- a/qga/commands-posix.c
>>> +++ b/qga/commands-posix.c
>>> @@ -493,7 +493,13 @@ struct GuestFileRead *qmp_guest_file_read(int64_t
>> handle, bool has_count,
>>>            gfh->state = RW_STATE_NEW;
>>>        }
>>>
>>> -    buf = g_malloc0(count+1);
>>> +    buf = g_try_malloc0(count + 1);
>>> +    if (!buf) {
>>> +        error_setg(errp,
>>> +                   "failed to allocate sufficient memory "
>>> +                   "to complete the requested service");
>>> +        return NULL;
>>> +    }
>>>        read_count = fread(buf, 1, count, fh);
>>>        if (ferror(fh)) {
>>>            error_setg_errno(errp, errno, "failed to read file");
>>>
>>
>>>> On 3/25/20 7:19 AM, Dietmar Maurer wrote:
>>>>> but error_setg() also calls malloc, so this does not help at all?
>>>>
>>>> IIUC the problem, you can send a QMP command to ask to read let's say
>>>> 3GB of a file, and QEMU crashes. But this doesn't mean there the .heap
>>>> is empty, there is probably few bytes still available, enough to
>>>> respond with an error message.
>>>
>>> We've discussed how to handle out-of-memory conditions many times.
>>> Here's one instance:
>>>
>>>       Subject: When it's okay to treat OOM as fatal?
>>>       Message-ID: <87efcqniza.fsf@dusky.pond.sub.org>
>>>       https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg03212.html
>>>
>>> No improvement since then; there's no guidance on when to check for OOM.
>>> Actual code tends to check only "large" allocations (for subjective
>>> values of "large").
>>>
>>> I reiterate my opinion that whatever OOM handling we have is too
>>> unreliable to be worth much, since it can only help when (1) allocations
>>> actually fail (they generally don't[*]), and (2) the allocation that
>>> fails is actually handled (they generally aren't), and (3) the handling
>>> actually works (we don't test OOM, so it generally doesn't).
>>>
>>>
>>> [*] Linux overcommits memory, which means malloc() pretty much always
>>> succeeds, but when you try to use "too much" of the memory you
>>> supposedly allocated, a lethal signal is coming your way.  Reasd the
>>> thread I quoted for examples.
>>
>> So this patch takes Stefan reasoning:
>> https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg03525.html
>>
>>    My thinking has been to use g_new() for small QEMU-internal structures
>>    and g_try_new() for large amounts of memory allocated in response to
>>    untrusted inputs.  (Untrusted inputs must never be used for unbounded
>>    allocation sizes but those bounded sizes can still be large.)
>>
>> In any cases (malloc/malloc_try) we have a denial of service
>> (https://www.openwall.com/lists/oss-security/2018/10/17/4) and the service
>> is restarted.
>>
>> Daniel suggests such behavior should be catched by external firewall guard
>> (either on the process or on the network). This seems out of scope of QEMU
>> and hard to fix.
> 
> Did I ?
> 
> IMHO if the guest agent is connected to a chardev that allows access to
> a user that isn't the one running QEMU, that is serious mis-configuration.
> IOW, it is user error, not a QEMU guest agent bug.  Firewalls are not
> required  - QEMU should be reconfigured to not make it accessible over TCP.
> This is the same situation documented in 4f24430821c568936aeda417bbb00e989a9e1984
> when it was claimed that QEMU had a security flaw because the QMP monitor
> could be exposed over TCP.
> 
> The guest agent is a privileged interface to interact with a guest OS, and
> as such it should only be made available to users who are trustworthy on
> the host.
> 
>> So, can we improve something? Or should we let this code as it?
> 
> Personally I would have just put a low, hard limit on "count" in the
> guest agent spec & impl. eg don't allow count to be larger than 10 MB
> and document that this command is just for limited, ad-hoc debugging,
> such as log file access. In extremis we could make the hard limit be
> a configuration parameter for the guest agent.

This is clearly a smarter/cleaner fix, thanks!

> The QEMU guest agent protocol is not sensible way to access huge files
> inside the guest. It requires the inefficient process of reading the
> entire data into memory than duplicating it again in base64 format, and
> then copying it again in the JSON serializer / monitor code.
> 
> For arbitrary general purpose file access, especially for large files,
> use a real file transfer program or use a network block device, not the
> QEMU guest agent.
> 
> 
> Regards,
> Daniel
> 


Re: [PATCH-for-5.0] qga-posix: Avoid crashing process when failing to allocate memory

Posted by Dr. David Alan Gilbert 1 week ago
* Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
> Cc'ing the ppl who responded the thread you quoted.
> 
> On 3/30/20 4:11 PM, Markus Armbruster wrote:
> > Philippe Mathieu-Daudé <philmd@redhat.com> writes:
> > ---
> >   qga/commands-posix.c | 8 +++++++-
> >   1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> > index 93474ff770..8f127788e6 100644
> > --- a/qga/commands-posix.c
> > +++ b/qga/commands-posix.c
> > @@ -493,7 +493,13 @@ struct GuestFileRead *qmp_guest_file_read(int64_t
> handle, bool has_count,
> >           gfh->state = RW_STATE_NEW;
> >       }
> >
> > -    buf = g_malloc0(count+1);
> > +    buf = g_try_malloc0(count + 1);
> > +    if (!buf) {
> > +        error_setg(errp,
> > +                   "failed to allocate sufficient memory "
> > +                   "to complete the requested service");
> > +        return NULL;
> > +    }
> >       read_count = fread(buf, 1, count, fh);
> >       if (ferror(fh)) {
> >           error_setg_errno(errp, errno, "failed to read file");
> >
> 
> > > On 3/25/20 7:19 AM, Dietmar Maurer wrote:
> > > > but error_setg() also calls malloc, so this does not help at all?
> > > 
> > > IIUC the problem, you can send a QMP command to ask to read let's say
> > > 3GB of a file, and QEMU crashes. But this doesn't mean there the .heap
> > > is empty, there is probably few bytes still available, enough to
> > > respond with an error message.
> > 
> > We've discussed how to handle out-of-memory conditions many times.
> > Here's one instance:
> > 
> >      Subject: When it's okay to treat OOM as fatal?
> >      Message-ID: <87efcqniza.fsf@dusky.pond.sub.org>
> >      https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg03212.html
> > 
> > No improvement since then; there's no guidance on when to check for OOM.
> > Actual code tends to check only "large" allocations (for subjective
> > values of "large").
> > 
> > I reiterate my opinion that whatever OOM handling we have is too
> > unreliable to be worth much, since it can only help when (1) allocations
> > actually fail (they generally don't[*]), and (2) the allocation that
> > fails is actually handled (they generally aren't), and (3) the handling
> > actually works (we don't test OOM, so it generally doesn't).
> > 
> > 
> > [*] Linux overcommits memory, which means malloc() pretty much always
> > succeeds, but when you try to use "too much" of the memory you
> > supposedly allocated, a lethal signal is coming your way.  Reasd the
> > thread I quoted for examples.
> 
> So this patch takes Stefan reasoning:
> https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg03525.html
> 
>   My thinking has been to use g_new() for small QEMU-internal structures
>   and g_try_new() for large amounts of memory allocated in response to
>   untrusted inputs.  (Untrusted inputs must never be used for unbounded
>   allocation sizes but those bounded sizes can still be large.)
> 
> In any cases (malloc/malloc_try) we have a denial of service
> (https://www.openwall.com/lists/oss-security/2018/10/17/4) and the service
> is restarted.
> 
> Daniel suggests such behavior should be catched by external firewall guard
> (either on the process or on the network). This seems out of scope of QEMU
> and hard to fix.
> 
> So, can we improve something? Or should we let this code as it?

I'll agree with Stefan's description; we should use 'try' for anything
'large' (badly defined) or user controlled.

So I think this should switch to having the try.

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH-for-5.0] qga-posix: Avoid crashing process when failing to allocate memory

Posted by Daniel P. Berrangé 1 week ago
On Mon, Mar 30, 2020 at 05:38:05PM +0100, Dr. David Alan Gilbert wrote:
> * Philippe Mathieu-Daudé (philmd@redhat.com) wrote:
> > Cc'ing the ppl who responded the thread you quoted.
> > 
> > On 3/30/20 4:11 PM, Markus Armbruster wrote:
> > > Philippe Mathieu-Daudé <philmd@redhat.com> writes:
> > > ---
> > >   qga/commands-posix.c | 8 +++++++-
> > >   1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> > > index 93474ff770..8f127788e6 100644
> > > --- a/qga/commands-posix.c
> > > +++ b/qga/commands-posix.c
> > > @@ -493,7 +493,13 @@ struct GuestFileRead *qmp_guest_file_read(int64_t
> > handle, bool has_count,
> > >           gfh->state = RW_STATE_NEW;
> > >       }
> > >
> > > -    buf = g_malloc0(count+1);
> > > +    buf = g_try_malloc0(count + 1);
> > > +    if (!buf) {
> > > +        error_setg(errp,
> > > +                   "failed to allocate sufficient memory "
> > > +                   "to complete the requested service");
> > > +        return NULL;
> > > +    }
> > >       read_count = fread(buf, 1, count, fh);
> > >       if (ferror(fh)) {
> > >           error_setg_errno(errp, errno, "failed to read file");
> > >
> > 
> > > > On 3/25/20 7:19 AM, Dietmar Maurer wrote:
> > > > > but error_setg() also calls malloc, so this does not help at all?
> > > > 
> > > > IIUC the problem, you can send a QMP command to ask to read let's say
> > > > 3GB of a file, and QEMU crashes. But this doesn't mean there the .heap
> > > > is empty, there is probably few bytes still available, enough to
> > > > respond with an error message.
> > > 
> > > We've discussed how to handle out-of-memory conditions many times.
> > > Here's one instance:
> > > 
> > >      Subject: When it's okay to treat OOM as fatal?
> > >      Message-ID: <87efcqniza.fsf@dusky.pond.sub.org>
> > >      https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg03212.html
> > > 
> > > No improvement since then; there's no guidance on when to check for OOM.
> > > Actual code tends to check only "large" allocations (for subjective
> > > values of "large").
> > > 
> > > I reiterate my opinion that whatever OOM handling we have is too
> > > unreliable to be worth much, since it can only help when (1) allocations
> > > actually fail (they generally don't[*]), and (2) the allocation that
> > > fails is actually handled (they generally aren't), and (3) the handling
> > > actually works (we don't test OOM, so it generally doesn't).
> > > 
> > > 
> > > [*] Linux overcommits memory, which means malloc() pretty much always
> > > succeeds, but when you try to use "too much" of the memory you
> > > supposedly allocated, a lethal signal is coming your way.  Reasd the
> > > thread I quoted for examples.
> > 
> > So this patch takes Stefan reasoning:
> > https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg03525.html
> > 
> >   My thinking has been to use g_new() for small QEMU-internal structures
> >   and g_try_new() for large amounts of memory allocated in response to
> >   untrusted inputs.  (Untrusted inputs must never be used for unbounded
> >   allocation sizes but those bounded sizes can still be large.)
> > 
> > In any cases (malloc/malloc_try) we have a denial of service
> > (https://www.openwall.com/lists/oss-security/2018/10/17/4) and the service
> > is restarted.
> > 
> > Daniel suggests such behavior should be catched by external firewall guard
> > (either on the process or on the network). This seems out of scope of QEMU
> > and hard to fix.
> > 
> > So, can we improve something? Or should we let this code as it?
> 
> I'll agree with Stefan's description; we should use 'try' for anything
> 'large' (badly defined) or user controlled.
> 
> So I think this should switch to having the try.

IMHO it is pointless to use try here as its not really solving the
problem. This is just the first of several mallocs in the process
of running this command. Consider the size is large, but still small
enough for the first g_try_malloc() to succeed. Soon after GLib is
going to g_malloc an even larger bock of (count / 3 + 1) * 4 + 1) bytes
in order to base64 encode this data, and that can push us over into OOM
again.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH-for-5.0] qga-posix: Avoid crashing process when failing to allocate memory

Posted by Philippe Mathieu-Daudé 1 week ago
On 3/30/20 6:04 PM, Philippe Mathieu-Daudé wrote:
> Cc'ing the ppl who responded the thread you quoted.
> 
> On 3/30/20 4:11 PM, Markus Armbruster wrote:
>> Philippe Mathieu-Daudé <philmd@redhat.com> writes:
>  > ---
>  >   qga/commands-posix.c | 8 +++++++-
>  >   1 file changed, 7 insertions(+), 1 deletion(-)
>  >
>  > diff --git a/qga/commands-posix.c b/qga/commands-posix.c
>  > index 93474ff770..8f127788e6 100644
>  > --- a/qga/commands-posix.c
>  > +++ b/qga/commands-posix.c
>  > @@ -493,7 +493,13 @@ struct GuestFileRead 
> *qmp_guest_file_read(int64_t handle, bool has_count,
>  >           gfh->state = RW_STATE_NEW;
>  >       }
>  >
>  > -    buf = g_malloc0(count+1);
>  > +    buf = g_try_malloc0(count + 1);
>  > +    if (!buf) {
>  > +        error_setg(errp,
>  > +                   "failed to allocate sufficient memory "
>  > +                   "to complete the requested service");
>  > +        return NULL;
>  > +    }
>  >       read_count = fread(buf, 1, count, fh);
>  >       if (ferror(fh)) {
>  >           error_setg_errno(errp, errno, "failed to read file");
>  >
> 
>>> On 3/25/20 7:19 AM, Dietmar Maurer wrote:
>>>> but error_setg() also calls malloc, so this does not help at all?
>>>
>>> IIUC the problem, you can send a QMP command to ask to read let's say
>>> 3GB of a file, and QEMU crashes. But this doesn't mean there the .heap
>>> is empty, there is probably few bytes still available, enough to
>>> respond with an error message.
>>
>> We've discussed how to handle out-of-memory conditions many times.
>> Here's one instance:
>>
>>      Subject: When it's okay to treat OOM as fatal?
>>      Message-ID: <87efcqniza.fsf@dusky.pond.sub.org>
>>      
>> https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg03212.html
>>
>> No improvement since then; there's no guidance on when to check for OOM.
>> Actual code tends to check only "large" allocations (for subjective
>> values of "large").
>>
>> I reiterate my opinion that whatever OOM handling we have is too
>> unreliable to be worth much, since it can only help when (1) allocations
>> actually fail (they generally don't[*]), and (2) the allocation that
>> fails is actually handled (they generally aren't), and (3) the handling
>> actually works (we don't test OOM, so it generally doesn't).
>>
>>
>> [*] Linux overcommits memory, which means malloc() pretty much always
>> succeeds, but when you try to use "too much" of the memory you
>> supposedly allocated, a lethal signal is coming your way.  Reasd the
>> thread I quoted for examples.
> 
> So this patch takes Stefan reasoning:
> https://lists.nongnu.org/archive/html/qemu-devel/2018-10/msg03525.html
> 
>    My thinking has been to use g_new() for small QEMU-internal structures
>    and g_try_new() for large amounts of memory allocated in response to
>    untrusted inputs.  (Untrusted inputs must never be used for unbounded
>    allocation sizes but those bounded sizes can still be large.)
> 
> In any cases (malloc/malloc_try) we have a denial of service 
> (https://www.openwall.com/lists/oss-security/2018/10/17/4) and the 
> service is restarted.
> 
> Daniel suggests such behavior should be catched by external firewall 
> guard (either on the process or on the network). This seems out of scope 
> of QEMU and hard to fix.
> 
> So, can we improve something? Or should we let this code as it?

But if we don't want to modify this code, we should document somewhere 
that maintstream QEMU doesn't care about DoS and it is the 
responsibility to protect it with something else...