[PATCH 0/4] xen: domain-tracked allocations, and fault injection

Andrew Cooper posted 4 patches 3 years, 3 months ago
Test gitlab-ci failed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20201223163442.8840-1-andrew.cooper3@citrix.com
tools/misc/.gitignore       |  1 +
tools/misc/Makefile         |  5 ++++
tools/misc/xen-fault-ttl.c  | 56 +++++++++++++++++++++++++++++++++++++++++++++
xen/common/Makefile         |  1 +
xen/common/dmalloc.c        | 25 ++++++++++++++++++++
xen/common/domain.c         | 14 ++++++++++--
xen/common/event_channel.c  | 14 ++++++------
xen/include/public/domctl.h |  1 +
xen/include/xen/dmalloc.h   | 29 +++++++++++++++++++++++
xen/include/xen/sched.h     |  3 +++
10 files changed, 140 insertions(+), 9 deletions(-)
create mode 100644 tools/misc/xen-fault-ttl.c
create mode 100644 xen/common/dmalloc.c
create mode 100644 xen/include/xen/dmalloc.h
[PATCH 0/4] xen: domain-tracked allocations, and fault injection
Posted by Andrew Cooper 3 years, 3 months ago
This was not the christmas hacking project that I was planning to do, but it
has had some exciting results.

After some discussion on an earlier thread, Tamas has successfully got fuzzing
of Xen working via kfx, and this series is a prototype for providing better
testing infrastructure.

And to prove a point, this series has already found a memory leak in ARM's
dom0less smoke test.

From https://gitlab.com/xen-project/people/andyhhp/xen/-/jobs/929518792

  (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
  (XEN) Freed 328kB init memory.
  (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER4
  (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER8
  (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER12
  (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER16
  (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER20
  (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER24
  (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER28
  (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER32
  (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER0
  (XEN) physdev.c:16:d0v0 PHYSDEVOP cmd=25: not implemented
  (XEN) physdev.c:16:d0v0 PHYSDEVOP cmd=15: not implemented
  (XEN) physdev.c:16:d0v0 PHYSDEVOP cmd=15: not implemented
  (XEN)
  (XEN) ****************************************
  (XEN) Panic on CPU 0:
  (XEN) d1 has 2 outstanding heap allocations
  (XEN) ****************************************
  (XEN)
  (XEN) Reboot in five seconds...

For some reason, neither of the evtchn default memory allocations are freed,
but it's not clear why d1 shut down to being with.  Stefano - any ideas?

Andrew Cooper (4):
  xen/dmalloc: Introduce dmalloc() APIs
  xen/evtchn: Switch to dmalloc
  xen/domctl: Introduce fault_ttl
  tools/misc: Test for fault injection

 tools/misc/.gitignore       |  1 +
 tools/misc/Makefile         |  5 ++++
 tools/misc/xen-fault-ttl.c  | 56 +++++++++++++++++++++++++++++++++++++++++++++
 xen/common/Makefile         |  1 +
 xen/common/dmalloc.c        | 25 ++++++++++++++++++++
 xen/common/domain.c         | 14 ++++++++++--
 xen/common/event_channel.c  | 14 ++++++------
 xen/include/public/domctl.h |  1 +
 xen/include/xen/dmalloc.h   | 29 +++++++++++++++++++++++
 xen/include/xen/sched.h     |  3 +++
 10 files changed, 140 insertions(+), 9 deletions(-)
 create mode 100644 tools/misc/xen-fault-ttl.c
 create mode 100644 xen/common/dmalloc.c
 create mode 100644 xen/include/xen/dmalloc.h

-- 
2.11.0


Re: [PATCH 0/4] xen: domain-tracked allocations, and fault injection
Posted by Julien Grall 1 year, 3 months ago
Hi Andrew,

On 23/12/2020 16:34, Andrew Cooper wrote:
> This was not the christmas hacking project that I was planning to do, but it
> has had some exciting results.
> 
> After some discussion on an earlier thread, Tamas has successfully got fuzzing
> of Xen working via kfx, and this series is a prototype for providing better
> testing infrastructure.
> 
> And to prove a point, this series has already found a memory leak in ARM's
> dom0less smoke test.

You mention this series recently on the ML. So I decided to give a try 
and manage to reproduce your "memory leak".

I put it in quote because the problem is not Arm and instead your code. 
If you look at the implementation of _dzalloc() you are using 
_xmalloc(). So the memory is not guaranteed to be zeroed after been 
allocation.

This is breaking the expectation of the callers. What you want is using 
"_xzalloc()'.

Cheers,

-- 
Julien Grall
Re: [PATCH 0/4] xen: domain-tracked allocations, and fault injection
Posted by Stefano Stabellini 3 years, 2 months ago
On Wed, 23 Dec 2020, Andrew Cooper wrote:
> This was not the christmas hacking project that I was planning to do, but it
> has had some exciting results.
> 
> After some discussion on an earlier thread, Tamas has successfully got fuzzing
> of Xen working via kfx, and this series is a prototype for providing better
> testing infrastructure.
> 
> And to prove a point, this series has already found a memory leak in ARM's
> dom0less smoke test.
> 
> >From https://gitlab.com/xen-project/people/andyhhp/xen/-/jobs/929518792
> 
>   (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
>   (XEN) Freed 328kB init memory.
>   (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER4
>   (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER8
>   (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER12
>   (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER16
>   (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER20
>   (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER24
>   (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER28
>   (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER32
>   (XEN) d0v0: vGICD: unhandled word write 0x000000ffffffff to ICACTIVER0
>   (XEN) physdev.c:16:d0v0 PHYSDEVOP cmd=25: not implemented
>   (XEN) physdev.c:16:d0v0 PHYSDEVOP cmd=15: not implemented
>   (XEN) physdev.c:16:d0v0 PHYSDEVOP cmd=15: not implemented
>   (XEN)
>   (XEN) ****************************************
>   (XEN) Panic on CPU 0:
>   (XEN) d1 has 2 outstanding heap allocations
>   (XEN) ****************************************
>   (XEN)
>   (XEN) Reboot in five seconds...
> 
> For some reason, neither of the evtchn default memory allocations are freed,
> but it's not clear why d1 shut down to being with.  Stefano - any ideas?

Right, this is confusing. It is not hard to believe that memory leaks
exist on the dom0less shutdown path because dom0less domains are not
really shutdown today. I imagine there could be issues.

But I don't understand why _domain_destroy gets called in the first
place for d1. Maybe a domain_create failure leads to goto fail and
_domain_destroy(). I wanted to run a test but ran out of time.