[RFC PATCH] xen/sched/null: avoid crash after failed domU creation

Stewart Hildebrand posted 1 patch 1 year ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20230424210048.786436-1-stewart.hildebrand@amd.com
There is a newer version of this series
xen/common/sched/null.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
[RFC PATCH] xen/sched/null: avoid crash after failed domU creation
Posted by Stewart Hildebrand 1 year ago
When creating a domU, but the creation fails, we may end up in a state
where a vcpu has not yet been added to the null scheduler, but
unit_deassign() is invoked. In this case, when running a debug build of
Xen, we will hit an ASSERT and crash Xen:

(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'npc->unit == unit' failed at common/sched/null.c:379
(XEN) ****************************************

To work around this, remove the ASSERT and introduce a check for the
case where npc->unit is NULL and simply return false from
unit_deassign().

Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---

I'm not entirely sure whether this is the correct fix (hence the RFC
tag), but at least it avoids crashing the system for me.

Here are the steps to reproduce on an aarch64 system with 4 cpus:

sudo xl cpupool-cpu-remove Pool-0 3
sudo xl cpupool-create 'name="nullpool"' 'sched="null"' 'cpus=["3"]'

cat > stew.cfg <<EOF
name = "stew"
kernel = "Image"
extra = "console=hvc0"
memory = 768
vcpus = 1
pool= "nullpool"
pci = [ "01:00.0" ]
EOF

sudo xl create stew.cfg

The PCI device 01:00.0 is not assignable, so the domain creation will
fail.

Here is a more detailed crash log:

stew@ubuntu:~$ sudo xl create stew.cfg
Parsing config from stew.cfg
libxl: error: libxl_pci.c:1677:libxl__device_pci_add: Domain 1:PCI device 0:1:0.0 is not assignable
libxl: error: libxl_pci.c:1809:device_pci_add_done: Domain 1:libxl__device_pci_add failed for PCI device 0:1:0.0 (rc -3)
libxl: error: libxl_create.c:1923:domcreate_attach_devices: Domain 1:unable to add pci devices
(XEN) Assertion 'npc->unit == unit' failed at common/sched/null.c:379
(XEN) ----[ Xen-4.18-unstable  arm64  debug=y  Tainted:   C    ]----
(XEN) CPU:    0
(XEN) PC:     000002000023ec30 null.c#unit_deassign+0xdc/0x1cc
(XEN) LR:     000002000023fa1c
(XEN) SP:     00008000fff67bb0
(XEN) CPSR:   00000000800002c9 MODE:64-bit EL2h (Hypervisor, handler)
(XEN)      X0: 0000000000000000  X1: 00008000fff43d70  X2: 0000000000000000
(XEN)      X3: 0000000000000003  X4: 0000000000000000  X5: 00008000ffff84b0
(XEN)      X6: 0000000000000000  X7: 0000000000000000  X8: 00008000fff43fd0
(XEN)      X9: 0000000000000000 X10: f00ff00ff00ff00f X11: 00000000ccccccc0
(XEN)     X12: 00000000ccccccc0 X13: 0000000000000000 X14: 0000000000000000
(XEN)     X15: 0000007fde58e6d0 X16: 0000000000000024 X17: 0000000000000000
(XEN)     X18: 0000000000000000 X19: 00008000fff43cd0 X20: 00008000fffdc120
(XEN)     X21: 00008000fff69a30 X22: 00008000fff69a30 X23: 0000000000000003
(XEN)     X24: 00008000fffecb20 X25: 0000000000000001 X26: 00008000fff69920
(XEN)     X27: 00008000fff43c70 X28: ffffff880188eec0  FP: 00008000fff67bb0
(XEN)
(XEN)   VTCR_EL2: 0000000080023558
(XEN)  VTTBR_EL2: 000100087ff94000
(XEN)
(XEN)  SCTLR_EL2: 0000000030cd183d
(XEN)    HCR_EL2: 00000000807c063f
(XEN)  TTBR0_EL2: 000000000034b000
(XEN)
(XEN)    ESR_EL2: 00000000f2000001
(XEN)  HPFAR_EL2: 0000000000f90100
(XEN)    FAR_EL2: ffffffc0097b0f00
(XEN)
(XEN) Xen stack trace from sp=00008000fff67bb0:
(XEN)    00008000fff67c10 000002000023fa1c 00008000fff43cd0 00008000fff43d70
(XEN)    00008000fff69a30 00008000fffec068 00008000fff68000 00008000fffecb20
(XEN)    0000000000000001 00008000fff43d70 00008000fff69a30 00008000fffec068
(XEN)    00008000fff67c40 0000020000243948 0000000000000003 00008000fff43e10
(XEN)    00008000fff43cd0 00008000fffec068 00008000fff67cb0 000002000023069c
(XEN)    00008000fffecb20 00008000fff68000 00008000fffecb20 00008000fff68000
(XEN)    000000005a000ea1 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 ffffff880188eec0 00008000fffecb20 00008000fff43ea0
(XEN)    00008000fff67cd0 0000020000231cc0 00000200002bb7e0 00008000fff68000
(XEN)    00008000fff67d00 00000200002099f0 00008000fff68000 0000000000000000
(XEN)    0000007fb5381010 000000005a000ea1 00008000fff67d30 000002000022ec8c
(XEN)    0000000000000001 000000005a000ea1 00008000fff67d30 000002000022e744
(XEN)    00008000fff67e40 000002000027121c 00008000fff67ea0 000000005a000ea1
(XEN)    00008000fff67f20 00008000fff52000 00008000fff67da0 0000020000244650
(XEN)    00008000fff67d90 0000020000224f54 00008000fff67da0 0000020000224f54
(XEN)    00008000fff67dc0 0000020000225088 00008000fff52e16 00008000fff52e00
(XEN)    0000001500000002 0000000000000001 000a752520646a25 742064656c696166
(XEN)    c00cc00cc00cc000 0000000000000000 ff0000ff000000ff 0000000000000000
(XEN)    00000000f00f000f 0000000000000000 f00ff00ff00ff00f f00ff00ff00ff00f
(XEN)    00000000ccccccc0 00000000ccccccc0 0000000000000000 0000000000000000
(XEN)    0000007fb5310018 0000007fb529d8d0 00008000fff67e70 0000020000272134
(XEN)    00008000fff67ea0 000000005a000ea1 00008000fff67fa8 0000000020000005
(XEN)    ffffffc009c9bce0 0000020000255c60 0000000000000000 ffffff8809182a00
(XEN)    ffffffc009c9bce0 0000020000255c54 0000007fb5381010 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    ffffffc009c9bdc8 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000007fde58e6d0 0000000000000024 0000000000000000
(XEN)    0000000000000000 0000007fde58e6d0 ffffff8809182a00 ffffff8809182a00
(XEN)    0000000000305000 0000007fde58e6d0 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 ffffff880188eec0 ffffffc009c9bce0
(XEN)    ffffffc008648cb8 ffffffffffffffff ffffffc00803799c 0000000020000005
(XEN)    000000005a000ea1 0000000000000000 0000000020000005 0000000000000000
(XEN)    0000000000000000 ffffff880188eec0 ffffffc009c9bce0 ffffffc008037998
(XEN)    0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN)    [<000002000023ec30>] null.c#unit_deassign+0xdc/0x1cc (PC)
(XEN)    [<000002000023fa1c>] null.c#null_unit_remove+0x6c/0xe4 (LR)
(XEN)    [<000002000023fa1c>] null.c#null_unit_remove+0x6c/0xe4
(XEN)    [<0000020000243948>] sched_move_domain+0x194/0x39c
(XEN)    [<000002000023069c>] cpupool.c#cpupool_move_domain_locked+0x38/0x70
(XEN)    [<0000020000231cc0>] cpupool_move_domain+0x34/0x54
(XEN)    [<00000200002099f0>] domain_kill+0xc4/0x144
(XEN)    [<000002000022ec8c>] do_domctl+0x6d0/0xfa4
(XEN)    [<000002000027121c>] traps.c#do_trap_hypercall+0x280/0x34c
(XEN)    [<0000020000272134>] do_trap_guest_sync+0x448/0x5c4
(XEN)    [<0000020000255c60>] entry.o#guest_sync_slowpath+0xa4/0xd4
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'npc->unit == unit' failed at common/sched/null.c:379
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
---
 xen/common/sched/null.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/xen/common/sched/null.c b/xen/common/sched/null.c
index 65a0a6c5312d..71a83c4fb1ad 100644
--- a/xen/common/sched/null.c
+++ b/xen/common/sched/null.c
@@ -376,7 +376,14 @@ static bool unit_deassign(struct null_private *prv, const struct sched_unit *uni
     struct null_pcpu *npc = get_sched_res(cpu)->sched_priv;
 
     ASSERT(list_empty(&null_unit(unit)->waitq_elem));
-    ASSERT(npc->unit == unit);
+
+    if ( !npc->unit )
+    {
+        dprintk(XENLOG_G_INFO, "%d <-- NULL (%pdv%d)\n", cpu, unit->domain,
+                unit->unit_id);
+        return false;
+    }
+
     ASSERT(!cpumask_test_cpu(cpu, &prv->cpus_free));
 
     npc->unit = NULL;
-- 
2.40.0
Re: [RFC PATCH] xen/sched/null: avoid crash after failed domU creation
Posted by Juergen Gross 1 year ago
On 24.04.23 23:00, Stewart Hildebrand wrote:
> When creating a domU, but the creation fails, we may end up in a state
> where a vcpu has not yet been added to the null scheduler, but
> unit_deassign() is invoked.

This is not really true. The vcpu has been added, but it was offline at
that time. This resulted in null_unit_insert() returning early and not
calling unit_assign().

Later the vcpu was onlined during XEN_DOMCTL_setvcpucontext handling,
resulting in null_unit_remove() calling unit_deassign().

> In this case, when running a debug build of
> Xen, we will hit an ASSERT and crash Xen:
> 
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'npc->unit == unit' failed at common/sched/null.c:379
> (XEN) ****************************************
> 
> To work around this, remove the ASSERT and introduce a check for the
> case where npc->unit is NULL and simply return false from
> unit_deassign().

I think the correct fix would be to call unit_deassign() from
null_unit_remove() only, if npc->unit isn't NULL. Dario might have a
different opinion, though. :-)


Juergen

> 
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> ---
> 
> I'm not entirely sure whether this is the correct fix (hence the RFC
> tag), but at least it avoids crashing the system for me.
> 
> Here are the steps to reproduce on an aarch64 system with 4 cpus:
> 
> sudo xl cpupool-cpu-remove Pool-0 3
> sudo xl cpupool-create 'name="nullpool"' 'sched="null"' 'cpus=["3"]'
> 
> cat > stew.cfg <<EOF
> name = "stew"
> kernel = "Image"
> extra = "console=hvc0"
> memory = 768
> vcpus = 1
> pool= "nullpool"
> pci = [ "01:00.0" ]
> EOF
> 
> sudo xl create stew.cfg
> 
> The PCI device 01:00.0 is not assignable, so the domain creation will
> fail.
> 
> Here is a more detailed crash log:
> 
> stew@ubuntu:~$ sudo xl create stew.cfg
> Parsing config from stew.cfg
> libxl: error: libxl_pci.c:1677:libxl__device_pci_add: Domain 1:PCI device 0:1:0.0 is not assignable
> libxl: error: libxl_pci.c:1809:device_pci_add_done: Domain 1:libxl__device_pci_add failed for PCI device 0:1:0.0 (rc -3)
> libxl: error: libxl_create.c:1923:domcreate_attach_devices: Domain 1:unable to add pci devices
> (XEN) Assertion 'npc->unit == unit' failed at common/sched/null.c:379
> (XEN) ----[ Xen-4.18-unstable  arm64  debug=y  Tainted:   C    ]----
> (XEN) CPU:    0
> (XEN) PC:     000002000023ec30 null.c#unit_deassign+0xdc/0x1cc
> (XEN) LR:     000002000023fa1c
> (XEN) SP:     00008000fff67bb0
> (XEN) CPSR:   00000000800002c9 MODE:64-bit EL2h (Hypervisor, handler)
> (XEN)      X0: 0000000000000000  X1: 00008000fff43d70  X2: 0000000000000000
> (XEN)      X3: 0000000000000003  X4: 0000000000000000  X5: 00008000ffff84b0
> (XEN)      X6: 0000000000000000  X7: 0000000000000000  X8: 00008000fff43fd0
> (XEN)      X9: 0000000000000000 X10: f00ff00ff00ff00f X11: 00000000ccccccc0
> (XEN)     X12: 00000000ccccccc0 X13: 0000000000000000 X14: 0000000000000000
> (XEN)     X15: 0000007fde58e6d0 X16: 0000000000000024 X17: 0000000000000000
> (XEN)     X18: 0000000000000000 X19: 00008000fff43cd0 X20: 00008000fffdc120
> (XEN)     X21: 00008000fff69a30 X22: 00008000fff69a30 X23: 0000000000000003
> (XEN)     X24: 00008000fffecb20 X25: 0000000000000001 X26: 00008000fff69920
> (XEN)     X27: 00008000fff43c70 X28: ffffff880188eec0  FP: 00008000fff67bb0
> (XEN)
> (XEN)   VTCR_EL2: 0000000080023558
> (XEN)  VTTBR_EL2: 000100087ff94000
> (XEN)
> (XEN)  SCTLR_EL2: 0000000030cd183d
> (XEN)    HCR_EL2: 00000000807c063f
> (XEN)  TTBR0_EL2: 000000000034b000
> (XEN)
> (XEN)    ESR_EL2: 00000000f2000001
> (XEN)  HPFAR_EL2: 0000000000f90100
> (XEN)    FAR_EL2: ffffffc0097b0f00
> (XEN)
> (XEN) Xen stack trace from sp=00008000fff67bb0:
> (XEN)    00008000fff67c10 000002000023fa1c 00008000fff43cd0 00008000fff43d70
> (XEN)    00008000fff69a30 00008000fffec068 00008000fff68000 00008000fffecb20
> (XEN)    0000000000000001 00008000fff43d70 00008000fff69a30 00008000fffec068
> (XEN)    00008000fff67c40 0000020000243948 0000000000000003 00008000fff43e10
> (XEN)    00008000fff43cd0 00008000fffec068 00008000fff67cb0 000002000023069c
> (XEN)    00008000fffecb20 00008000fff68000 00008000fffecb20 00008000fff68000
> (XEN)    000000005a000ea1 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 ffffff880188eec0 00008000fffecb20 00008000fff43ea0
> (XEN)    00008000fff67cd0 0000020000231cc0 00000200002bb7e0 00008000fff68000
> (XEN)    00008000fff67d00 00000200002099f0 00008000fff68000 0000000000000000
> (XEN)    0000007fb5381010 000000005a000ea1 00008000fff67d30 000002000022ec8c
> (XEN)    0000000000000001 000000005a000ea1 00008000fff67d30 000002000022e744
> (XEN)    00008000fff67e40 000002000027121c 00008000fff67ea0 000000005a000ea1
> (XEN)    00008000fff67f20 00008000fff52000 00008000fff67da0 0000020000244650
> (XEN)    00008000fff67d90 0000020000224f54 00008000fff67da0 0000020000224f54
> (XEN)    00008000fff67dc0 0000020000225088 00008000fff52e16 00008000fff52e00
> (XEN)    0000001500000002 0000000000000001 000a752520646a25 742064656c696166
> (XEN)    c00cc00cc00cc000 0000000000000000 ff0000ff000000ff 0000000000000000
> (XEN)    00000000f00f000f 0000000000000000 f00ff00ff00ff00f f00ff00ff00ff00f
> (XEN)    00000000ccccccc0 00000000ccccccc0 0000000000000000 0000000000000000
> (XEN)    0000007fb5310018 0000007fb529d8d0 00008000fff67e70 0000020000272134
> (XEN)    00008000fff67ea0 000000005a000ea1 00008000fff67fa8 0000000020000005
> (XEN)    ffffffc009c9bce0 0000020000255c60 0000000000000000 ffffff8809182a00
> (XEN)    ffffffc009c9bce0 0000020000255c54 0000007fb5381010 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    ffffffc009c9bdc8 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000007fde58e6d0 0000000000000024 0000000000000000
> (XEN)    0000000000000000 0000007fde58e6d0 ffffff8809182a00 ffffff8809182a00
> (XEN)    0000000000305000 0000007fde58e6d0 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 ffffff880188eec0 ffffffc009c9bce0
> (XEN)    ffffffc008648cb8 ffffffffffffffff ffffffc00803799c 0000000020000005
> (XEN)    000000005a000ea1 0000000000000000 0000000020000005 0000000000000000
> (XEN)    0000000000000000 ffffff880188eec0 ffffffc009c9bce0 ffffffc008037998
> (XEN)    0000000000000000 0000000000000000
> (XEN) Xen call trace:
> (XEN)    [<000002000023ec30>] null.c#unit_deassign+0xdc/0x1cc (PC)
> (XEN)    [<000002000023fa1c>] null.c#null_unit_remove+0x6c/0xe4 (LR)
> (XEN)    [<000002000023fa1c>] null.c#null_unit_remove+0x6c/0xe4
> (XEN)    [<0000020000243948>] sched_move_domain+0x194/0x39c
> (XEN)    [<000002000023069c>] cpupool.c#cpupool_move_domain_locked+0x38/0x70
> (XEN)    [<0000020000231cc0>] cpupool_move_domain+0x34/0x54
> (XEN)    [<00000200002099f0>] domain_kill+0xc4/0x144
> (XEN)    [<000002000022ec8c>] do_domctl+0x6d0/0xfa4
> (XEN)    [<000002000027121c>] traps.c#do_trap_hypercall+0x280/0x34c
> (XEN)    [<0000020000272134>] do_trap_guest_sync+0x448/0x5c4
> (XEN)    [<0000020000255c60>] entry.o#guest_sync_slowpath+0xa4/0xd4
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'npc->unit == unit' failed at common/sched/null.c:379
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
> ---
>   xen/common/sched/null.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/common/sched/null.c b/xen/common/sched/null.c
> index 65a0a6c5312d..71a83c4fb1ad 100644
> --- a/xen/common/sched/null.c
> +++ b/xen/common/sched/null.c
> @@ -376,7 +376,14 @@ static bool unit_deassign(struct null_private *prv, const struct sched_unit *uni
>       struct null_pcpu *npc = get_sched_res(cpu)->sched_priv;
>   
>       ASSERT(list_empty(&null_unit(unit)->waitq_elem));
> -    ASSERT(npc->unit == unit);
> +
> +    if ( !npc->unit )
> +    {
> +        dprintk(XENLOG_G_INFO, "%d <-- NULL (%pdv%d)\n", cpu, unit->domain,
> +                unit->unit_id);
> +        return false;
> +    }
> +
>       ASSERT(!cpumask_test_cpu(cpu, &prv->cpus_free));
>   
>       npc->unit = NULL;

Re: [RFC PATCH] xen/sched/null: avoid crash after failed domU creation
Posted by Jan Beulich 1 year ago
On 25.04.2023 08:36, Juergen Gross wrote:
> On 24.04.23 23:00, Stewart Hildebrand wrote:
>> When creating a domU, but the creation fails, we may end up in a state
>> where a vcpu has not yet been added to the null scheduler, but
>> unit_deassign() is invoked.
> 
> This is not really true. The vcpu has been added, but it was offline at
> that time. This resulted in null_unit_insert() returning early and not
> calling unit_assign().
> 
> Later the vcpu was onlined during XEN_DOMCTL_setvcpucontext handling,
> resulting in null_unit_remove() calling unit_deassign().
> 
>> In this case, when running a debug build of
>> Xen, we will hit an ASSERT and crash Xen:
>>
>> (XEN) ****************************************
>> (XEN) Panic on CPU 0:
>> (XEN) Assertion 'npc->unit == unit' failed at common/sched/null.c:379
>> (XEN) ****************************************
>>
>> To work around this, remove the ASSERT and introduce a check for the
>> case where npc->unit is NULL and simply return false from
>> unit_deassign().
> 
> I think the correct fix would be to call unit_deassign() from
> null_unit_remove() only, if npc->unit isn't NULL. Dario might have a
> different opinion, though. :-)

Furthermore, even if the proposed solution was (roughly) followed, ...

>> --- a/xen/common/sched/null.c
>> +++ b/xen/common/sched/null.c
>> @@ -376,7 +376,14 @@ static bool unit_deassign(struct null_private *prv, const struct sched_unit *uni
>>       struct null_pcpu *npc = get_sched_res(cpu)->sched_priv;
>>   
>>       ASSERT(list_empty(&null_unit(unit)->waitq_elem));
>> -    ASSERT(npc->unit == unit);
>> +
>> +    if ( !npc->unit )
>> +    {
>> +        dprintk(XENLOG_G_INFO, "%d <-- NULL (%pdv%d)\n", cpu, unit->domain,
>> +                unit->unit_id);
>> +        return false;
>> +    }
>> +

... shouldn't the assertion be kept, with the new if() inserted ahead of
it? Plus the log message probably better wouldn't print a unit ID like a
vCPU one, but instead use e.g. %pdu%u?

Jan
Re: [RFC PATCH] xen/sched/null: avoid crash after failed domU creation
Posted by Stewart Hildebrand 1 year ago
On 4/25/23 03:42, Jan Beulich wrote:
> On 25.04.2023 08:36, Juergen Gross wrote:
>> On 24.04.23 23:00, Stewart Hildebrand wrote:
>>> When creating a domU, but the creation fails, we may end up in a state
>>> where a vcpu has not yet been added to the null scheduler, but
>>> unit_deassign() is invoked.
>>
>> This is not really true. The vcpu has been added, but it was offline at
>> that time. This resulted in null_unit_insert() returning early and not
>> calling unit_assign().
>>
>> Later the vcpu was onlined during XEN_DOMCTL_setvcpucontext handling,
>> resulting in null_unit_remove() calling unit_deassign().

Makes sense. I'll reword the message in the next revision.

>>> In this case, when running a debug build of
>>> Xen, we will hit an ASSERT and crash Xen:
>>>
>>> (XEN) ****************************************
>>> (XEN) Panic on CPU 0:
>>> (XEN) Assertion 'npc->unit == unit' failed at common/sched/null.c:379
>>> (XEN) ****************************************
>>>
>>> To work around this, remove the ASSERT and introduce a check for the
>>> case where npc->unit is NULL and simply return false from
>>> unit_deassign().
>>
>> I think the correct fix would be to call unit_deassign() from
>> null_unit_remove() only, if npc->unit isn't NULL. Dario might have a
>> different opinion, though. :-)

Yes, this seems cleaner to me, thanks for the suggestion. I did a quick test, and this approach works to avoid the crash too. I'll wait a few days in case anyone else wants to chime in, and if there aren't any more comments I'll send out a new patch following this suggestion.

> Furthermore, even if the proposed solution was (roughly) followed, ...
> 
>>> --- a/xen/common/sched/null.c
>>> +++ b/xen/common/sched/null.c
>>> @@ -376,7 +376,14 @@ static bool unit_deassign(struct null_private *prv, const struct sched_unit *uni
>>>       struct null_pcpu *npc = get_sched_res(cpu)->sched_priv;
>>>
>>>       ASSERT(list_empty(&null_unit(unit)->waitq_elem));
>>> -    ASSERT(npc->unit == unit);
>>> +
>>> +    if ( !npc->unit )
>>> +    {
>>> +        dprintk(XENLOG_G_INFO, "%d <-- NULL (%pdv%d)\n", cpu, unit->domain,
>>> +                unit->unit_id);
>>> +        return false;
>>> +    }
>>> +
> 
> ... shouldn't the assertion be kept, with the new if() inserted ahead of
> it? Plus the log message probably better wouldn't print a unit ID like a
> vCPU one, but instead use e.g. %pdu%u?

Sure, although, with Juergen's suggested fix in null_unit_remove(), I think we could simply drop this snippet and leave unit_deassign() unmodified.

Your suggested print format is an improvement, but perhaps it would be better suited for a separate patch since there are several more instances throughout null.c that would also want to be changed.

Example with %pdv%d:
# xl create ...
(XEN) common/sched/null.c:355: 3 <-- d1v0
# xl destroy ...
(XEN) common/sched/null.c:385: 3 <-- NULL (d1v0)

Example with %pdu%u:
# xl create ...
(XEN) common/sched/null.c:355: 3 <-- d1u0
# xl destroy ...
(XEN) common/sched/null.c:385: 3 <-- NULL (d1u0)