[v3] Remaining patches for dynamic node programming using overlay dtbo

[PATCH v3 5/8] xen/arm/gic: Allow routing/removing interrupt to running VMs

Posted by Henry Wang 6 months ago

From: Vikram Garhwal <fnu.vikram@xilinx.com>

Currently, routing/removing physical interrupts are only allowed at
the domain creation/destroy time. For use cases such as dynamic device
tree overlay adding/removing, the routing/removing of physical IRQ to
running domains should be allowed.

Removing the above-mentioned domain creation/dying check. Since this
will introduce interrupt state unsync issues for cases when the
interrupt is active or pending in the guest, therefore for these cases
we simply reject the operation. Do it for both new and old vGIC
implementations.

Signed-off-by: Vikram Garhwal <fnu.vikram@xilinx.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Signed-off-by: Henry Wang <xin.wang2@amd.com>
---
v3:
- Update in-code comments.
- Correct the if conditions.
- Add taking/releasing the vgic lock of the vcpu.
v2:
- Reject the case where the IRQ is active or pending in guest.
---
 xen/arch/arm/gic-vgic.c  | 15 ++++++++++++---
 xen/arch/arm/gic.c       | 15 ---------------
 xen/arch/arm/vgic/vgic.c | 10 +++++++---
 3 files changed, 19 insertions(+), 21 deletions(-)

diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
index 56490dbc43..956c11ba13 100644
--- a/xen/arch/arm/gic-vgic.c
+++ b/xen/arch/arm/gic-vgic.c
@@ -439,24 +439,33 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *v, unsigned int virq,
 
     /* We are taking to rank lock to prevent parallel connections. */
     vgic_lock_rank(v_target, rank, flags);
+    spin_lock(&v_target->arch.vgic.lock);
 
     if ( connect )
     {
-        /* The VIRQ should not be already enabled by the guest */
+        /*
+         * The VIRQ should not be already enabled by the guest nor
+         * active/pending in the guest.
+         */
         if ( !p->desc &&
-             !test_bit(GIC_IRQ_GUEST_ENABLED, &p->status) )
+             !test_bit(GIC_IRQ_GUEST_ENABLED, &p->status) &&
+             !test_bit(GIC_IRQ_GUEST_VISIBLE, &p->status) &&
+             !test_bit(GIC_IRQ_GUEST_ACTIVE, &p->status) )
             p->desc = desc;
         else
             ret = -EBUSY;
     }
     else
     {
-        if ( desc && p->desc != desc )
+        if ( (desc && p->desc != desc) ||
+             test_bit(GIC_IRQ_GUEST_VISIBLE, &p->status) ||
+             test_bit(GIC_IRQ_GUEST_ACTIVE, &p->status) )
             ret = -EINVAL;
         else
             p->desc = NULL;
     }
 
+    spin_unlock(&v_target->arch.vgic.lock);
     vgic_unlock_rank(v_target, rank, flags);
 
     return ret;
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 44c40e86de..3ebd89940a 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -135,14 +135,6 @@ int gic_route_irq_to_guest(struct domain *d, unsigned int virq,
     ASSERT(virq < vgic_num_irqs(d));
     ASSERT(!is_lpi(virq));
 
-    /*
-     * When routing an IRQ to guest, the virtual state is not synced
-     * back to the physical IRQ. To prevent get unsync, restrict the
-     * routing to when the Domain is been created.
-     */
-    if ( d->creation_finished )
-        return -EBUSY;
-
     ret = vgic_connect_hw_irq(d, NULL, virq, desc, true);
     if ( ret )
         return ret;
@@ -167,13 +159,6 @@ int gic_remove_irq_from_guest(struct domain *d, unsigned int virq,
     ASSERT(test_bit(_IRQ_GUEST, &desc->status));
     ASSERT(!is_lpi(virq));
 
-    /*
-     * Removing an interrupt while the domain is running may have
-     * undesirable effect on the vGIC emulation.
-     */
-    if ( !d->is_dying )
-        return -EBUSY;
-
     desc->handler->shutdown(desc);
 
     /* EOI the IRQ if it has not been done by the guest */
diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
index b9463a5f27..78554c11e2 100644
--- a/xen/arch/arm/vgic/vgic.c
+++ b/xen/arch/arm/vgic/vgic.c
@@ -876,8 +876,11 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *vcpu,
 
     if ( connect )                      /* assign a mapped IRQ */
     {
-        /* The VIRQ should not be already enabled by the guest */
-        if ( !irq->hw && !irq->enabled )
+        /*
+         * The VIRQ should not be already enabled by the guest nor
+         * active/pending in the guest
+         */
+        if ( !irq->hw && !irq->enabled && !irq->active && !irq->pending_latch )
         {
             irq->hw = true;
             irq->hwintid = desc->irq;
@@ -887,7 +890,8 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *vcpu,
     }
     else                                /* remove a mapped IRQ */
     {
-        if ( desc && irq->hwintid != desc->irq )
+        if ( (desc && irq->hwintid != desc->irq) ||
+             irq->active || irq->pending_latch )
         {
             ret = -EINVAL;
         }
-- 
2.34.1

Re: [PATCH v3 5/8] xen/arm/gic: Allow routing/removing interrupt to running VMs

Posted by Julien Grall 6 months ago

Hi,

On 21/05/2024 05:35, Henry Wang wrote:
> From: Vikram Garhwal <fnu.vikram@xilinx.com>
> 
> Currently, routing/removing physical interrupts are only allowed at
> the domain creation/destroy time. For use cases such as dynamic device
> tree overlay adding/removing, the routing/removing of physical IRQ to
> running domains should be allowed.
> 
> Removing the above-mentioned domain creation/dying check. Since this
> will introduce interrupt state unsync issues for cases when the
> interrupt is active or pending in the guest, therefore for these cases
> we simply reject the operation. Do it for both new and old vGIC
> implementations.
> 
> Signed-off-by: Vikram Garhwal <fnu.vikram@xilinx.com>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> Signed-off-by: Henry Wang <xin.wang2@amd.com>
> ---
> v3:
> - Update in-code comments.
> - Correct the if conditions.
> - Add taking/releasing the vgic lock of the vcpu.
> v2:
> - Reject the case where the IRQ is active or pending in guest.
> ---
>   xen/arch/arm/gic-vgic.c  | 15 ++++++++++++---
>   xen/arch/arm/gic.c       | 15 ---------------
>   xen/arch/arm/vgic/vgic.c | 10 +++++++---
>   3 files changed, 19 insertions(+), 21 deletions(-)
> 
> diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
> index 56490dbc43..956c11ba13 100644
> --- a/xen/arch/arm/gic-vgic.c
> +++ b/xen/arch/arm/gic-vgic.c
> @@ -439,24 +439,33 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *v, unsigned int virq,
>   
>       /* We are taking to rank lock to prevent parallel connections. */
>       vgic_lock_rank(v_target, rank, flags);
> +    spin_lock(&v_target->arch.vgic.lock);

I know this is what Stefano suggested, but v_target would point to the 
current affinity whereas the interrupt may be pending/active on the 
"previous" vCPU. So it is a little unclear whether v_target is the 
correct lock. Do you have more pointer to show this is correct?

Also, while looking at the locking, I noticed that we are not doing 
anything with GIC_IRQ_GUEST_MIGRATING. In gic_update_one_lr(), we seem 
to assume that if the flag is set, then p->desc cannot be NULL.

Can we reach vgic_connect_hw_irq() with the flag set?

What about the other flags? Is this going to be a concern if we don't 
reset them?

Cheers,

-- 
Julien Grall

Re: [PATCH v3 5/8] xen/arm/gic: Allow routing/removing interrupt to running VMs

Posted by Henry Wang 6 months ago

Hi Julien,

On 5/21/2024 8:30 PM, Julien Grall wrote:
> Hi,
>
> On 21/05/2024 05:35, Henry Wang wrote:
>> diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
>> index 56490dbc43..956c11ba13 100644
>> --- a/xen/arch/arm/gic-vgic.c
>> +++ b/xen/arch/arm/gic-vgic.c
>> @@ -439,24 +439,33 @@ int vgic_connect_hw_irq(struct domain *d, 
>> struct vcpu *v, unsigned int virq,
>>         /* We are taking to rank lock to prevent parallel 
>> connections. */
>>       vgic_lock_rank(v_target, rank, flags);
>> +    spin_lock(&v_target->arch.vgic.lock);
>
> I know this is what Stefano suggested, but v_target would point to the 
> current affinity whereas the interrupt may be pending/active on the 
> "previous" vCPU. So it is a little unclear whether v_target is the 
> correct lock. Do you have more pointer to show this is correct?

No I think you are correct, we have discussed this in the initial 
version of this patch. Sorry.

I followed the way from that discussion to note down the vcpu ID and 
retrieve here, below is the diff, would this make sense to you?

diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
index 956c11ba13..134ed4e107 100644
--- a/xen/arch/arm/gic-vgic.c
+++ b/xen/arch/arm/gic-vgic.c
@@ -439,7 +439,7 @@ int vgic_connect_hw_irq(struct domain *d, struct 
vcpu *v, unsigned int virq,

      /* We are taking to rank lock to prevent parallel connections. */
      vgic_lock_rank(v_target, rank, flags);
-    spin_lock(&v_target->arch.vgic.lock);
+ spin_lock(&d->vcpu[p->spi_vcpu_id]->arch.vgic.lock);

      if ( connect )
      {
@@ -465,7 +465,7 @@ int vgic_connect_hw_irq(struct domain *d, struct 
vcpu *v, unsigned int virq,
              p->desc = NULL;
      }

-    spin_unlock(&v_target->arch.vgic.lock);
+ spin_unlock(&d->vcpu[p->spi_vcpu_id]->arch.vgic.lock);
      vgic_unlock_rank(v_target, rank, flags);

      return ret;
diff --git a/xen/arch/arm/include/asm/vgic.h 
b/xen/arch/arm/include/asm/vgic.h
index 79b73a0dbb..f4075d3e75 100644
--- a/xen/arch/arm/include/asm/vgic.h
+++ b/xen/arch/arm/include/asm/vgic.h
@@ -85,6 +85,7 @@ struct pending_irq
      uint8_t priority;
      uint8_t lpi_priority;       /* Caches the priority if this is an 
LPI. */
      uint8_t lpi_vcpu_id;        /* The VCPU for an LPI. */
+    uint8_t spi_vcpu_id;        /* The VCPU for an SPI. */
      /* inflight is used to append instances of pending_irq to
       * vgic.inflight_irqs */
      struct list_head inflight;
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index c04fc4f83f..e852479f13 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -632,6 +632,7 @@ void vgic_inject_irq(struct domain *d, struct vcpu 
*v, unsigned int virq,
      }
      list_add_tail(&n->inflight, &v->arch.vgic.inflight_irqs);
  out:
+    n->spi_vcpu_id = v->vcpu_id;
      spin_unlock_irqrestore(&v->arch.vgic.lock, flags);

      /* we have a new higher priority irq, inject it into the guest */
      vcpu_kick(v);


> Also, while looking at the locking, I noticed that we are not doing 
> anything with GIC_IRQ_GUEST_MIGRATING. In gic_update_one_lr(), we seem 
> to assume that if the flag is set, then p->desc cannot be NULL.
>
> Can we reach vgic_connect_hw_irq() with the flag set?

I think even from the perspective of making the code extra safe, we 
should also check GIC_IRQ_GUEST_MIGRATING as the LR is allocated for 
this case. I will also add the check of GIC_IRQ_GUEST_MIGRATING here.

> What about the other flags? Is this going to be a concern if we don't 
> reset them?

I don't think so, if I am not mistaken, no LR will be allocated with 
other flags set.

Kind regards,
Henry

>
> Cheers,
>

Re: [PATCH v3 5/8] xen/arm/gic: Allow routing/removing interrupt to running VMs

Posted by Stefano Stabellini 6 months ago

On Wed, 22 May 2024, Henry Wang wrote:
> Hi Julien,
> 
> On 5/21/2024 8:30 PM, Julien Grall wrote:
> > Hi,
> > 
> > On 21/05/2024 05:35, Henry Wang wrote:
> > > diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
> > > index 56490dbc43..956c11ba13 100644
> > > --- a/xen/arch/arm/gic-vgic.c
> > > +++ b/xen/arch/arm/gic-vgic.c
> > > @@ -439,24 +439,33 @@ int vgic_connect_hw_irq(struct domain *d, struct
> > > vcpu *v, unsigned int virq,
> > >         /* We are taking to rank lock to prevent parallel connections. */
> > >       vgic_lock_rank(v_target, rank, flags);
> > > +    spin_lock(&v_target->arch.vgic.lock);
> > 
> > I know this is what Stefano suggested, but v_target would point to the
> > current affinity whereas the interrupt may be pending/active on the
> > "previous" vCPU. So it is a little unclear whether v_target is the correct
> > lock. Do you have more pointer to show this is correct?
> 
> No I think you are correct, we have discussed this in the initial version of
> this patch. Sorry.
> 
> I followed the way from that discussion to note down the vcpu ID and retrieve
> here, below is the diff, would this make sense to you?
> 
> diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
> index 956c11ba13..134ed4e107 100644
> --- a/xen/arch/arm/gic-vgic.c
> +++ b/xen/arch/arm/gic-vgic.c
> @@ -439,7 +439,7 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *v,
> unsigned int virq,
> 
>      /* We are taking to rank lock to prevent parallel connections. */
>      vgic_lock_rank(v_target, rank, flags);
> -    spin_lock(&v_target->arch.vgic.lock);
> + spin_lock(&d->vcpu[p->spi_vcpu_id]->arch.vgic.lock);
> 
>      if ( connect )
>      {
> @@ -465,7 +465,7 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *v,
> unsigned int virq,
>              p->desc = NULL;
>      }
> 
> -    spin_unlock(&v_target->arch.vgic.lock);
> + spin_unlock(&d->vcpu[p->spi_vcpu_id]->arch.vgic.lock);
>      vgic_unlock_rank(v_target, rank, flags);
> 
>      return ret;
> diff --git a/xen/arch/arm/include/asm/vgic.h b/xen/arch/arm/include/asm/vgic.h
> index 79b73a0dbb..f4075d3e75 100644
> --- a/xen/arch/arm/include/asm/vgic.h
> +++ b/xen/arch/arm/include/asm/vgic.h
> @@ -85,6 +85,7 @@ struct pending_irq
>      uint8_t priority;
>      uint8_t lpi_priority;       /* Caches the priority if this is an LPI. */
>      uint8_t lpi_vcpu_id;        /* The VCPU for an LPI. */
> +    uint8_t spi_vcpu_id;        /* The VCPU for an SPI. */
>      /* inflight is used to append instances of pending_irq to
>       * vgic.inflight_irqs */
>      struct list_head inflight;
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index c04fc4f83f..e852479f13 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -632,6 +632,7 @@ void vgic_inject_irq(struct domain *d, struct vcpu *v,
> unsigned int virq,
>      }
>      list_add_tail(&n->inflight, &v->arch.vgic.inflight_irqs);
>  out:
> +    n->spi_vcpu_id = v->vcpu_id;
>      spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
> 
>      /* we have a new higher priority irq, inject it into the guest */
>      vcpu_kick(v);
> 
> 
> > Also, while looking at the locking, I noticed that we are not doing anything
> > with GIC_IRQ_GUEST_MIGRATING. In gic_update_one_lr(), we seem to assume that
> > if the flag is set, then p->desc cannot be NULL.
> > 
> > Can we reach vgic_connect_hw_irq() with the flag set?
> 
> I think even from the perspective of making the code extra safe, we should
> also check GIC_IRQ_GUEST_MIGRATING as the LR is allocated for this case. I
> will also add the check of GIC_IRQ_GUEST_MIGRATING here.

Yes. I think it might be easier to check for GIC_IRQ_GUEST_MIGRATING
early and return error immediately in that case. Otherwise, we can
continue and take spin_lock(&v_target->arch.vgic.lock) because no
migration is in progress




> > What about the other flags? Is this going to be a concern if we don't reset
> > them?
> 
> I don't think so, if I am not mistaken, no LR will be allocated with other
> flags set.
> 
> Kind regards,
> Henry

Re: [PATCH v3 5/8] xen/arm/gic: Allow routing/removing interrupt to running VMs

Posted by Henry Wang 6 months ago

Hi Stefano,

On 5/22/2024 9:16 AM, Stefano Stabellini wrote:
> On Wed, 22 May 2024, Henry Wang wrote:
>> Hi Julien,
>>
>> On 5/21/2024 8:30 PM, Julien Grall wrote:
>>> Hi,
>>>
>>> On 21/05/2024 05:35, Henry Wang wrote:
>>>> diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
>>>> index 56490dbc43..956c11ba13 100644
>>>> --- a/xen/arch/arm/gic-vgic.c
>>>> +++ b/xen/arch/arm/gic-vgic.c
>>>> @@ -439,24 +439,33 @@ int vgic_connect_hw_irq(struct domain *d, struct
>>>> vcpu *v, unsigned int virq,
>>>>          /* We are taking to rank lock to prevent parallel connections. */
>>>>        vgic_lock_rank(v_target, rank, flags);
>>>> +    spin_lock(&v_target->arch.vgic.lock);
>>> I know this is what Stefano suggested, but v_target would point to the
>>> current affinity whereas the interrupt may be pending/active on the
>>> "previous" vCPU. So it is a little unclear whether v_target is the correct
>>> lock. Do you have more pointer to show this is correct?
>> No I think you are correct, we have discussed this in the initial version of
>> this patch. Sorry.
>>
>> I followed the way from that discussion to note down the vcpu ID and retrieve
>> here, below is the diff, would this make sense to you?
>>
>> diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
>> index 956c11ba13..134ed4e107 100644
>> --- a/xen/arch/arm/gic-vgic.c
>> +++ b/xen/arch/arm/gic-vgic.c
>> @@ -439,7 +439,7 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *v,
>> unsigned int virq,
>>
>>       /* We are taking to rank lock to prevent parallel connections. */
>>       vgic_lock_rank(v_target, rank, flags);
>> -    spin_lock(&v_target->arch.vgic.lock);
>> + spin_lock(&d->vcpu[p->spi_vcpu_id]->arch.vgic.lock);
>>
>>       if ( connect )
>>       {
>> @@ -465,7 +465,7 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *v,
>> unsigned int virq,
>>               p->desc = NULL;
>>       }
>>
>> -    spin_unlock(&v_target->arch.vgic.lock);
>> + spin_unlock(&d->vcpu[p->spi_vcpu_id]->arch.vgic.lock);
>>       vgic_unlock_rank(v_target, rank, flags);
>>
>>       return ret;
>> diff --git a/xen/arch/arm/include/asm/vgic.h b/xen/arch/arm/include/asm/vgic.h
>> index 79b73a0dbb..f4075d3e75 100644
>> --- a/xen/arch/arm/include/asm/vgic.h
>> +++ b/xen/arch/arm/include/asm/vgic.h
>> @@ -85,6 +85,7 @@ struct pending_irq
>>       uint8_t priority;
>>       uint8_t lpi_priority;       /* Caches the priority if this is an LPI. */
>>       uint8_t lpi_vcpu_id;        /* The VCPU for an LPI. */
>> +    uint8_t spi_vcpu_id;        /* The VCPU for an SPI. */
>>       /* inflight is used to append instances of pending_irq to
>>        * vgic.inflight_irqs */
>>       struct list_head inflight;
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index c04fc4f83f..e852479f13 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -632,6 +632,7 @@ void vgic_inject_irq(struct domain *d, struct vcpu *v,
>> unsigned int virq,
>>       }
>>       list_add_tail(&n->inflight, &v->arch.vgic.inflight_irqs);
>>   out:
>> +    n->spi_vcpu_id = v->vcpu_id;
>>       spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
>>
>>       /* we have a new higher priority irq, inject it into the guest */
>>       vcpu_kick(v);
>>
>>
>>> Also, while looking at the locking, I noticed that we are not doing anything
>>> with GIC_IRQ_GUEST_MIGRATING. In gic_update_one_lr(), we seem to assume that
>>> if the flag is set, then p->desc cannot be NULL.
>>>
>>> Can we reach vgic_connect_hw_irq() with the flag set?
>> I think even from the perspective of making the code extra safe, we should
>> also check GIC_IRQ_GUEST_MIGRATING as the LR is allocated for this case. I
>> will also add the check of GIC_IRQ_GUEST_MIGRATING here.
> Yes. I think it might be easier to check for GIC_IRQ_GUEST_MIGRATING
> early and return error immediately in that case. Otherwise, we can
> continue and take spin_lock(&v_target->arch.vgic.lock) because no
> migration is in progress

Ok, this makes sense to me, I will add

     if( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
     {
         vgic_unlock_rank(v_target, rank, flags);
         return -EBUSY;
     }

right after taking the vgic rank lock.

Kind regards,
Henry

Re: [PATCH v3 5/8] xen/arm/gic: Allow routing/removing interrupt to running VMs

Posted by Julien Grall 6 months ago

Hi Henry,

On 22/05/2024 02:22, Henry Wang wrote:
> On 5/22/2024 9:16 AM, Stefano Stabellini wrote:
>> On Wed, 22 May 2024, Henry Wang wrote:
>>> Hi Julien,
>>>
>>> On 5/21/2024 8:30 PM, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> On 21/05/2024 05:35, Henry Wang wrote:
>>>>> diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
>>>>> index 56490dbc43..956c11ba13 100644
>>>>> --- a/xen/arch/arm/gic-vgic.c
>>>>> +++ b/xen/arch/arm/gic-vgic.c
>>>>> @@ -439,24 +439,33 @@ int vgic_connect_hw_irq(struct domain *d, struct
>>>>> vcpu *v, unsigned int virq,
>>>>>          /* We are taking to rank lock to prevent parallel 
>>>>> connections. */
>>>>>        vgic_lock_rank(v_target, rank, flags);
>>>>> +    spin_lock(&v_target->arch.vgic.lock);
>>>> I know this is what Stefano suggested, but v_target would point to the
>>>> current affinity whereas the interrupt may be pending/active on the
>>>> "previous" vCPU. So it is a little unclear whether v_target is the 
>>>> correct
>>>> lock. Do you have more pointer to show this is correct?
>>> No I think you are correct, we have discussed this in the initial 
>>> version of
>>> this patch. Sorry.
>>>
>>> I followed the way from that discussion to note down the vcpu ID and 
>>> retrieve
>>> here, below is the diff, would this make sense to you?
>>>
>>> diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
>>> index 956c11ba13..134ed4e107 100644
>>> --- a/xen/arch/arm/gic-vgic.c
>>> +++ b/xen/arch/arm/gic-vgic.c
>>> @@ -439,7 +439,7 @@ int vgic_connect_hw_irq(struct domain *d, struct 
>>> vcpu *v,
>>> unsigned int virq,
>>>
>>>       /* We are taking to rank lock to prevent parallel connections. */
>>>       vgic_lock_rank(v_target, rank, flags);
>>> -    spin_lock(&v_target->arch.vgic.lock);
>>> + spin_lock(&d->vcpu[p->spi_vcpu_id]->arch.vgic.lock);
>>>
>>>       if ( connect )
>>>       {
>>> @@ -465,7 +465,7 @@ int vgic_connect_hw_irq(struct domain *d, struct 
>>> vcpu *v,
>>> unsigned int virq,
>>>               p->desc = NULL;
>>>       }
>>>
>>> -    spin_unlock(&v_target->arch.vgic.lock);
>>> + spin_unlock(&d->vcpu[p->spi_vcpu_id]->arch.vgic.lock);
>>>       vgic_unlock_rank(v_target, rank, flags);
>>>
>>>       return ret;
>>> diff --git a/xen/arch/arm/include/asm/vgic.h 
>>> b/xen/arch/arm/include/asm/vgic.h
>>> index 79b73a0dbb..f4075d3e75 100644
>>> --- a/xen/arch/arm/include/asm/vgic.h
>>> +++ b/xen/arch/arm/include/asm/vgic.h
>>> @@ -85,6 +85,7 @@ struct pending_irq
>>>       uint8_t priority;
>>>       uint8_t lpi_priority;       /* Caches the priority if this is 
>>> an LPI. */
>>>       uint8_t lpi_vcpu_id;        /* The VCPU for an LPI. */
>>> +    uint8_t spi_vcpu_id;        /* The VCPU for an SPI. */
>>>       /* inflight is used to append instances of pending_irq to
>>>        * vgic.inflight_irqs */
>>>       struct list_head inflight;
>>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>>> index c04fc4f83f..e852479f13 100644
>>> --- a/xen/arch/arm/vgic.c
>>> +++ b/xen/arch/arm/vgic.c
>>> @@ -632,6 +632,7 @@ void vgic_inject_irq(struct domain *d, struct 
>>> vcpu *v,
>>> unsigned int virq,
>>>       }
>>>       list_add_tail(&n->inflight, &v->arch.vgic.inflight_irqs);
>>>   out:
>>> +    n->spi_vcpu_id = v->vcpu_id;
>>>       spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
>>>
>>>       /* we have a new higher priority irq, inject it into the guest */
>>>       vcpu_kick(v);
>>>
>>>
>>>> Also, while looking at the locking, I noticed that we are not doing 
>>>> anything
>>>> with GIC_IRQ_GUEST_MIGRATING. In gic_update_one_lr(), we seem to 
>>>> assume that
>>>> if the flag is set, then p->desc cannot be NULL.
>>>>
>>>> Can we reach vgic_connect_hw_irq() with the flag set?
>>> I think even from the perspective of making the code extra safe, we 
>>> should
>>> also check GIC_IRQ_GUEST_MIGRATING as the LR is allocated for this 
>>> case. I
>>> will also add the check of GIC_IRQ_GUEST_MIGRATING here.
>> Yes. I think it might be easier to check for GIC_IRQ_GUEST_MIGRATING
>> early and return error immediately in that case. Otherwise, we can
>> continue and take spin_lock(&v_target->arch.vgic.lock) because no
>> migration is in progress
> 
> Ok, this makes sense to me, I will add
> 
>      if( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
>      {
>          vgic_unlock_rank(v_target, rank, flags);
>          return -EBUSY;
>      }
> 
> right after taking the vgic rank lock.

I think that would be ok. I have to admit, I am still a bit wary about 
allowing to remove interrupts when the domain is running.

I am less concerned about the add part. Do you need the remove part now? 
If not, I would suggest to split in two so we can get the most of this 
series merged for 4.19 and continue to deal with the remove path in the 
background.

I will answer here to the other reply:

 > I don't think so, if I am not mistaken, no LR will be allocated with 
other flags set.

I wasn't necessarily thinking about the LR allocation. I was more 
thinking whether there are any flags that could still be set.

IOW, will the vIRQ like new once vgic_connect_hw_irq() is succesful?

Also, while looking at the flags, I noticed we clear _IRQ_INPROGRESS 
before vgic_connect_hw_irq(). Shouldn't we only clear *after*?

This brings to another question. You don't special case a dying domain. 
If the domain is crashing, wouldn't this mean it wouldn't be possible to 
destroy it?

Cheers,

-- 
Julien Grall

Re: [PATCH v3 5/8] xen/arm/gic: Allow routing/removing interrupt to running VMs

Posted by Henry Wang 6 months ago

Hi Julien, Stefano,

On 5/22/2024 9:03 PM, Julien Grall wrote:
> Hi Henry,
>
> On 22/05/2024 02:22, Henry Wang wrote:
>>>>> Also, while looking at the locking, I noticed that we are not 
>>>>> doing anything
>>>>> with GIC_IRQ_GUEST_MIGRATING. In gic_update_one_lr(), we seem to 
>>>>> assume that
>>>>> if the flag is set, then p->desc cannot be NULL.
>>>>>
>>>>> Can we reach vgic_connect_hw_irq() with the flag set?
>>>> I think even from the perspective of making the code extra safe, we 
>>>> should
>>>> also check GIC_IRQ_GUEST_MIGRATING as the LR is allocated for this 
>>>> case. I
>>>> will also add the check of GIC_IRQ_GUEST_MIGRATING here.
>>> Yes. I think it might be easier to check for GIC_IRQ_GUEST_MIGRATING
>>> early and return error immediately in that case. Otherwise, we can
>>> continue and take spin_lock(&v_target->arch.vgic.lock) because no
>>> migration is in progress
>>
>> Ok, this makes sense to me, I will add
>>
>>      if( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
>>      {
>>          vgic_unlock_rank(v_target, rank, flags);
>>          return -EBUSY;
>>      }
>>
>> right after taking the vgic rank lock.

Summary of our yesterday's discussion on Matrix:
For the split of patch mentioned in...

> I think that would be ok. I have to admit, I am still a bit wary about 
> allowing to remove interrupts when the domain is running.
>
> I am less concerned about the add part. Do you need the remove part 
> now? If not, I would suggest to split in two so we can get the most of 
> this series merged for 4.19 and continue to deal with the remove path 
> in the background.

...here, I will do that in the next version.

> I will answer here to the other reply:
>
> > I don't think so, if I am not mistaken, no LR will be allocated with 
> other flags set.
>
> I wasn't necessarily thinking about the LR allocation. I was more 
> thinking whether there are any flags that could still be set.
>
> IOW, will the vIRQ like new once vgic_connect_hw_irq() is succesful?
>
> Also, while looking at the flags, I noticed we clear _IRQ_INPROGRESS 
> before vgic_connect_hw_irq(). Shouldn't we only clear *after*?

This is a good catch, with the logic of vgic_connect_hw_irq() extended 
to reject the invalid cases, it is indeed safer to clear the 
_IRQ_INPROGRESS  after the successful vgic_connect_hw_irq(). I will move 
it after.

> This brings to another question. You don't special case a dying 
> domain. If the domain is crashing, wouldn't this mean it wouldn't be 
> possible to destroy it?

Another good point, thanks. I will try to make a special case of the 
dying domain.

Kind regards,
Henry


>
> Cheers,
>