[Xen-devel,v5,4/6] xen/arm: inflight irqs during migration

Message ID	1402504032-13267-4-git-send-email-stefano.stabellini@eu.citrix.com
State	New
Headers	show Return-Path: <patchwork-forward+bncBDWMDO7X3ICBB24H4KOAKGQEEMP6SDA@linaro.org> Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.128.176 as permitted sender) client-ip=209.85.128.176; Received-SPF: none (google.com: xen-devel-bounces@lists.xen.org does not designate permitted sender hosts) client-ip=50.57.142.19; From: Stefano Stabellini <stefano.stabellini@eu.citrix.com> To: <xen-devel@lists.xensource.com> Date: Wed, 11 Jun 2014 17:27:10 +0100 Message-ID: <1402504032-13267-4-git-send-email-stefano.stabellini@eu.citrix.com> In-Reply-To: <alpine.DEB.2.02.1406111607450.13771@kaball.uk.xensource.com> References: <alpine.DEB.2.02.1406111607450.13771@kaball.uk.xensource.com> MIME-Version: 1.0 Cc: julien.grall@citrix.com, Ian.Campbell@citrix.com, Stefano Stabellini <stefano.stabellini@eu.citrix.com> Subject: [Xen-devel] [PATCH v5 4/6] xen/arm: inflight irqs during migration Precedence: list Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit

Message ID

1402504032-13267-4-git-send-email-stefano.stabellini@eu.citrix.com

State

New

Headers

Received-SPF: pass (google.com: domain of
	patch+caf_=patchwork-forward=linaro.org@linaro.org designates
	209.85.128.176 as permitted sender) client-ip=209.85.128.176; 
Received-SPF: none (google.com: xen-devel-bounces@lists.xen.org does not
	designate permitted sender hosts) client-ip=50.57.142.19; 
From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
To: <xen-devel@lists.xensource.com>
Date: Wed, 11 Jun 2014 17:27:10 +0100
Message-ID: <1402504032-13267-4-git-send-email-stefano.stabellini@eu.citrix.com>
In-Reply-To: <alpine.DEB.2.02.1406111607450.13771@kaball.uk.xensource.com>
References: <alpine.DEB.2.02.1406111607450.13771@kaball.uk.xensource.com>
MIME-Version: 1.0
Cc: julien.grall@citrix.com, Ian.Campbell@citrix.com,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Subject: [Xen-devel] [PATCH v5 4/6] xen/arm: inflight irqs during migration
Precedence: list
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
Mailing-list: list patchwork-forward@linaro.org;
	contact patchwork-forward+owners@linaro.org
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

Commit Message

Stefano Stabellini June 11, 2014, 4:27 p.m. UTC

We need to take special care when migrating irqs that are already
inflight from one vcpu to another. See "The effect of changes to an
GICD_ITARGETSR", part of chapter 4.3.12 of the ARM Generic Interrupt
Controller Architecture Specification.

The main issue from the Xen point of view is that the lr_pending and
inflight lists are per-vcpu. The lock we take to protect them is also
per-vcpu.

In order to avoid issues, we set a new flag GIC_IRQ_GUEST_MIGRATING, so
that we can recognize when we receive an irq while the previous one is
still inflight (given that we are only dealing with hardware interrupts
here, it just means that its LR hasn't been cleared yet on the old vcpu).

If GIC_IRQ_GUEST_MIGRATING is set, we only set GIC_IRQ_GUEST_QUEUED and
interrupt the old vcpu. When clearing the LR on the old vcpu, we take
special care of injecting the interrupt into the new vcpu. To do that we
need to release the old vcpu lock before taking the new vcpu lock.

Using barriers and test_bit on GIC_IRQ_GUEST_MIGRATING, make sure that
vgic_migrate_irq can run at the same time as gic_update_one_lr on
different cpus with consistent results.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

---

Changes in v5:
- pass unsigned long to find_next_bit for alignment on aarch64;
- call vgic_get_target_vcpu instead of gic_add_to_lr_pending to add the
irq in the right inflight queue;
- add barrier and bit tests to make sure that vgic_migrate_irq and
gic_update_one_lr can run simultaneously on different cpus without
issues;
- rework the loop to identify the new vcpu when ITARGETSR is written;
- use find_first_bit instead of find_next_bit.
---
 xen/arch/arm/gic.c           |   25 ++++++++++++++--
 xen/arch/arm/vgic.c          |   66 +++++++++++++++++++++++++++++++++++++-----
 xen/include/asm-arm/domain.h |    4 +++
 3 files changed, 85 insertions(+), 10 deletions(-)

Comments

Ian Campbell June 18, 2014, 11:04 a.m. UTC | #1

On Wed, 2014-06-11 at 17:27 +0100, Stefano Stabellini wrote:
> We need to take special care when migrating irqs that are already
> inflight from one vcpu to another. See "The effect of changes to an
> GICD_ITARGETSR", part of chapter 4.3.12 of the ARM Generic Interrupt
> Controller Architecture Specification.
> 
> The main issue from the Xen point of view is that the lr_pending and
> inflight lists are per-vcpu. The lock we take to protect them is also
> per-vcpu.
> 
> In order to avoid issues, we set a new flag GIC_IRQ_GUEST_MIGRATING, so
> that we can recognize when we receive an irq while the previous one is
> still inflight (given that we are only dealing with hardware interrupts
> here, it just means that its LR hasn't been cleared yet on the old vcpu).
> 
> If GIC_IRQ_GUEST_MIGRATING is set, we only set GIC_IRQ_GUEST_QUEUED and
> interrupt the old vcpu. When clearing the LR on the old vcpu, we take
> special care of injecting the interrupt into the new vcpu. To do that we
> need to release the old vcpu lock before taking the new vcpu lock.
> 
> Using barriers and test_bit on GIC_IRQ_GUEST_MIGRATING, make sure that
> vgic_migrate_irq can run at the same time as gic_update_one_lr on
> different cpus with consistent results.
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> 
> ---
> 
> Changes in v5:
> - pass unsigned long to find_next_bit for alignment on aarch64;
> - call vgic_get_target_vcpu instead of gic_add_to_lr_pending to add the
> irq in the right inflight queue;
> - add barrier and bit tests to make sure that vgic_migrate_irq and
> gic_update_one_lr can run simultaneously on different cpus without
> issues;
> - rework the loop to identify the new vcpu when ITARGETSR is written;
> - use find_first_bit instead of find_next_bit.
> ---
>  xen/arch/arm/gic.c           |   25 ++++++++++++++--
>  xen/arch/arm/vgic.c          |   66 +++++++++++++++++++++++++++++++++++++-----
>  xen/include/asm-arm/domain.h |    4 +++
>  3 files changed, 85 insertions(+), 10 deletions(-)
> 
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index 5e502df..8ed242e 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -677,10 +677,31 @@ static void gic_update_one_lr(struct vcpu *v, int i)
>          clear_bit(GIC_IRQ_GUEST_ACTIVE, &p->status);
>          p->lr = GIC_INVALID_LR;
>          if ( test_bit(GIC_IRQ_GUEST_ENABLED, &p->status) &&
> -             test_bit(GIC_IRQ_GUEST_QUEUED, &p->status) )
> +             test_bit(GIC_IRQ_GUEST_QUEUED, &p->status) &&
> +             !test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
>              gic_raise_guest_irq(v, irq, p->priority);
> -        else
> +        else {
>              list_del_init(&p->inflight);
> +
> +            /* inflight is cleared before clearing
> +             * GIC_IRQ_GUEST_MIGRATING */
> +            dsb(sy);

Is sy really necessary? THis is all about state stored in RAM not the
actual GIC, right? I think "ish" is probably sufficient. POssibly even a
dmb of some sort.

> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index 4d655af..e640de9 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -543,6 +562,8 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
>      int offset = (int)(info->gpa - v->domain->arch.vgic.dbase);
>      int gicd_reg = REG(offset);
>      uint32_t tr;
> +    unsigned long trl;
> +    int i;
>  
>      switch ( gicd_reg )
>      {
> @@ -626,22 +647,45 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
>          rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR);
>          if ( rank == NULL) goto write_ignore;
>          /* 8-bit vcpu mask for this domain */
> -        tr = (1 << v->domain->max_vcpus) - 1;
> -        tr = tr | (tr << 8) | (tr << 16) | (tr << 24);
> -        tr &= *r;
> +        trl = (1 << v->domain->max_vcpus) - 1;
> +        if ( dabt.size == 2 )
> +            trl = trl | (trl << 8) | (trl << 16) | (trl << 24);
> +        else
> +            trl = (trl << (8 * (offset & 0x3)));
> +        trl &= *r;

What does trl stand for? Why is it an unsigned long when r and tr are
uint32_t?

>          /* ignore zero writes */
> -        if ( !tr )
> +        if ( !trl )
>              goto write_ignore;
>          if ( dabt.size == 2 &&
> -            !((tr & 0xff) && (tr & (0xff << 8)) &&
> -             (tr & (0xff << 16)) && (tr & (0xff << 24))))
> +            !((trl & 0xff) && (trl & (0xff << 8)) &&
> +             (trl & (0xff << 16)) && (trl & (0xff << 24))))
>              goto write_ignore;

I'm still not sure what this is ;-)

>          vgic_lock_rank(v, rank);
> +        i = 0;
> +        while ( (i = find_next_bit((const unsigned long *)&trl, 32, i)) < 32 )

Unnecessary cast (if trl really should be an unsigned long).

> +        {
> +            unsigned int irq, target, old_target;
> +            unsigned long old_target_mask;
> +            struct vcpu *v_target, *v_old;
> +
> +            target = i % 8;
> +            old_target_mask = byte_read(rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)], 0, i/8);
> +            old_target = find_first_bit((const unsigned long *) &old_target_mask, 8);

Another unnecessary cast.

> +
> +            if ( target != old_target )

Given the implementation only supports a single target I'm wondering if
maybe we should just store the configured target.

If we weren't required to read back the actual value then we could even
dispense with rank->itargets completely and just fabricate it.

> +            {
> +                irq = offset + (i / 8);
> +                v_target = v->domain->vcpu[target];
> +                v_old = v->domain->vcpu[old_target];
> +                vgic_migrate_irq(v_old, v_target, irq);
> +            }
> +            i += 8 - target;

I think this whole loop would be less subtle as:

	for ( byte = 0 ; byte < 4 ; byte++ )
		target = find_first_bit((char *)(&tr)+byte, 8)
		if ( target > 8 ) continue;
		old_target  =...
	}

The use of find_next_bit is not really semantically what is happening
here.

Ian.

Stefano Stabellini June 19, 2014, 6:32 p.m. UTC | #2

On Wed, 18 Jun 2014, Ian Campbell wrote:
> On Wed, 2014-06-11 at 17:27 +0100, Stefano Stabellini wrote:
> > We need to take special care when migrating irqs that are already
> > inflight from one vcpu to another. See "The effect of changes to an
> > GICD_ITARGETSR", part of chapter 4.3.12 of the ARM Generic Interrupt
> > Controller Architecture Specification.
> > 
> > The main issue from the Xen point of view is that the lr_pending and
> > inflight lists are per-vcpu. The lock we take to protect them is also
> > per-vcpu.
> > 
> > In order to avoid issues, we set a new flag GIC_IRQ_GUEST_MIGRATING, so
> > that we can recognize when we receive an irq while the previous one is
> > still inflight (given that we are only dealing with hardware interrupts
> > here, it just means that its LR hasn't been cleared yet on the old vcpu).
> > 
> > If GIC_IRQ_GUEST_MIGRATING is set, we only set GIC_IRQ_GUEST_QUEUED and
> > interrupt the old vcpu. When clearing the LR on the old vcpu, we take
> > special care of injecting the interrupt into the new vcpu. To do that we
> > need to release the old vcpu lock before taking the new vcpu lock.
> > 
> > Using barriers and test_bit on GIC_IRQ_GUEST_MIGRATING, make sure that
> > vgic_migrate_irq can run at the same time as gic_update_one_lr on
> > different cpus with consistent results.
> > 
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > 
> > ---
> > 
> > Changes in v5:
> > - pass unsigned long to find_next_bit for alignment on aarch64;
> > - call vgic_get_target_vcpu instead of gic_add_to_lr_pending to add the
> > irq in the right inflight queue;
> > - add barrier and bit tests to make sure that vgic_migrate_irq and
> > gic_update_one_lr can run simultaneously on different cpus without
> > issues;
> > - rework the loop to identify the new vcpu when ITARGETSR is written;
> > - use find_first_bit instead of find_next_bit.
> > ---
> >  xen/arch/arm/gic.c           |   25 ++++++++++++++--
> >  xen/arch/arm/vgic.c          |   66 +++++++++++++++++++++++++++++++++++++-----
> >  xen/include/asm-arm/domain.h |    4 +++
> >  3 files changed, 85 insertions(+), 10 deletions(-)
> > 
> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> > index 5e502df..8ed242e 100644
> > --- a/xen/arch/arm/gic.c
> > +++ b/xen/arch/arm/gic.c
> > @@ -677,10 +677,31 @@ static void gic_update_one_lr(struct vcpu *v, int i)
> >          clear_bit(GIC_IRQ_GUEST_ACTIVE, &p->status);
> >          p->lr = GIC_INVALID_LR;
> >          if ( test_bit(GIC_IRQ_GUEST_ENABLED, &p->status) &&
> > -             test_bit(GIC_IRQ_GUEST_QUEUED, &p->status) )
> > +             test_bit(GIC_IRQ_GUEST_QUEUED, &p->status) &&
> > +             !test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
> >              gic_raise_guest_irq(v, irq, p->priority);
> > -        else
> > +        else {
> >              list_del_init(&p->inflight);
> > +
> > +            /* inflight is cleared before clearing
> > +             * GIC_IRQ_GUEST_MIGRATING */
> > +            dsb(sy);
> 
> Is sy really necessary? THis is all about state stored in RAM not the
> actual GIC, right? I think "ish" is probably sufficient. POssibly even a
> dmb of some sort.

Yeah, it is only about the state in RAM. It only needs to be visible
across all the cores, not the devices. So I guess the right function to
call would be smp_wmb() that translates to dmb(ishst)?
That makes me think that I should add smp_rmb() to the corresponding
critical section: vgic_migrate_irq.


> > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> > index 4d655af..e640de9 100644
> > --- a/xen/arch/arm/vgic.c
> > +++ b/xen/arch/arm/vgic.c
> > @@ -543,6 +562,8 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
> >      int offset = (int)(info->gpa - v->domain->arch.vgic.dbase);
> >      int gicd_reg = REG(offset);
> >      uint32_t tr;
> > +    unsigned long trl;
> > +    int i;
> >  
> >      switch ( gicd_reg )
> >      {
> > @@ -626,22 +647,45 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
> >          rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR);
> >          if ( rank == NULL) goto write_ignore;
> >          /* 8-bit vcpu mask for this domain */
> > -        tr = (1 << v->domain->max_vcpus) - 1;
> > -        tr = tr | (tr << 8) | (tr << 16) | (tr << 24);
> > -        tr &= *r;
> > +        trl = (1 << v->domain->max_vcpus) - 1;
> > +        if ( dabt.size == 2 )
> > +            trl = trl | (trl << 8) | (trl << 16) | (trl << 24);
> > +        else
> > +            trl = (trl << (8 * (offset & 0x3)));
> > +        trl &= *r;
> 
> What does trl stand for? Why is it an unsigned long when r and tr are
> uint32_t?

It is an unfortunate name. trl is exactly like tr but unsigned long
instead of uint32_t.
It is unsigned long so that it can be used as an argument of
find_next_bit below. uint32_t cannot be used because it doesn't have the
right alignment on armv8.


> >          /* ignore zero writes */
> > -        if ( !tr )
> > +        if ( !trl )
> >              goto write_ignore;
> >          if ( dabt.size == 2 &&
> > -            !((tr & 0xff) && (tr & (0xff << 8)) &&
> > -             (tr & (0xff << 16)) && (tr & (0xff << 24))))
> > +            !((trl & 0xff) && (trl & (0xff << 8)) &&
> > +             (trl & (0xff << 16)) && (trl & (0xff << 24))))
> >              goto write_ignore;
> 
> I'm still not sure what this is ;-)

All the bytes in trl needs to be non-zero for the write to be valid
(considering a 0 write to any of the byte fields as invalid).


> >          vgic_lock_rank(v, rank);
> > +        i = 0;
> > +        while ( (i = find_next_bit((const unsigned long *)&trl, 32, i)) < 32 )
> 
> Unnecessary cast (if trl really should be an unsigned long).

The argument to find_next_bit really needs to be unsigned long. I'll
remove the cast.


> > +        {
> > +            unsigned int irq, target, old_target;
> > +            unsigned long old_target_mask;
> > +            struct vcpu *v_target, *v_old;
> > +
> > +            target = i % 8;
> > +            old_target_mask = byte_read(rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)], 0, i/8);
> > +            old_target = find_first_bit((const unsigned long *) &old_target_mask, 8);
> 
> Another unnecessary cast.

Right.


> > +
> > +            if ( target != old_target )
> 
> Given the implementation only supports a single target I'm wondering if
> maybe we should just store the configured target.
> 
> If we weren't required to read back the actual value then we could even
> dispense with rank->itargets completely and just fabricate it.

Yeah, that's a good point, I am not sure.
The spec says that it is valid to statically configure GICD_ITARGETSR
and ignore any writes to it, but it doesn't say that we can ignore
/some/ writes to it.


> > +            {
> > +                irq = offset + (i / 8);
> > +                v_target = v->domain->vcpu[target];
> > +                v_old = v->domain->vcpu[old_target];
> > +                vgic_migrate_irq(v_old, v_target, irq);
> > +            }
> > +            i += 8 - target;
> 
> I think this whole loop would be less subtle as:
> 
> 	for ( byte = 0 ; byte < 4 ; byte++ )
> 		target = find_first_bit((char *)(&tr)+byte, 8)

That would have alignment issues, wouldn't it?


> 		if ( target > 8 ) continue;
> 		old_target  =...
> 	}
> 
> The use of find_next_bit is not really semantically what is happening
> here.

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 5e502df..8ed242e 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -677,10 +677,31 @@  static void gic_update_one_lr(struct vcpu *v, int i)
         clear_bit(GIC_IRQ_GUEST_ACTIVE, &p->status);
         p->lr = GIC_INVALID_LR;
         if ( test_bit(GIC_IRQ_GUEST_ENABLED, &p->status) &&
-             test_bit(GIC_IRQ_GUEST_QUEUED, &p->status) )
+             test_bit(GIC_IRQ_GUEST_QUEUED, &p->status) &&
+             !test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
             gic_raise_guest_irq(v, irq, p->priority);
-        else
+        else {
             list_del_init(&p->inflight);
+
+            /* inflight is cleared before clearing
+             * GIC_IRQ_GUEST_MIGRATING */
+            dsb(sy);
+            if ( test_and_clear_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) &&
+                 test_bit(GIC_IRQ_GUEST_QUEUED, &p->status) )
+            {
+                struct vcpu *v_target;
+
+                /* It is safe to temporarily drop the lock because we
+                 * are finished dealing with this LR. We'll take the
+                 * lock before reading the next. */
+                spin_unlock(&v->arch.vgic.lock);
+                /* vgic_get_target_vcpu takes the rank lock, ensuring
+                 * consistency with other itarget changes. */
+                v_target = vgic_get_target_vcpu(v, irq);
+                vgic_vcpu_inject_irq(v_target, irq);
+                spin_lock(&v->arch.vgic.lock);
+            }
+        }
     }
 }
 
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 4d655af..e640de9 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -395,6 +395,25 @@  static struct vcpu *_vgic_get_target_vcpu(struct vcpu *v, unsigned int irq)
     return v_target;
 }
 
+static void vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq)
+{
+    unsigned long flags;
+    struct pending_irq *p = irq_to_pending(old, irq); 
+
+    /* nothing to do for virtual interrupts */
+    if ( p->desc == NULL )
+        return;
+
+    /* migration already in progress, no need to do anything */
+    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
+        return;
+
+    spin_lock_irqsave(&old->arch.vgic.lock, flags);
+    if ( !list_empty(&p->inflight) )
+        set_bit(GIC_IRQ_GUEST_MIGRATING, &p->status);
+    spin_unlock_irqrestore(&old->arch.vgic.lock, flags);
+}
+
 struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int irq)
 {
     struct vcpu *v_target;
@@ -543,6 +562,8 @@  static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
     int offset = (int)(info->gpa - v->domain->arch.vgic.dbase);
     int gicd_reg = REG(offset);
     uint32_t tr;
+    unsigned long trl;
+    int i;
 
     switch ( gicd_reg )
     {
@@ -626,22 +647,45 @@  static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
         rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR);
         if ( rank == NULL) goto write_ignore;
         /* 8-bit vcpu mask for this domain */
-        tr = (1 << v->domain->max_vcpus) - 1;
-        tr = tr | (tr << 8) | (tr << 16) | (tr << 24);
-        tr &= *r;
+        trl = (1 << v->domain->max_vcpus) - 1;
+        if ( dabt.size == 2 )
+            trl = trl | (trl << 8) | (trl << 16) | (trl << 24);
+        else
+            trl = (trl << (8 * (offset & 0x3)));
+        trl &= *r;
         /* ignore zero writes */
-        if ( !tr )
+        if ( !trl )
             goto write_ignore;
         if ( dabt.size == 2 &&
-            !((tr & 0xff) && (tr & (0xff << 8)) &&
-             (tr & (0xff << 16)) && (tr & (0xff << 24))))
+            !((trl & 0xff) && (trl & (0xff << 8)) &&
+             (trl & (0xff << 16)) && (trl & (0xff << 24))))
             goto write_ignore;
         vgic_lock_rank(v, rank);
+        i = 0;
+        while ( (i = find_next_bit((const unsigned long *)&trl, 32, i)) < 32 )
+        {
+            unsigned int irq, target, old_target;
+            unsigned long old_target_mask;
+            struct vcpu *v_target, *v_old;
+
+            target = i % 8;
+            old_target_mask = byte_read(rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)], 0, i/8);
+            old_target = find_first_bit((const unsigned long *) &old_target_mask, 8);
+
+            if ( target != old_target )
+            {
+                irq = offset + (i / 8);
+                v_target = v->domain->vcpu[target];
+                v_old = v->domain->vcpu[old_target];
+                vgic_migrate_irq(v_old, v_target, irq);
+            }
+            i += 8 - target;
+        }
         if ( dabt.size == 2 )
-            rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)] = tr;
+            rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)] = trl;
         else
             byte_write(&rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)],
-                       tr, offset);
+                       trl, offset);
         vgic_unlock_rank(v, rank);
         return 1;
 
@@ -779,6 +823,12 @@  void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq)
 
     spin_lock_irqsave(&v->arch.vgic.lock, flags);
 
+    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &n->status) )
+    {
+        set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
+        goto out;
+    }
+
     if ( !list_empty(&n->inflight) )
     {
         set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index d689675..743c020 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -54,11 +54,15 @@  struct pending_irq
      * GIC_IRQ_GUEST_ENABLED: the guest IRQ is enabled at the VGICD
      * level (GICD_ICENABLER/GICD_ISENABLER).
      *
+     * GIC_IRQ_GUEST_MIGRATING: the irq is being migrated to a different
+     * vcpu.
+     *
      */
 #define GIC_IRQ_GUEST_QUEUED   0
 #define GIC_IRQ_GUEST_ACTIVE   1
 #define GIC_IRQ_GUEST_VISIBLE  2
 #define GIC_IRQ_GUEST_ENABLED  3
+#define GIC_IRQ_GUEST_MIGRATING   4
     unsigned long status;
     struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */
     int irq;

[Xen-devel,v5,4/6] xen/arm: inflight irqs during migration

Commit Message

Comments

Patch