Message ID | 1400761950-25035-13-git-send-email-stefano.stabellini@eu.citrix.com |
---|---|
State | New |
Headers | show |
Hi Stefano, On 22/05/14 13:32, Stefano Stabellini wrote: > At the moment gic_remove_from_queues doesn't handle the case where the > guest kernel disables an irq on a different vcpu compared to the one > currently receiving the interrupt. > Make sure to take the right vcpu lock before removing the irq from > lr_queue. I see the same issue with vgic_enable_irqs. We may inject to the wrong VCPU (i.e other than 0). I think we should have the same case in vgic_enable_irqs. Cheers,
On Thu, 22 May 2014, Julien Grall wrote: > Hi Stefano, > > On 22/05/14 13:32, Stefano Stabellini wrote: > > At the moment gic_remove_from_queues doesn't handle the case where the > > guest kernel disables an irq on a different vcpu compared to the one > > currently receiving the interrupt. > > Make sure to take the right vcpu lock before removing the irq from > > lr_queue. > > I see the same issue with vgic_enable_irqs. We may inject to the wrong VCPU > (i.e other than 0). > > I think we should have the same case in vgic_enable_irqs. I think it would make more sense to print a warning in vgic_distr_mmio_write GICD_ITARGETSR rather than vgic_enable_irqs.
On 22/05/14 18:45, Stefano Stabellini wrote: > On Thu, 22 May 2014, Julien Grall wrote: >> Hi Stefano, >> >> On 22/05/14 13:32, Stefano Stabellini wrote: >>> At the moment gic_remove_from_queues doesn't handle the case where the >>> guest kernel disables an irq on a different vcpu compared to the one >>> currently receiving the interrupt. >>> Make sure to take the right vcpu lock before removing the irq from >>> lr_queue. >> >> I see the same issue with vgic_enable_irqs. We may inject to the wrong VCPU >> (i.e other than 0). >> >> I think we should have the same case in vgic_enable_irqs. > > I think it would make more sense to print a warning in > vgic_distr_mmio_write GICD_ITARGETSR rather than vgic_enable_irqs. IHMO the warning is not enougth. We may screw your state machine. BTW, for your todo: > + /* TODO: evict the irq from LRs */ We should not evict the IRQ from LRs. The guest may disable the IRQ while he is in the IRQ context (and before the IRQ has been EOI). If you drop the IRQs from the LRs, this can result to a maintenance interrupt: "If the specified Interrupt does not exist in the List registers, the GICH_HCR.EOIcount field is incremented, potentially generating a maintenance interrupt." Regards,
On Thu, 22 May 2014, Julien Grall wrote: > On 22/05/14 18:45, Stefano Stabellini wrote: > > On Thu, 22 May 2014, Julien Grall wrote: > > > Hi Stefano, > > > > > > On 22/05/14 13:32, Stefano Stabellini wrote: > > > > At the moment gic_remove_from_queues doesn't handle the case where the > > > > guest kernel disables an irq on a different vcpu compared to the one > > > > currently receiving the interrupt. > > > > Make sure to take the right vcpu lock before removing the irq from > > > > lr_queue. > > > > > > I see the same issue with vgic_enable_irqs. We may inject to the wrong > > > VCPU > > > (i.e other than 0). > > > > > > I think we should have the same case in vgic_enable_irqs. > > > > I think it would make more sense to print a warning in > > vgic_distr_mmio_write GICD_ITARGETSR rather than vgic_enable_irqs. > > IHMO the warning is not enougth. We may screw your state machine. That cannot happen: rank->itargets is actually unused at the moment. > BTW, for your todo: > > > + /* TODO: evict the irq from LRs */ > > We should not evict the IRQ from LRs. The guest may disable the IRQ while he > is in the IRQ context (and before the IRQ has been EOI). If you drop the IRQs > from the LRs, this can result to a maintenance interrupt: > > "If the specified Interrupt does not exist in the > List registers, the GICH_HCR.EOIcount field is incremented, potentially > generating a maintenance interrupt." It is still better than the alternative: having an LR busy for no reason. A maintenance interrupt would be harmless.
On 05/23/2014 06:33 PM, Stefano Stabellini wrote: > On Thu, 22 May 2014, Julien Grall wrote: >> On 22/05/14 18:45, Stefano Stabellini wrote: >>> On Thu, 22 May 2014, Julien Grall wrote: >>>> Hi Stefano, >>>> >>>> On 22/05/14 13:32, Stefano Stabellini wrote: >>>>> At the moment gic_remove_from_queues doesn't handle the case where the >>>>> guest kernel disables an irq on a different vcpu compared to the one >>>>> currently receiving the interrupt. >>>>> Make sure to take the right vcpu lock before removing the irq from >>>>> lr_queue. >>>> >>>> I see the same issue with vgic_enable_irqs. We may inject to the wrong >>>> VCPU >>>> (i.e other than 0). >>>> >>>> I think we should have the same case in vgic_enable_irqs. >>> >>> I think it would make more sense to print a warning in >>> vgic_distr_mmio_write GICD_ITARGETSR rather than vgic_enable_irqs. >> >> IHMO the warning is not enougth. We may screw your state machine. > > That cannot happen: rank->itargets is actually unused at the moment. itargets is not used, but nothing prevent a guest to enabled an IRQ on VCPU1. This can inject the IRQ in VCPU1 while it's already injected in VCPU0 (assuming the IRQ what disable for a little time). > >> BTW, for your todo: >> >>> + /* TODO: evict the irq from LRs */ >> >> We should not evict the IRQ from LRs. The guest may disable the IRQ while he >> is in the IRQ context (and before the IRQ has been EOI). If you drop the IRQs >> from the LRs, this can result to a maintenance interrupt: >> >> "If the specified Interrupt does not exist in the >> List registers, the GICH_HCR.EOIcount field is incremented, potentially >> generating a maintenance interrupt." > > It is still better than the alternative: having an LR busy for no reason. > A maintenance interrupt would be harmless. Our internal representation (in the status field, still inflight) won't be update-to-date for IRQ. We either inject a spurious IRQ (if it's a virtual IRQ), other set active & pending is physical IRQ (which is invalid from the GIC specification). Regards,
On Fri, 23 May 2014, Julien Grall wrote: > On 05/23/2014 06:33 PM, Stefano Stabellini wrote: > > On Thu, 22 May 2014, Julien Grall wrote: > >> On 22/05/14 18:45, Stefano Stabellini wrote: > >>> On Thu, 22 May 2014, Julien Grall wrote: > >>>> Hi Stefano, > >>>> > >>>> On 22/05/14 13:32, Stefano Stabellini wrote: > >>>>> At the moment gic_remove_from_queues doesn't handle the case where the > >>>>> guest kernel disables an irq on a different vcpu compared to the one > >>>>> currently receiving the interrupt. > >>>>> Make sure to take the right vcpu lock before removing the irq from > >>>>> lr_queue. > >>>> > >>>> I see the same issue with vgic_enable_irqs. We may inject to the wrong > >>>> VCPU > >>>> (i.e other than 0). > >>>> > >>>> I think we should have the same case in vgic_enable_irqs. > >>> > >>> I think it would make more sense to print a warning in > >>> vgic_distr_mmio_write GICD_ITARGETSR rather than vgic_enable_irqs. > >> > >> IHMO the warning is not enougth. We may screw your state machine. > > > > That cannot happen: rank->itargets is actually unused at the moment. > > itargets is not used, but nothing prevent a guest to enabled an IRQ on > VCPU1. That is actually the problem: if vcpu1 is the one enabling an SPI, the target vcpu should still be the one specified by itarget. > This can inject the IRQ in VCPU1 while it's already injected in > VCPU0 (assuming the IRQ what disable for a little time). > > > > >> BTW, for your todo: > >> > >>> + /* TODO: evict the irq from LRs */ > >> > >> We should not evict the IRQ from LRs. The guest may disable the IRQ while he > >> is in the IRQ context (and before the IRQ has been EOI). If you drop the IRQs > >> from the LRs, this can result to a maintenance interrupt: > >> > >> "If the specified Interrupt does not exist in the > >> List registers, the GICH_HCR.EOIcount field is incremented, potentially > >> generating a maintenance interrupt." > > > > It is still better than the alternative: having an LR busy for no reason. > > A maintenance interrupt would be harmless. > > Our internal representation (in the status field, still inflight) won't > be update-to-date for IRQ. We either inject a spurious IRQ (if it's a > virtual IRQ), other set active & pending is physical IRQ (which is > invalid from the GIC specification). I think that the best behaviour would be to evict the irq from LRs if the irq is not active.
On 25/05/14 16:39, Stefano Stabellini wrote: > That is actually the problem: if vcpu1 is the one enabling an SPI, the > target vcpu should still be the one specified by itarget. Yes, but by enabling I mean writing in ISENABLER* on VCPU1. In this case, we may inject this IRQ on this CPUs (see vgic_enable_irqs). > >> This can inject the IRQ in VCPU1 while it's already injected in >> VCPU0 (assuming the IRQ what disable for a little time). >> >>> >>>> BTW, for your todo: >>>> >>>>> + /* TODO: evict the irq from LRs */ >>>> >>>> We should not evict the IRQ from LRs. The guest may disable the IRQ while he >>>> is in the IRQ context (and before the IRQ has been EOI). If you drop the IRQs >>>> from the LRs, this can result to a maintenance interrupt: >>>> >>>> "If the specified Interrupt does not exist in the >>>> List registers, the GICH_HCR.EOIcount field is incremented, potentially >>>> generating a maintenance interrupt." >>> >>> It is still better than the alternative: having an LR busy for no reason. >>> A maintenance interrupt would be harmless. >> >> Our internal representation (in the status field, still inflight) won't >> be update-to-date for IRQ. We either inject a spurious IRQ (if it's a >> virtual IRQ), other set active & pending is physical IRQ (which is >> invalid from the GIC specification). > > I think that the best behaviour would be to evict the irq from LRs if > the irq is not active. Right. Regards,
On Sun, 25 May 2014, Julien Grall wrote: > On 25/05/14 16:39, Stefano Stabellini wrote: > > That is actually the problem: if vcpu1 is the one enabling an SPI, the > > target vcpu should still be the one specified by itarget. > > Yes, but by enabling I mean writing in ISENABLER* on VCPU1. In this case, we > may inject this IRQ on this CPUs (see vgic_enable_irqs). Yes, that is completely wrong, I have a patch for that. I'll send it out separately from this series. > > > This can inject the IRQ in VCPU1 while it's already injected in > > > VCPU0 (assuming the IRQ what disable for a little time). > > > > > > > > > > > > BTW, for your todo: > > > > > > > > > > > + /* TODO: evict the irq from LRs */ > > > > > > > > > > We should not evict the IRQ from LRs. The guest may disable the IRQ > > > > > while he > > > > > is in the IRQ context (and before the IRQ has been EOI). If you drop > > > > > the IRQs > > > > > from the LRs, this can result to a maintenance interrupt: > > > > > > > > > > "If the specified Interrupt does not exist in the > > > > > List registers, the GICH_HCR.EOIcount field is incremented, > > > > > potentially > > > > > generating a maintenance interrupt." > > > > > > > > It is still better than the alternative: having an LR busy for no > > > > reason. > > > > A maintenance interrupt would be harmless. > > > > > > Our internal representation (in the status field, still inflight) won't > > > be update-to-date for IRQ. We either inject a spurious IRQ (if it's a > > > virtual IRQ), other set active & pending is physical IRQ (which is > > > invalid from the GIC specification). > > > > I think that the best behaviour would be to evict the irq from LRs if > > the irq is not active. > > Right.
On Sun, 25 May 2014, Stefano Stabellini wrote: > On Sun, 25 May 2014, Julien Grall wrote: > > On 25/05/14 16:39, Stefano Stabellini wrote: > > > That is actually the problem: if vcpu1 is the one enabling an SPI, the > > > target vcpu should still be the one specified by itarget. > > > > Yes, but by enabling I mean writing in ISENABLER* on VCPU1. In this case, we > > may inject this IRQ on this CPUs (see vgic_enable_irqs). > > Yes, that is completely wrong, I have a patch for that. I'll send it out > separately from this series. To be clear: I'll drop this patch from this series and send out two separate patches. > > > > This can inject the IRQ in VCPU1 while it's already injected in > > > > VCPU0 (assuming the IRQ what disable for a little time). > > > > > > > > > > > > > > > BTW, for your todo: > > > > > > > > > > > > > + /* TODO: evict the irq from LRs */ > > > > > > > > > > > > We should not evict the IRQ from LRs. The guest may disable the IRQ > > > > > > while he > > > > > > is in the IRQ context (and before the IRQ has been EOI). If you drop > > > > > > the IRQs > > > > > > from the LRs, this can result to a maintenance interrupt: > > > > > > > > > > > > "If the specified Interrupt does not exist in the > > > > > > List registers, the GICH_HCR.EOIcount field is incremented, > > > > > > potentially > > > > > > generating a maintenance interrupt." > > > > > > > > > > It is still better than the alternative: having an LR busy for no > > > > > reason. > > > > > A maintenance interrupt would be harmless. > > > > > > > > Our internal representation (in the status field, still inflight) won't > > > > be update-to-date for IRQ. We either inject a spurious IRQ (if it's a > > > > virtual IRQ), other set active & pending is physical IRQ (which is > > > > invalid from the GIC specification). > > > > > > I think that the best behaviour would be to evict the irq from LRs if > > > the irq is not active. > > > > Right. >
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index 2bfaba9..bb598eb 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -589,9 +589,16 @@ static inline void gic_add_to_lr_pending(struct vcpu *v, struct pending_irq *n) void gic_remove_from_queues(struct vcpu *v, unsigned int virtual_irq) { - struct pending_irq *p = irq_to_pending(v, virtual_irq); + struct pending_irq *p; unsigned long flags; + /* TODO: do not assume SPI delivery on vcpu0 */ + if ( virtual_irq >= 32 && v->vcpu_id != 0 ) + v = v->domain->vcpu[0]; + + p = irq_to_pending(v, virtual_irq); + + /* TODO: evict the irq from LRs */ spin_lock_irqsave(&v->arch.vgic.lock, flags); if ( !list_empty(&p->lr_queue) ) list_del_init(&p->lr_queue);
At the moment gic_remove_from_queues doesn't handle the case where the guest kernel disables an irq on a different vcpu compared to the one currently receiving the interrupt. Make sure to take the right vcpu lock before removing the irq from lr_queue. Document that the function should remove irqs from LR registers too. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/arch/arm/gic.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)