diff mbox

[Xen-devel] Xen 4.5 random freeze question

Message ID 546CA3C6.6010406@linaro.org
State New
Headers show

Commit Message

Julien Grall Nov. 19, 2014, 2:05 p.m. UTC
On 11/19/2014 01:30 PM, Andrii Tseglytskyi wrote:
> On Wed, Nov 19, 2014 at 3:26 PM, Julien Grall <julien.grall@linaro.org> wrote:
>> On 11/19/2014 12:40 PM, Andrii Tseglytskyi wrote:
>>> Hi Julien,
>>>
>>> On Wed, Nov 19, 2014 at 2:23 PM, Julien Grall <julien.grall@linaro.org> wrote:
>>>> On 11/19/2014 12:17 PM, Stefano Stabellini wrote:
>>>>> On Wed, 19 Nov 2014, Ian Campbell wrote:
>>>>>> On Wed, 2014-11-19 at 11:42 +0000, Stefano Stabellini wrote:
>>>>>>> So it looks like there is not actually anything wrong, is just that you
>>>>>>> have too much inflight irqs? It should cause problems because in that
>>>>>>> case GICH_HCR_UIE should be set and you should get a maintenance
>>>>>>> interrupt when LRs become available (actually when "none, or only one,
>>>>>>> of the List register entries is marked as a valid interrupt").
>>>>>>>
>>>>>>> Maybe GICH_HCR_UIE is the one that doesn't work properly.
>>>>>>
>>>>>> How much testing did this aspect get when the no-maint-irq series
>>>>>> originally went in? Did you manage to find a workload which filled all
>>>>>> the LRs or try artificially limiting the number of LRs somehow in order
>>>>>> to provoke it?
>>>>>>
>>>>>> I ask because my intuition is that this won't happen very much, meaning
>>>>>> those code paths may not be as well tested...
>>>>>
>>>>> I did test it by artificially limiting the number of LRs to 1.
>>>>> However there have been many iterations of that series and I didn't run
>>>>> this test at every iteration.
>>>>
>>>> am I the only to think this may not be related to this bug? All the LRs
>>>> are full with IRQ of the same priority. So it's valid.
>>>>
>>>> As gic_restore_pending_irqs is called every time that we return to the
>>>> guest. It could be anything else.
>>>>
>>>> It would be interesting to see why we are trapping all the time in Xen.
>>>>
>>>
>>> I may perform any test if you have some specific scenario.
>>
>> I have no specific scenario in my mind :/.
>>
>> It looks like I'm able to reproduce it on my ARM board by the restricted
>> the number of LRs to 1.
>>
> 
> Do you mean that you got a hang with current xen/master branch ?

Yes but I forgot to update another part of the code.

With the patch below to restrict the number of LRs I'm still able to boot.
And don't see any maintenance interrupt.

Stefano, is it valid?
diff mbox

Patch

diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index faad1ff..c1c0f7ff 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -327,6 +327,7 @@  static void __cpuinit gicv2_hyp_init(void)
     vtr = readl_gich(GICH_VTR);
     nr_lrs  = (vtr & GICH_V2_VTR_NRLRGS) + 1;
     gicv2_info.nr_lrs = nr_lrs;
+    gicv2_info.nr_lrs = 1;
 
     writel_gich(GICH_MISR_EOI, GICH_MISR);
 }
@@ -488,6 +489,16 @@  static void gicv2_write_lr(int lr, const struct gic_lr *lr_reg)
 
 static void gicv2_hcr_status(uint32_t flag, bool_t status)
 {
+    uint32_t lr = readl_gich(GICH_LR + 0);
+
+    if ( status )
+        lr |= GICH_V2_LR_MAINTENANCE_IRQ;
+    else
+        lr &= ~GICH_V2_LR_MAINTENANCE_IRQ;
+
+    writel_gich(lr, GICH_LR + 0);
+
+#if 0
     uint32_t hcr = readl_gich(GICH_HCR);
 
     if ( status )
@@ -496,6 +507,7 @@  static void gicv2_hcr_status(uint32_t flag, bool_t status)
         hcr &= (~flag);
 
     writel_gich(hcr, GICH_HCR);
+#endif
 }
 
 static unsigned int gicv2_read_vmcr_priority(void)
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 70d10d6..c726d7a 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -599,6 +599,7 @@  static void maintenance_interrupt(int irq, void *dev_id, struct cpu_user_regs *r
      * on return to guest that is going to clear the old LRs and inject
      * new interrupts.
      */
+    gdprintk(XENLOG_DEBUG, "\n");
 }
 
 void gic_dump_info(struct vcpu *v)