diff mbox

[Xen-devel] Hit ASSERT in kill_timer function

Message ID 53610B51.8030701@linaro.org
State Not Applicable, archived
Headers show

Commit Message

Julien Grall April 30, 2014, 2:40 p.m. UTC
Hi,

I played a bit with the function vcpu_initialize on ARM.
If it fails, it will likely crash Xen with the following stack trace:

(XEN) Xen BUG at /local/home/julien/works/arndale/xen/xen/include/xen/list.h:175
(XEN) CPU0: Unexpected Trap: Undefined Instruction
(XEN) ----[ Xen-4.5-unstable  arm32  debug=y  Tainted:    C ]----
(XEN) CPU:    0
(XEN) PC:     002457e0 __bug+0x2c/0x44
(XEN) CPSR:   200001da MODE:Hypervisor
(XEN)      R0: 0026b6d4 R1: 00000005 R2: 00000000 R3: 200001da
(XEN)      R4: 000000af R5: 00263274 R6: 002ec020 R7: 002ee380
(XEN)      R8: 002ee380 R9: 8000015a R10:7ffc1040 R11:7ffdfd6c R12:00000006
(XEN) HYP: SP: 7ffdfd64 LR: 002457e0
(XEN) 
(XEN)   VTCR_EL2: 80003558
(XEN)  VTTBR_EL2: 00010002f9ffc000
(XEN) 
(XEN)  SCTLR_EL2: 30cd187f
(XEN)    HCR_EL2: 0000000000382437
(XEN)  TTBR0_EL2: 00000000ff6e7000
(XEN) 
(XEN)    ESR_EL2: 00000000
(XEN)  HPFAR_EL2: 0000000000fff110
(XEN)      HDFAR: a0800f00
(XEN)      HIFAR: 00000000
(XEN) 
(XEN) Xen stack trace from sp=7ffdfd64:
(XEN)    00000000 7ffdfd94 00231cb0 00000003 7ffc1000 00000fff 7ffc1000 40022000
(XEN)    00000000 00000003 0026bb80 7ffdfda4 002296e4 00000000 00000fff 7ffdfdc4
(XEN)    002081d8 00000000 00000080 00000003 40025c30 002ec020 7ffdfdf8 7ffdfedc
(XEN)    00206a10 76f6a004 00000000 76f6c004 002ef69c 002ef69c fffffffc 40022000
(XEN)    002ee298 002ee298 002ef69c 4000f9b8 00000001 0000000f 00000000 00000000
(XEN)    00000000 00000000 00000000 00000000 00000000 00000000 c437bb18 75464f92
(XEN)    83f21e90 0000000f 0000000a 00000003 00031008 00000001 76f43cec 76ea5000
(XEN)    00000000 7e8a319c 76eea484 76f49ec0 7e8a31bc 7e8a32d4 76f84000 76ef7000
(XEN)    000586f8 76efad4c 76f844c0 00000000 00000001 7e8a327c 76f77857 00000000
(XEN)    00000001 00000001 00000000 76f49ec0 76ed43b0 76f49e90 7e8a3500 00031030
(XEN)    00032290 00000003 00038828 7e8a3540 76f7bf2c 40024a60 7ffdff58 8000db88
(XEN)    00000005 00305000 00000ea1 9d7fc000 7e8a3120 7ffdff54 00254698 ffffffff
(XEN)    002ef280 002ef294 4000f068 00000019 7ffdff58 002ef280 002ef294 7ffdff2c
(XEN)    7ffdff2c 7ffdff3c 00000019 40023954 7ffdff58 00000000 76f6c000 9f782040
(XEN)    00000005 00305000 00000005 9d7fc000 ffffffff 9f782040 00000005 00305000
(XEN)    00000005 9d7fc000 7e8a3120 7ffdff58 00257110 76f6c004 76f89578 00000000
(XEN)    76ed43b0 ffffffff 9f782040 00000005 00305000 00000005 9d7fc000 7e8a3120
(XEN)    9f119010 00000024 ffffffff 76eead54 8000db88 60000013 00000000 7e8a30ec
(XEN)    8056e740 80011ec0 9d7fdeb4 80200a60 8056e74c 80012000 8056e758 800120a0
(XEN)    00000000 00000000 00000000 00000000 00000000 00000000 00000000 60000010
(XEN) Xen call trace:
(XEN)    [<002457e0>] __bug+0x2c/0x44 (PC)
(XEN)    [<002457e0>] __bug+0x2c/0x44 (LR)
(XEN)    [<00231cb0>] kill_timer+0x1bc/0x364
(XEN)    [<002296e4>] sched_destroy_vcpu+0x1c/0x14c
(XEN)    [<002081d8>] alloc_vcpu+0x17c/0x270
(XEN)    [<00206a10>] do_domctl+0xa74/0x11f4
(XEN)    [<00254698>] do_trap_hypervisor+0x7f0/0xb44
(XEN)    [<00257110>] return_from_trap+0/0x4
(XEN) 

It's easily reproductible on ARM with this small patch:

I guess we forget to take a lock or smth like that, but I don't know 
enough this code.

Regards,

Comments

Jan Beulich April 30, 2014, 3:49 p.m. UTC | #1
>>> On 30.04.14 at 16:40, <julien.grall@linaro.org> wrote:
> I played a bit with the function vcpu_initialize on ARM.
> If it fails, it will likely crash Xen with the following stack trace:
> 
> (XEN) Xen BUG at /local/home/julien/works/arndale/xen/xen/include/xen/list.h:175
> ...
> (XEN) Xen call trace:
> (XEN)    [<002457e0>] __bug+0x2c/0x44 (PC)
> (XEN)    [<002457e0>] __bug+0x2c/0x44 (LR)
> (XEN)    [<00231cb0>] kill_timer+0x1bc/0x364
> (XEN)    [<002296e4>] sched_destroy_vcpu+0x1c/0x14c
> (XEN)    [<002081d8>] alloc_vcpu+0x17c/0x270
> (XEN)    [<00206a10>] do_domctl+0xa74/0x11f4
> (XEN)    [<00254698>] do_trap_hypervisor+0x7f0/0xb44
> (XEN)    [<00257110>] return_from_trap+0/0x4
> 
> It's easily reproductible on ARM with this small patch:
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index ccccb77..7ada03f 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -473,6 +473,9 @@ int vcpu_initialise(struct vcpu *v)
>      if ( (rc = vcpu_vtimer_init(v)) != 0 )
>          return rc;
>  
> +    if ( v->domain->domain_id != 0 )
> +        return -EFAULT;
> +
>      return rc;
>  }
> 
> I guess we forget to take a lock or smth like that, but I don't know 
> enough this code.

I definitely can't reproduce this on x86 - I tried three different
variations of which vCPU(s) to fail this function on. Are you sure
you didn't corrupt something with your experiments?

Jan
Julien Grall April 30, 2014, 4:09 p.m. UTC | #2
On 04/30/2014 04:49 PM, Jan Beulich wrote:
>>>> On 30.04.14 at 16:40, <julien.grall@linaro.org> wrote:
>> I played a bit with the function vcpu_initialize on ARM.
>> If it fails, it will likely crash Xen with the following stack trace:
>>
>> (XEN) Xen BUG at /local/home/julien/works/arndale/xen/xen/include/xen/list.h:175
>> ...
>> (XEN) Xen call trace:
>> (XEN)    [<002457e0>] __bug+0x2c/0x44 (PC)
>> (XEN)    [<002457e0>] __bug+0x2c/0x44 (LR)
>> (XEN)    [<00231cb0>] kill_timer+0x1bc/0x364
>> (XEN)    [<002296e4>] sched_destroy_vcpu+0x1c/0x14c
>> (XEN)    [<002081d8>] alloc_vcpu+0x17c/0x270
>> (XEN)    [<00206a10>] do_domctl+0xa74/0x11f4
>> (XEN)    [<00254698>] do_trap_hypervisor+0x7f0/0xb44
>> (XEN)    [<00257110>] return_from_trap+0/0x4
>>
>> It's easily reproductible on ARM with this small patch:
>>
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index ccccb77..7ada03f 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -473,6 +473,9 @@ int vcpu_initialise(struct vcpu *v)
>>      if ( (rc = vcpu_vtimer_init(v)) != 0 )
>>          return rc;
>>  
>> +    if ( v->domain->domain_id != 0 )
>> +        return -EFAULT;
>> +
>>      return rc;
>>  }
>>
>> I guess we forget to take a lock or smth like that, but I don't know 
>> enough this code.
> 
> I definitely can't reproduce this on x86 - I tried three different
> variations of which vCPU(s) to fail this function on. Are you sure
> you didn't corrupt something with your experiments?

Yes, I've checkout xengit/staging (commit 9f2f129) and applied tiny patch above.

Usually the first iteration I get:

Parsing config from dom1.xl
libxl: error: libxl_dom.c:239:libxl__build_pre: Couldn't set max vcpu count
libxl: error: libxl_create.c:1036:domcreate_rebuild_done: cannot (re-)build domain: -3

And then Xen will crash as soon as I try to create another domain.

I will investigate in ARM side to see what happen.

Regards,
diff mbox

Patch

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index ccccb77..7ada03f 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -473,6 +473,9 @@  int vcpu_initialise(struct vcpu *v)
     if ( (rc = vcpu_vtimer_init(v)) != 0 )
         return rc;
 
+    if ( v->domain->domain_id != 0 )
+        return -EFAULT;
+
     return rc;
 }