diff mbox

ARM/ARM64: arch-timer: fix arch_timer_probed logic

Message ID 1413385580-23317-1-git-send-email-marc.zyngier@arm.com
State Accepted
Commit 59aa896db80479dec29f471a7ca2b9eeeeb7d38e
Headers show

Commit Message

Marc Zyngier Oct. 15, 2014, 3:06 p.m. UTC
Commit c387f07e6205 (clocksource: arm_arch_timer: Discard unavailable
timers correctly) changed the way the driver makes sure both the memory
and system-register timers have been probed before finalizing the probing.

There is a interesting flaw in this logic that leads to this final step
never to be executed. Things seems to work pretty well until something
actually needs the data that is produced during this final stage.

For example, KVM explodes on the first run of a guest when executed on
a platform that has both memory and sysreg nodes (Juno, for example).

Just fix the damned logic, and enjoy booting VMs again.

Tested on a Juno system.

Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Reported-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 drivers/clocksource/arm_arch_timer.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Mark Rutland Oct. 15, 2014, 3:28 p.m. UTC | #1
On Wed, Oct 15, 2014 at 04:06:20PM +0100, Marc Zyngier wrote:
> Commit c387f07e6205 (clocksource: arm_arch_timer: Discard unavailable
> timers correctly) changed the way the driver makes sure both the memory
> and system-register timers have been probed before finalizing the probing.
> 
> There is a interesting flaw in this logic that leads to this final step
> never to be executed. Things seems to work pretty well until something
> actually needs the data that is produced during this final stage.
> 
> For example, KVM explodes on the first run of a guest when executed on
> a platform that has both memory and sysreg nodes (Juno, for example).

As far as I can tell, the logic is flawed for all cases except two
functional nodes that we manage to probe.

> 
> Just fix the damned logic, and enjoy booting VMs again.
> 
> Tested on a Juno system.
> 
> Cc: Sudeep Holla <sudeep.holla@arm.com>
> Cc: Stephen Boyd <sboyd@codeaurora.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Reported-by: Riku Voipio <riku.voipio@linaro.org>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

The new logic makes sense to me, so:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Thanks,
Mark.

> ---
>  drivers/clocksource/arm_arch_timer.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index 2133f9d..43005d4 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -660,11 +660,11 @@ static bool __init
>  arch_timer_probed(int type, const struct of_device_id *matches)
>  {
>  	struct device_node *dn;
> -	bool probed = false;
> +	bool probed = true;
>  
>  	dn = of_find_matching_node(NULL, matches);
> -	if (dn && of_device_is_available(dn) && (arch_timers_present & type))
> -		probed = true;
> +	if (dn && of_device_is_available(dn) && !(arch_timers_present & type))
> +		probed = false;
>  	of_node_put(dn);
>  
>  	return probed;
> -- 
> 2.0.4
> 
>
Sudeep Holla Oct. 15, 2014, 3:49 p.m. UTC | #2
On 15/10/14 16:28, Mark Rutland wrote:
> On Wed, Oct 15, 2014 at 04:06:20PM +0100, Marc Zyngier wrote:
>> Commit c387f07e6205 (clocksource: arm_arch_timer: Discard unavailable
>> timers correctly) changed the way the driver makes sure both the memory
>> and system-register timers have been probed before finalizing the probing.
>>
>> There is a interesting flaw in this logic that leads to this final step
>> never to be executed. Things seems to work pretty well until something
>> actually needs the data that is produced during this final stage.
>>
>> For example, KVM explodes on the first run of a guest when executed on
>> a platform that has both memory and sysreg nodes (Juno, for example).
>
> As far as I can tell, the logic is flawed for all cases except two
> functional nodes that we manage to probe.
>

Agreed, it's my mistake. I inverted the logic incorrectly when I moved
it to a function while adding of_node_put in v2 of the patch.

I think wrong DTB got picked up when I tested this. I am sorry for that.

>>
>> Just fix the damned logic, and enjoy booting VMs again.
>>
>> Tested on a Juno system.
>>
>> Cc: Sudeep Holla <sudeep.holla@arm.com>
>> Cc: Stephen Boyd <sboyd@codeaurora.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>> Reported-by: Riku Voipio <riku.voipio@linaro.org>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>
> The new logic makes sense to me, so:
>
> Acked-by: Mark Rutland <mark.rutland@arm.com>

Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
(This time tested all possible case)

Regards,
Sudeep
Daniel Lezcano Oct. 21, 2014, 10:58 a.m. UTC | #3
On 10/15/2014 05:06 PM, Marc Zyngier wrote:
> Commit c387f07e6205 (clocksource: arm_arch_timer: Discard unavailable
> timers correctly) changed the way the driver makes sure both the memory
> and system-register timers have been probed before finalizing the probing.
>
> There is a interesting flaw in this logic that leads to this final step
> never to be executed. Things seems to work pretty well until something
> actually needs the data that is produced during this final stage.
>
> For example, KVM explodes on the first run of a guest when executed on
> a platform that has both memory and sysreg nodes (Juno, for example).
>
> Just fix the damned logic, and enjoy booting VMs again.
>
> Tested on a Juno system.
>
> Cc: Sudeep Holla <sudeep.holla@arm.com>
> Cc: Stephen Boyd <sboyd@codeaurora.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Reported-by: Riku Voipio <riku.voipio@linaro.org>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---

Applied to my tree. Also for -next.

Thanks !

   -- Daniel

>   drivers/clocksource/arm_arch_timer.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index 2133f9d..43005d4 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -660,11 +660,11 @@ static bool __init
>   arch_timer_probed(int type, const struct of_device_id *matches)
>   {
>   	struct device_node *dn;
> -	bool probed = false;
> +	bool probed = true;
>
>   	dn = of_find_matching_node(NULL, matches);
> -	if (dn && of_device_is_available(dn) && (arch_timers_present & type))
> -		probed = true;
> +	if (dn && of_device_is_available(dn) && !(arch_timers_present & type))
> +		probed = false;
>   	of_node_put(dn);
>
>   	return probed;
>
Mark Rutland Oct. 21, 2014, 11:07 a.m. UTC | #4
Hi Daniel,

On Tue, Oct 21, 2014 at 11:58:29AM +0100, Daniel Lezcano wrote:
> On 10/15/2014 05:06 PM, Marc Zyngier wrote:
> > Commit c387f07e6205 (clocksource: arm_arch_timer: Discard unavailable
> > timers correctly) changed the way the driver makes sure both the memory
> > and system-register timers have been probed before finalizing the probing.
> >
> > There is a interesting flaw in this logic that leads to this final step
> > never to be executed. Things seems to work pretty well until something
> > actually needs the data that is produced during this final stage.
> >
> > For example, KVM explodes on the first run of a guest when executed on
> > a platform that has both memory and sysreg nodes (Juno, for example).
> >
> > Just fix the damned logic, and enjoy booting VMs again.
> >
> > Tested on a Juno system.
> >
> > Cc: Sudeep Holla <sudeep.holla@arm.com>
> > Cc: Stephen Boyd <sboyd@codeaurora.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> > Cc: Christoffer Dall <christoffer.dall@linaro.org>
> > Reported-by: Riku Voipio <riku.voipio@linaro.org>
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > ---
> 
> Applied to my tree. Also for -next.

Is this going to hit -rc2? This was a regression introduced in -rc1.

Without this fix we've also lost our high precision sched_clock on arm64
platforms.

Thanks,
Mark.

> 
> Thanks !
> 
>    -- Daniel
> 
> >   drivers/clocksource/arm_arch_timer.c | 6 +++---
> >   1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> > index 2133f9d..43005d4 100644
> > --- a/drivers/clocksource/arm_arch_timer.c
> > +++ b/drivers/clocksource/arm_arch_timer.c
> > @@ -660,11 +660,11 @@ static bool __init
> >   arch_timer_probed(int type, const struct of_device_id *matches)
> >   {
> >   	struct device_node *dn;
> > -	bool probed = false;
> > +	bool probed = true;
> >
> >   	dn = of_find_matching_node(NULL, matches);
> > -	if (dn && of_device_is_available(dn) && (arch_timers_present & type))
> > -		probed = true;
> > +	if (dn && of_device_is_available(dn) && !(arch_timers_present & type))
> > +		probed = false;
> >   	of_node_put(dn);
> >
> >   	return probed;
> >
> 
> 
> -- 
>   <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
> 
> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
> <http://twitter.com/#!/linaroorg> Twitter |
> <http://www.linaro.org/linaro-blog/> Blog
> 
>
Daniel Lezcano Oct. 21, 2014, 11:12 a.m. UTC | #5
On 10/21/2014 01:07 PM, Mark Rutland wrote:
> Hi Daniel,
>
> On Tue, Oct 21, 2014 at 11:58:29AM +0100, Daniel Lezcano wrote:
>> On 10/15/2014 05:06 PM, Marc Zyngier wrote:
>>> Commit c387f07e6205 (clocksource: arm_arch_timer: Discard unavailable
>>> timers correctly) changed the way the driver makes sure both the memory
>>> and system-register timers have been probed before finalizing the probing.
>>>
>>> There is a interesting flaw in this logic that leads to this final step
>>> never to be executed. Things seems to work pretty well until something
>>> actually needs the data that is produced during this final stage.
>>>
>>> For example, KVM explodes on the first run of a guest when executed on
>>> a platform that has both memory and sysreg nodes (Juno, for example).
>>>
>>> Just fix the damned logic, and enjoy booting VMs again.
>>>
>>> Tested on a Juno system.
>>>
>>> Cc: Sudeep Holla <sudeep.holla@arm.com>
>>> Cc: Stephen Boyd <sboyd@codeaurora.org>
>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>> Reported-by: Riku Voipio <riku.voipio@linaro.org>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> ---
>>
>> Applied to my tree. Also for -next.
>
> Is this going to hit -rc2? This was a regression introduced in -rc1.
>
> Without this fix we've also lost our high precision sched_clock on arm64
> platforms.
>

Sure.

Thomas or Ingo,

is it possible to update the tip/urgent branch, so I can send the fixes 
against 3.18-rc1 ?

Thanks in advance

   -- Daniel


>>>    drivers/clocksource/arm_arch_timer.c | 6 +++---
>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
>>> index 2133f9d..43005d4 100644
>>> --- a/drivers/clocksource/arm_arch_timer.c
>>> +++ b/drivers/clocksource/arm_arch_timer.c
>>> @@ -660,11 +660,11 @@ static bool __init
>>>    arch_timer_probed(int type, const struct of_device_id *matches)
>>>    {
>>>    	struct device_node *dn;
>>> -	bool probed = false;
>>> +	bool probed = true;
>>>
>>>    	dn = of_find_matching_node(NULL, matches);
>>> -	if (dn && of_device_is_available(dn) && (arch_timers_present & type))
>>> -		probed = true;
>>> +	if (dn && of_device_is_available(dn) && !(arch_timers_present & type))
>>> +		probed = false;
>>>    	of_node_put(dn);
>>>
>>>    	return probed;
>>>
>>
>>
>> --
>>    <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
>>
>> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
>> <http://twitter.com/#!/linaroorg> Twitter |
>> <http://www.linaro.org/linaro-blog/> Blog
>>
>>
Marc Zyngier Oct. 27, 2014, 10:33 a.m. UTC | #6
Hi Daniel,

On 21/10/14 12:12, Daniel Lezcano wrote:
> On 10/21/2014 01:07 PM, Mark Rutland wrote:
>> Hi Daniel,
>>
>> On Tue, Oct 21, 2014 at 11:58:29AM +0100, Daniel Lezcano wrote:
>>> On 10/15/2014 05:06 PM, Marc Zyngier wrote:
>>>> Commit c387f07e6205 (clocksource: arm_arch_timer: Discard unavailable
>>>> timers correctly) changed the way the driver makes sure both the memory
>>>> and system-register timers have been probed before finalizing the probing.
>>>>
>>>> There is a interesting flaw in this logic that leads to this final step
>>>> never to be executed. Things seems to work pretty well until something
>>>> actually needs the data that is produced during this final stage.
>>>>
>>>> For example, KVM explodes on the first run of a guest when executed on
>>>> a platform that has both memory and sysreg nodes (Juno, for example).
>>>>
>>>> Just fix the damned logic, and enjoy booting VMs again.
>>>>
>>>> Tested on a Juno system.
>>>>
>>>> Cc: Sudeep Holla <sudeep.holla@arm.com>
>>>> Cc: Stephen Boyd <sboyd@codeaurora.org>
>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>> Reported-by: Riku Voipio <riku.voipio@linaro.org>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>
>>> Applied to my tree. Also for -next.
>>
>> Is this going to hit -rc2? This was a regression introduced in -rc1.
>>
>> Without this fix we've also lost our high precision sched_clock on arm64
>> platforms.
>>
> 
> Sure.
> 
> Thomas or Ingo,
> 
> is it possible to update the tip/urgent branch, so I can send the fixes 
> against 3.18-rc1 ?

Any update on this? -rc2 has landed, but arm/arm64 timers are still in a
rather bad shape. Can this please be merged as an urgent fix?

Thanks,

	M.
Thomas Gleixner Oct. 27, 2014, 8:30 p.m. UTC | #7
On Mon, 27 Oct 2014, Marc Zyngier wrote:
> Hi Daniel,
> 
> On 21/10/14 12:12, Daniel Lezcano wrote:
> > On 10/21/2014 01:07 PM, Mark Rutland wrote:
> >> Hi Daniel,
> >>
> >> On Tue, Oct 21, 2014 at 11:58:29AM +0100, Daniel Lezcano wrote:
> >>> On 10/15/2014 05:06 PM, Marc Zyngier wrote:
> >>>> Commit c387f07e6205 (clocksource: arm_arch_timer: Discard unavailable
> >>>> timers correctly) changed the way the driver makes sure both the memory
> >>>> and system-register timers have been probed before finalizing the probing.
> >>>>
> >>>> There is a interesting flaw in this logic that leads to this final step
> >>>> never to be executed. Things seems to work pretty well until something
> >>>> actually needs the data that is produced during this final stage.
> >>>>
> >>>> For example, KVM explodes on the first run of a guest when executed on
> >>>> a platform that has both memory and sysreg nodes (Juno, for example).
> >>>>
> >>>> Just fix the damned logic, and enjoy booting VMs again.
> >>>>
> >>>> Tested on a Juno system.
> >>>>
> >>>> Cc: Sudeep Holla <sudeep.holla@arm.com>
> >>>> Cc: Stephen Boyd <sboyd@codeaurora.org>
> >>>> Cc: Mark Rutland <mark.rutland@arm.com>
> >>>> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> >>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> >>>> Reported-by: Riku Voipio <riku.voipio@linaro.org>
> >>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >>>> ---
> >>>
> >>> Applied to my tree. Also for -next.
> >>
> >> Is this going to hit -rc2? This was a regression introduced in -rc1.
> >>
> >> Without this fix we've also lost our high precision sched_clock on arm64
> >> platforms.
> >>
> > 
> > Sure.
> > 
> > Thomas or Ingo,
> > 
> > is it possible to update the tip/urgent branch, so I can send the fixes 
> > against 3.18-rc1 ?
> 
> Any update on this? -rc2 has landed, but arm/arm64 timers are still in a
> rather bad shape. Can this please be merged as an urgent fix?

Daniel, timers/urgent is on rc1 already. Please send your pull request.

Thanks,

	tglx
diff mbox

Patch

diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 2133f9d..43005d4 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -660,11 +660,11 @@  static bool __init
 arch_timer_probed(int type, const struct of_device_id *matches)
 {
 	struct device_node *dn;
-	bool probed = false;
+	bool probed = true;
 
 	dn = of_find_matching_node(NULL, matches);
-	if (dn && of_device_is_available(dn) && (arch_timers_present & type))
-		probed = true;
+	if (dn && of_device_is_available(dn) && !(arch_timers_present & type))
+		probed = false;
 	of_node_put(dn);
 
 	return probed;