Message ID | 1531151136-18297-2-git-send-email-sudeep.holla@arm.com |
---|---|
State | Accepted |
Commit | 5e18e412973d6bb1804de1d4d30a891c774b006e |
Headers | show |
Series | [1/2] Revert "tick: Prefer a lower rating device only if it's CPU local device" | expand |
On Mon, 9 Jul 2018, Sudeep Holla wrote: > Currently, arch_mem_timer cpumask is set to cpu_all_mask which should be > fine. However, cpu_possible_mask is more accurate and if there are other > clockevent source in the system which are set to cpu_possible_mask, then > having cpu_all_mask may result in issue. > > E.g. on a platform with arm,sp804 timer with rating 300 and > cpu_possible_mask and this arch_mem_timer timer with rating 400 and > cpu_all_mask, tick_check_preferred may choose both preferred as the > cpumasks are not equal though they must be. > > This issue was root caused incorrectly initially and a fix was merged as > commit 1332a9055801 ("tick: Prefer a lower rating device only if it's CPU > local device"). To avoid that in the future we really should fix the decision logic to mask out the non possible CPUs from the supplied masks. Thanks, tgkx
On 09/07/18 23:09, Thomas Gleixner wrote: > On Mon, 9 Jul 2018, Sudeep Holla wrote: > >> Currently, arch_mem_timer cpumask is set to cpu_all_mask which should be >> fine. However, cpu_possible_mask is more accurate and if there are other >> clockevent source in the system which are set to cpu_possible_mask, then >> having cpu_all_mask may result in issue. >> >> E.g. on a platform with arm,sp804 timer with rating 300 and >> cpu_possible_mask and this arch_mem_timer timer with rating 400 and >> cpu_all_mask, tick_check_preferred may choose both preferred as the >> cpumasks are not equal though they must be. >> >> This issue was root caused incorrectly initially and a fix was merged as >> commit 1332a9055801 ("tick: Prefer a lower rating device only if it's CPU >> local device"). > > To avoid that in the future we really should fix the decision logic to mask > out the non possible CPUs from the supplied masks. Sure, I can do that. But do you want that as a fix for v4.18 ? I think we may need this check at another place in tick_setup_device 213 /* 214 * When the device is not per cpu, pin the interrupt to the 215 * current cpu: 216 */ 217 if (!cpumask_equal(newdev->cpumask, cpumask)) 218 irq_set_affinity(newdev->irq, cpumask); Does it make sense trim dev->cpumask when registering the clockevents device itself instead of adding check at place where this cpumask can be used ? So that any future user of those masks need not have to take care of that. Also only few ARM clocksource drivers use cpu_all_mask which could be result of copy-paste, we can even fix them too. arm_arch_timer.c: clk->cpumask = cpu_all_mask; tegra20_timer.c: tegra_clockevent.cpumask = cpu_all_mask; timer-atcpit100.c: .cpumask = cpu_all_mask, timer-keystone.c: event_dev->cpumask = cpu_all_mask; zevio-timer.c: timer->clkevt.cpumask = cpu_all_mask; -- Regards, Sudeep
On Tue, 10 Jul 2018, Sudeep Holla wrote: > > > On 09/07/18 23:09, Thomas Gleixner wrote: > > On Mon, 9 Jul 2018, Sudeep Holla wrote: > > > >> Currently, arch_mem_timer cpumask is set to cpu_all_mask which should be > >> fine. However, cpu_possible_mask is more accurate and if there are other > >> clockevent source in the system which are set to cpu_possible_mask, then > >> having cpu_all_mask may result in issue. > >> > >> E.g. on a platform with arm,sp804 timer with rating 300 and > >> cpu_possible_mask and this arch_mem_timer timer with rating 400 and > >> cpu_all_mask, tick_check_preferred may choose both preferred as the > >> cpumasks are not equal though they must be. > >> > >> This issue was root caused incorrectly initially and a fix was merged as > >> commit 1332a9055801 ("tick: Prefer a lower rating device only if it's CPU > >> local device"). > > > > To avoid that in the future we really should fix the decision logic to mask > > out the non possible CPUs from the supplied masks. > > Sure, I can do that. But do you want that as a fix for v4.18 ? No, that's for 4.19 > I think we may need this check at another place in tick_setup_device > > 213 /* > 214 * When the device is not per cpu, pin the interrupt to the > 215 * current cpu: > 216 */ > 217 if (!cpumask_equal(newdev->cpumask, cpumask)) > 218 irq_set_affinity(newdev->irq, cpumask); > > Does it make sense trim dev->cpumask when registering the clockevents > device itself instead of adding check at place where this cpumask > can be used ? So that any future user of those masks need not have to > take care of that. The problem is you cannot trim it because cpu_all_mask is global and const. > Also only few ARM clocksource drivers use cpu_all_mask which could be > result of copy-paste, we can even fix them too. > > arm_arch_timer.c: clk->cpumask = cpu_all_mask; > tegra20_timer.c: tegra_clockevent.cpumask = cpu_all_mask; > timer-atcpit100.c: .cpumask = cpu_all_mask, > timer-keystone.c: event_dev->cpumask = cpu_all_mask; > zevio-timer.c: timer->clkevt.cpumask = cpu_all_mask; Yes, that makes sense. What we could do is warn, when cpu_all_mask is set at registration time and replace the pointer with cpu_possible_mask. Thanks, tglx
On 10/07/18 13:21, Thomas Gleixner wrote: > On Tue, 10 Jul 2018, Sudeep Holla wrote: > >> >> >> On 09/07/18 23:09, Thomas Gleixner wrote: >>> On Mon, 9 Jul 2018, Sudeep Holla wrote: >>> >>>> Currently, arch_mem_timer cpumask is set to cpu_all_mask which should be >>>> fine. However, cpu_possible_mask is more accurate and if there are other >>>> clockevent source in the system which are set to cpu_possible_mask, then >>>> having cpu_all_mask may result in issue. >>>> >>>> E.g. on a platform with arm,sp804 timer with rating 300 and >>>> cpu_possible_mask and this arch_mem_timer timer with rating 400 and >>>> cpu_all_mask, tick_check_preferred may choose both preferred as the >>>> cpumasks are not equal though they must be. >>>> >>>> This issue was root caused incorrectly initially and a fix was merged as >>>> commit 1332a9055801 ("tick: Prefer a lower rating device only if it's CPU >>>> local device"). >>> >>> To avoid that in the future we really should fix the decision logic to mask >>> out the non possible CPUs from the supplied masks. >> >> Sure, I can do that. But do you want that as a fix for v4.18 ? > > No, that's for 4.19 > Ah OK, I assume you are fine with this patch as fix for v4.18. >> I think we may need this check at another place in tick_setup_device >> >> 213 /* >> 214 * When the device is not per cpu, pin the interrupt to the >> 215 * current cpu: >> 216 */ >> 217 if (!cpumask_equal(newdev->cpumask, cpumask)) >> 218 irq_set_affinity(newdev->irq, cpumask); >> >> Does it make sense trim dev->cpumask when registering the clockevents >> device itself instead of adding check at place where this cpumask >> can be used ? So that any future user of those masks need not have to >> take care of that. > > The problem is you cannot trim it because cpu_all_mask is global and const. > Indeed, again forgot it just a pointer to the global one, duh! >> Also only few ARM clocksource drivers use cpu_all_mask which could be >> result of copy-paste, we can even fix them too. >> >> arm_arch_timer.c: clk->cpumask = cpu_all_mask; >> tegra20_timer.c: tegra_clockevent.cpumask = cpu_all_mask; >> timer-atcpit100.c: .cpumask = cpu_all_mask, >> timer-keystone.c: event_dev->cpumask = cpu_all_mask; >> zevio-timer.c: timer->clkevt.cpumask = cpu_all_mask; > > Yes, that makes sense. What we could do is warn, when cpu_all_mask is set > at registration time and replace the pointer with cpu_possible_mask. > I like this approach than having to bitwise and with cpu_possible_mask at all the necessary place. I will cook up a patch. -- Regards, Sudeep
On Tue, 10 Jul 2018, Sudeep Holla wrote: > On 10/07/18 13:21, Thomas Gleixner wrote: > >> Also only few ARM clocksource drivers use cpu_all_mask which could be > >> result of copy-paste, we can even fix them too. > >> > >> arm_arch_timer.c: clk->cpumask = cpu_all_mask; > >> tegra20_timer.c: tegra_clockevent.cpumask = cpu_all_mask; > >> timer-atcpit100.c: .cpumask = cpu_all_mask, > >> timer-keystone.c: event_dev->cpumask = cpu_all_mask; > >> zevio-timer.c: timer->clkevt.cpumask = cpu_all_mask; > > > > Yes, that makes sense. What we could do is warn, when cpu_all_mask is set > > at registration time and replace the pointer with cpu_possible_mask. > > > I like this approach than having to bitwise and with cpu_possible_mask > at all the necessary place. I will cook up a patch. Appreciated. I'm inclined to take it for 4.18 even. Please send it along with the fixes for the above obvious failure spots. Thanks, tglx
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c index 57cb2f00fc07..d8c7f5750cdb 100644 --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -735,7 +735,7 @@ static void __arch_timer_setup(unsigned type, clk->features |= CLOCK_EVT_FEAT_DYNIRQ; clk->name = "arch_mem_timer"; clk->rating = 400; - clk->cpumask = cpu_all_mask; + clk->cpumask = cpu_possible_mask; if (arch_timer_mem_use_virtual) { clk->set_state_shutdown = arch_timer_shutdown_virt_mem; clk->set_state_oneshot_stopped = arch_timer_shutdown_virt_mem;
Currently, arch_mem_timer cpumask is set to cpu_all_mask which should be fine. However, cpu_possible_mask is more accurate and if there are other clockevent source in the system which are set to cpu_possible_mask, then having cpu_all_mask may result in issue. E.g. on a platform with arm,sp804 timer with rating 300 and cpu_possible_mask and this arch_mem_timer timer with rating 400 and cpu_all_mask, tick_check_preferred may choose both preferred as the cpumasks are not equal though they must be. This issue was root caused incorrectly initially and a fix was merged as commit 1332a9055801 ("tick: Prefer a lower rating device only if it's CPU local device"). Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> --- drivers/clocksource/arm_arch_timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Hi, There are few more drivers that set cpu_all_mask, should they also be changed ? It should be fine as long as it's not compared against someother non per-CPU timer with shorter cpumask. Regards, Sudeep -- 2.7.4