Message ID | 20141121123559.GF23177@e105550-lin.cambridge.arm.com |
---|---|
State | New |
Headers | show |
On 21 November 2014 at 13:35, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > On Mon, Nov 03, 2014 at 04:54:42PM +0000, Vincent Guittot wrote: [snip] >> The average running time of RT tasks is used to estimate the remaining compute >> @@ -5801,19 +5801,12 @@ static unsigned long scale_rt_capacity(int cpu) >> >> total = sched_avg_period() + delta; >> >> - if (unlikely(total < avg)) { >> - /* Ensures that capacity won't end up being negative */ >> - available = 0; >> - } else { >> - available = total - avg; >> - } >> + used = div_u64(avg, total); > > I haven't looked through all the details of the rt avg tracking, but if > 'used' is in the range [0..SCHED_CAPACITY_SCALE], I believe it should > work. Is it guaranteed that total > 0 so we don't get division by zero? static inline u64 sched_avg_period(void) { return (u64)sysctl_sched_time_avg * NSEC_PER_MSEC / 2; } > > It does get a slightly more complicated if we want to figure out the > available capacity at the current frequency (current < max) later. Say, > rt eats 25% of the compute capacity, but the current frequency is only > 50%. In that case get: > > curr_avail_capacity = (arch_scale_cpu_capacity() * > (arch_scale_freq_capacity() - (SCHED_SCALE_CAPACITY - scale_rt_capacity()))) > >> SCHED_CAPACITY_SHIFT You don't have to be so complicated but simply need to do: curr_avail_capacity for CFS = (capacity_of(CPU) * arch_scale_freq_capacity()) >> SCHED_CAPACITY_SHIFT capacity_of(CPU) = 600 is the max available capacity for CFS tasks once we have removed the 25% of capacity that is used by RT tasks arch_scale_freq_capacity = 512 because we currently run at 50% of max freq so curr_avail_capacity for CFS = 300 Vincent > > With numbers assuming arch_scale_cpu_capacity() = 800: > > curr_avail_capacity = 800 * (512 - (1024 - 758)) >> 10 = 200 > > Which isn't actually that bad. Anyway, it isn't needed until we start > invovling energy models. > >> >> - if (unlikely((s64)total < SCHED_CAPACITY_SCALE)) >> - total = SCHED_CAPACITY_SCALE; >> + if (likely(used < SCHED_CAPACITY_SCALE)) >> + return SCHED_CAPACITY_SCALE - used; >> >> - total >>= SCHED_CAPACITY_SHIFT; >> - >> - return div_u64(available, total); >> + return 1; >> } >> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On 24 November 2014 at 18:05, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > On Mon, Nov 24, 2014 at 02:24:00PM +0000, Vincent Guittot wrote: >> On 21 November 2014 at 13:35, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> > On Mon, Nov 03, 2014 at 04:54:42PM +0000, Vincent Guittot wrote: >> >> [snip] >> >> >> The average running time of RT tasks is used to estimate the remaining compute >> >> @@ -5801,19 +5801,12 @@ static unsigned long scale_rt_capacity(int cpu) >> >> >> >> total = sched_avg_period() + delta; >> >> >> >> - if (unlikely(total < avg)) { >> >> - /* Ensures that capacity won't end up being negative */ >> >> - available = 0; >> >> - } else { >> >> - available = total - avg; >> >> - } >> >> + used = div_u64(avg, total); >> > >> > I haven't looked through all the details of the rt avg tracking, but if >> > 'used' is in the range [0..SCHED_CAPACITY_SCALE], I believe it should >> > work. Is it guaranteed that total > 0 so we don't get division by zero? >> >> static inline u64 sched_avg_period(void) >> { >> return (u64)sysctl_sched_time_avg * NSEC_PER_MSEC / 2; >> } >> > > I see. > >> > >> > It does get a slightly more complicated if we want to figure out the >> > available capacity at the current frequency (current < max) later. Say, >> > rt eats 25% of the compute capacity, but the current frequency is only >> > 50%. In that case get: >> > >> > curr_avail_capacity = (arch_scale_cpu_capacity() * >> > (arch_scale_freq_capacity() - (SCHED_SCALE_CAPACITY - scale_rt_capacity()))) >> > >> SCHED_CAPACITY_SHIFT >> >> You don't have to be so complicated but simply need to do: >> curr_avail_capacity for CFS = (capacity_of(CPU) * >> arch_scale_freq_capacity()) >> SCHED_CAPACITY_SHIFT >> >> capacity_of(CPU) = 600 is the max available capacity for CFS tasks >> once we have removed the 25% of capacity that is used by RT tasks >> arch_scale_freq_capacity = 512 because we currently run at 50% of max freq >> >> so curr_avail_capacity for CFS = 300 > > I don't think that is correct. It is at least not what I had in mind. > > capacity_orig_of(cpu) = 800, we run at 50% frequency which means: > > curr_capacity = capacity_orig_of(cpu) * arch_scale_freq_capacity() > >> SCHED_CAPACITY_SHIFT > = 400 > > So the total capacity at the current frequency (50%) is 400, without > considering RT. scale_rt_capacity() is frequency invariant, so it takes > away capacity_orig_of(cpu) - capacity_of(cpu) = 200 worth of capacity > for RT. We need to subtract that from the current capacity to get the > available capacity at the current frequency. > > curr_available_capacity = curr_capacity - (capacity_orig_of(cpu) - > capacity_of(cpu)) = 200 you're right, this one looks good to me too > > In other words, 800 is the max capacity, we are currently running at 50% > frequency, which gives us 400. RT takes away 25% of 800 > (frequency-invariant) from the 400, which leaves us with 200 left for > CFS tasks at the current frequency. > > In your calculations you subtract the RT load before computing the > current capacity using arch_scale_freq_capacity(), where I think it > should be done after. You find the amount spare capacity you would have > at the maximum frequency when RT has been subtracted and then scale the > result by frequency which means indirectly scaling the RT load > contribution again (the rt avg has already been scaled). So instead of > taking away 200 of the 400 (current capacity @ 50% frequency), it only > takes away 100 which isn't right. > > scale_rt_capacity() is frequency-invariant, so if the RT load is 50% and > the frequency is 50%, there are no spare cycles left. > curr_avail_capacity should be 0. But using your expression above you > would get capacity_of(cpu) = 400 after removing RT, > arch_scale_freq_capacity = 512 and you get 200. I don't think that is > right. > > Morten > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 6fd5ac6..921b174 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2277,8 +2277,6 @@ static u32 __compute_runnable_contrib(u64 n) return contrib + runnable_avg_yN_sum[n]; } -unsigned long __weak arch_scale_freq_capacity(struct sched_domain *sd, int cpu); - /* * We can represent the historical contribution to runnable average as