Message ID | CAK+oA4WFAaT4DmR=8nKqTsyKe9imDYCu5p6DG4kqbhoBPD=LXw@mail.gmail.com |
---|---|
State | New |
Headers | show |
On 14 November 2016 at 04:50, Piotr Gregor <piotrgregor@rsyncme.org> wrote: > There was an error in previous patch. Please find attached corrected patch. > > cheers, > Piotr Piort, I am fairly new to this mailing list but if it is anything like the kernel's, you will want to send your patch using git send-email. It is probably a good idea to run checkpatch.pl on it too - that way you'll avoid all sort of style problems like lines over 80 characters long. Mathieu -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Mathieu, I am newer most likely - many thanks for your advise. cheers, Piotr -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 14 Nov 2016 16:11:58 +0000 Piotr Gregor <piotrgregor@rsyncme.org> wrote: > Hi Mathieu, > > I am newer most likely - many thanks for your advise. > > cheers, > Piotr > -- Luckily this list is *much* lower traffic than LKML and we're not quite as strict as they are, since we don't have the volume of patches to deal with. That being said, very good advice and I'll make a note to follow it :) Piotr, John Kacur or I will look at adding in your patch to cyclictest shortly. Thanks for the submission. -- United States Coast Guard Ruining Natural Selection since 1790
Please inline the patches There are 1000000 microseconds / second, the default interval is 1000us, but let's say we did it more frequently at 100us 1000000 us / 100 intervals = 10000 samples / second * 60 sec/min * 60 min/hr * 24 hrs/day * 365 days / year * 1000 (very pessimistic diff) = an accumulated diff of 3.1536×10^14 With a max double of 9x10^15/3.15*10^14 we could run for 30 years before we move into normal numbers. So, the problem with your patch, is not just the problem which it proposes to fix, is one I see as mostly theoretical, but there is the potential for accumulating small inaccuracies. Of course, there is always the chance that I made some kind of error in my analysis here, please feel free to point it out! What might be more interesting is how long would it take for the cycles to overflow? Thanks John On Mon, 14 Nov 2016, Piotr Gregor wrote: > There was an error in previous patch. Please find attached corrected patch. > > cheers, > Piotr >
Hi John, Clark, Recently I observed jitter of > 100 000 during tests of virtualized system. In that case 30 years would change into 0.3 year which is not as much. The jitter depends on the hardware/software you test, what if someone deliberately wishes to measure jitter of more than this or he is just out of luck and gets huge values? Also, the limit will change if the type of average changes, but if it is calculated in the way I proposed, it will always work. Apart from this, since the benign possibility exists - why not to fix this if the fix is easy. cheers, Piotr -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Piotr, please explain what you believe to be the root cause? > On Nov 19, 2016, at 5:27 PM, Piotr Gregor <piotrgregor@rsyncme.org> wrote: > > Hi John, Clark, > > Recently I observed jitter of > 100 000 during tests of virtualized system. > In that case 30 years would change into 0.3 year which is not as much. > The jitter depends on the hardware/software you test, what if someone > deliberately wishes to measure jitter of more than this or he is just > out of luck and gets huge values? > Also, the limit will change if the type of average changes, but if it > is calculated in the way I proposed, it will always work. > Apart from this, since the benign possibility exists - why not to fix this > if the fix is easy. > > cheers, > Piotr > -- > To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 19 Nov 2016, Piotr Gregor wrote: > Hi John, Clark, > > Recently I observed jitter of > 100 000 during tests of virtualized system. > In that case 30 years would change into 0.3 year which is not as much. > The jitter depends on the hardware/software you test, what if someone > deliberately wishes to measure jitter of more than this or he is just > out of luck and gets huge values? > Also, the limit will change if the type of average changes, but if it > is calculated in the way I proposed, it will always work. > Apart from this, since the benign possibility exists - why not to fix this > if the fix is easy. > I already explained why. Everytime you do a floating point long division you are potentially accumulating small errors. Something is terribly wrong if you are getting a diff of greater than 100,000, if that really is true you need to fix it right away instead of run cyclictest for a long period of time. John> -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
----- Original Message ----- > John, > > Latency of 100,000 seems to me to be a bug when doing floating point long > division, which potentially accumulates small errors. > > Is the issue in the cyclictest calculation or in the kernel? Or, the FPU? > > If in the kernel, where do we need to trace to check for increased latency > when doing floating point long division? > > Piotr, needs to trace where the latency is occurring and fix. Any > recommendations on a possible fix at this stage? > > Thx, Tracy > Sure, you make some good points, it could be a cyclictest bug, it could be in the kernel, he also said he was doing this in a virtualized system, it could be somewhere in that stack. The first step of course is to see if you can come up with a way to reliably reproduce it. Thanks John -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From 5610e2b01e5302a5ecff7ca368198863a6ce9f78 Mon Sep 17 00:00:00 2001 From: Piotr Gregor <piotrgregor@rsyncme.org> Date: Fri, 11 Nov 2016 11:28:09 +0000 Subject: [PATCH] rt-tests: cyclictest: Calculate average latencies on the fly Prevents stat->avg variable from overflow. --- src/cyclictest/cyclictest.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/src/cyclictest/cyclictest.c b/src/cyclictest/cyclictest.c index 3f1bef1..0d2ddbc 100644 --- a/src/cyclictest/cyclictest.c +++ b/src/cyclictest/cyclictest.c @@ -1178,7 +1178,11 @@ static void *timerthread(void *param) if (refresh_on_max) pthread_cond_signal(&refresh_on_max_cond); } - stat->avg += (double) diff; + if (stat->cycles == 0) { + stat->avg = (double) diff; + } else { + stat->avg = (stat->avg + (double) diff / (double) stat->cycles) * ((double) stat->cycles / (double) (stat->cycles + 1)); + } if (trigger && (diff > trigger)) { trigger_update(par, diff, calctime(now)); @@ -2005,7 +2009,7 @@ static void print_hist(struct thread_param *par[], int nthreads) fprintf(fd, "# Avg Latencies:"); for (j = 0; j < nthreads; j++) fprintf(fd, " %05lu", par[j]->stats->cycles ? - (long)(par[j]->stats->avg/par[j]->stats->cycles) : 0); + (long)(par[j]->stats->avg) : 0); fprintf(fd, "\n"); fprintf(fd, "# Max Latencies:"); maxmax = 0; @@ -2059,7 +2063,7 @@ static void print_stat(FILE *fp, struct thread_param *par, int index, int verbos fprintf(fp, fmt, index, stat->tid, par->prio, par->interval, stat->cycles, stat->min, stat->act, stat->cycles ? - (long)(stat->avg/stat->cycles) : 0, stat->max); + (long)(stat->avg) : 0, stat->max); if (smi) fprintf(fp," SMI:%8ld", stat->smi_count); -- 2.7.4