Message ID | 1441228253-21225-2-git-send-email-ivan.khoronzhuk@linaro.org |
---|---|
State | New |
Headers | show |
There are many example apps that actually want to measure execution time in CPU cycles, not in nsec. Time API and CPU (cycle) API updates are linked together. Real CPU cycle cases must not converted to use time API, but to use a new odp_cpu_cycle() count API first. Then what is left should be actual time API use cases. > -----Original Message----- > From: lng-odp [mailto:lng-odp-bounces@lists.linaro.org] On Behalf Of > ext Ivan Khoronzhuk > Sent: Thursday, September 03, 2015 12:11 AM > To: lng-odp@lists.linaro.org > Subject: [lng-odp] [odp-lng] [Patch v3 1/3] api: time: unbind CPU > cycles from time API > > Current time API supposes that frequency of counter is equal > to CPU frequency. But that's not always true, for instance, > in case if no access to CPU cycle counter, another hi-resolution > timer can be used, and it`s rate can be different from CPU > rate. There is no big difference in which cycles to measure > time, the better hi-resolution timer the better measurements. > So, unbind CPU cycle counter from time API by eliminating word > "cycle" as it's believed to be used with CPU. > > Also add new opaque type for time odp_time_t, as it asks user to use > API and abstracts time from units. New odp_time_t requires several > additional API functions to be added: > > odp_time_t odp_time_sum(odp_time_t t1, odp_time_t t2); > int odp_time_cmp(odp_time_t t1, odp_time_t t2); > uint64_t odp_time_to_u64(odp_time_t hdl); > > Also added new definition that represents 0 ticks for time - > ODP_TIME_NULL. It can be used instead of odp_time_from_ns(0) for > comparison and initialization. > > This patch only changes used time API, it doesn't change used var > names for simplicity. > > Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> > --- > > /* > diff --git a/test/performance/odp_scheduling.c > b/test/performance/odp_scheduling.c > index 2a7e531..5859460 100644 > --- a/test/performance/odp_scheduling.c > +++ b/test/performance/odp_scheduling.c > @@ -183,9 +183,10 @@ static int test_alloc_single(int thr, odp_pool_t > pool) > { > int i; > odp_buffer_t temp_buf; > - uint64_t t1, t2, cycles, ns; > + odp_time_t t1, t2, cycles; > + uint64_t ns; > > - t1 = odp_time_cycles(); > + t1 = odp_time(); > > for (i = 0; i < ALLOC_ROUNDS; i++) { > temp_buf = odp_buffer_alloc(pool); > @@ -198,12 +199,12 @@ static int test_alloc_single(int thr, odp_pool_t > pool) > odp_buffer_free(temp_buf); > } > > - t2 = odp_time_cycles(); > - cycles = odp_time_diff_cycles(t1, t2); > - ns = odp_time_cycles_to_ns(cycles); > + t2 = odp_time(); > + cycles = odp_time_diff(t1, t2); > + ns = odp_time_to_ns(cycles); > > printf(" [%i] alloc_sng alloc+free %"PRIu64" cycles, %"PRIu64" > ns\n", > - thr, cycles/ALLOC_ROUNDS, ns/ALLOC_ROUNDS); > + thr, odp_time_to_u64(cycles) / ALLOC_ROUNDS, ns / > ALLOC_ROUNDS); > > return 0; For example, this is really measuring average CPU cycles (with or without freq scaling). The ns conversion can be removed. -Petri
On 03.09.15 15:29, Savolainen, Petri (Nokia - FI/Espoo) wrote: > > There are many example apps that actually want to measure execution time in CPU cycles, not in nsec. > Time API and CPU (cycle) API updates are linked together. Real CPU cycle cases must not converted to use time API, but to use a new odp_cpu_cycle() count API first. > Then what is left should be actual time API use cases. Let me look at CPU API first. I hesitate to answer because of freq scaling. I have filling that there is not so very well. And yes, probably I should move part of this changes in preparation series. > > >> -----Original Message----- >> From: lng-odp [mailto:lng-odp-bounces@lists.linaro.org] On Behalf Of >> ext Ivan Khoronzhuk >> Sent: Thursday, September 03, 2015 12:11 AM >> To: lng-odp@lists.linaro.org >> Subject: [lng-odp] [odp-lng] [Patch v3 1/3] api: time: unbind CPU >> cycles from time API >> >> Current time API supposes that frequency of counter is equal >> to CPU frequency. But that's not always true, for instance, >> in case if no access to CPU cycle counter, another hi-resolution >> timer can be used, and it`s rate can be different from CPU >> rate. There is no big difference in which cycles to measure >> time, the better hi-resolution timer the better measurements. >> So, unbind CPU cycle counter from time API by eliminating word >> "cycle" as it's believed to be used with CPU. >> >> Also add new opaque type for time odp_time_t, as it asks user to use >> API and abstracts time from units. New odp_time_t requires several >> additional API functions to be added: >> >> odp_time_t odp_time_sum(odp_time_t t1, odp_time_t t2); >> int odp_time_cmp(odp_time_t t1, odp_time_t t2); >> uint64_t odp_time_to_u64(odp_time_t hdl); >> >> Also added new definition that represents 0 ticks for time - >> ODP_TIME_NULL. It can be used instead of odp_time_from_ns(0) for >> comparison and initialization. >> >> This patch only changes used time API, it doesn't change used var >> names for simplicity. >> >> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> >> --- > > > > >> >> /* >> diff --git a/test/performance/odp_scheduling.c >> b/test/performance/odp_scheduling.c >> index 2a7e531..5859460 100644 >> --- a/test/performance/odp_scheduling.c >> +++ b/test/performance/odp_scheduling.c >> @@ -183,9 +183,10 @@ static int test_alloc_single(int thr, odp_pool_t >> pool) >> { >> int i; >> odp_buffer_t temp_buf; >> - uint64_t t1, t2, cycles, ns; >> + odp_time_t t1, t2, cycles; >> + uint64_t ns; >> >> - t1 = odp_time_cycles(); >> + t1 = odp_time(); >> >> for (i = 0; i < ALLOC_ROUNDS; i++) { >> temp_buf = odp_buffer_alloc(pool); >> @@ -198,12 +199,12 @@ static int test_alloc_single(int thr, odp_pool_t >> pool) >> odp_buffer_free(temp_buf); >> } >> >> - t2 = odp_time_cycles(); >> - cycles = odp_time_diff_cycles(t1, t2); >> - ns = odp_time_cycles_to_ns(cycles); >> + t2 = odp_time(); >> + cycles = odp_time_diff(t1, t2); >> + ns = odp_time_to_ns(cycles); >> >> printf(" [%i] alloc_sng alloc+free %"PRIu64" cycles, %"PRIu64" >> ns\n", >> - thr, cycles/ALLOC_ROUNDS, ns/ALLOC_ROUNDS); >> + thr, odp_time_to_u64(cycles) / ALLOC_ROUNDS, ns / >> ALLOC_ROUNDS); >> >> return 0; > > For example, this is really measuring average CPU cycles (with or without freq scaling). The ns conversion can be removed. > > -Petri > > > >
Petri, On 03.09.15 16:13, Ivan Khoronzhuk wrote: > > > On 03.09.15 15:29, Savolainen, Petri (Nokia - FI/Espoo) wrote: >> >> There are many example apps that actually want to measure execution time in CPU cycles, not in nsec. >> Time API and CPU (cycle) API updates are linked together. Real CPU cycle cases must not converted to use time API, but to use a new odp_cpu_cycle() count API first. >> Then what is left should be actual time API use cases. > > Let me look at CPU API first. > I hesitate to answer because of freq scaling. I have filling that there is not so very well. > And yes, probably I should move part of this changes in preparation series. where did you see, odp_cycle_count? Even if you are going to add. Think once again. It seems it's better replace it here on odp_time. Imagine the system which has cpufreq governor that dependently on packet rate, and hence system load, regulate CPU frequency in order to reduce power consumption. In this case you CPU rate is changing very fast and frequently. When you get frequency with odp_cpu_hz() it doesn't mean that this freq was the same a 1ms ago or 1ms after. Or What if we have frequency spectrum: F1 F2 F3 Fmax Fmax can be switched on in rarely cases when traffic rate is extremely hi. This period is very short and most time (99%) CPU is working at F3, as processor can be damaged by temperature. In some point in time odp_cpu_hz() can return Fmax. But you want to count some number of cycles in period of time, mostly with freq = F3. But you got Fmax, your calculation will be wrong. In this regard current cpu frequency API looks very pure. And by a big account it should be calculated in combine with governor, which can be absent or even in operating system. Seems we don't have a choice, and should use odp_time for that. cpu cycles count can be evaluated only by odp_time()*odp_cpu_hz()/odp_time_hz(). But I'm worry about how we can count some cpu cycles if we are not sure in cpu frequency :-|? It's not informative for systems in question. So better to evaluate it with odp_time. No choice. > >> >> >>> -----Original Message----- >>> From: lng-odp [mailto:lng-odp-bounces@lists.linaro.org] On Behalf Of >>> ext Ivan Khoronzhuk >>> Sent: Thursday, September 03, 2015 12:11 AM >>> To: lng-odp@lists.linaro.org >>> Subject: [lng-odp] [odp-lng] [Patch v3 1/3] api: time: unbind CPU >>> cycles from time API >>> >>> Current time API supposes that frequency of counter is equal >>> to CPU frequency. But that's not always true, for instance, >>> in case if no access to CPU cycle counter, another hi-resolution >>> timer can be used, and it`s rate can be different from CPU >>> rate. There is no big difference in which cycles to measure >>> time, the better hi-resolution timer the better measurements. >>> So, unbind CPU cycle counter from time API by eliminating word >>> "cycle" as it's believed to be used with CPU. >>> >>> Also add new opaque type for time odp_time_t, as it asks user to use >>> API and abstracts time from units. New odp_time_t requires several >>> additional API functions to be added: >>> >>> odp_time_t odp_time_sum(odp_time_t t1, odp_time_t t2); >>> int odp_time_cmp(odp_time_t t1, odp_time_t t2); >>> uint64_t odp_time_to_u64(odp_time_t hdl); >>> >>> Also added new definition that represents 0 ticks for time - >>> ODP_TIME_NULL. It can be used instead of odp_time_from_ns(0) for >>> comparison and initialization. >>> >>> This patch only changes used time API, it doesn't change used var >>> names for simplicity. >>> >>> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> >>> --- >> >> >> >> >>> >>> /* >>> diff --git a/test/performance/odp_scheduling.c >>> b/test/performance/odp_scheduling.c >>> index 2a7e531..5859460 100644 >>> --- a/test/performance/odp_scheduling.c >>> +++ b/test/performance/odp_scheduling.c >>> @@ -183,9 +183,10 @@ static int test_alloc_single(int thr, odp_pool_t >>> pool) >>> { >>> int i; >>> odp_buffer_t temp_buf; >>> - uint64_t t1, t2, cycles, ns; >>> + odp_time_t t1, t2, cycles; >>> + uint64_t ns; >>> >>> - t1 = odp_time_cycles(); >>> + t1 = odp_time(); >>> >>> for (i = 0; i < ALLOC_ROUNDS; i++) { >>> temp_buf = odp_buffer_alloc(pool); >>> @@ -198,12 +199,12 @@ static int test_alloc_single(int thr, odp_pool_t >>> pool) >>> odp_buffer_free(temp_buf); >>> } >>> >>> - t2 = odp_time_cycles(); >>> - cycles = odp_time_diff_cycles(t1, t2); >>> - ns = odp_time_cycles_to_ns(cycles); >>> + t2 = odp_time(); >>> + cycles = odp_time_diff(t1, t2); >>> + ns = odp_time_to_ns(cycles); >>> >>> printf(" [%i] alloc_sng alloc+free %"PRIu64" cycles, %"PRIu64" >>> ns\n", >>> - thr, cycles/ALLOC_ROUNDS, ns/ALLOC_ROUNDS); >>> + thr, odp_time_to_u64(cycles) / ALLOC_ROUNDS, ns / >>> ALLOC_ROUNDS); >>> >>> return 0; >> >> For example, this is really measuring average CPU cycles (with or without freq scaling). The ns conversion can be removed. >> >> -Petri >> >> >> >> >
> -----Original Message----- > From: ext Ivan Khoronzhuk [mailto:ivan.khoronzhuk@linaro.org] > Sent: Thursday, September 03, 2015 6:26 PM > To: Savolainen, Petri (Nokia - FI/Espoo); lng-odp@lists.linaro.org > Subject: Re: [lng-odp] [odp-lng] [Patch v3 1/3] api: time: unbind CPU > cycles from time API > > Petri, > > On 03.09.15 16:13, Ivan Khoronzhuk wrote: > > > > > > On 03.09.15 15:29, Savolainen, Petri (Nokia - FI/Espoo) wrote: > >> > >> There are many example apps that actually want to measure execution > time in CPU cycles, not in nsec. > >> Time API and CPU (cycle) API updates are linked together. Real CPU > cycle cases must not converted to use time API, but to use a new > odp_cpu_cycle() count API first. > >> Then what is left should be actual time API use cases. > > > > Let me look at CPU API first. > > I hesitate to answer because of freq scaling. I have filling that > there is not so very well. > > And yes, probably I should move part of this changes in preparation > series. > > where did you see, odp_cycle_count? > Even if you are going to add. Think once again. > It seems it's better replace it here on odp_time. > > Imagine the system which has cpufreq governor that dependently on > packet rate, > and hence system load, regulate CPU frequency in order to reduce power > consumption. > > In this case you CPU rate is changing very fast and frequently. > When you get frequency with odp_cpu_hz() it doesn't mean that this freq > was > the same a 1ms ago or 1ms after. > > Or > What if we have frequency spectrum: F1 F2 F3 Fmax > Fmax can be switched on in rarely cases when traffic rate is extremely > hi. > This period is very short and most time (99%) CPU is working at F3, as > processor can > be damaged by temperature. In some point in time odp_cpu_hz() can > return Fmax. > But you want to count some number of cycles in period of time, mostly > with freq = F3. > But you got Fmax, your calculation will be wrong. In this regard > current cpu frequency API looks very pure. > And by a big account it should be calculated in combine with governor, > which can be absent > or even in operating system. > > Seems we don't have a choice, and should use odp_time for that. > cpu cycles count can be evaluated only by > odp_time()*odp_cpu_hz()/odp_time_hz(). > > But I'm worry about how we can count some cpu cycles if we are not sure > in cpu frequency :-|? > It's not informative for systems in question. > > So better to evaluate it with odp_time. No choice. CPU cycle measurement is different from time (latency) measurement. CPU cycle count tells you exact CPU overhead. These tests measure CPU overhead, which is many times more relevant for performance measurement than latency (nsec). User can choose the governor policy and interpret the measurement results accordingly. I'm going to add cpu cycle functions into cpu.h. -Petri
Petri, On 04.09.15 12:14, Savolainen, Petri (Nokia - FI/Espoo) wrote: > > >> -----Original Message----- >> From: ext Ivan Khoronzhuk [mailto:ivan.khoronzhuk@linaro.org] >> Sent: Thursday, September 03, 2015 6:26 PM >> To: Savolainen, Petri (Nokia - FI/Espoo); lng-odp@lists.linaro.org >> Subject: Re: [lng-odp] [odp-lng] [Patch v3 1/3] api: time: unbind CPU >> cycles from time API >> >> Petri, >> >> On 03.09.15 16:13, Ivan Khoronzhuk wrote: >>> >>> >>> On 03.09.15 15:29, Savolainen, Petri (Nokia - FI/Espoo) wrote: >>>> >>>> There are many example apps that actually want to measure execution >> time in CPU cycles, not in nsec. >>>> Time API and CPU (cycle) API updates are linked together. Real CPU >> cycle cases must not converted to use time API, but to use a new >> odp_cpu_cycle() count API first. >>>> Then what is left should be actual time API use cases. >>> >>> Let me look at CPU API first. >>> I hesitate to answer because of freq scaling. I have filling that >> there is not so very well. >>> And yes, probably I should move part of this changes in preparation >> series. >> >> where did you see, odp_cycle_count? >> Even if you are going to add. Think once again. >> It seems it's better replace it here on odp_time. >> >> Imagine the system which has cpufreq governor that dependently on >> packet rate, >> and hence system load, regulate CPU frequency in order to reduce power >> consumption. >> >> In this case you CPU rate is changing very fast and frequently. >> When you get frequency with odp_cpu_hz() it doesn't mean that this freq >> was >> the same a 1ms ago or 1ms after. >> >> Or >> What if we have frequency spectrum: F1 F2 F3 Fmax >> Fmax can be switched on in rarely cases when traffic rate is extremely >> hi. >> This period is very short and most time (99%) CPU is working at F3, as >> processor can >> be damaged by temperature. In some point in time odp_cpu_hz() can >> return Fmax. >> But you want to count some number of cycles in period of time, mostly >> with freq = F3. >> But you got Fmax, your calculation will be wrong. In this regard >> current cpu frequency API looks very pure. >> And by a big account it should be calculated in combine with governor, >> which can be absent >> or even in operating system. >> >> Seems we don't have a choice, and should use odp_time for that. >> cpu cycles count can be evaluated only by >> odp_time()*odp_cpu_hz()/odp_time_hz(). >> >> But I'm worry about how we can count some cpu cycles if we are not sure >> in cpu frequency :-|? >> It's not informative for systems in question. >> >> So better to evaluate it with odp_time. No choice. > > > CPU cycle measurement is different from time (latency) measurement. CPU cycle count tells you exact CPU overhead. These tests measure CPU overhead, which is many times more relevant for performance measurement than latency (nsec). > > User can choose the governor policy and interpret the measurement results accordingly. > > I'm going to add cpu cycle functions into cpu.h. No objection. It be good you add it in near future. And don't forget to mention that it's valid only in debug purposes. Also it can be 32-bit counter and it can wrap (4s), so it should be close to time api. And create maybe some ticket I can point that it blocks me to add time API. > > > -Petri > >
diff --git a/example/timer/odp_timer_test.c b/example/timer/odp_timer_test.c index 49630b0..23c5a9a 100644 --- a/example/timer/odp_timer_test.c +++ b/example/timer/odp_timer_test.c @@ -320,9 +320,10 @@ static void parse_args(int argc, char *argv[], test_args_t *args) int main(int argc, char *argv[]) { odph_linux_pthread_t thread_tbl[MAX_WORKERS]; + uint64_t ns; int num_workers; odp_queue_t queue; - uint64_t cycles, ns; + odp_time_t cycles; odp_queue_param_t param; odp_pool_param_t params; odp_timer_pool_param_t tparams; @@ -449,19 +450,20 @@ int main(int argc, char *argv[]) printf("CPU freq %"PRIu64" Hz\n", odp_sys_cpu_hz()); printf("Cycles vs nanoseconds:\n"); ns = 0; - cycles = odp_time_ns_to_cycles(ns); + cycles = odp_time_from_ns(ns); - printf(" %12"PRIu64" ns -> %12"PRIu64" cycles\n", ns, cycles); - printf(" %12"PRIu64" cycles -> %12"PRIu64" ns\n", cycles, - odp_time_cycles_to_ns(cycles)); + printf(" %12" PRIu64 " ns -> %12" PRIu64 " cycles\n", ns, + odp_time_to_u64(cycles)); + printf(" %12" PRIu64 " cycles -> %12" PRIu64 " ns\n", + odp_time_to_u64(cycles), odp_time_to_ns(cycles)); for (ns = 1; ns <= 100*ODP_TIME_SEC; ns *= 10) { - cycles = odp_time_ns_to_cycles(ns); + cycles = odp_time_from_ns(ns); - printf(" %12"PRIu64" ns -> %12"PRIu64" cycles\n", ns, - cycles); - printf(" %12"PRIu64" cycles -> %12"PRIu64" ns\n", cycles, - odp_time_cycles_to_ns(cycles)); + printf(" %12" PRIu64 " ns -> %12" PRIu64 " cycles\n", ns, + odp_time_to_u64(cycles)); + printf(" %12" PRIu64 " cycles -> %12" PRIu64 " ns\n", + odp_time_to_u64(cycles), odp_time_to_ns(cycles)); } printf("\n"); diff --git a/include/odp/api/time.h b/include/odp/api/time.h index b0072fc..60800ba 100644 --- a/include/odp/api/time.h +++ b/include/odp/api/time.h @@ -28,14 +28,22 @@ extern "C" { #define ODP_TIME_MSEC 1000000ULL /**< Millisecond in nsec */ #define ODP_TIME_SEC 1000000000ULL /**< Second in nsec */ +/** + * @typedef odp_time_t + * ODP time stamp. Time stamp is global and can be shared between threads. + */ /** - * Current time in CPU cycles - * - * @return Current time in CPU cycles + * @def ODP_TIME_NULL + * Zero time stamp */ -uint64_t odp_time_cycles(void); +/** + * Current global time stamp. + * + * @return Time stamp. It should be hi-resolution time. + */ +odp_time_t odp_time(void); /** * Time difference @@ -43,29 +51,60 @@ uint64_t odp_time_cycles(void); * @param t1 First time stamp * @param t2 Second time stamp * - * @return Difference of time stamps in CPU cycles + * @return Difference of time stamps */ -uint64_t odp_time_diff_cycles(uint64_t t1, uint64_t t2); +odp_time_t odp_time_diff(odp_time_t t1, odp_time_t t2); +/** + * Time sum + * + * @param t1 time stamp + * @param t2 time stamp + * + * @return Sum of time stamps + */ +odp_time_t odp_time_sum(odp_time_t t1, odp_time_t t2); /** - * Convert CPU cycles to nanoseconds + * Convert time to nanoseconds * - * @param cycles Time in CPU cycles + * @param time Time * * @return Time in nanoseconds */ -uint64_t odp_time_cycles_to_ns(uint64_t cycles); - +uint64_t odp_time_to_ns(odp_time_t time); /** - * Convert nanoseconds to CPU cycles + * Convert nanoseconds to time * * @param ns Time in nanoseconds * - * @return Time in CPU cycles + * @return Time stamp + */ +odp_time_t odp_time_from_ns(uint64_t ns); + +/** + * Compare two times as absolute ranges + * + * @param t1 First time + * @param t2 Second time + * + * @retval -1 if t2 < t1, 0 if t1 = t2, 1 if t2 > t1 + */ +int odp_time_cmp(odp_time_t t1, odp_time_t t2); + +/** + * Get printable value for an odp_time_t + * + * @param time time to be printed + * @return uint64_t value that can be used to print/display this + * time + * + * @note This routine is intended to be used for diagnostic purposes + * to enable applications to generate a printable value that represents + * an odp_time_t time. */ -uint64_t odp_time_ns_to_cycles(uint64_t ns); +uint64_t odp_time_to_u64(odp_time_t time); /** * @} diff --git a/test/performance/odp_pktio_perf.c b/test/performance/odp_pktio_perf.c index 85ef2bc..b27efc8 100644 --- a/test/performance/odp_pktio_perf.c +++ b/test/performance/odp_pktio_perf.c @@ -106,7 +106,7 @@ struct tx_stats_s { uint64_t tx_cnt; /* Packets transmitted */ uint64_t alloc_failures;/* Packet allocation failures */ uint64_t enq_failures; /* Enqueue failures */ - uint64_t idle_cycles; /* Idle cycle count in TX loop */ + odp_time_t idle_cycles; /* Idle cycle count in TX loop */ }; typedef union tx_stats_u { @@ -303,12 +303,12 @@ static void *run_thread_tx(void *arg) int thr_id; odp_queue_t outq; pkt_tx_stats_t *stats; - uint64_t burst_start_cycles, start_cycles, cur_cycles, send_duration; - uint64_t burst_gap_cycles; + odp_time_t start_cycles, cur_cycles, send_duration; + odp_time_t burst_start_cycles, burst_gap_cycles; uint32_t batch_len; int unsent_pkts = 0; odp_event_t tx_event[BATCH_LEN_MAX]; - uint64_t idle_start = 0; + odp_time_t idle_start = ODP_TIME_NULL; thread_args_t *targs = arg; @@ -326,30 +326,33 @@ static void *run_thread_tx(void *arg) if (outq == ODP_QUEUE_INVALID) LOG_ABORT("Failed to get output queue for thread %d\n", thr_id); - burst_gap_cycles = odp_time_ns_to_cycles( + burst_gap_cycles = odp_time_from_ns( ODP_TIME_SEC / (targs->pps / targs->batch_len)); - send_duration = odp_time_ns_to_cycles(targs->duration * ODP_TIME_SEC); + send_duration = odp_time_from_ns(targs->duration * ODP_TIME_SEC); odp_barrier_wait(&globals->tx_barrier); - cur_cycles = odp_time_cycles(); + cur_cycles = odp_time(); start_cycles = cur_cycles; - burst_start_cycles = odp_time_diff_cycles(cur_cycles, burst_gap_cycles); - while (odp_time_diff_cycles(start_cycles, cur_cycles) < send_duration) { + burst_start_cycles = odp_time_diff(cur_cycles, burst_gap_cycles); + while (odp_time_diff(start_cycles, cur_cycles) < send_duration) { unsigned alloc_cnt = 0, tx_cnt; - if (odp_time_diff_cycles(burst_start_cycles, cur_cycles) + if (odp_time_diff(burst_start_cycles, cur_cycles) < burst_gap_cycles) { - cur_cycles = odp_time_cycles(); - if (idle_start == 0) + cur_cycles = odp_time(); + if (!odp_time_cmp(ODP_TIME_NULL, idle_start)) idle_start = cur_cycles; continue; } - if (idle_start) { - stats->s.idle_cycles += odp_time_diff_cycles( - idle_start, cur_cycles); - idle_start = 0; + if (odp_time_cmp(ODP_TIME_NULL, idle_start)) { + odp_time_t diff = odp_time_diff(idle_start, cur_cycles); + + stats->s.idle_cycles = + odp_time_sum(diff, stats->s.idle_cycles); + + idle_start = ODP_TIME_NULL; } burst_start_cycles += burst_gap_cycles; @@ -363,14 +366,14 @@ static void *run_thread_tx(void *arg) stats->s.enq_failures += unsent_pkts; stats->s.tx_cnt += tx_cnt; - cur_cycles = odp_time_cycles(); + cur_cycles = odp_time(); } VPRINT(" %02d: TxPkts %-8"PRIu64" EnqFail %-6"PRIu64 " AllocFail %-6"PRIu64" Idle %"PRIu64"ms\n", thr_id, stats->s.tx_cnt, stats->s.enq_failures, stats->s.alloc_failures, - odp_time_cycles_to_ns(stats->s.idle_cycles)/1000/1000); + odp_time_to_ns(stats->s.idle_cycles) / (uint64_t)ODP_TIME_MSEC); return NULL; } @@ -588,9 +591,11 @@ static int setup_txrx_masks(odp_cpumask_t *thd_mask_tx, */ static void busy_loop_ns(uint64_t wait_ns) { - uint64_t end = odp_time_cycles() + odp_time_ns_to_cycles(wait_ns); - while (odp_time_cycles() < end) - ; + odp_time_t start_time = odp_time(); + odp_time_t wait = odp_time_from_ns(wait_ns); + + while (odp_time_cmp(diff, wait) > 0) + diff = odp_time_diff(start_time, odp_time()); } /* diff --git a/test/performance/odp_scheduling.c b/test/performance/odp_scheduling.c index 2a7e531..5859460 100644 --- a/test/performance/odp_scheduling.c +++ b/test/performance/odp_scheduling.c @@ -183,9 +183,10 @@ static int test_alloc_single(int thr, odp_pool_t pool) { int i; odp_buffer_t temp_buf; - uint64_t t1, t2, cycles, ns; + odp_time_t t1, t2, cycles; + uint64_t ns; - t1 = odp_time_cycles(); + t1 = odp_time(); for (i = 0; i < ALLOC_ROUNDS; i++) { temp_buf = odp_buffer_alloc(pool); @@ -198,12 +199,12 @@ static int test_alloc_single(int thr, odp_pool_t pool) odp_buffer_free(temp_buf); } - t2 = odp_time_cycles(); - cycles = odp_time_diff_cycles(t1, t2); - ns = odp_time_cycles_to_ns(cycles); + t2 = odp_time(); + cycles = odp_time_diff(t1, t2); + ns = odp_time_to_ns(cycles); printf(" [%i] alloc_sng alloc+free %"PRIu64" cycles, %"PRIu64" ns\n", - thr, cycles/ALLOC_ROUNDS, ns/ALLOC_ROUNDS); + thr, odp_time_to_u64(cycles) / ALLOC_ROUNDS, ns / ALLOC_ROUNDS); return 0; } @@ -220,9 +221,10 @@ static int test_alloc_multi(int thr, odp_pool_t pool) { int i, j; odp_buffer_t temp_buf[MAX_ALLOCS]; - uint64_t t1, t2, cycles, ns; + odp_time_t t1, t2, cycles; + uint64_t ns; - t1 = odp_time_cycles(); + t1 = odp_time(); for (i = 0; i < ALLOC_ROUNDS; i++) { for (j = 0; j < MAX_ALLOCS; j++) { @@ -238,12 +240,12 @@ static int test_alloc_multi(int thr, odp_pool_t pool) odp_buffer_free(temp_buf[j-1]); } - t2 = odp_time_cycles(); - cycles = odp_time_diff_cycles(t1, t2); - ns = odp_time_cycles_to_ns(cycles); + t2 = odp_time(); + cycles = odp_time_diff(t1, t2); + ns = odp_time_to_ns(cycles); printf(" [%i] alloc_multi alloc+free %"PRIu64" cycles, %"PRIu64" ns\n", - thr, cycles/(ALLOC_ROUNDS*MAX_ALLOCS), + thr, odp_time_to_u64(cycles) / (ALLOC_ROUNDS * MAX_ALLOCS), ns/(ALLOC_ROUNDS*MAX_ALLOCS)); return 0; @@ -265,7 +267,8 @@ static int test_poll_queue(int thr, odp_pool_t msg_pool) odp_buffer_t buf; test_message_t *t_msg; odp_queue_t queue; - uint64_t t1, t2, cycles, ns; + odp_time_t t1, t2, cycles; + uint64_t ns; int i; /* Alloc test message */ @@ -289,7 +292,7 @@ static int test_poll_queue(int thr, odp_pool_t msg_pool) return -1; } - t1 = odp_time_cycles(); + t1 = odp_time(); for (i = 0; i < QUEUE_ROUNDS; i++) { ev = odp_buffer_to_event(buf); @@ -310,12 +313,12 @@ static int test_poll_queue(int thr, odp_pool_t msg_pool) } } - t2 = odp_time_cycles(); - cycles = odp_time_diff_cycles(t1, t2); - ns = odp_time_cycles_to_ns(cycles); + t2 = odp_time(); + cycles = odp_time_diff(t1, t2); + ns = odp_time_to_ns(cycles); printf(" [%i] poll_queue enq+deq %"PRIu64" cycles, %"PRIu64" ns\n", - thr, cycles/QUEUE_ROUNDS, ns/QUEUE_ROUNDS); + thr, odp_time_to_u64(cycles) / QUEUE_ROUNDS, ns / QUEUE_ROUNDS); odp_buffer_free(buf); return 0; @@ -341,14 +344,15 @@ static int test_schedule_single(const char *str, int thr, { odp_event_t ev; odp_queue_t queue; - uint64_t t1, t2, cycles, ns; + odp_time_t t1, t2, cycles; + uint64_t ns; uint32_t i; uint32_t tot; if (create_queue(thr, msg_pool, prio)) return -1; - t1 = odp_time_cycles(); + t1 = odp_time(); for (i = 0; i < QUEUE_ROUNDS; i++) { ev = odp_schedule(&queue, ODP_SCHED_WAIT); @@ -382,18 +386,15 @@ static int test_schedule_single(const char *str, int thr, odp_schedule_resume(); - t2 = odp_time_cycles(); - cycles = odp_time_diff_cycles(t1, t2); - ns = odp_time_cycles_to_ns(cycles); + t2 = odp_time(); + cycles = odp_time_diff(t1, t2); + ns = odp_time_to_ns(cycles); odp_barrier_wait(barrier); clear_sched_queues(); - cycles = cycles/tot; - ns = ns/tot; - printf(" [%i] %s enq+deq %"PRIu64" cycles, %"PRIu64" ns\n", - thr, str, cycles, ns); + thr, str, odp_time_to_u64(cycles) / tot, ns / tot); return 0; } @@ -419,9 +420,8 @@ static int test_schedule_many(const char *str, int thr, { odp_event_t ev; odp_queue_t queue; - uint64_t t1; - uint64_t t2; - uint64_t cycles, ns; + odp_time_t t1, t2, cycles; + uint64_t ns; uint32_t i; uint32_t tot; @@ -429,7 +429,7 @@ static int test_schedule_many(const char *str, int thr, return -1; /* Start sched-enq loop */ - t1 = odp_time_cycles(); + t1 = odp_time(); for (i = 0; i < QUEUE_ROUNDS; i++) { ev = odp_schedule(&queue, ODP_SCHED_WAIT); @@ -463,18 +463,15 @@ static int test_schedule_many(const char *str, int thr, odp_schedule_resume(); - t2 = odp_time_cycles(); - cycles = odp_time_diff_cycles(t1, t2); - ns = odp_time_cycles_to_ns(cycles); + t2 = odp_time(); + cycles = odp_time_diff(t1, t2); + ns = odp_time_to_ns(cycles); odp_barrier_wait(barrier); clear_sched_queues(); - cycles = cycles/tot; - ns = ns/tot; - printf(" [%i] %s enq+deq %"PRIu64" cycles, %"PRIu64" ns\n", - thr, str, cycles, ns); + thr, str, odp_time_to_u64(cycles) / tot, ns / tot); return 0; } @@ -496,9 +493,8 @@ static int test_schedule_multi(const char *str, int thr, { odp_event_t ev[MULTI_BUFS_MAX]; odp_queue_t queue; - uint64_t t1; - uint64_t t2; - uint64_t cycles, ns; + odp_time_t t1, t2, cycles; + uint64_t ns, cycles_pr; int i, j; int num; uint32_t tot = 0; @@ -547,7 +543,7 @@ static int test_schedule_multi(const char *str, int thr, } /* Start sched-enq loop */ - t1 = odp_time_cycles(); + t1 = odp_time(); for (i = 0; i < QUEUE_ROUNDS; i++) { num = odp_schedule_multi(&queue, ODP_SCHED_WAIT, ev, @@ -584,23 +580,23 @@ static int test_schedule_multi(const char *str, int thr, odp_schedule_resume(); - t2 = odp_time_cycles(); - cycles = odp_time_diff_cycles(t1, t2); - ns = odp_time_cycles_to_ns(cycles); + t2 = odp_time(); + cycles = odp_time_diff(t1, t2); + ns = odp_time_to_ns(cycles); odp_barrier_wait(barrier); clear_sched_queues(); if (tot) { - cycles = cycles/tot; + cycles_pr = odp_time_to_u64(cycles) / tot; ns = ns/tot; } else { - cycles = 0; + cycles_pr = 0; ns = 0; } printf(" [%i] %s enq+deq %"PRIu64" cycles, %"PRIu64" ns\n", - thr, str, cycles, ns); + thr, str, cycles_pr, ns); return 0; } @@ -719,8 +715,8 @@ static void *run_thread(void *arg) static void test_time(void) { struct timespec tp1, tp2; - uint64_t t1, t2; - uint64_t ns1, ns2, cycles; + odp_time_t t1, t2, cycles; + uint64_t ns1, ns2; double err; if (clock_gettime(CLOCK_MONOTONIC, &tp2)) { @@ -738,7 +734,7 @@ static void test_time(void) } while (tp1.tv_sec == tp2.tv_sec); - t1 = odp_time_cycles(); + t1 = odp_time(); do { if (clock_gettime(CLOCK_MONOTONIC, &tp2)) { @@ -748,7 +744,7 @@ static void test_time(void) } while ((tp2.tv_sec - tp1.tv_sec) < TEST_SEC); - t2 = odp_time_cycles(); + t2 = odp_time(); ns1 = (tp2.tv_sec - tp1.tv_sec)*1000000000; @@ -757,14 +753,15 @@ static void test_time(void) else ns1 -= tp1.tv_nsec - tp2.tv_nsec; - cycles = odp_time_diff_cycles(t1, t2); - ns2 = odp_time_cycles_to_ns(cycles); + cycles = odp_time_diff(t1, t2); + ns2 = odp_time_to_ns(cycles); err = ((double)(ns2) - (double)ns1) / (double)ns1; printf("clock_gettime %"PRIu64" ns\n", ns1); - printf("odp_time_cycles %"PRIu64" cycles\n", cycles); - printf("odp_time_cycles_to_ns %"PRIu64" ns\n", ns2); + printf("odp_time %" PRIu64 " cycles\n", + odp_time_to_u64(cycles)); + printf("odp_time_to_ns %" PRIu64 " ns\n", ns2); printf("odp get cycle error %f%%\n", err*100.0); printf("\n"); diff --git a/test/validation/pktio/pktio.c b/test/validation/pktio/pktio.c index b136419..723db81 100644 --- a/test/validation/pktio/pktio.c +++ b/test/validation/pktio/pktio.c @@ -332,18 +332,18 @@ static int destroy_inq(odp_pktio_t pktio) static odp_event_t queue_deq_wait_time(odp_queue_t queue, uint64_t ns) { - uint64_t start, now, diff; + odp_time_t start, now, diff; odp_event_t ev; - start = odp_time_cycles(); + start = odp_time(); do { ev = odp_queue_deq(queue); if (ev != ODP_EVENT_INVALID) return ev; - now = odp_time_cycles(); - diff = odp_time_diff_cycles(start, now); - } while (odp_time_cycles_to_ns(diff) < ns); + now = odp_time(); + diff = odp_time_diff(start, now); + } while (odp_time_to_ns(diff) < ns); return ODP_EVENT_INVALID; } @@ -351,12 +351,12 @@ static odp_event_t queue_deq_wait_time(odp_queue_t queue, uint64_t ns) static odp_packet_t wait_for_packet(odp_queue_t queue, uint32_t seq, uint64_t ns) { - uint64_t start, now, diff; + odp_time_t start, now, diff; odp_event_t ev; odp_packet_t pkt = ODP_PACKET_INVALID; uint64_t wait; - start = odp_time_cycles(); + start = odp_time(); wait = odp_schedule_wait_time(ns); do { @@ -377,9 +377,9 @@ static odp_packet_t wait_for_packet(odp_queue_t queue, odp_event_free(ev); } - now = odp_time_cycles(); - diff = odp_time_diff_cycles(start, now); - } while (odp_time_cycles_to_ns(diff) < ns); + now = odp_time(); + diff = odp_time_diff(start, now); + } while (odp_time_to_ns(diff) < ns); CU_FAIL("failed to receive transmitted packet"); diff --git a/test/validation/scheduler/scheduler.c b/test/validation/scheduler/scheduler.c index 04cd166..f94f0d0 100644 --- a/test/validation/scheduler/scheduler.c +++ b/test/validation/scheduler/scheduler.c @@ -212,10 +212,11 @@ static void *schedule_common_(void *arg) CU_ASSERT(from != ODP_QUEUE_INVALID); if (locked) { int cnt; - uint64_t cycles = 0; + odp_time_t cycles = ODP_TIME_NULL; /* Do some work here to keep the thread busy */ for (cnt = 0; cnt < 1000; cnt++) - cycles += odp_time_cycles(); + cycles = odp_time_sum(cycles, + odp_time()); odp_spinlock_unlock(&globals->atomic_lock); } diff --git a/test/validation/time/time.c b/test/validation/time/time.c index 4b81c2c..f8dba8e 100644 --- a/test/validation/time/time.c +++ b/test/validation/time/time.c @@ -16,43 +16,44 @@ void time_test_odp_cycles_diff(void) { /* volatile to stop optimization of busy loop */ volatile int count = 0; - uint64_t diff, cycles1, cycles2; + odp_time_t diff, cycles1, cycles2; - cycles1 = odp_time_cycles(); + cycles1 = odp_time(); while (count < BUSY_LOOP_CNT) { count++; }; - cycles2 = odp_time_cycles(); - CU_ASSERT(cycles2 > cycles1); + cycles2 = odp_time(); + CU_ASSERT((odp_time_cmp(cycles1, cycles2) > 0); - diff = odp_time_diff_cycles(cycles1, cycles2); - CU_ASSERT(diff > 0); + diff = odp_time_diff(cycles1, cycles2); + CU_ASSERT(odp_time_cmp(ODP_TIME_NULL, diff) > 0); } /* check that a negative cycles difference gives a reasonable result */ void time_test_odp_cycles_negative_diff(void) { - uint64_t diff, cycles1, cycles2; + odp_time_t diff, cycles1, cycles2; cycles1 = 10; cycles2 = 5; - diff = odp_time_diff_cycles(cycles1, cycles2); - CU_ASSERT(diff > 0); + diff = odp_time_diff(cycles1, cycles2); + CU_ASSERT(odp_time_cmp(ODP_TIME_NULL, diff) > 0); } /* check that related conversions come back to the same value */ void time_test_odp_time_conversion(void) { - uint64_t ns1, ns2, cycles; + uint64_t ns1, ns2; + odp_time_t cycles; uint64_t upper_limit, lower_limit; ns1 = 100; - cycles = odp_time_ns_to_cycles(ns1); - CU_ASSERT(cycles > 0); + cycles = odp_time_from_ns(ns1); + CU_ASSERT(odp_time_cmp(ODP_TIME_NULL, cycles) > 0); - ns2 = odp_time_cycles_to_ns(cycles); + ns2 = odp_time_to_ns(cycles); /* need to check within arithmetic tolerance that the same * value in ns is returned after conversions */
Current time API supposes that frequency of counter is equal to CPU frequency. But that's not always true, for instance, in case if no access to CPU cycle counter, another hi-resolution timer can be used, and it`s rate can be different from CPU rate. There is no big difference in which cycles to measure time, the better hi-resolution timer the better measurements. So, unbind CPU cycle counter from time API by eliminating word "cycle" as it's believed to be used with CPU. Also add new opaque type for time odp_time_t, as it asks user to use API and abstracts time from units. New odp_time_t requires several additional API functions to be added: odp_time_t odp_time_sum(odp_time_t t1, odp_time_t t2); int odp_time_cmp(odp_time_t t1, odp_time_t t2); uint64_t odp_time_to_u64(odp_time_t hdl); Also added new definition that represents 0 ticks for time - ODP_TIME_NULL. It can be used instead of odp_time_from_ns(0) for comparison and initialization. This patch only changes used time API, it doesn't change used var names for simplicity. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> --- example/timer/odp_timer_test.c | 22 +++---- include/odp/api/time.h | 65 ++++++++++++++++---- test/performance/odp_pktio_perf.c | 47 ++++++++------- test/performance/odp_scheduling.c | 109 +++++++++++++++++----------------- test/validation/pktio/pktio.c | 20 +++---- test/validation/scheduler/scheduler.c | 5 +- test/validation/time/time.c | 27 +++++---- 7 files changed, 170 insertions(+), 125 deletions(-)