diff mbox

test:linux-generic: run odp_scheduling in process mode

Message ID 1470316694-17100-1-git-send-email-mike.holmes@linaro.org
State Accepted
Commit 5d4f6cbd12a7f079917bc5ad18b710b3e34485b1
Headers show

Commit Message

Mike Holmes Aug. 4, 2016, 1:18 p.m. UTC
Set up the environment to allow calling the performance tests in process
mode as part of make check when enabled.

To run the tests use --enable-test-perf-proc

Initial patch using odp_scheduling as a proof

Signed-off-by: Mike Holmes <mike.holmes@linaro.org>

---
 configure.ac                                       |  2 ++
 test/linux-generic/Makefile.am                     |  2 +-
 test/linux-generic/m4/configure.m4                 |  5 ++++-
 test/linux-generic/m4/performance.m4               |  9 ++++++++
 test/linux-generic/performance/.gitignore          |  2 ++
 test/linux-generic/performance/Makefile.am         | 13 +++++++++++
 .../performance/odp_scheduling_run_proc.sh         | 26 ++++++++++++++++++++++
 7 files changed, 57 insertions(+), 2 deletions(-)
 create mode 100644 test/linux-generic/m4/performance.m4
 create mode 100644 test/linux-generic/performance/.gitignore
 create mode 100644 test/linux-generic/performance/Makefile.am
 create mode 100755 test/linux-generic/performance/odp_scheduling_run_proc.sh

-- 
2.7.4

Comments

Brian Brooks Aug. 4, 2016, 3:26 p.m. UTC | #1
Reviewed-by: Brian Brooks <brian.brooks@linaro.org>


On 08/04 09:18:14, Mike Holmes wrote:
> +ret=0

> +

> +run()

> +{

> +	echo odp_scheduling_run_proc starts with $1 worker threads

> +	echo =====================================================

> +

> +	$PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1

> +}

> +

> +run 1

> +run 8

> +

> +exit $ret


Seeing this randomly in both multithread and multiprocess modes:

../../../odp/platform/linux-generic/odp_queue.c:328:odp_queue_destroy():queue "sched_00_07" not empty
../../../odp/platform/linux-generic/odp_schedule.c:271:schedule_term_global():Queue not empty
../../../odp/platform/linux-generic/odp_schedule.c:294:schedule_term_global():Pool destroy fail.
../../../odp/platform/linux-generic/odp_init.c:188:_odp_term_global():ODP schedule term failed.
../../../odp/platform/linux-generic/odp_queue.c:170:odp_queue_term_global():Not destroyed queue: sched_00_07
../../../odp/platform/linux-generic/odp_init.c:195:_odp_term_global():ODP queue term failed.
../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not destroyed pool: odp_sched_pool
../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not destroyed pool: msg_pool
../../../odp/platform/linux-generic/odp_init.c:202:_odp_term_global():ODP buffer pool term failed.
~/odp_incoming/odp_build/test/common_plat/performance$ echo $?
0

Potentially two items: one for correctly returning the failure code, and
another related to teardown. Both beyond the scope of this patch which LGTM.
Bill Fischofer Aug. 4, 2016, 3:33 p.m. UTC | #2
On Thu, Aug 4, 2016 at 10:26 AM, Brian Brooks <brian.brooks@linaro.org>
wrote:

> Reviewed-by: Brian Brooks <brian.brooks@linaro.org>

>

> On 08/04 09:18:14, Mike Holmes wrote:

> > +ret=0

> > +

> > +run()

> > +{

> > +     echo odp_scheduling_run_proc starts with $1 worker threads

> > +     echo =====================================================

> > +

> > +     $PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1

> > +}

> > +

> > +run 1

> > +run 8

> > +

> > +exit $ret

>

> Seeing this randomly in both multithread and multiprocess modes:

>


Before or after you apply this patch? What environment are you seeing these
errors in. They should definitely not be happening.


>

> ../../../odp/platform/linux-generic/odp_queue.c:328:odp_queue_destroy():queue

> "sched_00_07" not empty

> ../../../odp/platform/linux-generic/odp_schedule.c:271:schedule_term_global():Queue

> not empty

> ../../../odp/platform/linux-generic/odp_schedule.c:294:schedule_term_global():Pool

> destroy fail.

> ../../../odp/platform/linux-generic/odp_init.c:188:_odp_term_global():ODP

> schedule term failed.

> ../../../odp/platform/linux-generic/odp_queue.c:170:odp_queue_term_global():Not

> destroyed queue: sched_00_07

> ../../../odp/platform/linux-generic/odp_init.c:195:_odp_term_global():ODP

> queue term failed.

> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

> destroyed pool: odp_sched_pool

> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

> destroyed pool: msg_pool

> ../../../odp/platform/linux-generic/odp_init.c:202:_odp_term_global():ODP

> buffer pool term failed.

> ~/odp_incoming/odp_build/test/common_plat/performance$ echo $?

> 0

>

> Potentially two items: one for correctly returning the failure code, and

> another related to teardown. Both beyond the scope of this patch which

> LGTM.

>
Mike Holmes Aug. 4, 2016, 3:36 p.m. UTC | #3
On my vanilla x86 I don't get any issues, keen to get this in and  have CI
run it on lots of HW to see what happens, many of the other tests
completely fail in process mode so we will expose a lot as we add them I
think.

On 4 August 2016 at 11:33, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>

>

> On Thu, Aug 4, 2016 at 10:26 AM, Brian Brooks <brian.brooks@linaro.org>

> wrote:

>

>> Reviewed-by: Brian Brooks <brian.brooks@linaro.org>

>>

>> On 08/04 09:18:14, Mike Holmes wrote:

>> > +ret=0

>> > +

>> > +run()

>> > +{

>> > +     echo odp_scheduling_run_proc starts with $1 worker threads

>> > +     echo =====================================================

>> > +

>> > +     $PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1

>> > +}

>> > +

>> > +run 1

>> > +run 8

>> > +

>> > +exit $ret

>>

>> Seeing this randomly in both multithread and multiprocess modes:

>>

>

> Before or after you apply this patch? What environment are you seeing

> these errors in. They should definitely not be happening.

>

>

>>

>> ../../../odp/platform/linux-generic/odp_queue.c:328:odp_queue_destroy():queue

>> "sched_00_07" not empty

>> ../../../odp/platform/linux-generic/odp_schedule.c:271:schedule_term_global():Queue

>> not empty

>> ../../../odp/platform/linux-generic/odp_schedule.c:294:schedule_term_global():Pool

>> destroy fail.

>> ../../../odp/platform/linux-generic/odp_init.c:188:_odp_term_global():ODP

>> schedule term failed.

>> ../../../odp/platform/linux-generic/odp_queue.c:170:odp_queue_term_global():Not

>> destroyed queue: sched_00_07

>> ../../../odp/platform/linux-generic/odp_init.c:195:_odp_term_global():ODP

>> queue term failed.

>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

>> destroyed pool: odp_sched_pool

>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

>> destroyed pool: msg_pool

>> ../../../odp/platform/linux-generic/odp_init.c:202:_odp_term_global():ODP

>> buffer pool term failed.

>> ~/odp_incoming/odp_build/test/common_plat/performance$ echo $?

>> 0

>>

>> Potentially two items: one for correctly returning the failure code, and

>> another related to teardown. Both beyond the scope of this patch which

>> LGTM.

>>

>

>



-- 
Mike Holmes
Technical Manager - Linaro Networking Group
Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"
Bill Fischofer Aug. 4, 2016, 3:47 p.m. UTC | #4
On Thu, Aug 4, 2016 at 10:36 AM, Mike Holmes <mike.holmes@linaro.org> wrote:

> On my vanilla x86 I don't get any issues, keen to get this in and  have CI

> run it on lots of HW to see what happens, many of the other tests

> completely fail in process mode so we will expose a lot as we add them I

> think.

>

> On 4 August 2016 at 11:33, Bill Fischofer <bill.fischofer@linaro.org>

> wrote:

>

>>

>>

>> On Thu, Aug 4, 2016 at 10:26 AM, Brian Brooks <brian.brooks@linaro.org>

>> wrote:

>>

>>> Reviewed-by: Brian Brooks <brian.brooks@linaro.org>

>>>

>>> On 08/04 09:18:14, Mike Holmes wrote:

>>> > +ret=0

>>> > +

>>> > +run()

>>> > +{

>>> > +     echo odp_scheduling_run_proc starts with $1 worker threads

>>> > +     echo =====================================================

>>> > +

>>> > +     $PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1

>>> > +}

>>> > +

>>> > +run 1

>>> > +run 8

>>> > +

>>> > +exit $ret

>>>

>>> Seeing this randomly in both multithread and multiprocess modes:

>>>

>>

>> Before or after you apply this patch? What environment are you seeing

>> these errors in. They should definitely not be happening.

>>

>>

>>>

>>> ../../../odp/platform/linux-generic/odp_queue.c:328:odp_queue_destroy():queue

>>> "sched_00_07" not empty

>>> ../../../odp/platform/linux-generic/odp_schedule.c:271:schedule_term_global():Queue

>>> not empty

>>> ../../../odp/platform/linux-generic/odp_schedule.c:294:schedule_term_global():Pool

>>> destroy fail.

>>> ../../../odp/platform/linux-generic/odp_init.c:188:_odp_term_global():ODP

>>> schedule term failed.

>>> ../../../odp/platform/linux-generic/odp_queue.c:170:odp_queue_term_global():Not

>>> destroyed queue: sched_00_07

>>> ../../../odp/platform/linux-generic/odp_init.c:195:_odp_term_global():ODP

>>> queue term failed.

>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

>>> destroyed pool: odp_sched_pool

>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

>>> destroyed pool: msg_pool

>>> ../../../odp/platform/linux-generic/odp_init.c:202:_odp_term_global():ODP

>>> buffer pool term failed.

>>> ~/odp_incoming/odp_build/test/common_plat/performance$ echo $?

>>> 0

>>>

>>>

Looks like we have a real issue that somehow creeped into master. I can
sporadically reproduce these same errors on my x86 system.  It looks like
this is also present in the monarch_lts branch.


> Potentially two items: one for correctly returning the failure code, and

>>> another related to teardown. Both beyond the scope of this patch which

>>> LGTM.

>>>

>>

>>

>

>

> --

> Mike Holmes

> Technical Manager - Linaro Networking Group

> Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs

> "Work should be fun and collaborative, the rest follows"

>

>

>
Bill Fischofer Aug. 4, 2016, 3:58 p.m. UTC | #5
Quick update. I can repro this in v1.10.0.1 as well, however v1.10.0.0
seems good.

On Thu, Aug 4, 2016 at 10:47 AM, Bill Fischofer <bill.fischofer@linaro.org>
wrote:

>

> On Thu, Aug 4, 2016 at 10:36 AM, Mike Holmes <mike.holmes@linaro.org>

> wrote:

>

>> On my vanilla x86 I don't get any issues, keen to get this in and  have

>> CI run it on lots of HW to see what happens, many of the other tests

>> completely fail in process mode so we will expose a lot as we add them I

>> think.

>>

>> On 4 August 2016 at 11:33, Bill Fischofer <bill.fischofer@linaro.org>

>> wrote:

>>

>>>

>>>

>>> On Thu, Aug 4, 2016 at 10:26 AM, Brian Brooks <brian.brooks@linaro.org>

>>> wrote:

>>>

>>>> Reviewed-by: Brian Brooks <brian.brooks@linaro.org>

>>>>

>>>> On 08/04 09:18:14, Mike Holmes wrote:

>>>> > +ret=0

>>>> > +

>>>> > +run()

>>>> > +{

>>>> > +     echo odp_scheduling_run_proc starts with $1 worker threads

>>>> > +     echo =====================================================

>>>> > +

>>>> > +     $PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1

>>>> > +}

>>>> > +

>>>> > +run 1

>>>> > +run 8

>>>> > +

>>>> > +exit $ret

>>>>

>>>> Seeing this randomly in both multithread and multiprocess modes:

>>>>

>>>

>>> Before or after you apply this patch? What environment are you seeing

>>> these errors in. They should definitely not be happening.

>>>

>>>

>>>>

>>>> ../../../odp/platform/linux-generic/odp_queue.c:328:odp_queue_destroy():queue

>>>> "sched_00_07" not empty

>>>> ../../../odp/platform/linux-generic/odp_schedule.c:271:schedule_term_global():Queue

>>>> not empty

>>>> ../../../odp/platform/linux-generic/odp_schedule.c:294:schedule_term_global():Pool

>>>> destroy fail.

>>>> ../../../odp/platform/linux-generic/odp_init.c:188:_odp_term_global():ODP

>>>> schedule term failed.

>>>> ../../../odp/platform/linux-generic/odp_queue.c:170:odp_queue_term_global():Not

>>>> destroyed queue: sched_00_07

>>>> ../../../odp/platform/linux-generic/odp_init.c:195:_odp_term_global():ODP

>>>> queue term failed.

>>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

>>>> destroyed pool: odp_sched_pool

>>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

>>>> destroyed pool: msg_pool

>>>> ../../../odp/platform/linux-generic/odp_init.c:202:_odp_term_global():ODP

>>>> buffer pool term failed.

>>>> ~/odp_incoming/odp_build/test/common_plat/performance$ echo $?

>>>> 0

>>>>

>>>>

> Looks like we have a real issue that somehow creeped into master. I can

> sporadically reproduce these same errors on my x86 system.  It looks like

> this is also present in the monarch_lts branch.

>

>

>> Potentially two items: one for correctly returning the failure code, and

>>>> another related to teardown. Both beyond the scope of this patch which

>>>> LGTM.

>>>>

>>>

>>>

>>

>>

>> --

>> Mike Holmes

>> Technical Manager - Linaro Networking Group

>> Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs

>> "Work should be fun and collaborative, the rest follows"

>>

>>

>>

>
Mike Holmes Aug. 4, 2016, 3:59 p.m. UTC | #6
On 4 August 2016 at 11:47, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>

> On Thu, Aug 4, 2016 at 10:36 AM, Mike Holmes <mike.holmes@linaro.org>

> wrote:

>

>> On my vanilla x86 I don't get any issues, keen to get this in and  have

>> CI run it on lots of HW to see what happens, many of the other tests

>> completely fail in process mode so we will expose a lot as we add them I

>> think.

>>

>> On 4 August 2016 at 11:33, Bill Fischofer <bill.fischofer@linaro.org>

>> wrote:

>>

>>>

>>>

>>> On Thu, Aug 4, 2016 at 10:26 AM, Brian Brooks <brian.brooks@linaro.org>

>>> wrote:

>>>

>>>> Reviewed-by: Brian Brooks <brian.brooks@linaro.org>

>>>>

>>>> On 08/04 09:18:14, Mike Holmes wrote:

>>>> > +ret=0

>>>> > +

>>>> > +run()

>>>> > +{

>>>> > +     echo odp_scheduling_run_proc starts with $1 worker threads

>>>> > +     echo =====================================================

>>>> > +

>>>> > +     $PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1

>>>> > +}

>>>> > +

>>>> > +run 1

>>>> > +run 8

>>>> > +

>>>> > +exit $ret

>>>>

>>>> Seeing this randomly in both multithread and multiprocess modes:

>>>>

>>>

>>> Before or after you apply this patch? What environment are you seeing

>>> these errors in. They should definitely not be happening.

>>>

>>>

>>>>

>>>> ../../../odp/platform/linux-generic/odp_queue.c:328:odp_queue_destroy():queue

>>>> "sched_00_07" not empty

>>>> ../../../odp/platform/linux-generic/odp_schedule.c:271:schedule_term_global():Queue

>>>> not empty

>>>> ../../../odp/platform/linux-generic/odp_schedule.c:294:schedule_term_global():Pool

>>>> destroy fail.

>>>> ../../../odp/platform/linux-generic/odp_init.c:188:_odp_term_global():ODP

>>>> schedule term failed.

>>>> ../../../odp/platform/linux-generic/odp_queue.c:170:odp_queue_term_global():Not

>>>> destroyed queue: sched_00_07

>>>> ../../../odp/platform/linux-generic/odp_init.c:195:_odp_term_global():ODP

>>>> queue term failed.

>>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

>>>> destroyed pool: odp_sched_pool

>>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

>>>> destroyed pool: msg_pool

>>>> ../../../odp/platform/linux-generic/odp_init.c:202:_odp_term_global():ODP

>>>> buffer pool term failed.

>>>> ~/odp_incoming/odp_build/test/common_plat/performance$ echo $?

>>>> 0

>>>>

>>>>

> Looks like we have a real issue that somehow creeped into master. I can

> sporadically reproduce these same errors on my x86 system.  It looks like

> this is also present in the monarch_lts branch.

>



I think that we agreed that Monarch would not support Process mode becasue
we never tested for it, but for TgrM we need to start fixing it.

Mike


>

>

>> Potentially two items: one for correctly returning the failure code, and

>>>> another related to teardown. Both beyond the scope of this patch which

>>>> LGTM.

>>>>

>>>

>>>

>>

>>

>> --

>> Mike Holmes

>> Technical Manager - Linaro Networking Group

>> Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs

>> "Work should be fun and collaborative, the rest follows"

>>

>>

>>

>



-- 
Mike Holmes
Technical Manager - Linaro Networking Group
Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"
Bill Fischofer Aug. 4, 2016, 4:01 p.m. UTC | #7
On Thu, Aug 4, 2016 at 10:59 AM, Mike Holmes <mike.holmes@linaro.org> wrote:

>

>

> On 4 August 2016 at 11:47, Bill Fischofer <bill.fischofer@linaro.org>

> wrote:

>

>>

>> On Thu, Aug 4, 2016 at 10:36 AM, Mike Holmes <mike.holmes@linaro.org>

>> wrote:

>>

>>> On my vanilla x86 I don't get any issues, keen to get this in and  have

>>> CI run it on lots of HW to see what happens, many of the other tests

>>> completely fail in process mode so we will expose a lot as we add them I

>>> think.

>>>

>>> On 4 August 2016 at 11:33, Bill Fischofer <bill.fischofer@linaro.org>

>>> wrote:

>>>

>>>>

>>>>

>>>> On Thu, Aug 4, 2016 at 10:26 AM, Brian Brooks <brian.brooks@linaro.org>

>>>> wrote:

>>>>

>>>>> Reviewed-by: Brian Brooks <brian.brooks@linaro.org>

>>>>>

>>>>> On 08/04 09:18:14, Mike Holmes wrote:

>>>>> > +ret=0

>>>>> > +

>>>>> > +run()

>>>>> > +{

>>>>> > +     echo odp_scheduling_run_proc starts with $1 worker threads

>>>>> > +     echo =====================================================

>>>>> > +

>>>>> > +     $PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 ||

>>>>> ret=1

>>>>> > +}

>>>>> > +

>>>>> > +run 1

>>>>> > +run 8

>>>>> > +

>>>>> > +exit $ret

>>>>>

>>>>> Seeing this randomly in both multithread and multiprocess modes:

>>>>>

>>>>

>>>> Before or after you apply this patch? What environment are you seeing

>>>> these errors in. They should definitely not be happening.

>>>>

>>>>

>>>>>

>>>>> ../../../odp/platform/linux-generic/odp_queue.c:328:odp_queue_destroy():queue

>>>>> "sched_00_07" not empty

>>>>> ../../../odp/platform/linux-generic/odp_schedule.c:271:schedule_term_global():Queue

>>>>> not empty

>>>>> ../../../odp/platform/linux-generic/odp_schedule.c:294:schedule_term_global():Pool

>>>>> destroy fail.

>>>>> ../../../odp/platform/linux-generic/odp_init.c:188:_odp_term_global():ODP

>>>>> schedule term failed.

>>>>> ../../../odp/platform/linux-generic/odp_queue.c:170:odp_queue_term_global():Not

>>>>> destroyed queue: sched_00_07

>>>>> ../../../odp/platform/linux-generic/odp_init.c:195:_odp_term_global():ODP

>>>>> queue term failed.

>>>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

>>>>> destroyed pool: odp_sched_pool

>>>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

>>>>> destroyed pool: msg_pool

>>>>> ../../../odp/platform/linux-generic/odp_init.c:202:_odp_term_global():ODP

>>>>> buffer pool term failed.

>>>>> ~/odp_incoming/odp_build/test/common_plat/performance$ echo $?

>>>>> 0

>>>>>

>>>>>

>> Looks like we have a real issue that somehow creeped into master. I can

>> sporadically reproduce these same errors on my x86 system.  It looks like

>> this is also present in the monarch_lts branch.

>>

>

>

> I think that we agreed that Monarch would not support Process mode becasue

> we never tested for it, but for TgrM we need to start fixing it.

>


Unfortunately the issue Brian identified has nothing to do with process
mode. This happens in regular pthread mode on all levels past v1.10.0.0 as
far as I can see.


>

> Mike

>

>

>>

>>

>>> Potentially two items: one for correctly returning the failure code, and

>>>>> another related to teardown. Both beyond the scope of this patch which

>>>>> LGTM.

>>>>>

>>>>

>>>>

>>>

>>>

>>> --

>>> Mike Holmes

>>> Technical Manager - Linaro Networking Group

>>> Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM

>>> SoCs

>>> "Work should be fun and collaborative, the rest follows"

>>>

>>>

>>>

>>

>

>

> --

> Mike Holmes

> Technical Manager - Linaro Networking Group

> Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs

> "Work should be fun and collaborative, the rest follows"

>

>

>
Maxim Uvarov Aug. 4, 2016, 4:03 p.m. UTC | #8
On 08/04/16 18:26, Brian Brooks wrote:
> Reviewed-by: Brian Brooks <brian.brooks@linaro.org>

>

> On 08/04 09:18:14, Mike Holmes wrote:

>> +ret=0

>> +

>> +run()

>> +{

>> +	echo odp_scheduling_run_proc starts with $1 worker threads

>> +	echo =====================================================

>> +

>> +	$PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1

>> +}

>> +

>> +run 1

>> +run 8

>> +

>> +exit $ret

> Seeing this randomly in both multithread and multiprocess modes:

>

> ../../../odp/platform/linux-generic/odp_queue.c:328:odp_queue_destroy():queue "sched_00_07" not empty

> ../../../odp/platform/linux-generic/odp_schedule.c:271:schedule_term_global():Queue not empty

> ../../../odp/platform/linux-generic/odp_schedule.c:294:schedule_term_global():Pool destroy fail.

> ../../../odp/platform/linux-generic/odp_init.c:188:_odp_term_global():ODP schedule term failed.

> ../../../odp/platform/linux-generic/odp_queue.c:170:odp_queue_term_global():Not destroyed queue: sched_00_07

> ../../../odp/platform/linux-generic/odp_init.c:195:_odp_term_global():ODP queue term failed.

> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not destroyed pool: odp_sched_pool

> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not destroyed pool: msg_pool

> ../../../odp/platform/linux-generic/odp_init.c:202:_odp_term_global():ODP buffer pool term failed.

> ~/odp_incoming/odp_build/test/common_plat/performance$ echo $?

> 0


btw, if ipc pktio enable and termination function return error, than 
there will be pool file in /dev/shm/.
In docker shm space limited to 64MB. So other test can began randomly 
fail on pool allocation or pool usage.

Maxim.
>

> Potentially two items: one for correctly returning the failure code, and

> another related to teardown. Both beyond the scope of this patch which LGTM.
Bill Fischofer Aug. 4, 2016, 4:05 p.m. UTC | #9
On Thu, Aug 4, 2016 at 11:03 AM, Maxim Uvarov <maxim.uvarov@linaro.org>
wrote:

> On 08/04/16 18:26, Brian Brooks wrote:

>

>> Reviewed-by: Brian Brooks <brian.brooks@linaro.org>

>>

>> On 08/04 09:18:14, Mike Holmes wrote:

>>

>>> +ret=0

>>> +

>>> +run()

>>> +{

>>> +       echo odp_scheduling_run_proc starts with $1 worker threads

>>> +       echo =====================================================

>>> +

>>> +       $PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1

>>> +}

>>> +

>>> +run 1

>>> +run 8

>>> +

>>> +exit $ret

>>>

>> Seeing this randomly in both multithread and multiprocess modes:

>>

>> ../../../odp/platform/linux-generic/odp_queue.c:328:odp_queue_destroy():queue

>> "sched_00_07" not empty

>> ../../../odp/platform/linux-generic/odp_schedule.c:271:schedule_term_global():Queue

>> not empty

>> ../../../odp/platform/linux-generic/odp_schedule.c:294:schedule_term_global():Pool

>> destroy fail.

>> ../../../odp/platform/linux-generic/odp_init.c:188:_odp_term_global():ODP

>> schedule term failed.

>> ../../../odp/platform/linux-generic/odp_queue.c:170:odp_queue_term_global():Not

>> destroyed queue: sched_00_07

>> ../../../odp/platform/linux-generic/odp_init.c:195:_odp_term_global():ODP

>> queue term failed.

>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

>> destroyed pool: odp_sched_pool

>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

>> destroyed pool: msg_pool

>> ../../../odp/platform/linux-generic/odp_init.c:202:_odp_term_global():ODP

>> buffer pool term failed.

>> ~/odp_incoming/odp_build/test/common_plat/performance$ echo $?

>> 0

>>

>

> btw, if ipc pktio enable and termination function return error, than there

> will be pool file in /dev/shm/.

> In docker shm space limited to 64MB. So other test can began randomly fail

> on pool allocation or pool usage.



I've not enabled ipc pktio and this test doesn't do any pktio operations so
I don't think that's the issue here.


>

>

> Maxim.

>

>

>> Potentially two items: one for correctly returning the failure code, and

>> another related to teardown. Both beyond the scope of this patch which

>> LGTM.

>>

>

>
Brian Brooks Aug. 4, 2016, 4:17 p.m. UTC | #10
On 08/04 11:01:09, Bill Fischofer wrote:
> On Thu, Aug 4, 2016 at 10:59 AM, Mike Holmes <mike.holmes@linaro.org> wrote:

> 

> >

> >

> > On 4 August 2016 at 11:47, Bill Fischofer <bill.fischofer@linaro.org>

> > wrote:

> >

> >>

> >> On Thu, Aug 4, 2016 at 10:36 AM, Mike Holmes <mike.holmes@linaro.org>

> >> wrote:

> >>

> >>> On my vanilla x86 I don't get any issues, keen to get this in and  have

> >>> CI run it on lots of HW to see what happens, many of the other tests

> >>> completely fail in process mode so we will expose a lot as we add them I

> >>> think.

> >>>

> >>> On 4 August 2016 at 11:33, Bill Fischofer <bill.fischofer@linaro.org>

> >>> wrote:

> >>>

> >>>>

> >>>>

> >>>> On Thu, Aug 4, 2016 at 10:26 AM, Brian Brooks <brian.brooks@linaro.org>

> >>>> wrote:

> >>>>

> >>>>> Reviewed-by: Brian Brooks <brian.brooks@linaro.org>

> >>>>>

> >>>>> On 08/04 09:18:14, Mike Holmes wrote:

> >>>>> > +ret=0

> >>>>> > +

> >>>>> > +run()

> >>>>> > +{

> >>>>> > +     echo odp_scheduling_run_proc starts with $1 worker threads

> >>>>> > +     echo =====================================================

> >>>>> > +

> >>>>> > +     $PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 ||

> >>>>> ret=1

> >>>>> > +}

> >>>>> > +

> >>>>> > +run 1

> >>>>> > +run 8

> >>>>> > +

> >>>>> > +exit $ret

> >>>>>

> >>>>> Seeing this randomly in both multithread and multiprocess modes:

> >>>>>

> >>>>

> >>>> Before or after you apply this patch? What environment are you seeing

> >>>> these errors in. They should definitely not be happening.

> >>>>

> >>>>

> >>>>>

> >>>>> ../../../odp/platform/linux-generic/odp_queue.c:328:odp_queue_destroy():queue

> >>>>> "sched_00_07" not empty

> >>>>> ../../../odp/platform/linux-generic/odp_schedule.c:271:schedule_term_global():Queue

> >>>>> not empty

> >>>>> ../../../odp/platform/linux-generic/odp_schedule.c:294:schedule_term_global():Pool

> >>>>> destroy fail.

> >>>>> ../../../odp/platform/linux-generic/odp_init.c:188:_odp_term_global():ODP

> >>>>> schedule term failed.

> >>>>> ../../../odp/platform/linux-generic/odp_queue.c:170:odp_queue_term_global():Not

> >>>>> destroyed queue: sched_00_07

> >>>>> ../../../odp/platform/linux-generic/odp_init.c:195:_odp_term_global():ODP

> >>>>> queue term failed.

> >>>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

> >>>>> destroyed pool: odp_sched_pool

> >>>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

> >>>>> destroyed pool: msg_pool

> >>>>> ../../../odp/platform/linux-generic/odp_init.c:202:_odp_term_global():ODP

> >>>>> buffer pool term failed.

> >>>>> ~/odp_incoming/odp_build/test/common_plat/performance$ echo $?

> >>>>> 0

> >>>>>

> >>>>>

> >> Looks like we have a real issue that somehow creeped into master. I can

> >> sporadically reproduce these same errors on my x86 system.  It looks like

> >> this is also present in the monarch_lts branch.

> >>

> >

> >

> > I think that we agreed that Monarch would not support Process mode becasue

> > we never tested for it, but for TgrM we need to start fixing it.

> >

> 

> Unfortunately the issue Brian identified has nothing to do with process

> mode. This happens in regular pthread mode on all levels past v1.10.0.0 as

> far as I can see.


The issue seems to emerge only under high event rates. The application asks
for more work, but none will be scheduled. However, there actually will be
work in the queue. So, the teardown will fail because the queue is not empty.
There may be a disconnect between the scheduling and the queueing or some
other synchronization related bug. I think I've seen something similar on
an ARM platform, so it may be architecture independent.
Mike Holmes Aug. 5, 2016, 3:59 p.m. UTC | #11
On 4 August 2016 at 11:26, Brian Brooks <brian.brooks@linaro.org> wrote:

> Reviewed-by: Brian Brooks <brian.brooks@linaro.org>

>


Just wanted to follow up on this, this patch has also highlighted a bug in
the original code and the thread below documents it, also a bug for it has
been created https://bugs.linaro.org/show_bug.cgi?id=2457

However this patch is orthogonal to that problem which existed before this
patch was created so I think we need to  take it so that TgrM testing can
continue.


>

> On 08/04 09:18:14, Mike Holmes wrote:

> > +ret=0

> > +

> > +run()

> > +{

> > +     echo odp_scheduling_run_proc starts with $1 worker threads

> > +     echo =====================================================

> > +

> > +     $PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1

> > +}

> > +

> > +run 1

> > +run 8

> > +

> > +exit $ret

>

> Seeing this randomly in both multithread and multiprocess modes:

>

> ../../../odp/platform/linux-generic/odp_queue.c:328:odp_queue_destroy():queue

> "sched_00_07" not empty

> ../../../odp/platform/linux-generic/odp_schedule.c:271:schedule_term_global():Queue

> not empty

> ../../../odp/platform/linux-generic/odp_schedule.c:294:schedule_term_global():Pool

> destroy fail.

> ../../../odp/platform/linux-generic/odp_init.c:188:_odp_term_global():ODP

> schedule term failed.

> ../../../odp/platform/linux-generic/odp_queue.c:170:odp_queue_term_global():Not

> destroyed queue: sched_00_07

> ../../../odp/platform/linux-generic/odp_init.c:195:_odp_term_global():ODP

> queue term failed.

> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

> destroyed pool: odp_sched_pool

> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_pool_term_global():Not

> destroyed pool: msg_pool

> ../../../odp/platform/linux-generic/odp_init.c:202:_odp_term_global():ODP

> buffer pool term failed.

> ~/odp_incoming/odp_build/test/common_plat/performance$ echo $?

> 0

>

> Potentially two items: one for correctly returning the failure code, and

> another related to teardown. Both beyond the scope of this patch which

> LGTM.

>




-- 
Mike Holmes
Technical Manager - Linaro Networking Group
Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"
Bill Fischofer Aug. 5, 2016, 4:01 p.m. UTC | #12
On Fri, Aug 5, 2016 at 10:59 AM, Mike Holmes <mike.holmes@linaro.org> wrote:

> On 4 August 2016 at 11:26, Brian Brooks <brian.brooks@linaro.org> wrote:

>

> > Reviewed-by: Brian Brooks <brian.brooks@linaro.org>

> >

>

> Just wanted to follow up on this, this patch has also highlighted a bug in

> the original code and the thread below documents it, also a bug for it has

> been created https://bugs.linaro.org/show_bug.cgi?id=2457

>

> However this patch is orthogonal to that problem which existed before this

> patch was created so I think we need to  take it so that TgrM testing can

> continue.

>

>

Agreed. This one should be merged to permit parallel activity, but we'd
still like to track down the other bug.


>

> >

> > On 08/04 09:18:14, Mike Holmes wrote:

> > > +ret=0

> > > +

> > > +run()

> > > +{

> > > +     echo odp_scheduling_run_proc starts with $1 worker threads

> > > +     echo =====================================================

> > > +

> > > +     $PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1

> > > +}

> > > +

> > > +run 1

> > > +run 8

> > > +

> > > +exit $ret

> >

> > Seeing this randomly in both multithread and multiprocess modes:

> >

> > ../../../odp/platform/linux-generic/odp_queue.c:328:odp_

> queue_destroy():queue

> > "sched_00_07" not empty

> > ../../../odp/platform/linux-generic/odp_schedule.c:271:

> schedule_term_global():Queue

> > not empty

> > ../../../odp/platform/linux-generic/odp_schedule.c:294:

> schedule_term_global():Pool

> > destroy fail.

> > ../../../odp/platform/linux-generic/odp_init.c:188:_odp_

> term_global():ODP

> > schedule term failed.

> > ../../../odp/platform/linux-generic/odp_queue.c:170:odp_

> queue_term_global():Not

> > destroyed queue: sched_00_07

> > ../../../odp/platform/linux-generic/odp_init.c:195:_odp_

> term_global():ODP

> > queue term failed.

> > ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

> pool_term_global():Not

> > destroyed pool: odp_sched_pool

> > ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

> pool_term_global():Not

> > destroyed pool: msg_pool

> > ../../../odp/platform/linux-generic/odp_init.c:202:_odp_

> term_global():ODP

> > buffer pool term failed.

> > ~/odp_incoming/odp_build/test/common_plat/performance$ echo $?

> > 0

> >

> > Potentially two items: one for correctly returning the failure code, and

> > another related to teardown. Both beyond the scope of this patch which

> > LGTM.

> >

>

>

>

> --

> Mike Holmes

> Technical Manager - Linaro Networking Group

> Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs

> "Work should be fun and collaborative, the rest follows"

>
Maxim Uvarov Aug. 5, 2016, 4:27 p.m. UTC | #13
clear_sched_queues();

is missing just before destroying queues + fix form Bill with account 
all term errors.

Maxim.


On 08/05/16 19:01, Bill Fischofer wrote:
> On Fri, Aug 5, 2016 at 10:59 AM, Mike Holmes <mike.holmes@linaro.org> wrote:

>

>> On 4 August 2016 at 11:26, Brian Brooks <brian.brooks@linaro.org> wrote:

>>

>>> Reviewed-by: Brian Brooks <brian.brooks@linaro.org>

>>>

>> Just wanted to follow up on this, this patch has also highlighted a bug in

>> the original code and the thread below documents it, also a bug for it has

>> been created https://bugs.linaro.org/show_bug.cgi?id=2457

>>

>> However this patch is orthogonal to that problem which existed before this

>> patch was created so I think we need to  take it so that TgrM testing can

>> continue.

>>

>>

> Agreed. This one should be merged to permit parallel activity, but we'd

> still like to track down the other bug.

>

>

>>> On 08/04 09:18:14, Mike Holmes wrote:

>>>> +ret=0

>>>> +

>>>> +run()

>>>> +{

>>>> +     echo odp_scheduling_run_proc starts with $1 worker threads

>>>> +     echo =====================================================

>>>> +

>>>> +     $PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1

>>>> +}

>>>> +

>>>> +run 1

>>>> +run 8

>>>> +

>>>> +exit $ret

>>> Seeing this randomly in both multithread and multiprocess modes:

>>>

>>> ../../../odp/platform/linux-generic/odp_queue.c:328:odp_

>> queue_destroy():queue

>>> "sched_00_07" not empty

>>> ../../../odp/platform/linux-generic/odp_schedule.c:271:

>> schedule_term_global():Queue

>>> not empty

>>> ../../../odp/platform/linux-generic/odp_schedule.c:294:

>> schedule_term_global():Pool

>>> destroy fail.

>>> ../../../odp/platform/linux-generic/odp_init.c:188:_odp_

>> term_global():ODP

>>> schedule term failed.

>>> ../../../odp/platform/linux-generic/odp_queue.c:170:odp_

>> queue_term_global():Not

>>> destroyed queue: sched_00_07

>>> ../../../odp/platform/linux-generic/odp_init.c:195:_odp_

>> term_global():ODP

>>> queue term failed.

>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>> pool_term_global():Not

>>> destroyed pool: odp_sched_pool

>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>> pool_term_global():Not

>>> destroyed pool: msg_pool

>>> ../../../odp/platform/linux-generic/odp_init.c:202:_odp_

>> term_global():ODP

>>> buffer pool term failed.

>>> ~/odp_incoming/odp_build/test/common_plat/performance$ echo $?

>>> 0

>>>

>>> Potentially two items: one for correctly returning the failure code, and

>>> another related to teardown. Both beyond the scope of this patch which

>>> LGTM.

>>>

>>

>>

>> --

>> Mike Holmes

>> Technical Manager - Linaro Networking Group

>> Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs

>> "Work should be fun and collaborative, the rest follows"

>>
Mike Holmes Aug. 5, 2016, 4:29 p.m. UTC | #14
On 5 August 2016 at 12:27, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:

> clear_sched_queues();

>

> is missing just before destroying queues + fix form Bill with account all

> term errors.



Should this go on the bug ?

This  this thread should be about this patch to enable process mode or it
will get hijacked


>

>

> Maxim.

>

>

>

> On 08/05/16 19:01, Bill Fischofer wrote:

>

>> On Fri, Aug 5, 2016 at 10:59 AM, Mike Holmes <mike.holmes@linaro.org>

>> wrote:

>>

>> On 4 August 2016 at 11:26, Brian Brooks <brian.brooks@linaro.org> wrote:

>>>

>>> Reviewed-by: Brian Brooks <brian.brooks@linaro.org>

>>>>

>>>> Just wanted to follow up on this, this patch has also highlighted a bug

>>> in

>>> the original code and the thread below documents it, also a bug for it

>>> has

>>> been created https://bugs.linaro.org/show_bug.cgi?id=2457

>>>

>>> However this patch is orthogonal to that problem which existed before

>>> this

>>> patch was created so I think we need to  take it so that TgrM testing can

>>> continue.

>>>

>>>

>>> Agreed. This one should be merged to permit parallel activity, but we'd

>> still like to track down the other bug.

>>

>>

>> On 08/04 09:18:14, Mike Holmes wrote:

>>>>

>>>>> +ret=0

>>>>> +

>>>>> +run()

>>>>> +{

>>>>> +     echo odp_scheduling_run_proc starts with $1 worker threads

>>>>> +     echo =====================================================

>>>>> +

>>>>> +     $PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1

>>>>> +}

>>>>> +

>>>>> +run 1

>>>>> +run 8

>>>>> +

>>>>> +exit $ret

>>>>>

>>>> Seeing this randomly in both multithread and multiprocess modes:

>>>>

>>>> ../../../odp/platform/linux-generic/odp_queue.c:328:odp_

>>>>

>>> queue_destroy():queue

>>>

>>>> "sched_00_07" not empty

>>>> ../../../odp/platform/linux-generic/odp_schedule.c:271:

>>>>

>>> schedule_term_global():Queue

>>>

>>>> not empty

>>>> ../../../odp/platform/linux-generic/odp_schedule.c:294:

>>>>

>>> schedule_term_global():Pool

>>>

>>>> destroy fail.

>>>> ../../../odp/platform/linux-generic/odp_init.c:188:_odp_

>>>>

>>> term_global():ODP

>>>

>>>> schedule term failed.

>>>> ../../../odp/platform/linux-generic/odp_queue.c:170:odp_

>>>>

>>> queue_term_global():Not

>>>

>>>> destroyed queue: sched_00_07

>>>> ../../../odp/platform/linux-generic/odp_init.c:195:_odp_

>>>>

>>> term_global():ODP

>>>

>>>> queue term failed.

>>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>>>>

>>> pool_term_global():Not

>>>

>>>> destroyed pool: odp_sched_pool

>>>> ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>>>>

>>> pool_term_global():Not

>>>

>>>> destroyed pool: msg_pool

>>>> ../../../odp/platform/linux-generic/odp_init.c:202:_odp_

>>>>

>>> term_global():ODP

>>>

>>>> buffer pool term failed.

>>>> ~/odp_incoming/odp_build/test/common_plat/performance$ echo $?

>>>> 0

>>>>

>>>> Potentially two items: one for correctly returning the failure code, and

>>>> another related to teardown. Both beyond the scope of this patch which

>>>> LGTM.

>>>>

>>>>

>>>

>>> --

>>> Mike Holmes

>>> Technical Manager - Linaro Networking Group

>>> Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM

>>> SoCs

>>> "Work should be fun and collaborative, the rest follows"

>>>

>>>

>



-- 
Mike Holmes
Technical Manager - Linaro Networking Group
Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"
Maxim Uvarov Aug. 5, 2016, 4:31 p.m. UTC | #15
On 08/05/16 19:29, Mike Holmes wrote:
>

>

> On 5 August 2016 at 12:27, Maxim Uvarov <maxim.uvarov@linaro.org 

> <mailto:maxim.uvarov@linaro.org>> wrote:

>

>     clear_sched_queues();

>

>     is missing just before destroying queues + fix form Bill with

>     account all term errors.

>

>

> Should this go on the bug ?

>

> This  this thread should be about this patch to enable process mode or 

> it will get hijacked


yes, that is test bug, fix has to be in monarch also.

Bill, I think you will add this to your patch and send v2.

Maxim.


>

>

>     Maxim.

>

>

>

>     On 08/05/16 19:01, Bill Fischofer wrote:

>

>         On Fri, Aug 5, 2016 at 10:59 AM, Mike Holmes

>         <mike.holmes@linaro.org <mailto:mike.holmes@linaro.org>> wrote:

>

>             On 4 August 2016 at 11:26, Brian Brooks

>             <brian.brooks@linaro.org <mailto:brian.brooks@linaro.org>>

>             wrote:

>

>                 Reviewed-by: Brian Brooks <brian.brooks@linaro.org

>                 <mailto:brian.brooks@linaro.org>>

>

>             Just wanted to follow up on this, this patch has also

>             highlighted a bug in

>             the original code and the thread below documents it, also

>             a bug for it has

>             been created https://bugs.linaro.org/show_bug.cgi?id=2457

>             <https://bugs.linaro.org/show_bug.cgi?id=2457>

>

>             However this patch is orthogonal to that problem which

>             existed before this

>             patch was created so I think we need to  take it so that

>             TgrM testing can

>             continue.

>

>

>         Agreed. This one should be merged to permit parallel activity,

>         but we'd

>         still like to track down the other bug.

>

>

>                 On 08/04 09:18:14, Mike Holmes wrote:

>

>                     +ret=0

>                     +

>                     +run()

>                     +{

>                     +     echo odp_scheduling_run_proc starts with $1

>                     worker threads

>                     +     echo

>                     =====================================================

>                     +

>                     +     $PERFORMANCE/odp_scheduling${EXEEXT}

>                     --odph_proc -c $1 || ret=1

>                     +}

>                     +

>                     +run 1

>                     +run 8

>                     +

>                     +exit $ret

>

>                 Seeing this randomly in both multithread and

>                 multiprocess modes:

>

>                 ../../../odp/platform/linux-generic/odp_queue.c:328:odp_

>

>             queue_destroy():queue

>

>                 "sched_00_07" not empty

>                 ../../../odp/platform/linux-generic/odp_schedule.c:271:

>

>             schedule_term_global():Queue

>

>                 not empty

>                 ../../../odp/platform/linux-generic/odp_schedule.c:294:

>

>             schedule_term_global():Pool

>

>                 destroy fail.

>                 ../../../odp/platform/linux-generic/odp_init.c:188:_odp_

>

>             term_global():ODP

>

>                 schedule term failed.

>                 ../../../odp/platform/linux-generic/odp_queue.c:170:odp_

>

>             queue_term_global():Not

>

>                 destroyed queue: sched_00_07

>                 ../../../odp/platform/linux-generic/odp_init.c:195:_odp_

>

>             term_global():ODP

>

>                 queue term failed.

>                 ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>

>             pool_term_global():Not

>

>                 destroyed pool: odp_sched_pool

>                 ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>

>             pool_term_global():Not

>

>                 destroyed pool: msg_pool

>                 ../../../odp/platform/linux-generic/odp_init.c:202:_odp_

>

>             term_global():ODP

>

>                 buffer pool term failed.

>                 ~/odp_incoming/odp_build/test/common_plat/performance$

>                 echo $?

>                 0

>

>                 Potentially two items: one for correctly returning the

>                 failure code, and

>                 another related to teardown. Both beyond the scope of

>                 this patch which

>                 LGTM.

>

>

>

>             --

>             Mike Holmes

>             Technical Manager - Linaro Networking Group

>             Linaro.org <http://www.linaro.org/> *│ *Open source

>             software for ARM SoCs

>             "Work should be fun and collaborative, the rest follows"

>

>

>

>

>

> -- 

> Mike Holmes

> Technical Manager - Linaro Networking Group

> Linaro.org <http://www.linaro.org/>***│ *Open source software for ARM SoCs

> "Work should be fun and collaborative, the rest follows"

>
Bill Fischofer Aug. 5, 2016, 4:41 p.m. UTC | #16
On Fri, Aug 5, 2016 at 11:31 AM, Maxim Uvarov <maxim.uvarov@linaro.org>
wrote:

> On 08/05/16 19:29, Mike Holmes wrote:

>

>>

>>

>> On 5 August 2016 at 12:27, Maxim Uvarov <maxim.uvarov@linaro.org <mailto:

>> maxim.uvarov@linaro.org>> wrote:

>>

>>     clear_sched_queues();

>>

>>     is missing just before destroying queues + fix form Bill with

>>     account all term errors.

>>

>>

>> Should this go on the bug ?

>>

>> This  this thread should be about this patch to enable process mode or it

>> will get hijacked

>>

>

> yes, that is test bug, fix has to be in monarch also.

>

> Bill, I think you will add this to your patch and send v2.

>


Sorry for the hijack, but I don't understand this analysis or suggestion.
Are you saying the cause of the issue Brian identified is a missing
clear_sched_queues() call somewhere in odp_scheduling.c?  I don't see that
as each of the scheduled routines -- test_schedule_single(),
test_schedule_multi(), test_schedul_many() -- already contain that call.


>

> Maxim.

>

>

>

>>

>>     Maxim.

>>

>>

>>

>>     On 08/05/16 19:01, Bill Fischofer wrote:

>>

>>         On Fri, Aug 5, 2016 at 10:59 AM, Mike Holmes

>>         <mike.holmes@linaro.org <mailto:mike.holmes@linaro.org>> wrote:

>>

>>             On 4 August 2016 at 11:26, Brian Brooks

>>             <brian.brooks@linaro.org <mailto:brian.brooks@linaro.org>>

>>             wrote:

>>

>>                 Reviewed-by: Brian Brooks <brian.brooks@linaro.org

>>                 <mailto:brian.brooks@linaro.org>>

>>

>>

>>             Just wanted to follow up on this, this patch has also

>>             highlighted a bug in

>>             the original code and the thread below documents it, also

>>             a bug for it has

>>             been created https://bugs.linaro.org/show_bug.cgi?id=2457

>>             <https://bugs.linaro.org/show_bug.cgi?id=2457>

>>

>>             However this patch is orthogonal to that problem which

>>             existed before this

>>             patch was created so I think we need to  take it so that

>>             TgrM testing can

>>             continue.

>>

>>

>>         Agreed. This one should be merged to permit parallel activity,

>>         but we'd

>>         still like to track down the other bug.

>>

>>

>>                 On 08/04 09:18:14, Mike Holmes wrote:

>>

>>                     +ret=0

>>                     +

>>                     +run()

>>                     +{

>>                     +     echo odp_scheduling_run_proc starts with $1

>>                     worker threads

>>                     +     echo

>>                     =====================================================

>>                     +

>>                     +     $PERFORMANCE/odp_scheduling${EXEEXT}

>>                     --odph_proc -c $1 || ret=1

>>                     +}

>>                     +

>>                     +run 1

>>                     +run 8

>>                     +

>>                     +exit $ret

>>

>>                 Seeing this randomly in both multithread and

>>                 multiprocess modes:

>>

>>                 ../../../odp/platform/linux-generic/odp_queue.c:328:odp_

>>

>>             queue_destroy():queue

>>

>>                 "sched_00_07" not empty

>>                 ../../../odp/platform/linux-generic/odp_schedule.c:271:

>>

>>             schedule_term_global():Queue

>>

>>                 not empty

>>                 ../../../odp/platform/linux-generic/odp_schedule.c:294:

>>

>>             schedule_term_global():Pool

>>

>>                 destroy fail.

>>                 ../../../odp/platform/linux-generic/odp_init.c:188:_odp_

>>

>>             term_global():ODP

>>

>>                 schedule term failed.

>>                 ../../../odp/platform/linux-generic/odp_queue.c:170:odp_

>>

>>             queue_term_global():Not

>>

>>                 destroyed queue: sched_00_07

>>                 ../../../odp/platform/linux-generic/odp_init.c:195:_odp_

>>

>>             term_global():ODP

>>

>>                 queue term failed.

>>                 ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>>

>>             pool_term_global():Not

>>

>>                 destroyed pool: odp_sched_pool

>>                 ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>>

>>             pool_term_global():Not

>>

>>                 destroyed pool: msg_pool

>>                 ../../../odp/platform/linux-generic/odp_init.c:202:_odp_

>>

>>             term_global():ODP

>>

>>                 buffer pool term failed.

>>                 ~/odp_incoming/odp_build/test/common_plat/performance$

>>                 echo $?

>>                 0

>>

>>                 Potentially two items: one for correctly returning the

>>                 failure code, and

>>                 another related to teardown. Both beyond the scope of

>>                 this patch which

>>                 LGTM.

>>

>>

>>

>>             --

>>             Mike Holmes

>>             Technical Manager - Linaro Networking Group

>>             Linaro.org <http://www.linaro.org/> *│ *Open source

>>             software for ARM SoCs

>>             "Work should be fun and collaborative, the rest follows"

>>

>>

>>

>>

>>

>> --

>> Mike Holmes

>> Technical Manager - Linaro Networking Group

>> Linaro.org <http://www.linaro.org/>***│ *Open source software for ARM

>> SoCs

>> "Work should be fun and collaborative, the rest follows"

>>

>>

>
Maxim Uvarov Aug. 5, 2016, 4:45 p.m. UTC | #17
On 5 August 2016 at 19:41, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>

>

> On Fri, Aug 5, 2016 at 11:31 AM, Maxim Uvarov <maxim.uvarov@linaro.org>

> wrote:

>

>> On 08/05/16 19:29, Mike Holmes wrote:

>>

>>>

>>>

>>> On 5 August 2016 at 12:27, Maxim Uvarov <maxim.uvarov@linaro.org

>>> <mailto:maxim.uvarov@linaro.org>> wrote:

>>>

>>>     clear_sched_queues();

>>>

>>>     is missing just before destroying queues + fix form Bill with

>>>     account all term errors.

>>>

>>>

>>> Should this go on the bug ?

>>>

>>> This  this thread should be about this patch to enable process mode or

>>> it will get hijacked

>>>

>>

>> yes, that is test bug, fix has to be in monarch also.

>>

>> Bill, I think you will add this to your patch and send v2.

>>

>

> Sorry for the hijack, but I don't understand this analysis or suggestion.

> Are you saying the cause of the issue Brian identified is a missing

> clear_sched_queues() call somewhere in odp_scheduling.c?  I don't see that

> as each of the scheduled routines -- test_schedule_single(),

> test_schedule_multi(), test_schedul_many() -- already contain that call.

>



right, but the latest test does not have it. put it to line 568.







>

>

>>

>> Maxim.

>>

>>

>>

>>>

>>>     Maxim.

>>>

>>>

>>>

>>>     On 08/05/16 19:01, Bill Fischofer wrote:

>>>

>>>         On Fri, Aug 5, 2016 at 10:59 AM, Mike Holmes

>>>         <mike.holmes@linaro.org <mailto:mike.holmes@linaro.org>> wrote:

>>>

>>>             On 4 August 2016 at 11:26, Brian Brooks

>>>             <brian.brooks@linaro.org <mailto:brian.brooks@linaro.org>>

>>>             wrote:

>>>

>>>                 Reviewed-by: Brian Brooks <brian.brooks@linaro.org

>>>                 <mailto:brian.brooks@linaro.org>>

>>>

>>>

>>>             Just wanted to follow up on this, this patch has also

>>>             highlighted a bug in

>>>             the original code and the thread below documents it, also

>>>             a bug for it has

>>>             been created https://bugs.linaro.org/show_bug.cgi?id=2457

>>>             <https://bugs.linaro.org/show_bug.cgi?id=2457>

>>>

>>>             However this patch is orthogonal to that problem which

>>>             existed before this

>>>             patch was created so I think we need to  take it so that

>>>             TgrM testing can

>>>             continue.

>>>

>>>

>>>         Agreed. This one should be merged to permit parallel activity,

>>>         but we'd

>>>         still like to track down the other bug.

>>>

>>>

>>>                 On 08/04 09:18:14, Mike Holmes wrote:

>>>

>>>                     +ret=0

>>>                     +

>>>                     +run()

>>>                     +{

>>>                     +     echo odp_scheduling_run_proc starts with $1

>>>                     worker threads

>>>                     +     echo

>>>                     ==============================

>>> =======================

>>>                     +

>>>                     +     $PERFORMANCE/odp_scheduling${EXEEXT}

>>>                     --odph_proc -c $1 || ret=1

>>>                     +}

>>>                     +

>>>                     +run 1

>>>                     +run 8

>>>                     +

>>>                     +exit $ret

>>>

>>>                 Seeing this randomly in both multithread and

>>>                 multiprocess modes:

>>>

>>>                 ../../../odp/platform/linux-generic/odp_queue.c:328:odp_

>>>

>>>             queue_destroy():queue

>>>

>>>                 "sched_00_07" not empty

>>>                 ../../../odp/platform/linux-generic/odp_schedule.c:271:

>>>

>>>             schedule_term_global():Queue

>>>

>>>                 not empty

>>>                 ../../../odp/platform/linux-generic/odp_schedule.c:294:

>>>

>>>             schedule_term_global():Pool

>>>

>>>                 destroy fail.

>>>                 ../../../odp/platform/linux-generic/odp_init.c:188:_odp_

>>>

>>>             term_global():ODP

>>>

>>>                 schedule term failed.

>>>                 ../../../odp/platform/linux-generic/odp_queue.c:170:odp_

>>>

>>>             queue_term_global():Not

>>>

>>>                 destroyed queue: sched_00_07

>>>                 ../../../odp/platform/linux-generic/odp_init.c:195:_odp_

>>>

>>>             term_global():ODP

>>>

>>>                 queue term failed.

>>>                 ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>>>

>>>             pool_term_global():Not

>>>

>>>                 destroyed pool: odp_sched_pool

>>>                 ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>>>

>>>             pool_term_global():Not

>>>

>>>                 destroyed pool: msg_pool

>>>                 ../../../odp/platform/linux-generic/odp_init.c:202:_odp_

>>>

>>>             term_global():ODP

>>>

>>>                 buffer pool term failed.

>>>                 ~/odp_incoming/odp_build/test/common_plat/performance$

>>>                 echo $?

>>>                 0

>>>

>>>                 Potentially two items: one for correctly returning the

>>>                 failure code, and

>>>                 another related to teardown. Both beyond the scope of

>>>                 this patch which

>>>                 LGTM.

>>>

>>>

>>>

>>>             --

>>>             Mike Holmes

>>>             Technical Manager - Linaro Networking Group

>>>             Linaro.org <http://www.linaro.org/> *│ *Open source

>>>             software for ARM SoCs

>>>             "Work should be fun and collaborative, the rest follows"

>>>

>>>

>>>

>>>

>>>

>>> --

>>> Mike Holmes

>>> Technical Manager - Linaro Networking Group

>>> Linaro.org <http://www.linaro.org/>***│ *Open source software for ARM

>>> SoCs

>>> "Work should be fun and collaborative, the rest follows"

>>>

>>>

>>

>
Maxim Uvarov Aug. 5, 2016, 4:47 p.m. UTC | #18
ah, it has, but if you add it also after second barrier test works...

On 5 August 2016 at 19:45, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:

>

>

> On 5 August 2016 at 19:41, Bill Fischofer <bill.fischofer@linaro.org>

> wrote:

>

>>

>>

>> On Fri, Aug 5, 2016 at 11:31 AM, Maxim Uvarov <maxim.uvarov@linaro.org>

>> wrote:

>>

>>> On 08/05/16 19:29, Mike Holmes wrote:

>>>

>>>>

>>>>

>>>> On 5 August 2016 at 12:27, Maxim Uvarov <maxim.uvarov@linaro.org

>>>> <mailto:maxim.uvarov@linaro.org>> wrote:

>>>>

>>>>     clear_sched_queues();

>>>>

>>>>     is missing just before destroying queues + fix form Bill with

>>>>     account all term errors.

>>>>

>>>>

>>>> Should this go on the bug ?

>>>>

>>>> This  this thread should be about this patch to enable process mode or

>>>> it will get hijacked

>>>>

>>>

>>> yes, that is test bug, fix has to be in monarch also.

>>>

>>> Bill, I think you will add this to your patch and send v2.

>>>

>>

>> Sorry for the hijack, but I don't understand this analysis or suggestion.

>> Are you saying the cause of the issue Brian identified is a missing

>> clear_sched_queues() call somewhere in odp_scheduling.c?  I don't see that

>> as each of the scheduled routines -- test_schedule_single(),

>> test_schedule_multi(), test_schedul_many() -- already contain that call.

>>

>

>

> right, but the latest test does not have it. put it to line 568.

>

>

>

>

>

>

>

>>

>>

>>>

>>> Maxim.

>>>

>>>

>>>

>>>>

>>>>     Maxim.

>>>>

>>>>

>>>>

>>>>     On 08/05/16 19:01, Bill Fischofer wrote:

>>>>

>>>>         On Fri, Aug 5, 2016 at 10:59 AM, Mike Holmes

>>>>         <mike.holmes@linaro.org <mailto:mike.holmes@linaro.org>> wrote:

>>>>

>>>>             On 4 August 2016 at 11:26, Brian Brooks

>>>>             <brian.brooks@linaro.org <mailto:brian.brooks@linaro.org>>

>>>>             wrote:

>>>>

>>>>                 Reviewed-by: Brian Brooks <brian.brooks@linaro.org

>>>>                 <mailto:brian.brooks@linaro.org>>

>>>>

>>>>

>>>>             Just wanted to follow up on this, this patch has also

>>>>             highlighted a bug in

>>>>             the original code and the thread below documents it, also

>>>>             a bug for it has

>>>>             been created https://bugs.linaro.org/show_bug.cgi?id=2457

>>>>             <https://bugs.linaro.org/show_bug.cgi?id=2457>

>>>>

>>>>             However this patch is orthogonal to that problem which

>>>>             existed before this

>>>>             patch was created so I think we need to  take it so that

>>>>             TgrM testing can

>>>>             continue.

>>>>

>>>>

>>>>         Agreed. This one should be merged to permit parallel activity,

>>>>         but we'd

>>>>         still like to track down the other bug.

>>>>

>>>>

>>>>                 On 08/04 09:18:14, Mike Holmes wrote:

>>>>

>>>>                     +ret=0

>>>>                     +

>>>>                     +run()

>>>>                     +{

>>>>                     +     echo odp_scheduling_run_proc starts with $1

>>>>                     worker threads

>>>>                     +     echo

>>>>                     ==============================

>>>> =======================

>>>>                     +

>>>>                     +     $PERFORMANCE/odp_scheduling${EXEEXT}

>>>>                     --odph_proc -c $1 || ret=1

>>>>                     +}

>>>>                     +

>>>>                     +run 1

>>>>                     +run 8

>>>>                     +

>>>>                     +exit $ret

>>>>

>>>>                 Seeing this randomly in both multithread and

>>>>                 multiprocess modes:

>>>>

>>>>                 ../../../odp/platform/linux-ge

>>>> neric/odp_queue.c:328:odp_

>>>>

>>>>             queue_destroy():queue

>>>>

>>>>                 "sched_00_07" not empty

>>>>                 ../../../odp/platform/linux-generic/odp_schedule.c:271:

>>>>

>>>>             schedule_term_global():Queue

>>>>

>>>>                 not empty

>>>>                 ../../../odp/platform/linux-generic/odp_schedule.c:294:

>>>>

>>>>             schedule_term_global():Pool

>>>>

>>>>                 destroy fail.

>>>>                 ../../../odp/platform/linux-ge

>>>> neric/odp_init.c:188:_odp_

>>>>

>>>>             term_global():ODP

>>>>

>>>>                 schedule term failed.

>>>>                 ../../../odp/platform/linux-ge

>>>> neric/odp_queue.c:170:odp_

>>>>

>>>>             queue_term_global():Not

>>>>

>>>>                 destroyed queue: sched_00_07

>>>>                 ../../../odp/platform/linux-ge

>>>> neric/odp_init.c:195:_odp_

>>>>

>>>>             term_global():ODP

>>>>

>>>>                 queue term failed.

>>>>                 ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>>>>

>>>>             pool_term_global():Not

>>>>

>>>>                 destroyed pool: odp_sched_pool

>>>>                 ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>>>>

>>>>             pool_term_global():Not

>>>>

>>>>                 destroyed pool: msg_pool

>>>>                 ../../../odp/platform/linux-ge

>>>> neric/odp_init.c:202:_odp_

>>>>

>>>>             term_global():ODP

>>>>

>>>>                 buffer pool term failed.

>>>>                 ~/odp_incoming/odp_build/test/common_plat/performance$

>>>>                 echo $?

>>>>                 0

>>>>

>>>>                 Potentially two items: one for correctly returning the

>>>>                 failure code, and

>>>>                 another related to teardown. Both beyond the scope of

>>>>                 this patch which

>>>>                 LGTM.

>>>>

>>>>

>>>>

>>>>             --

>>>>             Mike Holmes

>>>>             Technical Manager - Linaro Networking Group

>>>>             Linaro.org <http://www.linaro.org/> *│ *Open source

>>>>             software for ARM SoCs

>>>>             "Work should be fun and collaborative, the rest follows"

>>>>

>>>>

>>>>

>>>>

>>>>

>>>> --

>>>> Mike Holmes

>>>> Technical Manager - Linaro Networking Group

>>>> Linaro.org <http://www.linaro.org/>***│ *Open source software for ARM

>>>> SoCs

>>>> "Work should be fun and collaborative, the rest follows"

>>>>

>>>>

>>>

>>

>
Bill Fischofer Aug. 5, 2016, 4:50 p.m. UTC | #19
On Fri, Aug 5, 2016 at 11:45 AM, Maxim Uvarov <maxim.uvarov@linaro.org>
wrote:

>

>

> On 5 August 2016 at 19:41, Bill Fischofer <bill.fischofer@linaro.org>

> wrote:

>

>>

>>

>> On Fri, Aug 5, 2016 at 11:31 AM, Maxim Uvarov <maxim.uvarov@linaro.org>

>> wrote:

>>

>>> On 08/05/16 19:29, Mike Holmes wrote:

>>>

>>>>

>>>>

>>>> On 5 August 2016 at 12:27, Maxim Uvarov <maxim.uvarov@linaro.org

>>>> <mailto:maxim.uvarov@linaro.org>> wrote:

>>>>

>>>>     clear_sched_queues();

>>>>

>>>>     is missing just before destroying queues + fix form Bill with

>>>>     account all term errors.

>>>>

>>>>

>>>> Should this go on the bug ?

>>>>

>>>> This  this thread should be about this patch to enable process mode or

>>>> it will get hijacked

>>>>

>>>

>>> yes, that is test bug, fix has to be in monarch also.

>>>

>>> Bill, I think you will add this to your patch and send v2.

>>>

>>

>> Sorry for the hijack, but I don't understand this analysis or suggestion.

>> Are you saying the cause of the issue Brian identified is a missing

>> clear_sched_queues() call somewhere in odp_scheduling.c?  I don't see that

>> as each of the scheduled routines -- test_schedule_single(),

>> test_schedule_multi(), test_schedul_many() -- already contain that call.

>>

>

>

> right, but the latest test does not have it. put it to line 568.

>

> That call is on line 554. No scheduler calls are made after that before

the function exits, so I'm not sure what good moving that line would do.

>

>

>

>

>

>

>>

>>

>>>

>>> Maxim.

>>>

>>>

>>>

>>>>

>>>>     Maxim.

>>>>

>>>>

>>>>

>>>>     On 08/05/16 19:01, Bill Fischofer wrote:

>>>>

>>>>         On Fri, Aug 5, 2016 at 10:59 AM, Mike Holmes

>>>>         <mike.holmes@linaro.org <mailto:mike.holmes@linaro.org>> wrote:

>>>>

>>>>             On 4 August 2016 at 11:26, Brian Brooks

>>>>             <brian.brooks@linaro.org <mailto:brian.brooks@linaro.org>>

>>>>             wrote:

>>>>

>>>>                 Reviewed-by: Brian Brooks <brian.brooks@linaro.org

>>>>                 <mailto:brian.brooks@linaro.org>>

>>>>

>>>>

>>>>             Just wanted to follow up on this, this patch has also

>>>>             highlighted a bug in

>>>>             the original code and the thread below documents it, also

>>>>             a bug for it has

>>>>             been created https://bugs.linaro.org/show_bug.cgi?id=2457

>>>>             <https://bugs.linaro.org/show_bug.cgi?id=2457>

>>>>

>>>>             However this patch is orthogonal to that problem which

>>>>             existed before this

>>>>             patch was created so I think we need to  take it so that

>>>>             TgrM testing can

>>>>             continue.

>>>>

>>>>

>>>>         Agreed. This one should be merged to permit parallel activity,

>>>>         but we'd

>>>>         still like to track down the other bug.

>>>>

>>>>

>>>>                 On 08/04 09:18:14, Mike Holmes wrote:

>>>>

>>>>                     +ret=0

>>>>                     +

>>>>                     +run()

>>>>                     +{

>>>>                     +     echo odp_scheduling_run_proc starts with $1

>>>>                     worker threads

>>>>                     +     echo

>>>>                     ==============================

>>>> =======================

>>>>                     +

>>>>                     +     $PERFORMANCE/odp_scheduling${EXEEXT}

>>>>                     --odph_proc -c $1 || ret=1

>>>>                     +}

>>>>                     +

>>>>                     +run 1

>>>>                     +run 8

>>>>                     +

>>>>                     +exit $ret

>>>>

>>>>                 Seeing this randomly in both multithread and

>>>>                 multiprocess modes:

>>>>

>>>>                 ../../../odp/platform/linux-ge

>>>> neric/odp_queue.c:328:odp_

>>>>

>>>>             queue_destroy():queue

>>>>

>>>>                 "sched_00_07" not empty

>>>>                 ../../../odp/platform/linux-generic/odp_schedule.c:271:

>>>>

>>>>             schedule_term_global():Queue

>>>>

>>>>                 not empty

>>>>                 ../../../odp/platform/linux-generic/odp_schedule.c:294:

>>>>

>>>>             schedule_term_global():Pool

>>>>

>>>>                 destroy fail.

>>>>                 ../../../odp/platform/linux-ge

>>>> neric/odp_init.c:188:_odp_

>>>>

>>>>             term_global():ODP

>>>>

>>>>                 schedule term failed.

>>>>                 ../../../odp/platform/linux-ge

>>>> neric/odp_queue.c:170:odp_

>>>>

>>>>             queue_term_global():Not

>>>>

>>>>                 destroyed queue: sched_00_07

>>>>                 ../../../odp/platform/linux-ge

>>>> neric/odp_init.c:195:_odp_

>>>>

>>>>             term_global():ODP

>>>>

>>>>                 queue term failed.

>>>>                 ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>>>>

>>>>             pool_term_global():Not

>>>>

>>>>                 destroyed pool: odp_sched_pool

>>>>                 ../../../odp/platform/linux-generic/odp_pool.c:149:odp_

>>>>

>>>>             pool_term_global():Not

>>>>

>>>>                 destroyed pool: msg_pool

>>>>                 ../../../odp/platform/linux-ge

>>>> neric/odp_init.c:202:_odp_

>>>>

>>>>             term_global():ODP

>>>>

>>>>                 buffer pool term failed.

>>>>                 ~/odp_incoming/odp_build/test/common_plat/performance$

>>>>                 echo $?

>>>>                 0

>>>>

>>>>                 Potentially two items: one for correctly returning the

>>>>                 failure code, and

>>>>                 another related to teardown. Both beyond the scope of

>>>>                 this patch which

>>>>                 LGTM.

>>>>

>>>>

>>>>

>>>>             --

>>>>             Mike Holmes

>>>>             Technical Manager - Linaro Networking Group

>>>>             Linaro.org <http://www.linaro.org/> *│ *Open source

>>>>             software for ARM SoCs

>>>>             "Work should be fun and collaborative, the rest follows"

>>>>

>>>>

>>>>

>>>>

>>>>

>>>> --

>>>> Mike Holmes

>>>> Technical Manager - Linaro Networking Group

>>>> Linaro.org <http://www.linaro.org/>***│ *Open source software for ARM

>>>> SoCs

>>>> "Work should be fun and collaborative, the rest follows"

>>>>

>>>>

>>>

>>

>
Brian Brooks Aug. 9, 2016, 5:33 p.m. UTC | #20
On 08/04 10:58:20, Bill Fischofer wrote:
> Quick update. I can repro this in v1.10.0.1 as well, however v1.10.0.0

> seems good.


It is possible to reproduce in v1.10.0.0.

A patch is needed to return the failure code. Bill's patch or something
like returning EXIT_FAILURE if odp_queue_destroy(q) fails will do.

Then, this should help with reproducing:

  $ while ./odp_scheduling -c 2 ; do :; done
Mike Holmes Aug. 9, 2016, 5:41 p.m. UTC | #21
Thanks Brian

We have  https://bugs.linaro.org/show_bug.cgi?id=2457 for this issue to
continue the discussion.

This patch is reviewed and should go in so that CI can help find the next
thing we missed!

Mike

On 9 August 2016 at 13:33, Brian Brooks <brian.brooks@linaro.org> wrote:

> On 08/04 10:58:20, Bill Fischofer wrote:

> > Quick update. I can repro this in v1.10.0.1 as well, however v1.10.0.0

> > seems good.

>

> It is possible to reproduce in v1.10.0.0.

>

> A patch is needed to return the failure code. Bill's patch or something

> like returning EXIT_FAILURE if odp_queue_destroy(q) fails will do.

>

> Then, this should help with reproducing:

>

>   $ while ./odp_scheduling -c 2 ; do :; done

>




-- 
Mike Holmes
Technical Manager - Linaro Networking Group
Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"
Mike Holmes Aug. 10, 2016, 1:39 p.m. UTC | #22
Maxim, can we apply this reviewd patch before it is buried :)

On 10 August 2016 at 08:56, Savolainen, Petri (Nokia - FI/Espoo) <
petri.savolainen@nokia-bell-labs.com> wrote:

> I got it replicated with ./odp_scheduling -c 10. Eight or less cores

> didn't expose it (often at least).

>

> It seems that the pool cache patch (ec0d570b8f76... jul 22) introduces it.

>

> -Petri

>

>

> > -----Original Message-----

> > From: Brian Brooks [mailto:brian.brooks@linaro.org]

> > Sent: Tuesday, August 09, 2016 8:33 PM

> > To: Bill Fischofer <bill.fischofer@linaro.org>

> > Cc: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolainen@nokia-bell-

> > labs.com>; Mike Holmes <mike.holmes@linaro.org>; LNG ODP Mailman List

> > <lng-odp@lists.linaro.org>

> > Subject: Re: [lng-odp] [PATCH] test:linux-generic: run odp_scheduling in

> > process mode

> >

> > On 08/04 10:58:20, Bill Fischofer wrote:

> > > Quick update. I can repro this in v1.10.0.1 as well, however v1.10.0.0

> > > seems good.

> >

> > It is possible to reproduce in v1.10.0.0.

> >

> > A patch is needed to return the failure code. Bill's patch or something

> > like returning EXIT_FAILURE if odp_queue_destroy(q) fails will do.

> >

> > Then, this should help with reproducing:

> >

> >   $ while ./odp_scheduling -c 2 ; do :; done

>




-- 
Mike Holmes
Technical Manager - Linaro Networking Group
Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"
Maxim Uvarov Aug. 10, 2016, 2 p.m. UTC | #23
Merged,
Maxim.

On 08/04/16 16:18, Mike Holmes wrote:
> Set up the environment to allow calling the performance tests in process

> mode as part of make check when enabled.

>

> To run the tests use --enable-test-perf-proc

>

> Initial patch using odp_scheduling as a proof

>

> Signed-off-by: Mike Holmes <mike.holmes@linaro.org>

> ---

>   configure.ac                                       |  2 ++

>   test/linux-generic/Makefile.am                     |  2 +-

>   test/linux-generic/m4/configure.m4                 |  5 ++++-

>   test/linux-generic/m4/performance.m4               |  9 ++++++++

>   test/linux-generic/performance/.gitignore          |  2 ++

>   test/linux-generic/performance/Makefile.am         | 13 +++++++++++

>   .../performance/odp_scheduling_run_proc.sh         | 26 ++++++++++++++++++++++

>   7 files changed, 57 insertions(+), 2 deletions(-)

>   create mode 100644 test/linux-generic/m4/performance.m4

>   create mode 100644 test/linux-generic/performance/.gitignore

>   create mode 100644 test/linux-generic/performance/Makefile.am

>   create mode 100755 test/linux-generic/performance/odp_scheduling_run_proc.sh

>

> diff --git a/configure.ac b/configure.ac

> index c0f0f21..6551287 100644

> --- a/configure.ac

> +++ b/configure.ac

> @@ -169,6 +169,7 @@ AM_CONDITIONAL([test_installdir], [test "$testdir" != ""])

>   AM_CONDITIONAL([cunit_support], [test x$cunit_support = xyes ])

>   AM_CONDITIONAL([test_vald], [test x$test_vald = xyes ])

>   AM_CONDITIONAL([test_perf], [test x$test_perf = xyes ])

> +AM_CONDITIONAL([test_perf_proc], [test x$test_perf_proc = xyes ])

>   AM_CONDITIONAL([test_cpp], [test x$test_cpp = xyes ])

>   AM_CONDITIONAL([test_helper], [test x$test_helper = xyes ])

>   AM_CONDITIONAL([test_example], [test x$test_example = xyes ])

> @@ -302,6 +303,7 @@ AC_MSG_RESULT([

>   	cunit:			${cunit_support}

>   	test_vald:		${test_vald}

>   	test_perf:		${test_perf}

> +	test_perf_proc:		${test_perf_proc}

>   	test_cpp:		${test_cpp}

>   	test_helper:		${test_helper}

>   	test_example:		${test_example}

> diff --git a/test/linux-generic/Makefile.am b/test/linux-generic/Makefile.am

> index f5cc52d..4660cf0 100644

> --- a/test/linux-generic/Makefile.am

> +++ b/test/linux-generic/Makefile.am

> @@ -3,7 +3,7 @@ TESTS_ENVIRONMENT += TEST_DIR=${top_builddir}/test/common_plat/validation

>   

>   ALL_API_VALIDATION_DIR = ${top_builddir}/test/common_plat/validation/api

>   

> -SUBDIRS =

> +SUBDIRS = performance

>   

>   if test_vald

>   TESTS = validation/api/pktio/pktio_run.sh \

> diff --git a/test/linux-generic/m4/configure.m4 b/test/linux-generic/m4/configure.m4

> index 9eec545..6b92201 100644

> --- a/test/linux-generic/m4/configure.m4

> +++ b/test/linux-generic/m4/configure.m4

> @@ -1,5 +1,8 @@

> +m4_include([test/linux-generic/m4/performance.m4])

> +

>   AC_CONFIG_FILES([test/linux-generic/Makefile

>   		 test/linux-generic/validation/api/shmem/Makefile

>   		 test/linux-generic/validation/api/pktio/Makefile

>   		 test/linux-generic/pktio_ipc/Makefile

> -		 test/linux-generic/ring/Makefile])

> +		 test/linux-generic/ring/Makefile

> +		 test/linux-generic/performance/Makefile])

> diff --git a/test/linux-generic/m4/performance.m4 b/test/linux-generic/m4/performance.m4

> new file mode 100644

> index 0000000..7f54b96

> --- /dev/null

> +++ b/test/linux-generic/m4/performance.m4

> @@ -0,0 +1,9 @@

> +##########################################################################

> +# Enable/disable test-perf-proc

> +##########################################################################

> +test_perf_proc=no

> +AC_ARG_ENABLE([test-perf-proc],

> +    [  --enable-test-perf-proc      run test in test/performance in process mode],

> +    [if test "x$enableval" = "xyes"; then

> +        test_perf_proc=yes

> +    fi])

> diff --git a/test/linux-generic/performance/.gitignore b/test/linux-generic/performance/.gitignore

> new file mode 100644

> index 0000000..7e563b8

> --- /dev/null

> +++ b/test/linux-generic/performance/.gitignore

> @@ -0,0 +1,2 @@

> +*.log

> +*.trs

> diff --git a/test/linux-generic/performance/Makefile.am b/test/linux-generic/performance/Makefile.am

> new file mode 100644

> index 0000000..cb72fce

> --- /dev/null

> +++ b/test/linux-generic/performance/Makefile.am

> @@ -0,0 +1,13 @@

> +include $(top_srcdir)/test/Makefile.inc

> +

> +TESTS_ENVIRONMENT += TEST_DIR=${builddir}

> +

> +TESTSCRIPTS = odp_scheduling_run_proc.sh

> +

> +TEST_EXTENSIONS = .sh

> +

> +if test_perf_proc

> +TESTS = $(TESTSCRIPTS)

> +endif

> +

> +EXTRA_DIST = $(TESTSCRIPTS)

> diff --git a/test/linux-generic/performance/odp_scheduling_run_proc.sh b/test/linux-generic/performance/odp_scheduling_run_proc.sh

> new file mode 100755

> index 0000000..b3ef26f

> --- /dev/null

> +++ b/test/linux-generic/performance/odp_scheduling_run_proc.sh

> @@ -0,0 +1,26 @@

> +#!/bin/sh

> +#

> +# Copyright (c) 2016, Linaro Limited

> +# All rights reserved.

> +#

> +# SPDX-License-Identifier:	BSD-3-Clause

> +#

> +# Script that passes command line arguments to odp_scheduling test when

> +# launched by 'make check'

> +

> +TEST_DIR="${TEST_DIR:-$(dirname $0)}"

> +PERFORMANCE="$TEST_DIR/../../common_plat/performance"

> +ret=0

> +

> +run()

> +{

> +	echo odp_scheduling_run_proc starts with $1 worker threads

> +	echo =====================================================

> +

> +	$PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1

> +}

> +

> +run 1

> +run 8

> +

> +exit $ret
diff mbox

Patch

diff --git a/configure.ac b/configure.ac
index c0f0f21..6551287 100644
--- a/configure.ac
+++ b/configure.ac
@@ -169,6 +169,7 @@  AM_CONDITIONAL([test_installdir], [test "$testdir" != ""])
 AM_CONDITIONAL([cunit_support], [test x$cunit_support = xyes ])
 AM_CONDITIONAL([test_vald], [test x$test_vald = xyes ])
 AM_CONDITIONAL([test_perf], [test x$test_perf = xyes ])
+AM_CONDITIONAL([test_perf_proc], [test x$test_perf_proc = xyes ])
 AM_CONDITIONAL([test_cpp], [test x$test_cpp = xyes ])
 AM_CONDITIONAL([test_helper], [test x$test_helper = xyes ])
 AM_CONDITIONAL([test_example], [test x$test_example = xyes ])
@@ -302,6 +303,7 @@  AC_MSG_RESULT([
 	cunit:			${cunit_support}
 	test_vald:		${test_vald}
 	test_perf:		${test_perf}
+	test_perf_proc:		${test_perf_proc}
 	test_cpp:		${test_cpp}
 	test_helper:		${test_helper}
 	test_example:		${test_example}
diff --git a/test/linux-generic/Makefile.am b/test/linux-generic/Makefile.am
index f5cc52d..4660cf0 100644
--- a/test/linux-generic/Makefile.am
+++ b/test/linux-generic/Makefile.am
@@ -3,7 +3,7 @@  TESTS_ENVIRONMENT += TEST_DIR=${top_builddir}/test/common_plat/validation
 
 ALL_API_VALIDATION_DIR = ${top_builddir}/test/common_plat/validation/api
 
-SUBDIRS =
+SUBDIRS = performance
 
 if test_vald
 TESTS = validation/api/pktio/pktio_run.sh \
diff --git a/test/linux-generic/m4/configure.m4 b/test/linux-generic/m4/configure.m4
index 9eec545..6b92201 100644
--- a/test/linux-generic/m4/configure.m4
+++ b/test/linux-generic/m4/configure.m4
@@ -1,5 +1,8 @@ 
+m4_include([test/linux-generic/m4/performance.m4])
+
 AC_CONFIG_FILES([test/linux-generic/Makefile
 		 test/linux-generic/validation/api/shmem/Makefile
 		 test/linux-generic/validation/api/pktio/Makefile
 		 test/linux-generic/pktio_ipc/Makefile
-		 test/linux-generic/ring/Makefile])
+		 test/linux-generic/ring/Makefile
+		 test/linux-generic/performance/Makefile])
diff --git a/test/linux-generic/m4/performance.m4 b/test/linux-generic/m4/performance.m4
new file mode 100644
index 0000000..7f54b96
--- /dev/null
+++ b/test/linux-generic/m4/performance.m4
@@ -0,0 +1,9 @@ 
+##########################################################################
+# Enable/disable test-perf-proc
+##########################################################################
+test_perf_proc=no
+AC_ARG_ENABLE([test-perf-proc],
+    [  --enable-test-perf-proc      run test in test/performance in process mode],
+    [if test "x$enableval" = "xyes"; then
+        test_perf_proc=yes
+    fi])
diff --git a/test/linux-generic/performance/.gitignore b/test/linux-generic/performance/.gitignore
new file mode 100644
index 0000000..7e563b8
--- /dev/null
+++ b/test/linux-generic/performance/.gitignore
@@ -0,0 +1,2 @@ 
+*.log
+*.trs
diff --git a/test/linux-generic/performance/Makefile.am b/test/linux-generic/performance/Makefile.am
new file mode 100644
index 0000000..cb72fce
--- /dev/null
+++ b/test/linux-generic/performance/Makefile.am
@@ -0,0 +1,13 @@ 
+include $(top_srcdir)/test/Makefile.inc
+
+TESTS_ENVIRONMENT += TEST_DIR=${builddir}
+
+TESTSCRIPTS = odp_scheduling_run_proc.sh
+
+TEST_EXTENSIONS = .sh
+
+if test_perf_proc
+TESTS = $(TESTSCRIPTS)
+endif
+
+EXTRA_DIST = $(TESTSCRIPTS)
diff --git a/test/linux-generic/performance/odp_scheduling_run_proc.sh b/test/linux-generic/performance/odp_scheduling_run_proc.sh
new file mode 100755
index 0000000..b3ef26f
--- /dev/null
+++ b/test/linux-generic/performance/odp_scheduling_run_proc.sh
@@ -0,0 +1,26 @@ 
+#!/bin/sh
+#
+# Copyright (c) 2016, Linaro Limited
+# All rights reserved.
+#
+# SPDX-License-Identifier:	BSD-3-Clause
+#
+# Script that passes command line arguments to odp_scheduling test when
+# launched by 'make check'
+
+TEST_DIR="${TEST_DIR:-$(dirname $0)}"
+PERFORMANCE="$TEST_DIR/../../common_plat/performance"
+ret=0
+
+run()
+{
+	echo odp_scheduling_run_proc starts with $1 worker threads
+	echo =====================================================
+
+	$PERFORMANCE/odp_scheduling${EXEEXT} --odph_proc -c $1 || ret=1
+}
+
+run 1
+run 8
+
+exit $ret