Message ID | yddlgxh6bsj.fsf@CeBiTec.Uni-Bielefeld.DE |
---|---|
State | New |
Headers | show |
> Ok for mainline (and eventually for 5 and 6 branches given the small > size and low risk of the patch)? I'm not familiar with lang_checks_parallelized, but that's OK with me on principle. Arno
On Fri, Oct 21, 2016 at 04:01:48PM +0200, Rainer Orth wrote: > I happened to notice that the gnat.dg testsuite run is slow even on a > reasonably fast SPARC machine (3.6 GHz SPARC T5) and together with the > libgomp testsuite (PR libgomp/66005) dominates bootstrap time: within a > make -j96 -k check, it takes 1h 18m 37s. For unknown reasons, > check-gnat isn't parallelized though it is trivial to do and buys quite > a bit: check-gnat dominates anything? That just really weird, it has only # of expected passes 2544 # of unexpected failures 2 # of expected failures 24 # of unsupported tests 3 compared to the 100000+ tests in gcc/g++ or 40000+ in gfortran testsuites it is just nothing. libgomp is a know problem, sure, the problem with parallelizing it is that many tests just use all available cores/threads. Perhaps we should do some small (at most 2 or 3 concurrent libgomp tests) parallelization of the libgomp testsuite unless disallowed through some env var option, but in that case bound OMP_NUM_THREADS if `getconf _NPROCESSORS_ONLN` > 32 to `getconf _NPROCESSORS_ONLN` / 2 or something similar. I'm not strongly against your patch, I'm just very surprised it is really needed (acats is much larger, check-gnat is small). > 2016-10-21 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> > > * gcc-interface/Make-lang.in (lang_checks_parallelized): New target. > (check_gnat_parallelize): Likewise. > Jakub
> I'm not strongly against your patch, I'm just very surprised it is really > needed (acats is much larger, check-gnat is small). In what unit do you count? ACATS has fewer tests than gnat.dg nowadays. -- Eric Botcazou
On Oct 21, 2016, at 9:54 AM, Eric Botcazou <ebotcazou@adacore.com> wrote: > >> I'm not strongly against your patch, I'm just very surprised it is really >> needed (acats is much larger, check-gnat is small). > > In what unit do you count? ACATS has fewer tests than gnat.dg nowadays. The only unit that matters, wall seconds.
On Oct 21, 2016, at 7:01 AM, Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> wrote: > > I happened to notice that the gnat.dg testsuite run is slow > 2.6 GHz AMD Opteron 8435, -j24 43m 24s => 33m 4s > 2.93 GHz Intel Xeon X7350, -j16 30m 7s => 9m 8s > 2.67 GHz Intel Xeon X7542, -j48 14m 56s => 5m 50s > > Seems like a worthwhile speedup to me. > Ok for mainline I like the change as well (if it shortens bootstrap and/or check).
Hi Jakub, > On Fri, Oct 21, 2016 at 04:01:48PM +0200, Rainer Orth wrote: >> I happened to notice that the gnat.dg testsuite run is slow even on a >> reasonably fast SPARC machine (3.6 GHz SPARC T5) and together with the >> libgomp testsuite (PR libgomp/66005) dominates bootstrap time: within a >> make -j96 -k check, it takes 1h 18m 37s. For unknown reasons, >> check-gnat isn't parallelized though it is trivial to do and buys quite >> a bit: > > check-gnat dominates anything? That just really weird, > it has only > # of expected passes 2544 > # of unexpected failures 2 > # of expected failures 24 > # of unsupported tests 3 > > compared to the 100000+ tests in gcc/g++ or 40000+ in gfortran testsuites > it is just nothing. That's comparing apples and oranges: the gnat.dg (and acats) tests are all compile or even run tests, while within the gcc or g++ testsuites you're also counting dg-error, dg-warning and some such, which are much cheeper. What ultimately matters is wall clock time, though (from a make -j96 run before my patch): start end #tests #partitions acats 12:28:51 13:24:25 2320 19 g++ 12:28:57 14:00:13 210885 48 gcc 12:28:57 14:10:41 197266 90 gfortran 12:28:57 13:47:40 86959 32 gnat 12:28:52 14:17:16 5100 1 go 12:28:57 13:17:02 14636 11 obj-c++ 12:28:53 12:48:47 3074 1 objc 12:28:57 13:14:38 5742 6 Here you can see what I mean by dominate: even on this relatively fast system (3.6 GHz SPARC T5), the gnat testsuite runs for several minutes beyond everything else in gcc/testsuite, thus determinating the end of the bootstrap. The effect becomes much more pronounced on slower boxes (UltraSPARC T2 for example) where the machine is almost idle, running just a single instance of runtest for half an hour or more. > libgomp is a know problem, sure, the problem with parallelizing it is that > many tests just use all available cores/threads. Perhaps we should do some Right, the same holds for the Cilk+ tests as well: I'm including my libcilkrts-on-sparc patch in my bootstraps and often see one or two tests failing because they time out, grabbing all 96 strands within a make -j96 check... > small (at most 2 or 3 concurrent libgomp tests) parallelization of the > libgomp testsuite unless disallowed through some env var option, but in that > case bound OMP_NUM_THREADS if `getconf _NPROCESSORS_ONLN` > 32 to > `getconf _NPROCESSORS_ONLN` / 2 or something similar. That would certainly be a start, even though _NPROCESSORS_ONLN/2 can still be a bit much on larger systems, especially if they are already running make -j_NPROCESSORS_ONLN check (or with even more parallelism). Besides, there's no reason to limit the parallel number of compile tests in this way. But certainly, every single bit helps: the libgomp testsuite right now is what really dominates make check time, check-gnat was just a low-hanging fruit. > I'm not strongly against your patch, I'm just very surprised it is really > needed (acats is much larger, check-gnat is small). Not really: on that SPARC T5 system, I have (sequential gnat.dg vs. acats with 19 partitions), all within a -j96 bootstrap: wall clock #tests gnat.dg 6505s = 108m 25s 5100 acats 3334s = 55m 34s 2320 compared to (one week later) parallel gnat.dg (5 partitions): gnat.dg 2458s = 40m 58s 5104 Right now, gnat.dg is larger since it's run for all multilibs (two in this case) while acats is for the default multilib only (until I finish my `convert acats to dg' patch). Rainer -- ----------------------------------------------------------------------------- Rainer Orth, Center for Biotechnology, Bielefeld University
On Mon, Oct 24, 2016 at 11:12:20AM +0200, Rainer Orth wrote: > Not really: on that SPARC T5 system, I have (sequential gnat.dg > vs. acats with 19 partitions), all within a -j96 bootstrap: > > wall clock #tests > > gnat.dg 6505s = 108m 25s 5100 gnat.dg takes for me 8m (x86_64 Haswell-E, 4GHz, 2 parallel -j16 bootstraps/regtests on 8c/16ht), which is why I've been so surprised it runs so much slower on SPARC. Anyway, the patch is ok if it is so much slower on other systems. Jakub
Hi Jakub, > On Mon, Oct 24, 2016 at 11:12:20AM +0200, Rainer Orth wrote: >> Not really: on that SPARC T5 system, I have (sequential gnat.dg >> vs. acats with 19 partitions), all within a -j96 bootstrap: >> >> wall clock #tests >> >> gnat.dg 6505s = 108m 25s 5100 > > gnat.dg takes for me 8m (x86_64 Haswell-E, 4GHz, 2 parallel -j16 > bootstraps/regtests on 8c/16ht), which is why I've been > so surprised it runs so much slower on SPARC. it's 15m (-m32 and -m64) on a 4-socket Intel Xeon X7542, 2.67 GHz, -j48 bootstrap on 4 x 6c/12ht) for me, but even other x86 systems (like AMD Opteron 8435) take about thrice as long. Rainer -- ----------------------------------------------------------------------------- Rainer Orth, Center for Biotechnology, Bielefeld University
# HG changeset patch # Parent 13db0c5f22f787b7a09b81e1173677a02afa240d Parallelize check-gnat diff --git a/gcc/ada/gcc-interface/Make-lang.in b/gcc/ada/gcc-interface/Make-lang.in --- a/gcc/ada/gcc-interface/Make-lang.in +++ b/gcc/ada/gcc-interface/Make-lang.in @@ -863,6 +863,9 @@ ada.stagefeedback: stagefeedback-start -$(MV) ada/stamp-* stagefeedback/ada lang_checks += check-gnat +lang_checks_parallelized += check-gnat +# For description see the check_$lang_parallelize comment in gcc/Makefile.in. +check_gnat_parallelize = 1000 check-ada: check-acats check-gnat check-ada-subtargets: check-acats-subtargets check-gnat-subtargets