Message ID | 20170102145637.GA8760@linaro.org |
---|---|
State | New |
Headers | show |
On 01/02, Vincent Guittot wrote: >Hi Xiaolong, > >Le Monday 19 Dec 2016 à 08:14:53 (+0800), kernel test robot a écrit : >> >> Greeting, >> >> FYI, we noticed a -4.5% regression of unixbench.score due to commit: > >I have been able to restore performance on my platform with the patch below. >Could you test it ? > >--- > kernel/sched/core.c | 1 > 1 file changed, 1 insertion(+) > >diff --git a/kernel/sched/core.c b/kernel/sched/core.c >index 393759b..6e7d45c 100644 >--- a/kernel/sched/core.c >+++ b/kernel/sched/core.c >@@ -2578,6 +2578,7 @@ void wake_up_new_task(struct task_struct *p) > __set_task_cpu(p, select_task_rq(p, task_cpu(p), SD_BALANCE_FORK, 0)); > #endif > rq = __task_rq_lock(p, &rf); >+ update_rq_clock(rq); > post_init_entity_util_avg(&p->se); > > activate_task(rq, p, 0); >-- >2.7.4 > >Vincent Hi, Vincent, I applied your fix patch on top of 6b94780 ("sched/core: Use load_avg for selecting idlest group"), and here is the comparison. (60df283834fd4def3c11ad2de3 is the fix commit id). Seems the performance hasn't been restored back. ========================================================================================= compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase: gcc-6/performance/x86_64-rhel-7.2/100%/debian-x86_64-2016-08-31.cgz/300s/lkp-wsm-ep1/shell1/unixbench commit: f519a3f1c6b7a990e5aed37a8f853c6ecfdee945 6b94780e45c17b83e3e75f8aaca5a328db583c74 60df283834fd4def3c11ad2de3e6fc9e81b7dff1 f519a3f1c6b7a990 6b94780e45c17b83e3e75f8aac 60df283834fd4def3c11ad2de3 ---------------- -------------------------- -------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 25565 ± 0% -4.5% 24414 ± 0% -4.5% 24421 ± 0% unixbench.score 13223805 ± 2% -19.6% 10628072 ± 0% -21.3% 10412818 ± 1% unixbench.time.involuntary_context_switches 9.232e+08 ± 0% -4.3% 8.831e+08 ± 0% -4.3% 8.838e+08 ± 0% unixbench.time.minor_page_faults 1807 ± 0% -5.4% 1709 ± 0% -5.6% 1705 ± 0% unixbench.time.percent_of_cpu_this_job_got 5656 ± 0% -6.8% 5271 ± 0% -7.3% 5243 ± 0% unixbench.time.system_time 5743 ± 0% -4.0% 5514 ± 0% -3.9% 5516 ± 0% unixbench.time.user_time 29557557 ± 0% -2.6% 28781098 ± 0% -2.2% 28919280 ± 0% unixbench.time.voluntary_context_switches 741766 ± 2% -62.4% 279054 ± 1% -61.8% 283034 ± 1% interrupts.CAL:Function_call_interrupts 2912823 ± 0% -9.7% 2630010 ± 0% -8.7% 2660077 ± 0% softirqs.RCU 13223805 ± 2% -19.6% 10628072 ± 0% -21.3% 10412818 ± 1% time.involuntary_context_switches 126250 ± 0% -12.2% 110890 ± 0% -11.5% 111739 ± 0% vmstat.system.cs 31060 ± 1% -9.2% 28214 ± 0% -9.6% 28078 ± 0% vmstat.system.in 454.50 ±150% +164.7% 1203 ±166% +792.3% 4055 ± 18% numa-numastat.node0.numa_foreign 454.50 ±150% +164.7% 1203 ±166% +792.3% 4055 ± 18% numa-numastat.node0.numa_miss 4297 ± 15% -18.1% 3520 ± 57% -84.5% 666.40 ±113% numa-numastat.node1.numa_foreign 4297 ± 15% -18.1% 3520 ± 57% -84.5% 666.40 ±113% numa-numastat.node1.numa_miss 78.58 ± 0% -5.6% 74.20 ± 0% -6.0% 73.90 ± 0% turbostat.%Busy 2507 ± 0% -5.6% 2366 ± 0% -6.0% 2356 ± 0% turbostat.Avg_MHz 3.01 ± 2% +100.4% 6.03 ± 2% +100.1% 6.02 ± 0% turbostat.CPU%c3 2.35 ± 1% +6.8% 2.51 ± 4% +12.1% 2.64 ± 1% turbostat.CPU%c6 1.25 ± 5% -17.1% 1.04 ± 22% -32.3% 0.85 ± 5% perf-profile.children.cycles-pp.__irqentry_text_start Thanks, Xiaolong > >> >> >> commit: 6b94780e45c17b83e3e75f8aaca5a328db583c74 ("sched/core: Use load_avg for selecting idlest group") >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master >> >> in testcase: unixbench >> on test machine: 24 threads Nehalem-EP with 24G memory >> with following parameters: >> >> runtime: 300s >> nr_task: 100% >> test: shell1 >> cpufreq_governor: performance >> >> test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system. >> test-url: https://github.com/kdlucas/byte-unixbench >> >> In addition to that, the commit also has significant impact on the following tests: >> >> +------------------+-----------------------------------------------------------------------+ >> | testcase: change | unixbench: unixbench.score -2.9% regression | >> | test machine | 8 threads Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz with 6G memory | >> | test parameters | nr_task=1 | >> | | runtime=300s | >> | | test=shell8 | >> +------------------+-----------------------------------------------------------------------+ >> >> >> Details are as below: >> --------------------------------------------------------------------------------------------------> >> >> >> To reproduce: >> >> git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git >> cd lkp-tests >> bin/lkp install job.yaml # job file is attached in this email >> bin/lkp run job.yaml >> >> testcase/path_params/tbox_group/run: unixbench/300s-100%-shell1-performance/lkp-wsm-ep1 >> >> f519a3f1c6b7a990 6b94780e45c17b83e3e75f8aac >> ---------------- -------------------------- >> 25565 -5% 24414 unixbench.score >> 29557557 28781098 unixbench.time.voluntary_context_switches >> 5743 -4% 5514 unixbench.time.user_time >> 9.232e+08 -4% 8.831e+08 unixbench.time.minor_page_faults >> 1807 -5% 1709 unixbench.time.percent_of_cpu_this_job_got >> 5656 -7% 5271 unixbench.time.system_time >> 13223805 -20% 10628072 unixbench.time.involuntary_context_switches >> 741766 -62% 279054 interrupts.CAL:Function_call_interrupts >> 31060 -9% 28214 vmstat.system.in >> 126250 -12% 110890 vmstat.system.cs >> 78.58 -6% 74.20 turbostat.%Busy >> 2507 -6% 2366 turbostat.Avg_MHz >> 9134 ± 47% -6e+03 2973 ± 36% latency_stats.max.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath >> 380879 ± 10% 5e+05 887692 ± 49% latency_stats.sum.wait_on_page_bit_killable.__lock_page_or_retry.filemap_fault.__do_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault >> 31710 ± 15% -2e+04 10583 ± 14% latency_stats.sum.call_rwsem_down_write_failed.__vma_adjust.__split_vma.do_munmap.vm_munmap.elf_map.load_elf_binary.search_binary_handler.do_execveat_common.SyS_execve.do_syscall_64.return_from_SYSCALL_64 >> 51796 ± 4% -4e+04 15457 ± 10% latency_stats.sum.call_rwsem_down_write_failed.unlink_file_vma.free_pgtables.unmap_region.do_munmap.vm_munmap.elf_map.load_elf_binary.search_binary_handler.do_execveat_common.SyS_execve.do_syscall_64 >> 111998 ± 18% -7e+04 37074 ± 14% latency_stats.sum.call_rwsem_down_write_failed.__vma_adjust.__split_vma.do_munmap.mmap_region.do_mmap.vm_mmap_pgoff.SyS_mmap_pgoff.SyS_mmap.entry_SYSCALL_64_fastpath >> 275087 ± 15% -2e+05 81973 ± 3% latency_stats.sum.call_rwsem_down_write_failed.unlink_file_vma.free_pgtables.unmap_region.do_munmap.mmap_region.do_mmap.vm_mmap_pgoff.SyS_mmap_pgoff.SyS_mmap.entry_SYSCALL_64_fastpath >> 930993 ± 12% -6e+05 320520 ± 4% latency_stats.sum.call_rwsem_down_write_failed.vma_link.mmap_region.do_mmap.vm_mmap_pgoff.vm_mmap.elf_map.load_elf_binary.search_binary_handler.do_execveat_common.SyS_execve.do_syscall_64 >> 4755783 ± 9% -3e+06 1619348 ± 4% latency_stats.sum.call_rwsem_down_write_failed.__vma_adjust.__split_vma.split_vma.mprotect_fixup.do_mprotect_pkey.SyS_mprotect.entry_SYSCALL_64_fastpath >> 5536067 ± 10% -4e+06 1929338 ± 3% latency_stats.sum.call_rwsem_down_write_failed.copy_process._do_fork.SyS_clone.do_syscall_64.return_from_SYSCALL_64 >> 9.032e+08 -4% 8.64e+08 perf-stat.page-faults >> 9.032e+08 -4% 8.64e+08 perf-stat.minor-faults >> 2.329e+09 2.269e+09 perf-stat.node-load-misses >> 2.2e+09 -9% 2.011e+09 ± 5% perf-stat.dTLB-store-misses >> 3.278e+10 -9% 2.987e+10 ± 6% perf-stat.dTLB-load-misses >> 19484819 13% 21974129 perf-stat.cpu-migrations >> 3.755e+13 -6% 3.54e+13 perf-stat.cpu-cycles >> 3244 4% 3379 perf-stat.instructions-per-iTLB-miss >> 4.536e+12 -4% 4.332e+12 perf-stat.branch-instructions >> 2.303e+13 -4% 2.208e+13 perf-stat.instructions >> 5.768e+12 -4% 5.517e+12 perf-stat.dTLB-loads >> 3.567e+11 -4% 3.414e+11 perf-stat.cache-references >> 2.97 2.93 perf-stat.branch-miss-rate% >> 2.768e+10 2.699e+10 perf-stat.node-stores >> 5.446e+10 -3% 5.275e+10 perf-stat.cache-misses >> 0.03 -4% 0.03 perf-stat.iTLB-load-miss-rate% >> 9.673e+09 -4% 9.294e+09 perf-stat.node-loads >> 3.596e+12 -4% 3.442e+12 perf-stat.dTLB-stores >> 0.61 0.62 perf-stat.ipc >> 1.347e+11 -6% 1.27e+11 perf-stat.branch-misses >> 7.098e+09 -8% 6.533e+09 perf-stat.iTLB-load-misses >> 2.309e+13 -4% 2.206e+13 perf-stat.iTLB-loads >> 79911173 -12% 70187035 perf-stat.context-switches >> >> >> >> turbostat._Busy >> >> 90 ++-------------------------------------*---*---------------------------+ >> | .. *...*.. | >> 80 *+..*..*...*..*...*..*...*..*...O...* O O O O O...O..O...O O O >> 70 O+ O O O O O O O O | >> | | >> 60 ++ | >> 50 ++ | >> | | >> 40 ++ | >> 30 ++ | >> | | >> 20 ++ | >> 10 ++ | >> | | >> 0 ++----------------------------------O----------------------------------+ >> >> >> >> >> >> unixbench.time.percent_of_cpu_this_job_got >> >> 2500 ++-------------------------------------------------------------------+ >> | | >> | .*... | >> 2000 ++ .*. *..*... | >> *..*...*..*...*..*...*..*...*..O...*. O O O O O..O...O..O O O >> O O O O O O O O O | >> 1500 ++ | >> | | >> 1000 ++ | >> | | >> | | >> 500 ++ | >> | | >> | | >> 0 ++---------------------------------O---------------------------------+ >> >> >> vmstat.system.in >> >> 40000 ++------------------------------------------------------------------+ >> | .*...*.. | >> 35000 ++ .*...*. | >> 30000 *+.*...*..*...*..*..*...*..*...*..*. *..*...*..* | >> O O O O O O O O O O O O O O O O O O O O >> 25000 ++ | >> | | >> 20000 ++ | >> | | >> 15000 ++ | >> 10000 ++ | >> | | >> 5000 ++ | >> | | >> 0 ++--------------------------------O---------------------------------+ >> >> [*] bisect-good sample >> [O] bisect-bad sample >> >> >> Disclaimer: >> Results have been estimated based on internal Intel analysis and are provided >> for informational purposes only. Any difference in system hardware or software >> design or configuration may affect actual performance. >> >> >> Thanks, >> Xiaolong >
Hi Xiaolong, Thanks for testing, I'm going to look for another root cause It was also mentioned a -2.9% regression with a 8 threads Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz with 6G memory. Have you checked this platform too ? Regards, Vincent On 3 January 2017 at 08:13, Ye Xiaolong <xiaolong.ye@intel.com> wrote: > On 01/02, Vincent Guittot wrote: >>Hi Xiaolong, >> >>Le Monday 19 Dec 2016 ŕ 08:14:53 (+0800), kernel test robot a écrit : >>> >>> Greeting, >>> >>> FYI, we noticed a -4.5% regression of unixbench.score due to commit: >> >>I have been able to restore performance on my platform with the patch below. >>Could you test it ? >> >>--- >> kernel/sched/core.c | 1 >> 1 file changed, 1 insertion(+) >> >>diff --git a/kernel/sched/core.c b/kernel/sched/core.c >>index 393759b..6e7d45c 100644 >>--- a/kernel/sched/core.c >>+++ b/kernel/sched/core.c >>@@ -2578,6 +2578,7 @@ void wake_up_new_task(struct task_struct *p) >> __set_task_cpu(p, select_task_rq(p, task_cpu(p), SD_BALANCE_FORK, 0)); >> #endif >> rq = __task_rq_lock(p, &rf); >>+ update_rq_clock(rq); >> post_init_entity_util_avg(&p->se); >> >> activate_task(rq, p, 0); >>-- >>2.7.4 >> >>Vincent > > Hi, Vincent, > > I applied your fix patch on top of 6b94780 ("sched/core: Use load_avg for selecting idlest group"), > and here is the comparison. (60df283834fd4def3c11ad2de3 is the fix commit id). > Seems the performance hasn't been restored back. Thanks for testings. > > > ========================================================================================= > compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase: > gcc-6/performance/x86_64-rhel-7.2/100%/debian-x86_64-2016-08-31.cgz/300s/lkp-wsm-ep1/shell1/unixbench > > commit: > f519a3f1c6b7a990e5aed37a8f853c6ecfdee945 > 6b94780e45c17b83e3e75f8aaca5a328db583c74 > 60df283834fd4def3c11ad2de3e6fc9e81b7dff1 > > f519a3f1c6b7a990 6b94780e45c17b83e3e75f8aac 60df283834fd4def3c11ad2de3 > ---------------- -------------------------- -------------------------- > %stddev %change %stddev %change %stddev > \ | \ | \ > 25565 ą 0% -4.5% 24414 ą 0% -4.5% 24421 ą 0% unixbench.score > 13223805 ą 2% -19.6% 10628072 ą 0% -21.3% 10412818 ą 1% unixbench.time.involuntary_context_switches > 9.232e+08 ą 0% -4.3% 8.831e+08 ą 0% -4.3% 8.838e+08 ą 0% unixbench.time.minor_page_faults > 1807 ą 0% -5.4% 1709 ą 0% -5.6% 1705 ą 0% unixbench.time.percent_of_cpu_this_job_got > 5656 ą 0% -6.8% 5271 ą 0% -7.3% 5243 ą 0% unixbench.time.system_time > 5743 ą 0% -4.0% 5514 ą 0% -3.9% 5516 ą 0% unixbench.time.user_time > 29557557 ą 0% -2.6% 28781098 ą 0% -2.2% 28919280 ą 0% unixbench.time.voluntary_context_switches > 741766 ą 2% -62.4% 279054 ą 1% -61.8% 283034 ą 1% interrupts.CAL:Function_call_interrupts > 2912823 ą 0% -9.7% 2630010 ą 0% -8.7% 2660077 ą 0% softirqs.RCU > 13223805 ą 2% -19.6% 10628072 ą 0% -21.3% 10412818 ą 1% time.involuntary_context_switches > 126250 ą 0% -12.2% 110890 ą 0% -11.5% 111739 ą 0% vmstat.system.cs > 31060 ą 1% -9.2% 28214 ą 0% -9.6% 28078 ą 0% vmstat.system.in > 454.50 ą150% +164.7% 1203 ą166% +792.3% 4055 ą 18% numa-numastat.node0.numa_foreign > 454.50 ą150% +164.7% 1203 ą166% +792.3% 4055 ą 18% numa-numastat.node0.numa_miss > 4297 ą 15% -18.1% 3520 ą 57% -84.5% 666.40 ą113% numa-numastat.node1.numa_foreign > 4297 ą 15% -18.1% 3520 ą 57% -84.5% 666.40 ą113% numa-numastat.node1.numa_miss > 78.58 ą 0% -5.6% 74.20 ą 0% -6.0% 73.90 ą 0% turbostat.%Busy > 2507 ą 0% -5.6% 2366 ą 0% -6.0% 2356 ą 0% turbostat.Avg_MHz > 3.01 ą 2% +100.4% 6.03 ą 2% +100.1% 6.02 ą 0% turbostat.CPU%c3 > 2.35 ą 1% +6.8% 2.51 ą 4% +12.1% 2.64 ą 1% turbostat.CPU%c6 > 1.25 ą 5% -17.1% 1.04 ą 22% -32.3% 0.85 ą 5% perf-profile.children.cycles-pp.__irqentry_text_start > > Thanks, > Xiaolong > >> >>> >>> >>> commit: 6b94780e45c17b83e3e75f8aaca5a328db583c74 ("sched/core: Use load_avg for selecting idlest group") >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master >>> >>> in testcase: unixbench >>> on test machine: 24 threads Nehalem-EP with 24G memory >>> with following parameters: >>> >>> runtime: 300s >>> nr_task: 100% >>> test: shell1 >>> cpufreq_governor: performance >>> >>> test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system. >>> test-url: https://github.com/kdlucas/byte-unixbench >>> >>> In addition to that, the commit also has significant impact on the following tests: >>> >>> +------------------+-----------------------------------------------------------------------+ >>> | testcase: change | unixbench: unixbench.score -2.9% regression | >>> | test machine | 8 threads Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz with 6G memory | >>> | test parameters | nr_task=1 | >>> | | runtime=300s | >>> | | test=shell8 | >>> +------------------+-----------------------------------------------------------------------+ >>> >>> >>> Details are as below: >>> --------------------------------------------------------------------------------------------------> >>> >>> >>> To reproduce: >>> >>> git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git >>> cd lkp-tests >>> bin/lkp install job.yaml # job file is attached in this email >>> bin/lkp run job.yaml >>> >>> testcase/path_params/tbox_group/run: unixbench/300s-100%-shell1-performance/lkp-wsm-ep1 >>> >>> f519a3f1c6b7a990 6b94780e45c17b83e3e75f8aac >>> ---------------- -------------------------- >>> 25565 -5% 24414 unixbench.score >>> 29557557 28781098 unixbench.time.voluntary_context_switches >>> 5743 -4% 5514 unixbench.time.user_time >>> 9.232e+08 -4% 8.831e+08 unixbench.time.minor_page_faults >>> 1807 -5% 1709 unixbench.time.percent_of_cpu_this_job_got >>> 5656 -7% 5271 unixbench.time.system_time >>> 13223805 -20% 10628072 unixbench.time.involuntary_context_switches >>> 741766 -62% 279054 interrupts.CAL:Function_call_interrupts >>> 31060 -9% 28214 vmstat.system.in >>> 126250 -12% 110890 vmstat.system.cs >>> 78.58 -6% 74.20 turbostat.%Busy >>> 2507 -6% 2366 turbostat.Avg_MHz >>> 9134 ą 47% -6e+03 2973 ą 36% latency_stats.max.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath >>> 380879 ą 10% 5e+05 887692 ą 49% latency_stats.sum.wait_on_page_bit_killable.__lock_page_or_retry.filemap_fault.__do_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault >>> 31710 ą 15% -2e+04 10583 ą 14% latency_stats.sum.call_rwsem_down_write_failed.__vma_adjust.__split_vma.do_munmap.vm_munmap.elf_map.load_elf_binary.search_binary_handler.do_execveat_common.SyS_execve.do_syscall_64.return_from_SYSCALL_64 >>> 51796 ą 4% -4e+04 15457 ą 10% latency_stats.sum.call_rwsem_down_write_failed.unlink_file_vma.free_pgtables.unmap_region.do_munmap.vm_munmap.elf_map.load_elf_binary.search_binary_handler.do_execveat_common.SyS_execve.do_syscall_64 >>> 111998 ą 18% -7e+04 37074 ą 14% latency_stats.sum.call_rwsem_down_write_failed.__vma_adjust.__split_vma.do_munmap.mmap_region.do_mmap.vm_mmap_pgoff.SyS_mmap_pgoff.SyS_mmap.entry_SYSCALL_64_fastpath >>> 275087 ą 15% -2e+05 81973 ą 3% latency_stats.sum.call_rwsem_down_write_failed.unlink_file_vma.free_pgtables.unmap_region.do_munmap.mmap_region.do_mmap.vm_mmap_pgoff.SyS_mmap_pgoff.SyS_mmap.entry_SYSCALL_64_fastpath >>> 930993 ą 12% -6e+05 320520 ą 4% latency_stats.sum.call_rwsem_down_write_failed.vma_link.mmap_region.do_mmap.vm_mmap_pgoff.vm_mmap.elf_map.load_elf_binary.search_binary_handler.do_execveat_common.SyS_execve.do_syscall_64 >>> 4755783 ą 9% -3e+06 1619348 ą 4% latency_stats.sum.call_rwsem_down_write_failed.__vma_adjust.__split_vma.split_vma.mprotect_fixup.do_mprotect_pkey.SyS_mprotect.entry_SYSCALL_64_fastpath >>> 5536067 ą 10% -4e+06 1929338 ą 3% latency_stats.sum.call_rwsem_down_write_failed.copy_process._do_fork.SyS_clone.do_syscall_64.return_from_SYSCALL_64 >>> 9.032e+08 -4% 8.64e+08 perf-stat.page-faults >>> 9.032e+08 -4% 8.64e+08 perf-stat.minor-faults >>> 2.329e+09 2.269e+09 perf-stat.node-load-misses >>> 2.2e+09 -9% 2.011e+09 ą 5% perf-stat.dTLB-store-misses >>> 3.278e+10 -9% 2.987e+10 ą 6% perf-stat.dTLB-load-misses >>> 19484819 13% 21974129 perf-stat.cpu-migrations >>> 3.755e+13 -6% 3.54e+13 perf-stat.cpu-cycles >>> 3244 4% 3379 perf-stat.instructions-per-iTLB-miss >>> 4.536e+12 -4% 4.332e+12 perf-stat.branch-instructions >>> 2.303e+13 -4% 2.208e+13 perf-stat.instructions >>> 5.768e+12 -4% 5.517e+12 perf-stat.dTLB-loads >>> 3.567e+11 -4% 3.414e+11 perf-stat.cache-references >>> 2.97 2.93 perf-stat.branch-miss-rate% >>> 2.768e+10 2.699e+10 perf-stat.node-stores >>> 5.446e+10 -3% 5.275e+10 perf-stat.cache-misses >>> 0.03 -4% 0.03 perf-stat.iTLB-load-miss-rate% >>> 9.673e+09 -4% 9.294e+09 perf-stat.node-loads >>> 3.596e+12 -4% 3.442e+12 perf-stat.dTLB-stores >>> 0.61 0.62 perf-stat.ipc >>> 1.347e+11 -6% 1.27e+11 perf-stat.branch-misses >>> 7.098e+09 -8% 6.533e+09 perf-stat.iTLB-load-misses >>> 2.309e+13 -4% 2.206e+13 perf-stat.iTLB-loads >>> 79911173 -12% 70187035 perf-stat.context-switches >>> >>> >>> >>> turbostat._Busy >>> >>> 90 ++-------------------------------------*---*---------------------------+ >>> | .. *...*.. | >>> 80 *+..*..*...*..*...*..*...*..*...O...* O O O O O...O..O...O O O >>> 70 O+ O O O O O O O O | >>> | | >>> 60 ++ | >>> 50 ++ | >>> | | >>> 40 ++ | >>> 30 ++ | >>> | | >>> 20 ++ | >>> 10 ++ | >>> | | >>> 0 ++----------------------------------O----------------------------------+ >>> >>> >>> >>> >>> >>> unixbench.time.percent_of_cpu_this_job_got >>> >>> 2500 ++-------------------------------------------------------------------+ >>> | | >>> | .*... | >>> 2000 ++ .*. *..*... | >>> *..*...*..*...*..*...*..*...*..O...*. O O O O O..O...O..O O O >>> O O O O O O O O O | >>> 1500 ++ | >>> | | >>> 1000 ++ | >>> | | >>> | | >>> 500 ++ | >>> | | >>> | | >>> 0 ++---------------------------------O---------------------------------+ >>> >>> >>> vmstat.system.in >>> >>> 40000 ++------------------------------------------------------------------+ >>> | .*...*.. | >>> 35000 ++ .*...*. | >>> 30000 *+.*...*..*...*..*..*...*..*...*..*. *..*...*..* | >>> O O O O O O O O O O O O O O O O O O O O >>> 25000 ++ | >>> | | >>> 20000 ++ | >>> | | >>> 15000 ++ | >>> 10000 ++ | >>> | | >>> 5000 ++ | >>> | | >>> 0 ++--------------------------------O---------------------------------+ >>> >>> [*] bisect-good sample >>> [O] bisect-bad sample >>> >>> >>> Disclaimer: >>> Results have been estimated based on internal Intel analysis and are provided >>> for informational purposes only. Any difference in system hardware or software >>> design or configuration may affect actual performance. >>> >>> >>> Thanks, >>> Xiaolong >>
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 393759b..6e7d45c 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2578,6 +2578,7 @@ void wake_up_new_task(struct task_struct *p) __set_task_cpu(p, select_task_rq(p, task_cpu(p), SD_BALANCE_FORK, 0)); #endif rq = __task_rq_lock(p, &rf); + update_rq_clock(rq); post_init_entity_util_avg(&p->se); activate_task(rq, p, 0);