Message ID | 20240814071424.2655666-1-jeffxu@chromium.org |
---|---|
Headers | show |
Series | mremap refactor: check src address for vma boundaries first. | expand |
Hi Oliver On Tue, Aug 20, 2024 at 11:19 PM Oliver Sang <oliver.sang@intel.com> wrote: > > hi, Jeff, > > here is a update per your test request. > > we extented the runtime to 600 seconds, and run 10 times for each commit. > > ========================================================================================= > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***600s*** > > commit: > ff388fe5c4 ("mseal: wire up mseal syscall") > 8be7258aad ("mseal: add mseal syscall") > 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 > ---------------- --------------------------- --------------------------- > %stddev %change %stddev %change %stddev > \ | \ | \ > 1.886e+08 ą 0% -5.0% 1.792e+08 ą 0% -3.4% 1.821e+08 ą 0% stress-ng.pagemove.ops > 314345 ą 0% -5.0% 298656 ą 0% -3.4% 303565 ą 0% stress-ng.pagemove.ops_per_sec > Thanks for testing with more samples. The result is reasonable and consistent with the 60 seconds result. The -3.4% reflects the impact from munmap, which isn't covered by this patch. > > the score of stress-ng.pagemove.ops_per_sec has some difference with 60s > run (list as below for comparison). but the trend is similar. > > ========================================================================================= > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***60s*** > > commit: > ff388fe5c4 ("mseal: wire up mseal syscall") > 8be7258aad ("mseal: add mseal syscall") > 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 > ---------------- --------------------------- --------------------------- > %stddev %change %stddev %change %stddev > \ | \ | \ > 18386219 ą 0% -5.0% 17474214 ą 0% -2.9% 17850959 ą 0% stress-ng.pagemove.ops > 306421 ą 0% -5.0% 291207 ą 0% -2.9% 297490 ą 0% stress-ng.pagemove.ops_per_sec > > > since the data is stable, %stddev shows as "ą 0%" in both above tables. > let me give out the detail data for 600s runs. > > for > ff388fe5c4 ("mseal: wire up mseal syscall") > > "stress-ng.pagemove.ops": [ > 188545955, > 188681834, > 188907282, > 188345009, > 188729465, > 188312187, > 188897283, > 188209713, > 188425965, > 189026136 > ], > "stress-ng.pagemove.ops_per_sec": [ > 314242.1, > 314467.13, > 314841.5, > 313907.19, > 314548.11, > 313852.5, > 314827.84, > 313680.74, > 314042.14, > 315042.79 > ], > > for > 8be7258aad ("mseal: add mseal syscall") > > "stress-ng.pagemove.ops": [ > 179127848, > 179401350, > 179350278, > 179023817, > 179106624, > 179535213, > 178936504, > 178870141, > 179462171, > 179136065 > ], > "stress-ng.pagemove.ops_per_sec": [ > 298545.54, > 299000.95, > 298915.62, > 298371.45, > 298509.15, > 299223.65, > 298226.74, > 298115.08, > 299101.23, > 298558.74 > ], > > for > 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" > > "stress-ng.pagemove.ops": [ > 182188207, > 182288813, > 182483678, > 181980233, > 182249440, > 181837961, > 182155893, > 181699445, > 182347580, > 182174597 > ], > "stress-ng.pagemove.ops_per_sec": [ > 303643.28, > 303814.05, > 304138.38, > 303298.9, > 303747.33, > 303060.84, > 303592.48, > 302831.56, > 303909.81, > 303622.07 > ], > > > for 600s run, below is the full comparion. > > ========================================================================================= > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***600s*** > > commit: > ff388fe5c4 ("mseal: wire up mseal syscall") > 8be7258aad ("mseal: add mseal syscall") > 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 > ---------------- --------------------------- --------------------------- > %stddev %change %stddev %change %stddev > \ | \ | \ > 4667 ą 0% -2.4% 4553 ą 0% -1.6% 4593 ą 0% vmstat.system.cs > 4.192e+08 ą 0% -4.3% 4.012e+08 ą 0% -2.8% 4.075e+08 ą 0% proc-vmstat.numa_hit > 4.192e+08 ą 0% -4.3% 4.011e+08 ą 0% -2.8% 4.074e+08 ą 0% proc-vmstat.numa_local > 7.843e+08 ą 0% -4.3% 7.504e+08 ą 0% -2.8% 7.623e+08 ą 0% proc-vmstat.pgalloc_normal > 7.836e+08 ą 0% -4.3% 7.498e+08 ą 0% -2.8% 7.616e+08 ą 0% proc-vmstat.pgfree > 1174825 ą 0% -2.6% 1143891 ą 0% -1.7% 1155336 ą 0% time.involuntary_context_switches > 5082 ą 0% +1.3% 5147 ą 0% +0.9% 5126 ą 0% time.percent_of_cpu_this_job_got > 29840 ą 0% +1.4% 30267 ą 0% +1.0% 30133 ą 0% time.system_time > 663.58 ą 1% -5.7% 625.54 ą 1% -4.3% 635.17 ą 0% time.user_time > 1.886e+08 ą 0% -5.0% 1.792e+08 ą 0% -3.4% 1.821e+08 ą 0% stress-ng.pagemove.ops > 314345 ą 0% -5.0% 298656 ą 0% -3.4% 303565 ą 0% stress-ng.pagemove.ops_per_sec > 212508 ą 0% -4.3% 203280 ą 0% -3.1% 205831 ą 0% stress-ng.pagemove.page_remaps_per_sec > 1174825 ą 0% -2.6% 1143891 ą 0% -1.7% 1155336 ą 0% stress-ng.time.involuntary_context_switches > 5082 ą 0% +1.3% 5147 ą 0% +0.9% 5126 ą 0% stress-ng.time.percent_of_cpu_this_job_got > 29840 ą 0% +1.4% 30267 ą 0% +1.0% 30133 ą 0% stress-ng.time.system_time > 663.58 ą 1% -5.7% 625.54 ą 1% -4.3% 635.17 ą 0% stress-ng.time.user_time > 1.00 ą 0% -7.1% 0.93 ą 0% -4.9% 0.95 ą 0% perf-stat.i.MPKI > 3.487e+10 ą 0% +3.5% 3.607e+10 ą 0% +2.4% 3.57e+10 ą 0% perf-stat.i.branch-instructions > 0.21 ą 0% -0.0 0.19 ą 3% -0.0 0.20 ą 0% perf-stat.i.branch-miss-rate% > 1.763e+08 ą 0% -5.0% 1.675e+08 ą 0% -3.4% 1.704e+08 ą 0% perf-stat.i.cache-misses > 2.342e+08 ą 0% -4.9% 2.228e+08 ą 0% -3.3% 2.264e+08 ą 0% perf-stat.i.cache-references > 4650 ą 0% -2.4% 4537 ą 0% -1.5% 4578 ą 0% perf-stat.i.context-switches > 1.11 ą 0% -2.2% 1.09 ą 0% -1.6% 1.10 ą 0% perf-stat.i.cpi > 172.66 ą 0% -2.8% 167.77 ą 0% -1.8% 169.52 ą 0% perf-stat.i.cpu-migrations > 1121 ą 0% +5.2% 1180 ą 0% +3.5% 1160 ą 0% perf-stat.i.cycles-between-cache-misses > 1.772e+11 ą 0% +2.2% 1.812e+11 ą 0% +1.6% 1.801e+11 ą 0% perf-stat.i.instructions > 0.90 ą 0% +2.3% 0.92 ą 0% +1.6% 0.91 ą 0% perf-stat.i.ipc > 0.99 ą 0% -7.1% 0.92 ą 0% -4.9% 0.95 ą 0% perf-stat.overall.MPKI > 0.21 ą 0% -0.0 0.19 ą 3% -0.0 0.20 ą 0% perf-stat.overall.branch-miss-rate% > 1.11 ą 0% -2.2% 1.09 ą 0% -1.6% 1.10 ą 0% perf-stat.overall.cpi > 1120 ą 0% +5.2% 1179 ą 0% +3.5% 1159 ą 0% perf-stat.overall.cycles-between-cache-misses > 0.90 ą 0% +2.3% 0.92 ą 0% +1.6% 0.91 ą 0% perf-stat.overall.ipc > 3.48e+10 ą 0% +3.5% 3.6e+10 ą 0% +2.4% 3.563e+10 ą 0% perf-stat.ps.branch-instructions > 1.759e+08 ą 0% -5.0% 1.672e+08 ą 0% -3.4% 1.7e+08 ą 0% perf-stat.ps.cache-misses > 2.338e+08 ą 0% -4.9% 2.224e+08 ą 0% -3.3% 2.26e+08 ą 0% perf-stat.ps.cache-references > 4642 ą 0% -2.4% 4529 ą 0% -1.5% 4570 ą 0% perf-stat.ps.context-switches > 172.30 ą 0% -2.8% 167.43 ą 0% -1.8% 169.17 ą 0% perf-stat.ps.cpu-migrations > 1.769e+11 ą 0% +2.3% 1.808e+11 ą 0% +1.6% 1.797e+11 ą 0% perf-stat.ps.instructions > 1.063e+14 ą 0% +2.3% 1.087e+14 ą 0% +1.7% 1.081e+14 ą 0% perf-stat.total.instructions > 74.86 ą 0% -2.1 72.76 ą 0% -0.8 74.06 ą 0% perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 36.72 ą 0% -1.7 35.04 ą 0% -1.2 35.54 ą 0% perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 24.93 ą 0% -1.4 23.54 ą 0% -0.8 24.12 ą 0% perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 19.91 ą 0% -1.1 18.79 ą 0% -0.7 19.17 ą 0% perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 14.71 ą 0% -0.8 13.90 ą 0% -0.4 14.30 ą 0% perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 10.82 ą 2% -0.6 10.22 ą 2% -0.6 10.25 ą 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 10.81 ą 2% -0.6 10.21 ą 2% -0.6 10.24 ą 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 10.81 ą 2% -0.6 10.21 ą 2% -0.6 10.24 ą 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork > 10.80 ą 2% -0.6 10.21 ą 2% -0.6 10.23 ą 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread > 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm > 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm > 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm > 10.76 ą 2% -0.6 10.17 ą 2% -0.6 10.20 ą 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn > 1.49 ą 1% -0.5 0.98 ą 0% -0.5 1.00 ą 0% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 7.86 ą 0% -0.4 7.48 ą 0% -0.3 7.59 ą 0% perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 6.72 ą 0% -0.4 6.37 ą 0% -0.2 6.49 ą 0% perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 6.06 ą 2% -0.3 5.71 ą 2% -0.3 5.73 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > 6.11 ą 0% -0.3 5.77 ą 0% -0.2 5.90 ą 0% perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 6.11 ą 0% -0.3 5.78 ą 1% -0.2 5.90 ą 0% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 5.50 ą 0% -0.3 5.19 ą 0% -0.2 5.31 ą 0% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 5.52 ą 0% -0.3 5.22 ą 0% -0.2 5.35 ą 0% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap > 5.15 ą 0% -0.3 4.86 ą 0% -0.2 4.97 ą 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap > 5.77 ą 0% -0.3 5.48 ą 0% -0.2 5.58 ą 0% perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > 5.16 ą 0% -0.3 4.88 ą 0% -0.1 5.01 ą 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma > 4.72 ą 2% -0.3 4.44 ą 2% -0.3 4.45 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs > 4.64 ą 0% -0.3 4.38 ą 0% -0.1 4.51 ą 1% perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma > 4.07 ą 0% -0.2 3.84 ą 0% -0.2 3.92 ą 0% perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 3.96 ą 1% -0.2 3.76 ą 1% -0.1 3.88 ą 1% perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma > 3.54 ą 0% -0.2 3.34 ą 0% -0.1 3.41 ą 1% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap > 38.68 ą 0% -0.2 38.49 ą 0% +0.4 39.05 ą 0% perf-profile.calltrace.cycles-pp.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.55 ą 1% -0.2 0.36 ą 65% -0.0 0.52 ą 1% perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > 3.41 ą 0% -0.2 3.22 ą 0% -0.1 3.28 ą 0% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap > 1.35 ą 0% -0.2 1.17 ą 0% -0.1 1.23 ą 0% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > 2.22 ą 0% -0.2 2.05 ą 0% -0.1 2.12 ą 0% perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 2.27 ą 0% -0.2 2.10 ą 0% -0.1 2.15 ą 0% perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 3.25 ą 0% -0.2 3.08 ą 0% -0.1 3.14 ą 0% perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 3.12 ą 2% -0.2 2.97 ą 2% -0.1 3.04 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > 0.96 ą 0% -0.1 0.82 ą 1% -0.1 0.87 ą 1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to > 2.98 ą 1% -0.1 2.84 ą 1% -0.1 2.89 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 8.19 ą 0% -0.1 8.05 ą 0% -0.1 8.04 ą 0% perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 3.13 ą 0% -0.1 3.00 ą 0% -0.1 3.06 ą 0% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma > 0.53 ą 1% -0.1 0.41 ą 50% -0.2 0.30 ą 81% perf-profile.calltrace.cycles-pp.arch_get_unmapped_area_topdown_vmflags.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap > 1.73 ą 2% -0.1 1.61 ą 2% -0.0 1.70 ą 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap > 2.14 ą 2% -0.1 2.02 ą 2% -0.0 2.09 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 2.46 ą 0% -0.1 2.34 ą 0% -0.1 2.38 ą 0% perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma > 2.04 ą 0% -0.1 1.93 ą 0% -0.1 1.96 ą 0% perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap > 1.85 ą 0% -0.1 1.74 ą 0% -0.1 1.78 ą 0% perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 2.22 ą 0% -0.1 2.12 ą 0% -0.1 2.15 ą 0% perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables > 1.40 ą 0% -0.1 1.30 ą 0% -0.1 1.33 ą 0% perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 0.56 ą 1% -0.1 0.46 ą 33% -0.0 0.54 ą 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma > 1.80 ą 2% -0.1 1.70 ą 2% -0.1 1.74 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma > 2.43 ą 0% -0.1 2.33 ą 0% -0.1 2.37 ą 0% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap > 1.25 ą 0% -0.1 1.15 ą 1% -0.1 1.19 ą 0% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap > 0.94 ą 1% -0.1 0.86 ą 0% -0.1 0.87 ą 0% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap > 1.38 ą 0% -0.1 1.30 ą 0% -0.1 1.33 ą 1% perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma > 1.22 ą 0% -0.1 1.14 ą 0% -0.1 1.17 ą 1% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma > 1.28 ą 0% -0.1 1.21 ą 0% -0.0 1.23 ą 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > 1.54 ą 1% -0.1 1.46 ą 0% -0.0 1.49 ą 0% perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap > 1.15 ą 0% -0.1 1.08 ą 1% -0.1 1.09 ą 0% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > 0.73 ą 1% -0.1 0.67 ą 1% -0.0 0.69 ą 1% perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > 0.72 ą 0% -0.1 0.66 ą 1% -0.0 0.69 ą 1% perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap > 1.64 ą 1% -0.1 1.58 ą 0% -0.1 1.58 ą 0% perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.78 ą 1% -0.1 0.72 ą 1% -0.0 0.75 ą 1% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma > 0.63 ą 1% -0.1 0.57 ą 1% -0.0 0.60 ą 1% perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma > 0.69 ą 2% -0.1 0.63 ą 4% -0.0 0.66 ą 2% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma > 0.60 ą 1% -0.1 0.54 ą 1% -0.0 0.58 ą 1% perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > 0.79 ą 2% -0.1 0.74 ą 3% -0.0 0.75 ą 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge > 1.12 ą 0% -0.0 1.08 ą 0% -0.0 1.09 ą 1% perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap > 0.67 ą 1% -0.0 0.62 ą 1% -0.0 0.63 ą 1% perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.77 ą 1% -0.0 0.72 ą 1% -0.0 0.73 ą 1% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge > 1.01 ą 1% -0.0 0.96 ą 0% -0.0 0.98 ą 0% perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region > 0.86 ą 0% -0.0 0.81 ą 1% -0.0 0.83 ą 1% perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 > 0.82 ą 1% -0.0 0.78 ą 1% -0.0 0.79 ą 1% perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 1.01 ą 0% -0.0 0.97 ą 0% -0.0 0.98 ą 0% perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.98 ą 1% -0.0 0.94 ą 0% -0.0 0.94 ą 1% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.78 ą 0% -0.0 0.74 ą 1% -0.0 0.75 ą 1% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap > 0.68 ą 0% -0.0 0.64 ą 1% -0.0 0.65 ą 0% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma > 0.68 ą 1% -0.0 0.64 ą 1% -0.0 0.64 ą 1% perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap > 0.89 ą 1% -0.0 0.85 ą 1% -0.0 0.86 ą 1% perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap > 0.62 ą 1% -0.0 0.58 ą 2% -0.0 0.59 ą 1% perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > 0.62 ą 1% -0.0 0.58 ą 1% -0.0 0.59 ą 1% perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.76 ą 1% -0.0 0.72 ą 1% -0.0 0.73 ą 1% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma > 1.01 ą 0% -0.0 0.97 ą 1% -0.0 0.98 ą 1% perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap > 0.64 ą 1% -0.0 0.60 ą 1% -0.0 0.61 ą 1% perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > 0.88 ą 1% -0.0 0.85 ą 0% -0.0 0.85 ą 0% perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.69 ą 1% -0.0 0.66 ą 1% -0.0 0.67 ą 0% perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.59 ą 1% -0.0 0.56 ą 1% -0.0 0.56 ą 0% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap > 0.82 ą 1% -0.0 0.82 ą 1% -0.0 0.79 ą 1% perf-profile.calltrace.cycles-pp.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > 0.76 ą 1% +0.1 0.83 ą 0% +0.1 0.84 ą 0% perf-profile.calltrace.cycles-pp.__madvise > 0.67 ą 1% +0.1 0.73 ą 1% +0.1 0.75 ą 1% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise > 0.63 ą 1% +0.1 0.70 ą 1% +0.1 0.71 ą 0% perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 0.62 ą 1% +0.1 0.69 ą 1% +0.1 0.71 ą 0% perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 0.66 ą 1% +0.1 0.73 ą 1% +0.1 0.74 ą 0% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 87.57 ą 0% +0.6 88.14 ą 0% +0.5 88.09 ą 0% perf-profile.calltrace.cycles-pp.mremap > 84.74 ą 0% +0.7 85.47 ą 0% +0.6 85.37 ą 0% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap > 84.58 ą 0% +0.7 85.32 ą 0% +0.6 85.22 ą 0% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 83.64 ą 0% +0.8 84.41 ą 0% +0.7 84.30 ą 0% perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.00 ą -1% +0.9 0.86 ą 0% +0.9 0.92 ą 0% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap > 0.00 ą -1% +0.9 0.87 ą 0% +0.0 0.00 ą -1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap > 0.00 ą -1% +0.9 0.91 ą 2% +0.9 0.92 ą 1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma > 0.00 ą -1% +1.1 1.09 ą 0% +0.0 0.00 ą -1% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 > 0.00 ą -1% +1.2 1.21 ą 0% +1.3 1.29 ą 0% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to > 2.10 ą 0% +1.5 3.61 ą 0% +1.7 3.79 ą 0% perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.00 ą -1% +1.5 1.51 ą 1% +1.5 1.52 ą 0% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap > 1.60 ą 0% +1.5 3.13 ą 0% +1.7 3.31 ą 0% perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 > 0.00 ą -1% +1.6 1.60 ą 0% +0.0 0.00 ą -1% perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.00 ą -1% +1.7 1.73 ą 0% +1.8 1.84 ą 0% perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > 0.00 ą -1% +2.0 2.00 ą 1% +2.0 2.04 ą 0% perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 5.35 ą 0% +3.0 8.37 ą 0% +1.6 6.92 ą 0% perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 75.03 ą 0% -2.1 72.92 ą 0% -0.8 74.22 ą 0% perf-profile.children.cycles-pp.move_vma > 36.94 ą 0% -1.7 35.25 ą 0% -1.2 35.75 ą 0% perf-profile.children.cycles-pp.do_vmi_align_munmap > 25.01 ą 0% -1.4 23.61 ą 0% -0.8 24.19 ą 0% perf-profile.children.cycles-pp.copy_vma > 20.00 ą 0% -1.1 18.88 ą 0% -0.7 19.26 ą 0% perf-profile.children.cycles-pp.__split_vma > 19.92 ą 0% -1.1 18.84 ą 0% -0.8 19.14 ą 0% perf-profile.children.cycles-pp.handle_softirqs > 19.90 ą 0% -1.1 18.82 ą 0% -0.8 19.12 ą 0% perf-profile.children.cycles-pp.rcu_core > 19.88 ą 0% -1.1 18.80 ą 0% -0.8 19.10 ą 0% perf-profile.children.cycles-pp.rcu_do_batch > 17.57 ą 0% -0.9 16.66 ą 0% -0.6 16.94 ą 0% perf-profile.children.cycles-pp.kmem_cache_free > 15.29 ą 0% -0.9 14.43 ą 0% -0.5 14.75 ą 0% perf-profile.children.cycles-pp.kmem_cache_alloc_noprof > 15.11 ą 0% -0.8 14.27 ą 0% -0.4 14.68 ą 0% perf-profile.children.cycles-pp.vma_merge > 12.15 ą 0% -0.7 11.46 ą 0% -0.5 11.65 ą 0% perf-profile.children.cycles-pp.__slab_free > 12.11 ą 0% -0.7 11.43 ą 0% -0.4 11.71 ą 0% perf-profile.children.cycles-pp.mas_wr_store_entry > 11.90 ą 0% -0.7 11.24 ą 0% -0.4 11.50 ą 0% perf-profile.children.cycles-pp.mas_store_prealloc > 10.82 ą 2% -0.6 10.22 ą 2% -0.6 10.25 ą 2% perf-profile.children.cycles-pp.smpboot_thread_fn > 10.81 ą 2% -0.6 10.21 ą 2% -0.6 10.24 ą 2% perf-profile.children.cycles-pp.run_ksoftirqd > 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.children.cycles-pp.kthread > 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.children.cycles-pp.ret_from_fork > 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.children.cycles-pp.ret_from_fork_asm > 10.85 ą 0% -0.6 10.26 ą 0% -0.4 10.47 ą 0% perf-profile.children.cycles-pp.vm_area_dup > 9.81 ą 0% -0.5 9.28 ą 0% -0.3 9.52 ą 0% perf-profile.children.cycles-pp.mas_wr_node_store > 8.38 ą 1% -0.5 7.90 ą 1% -0.2 8.13 ą 1% perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook > 7.98 ą 0% -0.4 7.58 ą 0% -0.3 7.70 ą 0% perf-profile.children.cycles-pp.move_page_tables > 6.66 ą 0% -0.4 6.29 ą 0% -0.2 6.43 ą 0% perf-profile.children.cycles-pp.vma_complete > 5.12 ą 0% -0.3 4.79 ą 0% -0.2 4.88 ą 0% perf-profile.children.cycles-pp.mas_preallocate > 6.05 ą 0% -0.3 5.72 ą 0% -0.2 5.82 ą 0% perf-profile.children.cycles-pp.vm_area_free_rcu_cb > 5.85 ą 0% -0.3 5.56 ą 0% -0.2 5.66 ą 0% perf-profile.children.cycles-pp.move_ptes > 3.51 ą 1% -0.2 3.28 ą 2% -0.1 3.37 ą 1% perf-profile.children.cycles-pp.mod_objcg_state > 3.45 ą 0% -0.2 3.24 ą 0% -0.2 3.30 ą 0% perf-profile.children.cycles-pp.___slab_alloc > 2.91 ą 0% -0.2 2.71 ą 0% -0.1 2.78 ą 0% perf-profile.children.cycles-pp.mas_alloc_nodes > 3.47 ą 0% -0.2 3.27 ą 0% -0.1 3.34 ą 0% perf-profile.children.cycles-pp.flush_tlb_mm_range > 3.43 ą 1% -0.2 3.24 ą 1% -0.1 3.35 ą 2% perf-profile.children.cycles-pp.down_write > 2.44 ą 0% -0.2 2.25 ą 0% -0.1 2.32 ą 0% perf-profile.children.cycles-pp.find_vma_prev > 4.24 ą 1% -0.2 4.06 ą 1% -0.1 4.11 ą 1% perf-profile.children.cycles-pp.anon_vma_clone > 3.35 ą 0% -0.2 3.18 ą 0% -0.1 3.24 ą 0% perf-profile.children.cycles-pp.mas_store_gfp > 2.21 ą 1% -0.2 2.05 ą 0% -0.1 2.10 ą 0% perf-profile.children.cycles-pp.__cond_resched > 3.32 ą 0% -0.2 3.17 ą 1% -0.1 3.24 ą 0% perf-profile.children.cycles-pp.__memcg_slab_free_hook > 8.26 ą 0% -0.1 8.12 ą 0% -0.1 8.11 ą 0% perf-profile.children.cycles-pp.unmap_region > 2.22 ą 1% -0.1 2.08 ą 1% -0.1 2.16 ą 3% perf-profile.children.cycles-pp.vma_prepare > 2.67 ą 0% -0.1 2.54 ą 0% -0.1 2.58 ą 0% perf-profile.children.cycles-pp.mtree_load > 3.18 ą 0% -0.1 3.05 ą 0% -0.1 3.11 ą 0% perf-profile.children.cycles-pp.unmap_vmas > 2.46 ą 0% -0.1 2.34 ą 0% -0.1 2.38 ą 0% perf-profile.children.cycles-pp.rcu_cblist_dequeue > 2.50 ą 0% -0.1 2.39 ą 0% -0.1 2.43 ą 0% perf-profile.children.cycles-pp.flush_tlb_func > 2.11 ą 1% -0.1 2.00 ą 1% -0.1 2.02 ą 1% perf-profile.children.cycles-pp.__call_rcu_common > 2.04 ą 1% -0.1 1.93 ą 1% -0.1 1.95 ą 1% perf-profile.children.cycles-pp.allocate_slab > 1.77 ą 1% -0.1 1.66 ą 0% -0.1 1.69 ą 1% perf-profile.children.cycles-pp.mas_wr_walk > 1.87 ą 0% -0.1 1.77 ą 0% -0.1 1.80 ą 0% perf-profile.children.cycles-pp.vma_link > 2.24 ą 0% -0.1 2.13 ą 0% -0.1 2.17 ą 0% perf-profile.children.cycles-pp.native_flush_tlb_one_user > 1.85 ą 1% -0.1 1.74 ą 0% -0.1 1.79 ą 2% perf-profile.children.cycles-pp.up_write > 2.48 ą 0% -0.1 2.38 ą 0% -0.1 2.42 ą 0% perf-profile.children.cycles-pp.unmap_page_range > 0.97 ą 2% -0.1 0.88 ą 1% -0.1 0.90 ą 1% perf-profile.children.cycles-pp.rcu_all_qs > 1.04 ą 0% -0.1 0.95 ą 1% -0.0 0.99 ą 1% perf-profile.children.cycles-pp.mas_prev > 1.24 ą 0% -0.1 1.16 ą 0% -0.1 1.19 ą 0% perf-profile.children.cycles-pp.mas_prev_slot > 0.93 ą 0% -0.1 0.85 ą 1% -0.0 0.88 ą 1% perf-profile.children.cycles-pp.mas_prev_setup > 1.39 ą 1% -0.1 1.31 ą 1% -0.1 1.33 ą 1% perf-profile.children.cycles-pp.shuffle_freelist > 1.52 ą 0% -0.1 1.45 ą 0% -0.0 1.48 ą 0% perf-profile.children.cycles-pp.mas_update_gap > 1.58 ą 1% -0.1 1.50 ą 0% -0.0 1.53 ą 0% perf-profile.children.cycles-pp.zap_pmd_range > 0.87 ą 1% -0.1 0.80 ą 0% -0.1 0.82 ą 1% perf-profile.children.cycles-pp._raw_spin_lock_irqsave > 1.68 ą 1% -0.1 1.62 ą 0% -0.1 1.62 ą 0% perf-profile.children.cycles-pp.__get_unmapped_area > 0.90 ą 1% -0.1 0.84 ą 0% -0.0 0.86 ą 1% perf-profile.children.cycles-pp.percpu_counter_add_batch > 0.62 ą 1% -0.1 0.56 ą 1% -0.0 0.60 ą 1% perf-profile.children.cycles-pp.security_mmap_addr > 0.49 ą 1% -0.1 0.44 ą 1% -0.1 0.44 ą 1% perf-profile.children.cycles-pp.setup_object > 1.02 ą 0% -0.1 0.97 ą 1% -0.0 0.99 ą 0% perf-profile.children.cycles-pp.mas_leaf_max_gap > 0.98 ą 1% -0.0 0.93 ą 1% -0.0 0.94 ą 1% perf-profile.children.cycles-pp.mas_pop_node > 1.22 ą 1% -0.0 1.18 ą 1% -0.0 1.19 ą 1% perf-profile.children.cycles-pp.__pte_offset_map_lock > 0.45 ą 2% -0.0 0.40 ą 2% -0.0 0.41 ą 1% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > 1.18 ą 0% -0.0 1.13 ą 0% -0.0 1.15 ą 1% perf-profile.children.cycles-pp.clear_bhb_loop > 1.08 ą 1% -0.0 1.03 ą 0% -0.0 1.05 ą 0% perf-profile.children.cycles-pp.zap_pte_range > 1.04 ą 0% -0.0 1.00 ą 0% -0.0 1.01 ą 0% perf-profile.children.cycles-pp.vma_to_resize > 0.58 ą 1% -0.0 0.53 ą 1% -0.0 0.54 ą 1% perf-profile.children.cycles-pp.mas_wr_end_piv > 0.34 ą 2% -0.0 0.30 ą 5% -0.0 0.31 ą 4% perf-profile.children.cycles-pp.get_partial_node > 0.64 ą 1% -0.0 0.61 ą 2% -0.0 0.61 ą 1% perf-profile.children.cycles-pp.get_old_pud > 0.62 ą 0% -0.0 0.59 ą 0% -0.0 0.59 ą 1% perf-profile.children.cycles-pp.__put_partials > 1.14 ą 0% -0.0 1.10 ą 1% -0.0 1.12 ą 1% perf-profile.children.cycles-pp.mt_find > 0.90 ą 0% -0.0 0.87 ą 0% -0.0 0.87 ą 0% perf-profile.children.cycles-pp.userfaultfd_unmap_complete > 0.61 ą 1% -0.0 0.58 ą 1% -0.0 0.59 ą 0% perf-profile.children.cycles-pp.entry_SYSCALL_64 > 0.32 ą 2% -0.0 0.29 ą 3% -0.0 0.30 ą 4% perf-profile.children.cycles-pp.security_vm_enough_memory_mm > 0.54 ą 1% -0.0 0.52 ą 1% -0.0 0.52 ą 1% perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown_vmflags > 0.55 ą 1% -0.0 0.52 ą 1% -0.0 0.54 ą 1% perf-profile.children.cycles-pp.refill_obj_stock > 0.45 ą 1% -0.0 0.43 ą 2% -0.0 0.43 ą 2% perf-profile.children.cycles-pp.__alloc_pages_noprof > 0.43 ą 1% -0.0 0.41 ą 2% -0.0 0.41 ą 2% perf-profile.children.cycles-pp.get_page_from_freelist > 0.17 ą 1% -0.0 0.15 ą 3% -0.0 0.16 ą 1% perf-profile.children.cycles-pp.get_any_partial > 0.32 ą 1% -0.0 0.30 ą 1% -0.0 0.30 ą 1% perf-profile.children.cycles-pp.pte_offset_map_nolock > 0.40 ą 0% -0.0 0.38 ą 1% -0.0 0.39 ą 1% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > 0.28 ą 2% -0.0 0.26 ą 2% -0.0 0.27 ą 1% perf-profile.children.cycles-pp.khugepaged_enter_vma > 0.32 ą 1% -0.0 0.30 ą 1% -0.0 0.30 ą 2% perf-profile.children.cycles-pp.mas_wr_store_setup > 0.19 ą 4% -0.0 0.17 ą 4% -0.0 0.18 ą 6% perf-profile.children.cycles-pp.cap_vm_enough_memory > 0.29 ą 1% -0.0 0.27 ą 2% -0.0 0.28 ą 3% perf-profile.children.cycles-pp.tlb_gather_mmu > 0.09 ą 4% -0.0 0.07 ą 6% -0.0 0.08 ą 5% perf-profile.children.cycles-pp.vma_dup_policy > 0.16 ą 3% -0.0 0.14 ą 2% -0.0 0.14 ą 2% perf-profile.children.cycles-pp.mas_wr_append > 0.22 ą 2% -0.0 0.20 ą 3% -0.0 0.20 ą 3% perf-profile.children.cycles-pp.__rmqueue_pcplist > 0.20 ą 2% -0.0 0.18 ą 2% -0.0 0.19 ą 3% perf-profile.children.cycles-pp.__thp_vma_allowable_orders > 0.24 ą 2% -0.0 0.23 ą 2% -0.0 0.23 ą 2% perf-profile.children.cycles-pp.free_pcppages_bulk > 0.44 ą 1% +0.0 0.45 ą 1% +0.0 0.46 ą 1% perf-profile.children.cycles-pp.mremap_userfaultfd_prep > 0.85 ą 1% +0.0 0.85 ą 1% -0.0 0.81 ą 1% perf-profile.children.cycles-pp.thp_get_unmapped_area_vmflags > 0.13 ą 3% +0.0 0.14 ą 3% +0.0 0.15 ą 2% perf-profile.children.cycles-pp.free_pgd_range > 0.08 ą 8% +0.0 0.10 ą 3% -0.0 0.08 ą 6% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags > 0.78 ą 1% +0.1 0.84 ą 0% +0.1 0.86 ą 0% perf-profile.children.cycles-pp.__madvise > 0.63 ą 1% +0.1 0.70 ą 1% +0.1 0.72 ą 0% perf-profile.children.cycles-pp.__x64_sys_madvise > 0.63 ą 1% +0.1 0.70 ą 0% +0.1 0.71 ą 0% perf-profile.children.cycles-pp.do_madvise > 0.00 ą -1% +0.1 0.09 ą 0% +0.1 0.09 ą 5% perf-profile.children.cycles-pp.can_modify_mm_madv > 1.32 ą 1% +0.1 1.46 ą 0% +0.2 1.50 ą 0% perf-profile.children.cycles-pp.mas_next_slot > 87.96 ą 0% +0.6 88.52 ą 0% +0.5 88.48 ą 0% perf-profile.children.cycles-pp.mremap > 85.91 ą 0% +0.8 86.69 ą 0% +0.7 86.61 ą 0% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > 83.74 ą 0% +0.8 84.52 ą 0% +0.7 84.40 ą 0% perf-profile.children.cycles-pp.__do_sys_mremap > 85.42 ą 0% +0.8 86.23 ą 0% +0.7 86.14 ą 0% perf-profile.children.cycles-pp.do_syscall_64 > 40.36 ą 0% +1.4 41.74 ą 0% +2.1 42.49 ą 0% perf-profile.children.cycles-pp.do_vmi_munmap > 2.12 ą 0% +1.5 3.63 ą 0% +1.7 3.81 ą 0% perf-profile.children.cycles-pp.do_munmap > 3.62 ą 0% +2.3 5.97 ą 0% +1.7 5.29 ą 0% perf-profile.children.cycles-pp.mas_walk > 5.41 ą 0% +3.0 8.44 ą 0% +1.6 6.98 ą 0% perf-profile.children.cycles-pp.mremap_to > 5.28 ą 0% +3.2 8.48 ą 0% +2.3 7.56 ą 0% perf-profile.children.cycles-pp.mas_find > 0.00 ą -1% +5.4 5.45 ą 0% +3.9 3.94 ą 0% perf-profile.children.cycles-pp.can_modify_mm > 11.51 ą 0% -0.6 10.86 ą 0% -0.5 11.04 ą 0% perf-profile.self.cycles-pp.__slab_free > 4.23 ą 2% -0.2 4.00 ą 2% -0.1 4.13 ą 2% perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook > 2.34 ą 1% -0.1 2.21 ą 1% -0.0 2.30 ą 3% perf-profile.self.cycles-pp.down_write > 2.43 ą 0% -0.1 2.31 ą 0% -0.1 2.34 ą 0% perf-profile.self.cycles-pp.rcu_cblist_dequeue > 2.34 ą 0% -0.1 2.24 ą 0% -0.1 2.27 ą 0% perf-profile.self.cycles-pp.mtree_load > 2.21 ą 0% -0.1 2.11 ą 0% -0.1 2.14 ą 0% perf-profile.self.cycles-pp.native_flush_tlb_one_user > 1.75 ą 0% -0.1 1.67 ą 0% -0.0 1.70 ą 0% perf-profile.self.cycles-pp.mod_objcg_state > 1.54 ą 1% -0.1 1.46 ą 0% -0.0 1.50 ą 1% perf-profile.self.cycles-pp.up_write > 1.52 ą 0% -0.1 1.44 ą 0% -0.1 1.46 ą 0% perf-profile.self.cycles-pp.mas_wr_walk > 0.70 ą 3% -0.1 0.63 ą 1% -0.1 0.64 ą 1% perf-profile.self.cycles-pp.rcu_all_qs > 1.43 ą 1% -0.1 1.36 ą 1% -0.1 1.36 ą 1% perf-profile.self.cycles-pp.__call_rcu_common > 1.01 ą 0% -0.1 0.95 ą 0% -0.0 0.96 ą 0% perf-profile.self.cycles-pp.mas_preallocate > 1.40 ą 1% -0.1 1.33 ą 1% -0.0 1.35 ą 0% perf-profile.self.cycles-pp.do_vmi_align_munmap > 1.00 ą 0% -0.1 0.94 ą 0% -0.0 0.96 ą 0% perf-profile.self.cycles-pp.mas_prev_slot > 1.14 ą 1% -0.1 1.08 ą 1% -0.0 1.10 ą 1% perf-profile.self.cycles-pp.shuffle_freelist > 1.18 ą 0% -0.1 1.13 ą 0% -0.0 1.16 ą 0% perf-profile.self.cycles-pp.vma_merge > 0.94 ą 1% -0.1 0.89 ą 2% -0.0 0.91 ą 1% perf-profile.self.cycles-pp.vm_area_free_rcu_cb > 0.88 ą 0% -0.1 0.83 ą 1% -0.0 0.84 ą 0% perf-profile.self.cycles-pp.___slab_alloc > 0.50 ą 1% -0.0 0.45 ą 2% -0.0 0.50 ą 1% perf-profile.self.cycles-pp.security_mmap_addr > 0.77 ą 1% -0.0 0.72 ą 1% -0.0 0.74 ą 1% perf-profile.self.cycles-pp.percpu_counter_add_batch > 0.45 ą 2% -0.0 0.40 ą 2% -0.0 0.41 ą 1% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > 1.17 ą 0% -0.0 1.12 ą 0% -0.0 1.14 ą 1% perf-profile.self.cycles-pp.clear_bhb_loop > 1.08 ą 1% -0.0 1.04 ą 1% -0.0 1.06 ą 1% perf-profile.self.cycles-pp.__cond_resched > 1.50 ą 2% -0.0 1.46 ą 0% -0.0 1.48 ą 0% perf-profile.self.cycles-pp.kmem_cache_free > 1.23 ą 0% -0.0 1.18 ą 0% -0.1 1.18 ą 0% perf-profile.self.cycles-pp.move_vma > 0.68 ą 1% -0.0 0.64 ą 0% -0.0 0.65 ą 1% perf-profile.self.cycles-pp.__split_vma > 0.80 ą 0% -0.0 0.76 ą 1% -0.0 0.77 ą 0% perf-profile.self.cycles-pp.mas_wr_store_entry > 0.61 ą 2% -0.0 0.57 ą 2% -0.0 0.57 ą 6% perf-profile.self.cycles-pp.mremap > 0.85 ą 1% -0.0 0.80 ą 1% -0.0 0.81 ą 1% perf-profile.self.cycles-pp.mas_pop_node > 0.44 ą 0% -0.0 0.40 ą 1% -0.0 0.40 ą 1% perf-profile.self.cycles-pp.do_munmap > 0.98 ą 0% -0.0 0.94 ą 1% -0.0 0.95 ą 0% perf-profile.self.cycles-pp.move_ptes > 0.89 ą 0% -0.0 0.86 ą 0% -0.0 0.87 ą 0% perf-profile.self.cycles-pp.mas_leaf_max_gap > 0.46 ą 1% -0.0 0.42 ą 1% -0.0 0.43 ą 1% perf-profile.self.cycles-pp.mas_wr_end_piv > 0.89 ą 0% -0.0 0.86 ą 0% -0.0 0.87 ą 0% perf-profile.self.cycles-pp.mas_store_gfp > 0.79 ą 0% -0.0 0.76 ą 1% -0.0 0.76 ą 0% perf-profile.self.cycles-pp.userfaultfd_unmap_complete > 0.99 ą 0% -0.0 0.97 ą 0% -0.0 0.98 ą 0% perf-profile.self.cycles-pp.mt_find > 0.87 ą 0% -0.0 0.84 ą 0% -0.0 0.84 ą 0% perf-profile.self.cycles-pp.move_page_tables > 0.55 ą 2% -0.0 0.52 ą 1% -0.0 0.52 ą 1% perf-profile.self.cycles-pp.get_old_pud > 0.50 ą 0% -0.0 0.47 ą 1% -0.0 0.48 ą 0% perf-profile.self.cycles-pp.find_vma_prev > 0.61 ą 0% -0.0 0.58 ą 1% -0.0 0.59 ą 0% perf-profile.self.cycles-pp.unmap_region > 0.66 ą 0% -0.0 0.63 ą 1% -0.0 0.64 ą 0% perf-profile.self.cycles-pp.mas_store_prealloc > 0.27 ą 1% -0.0 0.25 ą 1% -0.0 0.26 ą 1% perf-profile.self.cycles-pp.mas_prev_setup > 0.61 ą 1% -0.0 0.59 ą 1% -0.0 0.60 ą 1% perf-profile.self.cycles-pp.copy_vma > 0.48 ą 0% -0.0 0.45 ą 1% -0.0 0.46 ą 1% perf-profile.self.cycles-pp.flush_tlb_mm_range > 0.41 ą 1% -0.0 0.39 ą 1% -0.0 0.40 ą 1% perf-profile.self.cycles-pp._raw_spin_lock_irqsave > 0.48 ą 1% -0.0 0.46 ą 1% -0.0 0.47 ą 0% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > 0.50 ą 1% -0.0 0.48 ą 1% -0.0 0.48 ą 1% perf-profile.self.cycles-pp.refill_obj_stock > 0.47 ą 1% -0.0 0.46 ą 1% -0.0 0.45 ą 1% perf-profile.self.cycles-pp.arch_get_unmapped_area_topdown_vmflags > 0.71 ą 0% -0.0 0.69 ą 1% -0.0 0.69 ą 1% perf-profile.self.cycles-pp.unmap_page_range > 0.17 ą 4% -0.0 0.15 ą 4% -0.0 0.16 ą 3% perf-profile.self.cycles-pp.get_partial_node > 0.24 ą 1% -0.0 0.22 ą 1% -0.0 0.23 ą 0% perf-profile.self.cycles-pp.mas_prev > 0.45 ą 1% -0.0 0.43 ą 0% -0.0 0.44 ą 1% perf-profile.self.cycles-pp.mas_update_gap > 0.53 ą 1% -0.0 0.51 ą 0% -0.0 0.51 ą 1% perf-profile.self.cycles-pp.mremap_to > 0.21 ą 2% -0.0 0.19 ą 2% -0.0 0.19 ą 2% perf-profile.self.cycles-pp.__get_unmapped_area > 0.27 ą 1% -0.0 0.26 ą 1% -0.0 0.25 ą 1% perf-profile.self.cycles-pp.tlb_finish_mmu > 0.18 ą 2% -0.0 0.17 ą 2% -0.0 0.18 ą 2% perf-profile.self.cycles-pp.rcu_do_batch > 0.06 ą 0% -0.0 0.05 ą 0% -0.0 0.05 ą 0% perf-profile.self.cycles-pp.vma_dup_policy > 0.12 ą 0% -0.0 0.11 ą 0% -0.0 0.11 ą 3% perf-profile.self.cycles-pp.mas_wr_append > 0.14 ą 3% -0.0 0.13 ą 3% -0.0 0.12 ą 3% perf-profile.self.cycles-pp.x64_sys_call > 0.11 ą 0% +0.0 0.12 ą 0% +0.0 0.12 ą 3% perf-profile.self.cycles-pp.free_pgd_range > 0.06 ą 5% +0.0 0.07 ą 0% +0.0 0.06 ą 5% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags > 0.21 ą 0% +0.0 0.22 ą 2% -0.0 0.21 ą 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags > 0.45 ą 1% +0.0 0.48 ą 2% +0.0 0.50 ą 1% perf-profile.self.cycles-pp.do_vmi_munmap > 0.27 ą 1% +0.0 0.32 ą 2% -0.0 0.26 ą 1% perf-profile.self.cycles-pp.free_pgtables > 0.36 ą 2% +0.1 0.44 ą 1% -0.0 0.35 ą 4% perf-profile.self.cycles-pp.unlink_anon_vmas > 1.07 ą 1% +0.1 1.19 ą 0% +0.1 1.22 ą 0% perf-profile.self.cycles-pp.mas_next_slot > 1.50 ą 0% +0.5 2.02 ą 0% +0.4 1.85 ą 0% perf-profile.self.cycles-pp.mas_find > 0.00 ą -1% +1.4 1.38 ą 0% +0.9 0.92 ą 0% perf-profile.self.cycles-pp.can_modify_mm > 3.15 ą 0% +2.1 5.26 ą 0% +1.5 4.62 ą 0% perf-profile.self.cycles-pp.mas_walk > > > On Mon, Aug 19, 2024 at 02:35:40PM +0800, Oliver Sang wrote: > > hi, Jeff, > > > > On Mon, Aug 19, 2024 at 09:38:19AM +0800, Oliver Sang wrote: > > > hi, Jeff, > > > > > > On Sun, Aug 18, 2024 at 05:28:41PM +0800, Oliver Sang wrote: > > > > hi, Jeff, > > > > > > > > On Thu, Aug 15, 2024 at 07:58:57PM -0700, Jeff Xu wrote: > > > > > Hi Oliver > > > > > > > > [...] > > > > > > > > > > could you exlictly point to two commit-id? > > > > > sure > > > > > > > > > > this patch > > > > > 8be7258a: mseal: add mseal syscall > > > > > ff388fe5c: mseal: wire up mseal syscall > > > > > > > > I failed to apply this patch set to "8be7258a: mseal: add mseal syscall" > > > > > > look your patch set again > > > [PATCH v1 1/2] mseal:selftest mremap across VMA boundaries > > > just for kselftests > > > > > > and I can apply > > > [PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm > > > upon "8be7258a: mseal: add mseal syscall" cleanly > > > > > > so I will start test for this [PATCH v1 2/2] > > > > > > BTW, I will firstly use our default setting - "60s testtime; reboot between each > > > run; run 10 times", since we've already have the data for 8be7258a and ff388fe5c > > > then we could give you an update kind of quickly. > > > > > > as some private mail discussed, you want some special run method, could you > > > elaborate them here? thanks > > > > here is a quick update before you give us more details about special run method. > > > > by our default run method (60s testtime; reboot between each run; run 10 times), > > your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" could > > resolve regression partically. > > > > ========================================================================================= > > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > > gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s > > > > commit: > > ff388fe5c4 ("mseal: wire up mseal syscall") > > 8be7258aad ("mseal: add mseal syscall") > > 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" > > > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 > > ---------------- --------------------------- --------------------------- > > %stddev %change %stddev %change %stddev > > \ | \ | \ > > 4957 +1.3% 5023 +1.0% 5008 time.percent_of_cpu_this_job_got > > 2915 +1.5% 2959 +1.2% 2949 time.system_time > > 65.96 -7.3% 61.16 -5.5% 62.30 time.user_time > > 41535878 -4.0% 39873501 -2.6% 40452264 proc-vmstat.numa_hit > > 41466104 -4.0% 39806121 -2.6% 40384854 proc-vmstat.numa_local > > 77297398 -4.1% 74165258 -2.6% 75286134 proc-vmstat.pgalloc_normal > > 77016866 -4.1% 73886027 -2.6% 75012630 proc-vmstat.pgfree > > 18386219 -5.0% 17474214 -2.9% 17850959 stress-ng.pagemove.ops > > 306421 -5.0% 291207 -2.9% 297490 stress-ng.pagemove.ops_per_sec > > 4957 +1.3% 5023 +1.0% 5008 stress-ng.time.percent_of_cpu_this_job_got > > 2915 +1.5% 2959 +1.2% 2949 stress-ng.time.system_time > > 3.349e+10 ą 4% +3.0% 3.447e+10 ą 2% +4.1% 3.484e+10 perf-stat.i.branch-instructions > > 1.13 -2.1% 1.10 -2.2% 1.10 perf-stat.i.cpi > > 0.89 +2.2% 0.91 +2.0% 0.91 perf-stat.i.ipc > > 1.04 -6.9% 0.97 -4.9% 0.99 perf-stat.overall.MPKI > > 1.13 -2.3% 1.10 -2.0% 1.10 perf-stat.overall.cpi > > 1081 +5.0% 1136 +3.0% 1114 perf-stat.overall.cycles-between-cache-misses > > 0.89 +2.3% 0.91 +2.0% 0.91 perf-stat.overall.ipc > > 3.295e+10 ą 3% +2.9% 3.392e+10 ą 2% +4.0% 3.427e+10 perf-stat.ps.branch-instructions > > 1.674e+11 ą 3% +1.8% 1.704e+11 ą 2% +3.3% 1.73e+11 perf-stat.ps.instructions > > 1.046e+13 +2.7% 1.074e+13 +1.7% 1.064e+13 perf-stat.total.instructions > > 75.05 -2.0 73.02 -0.9 74.18 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 36.83 -1.6 35.19 -1.2 35.62 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > > 25.02 -1.4 23.65 -0.9 24.12 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 19.94 -1.1 18.87 -0.8 19.19 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 14.78 -0.8 14.01 -0.5 14.28 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 1.48 -0.5 0.99 -0.5 1.00 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > > 7.88 -0.4 7.47 -0.3 7.62 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 6.73 -0.4 6.37 -0.2 6.51 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 6.16 -0.3 5.82 -0.3 5.90 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 6.12 -0.3 5.79 -0.2 5.93 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > 5.79 -0.3 5.48 -0.2 5.59 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > > 5.54 -0.3 5.25 -0.2 5.32 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > 5.56 -0.3 5.28 -0.2 5.36 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 5.19 -0.3 4.92 -0.2 4.98 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap > > 5.21 -0.3 4.95 -0.2 5.02 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma > > 4.09 -0.2 3.85 -0.2 3.93 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 4.69 -0.2 4.46 -0.2 4.51 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma > > 3.56 -0.2 3.36 -0.1 3.43 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap > > 3.40 -0.2 3.22 -0.1 3.29 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap > > 1.35 -0.2 1.16 -0.1 1.24 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > > 4.00 -0.2 3.82 -0.1 3.86 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma > > 2.23 -0.2 2.05 -0.1 2.12 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 8.26 -0.2 8.10 -0.2 8.06 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 1.97 ą 3% -0.2 1.81 ą 3% -0.1 1.88 ą 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma > > 3.11 ą 2% -0.2 2.96 -0.1 3.05 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > > 0.97 -0.2 0.81 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to > > 2.27 -0.2 2.11 -0.1 2.16 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 3.25 -0.1 3.10 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 3.14 -0.1 3.00 -0.1 3.06 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 2.98 -0.1 2.85 -0.1 2.87 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 1.27 ą 2% -0.1 1.15 ą 4% -0.1 1.19 ą 6% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge > > 2.45 -0.1 2.34 -0.1 2.38 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma > > 2.05 -0.1 1.94 -0.1 1.97 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 2.44 -0.1 2.33 -0.1 2.38 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap > > 2.22 -0.1 2.11 -0.1 2.15 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables > > 1.76 ą 2% -0.1 1.65 ą 2% -0.1 1.66 ą 4% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 1.86 -0.1 1.75 -0.1 1.78 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 1.40 -0.1 1.30 -0.1 1.34 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > 1.39 -0.1 1.30 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma > > 0.55 -0.1 0.46 ą 30% -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > > 1.25 -0.1 1.16 -0.1 1.20 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap > > 0.94 -0.1 0.86 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap > > 1.23 -0.1 1.15 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma > > 1.54 -0.1 1.47 -0.0 1.49 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap > > 0.73 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > > 1.15 -0.1 1.09 -0.1 1.10 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > > 0.60 ą 2% -0.1 0.54 -0.0 0.58 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > > 1.27 -0.1 1.21 -0.0 1.24 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 0.80 ą 2% -0.1 0.74 ą 2% -0.0 0.76 ą 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge > > 0.72 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 0.78 -0.1 0.73 -0.0 0.75 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma > > 0.69 ą 2% -0.1 0.64 ą 3% -0.0 0.66 ą 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma > > 1.63 -0.1 1.58 -0.1 1.57 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 1.02 -0.1 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region > > 0.77 -0.0 0.72 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge > > 0.62 -0.0 0.57 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma > > 0.67 -0.0 0.62 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 0.86 -0.0 0.81 -0.0 0.83 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 > > 1.12 -0.0 1.08 -0.0 1.09 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap > > 0.56 -0.0 0.51 -0.0 0.53 perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma > > 0.68 ą 2% -0.0 0.63 -0.0 0.65 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap > > 0.81 -0.0 0.77 -0.0 0.80 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 1.02 -0.0 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.95 ą 2% -0.0 0.90 ą 2% -0.0 0.93 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region > > 0.98 -0.0 0.94 -0.0 0.95 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 0.78 -0.0 0.74 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap > > 0.70 -0.0 0.66 -0.0 0.67 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 0.69 -0.0 0.65 -0.0 0.66 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma > > 0.69 -0.0 0.65 -0.0 0.65 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap > > 0.62 -0.0 0.59 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 1.16 -0.0 1.12 -0.0 1.13 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 0.76 ą 2% -0.0 0.72 -0.0 0.72 ą 2% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma > > 1.01 -0.0 0.97 -0.0 0.99 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 0.60 -0.0 0.57 -0.0 0.58 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas > > 0.88 -0.0 0.85 -0.0 0.85 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 0.62 ą 2% -0.0 0.59 ą 2% -0.0 0.60 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > > 0.59 -0.0 0.56 -0.0 0.56 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap > > 0.65 -0.0 0.62 ą 2% -0.0 0.63 perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 0.81 +0.0 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > > 2.76 +0.0 2.78 ą 2% -0.1 2.67 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap > > 3.47 +0.0 3.51 -0.1 3.37 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 0.76 +0.1 0.83 +0.1 0.85 perf-profile.calltrace.cycles-pp.__madvise > > 0.66 +0.1 0.73 +0.1 0.75 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > > 0.67 +0.1 0.74 +0.1 0.76 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise > > 0.63 +0.1 0.70 +0.1 0.72 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > > 0.62 +0.1 0.70 +0.1 0.71 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > > 0.00 +0.9 0.86 +0.9 0.92 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap > > 0.00 +0.9 0.88 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap > > 83.81 +0.9 84.69 +0.6 84.44 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 0.00 +0.9 0.90 ą 2% +0.9 0.91 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma > > 0.00 +1.1 1.10 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 > > 0.00 +1.2 1.21 +1.3 1.28 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to > > 2.10 +1.5 3.60 +1.7 3.79 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.00 +1.5 1.52 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap > > 1.59 +1.5 3.12 +1.7 3.31 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 > > 0.00 +1.6 1.61 +0.0 0.00 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.00 +1.7 1.73 +1.8 1.83 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > > 0.00 +2.0 2.01 +2.0 2.04 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > > 5.34 +3.0 8.38 +1.6 6.92 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 75.22 -2.0 73.18 -0.9 74.34 perf-profile.children.cycles-pp.move_vma > > 37.04 -1.6 35.40 -1.2 35.83 perf-profile.children.cycles-pp.do_vmi_align_munmap > > 25.09 -1.4 23.72 -0.9 24.20 perf-profile.children.cycles-pp.copy_vma > > 20.04 -1.1 18.96 -0.8 19.28 perf-profile.children.cycles-pp.__split_vma > > 19.87 -1.0 18.84 -0.6 19.24 perf-profile.children.cycles-pp.rcu_core > > 19.85 -1.0 18.82 -0.6 19.22 perf-profile.children.cycles-pp.rcu_do_batch > > 19.89 -1.0 18.86 -0.6 19.26 perf-profile.children.cycles-pp.handle_softirqs > > 17.55 -0.9 16.67 -0.5 17.02 perf-profile.children.cycles-pp.kmem_cache_free > > 15.32 -0.8 14.49 -0.5 14.78 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof > > 15.17 -0.8 14.39 -0.5 14.66 perf-profile.children.cycles-pp.vma_merge > > 12.12 -0.6 11.48 -0.4 11.70 perf-profile.children.cycles-pp.__slab_free > > 12.19 -0.6 11.56 -0.5 11.73 perf-profile.children.cycles-pp.mas_wr_store_entry > > 11.99 -0.6 11.36 -0.5 11.53 perf-profile.children.cycles-pp.mas_store_prealloc > > 10.88 -0.6 10.28 -0.4 10.50 perf-profile.children.cycles-pp.vm_area_dup > > 9.90 -0.5 9.41 -0.4 9.53 perf-profile.children.cycles-pp.mas_wr_node_store > > 8.39 -0.5 7.92 -0.3 8.13 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook > > 7.99 -0.4 7.58 -0.3 7.73 perf-profile.children.cycles-pp.move_page_tables > > 6.70 -0.4 6.33 -0.3 6.43 perf-profile.children.cycles-pp.vma_complete > > 5.87 -0.3 5.55 -0.2 5.66 perf-profile.children.cycles-pp.move_ptes > > 5.12 -0.3 4.81 -0.2 4.90 perf-profile.children.cycles-pp.mas_preallocate > > 6.05 -0.3 5.74 -0.2 5.85 perf-profile.children.cycles-pp.vm_area_free_rcu_cb > > 2.98 -0.3 2.69 ą 4% -0.2 2.80 ą 6% perf-profile.children.cycles-pp.__memcpy > > 3.46 ą 2% -0.2 3.25 -0.1 3.36 ą 3% perf-profile.children.cycles-pp.mod_objcg_state > > 3.47 -0.2 3.26 -0.2 3.32 perf-profile.children.cycles-pp.___slab_alloc > > 2.44 -0.2 2.25 -0.1 2.33 perf-profile.children.cycles-pp.find_vma_prev > > 2.92 -0.2 2.73 -0.1 2.79 perf-profile.children.cycles-pp.mas_alloc_nodes > > 3.46 -0.2 3.27 -0.1 3.34 perf-profile.children.cycles-pp.flush_tlb_mm_range > > 3.47 -0.2 3.29 -0.2 3.32 ą 2% perf-profile.children.cycles-pp.down_write > > 3.33 -0.2 3.16 -0.1 3.25 perf-profile.children.cycles-pp.__memcg_slab_free_hook > > 4.23 -0.2 4.07 -0.1 4.08 ą 2% perf-profile.children.cycles-pp.anon_vma_clone > > 8.33 -0.2 8.17 -0.2 8.13 perf-profile.children.cycles-pp.unmap_region > > 3.35 -0.1 3.20 -0.1 3.26 perf-profile.children.cycles-pp.mas_store_gfp > > 2.21 -0.1 2.07 -0.1 2.10 perf-profile.children.cycles-pp.__cond_resched > > 3.19 -0.1 3.05 -0.1 3.11 perf-profile.children.cycles-pp.unmap_vmas > > 2.12 -0.1 1.99 -0.1 2.04 perf-profile.children.cycles-pp.__call_rcu_common > > 2.66 -0.1 2.54 -0.1 2.60 perf-profile.children.cycles-pp.mtree_load > > 2.24 -0.1 2.12 ą 2% -0.1 2.13 ą 3% perf-profile.children.cycles-pp.vma_prepare > > 2.50 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.flush_tlb_func > > 2.04 ą 2% -0.1 1.93 -0.1 1.96 ą 2% perf-profile.children.cycles-pp.allocate_slab > > 2.46 -0.1 2.35 -0.1 2.41 perf-profile.children.cycles-pp.rcu_cblist_dequeue > > 2.48 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.unmap_page_range > > 2.23 -0.1 2.12 -0.1 2.16 perf-profile.children.cycles-pp.native_flush_tlb_one_user > > 1.77 -0.1 1.67 -0.1 1.70 perf-profile.children.cycles-pp.mas_wr_walk > > 1.88 -0.1 1.78 -0.1 1.80 perf-profile.children.cycles-pp.vma_link > > 1.84 -0.1 1.75 -0.1 1.77 perf-profile.children.cycles-pp.up_write > > 0.97 ą 2% -0.1 0.88 -0.1 0.89 perf-profile.children.cycles-pp.rcu_all_qs > > 1.40 -0.1 1.32 -0.1 1.34 ą 2% perf-profile.children.cycles-pp.shuffle_freelist > > 1.03 -0.1 0.95 -0.0 0.99 perf-profile.children.cycles-pp.mas_prev > > 0.92 -0.1 0.85 -0.0 0.88 perf-profile.children.cycles-pp.mas_prev_setup > > 1.58 -0.1 1.51 -0.1 1.53 perf-profile.children.cycles-pp.zap_pmd_range > > 1.24 -0.1 1.17 -0.0 1.20 perf-profile.children.cycles-pp.mas_prev_slot > > 1.57 -0.1 1.49 -0.1 1.49 perf-profile.children.cycles-pp.mas_update_gap > > 0.62 -0.1 0.56 -0.0 0.60 perf-profile.children.cycles-pp.security_mmap_addr > > 0.90 -0.1 0.84 -0.0 0.86 perf-profile.children.cycles-pp.percpu_counter_add_batch > > 0.86 -0.1 0.80 -0.0 0.81 perf-profile.children.cycles-pp._raw_spin_lock_irqsave > > 0.98 -0.1 0.92 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node > > 1.68 -0.1 1.62 -0.1 1.62 perf-profile.children.cycles-pp.__get_unmapped_area > > 1.23 -0.1 1.18 -0.0 1.20 perf-profile.children.cycles-pp.__pte_offset_map_lock > > 0.49 ą 2% -0.1 0.43 -0.1 0.43 ą 2% perf-profile.children.cycles-pp.setup_object > > 1.09 -0.1 1.03 -0.0 1.05 perf-profile.children.cycles-pp.zap_pte_range > > 1.07 ą 2% -0.1 1.02 ą 2% -0.1 1.00 perf-profile.children.cycles-pp.mas_leaf_max_gap > > 0.70 ą 2% -0.0 0.65 -0.0 0.67 perf-profile.children.cycles-pp.syscall_return_via_sysret > > 1.18 -0.0 1.14 -0.0 1.15 perf-profile.children.cycles-pp.clear_bhb_loop > > 0.51 ą 3% -0.0 0.47 -0.0 0.49 ą 3% perf-profile.children.cycles-pp.anon_vma_interval_tree_insert > > 1.04 -0.0 1.00 -0.0 1.01 perf-profile.children.cycles-pp.vma_to_resize > > 0.57 -0.0 0.53 -0.0 0.54 perf-profile.children.cycles-pp.mas_wr_end_piv > > 0.44 ą 2% -0.0 0.40 ą 2% -0.0 0.40 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > > 1.14 -0.0 1.10 -0.0 1.12 perf-profile.children.cycles-pp.mt_find > > 0.90 -0.0 0.87 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete > > 0.62 -0.0 0.59 -0.0 0.60 perf-profile.children.cycles-pp.__put_partials > > 0.45 ą 6% -0.0 0.42 -0.0 0.43 perf-profile.children.cycles-pp._raw_spin_lock > > 0.48 -0.0 0.45 ą 2% -0.0 0.46 perf-profile.children.cycles-pp.mas_prev_range > > 0.61 -0.0 0.58 -0.0 0.59 perf-profile.children.cycles-pp.entry_SYSCALL_64 > > 0.31 ą 3% -0.0 0.28 ą 3% -0.0 0.31 perf-profile.children.cycles-pp.security_vm_enough_memory_mm > > 0.33 ą 3% -0.0 0.30 ą 2% -0.0 0.31 ą 4% perf-profile.children.cycles-pp.mas_put_in_tree > > 0.32 ą 2% -0.0 0.29 ą 2% -0.0 0.30 perf-profile.children.cycles-pp.tlb_finish_mmu > > 0.46 -0.0 0.44 ą 2% -0.0 0.46 perf-profile.children.cycles-pp.rcu_segcblist_enqueue > > 0.33 -0.0 0.31 -0.0 0.32 perf-profile.children.cycles-pp.mas_destroy > > 0.36 -0.0 0.34 -0.0 0.34 perf-profile.children.cycles-pp.__rb_insert_augmented > > 0.39 -0.0 0.37 -0.0 0.38 ą 2% perf-profile.children.cycles-pp.down_write_killable > > 0.29 -0.0 0.27 ą 2% -0.0 0.28 perf-profile.children.cycles-pp.tlb_gather_mmu > > 0.26 -0.0 0.24 ą 2% -0.0 0.25 ą 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode > > 0.16 ą 2% -0.0 0.14 ą 3% -0.0 0.14 ą 3% perf-profile.children.cycles-pp.mas_wr_append > > 0.30 ą 2% -0.0 0.28 ą 2% -0.0 0.29 ą 2% perf-profile.children.cycles-pp.__vm_enough_memory > > 0.32 -0.0 0.30 ą 2% -0.0 0.31 perf-profile.children.cycles-pp.pte_offset_map_nolock > > 2.83 +0.0 2.85 ą 2% -0.1 2.74 perf-profile.children.cycles-pp.unlink_anon_vmas > > 0.84 +0.0 0.86 -0.0 0.81 perf-profile.children.cycles-pp.thp_get_unmapped_area_vmflags > > 0.08 ą 5% +0.0 0.10 ą 3% -0.0 0.08 ą 6% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags > > 3.52 +0.0 3.56 -0.1 3.42 perf-profile.children.cycles-pp.free_pgtables > > 0.78 +0.1 0.85 +0.1 0.86 perf-profile.children.cycles-pp.__madvise > > 0.63 +0.1 0.70 +0.1 0.72 perf-profile.children.cycles-pp.__x64_sys_madvise > > 0.63 +0.1 0.70 +0.1 0.71 perf-profile.children.cycles-pp.do_madvise > > 0.00 +0.1 0.09 ą 3% +0.1 0.10 ą 5% perf-profile.children.cycles-pp.can_modify_mm_madv > > 1.31 +0.2 1.46 +0.2 1.50 perf-profile.children.cycles-pp.mas_next_slot > > 83.90 +0.9 84.79 +0.6 84.53 perf-profile.children.cycles-pp.__do_sys_mremap > > 40.45 +1.4 41.90 +2.1 42.57 perf-profile.children.cycles-pp.do_vmi_munmap > > 2.12 +1.5 3.62 +1.7 3.82 perf-profile.children.cycles-pp.do_munmap > > 3.63 +2.4 5.98 +1.7 5.29 perf-profile.children.cycles-pp.mas_walk > > 5.40 +3.0 8.44 +1.6 6.97 perf-profile.children.cycles-pp.mremap_to > > 5.26 +3.2 8.48 +2.3 7.58 perf-profile.children.cycles-pp.mas_find > > 0.00 +5.5 5.46 +3.9 3.93 perf-profile.children.cycles-pp.can_modify_mm > > 11.49 -0.6 10.89 -0.4 11.10 perf-profile.self.cycles-pp.__slab_free > > 4.32 -0.3 4.06 -0.2 4.16 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook > > 1.96 -0.2 1.77 ą 4% -0.1 1.84 ą 6% perf-profile.self.cycles-pp.__memcpy > > 2.36 -0.1 2.25 ą 2% -0.1 2.25 ą 3% perf-profile.self.cycles-pp.down_write > > 2.42 -0.1 2.31 -0.0 2.38 perf-profile.self.cycles-pp.rcu_cblist_dequeue > > 2.33 -0.1 2.23 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load > > 2.21 -0.1 2.10 -0.1 2.14 perf-profile.self.cycles-pp.native_flush_tlb_one_user > > 1.62 -0.1 1.54 -0.0 1.57 perf-profile.self.cycles-pp.__memcg_slab_free_hook > > 1.52 -0.1 1.44 -0.1 1.46 perf-profile.self.cycles-pp.mas_wr_walk > > 1.44 -0.1 1.36 -0.1 1.38 ą 2% perf-profile.self.cycles-pp.__call_rcu_common > > 1.53 -0.1 1.45 -0.0 1.48 perf-profile.self.cycles-pp.up_write > > 1.72 -0.1 1.65 -0.0 1.70 perf-profile.self.cycles-pp.mod_objcg_state > > 0.69 ą 2% -0.1 0.63 -0.1 0.63 perf-profile.self.cycles-pp.rcu_all_qs > > 1.14 ą 2% -0.1 1.08 -0.0 1.09 ą 2% perf-profile.self.cycles-pp.shuffle_freelist > > 1.18 -0.1 1.12 -0.0 1.17 perf-profile.self.cycles-pp.vma_merge > > 1.38 -0.1 1.33 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap > > 0.51 ą 2% -0.1 0.45 -0.0 0.49 perf-profile.self.cycles-pp.security_mmap_addr > > 0.62 -0.1 0.56 ą 2% -0.1 0.56 perf-profile.self.cycles-pp.mremap > > 0.89 -0.1 0.83 -0.0 0.85 perf-profile.self.cycles-pp.___slab_alloc > > 0.99 -0.1 0.94 -0.0 0.96 perf-profile.self.cycles-pp.mas_prev_slot > > 1.00 -0.0 0.95 -0.0 0.96 perf-profile.self.cycles-pp.mas_preallocate > > 0.98 -0.0 0.93 -0.0 0.95 perf-profile.self.cycles-pp.move_ptes > > 0.85 -0.0 0.80 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node > > 0.94 -0.0 0.90 -0.0 0.91 ą 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb > > 1.09 -0.0 1.04 -0.0 1.06 perf-profile.self.cycles-pp.__cond_resched > > 0.77 -0.0 0.72 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch > > 0.94 ą 2% -0.0 0.89 ą 2% -0.1 0.87 perf-profile.self.cycles-pp.mas_leaf_max_gap > > 1.17 -0.0 1.12 -0.0 1.14 perf-profile.self.cycles-pp.clear_bhb_loop > > 0.68 -0.0 0.63 -0.0 0.65 perf-profile.self.cycles-pp.__split_vma > > 0.79 -0.0 0.75 -0.0 0.77 perf-profile.self.cycles-pp.mas_wr_store_entry > > 1.22 -0.0 1.18 -0.0 1.18 perf-profile.self.cycles-pp.move_vma > > 0.43 ą 2% -0.0 0.40 ą 2% -0.0 0.40 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > > 1.49 -0.0 1.45 +0.0 1.49 perf-profile.self.cycles-pp.kmem_cache_free > > 0.44 -0.0 0.40 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap > > 0.45 -0.0 0.42 -0.0 0.43 perf-profile.self.cycles-pp.mas_wr_end_piv > > 0.89 -0.0 0.86 -0.0 0.88 perf-profile.self.cycles-pp.mas_store_gfp > > 0.78 -0.0 0.75 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete > > 0.66 -0.0 0.62 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc > > 0.60 -0.0 0.58 -0.0 0.59 perf-profile.self.cycles-pp.unmap_region > > 0.36 ą 4% -0.0 0.33 ą 3% -0.0 0.34 ą 2% perf-profile.self.cycles-pp.syscall_return_via_sysret > > 0.55 -0.0 0.52 -0.0 0.53 perf-profile.self.cycles-pp.get_old_pud > > 0.99 -0.0 0.97 -0.0 0.98 perf-profile.self.cycles-pp.mt_find > > 0.61 -0.0 0.58 -0.0 0.60 perf-profile.self.cycles-pp.copy_vma > > 0.43 ą 3% -0.0 0.40 -0.0 0.41 ą 4% perf-profile.self.cycles-pp.anon_vma_interval_tree_insert > > 0.49 -0.0 0.47 -0.0 0.48 perf-profile.self.cycles-pp.find_vma_prev > > 0.71 -0.0 0.68 -0.0 0.70 perf-profile.self.cycles-pp.unmap_page_range > > 0.27 -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.mas_prev_setup > > 0.47 -0.0 0.45 -0.0 0.46 ą 2% perf-profile.self.cycles-pp.flush_tlb_mm_range > > 0.37 ą 6% -0.0 0.35 -0.0 0.35 perf-profile.self.cycles-pp._raw_spin_lock > > 0.41 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave > > 0.40 -0.0 0.37 -0.0 0.38 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > > 0.27 -0.0 0.25 ą 2% -0.0 0.25 ą 3% perf-profile.self.cycles-pp.mas_put_in_tree > > 0.49 -0.0 0.47 -0.0 0.49 perf-profile.self.cycles-pp.refill_obj_stock > > 0.48 -0.0 0.46 -0.0 0.47 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > > 0.27 ą 2% -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.tlb_finish_mmu > > 0.24 ą 2% -0.0 0.22 -0.0 0.23 perf-profile.self.cycles-pp.mas_prev > > 0.28 -0.0 0.26 -0.0 0.27 ą 2% perf-profile.self.cycles-pp.mas_alloc_nodes > > 0.40 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp.__pte_offset_map_lock > > 0.14 ą 3% -0.0 0.12 ą 2% -0.0 0.13 ą 3% perf-profile.self.cycles-pp.syscall_exit_to_user_mode > > 0.26 -0.0 0.24 ą 2% -0.0 0.25 perf-profile.self.cycles-pp.__rb_insert_augmented > > 0.28 -0.0 0.26 -0.0 0.27 perf-profile.self.cycles-pp.alloc_new_pud > > 0.28 -0.0 0.26 -0.0 0.27 ą 2% perf-profile.self.cycles-pp.flush_tlb_func > > 0.20 ą 2% -0.0 0.19 -0.0 0.19 ą 2% perf-profile.self.cycles-pp.__get_unmapped_area > > 0.47 -0.0 0.46 -0.0 0.45 perf-profile.self.cycles-pp.arch_get_unmapped_area_topdown_vmflags > > 0.06 -0.0 0.05 ą 5% -0.0 0.05 perf-profile.self.cycles-pp.vma_dup_policy > > 0.06 ą 6% +0.0 0.07 -0.0 0.06 ą 8% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags > > 0.11 ą 4% +0.0 0.12 ą 4% +0.0 0.12 ą 4% perf-profile.self.cycles-pp.free_pgd_range > > 0.21 +0.0 0.22 ą 2% -0.0 0.20 ą 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags > > 0.45 +0.0 0.48 +0.0 0.50 perf-profile.self.cycles-pp.do_vmi_munmap > > 0.27 +0.0 0.32 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables > > 0.36 ą 2% +0.1 0.44 -0.0 0.35 perf-profile.self.cycles-pp.unlink_anon_vmas > > 1.07 +0.1 1.19 +0.2 1.22 perf-profile.self.cycles-pp.mas_next_slot > > 1.49 +0.5 2.01 +0.4 1.86 perf-profile.self.cycles-pp.mas_find > > 0.00 +1.4 1.37 +0.9 0.93 perf-profile.self.cycles-pp.can_modify_mm > > 3.14 +2.1 5.23 +1.5 4.60 perf-profile.self.cycles-pp.mas_walk > > > > > > > > > > > > > > > > > > to avoid the impact of other changes, better to apply the patch upon 8be7258a > > > > directly. > > > > > > > > if you prefer other base for this patch, please let us know. then we will > > > > supply the results for 4 commits in fact: > > > > > > > > this patch > > > > the base of this patch > > > > 8be7258a: mseal: add mseal syscall > > > > ff388fe5c: mseal: wire up mseal syscall > > > > > > > > > > > > > > > > > > > > > > > Thank you for your time and assistance in helping me on understanding > > > > > > > this issue. > > > > > > > > > > > > due to resource constraint, please expect that we need several days to finish > > > > > > this test request. > > > > > No problem. > > > > > > > > > > Thanks for your help! > > > > > -Jeff > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > -Jeff > > > > > > > > > > > > > > > -Jeff > > > > > > > > > > > > > > > > > [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ > > > > > > > > > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c > > > > > > > > > > > > > > > > > > > > > > > > > > > Jeff Xu (2): > > > > > > > > > mseal:selftest mremap across VMA boundaries. > > > > > > > > > mseal: refactor mremap to remove can_modify_mm > > > > > > > > > > > > > > > > > > mm/internal.h | 24 ++ > > > > > > > > > mm/mremap.c | 77 +++---- > > > > > > > > > mm/mseal.c | 17 -- > > > > > > > > > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- > > > > > > > > > 4 files changed, 353 insertions(+), 58 deletions(-) > > > > > > > > > > > > > > > > > > -- > > > > > > > > > 2.46.0.76.ge559c4bf1a-goog > > > > > > > > >
From: Jeff Xu <jeffxu@chromium.org> mremap doesn't allow relocate, expand, shrink across VMA boundaries, refactor the code to check src address range before doing anything on the destination, i.e. destination won't be unmapped, if src address failed the boundaries check. This also allows us to remove can_modify_mm from mremap.c, since the src address must be single VMA, can_modify_vma is used. It is likely this will improve the performance on mremap, previously the code does sealing check using can_modify_mm for the src address range, and the new code removed the loop (used by can_modify_mm). In order to verify this patch doesn't regress on mremap, I added tests in mseal_test, the test patch can be applied before mremap refactor patch or checkin independently. Also this patch doesn't change mseal's existing schematic: if sealing fail, user can expect the src/dst address isn't updated. So this patch can be applied regardless if we decided to go with current out-of-loop approach or in-loop approach currently in discussion. Regarding the perf test report by stress-ng [1] title: 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression The test is using below for testing: stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64 I can't repro this using ChromeOS, the pagemove test shows large value of stddev and stderr, and can't reasonably refect the performance impact. For example: I write a c program [2] to run the above pagemove test 10 times and calculate the stddev, stderr, for 3 commits: 1> before mseal feature is added: Ops/sec: Mean : 3564.40 Std Dev : 2737.35 (76.80% of Mean) Std Err : 865.63 (24.29% of Mean) 2> after mseal feature is added: Ops/sec: Mean : 2703.84 Std Dev : 2085.13 (77.12% of Mean) Std Err : 659.38 (24.39% of Mean) 3> after current patch (mremap refactor) Ops/sec: Mean : 3603.67 Std Dev : 2422.22 (67.22% of Mean) Std Err : 765.97 (21.26% of Mean) The result shows 21%-24% stderr, this means whatever perf improvment/impact there might be won't be measured correctly by this test. This test machine has 32G memory, Intel(R) Celeron(R) 7305, 5 CPU. And I reboot the machine before each test, and take the first 10 runs with run_stress_ng 10 (I will run longer duration to see if test still shows large stdDev,StdErr) [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c Jeff Xu (2): mseal:selftest mremap across VMA boundaries. mseal: refactor mremap to remove can_modify_mm mm/internal.h | 24 ++ mm/mremap.c | 77 +++---- mm/mseal.c | 17 -- tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- 4 files changed, 353 insertions(+), 58 deletions(-)