diff mbox series

[v2] plugins: optimize cpu_index code generation

Message ID 20241128213843.1023080-1-pierrick.bouvier@linaro.org
State New
Headers show
Series [v2] plugins: optimize cpu_index code generation | expand

Commit Message

Pierrick Bouvier Nov. 28, 2024, 9:38 p.m. UTC
When running with a single vcpu, we can return a constant instead of a
load when accessing cpu_index.
A side effect is that all tcg operations using it are optimized, most
notably scoreboard access.
When running a simple loop in user-mode, the speedup is around 20%.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>

---

v2:
- no need to do a flush, as user-mode already does it when spawning a
  second cpu (to honor CF_PARALLEL flags).
- change condition detection to use CF_PARALLEL instead
---
 accel/tcg/plugin-gen.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Richard Henderson Nov. 29, 2024, 1:29 p.m. UTC | #1
On 11/28/24 15:38, Pierrick Bouvier wrote:
> When running with a single vcpu, we can return a constant instead of a
> load when accessing cpu_index.
> A side effect is that all tcg operations using it are optimized, most
> notably scoreboard access.
> When running a simple loop in user-mode, the speedup is around 20%.
> 
> Signed-off-by: Pierrick Bouvier<pierrick.bouvier@linaro.org>
> 
> ---
> 
> v2:
> - no need to do a flush, as user-mode already does it when spawning a
>    second cpu (to honor CF_PARALLEL flags).
> - change condition detection to use CF_PARALLEL instead
> ---
>   accel/tcg/plugin-gen.c | 9 +++++++++
>   1 file changed, 9 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~
Pierrick Bouvier Dec. 16, 2024, 5:54 p.m. UTC | #2
Hi Richard,

On 11/29/24 05:29, Richard Henderson wrote:
> On 11/28/24 15:38, Pierrick Bouvier wrote:
>> When running with a single vcpu, we can return a constant instead of a
>> load when accessing cpu_index.
>> A side effect is that all tcg operations using it are optimized, most
>> notably scoreboard access.
>> When running a simple loop in user-mode, the speedup is around 20%.
>>
>> Signed-off-by: Pierrick Bouvier<pierrick.bouvier@linaro.org>
>>
>> ---
>>
>> v2:
>> - no need to do a flush, as user-mode already does it when spawning a
>>     second cpu (to honor CF_PARALLEL flags).
>> - change condition detection to use CF_PARALLEL instead
>> ---
>>    accel/tcg/plugin-gen.c | 9 +++++++++
>>    1 file changed, 9 insertions(+)
> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> 
> r~

any chance you could pull that as part of one of your series?

Thanks,
Pierrick
diff mbox series

Patch

diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 0f47bfbb489..961d9305913 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -102,6 +102,15 @@  static void gen_disable_mem_helper(void)
 
 static TCGv_i32 gen_cpu_index(void)
 {
+    /*
+     * Optimize when we run with a single vcpu. All values using cpu_index,
+     * including scoreboard index, will be optimized out.
+     * User-mode calls tb_flush when setting this flag. In system-mode, all
+     * vcpus are created before generating code.
+     */
+    if (!tcg_cflags_has(current_cpu, CF_PARALLEL)) {
+        return tcg_constant_i32(current_cpu->cpu_index);
+    }
     TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
     tcg_gen_ld_i32(cpu_index, tcg_env,
                    -offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));