[v4,14/57] tcg/i386: Add have_atomic16

Message ID	20230503070656.1746170-15-richard.henderson@linaro.org
State	Superseded
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; From: Richard Henderson <richard.henderson@linaro.org> To: qemu-devel@nongnu.org Cc: git@xen0n.name, gaosong@loongson.cn, philmd@linaro.org, qemu-arm@nongnu.org, qemu-riscv@nongnu.org, qemu-s390x@nongnu.org Subject: [PATCH v4 14/57] tcg/i386: Add have_atomic16 Date: Wed, 3 May 2023 08:06:13 +0100 Message-Id: <20230503070656.1746170-15-richard.henderson@linaro.org> In-Reply-To: <20230503070656.1746170-1-richard.henderson@linaro.org> References: <20230503070656.1746170-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2a00:1450:4864:20::32b; envelope-from=richard.henderson@linaro.org; helo=mail-wm1-x32b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org
Series	tcg: Improve atomicity support \| expand [v4,00/57] tcg: Improve atomicity support [v4,01/57] include/exec/memop: Add bits describing atomicity [v4,02/57] accel/tcg: Add cpu_in_serial_context [v4,03/57] accel/tcg: Introduce tlb_read_idx [v4,04/57] accel/tcg: Reorg system mode load helpers [v4,05/57] accel/tcg: Reorg system mode store helpers [v4,06/57] accel/tcg: Honor atomicity of loads [v4,07/57] accel/tcg: Honor atomicity of stores [v4,08/57] target/loongarch: Do not include tcg-ldst.h [v4,09/57] tcg: Unify helper_{be,le}_{ld,st}* [v4,10/57] accel/tcg: Implement helper_{ld, st}_mmu for user-only [v4,11/57] tcg/tci: Use helper_{ld,st}_mmu for user-only [v4,12/57] tcg: Add 128-bit guest memory primitives [v4,13/57] meson: Detect atomic128 support with optimization [v4,14/57] tcg/i386: Add have_atomic16 [v4,15/57] accel/tcg: Use have_atomic16 in ldst_atomicity.c.inc [v4,16/57] accel/tcg: Add aarch64 specific support in ldst_atomicity [v4,17/57] tcg/aarch64: Detect have_lse, have_lse2 for linux [v4,18/57] tcg/aarch64: Detect have_lse, have_lse2 for darwin [v4,19/57] accel/tcg: Add have_lse2 support in ldst_atomicity [v4,20/57] tcg: Introduce TCG_OPF_TYPE_MASK [v4,21/57] tcg/i386: Use full load/store helpers in user-only mode [v4,22/57] tcg/aarch64: Use full load/store helpers in user-only mode [v4,23/57] tcg/ppc: Use full load/store helpers in user-only mode [v4,24/57] tcg/loongarch64: Use full load/store helpers in user-only mode [v4,25/57] tcg/riscv: Use full load/store helpers in user-only mode [v4,26/57] tcg/arm: Adjust constraints on qemu_ld/st [v4,27/57] tcg/arm: Use full load/store helpers in user-only mode [v4,28/57] tcg/mips: Use full load/store helpers in user-only mode [v4,29/57] tcg/s390x: Use full load/store helpers in user-only mode [v4,30/57] tcg/sparc64: Allocate %g2 as a third temporary [v4,31/57] tcg/sparc64: Rename tcg_out_movi_imm13 to tcg_out_movi_s13 [v4,32/57] tcg/sparc64: Rename tcg_out_movi_imm32 to tcg_out_movi_u32 [v4,33/57] tcg/sparc64: Split out tcg_out_movi_s32 [v4,34/57] tcg/sparc64: Use standard slow path for softmmu [v4,35/57] accel/tcg: Remove helper_unaligned_{ld,st} [v4,36/57] tcg/loongarch64: Assert the host supports unaligned accesses [v4,37/57] tcg/loongarch64: Support softmmu unaligned accesses [v4,38/57] tcg/riscv: Support softmmu unaligned accesses [v4,39/57] tcg: Introduce tcg_target_has_memory_bswap [v4,40/57] tcg: Add INDEX_op_qemu_{ld,st}_i128 [v4,41/57] tcg: Support TCG_TYPE_I128 in tcg_out_{ld, st}_helper_{args, ret} [v4,42/57] tcg: Introduce atom_and_align_for_opc [v4,43/57] tcg/i386: Use atom_and_align_for_opc [v4,44/57] tcg/aarch64: Use atom_and_align_for_opc [v4,45/57] tcg/arm: Use atom_and_align_for_opc [v4,46/57] tcg/loongarch64: Use atom_and_align_for_opc [v4,47/57] tcg/mips: Use atom_and_align_for_opc [v4,48/57] tcg/ppc: Use atom_and_align_for_opc [v4,49/57] tcg/riscv: Use atom_and_align_for_opc [v4,50/57] tcg/s390x: Use atom_and_align_for_opc [v4,51/57] tcg/sparc64: Use atom_and_align_for_opc [v4,52/57] tcg/i386: Honor 64-bit atomicity in 32-bit mode [v4,53/57] tcg/i386: Support 128-bit load/store with have_atomic16 [v4,54/57] tcg/aarch64: Rename temporaries [v4,55/57] tcg/aarch64: Support 128-bit load/store [v4,56/57] tcg/ppc: Support 128-bit load/store [v4,57/57] tcg/s390x: Support 128-bit load/store

Message ID

20230503070656.1746170-15-richard.henderson@linaro.org

State

Superseded

Headers

Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as
 permitted sender) client-ip=209.51.188.17;
From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: git@xen0n.name, gaosong@loongson.cn, philmd@linaro.org,
 qemu-arm@nongnu.org, qemu-riscv@nongnu.org, qemu-s390x@nongnu.org
Subject: [PATCH v4 14/57] tcg/i386: Add have_atomic16
Date: Wed,  3 May 2023 08:06:13 +0100
Message-Id: <20230503070656.1746170-15-richard.henderson@linaro.org>
In-Reply-To: <20230503070656.1746170-1-richard.henderson@linaro.org>
References: <20230503070656.1746170-1-richard.henderson@linaro.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=2a00:1450:4864:20::32b;
 envelope-from=richard.henderson@linaro.org; helo=mail-wm1-x32b.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001,
 T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org

Series

tcg: Improve atomicity support | expand

Commit Message

Richard Henderson May 3, 2023, 7:06 a.m. UTC

Notice when Intel or AMD have guaranteed that vmovdqa is atomic.
The new variable will also be used in generated code.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/qemu/cpuid.h      | 18 ++++++++++++++++++
 tcg/i386/tcg-target.h     |  1 +
 tcg/i386/tcg-target.c.inc | 27 +++++++++++++++++++++++++++
 3 files changed, 46 insertions(+)

Comments

Peter Maydell May 5, 2023, 10:34 a.m. UTC | #1

On Wed, 3 May 2023 at 08:10, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Notice when Intel or AMD have guaranteed that vmovdqa is atomic.
> The new variable will also be used in generated code.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  include/qemu/cpuid.h      | 18 ++++++++++++++++++
>  tcg/i386/tcg-target.h     |  1 +
>  tcg/i386/tcg-target.c.inc | 27 +++++++++++++++++++++++++++
>  3 files changed, 46 insertions(+)
>
> diff --git a/include/qemu/cpuid.h b/include/qemu/cpuid.h
> index 1451e8ef2f..35325f1995 100644
> --- a/include/qemu/cpuid.h
> +++ b/include/qemu/cpuid.h
> @@ -71,6 +71,24 @@
>  #define bit_LZCNT       (1 << 5)
>  #endif
>
> +/*
> + * Signatures for different CPU implementations as returned from Leaf 0.
> + */
> +
> +#ifndef signature_INTEL_ecx
> +/* "Genu" "ineI" "ntel" */
> +#define signature_INTEL_ebx     0x756e6547
> +#define signature_INTEL_edx     0x49656e69
> +#define signature_INTEL_ecx     0x6c65746e
> +#endif
> +
> +#ifndef signature_AMD_ecx
> +/* "Auth" "enti" "cAMD" */
> +#define signature_AMD_ebx       0x68747541
> +#define signature_AMD_edx       0x69746e65
> +#define signature_AMD_ecx       0x444d4163
> +#endif

> @@ -4024,6 +4025,32 @@ static void tcg_target_init(TCGContext *s)
>                      have_avx512dq = (b7 & bit_AVX512DQ) != 0;
>                      have_avx512vbmi2 = (c7 & bit_AVX512VBMI2) != 0;
>                  }
> +
> +                /*
> +                 * The Intel SDM has added:
> +                 *   Processors that enumerate support for Intel® AVX
> +                 *   (by setting the feature flag CPUID.01H:ECX.AVX[bit 28])
> +                 *   guarantee that the 16-byte memory operations performed
> +                 *   by the following instructions will always be carried
> +                 *   out atomically:
> +                 *   - MOVAPD, MOVAPS, and MOVDQA.
> +                 *   - VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128.
> +                 *   - VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded
> +                 *     with EVEX.128 and k0 (masking disabled).
> +                 * Note that these instructions require the linear addresses
> +                 * of their memory operands to be 16-byte aligned.
> +                 *
> +                 * AMD has provided an even stronger guarantee that processors
> +                 * with AVX provide 16-byte atomicity for all cachable,
> +                 * naturally aligned single loads and stores, e.g. MOVDQU.
> +                 *
> +                 * See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688
> +                 */
> +                if (have_avx1) {
> +                    __cpuid(0, a, b, c, d);
> +                    have_atomic16 = (c == signature_INTEL_ecx ||
> +                                     c == signature_AMD_ecx);
> +                }

If the signature is 3 words why are we only checking one here ?

thanks
-- PMM

Richard Henderson May 8, 2023, 1:41 p.m. UTC | #2

On 5/5/23 11:34, Peter Maydell wrote:
> On Wed, 3 May 2023 at 08:10, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> Notice when Intel or AMD have guaranteed that vmovdqa is atomic.
>> The new variable will also be used in generated code.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   include/qemu/cpuid.h      | 18 ++++++++++++++++++
>>   tcg/i386/tcg-target.h     |  1 +
>>   tcg/i386/tcg-target.c.inc | 27 +++++++++++++++++++++++++++
>>   3 files changed, 46 insertions(+)
>>
>> diff --git a/include/qemu/cpuid.h b/include/qemu/cpuid.h
>> index 1451e8ef2f..35325f1995 100644
>> --- a/include/qemu/cpuid.h
>> +++ b/include/qemu/cpuid.h
>> @@ -71,6 +71,24 @@
>>   #define bit_LZCNT       (1 << 5)
>>   #endif
>>
>> +/*
>> + * Signatures for different CPU implementations as returned from Leaf 0.
>> + */
>> +
>> +#ifndef signature_INTEL_ecx
>> +/* "Genu" "ineI" "ntel" */
>> +#define signature_INTEL_ebx     0x756e6547
>> +#define signature_INTEL_edx     0x49656e69
>> +#define signature_INTEL_ecx     0x6c65746e
>> +#endif
>> +
>> +#ifndef signature_AMD_ecx
>> +/* "Auth" "enti" "cAMD" */
>> +#define signature_AMD_ebx       0x68747541
>> +#define signature_AMD_edx       0x69746e65
>> +#define signature_AMD_ecx       0x444d4163
>> +#endif
> 
>> @@ -4024,6 +4025,32 @@ static void tcg_target_init(TCGContext *s)
>>                       have_avx512dq = (b7 & bit_AVX512DQ) != 0;
>>                       have_avx512vbmi2 = (c7 & bit_AVX512VBMI2) != 0;
>>                   }
>> +
>> +                /*
>> +                 * The Intel SDM has added:
>> +                 *   Processors that enumerate support for Intel® AVX
>> +                 *   (by setting the feature flag CPUID.01H:ECX.AVX[bit 28])
>> +                 *   guarantee that the 16-byte memory operations performed
>> +                 *   by the following instructions will always be carried
>> +                 *   out atomically:
>> +                 *   - MOVAPD, MOVAPS, and MOVDQA.
>> +                 *   - VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128.
>> +                 *   - VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded
>> +                 *     with EVEX.128 and k0 (masking disabled).
>> +                 * Note that these instructions require the linear addresses
>> +                 * of their memory operands to be 16-byte aligned.
>> +                 *
>> +                 * AMD has provided an even stronger guarantee that processors
>> +                 * with AVX provide 16-byte atomicity for all cachable,
>> +                 * naturally aligned single loads and stores, e.g. MOVDQU.
>> +                 *
>> +                 * See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688
>> +                 */
>> +                if (have_avx1) {
>> +                    __cpuid(0, a, b, c, d);
>> +                    have_atomic16 = (c == signature_INTEL_ecx ||
>> +                                     c == signature_AMD_ecx);
>> +                }
> 
> If the signature is 3 words why are we only checking one here ?

Because one is sufficient.  I don't know why the signature is 3 words and not 1.


r~

diff --git a/include/qemu/cpuid.h b/include/qemu/cpuid.h
index 1451e8ef2f..35325f1995 100644
--- a/include/qemu/cpuid.h
+++ b/include/qemu/cpuid.h
@@ -71,6 +71,24 @@ 
 #define bit_LZCNT       (1 << 5)
 #endif
 
+/*
+ * Signatures for different CPU implementations as returned from Leaf 0.
+ */
+
+#ifndef signature_INTEL_ecx
+/* "Genu" "ineI" "ntel" */
+#define signature_INTEL_ebx     0x756e6547
+#define signature_INTEL_edx     0x49656e69
+#define signature_INTEL_ecx     0x6c65746e
+#endif
+
+#ifndef signature_AMD_ecx
+/* "Auth" "enti" "cAMD" */
+#define signature_AMD_ebx       0x68747541
+#define signature_AMD_edx       0x69746e65
+#define signature_AMD_ecx       0x444d4163
+#endif
+
 static inline unsigned xgetbv_low(unsigned c)
 {
     unsigned a, d;
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index d4f2a6f8c2..0421776cb8 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -120,6 +120,7 @@  extern bool have_avx512dq;
 extern bool have_avx512vbmi2;
 extern bool have_avx512vl;
 extern bool have_movbe;
+extern bool have_atomic16;
 
 /* optional instructions */
 #define TCG_TARGET_HAS_div2_i32         1
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index bb603e7968..f838683fc3 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -185,6 +185,7 @@  bool have_avx512dq;
 bool have_avx512vbmi2;
 bool have_avx512vl;
 bool have_movbe;
+bool have_atomic16;
 
 #ifdef CONFIG_CPUID_H
 static bool have_bmi2;
@@ -4024,6 +4025,32 @@  static void tcg_target_init(TCGContext *s)
                     have_avx512dq = (b7 & bit_AVX512DQ) != 0;
                     have_avx512vbmi2 = (c7 & bit_AVX512VBMI2) != 0;
                 }
+
+                /*
+                 * The Intel SDM has added:
+                 *   Processors that enumerate support for Intel® AVX
+                 *   (by setting the feature flag CPUID.01H:ECX.AVX[bit 28])
+                 *   guarantee that the 16-byte memory operations performed
+                 *   by the following instructions will always be carried
+                 *   out atomically:
+                 *   - MOVAPD, MOVAPS, and MOVDQA.
+                 *   - VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128.
+                 *   - VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded
+                 *     with EVEX.128 and k0 (masking disabled).
+                 * Note that these instructions require the linear addresses
+                 * of their memory operands to be 16-byte aligned.
+                 *
+                 * AMD has provided an even stronger guarantee that processors
+                 * with AVX provide 16-byte atomicity for all cachable,
+                 * naturally aligned single loads and stores, e.g. MOVDQU.
+                 *
+                 * See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688
+                 */
+                if (have_avx1) {
+                    __cpuid(0, a, b, c, d);
+                    have_atomic16 = (c == signature_INTEL_ecx ||
+                                     c == signature_AMD_ecx);
+                }
             }
         }
     }

[v4,14/57] tcg/i386: Add have_atomic16

Commit Message

Comments

Patch