Message ID | 20230213161352.17199-2-pierrick.bouvier@linaro.org |
---|---|
State | New |
Headers | show |
Series | Adds support for running QEMU natively on windows-arm64 | expand |
On Mon, 13 Feb 2023 at 20:50, Pierrick Bouvier <pierrick.bouvier@linaro.org> wrote: > > mrs instruction fails as an illegal instruction. > For now, no cache information is retrieved for this platform. > It could be specialized later, using Windows API. Unless I'm misreading the code, there's a sys_cache_info() implementation that's only guarded by if defined(_WIN32), so presumably we're using that on AArch64 also. Does it return sensible values ? > Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> > --- > util/cacheflush.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/util/cacheflush.c b/util/cacheflush.c > index 2c2c73e085..149f103d32 100644 > --- a/util/cacheflush.c > +++ b/util/cacheflush.c > @@ -121,8 +121,9 @@ static void sys_cache_info(int *isize, int *dsize) > static bool have_coherent_icache; > #endif > > -#if defined(__aarch64__) && !defined(CONFIG_DARWIN) > +#if defined(__aarch64__) && !defined(CONFIG_DARWIN) && !defined(CONFIG_WIN32) > /* Apple does not expose CTR_EL0, so we must use system interfaces. */ > +/* Does not work on windows-arm64, illegal instruction using mrs */ This comment should be better integrated with the previous, because the reason for the illegal instruction exception on Windows is the same as for macos -- the OS doesn't expose CTR_EL0 to userspace. > static uint64_t save_ctr_el0; > static void arch_cache_info(int *isize, int *dsize) > { > @@ -225,7 +226,7 @@ static void __attribute__((constructor)) init_cache_info(void) > > /* Caches are coherent and do not require flushing; symbol inline. */ > > -#elif defined(__aarch64__) > +#elif defined(__aarch64__) && !defined(CONFIG_WIN32) This will cause us to not use the generic aarch64 flush_idcache_range(), which uses DC CVAU and IC IVAU. Does that not work on Windows? If it doesn't then I think the ifdeffery would be more clearly structured as #elif defined(__aarch64__) ifdef CONFIG_DARWIN [macos implementation of flush_idcache_range] #elif defined(CONFIG_WIN32) /* Explanation here of why the generic version doesn't work */ #else /* generic version */ #endif #elif defined(__mips__) etc thanks -- PMM
On 2/14/23 06:44, Peter Maydell wrote: > This will cause us to not use the generic aarch64 flush_idcache_range(), > which uses DC CVAU and IC IVAU. Does that not work on Windows? > > If it doesn't then I think the ifdeffery would be more clearly > structured as > > #elif defined(__aarch64__) > > ifdef CONFIG_DARWIN > [macos implementation of flush_idcache_range] > #elif defined(CONFIG_WIN32) > /* Explanation here of why the generic version doesn't work */ > #else > /* generic version */ > #endif > > #elif defined(__mips__) More specifically, there *must* be a replacement, or TCG will not work. It appears as if FlushInstructionCache is the right syscall on windows. I cannot find documentation for a data cache flush. But until there is also a way to allocate split w^x memory regions (tcg/region.c, alloc_code_gen_buffer_splitwx), you need not worry about that. You could reasonably assert(rx == rw) there. r~
On 2/14/23 17:44, Peter Maydell wrote: > On Mon, 13 Feb 2023 at 20:50, Pierrick Bouvier > <pierrick.bouvier@linaro.org> wrote: >> >> mrs instruction fails as an illegal instruction. >> For now, no cache information is retrieved for this platform. >> It could be specialized later, using Windows API. > > Unless I'm misreading the code, there's a sys_cache_info() > implementation that's only guarded by if defined(_WIN32), so > presumably we're using that on AArch64 also. Does it return > sensible values ? > Yes, it works on arm64, and I checked it was returned by the expected "windows version" function sys_cache_info. On my machine, having a snapdragon 8cx gen3 processor, it returns (64, 64). It's on par with what I can see from a WSL1 environment (Linux, without VM) on the same machine: $ getconf LEVEL1_DCACHE_LINESIZE 64 $ getconf LEVEL1_ICACHE_LINESIZE 64 >> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> >> --- >> util/cacheflush.c | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/util/cacheflush.c b/util/cacheflush.c >> index 2c2c73e085..149f103d32 100644 >> --- a/util/cacheflush.c >> +++ b/util/cacheflush.c >> @@ -121,8 +121,9 @@ static void sys_cache_info(int *isize, int *dsize) >> static bool have_coherent_icache; >> #endif >> >> -#if defined(__aarch64__) && !defined(CONFIG_DARWIN) >> +#if defined(__aarch64__) && !defined(CONFIG_DARWIN) && !defined(CONFIG_WIN32) >> /* Apple does not expose CTR_EL0, so we must use system interfaces. */ >> +/* Does not work on windows-arm64, illegal instruction using mrs */ > > This comment should be better integrated with the previous, because > the reason for the illegal instruction exception on Windows is the > same as for macos -- the OS doesn't expose CTR_EL0 to userspace. > I'll update the comment 👍 >> static uint64_t save_ctr_el0; >> static void arch_cache_info(int *isize, int *dsize) >> { >> @@ -225,7 +226,7 @@ static void __attribute__((constructor)) init_cache_info(void) >> >> /* Caches are coherent and do not require flushing; symbol inline. */ >> >> -#elif defined(__aarch64__) >> +#elif defined(__aarch64__) && !defined(CONFIG_WIN32) > > This will cause us to not use the generic aarch64 flush_idcache_range(), > which uses DC CVAU and IC IVAU. Does that not work on Windows? > > If it doesn't then I think the ifdeffery would be more clearly > structured as > > #elif defined(__aarch64__) > > ifdef CONFIG_DARWIN > [macos implementation of flush_idcache_range] > #elif defined(CONFIG_WIN32) > /* Explanation here of why the generic version doesn't work */ > #else > /* generic version */ > #endif > > #elif defined(__mips__) > > etc > For now, the generic flush_idcache_range, using __builtin___clear_cache is used. Richard mentioned 'there *must* be a replacement, or TCG will not work'. As said in the cover letter, I could successfully install and boot an arm64 and x64 vm. I'm not an expert on this area, but I imagine that booting a full VM will force TCG to emit code at the same address twice (after having generated enough translated blocks), which shows that generic flush_idcache_range works. Is that reasoning correct? Do you think we still need a specialized version for windows-arm64? > thanks > -- PMM
On 2/15/23 02:49, Pierrick Bouvier wrote: > I'm not an expert on this area, but I imagine that booting a full VM will force TCG to > emit code at the same address twice (after having generated enough translated blocks), > which shows that generic flush_idcache_range works. Is that reasoning correct? > > Do you think we still need a specialized version for windows-arm64? No, I don't think so. It would be ideal if we were able to read CTR_EL0, because we make use of the IDC field, but 0 is a safe default. r~
After some investigation on this, I found that even faking ctr_el0 content does not work. Indeed, "dc cvau" and "ic ivau" both are privileged too (and fail with illegal instruction). I started looking how other projects are handling this. In the case of firefox js engine, they simply perform a FlushInstructionCache [1], and nothing for data cache. From what I found on various websites, there is no way to perform any data cache flush from userspace under Windows. Out of curiosity, I looked in llvm how __builtin___clear_cache was implemented, and for windows-arm64, it's simply a call to the same function, FlushInstructionCache [2]. This explains why the generic implementation of flush_idcache_range does the correct thing for this platform, and why tests I ran worked. Thus, it's probably the best approach we can have for now. [1] https://hg.mozilla.org/mozilla-central/file/tip/js/src/jit/arm64/vixl/MozCpu-vixl.cpp#l110 [2] https://github.com/llvm/llvm-project/blob/574e417460cdfebb8157fbe61b6a015e44856f64/compiler-rt/lib/builtins/clear_cache.c#L66 On 2/15/23 19:22, Richard Henderson wrote: > On 2/15/23 02:49, Pierrick Bouvier wrote: >> I'm not an expert on this area, but I imagine that booting a full VM will force TCG to >> emit code at the same address twice (after having generated enough translated blocks), >> which shows that generic flush_idcache_range works. Is that reasoning correct? >> >> Do you think we still need a specialized version for windows-arm64? > > No, I don't think so. It would be ideal if we were able to read CTR_EL0, because we make > use of the IDC field, but 0 is a safe default. > > > r~
diff --git a/util/cacheflush.c b/util/cacheflush.c index 2c2c73e085..149f103d32 100644 --- a/util/cacheflush.c +++ b/util/cacheflush.c @@ -121,8 +121,9 @@ static void sys_cache_info(int *isize, int *dsize) static bool have_coherent_icache; #endif -#if defined(__aarch64__) && !defined(CONFIG_DARWIN) +#if defined(__aarch64__) && !defined(CONFIG_DARWIN) && !defined(CONFIG_WIN32) /* Apple does not expose CTR_EL0, so we must use system interfaces. */ +/* Does not work on windows-arm64, illegal instruction using mrs */ static uint64_t save_ctr_el0; static void arch_cache_info(int *isize, int *dsize) { @@ -225,7 +226,7 @@ static void __attribute__((constructor)) init_cache_info(void) /* Caches are coherent and do not require flushing; symbol inline. */ -#elif defined(__aarch64__) +#elif defined(__aarch64__) && !defined(CONFIG_WIN32) #ifdef CONFIG_DARWIN /* Apple does not expose CTR_EL0, so we must use system interfaces. */
mrs instruction fails as an illegal instruction. For now, no cache information is retrieved for this platform. It could be specialized later, using Windows API. Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> --- util/cacheflush.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)