Message ID | 20201001170752.82063-1-richard.henderson@linaro.org |
---|---|
Headers | show |
Series | target/arm: Fix tlb flush page vs tbi | expand |
I can confirm this resolves the issue we were seeing. Thank you Richard! Best, Jordan On 10/1/20, 10:08 AM, "Richard Henderson" <richard.henderson@linaro.org> wrote: Since the FAR_ELx fix at 38d931687fa1, it is reported that page granularity flushing is broken. This makes sense, since TCG will record the entire virtual address in its TLB, not simply the 56 significant bits. With no other TCG support, the ARM backend should require 256 different page flushes to clear the virtual address of any possible tag. So I added a new tcg interface that allows passing the size of the virtual address. I thought a simple bit-count was a cleaner interface than passing in a mask, since it means that we couldn't be passed nonsensical masks like 0xdeadbeef. It also makes it easy to re-direct special cases. I don't have a test case that triggers the bug. All I can say is that (1) this still boots a normal kernel and (2) the code paths are triggered since the kernel enables tbi for EL0. Jordan, can you test this please? r~ Richard Henderson (2): accel/tcg: Add tlb_flush_page_bits_by_mmuidx* target/arm: Use tlb_flush_page_bits_by_mmuidx* include/exec/exec-all.h | 36 ++++++ accel/tcg/cputlb.c | 259 ++++++++++++++++++++++++++++++++++++++-- target/arm/helper.c | 46 +++++-- 3 files changed, 325 insertions(+), 16 deletions(-) -- 2.25.1
On Thu, 1 Oct 2020 at 18:07, Richard Henderson <richard.henderson@linaro.org> wrote: > > On ARM, the Top Byte Ignore feature means that only 56 bits of > the address are significant in the virtual address. We are > required to give the entire 64-bit address to FAR_ELx on fault, > which means that we do not "clean" the top byte early in TCG. > > This new interface allows us to flush all 256 possible aliases > for a given page, currently missed by tlb_flush_page*. > > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> > +static void tlb_flush_page_bits_by_mmuidx_async_1(CPUState *cpu, > + run_on_cpu_data data) > +{ > + target_ulong addr_map_bits = (target_ulong) data.target_ptr; > + target_ulong addr = addr_map_bits & TARGET_PAGE_MASK; > + uint16_t idxmap = (addr_map_bits & ~TARGET_PAGE_MASK) >> 6; > + unsigned bits = addr_map_bits & 0x3f; So this is unpacking... > + } else if (idxmap <= MAKE_64BIT_MASK(0, TARGET_PAGE_BITS - 6)) { > + run_on_cpu_data data > + = RUN_ON_CPU_TARGET_PTR(addr | (idxmap << 6) | bits); ...the value that we packed into an integer here... > + run_on_cpu_data data > + = RUN_ON_CPU_TARGET_PTR(addr | (idxmap << 6) | bits); ...here... > + if (idxmap <= MAKE_64BIT_MASK(0, TARGET_PAGE_BITS - 6)) { > + run_on_cpu_data data > + = RUN_ON_CPU_TARGET_PTR(addr | (idxmap << 6) | bits); ...and here. Could we do something to avoid all these hard-coded 6s and maybe make it a bit clearer that these two operations are the inverse of each other? Otherwise Reviewed-by: Peter Maydell <peter.maydell@linaro.org> thanks -- PMM