mbox series

[RFC,0/2] target/arm: Fix tlb flush page vs tbi

Message ID 20201001170752.82063-1-richard.henderson@linaro.org
Headers show
Series target/arm: Fix tlb flush page vs tbi | expand

Message

Richard Henderson Oct. 1, 2020, 5:07 p.m. UTC
Since the FAR_ELx fix at 38d931687fa1, it is reported that
page granularity flushing is broken.

This makes sense, since TCG will record the entire virtual
address in its TLB, not simply the 56 significant bits.
With no other TCG support, the ARM backend should require
256 different page flushes to clear the virtual address of
any possible tag.

So I added a new tcg interface that allows passing the size
of the virtual address.  I thought a simple bit-count was a 
cleaner interface than passing in a mask, since it means that
we couldn't be passed nonsensical masks like 0xdeadbeef.  It
also makes it easy to re-direct special cases.

I don't have a test case that triggers the bug.  All I can say
is that (1) this still boots a normal kernel and (2) the code
paths are triggered since the kernel enables tbi for EL0.

Jordan, can you test this please?


r~


Richard Henderson (2):
  accel/tcg: Add tlb_flush_page_bits_by_mmuidx*
  target/arm: Use tlb_flush_page_bits_by_mmuidx*

 include/exec/exec-all.h |  36 ++++++
 accel/tcg/cputlb.c      | 259 ++++++++++++++++++++++++++++++++++++++--
 target/arm/helper.c     |  46 +++++--
 3 files changed, 325 insertions(+), 16 deletions(-)

Comments

Jordan Frank Oct. 2, 2020, 6:19 p.m. UTC | #1
I can confirm this resolves the issue we were seeing. Thank you Richard!

Best,
Jordan

On 10/1/20, 10:08 AM, "Richard Henderson" <richard.henderson@linaro.org> wrote:

    Since the FAR_ELx fix at 38d931687fa1, it is reported that
    page granularity flushing is broken.

    This makes sense, since TCG will record the entire virtual
    address in its TLB, not simply the 56 significant bits.
    With no other TCG support, the ARM backend should require
    256 different page flushes to clear the virtual address of
    any possible tag.

    So I added a new tcg interface that allows passing the size
    of the virtual address.  I thought a simple bit-count was a 
    cleaner interface than passing in a mask, since it means that
    we couldn't be passed nonsensical masks like 0xdeadbeef.  It
    also makes it easy to re-direct special cases.

    I don't have a test case that triggers the bug.  All I can say
    is that (1) this still boots a normal kernel and (2) the code
    paths are triggered since the kernel enables tbi for EL0.

    Jordan, can you test this please?


    r~


    Richard Henderson (2):
      accel/tcg: Add tlb_flush_page_bits_by_mmuidx*
      target/arm: Use tlb_flush_page_bits_by_mmuidx*

     include/exec/exec-all.h |  36 ++++++
     accel/tcg/cputlb.c      | 259 ++++++++++++++++++++++++++++++++++++++--
     target/arm/helper.c     |  46 +++++--
     3 files changed, 325 insertions(+), 16 deletions(-)

    -- 
    2.25.1
Peter Maydell Oct. 8, 2020, 12:53 p.m. UTC | #2
On Thu, 1 Oct 2020 at 18:07, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On ARM, the Top Byte Ignore feature means that only 56 bits of
> the address are significant in the virtual address.  We are
> required to give the entire 64-bit address to FAR_ELx on fault,
> which means that we do not "clean" the top byte early in TCG.
>
> This new interface allows us to flush all 256 possible aliases
> for a given page, currently missed by tlb_flush_page*.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> +static void tlb_flush_page_bits_by_mmuidx_async_1(CPUState *cpu,
> +                                                  run_on_cpu_data data)
> +{
> +    target_ulong addr_map_bits = (target_ulong) data.target_ptr;
> +    target_ulong addr = addr_map_bits & TARGET_PAGE_MASK;
> +    uint16_t idxmap = (addr_map_bits & ~TARGET_PAGE_MASK) >> 6;
> +    unsigned bits = addr_map_bits & 0x3f;

So this is unpacking...

> +    } else if (idxmap <= MAKE_64BIT_MASK(0, TARGET_PAGE_BITS - 6)) {
> +        run_on_cpu_data data
> +            = RUN_ON_CPU_TARGET_PTR(addr | (idxmap << 6) | bits);

...the value that we packed into an integer here...

> +        run_on_cpu_data data
> +            = RUN_ON_CPU_TARGET_PTR(addr | (idxmap << 6) | bits);

...here...

> +    if (idxmap <= MAKE_64BIT_MASK(0, TARGET_PAGE_BITS - 6)) {
> +        run_on_cpu_data data
> +            = RUN_ON_CPU_TARGET_PTR(addr | (idxmap << 6) | bits);

...and here.

Could we do something to avoid all these hard-coded 6s and
maybe make it a bit clearer that these two operations
are the inverse of each other?

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM