diff mbox series

target/arm: Fix handling of LPAE block descriptors

Message ID 20220304165628.2345765-1-peter.maydell@linaro.org
State Superseded
Headers show
Series target/arm: Fix handling of LPAE block descriptors | expand

Commit Message

Peter Maydell March 4, 2022, 4:56 p.m. UTC
LPAE descriptors come in three forms:

 * table descriptors, giving the address of the next level page table
 * page descriptors, which occur only at level 3 and describe the
   mapping of one page (which might be 4K, 16K or 64K)
 * block descriptors, which occur at higher page table levels, and
   describe the mapping of huge pages

QEMU's page-table-walk code treats block and page entries
identically, simply ORing in a number of bits from the input virtual
address that depends on the level of the page table that we stopped
at; we depend on the previous masking of descaddr with descaddrmask
to have already cleared out the low bits of the descriptor word.

This is not quite right: the address field in a block descriptor is
smaller, and so there are bits which are valid address bits in a page
descriptor or a table descriptor but which are not supposed to be
part of the address in a block descriptor, and descaddrmask does not
clear them.  We previously mostly got away with this because those
descriptor bits are RES0; however with FEAT_BBM (part of Armv8.4)
block descriptor bit 16 is defined to be the nT bit.  No emulated
QEMU CPU has FEAT_BBM yet, but if the host CPU has it then we might
see it when using KVM or hvf.

Explicitly zero out all the descaddr bits we're about to OR vaddr
bits into.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/790
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Peter Maydell March 14, 2022, 6:12 p.m. UTC | #1
Ping for review, please?

thanks
-- PMM

On Fri, 4 Mar 2022 at 16:56, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> LPAE descriptors come in three forms:
>
>  * table descriptors, giving the address of the next level page table
>  * page descriptors, which occur only at level 3 and describe the
>    mapping of one page (which might be 4K, 16K or 64K)
>  * block descriptors, which occur at higher page table levels, and
>    describe the mapping of huge pages
>
> QEMU's page-table-walk code treats block and page entries
> identically, simply ORing in a number of bits from the input virtual
> address that depends on the level of the page table that we stopped
> at; we depend on the previous masking of descaddr with descaddrmask
> to have already cleared out the low bits of the descriptor word.
>
> This is not quite right: the address field in a block descriptor is
> smaller, and so there are bits which are valid address bits in a page
> descriptor or a table descriptor but which are not supposed to be
> part of the address in a block descriptor, and descaddrmask does not
> clear them.  We previously mostly got away with this because those
> descriptor bits are RES0; however with FEAT_BBM (part of Armv8.4)
> block descriptor bit 16 is defined to be the nT bit.  No emulated
> QEMU CPU has FEAT_BBM yet, but if the host CPU has it then we might
> see it when using KVM or hvf.
>
> Explicitly zero out all the descaddr bits we're about to OR vaddr
> bits into.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/790
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/helper.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 088956eecf0..b5c8caafe84 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11706,11 +11706,17 @@ static bool get_phys_addr_lpae(CPUARMState *env, uint64_t address,
>              indexmask = indexmask_grainsize;
>              continue;
>          }
> -        /* Block entry at level 1 or 2, or page entry at level 3.
> +        /*
> +         * Block entry at level 1 or 2, or page entry at level 3.
>           * These are basically the same thing, although the number
> -         * of bits we pull in from the vaddr varies.
> +         * of bits we pull in from the vaddr varies. Note that although
> +         * descaddrmask masks enough of the low bits of the descriptor
> +         * to give a correct page or table address, the address field
> +         * in a block descriptor is smaller; so we need to explicitly
> +         * clear the lower bits here before ORing in the low vaddr bits.
>           */
>          page_size = (1ULL << ((stride * (4 - level)) + 3));
> +        descaddr &= ~(page_size - 1);
>          descaddr |= (address & (page_size - 1));
>          /* Extract attributes from the descriptor */
>          attrs = extract64(descriptor, 2, 10)
> --
> 2.25.1
Richard Henderson March 14, 2022, 7:29 p.m. UTC | #2
On 3/4/22 08:56, Peter Maydell wrote:
> LPAE descriptors come in three forms:
> 
>   * table descriptors, giving the address of the next level page table
>   * page descriptors, which occur only at level 3 and describe the
>     mapping of one page (which might be 4K, 16K or 64K)
>   * block descriptors, which occur at higher page table levels, and
>     describe the mapping of huge pages
> 
> QEMU's page-table-walk code treats block and page entries
> identically, simply ORing in a number of bits from the input virtual
> address that depends on the level of the page table that we stopped
> at; we depend on the previous masking of descaddr with descaddrmask
> to have already cleared out the low bits of the descriptor word.
> 
> This is not quite right: the address field in a block descriptor is
> smaller, and so there are bits which are valid address bits in a page
> descriptor or a table descriptor but which are not supposed to be
> part of the address in a block descriptor, and descaddrmask does not
> clear them.  We previously mostly got away with this because those
> descriptor bits are RES0; however with FEAT_BBM (part of Armv8.4)
> block descriptor bit 16 is defined to be the nT bit.  No emulated
> QEMU CPU has FEAT_BBM yet, but if the host CPU has it then we might
> see it when using KVM or hvf.
> 
> Explicitly zero out all the descaddr bits we're about to OR vaddr
> bits into.
> 
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/790
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>   target/arm/helper.c | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 088956eecf0..b5c8caafe84 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11706,11 +11706,17 @@ static bool get_phys_addr_lpae(CPUARMState *env, uint64_t address,
>               indexmask = indexmask_grainsize;
>               continue;
>           }
> -        /* Block entry at level 1 or 2, or page entry at level 3.
> +        /*
> +         * Block entry at level 1 or 2, or page entry at level 3.
>            * These are basically the same thing, although the number
> -         * of bits we pull in from the vaddr varies.
> +         * of bits we pull in from the vaddr varies. Note that although
> +         * descaddrmask masks enough of the low bits of the descriptor
> +         * to give a correct page or table address, the address field
> +         * in a block descriptor is smaller; so we need to explicitly
> +         * clear the lower bits here before ORing in the low vaddr bits.
>            */
>           page_size = (1ULL << ((stride * (4 - level)) + 3));
> +        descaddr &= ~(page_size - 1);
>           descaddr |= (address & (page_size - 1));

As is,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

though I wonder if it reads better as

     page_bits = stride * (4 - level) + 3;
     descaddr = deposit64(descaddr, 0, page_bits, address);


r~
Peter Maydell March 14, 2022, 7:33 p.m. UTC | #3
On Mon, 14 Mar 2022 at 19:29, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 3/4/22 08:56, Peter Maydell wrote:
> >           page_size = (1ULL << ((stride * (4 - level)) + 3));
> > +        descaddr &= ~(page_size - 1);
> >           descaddr |= (address & (page_size - 1));
>
> As is,
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
>
> though I wonder if it reads better as
>
>      page_bits = stride * (4 - level) + 3;
>      descaddr = deposit64(descaddr, 0, page_bits, address);

I thought about that, but we specifically need page_size
anyway to return it later:

    *page_size_ptr = page_size;

and calculating both page_bits and page_size felt a bit awkward.

-- PMM
diff mbox series

Patch

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 088956eecf0..b5c8caafe84 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11706,11 +11706,17 @@  static bool get_phys_addr_lpae(CPUARMState *env, uint64_t address,
             indexmask = indexmask_grainsize;
             continue;
         }
-        /* Block entry at level 1 or 2, or page entry at level 3.
+        /*
+         * Block entry at level 1 or 2, or page entry at level 3.
          * These are basically the same thing, although the number
-         * of bits we pull in from the vaddr varies.
+         * of bits we pull in from the vaddr varies. Note that although
+         * descaddrmask masks enough of the low bits of the descriptor
+         * to give a correct page or table address, the address field
+         * in a block descriptor is smaller; so we need to explicitly
+         * clear the lower bits here before ORing in the low vaddr bits.
          */
         page_size = (1ULL << ((stride * (4 - level)) + 3));
+        descaddr &= ~(page_size - 1);
         descaddr |= (address & (page_size - 1));
         /* Extract attributes from the descriptor */
         attrs = extract64(descriptor, 2, 10)