Message ID | 1413910544-20150-6-git-send-email-greg.bellows@linaro.org |
---|---|
State | New |
Headers | show |
On 21 October 2014 at 17:55, Greg Bellows <greg.bellows@linaro.org> wrote: > From: Fabian Aggeler <aggelerf@ethz.ch> > > Make arm_current_el() return EL3 for secure PL1 and monitor mode. > Increase MMU modes since mmu_index is directly inferred from arm_ > current_el(). Change assertion in arm_el_is_aa64() to allow EL3. > -#define NB_MMU_MODES 2 > +#define NB_MMU_MODES 4 So this turns out not to quite be what we want. A QEMU MMU mode index basically defines a (vaddr -> paddr,permissions) mapping. This is similar to the ARM ARM concept of a "translation regime", with the differences that: * the ARM ARM translation regimes may have split permissions, for user and privileged code, so we need two mmu_idx values for a translation regime that applies to both EL0 and EL1 * stage 1 and stage 2 translations for a VA->IPA->PA lookup for an EL1/EL0 hypervisor guest are two different translation regimes, but for QEMU we can just cache the whole VA->PA and use a single mmu_idx. [We only need to separately do VA->IPA and IPA->VA for the "do this address translation" system instructions, which don't need to touch the TLB; a combined stage1+stage2 TLB is permitted by the architecture.] The translation regimes are: If EL3 is 64-bit: * Secure EL3 * Secure EL1 & EL0 * NonSecure EL2 * NonSecure EL1 & 0 stage 1 * NonSecure EL1 & 0 stage 2 If EL3 is 32-bit: * Secure PL0 & PL1 * NonSecure PL2 * NonSecure PL1 & 0 stage 1 * NonSecure PL1 & 0 stage 2 (reminder: for 32 bit EL3, Secure PL1 is *EL3*, not EL1.) which we can give the following mmu indexes: 64 bit EL3: 0 : NS EL0 stage 1+2 1 : NS EL1 stage 1+2 2 : NS EL2 3 : S EL3 4 : S EL0 5 : S EL1 32 bit EL3: 0 : NS EL0 (aka NS PL0) stage 1+2 1 : NS EL1 (aka NS PL1) stage 1+2 2 : NS EL2 (aka NS PL2) 3 : S EL3 (aka S PL1) 4 : S EL0 (aka S PL0) Notice how they end up being the same, except that with a 64 bit EL3 we need an extra mmu index that 32 bit doesn't have. They aren't simply "what is our current EL?", though as you can see I've put them in an order that comes close. So the right answer for NB_MMU_MODES is 6 :-) -- PMM
On 16 January 2015 at 18:36, Peter Maydell <peter.maydell@linaro.org> wrote: > On 21 October 2014 at 17:55, Greg Bellows <greg.bellows@linaro.org> wrote: >> -#define NB_MMU_MODES 2 >> +#define NB_MMU_MODES 4 > > So this turns out not to quite be what we want. > A QEMU MMU mode index basically defines a (vaddr -> paddr,permissions) > mapping. This is similar to the ARM ARM concept of a "translation > regime", with the differences that: > * the ARM ARM translation regimes may have split permissions, > for user and privileged code, so we need two mmu_idx values > for a translation regime that applies to both EL0 and EL1 > * stage 1 and stage 2 translations for a VA->IPA->PA lookup > for an EL1/EL0 hypervisor guest are two different translation > regimes, but for QEMU we can just cache the whole VA->PA > and use a single mmu_idx. [We only need to separately do > VA->IPA and IPA->VA for the "do this address translation" > system instructions, which don't need to touch the TLB; > a combined stage1+stage2 TLB is permitted by the architecture.] > > The translation regimes are: > > If EL3 is 64-bit: > * Secure EL3 > * Secure EL1 & EL0 > * NonSecure EL2 > * NonSecure EL1 & 0 stage 1 > * NonSecure EL1 & 0 stage 2 > If EL3 is 32-bit: > * Secure PL0 & PL1 > * NonSecure PL2 > * NonSecure PL1 & 0 stage 1 > * NonSecure PL1 & 0 stage 2 > (reminder: for 32 bit EL3, Secure PL1 is *EL3*, not EL1.) > > which we can give the following mmu indexes: > > 64 bit EL3: > 0 : NS EL0 stage 1+2 > 1 : NS EL1 stage 1+2 > 2 : NS EL2 > 3 : S EL3 > 4 : S EL0 > 5 : S EL1 > > 32 bit EL3: > 0 : NS EL0 (aka NS PL0) stage 1+2 > 1 : NS EL1 (aka NS PL1) stage 1+2 > 2 : NS EL2 (aka NS PL2) > 3 : S EL3 (aka S PL1) > 4 : S EL0 (aka S PL0) > > Notice how they end up being the same, except that with a > 64 bit EL3 we need an extra mmu index that 32 bit doesn't have. > They aren't simply "what is our current EL?", though as you > can see I've put them in an order that comes close. > > So the right answer for NB_MMU_MODES is 6 :-) ...except we would also kind of like to be able to cache NS stage 2 lookups, because otherwise every access we make to a stage 1 page table word (accessed by IPA) is going to require a full stage 2 page table walk. That would mean 7 MMU modes. Richard: do you have a feel for how expensive it is to have lots and lots of mmu modes? I might be able to merge "S EL1" with "NS EL1 stage 1+2" and ditto "S EL0" with "NS EL0 stage1 + 2" but we'd need to do more TLB flushing and it's not clear to me currently exactly where the extra flushes would have to go... -- PMM
On 19 January 2015 at 17:44, Richard Henderson <rth@twiddle.net> wrote: > On 01/19/2015 05:22 AM, Peter Maydell wrote: >> Richard: do you have a feel for how expensive it is to >> have lots and lots of mmu modes? I might be able to >> merge "S EL1" with "NS EL1 stage 1+2" and ditto "S EL0" >> with "NS EL0 stage1 + 2" but we'd need to do more TLB >> flushing and it's not clear to me currently exactly >> where the extra flushes would have to go... > > It's 10k per mmu mode, more or less. That's what you've > got to memset (to -1) whenever a flush occurs. Hmm. If the tlb flush memset is the main perf issue, we could let the target tell the generic code how many MMU modes it was using at runtime. We might need 7 modes in the general case, but we could avoid burdening "no TZ" or "no virtualization" CPUs with the overhead of clearing TLB entries that we never actually use. Alternatively (better!), for a lot of the tlb_flush()es triggered by target-arm code we could be more precise about the affected mmu_idx values, since the common case is going to be "NS EL1 did something that needs a TLB flush", and by definition that can't affect TLB entries for EL2, EL3 or S-EL1/EL0. So I think my preference would be to use 7 mmu indexes, and add a tlb_flush_mmuidx() function. (Assuming I'm not missing anything that makes that not workable...) -- PMM
========== v6 -> v7 - Fix commit message v5 -> v6 - Rework arm_current_el() logic to properly return EL3 for secure PL1 when EL3 is 32-bit. - Replace direct access of env->aarch64 with is_a64() Signed-off-by: Greg Bellows <greg.bellows@linaro.org> --- target-arm/cpu.h | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/target-arm/cpu.h b/target-arm/cpu.h index 1138539..cb6ec5c 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -100,7 +100,7 @@ typedef uint32_t ARMReadCPFunc(void *opaque, int cp_info, struct arm_boot_info; -#define NB_MMU_MODES 2 +#define NB_MMU_MODES 4 /* We currently assume float and double are IEEE single and double precision respectively. @@ -803,11 +803,12 @@ static inline bool arm_is_secure(CPUARMState *env) /* Return true if the specified exception level is running in AArch64 state. */ static inline bool arm_el_is_aa64(CPUARMState *env, int el) { - /* We don't currently support EL2 or EL3, and this isn't valid for EL0 + /* We don't currently support EL2, and this isn't valid for EL0 * (if we're in EL0, is_a64() is what you want, and if we're not in EL0 * then the state of EL0 isn't well defined.) */ - assert(el == 1); + assert(el == 1 || el == 3); + /* AArch64-capable CPUs always run with EL1 in AArch64 mode. This * is a QEMU-imposed simplification which we may wish to change later. * If we in future support EL2 and/or EL3, then the state of lower @@ -996,17 +997,27 @@ static inline bool cptype_valid(int cptype) */ static inline int arm_current_el(CPUARMState *env) { - if (env->aarch64) { + if (is_a64(env)) { return extract32(env->pstate, 2, 2); } - if ((env->uncached_cpsr & 0x1f) == ARM_CPU_MODE_USR) { + switch (env->uncached_cpsr & 0x1f) { + case ARM_CPU_MODE_USR: return 0; + case ARM_CPU_MODE_HYP: + return 2; + case ARM_CPU_MODE_MON: + return 3; + default: + if (arm_is_secure(env) && !arm_el_is_aa64(env, 3)) { + /* If EL3 is 32-bit then all secure privileged modes run in + * EL3 + */ + return 3; + } + + return 1; } - /* We don't currently implement the Virtualization or TrustZone - * extensions, so EL2 and EL3 don't exist for us. - */ - return 1; } typedef struct ARMCPRegInfo ARMCPRegInfo;