Message ID | 20180209143937.28866-49-andre.przywara@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | New VGIC(-v2) implementation | expand |
Hi, On 09/02/18 14:39, Andre Przywara wrote: > At the moment we allocate exactly one page for struct vcpu on ARM, also > have a check in place to prevent it growing beyond 4KB. > As the struct includes the state of all 32 private (per-VCPU) interrupts, > we are at 3840 bytes on arm64 at the moment already. Growing the per-IRQ > VGIC structure even slightly makes the VCPU quickly exceed the 4K limit. > The new VGIC will need more space per virtual IRQ. I spent a few hours > trying to trim this down, but couldn't get it below 4KB, even with the > nasty hacks piling up to save some bytes here and there. > It turns out that beyond efficiency, maybe, there is no real technical > reason this struct has to fit in one page, so lifting the limit to two > pages seems like the most pragmatic solution. I looked briefly: sizeof(struct vcpu): arm32: 3809 arm64: 5248 Clearly, bumping to 8K for 32-bit as well is a no go for me. For arm64: struct vgic_irq { spinlock_t irq_lock; /* 0 8 */ struct list_head lpi_list; /* 8 16 */ struct list_head ap_list; /* 24 16 */ struct vcpu * vcpu; /* 40 8 */ struct vcpu * target_vcpu; /* 48 8 */ u32 intid; /* 56 4 */ _Bool line_level; /* 60 1 */ _Bool pending_latch;/* 61 1 */ _Bool active; /* 62 1 */ _Bool enabled; /* 63 1 */ /* --- cacheline 1 boundary (64 bytes) --- */ _Bool hw; /* 64 1 */ /* XXX 3 bytes hole, try to pack */ atomic_t refcount; /* 68 4 */ u32 hwintid; /* 72 4 */ union { u8 targets; /* 1 */ u32 mpidr; /* 4 */ }; /* 76 4 */ u8 source; /* 80 1 */ u8 priority; /* 81 1 */ /* XXX 2 bytes hole, try to pack */ enum vgic_irq_config config; /* 84 4 */ /* size: 88, cachelines: 2, members: 17 */ /* sum members: 83, holes: 2, sum holes: 5 */ /* last cacheline: 24 bytes */ }; - There are 2 holes, with a total waste of 5 bytes per IRQ. - The bool fields could be turned into bool :1. - config is 4 bytes just for 2 values! Probably worth considered a different type (like a bool). You multiple that saving per 32 and it will free a bit of space. Even if it is not going to reach the 4K, those small optimizing are not invasive. Looking at the other structures, struct vgic_v{2,3}_cpu_if are duplicating struct gic_v{2,3}. The state is restored using the latter, but it looks like LRs will be saved in both structure... You probably want to look at streamlining that. I might be convinced we need to bump the size of vCPU allocated, but for that I need to see that effort have been made to minimize the size of th new structures. Cheers, > > Signed-off-by: Andre Przywara <andre.przywara@linaro.org> > --- > xen/arch/arm/domain.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > index 87bd493924..4dd34393f1 100644 > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -502,10 +502,13 @@ void dump_pageframe_info(struct domain *d) > struct vcpu *alloc_vcpu_struct(void) > { > struct vcpu *v; > - BUILD_BUG_ON(sizeof(*v) > PAGE_SIZE); > - v = alloc_xenheap_pages(0, 0); > - if ( v != NULL ) > + > + BUILD_BUG_ON(sizeof(*v) > 2 * PAGE_SIZE); > + v = alloc_xenheap_pages(1, 0); > + if ( v != NULL ) { > clear_page(v); > + clear_page((void *)v + PAGE_SIZE); > + } > return v; > } > >
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 87bd493924..4dd34393f1 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -502,10 +502,13 @@ void dump_pageframe_info(struct domain *d) struct vcpu *alloc_vcpu_struct(void) { struct vcpu *v; - BUILD_BUG_ON(sizeof(*v) > PAGE_SIZE); - v = alloc_xenheap_pages(0, 0); - if ( v != NULL ) + + BUILD_BUG_ON(sizeof(*v) > 2 * PAGE_SIZE); + v = alloc_xenheap_pages(1, 0); + if ( v != NULL ) { clear_page(v); + clear_page((void *)v + PAGE_SIZE); + } return v; }
At the moment we allocate exactly one page for struct vcpu on ARM, also have a check in place to prevent it growing beyond 4KB. As the struct includes the state of all 32 private (per-VCPU) interrupts, we are at 3840 bytes on arm64 at the moment already. Growing the per-IRQ VGIC structure even slightly makes the VCPU quickly exceed the 4K limit. The new VGIC will need more space per virtual IRQ. I spent a few hours trying to trim this down, but couldn't get it below 4KB, even with the nasty hacks piling up to save some bytes here and there. It turns out that beyond efficiency, maybe, there is no real technical reason this struct has to fit in one page, so lifting the limit to two pages seems like the most pragmatic solution. Signed-off-by: Andre Przywara <andre.przywara@linaro.org> --- xen/arch/arm/domain.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)