Message ID | 1466535515-18092-1-git-send-email-drjones@redhat.com |
---|---|
State | New |
Headers | show |
On 21 June 2016 at 19:58, Andrew Jones <drjones@redhat.com> wrote: > Signed-off-by: Andrew Jones <drjones@redhat.com> I think this commit message could be improved...it's both very short and a bit off the mark. > --- > hw/arm/virt.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > index c5c125e9204a0..53f545921003c 100644 > --- a/hw/arm/virt.c > +++ b/hw/arm/virt.c > @@ -1271,6 +1271,16 @@ static void machvirt_init(MachineState *machine) > } > cpuobj = object_new(object_class_get_name(oc)); > > + /* Adjust MPIDR per the GIC's target-list size. */ > + if (gic_version == 3) { > + CPUState *cs = CPU(cpuobj); > + uint8_t Aff1 = cs->cpu_index / 16; > + uint8_t Aff0 = cs->cpu_index % 16; > + > + object_property_set_int(cpuobj, (Aff1 << ARM_AFF1_SHIFT) | Aff0, > + "mp-affinity", NULL); > + } We still don't have support in KVM for telling the CPU what affinity to use, so these may get overridden later if KVM's idea of affinity and ours differ. I guess that's no different to what we have today, though. I think it would be better to: * use the loop index 'n' rather than fishing the cpu_index out of the CPUState. * do this regardless of GIC version (if it's GICv2 we only have 8 CPUs max anyway) * comment it as "Create our CPUs in clusters of 16; this suits the GICv3's target list limitations, and matches how KVM assigns them" * for 32-bit, set the mp-affinity in the same arrangement the kernel does for KVM, which is clusters of 4 CPUs * note also in the comment that for KVM these will be overridden by the hard-coded topology in the kernel when the CPU is realized [is changing mp-affinity a migration compat break?] thanks -- PMM
On Thu, Jun 23, 2016 at 12:15:59PM +0100, Peter Maydell wrote: > On 21 June 2016 at 19:58, Andrew Jones <drjones@redhat.com> wrote: > > Signed-off-by: Andrew Jones <drjones@redhat.com> > > I think this commit message could be improved...it's both > very short and a bit off the mark. > > > --- > > hw/arm/virt.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > > index c5c125e9204a0..53f545921003c 100644 > > --- a/hw/arm/virt.c > > +++ b/hw/arm/virt.c > > @@ -1271,6 +1271,16 @@ static void machvirt_init(MachineState *machine) > > } > > cpuobj = object_new(object_class_get_name(oc)); > > > > + /* Adjust MPIDR per the GIC's target-list size. */ > > + if (gic_version == 3) { > > + CPUState *cs = CPU(cpuobj); > > + uint8_t Aff1 = cs->cpu_index / 16; > > + uint8_t Aff0 = cs->cpu_index % 16; > > + > > + object_property_set_int(cpuobj, (Aff1 << ARM_AFF1_SHIFT) | Aff0, > > + "mp-affinity", NULL); > > + } > > We still don't have support in KVM for telling the CPU what > affinity to use, so these may get overridden later if KVM's Informing KVM of MPIDR is still on my TODO, and I'm getting closer to having time to work on it. Spoiler: I'm thinking about changing vcpu-id to the compressed mpidr. We won't need it compressed if vcpu-id gets expanded to 64bits, which I think Igor would like to do. > idea of affinity and ours differ. I guess that's no different > to what we have today, though. Right, that was my thinking as well, the benefit is purely for TCG. > > I think it would be better to: > * use the loop index 'n' rather than fishing the cpu_index > out of the CPUState. > * do this regardless of GIC version (if it's GICv2 we only > have 8 CPUs max anyway) > * comment it as "Create our CPUs in clusters of 16; this suits > the GICv3's target list limitations, and matches how KVM > assigns them" > * for 32-bit, set the mp-affinity in the same arrangement the > kernel does for KVM, which is clusters of 4 CPUs > * note also in the comment that for KVM these will be overridden > by the hard-coded topology in the kernel when the CPU is > realized Will do all the above. Thanks for the suggestions. > > [is changing mp-affinity a migration compat break?] Indeed this would introduce a guest visible change with a migration from old to new. It seems we need to add a machine property every time we add a guest visible change, which isn't off by default and enabled with the cmdline, and then turn it off for all older versions. Thanks, drew
On Thu, Jun 23, 2016 at 12:15:59PM +0100, Peter Maydell wrote: > On 21 June 2016 at 19:58, Andrew Jones <drjones@redhat.com> wrote: > > Signed-off-by: Andrew Jones <drjones@redhat.com> > > I think this commit message could be improved...it's both > very short and a bit off the mark. > > > --- > > hw/arm/virt.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > > index c5c125e9204a0..53f545921003c 100644 > > --- a/hw/arm/virt.c > > +++ b/hw/arm/virt.c > > @@ -1271,6 +1271,16 @@ static void machvirt_init(MachineState *machine) > > } > > cpuobj = object_new(object_class_get_name(oc)); > > > > + /* Adjust MPIDR per the GIC's target-list size. */ > > + if (gic_version == 3) { > > + CPUState *cs = CPU(cpuobj); > > + uint8_t Aff1 = cs->cpu_index / 16; > > + uint8_t Aff0 = cs->cpu_index % 16; > > + > > + object_property_set_int(cpuobj, (Aff1 << ARM_AFF1_SHIFT) | Aff0, > > + "mp-affinity", NULL); > > + } > > We still don't have support in KVM for telling the CPU what > affinity to use, so these may get overridden later if KVM's > idea of affinity and ours differ. I guess that's no different > to what we have today, though. > > I think it would be better to: > * use the loop index 'n' rather than fishing the cpu_index > out of the CPUState. > * do this regardless of GIC version (if it's GICv2 we only > have 8 CPUs max anyway) > * comment it as "Create our CPUs in clusters of 16; this suits > the GICv3's target list limitations, and matches how KVM > assigns them" > * for 32-bit, set the mp-affinity in the same arrangement the > kernel does for KVM, which is clusters of 4 CPUs Actually this depends on the host. KVM will use all 8 tlist bits with ARM guests, when running on an AArch64 host. As a guest doesn't really have to worry about clusters for shared cache, then ignoring the 4 per cluster requirement is likely fine (is that a hard requirement anyway?), but it could confuse a guest. So we can either a) play it safe and always use clusters of 4 for ARM guests, and KVM will get "fixed" when we start managing the guest's MPIDR from userspace, or b) use 8 here, as TCG always has, and KVM does for AArch32 guests. This might be less safe, but also improves SGI efficiency. Thanks, drew > * note also in the comment that for KVM these will be overridden > by the hard-coded topology in the kernel when the CPU is > realized > > [is changing mp-affinity a migration compat break?] > > thanks > -- PMM >
On Fri, Jun 24, 2016 at 06:03:21PM +0200, Andrew Jones wrote: > On Thu, Jun 23, 2016 at 12:15:59PM +0100, Peter Maydell wrote: > > On 21 June 2016 at 19:58, Andrew Jones <drjones@redhat.com> wrote: > > > Signed-off-by: Andrew Jones <drjones@redhat.com> > > > > I think this commit message could be improved...it's both > > very short and a bit off the mark. > > > > > --- > > > hw/arm/virt.c | 10 ++++++++++ > > > 1 file changed, 10 insertions(+) > > > > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > > > index c5c125e9204a0..53f545921003c 100644 > > > --- a/hw/arm/virt.c > > > +++ b/hw/arm/virt.c > > > @@ -1271,6 +1271,16 @@ static void machvirt_init(MachineState *machine) > > > } > > > cpuobj = object_new(object_class_get_name(oc)); > > > > > > + /* Adjust MPIDR per the GIC's target-list size. */ > > > + if (gic_version == 3) { > > > + CPUState *cs = CPU(cpuobj); > > > + uint8_t Aff1 = cs->cpu_index / 16; > > > + uint8_t Aff0 = cs->cpu_index % 16; > > > + > > > + object_property_set_int(cpuobj, (Aff1 << ARM_AFF1_SHIFT) | Aff0, > > > + "mp-affinity", NULL); > > > + } > > > > We still don't have support in KVM for telling the CPU what > > affinity to use, so these may get overridden later if KVM's > > idea of affinity and ours differ. I guess that's no different > > to what we have today, though. > > > > I think it would be better to: > > * use the loop index 'n' rather than fishing the cpu_index > > out of the CPUState. > > * do this regardless of GIC version (if it's GICv2 we only > > have 8 CPUs max anyway) > > * comment it as "Create our CPUs in clusters of 16; this suits > > the GICv3's target list limitations, and matches how KVM > > assigns them" > > * for 32-bit, set the mp-affinity in the same arrangement the > > kernel does for KVM, which is clusters of 4 CPUs > > Actually this depends on the host. KVM will use all 8 tlist bits > with ARM guests, when running on an AArch64 host. As a guest doesn't > really have to worry about clusters for shared cache, then ignoring > the 4 per cluster requirement is likely fine (is that a hard > requirement anyway?), but it could confuse a guest. > > So we can either > a) play it safe and always use clusters of 4 for ARM guests, and > KVM will get "fixed" when we start managing the guest's MPIDR > from userspace, or > b) use 8 here, as TCG always has, and KVM does for AArch32 guests. > This might be less safe, but also improves SGI efficiency. Actually AArch32 guests would even use all 16 tlist bits on gicv3, if there was a KVM host available to try it. So the (b) option shouldn't be "use 8" it should be "don't treat 32-bit guests differently" Thanks, drew > > Thanks, > drew > > > > * note also in the comment that for KVM these will be overridden > > by the hard-coded topology in the kernel when the CPU is > > realized > > > > [is changing mp-affinity a migration compat break?] > > > > thanks > > -- PMM > >
On 24 June 2016 at 17:15, Andrew Jones <drjones@redhat.com> wrote: > On Fri, Jun 24, 2016 at 06:03:21PM +0200, Andrew Jones wrote: >> So we can either >> a) play it safe and always use clusters of 4 for ARM guests, and >> KVM will get "fixed" when we start managing the guest's MPIDR >> from userspace, or >> b) use 8 here, as TCG always has, and KVM does for AArch32 guests. >> This might be less safe, but also improves SGI efficiency. > > Actually AArch32 guests would even use all 16 tlist bits on gicv3, if > there was a KVM host available to try it. So the (b) option shouldn't > be "use 8" it should be "don't treat 32-bit guests differently" KVM AArch32 is 4 CPUs per cluster: http://lxr.free-electrons.com/source/arch/arm/kvm/coproc.c#L109 thanks -- PMM
On Fri, Jun 24, 2016 at 05:41:55PM +0100, Peter Maydell wrote: > On 24 June 2016 at 17:15, Andrew Jones <drjones@redhat.com> wrote: > > On Fri, Jun 24, 2016 at 06:03:21PM +0200, Andrew Jones wrote: > >> So we can either > >> a) play it safe and always use clusters of 4 for ARM guests, and > >> KVM will get "fixed" when we start managing the guest's MPIDR > >> from userspace, or > >> b) use 8 here, as TCG always has, and KVM does for AArch32 guests. > >> This might be less safe, but also improves SGI efficiency. > > > > Actually AArch32 guests would even use all 16 tlist bits on gicv3, if > > there was a KVM host available to try it. So the (b) option shouldn't > > be "use 8" it should be "don't treat 32-bit guests differently" > > KVM AArch32 is 4 CPUs per cluster: > http://lxr.free-electrons.com/source/arch/arm/kvm/coproc.c#L109 Hmm... yes, it should use coproc.c, but here's what I get when I test qemu-system-aarch64 \ -machine virt,gic-version=2,accel=kvm \ -cpu host,aarch64=off \ -device virtio-serial-device \ -device virtconsole,chardev=ctd \ -chardev testdev,id=ctd \ -display none -serial stdio \ -kernel arm/selftest.flat \ -append smp -smp 8 PSCI version 0.2 PASS: selftest: smp: PSCI version PASS: selftest: smp: CPU( 1) mpidr=80000001 PASS: selftest: smp: CPU( 2) mpidr=80000002 PASS: selftest: smp: CPU( 3) mpidr=80000003 PASS: selftest: smp: CPU( 4) mpidr=80000004 PASS: selftest: smp: CPU( 5) mpidr=80000005 PASS: selftest: smp: CPU( 6) mpidr=80000006 PASS: selftest: smp: CPU( 7) mpidr=80000007 PASS: selftest: smp: CPU( 0) mpidr=80000000 SUMMARY: 9 tests (arm/selftest.flat built from https://github.com/rhdrjones/kvm-unit-tests/tree/arm/gic) Other configurations give the expected mpidrs. I'll look into it more next week. Thanks, drew
On 24 June 2016 at 18:22, Andrew Jones <drjones@redhat.com> wrote: > On Fri, Jun 24, 2016 at 05:41:55PM +0100, Peter Maydell wrote: >> KVM AArch32 is 4 CPUs per cluster: >> http://lxr.free-electrons.com/source/arch/arm/kvm/coproc.c#L109 > > Hmm... yes, it should use coproc.c, but here's what I get when I > test > > qemu-system-aarch64 \ > -machine virt,gic-version=2,accel=kvm \ > -cpu host,aarch64=off \ > -device virtio-serial-device \ > -device virtconsole,chardev=ctd \ > -chardev testdev,id=ctd \ > -display none -serial stdio \ > -kernel arm/selftest.flat \ > -append smp -smp 8 This suggests that 32-bit-guest-on-64-bit-host and 32-bit-guest-on-32-bit-host differ... thanks -- PMM
On Fri, Jun 24, 2016 at 06:27:20PM +0100, Peter Maydell wrote: > On 24 June 2016 at 18:22, Andrew Jones <drjones@redhat.com> wrote: > > On Fri, Jun 24, 2016 at 05:41:55PM +0100, Peter Maydell wrote: > >> KVM AArch32 is 4 CPUs per cluster: > >> http://lxr.free-electrons.com/source/arch/arm/kvm/coproc.c#L109 > > > > Hmm... yes, it should use coproc.c, but here's what I get when I > > test > > > > qemu-system-aarch64 \ > > -machine virt,gic-version=2,accel=kvm \ > > -cpu host,aarch64=off \ > > -device virtio-serial-device \ > > -device virtconsole,chardev=ctd \ > > -chardev testdev,id=ctd \ > > -display none -serial stdio \ > > -kernel arm/selftest.flat \ > > -append smp -smp 8 > > This suggests that 32-bit-guest-on-64-bit-host and > 32-bit-guest-on-32-bit-host differ... Yes, this is the case. I just looked at KVM and, it shouldn't use coproc.c (that's not one of the shared files between 32 and 64 bit hosts), and there's no special handing in reset_mpidr for KVM_ARM_VCPU_EL1_32BIT. The only special handing is in handlers for trapped coproc accesses, which MPIDR is not. I think it makes sense that the 32bit guest view be consistent. This means we need one of two patches in KVM. Either a) decide we don't need to emulate clusters of 4, and just use the max the gic supports, or b) modify arm64's reset_mpidr to change behavior based on KVM_ARM_VCPU_EL1_32BIT. If the clusters of 4 thing is a hard requirement, then we should go that way. If not, as it doesn't seem to break guests today (aarch64=off and tcg guests have never done it) then I say we stop doing it on 32bit hosts too, as it will increase SGI efficiency. (I've added kvmarm and Chistoffer and Marc to CC) Thanks, drew
On Mon, Jun 27, 2016 at 08:51:43AM +0100, Marc Zyngier wrote: > On 27/06/16 07:41, Andrew Jones wrote: > > On Fri, Jun 24, 2016 at 06:27:20PM +0100, Peter Maydell wrote: > >> On 24 June 2016 at 18:22, Andrew Jones <drjones@redhat.com> wrote: > >>> On Fri, Jun 24, 2016 at 05:41:55PM +0100, Peter Maydell wrote: > >>>> KVM AArch32 is 4 CPUs per cluster: > >>>> http://lxr.free-electrons.com/source/arch/arm/kvm/coproc.c#L109 > >>> > >>> Hmm... yes, it should use coproc.c, but here's what I get when I > >>> test > >>> > >>> qemu-system-aarch64 \ > >>> -machine virt,gic-version=2,accel=kvm \ > >>> -cpu host,aarch64=off \ > >>> -device virtio-serial-device \ > >>> -device virtconsole,chardev=ctd \ > >>> -chardev testdev,id=ctd \ > >>> -display none -serial stdio \ > >>> -kernel arm/selftest.flat \ > >>> -append smp -smp 8 > >> > >> This suggests that 32-bit-guest-on-64-bit-host and > >> 32-bit-guest-on-32-bit-host differ... > > > > Yes, this is the case. I just looked at KVM and, it shouldn't use coproc.c > > (that's not one of the shared files between 32 and 64 bit hosts), and > > there's no special handing in reset_mpidr for KVM_ARM_VCPU_EL1_32BIT. > > The only special handing is in handlers for trapped coproc accesses, > > which MPIDR is not. > > > > I think it makes sense that the 32bit guest view be consistent. This > > means we need one of two patches in KVM. Either > > > > a) decide we don't need to emulate clusters of 4, and just use the > > max the gic supports, or > > b) modify arm64's reset_mpidr to change behavior based on > > KVM_ARM_VCPU_EL1_32BIT. > > > > If the clusters of 4 thing is a hard requirement, then we should go > > that way. If not, as it doesn't seem to break guests today (aarch64=off > > and tcg guests have never done it) then I say we stop doing it on 32bit > > hosts too, as it will increase SGI efficiency. > > I've never been fond of the 32bit behaviour, to be honest, and I'd > rather stick to the default being to max Aff0 on 64bit hosts (and before > anyone asks, yes, GICv3 support is coming to 32bit as well). OK > > What I think we should have though is a way for userspace to override > the defaults presented by KVM. That way, 64bit userspace can enforce 4 > CPU clusters if it sees fit. OK, so it sounds like you'd rather not get any KVM patches now, but rather just wait for userspace to take control. That means, in the case of non- userspace MPIDR controlled systems, we'll still have a difference between 32-on-64 and 32-on-32. Anyway, userspace controlled MPIDR is definitely the goal. I'm sneaking up on it. In order to justify it, I also need to add cpu topology support to mach-virt. In order to do that, I need to 1) cleanup cpu topology parameters in QEMU (RFC posted), 2) write the DT and ACPI generation patches (written, will post soon), 3) determine how to inform KVM of the desired MPIDR (set-one-reg or maybe change the vcpu-id to be the MPIDR?) 4) adjust the MPIDR. The patch that spurred this immediate discussion is loosely related to (4), as it starts adjusting MPIDR (but, atm, it's just to be consistent with KVM) Thanks, drew
diff --git a/hw/arm/virt.c b/hw/arm/virt.c index c5c125e9204a0..53f545921003c 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -1271,6 +1271,16 @@ static void machvirt_init(MachineState *machine) } cpuobj = object_new(object_class_get_name(oc)); + /* Adjust MPIDR per the GIC's target-list size. */ + if (gic_version == 3) { + CPUState *cs = CPU(cpuobj); + uint8_t Aff1 = cs->cpu_index / 16; + uint8_t Aff0 = cs->cpu_index % 16; + + object_property_set_int(cpuobj, (Aff1 << ARM_AFF1_SHIFT) | Aff0, + "mp-affinity", NULL); + } + /* Handle any CPU options specified by the user */ cc->parse_features(CPU(cpuobj), cpuopts, &err); g_free(cpuopts);
Signed-off-by: Andrew Jones <drjones@redhat.com> --- hw/arm/virt.c | 10 ++++++++++ 1 file changed, 10 insertions(+) -- 2.4.11