mbox series

[v3,0/7] pseries NUMA distance rework

Message ID 20200903220639.563090-1-danielhb413@gmail.com
Headers show
Series pseries NUMA distance rework | expand

Message

Daniel Henrique Barboza Sept. 3, 2020, 10:06 p.m. UTC
Hi,

This spin attempts to cover all suggestions and concerns pointed
out by David Gibson in the v2 review.

The patches were rebased with David's ppc-for-5.2 at 615ae3763144.
They can also be cloned from
https://github.com/danielhb/qemu/tree/spapr_numa_v3.

Changes from v2:
- patches 1 and 2 from v2 -> already pushed to ppc-for-5.2
- patch 1 (former 3):
    * numa_assoc_array moved to SpaprMachineState
- patch 3 (former 5):
    * use memcpy
    * fix index increment to use MAX_DISTANCE_REF_POINTS
- patch 4 (former 6):
    * revamped. NVLink2 associativity is now calculated in
spapr_numa_associativity_init(). GPU code will use the same
helper everyone but vcpus uses to write the associativity DT
- patch 5 - new
- patch 6 - new
- patch 7:
    * no more brazilian portuguese notes in the commit message
    * change the code to handle an arbitrary vcpu associativity
array, retrieved with a new helper added by patch 6.

v2 link: https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg00261.html


Daniel Henrique Barboza (7):
  spapr: introduce SpaprMachineState::numa_assoc_array
  spapr, spapr_numa: handle vcpu ibm,associativity
  spapr, spapr_numa: move lookup-arrays handling to spapr_numa.c
  spapr_numa: move NVLink2 associativity handling to spapr_numa.c
  spapr: move h_home_node_associativity to spapr_numa.c
  spapr_numa: create a vcpu associativity helper
  spapr_numa: use spapr_numa_get_vcpu_assoc() in home_node hcall

 hw/ppc/spapr.c                |  65 +++---------
 hw/ppc/spapr_hcall.c          |  37 +------
 hw/ppc/spapr_numa.c           | 184 ++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_nvdimm.c         |  19 ++--
 hw/ppc/spapr_pci.c            |   9 +-
 hw/ppc/spapr_pci_nvlink2.c    |  20 +---
 include/hw/ppc/spapr.h        |  12 +++
 include/hw/ppc/spapr_numa.h   |  19 ++++
 include/hw/ppc/spapr_nvdimm.h |   2 +-
 9 files changed, 242 insertions(+), 125 deletions(-)

Comments

Daniel Henrique Barboza Sept. 3, 2020, 10:06 p.m. UTC | #1
The current implementation of h_home_node_associativity hard codes
the values of associativity domains of the vcpus. Let's make
it consider the values already initialized in spapr->numa_assoc_array,
via the spapr_numa_get_vcpu_assoc() helper.

We want to set it and forget it, and for that we also need to
assert that we don't overflow the registers of the hypercall.
David Gibson Sept. 3, 2020, 11:23 p.m. UTC | #2
On Thu, Sep 03, 2020 at 07:06:34PM -0300, Daniel Henrique Barboza wrote:
> Vcpus have an additional paramenter to be appended, vcpu_id. This
> also changes the size of the of property itself, which is being
> represented in index 0 of numa_assoc_array[cpu->node_id],
> and defaults to MAX_DISTANCE_REF_POINTS for all cases but
> vcpus.
> 
> All this logic makes more sense in spapr_numa.c, where we handle
> everything NUMA and associativity. A new helper spapr_numa_fixup_cpu_dt()
> was added, and spapr.c uses it the same way as it was using the former
> spapr_fixup_cpu_numa_dt().
> 
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> ---
>  hw/ppc/spapr.c              | 17 +----------------
>  hw/ppc/spapr_numa.c         | 27 +++++++++++++++++++++++++++
>  include/hw/ppc/spapr_numa.h |  2 ++
>  3 files changed, 30 insertions(+), 16 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 1ad6f59863..badfa86319 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -202,21 +202,6 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
>      return ret;
>  }
>  
> -static int spapr_fixup_cpu_numa_dt(void *fdt, int offset, PowerPCCPU *cpu)
> -{
> -    int index = spapr_get_vcpu_id(cpu);
> -    uint32_t associativity[] = {cpu_to_be32(0x5),
> -                                cpu_to_be32(0x0),
> -                                cpu_to_be32(0x0),
> -                                cpu_to_be32(0x0),
> -                                cpu_to_be32(cpu->node_id),
> -                                cpu_to_be32(index)};
> -
> -    /* Advertise NUMA via ibm,associativity */
> -    return fdt_setprop(fdt, offset, "ibm,associativity", associativity,
> -                          sizeof(associativity));
> -}
> -
>  static void spapr_dt_pa_features(SpaprMachineState *spapr,
>                                   PowerPCCPU *cpu,
>                                   void *fdt, int offset)
> @@ -785,7 +770,7 @@ static void spapr_dt_cpu(CPUState *cs, void *fdt, int offset,
>                        pft_size_prop, sizeof(pft_size_prop))));
>  
>      if (ms->numa_state->num_nodes > 1) {
> -        _FDT(spapr_fixup_cpu_numa_dt(fdt, offset, cpu));
> +        _FDT(spapr_numa_fixup_cpu_dt(spapr, fdt, offset, cpu));
>      }
>  
>      _FDT(spapr_fixup_cpu_smt_dt(fdt, offset, cpu, compat_smt));
> diff --git a/hw/ppc/spapr_numa.c b/hw/ppc/spapr_numa.c
> index f6b6fe648f..1a1ec8bcff 100644
> --- a/hw/ppc/spapr_numa.c
> +++ b/hw/ppc/spapr_numa.c
> @@ -45,6 +45,33 @@ void spapr_numa_write_associativity_dt(SpaprMachineState *spapr, void *fdt,
>                        sizeof(spapr->numa_assoc_array[nodeid]))));
>  }
>  
> +int spapr_numa_fixup_cpu_dt(SpaprMachineState *spapr, void *fdt,
> +                            int offset, PowerPCCPU *cpu)
> +{
> +    uint vcpu_assoc_size = NUMA_ASSOC_SIZE + 1;
> +    uint32_t vcpu_assoc[vcpu_assoc_size];
> +    int index = spapr_get_vcpu_id(cpu);
> +    int i;
> +
> +    /*
> +     * VCPUs have an extra 'cpu_id' value in ibm,associativity
> +     * compared to other resources. Increment the size at index
> +     * 0, copy all associativity domains already set, then put
> +     * cpu_id last.
> +     */
> +    vcpu_assoc[0] = cpu_to_be32(MAX_DISTANCE_REF_POINTS + 1);
> +
> +    for (i = 1; i <= MAX_DISTANCE_REF_POINTS; i++) {
> +        vcpu_assoc[i] = spapr->numa_assoc_array[cpu->node_id][i];
> +    }

You could use a single memcpy() here as well.

> +
> +    vcpu_assoc[vcpu_assoc_size - 1] = cpu_to_be32(index);
> +
> +    /* Advertise NUMA via ibm,associativity */
> +    return fdt_setprop(fdt, offset, "ibm,associativity",
> +                       vcpu_assoc, sizeof(vcpu_assoc));
> +}
> +
>  /*
>   * Helper that writes ibm,associativity-reference-points and
>   * max-associativity-domains in the RTAS pointed by @rtas
> diff --git a/include/hw/ppc/spapr_numa.h b/include/hw/ppc/spapr_numa.h
> index a2a4df55f7..43c6a16fe3 100644
> --- a/include/hw/ppc/spapr_numa.h
> +++ b/include/hw/ppc/spapr_numa.h
> @@ -27,5 +27,7 @@ void spapr_numa_associativity_init(SpaprMachineState *spapr,
>  void spapr_numa_write_rtas_dt(SpaprMachineState *spapr, void *fdt, int rtas);
>  void spapr_numa_write_associativity_dt(SpaprMachineState *spapr, void *fdt,
>                                         int offset, int nodeid);
> +int spapr_numa_fixup_cpu_dt(SpaprMachineState *spapr, void *fdt,
> +                            int offset, PowerPCCPU *cpu);
>  
>  #endif /* HW_SPAPR_NUMA_H */
David Gibson Sept. 3, 2020, 11:25 p.m. UTC | #3
On Thu, Sep 03, 2020 at 07:06:37PM -0300, Daniel Henrique Barboza wrote:
> The implementation of this hypercall will be modified to use
> spapr->numa_assoc_arrays input. Moving it to spapr_numa.c makes
> make more sense.
> 
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> ---
>  hw/ppc/spapr_hcall.c        | 37 +------------------------------------
>  hw/ppc/spapr_numa.c         | 36 ++++++++++++++++++++++++++++++++++++
>  include/hw/ppc/spapr_numa.h |  4 ++++
>  3 files changed, 41 insertions(+), 36 deletions(-)
> 
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index c1d01228c6..9e9b959bbd 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -18,6 +18,7 @@
>  #include "kvm_ppc.h"
>  #include "hw/ppc/fdt.h"
>  #include "hw/ppc/spapr_ovec.h"
> +#include "hw/ppc/spapr_numa.h"
>  #include "mmu-book3s-v3.h"
>  #include "hw/mem/memory-device.h"
>  
> @@ -1873,42 +1874,6 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
>      return ret;
>  }
>  
> -static target_ulong h_home_node_associativity(PowerPCCPU *cpu,
> -                                              SpaprMachineState *spapr,
> -                                              target_ulong opcode,
> -                                              target_ulong *args)
> -{
> -    target_ulong flags = args[0];
> -    target_ulong procno = args[1];
> -    PowerPCCPU *tcpu;
> -    int idx;
> -
> -    /* only support procno from H_REGISTER_VPA */
> -    if (flags != 0x1) {
> -        return H_FUNCTION;
> -    }
> -
> -    tcpu = spapr_find_cpu(procno);
> -    if (tcpu == NULL) {
> -        return H_P2;
> -    }
> -
> -    /* sequence is the same as in the "ibm,associativity" property */
> -
> -    idx = 0;
> -#define ASSOCIATIVITY(a, b) (((uint64_t)(a) << 32) | \
> -                             ((uint64_t)(b) & 0xffffffff))
> -    args[idx++] = ASSOCIATIVITY(0, 0);
> -    args[idx++] = ASSOCIATIVITY(0, tcpu->node_id);
> -    args[idx++] = ASSOCIATIVITY(procno, -1);
> -    for ( ; idx < 6; idx++) {
> -        args[idx] = -1;
> -    }
> -#undef ASSOCIATIVITY
> -
> -    return H_SUCCESS;
> -}
> -
>  static target_ulong h_get_cpu_characteristics(PowerPCCPU *cpu,
>                                                SpaprMachineState *spapr,
>                                                target_ulong opcode,
> diff --git a/hw/ppc/spapr_numa.c b/hw/ppc/spapr_numa.c
> index 93a000b729..d4dca57321 100644
> --- a/hw/ppc/spapr_numa.c
> +++ b/hw/ppc/spapr_numa.c
> @@ -165,3 +165,39 @@ void spapr_numa_write_rtas_dt(SpaprMachineState *spapr, void *fdt, int rtas)
>      _FDT(fdt_setprop(fdt, rtas, "ibm,max-associativity-domains",
>                       maxdomains, sizeof(maxdomains)));
>  }
> +
> +target_ulong h_home_node_associativity(PowerPCCPU *cpu,
> +                                       SpaprMachineState *spapr,
> +                                       target_ulong opcode,
> +                                       target_ulong *args)

So, if you're moving this function to this file, you should also move
the corresponding spapr_register_hypercall() to this file.  That will
let you keep this function static.

> +{
> +    target_ulong flags = args[0];
> +    target_ulong procno = args[1];
> +    PowerPCCPU *tcpu;
> +    int idx;
> +
> +    /* only support procno from H_REGISTER_VPA */
> +    if (flags != 0x1) {
> +        return H_FUNCTION;
> +    }
> +
> +    tcpu = spapr_find_cpu(procno);
> +    if (tcpu == NULL) {
> +        return H_P2;
> +    }
> +
> +    /* sequence is the same as in the "ibm,associativity" property */
> +
> +    idx = 0;
> +#define ASSOCIATIVITY(a, b) (((uint64_t)(a) << 32) | \
> +                             ((uint64_t)(b) & 0xffffffff))
> +    args[idx++] = ASSOCIATIVITY(0, 0);
> +    args[idx++] = ASSOCIATIVITY(0, tcpu->node_id);
> +    args[idx++] = ASSOCIATIVITY(procno, -1);
> +    for ( ; idx < 6; idx++) {
> +        args[idx] = -1;
> +    }
> +#undef ASSOCIATIVITY
> +
> +    return H_SUCCESS;
> +}
> diff --git a/include/hw/ppc/spapr_numa.h b/include/hw/ppc/spapr_numa.h
> index b3fd950634..5b4d165c06 100644
> --- a/include/hw/ppc/spapr_numa.h
> +++ b/include/hw/ppc/spapr_numa.h
> @@ -31,5 +31,9 @@ int spapr_numa_fixup_cpu_dt(SpaprMachineState *spapr, void *fdt,
>                              int offset, PowerPCCPU *cpu);
>  int spapr_numa_write_assoc_lookup_arrays(SpaprMachineState *spapr, void *fdt,
>                                           int offset);
> +target_ulong h_home_node_associativity(PowerPCCPU *cpu,
> +                                       SpaprMachineState *spapr,
> +                                       target_ulong opcode,
> +                                       target_ulong *args);
>  
>  #endif /* HW_SPAPR_NUMA_H */