mbox series

[v2,0/7] pseries NUMA distance rework

Message ID 20200901125645.118026-1-danielhb413@gmail.com
Headers show
Series pseries NUMA distance rework | expand

Message

Daniel Henrique Barboza Sept. 1, 2020, 12:56 p.m. UTC
Hi,

Following the reviews of the first version [1], specially this
reply from David [2], I decided to take a step back and refactor
all the code in hw/ppc/spapr* that operates with ibm,associativity,
ibm,associativity-reference-points and ibm,max-associativity-domains.

A new file named 'spapr_numa.c' was created to gather all the
associativity related code into helpers that write NUMA/associativity
related info to the FDT. These helpers are then used in other
spapr_* files. This allows us to change NUMA related code in a
single location, instead of searching every file to see where is
associativity being written and how, and all the soon to get
more complex logic can be contained in spapr_numa.c. I consider
the end result to be better than what I ended up doing in v1.

Unlike v1, there is no NUMA distance change being done in this series.
Later on, the hub of the new NUMA distance calculation will be
spapr_numa_associativity_init(), where we'll take into consideration
user input from numa_states, handle sizes to what the PAPR kernel
understands and establish assoaciativity domains between the NUMA nodes.


Changes from v1:
- all the patches that did guest visible changes were removed. They
will be re-submitted in a follow-up series after this one.
- patch 02 from v1 will be reworked and reposted in the follow-up
series as well.
- version 1 link:
https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg03169.html



These patches were rebased using David's ppc-for-5.2 tree. Github
repo with the patches applied:

https://github.com/danielhb/qemu/tree/spapr_numa_v2


[1] https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg03169.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg04661.html
 
Daniel Henrique Barboza (7):
  ppc: introducing spapr_numa.c NUMA code helper
  ppc/spapr_nvdimm: turn spapr_dt_nvdimm() static
  spapr: introduce SpaprMachineClass::numa_assoc_array
  spapr, spapr_numa: handle vcpu ibm,associativity
  spapr, spapr_numa: move lookup-arrays handling to spapr_numa.c
  spapr_numa: move NVLink2 associativity handling to spapr_numa.c
  spapr_hcall: h_home_node_associativity now reads numa_assoc_array

 hw/ppc/meson.build            |   3 +-
 hw/ppc/spapr.c                |  91 +++---------------
 hw/ppc/spapr_hcall.c          |  16 +++-
 hw/ppc/spapr_numa.c           | 172 ++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_nvdimm.c         |  37 ++++----
 hw/ppc/spapr_pci.c            |   9 +-
 hw/ppc/spapr_pci_nvlink2.c    |  19 +---
 include/hw/ppc/spapr.h        |  13 ++-
 include/hw/ppc/spapr_numa.h   |  32 +++++++
 include/hw/ppc/spapr_nvdimm.h |   3 +-
 10 files changed, 268 insertions(+), 127 deletions(-)
 create mode 100644 hw/ppc/spapr_numa.c
 create mode 100644 include/hw/ppc/spapr_numa.h

Comments

David Gibson Sept. 3, 2020, 1:34 a.m. UTC | #1
On Tue, Sep 01, 2020 at 09:56:43AM -0300, Daniel Henrique Barboza wrote:
> In a similar fashion as the previous patch, let's move the
> handling of ibm,associativity-lookup-arrays from spapr.c to
> spapr_numa.c. A spapr_numa_write_assoc_lookup_arrays() helper was
> created, and spapr_dt_dynamic_reconfiguration_memory() can now
> use it to advertise the lookup-arrays.
> 
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> ---
>  hw/ppc/spapr.c              | 25 ++----------------------
>  hw/ppc/spapr_numa.c         | 39 +++++++++++++++++++++++++++++++++++++
>  include/hw/ppc/spapr_numa.h |  2 ++
>  3 files changed, 43 insertions(+), 23 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 172f965fe0..65d2ccd578 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -535,13 +535,10 @@ static int spapr_dt_dynamic_reconfiguration_memory(SpaprMachineState *spapr,
>                                                     void *fdt)
>  {
>      MachineState *machine = MACHINE(spapr);
> -    int nb_numa_nodes = machine->numa_state->num_nodes;
> -    int ret, i, offset;
> +    int ret, offset;
>      uint64_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE;
>      uint32_t prop_lmb_size[] = {cpu_to_be32(lmb_size >> 32),
>                                  cpu_to_be32(lmb_size & 0xffffffff)};
> -    uint32_t *int_buf, *cur_index, buf_len;
> -    int nr_nodes = nb_numa_nodes ? nb_numa_nodes : 1;
>      MemoryDeviceInfoList *dimms = NULL;
>  
>      /*
> @@ -582,25 +579,7 @@ static int spapr_dt_dynamic_reconfiguration_memory(SpaprMachineState *spapr,
>          return ret;
>      }
>  
> -    /* ibm,associativity-lookup-arrays */
> -    buf_len = (nr_nodes * 4 + 2) * sizeof(uint32_t);
> -    cur_index = int_buf = g_malloc0(buf_len);
> -    int_buf[0] = cpu_to_be32(nr_nodes);
> -    int_buf[1] = cpu_to_be32(4); /* Number of entries per associativity list */
> -    cur_index += 2;
> -    for (i = 0; i < nr_nodes; i++) {
> -        uint32_t associativity[] = {
> -            cpu_to_be32(0x0),
> -            cpu_to_be32(0x0),
> -            cpu_to_be32(0x0),
> -            cpu_to_be32(i)
> -        };
> -        memcpy(cur_index, associativity, sizeof(associativity));
> -        cur_index += 4;
> -    }
> -    ret = fdt_setprop(fdt, offset, "ibm,associativity-lookup-arrays", int_buf,
> -            (cur_index - int_buf) * sizeof(uint32_t));
> -    g_free(int_buf);
> +    ret = spapr_numa_write_assoc_lookup_arrays(spapr, fdt, offset);
>  
>      return ret;
>  }
> diff --git a/hw/ppc/spapr_numa.c b/hw/ppc/spapr_numa.c
> index b8882d209e..9eb4bdbe80 100644
> --- a/hw/ppc/spapr_numa.c
> +++ b/hw/ppc/spapr_numa.c
> @@ -75,6 +75,45 @@ int spapr_numa_fixup_cpu_dt(SpaprMachineState *spapr, void *fdt,
>                         vcpu_assoc, sizeof(vcpu_assoc));
>  }
>  
> +
> +int spapr_numa_write_assoc_lookup_arrays(SpaprMachineState *spapr, void *fdt,
> +                                         int offset)
> +{
> +    MachineState *machine = MACHINE(spapr);
> +    SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> +    int nb_numa_nodes = machine->numa_state->num_nodes;
> +    int nr_nodes = nb_numa_nodes ? nb_numa_nodes : 1;
> +    uint32_t *int_buf, *cur_index, buf_len;
> +    int ret, i, j;
> +
> +    /* ibm,associativity-lookup-arrays */
> +    buf_len = (nr_nodes * MAX_DISTANCE_REF_POINTS + 2) * sizeof(uint32_t);
> +    cur_index = int_buf = g_malloc0(buf_len);
> +    int_buf[0] = cpu_to_be32(nr_nodes);
> +     /* Number of entries per associativity list */
> +    int_buf[1] = cpu_to_be32(MAX_DISTANCE_REF_POINTS);
> +    cur_index += 2;
> +    for (i = 0; i < nr_nodes; i++) {
> +        /*
> +         * For the lookup-array we use the ibm,associativity array,
> +         * from numa_assoc_array. without the first element (size).
> +         */
> +        uint32_t associativity[MAX_DISTANCE_REF_POINTS];
> +
> +        for (j = 0; j < MAX_DISTANCE_REF_POINTS; j++) {
> +            associativity[j] = smc->numa_assoc_array[i][j + 1];
> +        }
> +
> +        memcpy(cur_index, associativity, sizeof(associativity));

AFAICT, you could just use a single memcpy() to copy from the
numa_assoc_array() into the property here, rather than using a loop
and temporary array.

> +        cur_index += 4;

Shouldn't this be  += MAX_DISTANCE_REF_POINTS?

> +    }
> +    ret = fdt_setprop(fdt, offset, "ibm,associativity-lookup-arrays", int_buf,
> +                      (cur_index - int_buf) * sizeof(uint32_t));
> +    g_free(int_buf);
> +
> +    return ret;
> +}
> +
>  /*
>   * Helper that writes ibm,associativity-reference-points and
>   * max-associativity-domains in the RTAS pointed by @rtas
> diff --git a/include/hw/ppc/spapr_numa.h b/include/hw/ppc/spapr_numa.h
> index f92fb4f28a..f6127501a6 100644
> --- a/include/hw/ppc/spapr_numa.h
> +++ b/include/hw/ppc/spapr_numa.h
> @@ -22,6 +22,8 @@ void spapr_numa_write_associativity_dt(SpaprMachineState *spapr, void *fdt,
>                                         int offset, int nodeid);
>  int spapr_numa_fixup_cpu_dt(SpaprMachineState *spapr, void *fdt,
>                              int offset, PowerPCCPU *cpu);
> +int spapr_numa_write_assoc_lookup_arrays(SpaprMachineState *spapr, void *fdt,
> +                                         int offset);
>  
>  
>  #endif /* HW_SPAPR_NUMA_H */
David Gibson Sept. 3, 2020, 1:35 a.m. UTC | #2
On Tue, Sep 01, 2020 at 09:56:38AM -0300, Daniel Henrique Barboza wrote:
> Hi,
> 
> Following the reviews of the first version [1], specially this
> reply from David [2], I decided to take a step back and refactor
> all the code in hw/ppc/spapr* that operates with ibm,associativity,
> ibm,associativity-reference-points and ibm,max-associativity-domains.
> 
> A new file named 'spapr_numa.c' was created to gather all the
> associativity related code into helpers that write NUMA/associativity
> related info to the FDT. These helpers are then used in other
> spapr_* files. This allows us to change NUMA related code in a
> single location, instead of searching every file to see where is
> associativity being written and how, and all the soon to get
> more complex logic can be contained in spapr_numa.c. I consider
> the end result to be better than what I ended up doing in v1.
> 
> Unlike v1, there is no NUMA distance change being done in this series.
> Later on, the hub of the new NUMA distance calculation will be
> spapr_numa_associativity_init(), where we'll take into consideration
> user input from numa_states, handle sizes to what the PAPR kernel
> understands and establish assoaciativity domains between the NUMA
> nodes.

Patches 1..4 applied to ppc-for-5.2.  Patch 5 has some nits I've
commented on.

> 
> 
> Changes from v1:
> - all the patches that did guest visible changes were removed. They
> will be re-submitted in a follow-up series after this one.
> - patch 02 from v1 will be reworked and reposted in the follow-up
> series as well.
> - version 1 link:
> https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg03169.html
> 
> 
> 
> These patches were rebased using David's ppc-for-5.2 tree. Github
> repo with the patches applied:
> 
> https://github.com/danielhb/qemu/tree/spapr_numa_v2
> 
> 
> [1] https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg03169.html
> [2] https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg04661.html
>  
> Daniel Henrique Barboza (7):
>   ppc: introducing spapr_numa.c NUMA code helper
>   ppc/spapr_nvdimm: turn spapr_dt_nvdimm() static
>   spapr: introduce SpaprMachineClass::numa_assoc_array
>   spapr, spapr_numa: handle vcpu ibm,associativity
>   spapr, spapr_numa: move lookup-arrays handling to spapr_numa.c
>   spapr_numa: move NVLink2 associativity handling to spapr_numa.c
>   spapr_hcall: h_home_node_associativity now reads numa_assoc_array
> 
>  hw/ppc/meson.build            |   3 +-
>  hw/ppc/spapr.c                |  91 +++---------------
>  hw/ppc/spapr_hcall.c          |  16 +++-
>  hw/ppc/spapr_numa.c           | 172 ++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_nvdimm.c         |  37 ++++----
>  hw/ppc/spapr_pci.c            |   9 +-
>  hw/ppc/spapr_pci_nvlink2.c    |  19 +---
>  include/hw/ppc/spapr.h        |  13 ++-
>  include/hw/ppc/spapr_numa.h   |  32 +++++++
>  include/hw/ppc/spapr_nvdimm.h |   3 +-
>  10 files changed, 268 insertions(+), 127 deletions(-)
>  create mode 100644 hw/ppc/spapr_numa.c
>  create mode 100644 include/hw/ppc/spapr_numa.h
>
David Gibson Sept. 3, 2020, 1:49 a.m. UTC | #3
On Thu, Sep 03, 2020 at 11:35:39AM +1000, David Gibson wrote:
> On Tue, Sep 01, 2020 at 09:56:38AM -0300, Daniel Henrique Barboza wrote:
> > Hi,
> > 
> > Following the reviews of the first version [1], specially this
> > reply from David [2], I decided to take a step back and refactor
> > all the code in hw/ppc/spapr* that operates with ibm,associativity,
> > ibm,associativity-reference-points and ibm,max-associativity-domains.
> > 
> > A new file named 'spapr_numa.c' was created to gather all the
> > associativity related code into helpers that write NUMA/associativity
> > related info to the FDT. These helpers are then used in other
> > spapr_* files. This allows us to change NUMA related code in a
> > single location, instead of searching every file to see where is
> > associativity being written and how, and all the soon to get
> > more complex logic can be contained in spapr_numa.c. I consider
> > the end result to be better than what I ended up doing in v1.
> > 
> > Unlike v1, there is no NUMA distance change being done in this series.
> > Later on, the hub of the new NUMA distance calculation will be
> > spapr_numa_associativity_init(), where we'll take into consideration
> > user input from numa_states, handle sizes to what the PAPR kernel
> > understands and establish assoaciativity domains between the NUMA
> > nodes.
> 
> Patches 1..4 applied to ppc-for-5.2.  Patch 5 has some nits I've
> commented on.

Ah, sorry, I realised I missed something.  Patches 1..2 are still
applied, but patch 3 has a nit large enough to call for a respin.
Daniel Henrique Barboza Sept. 3, 2020, 11:17 a.m. UTC | #4
On 9/2/20 10:46 PM, David Gibson wrote:
> On Tue, Sep 01, 2020 at 09:56:45AM -0300, Daniel Henrique Barboza wrote:
>> home_node_associativity reply now uses the associativity
>> values for tcpu->node_id provided by numa_assoc_array.
>>
>> This will avoid further changes in this code when numa_assoc_array
>> changes values, but it won't be enough to prevent further changes
>> if (falar aqui q se mudar o tamanho do array tem q mexer nessa
>> funcao tambem, falar q a macro associativity() deixa a automacao
>> de tudo mto unreadable)
> 
> Uh.. I'm guessing that was a note to yourself you intended to
> translate before publishing?

Ops! Forgot to edit it :|

> 
>>
>> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
>> ---
>>   hw/ppc/spapr_hcall.c | 16 ++++++++++++++--
>>   1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
>> index c1d01228c6..2ec30efdcb 100644
>> --- a/hw/ppc/spapr_hcall.c
>> +++ b/hw/ppc/spapr_hcall.c
>> @@ -1878,9 +1878,13 @@ static target_ulong h_home_node_associativity(PowerPCCPU *cpu,
>>                                                 target_ulong opcode,
>>                                                 target_ulong *args)
> 
> You could move this function to spapr_numa.c as well (the name's a bit
> misleading, but spapr_hcall.c isn't really intended to hold *all*
> hcall implementations, just the ones that don't have somewhere better
> to live).

Ok!


> 
>>   {
>> +    SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
>>       target_ulong flags = args[0];
>>       target_ulong procno = args[1];
>>       PowerPCCPU *tcpu;
>> +    uint32_t assoc_domain1;
>> +    uint32_t assoc_domain2;
>> +    uint32_t assoc_domain3;
>>       int idx;
>>   
>>       /* only support procno from H_REGISTER_VPA */
>> @@ -1893,13 +1897,21 @@ static target_ulong h_home_node_associativity(PowerPCCPU *cpu,
>>           return H_P2;
>>       }
>>   
>> +    /*
>> +     * Index 0 is the ibm,associativity size of the node,
>> +     * which isn't relevant here.
>> +     */
>> +    assoc_domain1 = smc->numa_assoc_array[tcpu->node_id][1];
>> +    assoc_domain2 = smc->numa_assoc_array[tcpu->node_id][2];
>> +    assoc_domain3 = smc->numa_assoc_array[tcpu->node_id][3];
> 
> Using all these temporaries is a little ugly.  Maybe do something like:
> 	uint32_t *assoc = smc->numa_assoc_array[tcpu->node_id];
> 
> Then just use assoc[1], assoc[2] etc. below.
> 
>> +
>>       /* sequence is the same as in the "ibm,associativity" property */
>>   
>>       idx = 0;
>>   #define ASSOCIATIVITY(a, b) (((uint64_t)(a) << 32) | \
>>                                ((uint64_t)(b) & 0xffffffff))
>> -    args[idx++] = ASSOCIATIVITY(0, 0);
>> -    args[idx++] = ASSOCIATIVITY(0, tcpu->node_id);
>> +    args[idx++] = ASSOCIATIVITY(assoc_domain1, assoc_domain2);
>> +    args[idx++] = ASSOCIATIVITY(assoc_domain3, tcpu->node_id);
>>       args[idx++] = ASSOCIATIVITY(procno, -1);
>>       for ( ; idx < 6; idx++) {
>>           args[idx] = -1;
> 
> Better yet would be to make this handle an arbitrary length of assoc
> array, further isolating this from the specifics of how we construct
> the arrays.  Ideally, you'd call a common path with
> spapr_numa_fixup_cpu_dt() to get the assoc array for a cpu, then here
> just translate it into the in-register format the hcall expects.


Since we're moving this to spapr_numa.c then I guess it's ok to add
more juggling to handle an arbitrary size. I got a bit nervous about
adding too much stuff here in spapr_hcall.c.


Thanks,


DHB

>
Daniel Henrique Barboza Sept. 3, 2020, 11:22 a.m. UTC | #5
On 9/2/20 10:34 PM, David Gibson wrote:
> On Tue, Sep 01, 2020 at 09:56:43AM -0300, Daniel Henrique Barboza wrote:
>> In a similar fashion as the previous patch, let's move the
>> handling of ibm,associativity-lookup-arrays from spapr.c to
>> spapr_numa.c. A spapr_numa_write_assoc_lookup_arrays() helper was
>> created, and spapr_dt_dynamic_reconfiguration_memory() can now
>> use it to advertise the lookup-arrays.
>>
>> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
>> ---
>>   hw/ppc/spapr.c              | 25 ++----------------------
>>   hw/ppc/spapr_numa.c         | 39 +++++++++++++++++++++++++++++++++++++
>>   include/hw/ppc/spapr_numa.h |  2 ++
>>   3 files changed, 43 insertions(+), 23 deletions(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 172f965fe0..65d2ccd578 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -535,13 +535,10 @@ static int spapr_dt_dynamic_reconfiguration_memory(SpaprMachineState *spapr,
>>                                                      void *fdt)
>>   {
>>       MachineState *machine = MACHINE(spapr);
>> -    int nb_numa_nodes = machine->numa_state->num_nodes;
>> -    int ret, i, offset;
>> +    int ret, offset;
>>       uint64_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE;
>>       uint32_t prop_lmb_size[] = {cpu_to_be32(lmb_size >> 32),
>>                                   cpu_to_be32(lmb_size & 0xffffffff)};
>> -    uint32_t *int_buf, *cur_index, buf_len;
>> -    int nr_nodes = nb_numa_nodes ? nb_numa_nodes : 1;
>>       MemoryDeviceInfoList *dimms = NULL;
>>   
>>       /*
>> @@ -582,25 +579,7 @@ static int spapr_dt_dynamic_reconfiguration_memory(SpaprMachineState *spapr,
>>           return ret;
>>       }
>>   
>> -    /* ibm,associativity-lookup-arrays */
>> -    buf_len = (nr_nodes * 4 + 2) * sizeof(uint32_t);
>> -    cur_index = int_buf = g_malloc0(buf_len);
>> -    int_buf[0] = cpu_to_be32(nr_nodes);
>> -    int_buf[1] = cpu_to_be32(4); /* Number of entries per associativity list */
>> -    cur_index += 2;
>> -    for (i = 0; i < nr_nodes; i++) {
>> -        uint32_t associativity[] = {
>> -            cpu_to_be32(0x0),
>> -            cpu_to_be32(0x0),
>> -            cpu_to_be32(0x0),
>> -            cpu_to_be32(i)
>> -        };
>> -        memcpy(cur_index, associativity, sizeof(associativity));
>> -        cur_index += 4;
>> -    }
>> -    ret = fdt_setprop(fdt, offset, "ibm,associativity-lookup-arrays", int_buf,
>> -            (cur_index - int_buf) * sizeof(uint32_t));
>> -    g_free(int_buf);
>> +    ret = spapr_numa_write_assoc_lookup_arrays(spapr, fdt, offset);
>>   
>>       return ret;
>>   }
>> diff --git a/hw/ppc/spapr_numa.c b/hw/ppc/spapr_numa.c
>> index b8882d209e..9eb4bdbe80 100644
>> --- a/hw/ppc/spapr_numa.c
>> +++ b/hw/ppc/spapr_numa.c
>> @@ -75,6 +75,45 @@ int spapr_numa_fixup_cpu_dt(SpaprMachineState *spapr, void *fdt,
>>                          vcpu_assoc, sizeof(vcpu_assoc));
>>   }
>>   
>> +
>> +int spapr_numa_write_assoc_lookup_arrays(SpaprMachineState *spapr, void *fdt,
>> +                                         int offset)
>> +{
>> +    MachineState *machine = MACHINE(spapr);
>> +    SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
>> +    int nb_numa_nodes = machine->numa_state->num_nodes;
>> +    int nr_nodes = nb_numa_nodes ? nb_numa_nodes : 1;
>> +    uint32_t *int_buf, *cur_index, buf_len;
>> +    int ret, i, j;
>> +
>> +    /* ibm,associativity-lookup-arrays */
>> +    buf_len = (nr_nodes * MAX_DISTANCE_REF_POINTS + 2) * sizeof(uint32_t);
>> +    cur_index = int_buf = g_malloc0(buf_len);
>> +    int_buf[0] = cpu_to_be32(nr_nodes);
>> +     /* Number of entries per associativity list */
>> +    int_buf[1] = cpu_to_be32(MAX_DISTANCE_REF_POINTS);
>> +    cur_index += 2;
>> +    for (i = 0; i < nr_nodes; i++) {
>> +        /*
>> +         * For the lookup-array we use the ibm,associativity array,
>> +         * from numa_assoc_array. without the first element (size).
>> +         */
>> +        uint32_t associativity[MAX_DISTANCE_REF_POINTS];
>> +
>> +        for (j = 0; j < MAX_DISTANCE_REF_POINTS; j++) {
>> +            associativity[j] = smc->numa_assoc_array[i][j + 1];
>> +        }
>> +
>> +        memcpy(cur_index, associativity, sizeof(associativity));
> 
> AFAICT, you could just use a single memcpy() to copy from the
> numa_assoc_array() into the property here, rather than using a loop
> and temporary array.

I remember that I was having some weird problems with memcpy() and
numa_assoc_array and this is how I got around it. I'll try to sort it
out again.

> 
>> +        cur_index += 4;
> 
> Shouldn't this be  += MAX_DISTANCE_REF_POINTS?


Yeah it should. Good catch.

> 
>> +    }
>> +    ret = fdt_setprop(fdt, offset, "ibm,associativity-lookup-arrays", int_buf,
>> +                      (cur_index - int_buf) * sizeof(uint32_t));
>> +    g_free(int_buf);
>> +
>> +    return ret;
>> +}
>> +
>>   /*
>>    * Helper that writes ibm,associativity-reference-points and
>>    * max-associativity-domains in the RTAS pointed by @rtas
>> diff --git a/include/hw/ppc/spapr_numa.h b/include/hw/ppc/spapr_numa.h
>> index f92fb4f28a..f6127501a6 100644
>> --- a/include/hw/ppc/spapr_numa.h
>> +++ b/include/hw/ppc/spapr_numa.h
>> @@ -22,6 +22,8 @@ void spapr_numa_write_associativity_dt(SpaprMachineState *spapr, void *fdt,
>>                                          int offset, int nodeid);
>>   int spapr_numa_fixup_cpu_dt(SpaprMachineState *spapr, void *fdt,
>>                               int offset, PowerPCCPU *cpu);
>> +int spapr_numa_write_assoc_lookup_arrays(SpaprMachineState *spapr, void *fdt,
>> +                                         int offset);
>>   
>>   
>>   #endif /* HW_SPAPR_NUMA_H */
>
Daniel Henrique Barboza Sept. 3, 2020, 11:28 a.m. UTC | #6
On 9/2/20 10:51 PM, David Gibson wrote:
> On Tue, Sep 01, 2020 at 09:56:41AM -0300, Daniel Henrique Barboza wrote:
>> The next step to centralize all NUMA/associativity handling in
>> the spapr machine is to create a 'one stop place' for all
>> things ibm,associativity.
>>
>> This patch introduces numa_assoc_array, a 2 dimensional array
>> that will store all ibm,associativity arrays of all NUMA nodes.
>> This array is initialized in a new spapr_numa_associativity_init()
>> function, called in spapr_machine_init(). It is being initialized
>> with the same values used in other ibm,associativity properties
>> around spapr files (i.e. all zeros, last value is node_id).
>> The idea is to remove all hardcoded definitions and FDT writes
>> of ibm,associativity arrays, doing instead a call to the new
>> helper spapr_numa_write_associativity_dt() helper, that will
>> be able to write the DT with the correct values.
>>
>> We'll start small, handling the trivial cases first. The
>> remaining instances of ibm,associativity will be handled
>> next.
>>
>> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> 
> The idea is great, but there's one small but significant problem here:
> 
>> +void spapr_numa_associativity_init(MachineState *machine)
>> +{
>> +    SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(machine);
>> +    int nb_numa_nodes = machine->numa_state->num_nodes;
>> +    int i;
>> +
>> +    /*
>> +     * For all associativity arrays: first position is the size,
>> +     * position MAX_DISTANCE_REF_POINTS is always the numa_id,
>> +     * represented by the index 'i'.
>> +     *
>> +     * This will break on sparse NUMA setups, when/if QEMU starts
>> +     * to support it, because there will be no more guarantee that
>> +     * 'i' will be a valid node_id set by the user.
>> +     */
>> +    for (i = 0; i < nb_numa_nodes; i++) {
>> +        smc->numa_assoc_array[i][0] = cpu_to_be32(MAX_DISTANCE_REF_POINTS);
>> +        smc->numa_assoc_array[i][MAX_DISTANCE_REF_POINTS] = cpu_to_be32(i);
> 
> This initialization is called on a machine *instance*, which means it
> should treat the machine class as read-only.  i.e. the
> numa_assoc_array should be in the SpaprMachineState, rather than the
> class.
> 
> I mean, we'd get away with it in practice, since there's only ever
> likely to be a single machine instance, but still we should correct
> this.

Got it. I'll move it to SpaprMachineState. This will also spare a handful of lines
everywhere else since I was instantiating the class just to manipulate the matrix
(and now, in hindsight, I figured that this was a warning about the weirdness
of what I was doing).


Thanks,


DHB

>