mbox series

[v3,00/29] hw, target: Prefer fast cpu_env() over slower CPU QOM cast macro

Message ID 20240129164514.73104-1-philmd@linaro.org
Headers show
Series hw, target: Prefer fast cpu_env() over slower CPU QOM cast macro | expand

Message

Philippe Mathieu-Daudé Jan. 29, 2024, 4:44 p.m. UTC
Patches missing review: 1, 2, 5, 6, 8, 11, 14, 15, 29

It will be simpler if I get the whole series via my hw-cpus
tree once fully reviewed.

Since v2:
- Rebased
- bsd/linux-user
- Preliminary clean cpu_reset_hold
- Add R-b

Since v1:
- Avoid CPU() cast (Paolo)
- Split per targets (Thomas)

Use cpu_env() -- which is fast path -- when possible.
Bulk conversion using Coccinelle spatch (script included).

Philippe Mathieu-Daudé (29):
  bulk: Access existing variables initialized to &S->F when available
  hw/core: Declare CPUArchId::cpu as CPUState instead of Object
  hw/acpi/cpu: Use CPUState typedef
  bulk: Call in place single use cpu_env()
  scripts/coccinelle: Add cpu_env.cocci script
  target: Replace CPU_GET_CLASS(cpu -> obj) in cpu_reset_hold() handler
  target/alpha: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/arm: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/avr: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/cris: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/hexagon: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/hppa: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/i386/hvf: Use CPUState typedef
  target/i386: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/loongarch: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/m68k: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/microblaze: Prefer fast cpu_env() over slower CPU QOM cast
    macro
  target/mips: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/nios2: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/openrisc: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/ppc: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/riscv: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/rx: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/s390x: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/sh4: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/sparc: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/tricore: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/xtensa: Prefer fast cpu_env() over slower CPU QOM cast macro
  user: Prefer fast cpu_env() over slower CPU QOM cast macro

 MAINTAINERS                             |   1 +
 scripts/coccinelle/cpu_env.cocci        | 100 ++++++++++++++++++++++++
 include/hw/acpi/cpu.h                   |   2 +-
 include/hw/boards.h                     |   2 +-
 target/i386/hvf/vmx.h                   |  13 +--
 target/i386/hvf/x86.h                   |  26 +++---
 target/i386/hvf/x86_descr.h             |  14 ++--
 target/i386/hvf/x86_emu.h               |   4 +-
 target/i386/hvf/x86_mmu.h               |   6 +-
 accel/tcg/cpu-exec.c                    |   3 +-
 bsd-user/signal.c                       |   3 +-
 hw/core/machine.c                       |   4 +-
 hw/display/ati.c                        |   2 +-
 hw/i386/fw_cfg.c                        |   3 +-
 hw/i386/vmmouse.c                       |   6 +-
 hw/i386/x86.c                           |   2 +-
 hw/i386/xen/xen-hvm.c                   |   3 +-
 hw/intc/arm_gicv3_cpuif.c               |   7 +-
 hw/intc/arm_gicv3_cpuif_common.c        |   5 +-
 hw/loongarch/virt.c                     |   2 +-
 hw/misc/macio/pmu.c                     |   2 +-
 hw/misc/pvpanic-pci.c                   |   2 +-
 hw/pci-bridge/cxl_root_port.c           |   2 +-
 hw/ppc/mpc8544_guts.c                   |   3 +-
 hw/ppc/pnv.c                            |  23 +++---
 hw/ppc/pnv_xscom.c                      |   5 +-
 hw/ppc/ppce500_spin.c                   |   3 +-
 hw/ppc/spapr.c                          |   8 +-
 hw/ppc/spapr_caps.c                     |   7 +-
 hw/s390x/s390-virtio-ccw.c              |   2 +-
 hw/virtio/vhost-user-gpio.c             |   8 +-
 hw/virtio/vhost-user-scmi.c             |   6 +-
 hw/virtio/virtio-pci.c                  |   2 +-
 hw/xen/xen_pt.c                         |   6 +-
 linux-user/i386/cpu_loop.c              |   4 +-
 linux-user/signal.c                     |   6 +-
 migration/multifd-zlib.c                |   2 +-
 target/alpha/cpu.c                      |  31 ++------
 target/alpha/gdbstub.c                  |   6 +-
 target/alpha/helper.c                   |  15 ++--
 target/alpha/mem_helper.c               |  11 +--
 target/alpha/translate.c                |   4 +-
 target/arm/cpu.c                        |  37 ++++-----
 target/arm/debug_helper.c               |   8 +-
 target/arm/gdbstub.c                    |   6 +-
 target/arm/gdbstub64.c                  |   6 +-
 target/arm/helper.c                     |   9 +--
 target/arm/hvf/hvf.c                    |  12 +--
 target/arm/kvm.c                        |   5 +-
 target/arm/machine.c                    |   6 +-
 target/arm/ptw.c                        |   3 +-
 target/arm/tcg/cpu32.c                  |   4 +-
 target/arm/tcg/translate.c              |   3 +-
 target/avr/cpu.c                        |  29 ++-----
 target/avr/gdbstub.c                    |   6 +-
 target/avr/helper.c                     |  10 +--
 target/avr/translate.c                  |   3 +-
 target/cris/cpu.c                       |  12 +--
 target/cris/gdbstub.c                   |   9 +--
 target/cris/helper.c                    |  12 +--
 target/cris/translate.c                 |   6 +-
 target/hexagon/cpu.c                    |  27 ++-----
 target/hexagon/gdbstub.c                |   6 +-
 target/hppa/cpu.c                       |   8 +-
 target/hppa/int_helper.c                |   8 +-
 target/hppa/mem_helper.c                |   6 +-
 target/hppa/translate.c                 |   3 +-
 target/i386/arch_dump.c                 |  11 +--
 target/i386/arch_memory_mapping.c       |   3 +-
 target/i386/cpu-dump.c                  |   3 +-
 target/i386/cpu.c                       |  51 +++++-------
 target/i386/helper.c                    |  42 +++-------
 target/i386/hvf/hvf.c                   |   8 +-
 target/i386/hvf/x86.c                   |  30 ++++---
 target/i386/hvf/x86_descr.c             |   8 +-
 target/i386/hvf/x86_emu.c               |   6 +-
 target/i386/hvf/x86_mmu.c               |  14 ++--
 target/i386/hvf/x86_task.c              |  10 +--
 target/i386/hvf/x86hvf.c                |  11 +--
 target/i386/kvm/kvm.c                   |   6 +-
 target/i386/kvm/xen-emu.c               |  32 +++-----
 target/i386/nvmm/nvmm-all.c             |   6 +-
 target/i386/tcg/sysemu/bpt_helper.c     |   3 +-
 target/i386/tcg/sysemu/excp_helper.c    |   3 +-
 target/i386/tcg/tcg-cpu.c               |  14 +---
 target/i386/tcg/user/excp_helper.c      |   6 +-
 target/i386/tcg/user/seg_helper.c       |   3 +-
 target/i386/whpx/whpx-all.c             |  18 ++---
 target/loongarch/cpu.c                  |  41 +++-------
 target/loongarch/gdbstub.c              |   6 +-
 target/loongarch/kvm/kvm.c              |  41 +++-------
 target/loongarch/tcg/tlb_helper.c       |   6 +-
 target/loongarch/tcg/translate.c        |   3 +-
 target/m68k/cpu.c                       |  37 +++------
 target/m68k/gdbstub.c                   |   6 +-
 target/m68k/helper.c                    |   8 +-
 target/m68k/m68k-semi.c                 |   6 +-
 target/m68k/op_helper.c                 |  11 +--
 target/m68k/translate.c                 |   3 +-
 target/microblaze/cpu.c                 |   6 +-
 target/microblaze/gdbstub.c             |   3 +-
 target/microblaze/helper.c              |   3 +-
 target/microblaze/translate.c           |   6 +-
 target/mips/cpu.c                       |  17 ++--
 target/mips/gdbstub.c                   |   6 +-
 target/mips/kvm.c                       |  27 +++----
 target/mips/sysemu/physaddr.c           |   3 +-
 target/mips/tcg/exception.c             |   3 +-
 target/mips/tcg/op_helper.c             |   8 +-
 target/mips/tcg/sysemu/special_helper.c |   3 +-
 target/mips/tcg/sysemu/tlb_helper.c     |   6 +-
 target/mips/tcg/translate.c             |   3 +-
 target/nios2/cpu.c                      |  17 +---
 target/nios2/helper.c                   |   3 +-
 target/nios2/nios2-semi.c               |   6 +-
 target/nios2/translate.c                |   3 +-
 target/openrisc/cpu.c                   |   8 +-
 target/openrisc/gdbstub.c               |   6 +-
 target/openrisc/interrupt.c             |   6 +-
 target/openrisc/translate.c             |   6 +-
 target/ppc/cpu_init.c                   |  23 +++---
 target/ppc/excp_helper.c                |   3 +-
 target/ppc/gdbstub.c                    |  12 +--
 target/ppc/kvm.c                        |  20 ++---
 target/ppc/ppc-qmp-cmds.c               |   3 +-
 target/ppc/user_only_helper.c           |   3 +-
 target/riscv/arch_dump.c                |   6 +-
 target/riscv/cpu.c                      |  19 ++---
 target/riscv/cpu_helper.c               |  19 ++---
 target/riscv/debug.c                    |   9 +--
 target/riscv/gdbstub.c                  |   6 +-
 target/riscv/kvm/kvm-cpu.c              |  11 +--
 target/riscv/tcg/tcg-cpu.c              |  10 +--
 target/riscv/translate.c                |   6 +-
 target/rx/cpu.c                         |   6 +-
 target/rx/gdbstub.c                     |   6 +-
 target/rx/helper.c                      |   6 +-
 target/rx/translate.c                   |   6 +-
 target/s390x/cpu-dump.c                 |   3 +-
 target/s390x/gdbstub.c                  |   6 +-
 target/s390x/helper.c                   |   3 +-
 target/s390x/kvm/kvm.c                  |   6 +-
 target/s390x/tcg/excp_helper.c          |  11 +--
 target/s390x/tcg/misc_helper.c          |   4 +-
 target/s390x/tcg/translate.c            |   3 +-
 target/sh4/cpu.c                        |  22 ++----
 target/sh4/gdbstub.c                    |   6 +-
 target/sh4/helper.c                     |  14 +---
 target/sh4/op_helper.c                  |   4 +-
 target/sh4/translate.c                  |   6 +-
 target/sparc/cpu.c                      |  21 ++---
 target/sparc/gdbstub.c                  |   3 +-
 target/sparc/int32_helper.c             |   3 +-
 target/sparc/int64_helper.c             |   3 +-
 target/sparc/ldst_helper.c              |   6 +-
 target/sparc/mmu_helper.c               |  15 ++--
 target/sparc/translate.c                |   9 +--
 target/tricore/cpu.c                    |  28 ++-----
 target/tricore/gdbstub.c                |   6 +-
 target/tricore/helper.c                 |   3 +-
 target/tricore/translate.c              |   3 +-
 target/xtensa/cpu.c                     |   9 +--
 target/xtensa/dbg_helper.c              |   3 +-
 target/xtensa/exc_helper.c              |   3 +-
 target/xtensa/gdbstub.c                 |   6 +-
 target/xtensa/helper.c                  |   9 +--
 target/xtensa/translate.c               |   6 +-
 167 files changed, 615 insertions(+), 998 deletions(-)
 create mode 100644 scripts/coccinelle/cpu_env.cocci

Comments

Philippe Mathieu-Daudé Jan. 29, 2024, 4:50 p.m. UTC | #1
On 29/1/24 17:44, Philippe Mathieu-Daudé wrote:
> Mechanical patch produced running the command documented
> in scripts/coccinelle/cpu_env.cocci_template header.
> 
> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> ---
>   target/i386/hvf/vmx.h                | 13 ++-------
>   hw/i386/fw_cfg.c                     |  3 +-
>   hw/i386/vmmouse.c                    |  6 ++--
>   hw/i386/xen/xen-hvm.c                |  3 +-
>   target/i386/arch_dump.c              | 11 ++------
>   target/i386/arch_memory_mapping.c    |  3 +-
>   target/i386/cpu-dump.c               |  3 +-
>   target/i386/cpu.c                    | 37 ++++++++----------------
>   target/i386/helper.c                 | 42 ++++++++--------------------
>   target/i386/hvf/hvf.c                |  8 ++----
>   target/i386/hvf/x86.c                |  4 +--
>   target/i386/hvf/x86_emu.c            |  6 ++--
>   target/i386/hvf/x86_task.c           | 10 ++-----
>   target/i386/hvf/x86hvf.c             |  9 ++----
>   target/i386/kvm/kvm.c                |  6 ++--
>   target/i386/kvm/xen-emu.c            | 32 +++++++--------------
>   target/i386/tcg/sysemu/bpt_helper.c  |  3 +-
>   target/i386/tcg/sysemu/excp_helper.c |  3 +-
>   target/i386/tcg/tcg-cpu.c            | 14 +++-------
>   target/i386/tcg/user/excp_helper.c   |  6 ++--
>   target/i386/tcg/user/seg_helper.c    |  3 +-
>   21 files changed, 67 insertions(+), 158 deletions(-)

Actually this one had:

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>

But since I addressed Zhao's suggestion in patch 1
("bulk: Access existing variables initialized to &S->F when available")
which added more changes to this patch, I dropped the tags.
BALATON Zoltan Jan. 29, 2024, 7:46 p.m. UTC | #2
On Mon, 29 Jan 2024, Philippe Mathieu-Daudé wrote:
> When a variable is initialized to &struct->field, use it
> in place. Rationale: while this makes the code more concise,
> this also helps static analyzers.
>
> Mechanical change using the following Coccinelle spatch script:
>
> @@
> type S, F;
> identifier s, m, v;
> @@
>      S *s;
>      ...
>      F *v = &s->m;
>      <+...
> -    &s->m
> +    v
>      ...+>
>
> Inspired-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> ---
> hw/display/ati.c              |  2 +-
> hw/misc/macio/pmu.c           |  2 +-
> hw/misc/pvpanic-pci.c         |  2 +-
> hw/pci-bridge/cxl_root_port.c |  2 +-
> hw/ppc/pnv.c                  | 20 ++++++++++----------
> hw/virtio/vhost-user-gpio.c   |  8 ++++----
> hw/virtio/vhost-user-scmi.c   |  6 +++---
> hw/virtio/virtio-pci.c        |  2 +-
> hw/xen/xen_pt.c               |  6 +++---
> migration/multifd-zlib.c      |  2 +-
> target/arm/cpu.c              |  4 ++--
> target/arm/kvm.c              |  2 +-
> target/arm/machine.c          |  6 +++---
> target/i386/hvf/x86hvf.c      |  2 +-
> target/m68k/helper.c          |  2 +-
> target/ppc/kvm.c              |  8 ++++----
> target/riscv/cpu_helper.c     |  2 +-
> 17 files changed, 39 insertions(+), 39 deletions(-)
>
> diff --git a/hw/display/ati.c b/hw/display/ati.c
> index 569b8f6165..8d2501bd82 100644
> --- a/hw/display/ati.c
> +++ b/hw/display/ati.c
> @@ -991,7 +991,7 @@ static void ati_vga_realize(PCIDevice *dev, Error **errp)
>     }
>     vga_init(vga, OBJECT(s), pci_address_space(dev),
>              pci_address_space_io(dev), true);
> -    vga->con = graphic_console_init(DEVICE(s), 0, s->vga.hw_ops, &s->vga);
> +    vga->con = graphic_console_init(DEVICE(s), 0, s->vga.hw_ops, vga);
>     if (s->cursor_guest_mode) {
>         vga->cursor_invalidate = ati_cursor_invalidate;
>         vga->cursor_draw_line = ati_cursor_draw_line;
> diff --git a/hw/misc/macio/pmu.c b/hw/misc/macio/pmu.c
> index e9a90da88f..7fe1c4e517 100644
> --- a/hw/misc/macio/pmu.c
> +++ b/hw/misc/macio/pmu.c
> @@ -737,7 +737,7 @@ static void pmu_realize(DeviceState *dev, Error **errp)
>     timer_mod(s->one_sec_timer, s->one_sec_target);
>
>     if (s->has_adb) {
> -        qbus_init(&s->adb_bus, sizeof(s->adb_bus), TYPE_ADB_BUS,
> +        qbus_init(adb_bus, sizeof(s->adb_bus), TYPE_ADB_BUS,

Probably should change to use sizeof(*adb_bus) too. Although the next line 
is the only place the adb_bus local is used so maybe can drop the local 
and chnage next line to use &s->adb_bus instead.

Regards,
BALATON Zoltan

>                   dev, "adb.0");
>         adb_register_autopoll_callback(adb_bus, pmu_adb_poll, s);
>     }
Richard Henderson Jan. 30, 2024, 7:56 a.m. UTC | #3
On 1/30/24 02:44, Philippe Mathieu-Daudé wrote:
> When a variable is initialized to &struct->field, use it
> in place. Rationale: while this makes the code more concise,
> this also helps static analyzers.
> 
> Mechanical change using the following Coccinelle spatch script:
> 
>   @@
>   type S, F;
>   identifier s, m, v;
>   @@
>        S *s;
>        ...
>        F *v = &s->m;
>        <+...
>   -    &s->m
>   +    v
>        ...+>
> 
> Inspired-by: Zhao Liu<zhao1.liu@intel.com>
> Signed-off-by: Philippe Mathieu-Daudé<philmd@linaro.org>
> ---

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~
Richard Henderson Jan. 30, 2024, 10:10 a.m. UTC | #4
On 1/30/24 02:44, Philippe Mathieu-Daudé wrote:
> Mechanical patch produced running the command documented
> in scripts/coccinelle/cpu_env.cocci_template header.
> 
> Signed-off-by: Philippe Mathieu-Daudé<philmd@linaro.org>
> ---
>   target/i386/hvf/vmx.h                | 13 ++-------
>   hw/i386/fw_cfg.c                     |  3 +-
>   hw/i386/vmmouse.c                    |  6 ++--
>   hw/i386/xen/xen-hvm.c                |  3 +-
>   target/i386/arch_dump.c              | 11 ++------
>   target/i386/arch_memory_mapping.c    |  3 +-
>   target/i386/cpu-dump.c               |  3 +-
>   target/i386/cpu.c                    | 37 ++++++++----------------
>   target/i386/helper.c                 | 42 ++++++++--------------------
>   target/i386/hvf/hvf.c                |  8 ++----
>   target/i386/hvf/x86.c                |  4 +--
>   target/i386/hvf/x86_emu.c            |  6 ++--
>   target/i386/hvf/x86_task.c           | 10 ++-----
>   target/i386/hvf/x86hvf.c             |  9 ++----
>   target/i386/kvm/kvm.c                |  6 ++--
>   target/i386/kvm/xen-emu.c            | 32 +++++++--------------
>   target/i386/tcg/sysemu/bpt_helper.c  |  3 +-
>   target/i386/tcg/sysemu/excp_helper.c |  3 +-
>   target/i386/tcg/tcg-cpu.c            | 14 +++-------
>   target/i386/tcg/user/excp_helper.c   |  6 ++--
>   target/i386/tcg/user/seg_helper.c    |  3 +-
>   21 files changed, 67 insertions(+), 158 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~
Anthony PERARD March 8, 2024, 5:35 p.m. UTC | #5
On Mon, Jan 29, 2024 at 05:44:43PM +0100, Philippe Mathieu-Daudé wrote:
> When a variable is initialized to &struct->field, use it
> in place. Rationale: while this makes the code more concise,
> this also helps static analyzers.
> 
> Mechanical change using the following Coccinelle spatch script:
> 
>  @@
>  type S, F;
>  identifier s, m, v;
>  @@
>       S *s;
>       ...
>       F *v = &s->m;
>       <+...
>  -    &s->m
>  +    v
>       ...+>
> 
> Inspired-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> ---
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index 36e6f93c37..10ddf6bc91 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -710,7 +710,7 @@ static void xen_pt_destroy(PCIDevice *d) {
>      uint8_t intx;
>      int rc;
>  
> -    if (machine_irq && !xen_host_pci_device_closed(&s->real_device)) {
> +    if (machine_irq && !xen_host_pci_device_closed(host_dev)) {
>          intx = xen_pt_pci_intx(s);
>          rc = xc_domain_unbind_pt_irq(xen_xc, xen_domid, machine_irq,
>                                       PT_IRQ_TYPE_PCI,
> @@ -759,8 +759,8 @@ static void xen_pt_destroy(PCIDevice *d) {
>          memory_listener_unregister(&s->io_listener);
>          s->listener_set = false;
>      }
> -    if (!xen_host_pci_device_closed(&s->real_device)) {
> -        xen_host_pci_device_put(&s->real_device);
> +    if (!xen_host_pci_device_closed(host_dev)) {
> +        xen_host_pci_device_put(host_dev);

For the Xen part:
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>

Thanks,
Thomas Huth March 12, 2024, 10:58 a.m. UTC | #6
On 29/01/2024 17.44, Philippe Mathieu-Daudé wrote:
> Patches missing review: 1, 2, 5, 6, 8, 11, 14, 15, 29
> 
> It will be simpler if I get the whole series via my hw-cpus
> tree once fully reviewed.
> 
> Since v2:
> - Rebased
> - bsd/linux-user
> - Preliminary clean cpu_reset_hold
> - Add R-b
> 
> Since v1:
> - Avoid CPU() cast (Paolo)
> - Split per targets (Thomas)
> 
> Use cpu_env() -- which is fast path -- when possible.
> Bulk conversion using Coccinelle spatch (script included).
> 
> Philippe Mathieu-Daudé (29):
>    bulk: Access existing variables initialized to &S->F when available
>    hw/core: Declare CPUArchId::cpu as CPUState instead of Object
>    hw/acpi/cpu: Use CPUState typedef
>    bulk: Call in place single use cpu_env()
>    scripts/coccinelle: Add cpu_env.cocci script
>    target: Replace CPU_GET_CLASS(cpu -> obj) in cpu_reset_hold() handler
>    target/alpha: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/arm: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/avr: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/cris: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/hexagon: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/hppa: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/i386/hvf: Use CPUState typedef
>    target/i386: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/loongarch: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/m68k: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/microblaze: Prefer fast cpu_env() over slower CPU QOM cast
>      macro
>    target/mips: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/nios2: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/openrisc: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/ppc: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/riscv: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/rx: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/s390x: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/sh4: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/sparc: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/tricore: Prefer fast cpu_env() over slower CPU QOM cast macro
>    target/xtensa: Prefer fast cpu_env() over slower CPU QOM cast macro
>    user: Prefer fast cpu_env() over slower CPU QOM cast macro

FYI, I'll try to queue those for my PR today except for:

  scripts/coccinelle: Add cpu_env.cocci script
  --> Still needs review and you mentioned a pending change

  target/arm: Prefer fast cpu_env() over slower CPU QOM cast macro
  --> Needs a rebase and review

  target/hppa: Prefer fast cpu_env() over slower CPU QOM cast macro
  --> Needs a rebase

  target/i386: Prefer fast cpu_env() over slower CPU QOM cast macro
  --> There were unaddressed review comments from Igor

  target/riscv: Prefer fast cpu_env() over slower CPU QOM cast macro
  --> Needs a rebase

  Thomas
Thomas Huth March 12, 2024, 11:24 a.m. UTC | #7
On 30/01/2024 14.01, Igor Mammedov wrote:
> On Mon, 29 Jan 2024 17:44:56 +0100
> Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
> 
>> Mechanical patch produced running the command documented
>> in scripts/coccinelle/cpu_env.cocci_template header.
> 
> 
> commenting here since, I'm not expert on coccinelle scripts.
> 
> On negative side we are permanently loosing type checking in this area.

Not really that much. Have a look at cpu_env(), it has a comment saying:

  "We validate that CPUArchState follows CPUState in cpu-all.h"

So instead of run-time checking, the check should have already been done 
during compile time, i.e. when you have a valid CPUState pointer, it should 
be possible to derive a valid CPUArchState pointer from it without much 
further checking during runtime.

> Is it worth it, what gains do we get with this series?

It's a small optimization, but why not?

> Side note,
> QOM cast expenses you are replacing could be negated by disabling
> CONFIG_QOM_CAST_DEBUG without killing type check code when it's enabled.
> That way you will speed up not only cpuenv access but also all other casts
> across the board.

Yes, but that checking is enabled by default and does not have such 
compile-time checks that could be used instead, so I think Philippe's series 
here is still a good idea.

>> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>> ---
> ...
>>   static inline void vmx_clear_nmi_blocking(CPUState *cpu)
>>   {
>> -    X86CPU *x86_cpu = X86_CPU(cpu);
>> -    CPUX86State *env = &x86_cpu->env;
>> -
>> -    env->hflags2 &= ~HF2_NMI_MASK;
> 
>> +    cpu_env(cpu)->hflags2 &= ~HF2_NMI_MASK;
> 
> this style of de-referencing return value of macro/function
> was discouraged in past and preferred way was 'Foo f = CAST(me); f->some_access
> 
> (it's just imprint speaking, I don't recall where it comes from)

I agree, though the new code is perfectly valid, it looks nicer if we'd use 
a variable here instead.

  Thomas