mbox series

[00/15] accel/tcg: Fix for #360 and other i/o alignment issues

Message ID 20210619172626.875885-1-richard.henderson@linaro.org
Headers show
Series accel/tcg: Fix for #360 and other i/o alignment issues | expand

Message

Richard Henderson June 19, 2021, 5:26 p.m. UTC
Short story is that the first two patches resolve the observed
problem, by completely bypassing quite a lot of code in memory.c.

Longer story is that we should either use that code in memory.c,
or we should bypass it to an even lower level, so that we don't
have multiple locations doing the partial-read assembly thing.

Patch 13 exposes a number of obvious device bugs via make check.
I'm sure there are more in devices that are less well tested.

Patch 15 has an obvious drawback: it breaks the original #360.
But it starts the conversation as to whether the check in memory.c
is in fact broken.


r~


Mark Cave-Ayland (2):
  NOTFORMERGE q800: test case for do_unaligned_access issue
  accel/tcg: Use byte ops for unaligned loads

Philippe Mathieu-Daudé (1):
  accel/tcg: Extract load_helper_unaligned from load_helper

Richard Henderson (12):
  accel/tcg: Don't test for watchpoints for code read
  accel/tcg: Handle page span access before i/o access
  softmmu/memory: Inline memory_region_dispatch_read1
  softmmu/memory: Simplify access_with_adjusted_size interface
  hw/net/e1000e: Fix size of io operations
  hw/net/e1000e: Fix impl.min_access_size
  hw/pci-host/q35: Improve blackhole_ops
  hw/scsi/megasas: Fix megasas_mmio_ops sizes
  hw/scsi/megasas: Improve megasas_queue_ops min_access_size
  softmmu/memory: Disallow short writes
  softmmu/memory: Support some unaligned access
  RFC accel/tcg: Defer some unaligned accesses to memory subsystem

 accel/tcg/cputlb.c | 147 +++++++++++++----------------
 hw/m68k/q800.c     | 131 ++------------------------
 hw/net/e1000e.c    |   8 +-
 hw/pci-host/q35.c  |   9 +-
 hw/scsi/megasas.c  |   6 +-
 softmmu/memory.c   | 226 +++++++++++++++++++++++++++++++++------------
 6 files changed, 251 insertions(+), 276 deletions(-)

-- 
2.25.1

Comments

Mark Cave-Ayland June 20, 2021, 1:08 p.m. UTC | #1
On 19/06/2021 18:26, Richard Henderson wrote:

> Short story is that the first two patches resolve the observed

> problem, by completely bypassing quite a lot of code in memory.c.

> 

> Longer story is that we should either use that code in memory.c,

> or we should bypass it to an even lower level, so that we don't

> have multiple locations doing the partial-read assembly thing.

> 

> Patch 13 exposes a number of obvious device bugs via make check.

> I'm sure there are more in devices that are less well tested.

> 

> Patch 15 has an obvious drawback: it breaks the original #360.

> But it starts the conversation as to whether the check in memory.c

> is in fact broken.

> 

> 

> r~

> 

> 

> Mark Cave-Ayland (2):

>    NOTFORMERGE q800: test case for do_unaligned_access issue

>    accel/tcg: Use byte ops for unaligned loads

> 

> Philippe Mathieu-Daudé (1):

>    accel/tcg: Extract load_helper_unaligned from load_helper

> 

> Richard Henderson (12):

>    accel/tcg: Don't test for watchpoints for code read

>    accel/tcg: Handle page span access before i/o access

>    softmmu/memory: Inline memory_region_dispatch_read1

>    softmmu/memory: Simplify access_with_adjusted_size interface

>    hw/net/e1000e: Fix size of io operations

>    hw/net/e1000e: Fix impl.min_access_size

>    hw/pci-host/q35: Improve blackhole_ops

>    hw/scsi/megasas: Fix megasas_mmio_ops sizes

>    hw/scsi/megasas: Improve megasas_queue_ops min_access_size

>    softmmu/memory: Disallow short writes

>    softmmu/memory: Support some unaligned access

>    RFC accel/tcg: Defer some unaligned accesses to memory subsystem

> 

>   accel/tcg/cputlb.c | 147 +++++++++++++----------------

>   hw/m68k/q800.c     | 131 ++------------------------

>   hw/net/e1000e.c    |   8 +-

>   hw/pci-host/q35.c  |   9 +-

>   hw/scsi/megasas.c  |   6 +-

>   softmmu/memory.c   | 226 +++++++++++++++++++++++++++++++++------------

>   6 files changed, 251 insertions(+), 276 deletions(-)


Hi Richard,

I've had a look over this patchset, and based upon my work leading up to #360 this 
does feel like an improvement: in particular separating out page spanning accesses, 
and removing the duplicate cputlb code for splitting/combining unaligned accesses 
seem like wins to me.

As mentioned in the report itself, cputlb has effectively been bypassing 
.min_access_size but I'm not sure about the consequences of this - does it mean that 
the access size check should also be bypassed for the head/tail of unaligned 
accesses? I'm also not completely sure how this behaviour can change based upon the 
target CPU.

The comment in patch 15 re: mr->ops->valid.unaligned is interesting: if the memory 
subsystem can split the accesses beforehand so that they are accepted by the device, 
does this check then become obsolete?

My general feeling is that there are still some discussions to be had around defining 
how unaligned accesses should work, and including a qtest to cover the various 
unaligned/page span cases would help here. Other than that I do feel the patch is 
worthwhile, and it may be there is a situation similar to the 
memory_region_access_valid() changes in 5d971f9e67 whereby we accept the change in 
behaviour will have some fallout but allow plenty of time for regressions to be fixed.


ATB,

Mark.
Peter Maydell June 20, 2021, 2:33 p.m. UTC | #2
On Sat, 19 Jun 2021 at 18:28, Richard Henderson
<richard.henderson@linaro.org> wrote:
>

> Short story is that the first two patches resolve the observed

> problem, by completely bypassing quite a lot of code in memory.c.

>

> Longer story is that we should either use that code in memory.c,

> or we should bypass it to an even lower level, so that we don't

> have multiple locations doing the partial-read assembly thing.


I haven't read the patchset yet, but my initial reaction is that
we ought to be handling this stuff in memory.c, because that
code is shared across all accelerators -- we would want the
same behaviour for accesses to a device that doesn't itself
handle a misalignment or a small access, whether we are using
TCG or KVM or HVF or whatever...

thanks
-- PMM