Message ID | 20220629011638.21783-3-schmitzmic@gmail.com |
---|---|
State | New |
Headers | show |
Series | Converting m68k WD33C93 drivers to DMA API | expand |
On Wed, Jun 29, 2022 at 3:16 AM Michael Schmitz <schmitzmic@gmail.com> wrote: > > Use dma_map_single() for a2091 driver (leave bounce buffer > logic unchanged). > > Use dma_set_mask_and_coherent() to avoid explicit cache > flushes. > > Compile-tested only. > > CC: linux-scsi@vger.kernel.org > Link: https://lore.kernel.org/r/6d1d88ee-1cf6-c735-1e6d-bafd2096e322@gmail.com > Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> > + > + addr = dma_map_single(hdata->dev, scsi_pointer->ptr, > + len, DMA_DIR(dir_in)); > + if (dma_mapping_error(hdata->dev, addr)) { > + dev_warn(hdata->dev, "cannot map SCSI data block %p\n", > + scsi_pointer->ptr); > + return 1; > + } > + scsi_pointer->dma_handle = addr; > > /* don't allow DMA if the physical address is bad */ > if (addr & A2091_XFER_MASK) { > + /* drop useless mapping */ > + dma_unmap_single(hdata->dev, scsi_pointer->dma_handle, > + scsi_pointer->this_residual, > + DMA_DIR(dir_in)); I think you could save the extra map/unmap here if you wanted, but that would risk introducing bugs since it requires a larger rework. > + scsi_pointer->dma_handle = (dma_addr_t) NULL; > + > wh->dma_bounce_len = (scsi_pointer->this_residual + 511) & ~0x1ff; > wh->dma_bounce_buffer = kmalloc(wh->dma_bounce_len, > GFP_KERNEL); Not your bug, but if there is memory above the A2091_XFER_MASK limit, this would need to use GFP_DMA instead of GFP_KERNEL. Arnd
Hi Arnd, thanks for your review! Am 29.06.2022 um 18:06 schrieb Arnd Bergmann: > On Wed, Jun 29, 2022 at 3:16 AM Michael Schmitz <schmitzmic@gmail.com> wrote: >> >> Use dma_map_single() for a2091 driver (leave bounce buffer >> logic unchanged). >> >> Use dma_set_mask_and_coherent() to avoid explicit cache >> flushes. >> >> Compile-tested only. >> >> CC: linux-scsi@vger.kernel.org >> Link: https://lore.kernel.org/r/6d1d88ee-1cf6-c735-1e6d-bafd2096e322@gmail.com >> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> > > Reviewed-by: Arnd Bergmann <arnd@arndb.de> > >> + >> + addr = dma_map_single(hdata->dev, scsi_pointer->ptr, >> + len, DMA_DIR(dir_in)); >> + if (dma_mapping_error(hdata->dev, addr)) { >> + dev_warn(hdata->dev, "cannot map SCSI data block %p\n", >> + scsi_pointer->ptr); >> + return 1; >> + } >> + scsi_pointer->dma_handle = addr; >> >> /* don't allow DMA if the physical address is bad */ >> if (addr & A2091_XFER_MASK) { >> + /* drop useless mapping */ >> + dma_unmap_single(hdata->dev, scsi_pointer->dma_handle, >> + scsi_pointer->this_residual, >> + DMA_DIR(dir_in)); > > I think you could save the extra map/unmap here if you wanted, but that > would risk introducing bugs since it requires a larger rework. Not sure how to do that ... >> + scsi_pointer->dma_handle = (dma_addr_t) NULL; >> + >> wh->dma_bounce_len = (scsi_pointer->this_residual + 511) & ~0x1ff; >> wh->dma_bounce_buffer = kmalloc(wh->dma_bounce_len, >> GFP_KERNEL); > > Not your bug, but if there is memory above the A2091_XFER_MASK limit, > this would need to use GFP_DMA instead of GFP_KERNEL. Same argument goes for gvp11 - though last time I checked (and certainly at the time these drivers were written), there really was no difference between using GFP_DMA and GFP_KERNEL on m68k, hence the need to check the allocated buffer is indeed suitable for DMA, and perhaps revert to chip RAM (slow, useless for most other purposes but definitely below 16 MB) if that fails (as the gvp11 driver does). Most users will opt for loading the kernel to a RAM chunk at a higher physical address, making the lower chunk unusable (except as chip RAM), meaning allocating a bounce buffer within the first 16 MB will most likely fail anyway AIUI (but Geert would know that for sure). Cheers, Michael > > Arnd >
Hi Geert, On 30/06/22 03:55, Geert Uytterhoeven wrote: > Hi Michael, > > On Wed, Jun 29, 2022 at 9:27 AM Michael Schmitz <schmitzmic@gmail.com> wrote: >> Most users will opt for loading the kernel to a RAM chunk at a higher >> physical address, making the lower chunk unusable (except as chip RAM), >> meaning allocating a bounce buffer within the first 16 MB will most >> likely fail anyway AIUI (but Geert would know that for sure). > Exactly. On Zorro III-capable machines, Zorro II RAM is not mapped > for general use, but can be used by the z2ram block device. Is there any DMA capable memory other than chip and ZorroII RAM on these machines? > Note that you can use it as a bounce buffer, by looking for and marking > free chunks in the zorro_unused_z2ram bitmap. That would be similar to what amiga_chip_alloc() does, right? Any advantages over using chip RAM (faster, perhaps)? Cheers, Michael > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like that. > -- Linus Torvalds
Hi Michael, On Wed, Jun 29, 2022 at 10:48 PM Michael Schmitz <schmitzmic@gmail.com> wrote: > On 30/06/22 03:55, Geert Uytterhoeven wrote: > > On Wed, Jun 29, 2022 at 9:27 AM Michael Schmitz <schmitzmic@gmail.com> wrote: > >> Most users will opt for loading the kernel to a RAM chunk at a higher > >> physical address, making the lower chunk unusable (except as chip RAM), > >> meaning allocating a bounce buffer within the first 16 MB will most > >> likely fail anyway AIUI (but Geert would know that for sure). > > Exactly. On Zorro III-capable machines, Zorro II RAM is not mapped > > for general use, but can be used by the z2ram block device. > Is there any DMA capable memory other than chip and ZorroII RAM on these > machines? > > Note that you can use it as a bounce buffer, by looking for and marking > > free chunks in the zorro_unused_z2ram bitmap. > > That would be similar to what amiga_chip_alloc() does, right? Any > advantages over using chip RAM (faster, perhaps)? Yes, Z2RAM is faster, as Amiga custom chip DMA does not steal cycles from Z2RAM. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Hi Geert, On 30/06/22 21:27, Geert Uytterhoeven wrote: > >>> Note that you can use it as a bounce buffer, by looking for and marking >>> free chunks in the zorro_unused_z2ram bitmap. >> That would be similar to what amiga_chip_alloc() does, right? Any >> advantages over using chip RAM (faster, perhaps)? > Yes, Z2RAM is faster, as Amiga custom chip DMA does not steal cycles > from Z2RAM. OK, would make sense to use that, but that's well outside scope for this series. Cheers, Michael > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like that. > -- Linus Torvalds
diff --git a/drivers/scsi/a2091.c b/drivers/scsi/a2091.c index cf703a1ecdda..2338c75c34ed 100644 --- a/drivers/scsi/a2091.c +++ b/drivers/scsi/a2091.c @@ -24,8 +24,11 @@ struct a2091_hostdata { struct WD33C93_hostdata wh; struct a2091_scsiregs *regs; + struct device *dev; }; +#define DMA_DIR(d) ((d == DATA_OUT_DIR) ? DMA_TO_DEVICE : DMA_FROM_DEVICE) + static irqreturn_t a2091_intr(int irq, void *data) { struct Scsi_Host *instance = data; @@ -45,15 +48,31 @@ static irqreturn_t a2091_intr(int irq, void *data) static int dma_setup(struct scsi_cmnd *cmd, int dir_in) { struct scsi_pointer *scsi_pointer = WD33C93_scsi_pointer(cmd); + unsigned long len = scsi_pointer->this_residual; struct Scsi_Host *instance = cmd->device->host; struct a2091_hostdata *hdata = shost_priv(instance); struct WD33C93_hostdata *wh = &hdata->wh; struct a2091_scsiregs *regs = hdata->regs; unsigned short cntr = CNTR_PDMD | CNTR_INTEN; - unsigned long addr = virt_to_bus(scsi_pointer->ptr); + dma_addr_t addr; + + addr = dma_map_single(hdata->dev, scsi_pointer->ptr, + len, DMA_DIR(dir_in)); + if (dma_mapping_error(hdata->dev, addr)) { + dev_warn(hdata->dev, "cannot map SCSI data block %p\n", + scsi_pointer->ptr); + return 1; + } + scsi_pointer->dma_handle = addr; /* don't allow DMA if the physical address is bad */ if (addr & A2091_XFER_MASK) { + /* drop useless mapping */ + dma_unmap_single(hdata->dev, scsi_pointer->dma_handle, + scsi_pointer->this_residual, + DMA_DIR(dir_in)); + scsi_pointer->dma_handle = (dma_addr_t) NULL; + wh->dma_bounce_len = (scsi_pointer->this_residual + 511) & ~0x1ff; wh->dma_bounce_buffer = kmalloc(wh->dma_bounce_len, GFP_KERNEL); @@ -64,9 +83,6 @@ static int dma_setup(struct scsi_cmnd *cmd, int dir_in) return 1; } - /* get the physical address of the bounce buffer */ - addr = virt_to_bus(wh->dma_bounce_buffer); - /* the bounce buffer may not be in the first 16M of physmem */ if (addr & A2091_XFER_MASK) { /* we could use chipmem... maybe later */ @@ -81,6 +97,17 @@ static int dma_setup(struct scsi_cmnd *cmd, int dir_in) memcpy(wh->dma_bounce_buffer, scsi_pointer->ptr, scsi_pointer->this_residual); } + + /* may flush/invalidate cache for us */ + addr = dma_map_single(hdata->dev, wh->dma_bounce_buffer, + wh->dma_bounce_len, DMA_DIR(dir_in)); + /* can't map buffer; use PIO */ + if (dma_mapping_error(hdata->dev, addr)) { + dev_warn(hdata->dev, "cannot map bounce buffer %p\n", + wh->dma_bounce_buffer); + return 1; + } + scsi_pointer->dma_handle = addr; } /* setup dma direction */ @@ -95,13 +122,8 @@ static int dma_setup(struct scsi_cmnd *cmd, int dir_in) /* setup DMA *physical* address */ regs->ACR = addr; - if (dir_in) { - /* invalidate any cache */ - cache_clear(addr, scsi_pointer->this_residual); - } else { - /* push any dirty cache */ - cache_push(addr, scsi_pointer->this_residual); - } + /* no more cache flush here - using coherent DMA alloc */ + /* start DMA */ regs->ST_DMA = 1; @@ -151,6 +173,9 @@ static void dma_stop(struct Scsi_Host *instance, struct scsi_cmnd *SCpnt, wh->dma_bounce_buffer = NULL; wh->dma_bounce_len = 0; } + dma_unmap_single(hdata->dev, scsi_pointer->dma_handle, + scsi_pointer->this_residual, + DMA_DIR(wh->dma_dir)); } static struct scsi_host_template a2091_scsi_template = { @@ -178,6 +203,11 @@ static int a2091_probe(struct zorro_dev *z, const struct zorro_device_id *ent) wd33c93_regs wdregs; struct a2091_hostdata *hdata; + if (dma_set_mask_and_coherent(&z->dev, DMA_BIT_MASK(24))) { + dev_warn(&z->dev, "cannot use 24 bit DMA\n"); + return -ENODEV; + } + if (!request_mem_region(z->resource.start, 256, "wd33c93")) return -EBUSY; @@ -198,6 +228,7 @@ static int a2091_probe(struct zorro_dev *z, const struct zorro_device_id *ent) wdregs.SCMD = ®s->SCMD; hdata = shost_priv(instance); + hdata->dev = &z->dev; hdata->wh.no_sync = 0xff; hdata->wh.fast = 0; hdata->wh.dma_mode = CTRL_DMA;
Use dma_map_single() for a2091 driver (leave bounce buffer logic unchanged). Use dma_set_mask_and_coherent() to avoid explicit cache flushes. Compile-tested only. CC: linux-scsi@vger.kernel.org Link: https://lore.kernel.org/r/6d1d88ee-1cf6-c735-1e6d-bafd2096e322@gmail.com Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> --- drivers/scsi/a2091.c | 53 +++++++++++++++++++++++++++++++++++--------- 1 file changed, 42 insertions(+), 11 deletions(-)