Message ID | 1607087853-6570-1-git-send-email-yoshihiro.shimoda.uh@renesas.com |
---|---|
State | Superseded |
Headers | show |
Series | mmc: host: renesas_internal_dmac: add pre_req and post_req support | expand |
Hi Shimoda-san, On Fri, Dec 04, 2020 at 10:17:33PM +0900, Yoshihiro Shimoda wrote: > Add pre_req and post_req support to improve performance. > > Inspired by a patch in the BSP by Masaharu Hayakawa. Thank you for upporting this! > /* > * Specification of this driver: > * - host->chan_{rx,tx} will be used as a flag of enabling/disabling the dma > @@ -172,6 +178,47 @@ renesas_sdhi_internal_dmac_dataend_dma(struct tmio_mmc_host *host) { > tasklet_schedule(&priv->dma_priv.dma_complete); > } > > +/* Should not use host->sg_ptr/sg_len in the following function */ Maybe a short explanation why we shouldn't use the functions? > +static void > +renesas_sdhi_internal_dmac_unmap(struct tmio_mmc_host *host, > + struct mmc_data *data, > + enum renesas_sdhi_dma_cookie cookie, > + bool expected_unmatch) Can we maybe skip "expected_unmatch"? It is always true for COOKIE_UNMAPPED and always false for the COOKIE_*MAPPED values, or? > +{ > + bool unmap = expected_unmatch ? (data->host_cookie != cookie) : > + (data->host_cookie == cookie); Then, we could do: + bool unmap = cookie == COOKIE_UNMAPPED ? (data->host_cookie != cookie) : + (data->host_cookie == cookie); > + > + if (unmap) { > + dma_unmap_sg(&host->pdev->dev, data->sg, data->sg_len, > + mmc_get_dma_dir(data)); > + data->host_cookie = COOKIE_UNMAPPED; > + } Is it maybe worth a warning if the expected condition was not found? Rest looks good! All the best, Wolfram
Hi Wolfram-san, Thank you for your review! > From: Wolfram Sang, Sent: Tuesday, December 15, 2020 12:50 AM > > Hi Shimoda-san, > > On Fri, Dec 04, 2020 at 10:17:33PM +0900, Yoshihiro Shimoda wrote: > > Add pre_req and post_req support to improve performance. > > > > Inspired by a patch in the BSP by Masaharu Hayakawa. > > Thank you for upporting this! > > > /* > > * Specification of this driver: > > * - host->chan_{rx,tx} will be used as a flag of enabling/disabling the dma > > @@ -172,6 +178,47 @@ renesas_sdhi_internal_dmac_dataend_dma(struct tmio_mmc_host *host) { > > tasklet_schedule(&priv->dma_priv.dma_complete); > > } > > > > +/* Should not use host->sg_ptr/sg_len in the following function */ > > Maybe a short explanation why we shouldn't use the functions? I tried to update the comment as below: /* * tmio_mmc_request() only sets host->sg_{ptr,len} and * renesas_sdhi_internal_dmac_pre_req() doesn't set host->sg_{ptr,len} so that * we should not use the values in the following function. */ Hmm... Perhaps, I should modify the code to use host->sg_{ptr,len} in both paths (.request() and .pre_req()) and remove this comments. So, I'll try to modify. I guess tmio_mmc_init_sg() is called in pre_req(), we can use host->sg_{ptr,len}. > > +static void > > +renesas_sdhi_internal_dmac_unmap(struct tmio_mmc_host *host, > > + struct mmc_data *data, > > + enum renesas_sdhi_dma_cookie cookie, > > + bool expected_unmatch) > > Can we maybe skip "expected_unmatch"? It is always true for > COOKIE_UNMAPPED and always false for the COOKIE_*MAPPED values, or? > > > +{ > > + bool unmap = expected_unmatch ? (data->host_cookie != cookie) : > > + (data->host_cookie == cookie); > > Then, we could do: > + bool unmap = cookie == COOKIE_UNMAPPED ? (data->host_cookie != cookie) : > + (data->host_cookie == cookie); Thank you for your suggestion! You're correct. I'll fix this. > > + > > + if (unmap) { > > + dma_unmap_sg(&host->pdev->dev, data->sg, data->sg_len, > > + mmc_get_dma_dir(data)); > > + data->host_cookie = COOKIE_UNMAPPED; > > + } > > Is it maybe worth a warning if the expected condition was not found? If we could add such a warning, it's helpful. However, I have no idea how to implement a condition for it because we cannot use "else" here. For example, when this driver mapped as PRE_MAPPED and then renesas_sdhi_internal_dmac_complete() is called, the "unmap" value of renesas_sdhi_internal_dmac_unmap() from renesas_sdhi_internal_dmac_complete() is false: data->host_cookie = PRE_MAPPED, and cookie = MAPPED This is because this driver should unmap in .post_req() so that this is the expected condition in such a case. But, what do you think? > Rest looks good! Thanks! Best regards, Yoshihiro Shimoda
Hi Wolfram-san again, > From: Yoshihiro Shimoda, Sent: Tuesday, December 15, 2020 10:32 AM > > From: Wolfram Sang, Sent: Tuesday, December 15, 2020 12:50 AM > > On Fri, Dec 04, 2020 at 10:17:33PM +0900, Yoshihiro Shimoda wrote: > > > /* > > > * Specification of this driver: > > > * - host->chan_{rx,tx} will be used as a flag of enabling/disabling the dma > > > @@ -172,6 +178,47 @@ renesas_sdhi_internal_dmac_dataend_dma(struct tmio_mmc_host *host) { > > > tasklet_schedule(&priv->dma_priv.dma_complete); > > > } > > > > > > +/* Should not use host->sg_ptr/sg_len in the following function */ > > > > Maybe a short explanation why we shouldn't use the functions? > > I tried to update the comment as below: > /* > * tmio_mmc_request() only sets host->sg_{ptr,len} and > * renesas_sdhi_internal_dmac_pre_req() doesn't set host->sg_{ptr,len} so that > * we should not use the values in the following function. > */ > > Hmm... Perhaps, I should modify the code to use host->sg_{ptr,len} > in both paths (.request() and .pre_req()) and remove this comments. > So, I'll try to modify. I guess tmio_mmc_init_sg() is called in pre_req(), > we can use host->sg_{ptr,len}. I'm sorry. I was completely wrong. If we use {pre,post}_req, the MMC core will call pre_req() twice with each mmc_data before pre_req() is called. So that, second pre_req() will overwrite the host->sg_ptr. So, we should not use host->sg_ptr here. So, I'll update the comments like below. /* * Since pre_req() will be called twice before post_req() is called, * host->sg_ptr will be overwritten by second pre_req(). So, to use * suitable sg pointer, should use data->sg/sg_len instead of * host->sg_ptr/sg_len. */ Best regards, Yoshihiro Shimoda
Hi Wolfram-san again, > From: Yoshihiro Shimoda, Sent: Tuesday, December 15, 2020 12:00 PM > > From: Yoshihiro Shimoda, Sent: Tuesday, December 15, 2020 10:32 AM > > > From: Wolfram Sang, Sent: Tuesday, December 15, 2020 12:50 AM > > > On Fri, Dec 04, 2020 at 10:17:33PM +0900, Yoshihiro Shimoda wrote: > > > > /* > > > > * Specification of this driver: > > > > * - host->chan_{rx,tx} will be used as a flag of enabling/disabling the dma > > > > @@ -172,6 +178,47 @@ renesas_sdhi_internal_dmac_dataend_dma(struct tmio_mmc_host *host) { > > > > tasklet_schedule(&priv->dma_priv.dma_complete); > > > > } > > > > > > > > +/* Should not use host->sg_ptr/sg_len in the following function */ > > > > > > Maybe a short explanation why we shouldn't use the functions? > > > > I tried to update the comment as below: > > /* > > * tmio_mmc_request() only sets host->sg_{ptr,len} and > > * renesas_sdhi_internal_dmac_pre_req() doesn't set host->sg_{ptr,len} so that > > * we should not use the values in the following function. > > */ > > > > Hmm... Perhaps, I should modify the code to use host->sg_{ptr,len} > > in both paths (.request() and .pre_req()) and remove this comments. > > So, I'll try to modify. I guess tmio_mmc_init_sg() is called in pre_req(), > > we can use host->sg_{ptr,len}. > > I'm sorry. I was completely wrong. If we use {pre,post}_req, > the MMC core will call pre_req() twice with each mmc_data before > pre_req() is called. So that, second pre_req() will overwrite > the host->sg_ptr. So, we should not use host->sg_ptr here. > So, I'll update the comments like below. I'm sorry again and again. But, I realized the current patch breaks "force_pio" mode because tmio_mmc_pio_irq() doesn't take care of {pre,post}_req. So, I'll try to refactor tmio core to support {pre,post}_req(). Best regards, Yoshihiro Shimoda
> I'm sorry again and again. But, I realized the current patch breaks > "force_pio" mode because tmio_mmc_pio_irq() doesn't take care of {pre,post}_req. > So, I'll try to refactor tmio core to support {pre,post}_req(). How did you test this BTW? Just checksumming a large file doesn't show improvements here on 5.10 + your patch with H3 ES2.0 and M3-N for me. Did I miss something?
Hi Wolfram-san, > From: Yoshihiro Shimoda, Sent: Tuesday, December 15, 2020 2:03 PM <snip> > I'm sorry again and again. But, I realized the current patch breaks > "force_pio" mode because tmio_mmc_pio_irq() doesn't take care of {pre,post}_req. > So, I'll try to refactor tmio core to support {pre,post}_req(). I'm sorry. My fault injection code is bad. In other words, we don't need to refactor tmio core. I had added "goto force_pio;" in renesas_sdhi_internal_dmac_start_dma() yesterday. However, I should have added "goto force_pio_with_unmap;" instead. Otherwise, the driver unmaps the buffer after pio finished so that reading data by pio was invalidated by dma_unmap_sg(). So, I'll submit v2 patch later. Best regards, Yoshihiro Shimoda
diff --git a/drivers/mmc/host/renesas_sdhi_internal_dmac.c b/drivers/mmc/host/renesas_sdhi_internal_dmac.c index d8b811c..ce3c447 100644 --- a/drivers/mmc/host/renesas_sdhi_internal_dmac.c +++ b/drivers/mmc/host/renesas_sdhi_internal_dmac.c @@ -56,6 +56,12 @@ #define INFO2_DTRANERR1 BIT(17) #define INFO2_DTRANERR0 BIT(16) +enum renesas_sdhi_dma_cookie { + COOKIE_UNMAPPED, + COOKIE_PRE_MAPPED, + COOKIE_MAPPED, +}; + /* * Specification of this driver: * - host->chan_{rx,tx} will be used as a flag of enabling/disabling the dma @@ -172,6 +178,47 @@ renesas_sdhi_internal_dmac_dataend_dma(struct tmio_mmc_host *host) { tasklet_schedule(&priv->dma_priv.dma_complete); } +/* Should not use host->sg_ptr/sg_len in the following function */ +static void +renesas_sdhi_internal_dmac_unmap(struct tmio_mmc_host *host, + struct mmc_data *data, + enum renesas_sdhi_dma_cookie cookie, + bool expected_unmatch) +{ + bool unmap = expected_unmatch ? (data->host_cookie != cookie) : + (data->host_cookie == cookie); + + if (unmap) { + dma_unmap_sg(&host->pdev->dev, data->sg, data->sg_len, + mmc_get_dma_dir(data)); + data->host_cookie = COOKIE_UNMAPPED; + } +} + +/* Should not use host->sg_ptr/sg_len in the following function */ +static bool +renesas_sdhi_internal_dmac_map(struct tmio_mmc_host *host, + struct mmc_data *data, + enum renesas_sdhi_dma_cookie cookie) +{ + if (data->host_cookie == COOKIE_PRE_MAPPED) + return true; + + if (!dma_map_sg(&host->pdev->dev, data->sg, data->sg_len, + mmc_get_dma_dir(data))) + return false; + + data->host_cookie = cookie; + + /* This DMAC cannot handle if buffer is not 8-bytes alignment */ + if (!IS_ALIGNED(sg_dma_address(data->sg), 8)) { + renesas_sdhi_internal_dmac_unmap(host, data, cookie, false); + return false; + } + + return true; +} + static void renesas_sdhi_internal_dmac_start_dma(struct tmio_mmc_host *host, struct mmc_data *data) @@ -182,14 +229,9 @@ renesas_sdhi_internal_dmac_start_dma(struct tmio_mmc_host *host, if (!test_bit(SDHI_INTERNAL_DMAC_ADDR_MODE_FIXED_ONLY, &global_flags)) dtran_mode |= DTRAN_MODE_ADDR_MODE; - if (!dma_map_sg(&host->pdev->dev, sg, host->sg_len, - mmc_get_dma_dir(data))) + if (!renesas_sdhi_internal_dmac_map(host, data, COOKIE_MAPPED)) goto force_pio; - /* This DMAC cannot handle if buffer is not 8-bytes alignment */ - if (!IS_ALIGNED(sg_dma_address(sg), 8)) - goto force_pio_with_unmap; - if (data->flags & MMC_DATA_READ) { dtran_mode |= DTRAN_MODE_CH_NUM_CH1; if (test_bit(SDHI_INTERNAL_DMAC_ONE_RX_ONLY, &global_flags) && @@ -212,7 +254,7 @@ renesas_sdhi_internal_dmac_start_dma(struct tmio_mmc_host *host, return; force_pio_with_unmap: - dma_unmap_sg(&host->pdev->dev, sg, host->sg_len, mmc_get_dma_dir(data)); + renesas_sdhi_internal_dmac_unmap(host, data, COOKIE_UNMAPPED, true); force_pio: renesas_sdhi_internal_dmac_enable_dma(host, false); @@ -245,7 +287,7 @@ static bool renesas_sdhi_internal_dmac_complete(struct tmio_mmc_host *host) dir = DMA_TO_DEVICE; renesas_sdhi_internal_dmac_enable_dma(host, false); - dma_unmap_sg(&host->pdev->dev, host->sg_ptr, host->sg_len, dir); + renesas_sdhi_internal_dmac_unmap(host, host->data, COOKIE_MAPPED, false); if (dir == DMA_FROM_DEVICE) clear_bit(SDHI_INTERNAL_DMAC_RX_IN_USE, &global_flags); @@ -274,6 +316,32 @@ static void renesas_sdhi_internal_dmac_end_dma(struct tmio_mmc_host *host) renesas_sdhi_internal_dmac_complete(host); } +static void renesas_sdhi_internal_dmac_post_req(struct mmc_host *mmc, + struct mmc_request *req, + int err) +{ + struct tmio_mmc_host *host = mmc_priv(mmc); + struct mmc_data *data = req->data; + + if (!data) + return; + + renesas_sdhi_internal_dmac_unmap(host, data, COOKIE_UNMAPPED, true); +} + +static void renesas_sdhi_internal_dmac_pre_req(struct mmc_host *mmc, + struct mmc_request *req) +{ + struct tmio_mmc_host *host = mmc_priv(mmc); + struct mmc_data *data = req->data; + + if (!data) + return; + + data->host_cookie = COOKIE_UNMAPPED; + renesas_sdhi_internal_dmac_map(host, data, COOKIE_PRE_MAPPED); +} + static void renesas_sdhi_internal_dmac_request_dma(struct tmio_mmc_host *host, struct tmio_mmc_data *pdata) @@ -295,6 +363,10 @@ renesas_sdhi_internal_dmac_request_dma(struct tmio_mmc_host *host, tasklet_init(&host->dma_issue, renesas_sdhi_internal_dmac_issue_tasklet_fn, (unsigned long)host); + + /* Add pre_req and post_req */ + host->ops.pre_req = renesas_sdhi_internal_dmac_pre_req; + host->ops.post_req = renesas_sdhi_internal_dmac_post_req; } static void
Add pre_req and post_req support to improve performance. Inspired by a patch in the BSP by Masaharu Hayakawa. Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> --- When I applied this patch, read performance on R-Car H3 with HS400 mode was greatly improved: - without the patch : about 200MB/sec - with the patch : about 250MB/sec drivers/mmc/host/renesas_sdhi_internal_dmac.c | 88 ++++++++++++++++++++++++--- 1 file changed, 80 insertions(+), 8 deletions(-)