Message ID | 20221019145246.v2.2.I29f6a2189e84e35ad89c1833793dca9e36c64297@changeid |
---|---|
State | New |
Headers | show |
Series | mmc: sdhci controllers: Fix SDHCI_RESET_ALL for CQHCI | expand |
On 10/19/22 14:54, Brian Norris wrote: > SDHCI_RESET_ALL resets will reset the hardware CQE state, but we aren't > tracking that properly in software. When out of sync, we may trigger > various timeouts. > > It's not typical to perform resets while CQE is enabled, but one > particular case I hit commonly enough: mmc_suspend() -> mmc_power_off(). > Typically we will eventually deactivate CQE (cqhci_suspend() -> > cqhci_deactivate()), but that's not guaranteed -- in particular, if > we perform a partial (e.g., interrupted) system suspend. > > The same bug was already found and fixed for two other drivers, in v5.7 > and v5.9: > > 5cf583f1fb9c mmc: sdhci-msm: Deactivate CQE during SDHC reset > df57d73276b8 mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers > > The latter is especially prescient, saying "other drivers using CQHCI > might benefit from a similar change, if they also have CQHCI reset by > SDHCI_RESET_ALL." > > So like these other patches, deactivate CQHCI when resetting the > controller. > > Fixes: 84362d79f436 ("mmc: sdhci-of-arasan: Add CQHCI support for arasan,sdhci-5.1") > Cc: <stable@vger.kernel.org> > Signed-off-by: Brian Norris <briannorris@chromium.org> > --- > > Changes in v2: > - Rely on cqhci_deactivate() to safely handle (ignore) > not-yet-initialized CQE support > > drivers/mmc/host/sdhci-of-arasan.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/mmc/host/sdhci-of-arasan.c b/drivers/mmc/host/sdhci-of-arasan.c > index 3997cad1f793..b30f0d6baf5b 100644 > --- a/drivers/mmc/host/sdhci-of-arasan.c > +++ b/drivers/mmc/host/sdhci-of-arasan.c > @@ -366,6 +366,9 @@ static void sdhci_arasan_reset(struct sdhci_host *host, u8 mask) > struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); > struct sdhci_arasan_data *sdhci_arasan = sdhci_pltfm_priv(pltfm_host); > > + if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL)) > + cqhci_deactivate(host->mmc); > + > sdhci_reset(host, mask); Cannot this be absorbed by sdhci_reset() that all of these drivers appear to be utilizing since you have access to the host and the mask to make that decision?
On Wed, Oct 19, 2022 at 02:59:39PM -0700, Florian Fainelli wrote: > On 10/19/22 14:54, Brian Norris wrote: > > The same bug was already found and fixed for two other drivers, in v5.7 > > and v5.9: > > > > 5cf583f1fb9c mmc: sdhci-msm: Deactivate CQE during SDHC reset > > df57d73276b8 mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers > > > > The latter is especially prescient, saying "other drivers using CQHCI > > might benefit from a similar change, if they also have CQHCI reset by > > SDHCI_RESET_ALL." > > --- a/drivers/mmc/host/sdhci-of-arasan.c > > +++ b/drivers/mmc/host/sdhci-of-arasan.c > > @@ -366,6 +366,9 @@ static void sdhci_arasan_reset(struct sdhci_host *host, u8 mask) > > struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); > > struct sdhci_arasan_data *sdhci_arasan = sdhci_pltfm_priv(pltfm_host); > > + if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL)) > > + cqhci_deactivate(host->mmc); > > + > > sdhci_reset(host, mask); > > Cannot this be absorbed by sdhci_reset() that all of these drivers appear to > be utilizing since you have access to the host and the mask to make that > decision? It potentially could. I don't know if this is a specified SDHCI behavior that really belongs in the common helper, or if this is just a commonly-shared behavior. Per the comments I quote above ("if they also have CQHCI reset by SDHCI_RESET_ALL"), I chose to leave that as an implementation-specific behavior. I suppose it's not all that harmful to do this even if some SDHCI controller doesn't have the same behavior/quirk. I guess I also don't know if any SDHCI controllers will support command queueing (MMC_CAP2_CQE) via somethings *besides* CQHCI. I see CQE support in sdhci-sprd.c without CQHCI, although that driver doesn't set MMC_CAP2_CQE. Brian
On 20/10/22 01:19, Brian Norris wrote: > On Wed, Oct 19, 2022 at 02:59:39PM -0700, Florian Fainelli wrote: >> On 10/19/22 14:54, Brian Norris wrote: >>> The same bug was already found and fixed for two other drivers, in v5.7 >>> and v5.9: >>> >>> 5cf583f1fb9c mmc: sdhci-msm: Deactivate CQE during SDHC reset >>> df57d73276b8 mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers >>> >>> The latter is especially prescient, saying "other drivers using CQHCI >>> might benefit from a similar change, if they also have CQHCI reset by >>> SDHCI_RESET_ALL." > >>> --- a/drivers/mmc/host/sdhci-of-arasan.c >>> +++ b/drivers/mmc/host/sdhci-of-arasan.c >>> @@ -366,6 +366,9 @@ static void sdhci_arasan_reset(struct sdhci_host *host, u8 mask) >>> struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); >>> struct sdhci_arasan_data *sdhci_arasan = sdhci_pltfm_priv(pltfm_host); >>> + if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL)) >>> + cqhci_deactivate(host->mmc); >>> + >>> sdhci_reset(host, mask); >> >> Cannot this be absorbed by sdhci_reset() that all of these drivers appear to >> be utilizing since you have access to the host and the mask to make that >> decision? > > It potentially could. > > I don't know if this is a specified SDHCI behavior that really belongs > in the common helper, or if this is just a commonly-shared behavior. Per > the comments I quote above ("if they also have CQHCI reset by > SDHCI_RESET_ALL"), I chose to leave that as an implementation-specific > behavior. > > I suppose it's not all that harmful to do this even if some SDHCI > controller doesn't have the same behavior/quirk. > > I guess I also don't know if any SDHCI controllers will support command > queueing (MMC_CAP2_CQE) via somethings *besides* CQHCI. I see > CQE support in sdhci-sprd.c without CQHCI, although that driver doesn't > set MMC_CAP2_CQE. SDHCI and CQHCI are separate modules and are not dependent, so they cannot call into each other directly (and should not). A new CQE API would be needed in mmc_cqe_ops e.g. (*cqe_notify_reset)(struct mmc_host *host), and wrapped in mmc/host.h: static inline void mmc_cqe_notify_reset(struct mmc_host *host) { if (host->cqe_ops->cqe_notify_reset) host->cqe_ops->cqe_notify_reset(host); } Alternatively, you could make a new module for SDHCI/CQHCI helper functions, although in this case there is so little code it could be static inline and added in a new include file instead, say sdhci-cqhci.h e.g. #include "cqhci.h" #include "sdhci.h" static inline void sdhci_cqhci_reset(struct sdhci_host *host, u8 mask) { if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL) && host->mmc->cqe_private) cqhci_deactivate(host->mmc); sdhci_reset(host, mask); }
On 10/19/22 23:29, Adrian Hunter wrote: > On 20/10/22 01:19, Brian Norris wrote: >> On Wed, Oct 19, 2022 at 02:59:39PM -0700, Florian Fainelli wrote: >>> On 10/19/22 14:54, Brian Norris wrote: >>>> The same bug was already found and fixed for two other drivers, in v5.7 >>>> and v5.9: >>>> >>>> 5cf583f1fb9c mmc: sdhci-msm: Deactivate CQE during SDHC reset >>>> df57d73276b8 mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers >>>> >>>> The latter is especially prescient, saying "other drivers using CQHCI >>>> might benefit from a similar change, if they also have CQHCI reset by >>>> SDHCI_RESET_ALL." >> >>>> --- a/drivers/mmc/host/sdhci-of-arasan.c >>>> +++ b/drivers/mmc/host/sdhci-of-arasan.c >>>> @@ -366,6 +366,9 @@ static void sdhci_arasan_reset(struct sdhci_host *host, u8 mask) >>>> struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); >>>> struct sdhci_arasan_data *sdhci_arasan = sdhci_pltfm_priv(pltfm_host); >>>> + if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL)) >>>> + cqhci_deactivate(host->mmc); >>>> + >>>> sdhci_reset(host, mask); >>> >>> Cannot this be absorbed by sdhci_reset() that all of these drivers appear to >>> be utilizing since you have access to the host and the mask to make that >>> decision? >> >> It potentially could. >> >> I don't know if this is a specified SDHCI behavior that really belongs >> in the common helper, or if this is just a commonly-shared behavior. Per >> the comments I quote above ("if they also have CQHCI reset by >> SDHCI_RESET_ALL"), I chose to leave that as an implementation-specific >> behavior. >> >> I suppose it's not all that harmful to do this even if some SDHCI >> controller doesn't have the same behavior/quirk. >> >> I guess I also don't know if any SDHCI controllers will support command >> queueing (MMC_CAP2_CQE) via somethings *besides* CQHCI. I see >> CQE support in sdhci-sprd.c without CQHCI, although that driver doesn't >> set MMC_CAP2_CQE. > > SDHCI and CQHCI are separate modules and are not dependent, so they cannot > call into each other directly (and should not). A new CQE API would be > needed in mmc_cqe_ops e.g. (*cqe_notify_reset)(struct mmc_host *host), > and wrapped in mmc/host.h: > > static inline void mmc_cqe_notify_reset(struct mmc_host *host) > { > if (host->cqe_ops->cqe_notify_reset) > host->cqe_ops->cqe_notify_reset(host); > } > > Alternatively, you could make a new module for SDHCI/CQHCI helper functions, > although in this case there is so little code it could be static inline and > added in a new include file instead, say sdhci-cqhci.h e.g. > > #include "cqhci.h" > #include "sdhci.h" > > static inline void sdhci_cqhci_reset(struct sdhci_host *host, u8 mask) > { > if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL) && > host->mmc->cqe_private) > cqhci_deactivate(host->mmc); > sdhci_reset(host, mask); > } > I like the simplicity of the inline helper, especially towards backports. May suggest to name it sdhci_and_cqhci_reset() to illustrate that it does both, and does not apply specifically CQHCI that would be "embedded" into SDHCI, but your call here.
On 21/10/22 20:45, Florian Fainelli wrote: > On 10/19/22 23:29, Adrian Hunter wrote: >> On 20/10/22 01:19, Brian Norris wrote: >>> On Wed, Oct 19, 2022 at 02:59:39PM -0700, Florian Fainelli wrote: >>>> On 10/19/22 14:54, Brian Norris wrote: >>>>> The same bug was already found and fixed for two other drivers, in v5.7 >>>>> and v5.9: >>>>> >>>>> 5cf583f1fb9c mmc: sdhci-msm: Deactivate CQE during SDHC reset >>>>> df57d73276b8 mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers >>>>> >>>>> The latter is especially prescient, saying "other drivers using CQHCI >>>>> might benefit from a similar change, if they also have CQHCI reset by >>>>> SDHCI_RESET_ALL." >>> >>>>> --- a/drivers/mmc/host/sdhci-of-arasan.c >>>>> +++ b/drivers/mmc/host/sdhci-of-arasan.c >>>>> @@ -366,6 +366,9 @@ static void sdhci_arasan_reset(struct sdhci_host *host, u8 mask) >>>>> struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); >>>>> struct sdhci_arasan_data *sdhci_arasan = sdhci_pltfm_priv(pltfm_host); >>>>> + if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL)) >>>>> + cqhci_deactivate(host->mmc); >>>>> + >>>>> sdhci_reset(host, mask); >>>> >>>> Cannot this be absorbed by sdhci_reset() that all of these drivers appear to >>>> be utilizing since you have access to the host and the mask to make that >>>> decision? >>> >>> It potentially could. >>> >>> I don't know if this is a specified SDHCI behavior that really belongs >>> in the common helper, or if this is just a commonly-shared behavior. Per >>> the comments I quote above ("if they also have CQHCI reset by >>> SDHCI_RESET_ALL"), I chose to leave that as an implementation-specific >>> behavior. >>> >>> I suppose it's not all that harmful to do this even if some SDHCI >>> controller doesn't have the same behavior/quirk. >>> >>> I guess I also don't know if any SDHCI controllers will support command >>> queueing (MMC_CAP2_CQE) via somethings *besides* CQHCI. I see >>> CQE support in sdhci-sprd.c without CQHCI, although that driver doesn't >>> set MMC_CAP2_CQE. >> >> SDHCI and CQHCI are separate modules and are not dependent, so they cannot >> call into each other directly (and should not). A new CQE API would be >> needed in mmc_cqe_ops e.g. (*cqe_notify_reset)(struct mmc_host *host), >> and wrapped in mmc/host.h: >> >> static inline void mmc_cqe_notify_reset(struct mmc_host *host) >> { >> if (host->cqe_ops->cqe_notify_reset) >> host->cqe_ops->cqe_notify_reset(host); >> } >> >> Alternatively, you could make a new module for SDHCI/CQHCI helper functions, >> although in this case there is so little code it could be static inline and >> added in a new include file instead, say sdhci-cqhci.h e.g. >> >> #include "cqhci.h" >> #include "sdhci.h" >> >> static inline void sdhci_cqhci_reset(struct sdhci_host *host, u8 mask) >> { >> if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL) && >> host->mmc->cqe_private) >> cqhci_deactivate(host->mmc); >> sdhci_reset(host, mask); >> } >> > > I like the simplicity of the inline helper, especially towards backports. May suggest to name it sdhci_and_cqhci_reset() to illustrate that it does both, and does not apply specifically CQHCI that would be "embedded" into SDHCI, but your call here. sdhci_and_cqhci_reset() is fine by me
diff --git a/drivers/mmc/host/sdhci-of-arasan.c b/drivers/mmc/host/sdhci-of-arasan.c index 3997cad1f793..b30f0d6baf5b 100644 --- a/drivers/mmc/host/sdhci-of-arasan.c +++ b/drivers/mmc/host/sdhci-of-arasan.c @@ -366,6 +366,9 @@ static void sdhci_arasan_reset(struct sdhci_host *host, u8 mask) struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); struct sdhci_arasan_data *sdhci_arasan = sdhci_pltfm_priv(pltfm_host); + if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL)) + cqhci_deactivate(host->mmc); + sdhci_reset(host, mask); if (sdhci_arasan->quirks & SDHCI_ARASAN_QUIRK_FORCE_CDTEST) {
SDHCI_RESET_ALL resets will reset the hardware CQE state, but we aren't tracking that properly in software. When out of sync, we may trigger various timeouts. It's not typical to perform resets while CQE is enabled, but one particular case I hit commonly enough: mmc_suspend() -> mmc_power_off(). Typically we will eventually deactivate CQE (cqhci_suspend() -> cqhci_deactivate()), but that's not guaranteed -- in particular, if we perform a partial (e.g., interrupted) system suspend. The same bug was already found and fixed for two other drivers, in v5.7 and v5.9: 5cf583f1fb9c mmc: sdhci-msm: Deactivate CQE during SDHC reset df57d73276b8 mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers The latter is especially prescient, saying "other drivers using CQHCI might benefit from a similar change, if they also have CQHCI reset by SDHCI_RESET_ALL." So like these other patches, deactivate CQHCI when resetting the controller. Fixes: 84362d79f436 ("mmc: sdhci-of-arasan: Add CQHCI support for arasan,sdhci-5.1") Cc: <stable@vger.kernel.org> Signed-off-by: Brian Norris <briannorris@chromium.org> --- Changes in v2: - Rely on cqhci_deactivate() to safely handle (ignore) not-yet-initialized CQE support drivers/mmc/host/sdhci-of-arasan.c | 3 +++ 1 file changed, 3 insertions(+)