Message ID | 20230928092040.9420-7-brgl@bgdev.pl |
---|---|
State | New |
Headers | show |
Series | arm64: qcom: add and enable SHM Bridge support | expand |
On Fri, Sep 29, 2023 at 12:22:16PM -0700, Bartosz Golaszewski wrote: > On Fri, 29 Sep 2023 21:16:51 +0200, Andrew Halaney <ahalaney@redhat.com> said: > > On Thu, Sep 28, 2023 at 11:20:35AM +0200, Bartosz Golaszewski wrote: > >> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > >> > >> Let's use the new SCM memory allocator to obtain a buffer for this call > >> instead of using dma_alloc_coherent(). > >> > >> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > >> --- > >> drivers/firmware/qcom/qcom_scm.c | 16 +++++----------- > >> 1 file changed, 5 insertions(+), 11 deletions(-) > >> > >> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c > >> index 02a773ba1383..c0eb81069847 100644 > >> --- a/drivers/firmware/qcom/qcom_scm.c > >> +++ b/drivers/firmware/qcom/qcom_scm.c > >> @@ -532,7 +532,7 @@ static void qcom_scm_set_download_mode(bool enable) > >> int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, > >> struct qcom_scm_pas_metadata *ctx) > >> { > >> - dma_addr_t mdata_phys; > >> + phys_addr_t mdata_phys; > > > >> void *mdata_buf; > >> int ret; > >> struct qcom_scm_desc desc = { > >> @@ -544,13 +544,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, > >> }; > >> struct qcom_scm_res res; > >> > >> - /* > >> - * During the scm call memory protection will be enabled for the meta > >> - * data blob, so make sure it's physically contiguous, 4K aligned and > >> - * non-cachable to avoid XPU violations. > >> - */ > >> - mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys, > >> - GFP_KERNEL); > >> + mdata_buf = qcom_scm_mem_alloc(size, GFP_KERNEL); > > > > mdata_phys is never initialized now, and its what's being shoved into > > desc.args[1] later, which I believe is what triggered the -EINVAL > > with qcom_scm_call() that I reported in my cover letter reply this > > morning. > > > > Prior with the DMA API that would have been the device view of the buffer. > > > > Gah! Thanks for finding this. > > Can you try the following diff? > > diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c > index 794388c3212f..b0d4ea237034 100644 > --- a/drivers/firmware/qcom/qcom_scm.c > +++ b/drivers/firmware/qcom/qcom_scm.c > @@ -556,6 +556,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const > void *metadata, size_t size, > dev_err(__scm->dev, "Allocation of metadata buffer failed.\n"); > return -ENOMEM; > } > + mdata_phys = qcom_scm_mem_to_phys(mdata_buf); > memcpy(mdata_buf, metadata, size); > > ret = qcom_scm_clk_enable(); > @@ -578,7 +579,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const > void *metadata, size_t size, > qcom_scm_mem_free(mdata_buf); > } else if (ctx) { > ctx->ptr = mdata_buf; > - ctx->phys = qcom_scm_mem_to_phys(mdata_buf); > + ctx->phys = mdata_phys; > ctx->size = size; > } > > Bart > For some reason that I can't explain that is still not working. It seems the SMC call is returning !0 and then we return -EINVAL from there with qcom_scm_remap_error(). Here's a really crummy diff of what I hacked in during lunch to debug (don't judge my primitive debug skills): diff --git a/drivers/firmware/qcom/qcom_scm-smc.c b/drivers/firmware/qcom/qcom_scm-smc.c index 0d5554df1321..56eab0ae5f3a 100644 --- a/drivers/firmware/qcom/qcom_scm-smc.c +++ b/drivers/firmware/qcom/qcom_scm-smc.c @@ -162,6 +162,8 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, struct arm_smccc_res smc_res; struct arm_smccc_args smc = {0}; + dev_err(dev, "%s: %d: We are in this function\n", __func__, __LINE__); + smc.args[0] = ARM_SMCCC_CALL_VAL( smccc_call_type, qcom_smccc_convention, @@ -174,6 +176,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, if (unlikely(arglen > SCM_SMC_N_REG_ARGS)) { alloc_len = SCM_SMC_N_EXT_ARGS * sizeof(u64); args_virt = qcom_scm_mem_alloc(PAGE_ALIGN(alloc_len), flag); + dev_err(dev, "%s: %d: Hit the unlikely case!\n", __func__, __LINE__); if (!args_virt) return -ENOMEM; @@ -197,6 +200,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, /* ret error check follows after args_virt cleanup*/ ret = __scm_smc_do(dev, &smc, &smc_res, atomic); + dev_err(dev, "%s: %d: ret: %d\n", __func__, __LINE__, ret); if (ret) return ret; @@ -205,8 +209,10 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, res->result[0] = smc_res.a1; res->result[1] = smc_res.a2; res->result[2] = smc_res.a3; + dev_err(dev, "%s: %d: 0: %llu, 1: %llu: 2: %llu\n", __func__, __LINE__, res->result[0], res->result[1], res->result[2]); } + dev_err(dev, "%s: %d: smc_res.a0: %lu\n", __func__, __LINE__, smc_res.a0); return (long)smc_res.a0 ? qcom_scm_remap_error(smc_res.a0) : 0; And that all spams dmesg successfully for most cases, but the pas_init_image calls log this out: [ 16.362965] remoteproc remoteproc1: powering up 1b300000.remoteproc [ 16.364897] remoteproc remoteproc1: Booting fw image qcom/sc8280xp/LENOVO/21BX/qccdsp8280.mbn, size 3575808 [ 16.365009] qcom_scm firmware:scm: __scm_smc_call: 165: We are in this function [ 16.365251] qcom_scm firmware:scm: __scm_smc_call: 203: ret: 0 [ 16.365256] qcom_scm firmware:scm: __scm_smc_call: 212: 0: 0, 1: 0: 2: 0 [ 16.365261] qcom_scm firmware:scm: __scm_smc_call: 215: smc_res.a0: 4291821558 At the moment I am unsure why...
Hi Andrew, On 9/29/2023 1:44 PM, Andrew Halaney wrote: > On Fri, Sep 29, 2023 at 12:22:16PM -0700, Bartosz Golaszewski wrote: >> On Fri, 29 Sep 2023 21:16:51 +0200, Andrew Halaney <ahalaney@redhat.com> said: >>> On Thu, Sep 28, 2023 at 11:20:35AM +0200, Bartosz Golaszewski wrote: >>>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> >>>> >>>> Let's use the new SCM memory allocator to obtain a buffer for this call >>>> instead of using dma_alloc_coherent(). >>>> >>>> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> >>>> --- >>>> drivers/firmware/qcom/qcom_scm.c | 16 +++++----------- >>>> 1 file changed, 5 insertions(+), 11 deletions(-) >>>> >>>> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c >>>> index 02a773ba1383..c0eb81069847 100644 >>>> --- a/drivers/firmware/qcom/qcom_scm.c >>>> +++ b/drivers/firmware/qcom/qcom_scm.c >>>> @@ -532,7 +532,7 @@ static void qcom_scm_set_download_mode(bool enable) >>>> int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, >>>> struct qcom_scm_pas_metadata *ctx) >>>> { >>>> - dma_addr_t mdata_phys; >>>> + phys_addr_t mdata_phys; >>> >>>> void *mdata_buf; >>>> int ret; >>>> struct qcom_scm_desc desc = { >>>> @@ -544,13 +544,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, >>>> }; >>>> struct qcom_scm_res res; >>>> >>>> - /* >>>> - * During the scm call memory protection will be enabled for the meta >>>> - * data blob, so make sure it's physically contiguous, 4K aligned and >>>> - * non-cachable to avoid XPU violations. >>>> - */ >>>> - mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys, >>>> - GFP_KERNEL); >>>> + mdata_buf = qcom_scm_mem_alloc(size, GFP_KERNEL); >>> >>> mdata_phys is never initialized now, and its what's being shoved into >>> desc.args[1] later, which I believe is what triggered the -EINVAL >>> with qcom_scm_call() that I reported in my cover letter reply this >>> morning. >>> >>> Prior with the DMA API that would have been the device view of the buffer. >>> >> >> Gah! Thanks for finding this. >> >> Can you try the following diff? >> >> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c >> index 794388c3212f..b0d4ea237034 100644 >> --- a/drivers/firmware/qcom/qcom_scm.c >> +++ b/drivers/firmware/qcom/qcom_scm.c >> @@ -556,6 +556,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const >> void *metadata, size_t size, >> dev_err(__scm->dev, "Allocation of metadata buffer failed.\n"); >> return -ENOMEM; >> } >> + mdata_phys = qcom_scm_mem_to_phys(mdata_buf); >> memcpy(mdata_buf, metadata, size); >> >> ret = qcom_scm_clk_enable(); >> @@ -578,7 +579,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const >> void *metadata, size_t size, >> qcom_scm_mem_free(mdata_buf); >> } else if (ctx) { >> ctx->ptr = mdata_buf; >> - ctx->phys = qcom_scm_mem_to_phys(mdata_buf); >> + ctx->phys = mdata_phys; >> ctx->size = size; >> } >> >> Bart >> > > For some reason that I can't explain that is still not working. It seems > the SMC call is returning !0 and then we return -EINVAL from there > with qcom_scm_remap_error(). > > Here's a really crummy diff of what I hacked in during lunch to debug (don't > judge my primitive debug skills): > I don't know what you're talking about :-) > diff --git a/drivers/firmware/qcom/qcom_scm-smc.c b/drivers/firmware/qcom/qcom_scm-smc.c > index 0d5554df1321..56eab0ae5f3a 100644 > --- a/drivers/firmware/qcom/qcom_scm-smc.c > +++ b/drivers/firmware/qcom/qcom_scm-smc.c > @@ -162,6 +162,8 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, > struct arm_smccc_res smc_res; > struct arm_smccc_args smc = {0}; > > + dev_err(dev, "%s: %d: We are in this function\n", __func__, __LINE__); > + > smc.args[0] = ARM_SMCCC_CALL_VAL( > smccc_call_type, > qcom_smccc_convention, > @@ -174,6 +176,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, > if (unlikely(arglen > SCM_SMC_N_REG_ARGS)) { > alloc_len = SCM_SMC_N_EXT_ARGS * sizeof(u64); > args_virt = qcom_scm_mem_alloc(PAGE_ALIGN(alloc_len), flag); > + dev_err(dev, "%s: %d: Hit the unlikely case!\n", __func__, __LINE__); > > if (!args_virt) > return -ENOMEM; > @@ -197,6 +200,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, > > /* ret error check follows after args_virt cleanup*/ > ret = __scm_smc_do(dev, &smc, &smc_res, atomic); > + dev_err(dev, "%s: %d: ret: %d\n", __func__, __LINE__, ret); > > if (ret) > return ret; > @@ -205,8 +209,10 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, > res->result[0] = smc_res.a1; > res->result[1] = smc_res.a2; > res->result[2] = smc_res.a3; > + dev_err(dev, "%s: %d: 0: %llu, 1: %llu: 2: %llu\n", __func__, __LINE__, res->result[0], res->result[1], res->result[2]); > } > > + dev_err(dev, "%s: %d: smc_res.a0: %lu\n", __func__, __LINE__, smc_res.a0); > return (long)smc_res.a0 ? qcom_scm_remap_error(smc_res.a0) : 0; > > > And that all spams dmesg successfully for most cases, but the > pas_init_image calls log this out: > > [ 16.362965] remoteproc remoteproc1: powering up 1b300000.remoteproc > [ 16.364897] remoteproc remoteproc1: Booting fw image qcom/sc8280xp/LENOVO/21BX/qccdsp8280.mbn, size 3575808 > [ 16.365009] qcom_scm firmware:scm: __scm_smc_call: 165: We are in this function > [ 16.365251] qcom_scm firmware:scm: __scm_smc_call: 203: ret: 0 > [ 16.365256] qcom_scm firmware:scm: __scm_smc_call: 212: 0: 0, 1: 0: 2: 0 > [ 16.365261] qcom_scm firmware:scm: __scm_smc_call: 215: smc_res.a0: 4291821558 > > At the moment I am unsure why... > Does the issue appear right after taking patch 6 or does it only appear after taking the whole series? If it's just to this patch, then maybe something wrong with the refactor: shm bridge isn't enabled at this point in the series.
On Mon, Oct 02, 2023 at 08:24:09AM -0500, Andrew Halaney wrote: > On Fri, Sep 29, 2023 at 03:48:37PM -0700, Elliot Berman wrote: > > Hi Andrew, > > > > On 9/29/2023 1:44 PM, Andrew Halaney wrote: > > > On Fri, Sep 29, 2023 at 12:22:16PM -0700, Bartosz Golaszewski wrote: > > >> On Fri, 29 Sep 2023 21:16:51 +0200, Andrew Halaney <ahalaney@redhat.com> said: > > >>> On Thu, Sep 28, 2023 at 11:20:35AM +0200, Bartosz Golaszewski wrote: > > >>>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > >>>> > > >>>> Let's use the new SCM memory allocator to obtain a buffer for this call > > >>>> instead of using dma_alloc_coherent(). > > >>>> > > >>>> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > >>>> --- > > >>>> drivers/firmware/qcom/qcom_scm.c | 16 +++++----------- > > >>>> 1 file changed, 5 insertions(+), 11 deletions(-) > > >>>> > > >>>> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c > > >>>> index 02a773ba1383..c0eb81069847 100644 > > >>>> --- a/drivers/firmware/qcom/qcom_scm.c > > >>>> +++ b/drivers/firmware/qcom/qcom_scm.c > > >>>> @@ -532,7 +532,7 @@ static void qcom_scm_set_download_mode(bool enable) > > >>>> int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, > > >>>> struct qcom_scm_pas_metadata *ctx) > > >>>> { > > >>>> - dma_addr_t mdata_phys; > > >>>> + phys_addr_t mdata_phys; > > >>> > > >>>> void *mdata_buf; > > >>>> int ret; > > >>>> struct qcom_scm_desc desc = { > > >>>> @@ -544,13 +544,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, > > >>>> }; > > >>>> struct qcom_scm_res res; > > >>>> > > >>>> - /* > > >>>> - * During the scm call memory protection will be enabled for the meta > > >>>> - * data blob, so make sure it's physically contiguous, 4K aligned and > > >>>> - * non-cachable to avoid XPU violations. > > >>>> - */ > > >>>> - mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys, > > >>>> - GFP_KERNEL); > > >>>> + mdata_buf = qcom_scm_mem_alloc(size, GFP_KERNEL); > > >>> > > >>> mdata_phys is never initialized now, and its what's being shoved into > > >>> desc.args[1] later, which I believe is what triggered the -EINVAL > > >>> with qcom_scm_call() that I reported in my cover letter reply this > > >>> morning. > > >>> > > >>> Prior with the DMA API that would have been the device view of the buffer. > > >>> > > >> > > >> Gah! Thanks for finding this. > > >> > > >> Can you try the following diff? > > >> > > >> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c > > >> index 794388c3212f..b0d4ea237034 100644 > > >> --- a/drivers/firmware/qcom/qcom_scm.c > > >> +++ b/drivers/firmware/qcom/qcom_scm.c > > >> @@ -556,6 +556,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const > > >> void *metadata, size_t size, > > >> dev_err(__scm->dev, "Allocation of metadata buffer failed.\n"); > > >> return -ENOMEM; > > >> } > > >> + mdata_phys = qcom_scm_mem_to_phys(mdata_buf); > > >> memcpy(mdata_buf, metadata, size); > > >> > > >> ret = qcom_scm_clk_enable(); > > >> @@ -578,7 +579,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const > > >> void *metadata, size_t size, > > >> qcom_scm_mem_free(mdata_buf); > > >> } else if (ctx) { > > >> ctx->ptr = mdata_buf; > > >> - ctx->phys = qcom_scm_mem_to_phys(mdata_buf); > > >> + ctx->phys = mdata_phys; > > >> ctx->size = size; > > >> } > > >> > > >> Bart > > >> > > > > > > For some reason that I can't explain that is still not working. It seems > > > the SMC call is returning !0 and then we return -EINVAL from there > > > with qcom_scm_remap_error(). > > > > > > Here's a really crummy diff of what I hacked in during lunch to debug (don't > > > judge my primitive debug skills): > > > > > > > I don't know what you're talking about :-) > > > > > diff --git a/drivers/firmware/qcom/qcom_scm-smc.c b/drivers/firmware/qcom/qcom_scm-smc.c > > > index 0d5554df1321..56eab0ae5f3a 100644 > > > --- a/drivers/firmware/qcom/qcom_scm-smc.c > > > +++ b/drivers/firmware/qcom/qcom_scm-smc.c > > > @@ -162,6 +162,8 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, > > > struct arm_smccc_res smc_res; > > > struct arm_smccc_args smc = {0}; > > > > > > + dev_err(dev, "%s: %d: We are in this function\n", __func__, __LINE__); > > > + > > > smc.args[0] = ARM_SMCCC_CALL_VAL( > > > smccc_call_type, > > > qcom_smccc_convention, > > > @@ -174,6 +176,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, > > > if (unlikely(arglen > SCM_SMC_N_REG_ARGS)) { > > > alloc_len = SCM_SMC_N_EXT_ARGS * sizeof(u64); > > > args_virt = qcom_scm_mem_alloc(PAGE_ALIGN(alloc_len), flag); > > > + dev_err(dev, "%s: %d: Hit the unlikely case!\n", __func__, __LINE__); > > > > > > if (!args_virt) > > > return -ENOMEM; > > > @@ -197,6 +200,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, > > > > > > /* ret error check follows after args_virt cleanup*/ > > > ret = __scm_smc_do(dev, &smc, &smc_res, atomic); > > > + dev_err(dev, "%s: %d: ret: %d\n", __func__, __LINE__, ret); > > > > > > if (ret) > > > return ret; > > > @@ -205,8 +209,10 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, > > > res->result[0] = smc_res.a1; > > > res->result[1] = smc_res.a2; > > > res->result[2] = smc_res.a3; > > > + dev_err(dev, "%s: %d: 0: %llu, 1: %llu: 2: %llu\n", __func__, __LINE__, res->result[0], res->result[1], res->result[2]); > > > } > > > > > > + dev_err(dev, "%s: %d: smc_res.a0: %lu\n", __func__, __LINE__, smc_res.a0); > > > return (long)smc_res.a0 ? qcom_scm_remap_error(smc_res.a0) : 0; > > > > > > > > > And that all spams dmesg successfully for most cases, but the > > > pas_init_image calls log this out: > > > > > > [ 16.362965] remoteproc remoteproc1: powering up 1b300000.remoteproc > > > [ 16.364897] remoteproc remoteproc1: Booting fw image qcom/sc8280xp/LENOVO/21BX/qccdsp8280.mbn, size 3575808 > > > [ 16.365009] qcom_scm firmware:scm: __scm_smc_call: 165: We are in this function > > > [ 16.365251] qcom_scm firmware:scm: __scm_smc_call: 203: ret: 0 > > > [ 16.365256] qcom_scm firmware:scm: __scm_smc_call: 212: 0: 0, 1: 0: 2: 0 > > > [ 16.365261] qcom_scm firmware:scm: __scm_smc_call: 215: smc_res.a0: 4291821558 > > > > > > At the moment I am unsure why... > > > > > Does the issue appear right after taking patch 6 or does it only appear after taking > > the whole series? If it's just to this patch, then maybe something wrong with > > the refactor: shm bridge isn't enabled at this point in the series. > > > > I've only been testing the series as a whole on top of a 6.6 based > branch, I'm going to try and test some more today to see if just the > allocator bits (but not the SHM bridge enablement) works alright for > me. > After testing a little more with the fix Bart sent above, the allocator refactor seems to work well. With "firmware: qcom: scm: enable SHM bridge" applied is when I see the errors I pointed out above. All prior patches cause no problems on boot for me. For patches 1-9 (with the fix sent here for patch 6) please feel free to add: Tested-by: Andrew Halaney <ahalaney@redhat.com> # sc8280xp-lenovo-thinkpad-x13s If anyone has suggestions for debugging why the switch to using SHM bridge is breaking firmware loading for me, I'm willing to give that a try. Thanks, Andrew
On Mon, Oct 2, 2023 at 4:15 PM Andrew Halaney <ahalaney@redhat.com> wrote: > > On Mon, Oct 02, 2023 at 08:24:09AM -0500, Andrew Halaney wrote: > > On Fri, Sep 29, 2023 at 03:48:37PM -0700, Elliot Berman wrote: > > > Hi Andrew, > > > > > > On 9/29/2023 1:44 PM, Andrew Halaney wrote: > > > > On Fri, Sep 29, 2023 at 12:22:16PM -0700, Bartosz Golaszewski wrote: > > > >> On Fri, 29 Sep 2023 21:16:51 +0200, Andrew Halaney <ahalaney@redhat.com> said: > > > >>> On Thu, Sep 28, 2023 at 11:20:35AM +0200, Bartosz Golaszewski wrote: > > > >>>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > > >>>> > > > >>>> Let's use the new SCM memory allocator to obtain a buffer for this call > > > >>>> instead of using dma_alloc_coherent(). > > > >>>> > > > >>>> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > > >>>> --- > > > >>>> drivers/firmware/qcom/qcom_scm.c | 16 +++++----------- > > > >>>> 1 file changed, 5 insertions(+), 11 deletions(-) > > > >>>> > > > >>>> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c > > > >>>> index 02a773ba1383..c0eb81069847 100644 > > > >>>> --- a/drivers/firmware/qcom/qcom_scm.c > > > >>>> +++ b/drivers/firmware/qcom/qcom_scm.c > > > >>>> @@ -532,7 +532,7 @@ static void qcom_scm_set_download_mode(bool enable) > > > >>>> int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, > > > >>>> struct qcom_scm_pas_metadata *ctx) > > > >>>> { > > > >>>> - dma_addr_t mdata_phys; > > > >>>> + phys_addr_t mdata_phys; > > > >>> > > > >>>> void *mdata_buf; > > > >>>> int ret; > > > >>>> struct qcom_scm_desc desc = { > > > >>>> @@ -544,13 +544,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, > > > >>>> }; > > > >>>> struct qcom_scm_res res; > > > >>>> > > > >>>> - /* > > > >>>> - * During the scm call memory protection will be enabled for the meta > > > >>>> - * data blob, so make sure it's physically contiguous, 4K aligned and > > > >>>> - * non-cachable to avoid XPU violations. > > > >>>> - */ > > > >>>> - mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys, > > > >>>> - GFP_KERNEL); > > > >>>> + mdata_buf = qcom_scm_mem_alloc(size, GFP_KERNEL); > > > >>> > > > >>> mdata_phys is never initialized now, and its what's being shoved into > > > >>> desc.args[1] later, which I believe is what triggered the -EINVAL > > > >>> with qcom_scm_call() that I reported in my cover letter reply this > > > >>> morning. > > > >>> > > > >>> Prior with the DMA API that would have been the device view of the buffer. > > > >>> > > > >> > > > >> Gah! Thanks for finding this. > > > >> > > > >> Can you try the following diff? > > > >> > > > >> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c > > > >> index 794388c3212f..b0d4ea237034 100644 > > > >> --- a/drivers/firmware/qcom/qcom_scm.c > > > >> +++ b/drivers/firmware/qcom/qcom_scm.c > > > >> @@ -556,6 +556,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const > > > >> void *metadata, size_t size, > > > >> dev_err(__scm->dev, "Allocation of metadata buffer failed.\n"); > > > >> return -ENOMEM; > > > >> } > > > >> + mdata_phys = qcom_scm_mem_to_phys(mdata_buf); > > > >> memcpy(mdata_buf, metadata, size); > > > >> > > > >> ret = qcom_scm_clk_enable(); > > > >> @@ -578,7 +579,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const > > > >> void *metadata, size_t size, > > > >> qcom_scm_mem_free(mdata_buf); > > > >> } else if (ctx) { > > > >> ctx->ptr = mdata_buf; > > > >> - ctx->phys = qcom_scm_mem_to_phys(mdata_buf); > > > >> + ctx->phys = mdata_phys; > > > >> ctx->size = size; > > > >> } > > > >> > > > >> Bart > > > >> > > > > > > > > For some reason that I can't explain that is still not working. It seems > > > > the SMC call is returning !0 and then we return -EINVAL from there > > > > with qcom_scm_remap_error(). > > > > > > > > Here's a really crummy diff of what I hacked in during lunch to debug (don't > > > > judge my primitive debug skills): > > > > > > > > > > I don't know what you're talking about :-) > > > > > > > diff --git a/drivers/firmware/qcom/qcom_scm-smc.c b/drivers/firmware/qcom/qcom_scm-smc.c > > > > index 0d5554df1321..56eab0ae5f3a 100644 > > > > --- a/drivers/firmware/qcom/qcom_scm-smc.c > > > > +++ b/drivers/firmware/qcom/qcom_scm-smc.c > > > > @@ -162,6 +162,8 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, > > > > struct arm_smccc_res smc_res; > > > > struct arm_smccc_args smc = {0}; > > > > > > > > + dev_err(dev, "%s: %d: We are in this function\n", __func__, __LINE__); > > > > + > > > > smc.args[0] = ARM_SMCCC_CALL_VAL( > > > > smccc_call_type, > > > > qcom_smccc_convention, > > > > @@ -174,6 +176,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, > > > > if (unlikely(arglen > SCM_SMC_N_REG_ARGS)) { > > > > alloc_len = SCM_SMC_N_EXT_ARGS * sizeof(u64); > > > > args_virt = qcom_scm_mem_alloc(PAGE_ALIGN(alloc_len), flag); > > > > + dev_err(dev, "%s: %d: Hit the unlikely case!\n", __func__, __LINE__); > > > > > > > > if (!args_virt) > > > > return -ENOMEM; > > > > @@ -197,6 +200,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, > > > > > > > > /* ret error check follows after args_virt cleanup*/ > > > > ret = __scm_smc_do(dev, &smc, &smc_res, atomic); > > > > + dev_err(dev, "%s: %d: ret: %d\n", __func__, __LINE__, ret); > > > > > > > > if (ret) > > > > return ret; > > > > @@ -205,8 +209,10 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc, > > > > res->result[0] = smc_res.a1; > > > > res->result[1] = smc_res.a2; > > > > res->result[2] = smc_res.a3; > > > > + dev_err(dev, "%s: %d: 0: %llu, 1: %llu: 2: %llu\n", __func__, __LINE__, res->result[0], res->result[1], res->result[2]); > > > > } > > > > > > > > + dev_err(dev, "%s: %d: smc_res.a0: %lu\n", __func__, __LINE__, smc_res.a0); > > > > return (long)smc_res.a0 ? qcom_scm_remap_error(smc_res.a0) : 0; > > > > > > > > > > > > And that all spams dmesg successfully for most cases, but the > > > > pas_init_image calls log this out: > > > > > > > > [ 16.362965] remoteproc remoteproc1: powering up 1b300000.remoteproc > > > > [ 16.364897] remoteproc remoteproc1: Booting fw image qcom/sc8280xp/LENOVO/21BX/qccdsp8280.mbn, size 3575808 > > > > [ 16.365009] qcom_scm firmware:scm: __scm_smc_call: 165: We are in this function > > > > [ 16.365251] qcom_scm firmware:scm: __scm_smc_call: 203: ret: 0 > > > > [ 16.365256] qcom_scm firmware:scm: __scm_smc_call: 212: 0: 0, 1: 0: 2: 0 > > > > [ 16.365261] qcom_scm firmware:scm: __scm_smc_call: 215: smc_res.a0: 4291821558 > > > > > > > > At the moment I am unsure why... > > > > > > > Does the issue appear right after taking patch 6 or does it only appear after taking > > > the whole series? If it's just to this patch, then maybe something wrong with > > > the refactor: shm bridge isn't enabled at this point in the series. > > > > > > > I've only been testing the series as a whole on top of a 6.6 based > > branch, I'm going to try and test some more today to see if just the > > allocator bits (but not the SHM bridge enablement) works alright for > > me. > > > > After testing a little more with the fix Bart sent above, > the allocator refactor seems to work well. > With "firmware: qcom: scm: enable SHM bridge" applied is when I see the > errors I pointed out above. All prior patches cause no problems on boot > for me. > > For patches 1-9 (with the fix sent here for patch 6) please feel free > to add: > > Tested-by: Andrew Halaney <ahalaney@redhat.com> # sc8280xp-lenovo-thinkpad-x13s > > If anyone has suggestions for debugging why the switch to using > SHM bridge is breaking firmware loading for me, I'm willing to give > that a try. > Is it possible that sc8280xp doesn't support SHM bridge but for some reason __qcom_scm_is_call_available() returns true when checking if it does? That's the only thing that comes to my mind ATM. Bart
diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c index 02a773ba1383..c0eb81069847 100644 --- a/drivers/firmware/qcom/qcom_scm.c +++ b/drivers/firmware/qcom/qcom_scm.c @@ -532,7 +532,7 @@ static void qcom_scm_set_download_mode(bool enable) int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, struct qcom_scm_pas_metadata *ctx) { - dma_addr_t mdata_phys; + phys_addr_t mdata_phys; void *mdata_buf; int ret; struct qcom_scm_desc desc = { @@ -544,13 +544,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, }; struct qcom_scm_res res; - /* - * During the scm call memory protection will be enabled for the meta - * data blob, so make sure it's physically contiguous, 4K aligned and - * non-cachable to avoid XPU violations. - */ - mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys, - GFP_KERNEL); + mdata_buf = qcom_scm_mem_alloc(size, GFP_KERNEL); if (!mdata_buf) { dev_err(__scm->dev, "Allocation of metadata buffer failed.\n"); return -ENOMEM; @@ -574,10 +568,10 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, out: if (ret < 0 || !ctx) { - dma_free_coherent(__scm->dev, size, mdata_buf, mdata_phys); + qcom_scm_mem_free(mdata_buf); } else if (ctx) { ctx->ptr = mdata_buf; - ctx->phys = mdata_phys; + ctx->phys = qcom_scm_mem_to_phys(mdata_buf); ctx->size = size; } @@ -594,7 +588,7 @@ void qcom_scm_pas_metadata_release(struct qcom_scm_pas_metadata *ctx) if (!ctx->ptr) return; - dma_free_coherent(__scm->dev, ctx->size, ctx->ptr, ctx->phys); + qcom_scm_mem_free(ctx->ptr); ctx->ptr = NULL; ctx->phys = 0;