Message ID | 5cdad89c-282a-4df5-a286-b8404bc4dd81@freebox.fr |
---|---|
State | Superseded |
Headers | show |
Series | Work around missing MSA_READY indicator for msm8998 devices | expand |
On 30.03.2024 7:25 PM, Krzysztof Kozlowski wrote: > On 28/03/2024 18:39, Marc Gonzalez wrote: >> The ath10k driver waits for an "MSA_READY" indicator >> to complete initialization. If the indicator is not >> received, then the device remains unusable. >> >> cf. ath10k_qmi_driver_event_work() >> >> Several msm8998-based devices are affected by this issue. >> Oddly, it seems safe to NOT wait for the indicator, and >> proceed immediately when QMI_EVENT_SERVER_ARRIVE. >> >> Jeff Johnson wrote: >> >> The feedback I received was "it might be ok to change all ath10k qmi >> to skip waiting for msa_ready", and it was pointed out that ath11k >> (and ath12k) do not wait for it. >> >> However with so many deployed devices, "might be ok" isn't a strong >> argument for changing the default behavior. >> > > I think you got pretty clear comments: > > "This sounds more like a firmware feature, not a hardware feature." > > "This is why having this property in DT does not look right > place for this." Translating from dt maintainer speak to English, a functionally-equivalent resolution of adding an of_machine_is_compatible("qcom,msm8998") is more in line with the guidelines of not sprinkling firmware specifics in DTs Konrad
On 02/04/2024 16:34, Konrad Dybcio wrote: > On 30.03.2024 7:25 PM, Krzysztof Kozlowski wrote: > >> On 28/03/2024 18:39, Marc Gonzalez wrote: >> >>> The ath10k driver waits for an "MSA_READY" indicator >>> to complete initialization. If the indicator is not >>> received, then the device remains unusable. >>> >>> cf. ath10k_qmi_driver_event_work() >>> >>> Several msm8998-based devices are affected by this issue. >>> Oddly, it seems safe to NOT wait for the indicator, and >>> proceed immediately when QMI_EVENT_SERVER_ARRIVE. >>> >>> Jeff Johnson wrote: >>> >>> The feedback I received was "it might be ok to change all ath10k qmi >>> to skip waiting for msa_ready", and it was pointed out that ath11k >>> (and ath12k) do not wait for it. >>> >>> However with so many deployed devices, "might be ok" isn't a strong >>> argument for changing the default behavior. >> >> I think you got pretty clear comments: >> >> "This sounds more like a firmware feature, not a hardware feature." >> >> "This is why having this property in DT does not look right >> place for this." > > Translating from dt maintainer speak to English, a functionally-equivalent > resolution of adding an of_machine_is_compatible("qcom,msm8998") is more > in line with the guidelines of not sprinkling firmware specifics in DTs I'm not so sure about that, as I had proposed + if (of_device_is_compatible(of_root, "qcom,msm8998") + qmi->no_point_in_waiting_for_msa_ready_indicator = true; + To which Conor replied: > How come the root node comes into this, don't you have a soc-specific > compatible for the integration on this SoC? > (I am assuming that this is not the SDIO variant, given then it'd not be > fixed to this particular implementation) Then added: > A SoC-specific compatible sounds like it would be suitable in that case > then, to deal with integration quirks for that specific SoC? I usually > leave the ins and outs of these qcom SoCs to Krzysztof, but I can't help > but wanna know what the justification is here for not using one. Then Krzysztof added: > The WiFi+BT chips are separate products, so they are not usually > considered part of the SoC, even though they can be integrated into the > SoC like here. I guess correct approach would be to add SoC-specific > compatible for them. So, if I understand correctly, I take this to mean that I should: 1) DELETE the qcom,no-msa-ready-indicator boolean property, 2) ADD a "qcom,msm8998-wifi" (name OK?) compatible, 3) ADD that compatible to the wifi node in msm8998.dtsi compatible = "qcom,wcn3990-wifi", "qcom,msm8998-wifi"; 4) In the driver, set qmi->fake_msa_ready_indicator to true if we detect "qcom,msm8998-wifi" And this approach would be acceptable to both ath10k & DT maintainers? Bjarne, Konrad: is it OK to apply the work-around for all msm8998 boards? Regards
On Tue, 2 Apr 2024 at 18:31, Marc Gonzalez <mgonzalez@freebox.fr> wrote: > > On 02/04/2024 16:34, Konrad Dybcio wrote: > > > On 30.03.2024 7:25 PM, Krzysztof Kozlowski wrote: > > > >> On 28/03/2024 18:39, Marc Gonzalez wrote: > >> > >>> The ath10k driver waits for an "MSA_READY" indicator > >>> to complete initialization. If the indicator is not > >>> received, then the device remains unusable. > >>> > >>> cf. ath10k_qmi_driver_event_work() > >>> > >>> Several msm8998-based devices are affected by this issue. > >>> Oddly, it seems safe to NOT wait for the indicator, and > >>> proceed immediately when QMI_EVENT_SERVER_ARRIVE. > >>> > >>> Jeff Johnson wrote: > >>> > >>> The feedback I received was "it might be ok to change all ath10k qmi > >>> to skip waiting for msa_ready", and it was pointed out that ath11k > >>> (and ath12k) do not wait for it. > >>> > >>> However with so many deployed devices, "might be ok" isn't a strong > >>> argument for changing the default behavior. > >> > >> I think you got pretty clear comments: > >> > >> "This sounds more like a firmware feature, not a hardware feature." > >> > >> "This is why having this property in DT does not look right > >> place for this." > > > > Translating from dt maintainer speak to English, a functionally-equivalent > > resolution of adding an of_machine_is_compatible("qcom,msm8998") is more > > in line with the guidelines of not sprinkling firmware specifics in DTs > > I'm not so sure about that, as I had proposed > > + if (of_device_is_compatible(of_root, "qcom,msm8998") > + qmi->no_point_in_waiting_for_msa_ready_indicator = true; > + > > To which Conor replied: > > > How come the root node comes into this, don't you have a soc-specific > > compatible for the integration on this SoC? > > (I am assuming that this is not the SDIO variant, given then it'd not be > > fixed to this particular implementation) > > > Then added: > > > A SoC-specific compatible sounds like it would be suitable in that case > > then, to deal with integration quirks for that specific SoC? I usually > > leave the ins and outs of these qcom SoCs to Krzysztof, but I can't help > > but wanna know what the justification is here for not using one. > > > Then Krzysztof added: > > > The WiFi+BT chips are separate products, so they are not usually > > considered part of the SoC, even though they can be integrated into the > > SoC like here. I guess correct approach would be to add SoC-specific > > compatible for them. > > > So, if I understand correctly, I take this to mean that I should: > > 1) DELETE the qcom,no-msa-ready-indicator boolean property, > 2) ADD a "qcom,msm8998-wifi" (name OK?) compatible, I'd say, this is not correct. There is no "msm8998-wifi". > 3) ADD that compatible to the wifi node in msm8998.dtsi > compatible = "qcom,wcn3990-wifi", "qcom,msm8998-wifi"; > 4) In the driver, set qmi->fake_msa_ready_indicator to true if we detect "qcom,msm8998-wifi" > > And this approach would be acceptable to both ath10k & DT maintainers? I'd say, we should take a step back and actually verify how this was handled in the vendor kernel. > > Bjarne, Konrad: is it OK to apply the work-around for all msm8998 boards?
On 4/2/2024 8:55 AM, Dmitry Baryshkov wrote: > I'd say, we should take a step back and actually verify how this was > handled in the vendor kernel. (error handling and other non-QMI code removed from the following for readability) In ath10k we unconditionally call the following in ath10k_qmi_event_server_arrive(): ret = ath10k_qmi_host_cap_send_sync(qmi); ret = ath10k_qmi_msa_mem_info_send_sync_msg(qmi); ret = ath10k_qmi_setup_msa_permissions(qmi); ret = ath10k_qmi_msa_ready_send_sync_msg(qmi); ret = ath10k_qmi_cap_send_sync_msg(qmi); In vendor icnss2 there is conditional logic in icnss_driver_event_server_arrive(): if (priv->device_id == WCN6750_DEVICE_ID || priv->device_id == WCN6450_DEVICE_ID) { ret = wlfw_host_cap_send_sync(priv); } if (priv->device_id == ADRASTEA_DEVICE_ID) { ret = wlfw_msa_mem_info_send_sync_msg(priv); ret = wlfw_msa_ready_send_sync_msg(priv); } ret = wlfw_cap_send_sync_msg(priv); reference: https://git.codelinaro.org/clo/la/platform/vendor/qcom-opensource/wlan/platform/-/blob/wlan-platform.lnx.1.0.r1-rel/icnss2/main.c?ref_type=heads#L890 /jeff
On 02.04.2024 18:55, Dmitry Baryshkov wrote: > I'd say, we should take a step back and actually verify how this was > handled in the vendor kernel. AFAIK there is no such thing in vendor kernel driver for this, as this startup procedure is likely handled entirely in userspace in cnss_daemon. By the way this workaround is needed also for Wi-Fi in sdm630/660, so no not only msm8998 suffers from this. > This sounds more like a firmware feature, not a hardware feature > having this property in DT does not look right I agree with these 2 points above. This can be handled more nicely as firmware feature encoded in firmware-5.bin using ath10k-fwencoder and not involve any new DT compatibles or properties.
On Tue, 2 Apr 2024 at 21:22, Jeff Johnson <quic_jjohnson@quicinc.com> wrote: > > On 4/2/2024 8:55 AM, Dmitry Baryshkov wrote: > > I'd say, we should take a step back and actually verify how this was > > handled in the vendor kernel. > > (error handling and other non-QMI code removed from the following for readability) > > In ath10k we unconditionally call the following in > ath10k_qmi_event_server_arrive(): > ret = ath10k_qmi_host_cap_send_sync(qmi); > ret = ath10k_qmi_msa_mem_info_send_sync_msg(qmi); > ret = ath10k_qmi_setup_msa_permissions(qmi); > ret = ath10k_qmi_msa_ready_send_sync_msg(qmi); > ret = ath10k_qmi_cap_send_sync_msg(qmi); > > In vendor icnss2 there is conditional logic in icnss_driver_event_server_arrive(): Note, wcn3990 is icnss, not icnss2 > if (priv->device_id == WCN6750_DEVICE_ID || > priv->device_id == WCN6450_DEVICE_ID) { > ret = wlfw_host_cap_send_sync(priv); > } > > if (priv->device_id == ADRASTEA_DEVICE_ID) { > ret = wlfw_msa_mem_info_send_sync_msg(priv); > ret = wlfw_msa_ready_send_sync_msg(priv); > } The problem with applying this approach is that here the discriminator is the WiFi device ID. WCN6750, WCN6450 and this ADRASTEA are different WiFi/BT chips. However for msm8998 and e.g. sdm845 there is no easy way to distinguish the WiFi chips. Both platforms use wcn3990 device. > > ret = wlfw_cap_send_sync_msg(priv); > > reference: > https://git.codelinaro.org/clo/la/platform/vendor/qcom-opensource/wlan/platform/-/blob/wlan-platform.lnx.1.0.r1-rel/icnss2/main.c?ref_type=heads#L890 > > /jeff
On Tue, 2 Apr 2024 at 21:25, Alexey Minnekhanov <alexeymin@postmarketos.org> wrote: > > > > On 02.04.2024 18:55, Dmitry Baryshkov wrote: > > I'd say, we should take a step back and actually verify how this was > > handled in the vendor kernel. > > > AFAIK there is no such thing in vendor kernel driver for this, as > this startup procedure is likely handled entirely in userspace in > cnss_daemon. > > By the way this workaround is needed also for Wi-Fi in sdm630/660, > so no not only msm8998 suffers from this. Interesting. I have an sdm660 platform. I think I should be able to check these workarounds then. > > > This sounds more like a firmware feature, not a hardware feature > > > having this property in DT does not look right > > I agree with these 2 points above. This can be handled more nicely > as firmware feature encoded in firmware-5.bin using ath10k-fwencoder > and not involve any new DT compatibles or properties. I think Marc has already tried this. The firmware-N.bin, so-called "boot firmware" (because for normal devices it also contains the actual firmware), is loaded much later. See https://lore.kernel.org/ath10k/243a97b7-c298-4307-9f06-8b3a7c3e24fd@freebox.fr/ Probably we have an option of loading the firmware earlier, so that at this stage we already know the quirks set in the firmware-5.bin. But note, I haven't checked if at this point the driver has all the information to select correct firmware blob.
On 4/2/2024 11:25 AM, Alexey Minnekhanov wrote: > > > On 02.04.2024 18:55, Dmitry Baryshkov wrote: >> I'd say, we should take a step back and actually verify how this was >> handled in the vendor kernel. > > > AFAIK there is no such thing in vendor kernel driver for this, as > this startup procedure is likely handled entirely in userspace in > cnss_daemon. > So now I'm looking at the cnss-daemon, and there it appears that MSA READY is always expected to be received. There is target-specific logic to set the flags sent to firmware: if (gdata->instance_id == ADRASTEA_ID) { req.msa_ready_enable_valid = 1; req.msa_ready_enable = 1; } else { /* All targets other than Adrastea */ req.fw_mem_ready_enable_valid = 1; req.fw_mem_ready_enable = 1; } Logic to set an internal flag if the message is received: case QMI_WLFW_MSA_READY_IND_V01: gdata->state |= CNSS_MSA_READY; And Adrastea-specific logic to set that flag if it is set in a separate status indicator: static int wlfw_adrastea_init(struct wlfw_client_data *gdata) [...] if (fw_status & QMI_WLFW_MSA_READY_V01) { wsvc_printf_dbg("MSA is ready"); gdata->state |= CNSS_MSA_READY; } So that flag is only set by receiving the QMI_WLFW_MSA_READY_IND_V01 message or, only for Adrastea, if the response to wlfw_send_ind_register_req() indicates MSA_READY Later there is a wait for MSA_READY that has no conditions: while (!CNSS_IS_MSA_READY(gdata->state)) Truthfully this is code I've never had to deal with before, and I've struggled to find developers who have the necessary background. The least disruptive paths seem to either be the DT item or adding a new item for this in struct ath10k_hw_params. Kalle, do you have any different guidance?
On 02/04/2024 17:55, Dmitry Baryshkov wrote: > On Tue, 2 Apr 2024 at 18:31, Marc Gonzalez wrote: > >> So, if I understand correctly, I take this to mean that I should: >> >> 1) DELETE the qcom,no-msa-ready-indicator boolean property, >> 2) ADD a "qcom,msm8998-wifi" (name OK?) compatible, > > I'd say, this is not correct. There is no "msm8998-wifi". Can you explain what you mean by: 'There is no "msm8998-wifi".' ? Do you mean that: this compatible string does not exist? (I am proposing that it be created.) Or do you mean that: "msm8998-wifi" is a bad name? I meant to mimic these strings for various sub-blocks: arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-rpm-proc", "qcom,rpm-proc"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-rpmpd"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-qfprom", "qcom,qfprom"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-tsens", "qcom,tsens-v2"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-tsens", "qcom,tsens-v2"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-smmu-v2", "qcom,smmu-v2"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-smmu-v2", "qcom,smmu-v2"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-qmp-pcie-phy"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-ufshc", "qcom,ufshc", "jedec,ufs-2.0"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-qmp-ufs-phy"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-tcsr", "syscon"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-tcsr", "syscon"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-pinctrl"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-mss-pil"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-smmu-v2", "qcom,smmu-v2", arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-gpucc"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-slpi-pas"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-dwc3", "qcom,dwc3"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-qmp-usb3-phy"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-qusb2-phy"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-sdhci", "qcom,sdhci-msm-v4"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-mdss"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-dpu"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-dsi-ctrl", "qcom,mdss-dsi-ctrl"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-dsi-ctrl", "qcom,mdss-dsi-ctrl"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-venus"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-smmu-v2", "qcom,smmu-v2"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-adsp-pas"; arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-apcs-hmss-global", And these strings in ath11k: Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml: - qcom,ipq8074-wifi Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml: - qcom,ipq6018-wifi Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml: - qcom,wcn6750-wifi Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml: - qcom,ipq5018-wifi > I'd say, we should take a step back and actually verify how this was > handled in the vendor kernel. In our commercial product, we use the ath10k driver in the vendor kernel (v4.4 r38-rel). It looks like Jeff has already performed the code analysis wrt vendor vs mainline (including user-space tools). Regards
On Wed, 3 Apr 2024 at 16:05, Marc Gonzalez <mgonzalez@freebox.fr> wrote: > > On 02/04/2024 17:55, Dmitry Baryshkov wrote: > > > On Tue, 2 Apr 2024 at 18:31, Marc Gonzalez wrote: > > > >> So, if I understand correctly, I take this to mean that I should: > >> > >> 1) DELETE the qcom,no-msa-ready-indicator boolean property, > >> 2) ADD a "qcom,msm8998-wifi" (name OK?) compatible, > > > > I'd say, this is not correct. There is no "msm8998-wifi". > > Can you explain what you mean by: > 'There is no "msm8998-wifi".' ? > > Do you mean that: this compatible string does not exist? > (I am proposing that it be created.) > > Or do you mean that: "msm8998-wifi" is a bad name? I mean, it is qcom,wcn3990-wifi, because the chip is wcn3990. > And these strings in ath11k: > > Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml: - qcom,ipq8074-wifi > Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml: - qcom,ipq6018-wifi > Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml: - qcom,wcn6750-wifi > Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml: - qcom,ipq5018-wifi I must admit, I don't know the IPQ product naming (it well might be that it is both the name of the SoC and the WiFI SoC). > > > I'd say, we should take a step back and actually verify how this was > > handled in the vendor kernel. > > In our commercial product, we use the ath10k driver in the vendor kernel (v4.4 r38-rel). I see. > It looks like Jeff has already performed the code analysis > wrt vendor vs mainline (including user-space tools).
On 03/04/2024 15:05, Marc Gonzalez wrote: > On 02/04/2024 17:55, Dmitry Baryshkov wrote: > >> On Tue, 2 Apr 2024 at 18:31, Marc Gonzalez wrote: >> >>> So, if I understand correctly, I take this to mean that I should: >>> >>> 1) DELETE the qcom,no-msa-ready-indicator boolean property, >>> 2) ADD a "qcom,msm8998-wifi" (name OK?) compatible, >> >> I'd say, this is not correct. There is no "msm8998-wifi". > > Can you explain what you mean by: > 'There is no "msm8998-wifi".' ? > > Do you mean that: this compatible string does not exist? > (I am proposing that it be created.) > > Or do you mean that: "msm8998-wifi" is a bad name? msm8998 SoC does not have WiFi. > > > I meant to mimic these strings for various sub-blocks: > > arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-rpm-proc", "qcom,rpm-proc"; > arch/arm64/boot/dts/qcom/msm8998.dtsi: compatible = "qcom,msm8998-rpmpd"; These are all parts of SoC. What you are adding is not part of original SoC, I think. Best regards, Krzysztof
On 03/04/2024 16:12, Dmitry Baryshkov wrote:
> From [Jeff's] message it looks like we are expected to get MSA READY even on msm8998.
This is the code we're using:
https://git.codelinaro.org/clo/la/kernel/msm-4.4/-/blob/caf_migration/kernel.lnx.4.4.r38-rel/drivers/net/wireless/ath/ath10k/qmi.c
When ATH10K_SNOC_DRIVER_EVENT_SERVER_ARRIVE,
driver registers an "indicator handler"
ath10k_snoc_qmi_wlfw_clnt_ind()
It handles QMI_WLFW_FW_READY_IND_V01 by posting
ATH10K_SNOC_DRIVER_EVENT_FW_READY_IND
which is handled in the
ath10k_snoc_driver_event_work() work queue.
But QMI_WLFW_MSA_READY_IND_V01 only triggers
a debug log and setting qmi_cfg->msa_ready = true;
$ git grep '\<msa_ready\>'
drivers/net/wireless/ath/ath10k/qmi.c: qmi_cfg->msa_ready = true;
drivers/net/wireless/ath/ath10k/qmi.c: qmi_cfg->msa_ready = false;
drivers/net/wireless/ath/ath10k/qmi.h: * msa_ready: wlan firmware msa memory ready for board data download
drivers/net/wireless/ath/ath10k/qmi.h: bool msa_ready;
So basically, the vendor ath10k driver ignores QMI_WLFW_MSA_READY_IND_V01.
I will test the following patch which aligns the behavior
of mainline driver to that of vendor driver:
diff --git a/drivers/net/wireless/ath/ath10k/qmi.c b/drivers/net/wireless/ath/ath10k/qmi.c
index 38e939f572a9e..0e1ab5aca663b 100644
--- a/drivers/net/wireless/ath/ath10k/qmi.c
+++ b/drivers/net/wireless/ath/ath10k/qmi.c
@@ -1040,6 +1040,7 @@ static void ath10k_qmi_driver_event_work(struct work_struct *work)
switch (event->type) {
case ATH10K_QMI_EVENT_SERVER_ARRIVE:
ath10k_qmi_event_server_arrive(qmi);
+ ath10k_qmi_event_msa_ready(qmi);
break;
case ATH10K_QMI_EVENT_SERVER_EXIT:
ath10k_qmi_event_server_exit(qmi);
@@ -1048,7 +1049,7 @@ static void ath10k_qmi_driver_event_work(struct work_struct *work)
ath10k_qmi_event_fw_ready_ind(qmi);
break;
case ATH10K_QMI_EVENT_MSA_READY_IND:
- ath10k_qmi_event_msa_ready(qmi);
+ printk(KERN_WARNING "IGNORING MSA_READY INDICATOR");
break;
default:
ath10k_warn(ar, "invalid event type: %d", event->type);
Dmitry Baryshkov reported:
Works on sm8150, sdm845, qrb2210
Regards
Dmitry Baryshkov <dmitry.baryshkov@linaro.org> writes: >> 3) ADD that compatible to the wifi node in msm8998.dtsi >> compatible = "qcom,wcn3990-wifi", "qcom,msm8998-wifi"; >> 4) In the driver, set qmi->fake_msa_ready_indicator to true if we >> detect "qcom,msm8998-wifi" >> >> And this approach would be acceptable to both ath10k & DT maintainers? > > I'd say, we should take a step back and actually verify how this was > handled in the vendor kernel. One comment related to this: usually vendor driver and firmware branches go "hand in hand", meaning that a version of driver supports only one specific firmware branch. And there can be a lot of branches. So even if one branch might have a check for something specific, there are no guarantees what the other N+1 branches do :/
On 04/04/2024 13:57, Kalle Valo wrote: > Dmitry Baryshkov wrote: > >> I'd say, we should take a step back and actually verify how this was >> handled in the vendor kernel. > > One comment related to this: usually vendor driver and firmware branches > go "hand in hand", meaning that a version of driver supports only one > specific firmware branch. And there can be a lot of branches. So even if > one branch might have a check for something specific, there are no > guarantees what the other N+1 branches do :/ The consequences and ramifications of the above comment are not clear to me. Does this mean: "It is pointless to analyze a given version (or even several versions) of the vendor driver downstream, because there are exist a large number of variations of the code." ? And thus, "it is nonsensical to try to "align" the mainline driver to "the" vendor driver, as there is no single "vendor driver"" ? Thus, the following patch (or one functionally-equivalent) is not acceptable? diff --git a/drivers/net/wireless/ath/ath10k/qmi.c b/drivers/net/wireless/ath/ath10k/qmi.c index 38e939f572a9e..fd9ac9717488a 100644 --- a/drivers/net/wireless/ath/ath10k/qmi.c +++ b/drivers/net/wireless/ath/ath10k/qmi.c @@ -1040,6 +1040,8 @@ static void ath10k_qmi_driver_event_work(struct work_struct *work) switch (event->type) { case ATH10K_QMI_EVENT_SERVER_ARRIVE: ath10k_qmi_event_server_arrive(qmi); + printk(KERN_NOTICE "NOT WAITING FOR MSA_READY INDICATOR"); + ath10k_qmi_event_msa_ready(qmi); break; case ATH10K_QMI_EVENT_SERVER_EXIT: ath10k_qmi_event_server_exit(qmi); @@ -1048,7 +1050,7 @@ static void ath10k_qmi_driver_event_work(struct work_struct *work) ath10k_qmi_event_fw_ready_ind(qmi); break; case ATH10K_QMI_EVENT_MSA_READY_IND: - ath10k_qmi_event_msa_ready(qmi); + printk(KERN_NOTICE "IGNORING ACTUAL MSA_READY INDICATOR"); break; default: ath10k_warn(ar, "invalid event type: %d", event->type); Regards
On Thu, 4 Apr 2024 at 15:30, Marc Gonzalez <mgonzalez@freebox.fr> wrote: > > On 04/04/2024 13:57, Kalle Valo wrote: > > > Dmitry Baryshkov wrote: > > > >> I'd say, we should take a step back and actually verify how this was > >> handled in the vendor kernel. > > > > One comment related to this: usually vendor driver and firmware branches > > go "hand in hand", meaning that a version of driver supports only one > > specific firmware branch. And there can be a lot of branches. So even if > > one branch might have a check for something specific, there are no > > guarantees what the other N+1 branches do :/ > > The consequences and ramifications of the above comment are not clear to me. > > Does this mean: > "It is pointless to analyze a given version (or even several versions) > of the vendor driver downstream, because there are exist a large number > of variations of the code." ? > > And thus, "it is nonsensical to try to "align" the mainline driver to > "the" vendor driver, as there is no single "vendor driver"" ? > > Thus, the following patch (or one functionally-equivalent) is not acceptable? For reference, I tested this patch on sdm845 (db845c), qcm2290 aka qrb2210 (rb1), sm6115 aka qrb4210 (rb2) and sm8150 platforms. I was not able to fully test it on sda660, modem crashes without this patch (there is no MSA_READY indication) and with the patch applied the device hangs, most likely because of the IOMMU or clocking issue. > > diff --git a/drivers/net/wireless/ath/ath10k/qmi.c b/drivers/net/wireless/ath/ath10k/qmi.c > index 38e939f572a9e..fd9ac9717488a 100644 > --- a/drivers/net/wireless/ath/ath10k/qmi.c > +++ b/drivers/net/wireless/ath/ath10k/qmi.c > @@ -1040,6 +1040,8 @@ static void ath10k_qmi_driver_event_work(struct work_struct *work) > switch (event->type) { > case ATH10K_QMI_EVENT_SERVER_ARRIVE: > ath10k_qmi_event_server_arrive(qmi); > + printk(KERN_NOTICE "NOT WAITING FOR MSA_READY INDICATOR"); > + ath10k_qmi_event_msa_ready(qmi); > break; > case ATH10K_QMI_EVENT_SERVER_EXIT: > ath10k_qmi_event_server_exit(qmi); > @@ -1048,7 +1050,7 @@ static void ath10k_qmi_driver_event_work(struct work_struct *work) > ath10k_qmi_event_fw_ready_ind(qmi); > break; > case ATH10K_QMI_EVENT_MSA_READY_IND: > - ath10k_qmi_event_msa_ready(qmi); > + printk(KERN_NOTICE "IGNORING ACTUAL MSA_READY INDICATOR"); > break; > default: > ath10k_warn(ar, "invalid event type: %d", event->type); > > > > Regards >
Marc Gonzalez <mgonzalez@freebox.fr> writes: > On 04/04/2024 13:57, Kalle Valo wrote: > >> Dmitry Baryshkov wrote: >> >>> I'd say, we should take a step back and actually verify how this was >>> handled in the vendor kernel. >> >> One comment related to this: usually vendor driver and firmware branches >> go "hand in hand", meaning that a version of driver supports only one >> specific firmware branch. And there can be a lot of branches. So even if >> one branch might have a check for something specific, there are no >> guarantees what the other N+1 branches do :/ > > The consequences and ramifications of the above comment are not clear to me. > > Does this mean: > "It is pointless to analyze a given version (or even several versions) > of the vendor driver downstream, because there are exist a large number > of variations of the code." ? I was trying to say that because the design philosophy between vendor drivers and upstream drivers is very different, we can't 100% trust vendor drivers. It's a very good idea to check what vendor drivers do but we just need to be careful before making any conclusions. Testing real hardware (and corresponding firmware) is the most reliable way to know how different products/firmware work, unfortunately. > And thus, "it is nonsensical to try to "align" the mainline driver to > "the" vendor driver, as there is no single "vendor driver"" ? No no, I'm not saying that. I have suffered this "N+1 different firmware branches behaving slighly differently" problem since ath6kl days so for me this is business as usual, sadly. I'm sure we can find a solution for ath10k.
On 04/04/2024 17:28, Kalle Valo wrote: > Marc Gonzalez wrote: > >> On 04/04/2024 13:57, Kalle Valo wrote: >> >>> Dmitry Baryshkov wrote: >>> >>>> I'd say, we should take a step back and actually verify how this was >>>> handled in the vendor kernel. >>> >>> One comment related to this: usually vendor driver and firmware branches >>> go "hand in hand", meaning that a version of driver supports only one >>> specific firmware branch. And there can be a lot of branches. So even if >>> one branch might have a check for something specific, there are no >>> guarantees what the other N+1 branches do :/ >> >> The consequences and ramifications of the above comment are not clear to me. >> >> Does this mean: >> "It is pointless to analyze a given version (or even several versions) >> of the vendor driver downstream, because there are exist a large number >> of variations of the code." ? > > I was trying to say that because the design philosophy between vendor > drivers and upstream drivers is very different, we can't 100% trust > vendor drivers. It's a very good idea to check what vendor drivers do > but we just need to be careful before making any conclusions. Testing > real hardware (and corresponding firmware) is the most reliable way to > know how different products/firmware work, unfortunately. > >> And thus, "it is nonsensical to try to "align" the mainline driver to >> "the" vendor driver, as there is no single "vendor driver"" ? > > No no, I'm not saying that. I have suffered this "N+1 different firmware > branches behaving slighly differently" problem since ath6kl days so for > me this is business as usual, sadly. I'm sure we can find a solution for > ath10k. Hello Kalle, I can spin a v3, no problem. Do you prefer: Option A = never waiting for the MSA_READY indicator for ANYONE Option B = not waiting for the MSA_READY indicator when qcom,no-msa-ready-indicator is defined Option C = not waiting for the MSA_READY indicator for certain platforms (based on root compatible) Option D = some other solution not yet discussed Dmitry has tested Option A on 5 platforms, where it does not induce regressions. I worked on msm8998, where Option A (or any equivalent) unbreaks WiFi. Please provide guidance :) Regards
Marc Gonzalez <mgonzalez@freebox.fr> writes: > On 04/04/2024 17:28, Kalle Valo wrote: > >> Marc Gonzalez wrote: >> >>> On 04/04/2024 13:57, Kalle Valo wrote: >>> >>>> Dmitry Baryshkov wrote: >>>> >>>>> I'd say, we should take a step back and actually verify how this was >>>>> handled in the vendor kernel. >>>> >>>> One comment related to this: usually vendor driver and firmware branches >>>> go "hand in hand", meaning that a version of driver supports only one >>>> specific firmware branch. And there can be a lot of branches. So even if >>>> one branch might have a check for something specific, there are no >>>> guarantees what the other N+1 branches do :/ >>> >>> The consequences and ramifications of the above comment are not clear to me. >>> >>> Does this mean: >>> "It is pointless to analyze a given version (or even several versions) >>> of the vendor driver downstream, because there are exist a large number >>> of variations of the code." ? >> >> I was trying to say that because the design philosophy between vendor >> drivers and upstream drivers is very different, we can't 100% trust >> vendor drivers. It's a very good idea to check what vendor drivers do >> but we just need to be careful before making any conclusions. Testing >> real hardware (and corresponding firmware) is the most reliable way to >> know how different products/firmware work, unfortunately. >> >>> And thus, "it is nonsensical to try to "align" the mainline driver to >>> "the" vendor driver, as there is no single "vendor driver"" ? >> >> No no, I'm not saying that. I have suffered this "N+1 different firmware >> branches behaving slighly differently" problem since ath6kl days so for >> me this is business as usual, sadly. I'm sure we can find a solution for >> ath10k. > > Hello Kalle, > > I can spin a v3, no problem. > > Do you prefer: > > Option A = never waiting for the MSA_READY indicator for ANYONE > Option B = not waiting for the MSA_READY indicator when > qcom,no-msa-ready-indicator is defined > Option C = not waiting for the MSA_READY indicator for certain > platforms (based on root compatible) > Option D = some other solution not yet discussed After firmware-N.bin solution didn't work (sorry about that!) my prerence is option B. > Dmitry has tested Option A on 5 platforms, where it does not induce regressions. > I worked on msm8998, where Option A (or any equivalent) unbreaks WiFi. What do you mean here? Are you saying that option A works on all devices? I'm guessing I'm misunderstanding something.
On 25/04/2024 11:42, Kalle Valo wrote: > Marc Gonzalez wrote: > >> Do you prefer: >> >> Option A = never waiting for the MSA_READY indicator for ANYONE >> Option B = not waiting for the MSA_READY indicator when >> qcom,no-msa-ready-indicator is defined >> Option C = not waiting for the MSA_READY indicator for certain >> platforms (based on root compatible) >> Option D = some other solution not yet discussed > > After firmware-N.bin solution didn't work (sorry about that!) my > preference is option B. Actually, Option B is this patch series. Could you formally review it? Perhaps one thing I could do slightly differently is to NOT call ath10k_qmi_event_msa_ready() a second time if we DO receive the indicator later. >> Dmitry has tested Option A on 5 platforms, where it does not induce regressions. >> I worked on msm8998, where Option A (or any equivalent) unbreaks WiFi. > > What do you mean here? Are you saying that option A works on all > devices? I'm guessing I'm misunderstanding something. No one serious would ever claim "this works on all devices". Dmitry and I tested the following patch: diff --git a/drivers/net/wireless/ath/ath10k/qmi.c b/drivers/net/wireless/ath/ath10k/qmi.c index 38e939f572a9e..fd9ac9717488a 100644 --- a/drivers/net/wireless/ath/ath10k/qmi.c +++ b/drivers/net/wireless/ath/ath10k/qmi.c @@ -1040,6 +1040,8 @@ static void ath10k_qmi_driver_event_work(struct work_struct *work) switch (event->type) { case ATH10K_QMI_EVENT_SERVER_ARRIVE: ath10k_qmi_event_server_arrive(qmi); + printk(KERN_NOTICE "NOT WAITING FOR MSA_READY INDICATOR"); + ath10k_qmi_event_msa_ready(qmi); break; case ATH10K_QMI_EVENT_SERVER_EXIT: ath10k_qmi_event_server_exit(qmi); @@ -1048,7 +1050,7 @@ static void ath10k_qmi_driver_event_work(struct work_struct *work) ath10k_qmi_event_fw_ready_ind(qmi); break; case ATH10K_QMI_EVENT_MSA_READY_IND: - ath10k_qmi_event_msa_ready(qmi); + printk(KERN_NOTICE "IGNORING ACTUAL MSA_READY INDICATOR"); break; default: ath10k_warn(ar, "invalid event type: %d", event->type); Dmitry tested several platforms: > For reference, I tested this patch on sdm845 (db845c), qcm2290 aka > qrb2210 (rb1), sm6115 aka qrb4210 (rb2) and sm8150 platforms. > I was not able to fully test it on sda660, modem crashes without this > patch (there is no MSA_READY indication) and with the patch applied > the device hangs, most likely because of the IOMMU or clocking issue. I tested on apq8098 (msm8998 sibling). Patch makes adapter work on my msm8998 platform. Regards
Marc Gonzalez <mgonzalez@freebox.fr> writes: > On 25/04/2024 11:42, Kalle Valo wrote: > >> Marc Gonzalez wrote: >> >>> Do you prefer: >>> >>> Option A = never waiting for the MSA_READY indicator for ANYONE >>> Option B = not waiting for the MSA_READY indicator when >>> qcom,no-msa-ready-indicator is defined >>> Option C = not waiting for the MSA_READY indicator for certain >>> platforms (based on root compatible) >>> Option D = some other solution not yet discussed >> >> After firmware-N.bin solution didn't work (sorry about that!) my >> preference is option B. > > Actually, Option B is this patch series. > Could you formally review it? I'm happy with this series and would take it to ath.git, just need an ack from DT maintainers: https://patchwork.kernel.org/project/linux-wireless/patch/84f20fb5-5d48-419c-8eff-d7044afb81c0@freebox.fr/ > Perhaps one thing I could do slightly differently is to NOT call > ath10k_qmi_event_msa_ready() a second time if we DO receive the > indicator later. Good point. And maybe add an ath10k_warn() message so that we notice if there's a mismatch.
On Thu, Apr 25, 2024 at 06:42:16PM +0300, Kalle Valo wrote: > Marc Gonzalez <mgonzalez@freebox.fr> writes: > > > On 25/04/2024 11:42, Kalle Valo wrote: > > > >> Marc Gonzalez wrote: > >> > >>> Do you prefer: > >>> > >>> Option A = never waiting for the MSA_READY indicator for ANYONE > >>> Option B = not waiting for the MSA_READY indicator when > >>> qcom,no-msa-ready-indicator is defined > >>> Option C = not waiting for the MSA_READY indicator for certain > >>> platforms (based on root compatible) > >>> Option D = some other solution not yet discussed > >> > >> After firmware-N.bin solution didn't work (sorry about that!) my > >> preference is option B. > > > > Actually, Option B is this patch series. > > Could you formally review it? > > I'm happy with this series and would take it to ath.git, just need an > ack from DT maintainers: As far as I can tell, Krzysztof wanted a commit message update for 1/3. Usually this dts patch would go via the qcom dts tree, so presumably there's a need for an Ack from Bjorn or Konrad? > > https://patchwork.kernel.org/project/linux-wireless/patch/84f20fb5-5d48-419c-8eff-d7044afb81c0@freebox.fr/ > > > Perhaps one thing I could do slightly differently is to NOT call > > ath10k_qmi_event_msa_ready() a second time if we DO receive the > > indicator later. > > Good point. And maybe add an ath10k_warn() message so that we notice if > there's a mismatch. > > -- > https://patchwork.kernel.org/project/linux-wireless/list/ > > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
Conor Dooley <conor@kernel.org> writes: > On Thu, Apr 25, 2024 at 06:42:16PM +0300, Kalle Valo wrote: > >> Marc Gonzalez <mgonzalez@freebox.fr> writes: >> >> > On 25/04/2024 11:42, Kalle Valo wrote: >> > >> >> Marc Gonzalez wrote: >> >> >> >>> Do you prefer: >> >>> >> >>> Option A = never waiting for the MSA_READY indicator for ANYONE >> >>> Option B = not waiting for the MSA_READY indicator when >> >>> qcom,no-msa-ready-indicator is defined >> >>> Option C = not waiting for the MSA_READY indicator for certain >> >>> platforms (based on root compatible) >> >>> Option D = some other solution not yet discussed >> >> >> >> After firmware-N.bin solution didn't work (sorry about that!) my >> >> preference is option B. >> > >> > Actually, Option B is this patch series. >> > Could you formally review it? >> >> I'm happy with this series and would take it to ath.git, just need an >> ack from DT maintainers: > > As far as I can tell, Krzysztof wanted a commit message update for > 1/3. That's my understanding as well, I assume Marc will submit v3. I marked this patchset as 'Changes Requested' in patchwork. > Usually this dts patch would go via the qcom dts tree, so presumably > there's a need for an Ack from Bjorn or Konrad? Thanks pointing this out. What I meant to say earlier that I'm happy to take patches 1-2 to ath.git but I prefer that patch 3 goes via qcom dts tree.
diff --git a/arch/arm64/boot/dts/qcom/msm8998.dtsi b/arch/arm64/boot/dts/qcom/msm8998.dtsi index 026b6b97785b5..681a80f4dc9db 100644 --- a/arch/arm64/boot/dts/qcom/msm8998.dtsi +++ b/arch/arm64/boot/dts/qcom/msm8998.dtsi @@ -3185,6 +3185,7 @@ wifi: wifi@18800000 { iommus = <&anoc2_smmu 0x1900>, <&anoc2_smmu 0x1901>; qcom,snoc-host-cap-8bit-quirk; + qcom,no-msa-ready-indicator; }; }; };
The ath10k driver waits for an "MSA_READY" indicator to complete initialization. If the indicator is not received, then the device remains unusable. cf. ath10k_qmi_driver_event_work() Several msm8998-based devices are affected by this issue. Oddly, it seems safe to NOT wait for the indicator, and proceed immediately when QMI_EVENT_SERVER_ARRIVE. Jeff Johnson wrote: The feedback I received was "it might be ok to change all ath10k qmi to skip waiting for msa_ready", and it was pointed out that ath11k (and ath12k) do not wait for it. However with so many deployed devices, "might be ok" isn't a strong argument for changing the default behavior. cf. also https://wiki.postmarketos.org/wiki/Qualcomm_Snapdragon_835_(MSM8998)#WLAN Signed-off-by: Marc Gonzalez <mgonzalez@freebox.fr> --- arch/arm64/boot/dts/qcom/msm8998.dtsi | 1 + 1 file changed, 1 insertion(+)