mbox series

[v2,0/5] Misc MHI fixes

Message ID 1586278230-29565-1-git-send-email-jhugo@codeaurora.org
Headers show
Series Misc MHI fixes | expand

Message

Jeffrey Hugo April 7, 2020, 4:50 p.m. UTC
A few (independent) fixes to the MHI bus for issues that I have come across
while developing against the mainline code.

v2:
-fix syserr reset log message
-fix power up error check code style
-add change to remove pci assumptions for register accesses
-add comment typo fix

Jeffrey Hugo (5):
  bus: mhi: core: Handle syserr during power_up
  bus: mhi: core: Make sure to powerdown if mhi_sync_power_up fails
  bus: mhi: core: Remove link_status() callback
  bus: mhi: core: Offload register accesses to the controller
  bus: mhi: core: Fix typo in comment

 drivers/bus/mhi/core/init.c     |  7 +++----
 drivers/bus/mhi/core/internal.h |  3 ---
 drivers/bus/mhi/core/main.c     | 13 ++-----------
 drivers/bus/mhi/core/pm.c       | 26 +++++++++++++++++++++++++-
 include/linux/mhi.h             | 10 +++++++---
 5 files changed, 37 insertions(+), 22 deletions(-)

Comments

Jeffrey Hugo April 10, 2020, 3:03 p.m. UTC | #1
On 4/9/2020 6:55 PM, Hemant Kumar wrote:
> 
> On 4/7/20 9:50 AM, Jeffrey Hugo wrote:
>> The MHI device may be in the syserr state when we attempt to init it in
>> power_up().  Since we have no local state, the handling is simple -
>> reset the device and wait for it to transition out of the reset state.
>>
>> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
>> ---
>>   drivers/bus/mhi/core/pm.c | 20 ++++++++++++++++++++
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
>> index 52690cb..3285c9e 100644
>> --- a/drivers/bus/mhi/core/pm.c
>> +++ b/drivers/bus/mhi/core/pm.c
>> @@ -9,6 +9,7 @@
>>   #include <linux/dma-direction.h>
>>   #include <linux/dma-mapping.h>
>>   #include <linux/interrupt.h>
>> +#include <linux/iopoll.h>
>>   #include <linux/list.h>
>>   #include <linux/mhi.h>
>>   #include <linux/module.h>
>> @@ -760,6 +761,7 @@ static void mhi_deassert_dev_wake(struct 
>> mhi_controller *mhi_cntrl,
>>   int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
>>   {
>> +    enum mhi_state state;
>>       enum mhi_ee_type current_ee;
>>       enum dev_st_transition next_state;
>>       struct device *dev = &mhi_cntrl->mhi_dev->dev;
>> @@ -829,6 +831,24 @@ int mhi_async_power_up(struct mhi_controller 
>> *mhi_cntrl)
>>           goto error_bhi_offset;
>>       }
>> +    state = mhi_get_mhi_state(mhi_cntrl);
>> +    if (state == MHI_STATE_SYS_ERR) {
>> +        mhi_set_mhi_state(mhi_cntrl, MHI_STATE_RESET);
>> +        ret = readl_poll_timeout(mhi_cntrl->regs + MHICTRL, val,
>> +                     !(val & MHICTRL_RESET_MASK), 1000,
>> +                     mhi_cntrl->timeout_ms * 1000);
> can we use this instead of polling because MSI is configures and int_vec 
> handler is registered
> 
>      wait_event_timeout(mhi_cntrl->state_event,
>                 MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state) ||
>                mhi_read_reg_field(mhi_cntrl, base, MHICTRL,
>                            MHICTRL_RESET_MASK,
>                            MHICTRL_RESET_SHIFT, &reset) || !reset ,
>                 msecs_to_jiffies(mhi_cntrl->timeout_ms));
> 
> 1) In case of MHI_PM_IN_FATAL_STATE we would not be accessing MHI reg
> 2) Consistent with current MHI driver code.

I'm not sure this works in the way you intend.

state_event is linked to the intvec, which is the BHI interrupt.  I 
don't see that the state_event is triggered in the MHI interrupt path 
(mhi_irq_handler).  So, if we are in the PBL EE, we would expect to see 
the BHI interrupt, but if we are in the AMSS EE, we would expect to see 
a MHI interrupt.

Now, for my concerned usecase, those two interrupts happen to be the 
same interrupt, so both will get triggered, but I don't expect that to 
be the same for all usecases.

So, with the solution I propose, we exit the wait (poll loop) as soon as 
we see the register change values.

With the solution you propose, if we only get the MHI interrupt, we'll 
have to wait out the entire timeout value, and then check the register. 
In this scenario, we are almost guaranteed to wait for longer than 
necessary.

Did I miss something?

>> +        if (ret) {
>> +            dev_info(dev, "Failed to reset MHI due to syserr state\n");
>> +            goto error_bhi_offset;
>> +        }
>> +
>> +        /*
>> +         * device cleares INTVEC as part of RESET processing,
>> +         * re-program it
>> +         */
>> +        mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0);
>> +    }
>> +
>>       /* Transition to next state */
>>       next_state = MHI_IN_PBL(current_ee) ?
>>           DEV_ST_TRANSITION_PBL : DEV_ST_TRANSITION_READY;
>
Jeffrey Hugo April 10, 2020, 9:39 p.m. UTC | #2
On 4/10/2020 2:37 PM, Bhaumik Vasav Bhatt wrote:
> Hi Jeff,
> 
> We will always have the mhi_intvec_handler registered and trigger your 
> wake_up state event when you write MHI RESET. BHI INTVEC IRQ using 
> mhi_cntrl->irq[0] is _not_ unregistered once you enter AMSS EE.

I understand it is not unregistered.  However mhi_cntrl->irq[0] may be 
reserved for BHI, and thus only exercised by PBL EE.  Where as, 
mhi_cntrl->irq[1..N] may be only exercised by AMSS EE. 
mhi_intvec_handler is not called in response to mhi_cntrl->irq[1..N].

Additionally, I re-reviewed the MHI spec, and I don't see where the spec 
requires the device to issue an interrupt upon completion of the RESET 
request.

Under section 3.5, step 11 states -

"The host must poll for the value of the RESET bit to detect the 
completion of the reset procedure by the device (RESET is set to 0)."

So, by this, my proposed solution appears to be spec compliant, where as 
what Hemant proposed is not.

> 
> So, your below assumption is not true:
>  >>>So, if we are in the PBL EE, we would expect to see the BHI 
> interrupt, but if we are in the AMSS EE, we would expect to see a MHI 
> interrupt.
> 
> At the start of mhi_async_power_up(), you've already registered for the 
> BHI interrupt as we do setup for IRQ and it is only unregistered from 
> power down if power up on the same cycle was a success.

You seem to have misunderstood my point.  If the BHI irq is only for BHI 
activity, which is activity restricted to the PBL EE, and the MHI 
interrupt(s) are restricted to MHI activity, which for the purposes of 
this discussion only occur in the AMSS EE, then my assumption is 
correct.  When the device is in PBL EE, we should only observe BHI irqs, 
and when the device is in AMSS EE, we should only observe MHI irqs.

This is a statement of what IRQ lines the device is raising, and not a 
statement of what handlers the host has, or has not registered.

Again, if the BHI irq is only generated in the PBL EE, and we rely on 
the BHI irq for "sensing" the state_event - we will never see the 
state_event in the AMSS EE, unless the same IRQ line is used for both 
MHI and BHI (which is only a select set of usecases, and not universal).

> 
> On 4/10/20 8:03 AM, Jeffrey Hugo wrote:
>> On 4/9/2020 6:55 PM, Hemant Kumar wrote:
>>>
>>> On 4/7/20 9:50 AM, Jeffrey Hugo wrote:
>>>> The MHI device may be in the syserr state when we attempt to init it in
>>>> power_up().  Since we have no local state, the handling is simple -
>>>> reset the device and wait for it to transition out of the reset state.
>>>>
>>>> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
>>>> ---
>>>>   drivers/bus/mhi/core/pm.c | 20 ++++++++++++++++++++
>>>>   1 file changed, 20 insertions(+)
>>>>
>>>> diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
>>>> index 52690cb..3285c9e 100644
>>>> --- a/drivers/bus/mhi/core/pm.c
>>>> +++ b/drivers/bus/mhi/core/pm.c
>>>> @@ -9,6 +9,7 @@
>>>>   #include <linux/dma-direction.h>
>>>>   #include <linux/dma-mapping.h>
>>>>   #include <linux/interrupt.h>
>>>> +#include <linux/iopoll.h>
>>>>   #include <linux/list.h>
>>>>   #include <linux/mhi.h>
>>>>   #include <linux/module.h>
>>>> @@ -760,6 +761,7 @@ static void mhi_deassert_dev_wake(struct 
>>>> mhi_controller *mhi_cntrl,
>>>>   int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
>>>>   {
>>>> +    enum mhi_state state;
>>>>       enum mhi_ee_type current_ee;
>>>>       enum dev_st_transition next_state;
>>>>       struct device *dev = &mhi_cntrl->mhi_dev->dev;
>>>> @@ -829,6 +831,24 @@ int mhi_async_power_up(struct mhi_controller 
>>>> *mhi_cntrl)
>>>>           goto error_bhi_offset;
>>>>       }
>>>> +    state = mhi_get_mhi_state(mhi_cntrl);
>>>> +    if (state == MHI_STATE_SYS_ERR) {
>>>> +        mhi_set_mhi_state(mhi_cntrl, MHI_STATE_RESET);
>>>> +        ret = readl_poll_timeout(mhi_cntrl->regs + MHICTRL, val,
>>>> +                     !(val & MHICTRL_RESET_MASK), 1000,
>>>> +                     mhi_cntrl->timeout_ms * 1000);
>>> can we use this instead of polling because MSI is configures and 
>>> int_vec handler is registered
>>>
>>>      wait_event_timeout(mhi_cntrl->state_event,
>>>                 MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state) ||
>>>                mhi_read_reg_field(mhi_cntrl, base, MHICTRL,
>>>                            MHICTRL_RESET_MASK,
>>>                            MHICTRL_RESET_SHIFT, &reset) || !reset ,
>>>                 msecs_to_jiffies(mhi_cntrl->timeout_ms));
>>>
>>> 1) In case of MHI_PM_IN_FATAL_STATE we would not be accessing MHI reg
>>> 2) Consistent with current MHI driver code.
>>
>> I'm not sure this works in the way you intend.
>>
>> state_event is linked to the intvec, which is the BHI interrupt. I 
>> don't see that the state_event is triggered in the MHI interrupt path 
>> (mhi_irq_handler).  So, if we are in the PBL EE, we would expect to 
>> see the BHI interrupt, but if we are in the AMSS EE, we would expect 
>> to see a MHI interrupt.
>>
>> Now, for my concerned usecase, those two interrupts happen to be the 
>> same interrupt, so both will get triggered, but I don't expect that to 
>> be the same for all usecases.
>>
>> So, with the solution I propose, we exit the wait (poll loop) as soon 
>> as we see the register change values.
>>
>> With the solution you propose, if we only get the MHI interrupt, we'll 
>> have to wait out the entire timeout value, and then check the 
>> register. In this scenario, we are almost guaranteed to wait for 
>> longer than necessary.
>>
>> Did I miss something?
>>
>>>> +        if (ret) {
>>>> +            dev_info(dev, "Failed to reset MHI due to syserr 
>>>> state\n");
>>>> +            goto error_bhi_offset;
>>>> +        }
>>>> +
>>>> +        /*
>>>> +         * device cleares INTVEC as part of RESET processing,
>>>> +         * re-program it
>>>> +         */
>>>> +        mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0);
>>>> +    }
>>>> +
>>>>       /* Transition to next state */
>>>>       next_state = MHI_IN_PBL(current_ee) ?
>>>>           DEV_ST_TRANSITION_PBL : DEV_ST_TRANSITION_READY;
>>>
>>
>>
Jeffrey Hugo April 13, 2020, 2:01 p.m. UTC | #3
On 4/13/2020 7:34 AM, Manivannan Sadhasivam wrote:
> On Fri, Apr 10, 2020 at 03:39:57PM -0600, Jeffrey Hugo wrote:
>> On 4/10/2020 2:37 PM, Bhaumik Vasav Bhatt wrote:
>>> Hi Jeff,
>>>
>>> We will always have the mhi_intvec_handler registered and trigger your
>>> wake_up state event when you write MHI RESET. BHI INTVEC IRQ using
>>> mhi_cntrl->irq[0] is _not_ unregistered once you enter AMSS EE.
>>
>> I understand it is not unregistered.  However mhi_cntrl->irq[0] may be
>> reserved for BHI, and thus only exercised by PBL EE.  Where as,
>> mhi_cntrl->irq[1..N] may be only exercised by AMSS EE. mhi_intvec_handler is
>> not called in response to mhi_cntrl->irq[1..N].
>>
>> Additionally, I re-reviewed the MHI spec, and I don't see where the spec
>> requires the device to issue an interrupt upon completion of the RESET
>> request.
>>
>> Under section 3.5, step 11 states -
>>
>> "The host must poll for the value of the RESET bit to detect the completion
>> of the reset procedure by the device (RESET is set to 0)."
>>
> 
> If this is the scenario then we need to change all of the wait_event_timeout()
> implementation for MHI RESET in current stack to polling.
> 
> Or the interrupt generation is not defined in spec (sheet) but part of the
> existing implementation?

It probably could be considered part of the existing implementation, but 
I'd like to hear from Hemant/Bhaumik.  Wherever we end up, I'd like to 
have the spec match.
Manivannan Sadhasivam April 21, 2020, 6:08 a.m. UTC | #4
On Mon, Apr 13, 2020 at 08:01:36AM -0600, Jeffrey Hugo wrote:
> On 4/13/2020 7:34 AM, Manivannan Sadhasivam wrote:
> > On Fri, Apr 10, 2020 at 03:39:57PM -0600, Jeffrey Hugo wrote:
> > > On 4/10/2020 2:37 PM, Bhaumik Vasav Bhatt wrote:
> > > > Hi Jeff,
> > > > 
> > > > We will always have the mhi_intvec_handler registered and trigger your
> > > > wake_up state event when you write MHI RESET. BHI INTVEC IRQ using
> > > > mhi_cntrl->irq[0] is _not_ unregistered once you enter AMSS EE.
> > > 
> > > I understand it is not unregistered.  However mhi_cntrl->irq[0] may be
> > > reserved for BHI, and thus only exercised by PBL EE.  Where as,
> > > mhi_cntrl->irq[1..N] may be only exercised by AMSS EE. mhi_intvec_handler is
> > > not called in response to mhi_cntrl->irq[1..N].
> > > 
> > > Additionally, I re-reviewed the MHI spec, and I don't see where the spec
> > > requires the device to issue an interrupt upon completion of the RESET
> > > request.
> > > 
> > > Under section 3.5, step 11 states -
> > > 
> > > "The host must poll for the value of the RESET bit to detect the completion
> > > of the reset procedure by the device (RESET is set to 0)."
> > > 
> > 
> > If this is the scenario then we need to change all of the wait_event_timeout()
> > implementation for MHI RESET in current stack to polling.
> > 
> > Or the interrupt generation is not defined in spec (sheet) but part of the
> > existing implementation?
> 
> It probably could be considered part of the existing implementation, but I'd
> like to hear from Hemant/Bhaumik.  Wherever we end up, I'd like to have the
> spec match.

Hemant/Bhaumik, can you please share your thoughts?

Thanks,
Mani

> 
> -- 
> Jeffrey Hugo
> Qualcomm Technologies, Inc. is a member of the
> Code Aurora Forum, a Linux Foundation Collaborative Project.
Jeffrey Hugo April 22, 2020, 8:16 p.m. UTC | #5
On 4/21/2020 12:08 AM, Manivannan Sadhasivam wrote:
> On Mon, Apr 13, 2020 at 08:01:36AM -0600, Jeffrey Hugo wrote:
>> On 4/13/2020 7:34 AM, Manivannan Sadhasivam wrote:
>>> On Fri, Apr 10, 2020 at 03:39:57PM -0600, Jeffrey Hugo wrote:
>>>> On 4/10/2020 2:37 PM, Bhaumik Vasav Bhatt wrote:
>>>>> Hi Jeff,
>>>>>
>>>>> We will always have the mhi_intvec_handler registered and trigger your
>>>>> wake_up state event when you write MHI RESET. BHI INTVEC IRQ using
>>>>> mhi_cntrl->irq[0] is _not_ unregistered once you enter AMSS EE.
>>>>
>>>> I understand it is not unregistered.  However mhi_cntrl->irq[0] may be
>>>> reserved for BHI, and thus only exercised by PBL EE.  Where as,
>>>> mhi_cntrl->irq[1..N] may be only exercised by AMSS EE. mhi_intvec_handler is
>>>> not called in response to mhi_cntrl->irq[1..N].
>>>>
>>>> Additionally, I re-reviewed the MHI spec, and I don't see where the spec
>>>> requires the device to issue an interrupt upon completion of the RESET
>>>> request.
>>>>
>>>> Under section 3.5, step 11 states -
>>>>
>>>> "The host must poll for the value of the RESET bit to detect the completion
>>>> of the reset procedure by the device (RESET is set to 0)."
>>>>
>>>
>>> If this is the scenario then we need to change all of the wait_event_timeout()
>>> implementation for MHI RESET in current stack to polling.
>>>
>>> Or the interrupt generation is not defined in spec (sheet) but part of the
>>> existing implementation?
>>
>> It probably could be considered part of the existing implementation, but I'd
>> like to hear from Hemant/Bhaumik.  Wherever we end up, I'd like to have the
>> spec match.
> 
> Hemant/Bhaumik, can you please share your thoughts?

Sorry.  Hemant, Bhaumik, and I have had a few calls discussing this.  We 
are trying to come to a consensus on the expectation of the device 
behavior, so that we can propose the best solution.  Probably a few more 
days yet while we await for a bit of clarification.