mbox series

[RESEND,v2,0/8] Bug fixes and improved logging in MHI

Message ID 1588042766-17496-1-git-send-email-bbhatt@codeaurora.org
Headers show
Series Bug fixes and improved logging in MHI | expand

Message

Bhaumik Bhatt April 28, 2020, 2:59 a.m. UTC
A set of patches for bug fixes and improved logging in mhi/core/boot.c.
Verified on x86 and arm64 platforms.

v2:
-Fix channel ID range check potential infinite loop
-Add appropriate signed-off-by tags

Bhaumik Bhatt (5):
  bus: mhi: core: Handle firmware load using state worker
  bus: mhi: core: WARN_ON for malformed vector table
  bus: mhi: core: Return appropriate error codes for AMSS load failure
  bus: mhi: core: Improve debug logs for loading firmware
  bus: mhi: core: Ensure non-zero session or sequence ID values

Hemant Kumar (3):
  bus: mhi: core: Cache intmod from mhi event to mhi channel
  bus: mhi: core: Add range check for channel id received in event ring
  bus: mhi: core: Read transfer length from an event properly

 drivers/bus/mhi/core/boot.c     | 74 +++++++++++++++++++++++++----------------
 drivers/bus/mhi/core/init.c     |  5 ++-
 drivers/bus/mhi/core/internal.h |  1 +
 drivers/bus/mhi/core/main.c     | 18 +++++++---
 drivers/bus/mhi/core/pm.c       |  6 +---
 include/linux/mhi.h             |  2 --
 6 files changed, 65 insertions(+), 41 deletions(-)

Comments

Jeffrey Hugo April 28, 2020, 2:39 p.m. UTC | #1
On 4/27/2020 8:59 PM, Bhaumik Bhatt wrote:
> From: Hemant Kumar <hemantk@codeaurora.org>
> 
> Driver is using zero initialized intmod value from mhi channel when
> configuring TRE for bei field. This prevents interrupt moderation to
> take effect in case it is supported by an event ring. Fix this by
> copying intmod value from associated event ring to mhi channel upon
> registering mhi controller.
> 
> Signed-off-by: Hemant Kumar <hemantk@codeaurora.org>
> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
> ---
>   drivers/bus/mhi/core/init.c | 4 ++++
>   drivers/bus/mhi/core/main.c | 9 ++++++---
>   2 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c
> index b38359c..4dc7f22 100644
> --- a/drivers/bus/mhi/core/init.c
> +++ b/drivers/bus/mhi/core/init.c
> @@ -864,6 +864,10 @@ int mhi_register_controller(struct mhi_controller *mhi_cntrl,
>   		mutex_init(&mhi_chan->mutex);
>   		init_completion(&mhi_chan->completion);
>   		rwlock_init(&mhi_chan->lock);
> +
> +		/* used in setting bei field of TRE */
> +		mhi_event = &mhi_cntrl->mhi_event[mhi_chan->er_index];
> +		mhi_chan->intmod = mhi_event->intmod;
>   	}
>   
>   	if (mhi_cntrl->bounce_buf) {
> diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c
> index eb4256b..23154f1 100644
> --- a/drivers/bus/mhi/core/main.c
> +++ b/drivers/bus/mhi/core/main.c
> @@ -929,7 +929,7 @@ int mhi_queue_skb(struct mhi_device *mhi_dev, enum dma_data_direction dir,
>   	struct mhi_ring *buf_ring = &mhi_chan->buf_ring;
>   	struct mhi_buf_info *buf_info;
>   	struct mhi_tre *mhi_tre;
> -	int ret;
> +	int ret, bei;
>   
>   	/* If MHI host pre-allocates buffers then client drivers cannot queue */
>   	if (mhi_chan->pre_alloc)
> @@ -966,10 +966,11 @@ int mhi_queue_skb(struct mhi_device *mhi_dev, enum dma_data_direction dir,
>   		goto map_error;
>   
>   	mhi_tre = tre_ring->wp;
> +	bei = !!(mhi_chan->intmod);
>   
>   	mhi_tre->ptr = MHI_TRE_DATA_PTR(buf_info->p_addr);
>   	mhi_tre->dword[0] = MHI_TRE_DATA_DWORD0(buf_info->len);
> -	mhi_tre->dword[1] = MHI_TRE_DATA_DWORD1(1, 1, 0, 0);
> +	mhi_tre->dword[1] = MHI_TRE_DATA_DWORD1(bei, 1, 0, 0);

Why are we not using mhi_gen_tre() here?  Its used in mhi_queue_buf(), 
which seems to be why the issue doesn't appear there.  It looks like we 
have 3 instances of the same code, which all need to be kept in sync. 
Seems like this bug exists because they were not in sync.

Simplifying the code by removing the duplication seems to be the better 
approach by not only addressing this issue, but also preventing future ones.

>   
>   	/* increment WP */
>   	mhi_add_ring_element(mhi_cntrl, tre_ring);
> @@ -1006,6 +1007,7 @@ int mhi_queue_dma(struct mhi_device *mhi_dev, enum dma_data_direction dir,
>   	struct mhi_ring *buf_ring = &mhi_chan->buf_ring;
>   	struct mhi_buf_info *buf_info;
>   	struct mhi_tre *mhi_tre;
> +	int bei;
>   
>   	/* If MHI host pre-allocates buffers then client drivers cannot queue */
>   	if (mhi_chan->pre_alloc)
> @@ -1043,10 +1045,11 @@ int mhi_queue_dma(struct mhi_device *mhi_dev, enum dma_data_direction dir,
>   	buf_info->len = len;
>   
>   	mhi_tre = tre_ring->wp;
> +	bei = !!(mhi_chan->intmod);
>   
>   	mhi_tre->ptr = MHI_TRE_DATA_PTR(buf_info->p_addr);
>   	mhi_tre->dword[0] = MHI_TRE_DATA_DWORD0(buf_info->len);
> -	mhi_tre->dword[1] = MHI_TRE_DATA_DWORD1(1, 1, 0, 0);
> +	mhi_tre->dword[1] = MHI_TRE_DATA_DWORD1(bei, 1, 0, 0);
>   
>   	/* increment WP */
>   	mhi_add_ring_element(mhi_cntrl, tre_ring);
>
Jeffrey Hugo April 28, 2020, 2:50 p.m. UTC | #2
On 4/27/2020 8:59 PM, Bhaumik Bhatt wrote:
> From: Hemant Kumar <hemantk@codeaurora.org>
> 
> When MHI Driver receives an EOT event, it reads xfer_len from the
> event in the last TRE. The value is under control of the MHI device
> and never validated by Host MHI driver. The value should never be
> larger than the real size of the buffer but a malicious device can
> set the value 0xFFFF as maximum. This causes device to memory

The device will overflow, or the driver?

> overflow (both read or write). Fix this issue by reading minimum of
> transfer length from event and the buffer length provided.
> 
> Signed-off-by: Hemant Kumar <hemantk@codeaurora.org>
> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
> ---
>   drivers/bus/mhi/core/main.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c
> index 1ccd4cc..3d468d9 100644
> --- a/drivers/bus/mhi/core/main.c
> +++ b/drivers/bus/mhi/core/main.c
> @@ -521,7 +521,10 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>   				mhi_cntrl->unmap_single(mhi_cntrl, buf_info);
>   
>   			result.buf_addr = buf_info->cb_buf;
> -			result.bytes_xferd = xfer_len;
> +
> +			/* truncate to buf len if xfer_len is larger */
> +			result.bytes_xferd =
> +				min_t(u16, xfer_len, buf_info->len);
>   			mhi_del_ring_element(mhi_cntrl, buf_ring);
>   			mhi_del_ring_element(mhi_cntrl, tre_ring);
>   			local_rp = tre_ring->rp;
>
Hemant Kumar April 29, 2020, 5:29 p.m. UTC | #3
Hi Jeff

On 4/28/20 7:44 AM, Jeffrey Hugo wrote:
> On 4/27/2020 8:59 PM, Bhaumik Bhatt wrote:
>> From: Hemant Kumar <hemantk@codeaurora.org>
>>
>> MHI data completion handler function reads channel id from event
>> ring element. Value is under the control of MHI devices and can be
>> any value between 0 and 255. In order to prevent out of bound access
>> add a bound check against the max channel supported by controller
>> and skip processing of that event ring element.
>>
>> Signed-off-by: Hemant Kumar <hemantk@codeaurora.org>
>> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
>> ---
>>   drivers/bus/mhi/core/main.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c
>> index 23154f1..1ccd4cc 100644
>> --- a/drivers/bus/mhi/core/main.c
>> +++ b/drivers/bus/mhi/core/main.c
>> @@ -827,6 +827,9 @@ int mhi_process_data_event_ring(struct 
>> mhi_controller *mhi_cntrl,
>>           enum mhi_pkt_type type = MHI_TRE_GET_EV_TYPE(local_rp);
>>           chan = MHI_TRE_GET_EV_CHID(local_rp);
>> +        if (WARN_ON(chan >= mhi_cntrl->max_chan))
>> +            goto next_event;
>> +
>>           mhi_chan = &mhi_cntrl->mhi_chan[chan];
>>           if (likely(type == MHI_PKT_TYPE_TX_EVENT)) {
>> @@ -837,6 +840,7 @@ int mhi_process_data_event_ring(struct 
>> mhi_controller *mhi_cntrl,
>>               event_quota--;
>>           }
>> +next_event:
>>           mhi_recycle_ev_ring_element(mhi_cntrl, ev_ring);
>>           local_rp = ev_ring->rp;
>>           dev_rp = mhi_to_virtual(ev_ring, er_ctxt->rp);
>>
> 
> It looks like the same issue exists in mhi_process_ctrl_ev_ring(), and 
> thus some form of this solution needs to be applied there as well. Would 
> you please fix that too?
> 
As discussed with you off line, spec allows to have just event ring to 
be used for both data and control. Updating this in V3.