diff mbox series

drm/msm/dp: do not complete dp_aux_cmd_fifo_tx() if irq is not for aux transfer

Message ID 1671052890-11627-1-git-send-email-quic_khsieh@quicinc.com
State Superseded
Headers show
Series drm/msm/dp: do not complete dp_aux_cmd_fifo_tx() if irq is not for aux transfer | expand

Commit Message

Kuogee Hsieh Dec. 14, 2022, 9:21 p.m. UTC
There are 3 possible interrupt sources are handled by DP controller,
HPDstatus, Controller state changes and Aux read/write transaction.
At every irq, DP controller have to check isr status of every interrupt
sources and service the interrupt if its isr status bits shows interrupts
are pending. There is potential race condition may happen at current aux
isr handler implementation since it is always complete dp_aux_cmd_fifo_tx()
even irq is not for aux read or write transaction. This may cause aux read
transaction return premature if host aux data read is in the middle of
waiting for sink to complete transferring data to host while irq happen.
This will cause host's receiving buffer contains unexpected data. This
patch fixes this problem by checking aux isr and return immediately at
aux isr handler if there are no any isr status bits set.

Follows are the signature at kernel logs when problem happen,
EDID has corrupt header
panel-simple-dp-aux aux-aea0000.edp: Couldn't identify panel via EDID
panel-simple-dp-aux aux-aea0000.edp: error -EIO: Couldn't detect panel nor find a fallback

Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
---
 drivers/gpu/drm/msm/dp/dp_aux.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Doug Anderson Dec. 14, 2022, 10:29 p.m. UTC | #1
Hi,

On Wed, Dec 14, 2022 at 1:21 PM Kuogee Hsieh <quic_khsieh@quicinc.com> wrote:
>
> There are 3 possible interrupt sources are handled by DP controller,
> HPDstatus, Controller state changes and Aux read/write transaction.
> At every irq, DP controller have to check isr status of every interrupt
> sources and service the interrupt if its isr status bits shows interrupts
> are pending. There is potential race condition may happen at current aux
> isr handler implementation since it is always complete dp_aux_cmd_fifo_tx()
> even irq is not for aux read or write transaction. This may cause aux read
> transaction return premature if host aux data read is in the middle of
> waiting for sink to complete transferring data to host while irq happen.
> This will cause host's receiving buffer contains unexpected data. This
> patch fixes this problem by checking aux isr and return immediately at
> aux isr handler if there are no any isr status bits set.
>
> Follows are the signature at kernel logs when problem happen,
> EDID has corrupt header
> panel-simple-dp-aux aux-aea0000.edp: Couldn't identify panel via EDID
> panel-simple-dp-aux aux-aea0000.edp: error -EIO: Couldn't detect panel nor find a fallback
>
> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
> ---
>  drivers/gpu/drm/msm/dp/dp_aux.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
> index d030a93..8f8b12a 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> @@ -423,6 +423,13 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>
>         isr = dp_catalog_aux_get_irq(aux->catalog);
>
> +       /*
> +        * if this irq is not for aux transfer,
> +        * then return immediately
> +        */

Why do you need 4 lines for a comment that fits on one line?

> +       if (!isr)
> +               return;

I can confirm that this works for me. I could reproduce the EDID
problems in the past and I can't after this patch. ...so I could give
a:

Tested-by: Douglas Anderson <dianders@chromium.org>

I'm not an expert on this part of the code, so feel free to ignore my
other comments if everyone else thinks this patch is fine as-is, but
to me something here feels a little fragile. It feels a little weird
that we'll "complete" for _any_ interrupt that comes through now
rather than relying on dp_aux_native_handler() / dp_aux_i2c_handler()
to specifically identify interrupts that caused the end of the
transfer. I guess that idea is that every possible interrupt we get
causes the end of the transfer?

-Doug
Abhinav Kumar Dec. 14, 2022, 11:46 p.m. UTC | #2
Hi Doug

On 12/14/2022 2:29 PM, Doug Anderson wrote:
> Hi,
> 
> On Wed, Dec 14, 2022 at 1:21 PM Kuogee Hsieh <quic_khsieh@quicinc.com> wrote:
>>
>> There are 3 possible interrupt sources are handled by DP controller,
>> HPDstatus, Controller state changes and Aux read/write transaction.
>> At every irq, DP controller have to check isr status of every interrupt
>> sources and service the interrupt if its isr status bits shows interrupts
>> are pending. There is potential race condition may happen at current aux
>> isr handler implementation since it is always complete dp_aux_cmd_fifo_tx()
>> even irq is not for aux read or write transaction. This may cause aux read
>> transaction return premature if host aux data read is in the middle of
>> waiting for sink to complete transferring data to host while irq happen.
>> This will cause host's receiving buffer contains unexpected data. This
>> patch fixes this problem by checking aux isr and return immediately at
>> aux isr handler if there are no any isr status bits set.
>>
>> Follows are the signature at kernel logs when problem happen,
>> EDID has corrupt header
>> panel-simple-dp-aux aux-aea0000.edp: Couldn't identify panel via EDID
>> panel-simple-dp-aux aux-aea0000.edp: error -EIO: Couldn't detect panel nor find a fallback
>>
>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>> ---
>>   drivers/gpu/drm/msm/dp/dp_aux.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
>> index d030a93..8f8b12a 100644
>> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
>> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
>> @@ -423,6 +423,13 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>>
>>          isr = dp_catalog_aux_get_irq(aux->catalog);
>>
>> +       /*
>> +        * if this irq is not for aux transfer,
>> +        * then return immediately
>> +        */
> 
> Why do you need 4 lines for a comment that fits on one line?
Yes, we can fit this to one line.
> 
>> +       if (!isr)
>> +               return;
> 
> I can confirm that this works for me. I could reproduce the EDID
> problems in the past and I can't after this patch. ...so I could give
> a:
> 
> Tested-by: Douglas Anderson <dianders@chromium.org>
> 
> I'm not an expert on this part of the code, so feel free to ignore my
> other comments if everyone else thinks this patch is fine as-is, but
> to me something here feels a little fragile. It feels a little weird
> that we'll "complete" for _any_ interrupt that comes through now
> rather than relying on dp_aux_native_handler() / dp_aux_i2c_handler()
> to specifically identify interrupts that caused the end of the
> transfer. I guess that idea is that every possible interrupt we get
> causes the end of the transfer?
> 
> -Doug

So this turned out to be more tricky and was a good finding from kuogee.

In the bad EDID case, it was technically not bad EDID.

What was happening was, the VIDEO_READY interrupt was continuously 
firing. Ideally, this should fire only once but due to some error 
condition it kept firing. We dont exactly know why yet what was the 
error condition making it continuously fire.

In the DP ISR, the dp_aux_isr() gets called even if it was not an aux 
interrupt which fired (so the call flow in this case was 
dp_display_irq_handler (triggered for VIDEO_READY) ---> dp_aux_isr()
So we should certainly have some protection to return early from this 
routine if there was no aux interrupt which fired.

Which is what this fix is doing.

Its not completing any interrupt, its just returning early if no aux 
interrupt fired.

So based on the logs I have seen and given that its fixing this 
stability issue.

Fixes: c943b4948b58 ("drm/msm/dp: add displayPort driver support")

Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Doug Anderson Dec. 15, 2022, 12:14 a.m. UTC | #3
Hi,

On Wed, Dec 14, 2022 at 3:46 PM Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
>
> Hi Doug
>
> On 12/14/2022 2:29 PM, Doug Anderson wrote:
> > Hi,
> >
> > On Wed, Dec 14, 2022 at 1:21 PM Kuogee Hsieh <quic_khsieh@quicinc.com> wrote:
> >>
> >> There are 3 possible interrupt sources are handled by DP controller,
> >> HPDstatus, Controller state changes and Aux read/write transaction.
> >> At every irq, DP controller have to check isr status of every interrupt
> >> sources and service the interrupt if its isr status bits shows interrupts
> >> are pending. There is potential race condition may happen at current aux
> >> isr handler implementation since it is always complete dp_aux_cmd_fifo_tx()
> >> even irq is not for aux read or write transaction. This may cause aux read
> >> transaction return premature if host aux data read is in the middle of
> >> waiting for sink to complete transferring data to host while irq happen.
> >> This will cause host's receiving buffer contains unexpected data. This
> >> patch fixes this problem by checking aux isr and return immediately at
> >> aux isr handler if there are no any isr status bits set.
> >>
> >> Follows are the signature at kernel logs when problem happen,
> >> EDID has corrupt header
> >> panel-simple-dp-aux aux-aea0000.edp: Couldn't identify panel via EDID
> >> panel-simple-dp-aux aux-aea0000.edp: error -EIO: Couldn't detect panel nor find a fallback
> >>
> >> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
> >> ---
> >>   drivers/gpu/drm/msm/dp/dp_aux.c | 7 +++++++
> >>   1 file changed, 7 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
> >> index d030a93..8f8b12a 100644
> >> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> >> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> >> @@ -423,6 +423,13 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
> >>
> >>          isr = dp_catalog_aux_get_irq(aux->catalog);
> >>
> >> +       /*
> >> +        * if this irq is not for aux transfer,
> >> +        * then return immediately
> >> +        */
> >
> > Why do you need 4 lines for a comment that fits on one line?
> Yes, we can fit this to one line.
> >
> >> +       if (!isr)
> >> +               return;
> >
> > I can confirm that this works for me. I could reproduce the EDID
> > problems in the past and I can't after this patch. ...so I could give
> > a:
> >
> > Tested-by: Douglas Anderson <dianders@chromium.org>
> >
> > I'm not an expert on this part of the code, so feel free to ignore my
> > other comments if everyone else thinks this patch is fine as-is, but
> > to me something here feels a little fragile. It feels a little weird
> > that we'll "complete" for _any_ interrupt that comes through now
> > rather than relying on dp_aux_native_handler() / dp_aux_i2c_handler()
> > to specifically identify interrupts that caused the end of the
> > transfer. I guess that idea is that every possible interrupt we get
> > causes the end of the transfer?
> >
> > -Doug
>
> So this turned out to be more tricky and was a good finding from kuogee.
>
> In the bad EDID case, it was technically not bad EDID.
>
> What was happening was, the VIDEO_READY interrupt was continuously
> firing. Ideally, this should fire only once but due to some error
> condition it kept firing. We dont exactly know why yet what was the
> error condition making it continuously fire.
>
> In the DP ISR, the dp_aux_isr() gets called even if it was not an aux
> interrupt which fired (so the call flow in this case was
> dp_display_irq_handler (triggered for VIDEO_READY) ---> dp_aux_isr()
> So we should certainly have some protection to return early from this
> routine if there was no aux interrupt which fired.
>
> Which is what this fix is doing.
>
> Its not completing any interrupt, its just returning early if no aux
> interrupt fired.

...but the whole problem was that it was doing the complete() at the
end, right? Kuogee even mentioned that in the commit message.
Specifically, I checked dp_aux_native_handler() and
dp_aux_i2c_handler(), both of which are passed the "isr". Unless I
messed up, both functions already were no-ops if the ISR was 0, even
before Kuogee's patch. That means that the only thing Kuogee's patch
does is to prevent the call to "complete(&aux->comp)" at the end of
"dp_aux_isr()".

...and it makes sense not to call the complete() if no "isr" is 0.
...but what I'm saying is that _any_ non-zero value of ISR will still
cause the complete() to be called after Kuogee's patch. That means
that if any of the 32-bits in the "isr" variable are set, that we will
call complete(). I'm asking if you're sure that every single bit of
the "isr" means that we're ready to call complete(). It feels like it
would be less fragile if dp_aux_native_handler() and
dp_aux_i2c_handler() (which both already look at the ISR) returned
some value saying whether the "isr" contained a bit that meant that
complete() should be called.

-Doug
Stephen Boyd Dec. 15, 2022, 12:29 a.m. UTC | #4
Quoting Doug Anderson (2022-12-14 16:14:42)
> Hi,
>
> On Wed, Dec 14, 2022 at 3:46 PM Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
> >
> > Hi Doug
> >
> > On 12/14/2022 2:29 PM, Doug Anderson wrote:
> > > Hi,
> > >
> > > On Wed, Dec 14, 2022 at 1:21 PM Kuogee Hsieh <quic_khsieh@quicinc.com> wrote:
> > >>
> > >> There are 3 possible interrupt sources are handled by DP controller,
> > >> HPDstatus, Controller state changes and Aux read/write transaction.
> > >> At every irq, DP controller have to check isr status of every interrupt
> > >> sources and service the interrupt if its isr status bits shows interrupts
> > >> are pending. There is potential race condition may happen at current aux
> > >> isr handler implementation since it is always complete dp_aux_cmd_fifo_tx()
> > >> even irq is not for aux read or write transaction. This may cause aux read
> > >> transaction return premature if host aux data read is in the middle of
> > >> waiting for sink to complete transferring data to host while irq happen.
> > >> This will cause host's receiving buffer contains unexpected data. This
> > >> patch fixes this problem by checking aux isr and return immediately at
> > >> aux isr handler if there are no any isr status bits set.
> > >>
> > >> Follows are the signature at kernel logs when problem happen,
> > >> EDID has corrupt header
> > >> panel-simple-dp-aux aux-aea0000.edp: Couldn't identify panel via EDID
> > >> panel-simple-dp-aux aux-aea0000.edp: error -EIO: Couldn't detect panel nor find a fallback
> > >>
> > >> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
> > >> ---
> > >>   drivers/gpu/drm/msm/dp/dp_aux.c | 7 +++++++
> > >>   1 file changed, 7 insertions(+)
> > >>
> > >> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
> > >> index d030a93..8f8b12a 100644
> > >> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> > >> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> > >> @@ -423,6 +423,13 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
> > >>
> > >>          isr = dp_catalog_aux_get_irq(aux->catalog);
> > >>
> > >> +       /*
> > >> +        * if this irq is not for aux transfer,
> > >> +        * then return immediately
> > >> +        */
> > >
> > > Why do you need 4 lines for a comment that fits on one line?
> > Yes, we can fit this to one line.
> > >
> > >> +       if (!isr)
> > >> +               return;
> > >
> > > I can confirm that this works for me. I could reproduce the EDID
> > > problems in the past and I can't after this patch. ...so I could give
> > > a:
> > >
> > > Tested-by: Douglas Anderson <dianders@chromium.org>
> > >
> > > I'm not an expert on this part of the code, so feel free to ignore my
> > > other comments if everyone else thinks this patch is fine as-is, but
> > > to me something here feels a little fragile. It feels a little weird
> > > that we'll "complete" for _any_ interrupt that comes through now
> > > rather than relying on dp_aux_native_handler() / dp_aux_i2c_handler()
> > > to specifically identify interrupts that caused the end of the
> > > transfer. I guess that idea is that every possible interrupt we get
> > > causes the end of the transfer?
> > >
> > > -Doug
> >
> > So this turned out to be more tricky and was a good finding from kuogee.
> >
> > In the bad EDID case, it was technically not bad EDID.
> >
> > What was happening was, the VIDEO_READY interrupt was continuously
> > firing. Ideally, this should fire only once but due to some error
> > condition it kept firing. We dont exactly know why yet what was the
> > error condition making it continuously fire.

This is a great detail that is missing from the commit text.

> >
> > In the DP ISR, the dp_aux_isr() gets called even if it was not an aux
> > interrupt which fired (so the call flow in this case was
> > dp_display_irq_handler (triggered for VIDEO_READY) ---> dp_aux_isr()
> > So we should certainly have some protection to return early from this
> > routine if there was no aux interrupt which fired.

I'm not sure that's a race condition though, more like a problem where
the completion is called unconditionally?

> >
> > Which is what this fix is doing.
> >
> > Its not completing any interrupt, its just returning early if no aux
> > interrupt fired.
>
> ...but the whole problem was that it was doing the complete() at the
> end, right? Kuogee even mentioned that in the commit message.
> Specifically, I checked dp_aux_native_handler() and
> dp_aux_i2c_handler(), both of which are passed the "isr". Unless I
> messed up, both functions already were no-ops if the ISR was 0, even
> before Kuogee's patch. That means that the only thing Kuogee's patch
> does is to prevent the call to "complete(&aux->comp)" at the end of
> "dp_aux_isr()".
>
> ...and it makes sense not to call the complete() if no "isr" is 0.
> ...but what I'm saying is that _any_ non-zero value of ISR will still
> cause the complete() to be called after Kuogee's patch. That means
> that if any of the 32-bits in the "isr" variable are set, that we will
> call complete(). I'm asking if you're sure that every single bit of
> the "isr" means that we're ready to call complete(). It feels like it
> would be less fragile if dp_aux_native_handler() and
> dp_aux_i2c_handler() (which both already look at the ISR) returned
> some value saying whether the "isr" contained a bit that meant that
> complete() should be called.
>

I'm almost certain I've asked for this before, but I can't find it
anymore. Can we also simplify the aux handlers to be a big pile of
if-else-if conditions that don't overwrite the 'aux_error_num'? That
would simplify the patch below.

---8<---
diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
index d030a93a08c3..ff79cad90d21 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.c
+++ b/drivers/gpu/drm/msm/dp/dp_aux.c
@@ -162,45 +162,73 @@ static ssize_t dp_aux_cmd_fifo_rx(struct
dp_aux_private *aux,
 	return i;
 }

-static void dp_aux_native_handler(struct dp_aux_private *aux, u32 isr)
+static irqreturn_t dp_aux_native_handler(struct dp_aux_private *aux, u32 isr)
 {
-	if (isr & DP_INTR_AUX_I2C_DONE)
+	irqreturn_t ret = IRQ_NONE;
+
+	if (isr & DP_INTR_AUX_I2C_DONE) {
 		aux->aux_error_num = DP_AUX_ERR_NONE;
-	else if (isr & DP_INTR_WRONG_ADDR)
+		ret = IRQ_HANDLED;
+	} else if (isr & DP_INTR_WRONG_ADDR) {
 		aux->aux_error_num = DP_AUX_ERR_ADDR;
-	else if (isr & DP_INTR_TIMEOUT)
+		ret = IRQ_HANDLED;
+	} else if (isr & DP_INTR_TIMEOUT) {
 		aux->aux_error_num = DP_AUX_ERR_TOUT;
-	if (isr & DP_INTR_NACK_DEFER)
+		ret = IRQ_HANDLED;
+	}
+
+	if (isr & DP_INTR_NACK_DEFER) {
 		aux->aux_error_num = DP_AUX_ERR_NACK;
+		ret = IRQ_HANDLED;
+	}
 	if (isr & DP_INTR_AUX_ERROR) {
 		aux->aux_error_num = DP_AUX_ERR_PHY;
 		dp_catalog_aux_clear_hw_interrupts(aux->catalog);
+		ret = IRQ_HANDLED;
 	}
+
+	return ret;
 }

-static void dp_aux_i2c_handler(struct dp_aux_private *aux, u32 isr)
+static irqreturn_t dp_aux_i2c_handler(struct dp_aux_private *aux, u32 isr)
 {
+	irqreturn_t ret = IRQ_NONE;
+
 	if (isr & DP_INTR_AUX_I2C_DONE) {
 		if (isr & (DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER))
 			aux->aux_error_num = DP_AUX_ERR_NACK;
 		else
 			aux->aux_error_num = DP_AUX_ERR_NONE;
-	} else {
-		if (isr & DP_INTR_WRONG_ADDR)
-			aux->aux_error_num = DP_AUX_ERR_ADDR;
-		else if (isr & DP_INTR_TIMEOUT)
-			aux->aux_error_num = DP_AUX_ERR_TOUT;
-		if (isr & DP_INTR_NACK_DEFER)
-			aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
-		if (isr & DP_INTR_I2C_NACK)
-			aux->aux_error_num = DP_AUX_ERR_NACK;
-		if (isr & DP_INTR_I2C_DEFER)
-			aux->aux_error_num = DP_AUX_ERR_DEFER;
-		if (isr & DP_INTR_AUX_ERROR) {
-			aux->aux_error_num = DP_AUX_ERR_PHY;
-			dp_catalog_aux_clear_hw_interrupts(aux->catalog);
-		}
+		return IRQ_HANDLED;
+	}
+
+	if (isr & DP_INTR_WRONG_ADDR) {
+		aux->aux_error_num = DP_AUX_ERR_ADDR;
+		ret = IRQ_HANDLED;
+	} else if (isr & DP_INTR_TIMEOUT) {
+		aux->aux_error_num = DP_AUX_ERR_TOUT;
+		ret = IRQ_HANDLED;
+	}
+
+	if (isr & DP_INTR_NACK_DEFER) {
+		aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
+		ret = IRQ_HANDLED;
+	}
+	if (isr & DP_INTR_I2C_NACK) {
+		aux->aux_error_num = DP_AUX_ERR_NACK;
+		ret = IRQ_HANDLED;
+	}
+	if (isr & DP_INTR_I2C_DEFER) {
+		aux->aux_error_num = DP_AUX_ERR_DEFER;
+		ret = IRQ_HANDLED;
+	}
+	if (isr & DP_INTR_AUX_ERROR) {
+		aux->aux_error_num = DP_AUX_ERR_PHY;
+		dp_catalog_aux_clear_hw_interrupts(aux->catalog);
+		ret = IRQ_HANDLED;
 	}
+
+	return ret;
 }

 static void dp_aux_update_offset_and_segment(struct dp_aux_private *aux,
@@ -409,14 +437,15 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux *dp_aux,
 	return ret;
 }

-void dp_aux_isr(struct drm_dp_aux *dp_aux)
+irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux)
 {
 	u32 isr;
 	struct dp_aux_private *aux;
+	irqreturn_t ret = IRQ_NONE;

 	if (!dp_aux) {
 		DRM_ERROR("invalid input\n");
-		return;
+		return ret;
 	}

 	aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
@@ -424,14 +453,17 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
 	isr = dp_catalog_aux_get_irq(aux->catalog);

 	if (!aux->cmd_busy)
-		return;
+		return ret;

 	if (aux->native)
-		dp_aux_native_handler(aux, isr);
+		ret |= dp_aux_native_handler(aux, isr);
 	else
-		dp_aux_i2c_handler(aux, isr);
+		ret |= dp_aux_i2c_handler(aux, isr);

-	complete(&aux->comp);
+	if (ret == IRQ_HANDLED)
+		complete(&aux->comp);
+
+	return ret;
 }

 void dp_aux_reconfig(struct drm_dp_aux *dp_aux)
diff --git a/drivers/gpu/drm/msm/dp/dp_aux.h b/drivers/gpu/drm/msm/dp/dp_aux.h
index e930974bcb5b..511305da4f66 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.h
+++ b/drivers/gpu/drm/msm/dp/dp_aux.h
@@ -11,7 +11,7 @@

 int dp_aux_register(struct drm_dp_aux *dp_aux);
 void dp_aux_unregister(struct drm_dp_aux *dp_aux);
-void dp_aux_isr(struct drm_dp_aux *dp_aux);
+irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux);
 void dp_aux_init(struct drm_dp_aux *dp_aux);
 void dp_aux_deinit(struct drm_dp_aux *dp_aux);
 void dp_aux_reconfig(struct drm_dp_aux *dp_aux);
diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c
index dd26ca651a05..10c6d6847163 100644
--- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
+++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
@@ -1979,13 +1979,11 @@ int dp_ctrl_off(struct dp_ctrl *dp_ctrl)
 	return ret;
 }

-void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
+irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
 {
 	struct dp_ctrl_private *ctrl;
 	u32 isr;
-
-	if (!dp_ctrl)
-		return;
+	irqreturn_t ret = IRQ_NONE;

 	ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);

@@ -1994,12 +1992,16 @@ void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
 	if (isr & DP_CTRL_INTR_READY_FOR_VIDEO) {
 		drm_dbg_dp(ctrl->drm_dev, "dp_video_ready\n");
 		complete(&ctrl->video_comp);
+		ret = IRQ_HANDLED;
 	}

 	if (isr & DP_CTRL_INTR_IDLE_PATTERN_SENT) {
 		drm_dbg_dp(ctrl->drm_dev, "idle_patterns_sent\n");
 		complete(&ctrl->idle_comp);
+		ret = IRQ_HANDLED;
 	}
+
+	return ret;
 }

 struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.h b/drivers/gpu/drm/msm/dp/dp_ctrl.h
index 9f29734af81c..c3af06dc87b1 100644
--- a/drivers/gpu/drm/msm/dp/dp_ctrl.h
+++ b/drivers/gpu/drm/msm/dp/dp_ctrl.h
@@ -25,7 +25,7 @@ int dp_ctrl_off_link_stream(struct dp_ctrl *dp_ctrl);
 int dp_ctrl_off_link(struct dp_ctrl *dp_ctrl);
 int dp_ctrl_off(struct dp_ctrl *dp_ctrl);
 void dp_ctrl_push_idle(struct dp_ctrl *dp_ctrl);
-void dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
+irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
 void dp_ctrl_handle_sink_request(struct dp_ctrl *dp_ctrl);
 struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
 			struct dp_panel *panel,	struct drm_dp_aux *aux,
diff --git a/drivers/gpu/drm/msm/dp/dp_display.c
b/drivers/gpu/drm/msm/dp/dp_display.c
index a49f6dbbe888..559d9ab7954d 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -1192,7 +1192,7 @@ static int dp_hpd_event_thread_start(struct
dp_display_private *dp_priv)
 static irqreturn_t dp_display_irq_handler(int irq, void *dev_id)
 {
 	struct dp_display_private *dp = dev_id;
-	irqreturn_t ret = IRQ_HANDLED;
+	irqreturn_t ret = IRQ_NONE;
 	u32 hpd_isr_status;

 	if (!dp) {
@@ -1206,27 +1206,33 @@ static irqreturn_t dp_display_irq_handler(int
irq, void *dev_id)
 		drm_dbg_dp(dp->drm_dev, "type=%d isr=0x%x\n",
 			dp->dp_display.connector_type, hpd_isr_status);
 		/* hpd related interrupts */
-		if (hpd_isr_status & DP_DP_HPD_PLUG_INT_MASK)
+		if (hpd_isr_status & DP_DP_HPD_PLUG_INT_MASK) {
 			dp_add_event(dp, EV_HPD_PLUG_INT, 0, 0);
+			ret = IRQ_HANDLED;
+		}

 		if (hpd_isr_status & DP_DP_IRQ_HPD_INT_MASK) {
 			dp_add_event(dp, EV_IRQ_HPD_INT, 0, 0);
+			ret = IRQ_HANDLED;
 		}

 		if (hpd_isr_status & DP_DP_HPD_REPLUG_INT_MASK) {
 			dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0);
 			dp_add_event(dp, EV_HPD_PLUG_INT, 0, 3);
+			ret = IRQ_HANDLED;
 		}

-		if (hpd_isr_status & DP_DP_HPD_UNPLUG_INT_MASK)
+		if (hpd_isr_status & DP_DP_HPD_UNPLUG_INT_MASK) {
 			dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0);
+			ret = IRQ_HANDLED;
+		}
 	}

 	/* DP controller isr */
-	dp_ctrl_isr(dp->ctrl);
+	ret |= dp_ctrl_isr(dp->ctrl);

 	/* DP aux isr */
-	dp_aux_isr(dp->aux);
+	ret |= dp_aux_isr(dp->aux);

 	return ret;
 }
Abhinav Kumar Dec. 15, 2022, 12:37 a.m. UTC | #5
Hi Doug

On 12/14/2022 4:14 PM, Doug Anderson wrote:
> Hi,
> 
> On Wed, Dec 14, 2022 at 3:46 PM Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
>>
>> Hi Doug
>>
>> On 12/14/2022 2:29 PM, Doug Anderson wrote:
>>> Hi,
>>>
>>> On Wed, Dec 14, 2022 at 1:21 PM Kuogee Hsieh <quic_khsieh@quicinc.com> wrote:
>>>>
>>>> There are 3 possible interrupt sources are handled by DP controller,
>>>> HPDstatus, Controller state changes and Aux read/write transaction.
>>>> At every irq, DP controller have to check isr status of every interrupt
>>>> sources and service the interrupt if its isr status bits shows interrupts
>>>> are pending. There is potential race condition may happen at current aux
>>>> isr handler implementation since it is always complete dp_aux_cmd_fifo_tx()
>>>> even irq is not for aux read or write transaction. This may cause aux read
>>>> transaction return premature if host aux data read is in the middle of
>>>> waiting for sink to complete transferring data to host while irq happen.
>>>> This will cause host's receiving buffer contains unexpected data. This
>>>> patch fixes this problem by checking aux isr and return immediately at
>>>> aux isr handler if there are no any isr status bits set.
>>>>
>>>> Follows are the signature at kernel logs when problem happen,
>>>> EDID has corrupt header
>>>> panel-simple-dp-aux aux-aea0000.edp: Couldn't identify panel via EDID
>>>> panel-simple-dp-aux aux-aea0000.edp: error -EIO: Couldn't detect panel nor find a fallback
>>>>
>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>>>> ---
>>>>    drivers/gpu/drm/msm/dp/dp_aux.c | 7 +++++++
>>>>    1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
>>>> index d030a93..8f8b12a 100644
>>>> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
>>>> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
>>>> @@ -423,6 +423,13 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>>>>
>>>>           isr = dp_catalog_aux_get_irq(aux->catalog);
>>>>
>>>> +       /*
>>>> +        * if this irq is not for aux transfer,
>>>> +        * then return immediately
>>>> +        */
>>>
>>> Why do you need 4 lines for a comment that fits on one line?
>> Yes, we can fit this to one line.
>>>
>>>> +       if (!isr)
>>>> +               return;
>>>
>>> I can confirm that this works for me. I could reproduce the EDID
>>> problems in the past and I can't after this patch. ...so I could give
>>> a:
>>>
>>> Tested-by: Douglas Anderson <dianders@chromium.org>
>>>
>>> I'm not an expert on this part of the code, so feel free to ignore my
>>> other comments if everyone else thinks this patch is fine as-is, but
>>> to me something here feels a little fragile. It feels a little weird
>>> that we'll "complete" for _any_ interrupt that comes through now
>>> rather than relying on dp_aux_native_handler() / dp_aux_i2c_handler()
>>> to specifically identify interrupts that caused the end of the
>>> transfer. I guess that idea is that every possible interrupt we get
>>> causes the end of the transfer?
>>>
>>> -Doug
>>
>> So this turned out to be more tricky and was a good finding from kuogee.
>>
>> In the bad EDID case, it was technically not bad EDID.
>>
>> What was happening was, the VIDEO_READY interrupt was continuously
>> firing. Ideally, this should fire only once but due to some error
>> condition it kept firing. We dont exactly know why yet what was the
>> error condition making it continuously fire.
>>
>> In the DP ISR, the dp_aux_isr() gets called even if it was not an aux
>> interrupt which fired (so the call flow in this case was
>> dp_display_irq_handler (triggered for VIDEO_READY) ---> dp_aux_isr()
>> So we should certainly have some protection to return early from this
>> routine if there was no aux interrupt which fired.
>>
>> Which is what this fix is doing.
>>
>> Its not completing any interrupt, its just returning early if no aux
>> interrupt fired.
> 
> ...but the whole problem was that it was doing the complete() at the
> end, right? Kuogee even mentioned that in the commit message.
> Specifically, I checked dp_aux_native_handler() and
> dp_aux_i2c_handler(), both of which are passed the "isr". Unless I
> messed up, both functions already were no-ops if the ISR was 0, even
> before Kuogee's patch. That means that the only thing Kuogee's patch
> does is to prevent the call to "complete(&aux->comp)" at the end of
> "dp_aux_isr()".
> 
> ...and it makes sense not to call the complete() if no "isr" is 0.
> ...but what I'm saying is that _any_ non-zero value of ISR will still
> cause the complete() to be called after Kuogee's patch. That means
> that if any of the 32-bits in the "isr" variable are set, that we will
> call complete(). I'm asking if you're sure that every single bit of
> the "isr" means that we're ready to call complete(). It feels like it
> would be less fragile if dp_aux_native_handler() and
> dp_aux_i2c_handler() (which both already look at the ISR) returned
> some value saying whether the "isr" contained a bit that meant that
> complete() should be called.
> 

Yes, so other than the "transfer done" bits, the other bits we listen to 
are below:

29 #define DP_INTERRUPT_STATUS1 \
30 	(DP_INTR_AUX_I2C_DONE| \
31 	DP_INTR_WRONG_ADDR | DP_INTR_TIMEOUT | \
32 	DP_INTR_NACK_DEFER | DP_INTR_WRONG_DATA_CNT | \
33 	DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER | \
34 	DP_INTR_PLL_UNLOCKED | DP_INTR_AUX_ERROR

All of these, if they fire, will be handled in dp_aux_i2c_handler() and 
the aux_error_num will be assigned.

And only if aux_error_num is DP_AUX_ERR_NONE, we go further and read the 
data from the fifo.

So we should complete even if there is any bit set as they are error 
bits which will need to be handled.

> -Doug
Abhinav Kumar Dec. 15, 2022, 2:59 a.m. UTC | #6
Hi Stephen

On 12/14/2022 4:29 PM, Stephen Boyd wrote:
> Quoting Doug Anderson (2022-12-14 16:14:42)
>> Hi,
>>
>> On Wed, Dec 14, 2022 at 3:46 PM Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
>>>
>>> Hi Doug
>>>
>>> On 12/14/2022 2:29 PM, Doug Anderson wrote:
>>>> Hi,
>>>>
>>>> On Wed, Dec 14, 2022 at 1:21 PM Kuogee Hsieh <quic_khsieh@quicinc.com> wrote:
>>>>>
>>>>> There are 3 possible interrupt sources are handled by DP controller,
>>>>> HPDstatus, Controller state changes and Aux read/write transaction.
>>>>> At every irq, DP controller have to check isr status of every interrupt
>>>>> sources and service the interrupt if its isr status bits shows interrupts
>>>>> are pending. There is potential race condition may happen at current aux
>>>>> isr handler implementation since it is always complete dp_aux_cmd_fifo_tx()
>>>>> even irq is not for aux read or write transaction. This may cause aux read
>>>>> transaction return premature if host aux data read is in the middle of
>>>>> waiting for sink to complete transferring data to host while irq happen.
>>>>> This will cause host's receiving buffer contains unexpected data. This
>>>>> patch fixes this problem by checking aux isr and return immediately at
>>>>> aux isr handler if there are no any isr status bits set.
>>>>>
>>>>> Follows are the signature at kernel logs when problem happen,
>>>>> EDID has corrupt header
>>>>> panel-simple-dp-aux aux-aea0000.edp: Couldn't identify panel via EDID
>>>>> panel-simple-dp-aux aux-aea0000.edp: error -EIO: Couldn't detect panel nor find a fallback
>>>>>
>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>>>>> ---
>>>>>    drivers/gpu/drm/msm/dp/dp_aux.c | 7 +++++++
>>>>>    1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
>>>>> index d030a93..8f8b12a 100644
>>>>> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
>>>>> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
>>>>> @@ -423,6 +423,13 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>>>>>
>>>>>           isr = dp_catalog_aux_get_irq(aux->catalog);
>>>>>
>>>>> +       /*
>>>>> +        * if this irq is not for aux transfer,
>>>>> +        * then return immediately
>>>>> +        */
>>>>
>>>> Why do you need 4 lines for a comment that fits on one line?
>>> Yes, we can fit this to one line.
>>>>
>>>>> +       if (!isr)
>>>>> +               return;
>>>>
>>>> I can confirm that this works for me. I could reproduce the EDID
>>>> problems in the past and I can't after this patch. ...so I could give
>>>> a:
>>>>
>>>> Tested-by: Douglas Anderson <dianders@chromium.org>
>>>>
>>>> I'm not an expert on this part of the code, so feel free to ignore my
>>>> other comments if everyone else thinks this patch is fine as-is, but
>>>> to me something here feels a little fragile. It feels a little weird
>>>> that we'll "complete" for _any_ interrupt that comes through now
>>>> rather than relying on dp_aux_native_handler() / dp_aux_i2c_handler()
>>>> to specifically identify interrupts that caused the end of the
>>>> transfer. I guess that idea is that every possible interrupt we get
>>>> causes the end of the transfer?
>>>>
>>>> -Doug
>>>
>>> So this turned out to be more tricky and was a good finding from kuogee.
>>>
>>> In the bad EDID case, it was technically not bad EDID.
>>>
>>> What was happening was, the VIDEO_READY interrupt was continuously
>>> firing. Ideally, this should fire only once but due to some error
>>> condition it kept firing. We dont exactly know why yet what was the
>>> error condition making it continuously fire.
> 
> This is a great detail that is missing from the commit text.
> 
Yup, we should update this.
>>>
>>> In the DP ISR, the dp_aux_isr() gets called even if it was not an aux
>>> interrupt which fired (so the call flow in this case was
>>> dp_display_irq_handler (triggered for VIDEO_READY) ---> dp_aux_isr()
>>> So we should certainly have some protection to return early from this
>>> routine if there was no aux interrupt which fired.
> 
> I'm not sure that's a race condition though, more like a problem where
> the completion is called unconditionally?
> 

hmm ... True.

>>>
>>> Which is what this fix is doing.
>>>
>>> Its not completing any interrupt, its just returning early if no aux
>>> interrupt fired.
>>
>> ...but the whole problem was that it was doing the complete() at the
>> end, right? Kuogee even mentioned that in the commit message.
>> Specifically, I checked dp_aux_native_handler() and
>> dp_aux_i2c_handler(), both of which are passed the "isr". Unless I
>> messed up, both functions already were no-ops if the ISR was 0, even
>> before Kuogee's patch. That means that the only thing Kuogee's patch
>> does is to prevent the call to "complete(&aux->comp)" at the end of
>> "dp_aux_isr()".
>>
>> ...and it makes sense not to call the complete() if no "isr" is 0.
>> ...but what I'm saying is that _any_ non-zero value of ISR will still
>> cause the complete() to be called after Kuogee's patch. That means
>> that if any of the 32-bits in the "isr" variable are set, that we will
>> call complete(). I'm asking if you're sure that every single bit of
>> the "isr" means that we're ready to call complete(). It feels like it
>> would be less fragile if dp_aux_native_handler() and
>> dp_aux_i2c_handler() (which both already look at the ISR) returned
>> some value saying whether the "isr" contained a bit that meant that
>> complete() should be called.
>>
> 
> I'm almost certain I've asked for this before, but I can't find it
> anymore. Can we also simplify the aux handlers to be a big pile of
> if-else-if conditions that don't overwrite the 'aux_error_num'? That
> would simplify the patch below.
> 

Okay, this makes it much clear about what Doug was trying to explain.

This certainly improves the irq return code better (handled Vs none).

In terms of the functionality of it, although it looks too simple, the 
current change was okay.

But, I agree that this would be more robust in terms of the irq return 
codes.

Kuogee, any concerns with this?

> ---8<---
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
> index d030a93a08c3..ff79cad90d21 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> @@ -162,45 +162,73 @@ static ssize_t dp_aux_cmd_fifo_rx(struct
> dp_aux_private *aux,
>   	return i;
>   }
> 
> -static void dp_aux_native_handler(struct dp_aux_private *aux, u32 isr)
> +static irqreturn_t dp_aux_native_handler(struct dp_aux_private *aux, u32 isr)
>   {
> -	if (isr & DP_INTR_AUX_I2C_DONE)
> +	irqreturn_t ret = IRQ_NONE;
> +
> +	if (isr & DP_INTR_AUX_I2C_DONE) {
>   		aux->aux_error_num = DP_AUX_ERR_NONE;
> -	else if (isr & DP_INTR_WRONG_ADDR)
> +		ret = IRQ_HANDLED;
> +	} else if (isr & DP_INTR_WRONG_ADDR) {
>   		aux->aux_error_num = DP_AUX_ERR_ADDR;
> -	else if (isr & DP_INTR_TIMEOUT)
> +		ret = IRQ_HANDLED;
> +	} else if (isr & DP_INTR_TIMEOUT) {
>   		aux->aux_error_num = DP_AUX_ERR_TOUT;
> -	if (isr & DP_INTR_NACK_DEFER)
> +		ret = IRQ_HANDLED;
> +	}
> +
> +	if (isr & DP_INTR_NACK_DEFER) {
>   		aux->aux_error_num = DP_AUX_ERR_NACK;
> +		ret = IRQ_HANDLED;
> +	}
>   	if (isr & DP_INTR_AUX_ERROR) {
>   		aux->aux_error_num = DP_AUX_ERR_PHY;
>   		dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> +		ret = IRQ_HANDLED;
>   	}
> +
> +	return ret;
>   }
> 
> -static void dp_aux_i2c_handler(struct dp_aux_private *aux, u32 isr)
> +static irqreturn_t dp_aux_i2c_handler(struct dp_aux_private *aux, u32 isr)
>   {
> +	irqreturn_t ret = IRQ_NONE;
> +
>   	if (isr & DP_INTR_AUX_I2C_DONE) {
>   		if (isr & (DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER))
>   			aux->aux_error_num = DP_AUX_ERR_NACK;
>   		else
>   			aux->aux_error_num = DP_AUX_ERR_NONE;
> -	} else {
> -		if (isr & DP_INTR_WRONG_ADDR)
> -			aux->aux_error_num = DP_AUX_ERR_ADDR;
> -		else if (isr & DP_INTR_TIMEOUT)
> -			aux->aux_error_num = DP_AUX_ERR_TOUT;
> -		if (isr & DP_INTR_NACK_DEFER)
> -			aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
> -		if (isr & DP_INTR_I2C_NACK)
> -			aux->aux_error_num = DP_AUX_ERR_NACK;
> -		if (isr & DP_INTR_I2C_DEFER)
> -			aux->aux_error_num = DP_AUX_ERR_DEFER;
> -		if (isr & DP_INTR_AUX_ERROR) {
> -			aux->aux_error_num = DP_AUX_ERR_PHY;
> -			dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> -		}
> +		return IRQ_HANDLED;
> +	}
> +
> +	if (isr & DP_INTR_WRONG_ADDR) {
> +		aux->aux_error_num = DP_AUX_ERR_ADDR;
> +		ret = IRQ_HANDLED;
> +	} else if (isr & DP_INTR_TIMEOUT) {
> +		aux->aux_error_num = DP_AUX_ERR_TOUT;
> +		ret = IRQ_HANDLED;
> +	}
> +
> +	if (isr & DP_INTR_NACK_DEFER) {
> +		aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
> +		ret = IRQ_HANDLED;
> +	}
> +	if (isr & DP_INTR_I2C_NACK) {
> +		aux->aux_error_num = DP_AUX_ERR_NACK;
> +		ret = IRQ_HANDLED;
> +	}
> +	if (isr & DP_INTR_I2C_DEFER) {
> +		aux->aux_error_num = DP_AUX_ERR_DEFER;
> +		ret = IRQ_HANDLED;
> +	}
> +	if (isr & DP_INTR_AUX_ERROR) {
> +		aux->aux_error_num = DP_AUX_ERR_PHY;
> +		dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> +		ret = IRQ_HANDLED;
>   	}
> +
> +	return ret;
>   }
> 
>   static void dp_aux_update_offset_and_segment(struct dp_aux_private *aux,
> @@ -409,14 +437,15 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux *dp_aux,
>   	return ret;
>   }
> 
> -void dp_aux_isr(struct drm_dp_aux *dp_aux)
> +irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux)
>   {
>   	u32 isr;
>   	struct dp_aux_private *aux;
> +	irqreturn_t ret = IRQ_NONE;
> 
>   	if (!dp_aux) {
>   		DRM_ERROR("invalid input\n");
> -		return;
> +		return ret;
>   	}
> 
>   	aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
> @@ -424,14 +453,17 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>   	isr = dp_catalog_aux_get_irq(aux->catalog);
> 
>   	if (!aux->cmd_busy)
> -		return;
> +		return ret;
> 
>   	if (aux->native)
> -		dp_aux_native_handler(aux, isr);
> +		ret |= dp_aux_native_handler(aux, isr);
>   	else
> -		dp_aux_i2c_handler(aux, isr);
> +		ret |= dp_aux_i2c_handler(aux, isr);
> 
> -	complete(&aux->comp);
> +	if (ret == IRQ_HANDLED)
> +		complete(&aux->comp);
> +
> +	return ret;
>   }
> 
>   void dp_aux_reconfig(struct drm_dp_aux *dp_aux)
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.h b/drivers/gpu/drm/msm/dp/dp_aux.h
> index e930974bcb5b..511305da4f66 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.h
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.h
> @@ -11,7 +11,7 @@
> 
>   int dp_aux_register(struct drm_dp_aux *dp_aux);
>   void dp_aux_unregister(struct drm_dp_aux *dp_aux);
> -void dp_aux_isr(struct drm_dp_aux *dp_aux);
> +irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux);
>   void dp_aux_init(struct drm_dp_aux *dp_aux);
>   void dp_aux_deinit(struct drm_dp_aux *dp_aux);
>   void dp_aux_reconfig(struct drm_dp_aux *dp_aux);
> diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> index dd26ca651a05..10c6d6847163 100644
> --- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
> +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> @@ -1979,13 +1979,11 @@ int dp_ctrl_off(struct dp_ctrl *dp_ctrl)
>   	return ret;
>   }
> 
> -void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
> +irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
>   {
>   	struct dp_ctrl_private *ctrl;
>   	u32 isr;
> -
> -	if (!dp_ctrl)
> -		return;
> +	irqreturn_t ret = IRQ_NONE;
> 
>   	ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
> 
> @@ -1994,12 +1992,16 @@ void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
>   	if (isr & DP_CTRL_INTR_READY_FOR_VIDEO) {
>   		drm_dbg_dp(ctrl->drm_dev, "dp_video_ready\n");
>   		complete(&ctrl->video_comp);
> +		ret = IRQ_HANDLED;
>   	}
> 
>   	if (isr & DP_CTRL_INTR_IDLE_PATTERN_SENT) {
>   		drm_dbg_dp(ctrl->drm_dev, "idle_patterns_sent\n");
>   		complete(&ctrl->idle_comp);
> +		ret = IRQ_HANDLED;
>   	}
> +
> +	return ret;
>   }
> 
>   struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
> diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.h b/drivers/gpu/drm/msm/dp/dp_ctrl.h
> index 9f29734af81c..c3af06dc87b1 100644
> --- a/drivers/gpu/drm/msm/dp/dp_ctrl.h
> +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.h
> @@ -25,7 +25,7 @@ int dp_ctrl_off_link_stream(struct dp_ctrl *dp_ctrl);
>   int dp_ctrl_off_link(struct dp_ctrl *dp_ctrl);
>   int dp_ctrl_off(struct dp_ctrl *dp_ctrl);
>   void dp_ctrl_push_idle(struct dp_ctrl *dp_ctrl);
> -void dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
> +irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
>   void dp_ctrl_handle_sink_request(struct dp_ctrl *dp_ctrl);
>   struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
>   			struct dp_panel *panel,	struct drm_dp_aux *aux,
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c
> b/drivers/gpu/drm/msm/dp/dp_display.c
> index a49f6dbbe888..559d9ab7954d 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -1192,7 +1192,7 @@ static int dp_hpd_event_thread_start(struct
> dp_display_private *dp_priv)
>   static irqreturn_t dp_display_irq_handler(int irq, void *dev_id)
>   {
>   	struct dp_display_private *dp = dev_id;
> -	irqreturn_t ret = IRQ_HANDLED;
> +	irqreturn_t ret = IRQ_NONE;
>   	u32 hpd_isr_status;
> 
>   	if (!dp) {
> @@ -1206,27 +1206,33 @@ static irqreturn_t dp_display_irq_handler(int
> irq, void *dev_id)
>   		drm_dbg_dp(dp->drm_dev, "type=%d isr=0x%x\n",
>   			dp->dp_display.connector_type, hpd_isr_status);
>   		/* hpd related interrupts */
> -		if (hpd_isr_status & DP_DP_HPD_PLUG_INT_MASK)
> +		if (hpd_isr_status & DP_DP_HPD_PLUG_INT_MASK) {
>   			dp_add_event(dp, EV_HPD_PLUG_INT, 0, 0);
> +			ret = IRQ_HANDLED;
> +		}
> 
>   		if (hpd_isr_status & DP_DP_IRQ_HPD_INT_MASK) {
>   			dp_add_event(dp, EV_IRQ_HPD_INT, 0, 0);
> +			ret = IRQ_HANDLED;
>   		}
> 
>   		if (hpd_isr_status & DP_DP_HPD_REPLUG_INT_MASK) {
>   			dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0);
>   			dp_add_event(dp, EV_HPD_PLUG_INT, 0, 3);
> +			ret = IRQ_HANDLED;
>   		}
> 
> -		if (hpd_isr_status & DP_DP_HPD_UNPLUG_INT_MASK)
> +		if (hpd_isr_status & DP_DP_HPD_UNPLUG_INT_MASK) {
>   			dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0);
> +			ret = IRQ_HANDLED;
> +		}
>   	}
> 
>   	/* DP controller isr */
> -	dp_ctrl_isr(dp->ctrl);
> +	ret |= dp_ctrl_isr(dp->ctrl);
> 
>   	/* DP aux isr */
> -	dp_aux_isr(dp->aux);
> +	ret |= dp_aux_isr(dp->aux);
> 
>   	return ret;
>   }
Kuogee Hsieh Dec. 15, 2022, 5:12 p.m. UTC | #7
On 12/14/2022 6:59 PM, Abhinav Kumar wrote:
> Hi Stephen
>
> On 12/14/2022 4:29 PM, Stephen Boyd wrote:
>> Quoting Doug Anderson (2022-12-14 16:14:42)
>>> Hi,
>>>
>>> On Wed, Dec 14, 2022 at 3:46 PM Abhinav Kumar 
>>> <quic_abhinavk@quicinc.com> wrote:
>>>>
>>>> Hi Doug
>>>>
>>>> On 12/14/2022 2:29 PM, Doug Anderson wrote:
>>>>> Hi,
>>>>>
>>>>> On Wed, Dec 14, 2022 at 1:21 PM Kuogee Hsieh 
>>>>> <quic_khsieh@quicinc.com> wrote:
>>>>>>
>>>>>> There are 3 possible interrupt sources are handled by DP controller,
>>>>>> HPDstatus, Controller state changes and Aux read/write transaction.
>>>>>> At every irq, DP controller have to check isr status of every 
>>>>>> interrupt
>>>>>> sources and service the interrupt if its isr status bits shows 
>>>>>> interrupts
>>>>>> are pending. There is potential race condition may happen at 
>>>>>> current aux
>>>>>> isr handler implementation since it is always complete 
>>>>>> dp_aux_cmd_fifo_tx()
>>>>>> even irq is not for aux read or write transaction. This may cause 
>>>>>> aux read
>>>>>> transaction return premature if host aux data read is in the 
>>>>>> middle of
>>>>>> waiting for sink to complete transferring data to host while irq 
>>>>>> happen.
>>>>>> This will cause host's receiving buffer contains unexpected data. 
>>>>>> This
>>>>>> patch fixes this problem by checking aux isr and return 
>>>>>> immediately at
>>>>>> aux isr handler if there are no any isr status bits set.
>>>>>>
>>>>>> Follows are the signature at kernel logs when problem happen,
>>>>>> EDID has corrupt header
>>>>>> panel-simple-dp-aux aux-aea0000.edp: Couldn't identify panel via 
>>>>>> EDID
>>>>>> panel-simple-dp-aux aux-aea0000.edp: error -EIO: Couldn't detect 
>>>>>> panel nor find a fallback
>>>>>>
>>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/msm/dp/dp_aux.c | 7 +++++++
>>>>>>    1 file changed, 7 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c 
>>>>>> b/drivers/gpu/drm/msm/dp/dp_aux.c
>>>>>> index d030a93..8f8b12a 100644
>>>>>> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
>>>>>> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
>>>>>> @@ -423,6 +423,13 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>>>>>>
>>>>>>           isr = dp_catalog_aux_get_irq(aux->catalog);
>>>>>>
>>>>>> +       /*
>>>>>> +        * if this irq is not for aux transfer,
>>>>>> +        * then return immediately
>>>>>> +        */
>>>>>
>>>>> Why do you need 4 lines for a comment that fits on one line?
>>>> Yes, we can fit this to one line.
>>>>>
>>>>>> +       if (!isr)
>>>>>> +               return;
>>>>>
>>>>> I can confirm that this works for me. I could reproduce the EDID
>>>>> problems in the past and I can't after this patch. ...so I could give
>>>>> a:
>>>>>
>>>>> Tested-by: Douglas Anderson <dianders@chromium.org>
>>>>>
>>>>> I'm not an expert on this part of the code, so feel free to ignore my
>>>>> other comments if everyone else thinks this patch is fine as-is, but
>>>>> to me something here feels a little fragile. It feels a little weird
>>>>> that we'll "complete" for _any_ interrupt that comes through now
>>>>> rather than relying on dp_aux_native_handler() / dp_aux_i2c_handler()
>>>>> to specifically identify interrupts that caused the end of the
>>>>> transfer. I guess that idea is that every possible interrupt we get
>>>>> causes the end of the transfer?
>>>>>
>>>>> -Doug
>>>>
>>>> So this turned out to be more tricky and was a good finding from 
>>>> kuogee.
>>>>
>>>> In the bad EDID case, it was technically not bad EDID.
>>>>
>>>> What was happening was, the VIDEO_READY interrupt was continuously
>>>> firing. Ideally, this should fire only once but due to some error
>>>> condition it kept firing. We dont exactly know why yet what was the
>>>> error condition making it continuously fire.
>>
>> This is a great detail that is missing from the commit text.
>>
> Yup, we should update this.
>>>>
>>>> In the DP ISR, the dp_aux_isr() gets called even if it was not an aux
>>>> interrupt which fired (so the call flow in this case was
>>>> dp_display_irq_handler (triggered for VIDEO_READY) ---> dp_aux_isr()
>>>> So we should certainly have some protection to return early from this
>>>> routine if there was no aux interrupt which fired.
>>
>> I'm not sure that's a race condition though, more like a problem where
>> the completion is called unconditionally?
>>
>
> hmm ... True.
>
>>>>
>>>> Which is what this fix is doing.
>>>>
>>>> Its not completing any interrupt, its just returning early if no aux
>>>> interrupt fired.
>>>
>>> ...but the whole problem was that it was doing the complete() at the
>>> end, right? Kuogee even mentioned that in the commit message.
>>> Specifically, I checked dp_aux_native_handler() and
>>> dp_aux_i2c_handler(), both of which are passed the "isr". Unless I
>>> messed up, both functions already were no-ops if the ISR was 0, even
>>> before Kuogee's patch. That means that the only thing Kuogee's patch
>>> does is to prevent the call to "complete(&aux->comp)" at the end of
>>> "dp_aux_isr()".
>>>
>>> ...and it makes sense not to call the complete() if no "isr" is 0.
>>> ...but what I'm saying is that _any_ non-zero value of ISR will still
>>> cause the complete() to be called after Kuogee's patch. That means
>>> that if any of the 32-bits in the "isr" variable are set, that we will
>>> call complete(). I'm asking if you're sure that every single bit of
>>> the "isr" means that we're ready to call complete(). It feels like it
>>> would be less fragile if dp_aux_native_handler() and
>>> dp_aux_i2c_handler() (which both already look at the ISR) returned
>>> some value saying whether the "isr" contained a bit that meant that
>>> complete() should be called.
>>>
>>
>> I'm almost certain I've asked for this before, but I can't find it
>> anymore. Can we also simplify the aux handlers to be a big pile of
>> if-else-if conditions that don't overwrite the 'aux_error_num'? That
>> would simplify the patch below.
>>
>
> Okay, this makes it much clear about what Doug was trying to explain.
>
> This certainly improves the irq return code better (handled Vs none).
>
> In terms of the functionality of it, although it looks too simple, the 
> current change was okay.
>
> But, I agree that this would be more robust in terms of the irq return 
> codes.
>
> Kuogee, any concerns with this?
no, let me improve the aux is handler.
>
>> ---8<---
>> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c 
>> b/drivers/gpu/drm/msm/dp/dp_aux.c
>> index d030a93a08c3..ff79cad90d21 100644
>> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
>> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
>> @@ -162,45 +162,73 @@ static ssize_t dp_aux_cmd_fifo_rx(struct
>> dp_aux_private *aux,
>>       return i;
>>   }
>>
>> -static void dp_aux_native_handler(struct dp_aux_private *aux, u32 isr)
>> +static irqreturn_t dp_aux_native_handler(struct dp_aux_private *aux, 
>> u32 isr)
>>   {
>> -    if (isr & DP_INTR_AUX_I2C_DONE)
>> +    irqreturn_t ret = IRQ_NONE;
>> +
>> +    if (isr & DP_INTR_AUX_I2C_DONE) {
>>           aux->aux_error_num = DP_AUX_ERR_NONE;
>> -    else if (isr & DP_INTR_WRONG_ADDR)
>> +        ret = IRQ_HANDLED;
>> +    } else if (isr & DP_INTR_WRONG_ADDR) {
>>           aux->aux_error_num = DP_AUX_ERR_ADDR;
>> -    else if (isr & DP_INTR_TIMEOUT)
>> +        ret = IRQ_HANDLED;
>> +    } else if (isr & DP_INTR_TIMEOUT) {
>>           aux->aux_error_num = DP_AUX_ERR_TOUT;
>> -    if (isr & DP_INTR_NACK_DEFER)
>> +        ret = IRQ_HANDLED;
>> +    }
>> +
>> +    if (isr & DP_INTR_NACK_DEFER) {
>>           aux->aux_error_num = DP_AUX_ERR_NACK;
>> +        ret = IRQ_HANDLED;
>> +    }
>>       if (isr & DP_INTR_AUX_ERROR) {
>>           aux->aux_error_num = DP_AUX_ERR_PHY;
>>           dp_catalog_aux_clear_hw_interrupts(aux->catalog);
>> +        ret = IRQ_HANDLED;
>>       }
>> +
>> +    return ret;
>>   }
>>
>> -static void dp_aux_i2c_handler(struct dp_aux_private *aux, u32 isr)
>> +static irqreturn_t dp_aux_i2c_handler(struct dp_aux_private *aux, 
>> u32 isr)
>>   {
>> +    irqreturn_t ret = IRQ_NONE;
>> +
>>       if (isr & DP_INTR_AUX_I2C_DONE) {
>>           if (isr & (DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER))
>>               aux->aux_error_num = DP_AUX_ERR_NACK;
>>           else
>>               aux->aux_error_num = DP_AUX_ERR_NONE;
>> -    } else {
>> -        if (isr & DP_INTR_WRONG_ADDR)
>> -            aux->aux_error_num = DP_AUX_ERR_ADDR;
>> -        else if (isr & DP_INTR_TIMEOUT)
>> -            aux->aux_error_num = DP_AUX_ERR_TOUT;
>> -        if (isr & DP_INTR_NACK_DEFER)
>> -            aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
>> -        if (isr & DP_INTR_I2C_NACK)
>> -            aux->aux_error_num = DP_AUX_ERR_NACK;
>> -        if (isr & DP_INTR_I2C_DEFER)
>> -            aux->aux_error_num = DP_AUX_ERR_DEFER;
>> -        if (isr & DP_INTR_AUX_ERROR) {
>> -            aux->aux_error_num = DP_AUX_ERR_PHY;
>> - dp_catalog_aux_clear_hw_interrupts(aux->catalog);
>> -        }
>> +        return IRQ_HANDLED;
>> +    }
>> +
>> +    if (isr & DP_INTR_WRONG_ADDR) {
>> +        aux->aux_error_num = DP_AUX_ERR_ADDR;
>> +        ret = IRQ_HANDLED;
>> +    } else if (isr & DP_INTR_TIMEOUT) {
>> +        aux->aux_error_num = DP_AUX_ERR_TOUT;
>> +        ret = IRQ_HANDLED;
>> +    }
>> +
>> +    if (isr & DP_INTR_NACK_DEFER) {
>> +        aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
>> +        ret = IRQ_HANDLED;
>> +    }
>> +    if (isr & DP_INTR_I2C_NACK) {
>> +        aux->aux_error_num = DP_AUX_ERR_NACK;
>> +        ret = IRQ_HANDLED;
>> +    }
>> +    if (isr & DP_INTR_I2C_DEFER) {
>> +        aux->aux_error_num = DP_AUX_ERR_DEFER;
>> +        ret = IRQ_HANDLED;
>> +    }
>> +    if (isr & DP_INTR_AUX_ERROR) {
>> +        aux->aux_error_num = DP_AUX_ERR_PHY;
>> +        dp_catalog_aux_clear_hw_interrupts(aux->catalog);
>> +        ret = IRQ_HANDLED;
>>       }
>> +
>> +    return ret;
>>   }
>>
>>   static void dp_aux_update_offset_and_segment(struct dp_aux_private 
>> *aux,
>> @@ -409,14 +437,15 @@ static ssize_t dp_aux_transfer(struct 
>> drm_dp_aux *dp_aux,
>>       return ret;
>>   }
>>
>> -void dp_aux_isr(struct drm_dp_aux *dp_aux)
>> +irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux)
>>   {
>>       u32 isr;
>>       struct dp_aux_private *aux;
>> +    irqreturn_t ret = IRQ_NONE;
>>
>>       if (!dp_aux) {
>>           DRM_ERROR("invalid input\n");
>> -        return;
>> +        return ret;
>>       }
>>
>>       aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
>> @@ -424,14 +453,17 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>>       isr = dp_catalog_aux_get_irq(aux->catalog);
>>
>>       if (!aux->cmd_busy)
>> -        return;
>> +        return ret;
>>
>>       if (aux->native)
>> -        dp_aux_native_handler(aux, isr);
>> +        ret |= dp_aux_native_handler(aux, isr);
>>       else
>> -        dp_aux_i2c_handler(aux, isr);
>> +        ret |= dp_aux_i2c_handler(aux, isr);
>>
>> -    complete(&aux->comp);
>> +    if (ret == IRQ_HANDLED)
>> +        complete(&aux->comp);
>> +
>> +    return ret;
>>   }
>>
>>   void dp_aux_reconfig(struct drm_dp_aux *dp_aux)
>> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.h 
>> b/drivers/gpu/drm/msm/dp/dp_aux.h
>> index e930974bcb5b..511305da4f66 100644
>> --- a/drivers/gpu/drm/msm/dp/dp_aux.h
>> +++ b/drivers/gpu/drm/msm/dp/dp_aux.h
>> @@ -11,7 +11,7 @@
>>
>>   int dp_aux_register(struct drm_dp_aux *dp_aux);
>>   void dp_aux_unregister(struct drm_dp_aux *dp_aux);
>> -void dp_aux_isr(struct drm_dp_aux *dp_aux);
>> +irqreturn_t dp_aux_isr(struct drm_dp_aux *dp_aux);
>>   void dp_aux_init(struct drm_dp_aux *dp_aux);
>>   void dp_aux_deinit(struct drm_dp_aux *dp_aux);
>>   void dp_aux_reconfig(struct drm_dp_aux *dp_aux);
>> diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c 
>> b/drivers/gpu/drm/msm/dp/dp_ctrl.c
>> index dd26ca651a05..10c6d6847163 100644
>> --- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
>> +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
>> @@ -1979,13 +1979,11 @@ int dp_ctrl_off(struct dp_ctrl *dp_ctrl)
>>       return ret;
>>   }
>>
>> -void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
>> +irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
>>   {
>>       struct dp_ctrl_private *ctrl;
>>       u32 isr;
>> -
>> -    if (!dp_ctrl)
>> -        return;
>> +    irqreturn_t ret = IRQ_NONE;
>>
>>       ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
>>
>> @@ -1994,12 +1992,16 @@ void dp_ctrl_isr(struct dp_ctrl *dp_ctrl)
>>       if (isr & DP_CTRL_INTR_READY_FOR_VIDEO) {
>>           drm_dbg_dp(ctrl->drm_dev, "dp_video_ready\n");
>>           complete(&ctrl->video_comp);
>> +        ret = IRQ_HANDLED;
>>       }
>>
>>       if (isr & DP_CTRL_INTR_IDLE_PATTERN_SENT) {
>>           drm_dbg_dp(ctrl->drm_dev, "idle_patterns_sent\n");
>>           complete(&ctrl->idle_comp);
>> +        ret = IRQ_HANDLED;
>>       }
>> +
>> +    return ret;
>>   }
>>
>>   struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
>> diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.h 
>> b/drivers/gpu/drm/msm/dp/dp_ctrl.h
>> index 9f29734af81c..c3af06dc87b1 100644
>> --- a/drivers/gpu/drm/msm/dp/dp_ctrl.h
>> +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.h
>> @@ -25,7 +25,7 @@ int dp_ctrl_off_link_stream(struct dp_ctrl *dp_ctrl);
>>   int dp_ctrl_off_link(struct dp_ctrl *dp_ctrl);
>>   int dp_ctrl_off(struct dp_ctrl *dp_ctrl);
>>   void dp_ctrl_push_idle(struct dp_ctrl *dp_ctrl);
>> -void dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
>> +irqreturn_t dp_ctrl_isr(struct dp_ctrl *dp_ctrl);
>>   void dp_ctrl_handle_sink_request(struct dp_ctrl *dp_ctrl);
>>   struct dp_ctrl *dp_ctrl_get(struct device *dev, struct dp_link *link,
>>               struct dp_panel *panel,    struct drm_dp_aux *aux,
>> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c
>> b/drivers/gpu/drm/msm/dp/dp_display.c
>> index a49f6dbbe888..559d9ab7954d 100644
>> --- a/drivers/gpu/drm/msm/dp/dp_display.c
>> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
>> @@ -1192,7 +1192,7 @@ static int dp_hpd_event_thread_start(struct
>> dp_display_private *dp_priv)
>>   static irqreturn_t dp_display_irq_handler(int irq, void *dev_id)
>>   {
>>       struct dp_display_private *dp = dev_id;
>> -    irqreturn_t ret = IRQ_HANDLED;
>> +    irqreturn_t ret = IRQ_NONE;
>>       u32 hpd_isr_status;
>>
>>       if (!dp) {
>> @@ -1206,27 +1206,33 @@ static irqreturn_t dp_display_irq_handler(int
>> irq, void *dev_id)
>>           drm_dbg_dp(dp->drm_dev, "type=%d isr=0x%x\n",
>>               dp->dp_display.connector_type, hpd_isr_status);
>>           /* hpd related interrupts */
>> -        if (hpd_isr_status & DP_DP_HPD_PLUG_INT_MASK)
>> +        if (hpd_isr_status & DP_DP_HPD_PLUG_INT_MASK) {
>>               dp_add_event(dp, EV_HPD_PLUG_INT, 0, 0);
>> +            ret = IRQ_HANDLED;
>> +        }
>>
>>           if (hpd_isr_status & DP_DP_IRQ_HPD_INT_MASK) {
>>               dp_add_event(dp, EV_IRQ_HPD_INT, 0, 0);
>> +            ret = IRQ_HANDLED;
>>           }
>>
>>           if (hpd_isr_status & DP_DP_HPD_REPLUG_INT_MASK) {
>>               dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0);
>>               dp_add_event(dp, EV_HPD_PLUG_INT, 0, 3);
>> +            ret = IRQ_HANDLED;
>>           }
>>
>> -        if (hpd_isr_status & DP_DP_HPD_UNPLUG_INT_MASK)
>> +        if (hpd_isr_status & DP_DP_HPD_UNPLUG_INT_MASK) {
>>               dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0);
>> +            ret = IRQ_HANDLED;
>> +        }
>>       }
>>
>>       /* DP controller isr */
>> -    dp_ctrl_isr(dp->ctrl);
>> +    ret |= dp_ctrl_isr(dp->ctrl);
>>
>>       /* DP aux isr */
>> -    dp_aux_isr(dp->aux);
>> +    ret |= dp_aux_isr(dp->aux);
>>
>>       return ret;
>>   }
diff mbox series

Patch

diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
index d030a93..8f8b12a 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.c
+++ b/drivers/gpu/drm/msm/dp/dp_aux.c
@@ -423,6 +423,13 @@  void dp_aux_isr(struct drm_dp_aux *dp_aux)
 
 	isr = dp_catalog_aux_get_irq(aux->catalog);
 
+	/*
+	 * if this irq is not for aux transfer,
+	 * then return immediately
+	 */
+	if (!isr)
+		return;
+
 	if (!aux->cmd_busy)
 		return;