Message ID | 20241211105953.1205343-1-xu.yang_2@nxp.com |
---|---|
State | Superseded |
Headers | show |
Series | usb: typec: tcpci: fix NULL pointer issue on shared irq case | expand |
Hello, On Wed, Dec 11, 2024 at 06:59:53PM +0800, Xu Yang wrote: > The tcpci_irq() may meet below NULL pointer dereference issue: > > [ 2.641851] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 > [ 2.641951] status 0x1, 0x37f > [ 2.650659] Mem abort info: > [ 2.656490] ESR = 0x0000000096000004 > [ 2.660230] EC = 0x25: DABT (current EL), IL = 32 bits > [ 2.665532] SET = 0, FnV = 0 > [ 2.668579] EA = 0, S1PTW = 0 > [ 2.671715] FSC = 0x04: level 0 translation fault > [ 2.676584] Data abort info: > [ 2.679459] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 > [ 2.684936] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > [ 2.689980] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > [ 2.695284] [0000000000000010] user address but active_mm is swapper > [ 2.701632] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP > [ 2.707883] Modules linked in: > [ 2.710936] CPU: 1 UID: 0 PID: 87 Comm: irq/111-2-0051 Not tainted 6.12.0-rc6-06316-g7f63786ad3d1-dirty #4 > [ 2.720570] Hardware name: NXP i.MX93 11X11 EVK board (DT) > [ 2.726040] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 2.732989] pc : tcpci_irq+0x38/0x318 > [ 2.736647] lr : _tcpci_irq+0x14/0x20 > [ 2.740295] sp : ffff80008324bd30 > [ 2.743597] x29: ffff80008324bd70 x28: ffff800080107894 x27: ffff800082198f70 > [ 2.750721] x26: ffff0000050e6680 x25: ffff000004d172ac x24: ffff0000050f0000 > [ 2.757845] x23: ffff000004d17200 x22: 0000000000000001 x21: ffff0000050f0000 > [ 2.764969] x20: ffff000004d17200 x19: 0000000000000000 x18: 0000000000000001 > [ 2.772093] x17: 0000000000000000 x16: ffff80008183d8a0 x15: ffff00007fbab040 > [ 2.779217] x14: ffff00007fb918c0 x13: 0000000000000000 x12: 000000000000017a > [ 2.786341] x11: 0000000000000001 x10: 0000000000000a90 x9 : ffff80008324bd00 > [ 2.793465] x8 : ffff0000050f0af0 x7 : ffff00007fbaa840 x6 : 0000000000000031 > [ 2.800589] x5 : 000000000000017a x4 : 0000000000000002 x3 : 0000000000000002 > [ 2.807713] x2 : ffff80008324bd3a x1 : 0000000000000010 x0 : 0000000000000000 > [ 2.814838] Call trace: > [ 2.817273] tcpci_irq+0x38/0x318 > [ 2.820583] _tcpci_irq+0x14/0x20 > [ 2.823885] irq_thread_fn+0x2c/0xa8 > [ 2.827456] irq_thread+0x16c/0x2f4 > [ 2.830940] kthread+0x110/0x114 > [ 2.834164] ret_from_fork+0x10/0x20 > [ 2.837738] Code: f9426420 f9001fe0 d2800000 52800201 (f9400a60) > > This may happen on shared irq case. Such as two Type-C ports share one > irq. After the first port finished tcpci_register_port(), it may trigger > interrupt. However, if the interrupt comes by chance the 2nd port finishes > devm_request_threaded_irq(), the 2nd port interrupt handler may be run at > first. Then the above issue may happen. > > devm_request_threaded_irq() > <-- port1 irq comes > disable_irq(client->irq); > tcpci_register_port() > > This will restore the logic to the state before commit (77e85107a771 "usb: > typec: tcpci: support edge irq"). > > Fixes: 77e85107a771 ("usb: typec: tcpci: support edge irq") > Cc: stable@vger.kernel.org > Signed-off-by: Xu Yang <xu.yang_2@nxp.com> > --- > drivers/usb/typec/tcpm/tcpci.c | 18 ++++++++---------- > 1 file changed, 8 insertions(+), 10 deletions(-) > > diff --git a/drivers/usb/typec/tcpm/tcpci.c b/drivers/usb/typec/tcpm/tcpci.c > index 2f15734a5043..db42f4bf3632 100644 > --- a/drivers/usb/typec/tcpm/tcpci.c > +++ b/drivers/usb/typec/tcpm/tcpci.c > @@ -923,22 +923,20 @@ static int tcpci_probe(struct i2c_client *client) > > chip->data.set_orientation = err; > > + chip->tcpci = tcpci_register_port(&client->dev, &chip->data); > + if (IS_ERR(chip->tcpci)) > + return PTR_ERR(chip->tcpci); > + > err = devm_request_threaded_irq(&client->dev, client->irq, NULL, > _tcpci_irq, > IRQF_SHARED | IRQF_ONESHOT, > dev_name(&client->dev), chip); I do not think this is the correct fix, what about using IRQF_NO_AUTOEN ? Would it solve the issue? You need to adjust also the disable/enable irq accordingly. Francesco
On Wed, Dec 11, 2024 at 07:40:51PM +0800, Xu Yang wrote: > On Wed, Dec 11, 2024 at 12:09:28PM +0100, Francesco Dolcini wrote: > > Hello, > > > > On Wed, Dec 11, 2024 at 06:59:53PM +0800, Xu Yang wrote: > > > The tcpci_irq() may meet below NULL pointer dereference issue: > > > > > > [ 2.641851] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 > > > [ 2.641951] status 0x1, 0x37f > > > [ 2.650659] Mem abort info: > > > [ 2.656490] ESR = 0x0000000096000004 > > > [ 2.660230] EC = 0x25: DABT (current EL), IL = 32 bits > > > [ 2.665532] SET = 0, FnV = 0 > > > [ 2.668579] EA = 0, S1PTW = 0 > > > [ 2.671715] FSC = 0x04: level 0 translation fault > > > [ 2.676584] Data abort info: > > > [ 2.679459] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 > > > [ 2.684936] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > > > [ 2.689980] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > > > [ 2.695284] [0000000000000010] user address but active_mm is swapper > > > [ 2.701632] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP > > > [ 2.707883] Modules linked in: > > > [ 2.710936] CPU: 1 UID: 0 PID: 87 Comm: irq/111-2-0051 Not tainted 6.12.0-rc6-06316-g7f63786ad3d1-dirty #4 > > > [ 2.720570] Hardware name: NXP i.MX93 11X11 EVK board (DT) > > > [ 2.726040] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > > > [ 2.732989] pc : tcpci_irq+0x38/0x318 > > > [ 2.736647] lr : _tcpci_irq+0x14/0x20 > > > [ 2.740295] sp : ffff80008324bd30 > > > [ 2.743597] x29: ffff80008324bd70 x28: ffff800080107894 x27: ffff800082198f70 > > > [ 2.750721] x26: ffff0000050e6680 x25: ffff000004d172ac x24: ffff0000050f0000 > > > [ 2.757845] x23: ffff000004d17200 x22: 0000000000000001 x21: ffff0000050f0000 > > > [ 2.764969] x20: ffff000004d17200 x19: 0000000000000000 x18: 0000000000000001 > > > [ 2.772093] x17: 0000000000000000 x16: ffff80008183d8a0 x15: ffff00007fbab040 > > > [ 2.779217] x14: ffff00007fb918c0 x13: 0000000000000000 x12: 000000000000017a > > > [ 2.786341] x11: 0000000000000001 x10: 0000000000000a90 x9 : ffff80008324bd00 > > > [ 2.793465] x8 : ffff0000050f0af0 x7 : ffff00007fbaa840 x6 : 0000000000000031 > > > [ 2.800589] x5 : 000000000000017a x4 : 0000000000000002 x3 : 0000000000000002 > > > [ 2.807713] x2 : ffff80008324bd3a x1 : 0000000000000010 x0 : 0000000000000000 > > > [ 2.814838] Call trace: > > > [ 2.817273] tcpci_irq+0x38/0x318 > > > [ 2.820583] _tcpci_irq+0x14/0x20 > > > [ 2.823885] irq_thread_fn+0x2c/0xa8 > > > [ 2.827456] irq_thread+0x16c/0x2f4 > > > [ 2.830940] kthread+0x110/0x114 > > > [ 2.834164] ret_from_fork+0x10/0x20 > > > [ 2.837738] Code: f9426420 f9001fe0 d2800000 52800201 (f9400a60) > > > > > > This may happen on shared irq case. Such as two Type-C ports share one > > > irq. After the first port finished tcpci_register_port(), it may trigger > > > interrupt. However, if the interrupt comes by chance the 2nd port finishes > > > devm_request_threaded_irq(), the 2nd port interrupt handler may be run at > > > first. Then the above issue may happen. > > > > > > devm_request_threaded_irq() > > > <-- port1 irq comes > > > disable_irq(client->irq); > > > tcpci_register_port() > > > > > > This will restore the logic to the state before commit (77e85107a771 "usb: > > > typec: tcpci: support edge irq"). > > > > > > Fixes: 77e85107a771 ("usb: typec: tcpci: support edge irq") > > > Cc: stable@vger.kernel.org > > > Signed-off-by: Xu Yang <xu.yang_2@nxp.com> > > > --- > > > drivers/usb/typec/tcpm/tcpci.c | 18 ++++++++---------- > > > 1 file changed, 8 insertions(+), 10 deletions(-) > > > > > > diff --git a/drivers/usb/typec/tcpm/tcpci.c b/drivers/usb/typec/tcpm/tcpci.c > > > index 2f15734a5043..db42f4bf3632 100644 > > > --- a/drivers/usb/typec/tcpm/tcpci.c > > > +++ b/drivers/usb/typec/tcpm/tcpci.c > > > @@ -923,22 +923,20 @@ static int tcpci_probe(struct i2c_client *client) > > > > > > chip->data.set_orientation = err; > > > > > > + chip->tcpci = tcpci_register_port(&client->dev, &chip->data); > > > + if (IS_ERR(chip->tcpci)) > > > + return PTR_ERR(chip->tcpci); > > > + > > > err = devm_request_threaded_irq(&client->dev, client->irq, NULL, > > > _tcpci_irq, > > > IRQF_SHARED | IRQF_ONESHOT, > > > dev_name(&client->dev), chip); > > > > I do not think this is the correct fix, what about using > > IRQF_NO_AUTOEN ? Would it solve the issue? You need to adjust also the > > disable/enable irq accordingly. > > Not work. Probe failed directly. > > [ 2.646391] tcpci 2-0050: probe with driver tcpci failed with error -22 > [ 2.680086] tcpci 2-0051: probe with driver tcpci failed with error -22 Ok, clear. The reason is the shared IRQ. I think your change will break the support for edge IRQ, because we have interrupt generated calling tcpci_register_port and they will just be lost if the interrupt is not already requested. I think a better solution is needed. Francesco
On Wed, Dec 11, 2024 at 12:51:06PM +0100, Francesco Dolcini wrote: > On Wed, Dec 11, 2024 at 07:40:51PM +0800, Xu Yang wrote: > > On Wed, Dec 11, 2024 at 12:09:28PM +0100, Francesco Dolcini wrote: > > > Hello, > > > > > > On Wed, Dec 11, 2024 at 06:59:53PM +0800, Xu Yang wrote: > > > > The tcpci_irq() may meet below NULL pointer dereference issue: > > > > > > > > [ 2.641851] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 > > > > [ 2.641951] status 0x1, 0x37f > > > > [ 2.650659] Mem abort info: > > > > [ 2.656490] ESR = 0x0000000096000004 > > > > [ 2.660230] EC = 0x25: DABT (current EL), IL = 32 bits [...] > > > > > > I do not think this is the correct fix, what about using > > > IRQF_NO_AUTOEN ? Would it solve the issue? You need to adjust also the > > > disable/enable irq accordingly. > > > > Not work. Probe failed directly. > > > > [ 2.646391] tcpci 2-0050: probe with driver tcpci failed with error -22 > > [ 2.680086] tcpci 2-0051: probe with driver tcpci failed with error -22 > > Ok, clear. The reason is the shared IRQ. > > I think your change will break the support for edge IRQ, because we have > interrupt generated calling tcpci_register_port and they will just be lost if > the interrupt is not already requested. Do you really meet issue where interrupts got lost? I can't find the case because ALERT_MASK is set to 0 firstly, and ALERT_MASK is assigned some value in tcpci_init() which should be the last step of tcpm_register_port(). So the interrupts are masked before tcpci_register_port() completed. That's say, typec chip can't pull down ALERT line during this stage even though you don't call disable_irq(). This behavior is suit for both level and edge type interrupt.
On Wed, Dec 11, 2024 at 10:07:55PM +0800, Xu Yang wrote: > On Wed, Dec 11, 2024 at 12:51:06PM +0100, Francesco Dolcini wrote: > > On Wed, Dec 11, 2024 at 07:40:51PM +0800, Xu Yang wrote: > > > On Wed, Dec 11, 2024 at 12:09:28PM +0100, Francesco Dolcini wrote: > > > > Hello, > > > > > > > > On Wed, Dec 11, 2024 at 06:59:53PM +0800, Xu Yang wrote: > > > > > The tcpci_irq() may meet below NULL pointer dereference issue: > > > > > > > > > > [ 2.641851] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 > > > > > [ 2.641951] status 0x1, 0x37f > > > > > [ 2.650659] Mem abort info: > > > > > [ 2.656490] ESR = 0x0000000096000004 > > > > > [ 2.660230] EC = 0x25: DABT (current EL), IL = 32 bits > > [...] > > > > > > > > > I do not think this is the correct fix, what about using > > > > IRQF_NO_AUTOEN ? Would it solve the issue? You need to adjust also the > > > > disable/enable irq accordingly. > > > > > > Not work. Probe failed directly. > > > > > > [ 2.646391] tcpci 2-0050: probe with driver tcpci failed with error -22 > > > [ 2.680086] tcpci 2-0051: probe with driver tcpci failed with error -22 > > > > Ok, clear. The reason is the shared IRQ. > > > > I think your change will break the support for edge IRQ, because we have > > interrupt generated calling tcpci_register_port and they will just be lost if > > the interrupt is not already requested. > > Do you really meet issue where interrupts got lost? > > I can't find the case because ALERT_MASK is set to 0 firstly, and > ALERT_MASK is assigned some value in tcpci_init() which should be > the last step of tcpm_register_port(). So the interrupts are masked > before tcpci_register_port() completed. That's say, typec chip can't > pull down ALERT line during this stage even though you don't call > disable_irq(). This behavior is suit for both level and edge type > interrupt. Well, above assumption is based on commit 77e85107a771 ("usb: typec: tcpci: support edge irq"). I think the irq may be missed with this patch. A better way may be to lookup ALERT and handle event before probe() finishes. Thanks, Xu Yang
diff --git a/drivers/usb/typec/tcpm/tcpci.c b/drivers/usb/typec/tcpm/tcpci.c index 2f15734a5043..db42f4bf3632 100644 --- a/drivers/usb/typec/tcpm/tcpci.c +++ b/drivers/usb/typec/tcpm/tcpci.c @@ -923,22 +923,20 @@ static int tcpci_probe(struct i2c_client *client) chip->data.set_orientation = err; + chip->tcpci = tcpci_register_port(&client->dev, &chip->data); + if (IS_ERR(chip->tcpci)) + return PTR_ERR(chip->tcpci); + err = devm_request_threaded_irq(&client->dev, client->irq, NULL, _tcpci_irq, IRQF_SHARED | IRQF_ONESHOT, dev_name(&client->dev), chip); - if (err < 0) + if (err < 0) { + tcpci_unregister_port(chip->tcpci); return err; + } - /* - * Disable irq while registering port. If irq is configured as an edge - * irq this allow to keep track and process the irq as soon as it is enabled. - */ - disable_irq(client->irq); - chip->tcpci = tcpci_register_port(&client->dev, &chip->data); - enable_irq(client->irq); - - return PTR_ERR_OR_ZERO(chip->tcpci); + return 0; } static void tcpci_remove(struct i2c_client *client)
The tcpci_irq() may meet below NULL pointer dereference issue: [ 2.641851] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 [ 2.641951] status 0x1, 0x37f [ 2.650659] Mem abort info: [ 2.656490] ESR = 0x0000000096000004 [ 2.660230] EC = 0x25: DABT (current EL), IL = 32 bits [ 2.665532] SET = 0, FnV = 0 [ 2.668579] EA = 0, S1PTW = 0 [ 2.671715] FSC = 0x04: level 0 translation fault [ 2.676584] Data abort info: [ 2.679459] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 2.684936] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 2.689980] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 2.695284] [0000000000000010] user address but active_mm is swapper [ 2.701632] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 2.707883] Modules linked in: [ 2.710936] CPU: 1 UID: 0 PID: 87 Comm: irq/111-2-0051 Not tainted 6.12.0-rc6-06316-g7f63786ad3d1-dirty #4 [ 2.720570] Hardware name: NXP i.MX93 11X11 EVK board (DT) [ 2.726040] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 2.732989] pc : tcpci_irq+0x38/0x318 [ 2.736647] lr : _tcpci_irq+0x14/0x20 [ 2.740295] sp : ffff80008324bd30 [ 2.743597] x29: ffff80008324bd70 x28: ffff800080107894 x27: ffff800082198f70 [ 2.750721] x26: ffff0000050e6680 x25: ffff000004d172ac x24: ffff0000050f0000 [ 2.757845] x23: ffff000004d17200 x22: 0000000000000001 x21: ffff0000050f0000 [ 2.764969] x20: ffff000004d17200 x19: 0000000000000000 x18: 0000000000000001 [ 2.772093] x17: 0000000000000000 x16: ffff80008183d8a0 x15: ffff00007fbab040 [ 2.779217] x14: ffff00007fb918c0 x13: 0000000000000000 x12: 000000000000017a [ 2.786341] x11: 0000000000000001 x10: 0000000000000a90 x9 : ffff80008324bd00 [ 2.793465] x8 : ffff0000050f0af0 x7 : ffff00007fbaa840 x6 : 0000000000000031 [ 2.800589] x5 : 000000000000017a x4 : 0000000000000002 x3 : 0000000000000002 [ 2.807713] x2 : ffff80008324bd3a x1 : 0000000000000010 x0 : 0000000000000000 [ 2.814838] Call trace: [ 2.817273] tcpci_irq+0x38/0x318 [ 2.820583] _tcpci_irq+0x14/0x20 [ 2.823885] irq_thread_fn+0x2c/0xa8 [ 2.827456] irq_thread+0x16c/0x2f4 [ 2.830940] kthread+0x110/0x114 [ 2.834164] ret_from_fork+0x10/0x20 [ 2.837738] Code: f9426420 f9001fe0 d2800000 52800201 (f9400a60) This may happen on shared irq case. Such as two Type-C ports share one irq. After the first port finished tcpci_register_port(), it may trigger interrupt. However, if the interrupt comes by chance the 2nd port finishes devm_request_threaded_irq(), the 2nd port interrupt handler may be run at first. Then the above issue may happen. devm_request_threaded_irq() <-- port1 irq comes disable_irq(client->irq); tcpci_register_port() This will restore the logic to the state before commit (77e85107a771 "usb: typec: tcpci: support edge irq"). Fixes: 77e85107a771 ("usb: typec: tcpci: support edge irq") Cc: stable@vger.kernel.org Signed-off-by: Xu Yang <xu.yang_2@nxp.com> --- drivers/usb/typec/tcpm/tcpci.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-)