Message ID | 20231124122258.1050-1-xuewen.yan@unisoc.com |
---|---|
State | New |
Headers | show |
Series | [RFC] serial: core: Use pm_runtime_get_sync() in uart_start() | expand |
Hi Tony On Sat, Nov 25, 2023 at 3:48 PM Tony Lindgren <tony@atomide.com> wrote: > > * Xuewen Yan <xuewen.yan@unisoc.com> [231124 12:25]: > > The commit 84a9582fd203("serial: core: Start managing serial controllers to enable runtime PM") > > use the pm_runtime_get() after uart_port_lock() which would close the irq and disable preement. > > At this time, pm_runtime_get may cause the following two problems: > > > > (1) deadlock in try_to_wake_up: > > > > uart_write() > > uart_port_lock() <<< get lock > > __uart_start > > __pm_runtime_resume > > rpm_resume > > queue_work_on > > try_to_wake_up > > _printk > > uart_console_write > > ... > > uart_port_lock() <<< wait forever > > > > (2) scheduling while atomic: > > uart_write() > > uart_port_lock() <<< get lock > > __uart_start > > __pm_runtime_resume > > rpm_resume > > chedule() << sleep > > Are these spinlock vs raw spinlock nesting warnings from lockdep? If so > can you please post the full warnings somewhere? Or if some extra steps > are needed to reproduce please describe that too. Indeed, we use pr_info in scheduler in our own kernel tree, and this deadlock happended. I would try to use printk_deferred and re-test. And I also notice the warning was reported by syzbot: https://lore.kernel.org/all/0000000000006f01f00608a16cea@google.com/ https://lore.kernel.org/all/000000000000e7765006072e9591@google.com/ These are also because the pm_runtime_put(). Thanks! > > Chances are very high that your serial port is already runtime active at > this point unless you manually idle it so that's why I'm wondering as > all that likely is happening here is a check on the runtime PM usage count. > > > So let us use pm_runtime_get_sync() to prevent these. > > We need to fix this some other way as we can't use pm_runtime_get_sync() > here. The sync call variants require setting pm_runtime_irq_safe() for the > device. And this is what we really want to avoid as it takes a permanent > usage count on the parent device. > > What we want to do here is to get runtime PM to wake-up the device if idle > and try tx again after runtime PM resume as needed. > > Just guessing at this point.. To me it sounds like the fix might be to > use a raw spinlock for uart_port_lock() and uart_port_unlock()? > > Regards, > > Tony > > > > Fixes: 84a9582fd203 ("serial: core: Start managing serial controllers to enable runtime PM") > > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com> > > --- > > drivers/tty/serial/serial_core.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c > > index f1348a509552..902f7ed35f4d 100644 > > --- a/drivers/tty/serial/serial_core.c > > +++ b/drivers/tty/serial/serial_core.c > > @@ -145,7 +145,7 @@ static void __uart_start(struct uart_state *state) > > port_dev = port->port_dev; > > > > /* Increment the runtime PM usage count for the active check below */ > > - err = pm_runtime_get(&port_dev->dev); > > + err = pm_runtime_get_sync(&port_dev->dev); > > if (err < 0 && err != -EINPROGRESS) { > > pm_runtime_put_noidle(&port_dev->dev); > > return; > > @@ -159,7 +159,7 @@ static void __uart_start(struct uart_state *state) > > if (!pm_runtime_enabled(port->dev) || pm_runtime_active(port->dev)) > > port->ops->start_tx(port); > > pm_runtime_mark_last_busy(&port_dev->dev); > > - pm_runtime_put_autosuspend(&port_dev->dev); > > + pm_runtime_put_sync_autosuspend(&port_dev->dev); > > } > > > > static void uart_start(struct tty_struct *tty) > > -- > > 2.25.1 > >
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index f1348a509552..902f7ed35f4d 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -145,7 +145,7 @@ static void __uart_start(struct uart_state *state) port_dev = port->port_dev; /* Increment the runtime PM usage count for the active check below */ - err = pm_runtime_get(&port_dev->dev); + err = pm_runtime_get_sync(&port_dev->dev); if (err < 0 && err != -EINPROGRESS) { pm_runtime_put_noidle(&port_dev->dev); return; @@ -159,7 +159,7 @@ static void __uart_start(struct uart_state *state) if (!pm_runtime_enabled(port->dev) || pm_runtime_active(port->dev)) port->ops->start_tx(port); pm_runtime_mark_last_busy(&port_dev->dev); - pm_runtime_put_autosuspend(&port_dev->dev); + pm_runtime_put_sync_autosuspend(&port_dev->dev); } static void uart_start(struct tty_struct *tty)
The commit 84a9582fd203("serial: core: Start managing serial controllers to enable runtime PM") use the pm_runtime_get() after uart_port_lock() which would close the irq and disable preement. At this time, pm_runtime_get may cause the following two problems: (1) deadlock in try_to_wake_up: uart_write() uart_port_lock() <<< get lock __uart_start __pm_runtime_resume rpm_resume queue_work_on try_to_wake_up _printk uart_console_write ... uart_port_lock() <<< wait forever (2) scheduling while atomic: uart_write() uart_port_lock() <<< get lock __uart_start __pm_runtime_resume rpm_resume chedule() << sleep So let us use pm_runtime_get_sync() to prevent these. Fixes: 84a9582fd203 ("serial: core: Start managing serial controllers to enable runtime PM") Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com> --- drivers/tty/serial/serial_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)