Message ID | 9ca3ab0b40c875b6019f32f031c68a1ae80dd73a.1649310812.git.duoming@zju.edu.cn |
---|---|
State | New |
Headers | show |
Series | Fix deadlocks caused by del_timer_sync() | expand |
Hi Duoming, On Wed, Apr 6, 2022 at 11:38 PM Duoming Zhou <duoming@zju.edu.cn> wrote: > > There is a deadlock in rs_close(), which is shown > below: > > (Thread 1) | (Thread 2) > | rs_open() > rs_close() | mod_timer() > spin_lock_bh() //(1) | (wait a time) > ... | rs_poll() > del_timer_sync() | spin_lock() //(2) > (wait timer to stop) | ... > > We hold timer_lock in position (1) of thread 1 and > use del_timer_sync() to wait timer to stop, but timer handler > also need timer_lock in position (2) of thread 2. > As a result, rs_close() will block forever. I agree with this. > This patch extracts del_timer_sync() from the protection of > spin_lock_bh(), which could let timer handler to obtain > the needed lock. Looking at the timer_lock I don't really understand what it protects. It looks like it is not needed at all. Also, I see that rs_poll rewinds the timer regardless of whether del_timer_sync was called or not, which violates del_timer_sync requirements. > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > --- > arch/xtensa/platforms/iss/console.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c > index 81d7c7e8f7e..d431b61ae3c 100644 > --- a/arch/xtensa/platforms/iss/console.c > +++ b/arch/xtensa/platforms/iss/console.c > @@ -51,8 +51,10 @@ static int rs_open(struct tty_struct *tty, struct file * filp) > static void rs_close(struct tty_struct *tty, struct file * filp) > { > spin_lock_bh(&timer_lock); > - if (tty->count == 1) > + if (tty->count == 1) { > + spin_unlock_bh(&timer_lock); > del_timer_sync(&serial_timer); > + } > spin_unlock_bh(&timer_lock); Now in case tty->count == 1 the timer_lock would be unlocked twice.
Hello, On Thu, 7 Apr 2022 00:21:58 -0700 Max Filippov wrote: > > There is a deadlock in rs_close(), which is shown > > below: > > > > (Thread 1) | (Thread 2) > > | rs_open() > > rs_close() | mod_timer() > > spin_lock_bh() //(1) | (wait a time) > > ... | rs_poll() > > del_timer_sync() | spin_lock() //(2) > > (wait timer to stop) | ... > > > > We hold timer_lock in position (1) of thread 1 and > > use del_timer_sync() to wait timer to stop, but timer handler > > also need timer_lock in position (2) of thread 2. > > As a result, rs_close() will block forever. > > I agree with this. > > > This patch extracts del_timer_sync() from the protection of > > spin_lock_bh(), which could let timer handler to obtain > > the needed lock. > > Looking at the timer_lock I don't really understand what it protects. > It looks like it is not needed at all. There is no race condition between rs_close and rs_poll(timer handler), I think we could remove the timer_lock in rs_close(), rs_open() and rs_poll(). > Also, I see that rs_poll rewinds the timer regardless of whether del_timer_sync > was called or not, which violates del_timer_sync requirements. I wrote a kernel module to test whether del_timer_sync() could finish a timer handler that use mod_timer() to rewind itself. The following is the result. # insmod del_timer_sync.ko [ 929.374405] my_timer will be create. [ 929.374738] the jiffies is :4295595572 [ 930.411581] In my_timer_function [ 930.411956] the jiffies is 4295596609 [ 935.466643] In my_timer_function [ 935.467505] the jiffies is 4295601665 [ 940.586538] In my_timer_function [ 940.586916] the jiffies is 4295606784 [ 945.706579] In my_timer_function [ 945.706885] the jiffies is 4295611904 # # rmmod del_timer_sync.ko [ 948.507692] the del_timer_sync is :1 [ 948.507692] # # The result of the experiment shows that the timer handler could be killed after we execute del_timer_sync(). > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > > --- > > arch/xtensa/platforms/iss/console.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c > > index 81d7c7e8f7e..d431b61ae3c 100644 > > --- a/arch/xtensa/platforms/iss/console.c > > +++ b/arch/xtensa/platforms/iss/console.c > > @@ -51,8 +51,10 @@ static int rs_open(struct tty_struct *tty, struct file * filp) > > static void rs_close(struct tty_struct *tty, struct file * filp) > > { > > spin_lock_bh(&timer_lock); > > - if (tty->count == 1) > > + if (tty->count == 1) { > > + spin_unlock_bh(&timer_lock); > > del_timer_sync(&serial_timer); > > + } > > spin_unlock_bh(&timer_lock); > > Now in case tty->count == 1 the timer_lock would be unlocked twice. I will remove the timer_lock in rs_close(), rs_open() and rs_poll(). Thanks a lot for your time and advice! Best regards, Duoming Zhou
Hello, On Thu, 7 Apr 2022 12:42:31 +0300 Sergey Shtylyov wrote: > > There is a deadlock in rs_close(), which is shown > > below: > > > > (Thread 1) | (Thread 2) > > | rs_open() > > rs_close() | mod_timer() > > spin_lock_bh() //(1) | (wait a time) > > ... | rs_poll() > > del_timer_sync() | spin_lock() //(2) > > (wait timer to stop) | ... > > > > We hold timer_lock in position (1) of thread 1 and > > use del_timer_sync() to wait timer to stop, but timer handler > > also need timer_lock in position (2) of thread 2. > > As a result, rs_close() will block forever. > > > > This patch extracts del_timer_sync() from the protection of > > spin_lock_bh(), which could let timer handler to obtain > > the needed lock. > > > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > > --- > > arch/xtensa/platforms/iss/console.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c > > index 81d7c7e8f7e..d431b61ae3c 100644 > > --- a/arch/xtensa/platforms/iss/console.c > > +++ b/arch/xtensa/platforms/iss/console.c > > @@ -51,8 +51,10 @@ static int rs_open(struct tty_struct *tty, struct file * filp) > > static void rs_close(struct tty_struct *tty, struct file * filp) > > { > > spin_lock_bh(&timer_lock); > > - if (tty->count == 1) > > + if (tty->count == 1) { > > + spin_unlock_bh(&timer_lock); > > del_timer_sync(&serial_timer); > > + } > > spin_unlock_bh(&timer_lock); > > Double unlock iff tty->count == 1? Yes, Thanks a lot for your timer and advice. I found there is no race condition between rs_close and rs_poll(timer handler), I think we could remove the timer_lock in rs_close(), rs_open() and rs_poll(). Best regards, Duoming Zhou
diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c index 81d7c7e8f7e..d431b61ae3c 100644 --- a/arch/xtensa/platforms/iss/console.c +++ b/arch/xtensa/platforms/iss/console.c @@ -51,8 +51,10 @@ static int rs_open(struct tty_struct *tty, struct file * filp) static void rs_close(struct tty_struct *tty, struct file * filp) { spin_lock_bh(&timer_lock); - if (tty->count == 1) + if (tty->count == 1) { + spin_unlock_bh(&timer_lock); del_timer_sync(&serial_timer); + } spin_unlock_bh(&timer_lock); }
There is a deadlock in rs_close(), which is shown below: (Thread 1) | (Thread 2) | rs_open() rs_close() | mod_timer() spin_lock_bh() //(1) | (wait a time) ... | rs_poll() del_timer_sync() | spin_lock() //(2) (wait timer to stop) | ... We hold timer_lock in position (1) of thread 1 and use del_timer_sync() to wait timer to stop, but timer handler also need timer_lock in position (2) of thread 2. As a result, rs_close() will block forever. This patch extracts del_timer_sync() from the protection of spin_lock_bh(), which could let timer handler to obtain the needed lock. Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> --- arch/xtensa/platforms/iss/console.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)