diff mbox series

[RFC,v2,20/31] timers: usb: Use del_timer_shutdown() before freeing timer

Message ID 20221027150928.983388020@goodmis.org
State New
Headers show
Series None | expand

Commit Message

Steven Rostedt Oct. 27, 2022, 3:05 p.m. UTC
From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Before a timer is freed, del_timer_shutdown() must be called.

Link: https://lore.kernel.org/all/20220407161745.7d6754b3@gandalf.local.home/

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Johan Hovold <johan@kernel.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Michael Grzeschik <m.grzeschik@pengutronix.de>
Cc: Bhuvanesh Surachari <Bhuvanesh_Surachari@mentor.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: linux-usb@vger.kernel.org
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 drivers/usb/core/hub.c              | 3 +++
 drivers/usb/gadget/udc/m66592-udc.c | 2 +-
 drivers/usb/serial/garmin_gps.c     | 2 +-
 drivers/usb/serial/mos7840.c        | 2 +-
 4 files changed, 6 insertions(+), 3 deletions(-)

Comments

Guenter Roeck Oct. 28, 2022, 5:23 a.m. UTC | #1
On 10/27/22 08:05, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> 
> Before a timer is freed, del_timer_shutdown() must be called.
> 
> Link: https://lore.kernel.org/all/20220407161745.7d6754b3@gandalf.local.home/
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Felipe Balbi <balbi@kernel.org>
> Cc: Johan Hovold <johan@kernel.org>
> Cc: Alan Stern <stern@rowland.harvard.edu>
> Cc: Mathias Nyman <mathias.nyman@linux.intel.com>
> Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
> Cc: Matthias Kaehlcke <mka@chromium.org>
> Cc: Michael Grzeschik <m.grzeschik@pengutronix.de>
> Cc: Bhuvanesh Surachari <Bhuvanesh_Surachari@mentor.com>
> Cc: Dan Carpenter <dan.carpenter@oracle.com>
> Cc: linux-usb@vger.kernel.org
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
>   drivers/usb/core/hub.c              | 3 +++
>   drivers/usb/gadget/udc/m66592-udc.c | 2 +-
>   drivers/usb/serial/garmin_gps.c     | 2 +-
>   drivers/usb/serial/mos7840.c        | 2 +-
>   4 files changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> index bbab424b0d55..397f263ab7da 100644
> --- a/drivers/usb/core/hub.c
> +++ b/drivers/usb/core/hub.c
> @@ -1261,6 +1261,9 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
>   
>   		/* Don't do a long sleep inside a workqueue routine */
>   		if (type == HUB_INIT2) {
> +			/* Timers must be shutdown before they are re-initialized */
> +			if (hub->init_work.work.func)
> +				del_timer_shutdown(&hub->init_work.timer);
>   			INIT_DELAYED_WORK(&hub->init_work, hub_init_func3);

A similar call to INIT_DELAYED_WORK() around line 1085 needs the same change.

It would be great if that can somehow be hidden in INIT_DELAYED_WORK().

Thanks,
Guenter

>   			queue_delayed_work(system_power_efficient_wq,
>   					&hub->init_work,
> diff --git a/drivers/usb/gadget/udc/m66592-udc.c b/drivers/usb/gadget/udc/m66592-udc.c
> index 931e6362a13d..a6e2f8358adf 100644
> --- a/drivers/usb/gadget/udc/m66592-udc.c
> +++ b/drivers/usb/gadget/udc/m66592-udc.c
> @@ -1519,7 +1519,7 @@ static int m66592_remove(struct platform_device *pdev)
>   
>   	usb_del_gadget_udc(&m66592->gadget);
>   
> -	del_timer_sync(&m66592->timer);
> +	del_timer_shutdown(&m66592->timer);
>   	iounmap(m66592->reg);
>   	free_irq(platform_get_irq(pdev, 0), m66592);
>   	m66592_free_request(&m66592->ep[0].ep, m66592->ep0_req);
> diff --git a/drivers/usb/serial/garmin_gps.c b/drivers/usb/serial/garmin_gps.c
> index f1a8d8343623..2a53f26468bd 100644
> --- a/drivers/usb/serial/garmin_gps.c
> +++ b/drivers/usb/serial/garmin_gps.c
> @@ -1405,7 +1405,7 @@ static void garmin_port_remove(struct usb_serial_port *port)
>   
>   	usb_kill_anchored_urbs(&garmin_data_p->write_urbs);
>   	usb_kill_urb(port->interrupt_in_urb);
> -	del_timer_sync(&garmin_data_p->timer);
> +	del_timer_shutdown(&garmin_data_p->timer);
>   	kfree(garmin_data_p);
>   }
>   
> diff --git a/drivers/usb/serial/mos7840.c b/drivers/usb/serial/mos7840.c
> index 6b12bb4648b8..a90a706d27de 100644
> --- a/drivers/usb/serial/mos7840.c
> +++ b/drivers/usb/serial/mos7840.c
> @@ -1726,7 +1726,7 @@ static void mos7840_port_remove(struct usb_serial_port *port)
>   		mos7840_set_led_sync(port, MODEM_CONTROL_REGISTER, 0x0300);
>   
>   		del_timer_sync(&mos7840_port->led_timer1);
> -		del_timer_sync(&mos7840_port->led_timer2);
> +		del_timer_shutdown(&mos7840_port->led_timer2);
>   
>   		usb_kill_urb(mos7840_port->led_urb);
>   		usb_free_urb(mos7840_port->led_urb);
Steven Rostedt Oct. 28, 2022, 10:14 a.m. UTC | #2
On Thu, 27 Oct 2022 22:23:06 -0700
Guenter Roeck <linux@roeck-us.net> wrote:

> > diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> > index bbab424b0d55..397f263ab7da 100644
> > --- a/drivers/usb/core/hub.c
> > +++ b/drivers/usb/core/hub.c
> > @@ -1261,6 +1261,9 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
> >   
> >   		/* Don't do a long sleep inside a workqueue routine */
> >   		if (type == HUB_INIT2) {
> > +			/* Timers must be shutdown before they are re-initialized */
> > +			if (hub->init_work.work.func)
> > +				del_timer_shutdown(&hub->init_work.timer);
> >   			INIT_DELAYED_WORK(&hub->init_work, hub_init_func3);  
> 
> A similar call to INIT_DELAYED_WORK() around line 1085 needs the same change.
> 
> It would be great if that can somehow be hidden in INIT_DELAYED_WORK().

I agree, but the delayed work is such a special case, I'm struggling to
find something that works sensibly. :-/

-- Steve
Steven Rostedt Oct. 28, 2022, 2 p.m. UTC | #3
On Fri, 28 Oct 2022 06:14:22 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Thu, 27 Oct 2022 22:23:06 -0700
> Guenter Roeck <linux@roeck-us.net> wrote:
 > 
> > A similar call to INIT_DELAYED_WORK() around line 1085 needs the same change.
> > 
> > It would be great if that can somehow be hidden in INIT_DELAYED_WORK().  
> 
> I agree, but the delayed work is such a special case, I'm struggling to
> find something that works sensibly. :-/
>

OK, I diagnosed the issue here. The problem is that delayed work also has no
"shutdown" method when it's done. Which means there's no generic way to
call the work->timer shutdown method. So we have two options to handle
delayed work timers:

  1) Add special initialization for delayed work so that it can just go back
     to the old checking (activating on arming, deactivating by any
     del_timer*).

  2) Implement a shutdown state for the work queues as well. There could
     definitely be the same types of bugs as with timers, where a delayed
     work could be pending on something that's been freed. That's probably
     why there's a DEBUG_OBJECTS_WORK too.

Ideally, I would like to have #2, but realistically, I'm going for #1 for
now. We could always add the work queue shutdown state later if we want.

The problem with timers with respect to delayed work queues, is that there's
no place to add the "shutdown" before its no longer in use. Worse yet,
there's code that caches descriptors that have delayed work instead of
freeing them. (See block/blk-mq.c and drivers/scsi/scsi_lib.c with the queuelist).
Where it just calls del_timer(), and then sends it back to the free store
for reuse later. Perhaps we should add DEBUG_OBJECTS checking to these too?

Anyway, I'll make it where the INIT_DELAYED_WORK will call
__timer_init_work() that will set the debug state in the timer to
TIMER_DEBUG_WORK, were it will activate and deactivate the debug object on
add_timer() and del_timer() and hope that it's not one of the bugs we are
hitting :-/

-- Steve
Steven Rostedt Oct. 28, 2022, 6:01 p.m. UTC | #4
On Thu, 27 Oct 2022 22:23:06 -0700
Guenter Roeck <linux@roeck-us.net> wrote:

> > index bbab424b0d55..397f263ab7da 100644
> > --- a/drivers/usb/core/hub.c
> > +++ b/drivers/usb/core/hub.c
> > @@ -1261,6 +1261,9 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
> >   
> >   		/* Don't do a long sleep inside a workqueue routine */
> >   		if (type == HUB_INIT2) {
> > +			/* Timers must be shutdown before they are re-initialized */
> > +			if (hub->init_work.work.func)
> > +				del_timer_shutdown(&hub->init_work.timer);
> >   			INIT_DELAYED_WORK(&hub->init_work, hub_init_func3);  
> 
> A similar call to INIT_DELAYED_WORK() around line 1085 needs the same change.
> 
> It would be great if that can somehow be hidden in INIT_DELAYED_WORK().

I've decided to treat INIT_DELAYED_WORK() like it was before. It only
checks from the time the timer is added to the time it is removed without
needing a shutdown call. That's because there's no API in the workqueue
code that allows for us to require a shutdown on the INIT_DELAYED_WORK's
timer.

Guenter,

Can you remove all the extra patches that touched the timer.h and timer.c
code, and replace the last patch with this, and then try again?

-- Steve

 include/linux/timer.h     | 38 +++++++++++++++++++++++++++--
 include/linux/workqueue.h |  4 ++--
 kernel/time/timer.c       | 50 ++++++++++++++++++++++++++++++++++-----
 kernel/workqueue.c        | 12 ++++++++++
 4 files changed, 94 insertions(+), 10 deletions(-)

diff --git a/include/linux/timer.h b/include/linux/timer.h
index 45392b0ac2e1..27e3a8676ff8 100644
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -8,6 +8,12 @@
 #include <linux/debugobjects.h>
 #include <linux/stringify.h>
 
+enum timer_debug_state {
+	TIMER_DEBUG_DISABLED,
+	TIMER_DEBUG_ENABLED,
+	TIMER_DEBUG_WORK,
+};
+
 struct timer_list {
 	/*
 	 * All fields that change during normal runtime grouped to the
@@ -18,6 +24,9 @@ struct timer_list {
 	void			(*function)(struct timer_list *);
 	u32			flags;
 
+#ifdef CONFIG_DEBUG_OBJECTS_TIMERS
+	enum timer_debug_state	enabled;
+#endif
 #ifdef CONFIG_LOCKDEP
 	struct lockdep_map	lockdep_map;
 #endif
@@ -128,6 +137,31 @@ static inline void init_timer_on_stack_key(struct timer_list *timer,
 	init_timer_on_stack_key((_timer), (_fn), (_flags), NULL, NULL)
 #endif
 
+#ifdef CONFIG_DEBUG_OBJECTS_TIMERS
+#define __init_timer_debug(_timer, _fn, _flags)				\
+	do {								\
+		(_timer)->enabled = TIMER_DEBUG_DISABLED;		\
+		__init_timer((_timer), (_fn), (_flags));		\
+	} while (0)
+#define __init_timer_work(_timer, _fn, _flags)				\
+	do {								\
+		(_timer)->enabled = TIMER_DEBUG_WORK;			\
+		__init_timer((_timer), (_fn), (_flags));		\
+	} while (0)
+#define __init_timer_work_on_stack(_timer, _fn, _flags)				\
+	do {								\
+		(_timer)->enabled = TIMER_DEBUG_WORK;			\
+		__init_timer_on_stack((_timer), (_fn), (_flags));	\
+	} while (0)
+#else
+#define __init_timer_debug(_timer, _fn, _flags)				\
+	__init_timer((_timer), (_fn), (_flags))
+#define __init_timer_work(_timer, _fn, _flags)				\
+	__init_timer((_timer), (_fn), (_flags))
+#define __init_timer_work_on_stack(_timer, _fn, _flags)			\
+	__init_timer_on_stack((_timer), (_fn), (_flags))
+#endif
+
 /**
  * timer_setup - prepare a timer for first use
  * @timer: the timer in question
@@ -139,7 +173,7 @@ static inline void init_timer_on_stack_key(struct timer_list *timer,
  * be used and must be balanced with a call to destroy_timer_on_stack().
  */
 #define timer_setup(timer, callback, flags)			\
-	__init_timer((timer), (callback), (flags))
+	__init_timer_debug((timer), (callback), (flags))
 
 #define timer_setup_on_stack(timer, callback, flags)		\
 	__init_timer_on_stack((timer), (callback), (flags))
@@ -243,7 +277,7 @@ static inline int del_timer_shutdown(struct timer_list *timer)
 	return __del_timer_sync(timer, true);
 }
 
-#define del_singleshot_timer_sync(t) del_timer_sync(t)
+#define del_singleshot_timer_sync(t) del_timer_shutdown(t)
 
 extern void init_timers(void);
 struct hrtimer;
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index a0143dd24430..290c96429ce1 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -250,7 +250,7 @@ static inline unsigned int work_static(struct work_struct *work) { return 0; }
 #define __INIT_DELAYED_WORK(_work, _func, _tflags)			\
 	do {								\
 		INIT_WORK(&(_work)->work, (_func));			\
-		__init_timer(&(_work)->timer,				\
+		__init_timer_work(&(_work)->timer,			\
 			     delayed_work_timer_fn,			\
 			     (_tflags) | TIMER_IRQSAFE);		\
 	} while (0)
@@ -258,7 +258,7 @@ static inline unsigned int work_static(struct work_struct *work) { return 0; }
 #define __INIT_DELAYED_WORK_ONSTACK(_work, _func, _tflags)		\
 	do {								\
 		INIT_WORK_ONSTACK(&(_work)->work, (_func));		\
-		__init_timer_on_stack(&(_work)->timer,			\
+		__init_timer_work_on_stack(&(_work)->timer,		\
 				      delayed_work_timer_fn,		\
 				      (_tflags) | TIMER_IRQSAFE);	\
 	} while (0)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 5179ac2335a0..9a921843cc4f 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -691,7 +691,11 @@ static bool timer_fixup_init(void *addr, enum debug_obj_state state)
 
 	switch (state) {
 	case ODEBUG_STATE_ACTIVE:
-		del_timer_sync(timer);
+		if (timer->enabled != TIMER_DEBUG_WORK)
+			timer->enabled = TIMER_DEBUG_ENABLED;
+		del_timer_shutdown(timer);
+		if (timer->enabled != TIMER_DEBUG_WORK)
+			timer->enabled = TIMER_DEBUG_DISABLED;
 		debug_object_init(timer, &timer_debug_descr);
 		return true;
 	default:
@@ -737,8 +741,10 @@ static bool timer_fixup_free(void *addr, enum debug_obj_state state)
 
 	switch (state) {
 	case ODEBUG_STATE_ACTIVE:
-		del_timer_sync(timer);
+		del_timer_shutdown(timer);
 		debug_object_free(timer, &timer_debug_descr);
+		if (timer->enabled != TIMER_DEBUG_WORK)
+			timer->enabled = TIMER_DEBUG_DISABLED;
 		return true;
 	default:
 		return false;
@@ -774,16 +780,36 @@ static const struct debug_obj_descr timer_debug_descr = {
 
 static inline void debug_timer_init(struct timer_list *timer)
 {
+	if (timer->enabled == TIMER_DEBUG_ENABLED)
+		return;
+
 	debug_object_init(timer, &timer_debug_descr);
 }
 
 static inline void debug_timer_activate(struct timer_list *timer)
 {
+	if (timer->enabled == TIMER_DEBUG_ENABLED)
+		return;
+
+	if (timer->enabled == TIMER_DEBUG_DISABLED)
+		timer->enabled = TIMER_DEBUG_ENABLED;
+
 	debug_object_activate(timer, &timer_debug_descr);
 }
 
-static inline void debug_timer_deactivate(struct timer_list *timer)
+static inline void debug_timer_deactivate(struct timer_list *timer, bool free)
 {
+	switch (timer->enabled) {
+	case TIMER_DEBUG_DISABLED:
+		return;
+	case TIMER_DEBUG_ENABLED:
+		if (!free)
+			return;
+		timer->enabled = TIMER_DEBUG_DISABLED;
+		break;
+	case TIMER_DEBUG_WORK:
+		break;
+	}
 	debug_object_deactivate(timer, &timer_debug_descr);
 }
 
@@ -813,6 +839,14 @@ void destroy_timer_on_stack(struct timer_list *timer)
 }
 EXPORT_SYMBOL_GPL(destroy_timer_on_stack);
 
+static struct timer_base *lock_timer_base(struct timer_list *timer,
+					  unsigned long *flags);
+
+void __timer_reinit_debug_objects(struct timer_list *timer)
+{
+	return;
+}
+
 #else
 static inline void debug_timer_init(struct timer_list *timer) { }
 static inline void debug_timer_activate(struct timer_list *timer) { }
@@ -828,7 +862,7 @@ static inline void debug_init(struct timer_list *timer)
 
 static inline void debug_deactivate(struct timer_list *timer)
 {
-	debug_timer_deactivate(timer);
+	debug_timer_deactivate(timer, false);
 	trace_timer_cancel(timer);
 }
 
@@ -1251,8 +1285,10 @@ int __del_timer(struct timer_list *timer, bool free)
 	if (timer_pending(timer)) {
 		base = lock_timer_base(timer, &flags);
 		ret = detach_if_pending(timer, base, true);
-		if (free && ret)
+		if (free) {
 			timer->function = NULL;
+			debug_timer_deactivate(timer, true);
+		}
 		raw_spin_unlock_irqrestore(&base->lock, flags);
 	}
 
@@ -1272,8 +1308,10 @@ static int __try_to_del_timer_sync(struct timer_list *timer, bool free)
 
 	if (base->running_timer != timer)
 		ret = detach_if_pending(timer, base, true);
-	if (free)
+	if (free) {
 		timer->function = NULL;
+		debug_timer_deactivate(timer, true);
+	}
 
 	raw_spin_unlock_irqrestore(&base->lock, flags);
 
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 47a7124bbea4..9a48213fc4e4 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1225,6 +1225,16 @@ static void pwq_dec_nr_in_flight(struct pool_workqueue *pwq, unsigned long work_
 	put_pwq(pwq);
 }
 
+static void deactivate_timer(struct work_struct *work, bool is_dwork)
+{
+	struct delayed_work *dwork;
+
+	if (!is_dwork)
+		return;
+
+	dwork = to_delayed_work(work);
+}
+
 /**
  * try_to_grab_pending - steal work item from worklist and disable irq
  * @work: work item to steal
@@ -3148,6 +3158,8 @@ static bool __cancel_work_timer(struct work_struct *work, bool is_dwork)
 		}
 	} while (unlikely(ret < 0));
 
+	deactivate_timer(work, is_dwork);
+
 	/* tell other tasks trying to grab @work to back off */
 	mark_work_canceling(work);
 	local_irq_restore(flags);
Steven Rostedt Oct. 28, 2022, 6:10 p.m. UTC | #5
On Fri, 28 Oct 2022 14:01:29 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> @@ -813,6 +839,14 @@ void destroy_timer_on_stack(struct timer_list *timer)
>  }
>  EXPORT_SYMBOL_GPL(destroy_timer_on_stack);
>  
> +static struct timer_base *lock_timer_base(struct timer_list *timer,
> +					  unsigned long *flags);
> +
> +void __timer_reinit_debug_objects(struct timer_list *timer)
> +{
> +	return;
> +}
> +
>  #else
>  static inline void debug_timer_init(struct timer_list *timer) { }
>  static inline void debug_timer_activate(struct timer_list *timer) { }

Bah, the above chunk was leftover from some debugging.

Updated patch:

-- Steve

 include/linux/timer.h     | 38 +++++++++++++++++++++++++++++++++--
 include/linux/workqueue.h |  4 ++--
 kernel/time/timer.c       | 42 +++++++++++++++++++++++++++++++++------
 kernel/workqueue.c        | 12 +++++++++++
 4 files changed, 86 insertions(+), 10 deletions(-)

diff --git a/include/linux/timer.h b/include/linux/timer.h
index 45392b0ac2e1..27e3a8676ff8 100644
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -8,6 +8,12 @@
 #include <linux/debugobjects.h>
 #include <linux/stringify.h>
 
+enum timer_debug_state {
+	TIMER_DEBUG_DISABLED,
+	TIMER_DEBUG_ENABLED,
+	TIMER_DEBUG_WORK,
+};
+
 struct timer_list {
 	/*
 	 * All fields that change during normal runtime grouped to the
@@ -18,6 +24,9 @@ struct timer_list {
 	void			(*function)(struct timer_list *);
 	u32			flags;
 
+#ifdef CONFIG_DEBUG_OBJECTS_TIMERS
+	enum timer_debug_state	enabled;
+#endif
 #ifdef CONFIG_LOCKDEP
 	struct lockdep_map	lockdep_map;
 #endif
@@ -128,6 +137,31 @@ static inline void init_timer_on_stack_key(struct timer_list *timer,
 	init_timer_on_stack_key((_timer), (_fn), (_flags), NULL, NULL)
 #endif
 
+#ifdef CONFIG_DEBUG_OBJECTS_TIMERS
+#define __init_timer_debug(_timer, _fn, _flags)				\
+	do {								\
+		(_timer)->enabled = TIMER_DEBUG_DISABLED;		\
+		__init_timer((_timer), (_fn), (_flags));		\
+	} while (0)
+#define __init_timer_work(_timer, _fn, _flags)				\
+	do {								\
+		(_timer)->enabled = TIMER_DEBUG_WORK;			\
+		__init_timer((_timer), (_fn), (_flags));		\
+	} while (0)
+#define __init_timer_work_on_stack(_timer, _fn, _flags)				\
+	do {								\
+		(_timer)->enabled = TIMER_DEBUG_WORK;			\
+		__init_timer_on_stack((_timer), (_fn), (_flags));	\
+	} while (0)
+#else
+#define __init_timer_debug(_timer, _fn, _flags)				\
+	__init_timer((_timer), (_fn), (_flags))
+#define __init_timer_work(_timer, _fn, _flags)				\
+	__init_timer((_timer), (_fn), (_flags))
+#define __init_timer_work_on_stack(_timer, _fn, _flags)			\
+	__init_timer_on_stack((_timer), (_fn), (_flags))
+#endif
+
 /**
  * timer_setup - prepare a timer for first use
  * @timer: the timer in question
@@ -139,7 +173,7 @@ static inline void init_timer_on_stack_key(struct timer_list *timer,
  * be used and must be balanced with a call to destroy_timer_on_stack().
  */
 #define timer_setup(timer, callback, flags)			\
-	__init_timer((timer), (callback), (flags))
+	__init_timer_debug((timer), (callback), (flags))
 
 #define timer_setup_on_stack(timer, callback, flags)		\
 	__init_timer_on_stack((timer), (callback), (flags))
@@ -243,7 +277,7 @@ static inline int del_timer_shutdown(struct timer_list *timer)
 	return __del_timer_sync(timer, true);
 }
 
-#define del_singleshot_timer_sync(t) del_timer_sync(t)
+#define del_singleshot_timer_sync(t) del_timer_shutdown(t)
 
 extern void init_timers(void);
 struct hrtimer;
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index a0143dd24430..290c96429ce1 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -250,7 +250,7 @@ static inline unsigned int work_static(struct work_struct *work) { return 0; }
 #define __INIT_DELAYED_WORK(_work, _func, _tflags)			\
 	do {								\
 		INIT_WORK(&(_work)->work, (_func));			\
-		__init_timer(&(_work)->timer,				\
+		__init_timer_work(&(_work)->timer,			\
 			     delayed_work_timer_fn,			\
 			     (_tflags) | TIMER_IRQSAFE);		\
 	} while (0)
@@ -258,7 +258,7 @@ static inline unsigned int work_static(struct work_struct *work) { return 0; }
 #define __INIT_DELAYED_WORK_ONSTACK(_work, _func, _tflags)		\
 	do {								\
 		INIT_WORK_ONSTACK(&(_work)->work, (_func));		\
-		__init_timer_on_stack(&(_work)->timer,			\
+		__init_timer_work_on_stack(&(_work)->timer,		\
 				      delayed_work_timer_fn,		\
 				      (_tflags) | TIMER_IRQSAFE);	\
 	} while (0)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 5179ac2335a0..ac2e8beb4235 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -691,7 +691,11 @@ static bool timer_fixup_init(void *addr, enum debug_obj_state state)
 
 	switch (state) {
 	case ODEBUG_STATE_ACTIVE:
-		del_timer_sync(timer);
+		if (timer->enabled != TIMER_DEBUG_WORK)
+			timer->enabled = TIMER_DEBUG_ENABLED;
+		del_timer_shutdown(timer);
+		if (timer->enabled != TIMER_DEBUG_WORK)
+			timer->enabled = TIMER_DEBUG_DISABLED;
 		debug_object_init(timer, &timer_debug_descr);
 		return true;
 	default:
@@ -737,8 +741,10 @@ static bool timer_fixup_free(void *addr, enum debug_obj_state state)
 
 	switch (state) {
 	case ODEBUG_STATE_ACTIVE:
-		del_timer_sync(timer);
+		del_timer_shutdown(timer);
 		debug_object_free(timer, &timer_debug_descr);
+		if (timer->enabled != TIMER_DEBUG_WORK)
+			timer->enabled = TIMER_DEBUG_DISABLED;
 		return true;
 	default:
 		return false;
@@ -774,16 +780,36 @@ static const struct debug_obj_descr timer_debug_descr = {
 
 static inline void debug_timer_init(struct timer_list *timer)
 {
+	if (timer->enabled == TIMER_DEBUG_ENABLED)
+		return;
+
 	debug_object_init(timer, &timer_debug_descr);
 }
 
 static inline void debug_timer_activate(struct timer_list *timer)
 {
+	if (timer->enabled == TIMER_DEBUG_ENABLED)
+		return;
+
+	if (timer->enabled == TIMER_DEBUG_DISABLED)
+		timer->enabled = TIMER_DEBUG_ENABLED;
+
 	debug_object_activate(timer, &timer_debug_descr);
 }
 
-static inline void debug_timer_deactivate(struct timer_list *timer)
+static inline void debug_timer_deactivate(struct timer_list *timer, bool free)
 {
+	switch (timer->enabled) {
+	case TIMER_DEBUG_DISABLED:
+		return;
+	case TIMER_DEBUG_ENABLED:
+		if (!free)
+			return;
+		timer->enabled = TIMER_DEBUG_DISABLED;
+		break;
+	case TIMER_DEBUG_WORK:
+		break;
+	}
 	debug_object_deactivate(timer, &timer_debug_descr);
 }
 
@@ -828,7 +854,7 @@ static inline void debug_init(struct timer_list *timer)
 
 static inline void debug_deactivate(struct timer_list *timer)
 {
-	debug_timer_deactivate(timer);
+	debug_timer_deactivate(timer, false);
 	trace_timer_cancel(timer);
 }
 
@@ -1251,8 +1277,10 @@ int __del_timer(struct timer_list *timer, bool free)
 	if (timer_pending(timer)) {
 		base = lock_timer_base(timer, &flags);
 		ret = detach_if_pending(timer, base, true);
-		if (free && ret)
+		if (free) {
 			timer->function = NULL;
+			debug_timer_deactivate(timer, true);
+		}
 		raw_spin_unlock_irqrestore(&base->lock, flags);
 	}
 
@@ -1272,8 +1300,10 @@ static int __try_to_del_timer_sync(struct timer_list *timer, bool free)
 
 	if (base->running_timer != timer)
 		ret = detach_if_pending(timer, base, true);
-	if (free)
+	if (free) {
 		timer->function = NULL;
+		debug_timer_deactivate(timer, true);
+	}
 
 	raw_spin_unlock_irqrestore(&base->lock, flags);
 
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 47a7124bbea4..9a48213fc4e4 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1225,6 +1225,16 @@ static void pwq_dec_nr_in_flight(struct pool_workqueue *pwq, unsigned long work_
 	put_pwq(pwq);
 }
 
+static void deactivate_timer(struct work_struct *work, bool is_dwork)
+{
+	struct delayed_work *dwork;
+
+	if (!is_dwork)
+		return;
+
+	dwork = to_delayed_work(work);
+}
+
 /**
  * try_to_grab_pending - steal work item from worklist and disable irq
  * @work: work item to steal
@@ -3148,6 +3158,8 @@ static bool __cancel_work_timer(struct work_struct *work, bool is_dwork)
 		}
 	} while (unlikely(ret < 0));
 
+	deactivate_timer(work, is_dwork);
+
 	/* tell other tasks trying to grab @work to back off */
 	mark_work_canceling(work);
 	local_irq_restore(flags);
Guenter Roeck Oct. 28, 2022, 7:59 p.m. UTC | #6
On Fri, Oct 28, 2022 at 02:10:07PM -0400, Steven Rostedt wrote:
> On Fri, 28 Oct 2022 14:01:29 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > @@ -813,6 +839,14 @@ void destroy_timer_on_stack(struct timer_list *timer)
> >  }
> >  EXPORT_SYMBOL_GPL(destroy_timer_on_stack);
> >  
> > +static struct timer_base *lock_timer_base(struct timer_list *timer,
> > +					  unsigned long *flags);
> > +
> > +void __timer_reinit_debug_objects(struct timer_list *timer)
> > +{
> > +	return;
> > +}
> > +
> >  #else
> >  static inline void debug_timer_init(struct timer_list *timer) { }
> >  static inline void debug_timer_activate(struct timer_list *timer) { }
> 
> Bah, the above chunk was leftover from some debugging.
> 

I'll test again with the following changes on top of your published
patch series. I hope this is the current status, but I may have lost
something.

Looking into it ... deactivate_timer() doesn't do anything 
and seems wrong. Did I miss something ?

Thanks,
Guenter

---
diff --git a/block/blk-core.c b/block/blk-core.c
index 17667159482e..69b1daa2e91a 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -227,7 +227,7 @@ const char *blk_status_to_str(blk_status_t status)
  */
 void blk_sync_queue(struct request_queue *q)
 {
-	del_timer_sync(&q->timeout);
+	del_timer_shutdown(&q->timeout);
 	cancel_work_sync(&q->timeout_work);
 }
 EXPORT_SYMBOL(blk_sync_queue);
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index e71b3b43927c..12a1e46536ed 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -769,6 +769,8 @@ static void blk_release_queue(struct kobject *kobj)
 
 	percpu_ref_exit(&q->q_usage_counter);
 
+	blk_sync_queue(q);
+
 	if (q->poll_stat)
 		blk_stat_remove_callback(q, q->poll_cb);
 	blk_stat_free_callback(q->poll_cb);
diff --git a/drivers/net/ethernet/dec/tulip/tulip_core.c b/drivers/net/ethernet/dec/tulip/tulip_core.c
index ecfad43df45a..0c86066929d3 100644
--- a/drivers/net/ethernet/dec/tulip/tulip_core.c
+++ b/drivers/net/ethernet/dec/tulip/tulip_core.c
@@ -770,8 +770,6 @@ static void tulip_down (struct net_device *dev)
 
 	spin_unlock_irqrestore (&tp->lock, flags);
 
-	timer_setup(&tp->timer, tulip_tbl[tp->chip_id].media_timer, 0);
-
 	dev->if_port = tp->saved_if_port;
 
 	/* Leave the driver in snooze, not sleep, mode. */
@@ -1869,10 +1867,14 @@ static int __maybe_unused tulip_resume(struct device *dev_d)
 static void tulip_remove_one(struct pci_dev *pdev)
 {
 	struct net_device *dev = pci_get_drvdata (pdev);
+	struct tulip_private *tp;
 
 	if (!dev)
 		return;
 
+	tp = netdev_priv(dev);
+	del_timer_shutdown(&tp->timer);
+
 	unregister_netdev(dev);
 }
 
diff --git a/drivers/parport/ieee1284.c b/drivers/parport/ieee1284.c
index 4547ac44c8d4..50dbd2ea23fc 100644
--- a/drivers/parport/ieee1284.c
+++ b/drivers/parport/ieee1284.c
@@ -73,7 +73,7 @@ int parport_wait_event (struct parport *port, signed long timeout)
 	timer_setup(&port->timer, timeout_waiting_on_port, 0);
 	mod_timer(&port->timer, jiffies + timeout);
 	ret = down_interruptible (&port->physport->ieee1284.irq);
-	if (!del_timer_sync(&port->timer) && !ret)
+	if (!del_timer_shutdown(&port->timer) && !ret)
 		/* Timed out. */
 		ret = 1;
 
diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 63f32f843e75..b91b27c398ae 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -789,7 +789,7 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code)
 	while (!list_empty(&hostdata->sent)) {
 		evt = list_first_entry(&hostdata->sent, struct srp_event_struct, list);
 		list_del(&evt->list);
-		del_timer(&evt->timer);
+		del_timer_try_shutdown(&evt->timer);
 
 		spin_unlock_irqrestore(hostdata->host->host_lock, flags);
 		if (evt->cmnd) {
@@ -944,7 +944,7 @@ static int ibmvscsi_send_srp_event(struct srp_event_struct *evt_struct,
 			       be64_to_cpu(crq_as_u64[1]));
 	if (rc != 0) {
 		list_del(&evt_struct->list);
-		del_timer(&evt_struct->timer);
+		del_timer_shutdown(&evt_struct->timer);
 
 		/* If send_crq returns H_CLOSED, return SCSI_MLQUEUE_HOST_BUSY.
 		 * Firmware will send a CRQ with a transport event (0xFF) to
@@ -1840,7 +1840,7 @@ static void ibmvscsi_handle_crq(struct viosrp_crq *crq,
 		atomic_add(be32_to_cpu(evt_struct->xfer_iu->srp.rsp.req_lim_delta),
 			   &hostdata->request_limit);
 
-	del_timer(&evt_struct->timer);
+	del_timer_shutdown(&evt_struct->timer);
 
 	if ((crq->status != VIOSRP_OK && crq->status != VIOSRP_OK2) && evt_struct->cmnd)
 		evt_struct->cmnd->result = DID_ERROR << 16;
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 397f263ab7da..7d1f7a89a5ea 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -1082,6 +1082,9 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
 			delay = hub_power_on_good_delay(hub);
 
 			hub_power_on(hub, false);
+			/* Timers must be shutdown before they are re-initialized */
+			if (hub->init_work.work.func)
+				del_timer_shutdown(&hub->init_work.timer);
 			INIT_DELAYED_WORK(&hub->init_work, hub_init_func2);
 			queue_delayed_work(system_power_efficient_wq,
 					&hub->init_work,
diff --git a/include/linux/timer.h b/include/linux/timer.h
index d4d90149d015..4dfb3913bb69 100644
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -8,6 +8,12 @@
 #include <linux/debugobjects.h>
 #include <linux/stringify.h>
 
+enum timer_debug_state {
+	TIMER_DEBUG_DISABLED,
+	TIMER_DEBUG_ENABLED,
+	TIMER_DEBUG_WORK,
+};
+
 struct timer_list {
 	/*
 	 * All fields that change during normal runtime grouped to the
@@ -19,7 +25,7 @@ struct timer_list {
 	u32			flags;
 
 #ifdef CONFIG_DEBUG_OBJECTS_TIMERS
-	u32			enabled;
+	enum timer_debug_state	enabled;
 #endif
 #ifdef CONFIG_LOCKDEP
 	struct lockdep_map	lockdep_map;
@@ -134,14 +140,26 @@ static inline void init_timer_on_stack_key(struct timer_list *timer,
 #ifdef CONFIG_DEBUG_OBJECTS_TIMERS
 #define __init_timer_debug(_timer, _fn, _flags)				\
 	do {								\
-		(_timer)->enabled = 0;					\
+		(_timer)->enabled = TIMER_DEBUG_DISABLED;		\
 		__init_timer((_timer), (_fn), (_flags));		\
 	} while (0)
-#else
-#define __init_timer_debug(_timer, _fn, _flags)				\
+#define __init_timer_work(_timer, _fn, _flags)				\
 	do {								\
+		(_timer)->enabled = TIMER_DEBUG_WORK;			\
 		__init_timer((_timer), (_fn), (_flags));		\
 	} while (0)
+#define __init_timer_work_on_stack(_timer, _fn, _flags)				\
+	do {								\
+		(_timer)->enabled = TIMER_DEBUG_WORK;			\
+		__init_timer_on_stack((_timer), (_fn), (_flags));	\
+	} while (0)
+#else
+#define __init_timer_debug(_timer, _fn, _flags)				\
+	__init_timer((_timer), (_fn), (_flags))
+#define __init_timer_work(_timer, _fn, _flags)				\
+	__init_timer((_timer), (_fn), (_flags))
+#define __init_timer_work_on_stack(_timer, _fn, _flags)			\
+	__init_timer_on_stack((_timer), (_fn), (_flags))
 #endif
 
 /**
@@ -184,12 +202,23 @@ static inline int timer_pending(const struct timer_list * timer)
 	return !hlist_unhashed_lockless(&timer->entry);
 }
 
+extern int __del_timer(struct timer_list * timer, bool free);
+
 extern void add_timer_on(struct timer_list *timer, int cpu);
-extern int del_timer(struct timer_list * timer);
 extern int mod_timer(struct timer_list *timer, unsigned long expires);
 extern int mod_timer_pending(struct timer_list *timer, unsigned long expires);
 extern int timer_reduce(struct timer_list *timer, unsigned long expires);
 
+static inline int del_timer_try_shutdown(struct timer_list *timer)
+{
+	return __del_timer(timer, true);
+}
+
+static inline int del_timer(struct timer_list *timer)
+{
+	return __del_timer(timer, false);
+}
+
 /*
  * The jiffies value which is added to now, when there is no timer
  * in the timer wheel:
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index a0143dd24430..290c96429ce1 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -250,7 +250,7 @@ static inline unsigned int work_static(struct work_struct *work) { return 0; }
 #define __INIT_DELAYED_WORK(_work, _func, _tflags)			\
 	do {								\
 		INIT_WORK(&(_work)->work, (_func));			\
-		__init_timer(&(_work)->timer,				\
+		__init_timer_work(&(_work)->timer,			\
 			     delayed_work_timer_fn,			\
 			     (_tflags) | TIMER_IRQSAFE);		\
 	} while (0)
@@ -258,7 +258,7 @@ static inline unsigned int work_static(struct work_struct *work) { return 0; }
 #define __INIT_DELAYED_WORK_ONSTACK(_work, _func, _tflags)		\
 	do {								\
 		INIT_WORK_ONSTACK(&(_work)->work, (_func));		\
-		__init_timer_on_stack(&(_work)->timer,			\
+		__init_timer_work_on_stack(&(_work)->timer,		\
 				      delayed_work_timer_fn,		\
 				      (_tflags) | TIMER_IRQSAFE);	\
 	} while (0)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 1d17552b3ede..3c47652aeccf 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -691,7 +691,11 @@ static bool timer_fixup_init(void *addr, enum debug_obj_state state)
 
 	switch (state) {
 	case ODEBUG_STATE_ACTIVE:
-		del_timer_sync(timer);
+		if (timer->enabled != TIMER_DEBUG_WORK)
+			timer->enabled = TIMER_DEBUG_ENABLED;
+		del_timer_shutdown(timer);
+		if (timer->enabled != TIMER_DEBUG_WORK)
+			timer->enabled = TIMER_DEBUG_DISABLED;
 		debug_object_init(timer, &timer_debug_descr);
 		return true;
 	default:
@@ -737,8 +741,10 @@ static bool timer_fixup_free(void *addr, enum debug_obj_state state)
 
 	switch (state) {
 	case ODEBUG_STATE_ACTIVE:
-		del_timer_sync(timer);
+		del_timer_shutdown(timer);
 		debug_object_free(timer, &timer_debug_descr);
+		if (timer->enabled != TIMER_DEBUG_WORK)
+			timer->enabled = TIMER_DEBUG_DISABLED;
 		return true;
 	default:
 		return false;
@@ -774,22 +780,37 @@ static const struct debug_obj_descr timer_debug_descr = {
 
 static inline void debug_timer_init(struct timer_list *timer)
 {
-	if (!timer->enabled)
-		debug_object_init(timer, &timer_debug_descr);
+	if (timer->enabled == TIMER_DEBUG_ENABLED)
+		return;
+
+	debug_object_init(timer, &timer_debug_descr);
 }
 
 static inline void debug_timer_activate(struct timer_list *timer)
 {
-	if (!timer->enabled) {
-		timer->enabled = 1;
-		debug_object_activate(timer, &timer_debug_descr);
-	}
+	if (timer->enabled == TIMER_DEBUG_ENABLED)
+		return;
+
+	if (timer->enabled == TIMER_DEBUG_DISABLED)
+		timer->enabled = TIMER_DEBUG_ENABLED;
+
+	debug_object_activate(timer, &timer_debug_descr);
 }
 
-static inline void debug_timer_deactivate(struct timer_list *timer)
+static inline void debug_timer_deactivate(struct timer_list *timer, bool free)
 {
-	if (timer->enabled)
-		debug_object_deactivate(timer, &timer_debug_descr);
+	switch (timer->enabled) {
+	case TIMER_DEBUG_DISABLED:
+		return;
+	case TIMER_DEBUG_ENABLED:
+		if (!free)
+			return;
+		timer->enabled = TIMER_DEBUG_DISABLED;
+		break;
+	case TIMER_DEBUG_WORK:
+		break;
+	}
+	debug_object_deactivate(timer, &timer_debug_descr);
 }
 
 static inline void debug_timer_assert_init(struct timer_list *timer)
@@ -833,6 +854,7 @@ static inline void debug_init(struct timer_list *timer)
 
 static inline void debug_deactivate(struct timer_list *timer)
 {
+	debug_timer_deactivate(timer, false);
 	trace_timer_cancel(timer);
 }
 
@@ -1255,7 +1277,7 @@ EXPORT_SYMBOL_GPL(add_timer_on);
  * (ie. del_timer() of an inactive timer returns 0, del_timer() of an
  * active timer returns 1.)
  */
-int del_timer(struct timer_list *timer)
+int __del_timer(struct timer_list *timer, bool free)
 {
 	struct timer_base *base;
 	unsigned long flags;
@@ -1266,12 +1288,16 @@ int del_timer(struct timer_list *timer)
 	if (timer_pending(timer)) {
 		base = lock_timer_base(timer, &flags);
 		ret = detach_if_pending(timer, base, true);
+		if (free) {
+			timer->function = NULL;
+			debug_timer_deactivate(timer);
+		}
 		raw_spin_unlock_irqrestore(&base->lock, flags);
 	}
 
 	return ret;
 }
-EXPORT_SYMBOL(del_timer);
+EXPORT_SYMBOL(__del_timer);
 
 static int __try_to_del_timer_sync(struct timer_list *timer, bool free)
 {
@@ -1287,7 +1313,7 @@ static int __try_to_del_timer_sync(struct timer_list *timer, bool free)
 		ret = detach_if_pending(timer, base, true);
 	if (free) {
 		timer->function = NULL;
-		debug_timer_deactivate(timer);
+		debug_timer_deactivate(timer, true);
 	}
 
 	raw_spin_unlock_irqrestore(&base->lock, flags);
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 47a7124bbea4..9a48213fc4e4 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1225,6 +1225,16 @@ static void pwq_dec_nr_in_flight(struct pool_workqueue *pwq, unsigned long work_
 	put_pwq(pwq);
 }
 
+static void deactivate_timer(struct work_struct *work, bool is_dwork)
+{
+	struct delayed_work *dwork;
+
+	if (!is_dwork)
+		return;
+
+	dwork = to_delayed_work(work);
+}
+
 /**
  * try_to_grab_pending - steal work item from worklist and disable irq
  * @work: work item to steal
@@ -3148,6 +3158,8 @@ static bool __cancel_work_timer(struct work_struct *work, bool is_dwork)
 		}
 	} while (unlikely(ret < 0));
 
+	deactivate_timer(work, is_dwork);
+
 	/* tell other tasks trying to grab @work to back off */
 	mark_work_canceling(work);
 	local_irq_restore(flags);
diff --git a/net/core/sock.c b/net/core/sock.c
index 10cc84379d75..23a97442a0a6 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3345,7 +3345,7 @@ EXPORT_SYMBOL(sk_reset_timer);
 
 void sk_stop_timer(struct sock *sk, struct timer_list* timer)
 {
-	if (del_timer(timer))
+	if (del_timer_try_shutdown(timer))
 		__sock_put(sk);
 }
 EXPORT_SYMBOL(sk_stop_timer);
Steven Rostedt Oct. 28, 2022, 8:40 p.m. UTC | #7
On Fri, 28 Oct 2022 12:59:59 -0700
Guenter Roeck <linux@roeck-us.net> wrote:
> 
> I'll test again with the following changes on top of your published
> patch series. I hope this is the current status, but I may have lost
> something.
> 
> Looking into it ... deactivate_timer() doesn't do anything 
> and seems wrong. Did I miss something ?

You mean debug_deactivate_timer() or debug_deactivate?


> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c


>  
> -static inline void debug_timer_deactivate(struct timer_list *timer)
> +static inline void debug_timer_deactivate(struct timer_list *timer, bool free)
>  {
> -	if (timer->enabled)
> -		debug_object_deactivate(timer, &timer_debug_descr);
> +	switch (timer->enabled) {
> +	case TIMER_DEBUG_DISABLED:

DISABLE is set before an activate happens (before it is ever armed).

> +		return;
> +	case TIMER_DEBUG_ENABLED:
> +		if (!free)
> +			return;

This is called by del_timer{,_sync}() where free is false, or
del_timer_shutdown() where free is true.

We only want to deactivate when free is true.

> +		timer->enabled = TIMER_DEBUG_DISABLED;

And we allow for initialization of a "freed" timer again.

> +		break;
> +	case TIMER_DEBUG_WORK:

This is part of the delayed_work timers, were we keep the old behavior
(del_timer() and del_timer_sync() both deactivate the timer.

> +		break;
> +	}
> +	debug_object_deactivate(timer, &timer_debug_descr);

Here we call the debug object code to deactivate it.

>  }
>  
>  static inline void debug_timer_assert_init(struct timer_list *timer)
> @@ -833,6 +854,7 @@ static inline void debug_init(struct timer_list *timer)
>  
>  static inline void debug_deactivate(struct timer_list *timer)
>  {
> +	debug_timer_deactivate(timer, false);

This calls the above code.

>  	trace_timer_cancel(timer);
>  }


Or am I confused and you meant something else?

-- Steve
Guenter Roeck Oct. 28, 2022, 11:25 p.m. UTC | #8
On 10/28/22 13:40, Steven Rostedt wrote:
> On Fri, 28 Oct 2022 12:59:59 -0700
> Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> I'll test again with the following changes on top of your published
>> patch series. I hope this is the current status, but I may have lost
>> something.
>>
>> Looking into it ... deactivate_timer() doesn't do anything
>> and seems wrong. Did I miss something ?
> 
> You mean debug_deactivate_timer() or debug_deactivate?
> 

This:

+static void deactivate_timer(struct work_struct *work, bool is_dwork)
+{
+       struct delayed_work *dwork;
+
+       if (!is_dwork)
+               return;
+
+       dwork = to_delayed_work(work);
+}

Guenter
Steven Rostedt Oct. 28, 2022, 11:29 p.m. UTC | #9
On Fri, 28 Oct 2022 16:25:32 -0700
Guenter Roeck <linux@roeck-us.net> wrote:

> On 10/28/22 13:40, Steven Rostedt wrote:
> > On Fri, 28 Oct 2022 12:59:59 -0700
> > Guenter Roeck <linux@roeck-us.net> wrote:  
> >>
> >> I'll test again with the following changes on top of your published
> >> patch series. I hope this is the current status, but I may have lost
> >> something.
> >>
> >> Looking into it ... deactivate_timer() doesn't do anything
> >> and seems wrong. Did I miss something ?  
> > 
> > You mean debug_deactivate_timer() or debug_deactivate?
> >   
> 
> This:
> 
> +static void deactivate_timer(struct work_struct *work, bool is_dwork)
> +{
> +       struct delayed_work *dwork;
> +
> +       if (!is_dwork)
> +               return;
> +
> +       dwork = to_delayed_work(work);
> +}

Oh, that was part of my trying to figure out WTF delayed work was doing
with its timers. You can delete it's existence.

Thanks (and I'll go remove it from my tree).

-- Steve
Guenter Roeck Oct. 29, 2022, 2:52 p.m. UTC | #10
On Fri, Oct 28, 2022 at 01:00:02PM -0700, Guenter Roeck wrote:
> On Fri, Oct 28, 2022 at 02:10:07PM -0400, Steven Rostedt wrote:
> > On Fri, 28 Oct 2022 14:01:29 -0400
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > @@ -813,6 +839,14 @@ void destroy_timer_on_stack(struct timer_list *timer)
> > >  }
> > >  EXPORT_SYMBOL_GPL(destroy_timer_on_stack);
> > >  
> > > +static struct timer_base *lock_timer_base(struct timer_list *timer,
> > > +					  unsigned long *flags);
> > > +
> > > +void __timer_reinit_debug_objects(struct timer_list *timer)
> > > +{
> > > +	return;
> > > +}
> > > +
> > >  #else
> > >  static inline void debug_timer_init(struct timer_list *timer) { }
> > >  static inline void debug_timer_activate(struct timer_list *timer) { }
> > 
> > Bah, the above chunk was leftover from some debugging.
> > 
> 
> I'll test again with the following changes on top of your published
> patch series. I hope this is the current status, but I may have lost
> something.
> 

With the diffs I sent earlier applied, the warning still seen is

WARNING: CPU: 0 PID: 9 at lib/debugobjects.c:502 debug_print_object+0xd0/0x100
ODEBUG: free active (active state 0) object type: timer_list hint: neigh_timer_handler+0x0/0x480

That happens with almost every test, so I may have missed some others
in the noise.

Guenter
Steven Rostedt Oct. 29, 2022, 7:19 p.m. UTC | #11
On Sat, 29 Oct 2022 07:52:41 -0700
Guenter Roeck <linux@roeck-us.net> wrote:

> With the diffs I sent earlier applied, the warning still seen is
> 
> WARNING: CPU: 0 PID: 9 at lib/debugobjects.c:502 debug_print_object+0xd0/0x100
> ODEBUG: free active (active state 0) object type: timer_list hint: neigh_timer_handler+0x0/0x480
> 
> That happens with almost every test, so I may have missed some others
> in the noise.

Can you add this?

-- Steve

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 3c4786b99907..3e2586c72c7e 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -895,6 +895,8 @@ void neigh_destroy(struct neighbour *neigh)
 	if (neigh_del_timer(neigh))
 		pr_warn("Impossible event\n");
 
+	del_timer_try_shutdown(&neigh->timer);
+
 	write_lock_bh(&neigh->lock);
 	__skb_queue_purge(&neigh->arp_queue);
 	write_unlock_bh(&neigh->lock);
Guenter Roeck Oct. 29, 2022, 10:56 p.m. UTC | #12
On 10/29/22 12:19, Steven Rostedt wrote:
> On Sat, 29 Oct 2022 07:52:41 -0700
> Guenter Roeck <linux@roeck-us.net> wrote:
> 
>> With the diffs I sent earlier applied, the warning still seen is
>>
>> WARNING: CPU: 0 PID: 9 at lib/debugobjects.c:502 debug_print_object+0xd0/0x100
>> ODEBUG: free active (active state 0) object type: timer_list hint: neigh_timer_handler+0x0/0x480
>>
>> That happens with almost every test, so I may have missed some others
>> in the noise.
> 
> Can you add this?
> 

It doesn't make a difference.

Guenter

> -- Steve
> 
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 3c4786b99907..3e2586c72c7e 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -895,6 +895,8 @@ void neigh_destroy(struct neighbour *neigh)
>   	if (neigh_del_timer(neigh))
>   		pr_warn("Impossible event\n");
>   
> +	del_timer_try_shutdown(&neigh->timer);
> +
>   	write_lock_bh(&neigh->lock);
>   	__skb_queue_purge(&neigh->arp_queue);
>   	write_unlock_bh(&neigh->lock);
Steven Rostedt Oct. 30, 2022, 3:48 p.m. UTC | #13
On Sat, 29 Oct 2022 15:56:25 -0700
Guenter Roeck <linux@roeck-us.net> wrote:

> >> WARNING: CPU: 0 PID: 9 at lib/debugobjects.c:502 debug_print_object+0xd0/0x100
> >> ODEBUG: free active (active state 0) object type: timer_list hint: neigh_timer_handler+0x0/0x480
> >>
> >> That happens with almost every test, so I may have missed some others
> >> in the noise.  
> > 
> > Can you add this?
> >   
> 
> It doesn't make a difference.

Ah, it also requires this (I have other debugging in that file, so it may
only apply with some fuzzing):

-- Steve


diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index ac2e8beb4235..f2ccf24a8448 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1282,6 +1296,11 @@ int __del_timer(struct timer_list *timer, bool free)
 			debug_timer_deactivate(timer, true);
 		}
 		raw_spin_unlock_irqrestore(&base->lock, flags);
+	} else if (free) {
+		base = lock_timer_base(timer, &flags);
+		timer->function = NULL;
+		debug_timer_deactivate(timer, true);
+		raw_spin_unlock_irqrestore(&base->lock, flags);
 	}
 
 	return ret;
Guenter Roeck Oct. 31, 2022, 3:50 p.m. UTC | #14
On Sun, Oct 30, 2022 at 11:48:28AM -0400, Steven Rostedt wrote:
> On Sat, 29 Oct 2022 15:56:25 -0700
> Guenter Roeck <linux@roeck-us.net> wrote:
> 
> > >> WARNING: CPU: 0 PID: 9 at lib/debugobjects.c:502 debug_print_object+0xd0/0x100
> > >> ODEBUG: free active (active state 0) object type: timer_list hint: neigh_timer_handler+0x0/0x480
> > >>
> > >> That happens with almost every test, so I may have missed some others
> > >> in the noise.  
> > > 
> > > Can you add this?
> > >   
> > 
> > It doesn't make a difference.
> 
> Ah, it also requires this (I have other debugging in that file, so it may
> only apply with some fuzzing):
> 

Almost good, except for the attached backtrace. That seems to happen
on shutdown after bootting from a usb drive, but not on all platforms.

The warning is in __mod_timer():

        if (WARN_ON_ONCE(!timer->function))
                return -EINVAL;

This may be due to the change in blk_sync_queue() which I suspect may
be called prior to the last mod_timer() call. I'll add some debug code
to verify.

Guenter

------------[ cut here ]------------
WARNING: CPU: 0 PID: 283 at kernel/time/timer.c:1046 mod_timer+0x294/0x34c
Modules linked in:
CPU: 0 PID: 283 Comm: init Tainted: G                 N 6.1.0-rc2-00397-g18ccc9f8a778 #1
Hardware name: Freescale i.MX25 (Device Tree Support)
 unwind_backtrace from show_stack+0x10/0x18
 show_stack from dump_stack_lvl+0x34/0x54
 dump_stack_lvl from __warn+0xc0/0x1f0
 __warn from warn_slowpath_fmt+0x5c/0xc4
 warn_slowpath_fmt from mod_timer+0x294/0x34c
 mod_timer from blk_add_timer+0xa4/0xb4
 blk_add_timer from blk_mq_start_request+0x84/0x1f4
 blk_mq_start_request from scsi_queue_rq+0x4a8/0xb84
 scsi_queue_rq from blk_mq_dispatch_rq_list+0x320/0x9d0
 blk_mq_dispatch_rq_list from __blk_mq_sched_dispatch_requests+0xb0/0x158
 __blk_mq_sched_dispatch_requests from blk_mq_sched_dispatch_requests+0x34/0x64
 blk_mq_sched_dispatch_requests from __blk_mq_run_hw_queue+0x8c/0x234
 __blk_mq_run_hw_queue from blk_mq_sched_insert_request+0xe8/0x15c
 blk_mq_sched_insert_request from blk_execute_rq+0xa4/0x1d0
 blk_execute_rq from __scsi_execute+0xb4/0x19c
 __scsi_execute from sd_sync_cache+0xac/0x1ec
 sd_sync_cache from sd_shutdown+0x5c/0xc8
 sd_shutdown from sd_remove+0x30/0x44
 sd_remove from device_release_driver_internal+0xd0/0x16c
 device_release_driver_internal from bus_remove_device+0xd0/0x100
 bus_remove_device from device_del+0x190/0x464
 device_del from __scsi_remove_device+0x130/0x184
 __scsi_remove_device from scsi_forget_host+0x60/0x64
 scsi_forget_host from scsi_remove_host+0x6c/0x188
 scsi_remove_host from usb_stor_disconnect+0x40/0xf4
 usb_stor_disconnect from usb_unbind_interface+0x68/0x230
 usb_unbind_interface from device_release_driver_internal+0xd0/0x16c
 device_release_driver_internal from bus_remove_device+0xd0/0x100
 bus_remove_device from device_del+0x190/0x464
 device_del from usb_disable_device+0x88/0x130
 usb_disable_device from usb_disconnect+0xb4/0x234
 usb_disconnect from usb_disconnect+0x9c/0x234
 usb_disconnect from usb_remove_hcd+0xd0/0x16c
 usb_remove_hcd from host_stop+0x38/0xa8
 host_stop from ci_hdrc_remove+0x40/0x11c
 ci_hdrc_remove from platform_remove+0x24/0x54
 platform_remove from device_release_driver_internal+0xd0/0x16c
 device_release_driver_internal from bus_remove_device+0xd0/0x100
 bus_remove_device from device_del+0x190/0x464
 device_del from platform_device_del.part.0+0x10/0x78
 platform_device_del.part.0 from platform_device_unregister+0x18/0x28
 platform_device_unregister from ci_hdrc_remove_device+0xc/0x24
 ci_hdrc_remove_device from ci_hdrc_imx_remove+0x28/0xfc
 ci_hdrc_imx_remove from device_shutdown+0x178/0x230
 device_shutdown from kernel_restart_prepare+0x2c/0x3c
 kernel_restart_prepare from kernel_restart+0xc/0x68
 kernel_restart from __do_sys_reboot+0xc0/0x204
 __do_sys_reboot from ret_fast_syscall+0x0/0x1c
Exception stack(0xc8ca1fa8 to 0xc8ca1ff0)
1fa0:                   01234567 0000000f fee1dead 28121969 01234567 00000000
1fc0: 01234567 0000000f 00000001 00000058 000e05c0 00000000 00000000 00000000
1fe0: 000e0298 bea82de4 000994bc b6f6d2c0
irq event stamp: 3443
hardirqs last  enabled at (3451): [<c0074590>] __up_console_sem+0x64/0x88
hardirqs last disabled at (3458): [<c007457c>] __up_console_sem+0x50/0x88
softirqs last  enabled at (3438): [<c000988c>] __do_softirq+0x2fc/0x5d0
softirqs last disabled at (3433): [<c0022518>] __irq_exit_rcu+0x170/0x1ec
---[ end trace 0000000000000000 ]---
sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK
ci_hdrc ci_hdrc.0: USB bus 1 deregistered
reboot: Restarting system
------------
Guenter Roeck Oct. 31, 2022, 8:14 p.m. UTC | #15
On Mon, Oct 31, 2022 at 08:50:58AM -0700, Guenter Roeck wrote:
> On Sun, Oct 30, 2022 at 11:48:28AM -0400, Steven Rostedt wrote:
> > On Sat, 29 Oct 2022 15:56:25 -0700
> > Guenter Roeck <linux@roeck-us.net> wrote:
> > 
> > > >> WARNING: CPU: 0 PID: 9 at lib/debugobjects.c:502 debug_print_object+0xd0/0x100
> > > >> ODEBUG: free active (active state 0) object type: timer_list hint: neigh_timer_handler+0x0/0x480
> > > >>
> > > >> That happens with almost every test, so I may have missed some others
> > > >> in the noise.  
> > > > 
> > > > Can you add this?
> > > >   
> > > 
> > > It doesn't make a difference.
> > 
> > Ah, it also requires this (I have other debugging in that file, so it may
> > only apply with some fuzzing):
> > 
> 
> Almost good, except for the attached backtrace. That seems to happen
> on shutdown after bootting from a usb drive, but not on all platforms.
> 
> The warning is in __mod_timer():
> 
>         if (WARN_ON_ONCE(!timer->function))
>                 return -EINVAL;
> 
> This may be due to the change in blk_sync_queue() which I suspect may
> be called prior to the last mod_timer() call. I'll add some debug code
> to verify.
> 

I see that additional requests are sent to the scsi device after the call
to blk_sync_queue(). The description of this function suggests that this
may happen. Overall it does not seem to be appropriate to call
del_timer_shutdown() from blk_sync_queue(). I'll change my test code
accordingly.

Guenter
diff mbox series

Patch

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index bbab424b0d55..397f263ab7da 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -1261,6 +1261,9 @@  static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
 
 		/* Don't do a long sleep inside a workqueue routine */
 		if (type == HUB_INIT2) {
+			/* Timers must be shutdown before they are re-initialized */
+			if (hub->init_work.work.func)
+				del_timer_shutdown(&hub->init_work.timer);
 			INIT_DELAYED_WORK(&hub->init_work, hub_init_func3);
 			queue_delayed_work(system_power_efficient_wq,
 					&hub->init_work,
diff --git a/drivers/usb/gadget/udc/m66592-udc.c b/drivers/usb/gadget/udc/m66592-udc.c
index 931e6362a13d..a6e2f8358adf 100644
--- a/drivers/usb/gadget/udc/m66592-udc.c
+++ b/drivers/usb/gadget/udc/m66592-udc.c
@@ -1519,7 +1519,7 @@  static int m66592_remove(struct platform_device *pdev)
 
 	usb_del_gadget_udc(&m66592->gadget);
 
-	del_timer_sync(&m66592->timer);
+	del_timer_shutdown(&m66592->timer);
 	iounmap(m66592->reg);
 	free_irq(platform_get_irq(pdev, 0), m66592);
 	m66592_free_request(&m66592->ep[0].ep, m66592->ep0_req);
diff --git a/drivers/usb/serial/garmin_gps.c b/drivers/usb/serial/garmin_gps.c
index f1a8d8343623..2a53f26468bd 100644
--- a/drivers/usb/serial/garmin_gps.c
+++ b/drivers/usb/serial/garmin_gps.c
@@ -1405,7 +1405,7 @@  static void garmin_port_remove(struct usb_serial_port *port)
 
 	usb_kill_anchored_urbs(&garmin_data_p->write_urbs);
 	usb_kill_urb(port->interrupt_in_urb);
-	del_timer_sync(&garmin_data_p->timer);
+	del_timer_shutdown(&garmin_data_p->timer);
 	kfree(garmin_data_p);
 }
 
diff --git a/drivers/usb/serial/mos7840.c b/drivers/usb/serial/mos7840.c
index 6b12bb4648b8..a90a706d27de 100644
--- a/drivers/usb/serial/mos7840.c
+++ b/drivers/usb/serial/mos7840.c
@@ -1726,7 +1726,7 @@  static void mos7840_port_remove(struct usb_serial_port *port)
 		mos7840_set_led_sync(port, MODEM_CONTROL_REGISTER, 0x0300);
 
 		del_timer_sync(&mos7840_port->led_timer1);
-		del_timer_sync(&mos7840_port->led_timer2);
+		del_timer_shutdown(&mos7840_port->led_timer2);
 
 		usb_kill_urb(mos7840_port->led_urb);
 		usb_free_urb(mos7840_port->led_urb);