diff mbox series

[09/10] scsi/ibmvscsi: Replace srp tasklet with work

Message ID 20220530231512.9729-10-dave@stgolabs.net
State New
Headers show
Series scsi: Replace tasklets as BH | expand

Commit Message

Davidlohr Bueso May 30, 2022, 11:15 p.m. UTC
Tasklets have long been deprecated as being too heavy on the system
by running in irq context - and this is not a performance critical
path. If a higher priority process wants to run, it must wait for
the tasklet to finish before doing so.

Process srps asynchronously in process context in a dedicated
single threaded workqueue.

Cc: Tyrel Datwyler <tyreld@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 38 ++++++++++++++++++++++----------
 drivers/scsi/ibmvscsi/ibmvscsi.h |  3 ++-
 2 files changed, 28 insertions(+), 13 deletions(-)

Comments

Sebastian Andrzej Siewior June 9, 2022, 3:02 p.m. UTC | #1
On 2022-05-30 16:15:11 [-0700], Davidlohr Bueso wrote:
> Tasklets have long been deprecated as being too heavy on the system
> by running in irq context - and this is not a performance critical
> path. If a higher priority process wants to run, it must wait for
> the tasklet to finish before doing so.
> 
> Process srps asynchronously in process context in a dedicated
> single threaded workqueue.

I would suggest threaded interrupts instead. The pattern here is the
same as in the previous driver except here is less locking.

Sebastian
David Laight June 9, 2022, 3:46 p.m. UTC | #2
From: Sebastian Andrzej Siewior
> Sent: 09 June 2022 16:03
> 
> On 2022-05-30 16:15:11 [-0700], Davidlohr Bueso wrote:
> > Tasklets have long been deprecated as being too heavy on the system
> > by running in irq context - and this is not a performance critical
> > path. If a higher priority process wants to run, it must wait for
> > the tasklet to finish before doing so.
> >
> > Process srps asynchronously in process context in a dedicated
> > single threaded workqueue.
> 
> I would suggest threaded interrupts instead. The pattern here is the
> same as in the previous driver except here is less locking.

How long do these actions runs for, and what is waiting for
them to finish?

These changes seem to drop the priority from above that of the
highest priority RT process down to that of a default priority
user process.
There is no real guarantee that the latter will run 'any time soon'.

Consider some workloads I'm setting up where most of the cpu are
likely to spend 90%+ of the time running processes under the RT
scheduler that are processing audio.

It is quite likely that a non-RT thread (especially one bound
to a specific cpu) won't run for several milliseconds.
(We have to go through 'hoops' to avoid dropping ethernet frames.)

I'd have thought that some of these kernel threads really
need to run at a 'middling' RT priority.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Sebastian Andrzej Siewior June 14, 2022, 1:25 p.m. UTC | #3
On 2022-06-09 15:46:04 [+0000], David Laight wrote:
> From: Sebastian Andrzej Siewior
> > Sent: 09 June 2022 16:03
> > 
> > On 2022-05-30 16:15:11 [-0700], Davidlohr Bueso wrote:
> > > Tasklets have long been deprecated as being too heavy on the system
> > > by running in irq context - and this is not a performance critical
> > > path. If a higher priority process wants to run, it must wait for
> > > the tasklet to finish before doing so.
> > >
> > > Process srps asynchronously in process context in a dedicated
> > > single threaded workqueue.
> > 
> > I would suggest threaded interrupts instead. The pattern here is the
> > same as in the previous driver except here is less locking.
> 
> How long do these actions runs for, and what is waiting for
> them to finish?

That is something that one with hardware and workload can answer.

> These changes seem to drop the priority from above that of the
> highest priority RT process down to that of a default priority
> user process.
> There is no real guarantee that the latter will run 'any time soon'.

Not sure I can follow. Using threaded interrupts will run at FIFO-50 by
default. Workqueue however is SCHED_OTHER. But then it is not bound to
any CPU so it will run on an available CPU.

> Consider some workloads I'm setting up where most of the cpu are
> likely to spend 90%+ of the time running processes under the RT
> scheduler that are processing audio.
> 
> It is quite likely that a non-RT thread (especially one bound
> to a specific cpu) won't run for several milliseconds.
> (We have to go through 'hoops' to avoid dropping ethernet frames.)
> 
> I'd have thought that some of these kernel threads really
> need to run at a 'middling' RT priority.

The threaded interrupts do this by default. If you run your own RT
threads you need to decide if they are more or less important than the
interrupts.

> 	David

Sebastian
David Laight June 14, 2022, 1:34 p.m. UTC | #4
From: Sebastian Andrzej Siewior
> Sent: 14 June 2022 14:26
...
> > These changes seem to drop the priority from above that of the
> > highest priority RT process down to that of a default priority
> > user process.
> > There is no real guarantee that the latter will run 'any time soon'.
> 
> Not sure I can follow. Using threaded interrupts will run at FIFO-50 by
> default. Workqueue however is SCHED_OTHER. But then it is not bound to
> any CPU so it will run on an available CPU.

Ok, I'd only looked at normal workqueues, softints and napi.
They are all SCHED_OTHER.

Unbound FIFO is moderately ok - they are sticky but can move.
The only problem is that they won't move if a process is
spinning in kernel on the cpu they last run on.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
diff mbox series

Patch

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 63f32f843e75..37cbea8bb0af 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -86,6 +86,8 @@  static DEFINE_SPINLOCK(ibmvscsi_driver_lock);
 
 static struct scsi_transport_template *ibmvscsi_transport_template;
 
+static struct workqueue_struct *ibmvscsi_wq;
+
 #define IBMVSCSI_VERSION "1.5.9"
 
 MODULE_DESCRIPTION("IBM Virtual SCSI");
@@ -117,7 +119,7 @@  static void ibmvscsi_handle_crq(struct viosrp_crq *crq,
  * @irq:	number of irq to handle, not used
  * @dev_instance: ibmvscsi_host_data of host that received interrupt
  *
- * Disables interrupts and schedules srp_task
+ * Disables interrupts and schedules srp_work
  * Always returns IRQ_HANDLED
  */
 static irqreturn_t ibmvscsi_handle_event(int irq, void *dev_instance)
@@ -125,7 +127,7 @@  static irqreturn_t ibmvscsi_handle_event(int irq, void *dev_instance)
 	struct ibmvscsi_host_data *hostdata =
 	    (struct ibmvscsi_host_data *)dev_instance;
 	vio_disable_interrupts(to_vio_dev(hostdata->dev));
-	tasklet_schedule(&hostdata->srp_task);
+	queue_work(ibmvscsi_wq, &hostdata->srp_work);
 	return IRQ_HANDLED;
 }
 
@@ -145,7 +147,7 @@  static void ibmvscsi_release_crq_queue(struct crq_queue *queue,
 	long rc = 0;
 	struct vio_dev *vdev = to_vio_dev(hostdata->dev);
 	free_irq(vdev->irq, (void *)hostdata);
-	tasklet_kill(&hostdata->srp_task);
+	cancel_work_sync(&hostdata->srp_work);
 	do {
 		if (rc)
 			msleep(100);
@@ -206,16 +208,19 @@  static int ibmvscsi_send_crq(struct ibmvscsi_host_data *hostdata,
 }
 
 /**
- * ibmvscsi_task: - Process srps asynchronously
+ * ibmvscsi_workfn: - Process srps asynchronously
  * @data:	ibmvscsi_host_data of host
  */
-static void ibmvscsi_task(void *data)
+static void ibmvscsi_workfn(struct work_struct *work)
 {
-	struct ibmvscsi_host_data *hostdata = (struct ibmvscsi_host_data *)data;
-	struct vio_dev *vdev = to_vio_dev(hostdata->dev);
+	struct ibmvscsi_host_data *hostdata;
+	struct vio_dev *vdev;
 	struct viosrp_crq *crq;
 	int done = 0;
 
+	hostdata = container_of(work, struct ibmvscsi_host_data, srp_work);
+	vdev = to_vio_dev(hostdata->dev);
+
 	while (!done) {
 		/* Pull all the valid messages off the CRQ */
 		while ((crq = crq_queue_next_crq(&hostdata->queue)) != NULL) {
@@ -367,8 +372,7 @@  static int ibmvscsi_init_crq_queue(struct crq_queue *queue,
 	queue->cur = 0;
 	spin_lock_init(&queue->lock);
 
-	tasklet_init(&hostdata->srp_task, (void *)ibmvscsi_task,
-		     (unsigned long)hostdata);
+	INIT_WORK(&hostdata->srp_work, ibmvscsi_workfn);
 
 	if (request_irq(vdev->irq,
 			ibmvscsi_handle_event,
@@ -387,7 +391,7 @@  static int ibmvscsi_init_crq_queue(struct crq_queue *queue,
 	return retrc;
 
       req_irq_failed:
-	tasklet_kill(&hostdata->srp_task);
+	cancel_work_sync(&hostdata->srp_work);
 	rc = 0;
 	do {
 		if (rc)
@@ -2371,7 +2375,7 @@  static int ibmvscsi_resume(struct device *dev)
 {
 	struct ibmvscsi_host_data *hostdata = dev_get_drvdata(dev);
 	vio_disable_interrupts(to_vio_dev(hostdata->dev));
-	tasklet_schedule(&hostdata->srp_task);
+	queue_work(ibmvscsi_wq, &hostdata->srp_work);
 
 	return 0;
 }
@@ -2418,15 +2422,25 @@  static int __init ibmvscsi_module_init(void)
 	if (!ibmvscsi_transport_template)
 		return -ENOMEM;
 
+	ibmvscsi_wq = alloc_ordered_workqueue("ibmvscsi_wq", 0);
+	if (!ibmvscsi_wq) {
+		srp_release_transport(ibmvscsi_transport_template);
+		return -ENOMEM;
+	}
+
 	ret = vio_register_driver(&ibmvscsi_driver);
-	if (ret)
+	if (ret) {
+		destroy_workqueue(ibmvscsi_wq);
 		srp_release_transport(ibmvscsi_transport_template);
+	}
+
 	return ret;
 }
 
 static void __exit ibmvscsi_module_exit(void)
 {
 	vio_unregister_driver(&ibmvscsi_driver);
+	destroy_workqueue(ibmvscsi_wq);
 	srp_release_transport(ibmvscsi_transport_template);
 }
 
diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.h b/drivers/scsi/ibmvscsi/ibmvscsi.h
index e60916ef7a49..f7c52744a206 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.h
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.h
@@ -18,6 +18,7 @@ 
 #include <linux/types.h>
 #include <linux/list.h>
 #include <linux/completion.h>
+#include <linux/workqueue.h>
 #include <linux/interrupt.h>
 #include <scsi/viosrp.h>
 
@@ -90,7 +91,7 @@  struct ibmvscsi_host_data {
 	struct device *dev;
 	struct event_pool pool;
 	struct crq_queue queue;
-	struct tasklet_struct srp_task;
+	struct work_struct srp_work;
 	struct list_head sent;
 	struct Scsi_Host *host;
 	struct task_struct *work_thread;