diff mbox series

wifi: mwifiex: Replace RX workqueues with kthreads

Message ID 20230609103651.313194-1-treapking@chromium.org
State New
Headers show
Series wifi: mwifiex: Replace RX workqueues with kthreads | expand

Commit Message

Pin-yen Lin June 9, 2023, 10:35 a.m. UTC
This improves the RX throughput likely by better locality for the work
loads.

We tested this patch on Mediatek MT8173 Chromebooks, and it shows ~80%
Rx throughput improvement on high data rate test cases.

Signed-off-by: Pin-yen Lin <treapking@chromium.org>
---

 .../wireless/marvell/mwifiex/11n_rxreorder.c  |  2 +-
 .../net/wireless/marvell/mwifiex/cfg80211.c   |  2 +-
 drivers/net/wireless/marvell/mwifiex/main.c   | 29 ++++++++-----------
 drivers/net/wireless/marvell/mwifiex/main.h   |  5 ++--
 4 files changed, 17 insertions(+), 21 deletions(-)

Comments

Brian Norris June 12, 2023, 11:47 p.m. UTC | #1
Hi,

Thanks Pin-yen for most of the investigation here and for pushing the
patch. With some additional information though, I might suggest *not*
landing this patch at the moment. More details appended:

On Sat, Jun 10, 2023 at 01:41:51AM +0800, Pin-yen Lin wrote:
> I realized that I might have over-simplified the background and the
> impact of this patch...
> 
> The short answer to the question is that the throughput improved from
> 100 mbps to 180 mbps. The test was run on ChromeOS's v5.15 kernel
> fork. More detailed test setting is mentioned in [1].
> 
> However, the throughput of the same test case on our v4.19 kernel is
> 320 mbps. That is, we observed a 320 mbps --> 100 mbps regression when
> we tried to update the kernel version. This patch is more like a
> mitigation of the regression. It improves the throughput, even though
> it is still not as good as the older kernel.
> 
> That being said, this patch does improve the throughput, so we think
> this patch can be landed into the mainline kernel.
> 
> Best regards,
> Pin-yen
> 
> [1]: https://lore.kernel.org/all/ZFvpJb9Dh0FCkLQA@google.com/

I (we?) was optimistic this would be an improvement (or at least, no
worse) due to some of the reasoning at [1]. And, the work here is just a
single work item, queued repeatedly to the same unbound workqueue. So
conceptually, it shouldn't be much different than a kthread_worker,
except for scheduler details -- where again, we'd think this should be
an improvement, as the scheduler would now better track load for the
task (mwifiex RX) in question.

But additional testing on other mwifiex-based systems (RK3399 + PCIE
8997) showed the inverse: some throughput drops on similar benchmarks,
from 110 Mbps to 80 Mbps. (Frankly, both numbers are significantly below
where we might like...)

Considering we still don't have a full explanation for all the
performance differences we've been seeing (on either test platform), and
that at least one of our platforms showed a (smaller) regression, I
think we might want to do more research before committing to this.

Brian
Kalle Valo June 13, 2023, 5:12 a.m. UTC | #2
Brian Norris <briannorris@chromium.org> writes:

> Hi,
>
> Thanks Pin-yen for most of the investigation here and for pushing the
> patch. With some additional information though, I might suggest *not*
> landing this patch at the moment. More details appended:
>
> On Sat, Jun 10, 2023 at 01:41:51AM +0800, Pin-yen Lin wrote:
>> I realized that I might have over-simplified the background and the
>> impact of this patch...
>> 
>> The short answer to the question is that the throughput improved from
>> 100 mbps to 180 mbps. The test was run on ChromeOS's v5.15 kernel
>> fork. More detailed test setting is mentioned in [1].
>> 
>> However, the throughput of the same test case on our v4.19 kernel is
>> 320 mbps. That is, we observed a 320 mbps --> 100 mbps regression when
>> we tried to update the kernel version. This patch is more like a
>> mitigation of the regression. It improves the throughput, even though
>> it is still not as good as the older kernel.
>> 
>> That being said, this patch does improve the throughput, so we think
>> this patch can be landed into the mainline kernel.
>> 
>> Best regards,
>> Pin-yen
>> 
>> [1]: https://lore.kernel.org/all/ZFvpJb9Dh0FCkLQA@google.com/
>
> I (we?) was optimistic this would be an improvement (or at least, no
> worse) due to some of the reasoning at [1]. And, the work here is just a
> single work item, queued repeatedly to the same unbound workqueue. So
> conceptually, it shouldn't be much different than a kthread_worker,
> except for scheduler details -- where again, we'd think this should be
> an improvement, as the scheduler would now better track load for the
> task (mwifiex RX) in question.
>
> But additional testing on other mwifiex-based systems (RK3399 + PCIE
> 8997) showed the inverse: some throughput drops on similar benchmarks,
> from 110 Mbps to 80 Mbps. (Frankly, both numbers are significantly below
> where we might like...)
>
> Considering we still don't have a full explanation for all the
> performance differences we've been seeing (on either test platform), and
> that at least one of our platforms showed a (smaller) regression, I
> think we might want to do more research before committing to this.

Yeah, I agree and I'll drop this. This is a really weird problem, I hope
you can get to the bottom of it.
diff mbox series

Patch

diff --git a/drivers/net/wireless/marvell/mwifiex/11n_rxreorder.c b/drivers/net/wireless/marvell/mwifiex/11n_rxreorder.c
index 391793a16adc..431f6dc61f5b 100644
--- a/drivers/net/wireless/marvell/mwifiex/11n_rxreorder.c
+++ b/drivers/net/wireless/marvell/mwifiex/11n_rxreorder.c
@@ -198,7 +198,7 @@  mwifiex_del_rx_reorder_entry(struct mwifiex_private *priv,
 	priv->adapter->rx_locked = true;
 	if (priv->adapter->rx_processing) {
 		spin_unlock_bh(&priv->adapter->rx_proc_lock);
-		flush_workqueue(priv->adapter->rx_workqueue);
+		kthread_flush_worker(priv->adapter->rx_thread);
 	} else {
 		spin_unlock_bh(&priv->adapter->rx_proc_lock);
 	}
diff --git a/drivers/net/wireless/marvell/mwifiex/cfg80211.c b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
index bcd564dc3554..f985bff4e52c 100644
--- a/drivers/net/wireless/marvell/mwifiex/cfg80211.c
+++ b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
@@ -872,7 +872,7 @@  static int mwifiex_deinit_priv_params(struct mwifiex_private *priv)
 	adapter->rx_locked = true;
 	if (adapter->rx_processing) {
 		spin_unlock_bh(&adapter->rx_proc_lock);
-		flush_workqueue(adapter->rx_workqueue);
+		kthread_flush_worker(adapter->rx_thread);
 	} else {
 	spin_unlock_bh(&adapter->rx_proc_lock);
 	}
diff --git a/drivers/net/wireless/marvell/mwifiex/main.c b/drivers/net/wireless/marvell/mwifiex/main.c
index ea22a08e6c08..bab963f3db83 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.c
+++ b/drivers/net/wireless/marvell/mwifiex/main.c
@@ -168,7 +168,7 @@  static void mwifiex_queue_rx_work(struct mwifiex_adapter *adapter)
 		spin_unlock_bh(&adapter->rx_proc_lock);
 	} else {
 		spin_unlock_bh(&adapter->rx_proc_lock);
-		queue_work(adapter->rx_workqueue, &adapter->rx_work);
+		kthread_queue_work(adapter->rx_thread, &adapter->rx_work);
 	}
 }
 
@@ -526,9 +526,10 @@  static void mwifiex_terminate_workqueue(struct mwifiex_adapter *adapter)
 		adapter->workqueue = NULL;
 	}
 
-	if (adapter->rx_workqueue) {
-		destroy_workqueue(adapter->rx_workqueue);
-		adapter->rx_workqueue = NULL;
+	if (adapter->rx_thread) {
+		kthread_flush_worker(adapter->rx_thread);
+		kthread_destroy_worker(adapter->rx_thread);
+		adapter->rx_thread = NULL;
 	}
 }
 
@@ -1394,7 +1395,7 @@  int is_command_pending(struct mwifiex_adapter *adapter)
  *
  * It handles the RX operations.
  */
-static void mwifiex_rx_work_queue(struct work_struct *work)
+static void mwifiex_rx_work_queue(struct kthread_work *work)
 {
 	struct mwifiex_adapter *adapter =
 		container_of(work, struct mwifiex_adapter, rx_work);
@@ -1554,13 +1555,10 @@  mwifiex_reinit_sw(struct mwifiex_adapter *adapter)
 	INIT_WORK(&adapter->main_work, mwifiex_main_work_queue);
 
 	if (adapter->rx_work_enabled) {
-		adapter->rx_workqueue = alloc_workqueue("MWIFIEX_RX_WORK_QUEUE",
-							WQ_HIGHPRI |
-							WQ_MEM_RECLAIM |
-							WQ_UNBOUND, 1);
-		if (!adapter->rx_workqueue)
+		adapter->rx_thread = kthread_create_worker(0, "MWIFIEX_RX");
+		if (IS_ERR(adapter->rx_thread))
 			goto err_kmalloc;
-		INIT_WORK(&adapter->rx_work, mwifiex_rx_work_queue);
+		kthread_init_work(&adapter->rx_work, mwifiex_rx_work_queue);
 	}
 
 	/* Register the device. Fill up the private data structure with
@@ -1709,14 +1707,11 @@  mwifiex_add_card(void *card, struct completion *fw_done,
 	INIT_WORK(&adapter->main_work, mwifiex_main_work_queue);
 
 	if (adapter->rx_work_enabled) {
-		adapter->rx_workqueue = alloc_workqueue("MWIFIEX_RX_WORK_QUEUE",
-							WQ_HIGHPRI |
-							WQ_MEM_RECLAIM |
-							WQ_UNBOUND, 1);
-		if (!adapter->rx_workqueue)
+		adapter->rx_thread = kthread_create_worker(0, "MWIFIEX_RX");
+		if (!adapter->rx_thread)
 			goto err_kmalloc;
 
-		INIT_WORK(&adapter->rx_work, mwifiex_rx_work_queue);
+		kthread_init_work(&adapter->rx_work, mwifiex_rx_work_queue);
 	}
 
 	/* Register the device. Fill up the private data structure with relevant
diff --git a/drivers/net/wireless/marvell/mwifiex/main.h b/drivers/net/wireless/marvell/mwifiex/main.h
index b95886e1413e..3255f9a5c2d4 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.h
+++ b/drivers/net/wireless/marvell/mwifiex/main.h
@@ -32,6 +32,7 @@ 
 #include <linux/gfp.h>
 #include <linux/interrupt.h>
 #include <linux/io.h>
+#include <linux/kthread.h>
 #include <linux/of_gpio.h>
 #include <linux/of_platform.h>
 #include <linux/platform_device.h>
@@ -886,8 +887,8 @@  struct mwifiex_adapter {
 	atomic_t tx_hw_pending;
 	struct workqueue_struct *workqueue;
 	struct work_struct main_work;
-	struct workqueue_struct *rx_workqueue;
-	struct work_struct rx_work;
+	struct kthread_worker *rx_thread;
+	struct kthread_work rx_work;
 	struct workqueue_struct *dfs_workqueue;
 	struct work_struct dfs_work;
 	bool rx_work_enabled;