diff mbox series

crypto: qat - change SLAs cleanup flow at shutdown

Message ID 20240209124237.44530-1-damian.muszynski@intel.com
State Accepted
Commit c2304e1a0b8051a60d4eb9c99a1c509d90380ae5
Headers show
Series crypto: qat - change SLAs cleanup flow at shutdown | expand

Commit Message

Damian Muszynski Feb. 9, 2024, 12:42 p.m. UTC
The implementation of the Rate Limiting (RL) feature includes the cleanup
of all SLAs during device shutdown. For each SLA, the firmware is notified
of the removal through an admin message, the data structures that take
into account the budgets are updated and the memory is freed.
However, this explicit cleanup is not necessary as (1) the device is
reset, and the firmware state is lost and (2) all RL data structures
are freed anyway.

In addition, if the device is unresponsive, for example after a PCI
AER error is detected, the admin interface might not be available.
This might slow down the shutdown sequence and cause a timeout in
the recovery flows which in turn makes the driver believe that the
device is not recoverable.

Fix by replacing the explicit SLAs removal with just a free of the
SLA data structures.

Fixes: d9fb8408376e ("crypto: qat - add rate limiting feature to qat_4xxx")
Cc: <stable@vger.kernel.org>
Signed-off-by: Damian Muszynski <damian.muszynski@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
---
 drivers/crypto/intel/qat/qat_common/adf_rl.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)


base-commit: 84c2f23ad68a847e36dc9cae44f21c5e321a321c

Comments

Herbert Xu Feb. 17, 2024, 1:14 a.m. UTC | #1
On Fri, Feb 09, 2024 at 01:42:07PM +0100, Damian Muszynski wrote:
> The implementation of the Rate Limiting (RL) feature includes the cleanup
> of all SLAs during device shutdown. For each SLA, the firmware is notified
> of the removal through an admin message, the data structures that take
> into account the budgets are updated and the memory is freed.
> However, this explicit cleanup is not necessary as (1) the device is
> reset, and the firmware state is lost and (2) all RL data structures
> are freed anyway.
> 
> In addition, if the device is unresponsive, for example after a PCI
> AER error is detected, the admin interface might not be available.
> This might slow down the shutdown sequence and cause a timeout in
> the recovery flows which in turn makes the driver believe that the
> device is not recoverable.
> 
> Fix by replacing the explicit SLAs removal with just a free of the
> SLA data structures.
> 
> Fixes: d9fb8408376e ("crypto: qat - add rate limiting feature to qat_4xxx")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Damian Muszynski <damian.muszynski@intel.com>
> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
> ---
>  drivers/crypto/intel/qat/qat_common/adf_rl.c | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)

Patch applied.  Thanks.
diff mbox series

Patch

diff --git a/drivers/crypto/intel/qat/qat_common/adf_rl.c b/drivers/crypto/intel/qat/qat_common/adf_rl.c
index de1b214dba1f..d4f2db3c53d8 100644
--- a/drivers/crypto/intel/qat/qat_common/adf_rl.c
+++ b/drivers/crypto/intel/qat/qat_common/adf_rl.c
@@ -788,6 +788,24 @@  static void clear_sla(struct adf_rl *rl_data, struct rl_sla *sla)
 	sla_type_arr[node_id] = NULL;
 }
 
+static void free_all_sla(struct adf_accel_dev *accel_dev)
+{
+	struct adf_rl *rl_data = accel_dev->rate_limiting;
+	int sla_id;
+
+	mutex_lock(&rl_data->rl_lock);
+
+	for (sla_id = 0; sla_id < RL_NODES_CNT_MAX; sla_id++) {
+		if (!rl_data->sla[sla_id])
+			continue;
+
+		kfree(rl_data->sla[sla_id]);
+		rl_data->sla[sla_id] = NULL;
+	}
+
+	mutex_unlock(&rl_data->rl_lock);
+}
+
 /**
  * add_update_sla() - handles the creation and the update of an SLA
  * @accel_dev: pointer to acceleration device structure
@@ -1155,7 +1173,7 @@  void adf_rl_stop(struct adf_accel_dev *accel_dev)
 		return;
 
 	adf_sysfs_rl_rm(accel_dev);
-	adf_rl_remove_sla_all(accel_dev, true);
+	free_all_sla(accel_dev);
 }
 
 void adf_rl_exit(struct adf_accel_dev *accel_dev)