Message ID | 20230118171411.8088-1-giovanni.cabiddu@intel.com |
---|---|
State | New |
Headers | show |
Series | [RFC] crypto: qat - enable polling for compression | expand |
Giovanni Cabiddu <giovanni.cabiddu@intel.com> wrote: > When a request is synchronous, it is more efficient to submit it and > poll for a response without going through the interrupt path. Can you quantify how polling is more efficient? Does it actually make an observable difference? > This patch adds logic in the transport layer to poll the response ring > and enables polling for compression in the QAT driver. > > This is an initial and not complete implementation. The reason why it > has been sent as RFC is to discuss about ways to mark a request as > synchronous from the acomp APIs. What existing (or future) users would benefit from this? You could poll based on a request flag, e.g., the existing MAY_SLEEP flag. Thanks,
Thanks for the feedback Herbert! On Fri, Jan 20, 2023 at 04:29:19PM +0800, Herbert Xu wrote: > Giovanni Cabiddu <giovanni.cabiddu@intel.com> wrote: > > When a request is synchronous, it is more efficient to submit it and > > poll for a response without going through the interrupt path. > > Can you quantify how polling is more efficient? Does it actually make > an observable difference? Yes it does. In average 2x faster for c62x and 3x faster for 4xxx. > > > This patch adds logic in the transport layer to poll the response ring > > and enables polling for compression in the QAT driver. > > > > This is an initial and not complete implementation. The reason why it > > has been sent as RFC is to discuss about ways to mark a request as > > synchronous from the acomp APIs. > > What existing (or future) users would benefit from this? At the moment only zswap as it uses acomp synchronously. > > You could poll based on a request flag, e.g., the existing MAY_SLEEP > flag. Thanks. I might need to do it per tfm as it requires re-configuring the ring pair and has an impact on all requests sharing the same. BTW, I found an issue in the previous version of the patch. Below a new version in case someone wants to try it. Regards, ---8<--- When a request is synchronous, it is more efficient to submit it and poll for a response without going through the interrupt path. This patch adds logic in the transport layer to poll the response ring and enables polling for compression in the QAT driver. This is an initial and not complete implementation. The reason why it has been sent as RFC is to discuss about ways to mark a request as synchronous from the acomp APIs. Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> --- drivers/crypto/qat/qat_common/adf_transport.c | 28 +++++++++++++++++++ drivers/crypto/qat/qat_common/adf_transport.h | 1 + drivers/crypto/qat/qat_common/qat_comp_algs.c | 15 ++++++++-- .../crypto/qat/qat_common/qat_compression.c | 2 +- 4 files changed, 43 insertions(+), 3 deletions(-) diff --git a/drivers/crypto/qat/qat_common/adf_transport.c b/drivers/crypto/qat/qat_common/adf_transport.c index 630d0483c4e0..af613ee84cdc 100644 --- a/drivers/crypto/qat/qat_common/adf_transport.c +++ b/drivers/crypto/qat/qat_common/adf_transport.c @@ -133,6 +133,34 @@ static int adf_handle_response(struct adf_etr_ring_data *ring) return 0; } +int adf_poll_message(struct adf_etr_ring_data *ring) +{ + struct adf_hw_csr_ops *csr_ops = GET_CSR_OPS(ring->bank->accel_dev); + u32 msg_counter = 0; + u32 *msg = (u32 *)((uintptr_t)ring->base_addr + ring->head); + + if (atomic_read(ring->inflights) == 0) + return 0; + + while (*msg != ADF_RING_EMPTY_SIG) { + ring->callback((u32 *)msg); + atomic_dec(ring->inflights); + *msg = ADF_RING_EMPTY_SIG; + ring->head = adf_modulo(ring->head + + ADF_MSG_SIZE_TO_BYTES(ring->msg_size), + ADF_RING_SIZE_MODULO(ring->ring_size)); + msg_counter++; + msg = (u32 *)((uintptr_t)ring->base_addr + ring->head); + } + if (msg_counter > 0) { + csr_ops->write_csr_ring_head(ring->bank->csr_addr, + ring->bank->bank_number, + ring->ring_number, ring->head); + return 0; + } + return -EAGAIN; +} + static void adf_configure_tx_ring(struct adf_etr_ring_data *ring) { struct adf_hw_csr_ops *csr_ops = GET_CSR_OPS(ring->bank->accel_dev); diff --git a/drivers/crypto/qat/qat_common/adf_transport.h b/drivers/crypto/qat/qat_common/adf_transport.h index e6ef6f9b7691..d549081172f8 100644 --- a/drivers/crypto/qat/qat_common/adf_transport.h +++ b/drivers/crypto/qat/qat_common/adf_transport.h @@ -16,5 +16,6 @@ int adf_create_ring(struct adf_accel_dev *accel_dev, const char *section, bool adf_ring_nearly_full(struct adf_etr_ring_data *ring); int adf_send_message(struct adf_etr_ring_data *ring, u32 *msg); +int adf_poll_message(struct adf_etr_ring_data *ring); void adf_remove_ring(struct adf_etr_ring_data *ring); #endif diff --git a/drivers/crypto/qat/qat_common/qat_comp_algs.c b/drivers/crypto/qat/qat_common/qat_comp_algs.c index b533984906ec..a55a1a66080a 100644 --- a/drivers/crypto/qat/qat_common/qat_comp_algs.c +++ b/drivers/crypto/qat/qat_common/qat_comp_algs.c @@ -53,6 +53,7 @@ struct qat_compression_req { struct qat_alg_req alg_req; struct work_struct resubmit; struct qat_dst dst; + bool in_progress; }; static int qat_alg_send_dc_message(struct qat_compression_req *qat_req, @@ -246,6 +247,7 @@ static void qat_comp_generic_callback(struct qat_compression_req *qat_req, res = ctx->qat_comp_callback(qat_req, resp); end: + qat_req->in_progress = false; qat_bl_free_bufl(accel_dev, &qat_req->buf); acomp_request_complete(areq, res); } @@ -333,6 +335,7 @@ static int qat_comp_alg_compress_decompress(struct acomp_req *areq, enum directi return -EINVAL; qat_req->dst.is_null = false; + qat_req->in_progress = true; /* Handle acomp requests that require the allocation of a destination * buffer. The size of the destination buffer is double the source @@ -384,10 +387,18 @@ static int qat_comp_alg_compress_decompress(struct acomp_req *areq, enum directi } ret = qat_alg_send_dc_message(qat_req, inst, &areq->base); - if (ret == -ENOSPC) + if (ret == -ENOSPC) { qat_bl_free_bufl(inst->accel_dev, &qat_req->buf); + return ret; + } - return ret; + do { + schedule(); + + adf_poll_message(inst->dc_rx); + } while (qat_req->in_progress); + + return 0; } static int qat_comp_alg_compress(struct acomp_req *req) diff --git a/drivers/crypto/qat/qat_common/qat_compression.c b/drivers/crypto/qat/qat_common/qat_compression.c index 3f1f35283266..d7420de65dc7 100644 --- a/drivers/crypto/qat/qat_common/qat_compression.c +++ b/drivers/crypto/qat/qat_common/qat_compression.c @@ -174,7 +174,7 @@ static int qat_compression_create_instances(struct adf_accel_dev *accel_dev) msg_size = ICP_QAT_FW_RESP_DEFAULT_SZ; snprintf(key, sizeof(key), ADF_DC "%d" ADF_RING_DC_RX, i); ret = adf_create_ring(accel_dev, SEC, bank, num_msg_dc, - msg_size, key, qat_comp_alg_callback, 0, + msg_size, key, qat_comp_alg_callback, 1, &inst->dc_rx); if (ret) return ret; -- 2.39.1
diff --git a/drivers/crypto/qat/qat_common/adf_transport.c b/drivers/crypto/qat/qat_common/adf_transport.c index 630d0483c4e0..af613ee84cdc 100644 --- a/drivers/crypto/qat/qat_common/adf_transport.c +++ b/drivers/crypto/qat/qat_common/adf_transport.c @@ -133,6 +133,34 @@ static int adf_handle_response(struct adf_etr_ring_data *ring) return 0; } +int adf_poll_message(struct adf_etr_ring_data *ring) +{ + struct adf_hw_csr_ops *csr_ops = GET_CSR_OPS(ring->bank->accel_dev); + u32 msg_counter = 0; + u32 *msg = (u32 *)((uintptr_t)ring->base_addr + ring->head); + + if (atomic_read(ring->inflights) == 0) + return 0; + + while (*msg != ADF_RING_EMPTY_SIG) { + ring->callback((u32 *)msg); + atomic_dec(ring->inflights); + *msg = ADF_RING_EMPTY_SIG; + ring->head = adf_modulo(ring->head + + ADF_MSG_SIZE_TO_BYTES(ring->msg_size), + ADF_RING_SIZE_MODULO(ring->ring_size)); + msg_counter++; + msg = (u32 *)((uintptr_t)ring->base_addr + ring->head); + } + if (msg_counter > 0) { + csr_ops->write_csr_ring_head(ring->bank->csr_addr, + ring->bank->bank_number, + ring->ring_number, ring->head); + return 0; + } + return -EAGAIN; +} + static void adf_configure_tx_ring(struct adf_etr_ring_data *ring) { struct adf_hw_csr_ops *csr_ops = GET_CSR_OPS(ring->bank->accel_dev); diff --git a/drivers/crypto/qat/qat_common/adf_transport.h b/drivers/crypto/qat/qat_common/adf_transport.h index e6ef6f9b7691..d549081172f8 100644 --- a/drivers/crypto/qat/qat_common/adf_transport.h +++ b/drivers/crypto/qat/qat_common/adf_transport.h @@ -16,5 +16,6 @@ int adf_create_ring(struct adf_accel_dev *accel_dev, const char *section, bool adf_ring_nearly_full(struct adf_etr_ring_data *ring); int adf_send_message(struct adf_etr_ring_data *ring, u32 *msg); +int adf_poll_message(struct adf_etr_ring_data *ring); void adf_remove_ring(struct adf_etr_ring_data *ring); #endif diff --git a/drivers/crypto/qat/qat_common/qat_comp_algs.c b/drivers/crypto/qat/qat_common/qat_comp_algs.c index 1480d36a8d2b..07378e9fe8fa 100644 --- a/drivers/crypto/qat/qat_common/qat_comp_algs.c +++ b/drivers/crypto/qat/qat_common/qat_comp_algs.c @@ -291,8 +291,15 @@ static int qat_comp_alg_compress_decompress(struct acomp_req *areq, } ret = qat_alg_send_dc_message(qat_req, inst, &areq->base); - if (ret == -ENOSPC) + if (ret == -ENOSPC) { qat_bl_free_bufl(inst->accel_dev, &qat_req->buf); + return ret; + } + + do { + ret = adf_poll_message(inst->dc_rx); + schedule(); + } while (ret); return ret; } diff --git a/drivers/crypto/qat/qat_common/qat_compression.c b/drivers/crypto/qat/qat_common/qat_compression.c index 9fd10f4242f8..49b34c5ec595 100644 --- a/drivers/crypto/qat/qat_common/qat_compression.c +++ b/drivers/crypto/qat/qat_common/qat_compression.c @@ -174,7 +174,7 @@ static int qat_compression_create_instances(struct adf_accel_dev *accel_dev) msg_size = ICP_QAT_FW_RESP_DEFAULT_SZ; snprintf(key, sizeof(key), ADF_DC "%d" ADF_RING_DC_RX, i); ret = adf_create_ring(accel_dev, SEC, bank, num_msg_dc, - msg_size, key, qat_comp_alg_callback, 0, + msg_size, key, qat_comp_alg_callback, 1, &inst->dc_rx); if (ret) return ret;
When a request is synchronous, it is more efficient to submit it and poll for a response without going through the interrupt path. This patch adds logic in the transport layer to poll the response ring and enables polling for compression in the QAT driver. This is an initial and not complete implementation. The reason why it has been sent as RFC is to discuss about ways to mark a request as synchronous from the acomp APIs. Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> --- drivers/crypto/qat/qat_common/adf_transport.c | 28 +++++++++++++++++++ drivers/crypto/qat/qat_common/adf_transport.h | 1 + drivers/crypto/qat/qat_common/qat_comp_algs.c | 9 +++++- .../crypto/qat/qat_common/qat_compression.c | 2 +- 4 files changed, 38 insertions(+), 2 deletions(-)