Message ID | af9581e2-58f9-cc19-428f-6f18f1f83d54@redhat.com |
---|---|
State | New |
Headers | show |
Series | qat: fix deadlock in backlog processing | expand |
Hi Mikulas, many thanks for reporting this issue and finding a solution. On Thu, Sep 21, 2023 at 10:53:55PM +0200, Mikulas Patocka wrote: > I was evaluating whether it is feasible to use QAT with dm-crypt (the > answer is that it is not - QAT is slower than AES-NI for this type of > workload; QAT starts to be benefical for encryption requests longer than > 64k). Correct. Is there anything that we can do to batch requests in a single call? Sometime ago there was some work done to build a geniv template cipher and optimize dm-crypt to encrypt larger block sizes in a single call, see [1][2]. Don't know if that work was completed. >And I got some deadlocks. Ouch! > The reason for the deadlocks is this: suppose that one of the "if" > conditions in "qat_alg_send_message_maybacklog" is true and we jump to the > "enqueue" label. At this point, an interrupt comes in and clears all > pending messages. Now, the interrupt returns, we grab backlog->lock, add > the message to the backlog, drop backlog->lock - and there is no one to > remove the backlogged message out of the list and submit it. Makes sense. In my testing I wasn't able to reproduce this condition. > I fixed it with this patch - with this patch, the test passes and there > are no longer any deadlocks. I didn't want to add a spinlock to the hot > path, so I take it only if some of the condition suggests that queuing may > be required. > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > Cc: stable@vger.kernel.org The commit message requires a bit of rework to describe the change. Also, deserves a fixes tag. > > --- > drivers/crypto/intel/qat/qat_common/qat_algs_send.c | 31 ++++++++++++-------- > 1 file changed, 20 insertions(+), 11 deletions(-) > > Index: linux-2.6/drivers/crypto/intel/qat/qat_common/qat_algs_send.c > =================================================================== > --- linux-2.6.orig/drivers/crypto/intel/qat/qat_common/qat_algs_send.c > +++ linux-2.6/drivers/crypto/intel/qat/qat_common/qat_algs_send.c > @@ -40,16 +40,6 @@ void qat_alg_send_backlog(struct qat_ins > spin_unlock_bh(&backlog->lock); > } > > -static void qat_alg_backlog_req(struct qat_alg_req *req, > - struct qat_instance_backlog *backlog) > -{ > - INIT_LIST_HEAD(&req->list); Is the initialization of an element no longer needed? > - > - spin_lock_bh(&backlog->lock); > - list_add_tail(&req->list, &backlog->list); > - spin_unlock_bh(&backlog->lock); > -} > - > static int qat_alg_send_message_maybacklog(struct qat_alg_req *req) > { > struct qat_instance_backlog *backlog = req->backlog; > @@ -71,8 +61,27 @@ static int qat_alg_send_message_maybackl > return -EINPROGRESS; > > enqueue: > - qat_alg_backlog_req(req, backlog); > + spin_lock_bh(&backlog->lock); > + > + /* If any request is already backlogged, then add to backlog list */ > + if (!list_empty(&backlog->list)) > + goto enqueue2; > > + /* If ring is nearly full, then add to backlog list */ > + if (adf_ring_nearly_full(tx_ring)) > + goto enqueue2; > + > + /* If adding request to HW ring fails, then add to backlog list */ > + if (adf_send_message(tx_ring, fw_req)) > + goto enqueue2; In a nutshell, you are re-doing the same steps taking the backlog lock. It should be possible to re-write it so that there is a function that attempts enqueuing and if it fails, then the same is called again taking the lock. If you want I can rework it and resubmit. > + > + spin_unlock_bh(&backlog->lock); > + return -EINPROGRESS; > + > +enqueue2: > + list_add_tail(&req->list, &backlog->list); > + > + spin_unlock_bh(&backlog->lock); > return -EBUSY; > } [1] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1276510.html [2] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1428293.html Regards,
Index: linux-2.6/drivers/crypto/intel/qat/qat_common/qat_algs_send.c =================================================================== --- linux-2.6.orig/drivers/crypto/intel/qat/qat_common/qat_algs_send.c +++ linux-2.6/drivers/crypto/intel/qat/qat_common/qat_algs_send.c @@ -40,16 +40,6 @@ void qat_alg_send_backlog(struct qat_ins spin_unlock_bh(&backlog->lock); } -static void qat_alg_backlog_req(struct qat_alg_req *req, - struct qat_instance_backlog *backlog) -{ - INIT_LIST_HEAD(&req->list); - - spin_lock_bh(&backlog->lock); - list_add_tail(&req->list, &backlog->list); - spin_unlock_bh(&backlog->lock); -} - static int qat_alg_send_message_maybacklog(struct qat_alg_req *req) { struct qat_instance_backlog *backlog = req->backlog; @@ -71,8 +61,27 @@ static int qat_alg_send_message_maybackl return -EINPROGRESS; enqueue: - qat_alg_backlog_req(req, backlog); + spin_lock_bh(&backlog->lock); + + /* If any request is already backlogged, then add to backlog list */ + if (!list_empty(&backlog->list)) + goto enqueue2; + /* If ring is nearly full, then add to backlog list */ + if (adf_ring_nearly_full(tx_ring)) + goto enqueue2; + + /* If adding request to HW ring fails, then add to backlog list */ + if (adf_send_message(tx_ring, fw_req)) + goto enqueue2; + + spin_unlock_bh(&backlog->lock); + return -EINPROGRESS; + +enqueue2: + list_add_tail(&req->list, &backlog->list); + + spin_unlock_bh(&backlog->lock); return -EBUSY; }
Hi I was evaluating whether it is feasible to use QAT with dm-crypt (the answer is that it is not - QAT is slower than AES-NI for this type of workload; QAT starts to be benefical for encryption requests longer than 64k). And I got some deadlocks. The reason for the deadlocks is this: suppose that one of the "if" conditions in "qat_alg_send_message_maybacklog" is true and we jump to the "enqueue" label. At this point, an interrupt comes in and clears all pending messages. Now, the interrupt returns, we grab backlog->lock, add the message to the backlog, drop backlog->lock - and there is no one to remove the backlogged message out of the list and submit it. I fixed it with this patch - with this patch, the test passes and there are no longer any deadlocks. I didn't want to add a spinlock to the hot path, so I take it only if some of the condition suggests that queuing may be required. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org --- drivers/crypto/intel/qat/qat_common/qat_algs_send.c | 31 ++++++++++++-------- 1 file changed, 20 insertions(+), 11 deletions(-)