From patchwork Fri Dec 4 14:02:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 338388 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCDEDC433FE for ; Fri, 4 Dec 2020 14:04:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A29E522A99 for ; Fri, 4 Dec 2020 14:04:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729113AbgLDOED (ORCPT ); Fri, 4 Dec 2020 09:04:03 -0500 Received: from mail.fireflyinternet.com ([77.68.26.236]:56586 "EHLO fireflyinternet.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1727682AbgLDOED (ORCPT ); Fri, 4 Dec 2020 09:04:03 -0500 X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 23215309-1500050 for multiple; Fri, 04 Dec 2020 14:03:18 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Cc: tvrtko.ursulin@intel.com, Chris Wilson , stable@vger.kernel.org Subject: [PATCH 02/24] drm/i915/gt: Ignore repeated attempts to suspend request flow across reset Date: Fri, 4 Dec 2020 14:02:53 +0000 Message-Id: <20201204140315.24341-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201204140315.24341-1-chris@chris-wilson.co.uk> References: <20201204140315.24341-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Before reseting the engine, we suspend the execution of the guilty request, so that we can continue execution with a new context while we slowly compress the captured error state for the guilty context. However, if the reset fails, we will promptly attempt to reset the same request again, and discover the ongoing capture. Ignore the second attempt to suspend and capture the same request. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168 Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests") Signed-off-by: Chris Wilson Cc: # v5.7+ Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 43703efb36d1..1d209a8a95e8 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2823,6 +2823,9 @@ static void __execlists_hold(struct i915_request *rq) static bool execlists_hold(struct intel_engine_cs *engine, struct i915_request *rq) { + if (i915_request_on_hold(rq)) + return false; + spin_lock_irq(&engine->active.lock); if (i915_request_completed(rq)) { /* too late! */ From patchwork Fri Dec 4 14:02:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 338805 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECD4FC4361B for ; Fri, 4 Dec 2020 14:04:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BB81A22AAF for ; Fri, 4 Dec 2020 14:04:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727682AbgLDOEE (ORCPT ); Fri, 4 Dec 2020 09:04:04 -0500 Received: from mail.fireflyinternet.com ([77.68.26.236]:56592 "EHLO fireflyinternet.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1729039AbgLDOED (ORCPT ); Fri, 4 Dec 2020 09:04:03 -0500 X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 23215310-1500050 for multiple; Fri, 04 Dec 2020 14:03:18 +0000 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Cc: tvrtko.ursulin@intel.com, Chris Wilson , stable@vger.kernel.org Subject: [PATCH 03/24] drm/i915/gt: Cancel the preemption timeout on responding to it Date: Fri, 4 Dec 2020 14:02:54 +0000 Message-Id: <20201204140315.24341-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201204140315.24341-1-chris@chris-wilson.co.uk> References: <20201204140315.24341-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org We currently presume that the engine reset is successful, cancelling the expired preemption timer in the process. However, engine resets can fail, leaving the timeout still pending and we will then respond to the timeout again next time the tasklet fires. What we want is for the failed engine reset to be promoted to a full device reset, which is kicked by the heartbeat once the engine stops processing events. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168 Fixes: 3a7a92aba8fb ("drm/i915/execlists: Force preemption") Signed-off-by: Chris Wilson Cc: # v5.5+ --- drivers/gpu/drm/i915/gt/intel_lrc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 1d209a8a95e8..7f25894e41d5 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -3209,8 +3209,10 @@ static void execlists_submission_tasklet(unsigned long data) spin_unlock_irqrestore(&engine->active.lock, flags); /* Recheck after serialising with direct-submission */ - if (unlikely(timeout && preempt_timeout(engine))) + if (unlikely(timeout && preempt_timeout(engine))) { + cancel_timer(&engine->execlists.preempt); execlists_reset(engine, "preemption time out"); + } } }