From patchwork Thu Oct 1 14:01:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Brezillon X-Patchwork-Id: 262847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C234C4727C for ; Thu, 1 Oct 2020 14:02:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3CFDC221F0 for ; Thu, 1 Oct 2020 14:02:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732712AbgJAOCs (ORCPT ); Thu, 1 Oct 2020 10:02:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732711AbgJAOBz (ORCPT ); Thu, 1 Oct 2020 10:01:55 -0400 Received: from bhuna.collabora.co.uk (bhuna.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e3e3]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC3A0C0613D0 for ; Thu, 1 Oct 2020 07:01:54 -0700 (PDT) Received: from localhost.localdomain (unknown [IPv6:2a01:e0a:2c:6930:5cf4:84a1:2763:fe0d]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: bbrezillon) by bhuna.collabora.co.uk (Postfix) with ESMTPSA id 211C529D5DD; Thu, 1 Oct 2020 15:01:52 +0100 (BST) From: Boris Brezillon To: Rob Herring , Tomeu Vizoso , Alyssa Rosenzweig , Steven Price , Robin Murphy Cc: dri-devel@lists.freedesktop.org, Boris Brezillon , stable@vger.kernel.org Subject: [PATCH] panfrost: Fix job timeout handling Date: Thu, 1 Oct 2020 16:01:43 +0200 Message-Id: <20201001140143.1058669-1-boris.brezillon@collabora.com> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org If more than two or more jobs end up timeout-ing concurrently, only one of them (the one attached to the scheduler acquiring the lock) is fully handled. The other one remains in a dangling state where it's no longer part of the scheduling queue, but still blocks something in scheduler thus leading to repetitive timeouts when new jobs are queued. Let's make sure all bad jobs are properly handled by the thread acquiring the lock. Signed-off-by: Boris Brezillon Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") Cc: --- drivers/gpu/drm/panfrost/panfrost_job.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 30e7b7196dab..e87edca51d84 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -25,7 +25,7 @@ struct panfrost_queue_state { struct drm_gpu_scheduler sched; - + struct drm_sched_job *bad; u64 fence_context; u64 emit_seqno; }; @@ -392,19 +392,29 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job) job_read(pfdev, JS_TAIL_LO(js)), sched_job); + /* + * Collect the bad job here so it can be processed by the thread + * acquiring the reset lock. + */ + pfdev->js->queue[js].bad = sched_job; + if (!mutex_trylock(&pfdev->reset_lock)) return; for (i = 0; i < NUM_JOB_SLOTS; i++) { struct drm_gpu_scheduler *sched = &pfdev->js->queue[i].sched; - drm_sched_stop(sched, sched_job); if (js != i) /* Ensure any timeouts on other slots have finished */ cancel_delayed_work_sync(&sched->work_tdr); - } - drm_sched_increase_karma(sched_job); + drm_sched_stop(sched, pfdev->js->queue[i].bad); + + if (pfdev->js->queue[i].bad) + drm_sched_increase_karma(pfdev->js->queue[i].bad); + + pfdev->js->queue[i].bad = NULL; + } spin_lock_irqsave(&pfdev->js->job_lock, flags); for (i = 0; i < NUM_JOB_SLOTS; i++) {