From patchwork Wed Mar 2 17:27:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Akhil P Oommen X-Patchwork-Id: 548223 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 459BBC433EF for ; Wed, 2 Mar 2022 17:28:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243837AbiCBR2q (ORCPT ); Wed, 2 Mar 2022 12:28:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243853AbiCBR2l (ORCPT ); Wed, 2 Mar 2022 12:28:41 -0500 Received: from so254-9.mailgun.net (so254-9.mailgun.net [198.61.254.9]) by lindbergh.monkeyblade.net (Postfix) with UTF8SMTPS id 5CD9F39B87 for ; Wed, 2 Mar 2022 09:27:54 -0800 (PST) DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1646242076; h=Message-Id: Date: Subject: Cc: To: From: Sender; bh=Qa4oeOoMQCNOhiFhjfjB7NxSM+9SxKiZPhTh4vJHGJU=; b=V+jC7myDeSbuT7RVq2AHttmNCnbixSVgIs4VV80W+MTwlX2OvPnQT6AtLNrX2oFf4gwgKPR9 /47Syy0+1T1RyC53YX/0WKrYHNEX+W5d6V+2G6JboW6FsJ4/lo4QC+6yISEyDBpUB3R582hv g3+G3jaHk42FfQ6Hh6H3MMhtDUs= X-Mailgun-Sending-Ip: 198.61.254.9 X-Mailgun-Sid: WyI1MzIzYiIsICJsaW51eC1hcm0tbXNtQHZnZXIua2VybmVsLm9yZyIsICJiZTllNGEiXQ== Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n05.prod.us-east-1.postgun.com with SMTP id 621fa917e1c212bb9c1e992b (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Wed, 02 Mar 2022 17:27:51 GMT Sender: quic_akhilpo=quicinc.com@mg.codeaurora.org Received: by smtp.codeaurora.org (Postfix, from userid 1001) id A23FCC4363F; Wed, 2 Mar 2022 17:27:50 +0000 (UTC) Received: from hyd-lnxbld559.qualcomm.com (unknown [202.46.22.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: akhilpo) by smtp.codeaurora.org (Postfix) with ESMTPSA id 8C826C4338F; Wed, 2 Mar 2022 17:27:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 smtp.codeaurora.org 8C826C4338F Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; dmarc=fail (p=none dis=none) header.from=quicinc.com Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; spf=fail smtp.mailfrom=quicinc.com From: Akhil P Oommen To: freedreno , dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, Rob Clark , Dmitry Baryshkov , Bjorn Andersson Cc: Abhinav Kumar , AngeloGioacchino Del Regno , =?utf-8?q?Christian_K=C3=B6nig?= , Dan Carpenter , Daniel Vetter , David Airlie , Dmitry Osipenko , Douglas Anderson , Emma Anholt , Jonathan Marek , Jordan Crouse , Sean Paul , Stephen Boyd , Viresh Kumar , Vladimir Lypak , Wang Qing , Yangtao Li , linux-kernel@vger.kernel.org Subject: [PATCH v1 00/10] Support for GMU coredump and some related improvements Date: Wed, 2 Mar 2022 22:57:26 +0530 Message-Id: <1646242056-2456-1-git-send-email-quic_akhilpo@quicinc.com> X-Mailer: git-send-email 2.7.4 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org Major enhancement in this series is the support for a minimal gmu coredump which can be captured inline instead of through our usual recover worker. It is helpful in the case of gmu errors during gpu wake-up/suspend path and helps to capture a snapshot of gmu before we do a suspend. I had to introduce a lock to synchronize the crashstate because the runtime-suspend can happen from an asynchronous RPM thread. Apart from this, there are some improvements to gracefully handle the gmu errors by propagating the error back to parent or by retrying. Also, a few patches to fix some trivial bugs in the related code. Akhil P Oommen (10): drm/msm/a6xx: Add helper to check smmu is stalled drm/msm/a6xx: Send NMI to gmu when it is hung drm/msm/a6xx: Avoid gmu lock in pm ops drm/msm/a6xx: Enhance debugging of gmu faults drm/msm: Do recovery on hw_init failure drm/msm/a6xx: Propagate OOB set error drm/msm/adreno: Retry on gpu resume failure drm/msm/a6xx: Remove clk votes on failure drm/msm: Remove pm_runtime_get() from msm_job_run() drm/msm/a6xx: Free gmu_debug crashstate bo drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 89 +++++++++++++++++++++++------ drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 1 + drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 31 +++++++--- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 4 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 79 +++++++++++++++++++++---- drivers/gpu/drm/msm/adreno/adreno_device.c | 10 +++- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 10 +++- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 2 + drivers/gpu/drm/msm/msm_gpu.c | 28 ++++++++- drivers/gpu/drm/msm/msm_gpu.h | 11 ++-- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 -- 11 files changed, 218 insertions(+), 51 deletions(-)