From patchwork Mon Apr 26 07:29:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 427809 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, URIBL_RED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1B4DC433ED for ; Mon, 26 Apr 2021 07:45:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A394661004 for ; Mon, 26 Apr 2021 07:45:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233308AbhDZHpy (ORCPT ); Mon, 26 Apr 2021 03:45:54 -0400 Received: from mail.kernel.org ([198.145.29.99]:34612 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234034AbhDZHon (ORCPT ); Mon, 26 Apr 2021 03:44:43 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 43CA4613DA; Mon, 26 Apr 2021 07:41:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1619422874; bh=hjSYZSPWPq1fnBFqrqtgESksCLg0BUKDwHBIfJIdR54=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GN+DNl0vSmz7aoRCjhb53oc3jX+LAZOb1fHPfEqpCoUWBqbeOyJYEM78Qk7DRVJWC 3YH3FfJpJzAU/yH7pmyKQYzB58mAo1fCHB06yGD+KtGPxaupPFvF1pB/iYez9SrgBt UdiZ6UIwjBoWEXs1ROQVxNlyqx1cY/Qer5j2Dhp0= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Philip Yang , Felix Kuehling , =?utf-8?q?Christian_K?= =?utf-8?b?w7ZuaWc=?= , Alex Deucher Subject: [PATCH 5.11 05/41] drm/amdgpu: reserve fence slot to update page table Date: Mon, 26 Apr 2021 09:29:52 +0200 Message-Id: <20210426072819.866751115@linuxfoundation.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210426072819.666570770@linuxfoundation.org> References: <20210426072819.666570770@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Philip Yang commit d42a5b639d15622ece5b9dd12dafd9776efa2593 upstream. Forgot to reserve a fence slot to use sdma to update page table, cause below kernel BUG backtrace to handle vm retry fault while application is exiting. [ 133.048143] kernel BUG at /home/yangp/git/compute_staging/kernel/drivers/dma-buf/dma-resv.c:281! [ 133.048487] Workqueue: events amdgpu_irq_handle_ih1 [amdgpu] [ 133.048506] RIP: 0010:dma_resv_add_shared_fence+0x204/0x280 [ 133.048672] amdgpu_vm_sdma_commit+0x134/0x220 [amdgpu] [ 133.048788] amdgpu_vm_bo_update_range+0x220/0x250 [amdgpu] [ 133.048905] amdgpu_vm_handle_fault+0x202/0x370 [amdgpu] [ 133.049031] gmc_v9_0_process_interrupt+0x1ab/0x310 [amdgpu] [ 133.049165] ? kgd2kfd_interrupt+0x9a/0x180 [amdgpu] [ 133.049289] ? amdgpu_irq_dispatch+0xb6/0x240 [amdgpu] [ 133.049408] amdgpu_irq_dispatch+0xb6/0x240 [amdgpu] [ 133.049534] amdgpu_ih_process+0x9b/0x1c0 [amdgpu] [ 133.049657] amdgpu_irq_handle_ih1+0x21/0x60 [amdgpu] [ 133.049669] process_one_work+0x29f/0x640 [ 133.049678] worker_thread+0x39/0x3f0 [ 133.049685] ? process_one_work+0x640/0x640 Signed-off-by: Philip Yang Signed-off-by: Felix Kuehling Reviewed-by: Christian König Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org # 5.11.x Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -3298,7 +3298,7 @@ bool amdgpu_vm_handle_fault(struct amdgp struct amdgpu_bo *root; uint64_t value, flags; struct amdgpu_vm *vm; - long r; + int r; spin_lock(&adev->vm_manager.pasid_lock); vm = idr_find(&adev->vm_manager.pasid_idr, pasid); @@ -3347,6 +3347,12 @@ bool amdgpu_vm_handle_fault(struct amdgp value = 0; } + r = dma_resv_reserve_shared(root->tbo.base.resv, 1); + if (r) { + pr_debug("failed %d to reserve fence slot\n", r); + goto error_unlock; + } + r = amdgpu_vm_bo_update_mapping(adev, adev, vm, true, false, NULL, addr, addr, flags, value, NULL, NULL, NULL); @@ -3358,7 +3364,7 @@ bool amdgpu_vm_handle_fault(struct amdgp error_unlock: amdgpu_bo_unreserve(root); if (r < 0) - DRM_ERROR("Can't handle page fault (%ld)\n", r); + DRM_ERROR("Can't handle page fault (%d)\n", r); error_unref: amdgpu_bo_unref(&root);