From patchwork Mon Apr 28 20:54:33 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rob Clark X-Patchwork-Id: 885632 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B57B021B9F8; Mon, 28 Apr 2025 20:57:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745873869; cv=none; b=gfvpbI8GrFNTJnV7jQYsym64YRS4UNz+BO9br7A4ESDjE4WuKYeu3bQHEWsGVwfqid5Q3f1BuzTIzjzeGsMxU3MQlORpnySLFyMMaWSjZtsWU3qCg95LaGgrcSDL8R2vJGZju8FVbdR1qtIKIJj7UEp+sEOYuq0tRohxyxfvF4o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745873869; c=relaxed/simple; bh=hHoBUjjH0TQB5XKwaHIhu8Sgu1TevEne6tIKm+mXodA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uLd6SqqodGn8EfS9HshhV3R3mR0KGIR+fpO1aqus+Njr4lCV2SpBOPTlTNz6yJcxxVI1WBJo63UH52CaPLeyyhehWwOli95fZFghWhx0VHYFkFT+rIRU9jdrG4NDtnps/fLDLFmcTqE91NWVC/1jBFp97/nCCKfJEdAKl6I3tww= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hyrrSPta; arc=none smtp.client-ip=209.85.214.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hyrrSPta" Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-2279915e06eso60159295ad.1; Mon, 28 Apr 2025 13:57:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1745873867; x=1746478667; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Udy1qROHiLm0PiqzV909sHOl83XGRYgTsp73bE9AlF8=; b=hyrrSPtaIFZhIZrDofzog6iTdcVrNddEQQOrYC0xZmiIni1ZfKuPwYXY6VNAidXoIM DeTMXWPgCgHW8q5t8MLS2QBT7IN32dHLN85Fhz+4CAPNaIBdHGGFQd8PuyvJK3oc1TuQ 2IwPYM2mbXK/XXYzpKDgm+gB1UqinTdeJNl1AAE8O6cq4+K6Yx1OA9dvJGpX8+kLp/fG 9/Gt0JaIv2xZMvzGRC3DhgRQqVIgmzdv1Ct4mfq259eJgTS9tURW0Fxj66V3IHZvowWj q3mPkLAFckt96R2821JfGA+wjnQTIrqcPTxpQfsBWi9i4nN4szqk2R+dilvnwYBUHVmr dXbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745873867; x=1746478667; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Udy1qROHiLm0PiqzV909sHOl83XGRYgTsp73bE9AlF8=; b=CAd4gCmSejsJlkDoqF64Qdu03YGjGuQOBGCXU273aNYeoGFMXBQFZN93ROQPMgbRVR e2XzDxXCKh/x5Njye4848kH0NYfoKPFAoCZUTHsWthRrYr+m9ibWMrcdyOY6jikYzQur u2LTbr150LrGFUo/yDIb6tJcpd/hDYHTLHLaq6DSx7TIncNTZnv3IbLYPlF8quM0A+/U V2kUeI5YLrDSNDbpVENvNud0OmyiIkFzpx1sr9/qTpr8Jr0YUox92CAd39a/Jp3+ncDw dDPBUZ958OrNkLq3FmmItb0uWGXj8pdVrIS5snUwhT9DOe4I+GB4Bo1w5ze6Z50Sz//9 m1+g== X-Forwarded-Encrypted: i=1; AJvYcCUWjKHDGb6SRFRTgdjKyy/dXPhymiOeR+v/45u7EnPPbJjgFItUpzADK9xMeKE5c4tkIfam2vAey/lRnJY=@vger.kernel.org, AJvYcCUe1DsQjSD95Hk12G9vBizk7TsRg6O8Swn9fc4nTOLwKKojL49shYrjU5tc7IASgT3PUFhvKyuliJPW0i2X@vger.kernel.org, AJvYcCX4adFhjaIzek3ppF7GU5LNleX8R83XD5MJpnVM+nM1iS5b/2bbwvXMu50G/TYH7VVr/UUW6z+YqyzK+JX4@vger.kernel.org X-Gm-Message-State: AOJu0YxmBvtVnKoSirCYVDTyE8mk+Qm0hx9iWeOUdp+RyJVHu+9d0rPF rjaPS7fqSb7cRS+WIups6iswpICscoYJmkLIw0RvDUnPp0USddLP X-Gm-Gg: ASbGncu4KOYQu2iP0Ma2FIQL4uXj2AnjCXeBhRpFGHFgK3Av2GHwQ7ykd6L+Sw2TInA Icc6Y2HVP90SceZ2FrcRfiefPSke4NPGVHK5JxbG+/XUqBWGktPGpcs1E1kakh0VFNH0QjVH/Lb I2RGHWYcK/60GN8ugBmRaI7eS94m7PfW+1kA22sFMaSa2ULSatetZX7EN+qPP6JCoWvjN5tOVJQ +NZYt9d14g/OBvuITtFugSdWSdN/bj/yK4lF2sggHU8RvvAWsiVe2av5BMNEtB7uQRghgFu6dxy TJ2BACQj8ShIy6lH7O1VqzQlTp8U2ju9TqF/g1lAahoWbeJVEsltVHp3oghDebPu4ltuklXxZ/3 1v5khdXTyaKN6TmJzUGy+y0tEiA== X-Google-Smtp-Source: AGHT+IHf1Ez5QiB/sRmmYzewN8HuuUfl9wXoGRwoecAv8VAlwtVWIlr7vg10G/OkJw71vZneTDG0uA== X-Received: by 2002:a17:903:22cc:b0:21f:5cd8:c67 with SMTP id d9443c01a7336-22dc6a0f443mr149140805ad.31.1745873866882; Mon, 28 Apr 2025 13:57:46 -0700 (PDT) Received: from localhost ([2a00:79e0:3e00:2601:3afc:446b:f0df:eadc]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-22db5216adbsm87704825ad.234.2025.04.28.13.57.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Apr 2025 13:57:46 -0700 (PDT) From: Rob Clark To: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, Connor Abbott , Rob Clark , Rob Clark , Abhinav Kumar , Dmitry Baryshkov , Sean Paul , Marijn Suijten , David Airlie , Simona Vetter , Konrad Dybcio , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , linux-kernel@vger.kernel.org (open list), linux-media@vger.kernel.org (open list:DMA BUFFER SHARING FRAMEWORK:Keyword:\bdma_(?:buf|fence|resv)\b), linaro-mm-sig@lists.linaro.org (moderated list:DMA BUFFER SHARING FRAMEWORK:Keyword:\bdma_(?:buf|fence|resv)\b) Subject: [PATCH v3 26/33] drm/msm: Extract out syncobj helpers Date: Mon, 28 Apr 2025 13:54:33 -0700 Message-ID: <20250428205619.227835-27-robdclark@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250428205619.227835-1-robdclark@gmail.com> References: <20250428205619.227835-1-robdclark@gmail.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Rob Clark We'll be re-using these for the VM_BIND ioctl. Also, rename a few things in the uapi header to reflect that syncobj use is not specific to the submit ioctl. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/Makefile | 1 + drivers/gpu/drm/msm/msm_gem_submit.c | 192 ++------------------------- drivers/gpu/drm/msm/msm_syncobj.c | 172 ++++++++++++++++++++++++ drivers/gpu/drm/msm/msm_syncobj.h | 37 ++++++ include/uapi/drm/msm_drm.h | 26 ++-- 5 files changed, 235 insertions(+), 193 deletions(-) create mode 100644 drivers/gpu/drm/msm/msm_syncobj.c create mode 100644 drivers/gpu/drm/msm/msm_syncobj.h diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile index 5df20cbeafb8..8af34f87e0c8 100644 --- a/drivers/gpu/drm/msm/Makefile +++ b/drivers/gpu/drm/msm/Makefile @@ -128,6 +128,7 @@ msm-y += \ msm_rd.o \ msm_ringbuffer.o \ msm_submitqueue.o \ + msm_syncobj.o \ msm_gpu_tracepoints.o \ msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 375d89f23cd1..552ff2d57294 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -16,6 +16,7 @@ #include "msm_gpu.h" #include "msm_gem.h" #include "msm_gpu_trace.h" +#include "msm_syncobj.h" /* For userspace errors, use DRM_UT_DRIVER.. so that userspace can enable * error msgs for debugging, but we don't spam dmesg by default @@ -477,173 +478,6 @@ void msm_submit_retire(struct msm_gem_submit *submit) } } -struct msm_submit_post_dep { - struct drm_syncobj *syncobj; - uint64_t point; - struct dma_fence_chain *chain; -}; - -static struct drm_syncobj **msm_parse_deps(struct msm_gem_submit *submit, - struct drm_file *file, - uint64_t in_syncobjs_addr, - uint32_t nr_in_syncobjs, - size_t syncobj_stride) -{ - struct drm_syncobj **syncobjs = NULL; - struct drm_msm_gem_submit_syncobj syncobj_desc = {0}; - int ret = 0; - uint32_t i, j; - - syncobjs = kcalloc(nr_in_syncobjs, sizeof(*syncobjs), - GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY); - if (!syncobjs) - return ERR_PTR(-ENOMEM); - - for (i = 0; i < nr_in_syncobjs; ++i) { - uint64_t address = in_syncobjs_addr + i * syncobj_stride; - - if (copy_from_user(&syncobj_desc, - u64_to_user_ptr(address), - min(syncobj_stride, sizeof(syncobj_desc)))) { - ret = -EFAULT; - break; - } - - if (syncobj_desc.point && - !drm_core_check_feature(submit->dev, DRIVER_SYNCOBJ_TIMELINE)) { - ret = SUBMIT_ERROR(EOPNOTSUPP, submit, "syncobj timeline unsupported"); - break; - } - - if (syncobj_desc.flags & ~MSM_SUBMIT_SYNCOBJ_FLAGS) { - ret = SUBMIT_ERROR(EINVAL, submit, "invalid syncobj flags: %x", syncobj_desc.flags); - break; - } - - ret = drm_sched_job_add_syncobj_dependency(&submit->base, file, - syncobj_desc.handle, syncobj_desc.point); - if (ret) - break; - - if (syncobj_desc.flags & MSM_SUBMIT_SYNCOBJ_RESET) { - syncobjs[i] = - drm_syncobj_find(file, syncobj_desc.handle); - if (!syncobjs[i]) { - ret = SUBMIT_ERROR(EINVAL, submit, "invalid syncobj handle: %u", i); - break; - } - } - } - - if (ret) { - for (j = 0; j <= i; ++j) { - if (syncobjs[j]) - drm_syncobj_put(syncobjs[j]); - } - kfree(syncobjs); - return ERR_PTR(ret); - } - return syncobjs; -} - -static void msm_reset_syncobjs(struct drm_syncobj **syncobjs, - uint32_t nr_syncobjs) -{ - uint32_t i; - - for (i = 0; syncobjs && i < nr_syncobjs; ++i) { - if (syncobjs[i]) - drm_syncobj_replace_fence(syncobjs[i], NULL); - } -} - -static struct msm_submit_post_dep *msm_parse_post_deps(struct drm_device *dev, - struct drm_file *file, - uint64_t syncobjs_addr, - uint32_t nr_syncobjs, - size_t syncobj_stride) -{ - struct msm_submit_post_dep *post_deps; - struct drm_msm_gem_submit_syncobj syncobj_desc = {0}; - int ret = 0; - uint32_t i, j; - - post_deps = kcalloc(nr_syncobjs, sizeof(*post_deps), - GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY); - if (!post_deps) - return ERR_PTR(-ENOMEM); - - for (i = 0; i < nr_syncobjs; ++i) { - uint64_t address = syncobjs_addr + i * syncobj_stride; - - if (copy_from_user(&syncobj_desc, - u64_to_user_ptr(address), - min(syncobj_stride, sizeof(syncobj_desc)))) { - ret = -EFAULT; - break; - } - - post_deps[i].point = syncobj_desc.point; - - if (syncobj_desc.flags) { - ret = UERR(EINVAL, dev, "invalid syncobj flags"); - break; - } - - if (syncobj_desc.point) { - if (!drm_core_check_feature(dev, - DRIVER_SYNCOBJ_TIMELINE)) { - ret = UERR(EOPNOTSUPP, dev, "syncobj timeline unsupported"); - break; - } - - post_deps[i].chain = dma_fence_chain_alloc(); - if (!post_deps[i].chain) { - ret = -ENOMEM; - break; - } - } - - post_deps[i].syncobj = - drm_syncobj_find(file, syncobj_desc.handle); - if (!post_deps[i].syncobj) { - ret = UERR(EINVAL, dev, "invalid syncobj handle"); - break; - } - } - - if (ret) { - for (j = 0; j <= i; ++j) { - dma_fence_chain_free(post_deps[j].chain); - if (post_deps[j].syncobj) - drm_syncobj_put(post_deps[j].syncobj); - } - - kfree(post_deps); - return ERR_PTR(ret); - } - - return post_deps; -} - -static void msm_process_post_deps(struct msm_submit_post_dep *post_deps, - uint32_t count, struct dma_fence *fence) -{ - uint32_t i; - - for (i = 0; post_deps && i < count; ++i) { - if (post_deps[i].chain) { - drm_syncobj_add_point(post_deps[i].syncobj, - post_deps[i].chain, - fence, post_deps[i].point); - post_deps[i].chain = NULL; - } else { - drm_syncobj_replace_fence(post_deps[i].syncobj, - fence); - } - } -} - int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct drm_file *file) { @@ -654,7 +488,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct msm_gpu *gpu = priv->gpu; struct msm_gpu_submitqueue *queue; struct msm_ringbuffer *ring; - struct msm_submit_post_dep *post_deps = NULL; + struct msm_syncobj_post_dep *post_deps = NULL; struct drm_syncobj **syncobjs_to_reset = NULL; int out_fence_fd = -1; unsigned i; @@ -730,10 +564,10 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, } if (args->flags & MSM_SUBMIT_SYNCOBJ_IN) { - syncobjs_to_reset = msm_parse_deps(submit, file, - args->in_syncobjs, - args->nr_in_syncobjs, - args->syncobj_stride); + syncobjs_to_reset = msm_syncobj_parse_deps(dev, &submit->base, + file, args->in_syncobjs, + args->nr_in_syncobjs, + args->syncobj_stride); if (IS_ERR(syncobjs_to_reset)) { ret = PTR_ERR(syncobjs_to_reset); goto out_unlock; @@ -741,10 +575,10 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, } if (args->flags & MSM_SUBMIT_SYNCOBJ_OUT) { - post_deps = msm_parse_post_deps(dev, file, - args->out_syncobjs, - args->nr_out_syncobjs, - args->syncobj_stride); + post_deps = msm_syncobj_parse_post_deps(dev, file, + args->out_syncobjs, + args->nr_out_syncobjs, + args->syncobj_stride); if (IS_ERR(post_deps)) { ret = PTR_ERR(post_deps); goto out_unlock; @@ -887,10 +721,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, args->fence = submit->fence_id; queue->last_fence = submit->fence_id; - msm_reset_syncobjs(syncobjs_to_reset, args->nr_in_syncobjs); - msm_process_post_deps(post_deps, args->nr_out_syncobjs, - submit->user_fence); - + msm_syncobj_reset(syncobjs_to_reset, args->nr_in_syncobjs); + msm_syncobj_process_post_deps(post_deps, args->nr_out_syncobjs, submit->user_fence); out: submit_cleanup(submit, !!ret); diff --git a/drivers/gpu/drm/msm/msm_syncobj.c b/drivers/gpu/drm/msm/msm_syncobj.c new file mode 100644 index 000000000000..4baa9f522c54 --- /dev/null +++ b/drivers/gpu/drm/msm/msm_syncobj.c @@ -0,0 +1,172 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (C) 2020 Google, Inc */ + +#include "drm/drm_drv.h" + +#include "msm_drv.h" +#include "msm_syncobj.h" + +struct drm_syncobj ** +msm_syncobj_parse_deps(struct drm_device *dev, + struct drm_sched_job *job, + struct drm_file *file, + uint64_t in_syncobjs_addr, + uint32_t nr_in_syncobjs, + size_t syncobj_stride) +{ + struct drm_syncobj **syncobjs = NULL; + struct drm_msm_syncobj syncobj_desc = {0}; + int ret = 0; + uint32_t i, j; + + syncobjs = kcalloc(nr_in_syncobjs, sizeof(*syncobjs), + GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY); + if (!syncobjs) + return ERR_PTR(-ENOMEM); + + for (i = 0; i < nr_in_syncobjs; ++i) { + uint64_t address = in_syncobjs_addr + i * syncobj_stride; + + if (copy_from_user(&syncobj_desc, + u64_to_user_ptr(address), + min(syncobj_stride, sizeof(syncobj_desc)))) { + ret = -EFAULT; + break; + } + + if (syncobj_desc.point && + !drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) { + ret = UERR(EOPNOTSUPP, dev, "syncobj timeline unsupported"); + break; + } + + if (syncobj_desc.flags & ~MSM_SYNCOBJ_FLAGS) { + ret = UERR(EINVAL, dev, "invalid syncobj flags: %x", syncobj_desc.flags); + break; + } + + ret = drm_sched_job_add_syncobj_dependency(job, file, + syncobj_desc.handle, + syncobj_desc.point); + if (ret) + break; + + if (syncobj_desc.flags & MSM_SYNCOBJ_RESET) { + syncobjs[i] = drm_syncobj_find(file, syncobj_desc.handle); + if (!syncobjs[i]) { + ret = UERR(EINVAL, dev, "invalid syncobj handle: %u", i); + break; + } + } + } + + if (ret) { + for (j = 0; j <= i; ++j) { + if (syncobjs[j]) + drm_syncobj_put(syncobjs[j]); + } + kfree(syncobjs); + return ERR_PTR(ret); + } + return syncobjs; +} + +void +msm_syncobj_reset(struct drm_syncobj **syncobjs, uint32_t nr_syncobjs) +{ + uint32_t i; + + for (i = 0; syncobjs && i < nr_syncobjs; ++i) { + if (syncobjs[i]) + drm_syncobj_replace_fence(syncobjs[i], NULL); + } +} + +struct msm_syncobj_post_dep * +msm_syncobj_parse_post_deps(struct drm_device *dev, + struct drm_file *file, + uint64_t syncobjs_addr, + uint32_t nr_syncobjs, + size_t syncobj_stride) +{ + struct msm_syncobj_post_dep *post_deps; + struct drm_msm_syncobj syncobj_desc = {0}; + int ret = 0; + uint32_t i, j; + + post_deps = kcalloc(nr_syncobjs, sizeof(*post_deps), + GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY); + if (!post_deps) + return ERR_PTR(-ENOMEM); + + for (i = 0; i < nr_syncobjs; ++i) { + uint64_t address = syncobjs_addr + i * syncobj_stride; + + if (copy_from_user(&syncobj_desc, + u64_to_user_ptr(address), + min(syncobj_stride, sizeof(syncobj_desc)))) { + ret = -EFAULT; + break; + } + + post_deps[i].point = syncobj_desc.point; + + if (syncobj_desc.flags) { + ret = UERR(EINVAL, dev, "invalid syncobj flags"); + break; + } + + if (syncobj_desc.point) { + if (!drm_core_check_feature(dev, + DRIVER_SYNCOBJ_TIMELINE)) { + ret = UERR(EOPNOTSUPP, dev, "syncobj timeline unsupported"); + break; + } + + post_deps[i].chain = dma_fence_chain_alloc(); + if (!post_deps[i].chain) { + ret = -ENOMEM; + break; + } + } + + post_deps[i].syncobj = + drm_syncobj_find(file, syncobj_desc.handle); + if (!post_deps[i].syncobj) { + ret = UERR(EINVAL, dev, "invalid syncobj handle"); + break; + } + } + + if (ret) { + for (j = 0; j <= i; ++j) { + dma_fence_chain_free(post_deps[j].chain); + if (post_deps[j].syncobj) + drm_syncobj_put(post_deps[j].syncobj); + } + + kfree(post_deps); + return ERR_PTR(ret); + } + + return post_deps; +} + +void +msm_syncobj_process_post_deps(struct msm_syncobj_post_dep *post_deps, + uint32_t count, struct dma_fence *fence) +{ + uint32_t i; + + for (i = 0; post_deps && i < count; ++i) { + if (post_deps[i].chain) { + drm_syncobj_add_point(post_deps[i].syncobj, + post_deps[i].chain, + fence, post_deps[i].point); + post_deps[i].chain = NULL; + } else { + drm_syncobj_replace_fence(post_deps[i].syncobj, + fence); + } + } +} diff --git a/drivers/gpu/drm/msm/msm_syncobj.h b/drivers/gpu/drm/msm/msm_syncobj.h new file mode 100644 index 000000000000..bcaa15d01da0 --- /dev/null +++ b/drivers/gpu/drm/msm/msm_syncobj.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (C) 2020 Google, Inc */ + +#ifndef __MSM_GEM_SYNCOBJ_H__ +#define __MSM_GEM_SYNCOBJ_H__ + +#include "drm/drm_device.h" +#include "drm/drm_syncobj.h" +#include "drm/gpu_scheduler.h" + +struct msm_syncobj_post_dep { + struct drm_syncobj *syncobj; + uint64_t point; + struct dma_fence_chain *chain; +}; + +struct drm_syncobj ** +msm_syncobj_parse_deps(struct drm_device *dev, + struct drm_sched_job *job, + struct drm_file *file, + uint64_t in_syncobjs_addr, + uint32_t nr_in_syncobjs, + size_t syncobj_stride); + +void msm_syncobj_reset(struct drm_syncobj **syncobjs, uint32_t nr_syncobjs); + +struct msm_syncobj_post_dep * +msm_syncobj_parse_post_deps(struct drm_device *dev, + struct drm_file *file, + uint64_t syncobjs_addr, + uint32_t nr_syncobjs, + size_t syncobj_stride); + +void msm_syncobj_process_post_deps(struct msm_syncobj_post_dep *post_deps, + uint32_t count, struct dma_fence *fence); + +#endif /* __MSM_GEM_SYNCOBJ_H__ */ diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h index 1bccc347945c..2c2fc4b284d0 100644 --- a/include/uapi/drm/msm_drm.h +++ b/include/uapi/drm/msm_drm.h @@ -220,6 +220,17 @@ struct drm_msm_gem_cpu_fini { * Cmdstream Submission: */ +#define MSM_SYNCOBJ_RESET 0x00000001 /* Reset syncobj after wait. */ +#define MSM_SYNCOBJ_FLAGS ( \ + MSM_SYNCOBJ_RESET | \ + 0) + +struct drm_msm_syncobj { + __u32 handle; /* in, syncobj handle. */ + __u32 flags; /* in, from MSM_SUBMIT_SYNCOBJ_FLAGS */ + __u64 point; /* in, timepoint for timeline syncobjs. */ +}; + /* The value written into the cmdstream is logically: * * ((relocbuf->gpuaddr + reloc_offset) << shift) | or @@ -309,17 +320,6 @@ struct drm_msm_gem_submit_bo { MSM_SUBMIT_FENCE_SN_IN | \ 0) -#define MSM_SUBMIT_SYNCOBJ_RESET 0x00000001 /* Reset syncobj after wait. */ -#define MSM_SUBMIT_SYNCOBJ_FLAGS ( \ - MSM_SUBMIT_SYNCOBJ_RESET | \ - 0) - -struct drm_msm_gem_submit_syncobj { - __u32 handle; /* in, syncobj handle. */ - __u32 flags; /* in, from MSM_SUBMIT_SYNCOBJ_FLAGS */ - __u64 point; /* in, timepoint for timeline syncobjs. */ -}; - /* Each cmdstream submit consists of a table of buffers involved, and * one or more cmdstream buffers. This allows for conditional execution * (context-restore), and IB buffers needed for per tile/bin draw cmds. @@ -333,8 +333,8 @@ struct drm_msm_gem_submit { __u64 cmds; /* in, ptr to array of submit_cmd's */ __s32 fence_fd; /* in/out fence fd (see MSM_SUBMIT_FENCE_FD_IN/OUT) */ __u32 queueid; /* in, submitqueue id */ - __u64 in_syncobjs; /* in, ptr to array of drm_msm_gem_submit_syncobj */ - __u64 out_syncobjs; /* in, ptr to array of drm_msm_gem_submit_syncobj */ + __u64 in_syncobjs; /* in, ptr to array of drm_msm_syncobj */ + __u64 out_syncobjs; /* in, ptr to array of drm_msm_syncobj */ __u32 nr_in_syncobjs; /* in, number of entries in in_syncobj */ __u32 nr_out_syncobjs; /* in, number of entries in out_syncobj. */ __u32 syncobj_stride; /* in, stride of syncobj arrays. */ From patchwork Mon Apr 28 20:54:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rob Clark X-Patchwork-Id: 885631 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 366A822170A; Mon, 28 Apr 2025 20:58:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745873884; cv=none; b=CvCikJVmmSCmcw3Z2RyUk0i5tvSMecoePtpJN6ShWCtaI8LvRWXle8L14QMuY0Eiaa8kCJUI9FN3MiAToKeYNu2bDMPWCqpTFqNizK++ehH8SSD1T1uFgw3kNDNhA2fihly7IwylooHCZeRMVloWvTruOatZxOt1cM3ft4h3KMo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745873884; c=relaxed/simple; bh=pSUudMHlaAJJiN3iuOh/6BguviGsLFM4vwJRPFQVLl4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Zy9rnghOUpEgI6TBNWFRu4ZDVKTiDkCv1alztzsjrEkWJW/zsAlO9MXv40Vebstehh7QSMbnIz0XaYSnZIckIM2RStGnMAzDqR1bOksDJ6vY7D54Jls6fj8PjH8r7bjX9ehYv1hz4C87PDmG1wcjfhtnVv6Z2icJTRlRm4YMGu4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=cODkVRy6; arc=none smtp.client-ip=209.85.210.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cODkVRy6" Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-736bfa487c3so4406694b3a.1; Mon, 28 Apr 2025 13:58:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1745873880; x=1746478680; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=AqbCKUfESrDdXM9YTu1Ypf1jKLBYkJfDcNHUu79S7+0=; b=cODkVRy6+/VMgIUqobZMi3aWb2RzrBkRRfVh2U88tga8RPSlc3r/6LWgUvnmmsP1md 5SBiZ5Skm4Ffc/UBAGtoXkJ5F5UbpN1U7oKU6T4XjeQLH2g6DT4Qaabn+Y1pEtS21FrA 89KUrVnmezHAwAY+XBVD6MMDbHK87/vYaP4BZZo+NbV7ucYxO7hx75+WUBDkBlQ+xKfv 9zJaz6pReMdPiHDzvwLgxunj3lj3K07Cl4Wjime1uo6RoHUfjGqNp4DZh90HaFQdpGRt 1Ip1CyBG6CV2vMwfQkvMgJ0PzO6PJ2DJriSdNF3pTQWD5y/KEiKZWGjx5M45dNwpdyW3 zN2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745873880; x=1746478680; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AqbCKUfESrDdXM9YTu1Ypf1jKLBYkJfDcNHUu79S7+0=; b=NNAFBImOeihSzM7uaZ2+Sv/71ln7YgxrfrQpVMRj6zUSyY12rH4m5hrkX5nS6AV1/r LFJbKGdPx3gKjP3xXagqmGwF9Wy+HvknCo6391iYW2ZJhDuYiYpKuRoUsM5/5F2dA2Qj SHR0q3Lt0z9WguNSnRWk+6jabpIxtgkQk2Pzmk0GyDjn7Oo2sXGfgbrQJLW2gQbaDfGL L4spZ4qR+3i7fVQcUwdO1IuBstnsFPboRSlNukZljzXWaAhNcZKHswNU7ejS23EBu6tA klu+r13cgvG7YyX7f7PLAXjRlf17GPUxnj9OXbogRl3y18gTUDYckFgv0ffNQhIMXTmn nY1g== X-Forwarded-Encrypted: i=1; AJvYcCUobrzD7QR3WudfmfNNZhoBIzfv5wX3crM2/kdYG5vj4Horx7z3WII2viyq8HrfOheGL4DNIHVUKpdd6I4=@vger.kernel.org, AJvYcCVKaVfLrx85JD77UJuBpKsrZfTyCgvJUSHfTjK47HyLRN8AsxZEX4ZYhf+AtcIlksvoLvruhNzLE8TsE2Yf@vger.kernel.org, AJvYcCXMMxwSe/e7xCDPr5FiZCvD6eUxZMVNHPE35wSYp5Hjb1Kld3XYLl5zeFnak57Pu5OxWpESzQ1J9ggS/oir@vger.kernel.org X-Gm-Message-State: AOJu0Ywi49kli9ZQ7tD+zLnf8SXldtmlfA7uhxG/Jjzxbn4vKMS8qObf DXTLCieVYl6KlpB3I+cn4yl4j9qvh3OnelXFXmtcPP2tYBLW1Noy X-Gm-Gg: ASbGncuIZVqlpSsZPW/eT5lJe4xtaxMhZ9AzjHx62OaqqgEZ+ZqC+sO1boklPdLm2LQ omvqoVP1mmrv8fqlnca+8cs+FnuT41rdXkysIQdvz6CLtWy8OvlvwsVho8dhegeqdpoN7nznJIC Z7vFpFjaJyj3b5x629DlhL3//Q3ZolkVGLd6b7WTpnnjaQkmotVOHi7hdqKOn9GXXufLfihF1NR E8KoMkfp/vv1RfCV8ynXjmcCRyeuSP7vBbmFLPbmI+5R+oiyLIjWKiHdo80kEX7s5XQhZDCCfTG iGuZY2tTUI6fOuqcClYexAMhyKmN2keDeMDz0V/nPoI4smiAJpU25i9twuf/me8BqppBgz9K+o7 FpTTs8i69NML27OM= X-Google-Smtp-Source: AGHT+IECQGP1wmsoTq8MndQ0s+V21w4LPMD67LVK/wkAVjQ4uIC7u6csrd+LuNStWU+R+MrtjbsVmw== X-Received: by 2002:a05:6a20:c990:b0:201:8a13:f392 with SMTP id adf61e73a8af0-2046a57d7a4mr13815014637.20.1745873880163; Mon, 28 Apr 2025 13:58:00 -0700 (PDT) Received: from localhost ([2a00:79e0:3e00:2601:3afc:446b:f0df:eadc]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-73e25967766sm8495867b3a.82.2025.04.28.13.57.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Apr 2025 13:57:59 -0700 (PDT) From: Rob Clark To: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, Connor Abbott , Rob Clark , Rob Clark , Abhinav Kumar , Dmitry Baryshkov , Sean Paul , Marijn Suijten , David Airlie , Simona Vetter , Konrad Dybcio , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , linux-kernel@vger.kernel.org (open list), linux-media@vger.kernel.org (open list:DMA BUFFER SHARING FRAMEWORK:Keyword:\bdma_(?:buf|fence|resv)\b), linaro-mm-sig@lists.linaro.org (moderated list:DMA BUFFER SHARING FRAMEWORK:Keyword:\bdma_(?:buf|fence|resv)\b) Subject: [PATCH v3 32/33] drm/msm: Add VM_BIND ioctl Date: Mon, 28 Apr 2025 13:54:39 -0700 Message-ID: <20250428205619.227835-33-robdclark@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250428205619.227835-1-robdclark@gmail.com> References: <20250428205619.227835-1-robdclark@gmail.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Rob Clark Add a VM_BIND ioctl for binding/unbinding buffers into a VM. This is only supported if userspace has opted in to MSM_PARAM_EN_VM_BIND. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_drv.c | 1 + drivers/gpu/drm/msm/msm_drv.h | 4 +- drivers/gpu/drm/msm/msm_gem.c | 40 +- drivers/gpu/drm/msm/msm_gem.h | 4 + drivers/gpu/drm/msm/msm_gem_submit.c | 22 +- drivers/gpu/drm/msm/msm_gem_vma.c | 987 ++++++++++++++++++++++++++- include/uapi/drm/msm_drm.h | 74 +- 7 files changed, 1097 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 49e4425c3caf..a63442039e22 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -790,6 +790,7 @@ static const struct drm_ioctl_desc msm_ioctls[] = { DRM_IOCTL_DEF_DRV(MSM_SUBMITQUEUE_NEW, msm_ioctl_submitqueue_new, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(MSM_SUBMITQUEUE_CLOSE, msm_ioctl_submitqueue_close, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(MSM_SUBMITQUEUE_QUERY, msm_ioctl_submitqueue_query, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(MSM_VM_BIND, msm_ioctl_vm_bind, DRM_RENDER_ALLOW), }; static void msm_show_fdinfo(struct drm_printer *p, struct drm_file *file) diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index b0add236cbb3..33240afc6365 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -232,7 +232,9 @@ struct drm_gpuvm *msm_kms_init_vm(struct drm_device *dev); bool msm_use_mmu(struct drm_device *dev); int msm_ioctl_gem_submit(struct drm_device *dev, void *data, - struct drm_file *file); + struct drm_file *file); +int msm_ioctl_vm_bind(struct drm_device *dev, void *data, + struct drm_file *file); #ifdef CONFIG_DEBUG_FS unsigned long msm_gem_shrinker_shrink(struct drm_device *dev, unsigned long nr_to_scan); diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index b4b299e3f3d3..9d60b38ba58a 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -233,8 +233,7 @@ static void put_pages(struct drm_gem_object *obj) } } -static struct page **msm_gem_get_pages_locked(struct drm_gem_object *obj, - unsigned madv) +struct page **msm_gem_get_pages_locked(struct drm_gem_object *obj, unsigned madv) { struct msm_gem_object *msm_obj = to_msm_bo(obj); @@ -1036,18 +1035,37 @@ static void msm_gem_free_object(struct drm_gem_object *obj) /* * We need to lock any VMs the object is still attached to, but not * the object itself (see explaination in msm_gem_assert_locked()), - * so just open-code this special case: + * so just open-code this special case. + * + * Note that we skip the dance if we aren't attached to any VM. This + * is load bearing. The driver needs to support two usage models: + * + * 1. Legacy kernel managed VM: Userspace expects the VMA's to be + * implicitly torn down when the object is freed, the VMA's do + * not hold a hard reference to the BO. + * + * 2. VM_BIND, userspace managed VM: The VMA holds a reference to the + * BO. This can be dropped when the VM is closed and it's associated + * VMAs are torn down. (See msm_gem_vm_close()). + * + * In the latter case the last reference to a BO can be dropped while + * we already have the VM locked. It would have already been removed + * from the gpuva list, but lockdep doesn't know that. Or understand + * the differences between the two usage models. */ - drm_exec_init(&exec, 0, 0); - drm_exec_until_all_locked (&exec) { - struct drm_gpuvm_bo *vm_bo; - drm_gem_for_each_gpuvm_bo (vm_bo, obj) { - drm_exec_lock_obj(&exec, drm_gpuvm_resv_obj(vm_bo->vm)); - drm_exec_retry_on_contention(&exec); + if (!list_empty(&obj->gpuva.list)) { + drm_exec_init(&exec, 0, 0); + drm_exec_until_all_locked (&exec) { + struct drm_gpuvm_bo *vm_bo; + drm_gem_for_each_gpuvm_bo (vm_bo, obj) { + drm_exec_lock_obj(&exec, + drm_gpuvm_resv_obj(vm_bo->vm)); + drm_exec_retry_on_contention(&exec); + } } + put_iova_spaces(obj, NULL, true); + drm_exec_fini(&exec); /* drop locks */ } - put_iova_spaces(obj, NULL, true); - drm_exec_fini(&exec); /* drop locks */ if (obj->import_attach) { GEM_WARN_ON(msm_obj->vaddr); diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index 8ad25927c604..bfeb0f584ae5 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -73,6 +73,9 @@ struct msm_gem_vm { /** @mmu: The mmu object which manages the pgtables */ struct msm_mmu *mmu; + /** @mmu_lock: Protects access to the mmu */ + struct mutex mmu_lock; + /** * @pid: For address spaces associated with a specific process, this * will be non-NULL: @@ -205,6 +208,7 @@ int msm_gem_get_and_pin_iova(struct drm_gem_object *obj, struct drm_gpuvm *vm, uint64_t *iova); void msm_gem_unpin_iova(struct drm_gem_object *obj, struct drm_gpuvm *vm); void msm_gem_pin_obj_locked(struct drm_gem_object *obj); +struct page **msm_gem_get_pages_locked(struct drm_gem_object *obj, unsigned madv); struct page **msm_gem_pin_pages_locked(struct drm_gem_object *obj); void msm_gem_unpin_pages_locked(struct drm_gem_object *obj); int msm_gem_dumb_create(struct drm_file *file, struct drm_device *dev, diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 2e7f8352fa40..7cea8743ff68 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -184,6 +184,7 @@ static int submit_lookup_objects(struct msm_gem_submit *submit, static int submit_lookup_cmds(struct msm_gem_submit *submit, struct drm_msm_gem_submit *args, struct drm_file *file) { + struct msm_context *ctx = file->driver_priv; unsigned i; size_t sz; int ret = 0; @@ -215,6 +216,20 @@ static int submit_lookup_cmds(struct msm_gem_submit *submit, goto out; } + if (msm_context_is_vmbind(ctx)) { + if (submit_cmd.nr_relocs) { + ret = SUBMIT_ERROR(EINVAL, submit, "nr_relocs must be zero"); + goto out; + } + + if (submit_cmd.submit_idx || submit_cmd.submit_offset) { + ret = SUBMIT_ERROR(EINVAL, submit, "submit_idx/offset must be zero"); + goto out; + } + + submit->cmd[i].iova = submit_cmd.iova; + } + submit->cmd[i].type = submit_cmd.type; submit->cmd[i].size = submit_cmd.size / 4; submit->cmd[i].offset = submit_cmd.submit_offset / 4; @@ -517,6 +532,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct msm_ringbuffer *ring; struct msm_syncobj_post_dep *post_deps = NULL; struct drm_syncobj **syncobjs_to_reset = NULL; + unsigned cmds_to_parse; int out_fence_fd = -1; unsigned i; int ret; @@ -652,7 +668,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, goto out; } - for (i = 0; i < args->nr_cmds; i++) { + cmds_to_parse = msm_context_is_vmbind(ctx) ? 0 : args->nr_cmds; + + for (i = 0; i < cmds_to_parse; i++) { struct drm_gem_object *obj; uint64_t iova; @@ -683,7 +701,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, goto out; } - submit->nr_cmds = i; + submit->nr_cmds = args->nr_cmds; idr_preload(GFP_KERNEL); diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c index f3903825e0b6..69d0d423900e 100644 --- a/drivers/gpu/drm/msm/msm_gem_vma.c +++ b/drivers/gpu/drm/msm/msm_gem_vma.c @@ -4,9 +4,16 @@ * Author: Rob Clark */ +#include "drm/drm_file.h" +#include "drm/msm_drm.h" +#include "linux/file.h" +#include "linux/sync_file.h" + #include "msm_drv.h" #include "msm_gem.h" +#include "msm_gpu.h" #include "msm_mmu.h" +#include "msm_syncobj.h" #define vm_dbg(fmt, ...) pr_debug("%s:%d: "fmt"\n", __func__, __LINE__, ##__VA_ARGS__) @@ -34,8 +41,95 @@ struct msm_vm_unmap_op { uint64_t iova; /** @range: size of region to unmap */ uint64_t range; + /** + * @obj: backing object for pages to be unmapped + * + * Async unmap ops must hold a reference to the original GEM object + * backing the mapping that will be unmapped. This ensures that the + * pages backing the mapping are not freed before the mapping is torn + * down. + */ + struct drm_gem_object *obj; +}; + +/** + * struct msm_vma_op - A MAP or UNMAP operation + */ +struct msm_vm_op { + /** @op: The operation type */ + enum { + MSM_VM_OP_MAP, + MSM_VM_OP_UNMAP, + } op; + union { + /** @map: Parameters used if op == MSM_VMA_OP_MAP */ + struct msm_vm_map_op map; + /** @unmap: Parameters used if op == MSM_VMA_OP_UNMAP */ + struct msm_vm_unmap_op unmap; + }; + /** @node: list head in msm_vm_bind_job::vm_ops */ + struct list_head node; }; +/** + * struct msm_vm_bind_job - Tracking for a VM_BIND ioctl + * + * A table of userspace requested VM updates (MSM_VM_BIND_OP_UNMAP/MAP/MAP_NULL) + * gets applied to the vm, generating a list of VM ops (MSM_VM_OP_MAP/UNMAP) + * which are applied to the pgtables asynchronously. For example a userspace + * requested MSM_VM_BIND_OP_MAP could end up generating both an MSM_VM_OP_UNMAP + * to unmap an existing mapping, and a MSM_VM_OP_MAP to apply the new mapping. + */ +struct msm_vm_bind_job { + /** @base: base class for drm_sched jobs */ + struct drm_sched_job base; + /** @vm: The VM being operated on */ + struct drm_gpuvm *vm; + /** @fence: The fence that is signaled when job completes */ + struct dma_fence *fence; + /** @queue: The queue that the job runs on */ + struct msm_gpu_submitqueue *queue; + /** @prealloc: Tracking for pre-allocated MMU pgtable pages */ + struct msm_mmu_prealloc prealloc; + /** @vm_ops: a list of struct msm_vm_op */ + struct list_head vm_ops; + /** @bos_pinned: are the GEM objects being bound pinned? */ + bool bos_pinned; + /** @nr_ops: the number of userspace requested ops */ + unsigned int nr_ops; + /** + * @ops: the userspace requested ops + * + * The userspace requested ops are copied/parsed and validated + * before we start applying the updates to try to do as much up- + * front error checking as possible, to avoid the VM being in an + * undefined state due to partially executed VM_BIND. + * + * This table also serves to hold a reference to the backing GEM + * objects. + */ + struct msm_vm_bind_op { + uint32_t op; + uint32_t flags; + union { + struct drm_gem_object *obj; + uint32_t handle; + }; + uint64_t obj_offset; + uint64_t iova; + uint64_t range; + } ops[]; +}; + +#define job_foreach_bo(obj, _job) \ + for (unsigned i = 0; i < (_job)->nr_ops; i++) \ + if ((obj = (_job)->ops[i].obj)) + +static inline struct msm_vm_bind_job *to_msm_vm_bind_job(struct drm_sched_job *job) +{ + return container_of(job, struct msm_vm_bind_job, base); +} + static void msm_gem_vm_free(struct drm_gpuvm *gpuvm) { @@ -49,34 +143,52 @@ msm_gem_vm_free(struct drm_gpuvm *gpuvm) kfree(vm); } -static void +static int vm_unmap_op(struct msm_gem_vm *vm, const struct msm_vm_unmap_op *op) { + if (!vm->managed) + lockdep_assert_held(&vm->mmu_lock); + vm_dbg("%p: %016llx %016llx", vm, op->iova, op->iova + op->range); - vm->mmu->funcs->unmap(vm->mmu, op->iova, op->range); + return vm->mmu->funcs->unmap(vm->mmu, op->iova, op->range); } /* Actually unmap memory for the vma */ void msm_gem_vma_unmap(struct drm_gpuva *vma) { + struct msm_gem_vm *vm = to_msm_vm(vma->vm); struct msm_gem_vma *msm_vma = to_msm_vma(vma); /* Don't do anything if the memory isn't mapped */ if (!msm_vma->mapped) return; - vm_unmap_op(to_msm_vm(vma->vm), &(struct msm_vm_unmap_op){ + /* + * The mmu_lock is only needed when preallocation is used. But + * in that case we don't need to worry about recursion into + * shrinker + */ + if (!vm->managed) + mutex_lock(&vm->mmu_lock); + + vm_unmap_op(vm, &(struct msm_vm_unmap_op){ .iova = vma->va.addr, .range = vma->va.range, }); + if (!vm->managed) + mutex_unlock(&vm->mmu_lock); + msm_vma->mapped = false; } static int vm_map_op(struct msm_gem_vm *vm, const struct msm_vm_map_op *op) { + if (!vm->managed) + lockdep_assert_held(&vm->mmu_lock); + vm_dbg("%p: %016llx %016llx", vm, op->iova, op->iova + op->range); return vm->mmu->funcs->map(vm->mmu, op->iova, op->sgt, op->offset, @@ -87,6 +199,7 @@ vm_map_op(struct msm_gem_vm *vm, const struct msm_vm_map_op *op) int msm_gem_vma_map(struct drm_gpuva *vma, int prot, struct sg_table *sgt) { + struct msm_gem_vm *vm = to_msm_vm(vma->vm); struct msm_gem_vma *msm_vma = to_msm_vma(vma); int ret; @@ -98,6 +211,14 @@ msm_gem_vma_map(struct drm_gpuva *vma, int prot, struct sg_table *sgt) msm_vma->mapped = true; + /* + * The mmu_lock is only needed when preallocation is used. But + * in that case we don't need to worry about recursion into + * shrinker + */ + if (!vm->managed) + mutex_lock(&vm->mmu_lock); + /* * NOTE: iommu/io-pgtable can allocate pages, so we cannot hold * a lock across map/unmap which is also used in the job_run() @@ -107,16 +228,19 @@ msm_gem_vma_map(struct drm_gpuva *vma, int prot, struct sg_table *sgt) * Revisit this if we can come up with a scheme to pre-alloc pages * for the pgtable in map/unmap ops. */ - ret = vm_map_op(to_msm_vm(vma->vm), &(struct msm_vm_map_op){ + ret = vm_map_op(vm, &(struct msm_vm_map_op){ .iova = vma->va.addr, .range = vma->va.range, .offset = vma->gem.offset, .sgt = sgt, .prot = prot, }); - if (ret) { + + if (!vm->managed) + mutex_unlock(&vm->mmu_lock); + + if (ret) msm_vma->mapped = false; - } return ret; } @@ -131,6 +255,9 @@ void msm_gem_vma_close(struct drm_gpuva *vma) drm_gpuvm_resv_assert_held(&vm->base); + if (vma->gem.obj) + msm_gem_assert_locked(vma->gem.obj); + if (vma->va.addr && vm->managed) drm_mm_remove_node(&msm_vma->node); @@ -158,6 +285,7 @@ msm_gem_vma_new(struct drm_gpuvm *_vm, struct drm_gem_object *obj, if (vm->managed) { BUG_ON(offset != 0); + BUG_ON(!obj); /* NULL mappings not valid for kernel managed VM */ ret = drm_mm_insert_node_in_range(&vm->mm, &vma->node, obj->size, PAGE_SIZE, 0, range_start, range_end, 0); @@ -169,7 +297,8 @@ msm_gem_vma_new(struct drm_gpuvm *_vm, struct drm_gem_object *obj, range_end = range_start + obj->size; } - GEM_WARN_ON((range_end - range_start) > obj->size); + if (obj) + GEM_WARN_ON((range_end - range_start) > obj->size); drm_gpuva_init(&vma->base, range_start, range_end - range_start, obj, offset); vma->mapped = false; @@ -178,6 +307,9 @@ msm_gem_vma_new(struct drm_gpuvm *_vm, struct drm_gem_object *obj, if (ret) goto err_free_range; + if (!obj) + return &vma->base; + vm_bo = drm_gpuvm_bo_obtain(&vm->base, obj); if (IS_ERR(vm_bo)) { ret = PTR_ERR(vm_bo); @@ -199,11 +331,290 @@ msm_gem_vma_new(struct drm_gpuvm *_vm, struct drm_gem_object *obj, return ERR_PTR(ret); } +static int +msm_gem_vm_bo_validate(struct drm_gpuvm_bo *vm_bo, struct drm_exec *exec) +{ + struct drm_gem_object *obj = vm_bo->obj; + struct drm_gpuva *vma; + int ret; + + msm_gem_assert_locked(obj); + + drm_gpuvm_bo_for_each_va (vma, vm_bo) { + ret = msm_gem_pin_vma_locked(obj, vma); + if (ret) + return ret; + } + + return 0; +} + +struct op_arg { + unsigned flags; + struct msm_vm_bind_job *job; +}; + +static void +vm_op_enqueue(struct op_arg *arg, struct msm_vm_op _op) +{ + struct msm_vm_op *op = kmalloc(sizeof(*op), GFP_KERNEL); + *op = _op; + list_add_tail(&op->node, &arg->job->vm_ops); +} + +static struct drm_gpuva * +vma_from_op(struct op_arg *arg, struct drm_gpuva_op_map *op) +{ + return msm_gem_vma_new(arg->job->vm, op->gem.obj, op->gem.offset, + op->va.addr, op->va.addr + op->va.range); +} + +static void +map_vma(struct drm_gpuva *vma, void *arg) +{ + struct drm_gem_object *obj = vma->gem.obj; + struct sg_table *sgt; + unsigned prot; + + if (obj) { + sgt = to_msm_bo(obj)->sgt; + prot = msm_gem_prot(obj); + } else { + sgt = NULL; + prot = IOMMU_READ | IOMMU_WRITE; + } + + vm_op_enqueue(arg, (struct msm_vm_op){ + .op = MSM_VM_OP_MAP, + .map = { + .sgt = sgt, + .iova = vma->va.addr, + .range = vma->va.range, + .offset = vma->gem.offset, + .prot = prot, + }, + }); + + to_msm_vma(vma)->mapped = true; +} + +static int +msm_gem_vm_sm_step_map(struct drm_gpuva_op *op, void *arg) +{ + struct drm_gpuva *vma; + + vma = vma_from_op(arg, &op->map); + if (WARN_ON(IS_ERR(vma))) + return PTR_ERR(vma); + + vm_dbg("%p:%p: %016llx %016llx", vma->vm, vma, vma->va.addr, vma->va.range); + + vma->flags = ((struct op_arg *)arg)->flags; + + map_vma(vma, arg); + + return 0; +} + +static int +msm_gem_vm_sm_step_remap(struct drm_gpuva_op *op, void *arg) +{ + struct drm_gpuvm *vm = ((struct op_arg *)arg)->job->vm; + struct drm_gpuva *orig_vma = op->remap.unmap->va; + struct drm_gpuva *prev_vma = NULL, *next_vma = NULL; + uint64_t unmap_start, unmap_range; + unsigned flags; + + vm_dbg("orig_vma: %p:%p: %016llx %016llx", vm, orig_vma, orig_vma->va.addr, orig_vma->va.range); + + /* + * We are going to need at least some of the orig_vma mapped, the + * easier thing is to just map the whole thing and then unmap the + * unmap_range, instead of trying to optimize for this case + */ + if (!to_msm_vma(orig_vma)->mapped) + map_vma(orig_vma, arg); + + drm_gpuva_op_remap_to_unmap_range(&op->remap, &unmap_start, &unmap_range); + + if (orig_vma->gem.obj) + drm_gem_object_get(orig_vma->gem.obj); + + vm_op_enqueue(arg, (struct msm_vm_op){ + .op = MSM_VM_OP_UNMAP, + .unmap = { + .iova = unmap_start, + .range = unmap_range, + .obj = orig_vma->gem.obj, + }, + }); + + /* + * Part of this GEM obj is still mapped, but we're going to kill the + * existing VMA and replace it with one or two new ones (ie. two if + * the unmapped range is in the middle of the existing (unmap) VMA). + * So just set the state to unmapped: + */ + to_msm_vma(orig_vma)->mapped = false; + + /* + * The prev_vma and/or next_vma are replacing the unmapped vma, and + * therefore should preserve it's flags: + */ + flags = orig_vma->flags; + + msm_gem_vma_close(orig_vma); + + if (op->remap.prev) { + prev_vma = vma_from_op(arg, op->remap.prev); + if (WARN_ON(IS_ERR(prev_vma))) + return PTR_ERR(prev_vma); + + vm_dbg("prev_vma: %p:%p: %016llx %016llx", vm, prev_vma, prev_vma->va.addr, prev_vma->va.range); + to_msm_vma(prev_vma)->mapped = true; + prev_vma->flags = flags; + } + + if (op->remap.next) { + next_vma = vma_from_op(arg, op->remap.next); + if (WARN_ON(IS_ERR(next_vma))) + return PTR_ERR(next_vma); + + vm_dbg("next_vma: %p:%p: %016llx %016llx", vm, next_vma, next_vma->va.addr, next_vma->va.range); + to_msm_vma(next_vma)->mapped = true; + next_vma->flags = flags; + } + + return 0; +} + +static int +msm_gem_vm_sm_step_unmap(struct drm_gpuva_op *op, void *arg) +{ + struct drm_gpuva *vma = op->unmap.va; + struct msm_gem_vma *msm_vma = to_msm_vma(vma); + + vm_dbg("%p:%p: %016llx %016llx", vma->vm, vma, vma->va.addr, vma->va.range); + + if (!msm_vma->mapped) + goto out_close; + + if (vma->gem.obj) + drm_gem_object_get(vma->gem.obj); + + vm_op_enqueue(arg, (struct msm_vm_op){ + .op = MSM_VM_OP_UNMAP, + .unmap = { + .iova = vma->va.addr, + .range = vma->va.range, + .obj = vma->gem.obj, + }, + }); + + msm_vma->mapped = false; + +out_close: + msm_gem_vma_close(vma); + + return 0; +} + static const struct drm_gpuvm_ops msm_gpuvm_ops = { .vm_free = msm_gem_vm_free, + .vm_bo_validate = msm_gem_vm_bo_validate, + .sm_step_map = msm_gem_vm_sm_step_map, + .sm_step_remap = msm_gem_vm_sm_step_remap, + .sm_step_unmap = msm_gem_vm_sm_step_unmap, }; +static struct dma_fence * +msm_vma_job_run(struct drm_sched_job *_job) +{ + struct msm_vm_bind_job *job = to_msm_vm_bind_job(_job); + struct msm_gem_vm *vm = to_msm_vm(job->vm); + struct drm_gem_object *obj; + int ret = 0; + + vm_dbg(""); + + mutex_lock(&vm->mmu_lock); + vm->mmu->prealloc = &job->prealloc; + + while (!list_empty(&job->vm_ops)) { + struct msm_vm_op *op = + list_first_entry(&job->vm_ops, struct msm_vm_op, node); + + switch (op->op) { + case MSM_VM_OP_MAP: + /* + * On error, stop trying to map new things.. but we + * still want to process the unmaps (or in particular, + * the drm_gem_object_put()s) + */ + if (!ret) + ret = vm_map_op(vm, &op->map); + break; + case MSM_VM_OP_UNMAP: + if (!ret) + ret = vm_unmap_op(vm, &op->unmap); + drm_gem_object_put(op->unmap.obj); + break; + } + list_del(&op->node); + kfree(op); + } + + /* + * We failed to perform at least _some_ of the pgtable updates, so + * now the VM is in an undefined state. Game over! + */ + if (ret) + vm->unusable = true; + + vm->mmu->prealloc = NULL; + mutex_unlock(&vm->mmu_lock); + + job_foreach_bo (obj, job) { + msm_gem_lock(obj); + msm_gem_unpin_locked(obj); + msm_gem_unlock(obj); + } + + /* VM_BIND ops are synchronous, so no fence to wait on: */ + return NULL; +} + +static void +msm_vma_job_free(struct drm_sched_job *_job) +{ + struct msm_vm_bind_job *job = to_msm_vm_bind_job(_job); + struct msm_mmu *mmu = to_msm_vm(job->vm)->mmu; + struct drm_gem_object *obj; + + mmu->funcs->prealloc_cleanup(mmu, &job->prealloc); + + drm_sched_job_cleanup(_job); + + job_foreach_bo (obj, job) + drm_gem_object_put(obj); + + msm_submitqueue_put(job->queue); + dma_fence_put(job->fence); + + /* In error paths, we could have unexecuted ops: */ + while (!list_empty(&job->vm_ops)) { + struct msm_vm_op *op = + list_first_entry(&job->vm_ops, struct msm_vm_op, node); + list_del(&op->node); + kfree(op); + } + + kfree(job); +} + static const struct drm_sched_backend_ops msm_vm_bind_ops = { + .run_job = msm_vma_job_run, + .free_job = msm_vma_job_free }; /** @@ -264,6 +675,7 @@ msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name, drm_gem_object_put(dummy_gem); vm->mmu = mmu; + mutex_init(&vm->mmu_lock); vm->managed = managed; drm_mm_init(&vm->mm, va_start, va_size); @@ -276,7 +688,6 @@ msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name, err_free_vm: kfree(vm); return ERR_PTR(ret); - } /** @@ -292,6 +703,7 @@ msm_gem_vm_close(struct drm_gpuvm *gpuvm) { struct msm_gem_vm *vm = to_msm_vm(gpuvm); struct drm_gpuva *vma, *tmp; + struct drm_exec exec; /* * For kernel managed VMs, the VMAs are torn down when the handle is @@ -308,22 +720,557 @@ msm_gem_vm_close(struct drm_gpuvm *gpuvm) drm_sched_fini(&vm->sched); /* Tear down any remaining mappings: */ - dma_resv_lock(drm_gpuvm_resv(gpuvm), NULL); - drm_gpuvm_for_each_va_safe (vma, tmp, gpuvm) { - struct drm_gem_object *obj = vma->gem.obj; + drm_exec_init(&exec, 0, 2); + drm_exec_until_all_locked (&exec) { + drm_exec_lock_obj(&exec, drm_gpuvm_resv_obj(gpuvm)); + drm_exec_retry_on_contention(&exec); + + drm_gpuvm_for_each_va_safe (vma, tmp, gpuvm) { + struct drm_gem_object *obj = vma->gem.obj; + + /* + * MSM_BO_NO_SHARE objects share the same resv as the + * VM, in which case the obj is already locked: + */ + if (obj && (obj->resv == drm_gpuvm_resv(gpuvm))) + obj = NULL; + + if (obj) { + drm_exec_lock_obj(&exec, obj); + drm_exec_retry_on_contention(&exec); + } + + msm_gem_vma_unmap(vma); + msm_gem_vma_close(vma); + + if (obj) { + drm_exec_unlock_obj(&exec, obj); + } + } + } + drm_exec_fini(&exec); +} + + +static struct msm_vm_bind_job * +vm_bind_job_create(struct drm_device *dev, struct msm_gpu *gpu, + struct msm_gpu_submitqueue *queue, uint32_t nr_ops) +{ + struct msm_vm_bind_job *job; + uint64_t sz; + int ret; + + sz = struct_size(job, ops, nr_ops); + + if (sz > SIZE_MAX) + return ERR_PTR(-ENOMEM); + + job = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN); + if (!job) + return ERR_PTR(-ENOMEM); + + ret = drm_sched_job_init(&job->base, queue->entity, 1, queue); + if (ret) { + kfree(job); + return ERR_PTR(ret); + } + + job->vm = msm_context_vm(dev, queue->ctx); + job->queue = queue; + INIT_LIST_HEAD(&job->vm_ops); + + return job; +} + +static bool invalid_alignment(uint64_t addr) +{ + /* + * Technically this is about GPU alignment, not CPU alignment. But + * I've not seen any qcom SoC where the SMMU does not support the + * CPU's smallest page size. + */ + return !PAGE_ALIGNED(addr); +} - if (obj && obj->resv != drm_gpuvm_resv(gpuvm)) { - drm_gem_object_get(obj); - msm_gem_lock(obj); +static int +lookup_op(struct msm_vm_bind_job *job, const struct drm_msm_vm_bind_op *op) +{ + struct drm_device *dev = job->vm->drm; + struct msm_mmu *mmu = to_msm_vm(job->vm)->mmu; + int i = job->nr_ops++; + int ret = 0; + + job->ops[i].op = op->op; + job->ops[i].handle = op->handle; + job->ops[i].obj_offset = op->obj_offset; + job->ops[i].iova = op->iova; + job->ops[i].range = op->range; + job->ops[i].flags = op->flags; + + if (op->flags & ~MSM_VM_BIND_OP_FLAGS) + ret = UERR(EINVAL, dev, "invalid flags: %x\n", op->flags); + + if (invalid_alignment(op->iova)) + ret = UERR(EINVAL, dev, "invalid address: %016llx\n", op->iova); + + if (invalid_alignment(op->obj_offset)) + ret = UERR(EINVAL, dev, "invalid bo_offset: %016llx\n", op->obj_offset); + + if (invalid_alignment(op->range)) + ret = UERR(EINVAL, dev, "invalid range: %016llx\n", op->range); + + + /* + * MAP must specify a valid handle. But the handle MBZ for + * UNMAP or MAP_NULL. + */ + if (op->op == MSM_VM_BIND_OP_MAP) { + if (!op->handle) + ret = UERR(EINVAL, dev, "invalid handle\n"); + } else if (op->handle) { + ret = UERR(EINVAL, dev, "handle must be zero\n"); + } + + switch (op->op) { + case MSM_VM_BIND_OP_MAP: + case MSM_VM_BIND_OP_MAP_NULL: + mmu->funcs->prealloc_count(mmu, &job->prealloc, op->iova, op->range); + break; + case MSM_VM_BIND_OP_UNMAP: + break; + default: + ret = UERR(EINVAL, dev, "invalid op: %u\n", op->op); + break; + } + + return ret; +} + +/* + * ioctl parsing, parameter validation, and GEM handle lookup + */ +static int +vm_bind_job_lookup_ops(struct msm_vm_bind_job *job, struct drm_msm_vm_bind *args, + struct drm_file *file, int *nr_bos) +{ + struct drm_device *dev = job->vm->drm; + int ret = 0; + int cnt = 0; + + if (args->nr_ops == 1) { + /* Single op case, the op is inlined: */ + ret = lookup_op(job, &args->op); + } else { + for (unsigned i = 0; i < args->nr_ops; i++) { + struct drm_msm_vm_bind_op op; + void __user *userptr = + u64_to_user_ptr(args->ops + (i * sizeof(op))); + + /* make sure we don't have garbage flags, in case we hit + * error path before flags is initialized: + */ + job->ops[i].flags = 0; + + if (copy_from_user(&op, userptr, sizeof(op))) { + ret = -EFAULT; + break; + } + + ret = lookup_op(job, &op); + if (ret) + break; } + } + + if (ret) { + job->nr_ops = 0; + goto out; + } + + spin_lock(&file->table_lock); + + for (unsigned i = 0; i < args->nr_ops; i++) { + struct drm_gem_object *obj; - msm_gem_vma_unmap(vma); - msm_gem_vma_close(vma); + if (!job->ops[i].handle) { + job->ops[i].obj = NULL; + continue; + } - if (obj && obj->resv != drm_gpuvm_resv(gpuvm)) { - msm_gem_unlock(obj); - drm_gem_object_put(obj); + /* + * normally use drm_gem_object_lookup(), but for bulk lookup + * all under single table_lock just hit object_idr directly: + */ + obj = idr_find(&file->object_idr, job->ops[i].handle); + if (!obj) { + ret = UERR(EINVAL, dev, "invalid handle %u at index %u\n", job->ops[i].handle, i); + goto out_unlock; } + + drm_gem_object_get(obj); + + job->ops[i].obj = obj; + cnt++; } - dma_resv_unlock(drm_gpuvm_resv(gpuvm)); + + *nr_bos = cnt; + +out_unlock: + spin_unlock(&file->table_lock); + +out: + return ret; +} + +/* + * Lock VM and GEM objects + */ +static int +vm_bind_job_lock_objects(struct msm_vm_bind_job *job, struct drm_exec *exec) +{ + struct drm_gem_object *obj; + int ret; + + /* Lock VM and objects: */ + drm_exec_until_all_locked(exec) { + ret = drm_exec_lock_obj(exec, drm_gpuvm_resv_obj(job->vm)); + drm_exec_retry_on_contention(exec); + if (ret) + return ret; + + job_foreach_bo (obj, job) { + ret = drm_exec_prepare_obj(exec, obj, 1); + drm_exec_retry_on_contention(exec); + if (ret) + return ret; + } + } + + return 0; +} + +/* + * Pin GEM objects, ensuring that we have backing pages. Pinning will move + * the object to the pinned LRU so that the shrinker knows to first consider + * other objects for evicting. + */ +static int +vm_bind_job_pin_objects(struct msm_vm_bind_job *job) +{ + struct drm_gem_object *obj; + + /* + * First loop, before holding the LRU lock, avoids holding the + * LRU lock while calling msm_gem_pin_vma_locked (which could + * trigger get_pages()) + */ + job_foreach_bo (obj, job) { + struct page **pages; + + pages = msm_gem_get_pages_locked(obj, MSM_MADV_WILLNEED); + if (IS_ERR(pages)) + return PTR_ERR(pages); + } + + struct msm_drm_private *priv = job->vm->drm->dev_private; + + /* + * A second loop while holding the LRU lock (a) avoids acquiring/dropping + * the LRU lock for each individual bo, while (b) avoiding holding the + * LRU lock while calling msm_gem_pin_vma_locked() (which could trigger + * get_pages() which could trigger reclaim.. and if we held the LRU lock + * could trigger deadlock with the shrinker). + */ + mutex_lock(&priv->lru.lock); + job_foreach_bo (obj, job) + msm_gem_pin_obj_locked(obj); + mutex_unlock(&priv->lru.lock); + + job->bos_pinned = true; + + return 0; +} + +/* + * Unpin GEM objects. Normally this is done after the bind job is run. + */ +static void +vm_bind_job_unpin_objects(struct msm_vm_bind_job *job) +{ + struct drm_gem_object *obj; + + if (!job->bos_pinned) + return; + + job_foreach_bo (obj, job) + msm_gem_unpin_locked(obj); + + job->bos_pinned = false; +} + +/* + * Pre-allocate pgtable memory, and translate the VM bind requests into a + * sequence of pgtable updates to be applied asynchronously. + */ +static int +vm_bind_job_prepare(struct msm_vm_bind_job *job) +{ + struct msm_gem_vm *vm = to_msm_vm(job->vm); + struct msm_mmu *mmu = vm->mmu; + int ret; + + mmu->funcs->prealloc_allocate(mmu, &job->prealloc); + + for (unsigned i = 0; i < job->nr_ops; i++) { + const struct msm_vm_bind_op *op = &job->ops[i]; + struct op_arg arg = { + .job = job, + }; + + switch (op->op) { + case MSM_VM_BIND_OP_UNMAP: + ret = drm_gpuvm_sm_unmap(job->vm, &arg, op->iova, + op->obj_offset); + break; + case MSM_VM_BIND_OP_MAP: + if (op->flags & MSM_VM_BIND_OP_DUMP) + arg.flags |= MSM_VMA_DUMP; + fallthrough; + case MSM_VM_BIND_OP_MAP_NULL: + ret = drm_gpuvm_sm_map(job->vm, &arg, op->iova, + op->range, op->obj, op->obj_offset); + break; + default: + /* + * lookup_op() should have already thrown an error for + * invalid ops + */ + BUG_ON("unreachable"); + } + + if (ret) { + /* + * If we've already started modifying the vm, we can't + * adequetly describe to userspace the intermediate + * state the vm is in. So throw up our hands! + */ + if (i > 0) + vm->unusable = true; + return ret; + } + } + + return 0; +} + +/* + * Attach fences to the GEM objects being bound. This will signify to + * the shrinker that they are busy even after dropping the locks (ie. + * drm_exec_fini()) + */ +static void +vm_bind_job_attach_fences(struct msm_vm_bind_job *job) +{ + for (unsigned i = 0; i < job->nr_ops; i++) { + struct drm_gem_object *obj = job->ops[i].obj; + + if (!obj) + continue; + + dma_resv_add_fence(obj->resv, job->fence, + DMA_RESV_USAGE_KERNEL); + } +} + +int +msm_ioctl_vm_bind(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct msm_drm_private *priv = dev->dev_private; + struct drm_msm_vm_bind *args = data; + struct msm_context *ctx = file->driver_priv; + struct msm_vm_bind_job *job = NULL; + struct msm_gpu *gpu = priv->gpu; + struct msm_gpu_submitqueue *queue; + struct msm_syncobj_post_dep *post_deps = NULL; + struct drm_syncobj **syncobjs_to_reset = NULL; + struct dma_fence *fence; + int out_fence_fd = -1; + int ret, nr_bos = 0; + unsigned i; + + if (!gpu) + return -ENXIO; + + /* + * Maybe we could allow just UNMAP ops? OTOH userspace should just + * immediately close the device file and all will be torn down. + */ + if (to_msm_vm(ctx->vm)->unusable) + return UERR(EPIPE, dev, "context is unusable"); + + /* + * Technically, you cannot create a VM_BIND submitqueue in the first + * place, if you haven't opted in to VM_BIND context. But it is + * cleaner / less confusing, to check this case directly. + */ + if (!msm_context_is_vmbind(ctx)) + return UERR(EINVAL, dev, "context does not support vmbind"); + + if (args->flags & ~MSM_VM_BIND_FLAGS) + return UERR(EINVAL, dev, "invalid flags"); + + queue = msm_submitqueue_get(ctx, args->queue_id); + if (!queue) + return -ENOENT; + + if (!(queue->flags & MSM_SUBMITQUEUE_VM_BIND)) { + ret = UERR(EINVAL, dev, "Invalid queue type"); + goto out_post_unlock; + } + + if (args->flags & MSM_VM_BIND_FENCE_FD_OUT) { + out_fence_fd = get_unused_fd_flags(O_CLOEXEC); + if (out_fence_fd < 0) { + ret = out_fence_fd; + goto out_post_unlock; + } + } + + job = vm_bind_job_create(dev, gpu, queue, args->nr_ops); + if (IS_ERR(job)) { + ret = PTR_ERR(job); + goto out_post_unlock; + } + + ret = mutex_lock_interruptible(&queue->lock); + if (ret) + goto out_post_unlock; + + if (args->flags & MSM_VM_BIND_FENCE_FD_IN) { + struct dma_fence *in_fence; + + in_fence = sync_file_get_fence(args->fence_fd); + + if (!in_fence) { + ret = UERR(EINVAL, dev, "invalid in-fence"); + goto out_unlock; + } + + ret = drm_sched_job_add_dependency(&job->base, in_fence); + if (ret) + goto out_unlock; + } + + if (args->in_syncobjs > 0) { + syncobjs_to_reset = msm_syncobj_parse_deps(dev, &job->base, + file, args->in_syncobjs, + args->nr_in_syncobjs, + args->syncobj_stride); + if (IS_ERR(syncobjs_to_reset)) { + ret = PTR_ERR(syncobjs_to_reset); + goto out_unlock; + } + } + + if (args->out_syncobjs > 0) { + post_deps = msm_syncobj_parse_post_deps(dev, file, + args->out_syncobjs, + args->nr_out_syncobjs, + args->syncobj_stride); + if (IS_ERR(post_deps)) { + ret = PTR_ERR(post_deps); + goto out_unlock; + } + } + + ret = vm_bind_job_lookup_ops(job, args, file, &nr_bos); + if (ret) + goto out_unlock; + + struct drm_exec exec; + unsigned flags = DRM_EXEC_IGNORE_DUPLICATES | DRM_EXEC_INTERRUPTIBLE_WAIT; + drm_exec_init(&exec, flags, nr_bos + 1); + + ret = vm_bind_job_lock_objects(job, &exec); + if (ret) + goto out; + + ret = vm_bind_job_pin_objects(job); + if (ret) + goto out; + + ret = vm_bind_job_prepare(job); + if (ret) + goto out; + + drm_sched_job_arm(&job->base); + + job->fence = dma_fence_get(&job->base.s_fence->finished); + + if (ret == 0 && args->flags & MSM_VM_BIND_FENCE_FD_OUT) { + struct sync_file *sync_file = sync_file_create(job->fence); + if (!sync_file) { + ret = -ENOMEM; + } else { + fd_install(out_fence_fd, sync_file->file); + args->fence_fd = out_fence_fd; + } + } + + if (ret) + goto out; + + vm_bind_job_attach_fences(job); + + /* + * The job can be free'd (and fence unref'd) at any point after + * drm_sched_entity_push_job(), so we need to hold our own ref + */ + fence = dma_fence_get(job->fence); + + drm_sched_entity_push_job(&job->base); + + msm_syncobj_reset(syncobjs_to_reset, args->nr_in_syncobjs); + msm_syncobj_process_post_deps(post_deps, args->nr_out_syncobjs, fence); + + dma_fence_put(fence); + +out: + if (ret) + vm_bind_job_unpin_objects(job); + drm_exec_fini(&exec); +out_unlock: + mutex_unlock(&queue->lock); +out_post_unlock: + if (ret && (out_fence_fd >= 0)) + put_unused_fd(out_fence_fd); + + if (!IS_ERR_OR_NULL(job)) { + if (ret) + msm_vma_job_free(&job->base); + } else { + /* + * If the submit hasn't yet taken ownership of the queue + * then we need to drop the reference ourself: + */ + msm_submitqueue_put(queue); + } + + if (!IS_ERR_OR_NULL(post_deps)) { + for (i = 0; i < args->nr_out_syncobjs; ++i) { + kfree(post_deps[i].chain); + drm_syncobj_put(post_deps[i].syncobj); + } + kfree(post_deps); + } + + if (!IS_ERR_OR_NULL(syncobjs_to_reset)) { + for (i = 0; i < args->nr_in_syncobjs; ++i) { + if (syncobjs_to_reset[i]) + drm_syncobj_put(syncobjs_to_reset[i]); + } + kfree(syncobjs_to_reset); + } + + return ret; } diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h index 6d6cd1219926..5c67294edc95 100644 --- a/include/uapi/drm/msm_drm.h +++ b/include/uapi/drm/msm_drm.h @@ -272,7 +272,10 @@ struct drm_msm_gem_submit_cmd { __u32 size; /* in, cmdstream size */ __u32 pad; __u32 nr_relocs; /* in, number of submit_reloc's */ - __u64 relocs; /* in, ptr to array of submit_reloc's */ + union { + __u64 relocs; /* in, ptr to array of submit_reloc's */ + __u64 iova; /* cmdstream address (for VM_BIND contexts) */ + }; }; /* Each buffer referenced elsewhere in the cmdstream submit (ie. the @@ -339,7 +342,74 @@ struct drm_msm_gem_submit { __u32 nr_out_syncobjs; /* in, number of entries in out_syncobj. */ __u32 syncobj_stride; /* in, stride of syncobj arrays. */ __u32 pad; /*in, reserved for future use, always 0. */ +}; + +#define MSM_VM_BIND_OP_UNMAP 0 +#define MSM_VM_BIND_OP_MAP 1 +#define MSM_VM_BIND_OP_MAP_NULL 2 + +#define MSM_VM_BIND_OP_DUMP 1 +#define MSM_VM_BIND_OP_FLAGS ( \ + MSM_VM_BIND_OP_DUMP | \ + 0) +/** + * struct drm_msm_vm_bind_op - bind/unbind op to run + */ +struct drm_msm_vm_bind_op { + /** @op: one of MSM_VM_BIND_OP_x */ + __u32 op; + /** @handle: GEM object handle, MBZ for UNMAP or MAP_NULL */ + __u32 handle; + /** @obj_offset: Offset into GEM object, MBZ for UNMAP or MAP_NULL */ + __u64 obj_offset; + /** @iova: Address to operate on */ + __u64 iova; + /** @range: Number of bites to to map/unmap */ + __u64 range; + /** @flags: Bitmask of MSM_VM_BIND_OP_FLAG_x */ + __u32 flags; + /** @pad: MBZ */ + __u32 pad; +}; + +#define MSM_VM_BIND_FENCE_FD_IN 0x00000001 +#define MSM_VM_BIND_FENCE_FD_OUT 0x00000002 +#define MSM_VM_BIND_FLAGS ( \ + MSM_VM_BIND_FENCE_FD_IN | \ + MSM_VM_BIND_FENCE_FD_OUT | \ + 0) + +/** + * struct drm_msm_vm_bind - Input of &DRM_IOCTL_MSM_VM_BIND + */ +struct drm_msm_vm_bind { + /** @flags: in, bitmask of MSM_VM_BIND_x */ + __u32 flags; + /** @nr_ops: the number of bind ops in this ioctl */ + __u32 nr_ops; + /** @fence_fd: in/out fence fd (see MSM_VM_BIND_FENCE_FD_IN/OUT) */ + __s32 fence_fd; + /** @queue_id: in, submitqueue id */ + __u32 queue_id; + /** @in_syncobjs: in, ptr to array of drm_msm_gem_syncobj */ + __u64 in_syncobjs; + /** @out_syncobjs: in, ptr to array of drm_msm_gem_syncobj */ + __u64 out_syncobjs; + /** @nr_in_syncobjs: in, number of entries in in_syncobj */ + __u32 nr_in_syncobjs; + /** @nr_out_syncobjs: in, number of entries in out_syncobj */ + __u32 nr_out_syncobjs; + /** @syncobj_stride: in, stride of syncobj arrays */ + __u32 syncobj_stride; + /** @op_stride: sizeof each struct drm_msm_vm_bind_op in @ops */ + __u32 op_stride; + union { + /** @op: used if num_ops == 1 */ + struct drm_msm_vm_bind_op op; + /** @ops: userptr to array of drm_msm_vm_bind_op if num_ops > 1 */ + __u64 ops; + }; }; #define MSM_WAIT_FENCE_BOOST 0x00000001 @@ -435,6 +505,7 @@ struct drm_msm_submitqueue_query { #define DRM_MSM_SUBMITQUEUE_NEW 0x0A #define DRM_MSM_SUBMITQUEUE_CLOSE 0x0B #define DRM_MSM_SUBMITQUEUE_QUERY 0x0C +#define DRM_MSM_VM_BIND 0x0D #define DRM_IOCTL_MSM_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GET_PARAM, struct drm_msm_param) #define DRM_IOCTL_MSM_SET_PARAM DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SET_PARAM, struct drm_msm_param) @@ -448,6 +519,7 @@ struct drm_msm_submitqueue_query { #define DRM_IOCTL_MSM_SUBMITQUEUE_NEW DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_NEW, struct drm_msm_submitqueue) #define DRM_IOCTL_MSM_SUBMITQUEUE_CLOSE DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_CLOSE, __u32) #define DRM_IOCTL_MSM_SUBMITQUEUE_QUERY DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_QUERY, struct drm_msm_submitqueue_query) +#define DRM_IOCTL_MSM_VM_BIND DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_VM_BIND, struct drm_msm_vm_bind) #if defined(__cplusplus) }