From patchwork Mon May 19 17:57:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rob Clark X-Patchwork-Id: 891213 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8A28328982F; Mon, 19 May 2025 17:58:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747677489; cv=none; b=IQpgEUtbJOJAGas3b5WGty3UXcpAW/MMaElAjLagdHEaKC6WbPF4JbxsXWhfawP4EuEcbqooGfc54z7gsZpVVPGdjoTvulOPQAW9CbaLE/O5JFxk1v8zT4SIPEhF1wl653cobInhb/ewfl5bEIQGSf47FdO35jE9D++M5cWPEbc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747677489; c=relaxed/simple; bh=WgKaWhkKhemf+x9LRLZV0J5rs/gmVfXO6MlEcPH+okE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QuHaT8FSRW9ULx3LgBNGSPSSXcLbDct4YVW7aXNm3Os0BZ5ny4EDolbqISMLae+nxRMyHLuXFC3MSI9klPmtTB1HfdAAYYQm2u3RaOLq8niVOpm/oQZusorAHA2cJLUIr8iTj7CqIl0Cdu6RzUvaoRCS9ZmCXNEYuvrpCoZXvjM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ELif/ySt; arc=none smtp.client-ip=209.85.214.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ELif/ySt" Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-23228b9d684so17079975ad.1; Mon, 19 May 2025 10:58:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747677486; x=1748282286; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LJyNsqqMnCFBlZcBxnLOjTsC6Bw4XYLjkgJX6AkR1Z8=; b=ELif/yStD/XNzafeEFiNsAP0bVmwHmJk67Eo0j+89Ch5FzpRsU1e5Tkda2MMijEx2c 6ngEbniye6W1qrF2y+eswPHmKzADmYCFH5+/8cfhK8nWxAJzwkOQA0JugqPhPs+qF19/ hzP2EVdLgvmV2J9ChCbELdivsHsGg2sramzh1T3jZWKTw+AtgRUwf1tza2fJdBB/UJUo oix9Y260ZYxI4IhDmTD1PDyMWtUeuCiV1QGZI/QtrzwO7nGWAXwW9sNIZRqyJsbyymBK vd/oppsqwD4XGhI4hkbxl7wbqlA5BgVEy0mzztXGGkdF+s926lsr+tkSzmGaRr46F3Gh Vpxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747677486; x=1748282286; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LJyNsqqMnCFBlZcBxnLOjTsC6Bw4XYLjkgJX6AkR1Z8=; b=veNnSbhKA5bncbrWhUD/MuQytGzNDBoRP+2mKn6/bTqSKuGo3LnbAGVQ/csWDwASfG X2uxLbY1upTWiwE9Srjj7xmCUeRCCactif+zWDi0Iuv7WvEUtbGW/CrKbTT6IozlcOyv gUJDDEuBZ+kjDW6pZNOPF9ghrgNW5XKgwFmlW9jlSwqBRUIRHNOS4U4qy6QAVlZSPX6W tnh0dJxgxaRehCG5+VWBcfiIHhwSuS2VqdSnJp5CBrES0zYyuO4wW/NsZCljk08jejCr x7qmXwLh4U2heTE6Q/zRmUTkZFKhKLL31XjZcTOSG2E1QBZWNj4JvQ9ZRwQMZKUluvBL dZIQ== X-Forwarded-Encrypted: i=1; AJvYcCUmvdDDTBjtMMo1y55vOQwLIxnQF1TqfA7mma8G3L/XJD+Es2ZyqF6+hoqYcrfiGzoN8O+YlmirukUcMBc=@vger.kernel.org, AJvYcCW3iahS7ianlZ0oAAIoLU0sR5FAGsWGmIbmgvpnc8brjbg2eLPbhI3rbgR4d9IOT+FDFELmK+kbapGABW33@vger.kernel.org, AJvYcCXt6S+wVLC2lPlLxgKwOKipKecPD8gbVmOUgFLhjXvvyWWtMev/IeEupEiMAyiZi0VlHebTidFv82KtZHQY@vger.kernel.org X-Gm-Message-State: AOJu0YwP4rUqYJ20FPHTrG43I9f5mBEwMSmzIiNBEx/CyMsHQHLfOZXr 2Wleay+l41gwxu13KuZ+keUVq99G16MsPcgLQSfF834j0ezxOt1+B6Yw X-Gm-Gg: ASbGncuFOTcGz9Kgd2ZGtpDrCYkjexz3nx8yOtFqz3h/QM3zS0jGOTfuwWbkmrg129g y6nk591i/E4TEOOg0tQXIIYtnARalPonX2Ysy4Y2iNoq+Fz/LOyMhLpBAX8m0Lo7825bkIrL4Ze 3eTR/ODJE2VMOy03zaeGO39l9nHrzB1j4miXft9g7ByLeMHEf9FCgKnz2hYFDT5pHEUMBHJ53+4 IobWu+MChGiqVOFJlo6ezHL9R9StcK7lCCsELpRrmTwxJNOUjN1jTC2nSnN4vf9Zj1UDkFvrony s8wSVe4ClZ8zprhm0ZAhlxz2yKikyCbPX9cWD2kOF4+IYpBq0Og8W+HyHOwWrRbm3zv4CEbTfkO 5zhXQ1HAoI27jz8E3Rx0Pfn6aXvTaQ4xNQAgd X-Google-Smtp-Source: AGHT+IGanyuq409M8oZQs0siMUQtt9jalgdF95x4NnSDYklFUUnntLmWL4PzPxxhK6ssW/iEAev1mg== X-Received: by 2002:a17:902:e54e:b0:22e:4b74:5f67 with SMTP id d9443c01a7336-231de376f05mr190501355ad.31.1747677485653; Mon, 19 May 2025 10:58:05 -0700 (PDT) Received: from localhost ([2a00:79e0:3e00:2601:3afc:446b:f0df:eadc]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-231d4e9745esm63022455ad.127.2025.05.19.10.58.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 May 2025 10:58:04 -0700 (PDT) From: Rob Clark To: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, Connor Abbott , Rob Clark , Rob Clark , Abhinav Kumar , Dmitry Baryshkov , Sean Paul , Marijn Suijten , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , linux-kernel@vger.kernel.org (open list), linux-media@vger.kernel.org (open list:DMA BUFFER SHARING FRAMEWORK:Keyword:\bdma_(?:buf|fence|resv)\b), linaro-mm-sig@lists.linaro.org (moderated list:DMA BUFFER SHARING FRAMEWORK:Keyword:\bdma_(?:buf|fence|resv)\b) Subject: [PATCH v5 14/40] drm/msm: Convert vm locking Date: Mon, 19 May 2025 10:57:11 -0700 Message-ID: <20250519175755.13037-2-robdclark@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250519175755.13037-1-robdclark@gmail.com> References: <20250519175348.11924-1-robdclark@gmail.com> <20250519175755.13037-1-robdclark@gmail.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Rob Clark Convert to using the gpuvm's r_obj for serializing access to the VM. This way we can use the drm_exec helper for dealing with deadlock detection and backoff. This will let us deal with upcoming locking order conflicts with the VM_BIND implmentation (ie. in some scenarious we need to acquire the obj lock first, for ex. to iterate all the VMs an obj is bound in, and in other scenarious we need to acquire the VM lock first). Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_gem.c | 35 ++++++++--- drivers/gpu/drm/msm/msm_gem.h | 37 ++++++++++-- drivers/gpu/drm/msm/msm_gem_shrinker.c | 80 +++++++++++++++++++++++--- drivers/gpu/drm/msm/msm_gem_submit.c | 9 ++- drivers/gpu/drm/msm/msm_gem_vma.c | 27 ++++----- 5 files changed, 150 insertions(+), 38 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 3b7db3b3f763..b7055805a5dd 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -52,6 +52,7 @@ static void put_iova_spaces(struct drm_gem_object *obj, struct drm_gpuvm *vm, bo static void msm_gem_close(struct drm_gem_object *obj, struct drm_file *file) { struct msm_context *ctx = file->driver_priv; + struct drm_exec exec; update_ctx_mem(file, -obj->size); @@ -70,9 +71,9 @@ static void msm_gem_close(struct drm_gem_object *obj, struct drm_file *file) dma_resv_wait_timeout(obj->resv, DMA_RESV_USAGE_READ, false, msecs_to_jiffies(1000)); - msm_gem_lock(obj); + msm_gem_lock_vm_and_obj(&exec, obj, ctx->vm); put_iova_spaces(obj, &ctx->vm->base, true); - msm_gem_unlock(obj); + drm_exec_fini(&exec); /* drop locks */ } /* @@ -538,11 +539,12 @@ int msm_gem_get_and_pin_iova_range(struct drm_gem_object *obj, struct msm_gem_vm *vm, uint64_t *iova, u64 range_start, u64 range_end) { + struct drm_exec exec; int ret; - msm_gem_lock(obj); + msm_gem_lock_vm_and_obj(&exec, obj, vm); ret = get_and_pin_iova_range_locked(obj, vm, iova, range_start, range_end); - msm_gem_unlock(obj); + drm_exec_fini(&exec); /* drop locks */ return ret; } @@ -562,16 +564,17 @@ int msm_gem_get_iova(struct drm_gem_object *obj, struct msm_gem_vm *vm, uint64_t *iova) { struct msm_gem_vma *vma; + struct drm_exec exec; int ret = 0; - msm_gem_lock(obj); + msm_gem_lock_vm_and_obj(&exec, obj, vm); vma = get_vma_locked(obj, vm, 0, U64_MAX); if (IS_ERR(vma)) { ret = PTR_ERR(vma); } else { *iova = vma->base.va.addr; } - msm_gem_unlock(obj); + drm_exec_fini(&exec); /* drop locks */ return ret; } @@ -600,9 +603,10 @@ static int clear_iova(struct drm_gem_object *obj, int msm_gem_set_iova(struct drm_gem_object *obj, struct msm_gem_vm *vm, uint64_t iova) { + struct drm_exec exec; int ret = 0; - msm_gem_lock(obj); + msm_gem_lock_vm_and_obj(&exec, obj, vm); if (!iova) { ret = clear_iova(obj, vm); } else { @@ -615,7 +619,7 @@ int msm_gem_set_iova(struct drm_gem_object *obj, ret = -EBUSY; } } - msm_gem_unlock(obj); + drm_exec_fini(&exec); /* drop locks */ return ret; } @@ -1007,12 +1011,27 @@ static void msm_gem_free_object(struct drm_gem_object *obj) struct msm_gem_object *msm_obj = to_msm_bo(obj); struct drm_device *dev = obj->dev; struct msm_drm_private *priv = dev->dev_private; + struct drm_exec exec; mutex_lock(&priv->obj_lock); list_del(&msm_obj->node); mutex_unlock(&priv->obj_lock); + /* + * We need to lock any VMs the object is still attached to, but not + * the object itself (see explaination in msm_gem_assert_locked()), + * so just open-code this special case: + */ + drm_exec_init(&exec, 0, 0); + drm_exec_until_all_locked (&exec) { + struct drm_gpuvm_bo *vm_bo; + drm_gem_for_each_gpuvm_bo (vm_bo, obj) { + drm_exec_lock_obj(&exec, drm_gpuvm_resv_obj(vm_bo->vm)); + drm_exec_retry_on_contention(&exec); + } + } put_iova_spaces(obj, NULL, true); + drm_exec_fini(&exec); /* drop locks */ if (obj->import_attach) { GEM_WARN_ON(msm_obj->vaddr); diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index f7f7e7910754..36a846e9b943 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -62,12 +62,6 @@ struct msm_gem_vm { */ struct drm_mm mm; - /** @mm_lock: protects @mm node allocation/removal */ - struct spinlock mm_lock; - - /** @vm_lock: protects gpuvm insert/remove/traverse */ - struct mutex vm_lock; - /** @mmu: The mmu object which manages the pgtables */ struct msm_mmu *mmu; @@ -246,6 +240,37 @@ msm_gem_unlock(struct drm_gem_object *obj) dma_resv_unlock(obj->resv); } +/** + * msm_gem_lock_vm_and_obj() - Helper to lock an obj + VM + * @exec: the exec context helper which will be initalized + * @obj: the GEM object to lock + * @vm: the VM to lock + * + * Operations which modify a VM frequently need to lock both the VM and + * the object being mapped/unmapped/etc. This helper uses drm_exec to + * acquire both locks, dealing with potential deadlock/backoff scenarios + * which arise when multiple locks are involved. + */ +static inline int +msm_gem_lock_vm_and_obj(struct drm_exec *exec, + struct drm_gem_object *obj, + struct msm_gem_vm *vm) +{ + int ret = 0; + + drm_exec_init(exec, 0, 2); + drm_exec_until_all_locked (exec) { + ret = drm_exec_lock_obj(exec, drm_gpuvm_resv_obj(&vm->base)); + if (!ret && (obj->resv != drm_gpuvm_resv(&vm->base))) + ret = drm_exec_lock_obj(exec, obj); + drm_exec_retry_on_contention(exec); + if (GEM_WARN_ON(ret)) + break; + } + + return ret; +} + static inline void msm_gem_assert_locked(struct drm_gem_object *obj) { diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c index de185fc34084..5faf6227584a 100644 --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c @@ -43,6 +43,75 @@ msm_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) return count; } +static bool +with_vm_locks(struct ww_acquire_ctx *ticket, + void (*fn)(struct drm_gem_object *obj), + struct drm_gem_object *obj) +{ + /* + * Track last locked entry for for unwinding locks in error and + * success paths + */ + struct drm_gpuvm_bo *vm_bo, *last_locked = NULL; + int ret = 0; + + drm_gem_for_each_gpuvm_bo (vm_bo, obj) { + struct dma_resv *resv = drm_gpuvm_resv(vm_bo->vm); + + if (resv == obj->resv) + continue; + + ret = dma_resv_lock(resv, ticket); + + /* + * Since we already skip the case when the VM and obj + * share a resv (ie. _NO_SHARE objs), we don't expect + * to hit a double-locking scenario... which the lock + * unwinding cannot really cope with. + */ + WARN_ON(ret == -EALREADY); + + /* + * Don't bother with slow-lock / backoff / retry sequence, + * if we can't get the lock just give up and move on to + * the next object. + */ + if (ret) + goto out_unlock; + + /* + * Hold a ref to prevent the vm_bo from being freed + * and removed from the obj's gpuva list, as that would + * would result in missing the unlock below + */ + drm_gpuvm_bo_get(vm_bo); + + last_locked = vm_bo; + } + + fn(obj); + +out_unlock: + if (last_locked) { + drm_gem_for_each_gpuvm_bo (vm_bo, obj) { + struct dma_resv *resv = drm_gpuvm_resv(vm_bo->vm); + + if (resv == obj->resv) + continue; + + dma_resv_unlock(resv); + + /* Drop the ref taken while locking: */ + drm_gpuvm_bo_put(vm_bo); + + if (last_locked == vm_bo) + break; + } + } + + return ret == 0; +} + static bool purge(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket) { @@ -52,9 +121,7 @@ purge(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket) if (msm_gem_active(obj)) return false; - msm_gem_purge(obj); - - return true; + return with_vm_locks(ticket, msm_gem_purge, obj); } static bool @@ -66,9 +133,7 @@ evict(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket) if (msm_gem_active(obj)) return false; - msm_gem_evict(obj); - - return true; + return with_vm_locks(ticket, msm_gem_evict, obj); } static bool @@ -100,6 +165,7 @@ static unsigned long msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { struct msm_drm_private *priv = shrinker->private_data; + struct ww_acquire_ctx ticket; struct { struct drm_gem_lru *lru; bool (*shrink)(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket); @@ -124,7 +190,7 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) drm_gem_lru_scan(stages[i].lru, nr, &stages[i].remaining, stages[i].shrink, - NULL); + &ticket); nr -= stages[i].freed; freed += stages[i].freed; remaining += stages[i].remaining; diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 86791a854c42..6924d03026ba 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -256,11 +256,18 @@ static int submit_lookup_cmds(struct msm_gem_submit *submit, /* This is where we make sure all the bo's are reserved and pin'd: */ static int submit_lock_objects(struct msm_gem_submit *submit) { + unsigned flags = DRM_EXEC_IGNORE_DUPLICATES | DRM_EXEC_INTERRUPTIBLE_WAIT; int ret; - drm_exec_init(&submit->exec, DRM_EXEC_INTERRUPTIBLE_WAIT, submit->nr_bos); +// TODO need to add vm_bind path which locks vm resv + external objs + drm_exec_init(&submit->exec, flags, submit->nr_bos); drm_exec_until_all_locked (&submit->exec) { + ret = drm_exec_lock_obj(&submit->exec, + drm_gpuvm_resv_obj(&submit->vm->base)); + drm_exec_retry_on_contention(&submit->exec); + if (ret) + goto error; for (unsigned i = 0; i < submit->nr_bos; i++) { struct drm_gem_object *obj = submit->bos[i].obj; ret = drm_exec_prepare_obj(&submit->exec, obj, 1); diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c index d1621761ef36..e294e7f6e723 100644 --- a/drivers/gpu/drm/msm/msm_gem_vma.c +++ b/drivers/gpu/drm/msm/msm_gem_vma.c @@ -92,15 +92,13 @@ void msm_gem_vma_close(struct msm_gem_vma *vma) GEM_WARN_ON(vma->mapped); - spin_lock(&vm->mm_lock); + drm_gpuvm_resv_assert_held(&vm->base); + if (vma->base.va.addr) drm_mm_remove_node(&vma->node); - spin_unlock(&vm->mm_lock); - mutex_lock(&vm->vm_lock); drm_gpuva_remove(&vma->base); drm_gpuva_unlink(&vma->base); - mutex_unlock(&vm->vm_lock); kfree(vma); } @@ -114,16 +112,16 @@ msm_gem_vma_new(struct msm_gem_vm *vm, struct drm_gem_object *obj, struct msm_gem_vma *vma; int ret; + drm_gpuvm_resv_assert_held(&vm->base); + vma = kzalloc(sizeof(*vma), GFP_KERNEL); if (!vma) return ERR_PTR(-ENOMEM); if (vm->managed) { - spin_lock(&vm->mm_lock); ret = drm_mm_insert_node_in_range(&vm->mm, &vma->node, obj->size, PAGE_SIZE, 0, range_start, range_end, 0); - spin_unlock(&vm->mm_lock); if (ret) goto err_free_vma; @@ -137,9 +135,7 @@ msm_gem_vma_new(struct msm_gem_vm *vm, struct drm_gem_object *obj, drm_gpuva_init(&vma->base, range_start, range_end - range_start, obj, 0); vma->mapped = false; - mutex_lock(&vm->vm_lock); ret = drm_gpuva_insert(&vm->base, &vma->base); - mutex_unlock(&vm->vm_lock); if (ret) goto err_free_range; @@ -149,18 +145,14 @@ msm_gem_vma_new(struct msm_gem_vm *vm, struct drm_gem_object *obj, goto err_va_remove; } - mutex_lock(&vm->vm_lock); drm_gpuvm_bo_extobj_add(vm_bo); drm_gpuva_link(&vma->base, vm_bo); - mutex_unlock(&vm->vm_lock); GEM_WARN_ON(drm_gpuvm_bo_put(vm_bo)); return vma; err_va_remove: - mutex_lock(&vm->vm_lock); drm_gpuva_remove(&vma->base); - mutex_unlock(&vm->vm_lock); err_free_range: if (vm->managed) drm_mm_remove_node(&vma->node); @@ -191,7 +183,13 @@ struct msm_gem_vm * msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name, u64 va_start, u64 va_size, bool managed) { - enum drm_gpuvm_flags flags = managed ? DRM_GPUVM_VA_WEAK_REF : 0; + /* + * We mostly want to use DRM_GPUVM_RESV_PROTECTED, except that + * makes drm_gpuvm_bo_evict() a no-op for extobjs (ie. we loose + * tracking that an extobj is evicted) :facepalm: + */ + enum drm_gpuvm_flags flags = + (managed ? DRM_GPUVM_VA_WEAK_REF : 0); struct msm_gem_vm *vm; struct drm_gem_object *dummy_gem; int ret = 0; @@ -213,9 +211,6 @@ msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name, va_start, va_size, 0, 0, &msm_gpuvm_ops); drm_gem_object_put(dummy_gem); - spin_lock_init(&vm->mm_lock); - mutex_init(&vm->vm_lock); - vm->mmu = mmu; vm->managed = managed; From patchwork Mon May 19 17:57:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rob Clark X-Patchwork-Id: 891212 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2477B28CF4A; Mon, 19 May 2025 17:58:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747677519; cv=none; b=uBfdC0uKhwvhITTmFCDruNijmMprsyzvJIDlxOCpdtIIumZaJYxqoGLymMmnnHLafuxJL5Q84qfPni6WQqBfRoyJd0RsaBOJ2Z2eiX0GI+gVMxQWgEGRGSbQEOVQO4L7oNK3XJrR2kpKnMEzXvjKEohee8+LUJjYqnaryc5TXK8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747677519; c=relaxed/simple; bh=jG4HobhwucdIR8FiKaw3dwjAyUJvZK+eWfYr6ftrCks=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rQHdl04eqKeZ01AKi1ktIOCUjQHxEDIX71pT4qRFcd/lJJ9LU1jTA9Spu8d884SC0ilDbIKOLrlznRQrF0bIUvwZ6Jb5TICpOu+bLM44ySY9qFg5+YEHpxxbk2TXNX1K6r16/AyAnEoe6FqyAQkHy/nZFhBwliFFGAcTjcGNJ8E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hswNTVHd; arc=none smtp.client-ip=209.85.210.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hswNTVHd" Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-73c17c770a7so5119776b3a.2; Mon, 19 May 2025 10:58:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747677516; x=1748282316; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TSLYVQjOTOZYiQMTrnfPyfZsaxnus9kkFYB9Wzg3Y0s=; b=hswNTVHdxMJ/nm3QTceC7ltU5du4r2xBSKcd14GmNlz9j5HC6BpLQWzzCg+gLOPReq N5bT2oASU87nED27EUxdtpSm/acxLpk8WmyHD4QD8XATG2UIfNAQg+yghKsQ78TcLl3Z WvVPh05jh2WOvaH5JP3Y5iWbJkUh3OBskQedkY4TwhjvPnnPoFLlg31KX5K7vz9lJ4Nm yGZH1/d8p62xnOgzvfZhzodu4HMfdU/2QtTMg1p5Ibmsbf1JI7v7Wd8PYRvtpIgMz7T5 HrjcOV/DQfLPYjOyEzYicscVZQXmN8eP1afoMZOLDFaqesxA8Xo7eG3THOtd+ujm7+PY 69KQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747677516; x=1748282316; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TSLYVQjOTOZYiQMTrnfPyfZsaxnus9kkFYB9Wzg3Y0s=; b=ODmqCnFMMHI0pCz3ODkOlzQIXFo9tqfn7MmPWt2qQi3NPJGGdrDIRLWyN/n6XDbTS7 Pkf53PrsptcPYmyfTrelp0/iK20i1o4FNt0wv+lIyyJZjM8bcYhE0lgGyFUwauNxXVM/ InNZuctTUqpGMV6JuEUuD30JWyoK3Vm3D1r/hTwYNyKIFhgFgOwU1iwmn8+D6ilY2nue TuSQ6LJb78Zh2G7EsqosavUr9Xa4uxD+uX1CyKvl5hRgPzN6T5yBUMX2qGm9EjP8Eb4O Psfv8cytISPLwPxQZtFCEsVn/b+v+wkiZcQy3XDaQA/VPDP6NX+rYDXiX13FPNRvzPQ7 2sAA== X-Forwarded-Encrypted: i=1; AJvYcCVep92EsIdCPcNg1/x+rOAuWq/bFrPJFov2aTUTPxuLbm9c6U6x6hpE67OJM3QNppATzVxyRj3j8tQoGM8R@vger.kernel.org, AJvYcCXQheGJgVFfBummW+T/GSMb3aQy0zzJLk4+9xMb3KSJMily1ue6q+y2Z3dP62Yvnyqc848y7S42xPVEpa4Z@vger.kernel.org, AJvYcCXqkPn13ZO5fhKDYRmHnz+2NtKkJ8NmTByoKZkIuQOYRdVeCTqd+OHDs/CZUe0bzOWJ6Fr1BT9ZSWS1NVU=@vger.kernel.org X-Gm-Message-State: AOJu0YwJY5zmRIUdjmJWrrrZw5wy9zCZDlKW37dOZ06lkQXOzUT/U794 jIsO63Dn6JtHdRCHGgI2Fu1Ai0zwDaKCMwE9b6mmjpipmmVHBPUQVJ2O X-Gm-Gg: ASbGncvwm6B6TPOoA2JGlyrh6T0m19l/ko8KW9oJBm7YBQjw/48Aihr1qbVOYirCx4I u857Xas1dcKR7nROgcFAafkvKtRp0Q7BwpB0IuWYTe5s3i+tbkFW20lBZrYSk+FbQ5aDy2IdoeA i6YxW9LfjULoWX+sFU6YMtWaiKea1JWSLVXAaRziubAMiF86U2vS9ppajdNAqwC09dgo1/ayaV7 KsSgB81n/2ukpfCRxMZmFOS2mCQlqmCW5rSZe9uIKE8Ytt9+PTpI64sUufhjgkuFECraBhI46Q4 8soaWeao7BP6zCfjANngCxAsGcyHQFcxu75Ogm/IQkjnxaAwrf4zc0eZFTgTD8KGFbcAr4cdxQp vl414xm1xsHqs5Q0flToIlXWJpQ== X-Google-Smtp-Source: AGHT+IF40TTkGilyFoubydEYyxFSMUompa1jeTpGoGAZKV8o80mshojyviGGAX0jtbaVqJ6+GVV7xA== X-Received: by 2002:a05:6a00:2da5:b0:73f:f816:dd78 with SMTP id d2e1a72fcca58-742a98b2e69mr17791839b3a.15.1747677516315; Mon, 19 May 2025 10:58:36 -0700 (PDT) Received: from localhost ([2a00:79e0:3e00:2601:3afc:446b:f0df:eadc]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-742a97375e9sm6546833b3a.73.2025.05.19.10.58.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 May 2025 10:58:35 -0700 (PDT) From: Rob Clark To: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, Connor Abbott , Rob Clark , Rob Clark , Abhinav Kumar , Dmitry Baryshkov , Sean Paul , Marijn Suijten , David Airlie , Simona Vetter , Konrad Dybcio , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , linux-kernel@vger.kernel.org (open list), linux-media@vger.kernel.org (open list:DMA BUFFER SHARING FRAMEWORK:Keyword:\bdma_(?:buf|fence|resv)\b), linaro-mm-sig@lists.linaro.org (moderated list:DMA BUFFER SHARING FRAMEWORK:Keyword:\bdma_(?:buf|fence|resv)\b) Subject: [PATCH v5 29/40] drm/msm: Extract out syncobj helpers Date: Mon, 19 May 2025 10:57:26 -0700 Message-ID: <20250519175755.13037-17-robdclark@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250519175755.13037-1-robdclark@gmail.com> References: <20250519175348.11924-1-robdclark@gmail.com> <20250519175755.13037-1-robdclark@gmail.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Rob Clark We'll be re-using these for the VM_BIND ioctl. Also, rename a few things in the uapi header to reflect that syncobj use is not specific to the submit ioctl. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/Makefile | 1 + drivers/gpu/drm/msm/msm_gem_submit.c | 192 ++------------------------- drivers/gpu/drm/msm/msm_syncobj.c | 172 ++++++++++++++++++++++++ drivers/gpu/drm/msm/msm_syncobj.h | 37 ++++++ include/uapi/drm/msm_drm.h | 26 ++-- 5 files changed, 235 insertions(+), 193 deletions(-) create mode 100644 drivers/gpu/drm/msm/msm_syncobj.c create mode 100644 drivers/gpu/drm/msm/msm_syncobj.h diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile index 5df20cbeafb8..8af34f87e0c8 100644 --- a/drivers/gpu/drm/msm/Makefile +++ b/drivers/gpu/drm/msm/Makefile @@ -128,6 +128,7 @@ msm-y += \ msm_rd.o \ msm_ringbuffer.o \ msm_submitqueue.o \ + msm_syncobj.o \ msm_gpu_tracepoints.o \ msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index f282d691087f..bfb8c5ac1f1e 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -16,6 +16,7 @@ #include "msm_gpu.h" #include "msm_gem.h" #include "msm_gpu_trace.h" +#include "msm_syncobj.h" /* For userspace errors, use DRM_UT_DRIVER.. so that userspace can enable * error msgs for debugging, but we don't spam dmesg by default @@ -486,173 +487,6 @@ void msm_submit_retire(struct msm_gem_submit *submit) } } -struct msm_submit_post_dep { - struct drm_syncobj *syncobj; - uint64_t point; - struct dma_fence_chain *chain; -}; - -static struct drm_syncobj **msm_parse_deps(struct msm_gem_submit *submit, - struct drm_file *file, - uint64_t in_syncobjs_addr, - uint32_t nr_in_syncobjs, - size_t syncobj_stride) -{ - struct drm_syncobj **syncobjs = NULL; - struct drm_msm_gem_submit_syncobj syncobj_desc = {0}; - int ret = 0; - uint32_t i, j; - - syncobjs = kcalloc(nr_in_syncobjs, sizeof(*syncobjs), - GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY); - if (!syncobjs) - return ERR_PTR(-ENOMEM); - - for (i = 0; i < nr_in_syncobjs; ++i) { - uint64_t address = in_syncobjs_addr + i * syncobj_stride; - - if (copy_from_user(&syncobj_desc, - u64_to_user_ptr(address), - min(syncobj_stride, sizeof(syncobj_desc)))) { - ret = -EFAULT; - break; - } - - if (syncobj_desc.point && - !drm_core_check_feature(submit->dev, DRIVER_SYNCOBJ_TIMELINE)) { - ret = SUBMIT_ERROR(EOPNOTSUPP, submit, "syncobj timeline unsupported"); - break; - } - - if (syncobj_desc.flags & ~MSM_SUBMIT_SYNCOBJ_FLAGS) { - ret = SUBMIT_ERROR(EINVAL, submit, "invalid syncobj flags: %x", syncobj_desc.flags); - break; - } - - ret = drm_sched_job_add_syncobj_dependency(&submit->base, file, - syncobj_desc.handle, syncobj_desc.point); - if (ret) - break; - - if (syncobj_desc.flags & MSM_SUBMIT_SYNCOBJ_RESET) { - syncobjs[i] = - drm_syncobj_find(file, syncobj_desc.handle); - if (!syncobjs[i]) { - ret = SUBMIT_ERROR(EINVAL, submit, "invalid syncobj handle: %u", i); - break; - } - } - } - - if (ret) { - for (j = 0; j <= i; ++j) { - if (syncobjs[j]) - drm_syncobj_put(syncobjs[j]); - } - kfree(syncobjs); - return ERR_PTR(ret); - } - return syncobjs; -} - -static void msm_reset_syncobjs(struct drm_syncobj **syncobjs, - uint32_t nr_syncobjs) -{ - uint32_t i; - - for (i = 0; syncobjs && i < nr_syncobjs; ++i) { - if (syncobjs[i]) - drm_syncobj_replace_fence(syncobjs[i], NULL); - } -} - -static struct msm_submit_post_dep *msm_parse_post_deps(struct drm_device *dev, - struct drm_file *file, - uint64_t syncobjs_addr, - uint32_t nr_syncobjs, - size_t syncobj_stride) -{ - struct msm_submit_post_dep *post_deps; - struct drm_msm_gem_submit_syncobj syncobj_desc = {0}; - int ret = 0; - uint32_t i, j; - - post_deps = kcalloc(nr_syncobjs, sizeof(*post_deps), - GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY); - if (!post_deps) - return ERR_PTR(-ENOMEM); - - for (i = 0; i < nr_syncobjs; ++i) { - uint64_t address = syncobjs_addr + i * syncobj_stride; - - if (copy_from_user(&syncobj_desc, - u64_to_user_ptr(address), - min(syncobj_stride, sizeof(syncobj_desc)))) { - ret = -EFAULT; - break; - } - - post_deps[i].point = syncobj_desc.point; - - if (syncobj_desc.flags) { - ret = UERR(EINVAL, dev, "invalid syncobj flags"); - break; - } - - if (syncobj_desc.point) { - if (!drm_core_check_feature(dev, - DRIVER_SYNCOBJ_TIMELINE)) { - ret = UERR(EOPNOTSUPP, dev, "syncobj timeline unsupported"); - break; - } - - post_deps[i].chain = dma_fence_chain_alloc(); - if (!post_deps[i].chain) { - ret = -ENOMEM; - break; - } - } - - post_deps[i].syncobj = - drm_syncobj_find(file, syncobj_desc.handle); - if (!post_deps[i].syncobj) { - ret = UERR(EINVAL, dev, "invalid syncobj handle"); - break; - } - } - - if (ret) { - for (j = 0; j <= i; ++j) { - dma_fence_chain_free(post_deps[j].chain); - if (post_deps[j].syncobj) - drm_syncobj_put(post_deps[j].syncobj); - } - - kfree(post_deps); - return ERR_PTR(ret); - } - - return post_deps; -} - -static void msm_process_post_deps(struct msm_submit_post_dep *post_deps, - uint32_t count, struct dma_fence *fence) -{ - uint32_t i; - - for (i = 0; post_deps && i < count; ++i) { - if (post_deps[i].chain) { - drm_syncobj_add_point(post_deps[i].syncobj, - post_deps[i].chain, - fence, post_deps[i].point); - post_deps[i].chain = NULL; - } else { - drm_syncobj_replace_fence(post_deps[i].syncobj, - fence); - } - } -} - int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct drm_file *file) { @@ -663,7 +497,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct msm_gpu *gpu = priv->gpu; struct msm_gpu_submitqueue *queue; struct msm_ringbuffer *ring; - struct msm_submit_post_dep *post_deps = NULL; + struct msm_syncobj_post_dep *post_deps = NULL; struct drm_syncobj **syncobjs_to_reset = NULL; struct sync_file *sync_file = NULL; int out_fence_fd = -1; @@ -740,10 +574,10 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, } if (args->flags & MSM_SUBMIT_SYNCOBJ_IN) { - syncobjs_to_reset = msm_parse_deps(submit, file, - args->in_syncobjs, - args->nr_in_syncobjs, - args->syncobj_stride); + syncobjs_to_reset = msm_syncobj_parse_deps(dev, &submit->base, + file, args->in_syncobjs, + args->nr_in_syncobjs, + args->syncobj_stride); if (IS_ERR(syncobjs_to_reset)) { ret = PTR_ERR(syncobjs_to_reset); goto out_unlock; @@ -751,10 +585,10 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, } if (args->flags & MSM_SUBMIT_SYNCOBJ_OUT) { - post_deps = msm_parse_post_deps(dev, file, - args->out_syncobjs, - args->nr_out_syncobjs, - args->syncobj_stride); + post_deps = msm_syncobj_parse_post_deps(dev, file, + args->out_syncobjs, + args->nr_out_syncobjs, + args->syncobj_stride); if (IS_ERR(post_deps)) { ret = PTR_ERR(post_deps); goto out_unlock; @@ -897,10 +731,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, args->fence = submit->fence_id; queue->last_fence = submit->fence_id; - msm_reset_syncobjs(syncobjs_to_reset, args->nr_in_syncobjs); - msm_process_post_deps(post_deps, args->nr_out_syncobjs, - submit->user_fence); - + msm_syncobj_reset(syncobjs_to_reset, args->nr_in_syncobjs); + msm_syncobj_process_post_deps(post_deps, args->nr_out_syncobjs, submit->user_fence); out: submit_cleanup(submit, !!ret); diff --git a/drivers/gpu/drm/msm/msm_syncobj.c b/drivers/gpu/drm/msm/msm_syncobj.c new file mode 100644 index 000000000000..4baa9f522c54 --- /dev/null +++ b/drivers/gpu/drm/msm/msm_syncobj.c @@ -0,0 +1,172 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (C) 2020 Google, Inc */ + +#include "drm/drm_drv.h" + +#include "msm_drv.h" +#include "msm_syncobj.h" + +struct drm_syncobj ** +msm_syncobj_parse_deps(struct drm_device *dev, + struct drm_sched_job *job, + struct drm_file *file, + uint64_t in_syncobjs_addr, + uint32_t nr_in_syncobjs, + size_t syncobj_stride) +{ + struct drm_syncobj **syncobjs = NULL; + struct drm_msm_syncobj syncobj_desc = {0}; + int ret = 0; + uint32_t i, j; + + syncobjs = kcalloc(nr_in_syncobjs, sizeof(*syncobjs), + GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY); + if (!syncobjs) + return ERR_PTR(-ENOMEM); + + for (i = 0; i < nr_in_syncobjs; ++i) { + uint64_t address = in_syncobjs_addr + i * syncobj_stride; + + if (copy_from_user(&syncobj_desc, + u64_to_user_ptr(address), + min(syncobj_stride, sizeof(syncobj_desc)))) { + ret = -EFAULT; + break; + } + + if (syncobj_desc.point && + !drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) { + ret = UERR(EOPNOTSUPP, dev, "syncobj timeline unsupported"); + break; + } + + if (syncobj_desc.flags & ~MSM_SYNCOBJ_FLAGS) { + ret = UERR(EINVAL, dev, "invalid syncobj flags: %x", syncobj_desc.flags); + break; + } + + ret = drm_sched_job_add_syncobj_dependency(job, file, + syncobj_desc.handle, + syncobj_desc.point); + if (ret) + break; + + if (syncobj_desc.flags & MSM_SYNCOBJ_RESET) { + syncobjs[i] = drm_syncobj_find(file, syncobj_desc.handle); + if (!syncobjs[i]) { + ret = UERR(EINVAL, dev, "invalid syncobj handle: %u", i); + break; + } + } + } + + if (ret) { + for (j = 0; j <= i; ++j) { + if (syncobjs[j]) + drm_syncobj_put(syncobjs[j]); + } + kfree(syncobjs); + return ERR_PTR(ret); + } + return syncobjs; +} + +void +msm_syncobj_reset(struct drm_syncobj **syncobjs, uint32_t nr_syncobjs) +{ + uint32_t i; + + for (i = 0; syncobjs && i < nr_syncobjs; ++i) { + if (syncobjs[i]) + drm_syncobj_replace_fence(syncobjs[i], NULL); + } +} + +struct msm_syncobj_post_dep * +msm_syncobj_parse_post_deps(struct drm_device *dev, + struct drm_file *file, + uint64_t syncobjs_addr, + uint32_t nr_syncobjs, + size_t syncobj_stride) +{ + struct msm_syncobj_post_dep *post_deps; + struct drm_msm_syncobj syncobj_desc = {0}; + int ret = 0; + uint32_t i, j; + + post_deps = kcalloc(nr_syncobjs, sizeof(*post_deps), + GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY); + if (!post_deps) + return ERR_PTR(-ENOMEM); + + for (i = 0; i < nr_syncobjs; ++i) { + uint64_t address = syncobjs_addr + i * syncobj_stride; + + if (copy_from_user(&syncobj_desc, + u64_to_user_ptr(address), + min(syncobj_stride, sizeof(syncobj_desc)))) { + ret = -EFAULT; + break; + } + + post_deps[i].point = syncobj_desc.point; + + if (syncobj_desc.flags) { + ret = UERR(EINVAL, dev, "invalid syncobj flags"); + break; + } + + if (syncobj_desc.point) { + if (!drm_core_check_feature(dev, + DRIVER_SYNCOBJ_TIMELINE)) { + ret = UERR(EOPNOTSUPP, dev, "syncobj timeline unsupported"); + break; + } + + post_deps[i].chain = dma_fence_chain_alloc(); + if (!post_deps[i].chain) { + ret = -ENOMEM; + break; + } + } + + post_deps[i].syncobj = + drm_syncobj_find(file, syncobj_desc.handle); + if (!post_deps[i].syncobj) { + ret = UERR(EINVAL, dev, "invalid syncobj handle"); + break; + } + } + + if (ret) { + for (j = 0; j <= i; ++j) { + dma_fence_chain_free(post_deps[j].chain); + if (post_deps[j].syncobj) + drm_syncobj_put(post_deps[j].syncobj); + } + + kfree(post_deps); + return ERR_PTR(ret); + } + + return post_deps; +} + +void +msm_syncobj_process_post_deps(struct msm_syncobj_post_dep *post_deps, + uint32_t count, struct dma_fence *fence) +{ + uint32_t i; + + for (i = 0; post_deps && i < count; ++i) { + if (post_deps[i].chain) { + drm_syncobj_add_point(post_deps[i].syncobj, + post_deps[i].chain, + fence, post_deps[i].point); + post_deps[i].chain = NULL; + } else { + drm_syncobj_replace_fence(post_deps[i].syncobj, + fence); + } + } +} diff --git a/drivers/gpu/drm/msm/msm_syncobj.h b/drivers/gpu/drm/msm/msm_syncobj.h new file mode 100644 index 000000000000..bcaa15d01da0 --- /dev/null +++ b/drivers/gpu/drm/msm/msm_syncobj.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (C) 2020 Google, Inc */ + +#ifndef __MSM_GEM_SYNCOBJ_H__ +#define __MSM_GEM_SYNCOBJ_H__ + +#include "drm/drm_device.h" +#include "drm/drm_syncobj.h" +#include "drm/gpu_scheduler.h" + +struct msm_syncobj_post_dep { + struct drm_syncobj *syncobj; + uint64_t point; + struct dma_fence_chain *chain; +}; + +struct drm_syncobj ** +msm_syncobj_parse_deps(struct drm_device *dev, + struct drm_sched_job *job, + struct drm_file *file, + uint64_t in_syncobjs_addr, + uint32_t nr_in_syncobjs, + size_t syncobj_stride); + +void msm_syncobj_reset(struct drm_syncobj **syncobjs, uint32_t nr_syncobjs); + +struct msm_syncobj_post_dep * +msm_syncobj_parse_post_deps(struct drm_device *dev, + struct drm_file *file, + uint64_t syncobjs_addr, + uint32_t nr_syncobjs, + size_t syncobj_stride); + +void msm_syncobj_process_post_deps(struct msm_syncobj_post_dep *post_deps, + uint32_t count, struct dma_fence *fence); + +#endif /* __MSM_GEM_SYNCOBJ_H__ */ diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h index 1bccc347945c..2c2fc4b284d0 100644 --- a/include/uapi/drm/msm_drm.h +++ b/include/uapi/drm/msm_drm.h @@ -220,6 +220,17 @@ struct drm_msm_gem_cpu_fini { * Cmdstream Submission: */ +#define MSM_SYNCOBJ_RESET 0x00000001 /* Reset syncobj after wait. */ +#define MSM_SYNCOBJ_FLAGS ( \ + MSM_SYNCOBJ_RESET | \ + 0) + +struct drm_msm_syncobj { + __u32 handle; /* in, syncobj handle. */ + __u32 flags; /* in, from MSM_SUBMIT_SYNCOBJ_FLAGS */ + __u64 point; /* in, timepoint for timeline syncobjs. */ +}; + /* The value written into the cmdstream is logically: * * ((relocbuf->gpuaddr + reloc_offset) << shift) | or @@ -309,17 +320,6 @@ struct drm_msm_gem_submit_bo { MSM_SUBMIT_FENCE_SN_IN | \ 0) -#define MSM_SUBMIT_SYNCOBJ_RESET 0x00000001 /* Reset syncobj after wait. */ -#define MSM_SUBMIT_SYNCOBJ_FLAGS ( \ - MSM_SUBMIT_SYNCOBJ_RESET | \ - 0) - -struct drm_msm_gem_submit_syncobj { - __u32 handle; /* in, syncobj handle. */ - __u32 flags; /* in, from MSM_SUBMIT_SYNCOBJ_FLAGS */ - __u64 point; /* in, timepoint for timeline syncobjs. */ -}; - /* Each cmdstream submit consists of a table of buffers involved, and * one or more cmdstream buffers. This allows for conditional execution * (context-restore), and IB buffers needed for per tile/bin draw cmds. @@ -333,8 +333,8 @@ struct drm_msm_gem_submit { __u64 cmds; /* in, ptr to array of submit_cmd's */ __s32 fence_fd; /* in/out fence fd (see MSM_SUBMIT_FENCE_FD_IN/OUT) */ __u32 queueid; /* in, submitqueue id */ - __u64 in_syncobjs; /* in, ptr to array of drm_msm_gem_submit_syncobj */ - __u64 out_syncobjs; /* in, ptr to array of drm_msm_gem_submit_syncobj */ + __u64 in_syncobjs; /* in, ptr to array of drm_msm_syncobj */ + __u64 out_syncobjs; /* in, ptr to array of drm_msm_syncobj */ __u32 nr_in_syncobjs; /* in, number of entries in in_syncobj */ __u32 nr_out_syncobjs; /* in, number of entries in out_syncobj. */ __u32 syncobj_stride; /* in, stride of syncobj arrays. */ From patchwork Mon May 19 17:57:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rob Clark X-Patchwork-Id: 891211 Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DBDB128CF4A; Mon, 19 May 2025 17:58:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747677529; cv=none; b=EU/Ffis3rh28EInzwn641+3wPsoYe4aoXtZGvM5KOz9ctqeYqX5k2I0tDrVqkhJggz9DKBssZMljQh6OZmNB4Ky6cI4DCGLRvoyy+voYQ4o3Ebes2LpUHIZoO0gYAL8s+LcHp47Yf5/QGV4ItK7tOFoL+GcjMjcSnQ+81tGMNAM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747677529; c=relaxed/simple; bh=KNnplvjW7xiwe6MV6NjFuLxC393AOi/CTZ2ubKUynQE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=O6BU517Tap6d4OWUqyUcgeMsEu4dtJfrI8QiyJdzhoXXyULI2TvHlBryyAuMz80d1WjWSQQeQG/shxQbT+BJM/3VW/jqBcIEqsqdgIx8rgJfFZicbjCzZVcAvMzxqJwQGtXlkte4CoF0nnPkwzay1Ey4AV2XekXs3HUGoAJNC8Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=WEEzJyu2; arc=none smtp.client-ip=209.85.210.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WEEzJyu2" Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-742c9907967so1557797b3a.1; Mon, 19 May 2025 10:58:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747677526; x=1748282326; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=whvnaJKbd/LuG/OjoM3gFUtfFgomP7fEf6PbLtO81yk=; b=WEEzJyu2qnEZR6tuAP0z2xFPY1Wt4jU4hQdCiWrxPg2oVSVjuA+LSiy3xfZ+ib3APj 4/ayYYEkViVoqwm7ijcG4SMXiiZFHPWqthtaRm7l68guy3n58q6wY4HAwFbPFmTcruZy xsGfO3Z4pCL0kSrE4ybYNQhrfZaAMEcaILE1/tuP74OgsHklVZFuCH1NEuIz5aTHBiQy n+dG4nBLjrFzwRCi1z51fewahAu1cRbBiEFssxkJFHqJ+2DUqEL3y/2ZJYXohqX5prf8 uVU61CyxZINUetl9kyqTcU2D+lIx8YlxUCjugGHXT/ISQeiYIFKE/MJHrKou7qL7+uv+ TvUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747677526; x=1748282326; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=whvnaJKbd/LuG/OjoM3gFUtfFgomP7fEf6PbLtO81yk=; b=nr5+cmythiCQXjdUTxcRqKkpFhJJLR8joe2uaQo3YfekkE4nQHkALreJr+Lr/Un6yr 5jix5KbPY386/4C6WrE6Ic802mZpinHBFv/8VSZBK9x+PpPwX7iFA3degPNkCa+0CRc4 kB8Ko3W/u6bL0cN4TnhX2rN4Q573IDFfWsSNQBYBK5hBDwGj05gdZy0XRuyvousEG22t dF+eWQQ2UIFvggNSh8bt00bGTPFfoZuZUPUk9aYD8CI05tFPPGYkYcvSdPgXKZgCj6BQ RP9DYJBbRpEGRy+3bGH/BQgDl0bEBj5D1gj6MRd1j4AUacgijP61tnFpBAPchsR25Bzk lNug== X-Forwarded-Encrypted: i=1; AJvYcCVybL3vzMPRg9274LOzTWkP8Cqr3xUTxPuBUGqkOGethLCzpdAD3FgxYSyB/rG53Ai1yyGN0uKiE2PKfKIq@vger.kernel.org, AJvYcCWkPdsQdd/MpNCeLowLZon2/B3/uAwKZAbMII/Q+TUWzHpv7oTo816C4CQQt1fKeJzFexLr+XFLu7IiuKzZ@vger.kernel.org, AJvYcCXw20pxHPFtcK8sC9fLX13lDptclLzu66rCdNQqaEvtkkrsGr+VO2mFohlU1mXyTk4q/650XLzlDOk9tzk=@vger.kernel.org X-Gm-Message-State: AOJu0YxV/e8ZHwpcWEH132bLr7ve4FJov+h1T/Dk/HkvJBiS6SeuBGFO ZmjWODw8ixZVo/XntmuVSqdhnb4hOUoEhCzxeXXmdHmmnJVhjDkXOYbS X-Gm-Gg: ASbGnct2P3IIyzO1TG8IhJ7Sfy0HlSPbDiMP0FJSTS+Y6GkIydt8wpyTqAQQj0P4Ipw zSKhVV9g8FnYphGlMgLC+WP4zkK9C5Zy1uIH/3oddyO3jBSlciZz8qS3S03mMpTiQ6tdk5gc/rY rYSh3avIInPF4Mc1bq249hLML4Ee4wyubZpV/cxMPjWRLpt7curAxhr9MAgDz3AieHmwcbl3kno dc13TYIMShQxY+kwwahN66UkjZN2dH8X45bgPK73KTakdFD1YetwzF/iw95gYE4U2E0c60vyAY3 w5ZvXw1w9BDv9qvvS3bj7yljw9usK5UXvqnPdWSbj3q5lbnBu7g7n7d/MmiiGQI2xg8xyUD3GoA r2KOLa7MKhvhLoHNcsIHYLVZs9Q== X-Google-Smtp-Source: AGHT+IHGTGmdC4oymPjcXcSZInpeJnfRTIRYkGizCUsh3QwicMtfBlmBDPiJZgnUL+t3WkAkrwdFQg== X-Received: by 2002:a17:90b:56cc:b0:30e:3338:8c0e with SMTP id 98e67ed59e1d1-30e7d5a8b5amr20639074a91.27.1747677525677; Mon, 19 May 2025 10:58:45 -0700 (PDT) Received: from localhost ([2a00:79e0:3e00:2601:3afc:446b:f0df:eadc]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-30f2f6602f0sm219249a91.0.2025.05.19.10.58.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 May 2025 10:58:45 -0700 (PDT) From: Rob Clark To: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, Connor Abbott , Rob Clark , Rob Clark , Abhinav Kumar , Dmitry Baryshkov , Sean Paul , Marijn Suijten , David Airlie , Simona Vetter , Konrad Dybcio , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , linux-kernel@vger.kernel.org (open list), linux-media@vger.kernel.org (open list:DMA BUFFER SHARING FRAMEWORK:Keyword:\bdma_(?:buf|fence|resv)\b), linaro-mm-sig@lists.linaro.org (moderated list:DMA BUFFER SHARING FRAMEWORK:Keyword:\bdma_(?:buf|fence|resv)\b) Subject: [PATCH v5 35/40] drm/msm: Add VM_BIND ioctl Date: Mon, 19 May 2025 10:57:32 -0700 Message-ID: <20250519175755.13037-23-robdclark@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250519175755.13037-1-robdclark@gmail.com> References: <20250519175348.11924-1-robdclark@gmail.com> <20250519175755.13037-1-robdclark@gmail.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Rob Clark Add a VM_BIND ioctl for binding/unbinding buffers into a VM. This is only supported if userspace has opted in to MSM_PARAM_EN_VM_BIND. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_drv.c | 1 + drivers/gpu/drm/msm/msm_drv.h | 4 +- drivers/gpu/drm/msm/msm_gem.c | 40 +- drivers/gpu/drm/msm/msm_gem.h | 4 + drivers/gpu/drm/msm/msm_gem_submit.c | 22 +- drivers/gpu/drm/msm/msm_gem_vma.c | 1073 +++++++++++++++++++++++++- include/uapi/drm/msm_drm.h | 74 +- 7 files changed, 1185 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 89cb7820064f..bdf775897de8 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -791,6 +791,7 @@ static const struct drm_ioctl_desc msm_ioctls[] = { DRM_IOCTL_DEF_DRV(MSM_SUBMITQUEUE_NEW, msm_ioctl_submitqueue_new, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(MSM_SUBMITQUEUE_CLOSE, msm_ioctl_submitqueue_close, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(MSM_SUBMITQUEUE_QUERY, msm_ioctl_submitqueue_query, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(MSM_VM_BIND, msm_ioctl_vm_bind, DRM_RENDER_ALLOW), }; static void msm_show_fdinfo(struct drm_printer *p, struct drm_file *file) diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index b0add236cbb3..33240afc6365 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -232,7 +232,9 @@ struct drm_gpuvm *msm_kms_init_vm(struct drm_device *dev); bool msm_use_mmu(struct drm_device *dev); int msm_ioctl_gem_submit(struct drm_device *dev, void *data, - struct drm_file *file); + struct drm_file *file); +int msm_ioctl_vm_bind(struct drm_device *dev, void *data, + struct drm_file *file); #ifdef CONFIG_DEBUG_FS unsigned long msm_gem_shrinker_shrink(struct drm_device *dev, unsigned long nr_to_scan); diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index cf509ca42da0..040f0539baa5 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -233,8 +233,7 @@ static void put_pages(struct drm_gem_object *obj) } } -static struct page **msm_gem_get_pages_locked(struct drm_gem_object *obj, - unsigned madv) +struct page **msm_gem_get_pages_locked(struct drm_gem_object *obj, unsigned madv) { struct msm_gem_object *msm_obj = to_msm_bo(obj); @@ -1036,18 +1035,37 @@ static void msm_gem_free_object(struct drm_gem_object *obj) /* * We need to lock any VMs the object is still attached to, but not * the object itself (see explaination in msm_gem_assert_locked()), - * so just open-code this special case: + * so just open-code this special case. + * + * Note that we skip the dance if we aren't attached to any VM. This + * is load bearing. The driver needs to support two usage models: + * + * 1. Legacy kernel managed VM: Userspace expects the VMA's to be + * implicitly torn down when the object is freed, the VMA's do + * not hold a hard reference to the BO. + * + * 2. VM_BIND, userspace managed VM: The VMA holds a reference to the + * BO. This can be dropped when the VM is closed and it's associated + * VMAs are torn down. (See msm_gem_vm_close()). + * + * In the latter case the last reference to a BO can be dropped while + * we already have the VM locked. It would have already been removed + * from the gpuva list, but lockdep doesn't know that. Or understand + * the differences between the two usage models. */ - drm_exec_init(&exec, 0, 0); - drm_exec_until_all_locked (&exec) { - struct drm_gpuvm_bo *vm_bo; - drm_gem_for_each_gpuvm_bo (vm_bo, obj) { - drm_exec_lock_obj(&exec, drm_gpuvm_resv_obj(vm_bo->vm)); - drm_exec_retry_on_contention(&exec); + if (!list_empty(&obj->gpuva.list)) { + drm_exec_init(&exec, 0, 0); + drm_exec_until_all_locked (&exec) { + struct drm_gpuvm_bo *vm_bo; + drm_gem_for_each_gpuvm_bo (vm_bo, obj) { + drm_exec_lock_obj(&exec, + drm_gpuvm_resv_obj(vm_bo->vm)); + drm_exec_retry_on_contention(&exec); + } } + put_iova_spaces(obj, NULL, true); + drm_exec_fini(&exec); /* drop locks */ } - put_iova_spaces(obj, NULL, true); - drm_exec_fini(&exec); /* drop locks */ if (obj->import_attach) { GEM_WARN_ON(msm_obj->vaddr); diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index 8ad25927c604..bfeb0f584ae5 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -73,6 +73,9 @@ struct msm_gem_vm { /** @mmu: The mmu object which manages the pgtables */ struct msm_mmu *mmu; + /** @mmu_lock: Protects access to the mmu */ + struct mutex mmu_lock; + /** * @pid: For address spaces associated with a specific process, this * will be non-NULL: @@ -205,6 +208,7 @@ int msm_gem_get_and_pin_iova(struct drm_gem_object *obj, struct drm_gpuvm *vm, uint64_t *iova); void msm_gem_unpin_iova(struct drm_gem_object *obj, struct drm_gpuvm *vm); void msm_gem_pin_obj_locked(struct drm_gem_object *obj); +struct page **msm_gem_get_pages_locked(struct drm_gem_object *obj, unsigned madv); struct page **msm_gem_pin_pages_locked(struct drm_gem_object *obj); void msm_gem_unpin_pages_locked(struct drm_gem_object *obj); int msm_gem_dumb_create(struct drm_file *file, struct drm_device *dev, diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 053e6c65780f..9809918d8eb4 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -193,6 +193,7 @@ static int submit_lookup_objects(struct msm_gem_submit *submit, static int submit_lookup_cmds(struct msm_gem_submit *submit, struct drm_msm_gem_submit *args, struct drm_file *file) { + struct msm_context *ctx = file->driver_priv; unsigned i; size_t sz; int ret = 0; @@ -224,6 +225,20 @@ static int submit_lookup_cmds(struct msm_gem_submit *submit, goto out; } + if (msm_context_is_vmbind(ctx)) { + if (submit_cmd.nr_relocs) { + ret = SUBMIT_ERROR(EINVAL, submit, "nr_relocs must be zero"); + goto out; + } + + if (submit_cmd.submit_idx || submit_cmd.submit_offset) { + ret = SUBMIT_ERROR(EINVAL, submit, "submit_idx/offset must be zero"); + goto out; + } + + submit->cmd[i].iova = submit_cmd.iova; + } + submit->cmd[i].type = submit_cmd.type; submit->cmd[i].size = submit_cmd.size / 4; submit->cmd[i].offset = submit_cmd.submit_offset / 4; @@ -527,6 +542,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct msm_syncobj_post_dep *post_deps = NULL; struct drm_syncobj **syncobjs_to_reset = NULL; struct sync_file *sync_file = NULL; + unsigned cmds_to_parse; int out_fence_fd = -1; unsigned i; int ret; @@ -650,7 +666,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto out; - for (i = 0; i < args->nr_cmds; i++) { + cmds_to_parse = msm_context_is_vmbind(ctx) ? 0 : args->nr_cmds; + + for (i = 0; i < cmds_to_parse; i++) { struct drm_gem_object *obj; uint64_t iova; @@ -681,7 +699,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, goto out; } - submit->nr_cmds = i; + submit->nr_cmds = args->nr_cmds; idr_preload(GFP_KERNEL); diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c index a105aed82cae..fe41b7a042c3 100644 --- a/drivers/gpu/drm/msm/msm_gem_vma.c +++ b/drivers/gpu/drm/msm/msm_gem_vma.c @@ -4,9 +4,16 @@ * Author: Rob Clark */ +#include "drm/drm_file.h" +#include "drm/msm_drm.h" +#include "linux/file.h" +#include "linux/sync_file.h" + #include "msm_drv.h" #include "msm_gem.h" +#include "msm_gpu.h" #include "msm_mmu.h" +#include "msm_syncobj.h" #define vm_dbg(fmt, ...) pr_debug("%s:%d: "fmt"\n", __func__, __LINE__, ##__VA_ARGS__) @@ -36,6 +43,97 @@ struct msm_vm_unmap_op { uint64_t range; }; +/** + * struct msm_vma_op - A MAP or UNMAP operation + */ +struct msm_vm_op { + /** @op: The operation type */ + enum { + MSM_VM_OP_MAP = 1, + MSM_VM_OP_UNMAP, + } op; + union { + /** @map: Parameters used if op == MSM_VMA_OP_MAP */ + struct msm_vm_map_op map; + /** @unmap: Parameters used if op == MSM_VMA_OP_UNMAP */ + struct msm_vm_unmap_op unmap; + }; + /** @node: list head in msm_vm_bind_job::vm_ops */ + struct list_head node; + + /** + * @obj: backing object for pages to be mapped/unmapped + * + * Async unmap ops, in particular, must hold a reference to the + * original GEM object backing the mapping that will be unmapped. + * But the same can be required in the map path, for example if + * there is not a corresponding unmap op, such as process exit. + * + * This ensures that the pages backing the mapping are not freed + * before the mapping is torn down. + */ + struct drm_gem_object *obj; +}; + +/** + * struct msm_vm_bind_job - Tracking for a VM_BIND ioctl + * + * A table of userspace requested VM updates (MSM_VM_BIND_OP_UNMAP/MAP/MAP_NULL) + * gets applied to the vm, generating a list of VM ops (MSM_VM_OP_MAP/UNMAP) + * which are applied to the pgtables asynchronously. For example a userspace + * requested MSM_VM_BIND_OP_MAP could end up generating both an MSM_VM_OP_UNMAP + * to unmap an existing mapping, and a MSM_VM_OP_MAP to apply the new mapping. + */ +struct msm_vm_bind_job { + /** @base: base class for drm_sched jobs */ + struct drm_sched_job base; + /** @vm: The VM being operated on */ + struct drm_gpuvm *vm; + /** @fence: The fence that is signaled when job completes */ + struct dma_fence *fence; + /** @queue: The queue that the job runs on */ + struct msm_gpu_submitqueue *queue; + /** @prealloc: Tracking for pre-allocated MMU pgtable pages */ + struct msm_mmu_prealloc prealloc; + /** @vm_ops: a list of struct msm_vm_op */ + struct list_head vm_ops; + /** @bos_pinned: are the GEM objects being bound pinned? */ + bool bos_pinned; + /** @nr_ops: the number of userspace requested ops */ + unsigned int nr_ops; + /** + * @ops: the userspace requested ops + * + * The userspace requested ops are copied/parsed and validated + * before we start applying the updates to try to do as much up- + * front error checking as possible, to avoid the VM being in an + * undefined state due to partially executed VM_BIND. + * + * This table also serves to hold a reference to the backing GEM + * objects. + */ + struct msm_vm_bind_op { + uint32_t op; + uint32_t flags; + union { + struct drm_gem_object *obj; + uint32_t handle; + }; + uint64_t obj_offset; + uint64_t iova; + uint64_t range; + } ops[]; +}; + +#define job_foreach_bo(obj, _job) \ + for (unsigned i = 0; i < (_job)->nr_ops; i++) \ + if ((obj = (_job)->ops[i].obj)) + +static inline struct msm_vm_bind_job *to_msm_vm_bind_job(struct drm_sched_job *job) +{ + return container_of(job, struct msm_vm_bind_job, base); +} + static void msm_gem_vm_free(struct drm_gpuvm *gpuvm) { @@ -52,6 +150,9 @@ msm_gem_vm_free(struct drm_gpuvm *gpuvm) static void vm_unmap_op(struct msm_gem_vm *vm, const struct msm_vm_unmap_op *op) { + if (!vm->managed) + lockdep_assert_held(&vm->mmu_lock); + vm_dbg("%p: %016llx %016llx", vm, op->iova, op->iova + op->range); vm->mmu->funcs->unmap(vm->mmu, op->iova, op->range); @@ -60,6 +161,9 @@ vm_unmap_op(struct msm_gem_vm *vm, const struct msm_vm_unmap_op *op) static int vm_map_op(struct msm_gem_vm *vm, const struct msm_vm_map_op *op) { + if (!vm->managed) + lockdep_assert_held(&vm->mmu_lock); + vm_dbg("%p: %016llx %016llx", vm, op->iova, op->iova + op->range); return vm->mmu->funcs->map(vm->mmu, op->iova, op->sgt, op->offset, @@ -69,17 +173,29 @@ vm_map_op(struct msm_gem_vm *vm, const struct msm_vm_map_op *op) /* Actually unmap memory for the vma */ void msm_gem_vma_unmap(struct drm_gpuva *vma) { + struct msm_gem_vm *vm = to_msm_vm(vma->vm); struct msm_gem_vma *msm_vma = to_msm_vma(vma); /* Don't do anything if the memory isn't mapped */ if (!msm_vma->mapped) return; - vm_unmap_op(to_msm_vm(vma->vm), &(struct msm_vm_unmap_op){ + /* + * The mmu_lock is only needed when preallocation is used. But + * in that case we don't need to worry about recursion into + * shrinker + */ + if (!vm->managed) + mutex_lock(&vm->mmu_lock); + + vm_unmap_op(vm, &(struct msm_vm_unmap_op){ .iova = vma->va.addr, .range = vma->va.range, }); + if (!vm->managed) + mutex_unlock(&vm->mmu_lock); + msm_vma->mapped = false; } @@ -87,6 +203,7 @@ void msm_gem_vma_unmap(struct drm_gpuva *vma) int msm_gem_vma_map(struct drm_gpuva *vma, int prot, struct sg_table *sgt) { + struct msm_gem_vm *vm = to_msm_vm(vma->vm); struct msm_gem_vma *msm_vma = to_msm_vma(vma); int ret; @@ -98,6 +215,14 @@ msm_gem_vma_map(struct drm_gpuva *vma, int prot, struct sg_table *sgt) msm_vma->mapped = true; + /* + * The mmu_lock is only needed when preallocation is used. But + * in that case we don't need to worry about recursion into + * shrinker + */ + if (!vm->managed) + mutex_lock(&vm->mmu_lock); + /* * NOTE: iommu/io-pgtable can allocate pages, so we cannot hold * a lock across map/unmap which is also used in the job_run() @@ -107,16 +232,19 @@ msm_gem_vma_map(struct drm_gpuva *vma, int prot, struct sg_table *sgt) * Revisit this if we can come up with a scheme to pre-alloc pages * for the pgtable in map/unmap ops. */ - ret = vm_map_op(to_msm_vm(vma->vm), &(struct msm_vm_map_op){ + ret = vm_map_op(vm, &(struct msm_vm_map_op){ .iova = vma->va.addr, .range = vma->va.range, .offset = vma->gem.offset, .sgt = sgt, .prot = prot, }); - if (ret) { + + if (!vm->managed) + mutex_unlock(&vm->mmu_lock); + + if (ret) msm_vma->mapped = false; - } return ret; } @@ -131,6 +259,9 @@ void msm_gem_vma_close(struct drm_gpuva *vma) drm_gpuvm_resv_assert_held(&vm->base); + if (vma->gem.obj) + msm_gem_assert_locked(vma->gem.obj); + if (vma->va.addr && vm->managed) drm_mm_remove_node(&msm_vma->node); @@ -158,6 +289,7 @@ msm_gem_vma_new(struct drm_gpuvm *gpuvm, struct drm_gem_object *obj, if (vm->managed) { BUG_ON(offset != 0); + BUG_ON(!obj); /* NULL mappings not valid for kernel managed VM */ ret = drm_mm_insert_node_in_range(&vm->mm, &vma->node, obj->size, PAGE_SIZE, 0, range_start, range_end, 0); @@ -169,7 +301,8 @@ msm_gem_vma_new(struct drm_gpuvm *gpuvm, struct drm_gem_object *obj, range_end = range_start + obj->size; } - GEM_WARN_ON((range_end - range_start) > obj->size); + if (obj) + GEM_WARN_ON((range_end - range_start) > obj->size); drm_gpuva_init(&vma->base, range_start, range_end - range_start, obj, offset); vma->mapped = false; @@ -178,6 +311,9 @@ msm_gem_vma_new(struct drm_gpuvm *gpuvm, struct drm_gem_object *obj, if (ret) goto err_free_range; + if (!obj) + return &vma->base; + vm_bo = drm_gpuvm_bo_obtain(&vm->base, obj); if (IS_ERR(vm_bo)) { ret = PTR_ERR(vm_bo); @@ -200,11 +336,297 @@ msm_gem_vma_new(struct drm_gpuvm *gpuvm, struct drm_gem_object *obj, return ERR_PTR(ret); } +static int +msm_gem_vm_bo_validate(struct drm_gpuvm_bo *vm_bo, struct drm_exec *exec) +{ + struct drm_gem_object *obj = vm_bo->obj; + struct drm_gpuva *vma; + int ret; + + vm_dbg("validate: %p", obj); + + msm_gem_assert_locked(obj); + + drm_gpuvm_bo_for_each_va (vma, vm_bo) { + ret = msm_gem_pin_vma_locked(obj, vma); + if (ret) + return ret; + } + + return 0; +} + +struct op_arg { + unsigned flags; + struct msm_vm_bind_job *job; +}; + +static void +vm_op_enqueue(struct op_arg *arg, struct msm_vm_op _op) +{ + struct msm_vm_op *op = kmalloc(sizeof(*op), GFP_KERNEL); + *op = _op; + list_add_tail(&op->node, &arg->job->vm_ops); + + if (op->obj) + drm_gem_object_get(op->obj); +} + +static struct drm_gpuva * +vma_from_op(struct op_arg *arg, struct drm_gpuva_op_map *op) +{ + return msm_gem_vma_new(arg->job->vm, op->gem.obj, op->gem.offset, + op->va.addr, op->va.addr + op->va.range); +} + +static int +msm_gem_vm_sm_step_map(struct drm_gpuva_op *op, void *arg) +{ + struct drm_gem_object *obj = op->map.gem.obj; + struct drm_gpuva *vma; + struct sg_table *sgt; + unsigned prot; + + vma = vma_from_op(arg, &op->map); + if (WARN_ON(IS_ERR(vma))) + return PTR_ERR(vma); + + vm_dbg("%p:%p:%p: %016llx %016llx", vma->vm, vma, vma->gem.obj, + vma->va.addr, vma->va.range); + + vma->flags = ((struct op_arg *)arg)->flags; + + if (obj) { + sgt = to_msm_bo(obj)->sgt; + prot = msm_gem_prot(obj); + } else { + sgt = NULL; + prot = IOMMU_READ | IOMMU_WRITE; + } + + vm_op_enqueue(arg, (struct msm_vm_op){ + .op = MSM_VM_OP_MAP, + .map = { + .sgt = sgt, + .iova = vma->va.addr, + .range = vma->va.range, + .offset = vma->gem.offset, + .prot = prot, + }, + .obj = vma->gem.obj, + }); + + to_msm_vma(vma)->mapped = true; + + return 0; +} + +static int +msm_gem_vm_sm_step_remap(struct drm_gpuva_op *op, void *arg) +{ + struct msm_vm_bind_job *job = ((struct op_arg *)arg)->job; + struct drm_gpuvm *vm = job->vm; + struct drm_gpuva *orig_vma = op->remap.unmap->va; + struct drm_gpuva *prev_vma = NULL, *next_vma = NULL; + struct drm_gpuvm_bo *vm_bo = orig_vma->vm_bo; + bool mapped = to_msm_vma(orig_vma)->mapped; + unsigned flags; + + vm_dbg("orig_vma: %p:%p:%p: %016llx %016llx", vm, orig_vma, + orig_vma->gem.obj, orig_vma->va.addr, orig_vma->va.range); + + if (mapped) { + uint64_t unmap_start, unmap_range; + + drm_gpuva_op_remap_to_unmap_range(&op->remap, &unmap_start, &unmap_range); + + vm_op_enqueue(arg, (struct msm_vm_op){ + .op = MSM_VM_OP_UNMAP, + .unmap = { + .iova = unmap_start, + .range = unmap_range, + }, + .obj = orig_vma->gem.obj, + }); + + /* + * Part of this GEM obj is still mapped, but we're going to kill the + * existing VMA and replace it with one or two new ones (ie. two if + * the unmapped range is in the middle of the existing (unmap) VMA). + * So just set the state to unmapped: + */ + to_msm_vma(orig_vma)->mapped = false; + } + + /* + * Hold a ref to the vm_bo between the msm_gem_vma_close() and the + * creation of the new prev/next vma's, in case the vm_bo is tracked + * in the VM's evict list: + */ + if (vm_bo) + drm_gpuvm_bo_get(vm_bo); + + /* + * The prev_vma and/or next_vma are replacing the unmapped vma, and + * therefore should preserve it's flags: + */ + flags = orig_vma->flags; + + msm_gem_vma_close(orig_vma); + + if (op->remap.prev) { + prev_vma = vma_from_op(arg, op->remap.prev); + if (WARN_ON(IS_ERR(prev_vma))) + return PTR_ERR(prev_vma); + + vm_dbg("prev_vma: %p:%p: %016llx %016llx", vm, prev_vma, prev_vma->va.addr, prev_vma->va.range); + to_msm_vma(prev_vma)->mapped = mapped; + prev_vma->flags = flags; + } + + if (op->remap.next) { + next_vma = vma_from_op(arg, op->remap.next); + if (WARN_ON(IS_ERR(next_vma))) + return PTR_ERR(next_vma); + + vm_dbg("next_vma: %p:%p: %016llx %016llx", vm, next_vma, next_vma->va.addr, next_vma->va.range); + to_msm_vma(next_vma)->mapped = mapped; + next_vma->flags = flags; + } + + if (!mapped) + drm_gpuvm_bo_evict(vm_bo, true); + + /* Drop the previous ref: */ + drm_gpuvm_bo_put(vm_bo); + + return 0; +} + +static int +msm_gem_vm_sm_step_unmap(struct drm_gpuva_op *op, void *arg) +{ + struct drm_gpuva *vma = op->unmap.va; + struct msm_gem_vma *msm_vma = to_msm_vma(vma); + + vm_dbg("%p:%p:%p: %016llx %016llx", vma->vm, vma, vma->gem.obj, + vma->va.addr, vma->va.range); + + if (!msm_vma->mapped) + goto out_close; + + vm_op_enqueue(arg, (struct msm_vm_op){ + .op = MSM_VM_OP_UNMAP, + .unmap = { + .iova = vma->va.addr, + .range = vma->va.range, + }, + .obj = vma->gem.obj, + }); + + msm_vma->mapped = false; + +out_close: + msm_gem_vma_close(vma); + + return 0; +} + static const struct drm_gpuvm_ops msm_gpuvm_ops = { .vm_free = msm_gem_vm_free, + .vm_bo_validate = msm_gem_vm_bo_validate, + .sm_step_map = msm_gem_vm_sm_step_map, + .sm_step_remap = msm_gem_vm_sm_step_remap, + .sm_step_unmap = msm_gem_vm_sm_step_unmap, }; +static struct dma_fence * +msm_vma_job_run(struct drm_sched_job *_job) +{ + struct msm_vm_bind_job *job = to_msm_vm_bind_job(_job); + struct msm_gem_vm *vm = to_msm_vm(job->vm); + struct drm_gem_object *obj; + int ret = vm->unusable ? -EINVAL : 0; + + vm_dbg(""); + + mutex_lock(&vm->mmu_lock); + vm->mmu->prealloc = &job->prealloc; + + while (!list_empty(&job->vm_ops)) { + struct msm_vm_op *op = + list_first_entry(&job->vm_ops, struct msm_vm_op, node); + + switch (op->op) { + case MSM_VM_OP_MAP: + /* + * On error, stop trying to map new things.. but we + * still want to process the unmaps (or in particular, + * the drm_gem_object_put()s) + */ + if (!ret) + ret = vm_map_op(vm, &op->map); + break; + case MSM_VM_OP_UNMAP: + vm_unmap_op(vm, &op->unmap); + break; + } + drm_gem_object_put(op->obj); + list_del(&op->node); + kfree(op); + } + + vm->mmu->prealloc = NULL; + mutex_unlock(&vm->mmu_lock); + + /* + * We failed to perform at least _some_ of the pgtable updates, so + * now the VM is in an undefined state. Game over! + */ + if (ret) + vm->unusable = true; + + job_foreach_bo (obj, job) { + msm_gem_lock(obj); + msm_gem_unpin_locked(obj); + msm_gem_unlock(obj); + } + + /* VM_BIND ops are synchronous, so no fence to wait on: */ + return NULL; +} + +static void +msm_vma_job_free(struct drm_sched_job *_job) +{ + struct msm_vm_bind_job *job = to_msm_vm_bind_job(_job); + struct msm_mmu *mmu = to_msm_vm(job->vm)->mmu; + struct drm_gem_object *obj; + + mmu->funcs->prealloc_cleanup(mmu, &job->prealloc); + + drm_sched_job_cleanup(_job); + + job_foreach_bo (obj, job) + drm_gem_object_put(obj); + + msm_submitqueue_put(job->queue); + dma_fence_put(job->fence); + + /* In error paths, we could have unexecuted ops: */ + while (!list_empty(&job->vm_ops)) { + struct msm_vm_op *op = + list_first_entry(&job->vm_ops, struct msm_vm_op, node); + list_del(&op->node); + kfree(op); + } + + kfree(job); +} + static const struct drm_sched_backend_ops msm_vm_bind_ops = { + .run_job = msm_vma_job_run, + .free_job = msm_vma_job_free }; /** @@ -254,6 +676,7 @@ msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name, .ops = &msm_vm_bind_ops, .num_rqs = 1, .credit_limit = 1, + .enqueue_credit_limit = 1024, .timeout = MAX_SCHEDULE_TIMEOUT, .name = "msm-vm-bind", .dev = drm->dev, @@ -269,6 +692,7 @@ msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name, drm_gem_object_put(dummy_gem); vm->mmu = mmu; + mutex_init(&vm->mmu_lock); vm->managed = managed; drm_mm_init(&vm->mm, va_start, va_size); @@ -281,7 +705,6 @@ msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name, err_free_vm: kfree(vm); return ERR_PTR(ret); - } /** @@ -297,6 +720,7 @@ msm_gem_vm_close(struct drm_gpuvm *gpuvm) { struct msm_gem_vm *vm = to_msm_vm(gpuvm); struct drm_gpuva *vma, *tmp; + struct drm_exec exec; /* * For kernel managed VMs, the VMAs are torn down when the handle is @@ -313,22 +737,635 @@ msm_gem_vm_close(struct drm_gpuvm *gpuvm) drm_sched_fini(&vm->sched); /* Tear down any remaining mappings: */ - dma_resv_lock(drm_gpuvm_resv(gpuvm), NULL); - drm_gpuvm_for_each_va_safe (vma, tmp, gpuvm) { - struct drm_gem_object *obj = vma->gem.obj; + drm_exec_init(&exec, 0, 2); + drm_exec_until_all_locked (&exec) { + drm_exec_lock_obj(&exec, drm_gpuvm_resv_obj(gpuvm)); + drm_exec_retry_on_contention(&exec); + + drm_gpuvm_for_each_va_safe (vma, tmp, gpuvm) { + struct drm_gem_object *obj = vma->gem.obj; + + /* + * MSM_BO_NO_SHARE objects share the same resv as the + * VM, in which case the obj is already locked: + */ + if (obj && (obj->resv == drm_gpuvm_resv(gpuvm))) + obj = NULL; + + if (obj) { + drm_exec_lock_obj(&exec, obj); + drm_exec_retry_on_contention(&exec); + } + + msm_gem_vma_unmap(vma); + msm_gem_vma_close(vma); + + if (obj) { + drm_exec_unlock_obj(&exec, obj); + } + } + } + drm_exec_fini(&exec); +} + + +static struct msm_vm_bind_job * +vm_bind_job_create(struct drm_device *dev, struct msm_gpu *gpu, + struct msm_gpu_submitqueue *queue, uint32_t nr_ops) +{ + struct msm_vm_bind_job *job; + uint64_t sz; + int ret; + + sz = struct_size(job, ops, nr_ops); + + if (sz > SIZE_MAX) + return ERR_PTR(-ENOMEM); + + job = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN); + if (!job) + return ERR_PTR(-ENOMEM); + + ret = drm_sched_job_init(&job->base, queue->entity, 1, queue); + if (ret) { + kfree(job); + return ERR_PTR(ret); + } + + job->vm = msm_context_vm(dev, queue->ctx); + job->queue = queue; + INIT_LIST_HEAD(&job->vm_ops); + + return job; +} + +static bool invalid_alignment(uint64_t addr) +{ + /* + * Technically this is about GPU alignment, not CPU alignment. But + * I've not seen any qcom SoC where the SMMU does not support the + * CPU's smallest page size. + */ + return !PAGE_ALIGNED(addr); +} + +static int +lookup_op(struct msm_vm_bind_job *job, const struct drm_msm_vm_bind_op *op) +{ + struct drm_device *dev = job->vm->drm; + int i = job->nr_ops++; + int ret = 0; + + job->ops[i].op = op->op; + job->ops[i].handle = op->handle; + job->ops[i].obj_offset = op->obj_offset; + job->ops[i].iova = op->iova; + job->ops[i].range = op->range; + job->ops[i].flags = op->flags; + + if (op->flags & ~MSM_VM_BIND_OP_FLAGS) + ret = UERR(EINVAL, dev, "invalid flags: %x\n", op->flags); + + if (invalid_alignment(op->iova)) + ret = UERR(EINVAL, dev, "invalid address: %016llx\n", op->iova); + + if (invalid_alignment(op->obj_offset)) + ret = UERR(EINVAL, dev, "invalid bo_offset: %016llx\n", op->obj_offset); + + if (invalid_alignment(op->range)) + ret = UERR(EINVAL, dev, "invalid range: %016llx\n", op->range); + + + /* + * MAP must specify a valid handle. But the handle MBZ for + * UNMAP or MAP_NULL. + */ + if (op->op == MSM_VM_BIND_OP_MAP) { + if (!op->handle) + ret = UERR(EINVAL, dev, "invalid handle\n"); + } else if (op->handle) { + ret = UERR(EINVAL, dev, "handle must be zero\n"); + } + + switch (op->op) { + case MSM_VM_BIND_OP_MAP: + case MSM_VM_BIND_OP_MAP_NULL: + case MSM_VM_BIND_OP_UNMAP: + break; + default: + ret = UERR(EINVAL, dev, "invalid op: %u\n", op->op); + break; + } + + return ret; +} + +/* + * ioctl parsing, parameter validation, and GEM handle lookup + */ +static int +vm_bind_job_lookup_ops(struct msm_vm_bind_job *job, struct drm_msm_vm_bind *args, + struct drm_file *file, int *nr_bos) +{ + struct drm_device *dev = job->vm->drm; + int ret = 0; + int cnt = 0; + + if (args->nr_ops == 1) { + /* Single op case, the op is inlined: */ + ret = lookup_op(job, &args->op); + } else { + for (unsigned i = 0; i < args->nr_ops; i++) { + struct drm_msm_vm_bind_op op; + void __user *userptr = + u64_to_user_ptr(args->ops + (i * sizeof(op))); + + /* make sure we don't have garbage flags, in case we hit + * error path before flags is initialized: + */ + job->ops[i].flags = 0; + + if (copy_from_user(&op, userptr, sizeof(op))) { + ret = -EFAULT; + break; + } + + ret = lookup_op(job, &op); + if (ret) + break; + } + } + + if (ret) { + job->nr_ops = 0; + goto out; + } + + spin_lock(&file->table_lock); + + for (unsigned i = 0; i < args->nr_ops; i++) { + struct drm_gem_object *obj; + + if (!job->ops[i].handle) { + job->ops[i].obj = NULL; + continue; + } + + /* + * normally use drm_gem_object_lookup(), but for bulk lookup + * all under single table_lock just hit object_idr directly: + */ + obj = idr_find(&file->object_idr, job->ops[i].handle); + if (!obj) { + ret = UERR(EINVAL, dev, "invalid handle %u at index %u\n", job->ops[i].handle, i); + goto out_unlock; + } + + drm_gem_object_get(obj); + + job->ops[i].obj = obj; + cnt++; + } + + *nr_bos = cnt; + +out_unlock: + spin_unlock(&file->table_lock); + +out: + return ret; +} + +static void +prealloc_count(struct msm_vm_bind_job *job, + struct msm_vm_bind_op *first, + struct msm_vm_bind_op *last) +{ + struct msm_mmu *mmu = to_msm_vm(job->vm)->mmu; + + if (!first) + return; + + uint64_t start_iova = first->iova; + uint64_t end_iova = last->iova + last->range; + + mmu->funcs->prealloc_count(mmu, &job->prealloc, start_iova, end_iova - start_iova); +} + +static bool +ops_are_same_pte(struct msm_vm_bind_op *first, struct msm_vm_bind_op *next) +{ + /* + * Last level pte covers 2MB.. so we should merge two ops, from + * the PoV of figuring out how much pgtable pages to pre-allocate + * if they land in the same 2MB range: + */ + uint64_t pte_mask = ~(SZ_2M - 1); + return ((first->iova + first->range) & pte_mask) == (next->iova & pte_mask); +} + +/* + * Determine the amount of memory to prealloc for pgtables. For sparse images, + * in particular, userspace plays some tricks with the order of page mappings + * to get the desired swizzle pattern, resulting in a large # of tiny MAP ops. + * So detect when multiple MAP operations are physically contiguous, and count + * them as a single mapping. Otherwise the prealloc_count() will not realize + * they can share pagetable pages and vastly overcount. + */ +static void +vm_bind_prealloc_count(struct msm_vm_bind_job *job) +{ + struct msm_vm_bind_op *first = NULL, *last = NULL; + + for (int i = 0; i < job->nr_ops; i++) { + struct msm_vm_bind_op *op = &job->ops[i]; + + /* We only care about MAP/MAP_NULL: */ + if (op->op == MSM_VM_BIND_OP_UNMAP) + continue; + + /* + * If op is contiguous with last in the current range, then + * it becomes the new last in the range and we continue + * looping: + */ + if (last && ops_are_same_pte(last, op)) { + last = op; + continue; + } + + /* + * If op is not contiguous with the current range, flush + * the current range and start anew: + */ + prealloc_count(job, first, last); + first = last = op; + } + + /* Flush the remaining range: */ + prealloc_count(job, first, last); + + job->base.enqueue_credits = job->prealloc.count; +} + +/* + * Lock VM and GEM objects + */ +static int +vm_bind_job_lock_objects(struct msm_vm_bind_job *job, struct drm_exec *exec) +{ + struct drm_gem_object *obj; + int ret; + + /* Lock VM and objects: */ + drm_exec_until_all_locked(exec) { + ret = drm_exec_lock_obj(exec, drm_gpuvm_resv_obj(job->vm)); + drm_exec_retry_on_contention(exec); + if (ret) + return ret; + + job_foreach_bo (obj, job) { + ret = drm_exec_prepare_obj(exec, obj, 1); + drm_exec_retry_on_contention(exec); + if (ret) + return ret; + } + } + + return 0; +} + +/* + * Pin GEM objects, ensuring that we have backing pages. Pinning will move + * the object to the pinned LRU so that the shrinker knows to first consider + * other objects for evicting. + */ +static int +vm_bind_job_pin_objects(struct msm_vm_bind_job *job) +{ + struct drm_gem_object *obj; + + /* + * First loop, before holding the LRU lock, avoids holding the + * LRU lock while calling msm_gem_pin_vma_locked (which could + * trigger get_pages()) + */ + job_foreach_bo (obj, job) { + struct page **pages; + + pages = msm_gem_get_pages_locked(obj, MSM_MADV_WILLNEED); + if (IS_ERR(pages)) + return PTR_ERR(pages); + } + + struct msm_drm_private *priv = job->vm->drm->dev_private; + + /* + * A second loop while holding the LRU lock (a) avoids acquiring/dropping + * the LRU lock for each individual bo, while (b) avoiding holding the + * LRU lock while calling msm_gem_pin_vma_locked() (which could trigger + * get_pages() which could trigger reclaim.. and if we held the LRU lock + * could trigger deadlock with the shrinker). + */ + mutex_lock(&priv->lru.lock); + job_foreach_bo (obj, job) + msm_gem_pin_obj_locked(obj); + mutex_unlock(&priv->lru.lock); + + job->bos_pinned = true; + + return 0; +} + +/* + * Unpin GEM objects. Normally this is done after the bind job is run. + */ +static void +vm_bind_job_unpin_objects(struct msm_vm_bind_job *job) +{ + struct drm_gem_object *obj; + + if (!job->bos_pinned) + return; + + job_foreach_bo (obj, job) + msm_gem_unpin_locked(obj); - if (obj && obj->resv != drm_gpuvm_resv(gpuvm)) { - drm_gem_object_get(obj); - msm_gem_lock(obj); + job->bos_pinned = false; +} + +/* + * Pre-allocate pgtable memory, and translate the VM bind requests into a + * sequence of pgtable updates to be applied asynchronously. + */ +static int +vm_bind_job_prepare(struct msm_vm_bind_job *job) +{ + struct msm_gem_vm *vm = to_msm_vm(job->vm); + struct msm_mmu *mmu = vm->mmu; + int ret; + + ret = mmu->funcs->prealloc_allocate(mmu, &job->prealloc); + if (ret) + return ret; + + for (unsigned i = 0; i < job->nr_ops; i++) { + const struct msm_vm_bind_op *op = &job->ops[i]; + struct op_arg arg = { + .job = job, + }; + + switch (op->op) { + case MSM_VM_BIND_OP_UNMAP: + ret = drm_gpuvm_sm_unmap(job->vm, &arg, op->iova, + op->obj_offset); + break; + case MSM_VM_BIND_OP_MAP: + if (op->flags & MSM_VM_BIND_OP_DUMP) + arg.flags |= MSM_VMA_DUMP; + fallthrough; + case MSM_VM_BIND_OP_MAP_NULL: + ret = drm_gpuvm_sm_map(job->vm, &arg, op->iova, + op->range, op->obj, op->obj_offset); + break; + default: + /* + * lookup_op() should have already thrown an error for + * invalid ops + */ + BUG_ON("unreachable"); } - msm_gem_vma_unmap(vma); - msm_gem_vma_close(vma); + if (ret) { + /* + * If we've already started modifying the vm, we can't + * adequetly describe to userspace the intermediate + * state the vm is in. So throw up our hands! + */ + if (i > 0) + vm->unusable = true; + return ret; + } + } + + return 0; +} + +/* + * Attach fences to the GEM objects being bound. This will signify to + * the shrinker that they are busy even after dropping the locks (ie. + * drm_exec_fini()) + */ +static void +vm_bind_job_attach_fences(struct msm_vm_bind_job *job) +{ + for (unsigned i = 0; i < job->nr_ops; i++) { + struct drm_gem_object *obj = job->ops[i].obj; + + if (!obj) + continue; + + dma_resv_add_fence(obj->resv, job->fence, + DMA_RESV_USAGE_KERNEL); + } +} + +int +msm_ioctl_vm_bind(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct msm_drm_private *priv = dev->dev_private; + struct drm_msm_vm_bind *args = data; + struct msm_context *ctx = file->driver_priv; + struct msm_vm_bind_job *job = NULL; + struct msm_gpu *gpu = priv->gpu; + struct msm_gpu_submitqueue *queue; + struct msm_syncobj_post_dep *post_deps = NULL; + struct drm_syncobj **syncobjs_to_reset = NULL; + struct sync_file *sync_file = NULL; + struct dma_fence *fence; + int out_fence_fd = -1; + int ret, nr_bos = 0; + unsigned i; + + if (!gpu) + return -ENXIO; + + /* + * Maybe we could allow just UNMAP ops? OTOH userspace should just + * immediately close the device file and all will be torn down. + */ + if (to_msm_vm(ctx->vm)->unusable) + return UERR(EPIPE, dev, "context is unusable"); + + /* + * Technically, you cannot create a VM_BIND submitqueue in the first + * place, if you haven't opted in to VM_BIND context. But it is + * cleaner / less confusing, to check this case directly. + */ + if (!msm_context_is_vmbind(ctx)) + return UERR(EINVAL, dev, "context does not support vmbind"); + + if (args->flags & ~MSM_VM_BIND_FLAGS) + return UERR(EINVAL, dev, "invalid flags"); - if (obj && obj->resv != drm_gpuvm_resv(gpuvm)) { - msm_gem_unlock(obj); - drm_gem_object_put(obj); + queue = msm_submitqueue_get(ctx, args->queue_id); + if (!queue) + return -ENOENT; + + if (!(queue->flags & MSM_SUBMITQUEUE_VM_BIND)) { + ret = UERR(EINVAL, dev, "Invalid queue type"); + goto out_post_unlock; + } + + if (args->flags & MSM_VM_BIND_FENCE_FD_OUT) { + out_fence_fd = get_unused_fd_flags(O_CLOEXEC); + if (out_fence_fd < 0) { + ret = out_fence_fd; + goto out_post_unlock; } } - dma_resv_unlock(drm_gpuvm_resv(gpuvm)); + + job = vm_bind_job_create(dev, gpu, queue, args->nr_ops); + if (IS_ERR(job)) { + ret = PTR_ERR(job); + goto out_post_unlock; + } + + ret = mutex_lock_interruptible(&queue->lock); + if (ret) + goto out_post_unlock; + + if (args->flags & MSM_VM_BIND_FENCE_FD_IN) { + struct dma_fence *in_fence; + + in_fence = sync_file_get_fence(args->fence_fd); + + if (!in_fence) { + ret = UERR(EINVAL, dev, "invalid in-fence"); + goto out_unlock; + } + + ret = drm_sched_job_add_dependency(&job->base, in_fence); + if (ret) + goto out_unlock; + } + + if (args->in_syncobjs > 0) { + syncobjs_to_reset = msm_syncobj_parse_deps(dev, &job->base, + file, args->in_syncobjs, + args->nr_in_syncobjs, + args->syncobj_stride); + if (IS_ERR(syncobjs_to_reset)) { + ret = PTR_ERR(syncobjs_to_reset); + goto out_unlock; + } + } + + if (args->out_syncobjs > 0) { + post_deps = msm_syncobj_parse_post_deps(dev, file, + args->out_syncobjs, + args->nr_out_syncobjs, + args->syncobj_stride); + if (IS_ERR(post_deps)) { + ret = PTR_ERR(post_deps); + goto out_unlock; + } + } + + ret = vm_bind_job_lookup_ops(job, args, file, &nr_bos); + if (ret) + goto out_unlock; + + vm_bind_prealloc_count(job); + + struct drm_exec exec; + unsigned flags = DRM_EXEC_IGNORE_DUPLICATES | DRM_EXEC_INTERRUPTIBLE_WAIT; + drm_exec_init(&exec, flags, nr_bos + 1); + + ret = vm_bind_job_lock_objects(job, &exec); + if (ret) + goto out; + + ret = vm_bind_job_pin_objects(job); + if (ret) + goto out; + + ret = vm_bind_job_prepare(job); + if (ret) + goto out; + + drm_sched_job_arm(&job->base); + + job->fence = dma_fence_get(&job->base.s_fence->finished); + + if (args->flags & MSM_VM_BIND_FENCE_FD_OUT) { + sync_file = sync_file_create(job->fence); + if (!sync_file) { + ret = -ENOMEM; + } else { + fd_install(out_fence_fd, sync_file->file); + args->fence_fd = out_fence_fd; + } + } + + if (ret) + goto out; + + vm_bind_job_attach_fences(job); + + /* + * The job can be free'd (and fence unref'd) at any point after + * drm_sched_entity_push_job(), so we need to hold our own ref + */ + fence = dma_fence_get(job->fence); + + ret = drm_sched_entity_push_job(&job->base); + + msm_syncobj_reset(syncobjs_to_reset, args->nr_in_syncobjs); + msm_syncobj_process_post_deps(post_deps, args->nr_out_syncobjs, fence); + + dma_fence_put(fence); + +out: + if (ret) + vm_bind_job_unpin_objects(job); + + drm_exec_fini(&exec); +out_unlock: + mutex_unlock(&queue->lock); +out_post_unlock: + if (ret && (out_fence_fd >= 0)) { + put_unused_fd(out_fence_fd); + if (sync_file) + fput(sync_file->file); + } + + if (!IS_ERR_OR_NULL(job)) { + if (ret) + msm_vma_job_free(&job->base); + } else { + /* + * If the submit hasn't yet taken ownership of the queue + * then we need to drop the reference ourself: + */ + msm_submitqueue_put(queue); + } + + if (!IS_ERR_OR_NULL(post_deps)) { + for (i = 0; i < args->nr_out_syncobjs; ++i) { + kfree(post_deps[i].chain); + drm_syncobj_put(post_deps[i].syncobj); + } + kfree(post_deps); + } + + if (!IS_ERR_OR_NULL(syncobjs_to_reset)) { + for (i = 0; i < args->nr_in_syncobjs; ++i) { + if (syncobjs_to_reset[i]) + drm_syncobj_put(syncobjs_to_reset[i]); + } + kfree(syncobjs_to_reset); + } + + return ret; } diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h index 6d6cd1219926..5c67294edc95 100644 --- a/include/uapi/drm/msm_drm.h +++ b/include/uapi/drm/msm_drm.h @@ -272,7 +272,10 @@ struct drm_msm_gem_submit_cmd { __u32 size; /* in, cmdstream size */ __u32 pad; __u32 nr_relocs; /* in, number of submit_reloc's */ - __u64 relocs; /* in, ptr to array of submit_reloc's */ + union { + __u64 relocs; /* in, ptr to array of submit_reloc's */ + __u64 iova; /* cmdstream address (for VM_BIND contexts) */ + }; }; /* Each buffer referenced elsewhere in the cmdstream submit (ie. the @@ -339,7 +342,74 @@ struct drm_msm_gem_submit { __u32 nr_out_syncobjs; /* in, number of entries in out_syncobj. */ __u32 syncobj_stride; /* in, stride of syncobj arrays. */ __u32 pad; /*in, reserved for future use, always 0. */ +}; + +#define MSM_VM_BIND_OP_UNMAP 0 +#define MSM_VM_BIND_OP_MAP 1 +#define MSM_VM_BIND_OP_MAP_NULL 2 + +#define MSM_VM_BIND_OP_DUMP 1 +#define MSM_VM_BIND_OP_FLAGS ( \ + MSM_VM_BIND_OP_DUMP | \ + 0) +/** + * struct drm_msm_vm_bind_op - bind/unbind op to run + */ +struct drm_msm_vm_bind_op { + /** @op: one of MSM_VM_BIND_OP_x */ + __u32 op; + /** @handle: GEM object handle, MBZ for UNMAP or MAP_NULL */ + __u32 handle; + /** @obj_offset: Offset into GEM object, MBZ for UNMAP or MAP_NULL */ + __u64 obj_offset; + /** @iova: Address to operate on */ + __u64 iova; + /** @range: Number of bites to to map/unmap */ + __u64 range; + /** @flags: Bitmask of MSM_VM_BIND_OP_FLAG_x */ + __u32 flags; + /** @pad: MBZ */ + __u32 pad; +}; + +#define MSM_VM_BIND_FENCE_FD_IN 0x00000001 +#define MSM_VM_BIND_FENCE_FD_OUT 0x00000002 +#define MSM_VM_BIND_FLAGS ( \ + MSM_VM_BIND_FENCE_FD_IN | \ + MSM_VM_BIND_FENCE_FD_OUT | \ + 0) + +/** + * struct drm_msm_vm_bind - Input of &DRM_IOCTL_MSM_VM_BIND + */ +struct drm_msm_vm_bind { + /** @flags: in, bitmask of MSM_VM_BIND_x */ + __u32 flags; + /** @nr_ops: the number of bind ops in this ioctl */ + __u32 nr_ops; + /** @fence_fd: in/out fence fd (see MSM_VM_BIND_FENCE_FD_IN/OUT) */ + __s32 fence_fd; + /** @queue_id: in, submitqueue id */ + __u32 queue_id; + /** @in_syncobjs: in, ptr to array of drm_msm_gem_syncobj */ + __u64 in_syncobjs; + /** @out_syncobjs: in, ptr to array of drm_msm_gem_syncobj */ + __u64 out_syncobjs; + /** @nr_in_syncobjs: in, number of entries in in_syncobj */ + __u32 nr_in_syncobjs; + /** @nr_out_syncobjs: in, number of entries in out_syncobj */ + __u32 nr_out_syncobjs; + /** @syncobj_stride: in, stride of syncobj arrays */ + __u32 syncobj_stride; + /** @op_stride: sizeof each struct drm_msm_vm_bind_op in @ops */ + __u32 op_stride; + union { + /** @op: used if num_ops == 1 */ + struct drm_msm_vm_bind_op op; + /** @ops: userptr to array of drm_msm_vm_bind_op if num_ops > 1 */ + __u64 ops; + }; }; #define MSM_WAIT_FENCE_BOOST 0x00000001 @@ -435,6 +505,7 @@ struct drm_msm_submitqueue_query { #define DRM_MSM_SUBMITQUEUE_NEW 0x0A #define DRM_MSM_SUBMITQUEUE_CLOSE 0x0B #define DRM_MSM_SUBMITQUEUE_QUERY 0x0C +#define DRM_MSM_VM_BIND 0x0D #define DRM_IOCTL_MSM_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GET_PARAM, struct drm_msm_param) #define DRM_IOCTL_MSM_SET_PARAM DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SET_PARAM, struct drm_msm_param) @@ -448,6 +519,7 @@ struct drm_msm_submitqueue_query { #define DRM_IOCTL_MSM_SUBMITQUEUE_NEW DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_NEW, struct drm_msm_submitqueue) #define DRM_IOCTL_MSM_SUBMITQUEUE_CLOSE DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_CLOSE, __u32) #define DRM_IOCTL_MSM_SUBMITQUEUE_QUERY DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_QUERY, struct drm_msm_submitqueue_query) +#define DRM_IOCTL_MSM_VM_BIND DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_VM_BIND, struct drm_msm_vm_bind) #if defined(__cplusplus) }