mbox series

[v11,00/12] iommu/arm-smmu: Add Adreno SMMU specific implementation

Message ID 20200803193547.305660-1-jcrouse@codeaurora.org
Headers show
Series iommu/arm-smmu: Add Adreno SMMU specific implementation | expand

Message

Jordan Crouse Aug. 3, 2020, 7:35 p.m. UTC
This series adds an Adreno SMMU implementation to arm-smmu to allow GPU hardware
pagetable switching.

The Adreno GPU has built in capabilities to switch the TTBR0 pagetable during
runtime to allow each individual instance or application to have its own
pagetable.  In order to take advantage of the HW capabilities there are certain
requirements needed of the SMMU hardware.

This series adds support for an Adreno specific arm-smmu implementation. The new
implementation 1) ensures that the GPU domain is always assigned context bank 0,
2) enables split pagetable support (TTBR1) so that the instance specific
pagetable can be swapped while the global memory remains in place and 3) shares
the current pagetable configuration with the GPU driver to allow it to create
its own io-pgtable instances.

The series then adds the drm/msm code to enable these features. For targets that
support it allocate new pagetables using the io-pgtable configuration shared by
the arm-smmu driver and swap them in during runtime.

This version of the series merges the previous patchset(s) [1] and [2]
with the following improvements:

v11:
  - Add implementation specific get_attr/set_attr functions (per Rob Clark)
  - Fix context bank allocation (per Bjorn Andersson)
v10:
  - arm-smmu: add implementation hook to allocate context banks
  - arm-smmu: Match the GPU domain by stream ID instead of compatible string
  - arm-smmu: Make DOMAIN_ATTR_PGTABLE_CFG bi-directional. The leaf driver
    queries the configuration to create a pagetable and then sends the newly
    created configuration back to the smmu-driver to enable TTBR0
  - drm/msm: Add context reference counting for submissions
  - drm/msm: Use dummy functions to skip TLB operations on per-instance
    pagetables

[1] https://lists.linuxfoundation.org/pipermail/iommu/2020-June/045653.html
[2] https://lists.linuxfoundation.org/pipermail/iommu/2020-June/045659.html


Jordan Crouse (12):
  iommu/arm-smmu: Pass io-pgtable config to implementation specific
    function
  iommu/arm-smmu: Add support for split pagetables
  iommu/arm-smmu: Prepare for the adreno-smmu implementation
  iommu: Add a domain attribute to get/set a pagetable configuration
  iommu/arm-smmu-qcom: Add implementation for the adreno GPU SMMU
  dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU
  drm/msm: Add a context pointer to the submitqueue
  drm/msm: Set the global virtual address range from the IOMMU domain
  drm/msm: Add support to create a local pagetable
  drm/msm: Add support for private address space instances
  drm/msm/a6xx: Add support for per-instance pagetables
  arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU

 .../devicetree/bindings/iommu/arm,smmu.yaml   |   4 +
 arch/arm64/boot/dts/qcom/sdm845.dtsi          |   2 +-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c         |  12 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c         |  58 ++++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h         |   1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c       |  18 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h       |   3 +-
 drivers/gpu/drm/msm/msm_drv.c                 |  16 +-
 drivers/gpu/drm/msm/msm_drv.h                 |  13 ++
 drivers/gpu/drm/msm/msm_gem.h                 |   1 +
 drivers/gpu/drm/msm/msm_gem_submit.c          |   8 +-
 drivers/gpu/drm/msm/msm_gem_vma.c             |   9 +
 drivers/gpu/drm/msm/msm_gpu.c                 |  26 ++-
 drivers/gpu/drm/msm/msm_gpu.h                 |  12 +-
 drivers/gpu/drm/msm/msm_gpummu.c              |   2 +-
 drivers/gpu/drm/msm/msm_iommu.c               | 198 +++++++++++++++++-
 drivers/gpu/drm/msm/msm_mmu.h                 |  16 +-
 drivers/gpu/drm/msm/msm_ringbuffer.h          |   1 +
 drivers/gpu/drm/msm/msm_submitqueue.c         |   8 +-
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c    |   6 +-
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c    | 156 +++++++++++++-
 drivers/iommu/arm/arm-smmu/arm-smmu.c         | 113 +++++-----
 drivers/iommu/arm/arm-smmu/arm-smmu.h         |  82 +++++++-
 include/linux/iommu.h                         |   1 +
 24 files changed, 653 insertions(+), 113 deletions(-)

Comments

Rob Clark Aug. 7, 2020, 8:03 p.m. UTC | #1
On Mon, Aug 3, 2020 at 12:36 PM Jordan Crouse <jcrouse@codeaurora.org> wrote:
>
> Add support for allocating private address space instances. Targets that
> support per-context pagetables should implement their own function to
> allocate private address spaces.
>
> The default will return a pointer to the global address space.
>
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
>
>  drivers/gpu/drm/msm/msm_drv.c     | 13 +++++++------
>  drivers/gpu/drm/msm/msm_drv.h     |  5 +++++
>  drivers/gpu/drm/msm/msm_gem_vma.c |  9 +++++++++
>  drivers/gpu/drm/msm/msm_gpu.c     | 17 +++++++++++++++++
>  drivers/gpu/drm/msm/msm_gpu.h     |  5 +++++
>  5 files changed, 43 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index 108b663c3ef2..f072306f1260 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -597,7 +597,7 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
>         kref_init(&ctx->ref);
>         msm_submitqueue_init(dev, ctx);
>
> -       ctx->aspace = priv->gpu ? priv->gpu->aspace : NULL;
> +       ctx->aspace = msm_gpu_create_private_address_space(priv->gpu);
>         file->driver_priv = ctx;
>
>         return 0;
> @@ -780,18 +780,19 @@ static int msm_ioctl_gem_cpu_fini(struct drm_device *dev, void *data,
>  }
>
>  static int msm_ioctl_gem_info_iova(struct drm_device *dev,
> -               struct drm_gem_object *obj, uint64_t *iova)
> +               struct drm_file *file, struct drm_gem_object *obj,
> +               uint64_t *iova)
>  {
> -       struct msm_drm_private *priv = dev->dev_private;
> +       struct msm_file_private *ctx = file->driver_priv;
>
> -       if (!priv->gpu)
> +       if (!ctx->aspace)
>                 return -EINVAL;
>
>         /*
>          * Don't pin the memory here - just get an address so that userspace can
>          * be productive
>          */
> -       return msm_gem_get_iova(obj, priv->gpu->aspace, iova);
> +       return msm_gem_get_iova(obj, ctx->aspace, iova);
>  }
>
>  static int msm_ioctl_gem_info(struct drm_device *dev, void *data,
> @@ -830,7 +831,7 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void *data,
>                 args->value = msm_gem_mmap_offset(obj);
>                 break;
>         case MSM_INFO_GET_IOVA:
> -               ret = msm_ioctl_gem_info_iova(dev, obj, &args->value);
> +               ret = msm_ioctl_gem_info_iova(dev, file, obj, &args->value);
>                 break;
>         case MSM_INFO_SET_NAME:
>                 /* length check should leave room for terminating null: */
> diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
> index f69c6d62584d..51a5c9083e13 100644
> --- a/drivers/gpu/drm/msm/msm_drv.h
> +++ b/drivers/gpu/drm/msm/msm_drv.h
> @@ -249,6 +249,10 @@ int msm_gem_map_vma(struct msm_gem_address_space *aspace,
>  void msm_gem_close_vma(struct msm_gem_address_space *aspace,
>                 struct msm_gem_vma *vma);
>
> +
> +struct msm_gem_address_space *
> +msm_gem_address_space_get(struct msm_gem_address_space *aspace);
> +
>  void msm_gem_address_space_put(struct msm_gem_address_space *aspace);
>
>  struct msm_gem_address_space *
> @@ -434,6 +438,7 @@ static inline void msm_file_private_destroy(struct kref *kref)
>         struct msm_file_private *ctx = container_of(kref,
>                 struct msm_file_private, ref);
>
> +       msm_gem_address_space_put(ctx->aspace);
>         kfree(ctx);
>  }
>
> diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c
> index 5f6a11211b64..29cc1305cf37 100644
> --- a/drivers/gpu/drm/msm/msm_gem_vma.c
> +++ b/drivers/gpu/drm/msm/msm_gem_vma.c
> @@ -27,6 +27,15 @@ void msm_gem_address_space_put(struct msm_gem_address_space *aspace)
>                 kref_put(&aspace->kref, msm_gem_address_space_destroy);
>  }
>
> +struct msm_gem_address_space *
> +msm_gem_address_space_get(struct msm_gem_address_space *aspace)
> +{
> +       if (!IS_ERR_OR_NULL(aspace))
> +               kref_get(&aspace->kref);
> +
> +       return aspace;
> +}
> +
>  /* Actually unmap memory for the vma */
>  void msm_gem_purge_vma(struct msm_gem_address_space *aspace,
>                 struct msm_gem_vma *vma)
> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> index a1f3da6550e5..aabbd7908ee5 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.c
> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> @@ -823,6 +823,23 @@ static int get_clocks(struct platform_device *pdev, struct msm_gpu *gpu)
>         return 0;
>  }
>
> +/* Return a new address space for a msm_drm_private instance */
> +struct msm_gem_address_space *
> +msm_gpu_create_private_address_space(struct msm_gpu *gpu)
> +{
> +       if (!gpu)
> +               return NULL;
> +
> +       /*
> +        * If the target doesn't support private address spaces then return
> +        * the global one
> +        */
> +       if (!gpu->funcs->create_private_address_space)
> +               return msm_gem_address_space_get(gpu->aspace);
> +
> +       return gpu->funcs->create_private_address_space(gpu);

so if you flip the logic around here, you can get rid of the
msm_gem_address_space_get() in per-gen backend and consolidate the
error handling here, ie.

  struct msm_gem_address_space *aspace = NULL;

  if (gpu->funcs->create_private_address_space)
     aspace = gpu->funcs->create_private_address_space(...);

  if (IS_ERR_OR_NULL(aspace))
    aspace = msm_gem_address_space_get(...);

(and that also simplifies a bit the patch I have on top to show comm in debugfs)

BR,
-R

> +}
> +
>  int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
>                 struct msm_gpu *gpu, const struct msm_gpu_funcs *funcs,
>                 const char *name, struct msm_gpu_config *config)
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index d496d488222c..d298657b4730 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -64,6 +64,8 @@ struct msm_gpu_funcs {
>         void (*gpu_set_freq)(struct msm_gpu *gpu, struct dev_pm_opp *opp);
>         struct msm_gem_address_space *(*create_address_space)
>                 (struct msm_gpu *gpu, struct platform_device *pdev);
> +       struct msm_gem_address_space *(*create_private_address_space)
> +               (struct msm_gpu *gpu);
>  };
>
>  struct msm_gpu {
> @@ -286,6 +288,9 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
>                 struct msm_gpu *gpu, const struct msm_gpu_funcs *funcs,
>                 const char *name, struct msm_gpu_config *config);
>
> +struct msm_gem_address_space *
> +msm_gpu_create_private_address_space(struct msm_gpu *gpu);
> +
>  void msm_gpu_cleanup(struct msm_gpu *gpu);
>
>  struct msm_gpu *adreno_load_gpu(struct drm_device *dev);
> --
> 2.25.1
>
> _______________________________________________
> Freedreno mailing list
> Freedreno@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno