mbox series

[v2,0/2] drm/msm: Add tracking for faults associated with an address space

Message ID 20220201161618.778455-1-robdclark@gmail.com
Headers show
Series drm/msm: Add tracking for faults associated with an address space | expand

Message

Rob Clark Feb. 1, 2022, 4:16 p.m. UTC
From: Rob Clark <robdclark@chromium.org>

Currently, for GL_EXT_robustness userspace uses the global and per-
submitqueue fault counters to determine GUILTY_CONTEXT_RESET_EXT vs
INNOCENT_CONTEXT_RESET_EXT.  But that is a bit overly paranoid, in
that a fault in a different process's context (when it has it's own
isolated address space) should not hurt anything.

This is particularly annoying with CrOS and chrome's exit_on_context_lost quirk,
while running deqp in the android container, as the deqp-egl suite has
tests that intentionally trigger gpu hangs (for the purpose of testing
the robustness extension), which triggers chrome to restart, which
restarts the android container!

But chrome doesn't need to know about these faults, thanks to address
space isolation.

Applies on top of https://patchwork.freedesktop.org/series/98907/

Rob Clark (2):
  drm/msm/gpu: Add ctx to get_param()
  drm/msm/gpu: Track global faults per address-space

 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 5 +++--
 drivers/gpu/drm/msm/adreno/adreno_gpu.h | 3 ++-
 drivers/gpu/drm/msm/msm_drv.c           | 3 ++-
 drivers/gpu/drm/msm/msm_gem.h           | 3 +++
 drivers/gpu/drm/msm/msm_gpu.c           | 8 +++++++-
 drivers/gpu/drm/msm/msm_gpu.h           | 8 ++++++--
 drivers/gpu/drm/msm/msm_rd.c            | 6 ++++--
 7 files changed, 27 insertions(+), 9 deletions(-)

Comments

Emma Anholt Feb. 2, 2022, 7:27 p.m. UTC | #1
On Tue, Feb 1, 2022 at 8:16 AM Rob Clark <robdclark@gmail.com> wrote:
>
> From: Rob Clark <robdclark@chromium.org>
>
> Prep work for next patch.
>
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---

> diff --git a/drivers/gpu/drm/msm/msm_rd.c b/drivers/gpu/drm/msm/msm_rd.c
> index 7e4d6460719e..dd3605b46264 100644
> --- a/drivers/gpu/drm/msm/msm_rd.c
> +++ b/drivers/gpu/drm/msm/msm_rd.c
> @@ -197,13 +197,15 @@ static int rd_open(struct inode *inode, struct file *file)
>
>         /* the parsing tools need to know gpu-id to know which
>          * register database to load.
> +        *
> +        * Note: These particular param does not require a context
>          */

Minor typo fix, "param does not" -> "params do not"

Other than that, series
`Reviewed-by: Emma Anholt <emma@anholt.net>`

and I love that we're catching non-address-space associated faults now, too!