Message ID | 20220730150952.v3.2.Ifee853f6d8217a0fdacc459092bbc9e81a8a7ac7@changeid |
---|---|
State | Superseded |
Headers | show |
Series | Improve GPU Recovery | expand |
On Sun, Jul 31, 2022 at 9:33 AM Akhil P Oommen <quic_akhilpo@quicinc.com> wrote: > > On 7/31/2022 9:26 PM, Rob Clark wrote: > > On Sat, Jul 30, 2022 at 2:41 AM Akhil P Oommen <quic_akhilpo@quicinc.com> wrote: > >> Instead of separate refcount for each submit, take single rpm refcount > >> on behalf of all the submits. This makes it easier to drop the rpm > >> refcount during recovery in an upcoming patch. > >> > >> Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> > >> --- > >> > >> (no changes since v1) > > I see no earlier version of this patch? > > > >> drivers/gpu/drm/msm/msm_gpu.c | 12 ++++++++---- > >> 1 file changed, 8 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c > >> index c8cd9bf..e1dd3cc 100644 > >> --- a/drivers/gpu/drm/msm/msm_gpu.c > >> +++ b/drivers/gpu/drm/msm/msm_gpu.c > >> @@ -663,11 +663,12 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring, > >> mutex_lock(&gpu->active_lock); > >> gpu->active_submits--; > >> WARN_ON(gpu->active_submits < 0); > >> - if (!gpu->active_submits) > >> + if (!gpu->active_submits) { > >> msm_devfreq_idle(gpu); > >> - mutex_unlock(&gpu->active_lock); > >> + pm_runtime_put_autosuspend(&gpu->pdev->dev); > >> + } > >> > >> - pm_runtime_put_autosuspend(&gpu->pdev->dev); > >> + mutex_unlock(&gpu->active_lock); > >> > >> msm_gem_submit_put(submit); > >> } > >> @@ -756,14 +757,17 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) > >> > >> /* Update devfreq on transition from idle->active: */ > >> mutex_lock(&gpu->active_lock); > >> - if (!gpu->active_submits) > >> + if (!gpu->active_submits) { > >> + pm_runtime_get(&gpu->pdev->dev); > >> msm_devfreq_active(gpu); > >> + } > >> gpu->active_submits++; > >> mutex_unlock(&gpu->active_lock); > >> > >> gpu->funcs->submit(gpu, submit); > >> gpu->cur_ctx_seqno = submit->queue->ctx->seqno; > >> > >> + pm_runtime_put(&gpu->pdev->dev); > > this looks unbalanced? > There is another pm_runtime_get_sync at the top of this function. Just > before hw_init(). > https://elixir.bootlin.com/linux/v5.19-rc8/source/drivers/gpu/drm/msm/msm_gpu.c#L737 oh, right.. sorry, I was looking at my local stack of WIP patches which went the opposite direction and moved the runpm into just msm_job_run().. I'll drop that one BR, -R > > -Akhil. > > > > BR, > > -R > > > >> hangcheck_timer_reset(gpu); > >> } > >> > >> -- > >> 2.7.4 > >> >
On 8/1/2022 3:45 AM, Rob Clark wrote: > On Sun, Jul 31, 2022 at 9:33 AM Akhil P Oommen <quic_akhilpo@quicinc.com> wrote: >> On 7/31/2022 9:26 PM, Rob Clark wrote: >>> On Sat, Jul 30, 2022 at 2:41 AM Akhil P Oommen <quic_akhilpo@quicinc.com> wrote: >>>> Instead of separate refcount for each submit, take single rpm refcount >>>> on behalf of all the submits. This makes it easier to drop the rpm >>>> refcount during recovery in an upcoming patch. >>>> >>>> Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> >>>> --- >>>> >>>> (no changes since v1) >>> I see no earlier version of this patch? My bad, that is incorrect. This is a new patch included in the current series. -Akhil. >>> >>>> drivers/gpu/drm/msm/msm_gpu.c | 12 ++++++++---- >>>> 1 file changed, 8 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c >>>> index c8cd9bf..e1dd3cc 100644 >>>> --- a/drivers/gpu/drm/msm/msm_gpu.c >>>> +++ b/drivers/gpu/drm/msm/msm_gpu.c >>>> @@ -663,11 +663,12 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring, >>>> mutex_lock(&gpu->active_lock); >>>> gpu->active_submits--; >>>> WARN_ON(gpu->active_submits < 0); >>>> - if (!gpu->active_submits) >>>> + if (!gpu->active_submits) { >>>> msm_devfreq_idle(gpu); >>>> - mutex_unlock(&gpu->active_lock); >>>> + pm_runtime_put_autosuspend(&gpu->pdev->dev); >>>> + } >>>> >>>> - pm_runtime_put_autosuspend(&gpu->pdev->dev); >>>> + mutex_unlock(&gpu->active_lock); >>>> >>>> msm_gem_submit_put(submit); >>>> } >>>> @@ -756,14 +757,17 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) >>>> >>>> /* Update devfreq on transition from idle->active: */ >>>> mutex_lock(&gpu->active_lock); >>>> - if (!gpu->active_submits) >>>> + if (!gpu->active_submits) { >>>> + pm_runtime_get(&gpu->pdev->dev); >>>> msm_devfreq_active(gpu); >>>> + } >>>> gpu->active_submits++; >>>> mutex_unlock(&gpu->active_lock); >>>> >>>> gpu->funcs->submit(gpu, submit); >>>> gpu->cur_ctx_seqno = submit->queue->ctx->seqno; >>>> >>>> + pm_runtime_put(&gpu->pdev->dev); >>> this looks unbalanced? >> There is another pm_runtime_get_sync at the top of this function. Just >> before hw_init(). >> https://elixir.bootlin.com/linux/v5.19-rc8/source/drivers/gpu/drm/msm/msm_gpu.c#L737 > oh, right.. sorry, I was looking at my local stack of WIP patches > which went the opposite direction and moved the runpm into just > msm_job_run().. I'll drop that one > > BR, > -R > >> -Akhil. >>> BR, >>> -R >>> >>>> hangcheck_timer_reset(gpu); >>>> } >>>> >>>> -- >>>> 2.7.4 >>>>
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index c8cd9bf..e1dd3cc 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -663,11 +663,12 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring, mutex_lock(&gpu->active_lock); gpu->active_submits--; WARN_ON(gpu->active_submits < 0); - if (!gpu->active_submits) + if (!gpu->active_submits) { msm_devfreq_idle(gpu); - mutex_unlock(&gpu->active_lock); + pm_runtime_put_autosuspend(&gpu->pdev->dev); + } - pm_runtime_put_autosuspend(&gpu->pdev->dev); + mutex_unlock(&gpu->active_lock); msm_gem_submit_put(submit); } @@ -756,14 +757,17 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) /* Update devfreq on transition from idle->active: */ mutex_lock(&gpu->active_lock); - if (!gpu->active_submits) + if (!gpu->active_submits) { + pm_runtime_get(&gpu->pdev->dev); msm_devfreq_active(gpu); + } gpu->active_submits++; mutex_unlock(&gpu->active_lock); gpu->funcs->submit(gpu, submit); gpu->cur_ctx_seqno = submit->queue->ctx->seqno; + pm_runtime_put(&gpu->pdev->dev); hangcheck_timer_reset(gpu); }
Instead of separate refcount for each submit, take single rpm refcount on behalf of all the submits. This makes it easier to drop the rpm refcount during recovery in an upcoming patch. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> --- (no changes since v1) drivers/gpu/drm/msm/msm_gpu.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)