diff mbox series

[v9,15/15] drm/i915: Add deadline based boost support

Message ID 20230302235356.3148279-16-robdclark@gmail.com
State Superseded
Headers show
Series dma-fence: Deadline awareness | expand

Commit Message

Rob Clark March 2, 2023, 11:53 p.m. UTC
From: Rob Clark <robdclark@chromium.org>

v2: rebase

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Tvrtko Ursulin March 3, 2023, 3:07 p.m. UTC | #1
On 03/03/2023 14:48, Rob Clark wrote:
> On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 03/03/2023 03:21, Rodrigo Vivi wrote:
>>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
>>>> From: Rob Clark <robdclark@chromium.org>
>>>>
>>>
>>> missing some wording here...
>>>
>>>> v2: rebase
>>>>
>>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
>>>>    1 file changed, 20 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>>> index 7503dcb9043b..44491e7e214c 100644
>>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
>>>>       return i915_request_enable_breadcrumb(to_request(fence));
>>>>    }
>>>>
>>>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
>>>> +{
>>>> +    struct i915_request *rq = to_request(fence);
>>>> +
>>>> +    if (i915_request_completed(rq))
>>>> +            return;
>>>> +
>>>> +    if (i915_request_started(rq))
>>>> +            return;
>>>
>>> why do we skip the boost if already started?
>>> don't we want to boost the freq anyway?
>>
>> I'd wager Rob is just copying the current i915 wait boost logic.
> 
> Yup, and probably incorrectly.. Matt reported fewer boosts/sec
> compared to your RFC, this could be the bug

Hm, there I have preserved this same !i915_request_started logic.

Presumably it's not just fewer boosts but lower performance. How is he 
setting the deadline? Somehow from clFlush or so?

Regards,

Tvrtko

P.S. Take note that I did not post the latest version of my RFC. The one 
where I fix the fence chain and array misses you pointed out. I did not 
think it would be worthwhile given no universal love for it, but if 
people are testing with it more widely that I was aware perhaps I should.

>>>> +
>>>> +    /*
>>>> +     * TODO something more clever for deadlines that are in the
>>>> +     * future.  I think probably track the nearest deadline in
>>>> +     * rq->timeline and set timer to trigger boost accordingly?
>>>> +     */
>>>
>>> I'm afraid it will be very hard to find some heuristics of what's
>>> late enough for the boost no?
>>> I mean, how early to boost the freq on an upcoming deadline for the
>>> timer?
>>
>> We can off load this patch from Rob and deal with it separately, or
>> after the fact?
> 
> That is completely my intention, I expect you to replace my i915 patch ;-)
> 
> Rough idea when everyone is happy with the core bits is to setup an
> immutable branch without the driver specific patches, which could be
> merged into drm-next and $driver-next and then each driver team can
> add there own driver patches on top
> 
> BR,
> -R
> 
>> It's a half solution without a smarter scheduler too. Like
>> https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/,
>> or if GuC plans to do something like that at any point.
>>
>> Or bump the priority too if deadline is looming?
>>
>> IMO it is not very effective to fiddle with the heuristic on an ad-hoc
>> basis. For instance I have a new heuristics which improves the
>> problematic OpenCL cases for further 5% (relative to the current
>> waitboost improvement from adding missing syncobj waitboost). But I
>> can't really test properly for regressions over platforms, stacks,
>> workloads.. :(
>>
>> Regards,
>>
>> Tvrtko
>>
>>>
>>>> +
>>>> +    intel_rps_boost(rq);
>>>> +}
>>>> +
>>>>    static signed long i915_fence_wait(struct dma_fence *fence,
>>>>                                  bool interruptible,
>>>>                                  signed long timeout)
>>>> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = {
>>>>       .signaled = i915_fence_signaled,
>>>>       .wait = i915_fence_wait,
>>>>       .release = i915_fence_release,
>>>> +    .set_deadline = i915_fence_set_deadline,
>>>>    };
>>>>
>>>>    static void irq_execute_cb(struct irq_work *wrk)
>>>> --
>>>> 2.39.1
>>>>
Rob Clark March 3, 2023, 3:41 p.m. UTC | #2
On Fri, Mar 3, 2023 at 7:08 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 03/03/2023 14:48, Rob Clark wrote:
> > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>
> >> On 03/03/2023 03:21, Rodrigo Vivi wrote:
> >>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
> >>>> From: Rob Clark <robdclark@chromium.org>
> >>>>
> >>>
> >>> missing some wording here...
> >>>
> >>>> v2: rebase
> >>>>
> >>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
> >>>> ---
> >>>>    drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
> >>>>    1 file changed, 20 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> >>>> index 7503dcb9043b..44491e7e214c 100644
> >>>> --- a/drivers/gpu/drm/i915/i915_request.c
> >>>> +++ b/drivers/gpu/drm/i915/i915_request.c
> >>>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
> >>>>       return i915_request_enable_breadcrumb(to_request(fence));
> >>>>    }
> >>>>
> >>>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> >>>> +{
> >>>> +    struct i915_request *rq = to_request(fence);
> >>>> +
> >>>> +    if (i915_request_completed(rq))
> >>>> +            return;
> >>>> +
> >>>> +    if (i915_request_started(rq))
> >>>> +            return;
> >>>
> >>> why do we skip the boost if already started?
> >>> don't we want to boost the freq anyway?
> >>
> >> I'd wager Rob is just copying the current i915 wait boost logic.
> >
> > Yup, and probably incorrectly.. Matt reported fewer boosts/sec
> > compared to your RFC, this could be the bug
>
> Hm, there I have preserved this same !i915_request_started logic.
>
> Presumably it's not just fewer boosts but lower performance. How is he
> setting the deadline? Somehow from clFlush or so?

Yeah, fewer boosts, lower freq/perf.. I cobbled together a quick mesa
hack to set the DEADLINE flag on syncobj waits, but it seems likely
that I missed something somewhere

BR,
-R

> Regards,
>
> Tvrtko
>
> P.S. Take note that I did not post the latest version of my RFC. The one
> where I fix the fence chain and array misses you pointed out. I did not
> think it would be worthwhile given no universal love for it, but if
> people are testing with it more widely that I was aware perhaps I should.
>
> >>>> +
> >>>> +    /*
> >>>> +     * TODO something more clever for deadlines that are in the
> >>>> +     * future.  I think probably track the nearest deadline in
> >>>> +     * rq->timeline and set timer to trigger boost accordingly?
> >>>> +     */
> >>>
> >>> I'm afraid it will be very hard to find some heuristics of what's
> >>> late enough for the boost no?
> >>> I mean, how early to boost the freq on an upcoming deadline for the
> >>> timer?
> >>
> >> We can off load this patch from Rob and deal with it separately, or
> >> after the fact?
> >
> > That is completely my intention, I expect you to replace my i915 patch ;-)
> >
> > Rough idea when everyone is happy with the core bits is to setup an
> > immutable branch without the driver specific patches, which could be
> > merged into drm-next and $driver-next and then each driver team can
> > add there own driver patches on top
> >
> > BR,
> > -R
> >
> >> It's a half solution without a smarter scheduler too. Like
> >> https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/,
> >> or if GuC plans to do something like that at any point.
> >>
> >> Or bump the priority too if deadline is looming?
> >>
> >> IMO it is not very effective to fiddle with the heuristic on an ad-hoc
> >> basis. For instance I have a new heuristics which improves the
> >> problematic OpenCL cases for further 5% (relative to the current
> >> waitboost improvement from adding missing syncobj waitboost). But I
> >> can't really test properly for regressions over platforms, stacks,
> >> workloads.. :(
> >>
> >> Regards,
> >>
> >> Tvrtko
> >>
> >>>
> >>>> +
> >>>> +    intel_rps_boost(rq);
> >>>> +}
> >>>> +
> >>>>    static signed long i915_fence_wait(struct dma_fence *fence,
> >>>>                                  bool interruptible,
> >>>>                                  signed long timeout)
> >>>> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = {
> >>>>       .signaled = i915_fence_signaled,
> >>>>       .wait = i915_fence_wait,
> >>>>       .release = i915_fence_release,
> >>>> +    .set_deadline = i915_fence_set_deadline,
> >>>>    };
> >>>>
> >>>>    static void irq_execute_cb(struct irq_work *wrk)
> >>>> --
> >>>> 2.39.1
> >>>>
Rob Clark March 3, 2023, 3:43 p.m. UTC | #3
On Fri, Mar 3, 2023 at 7:20 AM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
>
> On Fri, Mar 03, 2023 at 05:00:03PM +0200, Ville Syrjälä wrote:
> > On Fri, Mar 03, 2023 at 06:48:43AM -0800, Rob Clark wrote:
> > > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin
> > > <tvrtko.ursulin@linux.intel.com> wrote:
> > > >
> > > >
> > > > On 03/03/2023 03:21, Rodrigo Vivi wrote:
> > > > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
> > > > >> From: Rob Clark <robdclark@chromium.org>
> > > > >>
> > > > >
> > > > > missing some wording here...
> > > > >
> > > > >> v2: rebase
> > > > >>
> > > > >> Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > > >> ---
> > > > >>   drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
> > > > >>   1 file changed, 20 insertions(+)
> > > > >>
> > > > >> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > > > >> index 7503dcb9043b..44491e7e214c 100644
> > > > >> --- a/drivers/gpu/drm/i915/i915_request.c
> > > > >> +++ b/drivers/gpu/drm/i915/i915_request.c
> > > > >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
> > > > >>      return i915_request_enable_breadcrumb(to_request(fence));
> > > > >>   }
> > > > >>
> > > > >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> > > > >> +{
> > > > >> +    struct i915_request *rq = to_request(fence);
> > > > >> +
> > > > >> +    if (i915_request_completed(rq))
> > > > >> +            return;
> > > > >> +
> > > > >> +    if (i915_request_started(rq))
> > > > >> +            return;
> > > > >
> > > > > why do we skip the boost if already started?
> > > > > don't we want to boost the freq anyway?
> > > >
> > > > I'd wager Rob is just copying the current i915 wait boost logic.
> > >
> > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec
> > > compared to your RFC, this could be the bug
> >
> > I don't think i915 calls drm_atomic_helper_wait_for_fences()
> > so that could explain something.
>
> Oh, I guess this wasn't even supposed to take over the current
> display boost stuff since you didn't remove the old one.

Right, I didn't try to replace the current thing.. but hopefully at
least make it possible for i915 to use more of the atomic helpers in
the future

BR,
-R

> The current one just boosts after a missed vblank. The deadline
> could use your timer approach I suppose and boost already a bit
> earlier in the hopes of not missing the vblank.
>
> --
> Ville Syrjälä
> Intel
Matt Turner March 3, 2023, 6:20 p.m. UTC | #4
On Fri, Mar 3, 2023 at 10:08 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 03/03/2023 14:48, Rob Clark wrote:
> > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>
> >> On 03/03/2023 03:21, Rodrigo Vivi wrote:
> >>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
> >>>> From: Rob Clark <robdclark@chromium.org>
> >>>>
> >>>
> >>> missing some wording here...
> >>>
> >>>> v2: rebase
> >>>>
> >>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
> >>>> ---
> >>>>    drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
> >>>>    1 file changed, 20 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> >>>> index 7503dcb9043b..44491e7e214c 100644
> >>>> --- a/drivers/gpu/drm/i915/i915_request.c
> >>>> +++ b/drivers/gpu/drm/i915/i915_request.c
> >>>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
> >>>>       return i915_request_enable_breadcrumb(to_request(fence));
> >>>>    }
> >>>>
> >>>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> >>>> +{
> >>>> +    struct i915_request *rq = to_request(fence);
> >>>> +
> >>>> +    if (i915_request_completed(rq))
> >>>> +            return;
> >>>> +
> >>>> +    if (i915_request_started(rq))
> >>>> +            return;
> >>>
> >>> why do we skip the boost if already started?
> >>> don't we want to boost the freq anyway?
> >>
> >> I'd wager Rob is just copying the current i915 wait boost logic.
> >
> > Yup, and probably incorrectly.. Matt reported fewer boosts/sec
> > compared to your RFC, this could be the bug
>
> Hm, there I have preserved this same !i915_request_started logic.
>
> Presumably it's not just fewer boosts but lower performance. How is he
> setting the deadline? Somehow from clFlush or so?
>
> Regards,
>
> Tvrtko
>
> P.S. Take note that I did not post the latest version of my RFC. The one
> where I fix the fence chain and array misses you pointed out. I did not
> think it would be worthwhile given no universal love for it, but if
> people are testing with it more widely that I was aware perhaps I should.

Yep, that would be great. We're interested in it for ChromeOS. Please
Cc me on the series when you send it.
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 7503dcb9043b..44491e7e214c 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -97,6 +97,25 @@  static bool i915_fence_enable_signaling(struct dma_fence *fence)
 	return i915_request_enable_breadcrumb(to_request(fence));
 }
 
+static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
+{
+	struct i915_request *rq = to_request(fence);
+
+	if (i915_request_completed(rq))
+		return;
+
+	if (i915_request_started(rq))
+		return;
+
+	/*
+	 * TODO something more clever for deadlines that are in the
+	 * future.  I think probably track the nearest deadline in
+	 * rq->timeline and set timer to trigger boost accordingly?
+	 */
+
+	intel_rps_boost(rq);
+}
+
 static signed long i915_fence_wait(struct dma_fence *fence,
 				   bool interruptible,
 				   signed long timeout)
@@ -182,6 +201,7 @@  const struct dma_fence_ops i915_fence_ops = {
 	.signaled = i915_fence_signaled,
 	.wait = i915_fence_wait,
 	.release = i915_fence_release,
+	.set_deadline = i915_fence_set_deadline,
 };
 
 static void irq_execute_cb(struct irq_work *wrk)