diff mbox series

[1/2] media: video-i2c: frame delay based on last frame's end time

Message ID 20210605115456.14440-2-euphoriccatface@gmail.com
State New
Headers show
Series media: video-i2c: additional support for Melexis MLX90640 | expand

Commit Message

Seongyong Park June 5, 2021, 11:54 a.m. UTC
Current implementation calculates frame delay based on the time of
start of the loop. This inevitably causes the loop delay to be
slightly longer than the actual measurement period, thus skipping a frame
every now and then.

However, MLX90640 should ideally be working without a frame skip.
Each measurement step updates half of the pixels in the frame
(every other pixel in default "chess mode", and every other row
in "interleave mode"), while additional coefficient data (25th & 26th row)
updates every step. The compensational coefficient data only corresponds
with the pixels updated in the same step.

In short, if a frame is skipped, then half of a frame loses correction
information and becomes garbage data.

Signed-off-by: Seongyong Park <euphoriccatface@gmail.com>
---
 drivers/media/i2c/video-i2c.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Comments

Mauro Carvalho Chehab June 5, 2021, 2:53 p.m. UTC | #1
Em Sat, 5 Jun 2021 16:00:28 +0200
Mauro Carvalho Chehab <mchehab@kernel.org> escreveu:

> Em Sat,  5 Jun 2021 20:54:56 +0900
> Seongyong Park <euphoriccatface@gmail.com> escreveu:
> 
> > Current implementation calculates frame delay based on the time of
> > start of the loop. This inevitably causes the loop delay to be
> > slightly longer than the actual measurement period, thus skipping a frame
> > every now and then.
> > 
> > However, MLX90640 should ideally be working without a frame skip.
> > Each measurement step updates half of the pixels in the frame
> > (every other pixel in default "chess mode", and every other row
> > in "interleave mode"), while additional coefficient data (25th & 26th row)
> > updates every step. The compensational coefficient data only corresponds
> > with the pixels updated in the same step.
> > 
> > In short, if a frame is skipped, then half of a frame loses correction
> > information and becomes garbage data.
> > 
> > Signed-off-by: Seongyong Park <euphoriccatface@gmail.com>
> > ---
> >  drivers/media/i2c/video-i2c.c | 11 ++++++-----
> >  1 file changed, 6 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/media/i2c/video-i2c.c b/drivers/media/i2c/video-i2c.c
> > index 0465832a4..2ccb08335 100644
> > --- a/drivers/media/i2c/video-i2c.c
> > +++ b/drivers/media/i2c/video-i2c.c
> > @@ -445,14 +445,16 @@ static int video_i2c_thread_vid_cap(void *priv)
> >  	struct video_i2c_data *data = priv;
> >  	unsigned int delay = mult_frac(HZ, data->frame_interval.numerator,
> >  				       data->frame_interval.denominator);
> > +	unsigned long end_jiffies = jiffies;
> >  
> >  	set_freezable();
> >  
> >  	do {
> > -		unsigned long start_jiffies = jiffies;
> >  		struct video_i2c_buffer *vid_cap_buf = NULL;
> >  		int schedule_delay;
> >  
> > +		end_jiffies += delay;
> > +
> >  		try_to_freeze();
> >  
> >  		spin_lock(&data->slock);
> > @@ -477,10 +479,9 @@ static int video_i2c_thread_vid_cap(void *priv)
> >  				VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> >  		}
> >  
> > -		schedule_delay = delay - (jiffies - start_jiffies);
> > -
> > -		if (time_after(jiffies, start_jiffies + delay))
> > -			schedule_delay = delay;  
> 
> > +		schedule_delay = end_jiffies - jiffies;
> > +		while (schedule_delay <= 0)
> > +			schedule_delay += delay;  
> 
> Huh? Why do you need a loop for that? If you want to make it positive,
> you could just do:
> 
> 	if (schedule_delay < 0)
> 		schedule_delay = delay; /* or something else */
> 
> but this won't solve the issue, as this is basically equivalent
> to the logic you removed.
> 
> >  
> >  		schedule_timeout_interruptible(schedule_delay);  
> 
> This is probably what's causing troubles, as this only ensures
> that it will sleep for at least schedule_delay (if not
> interrupted).
> 
> If you want to give a dead line to schedule, you should
> likely be doing, instead:
> 
> 		usleep_range(delay, delay + delta);
> 
> this would tell the realtime clock that there's a dead line of
> (schedule_delay + delta) microseconds.
> 
> You'll need to change the delay to be in microseconds too,
> e. g., I guess that something similar to this:
> 
> 
>     static int video_i2c_thread_vid_cap(void *priv)
>     {
>         struct video_i2c_data *data = priv;
> 	u64 delay;
> 
> 	delay = div64_u64(1000000ULL * data->frame_interval.numerator,
> 			  data->frame_interval.denominator);
> 
> 	set_freezable();
> 
>         do {
> ...
> 		usleep_range(delay * 3 / 4, delay);
> 	} while (!kthread_should_stop());
> 
> 	return 0;
>     }
> 
> would give you what you want.

The actual the logic should be more complex, as it needs to
dynamically adjust the delay based on how much frames were streamed
and how much time it was elapsed, but the basically idea is that
you would need to use:

	usleep_range(min_delay_us, max_delay_us);

instead of:

	schedule_timeout_interruptible(schedule_delay);

in order to tell the realtime clock about a dead line for
sleeping.

Thanks,
Mauro
Seongyong Park June 6, 2021, 7:20 a.m. UTC | #2
2021년 6월 5일 (토) 오후 11:53, Mauro Carvalho Chehab <mchehab@kernel.org>님이 작성:
> you would need to use:

>

>         usleep_range(min_delay_us, max_delay_us);

>

> instead of:

>

>         schedule_timeout_interruptible(schedule_delay);

>

> in order to tell the realtime clock about a dead line for

> sleeping.

>

> Thanks,

> Mauro


Okay, I have tried `usleep_range()` instead, and it indeed shows
improvement in the frame rate.
Now it's basically the same as before my patch, except for
`jiffies_to_usecs()` and then `usleep_range()`.

...
         int schedule_delay;
+        uint64_t schedule_delay_us;

        try_to_freeze();
...
        if (time_after(jiffies, start_jiffies + delay))
            schedule_delay = delay;

-        schedule_timeout_interruptible(schedule_delay);
+        schedule_delay_us = jiffies_to_usecs(schedule_delay);
+        usleep_range(schedule_delay_us * 3/4, schedule_delay_us);
     } while (!kthread_should_stop());

     return 0;
...

I decided to keep the `if (...) schedule_delay = delay;` part.
The concern was that my RPi Zero was having quite a bit of constant
drift, like showing 3FPS when set to 4FPS, 6FPS when 8FPS, 10FPS when
16FPS, and so on.
Now that I've confirmed the timing's good enough (usually ~0.5 FPS
faster than the frame rate given), there's no need for me to bother
anymore.

I'll send another patchset if it doesn't look too bad.

Thank you very much.
Seongyong Park
Mauro Carvalho Chehab June 6, 2021, 11 a.m. UTC | #3
Em Sun, 6 Jun 2021 16:20:53 +0900
Seongyong Park <euphoriccatface@gmail.com> escreveu:

> 2021년 6월 5일 (토) 오후 11:53, Mauro Carvalho Chehab <mchehab@kernel.org>님이 작성:

> > you would need to use:

> >

> >         usleep_range(min_delay_us, max_delay_us);

> >

> > instead of:

> >

> >         schedule_timeout_interruptible(schedule_delay);

> >

> > in order to tell the realtime clock about a dead line for

> > sleeping.

> >

> > Thanks,

> > Mauro  

> 

> Okay, I have tried `usleep_range()` instead, and it indeed shows

> improvement in the frame rate.

> Now it's basically the same as before my patch, except for

> `jiffies_to_usecs()` and then `usleep_range()`.

> 

> ...

>          int schedule_delay;

> +        uint64_t schedule_delay_us;

> 

>         try_to_freeze();

> ...

>         if (time_after(jiffies, start_jiffies + delay))

>             schedule_delay = delay;

> 

> -        schedule_timeout_interruptible(schedule_delay);

> +        schedule_delay_us = jiffies_to_usecs(schedule_delay);

> +        usleep_range(schedule_delay_us * 3/4, schedule_delay_us);

>      } while (!kthread_should_stop());

> 

>      return 0;

> ...

> 

> I decided to keep the `if (...) schedule_delay = delay;` part.


Yeah, you would need something like that.

> The concern was that my RPi Zero was having quite a bit of constant

> drift, like showing 3FPS when set to 4FPS, 6FPS when 8FPS, 10FPS when

> 16FPS, and so on.

> Now that I've confirmed the timing's good enough (usually ~0.5 FPS

> faster than the frame rate given), there's no need for me to bother

> anymore.


In other to avoid the drift, the logic needs to calculate the delay based
on something like this (pseudo-C) code:

	start_jiffies = jiffies;
	frame_count = 0;
	i = 0;
	do {
		i++;
		delay = jiffies - (start_jiffies * i * <interval>);
...
	}

The actual logic should probably avoid multiplying stuff there, as this
could go sideways easily due to overflows.

Perhaps something like this would work better, keeping a more precise
average fps rate:

	next_jiffies = jiffies + delay;
	do {
...
		schedule_delay_us = jiffies_to_usecs(next_jiffies - jiffies);
		usleep_range(schedule_delay_us * 3/4, schedule_delay_us);		...
		next_jiffies += delay;
	}

> 

> I'll send another patchset if it doesn't look too bad.

> 

> Thank you very much.

> Seongyong Park




Thanks,
Mauro
Seongyong Park June 6, 2021, 3:06 p.m. UTC | #4
`2021년 6월 6일 (일) 오후 8:00, Mauro Carvalho Chehab <mchehab@kernel.org>님이 작성:
>

>

> Perhaps something like this would work better, keeping a more precise

> average fps rate:

>

>         next_jiffies = jiffies + delay;

>         do {

> ...

>                 schedule_delay_us = jiffies_to_usecs(next_jiffies - jiffies);

>                 usleep_range(schedule_delay_us * 3/4, schedule_delay_us);               ...

>                 next_jiffies += delay;

>         }

>

> >

> > I'll send another patchset if it doesn't look too bad.

> >

> > Thank you very much.

> > Seongyong Park

>

>

>

> Thanks,

> Mauro


For a few hours, I couldn't get the module to make precise timing.
It would always be a few tenths of FPS higher than I set, regardless
of how I construct the calculation.
And then it hit me, maybe jiffies is not granular enough - and of
course, resolution of jiffies turns out to be 100Hz :P

e.g. If you set it 16FPS, the count of jiffies to sleep results 100 / 16 = 6
The sleep interval ends up being 60ms, resulting in 16.666... FPS.
This discrepancy gets quite big if you set it to 64 FPS, resulting in
a single jiffies count, i.e. 100Hz.
(Though you will need to either skip some data, or set I2C baud rate
beyond the sensor's spec
in order to realistically even achieve 64 FPS, at least on an RPi Zero
it seems.)

So, I basically changed the time source from `jiffies` to
`ktime_to_us(ktime_get())`.
The timing is definitely better now :)

One last question please, before creating another patchset.
>                 usleep_range(schedule_delay_us * 3/4, schedule_delay_us);

Is it okay to make it essentially 0 microseconds sleep here, by
setting `schedule_delay_us` to 0?
It's going to be like,
```
if (current_us > end_us) {
    end_us = current_us;
}
schedule_delay_us = end_us - current_us;
```

Regards,
Seongyong Park
Mauro Carvalho Chehab June 6, 2021, 7:18 p.m. UTC | #5
Em Mon, 7 Jun 2021 00:06:36 +0900
Seongyong Park <euphoriccatface@gmail.com> escreveu:

> `2021년 6월 6일 (일) 오후 8:00, Mauro Carvalho Chehab <mchehab@kernel.org>님이 작성:

> >

> >

> > Perhaps something like this would work better, keeping a more precise

> > average fps rate:

> >

> >         next_jiffies = jiffies + delay;

> >         do {

> > ...

> >                 schedule_delay_us = jiffies_to_usecs(next_jiffies - jiffies);

> >                 usleep_range(schedule_delay_us * 3/4, schedule_delay_us);               ...

> >                 next_jiffies += delay;

> >         }

> >  

> > >

> > > I'll send another patchset if it doesn't look too bad.

> > >

> > > Thank you very much.

> > > Seongyong Park  

> >

> >

> >

> > Thanks,

> > Mauro  

> 

> For a few hours, I couldn't get the module to make precise timing.

> It would always be a few tenths of FPS higher than I set, regardless

> of how I construct the calculation.

> And then it hit me, maybe jiffies is not granular enough - and of

> course, resolution of jiffies turns out to be 100Hz :P


It is actually worse than that, as it depends on CONFIG_HZ, which is
set by the one who built the Kernel. So, on other machines, it could
be higher (like 1200) or lower (24 is one of the options on mips) ;-)

> e.g. If you set it 16FPS, the count of jiffies to sleep results 100 / 16 = 6

> The sleep interval ends up being 60ms, resulting in 16.666... FPS.


That's basically why I suggested you to count the time from the start
of streaming, instead of just waiting for a fixed delay every time.

> This discrepancy gets quite big if you set it to 64 FPS, resulting in

> a single jiffies count, i.e. 100Hz.

> (Though you will need to either skip some data, or set I2C baud rate

> beyond the sensor's spec

> in order to realistically even achieve 64 FPS, at least on an RPi Zero

> it seems.)


The issue is probably not RPi zero, but the low speeds of I2C bus,
and the speed of the sensor. 

> So, I basically changed the time source from `jiffies` to

> `ktime_to_us(ktime_get())`.

> The timing is definitely better now :)


This helps but it probably use a lot more CPU cycles than reading
jiffies.

> One last question please, before creating another patchset.

> >                 usleep_range(schedule_delay_us * 3/4, schedule_delay_us);  

> Is it okay to make it essentially 0 microseconds sleep here, by

> setting `schedule_delay_us` to 0?

> It's going to be like,

> ```

> if (current_us > end_us) {

>     end_us = current_us;

> }

> schedule_delay_us = end_us - current_us;

> ```


Not sure, but you could instead revert the logic, like:

if (current_us <= end_us)
	usleep_range();

Thanks,
Mauro
diff mbox series

Patch

diff --git a/drivers/media/i2c/video-i2c.c b/drivers/media/i2c/video-i2c.c
index 0465832a4..2ccb08335 100644
--- a/drivers/media/i2c/video-i2c.c
+++ b/drivers/media/i2c/video-i2c.c
@@ -445,14 +445,16 @@  static int video_i2c_thread_vid_cap(void *priv)
 	struct video_i2c_data *data = priv;
 	unsigned int delay = mult_frac(HZ, data->frame_interval.numerator,
 				       data->frame_interval.denominator);
+	unsigned long end_jiffies = jiffies;
 
 	set_freezable();
 
 	do {
-		unsigned long start_jiffies = jiffies;
 		struct video_i2c_buffer *vid_cap_buf = NULL;
 		int schedule_delay;
 
+		end_jiffies += delay;
+
 		try_to_freeze();
 
 		spin_lock(&data->slock);
@@ -477,10 +479,9 @@  static int video_i2c_thread_vid_cap(void *priv)
 				VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
 		}
 
-		schedule_delay = delay - (jiffies - start_jiffies);
-
-		if (time_after(jiffies, start_jiffies + delay))
-			schedule_delay = delay;
+		schedule_delay = end_jiffies - jiffies;
+		while (schedule_delay <= 0)
+			schedule_delay += delay;
 
 		schedule_timeout_interruptible(schedule_delay);
 	} while (!kthread_should_stop());