mbox series

[v3,00/11] Add DELETE_BUF ioctl

Message ID 20230622131349.144160-1-benjamin.gaignard@collabora.com
Headers show
Series Add DELETE_BUF ioctl | expand

Message

Benjamin Gaignard June 22, 2023, 1:13 p.m. UTC
Unlike when resolution change on keyframes, dynamic resolution change
on inter frames doesn't allow to do a stream off/on sequence because
it is need to keep all previous references alive to decode inter frames.
This constraint have two main problems:
- more memory consumption.
- more buffers in use.
To solve these issue this series introduce DELETE_BUF ioctl and remove
the 32 buffers limit per queue.

VP9 conformance tests using fluster give a score of 210/305.
The 25 resize inter tests (vp90-2-21-resize_inter_* files) are ok
but require to use postprocessor.

Kernel branch is available here:
https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/remove_vb2_queue_limit_v3

GStreamer branch to use DELETE_BUF ioctl and testing dynamic resolution
change is here:
https://gitlab.freedesktop.org/benjamin.gaignard1/gstreamer/-/commits/VP9_drc

changes in version 3:
- Use Xarray API to store allocated video buffers.
- No module parameter to limit the number of buffer per queue.
- Use Xarray inside Verisilicon driver to store postprocessor buffers
  and remove VB2_MAX_FRAME limit.
- Allow Versilicon driver to change of resolution while streaming
- Various fixes the Verisilicon VP9 code to improve fluster score.
 
changes in version 2:
- Use a dynamic array and not a list to keep trace of allocated buffers.
  Not use IDR interface because it is marked as deprecated in kernel
  documentation.
- Add a module parameter to limit the number of buffer per queue.
- Add DELETE_BUF ioctl and m2m helpers.

Benjamin Gaignard (11):
  media: videobuf2: Access vb2_queue bufs array through helper functions
  media: videobuf2: Use Xarray instead of static buffers array
  media: videobuf2: Remove VB2_MAX_FRAME limit on buffer storage
  media: videobuf2: Stop define VB2_MAX_FRAME as global
  media: verisilicon: Refactor postprocessor to store more buffers
  media: verisilicon: Store chroma and motion vectors offset
  media: verisilicon: vp9: Use destination buffer height to compute
    chroma offset
  media: verisilicon: postproc: Fix down scale test
  media: verisilicon: vp9: Allow to change resolution while streaming
  media: v4l2: Add DELETE_BUF ioctl
  media: v4l2: Add mem2mem helpers for DELETE_BUF ioctl

 .../userspace-api/media/v4l/user-func.rst     |   1 +
 .../media/v4l/vidioc-delete-buf.rst           |  51 ++++
 .../media/common/videobuf2/videobuf2-core.c   | 275 ++++++++++++++----
 .../media/common/videobuf2/videobuf2-v4l2.c   |  34 ++-
 drivers/media/platform/amphion/vdec.c         |   1 +
 drivers/media/platform/amphion/vpu_dbg.c      |  22 +-
 .../platform/mediatek/jpeg/mtk_jpeg_core.c    |   6 +-
 .../vcodec/vdec/vdec_vp9_req_lat_if.c         |   4 +-
 drivers/media/platform/qcom/venus/hfi.h       |   2 +
 drivers/media/platform/st/sti/hva/hva-v4l2.c  |   4 +
 drivers/media/platform/verisilicon/hantro.h   |   8 +-
 .../platform/verisilicon/hantro_g2_vp9_dec.c  |  10 +-
 .../media/platform/verisilicon/hantro_hw.h    |   4 +-
 .../platform/verisilicon/hantro_postproc.c    | 114 +++++---
 .../media/platform/verisilicon/hantro_v4l2.c  |  37 +--
 drivers/media/test-drivers/vim2m.c            |   1 +
 drivers/media/test-drivers/visl/visl-dec.c    |  28 +-
 drivers/media/v4l2-core/v4l2-dev.c            |   1 +
 drivers/media/v4l2-core/v4l2-ioctl.c          |  10 +
 drivers/media/v4l2-core/v4l2-mem2mem.c        |  20 ++
 .../staging/media/atomisp/pci/atomisp_ioctl.c |   2 +-
 drivers/staging/media/ipu3/ipu3-v4l2.c        |   2 +
 include/media/v4l2-ioctl.h                    |   4 +
 include/media/v4l2-mem2mem.h                  |  12 +
 include/media/videobuf2-core.h                |  16 +-
 include/media/videobuf2-v4l2.h                |  15 +-
 include/uapi/linux/videodev2.h                |   2 +
 27 files changed, 523 insertions(+), 163 deletions(-)
 create mode 100644 Documentation/userspace-api/media/v4l/vidioc-delete-buf.rst

Comments

Dan Carpenter June 22, 2023, 1:56 p.m. UTC | #1
On Thu, Jun 22, 2023 at 03:13:41PM +0200, Benjamin Gaignard wrote:
> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> index f1ff7af34a9f..86e1e926fa45 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -455,9 +455,9 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
>  	struct vb2_buffer *vb;
>  	int ret;
>  
> -	/* Ensure that q->num_buffers+num_buffers is below VB2_MAX_FRAME */
> +	/* Ensure that q->num_buffers + num_buffers is UINT_MAX */
>  	num_buffers = min_t(unsigned int, num_buffers,
> -			    VB2_MAX_FRAME - q->num_buffers);
> +			    UINT_MAX - q->num_buffers);

The UINT_MAX limit adds a level of danger.  It would be safer to do what
the vfs layer does for MAX_RW_COUNT and use "INT_MAX - PAGE_SIZE".  That
way you can take size + sizeof() and it's only very rarely going to turn
negative.  Or at least just INT_MAX.  I would keep the VB2_MAX_FRAME and
define it as:

#define VB2_MAX_FRAME (INT_MAX & PAGE_MASK)  /* The mask prevents 85% of integer overflows */

>  
>  	for (buffer = 0; buffer < num_buffers; ++buffer) {
>  		/* Allocate vb2 buffer structures */
> @@ -858,9 +858,9 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
>  	/*
>  	 * Make sure the requested values and current defaults are sane.
>  	 */
> -	WARN_ON(q->min_buffers_needed > VB2_MAX_FRAME);
> +	WARN_ON(q->min_buffers_needed > UINT_MAX);

This will trigger a static checker warning because the condition is
impossible.

regards,
dan carpenter
Dan Carpenter June 22, 2023, 2:11 p.m. UTC | #2
On Thu, Jun 22, 2023 at 03:13:41PM +0200, Benjamin Gaignard wrote:
> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> index f1ff7af34a9f..86e1e926fa45 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -455,9 +455,9 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
>  	struct vb2_buffer *vb;
>  	int ret;
>  
> -	/* Ensure that q->num_buffers+num_buffers is below VB2_MAX_FRAME */
> +	/* Ensure that q->num_buffers + num_buffers is UINT_MAX */
>  	num_buffers = min_t(unsigned int, num_buffers,
> -			    VB2_MAX_FRAME - q->num_buffers);
> +			    UINT_MAX - q->num_buffers);
>  
>  	for (buffer = 0; buffer < num_buffers; ++buffer) {
>  		/* Allocate vb2 buffer structures */

Ah...  Here's one of the integer overflow bugs I was talking about.  The
__vb2_queue_alloc() function returns an int so if num_buffers goes over
INT_MAX we are hosed.

regards,
dan carpenter
Hans Verkuil June 23, 2023, 7:02 a.m. UTC | #3
On 22/06/2023 16:13, Benjamin Gaignard wrote:
> 
> Le 22/06/2023 à 16:11, Dan Carpenter a écrit :
>> On Thu, Jun 22, 2023 at 03:13:41PM +0200, Benjamin Gaignard wrote:
>>> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
>>> index f1ff7af34a9f..86e1e926fa45 100644
>>> --- a/drivers/media/common/videobuf2/videobuf2-core.c
>>> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
>>> @@ -455,9 +455,9 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
>>>       struct vb2_buffer *vb;
>>>       int ret;
>>>   -    /* Ensure that q->num_buffers+num_buffers is below VB2_MAX_FRAME */
>>> +    /* Ensure that q->num_buffers + num_buffers is UINT_MAX */
>>>       num_buffers = min_t(unsigned int, num_buffers,
>>> -                VB2_MAX_FRAME - q->num_buffers);
>>> +                UINT_MAX - q->num_buffers);
>>>         for (buffer = 0; buffer < num_buffers; ++buffer) {
>>>           /* Allocate vb2 buffer structures */
>> Ah...  Here's one of the integer overflow bugs I was talking about.  The
>> __vb2_queue_alloc() function returns an int so if num_buffers goes over
>> INT_MAX we are hosed.
> 
> I will limit it to:
> #define VB2_QUEUE_MAX_BUFFERS  (INT_MAX & PAGE_MASK)  /* The mask prevents 85% of integer overflows */
> as you have suggest it.

IMHO INT_MAX is way overkill. How about (1U << 20)? I would like some sort of
sanity check here. 1048576 buffers of 640x480 and 4 bytes per pixel is 1.2 TB.

Since a TB of memory is doable these days, I think this is a reasonable
value for MAX_BUFFERS without allowing just anything.

An alternative is to make this a kernel config.

Regards,

	Hans

> 
> That will be in version 4.
> 
> Thanks,
> Benjamin
> 
>>
>> regards,
>> dan carpenter
>>
Benjamin Gaignard June 23, 2023, 7:51 a.m. UTC | #4
Le 23/06/2023 à 09:02, Hans Verkuil a écrit :
> On 22/06/2023 16:13, Benjamin Gaignard wrote:
>> Le 22/06/2023 à 16:11, Dan Carpenter a écrit :
>>> On Thu, Jun 22, 2023 at 03:13:41PM +0200, Benjamin Gaignard wrote:
>>>> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
>>>> index f1ff7af34a9f..86e1e926fa45 100644
>>>> --- a/drivers/media/common/videobuf2/videobuf2-core.c
>>>> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
>>>> @@ -455,9 +455,9 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
>>>>        struct vb2_buffer *vb;
>>>>        int ret;
>>>>    -    /* Ensure that q->num_buffers+num_buffers is below VB2_MAX_FRAME */
>>>> +    /* Ensure that q->num_buffers + num_buffers is UINT_MAX */
>>>>        num_buffers = min_t(unsigned int, num_buffers,
>>>> -                VB2_MAX_FRAME - q->num_buffers);
>>>> +                UINT_MAX - q->num_buffers);
>>>>          for (buffer = 0; buffer < num_buffers; ++buffer) {
>>>>            /* Allocate vb2 buffer structures */
>>> Ah...  Here's one of the integer overflow bugs I was talking about.  The
>>> __vb2_queue_alloc() function returns an int so if num_buffers goes over
>>> INT_MAX we are hosed.
>> I will limit it to:
>> #define VB2_QUEUE_MAX_BUFFERS  (INT_MAX & PAGE_MASK)  /* The mask prevents 85% of integer overflows */
>> as you have suggest it.
> IMHO INT_MAX is way overkill. How about (1U << 20)? I would like some sort of
> sanity check here. 1048576 buffers of 640x480 and 4 bytes per pixel is 1.2 TB.

I will go for (1U << 20) in next version.

Regards,
Benjamin

>
> Since a TB of memory is doable these days, I think this is a reasonable
> value for MAX_BUFFERS without allowing just anything.
>
> An alternative is to make this a kernel config.
>
> Regards,
>
> 	Hans
>
>> That will be in version 4.
>>
>> Thanks,
>> Benjamin
>>
>>> regards,
>>> dan carpenter
>>>
Hsia-Jun Li June 27, 2023, 7:40 a.m. UTC | #5
On 6/22/23 21:13, Benjamin Gaignard wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
> Unlike when resolution change on keyframes, dynamic resolution change
> on inter frames doesn't allow to do a stream off/on sequence because
> it is need to keep all previous references alive to decode inter frames.
> This constraint have two main problems:
> - more memory consumption.
> - more buffers in use.
> To solve these issue this series introduce DELETE_BUF ioctl and remove
> the 32 buffers limit per queue.

I know the VIDIOC_CREATE_BUFS allows creating a buffer with a different 
size than the driver suggests in G_FMT.

But the vb2_ops->queue_setup() could check whether the sizeimages meet 
its minimal requirement with the current format.

This enables a problem that the driver need to check the buffer size 
before they make a hardware use a buffer from the rdy_queue.


Thinking of such case, we know a AV1 sequence(VP9 or VP8 didn't have a 
sequence header) would need a much large buffer for the alternative 
reference frame.

Then create one special buffer for the altref, the driver need a 
hardware to pick it from the rdy_queue first or it would be a waste to 
use it as a regular frame buffer.

Also missing such step would not solve the memory allocation problem.

>
> VP9 conformance tests using fluster give a score of 210/305.
> The 25 resize inter tests (vp90-2-21-resize_inter_* files) are ok
> but require to use postprocessor.
>
> Kernel branch is available here:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.collabora.com_benjamin.gaignard_for-2Dupstream_-2D_commits_remove-5Fvb2-5Fqueue-5Flimit-5Fv3&d=DwIDAg&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=DCpeuc2fAyJ_XUCYsydYOB5ynn0uW4JsFKVbEiXj-6AhZ5d2vm3GkOClPl8cfN9U&s=8whob9PKPu98WlyK6J9DcmFFiDPbwI3ws-nLfWR0oTE&e=
>
> GStreamer branch to use DELETE_BUF ioctl and testing dynamic resolution
> change is here:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.freedesktop.org_benjamin.gaignard1_gstreamer_-2D_commits_VP9-5Fdrc&d=DwIDAg&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=DCpeuc2fAyJ_XUCYsydYOB5ynn0uW4JsFKVbEiXj-6AhZ5d2vm3GkOClPl8cfN9U&s=SEexoIeuXbraR1zvtSkz0MQFGyZSeKQ7Pt6mJoNrS0A&e=
>
> changes in version 3:
> - Use Xarray API to store allocated video buffers.
> - No module parameter to limit the number of buffer per queue.
> - Use Xarray inside Verisilicon driver to store postprocessor buffers
>    and remove VB2_MAX_FRAME limit.
> - Allow Versilicon driver to change of resolution while streaming
> - Various fixes the Verisilicon VP9 code to improve fluster score.
>
> changes in version 2:
> - Use a dynamic array and not a list to keep trace of allocated buffers.
>    Not use IDR interface because it is marked as deprecated in kernel
>    documentation.
> - Add a module parameter to limit the number of buffer per queue.
> - Add DELETE_BUF ioctl and m2m helpers.
>
> Benjamin Gaignard (11):
>    media: videobuf2: Access vb2_queue bufs array through helper functions
>    media: videobuf2: Use Xarray instead of static buffers array
>    media: videobuf2: Remove VB2_MAX_FRAME limit on buffer storage
>    media: videobuf2: Stop define VB2_MAX_FRAME as global
>    media: verisilicon: Refactor postprocessor to store more buffers
>    media: verisilicon: Store chroma and motion vectors offset
>    media: verisilicon: vp9: Use destination buffer height to compute
>      chroma offset
>    media: verisilicon: postproc: Fix down scale test
>    media: verisilicon: vp9: Allow to change resolution while streaming
>    media: v4l2: Add DELETE_BUF ioctl
>    media: v4l2: Add mem2mem helpers for DELETE_BUF ioctl
>
>   .../userspace-api/media/v4l/user-func.rst     |   1 +
>   .../media/v4l/vidioc-delete-buf.rst           |  51 ++++
>   .../media/common/videobuf2/videobuf2-core.c   | 275 ++++++++++++++----
>   .../media/common/videobuf2/videobuf2-v4l2.c   |  34 ++-
>   drivers/media/platform/amphion/vdec.c         |   1 +
>   drivers/media/platform/amphion/vpu_dbg.c      |  22 +-
>   .../platform/mediatek/jpeg/mtk_jpeg_core.c    |   6 +-
>   .../vcodec/vdec/vdec_vp9_req_lat_if.c         |   4 +-
>   drivers/media/platform/qcom/venus/hfi.h       |   2 +
>   drivers/media/platform/st/sti/hva/hva-v4l2.c  |   4 +
>   drivers/media/platform/verisilicon/hantro.h   |   8 +-
>   .../platform/verisilicon/hantro_g2_vp9_dec.c  |  10 +-
>   .../media/platform/verisilicon/hantro_hw.h    |   4 +-
>   .../platform/verisilicon/hantro_postproc.c    | 114 +++++---
>   .../media/platform/verisilicon/hantro_v4l2.c  |  37 +--
>   drivers/media/test-drivers/vim2m.c            |   1 +
>   drivers/media/test-drivers/visl/visl-dec.c    |  28 +-
>   drivers/media/v4l2-core/v4l2-dev.c            |   1 +
>   drivers/media/v4l2-core/v4l2-ioctl.c          |  10 +
>   drivers/media/v4l2-core/v4l2-mem2mem.c        |  20 ++
>   .../staging/media/atomisp/pci/atomisp_ioctl.c |   2 +-
>   drivers/staging/media/ipu3/ipu3-v4l2.c        |   2 +
>   include/media/v4l2-ioctl.h                    |   4 +
>   include/media/v4l2-mem2mem.h                  |  12 +
>   include/media/videobuf2-core.h                |  16 +-
>   include/media/videobuf2-v4l2.h                |  15 +-
>   include/uapi/linux/videodev2.h                |   2 +
>   27 files changed, 523 insertions(+), 163 deletions(-)
>   create mode 100644 Documentation/userspace-api/media/v4l/vidioc-delete-buf.rst
>
> --
> 2.39.2
>