mbox series

[v2,0/3] media: videobuf2-core: attach once if multiple planes share the same dbuf

Message ID 20240403091306.1308878-1-yunkec@chromium.org
Headers show
Series media: videobuf2-core: attach once if multiple planes share the same dbuf | expand

Message

Yunke Cao April 3, 2024, 9:13 a.m. UTC
Hi,

This patch set avoids attaching and mapping a dma_buf multiple times when
several planes in one vb2 buffer share the same dma_buf.

1/3 and 2/3 refactors __prepare_dmabuf()
3/3 adds a flag to avoid duplicated attaching

Changelog since v1:
- Add patch 1/3 to refactor __prepare_dmabuf()
- Add patch 2/3 to resolve Tomasz's comment on __vb2_buf_dmabuf_put()'s
iteration order
- Fix mem_priv in patch 3/3
- Rename duplicated_dbuf to dbuf_duplicated

Yunke Cao (3):
  media: videobuf2-core: release all planes first in __prepare_dmabuf()
  media: videobuf2-core: reverse the iteration order in
    __vb2_buf_dmabuf_put
  media: videobuf2-core: attach once if multiple planes share the same
    dbuf

 .../media/common/videobuf2/videobuf2-core.c   | 97 +++++++++++--------
 include/media/videobuf2-core.h                |  3 +
 2 files changed, 62 insertions(+), 38 deletions(-)

Comments

Hans Verkuil April 24, 2024, 10:25 a.m. UTC | #1
Tomasz,

I would appreciate it if you can review this series.

Regards,

	Hans

On 03/04/2024 11:13, Yunke Cao wrote:
> Hi,
> 
> This patch set avoids attaching and mapping a dma_buf multiple times when
> several planes in one vb2 buffer share the same dma_buf.
> 
> 1/3 and 2/3 refactors __prepare_dmabuf()
> 3/3 adds a flag to avoid duplicated attaching
> 
> Changelog since v1:
> - Add patch 1/3 to refactor __prepare_dmabuf()
> - Add patch 2/3 to resolve Tomasz's comment on __vb2_buf_dmabuf_put()'s
> iteration order
> - Fix mem_priv in patch 3/3
> - Rename duplicated_dbuf to dbuf_duplicated
> 
> Yunke Cao (3):
>   media: videobuf2-core: release all planes first in __prepare_dmabuf()
>   media: videobuf2-core: reverse the iteration order in
>     __vb2_buf_dmabuf_put
>   media: videobuf2-core: attach once if multiple planes share the same
>     dbuf
> 
>  .../media/common/videobuf2/videobuf2-core.c   | 97 +++++++++++--------
>  include/media/videobuf2-core.h                |  3 +
>  2 files changed, 62 insertions(+), 38 deletions(-)
>
Tomasz Figa May 8, 2024, 4:02 p.m. UTC | #2
On Wed, Apr 24, 2024 at 7:25 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> Tomasz,
>
> I would appreciate it if you can review this series.

Sorry, I've been out for the last 2 weeks. Catching up now and should
get to this shortly.

Best,
Tomasz

>
> Regards,
>
>         Hans
>
> On 03/04/2024 11:13, Yunke Cao wrote:
> > Hi,
> >
> > This patch set avoids attaching and mapping a dma_buf multiple times when
> > several planes in one vb2 buffer share the same dma_buf.
> >
> > 1/3 and 2/3 refactors __prepare_dmabuf()
> > 3/3 adds a flag to avoid duplicated attaching
> >
> > Changelog since v1:
> > - Add patch 1/3 to refactor __prepare_dmabuf()
> > - Add patch 2/3 to resolve Tomasz's comment on __vb2_buf_dmabuf_put()'s
> > iteration order
> > - Fix mem_priv in patch 3/3
> > - Rename duplicated_dbuf to dbuf_duplicated
> >
> > Yunke Cao (3):
> >   media: videobuf2-core: release all planes first in __prepare_dmabuf()
> >   media: videobuf2-core: reverse the iteration order in
> >     __vb2_buf_dmabuf_put
> >   media: videobuf2-core: attach once if multiple planes share the same
> >     dbuf
> >
> >  .../media/common/videobuf2/videobuf2-core.c   | 97 +++++++++++--------
> >  include/media/videobuf2-core.h                |  3 +
> >  2 files changed, 62 insertions(+), 38 deletions(-)
> >
>
Tomasz Figa May 17, 2024, 11:11 a.m. UTC | #3
Hi Yunke,

On Wed, Apr 03, 2024 at 06:13:04PM +0900, Yunke Cao wrote:
> The existing implementation, validating planes, checking if the planes
> changed, releasing previous planes and reaquiring new planes all happens in
> the same for loop.
> 
> Split the for loop into 3 parts
> 1. In the first for loop, validate planes and check if planes changed.
> 2. Call __vb2_buf_dmabuf_put() to release all planes.
> 3. In the second for loop, reaquire new planes.
> 
> Signed-off-by: Yunke Cao <yunkec@chromium.org>
> ---
>  .../media/common/videobuf2/videobuf2-core.c   | 64 ++++++++++---------
>  1 file changed, 34 insertions(+), 30 deletions(-)
> 

Thanks for the second revision and sorry for the delay. Please check my
comments inline.

> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> index b6bf8f232f48..702f7b6f783a 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -1341,11 +1341,13 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
>  	for (plane = 0; plane < vb->num_planes; ++plane) {
>  		struct dma_buf *dbuf = dma_buf_get(planes[plane].m.fd);
>  
> +		planes[plane].dbuf = dbuf;
> +
>  		if (IS_ERR_OR_NULL(dbuf)) {
>  			dprintk(q, 1, "invalid dmabuf fd for plane %d\n",
>  				plane);
>  			ret = -EINVAL;
> -			goto err;
> +			goto err_put_dbuf;

nit: Maybe err_put_planes, since we're cleaning up the planes[] array?

>  		}
>  
>  		/* use DMABUF size if length is not provided */
> @@ -1356,17 +1358,14 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
>  			dprintk(q, 1, "invalid dmabuf length %u for plane %d, minimum length %u\n",
>  				planes[plane].length, plane,
>  				vb->planes[plane].min_length);
> -			dma_buf_put(dbuf);
>  			ret = -EINVAL;
> -			goto err;
> +			goto err_put_dbuf;
>  		}
>  
>  		/* Skip the plane if already verified */
>  		if (dbuf == vb->planes[plane].dbuf &&
> -			vb->planes[plane].length == planes[plane].length) {
> -			dma_buf_put(dbuf);
> +		    vb->planes[plane].length == planes[plane].length)
>  			continue;
> -		}
>  
>  		dprintk(q, 3, "buffer for plane %d changed\n", plane);
>  
> @@ -1375,29 +1374,30 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
>  			vb->copied_timestamp = 0;
>  			call_void_vb_qop(vb, buf_cleanup, vb);

Would it make sense to also move these two to the if (reacquired) part
below, since they are done once for the entire vb?

>  		}
> +	}
>  
> -		/* Release previously acquired memory if present */
> -		__vb2_plane_dmabuf_put(vb, &vb->planes[plane]);
> -		vb->planes[plane].bytesused = 0;
> -		vb->planes[plane].length = 0;
> -		vb->planes[plane].m.fd = 0;
> -		vb->planes[plane].data_offset = 0;

I don't see the code below setting the 4 fields above to zero. Is it
intended?

> +	if (reacquired) {
> +		__vb2_buf_dmabuf_put(vb);
> +
> +		for (plane = 0; plane < vb->num_planes; ++plane) {
> +			/* Acquire each plane's memory */
> +			mem_priv = call_ptr_memop(attach_dmabuf,
> +						  vb,
> +						  q->alloc_devs[plane] ? : q->dev,
> +						  planes[plane].dbuf,
> +						  planes[plane].length);
> +			if (IS_ERR(mem_priv)) {
> +				dprintk(q, 1, "failed to attach dmabuf\n");
> +				ret = PTR_ERR(mem_priv);
> +				goto err_put_dbuf;

Hmm, I think in this case we need to also clean up the partially acquired
planes of vb.

> +			}
>  
> -		/* Acquire each plane's memory */
> -		mem_priv = call_ptr_memop(attach_dmabuf,
> -					  vb,
> -					  q->alloc_devs[plane] ? : q->dev,
> -					  dbuf,
> -					  planes[plane].length);
> -		if (IS_ERR(mem_priv)) {
> -			dprintk(q, 1, "failed to attach dmabuf\n");
> -			ret = PTR_ERR(mem_priv);
> -			dma_buf_put(dbuf);
> -			goto err;
> +			vb->planes[plane].dbuf = planes[plane].dbuf;
> +			vb->planes[plane].mem_priv = mem_priv;
>  		}
> -
> -		vb->planes[plane].dbuf = dbuf;
> -		vb->planes[plane].mem_priv = mem_priv;
> +	} else {
> +		for (plane = 0; plane < vb->num_planes; ++plane)
> +			dma_buf_put(planes[plane].dbuf);
>  	}
>  
>  	/*
> @@ -1413,7 +1413,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
>  		if (ret) {
>  			dprintk(q, 1, "failed to map dmabuf for plane %d\n",
>  				plane);
> -			goto err;
> +			goto err_put_vb2_buf;
>  		}
>  		vb->planes[plane].dbuf_mapped = 1;
>  	}

I think this entire loop can also go under the (reacquired) case, since
(!reacquired) means that all the planes were identical (and thus are
alreday mapped). Given that now we release all the planes in one go, we
could even simplify it by dropping the dbuf_mapped check from the loop.

> @@ -1437,7 +1437,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
>  		ret = call_vb_qop(vb, buf_init, vb);
>  		if (ret) {
>  			dprintk(q, 1, "buffer initialization failed\n");
> -			goto err;
> +			goto err_put_vb2_buf;
>  		}
>  	}

Same for this block.

>  
> @@ -1445,11 +1445,15 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
>  	if (ret) {
>  		dprintk(q, 1, "buffer preparation failed\n");
>  		call_void_vb_qop(vb, buf_cleanup, vb);
> -		goto err;
> +		goto err_put_vb2_buf;
>  	}
>  
>  	return 0;
> -err:
> +
> +err_put_dbuf:
> +	for (plane = 0; plane < vb->num_planes; ++plane)

dma_buf_put() will throw a warning if the dmabuf pointer is NULL and just
plain crash if IS_ERR(), so we shouldn't call it on array elements that we
didn't succeed for.

> +		dma_buf_put(planes[plane].dbuf);
> +err_put_vb2_buf:
>  	/* In case of errors, release planes that were already acquired */
>  	__vb2_buf_dmabuf_put(vb);

Actually, would it make sense to invert the order of clean-up steps here?
In case if only the first loop fails, we don't really need to do anything with
vb. Or am I missing something?

Best regards,
Tomasz
Tomasz Figa May 17, 2024, 11:23 a.m. UTC | #4
On Wed, Apr 03, 2024 at 06:13:06PM +0900, Yunke Cao wrote:
> When multiple planes use the same dma buf, each plane will have its own dma
> buf attachment and mapping. It is a waste of IOVA space.
> 
> This patch adds a dbuf_duplicated boolean in vb2_plane. If a plane's dbuf
> is the same as an existing plane, do not create another attachment and
> mapping.
> 
> Signed-off-by: Yunke Cao <yunkec@chromium.org>
> ---
>  .../media/common/videobuf2/videobuf2-core.c   | 27 +++++++++++++++----
>  include/media/videobuf2-core.h                |  3 +++
>  2 files changed, 25 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> index a5368cef73bb..64fe3801b802 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -304,10 +304,13 @@ static void __vb2_plane_dmabuf_put(struct vb2_buffer *vb, struct vb2_plane *p)
>  	if (!p->mem_priv)
>  		return;
>  
> -	if (p->dbuf_mapped)
> -		call_void_memop(vb, unmap_dmabuf, p->mem_priv);
> +	if (!p->dbuf_duplicated) {
> +		if (p->dbuf_mapped)

Side note: Now when I'm reading this code I'm starting to wonder if
dbuf_mapped really add any value here. Can we even have a situation when we
have p->dbuf != NULL, but p->dbuf_mapped == false?

> +			call_void_memop(vb, unmap_dmabuf, p->mem_priv);
> +
> +		call_void_memop(vb, detach_dmabuf, p->mem_priv);
> +	}
>  
> -	call_void_memop(vb, detach_dmabuf, p->mem_priv);
>  	dma_buf_put(p->dbuf);
>  	p->mem_priv = NULL;
>  	p->dbuf = NULL;
> @@ -1327,7 +1330,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
>  	struct vb2_plane planes[VB2_MAX_PLANES];
>  	struct vb2_queue *q = vb->vb2_queue;
>  	void *mem_priv;
> -	unsigned int plane;
> +	unsigned int plane, i;
>  	int ret = 0;
>  	bool reacquired = vb->planes[0].mem_priv == NULL;
>  
> @@ -1380,6 +1383,19 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
>  		__vb2_buf_dmabuf_put(vb);
>  
>  		for (plane = 0; plane < vb->num_planes; ++plane) {

Can we add a short comment here explaining that this is an optimization for
using the same DMA-buf for many planes?

Best regards,
Tomasz

> +			for (i = 0; i < plane; ++i) {
> +				if (planes[plane].dbuf == vb->planes[i].dbuf) {
> +					vb->planes[plane].dbuf_duplicated = true;
> +					vb->planes[plane].dbuf = vb->planes[i].dbuf;
> +					vb->planes[plane].mem_priv = vb->planes[i].mem_priv;
> +					break;
> +				}
> +			}
> +
> +			/* There's no need to attach a duplicated dbuf. */
> +			if (vb->planes[plane].dbuf_duplicated)
> +				continue;
> +
>  			/* Acquire each plane's memory */
>  			mem_priv = call_ptr_memop(attach_dmabuf,
>  						  vb,
> @@ -1392,6 +1408,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
>  				goto err_put_dbuf;
>  			}
>  
> +			vb->planes[plane].dbuf_duplicated = false;
>  			vb->planes[plane].dbuf = planes[plane].dbuf;
>  			vb->planes[plane].mem_priv = mem_priv;
>  		}
> @@ -1406,7 +1423,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
>  	 * userspace knows sooner rather than later if the dma-buf map fails.
>  	 */
>  	for (plane = 0; plane < vb->num_planes; ++plane) {
> -		if (vb->planes[plane].dbuf_mapped)
> +		if (vb->planes[plane].dbuf_mapped || vb->planes[plane].dbuf_duplicated)
>  			continue;
>  
>  		ret = call_memop(vb, map_dmabuf, vb->planes[plane].mem_priv);
> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> index 8b86996b2719..2484e7d2881d 100644
> --- a/include/media/videobuf2-core.h
> +++ b/include/media/videobuf2-core.h
> @@ -154,6 +154,8 @@ struct vb2_mem_ops {
>   * @mem_priv:	private data with this plane.
>   * @dbuf:	dma_buf - shared buffer object.
>   * @dbuf_mapped:	flag to show whether dbuf is mapped or not
> + * @duplicated_dbuf:	boolean to show whether dbuf is duplicated with a
> + *		previous plane of the buffer.
>   * @bytesused:	number of bytes occupied by data in the plane (payload).
>   * @length:	size of this plane (NOT the payload) in bytes. The maximum
>   *		valid size is MAX_UINT - PAGE_SIZE.
> @@ -179,6 +181,7 @@ struct vb2_plane {
>  	void			*mem_priv;
>  	struct dma_buf		*dbuf;
>  	unsigned int		dbuf_mapped;
> +	bool			dbuf_duplicated;
>  	unsigned int		bytesused;
>  	unsigned int		length;
>  	unsigned int		min_length;
> -- 
> 2.44.0.478.gd926399ef9-goog
>
Yunke Cao May 20, 2024, 2:19 a.m. UTC | #5
Hi Tomasz,

Thanks for the review.

On Fri, May 17, 2024 at 8:23 PM Tomasz Figa <tfiga@chromium.org> wrote:
>
> On Wed, Apr 03, 2024 at 06:13:06PM +0900, Yunke Cao wrote:
> > When multiple planes use the same dma buf, each plane will have its own dma
> > buf attachment and mapping. It is a waste of IOVA space.
> >
> > This patch adds a dbuf_duplicated boolean in vb2_plane. If a plane's dbuf
> > is the same as an existing plane, do not create another attachment and
> > mapping.
> >
> > Signed-off-by: Yunke Cao <yunkec@chromium.org>
> > ---
> >  .../media/common/videobuf2/videobuf2-core.c   | 27 +++++++++++++++----
> >  include/media/videobuf2-core.h                |  3 +++
> >  2 files changed, 25 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> > index a5368cef73bb..64fe3801b802 100644
> > --- a/drivers/media/common/videobuf2/videobuf2-core.c
> > +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> > @@ -304,10 +304,13 @@ static void __vb2_plane_dmabuf_put(struct vb2_buffer *vb, struct vb2_plane *p)
> >       if (!p->mem_priv)
> >               return;
> >
> > -     if (p->dbuf_mapped)
> > -             call_void_memop(vb, unmap_dmabuf, p->mem_priv);
> > +     if (!p->dbuf_duplicated) {
> > +             if (p->dbuf_mapped)
>
> Side note: Now when I'm reading this code I'm starting to wonder if
> dbuf_mapped really add any value here. Can we even have a situation when we
> have p->dbuf != NULL, but p->dbuf_mapped == false?
>

Hmm, I think you are right. It seems we always do map_dmabuf after
attach_dma_buf.

> > +                     call_void_memop(vb, unmap_dmabuf, p->mem_priv);
> > +
> > +             call_void_memop(vb, detach_dmabuf, p->mem_priv);
> > +     }
> >
> > -     call_void_memop(vb, detach_dmabuf, p->mem_priv);
> >       dma_buf_put(p->dbuf);
> >       p->mem_priv = NULL;
> >       p->dbuf = NULL;
> > @@ -1327,7 +1330,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> >       struct vb2_plane planes[VB2_MAX_PLANES];
> >       struct vb2_queue *q = vb->vb2_queue;
> >       void *mem_priv;
> > -     unsigned int plane;
> > +     unsigned int plane, i;
> >       int ret = 0;
> >       bool reacquired = vb->planes[0].mem_priv == NULL;
> >
> > @@ -1380,6 +1383,19 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> >               __vb2_buf_dmabuf_put(vb);
> >
> >               for (plane = 0; plane < vb->num_planes; ++plane) {
>
> Can we add a short comment here explaining that this is an optimization for
> using the same DMA-buf for many planes?
>

Sure, I will add it in v3.

Best,
Yunke

> Best regards,
> Tomasz
>
> > +                     for (i = 0; i < plane; ++i) {
> > +                             if (planes[plane].dbuf == vb->planes[i].dbuf) {
> > +                                     vb->planes[plane].dbuf_duplicated = true;
> > +                                     vb->planes[plane].dbuf = vb->planes[i].dbuf;
> > +                                     vb->planes[plane].mem_priv = vb->planes[i].mem_priv;
> > +                                     break;
> > +                             }
> > +                     }
> > +
> > +                     /* There's no need to attach a duplicated dbuf. */
> > +                     if (vb->planes[plane].dbuf_duplicated)
> > +                             continue;
> > +
> >                       /* Acquire each plane's memory */
> >                       mem_priv = call_ptr_memop(attach_dmabuf,
> >                                                 vb,
> > @@ -1392,6 +1408,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> >                               goto err_put_dbuf;
> >                       }
> >
> > +                     vb->planes[plane].dbuf_duplicated = false;
> >                       vb->planes[plane].dbuf = planes[plane].dbuf;
> >                       vb->planes[plane].mem_priv = mem_priv;
> >               }
> > @@ -1406,7 +1423,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> >        * userspace knows sooner rather than later if the dma-buf map fails.
> >        */
> >       for (plane = 0; plane < vb->num_planes; ++plane) {
> > -             if (vb->planes[plane].dbuf_mapped)
> > +             if (vb->planes[plane].dbuf_mapped || vb->planes[plane].dbuf_duplicated)
> >                       continue;
> >
> >               ret = call_memop(vb, map_dmabuf, vb->planes[plane].mem_priv);
> > diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> > index 8b86996b2719..2484e7d2881d 100644
> > --- a/include/media/videobuf2-core.h
> > +++ b/include/media/videobuf2-core.h
> > @@ -154,6 +154,8 @@ struct vb2_mem_ops {
> >   * @mem_priv:        private data with this plane.
> >   * @dbuf:    dma_buf - shared buffer object.
> >   * @dbuf_mapped:     flag to show whether dbuf is mapped or not
> > + * @duplicated_dbuf: boolean to show whether dbuf is duplicated with a
> > + *           previous plane of the buffer.
> >   * @bytesused:       number of bytes occupied by data in the plane (payload).
> >   * @length:  size of this plane (NOT the payload) in bytes. The maximum
> >   *           valid size is MAX_UINT - PAGE_SIZE.
> > @@ -179,6 +181,7 @@ struct vb2_plane {
> >       void                    *mem_priv;
> >       struct dma_buf          *dbuf;
> >       unsigned int            dbuf_mapped;
> > +     bool                    dbuf_duplicated;
> >       unsigned int            bytesused;
> >       unsigned int            length;
> >       unsigned int            min_length;
> > --
> > 2.44.0.478.gd926399ef9-goog
> >
Yunke Cao May 30, 2024, 4:13 a.m. UTC | #6
Hi Tomasz,

Thanks for the review.

On Fri, May 17, 2024 at 8:11 PM Tomasz Figa <tfiga@chromium.org> wrote:
>
> Hi Yunke,
>
> On Wed, Apr 03, 2024 at 06:13:04PM +0900, Yunke Cao wrote:
> > The existing implementation, validating planes, checking if the planes
> > changed, releasing previous planes and reaquiring new planes all happens in
> > the same for loop.
> >
> > Split the for loop into 3 parts
> > 1. In the first for loop, validate planes and check if planes changed.
> > 2. Call __vb2_buf_dmabuf_put() to release all planes.
> > 3. In the second for loop, reaquire new planes.
> >
> > Signed-off-by: Yunke Cao <yunkec@chromium.org>
> > ---
> >  .../media/common/videobuf2/videobuf2-core.c   | 64 ++++++++++---------
> >  1 file changed, 34 insertions(+), 30 deletions(-)
> >
>
> Thanks for the second revision and sorry for the delay. Please check my
> comments inline.
>
> > diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> > index b6bf8f232f48..702f7b6f783a 100644
> > --- a/drivers/media/common/videobuf2/videobuf2-core.c
> > +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> > @@ -1341,11 +1341,13 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> >       for (plane = 0; plane < vb->num_planes; ++plane) {
> >               struct dma_buf *dbuf = dma_buf_get(planes[plane].m.fd);
> >
> > +             planes[plane].dbuf = dbuf;
> > +
> >               if (IS_ERR_OR_NULL(dbuf)) {
> >                       dprintk(q, 1, "invalid dmabuf fd for plane %d\n",
> >                               plane);
> >                       ret = -EINVAL;
> > -                     goto err;
> > +                     goto err_put_dbuf;
>
> nit: Maybe err_put_planes, since we're cleaning up the planes[] array?
>

err_put_planes sounds good to me.

> >               }
> >
> >               /* use DMABUF size if length is not provided */
> > @@ -1356,17 +1358,14 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> >                       dprintk(q, 1, "invalid dmabuf length %u for plane %d, minimum length %u\n",
> >                               planes[plane].length, plane,
> >                               vb->planes[plane].min_length);
> > -                     dma_buf_put(dbuf);
> >                       ret = -EINVAL;
> > -                     goto err;
> > +                     goto err_put_dbuf;
> >               }
> >
> >               /* Skip the plane if already verified */
> >               if (dbuf == vb->planes[plane].dbuf &&
> > -                     vb->planes[plane].length == planes[plane].length) {
> > -                     dma_buf_put(dbuf);
> > +                 vb->planes[plane].length == planes[plane].length)
> >                       continue;
> > -             }
> >
> >               dprintk(q, 3, "buffer for plane %d changed\n", plane);
> >
> > @@ -1375,29 +1374,30 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> >                       vb->copied_timestamp = 0;
> >                       call_void_vb_qop(vb, buf_cleanup, vb);
>
> Would it make sense to also move these two to the if (reacquired) part
> below, since they are done once for the entire vb?
>

Yes, Will do in the next version.

> >               }
> > +     }
> >
> > -             /* Release previously acquired memory if present */
> > -             __vb2_plane_dmabuf_put(vb, &vb->planes[plane]);
> > -             vb->planes[plane].bytesused = 0;
> > -             vb->planes[plane].length = 0;
> > -             vb->planes[plane].m.fd = 0;
> > -             vb->planes[plane].data_offset = 0;
>
> I don't see the code below setting the 4 fields above to zero. Is it
> intended?
>

I thought these were not necessary anymore.
But now that I look more carefully, it is useful when there is an error below.
I will add them back in the next version. Thanks for catching this!

> > +     if (reacquired) {
> > +             __vb2_buf_dmabuf_put(vb);
> > +
> > +             for (plane = 0; plane < vb->num_planes; ++plane) {
> > +                     /* Acquire each plane's memory */
> > +                     mem_priv = call_ptr_memop(attach_dmabuf,
> > +                                               vb,
> > +                                               q->alloc_devs[plane] ? : q->dev,
> > +                                               planes[plane].dbuf,
> > +                                               planes[plane].length);
> > +                     if (IS_ERR(mem_priv)) {
> > +                             dprintk(q, 1, "failed to attach dmabuf\n");
> > +                             ret = PTR_ERR(mem_priv);
> > +                             goto err_put_dbuf;
>
> Hmm, I think in this case we need to also clean up the partially acquired
> planes of vb.
>
> > +                     }
> >
> > -             /* Acquire each plane's memory */
> > -             mem_priv = call_ptr_memop(attach_dmabuf,
> > -                                       vb,
> > -                                       q->alloc_devs[plane] ? : q->dev,
> > -                                       dbuf,
> > -                                       planes[plane].length);
> > -             if (IS_ERR(mem_priv)) {
> > -                     dprintk(q, 1, "failed to attach dmabuf\n");
> > -                     ret = PTR_ERR(mem_priv);
> > -                     dma_buf_put(dbuf);
> > -                     goto err;
> > +                     vb->planes[plane].dbuf = planes[plane].dbuf;
> > +                     vb->planes[plane].mem_priv = mem_priv;
> >               }
> > -
> > -             vb->planes[plane].dbuf = dbuf;
> > -             vb->planes[plane].mem_priv = mem_priv;
> > +     } else {
> > +             for (plane = 0; plane < vb->num_planes; ++plane)
> > +                     dma_buf_put(planes[plane].dbuf);
> >       }
> >
> >       /*
> > @@ -1413,7 +1413,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> >               if (ret) {
> >                       dprintk(q, 1, "failed to map dmabuf for plane %d\n",
> >                               plane);
> > -                     goto err;
> > +                     goto err_put_vb2_buf;
> >               }
> >               vb->planes[plane].dbuf_mapped = 1;
> >       }
>
> I think this entire loop can also go under the (reacquired) case, since
> (!reacquired) means that all the planes were identical (and thus are
> alreday mapped). Given that now we release all the planes in one go, we
> could even simplify it by dropping the dbuf_mapped check from the loop.
>
> > @@ -1437,7 +1437,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> >               ret = call_vb_qop(vb, buf_init, vb);
> >               if (ret) {
> >                       dprintk(q, 1, "buffer initialization failed\n");
> > -                     goto err;
> > +                     goto err_put_vb2_buf;
> >               }
> >       }
>
> Same for this block.
>
> >
> > @@ -1445,11 +1445,15 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> >       if (ret) {
> >               dprintk(q, 1, "buffer preparation failed\n");
> >               call_void_vb_qop(vb, buf_cleanup, vb);
> > -             goto err;
> > +             goto err_put_vb2_buf;
> >       }
> >
> >       return 0;
> > -err:
> > +
> > +err_put_dbuf:
> > +     for (plane = 0; plane < vb->num_planes; ++plane)
>
> dma_buf_put() will throw a warning if the dmabuf pointer is NULL and just
> plain crash if IS_ERR(), so we shouldn't call it on array elements that we
> didn't succeed for.
>

I see. Will do in the next version.

> > +             dma_buf_put(planes[plane].dbuf);
> > +err_put_vb2_buf:
> >       /* In case of errors, release planes that were already acquired */
> >       __vb2_buf_dmabuf_put(vb);
>
> Actually, would it make sense to invert the order of clean-up steps here?
> In case if only the first loop fails, we don't really need to do anything with
> vb. Or am I missing something?
>

It seems the original implementation will call __vb2_buf_dmabuf_put(vb)
whenever dma_buf_get() returns err or length < min_length. I was trying
to keep the same behavior here. Do you have any preference?

Also, if "call_vb_qop(vb, buf_prepare, vb);" fails, I think we only need
__vb2_buf_dmabuf_put(), but not dma_buf_put().

Best,
Yunke

> Best regards,
> Tomasz
Tomasz Figa June 4, 2024, 3:56 a.m. UTC | #7
On Thu, May 30, 2024 at 01:13:13PM +0900, Yunke Cao wrote:
> Hi Tomasz,
> 
> Thanks for the review.
> 
> On Fri, May 17, 2024 at 8:11 PM Tomasz Figa <tfiga@chromium.org> wrote:
> >
> > Hi Yunke,
> >
> > On Wed, Apr 03, 2024 at 06:13:04PM +0900, Yunke Cao wrote:
> > > The existing implementation, validating planes, checking if the planes
> > > changed, releasing previous planes and reaquiring new planes all happens in
> > > the same for loop.
> > >
> > > Split the for loop into 3 parts
> > > 1. In the first for loop, validate planes and check if planes changed.
> > > 2. Call __vb2_buf_dmabuf_put() to release all planes.
> > > 3. In the second for loop, reaquire new planes.
> > >
> > > Signed-off-by: Yunke Cao <yunkec@chromium.org>
> > > ---
> > >  .../media/common/videobuf2/videobuf2-core.c   | 64 ++++++++++---------
> > >  1 file changed, 34 insertions(+), 30 deletions(-)
> > >
> >
> > Thanks for the second revision and sorry for the delay. Please check my
> > comments inline.
> >
> > > diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> > > index b6bf8f232f48..702f7b6f783a 100644
> > > --- a/drivers/media/common/videobuf2/videobuf2-core.c
> > > +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> > > @@ -1341,11 +1341,13 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> > >       for (plane = 0; plane < vb->num_planes; ++plane) {
> > >               struct dma_buf *dbuf = dma_buf_get(planes[plane].m.fd);
> > >
> > > +             planes[plane].dbuf = dbuf;
> > > +
> > >               if (IS_ERR_OR_NULL(dbuf)) {
> > >                       dprintk(q, 1, "invalid dmabuf fd for plane %d\n",
> > >                               plane);
> > >                       ret = -EINVAL;
> > > -                     goto err;
> > > +                     goto err_put_dbuf;
> >
> > nit: Maybe err_put_planes, since we're cleaning up the planes[] array?
> >
> 
> err_put_planes sounds good to me.
> 
> > >               }
> > >
> > >               /* use DMABUF size if length is not provided */
> > > @@ -1356,17 +1358,14 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> > >                       dprintk(q, 1, "invalid dmabuf length %u for plane %d, minimum length %u\n",
> > >                               planes[plane].length, plane,
> > >                               vb->planes[plane].min_length);
> > > -                     dma_buf_put(dbuf);
> > >                       ret = -EINVAL;
> > > -                     goto err;
> > > +                     goto err_put_dbuf;
> > >               }
> > >
> > >               /* Skip the plane if already verified */
> > >               if (dbuf == vb->planes[plane].dbuf &&
> > > -                     vb->planes[plane].length == planes[plane].length) {
> > > -                     dma_buf_put(dbuf);
> > > +                 vb->planes[plane].length == planes[plane].length)
> > >                       continue;
> > > -             }
> > >
> > >               dprintk(q, 3, "buffer for plane %d changed\n", plane);
> > >
> > > @@ -1375,29 +1374,30 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> > >                       vb->copied_timestamp = 0;
> > >                       call_void_vb_qop(vb, buf_cleanup, vb);
> >
> > Would it make sense to also move these two to the if (reacquired) part
> > below, since they are done once for the entire vb?
> >
> 
> Yes, Will do in the next version.
> 
> > >               }
> > > +     }
> > >
> > > -             /* Release previously acquired memory if present */
> > > -             __vb2_plane_dmabuf_put(vb, &vb->planes[plane]);
> > > -             vb->planes[plane].bytesused = 0;
> > > -             vb->planes[plane].length = 0;
> > > -             vb->planes[plane].m.fd = 0;
> > > -             vb->planes[plane].data_offset = 0;
> >
> > I don't see the code below setting the 4 fields above to zero. Is it
> > intended?
> >
> 
> I thought these were not necessary anymore.
> But now that I look more carefully, it is useful when there is an error below.
> I will add them back in the next version. Thanks for catching this!
> 

Actually, original code would leave the fields in a weird state in case of
an error too. Maybe we could set all the fields to 0 in
__vb2_plane_dmabuf_put(), so that we always have the vb2 struct in a clean
state regardless of where we fail? (Could be a separate patch.)

> > > +     if (reacquired) {
> > > +             __vb2_buf_dmabuf_put(vb);
> > > +
> > > +             for (plane = 0; plane < vb->num_planes; ++plane) {
> > > +                     /* Acquire each plane's memory */
> > > +                     mem_priv = call_ptr_memop(attach_dmabuf,
> > > +                                               vb,
> > > +                                               q->alloc_devs[plane] ? : q->dev,
> > > +                                               planes[plane].dbuf,
> > > +                                               planes[plane].length);
> > > +                     if (IS_ERR(mem_priv)) {
> > > +                             dprintk(q, 1, "failed to attach dmabuf\n");
> > > +                             ret = PTR_ERR(mem_priv);
> > > +                             goto err_put_dbuf;
> >
> > Hmm, I think in this case we need to also clean up the partially acquired
> > planes of vb.
> >
> > > +                     }
> > >
> > > -             /* Acquire each plane's memory */
> > > -             mem_priv = call_ptr_memop(attach_dmabuf,
> > > -                                       vb,
> > > -                                       q->alloc_devs[plane] ? : q->dev,
> > > -                                       dbuf,
> > > -                                       planes[plane].length);
> > > -             if (IS_ERR(mem_priv)) {
> > > -                     dprintk(q, 1, "failed to attach dmabuf\n");
> > > -                     ret = PTR_ERR(mem_priv);
> > > -                     dma_buf_put(dbuf);
> > > -                     goto err;
> > > +                     vb->planes[plane].dbuf = planes[plane].dbuf;
> > > +                     vb->planes[plane].mem_priv = mem_priv;
> > >               }
> > > -
> > > -             vb->planes[plane].dbuf = dbuf;
> > > -             vb->planes[plane].mem_priv = mem_priv;
> > > +     } else {
> > > +             for (plane = 0; plane < vb->num_planes; ++plane)
> > > +                     dma_buf_put(planes[plane].dbuf);
> > >       }
> > >
> > >       /*
> > > @@ -1413,7 +1413,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> > >               if (ret) {
> > >                       dprintk(q, 1, "failed to map dmabuf for plane %d\n",
> > >                               plane);
> > > -                     goto err;
> > > +                     goto err_put_vb2_buf;
> > >               }
> > >               vb->planes[plane].dbuf_mapped = 1;
> > >       }
> >
> > I think this entire loop can also go under the (reacquired) case, since
> > (!reacquired) means that all the planes were identical (and thus are
> > alreday mapped). Given that now we release all the planes in one go, we
> > could even simplify it by dropping the dbuf_mapped check from the loop.
> >
> > > @@ -1437,7 +1437,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> > >               ret = call_vb_qop(vb, buf_init, vb);
> > >               if (ret) {
> > >                       dprintk(q, 1, "buffer initialization failed\n");
> > > -                     goto err;
> > > +                     goto err_put_vb2_buf;
> > >               }
> > >       }
> >
> > Same for this block.
> >
> > >
> > > @@ -1445,11 +1445,15 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
> > >       if (ret) {
> > >               dprintk(q, 1, "buffer preparation failed\n");
> > >               call_void_vb_qop(vb, buf_cleanup, vb);
> > > -             goto err;
> > > +             goto err_put_vb2_buf;
> > >       }
> > >
> > >       return 0;
> > > -err:
> > > +
> > > +err_put_dbuf:
> > > +     for (plane = 0; plane < vb->num_planes; ++plane)
> >
> > dma_buf_put() will throw a warning if the dmabuf pointer is NULL and just
> > plain crash if IS_ERR(), so we shouldn't call it on array elements that we
> > didn't succeed for.
> >
> 
> I see. Will do in the next version.
> 
> > > +             dma_buf_put(planes[plane].dbuf);
> > > +err_put_vb2_buf:
> > >       /* In case of errors, release planes that were already acquired */
> > >       __vb2_buf_dmabuf_put(vb);
> >
> > Actually, would it make sense to invert the order of clean-up steps here?
> > In case if only the first loop fails, we don't really need to do anything with
> > vb. Or am I missing something?
> >
> 
> It seems the original implementation will call __vb2_buf_dmabuf_put(vb)
> whenever dma_buf_get() returns err or length < min_length. I was trying
> to keep the same behavior here. Do you have any preference?
> 
> Also, if "call_vb_qop(vb, buf_prepare, vb);" fails, I think we only need
> __vb2_buf_dmabuf_put(), but not dma_buf_put().

I see, fair enough. I also reread the function to double check that we're
not double cleaning anything and it looks good to me.

Best regards,
Tomasz