mbox series

[v1,0/6] Move all drivers to a common dma-buf locking convention

Message ID 20220715005244.42198-1-dmitry.osipenko@collabora.com
Headers show
Series Move all drivers to a common dma-buf locking convention | expand

Message

Dmitry Osipenko July 15, 2022, 12:52 a.m. UTC
Hello,

This series moves all drivers to a dynamic dma-buf locking specification.

Comments

Dmitry Osipenko July 15, 2022, 6:59 a.m. UTC | #1
On 7/15/22 09:50, Christian König wrote:
> Am 15.07.22 um 02:52 schrieb Dmitry Osipenko:
>> Intel i915 GPU driver uses wait-wound mutex to lock multiple GEMs on the
>> attachment to the i915 dma-buf. In order to let all drivers utilize
>> shared
>> wait-wound context during attachment in a general way, make dma-buf
>> core to
>> acquire the ww context internally for the attachment operation and update
>> i915 driver to use the importer's ww context instead of the internal one.
>>
>>  From now on all dma-buf exporters shall use the importer's ww context
>> for
>> the attachment operation.
>>
>> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
>> ---
>>   drivers/dma-buf/dma-buf.c                     |  8 +++++-
>>   drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |  2 +-
>>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  2 +-
>>   drivers/gpu/drm/i915/gem/i915_gem_object.h    |  6 ++---
>>   drivers/gpu/drm/i915/i915_gem_evict.c         |  2 +-
>>   drivers/gpu/drm/i915/i915_gem_ww.c            | 26 +++++++++++++++----
>>   drivers/gpu/drm/i915/i915_gem_ww.h            | 15 +++++++++--
>>   7 files changed, 47 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>> index 0ee588276534..37545ecb845a 100644
>> --- a/drivers/dma-buf/dma-buf.c
>> +++ b/drivers/dma-buf/dma-buf.c
>> @@ -807,6 +807,8 @@ static struct sg_table * __map_dma_buf(struct
>> dma_buf_attachment *attach,
>>    * Optionally this calls &dma_buf_ops.attach to allow
>> device-specific attach
>>    * functionality.
>>    *
>> + * Exporters shall use ww_ctx acquired by this function.
>> + *
>>    * Returns:
>>    *
>>    * A pointer to newly created &dma_buf_attachment on success, or a
>> negative
>> @@ -822,6 +824,7 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf
>> *dmabuf, struct device *dev,
>>                   void *importer_priv)
>>   {
>>       struct dma_buf_attachment *attach;
>> +    struct ww_acquire_ctx ww_ctx;
>>       int ret;
>>         if (WARN_ON(!dmabuf || !dev))
>> @@ -841,7 +844,8 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf
>> *dmabuf, struct device *dev,
>>       attach->importer_ops = importer_ops;
>>       attach->importer_priv = importer_priv;
>>   -    dma_resv_lock(dmabuf->resv, NULL);
>> +    ww_acquire_init(&ww_ctx, &reservation_ww_class);
>> +    dma_resv_lock(dmabuf->resv, &ww_ctx);
> 
> That won't work like this. The core property of a WW context is that you
> need to unwind all the locks and re-quire them with the contended one
> first.
> 
> When you statically lock the imported one here you can't do that any more.

You're right. I felt that something is missing here, but couldn't
notice. I'll think more about this and enable
CONFIG_DEBUG_WW_MUTEX_SLOWPATH. Thank you!
Dmitry Osipenko July 15, 2022, 9:31 a.m. UTC | #2
On 7/15/22 10:19, Christian König wrote:
>> -struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment
>> *attach,
>> -                    enum dma_data_direction direction)
>> +struct sg_table *
>> +dma_buf_map_attachment_unlocked(struct dma_buf_attachment *attach,
>> +                enum dma_data_direction direction)
> 
> The locking state of mapping and unmapping operations depend on if the
> attachment is dynamic or not.
> 
> So this here is not a good idea at all since it suggests that the
> function is always called without holding the lock.

I had the same thought while was working on this patch and initially was
thinking about adding an "unlocked" alias to dma_buf_map_attachment().
In the end I decided that it will create even more confusion and it's
simpler just to rename this func here since there are only two drivers
using the dynamic mapping.

Do you have suggestions how to improve it?
Christian König July 20, 2022, 8:29 a.m. UTC | #3
Am 19.07.22 um 22:05 schrieb Dmitry Osipenko:
> On 7/15/22 09:59, Dmitry Osipenko wrote:
>> On 7/15/22 09:50, Christian König wrote:
>>> Am 15.07.22 um 02:52 schrieb Dmitry Osipenko:
>>>> Intel i915 GPU driver uses wait-wound mutex to lock multiple GEMs on the
>>>> attachment to the i915 dma-buf. In order to let all drivers utilize
>>>> shared
>>>> wait-wound context during attachment in a general way, make dma-buf
>>>> core to
>>>> acquire the ww context internally for the attachment operation and update
>>>> i915 driver to use the importer's ww context instead of the internal one.
>>>>
>>>>   From now on all dma-buf exporters shall use the importer's ww context
>>>> for
>>>> the attachment operation.
>>>>
>>>> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
>>>> ---
>>>>    drivers/dma-buf/dma-buf.c                     |  8 +++++-
>>>>    drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |  2 +-
>>>>    .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  2 +-
>>>>    drivers/gpu/drm/i915/gem/i915_gem_object.h    |  6 ++---
>>>>    drivers/gpu/drm/i915/i915_gem_evict.c         |  2 +-
>>>>    drivers/gpu/drm/i915/i915_gem_ww.c            | 26 +++++++++++++++----
>>>>    drivers/gpu/drm/i915/i915_gem_ww.h            | 15 +++++++++--
>>>>    7 files changed, 47 insertions(+), 14 deletions(-)
>>>>
>>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>>> index 0ee588276534..37545ecb845a 100644
>>>> --- a/drivers/dma-buf/dma-buf.c
>>>> +++ b/drivers/dma-buf/dma-buf.c
>>>> @@ -807,6 +807,8 @@ static struct sg_table * __map_dma_buf(struct
>>>> dma_buf_attachment *attach,
>>>>     * Optionally this calls &dma_buf_ops.attach to allow
>>>> device-specific attach
>>>>     * functionality.
>>>>     *
>>>> + * Exporters shall use ww_ctx acquired by this function.
>>>> + *
>>>>     * Returns:
>>>>     *
>>>>     * A pointer to newly created &dma_buf_attachment on success, or a
>>>> negative
>>>> @@ -822,6 +824,7 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf
>>>> *dmabuf, struct device *dev,
>>>>                    void *importer_priv)
>>>>    {
>>>>        struct dma_buf_attachment *attach;
>>>> +    struct ww_acquire_ctx ww_ctx;
>>>>        int ret;
>>>>          if (WARN_ON(!dmabuf || !dev))
>>>> @@ -841,7 +844,8 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf
>>>> *dmabuf, struct device *dev,
>>>>        attach->importer_ops = importer_ops;
>>>>        attach->importer_priv = importer_priv;
>>>>    -    dma_resv_lock(dmabuf->resv, NULL);
>>>> +    ww_acquire_init(&ww_ctx, &reservation_ww_class);
>>>> +    dma_resv_lock(dmabuf->resv, &ww_ctx);
>>> That won't work like this. The core property of a WW context is that you
>>> need to unwind all the locks and re-quire them with the contended one
>>> first.
>>>
>>> When you statically lock the imported one here you can't do that any more.
>> You're right. I felt that something is missing here, but couldn't
>> notice. I'll think more about this and enable
>> CONFIG_DEBUG_WW_MUTEX_SLOWPATH. Thank you!
>>
> Christian, do you think we could make an excuse for the attach()
> callback and make the exporter responsible for taking the resv lock? It
> will be inconsistent with the rest of the callbacks, where importer
> takes the lock, but it will be the simplest and least invasive solution.
> It's very messy to do a cross-driver ww locking, I don't think it's the
> right approach.

So to summarize the following calls will require that the caller hold 
the resv lock:
1. dma_buf_pin()/dma_buf_unpin()
2. dma_buf_map_attachment()/dma_buf_unmap_attachment()
3. dma_buf_vmap()/dma_buf_vunmap()
4. dma_buf_move_notify()

The following calls require that caller does not held the resv lock:
1. dma_buf_attach()/dma_buf_dynamic_attach()/dma_buf_detach()
2. dma_buf_export()/dma_buf_fd()
3. dma_buf_get()/dma_buf_put()
4. dma_buf_begin_cpu_access()/dma_buf_end_cpu_access()

If that's correct than that would work for me as well, but we should 
probably document this.

Or let me ask the other way around: What calls exactly do you need to 
change to solve your original issue? That was vmap/vunmap, wasn't it? If 
yes then let's concentrate on those for the moment.

Regards,
Christian.
Dmitry Osipenko July 20, 2022, 12:18 p.m. UTC | #4
On 7/20/22 11:29, Christian König wrote:
> Am 19.07.22 um 22:05 schrieb Dmitry Osipenko:
>> On 7/15/22 09:59, Dmitry Osipenko wrote:
>>> On 7/15/22 09:50, Christian König wrote:
>>>> Am 15.07.22 um 02:52 schrieb Dmitry Osipenko:
>>>>> Intel i915 GPU driver uses wait-wound mutex to lock multiple GEMs
>>>>> on the
>>>>> attachment to the i915 dma-buf. In order to let all drivers utilize
>>>>> shared
>>>>> wait-wound context during attachment in a general way, make dma-buf
>>>>> core to
>>>>> acquire the ww context internally for the attachment operation and
>>>>> update
>>>>> i915 driver to use the importer's ww context instead of the
>>>>> internal one.
>>>>>
>>>>>   From now on all dma-buf exporters shall use the importer's ww
>>>>> context
>>>>> for
>>>>> the attachment operation.
>>>>>
>>>>> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
>>>>> ---
>>>>>    drivers/dma-buf/dma-buf.c                     |  8 +++++-
>>>>>    drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |  2 +-
>>>>>    .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  2 +-
>>>>>    drivers/gpu/drm/i915/gem/i915_gem_object.h    |  6 ++---
>>>>>    drivers/gpu/drm/i915/i915_gem_evict.c         |  2 +-
>>>>>    drivers/gpu/drm/i915/i915_gem_ww.c            | 26
>>>>> +++++++++++++++----
>>>>>    drivers/gpu/drm/i915/i915_gem_ww.h            | 15 +++++++++--
>>>>>    7 files changed, 47 insertions(+), 14 deletions(-)
>>>>>
>>>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>>>> index 0ee588276534..37545ecb845a 100644
>>>>> --- a/drivers/dma-buf/dma-buf.c
>>>>> +++ b/drivers/dma-buf/dma-buf.c
>>>>> @@ -807,6 +807,8 @@ static struct sg_table * __map_dma_buf(struct
>>>>> dma_buf_attachment *attach,
>>>>>     * Optionally this calls &dma_buf_ops.attach to allow
>>>>> device-specific attach
>>>>>     * functionality.
>>>>>     *
>>>>> + * Exporters shall use ww_ctx acquired by this function.
>>>>> + *
>>>>>     * Returns:
>>>>>     *
>>>>>     * A pointer to newly created &dma_buf_attachment on success, or a
>>>>> negative
>>>>> @@ -822,6 +824,7 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf
>>>>> *dmabuf, struct device *dev,
>>>>>                    void *importer_priv)
>>>>>    {
>>>>>        struct dma_buf_attachment *attach;
>>>>> +    struct ww_acquire_ctx ww_ctx;
>>>>>        int ret;
>>>>>          if (WARN_ON(!dmabuf || !dev))
>>>>> @@ -841,7 +844,8 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf
>>>>> *dmabuf, struct device *dev,
>>>>>        attach->importer_ops = importer_ops;
>>>>>        attach->importer_priv = importer_priv;
>>>>>    -    dma_resv_lock(dmabuf->resv, NULL);
>>>>> +    ww_acquire_init(&ww_ctx, &reservation_ww_class);
>>>>> +    dma_resv_lock(dmabuf->resv, &ww_ctx);
>>>> That won't work like this. The core property of a WW context is that
>>>> you
>>>> need to unwind all the locks and re-quire them with the contended one
>>>> first.
>>>>
>>>> When you statically lock the imported one here you can't do that any
>>>> more.
>>> You're right. I felt that something is missing here, but couldn't
>>> notice. I'll think more about this and enable
>>> CONFIG_DEBUG_WW_MUTEX_SLOWPATH. Thank you!
>>>
>> Christian, do you think we could make an excuse for the attach()
>> callback and make the exporter responsible for taking the resv lock? It
>> will be inconsistent with the rest of the callbacks, where importer
>> takes the lock, but it will be the simplest and least invasive solution.
>> It's very messy to do a cross-driver ww locking, I don't think it's the
>> right approach.
> 
> So to summarize the following calls will require that the caller hold
> the resv lock:
> 1. dma_buf_pin()/dma_buf_unpin()
> 2. dma_buf_map_attachment()/dma_buf_unmap_attachment()
> 3. dma_buf_vmap()/dma_buf_vunmap()
> 4. dma_buf_move_notify()
> 
> The following calls require that caller does not held the resv lock:
> 1. dma_buf_attach()/dma_buf_dynamic_attach()/dma_buf_detach()
> 2. dma_buf_export()/dma_buf_fd()
> 3. dma_buf_get()/dma_buf_put()
> 4. dma_buf_begin_cpu_access()/dma_buf_end_cpu_access()
> 
> If that's correct than that would work for me as well, but we should
> probably document this.

Looks good, thank you. I'll try this variant.

> Or let me ask the other way around: What calls exactly do you need to
> change to solve your original issue? That was vmap/vunmap, wasn't it? If
> yes then let's concentrate on those for the moment.

Originally, Daniel Vetter asked to sort out the dma-buf lockings across
all drivers, so we could replace custom locks in DRM-SHMEM with the resv
lock, otherwise there were no guarantees that we won't have deadlocks in
the dma-buf code paths.

The vmap/vunmap is one of the paths that needs to be sorted out, there
is no particular issue with it, just need to specify the convention. The
mmaping was the other questionable path and we concluded that it's
better to prohibit dma-buf mappings for DRM entirely. Lastly, there is
i915 attach() that uses the ww locking.