diff mbox series

[RFC,16/19] mm/frame-vector: remove FOLL_FORCE usage

Message ID 20221107161740.144456-17-david@redhat.com
State Accepted
Commit cb78a634f3f7ff743e19fbffcb72d794e4bd7f73
Headers show
Series mm/gup: remove FOLL_FORCE usage from drivers (reliable R/O long-term pinning) | expand

Commit Message

David Hildenbrand Nov. 7, 2022, 4:17 p.m. UTC
FOLL_FORCE is really only for debugger access. According to commit
707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always
writable"), the pinned pages are always writable.

FOLL_FORCE in this case seems to be a legacy leftover. Let's just remove
it.

Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/media/common/videobuf2/frame_vector.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Tomasz Figa Nov. 8, 2022, 4:45 a.m. UTC | #1
Hi David,

On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote:
>
> FOLL_FORCE is really only for debugger access. According to commit
> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always
> writable"), the pinned pages are always writable.

Actually that patch is only a workaround to temporarily disable
support for read-only pages as they seemed to suffer from some
corruption issues in the retrieved user pages. We expect to support
read-only pages as hardware input after. That said, FOLL_FORCE doesn't
sound like the right thing even in that case, but I don't know the
background behind it being added here in the first place. +Hans
Verkuil +Marek Szyprowski do you happen to remember anything about it?

Best regards,
Tomasz

>
> FOLL_FORCE in this case seems to be a legacy leftover. Let's just remove
> it.
>
> Cc: Tomasz Figa <tfiga@chromium.org>
> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  drivers/media/common/videobuf2/frame_vector.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c
> index 542dde9d2609..062e98148c53 100644
> --- a/drivers/media/common/videobuf2/frame_vector.c
> +++ b/drivers/media/common/videobuf2/frame_vector.c
> @@ -50,7 +50,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
>         start = untagged_addr(start);
>
>         ret = pin_user_pages_fast(start, nr_frames,
> -                                 FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
> +                                 FOLL_WRITE | FOLL_LONGTERM,
>                                   (struct page **)(vec->ptrs));
>         if (ret > 0) {
>                 vec->got_ref = true;
> --
> 2.38.1
>
David Hildenbrand Nov. 8, 2022, 9:40 a.m. UTC | #2
On 08.11.22 05:45, Tomasz Figa wrote:
> Hi David,

Hi Tomasz,

thanks for looking into this!

> 
> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> FOLL_FORCE is really only for debugger access. According to commit
>> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always
>> writable"), the pinned pages are always writable.
> 

As reference, the cover letter of the series:

https://lkml.kernel.org/r/20221107161740.144456-1-david@redhat.com

> Actually that patch is only a workaround to temporarily disable
> support for read-only pages as they seemed to suffer from some
> corruption issues in the retrieved user pages. We expect to support
> read-only pages as hardware input after. That said, FOLL_FORCE doesn't
> sound like the right thing even in that case, but I don't know the
> background behind it being added here in the first place. +Hans
> Verkuil +Marek Szyprowski do you happen to remember anything about it?

Maybe I mis-interpreted 707947247e95; re-reading it again, I am not 
quite sure what the actual problem is and how it relates to GUP 
FOLL_WRITE handling. FOLL_FORCE already was in place before 707947247e95 
and should be removed.

What I understood is "Just always call vb2_create_framevec() with 
FOLL_WRITE to always allow writing to the buffers.".


If the pinned page is never written to via the obtained GUP reference:
* FOLL_WRITE should not be set
* FOLL_FORCE should not be set
* We should not dirty the page when unpinning

If the pinned page may be written to via the obtained GUP reference:
* FOLL_WRITE should be set
* FOLL_FORCE should not be set
* We should dirty the page when unpinning


If the function is called for both, we should think about doing it 
conditional based on a "write" variable, like pre-707947247e95 did.

@Hans, any insight?
Hans Verkuil Nov. 22, 2022, 12:25 p.m. UTC | #3
Hi Tomasz, David,

On 11/8/22 05:45, Tomasz Figa wrote:
> Hi David,
> 
> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> FOLL_FORCE is really only for debugger access. According to commit
>> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always
>> writable"), the pinned pages are always writable.
> 
> Actually that patch is only a workaround to temporarily disable
> support for read-only pages as they seemed to suffer from some
> corruption issues in the retrieved user pages. We expect to support
> read-only pages as hardware input after. That said, FOLL_FORCE doesn't
> sound like the right thing even in that case, but I don't know the
> background behind it being added here in the first place. +Hans
> Verkuil +Marek Szyprowski do you happen to remember anything about it?

I tracked the use of 'force' all the way back to the first git commit
(2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the
reason is lost in the mists of time.

I'm not sure if the 'force' argument of get_user_pages() at that time
even meant the same as FOLL_FORCE today. From what I can tell it has just
been faithfully used ever since, but I have my doubt that anyone understands
the reason behind it since it was never explained.

Looking at this old LWN article https://lwn.net/Articles/28548/ suggests
that it might be related to calling get_user_pages for write buffers
(non-zero write argument) where you also want to be able to read from the
buffer. That is certainly something that some drivers need to do post-capture
fixups.

But 'force' was also always set for read buffers, and I don't know if that
was something that was actually needed, or just laziness.

I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still
allow drivers to read from the buffer?

Regards,

	Hans

> 
> Best regards,
> Tomasz
> 
>>
>> FOLL_FORCE in this case seems to be a legacy leftover. Let's just remove
>> it.
>>
>> Cc: Tomasz Figa <tfiga@chromium.org>
>> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
>> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  drivers/media/common/videobuf2/frame_vector.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c
>> index 542dde9d2609..062e98148c53 100644
>> --- a/drivers/media/common/videobuf2/frame_vector.c
>> +++ b/drivers/media/common/videobuf2/frame_vector.c
>> @@ -50,7 +50,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
>>         start = untagged_addr(start);
>>
>>         ret = pin_user_pages_fast(start, nr_frames,
>> -                                 FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
>> +                                 FOLL_WRITE | FOLL_LONGTERM,
>>                                   (struct page **)(vec->ptrs));
>>         if (ret > 0) {
>>                 vec->got_ref = true;
>> --
>> 2.38.1
>>
David Hildenbrand Nov. 22, 2022, 12:38 p.m. UTC | #4
On 22.11.22 13:25, Hans Verkuil wrote:
> Hi Tomasz, David,
> 
> On 11/8/22 05:45, Tomasz Figa wrote:
>> Hi David,
>>
>> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote:
>>>
>>> FOLL_FORCE is really only for debugger access. According to commit
>>> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always
>>> writable"), the pinned pages are always writable.
>>
>> Actually that patch is only a workaround to temporarily disable
>> support for read-only pages as they seemed to suffer from some
>> corruption issues in the retrieved user pages. We expect to support
>> read-only pages as hardware input after. That said, FOLL_FORCE doesn't
>> sound like the right thing even in that case, but I don't know the
>> background behind it being added here in the first place. +Hans
>> Verkuil +Marek Szyprowski do you happen to remember anything about it?
> 
> I tracked the use of 'force' all the way back to the first git commit
> (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the
> reason is lost in the mists of time.
> 
> I'm not sure if the 'force' argument of get_user_pages() at that time
> even meant the same as FOLL_FORCE today. From what I can tell it has just
> been faithfully used ever since, but I have my doubt that anyone understands
> the reason behind it since it was never explained.
> 
> Looking at this old LWN article https://lwn.net/Articles/28548/ suggests
> that it might be related to calling get_user_pages for write buffers
> (non-zero write argument) where you also want to be able to read from the
> buffer. That is certainly something that some drivers need to do post-capture
> fixups.
> 
> But 'force' was also always set for read buffers, and I don't know if that
> was something that was actually needed, or just laziness.
> 
> I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still
> allow drivers to read from the buffer?

Yes. The only problematic corner case I can imagine is if someone has a 
VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin 
user space pages as a read buffer. We'd specify now FOLL_WRITE without 
FOLL_FORCE and GUP would reject that: write access without write 
permissions is invalid.

There would be no way around "fixing" this implementation to not specify 
FOLL_WRITE when only reading from user-space pages. Not sure what the 
implications are regarding that corruption that was mentioned in 
707947247e95.

Having said that, I assume such a scenario is unlikely -- but you might 
know better how user space usually uses this interface. There would be 
three options:

1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid.
2) Remove FOLL_FORCE and see if anybody even notices (this patch) and
    leave the implementation as is for now.
3) Remove FOLL_FORCE and fixup the implementation to only specify
    FOLL_WRITE if the pages will actually get written to.

3) would most probably ideal, however, I am no expert on that code and 
can't do it (707947247e95 confuses me). So naive me would go with 2) first.
Hans Verkuil Nov. 22, 2022, 2:07 p.m. UTC | #5
On 11/22/22 13:38, David Hildenbrand wrote:
> On 22.11.22 13:25, Hans Verkuil wrote:
>> Hi Tomasz, David,
>>
>> On 11/8/22 05:45, Tomasz Figa wrote:
>>> Hi David,
>>>
>>> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote:
>>>>
>>>> FOLL_FORCE is really only for debugger access. According to commit
>>>> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always
>>>> writable"), the pinned pages are always writable.
>>>
>>> Actually that patch is only a workaround to temporarily disable
>>> support for read-only pages as they seemed to suffer from some
>>> corruption issues in the retrieved user pages. We expect to support
>>> read-only pages as hardware input after. That said, FOLL_FORCE doesn't
>>> sound like the right thing even in that case, but I don't know the
>>> background behind it being added here in the first place. +Hans
>>> Verkuil +Marek Szyprowski do you happen to remember anything about it?
>>
>> I tracked the use of 'force' all the way back to the first git commit
>> (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the
>> reason is lost in the mists of time.
>>
>> I'm not sure if the 'force' argument of get_user_pages() at that time
>> even meant the same as FOLL_FORCE today. From what I can tell it has just
>> been faithfully used ever since, but I have my doubt that anyone understands
>> the reason behind it since it was never explained.
>>
>> Looking at this old LWN article https://lwn.net/Articles/28548/ suggests
>> that it might be related to calling get_user_pages for write buffers
>> (non-zero write argument) where you also want to be able to read from the
>> buffer. That is certainly something that some drivers need to do post-capture
>> fixups.
>>
>> But 'force' was also always set for read buffers, and I don't know if that
>> was something that was actually needed, or just laziness.
>>
>> I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still
>> allow drivers to read from the buffer?
> 
> Yes. The only problematic corner case I can imagine is if someone has a 
> VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin 
> user space pages as a read buffer. We'd specify now FOLL_WRITE without 
> FOLL_FORCE and GUP would reject that: write access without write 
> permissions is invalid.

I do not believe this will be an issue.

> 
> There would be no way around "fixing" this implementation to not specify 
> FOLL_WRITE when only reading from user-space pages. Not sure what the 
> implications are regarding that corruption that was mentioned in 
> 707947247e95.

Before 707947247e95 the FOLL_WRITE flag was only set for write buffers
(i.e. video capture, DMA_FROM_DEVICE), not for read buffers (video output,
DMA_TO_DEVICE). In the video output case there should never be any need
for drivers to write to the buffer to the best of my knowledge.

But I have had some complaints about that commit that it causes problems
in some scenarios, and it has been on my todo list for quite some time now
to dig deeper into this. I probably should prioritize this for this or
next week.

> 
> Having said that, I assume such a scenario is unlikely -- but you might 
> know better how user space usually uses this interface. There would be 
> three options:
> 
> 1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid.
> 2) Remove FOLL_FORCE and see if anybody even notices (this patch) and
>     leave the implementation as is for now.
> 3) Remove FOLL_FORCE and fixup the implementation to only specify
>     FOLL_WRITE if the pages will actually get written to.
> 
> 3) would most probably ideal, however, I am no expert on that code and 
> can't do it (707947247e95 confuses me). So naive me would go with 2) first.
> 

Option 3 would be best. And 707947247e95 confuses me as well, and I actually
wrote it :-) I am wondering whether it was addressed at the right level, but
as I said, I need to dig a bit deeper into this.

Regards,

	Hans
David Hildenbrand Nov. 22, 2022, 3:03 p.m. UTC | #6
On 22.11.22 15:07, Hans Verkuil wrote:
> On 11/22/22 13:38, David Hildenbrand wrote:
>> On 22.11.22 13:25, Hans Verkuil wrote:
>>> Hi Tomasz, David,
>>>
>>> On 11/8/22 05:45, Tomasz Figa wrote:
>>>> Hi David,
>>>>
>>>> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote:
>>>>>
>>>>> FOLL_FORCE is really only for debugger access. According to commit
>>>>> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always
>>>>> writable"), the pinned pages are always writable.
>>>>
>>>> Actually that patch is only a workaround to temporarily disable
>>>> support for read-only pages as they seemed to suffer from some
>>>> corruption issues in the retrieved user pages. We expect to support
>>>> read-only pages as hardware input after. That said, FOLL_FORCE doesn't
>>>> sound like the right thing even in that case, but I don't know the
>>>> background behind it being added here in the first place. +Hans
>>>> Verkuil +Marek Szyprowski do you happen to remember anything about it?
>>>
>>> I tracked the use of 'force' all the way back to the first git commit
>>> (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the
>>> reason is lost in the mists of time.
>>>
>>> I'm not sure if the 'force' argument of get_user_pages() at that time
>>> even meant the same as FOLL_FORCE today. From what I can tell it has just
>>> been faithfully used ever since, but I have my doubt that anyone understands
>>> the reason behind it since it was never explained.
>>>
>>> Looking at this old LWN article https://lwn.net/Articles/28548/ suggests
>>> that it might be related to calling get_user_pages for write buffers
>>> (non-zero write argument) where you also want to be able to read from the
>>> buffer. That is certainly something that some drivers need to do post-capture
>>> fixups.
>>>
>>> But 'force' was also always set for read buffers, and I don't know if that
>>> was something that was actually needed, or just laziness.
>>>
>>> I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still
>>> allow drivers to read from the buffer?
>>
>> Yes. The only problematic corner case I can imagine is if someone has a
>> VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin
>> user space pages as a read buffer. We'd specify now FOLL_WRITE without
>> FOLL_FORCE and GUP would reject that: write access without write
>> permissions is invalid.
> 
> I do not believe this will be an issue.
> 
>>
>> There would be no way around "fixing" this implementation to not specify
>> FOLL_WRITE when only reading from user-space pages. Not sure what the
>> implications are regarding that corruption that was mentioned in
>> 707947247e95.
> 
> Before 707947247e95 the FOLL_WRITE flag was only set for write buffers
> (i.e. video capture, DMA_FROM_DEVICE), not for read buffers (video output,
> DMA_TO_DEVICE). In the video output case there should never be any need
> for drivers to write to the buffer to the best of my knowledge.
> 
> But I have had some complaints about that commit that it causes problems
> in some scenarios, and it has been on my todo list for quite some time now
> to dig deeper into this. I probably should prioritize this for this or
> next week.
> 
>>
>> Having said that, I assume such a scenario is unlikely -- but you might
>> know better how user space usually uses this interface. There would be
>> three options:
>>
>> 1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid.
>> 2) Remove FOLL_FORCE and see if anybody even notices (this patch) and
>>      leave the implementation as is for now.
>> 3) Remove FOLL_FORCE and fixup the implementation to only specify
>>      FOLL_WRITE if the pages will actually get written to.
>>
>> 3) would most probably ideal, however, I am no expert on that code and
>> can't do it (707947247e95 confuses me). So naive me would go with 2) first.
>>
> 
> Option 3 would be best. And 707947247e95 confuses me as well, and I actually
> wrote it :-) I am wondering whether it was addressed at the right level, but
> as I said, I need to dig a bit deeper into this.

Cool, let me know if I can help!
Linus Torvalds Nov. 22, 2022, 5:33 p.m. UTC | #7
On Tue, Nov 22, 2022 at 4:25 AM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> I tracked the use of 'force' all the way back to the first git commit
> (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the
> reason is lost in the mists of time.

Well, not entirely.

For archaeology reasons, I went back to the old BK history, which
exists as a git conversion in

    https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/

and there you can actually find it.

Not with a lot of explanations, though - it's commit b7649ef789
("[PATCH] videobuf update"):

    This updates the video-buf.c module (helper module for video buffer
    management).  Some memory management fixes, also some adaptions to the
    final v4l2 api.

but it went from

         err = get_user_pages(current,current->mm,
-                            data, dma->nr_pages,
-                            rw == READ, 0, /* don't force */
+                            data & PAGE_MASK, dma->nr_pages,
+                            rw == READ, 1, /* force */
                             dma->pages, NULL);

in that commit.

So it goes back to October 2002.

> Looking at this old LWN article https://lwn.net/Articles/28548/ suggests
> that it might be related to calling get_user_pages for write buffers

The timing roughly matches.

> I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still
> allow drivers to read from the buffer?

The issue with some of the driver hacks has been that

 - they only want to read, and the buffer may be read-only

 - they then used FOLL_WRITE despite that, because they want to break
COW (due to the issue that David is now fixing with his series)

 - but that means that the VM layer says "nope, you can't write to
this read-only user mapping"

 - ... and then they use FOLL_FORCE to say "yes, I can".

iOW, the FOLL_FORCE may be entirely due to an (incorrect, but
historically needed) FOLL_WRITE.

             Linus
diff mbox series

Patch

diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c
index 542dde9d2609..062e98148c53 100644
--- a/drivers/media/common/videobuf2/frame_vector.c
+++ b/drivers/media/common/videobuf2/frame_vector.c
@@ -50,7 +50,7 @@  int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
 	start = untagged_addr(start);
 
 	ret = pin_user_pages_fast(start, nr_frames,
-				  FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
+				  FOLL_WRITE | FOLL_LONGTERM,
 				  (struct page **)(vec->ptrs));
 	if (ret > 0) {
 		vec->got_ref = true;