mbox series

[v6,00/17] follow_pfn and other iomap races

Message ID 20201119144146.1045202-1-daniel.vetter@ffwll.ch
Headers show
Series follow_pfn and other iomap races | expand

Message

Daniel Vetter Nov. 19, 2020, 2:41 p.m. UTC
Hi all

Another update of my patch series to clamp down a bunch of races and gaps
around follow_pfn and other access to iomem mmaps. Previous version:

v1: https://lore.kernel.org/dri-devel/20201007164426.1812530-1-daniel.vetter@ffwll.ch/
v2: https://lore.kernel.org/dri-devel/20201009075934.3509076-1-daniel.vetter@ffwll.ch
v3: https://lore.kernel.org/dri-devel/20201021085655.1192025-1-daniel.vetter@ffwll.ch/
v4: https://lore.kernel.org/dri-devel/20201026105818.2585306-1-daniel.vetter@ffwll.ch/
v5: https://lore.kernel.org/dri-devel/20201030100815.2269-1-daniel.vetter@ffwll.ch/

And the discussion that sparked this journey:

https://lore.kernel.org/dri-devel/20201007164426.1812530-1-daniel.vetter@ffwll.ch/

Unfortunately took way longer to update than I hoped, I got sidetracked
with a few things.

Changes in v6:
- Tested v4l userptr as Tomasz suggested. No boom observed
- Added RFC for locking down follow_pfn, per discussion with Christoph and
  Jason.
- Explain why pup_fast is safe in relevant patches, there was a bit a
  confusion when discussing v5.
- Fix up the resource patch, with CONFIG_IO_STRICT_DEVMEM it crashed on
  boot due to an unintended change (reported by John)

Changes in v5:
- Tomasz found some issues in the media patches
- Polish suggested by Christoph for the unsafe_follow_pfn patch

Changes in v4:
- Drop the s390 patch, that was very stand-alone and now queued up to land
  through s390 trees.
- Comment polish per Dan's review.

Changes in v3:
- Bunch of polish all over, no functional changes aside from one barrier
  in the resource code, for consistency.
- A few more r-b tags.

Changes in v2:
- tons of small polish&fixes all over, thanks to all the reviewers who
  spotted issues
- I managed to test at least the generic_access_phys and pci mmap revoke
  stuff with a few gdb sessions using our i915 debug tools (hence now also
  the drm/i915 patch to properly request all the pci bar regions)
- reworked approach for the pci mmap revoke: Infrastructure moved into
  kernel/resource.c, address_space mapping is now set up at open time for
  everyone (which required some sysfs changes). Does indeed look a lot
  cleaner and a lot less invasive than I feared at first.

I feel like this is ready for some wider soaking. Since the remaining bits
are all kinda connnected probably simplest if it all goes through -mm.

Cheers, Daniel

Daniel Vetter (17):
  drm/exynos: Stop using frame_vector helpers
  drm/exynos: Use FOLL_LONGTERM for g2d cmdlists
  misc/habana: Stop using frame_vector helpers
  misc/habana: Use FOLL_LONGTERM for userptr
  mm/frame-vector: Use FOLL_LONGTERM
  media: videobuf2: Move frame_vector into media subsystem
  mm: Close race in generic_access_phys
  mm: Add unsafe_follow_pfn
  media/videbuf1|2: Mark follow_pfn usage as unsafe
  vfio/type1: Mark follow_pfn as unsafe
  PCI: Obey iomem restrictions for procfs mmap
  /dev/mem: Only set filp->f_mapping
  resource: Move devmem revoke code to resource framework
  sysfs: Support zapping of binary attr mmaps
  PCI: Revoke mappings like devmem
  RFC: kvm: pass kvm argument to follow_pfn callsites
  RFC: mm: add mmu_notifier argument to follow_pfn

 arch/powerpc/kvm/book3s_64_mmu_hv.c           |   2 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c        |   2 +-
 arch/powerpc/kvm/e500_mmu_host.c              |   2 +-
 arch/x86/kvm/mmu/mmu.c                        |   8 +-
 drivers/char/mem.c                            |  86 +------------
 drivers/gpu/drm/exynos/Kconfig                |   1 -
 drivers/gpu/drm/exynos/exynos_drm_g2d.c       |  48 ++++----
 drivers/media/common/videobuf2/Kconfig        |   1 -
 drivers/media/common/videobuf2/Makefile       |   1 +
 .../media/common/videobuf2}/frame_vector.c    |  57 +++------
 .../media/common/videobuf2/videobuf2-memops.c |   3 +-
 drivers/media/platform/omap/Kconfig           |   1 -
 drivers/media/v4l2-core/videobuf-dma-contig.c |   2 +-
 drivers/misc/habanalabs/Kconfig               |   1 -
 drivers/misc/habanalabs/common/habanalabs.h   |   6 +-
 drivers/misc/habanalabs/common/memory.c       |  50 +++-----
 drivers/pci/pci-sysfs.c                       |   4 +
 drivers/pci/proc.c                            |   6 +
 drivers/vfio/vfio_iommu_type1.c               |   4 +-
 fs/sysfs/file.c                               |  11 ++
 include/linux/ioport.h                        |   6 +-
 include/linux/kvm_host.h                      |   9 +-
 include/linux/mm.h                            |  50 +-------
 include/linux/sysfs.h                         |   2 +
 include/media/frame_vector.h                  |  47 +++++++
 include/media/videobuf2-core.h                |   1 +
 kernel/resource.c                             |  98 ++++++++++++++-
 mm/Kconfig                                    |   3 -
 mm/Makefile                                   |   1 -
 mm/memory.c                                   | 115 +++++++++++++++---
 mm/nommu.c                                    |  48 +++++++-
 security/Kconfig                              |  13 ++
 virt/kvm/kvm_main.c                           |  56 +++++----
 33 files changed, 447 insertions(+), 298 deletions(-)
 rename {mm => drivers/media/common/videobuf2}/frame_vector.c (84%)
 create mode 100644 include/media/frame_vector.h

Comments

Tomasz Figa Nov. 20, 2020, 8:32 a.m. UTC | #1
On Fri, Nov 20, 2020 at 5:28 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> On 20/11/2020 09:06, Hans Verkuil wrote:
> > On 19/11/2020 15:41, Daniel Vetter wrote:
> >> The media model assumes that buffers are all preallocated, so that
> >> when a media pipeline is running we never miss a deadline because the
> >> buffers aren't allocated or available.
> >>
> >> This means we cannot fix the v4l follow_pfn usage through
> >> mmu_notifier, without breaking how this all works. The only real fix
> >> is to deprecate userptr support for VM_IO | VM_PFNMAP mappings and
> >> tell everyone to cut over to dma-buf memory sharing for zerocopy.
> >>
> >> userptr for normal memory will keep working as-is, this only affects
> >> the zerocopy userptr usage enabled in 50ac952d2263 ("[media]
> >> videobuf2-dma-sg: Support io userptr operations on io memory").
> >>
> >> Acked-by: Tomasz Figa <tfiga@chromium.org>
> >
> > Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>
> Actually, cancel this Acked-by.
>
> So let me see if I understand this right: VM_IO | VM_PFNMAP mappings can
> move around. There is a mmu_notifier that can be used to be notified when
> that happens, but that can't be used with media buffers since those buffers
> must always be available and in the same place.
>
> So follow_pfn is replaced by unsafe_follow_pfn to signal that what is attempted
> is unsafe and unreliable.
>
> If CONFIG_STRICT_FOLLOW_PFN is set, then unsafe_follow_pfn will fail, if it
> is unset, then it writes a warning to the kernel log but just continues while
> still unsafe.
>
> I am very much inclined to just drop VM_IO | VM_PFNMAP support in the media
> subsystem. For vb2 there is a working alternative in the form of dmabuf, and
> frankly for vb1 I don't care. If someone really needs this for a vb1 driver,
> then they can do the work to convert that driver to vb2.
>
> I've added Mauro to the CC list and I'll ping a few more people to see what
> they think, but in my opinion support for USERPTR + VM_IO | VM_PFNMAP
> should just be killed off.
>
> If others would like to keep it, then frame_vector.c needs a comment before
> the 'while' explaining why the unsafe_follow_pfn is there and that using
> dmabuf is the proper alternative to use. That will make it easier for
> developers to figure out why they see a kernel warning and what to do to
> fix it, rather than having to dig through the git history for the reason.

I'm all for dropping that code.

Best regards,
Tomasz

>
> Regards,
>
>         Hans
>
> >
> > Thanks!
> >
> >       Hans
> >
> >> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> >> Cc: Kees Cook <keescook@chromium.org>
> >> Cc: Dan Williams <dan.j.williams@intel.com>
> >> Cc: Andrew Morton <akpm@linux-foundation.org>
> >> Cc: John Hubbard <jhubbard@nvidia.com>
> >> Cc: Jérôme Glisse <jglisse@redhat.com>
> >> Cc: Jan Kara <jack@suse.cz>
> >> Cc: Dan Williams <dan.j.williams@intel.com>
> >> Cc: linux-mm@kvack.org
> >> Cc: linux-arm-kernel@lists.infradead.org
> >> Cc: linux-samsung-soc@vger.kernel.org
> >> Cc: linux-media@vger.kernel.org
> >> Cc: Pawel Osciak <pawel@osciak.com>
> >> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> >> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> >> Cc: Tomasz Figa <tfiga@chromium.org>
> >> Cc: Laurent Dufour <ldufour@linux.ibm.com>
> >> Cc: Vlastimil Babka <vbabka@suse.cz>
> >> Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
> >> Cc: Michel Lespinasse <walken@google.com>
> >> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >> --
> >> v3:
> >> - Reference the commit that enabled the zerocopy userptr use case to
> >>   make it abundandtly clear that this patch only affects that, and not
> >>   normal memory userptr. The old commit message already explained that
> >>   normal memory userptr is unaffected, but I guess that was not clear
> >>   enough.
> >> ---
> >>  drivers/media/common/videobuf2/frame_vector.c | 2 +-
> >>  drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +-
> >>  2 files changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c
> >> index a0e65481a201..1a82ec13ea00 100644
> >> --- a/drivers/media/common/videobuf2/frame_vector.c
> >> +++ b/drivers/media/common/videobuf2/frame_vector.c
> >> @@ -70,7 +70,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
> >>                      break;
> >>
> >>              while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {
> >> -                    err = follow_pfn(vma, start, &nums[ret]);
> >> +                    err = unsafe_follow_pfn(vma, start, &nums[ret]);
> >>                      if (err) {
> >>                              if (ret == 0)
> >>                                      ret = err;
> >> diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c
> >> index 52312ce2ba05..821c4a76ab96 100644
> >> --- a/drivers/media/v4l2-core/videobuf-dma-contig.c
> >> +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c
> >> @@ -183,7 +183,7 @@ static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem,
> >>      user_address = untagged_baddr;
> >>
> >>      while (pages_done < (mem->size >> PAGE_SHIFT)) {
> >> -            ret = follow_pfn(vma, user_address, &this_pfn);
> >> +            ret = unsafe_follow_pfn(vma, user_address, &this_pfn);
> >>              if (ret)
> >>                      break;
> >>
> >>
> >
>
Daniel Vetter Nov. 20, 2020, 9:18 a.m. UTC | #2
On Fri, Nov 20, 2020 at 9:28 AM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>

> On 20/11/2020 09:06, Hans Verkuil wrote:

> > On 19/11/2020 15:41, Daniel Vetter wrote:

> >> The media model assumes that buffers are all preallocated, so that

> >> when a media pipeline is running we never miss a deadline because the

> >> buffers aren't allocated or available.

> >>

> >> This means we cannot fix the v4l follow_pfn usage through

> >> mmu_notifier, without breaking how this all works. The only real fix

> >> is to deprecate userptr support for VM_IO | VM_PFNMAP mappings and

> >> tell everyone to cut over to dma-buf memory sharing for zerocopy.

> >>

> >> userptr for normal memory will keep working as-is, this only affects

> >> the zerocopy userptr usage enabled in 50ac952d2263 ("[media]

> >> videobuf2-dma-sg: Support io userptr operations on io memory").

> >>

> >> Acked-by: Tomasz Figa <tfiga@chromium.org>

> >

> > Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>

>

> Actually, cancel this Acked-by.

>

> So let me see if I understand this right: VM_IO | VM_PFNMAP mappings can

> move around. There is a mmu_notifier that can be used to be notified when

> that happens, but that can't be used with media buffers since those buffers

> must always be available and in the same place.

>

> So follow_pfn is replaced by unsafe_follow_pfn to signal that what is attempted

> is unsafe and unreliable.

>

> If CONFIG_STRICT_FOLLOW_PFN is set, then unsafe_follow_pfn will fail, if it

> is unset, then it writes a warning to the kernel log but just continues while

> still unsafe.

>

> I am very much inclined to just drop VM_IO | VM_PFNMAP support in the media

> subsystem. For vb2 there is a working alternative in the form of dmabuf, and

> frankly for vb1 I don't care. If someone really needs this for a vb1 driver,

> then they can do the work to convert that driver to vb2.

>

> I've added Mauro to the CC list and I'll ping a few more people to see what

> they think, but in my opinion support for USERPTR + VM_IO | VM_PFNMAP

> should just be killed off.

>

> If others would like to keep it, then frame_vector.c needs a comment before

> the 'while' explaining why the unsafe_follow_pfn is there and that using

> dmabuf is the proper alternative to use. That will make it easier for

> developers to figure out why they see a kernel warning and what to do to

> fix it, rather than having to dig through the git history for the reason.


I'm happy to add a comment, but otherwise if you all want to ditch
this, can we do this as a follow up on top? There's quite a bit of
code that can be deleted and I'd like to not hold up this patch set
here on that - it's already a fairly sprawling pain touching about 7
different subsystems (ok only 6-ish now since the s390 patch landed).
For the comment, is the explanation next to unsafe_follow_pfn not good
enough?

So ... can I get you to un-cancel your ack?

Thanks, Daniel

>

> Regards,

>

>         Hans

>

> >

> > Thanks!

> >

> >       Hans

> >

> >> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

> >> Cc: Jason Gunthorpe <jgg@ziepe.ca>

> >> Cc: Kees Cook <keescook@chromium.org>

> >> Cc: Dan Williams <dan.j.williams@intel.com>

> >> Cc: Andrew Morton <akpm@linux-foundation.org>

> >> Cc: John Hubbard <jhubbard@nvidia.com>

> >> Cc: Jérôme Glisse <jglisse@redhat.com>

> >> Cc: Jan Kara <jack@suse.cz>

> >> Cc: Dan Williams <dan.j.williams@intel.com>

> >> Cc: linux-mm@kvack.org

> >> Cc: linux-arm-kernel@lists.infradead.org

> >> Cc: linux-samsung-soc@vger.kernel.org

> >> Cc: linux-media@vger.kernel.org

> >> Cc: Pawel Osciak <pawel@osciak.com>

> >> Cc: Marek Szyprowski <m.szyprowski@samsung.com>

> >> Cc: Kyungmin Park <kyungmin.park@samsung.com>

> >> Cc: Tomasz Figa <tfiga@chromium.org>

> >> Cc: Laurent Dufour <ldufour@linux.ibm.com>

> >> Cc: Vlastimil Babka <vbabka@suse.cz>

> >> Cc: Daniel Jordan <daniel.m.jordan@oracle.com>

> >> Cc: Michel Lespinasse <walken@google.com>

> >> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> >> --

> >> v3:

> >> - Reference the commit that enabled the zerocopy userptr use case to

> >>   make it abundandtly clear that this patch only affects that, and not

> >>   normal memory userptr. The old commit message already explained that

> >>   normal memory userptr is unaffected, but I guess that was not clear

> >>   enough.

> >> ---

> >>  drivers/media/common/videobuf2/frame_vector.c | 2 +-

> >>  drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +-

> >>  2 files changed, 2 insertions(+), 2 deletions(-)

> >>

> >> diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c

> >> index a0e65481a201..1a82ec13ea00 100644

> >> --- a/drivers/media/common/videobuf2/frame_vector.c

> >> +++ b/drivers/media/common/videobuf2/frame_vector.c

> >> @@ -70,7 +70,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,

> >>                      break;

> >>

> >>              while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {

> >> -                    err = follow_pfn(vma, start, &nums[ret]);

> >> +                    err = unsafe_follow_pfn(vma, start, &nums[ret]);

> >>                      if (err) {

> >>                              if (ret == 0)

> >>                                      ret = err;

> >> diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c

> >> index 52312ce2ba05..821c4a76ab96 100644

> >> --- a/drivers/media/v4l2-core/videobuf-dma-contig.c

> >> +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c

> >> @@ -183,7 +183,7 @@ static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem,

> >>      user_address = untagged_baddr;

> >>

> >>      while (pages_done < (mem->size >> PAGE_SHIFT)) {

> >> -            ret = follow_pfn(vma, user_address, &this_pfn);

> >> +            ret = unsafe_follow_pfn(vma, user_address, &this_pfn);

> >>              if (ret)

> >>                      break;

> >>

> >>

> >

>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Daniel Vetter Nov. 20, 2020, 10:51 a.m. UTC | #3
On Fri, Nov 20, 2020 at 11:39 AM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> On 20/11/2020 10:18, Daniel Vetter wrote:
> > On Fri, Nov 20, 2020 at 9:28 AM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> >>
> >> On 20/11/2020 09:06, Hans Verkuil wrote:
> >>> On 19/11/2020 15:41, Daniel Vetter wrote:
> >>>> The media model assumes that buffers are all preallocated, so that
> >>>> when a media pipeline is running we never miss a deadline because the
> >>>> buffers aren't allocated or available.
> >>>>
> >>>> This means we cannot fix the v4l follow_pfn usage through
> >>>> mmu_notifier, without breaking how this all works. The only real fix
> >>>> is to deprecate userptr support for VM_IO | VM_PFNMAP mappings and
> >>>> tell everyone to cut over to dma-buf memory sharing for zerocopy.
> >>>>
> >>>> userptr for normal memory will keep working as-is, this only affects
> >>>> the zerocopy userptr usage enabled in 50ac952d2263 ("[media]
> >>>> videobuf2-dma-sg: Support io userptr operations on io memory").
> >>>>
> >>>> Acked-by: Tomasz Figa <tfiga@chromium.org>
> >>>
> >>> Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> >>
> >> Actually, cancel this Acked-by.
> >>
> >> So let me see if I understand this right: VM_IO | VM_PFNMAP mappings can
> >> move around. There is a mmu_notifier that can be used to be notified when
> >> that happens, but that can't be used with media buffers since those buffers
> >> must always be available and in the same place.
> >>
> >> So follow_pfn is replaced by unsafe_follow_pfn to signal that what is attempted
> >> is unsafe and unreliable.
> >>
> >> If CONFIG_STRICT_FOLLOW_PFN is set, then unsafe_follow_pfn will fail, if it
> >> is unset, then it writes a warning to the kernel log but just continues while
> >> still unsafe.
> >>
> >> I am very much inclined to just drop VM_IO | VM_PFNMAP support in the media
> >> subsystem. For vb2 there is a working alternative in the form of dmabuf, and
> >> frankly for vb1 I don't care. If someone really needs this for a vb1 driver,
> >> then they can do the work to convert that driver to vb2.
> >>
> >> I've added Mauro to the CC list and I'll ping a few more people to see what
> >> they think, but in my opinion support for USERPTR + VM_IO | VM_PFNMAP
> >> should just be killed off.
> >>
> >> If others would like to keep it, then frame_vector.c needs a comment before
> >> the 'while' explaining why the unsafe_follow_pfn is there and that using
> >> dmabuf is the proper alternative to use. That will make it easier for
> >> developers to figure out why they see a kernel warning and what to do to
> >> fix it, rather than having to dig through the git history for the reason.
> >
> > I'm happy to add a comment, but otherwise if you all want to ditch
> > this, can we do this as a follow up on top? There's quite a bit of
> > code that can be deleted and I'd like to not hold up this patch set
> > here on that - it's already a fairly sprawling pain touching about 7
> > different subsystems (ok only 6-ish now since the s390 patch landed).
> > For the comment, is the explanation next to unsafe_follow_pfn not good
> > enough?
>
> No, because that doesn't mention that you should use dma-buf as a replacement.
> That's really the critical piece of information I'd like to see. That doesn't
> belong in unsafe_follow_pfn, it needs to be in frame_vector.c since it's
> vb2 specific.

Ah makes sense, I'll add that.

> >
> > So ... can I get you to un-cancel your ack?
>
> Hmm, I really would like to see support for this to be dropped completely.
>
> How about this: just replace follow_pfn() by -EINVAL instead of unsafe_follow_pfn().
>
> Add a TODO comment that this code now can be cleaned up a lot. Such a clean up patch
> can be added on top later, and actually that is something that I can do once this
> series has landed.
>
> Regardless, frame_vector.c should mention dma-buf as a replacement in a comment
> since I don't want users who hit this issue to have to dig through git logs
> to find that dma-buf is the right approach.
>
> BTW, nitpick: the subject line of this patch says 'videbuf' instead of 'videobuf'.

Will fix to, and next round will have the additional -EINVAL on top.
Iirc Mauro was pretty clear that we can't just delete this, so I kinda
don't want to get stuck in this discussion with my patches :-)
-Daniel

>
> Regards,
>
>         Hans
>
> >
> > Thanks, Daniel
> >
> >>
> >> Regards,
> >>
> >>         Hans
> >>
> >>>
> >>> Thanks!
> >>>
> >>>       Hans
> >>>
> >>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>>> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> >>>> Cc: Kees Cook <keescook@chromium.org>
> >>>> Cc: Dan Williams <dan.j.williams@intel.com>
> >>>> Cc: Andrew Morton <akpm@linux-foundation.org>
> >>>> Cc: John Hubbard <jhubbard@nvidia.com>
> >>>> Cc: Jérôme Glisse <jglisse@redhat.com>
> >>>> Cc: Jan Kara <jack@suse.cz>
> >>>> Cc: Dan Williams <dan.j.williams@intel.com>
> >>>> Cc: linux-mm@kvack.org
> >>>> Cc: linux-arm-kernel@lists.infradead.org
> >>>> Cc: linux-samsung-soc@vger.kernel.org
> >>>> Cc: linux-media@vger.kernel.org
> >>>> Cc: Pawel Osciak <pawel@osciak.com>
> >>>> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> >>>> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> >>>> Cc: Tomasz Figa <tfiga@chromium.org>
> >>>> Cc: Laurent Dufour <ldufour@linux.ibm.com>
> >>>> Cc: Vlastimil Babka <vbabka@suse.cz>
> >>>> Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
> >>>> Cc: Michel Lespinasse <walken@google.com>
> >>>> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>>> --
> >>>> v3:
> >>>> - Reference the commit that enabled the zerocopy userptr use case to
> >>>>   make it abundandtly clear that this patch only affects that, and not
> >>>>   normal memory userptr. The old commit message already explained that
> >>>>   normal memory userptr is unaffected, but I guess that was not clear
> >>>>   enough.
> >>>> ---
> >>>>  drivers/media/common/videobuf2/frame_vector.c | 2 +-
> >>>>  drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +-
> >>>>  2 files changed, 2 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c
> >>>> index a0e65481a201..1a82ec13ea00 100644
> >>>> --- a/drivers/media/common/videobuf2/frame_vector.c
> >>>> +++ b/drivers/media/common/videobuf2/frame_vector.c
> >>>> @@ -70,7 +70,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
> >>>>                      break;
> >>>>
> >>>>              while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {
> >>>> -                    err = follow_pfn(vma, start, &nums[ret]);
> >>>> +                    err = unsafe_follow_pfn(vma, start, &nums[ret]);
> >>>>                      if (err) {
> >>>>                              if (ret == 0)
> >>>>                                      ret = err;
> >>>> diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c
> >>>> index 52312ce2ba05..821c4a76ab96 100644
> >>>> --- a/drivers/media/v4l2-core/videobuf-dma-contig.c
> >>>> +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c
> >>>> @@ -183,7 +183,7 @@ static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem,
> >>>>      user_address = untagged_baddr;
> >>>>
> >>>>      while (pages_done < (mem->size >> PAGE_SHIFT)) {
> >>>> -            ret = follow_pfn(vma, user_address, &this_pfn);
> >>>> +            ret = unsafe_follow_pfn(vma, user_address, &this_pfn);
> >>>>              if (ret)
> >>>>                      break;
> >>>>
> >>>>
> >>>
> >>
> >
> >
>
Hans Verkuil Nov. 20, 2020, 12:08 p.m. UTC | #4
On 20/11/2020 11:51, Daniel Vetter wrote:
> On Fri, Nov 20, 2020 at 11:39 AM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>>
>> On 20/11/2020 10:18, Daniel Vetter wrote:
>>> On Fri, Nov 20, 2020 at 9:28 AM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>>>>
>>>> On 20/11/2020 09:06, Hans Verkuil wrote:
>>>>> On 19/11/2020 15:41, Daniel Vetter wrote:
>>>>>> The media model assumes that buffers are all preallocated, so that
>>>>>> when a media pipeline is running we never miss a deadline because the
>>>>>> buffers aren't allocated or available.
>>>>>>
>>>>>> This means we cannot fix the v4l follow_pfn usage through
>>>>>> mmu_notifier, without breaking how this all works. The only real fix
>>>>>> is to deprecate userptr support for VM_IO | VM_PFNMAP mappings and
>>>>>> tell everyone to cut over to dma-buf memory sharing for zerocopy.
>>>>>>
>>>>>> userptr for normal memory will keep working as-is, this only affects
>>>>>> the zerocopy userptr usage enabled in 50ac952d2263 ("[media]
>>>>>> videobuf2-dma-sg: Support io userptr operations on io memory").
>>>>>>
>>>>>> Acked-by: Tomasz Figa <tfiga@chromium.org>
>>>>>
>>>>> Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
>>>>
>>>> Actually, cancel this Acked-by.
>>>>
>>>> So let me see if I understand this right: VM_IO | VM_PFNMAP mappings can
>>>> move around. There is a mmu_notifier that can be used to be notified when
>>>> that happens, but that can't be used with media buffers since those buffers
>>>> must always be available and in the same place.
>>>>
>>>> So follow_pfn is replaced by unsafe_follow_pfn to signal that what is attempted
>>>> is unsafe and unreliable.
>>>>
>>>> If CONFIG_STRICT_FOLLOW_PFN is set, then unsafe_follow_pfn will fail, if it
>>>> is unset, then it writes a warning to the kernel log but just continues while
>>>> still unsafe.
>>>>
>>>> I am very much inclined to just drop VM_IO | VM_PFNMAP support in the media
>>>> subsystem. For vb2 there is a working alternative in the form of dmabuf, and
>>>> frankly for vb1 I don't care. If someone really needs this for a vb1 driver,
>>>> then they can do the work to convert that driver to vb2.
>>>>
>>>> I've added Mauro to the CC list and I'll ping a few more people to see what
>>>> they think, but in my opinion support for USERPTR + VM_IO | VM_PFNMAP
>>>> should just be killed off.
>>>>
>>>> If others would like to keep it, then frame_vector.c needs a comment before
>>>> the 'while' explaining why the unsafe_follow_pfn is there and that using
>>>> dmabuf is the proper alternative to use. That will make it easier for
>>>> developers to figure out why they see a kernel warning and what to do to
>>>> fix it, rather than having to dig through the git history for the reason.
>>>
>>> I'm happy to add a comment, but otherwise if you all want to ditch
>>> this, can we do this as a follow up on top? There's quite a bit of
>>> code that can be deleted and I'd like to not hold up this patch set
>>> here on that - it's already a fairly sprawling pain touching about 7
>>> different subsystems (ok only 6-ish now since the s390 patch landed).
>>> For the comment, is the explanation next to unsafe_follow_pfn not good
>>> enough?
>>
>> No, because that doesn't mention that you should use dma-buf as a replacement.
>> That's really the critical piece of information I'd like to see. That doesn't
>> belong in unsafe_follow_pfn, it needs to be in frame_vector.c since it's
>> vb2 specific.
> 
> Ah makes sense, I'll add that.
> 
>>>
>>> So ... can I get you to un-cancel your ack?
>>
>> Hmm, I really would like to see support for this to be dropped completely.
>>
>> How about this: just replace follow_pfn() by -EINVAL instead of unsafe_follow_pfn().
>>
>> Add a TODO comment that this code now can be cleaned up a lot. Such a clean up patch
>> can be added on top later, and actually that is something that I can do once this
>> series has landed.
>>
>> Regardless, frame_vector.c should mention dma-buf as a replacement in a comment
>> since I don't want users who hit this issue to have to dig through git logs
>> to find that dma-buf is the right approach.
>>
>> BTW, nitpick: the subject line of this patch says 'videbuf' instead of 'videobuf'.
> 
> Will fix to, and next round will have the additional -EINVAL on top.
> Iirc Mauro was pretty clear that we can't just delete this, so I kinda
> don't want to get stuck in this discussion with my patches :-)

Ah, I found that discussion for the v2 of this series.

Yes, add that on top and we can discuss whether to Ack that -EINVAL patch or
not.

I don't see why we would want to continue supporting a broken model that is
also a security risk, as I understand it.

Tomasz, can you look at the discussion for this old RFC patch of mine:

https://patchwork.linuxtv.org/project/linux-media/patch/20200221084531.576156-9-hverkuil-cisco@xs4all.nl/

Specifically, if we just drop support for follow_pfn(), would that cause
problems for Chromium since that is apparently still using USERPTR for encoders?

Regards,

	Hans

> -Daniel
> 
>>
>> Regards,
>>
>>         Hans
>>
>>>
>>> Thanks, Daniel
>>>
>>>>
>>>> Regards,
>>>>
>>>>         Hans
>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>>       Hans
>>>>>
>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>> Cc: Jason Gunthorpe <jgg@ziepe.ca>
>>>>>> Cc: Kees Cook <keescook@chromium.org>
>>>>>> Cc: Dan Williams <dan.j.williams@intel.com>
>>>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>>>> Cc: John Hubbard <jhubbard@nvidia.com>
>>>>>> Cc: Jérôme Glisse <jglisse@redhat.com>
>>>>>> Cc: Jan Kara <jack@suse.cz>
>>>>>> Cc: Dan Williams <dan.j.williams@intel.com>
>>>>>> Cc: linux-mm@kvack.org
>>>>>> Cc: linux-arm-kernel@lists.infradead.org
>>>>>> Cc: linux-samsung-soc@vger.kernel.org
>>>>>> Cc: linux-media@vger.kernel.org
>>>>>> Cc: Pawel Osciak <pawel@osciak.com>
>>>>>> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
>>>>>> Cc: Kyungmin Park <kyungmin.park@samsung.com>
>>>>>> Cc: Tomasz Figa <tfiga@chromium.org>
>>>>>> Cc: Laurent Dufour <ldufour@linux.ibm.com>
>>>>>> Cc: Vlastimil Babka <vbabka@suse.cz>
>>>>>> Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
>>>>>> Cc: Michel Lespinasse <walken@google.com>
>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>>> --
>>>>>> v3:
>>>>>> - Reference the commit that enabled the zerocopy userptr use case to
>>>>>>   make it abundandtly clear that this patch only affects that, and not
>>>>>>   normal memory userptr. The old commit message already explained that
>>>>>>   normal memory userptr is unaffected, but I guess that was not clear
>>>>>>   enough.
>>>>>> ---
>>>>>>  drivers/media/common/videobuf2/frame_vector.c | 2 +-
>>>>>>  drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +-
>>>>>>  2 files changed, 2 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c
>>>>>> index a0e65481a201..1a82ec13ea00 100644
>>>>>> --- a/drivers/media/common/videobuf2/frame_vector.c
>>>>>> +++ b/drivers/media/common/videobuf2/frame_vector.c
>>>>>> @@ -70,7 +70,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
>>>>>>                      break;
>>>>>>
>>>>>>              while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {
>>>>>> -                    err = follow_pfn(vma, start, &nums[ret]);
>>>>>> +                    err = unsafe_follow_pfn(vma, start, &nums[ret]);
>>>>>>                      if (err) {
>>>>>>                              if (ret == 0)
>>>>>>                                      ret = err;
>>>>>> diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c
>>>>>> index 52312ce2ba05..821c4a76ab96 100644
>>>>>> --- a/drivers/media/v4l2-core/videobuf-dma-contig.c
>>>>>> +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c
>>>>>> @@ -183,7 +183,7 @@ static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem,
>>>>>>      user_address = untagged_baddr;
>>>>>>
>>>>>>      while (pages_done < (mem->size >> PAGE_SHIFT)) {
>>>>>> -            ret = follow_pfn(vma, user_address, &this_pfn);
>>>>>> +            ret = unsafe_follow_pfn(vma, user_address, &this_pfn);
>>>>>>              if (ret)
>>>>>>                      break;
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
> 
>
Jason Gunthorpe Nov. 20, 2020, 6:30 p.m. UTC | #5
On Thu, Nov 19, 2020 at 03:41:46PM +0100, Daniel Vetter wrote:
> @@ -4805,21 +4824,15 @@ EXPORT_SYMBOL(follow_pte_pmd);
>   * Return: zero and the pfn at @pfn on success, -ve otherwise.
>   */
>  int follow_pfn(struct vm_area_struct *vma, unsigned long address,
> -	unsigned long *pfn)
> +	unsigned long *pfn, struct mmu_notifier *subscription)
>  {
> -	int ret = -EINVAL;
> -	spinlock_t *ptl;
> -	pte_t *ptep;
> +	if (WARN_ON(!subscription->mm))
> +		return -EINVAL;
>  
> +	if (WARN_ON(subscription->mm != vma->vm_mm))
> +		return -EINVAL;

These two things are redundant right? vma->vm_mm != NULL?

BTW, why do we even have this for nommu? If the only caller is kvm,
can you even compile kvm on nommu??

Jason
Oded Gabbay Nov. 21, 2020, 10:15 a.m. UTC | #6
On Thu, Nov 19, 2020 at 4:41 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> These are persistent, not just for the duration of a dma operation.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Cc: Jérôme Glisse <jglisse@redhat.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: linux-mm@kvack.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-samsung-soc@vger.kernel.org
> Cc: linux-media@vger.kernel.org
> Cc: Oded Gabbay <oded.gabbay@gmail.com>
> Cc: Omer Shpigelman <oshpigelman@habana.ai>
> Cc: Ofir Bitton <obitton@habana.ai>
> Cc: Tomer Tayar <ttayar@habana.ai>
> Cc: Moti Haimovski <mhaimovski@habana.ai>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Pawel Piskorski <ppiskorski@habana.ai>
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/misc/habanalabs/common/memory.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/misc/habanalabs/common/memory.c b/drivers/misc/habanalabs/common/memory.c
> index 0b220221873d..d08c41b90fec 100644
> --- a/drivers/misc/habanalabs/common/memory.c
> +++ b/drivers/misc/habanalabs/common/memory.c
> @@ -1298,7 +1298,8 @@ static int get_user_memory(struct hl_device *hdev, u64 addr, u64 size,
>                 return -ENOMEM;
>         }
>
> -       rc = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE,
> +       rc = pin_user_pages_fast(start, npages,
> +                                FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
>                                  userptr->pages);
>
>         if (rc != npages) {
> --
> 2.29.2
>

This patch and the previous one (03/17 misc/habana: Stop using
frame_vector helpers) are both:
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>

Thanks,
Oded
Daniel Vetter Nov. 24, 2020, 2:16 p.m. UTC | #7
On Fri, Nov 20, 2020 at 09:23:12PM +0900, Tomasz Figa wrote:
> On Fri, Nov 20, 2020 at 9:08 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> >
> > On 20/11/2020 11:51, Daniel Vetter wrote:
> > > On Fri, Nov 20, 2020 at 11:39 AM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> > >>
> > >> On 20/11/2020 10:18, Daniel Vetter wrote:
> > >>> On Fri, Nov 20, 2020 at 9:28 AM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> > >>>>
> > >>>> On 20/11/2020 09:06, Hans Verkuil wrote:
> > >>>>> On 19/11/2020 15:41, Daniel Vetter wrote:
> > >>>>>> The media model assumes that buffers are all preallocated, so that
> > >>>>>> when a media pipeline is running we never miss a deadline because the
> > >>>>>> buffers aren't allocated or available.
> > >>>>>>
> > >>>>>> This means we cannot fix the v4l follow_pfn usage through
> > >>>>>> mmu_notifier, without breaking how this all works. The only real fix
> > >>>>>> is to deprecate userptr support for VM_IO | VM_PFNMAP mappings and
> > >>>>>> tell everyone to cut over to dma-buf memory sharing for zerocopy.
> > >>>>>>
> > >>>>>> userptr for normal memory will keep working as-is, this only affects
> > >>>>>> the zerocopy userptr usage enabled in 50ac952d2263 ("[media]
> > >>>>>> videobuf2-dma-sg: Support io userptr operations on io memory").
> > >>>>>>
> > >>>>>> Acked-by: Tomasz Figa <tfiga@chromium.org>
> > >>>>>
> > >>>>> Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> > >>>>
> > >>>> Actually, cancel this Acked-by.
> > >>>>
> > >>>> So let me see if I understand this right: VM_IO | VM_PFNMAP mappings can
> > >>>> move around. There is a mmu_notifier that can be used to be notified when
> > >>>> that happens, but that can't be used with media buffers since those buffers
> > >>>> must always be available and in the same place.
> > >>>>
> > >>>> So follow_pfn is replaced by unsafe_follow_pfn to signal that what is attempted
> > >>>> is unsafe and unreliable.
> > >>>>
> > >>>> If CONFIG_STRICT_FOLLOW_PFN is set, then unsafe_follow_pfn will fail, if it
> > >>>> is unset, then it writes a warning to the kernel log but just continues while
> > >>>> still unsafe.
> > >>>>
> > >>>> I am very much inclined to just drop VM_IO | VM_PFNMAP support in the media
> > >>>> subsystem. For vb2 there is a working alternative in the form of dmabuf, and
> > >>>> frankly for vb1 I don't care. If someone really needs this for a vb1 driver,
> > >>>> then they can do the work to convert that driver to vb2.
> > >>>>
> > >>>> I've added Mauro to the CC list and I'll ping a few more people to see what
> > >>>> they think, but in my opinion support for USERPTR + VM_IO | VM_PFNMAP
> > >>>> should just be killed off.
> > >>>>
> > >>>> If others would like to keep it, then frame_vector.c needs a comment before
> > >>>> the 'while' explaining why the unsafe_follow_pfn is there and that using
> > >>>> dmabuf is the proper alternative to use. That will make it easier for
> > >>>> developers to figure out why they see a kernel warning and what to do to
> > >>>> fix it, rather than having to dig through the git history for the reason.
> > >>>
> > >>> I'm happy to add a comment, but otherwise if you all want to ditch
> > >>> this, can we do this as a follow up on top? There's quite a bit of
> > >>> code that can be deleted and I'd like to not hold up this patch set
> > >>> here on that - it's already a fairly sprawling pain touching about 7
> > >>> different subsystems (ok only 6-ish now since the s390 patch landed).
> > >>> For the comment, is the explanation next to unsafe_follow_pfn not good
> > >>> enough?
> > >>
> > >> No, because that doesn't mention that you should use dma-buf as a replacement.
> > >> That's really the critical piece of information I'd like to see. That doesn't
> > >> belong in unsafe_follow_pfn, it needs to be in frame_vector.c since it's
> > >> vb2 specific.
> > >
> > > Ah makes sense, I'll add that.
> > >
> > >>>
> > >>> So ... can I get you to un-cancel your ack?
> > >>
> > >> Hmm, I really would like to see support for this to be dropped completely.
> > >>
> > >> How about this: just replace follow_pfn() by -EINVAL instead of unsafe_follow_pfn().
> > >>
> > >> Add a TODO comment that this code now can be cleaned up a lot. Such a clean up patch
> > >> can be added on top later, and actually that is something that I can do once this
> > >> series has landed.
> > >>
> > >> Regardless, frame_vector.c should mention dma-buf as a replacement in a comment
> > >> since I don't want users who hit this issue to have to dig through git logs
> > >> to find that dma-buf is the right approach.
> > >>
> > >> BTW, nitpick: the subject line of this patch says 'videbuf' instead of 'videobuf'.
> > >
> > > Will fix to, and next round will have the additional -EINVAL on top.
> > > Iirc Mauro was pretty clear that we can't just delete this, so I kinda
> > > don't want to get stuck in this discussion with my patches :-)
> >
> > Ah, I found that discussion for the v2 of this series.
> >
> > Yes, add that on top and we can discuss whether to Ack that -EINVAL patch or
> > not.
> >
> > I don't see why we would want to continue supporting a broken model that is
> > also a security risk, as I understand it.
> >
> > Tomasz, can you look at the discussion for this old RFC patch of mine:
> >
> > https://patchwork.linuxtv.org/project/linux-media/patch/20200221084531.576156-9-hverkuil-cisco@xs4all.nl/
> >
> > Specifically, if we just drop support for follow_pfn(), would that cause
> > problems for Chromium since that is apparently still using USERPTR for encoders?
> >
> 
> Nope, we use regular page-backed user pointers and not IO/PFNMAP ones.
> 
> By the way, for any inter-device sharing we're using DMABUF. USERPTR
> is left only in case of the data coming from the CPU, e.g. network.

Yeah Mauro wasn't too enthusiastic even about this patch here, so I think
I'll just leave it as-is. I fixed the typo in the commit message subject.
-Daniel

> 
> > Regards,
> >
> >         Hans
> >
> > > -Daniel
> > >
> > >>
> > >> Regards,
> > >>
> > >>         Hans
> > >>
> > >>>
> > >>> Thanks, Daniel
> > >>>
> > >>>>
> > >>>> Regards,
> > >>>>
> > >>>>         Hans
> > >>>>
> > >>>>>
> > >>>>> Thanks!
> > >>>>>
> > >>>>>       Hans
> > >>>>>
> > >>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > >>>>>> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> > >>>>>> Cc: Kees Cook <keescook@chromium.org>
> > >>>>>> Cc: Dan Williams <dan.j.williams@intel.com>
> > >>>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
> > >>>>>> Cc: John Hubbard <jhubbard@nvidia.com>
> > >>>>>> Cc: Jérôme Glisse <jglisse@redhat.com>
> > >>>>>> Cc: Jan Kara <jack@suse.cz>
> > >>>>>> Cc: Dan Williams <dan.j.williams@intel.com>
> > >>>>>> Cc: linux-mm@kvack.org
> > >>>>>> Cc: linux-arm-kernel@lists.infradead.org
> > >>>>>> Cc: linux-samsung-soc@vger.kernel.org
> > >>>>>> Cc: linux-media@vger.kernel.org
> > >>>>>> Cc: Pawel Osciak <pawel@osciak.com>
> > >>>>>> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > >>>>>> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> > >>>>>> Cc: Tomasz Figa <tfiga@chromium.org>
> > >>>>>> Cc: Laurent Dufour <ldufour@linux.ibm.com>
> > >>>>>> Cc: Vlastimil Babka <vbabka@suse.cz>
> > >>>>>> Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
> > >>>>>> Cc: Michel Lespinasse <walken@google.com>
> > >>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > >>>>>> --
> > >>>>>> v3:
> > >>>>>> - Reference the commit that enabled the zerocopy userptr use case to
> > >>>>>>   make it abundandtly clear that this patch only affects that, and not
> > >>>>>>   normal memory userptr. The old commit message already explained that
> > >>>>>>   normal memory userptr is unaffected, but I guess that was not clear
> > >>>>>>   enough.
> > >>>>>> ---
> > >>>>>>  drivers/media/common/videobuf2/frame_vector.c | 2 +-
> > >>>>>>  drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +-
> > >>>>>>  2 files changed, 2 insertions(+), 2 deletions(-)
> > >>>>>>
> > >>>>>> diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c
> > >>>>>> index a0e65481a201..1a82ec13ea00 100644
> > >>>>>> --- a/drivers/media/common/videobuf2/frame_vector.c
> > >>>>>> +++ b/drivers/media/common/videobuf2/frame_vector.c
> > >>>>>> @@ -70,7 +70,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
> > >>>>>>                      break;
> > >>>>>>
> > >>>>>>              while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {
> > >>>>>> -                    err = follow_pfn(vma, start, &nums[ret]);
> > >>>>>> +                    err = unsafe_follow_pfn(vma, start, &nums[ret]);
> > >>>>>>                      if (err) {
> > >>>>>>                              if (ret == 0)
> > >>>>>>                                      ret = err;
> > >>>>>> diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c
> > >>>>>> index 52312ce2ba05..821c4a76ab96 100644
> > >>>>>> --- a/drivers/media/v4l2-core/videobuf-dma-contig.c
> > >>>>>> +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c
> > >>>>>> @@ -183,7 +183,7 @@ static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem,
> > >>>>>>      user_address = untagged_baddr;
> > >>>>>>
> > >>>>>>      while (pages_done < (mem->size >> PAGE_SHIFT)) {
> > >>>>>> -            ret = follow_pfn(vma, user_address, &this_pfn);
> > >>>>>> +            ret = unsafe_follow_pfn(vma, user_address, &this_pfn);
> > >>>>>>              if (ret)
> > >>>>>>                      break;
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>
> > >
> > >
> >
Daniel Vetter Nov. 24, 2020, 2:28 p.m. UTC | #8
On Fri, Nov 20, 2020 at 02:30:29PM -0400, Jason Gunthorpe wrote:
> On Thu, Nov 19, 2020 at 03:41:46PM +0100, Daniel Vetter wrote:
> > @@ -4805,21 +4824,15 @@ EXPORT_SYMBOL(follow_pte_pmd);
> >   * Return: zero and the pfn at @pfn on success, -ve otherwise.
> >   */
> >  int follow_pfn(struct vm_area_struct *vma, unsigned long address,
> > -	unsigned long *pfn)
> > +	unsigned long *pfn, struct mmu_notifier *subscription)
> >  {
> > -	int ret = -EINVAL;
> > -	spinlock_t *ptl;
> > -	pte_t *ptep;
> > +	if (WARN_ON(!subscription->mm))
> > +		return -EINVAL;
> >  
> > +	if (WARN_ON(subscription->mm != vma->vm_mm))
> > +		return -EINVAL;
> 
> These two things are redundant right? vma->vm_mm != NULL?

Yup, will remove.

> BTW, why do we even have this for nommu? If the only caller is kvm,
> can you even compile kvm on nommu??

Kinda makes sense, but I have no idea how to make sure with compile
testing this is really the case. And I didn't see any hard evidence in
Kconfig or Makefile that mmu notifiers requires CONFIG_MMU. So not sure
what to do here.

Should I just remove the nommu version of follow_pfn and see what happens?
We can't remove it earlier since it's still used by other subsystems.
-Daniel
Jason Gunthorpe Nov. 24, 2020, 3:55 p.m. UTC | #9
On Tue, Nov 24, 2020 at 03:28:14PM +0100, Daniel Vetter wrote:
> On Fri, Nov 20, 2020 at 02:30:29PM -0400, Jason Gunthorpe wrote:

> > On Thu, Nov 19, 2020 at 03:41:46PM +0100, Daniel Vetter wrote:

> > > @@ -4805,21 +4824,15 @@ EXPORT_SYMBOL(follow_pte_pmd);

> > >   * Return: zero and the pfn at @pfn on success, -ve otherwise.

> > >   */

> > >  int follow_pfn(struct vm_area_struct *vma, unsigned long address,

> > > -	unsigned long *pfn)

> > > +	unsigned long *pfn, struct mmu_notifier *subscription)

> > >  {

> > > -	int ret = -EINVAL;

> > > -	spinlock_t *ptl;

> > > -	pte_t *ptep;

> > > +	if (WARN_ON(!subscription->mm))

> > > +		return -EINVAL;

> > >  

> > > +	if (WARN_ON(subscription->mm != vma->vm_mm))

> > > +		return -EINVAL;

> > 

> > These two things are redundant right? vma->vm_mm != NULL?

> 

> Yup, will remove.

> 

> > BTW, why do we even have this for nommu? If the only caller is kvm,

> > can you even compile kvm on nommu??

> 

> Kinda makes sense, but I have no idea how to make sure with compile

> testing this is really the case. And I didn't see any hard evidence in

> Kconfig or Makefile that mmu notifiers requires CONFIG_MMU. So not sure

> what to do here.


It looks like only some arches have selectable CONFIG_MMU: arm,
m68k, microblaze, riscv, sh

If we look at arches that work with HAVE_KVM, I only see: arm64, mips,
powerpc, s390, x86

So my conclusion is there is no intersection between !MMU and HAVE_KVM?

> Should I just remove the nommu version of follow_pfn and see what happens?

> We can't remove it earlier since it's still used by other

> subsystems.


This is what I was thinking might work

Jason
Daniel Vetter Nov. 25, 2020, 9 a.m. UTC | #10
On Wed, Nov 25, 2020 at 9:13 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Tue, Nov 24, 2020 at 03:28:14PM +0100, Daniel Vetter wrote:
> > On Fri, Nov 20, 2020 at 02:30:29PM -0400, Jason Gunthorpe wrote:
> > > On Thu, Nov 19, 2020 at 03:41:46PM +0100, Daniel Vetter wrote:
> > > > @@ -4805,21 +4824,15 @@ EXPORT_SYMBOL(follow_pte_pmd);
> > > >   * Return: zero and the pfn at @pfn on success, -ve otherwise.
> > > >   */
> > > >  int follow_pfn(struct vm_area_struct *vma, unsigned long address,
> > > > - unsigned long *pfn)
> > > > + unsigned long *pfn, struct mmu_notifier *subscription)
> > > >  {
> > > > - int ret = -EINVAL;
> > > > - spinlock_t *ptl;
> > > > - pte_t *ptep;
> > > > + if (WARN_ON(!subscription->mm))
> > > > +         return -EINVAL;
> > > >
> > > > + if (WARN_ON(subscription->mm != vma->vm_mm))
> > > > +         return -EINVAL;
> > >
> > > These two things are redundant right? vma->vm_mm != NULL?
> >
> > Yup, will remove.
> >
> > > BTW, why do we even have this for nommu? If the only caller is kvm,
> > > can you even compile kvm on nommu??
> >
> > Kinda makes sense, but I have no idea how to make sure with compile
> > testing this is really the case. And I didn't see any hard evidence in
> > Kconfig or Makefile that mmu notifiers requires CONFIG_MMU. So not sure
> > what to do here.
>
> It looks like only some arches have selectable CONFIG_MMU: arm,
> m68k, microblaze, riscv, sh
>
> If we look at arches that work with HAVE_KVM, I only see: arm64, mips,
> powerpc, s390, x86
>
> So my conclusion is there is no intersection between !MMU and HAVE_KVM?
>
> > Should I just remove the nommu version of follow_pfn and see what happens?
> > We can't remove it earlier since it's still used by other
> > subsystems.
>
> This is what I was thinking might work

Makes sense, I'll do that for the next round.
-Daniel
Jason Gunthorpe Nov. 27, 2020, 1:12 p.m. UTC | #11
On Thu, Nov 19, 2020 at 03:41:29PM +0100, Daniel Vetter wrote:
> I feel like this is ready for some wider soaking. Since the remaining bits

> are all kinda connnected probably simplest if it all goes through -mm.


Did you figure out a sumbission plan for this stuff?

> Daniel Vetter (17):

>   drm/exynos: Stop using frame_vector helpers

>   drm/exynos: Use FOLL_LONGTERM for g2d cmdlists

>   misc/habana: Stop using frame_vector helpers

>   misc/habana: Use FOLL_LONGTERM for userptr

>   mm/frame-vector: Use FOLL_LONGTERM

>   media: videobuf2: Move frame_vector into media subsystem


At the very least it would be good to get those in right away.

>   mm: Add unsafe_follow_pfn

>   media/videbuf1|2: Mark follow_pfn usage as unsafe

>   vfio/type1: Mark follow_pfn as unsafe


I'm surprised nobody from VFIO has remarked on this, I think thety
won't like it

>   mm: Close race in generic_access_phys

>   PCI: Obey iomem restrictions for procfs mmap

>   /dev/mem: Only set filp->f_mapping

>   resource: Move devmem revoke code to resource framework

>   sysfs: Support zapping of binary attr mmaps

>   PCI: Revoke mappings like devmem


This sequence seems fairly stand alone, and in good shape as well

My advice is to put the done things on a branch and get Stephen to put
them in linux-next. You can send a PR to Lins. There is very little mm
stuff in here, and cross subsystem stuff works better in git land,
IMHO.

Jason
Daniel Vetter Nov. 27, 2020, 3:36 p.m. UTC | #12
On Fri, Nov 27, 2020 at 09:12:25AM -0400, Jason Gunthorpe wrote:
> On Thu, Nov 19, 2020 at 03:41:29PM +0100, Daniel Vetter wrote:
> > I feel like this is ready for some wider soaking. Since the remaining bits
> > are all kinda connnected probably simplest if it all goes through -mm.
> 
> Did you figure out a sumbission plan for this stuff?

I was kinda hoping Andrew would pick it all up.

> > Daniel Vetter (17):
> >   drm/exynos: Stop using frame_vector helpers
> >   drm/exynos: Use FOLL_LONGTERM for g2d cmdlists
> >   misc/habana: Stop using frame_vector helpers
> >   misc/habana: Use FOLL_LONGTERM for userptr
> >   mm/frame-vector: Use FOLL_LONGTERM
> >   media: videobuf2: Move frame_vector into media subsystem
> 
> At the very least it would be good to get those in right away.
> 
> >   mm: Add unsafe_follow_pfn
> >   media/videbuf1|2: Mark follow_pfn usage as unsafe
> >   vfio/type1: Mark follow_pfn as unsafe
> 
> I'm surprised nobody from VFIO has remarked on this, I think thety
> won't like it

Same here tbh :-)

> >   mm: Close race in generic_access_phys
> >   PCI: Obey iomem restrictions for procfs mmap
> >   /dev/mem: Only set filp->f_mapping
> >   resource: Move devmem revoke code to resource framework
> >   sysfs: Support zapping of binary attr mmaps
> >   PCI: Revoke mappings like devmem
> 
> This sequence seems fairly stand alone, and in good shape as well

Yeah your split makes sense. I'll reorder them for the next round (which
I'm prepping right now).
> 
> My advice is to put the done things on a branch and get Stephen to put
> them in linux-next. You can send a PR to Lins. There is very little mm
> stuff in here, and cross subsystem stuff works better in git land,
> IMHO.

Yeah could do. Andrew, any preferences?
-Daniel