diff mbox series

[4.14,1/3] KVM: do not assume PTE is writable after follow_pfn

Message ID 20210803135521.2603575-1-ovidiu.panait@windriver.com
State New
Headers show
Series [4.14,1/3] KVM: do not assume PTE is writable after follow_pfn | expand

Commit Message

Ovidiu Panait Aug. 3, 2021, 1:55 p.m. UTC
From: Paolo Bonzini <pbonzini@redhat.com>

commit bd2fae8da794b55bf2ac02632da3a151b10e664c upstream.

In order to convert an HVA to a PFN, KVM usually tries to use
the get_user_pages family of functinso.  This however is not
possible for VM_IO vmas; in that case, KVM instead uses follow_pfn.

In doing this however KVM loses the information on whether the
PFN is writable.  That is usually not a problem because the main
use of VM_IO vmas with KVM is for BARs in PCI device assignment,
however it is a bug.  To fix it, use follow_pte and check pte_write
while under the protection of the PTE lock.  The information can
be used to fail hva_to_pfn_remapped or passed back to the
caller via *writable.

Usage of follow_pfn was introduced in commit add6a0cd1c5b ("KVM: MMU: try to fix
up page faults before giving up", 2016-07-05); however, even older version
have the same issue, all the way back to commit 2e2e3738af33 ("KVM:
Handle vma regions with no backing page", 2008-07-20), as they also did
not check whether the PFN was writable.

Fixes: 2e2e3738af33 ("KVM: Handle vma regions with no backing page")
Reported-by: David Stevens <stevensd@google.com>
Cc: 3pvd@google.com
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[OP: backport to 4.14, adjust follow_pte() -> follow_pte_pmd()]
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
---
 virt/kvm/kvm_main.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

Comments

Paolo Bonzini Aug. 4, 2021, 8:51 a.m. UTC | #1
On 03/08/21 15:55, Ovidiu Panait wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
> 
> commit bd2fae8da794b55bf2ac02632da3a151b10e664c upstream.
> 
> In order to convert an HVA to a PFN, KVM usually tries to use
> the get_user_pages family of functinso.  This however is not
> possible for VM_IO vmas; in that case, KVM instead uses follow_pfn.
> 
> In doing this however KVM loses the information on whether the
> PFN is writable.  That is usually not a problem because the main
> use of VM_IO vmas with KVM is for BARs in PCI device assignment,
> however it is a bug.  To fix it, use follow_pte and check pte_write
> while under the protection of the PTE lock.  The information can
> be used to fail hva_to_pfn_remapped or passed back to the
> caller via *writable.
> 
> Usage of follow_pfn was introduced in commit add6a0cd1c5b ("KVM: MMU: try to fix
> up page faults before giving up", 2016-07-05); however, even older version
> have the same issue, all the way back to commit 2e2e3738af33 ("KVM:
> Handle vma regions with no backing page", 2008-07-20), as they also did
> not check whether the PFN was writable.
> 
> Fixes: 2e2e3738af33 ("KVM: Handle vma regions with no backing page")
> Reported-by: David Stevens <stevensd@google.com>
> Cc: 3pvd@google.com
> Cc: Jann Horn <jannh@google.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> [OP: backport to 4.14, adjust follow_pte() -> follow_pte_pmd()]
> Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
> ---
>   virt/kvm/kvm_main.c | 15 ++++++++++++---
>   1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 547ae59199db..4e23d0b4b810 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1491,9 +1491,11 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
>   			       kvm_pfn_t *p_pfn)
>   {
>   	unsigned long pfn;
> +	pte_t *ptep;
> +	spinlock_t *ptl;
>   	int r;
>   
> -	r = follow_pfn(vma, addr, &pfn);
> +	r = follow_pte_pmd(vma->vm_mm, addr, NULL, NULL, &ptep, NULL, &ptl);
>   	if (r) {
>   		/*
>   		 * get_user_pages fails for VM_IO and VM_PFNMAP vmas and does
> @@ -1508,14 +1510,19 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
>   		if (r)
>   			return r;
>   
> -		r = follow_pfn(vma, addr, &pfn);
> +		r = follow_pte_pmd(vma->vm_mm, addr, NULL, NULL, &ptep, NULL, &ptl);
>   		if (r)
>   			return r;
> +	}
>   
> +	if (write_fault && !pte_write(*ptep)) {
> +		pfn = KVM_PFN_ERR_RO_FAULT;
> +		goto out;
>   	}
>   
>   	if (writable)
> -		*writable = true;
> +		*writable = pte_write(*ptep);
> +	pfn = pte_pfn(*ptep);
>   
>   	/*
>   	 * Get a reference here because callers of *hva_to_pfn* and
> @@ -1530,6 +1537,8 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
>   	 */
>   	kvm_get_pfn(pfn);
>   
> +out:
> +	pte_unmap_unlock(ptep, ptl);
>   	*p_pfn = pfn;
>   	return 0;
>   }
> 

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Greg KH Aug. 6, 2021, 6:26 a.m. UTC | #2
On Tue, Aug 03, 2021 at 04:55:19PM +0300, Ovidiu Panait wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>

> 

> commit bd2fae8da794b55bf2ac02632da3a151b10e664c upstream.

> 

> In order to convert an HVA to a PFN, KVM usually tries to use

> the get_user_pages family of functinso.  This however is not

> possible for VM_IO vmas; in that case, KVM instead uses follow_pfn.

> 

> In doing this however KVM loses the information on whether the

> PFN is writable.  That is usually not a problem because the main

> use of VM_IO vmas with KVM is for BARs in PCI device assignment,

> however it is a bug.  To fix it, use follow_pte and check pte_write

> while under the protection of the PTE lock.  The information can

> be used to fail hva_to_pfn_remapped or passed back to the

> caller via *writable.

> 

> Usage of follow_pfn was introduced in commit add6a0cd1c5b ("KVM: MMU: try to fix

> up page faults before giving up", 2016-07-05); however, even older version

> have the same issue, all the way back to commit 2e2e3738af33 ("KVM:

> Handle vma regions with no backing page", 2008-07-20), as they also did

> not check whether the PFN was writable.

> 

> Fixes: 2e2e3738af33 ("KVM: Handle vma regions with no backing page")

> Reported-by: David Stevens <stevensd@google.com>

> Cc: 3pvd@google.com

> Cc: Jann Horn <jannh@google.com>

> Cc: Jason Gunthorpe <jgg@ziepe.ca>

> Cc: stable@vger.kernel.org

> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

> [OP: backport to 4.14, adjust follow_pte() -> follow_pte_pmd()]

> Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>

> ---

>  virt/kvm/kvm_main.c | 15 ++++++++++++---

>  1 file changed, 12 insertions(+), 3 deletions(-)


All now queued up, thanks.

greg k-h
diff mbox series

Patch

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 547ae59199db..4e23d0b4b810 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1491,9 +1491,11 @@  static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 			       kvm_pfn_t *p_pfn)
 {
 	unsigned long pfn;
+	pte_t *ptep;
+	spinlock_t *ptl;
 	int r;
 
-	r = follow_pfn(vma, addr, &pfn);
+	r = follow_pte_pmd(vma->vm_mm, addr, NULL, NULL, &ptep, NULL, &ptl);
 	if (r) {
 		/*
 		 * get_user_pages fails for VM_IO and VM_PFNMAP vmas and does
@@ -1508,14 +1510,19 @@  static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 		if (r)
 			return r;
 
-		r = follow_pfn(vma, addr, &pfn);
+		r = follow_pte_pmd(vma->vm_mm, addr, NULL, NULL, &ptep, NULL, &ptl);
 		if (r)
 			return r;
+	}
 
+	if (write_fault && !pte_write(*ptep)) {
+		pfn = KVM_PFN_ERR_RO_FAULT;
+		goto out;
 	}
 
 	if (writable)
-		*writable = true;
+		*writable = pte_write(*ptep);
+	pfn = pte_pfn(*ptep);
 
 	/*
 	 * Get a reference here because callers of *hva_to_pfn* and
@@ -1530,6 +1537,8 @@  static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 	 */ 
 	kvm_get_pfn(pfn);
 
+out:
+	pte_unmap_unlock(ptep, ptl);
 	*p_pfn = pfn;
 	return 0;
 }