diff mbox series

[v2,4/9] KVM: x86: forcibly leave nested mode on vCPU reset

Message ID 20221103141351.50662-5-mlevitsk@redhat.com
State Superseded
Headers show
Series nSVM: Security and correctness fixes | expand

Commit Message

Maxim Levitsky Nov. 3, 2022, 2:13 p.m. UTC
While not obivous, kvm_vcpu_reset() leaves the nested mode by clearing
'vcpu->arch.hflags' but it does so without all the required housekeeping.

On SVM,	it is possible to have a vCPU reset while in guest mode because
unlike VMX, on SVM, INIT's are not latched in SVM non root mode and in
addition to that L1 doesn't have to intercept triple fault, which should
also trigger L1's reset if happens in L2 while L1 didn't intercept it.

If one of the above conditions happen, KVM will	continue to use vmcb02
while not having in the guest mode.

Later the IA32_EFER will be cleared which will lead to freeing of the
nested guest state which will (correctly) free the vmcb02, but since
KVM still uses it (incorrectly) this will lead to a use after free
and kernel crash.

This issue is assigned CVE-2022-3344

Cc: stable@vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/x86.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Liam Merwick Nov. 21, 2022, 4:31 p.m. UTC | #1
On 03/11/2022 14:13, Maxim Levitsky wrote:
> While not obivous, kvm_vcpu_reset() leaves the nested mode by clearing
> 'vcpu->arch.hflags' but it does so without all the required housekeeping.
> 
> On SVM,	it is possible to have a vCPU reset while in guest mode because
> unlike VMX, on SVM, INIT's are not latched in SVM non root mode and in
> addition to that L1 doesn't have to intercept triple fault, which should
> also trigger L1's reset if happens in L2 while L1 didn't intercept it.
> 
> If one of the above conditions happen, KVM will	continue to use vmcb02
> while not having in the guest mode.

"having" is the wrong word here - maybe "not having in the" -> "not 
being in" ?

> 
> Later the IA32_EFER will be cleared which will lead to freeing of the
> nested guest state which will (correctly) free the vmcb02, but since
> KVM still uses it (incorrectly) this will lead to a use after free
> and kernel crash.
> 
> This issue is assigned CVE-2022-3344
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>

Reviewed-by: Liam Merwick <liam.merwick@oracle.com>


> ---
>   arch/x86/kvm/x86.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 316ab1d5317f92..3fd900504e683b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -11694,8 +11694,18 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>   	WARN_ON_ONCE(!init_event &&
>   		     (old_cr0 || kvm_read_cr3(vcpu) || kvm_read_cr4(vcpu)));
>   
> +	/*
> +	 * SVM doesn't unconditionally VM-Exit on INIT and SHUTDOWN, thus it's
> +	 * possible to INIT the vCPU while L2 is active.  Force the vCPU back
> +	 * into L1 as EFER.SVME is cleared on INIT (along with all other EFER
> +	 * bits), i.e. virtualization is disabled.
> +	 */
> +	if (is_guest_mode(vcpu))
> +		kvm_leave_nested(vcpu);
> +
>   	kvm_lapic_reset(vcpu, init_event);
>   
> +	WARN_ON_ONCE(is_guest_mode(vcpu) || is_smm(vcpu));
>   	vcpu->arch.hflags = 0;
>   
>   	vcpu->arch.smi_pending = 0;
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 316ab1d5317f92..3fd900504e683b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11694,8 +11694,18 @@  void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	WARN_ON_ONCE(!init_event &&
 		     (old_cr0 || kvm_read_cr3(vcpu) || kvm_read_cr4(vcpu)));
 
+	/*
+	 * SVM doesn't unconditionally VM-Exit on INIT and SHUTDOWN, thus it's
+	 * possible to INIT the vCPU while L2 is active.  Force the vCPU back
+	 * into L1 as EFER.SVME is cleared on INIT (along with all other EFER
+	 * bits), i.e. virtualization is disabled.
+	 */
+	if (is_guest_mode(vcpu))
+		kvm_leave_nested(vcpu);
+
 	kvm_lapic_reset(vcpu, init_event);
 
+	WARN_ON_ONCE(is_guest_mode(vcpu) || is_smm(vcpu));
 	vcpu->arch.hflags = 0;
 
 	vcpu->arch.smi_pending = 0;