diff mbox series

[v1,13/23] KVM: VMX: Handle VMX nested exception for FRED

Message ID 20231108183003.5981-14-xin3.li@intel.com
State New
Headers show
Series Enable FRED with KVM VMX | expand

Commit Message

Li, Xin3 Nov. 8, 2023, 6:29 p.m. UTC
Set VMX nested exception bit in the VM-entry interruption information
VMCS field when injecting a nested exception using FRED event delivery
to ensure:
  1) The nested exception is injected on a correct stack level.
  2) The nested bit defined in FRED stack frame is set.

The event stack level used by FRED event delivery depends on whether the
event was a nested exception encountered during delivery of another event,
because a nested exception is "regarded" as happening on ring 0.  E.g.,
when #PF is configured to use stack level 1 in IA32_FRED_STKLVLS MSR:
  - nested #PF will be delivered on stack level 1 when triggered from
    user level.
  - normal #PF will be delivered on stack level 0 when triggered from
    user level.

The VMX nested-exception support ensures the correct event stack level is
chosen when a VM entry injects a nested exception.

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  6 ++++--
 arch/x86/include/asm/vmx.h      |  4 +++-
 arch/x86/kvm/svm/svm.c          |  4 ++--
 arch/x86/kvm/vmx/vmx.c          | 26 +++++++++++++++++++++-----
 arch/x86/kvm/x86.c              | 22 +++++++++++++---------
 arch/x86/kvm/x86.h              |  1 +
 6 files changed, 44 insertions(+), 19 deletions(-)

Comments

Chao Gao Nov. 14, 2023, 7:40 a.m. UTC | #1
> 	/* Require Write-Back (WB) memory type for VMCS accesses. */
>@@ -7313,11 +7328,12 @@ static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu,
> 			}
> 		}
> 
>-		if (idt_vectoring_info & VECTORING_INFO_DELIVER_CODE_MASK) {
>-			u32 err = vmcs_read32(error_code_field);
>-			kvm_requeue_exception_e(vcpu, vector, err);
>-		} else
>-			kvm_requeue_exception(vcpu, vector);
>+		if (idt_vectoring_info & VECTORING_INFO_DELIVER_CODE_MASK)
>+			kvm_requeue_exception_e(vcpu, vector, vmcs_read32(error_code_field),
>+						idt_vectoring_info & INTR_INFO_NESTED_EXCEPTION_MASK);
>+		else
>+			kvm_requeue_exception(vcpu, vector,
>+					      idt_vectoring_info & INTR_INFO_NESTED_EXCEPTION_MASK);

Exiting-event identification can also have bit 13 set, indicating a nested
exception encountered and caused VM-exit. when reinjecting the exception to
guests, kvm needs to set the "nested" bit, right? I suspect some changes
to e.g., handle_exception_nmi() are needed.
Li, Xin3 Dec. 6, 2023, 8:37 a.m. UTC | #2
> Subject: RE: [PATCH v1 13/23] KVM: VMX: Handle VMX nested exception for FRED
> 
> > >+		if (idt_vectoring_info &
> VECTORING_INFO_DELIVER_CODE_MASK)
> > >+			kvm_requeue_exception_e(vcpu, vector,
> > vmcs_read32(error_code_field),
> > >+						idt_vectoring_info &
> > INTR_INFO_NESTED_EXCEPTION_MASK);
> > >+		else
> > >+			kvm_requeue_exception(vcpu, vector,
> > >+					      idt_vectoring_info &
> > INTR_INFO_NESTED_EXCEPTION_MASK);
> >
> > Exiting-event identification can also have bit 13 set, indicating a
> > nested exception encountered and caused VM-exit. when reinjecting the
> > exception to guests, kvm needs to set the "nested" bit, right? I
> > suspect some changes to e.g., handle_exception_nmi() are needed.
> 
> The current patch relies on kvm_multiple_exception() to do that.  But TBH, I'm
> not sure it can recognize all nested cases.  I probably should revisit it.

So the conclusion is that kvm_multiple_exception() is smart enough, and
a VMM doesn't have to check bit 13 of the Exiting-event identification.

In FRED spec 5.0, section 9.2 - New VMX Feature: VMX Nested-Exception
Support, there is a statement at the end of Exiting-event identification:

(The value of this bit is always identical to that of the valid bit of
the original-event identification field.)

It means that even w/o VMX Nested-Exception support, a VMM already knows
if an exception is a nested exception encountered during delivery of
another event in an exception caused VM exit (exit reason 0).  This is
done in KVM through reading IDT_VECTORING_INFO_FIELD and calling
vmx_complete_interrupts() immediately after VM exits.

vmx_complete_interrupts() simply queues the original exception if there is
one, and later the nested exception causing the VM exit could be cancelled
if it is a shadow page fault.  However if the shadow page fault is caused
by a guest page fault, KVM injects it as a nested exception to have guest
fix its page table.

I will add comments about this background in the next iteration.
Chao Gao Dec. 7, 2023, 8:42 a.m. UTC | #3
On Wed, Dec 06, 2023 at 04:37:39PM +0800, Li, Xin3 wrote:
>> Subject: RE: [PATCH v1 13/23] KVM: VMX: Handle VMX nested exception for FRED
>> 
>> > >+		if (idt_vectoring_info &
>> VECTORING_INFO_DELIVER_CODE_MASK)
>> > >+			kvm_requeue_exception_e(vcpu, vector,
>> > vmcs_read32(error_code_field),
>> > >+						idt_vectoring_info &
>> > INTR_INFO_NESTED_EXCEPTION_MASK);
>> > >+		else
>> > >+			kvm_requeue_exception(vcpu, vector,
>> > >+					      idt_vectoring_info &
>> > INTR_INFO_NESTED_EXCEPTION_MASK);
>> >
>> > Exiting-event identification can also have bit 13 set, indicating a
>> > nested exception encountered and caused VM-exit. when reinjecting the
>> > exception to guests, kvm needs to set the "nested" bit, right? I
>> > suspect some changes to e.g., handle_exception_nmi() are needed.
>> 
>> The current patch relies on kvm_multiple_exception() to do that.  But TBH, I'm
>> not sure it can recognize all nested cases.  I probably should revisit it.
>
>So the conclusion is that kvm_multiple_exception() is smart enough, and
>a VMM doesn't have to check bit 13 of the Exiting-event identification.
>
>In FRED spec 5.0, section 9.2 - New VMX Feature: VMX Nested-Exception
>Support, there is a statement at the end of Exiting-event identification:
>
>(The value of this bit is always identical to that of the valid bit of
>the original-event identification field.)
>
>It means that even w/o VMX Nested-Exception support, a VMM already knows
>if an exception is a nested exception encountered during delivery of
>another event in an exception caused VM exit (exit reason 0).  This is
>done in KVM through reading IDT_VECTORING_INFO_FIELD and calling
>vmx_complete_interrupts() immediately after VM exits.
>
>vmx_complete_interrupts() simply queues the original exception if there is
>one, and later the nested exception causing the VM exit could be cancelled
>if it is a shadow page fault.  However if the shadow page fault is caused
>by a guest page fault, KVM injects it as a nested exception to have guest
>fix its page table.
>
>I will add comments about this background in the next iteration.

is it possible that the CPU encounters an exception and causes VM-exit during
injecting an __interrupt__? in this case, no __exception__ will be (re-)queued
by vmx_complete_interrupts().
Chao Gao Dec. 8, 2023, 1:56 a.m. UTC | #4
On Thu, Dec 07, 2023 at 06:09:46PM +0800, Li, Xin3 wrote:
>> >> > Exiting-event identification can also have bit 13 set, indicating a
>> >> > nested exception encountered and caused VM-exit. when reinjecting the
>> >> > exception to guests, kvm needs to set the "nested" bit, right? I
>> >> > suspect some changes to e.g., handle_exception_nmi() are needed.
>> >>
>> >> The current patch relies on kvm_multiple_exception() to do that.  But TBH, I'm
>> >> not sure it can recognize all nested cases.  I probably should revisit it.
>> >
>> >So the conclusion is that kvm_multiple_exception() is smart enough, and
>> >a VMM doesn't have to check bit 13 of the Exiting-event identification.
>> >
>> >In FRED spec 5.0, section 9.2 - New VMX Feature: VMX Nested-Exception
>> >Support, there is a statement at the end of Exiting-event identification:
>> >
>> >(The value of this bit is always identical to that of the valid bit of
>> >the original-event identification field.)
>> >
>> >It means that even w/o VMX Nested-Exception support, a VMM already knows
>> >if an exception is a nested exception encountered during delivery of
>> >another event in an exception caused VM exit (exit reason 0).  This is
>> >done in KVM through reading IDT_VECTORING_INFO_FIELD and calling
>> >vmx_complete_interrupts() immediately after VM exits.
>> >
>> >vmx_complete_interrupts() simply queues the original exception if there is
>> >one, and later the nested exception causing the VM exit could be cancelled
>> >if it is a shadow page fault.  However if the shadow page fault is caused
>> >by a guest page fault, KVM injects it as a nested exception to have guest
>> >fix its page table.
>> >
>> >I will add comments about this background in the next iteration.
>> 
>> is it possible that the CPU encounters an exception and causes VM-exit during
>> injecting an __interrupt__? in this case, no __exception__ will be (re-)queued
>> by vmx_complete_interrupts().
>
>I guess the following case is what you're suggesting:
>KVM injects an external interrupt after shadow page tables are nuked.
>
>vmx_complete_interrupts() are called after each VM exit to clear both
>interrupt and exception queues, which means it always pushes the
>deepest event if there is an original event.  In the above case, the
>original event is the external interrupt KVM just tried to inject.

in my understanding, your point is:
1. if bit 13 of the Exiting-event identification is set. the original-event
identification field should be valid.
2. vmx_complete_interrupts() is done immediately after VM exits and reads
original-event identification and reinjects the event there.
3. if KVM injects the exception in exiting-event identification
to guest, KVM doesn't need to read the bit 13 because kvm_multiple_exception()
is "smart enough" and recognize the exception as nested-exception because if
bit 13 is 1, one exception must has been queued in #2.

my question is:
what if the event in original-event identification is an interrupt e.g.,
external interrupt or NMI, rather than exception. vmx_complete_interrupts()
won't queue an exception, then how can KVM or kvm_multiple_exception() know the
exception that caused VM-exit is an nested exception w/o reading bit 13 of the
Exiting-event identification?
Li, Xin3 Dec. 8, 2023, 11:48 p.m. UTC | #5
> >> >> > Exiting-event identification can also have bit 13 set, indicating a
> >> >> > nested exception encountered and caused VM-exit. when reinjecting the
> >> >> > exception to guests, kvm needs to set the "nested" bit, right? I
> >> >> > suspect some changes to e.g., handle_exception_nmi() are needed.
> >> >>
> >> >> The current patch relies on kvm_multiple_exception() to do that.  But TBH,
> I'm
> >> >> not sure it can recognize all nested cases.  I probably should revisit it.
> >> >
> >> >So the conclusion is that kvm_multiple_exception() is smart enough, and
> >> >a VMM doesn't have to check bit 13 of the Exiting-event identification.
> >> >
> >> >In FRED spec 5.0, section 9.2 - New VMX Feature: VMX Nested-Exception
> >> >Support, there is a statement at the end of Exiting-event identification:
> >> >
> >> >(The value of this bit is always identical to that of the valid bit of
> >> >the original-event identification field.)
> >> >
> >> >It means that even w/o VMX Nested-Exception support, a VMM already
> knows
> >> >if an exception is a nested exception encountered during delivery of
> >> >another event in an exception caused VM exit (exit reason 0).  This is
> >> >done in KVM through reading IDT_VECTORING_INFO_FIELD and calling
> >> >vmx_complete_interrupts() immediately after VM exits.
> >> >
> >> >vmx_complete_interrupts() simply queues the original exception if there is
> >> >one, and later the nested exception causing the VM exit could be cancelled
> >> >if it is a shadow page fault.  However if the shadow page fault is caused
> >> >by a guest page fault, KVM injects it as a nested exception to have guest
> >> >fix its page table.
> >> >
> >> >I will add comments about this background in the next iteration.
> >>
> >> is it possible that the CPU encounters an exception and causes VM-exit during
> >> injecting an __interrupt__? in this case, no __exception__ will be (re-)queued
> >> by vmx_complete_interrupts().
> >
> >I guess the following case is what you're suggesting:
> >KVM injects an external interrupt after shadow page tables are nuked.
> >
> >vmx_complete_interrupts() are called after each VM exit to clear both
> >interrupt and exception queues, which means it always pushes the
> >deepest event if there is an original event.  In the above case, the
> >original event is the external interrupt KVM just tried to inject.
> 
> in my understanding, your point is:
> 1. if bit 13 of the Exiting-event identification is set. the original-event
> identification field should be valid.
> 2. vmx_complete_interrupts() is done immediately after VM exits and reads
> original-event identification and reinjects the event there.
> 3. if KVM injects the exception in exiting-event identification
> to guest, KVM doesn't need to read the bit 13 because kvm_multiple_exception()
> is "smart enough" and recognize the exception as nested-exception because if
> bit 13 is 1, one exception must has been queued in #2.
> 
> my question is:
> what if the event in original-event identification is an interrupt e.g.,
> external interrupt or NMI, rather than exception.  vmx_complete_interrupts()
> won't queue an exception, then how can KVM or kvm_multiple_exception()
> know the
> exception that caused VM-exit is an nested exception w/o reading bit 13 of the
> Exiting-event identification?

The good news is that vmx_complete_interrupts() still queues the event
even it's not a hardware exception.  It's just that kvm_multiple_exception()
doesn't check if there is an original interrupt or NMI because IDT event
delivery doesn't care such a case.

I think your point is more of that we should check it when FRED is enabled
for a guest.  Yes, architecturally we should do it.

What I want to emphasize is that bit 13 of the exiting-event identification
is set to the valid bit of the original-event identification, they are
logically the same thing when FRED is enabled.  It doens't matter which one
a VMM reads and uses.  But a VMM doesn't need to differentiate FRED and IDT
if it reads the info from original-event identification.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1e5a6d9439f8..2ae8cc83dbb3 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -721,6 +721,7 @@  struct kvm_queued_exception {
 	u32 error_code;
 	unsigned long payload;
 	bool has_payload;
+	bool nested;
 };
 
 struct kvm_vcpu_arch {
@@ -2015,8 +2016,9 @@  int kvm_emulate_rdpmc(struct kvm_vcpu *vcpu);
 void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr);
 void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
 void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr, unsigned long payload);
-void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr);
-void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
+void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr, bool nested);
+void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr,
+			     u32 error_code, bool nested);
 void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault);
 void kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu,
 				    struct x86_exception *fault);
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 97729248e844..020dfd3f6b44 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -132,6 +132,7 @@ 
 /* VMX_BASIC bits and bitmasks */
 #define VMX_BASIC_32BIT_PHYS_ADDR_ONLY		BIT_ULL(48)
 #define VMX_BASIC_INOUT				BIT_ULL(54)
+#define VMX_BASIC_NESTED_EXCEPTION		BIT_ULL(58)
 
 /* VMX_MISC bits and bitmasks */
 #define VMX_MISC_INTEL_PT			BIT_ULL(14)
@@ -404,8 +405,9 @@  enum vmcs_field {
 #define INTR_INFO_INTR_TYPE_MASK        0x700           /* 10:8 */
 #define INTR_INFO_DELIVER_CODE_MASK     0x800           /* 11 */
 #define INTR_INFO_UNBLOCK_NMI		0x1000		/* 12 */
+#define INTR_INFO_NESTED_EXCEPTION_MASK	0x2000		/* 13 */
 #define INTR_INFO_VALID_MASK            0x80000000      /* 31 */
-#define INTR_INFO_RESVD_BITS_MASK       0x7ffff000
+#define INTR_INFO_RESVD_BITS_MASK       0x7fffd000
 
 #define VECTORING_INFO_VECTOR_MASK           	INTR_INFO_VECTOR_MASK
 #define VECTORING_INFO_TYPE_MASK        	INTR_INFO_INTR_TYPE_MASK
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 712146312358..78a9ff5cfcad 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4047,10 +4047,10 @@  static void svm_complete_interrupts(struct kvm_vcpu *vcpu)
 
 		if (exitintinfo & SVM_EXITINTINFO_VALID_ERR) {
 			u32 err = svm->vmcb->control.exit_int_info_err;
-			kvm_requeue_exception_e(vcpu, vector, err);
+			kvm_requeue_exception_e(vcpu, vector, err, false);
 
 		} else
-			kvm_requeue_exception(vcpu, vector);
+			kvm_requeue_exception(vcpu, vector, false);
 		break;
 	case SVM_EXITINTINFO_TYPE_INTR:
 		kvm_queue_interrupt(vcpu, vector, false);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 67fd4a56d031..518e68ee5a0d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1901,6 +1901,8 @@  static void vmx_inject_exception(struct kvm_vcpu *vcpu)
 				event_data = vcpu->arch.guest_fpu.xfd_err;
 
 			vmcs_write64(INJECTED_EVENT_DATA, event_data);
+
+			intr_info |= ex->nested ? INTR_INFO_NESTED_EXCEPTION_MASK : 0;
 		}
 	}
 
@@ -2851,6 +2853,19 @@  static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	/* IA-32 SDM Vol 3B: 64-bit CPUs always have VMX_BASIC_MSR[48]==0. */
 	if (basic_msr & VMX_BASIC_32BIT_PHYS_ADDR_ONLY)
 		return -EIO;
+
+	/*
+	 * FRED draft Spec 5.0 Section 9.2:
+	 *
+	 * Any processor that enumerates support for FRED transitions
+	 * will also enumerate VMX nested-exception support.
+	 */
+	if (cpu_feature_enabled(X86_FEATURE_FRED) &&
+	    !(basic_msr & VMX_BASIC_NESTED_EXCEPTION)) {
+		pr_warn_once("FRED enabled but no VMX nested-exception support\n");
+		if (error_on_inconsistent_vmcs_config)
+			return -EIO;
+	}
 #endif
 
 	/* Require Write-Back (WB) memory type for VMCS accesses. */
@@ -7313,11 +7328,12 @@  static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu,
 			}
 		}
 
-		if (idt_vectoring_info & VECTORING_INFO_DELIVER_CODE_MASK) {
-			u32 err = vmcs_read32(error_code_field);
-			kvm_requeue_exception_e(vcpu, vector, err);
-		} else
-			kvm_requeue_exception(vcpu, vector);
+		if (idt_vectoring_info & VECTORING_INFO_DELIVER_CODE_MASK)
+			kvm_requeue_exception_e(vcpu, vector, vmcs_read32(error_code_field),
+						idt_vectoring_info & INTR_INFO_NESTED_EXCEPTION_MASK);
+		else
+			kvm_requeue_exception(vcpu, vector,
+					      idt_vectoring_info & INTR_INFO_NESTED_EXCEPTION_MASK);
 		break;
 	case INTR_TYPE_SOFT_INTR:
 		vcpu->arch.event_exit_inst_len = vmcs_read32(instr_len_field);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d190bfc63fc4..51c07730f1b6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -645,7 +645,8 @@  static void kvm_leave_nested(struct kvm_vcpu *vcpu)
 
 static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
 		unsigned nr, bool has_error, u32 error_code,
-	        bool has_payload, unsigned long payload, bool reinject)
+	        bool has_payload, unsigned long payload,
+		bool reinject, bool nested)
 {
 	u32 prev_nr;
 	int class1, class2;
@@ -678,6 +679,7 @@  static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
 			 */
 			WARN_ON_ONCE(kvm_is_exception_pending(vcpu));
 			vcpu->arch.exception.injected = true;
+			vcpu->arch.exception.nested = nested;
 			if (WARN_ON_ONCE(has_payload)) {
 				/*
 				 * For a reinjected event, KVM delivers its
@@ -727,6 +729,8 @@  static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
 
 		kvm_queue_exception_e(vcpu, DF_VECTOR, 0);
 	} else {
+		vcpu->arch.exception.nested = true;
+
 		/* replace previous exception with a new one in a hope
 		   that instruction re-execution will regenerate lost
 		   exception */
@@ -736,20 +740,20 @@  static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
 
 void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr)
 {
-	kvm_multiple_exception(vcpu, nr, false, 0, false, 0, false);
+	kvm_multiple_exception(vcpu, nr, false, 0, false, 0, false, false);
 }
 EXPORT_SYMBOL_GPL(kvm_queue_exception);
 
-void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr)
+void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr, bool nested)
 {
-	kvm_multiple_exception(vcpu, nr, false, 0, false, 0, true);
+	kvm_multiple_exception(vcpu, nr, false, 0, false, 0, true, nested);
 }
 EXPORT_SYMBOL_GPL(kvm_requeue_exception);
 
 void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr,
 			   unsigned long payload)
 {
-	kvm_multiple_exception(vcpu, nr, false, 0, true, payload, false);
+	kvm_multiple_exception(vcpu, nr, false, 0, true, payload, false, false);
 }
 EXPORT_SYMBOL_GPL(kvm_queue_exception_p);
 
@@ -757,7 +761,7 @@  static void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr,
 				    u32 error_code, unsigned long payload)
 {
 	kvm_multiple_exception(vcpu, nr, true, error_code,
-			       true, payload, false);
+			       true, payload, false, false);
 }
 
 int kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err)
@@ -829,13 +833,13 @@  void kvm_inject_nmi(struct kvm_vcpu *vcpu)
 
 void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code)
 {
-	kvm_multiple_exception(vcpu, nr, true, error_code, false, 0, false);
+	kvm_multiple_exception(vcpu, nr, true, error_code, false, 0, false, false);
 }
 EXPORT_SYMBOL_GPL(kvm_queue_exception_e);
 
-void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code)
+void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code, bool nested)
 {
-	kvm_multiple_exception(vcpu, nr, true, error_code, false, 0, true);
+	kvm_multiple_exception(vcpu, nr, true, error_code, false, 0, true, nested);
 }
 EXPORT_SYMBOL_GPL(kvm_requeue_exception_e);
 
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 60da8cbe6759..63e543c6834b 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -108,6 +108,7 @@  static inline void kvm_clear_exception_queue(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.exception.pending = false;
 	vcpu->arch.exception.injected = false;
+	vcpu->arch.exception.nested = false;
 	vcpu->arch.exception_vmexit.pending = false;
 }