diff mbox series

[4.19,08/57] KVM: x86: Fix split-irqchip vs interrupt injection window request

Message ID 20201201084648.690944071@linuxfoundation.org
State Superseded
Headers show
Series None | expand

Commit Message

Greg Kroah-Hartman Dec. 1, 2020, 8:53 a.m. UTC
From: Paolo Bonzini <pbonzini@redhat.com>

commit 71cc849b7093bb83af966c0e60cb11b7f35cd746 upstream.

kvm_cpu_accept_dm_intr and kvm_vcpu_ready_for_interrupt_injection are
a hodge-podge of conditions, hacked together to get something that
more or less works.  But what is actually needed is much simpler;
in both cases the fundamental question is, do we have a place to stash
an interrupt if userspace does KVM_INTERRUPT?

In userspace irqchip mode, that is !vcpu->arch.interrupt.injected.
Currently kvm_event_needs_reinjection(vcpu) covers it, but it is
unnecessarily restrictive.

In split irqchip mode it's a bit more complicated, we need to check
kvm_apic_accept_pic_intr(vcpu) (the IRQ window exit is basically an INTACK
cycle and thus requires ExtINTs not to be masked) as well as
!pending_userspace_extint(vcpu).  However, there is no need to
check kvm_event_needs_reinjection(vcpu), since split irqchip keeps
pending ExtINT state separate from event injection state, and checking
kvm_cpu_has_interrupt(vcpu) is wrong too since ExtINT has higher
priority than APIC interrupts.  In fact the latter fixes a bug:
when userspace requests an IRQ window vmexit, an interrupt in the
local APIC can cause kvm_cpu_has_interrupt() to be true and thus
kvm_vcpu_ready_for_interrupt_injection() to return false.  When this
happens, vcpu_run does not exit to userspace but the interrupt window
vmexits keep occurring.  The VM loops without any hope of making progress.

Once we try to fix these with something like

     return kvm_arch_interrupt_allowed(vcpu) &&
-        !kvm_cpu_has_interrupt(vcpu) &&
-        !kvm_event_needs_reinjection(vcpu) &&
-        kvm_cpu_accept_dm_intr(vcpu);
+        (!lapic_in_kernel(vcpu)
+         ? !vcpu->arch.interrupt.injected
+         : (kvm_apic_accept_pic_intr(vcpu)
+            && !pending_userspace_extint(v)));

we realize two things.  First, thanks to the previous patch the complex
conditional can reuse !kvm_cpu_has_extint(vcpu).  Second, the interrupt
window request in vcpu_enter_guest()

        bool req_int_win =
                dm_request_for_irq_injection(vcpu) &&
                kvm_cpu_accept_dm_intr(vcpu);

should be kept in sync with kvm_vcpu_ready_for_interrupt_injection():
it is unnecessary to ask the processor for an interrupt window
if we would not be able to return to userspace.  Therefore,
kvm_cpu_accept_dm_intr(vcpu) is basically !kvm_cpu_has_extint(vcpu)
ANDed with the existing check for masked ExtINT.  It all makes sense:

- we can accept an interrupt from userspace if there is a place
  to stash it (and, for irqchip split, ExtINTs are not masked).
  Interrupts from userspace _can_ be accepted even if right now
  EFLAGS.IF=0.

- in order to tell userspace we will inject its interrupt ("IRQ
  window open" i.e. kvm_vcpu_ready_for_interrupt_injection), both
  KVM and the vCPU need to be ready to accept the interrupt.

... and this is what the patch implements.

Reported-by: David Woodhouse <dwmw@amazon.co.uk>
Analyzed-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Nikos Tsironis <ntsironis@arrikto.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Tested-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/irq.c              |    2 +-
 arch/x86/kvm/x86.c              |   18 ++++++++++--------
 3 files changed, 12 insertions(+), 9 deletions(-)

Comments

Paolo Bonzini Dec. 1, 2020, 9:06 a.m. UTC | #1
On 01/12/20 09:53, Greg Kroah-Hartman wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
> 
> commit 71cc849b7093bb83af966c0e60cb11b7f35cd746 upstream.

Since you did not apply "KVM: x86: handle !lapic_in_kernel case in 
kvm_cpu_*_extint" to 4.19 or earlier, please leave this one out as well.

Thanks,

Paolo

> kvm_cpu_accept_dm_intr and kvm_vcpu_ready_for_interrupt_injection are
> a hodge-podge of conditions, hacked together to get something that
> more or less works.  But what is actually needed is much simpler;
> in both cases the fundamental question is, do we have a place to stash
> an interrupt if userspace does KVM_INTERRUPT?
> 
> In userspace irqchip mode, that is !vcpu->arch.interrupt.injected.
> Currently kvm_event_needs_reinjection(vcpu) covers it, but it is
> unnecessarily restrictive.
> 
> In split irqchip mode it's a bit more complicated, we need to check
> kvm_apic_accept_pic_intr(vcpu) (the IRQ window exit is basically an INTACK
> cycle and thus requires ExtINTs not to be masked) as well as
> !pending_userspace_extint(vcpu).  However, there is no need to
> check kvm_event_needs_reinjection(vcpu), since split irqchip keeps
> pending ExtINT state separate from event injection state, and checking
> kvm_cpu_has_interrupt(vcpu) is wrong too since ExtINT has higher
> priority than APIC interrupts.  In fact the latter fixes a bug:
> when userspace requests an IRQ window vmexit, an interrupt in the
> local APIC can cause kvm_cpu_has_interrupt() to be true and thus
> kvm_vcpu_ready_for_interrupt_injection() to return false.  When this
> happens, vcpu_run does not exit to userspace but the interrupt window
> vmexits keep occurring.  The VM loops without any hope of making progress.
> 
> Once we try to fix these with something like
> 
>       return kvm_arch_interrupt_allowed(vcpu) &&
> -        !kvm_cpu_has_interrupt(vcpu) &&
> -        !kvm_event_needs_reinjection(vcpu) &&
> -        kvm_cpu_accept_dm_intr(vcpu);
> +        (!lapic_in_kernel(vcpu)
> +         ? !vcpu->arch.interrupt.injected
> +         : (kvm_apic_accept_pic_intr(vcpu)
> +            && !pending_userspace_extint(v)));
> 
> we realize two things.  First, thanks to the previous patch the complex
> conditional can reuse !kvm_cpu_has_extint(vcpu).  Second, the interrupt
> window request in vcpu_enter_guest()
> 
>          bool req_int_win =
>                  dm_request_for_irq_injection(vcpu) &&
>                  kvm_cpu_accept_dm_intr(vcpu);
> 
> should be kept in sync with kvm_vcpu_ready_for_interrupt_injection():
> it is unnecessary to ask the processor for an interrupt window
> if we would not be able to return to userspace.  Therefore,
> kvm_cpu_accept_dm_intr(vcpu) is basically !kvm_cpu_has_extint(vcpu)
> ANDed with the existing check for masked ExtINT.  It all makes sense:
> 
> - we can accept an interrupt from userspace if there is a place
>    to stash it (and, for irqchip split, ExtINTs are not masked).
>    Interrupts from userspace _can_ be accepted even if right now
>    EFLAGS.IF=0.
> 
> - in order to tell userspace we will inject its interrupt ("IRQ
>    window open" i.e. kvm_vcpu_ready_for_interrupt_injection), both
>    KVM and the vCPU need to be ready to accept the interrupt.
> 
> ... and this is what the patch implements.
> 
> Reported-by: David Woodhouse <dwmw@amazon.co.uk>
> Analyzed-by: David Woodhouse <dwmw@amazon.co.uk>
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Reviewed-by: Nikos Tsironis <ntsironis@arrikto.com>
> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
> Tested-by: David Woodhouse <dwmw@amazon.co.uk>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> ---
>   arch/x86/include/asm/kvm_host.h |    1 +
>   arch/x86/kvm/irq.c              |    2 +-
>   arch/x86/kvm/x86.c              |   18 ++++++++++--------
>   3 files changed, 12 insertions(+), 9 deletions(-)
> 
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1472,6 +1472,7 @@ int kvm_test_age_hva(struct kvm *kvm, un
>   void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte);
>   int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v);
>   int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
> +int kvm_cpu_has_extint(struct kvm_vcpu *v);
>   int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu);
>   int kvm_cpu_get_interrupt(struct kvm_vcpu *v);
>   void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event);
> --- a/arch/x86/kvm/irq.c
> +++ b/arch/x86/kvm/irq.c
> @@ -52,7 +52,7 @@ static int pending_userspace_extint(stru
>    * check if there is pending interrupt from
>    * non-APIC source without intack.
>    */
> -static int kvm_cpu_has_extint(struct kvm_vcpu *v)
> +int kvm_cpu_has_extint(struct kvm_vcpu *v)
>   {
>   	/*
>   	 * FIXME: interrupt.injected represents an interrupt whose
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3351,21 +3351,23 @@ static int kvm_vcpu_ioctl_set_lapic(stru
>   
>   static int kvm_cpu_accept_dm_intr(struct kvm_vcpu *vcpu)
>   {
> +	/*
> +	 * We can accept userspace's request for interrupt injection
> +	 * as long as we have a place to store the interrupt number.
> +	 * The actual injection will happen when the CPU is able to
> +	 * deliver the interrupt.
> +	 */
> +	if (kvm_cpu_has_extint(vcpu))
> +		return false;
> +
> +	/* Acknowledging ExtINT does not happen if LINT0 is masked.  */
>   	return (!lapic_in_kernel(vcpu) ||
>   		kvm_apic_accept_pic_intr(vcpu));
>   }
>   
> -/*
> - * if userspace requested an interrupt window, check that the
> - * interrupt window is open.
> - *
> - * No need to exit to userspace if we already have an interrupt queued.
> - */
>   static int kvm_vcpu_ready_for_interrupt_injection(struct kvm_vcpu *vcpu)
>   {
>   	return kvm_arch_interrupt_allowed(vcpu) &&
> -		!kvm_cpu_has_interrupt(vcpu) &&
> -		!kvm_event_needs_reinjection(vcpu) &&
>   		kvm_cpu_accept_dm_intr(vcpu);
>   }
>   
> 
>
Greg Kroah-Hartman Dec. 1, 2020, 9:57 a.m. UTC | #2
On Tue, Dec 01, 2020 at 10:06:54AM +0100, Paolo Bonzini wrote:
> On 01/12/20 09:53, Greg Kroah-Hartman wrote:
> > From: Paolo Bonzini <pbonzini@redhat.com>
> > 
> > commit 71cc849b7093bb83af966c0e60cb11b7f35cd746 upstream.
> 
> Since you did not apply "KVM: x86: handle !lapic_in_kernel case in
> kvm_cpu_*_extint" to 4.19 or earlier, please leave this one out as well.

Isn't it patch 07 of this series?

confused,

greg k-h
Greg Kroah-Hartman Dec. 1, 2020, 10:20 a.m. UTC | #3
On Tue, Dec 01, 2020 at 11:03:36AM +0100, Paolo Bonzini wrote:
> On 01/12/20 10:57, Greg Kroah-Hartman wrote:
> > > Since you did not apply "KVM: x86: handle !lapic_in_kernel case in
> > > kvm_cpu_*_extint" to 4.19 or earlier, please leave this one out as well.
> > Isn't it patch 07 of this series?
> 
> Yes, it arrived a few minutes after I hit send for whatever reason. Though
> it still applies to 4.14 and earlier:
> 
> ` [PATCH 4.14 04/50] wireless: Use linux/stddef.h instead of stddef.h
> ` [PATCH 4.14 07/50] btrfs: adjust return values of btrfs_inode_by_name
> ` [PATCH 4.14 08/50] btrfs: inode: Verify inode mode to avoid NULL pointer
> dereference
> ` [PATCH 4.14 09/50] KVM: x86: Fix split-irqchip vs interrupt injection
> window request
> 
> ` [PATCH 4.9 05/42] btrfs: tree-checker: Enhance chunk checker to validate
> chunk profile
> ` [PATCH 4.9 06/42] btrfs: inode: Verify inode mode to avoid NULL pointer
> dereference
> ` [PATCH 4.9 07/42] KVM: x86: Fix split-irqchip vs interrupt injection
> window request
> 
> ` [PATCH 4.4 01/24] btrfs: tree-checker: Enhance chunk checker to validate
> chunk profile
> ` [PATCH 4.4 02/24] btrfs: inode: Verify inode mode to avoid NULL pointer
> dereference
> ` [PATCH 4.4 03/24] KVM: x86: Fix split-irqchip vs interrupt injection
> window request

Ok, I will go drop this patch from 4.14, 4.9, and 4.4.  Or, should the
needed pre-requisite patch be properly backported there instead?

And was it marked somewhere that this patch depended on that one and I
just missed it?

thanks,

greg k-h
Greg Kroah-Hartman Dec. 1, 2020, 11:13 a.m. UTC | #4
On Tue, Dec 01, 2020 at 11:55:55AM +0100, Paolo Bonzini wrote:
> On 01/12/20 11:20, Greg Kroah-Hartman wrote:
> > Ok, I will go drop this patch from 4.14, 4.9, and 4.4.  Or, should the
> > needed pre-requisite patch be properly backported there instead?
> 
> I would just drop it.  It was not reported in five years so it's quite
> unlikely that people will see the bug.

Ok, will go drop them.

> > And was it marked somewhere that this patch depended on that one and I
> > just missed it?
> 
> I don't see anything in stable-kernel-rules.rst about how to mark such
> semantic conflicts, so no, it wasn't marked.  (The commit message does say
> "thanks to the previous patch", but I don't expect you or your scripts to
> notice that!).

If you look at the section on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
that starts with "Additionally, some patches..." it will show that you
can add "#" comments on the cc: stable line to let me know pre-requsite
commits if you know them, and want to do that in the future.

thanks,

greg k-h
Pavel Machek Dec. 1, 2020, 3:33 p.m. UTC | #5
Hi!

> - in order to tell userspace we will inject its interrupt ("IRQ
>   window open" i.e. kvm_vcpu_ready_for_interrupt_injection), both
>   KVM and the vCPU need to be ready to accept the interrupt.
> 
> ... and this is what the patch implements.
> 
> Reported-by: David Woodhouse <dwmw@amazon.co.uk>
> Analyzed-by: David Woodhouse <dwmw@amazon.co.uk>
> Cc: stable@vger.kernel.org

This makes no difference for -stable, but the patch is confused about
types:

> +++ b/arch/x86/kvm/x86.c
> @@ -3351,21 +3351,23 @@ static int kvm_vcpu_ioctl_set_lapic(stru
>  
>  static int kvm_cpu_accept_dm_intr(struct kvm_vcpu *vcpu)
>  {
> +	/*
> +	 * We can accept userspace's request for interrupt injection
> +	 * as long as we have a place to store the interrupt number.
> +	 * The actual injection will happen when the CPU is able to
> +	 * deliver the interrupt.
> +	 */
> +	if (kvm_cpu_has_extint(vcpu))
> +		return false;

Since function is "static int" it should probably return 0.

Best regards,
								Pavel
--
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
diff mbox series

Patch

--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1472,6 +1472,7 @@  int kvm_test_age_hva(struct kvm *kvm, un
 void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte);
 int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v);
 int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
+int kvm_cpu_has_extint(struct kvm_vcpu *v);
 int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu);
 int kvm_cpu_get_interrupt(struct kvm_vcpu *v);
 void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event);
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -52,7 +52,7 @@  static int pending_userspace_extint(stru
  * check if there is pending interrupt from
  * non-APIC source without intack.
  */
-static int kvm_cpu_has_extint(struct kvm_vcpu *v)
+int kvm_cpu_has_extint(struct kvm_vcpu *v)
 {
 	/*
 	 * FIXME: interrupt.injected represents an interrupt whose
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3351,21 +3351,23 @@  static int kvm_vcpu_ioctl_set_lapic(stru
 
 static int kvm_cpu_accept_dm_intr(struct kvm_vcpu *vcpu)
 {
+	/*
+	 * We can accept userspace's request for interrupt injection
+	 * as long as we have a place to store the interrupt number.
+	 * The actual injection will happen when the CPU is able to
+	 * deliver the interrupt.
+	 */
+	if (kvm_cpu_has_extint(vcpu))
+		return false;
+
+	/* Acknowledging ExtINT does not happen if LINT0 is masked.  */
 	return (!lapic_in_kernel(vcpu) ||
 		kvm_apic_accept_pic_intr(vcpu));
 }
 
-/*
- * if userspace requested an interrupt window, check that the
- * interrupt window is open.
- *
- * No need to exit to userspace if we already have an interrupt queued.
- */
 static int kvm_vcpu_ready_for_interrupt_injection(struct kvm_vcpu *vcpu)
 {
 	return kvm_arch_interrupt_allowed(vcpu) &&
-		!kvm_cpu_has_interrupt(vcpu) &&
-		!kvm_event_needs_reinjection(vcpu) &&
 		kvm_cpu_accept_dm_intr(vcpu);
 }