diff mbox

[PART1,RFC,v2,10/10] svm: Manage vcpu load/unload when enable AVIC

Message ID 1457124368-2025-11-git-send-email-Suravee.Suthikulpanit@amd.com
State New
Headers show

Commit Message

Suthikulpanit, Suravee March 4, 2016, 8:46 p.m. UTC
From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>


When a vcpu is loaded/unloaded to a physical core, we need to update
information in the Physical APIC-ID table accordingly.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

---
 arch/x86/kvm/svm.c | 146 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 146 insertions(+)

-- 
1.9.1

Comments

Suthikulpanit, Suravee March 14, 2016, 11:48 a.m. UTC | #1
Hi,

On 03/10/2016 04:46 AM, Radim Krčmář wrote:
> 2016-03-04 14:46-0600, Suravee Suthikulpanit:

>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

>>

>> When a vcpu is loaded/unloaded to a physical core, we need to update

>> information in the Physical APIC-ID table accordingly.

>>

>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

>> ---

>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c

>> @@ -1508,6 +1510,146 @@ static int avic_vcpu_init(struct kvm *kvm, struct vcpu_svm *svm, int id)

>> +static inline int

>> +avic_update_iommu(struct kvm_vcpu *vcpu, int cpu, phys_addr_t pa, bool r)

>> +{

>> +	if (!kvm_arch_has_assigned_device(vcpu->kvm))

>> +		return 0;

>> +

>> +	/* TODO: We will hook up with IOMMU API at later time */

>

> (It'd be best to drop avic_update_iommu from this series and introduce

>   it when merging iommu support.)

>


I just kept it there to make code merging b/w part1 and 2 that I have 
been testing simpler. I didn't think it would cause much confusion. But 
if you think that might be the case, I can drop it for now.

>> +	return 0;

>> +}

>> +

>> +static int avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu, bool is_load)

>

> This function does a lot and there is only one thing that must be done

> in svm_vcpu_load:  change host physical APIC ID if the CPU changed.

> The rest can be done elsewhere:

>   - is_running when blocking.


I added the logic here to track if the is_running is set when unloading 
since I noticed the case when the vCPU is busy doing some work for the 
guest, then get unloaded and later on get loaded w/o 
blocking/unblocking. So, in theory, it should be be set to running 
during unloaded period, and it should restore this flag if it is loaded 
again.

>   - kb_pg_ptr when the pointer changes = only on initialization?


The reason I put this here mainly because it is a convenient place to 
set the vAPIC bakcing page address since we already have to set up the 
host physical APIC id. I guess I can try setting this separately during 
vcpu create. But I don't think it would make much difference.

>   - valid when the kb_pg_ptr is valid = always for existing VCPUs?


According to the programming manual, the valid bit is set when we set 
the host physical APIC ID. However, in theory, the vAPIC backing page 
address is required for AVIC hardware to set bits in IRR register, while 
the host physical APIC ID is needed for sending doorbell to the target 
physical core. So, I would actually interpret the valid bit as it should 
be set when the vAPIC backing address is set.

In the current implementation, the valid bit is set during vcpu load, 
but is not unset it when unload. This actually reflect the 
interpretation of the description above.

If we decide to move the setting of vAPIC backing page address to the 
vcpu create phrase, we would set the valid bit at that point as well.

Please let me know if you think differently.

>> +static int avic_set_running(struct kvm_vcpu *vcpu, bool is_run)

>

> avic_set_running should get address of the entry and write is_run to it.

> (No need to touch avic_bk_page, h_phy_apic_id or do bound checks.)

>

> I think you can cache the physical apic id table entry in *vcpu, so both

> functions are going to be very simple.

>


I was actually thinking about doing this. Lemme try to come up with the 
logic to cache this.

Thanks,
Suravee
Suthikulpanit, Suravee March 14, 2016, 11:58 a.m. UTC | #2
Hi,

On 03/10/2016 09:01 PM, Radim Krčmář wrote:
> Well, we haven't reached an agreement on is_running yet.  The situation:

> if we don't unset vcpu1.is_running when vcpu1 is scheduled out and vcpu2

> gets scheduled on vcpu1's physical core, then vcpu2 would receive a

> doorbell intended to vcpu1.


That's why, in V2, I added the logic to check if the is_running bit is 
set for the current vcpu (e.g. vcpu1) when unloaded, then restore the 
bit during loading later of if it was set during previous unloaded. This 
way, when we load the new vcpu (e.g. vcpu2), the is_running will be set 
as it was before unloading.

> We'd like to keep is_running set when there is no reason to vmexit, but

> not if a guest can negatively affect other guests.


Not sure how this can affect other guests?

> How does receiving a stray doorbell affect the performance?


As far as I know, the doorbell only affects the CPU during vmrun. Once 
received, it will check the IRR in vAPIC backing page.  So, I think if 
IRR bit is not set, the affect should be rather minimal.

Thanks,
Suravee
diff mbox

Patch

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 5142861..ebcade0 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -35,6 +35,7 @@ 
 #include <linux/trace_events.h>
 #include <linux/slab.h>
 
+#include <asm/apic.h>
 #include <asm/perf_event.h>
 #include <asm/tlbflush.h>
 #include <asm/desc.h>
@@ -175,6 +176,7 @@  struct vcpu_svm {
 
 	struct page *avic_bk_page;
 	void *in_kernel_lapic_regs;
+	bool avic_was_running;
 };
 
 struct __attribute__ ((__packed__))
@@ -1508,6 +1510,146 @@  static int avic_vcpu_init(struct kvm *kvm, struct vcpu_svm *svm, int id)
 	return 0;
 }
 
+static inline int
+avic_update_iommu(struct kvm_vcpu *vcpu, int cpu, phys_addr_t pa, bool r)
+{
+	if (!kvm_arch_has_assigned_device(vcpu->kvm))
+		return 0;
+
+	/* TODO: We will hook up with IOMMU API at later time */
+	return 0;
+}
+
+static int avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu, bool is_load)
+{
+	int g_phy_apic_id, h_phy_apic_id;
+	struct svm_avic_phy_ait_entry *entry, new_entry;
+	struct vcpu_svm *svm = to_svm(vcpu);
+	int ret = 0;
+
+	if (!avic)
+		return 0;
+
+	if (!svm)
+		return -EINVAL;
+
+	/* Note: APIC ID = 0xff is used for broadcast.
+	 *       APIC ID > 0xff is reserved.
+	 */
+	g_phy_apic_id = vcpu->vcpu_id;
+	h_phy_apic_id = __default_cpu_present_to_apicid(cpu);
+
+	if ((g_phy_apic_id >= AVIC_PHY_APIC_ID_MAX) ||
+	    (h_phy_apic_id >= AVIC_PHY_APIC_ID_MAX))
+		return -EINVAL;
+
+	entry = avic_get_phy_ait_entry(vcpu, g_phy_apic_id);
+	if (!entry)
+		return -EINVAL;
+
+	if (is_load) {
+		/* Handle vcpu load */
+		phys_addr_t pa = PFN_PHYS(page_to_pfn(svm->avic_bk_page));
+
+		new_entry = READ_ONCE(*entry);
+
+		BUG_ON(new_entry.is_running);
+
+		new_entry.bk_pg_ptr = (pa >> 12) & 0xffffffffff;
+		new_entry.valid = 1;
+		new_entry.host_phy_apic_id = h_phy_apic_id;
+
+		if (svm->avic_was_running) {
+			/**
+			 * Restore AVIC running flag if it was set during
+			 * vcpu unload.
+			 */
+			new_entry.is_running = 1;
+		}
+		ret = avic_update_iommu(vcpu, h_phy_apic_id, pa,
+					   svm->avic_was_running);
+		WRITE_ONCE(*entry, new_entry);
+
+	} else {
+		/* Handle vcpu unload */
+		new_entry = READ_ONCE(*entry);
+		if (new_entry.is_running) {
+			phys_addr_t pa = PFN_PHYS(page_to_pfn(svm->avic_bk_page));
+
+			/**
+			 * This handles the case when vcpu is scheduled out
+			 * and has not yet not called blocking. We save the
+			 * AVIC running flag so that we can restore later.
+			 */
+			svm->avic_was_running = true;
+
+			/**
+			 * We need to also clear the AVIC running flag
+			 * and communicate the changes to IOMMU.
+			 */
+			ret = avic_update_iommu(vcpu, h_phy_apic_id, pa, 0);
+
+			new_entry.is_running = 0;
+			WRITE_ONCE(*entry, new_entry);
+		} else {
+			svm->avic_was_running = false;
+		}
+	}
+
+	return ret;
+}
+
+/**
+ * This function is called during VCPU halt/unhalt.
+ */
+static int avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
+{
+	int ret = 0;
+	int g_phy_apic_id, h_phy_apic_id;
+	struct svm_avic_phy_ait_entry *entry, new_entry;
+	struct vcpu_svm *svm = to_svm(vcpu);
+	phys_addr_t pa = PFN_PHYS(page_to_pfn(svm->avic_bk_page));
+
+	if (!avic)
+		return 0;
+
+	/* Note: APIC ID = 0xff is used for broadcast.
+	 *       APIC ID > 0xff is reserved.
+	 */
+	g_phy_apic_id = vcpu->vcpu_id;
+	h_phy_apic_id = __default_cpu_present_to_apicid(vcpu->cpu);
+
+	if ((g_phy_apic_id >= AVIC_PHY_APIC_ID_MAX) ||
+	    (h_phy_apic_id >= AVIC_PHY_APIC_ID_MAX))
+		return -EINVAL;
+
+	entry = avic_get_phy_ait_entry(vcpu, g_phy_apic_id);
+	if (!entry)
+		return -EINVAL;
+
+	if (is_run) {
+		/**
+		 * Handle vcpu unblocking after HLT
+		 */
+		new_entry = READ_ONCE(*entry);
+		new_entry.is_running = is_run;
+		WRITE_ONCE(*entry, new_entry);
+
+		ret = avic_update_iommu(vcpu, h_phy_apic_id, pa, is_run);
+	} else {
+		/**
+		 * Handle vcpu blocking due to HLT
+		 */
+		ret = avic_update_iommu(vcpu, h_phy_apic_id, pa, is_run);
+
+		new_entry = READ_ONCE(*entry);
+		new_entry.is_running = is_run;
+		WRITE_ONCE(*entry, new_entry);
+	}
+
+	return ret;
+}
+
 static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -1648,6 +1790,8 @@  static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	/* This assumes that the kernel never uses MSR_TSC_AUX */
 	if (static_cpu_has(X86_FEATURE_RDTSCP))
 		wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
+
+	avic_vcpu_load(vcpu, cpu, true);
 }
 
 static void svm_vcpu_put(struct kvm_vcpu *vcpu)
@@ -1655,6 +1799,8 @@  static void svm_vcpu_put(struct kvm_vcpu *vcpu)
 	struct vcpu_svm *svm = to_svm(vcpu);
 	int i;
 
+	avic_vcpu_load(vcpu, 0, false);
+
 	++vcpu->stat.host_state_reload;
 	kvm_load_ldt(svm->host.ldt);
 #ifdef CONFIG_X86_64