Message ID | 20240106083346.29180-1-dongli.zhang@oracle.com |
---|---|
State | New |
Headers | show |
Series | [1/1] KVM: selftests: add kvmclock drift test | expand |
On Sat, 2024-01-06 at 00:33 -0800, Dongli Zhang wrote: > There is kvmclock drift issue during the vCPU hotplug. It has been fixed by > the commit c52ffadc65e2 ("KVM: x86: Don't unnecessarily force masterclock > update on vCPU hotplug"). > > This is to add the test to verify if the master clock is updated when we > write 0 to MSR_IA32_TSC from the host side. > > Here is the usage example on the KVM with the bugfix reverted. > > $ ./kvm_clock_drift -v -p 5 > kvmclock based on old pvclock_vcpu_time_info: 5012221999 > version: 2 > tsc_timestamp: 3277968 > system_time: 11849519 > tsc_to_system_mul: 2152530255 > tsc_shift: 0 > flags: 1 > > kvmclock based on new pvclock_vcpu_time_info: 5012222411 > version: 4 > tsc_timestamp: 9980576184 > system_time: 5012222411 > tsc_to_system_mul: 2152530255 > tsc_shift: 0 > flags: 1 > > ==== Test Assertion Failure ==== > x86_64/kvm_clock_drift.c:216: clock_old == clock_new > pid=14257 tid=14257 errno=4 - Interrupted system call > 1 0x000000000040277b: main at kvm_clock_drift.c:216 > 2 0x00007f7766fa7e44: ?? ??:0 > 3 0x000000000040286d: _start at ??:? > kvmclock drift detected, old=5012221999, new=5012222411 > > Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> We should extend this to cover live update — it should create a *second* KVM, migrate the guest including its clock information to that, and validate that the kvmclock information still doesn't change. Ideally during a leap second.
On Sat, Jan 06, 2024, Dongli Zhang wrote: > diff --git a/tools/testing/selftests/kvm/x86_64/kvm_clock_drift.c b/tools/testing/selftests/kvm/x86_64/kvm_clock_drift.c > new file mode 100644 > index 000000000000..324f0dbc5762 > --- /dev/null > +++ b/tools/testing/selftests/kvm/x86_64/kvm_clock_drift.c For better or worse, our selftests naming adds a _test at the end. However, should we call this kvm_clock_hotplug_test.c? That way the file level comment isn't necessary (those these always become stale), and David's live update suggestion won't conflict. Or if the live update code fits nicely in this test, then kvm_clock_drift_test.c is probably a good name. > @@ -0,0 +1,223 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * The kvmclock drift test. Emulate vCPU hotplug and online to verify if > + * there is kvmclock drift. > + * > + * Adapted from steal_time.c This is really uninteresting. > + * Copyright (C) 2020, Red Hat, Inc. > + * Copyright (C) 2024 Oracle and/or its affiliates. > + */ > + > +#include <asm/kvm_para.h> > +#include <asm/pvclock.h> > +#include <asm/pvclock-abi.h> > +#include <sys/stat.h> > + > +#include "kvm_util.h" > +#include "processor.h" > + > +#define NR_VCPUS 2 > +#define NR_SLOTS 2 > +#define KVMCLOCK_SIZE sizeof(struct pvclock_vcpu_time_info) > +/* > + * KVMCLOCK_GPA is identity mapped > + */ > +#define KVMCLOCK_GPA (1 << 30) > + > +static uint64_t kvmclock_gpa = KVMCLOCK_GPA; Why hardcode an address? Can't the test have a global "struct pvclock_vcpu_time_info" whose address is then queried by the host by reading the MSR? Actually, can't the pvclock reads and asserts be done in the guest? Which is arguably better since the guest observing drift (or not) is what we really care about. Then the host side of the test never needs to resolve a host virtual address for the pvclock_vcpu_time_info structure. > +static void guest_code(int cpu) > +{ > + struct pvclock_vcpu_time_info *kvmclock; > + > + /* > + * vCPU#0 is to detect the change of pvclock_vcpu_time_info Just vCPU0, which is the usually terminology in KVM land. > + */ /* Single line comments should look like this. */ > + if (cpu == 0) { Rather than switch inside the guest code, just use vcpu_arch_set_entry_point() to point vCPU1 at a different function entirely. Then most of the commetns go away, and the code is generally much easier to read. > + GUEST_SYNC(0); > + > + kvmclock = (struct pvclock_vcpu_time_info *) kvmclock_gpa; > + wrmsr(MSR_KVM_SYSTEM_TIME_NEW, kvmclock_gpa | KVM_MSR_ENABLED); > + > + /* > + * Backup the pvclock_vcpu_time_info before vCPU#1 hotplug > + */ > + kvmclock[1] = kvmclock[0]; > + > + GUEST_SYNC(2); > + /* > + * Enter the guest to update pvclock_vcpu_time_info > + */ > + GUEST_SYNC(4); > + } > + > + /* > + * vCPU#1 is to emulate the vCPU hotplug > + */ > + if (cpu == 1) { > + GUEST_SYNC(1); > + /* > + * This is after the host side MSR_IA32_TSC > + */ Rather than add comments, use an enum to describe the stages: enum { SYNC_VCPU0_??? SYNC_VCPU1_??? SYNC_VCPU0_??? SYNC_VCPU1_??? } That said, why does this test "emulate" hotplug? Why not literally hotplug a vCPU? It's probably _less_ code, and the annoying NR_VCPUS #define goes away too, because you can do: vm = vm_create_with_one_vcpu(&vcpu0, vcpu0_code); <setup kvmclock> vcpu1 = vm_vcpu_add(vm, 1, vcpu1_code); Then you probably only need one sync, from vCPU to alert the host that kvmclock is setup. > + GUEST_SYNC(3); > + } > +} > +#define CLOCKSOURCE_PATH "/sys/devices/system/clocksource/clocksource0/current_clocksource" > + > +static void check_clocksource(void) > +{ > + char *clk_name; > + struct stat st; > + FILE *fp; > + > + fp = fopen(CLOCKSOURCE_PATH, "r"); > + if (!fp) { > + pr_info("failed to open clocksource file: %d; assuming TSC.\n", > + errno); > + return; > + } > + > + if (fstat(fileno(fp), &st)) { > + pr_info("failed to stat clocksource file: %d; assuming TSC.\n", > + errno); > + goto out; > + } > + > + clk_name = malloc(st.st_size); > + TEST_ASSERT(clk_name, "failed to allocate buffer to read file\n"); > + > + if (!fgets(clk_name, st.st_size, fp)) { > + pr_info("failed to read clocksource file: %d; assuming TSC.\n", > + ferror(fp)); > + goto out; > + } > + > + TEST_ASSERT(!strncmp(clk_name, "tsc\n", st.st_size), TEST_REQUIRE, not TEST_ASSERT, i.e. skip the test if the clocksource isn't supported, don't cause a failure. > + "clocksource not supported: %s", clk_name); > +out: > + fclose(fp); > +} This can all be replaced with: TEST_REQUIRE(sys_clocksource_is_based_on_tsc());
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 4412b42d95de..c665d0d8d348 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -84,6 +84,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/hyperv_features TEST_GEN_PROGS_x86_64 += x86_64/hyperv_ipi TEST_GEN_PROGS_x86_64 += x86_64/hyperv_svm_test TEST_GEN_PROGS_x86_64 += x86_64/hyperv_tlb_flush +TEST_GEN_PROGS_x86_64 += x86_64/kvm_clock_drift TEST_GEN_PROGS_x86_64 += x86_64/kvm_clock_test TEST_GEN_PROGS_x86_64 += x86_64/kvm_pv_test TEST_GEN_PROGS_x86_64 += x86_64/monitor_mwait_test diff --git a/tools/testing/selftests/kvm/x86_64/kvm_clock_drift.c b/tools/testing/selftests/kvm/x86_64/kvm_clock_drift.c new file mode 100644 index 000000000000..324f0dbc5762 --- /dev/null +++ b/tools/testing/selftests/kvm/x86_64/kvm_clock_drift.c @@ -0,0 +1,223 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * The kvmclock drift test. Emulate vCPU hotplug and online to verify if + * there is kvmclock drift. + * + * Adapted from steal_time.c + * + * Copyright (C) 2020, Red Hat, Inc. + * Copyright (C) 2024 Oracle and/or its affiliates. + */ + +#include <asm/kvm_para.h> +#include <asm/pvclock.h> +#include <asm/pvclock-abi.h> +#include <sys/stat.h> + +#include "kvm_util.h" +#include "processor.h" + +#define NR_VCPUS 2 +#define NR_SLOTS 2 +#define KVMCLOCK_SIZE sizeof(struct pvclock_vcpu_time_info) +/* + * KVMCLOCK_GPA is identity mapped + */ +#define KVMCLOCK_GPA (1 << 30) + +static uint64_t kvmclock_gpa = KVMCLOCK_GPA; + +static void guest_code(int cpu) +{ + struct pvclock_vcpu_time_info *kvmclock; + + /* + * vCPU#0 is to detect the change of pvclock_vcpu_time_info + */ + if (cpu == 0) { + GUEST_SYNC(0); + + kvmclock = (struct pvclock_vcpu_time_info *) kvmclock_gpa; + wrmsr(MSR_KVM_SYSTEM_TIME_NEW, kvmclock_gpa | KVM_MSR_ENABLED); + + /* + * Backup the pvclock_vcpu_time_info before vCPU#1 hotplug + */ + kvmclock[1] = kvmclock[0]; + + GUEST_SYNC(2); + /* + * Enter the guest to update pvclock_vcpu_time_info + */ + GUEST_SYNC(4); + } + + /* + * vCPU#1 is to emulate the vCPU hotplug + */ + if (cpu == 1) { + GUEST_SYNC(1); + /* + * This is after the host side MSR_IA32_TSC + */ + GUEST_SYNC(3); + } +} + +static void run_vcpu(struct kvm_vcpu *vcpu) +{ + struct ucall uc; + + vcpu_run(vcpu); + + switch (get_ucall(vcpu, &uc)) { + case UCALL_SYNC: + case UCALL_DONE: + break; + case UCALL_ABORT: + REPORT_GUEST_ASSERT(uc); + default: + TEST_ASSERT(false, "Unexpected exit: %s", + exit_reason_str(vcpu->run->exit_reason)); + } +} + +static void kvmclock_dump(struct pvclock_vcpu_time_info *kvmclock) +{ + pr_info(" version: %u\n", kvmclock->version); + pr_info(" tsc_timestamp: %lu\n", kvmclock->tsc_timestamp); + pr_info(" system_time: %lu\n", kvmclock->system_time); + pr_info(" tsc_to_system_mul: %u\n", kvmclock->tsc_to_system_mul); + pr_info(" tsc_shift: %d\n", kvmclock->tsc_shift); + pr_info(" flags: %u\n", kvmclock->flags); + pr_info("\n"); +} + +#define CLOCKSOURCE_PATH "/sys/devices/system/clocksource/clocksource0/current_clocksource" + +static void check_clocksource(void) +{ + char *clk_name; + struct stat st; + FILE *fp; + + fp = fopen(CLOCKSOURCE_PATH, "r"); + if (!fp) { + pr_info("failed to open clocksource file: %d; assuming TSC.\n", + errno); + return; + } + + if (fstat(fileno(fp), &st)) { + pr_info("failed to stat clocksource file: %d; assuming TSC.\n", + errno); + goto out; + } + + clk_name = malloc(st.st_size); + TEST_ASSERT(clk_name, "failed to allocate buffer to read file\n"); + + if (!fgets(clk_name, st.st_size, fp)) { + pr_info("failed to read clocksource file: %d; assuming TSC.\n", + ferror(fp)); + goto out; + } + + TEST_ASSERT(!strncmp(clk_name, "tsc\n", st.st_size), + "clocksource not supported: %s", clk_name); +out: + fclose(fp); +} + +int main(int argc, char *argv[]) +{ + struct pvclock_vcpu_time_info *kvmclock; + struct kvm_vcpu *vcpus[NR_VCPUS]; + uint64_t clock_old, clock_new; + bool verbose = false; + unsigned int gpages; + struct kvm_vm *vm; + int period = 2; + uint64_t tsc; + int opt; + + check_clocksource(); + + while ((opt = getopt(argc, argv, "p:vh")) != -1) { + switch (opt) { + case 'p': + period = atoi_positive("The period (seconds) between vCPU hotplug", + optarg); + break; + case 'v': + verbose = true; + break; + case 'h': + default: + pr_info("usage: %s [-p period (seconds)] [-v]\n", argv[0]); + exit(1); + } + } + + vm = vm_create_with_vcpus(NR_VCPUS, guest_code, vcpus); + gpages = vm_calc_num_guest_pages(VM_MODE_DEFAULT, + KVMCLOCK_SIZE * NR_SLOTS); + vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, + KVMCLOCK_GPA, 1, gpages, 0); + virt_map(vm, KVMCLOCK_GPA, KVMCLOCK_GPA, gpages); + + vcpu_args_set(vcpus[0], 1, 0); + vcpu_args_set(vcpus[1], 1, 1); + + /* + * Run vCPU#0 and vCPU#1 to update both pvclock_vcpu_time_info and + * master clock + */ + run_vcpu(vcpus[0]); + run_vcpu(vcpus[1]); + + /* + * Run vCPU#0 to backup the current pvclock_vcpu_time_info + */ + run_vcpu(vcpus[0]); + + sleep(period); + + /* + * Emulate the hotplug of vCPU#1 + */ + vcpu_set_msr(vcpus[1], MSR_IA32_TSC, 0); + + /* + * Emulate the online of vCPU#1 + */ + run_vcpu(vcpus[1]); + + /* + * Run vCPU#0 to backup the new pvclock_vcpu_time_info to detect + * if there is any change or kvmclock drift + */ + run_vcpu(vcpus[0]); + + kvmclock = addr_gva2hva(vm, kvmclock_gpa); + tsc = kvmclock[0].tsc_timestamp; + clock_old = __pvclock_read_cycles(&kvmclock[1], tsc); + clock_new = __pvclock_read_cycles(&kvmclock[0], tsc); + + if (verbose) { + pr_info("kvmclock based on old pvclock_vcpu_time_info: %lu\n", + clock_old); + kvmclock_dump(&kvmclock[1]); + pr_info("kvmclock based on new pvclock_vcpu_time_info: %lu\n", + clock_new); + kvmclock_dump(&kvmclock[0]); + } + + TEST_ASSERT(clock_old == clock_new, + "kvmclock drift detected, old=%lu, new=%lu", + clock_old, clock_new); + + kvm_vm_free(vm); + + return 0; +}
There is kvmclock drift issue during the vCPU hotplug. It has been fixed by the commit c52ffadc65e2 ("KVM: x86: Don't unnecessarily force masterclock update on vCPU hotplug"). This is to add the test to verify if the master clock is updated when we write 0 to MSR_IA32_TSC from the host side. Here is the usage example on the KVM with the bugfix reverted. $ ./kvm_clock_drift -v -p 5 kvmclock based on old pvclock_vcpu_time_info: 5012221999 version: 2 tsc_timestamp: 3277968 system_time: 11849519 tsc_to_system_mul: 2152530255 tsc_shift: 0 flags: 1 kvmclock based on new pvclock_vcpu_time_info: 5012222411 version: 4 tsc_timestamp: 9980576184 system_time: 5012222411 tsc_to_system_mul: 2152530255 tsc_shift: 0 flags: 1 ==== Test Assertion Failure ==== x86_64/kvm_clock_drift.c:216: clock_old == clock_new pid=14257 tid=14257 errno=4 - Interrupted system call 1 0x000000000040277b: main at kvm_clock_drift.c:216 2 0x00007f7766fa7e44: ?? ??:0 3 0x000000000040286d: _start at ??:? kvmclock drift detected, old=5012221999, new=5012222411 Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> --- tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/x86_64/kvm_clock_drift.c | 223 ++++++++++++++++++ 2 files changed, 224 insertions(+) create mode 100644 tools/testing/selftests/kvm/x86_64/kvm_clock_drift.c base-commit: f2a3fb7234e52f72ff4a38364dbf639cf4c7d6c6