From patchwork Mon Sep 25 14:48:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 726043 Delivered-To: patch@linaro.org Received: by 2002:adf:ea87:0:b0:31d:da82:a3b4 with SMTP id s7csp2438711wrm; Mon, 25 Sep 2023 08:02:36 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGGQVoqH6a5pUDgbSajqfjTw7enU+C20vEHrBislZBDPpd75FKGuqkzohttonVD0UlNAXrP X-Received: by 2002:a05:620a:3185:b0:767:9d40:a3b7 with SMTP id bi5-20020a05620a318500b007679d40a3b7mr6911649qkb.21.1695654156395; Mon, 25 Sep 2023 08:02:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695654156; cv=none; d=google.com; s=arc-20160816; b=QkyoweVNjKb4VDUXBy2O3h/1JpgwW3PvdbrNImCUXSubKeDaiXiB4AIQ0O04l5N55y yOW3M2HthqZw5n3+hR7xsYENljUQ+wJhZTyN1RMseReYIwcYmp9TX1M38Kep3ze9LHDo UKwAFMd89Js2rOeUhC0mfUL4yt3B2lp/wWsDRDIgNhHJ7i1dsEtMkhrnBt2wWBb3ucv0 mXkVCxr7vgd9pRbRsY0RxIwccuaHrY7+ewFuJfp6J0LHztgTXznYSVbB3T71OsC3MH9C oLQyoPbq25hg/XtxVJcDpOtC6fe6bav6plwAGQ+fcLkITEQQRNJDTlWfJ7PoipYnYy39 AnqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=eKXyxkZjkgMHzhbh3Oq/XROWT/WyaO+gDU9Mjp58TiI=; fh=f1HG4YWa+ndoxgBhrv2wvoi5B5rsQCa0P0+M/9TAKns=; b=xT1bJhX3QrCquU1OhoxGNrv7dj2Ce54sRjppKqfUhvq3cePJrpaLyvdqwS0VKV6CoT 6Yr8ef0yrwIsdFZ2C+I/qCo7jEeAsyBSD/VFHa8EFsxML3/0tXZD9za2yP587l0C2aQX UDuUhbk7c5kJAtoFpZP7Re5tEJwtTbi3wo/lnDX+hJmelC73hgmerA1avv9rys5ZZhAi AYPllWBGRKo6XmXTltFE/kwa0uFxYpwHlLXmIWtmtIYrs8j7L3rp24jaV1pTv/ABRLnQ ANQKeA1wHtgCSDQ0d6uD0lNOtURFoDdMV2Dw74X6SmRlMCvAqD09ouP7kvg1k5248BIf 5T0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=v4+LzZ0q; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id s1-20020a05620a0bc100b0076d81c85ecesi6176780qki.723.2023.09.25.08.02.36 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Mon, 25 Sep 2023 08:02:36 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=v4+LzZ0q; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qkn3M-0000E8-5e; Mon, 25 Sep 2023 10:59:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qkn35-0008KP-6T for qemu-devel@nongnu.org; Mon, 25 Sep 2023 10:59:00 -0400 Received: from mail-wm1-x334.google.com ([2a00:1450:4864:20::334]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qkn2r-0002Ef-W9 for qemu-devel@nongnu.org; Mon, 25 Sep 2023 10:58:57 -0400 Received: by mail-wm1-x334.google.com with SMTP id 5b1f17b1804b1-40572aeb73cso31366045e9.3 for ; Mon, 25 Sep 2023 07:58:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1695653924; x=1696258724; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eKXyxkZjkgMHzhbh3Oq/XROWT/WyaO+gDU9Mjp58TiI=; b=v4+LzZ0qKbYpVrBm6g7rdVJQ8kcfUnpsf7UwmNcPdeG0i8DpQt38mJXhXdN/ITGRnG zrFI6EgINpKA3T4dRNzEHwrTS3syn4Ai/GcfIkxAUGtWKYb0VqUTVW1E9Qs/NLokJmk+ kLyRC2wPJ0EOc/LpE7YntaKyfRbDusE+6R/yqB7B88ALXPnPQYgAqt/3c3GlcdvdPDfG wXKMTIzVzy/wURDyvciSnr8JFdjkfWFlWvRFYDt93vPUSzmexosIj0dmYweq706Gfg5w g4t5+fJplIs4N5tBzI0qKgZgPojh16fpONsjvzIpaoJR8sK0xSCqfILVepQRL2ou0mJK rgBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695653924; x=1696258724; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eKXyxkZjkgMHzhbh3Oq/XROWT/WyaO+gDU9Mjp58TiI=; b=Js2QWS7GPCjEwes1cDkjZlw7p/9bK6Q9Ed2OhMBKgmGstjGRo8B5293kPevMEkGjFm NB4ikj7ncuFtyIaTkgLO8NZJ4BZN6MAADaAwemjQTT/tHWjfYQdtRNfJJaMRhai5SGKv ZttKnyUeseZHENTiALbsXowURpx9VTm9Xnl19WaTpclyinTxzXKUmKlLLs/aEKZ7zuzR g2E9ilawCfkyeo1q28Ozb++IeUrOY7dX4F2dgLPJbXTTeUaCipTDdTIHrTDYoNI/4fDL qFijpFZIMC+UEEY7M4vQ1Q1Y0BgZn6oE31POv1gvspaMj+F1bhQKuxK9f84PqIIy176Y QAzQ== X-Gm-Message-State: AOJu0Yzo5N7WLWKpMJ4G1+qV39GJ06AS65PJSBda+VsC9HXKjUMa9hbB pzRwrEVryxgrty0qOcch47IcQA== X-Received: by 2002:a05:600c:230b:b0:3fe:d448:511a with SMTP id 11-20020a05600c230b00b003fed448511amr6821322wmo.9.1695653923716; Mon, 25 Sep 2023 07:58:43 -0700 (PDT) Received: from zen.linaroharston ([85.9.250.243]) by smtp.gmail.com with ESMTPSA id k12-20020a7bc40c000000b00403038d7652sm12457849wmi.39.2023.09.25.07.58.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 07:58:42 -0700 (PDT) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaroharston (Postfix) with ESMTP id D967B1FFD6; Mon, 25 Sep 2023 15:48:58 +0100 (BST) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: qemu-devel@nongnu.org Cc: Alistair Francis , =?utf-8?q?C=C3=A9dric_Le_Go?= =?utf-8?q?ater?= , Marcin Juszkiewicz , John Snow , libvir-list@redhat.com, =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , qemu-s390x@nongnu.org, Song Gao , Daniel Henrique Barboza , Marcel Apfelbaum , Bastian Koppelmann , Liu Zhiwei , Weiwei Li , Nicholas Piggin , Radoslaw Biernacki , =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Eduardo Habkost , Cleber Rosa , Paolo Bonzini , Mahmoud Mandour , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Thomas Huth , Wainer dos Santos Moschetta , Richard Henderson , Bin Meng , Alexandre Iooss , Xiaojuan Yang , qemu-ppc@nongnu.org, David Hildenbrand , =?utf-8?q?Alex_Benn=C3=A9e?= , Yanan Wang , Peter Maydell , qemu-riscv@nongnu.org, qemu-arm@nongnu.org, Palmer Dabbelt , Ilya Leoshkevich , Laurent Vivier , Yoshinori Sato , Leif Lindholm , Beraldo Leal Subject: [RFC PATCH 31/31] contrib/plugins: add iops plugin example for cost modelling Date: Mon, 25 Sep 2023 15:48:54 +0100 Message-Id: <20230925144854.1872513-32-alex.bennee@linaro.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230925144854.1872513-1-alex.bennee@linaro.org> References: <20230925144854.1872513-1-alex.bennee@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::334; envelope-from=alex.bennee@linaro.org; helo=mail-wm1-x334.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org This plugin uses the new time control interface to make decisions about the state of time during the emulation. The algorithm is currently very simple. The user specifies an iops rate which applies per core. If the core runs ahead of its allocated execution time the plugin sleeps for a bit to let real time catch up. Either way time as updated for the emulation as a function of total executed instructions with some adjustments for cores that idle. Signed-off-by: Alex Bennée Message-Id: <20230519170454.2353945-9-alex.bennee@linaro.org> --- v2 - fix various style issues --- contrib/plugins/iops.c | 261 +++++++++++++++++++++++++++++++++++++++ contrib/plugins/Makefile | 1 + 2 files changed, 262 insertions(+) create mode 100644 contrib/plugins/iops.c diff --git a/contrib/plugins/iops.c b/contrib/plugins/iops.c new file mode 100644 index 0000000000..6f8baca6f7 --- /dev/null +++ b/contrib/plugins/iops.c @@ -0,0 +1,261 @@ +/* + * iops rate limiting plugin. + * + * This plugin can be used to restrict the execution of a system to a + * particular number of Instructions Per Second (IOPS). This controls + * time as seen by the guest so while wall-clock time may be longer + * from the guests point of view time will pass at the normal rate. + * + * This uses the new plugin API which allows the plugin to control + * system time. + * + * Copyright (c) 2023 Linaro Ltd + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include +#include +#include + +QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION; + +#define SLICES 10 /* the number of slices per second we compute delay */ + +static GMutex global_state_lock; + +static uint64_t iops = 1000000; /* iops rate, per core, per second */ +static uint64_t current_ticks; /* current global ticks */ +static uint64_t next_check; /* the next checkpoint for time */ +static bool precise_execution; /* count every instruction */ + +static int64_t systime_at_start; /* time we started the first vCPU */ + +static const uint64_t nsec_per_sec = 1000000000; +static const void *time_handle; + +/* + * We need to track the number of instructions each vCPU has executed + * as well as what its current state is. We need to account for time + * passing while a vCPU is idle. + */ + +typedef enum { + UNKNOWN = 0, + CREATED, + EXECUTING, + IDLE, + FINISHED +} vCPUState; + +typedef struct { + /* pointer to vcpu counter entry */ + uint64_t *counter; + vCPUState state; + /* timestamp when vCPU entered state */ + uint64_t state_time; + /* number of ns vCPU was idle */ + uint64_t total_idle; +} vCPUTime; + +GArray *vcpus; +uint64_t *vcpu_counters; + +/* + * Get the vcpu structure for this vCPU. We don't do any locking here + * as only one vCPU will ever access its own structure. + */ +static vCPUTime *get_vcpu(int cpu_index) +{ + return &g_array_index(vcpus, vCPUTime, cpu_index); +} + +/* + * When emulation is running faster than real time this is the point + * we can throttle the execution of a given vCPU. Either way we can + * now tell the system to move time forward. + */ +static void update_system_time(int64_t vcpu_ticks) +{ + int64_t now = g_get_real_time(); + int64_t real_runtime_ns = now - systime_at_start; + + g_mutex_lock(&global_state_lock); + /* now we have the lock double check we are fastest */ + if (vcpu_ticks > next_check) { + + int64_t tick_runtime_ns = (vcpu_ticks / iops) * nsec_per_sec; + if (tick_runtime_ns > real_runtime_ns) { + int64_t sleep_us = (tick_runtime_ns - real_runtime_ns) / 1000; + g_usleep(sleep_us); + } + + /* Having slept we can now move the clocks forward */ + qemu_plugin_update_ns(time_handle, vcpu_ticks); + current_ticks = vcpu_ticks; + next_check = iops / SLICES; + } + g_mutex_unlock(&global_state_lock); +} + +/* + * State tracking + */ +static void vcpu_init(qemu_plugin_id_t id, unsigned int cpu_index) +{ + vCPUTime *vcpu = get_vcpu(cpu_index); + vcpu->state = CREATED; + vcpu->state_time = *vcpu->counter; + + g_mutex_lock(&global_state_lock); + if (!systime_at_start) { + systime_at_start = g_get_real_time(); + } + g_mutex_unlock(&global_state_lock); +} + +static void vcpu_idle(qemu_plugin_id_t id, unsigned int cpu_index) +{ + vCPUTime *vcpu = get_vcpu(cpu_index); + vcpu->state = IDLE; + vcpu->state_time = *vcpu->counter; + + /* handle when we are the last vcpu to sleep here */ +} + +static void vcpu_resume(qemu_plugin_id_t id, unsigned int cpu_index) +{ + vCPUTime *vcpu = get_vcpu(cpu_index); + + /* + * Now we need to reset counter to something approximating the + * current time, however we only update current_ticks when a block + * exceeds next_check. If the vCPU has been asleep for awhile this + * will probably do, otherwise lets pick somewhere between + * current_ticks and the next_check value. + */ + if (vcpu->state_time < current_ticks) { + *vcpu->counter = current_ticks; + } else { + int64_t window = next_check - vcpu->state_time; + *vcpu->counter = next_check - (window / 2); + } + + vcpu->state = EXECUTING; + vcpu->state_time = *vcpu->counter; +} + +static void vcpu_exit(qemu_plugin_id_t id, unsigned int cpu_index) +{ + vCPUTime *vcpu = get_vcpu(cpu_index); + vcpu->state = FINISHED; + vcpu->state_time = *vcpu->counter; +} + +/* + * tb exec + */ +static void vcpu_tb_exec(unsigned int cpu_index, void *udata) +{ + vCPUTime *vcpu = get_vcpu(cpu_index); + uint64_t count = *vcpu->counter; + + count += GPOINTER_TO_UINT(udata); + + if (count >= next_check) { + update_system_time(count); + } +} + +/* + * We have two choices at translation time. In imprecise mode we just + * install a tb execution callback with the total number of + * instructions in the block. This ignores any partial execution + * effects but it reasonably fast. In precise mode we increment a + * per-vCPU counter for every execution. + */ + +static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb) +{ + size_t n_insns = qemu_plugin_tb_n_insns(tb); + qemu_plugin_register_vcpu_tb_exec_cb(tb, vcpu_tb_exec, + QEMU_PLUGIN_CB_NO_REGS, + GUINT_TO_POINTER(n_insns)); +} + +/** + * Install the plugin + */ +QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id, + const qemu_info_t *info, int argc, + char **argv) +{ + /* This plugin only makes sense for system emulation */ + if (!info->system_emulation) { + fprintf(stderr, "iops plugin only works with system emulation\n"); + return -1; + } + + for (int i = 0; i < argc; i++) { + char *opt = argv[i]; + g_auto(GStrv) tokens = g_strsplit(opt, "=", 2); + if (g_strcmp0(tokens[0], "iops") == 0) { + iops = g_ascii_strtoull(tokens[1], NULL, 10); + if (!iops && errno) { + fprintf(stderr, "%s: couldn't parse %s (%s)\n", + __func__, tokens[1], g_strerror(errno)); + return -1; + } + + } else if (g_strcmp0(tokens[0], "precise") == 0) { + if (!qemu_plugin_bool_parse(tokens[0], tokens[1], + &precise_execution)) { + fprintf(stderr, "boolean argument parsing failed: %s\n", opt); + return -1; + } + } else { + fprintf(stderr, "option parsing failed: %s\n", opt); + return -1; + } + } + + /* + * Setup the tracking information we need to run. + */ + vcpus = g_array_new(true, true, sizeof(vCPUTime)); + g_array_set_size(vcpus, info->system.max_vcpus); + vcpu_counters = g_malloc0_n(info->system.max_vcpus, sizeof(uint64_t)); + for (int i = 0; i < info->system.max_vcpus; i++) { + vCPUTime *vcpu = get_vcpu(i); + vcpu->counter = &vcpu_counters[i]; + } + + /* + * We are going to check the state of time every slice so set the + * first check at t0 + iops/SLICES + */ + next_check = iops / SLICES; + + /* + * Only one plugin can request time control, if we don't get the + * handle there isn't much we can do. + */ + time_handle = qemu_plugin_request_time_control(); + if (!time_handle) { + fprintf(stderr, "%s: not given permission to control time\n", __func__); + return -1; + } + + /* + * To track time we need to measure how many instructions each + * core is executing as well as when each vcpu enters/leaves the + */ + qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans); + + qemu_plugin_register_vcpu_init_cb(id, vcpu_init); + qemu_plugin_register_vcpu_idle_cb(id, vcpu_idle); + qemu_plugin_register_vcpu_resume_cb(id, vcpu_resume); + qemu_plugin_register_vcpu_exit_cb(id, vcpu_exit); + + return 0; +} diff --git a/contrib/plugins/Makefile b/contrib/plugins/Makefile index 8ba78c7a32..3f45a46a03 100644 --- a/contrib/plugins/Makefile +++ b/contrib/plugins/Makefile @@ -21,6 +21,7 @@ NAMES += lockstep NAMES += hwprofile NAMES += cache NAMES += drcov +NAMES += iops SONAMES := $(addsuffix .so,$(addprefix lib,$(NAMES)))