From patchwork Wed Jul 1 16:11:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 192175 Delivered-To: patch@linaro.org Received: by 2002:a54:3249:0:0:0:0:0 with SMTP id g9csp564318ecs; Wed, 1 Jul 2020 09:14:42 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwvnMhJTeob6StGkQtMQBxCgoryKaYFWE5b7ZBYyRJt1fRWggknVA+0Fn8Jcq9GShT16HYV X-Received: by 2002:a25:2313:: with SMTP id j19mr46159535ybj.144.1593620081925; Wed, 01 Jul 2020 09:14:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593620081; cv=none; d=google.com; s=arc-20160816; b=bW0pUS4b0PRM2b8jTkOoid0rGZZoYKAJgRu6nHGfZjke3vRoT/NokQ5DhAtuGsBEYo 8NYZm0zt8FyB3razZ8dGdqP71qaZ6BzIEqUUcRnN/theNlkdfmq8Uf2aQeafglNYH3NH 2j1iOOLKgGhRxuPxqAyTfyJxEiGBBnx7nBoNAd6fgslNonZa/ArTus0p741CuBWo15iX aqPrAManJ9NGc3bh1Asn+r3qinP1mFI8FAlz+mmHIQTOs2xLQCzC/oZNTd9qSopUncsx MWvBxCpv+wfaGFV7X/4m+uB5VnKCzrgVLkF3FYABqB+4394hnAUD9VHH1N8QFM9Yungu OxSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=cOgbqI8IByJK4hKbNOF3vXqsSmzYFMqTxbljVvHqKbI=; b=fnkoxy6wU3ldUXRe4Qzd8Jvn3OlKiut9Mq5KdNRh4BIHV8QGHjNYsaUjzKRVzE4S5G p021H1rA8HL4mHsCKdPpD/b0l4mjYqjKbh2CzcN8xbi64DLLVqUyFmJSRbOGPFa5g/N0 igroJeainQJGQGxAU5WvRTVmshsY0m4vjvNbCFeQ6TfuXARRgKzfbjZydUZuqdL13uH0 ITbroAoMBILFW26Uj8GwiYm8ynhHIJ4Bprtuc0WoNSk3RRkv2cHIVLE1ND/3J13ubOEQ Ql3xSZRoVa6aKDyDC8KXrVuojkATummSKzzN3ZpRKOBe+n6/KOwnA7U1vNTZLYrulMYH putA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=A9hRZQvs; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id a16si5969360ybs.181.2020.07.01.09.14.41 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 01 Jul 2020 09:14:41 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=A9hRZQvs; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:46694 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jqfNh-0002Na-94 for patch@linaro.org; Wed, 01 Jul 2020 12:14:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40132) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jqfL9-0007GY-8a for qemu-devel@nongnu.org; Wed, 01 Jul 2020 12:12:03 -0400 Received: from mail-wm1-x342.google.com ([2a00:1450:4864:20::342]:54670) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jqfL6-0006Gu-Gk for qemu-devel@nongnu.org; Wed, 01 Jul 2020 12:12:02 -0400 Received: by mail-wm1-x342.google.com with SMTP id o8so23020231wmh.4 for ; Wed, 01 Jul 2020 09:11:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cOgbqI8IByJK4hKbNOF3vXqsSmzYFMqTxbljVvHqKbI=; b=A9hRZQvsvnWolLV/1g+zGO5b8ovhFkRVxmmpl4IvSiFyB/LGpj8/oD2KBVcRfiJ7Mn Sy9ZEGPjjNd52wQwAL73n3fqvA7EP3SWvsXDL00V+5NSSLMHQBZ7tSHzHxtrZLrIzXDK c+2EQEicc6VS9g/YlvVRSLUq0D/XqhHd9Oyu1u1v54uoeEEjeLXSsxDCTSQo5QBJDbD7 VpGTJw7SG5JpD7tR11NhuXtNaT45BiEPqok4jHGc/YtZ0w9MUHD+Jl9vb03zd5XEmsp0 kM8bFf4nfX+3Rs+8rocEMPry5mPnURFaurg0AX6DQv+LmOCES5NsNajEOd29gDxxcRG5 ohEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cOgbqI8IByJK4hKbNOF3vXqsSmzYFMqTxbljVvHqKbI=; b=KUAZ3ZmM+xFziwtAhNYQqe6ve746/0J1vLqele9t/WI3xfdSqoHvf9C6jmUKMZYknQ WYYxxHjYrKcDufHDCIaCEJFLSs75vEOME/PlHWWX2XVxp9q5F5ugvyx5OqQ0A4HQCBeL wBLYBVghc0rRGcFmjDFgfzMFn5R0I2t7JwturYItrqbV6VTmC5M3nEwssWkiRSbu3+LN 9RI4isq9NHOu0B6c/XUP/F+joFAXY9YeeaV3SB/cv5tCWJ0UbI1x+C6/n6c2cBwlV9bX ZTq8Jg9b9oNv8ta79nH3t//It3HiY0bINjYtMmyABr6CrQFKivfzEF3+CbO+AGeu4irA fhNw== X-Gm-Message-State: AOAM531kdzUDSe97mAWK1JeCxxIRxjlubnySSjDAar5FetKC6wzyvWbg dA14gHjaJ9cjPYhrAcUr7FQGuw== X-Received: by 2002:a1c:398b:: with SMTP id g133mr26420177wma.76.1593619917789; Wed, 01 Jul 2020 09:11:57 -0700 (PDT) Received: from zen.linaroharston ([51.148.130.216]) by smtp.gmail.com with ESMTPSA id s18sm8655356wra.85.2020.07.01.09.11.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2020 09:11:54 -0700 (PDT) Received: from zen.lan (localhost [127.0.0.1]) by zen.linaroharston (Postfix) with ESMTP id 685B01FF87; Wed, 1 Jul 2020 17:11:53 +0100 (BST) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: qemu-devel@nongnu.org Subject: [PATCH v2 1/3] docs/booting.rst: start documenting the boot process Date: Wed, 1 Jul 2020 17:11:51 +0100 Message-Id: <20200701161153.30988-2-alex.bennee@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200701161153.30988-1-alex.bennee@linaro.org> References: <20200701161153.30988-1-alex.bennee@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::342; envelope-from=alex.bennee@linaro.org; helo=mail-wm1-x342.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Alex_Benn=C3=A9e?= Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" While working on some test cases I realised there was quite a lot of assumed knowledge about how things boot up. I thought it would be worth gathering this together in a user facing document where we could pour in the details and background to the boot process. As it's quite wordy I thought it should be a separate document to the manual (which can obviously reference this). The document follows the socratic method and leaves the reader to ask themselves some questions in an effort to elucidate them about any problems they may be having. Signed-off-by: Alex Bennée Message-Id: <20190308211557.22589-1-alex.bennee@linaro.org> --- v2 - fix a lot of it's/its - mention microvm style booting - add some questions to the end --- docs/interop/booting.rst | 159 +++++++++++++++++++++++++++++++++++++++ docs/interop/index.rst | 1 + 2 files changed, 160 insertions(+) create mode 100644 docs/interop/booting.rst -- 2.20.1 Reviewed-by: Richard Henderson diff --git a/docs/interop/booting.rst b/docs/interop/booting.rst new file mode 100644 index 00000000000..8579a775d04 --- /dev/null +++ b/docs/interop/booting.rst @@ -0,0 +1,159 @@ +.. + Copyright (c) 2019-2020 Linaro Ltd. + + This work is licensed under the terms of the GNU GPL, version 2 or + later. See the COPYING file in the top-level directory. + +===================================== +Anatomy of a Boot, a QEMU perspective +===================================== + +This document attempts to give an overview of how machines boot and +how this matters to QEMU. We will discuss firmware and BIOSes and the +things they do before the OS kernel is loaded and your usable system +is finally ready. + +Firmware +======== + +When a CPU is powered up it knows nothing about its environment. The +internal state, including the program counter (PC), will be reset to a +defined set of values and it will attempt to fetch the first +instruction and execute it. It is the job of the firmware to bring a +CPU up from the initial few instructions to running in a relatively +sane execution environment. Firmware tends to be specific to the +hardware in question and is stored on non-volatile memory (memory that +survives a power off) usually a ROM or flash device on the computers +main board. + +Some examples of what firmware does include: + +Early Hardware Setup +-------------------- + +Modern hardware often requires configuring before it is usable. For +example most modern systems won't have working RAM until the memory +controller has been programmed with the correct timings for whatever +memory is installed on the system. Processors may boot with a very +restricted view of the memory map until RAM and other key peripherals +have been configured to appear in its address space. Some hardware +may not even appear until some sort of blob has been loaded into it so +it can start responding to the CPU. + +Fortunately for QEMU we don't have to worry too much about this very +low level configuration. The device model we present to the CPU at +start-up will generally respond to IO access from processor straight +away. + +BIOS or Firmware Services +------------------------- + +In the early days of the PC era the BIOS or Basic Input/Output System +provided an abstraction interface to the operating system which +allowed them to do basic IO operations without having to directly +drive the hardware. Since then the scope of these firmware services +have grown as systems become more and more complex. + +Modern firmware often follows the Unified Extensible Firmware +Interface (UEFI) which provides services like secure boot, persistent +variables and external time-keeping. + +There can often be multiple levels of firmware service functions. For +example systems which support secure execution enclaves generally have +a firmware component that executes in this secure mode which the +operating system can call in a defined secure manner to undertake +security sensitive tasks on its behalf. + +Hardware Enumeration +-------------------- + +It is easy to assume that modern hardware is built to be discover-able +and all the operating system needs to do is enumerate the various +buses on the system to find out what hardware exists. While buses like +PCI and USB do support discovery there is usually much more on a +modern system than just these two things. + +This process of discovery can take some time as devices usually need +to be probed and some time allowed for the buses to settle and the +probe complete. For purely virtual machines operating in on-demand +cloud environments you may operate with stripped down kernels that +only support a fixed expected environment so they can boot as fast as +possible. + +In the embedded world it used to be acceptable to have a similar +custom compiled kernel which knew where everything is meant to be. +However this was a brittle approach and not very flexible. For example +a general purpose distribution would have to ship a special kernel for +each variant of hardware you wanted to run on. If you try and use a +kernel compiled for one platform that nominally uses the same +processor as another platform the result will rarely work given a +processor rarely works in isolation. + +The more modern approach is to have a "generic" kernel that has a +number of different drivers compiled in which are then enabled based +on a hardware description provided by the firmware. This allows +flexibility on both sides. The software distribution is less concerned +about managing lots of different kernels for different pieces of +hardware. The hardware manufacturer is also able to make small changes +to the board over time to fix bugs or change minor components. + +The two main methods for this are the Advanced Configuration and Power +Interface (ACPI) and Device Trees. ACPI originated from the PC world +although it is becoming increasingly common for "enterprise" hardware +like servers. Device Trees of various forms have existed for a while +with perhaps the most common being Flattened Device Trees (FDT). + +Boot Code +========= + +The line between firmware and boot code is a very blurry one. However +from a functionality point of view we have moved from ensuring the +hardware is usable as a computing device to finding and loading a +kernel which is then going to take over control of the system. Modern +firmware often has the ability to boot a kernel directly and in some +systems you might chain through several boot loaders before the final +kernel takes control. + +The boot loader needs to do 3 things: + + - find a kernel and load it into RAM + - ensure the CPU is in the correct mode for the kernel to boot + - pass any information the kernel may need to boot and can't find itself + +Once it has done these things it can jump to the kernel and let it get +on with things. + +Kernel +====== + +The Kernel now takes over and will be in charge of the system from now +on. It will enumerate all the devices on the system (again) and load +drivers that can control them. It will then locate some sort of +file-system and eventually start running programs that actually do +work. + +Questions to ask yourself +========================= + +Having given this overview of booting here are some questions you +should ask when diagnosing boot problems. + +Hardware +~~~~~~~~ + + - is the platform fixed or dynamic? + - is the platform enumeratable (e.g. PCI/USB)? + +Firmware +~~~~~~~~ + + - is the firmware built for the platform your are booting? + - does the firmware need storage for variables (boot index etc)? + - does the firmware provide a service to kernels (e.g. ACPI/EFI)? + +Kernel +~~~~~~ + + - is the kernel platform specific or generic? + - how will the kernel enumerate the platform? + - can the kernel interface talk to the firmware? diff --git a/docs/interop/index.rst b/docs/interop/index.rst index 049387ac6de..58d587444b3 100644 --- a/docs/interop/index.rst +++ b/docs/interop/index.rst @@ -12,6 +12,7 @@ Contents: .. toctree:: :maxdepth: 2 + booting bitmaps dbus dbus-vmstate From patchwork Wed Jul 1 16:11:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 192176 Delivered-To: patch@linaro.org Received: by 2002:a54:3249:0:0:0:0:0 with SMTP id g9csp564983ecs; Wed, 1 Jul 2020 09:15:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz2A3aoDgO1Pb21gjunAti+6GCIQlHtJPnoyaMcbcBF58H3sZpJjZm7UcG75Mqlg4aom84k X-Received: by 2002:a25:50cc:: with SMTP id e195mr45972185ybb.483.1593620135271; Wed, 01 Jul 2020 09:15:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593620135; cv=none; d=google.com; s=arc-20160816; b=QIVxqCZVOpm0Wtm1Ozq4mO/7d0qG9D6g3KvtjVr7xIOA9g/fu7AeKYGDzH0+IqOVvV a7mN84NXwAAOwY6ngkrBlFKtWw+9j7Ada/NMH04JucG98Sv4I9/IR/dHYPD0mJqiDe4M dZoth6Ft2gwI13ombrVWDX/rbQR405zBq0djCDpND/lNQP5b+hG5hoLZjYpeguPw20xH 7oo2x7DISM3A+kBa35qiDXsH54FXOQJupCQGlwqgF2aBNjkvT2rGayZR32hJKL1lLW50 r9fvTG46vHqxiQ13dZktYKmyqfuAFnqkVkQc7s6iFTGeAwtzhjPCln8dNIlJKgk8tksw IWfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=plqX1uLjHS1JRtUyTbOdw9sOOjMvjpQ/src1XYAUVDM=; b=rL604wMrRJQGI8yB4tJiIOSpDP0Wh30ZuB+XSYNPnr7NXBUC931BTngCTQyHT9I7qh bLy1vUXB44amYC5sqY62cnHx8sRVs4bqPLqMQiwdueXLrCHkJ8vfHsAlzlqk3vaJ8wzr ztIW11W9NDFQ5aRoIurAuwaFFa70VsXrUOn551cD8Ni5hiPLzzCrPG7+jkA3HbN3pWXw tV5uacicqsmg1DIz1qFYSz/O6OV4zQBLbth7s9FfIG1zoEBLorOFZfbbx+9iEXa0nt9L jZSz9WPQ2mavWLC6c4XJpgClpfdyhh8ukM6yKx8Tg4V/2guSKIS2N57DFk8rsMBUKWLR LzVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="qT/4txte"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id t5si6150299ybp.39.2020.07.01.09.15.35 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 01 Jul 2020 09:15:35 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="qT/4txte"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:46954 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jqfOY-0002Ty-Ht for patch@linaro.org; Wed, 01 Jul 2020 12:15:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40108) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jqfL8-0007Ds-5m for qemu-devel@nongnu.org; Wed, 01 Jul 2020 12:12:02 -0400 Received: from mail-wr1-x443.google.com ([2a00:1450:4864:20::443]:45830) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jqfL4-0006GM-Fc for qemu-devel@nongnu.org; Wed, 01 Jul 2020 12:12:01 -0400 Received: by mail-wr1-x443.google.com with SMTP id s10so24588250wrw.12 for ; Wed, 01 Jul 2020 09:11:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=plqX1uLjHS1JRtUyTbOdw9sOOjMvjpQ/src1XYAUVDM=; b=qT/4txteqaajJk5C+0hMAnHFYeTIx/JBSqvSnfmpWrKhkPM+MCh0ieWDsBE3riGg5S pwaFghfibw4x5VvpkTeBLf9mq7lbr//vb1L6lKK/P05QmrqtyDHtalxFKHI6MMSjcsy3 of8Wecu4iGPV7kjHnrvB7tMOX7eFtSvwWFRVPVs/Mz+kWAk6lZl3aW9p3EfT6+ZKpLtQ W3KWGhl5dy+coKb12u7szq9y2A3Mf6eka05F/hmsRSH+md4i/wCSLYJQY4thgCT7lhZS aGpNfZOjYgL4MnIReIqfuVEa6MrldM42YV5rTo60Vcpn+tYl3MteisEICZt7KyQ1Vpyi fjkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=plqX1uLjHS1JRtUyTbOdw9sOOjMvjpQ/src1XYAUVDM=; b=LGL7oZerFZhLbGt0oARIYOle/J4gYP9ej7EG/Fd+xHfq50z8jVhQ2Oqv8o6NH0M7Uh ReABiCRnzud/K7FCTBSdSYizVFjJVCRFaxhMbU/RRAgoIXMtfF3oRUpRFN7wYRN7Ezmk gZD3vNqnNI9tGfu34JcxlLCGcHZ5TmEnFDr6wl3p7DbtUn4Zv+d/gF62SHy+25cbHhnK qjiGeT2EuAczZaHa8dHwNTVUySLgesiX4e7leYMsZqqr3VXlsucy34OTftWAf2z2ypuV lpURVOL/6BLzEKszCB4isKthYyDoFdYDM5FA4o0Fndjz+Gzoz87QFpq4+ZlpgulNj2Va KqNg== X-Gm-Message-State: AOAM533U9qI1FBu4DNH1c9s5XCiz9rlndc/zZ0wE3ZR3H3hPq9Cs8EOx VlpaAr5oZRpkiU+I2u2s3S221Q== X-Received: by 2002:adf:f60c:: with SMTP id t12mr29586194wrp.198.1593619916565; Wed, 01 Jul 2020 09:11:56 -0700 (PDT) Received: from zen.linaroharston ([51.148.130.216]) by smtp.gmail.com with ESMTPSA id z17sm8248687wmc.3.2020.07.01.09.11.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2020 09:11:54 -0700 (PDT) Received: from zen.lan (localhost [127.0.0.1]) by zen.linaroharston (Postfix) with ESMTP id 7EA2E1FF8C; Wed, 1 Jul 2020 17:11:53 +0100 (BST) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: qemu-devel@nongnu.org Subject: [PATCH v2 2/3] docs/devel: convert and update MTTCG design document Date: Wed, 1 Jul 2020 17:11:52 +0100 Message-Id: <20200701161153.30988-3-alex.bennee@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200701161153.30988-1-alex.bennee@linaro.org> References: <20200701161153.30988-1-alex.bennee@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::443; envelope-from=alex.bennee@linaro.org; helo=mail-wr1-x443.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Alex_Benn=C3=A9e?= Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Do a light conversion to .rst and clean-up some of the language at the start now MTTCG has been merged for a while. Signed-off-by: Alex Bennée --- docs/devel/index.rst | 1 + ...ti-thread-tcg.txt => multi-thread-tcg.rst} | 52 ++++++++++++------- 2 files changed, 34 insertions(+), 19 deletions(-) rename docs/devel/{multi-thread-tcg.txt => multi-thread-tcg.rst} (90%) -- 2.20.1 Reviewed-by: Richard Henderson diff --git a/docs/devel/index.rst b/docs/devel/index.rst index bb8238c5d6d..4ecaea3643f 100644 --- a/docs/devel/index.rst +++ b/docs/devel/index.rst @@ -23,6 +23,7 @@ Contents: decodetree secure-coding-practices tcg + multi-thread-tcg tcg-plugins bitops reset diff --git a/docs/devel/multi-thread-tcg.txt b/docs/devel/multi-thread-tcg.rst similarity index 90% rename from docs/devel/multi-thread-tcg.txt rename to docs/devel/multi-thread-tcg.rst index 3c85ac0eab9..42158b77c70 100644 --- a/docs/devel/multi-thread-tcg.txt +++ b/docs/devel/multi-thread-tcg.rst @@ -1,15 +1,17 @@ -Copyright (c) 2015-2016 Linaro Ltd. +.. + Copyright (c) 2015-2020 Linaro Ltd. -This work is licensed under the terms of the GNU GPL, version 2 or -later. See the COPYING file in the top-level directory. + This work is licensed under the terms of the GNU GPL, version 2 or + later. See the COPYING file in the top-level directory. Introduction ============ -This document outlines the design for multi-threaded TCG system-mode -emulation. The current user-mode emulation mirrors the thread -structure of the translated executable. Some of the work will be -applicable to both system and linux-user emulation. +This document outlines the design for multi-threaded TCG (a.k.a MTTCG) +system-mode emulation. user-mode emulation has always mirrored the +thread structure of the translated executable although some of the +changes done for MTTCG system emulation have improved the stability of +linux-user emulation. The original system-mode TCG implementation was single threaded and dealt with multiple CPUs with simple round-robin scheduling. This @@ -21,9 +23,18 @@ vCPU Scheduling =============== We introduce a new running mode where each vCPU will run on its own -user-space thread. This will be enabled by default for all FE/BE -combinations that have had the required work done to support this -safely. +user-space thread. This is enabled by default for all FE/BE +combinations where the host memory model is able to accommodate the +guest (TCG_GUEST_DEFAULT_MO & ~TCG_TARGET_DEFAULT_MO is zero) and the +guest has had the required work done to support this safely +(TARGET_SUPPORTS_MTTCG). + +System emulation will fall back to the original round robin approach +if: + +* forced by --accel tcg,thread=single +* enabling --icount mode +* 64 bit guests on 32 bit hosts (TCG_OVERSIZED_GUEST) In the general case of running translated code there should be no inter-vCPU dependencies and all vCPUs should be able to run at full @@ -61,7 +72,9 @@ have their block-to-block jumps patched. Global TCG State ---------------- -### User-mode emulation +User-mode emulation +~~~~~~~~~~~~~~~~~~~ + We need to protect the entire code generation cycle including any post generation patching of the translated code. This also implies a shared translation buffer which contains code running on all cores. Any @@ -78,9 +91,11 @@ patching. Code generation is serialised with mmap_lock(). -### !User-mode emulation +!User-mode emulation +~~~~~~~~~~~~~~~~~~~~ + Each vCPU has its own TCG context and associated TCG region, thereby -requiring no locking. +requiring no locking during translation. Translation Blocks ------------------ @@ -92,6 +107,7 @@ including: - debugging operations (breakpoint insertion/removal) - some CPU helper functions + - linux-user spawning it's first thread This is done with the async_safe_run_on_cpu() mechanism to ensure all vCPUs are quiescent when changes are being made to shared global @@ -250,8 +266,10 @@ to enforce a particular ordering of memory operations from the point of view of external observers (e.g. another processor core). They can apply to any memory operations as well as just loads or stores. -The Linux kernel has an excellent write-up on the various forms of -memory barrier and the guarantees they can provide [1]. +The Linux kernel has an excellent `write-up +` +on the various forms of memory barrier and the guarantees they can +provide. Barriers are often wrapped around synchronisation primitives to provide explicit memory ordering semantics. However they can be used @@ -352,7 +370,3 @@ an exclusive lock which ensures all emulation is serialised. While the atomic helpers look good enough for now there may be a need to look at solutions that can more closely model the guest architectures semantics. - -========== - -[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/Documentation/memory-barriers.txt From patchwork Wed Jul 1 16:11:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 192177 Delivered-To: patch@linaro.org Received: by 2002:a54:3249:0:0:0:0:0 with SMTP id g9csp566608ecs; Wed, 1 Jul 2020 09:17:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw6b2xqcdbWZRrQjyR470rs7bsTS9kV+xkBnvjLP6mywtjPs5NMFzldA4X31lPeMlaRls8t X-Received: by 2002:a25:f204:: with SMTP id i4mr41549432ybe.486.1593620249793; Wed, 01 Jul 2020 09:17:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593620249; cv=none; d=google.com; s=arc-20160816; b=ZUk52DmnfB5d5po9ABHs40sohkuPGbOvb8oa+m90dsSVN3wi30szk+gu2tt5NIKs3N tnT40MzsklITZun+VZYV3GlNja/gDAvTeI7dsx5Aboj0Q+EzDePUqpbWUuUAT17wb/x3 YeO3T9qdopskVtZls+/U1ReLA8oriQASX1q79GJc89m+BPvQky+bxbIf02K7j80/5klx +kAqhCZujhpCcZzG+w3A8DD/i2UQxk9DS+GwnyWqc2PW4iNsF4MYa/McDMhZYOW2Y27q 4KDwW1659ZlEmZJU+NWme62SzVLlpw35b/JYeNxcDaLWiazaa9En3sWrw/jYwl34zmc5 FnZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=RP9yAwjU/wG4S48AsoUmDxtOkxGjYVoWMIdIcMqKXrw=; b=HtbJDJGUPf3UocevTWfbCRWvXpTkYwWxeq2U5C56GtLjWrconyu8MyufXr6j5hgPHx 7EZDUCiInyoKZzHQtKKzBZ6UPLqyxYVhmTrFsptD/+811MYpzNvXq66+8DLsbAZ3bHLB CbBZkZocbO79VtuLILUengQTfxQNKSeKp+kWT2ueTBE6PKxaNU4hcpWmFn4ZFWp/Dvgx 1DlVy9UDUeqZakIlGr8oMbIEIKhqL1/rtksAUfVAdOgjpDNw8tj/6SixDxRJZfr+fFKP g62h9s9hUGNkHmWx3oMgkI1YRpj9OQ1wt3AmqqsuqsW5yNFfkwYpa5ZSPmHq990qVK2Z /YZQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=BDPYtGkJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id a17si5897539ybg.298.2020.07.01.09.17.29 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 01 Jul 2020 09:17:29 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=BDPYtGkJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:55390 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jqfQP-00064F-2i for patch@linaro.org; Wed, 01 Jul 2020 12:17:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40138) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jqfL9-0007HS-IU for qemu-devel@nongnu.org; Wed, 01 Jul 2020 12:12:03 -0400 Received: from mail-wr1-x444.google.com ([2a00:1450:4864:20::444]:40290) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jqfL6-0006Hs-HJ for qemu-devel@nongnu.org; Wed, 01 Jul 2020 12:12:03 -0400 Received: by mail-wr1-x444.google.com with SMTP id h5so24620913wrc.7 for ; Wed, 01 Jul 2020 09:12:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=RP9yAwjU/wG4S48AsoUmDxtOkxGjYVoWMIdIcMqKXrw=; b=BDPYtGkJ1j5mbkJIMlqSJQs8INrq7ckC4EEAbzOwcw8UTXZqNV/vLn4i4MSSUL8T+N VnMc0wQFL9H4Z/SqzN1WBrD9jVPuKxhhRvOQX8weYo+rCVLZt8nqPX9lHoRusvu1HJzx C63a49vk9bA/6SN0bOyQRiVgmigarKnlqGfodt0qtuWEkBDqK8vcYaExQuZEUZi0t3hT fUE7Lvj8BNNLSqeI7ZfGh117z0bcO9QY6e14iSEKO9SmJvbjwNXnOV69Sce++MSmDcmo uHUaR7FOIj+/J/aHYTX0ZJk+AdIAehr0WErlF0h9hzTvNk/PXRQBb0cqyQHvQ+J8A0ln e3EQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RP9yAwjU/wG4S48AsoUmDxtOkxGjYVoWMIdIcMqKXrw=; b=A0Ntd7Z2JRz6wJzBkQQgqRtWde6m92n6tbx4oWYXL6dAR/UC2gLT3jJ0ze/30RYD+9 E2FjdB+ca5xb5OLb0smHTK8jobxAvC2qDlAFxLpftXO2opEFyKAupAUCPVp5n+6i0j4v viNxS2F1pdy+BtagOI3f9ADh0hS+L5ELfJ0dKJ914+Q+EPjLcaO9iu09hBqs1lWKZ2n0 MjXu35iRa5IE/QuAMpzEIN4C2ikytUjoqUHHNlD3s8MTZ3T7d6iEVI4fjTqvYfQMaWjj 2aEbFmm8m4vLyRNdgybkZvQuMc1DASvekWi1RIdugciaPw8AQWxgyFHChepASdYlmUrk zKyA== X-Gm-Message-State: AOAM531cOpcEEdaYM0q6kVlka2zxKlNrQdqCWwx5pE28Ri6lh0EWgUOR OP7MkRNrPXQNCNsB/hAQUVJakA== X-Received: by 2002:adf:ce90:: with SMTP id r16mr27050958wrn.408.1593619918874; Wed, 01 Jul 2020 09:11:58 -0700 (PDT) Received: from zen.linaroharston ([51.148.130.216]) by smtp.gmail.com with ESMTPSA id d18sm8261774wrj.8.2020.07.01.09.11.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2020 09:11:54 -0700 (PDT) Received: from zen.lan (localhost [127.0.0.1]) by zen.linaroharston (Postfix) with ESMTP id 9404E1FF8F; Wed, 1 Jul 2020 17:11:53 +0100 (BST) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: qemu-devel@nongnu.org Subject: [PATCH v2 3/3] docs/devel: add some notes on tcg-icount for developers Date: Wed, 1 Jul 2020 17:11:53 +0100 Message-Id: <20200701161153.30988-4-alex.bennee@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200701161153.30988-1-alex.bennee@linaro.org> References: <20200701161153.30988-1-alex.bennee@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::444; envelope-from=alex.bennee@linaro.org; helo=mail-wr1-x444.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Richard Henderson , =?utf-8?q?Alex_Benn=C3=A9e?= , Pavel Dovgalyuk , Paolo Bonzini Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" This attempts to bring together my understanding of the requirements for icount behaviour into one reference document for our developer notes. Signed-off-by: Alex Bennée Reviewed-by: Richard Henderson Cc: Paolo Bonzini Cc: Pavel Dovgalyuk Cc: Peter Maydell Message-Id: <20200619135844.23307-1-alex.bennee@linaro.org> --- v2 - fix copyright date - it's -> its - drop mentioned of gen_io_end() - remove and correct original conjecture v3 - include link in index --- docs/devel/index.rst | 1 + docs/devel/tcg-icount.rst | 89 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 90 insertions(+) create mode 100644 docs/devel/tcg-icount.rst -- 2.20.1 diff --git a/docs/devel/index.rst b/docs/devel/index.rst index 4ecaea3643f..ae6eac7c9c6 100644 --- a/docs/devel/index.rst +++ b/docs/devel/index.rst @@ -23,6 +23,7 @@ Contents: decodetree secure-coding-practices tcg + tcg-icount multi-thread-tcg tcg-plugins bitops diff --git a/docs/devel/tcg-icount.rst b/docs/devel/tcg-icount.rst new file mode 100644 index 00000000000..cb51cb34dde --- /dev/null +++ b/docs/devel/tcg-icount.rst @@ -0,0 +1,89 @@ +.. + Copyright (c) 2020, Linaro Limited + Written by Alex Bennée + + +======================== +TCG Instruction Counting +======================== + +TCG has long supported a feature known as icount which allows for +instruction counting during execution. This should be confused with +cycle accurate emulation - QEMU does not attempt to emulate how long +an instruction would take on real hardware. That is a job for other +more detailed (and slower) tools that simulate the rest of a +micro-architecture. + +This feature is only available for system emulation and is +incompatible with multi-threaded TCG. It can be used to better align +execution time with wall-clock time so a "slow" device doesn't run too +fast on modern hardware. It can also provides for a degree of +deterministic execution and is an essential part of the record/replay +support in QEMU. + +Core Concepts +============= + +At its heart icount is simply a count of executed instructions which +is stored in the TimersState of QEMU's timer sub-system. The number of +executed instructions can then be used to calculate QEMU_CLOCK_VIRTUAL +which represents the amount of elapsed time in the system since +execution started. Depending on the icount mode this may either be a +fixed number of ns per instructions or adjusted as execution continues +to keep wall clock time and virtual time in sync. + +To be able to calculate the number of executed instructions the +translator starts by allocating a budget of instructions to be +executed. The budget of instructions is limited by how long it will be +until the next timer will expire. We store this budget as part of a +vCPU icount_decr field which shared with the machinery for handling +cpu_exit(). The whole field is checked at the start of every +translated block and will cause a return to the outer loop to deal +with whatever caused the exit. + +In the case of icount before the flag is checked we subtract the +number of instructions the translation block would execute. If this +would cause the instruction budget to got negative we exit the main +loop and regenerate a new translation block with exactly the right +number of instructions to take the budget to 0 meaning whatever timer +was due to expire will expire exactly when we exit the main run loop. + +Dealing with MMIO +----------------- + +While we can adjust the instruction budget for known events like timer +expiry we can not do the same for MMIO. Every load/store we execute +might potentially trigger an I/O event at which point we will need an +up to date and accurate reading of the icount number. + +To deal with this case when an I/O access is made we: + + - restore un-executed instructions to the icount budget + - re-compile a single [1]_ instruction block for the current PC + - exit the cpu loop and execute the re-compiled block + +The new block is created with the CF_LAST_IO compile flag which +ensures the final instruction translation starts with a call to +gen_io_start() so we don't enter a perpetual loop constantly +recompiling a single instruction block. For translators using the +common translator_loop this is done automatically. + +.. [1] sometimes two instructions if dealing with delay slots + +Other I/O operations +-------------------- + +MMIO isn't the only type of operation for which we might need a +correct and accurate clock. IO port instructions and accesses to +system registers are the common examples here. These instructions have +to be handled by the individual translators which have the knowledge +of which operations are I/O operations. + +.. warning:: Any instruction that eventually causes an access to + QEMU_CLOCK_VIRTUAL needs to be preceded by a + gen_io_start() and must also be the last instruction + translated in the block. + + + +