diff mbox series

[6/7] docs: add a glossary

Message ID 20241118172357.475281-7-pierrick.bouvier@linaro.org
State New
Headers show
Series Enhance documentation for new developers | expand

Commit Message

Pierrick Bouvier Nov. 18, 2024, 5:23 p.m. UTC
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 docs/devel/control-flow-integrity.rst |   2 +
 docs/devel/multi-thread-tcg.rst       |   2 +
 docs/glossary/index.rst               | 238 ++++++++++++++++++++++++++
 docs/index.rst                        |   1 +
 docs/system/arm/virt.rst              |   2 +
 docs/system/images.rst                |   2 +
 docs/tools/qemu-nbd.rst               |   2 +
 7 files changed, 249 insertions(+)
 create mode 100644 docs/glossary/index.rst

Comments

Peter Maydell Dec. 3, 2024, 5:37 p.m. UTC | #1
On Mon, 18 Nov 2024 at 17:24, Pierrick Bouvier
<pierrick.bouvier@linaro.org> wrote:
>
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> ---
>  docs/devel/control-flow-integrity.rst |   2 +
>  docs/devel/multi-thread-tcg.rst       |   2 +
>  docs/glossary/index.rst               | 238 ++++++++++++++++++++++++++
>  docs/index.rst                        |   1 +
>  docs/system/arm/virt.rst              |   2 +
>  docs/system/images.rst                |   2 +
>  docs/tools/qemu-nbd.rst               |   2 +
>  7 files changed, 249 insertions(+)
>  create mode 100644 docs/glossary/index.rst

I think this is a good idea; we've had at least one bug
report from a user pointing out that we had a term in
our docs which we didn't define ("block driver"):
https://gitlab.com/qemu-project/qemu/-/issues/2611
I have some comments on specific entries below.

> diff --git a/docs/devel/control-flow-integrity.rst b/docs/devel/control-flow-integrity.rst
> index e6b73a4fe1a..3d5702fa4cc 100644
> --- a/docs/devel/control-flow-integrity.rst
> +++ b/docs/devel/control-flow-integrity.rst
> @@ -1,3 +1,5 @@
> +.. _cfi:
> +
>  ============================
>  Control-Flow Integrity (CFI)
>  ============================
> diff --git a/docs/devel/multi-thread-tcg.rst b/docs/devel/multi-thread-tcg.rst
> index d706c27ea74..7fd0a07633d 100644
> --- a/docs/devel/multi-thread-tcg.rst
> +++ b/docs/devel/multi-thread-tcg.rst
> @@ -4,6 +4,8 @@
>    This work is licensed under the terms of the GNU GPL, version 2 or
>    later. See the COPYING file in the top-level directory.
>
> +.. _mttcg:
> +
>  ==================
>  Multi-threaded TCG
>  ==================
> diff --git a/docs/glossary/index.rst b/docs/glossary/index.rst
> new file mode 100644
> index 00000000000..a2d4f3eae16
> --- /dev/null
> +++ b/docs/glossary/index.rst

I guess it makes sense to give this its own subdir, since we want
it to come at the end of the manual. The other option would be
to put it directly into docs/.

> @@ -0,0 +1,238 @@
> +.. _Glossary:
> +
> +--------
> +Glossary
> +--------
> +
> +This section of the manual presents *simply* acronyms and terms QEMU developers
> +use.

What's "simply" intended to mean here?

> +
> +Accelerator
> +-----------
> +
> +A specific API used to accelerate execution of guest instructions. It can be
> +hardware-based, through a virtualization API provided by the host OS (kvm, hvf,
> +whpx, ...) or software-based (tcg). See this description of `supported

Comma after ')'.

> +accelerators<Accelerators>`.
> +
> +Board
> +-----

I think the correct term here is "machine" -- that's what the
command line option is named, it's what the QOM class is, etc.
So the major glossary entry should be "Machine". Some people
(including me!) and some of the documentation uses "board" as a
synonym for "machine", so we should have a glossary entry for
"board", but it should just say "Another name for 'machine'" and
xref to the "machine" entry.

> +
> +QEMU system defines board models for various architectures. It's a description
> +of a SoC (system-on-chip) with various devices pre-configured, and can be
> +selected with the option ``-machine`` of qemu-system.

SoCs are not the same as boards.

We could say something like:

QEMU's system emulation models many different types of hardware.
A machine model (sometimes called a board model) is the model
of a complete virtual system with RAM, one or more CPUs, and
various devices.

We could also put in a link to
https://www.qemu.org/docs/master/system/targets.html
which is where we document what our machine types are.

> +For virtual machines, you'll use ``virt`` board model, designed for this use
> +case. As an example, for Arm architecture, you can find the `model code
> +<https://gitlab.com/qemu-project/qemu/-/blob/master/hw/arm/virt.c>`_ and
> +associated `documentation <arm-virt>`.

I think I would delete this paragraph. 'virt' is only the
board type for virtual machines for some architectures; on
x86 it doesn't exist for, example. Our user facing
docs (that link above) are where we should suggest what
the best machine type to use is. And the codebase-guide
page is where we would say where machine type source code is.

> +
> +Block
> +-----
> +
> +Block drivers are the available `disk formats <block-drivers>` available, and
> +block devices `(see Block device section on options page)<sec_005finvocation>`
> +are using them to implement disks for a virtual machine.

Block drivers aren't just disk formats; there are some filter
drivers too. Somebody on the block side could probably
provide a better definition here.

> +
> +CFI
> +---
> +
> +Control Flow Integrity is a hardening technique used to prevent exploits
> +targeting QEMU by detecting unexpected branches during execution. QEMU `actively
> +supports<cfi>` being compiled with CFI enabled.
> +
> +Device
> +------
> +
> +QEMU is able to emulate a CPU, and all the hardware interacting with it,
> +including many devices. When QEMU runs a virtual machine using a hardware-based
> +accelerator, it is responsible for emulating, using software, all devices.

This definition doesn't actually define what a device is :-)

> +
> +EDK2
> +----
> +
> +EDK2, as known as `TianoCore <https://www.tianocore.org/>`_, is an open source
> +implementation of UEFI standard. It's ran by QEMU to support UEFI for virtual
> +machines.

Replace last sentence with
"QEMU virtual machines that boot a UEFI BIOS usually use EDK2."
?

> +
> +gdbstub
> +-------
> +
> +QEMU implements a `gdb server <GDB usage>`, allowing gdb to attach to it and
> +debug a running virtual machine, or a program in user-mode. This allows to debug
> +a given architecture without having access to hardware.

"allows debugging the guest code that is running inside QEMU."

> +
> +glib2
> +-----
> +
> +`GLib2 <https://docs.gtk.org/glib/>`_ is one of the most important library we

"libraries"

> +are using through the codebase. It provides many data structures, macros, string
> +and thread utilities and portable functions across different OS. It's required
> +to build QEMU.
> +
> +Guest agent
> +-----------
> +
> +`QEMU Guest agent <qemu-ga>` is a daemon intended to be executed by guest

"The QEMU Guest Agent"

"intended to be run within virtual machines. It provides various services"

> +virtual machines and providing various services to help QEMU to interact with
> +it.
> +
> +Guest/Host
> +----------

Make these two separate glossary entries, which cross reference each other.

> +
> +Guest is the architecture of the virtual machine, which is emulated.

"Sometimes this is called the 'target' architecture, but that term
can be ambiguous."

> +Host is the architecture on which QEMU is running on, which is native.


We could also have an entry for Target

 The term "target" can be ambiguous. In most places in QEMU it is used
 as a synonym for "guest"; for example the code for emulating Arm CPUs
 is in ``target/arm/``. However in the TCG subsystem "target" refers
 to the architecture which QEMU is running on, i.e. the "host".


> +
> +Hypervisor
> +----------
> +
> +The formal definition of an hypervisor is a program than can be used to manage a
> +virtual machine. QEMU itself is an hypervisor.

"a hypervisor". QEMU isn't really a hypervisor, though...


> +
> +In the context of QEMU, an hypervisor is an API, provided by the Host OS,
> +allowing to execute virtual machines. Linux implementation is KVM (and supports
> +Xen as well). For MacOS, it's HVF. Windows defines WHPX. And NetBSD provides
> +NVMM.
> +
> +Migration
> +---------
> +
> +QEMU can save and restore the execution of a virtual machine, including across
> +different machines. This is provided by the `Migration framework<migration>`.

"between different host systems".

> +
> +NBD
> +---
> +
> +`QEMU Network Block Device server <qemu-nbd>` is a tool that can be used to

"The QEMU ..."

> +mount and access QEMU images, providing functionality similar to a loop device.
> +
> +Mailing List
> +------------
> +
> +This is `where <https://wiki.qemu.org/Contribute/MailingLists>`_ all the
> +development happens! Changes are posted as series, that all developers can
> +review and share feedback for.
> +
> +For reporting issues, our `GitLab
> +<https://gitlab.com/qemu-project/qemu/-/issues>`_ tracker is the best place.
> +
> +MMU / softmmu
> +-------------
> +
> +The Memory Management Unit is responsible for translating virtual addresses to
> +physical addresses and managing memory protection. QEMU system mode is named
> +"softmmu" precisely because it implements this in software, including a TLB
> +(Translation lookaside buffer), for the guest virtual machine.
> +
> +QEMU user-mode does not implement a full software MMU, but "simply" translates
> +virtual addresses by adding a specific offset, and relying on host MMU/OS
> +instead.
> +
> +Monitor / QMP / HMP
> +-------------------
> +
> +`QEMU Monitor <QEMU monitor>` is a text interface which can be used to interact

"The QEMU Monitor"

> +with a running virtual machine.
> +
> +QMP stands for QEMU Monitor Protocol and is a json based interface.
> +HMP stands for Human Monitor Protocol and is a set of text commands available
> +for users who prefer natural language to json.
> +
> +MTTCG
> +-----
> +
> +Multiple cpus support was first implemented using a round-robin algorithm

"Multiple CPU support"

> +running on a single thread. Later on, `Multi-threaded TCG <mttcg>` was developed
> +to benefit from multiple cores to speed up execution.
> +
> +Plugins
> +-------
> +
> +`TCG Plugins <TCG Plugins>` is an API used to instrument guest code, in system
> +and user mode. The end goal is to have a similar set of functionality compared
> +to `DynamoRIO <https://dynamorio.org/>`_ or `valgrind <https://valgrind.org/>`_.
> +
> +One key advantage of QEMU plugins is that they can be used to perform
> +architecture agnostic instrumentation.
> +
> +Patchwork
> +---------
> +
> +`Patchwork <https://patchew.org/QEMU/>`_ is a website that tracks
> +patches on the Mailing List.

Patchwork and patchew are different systems. Patchew's URL is
https://patchew.org/QEMU/

(There is a patchwork instance that tracks qemu-devel patches,
at https://patchwork.kernel.org/project/qemu-devel/list/ , but
I'm not aware of any developers that are actively using it, so
I don't think it merits being mentioned in the glossary.)

> +
> +PR
> +--
> +
> +Once a series is reviewed and accepted by a subsystem maintainer, it will be
> +included in a PR (Pull Request) that the project maintainer will merge into QEMU
> +main branch, after running tests.

I think we could probably also usefully say

"The QEMU project doesn't currently expect most developers to
directly submit pull requests."

just to flag up that our development model isn't like the
currently-popular github/gitlab one where a PR is how you
send contributions.

> +
> +QCOW
> +----
> +
> +QEMU Copy On Write is a disk format developed by QEMU. It provides transparent
> +compression, automatic extension, and many other advantages over a raw image.

We want to be a bit careful here, because the "qcow" format
is not something we recommend for new use -- "qcow2" is what
you actually want.

https://www.qemu.org/docs/master/system/qemu-block-drivers.html#cmdoption-image-formats-arg-qcow2

> +
> +QEMU
> +----
> +
> +`QEMU (Quick Emulator) <https://www.qemu.org/>`_ is a generic and open source
> +machine emulator and virtualizer.
> +
> +QOM
> +---
> +
> +`QEMU Object Model <qom>` is an object oriented API used to define various
> +devices and hardware in the QEMU codebase.
> +
> +Record/replay
> +-------------
> +
> +`Record/replay <replay>` is a feature of QEMU allowing to have a deterministic
> +and reproducible execution of a virtual machine.
> +
> +Rust
> +----
> +
> +`A new programming language <https://www.rust-lang.org/>`_, memory safe by
> +default. We didn't see a more efficient way to create debates and tensions in
> +a community of C programmers since the birth of C++.

:-)  but I think we should probably avoid the joke in our docs.

> +
> +System mode
> +-----------
> +
> +QEMU System mode emulates a full machine, including its cpu, memory and devices.
> +It can be accelerated to hardware speed by using one of the hypervisors QEMU
> +supports. It is referenced as softmmu as well.

https://www.qemu.org/docs/master/about/index.html already has
text defining system emulation and user emulation, so we don't
really need to re-invent new phrasing for those here.

> +
> +TCG
> +---
> +
> +`Tiny Code Generator <tcg>` is an intermediate representation (IR) used to run
> +guest instructions on host cpu, with both architectures possibly being
> +different.

I would say

  TCG is the QEMU Tiny Code Generator; it is the JIT system we use
  to emulate a guest CPU in software.

That's enough for users to understand what it means (I hope); if
they want to know more specifics like about the intermediate
representation they can follow the link.

> +
> +It is one of the accelerator supported by QEMU, and supports a lot of

"accelerators"

> +guest/host architectures.
> +
> +User mode
> +---------
> +
> +QEMU User mode allows to run programs for a guest architecture, on a host
> +architecture, by translating system calls and using TCG. It is available for
> +Linux and BSD.
> +
> +VirtIO
> +------
> +
> +VirtIO is an open standard used to define and implement virtual devices with a
> +minimal overhead, defining a set of data structures and hypercalls (similar to
> +system calls, but targeting an hypervisor, which happens to be QEMU in our
> +case). It's designed to be more efficient than emulating a real device, by
> +minimizing the amount of interactions between a guest VM and its hypervisor.
> +
> +vhost-user
> +----------
> +
> +`Vhost-user <vhost_user>` is an interface used to implement VirtIO devices
> +outside of QEMU itself.

thanks
-- PMM
Alex Bennée Dec. 3, 2024, 6:10 p.m. UTC | #2
Peter Maydell <peter.maydell@linaro.org> writes:

> On Mon, 18 Nov 2024 at 17:24, Pierrick Bouvier
> <pierrick.bouvier@linaro.org> wrote:
>>
>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>> ---
>>  docs/devel/control-flow-integrity.rst |   2 +
>>  docs/devel/multi-thread-tcg.rst       |   2 +
>>  docs/glossary/index.rst               | 238 ++++++++++++++++++++++++++
>>  docs/index.rst                        |   1 +
>>  docs/system/arm/virt.rst              |   2 +
>>  docs/system/images.rst                |   2 +
>>  docs/tools/qemu-nbd.rst               |   2 +
>>  7 files changed, 249 insertions(+)
>>  create mode 100644 docs/glossary/index.rst
>
> I think this is a good idea; we've had at least one bug
> report from a user pointing out that we had a term in
> our docs which we didn't define ("block driver"):
> https://gitlab.com/qemu-project/qemu/-/issues/2611
> I have some comments on specific entries below.
>
>> diff --git a/docs/devel/control-flow-integrity.rst b/docs/devel/control-flow-integrity.rst
>> index e6b73a4fe1a..3d5702fa4cc 100644
>> --- a/docs/devel/control-flow-integrity.rst
<snip>
>> +
>> +Device
>> +------
>> +
>> +QEMU is able to emulate a CPU, and all the hardware interacting with it,
>> +including many devices. When QEMU runs a virtual machine using a hardware-based
>> +accelerator, it is responsible for emulating, using software, all devices.
>
> This definition doesn't actually define what a device is :-)

Also we can xref to:

  https://qemu.readthedocs.io/en/v9.1.0/system/device-emulation.html

where we go into a bit more detail about what a device, bus, frontend
and backend are.

>
>> +
>> +EDK2
>> +----
>> +
>> +EDK2, as known as `TianoCore <https://www.tianocore.org/>`_, is an open source
>> +implementation of UEFI standard. It's ran by QEMU to support UEFI for virtual
>> +machines.
>
> Replace last sentence with
> "QEMU virtual machines that boot a UEFI BIOS usually use EDK2."
> ?
>
<snip>
Pierrick Bouvier Dec. 3, 2024, 7:37 p.m. UTC | #3
On 12/3/24 10:10, Alex Bennée wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
> 
>> On Mon, 18 Nov 2024 at 17:24, Pierrick Bouvier
>> <pierrick.bouvier@linaro.org> wrote:
>>>
>>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>>> ---
>>>   docs/devel/control-flow-integrity.rst |   2 +
>>>   docs/devel/multi-thread-tcg.rst       |   2 +
>>>   docs/glossary/index.rst               | 238 ++++++++++++++++++++++++++
>>>   docs/index.rst                        |   1 +
>>>   docs/system/arm/virt.rst              |   2 +
>>>   docs/system/images.rst                |   2 +
>>>   docs/tools/qemu-nbd.rst               |   2 +
>>>   7 files changed, 249 insertions(+)
>>>   create mode 100644 docs/glossary/index.rst
>>
>> I think this is a good idea; we've had at least one bug
>> report from a user pointing out that we had a term in
>> our docs which we didn't define ("block driver"):
>> https://gitlab.com/qemu-project/qemu/-/issues/2611
>> I have some comments on specific entries below.
>>
>>> diff --git a/docs/devel/control-flow-integrity.rst b/docs/devel/control-flow-integrity.rst
>>> index e6b73a4fe1a..3d5702fa4cc 100644
>>> --- a/docs/devel/control-flow-integrity.rst
> <snip>
>>> +
>>> +Device
>>> +------
>>> +
>>> +QEMU is able to emulate a CPU, and all the hardware interacting with it,
>>> +including many devices. When QEMU runs a virtual machine using a hardware-based
>>> +accelerator, it is responsible for emulating, using software, all devices.
>>
>> This definition doesn't actually define what a device is :-)
> 
> Also we can xref to:
> 
>    https://qemu.readthedocs.io/en/v9.1.0/system/device-emulation.html
> 
> where we go into a bit more detail about what a device, bus, frontend
> and backend are.
> 

Good point, I'll add it.

>>
>>> +
>>> +EDK2
>>> +----
>>> +
>>> +EDK2, as known as `TianoCore <https://www.tianocore.org/>`_, is an open source
>>> +implementation of UEFI standard. It's ran by QEMU to support UEFI for virtual
>>> +machines.
>>
>> Replace last sentence with
>> "QEMU virtual machines that boot a UEFI BIOS usually use EDK2."
>> ?
>>
> <snip>
>
Pierrick Bouvier Dec. 3, 2024, 8:32 p.m. UTC | #4
On 12/3/24 09:37, Peter Maydell wrote:
> On Mon, 18 Nov 2024 at 17:24, Pierrick Bouvier
> <pierrick.bouvier@linaro.org> wrote:
>>
>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>> ---
>>   docs/devel/control-flow-integrity.rst |   2 +
>>   docs/devel/multi-thread-tcg.rst       |   2 +
>>   docs/glossary/index.rst               | 238 ++++++++++++++++++++++++++
>>   docs/index.rst                        |   1 +
>>   docs/system/arm/virt.rst              |   2 +
>>   docs/system/images.rst                |   2 +
>>   docs/tools/qemu-nbd.rst               |   2 +
>>   7 files changed, 249 insertions(+)
>>   create mode 100644 docs/glossary/index.rst
> 
> I think this is a good idea; we've had at least one bug
> report from a user pointing out that we had a term in
> our docs which we didn't define ("block driver"):
> https://gitlab.com/qemu-project/qemu/-/issues/2611
> I have some comments on specific entries below.
> 

And people can be free to add new entries later. However, we should 
resist the temptation to add too many details. It should stay simple and 
understandable, even if not all technical nuances are not represented.

>> diff --git a/docs/devel/control-flow-integrity.rst b/docs/devel/control-flow-integrity.rst
>> index e6b73a4fe1a..3d5702fa4cc 100644
>> --- a/docs/devel/control-flow-integrity.rst
>> +++ b/docs/devel/control-flow-integrity.rst
>> @@ -1,3 +1,5 @@
>> +.. _cfi:
>> +
>>   ============================
>>   Control-Flow Integrity (CFI)
>>   ============================
>> diff --git a/docs/devel/multi-thread-tcg.rst b/docs/devel/multi-thread-tcg.rst
>> index d706c27ea74..7fd0a07633d 100644
>> --- a/docs/devel/multi-thread-tcg.rst
>> +++ b/docs/devel/multi-thread-tcg.rst
>> @@ -4,6 +4,8 @@
>>     This work is licensed under the terms of the GNU GPL, version 2 or
>>     later. See the COPYING file in the top-level directory.
>>
>> +.. _mttcg:
>> +
>>   ==================
>>   Multi-threaded TCG
>>   ==================
>> diff --git a/docs/glossary/index.rst b/docs/glossary/index.rst
>> new file mode 100644
>> index 00000000000..a2d4f3eae16
>> --- /dev/null
>> +++ b/docs/glossary/index.rst
> 
> I guess it makes sense to give this its own subdir, since we want
> it to come at the end of the manual. The other option would be
> to put it directly into docs/.
> 

 From your comment, it's not clear for me if it's ok as it is, or if you 
want a change.
Can you elaborate on that?

>> @@ -0,0 +1,238 @@
>> +.. _Glossary:
>> +
>> +--------
>> +Glossary
>> +--------
>> +
>> +This section of the manual presents *simply* acronyms and terms QEMU developers
>> +use.
> 
> What's "simply" intended to mean here?
> 

"in a straightforward or plain manner".
I can remove this word if you think it does not serve any purpose.

>> +
>> +Accelerator
>> +-----------
>> +
>> +A specific API used to accelerate execution of guest instructions. It can be
>> +hardware-based, through a virtualization API provided by the host OS (kvm, hvf,
>> +whpx, ...) or software-based (tcg). See this description of `supported
> 
> Comma after ')'.
> 

Thanks.

>> +accelerators<Accelerators>`.
>> +
>> +Board
>> +-----
> 
> I think the correct term here is "machine" -- that's what the
> command line option is named, it's what the QOM class is, etc.
> So the major glossary entry should be "Machine". Some people
> (including me!) and some of the documentation uses "board" as a
> synonym for "machine", so we should have a glossary entry for
> "board", but it should just say "Another name for 'machine'" and
> xref to the "machine" entry.
> 

It's a good point. I thought the same when I wrote it (and finally chose 
Board). I'll rename it to machine and add the board entry to point to it.

>> +
>> +QEMU system defines board models for various architectures. It's a description
>> +of a SoC (system-on-chip) with various devices pre-configured, and can be
>> +selected with the option ``-machine`` of qemu-system.
> 
> SoCs are not the same as boards.
> 
> We could say something like:
> 
> QEMU's system emulation models many different types of hardware.
> A machine model (sometimes called a board model) is the model
> of a complete virtual system with RAM, one or more CPUs, and
> various devices.
> 
> We could also put in a link to
> https://www.qemu.org/docs/master/system/targets.html
> which is where we document what our machine types are.
> 

How do you distinguish a SoC and cpu? Is a SoC cpu + devices?
Isn't a board/machine a set of SoC + devices attached?

The definition does not say a board is a SoC, but maybe the wording is 
confusing.

>> +For virtual machines, you'll use ``virt`` board model, designed for this use
>> +case. As an example, for Arm architecture, you can find the `model code
>> +<https://gitlab.com/qemu-project/qemu/-/blob/master/hw/arm/virt.c>`_ and
>> +associated `documentation <arm-virt>`.
> 
> I think I would delete this paragraph. 'virt' is only the
> board type for virtual machines for some architectures; on
> x86 it doesn't exist for, example. Our user facing
> docs (that link above) are where we should suggest what
> the best machine type to use is. And the codebase-guide
> page is where we would say where machine type source code is.
> 

Ok.

>> +
>> +Block
>> +-----
>> +
>> +Block drivers are the available `disk formats <block-drivers>` available, and
>> +block devices `(see Block device section on options page)<sec_005finvocation>`
>> +are using them to implement disks for a virtual machine.
> 
> Block drivers aren't just disk formats; there are some filter
> drivers too. Somebody on the block side could probably
> provide a better definition here.
> 

I'm open to a more exact definition. The two terms (drivers and devices) 
seem to overlap on some parts, so I came up with this trivial definition.

>> +
>> +CFI
>> +---
>> +
>> +Control Flow Integrity is a hardening technique used to prevent exploits
>> +targeting QEMU by detecting unexpected branches during execution. QEMU `actively
>> +supports<cfi>` being compiled with CFI enabled.
>> +
>> +Device
>> +------
>> +
>> +QEMU is able to emulate a CPU, and all the hardware interacting with it,
>> +including many devices. When QEMU runs a virtual machine using a hardware-based
>> +accelerator, it is responsible for emulating, using software, all devices.
> 
> This definition doesn't actually define what a device is :-)
> 

Indeed :)
Should we explain what is a computer device?
The goal was just to say that QEMU can emulate hardware interacting with 
the cpu, which could be a possible definition. So people can associate 
that QEMU devices are nothing else than a "normal" computer device.

>> +
>> +EDK2
>> +----
>> +
>> +EDK2, as known as `TianoCore <https://www.tianocore.org/>`_, is an open source
>> +implementation of UEFI standard. It's ran by QEMU to support UEFI for virtual
>> +machines.
> 
> Replace last sentence with
> "QEMU virtual machines that boot a UEFI BIOS usually use EDK2."
> ?
> 
>> +
>> +gdbstub
>> +-------
>> +
>> +QEMU implements a `gdb server <GDB usage>`, allowing gdb to attach to it and
>> +debug a running virtual machine, or a program in user-mode. This allows to debug
>> +a given architecture without having access to hardware.
> 
> "allows debugging the guest code that is running inside QEMU."
> 
>> +
>> +glib2
>> +-----
>> +
>> +`GLib2 <https://docs.gtk.org/glib/>`_ is one of the most important library we
> 
> "libraries"
> 
>> +are using through the codebase. It provides many data structures, macros, string
>> +and thread utilities and portable functions across different OS. It's required
>> +to build QEMU.
>> +
>> +Guest agent
>> +-----------
>> +
>> +`QEMU Guest agent <qemu-ga>` is a daemon intended to be executed by guest
> 
> "The QEMU Guest Agent"
> 
> "intended to be run within virtual machines. It provides various services"
> 
>> +virtual machines and providing various services to help QEMU to interact with
>> +it.
>> +
>> +Guest/Host
>> +----------
> 
> Make these two separate glossary entries, which cross reference each other.
> 
>> +
>> +Guest is the architecture of the virtual machine, which is emulated.
> 
> "Sometimes this is called the 'target' architecture, but that term
> can be ambiguous."
> 
>> +Host is the architecture on which QEMU is running on, which is native.
> 
> 
> We could also have an entry for Target
> 
>   The term "target" can be ambiguous. In most places in QEMU it is used
>   as a synonym for "guest"; for example the code for emulating Arm CPUs
>   is in ``target/arm/``. However in the TCG subsystem "target" refers
>   to the architecture which QEMU is running on, i.e. the "host".
> 

It's a good point, and a very confusing one.
I'll add it and a link to docs/devel/tcg-ops.rst, that clarifies this 
for TCG.

> 
>> +
>> +Hypervisor
>> +----------
>> +
>> +The formal definition of an hypervisor is a program than can be used to manage a
>> +virtual machine. QEMU itself is an hypervisor.
> 
> "a hypervisor". QEMU isn't really a hypervisor, though...
> 

It's a shortcut, and I'm open to change it. It brings an interesting 
question though.

Technically, QEMU interacts with hypervisor APIs built in various OSes. 
On the other hand, when we use TCG, it's an emulator instead.

But as you can't use KVM/hvf/whpx by itself, how do you name the program 
interacting with it, and emulating the rest of the VM?

The correct word is probably "virtualizer", but from searching on 
Internet, it seems that "vmm" and "virtualizer" are considered the same 
as an "hypervisor". The difference is subtle, and maybe we have an 
opportunity here to clarify it.

> 
>> +
>> +In the context of QEMU, an hypervisor is an API, provided by the Host OS,
>> +allowing to execute virtual machines. Linux implementation is KVM (and supports
>> +Xen as well). For MacOS, it's HVF. Windows defines WHPX. And NetBSD provides
>> +NVMM.
>> +
>> +Migration
>> +---------
>> +
>> +QEMU can save and restore the execution of a virtual machine, including across
>> +different machines. This is provided by the `Migration framework<migration>`.
> 
> "between different host systems".
> 
>> +
>> +NBD
>> +---
>> +
>> +`QEMU Network Block Device server <qemu-nbd>` is a tool that can be used to
> 
> "The QEMU ..."
> 
>> +mount and access QEMU images, providing functionality similar to a loop device.
>> +
>> +Mailing List
>> +------------
>> +
>> +This is `where <https://wiki.qemu.org/Contribute/MailingLists>`_ all the
>> +development happens! Changes are posted as series, that all developers can
>> +review and share feedback for.
>> +
>> +For reporting issues, our `GitLab
>> +<https://gitlab.com/qemu-project/qemu/-/issues>`_ tracker is the best place.
>> +
>> +MMU / softmmu
>> +-------------
>> +
>> +The Memory Management Unit is responsible for translating virtual addresses to
>> +physical addresses and managing memory protection. QEMU system mode is named
>> +"softmmu" precisely because it implements this in software, including a TLB
>> +(Translation lookaside buffer), for the guest virtual machine.
>> +
>> +QEMU user-mode does not implement a full software MMU, but "simply" translates
>> +virtual addresses by adding a specific offset, and relying on host MMU/OS
>> +instead.
>> +
>> +Monitor / QMP / HMP
>> +-------------------
>> +
>> +`QEMU Monitor <QEMU monitor>` is a text interface which can be used to interact
> 
> "The QEMU Monitor"
> 
>> +with a running virtual machine.
>> +
>> +QMP stands for QEMU Monitor Protocol and is a json based interface.
>> +HMP stands for Human Monitor Protocol and is a set of text commands available
>> +for users who prefer natural language to json.
>> +
>> +MTTCG
>> +-----
>> +
>> +Multiple cpus support was first implemented using a round-robin algorithm
> 
> "Multiple CPU support"
> 
>> +running on a single thread. Later on, `Multi-threaded TCG <mttcg>` was developed
>> +to benefit from multiple cores to speed up execution.
>> +
>> +Plugins
>> +-------
>> +
>> +`TCG Plugins <TCG Plugins>` is an API used to instrument guest code, in system
>> +and user mode. The end goal is to have a similar set of functionality compared
>> +to `DynamoRIO <https://dynamorio.org/>`_ or `valgrind <https://valgrind.org/>`_.
>> +
>> +One key advantage of QEMU plugins is that they can be used to perform
>> +architecture agnostic instrumentation.
>> +
>> +Patchwork
>> +---------
>> +
>> +`Patchwork <https://patchew.org/QEMU/>`_ is a website that tracks
>> +patches on the Mailing List.
> 
> Patchwork and patchew are different systems. Patchew's URL is
> https://patchew.org/QEMU/
> 
> (There is a patchwork instance that tracks qemu-devel patches,
> at https://patchwork.kernel.org/project/qemu-devel/list/ , but
> I'm not aware of any developers that are actively using it, so
> I don't think it merits being mentioned in the glossary.)
> 

I've been confused by that, and just thought it was two different 
instances (fork me if you can) of the "same" thing.
How would you define patchew?
When we say patchwork, do we implicitely mean patchew?

if I understand currently, patchew is what we want to mention in our 
doc? (and mention it's not associated to patchwork).

>> +
>> +PR
>> +--
>> +
>> +Once a series is reviewed and accepted by a subsystem maintainer, it will be
>> +included in a PR (Pull Request) that the project maintainer will merge into QEMU
>> +main branch, after running tests.
> 
> I think we could probably also usefully say
> 
> "The QEMU project doesn't currently expect most developers to
> directly submit pull requests."
> 
> just to flag up that our development model isn't like the
> currently-popular github/gitlab one where a PR is how you
> send contributions.
> 

This is interesting.

For the majority of developers nowadays, a PR is a GitHub/GitLab PR.
Despite the fact we use the original PR meaning (in git terms), it's 
probably confusing when new comers hear pull request.

>> +
>> +QCOW
>> +----
>> +
>> +QEMU Copy On Write is a disk format developed by QEMU. It provides transparent
>> +compression, automatic extension, and many other advantages over a raw image.
> 
> We want to be a bit careful here, because the "qcow" format
> is not something we recommend for new use -- "qcow2" is what
> you actually want.
> 
> https://www.qemu.org/docs/master/system/qemu-block-drivers.html#cmdoption-image-formats-arg-qcow2
> 

Sounds good.

For my personal knowledge: during this work I discovered that we had 
qcow3. From what I understood, it seems to be included in what we called 
qcow2 today. Is that correct?

>> +
>> +QEMU
>> +----
>> +
>> +`QEMU (Quick Emulator) <https://www.qemu.org/>`_ is a generic and open source
>> +machine emulator and virtualizer.
>> +
>> +QOM
>> +---
>> +
>> +`QEMU Object Model <qom>` is an object oriented API used to define various
>> +devices and hardware in the QEMU codebase.
>> +
>> +Record/replay
>> +-------------
>> +
>> +`Record/replay <replay>` is a feature of QEMU allowing to have a deterministic
>> +and reproducible execution of a virtual machine.
>> +
>> +Rust
>> +----
>> +
>> +`A new programming language <https://www.rust-lang.org/>`_, memory safe by
>> +default. We didn't see a more efficient way to create debates and tensions in
>> +a community of C programmers since the birth of C++.
> 
> :-)  but I think we should probably avoid the joke in our docs.
> 

I had one smile, I'm happy to remove it now :).

More seriously, I can complete after "memory safe by default", "The QEMU 
community is currently working to integrate Rust in the codebase for 
various subsystems".

>> +
>> +System mode
>> +-----------
>> +
>> +QEMU System mode emulates a full machine, including its cpu, memory and devices.
>> +It can be accelerated to hardware speed by using one of the hypervisors QEMU
>> +supports. It is referenced as softmmu as well.
> 
> https://www.qemu.org/docs/master/about/index.html already has
> text defining system emulation and user emulation, so we don't
> really need to re-invent new phrasing for those here.
> 

I can repeat the definition we have there:
"System Emulation provides a virtual model of an entire machine (CPU, 
memory and emulated devices) to run a guest OS. In this mode the CPU may 
be fully emulated, or it may work with a hypervisor such as KVM, Xen or 
Hypervisor.Framework to allow the guest to run directly on the host CPU."

However, I think mentioning softmmu is important, as it's a common 
confusing name (to the new comer) coming from target list.

>> +
>> +TCG
>> +---
>> +
>> +`Tiny Code Generator <tcg>` is an intermediate representation (IR) used to run
>> +guest instructions on host cpu, with both architectures possibly being
>> +different.
> 
> I would say
> 
>    TCG is the QEMU Tiny Code Generator; it is the JIT system we use
>    to emulate a guest CPU in software.
> 
> That's enough for users to understand what it means (I hope); if
> they want to know more specifics like about the intermediate
> representation they can follow the link.
> 

I'm ok with your definition, TCG is wider than the IR we use.

>> +
>> +It is one of the accelerator supported by QEMU, and supports a lot of
> 
> "accelerators"
> 
>> +guest/host architectures.
>> +
>> +User mode
>> +---------
>> +
>> +QEMU User mode allows to run programs for a guest architecture, on a host
>> +architecture, by translating system calls and using TCG. It is available for
>> +Linux and BSD.
>> +
>> +VirtIO
>> +------
>> +
>> +VirtIO is an open standard used to define and implement virtual devices with a
>> +minimal overhead, defining a set of data structures and hypercalls (similar to
>> +system calls, but targeting an hypervisor, which happens to be QEMU in our
>> +case). It's designed to be more efficient than emulating a real device, by
>> +minimizing the amount of interactions between a guest VM and its hypervisor.
>> +
>> +vhost-user
>> +----------
>> +
>> +`Vhost-user <vhost_user>` is an interface used to implement VirtIO devices
>> +outside of QEMU itself.
> 
> thanks
> -- PMM

Thanks for your review Peter!

Pierrick
Peter Maydell Dec. 5, 2024, 3:23 p.m. UTC | #5
On Tue, 3 Dec 2024 at 20:32, Pierrick Bouvier
<pierrick.bouvier@linaro.org> wrote:
>
> On 12/3/24 09:37, Peter Maydell wrote:
> > On Mon, 18 Nov 2024 at 17:24, Pierrick Bouvier
> > <pierrick.bouvier@linaro.org> wrote:
> >>
> >> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> >> ---
> >>   docs/devel/control-flow-integrity.rst |   2 +
> >>   docs/devel/multi-thread-tcg.rst       |   2 +
> >>   docs/glossary/index.rst               | 238 ++++++++++++++++++++++++++
> >>   docs/index.rst                        |   1 +
> >>   docs/system/arm/virt.rst              |   2 +
> >>   docs/system/images.rst                |   2 +
> >>   docs/tools/qemu-nbd.rst               |   2 +
> >>   7 files changed, 249 insertions(+)
> >>   create mode 100644 docs/glossary/index.rst
> >
> > I think this is a good idea; we've had at least one bug
> > report from a user pointing out that we had a term in
> > our docs which we didn't define ("block driver"):
> > https://gitlab.com/qemu-project/qemu/-/issues/2611
> > I have some comments on specific entries below.
> >
>
> And people can be free to add new entries later. However, we should
> resist the temptation to add too many details. It should stay simple and
> understandable, even if not all technical nuances are not represented.
>
> >> diff --git a/docs/devel/control-flow-integrity.rst b/docs/devel/control-flow-integrity.rst
> >> index e6b73a4fe1a..3d5702fa4cc 100644
> >> --- a/docs/devel/control-flow-integrity.rst
> >> +++ b/docs/devel/control-flow-integrity.rst
> >> @@ -1,3 +1,5 @@
> >> +.. _cfi:
> >> +
> >>   ============================
> >>   Control-Flow Integrity (CFI)
> >>   ============================
> >> diff --git a/docs/devel/multi-thread-tcg.rst b/docs/devel/multi-thread-tcg.rst
> >> index d706c27ea74..7fd0a07633d 100644
> >> --- a/docs/devel/multi-thread-tcg.rst
> >> +++ b/docs/devel/multi-thread-tcg.rst
> >> @@ -4,6 +4,8 @@
> >>     This work is licensed under the terms of the GNU GPL, version 2 or
> >>     later. See the COPYING file in the top-level directory.
> >>
> >> +.. _mttcg:
> >> +
> >>   ==================
> >>   Multi-threaded TCG
> >>   ==================
> >> diff --git a/docs/glossary/index.rst b/docs/glossary/index.rst
> >> new file mode 100644
> >> index 00000000000..a2d4f3eae16
> >> --- /dev/null
> >> +++ b/docs/glossary/index.rst
> >
> > I guess it makes sense to give this its own subdir, since we want
> > it to come at the end of the manual. The other option would be
> > to put it directly into docs/.
> >
>
>  From your comment, it's not clear for me if it's ok as it is, or if you
> want a change.
> Can you elaborate on that?

It means I'm not sure. We end up with a subdirectory with only
one file in it and where there's no expectation we'd ever want
to add any more files to it. On the other hand it does keep it
out of the docs/ top level directory, which currently has a
fair amount of cruft awaiting cleanup.

I guess on balance I would make this docs/glossary.rst,
unless you anticipate wanting to split this into multiple
files or have something else in docs/glossary/ later.

> >> @@ -0,0 +1,238 @@
> >> +.. _Glossary:
> >> +
> >> +--------
> >> +Glossary
> >> +--------
> >> +
> >> +This section of the manual presents *simply* acronyms and terms QEMU developers
> >> +use.
> >
> > What's "simply" intended to mean here?
> >
>
> "in a straightforward or plain manner".
> I can remove this word if you think it does not serve any purpose.

You could phrase it as "presents brief definitions of acronyms
and terms", I think.

> >> +QEMU system defines board models for various architectures. It's a description
> >> +of a SoC (system-on-chip) with various devices pre-configured, and can be
> >> +selected with the option ``-machine`` of qemu-system.
> >
> > SoCs are not the same as boards.
> >
> > We could say something like:
> >
> > QEMU's system emulation models many different types of hardware.
> > A machine model (sometimes called a board model) is the model
> > of a complete virtual system with RAM, one or more CPUs, and
> > various devices.
> >
> > We could also put in a link to
> > https://www.qemu.org/docs/master/system/targets.html
> > which is where we document what our machine types are.
> >
>
> How do you distinguish a SoC and cpu? Is a SoC cpu + devices?
> Isn't a board/machine a set of SoC + devices attached?

An SoC is a "system on chip". To quote wikipedia's definition:

"A system on a chip or system-on-chip is an integrated circuit that
integrates most or all components of a computer or electronic system.
These components usually include an on-chip central processing unit
(CPU), memory interfaces, input/output devices and interfaces, and
secondary storage interfaces, often alongside other components such
as radio modems and a graphics processing unit (GPU) – all on a
single substrate or microchip."

An SoC always contains a CPU, but it will have a lot more
than that built into it too. And the SoC only has "most"
of the system components, so the whole machine will be
an SoC plus some other things.

Generally a board/machine that uses an SoC will have on it:
 * an SoC
 * the actual memory
 * perhaps one or two extra devices external to the SoC
 * connectors for things like serial ports, SD cards, etc
   (which generally wire up to SoC pins)
 * a crystal or similar to act as the main clock source

So if you look at a photo of a development board that uses
an SoC, there will be one large chip which is the SoC,
some RAM chips, a bunch of connectors and one or two smaller
chips. Not every device will be inside the SoC, but
generally almost all of them are.

QEMU's machine models for this kind of board match the
organization of the hardware; looking at hw/arm/sabrelite.c
which is a machine model you can see that it has:
 * an instance of the fsl-imx6 SoC device object
 * the main memory
 * some NOR flash memory
 * some configuration and wiring up of things

And the SoC itself is in hw/arm/fsl-imx6.c, and is a
QOM device object that creates and wires up the CPUs,
UARTs, USB controller, and various other devices that
this particular SoC includes. In this case we only have
one board model using this SoC, but for some SoCs we
have several board models that all use the same SoC
but wire up different external devices to it.

Some of our machine models are models of systems that
don't use an SoC at all. This is rare in the Arm world,
but for instance the SPARC machines like the SS-5 are
like that -- the real hardware had a discrete CPU chip
and a bunch of devices in their own chips on the
motherboard, and QEMU's model of that hardware has
a machine model which directly creates the CPU and
the various devices. (And some of our older machine
models are models of hardware that *does* have an SoC
but where we didn't model that level of abstraction,
so they directly create devices in the machine model
that really ought to be inside an SoC object.
hw/arm/stellaris.c is one example of that.)

> >> +
> >> +Block
> >> +-----
> >> +
> >> +Block drivers are the available `disk formats <block-drivers>` available, and
> >> +block devices `(see Block device section on options page)<sec_005finvocation>`
> >> +are using them to implement disks for a virtual machine.
> >
> > Block drivers aren't just disk formats; there are some filter
> > drivers too. Somebody on the block side could probably
> > provide a better definition here.
> >
>
> I'm open to a more exact definition. The two terms (drivers and devices)
> seem to overlap on some parts, so I came up with this trivial definition.

Yeah, the driver vs device split is a good one (basically
the device is the front-end visible to the guest, and the
driver is the back-end that provides it with storage
via an abstracted API). The nit I'm picking here is that
not every block driver is there to provide support for
an on-host disk format.

> >> +
> >> +CFI
> >> +---
> >> +
> >> +Control Flow Integrity is a hardening technique used to prevent exploits
> >> +targeting QEMU by detecting unexpected branches during execution. QEMU `actively
> >> +supports<cfi>` being compiled with CFI enabled.
> >> +
> >> +Device
> >> +------
> >> +
> >> +QEMU is able to emulate a CPU, and all the hardware interacting with it,
> >> +including many devices. When QEMU runs a virtual machine using a hardware-based
> >> +accelerator, it is responsible for emulating, using software, all devices.
> >
> > This definition doesn't actually define what a device is :-)
> >
>
> Indeed :)
> Should we explain what is a computer device?
> The goal was just to say that QEMU can emulate hardware interacting with
> the cpu, which could be a possible definition. So people can associate
> that QEMU devices are nothing else than a "normal" computer device.

We could say, perhaps:

In QEMU, a device is a piece of hardware visible to the guest.
Examples include UARTs, PCI controllers, PCI cards, VGA controllers,
and many more.



> >> +
> >> +Hypervisor
> >> +----------
> >> +
> >> +The formal definition of an hypervisor is a program than can be used to manage a
> >> +virtual machine. QEMU itself is an hypervisor.
> >
> > "a hypervisor". QEMU isn't really a hypervisor, though...
> >
>
> It's a shortcut, and I'm open to change it. It brings an interesting
> question though.
>
> Technically, QEMU interacts with hypervisor APIs built in various OSes.
> On the other hand, when we use TCG, it's an emulator instead.
>
> But as you can't use KVM/hvf/whpx by itself, how do you name the program
> interacting with it, and emulating the rest of the VM?
>
> The correct word is probably "virtualizer", but from searching on
> Internet, it seems that "vmm" and "virtualizer" are considered the same
> as an "hypervisor". The difference is subtle, and maybe we have an
> opportunity here to clarify it.


> >> +Patchwork
> >> +---------
> >> +
> >> +`Patchwork <https://patchew.org/QEMU/>`_ is a website that tracks
> >> +patches on the Mailing List.
> >
> > Patchwork and patchew are different systems. Patchew's URL is
> > https://patchew.org/QEMU/
> >
> > (There is a patchwork instance that tracks qemu-devel patches,
> > at https://patchwork.kernel.org/project/qemu-devel/list/ , but
> > I'm not aware of any developers that are actively using it, so
> > I don't think it merits being mentioned in the glossary.)
> >
>
> I've been confused by that, and just thought it was two different
> instances (fork me if you can) of the "same" thing.
> How would you define patchew?
> When we say patchwork, do we implicitely mean patchew?

No. patchwork is patchwork, and patchew is patchew -- these
are entirely different pieces of software that happen to do
similar jobs.

> if I understand currently, patchew is what we want to mention in our
> doc? (and mention it's not associated to patchwork).

We don't use patchwork, so we don't need to mention it anywhere.

> >> +Once a series is reviewed and accepted by a subsystem maintainer, it will be
> >> +included in a PR (Pull Request) that the project maintainer will merge into QEMU
> >> +main branch, after running tests.
> >
> > I think we could probably also usefully say
> >
> > "The QEMU project doesn't currently expect most developers to
> > directly submit pull requests."
> >
> > just to flag up that our development model isn't like the
> > currently-popular github/gitlab one where a PR is how you
> > send contributions.
> >
>
> This is interesting.
>
> For the majority of developers nowadays, a PR is a GitHub/GitLab PR.
> Despite the fact we use the original PR meaning (in git terms), it's
> probably confusing when new comers hear pull request.
>
> >> +
> >> +QCOW
> >> +----
> >> +
> >> +QEMU Copy On Write is a disk format developed by QEMU. It provides transparent
> >> +compression, automatic extension, and many other advantages over a raw image.
> >
> > We want to be a bit careful here, because the "qcow" format
> > is not something we recommend for new use -- "qcow2" is what
> > you actually want.
> >
> > https://www.qemu.org/docs/master/system/qemu-block-drivers.html#cmdoption-image-formats-arg-qcow2
> >
>
> Sounds good.
>
> For my personal knowledge: during this work I discovered that we had
> qcow3. From what I understood, it seems to be included in what we called
> qcow2 today. Is that correct?

I have no idea -- you'd need to ask somebody who works on the
block layer

thanks
-- PMM
Pierrick Bouvier Dec. 5, 2024, 7:21 p.m. UTC | #6
On 12/5/24 07:23, Peter Maydell wrote:
> On Tue, 3 Dec 2024 at 20:32, Pierrick Bouvier
> <pierrick.bouvier@linaro.org> wrote:
>>
>> On 12/3/24 09:37, Peter Maydell wrote:
>>> On Mon, 18 Nov 2024 at 17:24, Pierrick Bouvier
>>> <pierrick.bouvier@linaro.org> wrote:
>>>>
>>>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>>>> ---
>>>>    docs/devel/control-flow-integrity.rst |   2 +
>>>>    docs/devel/multi-thread-tcg.rst       |   2 +
>>>>    docs/glossary/index.rst               | 238 ++++++++++++++++++++++++++
>>>>    docs/index.rst                        |   1 +
>>>>    docs/system/arm/virt.rst              |   2 +
>>>>    docs/system/images.rst                |   2 +
>>>>    docs/tools/qemu-nbd.rst               |   2 +
>>>>    7 files changed, 249 insertions(+)
>>>>    create mode 100644 docs/glossary/index.rst
>>>
>>> I think this is a good idea; we've had at least one bug
>>> report from a user pointing out that we had a term in
>>> our docs which we didn't define ("block driver"):
>>> https://gitlab.com/qemu-project/qemu/-/issues/2611
>>> I have some comments on specific entries below.
>>>
>>
>> And people can be free to add new entries later. However, we should
>> resist the temptation to add too many details. It should stay simple and
>> understandable, even if not all technical nuances are not represented.
>>
>>>> diff --git a/docs/devel/control-flow-integrity.rst b/docs/devel/control-flow-integrity.rst
>>>> index e6b73a4fe1a..3d5702fa4cc 100644
>>>> --- a/docs/devel/control-flow-integrity.rst
>>>> +++ b/docs/devel/control-flow-integrity.rst
>>>> @@ -1,3 +1,5 @@
>>>> +.. _cfi:
>>>> +
>>>>    ============================
>>>>    Control-Flow Integrity (CFI)
>>>>    ============================
>>>> diff --git a/docs/devel/multi-thread-tcg.rst b/docs/devel/multi-thread-tcg.rst
>>>> index d706c27ea74..7fd0a07633d 100644
>>>> --- a/docs/devel/multi-thread-tcg.rst
>>>> +++ b/docs/devel/multi-thread-tcg.rst
>>>> @@ -4,6 +4,8 @@
>>>>      This work is licensed under the terms of the GNU GPL, version 2 or
>>>>      later. See the COPYING file in the top-level directory.
>>>>
>>>> +.. _mttcg:
>>>> +
>>>>    ==================
>>>>    Multi-threaded TCG
>>>>    ==================
>>>> diff --git a/docs/glossary/index.rst b/docs/glossary/index.rst
>>>> new file mode 100644
>>>> index 00000000000..a2d4f3eae16
>>>> --- /dev/null
>>>> +++ b/docs/glossary/index.rst
>>>
>>> I guess it makes sense to give this its own subdir, since we want
>>> it to come at the end of the manual. The other option would be
>>> to put it directly into docs/.
>>>
>>
>>   From your comment, it's not clear for me if it's ok as it is, or if you
>> want a change.
>> Can you elaborate on that?
> 
> It means I'm not sure. We end up with a subdirectory with only
> one file in it and where there's no expectation we'd ever want
> to add any more files to it. On the other hand it does keep it
> out of the docs/ top level directory, which currently has a
> fair amount of cruft awaiting cleanup.
> 
> I guess on balance I would make this docs/glossary.rst,
> unless you anticipate wanting to split this into multiple
> files or have something else in docs/glossary/ later.
> 
>>>> @@ -0,0 +1,238 @@
>>>> +.. _Glossary:
>>>> +
>>>> +--------
>>>> +Glossary
>>>> +--------
>>>> +
>>>> +This section of the manual presents *simply* acronyms and terms QEMU developers
>>>> +use.
>>>
>>> What's "simply" intended to mean here?
>>>
>>
>> "in a straightforward or plain manner".
>> I can remove this word if you think it does not serve any purpose.
> 
> You could phrase it as "presents brief definitions of acronyms
> and terms", I think.
> 
>>>> +QEMU system defines board models for various architectures. It's a description
>>>> +of a SoC (system-on-chip) with various devices pre-configured, and can be
>>>> +selected with the option ``-machine`` of qemu-system.
>>>
>>> SoCs are not the same as boards.
>>>
>>> We could say something like:
>>>
>>> QEMU's system emulation models many different types of hardware.
>>> A machine model (sometimes called a board model) is the model
>>> of a complete virtual system with RAM, one or more CPUs, and
>>> various devices.
>>>
>>> We could also put in a link to
>>> https://www.qemu.org/docs/master/system/targets.html
>>> which is where we document what our machine types are.
>>>
>>
>> How do you distinguish a SoC and cpu? Is a SoC cpu + devices?
>> Isn't a board/machine a set of SoC + devices attached?
> 
> An SoC is a "system on chip". To quote wikipedia's definition:
> 
> "A system on a chip or system-on-chip is an integrated circuit that
> integrates most or all components of a computer or electronic system.
> These components usually include an on-chip central processing unit
> (CPU), memory interfaces, input/output devices and interfaces, and
> secondary storage interfaces, often alongside other components such
> as radio modems and a graphics processing unit (GPU) – all on a
> single substrate or microchip."
> 
> An SoC always contains a CPU, but it will have a lot more
> than that built into it too. And the SoC only has "most"
> of the system components, so the whole machine will be
> an SoC plus some other things.
> 
> Generally a board/machine that uses an SoC will have on it:
>   * an SoC
>   * the actual memory
>   * perhaps one or two extra devices external to the SoC
>   * connectors for things like serial ports, SD cards, etc
>     (which generally wire up to SoC pins)
>   * a crystal or similar to act as the main clock source
> 
> So if you look at a photo of a development board that uses
> an SoC, there will be one large chip which is the SoC,
> some RAM chips, a bunch of connectors and one or two smaller
> chips. Not every device will be inside the SoC, but
> generally almost all of them are.
> 
> QEMU's machine models for this kind of board match the
> organization of the hardware; looking at hw/arm/sabrelite.c
> which is a machine model you can see that it has:
>   * an instance of the fsl-imx6 SoC device object
>   * the main memory
>   * some NOR flash memory
>   * some configuration and wiring up of things
> 
> And the SoC itself is in hw/arm/fsl-imx6.c, and is a
> QOM device object that creates and wires up the CPUs,
> UARTs, USB controller, and various other devices that
> this particular SoC includes. In this case we only have
> one board model using this SoC, but for some SoCs we
> have several board models that all use the same SoC
> but wire up different external devices to it.
> 
> Some of our machine models are models of systems that
> don't use an SoC at all. This is rare in the Arm world,
> but for instance the SPARC machines like the SS-5 are
> like that -- the real hardware had a discrete CPU chip
> and a bunch of devices in their own chips on the
> motherboard, and QEMU's model of that hardware has
> a machine model which directly creates the CPU and
> the various devices. (And some of our older machine
> models are models of hardware that *does* have an SoC
> but where we didn't model that level of abstraction,
> so they directly create devices in the machine model
> that really ought to be inside an SoC object.
> hw/arm/stellaris.c is one example of that.)
> 

Thanks for the answer on this.
My specific implicit question was: do we have boards with cpu only (and 
not a SoC), which you answered.

>>>> +
>>>> +Block
>>>> +-----
>>>> +
>>>> +Block drivers are the available `disk formats <block-drivers>` available, and
>>>> +block devices `(see Block device section on options page)<sec_005finvocation>`
>>>> +are using them to implement disks for a virtual machine.
>>>
>>> Block drivers aren't just disk formats; there are some filter
>>> drivers too. Somebody on the block side could probably
>>> provide a better definition here.
>>>
>>
>> I'm open to a more exact definition. The two terms (drivers and devices)
>> seem to overlap on some parts, so I came up with this trivial definition.
> 
> Yeah, the driver vs device split is a good one (basically
> the device is the front-end visible to the guest, and the
> driver is the back-end that provides it with storage
> via an abstracted API). The nit I'm picking here is that
> not every block driver is there to provide support for
> an on-host disk format.
> 
>>>> +
>>>> +CFI
>>>> +---
>>>> +
>>>> +Control Flow Integrity is a hardening technique used to prevent exploits
>>>> +targeting QEMU by detecting unexpected branches during execution. QEMU `actively
>>>> +supports<cfi>` being compiled with CFI enabled.
>>>> +
>>>> +Device
>>>> +------
>>>> +
>>>> +QEMU is able to emulate a CPU, and all the hardware interacting with it,
>>>> +including many devices. When QEMU runs a virtual machine using a hardware-based
>>>> +accelerator, it is responsible for emulating, using software, all devices.
>>>
>>> This definition doesn't actually define what a device is :-)
>>>
>>
>> Indeed :)
>> Should we explain what is a computer device?
>> The goal was just to say that QEMU can emulate hardware interacting with
>> the cpu, which could be a possible definition. So people can associate
>> that QEMU devices are nothing else than a "normal" computer device.
> 
> We could say, perhaps:
> 
> In QEMU, a device is a piece of hardware visible to the guest.
> Examples include UARTs, PCI controllers, PCI cards, VGA controllers,
> and many more.
> 
> 
> 
>>>> +
>>>> +Hypervisor
>>>> +----------
>>>> +
>>>> +The formal definition of an hypervisor is a program than can be used to manage a
>>>> +virtual machine. QEMU itself is an hypervisor.
>>>
>>> "a hypervisor". QEMU isn't really a hypervisor, though...
>>>
>>
>> It's a shortcut, and I'm open to change it. It brings an interesting
>> question though.
>>
>> Technically, QEMU interacts with hypervisor APIs built in various OSes.
>> On the other hand, when we use TCG, it's an emulator instead.
>>
>> But as you can't use KVM/hvf/whpx by itself, how do you name the program
>> interacting with it, and emulating the rest of the VM?
>>
>> The correct word is probably "virtualizer", but from searching on
>> Internet, it seems that "vmm" and "virtualizer" are considered the same
>> as an "hypervisor". The difference is subtle, and maybe we have an
>> opportunity here to clarify it.
> 
> 
>>>> +Patchwork
>>>> +---------
>>>> +
>>>> +`Patchwork <https://patchew.org/QEMU/>`_ is a website that tracks
>>>> +patches on the Mailing List.
>>>
>>> Patchwork and patchew are different systems. Patchew's URL is
>>> https://patchew.org/QEMU/
>>>
>>> (There is a patchwork instance that tracks qemu-devel patches,
>>> at https://patchwork.kernel.org/project/qemu-devel/list/ , but
>>> I'm not aware of any developers that are actively using it, so
>>> I don't think it merits being mentioned in the glossary.)
>>>
>>
>> I've been confused by that, and just thought it was two different
>> instances (fork me if you can) of the "same" thing.
>> How would you define patchew?
>> When we say patchwork, do we implicitely mean patchew?
> 
> No. patchwork is patchwork, and patchew is patchew -- these
> are entirely different pieces of software that happen to do
> similar jobs.
> 
>> if I understand currently, patchew is what we want to mention in our
>> doc? (and mention it's not associated to patchwork).
> 
> We don't use patchwork, so we don't need to mention it anywhere.
> 
>>>> +Once a series is reviewed and accepted by a subsystem maintainer, it will be
>>>> +included in a PR (Pull Request) that the project maintainer will merge into QEMU
>>>> +main branch, after running tests.
>>>
>>> I think we could probably also usefully say
>>>
>>> "The QEMU project doesn't currently expect most developers to
>>> directly submit pull requests."
>>>
>>> just to flag up that our development model isn't like the
>>> currently-popular github/gitlab one where a PR is how you
>>> send contributions.
>>>
>>
>> This is interesting.
>>
>> For the majority of developers nowadays, a PR is a GitHub/GitLab PR.
>> Despite the fact we use the original PR meaning (in git terms), it's
>> probably confusing when new comers hear pull request.
>>
>>>> +
>>>> +QCOW
>>>> +----
>>>> +
>>>> +QEMU Copy On Write is a disk format developed by QEMU. It provides transparent
>>>> +compression, automatic extension, and many other advantages over a raw image.
>>>
>>> We want to be a bit careful here, because the "qcow" format
>>> is not something we recommend for new use -- "qcow2" is what
>>> you actually want.
>>>
>>> https://www.qemu.org/docs/master/system/qemu-block-drivers.html#cmdoption-image-formats-arg-qcow2
>>>
>>
>> Sounds good.
>>
>> For my personal knowledge: during this work I discovered that we had
>> qcow3. From what I understood, it seems to be included in what we called
>> qcow2 today. Is that correct?
> 
> I have no idea -- you'd need to ask somebody who works on the
> block layer
> 
> thanks
> -- PMM

Thanks for all the suggestions, I'll integrate those in next version.
diff mbox series

Patch

diff --git a/docs/devel/control-flow-integrity.rst b/docs/devel/control-flow-integrity.rst
index e6b73a4fe1a..3d5702fa4cc 100644
--- a/docs/devel/control-flow-integrity.rst
+++ b/docs/devel/control-flow-integrity.rst
@@ -1,3 +1,5 @@ 
+.. _cfi:
+
 ============================
 Control-Flow Integrity (CFI)
 ============================
diff --git a/docs/devel/multi-thread-tcg.rst b/docs/devel/multi-thread-tcg.rst
index d706c27ea74..7fd0a07633d 100644
--- a/docs/devel/multi-thread-tcg.rst
+++ b/docs/devel/multi-thread-tcg.rst
@@ -4,6 +4,8 @@ 
   This work is licensed under the terms of the GNU GPL, version 2 or
   later. See the COPYING file in the top-level directory.
 
+.. _mttcg:
+
 ==================
 Multi-threaded TCG
 ==================
diff --git a/docs/glossary/index.rst b/docs/glossary/index.rst
new file mode 100644
index 00000000000..a2d4f3eae16
--- /dev/null
+++ b/docs/glossary/index.rst
@@ -0,0 +1,238 @@ 
+.. _Glossary:
+
+--------
+Glossary
+--------
+
+This section of the manual presents *simply* acronyms and terms QEMU developers
+use.
+
+Accelerator
+-----------
+
+A specific API used to accelerate execution of guest instructions. It can be
+hardware-based, through a virtualization API provided by the host OS (kvm, hvf,
+whpx, ...) or software-based (tcg). See this description of `supported
+accelerators<Accelerators>`.
+
+Board
+-----
+
+QEMU system defines board models for various architectures. It's a description
+of a SoC (system-on-chip) with various devices pre-configured, and can be
+selected with the option ``-machine`` of qemu-system.
+For virtual machines, you'll use ``virt`` board model, designed for this use
+case. As an example, for Arm architecture, you can find the `model code
+<https://gitlab.com/qemu-project/qemu/-/blob/master/hw/arm/virt.c>`_ and
+associated `documentation <arm-virt>`.
+
+Block
+-----
+
+Block drivers are the available `disk formats <block-drivers>` available, and
+block devices `(see Block device section on options page)<sec_005finvocation>`
+are using them to implement disks for a virtual machine.
+
+CFI
+---
+
+Control Flow Integrity is a hardening technique used to prevent exploits
+targeting QEMU by detecting unexpected branches during execution. QEMU `actively
+supports<cfi>` being compiled with CFI enabled.
+
+Device
+------
+
+QEMU is able to emulate a CPU, and all the hardware interacting with it,
+including many devices. When QEMU runs a virtual machine using a hardware-based
+accelerator, it is responsible for emulating, using software, all devices.
+
+EDK2
+----
+
+EDK2, as known as `TianoCore <https://www.tianocore.org/>`_, is an open source
+implementation of UEFI standard. It's ran by QEMU to support UEFI for virtual
+machines.
+
+gdbstub
+-------
+
+QEMU implements a `gdb server <GDB usage>`, allowing gdb to attach to it and
+debug a running virtual machine, or a program in user-mode. This allows to debug
+a given architecture without having access to hardware.
+
+glib2
+-----
+
+`GLib2 <https://docs.gtk.org/glib/>`_ is one of the most important library we
+are using through the codebase. It provides many data structures, macros, string
+and thread utilities and portable functions across different OS. It's required
+to build QEMU.
+
+Guest agent
+-----------
+
+`QEMU Guest agent <qemu-ga>` is a daemon intended to be executed by guest
+virtual machines and providing various services to help QEMU to interact with
+it.
+
+Guest/Host
+----------
+
+Guest is the architecture of the virtual machine, which is emulated.
+Host is the architecture on which QEMU is running on, which is native.
+
+Hypervisor
+----------
+
+The formal definition of an hypervisor is a program than can be used to manage a
+virtual machine. QEMU itself is an hypervisor.
+
+In the context of QEMU, an hypervisor is an API, provided by the Host OS,
+allowing to execute virtual machines. Linux implementation is KVM (and supports
+Xen as well). For MacOS, it's HVF. Windows defines WHPX. And NetBSD provides
+NVMM.
+
+Migration
+---------
+
+QEMU can save and restore the execution of a virtual machine, including across
+different machines. This is provided by the `Migration framework<migration>`.
+
+NBD
+---
+
+`QEMU Network Block Device server <qemu-nbd>` is a tool that can be used to
+mount and access QEMU images, providing functionality similar to a loop device.
+
+Mailing List
+------------
+
+This is `where <https://wiki.qemu.org/Contribute/MailingLists>`_ all the
+development happens! Changes are posted as series, that all developers can
+review and share feedback for.
+
+For reporting issues, our `GitLab
+<https://gitlab.com/qemu-project/qemu/-/issues>`_ tracker is the best place.
+
+MMU / softmmu
+-------------
+
+The Memory Management Unit is responsible for translating virtual addresses to
+physical addresses and managing memory protection. QEMU system mode is named
+"softmmu" precisely because it implements this in software, including a TLB
+(Translation lookaside buffer), for the guest virtual machine.
+
+QEMU user-mode does not implement a full software MMU, but "simply" translates
+virtual addresses by adding a specific offset, and relying on host MMU/OS
+instead.
+
+Monitor / QMP / HMP
+-------------------
+
+`QEMU Monitor <QEMU monitor>` is a text interface which can be used to interact
+with a running virtual machine.
+
+QMP stands for QEMU Monitor Protocol and is a json based interface.
+HMP stands for Human Monitor Protocol and is a set of text commands available
+for users who prefer natural language to json.
+
+MTTCG
+-----
+
+Multiple cpus support was first implemented using a round-robin algorithm
+running on a single thread. Later on, `Multi-threaded TCG <mttcg>` was developed
+to benefit from multiple cores to speed up execution.
+
+Plugins
+-------
+
+`TCG Plugins <TCG Plugins>` is an API used to instrument guest code, in system
+and user mode. The end goal is to have a similar set of functionality compared
+to `DynamoRIO <https://dynamorio.org/>`_ or `valgrind <https://valgrind.org/>`_.
+
+One key advantage of QEMU plugins is that they can be used to perform
+architecture agnostic instrumentation.
+
+Patchwork
+---------
+
+`Patchwork <https://patchew.org/QEMU/>`_ is a website that tracks
+patches on the Mailing List.
+
+PR
+--
+
+Once a series is reviewed and accepted by a subsystem maintainer, it will be
+included in a PR (Pull Request) that the project maintainer will merge into QEMU
+main branch, after running tests.
+
+QCOW
+----
+
+QEMU Copy On Write is a disk format developed by QEMU. It provides transparent
+compression, automatic extension, and many other advantages over a raw image.
+
+QEMU
+----
+
+`QEMU (Quick Emulator) <https://www.qemu.org/>`_ is a generic and open source
+machine emulator and virtualizer.
+
+QOM
+---
+
+`QEMU Object Model <qom>` is an object oriented API used to define various
+devices and hardware in the QEMU codebase.
+
+Record/replay
+-------------
+
+`Record/replay <replay>` is a feature of QEMU allowing to have a deterministic
+and reproducible execution of a virtual machine.
+
+Rust
+----
+
+`A new programming language <https://www.rust-lang.org/>`_, memory safe by
+default. We didn't see a more efficient way to create debates and tensions in
+a community of C programmers since the birth of C++.
+
+System mode
+-----------
+
+QEMU System mode emulates a full machine, including its cpu, memory and devices.
+It can be accelerated to hardware speed by using one of the hypervisors QEMU
+supports. It is referenced as softmmu as well.
+
+TCG
+---
+
+`Tiny Code Generator <tcg>` is an intermediate representation (IR) used to run
+guest instructions on host cpu, with both architectures possibly being
+different.
+
+It is one of the accelerator supported by QEMU, and supports a lot of
+guest/host architectures.
+
+User mode
+---------
+
+QEMU User mode allows to run programs for a guest architecture, on a host
+architecture, by translating system calls and using TCG. It is available for
+Linux and BSD.
+
+VirtIO
+------
+
+VirtIO is an open standard used to define and implement virtual devices with a
+minimal overhead, defining a set of data structures and hypercalls (similar to
+system calls, but targeting an hypervisor, which happens to be QEMU in our
+case). It's designed to be more efficient than emulating a real device, by
+minimizing the amount of interactions between a guest VM and its hypervisor.
+
+vhost-user
+----------
+
+`Vhost-user <vhost_user>` is an interface used to implement VirtIO devices
+outside of QEMU itself.
diff --git a/docs/index.rst b/docs/index.rst
index cb5e5098b65..2cad84cd77c 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -21,3 +21,4 @@  Welcome to QEMU's documentation!
    specs/index
    devel/index
    codebase/index
+   glossary/index
diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
index e67e7f0f7c5..11ceb898264 100644
--- a/docs/system/arm/virt.rst
+++ b/docs/system/arm/virt.rst
@@ -1,3 +1,5 @@ 
+.. _arm-virt:
+
 'virt' generic virtual platform (``virt``)
 ==========================================
 
diff --git a/docs/system/images.rst b/docs/system/images.rst
index d000bd6b6f1..a5551173c97 100644
--- a/docs/system/images.rst
+++ b/docs/system/images.rst
@@ -82,4 +82,6 @@  VM snapshots currently have the following known limitations:
 -  A few device drivers still have incomplete snapshot support so their
    state is not saved or restored properly (in particular USB).
 
+.. _block-drivers:
+
 .. include:: qemu-block-drivers.rst.inc
diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst
index 329f44d9895..4f21b7904ac 100644
--- a/docs/tools/qemu-nbd.rst
+++ b/docs/tools/qemu-nbd.rst
@@ -1,3 +1,5 @@ 
+.. _qemu-nbd:
+
 =====================================
 QEMU Disk Network Block Device Server
 =====================================