diff mbox series

system: Try hardware accelerators (KVM, HVF) before software one (TCG)

Message ID 20250103150558.1473-1-philmd@linaro.org
State New
Headers show
Series system: Try hardware accelerators (KVM, HVF) before software one (TCG) | expand

Commit Message

Philippe Mathieu-Daudé Jan. 3, 2025, 3:05 p.m. UTC
As Daniel suggested [*]:

> We should consider to rank HVF above TCG, on the basis
> that HW acceleration is faster and should provide a
> host<->guest security boundary that we don't claim for TCG

[*] https://lore.kernel.org/qemu-devel/Z07YASl2Pd3CPtjE@redhat.com/
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
 system/vl.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Comments

Daniel P. Berrangé Jan. 3, 2025, 3:15 p.m. UTC | #1
On Fri, Jan 03, 2025 at 04:05:58PM +0100, Philippe Mathieu-Daudé wrote:
> As Daniel suggested [*]:
> 
> > We should consider to rank HVF above TCG, on the basis
> > that HW acceleration is faster and should provide a
> > host<->guest security boundary that we don't claim for TCG
> 
> [*] https://lore.kernel.org/qemu-devel/Z07YASl2Pd3CPtjE@redhat.com/

Note, my statement above was on the basis that HVF passes all our
functional tests, thus indicating a decent level of confidence
in the correctness of the HVF impl.

If anyone knows any show stopper problems with HVF that would
justify blocking its promotion ahead of TCG.... say now.

> Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> ---
>  system/vl.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


With regards,
Daniel
Philippe Mathieu-Daudé Jan. 3, 2025, 5:16 p.m. UTC | #2
On 3/1/25 16:15, Daniel P. Berrangé wrote:
> On Fri, Jan 03, 2025 at 04:05:58PM +0100, Philippe Mathieu-Daudé wrote:
>> As Daniel suggested [*]:
>>
>>> We should consider to rank HVF above TCG, on the basis
>>> that HW acceleration is faster and should provide a
>>> host<->guest security boundary that we don't claim for TCG
>>
>> [*] https://lore.kernel.org/qemu-devel/Z07YASl2Pd3CPtjE@redhat.com/
> 
> Note, my statement above was on the basis that HVF passes all our
> functional tests, thus indicating a decent level of confidence
> in the correctness of the HVF impl.

Indeed, I forgot about that, and only tested in my 'HVF-only'
directory before posting, but ...

> If anyone knows any show stopper problems with HVF that would
> justify blocking its promotion ahead of TCG.... say now.

... here we go:

  3/15 qemu:qtest+qtest-aarch64 / qtest-aarch64/migration-test 
       ERROR            0.88s   killed by signal 11 SIGSEGV

>> Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>> ---
>>   system/vl.c | 5 ++---
>>   1 file changed, 2 insertions(+), 3 deletions(-)
> 
> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
> 
> 
> With regards,
> Daniel
Peter Xu Jan. 3, 2025, 5:48 p.m. UTC | #3
On Fri, Jan 03, 2025 at 06:16:38PM +0100, Philippe Mathieu-Daudé wrote:
> On 3/1/25 16:15, Daniel P. Berrangé wrote:
> > On Fri, Jan 03, 2025 at 04:05:58PM +0100, Philippe Mathieu-Daudé wrote:
> > > As Daniel suggested [*]:
> > > 
> > > > We should consider to rank HVF above TCG, on the basis
> > > > that HW acceleration is faster and should provide a
> > > > host<->guest security boundary that we don't claim for TCG
> > > 
> > > [*] https://lore.kernel.org/qemu-devel/Z07YASl2Pd3CPtjE@redhat.com/
> > 
> > Note, my statement above was on the basis that HVF passes all our
> > functional tests, thus indicating a decent level of confidence
> > in the correctness of the HVF impl.
> 
> Indeed, I forgot about that, and only tested in my 'HVF-only'
> directory before posting, but ...
> 
> > If anyone knows any show stopper problems with HVF that would
> > justify blocking its promotion ahead of TCG.... say now.
> 
> ... here we go:
> 
>  3/15 qemu:qtest+qtest-aarch64 / qtest-aarch64/migration-test       ERROR
> 0.88s   killed by signal 11 SIGSEGV

Hmm.. I think migration-test specifies either kvm or tcg in all its tests,
so I don't yet know why this patch can affect it..

migrate_start():
    cmd_source = g_strdup_printf("-accel kvm%s -accel tcg "
                                 "-machine %s,%s "
                                 ...);

May need a closer look on the crashed stack.
Phil Dennis-Jordan Jan. 4, 2025, 11:28 a.m. UTC | #4
On Fri, 3 Jan 2025 at 16:16, Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Fri, Jan 03, 2025 at 04:05:58PM +0100, Philippe Mathieu-Daudé wrote:
> > As Daniel suggested [*]:
> >
> > > We should consider to rank HVF above TCG, on the basis
> > > that HW acceleration is faster and should provide a
> > > host<->guest security boundary that we don't claim for TCG
> >
> > [*] https://lore.kernel.org/qemu-devel/Z07YASl2Pd3CPtjE@redhat.com/
>
> Note, my statement above was on the basis that HVF passes all our
> functional tests, thus indicating a decent level of confidence
> in the correctness of the HVF impl.
>
> If anyone knows any show stopper problems with HVF that would
> justify blocking its promotion ahead of TCG.... say now.
>

I don't know about showstoppers, but:

1. As far as I'm aware there are/were problems with the virtual IOMMU
devices in HVF. It's been a while (~half a year?) since I tried them, but I
had problems getting guests booted with intel_iommu etc.

2. I think there might also be a few remaining edge cases where the x86
instruction emulation on fault/trap is incomplete. Most notably, MMIO using
SSE/AVX/etc. instructions will, I think, fail. In practice this is a fairly
obscure use case - I'm not aware of any guest OS that actually performs
MMIO using these instructions. I have a patch kicking around that adds
support for missing 64-bit variants of common scalar arithmetic
instructions with memory operands. I can dig that up and post it - do we
have a good way of adding tests for this kind of thing?

3. As far as I'm aware, there's no CI happening on HVF? Or has that
changed? macOS is notoriously a pain in the rear in terms of CI thanks to
its licensing, and the big cloud CI platforms tend to run it in a VM which
in turn typically doesn't support nested HVF. I've been working on an
on-prem solution to provisioning bare-metal Macs to run clean-slate OS
images for CI. This has been a side project though and I haven't had the
resources to focus on that project to see it through. It might be possible
to do this in the cloud on Amazon's EC2 Mac Minis as well, but those aren't
exactly cheap.



> > Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
> > Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> > ---
> >  system/vl.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
>
> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
>
>
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-
> https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-
> https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-
> https://www.instagram.com/dberrange :|
>
>
>
diff mbox series

Patch

diff --git a/system/vl.c b/system/vl.c
index 0843b7ab49b..e20391281f8 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -2379,7 +2379,6 @@  static void configure_accelerators(const char *progname)
             /* Select the default accelerator */
             bool have_tcg = accel_find("tcg");
             bool have_kvm = accel_find("kvm");
-            bool have_hvf = accel_find("hvf");
 
             if (have_tcg && have_kvm) {
                 if (g_str_has_suffix(progname, "kvm")) {
@@ -2390,10 +2389,10 @@  static void configure_accelerators(const char *progname)
                 }
             } else if (have_kvm) {
                 accelerators = "kvm";
+            } else if (accel_find("hvf")) {
+                accelerators = "hvf";
             } else if (have_tcg) {
                 accelerators = "tcg";
-            } else if (have_hvf) {
-                accelerators = "hvf";
             } else {
                 error_report("No accelerator selected and"
                              " no default accelerator available");