Message ID | 1384204922-8250-1-git-send-email-peter.maydell@linaro.org |
---|---|
State | Accepted |
Headers | show |
On 11.11.13 22:22, Peter Maydell wrote: > Fix build failures with clang when KVM is not enabled by > providing a stub version of kvm_arch_get_supported_cpuid(). > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > I wouldn't be surprised if this also affected debug gcc > builds with KVM disabled, but I haven't checked. I can confirm the patch below fixes the clang link issue here. Also, the gcc debug build does work. Thanks a lot! Andreas > > Incidentally, since this is an x86 specific function its > prototype should be moved into target-i386/kvm_i386.h, but > that's a separate patch. > > target-i386/kvm-stub.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/target-i386/kvm-stub.c b/target-i386/kvm-stub.c > index 11429c4..18fe938 100644 > --- a/target-i386/kvm-stub.c > +++ b/target-i386/kvm-stub.c > @@ -16,3 +16,9 @@ bool kvm_allows_irq0_override(void) > { > return 1; > } > + > +uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function, > + uint32_t index, int reg) > +{ > + return 0; > +} >
Il 11/11/2013 22:22, Peter Maydell ha scritto: > Fix build failures with clang when KVM is not enabled by > providing a stub version of kvm_arch_get_supported_cpuid(). > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> No, please don't. We are already relying on dead code elimination for KVM code (I didn't introduce the idiom), even in very similar code like this one: if (kvm_enabled() && cpu->enable_pmu) { KVMState *s = cs->kvm_state; *eax = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EAX); *ebx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EBX); *ecx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_ECX); *edx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EDX); } else { *eax = 0; *ebx = 0; *ecx = 0; *edx = 0; } This is the first time that it breaks, and only on clang. So this really points at the very least at a compiler quirk (though perhaps not a bug in the formal sense). The point of kvm-stub.c is not to work around compiler quirks, it is to avoid peppering the code with kvm_enabled(). Adding a stub is wrong if the return code makes no sense (as is the case here). > I wouldn't be surprised if this also affected debug gcc > builds with KVM disabled, but I haven't checked. No, it doesn't affect GCC. See Andreas's bug report. Is it a bug or a feature? Having some kind of -O0 dead-code elimination is definitely a feature (http://gcc.gnu.org/ml/gcc-patches/2003-03/msg02443.html). That first implementation already handled "||" and "&&" just fine, though it didn't handle "break"/"continue"/"goto" in the clauses of the "if" statement. I think it would be considered a regression if this ceased to work, but of course I may be wrong. I am okay with Andreas's patch of course, but it would also be fine with me to split the "if" in two, each with its own separate break statement. Since it only affects debug builds, there is no hurry to fix this in 1.7 if the approach cannot be agreed with. > Incidentally, since this is an x86 specific function its > prototype should be moved into target-i386/kvm_i386.h, but > that's a separate patch. Yes, this I obviously agree with. Paolo > target-i386/kvm-stub.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/target-i386/kvm-stub.c b/target-i386/kvm-stub.c > index 11429c4..18fe938 100644 > --- a/target-i386/kvm-stub.c > +++ b/target-i386/kvm-stub.c > @@ -16,3 +16,9 @@ bool kvm_allows_irq0_override(void) > { > return 1; > } > + > +uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function, > + uint32_t index, int reg) > +{ > + return 0; > +} >
On 11 November 2013 22:19, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 11/11/2013 22:22, Peter Maydell ha scritto: >> Fix build failures with clang when KVM is not enabled by >> providing a stub version of kvm_arch_get_supported_cpuid(). >> >> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > > No, please don't. We are already relying on dead code elimination for > KVM code (I didn't introduce the idiom), even in very similar code like > this one: > > if (kvm_enabled() && cpu->enable_pmu) { > KVMState *s = cs->kvm_state; > > *eax = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EAX); > *ebx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EBX); > *ecx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_ECX); > *edx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EDX); That's also a bug, and it's also fixed by this patch (since it's the same function). If we have other places where we're relying on dead code elimination to not provide a function definition, please point them out, because they're bugs we need to fix, ideally before they cause compilation failures. > } else { > *eax = 0; > *ebx = 0; > *ecx = 0; > *edx = 0; > } > > This is the first time that it breaks, and only on clang. So this > really points at the very least at a compiler quirk (though perhaps not > a bug in the formal sense). The point of kvm-stub.c is not to work > around compiler quirks, it is to avoid peppering the code with > kvm_enabled(). Adding a stub is wrong if the return code makes no sense > (as is the case here). Huh? The point of stub functions is to provide versions of functions which either need to return an "always fails" code, or which will never be called, but in either case this is so we can avoid peppering the code with #ifdefs. The latter category is why we have stubs which do nothing but call abort(). (If you think this stub should call abort() I'm happy to roll a new patch which does that.) >> I wouldn't be surprised if this also affected debug gcc >> builds with KVM disabled, but I haven't checked. > > No, it doesn't affect GCC. See Andreas's bug report. Is it a bug or a > feature? Having some kind of -O0 dead-code elimination is definitely a > feature (http://gcc.gnu.org/ml/gcc-patches/2003-03/msg02443.html). That patch says it is to "speed up these RTL optimizers and by allocating less memory, reduce the compiler footprint and possible memory fragmentation". So they might investigate it as a performance regression, but it's only a "make compilation faster" feature, not correctness. Code which relies on dead-code-elimination is broken. > I am okay with Andreas's patch of course, but it would also be fine with > me to split the "if" in two, each with its own separate break statement. I think Andreas's patch is a bad idea and am against it being applied. It's very obviously a random tweak aimed at a specific compiler's implementation of dead-code elimination, and it's the wrong way to fix the problem. > Since it only affects debug builds, there is no hurry to fix this in 1.7 > if the approach cannot be agreed with. ?? Debug builds should absolutely work out of the box -- if debug build fails that is IMHO a release critical bug. -- PMM
Il 11/11/2013 23:38, Peter Maydell ha scritto: > If we have other places where we're relying on dead code elimination > to not provide a function definition, please point them out, because > they're bugs we need to fix, ideally before they cause compilation > failures. I'm not sure, there are probably a few others. Linux also relies on the idiom (at least KVM does on x86). > Huh? The point of stub functions is to provide versions of functions > which either need to return an "always fails" code, or which will never > be called, but in either case this is so we can avoid peppering the > code with #ifdefs. The latter category is why we have stubs which > do nothing but call abort(). There are very few stubs that call abort(): int kvm_cpu_exec(CPUState *cpu) { abort(); } int kvm_set_signal_mask(CPUState *cpu, const sigset_t *sigset) { abort(); } Calling abort() would be marginally better than returning 0, but why defer checks to runtime when you can let the linker do them? >>> I wouldn't be surprised if this also affected debug gcc >>> builds with KVM disabled, but I haven't checked. >> >> No, it doesn't affect GCC. See Andreas's bug report. Is it a bug or a >> feature? Having some kind of -O0 dead-code elimination is definitely a >> feature (http://gcc.gnu.org/ml/gcc-patches/2003-03/msg02443.html). > > That patch says it is to "speed up these RTL optimizers and by allocating > less memory, reduce the compiler footprint and possible memory > fragmentation". So they might investigate it as a performance > regression, but it's only a "make compilation faster" feature, not > correctness. Code which relies on dead-code-elimination is broken. There's plenty of tests in the GCC testsuite that rely on DCE to test that an optimization happened; some of them at -O0 too. So it's become a GCC feature in the end. Code which relies on dead-code-elimination is not broken, it's relying on the full power of the toolchain to ensure bugs are detected as soon as possible, i.e. at build time. >> I am okay with Andreas's patch of course, but it would also be fine with >> me to split the "if" in two, each with its own separate break statement. > > I think Andreas's patch is a bad idea and am against it being > applied. It's very obviously a random tweak aimed at a specific > compiler's implementation of dead-code elimination, and it's the > wrong way to fix the problem. It's very obviously a random tweak aimed at a specific compiler's bug in dead-code elimination, I'm not denying that. But the same compiler feature is being exploited elsewhere. >> Since it only affects debug builds, there is no hurry to fix this in 1.7 >> if the approach cannot be agreed with. > > ?? Debug builds should absolutely work out of the box -- if > debug build fails that is IMHO a release critical bug. Debug builds for qemu-system-{i386,x86_64} with clang on systems other than x86/Linux. Paolo
On Mon, Nov 11, 2013 at 3:11 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 11/11/2013 23:38, Peter Maydell ha scritto: >> If we have other places where we're relying on dead code elimination >> to not provide a function definition, please point them out, because >> they're bugs we need to fix, ideally before they cause compilation >> failures. > > I'm not sure, there are probably a few others. Linux also relies on the > idiom (at least KVM does on x86). And they are there because it's a useful tool. >> Huh? The point of stub functions is to provide versions of functions >> which either need to return an "always fails" code, or which will never >> be called, but in either case this is so we can avoid peppering the >> code with #ifdefs. The latter category is why we have stubs which >> do nothing but call abort(). > > There are very few stubs that call abort(): > > int kvm_cpu_exec(CPUState *cpu) > { > abort(); > } > > int kvm_set_signal_mask(CPUState *cpu, const sigset_t *sigset) > { > abort(); > } > > Calling abort() would be marginally better than returning 0, but why > defer checks to runtime when you can let the linker do them? Exactly. >>>> I wouldn't be surprised if this also affected debug gcc >>>> builds with KVM disabled, but I haven't checked. >>> >>> No, it doesn't affect GCC. See Andreas's bug report. Is it a bug or a >>> feature? Having some kind of -O0 dead-code elimination is definitely a >>> feature (http://gcc.gnu.org/ml/gcc-patches/2003-03/msg02443.html). >> >> That patch says it is to "speed up these RTL optimizers and by allocating >> less memory, reduce the compiler footprint and possible memory >> fragmentation". So they might investigate it as a performance >> regression, but it's only a "make compilation faster" feature, not >> correctness. Code which relies on dead-code-elimination is broken. > > There's plenty of tests in the GCC testsuite that rely on DCE to test > that an optimization happened; some of them at -O0 too. So it's become > a GCC feature in the end. > > Code which relies on dead-code-elimination is not broken, it's relying > on the full power of the toolchain to ensure bugs are detected as soon > as possible, i.e. at build time. > >>> I am okay with Andreas's patch of course, but it would also be fine with >>> me to split the "if" in two, each with its own separate break statement. >> >> I think Andreas's patch is a bad idea and am against it being >> applied. It's very obviously a random tweak aimed at a specific >> compiler's implementation of dead-code elimination, and it's the >> wrong way to fix the problem. > > It's very obviously a random tweak aimed at a specific compiler's bug in > dead-code elimination, I'm not denying that. But the same compiler > feature is being exploited elsewhere. We're not talking about something obscure here. It's eliminating an if(0) block. There's no reason to leave an if (0) block around. The code is never reachable. >>> Since it only affects debug builds, there is no hurry to fix this in 1.7 >>> if the approach cannot be agreed with. >> >> ?? Debug builds should absolutely work out of the box -- if >> debug build fails that is IMHO a release critical bug. > > Debug builds for qemu-system-{i386,x86_64} with clang on systems other > than x86/Linux. Honestly, it's hard to treat clang as a first class target. We don't have much infrastructure around so it's not getting that much testing. We really need to figure out how we're going to do CI. FWIW, I'd rather just add -O1 for debug builds than add more stub functions. Regards, Anthony Liguori > > Paolo > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11 November 2013 23:11, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 11/11/2013 23:38, Peter Maydell ha scritto: >> If we have other places where we're relying on dead code elimination >> to not provide a function definition, please point them out, because >> they're bugs we need to fix, ideally before they cause compilation >> failures. > > I'm not sure, there are probably a few others. Linux also relies on the > idiom (at least KVM does on x86). Linux is notoriously tied to a particular compiler. I don't think QEMU should be. >> Huh? The point of stub functions is to provide versions of functions >> which either need to return an "always fails" code, or which will never >> be called, but in either case this is so we can avoid peppering the >> code with #ifdefs. The latter category is why we have stubs which >> do nothing but call abort(). > > There are very few stubs that call abort(): You missed some more in stubs/, as it happens. > Calling abort() would be marginally better than returning 0, but why > defer checks to runtime when you can let the linker do them? Because you can't guarantee that the compiler will always throw the code out. As we can see here. > Code which relies on dead-code-elimination is not broken, it's relying > on the full power of the toolchain to ensure bugs are detected as soon > as possible, i.e. at build time. If you can point me at gcc and clang documentation that says "we guarantee that we will dead-code eliminate these classes of conditional at all levels of optimization" then I'm happy that we can continue doing this and we have a clear level of what we require that we're aiming at. If you can't then we're relying on undocumented compiler behaviour and we should stop. > It's very obviously a random tweak aimed at a specific compiler's bug in > dead-code elimination, I'm not denying that. It's not a compiler bug. It's just not applying an optimization in all the cases it possibly could. "Not as optimal as theoretically possible" happens all the time for compilers. I really don't see why anybody wants to rely on undocumented compiler behaviour when the fix which means we have entirely well defined behaviour is a single function in an already existing stub file. What are the downsides here? -- PMM
Il 12/11/2013 00:21, Anthony Liguori ha scritto:
> FWIW, I'd rather just add -O1 for debug builds than add more stub functions.
That can do, too. clang works fine with -O1.
Paolo
On 11 November 2013 23:21, Anthony Liguori <anthony@codemonkey.ws> wrote: > We're not talking about something obscure here. It's eliminating an > if(0) block. No, we're not talking about a simple "if (0)" expression. What we had in this case was if (!(env->features[FEAT_1_ECX] & CPUID_EXT_XSAVE) || !kvm_enabled()) { break; } [stuff using kvm_arch_get_supported_cpuid()] [where kvm_enabled() is #defined to constant-0]. For the compiler to eliminate this we are relying on: * dead-code elimination of code following a 'break' statement in a case block * constant-folding of "something || 1" to 1 * the compiler having done enough reasoning to be sure that env is not NULL Self-evidently, not all compilers will provide all of the above. Andreas' patch swaps the order of the if() conditionals, which seems to work for a wider set of compilers, but there's no guarantee that will always work. So exactly how much do we require our compiler's constant-folding to handle? For example, unsigned char a,b,c; [...] if (a != 0 && b != 0 && c != 0 && a * a * a + b * b * b == c * c * c) { is compile-time provable to be always false; can we rely on the branch being eliminated? My position here is straightforward: the only thing we can rely on is what the compilers document that they will guarantee to do, which is nothing. I don't see that relying on dead-code elimination gains us anything significant at all, and adding an extra stub or two (that won't even be linked in if your compiler does happen to optimise) is totally trivial overhead. You and Paolo have both mentioned "doing checks at compile time" but why is this particular detail of our code worth checking in that way at the expense of reliably being able to compile? As it happens, having a stub function that returns 0 would simplify several bits of code that currently do: if (kvm_enabled() && cpu->enable_pmu) { KVMState *s = cs->kvm_state; *eax = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EAX); *ebx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EBX); *ecx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_ECX); *edx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EDX); } else { *eax = 0; *ebx = 0; *ecx = 0; *edx = 0; } because you could get rid of the else block. Or you could #ifdef CONFIG_KVM this code section, as we already do for some of the more complicated bits of code that call this function. That would work too. -- PMM
Il 12/11/2013 12:07, Peter Maydell ha scritto: > On 11 November 2013 23:21, Anthony Liguori <anthony@codemonkey.ws> wrote: >> We're not talking about something obscure here. It's eliminating an >> if(0) block. > > No, we're not talking about a simple "if (0)" expression. > What we had in this case was > if (!(env->features[FEAT_1_ECX] & CPUID_EXT_XSAVE) || !kvm_enabled()) { > break; > } > [stuff using kvm_arch_get_supported_cpuid()] > > [where kvm_enabled() is #defined to constant-0]. > > For the compiler to eliminate this we are relying on: > * dead-code elimination of code following a 'break' > statement in a case block > * constant-folding of "something || 1" to 1 > * the compiler having done enough reasoning to be > sure that env is not NULL Yes, it's not trivial, but there are simpler ways to do it. For example there is no need to make sure that env is non-NULL, only to see that "something || 1" is never zero and thus "if (x) y;" is just "(void)x; y;". This seems easier to me than DCE after "break" which clang is able to do. Indeed, NULL checks (except perhaps very simple checks like &x != NULL) are not something I'd expect the compiler to optimize out at -O0. > You and Paolo have both mentioned "doing checks > at compile time" but why is this particular > detail of our code worth checking in that way > at the expense of reliably being able to compile? > > As it happens, having a stub function that returns > 0 would simplify several bits of code that currently > do: > > if (kvm_enabled() && cpu->enable_pmu) { > KVMState *s = cs->kvm_state; > > *eax = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EAX); > *ebx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EBX); > *ecx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_ECX); > *edx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EDX); > } else { > *eax = 0; > *ebx = 0; > *ecx = 0; > *edx = 0; > } > > because you could get rid of the else block. No, you couldn't because the "else" branch runs for "kvm_enabled() && !cpu->enable_pmu" too. Relying on kvm_arch_get_supported_cpuid to return 0 would only make code harder to review, and the above example shows this fact very well. kvm_arch_get_supported_cpuid abort()-ing would be fine, but it would also make code harder to review, because you cannot rely anymore on the compiler-linker combo to filter out the most obvious cases of forgetting about TCG's existence. > Or you could #ifdef CONFIG_KVM this code section, > as we already do for some of the more complicated > bits of code that call this function. That would > work too. Yes, but it wouldn't be _right_. Most of the code below the "break" is potentially applicable to TCG, even if it is not used now. Paolo
On 12 November 2013 12:09, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 12/11/2013 12:07, Peter Maydell ha scritto: >> For the compiler to eliminate this we are relying on: >> * dead-code elimination of code following a 'break' >> statement in a case block >> * constant-folding of "something || 1" to 1 >> * the compiler having done enough reasoning to be >> sure that env is not NULL > > Yes, it's not trivial, but there are simpler ways to do it. > > For example there is no need to make sure that env is non-NULL, only to > see that "something || 1" is never zero and thus "if (x) y;" is just > "(void)x; y;". This seems easier to me than DCE after "break" which > clang is able to do. You seem to be trying to reason about what the compiler might choose to do or how it might be implemented internally. I think this is fundamentally misguided. "-O0" means "reduce compile time and make debugging produce expected results", not "reduce compile time, make debugging produce expected results and also run these two optimization passes which my codebase implicitly relies on happening". gcc currently happens to do DCE and constant-folding even at -O0 because it turns out to be faster to do that than not to; if in future the compilation-speed tradeoff swings the other way they're free to decide to not do those passes, or to do cut-down versions that fold less or eliminate less. I find this argument confusing because to me it's a completely simple choice with one "obviously right" and one "obviously wrong" approach :-( -- PMM
Il 12/11/2013 13:16, Peter Maydell ha scritto: > On 12 November 2013 12:09, Paolo Bonzini <pbonzini@redhat.com> wrote: >> Il 12/11/2013 12:07, Peter Maydell ha scritto: >>> For the compiler to eliminate this we are relying on: >>> * dead-code elimination of code following a 'break' >>> statement in a case block >>> * constant-folding of "something || 1" to 1 >>> * the compiler having done enough reasoning to be >>> sure that env is not NULL >> >> Yes, it's not trivial, but there are simpler ways to do it. >> >> For example there is no need to make sure that env is non-NULL, only to >> see that "something || 1" is never zero and thus "if (x) y;" is just >> "(void)x; y;". This seems easier to me than DCE after "break" which >> clang is able to do. > > You seem to be trying to reason about what the compiler > might choose to do or how it might be implemented internally. I'm not reasoning about that in general (I was in the context of the message you quoted). I'm saying it's *reasonable* to expect that "-O0" means "reduce compile time, make debugging produce expected results, and try (not too hard) to not break what works at -O2". It's a simple QoI argument based on the fact that people *will* switch back and forth between -O2 and -O0. Of course not everything can be kept to work, since the compilers do pretty surprising optimizations (not counting the ones that break your code of course...). But I think a limited amount of dead code elimination *should* be expected because most people are now preferring "if" to "#ifdef" for compiling out code. If -O0 does not do that, let's move debug builds to -O1. Paolo
On 12 November 2013 13:12, Paolo Bonzini <pbonzini@redhat.com> wrote: > I'm saying it's *reasonable* to expect that "-O0" means "reduce compile > time, make debugging produce expected results, and try (not too hard) to > not break what works at -O2". And that's what we've got. There's no requirement, even at -O2, to eliminate all dead code. It's just a QoI issue for optimization. If your code is busted it's busted and that might show up only at -O0 or only at -O2 (the latter in fact is pretty common because of the tendency to only do dataflow analysis at higher levels). > It's a simple QoI argument based on the > fact that people *will* switch back and forth between -O2 and -O0. Of > course not everything can be kept to work, since the compilers do pretty > surprising optimizations (not counting the ones that break your code of > course...). But I think a limited amount of dead code elimination > *should* be expected because most people are now preferring "if" to > "#ifdef" for compiling out code. If your code isn't wrong then "if (0)" works just fine for compiling out code that isn't used. In an optimized build it gets eliminated; in an non-optimized build you don't care if it gets eliminated or not, it's no big deal. The reason for using "if (0)" rather than #ifdefs is "an optimizing compiler will make this no less efficient for our release builds"; that doesn't mean you can rely on it being optimal when you've specifically asked for a non-optimizing build, and it doesn't mean you can put code in the "if (0)" that's broken. (Similarly, you can put code that's a syntax error inside #if 0, but that won't work inside an "if (0)". The solution is not to do that.) -- PMM
On Tue, Nov 12, 2013 at 02:12:56PM +0100, Paolo Bonzini wrote: > Il 12/11/2013 13:16, Peter Maydell ha scritto: > > On 12 November 2013 12:09, Paolo Bonzini <pbonzini@redhat.com> wrote: > >> Il 12/11/2013 12:07, Peter Maydell ha scritto: > >>> For the compiler to eliminate this we are relying on: > >>> * dead-code elimination of code following a 'break' > >>> statement in a case block > >>> * constant-folding of "something || 1" to 1 > >>> * the compiler having done enough reasoning to be > >>> sure that env is not NULL > >> > >> Yes, it's not trivial, but there are simpler ways to do it. > >> > >> For example there is no need to make sure that env is non-NULL, only to > >> see that "something || 1" is never zero and thus "if (x) y;" is just > >> "(void)x; y;". This seems easier to me than DCE after "break" which > >> clang is able to do. > > > > You seem to be trying to reason about what the compiler > > might choose to do or how it might be implemented internally. > > I'm not reasoning about that in general (I was in the context of the > message you quoted). > > I'm saying it's *reasonable* to expect that "-O0" means "reduce compile > time, make debugging produce expected results, and try (not too hard) to > not break what works at -O2". It's a simple QoI argument based on the > fact that people *will* switch back and forth between -O2 and -O0. Of > course not everything can be kept to work, since the compilers do pretty > surprising optimizations (not counting the ones that break your code of > course...). But I think a limited amount of dead code elimination > *should* be expected because most people are now preferring "if" to > "#ifdef" for compiling out code. > > If -O0 does not do that, let's move debug builds to -O1. > Why not enable dce with -fdce? -- Gleb.
On Tue, Nov 12, 2013 at 01:21:51PM +0000, Peter Maydell wrote: > (Similarly, > you can put code that's a syntax error inside #if 0, > but that won't work inside an "if (0)". The solution > is not to do that.) > That's the advantage of using "if (0)" instead of #if 0. You do not need to enable insane amount of options to check that your change does not break complication for any of them. -- Gleb.
Il 12/11/2013 14:23, Gleb Natapov ha scritto: >> If -O0 does not do that, let's move debug builds to -O1. > > Why not enable dce with -fdce? First, because clang doesn't have fine-tuned optimization options (at least I couldn't find them and -fdce doesn't work). Second, because most optimization options are no-ops at -O0 (try "-fdce -fdump-tree-all" with GCC. Paolo
On 12 November 2013 13:12, Paolo Bonzini <pbonzini@redhat.com> wrote:
> If -O0 does not do that, let's move debug builds to -O1.
Isn't this going to sacrifice debuggability? That also seems
like the wrong tradeoff to me.
-- PMM
On Tue, Nov 12, 2013 at 02:57:49PM +0100, Paolo Bonzini wrote: > Il 12/11/2013 14:23, Gleb Natapov ha scritto: > >> If -O0 does not do that, let's move debug builds to -O1. > > > > Why not enable dce with -fdce? > > First, because clang doesn't have fine-tuned optimization options (at > least I couldn't find them and -fdce doesn't work). > -O1 then for clang. > Second, because most optimization options are no-ops at -O0 (try "-fdce > -fdump-tree-all" with GCC. > Strange. Is this by design? We can do -O1 and bunch of "-fno-" to disable most of optimizations -O1 enables, but this is ugly. -- Gleb.
On 12 November 2013 14:09, Gleb Natapov <gleb@redhat.com> wrote: > On Tue, Nov 12, 2013 at 02:57:49PM +0100, Paolo Bonzini wrote: >> Il 12/11/2013 14:23, Gleb Natapov ha scritto: >> >> If -O0 does not do that, let's move debug builds to -O1. >> > >> > Why not enable dce with -fdce? >> >> First, because clang doesn't have fine-tuned optimization options (at >> least I couldn't find them and -fdce doesn't work). >> > -O1 then for clang. No thanks. --enable-debug should give the maximum debuggability. That means -O0. -- PMM
Il 12/11/2013 15:09, Gleb Natapov ha scritto: > On Tue, Nov 12, 2013 at 02:57:49PM +0100, Paolo Bonzini wrote: >> Il 12/11/2013 14:23, Gleb Natapov ha scritto: >>>> If -O0 does not do that, let's move debug builds to -O1. >>> >>> Why not enable dce with -fdce? >> >> First, because clang doesn't have fine-tuned optimization options (at >> least I couldn't find them and -fdce doesn't work). > > -O1 then for clang. We can even test in configure for the exact optimizations we want, in fact. But I think -O1 doesn't sacrifice debuggability that much: http://www.redhat.com/magazine/011sep05/features/gcc/ -O0 Barely any transformations are done to the code, just code generation. At this level, the target code can be debugged with no loss of information. -O1 Some transformations that preserve execution ordering. Debuggability of the generated code is hardly affected. User variables should not disappear and function inlining is not done. -O2 More aggressive transformations that may affect execution ordering and usually provide faster code. Debuggability may be somewhat compromised by disappearing user variables and function bodies. Not very recent, but things have remained roughly the same and gdb also has improved. Hmm... I just found out that GCC has a shiny new "-Og" option to optimize for debuggability and still producing good code. Using "-Og" if it is present, and -O1 otherwise, seems like a good idea to me for 1.8. For 1.7 it can just be -O1. >> Second, because most optimization options are no-ops at -O0 (try "-fdce >> -fdump-tree-all" with GCC. >> > Strange. Is this by design? We can do -O1 and bunch of "-fno-" to > disable most of optimizations -O1 enables, but this is ugly. Yes, some data structures (I'm not up to date as to which exactly) are not even built at -O0 to make compilation faster, and they're required for most optimizations. Paolo
On 12 November 2013 14:57, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 12/11/2013 15:09, Gleb Natapov ha scritto: >> On Tue, Nov 12, 2013 at 02:57:49PM +0100, Paolo Bonzini wrote: >>> Il 12/11/2013 14:23, Gleb Natapov ha scritto: >>>>> If -O0 does not do that, let's move debug builds to -O1. >>>> >>>> Why not enable dce with -fdce? >>> >>> First, because clang doesn't have fine-tuned optimization options (at >>> least I couldn't find them and -fdce doesn't work). >> >> -O1 then for clang. > > We can even test in configure for the exact optimizations we want, in > fact. But I think -O1 doesn't sacrifice debuggability that much: I'm afraid I still don't see why you'd want to sacrifice it at all, when the alternative is "provide a three line stub function in a file we already have all the build machinery to compile in the config where it's needed". I just don't see why you'd worry about the fact that there's no longer a compile error if you try to call this obviously kvm specific function in a non-kvm-enabled code path, when we already have large numbers of kvm-specific functions that have stubs (and when in general, eg QOM APIs, we seem to be entirely happy to have things be runtime errors rather than compile time). To me the benefit seems entirely on the side of "have standard compliant code" rather than "attempt to find various odd workarounds for the fact that some compilers will correctly complain about our code". -- PMM
Il 12/11/2013 16:13, Peter Maydell ha scritto: >>> >> >>> >> -O1 then for clang. >> > >> > We can even test in configure for the exact optimizations we want, in >> > fact. But I think -O1 doesn't sacrifice debuggability that much: > I'm afraid I still don't see why you'd want to sacrifice it > at all, Is this FUD or do you have examples of bad debuggability of -O1 code? Personally, I've not even used -O0 for several years. -O2 debuggability is still awful but has improved a lot. If it's not enough, 99% of the time it means that tracing or printf are a better tool for the bug. > when the alternative is "provide a three line stub > function in a file we already have all the build machinery > to compile in the config where it's needed". I just don't > see why you'd worry about the fact that there's no longer > a compile error if you try to call this obviously kvm > specific function in a non-kvm-enabled code path, when > we already have large numbers of kvm-specific functions > that have stubs Most of these stubs do _not_ abort(), they return a sensible error code or should be no-ops in the non-KVM case. > (and when in general, eg QOM APIs, we seem > to be entirely happy to have things be runtime errors rather > than compile time). We're far from having consensus on that, indeed. Paolo
On 12 November 2013 15:21, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 12/11/2013 16:13, Peter Maydell ha scritto: >>>> >> >>>> >> -O1 then for clang. >>> > >>> > We can even test in configure for the exact optimizations we want, in >>> > fact. But I think -O1 doesn't sacrifice debuggability that much: >> I'm afraid I still don't see why you'd want to sacrifice it >> at all, > > Is this FUD or do you have examples of bad debuggability of -O1 code? The clang manpage says specifically "Note that Clang debug information works best at -O0. ", and I see no reason to disbelieve it. In particular, they don't say "we definitely will never add an optimization to -O1 that makes the debug info much worse". -- PMM
Il 12/11/2013 16:32, Peter Maydell ha scritto: > > Is this FUD or do you have examples of bad debuggability of -O1 code? > > The clang manpage says specifically "Note that Clang debug > information works best at -O0. ", and I see no reason to > disbelieve it. In particular, they don't say "we definitely > will never add an optimization to -O1 that makes the debug > info much worse". This doesn't quite answer my question. It looks like another bug that should be reported to clang. "-O1 is somewhere between -O0 and -O2" (quoted from the man page) is a joke, it's not documentation. Every time I look at clang, it seems to me that they are still relying on the "buzz" from their "better syntax errors" blog posts (undeserved these days), and from clang-analyzer (deserved). I don't really see a reason why QEMU should give clang more weight than Windows or Mac OS X. Paolo
On 12 November 2013 15:58, Paolo Bonzini <pbonzini@redhat.com> wrote: > I don't really see a reason why QEMU should give clang more weight than > Windows or Mac OS X. I'm not asking for more weight (and actually my main reason for caring about clang is exactly MacOSX). I'm just asking that when a bug is reported whose underlying cause is "we don't work on clang because we're relying on undocumented behaviour of gcc" with an attached patch that fixes this by not relying on the undocumented behaviour, that we apply the patch rather than saying "why do we care about clang"... This seems to me to be a win-win situation: * we improve our code by not relying on undocumented implentation specifics * we work on a platform that, while not a primary platform, is at least supported in the codebase and has people who fix it when it breaks -- PMM
On Tue, Nov 12, 2013 at 8:08 AM, Peter Maydell <peter.maydell@linaro.org> wrote: > On 12 November 2013 15:58, Paolo Bonzini <pbonzini@redhat.com> wrote: >> I don't really see a reason why QEMU should give clang more weight than >> Windows or Mac OS X. > > I'm not asking for more weight (and actually my main > reason for caring about clang is exactly MacOSX). I'm > just asking that when a bug is reported whose underlying > cause is "we don't work on clang because we're relying on > undocumented behaviour of gcc" with an attached patch that > fixes this by not relying on the undocumented behaviour, > that we apply the patch rather than saying "why do we > care about clang"... QEMU has always been intimately tied to GCC. Heck, it all started as a giant GCC hack relying on entirely undocumented behavior (dyngen's disassembly of functions). There's nothing intrinsically bad about being tied to GCC. If you were making argument that we could do it a different way and the result would be as nice or nicer, then it wouldn't be a discussion. But if supporting clang means we have to remove useful things, then it's always going to be an uphill battle. In this case, the whole discussion is a bit silly. Have you actually tried -O1 under a debugger with clang? Is it noticably worse than -O0? I find QEMU extremely difficult to use an interactive debugger on anyway. I doubt the difference between -O0 and -O1 is even close to the breaking point between usability under a debugger... Regards, Anthony Liguori > This seems to me to be a win-win situation: > * we improve our code by not relying on undocumented > implentation specifics > * we work on a platform that, while not a primary > platform, is at least supported in the codebase and > has people who fix it when it breaks > > -- PMM
On 12 November 2013 17:04, Anthony Liguori <anthony@codemonkey.ws> wrote: > QEMU has always been intimately tied to GCC. Heck, it all started as > a giant GCC hack relying on entirely undocumented behavior (dyngen's > disassembly of functions). It has historically. Blue Swirl put in a lot of work to remove those dependencies. I'd rather we didn't let them drift back in again, especially for really small reasons. > There's nothing intrinsically bad about being tied to GCC. If you > were making argument that we could do it a different way and the > result would be as nice or nicer, then it wouldn't be a discussion. I really think this patch is fundamentally nicer than the current code base, even if we didn't care about clang. I think relying on dead-code-elimination happening for us to compile is ugly. > But if supporting clang means we have to remove useful things, then > it's always going to be an uphill battle. I think the fundamental disagreement here is that I don't see this patch as removing anything useful. > In this case, the whole discussion is a bit silly. I'd agree with that :-) -- PMM
On 11/13/2013 03:04 AM, Anthony Liguori wrote: > On Tue, Nov 12, 2013 at 8:08 AM, Peter Maydell <peter.maydell@linaro.org> wrote: >> On 12 November 2013 15:58, Paolo Bonzini <pbonzini@redhat.com> wrote: >>> I don't really see a reason why QEMU should give clang more weight than >>> Windows or Mac OS X. >> >> I'm not asking for more weight (and actually my main >> reason for caring about clang is exactly MacOSX). I'm >> just asking that when a bug is reported whose underlying >> cause is "we don't work on clang because we're relying on >> undocumented behaviour of gcc" with an attached patch that >> fixes this by not relying on the undocumented behaviour, >> that we apply the patch rather than saying "why do we >> care about clang"... > > QEMU has always been intimately tied to GCC. Heck, it all started as > a giant GCC hack relying on entirely undocumented behavior (dyngen's > disassembly of functions). > > There's nothing intrinsically bad about being tied to GCC. If you > were making argument that we could do it a different way and the > result would be as nice or nicer, then it wouldn't be a discussion. > > But if supporting clang means we have to remove useful things, then > it's always going to be an uphill battle. > > In this case, the whole discussion is a bit silly. Have you actually For what it's worth, I think BOTH of the patches that have been posted should be applied. That is, the patch that does (X || 1) -> (1 || X), and the patch that adds the stub. Frankly I'd have thought this was obvious and I'm a bit dismayed about how long this thread has continued. As far as GCC is concerned, we consider trivial dead code elimination like this to be a quality of implementation issue. We would never remove it, even from -O0. We can't guarantee how successful we can be, but that's what bug reports and regression tests are for. r~
On 12 November 2013 18:54, Richard Henderson <rth@twiddle.net> wrote: > For what it's worth, I think BOTH of the patches that have been posted > should be applied. That is, the patch that does (X || 1) -> (1 || X), > and the patch that adds the stub. I think that makes sense and would be happy with that as a resolution. thanks -- PMM
Am 12.11.2013 19:57, schrieb Peter Maydell: > On 12 November 2013 18:54, Richard Henderson <rth@twiddle.net> wrote: >> For what it's worth, I think BOTH of the patches that have been posted >> should be applied. That is, the patch that does (X || 1) -> (1 || X), >> and the patch that adds the stub. > I think that makes sense and would be happy with that as a resolution. > > thanks > -- PMM > +1. By the way: I added a stub for kvm_arch_get_supported_cpuid() in my kvm patch, too (see http://patchwork.ozlabs.org/patch/260512/). It called g_assert_not_reached()instead of returning 0.Maybe this can be added in a later patch. Regards, Stefan
Il 12/11/2013 19:54, Richard Henderson ha scritto: > For what it's worth, I think BOTH of the patches that have been posted > should be applied. That is, the patch that does (X || 1) -> (1 || X), > and the patch that adds the stub. > > Frankly I'd have thought this was obvious It's not that obvious to me. If you add the stub, the patch that reorders operands is not necessary. If you reorder operands, the stub is not necessary. The patch that does (X || 1) -> (1 || X) is unnecessary as a microoptimization, since this code basically runs once at startup. The code is also a little bit less clear with the reordered operands, but perhaps that's just me because I wrote the code that way. (Splitting the if in two would also make sense, and would not affect clarity). Why should both be applied? Paolo
On 11/13/2013 08:53 AM, Paolo Bonzini wrote: > Il 12/11/2013 19:54, Richard Henderson ha scritto: >> For what it's worth, I think BOTH of the patches that have been posted >> should be applied. That is, the patch that does (X || 1) -> (1 || X), >> and the patch that adds the stub. >> >> Frankly I'd have thought this was obvious > > It's not that obvious to me. > > If you add the stub, the patch that reorders operands is not necessary. > If you reorder operands, the stub is not necessary. > > The patch that does (X || 1) -> (1 || X) is unnecessary as a > microoptimization, since this code basically runs once at startup. The > code is also a little bit less clear with the reordered operands, but > perhaps that's just me because I wrote the code that way. (Splitting > the if in two would also make sense, and would not affect clarity). > > Why should both be applied? It's worth working around the clang missed optimization, if for nothing else than avoiding the noise of the bugs that would otherwise be filed against the release. I think it's also worthwhile to implement the kvm api in kvm-stub.c, unnecessary or not. If you really want compile-time feedback on those that ought to have been removed by optimization, you could elide them from the stub file depending on ifndef __OPTIMIZE__. r~
Il 13/11/2013 03:27, Richard Henderson ha scritto: > I think it's also worthwhile to implement the kvm api in kvm-stub.c, > unnecessary or not. If you really want compile-time feedback on those that > ought to have been removed by optimization, you could elide them from the stub > file depending on ifndef __OPTIMIZE__. Good idea. Peter, can you do that? Paolo
On Wed, Nov 13, 2013 at 12:27:10PM +1000, Richard Henderson wrote: > On 11/13/2013 08:53 AM, Paolo Bonzini wrote: > > Il 12/11/2013 19:54, Richard Henderson ha scritto: > >> For what it's worth, I think BOTH of the patches that have been posted > >> should be applied. That is, the patch that does (X || 1) -> (1 || X), > >> and the patch that adds the stub. > >> > >> Frankly I'd have thought this was obvious > > > > It's not that obvious to me. > > > > If you add the stub, the patch that reorders operands is not necessary. > > If you reorder operands, the stub is not necessary. > > > > The patch that does (X || 1) -> (1 || X) is unnecessary as a > > microoptimization, since this code basically runs once at startup. The > > code is also a little bit less clear with the reordered operands, but > > perhaps that's just me because I wrote the code that way. (Splitting > > the if in two would also make sense, and would not affect clarity). > > > > Why should both be applied? > > It's worth working around the clang missed optimization, if for nothing else > than avoiding the noise of the bugs that would otherwise be filed against the > release. > > I think it's also worthwhile to implement the kvm api in kvm-stub.c, > unnecessary or not. If you really want compile-time feedback on those that > ought to have been removed by optimization, you could elide them from the stub > file depending on ifndef __OPTIMIZE__. > Sounds like a nice compromise. -- Gleb.
On 13 November 2013 07:25, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 13/11/2013 03:27, Richard Henderson ha scritto: >> I think it's also worthwhile to implement the kvm api in kvm-stub.c, >> unnecessary or not. If you really want compile-time feedback on those that >> ought to have been removed by optimization, you could elide them from the stub >> file depending on ifndef __OPTIMIZE__. > > Good idea. Peter, can you do that? I still think this serves no useful purpose and we'd be better off without such an ifndef, but I'd rather have a guarded stub than no stub, so I'll send a patch in a moment. thanks -- PMM
diff --git a/target-i386/kvm-stub.c b/target-i386/kvm-stub.c index 11429c4..18fe938 100644 --- a/target-i386/kvm-stub.c +++ b/target-i386/kvm-stub.c @@ -16,3 +16,9 @@ bool kvm_allows_irq0_override(void) { return 1; } + +uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function, + uint32_t index, int reg) +{ + return 0; +}
Fix build failures with clang when KVM is not enabled by providing a stub version of kvm_arch_get_supported_cpuid(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> --- I wouldn't be surprised if this also affected debug gcc builds with KVM disabled, but I haven't checked. Incidentally, since this is an x86 specific function its prototype should be moved into target-i386/kvm_i386.h, but that's a separate patch. target-i386/kvm-stub.c | 6 ++++++ 1 file changed, 6 insertions(+)