Message ID | 20211123205729.2205806-3-alex.bennee@linaro.org |
---|---|
State | New |
Headers | show |
Series | more tcg, plugin, test and build fixes | expand |
On 11/23/21 9:57 PM, Alex Bennée wrote: > Generally when we set cpu->cflags_next_tb it is because we want to > carefully control the execution of the next TB. Currently there is a > race that causes cflags_next_tb to get ignored if an IRQ is processed > before we execute any actual instructions. > > To avoid this we introduce a new compiler flag: CF_NOIRQ to suppress > this check in the generated code so we know we will definitely execute > the next block. > > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > Cc: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/245 > --- > include/exec/exec-all.h | 1 + > include/exec/gen-icount.h | 21 +++++++++++++++++---- > accel/tcg/cpu-exec.c | 14 ++++++++++++++ > 3 files changed, 32 insertions(+), 4 deletions(-) > > diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h > index 6bb2a0f7ec..35d8e93976 100644 > --- a/include/exec/exec-all.h > +++ b/include/exec/exec-all.h > @@ -503,6 +503,7 @@ struct TranslationBlock { > #define CF_USE_ICOUNT 0x00020000 > #define CF_INVALID 0x00040000 /* TB is stale. Set with @jmp_lock held */ > #define CF_PARALLEL 0x00080000 /* Generate code for a parallel context */ > +#define CF_NOIRQ 0x00100000 /* Generate an uninterruptible TB */ > #define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */ > #define CF_CLUSTER_SHIFT 24 > > diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h > index 610cba58fe..c57204ddad 100644 > --- a/include/exec/gen-icount.h > +++ b/include/exec/gen-icount.h > @@ -21,7 +21,6 @@ static inline void gen_tb_start(const TranslationBlock *tb) > { > TCGv_i32 count; > > - tcg_ctx->exitreq_label = gen_new_label(); > if (tb_cflags(tb) & CF_USE_ICOUNT) { > count = tcg_temp_local_new_i32(); > } else { > @@ -42,7 +41,19 @@ static inline void gen_tb_start(const TranslationBlock *tb) > icount_start_insn = tcg_last_op(); > } > > - tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, tcg_ctx->exitreq_label); > + /* > + * Emit the check against icount_decr.u32 to see if we should exit > + * unless we suppress the check with CF_NOIRQ. If we are using > + * icount and have suppressed interruption the higher level code > + * should have ensured we don't run more instructions than the > + * budget. > + */ > + if (tb_cflags(tb) & CF_NOIRQ) { > + tcg_ctx->exitreq_label = NULL; > + } else { > + tcg_ctx->exitreq_label = gen_new_label(); > + tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, tcg_ctx->exitreq_label); > + } > > if (tb_cflags(tb) & CF_USE_ICOUNT) { > tcg_gen_st16_i32(count, cpu_env, > @@ -74,8 +85,10 @@ static inline void gen_tb_end(const TranslationBlock *tb, int num_insns) > tcgv_i32_arg(tcg_constant_i32(num_insns))); > } > > - gen_set_label(tcg_ctx->exitreq_label); > - tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); > + if (tcg_ctx->exitreq_label) { > + gen_set_label(tcg_ctx->exitreq_label); > + tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); > + } > } > > #endif Split patch here, I think. > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c > index 9cb892e326..9e3ed42ceb 100644 > --- a/accel/tcg/cpu-exec.c > +++ b/accel/tcg/cpu-exec.c > @@ -721,6 +721,15 @@ static inline bool need_replay_interrupt(int interrupt_request) > static inline bool cpu_handle_interrupt(CPUState *cpu, > TranslationBlock **last_tb) > { > + /* > + * If we have special cflags lets not get distracted with IRQs. We > + * shall exit the loop as soon as the next TB completes what it > + * needs to do. > + */ We will *probably* exit the loop, I think. With watchpoints, we expect to perform the same memory operation, which is expected to hit the watchpoint_hit check in cpu_check_watchpoint, which will raise CPU_INTERRUPT_DEBUG. With SMC, we flush all TBs from the current page, re-execute one insn, and then (probably) have to exit to generate the next tb. With icount, in cpu_loop_exec_tb, we have no idea what's coming; we only know that we want no more than N insns to execute. None of which directly exit the loop -- we need the IRQ check at the beginning of the *next* TB to exit the loop. If we want to force an exit from the loop, we need CF_NO_GOTO_TB | CF_NO_GOTO_PTR. Which is probably a good idea, at least for watchpoints; exit_tb is the fastest way out of the TB to recognize the debug interrupt. The icount usage I find a bit odd. I don't think that we should suppress an IRQ in that case -- we can have up to 510 insns outstanding on icount_budget, which is clearly far too many to have IRQs disabled. I think we should not use cflags_next_tb for this at all, but should apply the icount limit later somehow, because an IRQ *could* be recognized immediately, with the first TB of the interrupt handler running with limited budget, and the icount tick being recognized at that point. > + * As we don't want this special TB being interrupted by > + * some sort of asynchronous event we apply CF_NOIRQ to > + * disable the usual event checking. > */ > cflags = cpu->cflags_next_tb; > if (cflags == -1) { > cflags = curr_cflags(cpu); > } else { > + cflags |= CF_NOIRQ; > cpu->cflags_next_tb = -1; > } Is it clearer to add NOIRQ here, or back where we set cflags_next_tb in the first place? r~
Richard Henderson <richard.henderson@linaro.org> writes: > On 11/23/21 9:57 PM, Alex Bennée wrote: >> Generally when we set cpu->cflags_next_tb it is because we want to >> carefully control the execution of the next TB. Currently there is a >> race that causes cflags_next_tb to get ignored if an IRQ is processed >> before we execute any actual instructions. >> To avoid this we introduce a new compiler flag: CF_NOIRQ to suppress >> this check in the generated code so we know we will definitely execute >> the next block. >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> >> Cc: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> >> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/245 >> --- >> include/exec/exec-all.h | 1 + >> include/exec/gen-icount.h | 21 +++++++++++++++++---- >> accel/tcg/cpu-exec.c | 14 ++++++++++++++ >> 3 files changed, 32 insertions(+), 4 deletions(-) >> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h >> index 6bb2a0f7ec..35d8e93976 100644 >> --- a/include/exec/exec-all.h >> +++ b/include/exec/exec-all.h >> @@ -503,6 +503,7 @@ struct TranslationBlock { >> #define CF_USE_ICOUNT 0x00020000 >> #define CF_INVALID 0x00040000 /* TB is stale. Set with @jmp_lock held */ >> #define CF_PARALLEL 0x00080000 /* Generate code for a parallel context */ >> +#define CF_NOIRQ 0x00100000 /* Generate an uninterruptible TB */ >> #define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */ >> #define CF_CLUSTER_SHIFT 24 >> diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h >> index 610cba58fe..c57204ddad 100644 >> --- a/include/exec/gen-icount.h >> +++ b/include/exec/gen-icount.h >> @@ -21,7 +21,6 @@ static inline void gen_tb_start(const TranslationBlock *tb) >> { >> TCGv_i32 count; >> - tcg_ctx->exitreq_label = gen_new_label(); >> if (tb_cflags(tb) & CF_USE_ICOUNT) { >> count = tcg_temp_local_new_i32(); >> } else { >> @@ -42,7 +41,19 @@ static inline void gen_tb_start(const TranslationBlock *tb) >> icount_start_insn = tcg_last_op(); >> } >> - tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, >> tcg_ctx->exitreq_label); >> + /* >> + * Emit the check against icount_decr.u32 to see if we should exit >> + * unless we suppress the check with CF_NOIRQ. If we are using >> + * icount and have suppressed interruption the higher level code >> + * should have ensured we don't run more instructions than the >> + * budget. >> + */ >> + if (tb_cflags(tb) & CF_NOIRQ) { >> + tcg_ctx->exitreq_label = NULL; >> + } else { >> + tcg_ctx->exitreq_label = gen_new_label(); >> + tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, tcg_ctx->exitreq_label); >> + } >> if (tb_cflags(tb) & CF_USE_ICOUNT) { >> tcg_gen_st16_i32(count, cpu_env, >> @@ -74,8 +85,10 @@ static inline void gen_tb_end(const TranslationBlock *tb, int num_insns) >> tcgv_i32_arg(tcg_constant_i32(num_insns))); >> } >> - gen_set_label(tcg_ctx->exitreq_label); >> - tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); >> + if (tcg_ctx->exitreq_label) { >> + gen_set_label(tcg_ctx->exitreq_label); >> + tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); >> + } >> } >> #endif > > Split patch here, I think. Not including the header to cpu_handle_interrupt? >> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c >> index 9cb892e326..9e3ed42ceb 100644 >> --- a/accel/tcg/cpu-exec.c >> +++ b/accel/tcg/cpu-exec.c >> @@ -721,6 +721,15 @@ static inline bool need_replay_interrupt(int interrupt_request) >> static inline bool cpu_handle_interrupt(CPUState *cpu, >> TranslationBlock **last_tb) >> { >> + /* >> + * If we have special cflags lets not get distracted with IRQs. We >> + * shall exit the loop as soon as the next TB completes what it >> + * needs to do. >> + */ > > We will *probably* exit the loop, I think. > > With watchpoints, we expect to perform the same memory operation, > which is expected to hit the watchpoint_hit check in > cpu_check_watchpoint, which will raise CPU_INTERRUPT_DEBUG. > > With SMC, we flush all TBs from the current page, re-execute one insn, > and then (probably) have to exit to generate the next tb. > > With icount, in cpu_loop_exec_tb, we have no idea what's coming; we > only know that we want no more than N insns to execute. I think technically we do because all asynchronous interrupt are tied to the icount (which is part of the budget calculation - icount_get_limit). In theory we could drop the interrupt check altogether in icount mode because we should always end and exit to the outer loop when a timer is going to expire. > None of which directly exit the loop -- we need the IRQ check at the > beginning of the *next* TB to exit the loop. > > If we want to force an exit from the loop, we need CF_NO_GOTO_TB | > CF_NO_GOTO_PTR. Which is probably a good idea, at least for > watchpoints; exit_tb is the fastest way out of the TB to recognize the > debug interrupt. > > The icount usage I find a bit odd. I don't think that we should > suppress an IRQ in that case -- we can have up to 510 insns > outstanding on icount_budget, which is clearly far too many to have > IRQs disabled. I think we should not use cflags_next_tb for this at > all, but should apply the icount limit later somehow, because an IRQ > *could* be recognized immediately, with the first TB of the interrupt > handler running with limited budget, and the icount tick being > recognized at that point. I wonder what would happen if we checked u16.high in icount mode? No timer should ever set it - although I guess it could get set during an IO operation. Perhaps really all icount exit checks should be done at the end of blocks? I suspect that breaks too many assumptions in the rest of the code. > >> + * As we don't want this special TB being interrupted by >> + * some sort of asynchronous event we apply CF_NOIRQ to >> + * disable the usual event checking. >> */ >> cflags = cpu->cflags_next_tb; >> if (cflags == -1) { >> cflags = curr_cflags(cpu); >> } else { >> + cflags |= CF_NOIRQ; >> cpu->cflags_next_tb = -1; >> } > > Is it clearer to add NOIRQ here, or back where we set cflags_next_tb > in the first place? Are there cases of setting cpu->cflags_next_tb which we are happy to get preempted by asynchronous events? I guess in the SMC case it wouldn't matter because when we get back from the IRQ things get reset? > > > r~
On 11/24/21 11:24 AM, Alex Bennée wrote: > > Richard Henderson <richard.henderson@linaro.org> writes: > >> On 11/23/21 9:57 PM, Alex Bennée wrote: >>> Generally when we set cpu->cflags_next_tb it is because we want to >>> carefully control the execution of the next TB. Currently there is a >>> race that causes cflags_next_tb to get ignored if an IRQ is processed >>> before we execute any actual instructions. >>> To avoid this we introduce a new compiler flag: CF_NOIRQ to suppress >>> this check in the generated code so we know we will definitely execute >>> the next block. >>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> >>> Cc: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> >>> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/245 >>> --- >>> include/exec/exec-all.h | 1 + >>> include/exec/gen-icount.h | 21 +++++++++++++++++---- >>> accel/tcg/cpu-exec.c | 14 ++++++++++++++ >>> 3 files changed, 32 insertions(+), 4 deletions(-) >>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h >>> index 6bb2a0f7ec..35d8e93976 100644 >>> --- a/include/exec/exec-all.h >>> +++ b/include/exec/exec-all.h >>> @@ -503,6 +503,7 @@ struct TranslationBlock { >>> #define CF_USE_ICOUNT 0x00020000 >>> #define CF_INVALID 0x00040000 /* TB is stale. Set with @jmp_lock held */ >>> #define CF_PARALLEL 0x00080000 /* Generate code for a parallel context */ >>> +#define CF_NOIRQ 0x00100000 /* Generate an uninterruptible TB */ >>> #define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */ >>> #define CF_CLUSTER_SHIFT 24 >>> diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h >>> index 610cba58fe..c57204ddad 100644 >>> --- a/include/exec/gen-icount.h >>> +++ b/include/exec/gen-icount.h >>> @@ -21,7 +21,6 @@ static inline void gen_tb_start(const TranslationBlock *tb) >>> { >>> TCGv_i32 count; >>> - tcg_ctx->exitreq_label = gen_new_label(); >>> if (tb_cflags(tb) & CF_USE_ICOUNT) { >>> count = tcg_temp_local_new_i32(); >>> } else { >>> @@ -42,7 +41,19 @@ static inline void gen_tb_start(const TranslationBlock *tb) >>> icount_start_insn = tcg_last_op(); >>> } >>> - tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, >>> tcg_ctx->exitreq_label); >>> + /* >>> + * Emit the check against icount_decr.u32 to see if we should exit >>> + * unless we suppress the check with CF_NOIRQ. If we are using >>> + * icount and have suppressed interruption the higher level code >>> + * should have ensured we don't run more instructions than the >>> + * budget. >>> + */ >>> + if (tb_cflags(tb) & CF_NOIRQ) { >>> + tcg_ctx->exitreq_label = NULL; >>> + } else { >>> + tcg_ctx->exitreq_label = gen_new_label(); >>> + tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, tcg_ctx->exitreq_label); >>> + } >>> if (tb_cflags(tb) & CF_USE_ICOUNT) { >>> tcg_gen_st16_i32(count, cpu_env, >>> @@ -74,8 +85,10 @@ static inline void gen_tb_end(const TranslationBlock *tb, int num_insns) >>> tcgv_i32_arg(tcg_constant_i32(num_insns))); >>> } >>> - gen_set_label(tcg_ctx->exitreq_label); >>> - tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); >>> + if (tcg_ctx->exitreq_label) { >>> + gen_set_label(tcg_ctx->exitreq_label); >>> + tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); >>> + } >>> } >>> #endif >> >> Split patch here, I think. > > Not including the header to cpu_handle_interrupt? Correct. Introduce CF_NOIRQ without using it yet. >> With icount, in cpu_loop_exec_tb, we have no idea what's coming; we >> only know that we want no more than N insns to execute. > > I think technically we do because all asynchronous interrupt are tied to > the icount (which is part of the budget calculation - icount_get_limit). Are you sure that's plain icount and not replay? In icount_get_limit we talk about timers, not any other asynchronous interrupt, like a keyboard press. > In theory we could drop the interrupt check altogether in icount mode > because we should always end and exit to the outer loop when a timer is > going to expire. But we know nothing about synchronous exceptions or anything else. > I wonder what would happen if we checked u16.high in icount mode? No > timer should ever set it - although I guess it could get set during an > IO operation. Uh, we do, via u32? I'm not sure what you're saying here. > Perhaps really all icount exit checks should be done at the end of > blocks? I suspect that breaks too many assumptions in the rest of the > code. There are multiple exits at the end of the block, which is why we do the check at the beginning of the next block. Besides, we have to check that the block we're about to execute is within budget. > Are there cases of setting cpu->cflags_next_tb which we are happy to get > preempted by asynchronous events? Well, icount. > I guess in the SMC case it wouldn't > matter because when we get back from the IRQ things get reset? SMC probably would work with an interrupt, but we'd wind up having to repeat the process of flushing all TBs on the page, so we might as well perform the one store and get it over with. r~
Richard Henderson <richard.henderson@linaro.org> writes: > On 11/24/21 11:24 AM, Alex Bennée wrote: >> Richard Henderson <richard.henderson@linaro.org> writes: >> >>> On 11/23/21 9:57 PM, Alex Bennée wrote: >>>> Generally when we set cpu->cflags_next_tb it is because we want to >>>> carefully control the execution of the next TB. Currently there is a >>>> race that causes cflags_next_tb to get ignored if an IRQ is processed >>>> before we execute any actual instructions. >>>> To avoid this we introduce a new compiler flag: CF_NOIRQ to suppress >>>> this check in the generated code so we know we will definitely execute >>>> the next block. >>>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> >>>> Cc: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> >>>> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/245 >>>> --- >>>> include/exec/exec-all.h | 1 + >>>> include/exec/gen-icount.h | 21 +++++++++++++++++---- >>>> accel/tcg/cpu-exec.c | 14 ++++++++++++++ >>>> 3 files changed, 32 insertions(+), 4 deletions(-) >>>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h >>>> index 6bb2a0f7ec..35d8e93976 100644 >>>> --- a/include/exec/exec-all.h >>>> +++ b/include/exec/exec-all.h >>>> @@ -503,6 +503,7 @@ struct TranslationBlock { >>>> #define CF_USE_ICOUNT 0x00020000 >>>> #define CF_INVALID 0x00040000 /* TB is stale. Set with @jmp_lock held */ >>>> #define CF_PARALLEL 0x00080000 /* Generate code for a parallel context */ >>>> +#define CF_NOIRQ 0x00100000 /* Generate an uninterruptible TB */ >>>> #define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */ >>>> #define CF_CLUSTER_SHIFT 24 >>>> diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h >>>> index 610cba58fe..c57204ddad 100644 >>>> --- a/include/exec/gen-icount.h >>>> +++ b/include/exec/gen-icount.h >>>> @@ -21,7 +21,6 @@ static inline void gen_tb_start(const TranslationBlock *tb) >>>> { >>>> TCGv_i32 count; >>>> - tcg_ctx->exitreq_label = gen_new_label(); >>>> if (tb_cflags(tb) & CF_USE_ICOUNT) { >>>> count = tcg_temp_local_new_i32(); >>>> } else { >>>> @@ -42,7 +41,19 @@ static inline void gen_tb_start(const TranslationBlock *tb) >>>> icount_start_insn = tcg_last_op(); >>>> } >>>> - tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, >>>> tcg_ctx->exitreq_label); >>>> + /* >>>> + * Emit the check against icount_decr.u32 to see if we should exit >>>> + * unless we suppress the check with CF_NOIRQ. If we are using >>>> + * icount and have suppressed interruption the higher level code >>>> + * should have ensured we don't run more instructions than the >>>> + * budget. >>>> + */ >>>> + if (tb_cflags(tb) & CF_NOIRQ) { >>>> + tcg_ctx->exitreq_label = NULL; >>>> + } else { >>>> + tcg_ctx->exitreq_label = gen_new_label(); >>>> + tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, tcg_ctx->exitreq_label); >>>> + } >>>> if (tb_cflags(tb) & CF_USE_ICOUNT) { >>>> tcg_gen_st16_i32(count, cpu_env, >>>> @@ -74,8 +85,10 @@ static inline void gen_tb_end(const TranslationBlock *tb, int num_insns) >>>> tcgv_i32_arg(tcg_constant_i32(num_insns))); >>>> } >>>> - gen_set_label(tcg_ctx->exitreq_label); >>>> - tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); >>>> + if (tcg_ctx->exitreq_label) { >>>> + gen_set_label(tcg_ctx->exitreq_label); >>>> + tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); >>>> + } >>>> } >>>> #endif >>> >>> Split patch here, I think. >> Not including the header to cpu_handle_interrupt? > > Correct. Introduce CF_NOIRQ without using it yet. > >>> With icount, in cpu_loop_exec_tb, we have no idea what's coming; we >>> only know that we want no more than N insns to execute. >> I think technically we do because all asynchronous interrupt are >> tied to >> the icount (which is part of the budget calculation - icount_get_limit). > > Are you sure that's plain icount and not replay? > In icount_get_limit we talk about timers, not any other asynchronous > interrupt, like a keyboard press. Hmm right - and I guess other I/O during record of icount. I guess it's only fully deterministic (outside of replay) if it's a totally "isolated" from external system events. >> In theory we could drop the interrupt check altogether in icount mode >> because we should always end and exit to the outer loop when a timer is >> going to expire. > > But we know nothing about synchronous exceptions or anything else. Hmm I didn't think we needed to care about synchronous events but now you have me wandering what happens in icount mode when an exception happens mid-block? >> I wonder what would happen if we checked u16.high in icount mode? No >> timer should ever set it - although I guess it could get set during an >> IO operation. > > Uh, we do, via u32? I'm not sure what you're saying here. I mean should we detect if something has called cpu_exit() mid block rather than just icount expiring. <snip> >> Are there cases of setting cpu->cflags_next_tb which we are happy to get >> preempted by asynchronous events? > > Well, icount. > >> I guess in the SMC case it wouldn't >> matter because when we get back from the IRQ things get reset? > > SMC probably would work with an interrupt, but we'd wind up having to > repeat the process of flushing all TBs on the page, so we might as > well perform the one store and get it over with. I guess. Makes the patch a bit noisier though...
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index 6bb2a0f7ec..35d8e93976 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -503,6 +503,7 @@ struct TranslationBlock { #define CF_USE_ICOUNT 0x00020000 #define CF_INVALID 0x00040000 /* TB is stale. Set with @jmp_lock held */ #define CF_PARALLEL 0x00080000 /* Generate code for a parallel context */ +#define CF_NOIRQ 0x00100000 /* Generate an uninterruptible TB */ #define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */ #define CF_CLUSTER_SHIFT 24 diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h index 610cba58fe..c57204ddad 100644 --- a/include/exec/gen-icount.h +++ b/include/exec/gen-icount.h @@ -21,7 +21,6 @@ static inline void gen_tb_start(const TranslationBlock *tb) { TCGv_i32 count; - tcg_ctx->exitreq_label = gen_new_label(); if (tb_cflags(tb) & CF_USE_ICOUNT) { count = tcg_temp_local_new_i32(); } else { @@ -42,7 +41,19 @@ static inline void gen_tb_start(const TranslationBlock *tb) icount_start_insn = tcg_last_op(); } - tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, tcg_ctx->exitreq_label); + /* + * Emit the check against icount_decr.u32 to see if we should exit + * unless we suppress the check with CF_NOIRQ. If we are using + * icount and have suppressed interruption the higher level code + * should have ensured we don't run more instructions than the + * budget. + */ + if (tb_cflags(tb) & CF_NOIRQ) { + tcg_ctx->exitreq_label = NULL; + } else { + tcg_ctx->exitreq_label = gen_new_label(); + tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, tcg_ctx->exitreq_label); + } if (tb_cflags(tb) & CF_USE_ICOUNT) { tcg_gen_st16_i32(count, cpu_env, @@ -74,8 +85,10 @@ static inline void gen_tb_end(const TranslationBlock *tb, int num_insns) tcgv_i32_arg(tcg_constant_i32(num_insns))); } - gen_set_label(tcg_ctx->exitreq_label); - tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); + if (tcg_ctx->exitreq_label) { + gen_set_label(tcg_ctx->exitreq_label); + tcg_gen_exit_tb(tb, TB_EXIT_REQUESTED); + } } #endif diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c index 9cb892e326..9e3ed42ceb 100644 --- a/accel/tcg/cpu-exec.c +++ b/accel/tcg/cpu-exec.c @@ -721,6 +721,15 @@ static inline bool need_replay_interrupt(int interrupt_request) static inline bool cpu_handle_interrupt(CPUState *cpu, TranslationBlock **last_tb) { + /* + * If we have special cflags lets not get distracted with IRQs. We + * shall exit the loop as soon as the next TB completes what it + * needs to do. + */ + if (cpu->cflags_next_tb != -1) { + return false; + } + /* Clear the interrupt flag now since we're processing * cpu->interrupt_request and cpu->exit_request. * Ensure zeroing happens before reading cpu->exit_request or @@ -954,11 +963,16 @@ int cpu_exec(CPUState *cpu) * after-access watchpoints. Since this request should never * have CF_INVALID set, -1 is a convenient invalid value that * does not require tcg headers for cpu_common_reset. + * + * As we don't want this special TB being interrupted by + * some sort of asynchronous event we apply CF_NOIRQ to + * disable the usual event checking. */ cflags = cpu->cflags_next_tb; if (cflags == -1) { cflags = curr_cflags(cpu); } else { + cflags |= CF_NOIRQ; cpu->cflags_next_tb = -1; }
Generally when we set cpu->cflags_next_tb it is because we want to carefully control the execution of the next TB. Currently there is a race that causes cflags_next_tb to get ignored if an IRQ is processed before we execute any actual instructions. To avoid this we introduce a new compiler flag: CF_NOIRQ to suppress this check in the generated code so we know we will definitely execute the next block. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Cc: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/245 --- include/exec/exec-all.h | 1 + include/exec/gen-icount.h | 21 +++++++++++++++++---- accel/tcg/cpu-exec.c | 14 ++++++++++++++ 3 files changed, 32 insertions(+), 4 deletions(-)