Message ID | 20230826232415.80233-3-richard.henderson@linaro.org |
---|---|
State | New |
Headers | show |
Series | softmmu: Use async_run_on_cpu in tcg_commit | expand |
Richard Henderson <richard.henderson@linaro.org> writes: > After system startup, run the update to memory_dispatch > and the tlb_flush on the cpu. This eliminates a race, > wherein a running cpu sees the memory_dispatch change > but has not yet seen the tlb_flush. > > Since the update now happens on the cpu, we need not use > qatomic_rcu_read to protect the read of memory_dispatch. > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1826 > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1834 > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1846 > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> > --- > softmmu/physmem.c | 40 +++++++++++++++++++++++++++++----------- > 1 file changed, 29 insertions(+), 11 deletions(-) > > diff --git a/softmmu/physmem.c b/softmmu/physmem.c > index 7597dc1c39..18277ddd67 100644 > --- a/softmmu/physmem.c > +++ b/softmmu/physmem.c > @@ -680,8 +680,7 @@ address_space_translate_for_iotlb(CPUState *cpu, int asidx, hwaddr orig_addr, > IOMMUTLBEntry iotlb; > int iommu_idx; > hwaddr addr = orig_addr; > - AddressSpaceDispatch *d = > - qatomic_rcu_read(&cpu->cpu_ases[asidx].memory_dispatch); > + AddressSpaceDispatch *d = cpu->cpu_ases[asidx].memory_dispatch; > > for (;;) { > section = address_space_translate_internal(d, addr, &addr, plen, false); > @@ -2412,7 +2411,7 @@ MemoryRegionSection *iotlb_to_section(CPUState *cpu, > { > int asidx = cpu_asidx_from_attrs(cpu, attrs); > CPUAddressSpace *cpuas = &cpu->cpu_ases[asidx]; > - AddressSpaceDispatch *d = qatomic_rcu_read(&cpuas->memory_dispatch); > + AddressSpaceDispatch *d = cpuas->memory_dispatch; > int section_index = index & ~TARGET_PAGE_MASK; > MemoryRegionSection *ret; > > @@ -2487,23 +2486,42 @@ static void tcg_log_global_after_sync(MemoryListener *listener) > } > } > > +static void tcg_commit_cpu(CPUState *cpu, run_on_cpu_data data) > +{ > + CPUAddressSpace *cpuas = data.host_ptr; > + > + cpuas->memory_dispatch = address_space_to_dispatch(cpuas->as); > + tlb_flush(cpu); > +} > + > static void tcg_commit(MemoryListener *listener) > { > CPUAddressSpace *cpuas; > - AddressSpaceDispatch *d; > + CPUState *cpu; > > assert(tcg_enabled()); > /* since each CPU stores ram addresses in its TLB cache, we must > reset the modified entries */ > cpuas = container_of(listener, CPUAddressSpace, tcg_as_listener); > - cpu_reloading_memory_map(); > - /* The CPU and TLB are protected by the iothread lock. > - * We reload the dispatch pointer now because cpu_reloading_memory_map() > - * may have split the RCU critical section. > + cpu = cpuas->cpu; > + > + /* > + * Defer changes to as->memory_dispatch until the cpu is quiescent. > + * Otherwise we race between (1) other cpu threads and (2) ongoing > + * i/o for the current cpu thread, with data cached by mmu_lookup(). > + * > + * In addition, queueing the work function will kick the cpu back to > + * the main loop, which will end the RCU critical section and reclaim > + * the memory data structures. > + * > + * That said, the listener is also called during realize, before > + * all of the tcg machinery for run-on is initialized: thus halt_cond. > */ > - d = address_space_to_dispatch(cpuas->as); > - qatomic_rcu_set(&cpuas->memory_dispatch, d); > - tlb_flush(cpuas->cpu); > + if (cpu->halt_cond) { > + async_run_on_cpu(cpu, tcg_commit_cpu, RUN_ON_CPU_HOST_PTR(cpuas)); > + } else { > + tcg_commit_cpu(cpu, RUN_ON_CPU_HOST_PTR(cpuas)); > + } > } > > static void memory_map_init(void)
On 8/26/23 16:24, Richard Henderson wrote: > +static void tcg_commit_cpu(CPUState *cpu, run_on_cpu_data data) > +{ > + CPUAddressSpace *cpuas = data.host_ptr; > + > + cpuas->memory_dispatch = address_space_to_dispatch(cpuas->as); > + tlb_flush(cpu); > +} Question: do I need to take the iothread lock here, while re-generating the address space dispatch? r~
Richard Henderson <richard.henderson@linaro.org> writes: > On 8/26/23 16:24, Richard Henderson wrote: >> +static void tcg_commit_cpu(CPUState *cpu, run_on_cpu_data data) >> +{ >> + CPUAddressSpace *cpuas = data.host_ptr; >> + >> + cpuas->memory_dispatch = address_space_to_dispatch(cpuas->as); >> + tlb_flush(cpu); >> +} > > Question: do I need to take the iothread lock here, while > re-generating the address space dispatch? Does it regenerate or just collect a current live version under RCU?
On 8/27/23 13:17, Alex Bennée wrote: > > Richard Henderson <richard.henderson@linaro.org> writes: > >> On 8/26/23 16:24, Richard Henderson wrote: >>> +static void tcg_commit_cpu(CPUState *cpu, run_on_cpu_data data) >>> +{ >>> + CPUAddressSpace *cpuas = data.host_ptr; >>> + >>> + cpuas->memory_dispatch = address_space_to_dispatch(cpuas->as); >>> + tlb_flush(cpu); >>> +} >> >> Question: do I need to take the iothread lock here, while >> re-generating the address space dispatch? > > Does it regenerate or just collect a current live version under RCU? Quite right, just reads. r~
On Sat, 26 Aug 2023 16:24:14 -0700 Richard Henderson <richard.henderson@linaro.org> wrote: > After system startup, run the update to memory_dispatch > and the tlb_flush on the cpu. This eliminates a race, > wherein a running cpu sees the memory_dispatch change > but has not yet seen the tlb_flush. > > Since the update now happens on the cpu, we need not use > qatomic_rcu_read to protect the read of memory_dispatch. > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1826 > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1834 > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1846 > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> I'm not pretending I've understood the change though, just that it makes the crashes I saw go away. Jonathan > --- > softmmu/physmem.c | 40 +++++++++++++++++++++++++++++----------- > 1 file changed, 29 insertions(+), 11 deletions(-) > > diff --git a/softmmu/physmem.c b/softmmu/physmem.c > index 7597dc1c39..18277ddd67 100644 > --- a/softmmu/physmem.c > +++ b/softmmu/physmem.c > @@ -680,8 +680,7 @@ address_space_translate_for_iotlb(CPUState *cpu, int asidx, hwaddr orig_addr, > IOMMUTLBEntry iotlb; > int iommu_idx; > hwaddr addr = orig_addr; > - AddressSpaceDispatch *d = > - qatomic_rcu_read(&cpu->cpu_ases[asidx].memory_dispatch); > + AddressSpaceDispatch *d = cpu->cpu_ases[asidx].memory_dispatch; > > for (;;) { > section = address_space_translate_internal(d, addr, &addr, plen, false); > @@ -2412,7 +2411,7 @@ MemoryRegionSection *iotlb_to_section(CPUState *cpu, > { > int asidx = cpu_asidx_from_attrs(cpu, attrs); > CPUAddressSpace *cpuas = &cpu->cpu_ases[asidx]; > - AddressSpaceDispatch *d = qatomic_rcu_read(&cpuas->memory_dispatch); > + AddressSpaceDispatch *d = cpuas->memory_dispatch; > int section_index = index & ~TARGET_PAGE_MASK; > MemoryRegionSection *ret; > > @@ -2487,23 +2486,42 @@ static void tcg_log_global_after_sync(MemoryListener *listener) > } > } > > +static void tcg_commit_cpu(CPUState *cpu, run_on_cpu_data data) > +{ > + CPUAddressSpace *cpuas = data.host_ptr; > + > + cpuas->memory_dispatch = address_space_to_dispatch(cpuas->as); > + tlb_flush(cpu); > +} > + > static void tcg_commit(MemoryListener *listener) > { > CPUAddressSpace *cpuas; > - AddressSpaceDispatch *d; > + CPUState *cpu; > > assert(tcg_enabled()); > /* since each CPU stores ram addresses in its TLB cache, we must > reset the modified entries */ > cpuas = container_of(listener, CPUAddressSpace, tcg_as_listener); > - cpu_reloading_memory_map(); > - /* The CPU and TLB are protected by the iothread lock. > - * We reload the dispatch pointer now because cpu_reloading_memory_map() > - * may have split the RCU critical section. > + cpu = cpuas->cpu; > + > + /* > + * Defer changes to as->memory_dispatch until the cpu is quiescent. > + * Otherwise we race between (1) other cpu threads and (2) ongoing > + * i/o for the current cpu thread, with data cached by mmu_lookup(). > + * > + * In addition, queueing the work function will kick the cpu back to > + * the main loop, which will end the RCU critical section and reclaim > + * the memory data structures. > + * > + * That said, the listener is also called during realize, before > + * all of the tcg machinery for run-on is initialized: thus halt_cond. > */ > - d = address_space_to_dispatch(cpuas->as); > - qatomic_rcu_set(&cpuas->memory_dispatch, d); > - tlb_flush(cpuas->cpu); > + if (cpu->halt_cond) { > + async_run_on_cpu(cpu, tcg_commit_cpu, RUN_ON_CPU_HOST_PTR(cpuas)); > + } else { > + tcg_commit_cpu(cpu, RUN_ON_CPU_HOST_PTR(cpuas)); > + } > } > > static void memory_map_init(void)
diff --git a/softmmu/physmem.c b/softmmu/physmem.c index 7597dc1c39..18277ddd67 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -680,8 +680,7 @@ address_space_translate_for_iotlb(CPUState *cpu, int asidx, hwaddr orig_addr, IOMMUTLBEntry iotlb; int iommu_idx; hwaddr addr = orig_addr; - AddressSpaceDispatch *d = - qatomic_rcu_read(&cpu->cpu_ases[asidx].memory_dispatch); + AddressSpaceDispatch *d = cpu->cpu_ases[asidx].memory_dispatch; for (;;) { section = address_space_translate_internal(d, addr, &addr, plen, false); @@ -2412,7 +2411,7 @@ MemoryRegionSection *iotlb_to_section(CPUState *cpu, { int asidx = cpu_asidx_from_attrs(cpu, attrs); CPUAddressSpace *cpuas = &cpu->cpu_ases[asidx]; - AddressSpaceDispatch *d = qatomic_rcu_read(&cpuas->memory_dispatch); + AddressSpaceDispatch *d = cpuas->memory_dispatch; int section_index = index & ~TARGET_PAGE_MASK; MemoryRegionSection *ret; @@ -2487,23 +2486,42 @@ static void tcg_log_global_after_sync(MemoryListener *listener) } } +static void tcg_commit_cpu(CPUState *cpu, run_on_cpu_data data) +{ + CPUAddressSpace *cpuas = data.host_ptr; + + cpuas->memory_dispatch = address_space_to_dispatch(cpuas->as); + tlb_flush(cpu); +} + static void tcg_commit(MemoryListener *listener) { CPUAddressSpace *cpuas; - AddressSpaceDispatch *d; + CPUState *cpu; assert(tcg_enabled()); /* since each CPU stores ram addresses in its TLB cache, we must reset the modified entries */ cpuas = container_of(listener, CPUAddressSpace, tcg_as_listener); - cpu_reloading_memory_map(); - /* The CPU and TLB are protected by the iothread lock. - * We reload the dispatch pointer now because cpu_reloading_memory_map() - * may have split the RCU critical section. + cpu = cpuas->cpu; + + /* + * Defer changes to as->memory_dispatch until the cpu is quiescent. + * Otherwise we race between (1) other cpu threads and (2) ongoing + * i/o for the current cpu thread, with data cached by mmu_lookup(). + * + * In addition, queueing the work function will kick the cpu back to + * the main loop, which will end the RCU critical section and reclaim + * the memory data structures. + * + * That said, the listener is also called during realize, before + * all of the tcg machinery for run-on is initialized: thus halt_cond. */ - d = address_space_to_dispatch(cpuas->as); - qatomic_rcu_set(&cpuas->memory_dispatch, d); - tlb_flush(cpuas->cpu); + if (cpu->halt_cond) { + async_run_on_cpu(cpu, tcg_commit_cpu, RUN_ON_CPU_HOST_PTR(cpuas)); + } else { + tcg_commit_cpu(cpu, RUN_ON_CPU_HOST_PTR(cpuas)); + } } static void memory_map_init(void)
After system startup, run the update to memory_dispatch and the tlb_flush on the cpu. This eliminates a race, wherein a running cpu sees the memory_dispatch change but has not yet seen the tlb_flush. Since the update now happens on the cpu, we need not use qatomic_rcu_read to protect the read of memory_dispatch. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1826 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1834 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1846 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- softmmu/physmem.c | 40 +++++++++++++++++++++++++++++----------- 1 file changed, 29 insertions(+), 11 deletions(-)