Message ID | 20220816122621.2066292-2-alex.bennee@linaro.org |
---|---|
State | Accepted |
Commit | 52f0c1607671293afcdb2acc2f83e9bccbfa74bb |
Headers | show |
Series | [PULL,1/3] linux-user: un-parent OBJECT(cpu) when closing thread | expand |
On 8/16/22 05:26, Alex Bennée wrote: > While forcing the CPU to unrealize by hand does trigger the clean-up > code we never fully free resources because refcount never reaches > zero. This is because QOM automatically added objects without an > explicit parent to /unattached/, incrementing the refcount. > > Instead of manually triggering unrealization just unparent the object > and let the device machinery deal with that for us. > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/866 > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > Reviewed-by: Laurent Vivier <laurent@vivier.eu> > Message-Id: <20220811151413.3350684-2-alex.bennee@linaro.org> > > diff --git a/linux-user/syscall.c b/linux-user/syscall.c > index f409121202..bfdd60136b 100644 > --- a/linux-user/syscall.c > +++ b/linux-user/syscall.c > @@ -8594,7 +8594,13 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1, > if (CPU_NEXT(first_cpu)) { > TaskState *ts = cpu->opaque; > > - object_property_set_bool(OBJECT(cpu), "realized", false, NULL); > + if (ts->child_tidptr) { > + put_user_u32(0, ts->child_tidptr); > + do_sys_futex(g2h(cpu, ts->child_tidptr), > + FUTEX_WAKE, INT_MAX, NULL, NULL, 0); > + } > + > + object_unparent(OBJECT(cpu)); This has caused a regression in arm/aarch64. We hard-code ARMCPRegInfo pointers into TranslationBlocks, for calling into helper_{get,set}cp_reg{,64}. So we have a race condition between whichever cpu thread translates the code first (encoding the pointer), and that cpu thread exiting, so that the next execution of the TB references a freed data structure. We shouldn't have N copies of these pointers in the first place. This seems like something that ought to be placed on the ARMCPUClass, so that it could be shared by each cpu. But that's going to be a complex fix, so I'm reverting this for rc4. r~
Richard Henderson <richard.henderson@linaro.org> writes: > On 8/16/22 05:26, Alex Bennée wrote: >> While forcing the CPU to unrealize by hand does trigger the clean-up >> code we never fully free resources because refcount never reaches >> zero. This is because QOM automatically added objects without an >> explicit parent to /unattached/, incrementing the refcount. >> Instead of manually triggering unrealization just unparent the >> object >> and let the device machinery deal with that for us. >> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/866 >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> >> Reviewed-by: Laurent Vivier <laurent@vivier.eu> >> Message-Id: <20220811151413.3350684-2-alex.bennee@linaro.org> >> diff --git a/linux-user/syscall.c b/linux-user/syscall.c >> index f409121202..bfdd60136b 100644 >> --- a/linux-user/syscall.c >> +++ b/linux-user/syscall.c >> @@ -8594,7 +8594,13 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1, >> if (CPU_NEXT(first_cpu)) { >> TaskState *ts = cpu->opaque; >> - object_property_set_bool(OBJECT(cpu), "realized", >> false, NULL); >> + if (ts->child_tidptr) { >> + put_user_u32(0, ts->child_tidptr); >> + do_sys_futex(g2h(cpu, ts->child_tidptr), >> + FUTEX_WAKE, INT_MAX, NULL, NULL, 0); >> + } >> + >> + object_unparent(OBJECT(cpu)); > > This has caused a regression in arm/aarch64. > > We hard-code ARMCPRegInfo pointers into TranslationBlocks, for calling > into helper_{get,set}cp_reg{,64}. So we have a race condition between > whichever cpu thread translates the code first (encoding the pointer), > and that cpu thread exiting, so that the next execution of the TB > references a freed data structure. What is the test case that breaks this? I guess a multi-threaded sysregs.c would trigger it? > We shouldn't have N copies of these pointers in the first place. This > seems like something that ought to be placed on the ARMCPUClass, so > that it could be shared by each cpu. > > But that's going to be a complex fix, so I'm reverting this for rc4. > > > r~
On 8/19/22 01:37, Alex Bennée wrote: >> This has caused a regression in arm/aarch64. >> >> We hard-code ARMCPRegInfo pointers into TranslationBlocks, for calling >> into helper_{get,set}cp_reg{,64}. So we have a race condition between >> whichever cpu thread translates the code first (encoding the pointer), >> and that cpu thread exiting, so that the next execution of the TB >> references a freed data structure. > > What is the test case that breaks this? I guess a multi-threaded > sysregs.c would trigger it? E.g. tests/tcg/aarch64-linux-user/signals. r~
diff --git a/linux-user/syscall.c b/linux-user/syscall.c index f409121202..bfdd60136b 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -8594,7 +8594,13 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1, if (CPU_NEXT(first_cpu)) { TaskState *ts = cpu->opaque; - object_property_set_bool(OBJECT(cpu), "realized", false, NULL); + if (ts->child_tidptr) { + put_user_u32(0, ts->child_tidptr); + do_sys_futex(g2h(cpu, ts->child_tidptr), + FUTEX_WAKE, INT_MAX, NULL, NULL, 0); + } + + object_unparent(OBJECT(cpu)); object_unref(OBJECT(cpu)); /* * At this point the CPU should be unrealized and removed @@ -8604,11 +8610,6 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1, pthread_mutex_unlock(&clone_lock); - if (ts->child_tidptr) { - put_user_u32(0, ts->child_tidptr); - do_sys_futex(g2h(cpu, ts->child_tidptr), - FUTEX_WAKE, INT_MAX, NULL, NULL, 0); - } thread_cpu = NULL; g_free(ts); rcu_unregister_thread();